What's New

Five new services expand IBM Watson capabilities to images, speech, and more

Share this post:

Republished from the Watson Blog

Since its launch, you’ve made the IBM Watson Developer Cloud one of IBM’s most vibrant and innovative communities on Bluemix. Today more than 5,000 partners, developers, data hobbyists, entrepreneurs, students and others have contributed to building 6,000+ apps infused with Watson’s cognitive computing capabilities.

A few months ago we released eight beta Watson services so that this community can test drive them, think of new ways to apply and tap into Watson’s capabilities, and harden each service as we prepare them for general availability. The services—which range from Language Identification and Machine Translation to Visualization Rendering and User Modeling—are being embedded into a new class of cognitive apps.

One example is Red Ant’s Sell Smart mobile app, a retail sales trainer that lets employees easily identify unique customer buying preferences by analyzing demographics, purchase history, wish lists, pricing and other product information. Another is eyeQ’s eyeQinsights, which helps retailers understand how consumers make purchasing decisions while standing in the store.

Today, we are excited to announce the arrival of five additional new beta services to the Watson Developer Cloud. Available now, you can access the following free beta services on Bluemix:

We’ve included an overview of each service below. Our team will continue to add more services in the Watson Developer Cloud as they become available. Stay tuned.

New services

Speech to Text

Speech to Text is a cloud-based, real-time service that uses low latency speech recognition capabilities to convert speech into text for voice-controlled mobile applications, transcription services, and more. Transcriptions are continuously sent back to the client, and retroactively corrected as more speech is heard, helping the system learn.

The service is based on more than 50 years of speech research at IBM. It uses state-of-the-art algorithms based on convolutional neural networks or “deep learning”. Using these algorithms, the Watson team has published the best accuracy results (10.4% word error rate vs. 12.5% for the second best as of today) on the popular Switchboard Hub5-2000 benchmark, and provided technology that has been deployed on more than 500 million smartphones. This is the first time in 10 years that the IBM team is delivering speech technology broadly to developers. While the base algorithms are solid, the service will keep getting better as it gets more usage and training data.

Use Cases:

  • Enable voice control over apps, embedded devices or accessories
  • Provide transcription of meetings and conference calls in real-time
  • Critical building block for “Speech-to-Speech” translation

Text to Speech

Text to Speech converts textual input into speech, and provides the option of three voices in English or Spanish, including the American English voice used by Watson in the 2011 Jeopardy match. Text to Speech generates synthesized audio output complete with appropriate cadence and intonation. The user can input any English or Spanish text to generate speech output, a service that has potential applications for the vision-impaired, as reading-based education tools and for multiple mobile apps.

Use Cases:

  • Assistance for the vision-impaired, reading and language education
  • Enable the audio reading of texts and emails to drivers
  • Critical building block for “Speech-to-Speech” translation

Visual Recognition

Visual Recognition analyzes the visual appearance of images or video frames to understand what is happening in a scene.

The Visual Recognition service includes an unmatched number of preset classifier and trained labels (2,000+), a taxonomy that recognizes 150+ different sports, and can ingest 1,000+ batch images with the ability to recognize multiple labels in a picture. Like the Speech to Text service, Visual Recognition relies on deep learning. Convolutional neural networks are used as semantic classifiers that recognize many visual entities such as settings, objects, and events. Input JPEG images into the service and you will receive a set of labels and probability scores such as such as “soccer, 0.7” or “baseball, 0.3”.

Use Cases:

  • Organize and ingest large collections of digital images
  • Build semantic association between images from multiple users
  • Understand consumer shopping preferences based on image queries

Concept Insights

Concept Insights handles text in a conceptual way, delivering a search capability that discovers new insights on text compared to traditional keyword searches.

Concept Insights links user-provided documents with a pre-existing graph of concepts based on Wikipedia (e.g. ‘The New York Times’, ‘Machine learning’, etc.). Two types of links are identified: explicit links when a document directly mentions a concept, and implicit links which connect the user’s documents to relevant concepts that are not directly mentioned. Users of this service can also search for documents that are relevant to a concept or collection of concepts by exploring the explicit and implicit links.

Use Cases:

  • Improve search queries with results that are more conceptually related
  • Locate sources of expertise across large or complex organizations
  • Deepen customer engagement by returning more relevant information

Tradeoff Analytics

Tradeoff Analytics enables dynamic real-time ‘tradeoff’ decisions across static or changing parameters, all delivered in an interactive visual display. Tradeoff Analytics enables better decision-making by dynamically weighing multiple, often conflicting, goals. This service uses Pareto filtering techniques to identify the optimal alternatives across multiple criteria. It then uses various analytical and visual approaches to help the decision maker explore tradeoffs and alternatives.

Tradeoff Analytics can be used to help make complex decisions like what mortgage to take, which treatment option to follow, what car to purchase.

Use Cases:

  • Enable retailers and manufacturers to determine product mix
  • Allow consumers to compare and contrast competitive products or services
  • Help physicians select optimal treatment options based on multiple criteria

Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More What's New stories

Log Analysis just got easier for serverless apps

In serverless applications, functions are small and need to execute in less amount of time in a stateless environment. Logging then becomes a challenge for DevOps to do Root Cause Analysis (RCA) or get insights about the behavior of their applications. One logging approach is to add code to your functions to send logs to an external API, which can be a […]

Continue reading

Bringing Continuous Delivery with Codefresh to Kubernetes on IBM Cloud

This post was co-authored with Dan Garfield, VP of Marketing at Codefresh and Full-Stack Developer.  We’re excited to partner with Codefresh and validate their technology on IBM Cloud, providing both customers sets with a consistent user experience and tool set. As a certified Kubernetes provider, IBM is one of the leaders in hosted and managed […]

Continue reading