Watson Developer Cloud

Overview of the Visual Recognition service

The IBM Watson™ Visual Recognition service uses deep learning algorithms to analyze images (.jpg, or .png) for scenes, objects, faces, and other content, and return keywords that provide information about that content. You can also create custom collections of your own images, and then upload an image to search the collection for similar images.

Flow of the service

The Visual Recognition services comes with a set of built-in classes so that you can analyze images with high accuracy right out of the box. You can also train custom classifiers to create specialized classes, and create custom collections to search for similar images.

Use cases

The Visual Recognition service can be used for diverse applications and industries, such as:

  • Manufacturing: Use images from a manufacturing setting to make sure products are being positioned correctly on an assembly line

  • Visual Auditing: Look for visual compliance or deterioration in a fleet of trucks, planes, or windmills out in the field, train custom classifiers to understand what defects look like

  • Insurance: Rapidly process claims by using images to classify claims into different categories.

  • Social listening: Use images from your product line or your logo to track buzz about your company on social media

  • Social commerce: Use an image of a plated dish to find out which restaurant serves it and find reviews, use a travel photo to find vacation suggestions based on similar experiences, use a house image to find similar homes that are for sale

  • Retail: Take a photo of a favorite outfit to find stores with those clothes in stock or on sale, use a travel image to find retail suggestions in that area

  • Education: Create image-based applications to educate about taxonomies, use pictures to find educational material on similar subjects

To see the Visual Recognition service in action, see the Visual Recognition demo app. With the demo, you can analyze images for subject matter, and faces, as well as train a custom classifier.

Supported languages

The Visual Recognition GET /v3/classify and POST /v3/classify methods support English (en), Spanish (es), and Arabic (ar) for default classes. Custom classifiers returned with the /v3/classify method only support English. Collections methods are language agnostic.

All other methods support English only.

SDK Overview

The following SDKs are available for the Watson Developer Cloud services, including the Visual Recognition API:

  • SDK for Java: Using this beta SDK only requires that you add the appropriate commands to integrate the Java SDK into your build process, as explained in the README file for the Java SDK, and that you understand the simplified functions that it provides.

  • SDK for Node.js: Using this SDK only requires that you execute the npm install watson-developer-cloud command to install the SDK locally, that your code includes a var watson = require('watson-developer-cloud'); statement, and that you understand the simplified functions that the SDK provides. See the README file for the Node SDK for more information.

  • SDK for Python: Using this beta SDK only requires that you execute the appropriate pip or easy_install commands to install the Python SDK on your system and that you understand the simplified functions that it provides. See the README for the Python SDK for more information.

  • SDK for iOS (Swift): This beta SDK has third-party dependencies, such as ObjectMapper and Alamofire, which can be satisfied with the Carthage dependency management tool. Use the Carthage installer or install Carthage with the Homebrew package manager.

To see documentation that provides sample code for using the Node.js and Java SDKs, and equivalent REST API calls, see the API reference.

Additional resources

See the following for more information about the Visual Recognition service:

Your questions and feedback

Find answers to your questions about Watson in our developer communities: