AI for the Enterprise

The power of Visual Recognition: How to use Watson to identify a hand of cards

Share this post:

Welcome to the fifth article in our blog series! If you are just joining us, we are a team of six interns working on utilizing Watson services to program two robots to play poker. Be sure to check out our introduction blog to get a better feel for what we’re up to.
image1

In the previous article, we went over how we use Watson’s Natural Language Classifier API to distinguish between poker and casual intent in conversation. In this blog post, we will discuss how we recognize a hand of cards.

Recognizing Playing Cards

To get our robots playing a proper game of poker, they must understand the values of the cards they have in their hands. To do this, we use an image taken from cameras on the robot and split the problem into two parts – determining the suit of the card (spades, clubs, diamonds, hearts) and determining the rank of the card (Ace, 2, 3, etc.).

We tackled the problem from a multi-step approach because doing so cuts down on the number of comparisons we have to make. Instead of 52 comparisons, we can cut it down to 17 comparisons (4 for suit and 13 for rank). We use Watson’s Visual Recognition to identify suit and a hybrid solution to identify rank. In this post, we will be focusing on how we recognize suits.

The Visual Recognition service

image2

Using Watson’s Visual Recognition Service to classify suits

IBM Watson’s Visual Recognition is a service that allows users to understand the content of images and classify images into logical categories. In addition to classifying images, Visual Recognition also offers facial detection and text recognition services. We will focus on classifying images in this post. The Visual Recognition service does come with a pre-trained classifier for general classification, but since we want to classify our images specifically for suits, we will train a custom Visual Recognition classifier on the four suits.

In this post, I’ll show you how to train your own custom suits classifier using Watson Developer Cloud’s Python SDK .

Getting Access

As in our previous posts, we will first have to create a Visual Recognition service in Bluemix to interact with.

Go ahead and instantiate an instance of the Visual Recognition from your Bluemix console and take note of the api_key in your service credentials. This step is the same as we have done before.

Training a Classifier

In our project, our current suits classifier uses over 300 images for each suit. Our training data only uses images taken from the camera on our NAO robots since we will only be using these photos during our game of poker. In terms of training data, we use images of each card in different angles and lighting. Over several iterations, we have improved upon the accuracy of our classifier by adding more images that the classifier struggles with, such as dimly lit photos and face cards.

To use the Visual Recognition service, you will need at least 10 images for each class you want inside of your classifier in the form of JPGs or PNGs. In our case of suits, we want 4 zipped folders of images – one for each suit. In total I will be using 40 images for this tutorial. There is also an option to pass in negative training data, but by using classes inside of the classifier, the visual recognition service implicitly puts in the training data of other classes as the negative training data for the other class.

You can download the files I used below:
clubs.zip
diamonds.zip
hearts.zip
spades.zip
test1.jpg

All of my code and all of the images I used can be found in my GitHub repository here.

The code below creates a classifier for us to use. When passing the zip folders into the service, the identifier key for each class is in the form ofclassname_positive_examples. For example, for our spades class, the key would be spades_positive_examples. Note that the file uploads may take a bit of time.

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

with open(join(dirname(__file__), 'hearts.zip'), 'rb') as hearts, \
    open(join(dirname(__file__), 'diamonds.zip'), 'rb') as diamonds, \
    open(join(dirname(__file__), 'clubs.zip'), 'rb') as clubs, \
    open(join(dirname(__file__), 'spades.zip'), 'rb') as spades :
 print "Uploading files..."
 print(json.dumps(visual_recognition.create_classifier('Suits', \
    hearts_positive_examples=hearts, \
    diamonds_positive_examples=diamonds, \
    clubs_positive_examples=clubs, \
    spades_positive_examples=spades), indent=2))

If you run this code, your response should look something like this:

{
  "status": "training", 
  "name": "suits_tutorial",
  "created": "2016-07-18T19:27:22.429Z", 
"classes": [
    {
      "class": "spades"
    }, 
    {
      "class": "hearts"
    }, 
    {
      "class": "diamonds"
    }, 
    {
      "class": "clubs"
    }
  ], 
  "owner": "{YOUR_OWNER_ID_HERE}", 
  "classifier_id": "{YOUR_CLASSIFIER_ID_HERE}"
}

Be sure to take note of your classifier_id, as that is the id that we will be using to utilize our classifier.

Your classifier will take a few minutes to finish training. Larger sets of training data will take more time to train. To check the status of your classifier, you can run the code below.

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

print(json.dumps(visual_recognition.get_classifier('YOUR CLASSIFIER ID'), indent=2))

If the status is ready, the classifier is done training. Now, we are ready to classify an image using our classifier. Run the code below to classify an image (test1.jpg).

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

with open(join(dirname(__file__), './test1.jpg'), 'rb') as image_file:
 print(json.dumps(visual_recognition.classify(images_file=image_file, threshold=0, classifier_ids=['{YOUR_CLASSIFIER_ID_HERE}']), indent=2))

Your result should look something like this:

{
  "images": [
    {
      "image": "./test1.jpg", 
      "classifiers": [
        {
          "classes": [
            {
              "score": 0.0713362, 
              "class": "clubs"
            }, 
            {
              "score": 0.0823247, 
              "class": "diamonds"
            }, 
            {
              "score": 0.0638997, 
              "class": "hearts"
            }, 
            {
              "score": 0.840459, 
              "class": "spades"
            }
          ], 
          "classifier_id": "{YOUR_SUITS_ID_HERE}", 
          "name": "suits_tutorial"
        }
      ]
    }
  ], 
  "custom_classes": 4, 
  "images_processed": 1
}

We can see that spades had the highest confidence out of the four suits, with a score of .840459 out of 1. Now you can use code you’ve written to train your own custom classifier on any kind of image you want!

Take a look at a video we took during development where we combine Speech to Text and Visual Recognition to recognize a poker hand of cards.

IBM Watson: Robots Identify a Hand of Cards

In addition to the functions we have walked through above, we can also delete our classifier, list all existing classifiers, detect faces, and recognize text in a similar fashion. Check out the the Watson Developer Cloud Github for more examples.

Further Reading

Learn more about Visual Recognition

More stories

AIconics names IBM Watson Discovery Best Innovator in Natural Language Processing

June 20, 2019 | AI for the Enterprise, Discovery and Exploration

On June 11, the world’s only independently judged enterprise AI awards – the AIconics – named Watson Discovery the winner for “Best Innovation in NLP.” Natural Language Processing is the area of computer science and AI that governs the interaction between computers and human languages. Specifically, NLP concerns how computers process and analyze unstructured natural language data. ...read more


IBM Watson Assistant gets smarter and faster, making customer service a breeze

June 20, 2019 | AI for the Enterprise, Conversational Services

We're excited to announce new Watson Assistant features that are designed to change the way businesses interact with their users. Watson Assistant not only helps answer customer questions quickly and accurately, but it also ensures that employees are empowered to do their jobs efficiently. ...read more


Balancing personalization with brand consistency: Podcast interview with Tameka Vasquez & Oliver Christie

April 30, 2019 | AI for the Enterprise, Think Leaders

In this episode of thinkPod, we are joined by Tameka Vasquez (marketing strategist and professor) and Oliver Christie (futurist and founder of Foxy Machine). We talk to Tameka and Oliver about creating customer experiences that resonate, the beauty of simplicity and being jargon-free, and whether or not AI will replace human creativity with marketing. We also tackle whether marketers have been tone deaf and the difficulties of hyper personalization. ...read more