AI for the Enterprise

The power of Visual Recognition: How to use Watson to identify a hand of cards

Share this post:

Welcome to the fifth article in our blog series! If you are just joining us, we are a team of six interns working on utilizing Watson services to program two robots to play poker. Be sure to check out our introduction blog to get a better feel for what we’re up to.
image1

In the previous article, we went over how we use Watson’s Natural Language Classifier API to distinguish between poker and casual intent in conversation. In this blog post, we will discuss how we recognize a hand of cards.

Recognizing Playing Cards

To get our robots playing a proper game of poker, they must understand the values of the cards they have in their hands. To do this, we use an image taken from cameras on the robot and split the problem into two parts – determining the suit of the card (spades, clubs, diamonds, hearts) and determining the rank of the card (Ace, 2, 3, etc.).

We tackled the problem from a multi-step approach because doing so cuts down on the number of comparisons we have to make. Instead of 52 comparisons, we can cut it down to 17 comparisons (4 for suit and 13 for rank). We use Watson’s Visual Recognition to identify suit and a hybrid solution to identify rank. In this post, we will be focusing on how we recognize suits.

The Visual Recognition service

image2

Using Watson’s Visual Recognition Service to classify suits

IBM Watson’s Visual Recognition is a service that allows users to understand the content of images and classify images into logical categories. In addition to classifying images, Visual Recognition also offers facial detection and text recognition services. We will focus on classifying images in this post. The Visual Recognition service does come with a pre-trained classifier for general classification, but since we want to classify our images specifically for suits, we will train a custom Visual Recognition classifier on the four suits.

In this post, I’ll show you how to train your own custom suits classifier using Watson Developer Cloud’s Python SDK .

Getting Access

As in our previous posts, we will first have to create a Visual Recognition service in Bluemix to interact with.

Go ahead and instantiate an instance of the Visual Recognition from your Bluemix console and take note of the api_key in your service credentials. This step is the same as we have done before.

Training a Classifier

In our project, our current suits classifier uses over 300 images for each suit. Our training data only uses images taken from the camera on our NAO robots since we will only be using these photos during our game of poker. In terms of training data, we use images of each card in different angles and lighting. Over several iterations, we have improved upon the accuracy of our classifier by adding more images that the classifier struggles with, such as dimly lit photos and face cards.

To use the Visual Recognition service, you will need at least 10 images for each class you want inside of your classifier in the form of JPGs or PNGs. In our case of suits, we want 4 zipped folders of images – one for each suit. In total I will be using 40 images for this tutorial. There is also an option to pass in negative training data, but by using classes inside of the classifier, the visual recognition service implicitly puts in the training data of other classes as the negative training data for the other class.

You can download the files I used below:
clubs.zip
diamonds.zip
hearts.zip
spades.zip
test1.jpg

All of my code and all of the images I used can be found in my GitHub repository here.

The code below creates a classifier for us to use. When passing the zip folders into the service, the identifier key for each class is in the form ofclassname_positive_examples. For example, for our spades class, the key would be spades_positive_examples. Note that the file uploads may take a bit of time.

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

with open(join(dirname(__file__), 'hearts.zip'), 'rb') as hearts, \
    open(join(dirname(__file__), 'diamonds.zip'), 'rb') as diamonds, \
    open(join(dirname(__file__), 'clubs.zip'), 'rb') as clubs, \
    open(join(dirname(__file__), 'spades.zip'), 'rb') as spades :
 print "Uploading files..."
 print(json.dumps(visual_recognition.create_classifier('Suits', \
    hearts_positive_examples=hearts, \
    diamonds_positive_examples=diamonds, \
    clubs_positive_examples=clubs, \
    spades_positive_examples=spades), indent=2))

If you run this code, your response should look something like this:

{
  "status": "training", 
  "name": "suits_tutorial",
  "created": "2016-07-18T19:27:22.429Z", 
"classes": [
    {
      "class": "spades"
    }, 
    {
      "class": "hearts"
    }, 
    {
      "class": "diamonds"
    }, 
    {
      "class": "clubs"
    }
  ], 
  "owner": "{YOUR_OWNER_ID_HERE}", 
  "classifier_id": "{YOUR_CLASSIFIER_ID_HERE}"
}

Be sure to take note of your classifier_id, as that is the id that we will be using to utilize our classifier.

Your classifier will take a few minutes to finish training. Larger sets of training data will take more time to train. To check the status of your classifier, you can run the code below.

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

print(json.dumps(visual_recognition.get_classifier('YOUR CLASSIFIER ID'), indent=2))

If the status is ready, the classifier is done training. Now, we are ready to classify an image using our classifier. Run the code below to classify an image (test1.jpg).

import json
from os.path import join, dirname
from os import environ
from watson_developer_cloud import VisualRecognitionV3

visual_recognition = VisualRecognitionV3(VisualRecognitionV3.latest_version, api_key='{YOUR_API_KEY_HERE}')

with open(join(dirname(__file__), './test1.jpg'), 'rb') as image_file:
 print(json.dumps(visual_recognition.classify(images_file=image_file, threshold=0, classifier_ids=['{YOUR_CLASSIFIER_ID_HERE}']), indent=2))

Your result should look something like this:

{
  "images": [
    {
      "image": "./test1.jpg", 
      "classifiers": [
        {
          "classes": [
            {
              "score": 0.0713362, 
              "class": "clubs"
            }, 
            {
              "score": 0.0823247, 
              "class": "diamonds"
            }, 
            {
              "score": 0.0638997, 
              "class": "hearts"
            }, 
            {
              "score": 0.840459, 
              "class": "spades"
            }
          ], 
          "classifier_id": "{YOUR_SUITS_ID_HERE}", 
          "name": "suits_tutorial"
        }
      ]
    }
  ], 
  "custom_classes": 4, 
  "images_processed": 1
}

We can see that spades had the highest confidence out of the four suits, with a score of .840459 out of 1. Now you can use code you’ve written to train your own custom classifier on any kind of image you want!

Take a look at a video we took during development where we combine Speech to Text and Visual Recognition to recognize a poker hand of cards.

IBM Watson: Robots Identify a Hand of Cards

In addition to the functions we have walked through above, we can also delete our classifier, list all existing classifiers, detect faces, and recognize text in a similar fashion. Check out the the Watson Developer Cloud Github for more examples.

Further Reading

Learn more about Visual Recognition

More stories
April 30, 2019

Balancing personalization with brand consistency: Podcast interview with Tameka Vasquez & Oliver Christie

In this episode of thinkPod, we are joined by Tameka Vasquez (marketing strategist and professor) and Oliver Christie (futurist and founder of Foxy Machine). We talk to Tameka and Oliver about creating customer experiences that resonate, the beauty of simplicity and being jargon-free, and whether or not AI will replace human creativity with marketing. We also tackle whether marketers have been tone deaf and the difficulties of hyper personalization.

Continue reading

April 24, 2019

Making mom proud with collaboration solutions

In this episode of thinkPod, we are joined by Michael McCabe, Vice President of WW IBM Go to Market at Box, who talks about how companies are leveraging collaboration tools to drive insights and outcomes, the evolution of content, and the future of AI in managing unstructured data.

Continue reading

April 5, 2019

How can AI improve the employee experience?

How can AI improve the employee experience? In this episode of thinkPod, we are joined by x.ai co-founder & CEO Dennis Mortensen and Ben Jackson, founder of For the Win. We talk to Dennis and Ben about hiring algorithms and the danger of bias, whether HR teams are equipped to make data-driven decisions, Inbox Zero versus Inbox Infinity, and the possibility of cultural change.

Continue reading