Posted in: AI, Cognitive Computing, IBM Research-Haifa

Cognitive AR to give you superhero vision

Live demonstration of our cognitive operations guidance technology at the recent Tel Aviv Watson Summit

Ethan Hadar giving a live demonstration of our cognitive operations guidance technology at the recent Tel Aviv Watson Summit

The other day my son asked me to explain what I do at work. I told him that our team builds technology that can answer your questions just by pointing your phone at what you’re trying to fix or understand. The augmented reality we’re developing will know what it is, and what to do – and tell you.

His answer? “Oh, I get it – like Jarvis helps Ironman!” I hadn’t thought of that, but yes!

Our team at IBM Research – Haifa is using augmented reality to provide visual guidance by superimposing digital information on real objects viewed through mobile devices, tablets, or smart glasses. What we refer to as ‘cognitive operations guidance’ can recognize what’s in the devices’ field of view, answer questions, and even understand and ask about what you’re gesturing toward – the leaky pipe under the sink on the left. This technology has the potential to simplify and clarify many types of interactions in daily life, for consumers and in industry.

What we can do with the help of visual guidance

We are using augmented reality and cognitive conversation to help field technicians find solutions for real-life industry problems in maintenance, support, and manufacturing. We can also help people learn how to use a device, or get instructions for home repairs.

Instead of overwhelming someone with detailed technical information, our cognitive AR can give simple step-by-step instructions. Cognitive operation guidance can help repair a kitchen appliance, or assemble a DIY dinner table (no paper manuals with instructions in origami). It can help even a novice technician solve a complex technical problem, incorporating the use of voice and gestures.

Instructions superimposed on objects

Instructions superimposed on objects

By simply pointing your device camera on the object to fix, the digital instructions of the operations appear on the video screen, and are anchored to the object’s elements even when you move the camera. Clicking on the next steps advances the repair instructions step by step or presents additional links to external information and even IoT telemetry.

By selecting the type of object, the user can load any repair action and start to interact with it. Our object recognition system can be configured to automatically understand what object the user is pointing the camera at, and offer several options for repair. In some cases, a human expert who is at a remote location, can chat with the end-user, understand what is the repair mission, and give the user the correct set of instructions using AR guided-operation.

Cognitive visual guidance also includes a tracking technology, superimposing information on top of the field of view, like where to find a valve that needs to be turned off. The presented information adjusts and correlates as the devices’ visual perspective, location, and orientation in relation to the object changes (helpful when under the sink repairing a disposer).

Think of our cognitive AR assistance as DIY anywhere, for just about anything. No more hunting for online help. It can guide machine maintenance work, making it safer and more efficient. If a technician in the field is caught in a thunderstorm or major winds, and can’t practically access a manual, our technology can offer guidance from an experienced technician or a cognitive computer assistant.

Another use case we’re looking at can help consumers with a variety of tasks via cognitive visual guidance on the screen of your mobile device, such as operating various features on your car console; checking water and oil levels in your vehicle; or just be assisted by AR to find the USB outlet in a new rental car.

Simplicity is the challenge

At IBM Research-Haifa, we’re working on a full-cycle technology. Cognitive AR for operation guidance can recognize language, objects, gestures, and intentions, and then can put it all together to understand what’s the next steps should be. Our goal is to make life easier with meaningful, real-time interaction – or in my son’s words, “turn us into superheroes!”

We’ll be demoing this technology at CVPR 2017 in Honolulu from July 21 – 26.

Read more about what IBM is up to at CVPR 2017 here.

This technology became a reality because of the team:Benjamin Cohen, Leonid Karlinsky, Ran Nissim, Joseph Shtok, Uzi Shvadron, Yochay Tzur, and Boris Vigman.

Comments

  1. Ethan Hadar says:

    The technology and driven customer value was created by a great team computer vision and software engineering experts! Specifically Benjamin Cohen, Leonid Karlinsky, Ran Nissim, Joseph Shtok, Uzi Shvadron, Yochay Tzur, and Boris Vigman. Thanks guys!

Add Comment

Your email address will not be published. Required fields are marked *

Ethan Hadar, Manager of Augmented Reality and Computer Vision at IBM Research - Haifa

Ethan Hadar

Dr. Ethan Hadar, Manager of Augmented Reality and Computer Vision at IBM Research