Who doesn’t love the challenge of solving a puzzle? Jigsaw puzzles are a popular hobby for all ages and have been an important tool in a child’s physical, cognitive and emotional development. Children can develop physical skills such as better hand/eye coordination; cognitive skills such as shape recognition, memory, and problem solving; and emotional skills such as setting goals and patience by completing puzzles. But what if you are a visually impaired or a blind parent? How can you help your child to accomplish this important developmental task?

Puzzle Solving Toolkit

At IBM Research – Ireland our team built a 3D computer vision driven task completion prototype called the Puzzle Solving Toolkit which interacts with visually impaired users to guide them in solving a jigsaw puzzle in a natural and intuitive way. We achieved this by providing a combination of computer vision algorithms and Watson services.

Our toolkit recognizes the puzzle pieces via a camera input and computes the best strategy for solving the puzzle. The system interacts with the user by leveraging Watson Speech to Text and Text to Speech services to communicate the necessary instructions about which piece to pick up next and also adapt to user’s feedback.

Watch our video where we explain how our Puzzle Solving Toolkit prototype works. In the video we also share additional research projects being developed by our team on Indoor Positioning Systems and Visual Recognition Systems. 

https://youtu.be/WeGNTB098xc

Components of the toolkit

We utilize a head-mounted Intel 3D camera as video input and microphone/speaker to achieve natural human-computer interaction. The video input is fed to a deep learning pipeline, which identifies the puzzle pieces in a given scene and matches them to the original image of the puzzle. Upon completion of the pipeline, the core puzzle solving algorithm computes the solution of the puzzle and coordinates the instructions that need to be communicated to the user to successfully complete the puzzle.

The core puzzle solving algorithm and the deep learning pipeline are executed in a Nvidia TK1 board and leverage the computational speed of its GPUs. This demo unveils the capabilities we can provide in assisting visually impaired people to execute simple everyday tasks by bringing sophisticated AI algorithms together with advanced hardware.

Real-world applications

With our toolkit we could envision helping, for example, visually impaired parents to guide their children in completing collaborative learning tasks. Our research can bring real-time, context-specific cognitive analytics to the edge, in tandem with deep device specialization and sensory augmentation, to transform the human experience.

Performing an everyday task such as cooking, navigating an indoor/outdoor environment, or attending a meeting at work is quite simple for most people. However, approximately 285 million visually impaired and 39 million blind people face an enormous challenge in accomplishing these daily tasks. [1] With the rapid convergence of hardware equipped with powerful sensors and 3D computer vision algorithms, we are now able to repurpose IoT devices and create AI-powered assistive technologies that naturally interact, using cognitive Watson services, to provide blind and visually impaired people with more independence. The spectrum of possible applications is endless, and we’re looking to enable people to accomplish more complex and challenging everyday tasks at a rapid pace.

The goal of our research is to help people maximise their full potential, accomplish more, and have a better quality of life. We will continue to enhance our prototypes, combined with the cognitive capabilities of Watson, to bring visually impaired people closer to complete autonomy.

[1]                  World Health Organization, Fact Sheet No 282, Aug 2014

Was this article helpful?
YesNo

More from Business transformation

Top 6 innovations from the IBM – AWS GenAI Hackathon

5 min read - Generative AI innovations can transform industries. Eight client teams collaborated with IBM® and AWS this spring to develop generative AI prototypes to address real-world business challenges in the public sector, financial services, energy, healthcare and other industries. Over the course of several weeks, cross-functional teams comprising client teams, IBM and AWS representatives worked to design, develop and iterate on prototypes that push the boundaries of what's possible with generative AI. IBM used design thinking and user-centric approach to guide the…

Turn complexity into a competitive advantage with IBM Rhapsody Systems Engineering  

3 min read - We are thrilled to announce the new IBM® Rhapsody® Systems Engineering (Rhapsody SE), a web-based solution for systems engineering teams. This solution empowers them to deliver smarter, more complex and more competitive solutions to their end users, while turning increasing design complexities into a competitive advantage. [button link="https://www.youtube.com/watch?v=3TgoWe0Z3Qs"]Listen to experts introduce this exciting new tool[/button] The inherent nature of systems, whether at the initial design phase or during subsequent upgrades, is characterized by a rapid accumulation of new capabilities outpacing…

How Data Cloud unlocks AI-driven results

4 min read - Agentforce is going to be a major focus at Dreamforce 2024, and we’ve already seen a tremendous amount of hype and development around the artificial intelligence capabilities it provides. We have also seen a commensurate focus on Data Cloud as the tool that brings data from multiple sources to make this AI wizardry possible. But how exactly do the two work together? Is Data Cloud needed to enable Einstein? Why is there such a focus on data, anyway? Data Cloud…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters