Reasoning is required for a range of cognitive tasks including using basic common sense, dealing with changing situations, simple planning and making complex decisions in a profession. “Humans know that if you put an object on the table, it’s likely to stay on the table unless the table’s tilted. But nobody writes that in a book—it’s something implicit. Systems don’t have this common-sense capability,” explains Aya Soffer, IBM Director of AI and Cognitive Analytics Research.
“In compliance, if a legislature passes a law, for example, it is possible that a newly elected legislature will rescind portions of that law. So you have to build into your system the very fundamental idea of human existence that anything can change,” says Vijay Saraswat, IBM Research Chief Scientist for Compliance.
There’s a real need for symbolic reasoning and alternative routes to intelligence that are going to be necessary to make more robust AI tools.
— Kevin Kelly, Wired co-founder and author of the best-selling book The Inevitable
AI experts agree that we’re still in the early days of teaching systems to deeply reason, with a few examples of progress in narrow applications such as self-driving cars and select professions. Much work remains to reach a level of efficiency that allows for scaling reasoning capabilities across a broader swath of applications.
“We have now reached the stage where, after expending significant effort in labeling text, we can map a natural language sentence to a logical form in some areas. Then we can use formalized reasoning mechanisms to work with these extracted formulas,” says IBM’s Saraswat. “The key challenge is to substantially reduce the effort needed to get to these formulas for a variety of areas.”
Some AI technologists are optimistic that we’ll figure out the reasoning challenge in the next five to ten years and point out that deep learning might actually be part of the solution.
“The opportunities for developing a massive capability for reasoning are within our reach now. We have large quantities of data, and while they’re not necessarily in exactly the form you want, there are very strong signs that machine learning techniques can transform data into a form required for automated reasoning,” says IBM Cognitive Research Manager Michael Witbrock. What’s needed, explains Witbrock, is a way to vastly scale up reasoning computations—a boost that would be akin to what GPUs provided for neural networks.
Watch the video
Deep learning gets a makeover
While deep learning is here to stay, it will likely look different in the next wave of AI breakthroughs. Experts stress the need to become much more efficient at training deep learning models to apply them at scale across increasingly more complex and diverse tasks. The path to this efficiency will be led in part by “small data” and the use of more unsupervised learning.
The advent of small data
The neural networks of deep learning models require exposure to huge amounts of data to learn a task. Training a neural network to recognize an object, for example, could require feeding it as many as 15 million images. Acquiring relevant datasets of this size can be costly and time-consuming, which slows the pace of training, testing and refining AI systems.
And sometimes there simply isn’t enough data available in a particular domain to feed a hungry deep learning model. “In health, how many subjects do you have in clinical studies? Thousands, if you’re lucky, but even that takes years to get. Patients don’t have the time to wait for that,” says IBM Cognitive Solutions Research Manager Costas Bekas.
Researchers are pushing to figure out ways to train systems on less data and are confident they’ll find a viable solution. As a result, AI experts expect the “data” variable in the AI growth equation to flip on its head, with small datasets overtaking big data as drivers of new AI innovation.
En route to unsupervised learning
Current deep learning models require datasets that are not only massive, but also labeled so that the system knows what each piece of data represents. Supervised learning largely relies on humans to do the labeling, a laborious task that further slows the innovation process, piles on expense and could introduce human bias into systems.
Even with labels in place, systems often require additional human hand-holding to learn. “You’ve got a subject matter expert telling the system everything they know. It makes for a really accurate system, but it can be a really painful process for the SME,” says IBM Vice President of Cognitive Computing Michael Karasick.
On the other end of the spectrum, unsupervised learning allows raw, unlabeled data to be used to train a system with little to no human effort. “In unsupervised learning, the system simply interacts with the world. It just gets to see what’s out there and learns from that,” says IBM’s Campbell.
Humans are very good at unsupervised learning, and we need to make substantial progress in that direction to approach human-level AI.
—Yoshua Bengio, deep learning researcher and University of Montreal professor
However, most AI visionaries cast pure unsupervised learning as the holy grail of deep learning and admit we’re a long way off from figuring out how to use it to train practical applications of AI. The next wave of AI innovation will likely be fueled by deep learning models trained using a method that lies somewhere between supervised and unsupervised learning.
Computer scientists and engineers are exploring a number of such learning methods, some of which offer a triple threat—less labeled data, less data volume and less human intervention. Among them, “one-shot learning” is closest to unsupervised learning. It’s based on the premise that most human learning takes place upon receiving just one or two examples.
James DiCarlo, head of MIT’s brain and cognitive sciences department, says a great example of this is how we learn to recognize objects. “Imagine you pick up a cup and you rotate it in front of you. From your visual system’s point of view, that’s unlabeled data. That can be used to start training the visual system, even without a label for the word ‘cup.’ So you can imagine that you might build a machine’s deep network in a largely unsupervised way. And then you just have a thin layer of labeling on the back end. That’s the model many people imagine.”
Other promising methods require more supervision, but would still help speed and scale applications of deep learning. These include:
- Reinforcement learning: “As the system takes action in an environment, it’s given rewards if it does good things and penalties if it does bad things. This is much easier to provide in terms of supervision, but it’s still requires supervision,” explains IBM’s Campbell.
- Transfer learning: “You take a trained model and then, to apply it to a completely new problem, you use just a little bit of training and a little bit of labeled data,” says IBM’s Smith.
- Active learning: The system requests more labeled data only when it needs it. “It’s definitely a baby step toward unsupervised learning in the sense that the computer is initiating the labeling versus humans just feeding the computer lots and lots of labeled data,” says IBM’s Soffer.
Efficient algorithms and new AI hardware
Simplifying the learning process will also help relieve the power crunch that slows both innovation and AI application performance. While GPUs have accelerated the training and running of deep learning models, they’re not enough. “It takes a lot of computational power to train the model and then use the model after you've trained it. We can do a test on a model and it will take two to three weeks to train it, and if you're trying to iterate fast, that is kind of painful,” says Jason Toy, CEO of AI creativity application startup Somatic. “So we spend a lot of time researching and testing to make the model architectures smaller and run faster.”
IBM’s Bekas explains that we simply can’t scale enough hardware to solve this. “Ultimately, hardware can’t beat computational complexity. You need to have a combination of algorithmic improvement and hardware development,” says Bekas.
With model improvements, experts contend that GPUs will pick up speed and remain an important part of the “computational power” variable in the formula that drives the next AI leaps. However, some AI hardware under development, such as neuromorphic chips or even quantum computing systems, could factor into the new equation for AI innovation.