Reinforcement learning

IBM’s AI learns to navigate around a virtual home using common sense

In a recent paper introduced at the 2021 AAAI Conference on Artificial Intelligence (AAAI), we describe an AI that trades off ‘exploration’ of the world with ‘exploitation’ of its action strategy to maximize rewards. In Reinforcement Learning, an AI gets a reward – such as a bag of gold behind a locked door in a video game – every time it reaches specific desirable states. We have greatly improved this exploration vs exploitation tradeoff using additional commonsense knowledge – in the form of crowdsourced text. Our work could lead to better mapping and navigation applications, and to a new generation of interactive assistive agents able to reason like humans.

Continue reading

Distributed Software-Defined Networking Control by Deep Reinforcement Learning for 5G and Beyond

IEEE ICC 2019 “Best Paper” details novel deep reinforcement learning approach to maximize overall performance of Software-Defined Networking that supports 5G.

Continue reading

Dialog-Based Interactive Image Retrieval

A natural language-based system for interactive image retrieval that is more expressive than conventional systems based on binary or fixed-form feedback.

Continue reading

‘Show and Tell’ Helps AI Agent Align with Societal Values

As AI-powered autonomous agents play an increasingly large role in society, we must ensure that their behavior aligns with societal values. To this end, we developed a novel technique for training an AI agent to operate optimally in a given environment while following implicit constraints on its behavior. Our strategy incorporates a bottom-up (or demonstration-based) […]

Continue reading

End-to-End Open-Domain QA via Multi-Passage Reading Comprehension

Recently, impressive progress has been made in neural network question answering (QA) systems which can analyze a passage to answer a question. These systems work by matching a representation of the question to the text to find the relevant answer phrase. But what if the text is potentially all of Wikipedia?  And what if the […]

Continue reading