IBM’s AI learns to navigate around a virtual home using common sense

In a recent paper introduced at the 2021 AAAI Conference on Artificial Intelligence (AAAI), we describe an AI that trades off ‘exploration’ of the world with ‘exploitation’ of its action strategy to maximize rewards. In Reinforcement Learning, an AI gets a reward – such as a bag of gold behind a locked door in a video game – every time it reaches specific desirable states. We have greatly improved this exploration vs exploitation tradeoff using additional commonsense knowledge – in the form of crowdsourced text. Our work could lead to better mapping and navigation applications, and to a new generation of interactive assistive agents able to reason like humans.

