AI

Improving the Scalability of AI Planning when Memory is Limited

Share this post:

Planning is an exploration to help us decide what actions to take in order to achieve a specific goal. As humans, we are able to make plans with many steps and decisions.

AI planning is an active research area focused on teaching machines how to build a plan. It has been applied successfully to such challenges as cybersecurity, journey planning and more. Our article here goes into more detail on when AI planning is useful.

We report new research results relevant to AI planning in our paper, “Depth-First Memory-Limited AND/OR Search and Unsolvability in Cyclic Search Spaces,”  presented at the International Joint Conference on Artificial Intelligence, IJCAI-19.

Planning problems can often be computationally demanding, requiring large amounts of memory and long processing times. In fact, AI planning techniques include both algorithms that can limit their memory usage when searching for a plan, and algorithms that use larger and larger amounts of memory as the search progresses. When algorithms cannot limit their memory usage, they often run out of memory and fail to solve the problem at hand. This is a major limitation of algorithms popular among researchers and practitioners, such as A* and AO*.

It is desirable but challenging to design a planning solver that is fast and uses only a limited amount of memory. Search algorithms work by exploring many states called the search space of the problem. A state corresponds to a specific configuration in the problem we attempt to solve. For example, considering a person who is attempting to reach a goal in a maze, the location of the person corresponds to the state.

To avoid duplicate search effort (i.e., avoid exploring the same part of the search space more than once), algorithms attempt to remember as much as possible from the search space already explored. For example, a transposition table caches as many states as its size permits, among the states already explored in the space. When the transposition table is full, algorithms discard part of their contents to make room for new states that will be cached into the transposition table.

Removing states from the transposition table can lead to subtle undesired side effects that can result in a suboptimal plan (e.g., plans that are longer than necessary).

The reason is related to the evaluation function, an important ingredient used when searching for a plan. Put in simple words, the evaluation function of a state heuristically estimates how far we are from achieving the goal when continuing from that state (e.g., how many additional actions would be needed). This is important because it guides the search, deciding what part of the search space looks more promising in terms of reaching the goal.

A heuristic function h that can provide an evaluation for any given state is presumed to be available. For example, the straight-line distance between two points on a map is a heuristic estimation of the actual length of a trip on the road. When we store a state in the transposition table, as diagrammed below, we can also store a more accurate heuristic evaluation of that state (compared to the value given by the heuristic function h), available as a result of searching in the vicinity of the state at hand.

The side effect is that some states will have an improved heuristic evaluation, whereas other states will rely on the original function F. Unless handled properly, such a situation can result in losing the so-called consistency property of the heuristic evaluations. Said informally, inconsistency happens when the accuracy of h estimations decreases as we get closer to the goal. Some algorithms from the literature that we analyze in our paper require the consistency heuristic property in order to guarantee the optimality of solutions. We extend them so that this requirement is no longer necessary.

We analyze in-depth how the suboptimal behavior can occur and how existing algorithms can be extended to safely ensure that the optimality. We mathematically prove the soundness of our contributions.

We apply our ideas to algorithms that can work even with a very limited amount of memory. When a solution exists, our approach will find one with sufficient time to complete the search. On the other hand, when no solution exists (e.g., looking for a road trip plan to travel from Europe to Canada by car), it could prove the absence of a solution only for certain subclasses of problems. Making this completeness analysis more general is an important open problem, and thus an interesting future work topic.

Remarkably, in practice, our ideas can improve the ability of proving that certain problem instances have no solution, as shown in our empirical analysis in AI planning. In addition, in solving problem instances that have solutions, our approach can prove optimal solutions without empirically degrading the performance of the existing memory-limited search algorithms.

Research Staff Member, IBM Research-Ireland

Adi Botea

Research Staff Member, IBM Research-Ireland

Radu Marinescu

Research Staff Member, IBM Research-Ireland

More AI stories

IBM Research advances in end-to-end speech recognition at INTERSPEECH 2019

IBM scientists presented three papers at INTERSPEECH 2019 that address the shortcomings of End-to-end automatic approaches for speech recognition - an emerging paradigm in the field of neural network-based speech recognition that offers multiple benefits.

Continue reading

High quality, lightweight and adaptable Text-to-Speech (TTS) using LPCNet

Recent advances in deep learning are dramatically improving the development of Text-to-Speech systems through more effective and efficient learning of voice and speaking styles of speakers and more natural generation of high-quality output speech.

Continue reading

IBM Research AI at INTERSPEECH 2019

IBM Research's papers at INTERSPEECH 2019 showcase our focus on improving the underlying speech technologies that enable companies provide their customers with a uniformly good experience across different channels and extract actionable insights from these interactions.

Continue reading