In addition to token efficiency, ReWOO demonstrates an additional benefit: robustness under tool-failure. If a tool fails under ReAct, for instance, the system can get caught in an infinite loop (as the LLM repeatedly queries a broken database for the weather in Chicago, for example).
ReWOO is nimbler. Even if a tool fails to return a given piece of evidence, the initial overarching plan is still in place: The Worker module can progress, and the Solver module will be able to deliver at least a partial answer. In the weather example, instead of getting caught in an infinite or excessive loop querying a database for Chicago’s weather, the Solver module would at least return an answer informing the user of New York’s and Milwaukee’s weather (assuming the Worker module was able to retrieve those bits of evidence), which might ultimately be sufficiently helpful for the user’s planning needs.
Despite ReWOO’s benefits, it is not a universally superior framework; it is simply better for certain kinds of jobs, particularly when the types and quantities of evidence needed are regular and predictable. Where ReWOO falls short, however, are with less predictable or structured problems that may require creativity, exploration or improvisation. With known unknowns, ReWOO excels, but with unknown unknowns, it flounders.
For instance, ReWOO would not be an optimal for debugging Python code, an exploratory and iterative process where each fix might yield new errors and clues, with the best-laid plans quickly becoming obviated. A more adaptable framework like ReAct, while less token-efficient in the abstract, would ultimately be a better match for such a problem.