Home Topics What is tree-of-thoughts? What is tree-of-thoughts?
Explore tree of thoughts with watsonx.ai Subscribe for AI updates
Illustration with collage of pictograms of data representing tree of thoughts

Contributors: Vrunda Gadesha, Eda Kavlakoglu

Tree-of-thoughts (ToT) is a ground-breaking framework designed to enhance the reasoning capabilities of large language models (LLMs). This approach simulates human cognitive strategies for problem-solving, enabling LLMs to explore multiple potential solutions in a structured manner, akin to a tree's branching paths[1].

Difference between chain-of-thoughts (CoT) and tree-of-thoughts (ToT)

The tree-of-thoughts (ToT) and chain-of-thoughts (CoT) frameworks serve as conceptual algorithms for understanding the organization and progression of text generation in language models (LMs) such as generative pretrained transformers such as GPT-3 and GPT-4. These prompting techniques are a part of prompt engineering, which involves crafting inputs (prompts) to effectively guide LMs in generating preferred outputs.

Tree-of-thoughts prompting: This framework operates on the model’s ability to generate text hierarchically, with a central topic or idea leading to branching subtopics and details. This approach mirrors how a model can expand on a specific prompt by generating increasingly specific and related text, similar to a tree structure. It allows for lookahead and tree search strategies, where the model can explore multiple branches before committing to a path, making it suitable for general problem solving and scenarios requiring complex decision-making. This method incorporates common sense reasoning and heuristics to evaluate the quality of each branch. The self-consistency mechanism is employed to provide reliable evaluations by prompting the model multiple times.

Chain-of-thought prompting: In contrast, this concept corresponds to the model's capacity to generate text in a linear, left-to-right fashion, where each subsequent token is directly influenced by the preceding tokens. This sequential progression reflects a simpler, more straightforward approach to text generation. CoT is effective for tasks that require a clear, step-by-step logical flow. Few-shot learning, where the model is provided with a few examples to learn from, can enhance this method by providing contextual understanding. CoT serves as a baseline technique in prompt engineering, offering a foundational method that is simpler to implement but may lack the depth and complexity of ToT.

Comparison and applications: While ToT prompting represents a more intricate and interconnected approach to text generation, by using tree search and lookahead strategies, CoT reflects a simpler, sequential progression. ToT's hierarchical nature makes it suitable for tasks requiring detailed exploration of multiple solutions, such as reinforcement learning scenarios, where backtracking and alternative strategies are crucial. However, CoT's linear progression is ideal for tasks that need a clear, logical sequence of thoughts.

In practical applications, APIs for LMs including GPT-3 and GPT-4 use prompting techniques such as ToT and CoT to enhance their performance in diverse tasks, from creative writing to complex problem solving.2 Prompt engineering continues to evolve, providing powerful tools for harnessing the capabilities of advanced transformers in language models.

Why AI governance is a business imperative for scaling enterprise artificial intelligence

Learn about barriers to AI adoptions, particularly lack of AI governance and risk management solutions.

Related content

Register for the guide on foundation models

Generative AI + ML for the enterprise

How does tree-of-thoughts work?

The tree-of-thoughts (ToT) guides large language models (LLMs) through a series of reasoning steps, where each step can branch into multiple paths, allowing the model to backtrack or explore alternative strategies as needed. For example, in solving a sudoku puzzle, the tree-of-thoughts (ToT) might guide the model to explore different number placements in a trial-and-error fashion. It then backtracks when a number leads to a contradiction and try a different number until the puzzle is solved. This mimics the human approach to problem-solving, where multiple solutions are considered and discarded if found incorrect[1][3].

Framework for tree-of-thoughts (ToT)

The tree-of-thoughts (ToT) is a sophisticated framework designed to enhance the problem-solving capabilities of large language models (LLMs) by structuring their reasoning in a manner analogous to human cognitive processes. The framework is composed of four key components:

  1. Thought decomposition: The ToT framework explicitly breaks a problem into smaller, manageable steps, called thoughts, which are pieced together to form a solution. Each thought should be the right size--not too large to handle or too small to be useful. For example, if you’re planning a trip, a thought might involve deciding on a travel destination first, then choosing the best mode of transportation and finally picking a place to stay. Whereas in a mathematical problem, a thought might be a single equation line or a concise explanation of a concept. This way, the problem is broken down into key steps that are easy to tackle and evaluate individually. The decomposition depends on the nature of the problem, making sure that thoughts are both significant and feasible for evaluation.
  2. Thought generation: After defining what constitutes a thought, the next step is to determine how these thoughts are generated. The framework proposes two primary techniques[4]:
    • Sampling: This technique involves generating several thoughts independently by using the same prompt. It works best when the thought space is rich and diverse, as independently generated thoughts are less likely to be duplicated. For example, in creative writing, multiple independent plot ideas might be generated.
    • Proposing: This technique generates thoughts sequentially by using a "propose prompt." Each thought is built upon the previous one, which helps avoid duplication in more constrained thought spaces. For example, in logical problem-solving, each step builds on the previous one to ensure consistency and progress.
  3. State evaluation: Once thoughts are generated, they need to be evaluated to ensure progress toward a solution. The framework employs two strategies for this purpose:
    • Value: This strategy involves assigning a scalar value (for example, a rating from 1-10) or a classification (for example, sure, likely or impossible) to each state. This helps indicate the value's quality or likelihood of leading to a solution. This method allows for a quantitative assessment of each thought's potential.
    • Vote: This strategy compares different solutions and selects the most promising one. Voting is particularly useful for tasks where the quality of a solution is subjective or hard to quantify, such as in creative writing or strategic planning. Multiple evaluations combine to determine the best path forward.
  4. Search algorithm: The final component involves the search algorithm used to navigate through the solution space. The framework typically employs two fundamental algorithms:
    • Breadth-first search (BFS): This algorithm explores all possible branches at each level before moving deeper into the tree. It makes sure that all potential solutions are considered equally, making it useful for problems where the shortest path or shallowest solution is preferred. For example, in a puzzle game, BFS would check all immediate moves before considering subsequent ones.
    • Depth-first search (DFS): This algorithm explores one branch deeply before backtracking to explore other branches. It allows for a thorough examination of each potential solution path, making it useful for problems requiring detailed exploration of each option. For example, in solving a complex logic problem, DFS would follow a single hypothesis deeply, checking its validity before considering alternatives.

By integrating these components, the tree-of-thoughts (ToT) framework mimics human problem-solving by systematically considering multiple solutions and discarding the ones that are found incorrect.


The operational dynamics of the tree-of-thoughts (ToT) framework involve an iterative, tree-structured exploration of possible solutions. Starting with the initial prompt, the model generates a range of thoughts or answers, each leading to subsequent queries or expansions. These branches develop as the model explores different reasoning paths. It employs tracking progress and exploring this entire solution space via LLM-powered self-evaluation ensuring each step's validity. If a particular line of reasoning reaches a contradiction or dead end, the system can backtrack to a previous node to explore alternative possibilities.

This structured yet flexible approach allows large language models (LLMs) to handle complex, multistep reasoning tasks more effectively. It resembles the human ability to navigate through a maze of thoughts and options, reassessing and adjusting strategies as needed.

In essence, the tree-of-thoughts (ToT) framework equips large language models with a more human-like ability to reason and solve problems, enhancing their effectiveness in tasks that require deep, strategic thinking and decision-making.

Advantages and limitations of tree-of-thoughts (ToT)

The tree-of-thoughts (ToT) framework represents a significant advancement in the capabilities of large language models (LLMs) for complex problem-solving. However, there are tradeoffs involving the added complexity inherent in the implementation of this framework.


The framework offers benefits to the field of artificial intelligence including:

Enhanced problem-solving abilities

ToT significantly improves the problem-solving skills of LLMs by enabling them to explore multiple reasoning paths simultaneously. This mirrors human cognitive processes where several potential solutions are considered and the most viable one is selected. For instance, in tasks requiring strategic thinking or planning, such as solving word puzzles or generating creative writing, ToT has demonstrated superior performance, achieving higher success rates compared to traditional methods. This increased capacity for complex reasoning by decomposing the intermediate steps is especially evident in challenging tasks where initial decisions greatly influence outcomes[4].

Handling of uncertainty

The tree-of-uncertain-thoughts (TouT), an extension of ToT, specifically addresses the inherent uncertainties present in the decision-making processes of LLMs. By quantifying and managing these uncertainties, TouT allows for more accurate and reliable outcomes. It uses techniques such as monte-carlo dropout. This technique is used in machine learning, particularly in deep learning models, to estimate uncertainty in predictions. It involves randomly dropping out neurons during both training and inference, which creates multiple different "paths" through the network. By averaging the predictions from these different paths, the model can provide more reliable estimates of uncertainty. This technique is valuable in applications where precise and trustworthy predictions are essential, such as medical diagnosis or financial forecasting[5].


Along with the benefits, there are some inherent limitations that must be considered.

Computational overhead

The tree-of-thoughts (ToT) framework involves complex operations such as maintaining multiple decision paths, backtracking and exploring alternative solutions. These processes are computationally intensive, often requiring significant resources in terms of processing power and memory. The need for resources can limit the scalability of tree-of-thoughts (ToT), especially in environments where computational resources are constrained or in real-time applications where rapid response times are critical.

Implementation complexity

Setting up a tree-of-thoughts (ToT) system involves integrating various components such as the prompter agent, checker module, memory module and tree-of-thoughts (ToT) controller.[1] Each component must be finely tuned to work in harmony, which can be a complex and time-consuming process. Moreover, the system’s efficacy heavily depends on the quality of its implementation. Poor configuration of any component can reduce the effectiveness of the entire system, making it less reliable or leading to incorrect problem-solving pathways.

Case studies

The tree-of-thoughts (ToT) framework has demonstrated its efficacy across various applications, showcasing its robustness and adaptability. Here, we explore four compelling case studies where ToT has significantly enhanced problem-solving capabilities:

Sudoku puzzle solving

Tree-of-thoughts (ToT) application in sudoku puzzle solving exemplifies its capacity to navigate complex logical challenges. By guiding the model through various number placements and enabling it to backtrack upon encountering contradictions, tree-of-thoughts (ToT) streamlines the path to correct solutions. This ability to dynamically reassess decisions dramatically improves problem-solving accuracy and efficiency, highlighting ToT's advantage over more static problem-solving approaches[1].

Game of 24

In the strategic arithmetic game of 24, tree-of-thoughts (ToT) significantly improved success rates by enabling the model to explore multiple calculation paths. This adaptive reasoning process allowed the model to solve puzzles more creatively and effectively, demonstrating tree-of-thoughts (ToT) capacity for enhancing cognitive flexibility in numerical problem-solving[4].

Creative writing

Tree-of-thoughts (ToT) has also been applied to creative writing tasks, where it aids large language models (LLMs) in generating more coherent and contextually appropriate narratives. By structuring the thought process into a branching tree, the model can explore different plot developments or stylistic choices, then select or revise based on the most promising outcomes. This method has led to improvements in the quality and originality of text generated by large language models (LLMs), providing a more nuanced approach to automated storytelling[4].

5x5 crossword solving

Another remarkable application of tree-of-thoughts (ToT) is in solving 5x5 mini crossword puzzles. The framework enables the model to consider multiple word options for each crossword clue, evaluating them not just in isolation but also how they interact with already placed words. This iterative, holistic assessment approach ensures higher accuracy in puzzle completion and demonstrates tree-of-thoughts (ToT) ability to apply logical and contextual reasoning in linguistically complex tasks. The use of tree-of-thoughts (ToT) in this context highlights its versatility and effectiveness in tasks that require integration of multiple types of knowledge and reasoning strategies[4].

These case studies illustrate the diverse capabilities of the tree of thoughts framework, from enhancing logical and numerical reasoning to boosting creativity and contextual understanding in language-based tasks. Each example underscores tree-of-thoughts (ToT) potential to revolutionize problem-solving across disciplines.

Recent advancements

Recent advancements in tree-of-thoughts (ToT) research have focused on expanding its capabilities and addressing inherent challenges in its application. Key developments include:

  1. Uncertainty quantification: The introduction of the tree-of-uncertain-thoughts (TouT) marks a significant advancement in tree-of-thoughts (ToT) research. TouT enhances ToT by integrating uncertainty quantification mechanisms that assess the reliability of each decision path. This development is crucial for applications where decisions must be made under conditions of uncertainty and where the cost of mistakes can be high[5].
  2. Global decision-making: Further research has focused on enhancing the global decision-making abilities of large language models (LLMs) when using tree-of-thoughts (ToT). Recent studies have introduced feedback loops into the framework, allowing models to learn from past decisions and adjust their reasoning processes in real-time. This iterative feedback mechanism helps refine the decision-making process, making it more dynamic and responsive to the evolving context of the problem. Such enhancements aim to bring the reasoning capabilities of large language models (LLMs) closer to human cognitive processes, where learning from past experiences plays a crucial role in shaping future decisions[4].

These recent developments underscore the ongoing efforts to refine and expand the tree-of-thoughts framework, ensuring its applicability and effectiveness in increasingly complex problem-solving scenarios. These advancements not only enhance the capabilities of LLMs but also open up new avenues for research and application in artificial intelligence.

Resources What are generative AI models

With all the buzz around ChatGPT, IBM expert Kate Soule explains how large language models work and what this form of generative AI can do for the enterprise.

IBM watsonx.data is an open, hybrid, governed data store

Discover how your organization can scale AI workloads, for all your data, anywhere.

What is chain of thoughts (CoT)?

Explore chain-of-thoughts (CoT) frame work with detailed explanation and case studies.

Take the next step

Train, validate, tune and deploy generative AI, foundation models and machine learning capabilities with IBM watsonx.ai, a next-generation enterprise studio for AI builders. Build AI applications in a fraction of the time with a fraction of the data.

Explore watsonx.ai Book a live demo

[1] Long, J. (May 2023). Large Language Model Guided Tree-of-Thought.

[2] Karthik Narasimhan, S. Y. (July 2023). Official Repository of Tree of Thoughts (ToT). https://github.com/princeton-nlp/tree-of-thought-llm (link resides outside ibm.com)

[3] Pengfei Liu, W. Y. (2021). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys.

[4] Shunyu Yao, D. Y. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. ArXiv, abs/2305.10601. https://arxiv.org/abs/2305.10601 (link resides outside ibm.com)

[5] 5 Shentong Mo, M. X. (September 2023). Tree of Uncertain Thoughts Reasoning for Large Language Models. ArXiv, abs/2309.07694. https://arxiv.org/abs/2309.07694 (link resides outside ibm.com)