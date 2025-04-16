Step 1. Install dependencies

Next, install the Python package dependencies for this notebook. Granite utils provide some helpful functions for recipes.

! pip install git+https://github.com/ibm-granite-community/utils.git \

langchain_community \

transformers \

replicate

Step 2. Select your model

Select a Granite model from the ibm-granite org on Replicate. While there is a smaller model, (Granite-3.2:2b-instruct) for the purpose of this tutorial Granite-3.3:8b-instruct is the default. It is important to note that model size plays a role in the ability to handle tasks such as logic and math without being explicitly trained to do so, also referred to as emergent reasoning. This ability tends to appear naturally as the models scale.

Here we use the Replicate Langchain client to connect to the model.

To get set up with Replicate, see Getting Started with Replicate.

To connect to a model on a provider other than Replicate, substitute this code cell with one from the LLM component recipe.

from ibm_granite_community.notebook_utils import get_env_var

from langchain_community.llms import Replicate

from transformers import AutoTokenizer



model_path = "ibm-granite/granite-3.3-8b-instruct"

model = Replicate(

model=model_path,

replicate_api_token=get_env_var("REPLICATE_API_TOKEN"),

model_kwargs={

"max_tokens": 4000, # Set the maximum number of tokens to generate as output.

"min_tokens": 200, # Set the minimum number of tokens to generate as output.

"temperature": 0.0, # Lower the temperature

},

)

tokenizer = AutoTokenizer.from_pretrained(model_path)

Step 3. Setup two prompts

Next, create two prompt chains. The first chain will use the model’s normal (non-chain of thought reasoning) response mode. The normal response mode is the default prompt mode for Granite. The second chain is configured to use the chain of thought reasoning response mode. This step is done by passing thinking=True to the chat template. When doing so, it adds specific instructions to the system prompt, causing the model's internal reasoning process to be activated which results in the response containing the reasoning steps. By exploring variants of chain of thought prompting, one can experiment with how the models approach decision-making, making them more adaptable to a wide range of tasks.

from langchain.prompts import PromptTemplate



# Create a Granite prompt without chain of thought reasoning

prompt = tokenizer.apply_chat_template(

conversation=[{

"role": "user",

"content": "{input}",

}],

add_generation_prompt=True,

tokenize=False,

)

prompt_template = PromptTemplate.from_template(template=prompt)

chain = prompt_template | model



# Create a Granite prompt by using chain of thought reasoning

reasoning_prompt = tokenizer.apply_chat_template(

conversation=[{

"role": "user",

"content": "{input}",

}],

thinking=True, # Use chain-of-thought reasoning

add_generation_prompt=True,

tokenize=False,

)

reasoning_prompt_template = PromptTemplate.from_template(template=reasoning_prompt)

reasoning_chain = reasoning_prompt_template | model

Now that the prompts have been created, take a look at the difference between them to see which activates the Granite model’s internal reasoning process in the reasoning prompt.

NOTE: This additional prompt text is specific to the chat template for the version of Granite used and can change in future releases of Granite.

import difflib

from ibm_granite_community.notebook_utils import wrap_text



print("==== System prompt instructions for chain-of-thought reasoning ====")

diff = difflib.ndiff(prompt, reasoning_prompt)

print(wrap_text("".join(d[-1] for d in diff if d[0] == "+"), indent=" "))

Step 4. Compare the responses of the two prompts

First, we define a helper function to take a question and use both prompts to respond to the question. The function will display the question and then display the response from the normal prompt, without CoT followed by the step-by-step response from the chain-of-thought reasoning prompt.

def question(question: str) -> None:

print("==== Question ====")

print(wrap_text(question, indent=" "))



print("==== Normal prompt response ====")

output = chain.invoke({"input": question})

print(wrap_text(output, indent=" "))



print("

==== Reasoning prompt response ====")

reasoning_output = reasoning_chain.invoke({"input": question})

print(wrap_text(reasoning_output, indent=" "))

Step 5. Chain of thoughts reasoning use cases

In this example, chain of thought prompting supports logical problem-solving by having the model summarize the given relationships before analyzing them in detail. This helps ensure that each part of the problem is clearly understood and leads to an accurate conclusion.

question("""\

Sally is a girl and has 3 brothers.

Each brother has 2 sisters.

How many sisters does Sally have?\

""")

The following example demonstrates how chain of thought prompting helps large language models handle basic decision-making and comparison-based problem-solving. This capacity makes the model's reasoning abilities and reasoning paths more transparent and accurate, turning a simple question into a short exercise in decision making.

question("""\

Which of the following items weigh more: a pound of water, two pounds of bricks, a pound of feathers, or three pounds of air?\

""")

This next example highlights how chain of thought prompting allows large language models to work through basic numerical comparisons with greater clarity. By encouraging step-by-step reasoning, even simple math-based questions become transparent exercises in evaluating magnitude and numerical relationships.

question("""\

Which one is greater, 9.11 or 9.9?\

""")

Building on the previous example of comparing decimal numbers, this question explores how the context of versioning can change the interpretation of similar-looking values. Chain of thought prompting helps clarify the subtle difference between numerical and version-based comparisons, guiding the model to apply reasoning that's sensitive to real-world conventions.

question("""\

Which version number is greater, 9.11 or 9.9?\

""")

Continuing the exploration of version comparisons, this example introduces Maven versioning and the impact of prerelease identifiers such as -rc1 (release candidate). Chain of thought prompting allows the model to navigate domain-specific rules—such as semantic version precedence—making it easier to reason about which version is considered "greater" in practical software versioning contexts.

question("""\

Which Maven version number is greater, 9.9-rc1 or 9.9?\

""")

Chain of thought prompting helps models solve math word problems by breaking them down into clear, step-by-step reasoning. Instead of jumping to the final answer, the model explains how quantities and percentages relate, mimicking the logical reasoning of how a student might work through a mixture problem logically.

question("""\

You have 10 liters of a 30% acid solution.

How many liters of a 70% acid solution must be added to achieve a 50% acid mixture?\

""")

The final example demonstrates how chain of thought prompting can support geometric reasoning by breaking down shape properties and applying fundamental rules, such as angle sums in triangles. It shows how a model can translate a brief problem statement into a structured logical process, leading to a clear and correct conclusion.

question("""\

In an isosceles triangle, the vertex angle measures 40 degrees.

What is the measure of each base angle?\

""")