28 March, 2025
Tool calling in Large Language Models (LLMs) is the ability of the LLM to interact with external tools, services, or APIs to perform tasks. This allows LLMs to extend their functionality, enhancing their ability to handle real-world tasks that may require access to external data, real-time information, or specific applications. When an LLM uses a web search tool, it can call the web to fetch real-time data that aren't available in the model's training data. Other types of tools might include Python for calculations, data analysis, or visualization, or calling a service endpoint for data. Tool calling can make a chatbot more dynamic and adaptable, allowing it to provide more accurate, relevant, and detailed responses based on live data or specialized tasks outside its immediate knowledge base. Popular frameworks for tool-calling include Langchain and now ollama.
Ollama is a platform that offers open-source, local AI models for use on personal devices so that users can run LLMs directly on their computers. Unlike a service like the OpenAI API, there’s no need for an account since the model is on your local machine. Ollama focuses on privacy, performance, and ease of use, enabling users to access and interact with AI models without sending data to external servers. This can be particularly appealing for those concerned about data privacy or who want to avoid the reliance on external APIs. Ollama’s platform is designed to be easy to set up and use, and it supports various models, giving users a range of tools for natural language processing, code generation, and other AI tasks directly on their own hardware. It is well suited to a tool calling architecture because it can access all the capabilities of a local environment including data, programs, and custom software.
In this tutorial you'll learn how to set up tool calling by using ollama to look through a local filesystem, a task which would be difficult to do with a remote LLM. Many ollama models are available for tool calling and building AI agents like Mistral and Llama 3.2, a full list is available at the ollama website. In this case we'll use IBM Granite 3.2 Dense which has tool support. The 2B and 8B models are text-only dense LLMs trained on designed to support tool-based use cases and for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.
The notebook for this tutorial can be downloaded from Github here.
First you'll download ollama from https://ollama.com/download and install it for your operating system. On OSX this is done via a .dmg file, on Linux via a single shell command, and on Windows with an installer. You may need admin access on your machine in order to run the installer.
You can test that ollama is correctly installed by opening a terminal or command prompt and entering:
ollama -v
Next, you'll add the initial imports. This demo will use the ollama python library to communicate with ollama and the pymupdf library to read PDF files in the file system.
!pip install pymupdf
import ollama
import os
import pymupdf
Next you'll pull the model that you'll be using throughout this tutorial. This downloads the model weights from ollama to your local computer and stores them for use without needing to make any remote API calls later on.
!ollama pull granite3.2
!ollama pull granite3.2-vision
Now you'll define the tools that the ollama tools instance will have access. Since the intent of the tools is to read files and look through images in the local file system, you'll create two python functions for each of those tools. The first is called search_text_files
and it takes a keyword to search for in the local files. For the purposes of this demo, the code only searches for files in a specific folder but it could be extended to include a second parameter that sets which folder the tool will search in.
You could use simple string matching to see whether the keyword is in the document but because ollama makes calling local llms easily, search_text_files
will use Granite 3.2 to determine whether the keyword describes the document text. This is done by reading the document into a string called document_text
. The function then calls ollama.chat and prompts the model with the following:
"Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text
If the model responds 'yes', then the function returns the name of the file that contains the keyword that the user indicated in the prompt. If none of the files seem to contain the information, then the function returns 'None' as a string.
This function may run slowly the first time because ollama will download Granite 3.2 Dense.
def search_text_files(keyword: str) -> str:
directory = os.listdir("./files/")
for fname in directory:
# look through all the files in our directory that aren't hidden files
if os.path.isfile("./files/" + fname) and not fname.startswith('.'):
if(fname.endswith(".pdf")):
document_text = ""
doc = pymupdf.open("./files/" + fname)
for page in doc: # iterate the document pages
document_text += page.get_text() # get plain text (is in UTF-8)
doc.close()
prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text
res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)
if 'Yes' in res['message']['content']:
return "./files/" + fname
elif(fname.endswith(".txt")):
f = open("./files/" + fname, 'r')
file_content = f.read()
prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + file_content
res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)
if 'Yes' in res['message']['content']:
f.close()
return "./files/" + fname
return "None"
The second tool is called search_image_files
and it takes a keyword to search for in the local photos.The search is done by using the Granite 3.2 Vision image description model via ollama. This model will return a text description of each image file in the folder and search for the keyword in the description. One of the strengths of using ollama is that multi-agent systems can easily be built to call one model with another.
The function returns a string, which is the name of the file whose description contains the keyword that the user indicated in the prompt.
def search_image_files(keyword:str) -> str:
directory = os.listdir("./files/")
image_file_types = ("jpg", "png", "jpeg")
for fname in directory:
if os.path.isfile("./files/" + fname) and not fname.startswith('.') and fname.endswith(image_file_types):
res = ollama.chat(
model="granite3.2-vision",
messages=[
{
'role': 'user',
'content': 'Describe this image in short sentences. Use simple phrases first and then describe it more fully.',
'images': ["./files/" + fname]
}
]
)
if keyword in res['message']['content']:
return "./files/" + fname
return "None"
Now that the functions for ollama to call have been defined, you'll configure the tool information for ollama itself. The first step is to create an object that maps the name of the tool to the functions for ollama function calling:
available_functions = {
'Search inside text files':search_text_files,
'Search inside image files':search_image_files
}
Next, configure a tools array to tell ollama what tools it will have access to and what those tools require. This is an array with one object schema per tool that tells the ollama tool calling framework how to call the tool and what it returns.
In the case of both of the tools that you created earlier, they are functions that require a keyword
parameter. Currently only functions are supported although this may change in the future. The description of the function and of the parameter help the model call the tool correctly. The description
field for the function of each tool is passed to the LLM when it selects which tool to use. The description
of the keyword is passed to the model when it generates the parameters that will be passed to the tool. Both of these are places you may look to fine tune prompts when you create your own tool calling applications with ollama.
# tools don't need to be defined as an object but this helps pass the correct parameters
# to the tool call itself by giving the model a prompt of how the tool is to be used
ollama_tools=[
{
'type': 'function',
'function': {
'name': 'Search inside text files',
'description': 'This tool searches in PDF or plaintext or text files in the local file system for descriptions or mentions of the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in text files',
},
},
'required': ['keyword'],
},
},
},
{
'type': 'function',
'function': {
'name': 'Search inside image files',
'description': 'This tool searches for photos or image files in the local file system for the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in image files',
},
},
'required': ['keyword'],
},
},
},
]
You'll use this tools definition when you call ollama with user input.
Now its time to pass user input to ollama and have it return the results of the tool calls. First, make sure that ollama is running on your system:
# if ollama is not currently running, start it
import subprocess
subprocess.Popen(["ollama","serve"], stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)
If Ollama is running, this will return:
<Popen: returncode: None args: ['ollama', 'serve']>
Now ask the user for input. You can also hardcode the input or retrieve from a chat interface depdending on you configure your application. The input
function will wait for user input before continuing on.
# input
user_input = input("What would you like to search for?")
print(user_input)
As an example, if the user enters "Information about dogs" this cell will print:
Information about dogs
Now the user query is passed to ollama itself. The messages need a role for the user and the content that the user input. This is passed to ollama using the chat
function. The first parameter is the model you want to use, in this case Granite 3.2 Dense, then the message with the user input, and finally the tools array that you configured earlier.
The chat
function will generate an output selecting which tool to use and what parameters should be passed to it in the subsequent tool calls.
messages = [{'role': 'user', 'content':user_input}]
response: ollama.ChatResponse = ollama.chat(
# set which model we're using
'granite3.2:8b',
# use the message from the user
messages=messages,
tools=ollama_tools
)
Now that the model has generated tool calls in the output, run all of the tool calls with the parameters that the model generated and check the output. In this application Granite 3.2 Dense is used to generate the final output as well, so the results of the tool calls are added to the initial user input and then passed to the model.
Multiple tool calls may return file matches, so the responses are collected in an array which is then passed to Granite 3.2 to generate a response. The prompt that precedes the data instructs the model how to respond:
If the tool output contains one or more file names, then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:
The final output is then generated using either the returned file names or
# this is a place holder that to use to see whether the tools return anything
output = []
if response.message.tool_calls:
# There may be multiple tool calls in the response
for tool_call in response.message.tool_calls:
# Ensure the function is available, and then call it
if function_to_call := available_functions.get(tool_call.function.name):
print('Calling tool: ', tool_call.function.name, ' \n with arguments: ', tool_call.function.arguments)
tool_res = function_to_call(**tool_call.function.arguments)
print(" Tool response is " + str(tool_res))
if(str(tool_res) != "None"):
output.append(str(tool_res))
print(tool_call.function.name, ' has output: ', output)
else:
print('Could not find ', tool_call.function.name)
# Now chat with the model using the tool call results
# Add the function response to messages for the model to use
messages.append(response.message)
prompt = '''
If the tool output contains one or more file names,
then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:
'''
messages.append({'role': 'tool', 'content': prompt + " " + ", ".join(str(x) for x in output)})
# Get a response from model with function outputs
final_response = ollama.chat('granite3.2:8b', messages=messages)
print('Final response:', final_response.message.content)
else:
# the model wasn't able to pick the correct tool from the prompt
print('No tool calls returned from model')
Using the provided files for this tutorial, the prompt "Information about dogs" will return:
Calling tool: Search inside text files
with arguments: {'keyword': 'dogs'}
Tool response is ./files/File4.pdf
Search inside text files has output: ['./files/File4.pdf']
Calling tool: Search inside image files
with arguments: {'keyword': 'dogs'}
Tool response is None
Final response: The keyword "dogs" was found in File4.pdf.
You can see that Granite 3.2 picked the correct keyword from the input, 'dogs', and searched through the files in the folder, finding the keyword in a PDF file. Since LLM results are not purely deterministic, you may get slightly different results with the same prompt or very similar prompts.
Get started with building and deploying agents by using watsonx.ai.
Shape generative AI by making contributions to LLMs in an open and accessible way.
Will 2025 be the year of AI agents? On this episode of Mixture of Experts, we review AI models, agents, hardware and product releases with some of the top industry experts.
Learn ways to use AI to be more creative and efficient. Start adapting to a future that involves working closely with AI agents.
Learn the potential opportunities and risks of agentic AI for IT leaders and learn how to prepare for this next wave of AI innovation.
Explore the difference between AI agents and assistants and learn how they can be a gamechanger for enterprise productivity.
Join the community for AI architects and builders to learn, share ideas and connect with others.