使用 Ollama 进行工具调用

作者

Data Scientist

使用 Ollama 进行工具调用

大型语言模型 (LLM) 中的工具调用是 LLM 与外部工具、服务或 API 交互以执行任务的能力。这使得 LLM 可以扩展其功能，增强处理现实世界任务的能力，这些任务可能需要访问外部数据、实时信息或特定应用程序。当 LLM 使用 Web 搜索工具时，它可以调用 Web 来获取模型训练数据中没有的实时数据。其他类型的工具可能包括用于计算、数据分析或可视化的 Python，或者用于获取数据的服务端点调用。工具调用可以让聊天机器人更加灵活和适应性更强，使其能够基于实时数据或超出其直接知识库的专门任务，提供更准确、更相关和更详细的回答。流行的工具调用框架包括 Langchain 和现在的 ollama。

Ollama 是一个提供开源、本地 AI 模型的平台，可在个人设备上使用，这样用户就可以直接在电脑上运行 LLM。与 OpenAI API 等服务不同的是，由于模型在本地机器上，因此无需帐户。Ollama 注重隐私、性能和易用性，使用户能够访问 AI 模型并与之交互，而无需将数据发送到外部服务器。这对于关注数据隐私的人或希望避免依赖外部 API 的人来说特别有吸引力。Ollama 平台易于设置和使用，并且支持各种模型，为用户提供了一系列工具，用于直接在自己的硬件上执行自然语言处理、代码生成和其他 AI 任务。该平台非常适合工具调用架构，因为它可以访问本地环境的所有功能，包括数据、程序和自定义软件。

在本教程中，您将学习如何通过使用 ollama 查看本地文件系统来设置工具调用，而这项任务对于远程 LLM 来说很难完成。许多 llama 模型可用于工具调用和构建 AI 智能体，例如 Mistral 和 Llama 3.2，完整列表可在 llama 网站上找到。在这种情况下，我们将使用具有工具支持的 IBM Granite 3.2 Dense。2B 和 8B 模型是经过训练的纯文本密集型 LLM，旨在支持基于工具的用例和检索增强生成 (RAG)，简化代码生成、翻译和错误修复。

本教程的笔记本可从 Github 下载点击此处。

第 1 步：安装 Ollama

首先，您需要从 https://ollama.com/download 下载 Ollama并为您的操作系统安装它。在 OSX 上，通过 .dmg 文件进行安装；在 Linux 上，通过一个 shell 命令进行安装；在 Windows 上，使用安装程序进行安装。您可能需要管理员权限才能运行安装程序。

您可以通过打开终端或命令提示符，并输入以下命令来测试 Ollama 是否正确安装：

ollama -v

第 2 步：安装库

接下来，您将添加初始的导入语句。此演示将使用 ollama Python 库与 Ollama 进行通信，并使用 pymupdf 库来读取文件系统中的 PDF 文件。

!pip install pymupdf

import ollama
import os
import pymupdf

接下来，您将拉取本教程中将使用的模型。这会将模型权重从 Ollama 下载到您的本地计算机，并将其存储起来，以便以后使用，无需再次进行远程 API 调用。

!ollama pull granite3.2
!ollama pull granite3.2-vision

第 3 步：定义工具

现在，您将定义 Ollama 工具实例可以访问的工具。由于这些工具的目的是读取文件并查看本地文件系统中的图像，因此您将为每个工具创建两个 Python 函数。第一个函数名为search_text_files ，它接受一个关键词，在本地文件中进行搜索。为了演示的目的，这段代码仅在特定文件夹中搜索文件，但它可以扩展为包括一个第二个参数，用来设置工具将要搜索的文件夹。

您可以使用简单的字符串匹配来查看关键词是否出现在文档中，但因为 Ollama 使得调用本地 LLM 变得简单，search_text_files 将使用 Granite 3.2 来判断关键词是否描述了文档文本。这是通过将文档读取到一个名为Document_text 的字符串中完成的。然后，函数调用 ollama.chat 并向模型发送以下提示：

"Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text

如果模型的回应是 'yes'，则函数返回文件名，该文件中包含用户在提示中指定的关键词。如果没有任何文件包含相关信息，则函数返回字符串 'None'。

由于 Ollama 将下载 Granite 3.2 Dense，因此这个函数第一次运行时可能会比较慢。

def search_text_files(keyword: str) -> str:

directory = os.listdir("./files/")
for fname in directory:

# look through all the files in our directory that aren't hidden files
if os.path.isfile("./files/" + fname) and not fname.startswith('.'):

if(fname.endswith(".pdf")):

document_text = ""
doc = pymupdf.open("./files/" + fname)

for page in doc: # iterate the document pages
document_text += page.get_text() # get plain text (is in UTF-8)

doc.close()

prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text

res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)

if 'Yes' in res['message']['content']:
return "./files/" + fname

elif(fname.endswith(".txt")):

f = open("./files/" + fname, 'r')
file_content = f.read()

prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + file_content

res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)

if 'Yes' in res['message']['content']:
f.close()
return "./files/" + fname

return "None"

第二个工具名为search_image_files ，它接受一个关键词，在本地照片中进行搜索。搜索是通过使用 Granite 3.2 Vision 图像描述模型通过 Ollama 实现的。该模型会返回文件夹中每个图像文件的文本描述，并在描述中搜索关键词。使用 Ollama 的优势之一是，多个智能体系统可以轻松构建，并通过一个模型调用另一个模型。

该函数返回一个字符串，这是文件的名称，其描述包含用户在提示中指明的关键字。

def search_image_files(keyword:str) -> str:

directory = os.listdir("./files/")
image_file_types = ("jpg", "png", "jpeg")

for fname in directory:

if os.path.isfile("./files/" + fname) and not fname.startswith('.') and fname.endswith(image_file_types):
res = ollama.chat(
model="granite3.2-vision",
messages=[
{
'role': 'user',
'content': 'Describe this image in short sentences. Use simple phrases first and then describe it more fully.',
'images': ["./files/" + fname]
}
]
)

if keyword in res['message']['content']:
return "./files/" + fname

return "None"

第 4 步：定义 ollama 的工具

现在，您已经定义了 Ollama 调用的函数，接下来您需要为 Ollama 本身配置工具信息。第一步是创建一个对象，将工具的名称映射到 Ollama 函数调用的函数：

available_functions = {
'Search inside text files':search_text_files,
'Search inside image files':search_image_files
}

接下来，配置一个工具数组，用于告诉 Ollama 它将可以访问哪些工具以及这些工具需要什么。这个数组包含每个工具的一个对象模式，指示 Ollama 工具调用框架如何调用工具以及它返回什么。

对于您之前创建的两个工具，它们是需要 keyword 参数的函数目前只支持函数调用，尽管未来可能会有所改变。函数和参数的描述帮助模型正确地调用工具。每个工具的函数的说明字段会在模型选择使用哪个工具时传递给 LLM。关键词的说明会在模型生成传递给工具的参数时传递给模型。这两个地方是您在使用 Ollama 创建自己工具调用应用时，可以调整提示的地方，以便微调模型的表现。

# tools don't need to be defined as an object but this helps pass the correct parameters
# to the tool call itself by giving the model a prompt of how the tool is to be used
ollama_tools=[
{
'type': 'function',
'function': {
'name': 'Search inside text files',
'description': 'This tool searches in PDF or plaintext or text files in the local file system for descriptions or mentions of the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in text files',
},
},
'required': ['keyword'],
},
},
},
{
'type': 'function',
'function': {
'name': 'Search inside image files',
'description': 'This tool searches for photos or image files in the local file system for the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in image files',
},
},
'required': ['keyword'],
},
},
},
]

当您使用用户输入调用 ollama 时，您将使用此工具定义。

第 5 步：将用户输入传递给 ollama

现在是时候将用户输入传递给 ollama 并让其返回工具调用的结果了。首先，确保 ollama 正在系统上运行：

# if ollama is not currently running, start it
import subprocess
subprocess.Popen(["ollama","serve"], stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)

如果 Ollama 正在运行，将返回：

<Popen: returncode: None args: ['ollama', 'serve']>

现在，向用户请求输入。您也可以硬编码输入，或者根据您配置应用的方式，从聊天界面中获取输入。然后，input 函数将在继续之前等待用户输入。

# input
user_input = input("What would you like to search for?")
print(user_input)

例如，如果用户输入 "Information about dogs"，该单元格将打印：

Information about dogs

现在，用户查询被传递给 Ollama 本身。消息需要为用户指定角色和用户输入的内容。这些信息通过 chat 函数传递给 Ollama。第一个参数是您想要使用的模型，在本例中是 Granite 3.2 Dense；接下来是包含用户输入的消息；最后是您之前配置的工具数组。

然后 chat 函数将生成输出，选择使用哪个工具以及在随后的工具调用中应该传递哪些参数。

messages = [{'role': 'user', 'content':user_input}]

response: ollama.ChatResponse = ollama.chat(

# set which model we're using
'granite3.2:8b',

# use the message from the user
messages=messages,

tools=ollama_tools
)

现在模型已经在输出中生成了工具调用，请使用模型生成的参数运行所有工具调用并检查输出。在此应用程序中，Granite 3.2 Dense 还用于生成最终输出，因此工具调用的结果会先添加到初始用户输入中，然后再传递给模型。

多个工具调用可能会返回文件匹配项，因此响应将收集到一个数组中，然后将其传递给 Granite 3.2 以生成响应。前置的提示词指示模型如何响应：

If the tool output contains one or more file names, then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:

最终输出将使用返回的文件名或

# this is a place holder that to use to see whether the tools return anything
output = []

if response.message.tool_calls:

# There may be multiple tool calls in the response
for tool_call in response.message.tool_calls:

# Ensure the function is available, and then call it
if function_to_call := available_functions.get(tool_call.function.name):
print('Calling tool: ', tool_call.function.name, ' \n with arguments: ', tool_call.function.arguments)
tool_res = function_to_call(**tool_call.function.arguments)

print(" Tool response is " + str(tool_res))

if(str(tool_res) != "None"):
output.append(str(tool_res))
print(tool_call.function.name, ' has output: ', output)
else:
print('Could not find ', tool_call.function.name)

# Now chat with the model using the tool call results
# Add the function response to messages for the model to use
messages.append(response.message)

prompt = '''
If the tool output contains one or more file names,
then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:
'''

messages.append({'role': 'tool', 'content': prompt + " " + ", ".join(str(x) for x in output)})

# Get a response from model with function outputs
final_response = ollama.chat('granite3.2:8b', messages=messages)
print('Final response:', final_response.message.content)

else:

# the model wasn't able to pick the correct tool from the prompt
print('No tool calls returned from model')

使用本教程提供的文件，提示词 "Information about dogs" 将返回：

Calling tool: Search inside text files
with arguments: {'keyword': 'dogs'}
Tool response is ./files/File4.pdf
Search inside text files has output: ['./files/File4.pdf']
Calling tool: Search inside image files
with arguments: {'keyword': 'dogs'}
Tool response is None
Final response: The keyword "dogs" was found in File4.pdf.

您可以看到 Granite 3.2 从输入中选择了正确的关键字“dogs”，并搜索了文件夹中的文件，在 PDF 文件中找到该关键字。由于 LLM 的结果不是完全确定性的，使用相同或非常相似的提示词时，您可能会得到略有不同的结果。