Ollamaを使用したツール呼び出し

著者

Data Scientist

Ollamaを使用したツール呼び出し

大規模言語モデル（LLM）のツール呼び出しは、LLMが外部ツール、サービス、またはAPIと対話してタスクを実行する機能です。これにより、LLMは機能を拡張し、外部データ、リアルタイム情報、または特定のアプリケーションへのアクセスが必要になる現実世界のタスクの処理能力を強化できます。LLMがWeb検索ツールを使用する場合、Webを呼び出して、モデルのトレーニング・データにはないリアルタイムのデータを取得できます。他のタイプのツールには、計算、データ分析、視覚化のためのPythonや、データのサービス・エンドポイントの呼び出しなどのためのPythonが含まれる場合があります。ツール呼び出しにより、チャットボットはより動的かつ適応的になり、ライブ・データや直接のナレッジ・ベース外の特殊なタスクに基づいて、より正確で関連性が高く詳細な応答を提供できるようになります。ツール呼び出し用の一般的なフレームワークには、LangchainとOllamaがあります。

Ollamaは、個人のデバイスで使用するためのオープンソースのローカルAIモデルを提供するプラットフォームであり、ユーザーは自分のコンピューター上でLLMを直接実行できます。OpenAI APIなどのサービスとは異なり、モデルはローカル・マシン上にあるため、アカウントは不要です。Ollamaはプライバシー、パフォーマンス、使いやすさに重点を置いており、ユーザーは外部サーバーにデータを送信することなくAIモデルにアクセスして対話できます。これは、データ・プライバシーに懸念がある人や、外部APIへの依存を回避したい人にとって、特に魅力的です。Ollamaのプラットフォームは、セットアップと使用が簡単になるように設計されており、さまざまなモデルをサポートしているため、自然言語処理、コード生成、その他のAIタスクのためのさまざまなツールを自分のハードウェア上で直接利用できます。データ、プログラム、カスタム・ソフトウェアを含めてローカル環境のすべての機能にアクセスできるため、ツール呼び出しアーキテクチャーに適しています。

このチュートリアルでは、Olllamaを使用してローカル・ファイルシステムを調べ、ツール呼び出しをセットアップする方法について説明します。このタスクは、リモートLLMでは実行が困難です。ツール呼び出しやMistralやLlama 3.2などのAIエージェントの構築では、多数のOllamaモデルを利用できます。完全なリストはOllamaのWebサイトで確認できます。今回は、ツール・サポートを備えたIBM Granite 3.2 Denseを使用します。2Bモデルと8Bモデルは、ツールベースのユースケースをサポートし、検索拡張生成（RAG）、コード生成、翻訳、バグ修正の合理化のために設計された、テキストのみの高密度LLMでトレーニングされています。

このチュートリアルのノートブックは、こちらのGithubからダウンロードできます。

ステップ1： Ollamaをインストールする

まず、 https://ollama.com/downloadからollamaをダウンロードし、オペレーティングシステム用にインストールします。これは、macOSでは.dmgファイルを介して行われ、Linuxでは単一のシェル・コマンド、Windowsではインストーラーで実行されます。インストーラーを実行するには、マシンの管理者アクセス権が必要になる場合があります。

ターミナルまたはコマンド・プロンプトを開いて、次の入力を実行することで、ollamaが正しくインストールされていることを確認できます：

ollama -v

ステップ2：ライブラリーをインストールする

次に、初期インポートを追加します。このデモでは、olllama Python libraryを使用してollamaと通信し、pymupdfライブラリーを使用してファイル・システム内のPDFファイルを読み取ります。

!pip install pymupdf

import ollama
import os
import pymupdf

次に、このチュートリアル全体で使用するモデルを取得します。これにより、ollamaからローカル・コンピューターにモデルの重みがダウンロードされ、後でリモートAPI呼び出しを行うことなく使用できるように保管します。

!ollama pull granite3.2
!ollama pull granite3.2-vision

ステップ3：ツールを定義する

次に、ollamaのツール・インスタンスがアクセスできるツールを定義します。このツールの目的はローカル・ファイル・システムにあるファイルを読み取って画像を調べることであるため、それらのツールごとに2つのPython関数を作成します。1つ目にsearch_text_files を呼び出し、ローカル・ファイルで検索するキーワードが必要となります。このデモの目的上、コードは特定のフォルダー内のファイルのみを検索しますが、ツールを検索するフォルダーを設定する2番目のパラメーターを含めるように拡張することもできます。

単純な文字列のマッチングを使用すれば、キーワードが文書内にあるかどうかを確認できますが、ollamaではローカルLLMの呼び出しが簡単なため、search_text_files は、Granite 3.2を使用して、キーワードが文書テキストを説明しているかどうかを判断します。これは、文書を次の文字列に読み込むことで行われます。document_text その後、この関数はolllama.chatを呼び出し、モデルに次のプロンプトを送信します。

"Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text

モデルが「はい」と応答すると、関数はユーザーがプロンプトで指定したキーワードを含むファイル名を返します。どのファイルにも情報が含まれていない場合、この関数は文字列として「None」を返します。

olllamaはGranite 3.2 Denceをダウンロードするため、最初は関数の実行が遅くなる可能性があります。

def search_text_files(keyword: str) -> str:

directory = os.listdir("./files/")
for fname in directory:

# look through all the files in our directory that aren't hidden files
if os.path.isfile("./files/" + fname) and not fname.startswith('.'):

if(fname.endswith(".pdf")):

document_text = ""
doc = pymupdf.open("./files/" + fname)

for page in doc: # iterate the document pages
document_text += page.get_text() # get plain text (is in UTF-8)

doc.close()

prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + document_text

res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)

if 'Yes' in res['message']['content']:
return "./files/" + fname

elif(fname.endswith(".txt")):

f = open("./files/" + fname, 'r')
file_content = f.read()

prompt = "Respond only 'yes' or 'no', do not add any additional information. Is the following text about " + keyword + "? " + file_content

res = ollama.chat(
model="granite3.2:8b",
messages=[{'role': 'user', 'content': prompt}]
)

if 'Yes' in res['message']['content']:
f.close()
return "./files/" + fname

return "None"

2つ目のツールはsearch_image_files を呼び出し、ローカル写真を検索するキーワードを必要とします。検索は、ollamaを経由した画像説明モデルであるGranite 3.2 Visionを使用して行われます。このモデルは、フォルダー内の各画像ファイルのテキスト説明を返し、説明内のキーワードを検索します。ollamaを使う強みの一つは、あるモデルを別のモデルと合わせて呼び出すマルチ・エージェント・システムを簡単に構築できることです。

この関数は、ユーザーがプロンプトで指定したキーワードを含む説明を持つファイルの名前と一致する文字列を返します。

def search_image_files(keyword:str) -> str:

directory = os.listdir("./files/")
image_file_types = ("jpg", "png", "jpeg")

for fname in directory:

if os.path.isfile("./files/" + fname) and not fname.startswith('.') and fname.endswith(image_file_types):
res = ollama.chat(
model="granite3.2-vision",
messages=[
{
'role': 'user',
'content': 'Describe this image in short sentences. Use simple phrases first and then describe it more fully.',
'images': ["./files/" + fname]
}
]
)

if keyword in res['message']['content']:
return "./files/" + fname

return "None"

ステップ4： Ollamaで使用するツールを定義する

ollamaが呼び出す関数が定義されたので、olllama自体のツール情報を構成します。最初のステップは、次のollama関数を呼び出すため、ツールの名前を関数にマッピングするオブジェクトを作成することです：

available_functions = {
'Search inside text files':search_text_files,
'Search inside image files':search_image_files
}

次に、olllamaにどのツールにアクセスできるか、またそれらのツールが何を必要とするかを指示するツール配列を構成します。これは、ツールごとに1つのオブジェクト・スキーマを持つ配列であり、olllamaツールの呼び出しフレームワークにツールの呼び出し方法と返されるものを指示します。

先ほど作成した両方のツールの場合、それらはキーワード・パラメータを必要とする関数です。現在は関数のみがサポートされていますが、将来的には変更される可能性があります。関数とパラメーターの説明は、モデルがツールを正しく呼び出すために役立ちます。説明 LLMが使用するツールを選択する際に、各ツールの関数のフィールドがLLMに渡されます。説明ツールに渡されるパラメーターを生成する際に、キーワードのがモデルに渡されます。どちらも、ollamaを使ってアプリケーションを呼び出す独自のツールを作成する際に、プロンプトをファイン・チューニングする領域です。

# tools don't need to be defined as an object but this helps pass the correct parameters
# to the tool call itself by giving the model a prompt of how the tool is to be used
ollama_tools=[
{
'type': 'function',
'function': {
'name': 'Search inside text files',
'description': 'This tool searches in PDF or plaintext or text files in the local file system for descriptions or mentions of the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in text files',
},
},
'required': ['keyword'],
},
},
},
{
'type': 'function',
'function': {
'name': 'Search inside image files',
'description': 'This tool searches for photos or image files in the local file system for the keyword.',
'parameters': {
'type': 'object',
'properties': {
'keyword': {
'type': 'string',
'description': 'Generate one keyword from the user request to search for in image files',
},
},
'required': ['keyword'],
},
},
},
]

ユーザー・インプットでollamaを呼び出すときに、このツール定義を使用します。

ステップ5：ユーザー・インプットをOllamaに渡す

次に、ユーザー・インプットをollamaに渡して、ツール呼び出しの結果を返します。まず、llamaがシステム上で実行されていることを確認します：

# if ollama is not currently running, start it
import subprocess
subprocess.Popen(["ollama","serve"], stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)

Ollamaが実行されている場合は、次が返されます：

<Popen: returncode: None args: ['ollama', 'serve']>

次に、ユーザーにインプットを求めます。また、アプリケーションの構成に応じて、インプットをハードコードしたり、チャット・インターフェースから取得したりすることもできます。入力関数は、続行する前にユーザーの入力を待ちます。

# input
user_input = input("What would you like to search for?")
print(user_input)

例として、ユーザーが「犬に関する情報」と入力すると、このセルは以下を出力します：

Information about dogs

これで、ユーザー・クエリがollama自体に渡されます。メッセージには、ユーザーの役割とそのユーザーがインプットした内容が必要です。これは、チャット機能を利用して ollamaに渡されます。最初のパラメーターは、使用するモデル（この場合は Granite 3.2 Dense）、次にユーザーインプットを含むメッセージ、そして最後に、以前に設定したツール配列です。

チャット機能を利用して関数は、使用するツールと後続ツールの呼び出しでどのパラメーターを渡す必要があるかを選択するアウトプットを生成します。

messages = [{'role': 'user', 'content':user_input}]

response: ollama.ChatResponse = ollama.chat(

# set which model we're using
'granite3.2:8b',

# use the message from the user
messages=messages,

tools=ollama_tools
)

モデルがツール呼び出しをアウトプットに生成したので、モデルが生成したパラメーターを使用してすべてのツール呼び出しを実行し、アウトプットを確認します。このアプリケーションでは、Granite 3.2 Denseは最終的なアウトプットの生成にも使用されるため、ツール呼び出しの成果は最初のユーザーインプットに追加され、その後モデルに渡されます。

複数のツール呼び出しによってファイルの一致が返される場合、応答は配列に収集され、Granite 3.2に渡されて応答が生成されます。データの前にあるプロンプトは、モデルに応答方法を指示します：

If the tool output contains one or more file names, then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:

最終的なアウトプットは、返されたファイル名または

# this is a place holder that to use to see whether the tools return anything
output = []

if response.message.tool_calls:

# There may be multiple tool calls in the response
for tool_call in response.message.tool_calls:

# Ensure the function is available, and then call it
if function_to_call := available_functions.get(tool_call.function.name):
print('Calling tool: ', tool_call.function.name, ' \n with arguments: ', tool_call.function.arguments)
tool_res = function_to_call(**tool_call.function.arguments)

print(" Tool response is " + str(tool_res))

if(str(tool_res) != "None"):
output.append(str(tool_res))
print(tool_call.function.name, ' has output: ', output)
else:
print('Could not find ', tool_call.function.name)

# Now chat with the model using the tool call results
# Add the function response to messages for the model to use
messages.append(response.message)

prompt = '''
If the tool output contains one or more file names,
then give the user only the filename found. Do not add additional details.
If the tool output is empty ask the user to try again. Here is the tool output:
'''

messages.append({'role': 'tool', 'content': prompt + " " + ", ".join(str(x) for x in output)})

# Get a response from model with function outputs
final_response = ollama.chat('granite3.2:8b', messages=messages)
print('Final response:', final_response.message.content)

else:

# the model wasn't able to pick the correct tool from the prompt
print('No tool calls returned from model')

このチュートリアルで提供されているファイルを使用すると、「犬に関する情報」というプロンプトは以下を返します：

Calling tool: Search inside text files
with arguments: {'keyword': 'dogs'}
Tool response is ./files/File4.pdf
Search inside text files has output: ['./files/File4.pdf']
Calling tool: Search inside image files
with arguments: {'keyword': 'dogs'}
Tool response is None
Final response: The keyword "dogs" was found in File4.pdf.

Granite 3.2がインプットから正しいキーワード「犬」を選択し、フォルダー内のファイルを検索して、PDFファイル内のキーワードを見つけたことがわかります。LLMの結果は完全に確定的ではないため、同じプロンプトまたは非常に類似したプロンプトでも、わずかに異なる結果が得られる可能性があります。