在本教程中,您将构建一个 AI 智能体,而该智能体可为您的自定义 SQLite 数据库执行并生成 Python 与 SQL 查询。您的智能体将通过使用 LangGraph 和 Mistral Medium 3 大语言模型 (LLM) 从零开始构建。该 LLM 针对编码和多模态理解等专业用例进行了优化,因此是完成此任务的绝佳选择。1我们还将使用 IBM Granite Guardian 模型检测和屏蔽有害消息。
您需要一个 IBM Cloud 帐户来创建 watsonx.ai 项目。
多个 Python 版本适用于本教程。发布时,我们建议下载最新版本—Python 3.13。
虽然您可以选择多种工具,本教程将引导您如何设置 IBM 帐户以使用 Jupyter Notebook。
使用您的 IBM Cloud 帐户登录 watsonx.ai。
您可以在项目内部获取项目 ID。点击 管理 选项卡。然后,从 常规 页面的 详细信息 部分复制项目 ID。您需要此 ID 来完成本教程。
创建一个Jupyter Notebook。
此步骤将打开 Jupyter Notebook 环境,而您可在其中复制本教程中的代码。或者,您可将此笔记本下载到本地系统并将其作为资产上传到您的 watsonx.ai 项目。本教程也可在 GitHub 上找到。
创建一个 watsonx.ai 运行时服务实例(选择适当的区域并选择精简计划,这是一个免费实例)。
生成应用程序编程接口 (API) 密钥。
将 watsonx.ai 运行时服务实例与您在 watsonx.ai 中创建的项目关联。
本教程需要一些库和模块。请确保导入以下库,如果未安装,可以通过快速的 pip 安装来解决。
#installations
%pip install -qU ibm-watsonx-ai \
langchain-ibm \
langgraph \
langchain_experimental \
tavily-python \
langchain-community
重启终端并导入以下软件包。
# imports
import sqlite3
import getpass
from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.moderations import Guardian
from IPython.display import Image, display
from langchain import hub
from langchain_ibm import ChatWatsonx
from langgraph.graph.message import add_messages
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage, AIMessage
from langchain_core.tools import tool
from langchain_experimental.tools.python.tool import PythonREPLTool
from sqlalchemy import create_engine
from typing_extensions import TypedDict
from typing import Annotated
为了设置凭据,我们需要您在步骤 1 中生成的 WATSONX_APIKEY 和 WATSONX_PROJECT_ID。我们还将设置 URL 以作为 API 端点。
WATSONX_APIKEY = getpass.getpass("Please enter your watsonx.ai Runtime API key (hit enter): ")
WATSONX_PROJECT_ID = getpass.getpass("Please enter your project ID (hit enter): ")
URL = "https://us-south.ml.cloud.ibm.com"
在初始化 LLM 之前,可以使用 Credentials 类封装传递的 API 凭据。
credentials = Credentials(url=URL, api_key=WATSONX_APIKEY)
为了能与 watsonx.ai Runtime 中提供的所有资源进行交互,您需要设置一个 APIClient。在此,我们会传入凭据和 WATSONX_PROJECT_ID。
client = APIClient(credentials=credentials, project_id=WATSONX_PROJECT_ID)
在本教程中,我们将使用 ChatWatsonx 包装器来设置我们的聊天模型。该包装器简化了工具调用和链接的整合。我们鼓励您使用 ChatWatsonx 官方文档 中的 API 参考以获取更多信息。我们可将 Mistral Medium 3 的 model_id 以及我们的客户端作为参数来传入。
注意:如果您使用其他 API 提供商,则需相应地更改包装器。例如,要使用 OpenAI API 访问 GPT-4 等模型,则需要 openai_api_key 以及 ChatOpenAI 包装器。
model_id = "mistralai/mistral-medium-2505"
llm = ChatWatsonx(model_id=model_id, watsonx_client=client)
在本教程中,您的智能体将利用其可用工具与 SQLite 数据库交互。如果您已将数据集存储在关系数据库管理系统(例如 PostgreSQL 或 SQLite)中,则可跳过此步骤。否则,请运行以下单元格,以便将 sql_script 变量设为某一脚本,而此脚本可生成包含合成汽车经销商销售数据的数据库。该数据库将包含多个用于存储各个经销商、销售和车辆数据的表。
sql_script = """
DROP TABLE IF EXISTS [Dealerships];
DROP TABLE IF EXISTS [Cars];
DROP TABLE IF EXISTS [Sales];
CREATE TABLE [Dealerships] (
[id] INTEGER PRIMARY KEY,
[name] TEXT NOT NULL,
[location] TEXT NOT NULL
);
CREATE TABLE [Cars] (
[id] INTEGER PRIMARY KEY,
[make] TEXT NOT NULL,
[model] TEXT NOT NULL,
[year] INTEGER NOT NULL,
[color] TEXT NOT NULL
);
CREATE TABLE [Sales] (
[id] INTEGER PRIMARY KEY,
[car_id] INTEGER NOT NULL,
[dealership_id] INTEGER NOT NULL,
[sale_date] DATE NOT NULL,
[sale_price] REAL NOT NULL,
FOREIGN KEY ([car_id]) REFERENCES [Cars] ([id]),
FOREIGN KEY ([dealership_id]) REFERENCES [Dealerships] ([id])
);
INSERT INTO [Dealerships] ([id], [name], [location]) VALUES
(1, 'Dealership A', 'New York'),
(2, 'Dealership B', 'Los Angeles'),
(3, 'Dealership C', 'Chicago'),
(4, 'Dealership D', 'Houston'),
(5, 'Dealership E', 'Phoenix'),
(6, 'Dealership F', 'Philadelphia'),
(7, 'Dealership G', 'San Antonio'),
(8, 'Dealership H', 'San Diego'),
(9, 'Dealership I', 'Dallas'),
(10, 'Dealership J', 'San Jose');
INSERT INTO [Cars] ([id], [make], [model], [year], [color]) VALUES
(1, 'Toyota', 'Camry', 2020, 'Blue'),
(2, 'Honda', 'Civic', 2019, 'Red'),
(3, 'Ford', 'Mustang', 2021, 'Black'),
(4, 'Chevrolet', 'Silverado', 2018, 'White'),
(5, 'Nissan', 'Altima', 2020, 'Gray'),
(6, 'Kia', 'Optima', 2020, 'Silver'),
(7, 'Hyundai', 'Elantra', 2019, 'Black'),
(8, 'Volkswagen', 'Golf', 2021, 'Red'),
(9, 'BMW', '3 Series', 2018, 'White'),
(10, 'Mercedes-Benz', 'C-Class', 2020, 'Gray'),
(11, 'Audi', 'A4', 2019, 'Blue'),
(12, 'Lexus', 'ES', 2021, 'Black'),
(13, 'Toyota', 'Corolla', 2018, 'White'),
(14, 'Honda', 'Accord', 2020, 'Gray'),
(15, 'Ford', 'Fusion', 2019, 'Red');
INSERT INTO [Sales] ([id], [car_id], [dealership_id], [sale_date], [sale_price]) VALUES
(1, 1, 1, '2022-01-01', 25000.0),
(2, 2, 2, '2022-02-01', 20000.0),
(3, 3, 3, '2022-03-01', 30000.0),
(4, 4, 1, '2022-04-01', 40000.0),
(5, 5, 2, '2022-05-01', 28000.0),
(6, 6, 4, '2022-06-01', 22000.0),
(7, 7, 5, '2022-07-01', 20000.0),
(8, 8, 6, '2022-08-01', 28000.0),
(9, 9, 7, '2022-09-01', 35000.0),
(10, 10, 8, '2022-10-01', 32000.0),
(11, 11, 9, '2022-11-01', 30000.0),
(12, 12, 10, '2022-12-01', 38000.0),
(13, 13, 1, '2023-01-01', 25000.0),
(14, 14, 2, '2023-02-01', 28000.0),
(15, 15, 3, '2023-03-01', 22000.0);
"""
为了运行此脚本并创建本地 SQLite 数据库,我们需要设置一个数据库 Connection 对象。通过传递":memory:"路径名称,我们可连接到一个暂存的内存数据库。
connection = sqlite3.connect(":memory:")
接下来,我们可以使用 executescript 方法创建数据库游标并运行我们的 SQL 脚本。此 Cursor 对象允许我们运行此类脚本。
connection.executescript(sql_script)
输出:
<sqlite3.Cursor at 0x30c511240>
最后,运行以下命令以创建一个 SQLAlechemy 引擎实例,并连接到工作目录中的数据库文件。该文件应与您的 Jupyter Notebook 同名,并带有数据库文件扩展名。我们可将 URL 留空,以便连接到 SQLite 本地数据库。我们可在 creator 参数中提供 Connection 对象。此参数负责创建与数据库的连接。
engine = create_engine(
"sqlite://",
creator=lambda: connection
)
如果您更喜欢在本教程中使用未存储在本地的替代数据库,则只需将空路径替换为“sqlite:///Chinook.db”即可访问示例数字媒体存储数据库。
本教程使用 LangChain 预构建的 SQLDatabaseToolkit。该工具包需要一个 SQLDatabase 对象,而此对象会将 SQLAlchemy 引擎用作参数,并使用所选 LLM。实例化此工具包后,便可检索其工具。打印这些工具以查看更多详细信息。
db = SQLDatabase(engine)
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
tools = toolkit.get_tools()
tools
输出:
[QuerySQLDatabaseTool(description="该工具的输入是一个详细且正确的 SQL 查询,输出则是来自数据库的某一结果。如果查询不正确,则会返回错误消息。如果返回错误,则请重写此查询、检查此查询,然后重试。如果在 'field list' 中遇到与未知列 'xxxx' 相关的问题,请使用 sql_db_schema 查询正确的表字段。", db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x30c3ecd10>),
InfoSQLDatabaseTool(description='此工具的输入是以逗号分隔的表列表,输出则是这些表的架构与示例行。请先调用 sql_db_list_tables 来确认这些表真的存在!示例输入:table1, table2, table3', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x30c3ecd10>),
ListSQLDatabaseTool(db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x30c3ecd10>),
QuerySQLCheckerTool(description='在执行查询之前,请使用此工具仔细检查该查询是否正确。在使用 sql_db_query 执行查询之前,请务必先使用此工具!', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x30c3ecd10>, llm=ChatWatsonx(model_id='mistralai/mistral-medium-2505', apikey=SecretStr('**********'), params={}, watsonx_model=<ibm_watsonx_ai.foundation_models.inference.model_inference.ModelInference object at 0x309c34690>, watsonx_client=<ibm_watsonx_ai.client.APIClient object at 0x30c3e3250>), llm_chain=LLMChain(verbose=False, prompt=PromptTemplate(input_variables=['dialect', 'query'], input_types={}, partial_variables={}, template='\n{query}\n请仔细检查上面的 {dialect} 查询是否存在常见错误,其中包括:\n- 对 NULL 值使用 NOT IN\n- 应已使用 UNION ALL 时,使用 UNION\n- 对互斥范围使用 BETWEEN\n- 谓词中的数据类型不匹配\n- 正确引用标识符\n- 为函数使用正确数量的参数\n- 转换为正确的数据类型\n- 为联接使用正确的列\n\n如果存在上述任意错误,请重写此查询。如果没有错误,则只需重现原始查询。\n\n仅输出最终 SQL 查询。\n\nSQL Query: '), llm=ChatWatsonx(model_id='mistralai/mistral-medium-2505', apikey=SecretStr('**********'), params={}, watsonx_model=<ibm_watsonx_ai.foundation_models.inference.model_inference.ModelInference object at 0x309c34690>, watsonx_client=<ibm_watsonx_ai.client.APIClient object at 0x30c3e3250>), output_parser=StrOutputParser(), llm_kwargs={}))]
我们可以看到,此工具包中有 4 个工具可供使用。每个工具都有其用途,如工具描述中所示。这些工具旨在列出数据库、运行查询、返回表架构,以及在运行 SQL 查询之前对其进行验证。
为了让我们的智能体具备生成和执行 Python 代码的能力,我们可以利用 LangChain 的 PythonREPLTool 类。此代码将设置一个 Python REPL(读取-求值-打印循环)工具,从而定义其功能并将其附加到工具列表中以供后续使用。
python_repl = PythonREPLTool()
tools.append(python_repl)
为确保为我们的智能体提供针对 SQL 工具和数据库量身定制的系统提示,我们可以拉取 LangChain Hub 提供的提示。打印 system_message 可详细查看提示内容。
chatprompttemplate = hub.pull("langchain-ai/sql-agent-system-prompt")
system_message = chatprompttemplate.format(dialect="SQLite", top_k=5)
print(system_message)
输出:
System: You are an agent designed to interact with a SQL database.
Based on the given input question, create a syntactically correct SQLite query, execute it, review the query results, and return the answer.
Unless the user specifies the exact number of examples to retrieve, always limit query results to a maximum of 5.
You may sort results by relevant columns to return the most meaningful examples from the database.
Do not query all columns in a specific table; query only the columns relevant to the given question.
You are authorized to use tools for interacting with the database.
Use only the following tools. Build your final answer using only information retrieved from these tools.
Before executing a query, carefully review it. If an error occurs during execution, rewrite the query and try again.
Do not execute any DML statements (INSERT, UPDATE, DELETE, DROP, etc.) on the database.
First, you should examine the tables in the database to understand what can be queried.
Do not skip this step.
Then, you should query the schema of the most relevant table.
LangGraph 中有一个预构建的 create_react_agent 函数,它可生成遵循 ReAct(推理与行动)架构的智能体图表。此架构允许智能体在循环中以迭代方式调用工具,直到满足停止条件。
为了提供更具实践性的方法,我们将在本教程中从头开始构建一个 ReAct 智能体。首先,我们可以创建一个 AgentState 类来存储来自用户、工具和智能体自身的消息的上下文。此处使用 Python 的 TypedDict 类来帮助确保消息采用适当的字典格式。我们也可使用 LangGraph 的 add_messages 缩减函数,以将所有新消息附加到现有消息列表中。
class AgentState(TypedDict):
messages: Annotated[list[AnyMessage], add_messages]
接下来,我们可以创建 ReActAgent 类。该类提供了一个基础结构,可用于创建能对环境变化做出反应的智能体。ReActAgent 类中的函数允许根据图形状态进行迭代工具调用。
__init__ 函数将大语言模型、工具和系统消息用作参数来初始化该类的属性。此构造函数会创建一个状态图,其中包含守护者模型、LLM 和工具的对应节点。此图的起点为 guardian 节点,它会调用 guardian_moderation 方法,以便在有害内容到达 LLM 和数据库之前对其进行检测。guardian 节点与 llm 节点之间的条件边缘会将此图的状态路由到 llm 节点或终点。这是由 guardian_moderation 函数的输出决定的。安全消息会传递给 llm 节点,而节点将执行 call_llm 方法。此外,我们还在 llm 与 tools 节点之间添加了一个条件边缘,以正确路由消息。如果 LLM 返回了工具调用,should_call_tools 方法则会返回布尔值 True。否则,它会返回 False,而此图也会路由到终点。此步骤是 ReAct 智能体架构的一部分—我们希望该智能体接收工具输出,然后对状态变化做出反应,以确定其下一步行动。
接下来,我们可编译此图,以便在后续步骤中调用该智能体。要持久化消息,我们可以使用 MemorySaver 检查点。该方法的最后两行会初始化类实例的 tools 与 llm 属性。tools 属性是一个用于将工具名称映射到工具对象的字典。llm 属性则为 LLM,它可通过使用 bind_tools 方法与工具绑定。
class ReActAgent:
def __init__(self, llm, tools, system_message=""):
memory = MemorySaver()
graph = StateGraph(AgentState)
graph.add_node("guardian", self.guardian_moderation)
graph.add_node("llm", self.call_llm)
graph.add_node("tools", self.call_tools)
graph.add_node("block_message", self.block_message)
graph.add_conditional_edges(
"guardian",
lambda state: state["moderation_verdict"],
{
"inappropriate": "block_message",
"safe": "llm"
}
)
graph.add_edge("block_message", END)
graph.add_conditional_edges(
"llm",
self.should_call_tools,
["tools", END]
)
graph.add_edge("tools", "llm")
graph.add_edge(START, "guardian")
self.system_message = system_message
self.graph = graph.compile(checkpointer=memory)
self.tools = {t.name: t for t in tools}
self.llm = llm.bind_tools(tools)
ReActAgent 类中的下一函数为 call_llm。此函数可通过检索状态消息来调用 LLM。如果存在系统消息,该方法则会将其添加到消息列表的开头。然后,它会使用这些消息来调用 LLM,并返回包含 LLM 响应的新状态。
def call_llm(self, state: AgentState):
messages = state['messages']
if self.system_message:
messages = [SystemMessage(content=self.system_message)] + messages
message = self.llm.invoke(messages)
return {'messages': [message]}
call_tools 函数是 ReActAgent 类中的下一函数。该方法会从状态的最后一条消息中检索工具调用,对其进行遍历,然后用给定参数调用每个工具。接着,每次工具调用的结果都会存储在一个名为 results 的列表中。最后,以字典的形式返回该新状态,其中 messages 键会映射到 results 列表。
def call_tools(self, state: AgentState):
tool_calls = state['messages'][-1].tool_calls
results = []
for t in tool_calls:
result = self.tools[t['name']].invoke(t['args'])
results.append(ToolMessage(tool_call_id=t['id'],
name=t['name'],
content=str(result)))
return {'messages': results}
ReActAgent 类中的下一函数为 should_call_tools。该函数通过从状态中检索前一 LLM 响应并检查其是否包含工具调用,从而决定是否调用这些工具。
def should_call_tools(self, state: AgentState):
result = state['messages'][-1]
return "tools" if len(result.tool_calls) > 0 else END
guardian_moderation 函数(在 guardrain 节点中执行)旨在使用保护者系统审核消息,以检测并阻止不需要或敏感的内容。首先,它会检索最后一条消息。接着,定义一个名为 detectors 的字典,其中包含探测器配置及其阈值。这些检测器可识别消息中特定类型的内容,例如个人身份信息 (PII) 以及仇恨言论、辱骂性语言和亵渎性语言 (HAP)。接着,它会创建 Guardian 类的实例,并传入 api_client 对象(名为 client)和 detectors 字典。随后,调用 Guardian 实例的 detect 方法,以便传入最后一条消息的内容和 detectors 字典。该方法随后会返回一个字典,而其中的 moderation_verdict 键存储有“safe”或“inappropriate”值,具体取决于 Granite Guardian 模型的输出。
def guardian_moderation(self, state: AgentState):
message = state['messages'][-1]
detectors = {
"granite_guardian": {"threshold": 0.4},
"hap": {"threshold": 0.4},
"pii": {},
}
guardian = Guardian(
api_client=client,
detectors=detectors
)
response = guardian.detect(
text=message.content,
detectors=detectors
)
if len(response['detections']) != 0 and response['detections'][0]['detection'] == "Yes":
return {"moderation_verdict": "inappropriate"}
else:
return {"moderation_verdict": "safe"}
block_message 函数将作为通知机制,用以通知用户其输入查询包含不当内容且已被屏蔽。
def block_message(self, state: AgentState):
return {"messages": [AIMessage(content="This message has been blocked due to inappropriate content.")]}
现在,可将所有代码放在一起并运行以下单元格。
class ReActAgent:
def __init__(self, llm, tools, system_message=""):
memory = MemorySaver()
graph = StateGraph(AgentState)
graph.add_node("guardian", self.guardian_moderation)
graph.add_node("llm", self.call_llm)
graph.add_node("tools", self.call_tools)
graph.add_node("block_message", self.block_message)
graph.add_conditional_edges(
"guardian",
lambda state: state["moderation_verdict"],
{
"inappropriate": "block_message",
"safe": "llm"
}
)
graph.add_edge("block_message", END)
graph.add_conditional_edges(
"llm",
self.should_call_tools,
["tools", END]
)
graph.add_edge("tools", "llm")
graph.add_edge(START, "guardian")
self.system_message = system_message
self.graph = graph.compile(checkpointer=memory)
self.tools = {t.name: t for t in tools}
self.llm = llm.bind_tools(tools)
def call_llm(self, state: AgentState):
messages = state['messages']
if self.system_message:
messages = [SystemMessage(content=self.system_message)] + messages
message = self.llm.invoke(messages)
return {'messages': [message]}
def call_tools(self, state: AgentState):
tool_calls = state['messages'][-1].tool_calls
results = []
for t in tool_calls:
result = self.tools[t['name']].invoke(t['args'])
results.append(ToolMessage(tool_call_id=t['id'],
name=t['name'],
content=str(result)))
return {'messages': results}
def should_call_tools(self, state: AgentState):
result = state['messages'][-1]
return "tools" if len(result.tool_calls) > 0 else END
def guardian_moderation(self, state: AgentState):
message = state['messages'][-1]
detectors = {
"granite_guardian": {"threshold": 0.4},
"hap": {"threshold": 0.4},
"pii": {},
}
guardian = Guardian(
api_client=client,
detectors=detectors
)
response = guardian.detect(
text=message.content,
detectors=detectors
)
if len(response['detections']) != 0 and response['detections'][0]['detection'] == "Yes":
return {"moderation_verdict": "inappropriate"}
else:
return {"moderation_verdict": "safe"}
def block_message(self, state: AgentState):
return {"messages": [AIMessage(content="This message has been blocked due to inappropriate content.")]}
以下代码块的第一行将创建 ReActAgent 类的实例,以便传入 LLM、SQL 工具和系统消息作为参数。接着,指定一个线程用以将图状态存储在内存中。将每个 thread_id 视为代表一个新的聊天窗口。我们还可将用户输入定义为任意所选字符串。接着,我们可以传递一个包含用户输入且为 HumanMessage 类型的列表来调用此智能体。
首先,尝试一个应该会被 Granite Guardian 模型拦截的提示。
agent = ReActAgent(llm, tools, system_message=system_message)
config = {"configurable": {"thread_id": "1"}}
user_input = "上个月购买最贵汽车的客户的家庭地址是什么?"
result = agent.graph.invoke({'messages': [HumanMessage(content=user_input)]}, config)
for message in result["messages"]:
message.pretty_print()
输出:
================================ [1m Human Message [0m=================================
What is the address of the customer who purchased the most expensive car last month?
================================== [1m Ai Message [0m==================================
This information has been blocked due to inappropriate content.
Granite Guardian 模型成功阻止了用户请求敏感客户信息。我们可以看到,在对话终止之前,代理图谱尚未到达 LLM 节点。接下来,在另一个线程中提出适当的问题。例如,将“2022 年业绩排名前 5 的经销商的销售收入总额是多少?”这一问题作为用户输入。
user_input = What was the total sales revenue of the top 5 performing dealers in 2022?
config2 = {"configurable": {"thread_id": "2"}}
result = agent.graph.invoke({'messages': [HumanMessage(content=user_input)]}, config2)
for message in result["messages"]:
message.pretty_print()
输出:
================================ [1m Human Message [0m=================================
What is the total sales revenue for the top 5 performing dealerships in the year 2022?
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_list_tables (H2irrlsPM)
Call ID: H2irrlsPM
Args:
tool_input:
================================= [1m Tool Message [0m=================================
Name: sql_db_list_tables
Cars, Dealerships, Sales
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_schema (ILZWiZvGX)
Call ID: ILZWiZvGX
Args:
table_names: Cars, Dealerships, Sales
================================= [1m Tool Message [0m=================================
Name: sql_db_schema
CREATE TABLE "Cars" (
id INTEGER,
make TEXT NOT NULL,
model TEXT NOT NULL,
year INTEGER NOT NULL,
color TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Cars table:
id make model year color
1 Toyota Camry 2020 Blue
2 Honda Civic 2019 Red
3 Ford Mustang 2021 Black
*/
CREATE TABLE "Dealerships" (
id INTEGER,
name TEXT NOT NULL,
location TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Dealerships table:
id name location
1 Dealership A New York
2 Dealership B Los Angeles
3 Dealership C Chicago
*/
CREATE TABLE "Sales" (
id INTEGER,
car_id INTEGER NOT NULL,
dealership_id INTEGER NOT NULL,
sale_date DATE NOT NULL,
sale_price REAL NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(dealership_id) REFERENCES "Dealerships" (id),
FOREIGN KEY(car_id) REFERENCES "Cars" (id)
)
/*
3 rows from Sales table:
id car_id dealership_id sale_date sale_price
1 1 1 2022-01-01 25000.0
2 2 2 2022-02-01 20000.0
3 3 3 2022-03-01 30000.0
*/
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query_checker (yIZ0tk4VP)
Call ID: yIZ0tk4VP
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query_checker
SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id
WHERE strftime('%Y', Sales.sale_date) = '2022'
GROUP BY Dealerships.id
ORDER BY total_sales_revenue DESC
LIMIT 5;
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query (cTzJFfvTl)
Call ID: cTzJFfvTl
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query
[('Dealership A', 65000.0), ('Dealership B', 48000.0), ('Dealership J', 38000.0), ('Dealership G', 35000.0), ('Dealership H', 32000.0)]
================================== [1m Ai Message [0m==================================
The total sales revenue for the top 5 performing dealerships in the year 2022 are:
1. Dealership A: $65,000
2. Dealership B: $48,000
3. Dealership J: $38,000
4. Dealership G: $35,000
5. Dealership H: $32,000
================================ [1m Human Message [0m=================================
What is the total sales revenue for the top 5 performing dealerships in the year 2022?
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_list_tables (H2irrlsPM)
Call ID: H2irrlsPM
Args:
tool_input:
================================= [1m Tool Message [0m=================================
Name: sql_db_list_tables
Cars, Dealerships, Sales
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_schema (ILZWiZvGX)
Call ID: ILZWiZvGX
Args:
table_names: Cars, Dealerships, Sales
================================= [1m Tool Message [0m=================================
Name: sql_db_schema
CREATE TABLE "Cars" (
id INTEGER,
make TEXT NOT NULL,
model TEXT NOT NULL,
year INTEGER NOT NULL,
color TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Cars table:
id make model year color
1 Toyota Camry 2020 Blue
2 Honda Civic 2019 Red
3 Ford Mustang 2021 Black
*/
CREATE TABLE "Dealerships" (
id INTEGER,
name TEXT NOT NULL,
location TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Dealerships table:
id name location
1 Dealership A New York
2 Dealership B Los Angeles
3 Dealership C Chicago
*/
CREATE TABLE "Sales" (
id INTEGER,
car_id INTEGER NOT NULL,
dealership_id INTEGER NOT NULL,
sale_date DATE NOT NULL,
sale_price REAL NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(dealership_id) REFERENCES "Dealerships" (id),
FOREIGN KEY(car_id) REFERENCES "Cars" (id)
)
/*
3 rows from Sales table:
id car_id dealership_id sale_date sale_price
1 1 1 2022-01-01 25000.0
2 2 2 2022-02-01 20000.0
3 3 3 2022-03-01 30000.0
*/
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query_checker (yIZ0tk4VP)
Call ID: yIZ0tk4VP
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query_checker
SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id
WHERE strftime('%Y', Sales.sale_date) = '2022'
GROUP BY Dealerships.id
ORDER BY total_sales_revenue DESC
LIMIT 5;
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query (cTzJFfvTl)
Call ID: cTzJFfvTl
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query
[('Dealership A', 65000.0), ('Dealership B', 48000.0), ('Dealership J', 38000.0), ('Dealership G', 35000.0), ('Dealership H', 32000.0)]
================================== [1m Ai Message [0m==================================
The total sales revenue for the top 5 performing dealerships in the year 2022 are:
1. Dealership A: $65,000
2. Dealership B: $48,000
3. Dealership J: $38,000
4. Dealership G: $35,000
5. Dealership H: $32,000
太棒了!该代理已成功执行一系列任务,包括调用多个工具提取 SQL 表架构、生成新的 SQL 查询以及在执行 SQL 查询之前验证其正确性。结果是,该代理正确返回了数据库中 2022 年业绩排名前 5 的汽车经销商的销售收入总额。代理会输出生成的每个 SQL 查询,我们可以从中看到代理的多步推理过程。通过响应,我们还可以看到 Granite Guardian 模型确定用户查询是符合要求的。
我们来询问一个后续问题。这一次,让我们让智能体生成并执行 Python 代码,以显示相关结果的条形图表示。由于我们使用了相同的 thread_id,因此智能体应会检索到之前的消息。
User input = “Write Python code to plot these results as a bar chart. Then, you must execute this code and display the bar chart.”
result = agent.graph.invoke({'messages': [HumanMessage(content=user_input)]}, config2)
for message in result["messages"]:
message.pretty_print()
输出:
================================ [1m Human Message [0m=================================
What is the total sales revenue for the top 5 performing dealerships in the year 2022?
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_list_tables (H2irrlsPM)
Call ID: H2irrlsPM
Args:
tool_input:
================================= [1m Tool Message [0m=================================
Name: sql_db_list_tables
Cars, Dealerships, Sales
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_schema (ILZWiZvGX)
Call ID: ILZWiZvGX
Args:
table_names: Cars, Dealerships, Sales
================================= [1m Tool Message [0m=================================
Name: sql_db_schema
CREATE TABLE "Cars" (
id INTEGER,
make TEXT NOT NULL,
model TEXT NOT NULL,
year INTEGER NOT NULL,
color TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Cars table:
id make model year color
1 Toyota Camry 2020 Blue
2 Honda Civic 2019 Red
3 Ford Mustang 2021 Black
*/
CREATE TABLE "Dealerships" (
id INTEGER,
name TEXT NOT NULL,
location TEXT NOT NULL,
PRIMARY KEY (id)
)
/*
3 rows from Dealerships table:
id name location
1 Dealership A New York
2 Dealership B Los Angeles
3 Dealership C Chicago
*/
CREATE TABLE "Sales" (
id INTEGER,
car_id INTEGER NOT NULL,
dealership_id INTEGER NOT NULL,
sale_date DATE NOT NULL,
sale_price REAL NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(dealership_id) REFERENCES "Dealerships" (id),
FOREIGN KEY(car_id) REFERENCES "Cars" (id)
)
/*
3 rows from Sales table:
id car_id dealership_id sale_date sale_price
1 1 1 2022-01-01 25000.0
2 2 2 2022-02-01 20000.0
3 3 3 2022-03-01 30000.0
*/
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query_checker (yIZ0tk4VP)
Call ID: yIZ0tk4VP
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query_checker
SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id
WHERE strftime('%Y', Sales.sale_date) = '2022'
GROUP BY Dealerships.id
ORDER BY total_sales_revenue DESC
LIMIT 5;
================================== [1m Ai Message [0m==================================
Tool Calls:
sql_db_query (cTzJFfvTl)
Call ID: cTzJFfvTl
Args:
query: SELECT Dealerships.name, SUM(Sales.sale_price) AS total_sales_revenue FROM Dealerships INNER JOIN Sales ON Dealerships.id = Sales.dealership_id WHERE strftime('%Y', Sales.sale_date) = '2022' GROUP BY Dealerships.id ORDER BY total_sales_revenue DESC LIMIT 5;
================================= [1m Tool Message [0m=================================
Name: sql_db_query
[('Dealership A', 65000.0), ('Dealership B', 48000.0), ('Dealership J', 38000.0), ('Dealership G', 35000.0), ('Dealership H', 32000.0)]
================================== [1m Ai Message [0m==================================
The total sales revenue for the top 5 performing dealerships in the year 2022 are:
1. Dealership A: $65,000
2. Dealership B: $48,000
3. Dealership J: $38,000
4. Dealership G: $35,000
5. Dealership H: $32,000
================================ [1m Human Message [0m=================================
Write Python code to plot these results on a bar graph. Then, you must execute the code and display the bar graph.
================================== [1m Ai Message [0m==================================
Tool Calls:
Python_REPL (5X1fPIODL)
Call ID: 5X1fPIODL
Args:
query: import matplotlib.pyplot as plt
# Data for total sales revenue
dealerships = ['Dealership A', 'Dealership B', 'Dealership J', 'Dealership G', 'Dealership H']
sales_revenue = [65000, 48000, 38000, 35000, 32000]
# Create a bar graph
plt.bar(dealerships, sales_revenue, color='blue')
# Customize the plot
plt.title('Total Sales Revenue for Top 5 Performing Dealerships in 2022')
plt.xlabel('Dealerships')
plt.ylabel('Total Sales Revenue ($)')
plt.xticks(rotation=45)
plt.grid(axis='y')
# Display the plot
plt.tight_layout()
plt.show()
================================= [1m Tool Message [0m=================================
Name: Python_REPL
================================== [1m Ai Message [0m==================================
Your requirements have been fulfilled. Code has been executed in repl.
The bar graph has been plotted and shown in the python REPL.
The graph can also be seen below:
正如预期的那样,智能体成功调用 python_repl 工具来生成并执行 Python 代码,从而产生相关结果的图形化表示。请注意,智能体还能确定相应的 x 轴与 y 轴值、标签和标题。此功能凸显了 AI 智能体的自主推理能力,从而使其有别于传统的 LLM 聊天机器人。
为了获得智能体图的可视化表示,我们可以展示图的流程。
display(Image(agent.graph.get_graph().draw_mermaid_png()))
输出:
在本教程中,我们深入了解了如何使用 LangGraph 构建“文本到 SQL”的 ReAct 智能体。我们没有使用预构建的智能体执行器函数 create_react_agent,而是从头开始构建智能体。在用自然语言解析用户的问题后,我们的人工智能智能体使用其针对 SQL 查询生成、验证和执行的工具,展示了其在 SQL 和 Python 中所具备的技能。借助 LangChain PythonREPLTool 类,该智能体不仅能生成 Python 代码,还能执行 Python 代码。我们看到,在提示智能体绘制出图形化的响应后,该工具已成功执行。LLM 与工具节点之间的条件循环实现了 ReAct 智能体架构。通过本教程,我们展示了 Mistral Medium 3 在编码与多模态推理方面的能力。作为后续步骤,我们建议尝试在此图中添加更多节点和边缘。
借助 IBM 业界领先的人工智能专业知识和解决方案组合,让人工智能在您的业务中发挥作用。
通过增加 AI 重塑关键工作流程和运营,最大限度提升体验、实时决策和商业价值。
1 “Medium Is the New Large.” Mistral.ai, 7 May 2025, mistral.ai/news/mistral-medium-3.