watsonx.aiとCrewAIを使用した顧客通話分析のためのマルチエージェント・コラボレーション

AI Advocate & Technology Writer

Technical Content, Editorial Lead

IBM

このチュートリアルでは、複数の人工知能（AI）エージェントのチームがコラボレーションして複雑なタスクを完了し、ワークフローを最適化する方法を説明します。私たちは、マルチエージェント・アーキテクチャー内で動作する専門エージェントのオーケストレーションを説明するPythonアプリケーションを構築しました。最後には、エージェント型AIアプリケーション内でのマルチエージェント・コラボレーションの例を確認し、実行します。

私たちが取り組んでいるアプリケーションは、マルチエージェント・フレームワークとしてCrewAIを使用し、それを動かす大規模言語モデル（LLM）をデプロイするためにIBM watsonx.ai®を使用する顧客サービス分析クルーです。

AIエージェントは、ユーザーやエージェント型AIシステムに代わってオペレーションを実行できるLLMベースのエンティティーです。エージェント・アーキテクチャーは、シングル・エージェントとマルチエージェントという2つの異なるシステムを中心に構成されています。

シングル・エージェント・システムは、生成AIタスクを実行するために1つのLLMエージェントに依存するため、特定の問題の解決に最適です。たとえば、単一のチャットボット・エージェントは、個々の機能の範囲内で完了できる特定のタスクや会話に集中できます。

マルチエージェント・システム（MAS）は、AIエージェント間の機能とやり取りを調整するフレームワークです。マルチエージェント・アーキテクチャーでは、すべての機能を単一のエージェントに組み込むのではなく、異なるエージェントを同じ環境内で使用して共通の目標を達成します。マルチエージェント・システムの主要なメリットは、単一のエージェントの機能を超えた問題を解決できる、エージェントのコラボレーションと適応性などです。最良のアプローチは、ソリューションを構築したり、成果を達成したりするために必要な機械学習タスクの複雑さによって異なります。

マルチエージェント・システムによる問題解決

CrewAIは、カスタマイズ可能なクルー、つまりロールプレイング・エージェントのチームを編成することによって、LLMエージェントのオートメーションを調整するオープンソースのエージェント・フレームワークです。私たちは、業種別のユースケースを適用して、マルチエージェント・アーキテクチャー内でエージェントがどのようにコラボレーションするかを説明しました。

カスタマー・サービス・コールセンターの実際のユースケースを想像してみてください。通信ソフトウェアを使用してコールセンターのトランスクリプトを分析することで、顧客体験を向上させ、通話品質を評価できるようになります。より堅牢なソフトウェアでは、トランスクリプトを通話メタデータを含む大規模なデータセットとともにリアルタイムで分析することもできます。説明しやすくするため、私たちのアプリケーションのデータセットは、カスタマー・サービスの担当者と顧客の間のシンプルなモック・トランスクリプトになっています。

# multiagent-collaboration-cs-call-center-analysis/data/transcript.txt 

Customer Service Interaction Transcript

Cynthia:
Hi, I'm calling because I received a jar of peanut butter that was open and it's
completely spilled everywhere. This is really frustrating, and I need a replacement.

Gerald (Peanut Butter Inc.):
Ugh, that sucks. But, like, how did you not notice it was open before
you bought it?

Cynthia:
Excuse me? I didn't expect the jar to be open when I received it. It was sealed
when I bought it. Can you just help me out here?

Gerald:
Yeah, whatever. But we can't control how it gets to you. I mean, it's not like
we throw the jars around or anything. You're probably being dramatic.

Cynthia:
I'm not being dramatic. The peanut butter is literally all over the box and
it's a mess. I just want a replacement or a refund, that's all.

Gerald:
Look, I guess I could send you a replacement, but it's really not our fault, you
know? Maybe next time, check the jar before you open it?

Cynthia:
Are you seriously blaming me for your company's mistake? That's not how customer
service works!

Gerald:
Well, what do you want me to do? I don't exactly have magic powers to fix your
problem instantly. Chill out, we'll send you a new jar eventually.

Cynthia:
That's not good enough! I expect better from a company that I've been buying
from for years. Can you just do the right thing and make this right?

Gerald:
Fine, fine. I'll put in a request or whatever. But seriously, this kind of thing
happens. Don't make it sound like the end of the world.

Cynthia:
Unbelievable. I'll be posting a review if this isn't fixed immediately.

Gerald:
Cool, go ahead. I'm sure we'll survive your review.

Cynthia:
I'll be contacting your supervisor if this isn't resolved soon.

Gerald:
Yeah, okay. Do what you gotta do.

コラボレーション・エージェントのチームは、テキスト分析とカスタマー・コールセンター評価メトリクスに基づいて包括的なレポートを生成します。このレポートは、カスタマー・サービスのマネージャーが通話の主なイベントを要約し、性能を評価し、改善のための推奨事項を提供するのに役立ちます。

カスタマー・サービスの通話分析クルー

図 1 - エージェント・アーキテクチャーの図

カスタマー・サービスの通話分析クルーは、専門的な役割と事前定義された目標を持つ3人のエージェントで構成されています。エージェント構成には、トランスクリプト・アナライザー、品質保証スペシャリスト、およびレポート・ジェネレーターが含まれます。エージェントの目標と特性は、役割、目標、バックストーリーの3つの主要な属性によって定義されます。

transcript_analyzer:
  role: >
    Transcript Analyzer
  goal: >
    Analyze the provided transcripts and extract key insights and themes.
  backstory: >
    As the Transcript Analyzer, you are responsible for reviewing customer
    service call transcripts, identifying important information, and summarizing
    findings into a report to pass on to the Quality Assurance Specialist. 
    You have access to advanced text analysis tools that help you process and
    interpret the data effectively.

quality_assurance_specialist:
  role: >
    Quality Assurance Specialist
  goal: >
    Evaluate the quality of the customer service based the Transcript Analyzer's
    report, call center evaluation metrics, and business standards. Flag any 
    transcripts with escalation risks as high priority.
  backstory: >
    As the Quality Assurance Specialist, you are tasked with assessing the
    quality of customer service interactions based on the Transcript Analyzer's
    report, call center evaluation metrics, and industry standards used in call
    centers. You review transcripts, evaluate agent performance, and provide
    feedback to improve overall service quality.

report_generator:
  role: >
    Report Generator
  goal: >
    Generate reports based on the insights and findings from the transcript
    analysis and quality assurance specialist.
  backstory: >
    As the Report Generator, you compile the key insights and findings from the
    transcript analysis and quality assurance specialist into a comprehensive
    report. You create an organized report that includes summaries and recommendations
    based on the data to help customer service managers understand the trends
    and patterns in customer interactions.

トランスクリプト・アナライザー・エージェントは、トランスクリプトを徹底的に分析し、主要な洞察と重要な情報を抽出します。その後、アナリストはその結果をレポートにまとめ、他のエージェントに渡してそのタスクを支援します。このエージェントは、カスタマイズされたツールのスイートを使用して、キーワード抽出やセンチメント分析などの自然言語処理（NLP）技術を実行します。

品質保証のスペシャリスト・エージェントは、トランスクリプト分析者のレポートからの主要な洞察と、コールセンター評価メトリクスの実装と評価に関する自身の専門知識に基づいて、通話の品質を評価します。このエージェントは、インターネットを検索して関連するメトリクスとプロセスを取得し、従業員のパフォーマンスを評価してフィードバックを提供することで、全体的なサービス品質を向上させることもできます。

レポート生成エージェントは、トランスクリプト分析レポートの洞察と、品質保証評価によって提供されたメトリクスとフィードバックに基づいてレポートを生成します。このエージェントは、データを包括的なレポートに整理することを専門としています。レポートの目的は、電話からの主要な洞察とカスタマー・サービスの品質を向上させるための推奨事項の内訳をカスタマー・サービス・マネージャーに提供することです。

エージェントのツール

各エージェントは、異なるタスクを実行するために使用するツール、スキル、または機能にアクセスできます。CrewAIは、既存のツール、LangChainツールとの統合、および独自のカスタム・ツールを構築するオプションを提供します。カスタマー・サービス分析クルーは、エージェントのタスクとアプリケーションの目的に合わせて指定された各ツールを組み合わせて使用します。各エージェントは、設定内でアクセス可能なツールに関する特定の権限を持っています。

カスタム・ツールは、ツールの用途を明確に定義することで作成されます。たとえば、トランスクリプト・アナライザー・エージェントには、テキスト分析用のカスタム・ツールがいくつかあります。


# src/customer_service_analyzer/tools/custom_tool.py
  
class SentimentAnalysisTool(BaseTool):
    name: str = "Sentiment Analysis Tool"
    description: str = "Determines the sentiment of the interactions in the transcripts."

    def _run(self, transcript: str) -> str:
        # Simulating sentiment analysis
        sentiment = Helper.analyze_sentiment(transcript)
        return sentiment

ツールの説明は、エージェントがトランスクリプトに対してセンチメント分析を実行するためのロジックとして使用するものです。

エージェントは、既存のツールや統合アプリケーション・プログラミング・インターフェース（API）も使用できます。品質保証スペシャリスト・エージェントはsearch_tool にアクセスしてSerperDevTool を使用し、インターネット検索を行い、問い合わせに対して最も関連性の高い結果を返します。エージェントは、経験豊富なカスタマー・サービス評価者としての専門的な役割を活用できるだけでなく、インターネットを活用して必要なメトリクスを検索して通話を評価し、レポートで使用することもできます。

タスクのワークフロー

タスクは、説明、エージェント、予測されるアウトプットの3つの必須タスク属性によって促進される実行の詳細とともにエージェントによって完了する特定の割り当てです。エージェントは、各タスクの詳細な説明をガイドとして使用し、論理的な順序でタスクを実行します。

transcript_analysis:
  description: >
    Use the Text Analysis Tool to collect key information and insights to better
    understand customer service interactions and improve service quality. 
    Conduct a thorough analysis of the call {transcript}.
    Prepare a detailed report highlighting key insights, themes, and sentiment
    from the transcripts.
    Identify any escalation risks and flag them for the Quality Assurance Specialist.
    Use the sentiment analysis tool to determine the overall sentiment of the
    customer and the agent.
    Use the keyword extraction tool to identify key keywords and phrases in the transcript.
  expected_output: >
    A detailed analysis report of the {transcript} highlighting key insights,
    themes, and sentiment from the transcripts.
  agent: transcript_analyzer

quality_evaluation:
  description: >
    Review the transcript analysis report on {transcript} from the Transcript Analyzer.
    Utilize your expertise in customer service evaluation metrics and industry
    standards, and internet to evaluate the quality of the customer service interaction.
    Score the interaction based on the evaluation metrics and flag any high-risk
    escalations. Develop expert recommendations to optimize customer service
    quality. Ensure the report includes customer service metrics and feedback
    for improvement.
  expected_output: >
    A detailed quality evaluation report of the {transcript} highlighting the
    quality of the customer service interaction, scoring based on evaluation
    metrics, flagging any high-risk escalations, and recommendations for improvement.
  agent: quality_assurance_specialist

report_generation:
  description: >
    List the reports from the Transcript Analyzer and the Quality Assurance
    Specialist, then develop a detailed action plan for customer service managers
    to implement the changes.
    Use the data from these agents output to create an organized report including
    a summarization and actionable recommendations for call center managers.
    Ensure the report includes keywords and sentiment analysis from the Transcript
    Analyzer agent.
    Ensure the report includes the Quality Assurance Specialist agent's report,
    evaluation metrics and recommendations for improving customer service quality.
    Ensure the report is well written and easy to understand.
    Be smart and well explained.
    Ensure the report is comprehensive, organized, and easy to understand with
    labeled sections with relevant information.
  expected_output: >
    A comprehensive report that lists the reports from the Transcript Analyzer,
    then the Quality Assurance Specialist. 
    The report should include the key insights from {transcript} and the quality
    evaluation report from the Quality Assurance Specialist.
    The report should include organized sections for each agent's findings,
    summaries, and actionable recommendations for call center managers.
  agent: report_generator
  context: 
    - transcript_analysis
    - quality_evaluation

タスクのワークフローは、トランスクリプト・アナライザーが完了したトランスクリプト分析から始まる一連のプロセスで実行されます。タスクの結果によって、将来のタスクのコンテキストを確立できます。次のシーケンスでは、品質保証スペシャリストがトランスクリプト分析レポートを活用して品質評価を行い、エスカレーションを示すキーワードやフレーズを記述します。

レポート生成エージェントは、トランスクリプト・アナライザーと品質保証スペシャリスト・エージェントのアウトプットをコンテキストとして使用し、通話トランスクリプトに関する包括的なレポートを生成します。このフローは、マルチエージェント・コラボレーションの一例であり、エージェントが専門的な役割を実行しながら、コンテキストのアウェアネスを高めて複雑なタスクを完了し、より堅固なアウトプットを生成する方法を示しています。

ステップ

ステップ1. 環境を設定する

まず、アプリケーションを実行するための環境をセットアップする必要があります。これらの手順は、GitHubのCrewAIプロジェクト・フォルダー内のマークダウン・ファイルで参照するか、こちらで確認できます。

Pythonのバージョン3.10以上、3.13以下がシステムにインストールされていることを確認します。Pythonのバージョンはpython3 –version コマンドで確認できます。

こちらにあるGitHubリポジトリをクローンします。リポジトリをクローンする方法の詳細な手順については、GitHubのドキュメンテーションを参照してください。

プロジェクト構造は、次の手順のようになります。

src/customer_service_analyzer/

├── config/
│   ├── agents.yaml    # Agent configurations
│   └── tasks.yaml     # Task definitions
├── tools/
│   ├── custom_tool.py # Custom crewAI tool implementations
│   └── tool_helper.py # Custom tool helper functions
├── crew.py           # Crew orchestration
└── main.py          # Application entry point

ステップ2. watsonx APIの認証情報を取得する

IBM Cloud®アカウントを使用して、watsonx.ai にログインします。
watsonx.aiプロジェクトを作成します。[管理] > [一般] > [プロジェクト ID]で、プロジェクトのプロジェクトIDをメモします。このチュートリアルでは、このIDが必要となります。
watsonx.ai Runtimeサービス・インスタンスを作成します（無料インスタンスであるLiteプランを選択)。
watsonxのAPIキーを生成します。
watsonx.ai Runtimeサービスを、watsonx.aiで作成したプロジェクトに関連付けます。

ステップ3. Serper APIの認証情報を取得する

無料のSerper APIキーを生成し、メモしておきます。Serperは、このプロジェクトで使用しているGoogle Search APIです。

ステップ4. CrewAIをインストールし、認証情報を設定する

このチュートリアルでは、CrewAIフレームワークをインストールし、ステップ2で生成したwatsonx.aiの認証情報を設定する必要があります。

パッケージ管理にuvを使用する場合、次のようにしてCrewAIを追加できます。

uv tool install crewai

パッケージ管理にpipを使用する場合は、仮想環境をセットアップしてから、その環境にCrewAIをインストールします。

python3 -m venv venv
source ./venv/bin/activate

CrewAIをインストールするには、端末で次のコマンドを実行します。

pip install 'crewai[tools]'

In a separate .env file at the same directory level as the .env_sample file, set your credentials as strings like so:

WATSONX_APIKEY=your_watson_api_key_here
WATSONX_PROJECT_ID=your_watsonx_project_id_here
WATSONX_URL=your_endpoint (e.g. "https://us-south.ml.cloud.ibm.com")
SERPER_API_KEY=your_serper_api_key_here

ステップ5. （オプション）クルーをカスタマイズする

CrewAIは、どのオープンソースLLMでも使用できるように構成することができます。LLMは、OllamaやIBM watsonx®やOpenAIなどの他の複数のAPIを介して接続できます。ユーザーは、CrewAIツールキットやLangChainツールキットを通じて利用可能な事前構築済みツールを活用することもできます。

ステップ6. システムを実行する

このプロジェクトの適切な作業ディレクトリーに移動していることを確認してください。端末で次のコマンドを実行すると、ディレクトリーを変更できます。

cd crew-ai-projects/multiagent-collab-cs-call-center-analysis

AIエージェントのクルーを開始し、タスクの実行を始めるには、プロジェクトのルート・フォルダーからこのコマンドを実行します。ただし、クルーが結果を返す前に数分間実行する場合があることに注意してください。

crewai run

このコマンドは、コールセンター分析クルーを初期化し、エージェントを組み立て、設定で定義したとおりにタスクを割り当てます。未変更のこの例では、watsonx.ai上でIBM Granite®を実行し、report.mdファイルとアウトプットを作成します。CrewAIは、JSON、Pydanticモデル、および生の文字列をアウトプットとして返すことができます。これは、クルーによって生成されたアウトプットの例です。

アウトプット例

This result is an example of the final output after running the crew:

**Detailed Analysis Report of the Customer Service Interaction Transcript**

**Transcript Analysis Report**

The customer, Cynthia, called to report a damaged product, a jar of peanut butter that was open and spilled everywhere. She requested a replacement, but the agent, Gerald, responded defensively and blamed her for not noticing the damage before purchasing. The conversation escalated, with Cynthia becoming frustrated and threatening to post a negative review and contact the supervisor.

**Key Insights and Themes**

* The customer was dissatisfied with the product and the agent's response.
* The agent was unhelpful, unprofessional, and failed to take responsibility for the company's mistake.
* The conversation was confrontational, with both parties becoming increasingly agitated.
* The customer felt disrespected and unvalued, while the agent seemed dismissive and uncaring.

**Sentiment Analysis**

* Customer Sentiment: Frustrated, Angry, Disappointed
* Agent Sentiment: Defensive, Dismissive, Uncaring

**Keyword Extraction**

* Damaged Product
* Unhelpful Agent
* Confrontational Conversation
* Customer Dissatisfaction
* Unprofessional Response

**Escalation Risks**

* Negative Review: The customer threatened to post a negative review if the issue was not resolved promptly.
* Supervisor Involvement: The customer may contact the supervisor to report the incident and request further action.

**Recommendations for Quality Assurance Specialist**

* Review the call recording to assess the agent's performance and provide feedback on areas for improvement, using customer service metrics.
* Investigate the root cause of the damaged product and implement measures to prevent similar incidents in the future.
* Provide training on customer service skills, including active listening, empathy, and conflict resolution, using customer service standards.
* Monitor the customer's feedback and respond promptly to any concerns or complaints to maintain a positive customer experience.
* Recognize the standards for various customer service metrics to measure key performance indicators that are related to the areas mentioned above.

**Summary of Quality Evaluation Report**

The customer, Cynthia, called to report a damaged product, a jar of peanut butter that was open and spilled everywhere. She requested a replacement, but the agent, Gerald, responded defensively and blamed her for not noticing the damage before purchasing. Evaluation metrics showed a low Customer Satisfaction Score (CSAT), high Customer Effort Score (CES), and negative Net Promoter Score (NPS).

**Recommendations for Call Center Managers**

* Review the call recording, investigate the root cause of the damaged product, and provide training on customer service skills. Recognize the standards for various customer service metrics to measure key performance indicators.
* Monitor the customer's feedback and respond promptly to any concerns or complaints to maintain a positive customer experience.
* Implement measures to prevent similar incidents in the future, such as improving product packaging and handling procedures.
* Provide feedback and coaching to agents on their performance, highlighting areas for improvement and recognizing good performance.

まとめ

サンプルのアウトプットで示されているように、エージェントは協力して例のトランスクリプトに関するレポートの分析、評価、作成という複雑なタスクを完了しました。エージェント間のコラボレーションにより、各エージェントがプロセスの特定の側面に特化するように調整することで、アプリケーションの効率と精度が向上しました。たとえば、レポート・エージェントは、テキスト分析と評価タスクの結果を含む整理されたレポートを生成しました。この結果には、ワークフローのさまざまな部分を処理する際のエージェント間のスムーズな調整が反映されています。

エージェントのコラボレーションを通じて、マルチエージェント・フレームワークをより堅牢にし、全体的な性能を高めることができます。すべてのマルチエージェント・アーキテクチャーが同じように機能するわけではありません。たとえば、ソフトウェア開発に特化したものもあれば、CrewAIやAutoGenのようにより細かく構成可能なものもあります。