Keyword inclusion evaluation metric
The keyword inclusion metric measures the similarity of nouns and pronouns between the foundation model output and the reference or ground truth.
Metric details
Keyword inclusion is a metric that measures how well your model generates text that matches certain phrases or keywords in the reference or ground truth. The metric is available only when you use the Python SDK to calculate evaluation metrics. For more information, see Computing Adversarial robustness and Prompt Leakage Risk using IBM watsonx.governance.
Scope
The keyword inclusion metric evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks:
- Text summarization
- Question answering
- Retrieval augmented generation (RAG)
- Supported languages: English
Scores and values
The keyword inclusion metric score indicates the proportion of keywords that exist in the generated output and the reference or ground truth.
- Range of values: 0.0-1.0
- Best possible score: 1.0
- Ratios:
- At 0: No similar keywords are included in the output
- Over 0: Increasing amount of similar keywords are included in the output.
Parent topic: Evaluation metrics