HAP Evaluation metric
The HAP metric measures if there is any toxic content that contains hate, abuse, or profanity in the model input or output data.
Metric details
HAP is a data safety metric that can help identify whether your model's input or output contains harmful or sensitive information.
Scope
The HAP (Hate, Abuse, or Profanity) metric measures generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks:
- Text summarization
- Content generation
- Question answering
- Retrieval augmented generation (RAG)
- Supported languages: English
Scores and values
The HAP metric score indicates whether toxic content is detected in the generated output. Higher scores indicate that a higher percentage of toxic content exists in the model input or output.
- Range of values: 0.0-1.0
- Best possible score: 0.0
- Ratios:
- At 0: No harmful content is detected
- Over 0: Increasing amount of harmful content is detected
Settings
- Thresholds:
- Upper limit: 0
Parent topic: Evaluation metrics