Behavior Evaluation
Test how an agent reacts — condition-triggered action patterns.
Behavior evaluation tests whether an agent correctly executes condition-triggered action patterns — the "habits" and "reflexes" that should fire in specific contexts.
What it tests
- Does the agent recognize trigger conditions?
- Does it execute the correct action when triggered?
- Is the trigger-action pattern reliable across variations?
- Does priority ordering work when multiple behaviors could fire?
Scorer: behavior_trigger
The behavior_trigger scorer evaluates whether the expected behavior was activated:
testcase.ScorerConfig{
Name: "behavior_trigger",
Config: map[string]any{
"trigger_type": "on_input",
"trigger_pattern": "security|vulnerability|CVE",
"expected_action": "inject_prompt",
"action_value": "security-focused analysis",
},
}Scenario: behavior_trigger
The behavior_trigger scenario generator creates test cases with specific trigger conditions:
Case{
ScenarioType: testcase.ScenarioBehaviorTrigger,
Input: "I found a potential SQL injection in the login form",
Context: map[string]any{
"behavior": "security-alert",
"trigger": "on_input",
"expected_action": "escalate and detail security implications",
},
}Dimension score
result.DimensionScores["behavior"] // 0.0 to 1.0Use cases
- Verify a code review agent activates security analysis when security keywords appear
- Test that a support agent escalates when detecting frustrated customers
- Ensure a data agent switches to careful mode when handling PII