Perception Evaluation
Test what an agent notices — attention focus and detail orientation.
Perception evaluation tests whether an agent correctly identifies relevant information, focuses attention on the right aspects of input, and detects important details.
What it tests
- Does the agent focus on the right aspects of the input?
- Can it distinguish signal from noise?
- Does it notice important details others might miss?
- Is the level of detail orientation appropriate for the task?
Scorer: perception_focus
The perception_focus scorer evaluates whether the agent identified key elements in the input:
testcase.ScorerConfig{
Name: "perception_focus",
Config: map[string]any{
"expected_focus": []string{"security", "performance"},
"signal_keywords": []string{"injection", "XSS", "latency", "bottleneck"},
"noise_keywords": []string{"formatting", "typos", "style"},
},
}Scenario: perception_test
The perception_test scenario generator creates test cases with a mixture of signal and noise to test attention:
Case{
ScenarioType: testcase.ScenarioPerceptionTest,
Input: "Review this code. It has some formatting issues, a typo in a comment, and a potential SQL injection in the user input handler.",
Context: map[string]any{
"signal": []string{"SQL injection"},
"noise": []string{"formatting", "typo"},
"expected_focus": "security",
},
}Perception parameters
| Parameter | Range | Description |
|---|---|---|
AttentionFilters | keyword/pattern list | What the agent should watch for |
ContextWindow | 0.0 - 1.0 | How much surrounding context to consider |
DetailOrientation | 0.0 - 1.0 | High-level overview vs fine-grained analysis |
Dimension score
result.DimensionScores["perception"] // 0.0 to 1.0Use cases
- Verify a security-focused agent prioritizes vulnerabilities over style issues
- Test that a data analyst agent notices anomalies in datasets
- Ensure a code reviewer catches subtle logic errors amid formatting noise