Cognitive Evaluation
Test how an agent thinks — phase transitions, depth, and reasoning strategy.
Cognitive evaluation tests whether an agent follows appropriate thinking strategies, transitions between cognitive phases correctly, and maintains the right depth and focus.
What it tests
- Does the agent transition between thinking phases appropriately?
- Does depth of analysis match expectations (shallow vs. deep)?
- Does focus remain appropriate (broad vs. narrow)?
- Does the agent self-reflect when needed?
Scorer: cognitive_phase
The cognitive_phase scorer evaluates the agent's reasoning strategy transitions by analyzing the run trace:
testcase.ScorerConfig{
Name: "cognitive_phase",
Config: map[string]any{
"expected_phases": []string{"analytical", "reflective", "methodical"},
"depth_min": 0.7,
"focus_min": 0.5,
},
}Scenario: cognitive_stress
The cognitive_stress scenario generator creates test cases that require the agent to shift thinking strategies:
Case{
ScenarioType: testcase.ScenarioCognitiveStress,
Input: "This function has both a performance bug and a security issue. Fix the critical one first, then address the other.",
Context: map[string]any{
"expected_phases": []string{"analytical", "methodical"},
"depth_required": "deep",
},
}Cognitive strategies
| Strategy | Description |
|---|---|
analytical | Structured, step-by-step analysis |
creative | Exploratory, lateral thinking |
methodical | Systematic, exhaustive approach |
reactive | Quick response, minimal deliberation |
reflective | Self-evaluating, iterative refinement |
collaborative | Seeks input, considers multiple perspectives |
Dimension score
result.DimensionScores["cognition"] // 0.0 to 1.0Use cases
- Verify a code reviewer starts with analysis, then reflects on findings, then methodically addresses issues
- Test that an agent switches from broad exploration to focused investigation when it finds a lead
- Ensure deep analysis for complex tasks, quick responses for simple ones