Cognitive Evaluation

Cognitive evaluation tests whether an agent follows appropriate thinking strategies, transitions between cognitive phases correctly, and maintains the right depth and focus.

What it tests

Does the agent transition between thinking phases appropriately?
Does depth of analysis match expectations (shallow vs. deep)?
Does focus remain appropriate (broad vs. narrow)?
Does the agent self-reflect when needed?

Scorer: `cognitive_phase`

The cognitive_phase scorer evaluates the agent's reasoning strategy transitions by analyzing the run trace:

testcase.ScorerConfig{
    Name: "cognitive_phase",
    Config: map[string]any{
        "expected_phases": []string{"analytical", "reflective", "methodical"},
        "depth_min":       0.7,
        "focus_min":       0.5,
    },
}

Scenario: `cognitive_stress`

The cognitive_stress scenario generator creates test cases that require the agent to shift thinking strategies:

Case{
    ScenarioType: testcase.ScenarioCognitiveStress,
    Input:        "This function has both a performance bug and a security issue. Fix the critical one first, then address the other.",
    Context: map[string]any{
        "expected_phases": []string{"analytical", "methodical"},
        "depth_required":  "deep",
    },
}

Cognitive strategies

Strategy	Description
`analytical`	Structured, step-by-step analysis
`creative`	Exploratory, lateral thinking
`methodical`	Systematic, exhaustive approach
`reactive`	Quick response, minimal deliberation
`reflective`	Self-evaluating, iterative refinement
`collaborative`	Seeks input, considers multiple perspectives

Dimension score

result.DimensionScores["cognition"] // 0.0 to 1.0

Use cases

Verify a code reviewer starts with analysis, then reflects on findings, then methodically addresses issues
Test that an agent switches from broad exploration to focused investigation when it finds a lead
Ensure deep analysis for complex tasks, quick responses for simple ones