Perception Evaluation

Perception evaluation tests whether an agent correctly identifies relevant information, focuses attention on the right aspects of input, and detects important details.

What it tests

Does the agent focus on the right aspects of the input?
Can it distinguish signal from noise?
Does it notice important details others might miss?
Is the level of detail orientation appropriate for the task?

Scorer: `perception_focus`

The perception_focus scorer evaluates whether the agent identified key elements in the input:

testcase.ScorerConfig{
    Name: "perception_focus",
    Config: map[string]any{
        "expected_focus":  []string{"security", "performance"},
        "signal_keywords": []string{"injection", "XSS", "latency", "bottleneck"},
        "noise_keywords":  []string{"formatting", "typos", "style"},
    },
}

Scenario: `perception_test`

The perception_test scenario generator creates test cases with a mixture of signal and noise to test attention:

Case{
    ScenarioType: testcase.ScenarioPerceptionTest,
    Input:        "Review this code. It has some formatting issues, a typo in a comment, and a potential SQL injection in the user input handler.",
    Context: map[string]any{
        "signal": []string{"SQL injection"},
        "noise":  []string{"formatting", "typo"},
        "expected_focus": "security",
    },
}

Perception parameters

Parameter	Range	Description
`AttentionFilters`	keyword/pattern list	What the agent should watch for
`ContextWindow`	0.0 - 1.0	How much surrounding context to consider
`DetailOrientation`	0.0 - 1.0	High-level overview vs fine-grained analysis

Dimension score

result.DimensionScores["perception"] // 0.0 to 1.0

Use cases

Verify a security-focused agent prioritizes vulnerabilities over style issues
Test that a data analyst agent notices anomalies in datasets
Ensure a code reviewer catches subtle logic errors amid formatting noise