Trait Evaluation

Trait evaluation tests whether an agent maintains consistent personality characteristics across different interactions and contexts.

What it tests

Does the agent maintain the same personality across conversations?
Are declared traits (e.g., thoroughness, patience) reflected in behavior?
Does personality remain consistent under different phrasings of the same question?

Scorer: `trait_consistency`

The trait_consistency scorer evaluates whether the agent's responses reflect its declared personality traits:

testcase.ScorerConfig{
    Name: "trait_consistency",
    Config: map[string]any{
        "trait_name":    "thoroughness",
        "dimension":     "depth",
        "expected_value": 0.85,
        "tolerance":     0.15,
    },
}

Scenario: `trait_probe`

The trait_probe scenario generator creates multiple phrasings of the same question to test personality consistency:

Case{
    ScenarioType: testcase.ScenarioTraitProbe,
    Input:        "How would you approach reviewing this code?",
    Context: map[string]any{
        "trait":     "thoroughness",
        "dimension": "depth",
        "expected":  "exhaustive analysis",
    },
}

Dimension score

result.DimensionScores["trait"] // 0.0 to 1.0

Use cases

Verify a thorough agent provides detailed analysis consistently
Test that a patient agent doesn't rush through complex queries
Ensure a cautious agent consistently identifies risks

Trait Evaluation

What it tests

Scorer: trait_consistency

Scenario: trait_probe

Dimension score

Use cases

On this page

Scorer: `trait_consistency`

Scenario: `trait_probe`