Sentinel

Trait Evaluation

Test who an agent is — personality consistency across interactions.

Trait evaluation tests whether an agent maintains consistent personality characteristics across different interactions and contexts.

What it tests

  • Does the agent maintain the same personality across conversations?
  • Are declared traits (e.g., thoroughness, patience) reflected in behavior?
  • Does personality remain consistent under different phrasings of the same question?

Scorer: trait_consistency

The trait_consistency scorer evaluates whether the agent's responses reflect its declared personality traits:

testcase.ScorerConfig{
    Name: "trait_consistency",
    Config: map[string]any{
        "trait_name":    "thoroughness",
        "dimension":     "depth",
        "expected_value": 0.85,
        "tolerance":     0.15,
    },
}

Scenario: trait_probe

The trait_probe scenario generator creates multiple phrasings of the same question to test personality consistency:

Case{
    ScenarioType: testcase.ScenarioTraitProbe,
    Input:        "How would you approach reviewing this code?",
    Context: map[string]any{
        "trait":     "thoroughness",
        "dimension": "depth",
        "expected":  "exhaustive analysis",
    },
}

Dimension score

result.DimensionScores["trait"] // 0.0 to 1.0

Use cases

  • Verify a thorough agent provides detailed analysis consistently
  • Test that a patient agent doesn't rush through complex queries
  • Ensure a cautious agent consistently identifies risks

On this page