Sentinel

Behavior Evaluation

Test how an agent reacts — condition-triggered action patterns.

Behavior evaluation tests whether an agent correctly executes condition-triggered action patterns — the "habits" and "reflexes" that should fire in specific contexts.

What it tests

  • Does the agent recognize trigger conditions?
  • Does it execute the correct action when triggered?
  • Is the trigger-action pattern reliable across variations?
  • Does priority ordering work when multiple behaviors could fire?

Scorer: behavior_trigger

The behavior_trigger scorer evaluates whether the expected behavior was activated:

testcase.ScorerConfig{
    Name: "behavior_trigger",
    Config: map[string]any{
        "trigger_type":   "on_input",
        "trigger_pattern": "security|vulnerability|CVE",
        "expected_action": "inject_prompt",
        "action_value":    "security-focused analysis",
    },
}

Scenario: behavior_trigger

The behavior_trigger scenario generator creates test cases with specific trigger conditions:

Case{
    ScenarioType: testcase.ScenarioBehaviorTrigger,
    Input:        "I found a potential SQL injection in the login form",
    Context: map[string]any{
        "behavior":       "security-alert",
        "trigger":        "on_input",
        "expected_action": "escalate and detail security implications",
    },
}

Dimension score

result.DimensionScores["behavior"] // 0.0 to 1.0

Use cases

  • Verify a code review agent activates security analysis when security keywords appear
  • Test that a support agent escalates when detecting frustrated customers
  • Ensure a data agent switches to careful mode when handling PII

On this page