Skip to main content

behavior

Runtime behavioural enforcement. Monitors every tool call, detects violations, injects corrections - fully YAML-driven, no hardcoded logic. Two enforcement layers: a declarative rule engine (14 built-in rules + custom) and an optional semantic classifier (small LLM that injects directives before the main agent acts).

PropertyValue
Module idbehavior
LLM-exposed actions0 - operates as a hook on the agent loop
Typeper-app instance, per-session state

No agent-callable actions. This module is wired in and called from at three points: classify_turn at turn 0, pre_tool_check before each tool, post_tool_check after each tool. Configuration lives entirely in security.behavior: in the app YAML.

Where it sits in the YAML

security.behavior - under the canonical security: block:

security:
behavior:
profile: dev
classify_turns: true
rules: { ... } # legacy boolean / threshold overrides
rule_definitions: [ ... ] # full declarative rules
state_tracking: { ... } # custom sets / counters / flags
classifier: { ... }
brain: { ... }

Full schema + 14 built-in rules + 13 condition primitives + classifier vocabulary in Behavior Engine. Quick recap below.

Built-in profiles

6 presets shipped:

ProfileDescription
devSenior-developer discipline. All sequence + delegation + lint rules ON. Injects 5 KB DEV_PROMPT_SECTION into the system prompt.
codingdev minus web_search_when_unknown, slightly looser thresholds, autonomy high.
researchRead-heavy. Disables read_before_edit / test_after_changes / verify_after_edit / lint_check. Enables web search + delegation.
dataData analysis / ETL. Strict reads + tests + lint, web search ON.
creativeWriting / content. Low autonomy, verbosity: detailed, planning required.
assistantGeneral chatbot. Minimal enforcement.
security:
behavior:
profile: dev

Custom profiles (bundle directory)

my-app/
app.yaml
behavior/
strict_dev.yaml
# behavior/strict_dev.yaml
name: strict_dev
description: "Production-grade enforcement"
extends: dev
rules:
max_blind_reads: 1
changes_before_test_reminder: 1

Reference via the {{behavior.X}} namespace:

security:
behavior:
profile: "{{behavior.strict_dev}}"

The 14 built-in rules

Override any by adding a rule_definitions entry with the same id.

Sequence (5)

IDTriggerWhenActionEffect
read_before_editeditprewarnFile must be read first.
read_before_write_existingwriteprewarnExisting file must be read before overwrite.
search_before_readreadprewarnAfter 3+ blind reads, suggest Grep / Glob.
verify_after_editeditpostremindRe-read the modified section.
test_after_changesedit, writepostremindRun tests after 3+ changes.

Prohibition (3)

IDTriggerWhenActionEffect
no_bash_for_filesbashprewarnDetects cat, head, tail, less, more, bat, sed, awk, perl.
no_blind_explorationbashprewarnDetects find ., ls -lRa, tree, dir /s.
confirm_destructivebashpreblockBlocks rm -rf, git reset --hard, git push --force, git clean -fd, drop table, drop database, truncate table.

Cognitive (6)

IDTriggerWhenActionEffect
plan_before_execute*prewarnAgent must produce text before the first tool call.
web_search_when_unknown*on_textwarnDetects not sure / unsure / don't know / uncertain AND no web search yet.
delegate_complex*postremindFires at exactly 8 tool calls in the turn.
delegate_large_readsreadpostremindAfter 5+ sequential reads.
max_sequential_same_tool*prewarnSame tool 8 times in a row.
always_lint_checkedit, writepostwarnTool result has lint errors with severity: error.

Numeric thresholds

ParamDefaultUsed by
max_blind_reads3search_before_read
changes_before_test_reminder3test_after_changes
max_sequential_same_tool8max_sequential_same_tool

Action levels

ActionEffectTool runs?
block[BEHAVIOR BLOCKED] injected, tool prevented.no
warn[BEHAVIOR WARNING] injected, tool runs.yes
remind[BEHAVIOR REMINDER] injected after the tool returns.yes

Custom rule definitions

security:
behavior:
rule_definitions:
- id: backup_before_modify
trigger: [database.execute]
when: pre_tool # pre_tool | post_tool | on_text
action: block # block | warn | remind
condition:
all:
- param_matches: { param: query, pattern: "(UPDATE|DELETE|DROP)" }
- flag_is: { name: backup_created, value: false }
message: "Create a backup before running '{param:query}'."

state_tracking:
flags:
backup_created: { set_on: [database.backup] }
counters:
queries_run: { increment_on: [database.execute] }

Condition primitives

  • 13 primitives plus all / any / not composites.
PrimitiveBehaviour
target_not_in_set: <set> / target_in_set: <set>Target param value membership in a tracked set.
counter_gte: {name, value}state.counters[name] >= value.
flag_is: {name, value}state.flags[name] == value.
param_matches: {param, pattern}Regex on params[param], case-insensitive.
param_contains: {param, value}Case-insensitive substring.
no_text_before_tools: trueAgent didn't produce text before the first tool.
first_tool_this_turn: truestate.tool_calls_this_turn == 0.
consecutive_gte: <N>Same tool called N+ times in a row.
tool_calls_this_turn_eq: <N>Exactly N tool calls this turn.
target_exists_on_disk: trueos.path.exists(target).
text_matches: <pattern>Regex on agent text (with when: on_text).
result_has_lint_errors: trueTool result lint has any item with severity: error.

Semantic classifier

ClassifierConfig. Optional small LLM that runs before each main agent turn and emits a [BEHAVIOR DIRECTIVE] block injected into the conversation.

security:
behavior:
classify_turns: true
classifier:
frequency: every_turn # every_turn | first_turn | every_n_turns | on_new_message
frequency_n: 3 # for every_n_turns
skip_followups: true # skip "ok", "yes", "continue", ...
timeout: 15
complexity_levels: [trivial, simple, moderate, complex, critical]
approaches: [direct, explore_first, plan_and_confirm, delegate, research_first]
risk_levels: [none, low, medium, high]
max_directives: 5
directive_prefix: "[BEHAVIOR DIRECTIVE - {complexity} complexity, {risk} risk]"
high_risk_threshold: medium
directive_footer: "Follow these directives. ..."
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
config: { api_key: "{{secret.DEEPSEEK_API_KEY}}" }

Per-session state

Each session gets its own BhvSessionState - sets, counters, flags, recent tool history, turn number, total tool calls, violations, consecutive-tool counter. Never cross-contaminate between sessions.

cleanup_session(session_id) runs on session end; called by the daemon.

Integration points

HookWhenPurpose
classify_turn(...)Before turn 0 LLM callSemantic task classification, inject directive.
on_turn_start(session_id)Each turn startReset per-turn counters.
pre_tool_check(...)Before each toolRule evaluation; block if needed.
post_tool_check(...)After each toolState update, reminders.
get_prompt_sectionsSystem prompt buildInject active rules list into the prompt.

Cross-references