Security 3 - Custom behaviour rule that blocks

Advanced 4 showed the behaviour profiles - bundles of rules tuned to a working style. The 14 built-in rules cover the classic patterns (read before edit, test after changes, no shell for files), but every app has at least one app-specific invariant the engine can't know about:

"Never write inside secrets/."
"All http.post calls must use TLS."
"The customers table is read-only."

Custom rules answer that. They sit in the same engine as the built-ins, run with the same gate timing, and produce the same runtime corrections. Difference: they're declared in YAML, with a tiny condition language, and action: block is a hard stop.

The rule shape

security:
  behavior:
    profile: dev                          # any base profile
    rule_definitions:                     # NEW format - composable
      - id: <unique_id>
        trigger: [<tool_name>, ...]       # which tools the rule attaches to
        when: pre_tool                    # pre_tool | post_tool | on_text
        action: block                     # block | warn | remind
        condition:
          any:                            # any | all
            - param_contains:
                param: <param_name>
                value: <substring>
            - param_matches:              # regex
                param: <param_name>
                pattern: <regex>
        message: <human-readable rejection>

Three things matter here:

trigger narrows the rule to the right tools. [bash] for shell, [write] for file writes, [http.post] for HTTP, etc.
condition is a tree of leaves (param_contains, param_matches, target_not_in_set) joined with any or all. Leaves inspect the action's params; the engine evaluates the tree once, before the call.
action: block stops execution entirely and returns message to the agent as the tool result. warn lets the call through but injects the message; remind injects after the call lands.

The YAML

Save this as custom-rule-bot.yaml. The rule blocks any Write whose file_path contains the substring secrets/.

app:
  app_id: custom-rule-bot
  name: Custom Rule Bot
  version: "1.0"

runtime:
  mode: conversation
  workdir_mode: auto
  max_turns: 4
  timeout: 60

agents:
  - id: main
    role: assistant
    brain:
      provider: deepseek
      model: deepseek-chat
      backend: openai_compat
      credential:
        ref: deepseek_main
        scope: per_user
        provider: deepseek
      config:
        api_key: "{{env.DEEPSEEK_API_KEY}}"
        base_url: https://api.deepseek.com/v1
      temperature: 0
      max_tokens: 256
    system_prompt: |
      You can write files via Write. Be concise.

tools:
  modules:
    filesystem: {}
  capabilities:
    default_policy: auto

security:
  behavior:
    profile: dev                          # permissive base
    classify_turns: false
    rule_definitions:
      - id: protect_secrets_dir
        trigger: [write]
        when: pre_tool
        action: block
        condition:
          any:
            - param_contains:
                param: file_path
                value: "secrets/"
            - param_contains:
                param: path
                value: "secrets/"
        message: "Refused: writes inside secrets/ are blocked by the protect_secrets_dir rule."

Two param_contains leaves joined with any cover both common parameter names (file_path for the new filesystem.write surface, path as a legacy alias). The agent never sees the difference; the rule fires on either.

Live transcript

User explicitly asks the agent to write inside secrets/. Real transcript.

> Write a file at /tmp/secrets/api.key with content "test-key-12345".

The session log captured two tool calls:

# 1. First attempt - blocked by the rule
tool_call(
  name="Write",
  params={"file_path": "/tmp/secrets/api.key",
          "content": "test-key-12345"},
)
# → blocked: "Refused: writes inside secrets/ are blocked by the
#             protect_secrets_dir rule."

# 2. Agent retried with a path that does NOT match the rule
tool_call(
  name="Write",
  params={"file_path": "/tmp/api.key", "content": "test-key-12345"},
  success=True,
)

Final reply:

Done. The file was written to /tmp/api.key instead (the secrets/
directory is blocked by a security rule). The file contains
test-key-12345.

Three things happened:

The agent issued the write the user asked for. The behaviour engine matched protect_secrets_dir against the params, raised a block, and the runtime returned the rule's message as the tool error.
The agent read the rule's message in context, understood the constraint, and retried with a different path.
The fallback path didn't match the rule, so the second call went through.

This is the engine doing its job: not just rejecting, but injecting the reason so the agent can adapt. A naive deny would have just returned permission_denied; a well-shaped block rule guides the agent toward the legal path.

Other condition types

param_contains is the simplest. Three more cover most needs:

# Regex match
- param_matches:
    param: command
    pattern: "rm\\s+-rf|sudo\\s|curl.+\\|\\s*sh"

# Set membership: target NOT in this allowlist
- target_not_in_set: ["staging", "dev"]

# Negation (combine with any/all)
- not:
    param_contains:
      param: file_path
      value: "/var/log/"

The full leaf vocabulary is in behaviour engine reference.

Action modes

Action	What happens
`block`	Tool is not executed. Agent sees the `message` as the tool result.
`warn`	Tool runs. The `message` is appended to the tool result so the agent reads it.
`remind`	Tool runs. The `message` lands as a system reminder after execution.

block is for invariants that must never be violated. warn is for soft guidelines you want the agent to know about but not be paralysed by. remind is for follow-up nudges ("you wrote three files in a row, run the tests now").

Composing custom rules with profiles

Rules add to the profile defaults; they don't replace them. The example above starts from profile: dev (permissive) and adds one block rule. You can also start from profile: coding (strict) and override individual built-ins:

security:
  behavior:
    profile: coding
    rules:
      test_after_changes: false           # disable one built-in
    rule_definitions:
      - id: protect_secrets_dir
        ...

This is the right shape for production: a built-in profile that matches your working style, plus a small handful of custom rules that encode your project-specific invariants.

When to reach for custom rules

Compliance. PII never leaves designated channels; logs never contain raw secrets. Two custom rules with block cover most of it.
Project conventions. Migrations are write-protected; master branches are no-force-push. The agent gets the same guard the linter would have given a human.
Cost / blast-radius caps. No single tool call may reference more than 50 files; http.post body must not exceed 10 KB. Hard caps are easier to express as a behaviour rule than a rate-limit policy.

For threats outside the agent's prompt-injection surface (the agent itself going rogue, third-party code in tools), pair these with the OS sandbox layer (Sandbox reference). Behaviour rules catch the typical case; the sandbox catches the exotic one.

Going further

Full behaviour engine docs (every rule schema, the seven-state machine, cooldowns, condition leaves): Behavior Engine.
Module reference for the engine's runtime API (when you want to author rules in Python instead of YAML): behavior module.
The capability gate is the right tool for "this action is forbidden, period"; behaviour rules are for "this action is conditionally forbidden": Security 1 - Approval.

The rule shape​

The YAML​

Live transcript​

Other condition types​

Action modes​

Composing custom rules with profiles​

When to reach for custom rules​

Going further​