Agents

An agent is an LLM with a brain (provider configuration), a system prompt, and a role. Every app has at least one agent defined in the agents: list.

Agent Definition

agents:
  - id: assistant
    role: assistant
    brain:
      provider: deepseek
      model: deepseek-chat
      backend: openai_compat
      config:
        api_key: "{{env.DEEPSEEK_API_KEY}}"
    system_prompt: |
      You are a helpful assistant.
      Workspace: {{workspace}}

Agent Fields

Field	Type	Default	Description
`id`	string	required	Unique agent identifier within the app
`role`	string	`"worker"`	Agent role: `coordinator`, `worker`, `assistant`, or custom
`brain`	AgentBrain	required	LLM provider configuration
`system_prompt`	string	`""`	System prompt injected at conversation start
`plan_first`	bool	`true`	Guide the agent to explain its plan before executing tools

Plan First

When enabled, communication guidelines are injected into the system prompt to encourage the LLM to explain what it's about to do before calling tools. This helps the user understand what's happening — especially since tool parameters and raw results are not shown directly.

This is prompt-level guidance only — the runtime never blocks or intercepts tool calls. The LLM remains free to work as it sees fit. How well the model follows these guidelines depends on the model itself (some models like DeepSeek never produce text alongside tool calls).

agents:
  # Default: guidelines injected to encourage explanation
  - id: assistant
    plan_first: true
    brain: ...

  # No guidelines: agent works silently
  - id: background-worker
    plan_first: false
    brain: ...

Set plan_first: false for agents where explanation is unnecessary (background workers, pipelines, automated tasks).

Note: Regardless of plan_first, the CLI always shows real-time tool activity (> Listing ., > Reading file.py, etc.) so the user is never completely in the dark.

Brain Configuration

The brain: block configures the LLM provider and model. Two modes are supported.

Inline Mode (recommended)

Full provider config embedded in the agent:

brain:
  provider: deepseek         # Provider hint (for base URL resolution)
  model: deepseek-chat       # Model identifier
  backend: openai_compat     # Backend: 'openai_compat' (default) or 'anthropic'
  config:                    # Provider-specific config
    api_key: "{{env.DEEPSEEK_API_KEY}}"
    base_url: "https://api.deepseek.com/v1"  # Optional if provider hint is set
  temperature: 0.2           # Sampling temperature
  max_tokens: 8192           # Max output tokens
  top_p: 1.0                 # Nucleus sampling
  context:                   # Per-brain context management (optional)
    max_tokens: 131072
    strategy: summarize

Reference Mode

Points to a named provider in modules.llm_provider.config.providers:

brain:
  provider_id: my_deepseek_provider
  temperature: 0.2

Brain Fields

Field	Type	Default	Description
`provider_id`	string	`null`	Reference to a named provider (reference mode)
`provider`	string	`null`	Provider hint for base URL resolution (inline mode)
`model`	string	`null`	Model identifier
`backend`	string	`"openai_compat"`	Backend: `openai_compat` or `anthropic`
`config`	dict	`{}`	Provider-specific config (`api_key`, `base_url`, etc.)
`temperature`	float	`null`	Sampling temperature (0-2)
`max_tokens`	int	`null`	Max tokens to generate
`top_p`	float	`null`	Nucleus sampling threshold (0-1)
`timeout`	float	`null`	Request timeout in seconds
`context`	ContextConfig	`null`	Per-brain context management (overrides `execution.context`)

Supported Providers

All providers use the OpenAI-compatible API format (openai_compat backend) unless noted.

Provider	Provider Hint	Default Base URL	Native Tool Use
OpenAI	`openai`	`https://api.openai.com/v1`	Yes
Anthropic	`anthropic`	`https://api.anthropic.com/v1`	Yes
DeepSeek	`deepseek`	`https://api.deepseek.com/v1`	Yes
Groq	`groq`	`https://api.groq.com/openai/v1`	Yes
Mistral	`mistral`	`https://api.mistral.ai/v1`	Yes
Together	`together`	`https://api.together.xyz/v1`	Yes
Ollama	`ollama`	`http://localhost:11434/v1`	No (text-based)
LM Studio	`lm_studio`	`http://localhost:1234/v1`	No (text-based)
vLLM	`vllm`	`http://localhost:8000/v1`	No (text-based)

When the provider hint is set, the base_url is auto-resolved. You can always override it in config.base_url.

Native vs Text-Based Tool Calling

Digitorn automatically detects whether a provider supports native tool calling:

Native (OpenAI, DeepSeek, Groq, Mistral, Together): Tools are passed via the API tools= parameter. The LLM generates structured tool calls natively.
Text-based (Ollama, LM Studio, vLLM): Tool schemas are injected into the system prompt. The LLM generates tool calls as text (e.g., {tool_call}{"name": "...", "arguments": {...}}</tool_call>), and Digitorn parses them.

This is fully automatic — you don't need to configure anything. The same YAML works with any provider.

Tool Call Recovery

Even with native tool calling, LLMs sometimes generate malformed tool calls. Digitorn handles this robustly:

Llama native format: <function=name{...}</function> — parsed via regex
XML format: {tool_call}{...}</tool_call> — parsed via regex
Raw JSON: {"name": "...", "arguments": {...}} — extracted via brace matching
Markdown JSON: ```json {...} ``` — extracted from code blocks
Smart quotes: Unicode curly quotes (""'') normalized to ASCII before parsing
API errors: Groq tool_use_failed errors with failed_generation are recovered

Provider Examples

# DeepSeek (cloud, native tool use)
brain:
  provider: deepseek
  model: deepseek-chat
  backend: openai_compat
  config:
    api_key: "{{env.DEEPSEEK_API_KEY}}"

# Groq (cloud, fast inference, native tool use)
brain:
  provider: groq
  model: llama-3.3-70b-versatile
  backend: openai_compat
  config:
    api_key: "{{env.GROQ_API_KEY}}"
    base_url: "https://api.groq.com/openai/v1"

# Ollama (local, text-based tool use)
brain:
  provider: ollama
  model: qwen2.5:14b-instruct-q4_K_M
  backend: openai_compat
  config:
    base_url: "http://localhost:11434/v1"
  context:
    max_tokens: 8000
    output_reserved: 1000
    strategy: truncate
    keep_recent: 6
    compression_trigger: 0.60
    auto_compact: true

# Any OpenAI-compatible endpoint
brain:
  provider: custom
  model: my-fine-tuned-model
  backend: openai_compat
  config:
    api_key: "{{env.CUSTOM_API_KEY}}"
    base_url: "https://my-api.example.com/v1"

System Prompt

The system_prompt field defines the agent's behavior and instructions. It supports template expressions.

agents:
  - id: assistant
    system_prompt: |
      You are a helpful coding assistant.
      Workspace: {{workspace}}

      You have access to tools via a discovery system.
      Use list_categories to see available modules,
      then browse_category and execute_tool to use them.

Note: The system prompt is automatically enriched by the runtime with:

Agent identity header

Tool discovery instructions (native mode) or full tool schemas (text-based mode)

Your system prompt is appended after these sections.

Best Practices

Be specific — Define the agent's role, capabilities, and constraints
Use variables — Inject dynamic context with {{variable_name}}
Guide tool usage — Explain the workflow (list, browse, execute)
Set limits — "Limit yourself to 3-5 tool calls per question"
Handle errors — "If a tool fails, explain the error instead of retrying in a loop"

Context Configuration (Per-Brain)

Each brain can override the execution-level context management:

brain:
  provider: ollama
  model: mistral-nemo
  context:
    max_tokens: 8000         # Context window size (0 = auto-detect)
    output_reserved: 1000    # Tokens reserved for output
    strategy: truncate       # 'truncate' or 'summarize'
    keep_recent: 6           # Messages to keep during compaction
    compression_trigger: 0.60 # Compact at 60% usage
    summary_max_tokens: 512  # Max tokens for summary (summarize strategy)
    auto_compact: true       # Auto-inject compaction hook
    summary_brain:           # Optional: separate model for compaction
      provider: ollama
      model: qwen2.5:3b
      backend: openai_compat

The summary_brain field accepts the same fields as the main brain (provider, model, backend, config, temperature, timeout, etc.). If not set, the agent's main brain is used for summarization.

See Context Management for full details.

Multi-Agent

Multiple agents can be defined in the agents: list:

agents:
  - id: coordinator
    role: coordinator
    brain:
      provider: deepseek
      model: deepseek-chat
      backend: openai_compat
      config:
        api_key: "{{env.DEEPSEEK_API_KEY}}"
    system_prompt: "You orchestrate tasks."

  - id: worker
    role: worker
    brain:
      provider: groq
      model: llama-3.3-70b-versatile
      backend: openai_compat
      config:
        api_key: "{{env.GROQ_API_KEY}}"
        base_url: "https://api.groq.com/openai/v1"
    system_prompt: "You execute tasks."

The execution.entry_agent field controls which agent starts. If not set, the first agent in the list is used.

Complete Agent Example

This example shows all agent fields: brain with inline provider, generation params, per-brain context management with a separate summary brain, and a detailed system prompt.

agents:
  - id: analyst
    role: assistant
    brain:
      provider: deepseek
      model: deepseek-chat
      backend: openai_compat
      temperature: 0.2
      max_tokens: 8192
      top_p: 0.95
      timeout: 60.0
      config:
        api_key: "{{env.DEEPSEEK_API_KEY}}"
      context:
        max_tokens: 80000
        output_reserved: 4096
        strategy: summarize
        keep_recent: 10
        compression_trigger: 0.75
        summary_max_tokens: 1024
        auto_compact: true
        summary_brain:              # Cheap local model for compaction
          provider: ollama
          model: qwen2.5:3b
          backend: openai_compat
    system_prompt: |
      You are a data analyst assistant. You respond in French.
      You have access to tools via a discovery system.

      EFFICIENT WORKFLOW:
      1. list_categories -> see available modules
      2. browse_category(category="name") -> see module tools
      3. execute_tool(name="module.action", params={...}) -> execute

      IMPORTANT:
      - Go directly to execute_tool once you know the tool name.
      - Limit yourself to 3-5 tool calls per question.
      - If a tool fails, explain the error instead of retrying.

Agent Definition​

Agent Fields​

Plan First​

Brain Configuration​

Inline Mode (recommended)​

Reference Mode​

Brain Fields​

Supported Providers​

Native vs Text-Based Tool Calling​

Tool Call Recovery​

Provider Examples​

System Prompt​

Best Practices​

Context Configuration (Per-Brain)​

Multi-Agent​

Complete Agent Example​