Agents
An agent is an LLM with a brain (provider configuration), a system prompt, and a role. Every app has at least one agent defined in the agents: list.
Agent Definition
agents:
- id: assistant
role: assistant
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
config:
api_key: "{{env.DEEPSEEK_API_KEY}}"
system_prompt: |
You are a helpful assistant.
Workspace: {{workspace}}
Agent Fields
| Field | Type | Default | Description |
|---|---|---|---|
id | string | required | Unique agent identifier within the app |
role | string | "worker" | Agent role: coordinator, worker, assistant, or custom |
brain | AgentBrain | required | LLM provider configuration |
system_prompt | string | "" | System prompt injected at conversation start |
plan_first | bool | true | Guide the agent to explain its plan before executing tools |
Plan First
When enabled, communication guidelines are injected into the system prompt to encourage the LLM to explain what it's about to do before calling tools. This helps the user understand what's happening — especially since tool parameters and raw results are not shown directly.
This is prompt-level guidance only — the runtime never blocks or intercepts tool calls. The LLM remains free to work as it sees fit. How well the model follows these guidelines depends on the model itself (some models like DeepSeek never produce text alongside tool calls).
agents:
# Default: guidelines injected to encourage explanation
- id: assistant
plan_first: true
brain: ...
# No guidelines: agent works silently
- id: background-worker
plan_first: false
brain: ...
Set plan_first: false for agents where explanation is unnecessary (background workers, pipelines, automated tasks).
Note: Regardless of
plan_first, the CLI always shows real-time tool activity (> Listing .,> Reading file.py, etc.) so the user is never completely in the dark.
Brain Configuration
The brain: block configures the LLM provider and model. Two modes are supported.
Inline Mode (recommended)
Full provider config embedded in the agent:
brain:
provider: deepseek # Provider hint (for base URL resolution)
model: deepseek-chat # Model identifier
backend: openai_compat # Backend: 'openai_compat' (default) or 'anthropic'
config: # Provider-specific config
api_key: "{{env.DEEPSEEK_API_KEY}}"
base_url: "https://api.deepseek.com/v1" # Optional if provider hint is set
temperature: 0.2 # Sampling temperature
max_tokens: 8192 # Max output tokens
top_p: 1.0 # Nucleus sampling
context: # Per-brain context management (optional)
max_tokens: 131072
strategy: summarize
Reference Mode
Points to a named provider in modules.llm_provider.config.providers:
brain:
provider_id: my_deepseek_provider
temperature: 0.2
Brain Fields
| Field | Type | Default | Description |
|---|---|---|---|
provider_id | string | null | Reference to a named provider (reference mode) |
provider | string | null | Provider hint for base URL resolution (inline mode) |
model | string | null | Model identifier |
backend | string | "openai_compat" | Backend: openai_compat or anthropic |
config | dict | {} | Provider-specific config (api_key, base_url, etc.) |
temperature | float | null | Sampling temperature (0-2) |
max_tokens | int | null | Max tokens to generate |
top_p | float | null | Nucleus sampling threshold (0-1) |
timeout | float | null | Request timeout in seconds |
context | ContextConfig | null | Per-brain context management (overrides execution.context) |
Supported Providers
All providers use the OpenAI-compatible API format (openai_compat backend) unless noted.
| Provider | Provider Hint | Default Base URL | Native Tool Use |
|---|---|---|---|
| OpenAI | openai | https://api.openai.com/v1 | Yes |
| Anthropic | anthropic | https://api.anthropic.com/v1 | Yes |
| DeepSeek | deepseek | https://api.deepseek.com/v1 | Yes |
| Groq | groq | https://api.groq.com/openai/v1 | Yes |
| Mistral | mistral | https://api.mistral.ai/v1 | Yes |
| Together | together | https://api.together.xyz/v1 | Yes |
| Ollama | ollama | http://localhost:11434/v1 | No (text-based) |
| LM Studio | lm_studio | http://localhost:1234/v1 | No (text-based) |
| vLLM | vllm | http://localhost:8000/v1 | No (text-based) |
When the provider hint is set, the base_url is auto-resolved. You can always override it in config.base_url.
Native vs Text-Based Tool Calling
Digitorn automatically detects whether a provider supports native tool calling:
-
Native (OpenAI, DeepSeek, Groq, Mistral, Together): Tools are passed via the API
tools=parameter. The LLM generates structured tool calls natively. -
Text-based (Ollama, LM Studio, vLLM): Tool schemas are injected into the system prompt. The LLM generates tool calls as text (e.g.,
{tool_call}{"name": "...", "arguments": {...}}</tool_call>), and Digitorn parses them.
This is fully automatic — you don't need to configure anything. The same YAML works with any provider.
Tool Call Recovery
Even with native tool calling, LLMs sometimes generate malformed tool calls. Digitorn handles this robustly:
- Llama native format:
<function=name{...}</function>— parsed via regex - XML format:
{tool_call}{...}</tool_call>— parsed via regex - Raw JSON:
{"name": "...", "arguments": {...}}— extracted via brace matching - Markdown JSON:
```json {...} ```— extracted from code blocks - Smart quotes: Unicode curly quotes (
""'') normalized to ASCII before parsing - API errors: Groq
tool_use_failederrors withfailed_generationare recovered
Provider Examples
# DeepSeek (cloud, native tool use)
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
config:
api_key: "{{env.DEEPSEEK_API_KEY}}"
# Groq (cloud, fast inference, native tool use)
brain:
provider: groq
model: llama-3.3-70b-versatile
backend: openai_compat
config:
api_key: "{{env.GROQ_API_KEY}}"
base_url: "https://api.groq.com/openai/v1"
# Ollama (local, text-based tool use)
brain:
provider: ollama
model: qwen2.5:14b-instruct-q4_K_M
backend: openai_compat
config:
base_url: "http://localhost:11434/v1"
context:
max_tokens: 8000
output_reserved: 1000
strategy: truncate
keep_recent: 6
compression_trigger: 0.60
auto_compact: true
# Any OpenAI-compatible endpoint
brain:
provider: custom
model: my-fine-tuned-model
backend: openai_compat
config:
api_key: "{{env.CUSTOM_API_KEY}}"
base_url: "https://my-api.example.com/v1"
System Prompt
The system_prompt field defines the agent's behavior and instructions. It supports template expressions.
agents:
- id: assistant
system_prompt: |
You are a helpful coding assistant.
Workspace: {{workspace}}
You have access to tools via a discovery system.
Use list_categories to see available modules,
then browse_category and execute_tool to use them.
Note: The system prompt is automatically enriched by the runtime with:
- Agent identity header
- Tool discovery instructions (native mode) or full tool schemas (text-based mode)
Your system prompt is appended after these sections.
Best Practices
- Be specific — Define the agent's role, capabilities, and constraints
- Use variables — Inject dynamic context with
{{variable_name}} - Guide tool usage — Explain the workflow (list, browse, execute)
- Set limits — "Limit yourself to 3-5 tool calls per question"
- Handle errors — "If a tool fails, explain the error instead of retrying in a loop"
Context Configuration (Per-Brain)
Each brain can override the execution-level context management:
brain:
provider: ollama
model: mistral-nemo
context:
max_tokens: 8000 # Context window size (0 = auto-detect)
output_reserved: 1000 # Tokens reserved for output
strategy: truncate # 'truncate' or 'summarize'
keep_recent: 6 # Messages to keep during compaction
compression_trigger: 0.60 # Compact at 60% usage
summary_max_tokens: 512 # Max tokens for summary (summarize strategy)
auto_compact: true # Auto-inject compaction hook
summary_brain: # Optional: separate model for compaction
provider: ollama
model: qwen2.5:3b
backend: openai_compat
The summary_brain field accepts the same fields as the main brain (provider, model, backend, config, temperature, timeout, etc.). If not set, the agent's main brain is used for summarization.
See Context Management for full details.
Multi-Agent
Multiple agents can be defined in the agents: list:
agents:
- id: coordinator
role: coordinator
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
config:
api_key: "{{env.DEEPSEEK_API_KEY}}"
system_prompt: "You orchestrate tasks."
- id: worker
role: worker
brain:
provider: groq
model: llama-3.3-70b-versatile
backend: openai_compat
config:
api_key: "{{env.GROQ_API_KEY}}"
base_url: "https://api.groq.com/openai/v1"
system_prompt: "You execute tasks."
The execution.entry_agent field controls which agent starts. If not set, the first agent in the list is used.
Complete Agent Example
This example shows all agent fields: brain with inline provider, generation params, per-brain context management with a separate summary brain, and a detailed system prompt.
agents:
- id: analyst
role: assistant
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
temperature: 0.2
max_tokens: 8192
top_p: 0.95
timeout: 60.0
config:
api_key: "{{env.DEEPSEEK_API_KEY}}"
context:
max_tokens: 80000
output_reserved: 4096
strategy: summarize
keep_recent: 10
compression_trigger: 0.75
summary_max_tokens: 1024
auto_compact: true
summary_brain: # Cheap local model for compaction
provider: ollama
model: qwen2.5:3b
backend: openai_compat
system_prompt: |
You are a data analyst assistant. You respond in French.
You have access to tools via a discovery system.
EFFICIENT WORKFLOW:
1. list_categories -> see available modules
2. browse_category(category="name") -> see module tools
3. execute_tool(name="module.action", params={...}) -> execute
IMPORTANT:
- Go directly to execute_tool once you know the tool name.
- Limit yourself to 3-5 tool calls per question.
- If a tool fails, explain the error instead of retrying.