Tools

Digitorn uses a tool discovery architecture. Instead of exposing all tools directly to the LLM (which wastes context tokens), agents discover and execute tools through meta-tools.

Adaptive Tool Injection

Digitorn automatically chooses between two injection modes based on the toolset size:

Direct Mode (small toolsets)

When the total number of tools is small enough to fit in ~20% of the context window (estimated at 200 tokens per tool), all tools are passed directly to the LLM as OpenAI function schemas. No meta-tools are needed — the agent calls tools by name.

This is simpler and more efficient for apps with just a few modules.

Example: an app with hello (1 action) + filesystem with 4 allowed actions = 5 tools × 200 = 1000 tokens. With a 131k context window, 20% = 26,200 tokens. 1000 < 26,200, so direct mode is used. The agent sees:

tools: [hello__greet, filesystem__read, filesystem__ls, filesystem__find, filesystem__grep]

Discovery Mode (large toolsets)

When the toolset is too large for direct injection, the agent discovers and executes tools through 5 meta-tools. This avoids wasting context tokens on tool schemas the agent may never use.

Example: an app with filesystem (all actions) + database (all actions) + many custom modules = 30+ tools. 30 × 200 = 6000 tokens > 20% of an 8k context window. Discovery mode is used. The agent sees only the 5 meta-tools.

The mode is chosen automatically at bootstrap and stored in AgentContext.tool_injection ("direct" or "discovery").

How Discovery Mode Works

The context_builder module indexes all tools from loaded modules and exposes them through meta-tools. The agent never sees the full tool list — it searches, browses, and executes tools on demand.

1. Agent calls list_categories()
   -> Returns: ["hello", "filesystem", "database"]

2. Agent calls browse_category(category="filesystem")
   -> Returns: [{ name: "filesystem.read", description: "Read a file", ... }, ...]

3. Agent calls get_tool(name="filesystem.read")
   -> Returns: { full JSON schema, examples, side effects }

4. Agent calls execute_tool(name="filesystem.read", params={"path": "/tmp/file.txt"})
   -> Returns: { success: true, data: "file contents..." }

Meta-Tools

The meta-tools are defined via @action decorators in the context_builder module. They are generated dynamically from the registry — adding a new @action makes it available everywhere automatically.

Current meta-tools:

Meta-Tool	Description	Key Params
`search_tools`	Keyword search over all visible tools	`query` (str), `max_results` (int, 1-20)
`get_tool`	Full schema and metadata for one tool	`name` (str, "module.action" format)
`execute_tool`	Execute a tool with parameters	`name` (str), `params` (dict)
`list_categories`	List all available tool domains	(none)
`browse_category`	Browse tools in a domain (paginated)	`category` (str), `page` (int), `page_size` (int)

Auto-Routing

If an LLM calls a tool directly (e.g., filesystem.read instead of execute_tool(name="filesystem.read")), the agent loop auto-routes it through execute_tool. This happens transparently for better LLM compatibility.

Module Tools

Tools come from modules declared in the modules: block. Each module exposes actions via @action decorators.

Declaring Modules

modules:
  # Load with defaults
  hello: {}

  # Load with constraints (restrict available actions)
  filesystem:
    constraints:
      allowed_actions: [read, ls, find, grep]

  # Load with config and setup
  database:
    config:
      timeout_seconds: 10
    setup:
      - action: connect
        params:
          driver: sqlite
          database: "{{workspace}}/data.db"
    constraints:
      allowed_actions: [fetch_results, list_tables]

Currently Implemented Modules

Module	Actions	Description
`hello`	`say_hello`, `greet_many`, `status`	Simple greeting (test/demo)
`filesystem`	`read`, `ls`, `find`, `grep`, `write`, `mkdir`, ...	File operations
`database`	`connect`, `query`, `fetch_results`, `list_tables`, `upsert`, `batch_execute`, ...	Database operations
`http`	`get`, `post`, `json_api`, `fetch_page`, `head`, `download`, ...	HTTP client with async downloads
`shell`	`run`, `script`, `which`, `env`, `background_run`, `task_status`, ...	Shell command execution
`mcp`	`connect`, `disconnect`, `list_servers`, `call_tool`, `list_resources`, ...	MCP server integration (auto-indexes external tools)

Use digitorn app schema {module_id} to see all actions and their parameter schemas.

MCP tools from connected servers appear as virtual modules (mcp_slack, mcp_github, etc.) — see MCP Servers for details.

Tool Constraints

Constrain what tools an agent can access:

modules:
  filesystem:
    constraints:
      # Only these actions are visible to the agent
      allowed_actions: [read, ls, find, grep]

  database:
    constraints:
      # These actions are blocked
      blocked_actions: [execute_query, drop_table]

The context_builder applies these constraints when building the tool index — blocked actions are invisible to the agent.

Native vs Text-Based Tool Use

Digitorn supports two modes of tool interaction, selected automatically based on the provider:

Native Tool Use (OpenAI, DeepSeek, Groq, Mistral, Together)

Meta-tools are passed via the API tools= parameter
The LLM generates structured tool calls natively
The system prompt contains workflow instructions only

Text-Based Tool Use (Ollama, LM Studio, vLLM)

Full tool schemas are injected into the system prompt

The LLM generates tool calls as text:

`{tool_call}`{"name": "list_categories", "arguments": {}}`</tool_call>`

Digitorn parses tool calls from the LLM's text output using multiple strategies

How the System Prompt is Built

For native mode:

You are agent "assistant" (role: assistant).

You have access to N tools across M domains.

To find and use tools, you have these 5 meta-tools:
- search_tools: Keyword search over all visible tools
- get_tool: Full schema, metadata, and examples for one tool
- execute_tool: Execute a tool with parameters
- list_categories: List all available tool domains
- browse_category: Browse all tools in a specific domain

Workflow:
1. Discover what's available (list or search)
2. Get the exact parameter schema before calling
3. Execute the tool with the correct parameters

[Your system_prompt from YAML]

For text-based mode:

You are agent "assistant" (role: assistant).

You have access to N tools across M domains.

# AVAILABLE TOOLS

To call a tool, output EXACTLY this XML format:
`{tool_call}`{"name": "tool_name", "arguments": {"param": "value"}}`</tool_call>`

## Tools
### search_tools
Keyword search over all visible tools
Parameters: { "query": ..., "max_results": ... }

### execute_tool
Execute a tool with parameters
Parameters: { "name": ..., "params": ... }
[... all meta-tools with full schemas ...]

[Your system_prompt from YAML]

Tool Name Sanitization

OpenAI-compatible APIs require function names to match ^[a-zA-Z0-9_-]+$. Since Digitorn uses dotted FQNs (e.g., filesystem.read), the runtime automatically sanitizes names:

Outbound (to API): filesystem.read -- filesystem__read (dots replaced with double underscores)
Inbound (from API): filesystem__read -- filesystem.read (reverse conversion before dispatch)

This is transparent — YAML authors and module developers always use the module.action format.

Semantic Search

Tool discovery in discovery mode uses hybrid search combining:

Semantic search (FastEmbed + Qdrant): Multilingual embeddings (paraphrase-multilingual-MiniLM-L12-v2, 384 dims) for meaning-based matching. Supports ~50 languages.
Keyword search: Inverted index with prefix matching for exact term lookup.
Hybrid scoring: Semantic score (×10 weight) + keyword boost (+2-3) for optimal ranking.

The semantic index is built at bootstrap from a rich corpus: FQN + description + tags + param names + side effects + aliases + synonym expansion.

Module Aliases

Modules can declare aliases for their actions using the @action(aliases=[...]) decorator. Aliases are indexed in both keyword and semantic indexes, improving discoverability in multiple languages.

Example: filesystem.read has aliases like "lire", "lire fichier", "read file" — so a French-speaking agent searching for "lire un fichier" will find it.

Dynamic Architecture

The entire tool system is built from a single source of truth: the context_builder module's @action registry.

Adding a new meta-tool requires only:

Add an @action method to context_builder/module.py
Define a Pydantic params model

Everything else updates automatically:

bootstrap.py generates the JSON schema via action_entry_to_json_schema()
prompt.py generates system prompt instructions from the tools list
agent_loop.py routes calls via the _action_registry
openai_compat.py extracts tool names dynamically for recovery
ui.py falls back to "Calling {name}" for any unknown tool

No hardcoded tool names anywhere in the pipeline.

Execution Primitives

In addition to meta-tools, the context_builder provides execution primitives — capabilities for parallel execution, background tasks, persistent monitoring, and time-based scheduling:

Category	Primitives	Requires
Parallel	`run_parallel`	Always available
Background	`background_run`, `background_status`, `background_result`, `background_cancel`, `background_list`, `background_wait`	Always available
Watchers	`watch_start`, `watch_stop`, `watch_pause`, `watch_resume`, `watch_status`, `watch_list`, `watch_history`	`execution.watchers: true`
Scheduler	`schedule_once`, `schedule_cron`, `schedule_cancel`, `schedule_list`, `schedule_status`, `remember`	`execution.scheduler: true`

Parallel and background primitives are always injected. Watchers and scheduler primitives require opt-in via the execution: block. All primitives work with any module action and respect security policies.

See Execution Primitives for full documentation.

Adaptive Tool Injection​

Direct Mode (small toolsets)​

Discovery Mode (large toolsets)​

How Discovery Mode Works​

Meta-Tools​

Auto-Routing​

Module Tools​

Declaring Modules​

Currently Implemented Modules​

Tool Constraints​

Native vs Text-Based Tool Use​

Native Tool Use (OpenAI, DeepSeek, Groq, Mistral, Together)​

Text-Based Tool Use (Ollama, LM Studio, vLLM)​

How the System Prompt is Built​

Tool Name Sanitization​

Semantic Search​

Module Aliases​

Dynamic Architecture​

Execution Primitives​