Skip to main content

Advanced 10 - Parallel tool execution with run_parallel

Tutorial 4 showed how to spawn parallel sub-agents. That's the right pattern when the work needs distinct system prompts or specialised brains. For the simpler case - fire N independent tool calls and wait for all of them - there's a lighter primitive: run_parallel.

The agent passes a list of {name, params} tuples; the runtime fires every call concurrently with asyncio.gather, collects the results, and returns them as a single bundle. One tool call in the agent's view, N parallel calls internally.

When to use it

  • Fan-out reads: search 5 different sources for the same query, merge the results.
  • Independent shell commands: run git status, git diff --stat, git log -3 in parallel and present a composite summary.
  • Bulk fetch: http.get against 10 URLs at once instead of serializing them.
  • Validation across files: lint 20 files concurrently, collect errors.

The constraint: the calls must be independent. If call B needs the result of call A, you want sequential chaining (a normal agent loop) or a hook pipe (Advanced 7), not parallel fan-out.

The YAML

Save as parallel-bot.yaml. The agent gets shell.bash plus the run_parallel primitive.

app:
app_id: parallel-bot
name: Parallel Bot
version: "1.0"

runtime:
mode: conversation
workdir_mode: auto
max_turns: 4
timeout: 60

agents:
- id: main
role: assistant
brain:
provider: deepseek
model: deepseek-chat
backend: openai_compat
credential:
ref: deepseek_main
scope: per_user
provider: deepseek
config:
api_key: "{{env.DEEPSEEK_API_KEY}}"
base_url: https://api.deepseek.com/v1
temperature: 0
max_tokens: 400
system_prompt: |
You can execute multiple tool calls in parallel using
run_parallel. Pass `actions` as a list of {name, params}
dicts. The runtime fires them concurrently and returns
every result. Use it when several independent calls can
run at once. Reply with one short summary line.

tools:
modules:
shell: {}
capabilities:
default_policy: auto
max_risk_level: high
grant:
- module: shell
actions: [bash]
- module: context_builder
actions: [run_parallel]

run_parallel is a meta-action exposed by context_builder (auto-loaded). The explicit grant is required because the default tool index filters meta-actions out unless they're named.

Live transcript

The user asks the agent to fire three Bash echos in parallel. Sample transcript:

> Use run_parallel to fire THREE Bash calls at once:
(1) echo ONE; (2) echo TWO; (3) echo THREE.
Pass them as the calls list. Then list each stdout.

All three ran in parallel (~158ms). Here are the stdout results:

1. echo ONE -> ONE
2. echo TWO -> TWO
3. echo THREE -> THREE

All exited with code 0, no errors.

tool_calls_count: 1 from the agent's perspective. The single tool call was:

run_parallel(
actions=[
{"name": "bash", "params": {"command": "echo ONE"}},
{"name": "bash", "params": {"command": "echo TWO"}},
{"name": "bash", "params": {"command": "echo THREE"}},
]
)
# returned:
# {
# "results": [
# {"name": "shell.bash", "success": true, "result": {...}},
# {"name": "shell.bash", "success": true, "result": {...}},
# {"name": "shell.bash", "success": true, "result": {...}},
# ],
# "total": 3,
# "succeeded": 3,
# "failed": 0,
# "elapsed_ms": 157.8,
# }

Three Bash invocations, 158 ms total for the parallel batch. Sequential Bash → Bash → Bash on Windows takes roughly 1.5 s for the same workload (Git Bash startup amortises to ~500 ms per call). The 10x speedup comes from the calls actually running concurrently in the asyncio event loop.

Anatomy of the result

The results list mirrors the input actions list in order. Each entry has:

  • name - the resolved FQN (shell.bash, not the short alias the agent used)
  • success - true / false
  • result - the tool's data dict on success
  • error - the error message on failure

The aggregate fields (total, succeeded, failed, elapsed_ms) let the agent quickly check the batch outcome without iterating. A useful pattern is "fire 10 things, only report the failed ones":

result = run_parallel(actions=[...])
if result.failed > 0:
failed = [r for r in result.results if not r.success]
summarize_for_user(failed)

Failure mode

run_parallel does not abort on the first failure - all N calls are awaited. A single failure shows up in results[i] with success: false and the rest of the batch keeps running. This matches the typical fan-out mental model: the failed fetches are reported alongside the successful ones, the agent decides what to do with the partial result.

To abort siblings on first failure, fall back to spawning a sub-agent (Tutorial 4) - sub-agents have isolated lifecycles and can be cancelled cleanly. run_parallel is for "best-effort batch", sub-agent spawn is for "ordered operation with cleanup semantics".

Comparison with the other parallelism primitives

PrimitiveConcurrencyLifecycleBest for
run_parallelTrue parallel via asyncio.gatherOne turn, await allBatch fan-out of N tool calls, no per-call agent context
background_runTrue parallel; agent gets task_idSurvives turns, cancellableLong-running tools the agent should poll, not block on
Agent (spawn)True parallel; full agent loopIndependent context, isolated brainSpecialists with different prompts / brains
Sequential callsSerialOne turn, await eachWhen call B depends on call A's result

The first three all use asyncio.gather internally; the difference is what each call carries. run_parallel carries just a tool call. background_run carries a tool call plus a handle so it can outlive the turn. Agent carries an entire agent loop with its own brain config.

Pick the smallest primitive that does the job. run_parallel is the smallest and the cheapest.

Going further