Skip to main content

web

Web search, fetch, and content extraction. Supports multiple search backends with automatic fallback. DuckDuckGo is the free default — no API key required.

PropertyValue
Module IDweb
Version1.0.0
Typeuser
Dependenciesaiohttp, beautifulsoup4, html2text

Design Philosophy

  • Free by default — DuckDuckGo works out of the box with no API key. Upgrade to Brave/Tavily/Google when needed.
  • Clean content — HTML is converted to readable markdown-like text. Scripts, ads, navigation, and cookie banners are stripped.
  • Cached fetches — pages are cached for 5 minutes. Same URL fetched twice costs one HTTP request, not two.
  • Fallback resilience — if the primary search backend fails, automatically retries with the configured fallback.

Configuration

modules:
web:
config:
search:
primary: duckduckgo
fallback: brave
api_keys:
brave: "{{env.BRAVE_API_KEY}}"
tavily: "{{env.TAVILY_API_KEY}}"
max_content_length: 50000
cache_ttl: 300

Search Backends

BackendAPI Key RequiredCostBest For
duckduckgoNoFreeDevelopment, testing
braveYes~$0.01/queryProduction, affordable
tavilyYes~$0.01/queryAI agents (structured results)
searxngNo (self-hosted)FreeMeta-search (aggregates engines)
googleYes + CX100 free/dayHighest quality results

Actions (4)

Search the web. Returns title, URL, snippet for each result. Parameters: query, limit. Risk: low

fetch

Fetch a page and convert HTML to clean readable text. Parameters: url, max_length, raw. Risk: low

extract

Extract content using CSS selectors. Parameters: url, selector, max_length. Risk: low

download

Download a file to a local path. Parameters: url, path. Risk: medium