Back to list
🔧
ℏεsam2026-01-30 · 10m

ClawdBot Architecture Explained: How It Actually Works

ClawdBot Architecture Explained: How It Actually Works

Original: Everyone talks about ClawdBot, but here's how it works Author: ℏεsam (@Hesamation)

Everyone talks about ClawdBot, but here's how it works:

What Is ClawdBot, Technically?

ClawdBot is a TypeScript CLI application. At its core, it exposes Claude's language capabilities through an interactive shell, paired with a suite of tools that give the model read/write access to your system.

The Architecture

Channel Adapter – Accepts input from Telegram, WhatsApp, CLI, etc. Normalizes messages into an internal format.

Gateway Server – Central router. Uses lanes (key-based queues) to serialize commands per user/session.

Agent Runner – Executes the actual loop: prompt → Claude → tool-use → repeat until done.

LLM API Call – The gateway calls Claude's Messages API with system prompt + tools.

Agentic Loop – If the response contains tool_use, the runner executes it, appends results, and loops.

Response Path – When done, the final reply goes back through the Gateway → Adapter → user.

How ClawdBot Remembers

Session transcripts – Every conversation is persisted as JSONL. Allows resumption and summary.

Memory files – Long-term memory stored as markdown in ~/.clawdbot/memory/. Indexed via hybrid search (vector + keyword) backed by SQLite with FTS5.

ClawdBot's Computer Use

Here's where things get interesting:

exec – Runs shell commands. Optionally sandboxed via macOS Seatbelt or Linux namespaces.

Filesystem – Read/write/glob/grep. Respects allowlists and denylist patterns.

Browser – Playwright-powered. Instead of visual screenshots, it captures a semantic snapshot (ARIA tree pruned for relevance).

Process management – Can spawn background tasks, kill jobs, or query their status.

Safety

Commands run in a restricted environment unless explicitly allowed. You can define an allowlist of patterns. Unrecognized commands prompt user approval or are blocked.

Browser Semantic Snapshots

Why not screenshots? Semantic snapshots are:

  • Smaller (KB vs. MB)
  • Structured (DOM-like refs for each element)
  • Actionable (model can issue click/type/scroll by ref)
  • Faster to interpret for the model

The snapshot is an accessibility tree: roles, names, states. Cleaner than raw HTML, better than pixels.

That's ClawdBot's internals. Questions?

Tags

ArchitectureTechnical Deep DiveHow It Works