Agent Loop

The agent loop is the core engine that drives multi-turn conversations. It coordinates between the LLM provider, tool execution, and the TUI, running as an asynchronous state machine on a background task.

State Machine

         user message
              │
              ▼
    ┌───── Idle ◄──────────────────────┐
    │         │                        │
    │    send to LLM                   │
    │         ▼                        │
    │    Streaming ───────────────┐    │
    │         │                   │    │
    │    tool call received   no tools │
    │         ▼                   │    │
    │    ExecutingTools            │    │
    │         │                   │    │
    │    needs approval?          │    │
    │    yes ──▶ PendingApproval  │    │
    │              │ approved     │    │
    │              ▼              │    │
    │    tool results ready ◄─────┘    │
    │         │                        │
    │    send results to LLM           │
    │         └────────────────────────┘
    │
    └──▶ waiting for next user message

The loop continues cycling through Streaming and ExecutingTools until the model produces a final text response with no tool calls, at which point it returns to Idle.

Event Flow

The agent loop runs on a dedicated background task, separate from the TUI rendering thread. Communication uses async message-passing channels:

The TUI sends user messages to the agent loop.
The agent loop forwards them to the provider’s stream method.
StreamEvent values arrive asynchronously and are forwarded to the TUI channel for incremental rendering.
When a ToolCall event arrives, the loop transitions to ExecutingTools (or PendingApproval if the current permission mode requires consent).
Tool results are appended to the conversation and the loop sends the updated context back to the provider for the next turn.

Conversation Management

The conversation is maintained as an ordered list of messages, each containing one or more content blocks (text, tool calls, tool results, thinking blocks). This structure maps directly to the wire format expected by LLM providers.

Compaction

When the conversation exceeds the model’s context window, Caboose compacts it rather than truncating blindly. The compaction strategy sends the full conversation to the LLM with a summarization prompt, producing a condensed version that preserves key decisions, file paths, and outstanding tasks. The compacted summary replaces older messages while recent turns are kept verbatim.

Cold Storage

For sessions that run long enough to exceed even compacted context limits, Caboose offloads older conversation segments to SQLite. These segments can be recalled on demand (via the recall tool) if the agent needs to reference earlier context. This allows effectively unbounded session length without degrading response quality.

Tool Dispatch

When the model emits a tool call, the agent loop looks up the tool name in the tool registry and invokes the corresponding handler. Each tool receives structured arguments (parsed from the model’s JSON output) and returns a result containing text output and optional metadata. Tool dispatch is synchronous from the state machine’s perspective — the loop awaits the result before deciding whether to continue streaming or return to Idle.