Agent System

Caboose’s agent loop is a state machine that manages the full lifecycle of a conversation turn — from sending a prompt to the LLM, through tool execution, to presenting the final response.

State Machine

The AgentLoop transitions through five states:

Idle → Streaming → ExecutingTools → PendingApproval → Idle

Idle — waiting for user input.
Streaming — receiving tokens from the LLM via SSE. Events flow from a background task to the UI in real time.
ExecutingTools — the LLM requested one or more tool calls. Caboose executes them and feeds results back.
PendingApproval — a tool requires user confirmation (based on the current permission mode). Execution pauses until the user approves or denies.

The loop repeats Streaming and ExecutingTools until the LLM produces a final response with no further tool calls.

Available Tools

Tool	Description
`read_file`	Read file contents with line range support
`write_file`	Create or overwrite a file
`edit_file`	Apply targeted edits to an existing file
`glob`	Find files by pattern
`grep`	Search file contents with regex
`bash`	Execute a shell command
`list_directory`	List directory contents
`fetch`	Make an HTTP request
`web_search`	Search the web
`todo_write`	Create or update a task list
`todo_read`	Read the current task list
`explore`	Broad codebase exploration
`agent`	Spawn a subagent for parallel work
`ask_user`	Prompt the user for clarification

Tool output is capped at 2,000 lines or 50 KB, whichever is reached first. Output beyond the cap is truncated with a notice so the LLM knows data was omitted.

Subagents

The agent tool spawns an independent agent loop that runs a focused subtask with its own context. Subagents are useful for parallelizing work — for example, researching an API while the main agent continues writing code. Subagent results are returned to the parent as tool output.

Context Management

Long sessions inevitably exceed a model’s context window. Caboose handles this automatically:

Compaction — when the context limit approaches, Caboose asks the LLM to summarize the conversation so far. The summary replaces the full history, freeing space for continued work.
Cold storage — older conversation segments are serialized to SQLite. They remain accessible for reference but no longer consume context tokens.

Both mechanisms are transparent. You will see a brief status message when compaction occurs.