Skip to content

Agent System

Caboose’s agent loop is a state machine that manages the full lifecycle of a conversation turn — from sending a prompt to the LLM, through tool execution, to presenting the final response.

The AgentLoop transitions through five states:

Idle → Streaming → ExecutingTools → PendingApproval → Idle
  • Idle — waiting for user input.
  • Streaming — receiving tokens from the LLM via SSE. Events flow from a background task to the UI in real time.
  • ExecutingTools — the LLM requested one or more tool calls. Caboose executes them and feeds results back.
  • PendingApproval — a tool requires user confirmation (based on the current permission mode). Execution pauses until the user approves or denies.

The loop repeats Streaming and ExecutingTools until the LLM produces a final response with no further tool calls.

ToolDescription
read_fileRead file contents with line range support
write_fileCreate or overwrite a file
edit_fileApply targeted edits to an existing file
globFind files by pattern
grepSearch file contents with regex
bashExecute a shell command
list_directoryList directory contents
fetchMake an HTTP request
web_searchSearch the web
todo_writeCreate or update a task list
todo_readRead the current task list
exploreBroad codebase exploration
agentSpawn a subagent for parallel work
ask_userPrompt the user for clarification

Tool output is capped at 2,000 lines or 50 KB, whichever is reached first. Output beyond the cap is truncated with a notice so the LLM knows data was omitted.

The agent tool spawns an independent agent loop that runs a focused subtask with its own context. Subagents are useful for parallelizing work — for example, researching an API while the main agent continues writing code. Subagent results are returned to the parent as tool output.

Long sessions inevitably exceed a model’s context window. Caboose handles this automatically:

  • Compaction — when the context limit approaches, Caboose asks the LLM to summarize the conversation so far. The summary replaces the full history, freeing space for continued work.
  • Cold storage — older conversation segments are serialized to SQLite. They remain accessible for reference but no longer consume context tokens.

Both mechanisms are transparent. You will see a brief status message when compaction occurs.