Safety Model
Caboose runs arbitrary code on behalf of an LLM, so safety is enforced at multiple layers. No single mechanism is sufficient on its own — the defenses are designed to overlap.
Permission Modes
Section titled “Permission Modes”Every session runs in one of three permission modes, which gate tool execution:
| Mode | Behavior |
|---|---|
| Plan | Read-only. The agent can inspect files and search the codebase but cannot write files or execute shell commands. Useful for exploration and planning. |
| Create | The agent can read and write files, but every destructive action (shell commands, file writes) requires explicit user approval via the TUI approval dialog. |
| Chug | Full autonomy. All tool invocations are auto-approved. Intended for trusted, well-scoped tasks. |
Users can switch modes mid-session. The mode is displayed in the TUI status bar so it is always visible.
Command Policy
Section titled “Command Policy”The shell tool passes every command through a command policy check before execution. This system maintains allow and deny lists and performs shell-segment analysis to catch dangerous patterns:
- Deny list — commands that are never permitted regardless of permission
mode (e.g.,
rm -rf /,mkfs,ddtargeting block devices). - Allow list — common safe commands that skip the approval prompt even in
Create mode (e.g.,
ls,cat,grep,git status). - Segment analysis — the command string is parsed into segments so that pipes, subshells, and command substitution cannot smuggle denied commands past the policy check.
Environment Filtering
Section titled “Environment Filtering”Before any shell command executes, Caboose strips sensitive environment
variables from the child process environment. API keys, tokens, and credentials
that are present in the parent process are not leaked to commands the agent
runs. The filter matches against known variable name patterns (e.g.,
*_API_KEY, *_SECRET, *_TOKEN).
Output Caps
Section titled “Output Caps”Tool output is capped at 2,000 lines or 50 KB, whichever is hit first. This prevents a runaway command from flooding the context window (and burning tokens). Truncated output includes a notice so the agent knows the result was clipped.
Session Budgets
Section titled “Session Budgets”Each session can have a configurable maximum cost. Caboose tracks token usage and estimated cost per turn using the pricing data from the model catalog. When the budget is exceeded, the agent loop halts and notifies the user rather than silently continuing to spend.