SECURITY

Two-layer security model. Approval system, SSRF, prompt-injection, sensitive paths. Sandbox.

09 / 14·reference·v0.2.81

alpi is published by Satoshi Ltd., whose three load-bearing principles for this document are:

Security First — threat-modeled from initial development; no surveillance disguised as telemetry. Privacy by Design — privacy is the foundation, not a feature. Zero Knowledge — what we don't know can't be subpoenaed, leaked, or sold.

Those are the frame every decision below lives inside: the guard is mandatory and local, the sandbox is an opt-in second wall, the LLM is treated as an adversary with user credentials, and we keep as little state off your machine as we can.

alpi runs LLM-decided tool calls on your machine. The security posture is layered — application-level guards that always run, plus an optional OS-level sandbox for shell commands.

Layer 1 — application guards (always on)

Live inside the Python process, can't be disabled without editing source. Cover the attack vectors that an OS sandbox around terminal doesn't reach:

Replaces the previous hard denylist. See docs/CONFIG.md for the allowlist format and surface-specific behaviour.

Layer 2 — OS sandbox (opt-in, per profile)

Wraps terminal subprocess calls in a native OS sandbox so the kernel refuses the syscalls, not just the regex above. Read/write access is limited to workspace + ~/.alpi/ + /tmp; network is denied by default.

Status: stable, opt-in. Defaults to off because real-world dev workflows vary too much to pick a profile that never breaks: git push over SSH relies on ~/.ssh, Apple Silicon Homebrew lives in /opt/homebrew, docker needs /var/run/docker.sock, npm wants ~/.npm. For interactive chat where you approve every command, the Layer 1 denylist is already sufficient.

Where it really earns its keep: unattended profiles. Telegram gateway, schedule daemon, research / delegate sub-agents — these run without a human approving each command. A prompt-injected email or a hallucinating sub-agent can issue rm -rf ~/anything with no veto. Layer 2 is the kernel-level veto you want there.

alpi's multi-profile CLI makes this ergonomic:

Each profile has its own ~/.alpi/profiles/<name>/config.yaml, so the sandbox flag is set independently.

Enabling

Interactive: alpi setup → Sandbox → toggle on/off + network.

YAML (direct): set in ~/.alpi/profiles/<name>/config.yaml:

tools:
  terminal:
    sandbox: true
    allow_network: false   # flip to true if the profile needs git push / npm install

TUI feedback

The top bar shows the current profile's sandbox state next to the workspace: sandbox on in green when active, sandbox off in muted grey when not. Quick visual confirmation you're in the posture you think you're in.

Platform support

macOS — uses native sandbox-exec (ships with the OS at /usr/bin/sandbox-exec). No install step.

Linux — uses bubblewrap. Install once:

Requires user namespaces enabled in the kernel (default on modern distros; some hardened configs disable them).

Windows — no native sandbox path. Two options:

  1. WSL2 (recommended): wsl --install, then run alpi inside Ubuntu as if it were Linux native. bubblewrap works there.
  2. Native Windows: leave tools.terminal.sandbox: false. Layer 1 stays active; you lose the kernel-level guarantee for shell commands.

What happens when the sandbox is on

Testing the Linux path from macOS

A minimal Docker image covers the Linux code path. See docs/sandbox-linux-test.md.

Threat model

alpi's realistic attacker:

Layer 1 covers the common-case attacks (known patterns, known sensitive paths, known SSRF targets). Layer 2 adds defense-in-depth so a creative prompt that bypasses the regex still can't touch the FS or the network.

Third-party code

Every runtime dependency is an attack surface. We keep the list tight (see ARCHITECTURE.md → Dependencies for why each one earns its place) and audit it before each release. The CVE pass is a single command:

uv run --with pip-audit pip-audit

Risk profile of the runtime set:

DepRiskNotes
litellmMediumLarge surface (100+ providers). Ships with telemetry=True by default — alpi flips it off in llm.py::_silence_litellm() so no request phones home. Regression test: tests/test_llm_privacy.py.
playwrightMedium-highRuns a full Chromium (~230 MB) that loads arbitrary web content. Chromium's own sandbox is the line of defence at that layer; alpi adds nothing on top. Used only by the browser tool.
playwright-stealthLowSmall patch set on navigator.webdriver and friends. Reverse-engineered detection bypass; breaks occasionally when detection vendors tighten.
pillowMediumImage parsers have a long history of CVEs. Keep on the latest minor; pip-audit catches known issues.
faster-whisperLowBundles CTranslate2 native code. Models are downloaded from HuggingFace on first use — inspect the model hash if paranoia calls for it.
edge-ttsLowReverse-engineered unofficial Microsoft Edge TTS endpoint. Small code, but the endpoint can change; have a plan B (say on macOS, espeak on Linux) ready.
textualLowPure Python, active, stable API surface we pin to.
litellm's transitive tree (openai SDK, anthropic SDK, etc.)Low-mediumFlows through. pip-audit covers.
httpx, rich, click, pyyaml, python-dotenv, prompt_toolkit, croniter, html2text, ddgsLowSmall or stable or both. Rarely updated, rarely break.

Policy

Known gaps