ARCHITECTURE

Code structure, turn loop, memory, sessions, gateway, scheduler, MCP, logging.

08 / 16·reference·v0.9.26

Living technical reference for alpi at HEAD. Describes only what currently ships — historical decisions live in commit messages, planned work lives in ROADMAP.md.

Audience: any developer (or LLM) reading this codebase from cold.

What alpi is

alpi is a local-first personal AI agent. It has a Textual TUI in the terminal, a Tauri desktop app (and a planned mobile client) that talk to the daemon over the host plane (Unix socket locally, WebSocket remotely), Telegram/IMAP/Gmail/Matrix gateways hosted by the alpi daemon, inline-learning memory, scanner-gated live skills, multi-provider LLM support via LiteLLM, read-only research, write-capable delegation, scheduling, MCP integration, and ALP for private agent-to-agent links.

The architectural constraint is sovereignty: state is local, identities are per-profile, network trust is explicit, and operational surfaces stay small enough to audit. The product is intentionally not a generic agent suite, marketplace, or hosted router.

Principles

alpi is published by Satoshi Ltd. and inherits the company's six operating principles (Privacy by Design, User Sovereignty, Security First, Open Source, Zero Knowledge, Digital Sovereignty). See the Why alpi is built like this section in README.md for the mapping between principle and code. The conventions below are the engineering expression of those principles — not separate from them.

Code conventions

No human-facing comments in alpi/ source. The reader is an LLM. Narrative prose, banner dividers, section labels, restatement docstrings — token tax. See feedback_no_human_comments.md in agent memory for the full rule. Tests, docs, and tool description strings are out of scope (those serve other audiences).

English only. All text inside alpi/ (code, docstrings, prompts, tool descriptions, error messages, seed comments) is English. The LLM reads these every turn; embedding Spanish nudges replies toward Spanish. User-facing runtime output follows the user's language.

No comments without "why". A comment survives only if removing it would mislead a future reader into a wrong edit or waste their time re-deriving an external fact. or-chains and try/except blocks are self-evidently intentional; documenting them is fluff.

CLI surface

Stable verbs shared across groups so a user doesn't relearn per feature.

alpi                           launch the TUI
alpi -c / --continue           resume the last session in the TUI
alpi -p <name>                 profile flag, combinable with any command

alpi chat                      alias for `alpi`
alpi chat --once "<text>"      one-shot turn to stdout (pipe-friendly)
alpi chat --once ... -c | --session <id>   continue the last / a specific session (one-shot)
alpi chat --once ... --emit-events     INTERNAL — gateway subprocess contract
alpi chat --once ... --no-save         INTERNAL — do not write a session file

alpi setup                     interactive menu: model / gateways / voice / MCPs /
                               peers / workgroups / sandbox / service /
                               health check / cleanup /
                               delete profile (non-default only)

alpi doctor                    live health check (Telegram getMe, IMAP login,
                               Gmail token refresh, MCP handshake, service PID);
                               exits 1 on any failure, 0 otherwise

alpi logs                      tail every subsystem log merged by timestamp
  --source {service|gateway|schedule|agent|approval}  restrict to one subsystem
  -n N                                         last N lines (default 100)
  -f                                           follow mode (poll every 1s)

alpi profile list              list profiles, mark the active one
alpi profile create <name>     bootstrap a new profile tree
alpi profile remove <name>     delete after safety checks + confirm

alpi daemon install|uninstall                  register / unregister the launchd plist / systemd unit
alpi daemon start|stop|restart|status          lifecycle of the single per-machine daemon
alpi schedule run-once|fire <id>                manual cron tick / ad-hoc job fire (operational, not lifecycle)

alpi peers list                list pinned ALP peers for this profile
alpi peers key                 print this profile's ALP public key
alpi peers add <id> <pubkey>   pin a peer (prefer the wizard for capability selection)
alpi peers remove <id>         unpin a peer
alpi peers ping <id>           live probe via link.ping

alpi workgroup list                                list workgroups (hub-of + member-of)
alpi workgroup show <wg_id>                        detail + decrypted transcript
alpi workgroup create <name> --member <id|pubkey>  hub-side create (auto-grants verbs to invited peers)
alpi workgroup join <hub_peer_id> <wg_id>          subscribe to a peer-hosted workgroup
alpi workgroup post <wg_id> <text>                 encrypt + post; cost is auto-declared in PR 5
alpi workgroup pull <wg_id>                        fetch new posts and decrypt; cursor advances
alpi workgroup pause|resume|leave <wg_id>          membership ops
alpi workgroup kick <wg_id> <member-id|pubkey>     hub-only; rotates the group key

Shape rules: containers (profile, peers, workgroups) get list/create/remove (or add/remove). The daemon gets start/stop/restart/status/install/uninstall under alpi daemon; the same lifecycle is also reachable from alpi setup → Services → Daemon (default profile only) so users have one canonical place. The first alpi setup auto-installs the daemon — no opt-in step. Per-profile services (gateway, schedule, alp, workgroups, host) toggle from alpi setup → Services → Subsystems or directly via the service: block in each profile's config.yaml. Interactive wizards live exclusively under alpi setup; never add a per-feature wizard command.

Command ordering in --help is frequency-first, not alphabetical: chat → setup → doctor → logs → profile → peers → workgroup → schedule → daemon. See _OrderedGroup in cli.py.

alpi/ui.py is the shared interactive layer. Raw questionary.* is forbidden outside it. Helpers: banner, menu, text, password, confirm, row, ok/fail/warn/dim/saved/cancelled. The close item is added automatically with value None (callers treat None as "out").

Menu close wording: top-level (alpi setup) → Exit. Sub-menus (Gateways:, MCP servers:, Manage saved keys) → ← Back. Wizard aborted mid-flow → cancelled. Mixing Exit/Back/Cancel in one context is a bug.

File layout

alpi/
├── __init__.py             __version__
├── cli.py                  entry point, --continue, --profile resolution
├── engine.py               turn runner, interrupt flag, tool loop
├── llm.py                  litellm stream() / complete() wrappers
├── session.py              Turn / ToolLog dataclasses, save/load
├── memory.py               MemoryStore (3 files, two-tier dedup, .bak)
├── home.py                 profile path resolution
├── config.py               YAML load/save, defaults, deep merge
├── ui.py                   shared wizard/menu primitives
├── service.py              unified orchestrator — runs every enabled subsystem on one asyncio loop; install/uninstall launchd / systemd unit per profile
├── ledger.py               daily spend ledger (logs/ledger.json: live counters + 30-day per-day history) + profile cap gate
├── outputs.py              persistent inbox JSONL store (notify / send_message + schedule failures)
├── status.py               canonical /status rows (TUI + Telegram share this)
├── prompts/
│   ├── default_agent.md
│   └── system_prompt.md
├── providers/              metadata for the model picker
│   └── {anthropic,openai,google,groq,openrouter,custom}.py
├── tools/
│   ├── base.py             Tool ABC + ToolResult
│   ├── _state.py           ContextVar-backed emit / interrupt / usage (per-thread isolated for batch sub-agents)
│   ├── _paths.py           resolve_path + sensitive-path denylist
│   ├── _guards.py          terminal denylist, SSRF, prompt-injection scan
│   ├── _budget.py          per-result char cap for LLM context (100K default, per-tool override)
│   ├── _osv.py             OSV malware query for PyPI/npm names before skill/MCP install
│   ├── _sandbox.py         OS-level sandbox wrapper (opt-in)
│   ├── skill.py            create/edit/patch/add_file/remove_file/delete/list/view + scanner + quota
│   ├── search.py           content + filename search (rg + stdlib fallback)
│   ├── research.py         read-only sub-agent (depth: quick/normal/deep)
│   ├── terminal.py         run/background/status/output/kill
│   ├── notify.py           native push to the owner's apps (delegates to send_message channel=alpi)
│   └── … (read_file, write_file, edit_file, todo, web_*, schedule,
│         memory, session_search, send_message, email, config)
├── tui/                    Textual app, widgets, screens, theme
├── gateway/                inbound platforms (Telegram / IMAP / Gmail / Matrix), hosted by the alpi daemon
├── scheduler/              cron + once jobs, hosted by the alpi daemon
├── mail/                   mail backends (imap.py — IMAP+SMTP; gmail.py coming in T)
├── mcp/                    MCP client (stdio JSON-RPC) + registry
├── alp/                    Alpi Link Protocol (spec: docs/ALP.md)
│   ├── keys.py            Ed25519 identity at {home}/alp/secrets/alp_key.{pem,pub}
│   ├── envelope.py        build/sign/verify JSON-RPC envelope + replay cache
│   ├── peers.py           {home}/alp/peers.yaml load/save + capability check
│   ├── server.py          Unix-socket listener, fail-closed dispatch
│   ├── client.py          one-shot call with typed errors (TargetOffline, RemoteError)
│   ├── handlers.py        link.ask / link.cancel — engine integration
│   ├── mention.py         @peer parser + executor (shared by TUI + gateway)
│   ├── pending.py         pending invites store (unpinned-sender capture)
│   └── setup.py           `alpi setup → Peers` wizard
├── host/                   control plane for desktop / mobile clients (default profile only)
│   ├── server.py          Unix-socket JSON-RPC server (no envelope, no Noise — fs perms = trust)
│   ├── handlers.py        read verbs (host.workgroup.transcript, host.sessions.*)
│   ├── chat.py            host.chat.send (streaming) + host.chat.cancel
│   ├── config.py          mutation verbs (host.providers.*, host.peers.*, host.profile.*, host.mcp.*, host.gateway.*, host.sandbox.*, host.voice.*)
│   ├── devices.py         host.devices.* pairing-token lifecycle
│   ├── attachments_rpc.py host.attachments.{stage,fetch} — stage uploads in, fetch serves a tool-produced output attachment's bytes out (scoped to the profile's workspace/home/temp) so rich clients render images inline + other files as a metadata chip; text surfaces get a shared listing
│   ├── network_rpc.py     host.network.{status,set_advertised,restart_host_server} — pairing endpoint query + override (parity with `alpi setup → devices → network`); scope classified by host character via network.classify_scope (tailscale / lan / custom / docker) so clients don't surface the "configured" resolution-path detail
│   ├── probes.py          host.gateway.probe, host.peers.ping, host.model.ctx_window
│   ├── schedule.py        host.schedule.{list,remove,set_paused,fire}
│   ├── outputs.py         host.outputs.{list,read,mark_read,mark_all_read,delete}
│   ├── daemon.py          host.daemon.{restart,update}
│   ├── device_state.py    device-facing profile state (profiles, summaries, storage, gateways, skills, workgroups)
│   ├── events.py          host.events.subscribe + thread-safe emit() for daemon-pushed updates
│   ├── workgroup.py       transcript decryption (hub + member shapes)
│   └── sessions.py        plaintext session list / read
└── knowledge/              `alpi_knowledge` answer packs — Markdown the tool reads (see docs/SKILLS.md)

Runtime state (skills, sessions, memories, logs, ALP peers, keys) does not ship with the package — it's generated per profile under ~/.alpi/. The alpi/knowledge/references/ directory holds the answer packs the alpi_knowledge tool serves; there is no bundled skill namespace. See Profile home layout immediately below. The skill tool (alpi/tools/skill.py) manages user-created skills that live at {home}/skills/<category>/<name>/.

Profile home layout (~/.alpi/ or ~/.alpi/profiles/<name>/)

~/.alpi/                     default profile root
├── .env                    API keys, gateway tokens, allowlists
├── config.yaml             model + tools + tui + mcp + gateway
├── memories/               USER.md, MEMORY.md, AGENT.md (+ .bak)
├── skills/<category>/<name>/    SKILL.md + scripts/ + references/ +
│                                 assets/ + secrets/ (0700) + state/ +
│                                 .gitignore
├── sessions/<id>.json      turn-based session log (TUI / desktop / `--once`)
├── rag/                    local RAG over the workspace (BA)
│   └── store.sqlite        sqlite-vec index — workspace_files / _chunks / _vec
├── mentions/<sender>.json  per-sender @-mention threads (cap 20 turns), receiving side
├── gateway/                inbound transport state + chat sessions
│   ├── telegram-state.json, imap-state.json, …   per-platform offsets, last-uid, etc.
│   └── sessions/<id>.json  Telegram / email / webhook chat logs (hidden from local listings)
│       └── _map.json       chat_id → session_id pointer
├── run/                    background process registry, gateway/schedule pids
├── alp/                    ALP state — keypair, peer list, socket, pid
│   ├── peers.yaml         pinned peers (pubkey + allow + optional address)
│   ├── alp.sock           Unix-domain socket, 0600, only while listener runs
│   ├── alp.pid            listener pid
│   └── secrets/alp_key.{pem,pub}   Ed25519 identity (private 0600, public 0644)
├── host/                   control-plane state (default profile only)
│   └── host.sock          Unix socket the local desktop connects to (mobile uses the WebSocket)
├── outputs/                persistent inbox for proactive agent messages + schedule failures
│   └── outputs.jsonl       JSONL store (≤500 rows, atomic compaction)
└── logs/                   service.log (daemon-wide; lives only at the root, NOT
                            duplicated per profile), agent.log + approval.log
                            (per profile — only the default profile's pair is at
                            this level), ledger.json, compaction.jsonl, runs.jsonl

~/.alpi/profiles/<name>/     same layout MINUS service.log; agent.log + approval.log
                             are emitted under each profile's own logs/

Core systems

Engine loop (alpi/engine.py)

Per turn: append user message → loop {LLM stream → emit deltas → exec tool calls → append tool results} until the LLM stops emitting tool calls OR the effective step ceiling is hit — max_steps_per_turn (default 40), raised to 1000 for free (zero-priced) or local/ollama models when left at the default; an explicit value is always respected. interrupt_requested is polled at three checkpoints (between iterations, mid-stream, between tool calls). A turn lock serializes concurrent runs so a delayed research tool from the previous turn can't bleed into the next.

Events emitted to the UI sink: user, reasoning_delta, assistant_delta, assistant_done, tool_start, tool_state, tool_end, usage, error, done, interrupted. The TUI consumes them; the gateway subprocess consumes a subset via JSON-lines.

Cross-turn resume. A chat is not a long-lived object: each turn spins up a fresh Engine and rehydrates the session from disk (_hydrate_from_path in cli.py, shared by TUI --continue, the host chat, and the gateway; the desktop "edit message" rewrite path mirrors it in host/chat.py). The model context is rebuilt from the prior replayable turns — those that ended in a final reply or produced a file; a turn aborted before its reply (no assistant text, no output files) is dropped, so a resumed session never re-answers a dangling request. Each replayed turn contributes its user text (plus an input-attachment marker [attached: name (mime)]) and assistant text (plus a produced-file marker [produced this turn — reuse the absolute path…: name → /abs/path]). Tool calls and tool results are deliberately not replayed — they would blow the context budget — so an agent does not remember what it searched, read, or analyzed last turn, only its final reply and the absolute paths of the files it produced. A multi-turn edit ("now relight it at sunset") reuses the produced path surfaced by the marker, not a remembered tool output; an agent that needs an earlier tool's result across turns must re-run the tool or rely on a produced file.

The system prompt for each turn is built from: AGENT.md (agent profile — voice, style, identity) → base prompt → environment block (workspace, profile home, path rule) → platform hint (_platform_hint() — injects per-surface guidance when ALPI_PLATFORM is set by the caller: cron, telegram, email, gmail, matrix; empty for TUI) → skills index (auto-injected by alpi.tools.skill.skills_index_block) → USER.mdMEMORY.md.

The gateway (alpi/gateway/run.py) sets ALPI_PLATFORM=<msg.platform> on every spawned subprocess so Telegram replies arrive Markdown-aware and email replies arrive plain-text-only. The scheduler (alpi/scheduler/run.py) sets ALPI_PLATFORM=cron so scheduled jobs run knowing no user is present and they cannot ask for clarification. Each fire runs as a subprocess capped at job_run_timeout(job) seconds — job.timeout if set, else DEFAULT_RUN_TIMEOUT_SECONDS (600), clamped to [30, MAX_RUN_TIMEOUT_SECONDS] (3600). The cap is a runaway/cost guardrail for unattended runs, not a hint that jobs must be short; heavy jobs (deep research, multi-step publishing) opt into a longer budget via schedule(add|update, timeout=…).

Cron jobs with no_agent: true skip the LLM entirely. The prompt is shlex-tokenized and exec'd directly (shell=False); ${ALPI_HOME} expands to the profile home and the profile's .env overrides inherited env keys so skills find their declared requires_env. A form-based allowlist enforces that the command is python[3] [flags] <script> or <script> invoked directly, where <script> resolves to <home>/skills/<category>/<name>/scripts/…; non-python executables and -c/-m inline-code flags are rejected at both schedule(add) time and inside the scheduler before exec. Use this for deterministic skills (sync, file processors) — saves both tokens and the agent boot latency per fire.

LLM transport (alpi/llm.py)

Thin wrapper over litellm.completion. stream() is an async generator yielding {text_delta, reasoning_delta, tool_calls_delta, finish_reason} per chunk plus a final {final, tool_calls, input_tokens, output_tokens, cost_usd}. complete() is the non-streaming variant used by research. _silence_litellm() runs at import time to mute LiteLLM's startup banner via FD-level redirect (Textual is sensitive to stdout pollution).

Memory (alpi/memory.py)

Three files: USER.md (facts about the user), MEMORY.md (env quirks, commands, incidents), AGENT.md (the agent's own profile — tone, style, identity, language). § entry delimiter, char limits USER_CHAR_LIMIT = 3000 / MEMORY_CHAR_LIMIT = 5000 (see alpi/memory.py; AGENT.md is free-form prose with no cap). Accent+case+punctuation-insensitive dedup, plus token-Jaccard dedup at 70% max-containment to catch paraphrases. .bak snapshot before every mutating write. Approach C: every mutating call returns the full current state of the target file so the agent sees its own write in the same turn.

v2 quality metadata. Each entry carries a trailing <!-- alpi-meta conf=... captured=... reinforced=... --> comment that is stripped before the entry reaches the system prompt. conf is low / normal / high (default normal). Near-duplicate writes reinforce the existing entry (bump reinforced, upgrade low → normal at ≥ 2) instead of appending a paraphrase. Low-confidence entries with zero reinforcements expire after LOW_CONFIDENCE_MAX_AGE_DAYS = 30 (constant in alpi/memory.py; keep it fixed unless operational traces justify tuning). The memory tool's safety scanner reuses the skill scanner patterns and adds invisible/bidi unicode detection (U+200B–200F, 202A–202E, 2060, 2066–2069, FEFF) to block Trojan-Source vectors; _operational_warning surfaces non-blocking warnings when a write looks like session state (chat_id, session_id, ISO timestamps).

Batch writes. memory(action="add", entries=[...]) accepts a list of entries for the same target in a single call. Each entry runs through cross-file and same-target dup checks independently; entries that collide are skipped with a per-line note, the rest land in one write. Replaces the pathological pattern of one add call per fact (16 calls in a single turn observed in real sessions).

Post-turn reviewer. When memory.review_interval > 0 (default 0 = off), alpi/review.py spawns a daemon thread after each turn that snapshots the user/assistant text and asks the LLM whether anything durable should be added. The reviewer is constrained to memory(action="add", ...) — never replace/remove — to prevent it from deleting unrelated entries on a bad pass.

Promotion queue (alpi/promotion.py). Auto-compaction never writes to USER.md / MEMORY.md / AGENT.md directly. After every fired compaction the engine runs a second short LLM call against the summary (system prompt CANDIDATE_PROMPT) and pushes any durable facts as candidates into <home>/memories/promotion_queue.jsonl. On enqueue, each candidate is annotated with the same preview warnings the memory tool computes at write time — operational-state heuristic, cross-file duplicate check, safety scan. The queue is bounded (MAX_PENDING = 200 per profile) and pending entries expire after MAX_AGE_DAYS = 30. Per-record fields in the JSONL: id (8-char hex), created_at (unix ts), source (compaction | reviewer | manual), session_id, model, target (USER.md | MEMORY.md | AGENT.md), text, confidence (low | normal | high), warnings (list of strings).

Two memory tool actions surface the queue, both safe for the agent to call: promotion_list (read-only) and promotion_discard(id) (drops a candidate without writing). There is no agent-callable apply. The only path that writes to durable memory from the queue is the CLI alpi memory promote — interactive review with [a]pply / [d]iscard / [s]kip / [q]uit per candidate, plus --apply-all / --discard-all for unattended sweeps. This keeps the human-in-the-loop gate genuine: the agent cannot promote facts on its own regardless of how the prompt is framed. If the underlying memory add fails (safety scan, duplicate), the candidate stays in the queue so the operator can fix and retry.

Path resolution (alpi/tools/_paths.py)

Single entry point resolve_path(path):

  1. expanduser().
  2. Relative paths root at the active workspace (cfg.workspace or cwd fallback).
  3. Resolve symlinks.
  4. Reject if the resulting path matches any sensitive-path entry (denylist below) — ValueError.

Denylist: /etc/, /boot/, /sys/, /proc/, /usr/lib/systemd/, /System/, /private/etc/, the docker sockets, ~/.ssh/id_*, ~/.ssh/authorized_keys, *_key, *_ed25519, *.pem/.p12/.pfx, ~/.aws/{credentials,config}, ~/.gnupg/, ~/.netrc, ~/.npmrc, ~/.pypirc, ~/.pgpass, ~/.config/{gh,gcloud}/, shell rc/login files (.bashrc/.zshrc/.zprofile/…), ~/Library/Launch{Agents,Daemons}/, profile .env/config.yaml, and skill secrets/ dirs. Both pre-resolve and post-resolve forms are checked (macOS /var/private/var symlink case).

suggest_similar_paths(target) lists the parent directory and fuzzy-matches siblings by basename substring/prefix. Used by read_file, edit_file, and search to turn dead-end errors into actionable suggestions.

alpi/tools/_lint.py::lint_content(path, content) runs a parser-based syntax check before every write_file / edit_file lands on disk. Parsers by suffix: .pyast.parse (stdlib), .jsonjson.loads (stdlib), .yaml/.ymlyaml.safe_load (PyYAML, already a dep), .tomltomllib.loads (stdlib on 3.11+, with tomli declared as a conditional dep for 3.10). Other suffixes pass through. Failures return a one-line error with the source line/col and the write is refused — the original file (if any) is untouched. Catches the class of bug where a malformed jobs.json, config.yaml, or skill script silently breaks a downstream consumer.

alpi/secrets_io.py::safe_write_secret(path, content, mode=0o600) is the canonical write path for any credential file. It uses tempfile.mkstemp (O_EXCL + 0o600 at creation, random unique name in the target dir), then os.replace onto the target — no TOCTOU window where the file exists at umask perms, and a stale <target>.tmp lingering at looser perms cannot compromise the write because the helper picks a fresh random name. Used by model_selector._atomic_write_env (.env writes), mail/gmail_auth._save (gmail token), alp/pending.save (pending-peers yaml), and alp/keys.create (ALP private key).

Tool registry (alpi/tools/__init__.py)

register(cls) adds a Tool subclass to the dict, schemas() emits the OpenAI function-calling shape, execute(name, args) runs by name with full error capture. The registry is assembled from the sibling tool modules in alpi/tools/__init__.py, including the Playwright-backed browser tool. search_workspace and index_workspace register first so they appear at the top of the schema list (semantic recall is the right default for "what does my file say about X" questions).

Local recall (alpi/core/ + alpi/tools/workspace.py)

Per-profile semantic search over the user's local files (BA). Two agent tools:

index_files(home, files, *, ocr) is the shared per-file indexer that backs learn_file: same readers/chunker/embedder/tables as index_workspace, incremental (mtime/size skip), purges passed files that no longer exist, no global orphan sweep. .alpi/documents/ is the one .alpi subtree the index walk does NOT skip, so a full index_workspace re-discovers learned docs and keeps them in sync instead of purging them as orphans.

Supported formats: markdown / text / source / configs (stdlib read), HTML (html2text), PDF (pypdf for text-layer, RapidOCR fallback when ocr=true and pypdf extracts < 50 chars), DOCX (python-docx), EPUB (ebooklib), images (PIL + RapidOCR — only with ocr=true). OCR backend is rapidocr-onnxruntime (ONNX port of PaddleOCR, no torch dependency).

Shared store primitive (alpi/core/store.py). open_store(home) returns a sqlite3.Connection with the sqlite-vec extension loaded. Designed to host other shapes later (workgroup search, future entity memory) — they bring their own table schemas.

Embedder (alpi/core/embed.py). Embedder Protocol; default FastembedEmbedder wraps the ONNX export of sentence-transformers/all-MiniLM-L6-v2 (384-dim, ~90 MB, no torch). Numerically equivalent to the original sentence-transformers checkpoint but ~10× lighter at runtime. Lazy-loaded under a threading.Lock so concurrent first-touch calls serialize on a single model instance instead of racing.

Session recall (alpi/tools/recall.py)

Recall over past conversations, the conversational-memory peer of the workspace RAG, in three layers: lexical find (session_search, term counts over sessions/*.json), exact browse (session_read, no model call), and opt-in semantic search (index_sessions / recall_sessions) for fuzzy "when did we discuss X / what did we decide about Y".

Forgettable. Recall is a derived view, so forgetting is real: deleting a session (host.sessions.deletehost/sessions.py::delete_session) purges its rows via recall.forget_session, and index_sessions orphan-sweeps any tracked session whose file is gone. No auto per-turn injection — retrieval is explicit, like the workspace tools.

Workgroup transcript search (alpi/tools/workgroup_search.py)

The third retrieval surface on the same store: semantic search over hub-owned workgroup transcripts. Workgroups are hub-owned by design, so this is profile-local and hub-only — the hub decrypts its own transcript and indexes it; there is no cross-peer / federated search and no global "search all my peers' workgroups". Two tools:

Forgettable. Removing a workgroup purges its index in both delete paths — the host RPC (host/workgroup_admin.py::_remove) and the CLI (alpi workgroup remove) call workgroup_search.forget_workgroup; index_workgroups orphan-sweeps any tracked workgroup whose directory is gone. No auto-injection into workgroup turns. ALP encryption/transcript behaviour is untouched — this only reads through the existing decrypt path.

Asset prefetch (service.py::_prefetch_assets). Scheduled by _main_all at boot+600 s — deliberately past the client-reconnection rush (at boot+5 s the Chromium unzip + ONNX load starved small Docker hosts, which read as "the machine is blocked"). Gated by service.prefetch on the root profile: auto (default) fetches the fastembed weights only when some profile has a non-empty rag/ index, and Chromium only when some profile leaves the browser tool un-denied; all forces both; off — the default under ALPI_PLATFORM=docker — skips prefetch entirely. Every asset still fetches lazily on first use, so off costs latency, never functionality. ensure_weights_cached() downloads through a throwaway embedder and releases the ONNX session instead of leaving ~150 MB resident in every daemon; the first real embed() lazy-loads from the disk cache. ensure_chromium() warns and stays retryable when the install fails, and after a successful install prunes stale chromium* builds (each playwright bump orphans ~520 MB; firefox/webkit are never touched, and nothing is pruned unless the wanted build exists on disk). RapidOCR remains first-use. Concurrent loaders keep the double-checked locking (_load, _ocr_reader, ensure_chromium).

Skills

Live under <home>/skills/<category>/<name>/. Required SKILL.md plus optional scripts/, references/, assets/, secrets/ (mode 0700, gitignored, scanner skipped), state/ (gitignored, scanner skipped, runtime persistence). .gitignore auto-written on create with secrets/\nstate/\n.

Live by default — no _pending/ approval stage (was tried in v0.1, removed in v0.2 as friction-without-benefit).

Frontmatter (auto-populated on create): name, description, category, version, origin: agent|user, created_at, requires_env, tools, keywords, optional output_schema. 13 fixed categories including miscellaneous as the fallback. secrets/ is filesystem state, not frontmatter: it is created lazily when a skill writes a secret file. output_schema is one-line JSON and uses a deliberately small subset (type, properties, required, items, enum) so the runtime stays dependency-light.

Security scanner (~50 patterns, _DANGER_PATTERNS in alpi/scan.py — the shared scanner library used by skills, memory writes, and the recalled-memory guard): destructive shell, credential exfiltration, prompt injection, persistence (cron/launchd/systemd/authorized_keys/sudoers/shell rc), reverse shells, tunneling, obfuscation (base64/eval/exec/compile), process exec, hardcoded credentials (API keys, OpenAI sk-, GitHub ghp_, AWS AKIA), system-password-file paths, deep traversal. Runs on every create/add_file/patch for files NOT in secrets/ or state/.

Atomic writes everywhere (tmp sibling + os.replace). .bak next to SKILL.md on every edit/patch. Quota: max 40 agent-owned skills, enforced at create.

Auto-injected into the system prompt (skills_index_block(home)): every session start, all installed skills are listed by category as name: description entries, prefixed by a directive that says "check this list before reaching for general tools". Without this nudge, mimo-class models routinely went straight to web_search/terminal even when a perfect skill existed.

TUI integration: when a terminal command's path matches .alpi/(profiles/<p>/)?skills/<cat>/<name>/..., arg_hint rewrites the ToolCard label as skill: <name> (or skill: <name> · <script> when the script is the full path). Tool name stays terminal; the rewrite is display-only.

Execution: skill(action="run", name=...). Single canonical ad-hoc path. If scripts/run.py exists the action validates the skill, then spawns the script via subprocess.run with cwd = skill dir, env += {ALPI_HOME, ALPI_SKILL_NAME, ALPI_SKILL_DIR}, 600s timeout, and the skill's requires_env checked up-front. If the skill declares output_schema, stdout must be JSON and is validated before the call succeeds. Scripts are normal Python; built-in tools and MCP methods are not importable Python APIs. No script → SKILL.md is returned with a [skill X has no scripts/run.py — follow these instructions] prefix so the agent follows the prose and calls the real tools. Scheduled prompts should call this action instead of reimplementing the skill by hand; the scheduler still enters through alpi chat --once --emit-events --no-save.

Structured composition: skill(action="invoke", name=...). Same subprocess/runtime path as run, but stricter: the callee must ship scripts/run.py, must declare output_schema, and stdout must satisfy it. This keeps skill-to-skill composition machine-readable and prevents prose-only skills from pretending to be callable subroutines.

Scripted harness: skill(action="test", name=...). Thin validation layer over the same runtime path. It exists so chat/scheduler/desktop can exercise a scripted skill and verify its declared output_schema without inventing a second testing runtime. If a CLI wrapper lands later, it should call this action instead of duplicating logic.

Research (read-only sub-agent, alpi/tools/research.py)

Spawns a sub-agent with a read-only toolset (web_search, web_fetch, web_extract, read_file, search). Returns a single synthesised report; the main agent never sees the intermediate tool trace.

Depth tiers instead of a numeric max_steps: depth="quick"|"normal"|"deep". The integer per tier comes from tools.research.{quick,normal,deep}_steps in config.yaml (defaults 8 / 15 / 30). Locks the model to three buckets (quick = single-answer, normal = comparative, deep = exhaustive) while letting the user re-tune all three from one place.

Synthesis fallback: when the budget runs out, research forces one final no-tools llm.complete() with "stop investigating, report now". Avoids the "[research gave up]" footgun where the main agent retries the whole thing.

Interrupt: polls tool_state.is_interrupted() between iterations and between tools; returns [research: interrupted] on the first hit. State label during execution: <depth> · step N/M; while an inner tool runs its own emit_state label gets auto-prefixed with step N/M · … via a wrapped _emit installed for the duration of each tool-call batch (restored in a finally).

Batch mode (v0.2.18): tasks: [{brief, depth}] up to 3 runs concurrently — see the Delegate section below for the shared ThreadPoolExecutor design (same pattern applies here).

Attachments (alpi/attachments.py)

host.chat.send accepts attachments: [{path, mime?, name?}]. The engine validates them (att.validate — magic-byte sniff for image/PDF, NUL/control-ratio guard for binary-as-text, per-type size caps, allowlist: images png/jpeg/webp, PDF, and text/source incl. py/js/ts/tsx/go/rs/sh/sql) and turns them into OpenAI content-parts (build_content_parts): images → base64 image_url data parts, text/source → inline text parts, text-layer PDFs → extracted text, scanned PDFs → rendered page images for vision-capable models. A guidance text-part tells the model the files are inline so it doesn't reflexively search_workspace/index_workspace to "find" them.

Per-turn only. Bytes live only in the in-memory message. session_metadata is itself bytes- and path-free ({name, mime, size}), but the engine re-adds a best-effort local path to each persisted chat-turn attachment so clients can thumbnail history — the path may be unfetchable from another client (outside host.attachments.fetch roots) or after a staged file's TTL, so this is preview replay, not durable storage. The validated turn attachments ({name, path, mime}) are also published to a runtime-only ContextVar (tools/_state.set_turn_attachments) so a tool can resolve a turn's files. Remote clients (mobile, or desktop pointed at a remote daemon) can't hand the daemon a local path, so they upload bytes via the host.attachments.stage RPC (type-aware caps, content validated 1:1 with send) which writes to a TTL-swept temp dir and returns a daemon-side path.

Durable. learn_file (see Local recall) is the bridge from per-turn to permanent: the document is copied into the user's workspace (<workspace>/.alpi/documents/, the source of truth), while the derived RAG index lives in the profile home (rag/store.sqlite). The manifest.jsonl beside the documents is metadata only — not authoritative; the files and the index are. There is no auto-learn: attachments stay one-turn unless the user explicitly asks to learn/remember/save one.

Vision (alpi/tools/read_image.py)

read_image(path, question) runs the current (or override) model in multimodal mode on an image and returns a text answer. path can be a local file OR an http(s) URL — URLs go through check_url() for SSRF (metadata hosts + private IPs blocked, redirects re-validated via httpx event_hooks).

Magic-bytes sniff accepts PNG / JPEG / GIF / WebP / BMP plus SVG (text-sniff for <svg); rejects bytes that don't match a known header even if the extension agrees. 20 MB cap on file and on download payload.

No pre-flight vision-capability check — LiteLLM's supports_vision() is wrong for openrouter/... prefixes and would bounce real vision models. If the call fails we surface the error with a hint pointing at /model when the message mentions image / vision / multimodal.

Model override via tools.read_image.model in config (same pattern as web_extract). When set, the tool tries the override first; on failure it retries with the main model and prefixes the answer with [fallback: <override> unavailable, used main model]. Useful for "main agent on a cheap text model, keep an expensive vision model just for images".

Same usage / cost plumbing as research and delegate (record_usage). Auto-resize to cut tokens is tracked in ROADMAP §S for v0.3.

Delegate (write-capable sub-agent, alpi/tools/delegate.py)

Sibling to research, but can mutate: spawn a focused sub-agent with a chosen toolset, get back a summary. Used when a task would otherwise flood the parent context (multi-file refactors, fetch+parse+write pipelines, skills that generate several output files, iterative debug loops).

Toolsets (callable presets via the toolsets param, default ["file", "web"]):

Blocked for sub-agents: delegate (no recursion), memory, skill, schedule, notify, send_message, email, session_search, session_read, todo (shared global state). research is not in any preset either — if you need deep investigation inside a delegate task today, do it in the main agent first and pass findings via context.

Budget: hardcoded MAX_STEPS = 30. No config knob — it's a ceiling, not a target (sub-agent stops when done). If a real case needs more, bump the constant.

System prompt is built from a single template plus the workspace root (when set): relative paths resolve under workspace, absolute paths go where the goal says, and the sub-agent is explicitly warned not to invent /workspace/... style roots.

Batch parallel mode (v0.2.18). Both research and delegate accept tasks: [...] (up to 3) and run them concurrently via ThreadPoolExecutor(max_workers=3). Isolation is provided by _state.py: _emit, _interrupt_getter, _usage_sink are contextvars.ContextVar, so each worker thread sees its own values without racing on module globals. Workers re-seed interrupt_getter + usage_sink from the parent context (Python's ThreadPoolExecutor doesn't propagate ContextVars automatically) and install a per-task prefixed emit so TUI progress lines read [i/N] <tag> · <msg>. Results aggregate into one markdown report with per-task sections; per-task failures are captured inline as [failed: <error>] instead of aborting the batch. Cap is hardcoded at 3 — bumping would need a config knob and would multiply LLM cost linearly; not a default worth moving.

TUI (alpi/tui/)

Textual 8.2.x. Layout: AlpiTopBar (identity) + chat scroll (VerticalScroll.anchor() auto-follows new content) + AlpiHeader (status: model · ctx · cost) + #chat-input (flat slab, accent-tinted bg on focus).

Theme (themes.py): build_theme(accent, dark) factory returns a Textual Theme from a single accent hex + dark/light flag. Registered in AlpiApp.__init__ (not on_mount — child widgets read theme_variables during their own mount). Widgets read self.app.theme_variables at render time instead of taking colors as params, so tui.accent or tui.theme changes propagate without rewiring.

Live tool cards (ToolCard in widgets.py): single line, spinner + elapsed at 6 Hz, tool_state labels while running, switches to result line on completion. uses $accent-darken-1 for non-error, $error for failures.

Assistant streaming: AssistantMessage uses Textual's native Markdown.get_stream() — async queue that coalesces fragments when deltas arrive faster than the widget can render. Parser runs on new fragments only, not the full buffer.

Reasoning surface:

Persistence contract (cross-surface). The engine consolidates the whole turn's reasoning — reasoning_delta thinking + the inter-tool prose — into Turn.reasoning (str), and records Turn.reasoned_s (float) = the reasoning span from turn start to the first tool boundary, or to the first final-answer text token when there are no tools; it excludes both tool execution and final-answer streaming so the duration isn't inflated by a long-running tool or a long reply. Desktop/mobile render a collapsible "Reasoned for Ns" block from Turn.reasoning, falling back to joining ToolLog.reasoning for turns logged before the field existed; the TUI renders the per-tool ToolLog.reasoning interleaved. ToolLog.reasoning (first tool of each batch) remains the legacy per-tool fallback.

Slash commands: /help, /memory, /tools, /mcps, /status, /skills, /clear, /new, /compact, /model, /exit. All surface-panels are FloatingPanels on the overlay layer docked above the input strip, dismissed by Esc or click-outside. Header ($surface-lighten-1 tint) shows the command name; body scrolls with max-height: 18. The info panels (screens.py) are read-only; /help and /model (model_panel.py) are interactive — subclasses focus an OptionList / Input in on_mount via call_after_refresh so selection and navigation work while the panel floats. Configuration verbs (workspace, gateways, sandbox, …) live exclusively in alpi setup — the TUI is for chat and inspection, not for editing the profile.

Interrupt on new input: typing while a turn runs cancels it. engine.interrupt_requested polled at 3 points; long-running tools (research) poll tool_state.is_interrupted(). Skipped tool calls get a [skipped — user interrupted] tool message to preserve OpenAI's pairing invariant.

Ctrl+Y copies last assistant reply (pbcopy/wl-copy/xclip/xsel/OSC-52 fallback chain). Ctrl+L clears.

Daemon (alpi/service.py)

One alpi daemon per machine, every profile inside. A single launchd plist (com.alpi.daemon) on macOS or systemd-user unit (alpi-daemon.service) on Linux supervises one Python process that hosts every profile under ~/.alpi/ (default plus each profiles/<name>/) on the same asyncio loop. Per-(profile, service) tasks are independently supervised — a crash in one profile's gateway leaves siblings untouched. Tasks are named <profile>/<service> (e.g. doc/gateway, builder/alp) so logs

Per-profile services (service.{gateway, schedule, alp, workgroups, host} in each profile's config.yaml):

Default if a key is missing: every service on (so the desktop "just works" after install).

alpi.service.serve_all(root) is the foreground entry point called from alpi daemon start and from the supervising unit's ExecStart. It:

  1. Walks ~/.alpi/ (default + every profiles/<name>/) to discover profiles.
  2. For each profile reads service.* toggles; missing block → every service on.
  3. Configures the root logger at ~/.alpi/logs/service.log (stderr only when it's a TTY, to avoid double-writes under launchd).
  4. Sets the process title to alpi (daemon, N profiles) via setproctitle.
  5. Writes ~/.alpi/service.pid.
  6. Spawns one supervised asyncio task per (profile, service) and waits. _supervise wraps each one so a crash leaves siblings running.
  7. SIGTERM / SIGINT cancels every task cooperatively; PID file removed on exit.

Operational invariants of serve_all (each one is the root cause of a real production incident; do not regress):

Active home isolation. Because N profiles share one process, tools that resolve their home via home.get_home() would all see the same env vars and write to default's home. The engine wraps each run_turn in a home.set_active_home(self.home) contextvar binding (per-thread); get_home() consults this binding before the env. Without it, another profile's memory tool would write to default's USER.md. See tests/core/test_home.py for the isolation tests.

daemon_status(root) is the snapshot used by alpi daemon status and by alpi setup → Services → Daemon: PID, uptime (via ps -o etime), install backend (launchd / systemd / none), and the per-profile services map.

Host plane (alpi/host/)

Control-plane for the desktop / mobile client. Not ALP — the two share a profile but live on different sockets, with different auth models. ALP is peer-to-peer (Noise on TCP, envelope-signed, peers pinned in peers.yaml); host is client-to-daemon. JSON-RPC-shaped over ~/.alpi/host/host.sock with filesystem permissions as the trust boundary; no peer identity, no envelope, no Noise handshake. Desktop and future mobile clients talk to this API; they do not read profile files directly.

Only the default profile hosts this subsystem — the client always targets default's socket and reaches sibling profiles via the profile parameter on each verb. _run_host refuses to bind on any other profile even if the toggle leaks via manual config edit.

host.device_state owns the device-facing profile state contract: profile lists/summaries, bounded profile file reads, storage stats, gateway status/config previews, skill lists, workgroup lists, workgroup member rosters, config field edits, and local Ollama model discovery. The desktop Tauri layer keeps its existing invoke(...) command names for UI stability, but those commands proxy to host.* verbs instead of parsing ~/.alpi themselves. Mobile should use the same verb shapes rather than inventing a separate state API.

Two transports, one dispatcher:

  1. Unix socket (~/.alpi/host/host.sock, mode 0600). Local trust = filesystem perms. Used by desktop on the same machine. No token required.
  2. WebSocket (ws://<bind>:49200 by default). Used by mobile and any remote desktop. network.host is the advertised address; the bind is derived from it (see config / security): empty → auto-detected Tailscale CGNAT (100.64.0.0/10) then private RFC1918 LAN; a private/Tailscale IP → that IP; a hostname or an opted-in public IP → 0.0.0.0 (all interfaces); a public IP without host.allow_public_bind → refused (no TCP); Docker → 0.0.0.0. Loopback is never a bind target. A 0.0.0.0 bind leans on the pairing token (and a firewall/NAT) for access control, so alpi doctor warns whenever the listener binds 0.0.0.0. Per-device pairing token required in every request's params.auth_token. permessage-deflate is negotiated by default (ws_serve(compression="deflate")); JSON-RPC payloads drop 50–80% on the wire. Clients that don't negotiate fall back to raw. Mobile and desktop keep a persistent multiplexed WS pool per (ip, port, token) so RPCs don't pay a TCP+WS handshake every call — the dominant cost of "remote alpi feels slow" on Tailscale. Streams (host.chat.send, host.events.subscribe) open their own dedicated socket.

Bind and advertised endpoint are intentionally separate concerns. The daemon chooses where the host-plane server listens; Devices → Network chooses what the paired client should dial. On a normal Mac or Linux install those often collapse to the same Tailscale or LAN address. In Docker they do not: the daemon binds 0.0.0.0 inside the container while the QR advertises ALPI_NETWORK_HOST — a LAN IP, a Tailscale 100.x address, or a MagicDNS hostname that resolves to the host machine outside the container.

Wire shape (both transports):

{"id": "<reqid>", "method": "host.<noun>.<verb>", "params": {…, "auth_token": "<token>"}}

Unix socket payload omits auth_token — the local transport is sovereign and bypasses token validation entirely. WS always requires a valid token; an empty or missing devices.yaml rejects every WS request (fail-closed). The first device is minted locally over the Unix socket; there is no remote bootstrap path.

The daemon writes either a single response line or, for streaming verbs (host.chat.send, host.events.subscribe), multiple frames followed by a done frame and connection close.

This is distinct from ALP peer transport. Devices / host-plane remote access configures how paired desktop and mobile clients reach their own daemon (host.*). Peer TCP listener configures the optional ALP TCP listener other alpis use for link.* and workgroup.*.

Pairing tokens (alpi/host/devices.py)

Each remote device holds its own opaque token. The store lives at ~/.alpi/host/devices.yaml (mode 0600) as a list of {token, label, created, last_seen, role}. role is admin or member; missing or unknown values collapse to member (least privilege). The daemon validates the token in _check_token_role (alpi/host/server.py); a hit also bumps last_seen so the user sees who's active and returns the role to the dispatcher.

Three trust tiers gate every WS call:

The admin set lives in _ADMIN_METHODS; the strictly-local set in _LOCAL_ONLY_METHODS (network admin only — no role unlocks those over WS).

Token lifecycle:

host.devices.list redacts the full token to a token_id (last 8 chars) so the TUI can show paired devices without leaking secrets. The full token only escapes the daemon once, at generate time, into the one-shot QR.

Verb namespaces in current shape:

Contract. host.events. is transport, not durable history. The replay window (HISTORY_MAX = 500) is sized for reconnect catch-up within a session of activity — it can drop old rows under load and must never be the source of truth for anything a user can browse. Durable user-visible state lives in the per-profile stores that host.outputs. / host.sessions.* / workgroup transcripts read from. If a UI needs history older than the replay window, it queries those stores, not host.events.history.

Adding a new verb: create the handler in the matching host/*.py module, register on host_server.Server.register (or register_stream for multi-frame), and call from the desktop / mobile client via the platform's host-client helper. Never expose a verb outside host.* — the namespace check in register enforces it.

Gateway (alpi/gateway/)

Inbound platform listeners (Telegram long-poll, IMAP polling, Gmail OAuth, webhook stub) hosted by the alpi daemon. Each platform iterates async for msg in platform.listen(); per incoming message the gateway spawns alpi chat --once --emit-events, keeps the typing indicator on while the subprocess works, and sends only the final reply back to the gateway.

Allowlist: TELEGRAM_ALLOWED_CHAT_IDS and IMAP_ALLOWED_SENDERS in .env, fail-closed if unset. Optional per-sender gate {PLATFORM}_ALLOWED_USER_IDS (e.g. TELEGRAM_ALLOWED_USER_IDS): unset → the chat allowlist governs (any member of an allowed group can drive the agent); set → the sender's id must also be listed. Inbound text reaches the model behind an untrusted banner + injection scan. Per-platform user config under gateway.* in config.yaml: Telegram/Matrix intentionally expose no UX knobs; IMAP/Gmail expose poll_interval and mark_as_read. Typing indicators are hardcoded by platform (chat on, email off — email has no typing concept). Gateways never send intermediate tool traces; use TUI, desktop, or mobile when you want live execution UI.

Disable for a profile via alpi setup → Services → Daemon → Gateway · off (writes service.gateway: false).

One bot per profile (hard rule). Telegram long-polling allows a single concurrent getUpdates per bot token; two profiles polling the same token deadlock each other on 409. The contract is enforced at write time: alpi setup → telegram and the host RPC host.providers.set_key reject a TELEGRAM_BOT_TOKEN that is already configured in another profile, with an error naming the owner. The daemon trusts the invariant and runs each profile's listener with its own self._token (read from <home>/.env at construction — no shared os.environ). Multi-profile inbound must use one bot per profile; single bot with internal routing is not supported (would force shared offsets / allowlists / session state).

Per-profile env snapshot. Every Platform captures alpi.home.effective_profile_env(home) at __init__ into self.envos.environ (process-level vars: PATH, HOME, TZ, ALPI_PLATFORM…) overlaid with <home>/.env (per-profile secrets). Quotes in the .env file are stripped. self.env is the source of truth for all credentials and allowlist checks: Telegram token, IMAP_*, MATRIX_*, GMAIL_*, and delivery.is_allowed(..., env= platform.env). Matrix _build_client and IMAP's ImapClient. from_env_map(self.env) both read from this snapshot — no platform adapter touches os.environ directly any more. As of v0.4.52 the same contract extends to the agent toolchain: tools/email, the LLM-override paths in tools/web_extract / tools/read_image, alpi/identity.py, the model selector / TUI provider gating, and the gateway child agent (gateway/run._run_agent injects via effective_profile_env(home, extra={...})). The snapshot is frozen at construction; credential edits via the host plane write the file atomically but live listeners pick up the change only on next daemon/gateway restart.

Schedule (alpi/scheduler/)

Tick loop (default 30s) hosted inside the alpi daemon. add schedules a job (kind: cron|once, expression or after_hours). run-once ticks manually for testing. LLM time grounding: when the agent calls schedule(action='add', kind='once', after_hours=N), the engine resolves now from a single source so the agent doesn't drift.

Duplicate guard + in-place edits. add rejects a job whose (kind + cron / run_at / after_hours) matches an existing one AND whose prompt fingerprint (lowercase + whitespace-collapsed first 80 chars) collides. Pass force=true to bypass when the second job is genuinely intentional. Use update to change prompt, cron, notify, or pause state without remove/recreate churn. A job carries a single delivery axis, notify: bool (default false = silent): true pushes the reply to the owner's apps. Legacy jobs with a platform field are migrated to notify on load (platform set → notify: true). Reaching a THIRD PARTY is an explicit send_message in the prompt — that's now allowed (the old auto-delivery guard that rejected such prompts is gone).

Scheduled jobs execute through alpi chat --once --emit-events --no-save with ALPI_PLATFORM=cron. The scheduler consumes stdout events to detect tool traces, final reply text, delivery, and failure. It does not write sessions/<id>.json: cron output belongs to schedule delivery/logging, not to local TUI / desktop chat history.

Loop isolation. serve() runs tick() in a dedicated ThreadPoolExecutor(max_workers=2), and host.schedule.fire wraps fire_by_id in run_in_executor before awaiting. Both paths ultimately call subprocess.run(timeout=job_run_timeout(job)) (default 600s, per-job up to 3600s); running them inline would block every other coroutine on the daemon's asyncio loop — gateway listeners, ALP responders, and host.chat.send streams in sibling profiles all stall for the duration of the scheduled job. The dedicated executor also means the scheduler can't starve chat's default-executor turns. A regression test in tests/core/test_schedule.py::test_serve_runs_tick_off_loop_so_chat_can_progress pins the contract.

Timezone. Cron expressions evaluate against the machine's system timezone (datetime.now().astimezone() in scheduler/run.py). Jobs are stored with UTC last_run_at but fire according to local wall-clock time. Practical consequence: if you specify 10 12 * * * because you want a 12:10 reminder in Bangkok, the Mac must be set to Asia/Bangkok. Move the machine to a different timezone and the cron fires at 12:10 there, not in Bangkok. No in-job timezone override today — add it via TZ=… in the launchd plist / systemd unit if cross-timezone stability is required.

MCP client (alpi/mcp/)

Spawns user-configured MCP servers (stdio JSON-RPC, SSE planned). Their tools are wrapped and registered as alpi tools. Servers configured in config.yaml under mcp.servers.<name> (command, args, env). Management lives in alpi setup → MCPs; alpi mcp itself is not exposed on the CLI surface.

External orchestration frameworks. Alpi does not embed LangGraph, CrewAI, AutoGen, or similar graph/supervisor runtimes in core. They overlap with Alpi's own agent loop and bring a heavier dependency, state, and observability model than the local-first runtime needs. Interop belongs at the edge: expose the external workflow as an MCP server and let Alpi call it as a tool, or wrap a local workflow in a scripted skill. ALP is not the adapter layer for these frameworks; ALP is reserved for sovereign profile-to-profile collaboration across machines, while MCP is the interop layer for external runtimes and tools.

Logging (alpi/_log.py, alpi/logs.py)

Every subsystem writes to a single flat folder: ~/.alpi/logs/<subsystem>.log, rotated at 1 MB with 3 backups (MAX_BYTES / BACKUP_COUNT in _log.py). Same format everywhere (%(asctime)s %(levelname)s %(name)s %(message)s) so alpi logs can merge them by timestamp prefix. The source tag on display comes from the filename.

Three sources today (file on disk + the writer that produces it):

The alpi logs --source CLI choice list also accepts gateway and schedule. Inside the unified daemon, gateway and scheduler events route through the root logger and land in service.log — those filter values are kept so that any standalone or legacy gateway.log / schedule.log (e.g. from an older scheduler.run.ensure_running() invocation that ran out-of-process) stays selectable.

Why logs are NOT inside sessions/: sessions/ is a structured store (one JSON per conversation, indexed by id, consumed by session_search and the resume flow). Mixing freeform logs would break the glob pattern and the cleanup semantics. Logs are the index and audit trail; sessions are the content. Peers, not nested.

Why one flat folder (logs/) instead of per-subsystem dirs: tiny <subsystem>/logs/ folders with a single file each is pure noise. The service keeps non-log state in its own places (schedule/jobs.json, alp/alp.sock, service.pid at the profile root) — only the .log files consolidate.

Adding a new source is two lines: from alpi._log import get_subsystem_logger; logger = get_subsystem_logger(home, "my-sub"). alpi logs picks it up without changes; add the tag to the --source choice list in cli.py::logs_cmd if you want it filterable.

Doctor (alpi/doctor.py)

alpi doctor — live health check. Verifies each subsystem actually responds, not just that it's configured. Same entry point from the CLI and from alpi setup → Health check; the status in the setup menu row (all green / N warning(s) / N failing) runs the full check too.

Checks:

Parallelism: the four network-bound tasks (Telegram/IMAP/Gmail/MCPs) submit to a ThreadPoolExecutor(max_workers=8). Sync checks (model, workspace, services, security) run on the main thread while the pool works. Total wall time ≈ slowest single task, not sum — ~5-10 s on a healthy profile.

Progressive rendering: run_and_render() uses rich.live.Live — every row appears immediately with a cyan spinner, each resolves to //! as its future completes. Animation at 10 fps via a manual frame cycler (rich's Spinner objects can't be appended to Text). Layout is stable (same rows, same column widths) so the eye doesn't jump.

Exit codes: 1 if any check returns fail, 0 for warn/info/ok. Warnings don't break cron. The wizard entry ignores the exit code — it press-enter-waits so the user can read.

Ops digest (alpi/ops_digest.py)

alpi digest [--since 7d] is the read-only evidence rollup for operator decisions. It deliberately does not own new state: each section reads the primitive owned by another subsystem.

The command has two renderers: a compact Rich view for humans and --json for scripts. The JSON is a dataclass dump of the report shape. It is not an observability daemon, dashboard, recommendation engine, or telemetry channel. Tests pin the read-only contract by snapshotting the profile tree before and after a digest run.

Sessions (alpi/session.py, alpi/session_map.py)

Turn-based JSON: turns: [{at, user, tools[], assistant}] plus cumulative metrics. ToolLog carries at, name, args, result (truncated hint), ok, duration_s, reasoning (non-empty only on first tool of a batch). Empty sessions (no user message) are NOT saved.

sessions/ is local human chat history: TUI, desktop, and manual alpi chat --once runs that should be resumable. --continue, tui.auto_resume, host latest_session, and desktop profile opening all treat only kind == "chat" as resumable local history. Historical files whose first user message starts with [SCHEDULED:], [INBOUND ...], [workgroup-poller], or another system bracket are ignored by resume/profile history.

TUI resume. Bare alpi resumes the most recent session when tui.auto_resume: true; -c / --continue is the manual override.

Gateway per-chat threading. Each inbound message carries external_chat_id (a Telegram chat id, or the sender email for IMAP/Gmail). alpi/session_map.py holds a pointer map at ~/.alpi/<profile>/gateway/sessions/_map.json: {chat_id: session_id}. When the gateway spawns alpi chat --once --resume-chat <chat_id>, the CLI sets engine.session.subdir = "gateway/sessions" and consults the map — if there's a pointer, that session is loaded and continued; otherwise a fresh session starts and the pointer gets bound after save. Same mechanism across every platform; the natural semantics fall out of what each puts in chat_id: per-chat threading for Telegram, per-sender threading for IMAP / Gmail.

Gateway sessions live in their own subdir (gateway/sessions/) so they don't pollute the local TUI/desktop session list (which scans sessions/ only) and so the Cleanup → Gateway category never collides with transport state files in gateway/ itself (Telegram offsets, IMAP last-uid, …).

Scheduled jobs do not persist session files. The scheduler uses --no-save because it only needs emitted final reply/tool events for delivery and audit; keeping a resumable transcript would make background jobs appear as user chats.

/new (wired up in AK) calls session_map.forget(chat_id) — the pointer drops but the underlying session file stays on disk. Historical threads remain searchable via session_search against the local sessions/ dir; gateway transcripts are intentionally excluded from local search.

@-mention threads (alpi/alp/mention_thread.py). When peer A @-mentions peer B over ALP (link.ask), the receiving side runs a fresh Engine per turn — but B persists a small per-sender thread at <B-home>/mentions/<A>.json, capped at 20 turns. Successive mentions from the same A→B pair carry conversational memory ("what I said before" resolves) without polluting B's local --continue (which only reads sessions/). Threads are isolated per remitente. Wipe via setup → Cleanup → Mentions.

Security model

Two layers:

Threat model: prompt injection via email/web content, LLM-issued tool calls on the user's machine, direct user input (trusted), and network adversaries for ALP links. Full discussion in SECURITY.md.

Cross-cutting concerns

Profiles

alpi -p <name> resolves home to ~/.alpi/profiles/<name>/. ALPI_PROFILE env var is the same. No sticky "current profile" file — resolution is fully explicit. The single daemon (com.alpi.daemon / alpi-daemon.service) supervises every profile from one process; tasks are namespaced <profile>/<service> so they stay distinguishable in logs and asyncio.all_tasks(). Inside a turn, home.set_active_home(home) binds the per-thread contextvar consulted by home.get_home() so tools resolve to the right profile even though every concurrent turn shares the daemon's env.

Workspace

cfg.workspace (or cwd fallback if unset) is the default root for relative paths — not a wall. File tools and terminal can reach absolute paths anywhere except the sensitive denylist. Real workspace-only isolation is the opt-in OS sandbox (Layer 2). Configure it via alpi setup → Workspace; the TUI top bar read-outs the resolved path but does not edit it.

Dependencies

Hard runtime deps are kept tight — every line in pyproject.toml's dependencies is actually imported by alpi/. The audited set, with one-liner for why each earns its place:

Optional dev extra: pytest + pytest-asyncio for the test suite, ruff for lint, pip-audit for CVE scans.

No gateway extra. Prior to v0.2.66 there was one bundling python-telegram-bot, fastapi, uvicorn for an HTTP webhook server that never materialised. A dependency audit confirmed zero imports from the codebase; dropped. If a FastAPI webhook ever lands, the extra comes back.

Security posture: uv run --with pip-audit pip-audit ran clean against the full lockfile at the time of the v0.2.66 audit. Re-run before each release. Known-CVE deps are not allowed to accumulate — drop or upgrade.

Testing

Run via uv run pytest tests/. The --llm flag enables real-LLM integration tests (a few cents on free models).

Key fixtures (tests/conftest.py):

Non-obvious things to know

theme