# helexa-acp ACP (Agent Client Protocol) bridge for editors like [Zed](https://zed.dev). Lets you point your editor's agent panel at **any combination** of OpenAI-compatible, OpenAI Responses, and Anthropic Messages endpoints — public APIs, private LAN deployments, local Ollama / LM Studio — and switch between them per session via a model dropdown. The "missing ACP binary" for users who don't want to be locked into one vendor's agent client. ``` ┌───────────────────────────────────┐ │ Zed (or any ACP editor client) │ └────────────┬──────────────────────┘ │ stdio JSON-RPC (ACP) ▼ ┌─────────────────┐ │ helexa-acp │ ← one binary, multi-endpoint └─────┬───────────┘ │ HTTP / SSE ┌────────┼─────────────┬──────────────┬──────────────┐ ▼ ▼ ▼ ▼ ▼ cortex/ OpenAI Anthropic OpenRouter LM Studio neuron Responses Messages (self- (gpt-5,…) (Claude) hosted) ``` ## What it does - **Speaks ACP** over stdio to editor clients (Zed today; any future ACP client tomorrow). - **Multi-endpoint** — one config file lists every LLM endpoint you want available; pick one per session via the model dropdown (`endpoint:model` selector). - **Three wire formats**: `openai-chat` (the broadly compatible default), `openai-responses` (newer OpenAI surface), and `anthropic-messages` (Claude). Each is a separate provider impl in `src/provider/`; adding a fourth (Gemini, Ollama native, …) is one file plus a `WireApi` enum variant. - **Built-in tools**: `read_file`, `write_file`, `edit_file`, `list_dir`, `bash`. Permission-gated by default; the editor user approves writes/shell per-call. - **Three session modes**: Default (gated), Bypass Permissions (auto-allow), and Plan (write-only-to-plan-dir, no shell). - **Vision** — drag-drop images into the agent panel against any vision-capable model. - **Session resume** — multi-day conversations survive editor restarts via on-disk transcript persistence. - **Context compaction** — rolling history stays inside the model's context window automatically so long sessions on small-context local models don't fall over. ## Install ### From source ```sh git clone https://git.lair.cafe/helexa/cortex.git cd cortex cargo install --path crates/helexa-acp # Binary lands at ~/.cargo/bin/helexa-acp ``` ### Pre-built RPM (Fedora 43) ```sh dnf copr enable helexa/helexa dnf install helexa-acp ``` The COPR project bundles helexa-acp alongside the cortex gateway and helexa-neuron flavours; install only the package(s) you need. ## Quick start The fastest path: env-var single-endpoint config. ```sh export HELEXA_ACP_BASE_URL=http://hanzalova.internal:31313/v1 export HELEXA_ACP_MODEL=Qwen/Qwen3.6-27B helexa-acp # speaks ACP over stdin/stdout; not interactive ``` Then in Zed (`~/.config/zed/settings.json`): ```jsonc { "agent_servers": { "helexa": { "command": "helexa-acp", "args": [] } } } ``` Restart Zed → open the agent panel → pick "helexa" → start chatting. Tool calls (file reads, writes, bash) prompt for permission per-call in Default mode. That's the minimum. The full config story below is what unlocks the multi-endpoint dropdown. ## Multi-endpoint config Copy `helexa-acp.example.toml` from this repo to `$XDG_CONFIG_HOME/helexa-acp/config.toml` (typically `~/.config/helexa-acp/config.toml`) and edit: ```toml default_endpoint = "helexa" [[endpoints]] name = "helexa" base_url = "http://hanzalova.internal:31313/v1" wire_api = "openai-chat" default_model = "Qwen/Qwen3.6-27B" max_tokens = 8192 context_window = 32768 [[endpoints]] name = "openrouter" base_url = "https://openrouter.ai/api/v1" wire_api = "openai-chat" api_key_env = "OPENROUTER_API_KEY" default_model = "anthropic/claude-opus-4" [[endpoints]] name = "anthropic" base_url = "https://api.anthropic.com/v1" wire_api = "anthropic-messages" api_key_env = "ANTHROPIC_API_KEY" default_model = "claude-opus-4" ``` Restart Zed. The model dropdown lists every model from every configured endpoint with the `endpoint:model` selector (`helexa:Qwen/Qwen3.6-27B`, `openrouter:anthropic/claude-opus-4`, …). Switch mid-session; the next prompt routes to the new endpoint. When only one endpoint is configured the prefix is dropped (model ids appear bare). ### Selector syntax The `model` field on every internal request is parsed as `:`: - `openrouter:gpt-4o` → routes to the `openrouter` endpoint, model `gpt-4o`. - `helexa/large` → no colon → falls through to whichever endpoint is named in `default_endpoint`, model `helexa/large`. - `:gpt-5` → leading colon → also falls through to default. ## Endpoint cookbook Copy-pasteable blocks. Mix and match. ### cortex / neuron (self-hosted) ```toml [[endpoints]] name = "helexa" base_url = "http://hanzalova.internal:31313/v1" wire_api = "openai-chat" default_model = "Qwen/Qwen3.6-27B" max_tokens = 8192 context_window = 32768 ``` Use `openai-responses` instead of `openai-chat` once cortex 0.1.16+ is deployed and you want the Responses API surface (vision item shape, structured reasoning items, etc.). ### OpenAI directly ```toml [[endpoints]] name = "openai" base_url = "https://api.openai.com/v1" wire_api = "openai-responses" api_key_env = "OPENAI_API_KEY" default_model = "gpt-5" ``` `openai-responses` is the right choice for current OpenAI models; `openai-chat` works against legacy GPT-3.5/4 deployments and anything labelled "chat completions". ### Anthropic directly ```toml [[endpoints]] name = "anthropic" base_url = "https://api.anthropic.com/v1" wire_api = "anthropic-messages" api_key_env = "ANTHROPIC_API_KEY" default_model = "claude-opus-4" ``` helexa-acp sends `x-api-key` + `anthropic-version: 2023-06-01` automatically. The `api_key_env` indirection keeps your key out of the config file. ### OpenRouter (multi-vendor proxy) ```toml [[endpoints]] name = "openrouter" base_url = "https://openrouter.ai/api/v1" wire_api = "openai-chat" api_key_env = "OPENROUTER_API_KEY" default_model = "anthropic/claude-opus-4" ``` OpenRouter speaks OpenAI-compat for every model it fronts, so `openai-chat` is the right wire format regardless of the underlying vendor. ### LM Studio (local) ```toml [[endpoints]] name = "lmstudio" base_url = "http://localhost:1234/v1" wire_api = "openai-chat" default_model = "auto" ``` LM Studio's "auto" model id picks whatever's loaded. Same shape works for Ollama in compat mode (`http://localhost:11434/v1`) and vLLM. ### Multiple cortex deployments ```toml [[endpoints]] name = "lan" base_url = "http://hanzalova.internal:31313/v1" wire_api = "openai-chat" default_model = "Qwen/Qwen3.6-27B" [[endpoints]] name = "cloud" base_url = "https://cortex.example.com/v1" wire_api = "openai-chat" api_key_env = "CLOUD_CORTEX_KEY" default_model = "Qwen/Qwen3-VL-8B" ``` Use the `endpoint:model` selector to switch between them mid-session. ## Zed setup `~/.config/zed/settings.json`: ```jsonc { "agent_servers": { "helexa": { "command": "helexa-acp" } } } ``` Optional environment overrides for the binary: ```jsonc { "agent_servers": { "helexa": { "command": "helexa-acp", "env": { "HELEXA_ACP_LOG_FILE": "/tmp/helexa-acp.log", "RUST_LOG": "helexa_acp=debug" } } } } ``` `HELEXA_ACP_LOG_FILE` is the one you actually want — Zed doesn't surface the agent's stderr, so without that env var debug output is invisible. Point it at a file you can `tail -f`. After restarting Zed: ⌘+? (or wherever your "Open Agent Panel" binding is) → select "helexa" → the model dropdown populates from your config → start prompting. ## Modes Three session modes ship; the user picks via Zed's mode dropdown on the agent panel. | Mode | Reads | Writes | Bash | Permission prompts | |------|-------|--------|------|--------------------| | **Default** | ✓ | with prompt | with prompt | per call | | **Bypass Permissions** | ✓ | ✓ | ✓ | never | | **Plan** | ✓ | only into plan dir | disabled | never (plan-dir writes auto-allow) | ### Default Reads are always allowed (`read_file`, `list_dir` are unrestricted). Writes and shell commands prompt the user before running. The intended baseline for any session where the agent might do something you'd rather review first. ### Bypass Permissions Auto-allow every tool call. Use for agentic loops you trust — bulk edits across many files, scripted workflows, prepared session templates. Never for code the agent hasn't seen before. ### Plan The "draft an implementation plan before you write code" mode. Available tools: - `read_file`, `list_dir`: unrestricted (read the codebase). - `write_file`, `edit_file`: allowed *only* under `$XDG_DATA_HOME/helexa-acp/plans//`. Any path outside that returns "plan mode: writes are restricted to …" back to the model so it self-corrects. - `bash`: disabled outright. Returns "plan mode: shell execution is disabled" if attempted. When the plan is complete, the model presents a 3-option menu: 1. **Bypass Permissions** — implement the plan now, no prompts. 2. **Default** — implement now with per-tool prompts. 3. **Plan** (stay here) — refine the plan with more guidance. Switch the mode dropdown to your preference and reply to proceed. ## Tools Five tools, defined in `src/tools.rs`: | Tool | Args | Gated in Default? | |------|------|-------------------| | `read_file` | `path`, `line?`, `limit?` | no | | `list_dir` | `path` | no | | `write_file` | `path`, `content` | yes | | `edit_file` | `path`, `old_text`, `new_text` | yes | | `bash` | `command`, `cwd?` | yes | ### Path handling `~`, `~/`, `$HOME`, and `$HOME/` are expanded server-side before the path reaches ACP or local fs. Lets the model emit `~/git/repo/file.rs` and have it Just Work. `read_file` first tries the editor's filesystem (ACP's `fs/read_text_file` — respects open buffers, workspace overlays, etc.). If that fails — typically because the path is outside Zed's workspace boundary — it falls back to `std::fs::read_to_string`. This lets the agent pull in shared material like `~/git/architecture/generic.md` from a different project's session. The fallback is logged at warn level so you can see when it kicks in. ### Tool dispatch Tool descriptions reach the model through a Qwen3 Hermes-format `# Tools` block injected into the system prompt — cortex/neuron pass the OpenAI `tools` request field through to the encoder unread, so we work the model into emitting `{json}` markers it then parses out of the content stream. This applies to the helexa wire format; OpenAI / Anthropic endpoints with native tool support would use their own paths once they're wired in. The parser is tolerant: malformed JSON (trailing braces, missing `name`, name nested in `arguments`) gets a repair pass; if that fails the call surfaces as a "Malformed tool call" card in Zed and the model gets a synthetic error result so it can self-correct. ## Session resume helexa-acp persists every session to `$XDG_DATA_HOME/helexa-acp/sessions/.json`. Zed's `session/list` RPC asks helexa-acp to enumerate them on workspace open; `session/load` rehydrates and replays the transcript as `session/update` notifications so the agent panel renders the prior conversation. Behaviour: - Persisted per-round, so a mid-turn agent stall (long bash, wedged ACP roundtrip) doesn't lose earlier rounds. - Survives editor restart and the helexa-acp binary upgrading between versions. - Project-scoped: only sessions whose `cwd` matches the workspace are listed. To wipe history: `rm -rf $XDG_DATA_HOME/helexa-acp/sessions/`. ## Context compaction When an endpoint sets `context_window`, helexa-acp projects the rolling history into a token budget before each request — old `ToolResult` content (read_file payloads are the worst offenders) gets elided to one-line markers, preserving `tool_call_id` pairing so the wire schema stays valid. System prompts, user turns, and the most recent ~4 messages are never elided. The full history stays on disk; compaction is a per-request projection, not a destructive edit. Set `context_window = 32768` for a 32 K Qwen3, `131072` for a modern Claude, etc. With `max_tokens` also set, the budget is `context_window - max_tokens - 512_safety`. ## Troubleshooting ### "default endpoint 'helexa' has no usable provider — check config" The named default endpoint failed to construct. Usually: - `api_key_env` references a variable that isn't set in the env Zed launched helexa-acp with. - The TOML's `wire_api` is misspelled (only `openai-chat`, `openai-responses`, `anthropic-messages` are accepted). Test by running `helexa-acp` directly from a shell — startup errors land on stderr. ### Model dropdown is empty Each provider's `list_models` failed at startup. Look at `HELEXA_ACP_LOG_FILE` for "list_models failed; this endpoint's models won't appear in the picker". Likely the endpoint URL is wrong, the API key is invalid, or the upstream `/v1/models` endpoint isn't responding. The agent still works against `default_model` even when the dropdown is empty — list-models is for picking, not routing. ### "prompt_too_long" / agent stalls mid-conversation You hit the model's context window. Set `context_window` on the endpoint and helexa-acp will compact before sending. The log line `context compaction applied` confirms it's running; if it fires but the upstream still rejects, the compaction heuristic under-counted and the budget needs tuning down. ### Reading files outside the workspace returns "not found" Zed's `fs/read_text_file` is workspace-scoped. helexa-acp falls back to local `std::fs` automatically when that fails — look for `fs/read_text_file failed; falling back to local std::fs` in the log. If even local read fails, the file genuinely doesn't exist or the user process lacks permissions. ### Tool calls render as text instead of structured cards The model is emitting `` markers that the parser can't decode. Two common causes: 1. The system prompt isn't reaching the model (cortex/neuron's tool-block injection didn't fire). Confirm with `RUST_LOG=helexa_acp=debug` and look at the outgoing `POST /chat/completions` body. 2. The model itself is too small / undertrained to follow the Hermes format reliably. helexa-acp has shape-based name inference and JSON repair, but there's a floor below which nothing helps. ### Plan-mode writes refused even inside the plan dir The path comparison is byte-for-byte. If the model emits a path with `~` and the plan_dir has the expanded form, expansion runs *before* the comparison — but resolved-vs-symlinked-path mismatches can still bite. The error message names the attempted path and the expected prefix so you can compare directly. ## Architecture Source layout under `crates/helexa-acp/src/`: | File | Responsibility | |------|----------------| | `main.rs` | tokio + Stdio transport. Builds providers, hands off to `agent::Agent` | | `config.rs` | TOML + env-fallback config, endpoint resolver | | `agent.rs` | ACP handlers (initialize, session/new, session/prompt, session/cancel, session/set_mode, session/set_model, session/load, session/list), prompt loop with tool-call recursion | | `session.rs` | Per-session state map (Arc>>) | | `store.rs` | On-disk session persistence, plan-dir resolution | | `prompt.rs` | System-prompt assembly, plan-mode addendum | | `tools.rs` | Tool schemas + shape-based name inference | | `tool_runner.rs` | Dispatch a single tool call through ACP client RPCs; permission gate | | `qwen3.rs` | Qwen3 Hermes tool-format parser (`` / `` markers) | | `compaction.rs` | Token-budget compaction for the rolling history | | `path_util.rs` | `~` / `$HOME` expansion shared across every path-taking tool | | `provider/openai_chat.rs` | OpenAI chat completions provider | | `provider/openai_responses.rs` | OpenAI Responses API provider | | `provider/anthropic_messages.rs` | Anthropic Messages API provider | ### Adding a new wire format 1. New file under `src/provider/` implementing the `Provider` trait (encoder + SSE decoder). 2. Add a `WireApi` variant in `config.rs`. 3. Wire it into `build_provider` in `main.rs`. 4. Done — every other module is wire-format-agnostic. ### Concurrency - `Arc>>>>` — per-session mutex so concurrent requests across sessions don't contend; the map's RwLock is read-mostly. - Every tool call dispatched serially within a session (parallel dispatch would require Zed to handle interleaved permission prompts). - Provider streams are back-pressured by the consumer (bounded mpsc channels). ### Self-contained The crate has no workspace-internal dependencies (no `cortex-core`, no `cortex-gateway`). Migration to a dedicated GitHub repo for cross-platform CI / cargo-dist binaries is Cargo.toml-only. ## Status - Stages 1–6 shipped: scaffold, agent loop, tools, modes, session resume, image input, model picker, three wire formats. - Stage 8 (RPM + multi-platform CI) tracked in the canonical plan; Linux x86_64 RPM ships today via the cortex monorepo's Gitea Actions. ## Contributing Repository: https://git.lair.cafe/helexa/cortex (`crates/helexa-acp/`). Issues / PRs welcome. The canonical staged plan is in `~/.claude/plans/plan-the-per-device-worker-abstract-micali.md` on the maintainer's machine; the substages 3a–3e and 6a/6b that the canonical plan didn't anticipate are documented in commit messages. CI: `cargo fmt --check --all`, `cargo clippy --workspace -- -D warnings`, `cargo test --workspace` must all pass before merge.