8fa1d1962ef53a9ed9309f8de03630eaf51c20e5
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
8fa1d1962e
|
feat(helexa-acp): anthropic-messages provider
Some checks failed
CI / CUDA type-check (push) Failing after 18s
CI / Format (push) Successful in 32s
build-prerelease / Resolve version stamps (push) Successful in 35s
CI / Test (push) Failing after 59s
CI / Clippy (push) Successful in 2m28s
CI / Build cortex SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build cortex binary (push) Successful in 4m17s
build-prerelease / Build neuron-blackwell (push) Successful in 5m32s
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Build neuron-ampere (push) Successful in 7m50s
build-prerelease / Build neuron-ada (push) Successful in 5m55s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m55s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m2s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m52s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m4s
Stage 6b. Third provider impl, completing the wire-format trio
(openai-chat, openai-responses, anthropic-messages). Lets a
helexa-acp endpoint configured with `wire_api = "anthropic-messages"`
drive Claude models — either against Anthropic directly or via
cortex's /v1/messages translation surface.
## Encoder (CompletionRequest → Anthropic body)
- System messages flatten to the top-level `system` field
(concatenated with blank lines when there are multiple).
- User text → `{role:"user", content:"..."}`.
- User MultiPart (text + images) → `content` array with Anthropic's
distinct image shape: `{type:"image", source:{type:"base64",
media_type, data}}` — structurally different from OpenAI's
`image_url` data URI.
- Assistant text → `{role:"assistant", content:"..."}`.
- Assistant tool_calls → `content` array with optional `{type:"text"}`
block plus one `{type:"tool_use", id, name, input:<parsed json>}`
per call. The internal arguments JSON string is parsed back to a
Value before encoding (Anthropic requires the parsed form);
malformed JSON falls back to a String input so the request body
still serialises.
- Tool result → `{role:"user", content:[{type:"tool_result",
tool_use_id, content}]}` per Anthropic's convention (no separate
`tool` role).
- `max_tokens` is required by Anthropic; defaults to 8192 when the
request doesn't specify.
## Decoder (Anthropic SSE → CompletionEvent)
Named SSE events:
- `message_start` → captures input_tokens from `usage` for the
eventual UsageStats.
- `content_block_start` (type=text) → TextDelta (initial text, if any).
- `content_block_start` (type=tool_use) → ToolCallStart; if a
pre-buffered `input` is present, also emits a single
ToolCallArgsDelta.
- `content_block_start` (type=thinking, for extended-thinking
models) → ReasoningDelta.
- `content_block_delta` (text_delta) → TextDelta.
- `content_block_delta` (input_json_delta) → ToolCallArgsDelta,
correlated by block index.
- `content_block_delta` (thinking_delta) → ReasoningDelta.
- `message_delta` → Usage (final output_tokens) + Finish with
stop_reason mapped: end_turn/stop_sequence → "stop", max_tokens
→ "length", tool_use → "tool_calls".
- `message_stop` → stream terminates.
- `ping` ignored (Anthropic's keep-alive).
- `error` → yields Err and ends the stream.
## Wiring
- Authentication: `x-api-key` + `anthropic-version: 2023-06-01`
headers (not Bearer). Both ship when api_key is configured;
servers that don't care (cortex) ignore them.
- `WireApi::AnthropicMessages` in build_provider now constructs
the provider instead of erroring "reserved for future".
- `provider::mod.rs` registers the new module.
18 new unit tests: encoder (system collapse, multi-system concat,
default max_tokens, multipart with image, tool_use blocks, tool
results, malformed JSON arg fallback), decoder (text streaming,
tool_use lifecycle, max_tokens→length mapping, empty deltas, ping
events, error events, cancellation, malformed payload skip,
thinking blocks).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
|||
|
1818dfb337
|
feat(helexa-acp): openai-responses provider
Some checks failed
CI / Format (push) Successful in 38s
build-prerelease / Resolve version stamps (push) Successful in 45s
CI / Clippy (push) Successful in 2m35s
CI / CUDA type-check (push) Failing after 12s
CI / Test (push) Successful in 5m54s
build-prerelease / Build cortex binary (push) Successful in 5m9s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m20s
build-prerelease / Build neuron-blackwell (push) Successful in 4m36s
build-prerelease / Build neuron-ampere (push) Successful in 7m11s
build-prerelease / Build neuron-ada (push) Successful in 6m33s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m55s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 2m56s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m45s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 59s
Stage 6a. Implements the `Provider` trait for OpenAI's Responses
API surface, parallel to the existing `OpenAIChatProvider`. Lets a
helexa-acp endpoint configured with `wire_api = "openai-responses"`
drive a `/v1/responses` server (today: neuron through cortex; later:
OpenAI directly) using the same agent-loop machinery the chat
provider already supports.
## Encoder (CompletionRequest → Responses body)
- System messages collapse into a single top-level `instructions`
string. Multiple system messages concatenate with blank lines so
ordering is preserved.
- User messages become `{type:"message", role:"user", content:…}`
input items. Text content stays a bare string; MultiPart content
(text + images, post-Stage 5) becomes a
`[{type:"input_text"}, {type:"input_image"}]` array with images
encoded as `data:{mime};base64,{data}` URIs — exactly the shape
neuron's `wire::openai_responses::request_to_chat` accepts.
- Assistant text turns become an `output_text` content part inside
a `message` item.
- Assistant tool-call turns become `function_call` input items.
- Tool result turns become `function_call_output` input items.
- `max_tokens` translates to `max_output_tokens`.
## Decoder (Responses SSE → CompletionEvent)
Reads named events on the SSE `event:` line:
- `response.output_text.delta` → `CompletionEvent::TextDelta`
- `response.output_item.added` with `type:"function_call"` →
`CompletionEvent::ToolCallStart` (and, when the upstream
pre-buffers fully, a single `ToolCallArgsDelta`)
- `response.function_call_arguments.delta` →
`CompletionEvent::ToolCallArgsDelta`, correlated back to the
tool-call slot by output_index.
- `response.completed` → `CompletionEvent::Usage` (if present) +
`CompletionEvent::Finish` with reason mapped from `status`:
`"completed"` → `"stop"`, `"incomplete"` → `"length"`.
- Bookkeeping events (`response.created`, `response.in_progress`,
`*.content_part.*`, `*.output_text.done`, `*.output_item.done`,
`*.function_call_arguments.done`, reasoning_*) are skipped.
## Wiring
- `EndpointConfig::responses_url()` joins `{base_url}/responses`.
- `WireApi::OpenAiResponses` in `build_provider` constructs the new
provider (was previously a "reserved for future" error).
- `provider::mod.rs` registers the new module.
## Cuts (carried over from neuron-side issues)
- The decoder's `ToolCall*` handling fires correctly when the
upstream emits `function_call` items, but the neuron candle
harness doesn't yet (Refs #6). Real tool-call testing against
cortex+neuron stays on the chat path until #6 lands.
- Reasoning events (`response.reasoning_*`) are deliberately
dropped today; once neuron emits `InferenceEvent::ReasoningDelta`
(Refs #5) the projector on the neuron side will start firing the
reasoning event family and this decoder will need a matching
case to route them to `CompletionEvent::ReasoningDelta`.
13 new unit tests cover encoder (system collapse, multipart user
input, assistant output_text encoding, tool-call round-trip via
function_call items) and decoder (text streaming, empty deltas
dropped, length finish, function_call lifecycle, inline-arguments
shape, cancellation, malformed payload skip).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
|||
|
df0abfe4d4
|
feat(helexa-acp): image input for vision-capable models
All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 34s
CI / Format (push) Successful in 37s
CI / Clippy (push) Successful in 2m33s
CI / Test (push) Successful in 5m4s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Successful in 6m2s
build-prerelease / Build neuron-ampere (push) Successful in 7m49s
build-prerelease / Build neuron-ada (push) Successful in 5m27s
build-prerelease / Build cortex binary (push) Successful in 4m16s
build-prerelease / Package cortex RPM (push) Successful in 1m19s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m2s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 3m10s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m47s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m2s
Stage 5. Zed clipboard/DnD images get forwarded as OpenAI
content-array messages on user turns.
- New MessageContent::MultiPart variant + MessagePart (Text|Image)
+ ImageData struct (mime_type, base64 data, optional uri).
- flatten_prompt now produces structured content: collapses to
Text when every block is text (some upstreams treat array-form
as vision-only and refuse on text-only models), otherwise
produces MultiPart preserving block order.
- OpenAI encoder emits `[{type:"text",text:…}, {type:"image_url",
image_url:{url:"data:{mime};base64,{data}"}}]` for MultiPart user
messages. Data URIs are used over remote `uri` because they
round-trip through every upstream we care about.
- prompt_capabilities.image = true at initialize so Zed actually
sends image blocks.
- compaction estimates ~512 tokens per image (the middle of the
Qwen3-VL / OpenAI detail range) so the budget tracker doesn't
pretend images are free.
- session/load replays image-bearing user turns by surfacing the
text parts verbatim and rendering each image as a "[image: {mime}
({n} bytes)]" placeholder chunk — Zed can show the prior text
context even though re-uploading the bytes through ACP isn't
meaningful for resume.
- 4 new tests: flatten produces MultiPart in block order, image-only
prompts still flatten to MultiPart, encoder emits the correct
array shape, text-only encoding stays as the string form.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
|||
|
5aac1ffc59
|
feat(helexa-acp): session resume via session/load
All checks were successful
CI / Format (push) Successful in 31s
build-prerelease / Resolve version stamps (push) Successful in 40s
CI / Clippy (push) Successful in 2m37s
CI / Test (push) Successful in 4m59s
CI / Build cortex SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build cortex binary (push) Successful in 4m35s
build-prerelease / Package cortex RPM (push) Successful in 1m19s
build-prerelease / Build neuron-blackwell (push) Successful in 6m4s
build-prerelease / Build neuron-ampere (push) Successful in 7m45s
build-prerelease / Build neuron-ada (push) Successful in 5m31s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m53s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m0s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m43s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m1s
Zed restarts (frequent during helexa-acp dogfooding) used to lose
every conversation because we'd ignore the load_session capability
and treat every project-reopen as a fresh session/new. Persist
sessions to disk and honour session/load so the agent panel comes
back where it left off.
Storage layout:
$XDG_DATA_HOME/helexa-acp/sessions/{session_id}.json
Each file holds session_id, cwd, model_id, mode_id, full Message
history, plus created/updated timestamps. Atomic save via
tempfile+rename so a crash mid-write can't corrupt the store.
Touch points:
- src/store.rs (new) — sessions_dir() resolution, save/load via
default and explicit-dir entry points (so unit tests don't have
to race on XDG_DATA_HOME). 5 unit tests cover round-trip,
not-found errors, atomic overwrite, tool-call/result preservation,
and the filename sanitiser's path-traversal handling.
- src/provider/mod.rs — Serialize/Deserialize on Role, Message,
MessageContent, ToolCall. MessageContent::Text turned into a
struct variant ({text: ...}) so internally-tagged JSON works.
- src/agent.rs — initialize_response advertises load_session: true;
handle_load_session reads the file, snapshots in-memory state,
returns LoadSessionResponse with the persisted mode preselected;
drive_prompt persists at the end of every prompt round under the
session lock with the I/O outside the lock.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|||
|
a494c8d43c
|
feat(helexa-acp): repair malformed tool calls and render failures as cards
Some checks failed
build-prerelease / Package helexa-neuron-blackwell RPM (push) Blocked by required conditions
build-prerelease / Resolve version stamps (push) Successful in 28s
CI / Format (push) Successful in 4m7s
CI / Test (push) Failing after 1m2s
build-prerelease / Build neuron-blackwell (push) Successful in 6m10s
CI / Clippy (push) Successful in 2m37s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build cortex binary (push) Successful in 4m24s
build-prerelease / Build neuron-ampere (push) Successful in 8m18s
build-prerelease / Package cortex RPM (push) Successful in 1m22s
build-prerelease / Build neuron-ada (push) Successful in 5m23s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m54s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 2m56s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
Two related fixes for cases where Qwen3 sometimes emits slightly-off JSON inside <tool_call> blocks: 1. JSON repair pass in qwen3::parse_tool_call_body — strip up to three trailing extra `}` characters (model overshoots its closing braces), and hoist `name` out of `arguments` when it lands nested instead of as a sibling. Both observed in the field; both trivially repairable; both now dispatch as normal tool calls instead of falling back to the malformed path. 2. New CompletionEvent::MalformedToolCall variant for the cases repair can't fix. decode_stream now emits it instead of wrapping the raw body in a TextDelta, and agent.rs surfaces each one as a Failed SessionUpdate::ToolCall card (so Zed renders it as a structured failure UI element rather than dumping the body inline) plus a synthetic tool-call/tool-result history pair so the model gets clear feedback for self-correction on the next round. Empty <tool_call></tool_call> blocks are now a no-op too (no Malformed event), matching the existing empty-<think> behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
|
0121a1930f
|
feat(helexa-acp): inject and parse Qwen3 Hermes tool format
Some checks failed
CI / Format (push) Successful in 38s
build-prerelease / Resolve version stamps (push) Successful in 42s
CI / Clippy (push) Successful in 2m33s
CI / Test (push) Successful in 5m45s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build cortex binary (push) Successful in 5m13s
build-prerelease / Build neuron-blackwell (push) Successful in 6m0s
build-prerelease / Package cortex RPM (push) Successful in 1m27s
build-prerelease / Build neuron-ampere (push) Successful in 7m55s
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
build-prerelease / Build neuron-ada (push) Has been cancelled
The OpenAI `tools` API field isn't load-bearing in this stack —
neuron's chat template renders only message.content, so tool
definitions sent that way never reach the model. Move both sides
of the tool conversation into the Qwen3 Hermes wire format the
model is actually trained on:
- Append a `# Tools` block to the system prompt describing every
available function (qwen3::render_tool_block).
- Parse `<tool_call>{json}</tool_call>` markers out of the streamed
content via a chunk-boundary-safe state machine (qwen3::ToolCallParser),
surfacing them as the existing CompletionEvent::ToolCall* events
so the agent loop doesn't change.
- Re-serialise assistant turns that called tools with inline
`<tool_call>` blocks and tool results as user turns wrapped in
`<tool_response>` (qwen3::render_assistant_with_tool_calls,
render_tool_response).
Verified against cortex+Qwen3.6-27B: the model produces a
well-formed `<tool_call>{"name":"list_dir","arguments":{"path":"/tmp"}}</tool_call>`
in response to a Hermes-formatted prompt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|||
|
96fc379893
|
feat(helexa-acp): wire ACP agent loop for text-only conversations
Some checks failed
build-prerelease / Package helexa-neuron-ada RPM (push) Blocked by required conditions
build-prerelease / Package helexa-neuron-ampere RPM (push) Blocked by required conditions
build-prerelease / Package helexa-neuron-blackwell RPM (push) Blocked by required conditions
build-prerelease / Resolve version stamps (push) Successful in 41s
CI / Format (push) Successful in 38s
CI / Clippy (push) Successful in 2m35s
build-prerelease / Build cortex binary (push) Successful in 5m26s
CI / Test (push) Successful in 5m43s
build-prerelease / Build neuron-blackwell (push) Successful in 5m47s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m23s
build-prerelease / Build neuron-ampere (push) Successful in 8m13s
build-prerelease / Build neuron-ada (push) Successful in 5m28s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
Stage 2 lands the agent loop on top of the Stage 1 scaffold: session state with per-session cancellation, a system-prompt builder honouring HELEXA_ACP_SYSTEM_PROMPT_PATH / system_prompt_path TOML, and handlers for initialize / session/new / session/prompt / session/cancel that stream provider output back as session/update notifications. Verified end-to-end against cortex from Zed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
|
e23d5011d0
|
feat(helexa-acp): scaffold ACP bridge with provider trait + OpenAI chat
Adds a new workspace crate `helexa-acp` (binary, Apache-2.0) — the
start of "the missing ACP binary" for multi-endpoint LLM setups
mixing public APIs, private LAN deployments, and various wire
formats. Today it speaks OpenAI /v1/chat/completions; the
Provider trait is the seam that lets OpenAI Responses, Anthropic
/v1/messages, and other wire formats slot in later without touching
the agent loop.
The crate is intentionally self-contained — no dependencies on the
other workspace crates (cortex-core, cortex-gateway, neuron) — so a
future migration to a dedicated GitHub repo is a Cargo.toml-only
change. All deps come from crates.io.
This commit lands:
* `config.rs` — TOML config at $XDG_CONFIG_HOME/helexa-acp/config.toml
with multi-endpoint support (each `[[endpoints]]` declares its
name, base_url, wire_api, default_model, optional API key /
api_key_env). Falls back to env-only single-endpoint config when
no TOML exists (HELEXA_ACP_BASE_URL, HELEXA_ACP_MODEL, etc.). The
`endpoint:model` selector syntax is validated and tested.
* `provider/mod.rs` — `Provider` trait + provider-agnostic types
(`CompletionRequest`, `CompletionEvent`, `Message`, `ToolCall`,
`ToolSpec`, `Role`, `UsageStats`). Agent loop consumes these
without knowing the wire format on the other side.
* `provider/openai_chat.rs` — `OpenAIChatProvider` impl. Compatible
with cortex, LM Studio, Ollama (compat mode), OpenRouter, OpenAI
itself. Streams via reqwest + eventsource-stream + async-stream.
Surfaces text deltas, reasoning deltas (for models that emit
`reasoning_content`), tool-call lifecycle (start, args-delta,
completion), usage, finish reason. Cancellation-token aware.
* `main.rs` — tokio + stderr-only tracing-subscriber + Stdio
transport. Builds a provider per configured endpoint at startup,
surfacing config mistakes before the editor even initializes.
Currently responds to `initialize`; everything else stubs to
`not implemented yet` until the agent loop lands in the next
commit.
12 unit tests pass — encoder shape, decoder shape (text-only,
tool-call progressive, cancellation, malformed-chunk recovery),
config parsing (multi-endpoint TOML, env fallback, validation).
The `#![allow(dead_code)]` on `provider/mod.rs` is temporary — the
agent loop in the next commit reads every field. It's noted in the
module-level docstring so the next reader knows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|