Commit Graph

2 Commits

Author SHA1 Message Date
df0abfe4d4 feat(helexa-acp): image input for vision-capable models
All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 34s
CI / Format (push) Successful in 37s
CI / Clippy (push) Successful in 2m33s
CI / Test (push) Successful in 5m4s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Successful in 6m2s
build-prerelease / Build neuron-ampere (push) Successful in 7m49s
build-prerelease / Build neuron-ada (push) Successful in 5m27s
build-prerelease / Build cortex binary (push) Successful in 4m16s
build-prerelease / Package cortex RPM (push) Successful in 1m19s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m2s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 3m10s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m47s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m2s
Stage 5. Zed clipboard/DnD images get forwarded as OpenAI
content-array messages on user turns.

- New MessageContent::MultiPart variant + MessagePart (Text|Image)
  + ImageData struct (mime_type, base64 data, optional uri).
- flatten_prompt now produces structured content: collapses to
  Text when every block is text (some upstreams treat array-form
  as vision-only and refuse on text-only models), otherwise
  produces MultiPart preserving block order.
- OpenAI encoder emits `[{type:"text",text:…}, {type:"image_url",
  image_url:{url:"data:{mime};base64,{data}"}}]` for MultiPart user
  messages. Data URIs are used over remote `uri` because they
  round-trip through every upstream we care about.
- prompt_capabilities.image = true at initialize so Zed actually
  sends image blocks.
- compaction estimates ~512 tokens per image (the middle of the
  Qwen3-VL / OpenAI detail range) so the budget tracker doesn't
  pretend images are free.
- session/load replays image-bearing user turns by surfacing the
  text parts verbatim and rendering each image as a "[image: {mime}
  ({n} bytes)]" placeholder chunk — Zed can show the prior text
  context even though re-uploading the bytes through ACP isn't
  meaningful for resume.
- 4 new tests: flatten produces MultiPart in block order, image-only
  prompts still flatten to MultiPart, encoder emits the correct
  array shape, text-only encoding stays as the string form.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 09:43:00 +03:00
537a0fe7f2 feat(helexa-acp): context compaction for small-context local models
All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 26s
CI / Format (push) Successful in 29s
CI / Clippy (push) Successful in 2m26s
build-prerelease / Build cortex binary (push) Successful in 5m17s
build-prerelease / Build neuron-blackwell (push) Successful in 5m51s
CI / Test (push) Successful in 5m53s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Build neuron-ampere (push) Successful in 7m58s
build-prerelease / Build neuron-ada (push) Successful in 5m30s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m57s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m7s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m0s
A new src/compaction.rs module projects rolling conversation history
into a token budget before each completion. Older tool results and
assistant prose get elided to one-line markers; system prompts, user
turns, and the last KEEP_TAIL=4 messages stay verbatim. tool_call_id
pairing is preserved so OpenAI strict-schema providers keep working.

Driven by a new per-endpoint `context_window` config field (also
HELEXA_ACP_CONTEXT_WINDOW for the env-only single-endpoint case).
When set, prompt budget = context_window - max_tokens - 512_safety;
when unset, behaviour is unchanged.

Without this, a 32 K Qwen3 dies with `prompt_too_long` after the
first few read_file results pile up in history — the symptom seen
in plan-mode dogfooding on beat.

10 new unit tests cover the compaction strategy and the prompt
budget arithmetic.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 08:22:01 +03:00