feat(helexa-acp): context compaction for small-context local models · 537a0fe7f2 - cortex

feat(helexa-acp): context compaction for small-context local models

All checks were successful

build-prerelease / Resolve version stamps (push) Successful in 26s

Details

CI / Format (push) Successful in 29s

Details

CI / Clippy (push) Successful in 2m26s

Details

build-prerelease / Build cortex binary (push) Successful in 5m17s

Details

build-prerelease / Build neuron-blackwell (push) Successful in 5m51s

Details

CI / Test (push) Successful in 5m53s

Details

CI / Build cortex SRPM (push) Has been skipped

Details

CI / Build neuron SRPM (push) Has been skipped

Details

CI / Publish cortex to COPR (push) Has been skipped

Details

CI / Publish neuron to COPR (push) Has been skipped

Details

CI / Bump version in source (push) Has been skipped

Details

build-prerelease / Package cortex RPM (push) Successful in 1m21s

Details

build-prerelease / Build neuron-ampere (push) Successful in 7m58s

Details

build-prerelease / Build neuron-ada (push) Successful in 5m30s

Details

build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m57s

Details

build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m7s

Details

build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s

Details

build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m0s

Details

A new src/compaction.rs module projects rolling conversation history
into a token budget before each completion. Older tool results and
assistant prose get elided to one-line markers; system prompts, user
turns, and the last KEEP_TAIL=4 messages stay verbatim. tool_call_id
pairing is preserved so OpenAI strict-schema providers keep working.

Driven by a new per-endpoint `context_window` config field (also
HELEXA_ACP_CONTEXT_WINDOW for the env-only single-endpoint case).
When set, prompt budget = context_window - max_tokens - 512_safety;
when unset, behaviour is unchanged.

Without this, a 32 K Qwen3 dies with `prompt_too_long` after the
first few read_file results pile up in history — the symptom seen
in plan-mode dogfooding on beat.

10 new unit tests cover the compaction strategy and the prompt
budget arithmetic.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

This commit is contained in:

rob thijssen

2026-05-29 08:22:01 +03:00

parent cbadfcf112

commit 537a0fe7f2

5 changed files with 501 additions and 2 deletions

									
										1

crates/helexa-acp/src/main.rs
									
												View File
												
				@@ -16,6 +16,7 @@ use agent_client_protocol::{Result, Stdio};

				use std::sync::Arc;

				mod agent;

				mod compaction;

				mod config;

				mod prompt;

				mod provider;

1 crates/helexa-acp/src/main.rs Unescape Escape View File

1

crates/helexa-acp/src/main.rs

View File