feat(helexa-acp): context compaction for small-context local models
All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 26s
CI / Format (push) Successful in 29s
CI / Clippy (push) Successful in 2m26s
build-prerelease / Build cortex binary (push) Successful in 5m17s
build-prerelease / Build neuron-blackwell (push) Successful in 5m51s
CI / Test (push) Successful in 5m53s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Build neuron-ampere (push) Successful in 7m58s
build-prerelease / Build neuron-ada (push) Successful in 5m30s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m57s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m7s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m0s
All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 26s
CI / Format (push) Successful in 29s
CI / Clippy (push) Successful in 2m26s
build-prerelease / Build cortex binary (push) Successful in 5m17s
build-prerelease / Build neuron-blackwell (push) Successful in 5m51s
CI / Test (push) Successful in 5m53s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Build neuron-ampere (push) Successful in 7m58s
build-prerelease / Build neuron-ada (push) Successful in 5m30s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m57s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m7s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m0s
A new src/compaction.rs module projects rolling conversation history into a token budget before each completion. Older tool results and assistant prose get elided to one-line markers; system prompts, user turns, and the last KEEP_TAIL=4 messages stay verbatim. tool_call_id pairing is preserved so OpenAI strict-schema providers keep working. Driven by a new per-endpoint `context_window` config field (also HELEXA_ACP_CONTEXT_WINDOW for the env-only single-endpoint case). When set, prompt budget = context_window - max_tokens - 512_safety; when unset, behaviour is unchanged. Without this, a 32 K Qwen3 dies with `prompt_too_long` after the first few read_file results pile up in history — the symptom seen in plan-mode dogfooding on beat. 10 new unit tests cover the compaction strategy and the prompt budget arithmetic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -98,6 +98,14 @@ pub struct EndpointConfig {
|
||||
/// request field.
|
||||
#[serde(default)]
|
||||
pub max_tokens: Option<u64>,
|
||||
/// Model context window in tokens (prompt + response). When set,
|
||||
/// the agent compacts conversation history before each completion
|
||||
/// so the prompt fits within `context_window - max_tokens - safety`
|
||||
/// tokens — long sessions on small-context local models (Qwen3 at
|
||||
/// 32 K) survive past the first few tool-call rounds rather than
|
||||
/// dying with `prompt_too_long`. `None` disables compaction.
|
||||
#[serde(default)]
|
||||
pub context_window: Option<usize>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default, Serialize, Deserialize)]
|
||||
@@ -193,6 +201,15 @@ impl Config {
|
||||
})
|
||||
})
|
||||
.transpose()?;
|
||||
let context_window = std::env::var("HELEXA_ACP_CONTEXT_WINDOW")
|
||||
.ok()
|
||||
.filter(|s| !s.is_empty())
|
||||
.map(|s| {
|
||||
s.parse::<usize>().with_context(|| {
|
||||
format!("HELEXA_ACP_CONTEXT_WINDOW is not a positive integer ({s})")
|
||||
})
|
||||
})
|
||||
.transpose()?;
|
||||
Ok(Self {
|
||||
default_endpoint: Some(DEFAULT_ENDPOINT_NAME.into()),
|
||||
endpoints: vec![EndpointConfig {
|
||||
@@ -203,6 +220,7 @@ impl Config {
|
||||
api_key,
|
||||
api_key_env: None,
|
||||
max_tokens,
|
||||
context_window,
|
||||
}],
|
||||
system_prompt_path,
|
||||
})
|
||||
@@ -316,6 +334,7 @@ mod tests {
|
||||
api_key: None,
|
||||
api_key_env: None,
|
||||
max_tokens: None,
|
||||
context_window: None,
|
||||
};
|
||||
assert_eq!(
|
||||
ep.chat_completions_url().as_str(),
|
||||
|
||||
Reference in New Issue
Block a user