feat(helexa-acp): context compaction for small-context local models

A new src/compaction.rs module projects rolling conversation history into a token budget before each completion. Older tool results and assistant prose get elided to one-line markers; system prompts, user turns, and the last KEEP_TAIL=4 messages stay verbatim. tool_call_id pairing is preserved so OpenAI strict-schema providers keep working. Driven by a new per-endpoint `context_window` config field (also HELEXA_ACP_CONTEXT_WINDOW for the env-only single-endpoint case). When set, prompt budget = context_window - max_tokens - 512_safety; when unset, behaviour is unchanged. Without this, a 32 K Qwen3 dies with `prompt_too_long` after the first few read_file results pile up in history — the symptom seen in plan-mode dogfooding on beat. 10 new unit tests cover the compaction strategy and the prompt budget arithmetic. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 08:22:01 +03:00
parent cbadfcf112
commit 537a0fe7f2
5 changed files with 501 additions and 2 deletions
--- a/crates/helexa-acp/src/config.rs
+++ b/crates/helexa-acp/src/config.rs
@@ -98,6 +98,14 @@ pub struct EndpointConfig {
    /// request field.
    #[serde(default)]
    pub max_tokens: Option<u64>,
+    /// Model context window in tokens (prompt + response). When set,
+    /// the agent compacts conversation history before each completion
+    /// so the prompt fits within `context_window - max_tokens - safety`
+    /// tokens — long sessions on small-context local models (Qwen3 at
+    /// 32 K) survive past the first few tool-call rounds rather than
+    /// dying with `prompt_too_long`. `None` disables compaction.
+    #[serde(default)]
+    pub context_window: Option<usize>,
 }

 #[derive(Debug, Clone, Copy, PartialEq, Eq, Default, Serialize, Deserialize)]
@@ -193,6 +201,15 @@ impl Config {
                })
            })
            .transpose()?;
+        let context_window = std::env::var("HELEXA_ACP_CONTEXT_WINDOW")
+            .ok()
+            .filter(|s| !s.is_empty())
+            .map(|s| {
+                s.parse::<usize>().with_context(|| {
+                    format!("HELEXA_ACP_CONTEXT_WINDOW is not a positive integer ({s})")
+                })
+            })
+            .transpose()?;
        Ok(Self {
            default_endpoint: Some(DEFAULT_ENDPOINT_NAME.into()),
            endpoints: vec![EndpointConfig {
@@ -203,6 +220,7 @@ impl Config {
                api_key,
                api_key_env: None,
                max_tokens,
+                context_window,
            }],
            system_prompt_path,
        })
@@ -316,6 +334,7 @@ mod tests {
            api_key: None,
            api_key_env: None,
            max_tokens: None,
+            context_window: None,
        };
        assert_eq!(
            ep.chat_completions_url().as_str(),