Files
cortex/crates/helexa-acp/src/config.rs
rob thijssen 1818dfb337
Some checks failed
CI / Format (push) Successful in 38s
build-prerelease / Resolve version stamps (push) Successful in 45s
CI / Clippy (push) Successful in 2m35s
CI / CUDA type-check (push) Failing after 12s
CI / Test (push) Successful in 5m54s
build-prerelease / Build cortex binary (push) Successful in 5m9s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m20s
build-prerelease / Build neuron-blackwell (push) Successful in 4m36s
build-prerelease / Build neuron-ampere (push) Successful in 7m11s
build-prerelease / Build neuron-ada (push) Successful in 6m33s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m55s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 2m56s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m45s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 59s
feat(helexa-acp): openai-responses provider
Stage 6a. Implements the `Provider` trait for OpenAI's Responses
API surface, parallel to the existing `OpenAIChatProvider`. Lets a
helexa-acp endpoint configured with `wire_api = "openai-responses"`
drive a `/v1/responses` server (today: neuron through cortex; later:
OpenAI directly) using the same agent-loop machinery the chat
provider already supports.

## Encoder (CompletionRequest → Responses body)

- System messages collapse into a single top-level `instructions`
  string. Multiple system messages concatenate with blank lines so
  ordering is preserved.
- User messages become `{type:"message", role:"user", content:…}`
  input items. Text content stays a bare string; MultiPart content
  (text + images, post-Stage 5) becomes a
  `[{type:"input_text"}, {type:"input_image"}]` array with images
  encoded as `data:{mime};base64,{data}` URIs — exactly the shape
  neuron's `wire::openai_responses::request_to_chat` accepts.
- Assistant text turns become an `output_text` content part inside
  a `message` item.
- Assistant tool-call turns become `function_call` input items.
- Tool result turns become `function_call_output` input items.
- `max_tokens` translates to `max_output_tokens`.

## Decoder (Responses SSE → CompletionEvent)

Reads named events on the SSE `event:` line:

- `response.output_text.delta` → `CompletionEvent::TextDelta`
- `response.output_item.added` with `type:"function_call"` →
  `CompletionEvent::ToolCallStart` (and, when the upstream
  pre-buffers fully, a single `ToolCallArgsDelta`)
- `response.function_call_arguments.delta` →
  `CompletionEvent::ToolCallArgsDelta`, correlated back to the
  tool-call slot by output_index.
- `response.completed` → `CompletionEvent::Usage` (if present) +
  `CompletionEvent::Finish` with reason mapped from `status`:
  `"completed"` → `"stop"`, `"incomplete"` → `"length"`.
- Bookkeeping events (`response.created`, `response.in_progress`,
  `*.content_part.*`, `*.output_text.done`, `*.output_item.done`,
  `*.function_call_arguments.done`, reasoning_*) are skipped.

## Wiring

- `EndpointConfig::responses_url()` joins `{base_url}/responses`.
- `WireApi::OpenAiResponses` in `build_provider` constructs the new
  provider (was previously a "reserved for future" error).
- `provider::mod.rs` registers the new module.

## Cuts (carried over from neuron-side issues)

- The decoder's `ToolCall*` handling fires correctly when the
  upstream emits `function_call` items, but the neuron candle
  harness doesn't yet (Refs #6). Real tool-call testing against
  cortex+neuron stays on the chat path until #6 lands.
- Reasoning events (`response.reasoning_*`) are deliberately
  dropped today; once neuron emits `InferenceEvent::ReasoningDelta`
  (Refs #5) the projector on the neuron side will start firing the
  reasoning event family and this decoder will need a matching
  case to route them to `CompletionEvent::ReasoningDelta`.

13 new unit tests cover encoder (system collapse, multipart user
input, assistant output_text encoding, tool-call round-trip via
function_call items) and decoder (text streaming, empty deltas
dropped, length finish, function_call lifecycle, inline-arguments
shape, cancellation, malformed payload skip).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 11:30:25 +03:00

425 lines
15 KiB
Rust

//! Configuration for the helexa-acp bridge.
//!
//! Loaded from `$XDG_CONFIG_HOME/helexa-acp/config.toml` (or
//! `~/.config/helexa-acp/config.toml` as a fallback). If no config file
//! exists, falls back to building a single anonymous endpoint from env
//! vars — that keeps "just point at one cortex" frictionless without
//! requiring a config file on disk.
//!
//! The design goal is "the missing ACP binary for users with multiple
//! API endpoints (possibly on a private LAN, possibly mixing wire
//! types)". Hence: every endpoint is named, has its own wire API, and
//! has its own default model. The agent's selected model id can be
//! prefixed `endpoint:model` to route across endpoints; a bare
//! `model` falls through to the configured `default_endpoint`.
//!
//! ### Example TOML
//!
//! ```toml
//! default_endpoint = "helexa"
//!
//! [[endpoints]]
//! name = "helexa"
//! base_url = "http://hanzalova.internal:31313/v1"
//! wire_api = "openai-chat"
//! default_model = "helexa/large"
//!
//! [[endpoints]]
//! name = "openrouter"
//! base_url = "https://openrouter.ai/api/v1"
//! wire_api = "openai-chat"
//! api_key_env = "OPENROUTER_API_KEY"
//! default_model = "anthropic/claude-opus-4"
//!
//! [[endpoints]]
//! name = "lmstudio"
//! base_url = "http://localhost:1234/v1"
//! wire_api = "openai-chat"
//! default_model = "auto"
//! ```
use anyhow::{Context, anyhow};
use serde::{Deserialize, Serialize};
use std::path::{Path, PathBuf};
use url::Url;
const DEFAULT_BASE_URL: &str = "http://hanzalova.internal:31313/v1";
const DEFAULT_MODEL: &str = "helexa/large";
const DEFAULT_ENDPOINT_NAME: &str = "default";
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Config {
/// Name of the endpoint used when a request doesn't pick one
/// explicitly. Must reference an entry in `endpoints`. Defaults to
/// the first endpoint declared if unset.
#[serde(default)]
pub default_endpoint: Option<String>,
/// Per-endpoint configuration. At least one entry is required.
#[serde(default)]
pub endpoints: Vec<EndpointConfig>,
/// Optional path to a system-prompt file. When unset, the built-in
/// default prompt from `prompt.rs` is used.
#[serde(default)]
pub system_prompt_path: Option<PathBuf>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct EndpointConfig {
/// Short identifier used in `endpoint:model` routing and in logs.
pub name: String,
/// Base URL of the OpenAI-compatible API. Must include the `/v1`
/// (or equivalent) suffix — paths like `chat/completions` and
/// `models` are joined onto this.
pub base_url: Url,
/// Wire protocol the endpoint speaks. Phase 1 supports
/// [`WireApi::OpenAiChat`] only; `openai-responses` and
/// `anthropic-messages` land later behind their own providers.
#[serde(default)]
pub wire_api: WireApi,
/// Model to use when the client hasn't picked one via
/// `session/set_model`.
#[serde(default)]
pub default_model: Option<String>,
/// Static API key to send as `Authorization: Bearer …`. Prefer
/// `api_key_env` for anything sensitive — keys in plain TOML are a
/// liability.
#[serde(default)]
pub api_key: Option<String>,
/// Env var name to read for the API key. Resolved at startup so a
/// missing env var yields a clear error rather than silent
/// unauthenticated calls.
#[serde(default)]
pub api_key_env: Option<String>,
/// Cap on the model's output tokens per turn. `None` lets the
/// upstream pick its own default (cortex/neuron's default is
/// often small enough to trip Zed's "Output Limit Reached" on
/// long responses). Set to e.g. `32768` to let the model
/// produce longer turns. Goes into the OpenAI `max_tokens`
/// request field.
#[serde(default)]
pub max_tokens: Option<u64>,
/// Model context window in tokens (prompt + response). When set,
/// the agent compacts conversation history before each completion
/// so the prompt fits within `context_window - max_tokens - safety`
/// tokens — long sessions on small-context local models (Qwen3 at
/// 32 K) survive past the first few tool-call rounds rather than
/// dying with `prompt_too_long`. `None` disables compaction.
#[serde(default)]
pub context_window: Option<usize>,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default, Serialize, Deserialize)]
pub enum WireApi {
/// `POST {base}/chat/completions` returning OpenAI-format SSE.
/// Compatible with cortex, LM Studio, Ollama (compat mode),
/// OpenRouter, OpenAI itself.
#[default]
#[serde(rename = "openai-chat")]
OpenAiChat,
/// `POST {base}/responses` — OpenAI's newer Responses API. Not
/// implemented yet; the variant is reserved so endpoint configs
/// can be authored ahead of provider support.
#[serde(rename = "openai-responses")]
OpenAiResponses,
/// `POST {base}/messages` — Anthropic format. Reserved.
#[serde(rename = "anthropic-messages")]
AnthropicMessages,
}
impl EndpointConfig {
/// Resolve the API key from `api_key` (literal) or `api_key_env`
/// (env-var lookup). Returns `Ok(None)` when neither is set;
/// `Err` when `api_key_env` references a missing variable.
pub fn resolve_api_key(&self) -> anyhow::Result<Option<String>> {
if let Some(literal) = &self.api_key {
return Ok(Some(literal.clone()));
}
if let Some(var) = &self.api_key_env {
return Ok(Some(std::env::var(var).with_context(|| {
format!(
"endpoint '{}' references missing env var {}",
self.name, var
)
})?));
}
Ok(None)
}
/// `{base_url}/chat/completions`.
pub fn chat_completions_url(&self) -> Url {
join_segments(&self.base_url, &["chat", "completions"])
}
/// `{base_url}/responses` — OpenAI Responses API endpoint.
pub fn responses_url(&self) -> Url {
join_segments(&self.base_url, &["responses"])
}
/// `{base_url}/models`. Called from `Provider::list_models`, which
/// Stage 4 wires into the model-picker dropdown; until then it's
/// reachable code with no in-tree callers.
#[allow(dead_code)]
pub fn models_url(&self) -> Url {
join_segments(&self.base_url, &["models"])
}
}
impl Config {
/// Load from TOML at the standard config path, or build from env
/// vars if no file exists. Env-fallback yields a single endpoint
/// named `"default"`.
pub fn load() -> anyhow::Result<Self> {
let path = config_path();
if let Some(path) = &path
&& path.exists()
{
return Self::from_file(path);
}
Self::from_env()
}
/// Single-endpoint config constructed from `HELEXA_ACP_BASE_URL`,
/// `HELEXA_ACP_MODEL`, `HELEXA_ACP_API_KEY`,
/// `HELEXA_ACP_SYSTEM_PROMPT_PATH`, `HELEXA_ACP_MAX_TOKENS`.
pub fn from_env() -> anyhow::Result<Self> {
let base_url = std::env::var("HELEXA_ACP_BASE_URL")
.ok()
.unwrap_or_else(|| DEFAULT_BASE_URL.into());
let base_url = Url::parse(&base_url)
.with_context(|| format!("HELEXA_ACP_BASE_URL is not a valid URL ({base_url})"))?;
let default_model = std::env::var("HELEXA_ACP_MODEL")
.ok()
.unwrap_or_else(|| DEFAULT_MODEL.into());
let api_key = std::env::var("HELEXA_ACP_API_KEY")
.ok()
.filter(|s| !s.is_empty());
let system_prompt_path = std::env::var("HELEXA_ACP_SYSTEM_PROMPT_PATH")
.ok()
.filter(|s| !s.is_empty())
.map(PathBuf::from);
let max_tokens = std::env::var("HELEXA_ACP_MAX_TOKENS")
.ok()
.filter(|s| !s.is_empty())
.map(|s| {
s.parse::<u64>().with_context(|| {
format!("HELEXA_ACP_MAX_TOKENS is not a positive integer ({s})")
})
})
.transpose()?;
let context_window = std::env::var("HELEXA_ACP_CONTEXT_WINDOW")
.ok()
.filter(|s| !s.is_empty())
.map(|s| {
s.parse::<usize>().with_context(|| {
format!("HELEXA_ACP_CONTEXT_WINDOW is not a positive integer ({s})")
})
})
.transpose()?;
Ok(Self {
default_endpoint: Some(DEFAULT_ENDPOINT_NAME.into()),
endpoints: vec![EndpointConfig {
name: DEFAULT_ENDPOINT_NAME.into(),
base_url,
wire_api: WireApi::OpenAiChat,
default_model: Some(default_model),
api_key,
api_key_env: None,
max_tokens,
context_window,
}],
system_prompt_path,
})
}
pub fn from_file(path: &Path) -> anyhow::Result<Self> {
let text = std::fs::read_to_string(path)
.with_context(|| format!("read config {}", path.display()))?;
let mut cfg: Self =
toml::from_str(&text).with_context(|| format!("parse config {}", path.display()))?;
cfg.validate()?;
Ok(cfg)
}
fn validate(&mut self) -> anyhow::Result<()> {
if self.endpoints.is_empty() {
return Err(anyhow!("config has no [[endpoints]] entries"));
}
for (i, ep) in self.endpoints.iter().enumerate() {
if ep.name.is_empty() {
return Err(anyhow!("endpoints[{i}] has empty name"));
}
if ep.name.contains(':') {
return Err(anyhow!(
"endpoints[{i}].name '{}' contains ':' which would clash \
with the endpoint:model selector syntax",
ep.name
));
}
}
// Pick a default endpoint if none was named.
if self.default_endpoint.is_none() {
self.default_endpoint = Some(self.endpoints[0].name.clone());
}
let default_name = self.default_endpoint.as_deref().unwrap();
if !self.endpoints.iter().any(|e| e.name == default_name) {
return Err(anyhow!(
"default_endpoint '{default_name}' is not declared in [[endpoints]]"
));
}
Ok(())
}
/// Look up an endpoint by name. Returns `None` if not configured.
pub fn endpoint(&self, name: &str) -> Option<&EndpointConfig> {
self.endpoints.iter().find(|e| e.name == name)
}
/// The default endpoint (guaranteed to exist after `validate`).
pub fn default_endpoint(&self) -> &EndpointConfig {
let name = self
.default_endpoint
.as_deref()
.expect("default_endpoint set by validate");
self.endpoint(name)
.expect("default_endpoint resolves after validate")
}
}
/// Parse an ACP-side `model` field into (endpoint name, raw model id).
///
/// `helexa:helexa/large` → (`Some("helexa")`, `"helexa/large"`).
/// `helexa/large` → (`None`, `"helexa/large"`).
///
/// The split happens at the FIRST colon. Model ids commonly contain
/// `/` (HuggingFace style) but rarely `:`; if a model id ever does, the
/// user can quote-prefix with the default endpoint name.
pub fn parse_model_selector(input: &str) -> (Option<&str>, &str) {
match input.split_once(':') {
Some((endpoint, model)) if !endpoint.is_empty() && !model.is_empty() => {
(Some(endpoint), model)
}
_ => (None, input),
}
}
fn config_path() -> Option<PathBuf> {
if let Ok(override_path) = std::env::var("HELEXA_ACP_CONFIG_PATH") {
return Some(PathBuf::from(override_path));
}
let xdg = std::env::var("XDG_CONFIG_HOME")
.ok()
.filter(|s| !s.is_empty());
let base = xdg.map(PathBuf::from).or_else(|| {
std::env::var("HOME")
.ok()
.map(|h| PathBuf::from(h).join(".config"))
})?;
Some(base.join("helexa-acp").join("config.toml"))
}
fn join_segments(base: &Url, segments: &[&str]) -> Url {
let mut out = base.clone();
if let Ok(mut path) = out.path_segments_mut() {
path.pop_if_empty().extend(segments.iter().copied());
}
out
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn url_join_handles_trailing_slash() {
let ep = EndpointConfig {
name: "x".into(),
base_url: Url::parse("http://h.internal:31313/v1").unwrap(),
wire_api: WireApi::OpenAiChat,
default_model: None,
api_key: None,
api_key_env: None,
max_tokens: None,
context_window: None,
};
assert_eq!(
ep.chat_completions_url().as_str(),
"http://h.internal:31313/v1/chat/completions"
);
assert_eq!(
ep.models_url().as_str(),
"http://h.internal:31313/v1/models"
);
}
#[test]
fn parses_model_selector() {
assert_eq!(
parse_model_selector("helexa:helexa/large"),
(Some("helexa"), "helexa/large")
);
assert_eq!(parse_model_selector("helexa/large"), (None, "helexa/large"));
assert_eq!(parse_model_selector("gpt-5"), (None, "gpt-5"));
// Edge case: a leading colon → no endpoint.
assert_eq!(parse_model_selector(":gpt-5"), (None, ":gpt-5"));
}
#[test]
fn env_fallback_builds_single_endpoint() {
// Don't actually set env vars (would race with other tests);
// just confirm the default path constructs cleanly.
unsafe {
std::env::remove_var("HELEXA_ACP_BASE_URL");
std::env::remove_var("HELEXA_ACP_MODEL");
std::env::remove_var("HELEXA_ACP_API_KEY");
}
let cfg = Config::from_env().unwrap();
assert_eq!(cfg.endpoints.len(), 1);
assert_eq!(cfg.endpoints[0].name, "default");
assert_eq!(cfg.endpoints[0].base_url.as_str(), DEFAULT_BASE_URL);
assert_eq!(
cfg.endpoints[0].default_model.as_deref(),
Some(DEFAULT_MODEL)
);
}
#[test]
fn toml_parses_multi_endpoint() {
let toml_text = r#"
default_endpoint = "helexa"
[[endpoints]]
name = "helexa"
base_url = "http://hanzalova.internal:31313/v1"
default_model = "helexa/large"
[[endpoints]]
name = "openrouter"
base_url = "https://openrouter.ai/api/v1"
wire_api = "openai-chat"
api_key_env = "OPENROUTER_API_KEY"
default_model = "anthropic/claude-opus-4"
"#;
let mut cfg: Config = toml::from_str(toml_text).unwrap();
cfg.validate().unwrap();
assert_eq!(cfg.endpoints.len(), 2);
assert_eq!(cfg.default_endpoint().name, "helexa");
assert_eq!(cfg.endpoints[0].wire_api, WireApi::OpenAiChat);
assert_eq!(
cfg.endpoints[1].api_key_env.as_deref(),
Some("OPENROUTER_API_KEY")
);
}
#[test]
fn validate_rejects_colon_in_endpoint_name() {
let toml_text = r#"
[[endpoints]]
name = "bad:name"
base_url = "http://x/v1"
"#;
let mut cfg: Config = toml::from_str(toml_text).unwrap();
let err = cfg.validate().unwrap_err();
assert!(format!("{err}").contains("clash"));
}
}