cortex

Author	SHA1	Message	Date
rob thijssen	3cccc2c56b	refactor(neuron): cut mistralrs/llamacpp, scaffold candle harness Stage 1 of the candle-native pivot. Replaces the external-process harness model (mistralrs over HTTP, llamacpp placeholder) with an in-process Harness trait whose sole implementation is candle. The trait keeps its shape so future engines slot in additively, but start/stop default to no-ops and HarnessConfig drops endpoint and systemd_unit since no harness needs external supervision. Behaviour is unchanged on the wire: load_model returns a "not implemented yet (Stage 2)" error and list_models is empty. The gateway-side proxy, poller, and router are untouched. CLAUDE.md Phase 11 (llama.cpp) and Phase 12 (mistral.rs COPR) are marked superseded; the staged plan lives in ~/.claude/plans/create-a-more-aggressive-calm-naur.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 15:53:04 +03:00
rob thijssen	3f94c50817	chore: move default ports out of common-collision ranges Previous defaults collided with well-trodden infra services and with the Linux ephemeral port range: - cortex API 8000 — common dev-server default (Django, minio UI) - cortex metrics 9100 — Prometheus node_exporter default - neuron API 9090 — Cockpit default on Fedora, Prometheus self Move to helexa-themed palindromic ports, all below Linux's 32768-60999 ephemeral range and not registered to any well-known service: - cortex API 31313 - cortex metrics 31314 - neuron API 13131 Updated places: - cortex.example.toml, neuron.example.toml defaults - default impls in cortex-core and neuron config - cortex-cli --endpoint default for the status subcommand - doc comments citing example URLs - README.md and CLAUDE.md snippets Consumers already on the old ports need a one-line edit in their /etc/cortex/cortex.toml or /etc/neuron/neuron.toml to match; firewall rules and prometheus scrape configs will also need updating. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 17:45:25 +03:00
rob thijssen	abe4ff7ccc	ci: publish both packages to a single helexa/helexa COPR project All checks were successful CI / Format, lint, build, test (push) Successful in 9m50s Details CI / Build neuron SRPM (push) Successful in 43s Details CI / Build cortex SRPM (push) Successful in 48s Details CI / Publish neuron to COPR (push) Successful in 6m14s Details CI / Publish cortex to COPR (push) Successful in 7m53s Details CI / Bump version in source (push) Successful in 31s Details Consolidates the previous helexa/cortex and helexa/helexa-neuron COPR projects into one shared project. Hosts enable a single repo and get access to both packages — cortex for gateway hosts and helexa-neuron for GPU nodes. Reduces the "which copr do I enable on this host" friction, and makes it clear the two packages are parts of the same helexa project suite. CI keeps two independent publish jobs (copr-cortex and copr-neuron) running in parallel; they now both target helexa/helexa with their respective SRPMs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 16:37:47 +03:00
rob thijssen	7c3390a4e1	fix(rpm): rename neuron package to helexa-neuron Fedora's official repos ship a package named `neuron` — the NEURON neural-simulation environment from Yale (see https://src.fedoraproject.org/rpms/neuron). Having our own `neuron` in the helexa COPR caused dnf5 to silently no-op `dnf install neuron` because of the name collision, even with the COPR repo enabled and keys imported. The only workarounds were full NEVRA (`dnf install neuron-0.1.12-1.fc43.x86_64`) or a local file install — neither acceptable for end-users. Rename the RPM package to `helexa-neuron`. Keep binary (/usr/bin/neuron), systemd unit (neuron.service), system user (neuron), and config dir (/etc/neuron) unchanged — those are project-local contexts where the short name is unambiguous. Follows Fedora subpackage-style naming except with a vendor prefix rather than a parent-package prefix, because neuron is an independent package from cortex (installed on different hosts) and neither depends on the other. Changes: - neuron.spec -> helexa-neuron.spec (git rename) - Name: neuron -> helexa-neuron (with comment explaining why) - CI: srpm-neuron job now builds helexa-neuron-VERSION.tar.gz with the matching top-level dir prefix, publishes to helexa/helexa-neuron COPR - CI: bump-version job references helexa-neuron.spec - CLAUDE.md: install instructions updated Old helexa/neuron COPR project can be deleted after the first helexa/helexa-neuron build lands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 16:37:47 +03:00
rob thijssen	c85d50066e	ci: add RPM packaging for cortex and neuron - cortex.spec: gateway binary, cortex.service systemd unit, cortex.toml + models.toml config files - neuron.spec: neuron binary, neuron.service systemd unit, neuron.toml config file - Parallel CI: srpm-cortex and srpm-neuron jobs build SRPMs concurrently, then publish to separate COPR repos (helexa/cortex and helexa/neuron) - Shared cortex user/group across both packages - Example configs: cortex.example.toml, neuron.example.toml, models.example.toml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:09:04 +03:00
rob thijssen	6c238f4557	refactor: rename cortex-neuron binary and crate to neuron All checks were successful CI / Format, lint, build, test (push) Successful in 2m28s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Package name, lib name, and binary all now just "neuron" without the cortex- prefix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 15:51:15 +03:00
rob thijssen	e42e8ee81f	refactor: cortex talks to neurons instead of mistral.rs directly All checks were successful CI / Format, lint, build, test (push) Successful in 2m46s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Replace NodeConfig (static vram_mb, pinned) with NeuronEndpoint. Hardware discovery and model pinning now come from neuron API and models.toml catalogue respectively. - config.rs: nodes -> neurons, add models_config path - catalogue.rs: ModelProfile with pinned_on, ModelCatalogue - poller.rs: poll neuron GET /models (ModelInfo format) - router.rs: resolve inference endpoint via neuron GET /models/{id}/endpoint - evictor.rs: call neuron POST /models/unload - node.rs: remove vram_mb, pinned fields (come from discovery/catalogue) - All 22 gateway tests updated to mock neuron API - Remove MistralModelsResponse, ModelLifecycleRequest (no longer needed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:42:52 +03:00
rob thijssen	26e5e7ead8	feat: implement mistral.rs harness and neuron model API All checks were successful CI / Format, lint, build, test (push) Successful in 2m30s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details - MistralRsHarness: Harness trait impl wrapping mistral.rs HTTP API (list/load/unload models, health check, start/stop via systemd) - HarnessRegistry: maps harness name -> Box<dyn Harness>, built from neuron.toml config - Neuron API endpoints: GET /models, POST /models/load, POST /models/unload, GET /models/:id/endpoint - NeuronConfig: figment-based config loading from neuron.toml - Integration test: full model lifecycle through mock mistral.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:29:42 +03:00
rob thijssen	6dc717ebcd	feat: add neuron daemon with GPU discovery and health endpoints All checks were successful CI / Format, lint, build, test (push) Successful in 2m29s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Replace cortex-agent stub with neuron (cortex-neuron binary). cortex-core additions: - discovery.rs: DeviceInfo, DiscoveryResponse, DeviceHealth, HealthResponse - harness.rs: Harness async trait, HarnessConfig, ModelSpec, ModelInfo neuron crate (crates/neuron/): - discovery.rs: nvidia-smi CSV parsing (pure functions) + system discovery via uname/nvidia-smi/nvcc - health.rs: cached GPU health polling every 5s - api.rs: GET /discovery and GET /health axum handlers - main.rs: CLI entrypoint with --port flag (default 9090) - harness stubs for mistralrs (Phase 8) and llamacpp (Phase 11) 12 new tests (9 unit + 3 integration), 35 total. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:23:42 +03:00
rob thijssen	67b9b044d3	feat: add per-request Prometheus metrics instrumentation All checks were successful CI / Format, lint, build, test (push) Successful in 2m26s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Emit cortex_requests_total, cortex_request_duration_seconds, cortex_request_errors_total, and cortex_cold_starts_total with model and node labels on every proxied request. Add install_test_recorder() for testing metrics without HTTP listener. Integration test verifies counters and histograms appear after proxy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:42:09 +03:00
rob thijssen	29c8f10761	feat: implement non-streaming Anthropic response translation Wire up openai_to_anthropic in the /v1/messages handler: buffer upstream OpenAI response, parse, translate to Anthropic format (stop_reason mapping, usage field names, content blocks). 5 integration tests covering round-trip translation, system prompt, content blocks, and error cases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:36:16 +03:00
rob thijssen	24c5e1e361	feat: add LRU eviction tests and last_accessed tracking All checks were successful CI / Format, lint, build, test (push) Successful in 2m37s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details - Add touch_model() in handlers to update last_accessed timestamp on every proxied request, driving LRU eviction ordering - 5 integration tests: LRU eviction, pinned model protection, nothing-to-evict case, lifecycle_cycles increment, and last_accessed update verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:34:08 +03:00
rob thijssen	d5f19b9ff2	test: add Phase 3 poller integration tests All checks were successful CI / Format, lint, build, test (push) Successful in 2m31s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Extract public poll_once() from poll_loop() for testability. 4 tests proving the poller correctly discovers models, updates gateway state, marks unreachable nodes unhealthy, and prunes stale models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:31:17 +03:00
rob thijssen	c2118aa81c	test: add Phase 2 streaming SSE passthrough tests All checks were successful CI / Format, lint, build, test (push) Successful in 2m36s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Confirms the existing proxy streams SSE chunks incrementally: - 5-chunk test with 50ms delays verifies time spread between first and last chunk arrival (not buffered) - Verifies data: [DONE] terminator is forwarded No src/ changes needed — Body::from_stream(bytes_stream()) already handles SSE correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:28:33 +03:00
rob thijssen	1b339b1426	test: add Phase 1 integration tests for basic proxy Some checks failed CI / Build SRPM (push) Has been cancelled Details CI / Publish to COPR (push) Has been cancelled Details CI / Format, lint, build, test (push) Has been cancelled Details 6 tests proving the scaffold works end-to-end: - chat completion proxied through gateway to mock backend - /health endpoint with healthy node - /v1/models returns seeded model list - 404 for unknown model - 404 when no healthy nodes available - 400 when request body missing model field Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 19:26:12 +03:00
rob thijssen	3ad8c72276	docs: add CI expectations to CLAUDE.md and README.md All checks were successful CI / Format, lint, build, test (push) Successful in 2m6s Details CI / Build SRPM (push) Has been skipped Details CI / Publish to COPR (push) Has been skipped Details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:27:17 +03:00
rob thijssen	0da68833af	feat: scaffold cortex workspace Rust reverse-proxy for multi-node mistral.rs inference clusters. Includes crate structure (cortex-core, cortex-gateway, cortex-agent, cortex-cli), config loading, OpenAI/Anthropic translation stubs, model routing, eviction, polling, and streaming proxy scaffolding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:13:30 +03:00

17 Commits