Commit Graph

14 Commits

Author SHA1 Message Date
7c3390a4e1 fix(rpm): rename neuron package to helexa-neuron
Fedora's official repos ship a package named `neuron` — the NEURON
neural-simulation environment from Yale (see
https://src.fedoraproject.org/rpms/neuron). Having our own `neuron`
in the helexa COPR caused dnf5 to silently no-op `dnf install neuron`
because of the name collision, even with the COPR repo enabled and
keys imported. The only workarounds were full NEVRA (`dnf install
neuron-0.1.12-1.fc43.x86_64`) or a local file install — neither
acceptable for end-users.

Rename the RPM package to `helexa-neuron`. Keep binary (/usr/bin/neuron),
systemd unit (neuron.service), system user (neuron), and config dir
(/etc/neuron) unchanged — those are project-local contexts where the
short name is unambiguous. Follows Fedora subpackage-style naming
except with a vendor prefix rather than a parent-package prefix,
because neuron is an independent package from cortex (installed on
different hosts) and neither depends on the other.

Changes:
- neuron.spec -> helexa-neuron.spec (git rename)
- Name: neuron -> helexa-neuron (with comment explaining why)
- CI: srpm-neuron job now builds helexa-neuron-VERSION.tar.gz with the
  matching top-level dir prefix, publishes to helexa/helexa-neuron COPR
- CI: bump-version job references helexa-neuron.spec
- CLAUDE.md: install instructions updated

Old helexa/neuron COPR project can be deleted after the first
helexa/helexa-neuron build lands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:37:47 +03:00
c85d50066e ci: add RPM packaging for cortex and neuron
- cortex.spec: gateway binary, cortex.service systemd unit,
  cortex.toml + models.toml config files
- neuron.spec: neuron binary, neuron.service systemd unit,
  neuron.toml config file
- Parallel CI: srpm-cortex and srpm-neuron jobs build SRPMs
  concurrently, then publish to separate COPR repos
  (helexa/cortex and helexa/neuron)
- Shared cortex user/group across both packages
- Example configs: cortex.example.toml, neuron.example.toml,
  models.example.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:09:04 +03:00
6c238f4557 refactor: rename cortex-neuron binary and crate to neuron
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m28s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Package name, lib name, and binary all now just "neuron" without
the cortex- prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:51:15 +03:00
e42e8ee81f refactor: cortex talks to neurons instead of mistral.rs directly
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m46s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Replace NodeConfig (static vram_mb, pinned) with NeuronEndpoint.
Hardware discovery and model pinning now come from neuron API and
models.toml catalogue respectively.

- config.rs: nodes -> neurons, add models_config path
- catalogue.rs: ModelProfile with pinned_on, ModelCatalogue
- poller.rs: poll neuron GET /models (ModelInfo format)
- router.rs: resolve inference endpoint via neuron GET /models/{id}/endpoint
- evictor.rs: call neuron POST /models/unload
- node.rs: remove vram_mb, pinned fields (come from discovery/catalogue)
- All 22 gateway tests updated to mock neuron API
- Remove MistralModelsResponse, ModelLifecycleRequest (no longer needed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:42:52 +03:00
26e5e7ead8 feat: implement mistral.rs harness and neuron model API
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m30s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
- MistralRsHarness: Harness trait impl wrapping mistral.rs HTTP API
  (list/load/unload models, health check, start/stop via systemd)
- HarnessRegistry: maps harness name -> Box<dyn Harness>, built from
  neuron.toml config
- Neuron API endpoints: GET /models, POST /models/load,
  POST /models/unload, GET /models/:id/endpoint
- NeuronConfig: figment-based config loading from neuron.toml
- Integration test: full model lifecycle through mock mistral.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:29:42 +03:00
6dc717ebcd feat: add neuron daemon with GPU discovery and health endpoints
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m29s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Replace cortex-agent stub with neuron (cortex-neuron binary).

cortex-core additions:
- discovery.rs: DeviceInfo, DiscoveryResponse, DeviceHealth, HealthResponse
- harness.rs: Harness async trait, HarnessConfig, ModelSpec, ModelInfo

neuron crate (crates/neuron/):
- discovery.rs: nvidia-smi CSV parsing (pure functions) + system
  discovery via uname/nvidia-smi/nvcc
- health.rs: cached GPU health polling every 5s
- api.rs: GET /discovery and GET /health axum handlers
- main.rs: CLI entrypoint with --port flag (default 9090)
- harness stubs for mistralrs (Phase 8) and llamacpp (Phase 11)

12 new tests (9 unit + 3 integration), 35 total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:23:42 +03:00
67b9b044d3 feat: add per-request Prometheus metrics instrumentation
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m26s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Emit cortex_requests_total, cortex_request_duration_seconds,
cortex_request_errors_total, and cortex_cold_starts_total with
model and node labels on every proxied request.

Add install_test_recorder() for testing metrics without HTTP listener.
Integration test verifies counters and histograms appear after proxy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:42:09 +03:00
29c8f10761 feat: implement non-streaming Anthropic response translation
Wire up openai_to_anthropic in the /v1/messages handler: buffer
upstream OpenAI response, parse, translate to Anthropic format
(stop_reason mapping, usage field names, content blocks).

5 integration tests covering round-trip translation, system prompt,
content blocks, and error cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:36:16 +03:00
24c5e1e361 feat: add LRU eviction tests and last_accessed tracking
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m37s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
- Add touch_model() in handlers to update last_accessed timestamp
  on every proxied request, driving LRU eviction ordering
- 5 integration tests: LRU eviction, pinned model protection,
  nothing-to-evict case, lifecycle_cycles increment, and
  last_accessed update verification

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:34:08 +03:00
d5f19b9ff2 test: add Phase 3 poller integration tests
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m31s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Extract public poll_once() from poll_loop() for testability.
4 tests proving the poller correctly discovers models, updates
gateway state, marks unreachable nodes unhealthy, and prunes
stale models.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:31:17 +03:00
c2118aa81c test: add Phase 2 streaming SSE passthrough tests
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m36s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Confirms the existing proxy streams SSE chunks incrementally:
- 5-chunk test with 50ms delays verifies time spread between first
  and last chunk arrival (not buffered)
- Verifies data: [DONE] terminator is forwarded

No src/ changes needed — Body::from_stream(bytes_stream()) already
handles SSE correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:28:33 +03:00
1b339b1426 test: add Phase 1 integration tests for basic proxy
Some checks failed
CI / Build SRPM (push) Has been cancelled
CI / Publish to COPR (push) Has been cancelled
CI / Format, lint, build, test (push) Has been cancelled
6 tests proving the scaffold works end-to-end:
- chat completion proxied through gateway to mock backend
- /health endpoint with healthy node
- /v1/models returns seeded model list
- 404 for unknown model
- 404 when no healthy nodes available
- 400 when request body missing model field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:26:12 +03:00
3ad8c72276 docs: add CI expectations to CLAUDE.md and README.md
All checks were successful
CI / Format, lint, build, test (push) Successful in 2m6s
CI / Build SRPM (push) Has been skipped
CI / Publish to COPR (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:27:17 +03:00
0da68833af feat: scaffold cortex workspace
Rust reverse-proxy for multi-node mistral.rs inference clusters.
Includes crate structure (cortex-core, cortex-gateway, cortex-agent,
cortex-cli), config loading, OpenAI/Anthropic translation stubs,
model routing, eviction, polling, and streaming proxy scaffolding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:13:30 +03:00