cortex

Author	SHA1	Message	Date
rob thijssen	aad314cdfa	feat(neuron): graceful unload-on-shutdown via SIGTERM/SIGINT Stage 6 of the candle-native pivot. Adds first-class deactivation: neuron now drains in-flight requests on SIGTERM (systemd stop) or SIGINT (Ctrl-C), then unloads every loaded model before the process exits — releasing CUDA contexts and VRAM cleanly rather than leaving the OS to reclaim them. Mechanism: - startup::shutdown_signal() resolves on either ctrl_c() or a SIGTERM listener. - axum::serve(...).with_graceful_shutdown(shutdown_signal()) stops accepting new connections, lets active requests finish, then returns control to main. - startup::unload_all_models(&registry) iterates list_all_models() and calls unload per entry. Per-model failures are logged warnings; cleanup continues. Empty registry is a fast no-op. - main holds an Arc<NeuronState> reference past axum's lifetime so the registry is still reachable for the unload sweep. data/neuron.service: - TimeoutStopSec=120s — generous bound for big-model unloads before systemd escalates to SIGKILL. - KillSignal=SIGTERM — explicit, matches the handler. Two non-gated tests cover the empty-registry no-op and the no-models- loaded path. Real load-then-unload-on-shutdown is exercised by the cuda-integration test from Stage 2 (which calls unload_model directly) and observable on a real GPU host by stopping the service and watching nvidia-smi. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 17:58:07 +03:00
rob thijssen	6779b7526a	feat(neuron): load default_models on service activation All checks were successful CI / Format (push) Successful in 34s Details CI / Clippy (push) Successful in 2m13s Details CI / Test (push) Successful in 4m6s Details CI / Build cortex SRPM (push) Has been skipped Details CI / Build neuron SRPM (push) Has been skipped Details CI / Publish cortex to COPR (push) Has been skipped Details CI / Publish neuron to COPR (push) Has been skipped Details CI / Bump version in source (push) Has been skipped Details Stage 5 of the candle-native pivot. Adds first-class support for auto-loading a configured set of models when the neuron service activates. Config: - NeuronConfig.default_models: Vec<ModelSpec> (defaults to []). - neuron.example.toml ships a commented [[default_models]] example. Activation flow (crates/neuron/src/startup.rs::load_default_models): - Sequential — VRAM contention makes parallel loads risky. - Per-entry timing logged at info level on success. - Failures logged as warnings; the next entry is still attempted. - An empty list short-circuits without log noise. Called from main.rs after the registry is built and before the axum listener binds, so /models reflects the loaded state from the very first request. data/neuron.service gains TimeoutStartSec=1800s. With activation blocked on potentially slow first-time HF downloads + GGUF materialisation, systemd's default 90s would kill larger model loads mid-flight. Two non-gated tests in tests/activation.rs cover the continues-past-failure and empty-list paths using a synthetically unknown harness name to fail loads fast without touching the network. The cuda-integration test from earlier stages still exercises the real load/unload lifecycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 17:56:08 +03:00
rob thijssen	472c0e8737	fix(rpm): ship firewalld service definitions with correct ports Some checks failed CI / Format, lint, build, test (push) Has been cancelled Details CI / Build cortex SRPM (push) Has been cancelled Details CI / Build neuron SRPM (push) Has been cancelled Details CI / Publish cortex to COPR (push) Has been cancelled Details CI / Publish neuron to COPR (push) Has been cancelled Details CI / Bump version in source (push) Has been cancelled Details cortex: opens 31313/tcp (API) and 31314/tcp (metrics) neuron: opens 13131/tcp Installs to /usr/lib/firewalld/services/ so firewall-cmd --add-service=cortex / --add-service=helexa-neuron works out of the box. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-11 12:52:20 +03:00
rob thijssen	9697fbae73	fix(neuron): run service as neuron user, not cortex All checks were successful CI / Format, lint, build, test (push) Successful in 2m22s Details CI / Build cortex SRPM (push) Successful in 43s Details CI / Build neuron SRPM (push) Successful in 43s Details CI / Publish neuron to COPR (push) Successful in 8m49s Details CI / Publish cortex to COPR (push) Successful in 11m22s Details CI / Bump version in source (push) Successful in 31s Details neuron and cortex are independent packages installable on different hosts. Having neuron run under a 'cortex' system user implied a shared identity that doesn't exist. Give neuron its own user/group. - New data/neuron-sysusers.conf declares the neuron user/group with home /var/lib/neuron. - systemd unit User/Group changed to neuron. - Spec file attrs, explicit Provides, and %sysusers_create_compat updated to reference the neuron user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:32:36 +03:00
rob thijssen	142e91c3f7	fix(neuron): install config at /etc/neuron/, not /etc/cortex/ All checks were successful CI / Format, lint, build, test (push) Successful in 4m45s Details CI / Build neuron SRPM (push) Successful in 44s Details CI / Build cortex SRPM (push) Successful in 45s Details CI / Publish neuron to COPR (push) Successful in 8m52s Details CI / Publish cortex to COPR (push) Successful in 11m17s Details CI / Bump version in source (push) Successful in 30s Details The neuron package was shipping its config at /etc/cortex/neuron.toml, which implied a shared config directory between two independent packages. Move to /etc/neuron/neuron.toml — neuron owns its own etc dir, consistent with its own /usr/lib/sysusers.d/neuron.conf and /usr/lib/systemd/system/neuron.service. Updated the systemd unit's ExecStart path and the example toml header to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:07:06 +03:00
rob thijssen	209150771e	fix(rpm): use sysusers.d for cortex user/group creation Both packages set %attr(...,cortex) on their config files, which caused RPM's auto-dep-generator to emit Requires: group(cortex) / user(cortex). The %pre scriptlets that actually created the group ran too late — dnf rejected neuron installation on hosts without cortex because nothing Provided group(cortex). Switch to systemd-sysusers declarative user creation: each package ships its own named sysusers.d file (cortex-gateway.conf and cortex-neuron.conf — different names so both packages can coinstall) with identical content defining the cortex user/group. RPM's user/group dep generator now emits Provides: user(cortex) and Provides: group(cortex) automatically from the sysusers.d files, satisfying the auto-generated Requires. Either package installs standalone; both can coinstall on the gateway host if desired. Also added Requires: systemd since %sysusers_create_compat depends on systemd-sysusers being present on the target. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 11:18:37 +03:00
rob thijssen	c85d50066e	ci: add RPM packaging for cortex and neuron - cortex.spec: gateway binary, cortex.service systemd unit, cortex.toml + models.toml config files - neuron.spec: neuron binary, neuron.service systemd unit, neuron.toml config file - Parallel CI: srpm-cortex and srpm-neuron jobs build SRPMs concurrently, then publish to separate COPR repos (helexa/cortex and helexa/neuron) - Shared cortex user/group across both packages - Example configs: cortex.example.toml, neuron.example.toml, models.example.toml Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:09:04 +03:00

7 Commits