Two fixes uncovered by the live validation against beast/benjy/quadbrat:
1. api.rs swallowed everything beyond the outermost anyhow context.
The validation script reported '{"error":"fetch GGUF ...gguf"}' but
the actual underlying hf-hub failure (cache dir creation, network,
auth, etc.) was hidden. Switching every error response to
format!("{e:#}") expands the full cause chain via anyhow's
alternate Display format.
2. The neuron systemd unit declared the service user but never ensured
/var/lib/neuron (its $HOME) existed. hf-hub defaults its cache to
~/.cache/huggingface/hub — when $HOME is absent the cache dir
creation fails and the download aborts. Adding `StateDirectory=neuron`
makes systemd create + chown that directory at activation; no spec
change needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 6 of the candle-native pivot. Adds first-class deactivation:
neuron now drains in-flight requests on SIGTERM (systemd stop) or
SIGINT (Ctrl-C), then unloads every loaded model before the process
exits — releasing CUDA contexts and VRAM cleanly rather than leaving
the OS to reclaim them.
Mechanism:
- startup::shutdown_signal() resolves on either ctrl_c() or a
SIGTERM listener.
- axum::serve(...).with_graceful_shutdown(shutdown_signal()) stops
accepting new connections, lets active requests finish, then
returns control to main.
- startup::unload_all_models(®istry) iterates list_all_models()
and calls unload per entry. Per-model failures are logged warnings;
cleanup continues. Empty registry is a fast no-op.
- main holds an Arc<NeuronState> reference past axum's lifetime so
the registry is still reachable for the unload sweep.
data/neuron.service:
- TimeoutStopSec=120s — generous bound for big-model unloads before
systemd escalates to SIGKILL.
- KillSignal=SIGTERM — explicit, matches the handler.
Two non-gated tests cover the empty-registry no-op and the no-models-
loaded path. Real load-then-unload-on-shutdown is exercised by the
cuda-integration test from Stage 2 (which calls unload_model directly)
and observable on a real GPU host by stopping the service and
watching nvidia-smi.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 5 of the candle-native pivot. Adds first-class support for
auto-loading a configured set of models when the neuron service
activates.
Config:
- NeuronConfig.default_models: Vec<ModelSpec> (defaults to []).
- neuron.example.toml ships a commented [[default_models]] example.
Activation flow (crates/neuron/src/startup.rs::load_default_models):
- Sequential — VRAM contention makes parallel loads risky.
- Per-entry timing logged at info level on success.
- Failures logged as warnings; the next entry is still attempted.
- An empty list short-circuits without log noise.
Called from main.rs after the registry is built and before the axum
listener binds, so /models reflects the loaded state from the very
first request.
data/neuron.service gains TimeoutStartSec=1800s. With activation
blocked on potentially slow first-time HF downloads + GGUF
materialisation, systemd's default 90s would kill larger model loads
mid-flight.
Two non-gated tests in tests/activation.rs cover the
continues-past-failure and empty-list paths using a synthetically
unknown harness name to fail loads fast without touching the network.
The cuda-integration test from earlier stages still exercises the
real load/unload lifecycle.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
neuron and cortex are independent packages installable on different
hosts. Having neuron run under a 'cortex' system user implied a
shared identity that doesn't exist. Give neuron its own user/group.
- New data/neuron-sysusers.conf declares the neuron user/group with
home /var/lib/neuron.
- systemd unit User/Group changed to neuron.
- Spec file attrs, explicit Provides, and %sysusers_create_compat
updated to reference the neuron user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The neuron package was shipping its config at /etc/cortex/neuron.toml,
which implied a shared config directory between two independent
packages. Move to /etc/neuron/neuron.toml — neuron owns its own etc
dir, consistent with its own /usr/lib/sysusers.d/neuron.conf and
/usr/lib/systemd/system/neuron.service. Updated the systemd unit's
ExecStart path and the example toml header to match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- cortex.spec: gateway binary, cortex.service systemd unit,
cortex.toml + models.toml config files
- neuron.spec: neuron binary, neuron.service systemd unit,
neuron.toml config file
- Parallel CI: srpm-cortex and srpm-neuron jobs build SRPMs
concurrently, then publish to separate COPR repos
(helexa/cortex and helexa/neuron)
- Shared cortex user/group across both packages
- Example configs: cortex.example.toml, neuron.example.toml,
models.example.toml
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>