feat(neuron): load default_models on service activation · 6779b7526a - cortex

feat(neuron): load default_models on service activation

All checks were successful

CI / Format (push) Successful in 34s

Details

CI / Clippy (push) Successful in 2m13s

Details

CI / Test (push) Successful in 4m6s

Details

CI / Build cortex SRPM (push) Has been skipped

Details

CI / Build neuron SRPM (push) Has been skipped

Details

CI / Publish cortex to COPR (push) Has been skipped

Details

CI / Publish neuron to COPR (push) Has been skipped

Details

CI / Bump version in source (push) Has been skipped

Details

Stage 5 of the candle-native pivot. Adds first-class support for
auto-loading a configured set of models when the neuron service
activates.

Config:
- NeuronConfig.default_models: Vec<ModelSpec> (defaults to []).
- neuron.example.toml ships a commented [[default_models]] example.

Activation flow (crates/neuron/src/startup.rs::load_default_models):
- Sequential — VRAM contention makes parallel loads risky.
- Per-entry timing logged at info level on success.
- Failures logged as warnings; the next entry is still attempted.
- An empty list short-circuits without log noise.

Called from main.rs after the registry is built and before the axum
listener binds, so /models reflects the loaded state from the very
first request.

data/neuron.service gains TimeoutStartSec=1800s. With activation
blocked on potentially slow first-time HF downloads + GGUF
materialisation, systemd's default 90s would kill larger model loads
mid-flight.

Two non-gated tests in tests/activation.rs cover the
continues-past-failure and empty-list paths using a synthetically
unknown harness name to fail loads fast without touching the network.
The cuda-integration test from earlier stages still exercises the
real load/unload lifecycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This commit is contained in:

rob thijssen

2026-05-18 17:56:08 +03:00

parent 84f5662df1

commit 6779b7526a

7 changed files with 131 additions and 2 deletions

									
										5

data/neuron.service
									
												View File
												
				@@ -10,6 +10,11 @@ Restart=on-failure

				RestartSec=5

				User=neuron

				Group=neuron

				# Loading default_models from neuron.toml happens before the HTTP

				# listener binds; large models can take many minutes to download and

				# materialise on first activation. systemd's default TimeoutStartSec

				# (90s) is far too short; allow 30 minutes.

				TimeoutStartSec=1800s

				[Install]

				WantedBy=multi-user.target

5 data/neuron.service Unescape Escape View File

5

data/neuron.service

View File