All checks were successful
CI / Format (push) Successful in 34s
CI / Clippy (push) Successful in 2m13s
CI / Test (push) Successful in 4m6s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
Stage 5 of the candle-native pivot. Adds first-class support for auto-loading a configured set of models when the neuron service activates. Config: - NeuronConfig.default_models: Vec<ModelSpec> (defaults to []). - neuron.example.toml ships a commented [[default_models]] example. Activation flow (crates/neuron/src/startup.rs::load_default_models): - Sequential — VRAM contention makes parallel loads risky. - Per-entry timing logged at info level on success. - Failures logged as warnings; the next entry is still attempted. - An empty list short-circuits without log noise. Called from main.rs after the registry is built and before the axum listener binds, so /models reflects the loaded state from the very first request. data/neuron.service gains TimeoutStartSec=1800s. With activation blocked on potentially slow first-time HF downloads + GGUF materialisation, systemd's default 90s would kill larger model loads mid-flight. Two non-gated tests in tests/activation.rs cover the continues-past-failure and empty-list paths using a synthetically unknown harness name to fail loads fast without touching the network. The cuda-integration test from earlier stages still exercises the real load/unload lifecycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
57 lines
1.7 KiB
Rust
57 lines
1.7 KiB
Rust
//! Activation-time behaviour: load_default_models continues past
|
|
//! individual failures so a single broken catalogue entry doesn't
|
|
//! prevent the rest of the fleet from starting.
|
|
|
|
use cortex_core::harness::{HarnessConfig, ModelSpec};
|
|
use neuron::config::HarnessSettings;
|
|
use neuron::harness::HarnessRegistry;
|
|
use neuron::startup;
|
|
|
|
#[tokio::test]
|
|
async fn test_load_default_models_skips_unknown_harness() {
|
|
let registry = HarnessRegistry::from_configs(
|
|
&[HarnessConfig {
|
|
name: "candle".into(),
|
|
}],
|
|
"http://localhost:0",
|
|
&HarnessSettings::default(),
|
|
);
|
|
|
|
// Both entries fail synchronously inside the registry — no network
|
|
// call escapes (the harness lookup mismatches before hf-hub is
|
|
// touched). The function should still return cleanly.
|
|
let specs = vec![
|
|
ModelSpec {
|
|
model_id: "model-a".into(),
|
|
harness: "no-such-harness".into(),
|
|
quant: None,
|
|
tensor_parallel: None,
|
|
devices: None,
|
|
},
|
|
ModelSpec {
|
|
model_id: "model-b".into(),
|
|
harness: "no-such-harness".into(),
|
|
quant: None,
|
|
tensor_parallel: None,
|
|
devices: None,
|
|
},
|
|
];
|
|
|
|
startup::load_default_models(®istry, &specs).await;
|
|
|
|
let listed = registry
|
|
.list_all_models()
|
|
.await
|
|
.expect("list_all_models should succeed");
|
|
assert!(
|
|
listed.is_empty(),
|
|
"no models should be loaded after failed entries"
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn test_load_default_models_empty_is_noop() {
|
|
let registry = HarnessRegistry::new();
|
|
startup::load_default_models(®istry, &[]).await;
|
|
}
|