All checks were successful
CI / Format (push) Successful in 34s
CI / Clippy (push) Successful in 2m13s
CI / Test (push) Successful in 4m6s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
Stage 5 of the candle-native pivot. Adds first-class support for auto-loading a configured set of models when the neuron service activates. Config: - NeuronConfig.default_models: Vec<ModelSpec> (defaults to []). - neuron.example.toml ships a commented [[default_models]] example. Activation flow (crates/neuron/src/startup.rs::load_default_models): - Sequential — VRAM contention makes parallel loads risky. - Per-entry timing logged at info level on success. - Failures logged as warnings; the next entry is still attempted. - An empty list short-circuits without log noise. Called from main.rs after the registry is built and before the axum listener binds, so /models reflects the loaded state from the very first request. data/neuron.service gains TimeoutStartSec=1800s. With activation blocked on potentially slow first-time HF downloads + GGUF materialisation, systemd's default 90s would kill larger model loads mid-flight. Two non-gated tests in tests/activation.rs cover the continues-past-failure and empty-list paths using a synthetically unknown harness name to fail loads fast without touching the network. The cuda-integration test from earlier stages still exercises the real load/unload lifecycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
68 lines
2.0 KiB
Rust
68 lines
2.0 KiB
Rust
//! Neuron configuration loaded from neuron.toml.
|
|
|
|
use cortex_core::harness::{HarnessConfig, ModelSpec};
|
|
use figment::{
|
|
Figment,
|
|
providers::{Env, Format, Toml},
|
|
};
|
|
use serde::{Deserialize, Serialize};
|
|
use std::path::{Path, PathBuf};
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct NeuronConfig {
|
|
#[serde(default = "default_port")]
|
|
pub port: u16,
|
|
#[serde(default)]
|
|
pub harnesses: Vec<HarnessConfig>,
|
|
/// Per-harness configuration. Currently only `candle` is recognised.
|
|
#[serde(default)]
|
|
pub harness: HarnessSettings,
|
|
/// Models to auto-load when the neuron service activates. Each entry
|
|
/// is loaded sequentially before the HTTP listener binds. A failure
|
|
/// on any single entry logs a warning and proceeds — broken entries
|
|
/// don't prevent the rest of the fleet from starting.
|
|
#[serde(default)]
|
|
pub default_models: Vec<ModelSpec>,
|
|
}
|
|
|
|
/// Settings for individual harness implementations. Each harness owns
|
|
/// its own sub-table so users only configure the harnesses they enable.
|
|
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
|
|
pub struct HarnessSettings {
|
|
#[serde(default)]
|
|
pub candle: CandleHarnessConfig,
|
|
}
|
|
|
|
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
|
|
pub struct CandleHarnessConfig {
|
|
/// HuggingFace cache directory for model weights.
|
|
/// When unset, defers to hf-hub's default (~/.cache/huggingface).
|
|
#[serde(default)]
|
|
pub hf_cache: Option<PathBuf>,
|
|
}
|
|
|
|
fn default_port() -> u16 {
|
|
13131
|
|
}
|
|
|
|
impl NeuronConfig {
|
|
pub fn load(path: impl AsRef<Path>) -> Result<Self, Box<figment::Error>> {
|
|
Figment::new()
|
|
.merge(Toml::file(path))
|
|
.merge(Env::prefixed("NEURON_").split("__"))
|
|
.extract()
|
|
.map_err(Box::new)
|
|
}
|
|
}
|
|
|
|
impl Default for NeuronConfig {
|
|
fn default() -> Self {
|
|
Self {
|
|
port: 13131,
|
|
harnesses: vec![],
|
|
harness: HarnessSettings::default(),
|
|
default_models: vec![],
|
|
}
|
|
}
|
|
}
|