feat(neuron): bind listener before pre-warm, surface activation in /health

Two coupled changes addressing the 2026-05-26 validate-neuron failure where a fresh deploy of beast had /health unreachable for ~5 minutes while Qwen3.6-27B q5k materialised, even though systemd reported the unit as active. 1. main.rs no longer awaits load_default_models before binding axum. The listener binds first; pre-warm runs in a spawned background task that holds a read lock on the harness registry for the duration of its sequential load loop. Concurrent on-demand /models/load and /v1/chat/completions traffic still flow. 2. /health gains an `activation` field carrying: state pre_warming | ready pending model ids queued but not started in_progress model id currently loading (Option) completed model ids loaded successfully this activation failed [{model_id, error}] for failed entries The field is `#[serde(default)]` so a pre-change cortex polling a new neuron — or vice versa — keeps working. `ActivationTracker` (new module `neuron::activation`) owns the RwLock-wrapped state; load_default_models takes a tracker reference and updates it per-model. NeuronState holds an Arc clone for the /health handler. Tests updated to construct trackers and assert state transitions (empty noop, two failures → ready with both in `failed`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 15:18:04 +03:00
parent d3f2d50749
commit 800498f530
9 changed files with 267 additions and 25 deletions
--- a/crates/neuron/src/activation.rs
+++ b/crates/neuron/src/activation.rs
@@ -0,0 +1,93 @@
+//! Activation-time pre-warm progress tracking.
+//!
+//! Wraps the [`ActivationStatus`] snapshot in an async RwLock so the
+//! background pre-warm task can update it per-model while the
+//! `/health` handler reads coherent snapshots. The tracker exists
+//! because `default_models` loading moved from synchronous-before-bind
+//! to background-after-bind on 2026-05-26: the listener is up
+//! immediately, but `/health` now needs to tell callers which of the
+//! configured defaults are still warming.
+
+use cortex_core::discovery::{ActivationState, ActivationStatus, PreWarmFailure};
+use cortex_core::harness::ModelSpec;
+use tokio::sync::RwLock;
+
+/// Shared, async-safe handle to the daemon's activation progress.
+///
+/// Construct once in `main` with the configured `default_models` so
+/// the initial `pending` list matches the spec; clone the `Arc` into
+/// the `NeuronState` for HTTP handlers and into the spawned pre-warm
+/// task for updates.
+pub struct ActivationTracker {
+    inner: RwLock<ActivationStatus>,
+}
+
+impl ActivationTracker {
+    /// Build a tracker primed with one entry per spec. An empty spec
+    /// list yields a `Ready` tracker — no point reporting PreWarming
+    /// when there's nothing queued.
+    pub fn new(default_models: &[ModelSpec]) -> Self {
+        let pending: Vec<String> = default_models.iter().map(|s| s.model_id.clone()).collect();
+        let state = if pending.is_empty() {
+            ActivationState::Ready
+        } else {
+            ActivationState::PreWarming
+        };
+        Self {
+            inner: RwLock::new(ActivationStatus {
+                state,
+                pending,
+                in_progress: None,
+                completed: vec![],
+                failed: vec![],
+            }),
+        }
+    }
+
+    /// Mark a model as in-progress: remove it from `pending`, set as
+    /// `in_progress`. Called immediately before `registry.load_model`.
+    pub async fn start_loading(&self, model_id: &str) {
+        let mut s = self.inner.write().await;
+        s.pending.retain(|m| m != model_id);
+        s.in_progress = Some(model_id.to_string());
+    }
+
+    /// Mark a model as completed: clear `in_progress` (if it matches),
+    /// append to `completed`.
+    pub async fn complete_loading(&self, model_id: &str) {
+        let mut s = self.inner.write().await;
+        if s.in_progress.as_deref() == Some(model_id) {
+            s.in_progress = None;
+        }
+        s.completed.push(model_id.to_string());
+    }
+
+    /// Mark a model as failed: clear `in_progress` (if it matches),
+    /// append a `PreWarmFailure` carrying the rendered error chain.
+    pub async fn fail_loading(&self, model_id: &str, error: &str) {
+        let mut s = self.inner.write().await;
+        if s.in_progress.as_deref() == Some(model_id) {
+            s.in_progress = None;
+        }
+        s.failed.push(PreWarmFailure {
+            model_id: model_id.to_string(),
+            error: error.to_string(),
+        });
+    }
+
+    /// Flip the high-level `state` to `Ready` once the pre-warm task
+    /// is done iterating. Pending should be empty by this point; if a
+    /// caller bails early it's a stuck activation and the operator
+    /// will see entries in `pending` even with `state=ready` — that's
+    /// a useful diagnostic, not an inconsistency to scrub.
+    pub async fn mark_ready(&self) {
+        let mut s = self.inner.write().await;
+        s.state = ActivationState::Ready;
+        s.in_progress = None;
+    }
+
+    /// Cheap clone of the current state for the `/health` handler.
+    pub async fn snapshot(&self) -> ActivationStatus {
+        self.inner.read().await.clone()
+    }
+}