cortex

helexa/cortex

Fork 0

Commit Graph

Author	SHA1	Message	Date
rob thijssen	ed4d71db09	fix(validate-neuron): default to unsloth GGUF + capture curl errors Two reasons the previous run silently bailed after POST /models/load: 1. Default model was Qwen/Qwen3-0.6B-GGUF (official). That repo ships ONLY Q8_0 — no Q4_K_M, no Q4_0, nothing else. The GGUF filename matcher in CandleHarness::resolve_files returned "no GGUF file matching quant Q4_K_M" and the load endpoint returned an error, but the script used `curl --silent --fail` and swallowed it. 2. /models/load is synchronous (it awaits the full HF download + GGUF parse). curl --max-time 30 was way too short for a 400 MB fresh download. Fixes: - Default model is now unsloth/Qwen3-0.6B-GGUF, which mirrors the full Q-spectrum (Q2_K through Q8_0 plus BF16) so Q4_K_M actually exists. - trigger_load / run_probe now use --write-out to capture HTTP code and emit the response body on non-2xx, so failures surface a real diagnostic instead of an opaque set -e abort. - LOAD_TIMEOUT bumped to 600s; INFER_TIMEOUT to 120s. - Probe payload built via `yq -n` so JSON quoting is reliable regardless of the prompt text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:14:31 +03:00
rob thijssen	39010c779f	add script/validate-neuron.sh — end-to-end candle harness smoke test Loads a small public Qwen3 GGUF on a target neuron host, fires a deterministic reasoning probe ("What is the capital of France?"), and asserts the response contains 'Paris'. Used to validate the candle harness on a real GPU host before the Stage 7 TP work begins, and as a regression check after future neuron builds. Defaults to beast.hanzalova.internal + Qwen/Qwen3-1.7B-GGUF + Q4_K_M; all three are positional args so the same script tests any node / model combination. Polls /models after triggering the load since /models/load returns once the materialisation is queued, not finished. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 07:58:05 +03:00

Author

SHA1

Message

Date

rob thijssen

ed4d71db09

fix(validate-neuron): default to unsloth GGUF + capture curl errors

Two reasons the previous run silently bailed after POST /models/load:

1. Default model was Qwen/Qwen3-0.6B-GGUF (official). That repo ships
   ONLY Q8_0 — no Q4_K_M, no Q4_0, nothing else. The GGUF filename
   matcher in CandleHarness::resolve_files returned "no GGUF file
   matching quant Q4_K_M" and the load endpoint returned an error,
   but the script used `curl --silent --fail` and swallowed it.

2. /models/load is synchronous (it awaits the full HF download + GGUF
   parse). curl --max-time 30 was way too short for a 400 MB fresh
   download.

Fixes:
- Default model is now unsloth/Qwen3-0.6B-GGUF, which mirrors the
  full Q-spectrum (Q2_K through Q8_0 plus BF16) so Q4_K_M actually
  exists.
- trigger_load / run_probe now use --write-out to capture HTTP code
  and emit the response body on non-2xx, so failures surface a real
  diagnostic instead of an opaque set -e abort.
- LOAD_TIMEOUT bumped to 600s; INFER_TIMEOUT to 120s.
- Probe payload built via `yq -n` so JSON quoting is reliable
  regardless of the prompt text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-19 08:14:31 +03:00

rob thijssen

39010c779f

add script/validate-neuron.sh — end-to-end candle harness smoke test

Loads a small public Qwen3 GGUF on a target neuron host, fires a
deterministic reasoning probe ("What is the capital of France?"),
and asserts the response contains 'Paris'. Used to validate the
candle harness on a real GPU host before the Stage 7 TP work begins,
and as a regression check after future neuron builds.

Defaults to beast.hanzalova.internal + Qwen/Qwen3-1.7B-GGUF + Q4_K_M;
all three are positional args so the same script tests any node /
model combination. Polls /models after triggering the load since
/models/load returns once the materialisation is *queued*, not
finished.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-19 07:58:05 +03:00

2 Commits