Some checks failed
CI / Format (push) Successful in 35s
build-prerelease / Resolve version stamps (push) Successful in 39s
CI / Clippy (push) Successful in 2m22s
build-prerelease / Build neuron-blackwell (push) Successful in 3m35s
CI / Test (push) Successful in 5m8s
build-prerelease / Build cortex binary (push) Successful in 4m34s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Package cortex RPM (push) Successful in 1m16s
build-prerelease / Build neuron-ampere (push) Successful in 5m12s
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
build-prerelease / Build neuron-ada (push) Has been cancelled
q5k produced NaN logits on Qwen/Qwen3.6-27B under candle TP=2 (sampler fell over with "logits unhealthy nan: 248320/248320"). q6k is the quant that worked well in production under mistral.rs on the same hardware, so it's the right baseline for verifying the mempool-trim fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
25 lines
654 B
TOML
25 lines
654 B
TOML
# neuron.toml for beast.hanzalova.internal
|
|
#
|
|
# 2x RTX 5090 (32 GB each) — TP-2 capable. Pre-warms Qwen3.6-27B with
|
|
# q5k ISQ across both GPUs at activation, matching the validate-neuron
|
|
# invocation: `validate-neuron.sh beast.hanzalova.internal
|
|
# Qwen/Qwen3.6-27B q5k 2`.
|
|
#
|
|
# Synced by script/deploy.sh from asset/neuron/<short-host>.toml. Edits
|
|
# take effect on the next deploy.sh run (which stops + restarts the
|
|
# service so default_models is re-read at activation).
|
|
|
|
port = 13131
|
|
|
|
[[harnesses]]
|
|
name = "candle"
|
|
|
|
[harness.candle]
|
|
|
|
[[default_models]]
|
|
model_id = "Qwen/Qwen3.6-27B"
|
|
harness = "candle"
|
|
quant = "q6k"
|
|
tensor_parallel = 2
|
|
devices = [0, 1]
|