Files
cortex/asset/neuron/benjy.toml
rob thijssen ea1fdf8aa6 chore(deploy): drop deploy.sh and manifest.yml now that workflow runs
First end-to-end run of the deploy workflow succeeded (gitea run #289),
so the operator-run rolling-deploy script and its YAML manifest are no
longer the source of truth — fleet topology lives in
.gitea/workflows/deploy.yml and per-host config in script/infra-setup.sh.

Per-host neuron config comments updated to point at the new sync path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 16:41:04 +03:00

20 lines
425 B
TOML

# neuron.toml for benjy.hanzalova.internal
#
# 1x RTX 4090 (24 GB) — largest single-GPU host on the fleet. Pre-warms
# Qwen3-8B (bf16, ~18 GB), leaving ~6 GB for KV cache + activations on
# moderate-length contexts.
#
# Synced to /etc/neuron/neuron.toml by script/infra-setup.sh.
port = 13131
[[harnesses]]
name = "candle"
[harness.candle]
[[default_models]]
model_id = "Qwen/Qwen3-8B"
harness = "candle"
devices = [0]