Files
helexa/asset/sudoers.d/neuron-host.conf
rob thijssen 6088830e7d
All checks were successful
build-prerelease / Resolve version stamps + change detection (push) Successful in 30s
build-prerelease / Lint (fmt + clippy) (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Has been skipped
build-prerelease / Build neuron-ampere (push) Has been skipped
build-prerelease / Build neuron-ada (push) Has been skipped
build-prerelease / Package helexa-neuron-ada RPM (push) Has been skipped
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been skipped
build-prerelease / Test (push) Has been skipped
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been skipped
build-prerelease / Build cortex binary (push) Has been skipped
build-prerelease / Package cortex RPM (push) Has been skipped
build-prerelease / Build helexa-bench binary (push) Has been skipped
build-prerelease / Package helexa-bench RPM (push) Has been skipped
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been skipped
feat(deploy): manage NEURON_MAX_PROMPT_TOKENS per host via model.conf drop-in
Roll the per-model context cap into deploy.yml so it is deterministic per
host and rolled out (with a restart) alongside the rest of the service
config, rather than hand-edited in local.conf. The deploy now writes
/etc/systemd/system/neuron.service.d/model.conf from a new per-host
`max_prompt_tokens` matrix field, and restarts a neuron when the package
OR the drop-in changes — so a cap change applies even with no new RPM.

beast (Qwen3.6-27B, hybrid linear, 2x 32GB) -> 131072 (~128k); benjy and
quadbrat (dense, VRAM-bound) stay at 16384 but become deploy-managed.

Adds the scoped sudoers grant for the root-owned drop-in install, and
doc/context-limits.md documenting the knob relationships and KV/VRAM math
(refs #62 for the eventual /models-advertised source of truth, #65 for
the length-aware text VRAM guard that gates pushing beyond 128k).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 18:48:19 +03:00

3.1 KiB