Files
helexa/asset/sudoers.d/neuron-host.conf
rob thijssen 6088830e7d
All checks were successful
build-prerelease / Resolve version stamps + change detection (push) Successful in 30s
build-prerelease / Lint (fmt + clippy) (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Has been skipped
build-prerelease / Build neuron-ampere (push) Has been skipped
build-prerelease / Build neuron-ada (push) Has been skipped
build-prerelease / Package helexa-neuron-ada RPM (push) Has been skipped
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been skipped
build-prerelease / Test (push) Has been skipped
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been skipped
build-prerelease / Build cortex binary (push) Has been skipped
build-prerelease / Package cortex RPM (push) Has been skipped
build-prerelease / Build helexa-bench binary (push) Has been skipped
build-prerelease / Package helexa-bench RPM (push) Has been skipped
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been skipped
feat(deploy): manage NEURON_MAX_PROMPT_TOKENS per host via model.conf drop-in
Roll the per-model context cap into deploy.yml so it is deterministic per
host and rolled out (with a restart) alongside the rest of the service
config, rather than hand-edited in local.conf. The deploy now writes
/etc/systemd/system/neuron.service.d/model.conf from a new per-host
`max_prompt_tokens` matrix field, and restarts a neuron when the package
OR the drop-in changes — so a cap change applies even with no new RPM.

beast (Qwen3.6-27B, hybrid linear, 2x 32GB) -> 131072 (~128k); benjy and
quadbrat (dense, VRAM-bound) stay at 16384 but become deploy-managed.

Adds the scoped sudoers grant for the root-owned drop-in install, and
doc/context-limits.md documenting the knob relationships and KV/VRAM math
(refs #62 for the eventual /models-advertised source of truth, #65 for
the length-aware text VRAM guard that gates pushing beyond 128k).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 18:48:19 +03:00

44 lines
3.1 KiB
Plaintext

# Install on every neuron host as /etc/sudoers.d/helexa_gitea_ci
# (owner root:root, mode 0440). Required by .gitea/workflows/deploy.yml,
# which SSHes as gitea_ci@<neuron-host> to roll out helexa-neuron-<flavour>
# package upgrades and config changes.
#
# Filename convention `helexa_gitea_ci` (vs bare `gitea_ci`) so other
# helexa-org apps can drop their own sudoers files on the same host
# without overwriting this one.
#
# All three CUDA flavours are listed because a host's flavour can change
# (e.g. GPU swap) and we don't want the sudoers file to need to change
# in lockstep. Only one flavour can be installed at a time (the packages
# Conflict: with each other), so the attack surface is bounded to "wrong
# flavour installed" — vandalism, not privilege escalation.
gitea_ci ALL=(root) NOPASSWD: /usr/bin/rsync * /etc/neuron/neuron.toml
# deploy.yml writes the per-model systemd drop-in carrying
# NEURON_MAX_PROMPT_TOKENS: gitea_ci stages it in its own dir, then
# installs it root-owned. Exact source/dest paths; see doc/context-limits.md.
gitea_ci ALL=(root) NOPASSWD: /usr/bin/install -o root -g root -m 0644 -D /var/lib/gitea_ci/model.conf /etc/systemd/system/neuron.service.d/model.conf
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl start neuron.service
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl stop neuron.service
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl enable --now neuron.service
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl daemon-reload
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-ampere
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-ampere
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-ada
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-ada
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-blackwell
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-blackwell
# sudoers reserves `:` and `=` and requires `\` escaping inside command
# arguments — without it visudo errors at the first `:` in `https://`.
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager addrepo --from-repofile\=https\://rpm.lair.cafe/lair-cafe-unstable.repo
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager setopt lair-cafe-unstable.enabled\=1
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager addrepo --from-repofile\=https\://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install -y libcudnn9-cuda-13
gitea_ci ALL=(root) NOPASSWD: /usr/bin/firewall-cmd --add-service=helexa-neuron --permanent
gitea_ci ALL=(root) NOPASSWD: /usr/bin/firewall-cmd --reload
# deploy-dev.yml fast path: install a freshly-built dev binary over the
# packaged one. Exact source path + args; the workflow must use this
# command form verbatim. The next deploy.yml run reconciles the host
# back to the RPM-owned binary.
gitea_ci ALL=(root) NOPASSWD: /usr/bin/install -o root -g root -m 0755 /var/lib/gitea_ci/neuron-dev /usr/bin/neuron