All checks were successful
build-prerelease / Resolve version stamps + change detection (push) Successful in 30s
build-prerelease / Lint (fmt + clippy) (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Has been skipped
build-prerelease / Build neuron-ampere (push) Has been skipped
build-prerelease / Build neuron-ada (push) Has been skipped
build-prerelease / Package helexa-neuron-ada RPM (push) Has been skipped
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been skipped
build-prerelease / Test (push) Has been skipped
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been skipped
build-prerelease / Build cortex binary (push) Has been skipped
build-prerelease / Package cortex RPM (push) Has been skipped
build-prerelease / Build helexa-bench binary (push) Has been skipped
build-prerelease / Package helexa-bench RPM (push) Has been skipped
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been skipped
Roll the per-model context cap into deploy.yml so it is deterministic per host and rolled out (with a restart) alongside the rest of the service config, rather than hand-edited in local.conf. The deploy now writes /etc/systemd/system/neuron.service.d/model.conf from a new per-host `max_prompt_tokens` matrix field, and restarts a neuron when the package OR the drop-in changes — so a cap change applies even with no new RPM. beast (Qwen3.6-27B, hybrid linear, 2x 32GB) -> 131072 (~128k); benjy and quadbrat (dense, VRAM-bound) stay at 16384 but become deploy-managed. Adds the scoped sudoers grant for the root-owned drop-in install, and doc/context-limits.md documenting the knob relationships and KV/VRAM math (refs #62 for the eventual /models-advertised source of truth, #65 for the length-aware text VRAM guard that gates pushing beyond 128k). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
44 lines
3.1 KiB
Plaintext
44 lines
3.1 KiB
Plaintext
# Install on every neuron host as /etc/sudoers.d/helexa_gitea_ci
|
|
# (owner root:root, mode 0440). Required by .gitea/workflows/deploy.yml,
|
|
# which SSHes as gitea_ci@<neuron-host> to roll out helexa-neuron-<flavour>
|
|
# package upgrades and config changes.
|
|
#
|
|
# Filename convention `helexa_gitea_ci` (vs bare `gitea_ci`) so other
|
|
# helexa-org apps can drop their own sudoers files on the same host
|
|
# without overwriting this one.
|
|
#
|
|
# All three CUDA flavours are listed because a host's flavour can change
|
|
# (e.g. GPU swap) and we don't want the sudoers file to need to change
|
|
# in lockstep. Only one flavour can be installed at a time (the packages
|
|
# Conflict: with each other), so the attack surface is bounded to "wrong
|
|
# flavour installed" — vandalism, not privilege escalation.
|
|
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/rsync * /etc/neuron/neuron.toml
|
|
# deploy.yml writes the per-model systemd drop-in carrying
|
|
# NEURON_MAX_PROMPT_TOKENS: gitea_ci stages it in its own dir, then
|
|
# installs it root-owned. Exact source/dest paths; see doc/context-limits.md.
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/install -o root -g root -m 0644 -D /var/lib/gitea_ci/model.conf /etc/systemd/system/neuron.service.d/model.conf
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl start neuron.service
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl stop neuron.service
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl enable --now neuron.service
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/systemctl daemon-reload
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-ampere
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-ampere
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-ada
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-ada
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install --refresh --allowerasing -y helexa-neuron-blackwell
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf upgrade --refresh --allowerasing -y helexa-neuron-blackwell
|
|
# sudoers reserves `:` and `=` and requires `\` escaping inside command
|
|
# arguments — without it visudo errors at the first `:` in `https://`.
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager addrepo --from-repofile\=https\://rpm.lair.cafe/lair-cafe-unstable.repo
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager setopt lair-cafe-unstable.enabled\=1
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf config-manager addrepo --from-repofile\=https\://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/dnf install -y libcudnn9-cuda-13
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/firewall-cmd --add-service=helexa-neuron --permanent
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/firewall-cmd --reload
|
|
# deploy-dev.yml fast path: install a freshly-built dev binary over the
|
|
# packaged one. Exact source path + args; the workflow must use this
|
|
# command form verbatim. The next deploy.yml run reconciles the host
|
|
# back to the RPM-owned binary.
|
|
gitea_ci ALL=(root) NOPASSWD: /usr/bin/install -o root -g root -m 0755 /var/lib/gitea_ci/neuron-dev /usr/bin/neuron
|