add asset/manifest.yml describing fleet hosts and neuron flavours

Adds a single source of truth for which hosts run cortex vs neuron and which CUDA compute-capability flavour each neuron host needs: cortex : hanzalova.internal neurons : beast → helexa-neuron-blackwell (2x RTX 5090, sm_120) benjy → helexa-neuron-ada (RTX 4090, sm_89) quadbrat → helexa-neuron-ampere (RTX 3060, sm_86) script/deploy.sh (gitignored, local-only) is updated locally to read hosts and flavours from this manifest and dnf install the correct helexa-neuron-<flavour> package per host. Using 'dnf install --refresh --allowerasing' lets it swap out the previous bare helexa-neuron RPM or a different flavour without manual intervention; the spec Conflicts: clauses keep at most one flavour resident. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ci(prerelease): add ampere flavour alongside ada and blackwell
2026-05-18 17:37:14 +03:00 · 2026-05-18 17:28:19 +03:00 · 2026-05-18 17:26:29 +03:00
3 changed files with 45 additions and 0 deletions
--- a/.gitea/workflows/build-prerelease.yml
+++ b/.gitea/workflows/build-prerelease.yml
@@ -94,6 +94,13 @@ jobs:
      fail-fast: false
      matrix:
        include:
          - flavour: ampere
            compute_cap: "86"
            runner: cuda-13.0
            cuda_home: /usr/local/cuda-13.0
            build_jobs: 8
            nvcc_threads: 4
            cargo_features: "cuda cudnn flash-attn"
          - flavour: ada
            compute_cap: "89"
            runner: cuda-13.0
@@ -193,6 +200,7 @@ jobs:
      fail-fast: false
      matrix:
        include:
          - flavour: ampere
          - flavour: ada
          - flavour: blackwell
    steps:
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -16,6 +16,13 @@ env:
  SCCACHE_S3_USE_SSL: "false"
  AWS_ACCESS_KEY_ID: ${{ secrets.SCCACHE_S3_ACCESS_KEY }}
  AWS_SECRET_ACCESS_KEY: ${{ secrets.SCCACHE_S3_SECRET_KEY }}
  # fmt, clippy, and test all run in parallel on the same `rust` runner
  # and would otherwise share /root/.cache/act/<hash>/hostexecutor/target/,
  # racing each other's cargo temp files (.tmpXXXXXX) and failing builds
  # mid-compile. Give each job its own target directory so the invocations
  # don't collide. sccache still backs the actual rustc cache, so the
  # rebuild penalty is small.
  CARGO_TARGET_DIR: target-${{ github.job }}
 jobs:
  fmt:
--- a/asset/manifest.yml
+++ b/asset/manifest.yml
@@ -0,0 +1,30 @@
 # Helexa fleet manifest.
 #
 # Drives rolling deploys via script/deploy.sh and serves as the source
 # of truth for which hosts run cortex vs neuron, and which CUDA
 # compute-capability flavour each neuron host needs.
 #
 # Flavour ↔ NVIDIA generation ↔ compute cap:
 #   ampere    sm_86   (RTX 30 series — e.g. 3060)
 #   ada       sm_89   (RTX 40 series — e.g. 4090)
 #   blackwell sm_120  (RTX 50 series — e.g. 5090)
 #
 # The flavour determines which RPM is installed on a given neuron host:
 # helexa-neuron-<flavour>. Only one flavour may be installed at a time
 # (the packages Conflict: with each other).
 cortex:
  host: hanzalova.internal
 neurons:
  - host: beast.hanzalova.internal
    flavour: blackwell
    gpu: "2x RTX 5090"
  - host: benjy.hanzalova.internal
    flavour: ada
    gpu: "RTX 4090"
  - host: quadbrat.hanzalova.internal
    flavour: ampere
    gpu: "RTX 3060"