Replace cuda13 references with ampere/ada/blackwell flavours, add unstable repo client setup instructions, remove obsolete nvm runner prerequisites and flavours.yml references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
174 lines
6.9 KiB
Markdown
174 lines
6.9 KiB
Markdown
# mistralrs-package
|
||
|
||
RPM packaging pipeline for [mistral.rs](https://github.com/EricLBuehler/mistral.rs) on Fedora 43 / x86_64 with CUDA support.
|
||
|
||
This repo does not contain the mistral.rs source. It clones upstream at a given release tag, cross-compiles with CUDA, and produces signed RPMs published to a dnf repo at `rpm.lair.cafe`.
|
||
|
||
## How it works
|
||
|
||
Two Gitea Actions workflows drive the pipeline:
|
||
|
||
1. **poll-upstream** runs every 15 minutes, checks GitHub for the latest mistral.rs release tag, and triggers a build if the corresponding RPM doesn't already exist on `rpm.lair.cafe`. It also checks the upstream `main` branch HEAD and triggers prerelease builds for the unstable repo.
|
||
2. **build-release** runs in three stages:
|
||
- **build** — clones upstream at the tag and compiles `mistralrs` with flavour-specific CUDA features on a `cuda-13.0` runner.
|
||
- **package** — builds an RPM from the compiled binary using `rpmbuild`.
|
||
- **publish** — GPG-signs the RPMs, rsyncs them to `rpm.lair.cafe`, and updates the repo metadata with `createrepo_c`.
|
||
3. **build-prerelease** — same structure as build-release but clones at a specific commit from `main`, uses versioning from `Cargo.toml` with a prerelease release suffix (e.g. `0.8.1-0.1.20260511git1a2b3c4`), and publishes to the unstable repo.
|
||
|
||
### Flavours
|
||
|
||
Build flavours are defined in the workflow matrix. Each flavour targets a specific GPU generation with the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl).
|
||
|
||
Currently defined:
|
||
|
||
| Flavour | Compute cap | GPU generation |
|
||
|------------|-------------|---------------------------|
|
||
| ampere | sm_86 | RTX 3060, A2000–A6000 |
|
||
| ada | sm_89 | RTX 4060–4090, L40 |
|
||
| blackwell | sm_120 | RTX 5090, B100, B200 |
|
||
|
||
### Systemd integration
|
||
|
||
Each RPM installs a templated systemd unit (`mistralrs@.service`). Instances are configured via environment files in `/etc/mistralrs/`:
|
||
|
||
```bash
|
||
# copy the example config
|
||
sudo cp /etc/mistralrs/default.conf.example /etc/mistralrs/mymodel.conf
|
||
# edit MISTRALRS_ARGS, HF_TOKEN, etc.
|
||
sudo systemctl start mistralrs@mymodel
|
||
```
|
||
|
||
## Infrastructure setup
|
||
|
||
The RPM repo is hosted on `oolon` (oolon.kosherinata.internal) behind nginx with TLS via Let's Encrypt. The setup scripts in `script/setup/` are run once from a dev workstation with SSH access to oolon.
|
||
|
||
### 1. DNS
|
||
|
||
```bash
|
||
./script/setup/dns.sh
|
||
```
|
||
|
||
Creates a CNAME record for `rpm.lair.cafe` via the Cloudflare API. Requires a Cloudflare API token in `~/.cloudflare/lair.cafe`.
|
||
|
||
### 2. TLS certificate
|
||
|
||
```bash
|
||
./script/setup/cert.sh
|
||
```
|
||
|
||
Obtains a Let's Encrypt certificate for `rpm.lair.cafe` using the Cloudflare DNS challenge. Run on oolon.
|
||
|
||
### 3. Nginx and repo directory
|
||
|
||
```bash
|
||
./script/setup/nginx.sh
|
||
```
|
||
|
||
Syncs the nginx config to oolon, creates the `gitea_ci` system user with SSH access for CI publishing, sets up the RPM repo directory at `/var/www/rpm/fedora/43/x86_64`, and reloads nginx. Requires the `gitea_ci` SSH public key at `~/.ssh/id_gitea_ci.pub`.
|
||
|
||
### 4. GPG signing key
|
||
|
||
```bash
|
||
./script/setup/gpg.sh
|
||
```
|
||
|
||
Manages the RPM signing key in a dedicated keyring at `~/.gnupg/lair`:
|
||
|
||
- Creates a certify-only ed25519 master key (no expiry) for `rpm@lair.cafe` if one doesn't exist.
|
||
- Adds a signing subkey with 1-year expiry.
|
||
- Cross-signs the key with your personal keys from the default keyring.
|
||
- Exports the public key and syncs it to `oolon:/var/www/rpm/<short-id>.gpg`.
|
||
|
||
After running the script, add two secrets to the Gitea repo:
|
||
|
||
| Secret | Value |
|
||
|---------------------|-----------------------------------------------------------------------------------------|
|
||
| `RPM_SIGNING_KEY` | Output of `gpg --homedir ~/.gnupg/lair --armor --export-secret-subkeys <subkey-fpr>!` |
|
||
| `RPM_SIGNING_KEY_ID`| `rpm@lair.cafe` |
|
||
|
||
The trailing `!` in the export command restricts the export to that specific subkey. Only the signing subkey is shared with CI; the master key stays on the workstation.
|
||
|
||
#### Rotating the signing subkey
|
||
|
||
```bash
|
||
gpg --homedir ~/.gnupg/lair --quick-add-key <master-fpr> ed25519 sign 1y
|
||
```
|
||
|
||
Then update the `RPM_SIGNING_KEY` secret in Gitea with the new subkey. The public key served to users doesn't change since it's anchored to the master key.
|
||
|
||
### 5. Runner prerequisites
|
||
|
||
#### sequoia-sq (for RPM signing)
|
||
|
||
Runners that run the publish job need `sequoia-sq` installed:
|
||
|
||
```bash
|
||
sudo dnf install sequoia-sq
|
||
```
|
||
|
||
## Client setup
|
||
|
||
### Stable packages
|
||
|
||
```bash
|
||
sudo rpm --import https://rpm.lair.cafe/<short-id>.gpg
|
||
sudo tee /etc/yum.repos.d/lair-cafe.repo > /dev/null <<'EOF'
|
||
[lair-cafe]
|
||
name=lair.cafe RPM Repository
|
||
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/
|
||
enabled=1
|
||
gpgcheck=1
|
||
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
|
||
EOF
|
||
|
||
# install the package for your GPU generation
|
||
sudo dnf install mistralrs-ampere # RTX 3000 series
|
||
sudo dnf install mistralrs-ada # RTX 4000 series
|
||
sudo dnf install mistralrs-blackwell # RTX 5000 series
|
||
```
|
||
|
||
### Unstable (prerelease) packages
|
||
|
||
Unstable packages are built from the latest upstream `main` commit and published to a separate repo. The RPM release field uses the Fedora snapshot convention (e.g. `0.8.1-0.1.20260511git1a2b3c4.fc43`) so stable releases automatically supersede any installed prerelease.
|
||
|
||
```bash
|
||
sudo tee /etc/yum.repos.d/lair-cafe-unstable.repo > /dev/null <<'EOF'
|
||
[lair-cafe-unstable]
|
||
name=lair.cafe RPM Repository (unstable)
|
||
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/unstable/
|
||
enabled=0
|
||
gpgcheck=1
|
||
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
|
||
EOF
|
||
|
||
# install from unstable on demand
|
||
sudo dnf --enablerepo=lair-cafe-unstable install mistralrs-ada
|
||
```
|
||
|
||
## Forcing a rebuild
|
||
|
||
To force a rebuild of an already-published RPM (e.g. after a packaging change), remove the RPM from the repo server and update the index:
|
||
|
||
```bash
|
||
ssh oolon "
|
||
sudo rm /var/www/rpm/fedora/43/x86_64/mistralrs-ada-<version>-1.fc43.x86_64.rpm \
|
||
&& cd /var/www/rpm/fedora/43/x86_64 \
|
||
&& sudo createrepo_c --update .;
|
||
"
|
||
```
|
||
|
||
The next poll-upstream cycle (every 15 minutes) will detect the missing package and trigger a full rebuild. You can also trigger poll-upstream manually from the Gitea Actions UI to avoid waiting.
|
||
|
||
Do not delete the RPM without running `createrepo_c --update` afterwards — this leaves the repo index referencing a missing file, which causes errors for dnf clients.
|
||
|
||
## CI secrets
|
||
|
||
The build-release and build-prerelease workflows require the following secrets:
|
||
|
||
| Secret | Purpose |
|
||
|------------------|----------------------------------------------|
|
||
| `DISPATCH_TOKEN` | Gitea API token for triggering builds |
|
||
| `RPM_SIGNING_KEY`| ASCII-armored GPG signing subkey |
|
||
| `RPM_SIGNING_KEY_ID` | GPG key UID (`rpm@lair.cafe`) |
|
||
| `RSYNC_SSH_KEY` | SSH private key for the `gitea_ci` user |
|