Files
mistralrs-package/readme.md
rob thijssen 0d6f48fcc0 docs: update readme and CLAUDE.md for per-GPU flavours and prerelease
Replace cuda13 references with ampere/ada/blackwell flavours, add
unstable repo client setup instructions, remove obsolete nvm runner
prerequisites and flavours.yml references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 14:48:39 +03:00

174 lines
6.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# mistralrs-package
RPM packaging pipeline for [mistral.rs](https://github.com/EricLBuehler/mistral.rs) on Fedora 43 / x86_64 with CUDA support.
This repo does not contain the mistral.rs source. It clones upstream at a given release tag, cross-compiles with CUDA, and produces signed RPMs published to a dnf repo at `rpm.lair.cafe`.
## How it works
Two Gitea Actions workflows drive the pipeline:
1. **poll-upstream** runs every 15 minutes, checks GitHub for the latest mistral.rs release tag, and triggers a build if the corresponding RPM doesn't already exist on `rpm.lair.cafe`. It also checks the upstream `main` branch HEAD and triggers prerelease builds for the unstable repo.
2. **build-release** runs in three stages:
- **build** — clones upstream at the tag and compiles `mistralrs` with flavour-specific CUDA features on a `cuda-13.0` runner.
- **package** — builds an RPM from the compiled binary using `rpmbuild`.
- **publish** — GPG-signs the RPMs, rsyncs them to `rpm.lair.cafe`, and updates the repo metadata with `createrepo_c`.
3. **build-prerelease** — same structure as build-release but clones at a specific commit from `main`, uses versioning from `Cargo.toml` with a prerelease release suffix (e.g. `0.8.1-0.1.20260511git1a2b3c4`), and publishes to the unstable repo.
### Flavours
Build flavours are defined in the workflow matrix. Each flavour targets a specific GPU generation with the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl).
Currently defined:
| Flavour | Compute cap | GPU generation |
|------------|-------------|---------------------------|
| ampere | sm_86 | RTX 3060, A2000A6000 |
| ada | sm_89 | RTX 40604090, L40 |
| blackwell | sm_120 | RTX 5090, B100, B200 |
### Systemd integration
Each RPM installs a templated systemd unit (`mistralrs@.service`). Instances are configured via environment files in `/etc/mistralrs/`:
```bash
# copy the example config
sudo cp /etc/mistralrs/default.conf.example /etc/mistralrs/mymodel.conf
# edit MISTRALRS_ARGS, HF_TOKEN, etc.
sudo systemctl start mistralrs@mymodel
```
## Infrastructure setup
The RPM repo is hosted on `oolon` (oolon.kosherinata.internal) behind nginx with TLS via Let's Encrypt. The setup scripts in `script/setup/` are run once from a dev workstation with SSH access to oolon.
### 1. DNS
```bash
./script/setup/dns.sh
```
Creates a CNAME record for `rpm.lair.cafe` via the Cloudflare API. Requires a Cloudflare API token in `~/.cloudflare/lair.cafe`.
### 2. TLS certificate
```bash
./script/setup/cert.sh
```
Obtains a Let's Encrypt certificate for `rpm.lair.cafe` using the Cloudflare DNS challenge. Run on oolon.
### 3. Nginx and repo directory
```bash
./script/setup/nginx.sh
```
Syncs the nginx config to oolon, creates the `gitea_ci` system user with SSH access for CI publishing, sets up the RPM repo directory at `/var/www/rpm/fedora/43/x86_64`, and reloads nginx. Requires the `gitea_ci` SSH public key at `~/.ssh/id_gitea_ci.pub`.
### 4. GPG signing key
```bash
./script/setup/gpg.sh
```
Manages the RPM signing key in a dedicated keyring at `~/.gnupg/lair`:
- Creates a certify-only ed25519 master key (no expiry) for `rpm@lair.cafe` if one doesn't exist.
- Adds a signing subkey with 1-year expiry.
- Cross-signs the key with your personal keys from the default keyring.
- Exports the public key and syncs it to `oolon:/var/www/rpm/<short-id>.gpg`.
After running the script, add two secrets to the Gitea repo:
| Secret | Value |
|---------------------|-----------------------------------------------------------------------------------------|
| `RPM_SIGNING_KEY` | Output of `gpg --homedir ~/.gnupg/lair --armor --export-secret-subkeys <subkey-fpr>!` |
| `RPM_SIGNING_KEY_ID`| `rpm@lair.cafe` |
The trailing `!` in the export command restricts the export to that specific subkey. Only the signing subkey is shared with CI; the master key stays on the workstation.
#### Rotating the signing subkey
```bash
gpg --homedir ~/.gnupg/lair --quick-add-key <master-fpr> ed25519 sign 1y
```
Then update the `RPM_SIGNING_KEY` secret in Gitea with the new subkey. The public key served to users doesn't change since it's anchored to the master key.
### 5. Runner prerequisites
#### sequoia-sq (for RPM signing)
Runners that run the publish job need `sequoia-sq` installed:
```bash
sudo dnf install sequoia-sq
```
## Client setup
### Stable packages
```bash
sudo rpm --import https://rpm.lair.cafe/<short-id>.gpg
sudo tee /etc/yum.repos.d/lair-cafe.repo > /dev/null <<'EOF'
[lair-cafe]
name=lair.cafe RPM Repository
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/
enabled=1
gpgcheck=1
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
EOF
# install the package for your GPU generation
sudo dnf install mistralrs-ampere # RTX 3000 series
sudo dnf install mistralrs-ada # RTX 4000 series
sudo dnf install mistralrs-blackwell # RTX 5000 series
```
### Unstable (prerelease) packages
Unstable packages are built from the latest upstream `main` commit and published to a separate repo. The RPM release field uses the Fedora snapshot convention (e.g. `0.8.1-0.1.20260511git1a2b3c4.fc43`) so stable releases automatically supersede any installed prerelease.
```bash
sudo tee /etc/yum.repos.d/lair-cafe-unstable.repo > /dev/null <<'EOF'
[lair-cafe-unstable]
name=lair.cafe RPM Repository (unstable)
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/unstable/
enabled=0
gpgcheck=1
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
EOF
# install from unstable on demand
sudo dnf --enablerepo=lair-cafe-unstable install mistralrs-ada
```
## Forcing a rebuild
To force a rebuild of an already-published RPM (e.g. after a packaging change), remove the RPM from the repo server and update the index:
```bash
ssh oolon "
sudo rm /var/www/rpm/fedora/43/x86_64/mistralrs-ada-<version>-1.fc43.x86_64.rpm \
&& cd /var/www/rpm/fedora/43/x86_64 \
&& sudo createrepo_c --update .;
"
```
The next poll-upstream cycle (every 15 minutes) will detect the missing package and trigger a full rebuild. You can also trigger poll-upstream manually from the Gitea Actions UI to avoid waiting.
Do not delete the RPM without running `createrepo_c --update` afterwards — this leaves the repo index referencing a missing file, which causes errors for dnf clients.
## CI secrets
The build-release and build-prerelease workflows require the following secrets:
| Secret | Purpose |
|------------------|----------------------------------------------|
| `DISPATCH_TOKEN` | Gitea API token for triggering builds |
| `RPM_SIGNING_KEY`| ASCII-armored GPG signing subkey |
| `RPM_SIGNING_KEY_ID` | GPG key UID (`rpm@lair.cafe`) |
| `RSYNC_SSH_KEY` | SSH private key for the `gitea_ci` user |