docs: update readme and CLAUDE.md for per-GPU flavours and prerelease
Replace cuda13 references with ampere/ada/blackwell flavours, add unstable repo client setup instructions, remove obsolete nvm runner prerequisites and flavours.yml references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
68
readme.md
68
readme.md
@@ -8,31 +8,34 @@ This repo does not contain the mistral.rs source. It clones upstream at a given
|
||||
|
||||
Two Gitea Actions workflows drive the pipeline:
|
||||
|
||||
1. **poll-upstream** runs every 15 minutes, checks GitHub for the latest mistral.rs release tag, and triggers a build if the corresponding RPM doesn't already exist on `rpm.lair.cafe`.
|
||||
1. **poll-upstream** runs every 15 minutes, checks GitHub for the latest mistral.rs release tag, and triggers a build if the corresponding RPM doesn't already exist on `rpm.lair.cafe`. It also checks the upstream `main` branch HEAD and triggers prerelease builds for the unstable repo.
|
||||
2. **build-release** runs in three stages:
|
||||
- **build** — clones upstream at the tag and compiles `mistralrs` with flavour-specific CUDA features on a `cuda-13.0` runner.
|
||||
- **package** — builds an RPM from the compiled binary using `rpmbuild`.
|
||||
- **publish** — GPG-signs the RPMs, rsyncs them to `rpm.lair.cafe`, and updates the repo metadata with `createrepo_c`.
|
||||
3. **build-prerelease** — same structure as build-release but clones at a specific commit from `main`, uses versioning from `Cargo.toml` with a prerelease release suffix (e.g. `0.8.1-0.1.20260511git1a2b3c4`), and publishes to the unstable repo.
|
||||
|
||||
### Flavours
|
||||
|
||||
Build flavours are defined in the workflow matrix. Each flavour specifies a name, CUDA home path, cargo features, and compute capabilities. The RPM spec uses `update-alternatives` so multiple flavours can coexist, with priority: base=10, fa=20, nccl=30.
|
||||
Build flavours are defined in the workflow matrix. Each flavour targets a specific GPU generation with the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl).
|
||||
|
||||
Currently defined:
|
||||
|
||||
| Flavour | Features | Compute cap |
|
||||
|----------|-------------------------------|-------------|
|
||||
| cuda13 | cuda, cudnn, flash-attn, nccl | sm_120 |
|
||||
| Flavour | Compute cap | GPU generation |
|
||||
|------------|-------------|---------------------------|
|
||||
| ampere | sm_86 | RTX 3060, A2000–A6000 |
|
||||
| ada | sm_89 | RTX 4060–4090, L40 |
|
||||
| blackwell | sm_120 | RTX 5090, B100, B200 |
|
||||
|
||||
### Systemd integration
|
||||
|
||||
Each RPM installs a templated systemd unit (`mistralrs-<flavour>@.service`). Instances are configured via environment files in `/etc/mistralrs/`:
|
||||
Each RPM installs a templated systemd unit (`mistralrs@.service`). Instances are configured via environment files in `/etc/mistralrs/`:
|
||||
|
||||
```bash
|
||||
# copy the example config
|
||||
sudo cp /etc/mistralrs/cuda13.conf.example /etc/mistralrs/mymodel.conf
|
||||
sudo cp /etc/mistralrs/default.conf.example /etc/mistralrs/mymodel.conf
|
||||
# edit MISTRALRS_ARGS, HF_TOKEN, etc.
|
||||
sudo systemctl start mistralrs-cuda13@mymodel
|
||||
sudo systemctl start mistralrs@mymodel
|
||||
```
|
||||
|
||||
## Infrastructure setup
|
||||
@@ -95,25 +98,6 @@ Then update the `RPM_SIGNING_KEY` secret in Gitea with the new subkey. The publi
|
||||
|
||||
### 5. Runner prerequisites
|
||||
|
||||
#### nvm (for UI builds)
|
||||
|
||||
Runners that build the UI need [nvm](https://github.com/nvm-sh/nvm) installed for the `gitea_runner` user and an `nvm` label in their runner config:
|
||||
|
||||
```bash
|
||||
sudo -u gitea_runner bash -c 'curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash'
|
||||
```
|
||||
|
||||
Then add `nvm` to the labels in `/etc/act_runner/config.yml`:
|
||||
|
||||
```yaml
|
||||
runner:
|
||||
labels:
|
||||
- "fedora-43:host"
|
||||
- "nvm"
|
||||
```
|
||||
|
||||
Restart the runner after changing labels. The `deploy-ui` workflow uses `runs-on: [fedora-43, nvm]` to select runners with Node.js capability.
|
||||
|
||||
#### sequoia-sq (for RPM signing)
|
||||
|
||||
Runners that run the publish job need `sequoia-sq` installed:
|
||||
@@ -124,6 +108,8 @@ sudo dnf install sequoia-sq
|
||||
|
||||
## Client setup
|
||||
|
||||
### Stable packages
|
||||
|
||||
```bash
|
||||
sudo rpm --import https://rpm.lair.cafe/<short-id>.gpg
|
||||
sudo tee /etc/yum.repos.d/lair-cafe.repo > /dev/null <<'EOF'
|
||||
@@ -134,7 +120,29 @@ enabled=1
|
||||
gpgcheck=1
|
||||
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
|
||||
EOF
|
||||
sudo dnf install mistralrs-cuda13
|
||||
|
||||
# install the package for your GPU generation
|
||||
sudo dnf install mistralrs-ampere # RTX 3000 series
|
||||
sudo dnf install mistralrs-ada # RTX 4000 series
|
||||
sudo dnf install mistralrs-blackwell # RTX 5000 series
|
||||
```
|
||||
|
||||
### Unstable (prerelease) packages
|
||||
|
||||
Unstable packages are built from the latest upstream `main` commit and published to a separate repo. The RPM release field uses the Fedora snapshot convention (e.g. `0.8.1-0.1.20260511git1a2b3c4.fc43`) so stable releases automatically supersede any installed prerelease.
|
||||
|
||||
```bash
|
||||
sudo tee /etc/yum.repos.d/lair-cafe-unstable.repo > /dev/null <<'EOF'
|
||||
[lair-cafe-unstable]
|
||||
name=lair.cafe RPM Repository (unstable)
|
||||
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/unstable/
|
||||
enabled=0
|
||||
gpgcheck=1
|
||||
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
|
||||
EOF
|
||||
|
||||
# install from unstable on demand
|
||||
sudo dnf --enablerepo=lair-cafe-unstable install mistralrs-ada
|
||||
```
|
||||
|
||||
## Forcing a rebuild
|
||||
@@ -143,7 +151,7 @@ To force a rebuild of an already-published RPM (e.g. after a packaging change),
|
||||
|
||||
```bash
|
||||
ssh oolon "
|
||||
sudo rm /var/www/rpm/fedora/43/x86_64/mistralrs-cuda13-<version>-1.fc43.x86_64.rpm \
|
||||
sudo rm /var/www/rpm/fedora/43/x86_64/mistralrs-ada-<version>-1.fc43.x86_64.rpm \
|
||||
&& cd /var/www/rpm/fedora/43/x86_64 \
|
||||
&& sudo createrepo_c --update .;
|
||||
"
|
||||
@@ -155,7 +163,7 @@ Do not delete the RPM without running `createrepo_c --update` afterwards — thi
|
||||
|
||||
## CI secrets
|
||||
|
||||
The build-release workflow requires the following secrets:
|
||||
The build-release and build-prerelease workflows require the following secrets:
|
||||
|
||||
| Secret | Purpose |
|
||||
|------------------|----------------------------------------------|
|
||||
|
||||
Reference in New Issue
Block a user