mistralrs-package

RPM packaging pipeline for mistral.rs on Fedora 43 / x86_64 with CUDA support.

This repo does not contain the mistral.rs source. It clones upstream at a given release tag, cross-compiles with CUDA, and produces signed RPMs published to a dnf repo at rpm.lair.cafe.

How it works

Two Gitea Actions workflows drive the pipeline:

  1. poll-upstream runs every 15 minutes, checks GitHub for the latest mistral.rs release tag, and triggers a build if the corresponding RPM doesn't already exist on rpm.lair.cafe. It also checks the upstream main branch HEAD and triggers prerelease builds for the unstable repo.
  2. build-release runs in three stages:
    • build — clones upstream at the tag and compiles mistralrs with flavour-specific CUDA features on a cuda-13.0 runner.
    • package — builds an RPM from the compiled binary using rpmbuild.
    • publish — GPG-signs the RPMs, rsyncs them to rpm.lair.cafe, and updates the repo metadata with createrepo_c.
  3. build-prerelease — same structure as build-release but clones at a specific commit from main, uses versioning from Cargo.toml with a prerelease release suffix (e.g. 0.8.1-0.1.20260511git1a2b3c4), and publishes to the unstable repo.

Flavours

Build flavours are defined in the workflow matrix. Each flavour targets a specific GPU generation with the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl).

Currently defined:

Flavour Compute cap GPU generation
ampere sm_86 RTX 3060, A2000A6000
ada sm_89 RTX 40604090, L40
blackwell sm_120 RTX 5090, B100, B200

Systemd integration

Each RPM installs a templated systemd unit (mistralrs@.service). Instances are configured via environment files in /etc/mistralrs/:

# copy the example config
sudo cp /etc/mistralrs/default.conf.example /etc/mistralrs/mymodel.conf
# edit MISTRALRS_ARGS, HF_TOKEN, etc.
sudo systemctl start mistralrs@mymodel

Infrastructure setup

The RPM repo is hosted on oolon (oolon.kosherinata.internal) behind nginx with TLS via Let's Encrypt. The setup scripts in script/setup/ are run once from a dev workstation with SSH access to oolon.

1. DNS

./script/setup/dns.sh

Creates a CNAME record for rpm.lair.cafe via the Cloudflare API. Requires a Cloudflare API token in ~/.cloudflare/lair.cafe.

2. TLS certificate

./script/setup/cert.sh

Obtains a Let's Encrypt certificate for rpm.lair.cafe using the Cloudflare DNS challenge. Run on oolon.

3. Nginx and repo directory

./script/setup/nginx.sh

Syncs the nginx config to oolon, creates the gitea_ci system user with SSH access for CI publishing, sets up the RPM repo directory at /var/www/rpm/fedora/43/x86_64, and reloads nginx. Requires the gitea_ci SSH public key at ~/.ssh/id_gitea_ci.pub.

4. GPG signing key

./script/setup/gpg.sh

Manages the RPM signing key in a dedicated keyring at ~/.gnupg/lair:

  • Creates a certify-only ed25519 master key (no expiry) for rpm@lair.cafe if one doesn't exist.
  • Adds a signing subkey with 1-year expiry.
  • Cross-signs the key with your personal keys from the default keyring.
  • Exports the public key and syncs it to oolon:/var/www/rpm/<short-id>.gpg.

After running the script, add two secrets to the Gitea repo:

Secret Value
RPM_SIGNING_KEY Output of gpg --homedir ~/.gnupg/lair --armor --export-secret-subkeys <subkey-fpr>!
RPM_SIGNING_KEY_ID rpm@lair.cafe

The trailing ! in the export command restricts the export to that specific subkey. Only the signing subkey is shared with CI; the master key stays on the workstation.

Rotating the signing subkey

gpg --homedir ~/.gnupg/lair --quick-add-key <master-fpr> ed25519 sign 1y

Then update the RPM_SIGNING_KEY secret in Gitea with the new subkey. The public key served to users doesn't change since it's anchored to the master key.

5. Runner prerequisites

sequoia-sq (for RPM signing)

Runners that run the publish job need sequoia-sq installed:

sudo dnf install sequoia-sq

Client setup

Stable packages

sudo rpm --import https://rpm.lair.cafe/<short-id>.gpg
sudo tee /etc/yum.repos.d/lair-cafe.repo > /dev/null <<'EOF'
[lair-cafe]
name=lair.cafe RPM Repository
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/
enabled=1
gpgcheck=1
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
EOF

# install the package for your GPU generation
sudo dnf install mistralrs-ampere      # RTX 3000 series
sudo dnf install mistralrs-ada         # RTX 4000 series
sudo dnf install mistralrs-blackwell   # RTX 5000 series

Unstable (prerelease) packages

Unstable packages are built from the latest upstream main commit and published to a separate repo. The RPM release field uses the Fedora snapshot convention (e.g. 0.8.1-0.1.20260511git1a2b3c4.fc43) so stable releases automatically supersede any installed prerelease.

sudo tee /etc/yum.repos.d/lair-cafe-unstable.repo > /dev/null <<'EOF'
[lair-cafe-unstable]
name=lair.cafe RPM Repository (unstable)
baseurl=https://rpm.lair.cafe/fedora/$releasever/$basearch/unstable/
enabled=0
gpgcheck=1
gpgkey=https://rpm.lair.cafe/<short-id>.gpg
EOF

# install from unstable on demand
sudo dnf --enablerepo=lair-cafe-unstable install mistralrs-ada

Forcing a rebuild

To force a rebuild of an already-published RPM (e.g. after a packaging change), remove the RPM from the repo server and update the index:

ssh oolon "
    sudo rm /var/www/rpm/fedora/43/x86_64/mistralrs-ada-<version>-1.fc43.x86_64.rpm \
    && cd /var/www/rpm/fedora/43/x86_64 \
    && sudo createrepo_c --update .;
"

The next poll-upstream cycle (every 15 minutes) will detect the missing package and trigger a full rebuild. You can also trigger poll-upstream manually from the Gitea Actions UI to avoid waiting.

Do not delete the RPM without running createrepo_c --update afterwards — this leaves the repo index referencing a missing file, which causes errors for dnf clients.

CI secrets

The build-release and build-prerelease workflows require the following secrets:

Secret Purpose
DISPATCH_TOKEN Gitea API token for triggering builds
RPM_SIGNING_KEY ASCII-armored GPG signing subkey
RPM_SIGNING_KEY_ID GPG key UID (rpm@lair.cafe)
RSYNC_SSH_KEY SSH private key for the gitea_ci user
Description
No description provided
Readme 325 KiB
Languages
TypeScript 53%
Shell 32.1%
Python 14%
HTML 0.9%