Files
mistralrs-package/CLAUDE.md
rob thijssen 0d6f48fcc0 docs: update readme and CLAUDE.md for per-GPU flavours and prerelease
Replace cuda13 references with ampere/ada/blackwell flavours, add
unstable repo client setup instructions, remove obsolete nvm runner
prerequisites and flavours.yml references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 14:48:39 +03:00

3.1 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Purpose

This repo packages mistral.rs (a Rust LLM inference server) into RPMs for Fedora 43 / x86_64. It does not contain the mistral.rs source — it clones upstream at a given tag, cross-compiles with CUDA, and produces signed RPMs published to a self-hosted dnf repo at rpm.lair.cafe.

Architecture

Pipeline flow

  1. poll-upstream (.gitea/workflows/poll-upstream.yml) — cron every 15 min, checks GitHub for latest mistral.rs release tag. If the corresponding RPMs don't exist on rpm.lair.cafe, triggers build-release. Also checks upstream main branch HEAD and triggers build-prerelease for the unstable repo.
  2. build-release (.gitea/workflows/build-release.yml) — three-stage pipeline:
    • build — runs on a cuda-13.0 runner. Clones upstream at tag, runs cargo build --release --locked with flavour-specific CUDA features.
    • package — runs rpmbuild -bb rpm/mistralrs.spec with --define for version and flavour.
    • publish — GPG-signs RPMs, rsyncs to rpm.lair.cafe, runs createrepo_c --update. Uses concurrency group rpm-publish to prevent metadata races.
  3. build-prerelease (.gitea/workflows/build-prerelease.yml) — same structure as build-release but clones at a specific commit from main, omits --locked, uses prerelease release suffix, and publishes to the unstable repo at rpm.lair.cafe/fedora/$releasever/$basearch/unstable/.

Flavours

Defined in the workflow matrix. Each flavour targets a specific GPU generation using the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl), varying only the compute capability.

Flavour Compute cap GPU generation
ampere sm_86 RTX 3060, A2000A6000
ada sm_89 RTX 40604090, L40
blackwell sm_120 RTX 5090, B100, B200

Key files

  • rpm/mistralrs.spec — RPM spec (binary-only package, no rebuild)
  • rpm/systemd/mistralrs@.service — templated systemd unit (@BINARY@ and @FLAVOUR@ are sed-replaced during rpmbuild)
  • rpm/systemd/mistralrs@.conf.example — example env file for instances
  • script/setup/ — one-time infra setup scripts (DNS, TLS cert, nginx, GPG) for rpm.lair.cafe on host oolon

Commands

Build an RPM from a pre-built binary:

rpmdev-setuptree
cp artifacts/mistralrs-ada ~/rpmbuild/SOURCES/
cp rpm/systemd/mistralrs@.service ~/rpmbuild/SOURCES/
cp rpm/systemd/mistralrs@.conf.example ~/rpmbuild/SOURCES/
rpmbuild -bb rpm/mistralrs.spec --define "mistralrs_version 0.8.0" --define "mistralrs_flavour ada"

Infrastructure

  • CI runs on Gitea Actions (self-hosted), not GitHub Actions
  • RPM repo hosted at rpm.lair.cafe on host oolon.kosherinata.internal
  • Stable repo: rpm.lair.cafe/fedora/$releasever/$basearch/
  • Unstable repo: rpm.lair.cafe/fedora/$releasever/$basearch/unstable/
  • TLS via Let's Encrypt with Cloudflare DNS challenge
  • Publish uses rsync over SSH as gitea_ci user