Files
mistralrs-package/CLAUDE.md
rob thijssen ef7e3a3183
All checks were successful
deploy-ui / build-and-deploy (push) Successful in 20s
refactor: rename package from mistralrs-server to mistralrs
Rename the RPM package from mistralrs-server-<flavour> to
mistralrs-<flavour> and the installed binary from mistralrs-server
to mistralrs, matching the upstream CLI binary name.

Adds Obsoletes/Provides for the old package name so dnf will cleanly
replace it on upgrade.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-27 18:53:32 +03:00

2.8 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Purpose

This repo packages mistral.rs (a Rust LLM inference server) into RPMs for Fedora 43 / x86_64. It does not contain the mistral.rs source — it clones upstream at a given tag, cross-compiles with CUDA, and produces signed RPMs published to a self-hosted dnf repo at rpm.lair.cafe.

Architecture

Pipeline flow

  1. poll-upstream (.gitea/workflows/poll-upstream.yml) — cron every 15 min, checks GitHub for latest mistral.rs release tag. If the corresponding RPM doesn't exist on rpm.lair.cafe, triggers build-release.
  2. build-release (.gitea/workflows/build-release.yml) — three-stage pipeline:
    • plan — reads flavours.yml, emits a JSON matrix of flavours + stripped version.
    • build — runs on a cuda-13.0 runner. Clones upstream at tag, calls script/build-binary.sh to cargo build --release --locked with flavour-specific CUDA features.
    • package — runs rpmbuild -bb rpm/mistralrs.spec with --define for version and flavour.
    • publish — GPG-signs RPMs, rsyncs to rpm.lair.cafe, runs createrepo_c --update. Uses concurrency group rpm-publish to prevent metadata races.

Flavours

Defined in flavours.yml. Each flavour specifies a name, cuda_home, cargo_features, and compute_caps. The RPM spec uses update-alternatives so multiple flavours can coexist, with priority: base=10, fa=20, nccl=30.

Key files

  • flavours.yml — flavour matrix definition (drives CI matrix)
  • rpm/mistralrs.spec — RPM spec (binary-only package, no rebuild)
  • rpm/systemd/mistralrs@.service — templated systemd unit (@BINARY@ and @FLAVOUR@ are sed-replaced during rpmbuild)
  • rpm/systemd/mistralrs@.conf.example — example env file for instances
  • script/setup/ — one-time infra setup scripts (DNS, TLS cert, nginx, GPG) for rpm.lair.cafe on host oolon

Commands

Build a binary locally (requires CUDA toolkit):

FLAVOUR_NAME=cuda13 CUDA_HOME=/usr/local/cuda-13.0 CARGO_FEATURES="cuda cudnn flash-attn nccl" CUDA_COMPUTE_CAP=120 SRC_DIR=./src ./script/build-binary.sh

Build an RPM from a pre-built binary:

rpmdev-setuptree
cp artifacts/mistralrs-cuda13 ~/rpmbuild/SOURCES/
cp rpm/systemd/mistralrs@.service ~/rpmbuild/SOURCES/
cp rpm/systemd/mistralrs@.conf.example ~/rpmbuild/SOURCES/
rpmbuild -bb rpm/mistralrs.spec --define "mistralrs_version 0.7.0" --define "mistralrs_flavour cuda13"

Infrastructure

  • CI runs on Gitea Actions (self-hosted), not GitHub Actions
  • RPM repo hosted at rpm.lair.cafe on host oolon.kosherinata.internal
  • TLS via Let's Encrypt with Cloudflare DNS challenge
  • Publish uses rsync over SSH as gitea_ci user