All checks were successful
deploy-ui / build-and-deploy (push) Successful in 20s
Rename the RPM package from mistralrs-server-<flavour> to mistralrs-<flavour> and the installed binary from mistralrs-server to mistralrs, matching the upstream CLI binary name. Adds Obsoletes/Provides for the old package name so dnf will cleanly replace it on upgrade. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
54 lines
2.8 KiB
Markdown
54 lines
2.8 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Purpose
|
|
|
|
This repo packages [mistral.rs](https://github.com/EricLBuehler/mistral.rs) (a Rust LLM inference server) into RPMs for Fedora 43 / x86_64. It does **not** contain the mistral.rs source — it clones upstream at a given tag, cross-compiles with CUDA, and produces signed RPMs published to a self-hosted dnf repo at `rpm.lair.cafe`.
|
|
|
|
## Architecture
|
|
|
|
### Pipeline flow
|
|
|
|
1. **poll-upstream** (`.gitea/workflows/poll-upstream.yml`) — cron every 15 min, checks GitHub for latest mistral.rs release tag. If the corresponding RPM doesn't exist on `rpm.lair.cafe`, triggers `build-release`.
|
|
2. **build-release** (`.gitea/workflows/build-release.yml`) — three-stage pipeline:
|
|
- **plan** — reads `flavours.yml`, emits a JSON matrix of flavours + stripped version.
|
|
- **build** — runs on a `cuda-13.0` runner. Clones upstream at tag, calls `script/build-binary.sh` to `cargo build --release --locked` with flavour-specific CUDA features.
|
|
- **package** — runs `rpmbuild -bb rpm/mistralrs.spec` with `--define` for version and flavour.
|
|
- **publish** — GPG-signs RPMs, rsyncs to `rpm.lair.cafe`, runs `createrepo_c --update`. Uses concurrency group `rpm-publish` to prevent metadata races.
|
|
|
|
### Flavours
|
|
|
|
Defined in `flavours.yml`. Each flavour specifies a name, `cuda_home`, `cargo_features`, and `compute_caps`. The RPM spec uses `update-alternatives` so multiple flavours can coexist, with priority: base=10, fa=20, nccl=30.
|
|
|
|
### Key files
|
|
|
|
- `flavours.yml` — flavour matrix definition (drives CI matrix)
|
|
- `rpm/mistralrs.spec` — RPM spec (binary-only package, no rebuild)
|
|
- `rpm/systemd/mistralrs@.service` — templated systemd unit (`@BINARY@` and `@FLAVOUR@` are sed-replaced during rpmbuild)
|
|
- `rpm/systemd/mistralrs@.conf.example` — example env file for instances
|
|
- `script/setup/` — one-time infra setup scripts (DNS, TLS cert, nginx, GPG) for `rpm.lair.cafe` on host `oolon`
|
|
|
|
## Commands
|
|
|
|
Build a binary locally (requires CUDA toolkit):
|
|
```bash
|
|
FLAVOUR_NAME=cuda13 CUDA_HOME=/usr/local/cuda-13.0 CARGO_FEATURES="cuda cudnn flash-attn nccl" CUDA_COMPUTE_CAP=120 SRC_DIR=./src ./script/build-binary.sh
|
|
```
|
|
|
|
Build an RPM from a pre-built binary:
|
|
```bash
|
|
rpmdev-setuptree
|
|
cp artifacts/mistralrs-cuda13 ~/rpmbuild/SOURCES/
|
|
cp rpm/systemd/mistralrs@.service ~/rpmbuild/SOURCES/
|
|
cp rpm/systemd/mistralrs@.conf.example ~/rpmbuild/SOURCES/
|
|
rpmbuild -bb rpm/mistralrs.spec --define "mistralrs_version 0.7.0" --define "mistralrs_flavour cuda13"
|
|
```
|
|
|
|
## Infrastructure
|
|
|
|
- CI runs on Gitea Actions (self-hosted), not GitHub Actions
|
|
- RPM repo hosted at `rpm.lair.cafe` on host `oolon.kosherinata.internal`
|
|
- TLS via Let's Encrypt with Cloudflare DNS challenge
|
|
- Publish uses rsync over SSH as `gitea_ci` user
|