Replace cuda13 references with ampere/ada/blackwell flavours, add unstable repo client setup instructions, remove obsolete nvm runner prerequisites and flavours.yml references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
56 lines
3.1 KiB
Markdown
56 lines
3.1 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Purpose
|
||
|
||
This repo packages [mistral.rs](https://github.com/EricLBuehler/mistral.rs) (a Rust LLM inference server) into RPMs for Fedora 43 / x86_64. It does **not** contain the mistral.rs source — it clones upstream at a given tag, cross-compiles with CUDA, and produces signed RPMs published to a self-hosted dnf repo at `rpm.lair.cafe`.
|
||
|
||
## Architecture
|
||
|
||
### Pipeline flow
|
||
|
||
1. **poll-upstream** (`.gitea/workflows/poll-upstream.yml`) — cron every 15 min, checks GitHub for latest mistral.rs release tag. If the corresponding RPMs don't exist on `rpm.lair.cafe`, triggers `build-release`. Also checks upstream `main` branch HEAD and triggers `build-prerelease` for the unstable repo.
|
||
2. **build-release** (`.gitea/workflows/build-release.yml`) — three-stage pipeline:
|
||
- **build** — runs on a `cuda-13.0` runner. Clones upstream at tag, runs `cargo build --release --locked` with flavour-specific CUDA features.
|
||
- **package** — runs `rpmbuild -bb rpm/mistralrs.spec` with `--define` for version and flavour.
|
||
- **publish** — GPG-signs RPMs, rsyncs to `rpm.lair.cafe`, runs `createrepo_c --update`. Uses concurrency group `rpm-publish` to prevent metadata races.
|
||
3. **build-prerelease** (`.gitea/workflows/build-prerelease.yml`) — same structure as build-release but clones at a specific commit from `main`, omits `--locked`, uses prerelease release suffix, and publishes to the unstable repo at `rpm.lair.cafe/fedora/$releasever/$basearch/unstable/`.
|
||
|
||
### Flavours
|
||
|
||
Defined in the workflow matrix. Each flavour targets a specific GPU generation using the same CUDA 13.0 toolkit and features (cuda, cudnn, flash-attn, nccl), varying only the compute capability.
|
||
|
||
| Flavour | Compute cap | GPU generation |
|
||
|------------|-------------|---------------------------|
|
||
| ampere | sm_86 | RTX 3060, A2000–A6000 |
|
||
| ada | sm_89 | RTX 4060–4090, L40 |
|
||
| blackwell | sm_120 | RTX 5090, B100, B200 |
|
||
|
||
### Key files
|
||
|
||
- `rpm/mistralrs.spec` — RPM spec (binary-only package, no rebuild)
|
||
- `rpm/systemd/mistralrs@.service` — templated systemd unit (`@BINARY@` and `@FLAVOUR@` are sed-replaced during rpmbuild)
|
||
- `rpm/systemd/mistralrs@.conf.example` — example env file for instances
|
||
- `script/setup/` — one-time infra setup scripts (DNS, TLS cert, nginx, GPG) for `rpm.lair.cafe` on host `oolon`
|
||
|
||
## Commands
|
||
|
||
Build an RPM from a pre-built binary:
|
||
```bash
|
||
rpmdev-setuptree
|
||
cp artifacts/mistralrs-ada ~/rpmbuild/SOURCES/
|
||
cp rpm/systemd/mistralrs@.service ~/rpmbuild/SOURCES/
|
||
cp rpm/systemd/mistralrs@.conf.example ~/rpmbuild/SOURCES/
|
||
rpmbuild -bb rpm/mistralrs.spec --define "mistralrs_version 0.8.0" --define "mistralrs_flavour ada"
|
||
```
|
||
|
||
## Infrastructure
|
||
|
||
- CI runs on Gitea Actions (self-hosted), not GitHub Actions
|
||
- RPM repo hosted at `rpm.lair.cafe` on host `oolon.kosherinata.internal`
|
||
- Stable repo: `rpm.lair.cafe/fedora/$releasever/$basearch/`
|
||
- Unstable repo: `rpm.lair.cafe/fedora/$releasever/$basearch/unstable/`
|
||
- TLS via Let's Encrypt with Cloudflare DNS challenge
|
||
- Publish uses rsync over SSH as `gitea_ci` user
|