docs(generic): document GPU inference hosts and planned cortex proxy

Add the three mistral.rs backends (beast, benjy, quadbrat) with their GPU capacity and the port 1234 / no-auth / no-TLS contract. Note that consumers must currently discover model availability per-host via /v1/models, and that cortex (git.lair.cafe/helexa/cortex) will eventually unify them behind https://cortex.internal:443. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs(generic): document default Postgres cluster and cert-CN mapping flow
2026-04-22 14:25:59 +03:00 · 2026-04-22 14:13:17 +03:00 · 2026-04-22 13:38:42 +03:00 · 2026-04-22 12:54:32 +03:00 · 2026-04-22 12:40:30 +03:00 · 2026-04-22 12:36:52 +03:00
2 changed files with 88 additions and 8 deletions
--- a/generic.md
+++ b/generic.md
@@ -25,7 +25,7 @@ Projects are Rust cargo workspaces. The repository root contains:
 ├── <frontend-dir>/         # Vite + React + SWC + TS frontend(s) — see §4
 ├── asset/                  # deployment artifacts (see §6)
 ├── script/                 # deploy.sh and related operational scripts
-└── README.md
+└── readme.md
 ```

 ### Crate naming
@@ -177,11 +177,32 @@ Tauri. Consumes the same `<app>-api` as the web client. Shares types via the `<a
 ### Central database: Postgres
 Default for any app with a central data store.

+- **Default server: `magrathea.kosherinata.internal:5432`**, with `frankie.hanzalova.internal` as a streaming standby. Unless a project explicitly specifies otherwise, assume a new app uses this cluster. Postgres 18 (path: `/var/lib/pgsql/18/data/`).
 - Connection is **mTLS with passwordless auth**. Host-level client certificates issued by the internal step-ca, with cert CN → pg role mapping via `pg_ident.conf`.
- No passwords in config files, ever. Connection strings reference cert paths.
+- No passwords in config files, ever. Connection strings reference cert paths (§11 TLS / PKI).
+
+**Granting an app access to the database:**
+
+1. Create the Postgres role(s) the app needs (e.g., `<app>_rw`, `<app>_ro`) on the **primary only** — replication carries them to the standby.
+2. Map the app host's cert CN to the Postgres role by dropping a file at `/var/lib/pgsql/18/data/pg_ident.conf.d/<app-host-fqdn>.conf` with one line per mapping:
+   ```
+   cert_cn <app-host-fqdn> <db-username>
+   ```
+   Multiple lines if the host connects as more than one role.
+3. Deploy the **same** ident drop-in to **both** `magrathea` and `frankie` — standbys don't replicate `pg_ident.conf` contents, and a failover to a server missing the mapping will lock the app out.
+4. On each server, reload Postgres to pick up the change (no restart needed):
+   ```
+   sudo systemctl reload postgresql-18
+   ```
+5. Verify from the app host by connecting with its host cert and confirming the role resolves as expected.
+
+`deploy.sh` should handle steps 2–4 idempotently when an app is being deployed to a new host (or when a host's cert CN changes).
 - Migrations via `sqlx-cli` or `refinery`; migration files live in `crates/<app>-data/migrations/`.
+- **Migrations are sequentially versioned and immutable once committed.** File naming follows the tool's convention (`V0001__init.sql`, `V0002__add_users.sql`, … for refinery; `0001_init.sql`, `0002_add_users.sql`, … for sqlx). Each new schema change lands as a **new** file with the next sequence number — **never** edit a migration that has already been committed, even if it hasn't been deployed yet, because checksums diverge and the migration runner will refuse to start (or worse, leave production out of sync with dev).
 - Schema changes are forward-only in production. Destructive migrations require a dedicated maintenance window and an explicit plan.
+- If you catch a bug in a recently-added migration *before* it's been merged or deployed anywhere, amending is fine — but the moment it's landed on `main` or run against any database, treat it as frozen and write a follow-up migration to correct the mistake.
 - Use `sqlx` with compile-time query checking (`sqlx prepare`) and commit the generated `.sqlx/` offline query cache so CI builds don't need a live database.
+- **Agentic contributors working in a project with a Postgres dependency will usually have MCP access to a Postgres MCP server scoped to that project's database(s).** Prefer using the MCP server to inspect schema, verify query shapes against real tables, and sanity-check migrations before applying them — don't guess at column names or types when you can look them up. The scope is limited to the project's own databases; don't assume access to unrelated ones.

 ### Distributed database: Turso
 When the app's data model is distributed (edge replicas, per-site local copies with sync), use Turso. Auth via Turso-issued tokens stored in the per-host secret store, not in `manifest.yml`.
@@ -459,11 +480,46 @@ This is the environment these apps deploy into. Claude Code should assume it.
 - Internal DNS split-horizon via `.internal` domains (`hanzalova.internal`, `kosherinata.internal`, etc.).

 ### TLS / PKI
- Internal PKI via Smallstep `step-ca` at `ca.internal`.
- Host certs renewed via systemd timers.
- mTLS everywhere internal services talk to each other.
+- Internal PKI via Smallstep `step-ca` at `https://ca.internal`.
+- Every host runs `step.service` (the Smallstep renewer) which keeps the host's cert fresh. **Certs are issued with a 24-hour expiry** and renewed continuously — services must tolerate cert rotation, not assume certs are stable for the life of the process.
+- **mTLS everywhere** internal services talk to each other.
 - **Quantum-safe** SSH (sntrup761x25519 KEX) and TLS (X25519MLKEM768 where peers support it) are the default. External peers that don't support PQ fall back to classical curves — document the fallback explicitly in nginx config.

+**Standard cert paths on every host:**
+
+| Path | Contents | Mode |
+| --- | --- | --- |
+| `/etc/pki/ca-trust/source/anchors/root-internal.pem` | Internal root CA bundle | world-readable |
+| `/etc/pki/tls/misc/$(hostname -f).pem` | Host cert (public) | world-readable |
+| `/etc/pki/tls/private/$(hostname -f).pem` | Host private key | ACL grants read to service-account users |
+
+Application code and systemd units should reference these paths directly — they're the same on every host, so config templates don't need to bake in a hostname. The key file is not world-readable; each app's service account is granted read access via `setfacl` (e.g., `setfacl -m u:<app>:r /etc/pki/tls/private/$(hostname -f).pem`) as part of deploy. This happens in `deploy.sh` alongside the `systemd-sysusers` step (§8).
+
+**Reacting to cert rotation:**
+
+Services that hold cert state in memory (most Rust daemons using `rustls` or `openssl`) must reload when the host cert changes. Ship a pair of systemd units alongside the service unit:
+
+```ini
+# /etc/systemd/system/<app>-api-cert.path
+[Path]
+PathChanged=/etc/pki/tls/misc/<hostname>.pem
+Unit=<app>-api-cert-reload.service
+
+[Install]
+WantedBy=multi-user.target
+```
+
+```ini
+# /etc/systemd/system/<app>-api-cert-reload.service
+[Service]
+Type=oneshot
+ExecStart=/bin/systemctl reload <app>-api.service
+```
+
+The service unit itself needs an `ExecReload=` that causes the daemon to re-read its certs without dropping in-flight requests (typically `SIGHUP` handling in the Rust binary). If the daemon can't reload gracefully, `ExecStart=/bin/systemctl restart <app>-api.service` is the fallback — but prefer graceful reload.
+
+Ship these `.path` and cert-reload `.service` units from `asset/systemd/` the same way as the main unit.
+
 ### Ingress
 - Per-site nginx reverse proxy terminates all WAN inbound 443.
 - Public DNS via Cloudflare, **unproxied by default** (CF's mTLS origin-pull has been unreliable). Revisit if/when that changes.
@@ -476,6 +532,26 @@ This is the environment these apps deploy into. Claude Code should assume it.
 - SELinux enforcing per §10.
 - Podman quadlets for containerised workloads; bare-metal systemd units for native Rust binaries (preferred where feasible).

+### GPU / inference
+Three bare-metal GPU hosts run [`mistral.rs`](https://github.com/EricLBuehler/mistral.rs) serving an OpenAI-compatible API on port `1234`:
+
+| Host | GPU(s) |
+| --- | --- |
+| `beast.hanzalova.internal:1234` | 2× RTX 5090 |
+| `benjy.hanzalova.internal:1234` | 1× RTX 4090 |
+| `quadbrat.hanzalova.internal:1234` | 1× RTX 3060 |
+
+- **No TLS, no auth.** The endpoints accept any bearer token (including a dummy one — most clients still require a non-empty token field). They are reachable only via the WireGuard mesh and protected at the network layer.
+- Model availability and capacity differ per host. Each host loads a different set depending on VRAM, and the set changes over time. Consumers must discover what's loaded by querying `/v1/models` on each endpoint rather than hard-coding model names to hosts.
+- **Planned: unified proxy at `https://cortex.internal:443`.** [`cortex`](https://git.lair.cafe/helexa/cortex) is an in-progress project that will load, evict, and route models across the three backends and expose a single TLS-terminated endpoint. Until it ships as functional, inference consumers must talk to the three backends directly and handle discovery/routing themselves.
+- When `cortex` lands, consumers should point at `https://cortex.internal:443` and drop the direct-backend logic. Until then, a simple strategy is: query `/v1/models` on all three hosts, pick the host that has the requested model loaded (prefer larger GPUs first for throughput), and fall back through the list on errors.
+
+### Source hosting
+- **New projects are hosted on the self-hosted Gitea instance** at `git.lair.cafe` (or `git.internal` on the WireGuard mesh — both resolve to the same instance). Agentic contributors will usually have MCP access to this Gitea and should prefer it over any public forge when creating repos, issues, or PRs.
+- **Legacy projects** live under various GitHub / GitLab orgs tied to my public username (`grenade`). These will continue to exist but are being migrated to Gitea over time, especially when they come up for a refactor.
+- **When a project has been relocated**, the original public repo should carry a prominent notice at the top of its `readme.md` (or a GitHub archival notice) pointing to the new Gitea URL. If you're working in a repo that looks stale or superseded, check for such a notice before assuming it's still the canonical location.
+- Default to `git.lair.cafe` / `git.internal` for new scaffolds. Only push a new project to GitHub/GitLab if there's a specific reason (OSS visibility, CI integration that only the public forge offers, etc.) — and note the reason in the project `readme.md`.
+
 ---

 ## 12. Code Quality and Tooling
@@ -503,8 +579,11 @@ This is the environment these apps deploy into. Claude Code should assume it.

 ### Documentation
 - Every public item in library crates has a doc comment.
- Each crate has a `README.md` or top-level module doc explaining its role in the workspace.
- The repo `README.md` covers: what the project does, how to build, how to run locally, how to deploy. Point readers to this document for architectural conventions.
+- Each crate has a `readme.md` or top-level module doc explaining its role in the workspace.
+- The repo `readme.md` covers: what the project does, how to build, how to run locally, how to deploy. Point readers to this document for architectural conventions.
+- **Name readme files `readme.md` (lowercase), not `README.md`.** The shouty all-caps spelling is a convention I don't share; filenames aren't where emphasis belongs. Every forge in use (Gitea, GitHub, GitLab) renders `readme.md` as the repo landing page just as readily as `README.md`. Other conventional top-level docs — `license`, `changelog`, `contributing` — follow the same rule: lowercase, no shouting.
+- **Exception: `CLAUDE.md` and `AGENTS.md` stay in uppercase.** These are agent-facing instruction files and are easy to miss in a file listing when lowercased. The all-caps spelling is the established convention and the one that tooling (Claude Code and other agent harnesses) looks for, so leave them as-is.
+- **Agents may modify `CLAUDE.md` and `AGENTS.md` at their own discretion** — no approval needed to add, update, or remove guidance when it's warranted. Diffs get reviewed, so unintentional drift will surface in the normal flow. Treat these as living instructions that should be kept accurate and current.

 ### Commits
 - **Use [Conventional Commits](https://www.conventionalcommits.org/) syntax for every commit.** `type(scope): subject`, with types drawn from the standard set (`feat`, `fix`, `docs`, `refactor`, `test`, `chore`, `build`, `ci`, `perf`, `style`). Scope is the crate, component, or area touched. Subject is imperative and under ~70 characters. A body may follow if the *why* isn't self-evident.
@@ -532,4 +611,5 @@ When scaffolding or extending a project:
 11. SELinux stays enforcing. Work with the default policy first; ship a custom module only when necessary (§10). Never suggest `setenforce 0`.
 12. Prefer fewer dependencies. Prefer bare-metal systemd over containers unless there's a reason.
 13. Commit in Conventional Commits syntax. Commit autonomously when the work is done; hold off when follow-ups on the same topic are likely (§12 Commits).
-14. When unsure, ask — these preferences are defaults, not mandates, but deviations should be deliberate.
+14. Default new repos to `git.lair.cafe` / `git.internal` (self-hosted Gitea). Public forges only with a stated reason (§11 Source hosting).
+15. When unsure, ask — these preferences are defaults, not mandates, but deviations should be deliberate.
--- a/readme.md
+++ b/readme.md
Author	SHA1	Message	Date
rob thijssen	83652460ed	docs(generic): document GPU inference hosts and planned cortex proxy Add the three mistral.rs backends (beast, benjy, quadbrat) with their GPU capacity and the port 1234 / no-auth / no-TLS contract. Note that consumers must currently discover model availability per-host via /v1/models, and that cortex (git.lair.cafe/helexa/cortex) will eventually unify them behind https://cortex.internal:443. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:25:59 +03:00
rob thijssen	c5ea03b026	docs(generic): document default Postgres cluster and cert-CN mapping flow Call out magrathea (primary) / frankie (standby) as the default Postgres cluster and document the concrete steps to grant an app access: create roles on the primary, drop a pg_ident.conf.d file on both servers, and reload postgresql-18. The both-servers detail is easy to miss and costs the app during a failover. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:13:17 +03:00
rob thijssen	2bc1a08055	docs(generic): document TLS cert paths, rotation cadence, and reload pattern Expand §11 TLS/PKI with the concrete host cert paths, file modes, and the ACL-for-service-accounts pattern. Document the 24h cert expiry and the continuous step.service renewal so implementations don't assume certs are stable. Add the standard systemd .path/.service reload pair for services that need to re-read certs without restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 13:38:42 +03:00
rob thijssen	a0de8ba18c	docs(generic): keep CLAUDE.md/AGENTS.md uppercase, allow autonomous edits Carve out the agent-instruction files as exceptions to the lowercase-readme convention — their all-caps naming is what tooling expects and what makes them visible in a file listing. Also document that agents can modify these files on their own judgement; diffs get reviewed so drift is caught downstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:54:32 +03:00
rob thijssen	c644e7ba46	docs: adopt lowercase readme.md convention Add guidance in generic.md §12 that readme files (and other conventional top-level docs: license, changelog, contributing) should be named in lowercase, not shouty all-caps. Update all README.md references in generic.md and rename this repo's own README.md to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:40:30 +03:00
rob thijssen	eaf2398c7a	docs(generic): document migration immutability and sequential versioning Migrations are sequentially numbered and frozen once committed. Editing an already-landed migration causes checksum divergence and migration-runner failures at deploy time — new changes must go in new files. Call this out explicitly so contributors don't quietly break a service by "fixing" a prior migration in place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:36:52 +03:00
rob thijssen	e9447f54f4	docs(generic): note Postgres MCP server availability for agentic contributors Projects with a Postgres dependency typically expose an MCP server scoped to their database(s). Call this out so agents know to verify schema and query shapes against the real database rather than guessing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:34:37 +03:00
rob thijssen	4f66508d86	docs(generic): document Gitea (git.lair.cafe) as default source host Note that new projects default to the self-hosted Gitea instance at git.lair.cafe (git.internal on the WireGuard mesh), that legacy projects on GitHub/GitLab are being migrated as they come up for refactor, and that relocated repos should carry a prominent pointer to the new URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:32:10 +03:00