Compare commits
8 Commits
4881720304
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
83652460ed
|
|||
|
c5ea03b026
|
|||
|
2bc1a08055
|
|||
|
a0de8ba18c
|
|||
|
c644e7ba46
|
|||
|
eaf2398c7a
|
|||
|
e9447f54f4
|
|||
|
4f66508d86
|
96
generic.md
96
generic.md
@@ -25,7 +25,7 @@ Projects are Rust cargo workspaces. The repository root contains:
|
||||
├── <frontend-dir>/ # Vite + React + SWC + TS frontend(s) — see §4
|
||||
├── asset/ # deployment artifacts (see §6)
|
||||
├── script/ # deploy.sh and related operational scripts
|
||||
└── README.md
|
||||
└── readme.md
|
||||
```
|
||||
|
||||
### Crate naming
|
||||
@@ -177,11 +177,32 @@ Tauri. Consumes the same `<app>-api` as the web client. Shares types via the `<a
|
||||
### Central database: Postgres
|
||||
Default for any app with a central data store.
|
||||
|
||||
- **Default server: `magrathea.kosherinata.internal:5432`**, with `frankie.hanzalova.internal` as a streaming standby. Unless a project explicitly specifies otherwise, assume a new app uses this cluster. Postgres 18 (path: `/var/lib/pgsql/18/data/`).
|
||||
- Connection is **mTLS with passwordless auth**. Host-level client certificates issued by the internal step-ca, with cert CN → pg role mapping via `pg_ident.conf`.
|
||||
- No passwords in config files, ever. Connection strings reference cert paths.
|
||||
- No passwords in config files, ever. Connection strings reference cert paths (§11 TLS / PKI).
|
||||
|
||||
**Granting an app access to the database:**
|
||||
|
||||
1. Create the Postgres role(s) the app needs (e.g., `<app>_rw`, `<app>_ro`) on the **primary only** — replication carries them to the standby.
|
||||
2. Map the app host's cert CN to the Postgres role by dropping a file at `/var/lib/pgsql/18/data/pg_ident.conf.d/<app-host-fqdn>.conf` with one line per mapping:
|
||||
```
|
||||
cert_cn <app-host-fqdn> <db-username>
|
||||
```
|
||||
Multiple lines if the host connects as more than one role.
|
||||
3. Deploy the **same** ident drop-in to **both** `magrathea` and `frankie` — standbys don't replicate `pg_ident.conf` contents, and a failover to a server missing the mapping will lock the app out.
|
||||
4. On each server, reload Postgres to pick up the change (no restart needed):
|
||||
```
|
||||
sudo systemctl reload postgresql-18
|
||||
```
|
||||
5. Verify from the app host by connecting with its host cert and confirming the role resolves as expected.
|
||||
|
||||
`deploy.sh` should handle steps 2–4 idempotently when an app is being deployed to a new host (or when a host's cert CN changes).
|
||||
- Migrations via `sqlx-cli` or `refinery`; migration files live in `crates/<app>-data/migrations/`.
|
||||
- **Migrations are sequentially versioned and immutable once committed.** File naming follows the tool's convention (`V0001__init.sql`, `V0002__add_users.sql`, … for refinery; `0001_init.sql`, `0002_add_users.sql`, … for sqlx). Each new schema change lands as a **new** file with the next sequence number — **never** edit a migration that has already been committed, even if it hasn't been deployed yet, because checksums diverge and the migration runner will refuse to start (or worse, leave production out of sync with dev).
|
||||
- Schema changes are forward-only in production. Destructive migrations require a dedicated maintenance window and an explicit plan.
|
||||
- If you catch a bug in a recently-added migration *before* it's been merged or deployed anywhere, amending is fine — but the moment it's landed on `main` or run against any database, treat it as frozen and write a follow-up migration to correct the mistake.
|
||||
- Use `sqlx` with compile-time query checking (`sqlx prepare`) and commit the generated `.sqlx/` offline query cache so CI builds don't need a live database.
|
||||
- **Agentic contributors working in a project with a Postgres dependency will usually have MCP access to a Postgres MCP server scoped to that project's database(s).** Prefer using the MCP server to inspect schema, verify query shapes against real tables, and sanity-check migrations before applying them — don't guess at column names or types when you can look them up. The scope is limited to the project's own databases; don't assume access to unrelated ones.
|
||||
|
||||
### Distributed database: Turso
|
||||
When the app's data model is distributed (edge replicas, per-site local copies with sync), use Turso. Auth via Turso-issued tokens stored in the per-host secret store, not in `manifest.yml`.
|
||||
@@ -459,11 +480,46 @@ This is the environment these apps deploy into. Claude Code should assume it.
|
||||
- Internal DNS split-horizon via `.internal` domains (`hanzalova.internal`, `kosherinata.internal`, etc.).
|
||||
|
||||
### TLS / PKI
|
||||
- Internal PKI via Smallstep `step-ca` at `ca.internal`.
|
||||
- Host certs renewed via systemd timers.
|
||||
- mTLS everywhere internal services talk to each other.
|
||||
- Internal PKI via Smallstep `step-ca` at `https://ca.internal`.
|
||||
- Every host runs `step.service` (the Smallstep renewer) which keeps the host's cert fresh. **Certs are issued with a 24-hour expiry** and renewed continuously — services must tolerate cert rotation, not assume certs are stable for the life of the process.
|
||||
- **mTLS everywhere** internal services talk to each other.
|
||||
- **Quantum-safe** SSH (sntrup761x25519 KEX) and TLS (X25519MLKEM768 where peers support it) are the default. External peers that don't support PQ fall back to classical curves — document the fallback explicitly in nginx config.
|
||||
|
||||
**Standard cert paths on every host:**
|
||||
|
||||
| Path | Contents | Mode |
|
||||
| --- | --- | --- |
|
||||
| `/etc/pki/ca-trust/source/anchors/root-internal.pem` | Internal root CA bundle | world-readable |
|
||||
| `/etc/pki/tls/misc/$(hostname -f).pem` | Host cert (public) | world-readable |
|
||||
| `/etc/pki/tls/private/$(hostname -f).pem` | Host private key | ACL grants read to service-account users |
|
||||
|
||||
Application code and systemd units should reference these paths directly — they're the same on every host, so config templates don't need to bake in a hostname. The key file is not world-readable; each app's service account is granted read access via `setfacl` (e.g., `setfacl -m u:<app>:r /etc/pki/tls/private/$(hostname -f).pem`) as part of deploy. This happens in `deploy.sh` alongside the `systemd-sysusers` step (§8).
|
||||
|
||||
**Reacting to cert rotation:**
|
||||
|
||||
Services that hold cert state in memory (most Rust daemons using `rustls` or `openssl`) must reload when the host cert changes. Ship a pair of systemd units alongside the service unit:
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/<app>-api-cert.path
|
||||
[Path]
|
||||
PathChanged=/etc/pki/tls/misc/<hostname>.pem
|
||||
Unit=<app>-api-cert-reload.service
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/<app>-api-cert-reload.service
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/bin/systemctl reload <app>-api.service
|
||||
```
|
||||
|
||||
The service unit itself needs an `ExecReload=` that causes the daemon to re-read its certs without dropping in-flight requests (typically `SIGHUP` handling in the Rust binary). If the daemon can't reload gracefully, `ExecStart=/bin/systemctl restart <app>-api.service` is the fallback — but prefer graceful reload.
|
||||
|
||||
Ship these `.path` and cert-reload `.service` units from `asset/systemd/` the same way as the main unit.
|
||||
|
||||
### Ingress
|
||||
- Per-site nginx reverse proxy terminates all WAN inbound 443.
|
||||
- Public DNS via Cloudflare, **unproxied by default** (CF's mTLS origin-pull has been unreliable). Revisit if/when that changes.
|
||||
@@ -476,6 +532,26 @@ This is the environment these apps deploy into. Claude Code should assume it.
|
||||
- SELinux enforcing per §10.
|
||||
- Podman quadlets for containerised workloads; bare-metal systemd units for native Rust binaries (preferred where feasible).
|
||||
|
||||
### GPU / inference
|
||||
Three bare-metal GPU hosts run [`mistral.rs`](https://github.com/EricLBuehler/mistral.rs) serving an OpenAI-compatible API on port `1234`:
|
||||
|
||||
| Host | GPU(s) |
|
||||
| --- | --- |
|
||||
| `beast.hanzalova.internal:1234` | 2× RTX 5090 |
|
||||
| `benjy.hanzalova.internal:1234` | 1× RTX 4090 |
|
||||
| `quadbrat.hanzalova.internal:1234` | 1× RTX 3060 |
|
||||
|
||||
- **No TLS, no auth.** The endpoints accept any bearer token (including a dummy one — most clients still require a non-empty token field). They are reachable only via the WireGuard mesh and protected at the network layer.
|
||||
- Model availability and capacity differ per host. Each host loads a different set depending on VRAM, and the set changes over time. Consumers must discover what's loaded by querying `/v1/models` on each endpoint rather than hard-coding model names to hosts.
|
||||
- **Planned: unified proxy at `https://cortex.internal:443`.** [`cortex`](https://git.lair.cafe/helexa/cortex) is an in-progress project that will load, evict, and route models across the three backends and expose a single TLS-terminated endpoint. Until it ships as functional, inference consumers must talk to the three backends directly and handle discovery/routing themselves.
|
||||
- When `cortex` lands, consumers should point at `https://cortex.internal:443` and drop the direct-backend logic. Until then, a simple strategy is: query `/v1/models` on all three hosts, pick the host that has the requested model loaded (prefer larger GPUs first for throughput), and fall back through the list on errors.
|
||||
|
||||
### Source hosting
|
||||
- **New projects are hosted on the self-hosted Gitea instance** at `git.lair.cafe` (or `git.internal` on the WireGuard mesh — both resolve to the same instance). Agentic contributors will usually have MCP access to this Gitea and should prefer it over any public forge when creating repos, issues, or PRs.
|
||||
- **Legacy projects** live under various GitHub / GitLab orgs tied to my public username (`grenade`). These will continue to exist but are being migrated to Gitea over time, especially when they come up for a refactor.
|
||||
- **When a project has been relocated**, the original public repo should carry a prominent notice at the top of its `readme.md` (or a GitHub archival notice) pointing to the new Gitea URL. If you're working in a repo that looks stale or superseded, check for such a notice before assuming it's still the canonical location.
|
||||
- Default to `git.lair.cafe` / `git.internal` for new scaffolds. Only push a new project to GitHub/GitLab if there's a specific reason (OSS visibility, CI integration that only the public forge offers, etc.) — and note the reason in the project `readme.md`.
|
||||
|
||||
---
|
||||
|
||||
## 12. Code Quality and Tooling
|
||||
@@ -503,8 +579,11 @@ This is the environment these apps deploy into. Claude Code should assume it.
|
||||
|
||||
### Documentation
|
||||
- Every public item in library crates has a doc comment.
|
||||
- Each crate has a `README.md` or top-level module doc explaining its role in the workspace.
|
||||
- The repo `README.md` covers: what the project does, how to build, how to run locally, how to deploy. Point readers to this document for architectural conventions.
|
||||
- Each crate has a `readme.md` or top-level module doc explaining its role in the workspace.
|
||||
- The repo `readme.md` covers: what the project does, how to build, how to run locally, how to deploy. Point readers to this document for architectural conventions.
|
||||
- **Name readme files `readme.md` (lowercase), not `README.md`.** The shouty all-caps spelling is a convention I don't share; filenames aren't where emphasis belongs. Every forge in use (Gitea, GitHub, GitLab) renders `readme.md` as the repo landing page just as readily as `README.md`. Other conventional top-level docs — `license`, `changelog`, `contributing` — follow the same rule: lowercase, no shouting.
|
||||
- **Exception: `CLAUDE.md` and `AGENTS.md` stay in uppercase.** These are agent-facing instruction files and are easy to miss in a file listing when lowercased. The all-caps spelling is the established convention and the one that tooling (Claude Code and other agent harnesses) looks for, so leave them as-is.
|
||||
- **Agents may modify `CLAUDE.md` and `AGENTS.md` at their own discretion** — no approval needed to add, update, or remove guidance when it's warranted. Diffs get reviewed, so unintentional drift will surface in the normal flow. Treat these as living instructions that should be kept accurate and current.
|
||||
|
||||
### Commits
|
||||
- **Use [Conventional Commits](https://www.conventionalcommits.org/) syntax for every commit.** `type(scope): subject`, with types drawn from the standard set (`feat`, `fix`, `docs`, `refactor`, `test`, `chore`, `build`, `ci`, `perf`, `style`). Scope is the crate, component, or area touched. Subject is imperative and under ~70 characters. A body may follow if the *why* isn't self-evident.
|
||||
@@ -532,4 +611,5 @@ When scaffolding or extending a project:
|
||||
11. SELinux stays enforcing. Work with the default policy first; ship a custom module only when necessary (§10). Never suggest `setenforce 0`.
|
||||
12. Prefer fewer dependencies. Prefer bare-metal systemd over containers unless there's a reason.
|
||||
13. Commit in Conventional Commits syntax. Commit autonomously when the work is done; hold off when follow-ups on the same topic are likely (§12 Commits).
|
||||
14. When unsure, ask — these preferences are defaults, not mandates, but deviations should be deliberate.
|
||||
14. Default new repos to `git.lair.cafe` / `git.internal` (self-hosted Gitea). Public forges only with a stated reason (§11 Source hosting).
|
||||
15. When unsure, ask — these preferences are defaults, not mandates, but deviations should be deliberate.
|
||||
|
||||
Reference in New Issue
Block a user