docs(generic): document TLS cert paths, rotation cadence, and reload pattern

Expand §11 TLS/PKI with the concrete host cert paths, file modes, and the
ACL-for-service-accounts pattern. Document the 24h cert expiry and the
continuous step.service renewal so implementations don't assume certs are
stable. Add the standard systemd .path/.service reload pair for services
that need to re-read certs without restart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-22 13:38:42 +03:00
parent a0de8ba18c
commit 2bc1a08055

View File

@@ -462,11 +462,46 @@ This is the environment these apps deploy into. Claude Code should assume it.
- Internal DNS split-horizon via `.internal` domains (`hanzalova.internal`, `kosherinata.internal`, etc.). - Internal DNS split-horizon via `.internal` domains (`hanzalova.internal`, `kosherinata.internal`, etc.).
### TLS / PKI ### TLS / PKI
- Internal PKI via Smallstep `step-ca` at `ca.internal`. - Internal PKI via Smallstep `step-ca` at `https://ca.internal`.
- Host certs renewed via systemd timers. - Every host runs `step.service` (the Smallstep renewer) which keeps the host's cert fresh. **Certs are issued with a 24-hour expiry** and renewed continuously — services must tolerate cert rotation, not assume certs are stable for the life of the process.
- mTLS everywhere internal services talk to each other. - **mTLS everywhere** internal services talk to each other.
- **Quantum-safe** SSH (sntrup761x25519 KEX) and TLS (X25519MLKEM768 where peers support it) are the default. External peers that don't support PQ fall back to classical curves — document the fallback explicitly in nginx config. - **Quantum-safe** SSH (sntrup761x25519 KEX) and TLS (X25519MLKEM768 where peers support it) are the default. External peers that don't support PQ fall back to classical curves — document the fallback explicitly in nginx config.
**Standard cert paths on every host:**
| Path | Contents | Mode |
| --- | --- | --- |
| `/etc/pki/ca-trust/source/anchors/root-internal.pem` | Internal root CA bundle | world-readable |
| `/etc/pki/tls/misc/$(hostname -f).pem` | Host cert (public) | world-readable |
| `/etc/pki/tls/private/$(hostname -f).pem` | Host private key | ACL grants read to service-account users |
Application code and systemd units should reference these paths directly — they're the same on every host, so config templates don't need to bake in a hostname. The key file is not world-readable; each app's service account is granted read access via `setfacl` (e.g., `setfacl -m u:<app>:r /etc/pki/tls/private/$(hostname -f).pem`) as part of deploy. This happens in `deploy.sh` alongside the `systemd-sysusers` step (§8).
**Reacting to cert rotation:**
Services that hold cert state in memory (most Rust daemons using `rustls` or `openssl`) must reload when the host cert changes. Ship a pair of systemd units alongside the service unit:
```ini
# /etc/systemd/system/<app>-api-cert.path
[Path]
PathChanged=/etc/pki/tls/misc/<hostname>.pem
Unit=<app>-api-cert-reload.service
[Install]
WantedBy=multi-user.target
```
```ini
# /etc/systemd/system/<app>-api-cert-reload.service
[Service]
Type=oneshot
ExecStart=/bin/systemctl reload <app>-api.service
```
The service unit itself needs an `ExecReload=` that causes the daemon to re-read its certs without dropping in-flight requests (typically `SIGHUP` handling in the Rust binary). If the daemon can't reload gracefully, `ExecStart=/bin/systemctl restart <app>-api.service` is the fallback — but prefer graceful reload.
Ship these `.path` and cert-reload `.service` units from `asset/systemd/` the same way as the main unit.
### Ingress ### Ingress
- Per-site nginx reverse proxy terminates all WAN inbound 443. - Per-site nginx reverse proxy terminates all WAN inbound 443.
- Public DNS via Cloudflare, **unproxied by default** (CF's mTLS origin-pull has been unreliable). Revisit if/when that changes. - Public DNS via Cloudflare, **unproxied by default** (CF's mTLS origin-pull has been unreliable). Revisit if/when that changes.