Capture the cert + edge-proxy conventions worked through deploying the helexa-bench UI: - external-tls.md — publicly-trusted certs via Let's Encrypt (certbot, Cloudflare DNS-01, ECDSA, /root/.certbot-internal); the external counterpart to internal-tls.md. Decision rule: public name → LE, *.internal → internal CA. - reverse-proxies.md — names the per-site edge proxies (oolon for kosherinata, hanzalova.internal for the office) and what sits behind each, the public-vs-mesh access paths + the "public names don't hairpin from inside the mesh" gotcha, per-vhost cert choice, nginx conventions, and the bench (bench.helexa.ai + bench.internal) worked example. - readme + generic.md §11 cross-reference both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4.8 KiB
Reverse proxies and edge ingress
Extends generic.md §11 (Network / Ingress). That section says "per-site nginx reverse
proxy terminates all WAN inbound 443"; this doc names the proxies, maps what sits behind
each, and pins down the two access paths and the per-vhost cert choice — plus the one
gotcha that bites every time (a public name doesn't work from inside the mesh).
1. The proxies (one per site)
Each WireGuard site has a single nginx edge proxy. All WAN-inbound 443 for that site is port-forwarded by the site's OPNsense router to its proxy, which terminates TLS and fans out to internal upstreams.
| Site | Edge proxy (nginx host) | Notable hosts behind it |
|---|---|---|
| kosherinata (DC) | oolon.kosherinata.internal |
magrathea (Postgres primary), nikola, gramathea, … |
| hanzalova (office) | hanzalova.internal |
GPU/inference: beast, benjy, quadbrat; bob (helexa-bench API + Agent Zero); frankie (Postgres streaming standby); trillian; the workstation |
Site octet encodes the mesh subnet (10.<site>.0.0/16); see generic.md §11. New
office services front on hanzalova.internal; new DC services on oolon.
2. Two access paths — and the mesh hairpin gotcha
A service can be reached two ways, and they are not interchangeable:
- Public (from the WAN): public DNS (Cloudflare, unproxied by default) → site WAN IP
→ OPNsense forwards
:443→ site nginx → upstream. Cert: Let's Encrypt (external-tls.md). - Internal (from the mesh): split-horizon
.internalDNS → the host/proxy directly over WireGuard → nginx. Cert: internal CA (internal-tls.md).
Gotcha — public names don't hairpin. From inside the mesh, a public name still resolves (via public DNS) to the site's WAN IP, so the packet hits the OPNsense LAN interface — which only forwards
:443inbound from the WAN, not from the LAN. The connection dead-ends (or worse, gets OPNsense's own default cert). So a service that mesh clients also need must be published under a*.internalname with its own internal-CA vhost, in addition to its public vhost.
This is why dual-audience services get two vhosts on the same proxy — one public
(LE), one internal (lair CA) — usually sharing one webroot and one upstream.
3. Per-vhost cert choice
| vhost audience | name | cert | doc |
|---|---|---|---|
| Public / WAN | <svc>.<public-zone> (e.g. bench.helexa.ai) |
Let's Encrypt (certbot, Cloudflare DNS-01, ECDSA) | external-tls.md |
| Mesh-only | <svc>.internal |
internal CA (step ca, lair provisioner, step@ renewal) |
internal-tls.md |
Provisioner credentials for the internal CA (~/.step/secrets/provisioner, shipped to
the host transiently and removed) are covered in internal-tls.md §4.
4. nginx conventions on the proxies
sites-available/+sites-enabled/symlink, included via/etc/nginx/conf.d/sites-enabled.conf(include /etc/nginx/sites-enabled/*.conf;). One file perserver_name; enable with a relative symlink (ln -sf ../sites-available/<name>.conf /etc/nginx/sites-enabled/).- Static SPA served from
/var/www/<name>with SPA fallback (try_files $uri $uri/ /index.html;); API reverse-proxied to the internalhost:port. Internal vhosts addssl_trusted_certificate <internal root>and pinssl_protocols TLSv1.3. - SELinux (enforcing): webroots must be labelled
httpd_sys_content_tor nginx returns 403. After creating/populating/var/www/<name>, runrestorecon -R /var/www/<name>; rsynced files inherit the dir's type. - Never reference a cert path before the cert exists —
nginx -tfails on a missingssl_certificateand blocks the whole server from (re)starting. Issue the cert, then install the TLS vhost (gate on the cert's presence; serve an http-only bootstrap until then if needed). - Config + cert/renewal wiring is installed idempotently from each project's
infra-setup.sh(deployment-gitea-actions.md§2); the recurring artifact rsync (e.g. built SPAdist/) rides in the deploy workflow.
5. Worked example: helexa-bench UI
The bench visualisation is reached both ways, fronted by hanzalova.internal:
| vhost | cert | DNS |
|---|---|---|
bench.helexa.ai (public) |
Let's Encrypt | Cloudflare A → office WAN IP; OPNsense forwards WAN :443 → hanzalova |
bench.internal (mesh) |
internal lair CA, renewed by step@bench.timer |
split-horizon bench.internal → hanzalova mesh IP |
Both vhosts share one webroot (/var/www/bench.helexa.ai, the built SPA) and proxy
/api to the helexa-bench read API on bob.hanzalova.internal:13132. The internal vhost
exists precisely because of §2: from a workstation on the mesh, bench.helexa.ai
hairpins to the OPNsense LAN interface and fails, so mesh users hit bench.internal.