114 lines
4.8 KiB
Markdown
114 lines
4.8 KiB
Markdown
# External TLS: public certs for WAN-facing vhosts
|
|
|
|
Extends `generic.md` §11 (TLS / PKI). That section and `internal-tls.md` cover the
|
|
**internal** PKI (Smallstep `step-ca`, `*.internal` names, mesh-only). This doc covers
|
|
the other half: **publicly-trusted certs for names served to the public internet** at a
|
|
site's WAN edge — e.g. `bench.helexa.ai`, `qapish.ai`, `*.zap.pics`.
|
|
|
|
Decision rule (the whole strategy in one line):
|
|
|
|
> **Public, internet-resolvable name → Let's Encrypt. Mesh-only `*.internal` name →
|
|
> internal CA (`internal-tls.md`).** A service reached both ways gets one vhost of each
|
|
> (see `reverse-proxies.md`).
|
|
|
|
Public certs must chain to a publicly-trusted root (browsers off the mesh don't trust
|
|
the `lair` internal root), so these come from Let's Encrypt — never `step-ca`.
|
|
|
|
---
|
|
|
|
## 1. Issuance: certbot + Cloudflare DNS-01, ECDSA
|
|
|
|
Our public DNS zones are on Cloudflare, so we use the **DNS-01** challenge via the
|
|
`certbot-dns-cloudflare` plugin. DNS-01 is deliberate:
|
|
|
|
- **No inbound :80 needed.** The challenge is a TXT record, not an HTTP hit — so a cert
|
|
can be issued (or renewed) even while nginx is stopped or the host isn't yet reachable
|
|
from the WAN. (This is why a dormant edge proxy doesn't block issuance.)
|
|
- **Wildcard-capable**, if a zone ever wants `*.example.com`.
|
|
|
|
Keys are **ECDSA** (`--key-type ecdsa`), matching the rest of the fleet.
|
|
|
|
```sh
|
|
sudo certbot certonly \
|
|
-m ops@<domain> --agree-tos --no-eff-email --noninteractive \
|
|
--cert-name <domain> \
|
|
--key-type ecdsa \
|
|
--dns-cloudflare \
|
|
--dns-cloudflare-credentials /root/.certbot-internal \
|
|
--dns-cloudflare-propagation-seconds 60 \
|
|
--keep-until-expiring \
|
|
-d <domain>
|
|
```
|
|
|
|
- **`/root/.certbot-internal`** holds the Cloudflare API token. One token covers all the
|
|
zones we manage (`helexa.ai`, `zap.pics`, …), so new sub-domains under an existing zone
|
|
need no new credential — just run the command.
|
|
- **`--keep-until-expiring`** makes scripted/repeated runs idempotent (no-op if the cert
|
|
is still valid), so this is safe to call unconditionally from `infra-setup.sh`.
|
|
- `--cert-name <domain>` pins the lineage name so the cert lands at a predictable path
|
|
regardless of `-d` ordering.
|
|
|
|
## 2. Paths
|
|
|
|
certbot's standard layout (do **not** relocate — the renew timer expects it):
|
|
|
|
| Path | Contents |
|
|
| --- | --- |
|
|
| `/etc/letsencrypt/live/<domain>/fullchain.pem` | cert + intermediate chain |
|
|
| `/etc/letsencrypt/live/<domain>/privkey.pem` | private key |
|
|
|
|
These live under root-only `/etc/letsencrypt/live` (`0700`). Scripts that check for an
|
|
existing cert must `sudo test -d /etc/letsencrypt/live/<domain>` — an unprivileged
|
|
`test` silently returns false and will wrongly conclude the cert is missing.
|
|
|
|
## 3. Renewal
|
|
|
|
Automatic via certbot's own `certbot-renew.timer` (systemd) — **no per-cert unit**,
|
|
unlike the internal `step@<name>` template. certbot renews any lineage within 30 days of
|
|
expiry and runs the configured deploy hook. Ensure nginx reloads after renewal with a
|
|
deploy hook (once per host):
|
|
|
|
```sh
|
|
# /etc/letsencrypt/renewal-hooks/deploy/reload-nginx.sh (chmod +x)
|
|
#!/bin/sh
|
|
systemctl reload nginx 2>/dev/null || true
|
|
```
|
|
|
|
## 4. nginx wiring
|
|
|
|
```nginx
|
|
server {
|
|
listen 443 ssl;
|
|
http2 on;
|
|
server_name <domain>;
|
|
|
|
ssl_certificate /etc/letsencrypt/live/<domain>/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/<domain>/privkey.pem;
|
|
ssl_protocols TLSv1.2 TLSv1.3;
|
|
}
|
|
```
|
|
|
|
Keep an `:80` server for the same name only if you want an HTTP→HTTPS redirect; the
|
|
cert itself needs no `:80` (DNS-01). Never reference a cert path before the cert exists —
|
|
`nginx -t` fails on a missing `ssl_certificate` file and blocks **all** of nginx from
|
|
(re)starting. Issue first, then install the TLS vhost (gate the vhost install on
|
|
`sudo test -d /etc/letsencrypt/live/<domain>`).
|
|
|
|
## 5. Gotchas
|
|
|
|
- **SAN, not CN.** Modern clients ignore CN; the served name must be in the SAN. certbot
|
|
sets SAN from `-d`, so this is automatic — but if `curl` reports *"no alternative
|
|
certificate subject name matches target hostname"*, the listener answering isn't the
|
|
one holding this cert (see next point).
|
|
- **Wrong cert on the public endpoint = a routing problem, not a cert problem.** If a
|
|
public name returns something like `CN=opnsense.<site>.internal`, the WAN `:443`
|
|
forward (or HAProxy SNI route) on OPNsense isn't landing on the site's nginx. Fix the
|
|
edge route (`reverse-proxies.md` §2), not the cert.
|
|
|
|
## 6. Checklist for a new public vhost
|
|
|
|
1. Add the public DNS record on Cloudflare (unproxied by default — `generic.md` §11).
|
|
2. Issue the cert (§1), from `infra-setup.sh`, idempotently.
|
|
3. Point the nginx vhost at the `live/<domain>` paths (§4); `nginx -t` && reload.
|
|
4. Confirm the site's OPNsense forwards WAN `:443` to this nginx (`reverse-proxies.md`).
|