Files
rbv/doc/db-cluster.md

9.3 KiB
Raw Blame History

PostgreSQL Cluster

Three-node Patroni cluster with pgvector, consolidating the rbv and immich PostgreSQL containers previously running on gramathea.

All connections (client and replication) use mutual TLS via the internal step CA. No password authentication is used anywhere.

Certificate convention on all infra hosts:

  • CA: /etc/pki/ca-trust/source/anchors/root-internal.pem
  • Cert: /etc/pki/tls/misc/$(hostname -f).pem
  • Key: /etc/pki/tls/private/$(hostname -f).pem

Certs are provisioned as both client and server, so the same PEMs serve for PostgreSQL SSL, client certificate authentication, and Patroni replication.

Nodes

Hostname Site Role
frankie.hanzalova.internal primary primary site, node 1
(TBD) primary primary site, node 2
(TBD) secondary secondary site, node 3

Hardware: ASRock E3C236D4M-4L, E3-1230 v6, 16 GB RAM, 2×1 TB SSD.

Replication topology (target):

  • Primary → sync standby: within primary site (no WireGuard on critical write path)
  • Primary → async standby: secondary site node (DR copy)

Phase 1 — Standalone on frankie

Install PostgreSQL 17 and pgvector

# Add PGDG repository
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/F-$(rpm -E %fedora)-x86_64/pgdg-fedora-repo-latest.noarch.rpm

# Disable the Fedora-packaged postgres to avoid conflicts
sudo dnf -qy module disable postgresql

# Install server and pgvector
sudo dnf install -y postgresql17-server pgvector_17

# Initialise the data directory
sudo /usr/pgsql-17/bin/postgresql-17-setup initdb

# Enable and start
sudo systemctl enable --now postgresql-17.service

Make certificates readable by postgres

Grant the postgres user read access via ACL, leaving ownership as root. This way cert renewals take effect automatically without re-copying.

sudo setfacl -m u:postgres:r /etc/pki/tls/private/$(hostname).pem

Configure postgresql.conf

sudo -u postgres mkdir -p /var/lib/pgsql/17/data/postgresql.conf.d
if ! sudo -u postgres grep 'postgresql.conf.d' /var/lib/pgsql/17/data/postgresql.conf &> /dev/null; then
    echo 'include_dir = postgresql.conf.d' | sudo -u postgres tee --append /var/lib/pgsql/17/data/postgresql.conf
fi
echo "listen_addresses = '*'" | sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/listen.conf
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/ssl.conf <<'EOF'
ssl = on
ssl_cert_file = '/etc/pki/tls/misc/frankie.hanzalova.internal.pem'
ssl_key_file  = '/etc/pki/tls/private/frankie.hanzalova.internal.pem'
ssl_ca_file   = '/etc/pki/ca-trust/source/anchors/root-internal.pem'
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/memory.conf <<'EOF'
shared_buffers = 4GB
work_mem = 64MB
maintenance_work_mem = 512MB
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/wal.conf <<'EOF'
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/checkpoint.conf <<'EOF'
checkpoint_completion_target = 0.9
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/vchord.conf <<'EOF'
shared_preload_libraries = 'vchord'
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/postgresql.conf.d/logging.conf <<'EOF'
log_destination = 'stderr'
logging_collector = off
EOF

sudo systemctl reload postgresql-17.service

Configure pg_hba.conf

Update the default rules with certificate-only authentication for lan connections. Local unix-socket access retains peer for admin use.

sudo -u postgres mkdir -p /var/lib/pgsql/17/data/pg_hba.conf.d
if ! sudo -u postgres grep 'pg_hba.conf.d' /var/lib/pgsql/17/data/pg_hba.conf &> /dev/null; then
    echo 'include_dir = pg_hba.conf.d' | sudo -u postgres tee --append /var/lib/pgsql/17/data/pg_hba.conf
fi
sudo -u postgres tee /var/lib/pgsql/17/data/pg_hba.conf.d/network-connections.conf <<'EOF'
hostnossl all all 0.0.0.0/0 reject
hostssl all all 10.3.0.0/16 cert map=cert_cn
hostssl all all 10.6.0.0/16 cert map=cert_cn
hostssl replication replicator 10.0.0.0/8 cert clientcert=verify-full map=cn
EOF

sudo systemctl reload postgresql-17.service

Configure pg_ident.conf

Maps the CN of each client certificate to the appropriate database user. Add a line for each application host.

sudo -u postgres mkdir -p /var/lib/pgsql/17/data/pg_ident.conf.d
if ! sudo -u postgres grep 'pg_ident.conf.d' /var/lib/pgsql/17/data/pg_ident.conf &> /dev/null; then
    echo 'include_dir = pg_ident.conf.d' | sudo -u postgres tee --append /var/lib/pgsql/17/data/pg_ident.conf
fi
sudo -u postgres tee /var/lib/pgsql/17/data/pg_ident.conf.d/immich.conf <<'EOF'
cn gramathea.kosherinata.internal immich
EOF
sudo -u postgres tee /var/lib/pgsql/17/data/pg_ident.conf.d/rbv.conf <<'EOF'
cn gramathea.kosherinata.internal rbv
EOF

sudo systemctl reload postgresql-17.service

Create roles and databases

No passwords — authentication is via certificate only.

sudo -u postgres psql <<'EOF'
CREATE USER rbv;
CREATE DATABASE rbv OWNER rbv;

CREATE USER immich;
CREATE DATABASE immich OWNER immich;

CREATE USER replicator REPLICATION;
EOF

Install VectorChord

VectorChord is not in PGDG — install from the GitHub release zip. Check https://github.com/tensorchord/VectorChord/releases for the current version.

curl \
    --fail \
    --show-error \
    --location \
    --silent \
    --output /tmp/postgresql-17-vchord_1.1.1_x86_64-linux-gnu.zip \
    --url https://github.com/tensorchord/VectorChord/releases/download/1.1.1/postgresql-17-vchord_1.1.1_x86_64-linux-gnu.zip
unzip \
    -d /tmp/vchord \
    /tmp/postgresql-17-vchord_1.1.1_x86_64-linux-gnu.zip

sudo install \
    --owner root \
    --group root \
    /tmp/vchord/pkglibdir/vchord.so \
    /usr/pgsql-17/lib/
sudo install \
    --owner root \
    --group root \
    --mode 644 \
    /tmp/vchord/sharedir/extension/vchord* \
    /usr/pgsql-17/share/extension/

rm -rf /tmp/vchord /tmp/postgresql-17-vchord_1.1.1_x86_64-linux-gnu.zip

VectorChord requires preloading (needs a restart, not just reload):

Caution

deprecated in favour of /var/lib/pgsql/17/data/postgresql.conf.d/vchord.conf above.

sudo tee -a /var/lib/pgsql/17/data/postgresql.conf <<'EOF'

# VectorChord
shared_preload_libraries = 'vchord'
EOF

sudo systemctl restart postgresql-17

Enable pgvector and VectorChord

sudo -u postgres psql -d rbv <<'EOF'
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS vchord;
EOF

sudo -u postgres psql -d immich <<'EOF'
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS vchord;
EOF

Open firewall port

sudo firewall-cmd --zone=$(firewall-cmd --get-default-zone) --add-service postgresql --permanent
sudo firewall-cmd --reload
sudo firewall-cmd --list-services

Migrate data from gramathea

Run on gramathea. The dump uses password auth against the existing containers; the restore connects to frankie using the host certificate.

# Dump from the running quadlet containers (password auth, local)
pg_dump -h localhost -p 4432 -U rbv rbv > rbv.sql
pg_dump -h localhost -p 5432 -U postgres immich > immich.sql   # adjust port/user for immich container

# Restore on frankie using cert auth
psql "host=frankie.hanzalova.internal \
      user=rbv dbname=rbv \
      sslmode=verify-full \
      sslcert=/etc/pki/tls/misc/gramathea.kosherinata.internal.pem \
      sslkey=/etc/pki/tls/private/gramathea.kosherinata.internal.pem \
      sslrootcert=/etc/pki/ca-trust/source/anchors/root-internal.pem" \
    < rbv.sql

psql "host=frankie.hanzalova.internal \
      user=immich dbname=immich \
      sslmode=verify-full \
      sslcert=/etc/pki/tls/misc/gramathea.kosherinata.internal.pem \
      sslkey=/etc/pki/tls/private/gramathea.kosherinata.internal.pem \
      sslrootcert=/etc/pki/ca-trust/source/anchors/root-internal.pem" \
    < immich.sql

Update application connection strings

rbv services use the connection string format accepted by libpq/sqlx. SSL parameters can be passed inline or via environment variables; inline is shown here for clarity. Update /etc/systemd/system/rbv-*.service:

postgres://rbv@frankie.hanzalova.internal/rbv\
?sslmode=verify-full\
&sslcert=/etc/pki/tls/misc/gramathea.kosherinata.internal.pem\
&sslkey=/etc/pki/tls/private/gramathea.kosherinata.internal.pem\
&sslrootcert=/etc/pki/ca-trust/source/anchors/root-internal.pem

The sed -i password substitution in script/deploy.sh can be removed once the services are updated to cert-based connection strings.

For immich, update DB_HOSTNAME, DB_USERNAME, and set:

DB_SSL_MODE=verify-full
DB_SSL_CERT=/etc/pki/tls/misc/gramathea.kosherinata.internal.pem
DB_SSL_KEY=/etc/pki/tls/private/gramathea.kosherinata.internal.pem
DB_SSL_ROOT_CERT=/etc/pki/ca-trust/source/anchors/root-internal.pem

Once both applications are confirmed working against frankie, stop and disable the postgres quadlets on gramathea:

sudo systemctl disable --now rbv-postgres.service
# and the immich postgres equivalent

Phase 2 — Patroni HA (when second node is ready)

To be documented once node 2 hardware is provisioned.

Key steps will be:

  1. Install etcd on all three nodes
  2. Install Patroni on all three nodes
  3. Bootstrap Patroni on node 1 (adopts existing data directory — no re-initdb)
  4. Stream base backup to node 2, add as sync standby (cert auth for replication)
  5. Add node 3 as async standby