hermes: single-container deploy (gateway + dashboard), as deployed on bob

The image's command selects mode; no command = interactive CLI which crash-loops under systemd. Switched to the supported headless setup: one container running `gateway run` with the dashboard supervised alongside via HERMES_DASHBOARD=1 (same netns so the dashboard can reach the gateway, which two bridge-networked containers could not). Image fails closed on a 0.0.0.0 dashboard bind, so HERMES_DASHBOARD_INSECURE=1 opts into the chosen trusted-LAN exposure on :5100. Verified live on bob: gateway stable, dash HTTP 200 across the LAN, inference endpoint reachable, enrolled in podman-auto-update.timer. Dropped the redundant separate dashboard quadlet. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011D3YeWKpjg5bT488fVanCH
2026-06-23 12:53:51 +03:00
parent 745a676702
commit 1142929874
2 changed files with 58 additions and 51 deletions
--- a/images/hermes/readme.md
+++ b/images/hermes/readme.md
@@ -24,47 +24,51 @@ rebuild via the workflow's `force` dispatch input, or locally:
 HERMES_REF=v0.2.0 ./build.sh
 ```

-## One image, two roles
+## How it runs (single container, gateway + dashboard)

-Upstream's compose runs a `gateway` (the agent) and a `dashboard` (web UI on
-`127.0.0.1:9119`) from the **same image**. Persistent state — `config.yaml`,
-`.env`, sessions, memory, skills — all lives under `/opt/data` (the single
-volume). Provider keys and the model backend go in those mounted files, never in
-the image.
+The image is an s6-overlay supervisor. The **command selects the mode** — the
+default (no command) is the *interactive CLI*, which exits without a TTY under
+systemd and crash-loops. The supported headless setup (per the image's own
+startup guidance) is **one container running `gateway run` with the dashboard
+supervised alongside** via `HERMES_DASHBOARD`:
+
+- `Exec=gateway run` → the agent daemon (`hermes gateway run --replace`), s6-supervised.
+- `HERMES_DASHBOARD=1` → the dashboard web UI (binds `0.0.0.0:9119`) in the **same**
+  container — which is what lets it reach the gateway (two bridge-networked
+  containers could not, unlike upstream's host-networked compose).
+- The image **fails closed** on a non-loopback dashboard bind: it refuses
+  `0.0.0.0` unless OAuth is configured *or* `HERMES_DASHBOARD_INSECURE=1` is set.
+  We expose on the trusted LAN without auth, so we opt in. ⚠ Anyone on the LAN
+  can reach the API-key-storing UI — switch to `HERMES_DASHBOARD_OAUTH_CLIENT_ID`
+  for real auth if that's not acceptable.
+
+Persistent state — `config.yaml`, `.env`, sessions, memory, skills — all lives
+under `/opt/data` (the single `:Z` volume). Keys/backend go in those mounted
+files, never in the image or quadlet.

 ## Deploying on bob

 See [`hermes.container`](hermes.container) — a rootful quadlet matching the
-existing `agent-zero` / `open-webui` services on bob. Summary:
+existing `agent-zero` / `open-webui` services. As deployed:

-1. `git.lair.cafe/lair/hermes:latest` must be published first (run the `images`
-   workflow).
+1. Publish `git.lair.cafe/lair/hermes:latest` (the `images` workflow).
 2. `sudo install -d -o 10000 -g 10000 /var/lib/hermes`
 3. Drop `config.yaml` into `/var/lib/hermes` (owned `10000:10000`) — **LLM backend
-   → local sovereign inference.** Hermes exposes a `custom` provider for any
-   OpenAI-compatible endpoint, so point it at the same endpoint open-webui uses:
+   → local sovereign inference** via Hermes' `custom` OpenAI-compatible provider:

   ```yaml
   # /var/lib/hermes/config.yaml
   model:
-     provider: "custom"                              # OpenAI-compatible endpoint
-     base_url: "http://hanzalova.internal:31313/v1"
-     api_key: "beast"                                # matches open-webui's OPENAI_API_KEY
-     default: "<model-id-your-endpoint-serves>"      # see: curl http://hanzalova.internal:31313/v1/models
-     # context_length: 32768                         # optional
-     # max_tokens:     4096                           # optional, output ceiling
+     provider: "custom"
+     base_url: "http://hanzalova.internal:31313/v1"   # same endpoint open-webui uses
+     api_key: "<your-inference-key>"
+     default: "<model-id>"                            # see: curl …/v1/models
+     # context_length / max_tokens optional
   ```

-   Any other secrets (web-search/tool keys, messaging tokens) go in
-   `/var/lib/hermes/.env`, never in the quadlet.
+   Other secrets (web-search/tool keys, messaging tokens) go in
+   `/var/lib/hermes/.env`.
 4. Install `hermes.container` to `/etc/containers/systemd/`, `daemon-reload`,
-   `start hermes.service`.
-
-### Dashboard LAN exposure (resolved)
-
-The image binds the dashboard on **`0.0.0.0:9119` by default**
-(`HERMES_DASHBOARD_HOST` / `HERMES_DASHBOARD_PORT`), so bridge networking +
-`PublishPort=5100:9119` in the quadlet exposes it on the LAN at `:5100` with no
-override. ⚠ The dashboard **stores provider API keys and has no auth** — keep it
-on a trusted LAN only, and front it with an authenticating reverse proxy for any
-wider exposure.
+   `start hermes.service`. Dashboard then serves on the LAN at
+   `http://bob.hanzalova.internal:5100/`; `AutoUpdate=registry` enrolls it in the
+   host's `podman-auto-update.timer`.