Commit Graph

5 Commits

Author SHA1 Message Date
a71b4e6b84 feat(github): per-repo commit enumeration for full history backfill
Adds a new github-repo EventSource that enumerates all repos via
/user/repos and walks each repo's /commits?author= endpoint, which
has no 1000-result cap unlike the Search API. Events use the same
github-commit:{sha} ID scheme as github_search for dedup. Per-repo
poller state enables full backfill on first run, page-1-only on
subsequent polls. Weekly poll interval by default.

Closes #1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 14:59:26 +03:00
88fbbba60b feat(hg): revset-based author query, group discovery, one-shot ingest script
Rewrites the hg worker to use json-log?rev=author() which matches the
changeset author (not the pusher), capturing commits landed by sheriffs.
Repos are discovered within configured groups plus individually listed
repos. The worker skips entirely after the first successful backfill.

Adds script/hg-ingest.sh for offline ingestion via local hg clones —
clones one repo at a time, caches extracted changesets to .tsv, inserts
via psql, and sets poller_state when done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 13:58:21 +03:00
8867ff5df3 feat(deploy): manifest-driven config, teardown + db-perms, hardening
deploy.sh:
- never rsync into /; stage to /tmp on the remote and install at final
  paths via sudo bash heredoc, closing the parent-dir attribute leak
  that broke three hosts in the earlier rsync incident
- shell-quote heredoc args via ${var@Q}
- drop -A -X on the remaining (web) rsyncs
- generic worker.secrets loop reads (env-var → pass path) from manifest;
  GITEA_TOKEN now flows through automatically
- in-memory bash substitution for templates (secrets never on argv)
- simplify semanage port labelling: --add 2>/dev/null || --modify (the
  old grep pre-check matched only the first listed port)
- restorecon back to short flags (Fedora policycoreutils has no long
  forms; --recursive errored at deploy time)
- quieter health probe loop: curl diagnostics only on final failure

manifest as source of truth:
- api.config.bind drives BIND_ADDR, firewalld port, semanage label,
  health-probe URL
- web.config.{server_name,root,api_upstream} drives nginx render,
  rsync targets, restorecon scope
- nginx config renamed to site.conf.tmpl; firewalld svc to
  moments-api.xml.tmpl; both rendered at deploy time
- topology flip: api → nikola, worker → frootmig (anjie freed)

new scripts:
- script/teardown.sh: idempotent component teardown, never rsyncs,
  shared-state cleanup gated on absence of remaining env files,
  --remove-docroot guard against shallow / system paths
- script/db-perms.sh: rewritten — fixes grep/append role mismatch that
  appended duplicates on re-run, adds postgres reload, hits primary +
  standby in a single invocation

readme: genericized; deployment topology no longer carries real host
or site names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 16:39:10 +03:00
abce3803ca chore(deploy): strip infra commentary from asset/ config files
These ship in a public repo; topology narration in nginx, systemd,
firewalld, and env templates is gratuitous. Keep the config terse —
directives speak for themselves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:23:11 +03:00
110b523fd0 chore(deploy): add manifest, systemd units, nginx config, deploy.sh
Wires up the prod deployment per architecture-doc conventions:

- api → nikola.kosherinata.internal, loopback bind 127.0.0.1:42424
  (less-common port, registered with SELinux as http_port_t).
- worker → frootmig.kosherinata.internal, no listening port.
- web (static ui/dist + nginx server_name rob.tn) → nikola, with
  /api/* reverse-proxied to the loopback API.
- db → existing magrathea cluster via mTLS, hostname-baked DATABASE_URL
  rendered into /etc/moments/{api,worker}.env at deploy time.

Cert rotation: step-ca renews host certs every 24h; .path units watch
/etc/pki/tls/misc/<host>.pem and trigger systemctl restart of the
relevant service. Both binaries hold cert state in rustls and read
once at startup, so restart is the right reload semantics.

deploy.sh contract matches the architecture doc: positional env arg,
component list (or `all` / `default`), --dry-run support. Renders
config templates from `pass`, rsyncs over ssh+sudo, runs sysusers /
restorecon / semanage / systemctl / nginx -t idempotently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:17:17 +03:00