feat(hg): revset-based author query, group discovery, one-shot ingest script

Rewrites the hg worker to use json-log?rev=author() which matches the
changeset author (not the pusher), capturing commits landed by sheriffs.
Repos are discovered within configured groups plus individually listed
repos. The worker skips entirely after the first successful backfill.

Adds script/hg-ingest.sh for offline ingestion via local hg clones —
clones one repo at a time, caches extracted changesets to .tsv, inserts
via psql, and sets poller_state when done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-05 13:45:33 +03:00
parent 1bbe55dc84
commit 88fbbba60b
4 changed files with 284 additions and 112 deletions

View File

@@ -14,8 +14,9 @@ GITEA_TOKEN={{GITEA_TOKEN}}
GITEA_POLL_INTERVAL_SECS=600
HG_HOST=hg-edge.mozilla.org
HG_REPOS=build/puppet,build/tools,build/buildbot-configs
HG_AUTHOR_TERMS=thijssen,grenade
HG_GROUPS=build,integration
HG_REPOS=mozilla-central
HG_AUTHOR_TERMS=rthijssen,grenade
HG_POLL_INTERVAL_SECS=86400
BUGZILLA_HOST=bugzilla.mozilla.org