3 Commits

Author SHA1 Message Date
818a535903 feat(worker): capture commits on non-default branches and forks
The ingestion paths each had a gap that let non-default-branch work
slip through: /search/commits silently excludes forks, the per-repo
REST commit scan only walked the default branch, and the user events
feed ages out after 90 days. Catch them by enumerating branches per
repo and scanning each (with per-branch state cursors so a brand-new
branch isn't cut off by the default branch's cursor), pre-filtering
branches via a GraphQL HEAD-author check so big upstream forks like
azure-docs don't trigger hundreds of wasted REST calls, treating
GitHub's HTTP 500 on author-filtered empty branches as "no commits"
rather than a server error, and adding fork:true to the search query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 16:04:58 +03:00
45ceec2ec7 feat(worker): add github events poller
Adds the first ingestion source. Page-1 polling is ETag-conditional
(304s don't count against rate limit); the very first run paginates
back through Link "next" pages up to a 10-page safety cap so the
table starts populated rather than waiting for new activity.

Hits /users/{user}/events/public — works without auth, returns the
right scope for a public timeline. Token (GITHUB_TOKEN) is optional;
when present it raises the rate limit from 60 to 5000/hr.

New plumbing:

  moments-core::sources
    - EventSource trait (poll() -> count)
    - PollerStateStore trait (etag persistence port)
    - run_poller driver: tokio interval + jittered exponential backoff

  moments-data::github
    - GithubSource impl, raw payload preserved as JSONB
    - parse_link_next for pagination
    - 4 unit tests covering parser + Link parsing

  migration 0002_poller_state.sql
    - one row per source: source, etag, last_modified, last_fetched

Worker binary spawns one tokio task per source (just github for now)
and aborts on SIGINT. Verified by smoke-curling the upstream endpoint:
ETag and Link headers are present; payload shape matches the parser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:59:15 +03:00
6775309043 chore: scaffold moments workspace
Cargo workspace with five crates per architecture conventions:

- moments-entities: Source enum, Event, EventQuery, SourceSummary
- moments-core:     EventReader / EventWriter ports
- moments-data:     PgStore (sqlx postgres adapter) + 0001_init.sql
- moments-api:      axum binary; /v1/{healthz,events,sources}
- moments-worker:   skeleton; pollers land in step 2

Sources committed-to for ingestion: github, gitea, hg, bugzilla.
Workstation events explicitly retired (not deferred).

Build + clippy clean. sqlx queries use the runtime API for now;
will switch to compile-time-checked macros + .sqlx offline cache
once magrathea has the moments_{ro,rw} roles and database created.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:47:06 +03:00