Commit Graph

7 Commits

Author SHA1 Message Date
e4052c4c9a feat(worker): add github search api source for historical backfill
The Events API is hard-capped at 90 days (15 events for grenade
right now). The Search API has its own 1000-result-per-query cap
but reaches the start of the user's GitHub history — for grenade,
430 issues/PRs going back to 2012-08-08.

  GET /search/issues?q=author:<user>&sort=created&order=desc

Polled hourly by default but defaults to 24h interval since this is
backfill, not a live feed. After the first run most upserts are
no-ops. Stored as Source::Github with action "Issue" or "PullRequest"
(distinguished by the .pull_request field on the search item),
keyed `github-issue:<owner>/<repo>#<n>`.

/search/commits is deliberately not used: GitHub matches the same
commit across every fork that contains it, so 275k of grenade's
"commits" are mostly duplicated fork hits in repos he never authored
to. If commit history becomes valuable we should enumerate his repos
and walk per-repo /commits?author= instead.

Visibility: search/issues items don't carry .private, so we lookup
/repos/{full_name} once per unique repo encountered (cached for the
duration of the poll). Failure to resolve is treated as private —
better to under-expose than over-expose on the public timeline.

Reshape: presentation/github.rs gains an Issue/PullRequest path that
extracts from the search item shape (html_url, number, title, state,
.pull_request.merged_at) rather than the events-API wrapper. Merged
PRs use the GitMerge icon, mirroring the events-API path.

Worker now spawns two tokio tasks (events + search), aborts both
on SIGINT. New env: SEARCH_POLL_INTERVAL_SECS (default 86400).

Tests: +2 in moments-data (URL parsing), +2 in moments-core
(search Issue + merged-PR reshape) — 14 total green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:49:06 +03:00
3c0253519f feat: ingest private events; surface public-only
The DB now stores everything GitHub will give us, the API only ever
returns public events (for now).

Endpoint switch in the github poller: when GITHUB_TOKEN is set we
hit /users/{u}/events (public + private), otherwise fall back to
/users/{u}/events/public. Either way each event's top-level `public`
boolean is captured into a new column.

Schema:
  migration 0003_event_public.sql adds events.public BOOLEAN NOT NULL
  DEFAULT true, plus an index on (public, occurred_at DESC).

Wire:
  Event gains a `public: bool` field.
  EventQuery gains `include_private: bool` (default false).
  list_events and source_summaries gate on it.
  moments-api pins include_private = false at every call site —
  threading it as a query param is a future-auth concern, not now.

The default-true on the column keeps existing rows correct: the 11
events already in the DB came from /events/public and are genuinely
public.

After this change, clear poller_state so the next worker run does a
fresh backfill via /events:

  DELETE FROM poller_state WHERE source = 'github';

Tests: +2 in github poller (private flag captured, default-public
on missing field) — 10 total green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:33:40 +03:00
003f427e98 feat(api): reshape raw events into TimelineItem
GET /v1/events now returns the presentation form rather than raw
upstream payloads. The frontend stays dumb: it renders title /
subtitle / body segments and picks an icon from a small kebab-case
enum. Title and subtitle are arrays of {text} | {text, url} segments
so the UI can interleave plain copy with anchors without parsing.

New entities (in moments-entities):

  TimelineItem        — id, source, action, occurred_at, icon, title, subtitle, body
  TitleSegment        — Text | Link
  TimelineBody        — Markdown | Commits | Links
  CommitSummary       — sha, short_sha, message, url, author
  TimelineIcon        — kebab-case enum; UI falls back to Generic on unknowns

Reshape lives in moments-core::presentation, dispatched by source.
github.rs covers the event types observed on grenade's feed:
PushEvent, PullRequest{,Review,ReviewComment}Event, Issues{,Comment}Event,
Create/Delete/Fork/Watch/Release/CommitComment/PublicEvent.
Anything else falls back to a generic "<action> on <repo>" line.
Other sources (gitea, hg, bugzilla) currently use a stub fallback;
they get their own reshape modules in steps 5 and 6.

4 unit tests cover the load-bearing cases (push commit list, merged
PR icon swap, issue-comment markdown body, unknown-event fallback).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:08:18 +03:00
418834c960 docs(asset/sql): document mtls and ssh-sudo run modes
The previous bootstrap docs implied a `-U postgres` connection that
won't work over the network — postgres peer auth is local-socket
only. Document the two paths that actually work on this infra:

  (a) mTLS as the network superuser `grenade` using the host cert
      via PGSSL* env vars (cert paths from /etc/pki/tls per §11).
  (b) ssh to the db host and sudo to the local postgres peer.

No script changes — only comments in bootstrap.sql and
bootstrap-moments.sql.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:07:57 +03:00
45ceec2ec7 feat(worker): add github events poller
Adds the first ingestion source. Page-1 polling is ETag-conditional
(304s don't count against rate limit); the very first run paginates
back through Link "next" pages up to a 10-page safety cap so the
table starts populated rather than waiting for new activity.

Hits /users/{user}/events/public — works without auth, returns the
right scope for a public timeline. Token (GITHUB_TOKEN) is optional;
when present it raises the rate limit from 60 to 5000/hr.

New plumbing:

  moments-core::sources
    - EventSource trait (poll() -> count)
    - PollerStateStore trait (etag persistence port)
    - run_poller driver: tokio interval + jittered exponential backoff

  moments-data::github
    - GithubSource impl, raw payload preserved as JSONB
    - parse_link_next for pagination
    - 4 unit tests covering parser + Link parsing

  migration 0002_poller_state.sql
    - one row per source: source, etag, last_modified, last_fetched

Worker binary spawns one tokio task per source (just github for now)
and aborts on SIGINT. Verified by smoke-curling the upstream endpoint:
ETag and Link headers are present; payload shape matches the parser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:59:15 +03:00
e40d6b0e44 chore(asset): add postgres bootstrap and pg_ident template
Idempotent SQL for role and database creation, split between the
postgres-database scope (bootstrap.sql) and the moments-database
scope (bootstrap-moments.sql), since CREATE DATABASE can't run
inside a DO block or transaction.

Roles:
  moments_rw — owner of the moments database; runs migrations
               and writes events from moments-worker.
  moments_ro — read-only; consumed by moments-api.

The pg_ident template is rendered per-host by deploy.sh once it
lands; one (host, role) mapping per file. Reload required on both
magrathea and frankie after install — pg_ident is not replicated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:52:35 +03:00
6775309043 chore: scaffold moments workspace
Cargo workspace with five crates per architecture conventions:

- moments-entities: Source enum, Event, EventQuery, SourceSummary
- moments-core:     EventReader / EventWriter ports
- moments-data:     PgStore (sqlx postgres adapter) + 0001_init.sql
- moments-api:      axum binary; /v1/{healthz,events,sources}
- moments-worker:   skeleton; pollers land in step 2

Sources committed-to for ingestion: github, gitea, hg, bugzilla.
Workstation events explicitly retired (not deferred).

Build + clippy clean. sqlx queries use the runtime API for now;
will switch to compile-time-checked macros + .sqlx offline cache
once magrathea has the moments_{ro,rw} roles and database created.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:47:06 +03:00