feat(worker): add github events poller

Adds the first ingestion source. Page-1 polling is ETag-conditional
(304s don't count against rate limit); the very first run paginates
back through Link "next" pages up to a 10-page safety cap so the
table starts populated rather than waiting for new activity.

Hits /users/{user}/events/public — works without auth, returns the
right scope for a public timeline. Token (GITHUB_TOKEN) is optional;
when present it raises the rate limit from 60 to 5000/hr.

New plumbing:

  moments-core::sources
    - EventSource trait (poll() -> count)
    - PollerStateStore trait (etag persistence port)
    - run_poller driver: tokio interval + jittered exponential backoff

  moments-data::github
    - GithubSource impl, raw payload preserved as JSONB
    - parse_link_next for pagination
    - 4 unit tests covering parser + Link parsing

  migration 0002_poller_state.sql
    - one row per source: source, etag, last_modified, last_fetched

Worker binary spawns one tokio task per source (just github for now)
and aborts on SIGINT. Verified by smoke-curling the upstream endpoint:
ETag and Link headers are present; payload shape matches the parser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-03 17:59:15 +03:00
parent e40d6b0e44
commit 45ceec2ec7
10 changed files with 489 additions and 9 deletions

View File

@@ -1,6 +1,8 @@
pub mod github;
use async_trait::async_trait;
use chrono::{DateTime, Utc};
use moments_core::{EventReader, EventWriter, StoreError};
use moments_core::{EventReader, EventWriter, PollerState, PollerStateStore, StoreError};
use moments_entities::{Event, EventQuery, Source, SourceSummary};
use sqlx::Row;
use sqlx::postgres::{PgPool, PgPoolOptions};
@@ -105,6 +107,74 @@ impl EventReader for PgStore {
}
}
#[async_trait]
impl PollerStateStore for PgStore {
async fn load(&self, source: &str) -> Result<Option<PollerState>, StoreError> {
let row = sqlx::query(
r#"
SELECT source, etag, last_modified, last_fetched
FROM poller_state
WHERE source = $1
"#,
)
.bind(source)
.fetch_optional(&self.pool)
.await
.map_err(map_err)?;
Ok(match row {
None => None,
Some(r) => Some(PollerState {
source: r.try_get("source").map_err(map_err)?,
etag: r.try_get("etag").map_err(map_err)?,
last_modified: r.try_get("last_modified").map_err(map_err)?,
last_fetched: r.try_get("last_fetched").map_err(map_err)?,
}),
})
}
async fn save(
&self,
source: &str,
etag: Option<&str>,
last_modified: Option<DateTime<Utc>>,
) -> Result<(), StoreError> {
sqlx::query(
r#"
INSERT INTO poller_state (source, etag, last_modified, last_fetched)
VALUES ($1, $2, $3, now())
ON CONFLICT (source) DO UPDATE
SET etag = EXCLUDED.etag,
last_modified = EXCLUDED.last_modified,
last_fetched = EXCLUDED.last_fetched
"#,
)
.bind(source)
.bind(etag)
.bind(last_modified)
.execute(&self.pool)
.await
.map_err(map_err)?;
Ok(())
}
async fn touch(&self, source: &str) -> Result<(), StoreError> {
sqlx::query(
r#"
INSERT INTO poller_state (source, last_fetched)
VALUES ($1, now())
ON CONFLICT (source) DO UPDATE
SET last_fetched = EXCLUDED.last_fetched
"#,
)
.bind(source)
.execute(&self.pool)
.await
.map_err(map_err)?;
Ok(())
}
}
#[async_trait]
impl EventWriter for PgStore {
async fn upsert_events(&self, events: &[Event]) -> Result<usize, StoreError> {