Commit Graph

5 Commits

Author SHA1 Message Date
ee93429317 feat: language stream graph on dashboard
Full-stack feature showing programming languages by commit activity
as a stream graph on the dashboard.

Backend:
- migration: repo_languages table (source, repo, language, bytes, color)
- worker: fetch language breakdowns via GitHub GraphQL (batched,
  20 repos/request) and Gitea REST API during poll cycles
- API: GET /v1/languages/daily (daily commit counts per language),
  GET /v1/languages/repos (all stored repo language data)
- fix timezone bug in daily_counts and language_daily_counts: the
  PostgreSQL server timezone (Europe/Sofia, UTC+3) shifted day
  boundaries, miscounting events near midnight. Now uses explicit
  UTC boundaries in generate_series JOINs.
- use per-source CASE for repo name extraction in language query
  to match gitea payload structure (repo.full_name vs repo.name)
- Gitea languages use GitHub colors via COALESCE fallback

Frontend:
- LanguageStreamGraph component: pure SVG stream graph, weekly
  buckets, centered baseline, top 8 languages + Other, GitHub
  canonical language colors, legend with color dots
- DashPage/ProjectPage: fetch repo languages once via new endpoint
  instead of per-repo forge proxy calls (eliminates 200+ GitHub
  API calls and 403 rate limit errors)
- removed fetchLanguages forge proxy wrapper (dead code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 06:27:59 +03:00
c66aaeb268 feat: discover contributed repos via GitHub GraphQL API
The REST /user/repos endpoint only returns repos where the user is
owner, collaborator, or org member. Repos contributed to via PRs
(e.g. polkadot-js/api, zed-industries/zed) were never discovered
and their commits were missing from moments.

Now supplements /user/repos with a GraphQL
repositoriesContributedTo query, which returns all repos the user
has committed to, opened issues/PRs on, or reviewed — with cursor-
based pagination and no result cap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 05:38:57 +03:00
f77a8ab48f fix: use since cursor in github-repo polls to prevent missed commits
After initial backfill, scan_repo was fetching only page 1 (100 most
recent commits) per repo. If more than 100 commits landed between
7-day polls, older ones in that window were permanently missed.

Now stores the newest commit date in poller_state.last_modified and
passes it as &since= on subsequent polls, with full pagination, so
only genuinely new commits are fetched but none are skipped.

On first poll after deploy, last_modified is NULL so no since filter
is applied — triggering a full re-backfill that catches any
previously missed commits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 05:03:41 +03:00
45fd45f5da fix: stamp _repo into github-repo commit payloads for project attribution
The /repos/{owner}/{repo}/commits endpoint doesn't include repo info
in its response. Without _repo in the payload, these commits were
invisible to the projects query. Add _repo to parse_commit and include
it in the COALESCE chain for github source repo extraction.

After deploy, reset github-repo poller state to re-ingest with _repo:
  DELETE FROM poller_state WHERE source LIKE 'github-repo%';

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 17:59:31 +03:00
a71b4e6b84 feat(github): per-repo commit enumeration for full history backfill
Adds a new github-repo EventSource that enumerates all repos via
/user/repos and walks each repo's /commits?author= endpoint, which
has no 1000-result cap unlike the Search API. Events use the same
github-commit:{sha} ID scheme as github_search for dedup. Per-repo
poller state enables full backfill on first run, page-1-only on
subsequent polls. Weekly poll interval by default.

Closes #1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 14:59:26 +03:00