Full-stack feature showing programming languages by commit activity
as a stream graph on the dashboard.
Backend:
- migration: repo_languages table (source, repo, language, bytes, color)
- worker: fetch language breakdowns via GitHub GraphQL (batched,
20 repos/request) and Gitea REST API during poll cycles
- API: GET /v1/languages/daily (daily commit counts per language),
GET /v1/languages/repos (all stored repo language data)
- fix timezone bug in daily_counts and language_daily_counts: the
PostgreSQL server timezone (Europe/Sofia, UTC+3) shifted day
boundaries, miscounting events near midnight. Now uses explicit
UTC boundaries in generate_series JOINs.
- use per-source CASE for repo name extraction in language query
to match gitea payload structure (repo.full_name vs repo.name)
- Gitea languages use GitHub colors via COALESCE fallback
Frontend:
- LanguageStreamGraph component: pure SVG stream graph, weekly
buckets, centered baseline, top 8 languages + Other, GitHub
canonical language colors, legend with color dots
- DashPage/ProjectPage: fetch repo languages once via new endpoint
instead of per-repo forge proxy calls (eliminates 200+ GitHub
API calls and 403 rate limit errors)
- removed fetchLanguages forge proxy wrapper (dead code)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The REST /user/repos endpoint only returns repos where the user is
owner, collaborator, or org member. Repos contributed to via PRs
(e.g. polkadot-js/api, zed-industries/zed) were never discovered
and their commits were missing from moments.
Now supplements /user/repos with a GraphQL
repositoriesContributedTo query, which returns all repos the user
has committed to, opened issues/PRs on, or reviewed — with cursor-
based pagination and no result cap.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After initial backfill, scan_repo was fetching only page 1 (100 most
recent commits) per repo. If more than 100 commits landed between
7-day polls, older ones in that window were permanently missed.
Now stores the newest commit date in poller_state.last_modified and
passes it as &since= on subsequent polls, with full pagination, so
only genuinely new commits are fetched but none are skipped.
On first poll after deploy, last_modified is NULL so no since filter
is applied — triggering a full re-backfill that catches any
previously missed commits.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The /repos/{owner}/{repo}/commits endpoint doesn't include repo info
in its response. Without _repo in the payload, these commits were
invisible to the projects query. Add _repo to parse_commit and include
it in the COALESCE chain for github source repo extraction.
After deploy, reset github-repo poller state to re-ingest with _repo:
DELETE FROM poller_state WHERE source LIKE 'github-repo%';
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a new github-repo EventSource that enumerates all repos via
/user/repos and walks each repo's /commits?author= endpoint, which
has no 1000-result cap unlike the Search API. Events use the same
github-commit:{sha} ID scheme as github_search for dedup. Per-repo
poller state enables full backfill on first run, page-1-only on
subsequent polls. Weekly poll interval by default.
Closes #1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>