Live-stream builder-live.log instead of post-mortem dump #1

Open
opened 2026-04-16 09:43:18 +00:00 by grenade · 0 comments
Owner

Current behaviour

scripts/copr-build.sh submits the build, runs copr-cli watch-build
to block until completion, then uses copr-cli download-build to
fetch each chroot's builder-live.log and emits them as ::group::
blocks.

This means the logs only appear in CI output after the build
finishes. For a long-running COPR build (tens of minutes on larger
projects), that's a long blind spot — you see status transitions but
not the actual compiler/linker output until the build is done.

Proposed behaviour

Stream each chroot's builder-live.log into the CI output as it is
written by the COPR builder. The log file is continuously updated on
the mirror at a predictable URL:

https://download.copr.fedorainfracloud.org/results/<owner>/<project>/<chroot>/<build_id>-<pkg>/builder-live.log

Implementation sketch

  1. Submit with --nowait (same as today), capture build ID.
  2. Query the COPR REST API for the list of chroots for this build:
    GET https://copr.fedorainfracloud.org/api_3/build/<build_id>
    
    The response includes a chroots array.
  3. For each chroot, spawn a background tailer:
    • Poll the live log URL every ~5s with an HTTP Range: bytes=N-
      request to fetch only new bytes since the last poll.
    • Prefix each emitted line with [<chroot>] so interleaved output
      from parallel chroots is still decipherable.
    • Exit when either: (a) the chroot's build completes (polled via
      API) and the final read returns no new bytes, or (b) the mirror
      returns a 416 (Range Not Satisfiable) plus a closed builder-live
      endpoint.
  4. Foreground: copr-cli watch-build <id> for overall status (as
    today) — exits non-zero on failure.
  5. wait for all tailers to drain before exiting.

Edge cases to handle

  • Log file doesn't exist yet — the chroot may be in pending /
    importing state. Initial polls will 404. Retry with backoff until
    the URL appears, or give up after N minutes.
  • Build fails before any log is written — import failure, no
    chroot ever runs. Fall back to the current post-mortem download-build
    behaviour.
  • Partial log on failurebuilder-live.log may truncate when
    the builder is killed. Accept it; emit what exists.
  • Network flakiness — treat transient 5xx / curl failures as
    "try again next poll", don't abort the stream.
  • Rawhide vs stable chroots finishing at different times — the
    stream should continue for still-running chroots even after one
    finishes.

Interleaving

Parallel tailers writing to the same stdout will interleave lines.
The [<chroot>] prefix is the minimum disambiguation. For clearer
reading, could buffer per-chroot and flush on line boundaries.
::group:: per chroot doesn't work during live streaming because
groups must be opened and closed contiguously — we'd have to drop
grouping or open one group per line (ugly). Line prefix is the
pragmatic choice.

Scope

Implement inside scripts/copr-build.sh (bash + curl + jq).
Keep the existing post-mortem download-build dump as a fallback
path for when streaming fails or the build errors before any log
appears — the current code can be wrapped in a finally-style trap
rather than being the primary path.

Bump the action to v2 since this changes output timing behaviour
enough that consumers may want to opt in. Keep v1 as the stable
post-mortem variant.

Context

The post-mortem approach was chosen initially because it is simple
and reliable — see the discussion that led to v1 in helexa/cortex
commit history. Streaming was deferred as a follow-up when and if
post-mortem visibility turns out to be insufficient in practice.

## Current behaviour `scripts/copr-build.sh` submits the build, runs `copr-cli watch-build` to block until completion, then uses `copr-cli download-build` to fetch each chroot's `builder-live.log` and emits them as `::group::` blocks. This means the logs only appear in CI output **after** the build finishes. For a long-running COPR build (tens of minutes on larger projects), that's a long blind spot — you see status transitions but not the actual compiler/linker output until the build is done. ## Proposed behaviour Stream each chroot's `builder-live.log` into the CI output as it is written by the COPR builder. The log file is continuously updated on the mirror at a predictable URL: ``` https://download.copr.fedorainfracloud.org/results/<owner>/<project>/<chroot>/<build_id>-<pkg>/builder-live.log ``` ## Implementation sketch 1. Submit with `--nowait` (same as today), capture build ID. 2. Query the COPR REST API for the list of chroots for this build: ``` GET https://copr.fedorainfracloud.org/api_3/build/<build_id> ``` The response includes a `chroots` array. 3. For each chroot, spawn a background tailer: - Poll the live log URL every ~5s with an HTTP `Range: bytes=N-` request to fetch only new bytes since the last poll. - Prefix each emitted line with `[<chroot>]` so interleaved output from parallel chroots is still decipherable. - Exit when either: (a) the chroot's build completes (polled via API) and the final read returns no new bytes, or (b) the mirror returns a 416 (Range Not Satisfiable) plus a closed builder-live endpoint. 4. Foreground: `copr-cli watch-build <id>` for overall status (as today) — exits non-zero on failure. 5. `wait` for all tailers to drain before exiting. ## Edge cases to handle - **Log file doesn't exist yet** — the chroot may be in `pending` / `importing` state. Initial polls will 404. Retry with backoff until the URL appears, or give up after N minutes. - **Build fails before any log is written** — import failure, no chroot ever runs. Fall back to the current post-mortem `download-build` behaviour. - **Partial log on failure** — `builder-live.log` may truncate when the builder is killed. Accept it; emit what exists. - **Network flakiness** — treat transient 5xx / curl failures as "try again next poll", don't abort the stream. - **Rawhide vs stable chroots finishing at different times** — the stream should continue for still-running chroots even after one finishes. ## Interleaving Parallel tailers writing to the same stdout will interleave lines. The `[<chroot>] ` prefix is the minimum disambiguation. For clearer reading, could buffer per-chroot and flush on line boundaries. `::group::` per chroot doesn't work during live streaming because groups must be opened and closed contiguously — we'd have to drop grouping or open one group per line (ugly). Line prefix is the pragmatic choice. ## Scope Implement inside `scripts/copr-build.sh` (bash + `curl` + `jq`). Keep the existing post-mortem `download-build` dump as a fallback path for when streaming fails or the build errors before any log appears — the current code can be wrapped in a `finally`-style trap rather than being the primary path. Bump the action to `v2` since this changes output timing behaviour enough that consumers may want to opt in. Keep `v1` as the stable post-mortem variant. ## Context The post-mortem approach was chosen initially because it is simple and reliable — see the discussion that led to v1 in helexa/cortex commit history. Streaming was deferred as a follow-up when and if post-mortem visibility turns out to be insufficient in practice.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: actions/copr-publish#1