Some checks failed
CI / Test (push) Waiting to run
CI / CUDA type-check (push) Failing after 18s
build-prerelease / Resolve version stamps (push) Successful in 30s
CI / Format (push) Successful in 31s
CI / Clippy (push) Successful in 2m25s
build-prerelease / Build cortex binary (push) Successful in 5m19s
build-prerelease / Build neuron-ada (push) Has been cancelled
build-prerelease / Package cortex RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
build-prerelease / Build neuron-blackwell (push) Has been cancelled
CI / Build cortex SRPM (push) Has been cancelled
CI / Build neuron SRPM (push) Has been cancelled
CI / Publish cortex to COPR (push) Has been cancelled
CI / Publish neuron to COPR (push) Has been cancelled
CI / Bump version in source (push) Has been cancelled
build-prerelease / Build neuron-ampere (push) Has been cancelled
CI run 255 job 3 (CUDA type-check) fails with:
error: could not execute process `*** rustc -vV` (never executed)
Caused by: No such file or directory (os error 2)
The redacted `***` is `sccache`. The ci.yml workflow-level env block
sets `RUSTC_WRAPPER: sccache` because the generic `rust` runner has
sccache installed and routes the cache to caveman.kosherinata.internal.
The new `cuda-check` job runs on `cuda-13.0` (where nvcc lives), and
that runner doesn't carry sccache on PATH — so cargo's first action
(`sccache rustc -vV` to probe the compiler version) fails before
borrow-check even starts.
`build-prerelease.yml`, which uses the same `cuda-13.0` runner for
the actual release neuron builds, deliberately does NOT set
RUSTC_WRAPPER. That's the pattern this commit applies.
Fix: override `RUSTC_WRAPPER` (plus the SCCACHE_* and AWS_* env
locally on the job. We lose caching on the cuda-check job (it's
borrow-check-only and finishes in a couple minutes anyway), but
the gate runs.
The job's purpose — fail fast on `#[cfg(feature = "cuda")]`
borrowck errors that the default-feature gate misses — is what
matters, and that purpose was undermined by the env inheritance.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
346 lines
11 KiB
YAML
346 lines
11 KiB
YAML
name: CI
|
|
|
|
on:
|
|
push:
|
|
branches: ["**"]
|
|
tags: ["v*"]
|
|
pull_request:
|
|
branches: [main]
|
|
|
|
# Share a concurrency group with build-prerelease.yml so the two
|
|
# workflows don't race on the same `rust` runner workspace (act's
|
|
# /root/.cache/act/<hash>/hostexecutor/ is shared across concurrent
|
|
# jobs and one job's checkout step nukes another's in-flight build
|
|
# files). cancel-in-progress=false → they queue; same-ref pushes
|
|
# coalesce per workflow via cancel-in-progress on each.
|
|
concurrency:
|
|
group: cortex-runner-pool-${{ github.ref }}
|
|
cancel-in-progress: false
|
|
|
|
env:
|
|
CARGO_INCREMENTAL: "0"
|
|
RUSTC_WRAPPER: sccache
|
|
SCCACHE_BUCKET: sccache
|
|
SCCACHE_ENDPOINT: http://caveman.kosherinata.internal:9000
|
|
SCCACHE_REGION: auto
|
|
SCCACHE_S3_USE_SSL: "false"
|
|
AWS_ACCESS_KEY_ID: ${{ secrets.SCCACHE_S3_ACCESS_KEY }}
|
|
AWS_SECRET_ACCESS_KEY: ${{ secrets.SCCACHE_S3_SECRET_KEY }}
|
|
# fmt, clippy, and test all run in parallel on the same `rust` runner
|
|
# and would otherwise share /root/.cache/act/<hash>/hostexecutor/target/,
|
|
# racing each other's cargo temp files (.tmpXXXXXX) and failing builds
|
|
# mid-compile. Give each job its own target directory so the invocations
|
|
# don't collide. sccache still backs the actual rustc cache, so the
|
|
# rebuild penalty is small.
|
|
CARGO_TARGET_DIR: target-${{ github.job }}
|
|
|
|
jobs:
|
|
fmt:
|
|
name: Format
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- run: cargo fmt --check --all
|
|
|
|
clippy:
|
|
name: Clippy
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
# sccache occasionally fails with spurious race-condition errors;
|
|
# retrying the same invocation succeeds without code changes.
|
|
# Allow up to 3 attempts before declaring real failure.
|
|
- name: Clippy (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::clippy attempt ${attempt}"
|
|
if cargo clippy --workspace -- -D warnings; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "clippy failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "clippy failed after 3 attempts"
|
|
exit 1
|
|
- run: sccache --show-stats
|
|
|
|
test:
|
|
name: Test
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
# See the clippy job for why this is retried.
|
|
- name: Test (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::test attempt ${attempt}"
|
|
if cargo test --workspace; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "test failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "test failed after 3 attempts"
|
|
exit 1
|
|
- run: sccache --show-stats
|
|
|
|
# Type-check the CUDA-only code path. Borrow-check-only — we
|
|
# never run the tests here (the runner has no GPU). This catches
|
|
# the category of bug where a refactor compiles fine under the
|
|
# default feature set (which is what the `clippy` and `test` jobs
|
|
# exercise) but fails inside a `#[cfg(feature = "cuda")]` block.
|
|
# `runs-on: cuda-13.0` selects the runner that ships nvcc /
|
|
# cudarc's build prerequisites. The generic `rust` and `rpm`
|
|
# runners don't have them (the previous label `rpm` was tried
|
|
# first and tripped cudarc's `nvcc --version` build script —
|
|
# see commit history).
|
|
cuda-check:
|
|
name: CUDA type-check
|
|
runs-on: cuda-13.0
|
|
# The workflow-level env sets `RUSTC_WRAPPER: sccache` for the
|
|
# `rust` runner (where fmt/clippy/test live and sccache is
|
|
# installed). The `cuda-13.0` runner doesn't have sccache on
|
|
# PATH, so inheriting the wrapper makes cargo bail with
|
|
# `could not execute process `sccache rustc -vV` (never executed)`
|
|
# before borrow-check even starts. Clear it locally. Also clear
|
|
# SCCACHE_* so cargo doesn't try to contact the cache (the
|
|
# remote auth headers come from secrets that aren't present on
|
|
# this runner either). Lose the cache, keep the gate.
|
|
env:
|
|
RUSTC_WRAPPER: ""
|
|
SCCACHE_BUCKET: ""
|
|
SCCACHE_ENDPOINT: ""
|
|
SCCACHE_REGION: ""
|
|
SCCACHE_S3_USE_SSL: ""
|
|
AWS_ACCESS_KEY_ID: ""
|
|
AWS_SECRET_ACCESS_KEY: ""
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- name: cargo check --features cuda (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::cuda-check attempt ${attempt}"
|
|
if cargo check -p neuron --features cuda --all-targets; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "cuda-check failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "cuda-check failed after 3 attempts"
|
|
exit 1
|
|
|
|
srpm-cortex:
|
|
name: Build cortex SRPM
|
|
runs-on: rpm
|
|
needs: [fmt, clippy, test, cuda-check]
|
|
if: startsWith(github.ref, 'refs/tags/v')
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: |
|
|
VERSION="${GITHUB_REF#refs/tags/v}"
|
|
echo "VERSION=${VERSION}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" cortex.spec
|
|
|
|
- name: Generate changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: cortex.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate source tarball
|
|
run: |
|
|
set -ex
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
tar czf /tmp/cortex-${VERSION}.tar.gz \
|
|
--transform "s,^\.,cortex-${VERSION}," \
|
|
--exclude='./target' \
|
|
--exclude='./.git' \
|
|
--exclude='*.tar.gz' \
|
|
--exclude='*.src.rpm' \
|
|
.
|
|
mv /tmp/cortex-${VERSION}.tar.gz .
|
|
|
|
- name: Vendor Rust dependencies
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
cargo vendor vendor/
|
|
tar czf cortex-${VERSION}-vendor.tar.gz vendor/
|
|
rm -rf vendor/
|
|
|
|
- name: Build SRPM
|
|
run: |
|
|
rpmbuild -bs cortex.spec \
|
|
--define "_sourcedir $(pwd)" \
|
|
--define "_srcrpmdir $(pwd)"
|
|
|
|
- name: Upload SRPM artifact
|
|
uses: actions/upload-artifact@v3
|
|
with:
|
|
name: srpm-cortex
|
|
path: "*.src.rpm"
|
|
|
|
srpm-neuron:
|
|
name: Build neuron SRPM
|
|
runs-on: rpm
|
|
needs: [fmt, clippy, test, cuda-check]
|
|
if: startsWith(github.ref, 'refs/tags/v')
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: |
|
|
VERSION="${GITHUB_REF#refs/tags/v}"
|
|
echo "VERSION=${VERSION}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" helexa-neuron.spec
|
|
|
|
- name: Generate changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: helexa-neuron.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate source tarball
|
|
run: |
|
|
set -ex
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
tar czf /tmp/helexa-neuron-${VERSION}.tar.gz \
|
|
--transform "s,^\.,helexa-neuron-${VERSION}," \
|
|
--exclude='./target' \
|
|
--exclude='./.git' \
|
|
--exclude='*.tar.gz' \
|
|
--exclude='*.src.rpm' \
|
|
.
|
|
mv /tmp/helexa-neuron-${VERSION}.tar.gz .
|
|
|
|
- name: Vendor Rust dependencies
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
cargo vendor vendor/
|
|
tar czf helexa-neuron-${VERSION}-vendor.tar.gz vendor/
|
|
rm -rf vendor/
|
|
|
|
- name: Build SRPM
|
|
run: |
|
|
rpmbuild -bs helexa-neuron.spec \
|
|
--define "_sourcedir $(pwd)" \
|
|
--define "_srcrpmdir $(pwd)"
|
|
|
|
- name: Upload SRPM artifact
|
|
uses: actions/upload-artifact@v3
|
|
with:
|
|
name: srpm-neuron
|
|
path: "*.src.rpm"
|
|
|
|
copr-cortex:
|
|
name: Publish cortex to COPR
|
|
runs-on: fedora-43
|
|
needs: srpm-cortex
|
|
steps:
|
|
- name: Download SRPM
|
|
uses: actions/download-artifact@v3
|
|
with:
|
|
name: srpm-cortex
|
|
|
|
- name: Publish to COPR
|
|
uses: https://git.lair.cafe/actions/copr-publish@v1
|
|
with:
|
|
project: helexa/helexa
|
|
srpm: "*.src.rpm"
|
|
copr-config: ${{ secrets.COPR_CONFIG }}
|
|
|
|
copr-neuron:
|
|
name: Publish neuron to COPR
|
|
runs-on: fedora-43
|
|
needs: srpm-neuron
|
|
steps:
|
|
- name: Download SRPM
|
|
uses: actions/download-artifact@v3
|
|
with:
|
|
name: srpm-neuron
|
|
|
|
- name: Publish to COPR
|
|
uses: https://git.lair.cafe/actions/copr-publish@v1
|
|
with:
|
|
project: helexa/helexa
|
|
srpm: "*.src.rpm"
|
|
copr-config: ${{ secrets.COPR_CONFIG }}
|
|
|
|
bump-version:
|
|
name: Bump version in source
|
|
runs-on: rust
|
|
needs: [copr-cortex, copr-neuron]
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: echo "VERSION=${GITHUB_REF#refs/tags/v}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" cortex.spec
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" helexa-neuron.spec
|
|
cargo check --workspace 2>/dev/null || true
|
|
|
|
- name: Generate cortex changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: cortex.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate helexa-neuron changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: helexa-neuron.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Commit and push
|
|
env:
|
|
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
git config user.name "Gitea Actions"
|
|
git config user.email "actions@git.lair.cafe"
|
|
git add Cargo.toml Cargo.lock cortex.spec helexa-neuron.spec
|
|
if git diff --cached --quiet; then
|
|
echo "Nothing to commit for ${VERSION}"
|
|
else
|
|
git commit -m "chore: bump version to ${VERSION}"
|
|
git remote set-url origin "https://gitea-actions:${GITEA_TOKEN}@git.lair.cafe/helexa/cortex.git"
|
|
git push origin HEAD:main
|
|
fi
|