Some checks failed
build-prerelease / Package cortex RPM (push) Blocked by required conditions
CI / CUDA type-check (push) Failing after 11s
build-prerelease / Resolve version stamps (push) Successful in 30s
CI / Format (push) Successful in 32s
CI / Clippy (push) Successful in 2m31s
build-prerelease / Build cortex binary (push) Successful in 4m32s
CI / Test (push) Successful in 5m42s
CI / Build cortex SRPM (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Successful in 6m8s
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
build-prerelease / Build neuron-ampere (push) Has been cancelled
build-prerelease / Build neuron-ada (push) Has been cancelled
Step 2 of the Responses rollout: native `/v1/responses` endpoint on
neuron that consumes the same InferenceEvent stream as
`/v1/chat/completions` but emits it as the Responses API's named
SSE event family. No gateway-side translation.
## Surface
- `cortex-core::responses` envelope types: `ResponsesRequest`,
`ResponsesInput` (text | items), `ResponsesInputItem` (message |
function_call | function_call_output | reasoning),
`ResponsesContentPart` (input_text | input_image | output_text),
`ResponsesResponse`, `ResponsesOutputItem`, `ResponsesUsage`. Plus
a `events::*` constant module so the projector and the wire shape
stay in sync without string-typos.
- `neuron::wire::openai_responses`:
- `request_to_chat(req)` flattens Responses input + instructions
into a `ChatCompletionRequest` the candle harness already
understands. Text-only Parts collapse to a string; mixed
text+image Parts go to chat's content-array shape; reasoning
items drop; function_call / function_call_output round-trip
via tool_calls / tool_call_id metadata so the surface is
consistent for the day the harness emits tool calls.
- `project_responses_stream(rx, meta)` reads InferenceEvents
and emits the eight named events that compose a Responses
stream: response.created → output_item.added → content_part.added
→ output_text.delta×N → output_text.done → content_part.done
→ output_item.done → response.completed. Synthesises start
frames if the producer skips Start (poisoned model, early
disconnect) so the stream stays coherent.
- `build_response(meta, text, reason, usage)` for the
non-streaming path.
- `CandleHarness::inference_stream(req)` extracted from
`chat_completion_stream`, returning a typed `InferenceStream`
(event receiver + id/created/model_id metadata). Both
`chat_completion_stream` and the new `responses_stream` are now
thin wrappers that pick their wire projection. TP path got the
same treatment (`chat_completion_tp_stream` → `inference_tp_stream`).
- `POST /v1/responses` route on neuron. Non-streaming returns one
buffered `ResponsesResponse`; streaming returns axum SSE with
both event names and JSON data per frame (Responses, unlike
chat completions, uses named `event:` lines). Reused
`inference_error_response` helper hoisted out so the chat and
responses handlers share the InferenceError → HTTP mapping.
## CI
Also bundles the `cuda-check` runner-label fix from feedback on
commit 1859777: `runs-on: rpm` doesn't ship the CUDA toolkit so
cudarc's nvcc-version build script blew up. Switched to
`runs-on: cuda-13.0` per the existing labels.
## Scope cuts (documented in the modules)
- `previous_response_id` rejected at translate time with 400
(`code: chained_conversation_not_supported`) — stateful chained
conversations need a persistence layer we haven't built.
- Reasoning items dropped (no Qwen3 `<think>` routing yet).
- Single output item per response (one `"message"` carrying text);
`function_call` items reserved but not synthesised.
- Streaming events cover the core set; `response.in_progress`
and the web_search / image_generation event families are
out-of-scope.
22 new tests: 5 in cortex-core (envelope round-trips), 13 in
neuron::wire (request translator + projector + non-streaming
builder), 4 in neuron's tests/api.rs (route surface — 503 when no
candle, 400 on previous_response_id, 404 on missing model for
both stream and non-stream).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
329 lines
10 KiB
YAML
329 lines
10 KiB
YAML
name: CI
|
|
|
|
on:
|
|
push:
|
|
branches: ["**"]
|
|
tags: ["v*"]
|
|
pull_request:
|
|
branches: [main]
|
|
|
|
# Share a concurrency group with build-prerelease.yml so the two
|
|
# workflows don't race on the same `rust` runner workspace (act's
|
|
# /root/.cache/act/<hash>/hostexecutor/ is shared across concurrent
|
|
# jobs and one job's checkout step nukes another's in-flight build
|
|
# files). cancel-in-progress=false → they queue; same-ref pushes
|
|
# coalesce per workflow via cancel-in-progress on each.
|
|
concurrency:
|
|
group: cortex-runner-pool-${{ github.ref }}
|
|
cancel-in-progress: false
|
|
|
|
env:
|
|
CARGO_INCREMENTAL: "0"
|
|
RUSTC_WRAPPER: sccache
|
|
SCCACHE_BUCKET: sccache
|
|
SCCACHE_ENDPOINT: http://caveman.kosherinata.internal:9000
|
|
SCCACHE_REGION: auto
|
|
SCCACHE_S3_USE_SSL: "false"
|
|
AWS_ACCESS_KEY_ID: ${{ secrets.SCCACHE_S3_ACCESS_KEY }}
|
|
AWS_SECRET_ACCESS_KEY: ${{ secrets.SCCACHE_S3_SECRET_KEY }}
|
|
# fmt, clippy, and test all run in parallel on the same `rust` runner
|
|
# and would otherwise share /root/.cache/act/<hash>/hostexecutor/target/,
|
|
# racing each other's cargo temp files (.tmpXXXXXX) and failing builds
|
|
# mid-compile. Give each job its own target directory so the invocations
|
|
# don't collide. sccache still backs the actual rustc cache, so the
|
|
# rebuild penalty is small.
|
|
CARGO_TARGET_DIR: target-${{ github.job }}
|
|
|
|
jobs:
|
|
fmt:
|
|
name: Format
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- run: cargo fmt --check --all
|
|
|
|
clippy:
|
|
name: Clippy
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
# sccache occasionally fails with spurious race-condition errors;
|
|
# retrying the same invocation succeeds without code changes.
|
|
# Allow up to 3 attempts before declaring real failure.
|
|
- name: Clippy (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::clippy attempt ${attempt}"
|
|
if cargo clippy --workspace -- -D warnings; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "clippy failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "clippy failed after 3 attempts"
|
|
exit 1
|
|
- run: sccache --show-stats
|
|
|
|
test:
|
|
name: Test
|
|
runs-on: rust
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
# See the clippy job for why this is retried.
|
|
- name: Test (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::test attempt ${attempt}"
|
|
if cargo test --workspace; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "test failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "test failed after 3 attempts"
|
|
exit 1
|
|
- run: sccache --show-stats
|
|
|
|
# Type-check the CUDA-only code path. Borrow-check-only — we
|
|
# never run the tests here (the runner has no GPU). This catches
|
|
# the category of bug where a refactor compiles fine under the
|
|
# default feature set (which is what the `clippy` and `test` jobs
|
|
# exercise) but fails inside a `#[cfg(feature = "cuda")]` block.
|
|
# `runs-on: cuda-13.0` selects the runner that ships nvcc /
|
|
# cudarc's build prerequisites. The generic `rust` and `rpm`
|
|
# runners don't have them (the previous label `rpm` was tried
|
|
# first and tripped cudarc's `nvcc --version` build script —
|
|
# see commit history).
|
|
cuda-check:
|
|
name: CUDA type-check
|
|
runs-on: cuda-13.0
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- name: cargo check --features cuda (with retry)
|
|
run: |
|
|
for attempt in 1 2 3; do
|
|
echo "::group::cuda-check attempt ${attempt}"
|
|
if cargo check -p neuron --features cuda --all-targets; then
|
|
echo "::endgroup::"
|
|
exit 0
|
|
fi
|
|
echo "::endgroup::"
|
|
echo "cuda-check failed on attempt ${attempt}"
|
|
if [ "${attempt}" -lt 3 ]; then
|
|
sleep 5
|
|
fi
|
|
done
|
|
echo "cuda-check failed after 3 attempts"
|
|
exit 1
|
|
|
|
srpm-cortex:
|
|
name: Build cortex SRPM
|
|
runs-on: rpm
|
|
needs: [fmt, clippy, test, cuda-check]
|
|
if: startsWith(github.ref, 'refs/tags/v')
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: |
|
|
VERSION="${GITHUB_REF#refs/tags/v}"
|
|
echo "VERSION=${VERSION}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" cortex.spec
|
|
|
|
- name: Generate changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: cortex.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate source tarball
|
|
run: |
|
|
set -ex
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
tar czf /tmp/cortex-${VERSION}.tar.gz \
|
|
--transform "s,^\.,cortex-${VERSION}," \
|
|
--exclude='./target' \
|
|
--exclude='./.git' \
|
|
--exclude='*.tar.gz' \
|
|
--exclude='*.src.rpm' \
|
|
.
|
|
mv /tmp/cortex-${VERSION}.tar.gz .
|
|
|
|
- name: Vendor Rust dependencies
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
cargo vendor vendor/
|
|
tar czf cortex-${VERSION}-vendor.tar.gz vendor/
|
|
rm -rf vendor/
|
|
|
|
- name: Build SRPM
|
|
run: |
|
|
rpmbuild -bs cortex.spec \
|
|
--define "_sourcedir $(pwd)" \
|
|
--define "_srcrpmdir $(pwd)"
|
|
|
|
- name: Upload SRPM artifact
|
|
uses: actions/upload-artifact@v3
|
|
with:
|
|
name: srpm-cortex
|
|
path: "*.src.rpm"
|
|
|
|
srpm-neuron:
|
|
name: Build neuron SRPM
|
|
runs-on: rpm
|
|
needs: [fmt, clippy, test, cuda-check]
|
|
if: startsWith(github.ref, 'refs/tags/v')
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: |
|
|
VERSION="${GITHUB_REF#refs/tags/v}"
|
|
echo "VERSION=${VERSION}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" helexa-neuron.spec
|
|
|
|
- name: Generate changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: helexa-neuron.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate source tarball
|
|
run: |
|
|
set -ex
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
tar czf /tmp/helexa-neuron-${VERSION}.tar.gz \
|
|
--transform "s,^\.,helexa-neuron-${VERSION}," \
|
|
--exclude='./target' \
|
|
--exclude='./.git' \
|
|
--exclude='*.tar.gz' \
|
|
--exclude='*.src.rpm' \
|
|
.
|
|
mv /tmp/helexa-neuron-${VERSION}.tar.gz .
|
|
|
|
- name: Vendor Rust dependencies
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
cargo vendor vendor/
|
|
tar czf helexa-neuron-${VERSION}-vendor.tar.gz vendor/
|
|
rm -rf vendor/
|
|
|
|
- name: Build SRPM
|
|
run: |
|
|
rpmbuild -bs helexa-neuron.spec \
|
|
--define "_sourcedir $(pwd)" \
|
|
--define "_srcrpmdir $(pwd)"
|
|
|
|
- name: Upload SRPM artifact
|
|
uses: actions/upload-artifact@v3
|
|
with:
|
|
name: srpm-neuron
|
|
path: "*.src.rpm"
|
|
|
|
copr-cortex:
|
|
name: Publish cortex to COPR
|
|
runs-on: fedora-43
|
|
needs: srpm-cortex
|
|
steps:
|
|
- name: Download SRPM
|
|
uses: actions/download-artifact@v3
|
|
with:
|
|
name: srpm-cortex
|
|
|
|
- name: Publish to COPR
|
|
uses: https://git.lair.cafe/actions/copr-publish@v1
|
|
with:
|
|
project: helexa/helexa
|
|
srpm: "*.src.rpm"
|
|
copr-config: ${{ secrets.COPR_CONFIG }}
|
|
|
|
copr-neuron:
|
|
name: Publish neuron to COPR
|
|
runs-on: fedora-43
|
|
needs: srpm-neuron
|
|
steps:
|
|
- name: Download SRPM
|
|
uses: actions/download-artifact@v3
|
|
with:
|
|
name: srpm-neuron
|
|
|
|
- name: Publish to COPR
|
|
uses: https://git.lair.cafe/actions/copr-publish@v1
|
|
with:
|
|
project: helexa/helexa
|
|
srpm: "*.src.rpm"
|
|
copr-config: ${{ secrets.COPR_CONFIG }}
|
|
|
|
bump-version:
|
|
name: Bump version in source
|
|
runs-on: rust
|
|
needs: [copr-cortex, copr-neuron]
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Determine version
|
|
id: version
|
|
run: echo "VERSION=${GITHUB_REF#refs/tags/v}" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Stamp version
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
sed -i '/\[workspace\.package\]/,/\[/{ s/^version = ".*"/version = "'"${VERSION}"'"/ }' Cargo.toml
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" cortex.spec
|
|
sed -i "s/^Version:.*/Version: ${VERSION}/" helexa-neuron.spec
|
|
cargo check --workspace 2>/dev/null || true
|
|
|
|
- name: Generate cortex changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: cortex.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Generate helexa-neuron changelog entry
|
|
uses: https://git.lair.cafe/actions/rpm-changelog@v1
|
|
with:
|
|
spec: helexa-neuron.spec
|
|
version: ${{ steps.version.outputs.VERSION }}
|
|
|
|
- name: Commit and push
|
|
env:
|
|
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
|
|
run: |
|
|
VERSION="${{ steps.version.outputs.VERSION }}"
|
|
git config user.name "Gitea Actions"
|
|
git config user.email "actions@git.lair.cafe"
|
|
git add Cargo.toml Cargo.lock cortex.spec helexa-neuron.spec
|
|
if git diff --cached --quiet; then
|
|
echo "Nothing to commit for ${VERSION}"
|
|
else
|
|
git commit -m "chore: bump version to ${VERSION}"
|
|
git remote set-url origin "https://gitea-actions:${GITEA_TOKEN}@git.lair.cafe/helexa/cortex.git"
|
|
git push origin HEAD:main
|
|
fi
|