cortex

helexa/cortex

Fork 0

Files

History

rob thijssen 18ae3c30ee

CI / Format (push) Successful in 34s

Details

build-prerelease / Resolve version stamps (push) Successful in 35s

Details

CI / Clippy (push) Successful in 2m17s

Details

CI / Test (push) Successful in 4m16s

Details

CI / Build cortex SRPM (push) Has been skipped

Details

CI / Publish cortex to COPR (push) Has been skipped

Details

CI / Build neuron SRPM (push) Has been skipped

Details

CI / Publish neuron to COPR (push) Has been skipped

Details

CI / Bump version in source (push) Has been skipped

Details

build-prerelease / Build cortex binary (push) Successful in 4m28s

Details

build-prerelease / Build neuron-blackwell (push) Successful in 3m42s

Details

build-prerelease / Package cortex RPM (push) Successful in 1m25s

Details

build-prerelease / Build neuron-ampere (push) Successful in 4m27s

Details

build-prerelease / Build neuron-ada (push) Successful in 4m51s

Details

build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m50s

Details

build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s

Details

build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 6m52s

Details

build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 2m32s

Details

post-validation cleanup: cuDNN runtime + repetition penalty

Two followups from the live single-GPU validation pass.

1. deploy.sh now ensures libcudnn.so.9 is available on each neuron
   host before installing/upgrading the package. Probes ldconfig first
   so hosts with a manual (tar/runfile) cuDNN install are untouched,
   then adds NVIDIA's RHEL9 CUDA repo (the Fedora 43 CUDA repo doesn't
   ship cuDNN; only the RHEL9 one does) and installs libcudnn9-cuda-13.
   benjy hit "cannot open shared object file: libcudnn.so.9" during
   validation; this prevents that recurring.

2. candle.rs applies a 1.1 repetition penalty over the last 64
   generated tokens before sampling, in both the non-streaming
   chat_completion path and the streaming chat_completion_stream
   path. Without it small Q4_K_M models degenerate into "Wait, no,
   no..." loops once they hit a confident-but-wrong path; with it
   sampling stays coherent. Defaults match mistral.rs and llama.cpp;
   exposing the value via the OpenAI request (frequency/presence
   penalty mapping) is Stage 8 territory.

Both routes through a new sample_with_penalty() helper so future
sampling tweaks land in one place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-19 14:48:08 +03:00

deploy.sh

post-validation cleanup: cuDNN runtime + repetition penalty

2026-05-19 14:48:08 +03:00

generate-packages-json.py

ci: add build-prerelease workflow for CUDA RPMs on rpm.lair.cafe

2026-05-18 17:01:35 +03:00

validate-neuron.sh

fix(validate-neuron): jq for JSON, say→stderr, sane max_tokens

2026-05-19 13:43:02 +03:00