cortex

helexa/cortex

Fork 0

Files

History

rob thijssen f084aaab8e

CI / Format (push) Successful in 33s

Details

build-prerelease / Resolve version stamps (push) Successful in 40s

Details

CI / Clippy (push) Successful in 2m18s

Details

CI / Test (push) Successful in 4m26s

Details

CI / Build cortex SRPM (push) Has been skipped

Details

CI / Publish cortex to COPR (push) Has been skipped

Details

CI / Build neuron SRPM (push) Has been skipped

Details

CI / Publish neuron to COPR (push) Has been skipped

Details

CI / Bump version in source (push) Has been skipped

Details

build-prerelease / Build neuron-blackwell (push) Successful in 3m41s

Details

build-prerelease / Build cortex binary (push) Successful in 4m22s

Details

build-prerelease / Package cortex RPM (push) Successful in 1m27s

Details

build-prerelease / Build neuron-ampere (push) Successful in 5m12s

Details

build-prerelease / Build neuron-ada (push) Successful in 4m41s

Details

build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 2m59s

Details

build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m5s

Details

build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m48s

Details

build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m2s

Details

fix(stage-8e-2c): cast bf16/f16 activations to f32 around QMatMul

candle's QTensor::cuda_fwd requires f32 inputs — its on-the-fly
GGUF dequantize accumulates in f32. The model dtype flowing into
MaybeQuantLinear::forward is bf16, so QMatMul::forward errored with
"unexpected dtype, expected: F32, got: BF16".

Wrap the Quant arm to cast the activation to f32 before the matmul
and cast the result back to the input dtype. The cast is a single
launch on the activation tensor (small relative to weight traffic);
it's the price of in-situ GGUF-style quantization, and what mistralrs
does inside its own Linear wrapper.

The Plain arm is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-21 20:05:19 +03:00

cuda

feat(stage-8d-1): import mistralrs GDN CUDA kernels — build infra only

2026-05-21 11:34:11 +03:00

harness

fix(stage-8e-2c): cast bf16/f16 activations to f32 around QMatMul

2026-05-21 20:05:19 +03:00

api.rs

chore(neuron): log load_model failures server-side with full chain

2026-05-19 13:08:54 +03:00

config.rs

feat(neuron): load default_models on service activation