cortex

helexa/cortex

Fork 0

Files

History

rob thijssen 825bf4e905

build-prerelease / Resolve version stamps (push) Successful in 30s

Details

CI / CUDA type-check (push) Successful in 31s

Details

CI / Format (push) Successful in 42s

Details

build-prerelease / Build cortex binary (push) Successful in 5m9s

Details

build-prerelease / Build neuron-blackwell (push) Successful in 6m4s

Details

build-prerelease / Package cortex RPM (push) Successful in 1m32s

Details

CI / Test (push) Successful in 7m19s

Details

build-prerelease / Build neuron-ampere (push) Successful in 8m40s

Details

build-prerelease / Build neuron-ada (push) Successful in 5m17s

Details

build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 3m0s

Details

build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 3m1s

Details

build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m53s

Details

build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m14s

Details

CI / Clippy (push) Successful in 2m29s

Details

CI / Build cortex SRPM (push) Has been skipped

Details

CI / Build neuron SRPM (push) Has been skipped

Details

CI / Publish cortex to COPR (push) Has been skipped

Details

CI / Publish neuron to COPR (push) Has been skipped

Details

CI / Bump version in source (push) Has been skipped

Details

feat(neuron): M-RoPE Stage 4 — wire interleaved M-RoPE into the TP path

Mirror Stage 3 into the tensor-parallel Qwen3.6 model:

- TpQwen3_5Attention / DecoderLayer take (cos, sin) instead of a scalar
  offset and apply via apply_cos_sin.
- TpQwen3_5Model gains the replicated rotary + rope_delta (reset in
  clear_kv_cache, settable). forward_inner builds the cos/sin once —
  interleaved M-RoPE from explicit position_ids (vision) or plain at
  offset+rope_delta (text/decode). forward() and forward_with_positions()
  delegate; the old single-shot forward_with_vision is gone.
- prefill_with_images_chunked now computes get_rope_index over the whole
  prompt once, stores rope_delta on the base model, and slices the
  (3, prompt_len) position tensor per chunk — so every rank assigns image
  tokens their 14×14 grid coordinates and steps in lockstep (every chunk,
  text or image, carries the M-RoPE slice because the image shifts the
  surrounding text positions).

Also build the position-id tensor as f32 directly (positions are small
integers, exact in f32) to avoid an i64→f32 cast on the GPU.

The TP forward is cuda-gated — CI CUDA type-check is the compile gate.
Non-cuda build + clippy + full workspace tests green; rope math + the
plain-RoPE-reduction invariant covered by unit tests.

Completes the interleaved-M-RoPE work for the vision spatial misread.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-04 18:46:27 +03:00

src

feat(neuron): M-RoPE Stage 4 — wire interleaved M-RoPE into the TP path

2026-06-04 18:46:27 +03:00

tests

feat(neuron,cortex-core): source-aware loader (scheme:org/name)

2026-06-01 13:42:11 +03:00

build.rs

feat(stage-8d-1): import mistralrs GDN CUDA kernels — build infra only

2026-05-21 11:34:11 +03:00

Cargo.toml

fix(neuron): render HF chat templates via minijinja pycompat

2026-06-04 16:32:23 +03:00