All checks were successful
build-prerelease / Resolve version stamps (push) Successful in 34s
CI / Format (push) Successful in 36s
CI / Clippy (push) Successful in 2m16s
CI / Test (push) Successful in 4m37s
CI / Build cortex SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build cortex binary (push) Successful in 4m21s
build-prerelease / Build neuron-blackwell (push) Successful in 3m51s
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Build neuron-ampere (push) Successful in 5m2s
build-prerelease / Build neuron-ada (push) Successful in 5m8s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 2m55s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 3m0s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m40s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m11s
Two interlocked bugs surfaced trying to load Qwen/Qwen3.5-0.8B (and
the same applies to Qwen/Qwen3.6-27B):
1. Qwen3-Next config.json does NOT have a top-level `rope_theta`.
It lives inside `rope_parameters: { rope_theta, partial_rotary_factor,
rope_type, mrope_section, mrope_interleaved }`. Our TextConfig
declared `rope_theta` as a non-optional top-level field, so the
deserializer bailed with the misleading "missing field
`rope_theta` at line 74 col 5".
Replaced with a nested `RopeParameters` struct that mirrors the
upstream shape. Defaults are conservative (rope_theta=10000,
partial_rotary_factor=1.0) so a missing or partial block degrades
to standard full-rotation RoPE rather than failing.
2. `partial_rotary_factor: 0.25` means only `head_dim * 0.25 = 64` of
the 256 head_dim values get RoPE applied — the rest pass through
unchanged. Our RotaryEmbedding was building the inv_freq table
for the full head_dim and rotating everything. Silently wrong
for every full-attention layer.
`RotaryEmbedding` now derives `rotary_dim` from
`head_dim * partial_rotary_factor`, builds its cos/sin tables at
that smaller size, and in `apply()` splits q/k into (rotate, pass)
on the last dim, only `rope_slow`-rotates the rotate half, and
re-concatenates. Mirrors the reference Python's
`apply_rotary_pos_emb` exactly for the non-trivial
`partial_rotary_factor` case.
Tests updated: config-deserialise fixture uses the real `rope_parameters`
shape (matching the Qwen3.6-27B and Qwen3.5-0.8B configs). The
linear-attention forward-smoke test was already using full rotation
which still works; just shifted to the nested struct.
After this, the load that previously failed at "parse Qwen3-Next
(qwen3_5) config.json: missing field rope_theta" should reach the
actual safetensors materialisation step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>