fix(qwen3_5): tensor names are under model.language_model.*, not model.*

Qwen3-Next is a multimodal architecture whose text core sits under `model.language_model.*` — sibling to `model.visual.*` (vision tower) and to top-level `lm_head` / `mtp.*`. Every text-side tensor in the safetensors files carries that prefix: model.language_model.embed_tokens.weight model.language_model.layers.{i}.{input,post_attention}_layernorm.weight model.language_model.layers.{i}.linear_attn.{in_proj_*, conv1d.weight, A_log, dt_bias, norm.weight, out_proj.weight} model.language_model.layers.{i}.self_attn.{q,k,v,o}_proj.weight + {q,k}_norm.weight model.language_model.layers.{i}.mlp.{gate,up,down}_proj.weight model.language_model.norm.weight lm_head.weight (top-level; not under language_model) The single-pre-emptive fix is in Qwen3_5Model::load — derive a `text_vb = vb.pp("model.language_model")` once and walk embed_tokens / layers / norm from there. `lm_head` stays at the top-level VB; that path was already correct. The non-text tensors (`model.visual.*`, `mtp.*`) are ignored: we don't reference them, so the safetensors mmap is fine even though the bytes are loaded into the address space. After this, the load that was failing at "cannot find tensor model.embed_tokens.weight" should proceed to materialising the actual layer weights — where any further bugs will be substantive architecture issues rather than naming ones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 16:47:51 +03:00
parent 07c44d5db1
commit a77f19686e
1 changed files with 12 additions and 3 deletions
--- a/crates/neuron/src/harness/arch/qwen3_5/mod.rs
+++ b/crates/neuron/src/harness/arch/qwen3_5/mod.rs
@@ -223,7 +223,15 @@ impl Qwen3_5Model {
        let dtype = vb.dtype();
        let device = vb.device().clone();
-        let embed_vb = vb.pp("model.embed_tokens");
+        // Qwen3-Next is a multimodal architecture whose text core lives
        // under `model.language_model.*` — sibling to `model.visual.*`
        // (the vision tower) and to top-level `lm_head` / `mtp.*`.
        // Every text-side tensor in the safetensors files is under
        // this prefix; we ignore the vision and MTP weights for
        // language-model inference.
        let text_vb = vb.pp("model.language_model");
        let embed_vb = text_vb.pp("embed_tokens");
        let embed_weight = embed_vb
            .get((cfg.vocab_size, cfg.hidden_size), "weight")
            .with_context(|| format!("load '{}/weight'", embed_vb.prefix()))?;
@@ -240,7 +248,7 @@ impl Qwen3_5Model {
            );
        }
-        let vb_l = vb.pp("model.layers");
+        let vb_l = text_vb.pp("layers");
        let mut layers = Vec::with_capacity(cfg.num_hidden_layers);
        for i in 0..cfg.num_hidden_layers {
            layers.push(Qwen3_5DecoderLayer::load(
@@ -251,7 +259,8 @@ impl Qwen3_5Model {
            )?);
        }
-        let norm = Qwen3_5RmsNorm::load(&vb.pp("model.norm"), cfg.hidden_size, cfg.rms_norm_eps)?;
+        let norm =
            Qwen3_5RmsNorm::load(&text_vb.pp("norm"), cfg.hidden_size, cfg.rms_norm_eps)?;
        Ok(Self {
            embed_tokens,