Vision: tensor-parallel implementation (Stage E) #12
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Deferred during planning of the initial vision capability for
Qwen3.6-27B (umbrella: #3). Stage A–C of that plan ships single-GPU
vision only. Refs:
~/.claude/plans/foamy-twirling-catmull.md.Problem
Qwen3.6-27B in BF16 is ~54 GB and does not fit on a single 5090
(32 GB). Without TP-vision, the helexa stack can only serve vision
requests against the 27B by quantising heavily (Q4/Q5 ISQ on one
GPU) — which sacrifices the quality that motivates running 27B in
the first place. This issue tracks bringing vision support to the
TP path so beast/benjy (2×5090, 32 GB each) can serve full-quality
multimodal Qwen3.6-27B.
Scope
VisionTowerstruct fromcrates/neuron/src/harness/arch/qwen3_5/vision.rs(introduced byStage A) into the TP path at
crates/neuron/src/harness/tp/tp_qwen3_5.rs.is small relative to the LM (~1-2 GB out of 54 GB for the LM in
BF16); sharding ViT layers across NCCL would add latency without
meaningful VRAM benefit, and the embedding layer is already
replicated per
tp_qwen3_5.rs:30-31.the same source bytes (broadcast pre-decode by the leader) or
rank 0 encodes and broadcasts the resulting patch embeddings via
NCCL
Broadcast. Net data volume favours the latter for largeimages; the former is simpler. Pick one with a justifying note.
Jobvariants on the TP worker pool for image encoding +image-aware forward, mirroring the single-GPU variants from
Stage A/B.
Acceptance
tensor_parallel = 2and a vision-capableconfig produces a working multimodal model on beast.
text via the TP path;
prompt_tokensincludes patch tokens.cargo test -p neuron --features cuda-integrationok on a CUDAhost with NCCL.
Blocked by
Stage A–C of the vision plan must land first. This issue is the
gate to production use of vision on Qwen3.6-27B; without it,
hosting vision on beast/benjy is impossible.
References
~/.claude/plans/foamy-twirling-catmull.mdcrates/neuron/src/harness/tp/tp_qwen3_5.rscrates/neuron/src/harness/tp/mod.rscrates/neuron/src/harness/tp/tp_qwen3_5.rs:30-31