From 577781de8d12fa32839a4eeff30d7ef363a76478 Mon Sep 17 00:00:00 2001 From: rob thijssen Date: Tue, 2 Jun 2026 15:51:57 +0300 Subject: [PATCH] fix(neuron): derive Clone on ImageInput for the CUDA vision dispatch MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CUDA type-check in CI failed on commit 24968e9 with E0308: error[E0308]: mismatched types --> crates/neuron/src/harness/candle.rs:1707:33 1707 | images.clone(), | ^^^^^^^^^^^^^^ expected `Vec`, found `&Vec` In Stage B5 the cuda branch of `chat_completion` matches `&vision_route` to keep the `vision_route: Option<...>` alive for both arms, which makes `images` bind as `&Vec`. The subsequent `images.clone()` call doesn't deep-clone because `ImageInput` doesn't derive `Clone` — rustc falls back to cloning the `&Vec` reference, which has the wrong type for the worker job. The CPU build (non-cuda) compiled fine because that branch is behind `#[cfg(feature = "cuda")]`; the cuda-check job is what catches the regression. Fix: derive `Clone` on `ImageInput`. The clone cost is one pixel-buffer memcpy per image (~2.4 MiB at fixed 448×448), which is fine on the chat-completion hot path — vision requests are rare per second relative to text-only decode. Co-Authored-By: Claude Opus 4.7 --- crates/neuron/src/harness/device_worker/jobs.rs | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/crates/neuron/src/harness/device_worker/jobs.rs b/crates/neuron/src/harness/device_worker/jobs.rs index 38d2d98..d0b023d 100644 --- a/crates/neuron/src/harness/device_worker/jobs.rs +++ b/crates/neuron/src/harness/device_worker/jobs.rs @@ -32,6 +32,13 @@ pub struct TpHandle(pub u64); /// `Job::EncodeImage`. Pixels are row-major `(c, h, w)` f32 — the /// shape `harness::preprocess::preprocess` produces. Carries the /// shape inline since `Vec` is rank-1. +/// +/// `Clone` so the vision-aware dispatch in `chat_completion` can +/// match `&vision_route` (carrying borrowed images) and still hand +/// owned `Vec` to the worker job. The clone cost is one +/// pixel-buffer memcpy per image — fine at fixed-resolution sizes +/// (3 × 448 × 448 × 4 bytes = ~2.4 MiB per image). +#[derive(Clone)] pub struct ImageInput { pub pixels: Vec, pub c: usize,