fix(neuron): TP-vision Stage 0 — reject image requests on the TP path
Some checks failed
build-prerelease / Resolve version stamps (push) Waiting to run
CI / Format (push) Waiting to run
CI / CUDA type-check (push) Successful in 32s
build-prerelease / Build cortex binary (push) Has been cancelled
build-prerelease / Build neuron-blackwell (push) Has been cancelled
build-prerelease / Build neuron-ampere (push) Has been cancelled
build-prerelease / Build neuron-ada (push) Has been cancelled
build-prerelease / Package cortex RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build cortex SRPM (push) Has been cancelled
CI / Build neuron SRPM (push) Has been cancelled
CI / Publish cortex to COPR (push) Has been cancelled
CI / Publish neuron to COPR (push) Has been cancelled
CI / Bump version in source (push) Has been cancelled
Some checks failed
build-prerelease / Resolve version stamps (push) Waiting to run
CI / Format (push) Waiting to run
CI / CUDA type-check (push) Successful in 32s
build-prerelease / Build cortex binary (push) Has been cancelled
build-prerelease / Build neuron-blackwell (push) Has been cancelled
build-prerelease / Build neuron-ampere (push) Has been cancelled
build-prerelease / Build neuron-ada (push) Has been cancelled
build-prerelease / Package cortex RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ada RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-ampere RPM (push) Has been cancelled
build-prerelease / Package helexa-neuron-blackwell RPM (push) Has been cancelled
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build cortex SRPM (push) Has been cancelled
CI / Build neuron SRPM (push) Has been cancelled
CI / Publish cortex to COPR (push) Has been cancelled
CI / Publish neuron to COPR (push) Has been cancelled
CI / Bump version in source (push) Has been cancelled
The TP inference path has no vision tower, and the TP dispatch in chat_completion / inference_stream returns before the VisionUnsupported guard runs — so an image request to a TP-loaded model (e.g. beast's tp=2 Qwen3.6-27B) was silently dropped and answered from text alone, the exact issue-#3 confident-hallucination pattern Stage C killed for single-GPU. Add the request_has_images → VisionUnsupported guard to both chat_completion_tp and inference_tp_stream, before prefill / before the SSE stream opens, so beast returns a clean 400 vision_unsupported. The guard is unconditional for now (TP has no tower); Stage 3 makes it conditional on the TP model's has_vision once real TP-vision lands. Detection is covered by the existing request_has_images unit test; the guard itself is cuda-gated (validated by CI's CUDA type-check). Refs TP-vision plan Stage 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
1
.claude/scheduled_tasks.lock
Normal file
1
.claude/scheduled_tasks.lock
Normal file
@@ -0,0 +1 @@
|
||||
{"sessionId":"a27586bb-2ca7-4e92-8d82-12f41b39f392","pid":3106893,"procStart":"59753850","acquiredAt":1780571089798}
|
||||
@@ -2739,6 +2739,18 @@ impl CandleHarness {
|
||||
return Err(poisoned_error(&model_id));
|
||||
}
|
||||
|
||||
// Stage 0 (TP-vision): the TP path has no vision tower yet, so
|
||||
// an image-bearing request can't be honoured. Reject it cleanly
|
||||
// with `vision_unsupported` instead of silently dropping the
|
||||
// image and answering from text alone (the issue-#3 confident-
|
||||
// hallucination pattern). Made conditional on the TP model's
|
||||
// `has_vision` once Stage 3 wires real TP-vision.
|
||||
if request_has_images(&request) {
|
||||
let _g = span.enter();
|
||||
tracing::warn!("TP chat_completion: rejecting image request, TP vision unsupported");
|
||||
return Err(InferenceError::VisionUnsupported { model_id });
|
||||
}
|
||||
|
||||
let tp_for_marker = Arc::clone(&tp);
|
||||
let handle = tokio::spawn(chat_completion_tp_inner(tp, request).instrument(span.clone()));
|
||||
match handle.await {
|
||||
@@ -2816,6 +2828,20 @@ impl CandleHarness {
|
||||
return Err(poisoned_error(&request.model));
|
||||
}
|
||||
|
||||
// Stage 0 (TP-vision): reject image requests on the TP streaming
|
||||
// path before opening the SSE stream — the TP path has no vision
|
||||
// tower yet, so honouring the image is impossible and silently
|
||||
// dropping it would hallucinate. Returns a clean 400; made
|
||||
// conditional on `has_vision` in Stage 3.
|
||||
if request_has_images(&request) {
|
||||
tracing::warn!(
|
||||
"TP chat_completion (stream): rejecting image request, TP vision unsupported"
|
||||
);
|
||||
return Err(InferenceError::VisionUnsupported {
|
||||
model_id: request.model.clone(),
|
||||
});
|
||||
}
|
||||
|
||||
let prompt = build_prompt_for_request(tp.chat_template.as_deref(), &request);
|
||||
let encoding = tp
|
||||
.tokenizer
|
||||
|
||||
Reference in New Issue
Block a user