fix: increase max_tokens to 8192 for R1 reasoning overhead

R1 models use 500-2000 tokens for <think> blocks before the final
response. 4096 was too tight — the model would exhaust the budget
mid-thought and never emit the JSON.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-09 18:17:48 +02:00
parent 185cb4586e
commit 6f4f864d28

View File

@@ -63,7 +63,7 @@ impl ClaudeClient {
) -> Result<(String, Option<Usage>)> { ) -> Result<(String, Option<Usage>)> {
let body = MessagesRequest { let body = MessagesRequest {
model: self.model.clone(), model: self.model.clone(),
max_tokens: 4096, max_tokens: 8192,
system: system.to_string(), system: system.to_string(),
messages: messages.to_vec(), messages: messages.to_vec(),
}; };