fix: increase max_tokens to 8192 for R1 reasoning overhead

R1 models use 500-2000 tokens for <think> blocks before the final response. 4096 was too tight — the model would exhaust the budget mid-thought and never emit the JSON. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 18:17:48 +02:00
parent 185cb4586e
commit 6f4f864d28
1 changed files with 1 additions and 1 deletions
--- a/src/claude.rs
+++ b/src/claude.rs
@@ -63,7 +63,7 @@ impl ClaudeClient {
    ) -> Result<(String, Option<Usage>)> {
        let body = MessagesRequest {
            model: self.model.clone(),
-            max_tokens: 4096,
+            max_tokens: 8192,
            system: system.to_string(),
            messages: messages.to_vec(),
        };