Commit Graph

10 Commits

Author SHA1 Message Date
185cb4586e fix: strip R1 think blocks before JSON extraction
DeepSeek-R1 models emit <think>...</think> before their actual response.
The brace-counting extractor would grab the first { inside the thinking
block (which contains partial JSON fragments) rather than the final
strategy JSON.

strip_think_blocks() removes all <think>...</think> sections including
unterminated blocks (truncated responses), leaving only the final output
for extract_json to process.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 18:17:06 +02:00
b947f48b01 feat: client-side validation, cycling detection, quantity prompt fix
- validate_strategy(): hard error if quantity is not a parseable decimal
  (catches "ATR_SIZED" etc. before sending to swym API); soft warning if
  a sell rule has no entry_price stop-loss or no bars_since_entry time exit
- Hard validation errors skip the backtest and feed errors back to the LLM
  via IterationRecord.validation_notes included in summary()
- json_contains_kind(): recursive helper to search strategy JSON tree
- diagnose_history(): add cycling detection — triggers is_converged when
  any avg_sharpe value appears 3+ times in history (not just last 3 streak),
  catching the alternating RSI<30 / RSI<25 pattern seen in the latest run
- prompts: clarify that quantity must parse as a float; list invalid
  placeholder strings ("ATR_SIZED", "FULL_BALANCE", "dynamic", etc.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 17:56:59 +02:00
e27aabae34 feat(agent): improve LLM feedback loop and convergence detection
Three related improvements to help the model learn and explore effectively:

Strategy JSON in history: include the compact strategy JSON in each
IterationRecord::summary() so the LLM knows exactly what was tested in
every past iteration, not just the outcome metrics. Without this the model
had no record of what it tried once conversation history was trimmed.

Rule comment in audit: include rule_comment from the condition audit in
the formatted audit string so the LLM can correlate hit-rate data with
the rule's stated purpose.

Convergence detection and anti-anchoring: diagnose_history() now returns
(String, bool) where the bool signals that the last 3 iterations had
avg_sharpe spread < 0.03 (model stuck in local optimum). When converged:
- Emit a ⚠ CONVERGENCE DETECTED note listing untried candle intervals
- Suppress best_so_far JSON to break the anchoring effect that was
  causing the model to produce near-identical strategies for 13+ iterations
- Targeted "try a different approach" instruction

Also add volume-as-field clarification to the DSL mistakes section in
the system prompt, fixing the "unknown variant `volume`" submit error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 14:38:07 +02:00
fb1145acae fix(swym): parse result_summary from actual API response structure
The swym API response structure differs from what the code previously
assumed. Fix all field extraction to match the real shape:

- total_positions: backtest_metadata.position_count (not top-level)
- sharpe_ratio, win_rate, profit_factor: instruments.{key}.{field}.value
  wrapped decimal strings (not plain floats); treat Decimal::MAX sentinel
  (~7.9e28) as None
- net_pnl: instruments.{key}.pnl (plain decimal string)
- instrument key derived as "{exchange_no_underscores}-{base}_{quote}"

Also fix coverage-based backtest_from clamping: after the coverage
check, compute the effective backtest start as the max first_open across
all instruments × common intervals, so strategies never fail with
"requested range outside available data". Log per-interval date ranges
for each instrument at startup.

Additionally:
- Compact format_audit_summary to handle {"rules":[...],"total_bars":N}
  structure with per-condition true_count/evaluated breakdown
- Drop avg_bars from summary_line (field absent from API)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 14:22:29 +02:00
c7a2d65539 fix(prompts): forbid dynamic quantity expressions, require plain decimal string
The model was generating Expr objects for quantity (e.g. ATR-based sizing),
causing consistent QuantitySpec deserialization failures. Replace the
"prefer dynamic sizing" hint with an explicit rule: quantity must always
be a fixed decimal string like "0.001".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 13:11:40 +02:00
292c101859 docs(prompts): add DSL expression kind reference and three working examples
Shows correct usage of rsi/bollinger/ema_trend condition shortcuts, entry_price
and bars_since_entry ExprKind values, and func/cross_over/bin_op expressions.
Also calls out common model mistakes (rsi as ExprKind, bars_since_entry as
FuncName, expr_field) and adds a note that spot strategies are long-only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 13:09:01 +02:00
fc9b7e094a feat(agent): add strategy quality introspection
Log full strategy JSON at debug level, show full anyhow cause chain on
submit failures, surface condition_audit_summary for 0-trade results in
both logs and the summary fed back to the AI each iteration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 12:58:49 +02:00
deb28f6714 chore: local defaults 2026-03-09 12:24:30 +02:00
b7aa458e40 feat(claude): add configurable API base URL via --anthropic-url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 10:28:44 +02:00
934566879e chore: init 2026-03-09 10:15:33 +02:00