# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview `scout` is an autonomous strategy search agent for the [swym](https://swym.rs) backtesting platform. It runs a loop: asks Claude to generate trading strategies → submits backtests to swym → evaluates results → feeds learnings back → repeats. Promising strategies are automatically validated on out-of-sample data to filter overfitting. ## Architecture ### Core Modules - **`agent.rs`** - Main orchestration logic. Contains the `run()` function that implements the search loop, strategy validation, and learning feedback. Key types: `IterationRecord`, `LedgerEntry`, `validate_strategy()`, `diagnose_history()`. - **`claude.rs`** - Claude API client. Handles model communication, JSON extraction from responses, and context length detection for R1-family models with thinking blocks. - **`swym.rs`** - Swym backtesting API client. Wraps all swym API calls: candle coverage, strategy validation, backtest submission, polling, and metrics retrieval. - **`prompts.rs`** - System and user prompts for the LLM. Generates the DSL schema context and iteration-specific prompts with prior results. - **`config.rs`** - CLI argument parsing and configuration. Defines `Cli` struct with all command-line flags and environment variables. ### Key Data Flows 1. **Strategy Generation**: `agent::run()` → `claude::chat()` → extracts JSON strategy → validates → submits to swym 2. **Backtest Execution**: `swym::submit_backtest()` → `swym::poll_until_done()` → `BacktestResult::from_response()` 3. **Learning Loop**: `load_prior_summary()` reads `run_ledger.jsonl` → fetches metrics via `swym::compare_runs()` → formats compact summary → appends to iteration prompt 4. **OOS Validation**: Promising in-sample results trigger re-backtest on held-out data → strategies passing both phases saved to `validated_*.json` ### Important Patterns - **Deduplication**: Strategies are deduplicated by full JSON serialization using a HashMap (`tested_strategies`). Identical strategies are skipped with a warning. - **Validation**: Two-stage validation—client-side (structure, quantity parsing, exit rules) and server-side (DSL schema validation via `/strategies/validate`). - **Context Management**: Conversation history is trimmed to keep last 6 messages (3 exchanges) to avoid token limits. Prior results are summarized in the next prompt. - **Error Recovery**: Consecutive failures (3×) trigger abort. Transient API errors are logged but don't stop the run. - **Ledger Persistence**: Each backtest writes a `LedgerEntry` to `run_ledger.jsonl` for cross-run learning. Uses atomic O_APPEND writes. ## Development Commands ```bash # Build cargo build # Run with default config cargo run # Run with custom flags cargo run -- \ --swym-url https://dev.swym.hanzalova.internal/api/v1 \ --max-iterations 50 \ --instruments binance_spot:BTCUSDC,binance_spot:ETHUSDC # Run tests cargo test # Run with debug logging RUST_LOG=debug cargo run ``` ## DSL Schema Strategies are JSON objects with the schema defined in `src/dsl-schema.json`. The DSL uses a rule-based structure with `when` (entry conditions) and `then` (exit actions). Key concepts: - **Indicators**: `{"kind":"indicator","name":"...","params":{...}}` - **Comparators**: `{"kind":"compare","lhs":"...","op":"...","rhs":"..."}` - **Functions**: `{"kind":"func","name":"...","args":[...]}` See `src/dsl-schema.json` for the complete schema and `prompts.rs::system_prompt()` for how it's presented to Claude. ## Model Families The code supports different Claude model families via `ModelFamily` enum in `config.rs`: - **Sonnet**: Standard model, no special handling - **Opus**: Larger context, higher cost - **R1**: Has thinking blocks (`...`) that need to be stripped before JSON extraction Context length is auto-detected from the server's `/api/v1/models` endpoint (LM Studio) or `/v1/models/{id}` (OpenAI-compatible). Output token budget is set to half the context window. ## Output Files - `strategy_001.json` through `strategy_NNN.json` - Every strategy attempted (full JSON) - `validated_001.json` through `validated_NNN.json` - Strategies that passed OOS validation (includes in-sample + OOS metrics) - `best_strategy.json` - Strategy with highest average Sharpe across instruments - `run_ledger.jsonl` - Persistent record of all backtests for learning across runs ## Common Tasks ### Adding a new CLI flag 1. Add field to `Cli` struct in `config.rs` 2. Add clap derive attribute with `#[arg(short, long, env = "VAR_NAME")]` 3. Use the flag in `agent::run()` via `cli.flag_name` ### Extending the DSL 1. Update `src/dsl-schema.json` with new expression kinds 2. Add validation logic in `validate_strategy()` if needed 3. Update prompts in `prompts.rs` to guide the model ### Modifying the learning loop 1. Edit `load_prior_summary()` in `agent.rs` to change how prior results are formatted 2. Adjust `diagnose_history()` to add new diagnostics or change convergence detection 3. Update `prompts.rs::iteration_prompt()` to incorporate new information ### Adding new validation checks Add to `validate_strategy()` in `agent.rs`. Returns `(hard_errors, warnings)` where hard errors block submission and warnings are logged but allow the backtest to proceed. ## Testing Strategy The codebase uses `anyhow` for error handling and `tracing` for logging. Key test areas: - Strategy JSON extraction from various response formats - Context length detection from LM Studio/OpenAI endpoints - Ledger entry serialization/deserialization - Backtest result parsing from swym API responses - Deduplication logic - Convergence detection in `diagnose_history()`