swym/scout

Files

docs: add CLAUDE.md for future Claude Code instances

Add comprehensive guidance document covering architecture, data flows,
development commands, DSL schema reference, and common patterns for
working with the scout strategy search agent.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

2026-03-12 05:38:28 +02:00

5.7 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

scout is an autonomous strategy search agent for the swym backtesting platform. It runs a loop: asks Claude to generate trading strategies → submits backtests to swym → evaluates results → feeds learnings back → repeats. Promising strategies are automatically validated on out-of-sample data to filter overfitting.

Architecture

Core Modules

agent.rs - Main orchestration logic. Contains the run() function that implements the search loop, strategy validation, and learning feedback. Key types: IterationRecord, LedgerEntry, validate_strategy(), diagnose_history().
claude.rs - Claude API client. Handles model communication, JSON extraction from responses, and context length detection for R1-family models with thinking blocks.
swym.rs - Swym backtesting API client. Wraps all swym API calls: candle coverage, strategy validation, backtest submission, polling, and metrics retrieval.
prompts.rs - System and user prompts for the LLM. Generates the DSL schema context and iteration-specific prompts with prior results.
config.rs - CLI argument parsing and configuration. Defines Cli struct with all command-line flags and environment variables.

Key Data Flows

Strategy Generation: agent::run() → claude::chat() → extracts JSON strategy → validates → submits to swym
Backtest Execution: swym::submit_backtest() → swym::poll_until_done() → BacktestResult::from_response()
Learning Loop: load_prior_summary() reads run_ledger.jsonl → fetches metrics via swym::compare_runs() → formats compact summary → appends to iteration prompt
OOS Validation: Promising in-sample results trigger re-backtest on held-out data → strategies passing both phases saved to validated_*.json

Important Patterns

Deduplication: Strategies are deduplicated by full JSON serialization using a HashMap (tested_strategies). Identical strategies are skipped with a warning.
Validation: Two-stage validation—client-side (structure, quantity parsing, exit rules) and server-side (DSL schema validation via /strategies/validate).
Context Management: Conversation history is trimmed to keep last 6 messages (3 exchanges) to avoid token limits. Prior results are summarized in the next prompt.
Error Recovery: Consecutive failures (3×) trigger abort. Transient API errors are logged but don't stop the run.
Ledger Persistence: Each backtest writes a LedgerEntry to run_ledger.jsonl for cross-run learning. Uses atomic O_APPEND writes.

Development Commands

# Build
cargo build

# Run with default config
cargo run

# Run with custom flags
cargo run -- \
  --swym-url https://dev.swym.hanzalova.internal/api/v1 \
  --max-iterations 50 \
  --instruments binance_spot:BTCUSDC,binance_spot:ETHUSDC

# Run tests
cargo test

# Run with debug logging
RUST_LOG=debug cargo run

DSL Schema

Strategies are JSON objects with the schema defined in src/dsl-schema.json. The DSL uses a rule-based structure with when (entry conditions) and then (exit actions). Key concepts:

Indicators: {"kind":"indicator","name":"...","params":{...}}
Comparators: {"kind":"compare","lhs":"...","op":"...","rhs":"..."}
Functions: {"kind":"func","name":"...","args":[...]}

See src/dsl-schema.json for the complete schema and prompts.rs::system_prompt() for how it's presented to Claude.

Model Families

The code supports different Claude model families via ModelFamily enum in config.rs:

Sonnet: Standard model, no special handling
Opus: Larger context, higher cost
R1: Has thinking blocks (<think>...</think>) that need to be stripped before JSON extraction

Context length is auto-detected from the server's /api/v1/models endpoint (LM Studio) or /v1/models/{id} (OpenAI-compatible). Output token budget is set to half the context window.

Output Files

strategy_001.json through strategy_NNN.json - Every strategy attempted (full JSON)
validated_001.json through validated_NNN.json - Strategies that passed OOS validation (includes in-sample + OOS metrics)
best_strategy.json - Strategy with highest average Sharpe across instruments
run_ledger.jsonl - Persistent record of all backtests for learning across runs

Common Tasks

Adding a new CLI flag

Add field to Cli struct in config.rs
Add clap derive attribute with #[arg(short, long, env = "VAR_NAME")]
Use the flag in agent::run() via cli.flag_name

Extending the DSL

Update src/dsl-schema.json with new expression kinds
Add validation logic in validate_strategy() if needed
Update prompts in prompts.rs to guide the model

Modifying the learning loop

Edit load_prior_summary() in agent.rs to change how prior results are formatted
Adjust diagnose_history() to add new diagnostics or change convergence detection
Update prompts.rs::iteration_prompt() to incorporate new information

Adding new validation checks

Add to validate_strategy() in agent.rs. Returns (hard_errors, warnings) where hard errors block submission and warnings are logged but allow the backtest to proceed.

Testing Strategy

The codebase uses anyhow for error handling and tracing for logging. Key test areas:

Strategy JSON extraction from various response formats
Context length detection from LM Studio/OpenAI endpoints
Ledger entry serialization/deserialization
Backtest result parsing from swym API responses
Deduplication logic
Convergence detection in diagnose_history()

5.7 KiB Raw Permalink Blame History Unescape Escape