Compare commits
13 Commits
87d31f8d7e
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
11fe79ed25
|
|||
|
fcb9a2f553
|
|||
|
75c95f7935
|
|||
|
6601da21cc
|
|||
|
8de3ae5fe1
|
|||
|
a435d3a99d
|
|||
|
b476199de8
|
|||
|
d76d3b9061
|
|||
|
0945c94cc8
|
|||
|
a0316be798
|
|||
|
609d64587b
|
|||
|
6692bdb490
|
|||
|
36689e3fbb
|
116
CLAUDE.md
Normal file
116
CLAUDE.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
`scout` is an autonomous strategy search agent for the [swym](https://swym.rs) backtesting platform. It runs a loop: asks Claude to generate trading strategies → submits backtests to swym → evaluates results → feeds learnings back → repeats. Promising strategies are automatically validated on out-of-sample data to filter overfitting.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Modules
|
||||
|
||||
- **`agent.rs`** - Main orchestration logic. Contains the `run()` function that implements the search loop, strategy validation, and learning feedback. Key types: `IterationRecord`, `LedgerEntry`, `validate_strategy()`, `diagnose_history()`.
|
||||
- **`claude.rs`** - Claude API client. Handles model communication, JSON extraction from responses, and context length detection for R1-family models with thinking blocks.
|
||||
- **`swym.rs`** - Swym backtesting API client. Wraps all swym API calls: candle coverage, strategy validation, backtest submission, polling, and metrics retrieval.
|
||||
- **`prompts.rs`** - System and user prompts for the LLM. Generates the DSL schema context and iteration-specific prompts with prior results.
|
||||
- **`config.rs`** - CLI argument parsing and configuration. Defines `Cli` struct with all command-line flags and environment variables.
|
||||
|
||||
### Key Data Flows
|
||||
|
||||
1. **Strategy Generation**: `agent::run()` → `claude::chat()` → extracts JSON strategy → validates → submits to swym
|
||||
2. **Backtest Execution**: `swym::submit_backtest()` → `swym::poll_until_done()` → `BacktestResult::from_response()`
|
||||
3. **Learning Loop**: `load_prior_summary()` reads `run_ledger.jsonl` → fetches metrics via `swym::compare_runs()` → formats compact summary → appends to iteration prompt
|
||||
4. **OOS Validation**: Promising in-sample results trigger re-backtest on held-out data → strategies passing both phases saved to `validated_*.json`
|
||||
|
||||
### Important Patterns
|
||||
|
||||
- **Deduplication**: Strategies are deduplicated by full JSON serialization using a HashMap (`tested_strategies`). Identical strategies are skipped with a warning.
|
||||
- **Validation**: Two-stage validation—client-side (structure, quantity parsing, exit rules) and server-side (DSL schema validation via `/strategies/validate`).
|
||||
- **Context Management**: Conversation history is trimmed to keep last 6 messages (3 exchanges) to avoid token limits. Prior results are summarized in the next prompt.
|
||||
- **Error Recovery**: Consecutive failures (3×) trigger abort. Transient API errors are logged but don't stop the run.
|
||||
- **Ledger Persistence**: Each backtest writes a `LedgerEntry` to `run_ledger.jsonl` for cross-run learning. Uses atomic O_APPEND writes.
|
||||
|
||||
## Development Commands
|
||||
|
||||
```bash
|
||||
# Build
|
||||
cargo build
|
||||
|
||||
# Run with default config
|
||||
cargo run
|
||||
|
||||
# Run with custom flags
|
||||
cargo run -- \
|
||||
--swym-url https://dev.swym.hanzalova.internal/api/v1 \
|
||||
--max-iterations 50 \
|
||||
--instruments binance_spot:BTCUSDC,binance_spot:ETHUSDC
|
||||
|
||||
# Run tests
|
||||
cargo test
|
||||
|
||||
# Run with debug logging
|
||||
RUST_LOG=debug cargo run
|
||||
```
|
||||
|
||||
## DSL Schema
|
||||
|
||||
Strategies are JSON objects with the schema defined in `src/dsl-schema.json`. The DSL uses a rule-based structure with `when` (entry conditions) and `then` (exit actions). Key concepts:
|
||||
|
||||
- **Indicators**: `{"kind":"indicator","name":"...","params":{...}}`
|
||||
- **Comparators**: `{"kind":"compare","lhs":"...","op":"...","rhs":"..."}`
|
||||
- **Functions**: `{"kind":"func","name":"...","args":[...]}`
|
||||
|
||||
See `src/dsl-schema.json` for the complete schema and `prompts.rs::system_prompt()` for how it's presented to Claude.
|
||||
|
||||
## Model Families
|
||||
|
||||
The code supports different Claude model families via `ModelFamily` enum in `config.rs`:
|
||||
|
||||
- **Sonnet**: Standard model, no special handling
|
||||
- **Opus**: Larger context, higher cost
|
||||
- **R1**: Has thinking blocks (`<think>...</think>`) that need to be stripped before JSON extraction
|
||||
|
||||
Context length is auto-detected from the server's `/api/v1/models` endpoint (LM Studio) or `/v1/models/{id}` (OpenAI-compatible). Output token budget is set to half the context window.
|
||||
|
||||
## Output Files
|
||||
|
||||
- `strategy_001.json` through `strategy_NNN.json` - Every strategy attempted (full JSON)
|
||||
- `validated_001.json` through `validated_NNN.json` - Strategies that passed OOS validation (includes in-sample + OOS metrics)
|
||||
- `best_strategy.json` - Strategy with highest average Sharpe across instruments
|
||||
- `run_ledger.jsonl` - Persistent record of all backtests for learning across runs
|
||||
|
||||
## Common Tasks
|
||||
|
||||
### Adding a new CLI flag
|
||||
|
||||
1. Add field to `Cli` struct in `config.rs`
|
||||
2. Add clap derive attribute with `#[arg(short, long, env = "VAR_NAME")]`
|
||||
3. Use the flag in `agent::run()` via `cli.flag_name`
|
||||
|
||||
### Extending the DSL
|
||||
|
||||
1. Update `src/dsl-schema.json` with new expression kinds
|
||||
2. Add validation logic in `validate_strategy()` if needed
|
||||
3. Update prompts in `prompts.rs` to guide the model
|
||||
|
||||
### Modifying the learning loop
|
||||
|
||||
1. Edit `load_prior_summary()` in `agent.rs` to change how prior results are formatted
|
||||
2. Adjust `diagnose_history()` to add new diagnostics or change convergence detection
|
||||
3. Update `prompts.rs::iteration_prompt()` to incorporate new information
|
||||
|
||||
### Adding new validation checks
|
||||
|
||||
Add to `validate_strategy()` in `agent.rs`. Returns `(hard_errors, warnings)` where hard errors block submission and warnings are logged but allow the backtest to proceed.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
The codebase uses `anyhow` for error handling and `tracing` for logging. Key test areas:
|
||||
|
||||
- Strategy JSON extraction from various response formats
|
||||
- Context length detection from LM Studio/OpenAI endpoints
|
||||
- Ledger entry serialization/deserialization
|
||||
- Backtest result parsing from swym API responses
|
||||
- Deduplication logic
|
||||
- Convergence detection in `diagnose_history()`
|
||||
133
docs/plan/cross-run-learning.md
Normal file
133
docs/plan/cross-run-learning.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Plan: Cross-run learning via run ledger and compare endpoint
|
||||
|
||||
## Context
|
||||
|
||||
Scout currently starts from scratch every run — no memory of prior iterations. The upstream
|
||||
patch `e47c18` adds:
|
||||
1. **Enriched `result_summary`**: sortino_ratio, calmar_ratio, max_drawdown, pnl_return,
|
||||
avg_win, avg_loss, max_win, max_loss, avg_hold_duration_secs
|
||||
2. **Compare endpoint**: `GET /api/v1/paper-runs/compare?ids=uuid1,uuid2,...` returns
|
||||
`RunMetricsSummary` for up to 50 runs in one call
|
||||
|
||||
Goal: persist enough state across runs so that iteration 1 of a new run starts informed by
|
||||
all previous runs' strategies and outcomes.
|
||||
|
||||
## Changes
|
||||
|
||||
### 1. Run ledger — persist strategy + run_id per backtest (`src/agent.rs`)
|
||||
|
||||
After each successful `run_single_backtest`, append a JSONL entry to `{output_dir}/run_ledger.jsonl`:
|
||||
|
||||
```json
|
||||
{"run_id":"uuid","instrument":"BTCUSDC","candle_interval":"4h","strategy":{...},"timestamp":"2026-03-10T12:38:15Z"}
|
||||
```
|
||||
|
||||
One line per instrument-backtest (3 per iteration for 3 instruments). The strategy JSON is
|
||||
duplicated across instrument entries for the same iteration — this keeps the format flat and
|
||||
self-contained.
|
||||
|
||||
Use `OpenOptions::append(true).create(true)` — no locking needed since scout is single-threaded.
|
||||
|
||||
### 2. Load prior runs on startup (`src/agent.rs`)
|
||||
|
||||
At the top of `run()`, before the iteration loop:
|
||||
1. Read `run_ledger.jsonl` if it exists (ignore if missing — first run)
|
||||
2. Collect all `run_id`s
|
||||
3. Call `swym.compare_runs(&run_ids)` (batching in groups of 50)
|
||||
4. Join metrics back to strategies from the ledger
|
||||
5. Group by strategy (entries with the same strategy JSON share an iteration)
|
||||
6. Rank by average sharpe across instruments
|
||||
7. Build a `prior_results_summary: Option<String>` for the initial prompt
|
||||
|
||||
### 3. Compare endpoint client (`src/swym.rs`)
|
||||
|
||||
Add `RunMetricsSummary` struct:
|
||||
|
||||
```rust
|
||||
pub struct RunMetricsSummary {
|
||||
pub id: Uuid,
|
||||
pub status: String,
|
||||
pub candle_interval: Option<String>,
|
||||
pub total_positions: Option<u32>,
|
||||
pub win_rate: Option<f64>,
|
||||
pub profit_factor: Option<f64>,
|
||||
pub net_pnl: Option<f64>,
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
pub sortino_ratio: Option<f64>,
|
||||
pub calmar_ratio: Option<f64>,
|
||||
pub max_drawdown: Option<f64>,
|
||||
pub pnl_return: Option<f64>,
|
||||
pub avg_win: Option<f64>,
|
||||
pub avg_loss: Option<f64>,
|
||||
pub max_win: Option<f64>,
|
||||
pub max_loss: Option<f64>,
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
}
|
||||
```
|
||||
|
||||
Add `SwymClient::compare_runs(&self, run_ids: &[Uuid]) -> Result<Vec<RunMetricsSummary>>`:
|
||||
- `GET {base_url}/paper-runs/compare?ids={comma_separated}`
|
||||
- Parse JSON array response using `parse_number()` for decimal strings
|
||||
|
||||
### 4. Enrich `BacktestResult` with new fields (`src/swym.rs`)
|
||||
|
||||
Add to `BacktestResult`: `sortino_ratio`, `calmar_ratio`, `max_drawdown`, `pnl_return`,
|
||||
`avg_win`, `avg_loss`, `max_win`, `max_loss`, `avg_hold_duration_secs`.
|
||||
|
||||
Parse all in `from_response()` via existing `parse_number()`.
|
||||
|
||||
Update `summary_line()` to include `max_dd={:.1}%` and `sortino={:.2}` when present —
|
||||
these two are the most useful additions for the model's reasoning.
|
||||
|
||||
### 5. Prior-results-aware initial prompt (`src/prompts.rs`)
|
||||
|
||||
Modify `initial_prompt()` to accept `prior_summary: Option<&str>`.
|
||||
|
||||
When present, insert before the "Design a trading strategy" instruction:
|
||||
|
||||
```
|
||||
## Learnings from {N} prior backtests across {M} strategies
|
||||
|
||||
{top 5 strategies ranked by avg sharpe, each showing:}
|
||||
- Interval, rule count, avg metrics across instruments
|
||||
- One-line description of the strategy approach (extracted from rule comments)
|
||||
- Full strategy JSON for the top 1-2
|
||||
|
||||
{compact table of all prior strategies' avg metrics}
|
||||
|
||||
Use these insights to avoid repeating failed approaches and to build on what worked.
|
||||
```
|
||||
|
||||
Limit to ~2000 tokens of prior context to avoid crowding the prompt. If many prior runs,
|
||||
show only the top 5 + bottom 3 (worst performers to avoid), plus a count of total runs.
|
||||
|
||||
### 6. Ledger entry struct (`src/agent.rs`)
|
||||
|
||||
```rust
|
||||
#[derive(Serialize, Deserialize)]
|
||||
struct LedgerEntry {
|
||||
run_id: Uuid,
|
||||
instrument: String,
|
||||
candle_interval: String,
|
||||
strategy: Value,
|
||||
timestamp: String,
|
||||
}
|
||||
```
|
||||
|
||||
## Files to modify
|
||||
|
||||
- `src/swym.rs` — `RunMetricsSummary` struct, `compare_runs()` method, enrich `BacktestResult`
|
||||
with new fields, update `summary_line()`
|
||||
- `src/agent.rs` — `LedgerEntry` struct, append-to-ledger after backtest, load-ledger-on-startup,
|
||||
call compare endpoint, build prior summary, pass to initial prompt
|
||||
- `src/prompts.rs` — `initial_prompt()` accepts optional prior summary
|
||||
|
||||
## Verification
|
||||
|
||||
1. `cargo build --release`
|
||||
2. Run once → confirm `run_ledger.jsonl` is created with entries
|
||||
3. Run again → confirm:
|
||||
- Ledger is loaded, compare endpoint is called
|
||||
- Iteration 1 prompt includes prior results summary (visible at debug log level)
|
||||
- New entries are appended (not overwritten)
|
||||
4. Check that enriched metrics (sortino, max_drawdown) appear in summary_line output
|
||||
235
src/agent.rs
235
src/agent.rs
@@ -1,14 +1,26 @@
|
||||
use std::io::Write as IoWrite;
|
||||
use std::path::Path;
|
||||
use std::time::Duration;
|
||||
|
||||
use anyhow::{Context, Result};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use serde_json::Value;
|
||||
use tracing::{debug, error, info, warn};
|
||||
use uuid::Uuid;
|
||||
|
||||
use crate::claude::{self, ClaudeClient, Message};
|
||||
use crate::config::{Cli, Instrument};
|
||||
use crate::prompts;
|
||||
use crate::swym::{BacktestResult, SwymClient};
|
||||
use crate::swym::{BacktestResult, RunMetricsSummary, SwymClient};
|
||||
|
||||
/// Persistent record of a single completed backtest, written to the run ledger.
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct LedgerEntry {
|
||||
run_id: Uuid,
|
||||
instrument: String,
|
||||
candle_interval: String,
|
||||
strategy: Value,
|
||||
}
|
||||
|
||||
/// A single iteration's record: strategy + results across instruments.
|
||||
#[derive(Debug)]
|
||||
@@ -190,14 +202,24 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
|
||||
// Load DSL schema for the system prompt
|
||||
let schema = include_str!("dsl-schema.json");
|
||||
let system = prompts::system_prompt(schema, claude.family());
|
||||
let has_futures = instruments.iter().any(|i| i.is_futures());
|
||||
let system = prompts::system_prompt(schema, claude.family(), has_futures);
|
||||
info!("model family: {}", claude.family().name());
|
||||
|
||||
// Resolve ledger path: explicit --ledger-file takes precedence, else <output_dir>/run_ledger.jsonl
|
||||
let ledger_path = cli.ledger_file.clone().unwrap_or_else(|| cli.output_dir.join("run_ledger.jsonl"));
|
||||
info!("ledger: {}", ledger_path.display());
|
||||
|
||||
// Load prior runs from ledger and build cross-run context for iteration 1
|
||||
let prior_summary = load_prior_summary(&ledger_path, &swym).await;
|
||||
|
||||
// Agent state
|
||||
let mut history: Vec<IterationRecord> = Vec::new();
|
||||
let mut conversation: Vec<Message> = Vec::new();
|
||||
let mut best_strategy: Option<(f64, Value)> = None; // (avg_sharpe, strategy)
|
||||
let mut consecutive_failures = 0u32;
|
||||
// Deduplication: track canonical strategy JSON → first iteration it was tested.
|
||||
let mut tested_strategies: std::collections::HashMap<String, u32> = std::collections::HashMap::new();
|
||||
|
||||
let instrument_names: Vec<String> = instruments.iter().map(|i| i.symbol.clone()).collect();
|
||||
|
||||
@@ -206,7 +228,7 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
|
||||
// Build the user prompt
|
||||
let user_msg = if iteration == 1 {
|
||||
prompts::initial_prompt(&instrument_names, &available_intervals)
|
||||
prompts::initial_prompt(&instrument_names, &available_intervals, prior_summary.as_deref(), has_futures)
|
||||
} else {
|
||||
let results_text = history
|
||||
.iter()
|
||||
@@ -372,6 +394,27 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
}
|
||||
}
|
||||
|
||||
// Deduplication check: skip strategies identical to one already tested this run.
|
||||
let strategy_key = serde_json::to_string(&strategy).unwrap_or_default();
|
||||
if let Some(&first_iter) = tested_strategies.get(&strategy_key) {
|
||||
warn!("duplicate strategy (identical to iteration {first_iter}), skipping backtest");
|
||||
let record = IterationRecord {
|
||||
iteration,
|
||||
strategy: strategy.clone(),
|
||||
results: vec![],
|
||||
validation_notes: vec![format!(
|
||||
"DUPLICATE: this exact strategy was already tested in iteration {first_iter}. \
|
||||
You submitted identical JSON. You MUST design a completely different strategy — \
|
||||
different indicator family, different entry conditions, or different timeframe. \
|
||||
Do NOT submit the same JSON again."
|
||||
)],
|
||||
};
|
||||
info!("{}", record.summary());
|
||||
history.push(record);
|
||||
continue;
|
||||
}
|
||||
tested_strategies.insert(strategy_key, iteration);
|
||||
|
||||
// Run backtests against all instruments (in-sample)
|
||||
let mut results: Vec<BacktestResult> = Vec::new();
|
||||
|
||||
@@ -397,12 +440,13 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
info!(" condition audit: {}", serde_json::to_string_pretty(audit).unwrap_or_default());
|
||||
}
|
||||
}
|
||||
append_ledger_entry(&ledger_path, &result, &strategy);
|
||||
results.push(result);
|
||||
}
|
||||
Err(e) => {
|
||||
warn!(" backtest failed for {}: {e:#}", inst.symbol);
|
||||
results.push(BacktestResult {
|
||||
run_id: uuid::Uuid::nil(),
|
||||
run_id: Uuid::nil(),
|
||||
instrument: inst.symbol.clone(),
|
||||
status: "failed".to_string(),
|
||||
total_positions: None,
|
||||
@@ -413,6 +457,15 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
total_pnl: None,
|
||||
net_pnl: None,
|
||||
sharpe_ratio: None,
|
||||
sortino_ratio: None,
|
||||
calmar_ratio: None,
|
||||
max_drawdown: None,
|
||||
pnl_return: None,
|
||||
avg_win: None,
|
||||
avg_loss: None,
|
||||
max_win: None,
|
||||
max_loss: None,
|
||||
avg_hold_duration_secs: None,
|
||||
total_fees: None,
|
||||
avg_bars_in_trade: None,
|
||||
error_message: Some(e.to_string()),
|
||||
@@ -550,6 +603,7 @@ async fn run_single_backtest(
|
||||
&inst.symbol,
|
||||
&inst.base(),
|
||||
&inst.quote(),
|
||||
inst.market_kind(),
|
||||
strategy,
|
||||
starts_at,
|
||||
finishes_at,
|
||||
@@ -573,6 +627,179 @@ async fn run_single_backtest(
|
||||
Ok(BacktestResult::from_response(&final_resp, &inst.symbol))
|
||||
}
|
||||
|
||||
/// Append a ledger entry for a completed backtest so future runs can learn from it.
|
||||
fn append_ledger_entry(ledger: &Path, result: &BacktestResult, strategy: &Value) {
|
||||
// Skip nil run_ids (error placeholders)
|
||||
if result.run_id == Uuid::nil() {
|
||||
return;
|
||||
}
|
||||
let entry = LedgerEntry {
|
||||
run_id: result.run_id,
|
||||
instrument: result.instrument.clone(),
|
||||
candle_interval: strategy["candle_interval"]
|
||||
.as_str()
|
||||
.unwrap_or("?")
|
||||
.to_string(),
|
||||
strategy: strategy.clone(),
|
||||
};
|
||||
// Append newline inside the serialised bytes so the entire write is a single
|
||||
// write_all() syscall — O_APPEND + single write() is atomic on Linux local
|
||||
// filesystems, making concurrent instances safe for typical entry sizes.
|
||||
let mut bytes = match serde_json::to_vec(&entry) {
|
||||
Ok(b) => b,
|
||||
Err(e) => {
|
||||
warn!("could not serialize ledger entry: {e}");
|
||||
return;
|
||||
}
|
||||
};
|
||||
bytes.push(b'\n');
|
||||
if let Err(e) = std::fs::OpenOptions::new()
|
||||
.append(true)
|
||||
.create(true)
|
||||
.open(ledger)
|
||||
.and_then(|mut f| f.write_all(&bytes))
|
||||
{
|
||||
warn!("could not write ledger entry: {e}");
|
||||
}
|
||||
}
|
||||
|
||||
/// Load the run ledger, fetch metrics via the compare endpoint, and return a compact
|
||||
/// prior-results summary string for the initial prompt. Returns `None` if the ledger
|
||||
/// is absent, empty, or the compare call fails.
|
||||
async fn load_prior_summary(ledger: &Path, swym: &SwymClient) -> Option<String> {
|
||||
let path = ledger;
|
||||
let contents = std::fs::read_to_string(&path).ok()?;
|
||||
|
||||
// Parse all ledger entries
|
||||
let entries: Vec<LedgerEntry> = contents
|
||||
.lines()
|
||||
.filter(|l| !l.trim().is_empty())
|
||||
.filter_map(|l| serde_json::from_str(l).ok())
|
||||
.collect();
|
||||
if entries.is_empty() {
|
||||
return None;
|
||||
}
|
||||
info!("loaded {} ledger entries from previous runs", entries.len());
|
||||
|
||||
// Fetch metrics for all run_ids
|
||||
let run_ids: Vec<Uuid> = entries.iter().map(|e| e.run_id).collect();
|
||||
let metrics = match swym.compare_runs(&run_ids).await {
|
||||
Ok(m) => m,
|
||||
Err(e) => {
|
||||
warn!("could not fetch prior run metrics: {e}");
|
||||
return None;
|
||||
}
|
||||
};
|
||||
|
||||
// Build a map from run_id → metrics
|
||||
let metrics_map: std::collections::HashMap<Uuid, &RunMetricsSummary> =
|
||||
metrics.iter().map(|m| (m.id, m)).collect();
|
||||
|
||||
// Group entries by strategy (use candle_interval + rules fingerprint)
|
||||
// We use the full strategy JSON as the grouping key.
|
||||
let mut strategy_groups: std::collections::HashMap<String, Vec<(&LedgerEntry, Option<&RunMetricsSummary>)>> =
|
||||
std::collections::HashMap::new();
|
||||
// Cap at 3 entries per unique strategy (one per instrument is enough).
|
||||
// Without this, a strategy repeated across many iterations swamps the summary.
|
||||
for entry in &entries {
|
||||
let key = serde_json::to_string(&entry.strategy).unwrap_or_default();
|
||||
let group = strategy_groups.entry(key).or_default();
|
||||
if group.len() < 3 {
|
||||
let m = metrics_map.get(&entry.run_id).copied();
|
||||
group.push((entry, m));
|
||||
}
|
||||
}
|
||||
|
||||
// Compute avg sharpe per strategy group
|
||||
let mut strategies: Vec<(f64, &Value, Vec<(&LedgerEntry, Option<&RunMetricsSummary>)>)> = strategy_groups
|
||||
.into_values()
|
||||
.map(|group| {
|
||||
let sharpes: Vec<f64> = group
|
||||
.iter()
|
||||
.filter_map(|(_, m)| m.and_then(|m| m.sharpe_ratio))
|
||||
.collect();
|
||||
let avg_sharpe = if sharpes.is_empty() {
|
||||
f64::NEG_INFINITY
|
||||
} else {
|
||||
sharpes.iter().sum::<f64>() / sharpes.len() as f64
|
||||
};
|
||||
let strategy = &group[0].0.strategy;
|
||||
(avg_sharpe, strategy, group)
|
||||
})
|
||||
.collect();
|
||||
strategies.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
|
||||
|
||||
let total_strategies = strategies.len();
|
||||
let total_backtests = entries.len();
|
||||
|
||||
// Build summary text — top 5 + bottom 3 (if distinct), capped at ~2000 chars
|
||||
let mut lines = vec![format!(
|
||||
"## Learnings from {} prior backtests across {} strategies\n",
|
||||
total_backtests, total_strategies
|
||||
)];
|
||||
lines.push("### Best strategies (ranked by avg Sharpe):".to_string());
|
||||
|
||||
let show_top = strategies.len().min(5);
|
||||
for (avg_sharpe, strategy, group) in strategies.iter().take(show_top) {
|
||||
let interval = strategy["candle_interval"].as_str().unwrap_or("?");
|
||||
let rule_count = strategy["rules"].as_array().map(|r| r.len()).unwrap_or(0);
|
||||
// Collect per-instrument metrics
|
||||
let inst_lines: Vec<String> = group
|
||||
.iter()
|
||||
.filter_map(|(entry, m)| {
|
||||
let m = (*m)?;
|
||||
Some(format!(
|
||||
" {}: trades={} sharpe={:.3} net_pnl={:.2}{}",
|
||||
entry.instrument,
|
||||
m.total_positions.unwrap_or(0),
|
||||
m.sharpe_ratio.unwrap_or(0.0),
|
||||
m.net_pnl.unwrap_or(0.0),
|
||||
m.max_drawdown.map(|d| format!(" max_dd={:.1}%", d * 100.0)).unwrap_or_default(),
|
||||
))
|
||||
})
|
||||
.collect();
|
||||
// Pull the first rule comment as a strategy description
|
||||
let description = strategy["rules"][0]["comment"]
|
||||
.as_str()
|
||||
.unwrap_or("(no description)");
|
||||
lines.push(format!(
|
||||
"\n [{interval}, {rule_count} rules, avg_sharpe={avg_sharpe:.3}] {description}"
|
||||
));
|
||||
lines.extend(inst_lines);
|
||||
// Include full JSON only for the top 2
|
||||
let rank = strategies.iter().position(|(_, s, _)| std::ptr::eq(*s, *strategy)).unwrap_or(99);
|
||||
if rank < 2 {
|
||||
lines.push(format!(
|
||||
" strategy JSON: {}",
|
||||
serde_json::to_string(strategy).unwrap_or_default()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
// Worst 3 (if we have more than 5)
|
||||
if strategies.len() > 5 {
|
||||
lines.push("\n### Worst strategies (avoid repeating these):".to_string());
|
||||
let worst_start = strategies.len().saturating_sub(3);
|
||||
for (avg_sharpe, strategy, _) in strategies.iter().skip(worst_start) {
|
||||
let interval = strategy["candle_interval"].as_str().unwrap_or("?");
|
||||
let description = strategy["rules"][0]["comment"].as_str().unwrap_or("(no description)");
|
||||
lines.push(format!(" [{interval}, avg_sharpe={avg_sharpe:.3}] {description}"));
|
||||
}
|
||||
}
|
||||
|
||||
lines.push(format!(
|
||||
"\nUse these results to avoid repeating failed approaches and build on what worked.\n"
|
||||
));
|
||||
|
||||
let summary = lines.join("\n");
|
||||
// Truncate to ~6000 chars to stay within prompt budget
|
||||
if summary.len() > 6000 {
|
||||
Some(format!("{}…\n[truncated — {} total strategies]\n", &summary[..5900], total_strategies))
|
||||
} else {
|
||||
Some(summary)
|
||||
}
|
||||
}
|
||||
|
||||
fn save_validated_strategy(
|
||||
output_dir: &Path,
|
||||
iteration: u32,
|
||||
|
||||
@@ -118,6 +118,13 @@ pub struct Cli {
|
||||
#[arg(long, default_value = "./results")]
|
||||
pub output_dir: PathBuf,
|
||||
|
||||
/// Path to the run ledger JSONL file used for cross-run learning.
|
||||
/// Defaults to <output_dir>/run_ledger.jsonl when not specified.
|
||||
/// Pass a different path to seed a new run from a specific ledger
|
||||
/// (e.g. a curated export from a previous campaign).
|
||||
#[arg(long)]
|
||||
pub ledger_file: Option<PathBuf>,
|
||||
|
||||
/// Poll interval in seconds when waiting for backtest completion.
|
||||
#[arg(long, default_value_t = 2)]
|
||||
pub poll_interval_secs: u64,
|
||||
@@ -167,4 +174,22 @@ impl Instrument {
|
||||
}
|
||||
"usdc".to_string()
|
||||
}
|
||||
|
||||
/// Instrument kind for the paper-run config `instrument.kind` field.
|
||||
/// Derived from the exchange identifier (case-insensitive).
|
||||
pub fn market_kind(&self) -> &'static str {
|
||||
let e = self.exchange.to_ascii_lowercase();
|
||||
if e.contains("futures_usd") || e.contains("futures_um") {
|
||||
"futures_um"
|
||||
} else if e.contains("futures_coin") || e.contains("futures_cm") {
|
||||
"futures_cm"
|
||||
} else {
|
||||
"spot"
|
||||
}
|
||||
}
|
||||
|
||||
/// True when this instrument is traded on a futures market.
|
||||
pub fn is_futures(&self) -> bool {
|
||||
self.market_kind() != "spot"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -74,6 +74,11 @@
|
||||
{ "$ref": "#/definitions/SizingFixedUnits" },
|
||||
{ "$ref": "#/definitions/Expr" }
|
||||
]
|
||||
},
|
||||
"reverse": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Flip-through-zero flag (futures only). When true and an opposite position is currently open, the submitted order quantity becomes position_qty + configured_qty, closing the existing position and immediately opening a new one in the opposite direction in a single order. When flat the flag has no effect and configured_qty is used as normal. Omit or set false for standard close-only behaviour."
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
292
src/prompts.rs
292
src/prompts.rs
@@ -4,7 +4,7 @@ use crate::config::ModelFamily;
|
||||
///
|
||||
/// Accepts a `ModelFamily` so each family can receive tailored guidance
|
||||
/// while sharing the common DSL schema and strategy evaluation rules.
|
||||
pub fn system_prompt(dsl_schema: &str, family: &ModelFamily) -> String {
|
||||
pub fn system_prompt(dsl_schema: &str, family: &ModelFamily, has_futures: bool) -> String {
|
||||
let output_instructions = match family {
|
||||
ModelFamily::DeepSeekR1 => {
|
||||
"## Output format\n\n\
|
||||
@@ -103,6 +103,14 @@ Buy a fixed number of base units (semantic alias for a decimal string):
|
||||
"right":{{"kind":"func","name":"atr","period":14}}}}
|
||||
```
|
||||
|
||||
CRITICAL — ATR sizing and balance limits: `N/atr(14)` expresses quantity in BASE asset units.
|
||||
For BTC, 4h ATR ≈ $1500–3000. So `1000/atr(14)` ≈ 0.4–0.7 BTC ≈ $32k–56k notional —
|
||||
silently rejected on a $10k account (fill returns None, 0 positions open, no error shown).
|
||||
The numerator N represents your intended dollar risk per trade. For a $10k account keep N ≤ 200.
|
||||
`200/atr(14)` ≈ 0.07–0.13 BTC ≈ $5.6k–10k notional — fits within a $10k account.
|
||||
Prefer `percent_of_balance` for most sizing. Only reach for ATR-based Expr sizing when you need
|
||||
volatility-scaled position risk, and keep the numerator proportional to your risk tolerance.
|
||||
|
||||
**4. Exit rules** — use `position_quantity` to close the exact open size:
|
||||
```json
|
||||
{{"kind":"position_quantity"}}
|
||||
@@ -110,14 +118,35 @@ Buy a fixed number of base units (semantic alias for a decimal string):
|
||||
Alternatively, `"9999"` works for exits: sell quantities are automatically capped to the open
|
||||
position size, so a large fixed number is equivalent to `position_quantity`.
|
||||
|
||||
CRITICAL mistakes to never make:
|
||||
- `{{"method":"position_quantity"}}` is WRONG — `position_quantity` is an Expr, not a SizingMethod.
|
||||
CORRECT: `{{"kind":"position_quantity"}}`. The `"method"` field belongs ONLY to the three
|
||||
declarative sizing objects (`fixed_sum`, `percent_of_balance`, `fixed_units`).
|
||||
CRITICAL — the `"method"` vs `"kind"` distinction:
|
||||
- `"method"` belongs ONLY to the three declarative sizing objects: `fixed_sum`, `percent_of_balance`, `fixed_units`.
|
||||
- `"kind"` belongs to Expr objects: `position_quantity`, `bin_op`, `func`, `field`, `literal`, etc.
|
||||
- `{{"method":"position_quantity"}}` is ALWAYS WRONG. It will be rejected every time.
|
||||
CORRECT: `{{"kind":"position_quantity"}}`.
|
||||
- If you used `{{"method":"percent_of_balance",...}}` for the buy, use `{{"kind":"position_quantity"}}` for the sell.
|
||||
These are different object types — buy uses a SizingMethod (`method`), sell uses an Expr (`kind`).
|
||||
- `{{"method":"fixed_sum","amount":"100","multiplier":"2.0"}}` is WRONG — `fixed_sum` has no
|
||||
`multiplier` field. Only `amount` is accepted alongside `method`.
|
||||
- NEVER add extra fields to SizingMethod objects — they use `additionalProperties: false`.
|
||||
|
||||
### Reverse / flip-through-zero (futures only)
|
||||
|
||||
Setting `"reverse": true` on a rule action enables a single-order position flip on futures.
|
||||
When an opposite position is open, quantity = `position_qty + configured_qty`, which closes
|
||||
the existing position and opens a new one in the opposite direction in one order (fees split
|
||||
proportionally). When flat the flag has no effect — `configured_qty` is used normally.
|
||||
|
||||
This lets you collapse a 4-rule long+short strategy (separate open/close for each leg) into
|
||||
2 rules, reducing round-trip fees and keeping logic compact:
|
||||
|
||||
```json
|
||||
{{"side": "sell", "quantity": {{"method": "percent_of_balance", "percent": "10", "asset": "usdc"}}, "reverse": true}}
|
||||
```
|
||||
|
||||
Use `reverse` when you always want to be in a position — the signal flips you from long to
|
||||
short (or vice versa) rather than first exiting and then re-entering separately. Do NOT use
|
||||
`reverse` on spot markets (short selling is not supported there).
|
||||
|
||||
### Multi-timeframe
|
||||
Any expression can reference a different timeframe via "timeframe" field.
|
||||
Use higher timeframes as trend filters, lower timeframes for entry precision.
|
||||
@@ -142,6 +171,13 @@ Use higher timeframes as trend filters, lower timeframes for entry precision.
|
||||
6. **Composite / hybrid**: Combine families. Trend filter + mean-reversion entry.
|
||||
Momentum confirmation + volatility sizing.
|
||||
|
||||
7. **Supertrend consensus flip (futures only)**: Use `any_of` across multiple
|
||||
Supertrend configs (e.g. period=7/mul=1.5, period=10/mul=2.0, period=20/mul=3.0)
|
||||
so that ANY flip triggers a long or short entry. Combine with `"reverse": true`
|
||||
for an always-in-market approach where the opposite signal is the stop-loss.
|
||||
Varying multiplier tightens/loosens the band; varying period controls sensitivity.
|
||||
Risk: choppy markets generate many whipsaws — best on daily or 4h.
|
||||
|
||||
## Risk management (always include)
|
||||
|
||||
Every strategy MUST have:
|
||||
@@ -149,6 +185,10 @@ Every strategy MUST have:
|
||||
- A time-based exit: use bars_since_entry to avoid holding losers indefinitely
|
||||
- Reasonable position sizing: prefer ATR-based or percent-of-balance over fixed quantity
|
||||
|
||||
Exception: always-in-market flip strategies (using `"reverse": true`) do not need an
|
||||
explicit stop-loss or time exit — the opposite signal acts as the stop. These are
|
||||
only valid on futures. See Example 6 and Example 7.
|
||||
|
||||
{output_instructions}
|
||||
|
||||
## Interpreting backtest results
|
||||
@@ -157,7 +197,11 @@ When I share results from previous iterations, use them to guide your next strat
|
||||
|
||||
- **Zero trades**: The entry conditions are too restrictive or never co-occur.
|
||||
Relax thresholds, simplify conditions, or check if the indicator periods make
|
||||
sense for the candle interval.
|
||||
sense for the candle interval. Also check your position sizing — if using an
|
||||
ATR-based Expr quantity (`N/atr(14)`), a large N can produce a notional value
|
||||
exceeding your account balance (e.g. `1000/atr(14)` on BTC ≈ 0.4 BTC ≈ $32k),
|
||||
which is silently rejected by the fill engine. Switch to `percent_of_balance`
|
||||
or reduce N to ≤ 200 for a $10k account.
|
||||
|
||||
- **Many trades but negative PnL**: The entry signal has no edge, or the exit
|
||||
logic is poor. Try different indicator combinations, add trend filters, or
|
||||
@@ -190,6 +234,9 @@ Common mistakes to NEVER make:
|
||||
- `"kind": "expr_field"` does NOT exist. Use `{{"kind":"field","field":"close"}}`.
|
||||
- Every Expr object MUST have a `"kind"` field. `{{"field":"close"}}` is WRONG — missing `"kind"`.
|
||||
CORRECT: `{{"kind":"field","field":"close"}}`. The `"kind"` is never optional.
|
||||
This applies to ALL field access including offset lookups:
|
||||
`{{"field":"volume","offset":-1}}` is WRONG. CORRECT: `{{"kind":"field","field":"volume","offset":-1}}`.
|
||||
`{{"field":"high","offset":-2}}` is WRONG. CORRECT: `{{"kind":"field","field":"high","offset":-2}}`.
|
||||
- `rsi`, `adx`, `supertrend` are NOT valid inside `apply_func`. Use only `apply_func`
|
||||
with `ApplyFuncName` values: `highest`, `lowest`, `sma`, `ema`, `wma`, `std_dev`, `sum`,
|
||||
`bollinger_upper`, `bollinger_lower`.
|
||||
@@ -473,26 +520,241 @@ CRITICAL: `apply_func` uses `"input"`, not `"expr"`. Writing `"expr":` will be r
|
||||
- Don't set RSI thresholds at extreme values (< 10 or > 90) — too rare to fire
|
||||
- Don't use very short periods (< 5) on high timeframes — noisy
|
||||
- Don't use very long periods (> 100) on low timeframes — too slow to react
|
||||
- Don't switch to 15m or shorter intervals when results are poor — higher frequency amplifies
|
||||
fees and noise, making edge harder to find. Prefer 1h or 4h. If Sharpe is negative across
|
||||
intervals, the issue is signal logic, not timeframe — fix the signal before changing interval.
|
||||
- Don't create strategies with more than 5-6 conditions — overfitting risk
|
||||
- Don't ignore fees — a strategy needs to overcome 0.1% per round trip
|
||||
- Always gate buy rules with position state "flat" and sell rules with "long"
|
||||
- Never add a short-entry (sell when flat) rule — spot markets are long-only
|
||||
- Never use an expression object for `quantity` — it must always be a plain decimal string like `"0.01"`
|
||||
- Never use a placeholder string for `quantity` — `"ATR_SIZED"`, `"FULL_BALANCE"`, `"dynamic"`, etc. are all invalid and will be rejected. Use `"0.01"` or similar.
|
||||
"##
|
||||
- Spot markets are long-only: gate buy (entry) rules with state "flat" and sell (exit) rules with state "long". Never add a short-entry (sell when flat) rule on spot.
|
||||
- Futures markets support both directions: long entry = buy when flat; long exit = sell when long; short entry = sell when flat; short exit (cover) = buy when short. Always include a stop-loss and time exit for both long and short legs.
|
||||
- Never use a placeholder string for `quantity` — `"ATR_SIZED"`, `"FULL_BALANCE"`, `"dynamic"`, etc. are all invalid and will be rejected.
|
||||
- Don't use large ATR-based sizing numerators. `N/atr(14)` gives BASE units; for BTC (ATR ≈ $2000
|
||||
on 4h), `1000/atr(14)` ≈ 0.5 BTC ≈ $40k — silently rejected on a $10k account. Keep N ≤ 200
|
||||
or use `percent_of_balance`. The condition audit may show entry conditions firing while 0 positions
|
||||
open — this is the typical symptom of an oversized ATR quantity.
|
||||
- `{{"method":"position_quantity"}}` is WRONG for exit rules — use `{{"kind":"position_quantity"}}` (see Quantity section above).
|
||||
{futures_examples}"##,
|
||||
futures_examples = if has_futures { FUTURES_SHORT_EXAMPLES } else { "" },
|
||||
)
|
||||
}
|
||||
|
||||
/// Short-entry and short-exit strategy examples, injected into the system prompt when
|
||||
/// futures instruments are present.
|
||||
const FUTURES_SHORT_EXAMPLES: &str = r##"
|
||||
|
||||
### Example 5 — Futures short: EMA trend-following short with ATR stop
|
||||
|
||||
On futures you can also short. Short entry = `"side": "sell"` when `"state": "flat"`;
|
||||
short exit (cover) = `"side": "buy"` when `"state": "short"`. Stop-loss for a short
|
||||
is price rising above entry, e.g. entry_price * 1.02. You may run long and short legs
|
||||
in the same strategy (4 rules total), or a short-only strategy (2 rules).
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{
|
||||
"comment": "Short entry: EMA9 crosses below EMA21 while price is below EMA50 (downtrend)",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "below"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "below"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "sell", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}}
|
||||
},
|
||||
{
|
||||
"comment": "Short exit: EMA9 crosses back above EMA21, OR 2% stop-loss, OR 48-bar time exit",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "position", "state": "short"},
|
||||
{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "above"},
|
||||
{
|
||||
"kind": "compare",
|
||||
"left": {"kind": "field", "field": "close"},
|
||||
"op": ">",
|
||||
"right": {"kind": "bin_op", "op": "mul", "left": {"kind": "entry_price"}, "right": {"kind": "literal", "value": "1.02"}}
|
||||
},
|
||||
{
|
||||
"kind": "compare",
|
||||
"left": {"kind": "bars_since_entry"},
|
||||
"op": ">=",
|
||||
"right": {"kind": "literal", "value": "48"}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
"then": {"side": "buy", "quantity": {"kind": "position_quantity"}}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Key short-specific notes:
|
||||
- Stop-loss for short = close > entry_price * (1 + stop_pct), e.g. `* 1.02` for 2% stop
|
||||
- Take-profit for short = close < entry_price * (1 - target_pct), e.g. `* 0.97` for 3% target
|
||||
- Short exit uses `"side": "buy"` with `{"kind": "position_quantity"}` (same as long exit uses sell)
|
||||
- `percent_of_balance` for short entry uses `"usdc"` as the asset (the collateral currency)
|
||||
|
||||
### Example 6 — Futures flip-through-zero: 2-rule EMA trend-follower using `reverse`
|
||||
|
||||
When you always want to be in a position (long during uptrends, short during downtrends),
|
||||
use `"reverse": true` to flip from one side to the other in a single order. This uses half
|
||||
the round-trip fee count compared to a 4-rule separate-entry/exit approach.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{
|
||||
"comment": "Go long (or flip short→long): EMA9 crosses above EMA21 while above EMA50",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "any_of", "conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "position", "state": "short"}
|
||||
]},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "above"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "above"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "buy", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}, "reverse": true}
|
||||
},
|
||||
{
|
||||
"comment": "Go short (or flip long→short): EMA9 crosses below EMA21 while below EMA50",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "any_of", "conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "position", "state": "long"}
|
||||
]},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "below"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "below"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "sell", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}, "reverse": true}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Key flip-strategy notes:
|
||||
- Gate each rule on `flat OR opposite` (using `any_of`) so it fires both on initial entry and on flip
|
||||
- `reverse: true` handles the flip math automatically — no need to size for `position_qty + new_qty`
|
||||
- This pattern works best for trend-following where you want continuous market exposure
|
||||
- Still add a time-based or ATR stop if you want a safety exit when the trend stalls
|
||||
|
||||
### Example 7 — Futures triple-Supertrend consensus flip
|
||||
|
||||
Multiple Supertrend instances with different period/multiplier combos act as a tiered
|
||||
signal. `any_of` fires on the FIRST flip — the fastest line (7/1.5) reacts quickly,
|
||||
the slowest (20/3.0) confirms strong trends. `reverse: true` makes it always-in-market:
|
||||
the opposite signal is the stop-loss. No explicit stop or time exit needed.
|
||||
|
||||
Varying parameters to tune:
|
||||
- Tighter multipliers (1.0–2.0) → more signals, more whipsaws
|
||||
- Looser multipliers (2.5–4.0) → fewer signals, longer holds
|
||||
- Try `all_of` instead of `any_of` to require consensus across all three (stronger filter)
|
||||
|
||||
```json
|
||||
{{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{{
|
||||
"comment": "LONG (or flip short→long): any Supertrend flips bullish",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "any_of", "conditions": [
|
||||
{{"kind": "position", "state": "flat"}},
|
||||
{{"kind": "position", "state": "short"}}
|
||||
]}},
|
||||
{{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 7, "multiplier": "1.5"}}}},
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 10, "multiplier": "2.0"}}}},
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 20, "multiplier": "3.0"}}}}
|
||||
]
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": {{"method": "percent_of_balance", "percent": "5", "asset": "usdc"}}, "reverse": true}}
|
||||
}},
|
||||
{{
|
||||
"comment": "SHORT (or flip long→short): any Supertrend flips bearish",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "any_of", "conditions": [
|
||||
{{"kind": "position", "state": "flat"}},
|
||||
{{"kind": "position", "state": "long"}}
|
||||
]}},
|
||||
{{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 7, "multiplier": "1.5"}}}},
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 10, "multiplier": "2.0"}}}},
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 20, "multiplier": "3.0"}}}}
|
||||
]
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": {{"method": "percent_of_balance", "percent": "5", "asset": "usdc"}}, "reverse": true}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
```
|
||||
|
||||
Key Supertrend-specific notes:
|
||||
- `supertrend` ignores `field` — it uses OHLC internally; omit the `field` param
|
||||
- `multiplier` controls band width: lower = tighter, more reactive; higher = wider, more stable
|
||||
- `any_of` → first flip triggers (responsive); `all_of` → all three must agree (conservative)
|
||||
- Gate on position state to prevent re-entries scaling into an existing position"##;
|
||||
|
||||
/// Build the user message for the first iteration (no prior results).
|
||||
pub fn initial_prompt(instruments: &[String], candle_intervals: &[String]) -> String {
|
||||
/// `prior_summary` contains a formatted summary of results from previous runs, if any.
|
||||
pub fn initial_prompt(instruments: &[String], candle_intervals: &[String], prior_summary: Option<&str>, has_futures: bool) -> String {
|
||||
let prior_section = match prior_summary {
|
||||
Some(s) => format!("{s}\n\n"),
|
||||
None => String::new(),
|
||||
};
|
||||
let starting_instruction = if prior_summary.is_some() {
|
||||
"Based on the prior results above:\n\
|
||||
- A strategy is \"promising\" if avg_sharpe >= 0.5 AND it traded >= 10 times per instrument. \
|
||||
If the best prior strategy meets both thresholds, refine it (tighten entry conditions, \
|
||||
adjust the exit, or tune the interval).\n\
|
||||
- If no prior strategy reaches avg_sharpe >= 0.5, do NOT repeat the same indicator family. \
|
||||
Scan the best-strategies list: if they all use the same core indicator (e.g. all use \
|
||||
Bollinger Bands, or all use EMA crossovers, or all use RSI threshold), your FIRST strategy \
|
||||
MUST use a completely different indicator family — for example: MACD crossover, ATR \
|
||||
breakout, volume spike, donchian channel breakout, or stochastic oscillator. Only after \
|
||||
that novelty attempt may you refine prior work.\n\
|
||||
- Never repeat an approach that produced 0 trades or fewer than 5 trades per instrument."
|
||||
} else {
|
||||
"Start with a multi-timeframe trend-following approach with proper risk management \
|
||||
(stop-loss, time exit, and ATR-based position sizing)."
|
||||
};
|
||||
let market_type = if has_futures { "futures" } else { "spot" };
|
||||
format!(
|
||||
r#"Design a trading strategy for crypto spot markets.
|
||||
r#"{prior_section}Design a trading strategy for crypto {market_type} markets.
|
||||
|
||||
Available instruments: {}
|
||||
Available candle intervals: {}
|
||||
|
||||
Start with a multi-timeframe trend-following approach with proper risk management
|
||||
(stop-loss, time exit, and ATR-based position sizing). Use "usdc" as the quote asset.
|
||||
{starting_instruction} Use "usdc" as the quote asset.
|
||||
|
||||
Respond with ONLY the strategy JSON."#,
|
||||
instruments.join(", "),
|
||||
|
||||
95
src/swym.rs
95
src/swym.rs
@@ -49,6 +49,37 @@ pub struct CandleCoverage {
|
||||
pub coverage_pct: Option<f64>,
|
||||
}
|
||||
|
||||
/// Response from `GET /api/v1/paper-runs/compare?ids=...`.
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct RunMetricsSummary {
|
||||
pub id: Uuid,
|
||||
pub status: String,
|
||||
pub candle_interval: Option<String>,
|
||||
pub total_positions: Option<u32>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub win_rate: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub profit_factor: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub net_pnl: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub sortino_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub calmar_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub max_drawdown: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub pnl_return: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_win: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_loss: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct BacktestResult {
|
||||
pub run_id: Uuid,
|
||||
@@ -62,6 +93,15 @@ pub struct BacktestResult {
|
||||
pub total_pnl: Option<f64>,
|
||||
pub net_pnl: Option<f64>,
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
pub sortino_ratio: Option<f64>,
|
||||
pub calmar_ratio: Option<f64>,
|
||||
pub max_drawdown: Option<f64>,
|
||||
pub pnl_return: Option<f64>,
|
||||
pub avg_win: Option<f64>,
|
||||
pub avg_loss: Option<f64>,
|
||||
pub max_win: Option<f64>,
|
||||
pub max_loss: Option<f64>,
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
pub total_fees: Option<f64>,
|
||||
pub avg_bars_in_trade: Option<f64>,
|
||||
pub error_message: Option<String>,
|
||||
@@ -89,6 +129,15 @@ impl BacktestResult {
|
||||
let net_pnl = summary.and_then(|s| parse_number(&s["net_pnl"]));
|
||||
let total_pnl = summary.and_then(|s| parse_number(&s["total_pnl"]));
|
||||
let sharpe_ratio = summary.and_then(|s| parse_number(&s["sharpe_ratio"]));
|
||||
let sortino_ratio = summary.and_then(|s| parse_number(&s["sortino_ratio"]));
|
||||
let calmar_ratio = summary.and_then(|s| parse_number(&s["calmar_ratio"]));
|
||||
let max_drawdown = summary.and_then(|s| parse_number(&s["max_drawdown"]));
|
||||
let pnl_return = summary.and_then(|s| parse_number(&s["pnl_return"]));
|
||||
let avg_win = summary.and_then(|s| parse_number(&s["avg_win"]));
|
||||
let avg_loss = summary.and_then(|s| parse_number(&s["avg_loss"]));
|
||||
let max_win = summary.and_then(|s| parse_number(&s["max_win"]));
|
||||
let max_loss = summary.and_then(|s| parse_number(&s["max_loss"]));
|
||||
let avg_hold_duration_secs = summary.and_then(|s| parse_number(&s["avg_hold_duration_secs"]));
|
||||
let total_fees = summary.and_then(|s| parse_number(&s["total_fees"]));
|
||||
|
||||
Self {
|
||||
@@ -103,6 +152,15 @@ impl BacktestResult {
|
||||
total_pnl,
|
||||
net_pnl,
|
||||
sharpe_ratio,
|
||||
sortino_ratio,
|
||||
calmar_ratio,
|
||||
max_drawdown,
|
||||
pnl_return,
|
||||
avg_win,
|
||||
avg_loss,
|
||||
max_win,
|
||||
max_loss,
|
||||
avg_hold_duration_secs,
|
||||
total_fees,
|
||||
avg_bars_in_trade: None,
|
||||
error_message: resp.error_message.clone(),
|
||||
@@ -128,6 +186,12 @@ impl BacktestResult {
|
||||
self.net_pnl.unwrap_or(0.0),
|
||||
self.sharpe_ratio.unwrap_or(0.0),
|
||||
);
|
||||
if let Some(sortino) = self.sortino_ratio {
|
||||
s.push_str(&format!(" sortino={:.2}", sortino));
|
||||
}
|
||||
if let Some(dd) = self.max_drawdown {
|
||||
s.push_str(&format!(" max_dd={:.1}%", dd * 100.0));
|
||||
}
|
||||
if self.total_positions.unwrap_or(0) == 0 {
|
||||
if let Some(audit) = &self.condition_audit_summary {
|
||||
let audit_str = format_audit_summary(audit);
|
||||
@@ -160,6 +224,15 @@ fn parse_number(v: &Value) -> Option<f64> {
|
||||
if f.abs() > 1e20 { None } else { Some(f) }
|
||||
}
|
||||
|
||||
/// Serde deserializer for `Option<f64>` that accepts both JSON numbers and decimal strings.
|
||||
fn deserialize_opt_number<'de, D>(deserializer: D) -> Result<Option<f64>, D::Error>
|
||||
where
|
||||
D: serde::Deserializer<'de>,
|
||||
{
|
||||
let v = Value::deserialize(deserializer)?;
|
||||
Ok(parse_number(&v))
|
||||
}
|
||||
|
||||
/// Render a condition_audit_summary Value into a compact one-line string.
|
||||
///
|
||||
/// Handles the primary shape from the swym API:
|
||||
@@ -295,6 +368,7 @@ impl SwymClient {
|
||||
instrument_symbol: &str,
|
||||
base_asset: &str,
|
||||
quote_asset: &str,
|
||||
market_kind: &str,
|
||||
strategy: &Value,
|
||||
starts_at: &str,
|
||||
finishes_at: &str,
|
||||
@@ -312,7 +386,7 @@ impl SwymClient {
|
||||
"name_exchange": instrument_symbol,
|
||||
"underlying": { "base": base_asset, "quote": quote_asset },
|
||||
"quote": "underlying_quote",
|
||||
"kind": "spot"
|
||||
"kind": market_kind
|
||||
},
|
||||
"execution": {
|
||||
"mocked_exchange": instrument_exchange,
|
||||
@@ -386,6 +460,25 @@ impl SwymClient {
|
||||
}
|
||||
}
|
||||
|
||||
/// Fetch metrics for multiple completed runs via the compare endpoint.
|
||||
/// Batches requests in groups of 50 (API maximum).
|
||||
pub async fn compare_runs(&self, run_ids: &[Uuid]) -> Result<Vec<RunMetricsSummary>> {
|
||||
let mut results = Vec::new();
|
||||
for chunk in run_ids.chunks(50) {
|
||||
let ids = chunk.iter().map(|id| id.to_string()).collect::<Vec<_>>().join(",");
|
||||
let url = format!("{}/paper-runs/compare?ids={}", self.base_url, ids);
|
||||
let resp = self.client.get(&url).send().await.context("compare runs request")?;
|
||||
if !resp.status().is_success() {
|
||||
let status = resp.status();
|
||||
let body = resp.text().await.unwrap_or_default();
|
||||
anyhow::bail!("compare runs {status}: {body}");
|
||||
}
|
||||
let mut batch: Vec<RunMetricsSummary> = resp.json().await.context("parse compare response")?;
|
||||
results.append(&mut batch);
|
||||
}
|
||||
Ok(results)
|
||||
}
|
||||
|
||||
/// Fetch condition audit summary for a completed run.
|
||||
pub async fn condition_audit(&self, run_id: Uuid) -> Result<Value> {
|
||||
let url = format!("{}/paper-runs/{}/condition-audit", self.base_url, run_id);
|
||||
|
||||
Reference in New Issue
Block a user