Compare commits
28 Commits
b947f48b01
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
11fe79ed25
|
|||
|
fcb9a2f553
|
|||
|
75c95f7935
|
|||
|
6601da21cc
|
|||
|
8de3ae5fe1
|
|||
|
a435d3a99d
|
|||
|
b476199de8
|
|||
|
d76d3b9061
|
|||
|
0945c94cc8
|
|||
|
a0316be798
|
|||
|
609d64587b
|
|||
|
6692bdb490
|
|||
|
36689e3fbb
|
|||
|
87d31f8d7e
|
|||
|
3892ab37c1
|
|||
|
85896752f2
|
|||
|
ee260ea4d5
|
|||
|
3f8d4de7fb
|
|||
|
7e1ff51ae0
|
|||
|
5146b3f764
|
|||
|
759439313e
|
|||
|
9a7761b452
|
|||
|
8d53d6383d
|
|||
|
55e41b6795
|
|||
|
51e452b607
|
|||
|
89f7ba66e0
|
|||
|
6f4f864d28
|
|||
|
185cb4586e
|
116
CLAUDE.md
Normal file
116
CLAUDE.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
`scout` is an autonomous strategy search agent for the [swym](https://swym.rs) backtesting platform. It runs a loop: asks Claude to generate trading strategies → submits backtests to swym → evaluates results → feeds learnings back → repeats. Promising strategies are automatically validated on out-of-sample data to filter overfitting.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Modules
|
||||
|
||||
- **`agent.rs`** - Main orchestration logic. Contains the `run()` function that implements the search loop, strategy validation, and learning feedback. Key types: `IterationRecord`, `LedgerEntry`, `validate_strategy()`, `diagnose_history()`.
|
||||
- **`claude.rs`** - Claude API client. Handles model communication, JSON extraction from responses, and context length detection for R1-family models with thinking blocks.
|
||||
- **`swym.rs`** - Swym backtesting API client. Wraps all swym API calls: candle coverage, strategy validation, backtest submission, polling, and metrics retrieval.
|
||||
- **`prompts.rs`** - System and user prompts for the LLM. Generates the DSL schema context and iteration-specific prompts with prior results.
|
||||
- **`config.rs`** - CLI argument parsing and configuration. Defines `Cli` struct with all command-line flags and environment variables.
|
||||
|
||||
### Key Data Flows
|
||||
|
||||
1. **Strategy Generation**: `agent::run()` → `claude::chat()` → extracts JSON strategy → validates → submits to swym
|
||||
2. **Backtest Execution**: `swym::submit_backtest()` → `swym::poll_until_done()` → `BacktestResult::from_response()`
|
||||
3. **Learning Loop**: `load_prior_summary()` reads `run_ledger.jsonl` → fetches metrics via `swym::compare_runs()` → formats compact summary → appends to iteration prompt
|
||||
4. **OOS Validation**: Promising in-sample results trigger re-backtest on held-out data → strategies passing both phases saved to `validated_*.json`
|
||||
|
||||
### Important Patterns
|
||||
|
||||
- **Deduplication**: Strategies are deduplicated by full JSON serialization using a HashMap (`tested_strategies`). Identical strategies are skipped with a warning.
|
||||
- **Validation**: Two-stage validation—client-side (structure, quantity parsing, exit rules) and server-side (DSL schema validation via `/strategies/validate`).
|
||||
- **Context Management**: Conversation history is trimmed to keep last 6 messages (3 exchanges) to avoid token limits. Prior results are summarized in the next prompt.
|
||||
- **Error Recovery**: Consecutive failures (3×) trigger abort. Transient API errors are logged but don't stop the run.
|
||||
- **Ledger Persistence**: Each backtest writes a `LedgerEntry` to `run_ledger.jsonl` for cross-run learning. Uses atomic O_APPEND writes.
|
||||
|
||||
## Development Commands
|
||||
|
||||
```bash
|
||||
# Build
|
||||
cargo build
|
||||
|
||||
# Run with default config
|
||||
cargo run
|
||||
|
||||
# Run with custom flags
|
||||
cargo run -- \
|
||||
--swym-url https://dev.swym.hanzalova.internal/api/v1 \
|
||||
--max-iterations 50 \
|
||||
--instruments binance_spot:BTCUSDC,binance_spot:ETHUSDC
|
||||
|
||||
# Run tests
|
||||
cargo test
|
||||
|
||||
# Run with debug logging
|
||||
RUST_LOG=debug cargo run
|
||||
```
|
||||
|
||||
## DSL Schema
|
||||
|
||||
Strategies are JSON objects with the schema defined in `src/dsl-schema.json`. The DSL uses a rule-based structure with `when` (entry conditions) and `then` (exit actions). Key concepts:
|
||||
|
||||
- **Indicators**: `{"kind":"indicator","name":"...","params":{...}}`
|
||||
- **Comparators**: `{"kind":"compare","lhs":"...","op":"...","rhs":"..."}`
|
||||
- **Functions**: `{"kind":"func","name":"...","args":[...]}`
|
||||
|
||||
See `src/dsl-schema.json` for the complete schema and `prompts.rs::system_prompt()` for how it's presented to Claude.
|
||||
|
||||
## Model Families
|
||||
|
||||
The code supports different Claude model families via `ModelFamily` enum in `config.rs`:
|
||||
|
||||
- **Sonnet**: Standard model, no special handling
|
||||
- **Opus**: Larger context, higher cost
|
||||
- **R1**: Has thinking blocks (`<think>...</think>`) that need to be stripped before JSON extraction
|
||||
|
||||
Context length is auto-detected from the server's `/api/v1/models` endpoint (LM Studio) or `/v1/models/{id}` (OpenAI-compatible). Output token budget is set to half the context window.
|
||||
|
||||
## Output Files
|
||||
|
||||
- `strategy_001.json` through `strategy_NNN.json` - Every strategy attempted (full JSON)
|
||||
- `validated_001.json` through `validated_NNN.json` - Strategies that passed OOS validation (includes in-sample + OOS metrics)
|
||||
- `best_strategy.json` - Strategy with highest average Sharpe across instruments
|
||||
- `run_ledger.jsonl` - Persistent record of all backtests for learning across runs
|
||||
|
||||
## Common Tasks
|
||||
|
||||
### Adding a new CLI flag
|
||||
|
||||
1. Add field to `Cli` struct in `config.rs`
|
||||
2. Add clap derive attribute with `#[arg(short, long, env = "VAR_NAME")]`
|
||||
3. Use the flag in `agent::run()` via `cli.flag_name`
|
||||
|
||||
### Extending the DSL
|
||||
|
||||
1. Update `src/dsl-schema.json` with new expression kinds
|
||||
2. Add validation logic in `validate_strategy()` if needed
|
||||
3. Update prompts in `prompts.rs` to guide the model
|
||||
|
||||
### Modifying the learning loop
|
||||
|
||||
1. Edit `load_prior_summary()` in `agent.rs` to change how prior results are formatted
|
||||
2. Adjust `diagnose_history()` to add new diagnostics or change convergence detection
|
||||
3. Update `prompts.rs::iteration_prompt()` to incorporate new information
|
||||
|
||||
### Adding new validation checks
|
||||
|
||||
Add to `validate_strategy()` in `agent.rs`. Returns `(hard_errors, warnings)` where hard errors block submission and warnings are logged but allow the backtest to proceed.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
The codebase uses `anyhow` for error handling and `tracing` for logging. Key test areas:
|
||||
|
||||
- Strategy JSON extraction from various response formats
|
||||
- Context length detection from LM Studio/OpenAI endpoints
|
||||
- Ledger entry serialization/deserialization
|
||||
- Backtest result parsing from swym API responses
|
||||
- Deduplication logic
|
||||
- Convergence detection in `diagnose_history()`
|
||||
133
docs/plan/cross-run-learning.md
Normal file
133
docs/plan/cross-run-learning.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Plan: Cross-run learning via run ledger and compare endpoint
|
||||
|
||||
## Context
|
||||
|
||||
Scout currently starts from scratch every run — no memory of prior iterations. The upstream
|
||||
patch `e47c18` adds:
|
||||
1. **Enriched `result_summary`**: sortino_ratio, calmar_ratio, max_drawdown, pnl_return,
|
||||
avg_win, avg_loss, max_win, max_loss, avg_hold_duration_secs
|
||||
2. **Compare endpoint**: `GET /api/v1/paper-runs/compare?ids=uuid1,uuid2,...` returns
|
||||
`RunMetricsSummary` for up to 50 runs in one call
|
||||
|
||||
Goal: persist enough state across runs so that iteration 1 of a new run starts informed by
|
||||
all previous runs' strategies and outcomes.
|
||||
|
||||
## Changes
|
||||
|
||||
### 1. Run ledger — persist strategy + run_id per backtest (`src/agent.rs`)
|
||||
|
||||
After each successful `run_single_backtest`, append a JSONL entry to `{output_dir}/run_ledger.jsonl`:
|
||||
|
||||
```json
|
||||
{"run_id":"uuid","instrument":"BTCUSDC","candle_interval":"4h","strategy":{...},"timestamp":"2026-03-10T12:38:15Z"}
|
||||
```
|
||||
|
||||
One line per instrument-backtest (3 per iteration for 3 instruments). The strategy JSON is
|
||||
duplicated across instrument entries for the same iteration — this keeps the format flat and
|
||||
self-contained.
|
||||
|
||||
Use `OpenOptions::append(true).create(true)` — no locking needed since scout is single-threaded.
|
||||
|
||||
### 2. Load prior runs on startup (`src/agent.rs`)
|
||||
|
||||
At the top of `run()`, before the iteration loop:
|
||||
1. Read `run_ledger.jsonl` if it exists (ignore if missing — first run)
|
||||
2. Collect all `run_id`s
|
||||
3. Call `swym.compare_runs(&run_ids)` (batching in groups of 50)
|
||||
4. Join metrics back to strategies from the ledger
|
||||
5. Group by strategy (entries with the same strategy JSON share an iteration)
|
||||
6. Rank by average sharpe across instruments
|
||||
7. Build a `prior_results_summary: Option<String>` for the initial prompt
|
||||
|
||||
### 3. Compare endpoint client (`src/swym.rs`)
|
||||
|
||||
Add `RunMetricsSummary` struct:
|
||||
|
||||
```rust
|
||||
pub struct RunMetricsSummary {
|
||||
pub id: Uuid,
|
||||
pub status: String,
|
||||
pub candle_interval: Option<String>,
|
||||
pub total_positions: Option<u32>,
|
||||
pub win_rate: Option<f64>,
|
||||
pub profit_factor: Option<f64>,
|
||||
pub net_pnl: Option<f64>,
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
pub sortino_ratio: Option<f64>,
|
||||
pub calmar_ratio: Option<f64>,
|
||||
pub max_drawdown: Option<f64>,
|
||||
pub pnl_return: Option<f64>,
|
||||
pub avg_win: Option<f64>,
|
||||
pub avg_loss: Option<f64>,
|
||||
pub max_win: Option<f64>,
|
||||
pub max_loss: Option<f64>,
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
}
|
||||
```
|
||||
|
||||
Add `SwymClient::compare_runs(&self, run_ids: &[Uuid]) -> Result<Vec<RunMetricsSummary>>`:
|
||||
- `GET {base_url}/paper-runs/compare?ids={comma_separated}`
|
||||
- Parse JSON array response using `parse_number()` for decimal strings
|
||||
|
||||
### 4. Enrich `BacktestResult` with new fields (`src/swym.rs`)
|
||||
|
||||
Add to `BacktestResult`: `sortino_ratio`, `calmar_ratio`, `max_drawdown`, `pnl_return`,
|
||||
`avg_win`, `avg_loss`, `max_win`, `max_loss`, `avg_hold_duration_secs`.
|
||||
|
||||
Parse all in `from_response()` via existing `parse_number()`.
|
||||
|
||||
Update `summary_line()` to include `max_dd={:.1}%` and `sortino={:.2}` when present —
|
||||
these two are the most useful additions for the model's reasoning.
|
||||
|
||||
### 5. Prior-results-aware initial prompt (`src/prompts.rs`)
|
||||
|
||||
Modify `initial_prompt()` to accept `prior_summary: Option<&str>`.
|
||||
|
||||
When present, insert before the "Design a trading strategy" instruction:
|
||||
|
||||
```
|
||||
## Learnings from {N} prior backtests across {M} strategies
|
||||
|
||||
{top 5 strategies ranked by avg sharpe, each showing:}
|
||||
- Interval, rule count, avg metrics across instruments
|
||||
- One-line description of the strategy approach (extracted from rule comments)
|
||||
- Full strategy JSON for the top 1-2
|
||||
|
||||
{compact table of all prior strategies' avg metrics}
|
||||
|
||||
Use these insights to avoid repeating failed approaches and to build on what worked.
|
||||
```
|
||||
|
||||
Limit to ~2000 tokens of prior context to avoid crowding the prompt. If many prior runs,
|
||||
show only the top 5 + bottom 3 (worst performers to avoid), plus a count of total runs.
|
||||
|
||||
### 6. Ledger entry struct (`src/agent.rs`)
|
||||
|
||||
```rust
|
||||
#[derive(Serialize, Deserialize)]
|
||||
struct LedgerEntry {
|
||||
run_id: Uuid,
|
||||
instrument: String,
|
||||
candle_interval: String,
|
||||
strategy: Value,
|
||||
timestamp: String,
|
||||
}
|
||||
```
|
||||
|
||||
## Files to modify
|
||||
|
||||
- `src/swym.rs` — `RunMetricsSummary` struct, `compare_runs()` method, enrich `BacktestResult`
|
||||
with new fields, update `summary_line()`
|
||||
- `src/agent.rs` — `LedgerEntry` struct, append-to-ledger after backtest, load-ledger-on-startup,
|
||||
call compare endpoint, build prior summary, pass to initial prompt
|
||||
- `src/prompts.rs` — `initial_prompt()` accepts optional prior summary
|
||||
|
||||
## Verification
|
||||
|
||||
1. `cargo build --release`
|
||||
2. Run once → confirm `run_ledger.jsonl` is created with entries
|
||||
3. Run again → confirm:
|
||||
- Ledger is loaded, compare endpoint is called
|
||||
- Iteration 1 prompt includes prior results summary (visible at debug log level)
|
||||
- New entries are appended (not overwritten)
|
||||
4. Check that enriched metrics (sortino, max_drawdown) appear in summary_line output
|
||||
336
src/agent.rs
336
src/agent.rs
@@ -1,14 +1,26 @@
|
||||
use std::io::Write as IoWrite;
|
||||
use std::path::Path;
|
||||
use std::time::Duration;
|
||||
|
||||
use anyhow::{Context, Result};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use serde_json::Value;
|
||||
use tracing::{debug, error, info, warn};
|
||||
use uuid::Uuid;
|
||||
|
||||
use crate::claude::{self, ClaudeClient, Message};
|
||||
use crate::config::{Cli, Instrument};
|
||||
use crate::prompts;
|
||||
use crate::swym::{BacktestResult, SwymClient};
|
||||
use crate::swym::{BacktestResult, RunMetricsSummary, SwymClient};
|
||||
|
||||
/// Persistent record of a single completed backtest, written to the run ledger.
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct LedgerEntry {
|
||||
run_id: Uuid,
|
||||
instrument: String,
|
||||
candle_interval: String,
|
||||
strategy: Value,
|
||||
}
|
||||
|
||||
/// A single iteration's record: strategy + results across instruments.
|
||||
#[derive(Debug)]
|
||||
@@ -132,7 +144,8 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
|
||||
// Init clients
|
||||
let swym = SwymClient::new(&cli.swym_url)?;
|
||||
let claude = ClaudeClient::new(&cli.anthropic_key, &cli.anthropic_url, &cli.model);
|
||||
let mut claude = ClaudeClient::new(&cli.anthropic_key, &cli.anthropic_url, &cli.model);
|
||||
claude.apply_server_limits().await;
|
||||
|
||||
// Check candle coverage for all instruments
|
||||
info!(
|
||||
@@ -189,13 +202,24 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
|
||||
// Load DSL schema for the system prompt
|
||||
let schema = include_str!("dsl-schema.json");
|
||||
let system = prompts::system_prompt(schema);
|
||||
let has_futures = instruments.iter().any(|i| i.is_futures());
|
||||
let system = prompts::system_prompt(schema, claude.family(), has_futures);
|
||||
info!("model family: {}", claude.family().name());
|
||||
|
||||
// Resolve ledger path: explicit --ledger-file takes precedence, else <output_dir>/run_ledger.jsonl
|
||||
let ledger_path = cli.ledger_file.clone().unwrap_or_else(|| cli.output_dir.join("run_ledger.jsonl"));
|
||||
info!("ledger: {}", ledger_path.display());
|
||||
|
||||
// Load prior runs from ledger and build cross-run context for iteration 1
|
||||
let prior_summary = load_prior_summary(&ledger_path, &swym).await;
|
||||
|
||||
// Agent state
|
||||
let mut history: Vec<IterationRecord> = Vec::new();
|
||||
let mut conversation: Vec<Message> = Vec::new();
|
||||
let mut best_strategy: Option<(f64, Value)> = None; // (avg_sharpe, strategy)
|
||||
let mut consecutive_failures = 0u32;
|
||||
// Deduplication: track canonical strategy JSON → first iteration it was tested.
|
||||
let mut tested_strategies: std::collections::HashMap<String, u32> = std::collections::HashMap::new();
|
||||
|
||||
let instrument_names: Vec<String> = instruments.iter().map(|i| i.symbol.clone()).collect();
|
||||
|
||||
@@ -204,7 +228,7 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
|
||||
// Build the user prompt
|
||||
let user_msg = if iteration == 1 {
|
||||
prompts::initial_prompt(&instrument_names, &available_intervals)
|
||||
prompts::initial_prompt(&instrument_names, &available_intervals, prior_summary.as_deref(), has_futures)
|
||||
} else {
|
||||
let results_text = history
|
||||
.iter()
|
||||
@@ -263,14 +287,21 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
content: response_text.clone(),
|
||||
});
|
||||
|
||||
// Log R1 reasoning chain at debug level so it can be inspected when
|
||||
// the model makes repeated DSL mistakes (run with RUST_LOG=debug).
|
||||
if let Some(thinking) = claude::extract_think_content(&response_text) {
|
||||
debug!("R1 thinking ({} chars):\n{}", thinking.len(), thinking);
|
||||
}
|
||||
|
||||
// Extract strategy JSON
|
||||
let strategy = match claude::extract_json(&response_text) {
|
||||
Ok(s) => s,
|
||||
Err(e) => {
|
||||
warn!("failed to extract strategy JSON: {e}");
|
||||
warn!("failed to extract strategy JSON: {e:#}");
|
||||
warn!(
|
||||
"raw response: {}",
|
||||
&response_text[..response_text.len().min(500)]
|
||||
"raw response ({} chars): {}",
|
||||
response_text.len(),
|
||||
&response_text[..response_text.len().min(800)]
|
||||
);
|
||||
consecutive_failures += 1;
|
||||
if consecutive_failures >= 3 {
|
||||
@@ -316,7 +347,7 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
let strat_path = cli.output_dir.join(format!("strategy_{iteration:03}.json"));
|
||||
std::fs::write(&strat_path, serde_json::to_string_pretty(&strategy)?)?;
|
||||
|
||||
// Hard validation errors: skip the expensive backtest and give immediate feedback.
|
||||
// Hard client-side validation errors: skip without hitting the API.
|
||||
if !hard_errors.is_empty() {
|
||||
let record = IterationRecord {
|
||||
iteration,
|
||||
@@ -329,6 +360,61 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Server-side validation: call /strategies/validate to get ALL DSL errors
|
||||
// at once before submitting a backtest. This avoids burning a full backtest
|
||||
// round-trip on a structurally invalid strategy and gives the model a
|
||||
// complete list of errors to fix in one shot.
|
||||
match swym.validate_strategy(&strategy).await {
|
||||
Ok(api_errors) if !api_errors.is_empty() => {
|
||||
for e in &api_errors {
|
||||
warn!(" DSL error at {}: {}", e.path.as_deref().unwrap_or("(top-level)"), e.message);
|
||||
}
|
||||
let error_notes: Vec<String> = api_errors
|
||||
.iter()
|
||||
.map(|e| format!("DSL error at {}: {}", e.path.as_deref().unwrap_or("(top-level)"), e.message))
|
||||
.collect();
|
||||
validation_notes.extend(error_notes);
|
||||
let record = IterationRecord {
|
||||
iteration,
|
||||
strategy: strategy.clone(),
|
||||
results: vec![],
|
||||
validation_notes,
|
||||
};
|
||||
info!("{}", record.summary());
|
||||
history.push(record);
|
||||
continue;
|
||||
}
|
||||
Ok(_) => {
|
||||
// Valid — proceed to backtest
|
||||
}
|
||||
Err(e) => {
|
||||
// Network/parse failure from the validate endpoint — log and proceed
|
||||
// anyway so a transient API issue doesn't stall the run.
|
||||
warn!(" strategy validation request failed (proceeding): {e:#}");
|
||||
}
|
||||
}
|
||||
|
||||
// Deduplication check: skip strategies identical to one already tested this run.
|
||||
let strategy_key = serde_json::to_string(&strategy).unwrap_or_default();
|
||||
if let Some(&first_iter) = tested_strategies.get(&strategy_key) {
|
||||
warn!("duplicate strategy (identical to iteration {first_iter}), skipping backtest");
|
||||
let record = IterationRecord {
|
||||
iteration,
|
||||
strategy: strategy.clone(),
|
||||
results: vec![],
|
||||
validation_notes: vec![format!(
|
||||
"DUPLICATE: this exact strategy was already tested in iteration {first_iter}. \
|
||||
You submitted identical JSON. You MUST design a completely different strategy — \
|
||||
different indicator family, different entry conditions, or different timeframe. \
|
||||
Do NOT submit the same JSON again."
|
||||
)],
|
||||
};
|
||||
info!("{}", record.summary());
|
||||
history.push(record);
|
||||
continue;
|
||||
}
|
||||
tested_strategies.insert(strategy_key, iteration);
|
||||
|
||||
// Run backtests against all instruments (in-sample)
|
||||
let mut results: Vec<BacktestResult> = Vec::new();
|
||||
|
||||
@@ -354,12 +440,13 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
info!(" condition audit: {}", serde_json::to_string_pretty(audit).unwrap_or_default());
|
||||
}
|
||||
}
|
||||
append_ledger_entry(&ledger_path, &result, &strategy);
|
||||
results.push(result);
|
||||
}
|
||||
Err(e) => {
|
||||
warn!(" backtest failed for {}: {e:#}", inst.symbol);
|
||||
results.push(BacktestResult {
|
||||
run_id: uuid::Uuid::nil(),
|
||||
run_id: Uuid::nil(),
|
||||
instrument: inst.symbol.clone(),
|
||||
status: "failed".to_string(),
|
||||
total_positions: None,
|
||||
@@ -370,6 +457,15 @@ pub async fn run(cli: &Cli) -> Result<()> {
|
||||
total_pnl: None,
|
||||
net_pnl: None,
|
||||
sharpe_ratio: None,
|
||||
sortino_ratio: None,
|
||||
calmar_ratio: None,
|
||||
max_drawdown: None,
|
||||
pnl_return: None,
|
||||
avg_win: None,
|
||||
avg_loss: None,
|
||||
max_win: None,
|
||||
max_loss: None,
|
||||
avg_hold_duration_secs: None,
|
||||
total_fees: None,
|
||||
avg_bars_in_trade: None,
|
||||
error_message: Some(e.to_string()),
|
||||
@@ -507,6 +603,7 @@ async fn run_single_backtest(
|
||||
&inst.symbol,
|
||||
&inst.base(),
|
||||
&inst.quote(),
|
||||
inst.market_kind(),
|
||||
strategy,
|
||||
starts_at,
|
||||
finishes_at,
|
||||
@@ -527,13 +624,180 @@ async fn run_single_backtest(
|
||||
.await
|
||||
.context("poll")?;
|
||||
|
||||
Ok(BacktestResult::from_response(
|
||||
&final_resp,
|
||||
&inst.symbol,
|
||||
&inst.exchange,
|
||||
&inst.base(),
|
||||
&inst.quote(),
|
||||
Ok(BacktestResult::from_response(&final_resp, &inst.symbol))
|
||||
}
|
||||
|
||||
/// Append a ledger entry for a completed backtest so future runs can learn from it.
|
||||
fn append_ledger_entry(ledger: &Path, result: &BacktestResult, strategy: &Value) {
|
||||
// Skip nil run_ids (error placeholders)
|
||||
if result.run_id == Uuid::nil() {
|
||||
return;
|
||||
}
|
||||
let entry = LedgerEntry {
|
||||
run_id: result.run_id,
|
||||
instrument: result.instrument.clone(),
|
||||
candle_interval: strategy["candle_interval"]
|
||||
.as_str()
|
||||
.unwrap_or("?")
|
||||
.to_string(),
|
||||
strategy: strategy.clone(),
|
||||
};
|
||||
// Append newline inside the serialised bytes so the entire write is a single
|
||||
// write_all() syscall — O_APPEND + single write() is atomic on Linux local
|
||||
// filesystems, making concurrent instances safe for typical entry sizes.
|
||||
let mut bytes = match serde_json::to_vec(&entry) {
|
||||
Ok(b) => b,
|
||||
Err(e) => {
|
||||
warn!("could not serialize ledger entry: {e}");
|
||||
return;
|
||||
}
|
||||
};
|
||||
bytes.push(b'\n');
|
||||
if let Err(e) = std::fs::OpenOptions::new()
|
||||
.append(true)
|
||||
.create(true)
|
||||
.open(ledger)
|
||||
.and_then(|mut f| f.write_all(&bytes))
|
||||
{
|
||||
warn!("could not write ledger entry: {e}");
|
||||
}
|
||||
}
|
||||
|
||||
/// Load the run ledger, fetch metrics via the compare endpoint, and return a compact
|
||||
/// prior-results summary string for the initial prompt. Returns `None` if the ledger
|
||||
/// is absent, empty, or the compare call fails.
|
||||
async fn load_prior_summary(ledger: &Path, swym: &SwymClient) -> Option<String> {
|
||||
let path = ledger;
|
||||
let contents = std::fs::read_to_string(&path).ok()?;
|
||||
|
||||
// Parse all ledger entries
|
||||
let entries: Vec<LedgerEntry> = contents
|
||||
.lines()
|
||||
.filter(|l| !l.trim().is_empty())
|
||||
.filter_map(|l| serde_json::from_str(l).ok())
|
||||
.collect();
|
||||
if entries.is_empty() {
|
||||
return None;
|
||||
}
|
||||
info!("loaded {} ledger entries from previous runs", entries.len());
|
||||
|
||||
// Fetch metrics for all run_ids
|
||||
let run_ids: Vec<Uuid> = entries.iter().map(|e| e.run_id).collect();
|
||||
let metrics = match swym.compare_runs(&run_ids).await {
|
||||
Ok(m) => m,
|
||||
Err(e) => {
|
||||
warn!("could not fetch prior run metrics: {e}");
|
||||
return None;
|
||||
}
|
||||
};
|
||||
|
||||
// Build a map from run_id → metrics
|
||||
let metrics_map: std::collections::HashMap<Uuid, &RunMetricsSummary> =
|
||||
metrics.iter().map(|m| (m.id, m)).collect();
|
||||
|
||||
// Group entries by strategy (use candle_interval + rules fingerprint)
|
||||
// We use the full strategy JSON as the grouping key.
|
||||
let mut strategy_groups: std::collections::HashMap<String, Vec<(&LedgerEntry, Option<&RunMetricsSummary>)>> =
|
||||
std::collections::HashMap::new();
|
||||
// Cap at 3 entries per unique strategy (one per instrument is enough).
|
||||
// Without this, a strategy repeated across many iterations swamps the summary.
|
||||
for entry in &entries {
|
||||
let key = serde_json::to_string(&entry.strategy).unwrap_or_default();
|
||||
let group = strategy_groups.entry(key).or_default();
|
||||
if group.len() < 3 {
|
||||
let m = metrics_map.get(&entry.run_id).copied();
|
||||
group.push((entry, m));
|
||||
}
|
||||
}
|
||||
|
||||
// Compute avg sharpe per strategy group
|
||||
let mut strategies: Vec<(f64, &Value, Vec<(&LedgerEntry, Option<&RunMetricsSummary>)>)> = strategy_groups
|
||||
.into_values()
|
||||
.map(|group| {
|
||||
let sharpes: Vec<f64> = group
|
||||
.iter()
|
||||
.filter_map(|(_, m)| m.and_then(|m| m.sharpe_ratio))
|
||||
.collect();
|
||||
let avg_sharpe = if sharpes.is_empty() {
|
||||
f64::NEG_INFINITY
|
||||
} else {
|
||||
sharpes.iter().sum::<f64>() / sharpes.len() as f64
|
||||
};
|
||||
let strategy = &group[0].0.strategy;
|
||||
(avg_sharpe, strategy, group)
|
||||
})
|
||||
.collect();
|
||||
strategies.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
|
||||
|
||||
let total_strategies = strategies.len();
|
||||
let total_backtests = entries.len();
|
||||
|
||||
// Build summary text — top 5 + bottom 3 (if distinct), capped at ~2000 chars
|
||||
let mut lines = vec![format!(
|
||||
"## Learnings from {} prior backtests across {} strategies\n",
|
||||
total_backtests, total_strategies
|
||||
)];
|
||||
lines.push("### Best strategies (ranked by avg Sharpe):".to_string());
|
||||
|
||||
let show_top = strategies.len().min(5);
|
||||
for (avg_sharpe, strategy, group) in strategies.iter().take(show_top) {
|
||||
let interval = strategy["candle_interval"].as_str().unwrap_or("?");
|
||||
let rule_count = strategy["rules"].as_array().map(|r| r.len()).unwrap_or(0);
|
||||
// Collect per-instrument metrics
|
||||
let inst_lines: Vec<String> = group
|
||||
.iter()
|
||||
.filter_map(|(entry, m)| {
|
||||
let m = (*m)?;
|
||||
Some(format!(
|
||||
" {}: trades={} sharpe={:.3} net_pnl={:.2}{}",
|
||||
entry.instrument,
|
||||
m.total_positions.unwrap_or(0),
|
||||
m.sharpe_ratio.unwrap_or(0.0),
|
||||
m.net_pnl.unwrap_or(0.0),
|
||||
m.max_drawdown.map(|d| format!(" max_dd={:.1}%", d * 100.0)).unwrap_or_default(),
|
||||
))
|
||||
})
|
||||
.collect();
|
||||
// Pull the first rule comment as a strategy description
|
||||
let description = strategy["rules"][0]["comment"]
|
||||
.as_str()
|
||||
.unwrap_or("(no description)");
|
||||
lines.push(format!(
|
||||
"\n [{interval}, {rule_count} rules, avg_sharpe={avg_sharpe:.3}] {description}"
|
||||
));
|
||||
lines.extend(inst_lines);
|
||||
// Include full JSON only for the top 2
|
||||
let rank = strategies.iter().position(|(_, s, _)| std::ptr::eq(*s, *strategy)).unwrap_or(99);
|
||||
if rank < 2 {
|
||||
lines.push(format!(
|
||||
" strategy JSON: {}",
|
||||
serde_json::to_string(strategy).unwrap_or_default()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
// Worst 3 (if we have more than 5)
|
||||
if strategies.len() > 5 {
|
||||
lines.push("\n### Worst strategies (avoid repeating these):".to_string());
|
||||
let worst_start = strategies.len().saturating_sub(3);
|
||||
for (avg_sharpe, strategy, _) in strategies.iter().skip(worst_start) {
|
||||
let interval = strategy["candle_interval"].as_str().unwrap_or("?");
|
||||
let description = strategy["rules"][0]["comment"].as_str().unwrap_or("(no description)");
|
||||
lines.push(format!(" [{interval}, avg_sharpe={avg_sharpe:.3}] {description}"));
|
||||
}
|
||||
}
|
||||
|
||||
lines.push(format!(
|
||||
"\nUse these results to avoid repeating failed approaches and build on what worked.\n"
|
||||
));
|
||||
|
||||
let summary = lines.join("\n");
|
||||
// Truncate to ~6000 chars to stay within prompt budget
|
||||
if summary.len() > 6000 {
|
||||
Some(format!("{}…\n[truncated — {} total strategies]\n", &summary[..5900], total_strategies))
|
||||
} else {
|
||||
Some(summary)
|
||||
}
|
||||
}
|
||||
|
||||
fn save_validated_strategy(
|
||||
@@ -662,6 +926,48 @@ pub fn diagnose_history(history: &[IterationRecord]) -> (String, bool) {
|
||||
}
|
||||
}
|
||||
|
||||
// --- Repeated API error detection ---
|
||||
// If the same DSL error variant has appeared in 2+ consecutive iterations,
|
||||
// call it out explicitly so the model knows exactly what to fix.
|
||||
{
|
||||
let recent_errors: Vec<String> = history
|
||||
.iter()
|
||||
.rev()
|
||||
.take(4)
|
||||
.flat_map(|rec| rec.results.iter())
|
||||
.filter_map(|r| r.error_message.as_deref())
|
||||
.filter(|e| e.contains("unknown variant"))
|
||||
.map(|e| {
|
||||
// Extract the variant name: "unknown variant `foo`, expected ..."
|
||||
e.split('`')
|
||||
.nth(1)
|
||||
.unwrap_or(e)
|
||||
.to_string()
|
||||
})
|
||||
.collect();
|
||||
|
||||
if recent_errors.len() >= 2 {
|
||||
// Find the most frequent bad variant
|
||||
let mut counts: std::collections::HashMap<&str, usize> = std::collections::HashMap::new();
|
||||
for v in &recent_errors {
|
||||
*counts.entry(v.as_str()).or_default() += 1;
|
||||
}
|
||||
if let Some((bad_variant, count)) = counts.into_iter().max_by_key(|(_, c)| *c) {
|
||||
if count >= 2 {
|
||||
notes.push(format!(
|
||||
"⚠ DSL ERROR (repeated {count}×): the swym API rejected \
|
||||
`{bad_variant}` as an unknown variant. \
|
||||
Check the 'Critical: expression kinds' section — \
|
||||
`{bad_variant}` may be a FuncName (use inside \
|
||||
{{\"kind\":\"func\",\"name\":\"{bad_variant}\",...}}) \
|
||||
or it may not be supported at all. \
|
||||
Use ONLY the documented kinds and func names."
|
||||
));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// --- Zero-trade check ---
|
||||
let zero_trade_iters = history
|
||||
.iter()
|
||||
|
||||
159
src/claude.rs
159
src/claude.rs
@@ -2,12 +2,20 @@ use anyhow::{Context, Result};
|
||||
use reqwest::Client;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use serde_json::Value;
|
||||
use tracing::{info, warn};
|
||||
|
||||
use crate::config::ModelFamily;
|
||||
|
||||
pub struct ClaudeClient {
|
||||
client: Client,
|
||||
api_key: String,
|
||||
api_url: String,
|
||||
model: String,
|
||||
family: ModelFamily,
|
||||
/// Effective max output tokens, initialised from the family default and
|
||||
/// optionally updated by `apply_server_limits()` after querying the
|
||||
/// server's model metadata.
|
||||
max_output_tokens: u32,
|
||||
}
|
||||
|
||||
#[derive(Serialize)]
|
||||
@@ -43,19 +51,93 @@ pub struct Usage {
|
||||
|
||||
impl ClaudeClient {
|
||||
pub fn new(api_key: &str, api_url: &str, model: &str) -> Self {
|
||||
let family = ModelFamily::detect(model);
|
||||
// R1 thinking can take several minutes; use a generous timeout.
|
||||
let timeout_secs = if family.has_thinking() { 300 } else { 120 };
|
||||
let client = Client::builder()
|
||||
.timeout(std::time::Duration::from_secs(120))
|
||||
.timeout(std::time::Duration::from_secs(timeout_secs))
|
||||
.build()
|
||||
.expect("build http client");
|
||||
let max_output_tokens = family.max_output_tokens();
|
||||
Self {
|
||||
client,
|
||||
api_key: api_key.to_string(),
|
||||
api_url: api_url.to_string(),
|
||||
model: model.to_string(),
|
||||
family,
|
||||
max_output_tokens,
|
||||
}
|
||||
}
|
||||
|
||||
/// Send a conversation to Claude and get the text response.
|
||||
pub fn family(&self) -> &ModelFamily {
|
||||
&self.family
|
||||
}
|
||||
|
||||
/// Query the server for the loaded model's actual context length and
|
||||
/// update `max_output_tokens` accordingly.
|
||||
///
|
||||
/// Uses half the loaded context window for output, leaving the other
|
||||
/// half for the system prompt and conversation history. Falls back to
|
||||
/// the family default if the server does not expose the information.
|
||||
///
|
||||
/// Tries two endpoints:
|
||||
/// 1. LM Studio `/api/v1/models` — returns `loaded_instances[].config.context_length`
|
||||
/// 2. OpenAI-compat `/v1/models/{id}` — returns `context_length` if present
|
||||
pub async fn apply_server_limits(&mut self) {
|
||||
match self.query_context_length().await {
|
||||
Some(ctx_len) => {
|
||||
// Reserve half the context for input (system prompt + history).
|
||||
let budget = ctx_len / 2;
|
||||
info!(
|
||||
"server context_length={ctx_len} → max_output_tokens={budget} \
|
||||
(was {} from family default)",
|
||||
self.max_output_tokens,
|
||||
);
|
||||
self.max_output_tokens = budget;
|
||||
}
|
||||
None => {
|
||||
info!(
|
||||
"could not determine server context_length; \
|
||||
using family default max_output_tokens={}",
|
||||
self.max_output_tokens,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Try to discover the loaded context length for the current model.
|
||||
async fn query_context_length(&self) -> Option<u32> {
|
||||
let base = self.api_url.trim_end_matches('/');
|
||||
|
||||
// --- Strategy 1: LM Studio proprietary /api/v1/models ---
|
||||
let lmstudio_url = format!("{base}/api/v1/models");
|
||||
if let Ok(resp) = self.client.get(&lmstudio_url).send().await {
|
||||
if resp.status().is_success() {
|
||||
if let Ok(json) = resp.json::<Value>().await {
|
||||
if let Some(ctx) = lmstudio_context_length(&json, &self.model) {
|
||||
return Some(ctx);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// --- Strategy 2: OpenAI-compat /v1/models/{id} ---
|
||||
let oai_url = format!("{base}/v1/models/{}", self.model);
|
||||
if let Ok(resp) = self.client.get(&oai_url).send().await {
|
||||
if resp.status().is_success() {
|
||||
if let Ok(json) = resp.json::<Value>().await {
|
||||
if let Some(n) = json["context_length"].as_u64() {
|
||||
return Some(n as u32);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
warn!("could not query context_length from server for model {}", self.model);
|
||||
None
|
||||
}
|
||||
|
||||
/// Send a conversation to the model and get the text response.
|
||||
pub async fn chat(
|
||||
&self,
|
||||
system: &str,
|
||||
@@ -63,7 +145,7 @@ impl ClaudeClient {
|
||||
) -> Result<(String, Option<Usage>)> {
|
||||
let body = MessagesRequest {
|
||||
model: self.model.clone(),
|
||||
max_tokens: 4096,
|
||||
max_tokens: self.max_output_tokens,
|
||||
system: system.to_string(),
|
||||
messages: messages.to_vec(),
|
||||
};
|
||||
@@ -98,9 +180,54 @@ impl ClaudeClient {
|
||||
}
|
||||
}
|
||||
|
||||
/// Extract a JSON object from Claude's response text.
|
||||
/// Looks for the first `{` ... `}` block, handling markdown code fences.
|
||||
/// Extract the loaded context_length for a model from the LM Studio
|
||||
/// `/api/v1/models` response.
|
||||
///
|
||||
/// Matches on `key` or `id` fields (LM Studio uses `key`; some variants
|
||||
/// append a quantization suffix like `@q4_k_m`, so we strip that too).
|
||||
fn lmstudio_context_length(json: &Value, model_id: &str) -> Option<u32> {
|
||||
let models = json["models"].as_array()?;
|
||||
let model_base = model_id.split('@').next().unwrap_or(model_id);
|
||||
|
||||
for entry in models {
|
||||
let key = entry["key"].as_str().unwrap_or("");
|
||||
let key_base = key.split('@').next().unwrap_or(key);
|
||||
|
||||
if key_base == model_base || key == model_id {
|
||||
// Prefer the actually-loaded context (loaded_instances[0].config.context_length)
|
||||
// over the theoretical max_context_length.
|
||||
let loaded = entry["loaded_instances"]
|
||||
.as_array()
|
||||
.and_then(|a| a.first())
|
||||
.and_then(|inst| inst["config"]["context_length"].as_u64())
|
||||
.map(|n| n as u32);
|
||||
if loaded.is_some() {
|
||||
return loaded;
|
||||
}
|
||||
// Fall back to max_context_length if no loaded instance info
|
||||
if let Some(n) = entry["max_context_length"].as_u64() {
|
||||
return Some(n as u32);
|
||||
}
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
/// Return the content of the first `<think>` block, if any.
|
||||
/// Used for debug logging of R1 reasoning chains.
|
||||
pub fn extract_think_content(text: &str) -> Option<String> {
|
||||
let start = text.find("<think>")? + "<think>".len();
|
||||
let end = text[start..].find("</think>").map(|i| start + i)?;
|
||||
Some(text[start..end].trim().to_string())
|
||||
}
|
||||
|
||||
/// Extract a JSON object from a model response text.
|
||||
/// Handles markdown code fences and R1-style `<think>...</think>` blocks.
|
||||
pub fn extract_json(text: &str) -> Result<Value> {
|
||||
// Strip R1-style thinking blocks before looking for JSON
|
||||
let text = strip_think_blocks(text);
|
||||
let text = text.as_ref();
|
||||
|
||||
// Strip markdown fences if present
|
||||
let cleaned = text
|
||||
.replace("```json", "")
|
||||
@@ -137,3 +264,25 @@ pub fn extract_json(text: &str) -> Result<Value> {
|
||||
|
||||
serde_json::from_str(&cleaned[s..e]).context("parse extracted JSON")
|
||||
}
|
||||
|
||||
/// Remove `<think>...</think>` blocks emitted by R1-family reasoning models.
|
||||
/// Handles nested tags and unterminated blocks (truncated responses).
|
||||
fn strip_think_blocks(text: &str) -> std::borrow::Cow<'_, str> {
|
||||
if !text.contains("<think>") {
|
||||
return std::borrow::Cow::Borrowed(text);
|
||||
}
|
||||
let mut out = String::with_capacity(text.len());
|
||||
let mut rest = text;
|
||||
while let Some(start) = rest.find("<think>") {
|
||||
out.push_str(&rest[..start]);
|
||||
rest = &rest[start + "<think>".len()..];
|
||||
if let Some(end) = rest.find("</think>") {
|
||||
rest = &rest[end + "</think>".len()..];
|
||||
} else {
|
||||
// Unterminated — discard the rest (truncated thinking block)
|
||||
rest = "";
|
||||
}
|
||||
}
|
||||
out.push_str(rest);
|
||||
std::borrow::Cow::Owned(out)
|
||||
}
|
||||
|
||||
@@ -2,6 +2,50 @@ use std::path::PathBuf;
|
||||
|
||||
use clap::Parser;
|
||||
|
||||
/// Model family — controls token budgets and prompt style.
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub enum ModelFamily {
|
||||
/// DeepSeek-R1 and its distillations: emit `<think>` blocks that count
|
||||
/// against the output-token budget, so we need a much larger max_tokens.
|
||||
DeepSeekR1,
|
||||
/// General instruction-following models (Qwen, Llama, Mistral, …).
|
||||
Generic,
|
||||
}
|
||||
|
||||
impl ModelFamily {
|
||||
/// Detect family from a model name string (case-insensitive).
|
||||
pub fn detect(model: &str) -> Self {
|
||||
let m = model.to_ascii_lowercase();
|
||||
if m.contains("deepseek-r1") || m.contains("r1-distill") || m.contains("r1_distill") {
|
||||
Self::DeepSeekR1
|
||||
} else {
|
||||
Self::Generic
|
||||
}
|
||||
}
|
||||
|
||||
/// Display name for logging.
|
||||
pub fn name(&self) -> &'static str {
|
||||
match self {
|
||||
Self::DeepSeekR1 => "DeepSeek-R1",
|
||||
Self::Generic => "Generic",
|
||||
}
|
||||
}
|
||||
|
||||
/// Maximum output tokens to request. R1 thinking blocks can be thousands
|
||||
/// of tokens; reserve enough headroom for the JSON after thinking.
|
||||
pub fn max_output_tokens(&self) -> u32 {
|
||||
match self {
|
||||
Self::DeepSeekR1 => 32768,
|
||||
Self::Generic => 8192,
|
||||
}
|
||||
}
|
||||
|
||||
/// Whether this model family emits chain-of-thought before its response.
|
||||
pub fn has_thinking(&self) -> bool {
|
||||
matches!(self, Self::DeepSeekR1)
|
||||
}
|
||||
}
|
||||
|
||||
/// Autonomous strategy search agent for the swym backtesting platform.
|
||||
///
|
||||
/// Runs a loop: ask Claude to generate/refine strategies → submit backtests to swym →
|
||||
@@ -74,6 +118,13 @@ pub struct Cli {
|
||||
#[arg(long, default_value = "./results")]
|
||||
pub output_dir: PathBuf,
|
||||
|
||||
/// Path to the run ledger JSONL file used for cross-run learning.
|
||||
/// Defaults to <output_dir>/run_ledger.jsonl when not specified.
|
||||
/// Pass a different path to seed a new run from a specific ledger
|
||||
/// (e.g. a curated export from a previous campaign).
|
||||
#[arg(long)]
|
||||
pub ledger_file: Option<PathBuf>,
|
||||
|
||||
/// Poll interval in seconds when waiting for backtest completion.
|
||||
#[arg(long, default_value_t = 2)]
|
||||
pub poll_interval_secs: u64,
|
||||
@@ -123,4 +174,22 @@ impl Instrument {
|
||||
}
|
||||
"usdc".to_string()
|
||||
}
|
||||
|
||||
/// Instrument kind for the paper-run config `instrument.kind` field.
|
||||
/// Derived from the exchange identifier (case-insensitive).
|
||||
pub fn market_kind(&self) -> &'static str {
|
||||
let e = self.exchange.to_ascii_lowercase();
|
||||
if e.contains("futures_usd") || e.contains("futures_um") {
|
||||
"futures_um"
|
||||
} else if e.contains("futures_coin") || e.contains("futures_cm") {
|
||||
"futures_cm"
|
||||
} else {
|
||||
"spot"
|
||||
}
|
||||
}
|
||||
|
||||
/// True when this instrument is traded on a futures market.
|
||||
pub fn is_futures(&self) -> bool {
|
||||
self.market_kind() != "spot"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -66,11 +66,53 @@
|
||||
"properties": {
|
||||
"side": { "type": "string", "enum": ["buy", "sell"] },
|
||||
"quantity": {
|
||||
"$ref": "#/definitions/DecimalString",
|
||||
"description": "Per-order size in base asset units, e.g. \"0.001\" for BTC."
|
||||
"description": "Per-order size in base asset units. Fixed decimal string (e.g. \"0.001\"), a declarative SizingMethod object, or a dynamic Expr object. When a method or Expr returns None the order is skipped; negative values are clamped to zero.",
|
||||
"oneOf": [
|
||||
{ "$ref": "#/definitions/DecimalString" },
|
||||
{ "$ref": "#/definitions/SizingFixedSum" },
|
||||
{ "$ref": "#/definitions/SizingPercentOfBalance" },
|
||||
{ "$ref": "#/definitions/SizingFixedUnits" },
|
||||
{ "$ref": "#/definitions/Expr" }
|
||||
]
|
||||
},
|
||||
"reverse": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Flip-through-zero flag (futures only). When true and an opposite position is currently open, the submitted order quantity becomes position_qty + configured_qty, closing the existing position and immediately opening a new one in the opposite direction in a single order. When flat the flag has no effect and configured_qty is used as normal. Omit or set false for standard close-only behaviour."
|
||||
}
|
||||
}
|
||||
},
|
||||
"SizingFixedSum": {
|
||||
"description": "Buy `amount` worth of quote currency at the current price. qty = amount / current_price.",
|
||||
"type": "object",
|
||||
"required": ["method", "amount"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"method": { "const": "fixed_sum" },
|
||||
"amount": { "$ref": "#/definitions/DecimalString", "description": "Quote-currency amount, e.g. \"500\" means buy $500 worth." }
|
||||
}
|
||||
},
|
||||
"SizingPercentOfBalance": {
|
||||
"description": "Buy percent% of the named asset's free balance worth of base asset. qty = balance(asset) * percent/100 / current_price.",
|
||||
"type": "object",
|
||||
"required": ["method", "percent", "asset"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"method": { "const": "percent_of_balance" },
|
||||
"percent": { "$ref": "#/definitions/DecimalString", "description": "Percentage, e.g. \"2\" means 2% of the free balance." },
|
||||
"asset": { "type": "string", "description": "Asset name to look up, e.g. \"usdc\". Matched case-insensitively." }
|
||||
}
|
||||
},
|
||||
"SizingFixedUnits": {
|
||||
"description": "Buy exactly `units` of base asset. Semantic alias for a fixed decimal quantity.",
|
||||
"type": "object",
|
||||
"required": ["method", "units"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"method": { "const": "fixed_units" },
|
||||
"units": { "$ref": "#/definitions/DecimalString", "description": "Base asset quantity, e.g. \"0.01\" means 0.01 BTC." }
|
||||
}
|
||||
},
|
||||
"Rule": {
|
||||
"type": "object",
|
||||
"required": ["when", "then"],
|
||||
@@ -280,7 +322,12 @@
|
||||
{ "$ref": "#/definitions/ExprBinOp" },
|
||||
{ "$ref": "#/definitions/ExprApplyFunc" },
|
||||
{ "$ref": "#/definitions/ExprUnaryOp" },
|
||||
{ "$ref": "#/definitions/ExprBarsSince" }
|
||||
{ "$ref": "#/definitions/ExprBarsSince" },
|
||||
{ "$ref": "#/definitions/ExprEntryPrice" },
|
||||
{ "$ref": "#/definitions/ExprPositionQuantity" },
|
||||
{ "$ref": "#/definitions/ExprUnrealisedPnl" },
|
||||
{ "$ref": "#/definitions/ExprBarsSinceEntry" },
|
||||
{ "$ref": "#/definitions/ExprBalance" }
|
||||
]
|
||||
},
|
||||
"ExprLiteral": {
|
||||
@@ -417,6 +464,55 @@
|
||||
"description": "Maximum bars to look back."
|
||||
}
|
||||
}
|
||||
},
|
||||
"ExprEntryPrice": {
|
||||
"description": "Volume-weighted average entry price of the current open position. Returns None when flat.",
|
||||
"type": "object",
|
||||
"required": ["kind"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"kind": { "const": "entry_price" }
|
||||
}
|
||||
},
|
||||
"ExprPositionQuantity": {
|
||||
"description": "Absolute quantity of the current open position in base asset units. Returns None when flat.",
|
||||
"type": "object",
|
||||
"required": ["kind"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"kind": { "const": "position_quantity" }
|
||||
}
|
||||
},
|
||||
"ExprUnrealisedPnl": {
|
||||
"description": "Estimated unrealised PnL of the current open position in quote asset. Returns None when flat.",
|
||||
"type": "object",
|
||||
"required": ["kind"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"kind": { "const": "unrealised_pnl" }
|
||||
}
|
||||
},
|
||||
"ExprBarsSinceEntry": {
|
||||
"description": "Number of complete primary-interval bars elapsed since the current position was opened. Computed as floor((now - time_enter) / primary_interval_secs). Returns None when flat.",
|
||||
"type": "object",
|
||||
"required": ["kind"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"kind": { "const": "bars_since_entry" }
|
||||
}
|
||||
},
|
||||
"ExprBalance": {
|
||||
"description": "Free balance of the named asset (matched case-insensitively). Returns None when the asset is not found or balance data is unavailable.",
|
||||
"type": "object",
|
||||
"required": ["kind", "asset"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"kind": { "const": "balance" },
|
||||
"asset": {
|
||||
"type": "string",
|
||||
"description": "Internal asset name, e.g. \"usdt\", \"btc\". Case-insensitive."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
491
src/prompts.rs
491
src/prompts.rs
@@ -1,9 +1,28 @@
|
||||
/// System prompt for the strategy-generation Claude instance.
|
||||
use crate::config::ModelFamily;
|
||||
|
||||
/// System prompt for the strategy-generation model.
|
||||
///
|
||||
/// This is the most important part of the agent — it defines how Claude
|
||||
/// thinks about strategy design, what it knows about the DSL, and how
|
||||
/// it should interpret backtest results.
|
||||
pub fn system_prompt(dsl_schema: &str) -> String {
|
||||
/// Accepts a `ModelFamily` so each family can receive tailored guidance
|
||||
/// while sharing the common DSL schema and strategy evaluation rules.
|
||||
pub fn system_prompt(dsl_schema: &str, family: &ModelFamily, has_futures: bool) -> String {
|
||||
let output_instructions = match family {
|
||||
ModelFamily::DeepSeekR1 => {
|
||||
"## Output format\n\n\
|
||||
Think through your strategy design carefully before committing to it. \
|
||||
After your thinking, output ONLY a bare JSON object — no markdown fences, \
|
||||
no commentary, no explanation. Start with `{` and end with `}`. \
|
||||
Your thinking will be stripped automatically; only the JSON is used."
|
||||
}
|
||||
ModelFamily::Generic => {
|
||||
"## How to respond\n\n\
|
||||
You must respond with ONLY a valid JSON object — the strategy config.\n\
|
||||
No prose, no markdown explanation, no commentary.\n\
|
||||
Just the raw JSON starting with { and ending with }.\n\n\
|
||||
The JSON must be a valid strategy with \"type\": \"rule_based\".\n\
|
||||
Use \"usdc\" (not \"usdt\") as the quote asset for balance expressions."
|
||||
}
|
||||
};
|
||||
|
||||
format!(
|
||||
r##"You are a quantitative trading strategy researcher. Your task is to design,
|
||||
evaluate, and iteratively refine trading strategies expressed in the swym JSON DSL.
|
||||
@@ -33,6 +52,10 @@ sma, ema, wma, rsi, std_dev, sum, highest, lowest, atr, supertrend, adx,
|
||||
bollinger_upper, bollinger_lower — applied to any candle field (open/high/low/close/volume)
|
||||
with configurable period and optional offset.
|
||||
|
||||
These are FuncNames used INSIDE `{{"kind":"func","name":"...","period":N}}` expressions.
|
||||
`atr`, `adx`, and `supertrend` use OHLC internally and ignore the `field` parameter.
|
||||
To use ADX as a trend-strength filter: `{{"kind":"compare","left":{{"kind":"func","name":"adx","period":14}},"op":">","right":{{"kind":"literal","value":"25"}}}}`
|
||||
|
||||
### Composed indicators (apply_func)
|
||||
Apply rolling functions to arbitrary expressions: EMA of EMA, Hull MA (WMA of expression),
|
||||
VWAP (sum of close*volume / sum of volume), standard deviation of returns, etc.
|
||||
@@ -51,11 +74,78 @@ bars_since_entry — complete bars elapsed since position was opened
|
||||
balance — free balance of a named asset (e.g. "usdt", "usdc")
|
||||
|
||||
### Quantity
|
||||
Action quantity MUST be a fixed decimal string that parses as a floating-point number,
|
||||
e.g. `"quantity": "0.001"`.
|
||||
NEVER use an expression object for quantity — only plain decimal strings are accepted.
|
||||
NEVER use placeholder strings like `"ATR_SIZED"`, `"FULL_BALANCE"`, `"percent_of_balance"`,
|
||||
`"dynamic"`, or any non-numeric string — these will be rejected immediately.
|
||||
Action quantity accepts four forms — pick the simplest one for your intent:
|
||||
|
||||
**1. Declarative sizing methods (preferred — instrument-agnostic, readable):**
|
||||
|
||||
Spend a fixed quote amount (e.g. $500 worth of base at current price):
|
||||
```json
|
||||
{{"method":"fixed_sum","amount":"500"}}
|
||||
```
|
||||
|
||||
Spend a percentage of free quote balance (e.g. 5% of USDC):
|
||||
```json
|
||||
{{"method":"percent_of_balance","percent":"5","asset":"usdc"}}
|
||||
```
|
||||
|
||||
Buy a fixed number of base units (semantic alias for a decimal string):
|
||||
```json
|
||||
{{"method":"fixed_units","units":"0.01"}}
|
||||
```
|
||||
|
||||
**2. Plain decimal string** — use only when you have a specific reason:
|
||||
`"0.01"` (0.01 BTC, 3.0 ETH, 50.0 SOL — instrument-specific, not portable)
|
||||
|
||||
**3. Expr** — for dynamic sizing not covered by the methods above, e.g. ATR-based:
|
||||
```json
|
||||
{{"kind":"bin_op","op":"div",
|
||||
"left":{{"kind":"literal","value":"200"}},
|
||||
"right":{{"kind":"func","name":"atr","period":14}}}}
|
||||
```
|
||||
|
||||
CRITICAL — ATR sizing and balance limits: `N/atr(14)` expresses quantity in BASE asset units.
|
||||
For BTC, 4h ATR ≈ $1500–3000. So `1000/atr(14)` ≈ 0.4–0.7 BTC ≈ $32k–56k notional —
|
||||
silently rejected on a $10k account (fill returns None, 0 positions open, no error shown).
|
||||
The numerator N represents your intended dollar risk per trade. For a $10k account keep N ≤ 200.
|
||||
`200/atr(14)` ≈ 0.07–0.13 BTC ≈ $5.6k–10k notional — fits within a $10k account.
|
||||
Prefer `percent_of_balance` for most sizing. Only reach for ATR-based Expr sizing when you need
|
||||
volatility-scaled position risk, and keep the numerator proportional to your risk tolerance.
|
||||
|
||||
**4. Exit rules** — use `position_quantity` to close the exact open size:
|
||||
```json
|
||||
{{"kind":"position_quantity"}}
|
||||
```
|
||||
Alternatively, `"9999"` works for exits: sell quantities are automatically capped to the open
|
||||
position size, so a large fixed number is equivalent to `position_quantity`.
|
||||
|
||||
CRITICAL — the `"method"` vs `"kind"` distinction:
|
||||
- `"method"` belongs ONLY to the three declarative sizing objects: `fixed_sum`, `percent_of_balance`, `fixed_units`.
|
||||
- `"kind"` belongs to Expr objects: `position_quantity`, `bin_op`, `func`, `field`, `literal`, etc.
|
||||
- `{{"method":"position_quantity"}}` is ALWAYS WRONG. It will be rejected every time.
|
||||
CORRECT: `{{"kind":"position_quantity"}}`.
|
||||
- If you used `{{"method":"percent_of_balance",...}}` for the buy, use `{{"kind":"position_quantity"}}` for the sell.
|
||||
These are different object types — buy uses a SizingMethod (`method`), sell uses an Expr (`kind`).
|
||||
- `{{"method":"fixed_sum","amount":"100","multiplier":"2.0"}}` is WRONG — `fixed_sum` has no
|
||||
`multiplier` field. Only `amount` is accepted alongside `method`.
|
||||
- NEVER add extra fields to SizingMethod objects — they use `additionalProperties: false`.
|
||||
|
||||
### Reverse / flip-through-zero (futures only)
|
||||
|
||||
Setting `"reverse": true` on a rule action enables a single-order position flip on futures.
|
||||
When an opposite position is open, quantity = `position_qty + configured_qty`, which closes
|
||||
the existing position and opens a new one in the opposite direction in one order (fees split
|
||||
proportionally). When flat the flag has no effect — `configured_qty` is used normally.
|
||||
|
||||
This lets you collapse a 4-rule long+short strategy (separate open/close for each leg) into
|
||||
2 rules, reducing round-trip fees and keeping logic compact:
|
||||
|
||||
```json
|
||||
{{"side": "sell", "quantity": {{"method": "percent_of_balance", "percent": "10", "asset": "usdc"}}, "reverse": true}}
|
||||
```
|
||||
|
||||
Use `reverse` when you always want to be in a position — the signal flips you from long to
|
||||
short (or vice versa) rather than first exiting and then re-entering separately. Do NOT use
|
||||
`reverse` on spot markets (short selling is not supported there).
|
||||
|
||||
### Multi-timeframe
|
||||
Any expression can reference a different timeframe via "timeframe" field.
|
||||
@@ -81,6 +171,13 @@ Use higher timeframes as trend filters, lower timeframes for entry precision.
|
||||
6. **Composite / hybrid**: Combine families. Trend filter + mean-reversion entry.
|
||||
Momentum confirmation + volatility sizing.
|
||||
|
||||
7. **Supertrend consensus flip (futures only)**: Use `any_of` across multiple
|
||||
Supertrend configs (e.g. period=7/mul=1.5, period=10/mul=2.0, period=20/mul=3.0)
|
||||
so that ANY flip triggers a long or short entry. Combine with `"reverse": true`
|
||||
for an always-in-market approach where the opposite signal is the stop-loss.
|
||||
Varying multiplier tightens/loosens the band; varying period controls sensitivity.
|
||||
Risk: choppy markets generate many whipsaws — best on daily or 4h.
|
||||
|
||||
## Risk management (always include)
|
||||
|
||||
Every strategy MUST have:
|
||||
@@ -88,14 +185,11 @@ Every strategy MUST have:
|
||||
- A time-based exit: use bars_since_entry to avoid holding losers indefinitely
|
||||
- Reasonable position sizing: prefer ATR-based or percent-of-balance over fixed quantity
|
||||
|
||||
## How to respond
|
||||
Exception: always-in-market flip strategies (using `"reverse": true`) do not need an
|
||||
explicit stop-loss or time exit — the opposite signal acts as the stop. These are
|
||||
only valid on futures. See Example 6 and Example 7.
|
||||
|
||||
You must respond with ONLY a valid JSON object — the strategy config.
|
||||
No prose, no markdown explanation, no commentary.
|
||||
Just the raw JSON starting with {{ and ending with }}.
|
||||
|
||||
The JSON must be a valid strategy with "type": "rule_based".
|
||||
Use "usdc" (not "usdt") as the quote asset for balance expressions.
|
||||
{output_instructions}
|
||||
|
||||
## Interpreting backtest results
|
||||
|
||||
@@ -103,7 +197,11 @@ When I share results from previous iterations, use them to guide your next strat
|
||||
|
||||
- **Zero trades**: The entry conditions are too restrictive or never co-occur.
|
||||
Relax thresholds, simplify conditions, or check if the indicator periods make
|
||||
sense for the candle interval.
|
||||
sense for the candle interval. Also check your position sizing — if using an
|
||||
ATR-based Expr quantity (`N/atr(14)`), a large N can produce a notional value
|
||||
exceeding your account balance (e.g. `1000/atr(14)` on BTC ≈ 0.4 BTC ≈ $32k),
|
||||
which is silently rejected by the fill engine. Switch to `percent_of_balance`
|
||||
or reduce N to ≤ 200 for a $10k account.
|
||||
|
||||
- **Many trades but negative PnL**: The entry signal has no edge, or the exit
|
||||
logic is poor. Try different indicator combinations, add trend filters, or
|
||||
@@ -134,11 +232,31 @@ Common mistakes to NEVER make:
|
||||
- `"kind": "bars_since_entry"` is a valid standalone Expr (no extra fields needed).
|
||||
Do NOT put `"bars_since_entry"` as a `"name"` inside `{{"kind":"func",...}}` — that is WRONG.
|
||||
- `"kind": "expr_field"` does NOT exist. Use `{{"kind":"field","field":"close"}}`.
|
||||
- Every Expr object MUST have a `"kind"` field. `{{"field":"close"}}` is WRONG — missing `"kind"`.
|
||||
CORRECT: `{{"kind":"field","field":"close"}}`. The `"kind"` is never optional.
|
||||
This applies to ALL field access including offset lookups:
|
||||
`{{"field":"volume","offset":-1}}` is WRONG. CORRECT: `{{"kind":"field","field":"volume","offset":-1}}`.
|
||||
`{{"field":"high","offset":-2}}` is WRONG. CORRECT: `{{"kind":"field","field":"high","offset":-2}}`.
|
||||
- `rsi`, `adx`, `supertrend` are NOT valid inside `apply_func`. Use only `apply_func`
|
||||
with `ApplyFuncName` values: `highest`, `lowest`, `sma`, `ema`, `wma`, `std_dev`, `sum`,
|
||||
`bollinger_upper`, `bollinger_lower`.
|
||||
- `volume` is a candle FIELD, not a func name. Access it as `{{"kind":"field","field":"volume"}}`.
|
||||
To compute EMA of volume: `{{"kind":"apply_func","name":"ema","period":20,"expr":{{"kind":"field","field":"volume"}}}}`.
|
||||
To compute EMA of volume: `{{"kind":"apply_func","name":"ema","period":20,"input":{{"kind":"field","field":"volume"}}}}`.
|
||||
- `bollinger_upper` and `bollinger_lower` are FUNC NAMES, not Expr kinds. To compare close to the upper band:
|
||||
`{{"kind":"compare","left":{{"kind":"field","field":"close"}},"op":">","right":{{"kind":"func","name":"bollinger_upper","period":20}}}}`
|
||||
NEVER write `{{"kind":"bollinger_upper",...}}` — `bollinger_upper` is not an Expr kind.
|
||||
NEVER set `"field":"bollinger_upper"` on a func Expr — `bollinger_upper`/`bollinger_lower` have no `field`
|
||||
parameter; they compute from close internally. Just `{{"kind":"func","name":"bollinger_upper","period":20}}`.
|
||||
- The `{{"kind":"bollinger",...}}` Condition (shorthand) only accepts `"band": "above_upper"` or
|
||||
`"band": "below_lower"`. There is NO `above_lower` or `below_upper` — those are invalid and will be
|
||||
rejected. Use `above_upper` (price above the upper band) or `below_lower` (price below the lower band).
|
||||
- `adx` is a FUNC NAME, not a Condition kind. To filter for strong trends (ADX > 25):
|
||||
`{{"kind":"compare","left":{{"kind":"func","name":"adx","period":14}},"op":">","right":{{"kind":"literal","value":"25"}}}}`
|
||||
NEVER write `{{"kind":"adx",...}}` — `adx` is not a Condition kind, it is a FuncName used inside `{{"kind":"func",...}}`.
|
||||
- `roc` (rate of change), `hma` (Hull MA), `ma` (generic), `vwap`, `macd`, `cci`, `stoch` are NOT supported.
|
||||
Use `sma`, `ema`, `wma`, `rsi`, `atr`, `adx`, `supertrend`, `std_dev`, `sum`, `highest`, `lowest`,
|
||||
`bollinger_upper`, `bollinger_lower` only. There is no generic `ma` — use `sma` or `ema` explicitly.
|
||||
Hull MA can be approximated as: WMA(2*WMA(n/2) - WMA(n)) using `apply_func`.
|
||||
|
||||
## Working examples
|
||||
|
||||
@@ -159,7 +277,7 @@ Common mistakes to NEVER make:
|
||||
{{"kind": "ema_trend", "period": 50, "direction": "above"}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": "0.001"}}
|
||||
"then": {{"side": "buy", "quantity": "0.01"}}
|
||||
}},
|
||||
{{
|
||||
"comment": "Sell: EMA9 crosses below EMA21, OR 2% stop-loss, OR 72-bar time exit",
|
||||
@@ -187,7 +305,7 @@ Common mistakes to NEVER make:
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": "0.001"}}
|
||||
"then": {{"side": "sell", "quantity": {{"kind": "position_quantity"}}}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
@@ -210,7 +328,7 @@ Common mistakes to NEVER make:
|
||||
{{"kind": "bollinger", "period": 20, "band": "below_lower"}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": "0.001"}}
|
||||
"then": {{"side": "buy", "quantity": "0.01"}}
|
||||
}},
|
||||
{{
|
||||
"comment": "Sell: RSI recovers above 55, OR 3% stop-loss, OR 48-bar time exit",
|
||||
@@ -238,7 +356,7 @@ Common mistakes to NEVER make:
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": "0.001"}}
|
||||
"then": {{"side": "sell", "quantity": {{"kind": "position_quantity"}}}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
@@ -265,7 +383,7 @@ Common mistakes to NEVER make:
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": "0.001"}}
|
||||
"then": {{"side": "buy", "quantity": "0.01"}}
|
||||
}},
|
||||
{{
|
||||
"comment": "Sell: 2-ATR stop-loss below entry price, OR 48-bar time exit",
|
||||
@@ -300,38 +418,343 @@ Common mistakes to NEVER make:
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": "0.001"}}
|
||||
"then": {{"side": "sell", "quantity": {{"kind": "position_quantity"}}}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
```
|
||||
|
||||
### Example 4 — MACD crossover (composed from primitives)
|
||||
|
||||
MACD has no native support, but can be composed from `func` and `apply_func`.
|
||||
The MACD line is `EMA(12) - EMA(26)`; the signal line is `EMA(9)` of the MACD line.
|
||||
|
||||
```json
|
||||
{{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{{
|
||||
"comment": "Buy: MACD line crosses above signal line",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "position", "state": "flat"}},
|
||||
{{
|
||||
"kind": "cross_over",
|
||||
"left": {{
|
||||
"kind": "bin_op", "op": "sub",
|
||||
"left": {{"kind": "func", "name": "ema", "period": 12}},
|
||||
"right": {{"kind": "func", "name": "ema", "period": 26}}
|
||||
}},
|
||||
"right": {{
|
||||
"kind": "apply_func", "name": "ema", "period": 9,
|
||||
"input": {{
|
||||
"kind": "bin_op", "op": "sub",
|
||||
"left": {{"kind": "func", "name": "ema", "period": 12}},
|
||||
"right": {{"kind": "func", "name": "ema", "period": 26}}
|
||||
}}
|
||||
}}
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": "0.01"}}
|
||||
}},
|
||||
{{
|
||||
"comment": "Sell: MACD crosses below signal, OR 2% stop-loss, OR 72-bar time exit",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "position", "state": "long"}},
|
||||
{{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{{
|
||||
"kind": "cross_under",
|
||||
"left": {{
|
||||
"kind": "bin_op", "op": "sub",
|
||||
"left": {{"kind": "func", "name": "ema", "period": 12}},
|
||||
"right": {{"kind": "func", "name": "ema", "period": 26}}
|
||||
}},
|
||||
"right": {{
|
||||
"kind": "apply_func", "name": "ema", "period": 9,
|
||||
"input": {{
|
||||
"kind": "bin_op", "op": "sub",
|
||||
"left": {{"kind": "func", "name": "ema", "period": 12}},
|
||||
"right": {{"kind": "func", "name": "ema", "period": 26}}
|
||||
}}
|
||||
}}
|
||||
}},
|
||||
{{
|
||||
"kind": "compare",
|
||||
"left": {{"kind": "field", "field": "close"}},
|
||||
"op": "<",
|
||||
"right": {{"kind": "bin_op", "op": "mul",
|
||||
"left": {{"kind": "entry_price"}},
|
||||
"right": {{"kind": "literal", "value": "0.98"}}}}
|
||||
}},
|
||||
{{
|
||||
"kind": "compare",
|
||||
"left": {{"kind": "bars_since_entry"}},
|
||||
"op": ">=",
|
||||
"right": {{"kind": "literal", "value": "72"}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": {{"kind": "position_quantity"}}}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
```
|
||||
|
||||
Key pattern: `apply_func` wraps any `Expr` tree using the `"input"` field (NOT `"expr"`).
|
||||
This enables EMA-of-expression (signal line), WMA-of-expression (Hull MA), or std_dev-of-returns.
|
||||
There is NO native `macd` func name — always compose it as `bin_op(sub, func(ema,12), func(ema,26))` as shown above.
|
||||
CRITICAL: `apply_func` uses `"input"`, not `"expr"`. Writing `"expr":` will be rejected by the API.
|
||||
|
||||
## Anti-patterns to avoid
|
||||
|
||||
- Don't use the same indicator for both entry and exit (circular logic)
|
||||
- Don't set RSI thresholds at extreme values (< 10 or > 90) — too rare to fire
|
||||
- Don't use very short periods (< 5) on high timeframes — noisy
|
||||
- Don't use very long periods (> 100) on low timeframes — too slow to react
|
||||
- Don't switch to 15m or shorter intervals when results are poor — higher frequency amplifies
|
||||
fees and noise, making edge harder to find. Prefer 1h or 4h. If Sharpe is negative across
|
||||
intervals, the issue is signal logic, not timeframe — fix the signal before changing interval.
|
||||
- Don't create strategies with more than 5-6 conditions — overfitting risk
|
||||
- Don't ignore fees — a strategy needs to overcome 0.1% per round trip
|
||||
- Always gate buy rules with position state "flat" and sell rules with "long"
|
||||
- Never add a short-entry (sell when flat) rule — spot markets are long-only
|
||||
- Never use an expression object for `quantity` — it must always be a plain decimal string like `"0.001"`
|
||||
- Never use a placeholder string for `quantity` — `"ATR_SIZED"`, `"FULL_BALANCE"`, `"dynamic"`, etc. are all invalid and will be rejected. Use `"0.001"` or similar.
|
||||
"##
|
||||
- Spot markets are long-only: gate buy (entry) rules with state "flat" and sell (exit) rules with state "long". Never add a short-entry (sell when flat) rule on spot.
|
||||
- Futures markets support both directions: long entry = buy when flat; long exit = sell when long; short entry = sell when flat; short exit (cover) = buy when short. Always include a stop-loss and time exit for both long and short legs.
|
||||
- Never use a placeholder string for `quantity` — `"ATR_SIZED"`, `"FULL_BALANCE"`, `"dynamic"`, etc. are all invalid and will be rejected.
|
||||
- Don't use large ATR-based sizing numerators. `N/atr(14)` gives BASE units; for BTC (ATR ≈ $2000
|
||||
on 4h), `1000/atr(14)` ≈ 0.5 BTC ≈ $40k — silently rejected on a $10k account. Keep N ≤ 200
|
||||
or use `percent_of_balance`. The condition audit may show entry conditions firing while 0 positions
|
||||
open — this is the typical symptom of an oversized ATR quantity.
|
||||
- `{{"method":"position_quantity"}}` is WRONG for exit rules — use `{{"kind":"position_quantity"}}` (see Quantity section above).
|
||||
{futures_examples}"##,
|
||||
futures_examples = if has_futures { FUTURES_SHORT_EXAMPLES } else { "" },
|
||||
)
|
||||
}
|
||||
|
||||
/// Short-entry and short-exit strategy examples, injected into the system prompt when
|
||||
/// futures instruments are present.
|
||||
const FUTURES_SHORT_EXAMPLES: &str = r##"
|
||||
|
||||
### Example 5 — Futures short: EMA trend-following short with ATR stop
|
||||
|
||||
On futures you can also short. Short entry = `"side": "sell"` when `"state": "flat"`;
|
||||
short exit (cover) = `"side": "buy"` when `"state": "short"`. Stop-loss for a short
|
||||
is price rising above entry, e.g. entry_price * 1.02. You may run long and short legs
|
||||
in the same strategy (4 rules total), or a short-only strategy (2 rules).
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{
|
||||
"comment": "Short entry: EMA9 crosses below EMA21 while price is below EMA50 (downtrend)",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "below"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "below"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "sell", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}}
|
||||
},
|
||||
{
|
||||
"comment": "Short exit: EMA9 crosses back above EMA21, OR 2% stop-loss, OR 48-bar time exit",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "position", "state": "short"},
|
||||
{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "above"},
|
||||
{
|
||||
"kind": "compare",
|
||||
"left": {"kind": "field", "field": "close"},
|
||||
"op": ">",
|
||||
"right": {"kind": "bin_op", "op": "mul", "left": {"kind": "entry_price"}, "right": {"kind": "literal", "value": "1.02"}}
|
||||
},
|
||||
{
|
||||
"kind": "compare",
|
||||
"left": {"kind": "bars_since_entry"},
|
||||
"op": ">=",
|
||||
"right": {"kind": "literal", "value": "48"}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
"then": {"side": "buy", "quantity": {"kind": "position_quantity"}}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Key short-specific notes:
|
||||
- Stop-loss for short = close > entry_price * (1 + stop_pct), e.g. `* 1.02` for 2% stop
|
||||
- Take-profit for short = close < entry_price * (1 - target_pct), e.g. `* 0.97` for 3% target
|
||||
- Short exit uses `"side": "buy"` with `{"kind": "position_quantity"}` (same as long exit uses sell)
|
||||
- `percent_of_balance` for short entry uses `"usdc"` as the asset (the collateral currency)
|
||||
|
||||
### Example 6 — Futures flip-through-zero: 2-rule EMA trend-follower using `reverse`
|
||||
|
||||
When you always want to be in a position (long during uptrends, short during downtrends),
|
||||
use `"reverse": true` to flip from one side to the other in a single order. This uses half
|
||||
the round-trip fee count compared to a 4-rule separate-entry/exit approach.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{
|
||||
"comment": "Go long (or flip short→long): EMA9 crosses above EMA21 while above EMA50",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "any_of", "conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "position", "state": "short"}
|
||||
]},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "above"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "above"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "buy", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}, "reverse": true}
|
||||
},
|
||||
{
|
||||
"comment": "Go short (or flip long→short): EMA9 crosses below EMA21 while below EMA50",
|
||||
"when": {
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{"kind": "any_of", "conditions": [
|
||||
{"kind": "position", "state": "flat"},
|
||||
{"kind": "position", "state": "long"}
|
||||
]},
|
||||
{"kind": "ema_crossover", "fast_period": 9, "slow_period": 21, "direction": "below"},
|
||||
{"kind": "ema_trend", "period": 50, "direction": "below"}
|
||||
]
|
||||
},
|
||||
"then": {"side": "sell", "quantity": {"method": "percent_of_balance", "percent": "10", "asset": "usdc"}, "reverse": true}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Key flip-strategy notes:
|
||||
- Gate each rule on `flat OR opposite` (using `any_of`) so it fires both on initial entry and on flip
|
||||
- `reverse: true` handles the flip math automatically — no need to size for `position_qty + new_qty`
|
||||
- This pattern works best for trend-following where you want continuous market exposure
|
||||
- Still add a time-based or ATR stop if you want a safety exit when the trend stalls
|
||||
|
||||
### Example 7 — Futures triple-Supertrend consensus flip
|
||||
|
||||
Multiple Supertrend instances with different period/multiplier combos act as a tiered
|
||||
signal. `any_of` fires on the FIRST flip — the fastest line (7/1.5) reacts quickly,
|
||||
the slowest (20/3.0) confirms strong trends. `reverse: true` makes it always-in-market:
|
||||
the opposite signal is the stop-loss. No explicit stop or time exit needed.
|
||||
|
||||
Varying parameters to tune:
|
||||
- Tighter multipliers (1.0–2.0) → more signals, more whipsaws
|
||||
- Looser multipliers (2.5–4.0) → fewer signals, longer holds
|
||||
- Try `all_of` instead of `any_of` to require consensus across all three (stronger filter)
|
||||
|
||||
```json
|
||||
{{
|
||||
"type": "rule_based",
|
||||
"candle_interval": "4h",
|
||||
"rules": [
|
||||
{{
|
||||
"comment": "LONG (or flip short→long): any Supertrend flips bullish",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "any_of", "conditions": [
|
||||
{{"kind": "position", "state": "flat"}},
|
||||
{{"kind": "position", "state": "short"}}
|
||||
]}},
|
||||
{{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 7, "multiplier": "1.5"}}}},
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 10, "multiplier": "2.0"}}}},
|
||||
{{"kind": "cross_over", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 20, "multiplier": "3.0"}}}}
|
||||
]
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "buy", "quantity": {{"method": "percent_of_balance", "percent": "5", "asset": "usdc"}}, "reverse": true}}
|
||||
}},
|
||||
{{
|
||||
"comment": "SHORT (or flip long→short): any Supertrend flips bearish",
|
||||
"when": {{
|
||||
"kind": "all_of",
|
||||
"conditions": [
|
||||
{{"kind": "any_of", "conditions": [
|
||||
{{"kind": "position", "state": "flat"}},
|
||||
{{"kind": "position", "state": "long"}}
|
||||
]}},
|
||||
{{
|
||||
"kind": "any_of",
|
||||
"conditions": [
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 7, "multiplier": "1.5"}}}},
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 10, "multiplier": "2.0"}}}},
|
||||
{{"kind": "cross_under", "left": {{"kind": "field", "field": "close"}}, "right": {{"kind": "func", "name": "supertrend", "period": 20, "multiplier": "3.0"}}}}
|
||||
]
|
||||
}}
|
||||
]
|
||||
}},
|
||||
"then": {{"side": "sell", "quantity": {{"method": "percent_of_balance", "percent": "5", "asset": "usdc"}}, "reverse": true}}
|
||||
}}
|
||||
]
|
||||
}}
|
||||
```
|
||||
|
||||
Key Supertrend-specific notes:
|
||||
- `supertrend` ignores `field` — it uses OHLC internally; omit the `field` param
|
||||
- `multiplier` controls band width: lower = tighter, more reactive; higher = wider, more stable
|
||||
- `any_of` → first flip triggers (responsive); `all_of` → all three must agree (conservative)
|
||||
- Gate on position state to prevent re-entries scaling into an existing position"##;
|
||||
|
||||
/// Build the user message for the first iteration (no prior results).
|
||||
pub fn initial_prompt(instruments: &[String], candle_intervals: &[String]) -> String {
|
||||
/// `prior_summary` contains a formatted summary of results from previous runs, if any.
|
||||
pub fn initial_prompt(instruments: &[String], candle_intervals: &[String], prior_summary: Option<&str>, has_futures: bool) -> String {
|
||||
let prior_section = match prior_summary {
|
||||
Some(s) => format!("{s}\n\n"),
|
||||
None => String::new(),
|
||||
};
|
||||
let starting_instruction = if prior_summary.is_some() {
|
||||
"Based on the prior results above:\n\
|
||||
- A strategy is \"promising\" if avg_sharpe >= 0.5 AND it traded >= 10 times per instrument. \
|
||||
If the best prior strategy meets both thresholds, refine it (tighten entry conditions, \
|
||||
adjust the exit, or tune the interval).\n\
|
||||
- If no prior strategy reaches avg_sharpe >= 0.5, do NOT repeat the same indicator family. \
|
||||
Scan the best-strategies list: if they all use the same core indicator (e.g. all use \
|
||||
Bollinger Bands, or all use EMA crossovers, or all use RSI threshold), your FIRST strategy \
|
||||
MUST use a completely different indicator family — for example: MACD crossover, ATR \
|
||||
breakout, volume spike, donchian channel breakout, or stochastic oscillator. Only after \
|
||||
that novelty attempt may you refine prior work.\n\
|
||||
- Never repeat an approach that produced 0 trades or fewer than 5 trades per instrument."
|
||||
} else {
|
||||
"Start with a multi-timeframe trend-following approach with proper risk management \
|
||||
(stop-loss, time exit, and ATR-based position sizing)."
|
||||
};
|
||||
let market_type = if has_futures { "futures" } else { "spot" };
|
||||
format!(
|
||||
r#"Design a trading strategy for crypto spot markets.
|
||||
r#"{prior_section}Design a trading strategy for crypto {market_type} markets.
|
||||
|
||||
Available instruments: {}
|
||||
Available candle intervals: {}
|
||||
|
||||
Start with a multi-timeframe trend-following approach with proper risk management
|
||||
(stop-loss, time exit, and ATR-based position sizing). Use "usdc" as the quote asset.
|
||||
{starting_instruction} Use "usdc" as the quote asset.
|
||||
|
||||
Respond with ONLY the strategy JSON."#,
|
||||
instruments.join(", "),
|
||||
|
||||
203
src/swym.rs
203
src/swym.rs
@@ -4,6 +4,21 @@ use serde::{Deserialize, Serialize};
|
||||
use serde_json::Value;
|
||||
use uuid::Uuid;
|
||||
|
||||
/// Response from `POST /api/v1/strategies/validate`.
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct ValidationResponse {
|
||||
pub valid: bool,
|
||||
#[serde(default)]
|
||||
pub errors: Vec<ValidationError>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize, Clone)]
|
||||
pub struct ValidationError {
|
||||
/// Dotted JSON path to the offending field. Absent for top-level structural errors.
|
||||
pub path: Option<String>,
|
||||
pub message: String,
|
||||
}
|
||||
|
||||
/// Client for the swym backtesting API.
|
||||
pub struct SwymClient {
|
||||
client: Client,
|
||||
@@ -30,6 +45,39 @@ pub struct CandleCoverage {
|
||||
pub first_open: String,
|
||||
pub last_close: String,
|
||||
pub count: u64,
|
||||
pub expected_count: Option<u64>,
|
||||
pub coverage_pct: Option<f64>,
|
||||
}
|
||||
|
||||
/// Response from `GET /api/v1/paper-runs/compare?ids=...`.
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct RunMetricsSummary {
|
||||
pub id: Uuid,
|
||||
pub status: String,
|
||||
pub candle_interval: Option<String>,
|
||||
pub total_positions: Option<u32>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub win_rate: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub profit_factor: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub net_pnl: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub sortino_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub calmar_ratio: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub max_drawdown: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub pnl_return: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_win: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_loss: Option<f64>,
|
||||
#[serde(default, deserialize_with = "deserialize_opt_number")]
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
@@ -45,6 +93,15 @@ pub struct BacktestResult {
|
||||
pub total_pnl: Option<f64>,
|
||||
pub net_pnl: Option<f64>,
|
||||
pub sharpe_ratio: Option<f64>,
|
||||
pub sortino_ratio: Option<f64>,
|
||||
pub calmar_ratio: Option<f64>,
|
||||
pub max_drawdown: Option<f64>,
|
||||
pub pnl_return: Option<f64>,
|
||||
pub avg_win: Option<f64>,
|
||||
pub avg_loss: Option<f64>,
|
||||
pub max_win: Option<f64>,
|
||||
pub max_loss: Option<f64>,
|
||||
pub avg_hold_duration_secs: Option<f64>,
|
||||
pub total_fees: Option<f64>,
|
||||
pub avg_bars_in_trade: Option<f64>,
|
||||
pub error_message: Option<String>,
|
||||
@@ -52,16 +109,10 @@ pub struct BacktestResult {
|
||||
}
|
||||
|
||||
impl BacktestResult {
|
||||
/// Parse a backtest response.
|
||||
///
|
||||
/// `exchange`, `base`, `quote` are needed to derive the instrument key used
|
||||
/// in the `result_summary.instruments` map (e.g. `binancespot-eth_usdc`).
|
||||
/// Parse a backtest response using the flat summary fields added in swym patch 8fb410311.
|
||||
pub fn from_response(
|
||||
resp: &PaperRunResponse,
|
||||
instrument: &str,
|
||||
exchange: &str,
|
||||
base: &str,
|
||||
quote: &str,
|
||||
) -> Self {
|
||||
let summary = resp.result_summary.as_ref();
|
||||
if let Some(s) = summary {
|
||||
@@ -70,28 +121,47 @@ impl BacktestResult {
|
||||
tracing::debug!("[{instrument}] result_summary: null");
|
||||
}
|
||||
|
||||
// The API key for per-instrument stats: "binance_spot" + "eth" + "usdc" → "binancespot-eth_usdc"
|
||||
let inst_key = format!("{}-{}_{}", exchange.replace('_', ""), base, quote);
|
||||
|
||||
let total_positions = summary.and_then(|s| {
|
||||
s["backtest_metadata"]["position_count"].as_u64().map(|v| v as u32)
|
||||
});
|
||||
|
||||
let inst_stats = summary.and_then(|s| s["instruments"].get(&inst_key));
|
||||
let total_positions = summary.and_then(|s| s["total_positions"].as_u64().map(|v| v as u32));
|
||||
let winning_positions = summary.and_then(|s| s["winning_positions"].as_u64().map(|v| v as u32));
|
||||
let losing_positions = summary.and_then(|s| s["losing_positions"].as_u64().map(|v| v as u32));
|
||||
let win_rate = summary.and_then(|s| parse_number(&s["win_rate"]));
|
||||
let profit_factor = summary.and_then(|s| parse_number(&s["profit_factor"]));
|
||||
let net_pnl = summary.and_then(|s| parse_number(&s["net_pnl"]));
|
||||
let total_pnl = summary.and_then(|s| parse_number(&s["total_pnl"]));
|
||||
let sharpe_ratio = summary.and_then(|s| parse_number(&s["sharpe_ratio"]));
|
||||
let sortino_ratio = summary.and_then(|s| parse_number(&s["sortino_ratio"]));
|
||||
let calmar_ratio = summary.and_then(|s| parse_number(&s["calmar_ratio"]));
|
||||
let max_drawdown = summary.and_then(|s| parse_number(&s["max_drawdown"]));
|
||||
let pnl_return = summary.and_then(|s| parse_number(&s["pnl_return"]));
|
||||
let avg_win = summary.and_then(|s| parse_number(&s["avg_win"]));
|
||||
let avg_loss = summary.and_then(|s| parse_number(&s["avg_loss"]));
|
||||
let max_win = summary.and_then(|s| parse_number(&s["max_win"]));
|
||||
let max_loss = summary.and_then(|s| parse_number(&s["max_loss"]));
|
||||
let avg_hold_duration_secs = summary.and_then(|s| parse_number(&s["avg_hold_duration_secs"]));
|
||||
let total_fees = summary.and_then(|s| parse_number(&s["total_fees"]));
|
||||
|
||||
Self {
|
||||
run_id: resp.id,
|
||||
instrument: instrument.to_string(),
|
||||
status: resp.status.clone(),
|
||||
total_positions,
|
||||
winning_positions: None,
|
||||
losing_positions: None,
|
||||
win_rate: inst_stats.and_then(|s| parse_ratio_value(&s["win_rate"])),
|
||||
profit_factor: inst_stats.and_then(|s| parse_ratio_value(&s["profit_factor"])),
|
||||
total_pnl: inst_stats.and_then(|s| parse_decimal_str(&s["pnl"])),
|
||||
net_pnl: inst_stats.and_then(|s| parse_decimal_str(&s["pnl"])),
|
||||
sharpe_ratio: inst_stats.and_then(|s| parse_ratio_value(&s["sharpe_ratio"])),
|
||||
total_fees: None,
|
||||
winning_positions,
|
||||
losing_positions,
|
||||
win_rate,
|
||||
profit_factor,
|
||||
total_pnl,
|
||||
net_pnl,
|
||||
sharpe_ratio,
|
||||
sortino_ratio,
|
||||
calmar_ratio,
|
||||
max_drawdown,
|
||||
pnl_return,
|
||||
avg_win,
|
||||
avg_loss,
|
||||
max_win,
|
||||
max_loss,
|
||||
avg_hold_duration_secs,
|
||||
total_fees,
|
||||
avg_bars_in_trade: None,
|
||||
error_message: resp.error_message.clone(),
|
||||
condition_audit_summary: summary.and_then(|s| s.get("condition_audit_summary").cloned()),
|
||||
@@ -116,6 +186,12 @@ impl BacktestResult {
|
||||
self.net_pnl.unwrap_or(0.0),
|
||||
self.sharpe_ratio.unwrap_or(0.0),
|
||||
);
|
||||
if let Some(sortino) = self.sortino_ratio {
|
||||
s.push_str(&format!(" sortino={:.2}", sortino));
|
||||
}
|
||||
if let Some(dd) = self.max_drawdown {
|
||||
s.push_str(&format!(" max_dd={:.1}%", dd * 100.0));
|
||||
}
|
||||
if self.total_positions.unwrap_or(0) == 0 {
|
||||
if let Some(audit) = &self.condition_audit_summary {
|
||||
let audit_str = format_audit_summary(audit);
|
||||
@@ -129,27 +205,32 @@ impl BacktestResult {
|
||||
}
|
||||
|
||||
/// Is this result promising enough to warrant out-of-sample validation?
|
||||
/// Uses sharpe if available, otherwise falls back to net_pnl > 0.
|
||||
pub fn is_promising(&self, min_sharpe: f64, min_trades: u32) -> bool {
|
||||
self.status == "complete"
|
||||
&& self.sharpe_ratio.unwrap_or(0.0) > min_sharpe
|
||||
&& self.total_positions.unwrap_or(0) >= min_trades
|
||||
&& self.net_pnl.unwrap_or(0.0) > 0.0
|
||||
if self.status != "complete" { return false; }
|
||||
if self.total_positions.unwrap_or(0) < min_trades { return false; }
|
||||
if self.net_pnl.unwrap_or(0.0) <= 0.0 { return false; }
|
||||
match self.sharpe_ratio {
|
||||
Some(sr) => sr > min_sharpe,
|
||||
None => true, // sharpe absent (e.g. 0 trades); net_pnl + trades is sufficient signal
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Parse a `{"interval": null, "value": "123.45"}` ratio wrapper.
|
||||
/// Returns `None` for null, missing, or sentinel values (Decimal::MAX ≈ 7.9e28).
|
||||
fn parse_ratio_value(v: &Value) -> Option<f64> {
|
||||
let s = v.get("value")?.as_str()?;
|
||||
let f: f64 = s.parse().ok()?;
|
||||
/// Parse a numeric JSON value — accepts either a plain JSON number or a decimal string.
|
||||
/// Returns `None` for null, missing, or sentinel values (>1e20 in magnitude).
|
||||
fn parse_number(v: &Value) -> Option<f64> {
|
||||
let f = v.as_f64().or_else(|| v.as_str()?.parse().ok())?;
|
||||
if f.abs() > 1e20 { None } else { Some(f) }
|
||||
}
|
||||
|
||||
/// Parse a plain decimal string JSON value.
|
||||
/// Returns `None` for null, missing, or sentinel values.
|
||||
fn parse_decimal_str(v: &Value) -> Option<f64> {
|
||||
let f: f64 = v.as_str()?.parse().ok()?;
|
||||
if f.abs() > 1e20 { None } else { Some(f) }
|
||||
/// Serde deserializer for `Option<f64>` that accepts both JSON numbers and decimal strings.
|
||||
fn deserialize_opt_number<'de, D>(deserializer: D) -> Result<Option<f64>, D::Error>
|
||||
where
|
||||
D: serde::Deserializer<'de>,
|
||||
{
|
||||
let v = Value::deserialize(deserializer)?;
|
||||
Ok(parse_number(&v))
|
||||
}
|
||||
|
||||
/// Render a condition_audit_summary Value into a compact one-line string.
|
||||
@@ -254,6 +335,32 @@ impl SwymClient {
|
||||
resp.json().await.context("parse candle coverage")
|
||||
}
|
||||
|
||||
/// Validate a strategy against the swym DSL schema.
|
||||
///
|
||||
/// Calls `POST /api/v1/strategies/validate` and returns a structured list
|
||||
/// of all validation errors. Returns `Ok(vec![])` when the strategy is valid.
|
||||
/// Returns `Err` only on network or parse failures, not on DSL errors.
|
||||
pub async fn validate_strategy(&self, strategy: &Value) -> Result<Vec<ValidationError>> {
|
||||
let url = format!("{}/strategies/validate", self.base_url);
|
||||
let resp = self
|
||||
.client
|
||||
.post(&url)
|
||||
.json(strategy)
|
||||
.send()
|
||||
.await
|
||||
.context("validate strategy request")?;
|
||||
|
||||
if !resp.status().is_success() {
|
||||
let status = resp.status();
|
||||
let body = resp.text().await.unwrap_or_default();
|
||||
anyhow::bail!("validate strategy {status}: {body}");
|
||||
}
|
||||
|
||||
let parsed: ValidationResponse =
|
||||
resp.json().await.context("parse validation response")?;
|
||||
Ok(parsed.errors)
|
||||
}
|
||||
|
||||
/// Submit a backtest run.
|
||||
pub async fn submit_backtest(
|
||||
&self,
|
||||
@@ -261,6 +368,7 @@ impl SwymClient {
|
||||
instrument_symbol: &str,
|
||||
base_asset: &str,
|
||||
quote_asset: &str,
|
||||
market_kind: &str,
|
||||
strategy: &Value,
|
||||
starts_at: &str,
|
||||
finishes_at: &str,
|
||||
@@ -278,7 +386,7 @@ impl SwymClient {
|
||||
"name_exchange": instrument_symbol,
|
||||
"underlying": { "base": base_asset, "quote": quote_asset },
|
||||
"quote": "underlying_quote",
|
||||
"kind": "spot"
|
||||
"kind": market_kind
|
||||
},
|
||||
"execution": {
|
||||
"mocked_exchange": instrument_exchange,
|
||||
@@ -352,6 +460,25 @@ impl SwymClient {
|
||||
}
|
||||
}
|
||||
|
||||
/// Fetch metrics for multiple completed runs via the compare endpoint.
|
||||
/// Batches requests in groups of 50 (API maximum).
|
||||
pub async fn compare_runs(&self, run_ids: &[Uuid]) -> Result<Vec<RunMetricsSummary>> {
|
||||
let mut results = Vec::new();
|
||||
for chunk in run_ids.chunks(50) {
|
||||
let ids = chunk.iter().map(|id| id.to_string()).collect::<Vec<_>>().join(",");
|
||||
let url = format!("{}/paper-runs/compare?ids={}", self.base_url, ids);
|
||||
let resp = self.client.get(&url).send().await.context("compare runs request")?;
|
||||
if !resp.status().is_success() {
|
||||
let status = resp.status();
|
||||
let body = resp.text().await.unwrap_or_default();
|
||||
anyhow::bail!("compare runs {status}: {body}");
|
||||
}
|
||||
let mut batch: Vec<RunMetricsSummary> = resp.json().await.context("parse compare response")?;
|
||||
results.append(&mut batch);
|
||||
}
|
||||
Ok(results)
|
||||
}
|
||||
|
||||
/// Fetch condition audit summary for a completed run.
|
||||
pub async fn condition_audit(&self, run_id: Uuid) -> Result<Value> {
|
||||
let url = format!("{}/paper-runs/{}/condition-audit", self.base_url, run_id);
|
||||
|
||||
Reference in New Issue
Block a user