Skip to main content
Backtesting lets you evaluate how a strategy would have performed on historical data before deploying it live. Hyperoru supports backtesting for both program strategies (deterministic replay) and prompt strategies (re-run AI decisions).

Program backtesting

Program strategies are replayed against historical kline data. The sandbox receives the same MarketData API but backed by historical snapshots instead of live data.

Running a program backtest

curl -X POST https://api.hyperoru.com/api/backtests \
  -H "Content-Type: application/json" \
  -d '{
    "program_id": "program-uuid",
    "config": {
      "symbols": ["BTC", "ETH"],
      "start_date": "2025-01-01T00:00:00Z",
      "end_date": "2025-06-01T00:00:00Z",
      "interval": "1h",
      "initial_balance": 10000.0,
      "slippage_pct": 0.05,
      "maker_fee_pct": 0.02,
      "taker_fee_pct": 0.05
    }
  }'

Prompt backtesting

Prompt backtesting re-runs AI decisions on historical market snapshots. This helps you evaluate whether prompt changes improve decision quality without waiting for live results.
curl -X POST https://api.hyperoru.com/api/prompt-backtests \
  -H "Content-Type: application/json" \
  -d '{
    "prompt_id": "prompt-uuid",
    "trader_id": "trader-uuid",
    "config": {
      "symbols": ["BTC"],
      "start_date": "2025-03-01T00:00:00Z",
      "end_date": "2025-04-01T00:00:00Z",
      "interval": "4h",
      "initial_balance": 10000.0
    }
  }'
Prompt backtests call the LLM for each historical snapshot, so they consume API credits and take longer than program backtests. Use shorter time ranges and larger intervals for initial iteration.

Configuration options

ParameterTypeDefaultDescription
symbolsstring[]Symbols to include in the backtest
start_datedatetimeStart of the historical period
end_datedatetimeEnd of the historical period
intervalstring1hCandlestick interval (1m, 5m, 15m, 1h, 4h, 1d)
initial_balancefloat10000.0Starting virtual balance in USD
slippage_pctfloat0.05Simulated slippage as a percentage of order value
maker_fee_pctfloat0.02Simulated maker fee percentage
taker_fee_pctfloat0.05Simulated taker fee percentage

Results

Backtest results include a comprehensive performance summary:

Metrics

MetricDescription
Total PnLNet profit/loss over the test period
PnL %Return as a percentage of initial balance
Sharpe RatioRisk-adjusted return (annualized)
Max DrawdownLargest peak-to-trough decline
Win RatePercentage of profitable trades
Profit FactorGross profit divided by gross loss
Total TradesNumber of trades executed
Avg Trade DurationMean time a position was held

Example response

{
  "id": "backtest-uuid",
  "status": "completed",
  "metrics": {
    "total_pnl": 1250.30,
    "pnl_pct": 12.5,
    "sharpe_ratio": 1.85,
    "max_drawdown_pct": 8.2,
    "win_rate": 0.62,
    "profit_factor": 2.1,
    "total_trades": 47,
    "avg_trade_duration_hours": 6.3
  },
  "equity_curve": [...],
  "trades": [...]
}

Equity curve

The equity curve tracks portfolio value over time. Use it to visualize drawdown periods and growth patterns.

Trade log

Every simulated trade is recorded with:
  • Entry/exit timestamps and prices
  • Position size, leverage, and direction
  • PnL and fees
  • The decision reasoning (program output or LLM response)
Compare multiple backtest runs side-by-side to evaluate prompt or parameter changes. The analytics API supports filtering backtests by program, prompt, and date range.

Best practices

Use realistic fees

Set slippage and fee parameters to match your actual exchange costs. Overly optimistic assumptions inflate results.

Avoid overfitting

Test on out-of-sample data. Split your historical period into train and test windows.

Account for regime changes

Markets shift between trending and ranging. A strategy that works in one regime may fail in another.

Start with longer intervals

Begin with 4h or 1d intervals to iterate quickly, then refine with shorter intervals once the logic is sound.