Backtesting - Hyperoru

Backtesting lets you evaluate how a strategy would have performed on historical data before deploying it live. Hyperoru supports backtesting for both program strategies (deterministic replay) and prompt strategies (re-run AI decisions).

Program backtesting

Program strategies are replayed against historical kline data. The sandbox receives the same MarketData API but backed by historical snapshots instead of live data.

Running a program backtest

curl -X POST https://api.production.hyperoru.com/api/backtests \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "program_id": "program-uuid",
    "config": {
      "symbols": ["BTC", "ETH"],
      "start_date": "2025-01-01T00:00:00Z",
      "end_date": "2025-06-01T00:00:00Z",
      "interval": "1h",
      "initial_balance": 10000.0,
      "slippage_pct": 0.05,
      "maker_fee_pct": 0.02,
      "taker_fee_pct": 0.05
    }
  }'

Prompt backtesting

Prompt backtesting re-runs AI decisions on historical market snapshots. This helps you evaluate whether prompt changes improve decision quality without waiting for live results.

curl -X POST https://api.production.hyperoru.com/api/prompt-backtests \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt_id": "prompt-uuid",
    "trader_id": "trader-uuid",
    "config": {
      "symbols": ["BTC"],
      "start_date": "2025-03-01T00:00:00Z",
      "end_date": "2025-04-01T00:00:00Z",
      "interval": "4h",
      "initial_balance": 10000.0
    }
  }'

Prompt backtests call the LLM for each historical snapshot, so they consume API credits and take longer than program backtests. Use shorter time ranges and larger intervals for initial iteration.

Configuration options

Parameter	Type	Default	Description
`symbols`	`string[]`	—	Symbols to include in the backtest
`start_date`	`datetime`	—	Start of the historical period
`end_date`	`datetime`	—	End of the historical period
`interval`	`string`	`1h`	Candlestick interval (`1m`, `5m`, `15m`, `1h`, `4h`, `1d`)
`initial_balance`	`float`	`10000.0`	Starting virtual balance in USD
`slippage_pct`	`float`	`0.05`	Simulated slippage as a percentage of order value
`maker_fee_pct`	`float`	`0.02`	Simulated maker fee percentage
`taker_fee_pct`	`float`	`0.05`	Simulated taker fee percentage

Results

Backtest results include a comprehensive performance summary:

Metrics

Metric	Description
Total PnL	Net profit/loss over the test period
PnL %	Return as a percentage of initial balance
Sharpe Ratio	Risk-adjusted return (annualized)
Max Drawdown	Largest peak-to-trough decline
Win Rate	Percentage of profitable trades
Profit Factor	Gross profit divided by gross loss
Total Trades	Number of trades executed
Avg Trade Duration	Mean time a position was held

Example response

{
  "id": "backtest-uuid",
  "status": "completed",
  "metrics": {
    "total_pnl": 1250.30,
    "pnl_pct": 12.5,
    "sharpe_ratio": 1.85,
    "max_drawdown_pct": 8.2,
    "win_rate": 0.62,
    "profit_factor": 2.1,
    "total_trades": 47,
    "avg_trade_duration_hours": 6.3
  },
  "equity_curve": [...],
  "trades": [...]
}

Equity curve

The equity curve tracks portfolio value over time. Use it to visualize drawdown periods and growth patterns.

Trade log

Every simulated trade is recorded with:

Entry/exit timestamps and prices
Position size, leverage, and direction
PnL and fees
The decision reasoning (program output or LLM response)

Compare multiple backtest runs side-by-side to evaluate prompt or parameter changes. The analytics API supports filtering backtests by program, prompt, and date range.

Best practices

Use realistic fees

Set slippage and fee parameters to match your actual exchange costs. Overly optimistic assumptions inflate results.

Avoid overfitting

Test on out-of-sample data. Split your historical period into train and test windows.

Account for regime changes

Markets shift between trending and ranging. A strategy that works in one regime may fail in another.

Start with longer intervals

Begin with 4h or 1d intervals to iterate quickly, then refine with shorter intervals once the logic is sound.

Documentation Index

​Program backtesting

​Running a program backtest

​Prompt backtesting

​Configuration options

​Results

​Metrics

​Example response

​Equity curve

​Trade log

​Best practices

Use realistic fees

Avoid overfitting

Account for regime changes

Start with longer intervals

Program backtesting

Running a program backtest

Prompt backtesting

Configuration options

Results

Metrics

Example response

Equity curve

Trade log

Best practices