Chapter 5: Backtesting¶

Validate strategies properly with historical simulation.

What is Backtesting?¶

Backtesting simulates how a strategy would have performed historically:

Text Only

portfolio main:
  weights = rank(momentum).long_short(top=0.2, bottom=0.2)
  backtest from 2015-01-01 to 2024-12-31

Output:

Text Only

Sharpe: 0.85
Return: 8.2% annual
Max Drawdown: -15.3%

Why Backtesting Matters¶

Validates Ideas¶

Before risking real money: - Does the signal work historically? - How does it behave in different regimes? - What's the worst-case scenario?

Identifies Issues¶

Implementation errors
Data problems
Unrealistic assumptions

Backtesting Fundamentals¶

The Backtest Loop¶

Text Only

For each trading day:
  1. Observe today's data (NOT future)
  2. Compute signals
  3. Generate target weights
  4. Execute trades (with realistic costs)
  5. Mark-to-market
  6. Record performance

Key Principles¶

No look-ahead bias: Only use data available at decision time
Realistic execution: Include transaction costs
Survivorship-free data: Include delisted stocks
Point-in-time data: Use data as it was known then

Basic Backtesting¶

Simple Backtest¶

Text Only

signal momentum:
  emit zscore(ret(prices, 60))

portfolio simple:
  weights = rank(momentum).long_short(top=0.2, bottom=0.2)
  backtest from 2020-01-01 to 2024-12-31

With Rebalancing¶

Text Only

portfolio monthly:
  weights = rank(momentum).long_short(top=0.2, bottom=0.2)
  backtest rebal=21 from 2020-01-01 to 2024-12-31  // Rebalance monthly

With Transaction Costs¶

Text Only

portfolio realistic:
  weights = rank(momentum).long_short(top=0.2, bottom=0.2)
  costs = tc.bps(10)  // 10 basis points round-trip
  backtest from 2020-01-01 to 2024-12-31

Performance Metrics¶

Return Metrics¶

Metric	Description	Good Value
Total Return	Cumulative return	>50%
Annual Return	Annualized return	>5%
Excess Return	Return vs benchmark	>0%

Risk Metrics¶

Metric	Description	Good Value
Volatility	Annualized std dev	<15%
Max Drawdown	Largest peak-to-trough	<20%
VaR (95%)	Value at Risk	<2% daily

Risk-Adjusted Metrics¶

Metric	Formula	Good Value
Sharpe Ratio	Return / Vol	>0.5
Sortino Ratio	Return / Downside Vol	>1.0
Calmar Ratio	Return / Max DD	>0.5
Information Ratio	Active Return / TE	>0.5

Interpreting Results¶

Text Only

Backtest Results
================
Total Return: 85%
Annual Return: 8.2%
Annual Volatility: 9.6%
Sharpe Ratio: 0.85

Max Drawdown: -15.3%
Avg Drawdown: -4.2%
Drawdown Duration: 45 days (avg)

Turnover: 320% annual
Win Rate: 54%

Common Pitfalls¶

1. Look-Ahead Bias¶

Using information not available at decision time:

Text Only

// WRONG: Using future data
signal bad:
  future_return = ret(prices, -60)  // Future returns!
  emit future_return

// CORRECT: Only past data
signal good:
  past_return = ret(prices, 60)
  emit zscore(past_return)

2. Survivorship Bias¶

Only testing on stocks that still exist:

Text Only

Full Universe (2015):      1000 stocks
Survivors (to 2024):        800 stocks
Survivorship bias: Testing only on 800 winners

Solution: Use survivorship-free data that includes delisted stocks.

3. Overfitting¶

Finding patterns that only exist in sample:

Text Only

// High risk of overfitting
params:
  p1: range(1, 100, 1)
  p2: range(1, 100, 1)
  p3: range(1, 100, 1)
  // 1,000,000 combinations tested!

Signs of overfitting: - Perfect in-sample, terrible out-of-sample - Complex, unintuitive parameters - Very different optimal params across periods

4. Ignoring Transaction Costs¶

Text Only

// Unrealistic: No costs
portfolio no_costs:
  weights = ...
  backtest rebal=1  // Daily rebalancing, no costs
  // Result: 20% annual return! (unrealistic)

// Realistic: Include costs
portfolio with_costs:
  weights = ...
  costs = tc.bps(10)
  backtest rebal=1
  // Result: 2% annual return (costs destroy alpha)

5. Data Mining¶

Testing many hypotheses until one works:

Text Only

Test 100 random signals
By chance, ~5 will be "significant" (p < 0.05)
These are false positives!

Solution: Multiple testing correction, out-of-sample validation.

Walk-Forward Analysis¶

The Right Way to Test¶

Text Only

params:
  lookback: range(20, 100, 20)

portfolio validated:
  weights = rank(momentum).long_short(top=0.2, bottom=0.2)

  backtest walk_forward(
    train_years = 5,   // Optimize on 5 years
    test_years = 1     // Test on 1 year (unseen)
  ) from 2010-01-01 to 2024-12-31

How It Works¶

Text Only

Period 1:
  Train: 2010-2014 → Find optimal params
  Test:  2015      → Evaluate (out-of-sample)

Period 2:
  Train: 2011-2015 → Re-optimize
  Test:  2016      → Evaluate

...

Combined test results = True expected performance

Interpreting Walk-Forward Results¶

Text Only

Walk-Forward Results
====================
In-Sample Sharpe:  1.2
Out-of-Sample Sharpe: 0.65

Degradation: 46%

<30% degradation: Robust strategy
30-50% degradation: Some overfitting
>50% degradation: Significant overfitting

Robustness Checks¶

1. Sensitivity Analysis¶

Text Only

params:
  lookback: [40, 50, 60, 70, 80]

// Run all parameter values
// Check: Are results similar across all values?

2. Subsample Testing¶

Text Only

// Test on different time periods
backtest from 2010-01-01 to 2014-12-31  // Period 1
backtest from 2015-01-01 to 2019-12-31  // Period 2
backtest from 2020-01-01 to 2024-12-31  // Period 3

// Results should be consistent

3. Universe Variations¶

Test on different asset subsets: - Large cap vs small cap - Different sectors - Different geographies

4. Transaction Cost Sensitivity¶

Text Only

// Test different cost assumptions
costs = tc.bps(5)   // Optimistic
costs = tc.bps(10)  // Realistic
costs = tc.bps(20)  // Conservative

Complete Backtesting Example¶

Text Only

data:
  source = "prices_fundamentals.parquet"
  format = parquet

// Parameters for optimization
params:
  mom_lookback: [40, 60, 80]
  value_weight: [0.3, 0.4, 0.5]
  top_pct: [0.15, 0.20, 0.25]

// Signals
signal momentum:
  emit neutralize(zscore(ret(prices, mom_lookback)), by=sectors)

signal value:
  emit neutralize(zscore(book_to_market), by=sectors)

signal combined:
  mom_wt = 1 - value_weight
  emit mom_wt * momentum + value_weight * value

// Portfolio with constraints
portfolio main:
  weights = rank(combined).long_short(
    top = top_pct,
    bottom = top_pct,
    cap = 0.03
  )

  constraints:
    gross_exposure = 2.0
    net_exposure = 0.0
    max_sector = 0.20

  costs = tc.bps(10)

  // Walk-forward validation
  backtest walk_forward(
    train_years = 5,
    test_years = 1,
    metric = "sharpe"
  ) rebal=21 from 2010-01-01 to 2024-12-31

Run:

Bash

sigc run strategy.sig --walk-forward --verbose

Best Practices Checklist¶

Exercises¶

Backtest a simple momentum strategy with different lookback periods
Add transaction costs and measure the impact
Run walk-forward validation and measure degradation
Test strategy on different time periods

Next Chapter¶

Continue to Chapter 6: Risk Management to learn about controlling portfolio risk.