How it works.
Most retail traders test strategies with a spreadsheet or a one-off Python script. That process leaks lookahead bias, invites overfitting, and ignores frictions. The framework is an opinionated harness that forces an honest answer to a single question.
Strategy in, scorecard out.
- 01
Strategy interface
Every strategy subclasses the same Strategy base class. Signal mode returns BUY / SELL / HOLD per bar; target-position mode specifies exact sizes (including shorts and partials). The framework takes care of sizing, execution, and reporting from there.
- 02
Data layer
Equities via Yahoo, Alpaca, Polygon, Tiingo, and more. Crypto via CCXT (110+ exchanges, public OHLCV without API keys). Options via synthetic Black-Scholes or the free philippdubach chains dataset back to 2008.
- 03
Dual execution engines
Every run executes twice. The bar engine replays daily OHLCV with market orders at next open. The event engine replays the same bars as order-book events, filling stop / limit / OCO intrabar. If the PnL diverges, the strategy is execution-sensitive.
- 04
Monte Carlo stressors
Six generators run automatically: geometric Brownian motion (sanity baseline), block bootstrap, Markov regime switching, noise injection, Heston stochastic volatility, and a trained cGAN conditioned on four real SPY regimes.
- 05
Standardized scorecard
A four-page PNG: equity curve and trade stats; Monte Carlo distributions; bar-vs-event comparison; letter grades across performance, risk, trade quality, and robustness. Same output format for every strategy, every member, every semester.
Bar vs. event, side by side.
The part most non-quant audiences don't see. Two engines exist because stop orders can fire intrabar and bar engines can't model that. Running both and measuring the gap is the cheapest form of robustness check.
Bar-based
Replays daily OHLCV. Every order becomes “buy at tomorrow's open.” Fast, simple, sufficient for swing and position strategies — but fundamentally cannot model a stop-loss triggering intraday.
Event-driven
Replays the same bars as order-book events. Stop orders fire intrabar at the trigger price. Bracket exits (OCO stop-loss plus take-profit) are real. Captures what the bar engine cannot.
“If a strategy performs well on one data source but poorly on another, that's a red flag — the strategy might be fitting to quirks in the data rather than real market dynamics.”
If the scorecard says it works, it works.
A strategy that earns a pass on this scorecard has cleared a deliberately high bar: positive returns against a benchmark, a negative Sharpe on pure GBM noise, survivable drawdowns across GAN-generated crash regimes, and agreement between two independent execution engines. That combination is hard to fake.
Nothing else is needed to decide whether a strategy is worth trading. The scorecard is the answer, not a waypoint toward one.
See Limitations for an honest audit of what this framework is and isn't.