June 2026 · 5 min read

I Built a Backtest That Returned +3,542%. Then I Proved It Was a Lie.

How survivorship bias inflated my strategy ~5×, and what it took to find the honest number.

My backtest said $100,000 would have become $3.6 million — 37% a year for eleven years, on a plain momentum strategy. I should have been thrilled. Instead I got the nagging feeling every honest builder learns to trust: 37% sustained for a decade is Buffett-and-Renaissance territory. I'm one person with a Rails app and a data subscription. That's not genius — that's a bug.

And that number would have sold. A +3,542% curve on a landing page converts. Which is exactly why I didn't trust it — the numbers that flatter you deserve the most scrutiny, because you're the least motivated to question them.

So I spent a week trying to prove my own headline was a lie.

The tell: same risk, 5× the reward

I rebuilt the exact same strategy on a survivorship-free universe — one that included every company actually in the index at the time, including the ones that went bankrupt or got acquired. Same strategy, same window, same risk: the maximum drawdowns came back nearly identical (~30% each). The only thing I changed was removing the survivor bias.

Claimed (survivor-biased) Actual (survivorship-free)

Identical top-1000 momentum strategy, 2015–2026, $100k start. Same drawdown. The biased version ends at $3.64M; the honest one at $630k. ~5.8× of the "return" was the bias.

That's not a discrepancy you reconcile with a footnote — it's a logical impossibility for a real edge. A genuine edge is risk-adjusted: if two strategies take the same drawdown, they should earn roughly the same reward. When one earns ~6× more for the same risk, the extra return isn't skill. The extra return is the bias.

What survivorship bias actually is

My universe was built from today's stock list, backfilled with old prices. So the backtest could only ever pick from companies that survived to the present. Every Lehman, every Washington Mutual, every name that went to zero was silently absent — the strategy couldn't lose on companies that no longer exist, and the pool was quietly stacked with winners I already knew about. When I reconstructed the real point-in-time universe, half of it had been missing — and not a random half. The losers.

The bugs the clean data threw at me

Survivorship-free data is its own minefield. Three that nearly got me:

ETFs and a test ticker. "Every stock trading today" includes SPY, 3× leveraged ETFs, and ZVZZT — a Nasdaq test symbol with fake prices. By dollar volume they swamped real companies; my first clean run started trading leveraged ETFs and returned obvious garbage. Fix: a common-stock whitelist. The data doesn't know what a "stock" is — you have to tell it.
The ×1000 ghost. A few small-caps were ranking above Apple. Moog had ~45 million shares its entire history — except one filing that reported 45 billion. The financials had scattered ×1000 unit errors; one bad row turned a $2B company into a $2T one. Fix: per-ticker outlier rejection, robust even when most of a ticker's filings are wrong.
Two companies, one ticker. WM was Washington Mutual — until it collapsed in 2008, when Waste Management took the symbol. My cache had stitched them into a single "stock" that crashed to pennies and then recovered to $100. The crash and the recovery were different companies. Fix: detect the multi-month gap, split the series.

The number I didn't want

The honest answer was about 17% a year (~12% at a risk setting I'd actually run), with a ~30% drawdown. That's genuinely good — it beats the S&P 500 roughly two-to-one with half the drawdown, and in 2008 it lost 26% while the market lost 56%. But it's not a fairy tale, and it's a long way from +3,542%.

Watching $3.6M collapse into $630k — because I went looking for the truth — was a real gut-punch. The temptation to just not look that hard is real. It's exactly why most backtests you see online are lies. Not malicious ones. Just unexamined ones.

A real edge looks the same on clean data and dirty data. If scrubbing the bias out changes your headline by 5×, you never had an edge — you had a bias with good marketing.

I'd rather sell a 17% I've tried my hardest to break than a 37% that dies under a single question. In a field drowning in curve-fit, survivorship-inflated, cherry-picked backtests, the honest number — the one you can prove — is the only one worth putting my name on. Less money than I started the week believing in. A lot more truth. I'll take that trade.