Look-ahead bias hides in your timestamps

Overfit 5 Jul 2026 5 min read

Backtesting and overfitting

Look-ahead bias is often a timestamp bug rather than an obvious future leak. Event time, release time, timezone handling, and resampling choices all matter.

Look-ahead bias rarely announces itself as future_return.shift(-1). That bug exists, and we have all written some version of it, but the nastier leaks are smaller. A timestamp parsed in the wrong timezone. A daily bar labelled at midnight when the close happened at 16:00 New York. A fundamental field stamped with fiscal period end rather than filing time. A macro series revised in 2024 and treated as though the 2011 vintage said the same thing.

The backtest still runs chronologically. The code looks respectable. The data is already carrying tomorrow inside today's row.

Every timestamp in a research table needs a meaning. Event time is when the thing happened. Release time is when someone published it. Availability time is when your system could have used it. If those collapse into one column called date, your backtest is asking for trouble.

Daily bars are not timeless

Take a simple US equity daily bar. The row labelled 2024-03-14 might contain the open, high, low, close, and volume for the session ending at 16:00 Eastern time. If your signal is computed at the same labelled date and traded at that same close, you may have used the close to decide a trade placed before the close. If your resampling code labels weekly bars on Friday but includes Friday's close, a Friday-at-open strategy using that bar is looking ahead by a session.

This is why pandas details matter. The time series guide covers timezone localisation, conversion, and resampling in one place, because these are not cosmetic operations. tz_localize says what timezone naive timestamps belong to. tz_convert changes the representation of an already timezone-aware timestamp. resample groups observations into bins whose labels and closed sides must match the trading decision.

If that sounds pedantic, good. Backtesting is mostly pedantry with a P&L column.

Resampling can move information backwards

The common leak is building lower-frequency features from higher-frequency data and then joining them back to the trade table without checking the label.

A weekly volatility estimate computed from Monday through Friday and labelled on Monday is a leak. A month-end fundamental ratio forward-filled from the fiscal period end, rather than the filing date, is a leak. A daily signal using the full day's high and low before the close is a leak unless the strategy trades after those values are known.

In pandas, I want the code to show intent:

weekly_close = close.resample("W-FRI", label="right", closed="right").last()
weekly_signal = weekly_close.pct_change(12)
trade_signal = weekly_signal.shift(1).reindex(daily_index).ffill()

That shift(1) is not decoration. It says the Friday close signal is first available after the Friday close, not during it. The exact shift depends on the execution rule, but there must be a rule. Silence is usually a leak.

Point-in-time data is a contract

Point-in-time data means a query as of time T returns only what was knowable at time T. Not what is true now about then. What was knowable then.

Fundamentals are the obvious example. A company's fiscal quarter may end on 31 March, but the 10-Q may arrive weeks later, and the numbers may be restated after that. If your factor joins March quarter fundamentals to 31 March prices, you have bought information from the future. Economic data has the same disease through vintages and revisions. Index membership has it through additions and deletions. Analyst estimates have it through backfilled histories.

The point-in-time problem is now turning up in machine-learning evaluations as well. Mostapha Benhenda's Look-Ahead-Bench is about financial LLMs, but the principle is familiar: a system trained or evaluated with knowledge from after the test date can look predictive without being tradeable. Different tool, same sin.

Timezones are a source of false alpha

Cross-market strategies are where timestamps become malicious. A signal using European closes and US closes has to respect the order in which those closes occur. A macro release at 08:30 New York is not available to a London signal at 08:00 London. A futures session that trades through midnight should not be chopped into calendar days because a CSV parser found a date column and felt confident.

The pandas docs for DataFrame.tz_localize, Timestamp.tz_convert, and DataFrame.resample are not bedtime reading, but they repay attention. I want all raw timestamps normalised to UTC at ingestion, with the original exchange timezone retained as metadata. I want session calendars from the exchange or a library built for them. I want joins expressed as "latest observation available before decision time", not "same date".

The test I run

For any feature, I ask three questions.

First, when did the underlying event happen. Second, when was it published. Third, when could my strategy have acted on it after ingestion, validation, and scheduling. If the answer to all three is the same timestamp, I get suspicious.

Then I run a deliberately delayed version of the strategy. Lag every feature by one more bar, one more day, or one more filing cycle. A genuine medium-horizon effect should degrade, not disappear instantly. If one extra lag kills the whole strategy, the edge may have been living in the timestamp convention rather than the market.

Look-ahead bias is not always a dramatic coding error. Sometimes it is a daily bar with the wrong label, quietly handing you the close before you have earned it.