# Explaining the hot-hand fallacy fallacy

Since first coming across Joshua Miller and Adam Sanurjo’s great work demonstrating that the hot-hand fallacy was itself a fallacy, I’ve been looking for a good way to explain simply the logic behind their argument. I haven’t found something that completely hits the mark yet, but the following explanation from Miller and Sanjurjo in The Conversation might be useful to some:

In the landmark 1985 paper “The hot hand in basketball: On the misperception of random sequences,” psychologists Thomas Gilovich, Robert Vallone and Amos Tversky (GVT, for short) found that when studying basketball shooting data, the sequences of makes and misses are indistinguishable from the sequences of heads and tails one would expect to see from flipping a coin repeatedly.

Just as a gambler will get an occasional streak when flipping a coin, a basketball player will produce an occasional streak when shooting the ball. GVT concluded that the hot hand is a “cognitive illusion”; people’s tendency to detect patterns in randomness, to see perfectly typical streaks as atypical, led them to believe in an illusory hot hand.

In what turns out to be an ironic twist, we’ve recently found this consensus view rests on a subtle – but crucial – misconception regarding the behavior of random sequences. In GVT’s critical test of hot hand shooting conducted on the Cornell University basketball team, they examined whether players shot better when on a streak of hits than when on a streak of misses. In this intuitive test, players’ field goal percentages were not markedly greater after streaks of makes than after streaks of misses.

GVT made the implicit assumption that the pattern they observed from the Cornell shooters is what you would expect to see if each player’s sequence of 100 shot outcomes were determined by coin flips. That is, the percentage of heads should be similar for the flips that follow streaks of heads, and the flips that follow streaks of misses.

Our surprising finding is that this appealing intuition is incorrect. For example, imagine flipping a coin 100 times and then collecting all the flips in which the preceding three flips are heads. While one would intuitively expect that the percentage of heads on these flips would be 50 percent, instead, it’s less.

Here’s why.

Suppose a researcher looks at the data from a sequence of 100 coin flips, collects all the flips for which the previous three flips are heads and inspects one of these flips. To visualize this, imagine the researcher taking these collected flips, putting them in a bucket and choosing one at random. The chance the chosen flip is a heads – equal to the percentage of heads in the bucket – we claim is less than 50 percent.

To see this, let’s say the researcher happens to choose flip 42 from the bucket. Now it’s true that if the researcher were to inspect flip 42 before examining the sequence, then the chance of it being heads would be exactly 50/50, as we intuitively expect. But the researcher looked at the sequence first, and collected flip 42 because it was one of the flips for which the previous three flips were heads. Why does this make it more likely that flip 42 would be tails rather than a heads?

If flip 42 were heads, then flips 39, 40, 41 and 42 would be HHHH. This would mean that flip 43 would also follow three heads, and the researcher could have chosen flip 43 rather than flip 42 (but didn’t). If flip 42 were tails, then flips 39 through 42 would be HHHT, and the researcher would be restricted from choosing flip 43 (or 44, or 45). This implies that in the world in which flip 42 is tails (HHHT) flip 42 is more likely to be chosen as there are (on average) fewer eligible flips in the sequence from which to choose than in the world in which flip 42 is heads (HHHH).

This reasoning holds for any flip the researcher might choose from the bucket (unless it happens to be the final flip of the sequence). The world HHHT, in which the researcher has fewer eligible flips besides the chosen flip, restricts his choice more than world HHHH, and makes him more likely to choose the flip that he chose. This makes world HHHT more likely, and consequentially makes tails more likely than heads on the chosen flip.

In other words, selecting which part of the data to analyze based on information regarding where streaks are located within the data, restricts your choice, and changes the odds.

There are a few other pieces in the article that make it worth reading, but here is an important punchline to the research:

Because of the surprising bias we discovered, their finding of only a negligibly higher field goal percentage for shots following a streak of makes (three percentage points), was, if you do the calculation, actually 11 percentage points higher than one would expect from a coin flip!

An 11 percentage point relative boost in shooting when on a hit-streak is not negligible. In fact, it is roughly equal to the difference in field goal percentage between the average and the very best 3-point shooter in the NBA. Thus, in contrast with what was originally found, GVT’s data reveal a substantial, and statistically significant, hot hand effect.

## 11 thoughts on “Explaining the hot-hand fallacy fallacy”

1. funstein19 says:

They found it to be true in baseball.

Work more recent than 1985 has disputed their study as poor model design.

On Wed, Jun 27, 2018 at 12:01 PM, Jason Collins blog wrote:

> Jason Collins posted: “Since first coming across Joshua Miller and Adam > Sanurjo’s great work demonstrating that the hot-hand fallacy was itself a > fallacy, I’ve been looking for a good way to explain simply the logic > behind their argument. I haven’t found something that complete” >

2. funstein19 says:

I cant read. Fallacy of a fallacy is is obviously too confusing for my small mind. Have a great day.

On Wed, Jun 27, 2018 at 12:01 PM, Jason Collins blog wrote:

> Jason Collins posted: “Since first coming across Joshua Miller and Adam > Sanurjo’s great work demonstrating that the hot-hand fallacy was itself a > fallacy, I’ve been looking for a good way to explain simply the logic > behind their argument. I haven’t found something that complete” >

3. colinhutton says:

Debunking debunkers Miller and Sanjurno (MS).

The description in the linked article of the ‘flips’ procedures is analogous to significant aspects of roulette. This raises the question of whether MS conclude that their findings apply also to betting on the outcome of a spin of the wheel following a sequence of, say, 3 Reds. In short, whether the odds of Black then coming up are better than 50/50.

If that is their conclusion, then they have bamboozled themselves.

To check that they have not inadvertently misrepresented the import of their findings in the illustration they use in the article (and the convoluted/confusing explanations – “buckets” etc) I followed the links in the article to their ‘definitive’ paper – a 64-page (!) PDF. (I note that the title/abstract of the paper includes references to “Gambler’s Fallacy”).

I read/studied the Abstract and Introduction. Painfully time-consuming as their ‘explanations’ are as garbled as those in their article. (Perhaps, also, because I am a retired chartered accountant, not a statistician. But I doubt it). As it is, I am now confident that they believe they have “proved” both the ‘Hot Hand Fallacy’ and the ‘Gambler’s Fallacy’ to be fallacious. In short, they have bamboozled themselves.

Their claim (particularly insofar as it relates to wagering on roulette) is on a par with those who claim to have invented/constructed the proverbial perpetual motion machine. Their paper is grossly flawed. A total nonsense. I could go on and explain further – but enough already.

So, Jason, regarding MS’s “great paper”, I confidently predict that you will never “find a good way to explain simply the logic behind their argument” :)

4. I don’t get it :-( In table 1 of their revised 2016 paper they are adding up the thrid column, and dividing by the number of rows with any heads in the first two places (which is 6). That seems like a pretty meaningless sum to me. What they ought to be doing is multiplying column 2 by column 3, adding those values up and dividing by the total of column 2 (which is 8). That gives the correct expected value of 1/2. They even give a reasonable-sounding analysis of why their sum is wrong in the passage following the table. Why do they promote their dubious sum giving the wrong answer as somehow correct? It isn’t clear, at least to me.

1. GVT effectively used the “dubious” method in examination of the hot hand – concluding that it didn’t exist. If GVT had used that same methodology on coin flips, they would conclude people have cold hands (or that the gambler’s fallacy is true). Miller and Sanjurjo aren’t promoting the dubious sum – they’re pointing out that it is dubious and led to an incorrect conclusion by GVT.

1. It still looks as though they are asserting the problematical claim – rather than exposing it. For example, consider the first paragraph of their introduction:

“Jack takes a coin from his pocket and decides to flip it, say, one hundred times. As he is curious about what outcome typically follows a heads, whenever he flips a heads he commits to writing the outcome of the next flip on the scrap of paper next to him. Upon completing the one hundred flips, Jack of course expects the proportion of heads written on the scrap of paper to be one-half. Shockingly, Jack is wrong. For a fair coin, the expected proportion of heads is smaller than one-half.”

I think it is challenging to find a sympathetic interpretation of this paragraph. Surely, Jack is correct and the authors of the paper are mistaken.

2. I don’t like that particular paragraph either – not because it is wrong, but because Jack is unlikely to discover the subtle bias in a single series of 100 – the expectation is around 0.495.

But to test it, try this R code, looking at one millions reps of a series of 4:

rep <- 1e6
n <- 4
data <- array(sample(c(0,1), rep*n, replace=TRUE), c(rep,n))
prob <- rep(NA, rep)
for (i in 1:rep){
}
print(mean(prob, na.rm=TRUE))

Borrowing from here: http://andrewgelman.com/2015/07/09/hey-guess-what-there-really-is-a-hot-hand/

You can test the series of 100 by adjusting n to 100 (suggest reducing rep to decrease runtime if you do).

3. Hmm. Math mistake on my part, dammit. The authors are correct about this being a counter-intuitive result. Thanks for spending the time to point my problem out.

5. colinhutton says:

OK I get it now. The first paragraph of the introduction totally derailed me.

In my defense, the syntax is truly horrible. The worst example being the short sentence Shockingly, Jack is wrong. That clearly implies the reader of the paper will be shocked – which I duly was – and the first sentence of the immediately subsequent 2nd paragraph is just as misleading.

Anecdote : Soon after Crown commenced operation in Melbourne 20 years ago, I went to have a look at the mug punters crowded around the roulette tables. The Casino provided a feature to save them the hassle of recording black / red outcomes to track/identify streaks. It consisted of a prominent electronic display at each wheel of the R/B outcome of the 8 (10?) preceding spins. I didn’t believe that even mug punters would be influenced by streaks. I should have been cynical enough to realise that Crown would know exactly what it was doing.

6. David says:

The hot hand fallacy fallacy exists, but so does the hot hand fallacy (maybe). See – Daks, Alon and Desai, Nishant and Goldberg, Lisa R., Do the Golden State Warriors Have Hot Hands? (November 4, 2017). Available at SSRN: https://ssrn.com/abstract=2984615 or http://dx.doi.org/10.2139/ssrn.2984615 – or, like me, watch https://www.youtube.com/watch?v=bPZFQ6i759g via numberphile.

Would be interested to know if you think the hot hand fallacy fallacy should have been caught in peer review or if it was a particularly hard error to spot.

1. Sorry on the slow response. I’ll say it was a hard error to spot – I read the paper and relied on it for years, so it likely would have slipped by me. For me it’s an advertisement for open peer review: imagine if the paper and data were publicly available as a working paper for a while before publishing. I think then the problem might have been found.