Higgs and Stats

Every time there is some science news, I always hold my breath until SWAB comments. And on this issue Ethan Siegel does not disappoint. I highly recommend reading him if you’re interested in great science writing.

Anyway, I’ve been pretty confused about a lot of the statistics around the evidence of the Higgs Boson. I’ll set this up, first, though. Here’s Ethan:


Back in 1976, there were only four quarks that had been discovered, but suspicions were incredibly strong that there were actually six. (There are, in fact, six.) If you look at the above graph, the dotted line represents the expected background, while the solid line represents the signal published here from a E288 Collaboration’s famous Fermilab experiment. Looking at it, you would very likely suspect that you’re seeing a new particle right at that 6.0 GeV peak, where there ought to be no background. Statistically, you can analyze the data yourself and find that you’d be 98% likely to have found a new particle, rather than have a fluke. In fact, the particle was named (the Upsilon), but when they looked to confirm its existence… nothing!

In other words, it was a statistical fluke, now known as the Oops-Leon (after Leon Lederman, one of the collaboration’s leaders). The real Upsilon was found the next year, and you shouldn’t feel too bad for Leon; he was awarded the Nobel Prize in 1988.

But the lesson was learned. It takes a 99.99995% certainty in order to call something a discovery these days.

6 sigmas?! WTF?! That’s humongous. That says to me that they’re either using the wrong distribution or the number of observations is immensely higher than any dataset I’ve ever seen. Considering these are probably the most competent statisticians on earth, I have to assume the latter, but… seriously?! SIX standard deviations?

I’d love to see the data.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.