Excel Statistics

Attention Conservation Notice: Far too many words (again) on why I think (good) statistics matters and (bad) statistics should be destroyed.

I pontificate a lot about statistics. I knew it had gotten bad when a friend commented to me recently that he liked my rant, and I had to ask which one. I think I fight so hard for statistics because there was a period in my life (mostly my undergraduate years) when I thought statistics wasn't a field worth studying. It's just means, medians, and modes. We learned that stuff in elementary school. Why would anyone want to become a statistician?

After beginning to study nonlinear dynamics and complex systems, I saw statistics everywhere. A lot of it was shoddily done. Some of it was not. And then I took mathematical statistics the Spring semester of my senior year, and realized that statistics is not, in fact, a castle built in the sky. Rather, it's a castle solidly constructed, with foundations in measure theory. Lots of smart people worked on making it the field it is today. There are some philosophical differences amongst its practitioners, but overall the field offers a lot of tools for studying the stochastic world we live in.

And yet. And yet I still find myself frequently railing against statistics. Like when I see a course like this. Do I think that sociologists should learn more statistics? Yes. But not like this. No joke, one of the examples from the first day handout proposes doing a \(t\)-test to determine whether married or unmarried individuals have more sex.

And I quote:

People who are married have sex an average of 78 times a year; unmarried people have sex on average 53 times a year. The \(t\)-test value is 7.29 indicating that the difference is statistically significant.

The hypothesis is supported-cannot be rejected-given these measurements.

Let's not worry about whether the estimator they're using is asymptotically \(t\)-distributed. (Do these students even remember what asymptote means?) Much more important is that this question isn't interesting. Statistics got its start answering sociological questions, about life and death, crime and punishment. Poisson proved a generalization of the law of large numbers so that he could use it in designing a fair jury system for courts.

And now sociologists in training run \(t\)-tests to see who has more sex.

During an office conversation with a friend, we coined a new phrase: 'Excel statistics.' That's the sort of statistics this handout advocates. 'Let's do a hypothesis test, because that involves math! Which makes it science!'1 No. 'Let's a fit a line through the data, and report an \(R^{2}\) value!' No. 'Let's...'

Readily available computing has brought statistics to the masses. Which should be a good thing. Unfortunately, the masses aren't ready for statistics.