
Consider this Rule:
You Cannot Persuade a Falling Apple.
The pun falls from Isaac Newton’s famous mediation under an apple tree, his discovery of gravity, and the creation of modern physics. You cannot persuade a falling apple because it is in the throes of gravity. Persuasion, by contrast, only works on the uncertain, the ambiguous, the disputed. And, not only can you not persuade a falling apple, it follows that you should not try to persuade a falling apple because it makes you like that little boy running down a slope flapping his arms in an attempt to fly.
I always remember this Rule anytime somebody tries to interject science into a controversy. Hey, if the issue is a Falling Apple, then shouldn’t it be as obvious as a falling apple? Doesn’t the fact that we are disputing it mean that it isn’t a falling apple? And, so why is anyone trying to moot the point with science?
Consider this in the context of the current furor over concussions in the NFL. While everyone knows the NFL produces violent hits that often cause head injuries, it has only been in the past year or so that this topic has risen to the top of the Change Agenda. Lots of people are buzzing about it and demanding that the NFL do something to reduce the risk. Why now?
The proximate cause of the buzz is Malcolm Gladwell’s 2009 New Yorker piece. The distal cause is a couple of publications in the peer review literature a few years farther back. Gladwell got the Cool Table buzzing with his fabulous rhetorical skills – just read his writing – and his equally fabulous FauxItAll skills – carefully read his writing. Since the buzz is based on science, it would not be a good idea to take Gladwell’s account too seriously. He does get paid by the word and not by the truth, you know.
But, what is that science?
If we restrict that answer to only sources that are published in the peer review literature and not anything in the pop press, it appears that there are two prominent articles, here and here, both by Professor Guskiewicz and colleagues, the first on cognitive impairment and the second on depression. These papers seem to be the most widely discussed in the pop press whether with sports sources like ESPN or general sources like the NYT.
While the two papers are published in two different sources and address two different outcomes, its apparent that the studies both flow from the same data collection. This is not unusual, particularly in health and medical research. Teams will mount a large data collection, then spend several years analyzing and publishing parts of it. Obviously this practice helps the vita as you get more hits from the same data point. And, if you are not paying careful attention, this practice can lead to a bandwagon effect as a series of stories appear over time making things look more compelling than they really are.
The key feature to this data is the method of collection. Guskiewicz et al. obtained the cooperation of the NFL Players Union and sent a self report survey to retired NFL players. The mailings went out to all 3683 retirees and 2552 were returned. The surveys contained all of the information to be analyzed. Thus, all of the data, both cause and effect, were from only one source, the retired player, and provided from their self report.
Okay. Take a moment and think about this like a good scientist. This is one of the simplest forms of data collection available. You create a paper and pencil questionnaire. You mail it out to a population. You analyze what you get back. This is a one shot, nonrandom, self report, retrospective survey. This has no control over:
1. Who gets the survey.
2. Who returns the survey.
3. The conditions under which people complete the survey.
4. Who actually completes the survey.
5. Whether each respondent understands each question the same way.
And, you can think of additional concerns if you let yourself get skeptical. But, you get the point. This data collection is inherently shaky. Poor control. No randomization. Scientists call it a “biased sample.”
Now, this method is better than one of those popup polls on a politics blog that asks you to vote on whether the President is:
A. An Idiot.
B. A Fool.
C. A Liar.
D. A Muslim.
E. All of the Above.
But, it is still a biased method and thus biased science. Of course, it is practically impossible to do great science on this problem. The unattainable ideal would be to randomly select men who want to play football and randomly assign them to different conditions – amount and severity of contact, length of exposure, timing of exposure, etc. We would then carefully measure all of the variables in our model, both causes and effects, and at the end of the study, count the outcomes, making comparison between our randomly assigned and controlled conditions. We will not be able to do a great scientific study on this problem, so we will have to face limitations in our methods. But accepting limitations does not mean we forget about them.
The key question in the Guskiewicz et al. work is, What is the dose-response relationship? Stated in football terms, do more hits cause more cognitive impairment and depression? The survey measures “hits” in a variety of ways, but again only through the self report of the athlete. There’s no medical record with a physician’s report or any films to read. Just self report. Same thing with the outcomes of impairment and depression. It’s what the athlete says. There’s no independent data source. So, what’s the relationship between hits and health?
Here’s how Guskiewicz et al. describe it for cognitive impairments.
Statistical analysis of the data identified an association between recurrent concussion and clinically diagnosed MCI (chi = 7.82, df = 2, P = 0.02) and self-reported significant memory impairments (chi = 19.75, df = 2, P = 0.001).
And now for depression.
There was an association between recurrent concussion and diagnosis of lifetime depression (chi2=71.21, df=2, P<0.005).
Bingo! The effect is statistically significant and beyond the conventional P < .05 level for Mild Clinical Impairment (MCI), self reported memory loss, and depression.
Strong news, right?
No.
The statistical significance of a test is a function of two things: the effect size and the sample size. To get “more” significance, just add more subjects to the analysis even with the same effect size. Forget the significance for now and seek the effect size, the strength of the relationship between hits and health.
To do that, let’s translate those misleading P values (.02, .001, and .005) into the Windowpane display.
48-52 for Mild Cognitive Impairment
47-53 for self reported Memory Loss
47-53 for Depression
Now, a small effect size is considered 45-55, so the effects here are not even small. The only reason these results got published is because they are “statistically significant” and we know that is a function of sample size. For example, if only 1000 retired players had returned the surveys and the same effect sizes had occurred, the results would not have been “statistically significant.”
Think, now, about all the limitations we noted to this data collection. No randomization. No control. All self report from one reporting source. What if some of the guys interpret a question one way while the rest interpret another way? What if some of the surveys are completed by loved ones and not the players? Hey, 70% of the surveys were returned; is it possible that people worried about concussions were more likely to respond? Hey, hey, hey, this was sponsored by the Players Union and they are now renegotiating the contract. All of these issues serve to mess with the data and should increase our scientific skepticism. Any of these biases (sometimes called rival explanations or threats to internal validity) could easily produce the very small effects reported in the publications.
Of course, you probably don’t think about any of this stuff when you’re at a press conference and one of the researchers is talking about it. Look at this shot from a news conference yesterday at WVU.

The lab coat. The lecturn. The blue background. You don’t need to know that this is Julian Bailes, one of the study coauthors. You just know this is a credible guy telling the truth.
Except really he’s just a guy with a PowerPoint slide, an argument, and yes, that cool white lab coat. Gee, does he wear that lab coat when he does the statistical analysis? Probably not since he’s a neurosurgeon and not an applied data analyst.
You might be feeling a bit confused with these now obvious criticisms and ask yourself why the hell this crap got published in the first place. The reason you ask that is because you don’t understand how peer review science works. If you carefully read the paper, you see that the writers are aware of the limitations we’ve discussed and that they qualify the findings. They don’t report this as a Falling Apple, but as something that might be a Falling Apple.
From my point of view as either a reviewer or just a reader of that literature, I support their right to voice this position as long as they support my right to disagree. It also helps to realize that this study is not a neurological study, but a psychological study since it is just self reporting. Science is a long and winding road with lots of ups and downs before anything gets into the Received View of textbooks, training, and tenure. Any one or two or even a few publications are never decisive.
FauxItAlls in the pop press don’t know this because they don’t compete in peer review. ESPN or NYT only wants eyeballs and ears for their advertising. Gladwell packs them in with his fabulous writing skill. The pop press guys don’t want Falling Apples. They want uncertainty, doubt, ambiguity, fear, risk, worry. And FauxItAlls deliver that with grace, style, and wit.
The interesting persuasion problem here lands squarely in the lap of the NFL. When the Cool Table is generating negative buzz about the Greatest Game on Turf, the NFL must act. It doesn’t matter that the Cool Table is badly abusing the science here and if they did it on any given Sunday, they’d get flagged for a personal foul, fined by the Commissioner, and given a suspension. Roger Goodell can’t call Malcolm Gladwell to the woodshed on this, but has to deal with the image problem caused by Gladwell and his ilksters.
The bell tolls for the NFL and it will be interesting to see how they answer it.