B.S. In The World Of Research
Here is a quote from a comment posted on my favourite stats blog by someone who is not the author of said blog:
“The stand out example for me is nutrition science. A lot of the big, obvious effects have been picked through and now so much of it is simmering in noise with strong incentives to find various different things by getting significance. Alcohol/chocolate/coffee does, doesn’t, does, doesn’t, does, doesn’t cause increased mortality. I don’t know how we could expect that discipline to turn around. There is good work being done there here and there, but so much of it is GIGO.”
…by which is meant Garbage In Garbage Out. The writer is Joe B Bak-Coleman, who describes himself as a computational social scientist. You can check him out here.
This point gets made repeatedly on Gelman’s stats blog by him and many other contributors: much of social and medical and health science research is based on bad statistical analysis of data that is fundamentally ‘noisy’, in which case there is little reason, statistically or logically speaking, to believe that whatever ‘influence of X on Y’ the research purports to have found really exists.
[I hear you wondering – what is ‘noisy’ data? Check out the end of this post.]
To put it in more human terms, a researcher who has a big data set and is determined to find a very particular effect of some X on some Y lurking in that data set, can almost always find it, with attached statistical tests that are (too often) taken to imply that the effect is not just the result of random noise in the data.
And, if that X on Y effect can be portrayed as ‘cool’ or ‘unexpected’ or, sadly, politically convenient, the researcher can become a minor celebrity and get asked to give a TED talk or be interviewed by a podcaster or written up in the NYT.
A different phenomenon, equally important, is when public policy bureaucrats, like, say, the US Surgeon General, make statements about research on, say, the health effects of alcohol consumption, that indicate they did not even read the research they are citing. A topic for another post.
However, what I want to stress here is that the incentives in research are almost perfectly upside down. If a researcher plugs along and tries to develop credible careful evidence that the unsurprising ‘X influences Y’ statement is true, then you will likely never know about it. The stuff the public hears about are the sexy results, such as
- signing an honesty pledge at the top of a form promotes honesty more than signing it at the bottom of the form.
- cold showers can change your life and reduce body fat. (That one, with a sample size of 49, was eventually retracted).
- people who sit at a wobbly work station rather than a stable one are more likely to see their personal relationships as being likely not to last.
- student subjects who are told with great seriousness that they are putting a lucky golf ball sink more putts than those who are not told that. (An attempt to replicate this experiment by other researchers failed)
- Adverse Infant Health Outcomes Increased After the 2016 U.S. Presidential Election Among Non-White U.S.-born and Foreign-born Mothers ….that’s actually the title of a 2024 paper published in Demography which, conveniently, also tells you what the authors claim to have ‘found’. This is an example of the ‘politically convenient’ sort of result, and was arrived at using a method called ‘regression discontinuity’, which is so prone to abuse even I could explain why. Maybe I’ll try one day….
And, just to round things off, here are some quotes from Retraction Watch, a fun (but depressing) site which I have mentioned before, that keeps an eye on research studies that get retracted because, well, for many reasons.
i. “Our list of retracted or withdrawn COVID-19 papers is up past 500.”
ii. “And have you seen our leaderboard of authors with the most retractions lately — or our list of top 10 most highly cited retracted papers?
iii. “The retraction of “a final batch” of 678 articles concludes Sage’s investigation into questionable peer review, citation manipulation, and other signs of paper mill activity at one of its journals, according to the publisher.” [For the curious, the journal in question is Journal of Intelligent and Fuzzy Systems. Yes, really.]
iv. “A Scientist Is Paid to Study Maple Syrup. He’s Also Paid to Promote It.”
My point is not a new one. The world is full of bullshit, and that includes that part of the world in which highly educated researchers are supposed to be trying to improve our understanding of the world.
Appendix (sort of) on Noisy Data
This refers to data that is only weakly connected to the actual thing it is supposed to be a measure of. Here’s a good example from a seminar I recently attended. The researcher was studying the impact of parental attention on how their kids did in school. Said researcher had data on a number of things, including, as one example, the time spent by parents reading to their pre-kindergarten children. However, this data came from a survey in which parents were asked to recall how many times in a typical week during the previous year they had read to their kids.
How good is any parent going to be at figuring out what a typical week is, or recalling how many times they read to their kid over the past year? The actual fact that the researcher needs, the impact of which on kids outcomes they are trying to study, is the time parents actually spend reading to their kids. The answers parents give to that survey question will be related to that actual fact, but only imperfectly. Parents will over and/or under state what they actually did when they answer that question. A lot of studies in social and health science rely on survey data, and it all tends to be noisy in this way. People have imperfect memories, and there is also variation in what people think a particular survey question is asking them. There are of course other things that can make data ‘noisy’ when it is not the result of some objective, regularized measurement process.