Skip to main content

People Sayin’ Stuff

When I was a young prof, I didn’t do any data research, but I saw a lot of it presented, and it was generally true way back then that economists did not often do survey research. That is, they did not put together questionnaires, get a lot of folks to fill them out and then analyze their responses.

To be sure, there were just beginning to be government-sponsored and funded large scale surveys like The Panel Study of Income Dynamics (PSID), which began in 1968 with  a nationally representative sample of over 18,000 individuals living in 5,000 families in the United States. It describes itself as the longest running longitudinal household survey in the world. Wow, eh?

Things like the PSID are publicly available and painstakingly constructed, and those who do use them in their research are able to offer feedback on any issues they find with the survey. The PSID, like any such survey, asks people to answer questions about their income and such, and it is well-recognized that people’s answers are not entirely reliable, as I will discuss further below. However, when you have a panel like this, in which people answer the same questions repeatedly over time, it is possible to detect when some answers just don’t make sense.

So, what I am claiming economists did not do much in the 70s and 80s was what I described above: construct a questionnaire, get some folks to fill it out, and then analyze their answers with a view to understanding something about people in general.

The reasons for this lack of interest in surveys were, so far as I could tell, two.

One was a belief that it didn’t matter what people said about things. We economists were doing behavioural science, so it was what people did that was important, not what they said. Talk is cheap, but actions have consequences. We want data on what people actually bought and how many hours they worked and such.

Two was a recognition that people say all kinds of things that they don’t mean and/or are internally inconsistent. The most famous example of this in my discipline (I can’t find a reference to it, it was eons ago) was when someone went to a group of people in charge of running firms and asked them if their goal was to maximize the firm’s profit. This is of interest because one assumption made in the standard economic model of firm behaviour is that very thing: firms always behave so as to maximize their profits.

The person who asked this was chagrined to find that most of those asked said ‘No, that’s not how we operate at all.’

The chagrin was decidedly lessened by the fact that when those same people were asked ‘If you become aware of a change in your firm’s operations that will increase profits, do you implement it?’, to which most firm managers said ‘Well, of course we do.’

Like I said, inconsistent. They do maximize profits, they just don’t like to say it that way.

The problem with this stance on not doing surveys was that it left economists far behind empirical researchers in psychology, sociology and even political science, who were very big on survey research, and did tons of it.

Because here’s the thing. The absolutely essential input into all empirical research in econ, poli-sci, psych and soc is data. Getting your hands on good data is step one in any empirical research, and gathering good data is costly and time-consuming. Unless, of course, you just put together a bunch of questions on a form and use (back in the day) the phone or door-to-door canvasing or (more recently and much easier) some online platform. Indeed, things like Survey Monkey mean that even you, gentle reader, can construct a survey and try to get people to answer it.

So, the absolute explosion in (mostly useless, imho) social science and other research in the last couple of decades has been at least partly aided by the fact that one can so easily generate data sets from online surveys, so long as one does not look too closely at what those surveys actually mean.

A prime example is the research by economists and others on – wait for it – ‘happiness’. Happiness research has grown massively in the last twenty years, all the result of people putting together surveys asking people how happy they are on a scale of one to whatever.

Indeed, there is something now called a World Happiness Report. You should click and check it out. How do they measure happiness, you ask? Here is what they say:

Our happiness ranking is based on a single life evaluation question called the Cantril Ladder:

    • Please imagine a ladder with steps numbered from 0 at the bottom to 10 at the top.

    • The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you.

    • On which step of the ladder would you say you personally feel you stand at this time?

Is that not awesome? But the really cool thing is that, because the steps on the ladder are numbered from 1 to 10, that means you can treat happiness like any other quantity, you know, like pints of milk. ‘Joe says he’s on step 4, so he has 4 units of happiness.’ That means you can add happiness numbers together and find averages and standard deviations and then you can publish reports that say things like this –

Finns have an average life evaluation score of 7.736 (is that precise or what?) which is much better than the Germans with their average of 6.753.

Then you can do endless amounts of statistical operations in which you toss in other numbers like GDP and average temperature in March and the percentage of people who live in urban areas and average life expectancy and daily minutes watching TV in October and —- do you not see it children, there is no end to the set of possible papers one can write. Indeed, we can figure out why those damn Germans are so much grumpier than Finns.

All you have to do is never ever think about this question: If I put myself on step 7 of that ladder, is there any reason to believe that makes me in any important way similar to someone in Sri Lanka (or Manitoba, or down the street for that matter) who also thinks they are on step 7?

If that question occurs to you, Mr./Ms. researcher, just put it out of your mind, and go write another paper.

It is not just in this research on so-called ‘happiness’ that what people say is taken as objective truth that can be quantified and analyzed. One of the first articles I posted on this site was about some local researchers who claimed they wanted to learn how much discrimination was being experienced by recent immigrants, so – they asked them. When it turned out that non-immigrant white people claimed to be discriminated against more than immigrants, it did not occur to the researchers to consider the possibility that their survey responses were not telling them about actual discrimination. Onward, ever onward, there are (discriminatory) dragons to be slain, dammit.

And, think about it. If we care about discrimination, we care about how people behave, about what they do. We want to know if people are being cursed at due to their background or race, or if they are being denied jobs. But damn, that is very hard to discover, whereas if we just accept that whenever people say these things happened then they did – voila! We can really get somewhere.

While it is true that the internet age has made doing surveys easier, the problems with survey research are not new. One problem with which I am very familiar is the bias in surveys of voting behavior. If you ask a random sample of adults whether they voted in the last federal election, you will invariably get something like 80% +of them saying they did, even though we know only some 60% of eligible adults actually voted.

There are a number of reasons for this. First, people feel that they should vote, and are reluctant to admit they did not, even to a pollster. Probably more important is that the sample in any voluntary voting survey is not truly unbiased, in that people who willingly answer surveys also are more likely to vote. Or, put another way, someone who didn’t vote is more likely than someone who did to tell a pollster to piss off, or to not fill out the online questionnaire.

What got me started on this ‘sayin’ stuff’ topic was a post on the stats blog titled ‘Junk Science Presented as Public Health Research’. This was written by good old Gelman about a paper published in a journal sponsored by The Journal of the American Medical Association (although not in JAMA itself) and it claimed that in a survey of 10,000 US adults, some 7% reported being injured in or otherwise present at a mass shooting. I mean, JAMA, right? Must be true.

Gelman the statistician pointed out that, given what we know about the actual number of such shootings, that 7% claim would require that the average number of people present at mass shootings over the last 50years would have to be some 700, and that was implausible. Some mass shootings have thousands present, like the one in Las Vegas in 2017, but that is a huge outlier, considering that ‘mass shooting’ is defined as 4 or more people being hit by gunfire.

Well, this led me to an even worse example of this in another JAMA network paper from 2024 titled ‘Prevalence and Risk Factors of Depression and Posttraumatic Stress Disorder After a Mass Shooting’

You can read the whole thing here, free. Worth every penny.

This paper surveyed people who were actually present at that shooting at the Route 91 Harvest Music Festival in Vegas in 2017, in which 60 people were killed. They contacted 1000 people on a list of those who were present (there were tens of thousands in attendance), got a final sample of 177 people, and concluded from the answers those 177 people gave to their survey that ‘eighty-seven participants (49.2%) reported past-year MDEs [Major Depressive Episodes], while 112 (63.3%) reported past-year PTSD’.

All fine so far, but in their conclusions they write ‘We documented a high burden of MDE and PTSD among witnesses and survivors of the Las Vegas MVI[Mass Violence Incident].’

No, folks, you did not. You documented that was true in your small and certainly biased sample. Think about it, readers. some 800 of the 1000 people contacted did not reply, and who is more likely to respond to a survey about this incident? Those who feel they are suffering from depression and PTSD or those who have moved on and are doing ok?

Given that bias, and despite what they titled the paper, this work tells us nothing about the ‘prevalence’ of MDEs or PTSD in the set of people actually present at that shooting.

The researchers do eventually note that “Study limitations include a small sample size, predominantly female respondents, and a low response rate, all of which may affect generalizability.” Sample bias? Not even mentioned.

Besides, generalizability to what? There is no reason to believe your results tell us anything about the thousands of people at that event, period. And of course, after that brief and inadequate cautionary note, they go on to compound their error:

“This study found that witnesses and survivors of the Las Vegas MVI continued to have substantial mental health challenges even 4 years later, emphasizing the need for sustained mental health support.”

Isn’t it funny how leaving out the word ‘some’ in that sentence makes it seem so much more….important?

Let me emphasize my point. Gelman called the paper that asserted 7% of people in the US have been at a mass shooting ‘junk science’ because there was no reason to believe that 7% number was actually true in the entire US population. As he writes, what the researchers know is only that 7% of the people they asked said they were present. The characters in this PTSD paper cannot even claim that their results are representative of the population of people who were present at the Vegas shooting. They only know what 117 people from their biased sample said about having PTSD.

I will finish this by going back to that original JAMA paper Gelman dissed in his Stats blog. Along with taking the researchers to task for doing ‘junk science’ Gelman also suggests that a study of how many people might have been present at some mass shooting somewhere doesn’t even belong in a public health journal, and he suggests that it was published there to push a political agenda.

Some of his readers disagreed with this point, and he eventually caved and wrote that maybe the study was legitimately about public health.

I think he was right the first time. It makes sense to publish a paper on mass shootings in a public health journal if the methods used to deal with public health problems might be useful in dealing with mass shootings. If that makes sense to you, then think back to, oh, 2020, and consider the methods advocated by public health officials to deal with that problem. Clearly, a serious public health problem is sufficient to require all kinds of draconian measures, including the suspension of civil liberties.

imho, there definitely is a political agenda here. Same goes for alcohol consumption, by the way. Big public health problem, donthchaknow.

Just because I’m paranoid doesn’t mean people are not trying to mess with me.