Skip to main content

Motivational Workshops are Good For You. Really. Stanford Says So

I mentioned in my last post I had been reading about academic fraud, and I was. I do. Frequently. I didn’t post anything on it, but after posting my last piece I came upon some material I had dug up a while back on a case that is, well, improbable. Not fraud, or so say the experts who know more than I, but research that makes you wonder ‘What were these people thinking?’

The story starts with an improbable figure: Tony Robbins. You may remember him, motivational speaker, was all over TV at one point, good looking guy, great teeth. This guy:

He was on OWN for a while, wrote several books, like Unlimited Power: The New Science of Personal Achievement (1986) and several others. He also went through some rough times, accused of sexual misconduct in Buzzfeed in 2019. I don’t know what came of that, and it is quite separate from his role in our current story.

A group of researchers, some from Stanford’s Genetics Department and its Stanford Health Innovation Lab (SHIL) undertook some research projects in which Robbins’s seminars figured. I don’t know how many projects there were in total, or how many papers they got published on them, here I will focus on two I do know about:

Non-traditional immersive seminar enhances learning by promoting greater physiological and psychological engagement compared to a traditional lecture format  

This was published in the journal Physiology and Behavior in 2021, and you can read it here if you like. I’ll refer to this as the learning paper. The other is

Effects of an immersive psychosocial training program on depression and well-being: A randomized clinical trial

Published in The Journal of Psychiatric Research in 2022, you can read it here. I’ll refer to this as the depression paper.

Each paper lists 9 separate author-researchers, five of whom appear as authors on both papers. Michael Snyder, the Director of SHIL, is listed only on the depression paper, but other authors on the learning paper are from the Stanford Dept of Genetics.

The learning paper’s abstract starts with this sentence:

“The purpose of this study was to determine the impact of an immersive seminar, which included moderate intensities of physical activity, on learning when compared to traditional lecture format.”

In fact the researchers observed 26 people, 13 in an ‘immersive treatment’ group and 13 in a control group, all 26 of whom went through a two-day seminar. The difference in the two groups is described as follows:

“For the IMS group, participants were provided with a hard copy of lecture material which was presented at UPW with a combination of state elevation sessions (jumping, shouting, fist pumping, and high five behaviors) which were conducted approximately once every hour to raise arousal and to interrupt sedentary behavior, as well as mindfulness mediation[sic] that focused on a wide variety of awareness, affective states, thoughts and images [35] that were conducted once at the end of each day.”

The control group got the same seminar and hard copy of material, but without the meditation and ‘elevation sessions’.

What does this have to do with Tony Robbins? You will find his name in the paper, sort of, at the very end –

Funding

This study financially supported by Robbins Research International Inc.

What this does not mention is that the ‘lectures’ the research subjects attended were in fact a two-day Robbins seminar, one of those things one pays a lot for.

Still, nothing wrong that I can see with someone like Robbins sponsoring research to see if one of two different approaches leads to better learning outcomes in his seminars. However, because all the subjects are people who paid a lot to be in this environment, one can’t really claim they are representative of the general population, so one has no reason to think that the results of this study tell us much about the impact of such ‘immersive’ learning in other situations.

Beyond that, there are only 13 research subjects in each group. That is a rather small number,  which makes one wonder whether the results mean much. Here’s what the researchers say they found:

            “The primary findings of this study were that learning was greater in the IMS compared to the CON as the increased performance on the exam was sustained 30- days post event when compared to CON, which decreased 30-days post event.”

Ok, but the question that comes to me at this point is – why did these well-placed researchers bother with this? The paper’s Conclusion section actually notes that “Previous studies have shown that physical activity can promote learning in traditional classrooms [2,3].” They then go on to note that those previous studies utilized different types of exercise. Whoop-de-do. I would think researchers at this high level would be interested in research that could move the needle more than that. I get that Robbins (maybe only partly) funded the work, but I doubt these people are hard up for research funding. So, in the end my only question is really: why and how did these researchers get involved with Robbins at all, and on such a mediocre project?

Hang on. I came upon this in Gelman’s statistics blog, which I read regularly and mention here often. He didn’t take these guys on, all he did was reprint some material from an article in The San Francisco Chronicle which, based on the bits I’ve been able to read (paywall) was really hard on these researchers and this research.

As I noted, Robbins had been in the news not long ago for bad behavior, so if the learning paper was the extent of this Robbins/researchers interaction, I would likely chalk this up to no more than a newspaper thinking it can score points by embarrassing some Stanford researchers for their association with Robbins, and take another strip off Robbins himself in the process. Nothing (much) to see here.

But then there’s the depression paper.

Again, Robbins’ name appears only once, at the end of the paper, like this:

“This study was not funded by Robbins Research International; however, they did allow participants to participate in the DWD program at no charge. They also provided housing for two research coordinators who stayed on site during the trial.”.

DWD here refers to Date With Destiny, which is a Tony Robbins seminar that the paper describes as follows:

“…a six-day immersive training program that includes a subsequent 30-day daily psychosocial exercise follow-up period. DWD is popular with thousands of people using this intervention annually. The program combines a variety of lifestyle and psychological approaches that seek to improve well-being, including cognitive reframing, guided meditation and visualization, neurolinguistic programming, gratitude, goal setting, guided hypnosis, community belonging and engagement, and exercise. Although components of the program such as exercise, gratitude, and cognitive reframing have independently been found to improve mental health and wellness (Goyalet al., 2014; Kvam et al., 2016; Mikkelsen et al., 2017; Schuch et al., 2016), the effectiveness of the DWD program has not been investigated.”

No mention anywhere of the fact that this is an ongoing Tony Robbins (money-making) seminar series for which people (although not the subjects of this study) pay big bucks to attend, although people familiar with the Robbins operation would recognize the DWD name. Here’s what the Robbins website says about DWD:

Create life according to your terms

Dive deep into the patterns that are holding you back, ignite your motivation, and build momentum toward the life of your dreams.

I guess that’s kinda ‘mental health and wellness’, right?

The researchers describe their experiment as follows:

            “A randomized clinical trial was conducted in which 45 participants were randomized at 1:1 ratio to DWD (n = 23) or a gratitude journaling control group (n = 22) (Fig. 1). Depressed individuals (n = 27), as assessed by the Patient Health Questionnaire-9 (PHQ-9; see below), and those without depression (n = 18) were recruited by email, flyers, and physician referral in the U.S.

At least there are more subjects this time, right? Ah, no. In fact there were 14 depressed subjects assigned to the Seminar and 13 assigned to the ‘gratitude journaling’ control group. The other 18 subjects, 9 in each group, were not designated as depressed according to their own original responses on the above-mentioned PHQ-9 questionnaire, and as we shall see, it is the originally depressed subjects who are the star of the show.

That’s because the big question that is being asked here is: does attending a Tony Robbins six-day DWD event reduce depression?

Can anyone guess what answer Mr. Robbins would like to hear to that question?

And indeed, that is the answer we get – in spades. Here’s the payoff, from the paper’s Abstract:

“Seventy-nine percent (11/14) of depressed participants in the intervention condition were in remission (PHQ-9 ≤ 4) by week one and 100% (14/14) were in remission at week six.”

In remission here means that on that self-reported questionnaire, the scores generated by their answers were below the threshold at which they are considered depressed. In other words, the Tony Robbins DWD seminar has a 100% cure rate for depression.

One need not even ask how the control group did, my god, all the depressed people were cured by the treatment. Huzzah!

More times than I could count, I have written in this blog, ‘If it seems too good to be true, it ain’t true’.

Reasons for skepticism are many and varied.

In clinical trials of anti-depressants, typically something like half of participants report ‘feeling better’ after six to eight weeks. A paper in the Lancet I dug up said that 62% of adults reported ‘improvement’ in depression after psychotherapy, at varying time frames.

But DWD – 100% cure rate. Go, Tony.

There is once again the small sample issue here, just as in the learning paper, but to this under-educated economist, the subjects themselves are the big question mark. These are people who self-reported being depressed, went to a DWD seminar for six days for free and then were asked again afterward about how they felt. Ya think maybe they were inclined to believe in the power of DWD?

The bits of the SF Chronicle article that are quoted on Gelman’s blog suggests that some of the SHIL researchers knew and were fans of Robbins before the research started. I can’t say anything about that, and I don’t think you need to know that to wonder about the results in the depression paper.

A final note. If you do go to download the depression paper, you will find that the journal in 2024 also published a Corrigendum about the original 2022 paper, and this corrigendum has the same original nine authors on it. These corrigenda are published by a journal when a mistake is found in a published paper, but is thought not so egregious as to warrant retracting the paper completely. It notes that there was an error in calculating the post-treatment PHQ-9 score for one of the treatment subjects, and as a result, the cure rate was ‘really’ only 93%.

Most interesting in this corrigendum, all two pages of it, is the following paragraph, which I quote:

“Finally, we note that after the article was first made available online on March 9th, 2022, Dr. Snyder became a co-founder of a startup, Marble Therapeutics, on July 12th, 2022. Mr. Robbins later invested in Marble Therapeutics on September 26th, 2022, three months after the final version of the article was published. We do not believe there was a conflict at the time this work was done, but nevertheless wish to note this relationship.”

There’s that other thing I often write: can’t make this shit up.

Speeding Rich People

This post is about something interesting that I just read about in an academic paper (which you can also read here). It’s titled ‘How Do People React to Income-Based Fines? Evidence from Speeding Tickets Discontinuities’ and I know, ‘interesting’ and ‘academic paper’ are not supposed to show up in the same sentence…..

It turns out that in Finland, according to the author of this paper, the fine one pays if caught driving at more than 20km/h over the posted limit depends on your income. If your income is below a set cut-off, you pay a set fine, but if it is above that cut-off, then the size of the fine increases with your income. The paper includes the following sentence – ‘For example, in 2019, the Police assigned NHL ice hockey player Rasmus Ristolainen an income-based speeding ticket equal to approximately 120,000 euros.’

Ouch, eh? I do wonder if it was the police who set that fine or a magistrate, but either way, that’s a big speeding ticket.

One question to ask here is why Finland does that. The obvious – but still perhaps wrong – answer would be that Finland’s government operates under a pretty egalitarian ethic, which might be seen to imply that rich folks should pay more; for everything, including breaking the law. However, these economists are interested in this system not for that reason, but because a well-known theory of crime deterrence, due to now-dead economist Gary Becker, would predict that with this speeding fine structure, one should find that a lot of speeders are caught doing 19km/h or less over the limit. That is, if you bar-graphed the number of tickets given for being 1km/h, 2km/h……15km/h…..19km/h, 20km/h, 21km/h…..30km/h etc, over the limit, then the bars at 18 and 19 should be noticeably higher than the other bars. Put simply, no one wants to get dinged for doing 20 or 21 over when one could incur a much smaller fine by slowing down just a little.

Unfortunately for Becker, not that he cares now, this does not turn out to be the case, according to these researchers. One might instantly say that maybe that’s because most speeders have incomes below that cut-off, and hence don’t face that steep increase in fine if they do 20km/h+ over, but in fact these guys have really good data, including about the speeders’ incomes, so they can tell that the bunching does not happen even among drivers with incomes above the cut-off.

What they do find is that the higher income drivers who get pinched with the higher fine tend to slow down afterwards. Quoting from the paper:

“Those assigned, on average, a 200 euro larger fine are approximately 2-3 percentage points less likely to commit another traffic crime in the following 4-8 months. Compared to the average speeding behavior of the speeders who receive a smaller fixed fine, this estimate implies a 15-20 percent reduction in recidivism.”

I should stress that this is a research paper that has not yet been peer-reviewed, and I am not about to vouch for how well the statistical and data work was done.

I will, however, suggest that the Becker prediction of bunching would not be expected to hold in this instance – at least not by me – given that those who pay the higher fines are, by definition, folks with higher incomes, who therefore might be expected to be willing to pay higher fines. It is basic to economic theory that people with higher incomes have – other things being equal – a higher willingness to pay for most things, including driving fast. Indeed, one might expect that higher-income people put a higher value on their own time, which they can ‘save’ by driving fast. The author does not seem to consider this, as his behavioral model assumes that drivers may value speeding differently, but that any differences are purely idiosyncratic, and so not related to their income.

In fact, it occurs to me that all this really has nothing to do with the fines in Finland being based on income for richer folks, although that is what originally caught my eye. What Becker’s theory says is that people react to an increased fine by being less likely to speed. Since the fines for going above 20km/h over the limits are higher than for going less fast than that, people will go 19km/h over rather than 20 or 21. It doesn’t really matter that the amount of the fine is based on income, what matters is that it goes up at the 20km/h threshold.

Given that,  one could do a cleaner test of Becker’s theory in Ontario, where speeding fines are as follows:

Less than 20 km/h over: $3.00 per km/h over .

20 to less than 30 km/h over: $4.50 per km/h over.

30 to less than 50 km/h over: $7.00 km/h over.

Above 50km/h over: $9.75 per kh/h over.

(One also gets ‘demerit points’ added to one’s driving record for speeding. Those have consequences too, and the number of demerits you get depend on how fast you go – but the thresholds, for some reason, are different than for the fines. Go figure.)

Thus Ontario speeding fines depend on how fast you go, but not on your income, and there are three thresholds where the fines jump. Becker’s theory would predict bunching below 20, 30 and 50km/h over in response to this fine structure. And, doing more than 50km/h will get you charged with a second offense, ‘stunt driving’, which involves a whole additional set of penalties, including a fine of $2k to $10k (set by a judge, not your income, because if you go that fast, you have to go to court). Another reason for Becker’s theory to predict bunching under 50km/h.

Thus one could test Becker’s deterrence theory in Canada without any need for data on the incomes of the speeders. (You’d have to take into account the effect of the demerit point thresholds, too. Complicated.) Which would be good for anyone who set out to do this, because I’m pretty sure that, unlike in Finland, such data does not exist. The Finns (and the Danes, I think) seem to be happy with their government collecting all kinds of data on them, whether they speed or not. Canadians and Americans, not so much.

So far as I know, Becker’s theory has not generally done all that well when confronted with data about actual criminal behavior. A couple of economists did a review in 2014 of the empirical (that is, data-driven) research on criminal deterrence, and this is from the Abstract of the paper they wrote about what they found in their review (which you can download here):

“While there is considerable evidence that crime is responsive to police and to the existence of attractive legitimate labor market opportunities, there is far less evidence that crime responds to the severity of criminal sanctions.”

That response is central to Becker’s theory, and a theory that predicts behavior that does not seem to occur out in the world is what is known as a bad theory. That does not prevent it from being quite famous, however – among economists, at least.

 

What Does a 226% Improvement Smell Like?

Let me first express my thanks to Andrew Gelman of Columbia whose blog (see above) first brought this to my attention, so I can bring it to yours.

This is another of those ‘if it seems too good to be true it probably is’ research papers that I so love. The paper is titled ‘Overnight olfactory enrichment using an odorant diffuser improves memory and modifies the uncinate fasciculus in older adults’ and it was published in Frontiers in Neuroscience in July of 2023.

It reports on a study in which participants – all elderly, like me – were given scent diffusers to take home and use for two hours each night, starting when they went to bed – the diffusers automatically shut off two hours after they were started. The participants were given cognition tests and a functional MRI (that’s where the uncinate fasciculus bit in the title comes from) before they started the experiment and again six months later, after they had used the diffusers for that period of time.

The ‘treatment group’ got a set of 7 genuine essential oils to use in their diffusers, while the control group got ‘de minimis amounts of odorant’ according to the researchers. (Do you suppose those in the Control Group noticed that their diffusers produced no scent? But I digress.)

In the end there were a total of 43 participants in the two groups, and the headline result of this research was, quoting the paper,

“ A statistically significant 226% improvement was observed in the enriched group compared to the control group on the Rey Auditory Verbal Learning Test….”

Pretty impressive, eh? Six months of smelling essential oils for two hours/night at bedtime, that’s all it took to get that huge improvement in Auditory Verbal Learning.

Anyone smell anything?

Here are some facts about this ‘controlled’ experiment. First, 43 participants? A bit more than 20 in each group? That is what statisticians call a small sample. But wait, there’s more. If you look at the flow chart of how they recruited and screened participants for this study, you find that 132 subjects passed the initial screening. Of these, only 68 were included in the Control and Treatment groups that were used in the statistical analysis of the results, and of those 68, 25 dropped out during the study. That leaves the 43 whose results are reported on, of the 132 who passed the screen.

Smell anything yet? Why did those 25 people drop out? That’s 36% of the 68 whose results were analyzed and reported. What does that drop out rate imply for the credibility of the results? People don’t drop out randomly, they do it for reasons.

And, as one of the readers of the Gelman blog pointed out, that 226% improvement claim comes from the control group scoring on average 0.73 points worse post-treatment than pre-treatment on a particular test, while the treatment group scored 0.92 points better on average. So you have a difference of 1.65 points in the two groups’ average ‘improvement’ on the test, and 1.65 is 2.26 times 0.73.

Interesting arithmetic. I think Gelman’s reader is right, as that 226% number doesn’t come up anywhere else in the paper. However, note that 1.65 being 2.26 times 0.73 is not the same as 1.65 being a 226% ‘improvement’ over 0.73. The latter would mean that it was more than three times greater, and it is not. Neuroscientists don’t do a lot of basic arithmetic, I guess. That detail aside, just looking at the difference in average scores for the two groups – what does a ‘point’ mean in this context, anyway? Is it big? How ‘big’ is a 1.65 point improvement on this test? What does that actually translate into, memory-wise? The researchers do not say.

One last thing. At the end of the paper there is a section called Funding, under which is written ‘This research was sponsored by Procter and Gamble.’

There, now you smell it, eh?

I know nothing about the journal Frontiers in Neuroscience, but if their editors did not stop and wonder about that 226% claim, well, I don’t think I’ll subscribe. Actually, it’s an open access journal so I can read it for free, as I did this smelly article. But I won’t. Read the journal, I mean. Unless maybe I’m looking for something else to write about here.

The impact of strip clubs on sex crimes

I ended a post back on April 25 with the following question:

Does the presence of bricks-and-mortar adult entertainment establishments have a positive, negative, or no effect on the commission of sex crimes in the surrounding neighborhood?

I then asked you to consider what sort of data would be required to provide credible evidence as to what is the correct answer to that question.

Fair warning, this is going to be a longish article, but I would suggest that a credible answer to the first question above has some social value. And, full disclosure, this post is part of my ‘Studies show’ inoculation campaign.

‘Swat I do.

I do think the answer to this question is of more than passing interest.

If the presence of adult entertainment establishments (aee’s, henceforth) like strip clubs and such could be shown to reduce the incidence of sex-crimes like sexual assault and rape, this might be counted as a reason to allow them to operate. If, on the other hand, they are associated with an increase in such crimes, then that is a reason to ban them entirely. The ban/allow decision for aees is of course complex, and other factors may also be important (e.g., links to organized crime, drug use). Still, the answer could be a significant input into city policy-setting on such places.

More disclosure, this is not a very original post. I got wind of all this reading Andrew Gelman’s Statistical Inference blog back when. However, he didn’t dig into the details much.  I have, and I think it is another nice illustration of an important principle: if it sounds really good, be skeptical.

Ok, then – our story begins with a paper by two economists titled “THE EFFECT OF ADULT ENTERTAINMENT ESTABLISHMENTS ON SEX CRIME: EVIDENCE FROM NEW YORK CITY”,

which was written by Riccardo Ciacci and Maria Micaela Sviatschi and published in 2021, in The Economic Journal, a well-respected outlet in my old discipline.

The following sentence from the Abstract of their paper lays out what they find –

“We find that these businesses decrease sex crime by 13% per police precinct one week after the opening, and have no effect on other types of crime. The results suggest that the reduction is mostly driven by potential sex offenders frequenting these establishments rather than committing crimes.”

Trust me, if true, that’s a big deal. A 13% reduction on average, and in the first week after the aees open.

Social scientists rarely find effects of that size attributable to any single thing. That’s huge. One might even venture to say – unbelievable.

It is not surprising that The Economic Journal was happy to give space in its pages to publish these results. And, coming back to what I wrote above, what city politician could ignore the possibility that licensing aees in their jurisdiction might reduce sex crimes by 13%?

To dig deeper we return to the ‘extra credit’ question I posed on that post of April 25 – what kind of data would one need to answer the question?

Well, you need to be able to make a comparison of sex crime numbers between areas where aees operate, and areas where they do not. An obvious possibility is to find two political jurisdictions such that one contains aees, and the other, perhaps due to different laws, does not. Then you can compare the incidence of sex crimes in those two jurisdictions and get your answer.

That approach is just fraught with difficulties, all following from the fact that the two jurisdictions are bound to be different from one another in a whole host of ways, any one of which might be the reason for any sex-crime difference you find. Demographics, incomes, legal framework, policing differences, the list goes on and on. You can try to account for all that, but it’s very difficult, you need all kinds of extra data, and you can never be certain that any difference you find can actually be attributed to the presence/absence of aees.

The alternative is to look at a single jurisdiction, like NYC, and find data on where aees operate and where they do not. Now NYC is a highly heterogeneous place – it’s huge, and its neighborhoods differ a lot, so it sort of seems like we’re back to the same problem.

However, suppose you can get data on when and where aees open and close in NYC. Then, you have before and after data for each establishment and its neighborhood. If an aee started operating in neighborhood X on June 23, 2012, you can then look at sex crime data in that area before and after the opening date. You still want to assure yourself that nothing else important in that neighborhood changed around that same time, but that seems like a doable thing.

This is pretty much what our economists did, as we will see, but there is still another issue; data on sex crimes.

All data on criminal activity carries with it certain problems. Data on arrests and convictions for crimes is generally pretty reliable, but crimes are committed all the time for which no arrests are made and/or no convictions occur. Still, the crimes occur, and for the purposes of this question, you want data on the occurrence of sex crimes, not on arrests for them.

We’ll come back to the crime data below, but I’ll start with the data on aees.

The authors note that if you are going to open a strip club in NYC there is a bureaucratic process to go through, and the first thing a potential operator of such has to do is register the business with a government bureau.

To quote directly from the paper:

“We construct a new data set on adult entertainment establishments that includes the names and addresses of the establishments, providing precise geographic information. We complement this with information on establishment registration dates from the New York Department of State and Yellow Pages, which we use to define when an establishment opened.”

So, the researchers know where each aee opened, and they know when, but do note for later, that for the ‘when’ bit they use the date of registration with the NY Department of State.

The location that they get from the Dept and the Yellow pages then allows the researchers to determine in which NYPD precinct the aee is located, and that is going to allow them to associate each aee, once it opens, with crime data from that precinct.

So, what crime data do they use? As I’ve noted, such data always has issues.

Here’s one thing the economist say about their crime data.

“The crime data include hourly information on crimes observed by the police, including sex crimes. The data set covers the period from 1 January 2004 to 29 June 2012. Since these crimes are reported by the police, it minimises the biases associated with self-reported data on sex crime.”

Ok, hold on. ‘Crimes observed by police’? What does that mean? How many of the people arrested for or even suspected of a crime by the police had that crime observed by the police? Speeders, stop-sign ignorers, perhaps? But burglars, murderers, and – the point here – sexual assault or rape? How often are those crimes observed by police?

The vast majority of crimes come to light and are investigated by police on the basis of a report by a private citizen. In the case of sex crimes, most often a victim is found somewhere or comes to the police after the crime has occurred, inducing police to begin an investigation.

This sentence from the paper clears things up….a bit.

“We categorise adult entertainment establishments by New York Police Department (NYPD) precincts to match crime data from the ‘stop-and-frisk’ program.”

Ah. You may remember NYC’s (in)famous ‘stop and frisk’ program of several years (and mayors) ago. NYPD officers would stop folks on the street and – chat them up. Ask questions of various kinds, and then fill out and turn in a card that recorded various aspects of the encounter. As we will see below, virtually none of these s-a-f encounters resulted in a report of a crime or an arrest.

So….’crime data’? From stop and frisk encounters? Need to know a lot more about how that data was used.

And we shall, but let’s go back to the other key bit of data – where and when aees opened in NYC. The date used for the aees ‘opening’ is, according to the quote above, the date on which each establishment was registered  with the NY Dept of State.

Can you think of any establishment that needs a city or health or any other license to operate, that actually starts serving customers the day after it files the licensing paperwork?

To be sure, I have never operated a business, but I don’t think that can possibly be how it works. For one thing, how many different licenses do you suppose a strip club needs to operate at all? A health inspection, a liquor license, a fire inspection, building safety certificate….?

This is not a detail, because the BIG Headline this paper starts with is that a strip club reduces the number of sex crimes in the precinct in which it is located in the first week of operation. If the researchers are using the date of registration to determine when was that first week – there’s a problem.

 Ok, time to let the rest of the cats out of the proverbial bag. I mentioned above that I came on this research through a post on Gelman’s blog in which some folks expressed considerable skepticism about the economists’ findings. Those skeptics are, to give credit where due:

Brandon del Pozo, PhD, MPA, MA (corresponding author); Division of General Internal Medicine, Rhode Island Hospital/The Warren Alpert Medical School

Peter Moskos, PhD; Department of Law, Police Science, and Criminal Justice Administration, John Jay College of Criminal Justice, New York

John K. Donohue, JD, MBA; Center on Policing, Rutgers University

John Hall, MPA, MS ; Crime Control Strategies, New York Metropolitan Transportation Authority Police Department

They lay out their issues with the paper in considerable detail in a paper of their own titled:

Registering a proposed business reduces police stops of innocent people? Reconsidering the effects of strip clubs on sex crimes found in Ciacci & Sviatschi’s study of New York City

which was published in Police Practice and Research, 2024-05-03.

This post is already quite long, so I am going to just give you the two most salient (in my opinion) points that are made by the skeptics in their paper.

First, as to the economists’ ‘sex crime’ data:

“The study uses New York City Police Department stop, question and frisk (SQF) report data to measure what it asserts are police-observed sex crimes, and uses changes in the frequency of the reports to assert the effect of opening an adult entertainment establishment on these sex crimes. These reports document forcible police stops of people based on less than probable cause, not crimes. Affirmatively referring to the SQF incidents included in the study as ‘sex crimes,’ which the paper does throughout (see p. 2 and p. 6, for example), is a category error. Over 94% of the analytic sample used in the study records a finding that there was insufficient cause to believe the person stopped had committed a crime….In other words, 94% of the reports are records of people who were legally innocent of the crime the police stopped them to investigate.”

And then, for the data on the openings of aees:

“This brings us back to using the date a business is registered with New York State as a proxy for its opening date, considering it provides a discrete date memorialized by a formal process between the government and a business. However, the date of registration is not an opening date, and has no predictable relationship to it, regardless of the type of business, or whether it requires the extra reviews necessary for a liquor license. New York City’s guidance to aspiring business owners reinforces the point that registration occurs well before opening.”

I close with the following. It turns out our four skeptics sent a comment to The Economic Journal laying out all their concerns about the original research, the Journal duly sent said commentary on to the authors, Ciacci and Sviatschi, and those authors responded that they did not think these concerns affected the important points in their paper. So, the journal not only did not retract the paper, it declined to publish a Commentary on its findings by the four skeptics. (Econ Journals do publish such Comments from time to time. Not this time.)

I mean, that would just make the original authors – and the Journal – look bad, no? The skeptics did, as we saw, eventually get their concerns into the public domain via a different publication – one read by pretty much nobody who reads The Economic Journal, I’m thinking.

Again – if it seems too good to be true…Objects in mirror may be smaller than they appear.

It’s Getting Hot Out: The Efficacy of Heat Warnings

 

Summer’s about here, and we can look forward to more of that Environment Canada staple – The Heat Warning. You know, the alerts about high temps and humidity you see on your favourite source for weather info.

I never think much about them, figuring people are pretty good at understaning when it’s hot out and what to do about it. It turns out some local researchers got to wondering if these alerts did any measurable good.

Their work was written up in the Freeps some while back, in an article headlined:

Do hot-weather alerts help? No, not really: London researchers

– published on Aug 22, 2022.

The tag line below the title reads “Those heat alerts telling us to be careful when temperatures spike? Turns out they do little to keep people overcome by heat out of hospital, say London researchers calling for changes to make the warnings more effective.”

The Freeps reporter has the research right in this case. In the research article you will find the following two paragraphs –

“The researchers compiled data on patients with heat-related illnesses who showed up in emergency rooms from 2012-18 and looked at whether their numbers dropped after the harmonized heat warnings kicked in.”

Then later –

“While there did appear to be a slight drop in heat-related emergency room visits after the provincial warning system was introduced, particularly in children and adults with chronic conditions, the results were not statistically significant, Clemens said.”

I went and read the research paper, published in The Canadian Journal of Public Health in 2022 (I’m a geek; you can read it too, here, although you will have to get past the paywall). That is indeed what the researchers say in the paper.

This research paper strikes me as reporting on potentially useful research. The Freeps article notes that “In southern Ontario, heat alerts are issued amid daytime highs hit 31 C or higher, lows of 20 C or when the humidex reaches 40.” You want to put off digging that garden to another, cooler, day. Old coots like me are particularly aware of this.

But setting aside my own instincts, I am all in favor of research to determine whether government initiatives are having their hoped-for effect. My unease about the research arose from the following lines in the Freeps article, in which the lead researcher is quoted –

“This research points to the need to raise awareness of heat-related illness. I’d like to see this translate into more education and physician-level awareness . . . ,” Clemens said. “As an endocrinologist, (I) could help inform or prepare my complex patients to better protect themselves.”

Huh? Exactly how does this research point to that? These research findings say the current warning system had no impact on heat-related emergency-room visits. What is the logic leading from that useful finding to the first sentence in the quote above? And as to the second quoted sentence, by all means, go ahead and inform and prepare your patients, but what does “this research” have to do with that?

Then, at the very end of the paper, we find this:

What are the key implications for public health interventions, policy or practice?

  1. More heat alerts were triggered in Ontario between 2013 and 2018, and many cities spent more days under heat warnings. The implementation of a harmonized HWIS appeared to reduce rates of ED visits for heat-related illness in some subpopulations, but at a provincial level, the change was not statistically significant.

 

  1. Given HWPs are a main policy tool to protect populations against heat, we suggest ongoing efforts to support effective HWP in our communities, with a particular focus on at-risk groups.

 

The journal itself probably has a requirement – since it is a public health journal – to include in any published paper a final statement on the public health implications of the research. However, point 1 is not an implication of the research findings. It is just a restatement of the fact that the research found the warnings had no impact. However it is entirely misleading to say that the HWIS ‘appeared to reduce rates of ED visits…’ and then immediately say ‘the change was not statistically significant’. All social science research operates with the knowledge that there is a lot going on in the world that we can’t identify, or even know about, and so any difference we see in data (like differences in ER visits) might be due to random chance. Researcher can’t just say something ‘appeared’ to be different when in fact the difference was statistically insignificant.

So, why cling to the ‘ we found this, but it wasn’t significant’ language? Why not just say ‘we found no impact’? That is a useful thing to find, indeed, and researchers should expect to find exactly nothing much of the time. Finding nothing advances our knowledge about the world, it is very useful to learn ‘well, that doesn’t seem to have any impact’.

Then, in implication 2 above, they write “…we suggest ongoing efforts to support effective HWP in our communities….”

C’mon folks, you just found that HWPs are ineffective in reducing ER visits, so in what way is an implication of that finding that we should support effective HWPs? Particularly since nothing in your research tells anyone what an effective HWP might look like.

Having hung around with social science researchers nearly all my adult life, I will bravely put forward a hypothesis about motivations, here: there is nothing that would have induced the researchers to write, instead of the two misguided points above, this implication of their research: –

Our research suggests that the HWIS program and its associated HWPs be ended, and the resources involved be directed toward programs for which there is evidence of effectiveness.

That sentence never stood a chance of appearing in their paper.

 

 

Immigrant Discrimination and the Freeps

The media these days like to publish stories about academic research but as a rule, they do a bad job of it, and that is particularly true of my hometown paper, The Freeps.

An article published in the Freeps on July 31, 2023, headlined:

‘Alarming’: Study reveals hostility toward immigrants in London, region’

illustrates well what I mean.

The line under that headline read: “The study, funded by the London and Middlesex Local Immigration Partnership, surveyed the experiences of 30 London and Middlesex County immigrant and racialized people.”

Now, hold on. What can one possibly conclude about anything that might be happening in London-Middlesex after talking to 30 people? The immigrant population in the areas is, according to Stats Can figures reported in the actual study (more on it below) was around 90,000 in 2016, no doubt higher than that now. 30 is a laughably small sample of that population. However, reading on, the article goes on to say that this was ‘….a followup to a survey conducted by the same team that found about 60 per cent of those who identified as immigrants in Southwestern Ontario said they experienced some level of discrimination or racism in the last three years.’

Ok, so this suggests that at least two studies were done, a survey plus interviews of 30 local immigrants. The writer for the Freeps claims that 60% of immigrants reported discrimination in that survey. I determined to go find the actual studies to sort all this out, partly due to my reading this sentence further down in the article:

“A group of Western University researchers led by Esses heard newcomers say they were overlooked for promotion and their work was underappreciated.”

Ok, how many of your peers report being overlooked for promotion,  or underappreciated at their job? Maybe everyone? What makes that discrimination?

My curiosity fully aroused, I found the two studies on the website of the sponsoring organization mentioned above. You can, too, at this link

The first study, which surveyed 829 L-M residents in March of 2021, is written up in the paper dated August 2021. The second is indeed a report on interviews with 30 immigrants from the L-M region, and is dated March, 2023. This is the study mentioned in the tag line, and it is worth noting that all of the 30 people interviewed for that study reported being immigrants  and reported experiencing discrimination. Anyone who did not report those two things when first contacted to be interviewed, was not in fact interviewed. So the rate of reported discrimination among the interviewed group was 100%, by design.  That’s not what the article’s tag line would have you think but…ah, details.

As to the first, much larger survey, that’s where the Freeps reporting gets worse and the research gets, well, interesting. The 829 respondents to the survey were contacted by a hired polling company that used random-telephone-number dialing to collect its sample of respondents. Those who were actually given the survey to respond to were put into three groups, which the researchers titled Immigrants and Visible Minorities, Indigenous Peoples, and White Non-Immigrants.

Now, if one is trying to understand discrimination experienced by immigrants living in London-Middlesex, it seems very odd to include Indigenous Peoples in the survey. If anyone is 180 degrees different from being an immigrant, that would be indigenous folks.

On the other hand, including a set of White Non-immigrants in the survey makes sense. Whatever you learn about discrimination among immigrants is pretty meaningless without a point of comparison: the White Non-Immigrants can be considered the analog to a Control Group in a drug study. It’s like if someone tells you that The Bismarck displaced 41,000 tons when it was built, that doesn’t tell you it was one huge battleship unless you also know how big were other ships of the era.

Below are the self-reported rates of discrimination of these three groups – that is, the percentage of survey-takers in each group who reported being discriminated against – you can find these numbers on p.20 of the 2021 report:

Immigrants and Visible Minorities: 36.7%

Indigenous Peoples: 61.6%

White Non-immigrants: 44.4%

Which brings us back to the Freeps writing that “…60 per cent of those who identified as immigrants in Southwestern Ontario said they experienced some level of discrimination or racism in the last three years.”

Clearly that’s just wrong. Inaccurate. (See why I love the Freeps?) Indigenous peoples most certainly do not identify as immigrants. Count on it. The self-reported rate of discrimination among the immigrant group was 36.7%, which is way less than 60 in anyone’s arithmetic.

So the Freeps got the facts wrong, and they erred in the direction in which the Freeps always errs, in my experience. The Freeps has become The London Alarmist, always making things seem as bad as possible, so here they report the biggest, baddest number, even if it’s the wrong number.

However, I cannot let the researchers off free on this one, either. The word ‘Alarming’ in the headline is accurate, in that researcher Vicki Esses did use that adjective in describing the stories they heard in the interview study. But, of course, in the 2023 interview study the interviewees were pre-selected for saying they had been discriminated against. The earlier 2021 survey study could then be viewed as an attempt to understand how representative those stories are of the general experience of immigrants in the area.

But here’s the thing, which you alert readers likely have already noticed. White non-immigrants reported being discriminated against at a higher rate than the immigrants. As my foul-tongued friend Hugo might say – WTF?

The Freeps reporter did not question Esses about this finding from the survey, and I would bet a buck said reporter did not read either report. I mean, who has time to learn about the things one writes about? I would bet a lot more than a buck that had Ms. Rivers turned in a story to her editor headlined ‘Immigrants less discriminated against locally than white non-immigrants’ she would not have gotten her byline into The Alarmist.

A final note on the research, specifically the survey report. As the researchers write on p. 51 of the 2021 report – “Nonetheless, because participation was voluntary, it is likely that interest in the topic had some influence on whether or not eligible individuals participated, leading to some inevitable potential biasing of the samples.”

Yea. Likely, indeed. They note that the use of random-phone-dialing to get initial respondents helps work against bias, and that is only partly true. The researchers don’t tell us much about that respondent recruiting process, and were this study being presented in a seminar, here are just a few of the questions I would ask:

Did the phone-calling include cell phone numbers or just land-lines?

Was there a set text the callers used to screen potential survey-takers, and if so, what did it say?

Given that the survey involves unconfirmed self-reporting (the results from which should always be taken with a grain of salt) what reason is there for confidence that the reports of discrimination correspond to actual discrimination?

The reasoning behind the first two questions is simple: if only land-lines were called, as used to be the case, a whole swath of Canadians, mostly younger, who have no landlines anymore, is left out of the pool of potential survey-takers. How that might bias the results I cannot say, but it seems it must to some extent, so this is important for understanding the survey results.

And, if the callers doing the recruiting let out the fact- or even the possibility – that they were asking people to participate in a survey on discrimination, then my Spidey sense goes into full vibration mode. This will attract people who feel discriminated against disproportionately, and renders the ‘% experiencing discrimination’ statistic unrepresentative of what happens in the general L-M population. It is not reassuring that the researchers say so little about the respondent recruiting process.

The third question is prompted by the fact that White non-immigrants reported more discrimination than Immigrants. This makes it very difficult for me to believe that these responses tell us much about discrimination against immigrants.

To go back to the Random-Controlled Drug Trial analogy above, if the group you gave the drug to (the white non-immigrants) is more likely to get the disease the drug is supposed to cure than the group that did not get it (the immigrants), your drug does not work; indeed, it’s bloody dangerous.

That finding should have the researchers questioning just what the survey responses actually tell them, if anything. Do they believe that white non-immigrants are actually more likely to experience discrimination than immigrants? If so, are they seeking research funding to look into discrimination against white non-immigrants? I rather doubt the answer to either question is yes, but that’s the implication of their survey results if they want to insist that the survey responses tell us something about actual discrimination.  And that ‘Alarming’ thing kinda suggests they do.

 

 

Ask Yourself: Do I Feel Lucky? Well, do you…..?

 

Bad luck and trouble, two of my best friends – Sam (Lightnin’) Hopkins/Mack McCormick

You can spend a lot of time reading about things that seem social sciencey, but are in fact pure politics. One example of what I’m talking about is the following non-question: what matters more in life, luck or talent?

It’s not a scientific question because 1) luck is impossible to measure, 2) talent is only measured approximately, at best, and 3) there is no scale on which one can put a life to decide the ‘more’ part of the question.

However, political types, by which I mean politicians, advocates, activists and ‘experts’ are happy to go on and on about which matters more, and they are all quite sure they know the answer.

An Opinion piece showed up in the Feb 21 Report on Business section of the Globe on this non-question, titled Rich and successful? It’s likely you’re just lucky. Written by Mark Rank, said to be a Professor of Social Welfare at Washington University in St Louis, the piece is labelled as being ‘Special to the Globe and Mail’, which I think just means that Rank is not on the staff of the Globe.

It was a very annoying article.

Let me explain.

In his piece Rank weighs in on the side of luck in this debate, and I’m not writing to argue against that position; as I wrote, it’s a pointless argument. I’m writing because Rank badly mis-characterizes a piece of academic research in supporting his position. He writes:

“Take the case of who becomes wealthy and who experiences poverty. It turns out that the random factor is very much in play. In a fascinating research article titled Talent Versus Luck: The Role of Randomness in Success and Failure, mathematical physicist Alessandro Pluchino and his colleagues were able to empirically quantify the relative importance of talent versus luck in terms of acquiring great wealth over a 40-year working-age lifespan. What they found was that the most talented people almost never reached the peaks of economic success – rather, the ones most likely to achieve the pinnacle of wealth were those with more average talents but who happened to catch a couple of lucky breaks.”

Before I explain what is wrong here, I doff my hat to Professor Rank for apparently citing some actual research, and for including in his article a link to the paper he is citing. That happens all too rarely.

The above paragraph makes the research of Pluchino and colleagues seem like it is about real people, living ’40-year working-age lifespan(s)’, right? And, it seems that for these ‘people’ it turned out that the ‘most talented people’ were not generally the ones to achieve ‘the pinnacle of wealth’. Rather it was those people with good luck who did well.

That could hardly be further from the truth of what the Puchino article does.

To start with, his statement that Pluchino and colleagues “…were able to empirically quantify the relative importance of talent versus luck…” is flat out wrong. The adjective ‘empirically’ says that they observed people and recorded what they observed to support their ‘luck matters more than talent’ claim. [Merriam Webster – empiric: capable of being verified or disproved by observation or experiment; originating in or based on observation or experience.] They observed no one, they gathered no data about anybody.

In fact, what the Pluchino et al paper does is report on the construction of a mathematical model, in which the authors interpret their purely mathematical result as demonstrating that luck matters more than talent.

However, none of this ‘talent’ they write about is embodied or observed in actual people, nor do they observe the amount of good or bad luck that any real people experience.

Puchino and friends build a model in which the ‘people’ are theoretical entities who do nothing. In the model they are assigned varying levels of theoretical talent by the researchers, and then they are bombarded with theoretical good or bad luck. They do not respond to what happens to them in any way. In fact, in the model, they are not allowed to make any decisions or take any actions – they are automatons. The researchers give these automatons varying levels of ‘talent’, but the same ‘wealth’ (also theoretical) to start out with, then run them through forty fictitious instances of good or bad luck. Having performed those mathematical operations, they observe how much wealth each automaton ends up with. (These forty hits of theoretical good or bad luck is what Rank refers to as a ’40-year working life span’. No one works in the model, either – good or bad things just happen to them. 40 times.)

In addition to this, there is no interaction between the various fictitious automatons. What happens to automaton no. 12 has no influence on, nor is it influenced by, what happens to any other automaton. You know, just like in the real world, where people go through their lives in perfect isolation.

I repeat, there is no actual data about anything collected or observed by these researchers. Thus, there is nothing whatsoever ‘empirical’ about this research, and I leave it to you to consider what this fictitious world of automatons might tell us about luck vs talent out in the real (empirical) world. (Note that because the automatons don’t do anything, the role of diligence, effort, or good decision-making doesn’t have even a theoretical place in this research.)

Puchino and company do discuss other research papers they say provide evidence that luck matters more than other things. These other papers are duly referenced, and the interested reader can go read them and judge for themselves how convincing any of them are. I have not done so, and there is no indication Rank has, either.

The point here is that Puchino and his colleagues provide no empirical evidence that luck matters more than talent, only that it does in their theoretical model, and Rank is wrong to say they do. Professor Rank has mis-used their research in trying to support his own view on this matter in his op-ed column. As a Professor of Social Welfare, I am not surprised that Rank believes luck matters more than talent, but there is no evidence in his article or the Puchino paper to support that view.

Sadly, academics tend to believe that any press is good press, so I doubt Puchino and friends are upset by Rank’s complete misrepresentation of their research in his article.

I was.

 

Following the Hot Hand of Science

Anyone vaguely familiar with basketball has heard of the ‘hot hand’ phenomenon. Someone on the team gets a hot shooting streak going, they can’t seem to miss, and their teammates start looking to get the hot-handed player the ball. I played backyard hoops a lot in my youth, and there were (very few) times when it happened to me; every shot I threw up seemed to go in – briefly.

Well, academics got wind of this long ago also, and decided to investigate whether there was anything to it. Yea, sure, players talk about experiencing it, or seeing it, but it could easily be just a matter of perception, something that would disappear into the ether once subjected to hard-nosed observation and statistical analysis.

The canonical paper to do this analysis was published in 1985 in Cognitive Psychology, authored by Gilovich, Tallone and Tversky. The last of this trio, Amos Tversky, was a sufficiently notable scholar that young economists like me were told to read some of his work back in the day. He died young, age 59, in 1996, six years before his frequent co-researcher, Daniel Kahnemann, was awarded the Nobel Prize in Economics. The work the Nobel committee cited in awarding the prize to Kahnemann was mostly done with Tversky, so there is little doubt Tversky would have shared the prize had he lived long enough, but Nobels are, by rule, not given to the dead.

Now, as a research question, looking for a basketball hot hand is in many ways ideal: the trio used data on shots made and missed by players in the NBA, which tracks such data very carefully, and beyond that, they did their own controlled experiment, putting the Cornell basketball teams to work taking shots, and recording the results. Good data is everything in social science, and the data doesn’t get much better than that. Well, bear with me here, this is most of the Abstract of that 1985 paper:

“Basketball players and fans alike tend to believe that a player’s chance of hitting a shot are greater following a hit than following a miss on the previous shot. However, detailed analyses of the shooting records of the Philadelphia 76ers provided no evidence for a positive correlation between the outcomes of successive shots. The same conclusions emerged from free-throw records of the Boston Celtics, and from a controlled shooting experiment with the men and women of Cornell’s varsity teams. The outcomes of previous shots influenced Cornell players’ predictions but not their performance. The belief in the hot hand and the “detection” of streaks in random sequences is attributed to a general misconception of chance according to which even short random sequences are thought to be highly representative of their generating process.”

That is, a player who hits a shot expects he is likely to hit the next one, too. When he does, he files this away as ‘having a hot hand’, but the actual frequency with which he hits the second shot is not actually higher than when he had missed his previous shot. Standard ‘cognitive bias’ causes the player – and fans – to see it that way, that’s all. They remember when the second shot is made more than they remember it being missed.

Damn scientists are always messing with our hopes and dreams, right? No Easter Bunny, no extra-terrestrials in Mississauga, and no hot hand. Is nothing sacred?  Other researchers went looking for evidence of a hot hand over the ensuing years, but it became known in academic circles as ‘the hot hand fallacy’, the general consensus being that it did not exist in the real world of basketball.

33 years later

But wait, it’s now 2018 and a paper by Miller and Sanjurjo appears in Econometrica, the premier journal for economic analysis involving probability and/or statistics. It’s title is “Surprised by the hot-hand fallacy? A truth in the law of small numbers”

Here’s some of what their Abstract says:

We prove that a subtle but substantial bias exists in a common measure of the conditional dependence of present outcomes on streaks of past outcomes in sequential data…. We observe that the canonical study [that is, Gilovich, Tallone and Tversky] in the influential hot hand fallacy literature, along with replications, are vulnerable to the bias. Upon correcting for the bias, we find that the longstanding conclusions of the canonical study are reversed.

It took over 30 years for two economists to figure out that ‘the canonical study’ of the hot hand did its ciphering wrong, and that once this is corrected, it’s findings are not just no longer true, they are reversed. The data collected in 1985 do provide evidence of the existence of a hot hand.

Think about this. In 1985 some very clever academics showed there was no such thing as a hot hand in the real world of basketball, and the academics who peer-reviewed their work agreed with them. Thirty-plus years later, some other clever academics realized that first set had gotten something wrong, and that fixing it reversed the previous findings – and the academics who peer-reviewed their work agreed with them.

Ain’t social science wonderful? A question for which there is excellent data, a situation rarer than hen’s teeth in social science, is investigated and a conclusive answer arrived at, and thirty years later that answer is shown to be not just wrong but backwards.

No one did anything shady here. There was no messing with data, the 2018 guys used the same data used in 1985. A mistake, a subtle but significant mistake, accounts for the turnaround, and it took 33 years to discover it. One can hardly blame the 1985 researchers for not seeing the mistake, given that no one else did for such a long time.

The Lesson?

So, in case my point is not yet obvious, science is not a set of settled facts. Those do exist – sort of – but anyone who understands the process of science even a little understands that settled facts are settled only until they are overturned. And if that is true for such a clean research question as an investigation of a basketball hot hand, think about a more typical social science question in which two things are almost always true. One, the data is not at all what the researchers need, so they make do with what they can actually gather. Two, the right way to analyze that data – among endless possibilities – is a matter of disagreement among respectable social scientists. Following that kinda science will make you dizzy, my friends.

A teaser: think about this social scientific question. It is arguably of more importance than basketball shooting.

Does the availability of bricks-and-mortar adult entertainment establishments have a positive, negative, or no effect on the commission of sex crimes in the surrounding neighborhood?

Whaddya think is the right answer?

For extra credit: what kind of data would a researcher need to gather to answer that question?

Now that’s real (i.e., messy) social science.

Stay tuned, because a couple of economists set out to investigate the question above, and I’ll have a go at what they did and their findings in a future post.

 

 

Who needs experts – and who needs Al?

I had planned to follow up my post on the Freeps printing a ‘news’ article in which the only news was a set of comments by one person (link), with one on the use of ‘experts’ in media more generally. Before I could, a regular reader pointed me at a piece in a site called The Hub that covered the same ground. Having read Howard Anglin’s piece carefully, and enjoyed it much, I’ve decided the best thing for me to do is just provide my own readers with a link to it (link).

I can’t see me writing anything better than he did….at least not yet.

Sci-fi in aid of Science

I was a pretty big fan of science fiction in my younger days, and still read some from time to time. I think Frank Herbert’s  Dune is a great novel (the sequels not so much), enjoyed reading works by Heinlein, Le Guin and Asimov.. 

One of the genre’s leading lights back then was Arthur C Clarke, who wrote the novel 2001: A Space Odyssey (in 1982) [not true, see below] on which the film was based. I was not a Clarke fan, don’t remember that I read any of his stuff. However, he made an interesting contribution to the culture beyond his books themselves, when he formulated three ‘laws’ regarding technology that have come to be known as Clarke’s Laws. He didn’t proclaim these all at once, and in any case it is the third law that is most cited, which so far as I can determine first appeared in a letter he wrote to Science in 1968. [If anyone has better info on the third law’s original appearance and antecedents I’d love to hear it.]

Clarke’s Third Law is: ‘Any sufficiently advanced technology is indistinguishable from magic.’

That strikes me – and many others, apparently – as a perceptive statement. Think of how someone living in 1682 anywhere in the world would regard television or radio. 

As with any perceptive and oft-repeated assertion,  this prompted others to lay down similar edicts, such as Grey’s Law: “Any sufficiently egregious incompetence is indistinguishable from malice.”

[I cannot trace Grey’s law back to anyone named Grey – if you can, let me know.]

Note that there is a difference, as Clarke’s law speaks to how something will be perceived, whereas Grey’s points at the consequences of incompetence vs malice. If you are denied a mortgage by a bank despite your stellar credit rating, the impact on you of that decision does not depend on whether it is attributable to the credit officer’s incompetence or dislike of you. 

On to Science, then, and what I will call Gelman’s Law (although Gelman himself does not refer to it that way). 

Most non-academics I know view academics and their research with a somewhat rosy glow. If someone with letters after their name writes something, and particularly if they write it in an academic journal, they believe it. 

It does nothing to increase my popularity with my friends to repeatedly tell them: it ain’t so. There is a lot of crappy (a technical academic term, I will elaborate in future posts) research being done, and a lot of crappy research being published, even in peer-reviewed journals. What is worse is that as far as I can tell, the credible research is almost never the stuff that gets written up in the media. Some version of Gresham’s Law [‘bad money drives out good money’] seems to be at work here. 

A blog that I read regularly is titled Statistical Modeling, Causal Inference and Social Science (gripping title, eh?), written by Andrew Gelman, a Political Science and Stats prof at Columbia U. I recommend it to you, but warn that you better have your geek hard-hat on for many of the posts. 

Although I often disagree with Gelman, he generally writes well and I have learned tons from his blog. One of the things that has endeared it to me is his ongoing campaign against academic fraud and incompetent research. 

He has formulated a Law of his own, which he modestly attributes to Clarke, but which I will here dub Gelman’s Third Law: 

“Any sufficiently crappy research is indistinguishable from fraud.”

I think this law combines the insights of Clarke’s and Grey’s. The consequences of believing the results from crappy research do not differ from the consequences of believing the results from fraudulent research, as with Grey. However, it is also true that there is no reason to see the two things as different. If you are so incompetent at research as to produce crap, then you should be seen as a fraud, as with Clarke. 

I will be writing about crappy/fraudulent research often here, in hopes of convincing readers that they should be very skeptical the next time they read those deadly words: “Studies show…”

I will close this by referring you, for your reading pleasure, to a post by Gelman titled:    

 It’s bezzle time: The Dean of Engineering at the University of Nevada gets paid $372,127 a year and wrote a paper that’s so bad, you can’t believe it.

It’s a long post, but non-geeky, and quite illuminating. (Aside: I interviewed for an academic position at U of Nevada in Reno a hundred years ago. They put me up in a casino during my visit. Didn’t gamble, didn’t get a job offer.) You can read more about this intrepid and highly paid Dean here. His story is really making the (academic) rounds these days. 

You’re welcome, and stay tuned. I got a million of ‘em….

p.s. Discovered this since I wrote the above, but before posting. One of many reasons this stuff matters, from Nevada Today

University receives largest individual gift in its history to create the George W. Gillemot Aerospace Engineering Department 

The $36 million gift is the largest individual cash gift the University has received in its 149-year history 

Anyone care to bet on whether this Dean gets canned?

 Corrigendum: An alert reader has pointed out that Clarke’s novel was not written in 1982 – indeed, the film came out in 1968. In fact the 2001 film was based largely on one of Clarke’s short stories from 1951: The Sentinel. Clarke did write a novel called 2010: Odyssey Two, in 1982, and a not-so-successful movie was based on that, in 1984.