Disturbing fact: The Stanford prison experiment, run by Philip Zimbardo, wasn't reproducible but that didn't stop Zimbardo from using it to promote his ideologies about the impossibility of rehabilitating criminals, or from becoming the president of the American Psychological Association.
The APA has a really good style guide, but I don't trust them for actual psychology.
This is a great list for people who want to smugly say "Um, actually" a lot in conversation.
Based on my brief stint doing data work in psychology research, amongst many other problems they are AWFUL at stats. And it isn't a skill issue as much as a cultural one. They teach it wrong and have a "well, everybody else does it" attitude towards p-hacking and other statistical malpractice.
SF author Michael Flynn was a process control engineer as his day job; he wrote about how designing statistically valid experiments is incredibly difficult, and the potential for fooling yourself is high, even when you really do know what you are doing and you have nearly perfect control over the measurement setup.
And on top of it you're trying to measure the behavior of people not widgets; and people change their behavior based on the context and what they think you're measuring.
There was a lab set up to do "experimental economics" at Caltech back in the late 80's/early 90's. Trouble is, people make different economic decisions when they are working with play money rather than real money.
As someone who's part of a startup (hrpotentials.com) trying to bring truly scientifically valid psychological testing into HR processes .... yeah. We've been at it for almost 7 years, and we're finally at a point where we can say we have something that actually makes scientific sense - and we're not inventing anything new, just commercializing the science! It only took an electrical engineer (not me) with a strong grasp of statistics working for years with a competent professor of psychology to separate the wheat from the chaff. There's some good science there it's just ... not used much.
How are you going to get around Griggs v. Duke Power Co.? AFAIK, personality tests have not (yet) been given the regulatory eye, but testing cognitive ability has.
Um, actually I’d say it is the responsibility of all scientists, both professional and amateur, to point out falsehoods when they’re uttered, and not an act of smugness.
[um], has contexts but is usually a cue, that an unexpected, off the average, something is about to be said.
[actually], is a neutral declaration that some cognitive structure was presented, but is at odds with physically observable fact that will now be laid out to you.
> Claimed result: Women risk being judged by the negative stereotype that women have weaker math ability, and this apprehension disrupts their math performance on difficult tests.
I'll never understand stances trying to hide biological differences between different sexes or ethnic backgrounds.
We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains (and hormones) work.
Women have, on average, a higher emotional intelligence which is e.g. tied to higher linguistic proficiency. That helps in many different fields and, on average, women tend to learn languages easier than men.
At the same time, on average, they may perform slightly worse than men in highly computational fields (math or chess).
I want to iterate what I'm getting at to before the rest of the post:
Genetics matter when you look at very large samples, but they are irrelevant on smaller (or single) samples.
I feel NBA provides a great example.
On average, african americans are taller than white men and have a higher muscular density.
On large samples, they tend to outperform white men. But as soon as you make the samples smaller, even at elite levels, you find out that Larry Bird (30+ years ago) or Nikola Jokic (today) are the best players in the world.
Same applies to women, just because average samples will explain some statistics, such as on average females performing worse on maths, won't change that women can be the best chess players or cryptographers in the world.
<On average, african americans are taller than white men and have a higher muscular density.>
Are you comparing direct descendants of Yoruba versus descendants of Celts in America ? or mixed descendants of Bantu and Cherokee versus mixed descendants of Anglo-Saxons and Slavs ? In your study would Barack Obama be a person of color or a person of pallor ?
Or is this data you have gathered observing people at Costco. Just checking on your scientific methodology.
Differences are hidden because (1) differences, even small ones, are used to justify discrimination (2) some feel the need to correct for stereotypes (3) these differences often don't really exist or amount to a small effect size.[0]
In the end, we're talking about distributions of people, and staring at these differences mischaracterizes all but those at the mean.
All that matters is who can pass the test.
[0]: I also encourage you to ask ChatGPT/Grok/Claude "men vs women math performance studies." You'll be shocked to find most studies point to no or small differences.
> We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains work.
Here is your error. You’re assuming that a physical difference in morphology is linked to behavioral or neural correlates. That’s not the case, since observed statistical- or group-level differences need not be driven by biology. You’re assuming biological determinism, and the evidence for direct genetic effects on behavior isn’t there.
> and the evidence for direct genetic effects on behavior isn’t there.
Yes it is. There's an entire field for studying this called Behavioral Genetics.
The easiest evidence comes from comparing monozygotic and dizygotic twins (maternal vs fraternal twins). The variance in behavior is higher among the dizygotic twins who have different genomes.
It's not an error unless you're able to demonstrate the opposite.
I have yet to see studies that demonstrate that different sexes, hormones or even ethnicities do not impact cognitive abilities or higher proficiency in different fields.
Whereas I've seen plenty that show that women, on average, demonstrate higher cognitive abilities linked to verbal proficiency, text comprehension or executive tasks. Women also tend to have better memory than men.
Facts are that there are genetic differences in how our brains work. And let's not ignore the huge importance of hormones, extremely potent regulators of how we function.
To ignore that we have differences that, at large, help explain statistics is asinine.
And how are you able to rule out that societal or environmental effects are the primary driver? How is your argument not circular, that observed differences are therefore the result of biology?
I see you edited your response after my reply. I’m not denying that you’ve read about those observed differences. I’m trying to say that those differences don’t need to be driven by biology, and evidence suggests otherwise. Behavior can’t be reduced to genetics, and the mechanistic link isn’t there. You are claiming that morphological differences explain the variation. Besides, by your reasoning, you could look at the NBA before Bill Russell and make very different claims.
Nonreplicable publications are cited more than replicable ones (2021)
> We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are cited more than those that replicate. This difference in citation does not change after the publication of the failure to replicate. Only 12% of postreplication citations of nonreplicable findings acknowledge the replication failure.
There may be minute details like having a confident frame of reference for the confidence tests. Cultures, even psychologies might swing certain ideas and their compulsions.
The incentive of all psychology researchers is to do new work rather than replications. Because of this, publicly-funded psychology PhDs should be required to perform study replication as part of their training. Protocol + results should be put in a database.
Sure, dump it on the lowest level employee, who has the least training and the most to lose. Grad school already takes too long, pays too little, and involves too much risk of not finishing. And it doesn't solve the problem of people having to generate copious quantities of research in order to sustain their careers.
> in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.
The normal distribution predates the general factor model of IQ by hundreds of years.[0]
You can try other distributions yourself, it's going to be hard to find one that better fits the existing IQ data than the normal (bell curve) distribution.
Darwin's cousin, Francis Galton, for whom the log-normal distribution is often called the Galton distribution, was among the first to investigate psychometrics.
not realizing he was hundreds of years late to the game, he still went ahead and coined the term "median"
A key factor behind psychology's low replication rate is the absence of theories that define the field. In most science fields, an initial finding can be compared to theory before publication, which may weed out unlikely results in advance. But psychology doesn't have this option -- no theories, so no Litmus test.
It's important to say that a psychology study can be scientific in one sense -- say, rigorous and disciplined, but at the same time be unscientific, in the sense that it doesn't test a falsifiable, defining psychological theory -- because there aren't any of those.
Or, to put it more simply, scientific fields require falsifiable theories about some aspect of nature, and the mind is not part of nature.
Future neuroscience might fix this, but don't hold your breath for that outcome. I suspect we'll have AGI in artificial brains before we have testable, falsifiable neuroscience theories about our natural brains.
> Claimed result: Adopting expansive body postures for 2 minutes (like standing with hands on hips or arms raised) increases testosterone, decreases cortisol, and makes people feel more powerful and take more risks.
A heuristic I use that is unreasonably good at identifying grifters and charlatans: Unnecessarily invoking cortisol or other hormones when discussing behavioral topics. Influencers, podcasters, and pseudoscience practitioners love to invoke cortisol, testosterone, inflammation, and other generic concepts to make their ideas sound more scientific. Instead of saying "stress levels" they say "cortisol". They also try to suggest that cortisol is bad and you always want it lower, which isn't true.
Dopamine is another favorite of the grifters. Whenever someone starts talking about raising dopamine or doing something to increase dopamine, they're almost always being misleading or just outright lying. Health and fitness podcasters are the worst at this right now.
> Dopamine is another favorite of the grifters. Whenever someone starts talking about raising dopamine or doing something to increase dopamine, they're almost always being misleading or just outright lying. Health and fitness podcasters are the worst at this right now.
Yeah it’s ridiculous. You know what raises dopamine very effectively? Cocaine, gambling, heroin, meth, etc. If they really believed their own advice they’d all be doing meth or cocaine all day every day. If you look at what happens to regular math users, it doesn’t seem like raising dopamine all the time is a good idea.
The most obvious one is the breakdown of trust in scientific research. A frequent discussion I would have with another statistics friend of mine was that that anti-vax crowd really isn't as off base as they are more popularly portrayed and if anything, the "trust the science!" rhetoric is more clearly incorrect.
Science should never be taught as dogmatic, but the reproducibility crisis has ultimately fostered a culture where one should not question "established" results (Kahneman famously proclaimed that one "must" accept the results of the unbelievable priming results in his famous book), especially if that one is interested in a long academic career.
The trouble is that some trust is necessary in communicating scientific observations and hypothesis to the general public. It's easy to blame the failure of the public to unify around Covid as based around cultural divides, but the truth is that skepticism around high stakes, hastily done science is well warranted. The trouble is that even when you can step through the research and see the conclusions are sound, the skepticism remains.
However, as someone that has spent a long career using data to understand the world, I suspect the harm directly caused by the wrong conclusions being reached is more minimal than one would think. This is largely because, despite lip service to "data driven decision making", science and statistics very rarely are the prime driver of any policy decision.
We rack up quite a lot of awfulness with eugenics, phrenology, the "science" that influenced Stalin's disastrous agriculture policies in the early USSR, overpopulation scares leading to China's one-child policy, etc. Although one could argue these were back-justifications for the awfulness that people wanted to do anyway.
Those things were not done by awful people though - they all thought they were serving the public good. We only judge it as awful now because of the results. Nearly of these ideas (Lysenkoism I think was always fringe) were embraced by the educated elites of the time.
Lysenkoism! That's the one. Thank you for reminding me of the name (and for knowing what I was grasping at).
I think some "bad people" used eugenics and phrenology to justify prior hate, but they were also effective tools at convincing otherwise "good people" to join them.
Public policies were made (or justified) based on some of this research. People used this "settled science" to make consequential decisions.
Stereotype threat for example was widely used to explain test score gaps as purely environmental, which contributed to the public seeing gaps as a moral emergency that needed to be fixed, leading to affirmative action policies.
> Claimed result: Holding a pen in your teeth (forcing a smile-like expression) makes you rate cartoons as funnier compared to holding a pen with your lips (preventing smiling). More broadly, facial expressions can influence emotional experiences: "fake it till you make it."
I read this about a decade ago, and started, when going into a situation where I wanted to have a natural smile, grimacing maniacally like I had a pencil in my teeth. The thing is, it's just so silly, it always makes me laugh at myself, at which point I have a genuine smile. I always doubted whether the claimed connection was real, but it's been a useful tool anyway.
Papers should not be accepted until an independent lab has replicated the results. It’s pretty simple but people are incentivized to not care if it’s replicable because they need the paper to publish to advance their career
Well, at least the growth mindset study is not fully debunked yet. It's basically a modern interpretation of what we've known to be true about self-fulfilling prophecies. If you tell children they are can be smart and competent if they work hard, then they will work hard and become smart and competent. This should be a given.
From working in industry and rubbing shoulders with CS people who prioritize writing papers over writing working software I’m sure that in a high fraction of papers people didn’t implement the algorithm they thought they implemented.
Don't get me started, I have seem repos that I'm fairly sure never ran in their presented form.
A guy in our lab thinks authors purposefully mess up their code when publishing on GitHub to make it harder to replicate. I'm starting to come around on his theory.
> Claimed result: Listening to Mozart temporarily makes you smarter.
This belongs in a dungeon crawl game. You find an artifact that plays music to you. Depending on the music played (depends on the artifact's enchantment and blessed status), it can buff or debuff your intelligence by several points temporarily.
If the "failed replication" was a single study, as in many cases listed here, there is still an open question as to whether the 1) replication study was underpowered (the ones I looked at had pretty small n's), or 2) the re-implementation of the original study was flawed. So I'm not so sure we can quickly label the original studies as "debunked", no more than we can express a high level of confidence in the original studies.
(This isn't a comment on any of the individual studies listed.)
> Most results in the field do actually replicate and are robust [citation needed], so it would be a pity to lose confidence in the whole field just because of a few bad apples.
Is there a good list of results that do consistently replicate?
I thought we knew that these were vehicles by wannabe self-help authors to puff up their status for money. See for example “Grit” and “Deep Work” and other bullshit entries in a breathlessly hyped up genre of pseudoscience.
One thing that confuses me is that some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.
The ego depletion effect seems intuitively surprising to me. Science is often unintuitive. I do know that it is easier to make forward-thinking decisions when I am not tired so I dont know.
>some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.
I don't like Giancotti's claims. He wrote:
>This post is a compact reference list of the most (in)famous cognitive science results that failed to replicate and should, for the time being, be considered false.
I don't agree with Giancotti's epistemological claims but today I will not bloviate at length about the epistemology of science. I will try to be brief.
If I understand Marco Giancotti correctly, one particular point is that Giancotti seems to be saying that Hagger et al. have impressively debunked Baumeister et al.
The ego depletion "debunking" is not really what I would call a refutation. It says, "Results from the current multilab registered replication of the ego-depletion effect provide evidence that, if there is any effect, it is close to zero. ... Although the current analysis provides robust evidence that questions the strength of the ego-depletion effect and its replicability, it may be premature to reject the ego-depletion effect altogether based on these data alone."
Maybe Baumeister's protocol was fundamentally flawed, but the counter-argument from Hagger et al. does not convince me. I wasn't thrilled with Baumeister's claims when they came out, but now I am somehow even less thrilled with the claims of Hagger et al., and I absolutely don't trust Giancotti's assessment.
I could believe that Hagger executed Baumeister's protocol correctly, but I can't believe Giancotti has a grasp of what scientific claims "should" be "believed."
The idea isn't that it is easier to do things when not tired. It is that you specifically get tired exercising self control.
I think that can be subtly confused by people thinking you can't get better at self control with practice? That is, I would think a deliberate practice of doing more and more self control every day should build up your ability to do more self control. And it would be easy to think that that means you have a stamina for self control that depletes in the same way that aerobic fitness can work. But, those don't necessarily follow each other.
Disturbing fact: The Stanford prison experiment, run by Philip Zimbardo, wasn't reproducible but that didn't stop Zimbardo from using it to promote his ideologies about the impossibility of rehabilitating criminals, or from becoming the president of the American Psychological Association.
The APA has a really good style guide, but I don't trust them for actual psychology.
This is a great list for people who want to smugly say "Um, actually" a lot in conversation.
Based on my brief stint doing data work in psychology research, amongst many other problems they are AWFUL at stats. And it isn't a skill issue as much as a cultural one. They teach it wrong and have a "well, everybody else does it" attitude towards p-hacking and other statistical malpractice.
"they are AWFUL at stats."
SF author Michael Flynn was a process control engineer as his day job; he wrote about how designing statistically valid experiments is incredibly difficult, and the potential for fooling yourself is high, even when you really do know what you are doing and you have nearly perfect control over the measurement setup.
And on top of it you're trying to measure the behavior of people not widgets; and people change their behavior based on the context and what they think you're measuring.
There was a lab set up to do "experimental economics" at Caltech back in the late 80's/early 90's. Trouble is, people make different economic decisions when they are working with play money rather than real money.
As someone who's part of a startup (hrpotentials.com) trying to bring truly scientifically valid psychological testing into HR processes .... yeah. We've been at it for almost 7 years, and we're finally at a point where we can say we have something that actually makes scientific sense - and we're not inventing anything new, just commercializing the science! It only took an electrical engineer (not me) with a strong grasp of statistics working for years with a competent professor of psychology to separate the wheat from the chaff. There's some good science there it's just ... not used much.
How are you going to get around Griggs v. Duke Power Co.? AFAIK, personality tests have not (yet) been given the regulatory eye, but testing cognitive ability has.
Yeah, this is an era which is notorious for pseudoscience.
There’s surely irony here
Um, actually I’d say it is the responsibility of all scientists, both professional and amateur, to point out falsehoods when they’re uttered, and not an act of smugness.
[um], has contexts but is usually a cue, that an unexpected, off the average, something is about to be said.
[actually], is a neutral declaration that some cognitive structure was presented, but is at odds with physically observable fact that will now be laid out to you.
> Claimed result: Women risk being judged by the negative stereotype that women have weaker math ability, and this apprehension disrupts their math performance on difficult tests.
I'll never understand stances trying to hide biological differences between different sexes or ethnic backgrounds.
We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains (and hormones) work.
Women have, on average, a higher emotional intelligence which is e.g. tied to higher linguistic proficiency. That helps in many different fields and, on average, women tend to learn languages easier than men.
At the same time, on average, they may perform slightly worse than men in highly computational fields (math or chess).
I want to iterate what I'm getting at to before the rest of the post:
Genetics matter when you look at very large samples, but they are irrelevant on smaller (or single) samples.
I feel NBA provides a great example.
On average, african americans are taller than white men and have a higher muscular density.
On large samples, they tend to outperform white men. But as soon as you make the samples smaller, even at elite levels, you find out that Larry Bird (30+ years ago) or Nikola Jokic (today) are the best players in the world.
Same applies to women, just because average samples will explain some statistics, such as on average females performing worse on maths, won't change that women can be the best chess players or cryptographers in the world.
<On average, african americans are taller than white men and have a higher muscular density.>
Are you comparing direct descendants of Yoruba versus descendants of Celts in America ? or mixed descendants of Bantu and Cherokee versus mixed descendants of Anglo-Saxons and Slavs ? In your study would Barack Obama be a person of color or a person of pallor ?
Or is this data you have gathered observing people at Costco. Just checking on your scientific methodology.
Differences are hidden because (1) differences, even small ones, are used to justify discrimination (2) some feel the need to correct for stereotypes (3) these differences often don't really exist or amount to a small effect size.[0]
In the end, we're talking about distributions of people, and staring at these differences mischaracterizes all but those at the mean.
All that matters is who can pass the test.
[0]: I also encourage you to ask ChatGPT/Grok/Claude "men vs women math performance studies." You'll be shocked to find most studies point to no or small differences.
[1]: Malcom Gladwell wrote a great piece about his experience as a runner that seems appropriate to share https://www.newyorker.com/magazine/1997/05/19/the-sports-tab...
Circular reasoning can be used to "prove" anything, so it's not helpful as a basis for policy making.
> We know for a fact that sex or ethnicity impacts body yet we seem unable to cope with the idea that there are also differences in how brains work.
Here is your error. You’re assuming that a physical difference in morphology is linked to behavioral or neural correlates. That’s not the case, since observed statistical- or group-level differences need not be driven by biology. You’re assuming biological determinism, and the evidence for direct genetic effects on behavior isn’t there.
> and the evidence for direct genetic effects on behavior isn’t there.
Yes it is. There's an entire field for studying this called Behavioral Genetics.
The easiest evidence comes from comparing monozygotic and dizygotic twins (maternal vs fraternal twins). The variance in behavior is higher among the dizygotic twins who have different genomes.
It's not an error unless you're able to demonstrate the opposite.
I have yet to see studies that demonstrate that different sexes, hormones or even ethnicities do not impact cognitive abilities or higher proficiency in different fields.
Whereas I've seen plenty that show that women, on average, demonstrate higher cognitive abilities linked to verbal proficiency, text comprehension or executive tasks. Women also tend to have better memory than men.
Facts are that there are genetic differences in how our brains work. And let's not ignore the huge importance of hormones, extremely potent regulators of how we function.
To ignore that we have differences that, at large, help explain statistics is asinine.
And how are you able to rule out that societal or environmental effects are the primary driver? How is your argument not circular, that observed differences are therefore the result of biology?
I've never stated that biology is the primary driver.
I merely stated that biology, should not be ignored when judging very large samples.
There are cross sex cognitive tests at which women and men tend to perform differently, such as spatial awareness or speed perception and many others.
What's the environmental or cultural factor behind the fact that a female's brain, on average, is able to judge speed much more correctly than a male?
I see you edited your response after my reply. I’m not denying that you’ve read about those observed differences. I’m trying to say that those differences don’t need to be driven by biology, and evidence suggests otherwise. Behavior can’t be reduced to genetics, and the mechanistic link isn’t there. You are claiming that morphological differences explain the variation. Besides, by your reasoning, you could look at the NBA before Bill Russell and make very different claims.
> And how are you able to rule out
It is not possible to rule out unfalsifiable hypotheses.
Approximate replication rates in psychology:
So a list of famous psychology experiments that do replicate may be shorter.https://www.nature.com/articles/nature.2015.18248
I think one would wish the famous ones to be more often replicable.
Nonreplicable publications are cited more than replicable ones (2021)
> We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are cited more than those that replicate. This difference in citation does not change after the publication of the failure to replicate. Only 12% of postreplication citations of nonreplicable findings acknowledge the replication failure.
https://www.science.org/doi/10.1126/sciadv.abd1705
Press release: https://rady.ucsd.edu/why/news/2021/05-21-a-new-replication-...
There may be minute details like having a confident frame of reference for the confidence tests. Cultures, even psychologies might swing certain ideas and their compulsions.
The incentive of all psychology researchers is to do new work rather than replications. Because of this, publicly-funded psychology PhDs should be required to perform study replication as part of their training. Protocol + results should be put in a database.
Sure, dump it on the lowest level employee, who has the least training and the most to lose. Grad school already takes too long, pays too little, and involves too much risk of not finishing. And it doesn't solve the problem of people having to generate copious quantities of research in order to sustain their careers.
Disclosure: Physics PhD.
How interesting would it be if every PhD thesis had to have a "replication" section, where they tried to replicate some famous paper's results.
>Source: Stern, Gerlach, & Penke (2020)
Wow, what are the odds?
https://en.wikipedia.org/wiki/Stern%E2%80%93Gerlach_experime...
I'm still amazed that wikipedia doesn't have redirect away from its mobile site
(It's on my list to rewrite those URLs in HN comments at least)
> Source: Hagger et (63!) al. 2016
I can't help chuckling at the idea that over 1.98 * 10^87 people were involved in the paper.
famous cognitive psychology experiments that do replicate: IQ tests
http://www.psychpage.com/learning/library/intell/mainstream....
in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.
Survivorship bias. You can easily make someone's IQ test not replicate. (Hit them on the head really hard.)
> in fact, the foundational statistical models considered the gold standard for statistics today were developed for this testing.
The normal distribution predates the general factor model of IQ by hundreds of years.[0]
You can try other distributions yourself, it's going to be hard to find one that better fits the existing IQ data than the normal (bell curve) distribution.
[0] https://en.wikipedia.org/wiki/Normal_distribution#History
Darwin's cousin, Francis Galton, for whom the log-normal distribution is often called the Galton distribution, was among the first to investigate psychometrics.
not realizing he was hundreds of years late to the game, he still went ahead and coined the term "median"
more tidbits here https://en.wikipedia.org/wiki/Francis_Galton#Statistical_inn...
A key factor behind psychology's low replication rate is the absence of theories that define the field. In most science fields, an initial finding can be compared to theory before publication, which may weed out unlikely results in advance. But psychology doesn't have this option -- no theories, so no Litmus test.
It's important to say that a psychology study can be scientific in one sense -- say, rigorous and disciplined, but at the same time be unscientific, in the sense that it doesn't test a falsifiable, defining psychological theory -- because there aren't any of those.
Or, to put it more simply, scientific fields require falsifiable theories about some aspect of nature, and the mind is not part of nature.
Future neuroscience might fix this, but don't hold your breath for that outcome. I suspect we'll have AGI in artificial brains before we have testable, falsifiable neuroscience theories about our natural brains.
> Claimed result: Adopting expansive body postures for 2 minutes (like standing with hands on hips or arms raised) increases testosterone, decreases cortisol, and makes people feel more powerful and take more risks.
A heuristic I use that is unreasonably good at identifying grifters and charlatans: Unnecessarily invoking cortisol or other hormones when discussing behavioral topics. Influencers, podcasters, and pseudoscience practitioners love to invoke cortisol, testosterone, inflammation, and other generic concepts to make their ideas sound more scientific. Instead of saying "stress levels" they say "cortisol". They also try to suggest that cortisol is bad and you always want it lower, which isn't true.
Dopamine is another favorite of the grifters. Whenever someone starts talking about raising dopamine or doing something to increase dopamine, they're almost always being misleading or just outright lying. Health and fitness podcasters are the worst at this right now.
> Dopamine is another favorite of the grifters. Whenever someone starts talking about raising dopamine or doing something to increase dopamine, they're almost always being misleading or just outright lying. Health and fitness podcasters are the worst at this right now.
Yeah it’s ridiculous. You know what raises dopamine very effectively? Cocaine, gambling, heroin, meth, etc. If they really believed their own advice they’d all be doing meth or cocaine all day every day. If you look at what happens to regular math users, it doesn’t seem like raising dopamine all the time is a good idea.
Is anyone tracking how much damage to society bad social science has done? I imagine it's quite a bit.
The most obvious one is the breakdown of trust in scientific research. A frequent discussion I would have with another statistics friend of mine was that that anti-vax crowd really isn't as off base as they are more popularly portrayed and if anything, the "trust the science!" rhetoric is more clearly incorrect.
Science should never be taught as dogmatic, but the reproducibility crisis has ultimately fostered a culture where one should not question "established" results (Kahneman famously proclaimed that one "must" accept the results of the unbelievable priming results in his famous book), especially if that one is interested in a long academic career.
The trouble is that some trust is necessary in communicating scientific observations and hypothesis to the general public. It's easy to blame the failure of the public to unify around Covid as based around cultural divides, but the truth is that skepticism around high stakes, hastily done science is well warranted. The trouble is that even when you can step through the research and see the conclusions are sound, the skepticism remains.
However, as someone that has spent a long career using data to understand the world, I suspect the harm directly caused by the wrong conclusions being reached is more minimal than one would think. This is largely because, despite lip service to "data driven decision making", science and statistics very rarely are the prime driver of any policy decision.
I imagine it's comparable to the damage done when policies are set that are not based on studies.
Let's be candid: Most policies have no backing in science whatsoever. The fact that some were backed by poor science is not an indictment of much.
We rack up quite a lot of awfulness with eugenics, phrenology, the "science" that influenced Stalin's disastrous agriculture policies in the early USSR, overpopulation scares leading to China's one-child policy, etc. Although one could argue these were back-justifications for the awfulness that people wanted to do anyway.
Those things were not done by awful people though - they all thought they were serving the public good. We only judge it as awful now because of the results. Nearly of these ideas (Lysenkoism I think was always fringe) were embraced by the educated elites of the time.
Lysenkoism! That's the one. Thank you for reminding me of the name (and for knowing what I was grasping at).
I think some "bad people" used eugenics and phrenology to justify prior hate, but they were also effective tools at convincing otherwise "good people" to join them.
i'm struggling to imagine many negative effects on society caused by the specific papers in this list
Public policies were made (or justified) based on some of this research. People used this "settled science" to make consequential decisions.
Stereotype threat for example was widely used to explain test score gaps as purely environmental, which contributed to the public seeing gaps as a moral emergency that needed to be fixed, leading to affirmative action policies.
> Smile to Feel Better Effect
> Claimed result: Holding a pen in your teeth (forcing a smile-like expression) makes you rate cartoons as funnier compared to holding a pen with your lips (preventing smiling). More broadly, facial expressions can influence emotional experiences: "fake it till you make it."
I read this about a decade ago, and started, when going into a situation where I wanted to have a natural smile, grimacing maniacally like I had a pencil in my teeth. The thing is, it's just so silly, it always makes me laugh at myself, at which point I have a genuine smile. I always doubted whether the claimed connection was real, but it's been a useful tool anyway.
Yeah, the marshmallow one taught me to have patience and look for the long returns on investments of personal effort.
I think there may be something to a few of these, and more may need considering regarding how these are conducted.
Let’s leave open our credulities for the inquest of time.
Little of this is considered cognitive psychology. The vast majority would be viewed as "social psychology"
Setting that aside, among any scientific field I'm aware of, psychology has taken the replication crisis most seriously. Rigor across all areas of psychology is steadily increasing: https://journals.sagepub.com/doi/full/10.1177/25152459251323...
No mention of the Stanford Prison Experiment I notice.
Papers should not be accepted until an independent lab has replicated the results. It’s pretty simple but people are incentivized to not care if it’s replicable because they need the paper to publish to advance their career
Well, at least the growth mindset study is not fully debunked yet. It's basically a modern interpretation of what we've known to be true about self-fulfilling prophecies. If you tell children they are can be smart and competent if they work hard, then they will work hard and become smart and competent. This should be a given.
i wonder the replication rate is for ML papers
From working in industry and rubbing shoulders with CS people who prioritize writing papers over writing working software I’m sure that in a high fraction of papers people didn’t implement the algorithm they thought they implemented.
Don't get me started, I have seem repos that I'm fairly sure never ran in their presented form. A guy in our lab thinks authors purposefully mess up their code when publishing on GitHub to make it harder to replicate. I'm starting to come around on his theory.
> Claimed result: Listening to Mozart temporarily makes you smarter.
This belongs in a dungeon crawl game. You find an artifact that plays music to you. Depending on the music played (depends on the artifact's enchantment and blessed status), it can buff or debuff your intelligence by several points temporarily.
If the "failed replication" was a single study, as in many cases listed here, there is still an open question as to whether the 1) replication study was underpowered (the ones I looked at had pretty small n's), or 2) the re-implementation of the original study was flawed. So I'm not so sure we can quickly label the original studies as "debunked", no more than we can express a high level of confidence in the original studies.
(This isn't a comment on any of the individual studies listed.)
> Most results in the field do actually replicate and are robust [citation needed], so it would be a pity to lose confidence in the whole field just because of a few bad apples.
Is there a good list of results that do consistently replicate?
I thought we knew that these were vehicles by wannabe self-help authors to puff up their status for money. See for example “Grit” and “Deep Work” and other bullshit entries in a breathlessly hyped up genre of pseudoscience.
One thing that confuses me is that some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.
The ego depletion effect seems intuitively surprising to me. Science is often unintuitive. I do know that it is easier to make forward-thinking decisions when I am not tired so I dont know.
>some of these papers were successfully replicated, so juxtaposing them to the ones that have not been replicated at all given the title of the page feels a bit off. Not sure if fair.
I don't like Giancotti's claims. He wrote: >This post is a compact reference list of the most (in)famous cognitive science results that failed to replicate and should, for the time being, be considered false.
I don't agree with Giancotti's epistemological claims but today I will not bloviate at length about the epistemology of science. I will try to be brief.
If I understand Marco Giancotti correctly, one particular point is that Giancotti seems to be saying that Hagger et al. have impressively debunked Baumeister et al.
The ego depletion "debunking" is not really what I would call a refutation. It says, "Results from the current multilab registered replication of the ego-depletion effect provide evidence that, if there is any effect, it is close to zero. ... Although the current analysis provides robust evidence that questions the strength of the ego-depletion effect and its replicability, it may be premature to reject the ego-depletion effect altogether based on these data alone."
Maybe Baumeister's protocol was fundamentally flawed, but the counter-argument from Hagger et al. does not convince me. I wasn't thrilled with Baumeister's claims when they came out, but now I am somehow even less thrilled with the claims of Hagger et al., and I absolutely don't trust Giancotti's assessment. I could believe that Hagger executed Baumeister's protocol correctly, but I can't believe Giancotti has a grasp of what scientific claims "should" be "believed."
The idea isn't that it is easier to do things when not tired. It is that you specifically get tired exercising self control.
I think that can be subtly confused by people thinking you can't get better at self control with practice? That is, I would think a deliberate practice of doing more and more self control every day should build up your ability to do more self control. And it would be easy to think that that means you have a stamina for self control that depletes in the same way that aerobic fitness can work. But, those don't necessarily follow each other.
Now I want to know which cognitive psychology experiments were successfully replicated though.