P-Values: Are we using a flawed statistical tool?

Update: 2025-09-22

Description

P-values show up in almost every scientific paper, yet they’re one of the most misunderstood ideas in statistics. In this episode, we break from our usual journal-club format to unpack what a p-value really is, why researchers have fought about it for a century, and how that famous 0.05 cutoff became enshrined in science. Along the way, we share stories from our own papers—from a Nature feature that helped reshape the debate to a statistical sleuthing project that uncovered a faulty method in sports science. The result: a behind-the-scenes look at how one statistical tool has shaped the culture of science itself.

Statistical topics

Bayesian statistics
Confidence intervals
Effect size vs. statistical significance
Fisher’s conception of p-values
Frequentist perspective
Magnitude-Based Inference (MBI)
Multiple testing / multiple comparisons
Neyman-Pearson hypothesis testing framework
P-hacking
Posterior probabilities
Preregistration and registered reports
Prior probabilities
P-values
Researcher degrees of freedom
Significance thresholds (p < 0.05)
Simulation-based inference
Statistical power
Statistical significance
Transparency in research
Type I error (false positive)
Type II error (false negative)
Winner’s Curse

Methodological morals

“If p-values tell us the probability the null is true, then octopuses are psychic.”
“Statistical tools don't fool us, blind faith in them does.”

References

Nuzzo R. Scientific method: statistical errors. Nature. 2014 Feb 13;506(7487):150-2. doi: 10.1038/506150a.
Nuzzo, R., 2015. Scientists perturbed by loss of stat tools to sift research fudge from fact. Scientific American, pp.16-18.
Nuzzo RL. The inverse fallacy and interpreting P values. PM&R. 2015 Mar;7(3):311-4. doi: 10.1016/j.pmrj.2015.02.011. Epub 2015 Feb 25.
Nuzzo, R., 2015. Probability wars. New Scientist, 225(3012), pp.38-41.
Sainani KL. Putting P values in perspective. PM&R. 2009 Sep;1(9):873-7. doi: 10.1016/j.pmrj.2009.07.003.
Sainani KL. Clinical versus statistical significance. PM&R. 2012 Jun;4(6):442-5. doi: 10.1016/j.pmrj.2012.04.014.
McLaughlin MJ, Sainani KL. Bonferroni, Holm, and Hochberg corrections: fun names, serious changes to p values. PM&R. 2014 Jun;6(6):544-6. doi: 10.1016/j.pmrj.2014.04.006. Epub 2014 Apr 22.
Sainani KL. The Problem with "Magnitude-based Inference". Med Sci Sports Exerc. 2018 Oct;50(10):2166-2176. doi: 10.1249/MSS.0000000000001645.
Sainani KL, Lohse KR, Jones PR, Vickers A. Magnitude-based Inference is not Bayesian and is not a valid method of inference. Scand J Med Sci Sports. 2019 Sep;29(9):1428-1436. doi: 10.1111/sms.13491.
Lohse KR, Sainani KL, Taylor JA, Butson ML, Knight EJ, Vickers AJ. Systematic review of the use of "magnitude-based inference" in sports science and medicine. PLoS One. 2020 Jun 26;15(6):e0235318. doi: 10.1371/journal.pone.0235318.
Wasserstein, R.L. and Lazar, N.A., 2016. The ASA statement on p-values: context, process, and purpose. The American Statistician, 70(2), pp.129-133.

Kristin and Regina’s online courses:

Demystifying Data: A Modern Approach to Statistical Understanding

Clinical Trials: Design, Strategy, and Analysis

Medical Statistics Certificate Program

Writing in the Sciences

Epidemiology and Clinical Research Graduate Certificate Program

Programs that we teach in:

Epidemiology and Clinical Research Graduate Certificate Program

Find us on:

Kristin - LinkedIn & Twitter/X

Regina - LinkedIn & ReginaNuzzo.com

(00:00 ) - Intro & claim of the episode

(01:00 ) - Why p-values matter in science

(02:44 ) - What is a p-value? (ESP guessing game)

(06:47 ) - Big vs. small p-values (psychic octopus example)

(08:29 ) - Significance thresholds and the 0.05 rule

(09:00 ) - Regina’s Nature paper on p-values

(11:32 ) - Misconceptions about p-values

(13:18 ) - Fisher vs. Neyman-Pearson (history & feud)

(16:26 ) - Botox analogy and type I vs. type II errors

(19:41 ) - Dating app analogies for false positives/negatives

(22:02 ) - How the 0.05 cutoff got enshrined

(24:40 ) - Misinterpretations: statistical vs. practical significance

(26:16 ) - Effect size, sample size, and “statistically discernible”

(26:45 ) - P-hacking and researcher degrees of freedom

(29:46 ) - Transparency, preregistration, and open science

(30:52 ) - The 0.05 cutoff trap (p = 0.049 vs 0.051)

(31:18 ) - The biggest misinterpretation: what p-values actually mean

(33:29 ) - Paul the psychic octopus (worked example)

(35:59 ) - Why Bayesian statistics differ

(39:49 ) - Why aren’t we all Bayesian? (probability wars)

(41:05 ) - The ASA p-value statement (behind the scenes)

(43:16 ) - Key principles from the ASA white paper

(44:15 ) - Wrapping up Regina’s paper

(45:33 ) - Kristin’s paper on sports science (MBI)

(48:10 ) - What MBI is and how it spread

(50:43 ) - How Kristin got pulled in (Christie Aschwanden & FiveThirtyEight)

(54:05 ) - Critiques of MBI and “Bayesian monster” rebuttal

(56:14 ) - Spreadsheet autopsies (Welsh & Knight)

(58:05 ) - Cherry juice example (why MBI misleads)

(01:00:22 ) - Rebuttals and smoke & mirrors from MBI advocates

(01:02:55 ) - Winner’s Curse and small samples

(01:03:38 ) - Twitter fights & “establishment statistician”

(01:05:56 ) - Cult-like following & Matrix red pill analogy

(01:08:06 ) - Wrap-up

Comments (1)

Nima Rahmati

What a podcast. Thank you both. Professor Sainani, I've learned a lot from you, and I try to apply your valuable lessons and insights in my academic career. You make me enjoy science even when I feel disappointed or frustrated, and I am truly grateful for that.

Oct 14th

In Channel