DiscoverNormal Curves: Sexy Science, Serious StatisticsP-Values: Are we using a flawed statistical tool?
P-Values: Are we using a flawed statistical tool?

P-Values: Are we using a flawed statistical tool?

Update: 2025-09-22
Share

Description

P-values show up in almost every scientific paper, yet they’re one of the most misunderstood ideas in statistics. In this episode, we break from our usual journal-club format to unpack what a p-value really is, why researchers have fought about it for a century, and how that famous 0.05 cutoff became enshrined in science. Along the way, we share stories from our own papers—from a Nature feature that helped reshape the debate to a statistical sleuthing project that uncovered a faulty method in sports science. The result: a behind-the-scenes look at how one statistical tool has shaped the culture of science itself.


Statistical topics

  • Bayesian statistics
  • Confidence intervals 
  • Effect size vs. statistical significance
  • Fisher’s conception of p-values
  • Frequentist perspective
  • Magnitude-Based Inference (MBI)
  • Multiple testing / multiple comparisons
  • Neyman-Pearson hypothesis testing framework
  • P-hacking
  • Posterior probabilities
  • Preregistration and registered reports
  • Prior probabilities
  • P-values
  • Researcher degrees of freedom
  • Significance thresholds (p < 0.05)
  • Simulation-based inference
  • Statistical power 
  • Statistical significance
  • Transparency in research 
  • Type I error (false positive)
  • Type II error (false negative)
  • Winner’s Curse


Methodological morals

  • “​​If p-values tell us the probability the null is true, then octopuses are psychic.”
  • “Statistical tools don't fool us, blind faith in them does.”


References


Kristin and Regina’s online courses: 

Demystifying Data: A Modern Approach to Statistical Understanding  

Clinical Trials: Design, Strategy, and Analysis 

Medical Statistics Certificate Program  

Writing in the Sciences 

Epidemiology and Clinical Research Graduate Certificate Program 

Programs that we teach in:

Epidemiology and Clinical Research Graduate Certificate Program 


Find us on:

Kristin -  LinkedIn & Twitter/X

Regina - LinkedIn & ReginaNuzzo.com

  • (00:00 ) - Intro & claim of the episode

  • (01:00 ) - Why p-values matter in science

  • (02:44 ) - What is a p-value? (ESP guessing game)

  • (06:47 ) - Big vs. small p-values (psychic octopus example)

  • (08:29 ) - Significance thresholds and the 0.05 rule

  • (09:00 ) - Regina’s Nature paper on p-values

  • (11:32 ) - Misconceptions about p-values

  • (13:18 ) - Fisher vs. Neyman-Pearson (history & feud)

  • (16:26 ) - Botox analogy and type I vs. type II errors

  • (19:41 ) - Dating app analogies for false positives/negatives

  • (22:02 ) - How the 0.05 cutoff got enshrined

  • (24:40 ) - Misinterpretations: statistical vs. practical significance

  • (26:16 ) - Effect size, sample size, and “statistically discernible”

  • (26:45 ) - P-hacking and researcher degrees of freedom

  • (29:46 ) - Transparency, preregistration, and open science

  • (30:52 ) - The 0.05 cutoff trap (p = 0.049 vs 0.051)

  • (31:18 ) - The biggest misinterpretation: what p-values actually mean

  • (33:29 ) - Paul the psychic octopus (worked example)

  • (35:59 ) - Why Bayesian statistics differ

  • (39:49 ) - Why aren’t we all Bayesian? (probability wars)

  • (41:05 ) - The ASA p-value statement (behind the scenes)

  • (43:16 ) - Key principles from the ASA white paper

  • (44:15 ) - Wrapping up Regina’s paper

  • (45:33 ) - Kristin’s paper on sports science (MBI)

  • (48:10 ) - What MBI is and how it spread

  • (50:43 ) - How Kristin got pulled in (Christie Aschwanden & FiveThirtyEight)

  • (54:05 ) - Critiques of MBI and “Bayesian monster” rebuttal

  • (56:14 ) - Spreadsheet autopsies (Welsh & Knight)

  • (58:05 ) - Cherry juice example (why MBI misleads)

  • (01:00:22 ) - Rebuttals and smoke & mirrors from MBI advocates

  • (01:02:55 ) - Winner’s Curse and small samples

  • (01:03:38 ) - Twitter fights & “establishment statistician”

  • (01:05:56 ) - Cult-like following & Matrix red pill analogy

  • (01:08:06 ) - Wrap-up


Comments (1)

Nima Rahmati

What a podcast. Thank you both. Professor Sainani, I've learned a lot from you, and I try to apply your valuable lessons and insights in my academic career. You make me enjoy science even when I feel disappointed or frustrated, and I am truly grateful for that.

Oct 14th
Reply
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

P-Values: Are we using a flawed statistical tool?

P-Values: Are we using a flawed statistical tool?

Regina Nuzzo and Kristin Sainani