DiscoverData Skeptic
Data Skeptic
Claim Ownership

Data Skeptic

Author: Kyle Polich

Subscribed: 29,981Played: 424,642


The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
353 Episodes
Nirupam Gupta, a Computer Science Post Doctoral Researcher at EDFL University in Switzerland, joins us today to discuss his work “Byzantine Fault-Tolerance in Peer-to-Peer Distributed Gradient-Descent.”   Works Mentioned: Byzantine Fault-Tolerance in Peer-to-Peer Distributed Gradient-Descent by Nirupam Gupta and Nitin H. Vaidya   Conference Details:
Mikko Lauri, Post Doctoral researcher at the University of Hamburg, Germany, comes on the show today to discuss the work Information Gathering in Decentralized POMDPs by Policy Graph Improvements. Follow Mikko: @mikko_lauri Github
Leaderless Consensus

Leaderless Consensus


Balaji Arun, a PhD Student in the Systems of Software Research Group at Virginia Tech, joins us today to discuss his research of distributed systems through the paper “Taming the Contention in Consensus-based Distributed Systems.”  Works Mentioned “Taming the Contention in Consensus-based Distributed Systems”  by Balaji Arun, Sebastiano Peluso, Roberto Palmieri, Giuliano Losa, and Binoy Ravindran “Fast Paxos” by Leslie Lamport
Maartje ter Hoeve, PhD Student at the University of Amsterdam, joins us today to discuss her research in automated summarization through the paper “What Makes a Good Summary? Reconsidering the Focus of Automatic Summarization.”  Works Mentioned  “What Makes a Good Summary? Reconsidering the Focus of Automatic Summarization.” by Maartje der Hoeve, Juilia Kiseleva, and Maarten de Rijke Contact Email: Twitter: Website:



Brian Brubach, Assistant Professor in the Computer Science Department at Wellesley College, joins us today to discuss his work “Meddling Metrics: the Effects of Measuring and Constraining Partisan Gerrymandering on Voter Incentives". WORKS MENTIONED: Meddling Metrics: the Effects of Measuring and Constraining Partisan Gerrymandering on Voter Incentives by Brian Brubach, Aravind Srinivasan, and Shawn Zhao
Aside from victory questions like “can black force a checkmate on white in 5 moves?” many novel questions can be asked about a game of chess. Some questions are trivial (e.g. “How many pieces does white have?") while more computationally challenging questions can contribute interesting results in computational complexity theory. In this episode, Josh Brunner, Master's student in Theoretical Computer Science at MIT, joins us to discuss his recent paper Complexity of Retrograde and Helpmate Chess Problems: Even Cooperative Chess is Hard. Works Mentioned Complexity of Retrograde and Helpmate Chess Problems: Even Cooperative Chess is Hard by Josh Brunner, Erik D. Demaine, Dylan Hendrickson, and Juilian Wellman 1x1 Rush Hour With Fixed Blocks is PSPACE Complete by Josh Brunner, Lily Chung, Erik D. Demaine, Dylan Hendrickson, Adam Hesterberg, Adam Suhl, Avi Zeff
Eil Goldweber, a graduate student at the University of Michigan, comes on today to share his work in applying formal verification to systems and a modification to the Paxos protocol discussed in the paper Significance on Consecutive Ballots in Paxos. Works Mentioned : Previous Episode on Paxos Paper: On the Significance on Consecutive Ballots in Paxos by: Eli Goldweber, Nuda Zhang, and Manos Kapritsos Thanks to our sponsor: Nord VPN : 68% off a 2-year plan and one month free! With NordVPN, all the data you send and receive online travels through an encrypted tunnel. This way, no one can get their hands on your private information. Nord VPN is quick and easy to use to protect the privacy and security of your data. Check them out at
Today on the show we have Adrian Martin, a Postdoctorial researcher from the Univeristy of Pompeu Fabra in Barcelona, Spain. He comes on the show today to discuss his research from the paper “Convolutional Neural Networks can be Decieved by Visual Illusions.” Workes Mentioned in Paper: “Convolutional Neural Networks can be Decieved by Visual Illusions.” by Alexander Gomez-Villa, Adrian Martin, Javier Vazquez-Corral, and Marcelo Bertalmio Examples: Snake Illusions Twitter: Alex: @alviur Adrian:  @adriMartin13 Thanks to our sponsor! Keep your home internet connection safe with Nord VPN! Get 68% off plus a free month at  (30-day money-back guarantee!)
Have you ever wanted to hear what an earthquake sounds like? Today on the show we have Omkar Ranadive, Computer Science Masters student at NorthWestern University, who collaborates with Suzan van der Lee, an Earth and Planetary Sciences professor at Northwestern University, on the crowd-sourcing project Earthquake Detective.  Email Links: Suzan:  Omkar: Works Mentioned:  Paper: Applying Machine Learning to Crowd-sourced Data from Earthquake Detective by Omkar Ranadive, Suzan van der Lee, Vivan Tang, and Kevin Chao Github: Earthquake Detective: Thanks to our sponsors! Is an awesome platform with interesting courses, like Quantum Computing! There is something for you and surely something for the whole family! Get 20% off Brilliant Premium at
Byzantine fault tolerance (BFT) is a desirable property in a distributed computing environment. BFT means the system can survive the loss of nodes and nodes becoming unreliable. There are many different protocols for achieving BFT, though not all options can scale to large network sizes. Ted Yin joins us to explain BFT, survey the wide variety of protocols, and share details about HotStuff.
Alpha Fold

Alpha Fold


Kyle shared some initial reactions to the announcement about Alpha Fold 2's celebrated performance in the CASP14 prediction.  By many accounts, this exciting result means protein folding is now a solved problem. Thanks to our sponsors! Brilliant is a great last-minute gift idea! Give access to 60 + interactive courses including Quantum Computing and Group Theory. There's something for everyone at Brilliant. They have award-winning courses, taught by teachers, researchers and professionals from MIT, Caltech, Duke, Microsoft, Google and many more. Check them out at to take advantage of 20% off a Premium memebership. Betterhelp is an online professional counseling platform. Start communicating with a licensed professional in under 24 hours! It's safe, private and convenient. From online messages to phone and video calls, there is something for everyone. Get 10% off your first month at
Above all, everyone wants voting to be fair. What does fair mean and how can we measure it? Kenneth Arrow posited a simple set of conditions that one would certainly desire in a voting system. For example, unanimity - if everyone picks candidate A, then A should win! Yet surprisingly, under a few basic assumptions, this theorem demonstrates that no voting system exists which can satisfy all the criteria. This episode is a discussion about the structure of the proof and some of its implications. Works Mentioned A Difficulty in the Concept of Social Welfare by Kenneth J. Arrow   Three Brief Proofs of Arrows Impossibility Theorem by John Geanakoplos   Thank you to our sponsors!   Better Help is much more affordable than traditional offline counseling, and financial aid is available! Get started in less than 24 hours. Data Skeptic listeners get 10% off your first month when you visit:   Let Springboard School of Data jumpstart your data career! With 100% online and remote schooling, supported by a vast network of professional mentors with a tuition-back guarantee, you can't go wrong. Up to twenty $500 scholarships will be awarded to Data Skeptic listeners. Check them out at and enroll using code: DATASK
As the COVID-19 pandemic continues, the public (or at least those with Twitter accounts) are sharing their personal opinions about mask-wearing via Twitter. What does this data tell us about public opinion? How does it vary by demographic? What, if anything, can make people change their minds? Today we speak to, Neil Yeung and Jonathan Lai, Undergraduate students in the Department of Computer Science at the University of Rochester, and Professor of Computer Science, Jiebo-Luoto to discuss their recent paper. Face Off: Polarized Public Opinions on Personal Face Mask Usage during the COVID-19 Pandemic. Works Mentioned Emails: Neil Yeung Jonathan Lia Jiebo Luo Thanks to our sponsors! Springboard School of Data offers a comprehensive career program encompassing data science, analytics, engineering, and Machine Learning. All courses are online and tailored to fit the lifestyle of working professionals. Up to 20 Data Skeptic listeners will receive $500 scholarships. Apply today at Check out Brilliant's group theory course to learn about object-oriented design! Brilliant is great for learning something new or to get an easy-to-look-at review of something you already know. Check them out a to get 20% off of a year of Brilliant Premium!
Niclas Boehmer, second year PhD student at Berlin Institute of Technology, comes on today to discuss the computational complexity of bribery in elections through the paper “On the Robustness of Winners: Counting Briberies in Elections.” Links Mentioned: Works Mentioned: “On the Robustness of Winners: Counting Briberies in Elections.” by Niclas Boehmer, Robert Bredereck, Piotr Faliszewski. Rolf Niedermier Thanks to our sponsors: Springboard School of Data: Springboard is a comprehensive end-to-end online data career program. Create a portfolio of projects to spring your career into action. Learn more about how you can be one of twenty $500 scholarship recipients at This opportunity is exclusive to Data Skeptic listeners. (Enroll with code: DATASK) Nord VPN: Protect your home internet connection with unlimited bandwidth. Data Skeptic Listeners-- take advantage of their Black Friday offer: purchase a 2-year plan, get 4 additional months free. (Use coupon code DATASKEPTIC)
Clement Fung, a Societal Computing PhD student at Carnegie Mellon University, discusses his research in security of machine learning systems and a defense against targeted sybil-based poisoning called FoolsGold. Works Mentioned: The Limitations of Federated Learning in Sybil Settings Twitter: @clemfung Website: Thanks to our sponsors: Brilliant - Online learning platform. Check out Geometry Fundamentals! Visit for 20% off Brilliant Premium! BetterHelp - Convenient, professional, and affordable online counseling. Take 10% off your first month at
Simson Garfinkel, Senior Computer Scientist for Confidentiality and Data Access at the US Census Bureau, discusses his work modernizing the Census Bureau disclosure avoidance system from private to public disclosure avoidance techniques using differential privacy. Some of the discussion revolves around the topics in the paper Randomness Concerns When Deploying Differential Privacy.   WORKS MENTIONED: “Calibrating Noise to Sensitivity in Private Data Analysis” by Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith "Issues Encountered Deploying Differential Privacy" by Simson L Garfinkel, John M Abowd, and Sarah Powazek "Randomness Concerns When Deploying Differential Privacy" by Simson L. Garfinkel and Philip Leclerc Check out:   Thank you to our sponsor, BetterHelp. Professional and confidential in-app counseling for everyone. Save 10% on your first month of services with
Distributed Consensus

Distributed Consensus


Computer Science research fellow of Cambridge University, Heidi Howard discusses Paxos, Raft, and distributed consensus in distributed systems alongside with her work “Paxos vs. Raft: Have we reached consensus on distributed consensus?” She goes into detail about the leaders in Paxos and Raft and how The Raft Consensus Algorithm actually inspired her to pursue her PhD. Paxos vs Raft paper: Leslie Lamport paper “part-time Parliament” Leslie Lamport paper "Paxos Made Simple" Twitter : @heidiann360 Thank you to our sponsor! Their apps challenge is still accepting submissions! find more information at
ACID Compliance

ACID Compliance


Linhda joins Kyle today to talk through A.C.I.D. Compliance (atomicity, consistency, isolation, and durability). The presence of these four components can ensure that a database’s transaction is completed in a timely manner. Kyle uses examples such as google sheets, bank transactions, and even the game rummy cube.   Thanks to this week's sponsors: - Their Apps Challenge is underway and available at Brilliant - Check out their Quantum Computing Course, I highly recommend it! Other interesting topics I’ve seen are Neural Networks and Logic. Check them out at
Patrick Rosenstiel joins us to discuss the The National Popular Vote.
Defending the p-value

Defending the p-value


Yudi Pawitan joins us to discuss his paper Defending the P-value.
Comments (18)



Jan 23rd


@6:00: The threshold for statistical significance does not "depend on the outcome." It raises a red flag even to hear someone say that, especially the host of a "data science" podcast. (Of course, if he knew what he was talking about, he'd be a "statistician" instead.) He might more accurately have said that any such estimate of the minimum sample size depends on the number of planned comparisons and the assumed effect size for each measured effect. Confusion about this should disqualify someone from hosting such a podcast.

Aug 23rd


@2:19: Too much interpretation as if respondents were randomly sampled. Respondents self-selected.

Aug 23rd

Antonio Andrade

thanks so much for sharing the results

Aug 12th


@1:03: It doesn't "beg the question"; it "raises the question." To "beg the question" is to commit a logical fallacy in which one assumes the conclusion.

Jun 15th

Benjamin Weckerle

Is the spin-off / journal club podcast on castbox?

Jun 2nd

Platte Gruber

KILLER intro, awesome work!

Jan 8th
Reply (1)

Marco Gorelli

"I find it stunning that people don't do that. The only thing I can think of is that there's just the lack of focused time. There's so many things we could spend our time on now we spend a little on all of them and we don't have depth that we need. A lot of people will come to a conference or something like that just to be away from work and only focus on one thing. Unfortunately they also bring their phone and completely break that paradigm. " ouch

Dec 8th

Bhavul Gauri

Brilliantly put!

Aug 27th

Akshay Shirsath

Thoughtful episode.

Jun 8th
Reply (1)

Achint Verma

A very very high level introduction to Kalman Filters. You could have talked about the matrices.

Mar 29th

Troy Kirin

Golden, thanks for this!

Mar 19th

Vannucci Santos

Why the guy talking about ethics was so evasive?

Dec 25th

Anna Malahova

I love everything about this podcast channel! It is easy to listen to, easy to understand without data science background, interesting topics and examples of situations where to apply. Very enjoyable and entertaining delivery. Informative show notes that help you to recall what the episode was about even after a while. Really can't think about any downsides. I am listening to all episodes starting from early days like an audiobook, love how music into evolved over time.

Nov 24th

Abdul Wahab Abrar

What about using Deep Learning techniques directly and integrate it with Neuroscience

Feb 12th

Giancarlo Vercellino

rilevante dal 23esimo minuto

Dec 12th
Download from Google Play
Download from App Store