DiscoverData Skeptic
Data Skeptic
Claim Ownership

Data Skeptic

Author: Kyle Polich

Subscribed: 32,319Played: 593,891
Share

Description

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
554 Episodes
Reverse
Alex Bisberg, a PhD candidate at the University of Southern California, specializes in network science and game analytics, with a focus on understanding social and competitive success in multiplayer online games. In this episode, listeners can expect to learn from a network perspective about players interactions and patterns of behavior. Through his research on games, Alex sheds light on how network analysis and statistical tests might explain positive contagious behaviors, such as generosity, and explore the dynamics of collaboration and competition in gaming environments. These insights offer valuable lessons not only for game developers in enhancing player experience, engagement and retention, but also for anyone interested in understanding the ways that virtual interactions shape social networks and behavior.
In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University.  As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository.  The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals insights into how developer communities form, function, and evolve, that can be used as guidance for OSS community managers.
We are joined by Abhishek Paudel, a PhD Student at George Mason University with a research focus on robotics, machine learning, and planning under uncertainty, using graph-based methods to enhance robot behavior. He explains how graph-based approaches can model environments, capture spatial relationships, and provide a framework for integrating multiple levels of planning and decision-making.
We are joined by Maciej Besta, a senior researcher of sparse graph computations and large language models at the Scalable Parallel Computing Lab (SPCL). In this episode, we explore the intersection of graph theory and high-performance computing (HPC), Graph Neural Networks (GNNs) and LLMs.
Graph Databases and AI

Graph Databases and AI

2024-10-2135:58

In this episode, we sit down with Yuanyuan Tian, a principal scientist manager at Microsoft Gray Systems Lab, to discuss the evolving role of graph databases in various industries such as fraud detection in finance and insurance, security, healthcare, and supply chain optimization. 
Our new season "Graphs and Networks" begins here!  We are joined by new co-host Asaf Shapira, a network analysis consultant and the podcaster of NETfrix – the network science podcast. Kyle and Asaf discuss ideas to cover in the season and explore Asaf's work in the field.
Join us for our capstone episode on the Animal Intelligence season. We recap what we loved, what we learned, and things we wish we had gotten to spend more time on. This is a great episode to see how the podcast is produced. Now that the season is ending, our current co-host, Becky, is moving to emeritus status. In this last installment we got to spend a little more time getting to know Becky and where her work will take her after this. Did Data Skeptic inspire her to learn more about machine learning? Tune in and find out. 
David Obembe, a recent University of Tartu graduate, discussed his Masters thesis on integrating LLMs with process mining tools. He explained how process mining uses event logs to create maps that identify inefficiencies in business processes. David shared his research on LLMs' potential to enhance process mining, including experiments evaluating their performance and future improvements using Retrieval Augmented Generation (RAG).
Open Animal Tracks

Open Animal Tracks

2024-09-1722:45

Our guest today is Risa Shinoda, a PhD student at Kyoto University Agricultural Systems Engineering Lab, where she applies computer vision techniques. She talked about the OpenAnimalTracks dataset and what it was used for. The dataset helps researchers predict animal footprint. She also discussed how she built a model for predicting tracks of animals. She shared the algorithms used and the accuracy they achieved. She also discussed further improvement opportunities for the model.
This episode features an interview with Mélisande Teng, a PhD candidate at Université de Montréal. Her research lies in the intersection of remote sensing and computer vision for biodiversity monitoring.
Ant Encounters

Ant Encounters

2024-08-2631:26

In this interview with author Deborah Gordon, Kyle asks questions about the mechanisms at work in an ant colony and what ants might teach us about how to build artificial intelligence. Ants are surprisingly adaptive creatures whose behavior emerges from their complex interactions. Aspects of network theory and the statistical nature of ant behavior are just some of the interesting details you'll get in this episode.  
Computing Toolbox

Computing Toolbox

2024-08-1938:44

This season it’s become clear that computing skills are vital for working in the natural sciences. In this episode, we were fortunate to speak with Madlen Wilmes, co-author of the book "Computing Skills for Biologists: A Toolbox". We discussed the book and why it’s a great resource for students and teachers. In addition to the book, Madlen shared her experience and advice on transitioning from academia to an industry career and how data analytic skills transfer to jobs that your professionals might not always consider. Join us and learn more about the book and careers using transferable skills.
In this episode, we talked shop with Hager Radi about her biodiversity monitoring work. While biodiversity modeling may sound simple, count organisms and mark their location, there is a lot more to it than that! Incomplete and biased data can make estimations hard. There are also many species with very few observations in the wild. Using machine learning and remote sensing data, scientists can build models that predict species distributions with limited data. Listen in and hear about Hager’s work tackling these challenges and the tools she has built.
Hacking the Colony

Hacking the Colony

2024-08-0841:03

Today, Ashay Aswale and Tony Lopez shared their work on swarm robotics and what they have learned from ants. Robotic swarms must solve the same problems that eusocial insects do. What if your pheromone trail goes cold? What if you’re getting bad information from a bad-actor within the swarm? Answering these questions can help tackle serious robotic challenges. For example, a swarm of robots can lose a few members to accidents and malfunctions, but a large robot cannot. Additionally, a swarm could be host to many castes like an ant colony. Specialization with redundancy built in seems like a win-win! Tune in and hear more about this fascinating topic.
Primate Poses

Primate Poses

2024-07-3132:57

During this season we have talked with researchers working to utilize machine learning for behavioral observations. In previous episodes, you have heard about the software people like Richard use, but you haven’t heard much from scientists modifying and using these tools for specific research cases. PhD student, Richard Vogg, is working with multi-camera set-ups to track lemurs and macaques solving puzzle boxes in the wild. His work is part of a larger movement to automate behavioral analyses of video data. Listen in and learn why this tech is useful and why multi-camera setups are a good idea for more reliably identifying poses and individual animals.
Generative AI can struggle to create realistic animals and 2D representations often have mistakes like extra limbs and tails. If 2D wasn’t hard enough, there are researchers working on generative 3D models. 3D models present an extra challenge because there is paucity of training datasets.In this episode, PhD students Sandeep and Oindrila walked us through their work on creating 3D animals using 2D data. Join us to learn about their pipelines, quality control, tie in with iNaturalist, and how this tech could streamline FX pipelines.
Weird Communication

Weird Communication

2024-07-1538:29

Today, we sat down with Dr. Ignacio Escalante Meza to learn about opiliones and treehoppers. Opiliones, known as “daddy long legs” in the US, are understudied arachnids known for their tenacious locomotor behavior, sociality, and chemical communication. Treehoppers communicate through the stems of plants using vibrations. They can signal danger, attract mates, and communicate with their offspring. Join us to learn how researchers turn their vibrations into sound waves and study what they have to say.
Human shipping operations have increased significantly in the past few decades.  While that means international trade and cheap goods for humans, it also means the ocean has experienced an increase in noise pollution.  This has a measurable negative impact on marine mammals and other aquatic life.  Could mathematics be the solution?  This interview explores how optimization techniques can guide voyage optimization in a way that handles multiple optimization objectives including fuel cost and sound reduction.
Robbie Moon from the Georgia Tech Scheller College of Business joins us to discuss the analysis of unstructured data and the application of NLP methodologies towards financial data.
iNaturalist

iNaturalist

2024-06-2437:53

Have you ever participated in citizen science? Do you want to? One of the most popular platforms for crowdsourcing biodiversity data is iNaturalist. In addition to being a great science tool, the iNaturalist app can help you identify the organisms you encounter every day. We talked to Executive Director Scott Laurie about how scientists use iNaturalist. We also got to discuss what makes iNaturalist’s AI species recognition so good, and how citizen scientists are constantly providing high-quality training data. Listen in and learn how this fun-to-use tool works, where it's headed, and how you can get involved.
loading
Comments (32)

mrs rime

🔴💚Really Amazing ️You Can Try This💚WATCH💚ᗪOᗯᑎᒪOᗩᗪ👉https://co.fastmovies.org

Jan 16th
Reply

Priya Dharshini

🔴WATCH>>ᗪOᗯᑎᒪOᗩᗪ>>👉https://co.fastmovies.org

Jan 16th
Reply

Tommy king

You would normally use a 3D rendering programme or game engine that supports rendering animations and sequences to render sequences like Maya. Here are some general guidelines to assist you in doing that have a peek here https://thewordpoint.com/services/translation-service/film-script-translation depending on the number of countries you intend to release your film in, you may need to hire film screenplay translation services in more than one language if you want to deliver your tale to a global audience.

Jun 21st
Reply

DemonDogs

incredibly awful audio for some reason

Apr 25th
Reply

tell tims

Now a days surveys are crucial to implement any type of instructions or development works. So, it's great thing to conduct the surveys. That's why we (Tim Hortons) are conducting the surveys at the official survey site https://telltims-ca.com. In return we have providing free validation codes to the survey participants to redeem on their further visit.

Mar 27th
Reply

Edward McBride

Hi friends. I agree, this book is very useful. But I'm not strong on marketing. So I enlisted the help of a proven marketing agency, https://saphira.agency/. They helped me promote my business and make my marketing loud, bright, unique and effective. I am sure that you should contact this agency if you need help in promoting your brand or business.

Dec 29th
Reply

John Skinner

Sponsored Social Media Posts: You can pay to turn one of your company's social media posts into an ad. This method allows you to select the target audience, region, and duration of the ad. Pay Per Click Ads. With this model, you pay every time someone clicks on your ad. Search engines such as Google offer this service where your ad appears at the top of the results page for your chosen keywords.

Dec 29th
Reply

Anthony Hall

Visuals are key. Investing in high quality visual content for your website and social media is a must. Many users rely on images to decide who to follow and what messages to interact with. Consider including photos or videos of your products, services, facilities, or staff.

Dec 28th
Reply

Hasters

In general, as for the affiliate program for designers, it is best to find a special offer. I already know from experience that here https://masterbundles.com/become-our-affiliate/ in the company Master Bundles you can get up to 15% commission if you start cooperation with this service. For me for example it helped me a lot, I hope it will help you too. Good luck with this business.

Dec 21st
Reply

Bill

Wow this is fascinating

Oct 1st
Reply

DemonDogs

the guest has classic psychological and groupthink issues in his research he needs to get out more

Aug 3rd
Reply (1)

Data Science LL

Hi guys in data skeptic. Thanks to share your valuable contents. May I acheive the text of your podcast? Wish luck.

Jul 20th
Reply

Vassili Savinov

great episode, cant wait to hear next one. Thanks!

Jun 28th
Reply

DemonDogs

Description?

Jan 23rd
Reply

ncooty

@6:00: The threshold for statistical significance does not "depend on the outcome." It raises a red flag even to hear someone say that, especially the host of a "data science" podcast. (Of course, if he knew what he was talking about, he'd be a "statistician" instead.) He might more accurately have said that any such estimate of the minimum sample size depends on the number of planned comparisons and the assumed effect size for each measured effect. Confusion about this should disqualify someone from hosting such a podcast.

Aug 23rd
Reply

ncooty

@2:19: Too much interpretation as if respondents were randomly sampled. Respondents self-selected.

Aug 23rd
Reply

Antonio Andrade

thanks so much for sharing the results

Aug 12th
Reply

ncooty

@1:03: It doesn't "beg the question"; it "raises the question." To "beg the question" is to commit a logical fallacy in which one assumes the conclusion.

Jun 15th
Reply

Benjamin Weckerle

Is the spin-off / journal club podcast on castbox?

Jun 2nd
Reply

Platte Gruber

KILLER intro, awesome work!

Jan 8th
Reply (1)
loading