DiscoverData Skeptic
Data Skeptic
Claim Ownership

Data Skeptic

Author: Kyle Polich

Subscribed: 32,364Played: 596,103
Share

Description

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
557 Episodes
Reverse
In this episode, Dave Bechberger, principal Graph Architect at AWS and author of "Graph Databases in Action", brings deep insights into the field of graph databases and their applications. Together we delve into specific scenarios in which Graph Databases provide unique solutions, such as in the fraud industry, and learn how to optimize our DB for questions around connections, such as "How are these entities related?" or "What patterns of interaction indicate anomalies?" This discussion sheds light on when organizations should consider adopting graph databases, particularly for cases that require scalable analysis of highly interconnected data and provides practical insights into leveraging graph databases for performance improvements in tasks that traditional relational databases struggle with.
Graph Transformations

Graph Transformations

2024-12-0932:47

In this episode, Adam Machowczyk, a PhD student at the University of Leicester, specializes in graph rewriting and its intersection with machine learning, particularly Graph Neural Networks. Adam explains how graph rewriting provides a formalized method to modify graphs using rule-based transformations, allowing for tasks like graph completion, attribute prediction, and structural evolution. Bridging the worlds of graph rewriting and machine learning, Adam's work aspire to  open new possibilities for creating adaptive, scalable models capable of solving challenges that traditional methods struggle with, such as handling heterogeneous graphs or incorporating incremental updates efficiently. Real-life applications discussed include using graph transformations to improve recommender systems in social networks, molecular research in chemistry, and enhancing IoT network analysis.
In this episode, the data scientist Wentao Su shares his experience in AB testing on social media platforms like LinkedIn and TikTok. We talk about how network science can enhance AB testing by accounting for complex social interactions, especially in environments where users are both viewers and content creators. These interactions might cause a "spillover effect" meaning a possible influence across experimental groups, which can distort results. To mitigate this effect, our guest presents heuristics and algorithms they developed ("one-degree label propagation”) to allow for good results on big data with minimal running time and so optimize user experience and advertiser performance in social media platforms.
Alex Bisberg, a PhD candidate at the University of Southern California, specializes in network science and game analytics, with a focus on understanding social and competitive success in multiplayer online games. In this episode, listeners can expect to learn from a network perspective about players interactions and patterns of behavior. Through his research on games, Alex sheds light on how network analysis and statistical tests might explain positive contagious behaviors, such as generosity, and explore the dynamics of collaboration and competition in gaming environments. These insights offer valuable lessons not only for game developers in enhancing player experience, engagement and retention, but also for anyone interested in understanding the ways that virtual interactions shape social networks and behavior.
In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University.  As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository.  The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals insights into how developer communities form, function, and evolve, that can be used as guidance for OSS community managers.
We are joined by Abhishek Paudel, a PhD Student at George Mason University with a research focus on robotics, machine learning, and planning under uncertainty, using graph-based methods to enhance robot behavior. He explains how graph-based approaches can model environments, capture spatial relationships, and provide a framework for integrating multiple levels of planning and decision-making.
We are joined by Maciej Besta, a senior researcher of sparse graph computations and large language models at the Scalable Parallel Computing Lab (SPCL). In this episode, we explore the intersection of graph theory and high-performance computing (HPC), Graph Neural Networks (GNNs) and LLMs.
Graph Databases and AI

Graph Databases and AI

2024-10-2135:58

In this episode, we sit down with Yuanyuan Tian, a principal scientist manager at Microsoft Gray Systems Lab, to discuss the evolving role of graph databases in various industries such as fraud detection in finance and insurance, security, healthcare, and supply chain optimization. 
Our new season "Graphs and Networks" begins here!  We are joined by new co-host Asaf Shapira, a network analysis consultant and the podcaster of NETfrix – the network science podcast. Kyle and Asaf discuss ideas to cover in the season and explore Asaf's work in the field.
Join us for our capstone episode on the Animal Intelligence season. We recap what we loved, what we learned, and things we wish we had gotten to spend more time on. This is a great episode to see how the podcast is produced. Now that the season is ending, our current co-host, Becky, is moving to emeritus status. In this last installment we got to spend a little more time getting to know Becky and where her work will take her after this. Did Data Skeptic inspire her to learn more about machine learning? Tune in and find out. 
David Obembe, a recent University of Tartu graduate, discussed his Masters thesis on integrating LLMs with process mining tools. He explained how process mining uses event logs to create maps that identify inefficiencies in business processes. David shared his research on LLMs' potential to enhance process mining, including experiments evaluating their performance and future improvements using Retrieval Augmented Generation (RAG).
Open Animal Tracks

Open Animal Tracks

2024-09-1722:45

Our guest today is Risa Shinoda, a PhD student at Kyoto University Agricultural Systems Engineering Lab, where she applies computer vision techniques. She talked about the OpenAnimalTracks dataset and what it was used for. The dataset helps researchers predict animal footprint. She also discussed how she built a model for predicting tracks of animals. She shared the algorithms used and the accuracy they achieved. She also discussed further improvement opportunities for the model.
This episode features an interview with Mélisande Teng, a PhD candidate at Université de Montréal. Her research lies in the intersection of remote sensing and computer vision for biodiversity monitoring.
Ant Encounters

Ant Encounters

2024-08-2631:26

In this interview with author Deborah Gordon, Kyle asks questions about the mechanisms at work in an ant colony and what ants might teach us about how to build artificial intelligence. Ants are surprisingly adaptive creatures whose behavior emerges from their complex interactions. Aspects of network theory and the statistical nature of ant behavior are just some of the interesting details you'll get in this episode.  
Computing Toolbox

Computing Toolbox

2024-08-1938:44

This season it’s become clear that computing skills are vital for working in the natural sciences. In this episode, we were fortunate to speak with Madlen Wilmes, co-author of the book "Computing Skills for Biologists: A Toolbox". We discussed the book and why it’s a great resource for students and teachers. In addition to the book, Madlen shared her experience and advice on transitioning from academia to an industry career and how data analytic skills transfer to jobs that your professionals might not always consider. Join us and learn more about the book and careers using transferable skills.
In this episode, we talked shop with Hager Radi about her biodiversity monitoring work. While biodiversity modeling may sound simple, count organisms and mark their location, there is a lot more to it than that! Incomplete and biased data can make estimations hard. There are also many species with very few observations in the wild. Using machine learning and remote sensing data, scientists can build models that predict species distributions with limited data. Listen in and hear about Hager’s work tackling these challenges and the tools she has built.
Hacking the Colony

Hacking the Colony

2024-08-0841:03

Today, Ashay Aswale and Tony Lopez shared their work on swarm robotics and what they have learned from ants. Robotic swarms must solve the same problems that eusocial insects do. What if your pheromone trail goes cold? What if you’re getting bad information from a bad-actor within the swarm? Answering these questions can help tackle serious robotic challenges. For example, a swarm of robots can lose a few members to accidents and malfunctions, but a large robot cannot. Additionally, a swarm could be host to many castes like an ant colony. Specialization with redundancy built in seems like a win-win! Tune in and hear more about this fascinating topic.
Primate Poses

Primate Poses

2024-07-3132:57

During this season we have talked with researchers working to utilize machine learning for behavioral observations. In previous episodes, you have heard about the software people like Richard use, but you haven’t heard much from scientists modifying and using these tools for specific research cases. PhD student, Richard Vogg, is working with multi-camera set-ups to track lemurs and macaques solving puzzle boxes in the wild. His work is part of a larger movement to automate behavioral analyses of video data. Listen in and learn why this tech is useful and why multi-camera setups are a good idea for more reliably identifying poses and individual animals.
Generative AI can struggle to create realistic animals and 2D representations often have mistakes like extra limbs and tails. If 2D wasn’t hard enough, there are researchers working on generative 3D models. 3D models present an extra challenge because there is paucity of training datasets.In this episode, PhD students Sandeep and Oindrila walked us through their work on creating 3D animals using 2D data. Join us to learn about their pipelines, quality control, tie in with iNaturalist, and how this tech could streamline FX pipelines.
Weird Communication

Weird Communication

2024-07-1538:29

Today, we sat down with Dr. Ignacio Escalante Meza to learn about opiliones and treehoppers. Opiliones, known as “daddy long legs” in the US, are understudied arachnids known for their tenacious locomotor behavior, sociality, and chemical communication. Treehoppers communicate through the stems of plants using vibrations. They can signal danger, attract mates, and communicate with their offspring. Join us to learn how researchers turn their vibrations into sound waves and study what they have to say.
loading
Comments (32)

mrs rime

🔴💚Really Amazing ️You Can Try This💚WATCH💚ᗪOᗯᑎᒪOᗩᗪ👉https://co.fastmovies.org

Jan 16th
Reply

Priya Dharshini

🔴WATCH>>ᗪOᗯᑎᒪOᗩᗪ>>👉https://co.fastmovies.org

Jan 16th
Reply

Tommy king

You would normally use a 3D rendering programme or game engine that supports rendering animations and sequences to render sequences like Maya. Here are some general guidelines to assist you in doing that have a peek here https://thewordpoint.com/services/translation-service/film-script-translation depending on the number of countries you intend to release your film in, you may need to hire film screenplay translation services in more than one language if you want to deliver your tale to a global audience.

Jun 21st
Reply

DemonDogs

incredibly awful audio for some reason

Apr 25th
Reply

tell tims

Now a days surveys are crucial to implement any type of instructions or development works. So, it's great thing to conduct the surveys. That's why we (Tim Hortons) are conducting the surveys at the official survey site https://telltims-ca.com. In return we have providing free validation codes to the survey participants to redeem on their further visit.

Mar 27th
Reply

Edward McBride

Hi friends. I agree, this book is very useful. But I'm not strong on marketing. So I enlisted the help of a proven marketing agency, https://saphira.agency/. They helped me promote my business and make my marketing loud, bright, unique and effective. I am sure that you should contact this agency if you need help in promoting your brand or business.

Dec 29th
Reply

John Skinner

Sponsored Social Media Posts: You can pay to turn one of your company's social media posts into an ad. This method allows you to select the target audience, region, and duration of the ad. Pay Per Click Ads. With this model, you pay every time someone clicks on your ad. Search engines such as Google offer this service where your ad appears at the top of the results page for your chosen keywords.

Dec 29th
Reply

Anthony Hall

Visuals are key. Investing in high quality visual content for your website and social media is a must. Many users rely on images to decide who to follow and what messages to interact with. Consider including photos or videos of your products, services, facilities, or staff.

Dec 28th
Reply

Hasters

In general, as for the affiliate program for designers, it is best to find a special offer. I already know from experience that here https://masterbundles.com/become-our-affiliate/ in the company Master Bundles you can get up to 15% commission if you start cooperation with this service. For me for example it helped me a lot, I hope it will help you too. Good luck with this business.

Dec 21st
Reply

Bill

Wow this is fascinating

Oct 1st
Reply

DemonDogs

the guest has classic psychological and groupthink issues in his research he needs to get out more

Aug 3rd
Reply (1)

Data Science LL

Hi guys in data skeptic. Thanks to share your valuable contents. May I acheive the text of your podcast? Wish luck.

Jul 20th
Reply

Vassili Savinov

great episode, cant wait to hear next one. Thanks!

Jun 28th
Reply

DemonDogs

Description?

Jan 23rd
Reply

ncooty

@6:00: The threshold for statistical significance does not "depend on the outcome." It raises a red flag even to hear someone say that, especially the host of a "data science" podcast. (Of course, if he knew what he was talking about, he'd be a "statistician" instead.) He might more accurately have said that any such estimate of the minimum sample size depends on the number of planned comparisons and the assumed effect size for each measured effect. Confusion about this should disqualify someone from hosting such a podcast.

Aug 23rd
Reply

ncooty

@2:19: Too much interpretation as if respondents were randomly sampled. Respondents self-selected.

Aug 23rd
Reply

Antonio Andrade

thanks so much for sharing the results

Aug 12th
Reply

ncooty

@1:03: It doesn't "beg the question"; it "raises the question." To "beg the question" is to commit a logical fallacy in which one assumes the conclusion.

Jun 15th
Reply

Benjamin Weckerle

Is the spin-off / journal club podcast on castbox?

Jun 2nd
Reply

Platte Gruber

KILLER intro, awesome work!

Jan 8th
Reply (1)