Practical AI: Machine Learning, Data Science, LLM

302 Episodes

Reverse

Mozart to Megadeath at CHRP

2024-12-1953:541

Daniel and Chris groove with Jeff Smith, Founder and CEO at CHRP.ai. Jeff describes how CHRP anonymously analyzes emotional wellness data, derived from employees' music preferences, giving HR leaders actionable insights to improve productivity, retention, and overall morale. By monitoring key trends and identifying shifts in emotional health across teams, CHRP.ai enables proactive decisions to ensure employees feel supported and engaged.

Sidekick is an AI Shopify expert

2024-12-1151:391

Today, Chris explores Shopify Magic and other AI offerings with Mike Tamir, Distinguished ML Engineer and Head of Machine Learning, and Matt Colyer, Director of Product Management for Sidekick. They talk about how Shopify uses generative AI and LLMs to enhance their products, and they take a deeper dive into Sidekick, a first-of-its-kind, AI-enabled commerce assistant that understands a merchant’s business (products, orders, customers) and has been trained to know all about Shopify.

Full-duplex, real-time dialogue with Kyutai

2024-12-0450:05

Kyutai, an open science research lab, made headlines over the summer when they released their real-time speech-to-speech AI assistant (beating OpenAI to market with their teased GPT-driven speech-to-speech functionality). Alex from Kyutai joins us in this episode to discuss the research lab, their recent Moshi models, and what might be coming next from the lab. Along the way we discuss small models and the AI ecosystem in France.

Clones, commerce & campaigns

2024-11-2953:12

Chris and Daniel dive into what Trump’s impending second term could mean for AI companies, model developers, and regulators, unpacking the potential shifts in policy and innovation. Next, they discuss the latest models, like Qwen, that blur the performance gap between open and closed systems. Finally, they explore new AI tools for meeting clones and AI-driven commerce, sparking a conversation about the balance between digital convenience and fostering genuine human connections.

scikit-learn & data science you own

2024-11-1952:042

We are at GenAI saturation, so let's talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the "AI" that is actually in production. :probabl is a new kind of company that is stewarding this project along with a variety of other open source projects. Yann Lechelle and Guillaume Lemaitre share some of the vision behind the company and talk about the future of scikit-learn!

Creating tested, reliable AI applications

2024-11-1350:092

It can be frustrating to get an AI application working amazingly well 80% of the time and failing miserably the other 20%. How can you close the gap and create something that you rely on? Chris and Daniel talk through this process, behavior testing, and the flow from prototype to production in this episode. They also talk a bit about the apparent slow down in the release of frontier models.

AI is changing the cybersecurity threat landscape

2024-11-0555:25

This week, Chris is joined by Gregory Richardson, Vice President and Global Advisory CISO at BlackBerry, and Ismael Valenzuela, Vice President of Threat Research & Intelligence at BlackBerry. They address how AI is changing the threat landscape, why human defenders remain a key part of our cyber defenses, and the explain the AI standoff between cyber threat actors and cyber defenders.

The path towards trustworthy AI

2024-10-2951:461

Elham Tabassi, the Chief AI Advisor at the U.S. National Institute of Standards & Technology (NIST), joins Chris for an enlightening discussion about the path towards trustworthy AI. Together they explore NIST's 'AI Risk Management Framework' (AI RMF) within the context of the White House's 'Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence'.

Big data is dead, analytics is alive

2024-10-2450:211

We are on the other side of "big data" hype, but what is the future of analytics and how does AI fit in? Till and Adithya from MotherDuck join us to discuss why DuckDB is taking the analytics and AI world by storm. We dive into what makes DuckDB, a free, in-process SQL OLAP database management system, unique including its ability to execute lighting fast analytics queries against a variety of data sources, even on your laptop! Along the way we dig into the intersections with AI, such as text-to-sql, vector search, and AI-driven SQL query correction.

Practical workflow orchestration

2024-10-1558:26

Workflow orchestration has always been a pain for data scientists, but this is exacerbated in these AI hype days by agentic workflows executing arbitrary (not pre-defined) workflows with a variety of failure modes. Adam from Prefect joins us to talk through their open source Python library for orchestration and visibility into python-based pipelines. Along the way, he introduces us to things like Marvin, their AI engineering framework, and ControlFlow, their agent workflow system.

Towards high-quality (maybe synthetic) datasets

2024-10-0957:042

As Argilla puts it: "Data quality is what makes or breaks AI." However, what exactly does this mean and how can AI team probably collaborate with domain experts towards improved data quality? David Berenstein & Ben Burtenshaw, who are building Argilla & Distilabel at Hugging Face, join us to dig into these topics along with synthetic data generation & AI-generated labeling / feedback.

Understanding what's possible, doable & scalable

2024-10-0301:02:232

We are constantly hearing about disillusionment as it relates to AI. Some of that is probably valid, but Mike Lewis, an AI architect from Cincinnati, has proven that he can consistently get LLM and GenAI apps to the point of real enterprise value (even with the Big Cos of the world). In this episode, Mike joins us to share some stories from the AI trenches & highlight what it takes (practically) to show what is possible, doable & scalable with AI.

GraphRAG (beyond the hype)

2024-09-2555:042

Seems like we are hearing a lot about GraphRAG these days, but there are lots of questions: what is it, is it hype, what is practical? One of our all time favorite podcast friends, Prashanth Rao, joins us to dig into this topic beyond the hype. Prashanth gives us a bit of background and practical use cases for GraphRAG and graph data.

Pausing to think about scikit-learn & OpenAI o1

2024-09-1750:12

Recently the company stewarding the open source library scikit-learn announced their seed funding. Also, OpenAI released "o1" with new behavior in which it pauses to "think" about complex tasks. Chris and Daniel take some time to do their own thinking about o1 and the contrast to the scikit-learn ecosystem, which has the goal to promote "data science that you own."

Cybersecurity in the GenAI age

2024-09-1151:36

Dinis Cruz drops by to chat about cybersecurity for generative AI and large language models. In addition to discussing The Cyber Boardroom, Dinis also delves into cybersecurity efforts at OWASP and that organization's Top 10 for LLMs and Generative AI Apps.

AI is more than GenAI

2024-09-0540:052

GenAI is often what people think of when someone mentions AI. However, AI is much more. In this episode, Daniel breaks down a history of developments in data science, machine learning, AI, and GenAI in this episode to give listeners a better mental model. Don't miss this one if you are wanting to understand the AI ecosystem holistically and how models, embeddings, data, prompts, etc. all fit together.

Metrics Driven Development

2024-08-2942:141

How do you systematically measure, optimize, and improve the performance of LLM applications (like those powered by RAG or tool use)? Ragas is an open source effort that has been trying to answer this question comprehensively, and they are promoting a "Metrics Driven Development" approach. Shahul from Ragas joins us to discuss Ragas in this episode, and we dig into specific metrics, the difference between benchmarking models and evaluating LLM apps, generating synthetic test data and more.

Threat modeling LLM apps

2024-08-2254:40

If you have questions at the intersection of Cybersecurity and AI, you need to know Donato at WithSecure! Donato has been threat modeling AI applications and seriously applying those models in his day-to-day work. He joins us in this episode to discuss his LLM application security canvas, prompt injections, alignment, and more.

Only as good as the data

2024-08-1445:423

You might have heard that "AI is only as good as the data." What does that mean and what data are we talking about? Chris and Daniel dig into that topic in the episode exploring the categories of data that you might encounter working in AI (for training, testing, fine-tuning, benchmarks, etc.). They also discuss the latest developments in AI regulation with the EU's AI Act coming into force.

Gaudi processors & Intel's AI portfolio

2024-08-0746:30

There is an increasing desire for and effort towards GPU alternatives for AI workloads and an ability to run GenAI models on CPUs. Ben and Greg from Intel join us in this episode to help us understand Intel's strategy as it related to AI along with related projects, hardware, and developer communities. We dig into Intel's Gaudi processors, open source collaborations with Hugging Face, and AI on CPU/Xeon processors.

Comments (14)

J T

PendDown: 100-200

Oct 16th

James Martin

The concept of the Practical AI show sounds really interesting! It's so important to understand discussions around AI and related topics and apply them in real-world scenarios. If you're also interested in financial services, make sure to check out this link: https://myconveyancingmatters.co.uk/services/mortgage-product-transfer/ which provides helpful information on mortgage product transfers.

Oct 14th

Russell Johnson

thanks for the episode, It really gave a good understanding of the different areas. sometimes it is good to dial back the technical to summarise the basics.

Sep 7th

Aidan Goodall

Please consider removing the annoying background music while guests are speaking (27.30) so we can listen to the content and not 60 seconds of sponsored ad intro music

Jun 19th

:

The file is damaged and won't play properly.

May 13th

mrs rime

🔴💚Really Amazing ️You Can Try This💚WATCH💚ᗪOᗯᑎᒪOᗩᗪ👉https://co.fastmovies.org

Jan 16th

Annakaye Bennett

✅ CLICK HERE Full HD 1080p 4K👉👉https://co.fastmovies.org

Jan 13th

Farshid Hesami

Thank for podcast and it's was very useful for me

Dec 12th

Mohammed Boosiri

great talks. I loved the concluding part.

Aug 18th

Andrew Miller

Processing data is a pretty complex process. This podcast did a pretty good job explaining it. But if you want to learn something more about labeling and annotation, look here https://marketbusinesstimes.com/data-labeling/. In this article, you can find some decent information about raw data processing in machine learning which can really help you down the road.

Apr 20th

Justin lee

I got a call yesterday from a friend of mine. He told me that he has to do courses in Machine Learning, Data Science. After that I shared the link of this post with him. Because Machine Learning, Data Science and chat gpt https://supercharged-by-ai.com/ service were told on this post. A website link was also given here. You can get the initial information of AI from that website.

Jan 18th

Denial Brown

I noticed that the educational niche market is now rapidly developing due to companies such as https://geniusee.com/edtech that have simplified the creation and development of software or web products for training. Now there are already good strategies for creating such things.

Dec 10th

Robert Jackson

An alternative to DigitalOcean hosting: https://scalegrid.io/mysql/digitalocean.html

Sep 6th

Mark Cund

Great introduction to what's going on in AI. Already started on getting MachineBox up and running. Looking forward to my commutes so I can learn some more! Mark Cund (@AluminumBlonde)

Aug 9th

#box-pro-ellipsis-173764574598770{-webkit-line-clamp:2;}Practical AI: Machine Learning, Data Science, LLM

J T