DiscoverDataTalks.Club
DataTalks.Club
Claim Ownership

DataTalks.Club

Author: DataTalks.Club

Subscribed: 90Played: 1,983
Share

Description

DataTalks.Club - the place to talk about data!
202 Episodes
Reverse
In this session Sofya shares her journey building a pet-tech startup that blends machine learning sensor data and canine behavior analytics. She walks through her path from early programming explorations to launching a health monitoring device designed around anomaly detection and long-term behavioral baselines.TIMECODES: 00:00 Sofya's pet tech startup with machine learning sensor data and behavior pattern analytics10:00 Journey from programming hobby to full time software development career17:20 Career growth after skipping university and building practical experience24:07 Puppy adoption story and family influence on pet focused innovation32:16 Dog health monitoring framed as anomaly detection in real world machine learning37:05 Collecting canine data with emphasis on sleep patterns and cycle tracking43:35 Establishing a dogs normal baseline through long term data observation49:34 Startup funding through personal savings and early stage bootstrapping55:28 Finding cofounders and collaborators through meetups and coworking communities59:48 Closing insights on Sofya's educational path and early device prototypesConnect with Sofya- Website - https://www.fit-tails.com/ - Linkedin - https://www.linkedin.com/in/sofya-yulpatova/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this talk, Xia He-Bleinagel, Head of Data & Cloud at NOW GmbH, shares her remarkable journey from studying automotive engineering across Europe to leading modern data, cloud, and engineering teams in Germany.We dive into her transition from hands-on engineering to leadership, how she balanced family with career growth, and what it really takes to succeed in today’s cloud, data, and AI job market.TIMECODES:00:00 Studying Automotive Engineering Across Europe08:15 How Andrew Ng Sparked a Machine Learning Journey11:45 Import–Export Work as an Unexpected Career Boost17:05 Balancing Family Life with Data Engineering Studies20:50 From Data Engineer to Head of Data & Cloud27:46 Building Data Teams & Tackling Tech Debt30:56 Learning Leadership Through Coaching & Observation34:17 Management vs. IC: Finding Your Best Fit38:52 Boosting Developer Productivity with AI Tools42:47 Succeeding in Germany’s Competitive Data Job Market46:03 Fast-Track Your Cloud & Data Career50:03 Mentorship & Supporting Working Moms in Tech53:03 Cultural & Economic Factors Shaping Women’s Careers57:13 Top Networking Groups for Women in Data1:00:13 Turning Domain Expertise into a Data Career AdvantageConnect with Xia- Linkedin - https://www.linkedin.com/in/xia-he-bleinagel-51773585/- Github - https://github.com/Data-Think-2021- Website - https://datathinker.de/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this talk, Anusha Akkina, co-founder of Auralytix, shares her journey from working as a Chartered Accountant and Auditor at Deloitte to building an AI-powered finance intelligence platform designed to augment, not replace, human decision-making. Together with host Alexey from DataTalks.Club, she explores how AI is transforming finance operations beyond spreadsheets—from tackling ERP limitations to creating real-time insights that drive strategic business outcomes.TIMECODES:00:00 Building trust in AI finance and introducing Auralytix02:22 From accounting roots to auditing at Deloitte and Paraxel08:20 Moving to Germany and pivoting into corporate finance11:50 The data struggle in strategic finance and the need for change13:23 How Auralytix was born: bridging AI and financial compliance17:15 Why ERP systems fail finance teams and how spreadsheets fill the gap24:31 The real cost of ERP rigidity and lessons from failed transformations29:10 The hidden risks of spreadsheet dependency and knowledge loss37:30 Experimenting with ChatGPT and coding the first AI finance prototype43:34 Identifying finance’s biggest pain points through user research47:24 Empowering finance teams with AI-driven, real-time decision insights50:59 Developing an entrepreneurial mindset through strategy and learning54:31 Essential resources and finding the right AI co-founderConnect with Anusha- Linkedin - https://www.linkedin.com/in/anusha-akkina-acma-cgma-56154547/- Website - https://aurelytix.com/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
At Qdrant Conference, builders, researchers, and industry practitioners shared how vector search, retrieval infrastructure, and LLM-driven workflows are evolving across developer tooling, AI platforms, analytics teams, and modern search research.Andrey Vasnetsov (Qdrant) explained how Qdrant was born from the need to combine database-style querying with vector similarity search—something he first built during the COVID lockdowns. He highlighted how vector search has shifted from an ML specialty to a standard developer tool and why hosting an in-person conference matters for gathering honest, real-time feedback from the growing community.Slava Dubrov (HubSpot) described how his team uses Qdrant to power AI Signals, a platform for embeddings, similarity search, and contextual recommendations that support HubSpot’s AI agents. He shared practical use cases like look-alike company search, reflected on evaluating agentic frameworks, and offered career advice for engineers moving toward technical leadership.Marina Ariamnova (SumUp) presented her internally built LLM analytics assistant that turns natural-language questions into SQL, executes queries, and returns clean summaries—cutting request times from days to minutes. She discussed balancing analytics and engineering work, learning through real projects, and how LLM tools help analysts scale routine workflows without replacing human expertise.Evgeniya (Jenny) Sukhodolskaya (Qdrant) discussed the multi-disciplinary nature of DevRel and her focus on retrieval research. She shared her work on sparse neural retrieval, relevance feedback, and hybrid search models that blend lexical precision with semantic understanding—contributing methods like Mini-COIL and shaping Qdrant’s search quality roadmap through end-to-end experimentation and community education.SpeakersAndrey VasnetsovCo-founder & CTO of Qdrant, leading the engineering and platform vision behind a developer-focused vector database and vector-native infrastructure.Connect: https://www.linkedin.com/in/andrey-vasnetsov-75268897/Slava DubrovTechnical Lead at HubSpot working on AI Signals—embedding models, similarity search, and context systems for AI agents.Connect: https://www.linkedin.com/in/slavadubrov/Marina AriamnovaData Lead at SumUp, managing analytics and financial data workflows while prototyping LLM tools that automate routine analysis.Connect: https://www.linkedin.com/in/marina-ariamnova/Evgeniya (Jenny) SukhodolskayaDeveloper Relations Engineer at Qdrant specializing in retrieval research, sparse neural methods, and educational ML content.Connect: https://www.linkedin.com/in/evgeniya-sukhodolskaya/
In this talk, Hugo Bowne-Anderson, an independent data and AI consultant, educator, and host of the podcasts Vanishing Gradients and High Signal, shares his journey from academic research and curriculum design at DataCamp to advising teams at Netflix, Meta, and the US Air Force. Together, we explore how to build reliable, production-ready AI systems—from prompt evaluation and dataset design to embedding agents into everyday workflows.You’ll learn about: How to structure teams and incentives for successful AI adoptionPractical prompting techniques for accurate timestamp and data generationBuilding and maintaining evaluation sets to avoid “prompt overfitting”- Cost-effective methods for LLM evaluation and monitoringTools and frameworks for debugging and observing AI behavior (Logfire, Braintrust, Phoenix Arise)The evolution of AI agents—from simple RAG systems to proactive, embedded assistantsHow to escape “proof of concept purgatory” and prioritize AI projects that drive business valueStep-by-step guidance for building reliable, evaluable AI agentsThis session is ideal for AI engineers, data scientists, ML product managers, and startup founders looking to move beyond experimentation into robust, scalable AI systems. Whether you’re optimizing RAG pipelines, evaluating prompts, or embedding AI into products, this talk offers actionable frameworks to guide you from concept to production.LINKSEscaping POC Purgatory: Evaluation-Driven Development for AI Systems - https://www.oreilly.com/radar/escaping-poc-purgatory-evaluation-driven-development-for-ai-systems/Stop Building AI Agents - https://www.decodingai.com/p/stop-building-ai-agentsHow to Evaluate LLM Apps Before You Launch - https://www.youtube.com/watch?si=90fXJJQThSwGCaYv&v=TTr7zPLoTJI&feature=youtu.beMy Vanishing Gradients Substack - https://hugobowne.substack.com/Building LLM Applications for Data Scientists and Software Engineers https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=datatalksclubTIMECODES:00:00 Introduction and Expertise04:04 Transition to Freelance Consulting and Advising08:49 Restructuring Teams and Incentivizing AI Adoption12:22 Improving Prompting for Timestamp Generation17:38 Evaluation Sets and Failure Analysis for Reliable Software23:00 Evaluating Prompts: The Cost and Size of Gold Test Sets27:38 Software Tools for Evaluation and Monitoring33:14 Evolution of AI Tools: Proactivity and Embedded Agents40:12 The Future of AI is Not Just Chat44:38 Avoiding Proof of Concept Purgatory: Prioritizing RAG for Business Value50:19 RAG vs. Agents: Complexity and Power Trade-Offs56:21 Recommended Steps for Building Agents59:57 Defining Memory in Multi-Turn ConversationsConnect with HugoTwitter - https://x.com/hugobowneLinkedin - https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/Github - https://github.com/hugobowneWebsite - https://hugobowne.github.io/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this talk, Sebastian, a bioinformatics researcher and software engineer, shares his inspiring journey from wet lab biotechnology to computational bioinformatics. Hosted by Data Talks Club, this session explores how data science, AI, and open-source tools are transforming modern biological research — from DNA sequencing to metagenomics and protein structure prediction.You’ll learn about: - The difference between wet lab and dry lab workflows in biotechnology - How bioinformatics enables faster insights through data-driven modeling - The MCW2 Graph Project and its role in studying wastewater microbiomes - Using co-abundance networks and the CC Lasso algorithm to map microbial interactions - How AlphaFold revolutionized protein structure prediction - Building scientific knowledge graphs to integrate biological metadata - Open-source tools like VueGen and VueCore for automating reports and visualizations - The growing impact of AI and large language models (LLMs) in research and documentation - Key differences between R (BioConductor) and Python ecosystems for bioinformaticsThis talk is ideal for data scientists, bioinformaticians, biotech researchers, and AI enthusiasts who want to understand how data science, AI, and biology intersect. Whether you work in genomics, computational biology, or scientific software, you’ll gain insights into real-world tools and workflows shaping the future of bioinformatics.Links:- MicW2Graph: https://zenodo.org/records/12507444- VueGen: https://github.com/Multiomics-Analytics-Group/vuegen- Awesome-Bioinformatics: https://github.com/danielecook/Awesome-BioinformaticsTIMECODES00:00 Sebastian’s Journey into Bioinformatics06:02 From Wet Lab to Computational Biology08:23 Wet Lab vs Dry Lab Explained12:35 Bioinformatics as Data Science for Biology15:30 How DNA Sequencing Works19:29 MCW2 Graph and Wastewater Microbiomes23:10 Building Microbial Networks with CC Lasso26:54 Protein–Ligand Simulation Basics29:58 Predicting Protein Folding in 3D33:30 AlphaFold Revolution in Protein Prediction36:45 Inside the MCW2 Knowledge Graph39:54 VueGen: Automating Scientific Reports43:56 VueCore: Visualizing OMIX Data47:50 Using AI and LLMs in Bioinformatics50:25 R vs Python in Bioinformatics Tools53:17 Closing Thoughts from EcuadorConnect with SebastianTwitter - https://twitter.com/sayalaruanoLinkedin - https://linkedin.com/in/sayalaruano Github - https://github.com/sayalaruanoWebsite - https://sayalaruano.github.io/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this episode, we talked with Aishwarya Jadhav, a machine learning engineer whose career has spanned Morgan Stanley, Tesla, and now Waymo. Aishwarya shares her journey from big data in finance to applied AI in self-driving, gesture understanding, and computer vision. She discusses building an AI guide dog for the visually impaired, contributing to malaria mapping in Africa, and the challenges of deploying safe autonomous systems. We also explore the intersection of computer vision, NLP, and LLMs, and what it takes to break into the self-driving AI industry.TIMECODES00:51 Aishwarya’s career journey from finance to self-driving AI05:45 Building AI guide dog for the visually impaired12:03 Exploring LiDAR, radar, and Tesla’s camera-based approach16:24 Trust, regulation, and challenges in self-driving adoption19:39 Waymo, ride-hailing, and gesture recognition for traffic control24:18 Malaria mapping in Africa and AI for social good29:40 Deployment, safety, and testing in self-driving systems37:00 Transition from NLP to computer vision and deep learning43:37 Reinforcement learning, robotics, and self-driving constraints51:28 Testing processes, evaluations, and staged rollouts for autonomous driving52:53 Can multimodal LLMs be applied to self-driving?55:33 How to get started in self-driving AI careersConnect with Aishwarya- Linkedin - https://www.linkedin.com/in/aishwaryajadhav8/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this episode, we talked with Ranjitha Kulkarni, a machine learning engineer with a rich career spanning Microsoft, Dropbox, and now NeuBird AI. Ranjitha shares her journey into ML and NLP, her work building recommendation systems, early AI agents, and cutting-edge LLM-powered products. She offers insights into designing reliable AI systems in the new era of generative AI and agents, and how context engineering and dynamic planning shape the future of AI products.TIMECODES00:00 Career journey and early curiosity04:25 Speech recognition at Microsoft05:52 Recommendation systems and early agents at Dropbox07:44 Joining NewBird AI12:01 Defining agents and LLM orchestration16:11 Agent planning strategies18:23 Agent implementation approaches22:50 Context engineering essentials30:27 RAG evolution in agent systems37:39 RAG vs agent use cases40:30 Dynamic planning in AI assistants43:00 AI productivity tools at Dropbox46:00 Evaluating AI agents53:20 Reliable tool usage challenges58:17 Future of agents in engineering Connect with Ranjitha- Linkedin - https://www.linkedin.com/in/ranjitha-gurunath-kulkarniConnect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
In this episode, we talked with Abouzar Abbaspour, a data engineer whose career spans software engineering in Iran, building crowd and recommendation systems at a Dutch theme park, deploying large-scale ML models at Bol.com, and now working at Tesla. Abouzar shares how he bridged diverse industries, tackled real-world data challenges, and adapted to new roles while keeping a hands-on approach to machine learning and engineering.TIMECODES00:00 Career journey and early motivations06:17 Moving to Europe for data science12:18 Working with theme parks and crowd modeling18:29 Lessons from ride and visitor data23:06 Building recommendation systems at Efteling27:26 Joining Bol.com and the Dutch e-commerce industry32:49 Product and brand recommendation logic36:09 Experimenting with "Tinder for brands"40:26 Engagement metrics and product validation43:02 From ML engineering to data engineering roles52:04 Hands-on skills at Tesla and industry expectations57:43 Career growth, learning, and adviceConnect with AbouzarLinkedin -   / abouzar-abbaspour  Website - https://www.abouzar-abbaspour.com/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/
In this episode, we chat with Dashel Ruiz, whose journey spans semiconductors, machine learning, and teaching. Dashel shares how he transitioned from hardware to data science, navigated complex projects in diverse industries, and now combines technical expertise with a passion for teaching. Tune in to hear insights on building a career in data, mastering new technologies, and making an impact both in the lab and the classroom.TIMECODES00:00 Dashel's unique career path from music to semiconductors06:16 The transition into data and software engineering at Microchip11:44 Discovering machine learning to solve real problems in semiconductor manufacturing20:40 How Dashel found and his experience with the Machine Learning Zoomcamp29:33 The practical advantages of DataTalks.Club courses over other platforms39:52 Overcoming challenges and the value of the learning community48:10 Hands-on project experience: From image classification to Kaggle competitions54:12 Staying motivated throughout the long-term course59:55 The importance of deployment and full-stack ML skills1:07:36 Closing thoughts on teaching and future coursesConnect with Dashel Linkedin - https://www.linkedin.com/in/dashel-ruiz-perez-2b036172/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/
In this episode, we talk with Michael Lanham, an AI and software innovator with over two decades of experience spanning game development, fintech, oil and gas, and agricultural tech. Michael shares his journey from building neural network-based games and evolutionary algorithms to writing influential books on AI agents and deep learning. He offers insights into the evolving AI landscape, practical uses of AI agents, and the future of generative AI in gaming and beyond.TIMECODES00:00 Micheal Lanham’s career journey and AI agent books05:45 Publishing journey: AR, Pokémon Go, sound design, and reinforcement learning10:00 Evolution of AI: evolutionary algorithms, deep learning, and agents13:33 Evolutionary algorithms in prompt engineering and LLMs18:13 AI agent books second edition and practical applications20:57 AI agent workflows: minimalism, task breakdown, and collaboration26:25 Collaboration and orchestration among AI agents31:24 Tools and reasoning servers for agent communication35:17 AI agents in game development and generative AI impact38:57 Future of generative AI in gaming and immersive content41:42 Coding agents, new LLMs, and local deployment45:40 AI model trends and data scientist career advice53:36 Cognitive testing, evaluation, and monitoring in AI58:50 Publishing details and closing remarksConnect with MichealLinkedin - https://www.linkedin.com/in/micheal-lanham-189693123/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/
At PyData Berlin, community members and industry voices highlighted how AI and data tooling are evolving across knowledge graphs, MLOps, small-model fine-tuning, explainability, and developer advocacy.- Igor Kvachenok (Leuphana University / ProKube) combined knowledge graphs with LLMs for structured data extraction in the polymer industry, and noted how MLOps is shifting toward LLM-focused workflows.- Selim Nowicki (Distill Labs) introduced a platform that uses knowledge distillation to fine-tune smaller models efficiently, making model specialization faster and more accessible.- Gülsah Durmaz (Architect & Developer) shared her transition from architecture to coding, creating Python tools for design automation and volunteering with PyData through PyLadies.- Yashasvi Misra (Pure Storage) spoke on explainable AI, stressing accountability and compliance, and shared her perspective as both a data engineer and active Python community organizer.- Mehdi Ouazza (MotherDuck) reflected on developer advocacy through video, workshops, and branding, showing how creative communication boosts adoption of open-source tools like DuckDB.Igor KvachenokMaster’s student in Data Science at Leuphana University of Lüneburg, writing a thesis on LLM-enhanced data extraction for the polymer industry. Builds RDF knowledge graphs from semi-structured documents and works at ProKube on MLOps platforms powered by Kubeflow and Kubernetes.Connect: https://www.linkedin.com/in/igor-kvachenok/Selim NowickiFounder of Distill Labs, a startup making small-model fine-tuning simple and fast with knowledge distillation. Previously led data teams at Berlin startups like Delivery Hero, Trade Republic, and Tier Mobility. Sees parallels between today’s ML tooling and dbt’s impact on analytics.Connect: https://www.linkedin.com/in/selim-nowicki/Gülsah DurmazArchitect turned developer, creating Python-based tools for architectural design automation with Rhino and Grasshopper. Active in PyLadies and a volunteer at PyData Berlin, she values the community for networking and learning, and aims to bring ML into architecture workflows.Connect: https://www.linkedin.com/in/gulsah-durmaz/Yashasvi (Yashi) MisraData Engineer at Pure Storage, community organizer with PyLadies India, PyCon India, and Women Techmakers. Advocates for inclusive spaces in tech and speaks on explainable AI, bridging her day-to-day in data engineering with her passion for ethical ML.Connect: https://www.linkedin.com/in/misrayashasvi/Mehdi OuazzaDeveloper Advocate at MotherDuck, formerly a data engineer, now focused on building community and education around DuckDB. Runs popular YouTube channels ("mehdio DataTV" and "MotherDuck") and delivered a hands-on workshop at PyData Berlin. Blends technical clarity with creative storytelling.Connect: https://www.linkedin.com/in/mehd-io/
In this episode, we talk with Daniel, an astrophysicist turned machine learning engineer and AI ambassador. Daniel shares his journey bridging astronomy and data science, how he leveraged live courses and public knowledge sharing to grow his skills, and his experiences working on cutting-edge radio astronomy projects and AI deployments. He also discusses practical advice for beginners in data and astronomy, and insights on career growth through community and continuous learning.TIMECODES00:00 Lunar eclipse story and Daniel’s astronomy career04:12 Electromagnetic spectrum and MEERKAT data explained10:39 Data analysis and positional cross-correlation challenges15:25 Physics behind radio star detection and observation limits16:35 Radio astronomy’s advantage and machine learning potential20:37 Radio astronomy progress and Daniel’s ML journey26:00 Python tools and experience with ZoomCamps31:26 Intel internship and exploring LLMs41:04 Sharing progress and course projects with orchestration tools44:49 Setting up Airflow 3.0 and building data pipelines47:39 AI startups, training resources, and NVIDIA courses50:20 Student access to education, NVIDIA experience, and beginner astronomy programs57:59 Skills, projects, and career advice for beginners59:19 Starting with data science or engineering1:00:07 Course sponsorship, data tools, and learning resourcesConnect with DanielLinkedin -   / egbodaniel  Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/
At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.- Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.- Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.- André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.- Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.- Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.- Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.Kacper Łukawski Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant’s support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps. Connect: https://www.linkedin.com/in/kacperlukawski/Manish Gill Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager. Connect: https://www.linkedin.com/in/manishgill/André Charton Head of Search at Kleinanzeigen, he discussed shaping the company’s search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering. Connect: https://www.linkedin.com/in/andrecharton/Filip Makraduli Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters. Connect: https://www.linkedin.com/in/filipmakraduli/Brian Goldin Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager’s 12 years of enterprise experience. Connect: https://www.linkedin.com/in/brian-goldin-04170a1/Atita Arora Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion. Connect: https://www.linkedin.com/in/atitaarora/
In this episode, We talked with Pastor, a medical doctor who built a career in machine learning while studying medicine. Pastor shares how he balanced both fields, leveraged live courses and public sharing to grow his skills, and found opportunities through freelancing and mentoring.TIMECODES00:00 Pastor’s background and early programming journey06:05 Learning new tools and skills on the job while studying medicine11:44 Balancing medical studies with data science work and motivation13:48 Applying medical knowledge to data science and vice versa18:44 Starting freelance work on Upwork and overcoming language challenges24:03 Joining the machine learning engineering course and benefits of live cohorts27:41 Engaging with the course community and sharing progress publicly35:16 Using LinkedIn and social media for career growth and interview opportunities41:03 Building reputation, structuring learning, and leveraging course projects50:53 Volunteering and mentoring with DeepLearning.AI and Stanford Coding Place57:00 Managing time and staying productive while studying medicine and machine learningConnect with PastorTwitter - https://x.com/PastorSotoB1Linkedin -   / pastorsoto  Github - https://github.com/sotoblancoWebsite - https://substack.com/@pastorsotoConnect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/
Struggling with data trust issues, dashboard drama, or constant pipeline firefighting? In this deep‑dive interview, Lior Barak shows you how to shift from a reactive “fix‑it” culture to a mindful, impact‑driven practice rooted in Zen/Wabi‑Sabi principles.You’ll learn:Why 97 % of CEOs say they use data, but only 24 % call themselves data‑drivenThe traffic‑light dashboard pattern (green / yellow / red) that instantly tells execs whether numbers are safe to useA practical rule for balancing maintenance, rollout, and innovation—and avoiding team burnoutHow to quantify ROI on data products, kill failing legacy systems, and handle ad‑hoc exec requests without derailing roadmapsTurning “imperfect” data into business value with mindful communication, root‑cause logs, and automated incident review loops🕒 TIMECODES00:00 Community and mindful data strategy04:06 Career journey and product management insights08:03 Wabi-sabi data and the trust crisis11:47 AI, data imperfection, and trust challenges20:05 Trust crisis examples and root cause analysis25:06 Regaining trust through mindful data management30:47 Traffic light system and effective communication37:41 Communication gaps and team workload balance39:58 Maintenance stress and embracing Zen mindset49:29 Accepting imperfection and measuring impact56:19 Legacy systems and managing executive requests01:00:23 Role guidance and closing reflections🔗 Connect with LiorLinkedIn - https://www.linkedin.com/in/liorbarakWebsite - https://cookingdata.substack.com/Cooking Data newsletter: https://cookingdata.substack.com/Product product lifecycle manager: https://app--data-product-lifecycle-manager-c81b10bb.base44.app/🔗 Connect with DataTalks.ClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/Twitter - https://x.com/DataTalksClubWebsite - https://datatalks.club/ 🔗 Connect with AlexeyTwitter - https://x.com/Al_GrigorLinkedin - https://www.linkedin.com/in/agrigorev/
In this episode, we talk with Orell about his journey from electrical engineering to freelancing in data engineering. Exploring lessons from startup life, working with messy industrial data, the realities of freelancing, and how to stay up to date with new tools. Topics covered: Why Orel left a PhD and a simulation‑focused start‑up after Covid hitWhat he learned trying (and failing) to commercialise medical‑imaging simulationsThe first freelance project and the long, quiet months that followedHow he now finds clients, keeps projects small and delivers value quicklyTypical work he does for industrial companies: parsing messy machine logs, building simple pipelines, adding structure laterFavorite everyday tools (Python, DuckDB, a bit of C++) and the habit of blocking time for learningAdvice for anyone thinking about freelancing: cash runway, networking, and focusing on problems rather than “perfect” tech choicesA practical conversation for listeners who are curious about moving from research or permanent roles into freelance data engineering.🕒 TIMECODES0:00 Orel’s career and move to freelancing9:04 Startup experience and data engineering lessons16:05 Academia vs. startups and starting freelancing25:33 Early freelancing challenges and networking34:22 Freelance data engineering and messy industrial data43:27 Staying practical, learning tools, and growth50:33 Freelancing challenges and client acquisition58:37 Tools, problem-solving, and manual work🔗 CONNECT WITH ORELLTwitter - https://bsky.app/profile/orgarten.bsk...LinkedIn - / ogarten Github - https://github.com/orgartenWebsite - https://orellgarten.com🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/ 🔗 CONNECT WITH ALEXEYConnect with AlexeyTwitter - / al_grigor Linkedin - / agrigorev
Thinking about swapping your 9‑to‑5 for client work, but worried that a long German–style notice period will kill your chances?  In this live interview, seven‑year data‑freelance veteran Dimitri walks through his experience of taking his freelance career to the next level.About the Speaker: Dimitri Visnadi is an independent data consultant with a focus on data strategy. He has been consulting companies leading the marketing data space such as Unilever, Ferrero, Heineken, and Red Bull.He has lived and worked in 6 countries across Europe in both corporate and startup organizations. He was part of data departments at Hewlett-Packard (HP) and a Google partnered consulting firm where he was working on data products and strategy.Having received a Masters in Business Analytics with Computer Science from University College London and a Bachelor in Business Administration from John Cabot University, Dimitri still has close ties to academia and holds a mentor position in entrepreneurship at both institutions.🕒 TIMECODES00:00 Dimitri’s journey from corporate to freelance data specialist05:41 Job tenure trends, tech career shifts, and freelance types10:50 Freelancing challenges, success, and finding clients17:33 Freelance market trends and Dimitri’s job board23:51 Starting points, top freelance skills, and market insights32:48 Building a lifestyle business: scaling and work-life balance45:30 Data Freelancer course and marketing for freelancers48:33 Subscription services and managing client relationships56:47 Pricing models and transitioning advice1:01:02 Notice periods, networking, and risks in freelancing transition 🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-events LinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/ 🔗 CONNECT WITH DIMITRILinkedin - https://www.linkedin.com/in/visnadi/
In this podcast episode, we talked with Will Russell about From Hackathons to Developer Advocacy.About the Speaker: Will Russell is a Developer Advocate at Kestra, known for his videos on workflow orchestration. Previously, Will built open source education programs to help up and coming developers make their first contributions in open source. With a passion for developer education, Will creates technical video content and documentation that makes technologies more approachable for developers.In this episode, we sit down with Will—developer advocate, content creator, and passionate community builder. We’ll hear about his unique path through tech, the lessons he’s learned, and his approach to making complex topics accessible and engaging. Whether you’re curious about open source, hackathons, or what it’s like to bridge the gap between developers and the broader tech community, this conversation is full of insights and inspiration.🕒 TIMECODES0:00 Introduction, career journeys, and video setup and workflow10:41 From hackathons to open source: Early experiences and learning16:04 Becoming a hackathon organizer and the value of soft skills23:18 How to organize a hackathon, memorable projects, and creativity33:39 Major League Hacking: Building community and scaling student programs41:16 Mentorship, development environments, and onboarding in open source49:14 Developer advocacy, content strategy, and video tips57:16 Will’s current projects and future plans for content creation🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/ 🔗 CONNECT WITH WILLLinkedIn - https://www.linkedin.com/in/wrussell1999/Twitter - https://x.com/wrussell1999GitHub - https://github.com/wrussell1999Website - https://wrussell.co.uk/
In this podcast episode, we talked with Lavanya Gupta about Building a Strong Career in Data.About the Speaker: Lavanya is a Carnegie Mellon University (CMU) alumni of the Language Technologies Institute (LTI). She works as a Sr. AI/ML Applied Associate at JPMorgan Chase in their specialized Machine Learning Center of Excellence (MLCOE) vertical. Her latest research on long-context evaluation of LLMs was published in EMNLP 2024. In addition to having a strong industrial research background of 5+ years, she is also an enthusiastic technical speaker. She has delivered talks at events such as Women in Data Science (WiDS) 2021, PyData, Illuminate AI 2021, TensorFlow User Group (TFUG), and MindHack! Summit. She also serves as a reviewer at top-tier NLP conferences (NeurIPS 2024, ICLR 2025, NAACL 2025). Additionally, through her collaborations with various prestigious organizations, like Anita BOrg and Women in Coding and Data Science (WiCDS), she is committed to mentoring aspiring machine learning enthusiasts.In this episode, we talk about Lavanya Gupta’s journey from software engineer to AI researcher. She shares how hackathons sparked her passion for machine learning, her transition into NLP, and her current work benchmarking large language models in finance. Tune in for practical insights on building a strong data career and navigating the evolving AI landscape.🕒 TIMECODES00:00 Lavanya’s journey from software engineer to AI researcher10:15 Benchmarking long context language models12:36 Limitations of large context models in real domains14:54 Handling large documents and publishing research in industry19:45 Building a data science career: publications, motivation, and mentorship25:01 Self-learning, hackathons, and networking33:24 Community work and Kaggle projects37:32 Mentorship and open-ended guidance51:28 Building a strong data science portfolio🔗 CONNECT WITH LAVANYALinkedIn -   / lgupta18  🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/
loading
Comments