In this episode, we talk with Michael Lanham, an AI and software innovator with over two decades of experience spanning game development, fintech, oil and gas, and agricultural tech. Michael shares his journey from building neural network-based games and evolutionary algorithms to writing influential books on AI agents and deep learning. He offers insights into the evolving AI landscape, practical uses of AI agents, and the future of generative AI in gaming and beyond.TIMECODES00:00 Micheal Lanham’s career journey and AI agent books05:45 Publishing journey: AR, Pokémon Go, sound design, and reinforcement learning10:00 Evolution of AI: evolutionary algorithms, deep learning, and agents13:33 Evolutionary algorithms in prompt engineering and LLMs18:13 AI agent books second edition and practical applications20:57 AI agent workflows: minimalism, task breakdown, and collaboration26:25 Collaboration and orchestration among AI agents31:24 Tools and reasoning servers for agent communication35:17 AI agents in game development and generative AI impact38:57 Future of generative AI in gaming and immersive content41:42 Coding agents, new LLMs, and local deployment45:40 AI model trends and data scientist career advice53:36 Cognitive testing, evaluation, and monitoring in AI58:50 Publishing details and closing remarksConnect with MichealLinkedin - https://www.linkedin.com/in/micheal-lanham-189693123/Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/
At PyData Berlin, community members and industry voices highlighted how AI and data tooling are evolving across knowledge graphs, MLOps, small-model fine-tuning, explainability, and developer advocacy.- Igor Kvachenok (Leuphana University / ProKube) combined knowledge graphs with LLMs for structured data extraction in the polymer industry, and noted how MLOps is shifting toward LLM-focused workflows.- Selim Nowicki (Distill Labs) introduced a platform that uses knowledge distillation to fine-tune smaller models efficiently, making model specialization faster and more accessible.- Gülsah Durmaz (Architect & Developer) shared her transition from architecture to coding, creating Python tools for design automation and volunteering with PyData through PyLadies.- Yashasvi Misra (Pure Storage) spoke on explainable AI, stressing accountability and compliance, and shared her perspective as both a data engineer and active Python community organizer.- Mehdi Ouazza (MotherDuck) reflected on developer advocacy through video, workshops, and branding, showing how creative communication boosts adoption of open-source tools like DuckDB.Igor KvachenokMaster’s student in Data Science at Leuphana University of Lüneburg, writing a thesis on LLM-enhanced data extraction for the polymer industry. Builds RDF knowledge graphs from semi-structured documents and works at ProKube on MLOps platforms powered by Kubeflow and Kubernetes.Connect: https://www.linkedin.com/in/igor-kvachenok/Selim NowickiFounder of Distill Labs, a startup making small-model fine-tuning simple and fast with knowledge distillation. Previously led data teams at Berlin startups like Delivery Hero, Trade Republic, and Tier Mobility. Sees parallels between today’s ML tooling and dbt’s impact on analytics.Connect: https://www.linkedin.com/in/selim-nowicki/Gülsah DurmazArchitect turned developer, creating Python-based tools for architectural design automation with Rhino and Grasshopper. Active in PyLadies and a volunteer at PyData Berlin, she values the community for networking and learning, and aims to bring ML into architecture workflows.Connect: https://www.linkedin.com/in/gulsah-durmaz/Yashasvi (Yashi) MisraData Engineer at Pure Storage, community organizer with PyLadies India, PyCon India, and Women Techmakers. Advocates for inclusive spaces in tech and speaks on explainable AI, bridging her day-to-day in data engineering with her passion for ethical ML.Connect: https://www.linkedin.com/in/misrayashasvi/Mehdi OuazzaDeveloper Advocate at MotherDuck, formerly a data engineer, now focused on building community and education around DuckDB. Runs popular YouTube channels ("mehdio DataTV" and "MotherDuck") and delivered a hands-on workshop at PyData Berlin. Blends technical clarity with creative storytelling.Connect: https://www.linkedin.com/in/mehd-io/
In this episode, we talk with Daniel, an astrophysicist turned machine learning engineer and AI ambassador. Daniel shares his journey bridging astronomy and data science, how he leveraged live courses and public knowledge sharing to grow his skills, and his experiences working on cutting-edge radio astronomy projects and AI deployments. He also discusses practical advice for beginners in data and astronomy, and insights on career growth through community and continuous learning.TIMECODES00:00 Lunar eclipse story and Daniel’s astronomy career04:12 Electromagnetic spectrum and MEERKAT data explained10:39 Data analysis and positional cross-correlation challenges15:25 Physics behind radio star detection and observation limits16:35 Radio astronomy’s advantage and machine learning potential20:37 Radio astronomy progress and Daniel’s ML journey26:00 Python tools and experience with ZoomCamps31:26 Intel internship and exploring LLMs41:04 Sharing progress and course projects with orchestration tools44:49 Setting up Airflow 3.0 and building data pipelines47:39 AI startups, training resources, and NVIDIA courses50:20 Student access to education, NVIDIA experience, and beginner astronomy programs57:59 Skills, projects, and career advice for beginners59:19 Starting with data science or engineering1:00:07 Course sponsorship, data tools, and learning resourcesConnect with DanielLinkedin - / egbodaniel Connect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/
At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.- Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.- Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.- André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.- Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.- Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.- Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.Kacper Łukawski Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant’s support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps. Connect: https://www.linkedin.com/in/kacperlukawski/Manish Gill Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager. Connect: https://www.linkedin.com/in/manishgill/André Charton Head of Search at Kleinanzeigen, he discussed shaping the company’s search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering. Connect: https://www.linkedin.com/in/andrecharton/Filip Makraduli Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters. Connect: https://www.linkedin.com/in/filipmakraduli/Brian Goldin Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager’s 12 years of enterprise experience. Connect: https://www.linkedin.com/in/brian-goldin-04170a1/Atita Arora Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion. Connect: https://www.linkedin.com/in/atitaarora/
In this episode, We talked with Pastor, a medical doctor who built a career in machine learning while studying medicine. Pastor shares how he balanced both fields, leveraged live courses and public sharing to grow his skills, and found opportunities through freelancing and mentoring.TIMECODES00:00 Pastor’s background and early programming journey06:05 Learning new tools and skills on the job while studying medicine11:44 Balancing medical studies with data science work and motivation13:48 Applying medical knowledge to data science and vice versa18:44 Starting freelance work on Upwork and overcoming language challenges24:03 Joining the machine learning engineering course and benefits of live cohorts27:41 Engaging with the course community and sharing progress publicly35:16 Using LinkedIn and social media for career growth and interview opportunities41:03 Building reputation, structuring learning, and leveraging course projects50:53 Volunteering and mentoring with DeepLearning.AI and Stanford Coding Place57:00 Managing time and staying productive while studying medicine and machine learningConnect with PastorTwitter - https://x.com/PastorSotoB1Linkedin - / pastorsoto Github - https://github.com/sotoblancoWebsite - https://substack.com/@pastorsotoConnect with DataTalks.Club:Join the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/
Struggling with data trust issues, dashboard drama, or constant pipeline firefighting? In this deep‑dive interview, Lior Barak shows you how to shift from a reactive “fix‑it” culture to a mindful, impact‑driven practice rooted in Zen/Wabi‑Sabi principles.You’ll learn:Why 97 % of CEOs say they use data, but only 24 % call themselves data‑drivenThe traffic‑light dashboard pattern (green / yellow / red) that instantly tells execs whether numbers are safe to useA practical rule for balancing maintenance, rollout, and innovation—and avoiding team burnoutHow to quantify ROI on data products, kill failing legacy systems, and handle ad‑hoc exec requests without derailing roadmapsTurning “imperfect” data into business value with mindful communication, root‑cause logs, and automated incident review loops🕒 TIMECODES00:00 Community and mindful data strategy04:06 Career journey and product management insights08:03 Wabi-sabi data and the trust crisis11:47 AI, data imperfection, and trust challenges20:05 Trust crisis examples and root cause analysis25:06 Regaining trust through mindful data management30:47 Traffic light system and effective communication37:41 Communication gaps and team workload balance39:58 Maintenance stress and embracing Zen mindset49:29 Accepting imperfection and measuring impact56:19 Legacy systems and managing executive requests01:00:23 Role guidance and closing reflections🔗 Connect with LiorLinkedIn - https://www.linkedin.com/in/liorbarakWebsite - https://cookingdata.substack.com/Cooking Data newsletter: https://cookingdata.substack.com/Product product lifecycle manager: https://app--data-product-lifecycle-manager-c81b10bb.base44.app/🔗 Connect with DataTalks.ClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - https://www.linkedin.com/company/datatalks-club/Twitter - https://x.com/DataTalksClubWebsite - https://datatalks.club/ 🔗 Connect with AlexeyTwitter - https://x.com/Al_GrigorLinkedin - https://www.linkedin.com/in/agrigorev/
In this episode, we talk with Orell about his journey from electrical engineering to freelancing in data engineering. Exploring lessons from startup life, working with messy industrial data, the realities of freelancing, and how to stay up to date with new tools. Topics covered: Why Orel left a PhD and a simulation‑focused start‑up after Covid hitWhat he learned trying (and failing) to commercialise medical‑imaging simulationsThe first freelance project and the long, quiet months that followedHow he now finds clients, keeps projects small and delivers value quicklyTypical work he does for industrial companies: parsing messy machine logs, building simple pipelines, adding structure laterFavorite everyday tools (Python, DuckDB, a bit of C++) and the habit of blocking time for learningAdvice for anyone thinking about freelancing: cash runway, networking, and focusing on problems rather than “perfect” tech choicesA practical conversation for listeners who are curious about moving from research or permanent roles into freelance data engineering.🕒 TIMECODES0:00 Orel’s career and move to freelancing9:04 Startup experience and data engineering lessons16:05 Academia vs. startups and starting freelancing25:33 Early freelancing challenges and networking34:22 Freelance data engineering and messy industrial data43:27 Staying practical, learning tools, and growth50:33 Freelancing challenges and client acquisition58:37 Tools, problem-solving, and manual work🔗 CONNECT WITH ORELLTwitter - https://bsky.app/profile/orgarten.bsk...LinkedIn - / ogarten Github - https://github.com/orgartenWebsite - https://orellgarten.com🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-eventsGitHub: https://github.com/DataTalksClubLinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/ 🔗 CONNECT WITH ALEXEYConnect with AlexeyTwitter - / al_grigor Linkedin - / agrigorev
Thinking about swapping your 9‑to‑5 for client work, but worried that a long German–style notice period will kill your chances? In this live interview, seven‑year data‑freelance veteran Dimitri walks through his experience of taking his freelance career to the next level.About the Speaker: Dimitri Visnadi is an independent data consultant with a focus on data strategy. He has been consulting companies leading the marketing data space such as Unilever, Ferrero, Heineken, and Red Bull.He has lived and worked in 6 countries across Europe in both corporate and startup organizations. He was part of data departments at Hewlett-Packard (HP) and a Google partnered consulting firm where he was working on data products and strategy.Having received a Masters in Business Analytics with Computer Science from University College London and a Bachelor in Business Administration from John Cabot University, Dimitri still has close ties to academia and holds a mentor position in entrepreneurship at both institutions.🕒 TIMECODES00:00 Dimitri’s journey from corporate to freelance data specialist05:41 Job tenure trends, tech career shifts, and freelance types10:50 Freelancing challenges, success, and finding clients17:33 Freelance market trends and Dimitri’s job board23:51 Starting points, top freelance skills, and market insights32:48 Building a lifestyle business: scaling and work-life balance45:30 Data Freelancer course and marketing for freelancers48:33 Subscription services and managing client relationships56:47 Pricing models and transitioning advice1:01:02 Notice periods, networking, and risks in freelancing transition 🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-events LinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/ 🔗 CONNECT WITH DIMITRILinkedin - https://www.linkedin.com/in/visnadi/
In this podcast episode, we talked with Will Russell about From Hackathons to Developer Advocacy.About the Speaker: Will Russell is a Developer Advocate at Kestra, known for his videos on workflow orchestration. Previously, Will built open source education programs to help up and coming developers make their first contributions in open source. With a passion for developer education, Will creates technical video content and documentation that makes technologies more approachable for developers.In this episode, we sit down with Will—developer advocate, content creator, and passionate community builder. We’ll hear about his unique path through tech, the lessons he’s learned, and his approach to making complex topics accessible and engaging. Whether you’re curious about open source, hackathons, or what it’s like to bridge the gap between developers and the broader tech community, this conversation is full of insights and inspiration.🕒 TIMECODES0:00 Introduction, career journeys, and video setup and workflow10:41 From hackathons to open source: Early experiences and learning16:04 Becoming a hackathon organizer and the value of soft skills23:18 How to organize a hackathon, memorable projects, and creativity33:39 Major League Hacking: Building community and scaling student programs41:16 Mentorship, development environments, and onboarding in open source49:14 Developer advocacy, content strategy, and video tips57:16 Will’s current projects and future plans for content creation🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/ 🔗 CONNECT WITH WILLLinkedIn - https://www.linkedin.com/in/wrussell1999/Twitter - https://x.com/wrussell1999GitHub - https://github.com/wrussell1999Website - https://wrussell.co.uk/
In this podcast episode, we talked with Lavanya Gupta about Building a Strong Career in Data.About the Speaker: Lavanya is a Carnegie Mellon University (CMU) alumni of the Language Technologies Institute (LTI). She works as a Sr. AI/ML Applied Associate at JPMorgan Chase in their specialized Machine Learning Center of Excellence (MLCOE) vertical. Her latest research on long-context evaluation of LLMs was published in EMNLP 2024. In addition to having a strong industrial research background of 5+ years, she is also an enthusiastic technical speaker. She has delivered talks at events such as Women in Data Science (WiDS) 2021, PyData, Illuminate AI 2021, TensorFlow User Group (TFUG), and MindHack! Summit. She also serves as a reviewer at top-tier NLP conferences (NeurIPS 2024, ICLR 2025, NAACL 2025). Additionally, through her collaborations with various prestigious organizations, like Anita BOrg and Women in Coding and Data Science (WiCDS), she is committed to mentoring aspiring machine learning enthusiasts.In this episode, we talk about Lavanya Gupta’s journey from software engineer to AI researcher. She shares how hackathons sparked her passion for machine learning, her transition into NLP, and her current work benchmarking large language models in finance. Tune in for practical insights on building a strong data career and navigating the evolving AI landscape.🕒 TIMECODES00:00 Lavanya’s journey from software engineer to AI researcher10:15 Benchmarking long context language models12:36 Limitations of large context models in real domains14:54 Handling large documents and publishing research in industry19:45 Building a data science career: publications, motivation, and mentorship25:01 Self-learning, hackathons, and networking33:24 Community work and Kaggle projects37:32 Mentorship and open-ended guidance51:28 Building a strong data science portfolio🔗 CONNECT WITH LAVANYALinkedIn - / lgupta18 🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/
In this podcast episode, we talked with Eddy Zulkifly about From Supply Chain Management to Digital Warehousing and FinOpsAbout the Speaker: Eddy Zulkifly is a Staff Data Engineer at Kinaxis, building robust data platforms across Google Cloud, Azure, and AWS. With a decade of experience in data, he actively shares his expertise as a Mentor on ADPList and Teaching Assistant at Uplimit. Previously, he was a Senior Data Engineer at Home Depot, specializing in e-commerce and supply chain analytics. Currently pursuing a Master’s in Analytics at the Georgia Institute of Technology, Eddy is also passionate about open-source data projects and enjoys watching/exploring the analytics behind the Fantasy Premier League.In this episode, we dive into the world of data engineering and FinOps with Eddy Zulkifly, Staff Data Engineer at Kinaxis. Eddy shares his unconventional career journey—from optimizing physical warehouses with Excel to building digital data platforms in the cloud.🕒 TIMECODES0:00 Eddy’s career journey: From supply chain to data engineering8:18 Tools & learning: Excel, Docker, and transitioning to data engineering21:57 Physical vs. digital warehousing: Analogies and key differences31:40 Introduction to FinOps: Cloud cost optimization and vendor negotiations40:18 Resources for FinOps: Certifications and the FinOps Foundation45:12 Standardizing cloud cost reporting across AWS/GCP/Azure50:04 Eddy’s master’s degree and closing thoughts🔗 CONNECT WITH EDDYTwitter - https://x.com/eddariefLinkedin - https://www.linkedin.com/in/eddyzulkifly/Github: https://github.com/eyzyly/eyzylyADPList: https://adplist.org/mentors/eddy-zulkifly🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQCheck other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/
In this podcast episode, we talked with Bartosz Mikulski about Data Intensive AI.About the Speaker:Bartosz is an AI and data engineer. He specializes in moving AI projects from the good-enough-for-a-demo phase to production by building a testing infrastructure and fixing the issues detected by tests. On top of that, he teaches programmers and non-programmers how to use AI. He contributed one chapter to the book 97 Things Every Data Engineer Should Know, and he was a speaker at several conferences, including Data Natives, Berlin Buzzwords, and Global AI Developer Days. In this episode, we discuss Bartosz’s career journey, the importance of testing in data pipelines, and how AI tools like ChatGPT and Cursor are transforming development workflows. From prompt engineering to building Chrome extensions with AI, we dive into practical use cases, tools, and insights for anyone working in data-intensive AI projects. Whether you’re a data engineer, AI enthusiast, or just curious about the future of AI in tech, this episode offers valuable takeaways and real-world experiences.0:00 Introduction to Bartosz and his background4:00 Bartosz’s career journey from Java development to AI engineering9:05 The importance of testing in data engineering11:19 How to create tests for data pipelines13:14 Tools and approaches for testing data pipelines17:10 Choosing Spark for data engineering projects19:05 The connection between data engineering and AI tools21:39 Use cases of AI in data engineering and MLOps25:13 Prompt engineering techniques and best practices31:45 Prompt compression and caching in AI models33:35 Thoughts on DeepSeek and open-source AI models35:54 Using AI for lead classification and LinkedIn automation41:04 Building Chrome extensions with AI integration43:51 Comparing Cursor and GitHub Copilot for coding47:11 Using ChatGPT and Perplexity for AI-assisted tasks52:09 Hosting static websites and using AI for development54:27 How blogging helps attract clients and share knowledge58:15 Using AI to assist with writing and content creation🔗 CONNECT WITH BartoszLinkedIn: https://www.linkedin.com/in/mikulskibartosz/ Github: https://github.com/mikulskibartoszWebsite: https://mikulskibartosz.name/blog/🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/
In this podcast episode, we talked with Nemanja Radojkovic about MLOps in Corporations and Startups.About the Speaker: Nemanja Radojkovic is Senior Machine Learning Engineer at Euroclear.In this event,we’re diving into the world of MLOps, comparing life in startups versus big corporations. Joining us again is Nemanja, a seasoned machine learning engineer with experience spanning Fortune 500 companies and agile startups. We explore the challenges of scaling MLOps on a shoestring budget, the trade-offs between corporate stability and startup agility, and practical advice for engineers deciding between these two career paths. Whether you’re navigating legacy frameworks or experimenting with cutting-edge tools.1:00 MLOps in corporations versus startups6:03 The agility and pace of startups7:54 MLOps on a shoestring budget12:54 Cloud solutions for startups15:06 Challenges of cloud complexity versus on-premise19:19 Selecting tools and avoiding vendor lock-in22:22 Choosing between a startup and a corporation27:30 Flexibility and risks in startups29:37 Bureaucracy and processes in corporations33:17 The role of frameworks in corporations34:32 Advantages of large teams in corporations40:01 Challenges of technical debt in startups43:12 Career advice for junior data scientists44:10 Tools and frameworks for MLOps projects49:00 Balancing new and old technologies in skill development55:43 Data engineering challenges and reliability in LLMs57:09 On-premise vs. cloud solutions in data-sensitive industries59:29 Alternatives like Dask for distributed systems🔗 CONNECT WITH NEMANJALinkedIn - / radojkovic Github - https://github.com/baskervilski🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.htmlSubscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...Check other upcoming events - https://lu.ma/dtc-events LinkedIn - / datatalks-club Twitter - / datatalksclub Website - https://datatalks.club/
In this podcast episode, we talked with Adrian Brudaru about the past, present and future of data engineering.About the speaker:Adrian Brudaru studied economics in Romania but soon got bored with how creative the industry was, and chose to go instead for the more factual side. He ended up in Berlin at the age of 25 and started a role as a business analyst. At the age of 30, he had enough of startups and decided to join a corporation, but quickly found out that it did not provide the challenge he wanted.As going back to startups was not a desirable option either, he decided to postpone his decision by taking freelance work and has never looked back since. Five years later, he co-founded a company in the data space to try new things. This company is also looking to release open source tools to help democratize data engineering.0:00 Introduction to DataTalks.Club1:05 Discussing trends in data engineering with Adrian2:03 Adrian's background and journey into data engineering5:04 Growth and updates on Adrian's company, DLT Hub9:05 Challenges and specialization in data engineering today13:00 Opportunities for data engineers entering the field15:00 The "Modern Data Stack" and its evolution17:25 Emerging trends: AI integration and Iceberg technology27:40 DuckDB and the emergence of portable, cost-effective data stacks32:14 The rise and impact of dbt in data engineering34:08 Alternatives to dbt: SQLMesh and others35:25 Workflow orchestration tools: Airflow, Dagster, Prefect, and GitHub Actions37:20 Audience questions: Career focus in data roles and AI engineering overlaps39:00 The role of semantics in data and AI workflows41:11 Focusing on learning concepts over tools when entering the field 45:15 Transitioning from backend to data engineering: challenges and opportunities 47:48 Current state of the data engineering job market in Europe and beyond 49:05 Introduction to Apache Iceberg, Delta, and Hudi file formats 50:40 Suitability of these formats for batch and streaming workloads 52:29 Tools for streaming: Kafka, SQS, and related trends 58:07 Building AI agents and enabling intelligent data applications 59:09Closing discussion on the place of tools like DBT in the ecosystem🔗 CONNECT WITH ADRIAN BRUDARULinkedin - / data-team Website - https://adrian.brudaru.com/ 🔗 CONNECT WITH DataTalksClubJoin the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn - /datatalks-club Twitter - /datatalksclub Website - https://datatalks.club/
In this podcast episode, we talked with Alexander Guschin about launching a career off Kaggle.About the Speaker: Alexander Guschin is a Machine Learning Engineer with 10+ years of experience, a Kaggle Grandmaster ranked 5th globally, and a teacher to 100K+ students. He leads DS and SE teams and contributes to open-source ML tools.0:00 Starting with Machine Learning: Challenges and Early Steps 13:05 Community and Learning Through Kaggle Sessions 17:10 Broadening Skills Through Kaggle Participation 18:54 Early Competitions and Lessons Learned 21:10 Transitioning to Simpler Solutions Over Time 23:51 Benefits of Kaggle for Starting a Career in Machine Learning 29:08 Teamwork vs. Solo Participation in Competitions 31:14 Schoolchildren in AI Competitions42:33 Transition to Industry and MLOps50:13 Encouraging teamwork in student projects50:48 Designing competitive machine learning tasks52:22 Leaderboard types for tracking performance53:44 Managing small-scale university classes54:17 Experience with Coursera and online teaching59:40 Convincing managers about Kaggle's value61:38 Secrets of Kaggle competition success63:11 Generative AI's impact on competitive ML65:13 Evolution of automated ML solutions66:22 Reflecting on competitive data science experience🔗 CONNECT WITH ALEXANDER GUSCHINLinkedin - https://www.linkedin.com/in/1aguschin/Website - https://www.aguschin.com/🔗 CONNECT WITH DataTalksClubJoin DataTalks.Club:https://datatalks.club/slack.htmlOur events:https://datatalks.club/events.htmlDatalike Substack -https://datalike.substack.com/LinkedIn: / datatalks-club
In this podcast episode, we talked with Andrey Cheptsov about The future of AI infrastructure.About the Speaker:Andrey Cheptsov is the founder and CEO of dstack, an open-source alternative to Kubernetes and Slurm, built to simplify the orchestration of AI infrastructure. Before dstack, Andrey worked at JetBrains for over a decade helping different teams make the best developer tools.During the event, the guest, Andrey Cheptsov, founder and CEO of dstack, discussed the complexities of AI infrastructure. We explore topics like the challenges of using Kubernetes for AI workloads, the need to rethink container orchestration, and the future of hybrid and cloud-only infrastructures. Andrey also shares insights into the role of on-premise and bare-metal solutions, edge computing, and federated learning.00:00 Andrey's Career Journey: From JetBrains to DStack5:00 The Motivation Behind DStack7:00 Challenges in Machine Learning Infrastructure10:00 Transitioning from Cloud to On-Prem Solutions14:30 Reflections on OpenAI's Evolution17:30 Open Source vs Proprietary Models: A Balanced Perspective21:01 Monolithic vs. Decentralized AI businesses22:05 The role of privacy and control in AI for industries like banking and healthcare30:00 Challenges in training large AI models: GPUs and distributed systems37:03 DeepSpeed's efficient training approach vs. brute force methods39:00 Challenges for small and medium businesses: hosting and fine-tuning models47:01 Managing Kubernetes challenges for AI teams52:00 Hybrid vs. cloud-only infrastructure56:03 On-premise vs. bare-metal solutions58:05 Exploring edge computing and its challenges🔗 CONNECT WITH ANDREY CHEPTSOVTwitter - / andrey_cheptsov Linkedin - / andrey-cheptsov GitHub - https://github.com/dstackai/dstack/Website - https://dstack.ai/🔗 CONNECT WITH DataTalksClubJoin DataTalks.Club:https://datatalks.club/slack.htmlOur events:https://datatalks.club/events.htmlDatalike Substack -https://datalike.substack.com/LinkedIn: / datatalks-club
In this podcast episode, we talked with Tamara Atanasoska about building fair AI systems.About the Speaker:Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emphasized the importance of tools, human judgment, and collaboration. The guest also shared their involvement with PyLadies and encouraged contributions to Fairlearn.00:00 Introduction to the event and the community01:51 Topic introduction: Linguistic fairness and socio-technical perspectives in AI02:37 Guest introduction: Tamara’s background and career03:18 Tamara’s career journey: Software engineering, music tech, and computational linguistics09:53 Tamara’s background in language and computer science14:52 Exploring fairness in AI and its impact on society21:20 Fairness in AI models26:21 Automating fairness analysis in models32:32 Balancing technical and domain expertise in decision-making37:13 The role of humans in the loop for fairness40:02 Joining Probable and working on open-source projects46:20 Scopes library and its integration with Hugging Face50:48 PyLadies and community involvement55:41 The ethos of Scikit-learn and Fairlearn🔗 CONNECT WITH TAMARA ATANASOSKALinkedin - https://www.linkedin.com/in/tamaraatanasoskaGitHub- https://github.com/TamaraAtanasoska🔗 CONNECT WITH DataTalksClubJoin DataTalks.Club:https://datatalks.club/slack.htmlOur events:https://datatalks.club/events.htmlDatalike Substack -https://datalike.substack.com/LinkedIn: / datatalks-club
In this podcast episode, we talked with Agita Jaunzeme about Career choices, transitions and promotions in and out of tech. About the Speaker: Agita has designed a career spanning DevOps/DataOps engineering, management, community building, education, and facilitation. She has worked on projects across corporate, startup, open source, and non-governmental sectors. Following her passion, she founded an NGO focusing on the inclusion of expats and locals in Porto. Embodying the values of innovation, automation, and continuous learning, Agita provides practical insights on promotions, career pivots, and aligning work with passion and purpose. During this event, discussed their career journey, starting with their transition from art school to programming and later into DevOps, eventually taking on leadership roles. They explored the challenges of burnout and the importance of volunteering, founding an NGO to support inclusion, gender equality, and sustainability. The conversation also covered key topics like mentorship, the differences between data engineering and data science, and the dynamics of managing volunteers versus employees. Additionally, the guest shared insights on community management, developer relations, and the importance of product vision and team collaboration. 0:00 Introduction and Welcome 1:28 Guest Introduction: Agita’s Background and Career Highlights 3:05 Transition to Tech: From Art School to Programming 5:40 Exploring DevOps and Growing into Leadership Roles 7:24 Burnout, Volunteering, and Founding an NGO 11:00 Volunteering and Mentorship Initiatives 14:00 Discovering Programming Skills and Early Career Challenges 15:50 Automating Work Processes and Earning a Promotion 19:00 Transitioning from DevOps to Volunteering and Project Management 24:00 Managing Volunteers vs. Employees and Building Organizational Skills 31:07 Personality traits in engineering vs. data roles 33:14 Differences in focus between data engineers and data scientists 36:24 Transitioning from volunteering to corporate work 37:38 The role and responsibilities of a community manager 39:06 Community management vs. developer relations activities 41:01 Product vision and team collaboration 43:35 Starting an NGO and legal processes 46:13 NGO goals: inclusion, gender equality, and sustainability 49:02 Community meetups and activities 51:57 Living off-grid in a forest and sustainability 55:02 Unemployment party and brainstorming session 59:03 Unemployment party: the process and structure 🔗 CONNECT WITH AGITA JAUNZEME Linkedin - /agita 🔗 CONNECT WITH DataTalksClub Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Datalike Substack - https://datalike.substack.com/ LinkedIn: / datatalks-club
In this podcast episode, we talked with Isabella Bicalho about Career advice, learning, and featuring women in ML and AI. About the Speaker: Isabella is a Machine Learning Engineer and Data Scientist with three years of hands-on AI development experience. She draws upon her early computational research expertise to develop ML solutions. While contributing to open-source projects, she runs a newsletter dedicated to showcasing women's accomplishments in data science. During this event, the guest discussed her transition into machine learning, her freelance work in AI, and the growing AI scene in France. She shared insights on freelancing versus full-time work, the value of open-source contributions, and developing both technical and soft skills. The conversation also covered career advice, mentorship, and her Substack series on women in data science, emphasizing leadership, motivation, and career opportunities in tech. 0:00 Introduction 1:23 Background of Isabella Bicalho 2:02 Transition to machine learning 4:03 Study and work experience 5:00 Living in France and language learning 6:03 Internship experience 8:45 Focus areas of Inria 9:37 AI development in France 10:37 Current freelance work 11:03 Freelancing in machine learning 13:31 Moving from research to freelancing 14:03 Freelance vs. full-time data science 17:00 Finding first freelance client 18:00 Involvement in open-source projects 20:17 Passion for open-source and teamwork 23:52 Starting new projects 25:03 Community project experience 26:02 Teaching and learning 29:04 Contributing to open-source projects 32:05 Open-source tools vs. projects 33:32 Importance of community-driven projects 34:03 Learning resources 36:07 Green space segmentation project 39:02 Developing technical and soft skills 40:31 Gaining insights from industry experts 41:15 Understanding data science roles 41:31 Project challenges and team dynamics 42:05 Turnover in open-source projects 43:05 Managing expectations in open-source work 44:50 Mentorship in projects 46:17 Role of AI tools in learning 47:59 Overcoming learning challenges 48:52 Discussion on substack 49:01 Interview series on women in data 50:15 Insights from women in data science 51:20 Impactful stories from substack 53:01 Leadership challenges in projects 54:19 Career advice and opportunities 56:07 Motivating others to step out of comfort zone 57:06 Contacting for substack story sharing 58:00 Closing remarks and connections 🔗 CONNECT WITH ISABELLA BICALHO Github: github https://github.com/bellabf LinkedIn: / isabella-frazeto 🔗 CONNECT WITH DataTalksClub Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Datalike Substack - https://datalike.substack.com/ LinkedIn: / datatalks-club
Reflection on an Almost Two-Year Journey of Generative AI in Industry – Maria Sukhareva About the speaker: Maria Sukhareva is a principal key expert in Artificial Intelligence in Siemens with over 15 years of experience at the forefront of generative AI technologies. Known for her keen eye for technological innovation, Maria excels at transforming cutting-edge AI research into practical, value-driven tools that address real-world needs. Her approach is both hands-on and results-focused, with a commitment to creating scalable, long-term solutions that improve communication, streamline complex processes, and empower smarter decision-making. Maria's work reflects a balanced vision, where the power of innovation is met with ethical responsibility, ensuring that her AI projects deliver impactful and production-ready outcomes. We talked about: 00:00 DataTalks.Club intro 02:13 Career journey: From linguistics to AI 08:02 The Evolution of AI Expertise and its Future 13:10 AI vulnerabilities: Bypassing bot restrictions 17:00 Non-LLM classifiers as a more robust solution 22:56 Risks of chatbot deployment: Reputational and financial 27:13 The role of AI as a tool, not a replacement for human workers 31:41 The role of human translators in the age of AI 34:49 Evolution of English and its Germanic roots 38:44 Beowulf and Old English 39:43 Impact of the Norman occupation on English grammar 42:34 Identifying mushrooms with AI apps and safety precautions 45:08 Decoding ancient languages like Sumerian 49:43 The evolution of machine translation and multilingual models 53:01 Challenges with low-resource languages and inconsistent orthography 57:28 Transition from academia to industry in AI Join our Slack: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html