Discover
The Data Exchange with Ben Lorica

The Data Exchange with Ben Lorica
Author: Ben Lorica
Subscribed: 128Played: 4,993Subscribe
Share
© 2023 The Data Exchange with Ben Lorica
Description
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
172 Episodes
Reverse
Paras Jain and Sarah Wooders are graduate students at UC Berkeley’s Sky Computing Lab. They are part of the team behind Skyplane, and open source project that accelerates wide-area transfers in the cloud via overlay routing and parallelism. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Pablo Villalobos is a Staff Researcher at Epoch, and lead author of the recent paper “Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning”. We discuss the key findings in this paper, as well as a related study Pablo conducted on scaling laws. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Jinsung Yoon (Senior Research Scientist) and Sercan Arik (Staff Research Scientist and Manager) are part of the Google team behind EHR-Safe, a set of tools for generating highly realistic and privacy-preserving synthetic Electronic Health Records.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Brandon Jenkins, Co-founder and COO of Fundrise, the largest direct-to-individuals alternative investment platform in the country. Our conversation centered on their recent foray into technology investing, specifically startup companies in the data infrastructure space. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Zongheng Yang, is a researcher in the Sky Computing Lab at UC Berkeley, a multi-year research initiative that utilizes distributed systems, programming languages, security and machine learning to separate the services that a company requires from the choice of a specific cloud. He provides a detailed overview and update on SkyPilot, a groundbreaking intercloud broker that views the cloud ecosystem as a unified and integrated entity rather than a collection of disparate, largely incompatible clouds. SkyPilot enables users to run Machine Learning and Data Science batch jobs on any cloud, realize substantial cost savings, access the best hardware across clouds, and enjoy higher resource availability.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Jesse Anderson, Evan Chan, and I delve into the current developments and possibilities within the realm of data engineering and platforms. As the foundation for artificial intelligence and machine learning, data plays a crucial role in the advancement of these technologies. Download a copy of the FREE Report: https://gradientflow.com/2023trendsreport/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
This week we discuss AI regulations with Gabriela Zanfir-Fortuna is VP for Global Privacy at the Future of Privacy Forum, and Andrew Burt, Managing Partner at BNH, the first law firm focused on AI and Analytics.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Dylan Patel is the Chief Analyst at SemiAnalysis, a boutique semiconductor research and consulting firm focused on the semiconductor supply chain from chemical inputs to fabs to design IP and strategy. In this episode, we discuss the emerging open source software stack for PyTorch that makes it easier and more accessible to implement non-Nvidia backends (see his recent post).Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Peter Norvig (of Google and Stanford) and Alfred Spector (of MIT) are part of the team of authors behind the must-read book Data Science in Context: Foundations, Challenges, Opportunities. We discussed their recent book and tool a deep dive into their Data Science Analysis Rubric, and we also talked about a trending topics in AI including looming regulations, synthetic data, and Large Language and Foundation Models.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Percy Liang is Associate Professor of Computer Science and Statistics, and Director of the new Center for Research on Foundation Models at Stanford University. We discussed a new suit of tools (HELM) designed to help users and researchers understand language models in their totality. We also discuss recent trends in AI including the rise of Generative AI and Foundation Models.Download a copy of our FREE 2023 Trends in Data and AI Report: https://gradientflow.com/2023trendsreport/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Jenn Webb, special correspondent and managing editor at Gradient Flow, recently organized a mini-panel to discuss themes and trends for 2023. The panel consisted of myself and Mikio Braun. More information on these trends can be found in our Annual Trends Report, which is available for free download (see details below). Download a copy of the FREE Report: https://gradientflow.com/2023trendsreport/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Given the growing interest in Generative AI, we revisit a conversation with Mark Chen, Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
On this special end of the year episode, we revisit conversations with two data science leaders in the e-commerce space:Wendy Foster, Director, Engineering & Data Science at Shopify.Olivia Liao, Senior Director of Data Science at Stitch Fix.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.
Shayan Mohanty is the CEO of Watchful, a modern and interactive solution that places the control of data labeling back in the hands of data scientists, machine learning practitioners, and subject matter experts. This podcast focuses on a data management system (written in Rust) they built to support the level of automation and interactivity required to support Watchful.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
Frank Liu is Director of Operations & ML Architect at Zilliz, the company behind Milvus, an open source vector database. We discuss their recent VLDB paper (“A Cloud Native Vector Database Management System”) that describes recent updates to Milvus, as well as vector databases and vector search in general.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
Ira Cohen is co-founder, Chief Data Scientist at Anodot, a startup that uses time series tools to monitor business data in real time, so organizations can proactively resolve revenue, cost, and customer experience issues before they impact business performance. We recently wrote a well-received post that provided a detailed overview on the state of technologies for collecting, storing, and unlocking time series. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
Roy Schwartz is Professor of Natural Language Processing at The Hebrew University of Jerusalem. We discussed a recent survey paper that Roy co-wrote that presented a broad overview of existing methods to improve NLP efficiency through the lens of traditional NLP pipelines. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
On this Thanksgiving holiday weekend in the U.S., we revisit a Twitter Spaces conversation I had withAndrew Burt, Managing Partner at BNH1, the first law firm focused on AI risks.Bob Friday, Chief AI Officer at Juniper Networks.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
Hung Bui is the CEO of VinAI, a premier Artificial Intelligence research-based company developing world-class products and services. Hung assembled the VinAI team just over three years ago and they are now among the Top 20 Global Companies in AI Research in 2022. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.
Bob van Luijt, is CEO of SeMI Technologies, the company behind the popular vector search engine Weaviate. Bob describes their key features and core components, popular use cases, and he also provides an overview of Weaviate’s near-term roadmap. We also discuss how vector search engines compare with existing data management systems.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.