Data Science Conversations

33 Episodes

Reverse

Understanding Cause and Effect: Is Causal Discovery The Missing Layer in Artificial Intelligence?

2026-02-1154:08

Michael Haft, founder of xplain Data, discusses causal discovery and causal AI, explaining how understanding cause-and-effect relationships goes beyond predictive modeling to enable truly intelligent interventions. He explores the technical foundations of object analytics, real-world applications in healthcare and manufacturing, and his vision for integrating causal AI into future intelligent systems.Episode Summary Causal Discovery vs. Prediction - Causal discovery aims to understand why things happen rather than just predicting what will happen. Unlike predictive models that rely on correlations, causal discovery identifies true cause-and-effect relationships necessary for intelligent interventions and goal achievement.The Confounder Challenge - Understanding causality requires comprehensive data to identify confounders—hidden common causes that create spurious correlations. The gray hair and glasses example illustrates how age acts as a confounder, making the two correlated without a direct causal relationship between them.Object Analytics Technology - Traditional machine learning requires flat tables, but real-world data (like electronic health records with 150+ tables) is inherently complex. Object analytics allows algorithms to work with comprehensive, holistic data structures, enabling deeper causal analysis without manual feature engineering.Manufacturing Use Case - A cylinder head manufacturing example demonstrates how causal discovery identified the complete pathway from washing machine timing through part temperature to false negative leakage test results, enabling an intelligent process intervention that traditional predictive models couldn't provide.Healthcare Applications - Projects using MIMIC hospital data analyze causes of pressure injuries in patients. The vision is to provide doctors with causal knowledge derived from millions of patient records to improve treatment decisions, discover new drug effects, and enable cost-efficient healthcare.Path to Causal Maturity - Organizations need education on the difference between prediction and causality, comprehensive data availability, and engagement from both business owners (who have problems to solve) and data science teams. The shift requires iterative learning and hands-on experience with the technology.Community Edition Launch - Explained Data is releasing a community edition starting with pre-configured object analytics models for the MIMIC healthcare dataset, followed by a full version for the broader data science community, with free access for universities and evaluation purposes.Future of Causal AI - The next generation of AI systems will integrate causal layers with large language models, moving beyond text rephrasing to answering "why" questions based on empirical cause-and-effect relationships, particularly transforming healthcare and enabling more explainable, intelligent decision-making systems.

Predicting the Next Financial Crisis: The 18-Year Cycle Peak and the Bursting of the AI Investment Bubble

2025-11-1901:04:29

In this episode, we had the privilege of speaking with Akhil Patel, a globally recognized expert in economic cycles, discusses the 18-year boom-bust pattern and warns that we're approaching the peak of the current cycle in 2026, with a major financial crisis likely in 2027. He analyzes the AI investment bubble, draws parallels to historical manias, and provides practical strategies for businesses and investors to prepare for the downturn.Episode Summary 1. Understanding Economic Cycles - Akhil Patel explains why cycles matter, emphasizing that cyclical patterns appear throughout nature and human behavior, particularly in stock markets and economies. Understanding these rhythms helps predict both prosperity and crisis periods.2. The 18-Year Cycle Theory - the hypothesis of a regular 18-year boom-bust cycle (sometimes 16-20 years) in Western economies, particularly the US and UK. This pattern, first identified by economist Homer Hoyt in the 1930s through Chicago land sales data, has preceded every major financial crisis over the past 200 years.3. Land Values Drive Cycles - Land is identified as the key indicator because it's a scarce, monopolistic asset that captures economic surplus. Property prices and speculation patterns serve as the primary mechanism driving both the boom and bust phases, with banking credit amplifying these movements.4. Current Cycle (2011-2026) - Walking through the present cycle, Akhil identifies 2011-2012 as the starting point following the 2008 crisis. The COVID pandemic compressed what would normally be a 7-year second half into just 2 years of mania (2020-2022), though we're still seeing bubble behavior in AI investments arriving on schedule.5. AI Investment Bubble Analysis - The current AI sector exhibits classic bubble characteristics: inflated valuations disconnected from fundamentals, enormous capital investment with questionable returns, and incestuous interconnections between major players (Nvidia, OpenAI, Oracle). Parallels are drawn to the dot-com bubble, 1980s Japan, and 19th-century railway booms.6. Crisis Timing: 2026-2027 - Akhil predicts the property market will peak in 2026, with a major financial crisis following 6-12 months later in 2027. The trigger location is uncertain but likely in areas with extreme speculation—possibly the Middle East, parts of Asia, or unexpectedly in Germany, rather than the US which remains cautious after 2008.7. Practical Preparation Strategies - Key recommendations include: avoid leverage, build cash reserves, ensure businesses can survive revenue declines, don't buy based solely on capital gains momentum, and position to acquire assets during the downturn. The advice emphasizes survival first, then opportunistic expansion during recovery.8. Future Outlook Beyond Crisis - Despite the predicted downturn, Akhil remains optimistic about the next cycle (post-2030), believing AI and blockchain technologies are genuinely transformative once properly applied. The tech sector typically leads recovery, offering significant opportunities for those who survive the crisis with resources intact.

"Insuring Non-Determinism”: How Munich RE is Managing AI's Probabilistic Risks

2025-10-2839:09

Peter Bärnreuther from Munich RE discusses the emerging field of AI insurance, explaining how companies can manage the inherent risks of probabilistic AI systems through specialized insurance products. The conversation covers real-world AI failures, different types of AI risks, and how insurance can help both corporations and AI vendors scale their operations safely.Key Topics DiscussedPeter's Career Journey: Peter Bärnreuther transitioned from studying physics and economics to risk management at Accenture, then Munich RE, where he developed crypto insurance products before joining the AI risk team to create coverage for AI-related risks.Probabilistic vs Deterministic Systems: Unlike traditional deterministic systems where errors can be traced, AI systems are probabilistic - they can be 99.5% accurate but never 100% certain, creating fundamental new risks that require insurance coverage.AI Risk Categories: Two main types exist - traditional machine learning risks (classification errors like fraud detection) and generative AI risks (IP infringement, hallucinations, legal compliance issues), each requiring different insurance approaches.Real-World AI Incidents: Examples include airline chatbots promising unauthorized discounts, lawyers using fake legal cases, and AI house valuation systems losing $300M+ by failing to adjust to market changes during price drops.Insurance Product Structure: Munich RE offers two main products - one for corporations using AI internally for risk mitigation, and another for AI vendors needing trust-building to scale their business and attract enterprise clients.Specific Use Cases: Successful implementations include solar panel fault detection (100% accuracy guarantee), credit card fraud prevention (99.9% performance guarantee), and battery health assessment for electric vehicles with compensation guarantees.Market Challenges: Key difficulties include pricing models with limited historical data, concept drift where AI performance degrades over time, accumulation risk when multiple clients use similar foundation models, and "silent coverage" issues in existing insurance policies.Future Market Outlook: AI insurance may either become a separate line of business (like cyber insurance) or be integrated into traditional policies, with current focus on US and European markets and strongest traction in IT security applications.

How AI is Transforming Data Analytics and Visualisation in the Enterprise

2025-09-0301:11:08

Chris Parmer (Chief Product Officer & Co-Founder, Plotly) and Domenic Ravita (VP of Marketing, Plotly) discuss the evolution of AI-powered data analytics and how natural language interfaces are democratizing advanced analytics.Key Topics DiscussedAI's Market Category Convergence Domenic describes how AI is collapsing traditional boundaries between business intelligence tools (Power BI, Tableau), data science platforms, and AI coding tools, creating a quantum leap similar to the drag-and-drop revolution 20 years ago.The 30/70 Engineering Reality Chris reveals that LLMs represent only 30% of AI analytics products, with 70% being sophisticated tooling, error correction loops, and multi-agent systems. Raw LLM output succeeds only one-third of the time without extensive supporting infrastructure.Code-First AI Architecture Plotly's approach generates Python code rather than having AI directly process data, creating more rigorous analytics. The system generates 2,000-5,000 lines of code in under two minutes through parallel processing while maintaining 90%+ accuracy.Natural Language as Universal Equalizer Discussion of how natural language interfaces eliminate the learning curves of different analytics tools (Salesforce, Tableau, Google Analytics), potentially democratizing data visualization across organizations by providing a common interface.Vibe Analysis Concept Introduction of "vibe analysis" - the data equivalent of "vibe coding" - enabling fluid, rapid data exploration that keeps analysts in flow states through natural language interactions with AI-powered tools.Transparency and Trust Building Exploration of building user trust through auto-generated specifications in natural language, transparent logging interfaces, and making underlying code assumptions visible and adjustable to prevent misleading results.Human-AI Collaboration Balance Chris emphasizes that while AI accelerates visualization creation and data exploration, human interpretation remains essential for generating insights. The risk lies in systems that attempt to "skip to the finish" with fully automated decision-making.Infrastructure Misconceptions Domenic predicts people will wrongly assume AI analytics requires extensive data warehouses and semantic layers, when effective analysis can work with standard databases and file formats, making advanced analytics more accessible than many realize.

Enterprise Data Architecture in The Age of AI - How To Balance Flexibility, Control and Business Value

2025-05-2601:06:46

In this episode, we had the privilege of speaking with Nikhil Srinidhi from Rewire.Nikhil helps large organizations tackle complex business challenges by building high-performing teams focused on data, AI, and technology. With practical experience in data and software engineering, he drives impactful and lasting change. Before joining Rewire in 2024, Nikhil spent over six years at McKinsey and QuantumBlack, where he led holistic data and AI initiatives, particularly for clients in life sciences and healthcare. Earlier in his career, he worked as a data engineer in Canada, specializing in financial services. Nikhil holds a degree in Electrical Engineering and Economics from McGill University in Montreal, Canada.

Key Principles For Scaling AI In Enterprise: Leadership Lessons With Walid Mehanna

2024-12-1001:03:57

In this episode, we had the privilege of speaking with Walid Mehanna, Chief Data and AI Officer at Merck Group. Walid shares deep insights into how large, complex organizations can scale data and AI and create lasting impact through thoughtful leadership.As Chief Data & AI Officer of Merck Group, Walid led the Merck Data & AI Organization, delivering strategy, value, architecture, governance, engineering, and operations across the whole company globally. Hand in hand with Merck’s business sectors and their data offices, we harnessed the power of Data & AI. Walid is glad to be part of Merck as another curious mind dedicated to human progress.

Maximising the Impact of Your Data & AI Consulting Projects

2024-11-2546:47

In our latest episode of the Data Science Conversations Podcast, we spoke with Christoph Sporleder, Managing Partner at Rewire, about the evolving role of consulting in the data and AI space.This conversation is a must listen for anyone dealing with the challenges of integrating AI into business processes or considering an AI project with an external consulting firm. Christoph draws from decades of experience, offering practical advice and actionable insights for organizations and practitioners alike.Key Topics Discussed1. Evolution of Data and Cloud ComputingThe shift from local computing to cloud technologies, enabling broader data integration and advanced analytics, with the rise of IoT and machine data.2. Data Management ChallengesDiscussion on the evolution from data warehouses to data lakes and the emerging concept of data mesh for better governance and scalability.3. Importance of Strategy in AIWhy a clear strategy is crucial for AI adoption, including aligning organizational leadership and identifying impactful use cases.4. Sectoral Adoption of Data and AIDifferences in adoption across sectors, with early adopters in finance and insurance versus later adoption in manufacturing and infrastructure.5. Consulting Models and EngagementInsights into consulting engagement types, including strategy consulting, system integration, and body leasing, and their respective challenges and benefits.6. Challenges in AI ImplementationCommon pitfalls in AI projects, such as misalignment with business goals, inadequate infrastructure planning, and siloed lighthouse initiatives.7. Leadership’s Role in AI SuccessThe critical need for senior leadership commitment to drive AI adoption, ensure process integration, and manage organizational change.8. Effective Collaboration with ConsultantsBest practices for successful partnerships with consultants, including aligning on objectives, managing personnel transitions, and setting clear engagement expectations.9. Future Trends in Data and AIEmerging trends like componentized AI architectures, Gen AI integration, and the growing focus on embedding AI within business processes.10. Tips for Managing Long-Term ProjectsStrategies for handling staff rotations and maintaining project continuity in consulting engagements, emphasizing planning and communication.

KP Reddy: How AI is Reshaping Startup Dynamics and VC Strategies

2024-09-2401:01:53

KP Reddy, founder and managing partner of Shadow Ventures, explains how AI is set to redefine the startup landscape and the venture capital model. KP shares his unique perspective on the rapidly evolving role of AI in entrepreneurship, offering insights into:GENAI adoption in large companies is still limited How AI is empowering leaner, more efficient startupsThe potential for AI to disrupt traditional venture capital strategiesThe emergence of new business models driven by AI capabilitiesReal-world applications of AI in industries like construction, life sciences, and professional services

The Evolution of GenAI: From GANs to Multi-Agent Systems

2024-08-2943:27

Early Interest in Generative AIMartin's initial exposure to Generative AI in 2016 through a conference talk in Milano, Italy, and his early work with Generative Adversarial Networks (GANs).Development of GANs and Early Language Models since 2016The evolution of Generative AI from visual content generation to text generation with models like Google's Bard and the increasing popularity of GANs in 2018.Launch of GenerativeAI.net and Online CourseMartin's creation of GenerativeAI.net and an online course, which gained traction after being promoted on platforms like Reddit and Hacker News.Defining Generative AIMartin’s explanation of Generative AI as a technology focused on generating content, contrasting it with Discriminative AI, which focuses on classification and selection.Evolution of GenAI TechnologiesThe shift from LSTM models to Transformer models, highlighting key developments like the "Attention Is All You Need" paper and the impact of Transformer architecture on language models.Impact of Computing Power on GenAIThe role of increasing computing power and larger datasets in improving the capabilities of Generative AIGenerative AI in Business ApplicationsMartin’s insights into the real-world applications of GenAI, including customer service automation, marketing, and software development.Retrieval Augmented Generation (RAG) ArchitectureThe use of RAG architecture in enterprise AI applications, where documents are chunked and queried to provide accurate and relevant responses using large language models.Technological Drivers of GenAIThe advancements in chip design, including Nvidia’s focus on GPU improvements and the emergence of new processing unit architectures like the LPU.Small vs. Large Language ModelsA comparison between small and large language models, discussing their relative efficiency, cost, and performance, especially in specific use cases.Challenges in Implementing GenAI SystemsCommon challenges faced in deploying GenAI systems, including the costs associated with training and fine-tuning large language models and the importance of clean data.Measuring GenAI PerformanceMartin’s explanation of the complexities in measuring the performance of GenAI systems, including the use of the Hallucination Leaderboard for evaluating language models.Emerging Trends in GenAIDiscussion of future trends such as the rise of multi-agent frameworks, the potential for AI-driven humanoid robots, and the path towards Artificial General Intelligence (AGI).

Future AI Trends: Strategy, Hardware and AI Security at Intel

2024-07-2401:02:33

In this episode, we sit down with Steve Orrin, Federal Chief Technology Officer at Intel Corporation. Steve shares his extensive experience and insights on the transformative power of AI and its parallels with past technological revolutions. He discusses Intel’s pioneering role in enabling these shifts through innovations in microprocessors, wireless connectivity, and more.Steve highlights the pervasive role of AI in various industries and everyday technology, emphasizing the importance of a heterogeneous computing architecture to support diverse AI environments. He talks about the challenges of operationalizing AI, ensuring real-world reliability, and the critical need for robust AI security. Confidential computing emerges as a key solution for protecting AI workloads across different platforms.The episode also explores Intel’s strategic tools like oneAPI and OpenVINO, which streamline AI development and deployment. This episode is a must-listen for anyone interested in the evolving landscape of AI and its real-world applications.Intel's Legacy and Technological RevolutionsHistorical parallels between past tech revolutions (PC era, internet era) and current AI era.Intel's contributions to major technological shifts, including the development of wireless technology, USB, and cloud computing.AI's Current and Future LandscapeAI's pervasive role in everyday technology and various industries.Importance of computing hardware in facilitating AI advancements.AI's integration across different environments: cloud, network, edge, and personal devices.Intel's Approach to AIFocus on heterogeneous computing architectures for diverse AI needs.Development of software tools like oneAPI and OpenVINO to enable cross-platform AI development.Challenges and Solutions in AI DeploymentScaling AI from lab experiments to real-world applications.Ensuring AI security and trustworthiness through transparency and lifecycle management.Addressing biases in AI datasets and continuous monitoring for maintaining AI integrity.AI Security ConcernsProtection of AI models and data through hardware security measures like confidential computing.Importance of data privacy and regulatory compliance in AI deployments.Emerging threats such as AI model poisoning, prompt injection attacks, and adversarial attacks.Innovations in AI Hardware and SoftwareConfidential computing as a critical technology for securing AI.Research into using AI for chip layout optimization and process improvements in various industries.Future trends in AI applications, including generative AI for fault detection and process optimization.Collaboration and Standards in AI SecurityIntel's involvement in developing industry standards and collaborating with competitors and other stakeholders.The role of industry forums and standards bodies like NIST in advancing AI security.Advice for Aspiring AI Security ProfessionalsImportance of hands-on experience with AI technologies.Networking and collaboration with peers and industry experts.Staying informed through industry news, conferences, and educational resources.Exciting Developments in AIFusion of multiple AI applications for complex problem-solving.Advancements in AI hardware, such as AI PCs and edge devices.Potential transformative impacts of AI on everyday life and business operations.

Enhancing GenAI with Knowledge Graphs: A Deep Dive with Kirk Marple

2024-06-0644:46

In this episode we talk to Kirk Marple about the power of Knowledge Graphs when combined with GenAI models. Kirk explained the growing relevance of knowledge graphs in the AI era, the practical applications, their integration with LLMs, and the future potential of Graph RAG.Kirk Marple a veteran of Microsoft and General Motors, Kirk has spent the last 30 years in software development and data leadership roles. He also successfully exited the first startup he founded, RadiantGrid, acquired by Wohler Technologies.Now, as the technical founder and CEO of Graphlit, Kirk and his team are streamlining the development of vertical AI apps with their end-to-end, cloud based offering that ingests unstructured data and leverages retrieval augmented generation to improve accuracy, domain specificity, adaptability, and context understanding – all while expediting development.Episode Summary -Introduction to Knowledge Graphs:Knowledge graphs extract relationships between entities like people, places, and things, facilitating efficient information retrieval.They represent intricate interactions and interrelationships, enabling users to "walk the graph" and uncover deeper insights.Importance in the AI Era:Knowledge graphs enhance data retrieval and filtering, crucial for feeding accurate data into large language models (LLMs) and multimodal models.They provide an additional axis for information retrieval, complementing vector search.Industry Use Cases:Commonly used in customer data platforms and CRM models to map relationships within and between companies.Knowledge graphs can convert complex datasets into structured, easily queryable formats.Challenges and Limitations:Familiarity with graph databases and the ETL process for graph data integration is still developing.Graph structures are less common and more complex than traditional relational models.Integrating Knowledge Graphs with LLMs:Knowledge graphs enrich data integration and semantic understanding, adding context to text retrieved by LLMs.They can help reduce hallucinations in LLMs by grounding responses with more accurate and comprehensive context.Graph RAG (Retrieval Augmented Generation):Combines knowledge graphs with RAG to provide additional context for LLM-generated responses.Allows retrieval of data not directly cited in the text, enhancing the breadth of information available for queries.Scalability and Efficiency:Effective graph database architectures can handle large-scale graph data efficiently.Graph RAG requires a robust ingestion pipeline and careful management of data freshness and retrieval processes.Future Developments:Growing interest and implementation of knowledge graphs and Graph RAG in various industries.Potential for new tools and standardization efforts to make these technologies more accessible and effective.Graphlit: Simplifying Knowledge Graphs:The platform focuses on simplifying the creation and use of knowledge graphs for developers.Provides APIs for easy integration, supporting domain-specific vertical AI applications.Offers a unified pipeline for data ingestion, extraction, and knowledge graph construction.Open Source and Community Contributions:Recommendations for...

Using Open Source LLMs in Language for Grammatical Error Correction (GEC)

2024-03-0450:27

At LanguageTool, Bartmoss St Clair (Head of AI) is pioneering the use of Large Language Models (LLMs) for grammatical error correction (GEC), moving away from the tool's initial non-AI approach to create a system capable of catching and correcting errors across multiple languages.LanguageTool supports over 30 languages, has several million users, and over 4 million installations of its browser add-on, benefiting from a diverse team of employees from around the world.Episode Summary -LanguageTool decided against using existing LLMs like GPT-3 or GPT-4 due to cost, speed, and accuracy benefits of developing their own models, focusing on creating a balance between performance, speed, and cost.The tool is designed to work with low latency for real-time applications, catering to a wide range of users including academics and businesses, with the aim to balance accurate grammar correction without being intrusive.Bartmoss discussed the nuanced approach to grammar correction, acknowledging that language evolves and user preferences may vary, necessitating a balance between strict grammatical rules and user acceptability.The company employs a mix of decoder and encoder-decoder models depending on the task, with a focus on contextual understanding and the challenges of maintaining the original meaning of text while correcting grammar.A hybrid system that combines rule-based algorithms with machine learning is used to provide nuanced grammar corrections and explanations for the corrections, enhancing user understanding and trust.LanguageTool is developing a generalized GEC system, incorporating legacy rules and machine learning for comprehensive error correction across various types of text.Training models involve a mix of user data, expert-annotated data, and synthetic data, aiming to reflect real user error patterns for effective correction.The company has built tools to benchmark GEC tasks, focusing on precision, recall, and user feedback to guide quality improvements.Introduction of LLMs has expanded LanguageTool's capabilities, including rewriting and rephrasing, and improved error detection beyond simple grammatical rules.Despite the higher costs associated with LLMs and hosting infrastructure, the investment is seen as worthwhile for improving user experience and conversion rates for premium products.Bartmoss speculates on the future impact of LLMs on language evolution, noting their current influence and the importance of adapting to changes in language use over time.LanguageTool prioritizes privacy and data security, avoiding external APIs for grammatical error correction and developing their systems in-house with open-source models.

The Path to Responsible AI with Julia Stoyanovich of NYU

2024-01-2948:09

In this enlightening episode, Dr. Julia Stoyanovich delves into the world of responsible AI, exploring the ethical, societal, and technological implications of AI systems. She underscores the importance of global regulations, human-centric decision-making, and the proactive management of biases and risks associated with AI deployment. Through her expert lens, Dr. Stoyanovich advocates for a future where AI is not only innovative but also equitable, transparent, and aligned with human values.Julia is an Institute Associate Professor at NYU in both the Tandon School of Engineering, and the Center for Data Science. In addition she is Director of the Center for Responsible AI also at NYU. Her research focuses on responsible data management, fairness, diversity, transparency, and data protection in all stages of the data science lifecycle. Episode Summary -The Definition of Responsible AIExample of ethical AI in the medical world - Fast MRI technologyFairness and Diversity in AIThe role of regulation - What it can and can’t doTransparency, Bias in AI models and Data ProtectionThe dangers of Gen AI Hype and problematic AI narratives from the tech industryThe impotence of humans in ensuring ethical development Why “Responsible AI” is actually a bit of a misleading termWhat Data & AI leaders can do to practise Responsible AI

Transforming Freight Logistics with AI and Machine Learning

2023-12-0801:01:43

Luis Moreira-Matias is Senior Director of Artificial Intelligence at sennder, Europe’s leading digital freight forwarder. At sennder, Luis founded sennAI: sennder’s organization that oversees the creation (from R&D to real-world productization) of proprietary AI technology for the road logistics industry.During his 15 years of career, Luis led 50+ FTEs across 4+ organisations to develop award-winning ML solutions to address real-world problems in various fields such as e-commerce, travel, logistics, and finance. Luis holds a Ph.D. in Machine Learning from the U. Porto, Portugal. He possesses a world-class academic track with high impact publications at top tier venues in ML/AI fundamentals, 5 patents and multiple keynotes worldwide - ranging from Brisbane (Australia) to Las Palmas (Spain).

The future of LLMs, ELMs and the semantic layer

2023-11-0134:50

In this episode Tarush Aggarwal, formerly of Salesforce and WeWork is back on the podcast to discuss the evolution of the Semantic layer and how that can help practitioners get results from LLMs. We also discuss how smaller ELMs (expert language models) might be the future when it comes to consistent reliable outputs from Generative AI and also the impact of all of this on traditional BI tools.

Data Strategy Evolved: How the Biological Model fuels enterprise data performance

2023-05-0956:37

In this episode Patrick McQuillan shares his innovative Biological Model - a concept you can use to enhance data outcome in large enterprises. The concept takes the idea that the best way to design a data strategy is to align it closely with a biological system.He discusses the power of centralized information, importance of data governance, and the necessity for a common performance narrative across an organization. Episode Summary -- Biological Model Concept- Centralized vs. Decentralized Data- Data Collection and Maturity- Horizontal translation layer - Partnership with vertical leaders - Curated data layers - Data dictionary for consistency- Focusing on vital metrics- Data Flow in Organizations- Biological Model Governance- Overcoming Inconsistency and Inaccuracy

Mapping forests: Verifying carbon offsetting with machine learning

2023-03-1425:08

In this episode Heidi Hurst returns to talk to us about how in her current role at Pachama she is using the power of machine learning to fight climate change. She discusses her work in measuring the capacity of existing forests and reforestation projects using satellite imagery.Episode Summary1. The importance of carbon credits verification in mitigating climate change2. How Pachama is using machine learning and satellite imagery to verify carbon projects3. Three types of carbon projects: avoided deforestation, reforestation, and improved forest management4. Challenges in using satellite imagery to measure the capacity of existing forests5. The role of multispectral imaging in measuring density of forests6. Challenges in collecting data from dense rainforests and weather obstructions7. The impact of machine learning on scaling up carbon verification8. Advancements in the field of satellite imaging, particularly in small satellite constellations

How Science is (mis)communicated in Online Media

2023-02-1533:56

Ágnes Horvát is an Assistant Professor in Communication and Computer Science at Northwestern University. Her work focuses on understanding how online networks induce biased information production, sharing and processing across digital platforms. - The new Post-normal era for science - Having an awareness of the context and values that impact scientific researchWhere is science communication in relation to digital platforms? - Scholars work hard on discovering scientific findings, but information doesn’t always reach the public appropriately. How to communicate scientific research - it’s not just about communicating with scientists and general audiences. News needs to reach policymakers and governments too for real change.The production of scientific research has exploded recently thanks to decision-making demands - and the pandemic had a lot to do with this. Scientists were under pressure to carry out research quickly and at the expense of quality. Misinformation can have detrimental consequences - even leading to vaccine hesitancy in some communities.The surprising effect of retracting papers - papers that get retracted in the future are more likely to receive more engagement before getting withdrawn.Why are paper retractions on the rise? - again, the recent pandemic has caused an increase in retractions. Is social media helping or hindering science research? - while the platforms are helping to spread real news, social media also helps the spread of false information. As long as you have quality data and robust trends - regardless of the method, you will identity that trend. Reducing the problem of miscommunication - with whom does the responsibility lie?

How Observability is Advancing Data Reliability and Data Quality

2022-05-1843:49

Modern Data Infrastructures and platforms store huge amounts of multidimensional data. But - data pipelines frequently break and a machine learning algorithm's performance is only as good as the quality and reliability of the data itself.In this episode we are joined by Lior Gavish and Ryan Kearns of Monte Carlo, to talk about how the new concept of Data Observability is advancing Data Reliability and Data Quality at Scale.Episode SummaryA overview of Data Reliability/Quality and why it is so critical for organisationsThe limitations of traditional approaches in the area of Data ReliabilityData observability and why it is different to traditional approaches to Data QualityThe 5 Pillars of Data ObservabilityHow to improve data reliability/quality at scale and generate trust in data with stakeholders.How observability can lead to better outcomes for Data Science and engineering teams?Examples of data observability use cases in industryOverview of O’Reilly’s upcoming book, The Fundamentals of Data Quality.

The Pitfalls of Using AI Systems for Hiring

2022-02-0144:43

In this episode we are joined by Julia Stoyanovich from NYU, to talk about her work into how AI is being used in the hiring process.Whether you are responsible for hiring on behalf of a business or are a job seeker, you will find this podcast very interesting, but for very different reasons.Episode SummaryAlgorithmic decision making in the hiring process - what does that mean for businesses and job seekers?The hiring process - the funnel effect.Lack of public disclosure about the use of algorithmic tools as part of the talent acquisition pipeline.Are job seekers being unfairly screened out of the hiring process?How AI based implementations of psychometric instruments are used today.Is it possible to measure a person’s personality based on data alone?Do these systems remove bias and discrimination from the hiring process?Testing the stability and consistency of these algorithmic systems.Vendors of systems and their lack of testing / recognising the issues.Are new laws needed so the hiring process is fairer and more transparent?What does the future of hiring look like - fewer AI systems and more human intervention?

#box-pro-ellipsis-177349401355890{-webkit-line-clamp:2;}Data Science Conversations