DiscoverDataFramed
DataFramed

DataFramed

Author: DataCamp

Subscribed: 1,786Played: 10,958
Share

Description

Data science is one of the fastest growing industries and has been called the ‘Sexiest job of the 21st Century’. But what exactly is data science? In this podcast, brought to you by DataCamp, Hugo Bowne-Anderson approaches the question by exploring what problems data science can solve rather than defining what data science is. From automated medical diagnosis and self-driving cars to recommendation systems and climate change, come on a journey with experts from industry and academia to explore the industry that will change the course of the 21st century.
49 Episodes
Reverse
#48 Managing Data Science Teams
In this episode of DataFramed, the DataCamp podcast, Hugo speaks with Angela Bassa about managing data science teams. Angela is Director of Data Science at iRobot, where she leads the team through development of machine learning algorithms, sentiment analysis, and anomaly detection processes. iRobot are the makers of consumer robots that we all know and love, like the Roomba, and the Braava which are, respectively, a robotic vacuum cleaner and a robotic mop. Angela will talk about how to get into data science management, the most important strategies to ensure that your data science team delivers value to the organization, how to hire data scientists and key points to consider as your data science team grows over time, in addition to the types of trade-offs you need to make as a data science manager and how you make the right ones. Along the way, you’ll see why a former marine biologist has the skills and ways of thinking to be a super data scientist at a company like iRobot and you’ll also see the importance of throwing data analysis parties.LINKS FROM THE SHOWFROM THE INTERVIEWAngela on TwitterHBR NewslettersiRobot CareersData Science InternshipFROM THE SEGMENTSCorrecting Data Science Misconceptions (w/ Heather Nolis ~18:45)Using docker to deploy an R plumber API (By Jonathon Nolis)Enterprise Web Services with Neural Networks Using R and TensorFlow (By Jonathan Nolis and Heather Nolis)Project of the Month (w/ David Venturi ~38:45)Rise and Fall of Programming Languages (R Project by David Robinson)Learn, Practice, Apply! (By Ramnath Vaidyanathan)Apply to create a DataCamp project! Original music and sounds by The Sticks.
#47 Human-centered Design in Data Science
Hugo speaks with Peter Bull about the importance of human-centered design in data science. Peter is a data scientist for social good and co-founder of Driven Data, a company that brings cutting-edge practices in data science and crowdsourcing to some of the world's biggest social challenges and the organizations taking them on, including machine learning competitions for social good. They’ll speak about the practice of considering how humans interact with data and data products and how important it is to consider them while designing your data projects. They’ll see how human-centered design provides a robust and reproducible framework for involving the end-user all through the data work, illuminated by examples such as DrivenData’s work in financial services and Mobile Money in Tanzania. Along the way, they’ll discuss the role of empathy in data science, the increasingly important conversation around data ethics and much, much more.LINKS FROM THE SHOWFROM THE INTERVIEWPeter on TwitterDrivenDataDeon (Ethics Checklist)Cookiecutter Data ScienceIf you liked this interview, you might be interested in working with DrivenData! Currently, the team is looking for a software engineer who loves the idea of building Python applications for social impact. Apply Here!FROM THE SEGMENTSProbability Distributions and their Stories (with Justin Bois at ~24:00)Justin's Website at CaltechProbability distributions and their stories (By Justin Bois)Studies in Interpretability (with Peader Coyle at ~38:10)Interpretable ML SymposiumHow will the GDPR impact machine learning? (By Andrew Burt)How to use Bayesian Stats in your daily job (Gates, Perry, Zorn (2002))Fairness in Machine Learning (By Moritz Hardt)Original music and sounds by The Sticks.
#46 AI in Healthcare, an Insider's Account
In this episode of DataFramed, a DataCamp podcast, Hugo speaks with Arnaub Chatterjee. Arnaub is a Senior Expert and Associate Partner in the Pharmaceutical and Medical Products group at McKinsey & Company. They’ll discuss cutting through the hype about artificial intelligence (AI) and machine learning (ML) in healthcare by looking at practical applications and how McKinsey & Company is helping the industry evolve.Tune in for an insider’s account into what has worked in healthcare, from ML models being used to predict nearly everything in clinical settings, to imaging analytics for disease diagnosis, to wound therapeutics. Will robots and AI replace disciplines such as radiology, ophthalmology, and dermatology? How have the moving parts of data science work evolved in healthcare? What does the future of data science, ML and AI in healthcare hold? Stick around to find out.LINKS FROM THE SHOWFROM THE INTERVIEWMcKinsey Analytics on TwitterHot off the press article for HBR’s Future of Healthcare online forum (By Arnaub Chatterjee)Our latest piece on the promise & challenge of AI (By James Manyika and Jacques Bughin)Are robots coming for our jobs? (mckinsey.com)Analytics Careers page (mckinsey.com)How we help clients in healthcare analytics (mckinsey.com)AI analysis of 400+ use cases, including ones in healthcare (By Michael Chui et al. mckinsey.com)FROM THE SEGMENTSMachines that Multi-task (with Manny Moss)Part 1 at ~21:05Responsible AI in Consumer EnterpriseHilary Mason, DJ Patil and Mike Loukides on Data EthicsEthicalOS TookitPart 2 at ~40:0021 Definitions of Fairness Tutorial from FAT* (Arvind Naranayan)Kate Crawford's keynote address "The Trouble with Bias" from NIPS 2017The (im)possibility of Fairness (Sorelle et al. arXiv.org)Learning from disparate data sources (Li Y et al. PubMed.gov)Distributed Multi-task Learning (Liyang Xie et al. KDD.org)The Cost of Fairness in Binary Classification (Aditya Krishna Menon et al. proceedings.mlr.press)Original music and sounds by The Sticks.
#45 Decision Intelligence and Data Science
In this episode of DataFramed, Hugo speaks with Cassie Kozyrkov, Chief Decision Scientist at Google Cloud. Cassie and Hugo will be talking about data science, decision making and decision intelligence, which Cassie thinks of as data science plus plus, augmented with the social and managerial sciences. They’ll talk about the different and evolving models for how the fruits of data science work can be used to inform robust decision making, along with pros and cons of all the models for embedding data scientists in organizations relative to the decision function. They’ll tackle head on why so many organizations fail at using data to robustly inform decision making, along with best practices for working with data, such as not verifying your results on the data that inspired your models. As Cassie says, “Split your damn data”.Links from the showFROM THE INTERVIEWCassie on Twitter Is data science a bubble? (By Cassie Kozyrkov, Hackernoon)Incompetence, delegation, and population (By Cassie Kozyrkov, Hackernoon)Populations — You’re doing it wrong (By Cassie Kozyrkov, Hackernoon)What on earth is data science? (By Cassie Kozyrkov, Hackernoon)FROM THE SEGMENTSProbability Distributions and their Stories (with Justin Bois at ~19:45)Justin's Website at CaltechProbability distributions and their stories (By Justin Bois)Machines that Multi-Task (with Friederike Schüür of Fast Forward Labs ~43:45)Sebastian’s Ruder’s Overview of Multi-Task Learning in Deep Neural NetworksMulti-Task Learning for NLP, also by Sebastian RuderGANs for Fake Celebrity Images (Karras et al, Nvidia)Adversarial Multi-Task Learning for Text Classification (Liu et al., arXiv.org)Original music and sounds by The Sticks.
#44 Project Jupyter and Interactive Computing
In this episode of DataFramed, Hugo speaks with Brian Granger, co-founder and co-lead of Project Jupyter, physicist and co-creator of the Altair package for statistical visualization in Python.They’ll speak about data science, interactive computing, open source software and Project Jupyter. With over 2.5 million public Jupyter notebooks on github alone, Project Jupyter is a force to be reckoned with. What is interactive computing and why is it important for data science work? What are all the the moving parts of the Jupyter ecosystem, from notebooks to JupyterLab to JupyterHub and binder and why are they so relevant as more and more institutions adopt open source software for interactive computing and data science? From Netflix running around 100,000 Jupyter notebook batch jobs a day to LIGO’s Nobel prize winning discovery of gravitational waves publishing all their results reproducibly using Notebooks, Project Jupyter is everywhere. Links from the show FROM THE INTERVIEWBrian on Twitter Project JupyterBeyond Interactive: Notebook Innovation at Netflix (Ufford, Pacer, Seal, Kelley, Netflix Tech Blog)Gravitational Wave Open Science Center (Tutorials)JupyterCon YouTube Playlistjupyterstream Github RepositoryFROM THE SEGMENTSMachines that Multi-Task (with Friederike Schüür of Fast Forward Labs)Part 1 at ~24:40Brief Introduction to Multi-Task Learning (By Friederike Schüür)Overview of Multi-Task Learning Use Cases (By Manny Moss)Multi-Task Learning for the Segmentation of Building Footprints (Bischke et al., arXiv.org)Multi-Task as Question Answering (McCann et al., arXiv.org)The Salesforce Natural Language Decathlon: A Multitask Challenge for NLP Part 2 at ~44:00Rich Caruana’s Awesome Overview of Multi-Task Learning and Why It WorksSebastian’s Ruder’s Overview of Multi-Task Learning in Deep Neural NetworksMassively Multi-Task Network for Drug Discovery, 259 Tasks (!) (Ramsundar et al. arXiv.org)Brief Overview of Multi-Task Learning with Video of Newsie, the Prototype (By Friederike Schüür) Original music and sounds by The Sticks.
#43 Election Forecasting and Polling
Hugo speaks with Andrew Gelman about statistics, data science, polling, and election forecasting. Andy is a professor of statistics and political science and director of the Applied Statistics Center at Columbia University and this week we’ll be talking the ins and outs of general polling and election forecasting, the biggest challenges in gauging public opinion, the ever-present challenge of getting representative samples in order to model the world and the types of corrections statisticians can and do perform. "Chatting with Andy was an absolute delight and I cannot wait to share it with you!"-Hugo  Links from the show FROM THE INTERVIEWAndrew's Blog Andrew on Twitter We Need to Move Beyond Election-Focused Polling (Gelman and Rothschild, Slate)We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results (Cohn, The New York Times).19 things we learned from the 2016 election (Gelman and Azari, Science, 2017)The best books on How Americans Vote (Gelman, Five Books)The best books on Statistics (Gelman, Five Books)Andrew's Research FROM THE SEGMENTSStatistical Lesson of the Week (with Emily Robinson at ~13:30)The five Cs (Loukides, Mason, and Patil, O'Reilly)Data Science Best Practices (with Ben Skrainka~40:40)Oberkampf & Roy’s Verification and Validation in Scientific Computing provides a thorough yet very readable treatment A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing (Roy and Oberkampf, Science Direct) Original music and sounds by The Sticks.
#40 Becoming a Data Scientist
Hugo speaks with Renee Teate about the many paths to becoming a data scientist. Renee is a Data Scientist at higher ed analytics start-up HelioCampus, and creator and host of the Becoming a Data Scientist Podcast. In addition to discussing the many possible ways to become becoming a data scientist, they will discuss the common data scientist profiles and how to figure out which ones may be a fit for you. They’ll also dive into the fact that you need to figure out both where you are in terms of skills and knowledge and where you want to go in terms of your career. Renee has a bunch of great suggestions for aspiring data scientists and also flags several important pitfalls and warnings. On top of this, they'll dive into how much statistics, linear algebra and calculus you need to know in order to become an effective data scientist and/or data analyst. Links from the show FROM THE INTERVIEW Becoming a Data Scientist (Renée's Blog) Renée's Twitter Data Sci Guide (Data Science Learning Directory) FROM THE SEGMENTSStatistical Distributions and their Stories (with Justin Bois at ~19:20) Justin's Website at Caltech Probability distributions and their stories Programming Topic of the Week (with Emily Robinson at ~43:20) Categorical Data in the Tidyverse, a DataCamp Course taught by Emily Robinson. R for Data Science Book by Hadley Wickham (Factors Chapter) Inference for Categorical Data, a DataCamp Course taught by Andrew Bray. stringsAsFactors: An unauthorized biography (Roger Peng, July 24, 2015) Wrangling categorical data in R (Amelia McNamara & Nicholas J Horton, August 30, 2017) Original music and sounds by The Sticks.
loading
Comments (3)

Paolo Eusebi

Amazing episode! How many listeners worked with Stan in R? What are their impressions over other bayesian software?

Oct 9th
Reply

Rafael Anjos

The contents are very good. Thank you for your good job

Sep 18th
Reply

Anthony Giancursio

Ol

Jul 19th
Reply

Nuage Laboratoire

Anthony Giancursio text

Oct 10th
Reply
loading
Download from Google Play
Download from App Store