DiscoverThe Python Podcast.__init__
The Python Podcast.__init__

The Python Podcast.__init__

Author: Tobias Macey

Subscribed: 4,525Played: 66,032
Share

Description

The weekly podcast about the Python programming language, its ecosystem, and its community. Tune in for engaging, educational, and technical discussions about the broad range of industries, individuals, and applications that rely on Python.
284 Episodes
Reverse
News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project.
Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain.
The internet is a rich source of information, but a majority of it isn't accessible programmatically through APIs or databases. To address that shortcoming there are a variety of web scraping frameworks that aid in extracting structured data from web pages. In this episode Attila Tóth shares the challenges of web data extraction, the ways that you can use it, and how Scrapy and ScrapingHub can help you with your projects.
A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices.
The Musicbrainz project was an early entry in the movement to build an open data ecosystem. In recent years, the Metabrainz Foundation has fostered a growing ecosystem of projects to support the contribution of, and access to, metadata, listening habits, and review of music. The majority of those projects are written in Python, and in this episode Param Singh explains how they are built, how they fit together, and how they support the goals of the Metabrains Foundation. This was an interesting exporation of the work involved in building an ecosystem of open data, the challenges of making it sustainable, and the benefits of building for the long term rather than trying to achieve a quick win.
Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic.
Netflix uses machine learning to power every aspect of their business. To do this effectively they have had to build extensive expertise and tooling to support their engineers. In this episode Savin Goyal discusses the work that he and his team are doing on the open source machine learning operations platform Metaflow. He shares the inspiration for building an opinionated framework for the full lifecycle of machine learning projects, how it is implemented, and how they have designed it to be extensible to allow for easy adoption by users inside and outside of Netflix. This was a great conversation about the challenges of building machine learning projects and the work being done to make it more achievable.
One of the best methods for learning programming is to just build a project and see how things work first-hand. With that in mind, Ken Youens-Clark wrote a whole book of Tiny Python Projects that you can use to get started on your journey. In this episode he shares his inspiration for the book, his thoughts on the benefits of teaching testing principles and the use of linting and formatting tools, as well as the benefits of trying variations on a working program to see how it behaves. This was a great conversation about useful strategies for supporting new programmers in their efforts to learn a valuable skill.
Python is an intuitive and flexible language, but that versatility can also lead to problematic designs if you're not careful. Nikita Sobolev is the CTO of Wemake Services where he works on open source projects that encourage clean coding practices and maintainable architectures. In this episode he discusses his work on the DRY Python set of libraries and how they provide an accessible interface to functional programming patterns while maintaining an idiomatic Python interface. He also shares the story behind the wemake Python styleguide plugin for Flake8 and the benefits of strict linting rules to engender good development habits. This was a great conversation about useful practices to build software that will be easy and fun to work on.
Barry Warsaw has been a member of the Python community since the very beginning. His contributions to the growth of the language and its ecosystem are innumerable and diverse, earning him the title of Friendly Language Uncle For Life. In this episode he reminisces on his experiences as a core developer, a member of the Python Steering Committee, and his roles at Canonical and LinkedIn supporting the use of Python at those companies. In order to know where you are going it is always important to understand where you have been and this was a great conversation to get a sense of the history of how Python has gotten to where it is today.
Building and managing servers is a challenging task. Configuration management tools provide a framework for handling the various tasks involved, but many of them require learning a specific syntax and toolchain. PyInfra is a configuration management framework that embraces the familiarity of Pure Python, allowing you to build your own integrations easily and package it all up using the same tools that you rely on for your applications. In this episode Nick Barrett explains why he built it, how it is implemented, and the ways that you can start using it today. He also shares his vision for the future of the project and you can get involved. If you are tired of writing mountains of YAML to set up your servers then give PyInfra a try today.
Programming languages are a powerful tool and can be used to create all manner of applications, however sometimes their syntax is more cumbersome than necessary. For some industries or subject areas there is already an agreed upon set of concepts that can be used to express your logic. For those cases you can create a Domain Specific Language, or DSL to make it easier to write programs that can express the necessary logic with a custom syntax. In this episode Igor Dejanović shares his work on textX and how you can use it to build your own DSLs with Python. He explains his motivations for creating it, how it compares to other tools in the Python ecosystem for building parsers, and how you can use it to build your own custom languages.
Once you release an application into production it can be difficult to understand all of the ways that it is interacting with the systems that it integrates with. The OpenTracing project and its accompanying ecosystem of technologies aims to make observability of your systems more accessible. In this episode Austin Parker and Alex Boten explain how the correlation of tracing and metrics collection improves visibility of how your software is behaving, how you can use the Python SDK to automatically instrument your applications, and their vision for the future of observability as the OpenTelemetry standard gains broader adoption.
Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance.
You spend a lot of time and energy on building a great application, but do you know how it's actually being used? Using a product analytics tool lets you gain visibility into what your users find helpful so that you can prioritize feature development and optimize customer experience. In this episode PostHog CTO Tim Glaser shares his experience building an open source product analytics platform to make it easier and more accessible to understand your product. He shares the story of how and why PostHog was created, how to incorporate it into your projects, the benefits of providing it as open source, and how it is implemented. If you are tired of fighting with your user analytics tools, or unwilling to entrust your data to a third party, then have a listen and then test out PostHog for yourself.
The divide between Python 2 and 3 lasted a long time, and in recent years all of the new features were added to version 3. To help bridge the gap and extend the viability of version 2 Naftali Harris created Tauthon, a fork of Python 2 that backports features from Python 3. In this episode he explains his motivation for creating it, the process of maintaining it and backporting features, and the ways that it is being used by developers who are unable to make the leap. This was an interesting look at how things might have been if the elusive Python 2.8 had been created as a more gentle transition.
Dependency management in Python has taken a long and winding path, which has led to the current dominance of Pip. One of the remaining shortcomings is the lack of a robust mechanism for resolving the package and version constraints that are necessary to produce a working system. Thankfully, the Python Software Foundation has funded an effort to upgrade the dependency resolution algorithm and user experience of Pip. In this episode the engineers working on these improvements, Pradyun Gedam, Tzu-Ping Chung, and Paul Moore, discuss the history of Pip, the challenges of dependency management in Python, and the benefits that surrounding projects will gain from a more robust resolution algorithm. This is an exciting development for the Python ecosystem, so listen now and then provide feedback on how the new resolver is working for you.
One of the most common causes of bugs is incorrect data being passed throughout your program. Pydantic is a library that provides runtime checking and validation of the information that you rely on in your code. In this episode Samuel Colvin explains why he created it, the interesting and useful ways that it can be used, and how to integrate it into your own projects. If you are tired of unhelpful errors due to bad data then listen now and try it out today.
More of us are working remotely than ever before, many with no prior experience with a remote work environment. In this episode Quinn Slack discusses his thoughts and experience of running Sourcegraph as a fully distributed company. He covers the lessons that he has learned in moving from partially to fully remote, the practices that have worked well in managing a distributed workforce, and the challenges that he has faced in the process. If you are struggling with your remote work situation then this conversation has some useful tips and references for further reading to help you be successful in the current environment.
After you write your application, you need a way to make it available to your users. These days, that usually means deploying it to a cloud provider, whether that's a virtual server, a serverless platform, or a Kubernetes cluster. To manage the increasingly dynamic and flexible options for running software in production, we have turned to building infrastructure as code. Pulumi is an open source framework that lets you use your favorite language to build scalable and maintainable systems out of cloud infrastructure. In this episode Luke Hoban, CTO of Pulumi, explains how it differs from other frameworks for interacting with infrastructure platforms, the benefits of using a full programming language for treating infrastructure as code, and how you can get started with it today. If you are getting frustrated with switching contexts when working between the application you are building and the systems that it runs on, then listen now and then give Pulumi a try.
loading
Comments (3)

Antonio Andrade

terrible audio this time

Jan 14th
Reply

Nihan Dip

this Masonite dude is so full of himself 😂

Sep 21st
Reply

Antonio Andrade

Tobias, are you a robot? nice postcast

May 27th
Reply
loading
Download from Google Play
Download from App Store