DiscoverThe Python Podcast.__init__
The Python Podcast.__init__

The Python Podcast.__init__

Author: Tobias Macey

Subscribed: 4,576Played: 69,657
Share

Description

The weekly podcast about the Python programming language, its ecosystem, and its community. Tune in for engaging, educational, and technical discussions about the broad range of industries, individuals, and applications that rely on Python.
294 Episodes
Reverse
In a software project writing code is just one step of the overall lifecycle. There are many repetitive steps such as linting, running tests, and packaging that need to be run for each project that you maintain. In order to reduce the overhead of these repeat tasks, and to simplify the process of integrating code across multiple systems the use of monorepos has been growing in popularity. The Pants build tool is purpose built for addressing all of the drudgery and for working with monorepos of all sizes. In this episode core maintainers Eric Arellano and Stu Hood explain how the Pants project works, the benefits of automatic dependency inference, and how you can start using it in your own projects today. They also share useful tips for how to organize your projects, and how the plugin oriented architecture adds flexibility for you to customize Pants to your specific needs.
Building a machine learning model is a process that requires well curated and cleaned data and a lot of experimentation. Doing it repeatably and at scale with a team requires a way to share your discoveries with your teammates. This has led to a new set of operational ML platforms. In this episode Michael Del Balso shares the lessons that he learned from building the platform at Uber for putting machine learning into production. He also explains how the feature store is becoming the core abstraction for data teams to collaborate on building machine learning models. If you are struggling to get your models into production, or scale your data science throughput, then this interview is worth a listen.
The CPython implementation has grown and evolved significantly over the past ~25 years. In that time there have been many other projects to create compatible runtimes for your Python code. One of the challenges for these other projects is the lack of a fully documented specification of how and why everything works the way that it does. In the most recent Python language summit Mark Shannon proposed implementing a formal specification for CPython, and in this episode he shares his reasoning for why that would be helpful and what is involved in making it a reality.
Artificial intelligence applications can provide dramatic benefits to a business, but only if you can bring them from idea to production. Henrik Landgren was behind the original efforts at Spotify to leverage data for new product features, and in his current role he works on an AI system to evaluate new businesses to invest in. In this episode he shares advice on how to identify opportunities for leveraging AI to improve your business, the capabilities necessary to enable aa successful project, and some of the pitfalls to watch out for. If you are curious about how to get started with AI, or what to consider as you build a project, then this is definitely worth a listen.
Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this episode the current lead developer, Karl Nelson, explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. If you have ever wanted to use a library or module from Java, but the rest of your project is already in Python, then this episode is definitely worth a listen.
The release of Python 3.9 introduced a new parser that paves the way for brand new features. Every programming language has its own specific syntax for representing the logic that you are trying to express. The way that the rules of the language are defined and validated is with a grammar definition, which in turn is processed by a parser. The parser that the Python language has relied on for the past 25 years has begun to show its age through mounting technical debt and a lack of flexibility in defining new syntax. In this episode Pablo Galindo and Lysandros Nikolaou explain how, together with Python's creator Guido van Rossum, they replaced the original parser implementation with one that is more flexible and maintainable, why now was the time to make the change, and how it will influence the future evolution of the language.
The way that applications are being built and delivered has changed dramatically in recent years with the growing trend toward cloud native software. As part of this movement toward the infrastructure and orchestration that powers your project being defined in software, a new approach to operations is gaining prominence. Commonly called GitOps, the main principle is that all of your automation code lives in version control and is executed automatically as changes are merged. In this episode Victor Farcic shares details on how that workflow brings together developers and operations engineers, the challenges that it poses, and how it influences the architecture of your software. This was an interesting look at an emerging pattern in the development and release cycle of modern applications.
Learning to code is a neverending journey, which is why it's important to find a way to stay motivated. A common refrain is to just find a project that you're interested in building and use that goal to keep you on track. The problem with that advice is that as a new programmer, you don't have the knowledge required to know which projects are reasonable, which are difficult, and which are effectively impossible. Steven Lott has been sharing his programming expertise as a consultant, author, and trainer for years. In this episode he shares his insights on how to help readers, students, and colleagues interested enough to learn the fundamentals without losing sight of the long term gains. He also uses his own difficulties in learning to maintain, repair, and captain his sailboat as relatable examples of the learning process and how the lessons he has learned can be translated to the process of learning a new technology or skill. This was a great conversation about the various aspects of how to learn, how to stay motivated, and how to help newcomers bridge the gap between what they want to create and what is within their grasp.
Python is a powerful and expressive programming language with a vast ecosystem of incredible applications. Unfortunately, it has always been challenging to share those applications with non-technical end users. Gregory Szorc set out to solve the problem of how to put your code on someone else's computer and have it run without having to rely on extra systems such as virtualenvs or Docker. In this episode he shares his work on PyOxidizer and how it allows you to build a self-contained Python runtime along with statically linked dependencies and the software that you want to run. He also digs into some of the edge cases in the Python language and its ecosystem that make this a challenging problem to solve, and some of the lessons that he has learned in the process. PyOxidizer is an exciting step forward in the evolution of packaging and distribution for the Python language and community.
Servers and services that have any exposure to the public internet are under a constant barrage of attacks. Network security engineers are tasked with discovering and addressing any potential breaches to their systems, which is a never-ending task as attackers continually evolve their tactics. In order to gain better visibility into complex exploits Colin O'Brien built the Grapl platform, using graph database technology to more easily discover relationships between activities within and across servers. In this episode he shares his motivations for creating a new system to discover potential security breaches, how its design simplifies the work of identifying complex attacks without relying on brittle rules, and how you can start using it to monitor your own systems today.
News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project.
Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain.
The internet is a rich source of information, but a majority of it isn't accessible programmatically through APIs or databases. To address that shortcoming there are a variety of web scraping frameworks that aid in extracting structured data from web pages. In this episode Attila Tóth shares the challenges of web data extraction, the ways that you can use it, and how Scrapy and ScrapingHub can help you with your projects.
A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices.
The Musicbrainz project was an early entry in the movement to build an open data ecosystem. In recent years, the Metabrainz Foundation has fostered a growing ecosystem of projects to support the contribution of, and access to, metadata, listening habits, and review of music. The majority of those projects are written in Python, and in this episode Param Singh explains how they are built, how they fit together, and how they support the goals of the Metabrains Foundation. This was an interesting exporation of the work involved in building an ecosystem of open data, the challenges of making it sustainable, and the benefits of building for the long term rather than trying to achieve a quick win.
Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic.
Netflix uses machine learning to power every aspect of their business. To do this effectively they have had to build extensive expertise and tooling to support their engineers. In this episode Savin Goyal discusses the work that he and his team are doing on the open source machine learning operations platform Metaflow. He shares the inspiration for building an opinionated framework for the full lifecycle of machine learning projects, how it is implemented, and how they have designed it to be extensible to allow for easy adoption by users inside and outside of Netflix. This was a great conversation about the challenges of building machine learning projects and the work being done to make it more achievable.
One of the best methods for learning programming is to just build a project and see how things work first-hand. With that in mind, Ken Youens-Clark wrote a whole book of Tiny Python Projects that you can use to get started on your journey. In this episode he shares his inspiration for the book, his thoughts on the benefits of teaching testing principles and the use of linting and formatting tools, as well as the benefits of trying variations on a working program to see how it behaves. This was a great conversation about useful strategies for supporting new programmers in their efforts to learn a valuable skill.
Python is an intuitive and flexible language, but that versatility can also lead to problematic designs if you're not careful. Nikita Sobolev is the CTO of Wemake Services where he works on open source projects that encourage clean coding practices and maintainable architectures. In this episode he discusses his work on the DRY Python set of libraries and how they provide an accessible interface to functional programming patterns while maintaining an idiomatic Python interface. He also shares the story behind the wemake Python styleguide plugin for Flake8 and the benefits of strict linting rules to engender good development habits. This was a great conversation about useful practices to build software that will be easy and fun to work on.
Barry Warsaw has been a member of the Python community since the very beginning. His contributions to the growth of the language and its ecosystem are innumerable and diverse, earning him the title of Friendly Language Uncle For Life. In this episode he reminisces on his experiences as a core developer, a member of the Python Steering Committee, and his roles at Canonical and LinkedIn supporting the use of Python at those companies. In order to know where you are going it is always important to understand where you have been and this was a great conversation to get a sense of the history of how Python has gotten to where it is today.
loading
Comments (3)

Antonio Andrade

terrible audio this time

Jan 14th
Reply

Nihan Dip

this Masonite dude is so full of himself 😂

Sep 21st
Reply

Antonio Andrade

Tobias, are you a robot? nice postcast

May 27th
Reply
Download from Google Play
Download from App Store