Episode 7: The Old Parser

Update: 2024-01-31

Description

Context-free grammars, non-deterministic finite automatons, left-to-right leftmost derivations... what even is all that?! Today we're talking about how Python parses your source code. We start gently with how this worked in the past. Come listen to Łukasz's high-level explanations and Pedantic Pablo's "well actuallys".

# Timestamps

(00:00:00 ) INTRO

(00:01:35 ) You can still download Python 1.0!

(00:02:19 ) The original tokenizer

(00:03:10 ) What even is a tokenizer?

(00:04:08 ) FUN FACTS ABOUT THE TOKENIZER

(00:04:34 ) Circumflex

(00:05:16 ) Python's invisible braces

(00:08:29 ) Backticks in the syntax

(00:11:00 ) Where are the comments stored?

(00:12:27 ) GRAMMAR

(00:13:37 ) What is a grammar?

(00:16:25 ) The long-forgotten 'access' keyword

(00:20:25 ) Making LL1 do things it wasn't meant to do

(00:23:24 ) SURPRISE QUESTION 1: soft keywords

(00:24:46 ) What's a context-free grammar?

(00:26:51 ) A note about backslashes

(00:29:33 ) The Dragon Book(s)

(00:31:27 ) PARSING: What is it?

(00:35:23 ) How to generate a parser?

(00:39:00 ) LL Cool Parser

(00:41:15 ) What if we used LR?

(00:44:01 ) Let's have three tokenizers!

(00:47:50 ) 2to3 and its legacy

(00:52:38 ) Black and its blib2to3

(00:54:04 ) The pesky 'with' statement and the death of LL1

(01:00:05 ) PR OF THE WEEK: GH-113745

(01:05:41 ) SURPRISE QUESTION 2: Subclasses of SyntaxError

(01:07:02 ) WHAT'S GOING ON IN CPYTHON?

(01:09:16 ) Sam Gross nominated as a core dev

(01:10:13 ) Free-threading progress

(01:13:11 ) Faster CPython changes

(01:17:29 ) ntpath.isreserved()

(01:20:11 ) Pablo and the DWARF

(01:22:02 ) OUTRO

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Episode 17: Argparse, JIT, and balloons with Savannah Ostrowski

2024-11-1901:45:06

Episode 16: Memory Allocation

2024-10-2901:45:52

Episode 15: Core sprint at Meta

2024-10-0301:56:21

Episode 14: Integration Events

2024-09-0301:30:50

Episode 13: A Legit Episode

2024-06-2901:51:55

Episode 12: WTF Python

2024-06-1001:24:55

Episode 11: Live from PyCon 2024

2024-05-2830:30

Episode 10: The Interactive REPL

2024-05-0301:22:51

Episode 9: Py Day with Emily Morehouse-Valcarcel

2024-03-1401:09:44

Episode 8: The New Parser

2024-03-0101:42:36

Episode 7: The Old Parser

2024-01-3101:23:24

Episode 6 - An Exceptional Episode

2024-01-0801:31:25

Episode 5 - Cinder with Carl Meyer

2023-12-1101:21:19

Episode 4 - Frame Evaluation

2023-11-2901:13:20

Episode 3 - Imports, frozen modules, Python news

2023-11-1301:11:44

Episode 2 - PEP 703: Removing the GIL

2023-10-3001:14:39

Episode 1 - Core Sprint in Brno & Python 3.13.0 alpha 1

2023-10-3001:11:59

00:00

Episode 7: The Old Parser

Pablo Galindo and Łukasz Langa

#box-pro-ellipsis-173228082811760{-webkit-line-clamp:2;}Episode 7: The Old Parser

Episode 7: The Old Parser

Pablo Galindo and Łukasz Langa

Episode 7: The Old Parser