DiscoverPython Bytes
Python Bytes
Claim Ownership

Python Bytes

Author: Michael Kennedy and Brian Okken

Subscribed: 6,424Played: 88,815
Share

Description

Python Bytes is a weekly podcast hosted by Michael Kennedy and
Brian Okken. The show is a short discussion on the headlines and noteworthy news
in the Python, developer, and data science space.
169 Episodes
Reverse
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: D-Tale suggested by @davidouglasmit via twitter “D-Tale is the combination of a Flask back-end and a React front-end to bring you an easy way to view & analyze Pandas data structures. It integrates seamlessly with ipython notebooks & python/ipython terminals. Currently this tool supports such Pandas objects as DataFrame, Series, MultiIndex, DatetimeIndex & RangeIndex.” way cool UI for visualizing data Live Demo shows Describe shows column statistics, graph, and top 100 values filter, correlations, charts, heat map Michael #2: Carnets by Nicolas Holzschuch A standalone Jupyter notebooks implementation for iOS. The power of Jupyter notebooks. In your pocket. Anywhere. Everything runs on your device. No need to setup a server, no need for an internet connection. Standard packages like Numpy, Matplotlib, Sympy and Pandas are already installed. You're ready to edit notebooks. Carnets uses iOS 11 filesharing ability. You can store your notebooks in iCloud, access them using other apps, share them. Extended keyboard on iPads, you get an extended toolbar with basic actions on your keyboard. Install more packages: Add more Python packages with %pip (if they are pure Python). OpenSource: Carnets is entirely OpenSource, and released under the FreeBSD license. Brian #3: BeeWare Podium suggested by Katie McLaughlin, @glasnt on twitter NOT a pip install, download a binary from https://github.com/beeware/podium/releases Linux and macOS Still early, so you gotta do the open and trust from the apps directory thing for running stuff not from the app store. But Oh man is it worth it. HTML5 based presentation frameworks are cool. run a presentation right in your browser. My favorite has been remark.js presenter mode, notes are especially useful while practicing a talk running timer super helpful while giving a talk write talk in markdown, so it’s super easy to version control issues: presenter mode, full screen, with extended monitor hard to do. notes and timer on laptop, full presentation on extended screen super cool but requires full screening with mouse Podium uses similar syntax as remark.js and I think uses remark under the hood. but it’s a native app, not a browser Handles the presenter mode and extended screen smoothly, like keynote and others. Removes the need for boilerplate html in your markdown file (remark.js md files have cruft). Can’t wait to try this out for my next presentation Michael #4: pytest-mock-resources via Daniel Cardin pytest fixture factories to make it easier to test against code that depends on external resources like Postgres, Redshift, and MongoDB. Code which depends on external resources such a databases (postgres, redshift, etc) can be difficult to write automated tests for. Conventional wisdom might be to mock or stub out the actual database calls and assert that the code works correctly before/after the calls. Whether the actual query did the correct thing truly requires that you execute the query. Having tests depend upon a real postgres instance running somewhere is a pain, very fragile, and prone to issues across machines and test failures. Therefore pytest-mock-resources (primarily) works by managing the lifecycle of docker containers and providing access to them inside your tests. Brian #5: How James Bennet is testing in 2020 Follow up from Testing Django applications in 2018 Favors unittest over pytest. tox for testing over multiple Django and Python versions, including tox-travis plugin pyenv for local Python installation management and pyenv-virtualenv plugin for venvs. Custom runtests.py for setting up environment and running tests. Changed to src/ directory layout. Coverage and reporting failure if coverage dips, with a healthy perspective: “… this isn’t because I have 100% coverage as a goal. Achieving that is so easy in most projects that it’s meaningless as a way to measure quality. Instead, I use the coverage report as a canary. It’s a thing that shouldn’t change, and if it ever does change I want to know, because it will almost always mean something else has gone wrong, and the coverage report will give me some pointers for where to look as I start investigating.” Testing is more than tests, it’s also black, isort, flake8, mypy, and even spell checking sphinx documentation. Using tox.ini for utility scripts, like cleanup, pipupgrade, … Michael #6: Python and PyQt: Building a GUI Desktop Calculator by by Leodanis Pozo Ramos at realpython Some interesting take-aways: Basics of PyQt Widgets: QWidget is the base class for all user interface objects, or widgets. These are rectangular-shaped graphical components that you can place on your application’s windows to build the GUI. Layout Managers: Layout managers are classes that allow you to size and position your widgets at the places you want them to be on the application’s form. Main Windows: Most of the time, your GUI applications will be Main Window-Style. This means that they’ll have a menu bar, some toolbars, a status bar, and a central widget that will be the GUI’s main element. Applications: The most basic class you’ll use when developing PyQt GUI applications is QApplication. This class is at the core of any PyQt application. It manages the application’s control flow as well as its main settings. Signals and Slots: PyQt widgets act as event-catchers. Widgets always emit a signal, which is a kind of message that announces a change in its state. Due to Qt licensing, you can only use the free version for non-commercial projects or internal non-redistributed or purchase a commercial license for $5,500/yr/dev. Extras Brian PyCascades 2020 livestream videos of day 1 & day 2 are available. Huge shout-out and thank you to all of the volunteers for this event. In particular Nina Zakharenko for calming me down before my talk. Michael Recording for Python for .NET devs webcast available. Take some of our free courses with our mobile app. Joke Why do programmers confuse Halloween with Christmas? Because OCT 31 == DEC 25. Speed dating is useless. 5 minutes is not enough to properly explain the benefits of the Unix philosophy.
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Kojo Idrissa! Michael #1: donkeycar Have you ever seen a proper RC car race? Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions. Use Donkey if you want to: Make an RC car drive its self. Compete in self driving races like DIY Robocars Experiment with autopilots, mapping computer vision and neural networks. Log sensor data (images, user inputs, sensor readings). Drive your car via a web or game controller. Leverage community contributed driving data. Use existing CAD models for design upgrades. Brian #2: RIP Pipenv: Tried Too Hard. Do what you need with pip-tools. Nick Timkovich No releases of pipenv in 2019. It “has been held back by several subdependencies and a complicated release process” main benefits of pipenv: pin everything and use hashes for verifying packages The two file concept (Pipfile Pipfile.lock) is pretty cool and useful But we can do that with pip-tools command line tool pip-compile, which is also used by pipenv: pip-compile --generate-hashes --ouptut-file requirements.txt requirements.in What about virtual environment support? python -m venv venv --prompt $(basename $PWD) or equivalent for your shell works fine, and it’s built in. Kojo #3: str.casefold() used for caseless matching “Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string.” especially helpful for Unicode characters firstString = "der Fluß" secondString = "der Fluss" # ß is equivalent to ss if firstString.casefold() == secondString.casefold(): print('The strings are equal.') else: print('The strings are not equal.') # prints "The strings are equal." Michael #4: Virtualenv via Brian Skinn Virtualenv 20.0.0 beta1 is available Announcement by Bernat Gabor Why the major release I identified three main pain points: Creating a virtual environment is slow (takes around 3 seconds, even in offline mode; while 3 seconds does not seem that long if you need to create tens of virtual environments, it quickly adds up). The API used within PEP-405 is excellent if you want to create virtual environments; however, only that. It does not allow us to describe the target environment flexibly or to do that without actually creating the environment. The duality of virtualenv versus venv. Right, python3.4 has the venv module as defined by PEP-405. In theory, we could switch to that and forget virtualenv. However, it is not that simple. virtualenv offers a few benefits that venv does not Benefits over venv Ability to discover alternate versions (-p 2 creates a python 2 virtual environment, -p 3.8 a python 3.8, -p pypy3 a PyPy 3, and so on). virtualenv packages out of the box the wheel package as part of the seed packages, this significantly improves package installation speed as pip can now use its wheel cache when installing packages. You are guaranteed to work even when distributions decide not to ship venv (Debian derivates notably make venv an extra package, and not part of the core binary). Can be upgraded out of band from the host python (often via just pip/curl - so can pull in bug fixes and improvements without needing to wait until the platform upgrades venv). Easier to extend, e.g., we added Xonsh activation script generation without much pushback, support for PowerShell activation on POSIX platforms. Brian #5: Property-based tests for the Python standard library (and builtins) Zac Hatfield-Dodds and Paul Ganssle, so far. Goal: Find and fix bugs in Python, before they ship to users. “CPython's existing test suite is good, but bugs still slip through occasionally. We think that using property-based testing tools - i.e. Hypothesis - can help with this. They're no magic bullet, but computer-assisted testing techniques routinely try inputs that humans wouldn't think of (or bother trying), and turn up bugs that humans missed.” “Writing tests that describe every valid input often leads to tighter validation and cleaner designs too, even when no counterexamples are found!” “We aim to have a compelling proof-of-concept by PyCon US, and be running as part of the CPython CI suite by the end of the sprints.” Hypothesis and property based testing is superb to throw at algorithmic pure functions, and the test criteria is relatively straightforward for function pairs that have round trip logic, like tokenize/untokenize, encode/decode, compress/decompress, etc. And there’s probably tons of those types of methods in Python. At the very least, I’m interested in this to watch how other people are using hypothesis. Kojo #6: PyCon US Tutorial Schedule & Registration Find the schedule at https://us.pycon.org/2020/schedule/tutorials/ They tend to sell out FAST Videos are up fast afterwards What’s interesting to me? Migration from Python 2 to 3 Welcome to Circuit Python (Kattni Rembor) Intro to Property-Based Testing Minimum Viable Documentation (Heidi Waterhouse) Extras Michael: Foreword for Mastering Python Networking Pyramid (Waitress) and Django both issued security CVEs. You should upgrade! StackOverflow Survey 2020 is open. Go fill it out! Joke See the cartoon: https://trello-attachments.s3.amazonaws.com/58e3f7c543422d7f3ad84f33/5df14f77efb5642d017a593f/31cba5cdf0e9805d47837916555dd7ab/b5cb6570af72883f06c3dcbf47679e9d.jpg
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Vicki Boykis: @vboykis Michael #1: clize: Turn functions into command-line interfaces via Marcelo Follow up from Typer on episode 164. Features Create command-line interfaces by creating functions and passing them to [clize.run](https://clize.readthedocs.io/en/stable/api.html#clize.run). Enjoy a CLI automatically created from your functions’ parameters. Bring your users familiar --help messages generated from your docstrings. Reuse functionality across multiple commands using decorators. Extend Clize with new parameter behavior. I love how this is pure Python without its own API for the default case Vicki #2: How to cheat at Kaggle AI contests Kaggle is a platform, now owned by Google, that allows data scientists to find data sets, learn data science, and participate in competitions Many people participate in Kaggle competitions to sharpen their data science/modeling skills Recently, a competition that was related to analyzing pet shelter data resulted in a huge controversy Petfinder.my is a platform that helps people find pets to rescue in Malaysia from shelters. In 2019, they announced a collaboration with Kaggle to create a machine learning predictor algorithm of which pets (worldwide) were more likely to be adopted based on the metadata of the descriptions on the site. The total prize offered was $25,000 After several months, a contestant won. He was previously a Kaggle grandmaster, and won $10k. A volunteer, Benjamin Minixhofer, offered to put the algorithm in production, and when he did, he found that there was a huge discrepancy between first and second place Technical Aspects of the controversy: The data they gave asked the contestants to predict the speed at which a pet would be adopted, from 1-5, and included input features like type of animal, breed, coloration, whether the animal was vaccinated, and adoption fee The initial training set had 15k animals and the teams, after a couple months, were then given 4k animals that their algorithms had not seen before as a test of how accurate they were (common machine learning best practice). In a Jupyter notebook Kernel on Kaggle, Minixhofer explains how the winning team cheated First, they individually scraped Petfinder.my to find the answers for the 4k test data Using md5, they created a hash for each unique pet, and looked up the score for each hash from the external dataset - there were 3500 overlaps Did Pandas column manipulation to get at the hidden prediction variable for every 10th pet and replaces the prediction that should have been generated by the algorithm with the actual value Using mostly: obfuscated functions, Pandas, and dictionaries, as well as MD5 hashes Fallout: He was fired from H20.ai Kaggle issued an apology Michael #3: Configuring uWSGI for Production Deployment We run a lot of uWSGI backed services. I’ve spoken in-depth back on Talk Python 215: The software powering Talk Python courses and podcast about this. This is guidance from Bloomberg Engineering’s Structured Products Applications group We chose uWSGI as our host because of its performance and feature set. But, while powerful, uWSGI’s defaults are driven by backward compatibility and are not ideal for new deployments. There is also an official Things to Know doc. Unbit, the developer of uWSGI, has “decided to fix all of the bad defaults (especially for the Python plugin) in the 2.1 branch.” The 2.1 branch is not released yet. Warning, I had trouble with die-on-term and systemctl Settings I’m using: # This option tells uWSGI to fail to start if any parameter # in the configuration file isn’t explicitly understood by uWSGI. strict = true # The master uWSGI process is necessary to gracefully re-spawn # and pre-fork workers, consolidate logs, and manage many other features master = true # uWSGI disables Python threads by default, as described in the Things to Know doc. enable-threads = true # This option will instruct uWSGI to clean up any temporary files or UNIX sockets it created vacuum = true # By default, uWSGI starts in multiple interpreter mode single-interpreter = true # Prevents uWSGI from starting if it is unable to find or load your application module need-app = true # uWSGI provides some functionality which can help identify the workers auto-procname = true procname-prefix = pythonbytes- # Forcefully kill workers after 60 seconds. Without this feature, # a stuck process could stay stuck forever. harakiri = 60 harakiri-verbose = true Vicki #4: Thinc: A functional take on deep learning, compatible with Tensorflow, PyTorch, and MXNet A deep learning library that abstracts away some TF and Pytorch boilerplate, from Explosion Already runs under the covers in SpaCy, an NLP library used for deep learning type checking, particularly helpful for Tensors: PyTorchWrapper and TensorFlowWrapper classes and the intermingling of both Deep support for numpy structures and semantics Assumes you’re going to be using stochastic gradient descent And operates in batches Also cleans up the configuration and hyperparameters Mainly hopes to make it easier and more flexible to do matrix manipulations, using a codebase that already existed but was not customer-facing. Examples and code are all available in notebooks in the GitHub repo Michael #5: pandas-vet via Jacob Deppen A plugin for Flake8 that checks pandas code Starting with pandas can be daunting. The usual internet help sites are littered with different ways to do the same thing and some features that the pandas docs themselves discourage live on in the API. Makes pandas a little more friendly for newcomers by taking some opinionated stances about pandas best practices. The idea to create a linter was sparked by Ania Kapuścińska's talk at PyCascades 2019, "Lint your code responsibly!" Vicki #6: NumPy beginner documentation NumPy is the backbone of numerical computing in Python: Pandas (which I mentioned before), scikit-learn, Tensorflow, and Pytorch, all lean heavily if not directly depend on its core concepts, which include matrix operations through a data structure known as a NumPy array (which is different than a Python list) - ndarray Anne Bonner wrote up new documentation for NumPy that introduces these fundamental concepts to beginners coming to both Python and scientific computing Before, you went directly to the section about arrays and had to search through it find what you wanted. The new guide, which is very nice, includes a step-by-step on how arrays work, how to reshape them, and illustrated guides on basic array operations. Extras: Vicki I write a newsletter, Normcore Tech, about all things tech that I’m not seeing covered in the mainstream tech media. I’ve written before about machine learning, data for NLP, Elon Musk memes, and Nginx. There’s a free version that goes out once a week and paid subscribers get access to one more newsletter per week, but really it’s more about the idea of supporting in-depth writing about tech. vicki.substack.com Michael: pip 20.0 Released - Default to doing a user install (as if --user was passed) when the main site-packages directory is not writeable and user site-packages are enabled, cache wheels built from Git requirements, and more. Homebrew: brew install python@3.8 Joke: An SEO expert walks into a bar, bars, pub, public house, Irish pub, tavern, bartender, beer, liquor, wine, alcohol, spirits...
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Amazon is now offering quantum computing as a service Amazon Braket – A fully managed service that allows scientists, researchers, and developers to begin experimenting with computers from multiple quantum hardware providers in a single place. We all know about bits. Quantum computers use a more sophisticated data representation known as a qubit or quantum bit. Each qubit can exist in state 1 or 0, but also in superpositions of 1 and 0, meaning that the qubit simultaneously occupies both states. Such states can be specified by a two-dimensional vector that contains a pair of complex numbers, making for an infinite number of states. Each of the complex numbers is a probability amplitude, basically the odds that the qubit is a 0 or a 1, respectively. Amazon Braket is a new service designed to let you get some hands-on experience with qubits and quantum circuits. You can build and test your circuits in a simulated environment and then run them on an actual quantum computer. See linked announcement. Language looks familiar: [1]: bell = Circuit().h(0).cnot(0, 1) print(device.run(bell, s3_folder).result().measurement_counts()) How it Works: Quantum computers work by manipulating the amplitudes of the state vector. To program a quantum computer, you figure out how many qubits you need, wire them together into a quantum circuit, and run the circuit. When you build the circuit, you set it up so that the correct answer is the most probable one, and all the rest are highly improbable. Brian #2: A quick-and-dirty guide on how to install packages for Python Brett Cannon Good modern intro to venv use. Pro short. simple. quick uses --prompt in every example (more people need to use this) and suggests using the directory name containing the env. send it to all your co-workers that STILL aren’t using virtual environments hints at an improved form of --prompt coming in Python 3.9 Con uses .venv, I’m a venv (no dot kinda guy) hints at an improved form of --prompt coming in Python 3.9 --prompt . will deduce the directory name. In 3.8 it just names your env “.”. Michael #3: Say No to the no code movement Article by Alex Hudson 2020 is going to be the year of “no code”: the movement that say you can write business logic and even entire applications without having the training of a software developer. Every company is a software company But software devs are in short supply and outcomes are variable two distinct benefits to transitioning business processes into the software domain “change control” becomes a software problem rather than a people problem. it’s easier to innovate on what makes a business distinct. The basic problem with “no code” the idea of writing business logic in text form according to the syntax of a technical programming language is anathema. The “simpler abstraction” misconception The “simpler syntax” misconception Configuration over code: Many No Code advocates are building significant systems by pulling together off-the-shelf applications and integrating them. But the logic has been implemented as configuration as opposed to code. The equivalence of code: There are reasons why developers still use plain text, if something came along that was better, many (not all!) developers would drop text like a hot rock. Where does “No code” fail in practice? 80% there and then … Where does “No code” succeed? “No Code” systems are extremely good for putting together proofs-of-concept which can demonstrate the value of moving forward with development. Brian #4: What I learned going from prison to Python Shadeed “Sha” Wallace-Stepter Presented at North Bay Python I got this recommended to be by many people, even those not in the Python community, including my good friends Chuck Forbes and Dr. Donna Beegle, who work to fight poverty. Amazing story. Go listen to it. Michael #5: A real QUICK → Qt5 based gUI generator for ClicK Via Ricky Teachey. Inspired by Gooey, the GUI generator for classical Python argparse-based command line programs. Take a standard Click-based app, add --gui to the command line and you get a GUI! Brian #6: Falsehoods programmers believe about time also More falsehoods programmers believe about time; “wisdom of the crowd” edition All of these assumptions are wrong There are always 24 hours in a day. Months have either 30 or 31 days. … A week always begins and ends in the same month. … The system clock will always be set to the correct local time The system clock will always be set to a time that is not wildly different from the correct local time. If the system clock is incorrect, it will at least always be off by a consistent number of seconds. … It will never be necessary to set the system time to any value other than the correct local time. Ok, testing might require setting the system time to a value other than the correct local time but it will never be necessary to do so in production. … Human-readable dates can be specified in universally understood formats such as 05/07/11. … from more … The day before Saturday is always Friday. … Two subsequent calls to a getCurrentTime() function will return distinct results. The second of two subsequent calls to a getCurrentTime() function will return a larger result. The software will never run on a space ship that is orbiting a black hole. Extras Michael: REMI GUI editor Joke https://twitter.com/mbbillz/status/921119218703257600
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: iterators, generators, coroutines Cool quick read article by Mark McDonnell. Starts with an attempt at a gentle introduction to the iterator protocol (why does everyone think that users need to start with this info?) Muscle through this part or just skim it. Should be an appendix. Generators (start here): functions that use yield Unbound generators: they don’t stop Generator Expressions: Like for v in ("foo" for i in range(5)): … Use parens instead of brackets, otherwise they are like list comprehensions. Specifically: (expression for item in collection if condition) Generators using generators / nested generators : yield from Given bar() and baz() are generators, this works: def foo(): yield from bar() yield from baz() Coroutines are an extension of generators “Generators use the yield keyword to return a value at some point in time within a function, but with coroutines the yield directive can also be used on the right-hand side of an = operator to signify it will accept a value at that point in time.” Then….. coroutine example, some asyncio stuff, … honestly I got lost. Bottom line: I’m still looking for a great tutorial on coroutines that doesn’t explain the iterator protocol (boring!) shows an example NOT using asyncio and NOT a REPL example I want to know how I can make use of coroutines in an actual program (toy ok) where the use of coroutines actually helps the structure and makes it more maintainable, etc. Michael #2: requests-toolbelt A toolbelt of useful classes and functions to be used with requests multipart/form-data encoder - The main attraction is a streaming multipart form-data object, MultipartEncoder. User-Agent constructor - You can easily construct a requests-style User-Agent string SSLAdapter - Allows the user to choose one of the SSL protocols made available in Python's ssl module for outgoing HTTPS connections ForgetfulCookieJar - prevents a particular requests session from storing cookies Brian #3: Pandas Validation We covered Bulwark in episode 162 There are other approaches and projects looking at the same problem. pandas-validation Suggested by Lance “… pandas-validation lets you create a template of what your pandas dataframe should look like and it'll validate the entire dataframe against that template. So if you have a dataframe with first column being strings second column being dates and the third being address, you can use a mixture of built in validate types to ensure your data conforms to that. It will even let you set up some regex and make sure that the data in a column conforms to that regex.” - Lance supports dates, timestamps, numeric values, strings pandera “pandera provides a flexible and expressive API for performing data validation on tidy (long-form) and wide data to make data processing pipelines more readable and robust." “pandas data structures contain information that pandera explicitly validates at runtime. This is useful in production-critical or reproducible research settings. “pandera enables users to: Check the types and properties of columns in a DataFrame or values in a Series. Perform more complex statistical validation. Seamlessly integrate with existing data analysis/processing pipelines via function decorators.” A few different approaches. I can’t really tell from the outside if there is a clear winner or solution that’s working better for most cases. I’d like to hear from listeners which they use, if any. Or if we missed the obvious validation method most people are using. Michael #4: qtpy I have been inspired to check out Qt again, but the libraries and versions a confusing. Provides an uniform layer to support PyQt5, PySide2, PyQt4 and PySide with a single codebase Basically, you can write your code as if you were using PySide2 but import Qt modules from qtpy instead of PySide2 (or PyQt5). Brian #5: pylightxl Viktor Kis submission “A light weight, zero dependency, minimal functionality excel read/writer python library” Well. Reader right now. Writing coming soon. :) Some cool examples in the docs to get you started grabbing data from spreadsheets right away. Features: Zero non-standard library dependencies Single source code that supports both Python37 and Python27. The light weight library is only 3 source files that can be easily copied directly into a project for those that have installation/download restrictions. In addition the library’s size and zero dependency makes pyinstaller compilation small and easy! 100% test-driven development for highest reliability/maintainability with 100% coverage on all supported versions API aimed to be user friendly, intuitive and to the point with no bells and whistles. Structure: database > worksheet > indexing example: db.ws('Sheet1').index(row=1,col=2) or db.ws('Sheet1').address(address='B1') Read excel files (.xlsx, .xlsm), all sheets or selective few for speed/memory management Index cells data by row/col number or address Calling an entire row/col of data returns an easy to use list output: db.ws('Sheet1').row(1) or db.ws('Sheet1').rows Worksheet data size is consistent for each row/col. Any data that is empty will return a ‘’ Michael #6: python-ranges via Aiden Price Continuous Range, RangeSet, and RangeDict data structures for Python Best understood as an example: tax_info = RangeDict({ Range(0, 9701): (0, 0.10, 0), Range(9701, 39476): (970, 0.12, 9700), ... }) income = int(input("What is your income? $")) base, marginal_rate, bracket_floor = tax_info[income] Range and RangeSet objects are mutually compatible for things like union(), intersection(), difference(), and symmetric_difference() Extras: Brian: pytest-check works with pytest-rerunfailures - with other plugins, it may not. - known incompatibility with flaky and retry Michael: Pandas goes 1.0 (via Jeremy Schendel). Just put out a release candidate for 1.0, and will be using SemVer going forward. PyCharm security from Anthony Shaw. Video for Python for Decision Makers webcast is out. Joke: Optimist: The glass is half full. Pessimist: The glass is half empty. Engineer: The glass is twice as large as it needs to be.
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: Data driven journalism via cjworkbench via Michael Paholski The data journalism platform with built in training Think spreadsheet + ETL automation Designed around modular tools for data processing -- table in, table out -- with no code required Features include: Modules to scrape, clean, analyze and visualize data An integrated data journalism training program Connect to Google Drive, Twitter, and API endpoints. Every action is recorded, so all workflows are repeatable and transparent All data is live and versioned, and you can monitor for changes. Write custom modules in Python and add them to the module library Brian #2: remi: A Platform-independent Python GUI library for your applications. Python REMote Interface library. “Remi is a GUI library for Python applications which transpiles an application's interface into HTML to be rendered in a web browser. This removes platform-specific dependencies and lets you easily develop cross-platform applications in Python!” No dependencies. pip install git+https://github.com/dddomodossola/remi.git doesn’t install anything else. Yes. Another GUI in a web page, but for quick and dirty internal tools, this will be very usable. Basic app: import remi.gui as gui from remi import start, App class MyApp(App): def __init__(self, *args): super(MyApp, self).__init__(*args) def main(self): container = gui.VBox(width=120, height=100) self.lbl = gui.Label('Hello world!') self.bt = gui.Button('Press me!') self.bt.onclick.do(self.on_button_pressed) container.append(self.lbl) container.append(self.bt) return container def on_button_pressed(self, widget): self.lbl.set_text('Button pressed!') self.bt.set_text('Hi!') start(MyApp) Michael #3: Typer Build great CLIs. Easy to code. Based on Python type hints. Typer is FastAPI's little sibling. And it's intended to be the FastAPI of CLIs. Just declare once the types of parameters (arguments and options) as function parameters. You do that with standard modern Python types. You don't have to learn a new syntax, the methods or classes of a specific library, etc. Based on Click Example (min version) import typer def main(name: str): typer.echo(f"Hello {name}") if __name__ == "__main__": typer.run(main) Brian #4: Effectively using Matplotlib Chris Moffitt “… I think I was a little premature in dismissing matplotlib. To be honest, I did not quite understand it and how to use it effectively in my workflow.” That very much sums up my relationship with matplotlib. But I’m ready to take another serious look at it. one reason for complexity is 2 interfaces MATLAB like state-based interface object based interface (use this) recommendations: Learn the basic matplotlib terminology, specifically what is a Figure and an Axes . Always use the object-oriented interface. Get in the habit of using it from the start of your analysis. Start your visualizations with basic pandas plotting. Use seaborn for the more complex statistical visualizations. Use matplotlib to customize the pandas or seaborn visualization. Runs through an example Describes figures and plots Includes a handy reference for customizing a plot. Related: StackOverflow answer that shows how to generate and embed a matplotlib image into a flask app without saving it to a file. Style it with pylustrator.readthedocs.io :) Michael #5: Django Simple Task django-simple-task runs background tasks in Django 3 without requiring other services and workers. It runs them in the same event loop as your ASGI application. Here’s a simple overview of how it works: On application start, a queue is created and a number of workers starts to listen to the queue When defer is called, a task(function or coroutine function) is added to the queue When a worker gets a task, it runs it or delegates it to a threadpool On application shutdown, it waits for tasks to finish before exiting ASGI server It is required to run Django with ASGI server. Example from django_simple_task import defer def task1(): time.sleep(1) print("task1 done") async def task2(): await asyncio.sleep(1) print("task2 done") def view(requests): defer(task1) defer(task2) return HttpResponse(b"My View") Brian #6: PyPI Stats at pypistats.org Simple interface. Pop in a package name and get the download stats. Example use: Why is my open source project now getting PRs and issues? I’ve got a few packages on PyPI, not updated much. cards and submark are mostly for demo purposes for teaching testing. pytest-check is a pytest plugin that allows multiple failures per test. I only hear about issues and PRs on one of these. So let’s look at traffic. cards: downloads day: 2 week: 24 month: 339 submark: day: 5 week: 9 month: 61 pytest-check: day: 976 week: 4,524 month: 19,636 That totally explains why I need to start actually supporting pytest-check. Cool. Note: it’s still small. Top 20 packages are all downloaded over 1.3 million times per day. Extras: Comment from January Python PDX West meetup “Please remember to have one beginner friendly talk per meetup.” Good point. Even if you can’t present here in Portland / Hillsboro, or don’t want to, I’d love to hear feedback of good beginner friendly topics that are good for meetups. PyCascades 2020 discount code listeners-at-pycascades for 10% off FireFox 72 is out with anti-fingerprinting and PIP - Ars Technica Joke: Language essays comic
Sponsored by us! Support us by visiting pythonbytes.fm/biz [courses] and pythonbytes.fm/pytest [book], or becoming a patron at patreon.com/pythonbytes Brian #1: Meditations on the Zen of Python Moshe Zadka The Zen of Python is not "the rules of Python" or "guidelines of Python". It is full of contradiction and allusion. It is not intended to be followed: it is intended to be meditated upon. Moshe give some of his thoughts on the different lines of the Zen of Python. Full Zen of Python can be found here or in a REPL with import this A few Beautiful is better than ugly Consistency helps. So black, flake8, pylint are useful. “But even more important, only humans can judge what humans find beautiful. Code reviews and a collaborative approach to writing code are the only realistic way to build beautiful code. Listening to other people is an important skill in software development.” Complex is better than complicated. “When solving a hard problem, it is often the case that no simple solution will do. In that case, the most Pythonic strategy is to go "bottom-up." Build simple tools and combine them to solve the problem.” Readability counts “In the face of immense pressure to throw readability to the side and just "solve the problem," the Zen of Python reminds us: readability counts. Writing the code so it can be read is a form of compassion for yourself and others.” Michael #2: nginx raided by Russian police Russian police have raided today the Moscow offices of NGINX, Inc., a subsidiary of F5 Networks and the company behind the internet's most popular web server technology. Russian search engine Rambler.ru claims full ownership of NGINX code. Rambler claims that Igor Sysoev developed NGINX while he was working as a system administrator for the company, hence they are the rightful owner of the project. Sysoev never denied creating NGINX while working at Rambler. In a 2012 interview, Sysoev claimed he developed NGINX in his free time and that Rambler wasn't even aware of it for years. Update Promptly following the event we took measures to ensure the security of our master software builds for NGINX, NGINX Plus, NGINX WAF and NGINX Unit—all of which are stored on servers outside of Russia. No other products are developed within Russia. F5 remains committed to innovating with NGINX, NGINX Plus, NGINX WAF and NGINX Unit, and we will continue to provide the best-in-class support you’ve come to expect. Brian #3: I'm not feeling the async pressure Armin Ronacher “Async is all the rage.” But before you go there, make sure you understand flow control and back pressure. “…back pressure is resistance that opposes the flow of data through a system. Back pressure sounds quite negative … but it's here to save your day.” If parts of your system are async, you have to make sure the entire flow throw the system doesn’t have overflow points. An example shown with reader/writer that is way hairier than you’d think it should be. “New Footguns: async/await is great but it encourages writing stuff that will behave catastrophically when overloaded.” “So for you developers of async libraries here is a new year's resolution for you: give back pressure and flow control the importance they deserve in documentation and API.” Michael #4: codetiming from Real Python via Doug Farrell A flexible, customizable timer for your Python code For a complete tutorial on how codetiming works, see Python Timer Functions: Three Ways to Monitor Your Code on Real Python. Time your code via A timer class A decorator A context manager Brian #5: Making Python Programs Blazingly Fast Martin Heinz Seemed like a good followup to the last topic Profiling with command line time python something.py python -m cProfile -s time something.py timing functions with wrapper Misses timeit, but see that also, https://docs.python.org/3.8/library/timeit.html How to make things faster: use built in types over custom types caching/memoization with lru_cache use local variables and local aliases when looping use functions… (kinda duh, but sure). don’t repeatedly access attributes in loops use f-strings over other formatting use generators. or at least experiment with them. the memory savings could result in speedup Michael #6: LocalStack via Graham Williamson and Jan 'oglop' Gazda A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline! LocalStack spins up the following core Cloud APIs on your local machine: S3, DynamoDB, Lambda, Elasticsearch see many more services paid one has more LocalStack builds on existing best-of-breed mocking/testing tools, most notably kinesalite/dynalite and moto. While these tools are awesome (!), they lack functionality for certain use cases. LocalStack combines the tools, makes them interoperable, and adds important missing functionality on top of them Has lots of config and knobs, but runs in docker so that helps Extras: Python Job Board Michael: Guido interviewed for JavaScript language! Microsoft: We're creating a new Rust-based programming language for secure coding New webcast: Python for the .NET developer Ace Python Interviews free course Joke: Types of software jobs.
Sponsored by DataDog: pythonbytes.fm/datadog Special guest: Aly Aly #1: Andrew Godwin - Just Add Await: Retrofitting Async into Django — DjangoCon 2019 Andrew is leading the implementation of asynchronous support for Django Overview of Async Landscape How synchronous and asynchronous code interact Async functions are different than sync functions which makes it hard to design APIs Difficulties in adding Async support to Django Django is a project that a lot of people are familiar with; it’s new async implementation also needs to feel familiar Plan was Implement async capabilities in three phases Phase 1: ASGI Support (Django 3.0) This phase lays the groundwork for future changes ORM is async-aware: using it from async code raises a SynchronousOnlyOperation exception Phase 2: Async Views, Async Handlers, and Async Middleware (Django 3.1) Add async capabilities for the core part of the request path There is a branch where things are mostly working, just need to fix a couple of tests Phase 3: Async ORM (Django 3.2 / 4.0) Largest, most difficult and most unbounded part of the project ORM queries can result in lots of database lookups; have to be careful here Async Project Wiki - project status, find out how to contribute Brian #2: gamesbyexample Al Sweigart “PythonStdioGames : A collection of games (with source code) to use for example programming lessons. Written in Python 3. Click on the src folder to view all of the programs.” I first learned programming by modifying games written by others and seeing what the different parts do when I change them. For me it was Lunar Lander on a TRS-80, and it took forever to type in the listing from the back of a magazine. But now, you can just clone a repo and play with existing files. Cool features: They're short, with a limit of 256 lines of code. They fit into a single source code file and have no installer. They only use the Python standard library. They only use stdio text; print() and input() in Python. They're well commented. They use as few programming concepts as possible. If classes, list comprehensions, recursion, aren't necessary for the program, then they are't used. Elegant and efficient code is worthless next to code that is easy to understand and readable. These programs are for education, not production. Standard best practices, like not using global variables, can be ignored to make it easier to understand. They do input validation and are bug free. All functions have docstrings. There’s also a todo list if people want to help out. Aly #3: Bulwark Open-source library that allows users to property test pandas DataFrames Goal is to make it easy for data analysts and data scientists to write tests Tests around data are different; they are not deterministic, they requires us to think about testing in a different way With property tests, we can check an object has a certain property Property tests for DataFrames includes validating the shape of the DataFrame, checking that a column is within a certain range, verifying a DataFrame has no NaNs, etc Bulwark allows you to implement property tests as checks. Each check Takes a DataFrame and optional arguments The check will make an assertion about a DataFrame property If the assertion passes, the check will return the original, unaltered DataFrame If the check fails, an AssertionError is raised and you have context around why it failed Bulwark also allows you to implement property checks as decorators This is useful if you design data pipelines as functions Each function take in input data, performs an action, returns output Add decorators validate properties of input DataFrame to pipeline functions Lots of builtin checks and decorators; easy to add your own Slides with example usage and tips: Property Testing with Pandas with Bulwark Brian #4: Poetry 1.0.0 Sebastien Eustace caution: not backwards compatible full change log Highlights: Poetry is getting serious. more ways to manage environments switch between python versions in a project with poetry env use /path/to/python or poetry env use python3.7 Imroved support for private indices (instead of just pypi) can specify index per dependency can specify a secondary index can specify a non-pypi index as default, avoiding pypi Env variable support to more easily work with poetry in a CI environment Improved add command to allow for constraints, paths, directories, etc for a dependency publishing allows api tokens marker specifiers on dependencies. Aly #5: Kubernetes for Full-Stack Developers With the rise of containers, Kubenetes has become the defacto platform for running and coordinating containerized applications across multiple machines With the rise of containers, Kubenetes is the defacto platform for running and coordinating applications across multiple machines This guide follows steps new users would take when learning how to deploy applications to Kubernetes: Learn Kubernetes core concepts Build modern 12 Factor web applications Get applications working inside of containers Deploy applications to Kubernetes Manage cluster operations New to containers? Check out my Introduction to Docker talk Brian #6: testmon: selects tests affected by changed files and methods On a previous episode (159) we mentioned pytest-picked and I incorrectly assumed it would run tests related to code that has changed, ‘cause it says “Run the tests related to the unstaged files or the current branch (according to Git)”. I was wrong, Michael was right. It runs the tests that are in modified test files. What I was thinking of is “testmon” which does what I was hoping for. “pytest-testmon is a pytest plugin which selects and executes only tests you need to run. It does this by collecting dependencies between tests and all executed code (internally using Coverage.py) and comparing the dependencies against changes. testmon updates its database on each test execution, so it works independently of version control.” If you had tried testmon before, like me, be aware that there have been significant changes in 1.0.0 Very cool to see continued effort on this project. Extras: Aly: Finding local Python User Groups PyCon.org Events Calendar Meetup.com search for Python PyTennessee 2019 on March 7 - 8. Tickets on sale now! I will be giving a talk on the Facade Design Pattern Brian: Next episode is planned to be a live recording during the Jan 7 Python PDX West meetup. There will also be 1-2 other talks. Joke: From Tyler Matteson Two coroutines walk into a bar. RuntimeError: 'bar' was never awaited. From Ben Sandofsky Q: How many developers on a message board does it take to screw in a light bulb? A: “Why are you trying to do that?”
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Anthony Herbert Anthony #1: Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018 Michael’s personal automation things that I do all the time stripe to sheets automation urlify tons of reporting wakeup - to get 100 on Lighthouse deploy (on my servers) creating import data for video courses measuring duration of audio files Michael #2: Introduction to ASGI: Emergence of an Async Python Web Ecosystem by Florimond Manca Python growth is not just data science Python web development is back with an async spin, and it's exciting. One of the main drivers of this endeavour is ASGI , the Asynchronous Standard Gateway Interface. A guided tour about what ASGI is and what it means for modern Python web development. Since 3.5 was released, the community has been literally async-ifying all the things. If you're curious, a lot of the resulting projects are now listed in aio-libs and awesome-asyncio . An overview of ASGI Why should I care? Interoperability is a strong selling point, there are many more advantages to using ASGI-based components for building Python web apps. Speed: the async nature of ASGI apps and servers make them really fast (for Python, at least) — we're talking about 60k-70k req/s (consider that Flask and Django only achieve 10-20k in a similar situation). Features: ASGI servers and frameworks gives you access to inherently concurrent features (WebSocket, Server-Sent Events, HTTP/2) that are impossible to implement using sync/WSGI. Stability: ASGI as a spec has been around for about 3 years now, and version 3.0 is considered very stable. Foundational parts of the ecosystem are stabilizing as a result. To get your hands dirty, try out any of the following projects: uvicorn: ASGI server. Starlette: ASGI framework. TypeSystem: data validation and form rendering Databases: async database library. orm: asynchronous ORM. HTTPX: async HTTP client w/ support for calling ASGI apps (useful as a test client). Anthony #3: Python Insights Michael #4: Assembly via Luiz Honda Assembly is a Pythonic Object-Oriented Web Framework built on Flask, that groups your routes by class Assembly is a pythonic object-oriented, mid stack, batteries included framework built on Flask, that adds structure to your Flask application, and group your routes by class. Assembly allows you to build web applications in much the same way you would build any other object-oriented Python program. Assembly helps you create small to enterprise level applications easily. Decisions made for you + features: github.com/mardix/assembly#decisions-made-for-you--features Examples, root URLs: # Extends to Assembly makes it a route automatically # By default, Index will be the root url class Index(Assembly): # index is the entry route # -> / def index(self): return "welcome to my site" # method name becomes the route # -> /hello/ def hello(self): return "I am a string" # undescore method name will be dasherize # -> /about-us/ def about_us(self): return "I am a string" Example of /blog. # The class name is part of the url prefix # This will become -> /blog class Blog(Assembly): # index will be the root # -> /blog/ def index(self): return [ { "title": "title 1", "content": "content" }, ... ] # with params. The order will be respected # -> /comments/1234/ # 1234 will be passed to the id def comments(self, id): return [ { comments... } ] Anthony #5: Building a Standalone GPS Logger with CircuitPython using @Adafruit and particle hardware Michael #6: 10 reasons python is good to learn Python is popular and good to learn because, in Michael’s words, it’s a full spectrum language. And the reasons are: Python Is Free and Open-Source Python Is Popular, Loved, and Wanted Python Has a Friendly and Devoted Community Python Has Elegant and Concise Syntax Python Is Multi-Platform Python Supports Multiple Programming Paradigms Python Offers Useful Built-In Libraries Python Has Many Third-Party Packages Python Is a General-Purpose Programming Language Python Plays Nice with Others Extras: Michael: I was just on .NET Rocks podcast talking about Python for the .NET Developer New Python for the .NET Developer 9-hour course New Python for Decision Makers course, 2.5 hours of exploring Python for your org. Hidden files in Finder: use shortcut cmd+shift+. Anthony: Pretty Printed YouTube channel Joke: The failed pickup line A girl is hanging out at a bar with her friends. Some guy comes up to her an says: “You are the ; to my line of code.” She responds, “Get outta here creep, I code in Python.”
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Type Hints for Busy Python Programmers Al Sweigart, @AlSweigart We’ve (Michael and myself, of course) convinced you that type hints might be a good thing to help reduce bugs or confusion or whatever. Now what? Al’s got your back with this no nonsense guide to get you going very quickly. Written as a conversation between a programmer and an type hint expert. Super short. Super helpful. typing and mypy are the modules you need. There are other tools, but let’s start there. Doesn’t affect run time, so you gotta run the tool. Gradually add, don’t have to do everything in one go. Covers the basics And then the “just after basics” stuff you’ll run into right away when you start, like: Allowing a type and None: Union[int, NoneType] Optional parameters Shout out to Callable, Sequence, Mapping, Iterable, available in the documentation when you are ready for them later Just really a great get started today guide. Michael #2: auto-py-to-exe A .py to .exe converter using a simple graphical interface built using Eel and PyInstaller in Python. Using the Application Select your script location (paste in or use a file explorer) Outline will become blue when file exists Select other options and add things like an icon or other files Click the big blue button at the bottom to convert Find your converted files in /output when complete Short 3 min video. Brian #3: How to document Python code with Sphinx Moshe Zadka, @moshezadka I’m intimidated by sphinx. Not sure why. But what I’ve really just wanted to do is to use it for this use of generating documentation of code based on the code and the docstrings. Many of the tutorials I’ve looked at before got me stuck somewhere along the way and I’ve given up. But this looks promising. Example module with docstring shown. Simple docs/index.rst, no previous knowledge of restructured text necessary. Specifically what extensions do I need: autodoc, napolean, and viewcode example docs/conf.py that’s really short setting up tox to generate the docs and the magic command like incantation necessary: sphinx-build -W -b html -d {envtmpdir}/doctrees . {envtmpdir}/html That’s it. (well, you may want to host the output somewhere, but I can figure that out. ) Super simple. Awesome Michael #4: Snek is a cross-platform PowerShell module for integrating with Python via Chad Miars Snek is a cross-platform PowerShell module for integrating with Python. It uses the Python for .NET library to load the Python runtime directly into PowerShell. Using the dynamic language runtime, it can then invoke Python scripts and modules and return the result directly to PowerShell as managed .NET objects. Kind of funky syntax, but that’s PowerShell for you ;) Even allows for external packages installed via pip Brian #5:How to use Pandas to access databases Irina Truong, @irinatruong You can use pandas and sqlalchemy to easily slurp tables right out of your db into memory. But don’t. pandas isn’t lazy and reads everything, even the stuff you don’t need. This article has tips on how to do it right. Recommendation to use the CLI for exploring, then shift to pandas and sqlalchemy. Tips (with examples, not shown here): limit the fields to just those you care about limit the number of records with limit or by selecting only rows where a particular field is a specific value, or something. Let the database do joins, even though you can do it in pandas Estimate memory usage with small queries and .memory_usage().sum(). Tips on reading chunks and converting small int types into small pandas types instead of 64 bit types. Michael #6: ijson — Iterative JSON parser with a standard Python iterator interface Iterative JSON parser with a standard Python iterator interface Most common usage is having ijson yield native Python objects out of a JSON stream located under a prefix. Here’s how to process all European cities: // from: { "earth": { "europe": [ ... ] } } stream each entry in europe as item: objects = ijson.items(f, 'earth.europe.item') cities = (o for o in objects if o['type'] == 'city') for city in cities: do_something_with(city) Extras: Michael: Python decision makers webcast on January 14th, 9:30am US Pacific Guido steps down from Steering Council via Vincent POULAILLEAU GitHub Archive Program, Preserving open source software for future generations, video Python 2.7 will be removed from Homebrew, via Allan Hansen Django 3.0 released Joke: Question: "What is the best prefix for global variables?" Answer: # via shinjitsu A web developer walks into a restaurant. He immediately leaves in disgust as the restaurant was laid out in tables. via shinjitsu
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Final type PEP 591 -- Adding a final qualifier to typing This PEP proposes a "final" qualifier to be added to the typing module---in the form of a final decorator and a Final type annotation---to serve three related purposes: Declaring that a method should not be overridden Declaring that a class should not be subclassed Declaring that a variable or attribute should not be reassigned Some situations where a final class or method may be useful include: A class wasn’t designed to be subclassed or a method wasn't designed to be overridden. Perhaps it would not work as expected, or be error-prone. Subclassing or overriding would make code harder to understand or maintain. For example, you may want to prevent unnecessarily tight coupling between base classes and subclasses. You want to retain the freedom to arbitrarily change the class implementation in the future, and these changes might break subclasses. # Example for a class: from typing import final @final class Base: ... class Derived(Base): # Error: Cannot inherit from final class "Base" ... And for a method: class Base: @final def foo(self) -> None: ... class Derived(Base): def foo(self) -> None: # Error: Cannot override final attribute "foo" # (previously declared in base class "Base") ... It seems to also mean const RATE: Final = 3000 class Base: DEFAULT_ID: Final = 0 RATE = 300 # Error: can't assign to final attribute Base.DEFAULT_ID = 1 # Error: can't override a final attribute Brian #2: flit 2 Michael #3: Pint via Andrew Simon Physical units and builtin unit conversion to everyday python numbers like floats. Receive inputs in different unit systems it can make life difficult to account for that in software. Pint handles the unit conversion automatically in a wide array of contexts – Can add 2 meters and 5 inches and get the correct result without any additional work. The integration with numpy and pandas are seamless, and it’s made my life so much simpler overall. Units and types of measurements Think you need this? How about the Mars Climate Orbiter The MCO MIB has determined that the root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, “Small Forces,” used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used in the software application code titled SM_FORCES (small forces). Brian #4: 8 great pytest plugins Jeff Triplett Michael #5: 11 new web frameworks via LuisCarlos Contreras Sanic [flask like] - a web server and web framework that’s written to go fast. It allows the usage of the async / await syntax added in Python 3.5 Starlette [flask like] - A lightweight ASGI framework which is ideal for building high performance asyncio services, designed to be used either as a complete framework, or as an ASGI toolkit. Masonite - A developer centric Python web framework that strives for an actual batteries included developer tool with a lot of out of the box functionality. Craft CLI is the edge here. FastAPI - A modern, high-performance, web framework for building APIs with Python 3.6+ based on standard Python type hints. Responder - Based on Starlette, Responder’s primary concept is to bring the niceties that are brought forth from both Flask and Falcon and unify them into a single framework. Molten - A minimal, extensible, fast and productive framework for building HTTP APIs with Python. Molten can automatically validate requests according to predefined schemas. Japronto - A screaming-fast, scalable, asynchronous Python 3.5+ HTTP toolkit integrated with pipelining HTTP server based on uvloop and picohttpparser. Klein [flask like] - A micro-framework for developing production-ready web services with Python. It is ‘micro’ in that it has an incredibly small API similar to Bottle and Flask. Quart [flask like]- A Python ASGI web microframework. It is intended to provide the easiest way to use asyncio functionality in a web context, especially with existing Flask apps. BlackSheep - An asynchronous web framework to build event based, non-blocking Python web applications. It is inspired by Flask and ASP.NET Core. BlackSheep supports automatic binding of values for request handlers, by type annotation or by conventions. Cyclone - A web server framework that implements the Tornado API as a Twisted protocol. The idea is to bridge Tornado’s elegant and straightforward API to Twisted’s Event-Loop, enabling a vast number of supported protocols. Brian #6: Raise Better Exceptions in Python Extras Michael: Naming venvs --prompt Another new course coming soon: Python for decision makers and business leaders Some random interview over at Real Python: Python Community Interview With Brian Okken Joke via Daniel Pope What's a tractor's least favorite programming language? Rust.
This episode is sponsored by DigitalOcean - pythonbytes.fm/digitalocean Brian #1: Python already replaced Excel in banking “If you wanted to prove your mettle as an entry-level banker or trader it used to be the case that you had to know all about financial modeling in Excel. Not any more. These days it's all about Python, especially on the trading floor. "Python already replaced Excel," said Matthew Hampson, deputy chief digital officer at Nomura, speaking at last Friday's Quant Conference in London. "You can already walk across the trading floor and see people writing Python code...it will become much more common in the next three to four years." Michael #2: GitHub launches 'Security Lab' to help secure open source ecosystem At the GitHub Universe developer conference, GitHub announced the launch of a new community program called Security Lab GitHub says Security Lab founding members have found, reported, and helped fix more than 100 security flaws already. Other organizations, as well as individual security researchers, can also join. A bug bounty program with rewards of up to $3,000 is also available, to compensate bug hunters for the time they put into searching for vulnerabilities in open source projects. Bug reports must contain a CodeQL query. CodeQL is a new open source tool that GitHub released today; a semantic code analysis engine that was designed to find different versions of the same vulnerability across vasts swaths of code. Starting today automated security updates are generally available and have been rolled out to every active repository with security alerts enabled. Once a security flaw is fixed, the project owner can publish the security, and GitHub will warn all upstream project owners who are using vulnerable versions of the original maintainer's code. But before publishing a security advisory, project owners can also request and receive a CVE number for their project's vulnerability directly from GitHub. And last, but not least, GitHub also updated Token Scanning, its in-house service that can scan users' projects for API keys and tokens that have been accidentally left inside their source code. Brian #3: pybit.es now has some test challenges Uses pytest, coverage.py, and mutpy (for mutation testing) Most other challenges have tests that validate the code you write. New challenges (3 so far) have you write the tests for existing code. Tests are evaluated with both coverage.py and mutpy another mutation testing tool is mutmut, written about earlier this year by Ned Badtchelder. Michael #4: pyhttptest - a command-line tool for HTTP tests over RESTful APIs via Florian Dahlitz A command-line tool for HTTP tests over RESTful APIs Tired of writing test scripts against your RESTFul APIs anytime? Describe an HTTP Requests test cases in a simplest and widely used format JSON within a file. Run one command and gain a summary report. Example { "name": "TEST: List all users", "verb": "GET", "endpoint": "users", "host": "https://github.com", "headers": { "Accept-Language": "en-US" }, "query_string": { "limit": 5 } } Brian #5: xarray suggested by Guido Imperiale xarray is a mature library that builds on top of numpy, pandas and dask to offer arrays that are n-dimensional (numpy and dask do it, but pandas doesn't) self-described and indexed (pandas does it, but numpy and dask don't) out-of-memory, multi-threaded, and cloud-distributed (dask does it, but numpy and pandas don't). Additionally, xarray can semi-transparently swap numpy with other backends, such as sparse , while retaining the same API. Michael #6: Animated SVG Terminals Florian Dahlitz termtosvg is a Unix terminal recorder written in Python that renders your command line sessions as standalone SVG animations. Features: Produce lightweight and clean looking animations or still frames embeddable on a project page Custom color themes, terminal UI and animation controls via user-defined SVG templates Rendering of recordings in asciicast format made with asciinema Examples: nbedos.github.io/termtosvg/pages/examples.html Extras pytest 5.3.0 released, please read changelog if you use pytest, especially if you use it with CI and depend on --junitxml, as they have changed the format to a newer version. Michael: PyCon registration is open (via Jacqueline Wilson) Facebook: Microsoft's Visual Studio Code is now our default development platform Black friday at Talk Python Training! New course coming soon: Python for the .NET developer Jokes What do you get when you put root beer in a square glass? Beer. Q: What do you call optimistic front-end developers? A: Stack half-full developers. Also, I was going to tell a version control joke, but they are only funny if you git them.
This episode is sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: pydantic via Colin Sullivan Data validation and settings management using python type annotations. (We covered Cerberus, this is similar) pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid. class User(pydantic.BaseModel): id: int name = 'John Doe' signup_ts: datetime = None friends: List[int] = [] external_data = { 'id': '123', 'signup_ts': '2019-06-01 12:22', 'friends': [1, 2, '3'] } user = User(**external_data) id is of type int; the annotation-only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible; otherwise an exception will be raised. name is inferred as a string from the provided default; because it has a default, it is not required. signup_ts is a datetime field which is not required (and takes the value None if it's not supplied). Why use it? There's no new schema definition micro-language to learn. In benchmarks pydantic is faster than all other tested libraries. Use of recursive pydantic models, typing's standard types (e.g. List, Tuple, Dict etc.) and validators allow complex data schemas to be clearly and easily defined, validated, and parsed. As well as BaseModel, pydantic provides a [dataclass](https://pydantic-docs.helpmanual.io/usage/dataclasses/) decorator which creates (almost) vanilla python dataclasses with input data parsing and validation. Brian #2: Coverage.py 5.0 beta 1 adds context support Please try out the beta, even without trying contexts, as it helps Ned Batchelder to make sure it’s as backwards compatible as possible while still adding this super cool functionality. Coverage 5.0 beta 1 announcement The changes. Measurement contexts in depth. Trying out contexts with pytest and pytest-cov: (venv) $ pip install coverage==5.0b1 (venv) $ pip install pytest-cov (venv) $ pytest --cov=foo --cov-context=test test_foo.py (venv) $ coverage html --show-contexts (venv) $ open htmlcov/index.html results in coverage report that has little dropdowns on the right for lines that are covered, and what context they were covered. For the example above, with pytest-cov, it shows what test caused each line to be hit. Contexts can do way more than this. One example, split up different levels of tests, to see which lines are only hit by unit tests, indicating missing higher level tests, or the opposite. The stored db could also possibly be mined to see how much overlap there is between tests, and maybe help with higher level tools to predict the harm or benefit from removing some tests. I’m excited about the future, with contexts in place. Even if you ignore contexts, please go try out the beta ASAP to make sure your old use model still works. Michael #3: PSF is seeking developers for paid contract improving pip via Brian Rutledge The Python Software Foundation Packaging Working Group is receiving funding to work on the design, implementation, and rollout of pip's next-generation dependency resolver. This project aims to complete the design, implementation, and rollout of pip's next-generation dependency resolver. Lower the barriers to installing Python software, empowering users to get a version of a package that works. It will also lower the barriers to distributing Python software, empowering developers to make their work available in an easily reusable form. Because of the size of the project, funding has been allocated to secure two contractors, a senior developer and an intermediate developer, to work on development, testing and building test infrastructure, code review, bug triage, and assisting in the rollout of necessary features. Total pay: Stage 1: $116,375, Stage 2: $103,700 Brian #4: dovpanda - Directions OVer PANDAs Dean Langsam “Directions are hints and tips for using pandas in an analysis environment. dovpanda is an overlay for working with pandas in an analysis environment. "If you think your task is common enough, it probably is, and Pandas probably has a built-in solution. dovpanda is an overlay module that tries to understand what you are trying to do with your data, and help you find easier ways to write your code.” “The main usage of dovpanda is its hints mechanism, which is very easy and works out-of-the-box. Just import it after you import pandas, whether inside a notebook or in a console.” It’s like training wheels for pandas to help you get the most out of pandas and learn while you are doing your work. Very cool. Michael #5: removestar via PyCoders newsletter Tool to automatically replace 'import *' in Python files with explicit imports Report only mode and modify in place mode. Brian #6: pytest-quarantine : Save the list of failing tests, so that they can be automatically marked as expected failures on future test runs. Brian Rutlage Really nice email from Brian: >"Hi Brian! We've met a couple times at PyCon in Cleveland. Thanks for your podcasts, and your book. I've gone from being a complete pytest newbie, to helping my company adopt it, to writing a plugin. The plugin was something I developed at work, and they let me open-source it. I wanted to share it with you as a way of saying "thank you", and because you seem to be a bit of connoisseur of pytest plugins. ;)" Here it is: https://github.com/EnergySage/pytest-quarantine/” pytest has a cool feature called xfail, to allow you to mark tests you know fail. pytest-quarantine allows you to run your suite and generate a file of all failures, then use that to mark the xfails. Then you or your team can chip away at these failures until you get rid of them. But in the meantime, your suite can still be useful for finding new failures. And, the use of an external file to mark failures makes it so you don’t have to edit your test files to mark the tests that are xfail. Extras: MK: Our infrastructure is fully carbon neutral! Joke: A cop pulls Dr. Heisenberg over for speeding. The officer asks, "Do you know how fast you were going?" Heisenberg pauses for a moment, then answers, "No, but I know where I am.” [1] See Uncertainty principle, also called Heisenberg uncertainty principle or indeterminacy principle, statement, articulated (1927) by the German physicist Werner Heisenberg, that the position and the velocity of an object cannot both be measured exactly, at the same time, even in theory.
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guests: Dan Bader Cecil Philip Dan #1: Why You Should Use python -m pip https://snarky.ca/why-you-should-use-python-m-pip/ Cecil #2: Visual Studio Online: Web-Based IDE & Collaborative Code Editor https://visualstudio.microsoft.com/services/visual-studio-online/ Michael #3: Python Adopts a 12-month Release Cycle The long discussion on changing the Python project's release cadence has come to a conclusion: the project will now be releasing new versions on an annual basis. Described in PEP 602 The steering council thinks that having a consistent schedule every year when we hit beta, RC, and final it will help the community: Know when to start testing the beta to provide feedback Known when the expect the RC so the community can prepare their projects for the final release Know when the final release will occur to coordinate their own releases (if necessary) when the final release of Python occurs Allow core developers to more easily plan their work to make sure work lands in the release they are targeting Make sure that core developers and the community have a shorter amount of time to wait for new features to be released Dan #4: Black 19.10b0 Released — stable release coming soon https://twitter.com/llanga/status/1188968251918819329 “Black Friday” release date? https://twitter.com/llanga/status/1189145837991014402 Playground: https://black.now.sh/ Cecil 5: Navigating code on GitHub Example: https://github.com/talkpython/100daysofcode-with-python-course/blob/master/days/10-12-pytest/guess/guess.py Michael #6: lolcommits: selfies for software developers. lolcommits takes a snapshot with your webcam every time you git commit code, and archives a lolcat style image with it. git blame has never been so much fun. Infinite uses: Animate your progress through a project and watch as you age. See what you looked like when you broke the build. Keep a joint lolrepository for your entire company. Plugins: Lolcommits allows a growing list of plugins to perform additional work on your lolcommit image after capturing. Animate: Configure lolcommits to generate an animated GIF with each commit for extra lulz! Extras: Dan: Article & Course on Python 3.8 https://realpython.com/python38-new-features/ https://realpython.com/courses/cool-new-features-python-38/ Cecil: Twitch learning Python channel Michael: New Anvil course, free one - https://talkpython.fm/anvil PSF yearly survey is out: https://twitter.com/thepsf/status/1190004772704784385 Joke: LOLCODE
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: Guido retires Guido van Rossum has left DropBox and retired (post) Let’s all take a moment to say thank you. I wonder what will come next in terms of creative projects Some comments from community members (see Twitter thread) Brian #2: SeleniumBase Automated UI Testing with Selenium WebDriver and pytest. Very expressive and intuitive automation library built on top of Selenium WebDriver. method overview very readable (this is a workflow test, but still, quite readable): from seleniumbase import BaseCase class MyTestClass(BaseCase): def test_basic(self): self.open("https://xkcd.com/353/") self.assert_title("xkcd: Python") self.assert_element('img[alt="Python"]') self.click('a[rel="license"]') self.assert_text("free to copy and reuse") self.go_back() self.click("link=About") self.assert_text("xkcd.com", "h2") self.open("https://store.xkcd.com/collections/everything") self.update_text("input.search-input", "xkcd book\n") self.assert_exact_text("xkcd: volume 0", "h3") includes plugins for including screenshots in test results. supports major CI systems some cool features that I didn’t expect user onboarding demos assisted QA (partially automated with manual questions) support for selenium grid logs of command line options, including headless Michael #3: Reimplementing a Solaris command in Python gained 17x performance improvement from C Postmortem by Darren Moffat Is Python slow? A result of fixing a memory allocation issue in the /usr/bin/listusers command Decided to investigate if this ancient C code could be improved by conversion to Python. The C code was largely untouched since 1988 and was around 800 lines long, it was written in an era when the number of users was fairly small and probably existed in the local files /etc/passwd or a smallish NIS server. It turns out that the algorithm to implement the listusers is basically some simple set manipulation. Rewrite of listusers in Python 3 turned out to be roughly a 10th of the number of lines of code But Python would be slower right ? Turns out it isn't and in fact for some of my datasets (that had over 100,000 users in them) it was 17 times faster. A few of the comments asked about the availability of the Python version. The listusers command in Oracle Solaris 11.4 SRU 9 and higher. Since we ship the /usr/bin/listusers file as the Python code you can see it by just viewing the file in an editor. Note though that is not open source and is covered by the Oracle Solaris licenses. Brian #4: 20 useful Python tips and tricks you should know I admit it, I’m capable of getting link-baited by the occasional listicle. Some great stuff, especially for people coming from other languages. Chained assignment: x = y = z = 2 Chained comparison: 2 < x <= 8 2 < x > 4 0 < x < 4 < y < 16 Multiple assignment: x, y, z = 2, 4, 8 More Advanced Multiple Assignment: x, *y, z = 2, 4, 8, 16 I’ve been using the * for unpacking a lot, especially with *_ Merge dictionaries: z = {**x, **y} Join strings: '_'.join(['u', 'v', 'w']) using list(set(something)) to remove duplicates. aggregate elements. using zip to element-wise combine two or more iterables. >>> x = [1, 2, 3] >>> y = ['a', 'b', 'c'] >>> zip(x, y) [(1, 'a'), (2, 'b'), (3, 'c')] and then some other weird stuff that I don’t find that useful. Michael #5: Complexity Waterfall via Ahrem Ahreff Heavy use of wemake-python-styleguide Code smells! Use your refactoring tools and write tests. Automation enable an opportunity of “Continuous Refactoring” and “Architecture on Demand” development styles. Brian #6: Plynth Plynth is a GUI framework for building cross-platform desktop applications with HTML, CSS and Python. Plynth has integrated the standard CPython implementation with Chromium's rendering engine. You can run your python scripts seamlessly with HTML/CSS instead of using Javascript with modules from pip Plynth uses Chromium/Electron for its rendering. With Plynth, every Javascript library turns into a Python module. Not open source. But free for individuals, including commercial use and education. A bunch of tutorial videos that are not difficult to follow, and not long, but… not really obvious code either. Python 3.6 and 3.7 development kits available Extras: Michael: Google Is Uncovering Hundreds Of Race Conditions Within The Linux Kernel Joke: Q: What's a web developer's favorite tea? A: URL gray via Aideen Barry
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Bob Belderbos Brian #1: Lesser Known Coding Fonts Interesting examination of some coding fonts. Link to a great talk called Cracking the Code, by Jonathan David Ross, about coding fonts and Input. I’m trying out Input Mono right now, and quite like it. Fira code: https://github.com/tonsky/FiraCode Bob #2: Django Admin Handbook As a Django developer knowing the admin is pretty important. Free ebook of 40 or so pages, you can consume it in one evening. There are a lot of good tricks, 3 I liked: How to optimize queries in Django admin (override get_queryset) How to export CSV from Django admin (useful for data analysis in Jupyter for example) How to override save behaviour for Django admin (used this to notify users upon publishing a new exercise on our platform) Some more cool ebooks on that site, e.g. Tweetable #Python. Michael #3: Your Guide to the CPython Source Code Let’s talk about exploring the CPython code You’ll want to get the code: git clone https://github.com/python/cpython Compile the code (Anthony gives lots of steps for macOS, Windows, and Linux) Structure: cpython/ │ ├── Doc ← Source for the documentation ├── Grammar ← The computer-readable language definition ├── Include ← The C header files ├── Lib ← Standard library modules written in Python ├── Mac ← macOS support files ├── Misc ← Miscellaneous files ├── Modules ← Standard Library Modules written in C ├── Objects ← Core types and the object model ├── Parser ← The Python parser source code ├── PC ← Windows build support files ├── PCbuild ← Windows build support files for older Windows versions ├── Programs ← Source code for the python executable and other binaries ├── Python ← The CPython interpreter source code └── Tools ← Standalone tools useful for building or extending Python Some cool “hidden” goodies. For example, check out Lib/concurrent/futures/process.py, it comes with a cool ascii diagram of the process. Lots more covered, that we don’t have time for The Python Interpreter Process The CPython Compiler and Execution Loop Objects in CPython The CPython Standard Library Installing a custom version Brian #4: Six Django template tags not often used in tutorials Here’s a few: {% empty %}, for use in for loops when the array is empty {% lorem \[count\] [method] [random] %} for automatically filling with Lorem Ipsum text. {% verbatim %} … {% endverbatim %}, stop the rendering engine from trying to parse it and replace stuff. https://hipsum.co/ Bob #5: Beautiful code snippets with Carbon Beautiful images, great for teaching Python / programming. Used by a lot of developer, nice example I spotted today. Supports typing and drag and drop, just generated this link by dropping a test module onto the canvas! Great to expand Twitter char limit (we use it to generate Python Tip images). Follow the project here, seems they now integrate with Github. Michael #6: Researchers find bug in Python script may have affected hundreds of studies More info via Mike Driscoll at Thousands of Scientific Papers May be Invalid Due to Misunderstanding Python In a paper published October 8, researchers at the University of Hawaii found that a programming error in a set of Python scripts commonly used for computational analysis of chemistry data returned varying results based on which operating system they were run on. Scientists did not understand that Python’s glob.glob() does not return sorted results Throwing doubt on the results of more than 150 published chemistry studies. the researcher were trying to analyze results from an experiment involving cyanobacteria discovered significant variations in results run against the same nuclear magnetic resonance spectroscopy (NMR) data. The scripts, called the "Willoughby-Hoye" scripts after their creators, were found to return correct results on macOS Mavericks and Windows 10. But on macOS Mojave and Ubuntu, the results were off by nearly a full percent. The module depends on the operating system for the order in which the files are returned. And the results of the scripts' calculations are affected by the order in which the files are processed. The fix: A simple list.sort()! Williams said he hopes the paper will get scientists to pay more attention to the computational side of experiments in the future. Extras: Nov 5 is the next Python PDX West Using Big Tech Tools Working on: PyBites platform: added flake8/ black code formatting, UI enhancements. Michael: Bezos DDoS'd: Amazon Web Services' DNS systems knackered by hours-long cyber-attack PyPI Just Crossed the 200,000 Packages Threshold! (via RP) XKCD Date — via André Jaenisch, Enter https://explainxkcd.com/wiki/index.php/1425:_Tasks and learn, that it was published on 24th Sep 2014. Joke: Q: What did the Network Administrator say when they caught a nasty virus? A: It hurts when IP
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Building a Python C Extension Module Tutorial, learn to use the Python API to write Python C extension modules. And Invoke C functions from within Python Pass arguments from Python to C and parse them accordingly Raise exceptions from C code and create custom Python exceptions in C Define global constants in C and make them accessible in Python Test, package, and distribute your Python C extension module Extending Your Python Program there may be other lesser-used system calls that are only accessible through C Steps: Writing a Python Interface in C Figure out the arguments (e.g. int fputs(const char *, FILE *) ) Implement in C: #include Python.h static PyObject *method_fputs(PyObject *self, PyObject *args) { char *str, *filename = NULL; int bytes_copied = -1; /* Parse arguments */ if(!PyArg_ParseTuple(args, "ss", &str, &filename)) { return NULL; } FILE *fp = fopen(filename, "w"); bytes_copied = fputs(str, fp); fclose(fp); return PyLong_FromLong(bytes_copied); } In line 2, you declare the argument types you wish to receive from your Python code line 6, then you’ll see that PyArg_ParseTuple() copies into the char*’s Write regular C code (fopen, fputs) Return: PyLong_FromLong() creates a PyLongObject, which represents an integer object in Python. a few extra functions that are necessary write definitions of your module and the methods it contains Before you can import your new module, you first need to build it. You can do this by using the Python package distutils. Brian #2: What’s New in Python 3.8 - docs.python.org We’ve already talked about the big hitters: assignment expressions, (the walrus operator) positional only parameters, (the / in the param list) f-strings support = for self-documenting expressions and debugging There are a few more goodies I wanted to quickly mention: More async: python -m asyncio launches a native async REPL More helpful warnings and messages when using is and is not to compare strings and integers and other types intended to be compared with == and != Missing the comma in stuff like [(1,2) (3,4)]. Happens all the time with parametrized testing you can do iterable unpacking in a yield or return statement x = (1, 2, 3) a, *b = x return a, *b <- this used to be a syntax error you had to do return (a, *b) New module importlib.metadata lets you access things like version numbers or dependent library required version numbers, and cool stuff like that. quite a few more goodies. I run through all my favorites on testandcode.com/91 Michael #3: UK National Cyber Security Centre (NCSC) is warning developers of the risks of sticking with Python 2.7, particularly for library writers NCSC likens companies continuing to use Python 2 past its EOL to tempting another WannaCry or Equifax incident. Equifax details: a vulnerability, dubbed CVE-2017-5638, was discovered in Apache Struts, an open source development framework for creating enterprise Java applications that Equifax, along with thousands of other websites, uses… Quote: "If you're still using 2.x, it's time to port your code to Python 3," the NCSC said. "If you continue to use unsupported modules, you are risking the security of your organisation and data, as vulnerabilities will sooner or later appear which nobody is fixing." Moreover: "If you maintain a library that other developers depend on, you may be preventing them from updating to 3," the agency added. "By holding other developers back, you are indirectly and likely unintentionally increasing the security risks of others.” "If migrating your code base to Python 3 is not possible, another option is to pay a commercial company to support Python 2 for you," the NCSC said. NCSC: If you don't migrate, you should expect security incidents Python's popularity makes updating code imperative: The reason the NCSC is warning companies about Python 2's impending EOL is because of the language's success. Brian #4: Pythonic News Sebastian A. Steins “A Hacker News lookalike written in Python/Django” “ powering https://news.python.sc" Cool that it’s open source, and on github Was submitted to us by Sebastian, and a few others too, so there is excitement. It’s like 6 days old and has 153 stars on github, 4 contributors, 18 forks. Fun. Michael #5: Deep Learning Workstations, Servers, Laptops, and GPU Cloud GPU-accelerated with TensorFlow, PyTorch, Keras, and more pre-installed. Just plug in and start training. Save up to 90% by moving off your current cloud and choosing Lambda. They offer: TensorBook: GPU Laptop for $2,934 Lambda Quad: 4x GPU Workstation for $21,108 (yikes!) All in: Lambda Hyperplane: 8x Tesla V100 Server, starting at $114,274 But compare to: AWS EC2: p3.8xlarge @ $12.24 per Hour => $8,935 / month Brian #6: Auto formatters for Python A comparison of autopep8, yapf, and black Auto formatters are super helpful for teams. They shut down the unproductive arguments over style and make code reviews way more pleasant. People can focus on content, not being the style police. We love black. But it might be a bit over the top for some people. Here are a couple of other alternatives. autopep8 - mostly focuses on PEP8 “autopep8 automatically formats Python code to conform to the PEP 8 style guide. It uses the pycodestyle utility to determine what parts of the code needs to be formatted. autopep8 is capable of fixing most of the formatting issues that can be reported by pycodestyle.” black - does more doesn’t have many options, but you can alter line length, can turn of string quote normalization, and you can limit or focus the files it sees. does a cool “check that the reformatted code still produces a valid AST that is equivalent to the original.” but you can turn that off with --fast yapf - way more customizable. Great if you want to auto format to a custom style guide. “The ultimate goal is that the code YAPF produces is as good as the code that a programmer would write if they were following the style guide. It takes away some of the drudgery of maintaining your code.” Article is cool in that it shows some sample code and how it’s changed by the different formatters. Extras: Michael: New courses coming Financial Aid Launches for PyCon US 2020! Joke: American Py Song From Eric Nelson: Math joke. “i is as complex as it gets. jk.”
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: JPMorgan’s Athena Has 35 Million Lines of Python 2 Code, and Won’t Be Updated to Python 3 in Time With 35 million lines of Python code, the Athena trading platform is at the core of JPMorgan's business operations. A late start to migrating to Python 3 could create a security risk. Athena platform is used internally at JPMorgan for pricing, trading, risk management, and analytics, with tools for data science and machine learning. This extensive feature set utilizes over 150,000 Python modules, over 500 open source packages, and 35 million lines of Python code contributed by over 1,500 developers, according to data presented by Misha Tselman, executive director at J.P. Morgan Chase in a talk at PyData 2017. And JPMorgan is going to miss the deadline Roadmap puts "most strategic components" compatible with Python 3 by the end of Q1 2020 JPMorgan uses Continuous Delivery, with 10,000 to 15,000 production changes per week "If you maintain a library that other developers depend on," the post states, "you may be preventing them from updating to 3. By holding other developers back, you are indirectly and likely unintentionally increasing the security risks of others," adding that developers who do not publish code publicly should "consider your colleagues who may also be using your code internally." Brian #2: organize suggested by Ariel Barkan a Python based file management automation tool configuration is via a yml file command line tool to organize your file system examples: move all of your screenshots off of your desktop into a screenshots folder move old incomplete downloads into trash remove empty files from certain folders organize receipts and invoices into date based folders Michael #3: PEP 589 – TypedDict: Type Hints for Dictionaries With a Fixed Set of Keys Author: Jukka Lehtosalo Sponsor: Guido van Rossum Status: Accepted Version: 3.8 PEP 484 defines the type Dict[K, V] for uniform dictionaries, where each value has the same type, and arbitrary key values are supported. It doesn't properly support the common pattern where the type of a dictionary value depends on the string value of the key. Core idea: Consider creating a type to validate an arbitrary JSON document with a fixed schema Proposed syntax: from typing import TypedDict class Movie(TypedDict): name: str year: int movie: Movie = {'name': 'Blade Runner', 'year': 1982} Operations on movie can be checked by a static type checker movie['director'] = 'Ridley Scott' # Error: invalid key 'director' movie['year'] = '1982' # Error: invalid value type ("int" expected) Brian #4: gazpacho gazpacho is a web scraping library “It replaces requests and BeautifulSoup for most projects. “ “gazpacho is small, simple, fast, and consistent.” example of using gazpacho to scrape hockey data for fantasy sports. simple interface, short scripts, really beginner friendly retrieve with get, parse with Soup. I don’t think it will completely replace the other tools, but for simple get/parse/find operations, it may make for slimmer code. Note, I needed to update certificates to get this to work. see this. Michael #5: How pip install Works via PyDist What happens when you run pip install [somepackage]? First pip needs to decide which distribution of the package to install. This is more complex for Python than many other languages There are 7 different kinds of distributions, but the most common these days are source distributions and binary wheels. A binary wheel is a more complex archive format, which can contain compiled C extension code. Compiling, say, numpy from source takes a long time (~4 minutes on my desktop), and it is hard for package authors to ensure that their source code will compile on other people's machines. Most packages with C extensions will build multiple wheel distributions, and pip needs to decide which if any are suitable for your computer. To find the distributions available, pip requests https://pypi.org/simple/[somepackage], which is a simple HTML page full of links, where the text of the link is the filename of the distribution. To select a distribution, pip first determines which distributions are compatible with your system and implementation of python. binary wheels, it parses the filenames according to PEP 425, extracting the python implementation, application binary interface, and platform. All source distributions are assumed to be compatible, at least at this step in the process Once pip has a list of compatible distributions, it sorts them by version, chooses the most recent version, and then chooses the "best" distribution for that version It prefers binary wheels if there are any Determining the dependencies for this distribution is not simple either. For binary wheels, the dependencies are listed in a file called METADATA. But for source distributions the dependencies are effectively whatever gets installed when you execute their setup.py script with the install command. What happens though if one of the distributions pip finds violates the requirements of another? It ignores the requirement and installs idna anyway! Next pip has to actually build and install the package. it needs to determine which library directory to install the package in—the system's, the user's, or a virtualenvs? Controlled by sys.prefix, which in turn is controlled by pip's executable path and the PYTHONPATH and PYTHONHOME environment variables. Finally, it moves the wheel files into the appropriate library directory, and compiles the python source files into bytecode for faster execution. Now your package is installed! Brian #6: daily pandas tricks Kevin Markham is sending out one pandas tip or trick per day via twitter. It’s been fun to watch and learn new bits. The link is a sampling of a bunch of them. Here’s just one example: Need to rename all of your columns in the same way? Use a string method: Replace spaces with _: df.columns = df.columns.str.replace(' ', '_') Make lowercase & remove trailing whitespace: df.columns = df.columns.str.lower().str.rstrip() Extras Michael: Switched to Adobe Audition Azure Databricks drops Python 2 Better Jupyter in VS Code macOS Catalina (so far so good) Jokes: via Sarcastic Pharmacist Hard to distinguish hard from easy in programming
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python alternative to Docker Matt Layman Using Shiv, from LinkedIn Mentioned briefly in episode 114 Shiv uses zipapp, PEP 441. Execute code directly from a zip file. App code and dependencies can be bundled together. “Having one artifact eliminates the possibility of a bad interaction getting to your production system.” article includes an example of all the steps for packaging a Django app with Gunicorn. includes talking about deployment. Matt includes shoutouts to: Platform as a Service providers Manual steps to do it all. Docker Compares the process against Docker and discusses when to choose one over the other. Also an interesting read: Docker is in deep trouble Michael #2: How to support open-source software and stay sane via Jason Thomas written by Anna Nowogrodzki Releasing lab-built open-source software often involves a mountain of unforeseen work for the developers. Article opens: “On 10 April, astrophysicists announced that they had captured the first ever image of a black hole. This was exhilarating news, but none of the giddy headlines mentioned that the image would have been impossible without open-source software.” The image was created using Matplotlib, a Python library for graphing data, as well as other components of the open-source Python ecosystem. Just five days later, the US National Science Foundation (NSF) rejected a grant proposal to support that ecosystem, saying that the software lacked sufficient impact. Open-source software is widely acknowledged as crucially important in science, yet it is funded non-sustainably. “It’s sort of the difference between having insurance and having a GoFundMe when their grandma goes to the hospital,” says Anne Carpenter Challenges Scientists writing open-source software often lack formal training in software engineering. Yet poorly maintained software can waste time and effort, and hinder reproducibility. If your research group is planning to release open-source software, you can prepare for the support work Obsolescence isn’t bad, she adds: knowing when to stop supporting software is an important skill. However long your software will be used for, good software-engineering practices and documentation are essential. These include continuous integration systems (such as TravisCI), version control (Git) and unit testing. To facilitate maintenance, Varoquaux recommends focusing on code readability over peak performance. Brian #3: The Hippocratic License Coraline Ada Ehmke Interesting idea to derive from MIT, but add restrictions. This license adds these restrictions: “The software may not be used by individuals, corporations, governments, or other groups for systems or activities that actively and knowingly endanger, harm, or otherwise threaten the physical, mental, economic, or general well-being of individuals or groups in violation of the United Nations Universal Declaration of Human Rights” I could see others with different restrictions, or this but more. Michael #4: MATLAB vs Python: Why and How to Make the Switch MATLAB® is widely known as a high-quality environment for any work that involves arrays, matrices, or linear algebra. I personally used it for wavelet-decomposition of real time eye measurements during cognitively intensive human workloads… That toolbox costs $2000 per user. Difference in philosophy: Closed, paid vs. open source. Since Python is available at no cost, a much broader audience can use the code you develop Also, there is GNU Octave is a free and open-source clone of MATLAB apparently Brian #5: PyperCard - Easy GUIs for All Nicholas Tollervey Came up on episode 143 Also, episode 89 of Test & Code Really easy to quickly set up a GUI specified by a list of “Card” objects. (different from cards project) Simple examples are choose your own adventure type applications, where one button takes you to another card, and another button, a different card. However, the “next card” could be a Python function that can do anything, as long as it returns a string with the name of the next card. Lots of potential here, especially with input boxes, images, sound, and more. Super fun, but also might have business use. Michae #6: pynode Article: Bridging Node.js and Python with PyNode to Predict Home Prices Call python code from node.js Define a Python method In node: require pynode: const pynode = require('@fridgerator/pynode') Start an interpreter: pynode.startInterpreter() Call the function pynode.call('add', 1, 2, (err, result) => { if (err) return console.log('error : ', err) result === 3 // true }) Jokes The "Works on My Machine" Certification Program, get certified!
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: How to Stand Out in a Python Coding Interview Real Python, by James Timmins Are tech interviews broken? Well at least we can try to succeed at them anyway You’ve made it past the phone call with the recruiter, and now it’s time to show that you know how to solve problems with actual code… Interviews aren’t just about solving problems: they’re also about showing that you can write clean production code. This means that you have a deep knowledge of Python’s built-in functionality and libraries. Things to learn Use enumerate() to iterate over both indices and values Debug problematic code with breakpoint() Format strings effectively with f-strings Sort lists with custom arguments Use generators instead of list comprehensions to conserve memory Define default values when looking up dictionary keys Count hashable objects with the collections.Counter class Use the standard library to get lists of permutations and combinations Brian #2: The Python Software Foundation has updated its Code of Conduct There’s now one code of conduct for PSF and PyCon US and other spaces sponsored by the PSF This includes some regional conferences, such as PyCascades, and some meetup groups, (ears perk up) The docs Code of Conduct Enforcement Guidelines Reporting Guidelines Do we need to care? all of us, yes. If there weren’t problems, we wouldn’t need these. attendees, yes. Know before you go. organizers, yes. Better to think about it ahead of time and have a plan than have to make up a strategy during an event if something happens. me, in particular, and Michael. Ugh. yes. our first meetup is next month. I’d like to be in line with the rest of Python. So, yep, we are going to have to talk about this and put something in place. Michael #3: The Interview Study Guide For Software Engineers A checklist on my last round of interviews that covers many of the popular topics. Warm Up With The Classics Fizz Buzz 560. Subarray Sum Equals K Arrays: Left Rotation Strings: Making Anagrams Nth Fibonacci Many many videos on interview topics and ideas Data Structures Algorithms Big O Notation Dynamic Programming String Manipulation System Design Operating Systems Threads Object Oriented Design Patterns SQL Fun conversation in the comments Brian #4: re-assert : “show where your regex match assertion failed” Anthony Sotille “re-assert provides a helper class to make assertions of regexes simpler.” The Matches objects allows for useful pytest assertion messages In order to get my head around it, I looked at the test code: https://raw.githubusercontent.com/asottile/re-assert/master/tests/re_assert_test.py and modified it to remove all of the with pytest.raises(AssertionError)… to actually get to see the errors and how to use it. def test_match_old(): > assert re.match('foo', 'fob') E AssertionError: assert None E + where None = [HTML_REMOVED]('foo', 'fob') E + where [HTML_REMOVED] = re.match test_re.py:8: AssertionError ____________ test_match_new ___________________ def test_match_new(): > assert Matches('foo') == 'fob' E AssertionError: assert Matches('foo') ^ == 'fob' E -Matches('foo') E - # regex failed to match at: E - # E - #> fob E - # ^ E +'fob' Michael #5: awesome-python-typing Collection of awesome Python types, stubs, plugins, and tools to work with them. Taxonomy Static type checkers Stub packages Tools Integrations Articles Communities Related Static type checkers: mypy - Optional static typing for Python 3 and 2 (PEP 484). Stub packages: Typeshed - Collection of library stubs for Python, with static types. Tools (super category): pytest-mypy - Mypy static type checker plugin for Pytest. Articles: Typechecking Django and DRF - Full tutorial about type-checking django. Brian #6: Developer Advocacy: Frequently Asked Questions Dustin Ingram I know a handful of people who have this job title. What is it? disclaimer: Dustin is a DA at Google. Other companies might be different What is it? “I help represent the Python community at [company]" “part of my job is to be deeply involved in the Python community.” working on projects that help Python, PyPI, packaging, etc. speaking at conferences talking to people. customers and non-customers talking to product teams being “user zero” for new products and features paying attention to places users might raise issues about products working in open source creating content for Python devs being involved in the community as a company rep representing Python in the company coordinating with other DAs Work/life? Not all DAs travel all the time. that was my main question. Talk Python episode: War Stories of the Developer Evangelists Extras: https://www.meetup.com/Python-PDX-West/ Michael: requests moves to PSF Joke: via https://twitter.com/NotGbo/status/1173667028965777410 Web Dev Merit Badges
loading
Comments (12)

Raymond Buhr

I think the methodology for the calculation of language popularity is specifically under representative of both R and python. if you check out trends for dplyr (R) or pandas (python) packages for data manipulation, both dwarf the overall language specific searches. I wonder if that bias also partially led to the declining interest in Ruby on Rails.

Jun 25th
Reply

connor maynes

fgr Dr rhh

Jun 1st
Reply

Raj

Thanks for the kubernetes example, and overall good episode

Mar 22nd
Reply

Mian A. Shah

ypf

Jan 28th
Reply

GreatBahram

As usual, perfect!

Jan 27th
Reply

Antonio Andrade

I think you missed to highlight all the nice work of realphlython and your podcasts, these are key stuffs for Python in 2018!

Dec 27th
Reply

Vignesh Anand Krishnan

The jokes are good but let brian do it. 😂

Dec 13th
Reply

GreatBahram

Congrats Python Bytes. This episode was really great 😎

Oct 27th
Reply

pyguy

Joel Grus talk can be found here: https://youtu.be/7jiPeIFXb6U

Oct 8th
Reply

Antonio Andrade

víbora means in Spanish: snake. umm, just thinking about Phyton

Aug 4th
Reply

GreatBahram

It's intetesting the title is flask but you guys spoke more about Django? kidding? hahaha please dont mess with us《Mico framework fans Thanks

Jun 28th
Reply

Antonio Andrade

nice, another super good Python postcast

May 20th
Reply
loading
Download from Google Play
Download from App Store