Discover
Contributor

66 Episodes
Reverse
Loris Degioanni (@lorisdegio) joins Eric Anderson (@ericmander) to chat about Falco, the open-source runtime security tool for modern cloud infrastructures. Loris is the founder and CTO of Sysdig, and co-creator of Wireshark, the legendary open-source packet analysis tool. Today, Loris talks about all these projects and more - tune in to learn about some deep history and Loris’ predictions for the future.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
How Loris began working with Gerald Combs as a student in Italy
Why Loris’ teams name their products after animals
The new non-profit Wireshark Foundation
Parallel development of cloud technology and containers during Loris’ career
The little things that make open-source projects go viral
Links:
Falco
Sysdig
Wireshark
People mentioned:
Solomon Hykes (@solomonhykes)
Emre Baran (@emre) is the CEO and co-founder of Cerbos, the open-source authorization layer for implementing roles and permissions. Cerbos allows developers to decouple authorization logic from core code into its own centrally distributed component. Easier said than done, perhaps - but Cerbos is secure, intentionally simple to implement, and developer-focused.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
The difference between authentication and authorization
Why Cerbos is language-agnostic
Authorization patterns in a single application versus a larger network
The reason most devs start out trying to do authorization themselves, and sometimes give up
How the upcoming Cerbos Cloud will empower less technical users to deploy and manage policies and logs
Links:
Cerbos
Cerbos Cloud Beta
Zanzibar: Google’s Consistent, Global Authorization System
People mentioned:
Charith Ellawala (Github: @charithe)
Other episodes:
Open Policy Agent with Torin Sandall
Eric Anderson (@ericmander) has a conversation with Liam Randall (@Hectaman) and Bailey Hayes (@baihay) of Cosmonic, the platform-as-a-service environment for building cloud-native applications using WebAssembly. Bailey is also on the steering committee for the Bytecode Alliance, which stewards WebAssembly. In 2021, Cosmonic donated their WebAssembly runtime, wasmCloud, to the CNCF as an open-source project. Today, Liam and Bailey trace the history of WebAssembly, and their personal paths alongside it.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
How WebAssembly came together over the last decade to become the fourth standardized language of the web
The moments when Bailey and Liam both realized they might be changing the future of computing
Modding Microsoft Flight Simulator with Wasm modules
Liam’s thoughts on how WebAssembly will affect business models going forward
Links:
Cosmonic
WebAssembly
Bytecode Alliance
CNCF wasmCloud
Wasmtime
WAMR
Better together: A Kubernetes and Wasm case study
Spin
People mentioned:
Kevin Hoffman (@KevinHoffman)
Kelsey Hightower (@kelseyhightower)
Guy Bedford (@guybedford)
Peter Huene (@peterhuene)
Chris Aniszczyk (@cra)
Other episodes:
Envoy Proxy with Matt Klein
Suborbital with Connor Hicks
Eric Anderson (@ericmander) is joined by Milos Rusic (@rusic_milos) to discuss Haystack, the open-source NLP framework for leveraging Transformer models and building intelligent search systems. Milos and his colleagues at deepset were early contributors to Hugging Face’s Transformer models, and began building pipelines for searching large document stores. Today, Haystack is wildly popular, with an active Discord community and over 6,000 GitHub stars.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
A deep dive into how Haystack works and its many use cases
How a customer demo with one-minute long queries helped inspire Haystack
Marketing open-source projects vs word of mouth
NLP applications working with structured data and translating between types of data
Imagining a world where every person has their own personal ChatGPT
Links:
Haystack
deepset
Hugging Face
Notion
Other episodes:
Milvus with Frank Liu
Eric Anderson (@ericmander) talks with Artyom Keydunov (@keydunov) about Cube, the semantic layer for building data applications. Cube helps engineers bridge data warehouses and data experiences, and provides access control, security, caching, and more helpful features. The project began in open-source and has evolved quite a lot over the last few years with a ton of community support.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
What is a semantic layer?
Coming up with the idea to open-source during a game of ping pong
Setting a ten-company-deployment goal
Using Cube to track COVID stats in lockdown
How one contributor built a GraphQL API
Links:
Cube
Superset
Metabase
Observable
Streamlit
People mentioned:
Pavel Tiunov (@paveltiunov87)
Eric Anderson (@ericmander) and Erika Hokanson (@erikawh0) remember the life of Jeff Meyerson, creator of the influential podcast Software Engineering Daily. He passed during the summer of 2022. Still, his work lives on - thousands of episodes, talks, music, a book, and a community of dedicated listeners and engineers whose lives were touched by Jeff’s dreams.
Software Engineering Daily is still running, and you can listen to new episodes right here or wherever you get your podcasts.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
Links:
Software Engineering Daily
Software Engineering Radio
The Prion (Soundcloud) (Spotify)
You Are Not A Commodity
Move Fast: How Facebook Builds Software
People mentioned:
Pranay Mohan (@pranaymohan)
We’re kicking off the new year with a conversation between Eric Anderson (@ericmander), Sergei Egorov (@bsideup) and Eli Aleyner (@ealeyner). Sergei and Eli founded AtomicJar to maintain Testcontainers, the family of open-source libraries that allow developers to write and run integration tests locally, and treat them as unit tests. Testcontainers is wildly popular, with over six thousand GitHub stars (and climbing!). Tune in to find out how Sergei and Eli are helping people test their software quicker, easier, and more efficiently.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
How Testcontainers solves the problem of confidence
The value of Github’s networking effect
Inspiration from Amazon’s S3 “test bunny”
Consequences of Docker’s over- and under-adoption
Replicating success in other languages besides Java
Links:
Testcontainers
AtomicJar
Spring
Quarkus
Micronaut
How We Maintain Security Testing within the Software Development Life Cycle
People mentioned:
Richard North (@whichrich)
Kevin Wittek (@Kiview)
Martin Fowler (@martinfowler)
Eric Anderson (@ericmander) is joined by Nate Rush (@naterush1997) and Aaron Diamond-Reivich (@_aaronDR) to talk about Mito, the open-source spreadsheet that generates Python code for data analysts. Mito is a Python library and acts as an extension to a Jupyter Notebook. Tune in to find out how the Mito team is bridging the gap in data science between spreadsheets and programming.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
How Nate, Aaron and Aaron’s fraternal twin brother Jake have been friends since middle school
Programming tools for spreadsheet users vs spreadsheet tools for people who are trying to become programmers
Advantages to integrating into other open-source projects
Reflecting on the hype around Python data science
Python needs for Mito’s enterprise customers
Links:
Mito
Project Jupyter
pandas
Superhuman
Streamlit
People mentioned:
Jacob Diamond-Reivich (@Jake_Stack808)
Eric Anderson (@ericmander) and Simba Khadder (@simba_khadder) explore Featureform, the “virtual” feature store platform that aims to standardize data pipelines for machine learning. Contributor is no stranger to feature stores, but Simba has a broader definition than most. Join us to learn how Featureform enables data scientists and machine learning practitioners to solve a common, but rarely addressed organizational problem.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
How there is no standard or north star for MLOps
Why enterprise is where Featureform’s value shines
MLPlatform problems vs MLOps problems
Why copy/paste and Git don’t cut it
Deploying MLOps solutions that make data scientists and everyone else happy
Links:
Featureform
Terraform
Apache Spark
Feathr
Other episodes:
Tensorflow with Rajat Monga
Eric Anderson (@ericmander) hosts Ben Haynes (@benhaynes), CEO and co-founder of Directus. Directus is an open-source data platform that layers on SQL databases to provide an instant API, and includes a no-code data studio interface. Listen in to find out how Directus is aiming to democratize the modern data stack for everyone.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
The inspiration to create an “admin interface on steroids”
Reflecting on Directus’ unusual linear growth trend
How Directus powers digital experiences, applications, and internal dev tools
Ben’s thoughts on maintaining a sustainable, premium open-source experience
Automated data processing with Directus Flows
Links:
Directus
Supabase
Other episodes:
Chef with Adam Jacob
Eric Anderson (@ericmander) chats with Toni de la Fuente (@ToniBlyx) about how he created Prowler, an open source security tool for AWS. Toni talks about taking Prowler from a nights-and-weekends project to his current full-time job, managing a team of four. They discuss transitioning from primarily coding to primarily managing tickets and users, as well as being “client zero” and bringing the project to big companies.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
The roadmap from open source Prowler to Prowler Pro
Prowler’s diverse set of users
What Toni learned from quitting an earlier open source project
The differences between Prowler and other security services for AWS
Links:
Prowler on Github
Prowler Pro
Verica
Black Hat
People mentioned:
Aaron Rinehart
Casey Rosenthal
Eric Anderson (@ericmander) meets legendary open-source developer Max Howell (@mxcl) to talk about tea, a decentralized protocol for remunerating the open-source ecosystem. Max is the creator of Homebrew, and he chats about his exit from the project. The conversation turns to his newest project, tea, which is an evolution of Brew, and takes inspiration from blockchain technology. They also discuss Max’s famous interview at Google and his time working for Apple.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
Max’s experience creating Homebrew, one of the largest open-source projects ever
The utility of Web3 beyond decentralized finance
Writing a white paper for tea, “just like everyone else”
Why Max wants a global team, with people in every time zone
How tea ensures a sustainable future for open-source
Links:
Homebrew
tea.xyz
tea white paper
Bitcoin white paper
Max’s Google interview tweet
Log4j vulnerability
“Nebraska” XKCD comic
Nix OS
People mentioned:
Timothy Lewis
Eric Anderson (@ericmander) and Connor Hicks (@cohix) launch into detail on Suborbital, an open-source project that allows developers to create WebAssembly projects embedded in other applications. Connor conceived of Suborbital while frustrated with the cold start problem that can impact Function-as-a-Service platforms. Today, Suborbital collaborates with companies like Microsoft on a community called Wasm Builders, dedicated to sharing and developing innovations in WebAssembly applications.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
The three tentpoles of WebAssembly that make it a useful foundation for Suborbital
Surprising niche use cases for WebAssembly like IoT and data modeling
Open-source tools in the Suborbital ecosystem
Putting focus on building a larger Wasm Builders community
Connor’s thoughts on how WebAssembly can improve edge computing
Links:
Suborbital
WebAssembly
Suborbital Compute
Atmo
Reactr
Subo
Sat
Firecracker
Eric Anderson (@ericmander) and Frank Liu (@frankzliu) talk about Milvus, the open-source vector database built for scalable similarity search. Vector databases are built to search, index and store embeddings, a requirement for powerful AI applications. Frank is Director of Operations at Zilliz, the company that stewards the project. Tune in to find out how Milvus is the database for the AI era.
Subscribe to Contributor on Substack for email notifications, and join our Slack community!
In this episode we discuss:
A crash course on embeddings and vector databases
Using Milvus for logo search, crypto predictions, drug discovery, and more
Other open-source projects at Zilliz that complement Milvus
“Embedding Everything”
How Milvus incorporates tunable consistency to its search process
Links:
Milvus
Zilliz
Towhee
Attu
Feder
Other episodes:
Clickhouse with Alexey Milovidov and Ivan Blinkov
Correction:
Milvus is based on a “shared storage” architecture, not “shared nothing.”
Eric Anderson (@ericmander) reunites with old colleagues Kenn Knowles (@KennKnowles) and Pablo Estrada (@polecitoem) for a conversation on Apache Beam, the open-source programming model for data processing. The trio once worked together at Google, and Beam was a turning point in the history of open-source there. Today, both Kenn and Pablo are members of the Beam PMC, and join the show with the inside scoop on Beam’s past, present and future.
In this episode we discuss:
Transitioning Beam to the Apache Way
How “inner source” works at Google
Thoughts on the relationship between batch processing and streaming
Some ways that community “power users” have contributed to Beam
Information on Beam Summit 2022, the first onsite summit since COVID began
The first few people to register can use code BEAM_POD_INV for a discount on tickets!
Links:
Apache Beam
Apache Spark
Apache Flink
Apache Nemo
Apache Samza
Apache Crunch
MapReduce paper
MillWheel paper
FlumeJava paper
Dataflow paper
Beam Summit 2022 Website
Other episodes:
TensorFlow with Rajat Monga
Eric Anderson (@ericmander) returns to Temporal with co-founder Maxim Fateev (@mfateev) and principal engineer Dominik Tornow (@DominikTornow). When Maxim joined us in September of 2020, the company called their project a “workflow orchestrator.” Today, Temporal has grown in popularity and usability, but the terminology around that abstraction has changed. Tune in to track the evolution of what Maxim calls a genuinely “new category of software.”
In this episode we discuss:
New features and developments in the last 2 years
The proper way to pronounce “Temporal”
How Temporal guarantees that workflow execution actually runs to execution
Describing Temporal as a new pair of glasses
Replay, Temporal’s first developer conference on August 25-26, in Seattle
Links:
Temporal
Cadence
Apache Cassandra
Replay
People mentioned:
Samar Abbas (@samarabbas77)
Other episodes:
Temporal with Maxim Fateev
Apache Cassandra with Patrick McFadin
Eric Anderson (@ericmander) interviews Avi Press (@avi_press) about Scarf, the distribution platform for open-source software that facilitates analytics and commercialization. Scarf offers a set of tools that allows founders and maintainers to understand adoption of their products, including Scarf Gateway, which provides a central access point to containers and packages. From there, open-source developers can connect with the people that rely on their work.
In this episode we discuss:
Why you can’t rely on Github as a source of comprehensive data about open-source software
Tracing a user’s journey interacting with a project across multiple platforms
How better observability allows maintainers to make better software
Inspiring indie maintainers to commercialize their projects
The privilege of being able to work in open-source, and how Scarf can enable a more inclusive developer community
Links:
Scarf
Tidelift
Gitcoin
OpenTeams
Aviyel
Eric Anderson (@ericmander) and Patrick Dougherty (@cpdough) talk about Rasgo, the data transformation platform for MLOps that makes generating SQL easy. The team at Rasgo recently open-sourced a package called RasgoQL, that allows users to execute SQL queries against a data warehouse using Python syntax. Tune in to find out how Rasgo aims to bridge an important gap in the Modern Data Stack.
In this episode we discuss:
The advantages of offering both a low-code/no-code UI and a Python interface
"How can a data scientist, without needing full-time resources from data engineering, be somewhat self-sufficient in data prep and able to deliver those insights without a massive human capital investment needed?"
Where Rasgo fits into the world of feature stores
Why one Rasgo user took a trip to a wind farm in Texas
Eric’s predictions for the future of data prep and transformation
Links:
Rasgo
RasgoQL
DuckDB
Delta Lake
People mentioned:
Jared Parker (@jaredtparker_)
Eric Anderson (@ericmander) and Willem Pienaar (@willpienaar) talk about Feast, the open-source feature store for machine learning. Feature stores act as a bridge between models and data, and allow data scientists to ship features into production without the need for engineers. Willem co-created Feast at Gojek, and later teamed up with the folks at Tecton to back the project.
In this episode we discuss:
The value of feature stores in MLOps
What happens when you open-source too early
Why most open-source code has nothing to hide
Bringing an open-source project to an existing company
Good and bad use cases for a feature store
Links:
Feast
Tecton
Turing
Merlin
Kubeflow
apply() Conference
People mentioned:
Mike Del Balso
Kevin Stumpf (@kevinmstumpf)
Ajey Gore (@AjeyGore)
Demetrios Brinkmann (@Dpbrinkm)
Wes McKinney (@wesmckinn)
Other episodes:
Flyte with Ketan Umare
Great Expectations with Abe Gong and Kyle Eaton
Eric Anderson (@ericmander) and Ketan Umare (@ketanumare) discuss Flyte, the open-source workflow automation platform for large-scale machine learning and data use cases. Ketan is a former engineer at Lyft, where he created Flyte to help models in Pricing, Locations, ETA, and more. Today, the project allows machine learning developers everywhere to bring their ideas from conception to production.
In this episode we discuss:
How Flyte combines compute with parts of a workflow engine in a way that is best for the user
The importance of reliable fares and ETA predictions at a ride-sharing app
A progenitor to Flyte called “Better Airflow”
Ketan’s innovative approach to bringing typing to machine learning workloads
Why Flyte landed at the Linux Foundation
Links:
Flyte
Union.ai
Apache Airflow
Kubeflow
Luigi
MLTwist
Other episodes:
Great Expectations with Abe Gong and Kyle Eaton
Envoy Proxy with Matt Klein