Make it Work

Tech infrastructure that gets us excited. Conversations & screen sharing. 🔧 💻

Pairing-up on a CDN PURGE with Elixir

Listen to the full pairing session for pull request #549. The focus is on replacing an existing Fastly implementation with Jerod's Pipedream, which is built on top of the open-source Varnish HTTP Cache. We cover the initial problem, the proposed solution, the implementation details, and the testing process.The process begins with a pull request that, for the sake of rapid feedback, is set up to automatically deploy to new production. This allows for real-time testing in a production setting without affecting the actual production traffic. The new production - changelog-2025-05-05 - serves as a production replica for testing the new PURGE functionality.To understand how the PURGE works, we first examine the cache headers of a request. The cache-status header reveals whether a request was a hit, miss, or stale. A stale status indicates that the cached content has expired but is still being served while a fresh version is fetched in the background. The goal of the new system is to explicitly purge the cache, ensuring that users always get the latest content.A manual purge is performed using a PURGE request with curl. This demonstrates how a single instance can be cleared. However, the real challenge lies in purging all CDN instances globally. This requires a mechanism to discover all the instances and send a purge request to each one.The existing solution for purging all instances is a bash one-liner that uses dig to perform a DNS lookup, retrieves all the IP addresses of the CDN instances, and then loops through them, sending a curl purge request to each. The task is to replicate this logic in Elixir.The first step is to perform the DNS lookup in Elixir. A new module is created that uses Erlang's :inet_res module to resolve the IPv6 addresses of the CDN instances. This provides the list of all instances that need to be purged.Next, a new Pipedream module is created to handle the purging logic. This module is designed to be a drop-in replacement for the existing Fastly module. It will have the same interface, allowing for a seamless transition. The core of this module is a purge function that takes a URL, retrieves the list of CDN instances, and then sends a purge request to each instance.The implementation of the Pipedream module is done using Test-Driven Development (TDD). This involves writing a failing test first and then writing the code to make the test pass. This ensures that the code is correct and behaves as expected.The first test is to verify that a purge request is sent to a single CDN instance. This involves mocking the DNS lookup to return a single IP address and then asserting that an HTTP request is made to that address. The test is then extended to handle multiple instances, ensuring that the looping logic is correct.A key challenge in testing is handling the deconstruction of the URL. The purge/1 function receives a full URL, but the purge request needs to be sent to a specific IP address with the original host as a header. This requires parsing the URL to extract the host and the path.Once the unit tests are passing, the new purge functionality is deployed to the new production environment for real-world testing. This allows for verification of the entire workflow, from triggering a purge to observing the cache status of subsequent requests.The testing process involves editing an episode, which triggers a purge, and then using curl to check the cache headers. A miss indicates that the purge was successful. The tests are performed on both the application and the static assets, ensuring that all backends are purged correctly.With the core functionality in place, the next steps involve refining the implementation and adding more features. This includes:Configuration: Moving hardcoded values, such as the application name and port, to a configuration file.Error Handling: Implementing robust error handling for DNS lookups and HTTP requests.Security: Adding a token to the purge request to prevent unauthorized purges.Observability: Using tools like Honeycomb.io to monitor the purge requests and ensure that they are being processed correctly.By following a methodical approach that combines TDD, a staging environment, and careful consideration of the implementation details, it is possible to build a robust and reliable global CDN purge system with Elixir. This not only improves the performance and reliability of the CDN but also provides a solid foundation for future enhancements.🍿 This entire conversation is available to Make it Work members as full videos served from the CDN, and also a Jellyfin media server: makeitwork.tv/cdn-purge-with-elixir 👈 Scroll to the bottom of the page for CDN & media server infoLINKS🐙 github.com/thechangelog/changelog.com pull request #549🐙 github.com/thechangelog/pipelyEPISODE CHAPTERS (00:00) - The Goal (03:54) - The Elixir Way (07:18) - Pipedream vs Pipely (09:26) - Copy, paste & start TDD-ing (13:36) - TDD talk (17:08) - Let's TDD! (24:45) - Does it work? (30:24) - It works! (33:15) - Should we test DNS failures? (35:02) - Let's test the HTTP part (37:15) - All tests passing (37:53) - Let's test this in production (40:29) - Let's check if it's working as expected (41:43) - Does purging the static backend work? (43:54) - Next steps (47:35) - Let's look at requests in Honeycomb.io (51:56) - How does it feel to be this close to finishing this? (52:45) - Remember how this started?

06-29
56:00

I LOVE TLS

In the world of web infrastructure, what starts as a simple goal can often lead you down a fascinating rabbit hole of history, philosophy, and clever engineering. This is the story of our journey to build a simple, single-purpose, open-source CDN for changelog.com and the one major hurdle that stood in our way: Varnish, our HTTP caching layer of choice, doesn't support TLS backends.Enter Nabeel Sulieman, a shipit.show guest, who had previously introduced us to KCert, a simpler alternative to cert-manager. We knew if anyone could help us solve this TLS conundrum, it was him. After a couple of false starts, we finally recorded the final solution. As Nabeel aptly put it: Third time is the charm.🍿 This entire conversation is available to Make it Work members as full videos served from the CDN, and also a Jellyfin media server: makeitwork.tv/i-love-tls 👈 Scroll to the bottom of the page for CDN & media server infoLINKS🐙 github.com/thechangelog/pipely pull-request #8🐙 github.com/nabsul/tls-exterminator 👀 Varnish - Why no SSL?🚲 PHKs Bikeshed🏡 bikeshed.orgEPISODE CHAPTERS (00:00) - How this started (02:05) - What makes TLS & SSL interesting for you? (05:58) - Disabling issues & pull requests (08:19) - What is Pipely? (14:03) - Why no SSL? (in Varnish) (15:36) - Who is Poul-Henning Kamp? (17:30) - The Bikeshed (19:46) - Pipely pull request #8 (23:56) - Dagger instead of Docker (29:41) - pipely Dagger module (36:52) - What is saswqatch? (40:44) - ghcr.io/gerhard/sysadmin (43:45) - Let's benchmark! (51:52) - What happens next? (01:00:17) - Wrap-up

05-29
01:03:08

DevOps Sushi

In this episode, we sit down for a deep-dive conversation with Mischa van den Burg, a former nurse who made the leap into the world of DevOps. We explore the practical realities, technical challenges, and hard-won wisdom gained from building and managing modern infrastructure. This isn't your typical high-level overview; we get into the weeds on everything from homelab setups to the nuances of GitOps tooling.We start by exploring the journey from nursing to DevOps - the why behind the career change (00:54) - focusing on the transferable skills and the mindset required to succeed in a field defined by continuous learning and complex problem-solving.What are the most engaging aspects of DevOps (04:49)? We discuss the satisfaction of automating complex workflows and building resilient systems. Conversely, we also tackle the hardest parts of the job (05:48), moving beyond the cliché "it's the people" to discuss the genuine technical and architectural hurdles faced in production environments.We move past the buzzword and into the practical application of "breaking down silos" (07:36). The conversation details concrete strategies for fostering collaboration between development and operations, emphasising shared ownership, transparent communication, and the cultural shift required to make it work.We discuss critical lessons learned from the field (13:07), including the importance of simplicity, the dangers of over-engineering, and the necessity of building systems that are as easy to decommission as they are to deploy.The heart of the conversation tackles an important perspective: Why choose Kubernetes for a homelab? (23:06) We break down the decision-making process, comparing it to alternatives like Nomad and Docker Swarm. The discussion covers the benefits of using a consistent, API-driven environment for both personal projects and professional development. We also touch on the hardest Talos OS issue encountered (36:17), providing a specific, real-world example of troubleshooting in an immutable infrastructure environment. Two of Everything & No in-place upgrades are important pillars of this mindset, and we cover them both (41:14). We then pivot to a practical comparison of GitOps tools, detailing the migration from ArgoCD to Flux (46:50) and the specific technical reasons that motivated the change.We conclude (50:40) by reflecting on the core principles of DevOps and platform engineering, emphasising the human element and the ultimate goal of delivering value, not just managing technology.🍿 This entire conversation, as well as the screen sharing part, is available to Make it Work members as full videos served from the CDN, and also a Jellyfin media server:DevOps Sushi 1 - conversational partDevOps Sushi 2 - screen sharing partScroll to the bottom of those pages 👆 for CDN & media server infoLINKS🍣 Jiro Dreams of Sushi✍️ I'm In Love with my Work: Lessons from a Japanese Sushi Master🎬 Why I Use Kubernetes For My Homelab🐙 Mischa's homelab GitHub repository🎁 Mischa's Free DevOps Community🎓 KubeCraft DevOps SchoolEPISODE CHAPTERS (00:00) - Intro (00:54) - From Nurse to DevOps Engineer - Why? (04:49) - What are the fun DevOps things? (05:48) - Hardest part in DevOps (07:36) - What does breaking down silos mean to you? (13:07) - Hard earned lessons that are worth sharing (17:44) - The Bear that Dreams of DevOps (23:06) - Why I use Kubernetes for my Homelab? (29:04) - Your recommendation for someone starting today (36:17) - Hardest Talos issue that you've hit (41:14) - No in-place upgrades (46:50) - From ArgoCD to Flux (50:40) - Remembering what's important

04-29
58:39

Fast Infrastructure

Hugo Santos, founder & CEO of Namespace Labs joins us today to share his passion for fast infrastructure. From sharing childhood stories & dial-up modem phone line wiring experiences, we get to speed testing Hugo's current home internet connection: 25 gigabit FTTP.We shift focus to Namespace, and talk about how it evolved from software-defined storage to building an application platform that starts Kubernetes clusters in seconds. The underlying infrastructure is fast, custom built and is able to:Spin up thousands of isolated, virtual machine-based Kubernetes clustersRun millions of jobs concurrentlyControl everything from CPU/RAM allocation to networking setupDeliver exceptionally low latency at high concurrencyA significant portion of the conversation centres on a major service degradation Namespace experienced in October 2024. Hugo shares the full story, including:How a hardware delivery delay combined with network issues from a third-party provider created problemsThe difficult decision to rebuild the network setup rather than depend on unreliable componentsThe emotional toll of not meeting self-imposed high standards despite working around the clockThe surprising customer loyalty, with no customers leaving despite an impact on their build systemHugo emphasizes taking full responsibility for this incident: "That's on us. We decide which companies we work with..."The episode concludes with Hugo sharing his philosophy on excellence: "I find that it's usually some kind of unrelenting curiosity that really propels people beyond just being good to being excellent... When we approach how we build our products, it's with that same level of unrelenting curiosity and willingness to break through and change things."🍿 This entire conversation, including all three YouTube videos, is available for members only as a 1h+ long movie at makeitwork.tv/fast-infrastructureLINKSPost mortem: Oct 22, 2024 outage🐙 namespacelabs/foundationGoogle's Boq (mention)🎬 Open-source application platform inspired by Google's Boq🎬 Why is this 25 gigabit home internet slow?🎬 Remote Docker build faster than local?EPISODE CHAPTERS (00:33) - Weekend projects (03:16) - Love for all things infrastructure (09:58) - Hugo's 25 gigabit home internet connection (13:33) - How does this love for infrastructure translate to Namespace.so? (15:28) - What does it mean for a Kubernetes cluster to spin up fast? (20:24) - What does a job mean in infrastructure terms? (23:12) - Let's talk about your last major outage (37:15) - What does Namespace.so look in practice? (39:51) - Namespace Foundation - Open-source Kubernetes app platform (40:54) - Complex preview scenarios (42:37) - One last thought

02-28
45:16

Keep Alert Chaos in Check

Today we talk with Matvey Kukuy and Tal Borenstein, co-founders of Keep, a startup focused on helping companies manage and make sense of their alert systems. The discussion comes three years after Matvey's previous appearance - https://shipit.show/36 - where he talked about Grafana Labs' acquisition of his previous startup Amixr (now Grafana OnCall).Keep tackles a significant challenge in modern tech infrastructure: managing the overwhelming volume of alerts that companies receive from their various monitoring systems. Some enterprises deal with up to 70,000 alerts daily, making it crucial to identify which ones represent actual incidents requiring attention.We explore real-world examples of major incidents, including the significant CrowdStrike outage in July 2024 that caused widespread system crashes and resulted in an estimated $10 billion in worldwide damages. This incident highlighted how critical it is to quickly identify and respond to serious issues among numerous alerts. Matvey tells us about his most black swan experience.The episode concludes with a hint that some of Keep's AI features may eventually be released as open source once they're sufficiently polished.LINKS🎧 Keep on-call simpleCrowdStrike - Wikipedia🎬 The Black Swan TheoryKeep PlaygroundShow HN: Keep - GitHub Actions for your monitoring toolsEPISODE CHAPTERS (00:00) - What is new after three years? (02:58) - Take us through the last memorable incident (07:16) - My most black swan (08:50) - How would have Keep made the CrowdStrike experience different? (12:38) - How do companies end up in that place? (15:29) - Keep name origin (17:40) - Why would someone pick Keep? (23:22) - Let's think about our use case (25:03) - Demo ends (28:21) - Reporting capabilities? (30:25) - Deploying & running Keep (33:12) - 2025 for Keep (38:50) - Until next time

01-26
41:28

Let's build a CDN - Part 2

This is a follow-up to Let's build a CDN - Part 1A new friend joins us. We talk about the high-level, including why Varnish and why we are doing this in the first place. We go through the plan for this session, and then just make it happen. The video in the show notes captures most of this pairing session.If you enjoyed this podcast and the YouTube video, you can now watch the full movie in 4k on 📺 makeitwork.tv. Offline download is available.LINKSHypertext Transfer Protocol (HTTP) Field Name RegistryRFC 9211 The Cache-Status HTTP Response Header Fieldlibvmod-dynamicEPISODE CHAPTERS (00:00) - Who is James (01:15) - Who is Matt? (02:26) - Why Varnish? (06:01) - Would you still choose Varnish today? (10:10) - Did you do a typo? (11:04) - Why are we doing this? (17:21) - Where did we stop in part 1? (21:40) - What are we trying to achieve today? (24:03) - Outro

12-16
24:55

Move fast & break nothing

This is the audio version of 🎬 Ninjastructure - Move fast & break nothingMatias Pan, a professional maté drinker & Senior Software Engineer at Dagger, is showing us an approach to Infrastructure as Code built with Pulumi.We look at Go code, discuss procedural (imperative) vs. declarative, spend some time on state management & introduce the concept of Ninjas in the context of infrastructure: move fast & break nothing.In the second half, Matias uses diagrams to talk through different ideas of rolling this out into production. Which of the two approaches would you choose?LINKSTesting Pulumi programsEPISODE CHAPTERS (00:00) - Why Pulumi instead of Terraform? (02:40) - Procedural or declarative? (07:19) - What is Ninjastructure? (08:47) - First thing that gets provisioned in an AWS account (11:18) - How does the network module work? (14:29) - Biggest advantage to using Pulumi over Terraform (17:02) - Stacks = different environments (18:20) - Where is state stored? (20:18) - Where did you choose to store the state? (21:46) - How to use this in production? (24:32) - The GitOps approach (29:17) - Outro

10-26
30:25

TalosCon 2024

We have 3 conversations from TalosCon 2024:1. Vincent Behar & Louis Fradin from Ubisoft tell us how how they are building the next generation of game servers on Kubernetes. Recorded in a coffee shop.2. We catch up with David Flanagan on the AI stack that he had success with in the context of rawkode.academy. David also tells us the full story behind his office burning down earlier this year. Recorded in the hallway track.3. As for the last conversation, Gerhard finally gets together with Justin Garrison in person. They talked about TalosCon, some of the reasons behind users migrating off Cloud, and why Kubernetes & Talos hold a special place in their minds. Recorded in the workshop cinema room.LINKS🎬 25,000 servers at Ubisoft - Vincent Behar & Louis Fradin - TalosCon 2024Agones is a library for hosting, running and scaling dedicated game servers on Kubernetes🎬 Managing Talos with CUElang - David Flanagan - TalosCon 2024Xiu is a simple, high performance and secure live media server written in pure Rust🎬 From Homelab to Production - Gerhard Lazu - TalosCon 2024EPISODE CHAPTERS (00:00) - Intro (00:52) - Vincent + Louis: Cinema conference talk (02:09) - Vincent + Louis: What do you do? (03:06) - Vincent + Louis: How do you split work? (04:58) - Vincent + Louis: Game servers on Kubernetes (08:07) - Vincent + Louis: What made you choose Omni & Talos? (11:14) - Vincent + Louis: What could be better about them? (12:58) - Vincent + Louis: Tell us about your talk (16:50) - Vincent + Louis: What if Omni didn't exist? (18:11) - Vincent + Louis: Last takeaway for the listeners (18:53) - David: What is your AI stack for creating content? (20:31) - David: Can AI guide me through running OCR on a video? (21:18) - David: Which AI tools worked best for you? (23:09) - David: Any nice AI tools which are worth mentioning? (24:20) - David: My office went on fire in March (26:13) - David: Which Linux distro do you use? (27:18) - David: The extended version behind the office fire (30:37) - David: What are you looking forward to? (33:07) - David: What tech stack runs rawkode.academy? (38:44) - Justin: Finally meeting in person! (39:13) - Justin: What was your contribution to TalosCon 2024? (41:21) - Justin: What would you improve for next time? (43:59) - Justin: What did you love about this conference? (46:00) - Justin: Help us visualize the venue (47:16) - Justin: What are you thinking for the next TalosCon? (49:22) - Justin: What is most interesting for you in Talos & Omni? (55:25) - Justin: What is missing? (01:00:25) - Justin: How do you see the growing discontent with the Cloud & Kubernetes? (01:07:55) - Justin: What are your takeaways from TalosCon 2024?

09-28
01:11:48

Access your Kubernetes pods anywhere

How does Michal Kuratczyk, Staff Software Engineer at RabbitMQ, access Kubernetes workloads securely, from anywhere? Regardless whether it's a Google Kubernetes Engine (GKE) cluster or Kubernetes in Docker (KiND), Tailscale is a simple solution for this particular use case. This also makes it easy to share private services with all devices on a tailnet, including with friends that want to access them on a smartphone.Watch the demo 🎬 Access your Kubernetes pods anywhereIf you want to watch the full, 32 minutes-long video, go to 🎁 https://makeitwork.gerhard.ioLINKS🎬 Access your Kubernetes pods anywhereTailscale Kubernetes OperatorRabbitMQ Cluster Kubernetes Operator🎬 TGIR S01E07: How to monitor RabbitMQ?🗂️ Observe and Understand RabbitMQ - RabbitMQ Summit 2019🎬 Observe and understand RabbitMQ - RabbitMQ Summit 2019EPISODE CHAPTERS (00:00) - INTRO (05:12) - DEMO STARTS (06:11) - RabbitMQ in Kubernetes (07:32) - Tailscale in Kubernetes (11:59) - Magic DNS (13:31) - Let me connect to it (15:33) - Is this the last RabbitMQ 3 minor? (17:12) - An alternative way to expose a service (19:11) - Works on any tailnet device (22:04) - How do we continue? (23:26) - Have you tried upgrading the operator? (24:23) - Can we try it? (25:43) - DEMO ENDS (25:54) - Exit nodes & subnet routers (28:50) - OUTRO

08-12
33:20

Modern CI/CD - Part 1

What does it look like to build a modern CI/CD pipeline from scratch in 2024? While many of you would pick GitHub Actions and be done with it, how do you run it locally? And what do you need to do to get caching to work?Tom Chauveau joins us to help Alex Sims build a modern CI/CD pipeline from scratch. We start with a Remix app, write the CI/CD pipeline in TypeScript and get it working locally. While we don't finish, this is a great start (according to Alex).This was recorded in January 2024, just as Dagger was preparing to launch Functions in the v0.10 release. While many things have improved in Dagger since then, the excitement & the joy of approaching CI/CD with this mindset have remained the same.LINKS🎬 Modern CI/CD from Scratch (using Dagger TypeScript Modules)🎉 Introducing Dagger Functions (a.k.a. Dagger Modules)🌌 DaggerverseEPISODE CHAPTERS (00:47) - Intro (01:35) - Current CI/CD pipeline (03:40) - Why not a single pipeline stage? (04:29) - Dagger expectations (05:18) - Thinking of retiring GitHub Actions (05:48) - Why the GitHub Actions & Jenkins split? (06:46) - TypeScript in Dagger Modules (08:40) - Modules extend the Engine API (09:45) - Plan for today (10:57) - Pairing session conclusions (12:11) - Is it faster? (13:10) - Re-using the cache between runs (14:50) - Key takeaways (19:04) - What comes next? (22:43) - Not if you are using Jenkins (23:33) - Thank you

07-07
24:30

Let's build a CDN - Part 1

This started as a conversation between James A Rosen & Gerhard in August 2023. Several months later, it evolved into a few epic pairing sessions captured in these GitHub threads:thechangelog#480 (reply in thread)thechangelog#486The last pairing session eventually lead to 🎧 Kaizen! Should we build a CDN? This is the follow-up to that. How far did we get in 1 hour?LINKSThe 5-hour CDNvarnish - Docker Official ImageIntroduction to VarnishMagento2 Varnish configMagento Internals: Cache Purging and Cache TagsVarnish modulesEPISODE CHAPTERS (00:00) - Intro (02:08) - The 5-hour CDN (03:44) - Varnish container image (05:00) - Varnish container image command (06:31) - Local-friendly Varnish container image (06:44) - Varnish command-line options (08:30) - Varnish parameters (09:45) - Experimenting with Varnish locally (12:36) - Varnish purging (15:22) - Backend fetch failed (16:20) - Varnish debug mode & logs (17:29) - Why can't we query the backend? (21:08) - Why is the backend sick? (22:49) - That's the problem!

05-27
24:10

KubeCon EU 2024

For our 4th episode, we have four conversations from KubeCon EU 2024.We talk to Jesse Suen about Argo CD & Kargo, Solomon Hykes shares the next evolution of Dagger, and Justin Cormack dives into Docker & AI. We also catch up with Frederic Branczyk & Thor Hansen on the latest Parca & Polar Signals Cloud updates since our last conversation.Each conversation has a video version too:Jesse Suen: 🎬 GitOps & ClickOps beyond KubernetesSolomon Hykes: 🎬 Pipelines as FunctionsJustin Cormack: 🎬 Works on my ComputerFrederic Branczyk & Thor Hansen: 🎬 always BPFLINKS1. Jesse Suen- What's New in Kargo v0.5.0- 🎬 Navigating Multi-stage Deployment Pipelines via the GitOps Approach 2. Solomon Hykes- Introducing Dagger Functions- A Keynote heard around the world - KubeCon EU 2024 Recap- 🎬 Local & open-source AI Developer Meetup3. Justin Cormack- AI Trends Report 2024: AI’s Growing Role in Software Development- Building a Video Analysis and Transcription Chatbot with the GenAI Stack4. Frederic Branczyk & Thor Hansen- Correlating Tracing with Profiling using eBPFLET'S MAKE IT BETTERIf you enjoyed this episode, I will appreciate your feedback on Apple Podcasts or Spotify.If there is something that would have made it better for you, let me know: makeitwork@gerhard.io (00:00) - Intro (00:39) - Jesse Suen (JS) (00:56) - JS: Hardest ArgoCD question that you got today (01:54) - JS: Rendered YAML branches (04:06) - JS: What is top of your mind? (06:12) - JS: Kargo beyond Kubernetes (08:20) - JS: Trusting Kargo with production (09:49) - JS: GitOps for leadership, UIs for app devs (12:11) - JS: How is this KubeCon different? (12:55) - JS: Anything that you will do different after this KubeCon? (14:58) - Solomon Hykes (SH) (15:10) - SH: What are you most excited about? (16:12) - SH: What is different about functions this time? (16:40) - SH: What makes functions fun for you? (18:01) - SH: Anything significant that happened at this KubeCon? (19:38) - SH: Thoughts on Dagger in production (20:21) - SH: What does Dagger 1.0 mean? (21:28) - SH: Asks for the Dagger Community (23:04) - SH: How do Dagger SDKs work with Modules? (25:02) - SH: Thoughts on the tech industry (27:19) - Justin Cormack (JC) (27:35) - JC: Docker & AI (32:14) - JC: Docker Build Cloud (35:30) - JC: Web Assembly & WASM (39:37) - JC: KubeCon Community (42:01) - Frederic Branczyk (FB) & Thor Hansen (TH) (42:23) - TH: Excited to announce Polar Signals Cloud (42:47) - FB & TH: Most exciting feature since launch (45:24) - FB & TH: How is this KubeCon different? (47:14) - TH & FB: What are you going to do different after this KubeCon? (49:06) - FB & TH: Plans for next KubeCon? (50:24) - FB & TH: Anything apart from AI that is exciting? (51:25) - TH & FB: Any hot takes? (52:12) - Outro

04-30
52:47

80ms response SLO

Alex Sims, Solutions Architect & Sr. Software Engineer at James and James Fulfilment, talks about their journey to 80ms response SLO with PHP & React.Alex shares how they optimised API performance, specifically highlighting improvements made by altering interactions with Amazon S3 and Redis. Key points include the transition from synchronous to asynchronous S3 processes, the impact of Amazon's SLO on write speed, and the significant runtime reduction achieved through JIT compilation in PHP 8. We conclude with insights into decision-making in technology architecture, emphasising the balance between choosing cutting-edge technology and the existing skill set of the team.🎬 View the video part of this episode at 80ms response SLO with PHP & React🎁 Access the audio & video as a single conversation at makeitwork.gerhard.ioEPISODE CHAPTERS (00:00) - Introduction (00:50) - 2023: A Year of Productive Chaos at James & James (02:43) - Alex's Journey throughout our Conversations (03:33) - What does James & James do? (04:29) - What does Alex do at James & James? (05:33) - Technical Challenges in 2023 (06:37) - Who is James? (08:21) - Why do you - Alex - do what you do? (10:52) - 2023 Highlights (16:22) - Where does Kubernetes fit? (18:03) - LESSON 1: Be aware of the different EC2 node type behaviour (21:05) - instances.vantage.sh (22:46) - LESSON 2: Understand the time cost of AWS S3 writes (24:19) - LESSON 3: Connecting to Redis is expensive (25:58) - Be careful when mixing persistent connections and transactions in Redis (26:41) - LESSON 4: Always check for SELECT * (28:15) - Lessons recap (29:35) - OODA (30:24) - SCREEN SHARING (31:02) - Wrap-up (35:58) - Planning for the next conversation

02-29
37:59

Automation Engine

Today we delve into BuildKit and Dagger, focusing on their significance in the development and deployment of containerized applications, as well as Kubernetes integration.BuildKit's Role: Essential for anyone using Docker Build, facilitating efficient, dependency-aware container builds with advanced caching mechanisms. It's not just for Docker but serves as a versatile execution engine across various projects, including Dagger.Eric's Attraction to BuildKit: The power of BuildKit's DAG (Directed Acyclic Graph) execution model and its parallelization and deduplication capabilities drew Eric to maintain and contribute to the project.First BuildKit Project: Eric's initial project, bincastle, aimed to build a development environment from source, highlighting BuildKit's ability to handle complex builds.Introduction of Dagger: Dagger builds on top of BuildKit, enhances automation by allowing developers to use familiar programming languages without being confined to a specific domain-specific language (DSL). It aims to simplify and optimize automation tasks, particularly in CI/CD environments.Dagger's Enhancements over BuildKit: Dagger introduces a language-agnostic layer, making automation more accessible and scalable. It incorporates features like remote caching and a services layer, potentially positioning it as a simpler alternative to Kubernetes for certain use cases.Future Directions: The podcast touches on ongoing developments, such as modules for sharing automation code within Dagger, aiming to foster an ecosystem where developers can easily reuse and contribute to collective automation solutions.The conversation highlights the evolving landscape of development tools, where BuildKit and Dagger play pivotal roles in making containerized development and deployment more efficient and user-friendly. Eric and Gerhard discuss the potential for these tools to simplify and enhance automation, reflecting on past projects and future possibilities.🎬 View the video part of this episode: Deploying and Experimenting with Dagger 0.9 on Kubernetes 1.28 🎁 Access the audio & video as a single conversation at makeitwork.gerhard.ioLINKSsipsma/bincastlemoby/buildkitdagger/daggerdagger --mod github.com/sipsma/daggerverse/yamlinvaders@8071646e5831d7c93ebcd8cca46444250bf25b8c shell playEPISODE CHAPTERS (00:00) - Intro (01:43) - What attracted you to BuildKit? (03:23) - What was the first project that you used BuildKit for? (06:03) - Four years later, do you still want to see that idea through? (06:44) - What is Dagger? (08:18) - How much does Dagger add on top of BuildKit? (10:42) - How does Gerhard think of Dagger in relation to BuildKit? (12:48) - Dagger Modules - a way to share automation code (14:01) - If someone installs Dagger today, what happens under the hood? (14:47) - Why is the Engine distributed as a container image? (16:02) - If the Dagger Engine was a single binary, how would you run it? (17:05) - Thoughts on BuildKit caching? (18:15) - What about remote caching? (20:53) - Let's run Dagger on K8s on this Latte Panda Sigma (21:43) - SCREEN SHARE (22:06) - As we approach KubeCon, what is on your list? (23:19) - An idea for next time when we get together

02-29
24:29

How much CPU & Memory?

This episode looks into the observability tool Parca & Polar Signals Cloud with Frederic Branczyk and Thor Hansen. We discuss experiences and discoveries using Parca for detailed system-wide performance analysis, which transcends programming languages.We highlight a significant discovery related to kube-prometheus and the unnecessary CPU usage caused by Prometheus exporter's attempts to access BTRFS stats, leading to a beneficial configuration change for Kubernetes users globally.We also explore Parca Agent's installation on Kubernetes 1.28 running on Talos 1.5, the process of capturing memory profiles with Parca, and the efficiency of the Parca Agent in terms of memory and CPU usage.We  touch upon the continuous operation of the Parca Agent, the importance of profiling for debugging and optimization, and the potential of profile-guided optimizations in Go 1.22 for enhancing software efficiency.🎬 Screensharing videos that go with this episode:First impressions: Parca Agent on K8s 1.28 running as Talos 1.5See where your Go code allocates memoryHow to debug a memory issue with Parca?See which line of your Go code allocates the most memory🎁 Access the audio & all videos as a single conversation at makeitwork.gerhard.ioLINKSGo Profile-guided optimizationView Profiling Data within CodeAnnouncing Continuous Memory Profiling for RustEPISODE CHAPTERS (00:00) - Intro (02:21) - kube-prometheus discovery & fix (06:29) - Parca Agent on K8s 1.28 running as Talos 1.5 (06:49) - How to capture memory profiles with Parca? (08:42) - pprof.me (10:42) - Data retention in Parca (11:42) - A real-world memory issue debugging example (16:05) - How much memory is Parca Server expected to use? (17:39) - How much memory is the Parca Agent expected to use? (19:42) - What about Parca Agent CPU usage? (21:57) - Is Parca Agent meant to run continously? (23:03) - Other Parca stories worth sharing (25:19) - What are the things that you are looking forward to in 2024? (27:23) - Golang Profile Guided Optimisations with Parca (30:22) - Frederic's surprise screen share (34:02) - Wrap-up

02-29
35:36

Recommend Channels