DiscoverPurePerformance
PurePerformance
Claim Ownership

PurePerformance

Author: PurePerformance

Subscribed: 135Played: 2,914
Share

Description

The brutal truth about digital performance engineering and operations.

Andreas (aka Andi) Grabner and Brian Wilson are veterans of the digital performance world. Combined they have seen too many applications not scaling and performing up to expectations. With more rapid deployment models made possible through continuous delivery and a mentality shift sparked by DevOps they feel it’s time to share their stories. In each episode, they and their guests discuss different topics concerning performance, ranging from common performance problems for specific technology platforms to best practices in development, testing, deploying and monitoring software performance and user experience. Be prepared to learn a lot about metrics.

Andi & Brian both work at Dynatrace, where they get to witness more real world customer performance issues than they can TPS report at.
321 Episodes
Reverse
While Artificial Intelligence seems to have just popped up when OpenAI brought ChatGPT to the consumer market it has its roots in the mids of the 20th century. But what is it that all of a sudden made it into every conversation we seem to have?Thomas Natschlaeger, Principal Data Scientist at Dynatrace, who has been working in the AI and Machine Learning space for the past 30 years gives us a brief historical overview and describes the critical evolutionary steps and compelling events in that technology that made it to what it is today. Tune in and hear about how AIs are trained, how they are optimized and most importantly: how their outputs can be tested and validated!In our conversation we discuss current trends towards small language models that will help model digital twins of our existing roles and how AIs are used to Validate other AIs like we humans do when a senior engineer does pair programming with a junior and with that provides essential feedback on current accuracy and input to improve the outcome of future tasks.Links we discussedLinkedIn Profile from Thomas: https://www.linkedin.com/in/thomas-natschlaeger/Ask Me Anything Session on Davis CoPilot: https://www.linkedin.com/posts/grabnerandi_llm-copilot-activity-7373837743971393536-QgxV?utm_source=share&utm_medium=member_desktop&rcm=ACoAAABLhVQBbh8Jkn_K8din5tsQlMCpXRNzlKUVoxxed Conference Talk: https://amsterdam.voxxeddays.com/talk/?id=39801Attention is all you need paper: https://en.wikipedia.org/wiki/Attention_Is_All_You_Need
On September 8 the world saw the npm supply chain attack. Fortunately the community reacted in record time to avert a disaster. In todays episode we have Constanze Roedig, Key Researcher at SBA Research, who introduces us to the new buddy of SBoM (Software Bill of Materials): SBoB (Software Bill of Behaviors) and her thoughts on how that new approach to fingerprinting software can help cyber security teams. What's a BoB? It's a detailed runtime behavior profile of software. It expands on the static validation option through SBOMs as it allows security teams to validate the correct execution behavior of deployed software at deploy time or continuously in production. Thanks to eBPF, a malicious behavior such as opening non expected ports or accessing non expected files can therefore be detected.Listen to Constanze who shares the work she and Vadim Bauer, Owner of 8gear, have done on this topic. You will learn about how software vendors can create their own SBOBs, ship them with their container images and how security teams can get alerted or enforce any detected malicious behavior. Make sure to check out their GitHub repo, star it if you like it and try their hands-on tutorial!Links:Constanze LinkedIn: https://www.linkedin.com/in/croedig/Vadim LinkedIn: https://www.linkedin.com/in/vadim-bauer/OBobCtl GitHub Repo: https://github.com/k8sstormcenter/bobctlCloud Native Summit Munich Talk: https://www.youtube.com/watch?v=XETuwndd_mw&index=11&pp=iAQBnpm supply chain attack: https://www.infosecurity-magazine.com/news/npm-supply-chain-attack-averted/
Defining AI-Native in 2025 is like trying to define Cloud Native back in 2014! We are in the early stages of understanding what AI really means to us. The ecosystem is just evolving, and many organizations are still struggling with re-architecting their digital systems to cloud native patterns!To learn more about the current transformational wave—the AI-Native Wave—we have invited Pini Reznik, CEO and Co-Founder of re:cinq. We will discuss what we can learn from previous "waves of innovation," why the business must care, and why the primary AI use case should not be just cost-cutting! Make sure to get a copy of his book or catch his talk from Cloud Native Munich. All links we discussed here:Pini's LinkedIn: https://www.linkedin.com/in/pinireznik/The Next Transformation Mini Book: https://re-cinq.com/mini-bookCloud Native Munich Talk: https://www.youtube.com/watch?v=CHb3TLEV8ZU
Most AI projects still fail, are too costly, or don't provide the value they hoped to gain. The root cause is nothing new: it's non-optimized models or code that runs the logic behind your AI Apps. The solution is also not new: tuning the system based on insights from Observability!To learn more about the state of AI Observability, we invited back Nir Gazit, CEO and Co-Founder of traceloop, the company behind OpenLLMetry, the open source observability standard that is seeing exponential adoption growth!Tune in and learn how OpenLLMetry became such a successful open source project, which problems it solves, and what we can learn from other AI project implementations that successfully launched their AI Apps and AgentsLinks we discussedNir's LinkedIn: https://www.linkedin.com/in/nirga/OpenLLMetry: https://github.com/traceloop/openllmetryTraceloop Hub LLM Gateway: https://www.traceloop.com/docs/hub
Did you know that the average salary for a Platform Engineer is 42.5% more than a DevOps engineer? But why is that?We sat down with Artem Lajko, CNCF Kubestronaut and Ambassador as well as Author of the book Implementing GitOps with Kubernetes. We dive into the role of a platform engineer, the common pitfalls in implementing IDPs and why Backstage and AI won't solve all your problems. And we touch upon a topic hot off the press around Terraform: Its not dead!Links we discussedArtem's LinkedIn: https://www.linkedin.com/in/lajko/Talk slides from Cloud Land: https://lajko10-my.sharepoint.com/personal/artem_lajko_dev/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fartem%5Flajko%5Fdev%2FDocuments%2FAttachments%2Fcloud%20land%2D2025%5F%2Epdf&parent=%2Fpersonal%2Fartem%5Flajko%5Fdev%2FDocuments%2FAttachments&ga=1State of Platform Engineering Report: https://platformengineering.org/reports/state-of-platform-engineering-vol-3Upjet GitHub Project: https://github.com/crossplane/upjet
"Privacy engineering is the art of translating privacy laws and policies into code, figuring out how to make legal requirements such as ‘an individual must be able to request deletion of all their personal data’ a technical reality.", was the elegant explanation from Cat Easdon when asked about what she is doing in her day job.If you want to learn more then tune in to this episode. Cat, Privacy Engineer at Dynatrace, shares her learnings about things such as: When the right time is to form your own privacy engineering team, why privacy means different things for different people and regulators and what privacy considerations we specifically have in the observability industry so that our users trust our services!Links:Cat's LinkedIn Profile: https://www.linkedin.com/in/easdon/Publications from Cat: https://www.dynatrace.com/engineering/persons/catherine-easdon/Blog on Managing Sensitive Data at Scale: https://www.dynatrace.com/news/blog/manage-sensitive-data-and-privacy-requirements-at-scale/Semgrep for lightweight code scanning: https://github.com/semgrep/semgrepThe IAPP: https://iapp.org/'Meeting your users' expectations' is formally described by the theory of contextual integrity: https://www.open.edu/openlearncreate/mod/page/view.php?id=214540Facebook's $5 billion fine from the FTC: http://ftc.gov/news-events/news/press-releases/2019/07/ftc-imposes-5-billion-penalty-sweeping-new-privacy-restrictions-facebookFact-check: "The $5 billion penalty against Facebook is the largest ever imposed on any company for violating consumers’ privacy and almost 20 times greater than the largest privacy or data security penalty ever imposed worldwide. It is one of the largest penalties ever assessed by the U.S. government for any violation." I think that's still true; the largest fine under the GDPR was €1.2 billion (again for Facebook/Meta)
More than 50% of platform engineering leads don't know how to measure the impact of their platform! Many platform projects fall into common anti-pattern traps that make the platform look great on Day 1 but fail to scale and excite on Day 2!Daniel Bryant - who's profile tagline is "Helping you build better platforms" - is sharing his thoughts on how to measure the value of your platform, how to avoid common anti-patterns and why he believes that the future of platform engineering is in Platform Democracy!And of course, we wrap everything up with a discussion around the impact of Agentic AI towards platform engineering. So - tune in! Here the links we discussedDaniel's LinkedIn Profile: https://www.linkedin.com/in/danielbryantuk/Platform Engineering Book for Technical Product Leaders: https://www.amazon.de/Platform-Engineering-Technical-Product-Leaders/dp/1098153642/ref=asc_df_1098153642Platform Engineering Day Talk: https://www.syntasso.io/post/syntasso-at-platengday-london-presentation-recapKratix Website: https://www.kratix.io/Ai-Driven Platform Engineering Blog: https://www.syntasso.io/post/what-we-learned-building-a-prototype-ai-driven-dev-interface-for-kratixPlatform Democracy: https://www.syntasso.io/post/platform-democracy-rethinking-who-builds-and-consumes-your-internal-platformPlatform Anti Patterns: https://www.syntasso.io/post/platform-building-antipatterns-slow-low-and-just-for-showSlide Deck on Platform Engineering for Devs and Architects: https://speakerdeck.com/danielbryantuk/platform-engineering-for-software-developers-and-architects-redux 
"How do you measure the impact you have with your platform engineering initiative?" is a question you should be able to answer. To show improvement you must first need to know what the status quo is. And this is where frameworks such as DX Core 4 come in. Never heard about it? Then tune into this episode where we have Dušan Katona, Sr Director of Platform Engineering at Ataccama, who is a big fan of the DX Core Four Metrics and who has just applied it in his current role to optimize developer experience.Dušan explains the details behind those 4 Core metrics: Speed, Effectiveness, Quality and Impact. He also shares how improving those metrics by a single point results in the equivalent of 10 hours saved per developer per year.And here the relevant links we discussed todayDusan's LinkedIn Profile: https://www.linkedin.com/in/dusankatona/DX Core 4 Blog: https://getdx.com/research/measuring-developer-productivity-with-the-dx-core-4/Marian's JIRA Analytics Open Source Project: https://github.com/marian-kamenistak/jira-lead-cycle-time-duration-extractor
"15 years ago it was enough to be smart - going forward its not a differentiator - being smart will just make you average!". But what is it? What makes great leaders worth following and how do they achieve tripling their value while others keep waiting for their 5% raise?4 years ago Marian Kamenistak launched the Engineering Leadership Community out of Prague, Czech Republic. Feeding from his experience in the Silicon Valley this community has grown to 1500 members with the mission to create "Leaders worth following". Tune in and hear from Marian on how to think and talk about value impact vs being held up with trying to achieve technical perfection. Why its important to build a network around you, the difference between mentorship and management as well as how to proof the value to your leadership that you bring to the organization!Links we discussed todayMarian's LinkedIn: https://www.linkedin.com/in/mariankamenistak/Engineering Leadership Conference: https://www.elc-conference.io/Engineering Leadership Community: https://www.engineeringleaders.io/The Leadership Pipeline Book: https://www.amazon.com/Leadership-Pipeline-Build-Powered-Company/dp/0470894563
Scientific research is the foundation of many innovative solutions in any field. Did you know that Dynatrace runs its own Research Lab within the Campus of the Johannes Kepler University (JKU) in Linz, Austria - just 2 kilometers away from our global engineering headquarter? What started in 2020 has grown to 20 full time researchers and many more students that do research on topics such as GenAI, Agentic AI, Log Analytics, Procesesing of Large Data Sets, Sampling Strategies, Cloud Native Security or Memory and Storage Optimizations.Tune in and hear from Otmar and Martin how they are researching on the N+2 generation of Observability and AI, how they are contributing to open source projects such as OpenTelemetry, and what their predictions are when AI is finally taking control of us humans!To learn more about their work check out these links:Martin's LinkedIn: https://www.linkedin.com/in/mflechl/Otmar's LinkedIn: https://www.linkedin.com/in/otmar-ertl/Dynatrace Research Lab: https://careers.dynatrace.com/locations/linz/#__researchLab
As a leader that wants to optimize an organization you are bound to fail if you isolate social (culture and people) and technical (tools and process) changes. When we ask Lesley Cordero, Staff Engineer at The New York Times how to solve this dilemma she answers: "Platform Engineering, it can drive organizational sustainability by practicing sociotechnical principles that provide a community driven support system for application developers using our standardized shared platform architecture"Tune in to our latest episode and learn more about the importance of leadership to continuously keep up and balance the tension between "Developers" and "Operations", between "End User Experience" and "Developer Experience" and ultimately between "Culture and People and "Tools and Processes"Links we discussedLesley's LinkedIn: https://www.linkedin.com/in/lesleycordero/GOTO Conference Talk => https://www.youtube.com/watch?v=Jx-XrUONJ-o QCon 2025 Talk Details: https://qconlondon.com/presentation/apr2025/platform-engineering-practice-sociotechnical-excellence DevOpsCon 2024 Talk Details: https://devopscon.io/business-company-culture/platform-engineering-devops/
Do you plan for incidents? Do you have a time / cost budget for it in your sprint or quarterly planning? Do you have engineers that are "interruptible"?We discussed those and more questions with Lisa Karlin Curtis, Founding Engineer at incident.io who teaches us why we need to think differently about dealing with incidents!In our discussion we learn why modern incident management embraces more incidents that are publicly shared within an organization to foster learning. We learn about how to train more people to become incident responders, how to triage and categorize incidents, how to better plan for them and how to best report on themWe also touch on AI - and how AI-generated code will eventually result in more Incidents which we should use as an opportunity to learn and improve our engineering processP.S: This was our 10th-anniversary podcast episode!!Here the links we discussed in the podcast:Lisa's LinkedIn: https://www.linkedin.com/in/lisa-karlin-curtis-a4563920/Her talk at ELC Prague: https://docs.google.com/presentation/d/18536WBHBcPEppEeXXP7o5UQOX2XfWoGmfds2CHegHq4/edit?slide=id.g3434e0cba65_0_0#slide=id.g3434e0cba65_0_0Incident Playbook: https://incident.io/guide 
MCPs (Model Context Protocol) is an open source standard for connecting AI assistants to the the systems where data lives. But you probably already knew that if you have followed the recent hype around this topic after Anthropic made their announcement end of 2024.To learn more about that MCPs are not that magic, but enable "magic" new use cases to speed up efficiency of engineers we have invited Dana Harrison, Staff Site Reliability Engineer at Telus. Dana goes into the use cases he and his team have been testing out over the past months to increase developer efficiency.In our conversation we also talk about the difference between local and remote MCPs, the importance of keeping resiliance in mind as MCPs are connecting to many different API backends and how we can and should observe the interactions with MCPs.Links we discussedAntrohopic Blog: https://www.anthropic.com/news/model-context-protocolDana's LinkedIn: https://www.linkedin.com/in/danaharrisonsre/overlay/about-this-profile/
So you think Distributed Tracing is the new thing? Well - its not! But its never been as exciting as today!In this episode we combine 50 years of Distributed Tracing experience across our guests and hosts. We invited Christoph Neumueller and Thomas Rothschaedl who have seen the early days of agent-based instrumentation, how global standards like the W3C Trace Context allowed tracing to connect large enterprise systems and how OpenTelemetry is commoditizing data collection across all tech stacks.Tune in and learn about the difference between spans and traces, why collecting the data is only part of the story, how to combat the challenge when dealing with too much data and how traces relate and connect to logs, metrics and events.Links we discussedYouTube with Christoph: LINK WILL FOLLOW ONCE VIDEO IS POSTEDChristoph's LinkedIn: https://www.linkedin.com/in/christophneumueller/Thomas's LinkedIn: https://www.linkedin.com/in/rothschaedl/
In the ever-changing IT world, creating content that stays relevant for long is hard. One of the objectives of "Platform Engineering for Architects: Crafting Modern Platforms as a Product" was to stay timeless by providing practical examples of use cases not necessarily tied to current technology trends.The book focuses on the importance of building a platform with a purpose, making the impact measurable, and ensuring the platform continuously evolves by continuously including the end users (the engineering teams) in the evolution of the platform.Tune in to this episode and hear from Max Körbächer (Founder of Liquid Reply), Hilliary Lipsig (Senior Principal SRE at RedHat), and Andi Grabner (Co-Host of PurePerformance) on what made them write a book on Platform Engineering and get some personal insights into what gets the authors excited about their respective topics.If you have a chance, meet Max, Hilliary, and Andi at KubeCon in London. They will present at Platform Engineering Day and do a book signing at KubeCrawl!Links we discussed:Book on Amazon: https://www.amazon.com/Platform-Engineering-Architects-Crafting-platforms-ebook/dp/B0DH5DJFTHPlatform Engineering Day Session: https://colocatedeventseu2025.sched.com/event/1u5mX/platform-engineering-for-architects-crafting-platforms-as-a-product-max-korbacher-liquid-reply-hilliary-lipsig-red-hatHilliary Lipsig: https://www.linkedin.com/in/hilliary-lipsig-a5935245/Max Körbächer: https://www.linkedin.com/in/maxkoerbaecher/Andi Grabner: https://www.linkedin.com/in/grabnerandi/
One PetaByte is the equivalent of 11000 4k movies. And CERN's Large Hadron Collider (LHC) generates this every single second. Only a fraction of this data (~1 GB/s) is stored and analyzed using a multicluster batch job dispatcher with Kueue running on Kubernetes. In this episode we have Ricardo Rocha, Platform Engineering Lead at CERN and CNCF Advocate, explaining why after 20 years at CERN he is still excited about the work he and his colleagues at CERN are doing. To kick things off we learn about the impact that the CNCF has on the scientific community, how to best balance an implementation of that scale between "easy of use" vs "optimized for throughput". Tune in and learn about custom hardware being built 20 years ago and how the advent of the latest chip generation has impacted the evolution of data scientists around the globeLinks we discussedRicardo's LinkedIn: https://www.linkedin.com/in/ricardo-rocha-739aa718/KubeCon SLC Keynote: https://www.youtube.com/watch?v=xMmskWIlktA&list=PLj6h78yzYM2Pw4mRw4S-1p_xLARMqPkA7&index=5Kueue CNCF Project: https://kubernetes.io/blog/2022/10/04/introducing-kueue/
The word "Compliance" reminds many about mandatory training or audits. Two things not everyone gets excited about!Tune in and meet Michiel de Lepper who has spent most of his career in Security and Compliance. He gives us a different perspective on the importance of compliance, why it exists, how it intertwines with security and threat detection, what it has to do with security posture management and why he thinks its one of the most exciting things in IT!Links we discussed:Michiel's LinkedIn: https://www.linkedin.com/in/madelepper/Blog posts on security and compliance:https://www.dynatrace.com/news/blog/dynatrace-for-executives-security-compliance/ https://www.dynatrace.com/news/blog/manage-compliance-and-resilience-at-scale-with-dynatrace/ https://www.dynatrace.com/news/blog/dynatrace-kspm-transforming-kubernetes-security-and-compliance/ 
Feature Flagging - some may call them "glorified if-statements" - has been a development practice for decades. But have we reached a stage where organizations are doing "Feature Flag-Driven Development?". After all it took years to establish a test-driven development culture despite having great tools and frameworks available!To learn more we invited Ben Rometsch, Co-Founder of Flagsmith, to chat about the history, state and future of Feature Flagging. He is giving us an update on where the market is heading, how the CNCF project OpenFeature and its community is driving best practices, what the role of AI might be and what he thinks might be next!Couple of links we discussed during the episode:Ben on LinkedIn: https://www.linkedin.com/in/benrometsch/YouTube Video on Observability & Feature Flagging: https://www.youtube.com/watch?v=VZakh1_oEL8OpenFeature: https://openfeature.dev/
To predict the future, it's important to know the past. And that is true for Bernd Greifeneder, Founder and CTO of Dynatrace, who has been driving innovation in the observability and security since he founded Dynatrace 20 years ago!Bernd agreed to sit down, look behind the covers and answer the open questions that people posted on his LinkedIn in response to his recent observability prediction blog. Tune in and learn about Bernd's though on the evaluation from reactive to preventive operations, who is behind the convergence of observability & security, why observability can help those that have serious intentions for sustainability and how observability becomes mandatory and indispensable for AI-driven services.We mentioned a lot of links in todays session. Here they are:Our podcast from 9 years ago: https://www.spreaker.com/episode/015-leading-the-apm-market-from-enterprise-into-cloud-native--9607734Bernds LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7275101213237354497/Predictions Blog: https://www.dynatrace.com/news/blog/observability-predictions-for-2025/K8s Predictive Scaling Lab: https://github.com/Dynatrace/obslab-predictive-kubernetes-scalingSecurity Video: https://www.youtube.com/watch?v=ICUwRy4JFTkCarbon Impact App: https://www.youtube.com/watch?v=8Px0BB1U1ykAI & LLM Observability Video: https://www.youtube.com/watch?v=eW2KuWFeZyY
eBay, Yahoo, Netflix and then 10+ years at Uber. In this episode we sit down with Vishnu Acharya, Head of Network Infrastructure EMEA and Platform Engineering at Uber. Vishnu shares how Uber has scaled over the years to about 4000 engineers and how his team makes sure that infrastructure and platform engineering scales with the growing company and the growing demand on their digital services.Tune in and learn about how Vishnu thinks about SLOs across all layers of the stack, how they manage to get better insights with their cloud providers and why its important to have an end-to-end understanding of the most critical end user journeys.Links we discussed:Conference talk at Observability & SRE Summit: https://www.iqpc.com/events-observability-sre-summit/speakers/vishnu-acharyaVishnu's LinkedIn Page: https://www.linkedin.com/in/vishnuacharya/Uber Engineering Blog: https://www.uber.com/blog/engineering/ 
loading
Comments