#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty

Update: 2024-12-05

Description

Planning AND Practice: The Secret to Incident Response

Plan and PRACTICE for better incident response with insights from Tim Armandpour, CTO of PagerDuty. Learn the secrets to resilience from the team that mitigated the impact of a major outage—handling a 250% traffic surge while delivering on their SLA.

Listen to find out:

🛠️ Why planning AND practice are both critical for incident response.

🚧 How to practice for incident response (e.g Failure Fridays with Chaos Engineering)

🧑‍🤝‍🧑 Ownership: Why tech AND business teams must join post-mortems.

☁️ How to mitigate the impact of your cloud provider’s lower SLA.

⚓ Which architectural patterns are more resilient?

⚖️ WARNING: “bend” the CAP theorem at your own risk

Listen here

TimeStamps:
(00:00:00 ) Introduction to Alphalist Podcast
(00:01:00 ) Meet Tim Armanpour
(00:01:56 ) Tim's Early Career
(00:06:22 ) Handling Major Incidents at PagerDuty
(00:09:21 ) The Importance of Preparedness
(00:13:54 ) Practicing Failure Scenarios
(00:18:16 ) Resilient Infrastructure and Architectural Patterns
(00:22:44 ) Standardization and Data Management
(00:25:48 ) Exploring Infrastructure Resilience
(00:26:20 ) Achieving High Availability with Lower SLA Cloud Platforms
(00:29:38 ) Defining Meaningful SLIs
(00:32:15 ) Assessing Incident Readiness
(00:35:15 ) The Importance of Ownership
(00:41:30 ) Continuous Improvement
(00:43:53 ) Lessons from a Yogurt Business
(00:48:18 ) Final Thoughts and Takeaways

Comments

In Channel

#131 - AI Product Strategy: When to Build and When to Wait with Matthias Keller // CPO @ Kayak

2025-11-1352:47

#130 - From PhD Research to DuckDB: Building the Next Generation of Analytical DBs with Mark Raasveldt // CTO @ DuckDB

2025-10-1654:14

#129 - $32B Lessons: Building CTO Teams, Rapid Innovation, and Staying Customer-Connected with Solal Raveh

2025-09-1849:14

#128 - From Tickets to Problems: Klaus Breyer // Head of Product & Technology @ Edding

2025-09-0455:33

#127 - Kelsey Hightower's Unfiltered Truths: 25 Years of Infrastructure, DevOps, and Retiring at 42

2025-08-0701:00:30

#126 - AI Transformation at Scale: Practical Adoption Across 150+ Engineers with Peter Gostev // Head of AI @ Moonpig

2025-07-2401:05:48

#125 - Two CTO Dinosaurs vs. Today's Tech Hype with Raz Shuty // CTO @ auxmoney

2025-07-1001:03:15

#124 - The Path to AGI: Inside poolside’s AI Model Factory for Code with Eiso Kant

2025-06-2701:03:56

#123 - From Nokia to AI-IoT: Engineering the Physical World with Bernd Groß // CEO @ Cumulocity

2025-06-1201:04:11

#122 - Grid Control in Milliseconds: Engineering Energy Systems with Barbara Wittenberg // CTO @ 1KOMMA5°

2025-05-1601:03:19

#121 - Canva's Playbook: Scaling Teams, Tech, and AI with Adam Schuck // Senior Engineering Director @ Canva

2025-05-0101:01:03

#120 - AI's Singularity & Commoditization: Navigating Hype vs. Reality with Georg Zoeller // Co-Founder @ C4AIL

2025-04-1701:14:55

#119 - Navigating Ambiguity and AI's Impact on Engineering feat. Ivan Kusalic // CTO @ Enpal

2025-04-0456:31

#118 - Radical Engineering Culture and High Bar Hiring feat. Stefan Richter // Founder & CTO @ freiheit.com technologies

2025-03-2001:02:51

#117 - Navigating AI Trends with Dat Tran // Partner & CTO @ DATANOMIQ, VP AI/ML @ Beams Safety AI

2025-02-0701:08:42

#116 - Exploring Platform Engineering feat. Camille Fournier // CTO @ Open Athena & Author @ O'Reilly Media

2025-01-2201:03:04

115 - Exploring Growth and Compliance feat. Dennis Winter // CTO @ Börse Stuttgart

2025-01-0901:02:08

#114 - Building an SDK feat. Dr. Daniel Hauschildt // CPTO @ IMG.LY

2024-12-1941:06

#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty

2024-12-0550:10

#112 - Cross Functional Team Members feat. Daniel Bartholomae // CTO @ optilyz

2024-11-2149:10

00:00

#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty

Tobias Schlottke - alphalist CTO Podcast

#box-pro-ellipsis-176388879007182{-webkit-line-clamp:2;}#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty

#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty

Tobias Schlottke - alphalist CTO Podcast

#113 - Faster Incident Response feat. Tim Armandpour // CTO @ PagerDuty