DiscoverAdventures in DevOpsIncident Response Essentials: From Postmortems to Communication Strategies - DevOps 212
Incident Response Essentials: From Postmortems to Communication Strategies - DevOps 212

Incident Response Essentials: From Postmortems to Communication Strategies - DevOps 212

Update: 2024-08-22
Share

Description

In today's episode, Warren, Will, and special guest Falit Jain dive deep into the intricate world of incident management and response, drawing from rich experiences at tech giants like Amazon and Disney. They explore real-life scenarios, including Amazon's complex debugging challenges with over 150 engineers maintaining their detail page, and the high stakes of live streaming events at Disney.\


Join them as they discuss the crucial aspects of effective incident response, from the importance of familiarity with systems and the role of on-call processes to the value of communication and meticulous postmortems. They also deep-dive into cultural influences from leadership, the balance between new feature launches and system stability, and the significance of metrics like mean time to resolution and error budgets.



Socials
  • LinkedIn: Falit Jain

Picks


Become a supporter of this podcast: https://www.spreaker.com/podcast/adventures-in-devops--6102036/support.
Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Incident Response Essentials: From Postmortems to Communication Strategies - DevOps 212

Incident Response Essentials: From Postmortems to Communication Strategies - DevOps 212

Charles M Wood