Bonus E2 – CrowdStrike Crisis: An IT Nightmare Unfolds
Update: 2024-07-21
Description
Introduction:
- Welcome back to a bonus episode of Off the Wire.
- Highlight of the week: a bad patch pushed out by CrowdStrike caused worldwide outages.
Initial Impact:
- Anthony's experience: dealing with server and workstation blue screens.
- Timeline of the incident: starting at 12:09 AM with alerts coming in around 12:40 AM.
- Initial thoughts and confusion about the cause of the outages.
Incident Breakdown:
- Detailed recount of the events from the first alert to the realization of the issue.
- Actions taken: communicating with the team, creating a list of affected servers, and initial troubleshooting steps.
- The emotional toll: dealing with the uncertainty and high-stress situation.
Discovery and Response:
- Identifying the issue was linked to CrowdStrike after finding relevant information on their support portal.
- Relief upon realizing it was not a hack but a bad patch.
- Steps taken to mitigate the issue: removing CrowdStrike from systems, following CrowdStrike's fix instructions.
Operational Challenges:
- Logistics of fixing the issue across remote and local systems.
- Game plan for addressing workstation issues at different office locations.
- The coordination effort: managing communications and task delegation.
Post-Incident Reflection:
- The importance of a coordinated response and having a "bug-out" bag.
- CrowdStrike's handling of the incident and the need for transparency.
- Discussion on potential industry-wide implications and the fragility of IT infrastructure.
Impact and Future Considerations:
- Worldwide impact: other organizations affected including critical infrastructure.
- Reflection on CrowdStrike's reputation and future trust.
- Legal and liability considerations for CrowdStrike in various jurisdictions.
Closing Thoughts:
- The importance of preparedness and having a response plan in place.
- Lessons learned from the incident and changes to be implemented.
- Invitation to listeners to share feedback and follow on social media.
Outro:
- Thanks for joining this bonus episode.
- Reminder about the regular podcast schedule and mention of recent episodes.
- Encouragement to share the podcast with others and stay tuned for more content.
- Welcome back to a bonus episode of Off the Wire.
- Highlight of the week: a bad patch pushed out by CrowdStrike caused worldwide outages.
Initial Impact:
- Anthony's experience: dealing with server and workstation blue screens.
- Timeline of the incident: starting at 12:09 AM with alerts coming in around 12:40 AM.
- Initial thoughts and confusion about the cause of the outages.
Incident Breakdown:
- Detailed recount of the events from the first alert to the realization of the issue.
- Actions taken: communicating with the team, creating a list of affected servers, and initial troubleshooting steps.
- The emotional toll: dealing with the uncertainty and high-stress situation.
Discovery and Response:
- Identifying the issue was linked to CrowdStrike after finding relevant information on their support portal.
- Relief upon realizing it was not a hack but a bad patch.
- Steps taken to mitigate the issue: removing CrowdStrike from systems, following CrowdStrike's fix instructions.
Operational Challenges:
- Logistics of fixing the issue across remote and local systems.
- Game plan for addressing workstation issues at different office locations.
- The coordination effort: managing communications and task delegation.
Post-Incident Reflection:
- The importance of a coordinated response and having a "bug-out" bag.
- CrowdStrike's handling of the incident and the need for transparency.
- Discussion on potential industry-wide implications and the fragility of IT infrastructure.
Impact and Future Considerations:
- Worldwide impact: other organizations affected including critical infrastructure.
- Reflection on CrowdStrike's reputation and future trust.
- Legal and liability considerations for CrowdStrike in various jurisdictions.
Closing Thoughts:
- The importance of preparedness and having a response plan in place.
- Lessons learned from the incident and changes to be implemented.
- Invitation to listeners to share feedback and follow on social media.
Outro:
- Thanks for joining this bonus episode.
- Reminder about the regular podcast schedule and mention of recent episodes.
- Encouragement to share the podcast with others and stay tuned for more content.
Comments
Top Podcasts
The Best New Comedy Podcast Right Now – June 2024The Best News Podcast Right Now – June 2024The Best New Business Podcast Right Now – June 2024The Best New Sports Podcast Right Now – June 2024The Best New True Crime Podcast Right Now – June 2024The Best New Joe Rogan Experience Podcast Right Now – June 20The Best New Dan Bongino Show Podcast Right Now – June 20The Best New Mark Levin Podcast – June 2024
In Channel