Critical Incidents: a guide for developers

1:00pm - 1:30pm on Friday, October 5 in PennTop North

Lais Varejão

Audience Level:
All
Slides:
http://bit.ly/critical-incidents-guide
Watch:
https://youtu.be/oQL4s01QbbU

Overview

As developers we know there’s no such thing as bug-free software. Incidents are inevitable and being prepared is key. How would you handle a database outage if it happened now? Or a critical bug affecting half of your users? How to plan the full recovery from the bug discovery until the postmortem?

Description

As developers we know (too well) there’s no such thing as bug-free software. Whether you work in a small or big production software, incidents are inevitable and being prepared is key. How would you handle, for example, a database outage if it happened now? Or a critical bug that is affecting half of your users?

Navigating in crisis mode is never easy, but having a great company culture and recovery plan gives you guidance and mitigates damage. In this talk, I will share some success cases, such as GitLab database outage recovery, and my personal experience as a project manager overcoming a critical incident in a subscription system built with Django.

What can you do to prepare your team? When should you enter crisis mode? How to assemble a recovery plan? To answer these and other questions, I will provide a step-by-step guide, from an Modern Agile perspective, starting with the bug discovery, and handling the client’s expectations, through the data recovery, until your incident postmortem.

Want to edit this page?