I was excited to contribute an article,
Move past incident response to reliability
to Github’s The ReadME project.
This topic was particularly on my mind when I wrote it towards the end of last year,
when I was focused on my Infrastructure Engineering project.
That project is a bit paused at the moment, as I’m focused on another project that I’ll
get to announce in the next month or two. (No details there yet, but if you look at my
recent writing, you can probably make a good guess.)
I will get back to writing on infrastructure engineering in a bit, but for now,
you can read this piece.
This piece was fun to write because I tried to say something relatively controversial
(at some point, most incident programs get caught up in process), in the context of an
101-ish incident program overview.