Irrational Exuberance!

Describing fault domains.

August 17, 2019 Fault domains are one of the most useful concepts I've found in arhcitecting reliable systems, and don't get enough attention. If you want to make your software predictably reliable, including measuring your reliability risk, then it's an extremely useful concept ot spend some time with.

Distributed systems vocabulary.

August 11, 2019 One of the challenges of having a discussion about distributed systems is agreeing on vocabulary. I've written up some of the vocabulary I've found most useful for having such discussions.

Head in the clouds.

July 7, 2019 When I wrote about the public cloud expansion forcing infrastructure engineers to evolve their role, I sort of imagined that the precursor question--should we run our infrastructure on the public cloud?--was already quite settled, but it's a discussion that I find myself having more rather than less frequently each year, so I've taken some time to structure and document my thinking.

Don't follow the sun.

July 3, 2019 When I get the chance to speak with engineering leaders, I sometimes get asked to endorse an already underway plan to spin up a “follow the sun” on-call rotation. My advice is probably not what folks anticipate: please don’t.

How to invest in technical infrastructure.

May 19, 2019 I'm speaking at Velocity on June 12th on 'How Stripe invests in technical infrastructure', and this is the rough outline of the content the talk will cover. I hope to see y'all there.