My Top 5 Books for DevOps/SRE

muncus

Marc Dougherty

Posted on November 29, 2020

My Top 5 Books for DevOps/SRE

Disclaimer: The opinions expressed here are my own, and do not represent those of my employer.

I am the type of person that learns well from self-study like reading books, or watching instructional videos. I recognize that this method does not work for everyone, but if it works for you, I'd like to recommend a few books that have helped me grow in my career as a Site Reliability Engineer.

Note: these are not affiliate links, and I have not been compensated in any way to recommend these items.

1. Crucial Conversations

You may have heard the old Operations adage "when things are going well, everyone forgets you exist". The corrolary is that we're usually involved when other teams are having a Bad Day, and tempers can flare on both sides.

Crucial Conversations has helped me build the skills to better understand the needs of partner teams, and leadership, when they may not be communicating very clearly (because things are a 🗑🔥).

This book has helped me identify my own defensive reactions, and build habits to help avoid them. This leads to the perception that I am keeping a cooler head during outages, even if i'm still freaking out!

This is a book that I try to at least skim every year or two, because the lessons and practices that speak to me are different each time I read it.

2. Thanks for the Feedback

This book is all about asking for feedback, and really understanding it, (regardless of how poorly it might be delivered).

I'm not going to lie, this might be the hardest book i've ever read. The book illustrates difficult situations, and I often found myself identifying with the more antagonistic party.

Because of that, this book had a real effect on how I treat feedback, and I certainly feel better about receiving difficult feedback than before reading this book.

3. Accelerate: The science of lean software and devops

Finally, we get to the books with technical content!

As an SRE, I spend a lot of my time making the case for change, whether in business requirements, or in engineering time. This book provides the data to make these cases more easily, and point you at the right measurements to show the impact of the change.

If you're not familiar with this book, it is based on the last 4(?) years of DORA State of DevOps Report survey data, making it the largest dataset you could hope to find on the topic.

The book (and to some extent the site above) lays out a set of 24 Key Capabilities, and the relationships between them. The relationships between these capabilities are sometimes obvious (e.g. Continuous Delivery drives Software Delivery Performance), and sometimes more surprising (e.g. Continuous Delivery also drives Job Satisfaction).

My favorite parts are "intuitive" correlations that do not hold up in this large of a data set, e.g. having a separate QA team that owns tests does not correlate to fewer failed changes.

If you're looking to make change in your organization, this book is a must-have.

4. Site Reliability Workbook

The sequel to the first SRE Book, The SRE Workbook improves on the original by using more thorough non-Google examples, and tools and techniques available to those outside of Google. Many of the success stories come from Google customers, but since they are using industry-standard technologies, the examples are more relateable.

I especially appreciate that SLOs are called out as the first step toward reliability. Reliable systems are inherently more expensive, and high-availability systems often extremely so. Getting a shared understanding of reliability requirements between Engineering, Operations (or SRE), and Product (representing the Customer) records the expectations of all parties, and can make it clear that for some systems, 90% uptime (or lower) can be the Right Thing.

There are several chapters about how Google has structured SRE, and how it structures SRE Engagements. I'd caution readers to be skeptical here, as these decisions have consequences on how your SRE org is able to engage with partner teams and other parts of your organization, and what worked for Google may not work for you.

5. Everybody Writes

For most of us, our written communication has the largest audience. Unless you're doing a lot of public speaking, the emails and documents that we write reach many more people than our verbal communications.

The author of this book works in Marketing, but the first 3/4 of this book is full of lessons that anyone can benefit from. Perhaps most importantly, it teaches us that writing well is a learned skill, not an innate ability. The best way for us to improve our written communication is to do more of it!

I was surprised to see a discussion of "content limits", basically maximum sizes for specific types of content (e.g. a blog post should be at most 1500 words). This was a new concept for me, and these guidelines help me make content (from blog posts to emails) that are more accessible to the reader. (even if it does take me longer to write them!).

Do you have a favorite book that's helped you grow personally or professionally?
Tell me about it in the comments!

💖 💪 🙅 🚩
muncus
Marc Dougherty

Posted on November 29, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

My Top 5 Books for DevOps/SRE
readinglist My Top 5 Books for DevOps/SRE

November 29, 2020