How strongly do I recommend The DevOps Handbook?
10 / 10
The DevOps Handbook comes to you from the authors of Accelerate, The Phoenix Project, and The Unicorn Project. If you’re having trouble deciding which devops book to pick from….
DevOps Handbook is an excellent choice for software engineering leaders and engineering book clubs. Read this book together with your team and you’re likely to find several areas for continuous improvement and meaningful conversation.
Top Ideas in This Book
As an engineering manager or leader, your job is to improve feedback loops and primarily through speed:
Engineers instinctively know that faster feedback is better. Waiting 2 minutes for a test suite to complete is much better than 30 minutes. But how many organizations invest energy toward tightening feedback loops?
To my mind, quality of the feedback is important but a secondary consideration to speed. I’d rather have a fast code review of okay quality than a slow code review of great quality. I can work with the first to extract more information, but slow feedback often arrives so late that everyone feels frustrated.
Google has all their code in a single repo with billions of files. Randy Shoup, current VP of Engineering and Chief Architect at Ebay, formerly of Google, called Google’s monorepo “the most powerful mechanism for preventing failures.”
At my company, we still have multiple repos but after reading The DevOps Handbook, we’re starting to see the pain points that could be alleviated by consolidating our code into a single repository.
For instance, when fixing a bug ticket that spans both our API and mobile app, we currently need to touch multiple repos, create multiple pull requests, and perform fragmented code reviews. Consolidating repositories would improve our ability to view code changes as a single body of work.
Moving toward a monorepo is about progress not perfection. You have the option of consolidating similar repositories first, for instance if you have multiple web apps those could go into the same repo. Work in a stepwise fashion toward the monorepo.
When I interview engineers and ask about automated testing at their current job, the most frequent response is that they don’t have time to do automated testing. They are too busy shipping features! That customers demand! Yesterday!
It’s an understandable, but penny-wise and pound-foolish mistake that everyone is complicit in.
More important than improving the work itself is improving at the daily work. Constantly improve your ability to develop software using the four metrics for software delivery performance as a guide. Results will follow.
The DevOps Handbook is filled with stories and interviews of software engineering leaders at Etsy, Target, Macy’s, Google, Facebook, and other companies all pursuing devops transformations.
The takeaway for me was that cultural transformations are possible. If Target can improve their software development environment and culture, so can your company.
But you need transformational leaders. Reformers. People who know how to navigate your own company culture, blowing up bureaucracy with an inviting and inclusive mentality, not threatening.
The devops movement is largely about cross-training your software engineers to grow operational skills in delivery, reliability, and system administration.
You are developing generalists, capable of solving business problems across a broader range of engineering disciplines. This minimizes the handoffs and transitions between teams, which slows down delivery. Likewise, you minimize the likelihood of someone having ownership over reliability of a feature or solution that somebody else developed.
On my team, every engineer works full stack for both pragmatic and growth reasons. Pragmatically, we have a small team and everyone needs to be capable of solving many types of problems. This also provides growth areas for my engineers to learn new skills, try new technologies, and take ownership over business and engineering outcomes.
About a third of The DevOps Handbook is dedicated to telemetry and monitoring.
Too few teams can successfully answer the question, “Is it working?” at both systems and usage levels.
At the system level, we of course want to know that our servers, databases, and clients are operating as expected. We want insight into system performance and issues.
At the usage level, we need to know whether the features and solutions we’re building are delivering business value. Are users actually using the things we build and how?
Where we fault to instrument our systems with telemetry, we fail to engage in learning and improvement. In other words, we’re just building stuff and passively hoping for the best.
Watch work flow through the system to identify limiting factors. This is why kanban boards are so useful — they make work visible.
Knowing we need to watch work flow through the development system, engineering managers should orient standups around the work. Focus on how each project is advancing or is roadblocked.
On the opposite end of the standup spectrum is watching the people, where each developer reports their status. Usually these standups are boring and developers disengage quickly, so they are easy to identify, but common enough that all teams eventually fall into this trap.
In my experience most software engineers know that they should write tests. So when developers don’t write automated tests, first check whether the incentives. Are they motivated to write tests? If yes, then you likely have a skill problem.
The same applies to flaky, broken, or slow pipelines — there is usually a skill gap that needs to be addressed and just throwing new tech at the problem isn’t a solution.
The fear of deploying code causes teams to deploy less frequently, a vicious cycle that is hard to stop. In many cases the root cause of deployment fear is a lack of automated testing or unreliable automated testing. In such environments, regressions and bugs are common, creating reactive work and panic which nobody likes, hence the fear of deployment.
Integration tests can be a particular pain point because they are slower and flakier than unit tests and acceptance tests. When possible, we want to shift tests from integration down to acceptance or even better, down to unit tests.