In this series about DevOps, Part 1 introduced the underlying principles of DevOps. Part 2 discussed creating environments early in the development process for more deterministic, predictable, and secure releases. Part 3 profiled a Twitter effort to integrate information security into the development process and amplify the feedback loop. In this article, learn about standardizing the work of IT operations to increase project predictability. See how standardization can also increase the throughput of work.
Create repeatable, standardized deployment stories
A recurring problem in the DevOps value stream is that handoffs between development and IT operations are often not sufficiently standardized. When every deployment is done differently, every production environment is a different snowflake. When this occurs, no mastery of procedures or configurations is ever built in the organization. As Luke Kanies, founder of Puppet Labs said, "If your infrastructure is special, you're doing it wrong."
This article explores a pattern for defining reusable deployment procedures that can be used across projects. There is a very elegant solution in the Agile methodology where deployment activities are turned into a user story. For example, you could build a reusable user story for IT operations called "Deploy into high availability environment." The story defines the exact steps to build the environment, how long it takes, what resources are required, and so on.
Reusable deployment procedures can then be used by project managers to accurately integrate deployment activities into the project plan. For instance, you would have high confidence in the deployment schedule if you knew that the "Deploy into high availability environment" story has been executed fifteen times in the past year and that it takes an average of three days, plus or minus one day. Furthermore, you can also gain confidence that the deployment activities are being properly integrated into every software project.
Some software projects require unique environments that IT operations doesn't officially support. You can allow for exceptions in such cases if the "unsupported" environment is supported, in a standardized way, by someone outside of IT operations. You get the benefits of environment standardization (reduced production variance, fewer snowflakes in production, increased ability of IT operations to reliably support and maintain) while allowing nimbleness for special cases.
Stop the line when the deployment pipeline breaks
A problematic aspect of the IT operations user deployment stories is that they're described in great detail. But, when it's time to deploy, things never go as planned. For instance, you may find that:
- Someone breaks the build and someone else checks in their changes without waiting for the build to be fixed.
- Nobody pays attention when the build breaks or tests fail.
- The build stays broken for longer than a few minutes.
- Automated test suites are flaky, untrusted, and consistently broken.
- Writing tests for the code and environment are deferred, so some required tests aren't automated.
- Deployments are unreliable, and the code doesn't work correctly when deployed to a production-like environment.
As Deming once described in his Fourteen Points Of Management, "We must cease dependence on mass inspection to achieve quality. Improve the process and build quality into the product in the first place."
The goal is to have assurance that, at any point in time, the code and environment are in a deployable state. However, this can only happen if, and only if, you have maintained the integrity of the continuous integration process and deployment pipeline.
The only way to achieve the goal is to prioritize keeping the system working over doing work. When any part of the continuous integration and automated test suit or deployment pipeline fails, the entire team is notified and must fix the problem. If it was due to a code error, you must roll back the commit. If it was due to an environment error, you must fix the environment creation process. If it was due to a documentation failure, you must fix the documentation.
Help for information security and QA
The high deployment rates typically associated with DevOps often put enormous pressure on QA and information security. Consider the case where development is doing ten deployments per day and information security requires a four-month lead time to execute an application security review.
The famous 2011 Dropbox failure is an example of the risk posed by insufficiently tested deployments. Authentication was turned off for four hours, enabling unauthorized users to access all stored data.
There's good news for QA and information security teams, though. Development organizations capable of sustaining high deployment rates are likely using continuous integration and deployment practices. Such methods require a culture that prioritizes fixing issues that stop the deployment pipeline over developing new features.
If changes to the code or the environment take you out of a deployable state, the entire organization, by habit, should swarm to fix the issue. Only by doing this can you have confidence that you're in a deployable state and that the deployment process is working. Information security or QA shouldn't be the lonely voices trying to stop a potentially catastrophic deployment from happening. Jez Humble, who co-wrote the seminal book Continuous Delivery, gave a great talk on this topic.
In this article, you learned a bit about increasing project predictability, accuracy, and throughput by standardizing the work of IT operations. Continuous integration and deployment practices can provide confidence that you're always in a deployable state.
The fifth, and final, part of this series discusses why everyone needs DevOps now.
- Check out the previous parts of this "DevOps distilled" series:
- The Deployment Pipeline: from check-in to release: Jez Humble's talk describing the deployment pipeline pattern and discussing how continuous integration, configuration management, and automated testing fit into continuous delivery. He also gives examples of controlling infrastructure changes using Puppet in the context of a deployment pipeline.
- Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation: Explores the principles and technical practices that enable rapid, incremental delivery of high quality, valuable new functionality jeto users.
- Put Your Robots to Work: Security Automation at Twitter: A presentation by the product security team at Twitter that explores how information security integrates into a DevOps work stream.
- Here's How The Amazing Twitter Infosec Team Helps DevOps: Read Gene Kim's blog entry.
- Brakeman 1.9.0 Released: Learn more about the static analysis security scanner for Ruby on Rails.
- Improving Browser Security with CSP: Read how Twitter uses CSP.
- The Convergence of DevOps: Read how DevOps is the culmination of three amazing and significant movements.
- The History of DevOps: Get a firsthand account of the history of DevOps.
- The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win: Learn how to recognize problems that happen in IT organizations, how these problems jeopardize nearly every commitment the business makes, and how DevOps techniques can fix the problems.
- 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr: The seminal presentation by John Allspaw and Paul Hammond about how Flickr achieved fast flow of features while maintaining world-class stability, reliability, and security. They address the benefits of this high rate of change, as well as the culture and technology needed to make it possible.
- W. Edwards Deming: Read about Deming on Wikipedia.
- Follow developerWorks on Twitter.
- Get more information on security topics in the Security site on developerWorks.
Get products and technologies
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.
- Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.