What is DevOps?
By definition, DevOps outlines a software development process and an organizational culture shift that speeds the delivery of higher quality software by automating and integrating the efforts of development and IT operations teams – two groups that traditionally practiced separately from each other, or in silos.
In practice, the best DevOps processes and cultures extend beyond development and operations to incorporate inputs from all application stakeholders - including platform and infrastructure engineering, security, compliance, governance, risk management, line-of-business, end-users and customers - into the software development lifecycle.
DevOps represents the current state of the evolution of software delivery cycles during the past 20+ years, from giant application-wide code releases every several months or even years, to iterative smaller feature or functional updates released as frequently as every day or several times per day.
Ultimately, DevOps is about meeting software users’ ever-increasing demand for frequent, innovative new features and uninterrupted performance and availability.
How we got to DevOps
Until just before 2000, most software was developed and updated using waterfall methodology, a linear approach to large-scale development projects. Software development teams would spend months developing large bodies of new code that impacted most or all of the application. Because the changes were so extensive, they spent several more months integrating that new code into the code base.
Next, quality assurance (QA), security and operations teams would spend still more months testing the code. The result was months or even years between software releases, and often several significant patches or bug fixes between releases as well. And this “big bang” approach to feature delivery was often characterized by complex and risky deployment plans, hard to schedule interlocks with upstream and downstream systems, and IT’s “great hope” that the business requirements had not changed drastically in the months leading up to production “go live.”
To speed development and improve quality, development teams began adopting agile software development methodologies, which are iterative rather than linear and focus on making smaller, more frequent updates to the application code base. Chief among these methodologies are continuous integration and continuous delivery, or CI/CD. In CI/CD smaller chunks of new code are merged into the code base every one or two weeks, and then automatically integrated, tested and prepared for deployment to the production environment. Agile evolved the “big bang” approach into a series of “smaller snaps” which also compartmentalized risks.
The more effectively these agile development practices accelerated software development and delivery, the more they exposed still-siloed IT operations - system provisioning, configuration, acceptance testing, management, monitoring - as the next bottleneck in the software delivery lifecycle.
So DevOps grew out of agile. It added new processes and tools that extend the continuous iteration and automation of CI/CD to the rest of the software delivery lifecycle. And it implemented close collaboration between development and operations at every step in the process.
How DevOps works: The DevOps lifecycle
The DevOps lifecycle (sometimes called the continuous delivery pipeline, when portrayed in a linear fashion) is a series of iterative, automated development processes, or workflows, executed within a larger, automated and iterative development lifecycle designed to optimize the rapid delivery of high-quality software. The name and number of workflows can differ depending on whom you ask, but they typically boil down to these six:
- Planning (or ideation). In this workflow, teams scope out new features and functionality in the next release, drawing from prioritized end-user feedback and case studies, as well as inputs from all internal stakeholders. The goal in the planning stage is to maximize the business value of the product by producing a backlog of features that when delivered produce a desired outcome that has value.
- Development. This is the programming step, where developers test, code, and build new and enhanced features, based on user stories and work items in the backlog. A combination of practices such as test-driven development (TDD), pair programming, and peer code reviews, among others are common. Developers often use their local workstations to perform the “inner loop” of writing and testing code before sending it down the continuous delivery pipeline.
- Integration (or build, or continuous Integration and continuous delivery (CI/CD). As noted above, in this workflow the new code is integrated into the existing code base, then tested and packaged into an executable for deployment. Common automation activities include merging code changes into a “master” copy, checking out that code from a source code repository, and automating the compile, unit test and packaging into an executable. Best practice is to store the output of the CI phase in a binary repository, for the next phase.
- Deployment (usually called continuous deployment). Here the runtime build output (from integration) is deployed to a runtime environment - usually a development environment where runtime tests are executed for quality, compliance and security. If errors or defects are found, developers have a chance to intercept and remediate any problems before any end users see them. There are typically environments for development, test, and production, with each environment requiring progressively “stricter” quality gates. A good practice for deployment to a production environment is typically to deploy first to a subset of end users, and then eventually to all users once stability is established.
- Operations. If getting features delivered to a production environment is characterized as “Day 1”, then once features are running in production “Day 2” operations occur. Monitoring feature performance, behavior, and availability ensures that the features are able to provide value add to end users. Operations ensures that features are running smoothly and that there are no interruptions in service - by making sure the network, storage, platform, compute and security posture are all healthy! If something goes wrong, operations ensures incidents are identified, the proper personnel are alerted, problems are determined, and fixes are applied.
- Learning (sometimes called continuous feedback). This is the gathering of feedback from end users and customers on features, functionality, performance and business value to take back to planning for enhancements and features the next release. This would also include any learning and backlog items from the operations activities, that could empower developers to proactively avoid any past incidents that could happen again in the future. This is the point where the “wraparound” to the Planning phase happens and we “continuously improve!”
Three other important continuous workflows occur between these workflows:
Continuous testing: Classical DevOps lifecycles include a discrete “test” phase that occurs between integration and deployment. However, DevOps has advanced such that certain elements of testing can occur in planning (behavior-driven development), development (unit testing, contract testing), integration (static code scans, CVE scans, linting), deployment (smoke testing, penetration testing, configuration testing), operations (chaos testing, compliance testing), and learning (A/B testing). Testing is a powerful form of risk and vulnerability identification and provides an opportunity for IT to accept, mitigate, or remediate risks.
Security: While waterfall methodologies and agile implementations 'tack on' security workflows after delivery or deployment, DevOps strives to incorporate security from the start (Planning) - when security issues are easiest and least expensive to address - and continuously throughout the rest of the development cycle. This approach to security is referred to as shifting left (which is easier to understand if you look Figure 1). Some organizations have had less success shifting left than others, which led to the rise of DevSecOps (see below).
Compliance. Regulatory compliance (governance and risk) are also best addressed early and throughout the development lifecycle. Regulated industries are often mandated to provide a certain level of observability, traceability and access of how features are delivered and managed in their runtime operational environment. This requires planning, development, testing, and enforcement of policies in the continuous delivery pipeline and in the runtime environment. Auditability of compliance measures is extremely important for proving compliance to 3rd party auditors.
It's generally accepted that DevOps methods can't work without a commitment to DevOps culture, which can be summarized as a different organizational and technical approach to software development.
At the organizational level, DevOps requires continuous communication, collaboration and shared responsibility among all software delivery stakeholders - software development and IT operations teams for certain, but also security, compliance, governance, risk and line-of-business teams - to innovate quickly and continually, and to build quality into software from the start.
In most cases the best way to accomplish this is to break down these silos and reorganize them into cross-functional, autonomous DevOps teams that can work on code projects from start to finish - planning to feedback - without making handoffs to, or waiting for approvals from, other teams. When put in the context of agile development, the shared accountability and collaboration are the bedrock of having a shared product focus that has a valuable outcome.
At the technical level, DevOps requires a commitment to automation that keeps projects moving within and between workflows, and to feedback and measurement that enable teams to continually accelerate cycles and improve software quality and performance.
DevOps tools: Building a DevOps toolchain
The demands of DevOps and DevOps culture put a premium on tooling that supports asynchronous collaboration, seamlessly integrates DevOps workflows, and automates the entire DevOps lifecycle as much as possible. Categories of DevOps tools include:
- Project management tools - tools that enable teams to build a backlog of user stories (requirements) that form coding projects, break them down into smaller tasks and track the tasks through to completion. Many support agile project management practices such, as Scrum, Lean and Kanban, that developers bring to DevOps. Popular open source options include GitHub Issues and Jira.
- Collaborative source code repositories - version-controlled coding environments that that let multiple developers to work on the same code base. Code repositories should integrate with CI/CD, testing and security tools, so that when code is committed to the repository it can automatically move to the next step. Open source code repositories include GiHub and GitLab.
- CI/CD pipelines - tools that automate code checkout, building, testing and deployment. Jenkins is the most popular open source tool in this category; many previously open-source alternatives, such as CircleCI, are now available in commercial versions only. When it comes to continuous deployment (CD) tools, Spinnaker straddles between application and infrastructure as code layers. ArgoCD is another popular open source choice for Kubernetes native CI/CD.
- Test automation frameworks - these include software tools, libraries and best practices for automating unit, contract, functional, performance, usability, penetration and security tests. The best of these tools support multiple languages; some use artificial intelligence (AI) to automatically reconfigure tests in response to code changes. The expanse of test tools and frameworks is far and wide! Popular open source test automation frameworks include Selenium, Appium, Katalon, Robot Framework, and Serenity (formerly known as Thucydides).
- Configuration management (infrastructure as code) tools - these enable DevOps engineers to configure and provision fully versioned and fully documented infrastructure by executing a script. Open source options include Ansible (Red Hat), Chef, Puppet and Terraform. Kubernetes performs the same function for containerized applications (see 'DevOps and cloud-native development,' below).
- Monitoring tools - these help DevOps teams identify and resolve system issues; they also gather and analyze data in real time to reveal how code changes impact application performance. Open source monitoring tools include Datadog, Nagios, Prometheus and Splunk.
- Continuous feedback tools - tools that gather feedback from users, either through heatmapping (recording users' actions on screen), surveys, or self-service issue ticketing.
DevOps and cloud-native development
Cloud-native is an approach to building applications that leverage foundational cloud computing technologies. The goal of cloud-native is to enable a consistent and optimal application development, deployment, management and performance across public, private and multicloud environments.
Today, cloud-native applications are typically
- Built using microservices - loosely-coupled, independently deployable components that have their own self-contained stack, and communicate with each other via REST APIs, event streaming or message brokers.
- Deployed in containers - executable units of code that contain all the code, runtimes and operating system dependencies required to run the application. (For most organizations, 'containers' is synonymous with Docker containers, but other container types exist.)
- Operated (at scale) using Kubernetes, an open-source container orchestration platform for scheduling and automating the deployment, management and scaling of containerized applications.
In many ways, cloud-native development and DevOps are made for each other.
Developing and updating microservices - i.e., the iterative delivery of small units of code to a small code base - is a perfect fit for DevOps rapid release and management cycles. And it would be difficult to deal with the complexity of a microservices architecture without DevOps deployment and operation.
By packaging and permanently fixing all OS dependencies, containers enable rapid CI/CD and deployment cycles, because all integration, testing and deployment occurs in the same environment. And Kubernetes orchestration performs the same continuous configuration tasks for containerized applications as Ansible, Puppet and Chef perform for non-containerized applications.
Most leading cloud computing providers - including AWS, Google, Microsoft Azure, and IBM Cloud - offer some sort of managed DevOps pipeline solution.
What is DevSecOps?
DevSecOps is DevOps that continuously integrates and automates security throughout the DevOps lifecycle - from start to finish, from planning through feedback and back to planning again.
Another way to put this is that DevSecOps is what DevOps was supposed to be from the start. But two of the early significant (and for a time insurmountable) challenges of DevOps adoption were integrating security expertise into cross-functional teams (a cultural problem), and implementing security automation into the DevOps lifecycle (a technical issue). Security came to be perceived as the "Team of 'No,'" and as an expensive bottleneck in many DevOps practices.
DevSecOps emerged as a specific effort to integrate and automate security as originally intended. In DevSecOps, security is a “first class” citizen and stakeholder along with development and Operations, and brings security into the development process with a product focus.
Watch 'What is DevSecOps?' to learn more about DevSecOps principles, benefits and use cases:
DevOps and site reliability engineering (SRE)
Site reliability engineering (SRE) uses software engineering techniques to automate IT operations tasks - e.g. production system management, change management, incident response, even emergency response - that might otherwise be performed manually by systems administrators. SRE seeks to transform the classical system administrator into an engineer.
The ultimate goal of SRE is similar to the goal of DevOps, but more specific: SRE aims to balance an organization's desire for rapid application development with its need to meet performance and availability levels specified in service level agreements (SLAs) with customers and end-users.
Site reliability engineers achieve this balance by determining an acceptable level of operational risk caused by applications - called an 'error budget' - and by automating operations to meet that level.
On a cross-functional DevOps team, SRE can serve as a bridge between development and operations, providing the metrics and automation the team needs to push code changes and new features through the DevOps pipeline as quickly as possible, without 'breaking' the terms of the organizations SLAs.
DevOps and IBM Cloud
DevOps requires collaboration across business, development, and operation stakeholders to expeditiously deliver and run reliable software. Organizations that use DevOps tools and practices while transforming their culture build a powerful foundation for digital transformation, and for modernizing their applications as the need for automation widens across business and IT operations.
A move toward greater automation should start with small, measurably successful projects, which you can then scale and optimize for other processes and in other parts of your organization.
Working with IBM, you’ll have access to AI-powered automation capabilities, including prebuilt workflows, to make every IT services process more intelligent, freeing up teams to focus on the most important IT issues and accelerate innovation.
Take the next step:
- See how you can place AI at the core of your entire IT operations toolchain with IBM Cloud Pak for Watson AIOps.
- Explore additional IBM tools to support a DevOps approach, including IBM Architecture Room Live, IBM Rational Test Workbench, IBM UrbanCode Deploy, and IBM UrbanCode Velocity.
- Discover how IBM DevOps can help you shorten releases, improve reliability and stay ahead of the competition.
- Build DevOps skills through our “Introduction to DevOps for Cloud Solutions” course contained within the Cloud Architect Professional learning curriculum or the courses on DevOps capabilities and IBM Cloud Continuous Delivery Toolchain contained within the Cloud Developer Professional learning curriculum.
- Register to download the Gartner report and discover how to future-proof your IT operations with AI.
- Download the IBM Cloud infographic that shows the benefits of AI-powered automation for IT operations. (PDF, 467 KB)
- Check out the eBook, Putting the Ops into DevOps for Dummies (PDF, 2.1 MB).
- Read about the five “must-have’s” for automation success (link resides outside IBM) in this HFS Research report.
Get started with an IBM Cloud account today.
IBM Cloud Pak for Watson AIOps uses machine learning and natural language understanding to correlate structured and unstructured data across your operations toolchain in real time to uncover hidden insights and help identify root causes faster. Eliminating the need for multiple dashboards, Watson AIOps feeds insights and recommendations directly into your team workflows to speed incident resolution.
About the author
Andrea C. Crawford is a Distinguished Engineer at IBM with expertise in classical and modern DevOps. She’s passionate about helping client’s transform application delivery through people, process and tool modernization. Andrea enjoys exploring tactical “outliers” like gardening and riding motorcycles.
https://twitter.com/acmthinks (link resides outside IBM)
https://medium.com/@acmThinks (link resides outside IBM)