DORA metrics are a set of standardized software delivery performance indicators that describe how effectively an enterprise builds, deploys and manages software.
Developed by Google’s DevOps Research and Assessment (DORA) team, DORA metrics provide industry-standard benchmarks for software delivery and DevOps maturity.
At their core, DORA metrics focus on two dimensions:
Throughput-oriented metrics reveal how quickly code changes move through the software development lifecycle (from ideation to production), while stability-oriented metrics show how frequently deployments break and how fast teams recover. Together, they give business leaders and DevOps teams evidence-based, reproducible insights into DevOps performance. Businesses don’t have to rely on subjective performance measures or vanity indicators such as “lines of code written” or “hours worked.”
Tracking DORA metrics enables DevOps teams to see how they stack up against industry-recognized performance tiers (low, medium, high and elite) and whether they are aligned with enterprise-wide goals. It helps businesses objectively identify bottlenecks in continuous integration/continuous delivery (CI/CD) pipelines, software testing and incident management.
DORA metrics also make it easier for leadership to communicate the health of engineering departments to business stakeholders and justify investments in the advanced automation, observability and site reliability engineering (SRE) tools that help optimize DevOps practices.
In DevOps, DORA refers to the DevOps Research and Assessment team that popularized organizational performance metrics for software development teams.
DORA—now part of Google Cloud—started as an independent research project that studied how software delivery processes affect business outcomes. The DORA team was interested in answering one key question: What makes high‑performing software teams different from low performers?
They collected survey and telemetry data from thousands of teams and found that a small set of delivery‑ and operations‑related metrics were strongly predictive of overall performance. Namely, those metrics are deployment frequency, lead time for changes, change failure rate and failed deployment recovery time. The results also proved that release speed and software stability can go hand‑in‑hand, so teams don’t have to prioritize one over the other.
Eventually, the four indicators became the standard DORA metrics, which provided industry-standard benchmarks for software delivery performance. In 2024, the DORA research team added a fifth metric—deployment rework rate—to the framework.
Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.
The five DORA metrics can be grouped into two overarching categories: “throughput” and “stability.”
Throughput metrics include deployment frequency, lead time for changes and failed deployment recovery time, which together describe how many changes an application can handle and how quickly changes flow to production.
Stability metrics include change failure rate and deployment rework rate, which measure the proportion of deployments that cause failures and the share of deployments that are unplanned or driven by production incidents.
Deployment frequency measures how often a team successfully deploys code to production over a period of time (per day, per week, per month).
Deployment frequency is one of the most visible signals of how agile and confident a software team is in its delivery process. When a team deploys code to production many times per day, it usually means that automated build, test and deployment processes are running smoothly and that sound observability mechanisms are in place. These conditions enable development teams to release software without fear of breaking the system.
In contrast, teams that deploy only once per month (or even less frequently) often rely on long-lived code branches, manual testing, complex approval protocols and big release days. All of thse factors slow down the user feedback cycle and increase the cost of fixing defects.
Notably, high deployment frequency is not a sign of health if it comes with frequent outages or rollbacks. High-performing teams combine high deployment frequency with robust quality assurance practices (multilevel automated testing, staggered deployments, feature flags and real-time monitoring practices) that help teams ship often while minimizing risk to users.
From a leadership perspective, tracking deployment frequency over time can reveal bottlenecks in the software delivery pipeline. For example, a drop in deployment frequency after introducing a new approval board might indicate that the process is too heavy‑handed. Conversely, a steady increase in deployment frequency—paired with improving reliability metrics—is a strong indicator that investments in CI/CD pipelines and DevOps practices are paying off.
Lead time for changes (also called change lead time) measures the amount of time it takes for a code change to reach deployment and run successfully in live production after the change has been committed to the main code branch in a version control system.
This end-to-end window captures everything, including build time, test execution, code review, approvals, pipeline stages and the actual deployment. It reveals how fast a system can convert a decision into a tangible change for end users.
When lead time is short (under an hour), organizations can respond to incidents in near-real time. For example, if a security team flags a vulnerability, the engineering team can implement a fix, validate it through automated checks and get it live in a very short span of time. This rapid feedback loop attaches development work directly to business outcomes, because teams can measure the impact of changes quickly and adjust accordingly.
Reducing change lead time almost always requires DevOps teams to automate and simplify the delivery pipeline. This process typically includes investing in continuous integration (CI) tools so that every commit is built and tested quickly and streamlining approval processes so that approvals are integrated with development workflows.
Failed deployment recovery time, also called mean time to recovery (MTTR) or mean time to restore service, assesses how long it takes to restore service to normal after a deployment-related failure.
Service remediation might require developers to roll back a bad change, deploy a hotfix (a small, urgent code change released outside the normal release cycle) or otherwise mitigate the impact of the issue so users can use the system normally again.
When recovery time is short (a matter of minutes), it usually means several DevOps components are working well together.
Observability is strong. Real-time telemetry data—metrics, logs and traces—is fully aggregated and centralized in dashboards. Incidents are detected quickly (often automatically) and without waiting for users to complain.
Alerting is appropriately tuned so that engineers are notified promptly but not overwhelmed by noise.
DevOps teams rely on runbooks (step‑by‑step tactical guides for carrying out technical tasks), playbooks (higher‑level strategic guides that describe how teams should respond to a situation) and documented rollback procedures. These frameworks take the guess work out determining what to do in the middle of an outage.
Long recovery times (hours or even days) often indicate weak operational practices. Detection might be poor, so production failures linger until customers report them. Diagnosis is a slow process, because data collection is scattered, error messages are unclear or teams lack a shared understanding of how the system behaves in failure modes. In these environments, each incident becomes a high-stress, high-risk event instead of a straightforward problem-solving exercise with a clear resolution protocol.
When paired with the other DORA metrics, low failed deployment recovery times help ensure that even if deployments occasionally fail, the failure is contained and response times are short.
Change failure rate (CFR) is a “quality of delivery” metric that tells DevOps teams how frequently deployments fail in live production environments. Any problematic production deployment that forces the team to intervene immediately—instead of waiting until the next planned release to address the issue—is considered a “failed” deployment.
Tracking CFR over time helps teams understand how reliably they are delivering changes and serves as a counterbalance to throughput metrics (deployment frequency, lead time for changes) that prioritize speed.
CFR is typically expressed as a percentage and is calculated by dividing the number of failed deployments by the total number of deployments over the same period and then multiplying by 100.
A low change failure rate indicates that a team is shipping changes that generally behave as expected in production. When CFR is low and stable, teams can safely increase deployment frequency and shorten change lead time without harming user experience.
However, a high change failure rate suggests that too many changes are destabilizing the system. Causes can include incomplete or flaky tests, stage and test environments that differ from those in production, deployments that are too large or slow monitoring and alerting. High CFR signals that the organization should prioritize smaller, more incremental changes and safer deployment practices (such as canary releases, where a new app or feature is rolled out to a small group of users before the wider release) so that fewer changes turn into production incidents.
Deployment rework rate is a stability metric that measures what fraction of a team’s deployments are rework. “Rework” refers to any deployment that is triggered by a user-facing bug or incident in production and is not part of the original, planned release train.
Deployment rework rate complements the original four DORA metrics by making reactive work visible. It reveals how much of a team’s development effort is spent cleaning up past mistakes instead of moving the product forward.
A low rework rate generally indicates that DevOps teams are catching most issues before they reach end users and spending most of their time doing value-creating work (planned optimizations and refactors). A high rework rate means that teams are spending too much time on firefighting tasks.
Rework rates are especially valuable when combined with other key metrics. For example, if a team has high deployment frequency and a low rework rate, it signals that the team is shipping new software iterations quickly without sacrificing quality. But if the team shows high deployment frequency alongside a high rework rate, it suggests that they are shipping a lot of code but also burning a lot of time fixing it.
Importantly, a nonzero rework rate is normal (and often healthy). Even the best development teams can make mistakes, and well-built software applications will still require improvements and feature updates over time. The goal isn’t to hit 0% rework but to hold rework to a small, predictable percentage of all deployments.
For each DORA metric, teams are classified into one of four performance tiers—elite, high, medium or low—and each tier is based on established thresholds for the metric. The thresholds are:
| Elite | High | Medium | Low | |
|---|---|---|---|---|
| Deployment frequency | Multiple deployments daily | Daily to weekly deployments | Weekly to monthly deployments | Monthly to twice-yearly deployments |
| Lead time for changes | Less than one hour | Several hours to one week | Several weeks to six months | More than six months |
| Failed deployment recovery time | Less than one hour | Less than one day | One day to one week | One week to one month |
| Change failure rate | 0–15% | 16–30% | 31–45% | 46–60% |
Because deployment rework rate is a newer metric, it doesn’t yet have established thresholds or performance tiers. However, many enterprises and DevOps teams track their own rework baselines and aim to trend downward over time. Using rework rate data alongside change failure rate and other reliability metrics helps teams improve deployment rework rates without overcorrecting based on a single number.
DORA metrics support the broader DevOps approach to software development, which prioritizes continuous collaboration between multidisciplinary teams across app development and operations functions. DevOps environments can only be as successful as the teams that comprise them.
DORA metrics enable enterprises to quantify team performance according to criteria that are closely aligned with broader value streams and business outcomes. For example, high performers and elite performers are more likely to meet profitability, time‑to‑market and customer satisfaction goals.
DevOps cultures are also characterized by a commitment to free-flowing communication and shared responsibility among software delivery stakeholders, all of whom work together to build high-quality software and accelerate product innovation. Therefore, many organizations use DORA metrics to create a continuous improvement feedback loop. The metrics measure each team’s performance. Teams can discuss opportunities for improvement based on the measurements, experiment with new approaches and remeasure performance to see which changes actually improve delivery practices.
Importantly, DORA metrics are diagnostic, not punitive. They help reveal systemic patterns in development processes, instead of focusing on the work of individual developers.
DORA metrics provide quantitative, repeatable measures of how quickly and reliably DevOps teams ship software. They replace subjective statements such as “we’re slow” or “we’re stable” with concrete numbers that can be monitored over time.
Tracking DORA metrics across the delivery pipeline enables teams to pinpoint where delays or failures occur, which makes it easier to decide which improvements will be most impactful.
Leaders can use DORA trends to justify investments in advanced tools or process optimizations by showing how those changes might affect release velocity and stability.
DORA data enables enterprises to compare themselves against industry benchmarks and diagnose DevOps maturity gaps across teams, which is especially valuable in large, multiteam environments.
When DORA metrics are shared transparently, they bring together product, operations and software engineering departments around shared outcomes and help multidisciplinary teams tweak their practices to get better results.
Harness the power of AI and automation to proactively solve issues across the application stack.
Use DevOps software and tools to build, deploy and manage cloud-native apps across multiple devices and environments.
Accelerate business agility and growth—continuously modernize your applications on any platform using our cloud consulting services.