Enabling DevOps Success with Shift Left Continuous Testing
In my last blog post, I shared the 5 best practices my teams have learned during our own DevOps Journey.
This blog post focuses on my experiences with the second practice, Shift Left with continuous testing using automation and virtualization to eliminate long back-end test cycles and increase quality.
When we delivered our products on a longer development schedule, we followed Agile development practices and did testing through our sprints; however, we were not in a position at the end of each sprint to be at ship quality. We had to spend time (initially months, then weeks) at the end of the release cycle testing and uncovering issues that could have been found earlier in the cycle. This approach led to rework, inefficiencies and increased cost. There were multiple causes, everything from lack of automation, no enforcement of done criteria, and a team culture that depended on the "end game' to flush out the final release issues. A key part of our DevOps journey has been to shift this testing to the left in our timeline and our process to find problems earlier where they are much less expensive to fix. The table below came from an IBM study showing that it’s 100 times more expensive to fix a defect found once it has been released.
There are many different test phases and for clarity, I wanted to share my team's definitions.
Our goal is to shift as much of this testing to the left and make not only the testing but the progression of any build through the quality stages of our delivery pipeline automated and continuous. To accomplish this in my organization, we had to invest to being able to have a single version of the truth for the quality of our deliverable across all facets of our testing. We had to change development ways of workings to create and run automation as a part of the development process, not as an afterthought. In addition, we had to invest in creating test automation assets and ensuring strong governance across all of our testing end to end. Anything being done manually was analyzed and explicit decisions were made to continue the manual work, eliminate or automate it.
The Pipeline - enabling speed and quality
A delivery pipeline which incorporates build, test automation and deployment is a critical element of moving quickly. Bringing together test automation from different sources into the delivery pipeline can be a challenge. Like many clients, my team has a set of test automation tools that have evolved over a period of time. JUnits, automated test from our Rational Test Workbench tools like Rational Functional Tester and Rational Performance Tester, and a number of other home grown tools that our developers and testers have written to make their jobs easier. This leads to a very disjoint environment where each team has their isolated view of quality. To move quickly, the results of the individual teams have to feed into an overall view of quality of the entire deliverable in a repeatable way.
There are a couple of techniques I have used to address this challenge and speed up getting to this single view of quality:
Single view of quality
In order to know that something was GA ready we needed the ability to view our results in a consistent way. We use Rational Quality Manager (RQM) for our dashboards. Our tests report results to RQM using the junit selenium adapter and other adapters that exist or can be built show the overall results from the different test suites. One can drill down into the 'red x' to understand cause and link to the work items and requirements so when a test fails, we can trace it back to the change sets and how it was intended to work. It makes debugging the failures more efficient. This view alone has increased the efficiency of our scrums, shortening them by 20% and saving time everyday! Visit the live delivery pipeline view from Jazz.net. Here is a snapshot of our view.
Make consistency easy
A large portion of our test cases are run against Web UI's which are traditionally very difficult to automate because the test cases are very fragile. They break easily when code changes are made. As a result, developers and testers started building their own frameworks to make it easier. Not only is this inefficient because we have multiple people solving the same problems(which are actually not related to the development of our product), but also it exacerbated the problem described above. To address this issue, we started one single community source project to address the challenge of Web UI automation. We invested in one person to prototype and provide it out, then started an internal community source project where the entire team can contribute. By having a single framework developed through a community source project, you are incenting people to leverage work done by others, contribute to a common framework which is integrated into the delivery, and share their great ideas which benefit others. Re-use is possible, maintenance of the framework is more efficient, and you start to converge to one tool. Consistency happens by making the 'right way the easy way'. Because it is easier, we've doubled our number of automated tests between releases. For greater details on our approach, you can read more here.
A pipeline is no good without test assets
Having a visible pipeline that represents the quality of the release provides people an incentive to contribute more test automation. Having the right tools in place is important; however, it is most critical to have a culture which values automation and is committed to creating assets as new features are developed. Assessing test automation coverage gaps and deciding the smartest place to invest is challenging. We have created test automation heat maps driven by the number of client problems found as compared to test coverage to help us understand where to start. You can never automate everything, but you can create a culture which questions any testing or deployment that is not automated.
While it is true that all of our testing will not be automated, continuing to create and run tests manually should be an overt business decision rather than something that is continued because the team culture perpetuates it. I have leveraged a number of approaches to help drive cultural change and attitude around creating test assets and automation.
Definition of done: In Agile practices, the definition of done and 'holding the line' are critical elements of culture change. It is important to ensure that test automation is part of the definition of done. Each feature team is responsible for creating test automation as a part of writing a new feature. Test automation is managed as code and the feature team is responsible for maintaining the code changes, not the test team. This governance approach significantly improves ownership and the long term value of the assets being created.
Innovation: It is also good to incent teams to experiment with techniques and approaches like Test Driven Development. Engineers love to innovate and as engineers are more involved in testing, it is important for them to realize the innovation opportunities in this space. Sharing success stories with the broader team drives excitement and energy across all disciplines.
Stability early: Results drive momentum and appropriate testing early on enables the team to find problems, fix them and get stable drivers and move faster. I remember the days where we went weeks without a stable driver that we could use to test and progress. Now if we have more than one hour with a broken build it impacts our velocity. This culture change to focus on 'getting green runs' and finding and fixing problems is a critical change in culture and is essential to allowing all the other more complex testing to occur much earlier.
Continuous Complex Testing in production like environments finds difficult problems early
System Verification Testing
Many of the end game defects were found by our System Verification Test team. We analyzed the following to understand what was preventing us from shifting it left.
In evaluating the answers, the team found that we had automation that was no longer valuable. They established priorities with stakeholders and provided a coverage model which allowed them to invest in the important areas based on customer risk, rather than the approach of "run everything, all of the time."
Additionally, the team identified that they were spending roughly 60% of their time setting up and configuring complex test environments and only only 40% of their time allocated to actual testing. To remove the manual effort, we created infrastructure definition patterns and automated deployment scripts. This allowed the team to save 90% of the time they were spending on manual deployment and configuration of the test environment. This increase in speed enables us to run SVT continuously, keeping the pace with feature development. We are also able to take the time savings and allocate it to more automation and more exploratory testing.
Performance Regression Testing
It is essential that our Performance Testing is run continuously. While we used to run it every 2 weeks because of manual deployment and configuration work, it was problematic at the end of the cycle, causing regressions that would impact the GA product. Today we’re able to run daily build regressions.
We accomplished this by developing a set of performance verification tests which we integrated into the daily build process. Now, when a build completes, it starts a job that automatically deploys a set of virtual machines and installs the latest build and performance automation. The job runs the performance automation to gather performance data, and then analyzes the results against baseline data for the prior release, flagging any potential regressions as well as highlighting areas where performance has improved. The analysis is published in HTML format, so that the performance data can easily be surfaced in project dashboards. Now, we are able to assess the performance of Rational Team Concert (including plan loading), Rational Quality Manager, and Rational DOORS Next Generation every day, and detect performance regressions as soon as they are introduced.
As enterprise projects implement DevOps, we find multiple teams contributing to the same solution, but not all developing at the same pace. One example might be two teams, one is building a mobile application that calls a back-end service, like a database or an application server. The other team is working on the back-end modifications to address the mobile application needs. It is essential that the mobile (or systems of engagement) team move quickly and also test their applications with these back-end systems. However, the back-end services team may not move at the same pace. Many clients use test virtualization as a way to handle these differences in speed. Test virtualization enables the mobile team to simulate the back-end system, staying in synch with code changes from the back-end services team, but still run their mobile testing continuously enabling speed with quality.
Security testing is an essential part of our quality criteria. Our product development teams run static security testing with AppScan as a part of their build process. Our CLM FVT team focuses on dynamic security scans with AppScan, which we run in our production like environments. This allows us to find the security problems throughout the release cycle to make sure we are delivering a secure product. We also run penetration testing, malicious testing, where we are looking for security issues a hacker might find.
Market pressures and technology changes are driving the need for speed while continuing to improve quality. As development executives, it is our responsibility to find the right balance. Continuous Testing depends on the essential ingredients of automating everything possible and integrating it into the pipeline, shifting complex production testing to the left, and changing team culture. Making this shift takes investment and requires analysis of your current bottlenecks and re-prioritization of work to enable the changes needed to drive continuous testing. The pay-off is stability early which enables quality and provides the confidence needed by the entire team to move at the speed of business!