Test Data Management in the DevOps Lifecycle
Technologies such as Cloud, Mobile, Big Data, and Social are pushing companies to innovate more rapidly. DevOps is an enterprise capability for continuous software delivery that enables clients to seize market opportunities and reduce time to customer feedback. DevOps stems from pressures being placed on organizations today as they balance the need for stability and innovation. Test data management is about easily creating targeted, right-sized test databases rather than cloning entire production environments. Test data management is becoming more critical in the DevOps lifecycle. This blog explores what DevOps is, how test data management fits within DevOps and key principles and capabilities of DevOps with test data management.
What is DevOps & the need for it
DevOps is a new movement in the Information Technology industry. It is a Software Delivery approach based on lean principles in which the lines of business, development, operations, and quality assurance departments collaborate to deliver a new or enhanced capability to customers.
IBM has a very holistic definition of DevOps.
An enterprise capability for continuous software delivery that enables clients to seize market opportunities and reduce time to customer feedback.
This definition sets DevOps to be a business capability that enables companies to take a capability from ideation to production in an efficient manner, while capturing customer feedback to enhance the capability and its delivery.
The need for DevOps stems from the pressures being placed on organizations today as they balance the need for stability and innovation. New technological advances like Social, Mobile, Big Data, Cloud etc. are pushing companies to innovate more rapidly than they have even done before and deliver more software driven capabilities than ever before. At the same time, keeping their systems stable and providing capabilities to their customers continuously. This requires efficient and effective Software Delivery capabilities. In a recent survey it was found that 86% of companies believe that Software Delivery is ‘important’ to their business, while only 25% believed that they were good at it.
What is Test Data Management & the need for it in DevOps
Test data management is about easily creating targeted, right-sized test databases rather than cloning entire production environments. Without the need to manually create and maintain test data, development and test environments are more manageable for continuous testing. Simply stated, test data management is the process of quickly creating realistic test data at the time it is required for testing.
So, how does test data management fit with DevOps? DevOps requires that developers deploy applications regularly, in order to validate their integrations, functionality and performance. The goal of testing in DevOps is to perform these validations by carrying out appropriate integration, functional, performance tests. This requires that applications be tested every time they are delivered (continuous integration and delivery). For such applications, providing good sets of test data is inherently challenging. This is exacerbated by the need to test the application with new and refreshed test data each time. In order to address these challenges test data management hence becomes a pre-requisite to the very existence of DevOps.
The Principles and Capabilities of DevOps
IBM has identified four key principles around which it has based its DevOps solution. They are:
1. Develop and Test against a production-like system addresses the need for software delivery organizations (development, quality assurance, operations, security) to use environments in their ‘dev’ and ‘test’ stage of software delivery, that are as close to being like the production systems as possible. For example, test data management takes a subset of your production data to easily create your test data for continuous testing. This allows for validation that the systems being developed are going to function and perform as designed, long before they are ready for deployment to production. This also tests and validates that the deployment processes function as designed, reducing challenges during final deployment to production.
2. Iterative and frequent deployments using repeatable and reliable processes addresses using ‘agile’ and/or ‘iterative’ development practices: develop – test – deploy – validate – adjust. Doing this cycle in a rapid and repetitive manner requires good software engineering and continuous deployment capabilities. This principle also allows operations teams to validate and enhance their systems early in the development lifecycle.
3. Continuously monitor and validate operational quality characteristics addresses the ‘feedback loop’ in the development lifecycle. All thru the lifecycle, development and delivery teams need to get metrics to validate that the software they are delivering is functioning and performing as designed. During the development lifecycle, these are gathered thru continuous testing. Testing for integration, functionality, performance and security. Once in production, they additionally need to capture how customers are interacting with the software delivered.
4. Amplify feedback loops addressed need to enhance the ‘feedback loops’ mentioned in the previous paragraph. This continuous feedback should be used to drive continuous improvement – to the software and the software delivery processes and capabilities. This requires that all stakeholders – lines of business, development, quality assurance, security, architecture, and operations – have access to the metrics being gathered. This also requires that the metrics be in a form that is consumable by these stakeholders.
The Principles and Capabilities of Test Data Management
IBM has identified the following best practices for test data management that enable continuous integration and delivery for DevOps:
1. Discover test data
Test cases need to be associated with the appropriate test data and finding the right test data for your test cases is critical. In some cases this data may exist across several production databases. For example, an application might use data from a customer record from a Siebel CRM database along with related details on purchased items from a separate inventory management system database. The goal is to capture the end-to-end business process and associated test data wherever it may reside. This will enable you to extract the appropriate data into the subset needed for the test cases.
2. Automate creation of realistic “right sized” test data
How are enterprises creating test data today? They are either creating it manually or just cloning their entire production system to obtain their test database instead of extracting only the subset of test data needed to support the test case. These manual processes do not provide the agility needed for continuous integration and delivery for DevOps. Automated test data generation allows for rapid creation of test databases for various types of testing on demand.
3. Mask sensitive information for compliance and protection
Protecting data privacy is no longer optional—it’s the law! Organizations must have procedures in place to de-identify data across non-production environments to comply with data privacy regulations and avoid data breaches. Data masking provides development teams with meaningful test data, without exposing sensitive private information such as Personally Identifiable Information (PII) and Protected Health Information (PHI). Masking takes real data and makes it realistic but fictional so that no sensitive data is compromised.
4. Refresh test data for continuous integration and delivery
Organizations are continuously integrating and delivering applications as part of DevOps. For this effort testers and developers need access to test data continuously, in order to run tests and builds, and run them again until the functionality works. Organizations can streamline test data delivery by enabling testers and developers to refresh test data without the need to involve database administrators (DBAs). This improves operational efficiency, provides more time for testing, and enables releases to be delivered more quickly and continuously.
5. Analyze test data results
While functional testing confirms the behavior of the application, test data management enables organizations to assess changes in test data for success or failure. Analyzing test data results by comparing pre-test data against post-test data helps to assess whether the test passed or failed. This best practice addresses any hidden errors allowing organizations to quickly identify and resolve defects for continuous integration and delivery.
DevOps is a new movement in the IT industry that is critical to the success of innovative companies. Test data management takes a subset of your production data to easily create your test data for continuous testing as part of DevOps. Test data management is hence, a fundamental part of the principles and capabilities of DevOps. It is a must for continuous software delivery.
About the Author
Sanjeev is a 18 year veteran of the software industry. For the past 15+ years he has been with the Rational Brand, coming to IBM via the Rational acquisition. He is currently a Rational Specialty Architect in the Mid-Atlantic Business Unit in the United States. His current area of expertise includes Mobile, DevOps, Agile Transformation and Application Lifecycle Management. He has spoken at several international industry conferences, including IBM Innovate and written multiple internal and external articles. He blogs at