Test Data Delivery: Key Pillar of the Modern Software Factory

In today?s rapidly changing and disruptive business environment, speed is the name of the game. The desire to achieve digital transformation is pushing enterprise IT to deliver more capabilities faster, with higher quality. As traditional IT departments embark on transforming their service development and delivery processes by adopting Agile and DevOps principals, they are finding software testing to be an inhibitor to faster development. Data driven testing is a well-established technique that allows QA teams to deliver quality at higher efficiency with easier-to-maintain coverage. While data-driven testing helps, it also shifts the testing problem to a secure test data availability problem.

Organizations realize that speeding up testing is not possible without having an efficient test data management practice. One of the major issues testing teams face is having the skillset and ability to locate the right data from across different data sources to fulfill their testing scenarios. For simpler applications, the UI can help find the data, but most of today?s business applications are in reality, composite apps. It takes much longer for the testing teams to correlate and find consistent data from across multiple application components. Some testing cycles can be stalled for weeks due to unavailability of the right test data.

What exactly are testing teams looking for? We have analyzed the test data requirements of many of our enterprise customers and have seen a very common pattern. Testing teams are generally looking for specific types of business objects related to the application domain they are testing. For example, in a banking application, testers are looking for bank accounts with specific characteristics. Similarly, QA teams in insurance companies are looking for insurance policies to test their desired scenarios. These business objects generally are part of an organization?s master data.

But most of the testing teams are working in silos even though they all need very similar types of data and spend lots of time finding or creating this data for their own testing needs. If we create a standardized and shared service to help testing teams get the business objects based on their criteria, then we can bring clear efficiency into the testing process. This is where Test Data Delivery comes into play.

The 5 Characteristics of Test Data Delivery

For Test Data Delivery to be most effective, it needs to have the following 5 key characteristics:

Image title

1. Standardized Requests

A testing team member sees the world more from a business perspective so they should be familiar with business objects, their relationships and objectives (in the business domain/context they are testing). If they are not familiar with business objects, then it will be impossible to create valid testing scenarios. What they often don?t know is how the business model gets translated many times across federated databases. Test Data Delivery should abstract the underlying DB models/complexities to terms more familiar to the testing team member.

There needs to be a standard way of requesting the test data so the service can be consumed across testing teams. Since we are dealing with a mix of manual and automated testing, the request process should be consumable both interactively as well as through APIs. QA engineers will interact with the service at two different phases:

Test Case Setup Phase:

During test case setup time, a QA engineer needs to be able to explore the data interactively and define test data criteria that will be linked back to the test case. Similarly, the Test Automation engineer needs to be able to define test data criteria external to the test automation code so that it can be maintained, managed and requested independently.

Test Execution Phase:

When the testing team is about to execute the test case they should be able to send data criteria and the target environment they are about to execute tests against and have the service inject all necessary data into the test.

2. Automated Fulfillment

Standardization of the requesting process potentially automates the process of delivery. Like any other automation, a decision should be made on what objects and operations need to be exposed via the service in an automatic fashion. Ideally, you figure out which objects are most requested from the testing community. If you have a Master Data Management process you can figure out which business objects are going to be most required across multiple teams. Generally, business objects categorized as Master Data are most in demand followed by referential and transactional data objects.

In terms of operations, generally following these five operations is very much in demand:

Find from different environments.
Reserve business object in target environment.
Copy to a different target environment.
Clone to a same or different environment.

To independently manage interesting test data objects organizations can create a repository of test data objects called a gold copy. The base data in this repository comes from production after cleaning up PII and removing redundant data objects. The operations listed above can be extended to the gold copy of test data repository to allow discovery and self-management of gold copy database.

While synthetic data generation is the most secure approach to acquire Test data, it is time-consuming to set it up, especially for large and complex schemas. Cloning provides the phased approach to implement synthetic data generation. With cloning, you are enabling testers to select an object from source and make one or more copies of the source object as well as anything related to the object into the target system. Here target and source can be the same underlying data sources. In very simplified clone process data generation techniques are applied only to generate key values and unique attributes. Once basic cloning capability is setup, an iterative approach of adding synthetic generation scheme to other attributes required by the testing team can be added sprint by sprint.

3. Integrated With Continuous Delivery Tool Chain

To have better usability and experience for testing teams, TDM services should be integrated with the SDLC and the Continuous Delivery toolchain. Major areas where the services need to be integration should be test case management tools. Testing teams spend lots of time in Test Case Management tools to setup and manage test cases, for higher efficiency they need to have the ability to assign test data criteria to test cases and request test data when the test cases are about to be executed

Test Automation frameworks: As organizations are increasing the percentage of automated test cases, being able to initialize and clean up test data in an automated fashion is critical. TDM service not only needs to be able to inject the required test data scenarios to backend database but also needs to feed input data to the automation framework. For this purpose, the service needs to be orchestratable via an API.

Environment provisioning tools: On demand, automated environment provisioning for testing is the cornerstone of a well-run Continuous Delivery pipeline. When preparing the environment, being able to prepare right test data in the backend databases and/or virtual services can help speed up testing process. This is especially important for automated regression suites execution where test cases and test data criteria is already established. For preparation of test data as part of the environment provisioning, environment provisioning tool should be able to collect all the test data criteria needed for the regression test suite to initiate the Test Data Service as part of environment provisioning process.

4. Secured Data

Many organizations rely on production data to quickly get large amount of real world test data objects into their Test databases. To make sure data is secure a process of masking sensitive information is setup. Generally frequent refreshes are setup to keep the data fresh. While getting the data to bootstrap Test Database might be the best approach, subsequent needs of secure data should be better unfilled via synthetically generating needed data. Using synthetic data generation not only makes the Test Data compliant to regulations like GDRP but also plays an enforcing role of only create data focused on the test scenarios, hence keeps only need based data to eliminate unnecessary maintenance and storage cost.

5. Agile

With new requirements Test data needs are constantly evolving. Good Test Data practice requires a way to minimize the impact of those changes. Since the epicenter of change is requirement, utilizing model-based approach on the requirement management and automatically drive generation and maintenance of test cases and test data requirements can provide the agility needed for today?s agile development teams. As the requirement changes and model gets updated the impacted test cases and test data requirement automatically get changed. Similar model based approach should be used to maintain and manage the Synthetic Test Data generation Services. Synthetic data generation rules based on the Test Data requirement model can alleviate the service maintenance burden and allow more effective TDM practice.

Conclusion

Test Data Delivery is the process of creating a standardized and shared service to help testing teams get the business objects they need based on their specific criteria. Test Data Delivery helps bring clear efficiency into the testing process. With these five characteristics, testers can more fully focus on locating the right business objects. The result is faster testing and faster application delivery. To read the Bloor Market Update Report on CA Test Data Manager, click here.

article