Handling test data with IBM Rational Performance Tester 7.0, Part 2
Using files for very large sets of test data
How to work with huge volumes of data for workload simulation
This content is part # of 2 in the series: Handling test data with IBM Rational Performance Tester 7.0, Part 2
This content is part of the series:Handling test data with IBM Rational Performance Tester 7.0, Part 2
Stay tuned for additional content in this series.
Test data is an important part of most software testing, especially performance testing, which often requires large amounts of test data. IBM® Rational Performance Tester leverages the Eclipse Test and Performance Tools Platform (TPTP) datapool structures for handling test data. Rational Performance Tester includes many features that make using these datapools very easy and flexible. Once you start using datapools with more than 10,000 rows of data, however, there can be long delays in starting up a test. For performance testing of large systems, it is not inconceivable to have test data requirements of one million records or more. In these situations, the TPTP datapools may not be the most effective solution.
This article series shows you how to create Rational Performance Tester tests that use files instead of datapools for storing and handling test data. This technique can be used to address the need for using large sets of test data in performance testing with Rational Performance Tester.
In Part 1 of this series of two articles you created a test with a datapool. Now in Part 2, you will modify the same test to use a test data file in place of the datapool.
Note: This article was developed using Performance Tester version 7.0.0. This should work with future versions although specific procedures could change. You can also use this article with RPT version 6.1.2, although some screen shots and procedures may not be exactly the same. Performance Tester version 6.1.1 and earlier will not work due to changes in the Performance Tester custom code API.
Handling large volumes of test data with Rational Performance Tester
You are starting with the simple test (with a datapool used to hold multiple search items), which was created in Part 1 of this series. This article will show you how to create a new version of the same test, one that will use a test datafile for the same functionality as the datapool. The test datafile will allow more efficient handling of very large volumes of data, although this article will only demonstrate with a relatively small volume.
Setting up the article files
There are several files included with this article, described following:
- TestData.csv: A file containing 50 random names used for search strings.
- GetTestData.java: A completed custom code module from which you can copy and paste.
- SetupTestDataArea.java: A completed custom code module from which you can copy and paste.
These files should already be downloaded and unzipped to your C:\temp\ directory if you performed the steps in Part 1 of this article; if not, then do so now. If you put these into another location, then you will have to update the location in several steps, which are pointed out in the sections following.
Rational Performance Tester custom coding strategy
You will add code to a test that will emulate Rational Performance Tester's built-in datapool functionality. This means that you want multiple simulated users, running simultaneous instances of the same test, to each retrieve a unique test data record from a file. Each time a test instance gets a record from the test data file, you want to increment a shared pointer to the row in the file so that the next test instance (simulated user) will fetch the next row.
To use a shared row pointer, you will utilize Rational Performance Tester's test execution services IDataArea interface, and initialize this in a separate test prior to other tests getting test data. To get the test data from a file, you will use simple java.io methods such as the StreamTokenizer.
For simplicity, you should have the row pointer always wrap back to the beginning if the end of the file is reached. You will handle concurrency issues in the performance schedule instead of through the code. This implementation will also require all users accessing a given test data file to be running on the same machine. Multiple machines can be used for more virtual users, but these would need to have their own copy of the test data file.
Create a test using a data file for test data
To create a test and test data, perform the following steps:
- Copy the first recorded test.
You will now create a second test that will use the original CSV file for test data instead of a datapool. Since you want the exact same test steps, you shouldn't have to re-record the test. You will make a copy of the initial recording and use that as your new test.
- Click the Create New Test From Recording toolbar button.
- In the Create New Test window, select Create Test From
Existing Recording and HTTP
Recording, as shown in Figure 1. Click
Figure 1. Creating a test from an existing recording
- In the next Create New Test From Recording window, select the
recording LargeTestData_datapool.rec, which was recorded in
Part 1 (Figure 2).
Figure 2. Selecting an existing recording to generate a new test
- 4 In the next Create New Test From Recording window, select
the tests folder and enter the name
LargeTestData_data_file, as shown in Figure 3. Click Finish to generate the new test.
Figure 3. Selecting the location and name for the new test
- Add custom code for accessing test data files.
Since the new test was generated from the initial recording, it will not include the edits for substituting datapool values. This is good, since you want to substitute values from files instead. To do this, you will use custom code modules.
- Add a new custom code class to the test
LargeTestData_data_file by right-clicking the second page
(titled developerWorks : Rational : Products : Performance
Tester) in the Test Contents section and
selecting Add > Custom Code, as shown in
Figure 4. Adding custom code
- This will create a custom code module just before the search page titled IBM developerWorks > Search results in the Test Contents section. If the custom code is not just before the last page, then you can use the Up and Down buttons in the center of the test editor to place the code.
- 3 Enter the name
test.custom.GetTestDatain the Class name field (Figure 5) and click Generate Code.
Note: Be sure to enter a period after
customin the class name. This will create a package named custom to contain the code you are developing. This makes it easier to distinguish your custom code from other Java™ files generated by Rational Performance Tester in the test package. It is recommended that you do not edit the Rational Performance Tester-generated Java files in the test package.
Figure 5. Name and placement of custom code
- In the Code Editor window for the
GetTestData.java class, enter the code shown in the listings below. You can also copy this from the included article files.
Listing 1. Get file row pointer and initialize return variable
Listing 2. Open file, go to current row, and get value
Listing 3. Update new row pointer and return test data
The GetTestData code does the following:
- Gets the current file row pointer, stored as an object
but used as an integer, from the
globalvariable in the data area for the test engine (this will be set up later)
- Opens the test data file for reading using a StreamTokenizer to parse the values (the file name and path are hard coded)
- Goes to the current shared line of the test data file
- Increments the row pointer for the next virtual user or, if the end of the file is reached, go back to the beginning of the file and reset the pointer to 1
- Gets the value from the file using the
- Catches and handles potential file and IO exceptions
- Returns the file value to the test
Note: More information on Rational Performance Tester custom code can be found in the Help section Extending test execution with custom code.
- Gets the current file row pointer, stored as an object but used as an integer, from the
- In addition to the View Code button in the test editor, you
can also open and edit custom code classes by opening the
Navigator view and browsing to the \src\test\custom folder, as
shown in Figure 6.
Figure 6. Opening custom code outside of a test
You will now create a class to set up and initialize the shared file row pointer. Since a custom code module can only run in the context of a test, you will create a simple test that contains nothing except the code module. Instead of using the HTTP recorder, you will create a blank test.
- Create a new blank test by right-clicking the tests folder and
selecting New > Other > Test > Test Assets
> New HTTP Test, as shown in Figure 7.
Figure 7. New HTTP test
- In the New HTTP Performance Test window, enter the name
LargeTestData_setup, then click Next.
- Enter a description and click Next.
- In the final New HTTP Performance Test window, change the
Number of HTTP pages to generate to
0(zero), as shown in Figure 8.
- Click Finish.
Figure 8. Creating a blank test
- Add a new custom code class to the test LargeTestData_setup by right-clicking in the Test Contents section and selecting Add > Custom Code.
- Enter the class name
test.custom.SetupTestDataAreaas shown in Figure 9, and then click Generate Code.
Figure 9: LargeTestData_setup test
- In the code editor window for the SetupTestDataArea.java
class, enter the lines of code shown in Listing 4. You can
also copy this from the included article files.
Listing 4. SetupTestDataArea code
The SetupTestDataArea code does the following:
- Creates a variable
- Stores the global variable in the data area for the test engine, which allows access from different tests and different schedule user groups.
- Creates a variable
- Save and close both SetupTestDataArea.java and the test LargeTestData_setup.
- Add a new custom code class to the test LargeTestData_data_file by right-clicking the second page (titled developerWorks : Rational : Products : Performance Tester) in the Test Contents section and selecting Add > Custom Code, as shown in Figure 4.
- Replace the recorded search value with a code returned
You want your test to substitute the search value with test data, similar to what you did with the datapool in Part 1, but using data from the file that is returned by the custom code.
- In the LargeTestData_data_file test, expand the last page in the Test Contents, which should be titled IBM developerWorks > Search results.
- Select first request (the first line/node in the expanded page). In the Request Attributes section to the right, you will notice the URL is shown with encoded parameters including "query=RecordedSearchâ with RecordedSearch being the search string you entered while recording. This value is displayed in inverse dark green, indicating that it is currently substituted by a datapool value.
- In the URL box, right-click anywhere in the RecordedSearch
value and select Substitute From > Custom Code:
test.custom.GetTestData, as shown in Figure 10.
This will change the substitution from the datapool value to
the return value of the custom code class.
Figure 10. Custom code value substitution
- In the Test Data section of the test editor,
you should now see that the item query is substituted with the
custom code GetTestData, as shown in Figure 11.
Figure 11. Substituting parameters with custom code test data
- Create a new schedule by right-clicking the schedules folder in the Test Navigator pane and selecting New > Performance Schedule.
- In the Performance Schedule window, enter the name
LargeTestData, then click Finish.
- Add another user group by right-clicking in the Schedule Contents section and selecting Add > User Group.
- Enter the group name
Initializationand change the group size to Absolute with a value of 1. If this group is not at the top of the Schedule Contents section then click the Up button to move it to the top.
- Add a test to the Initialization group by right-clicking Initialization (1 user) in the Schedule Contents section (Figure 11) and selecting Add > Test.
- From the Select Performance Tests window, select
LargeTestData_setup and click
Figure 12. Performance schedule Initialization user group
- Change the name of the other user group by selecting the group
in the Schedule Contents section and entering
Usersin the group name field. The group size for this should remain Percentage with a value of
- Add a loop to the Users group by right-clicking Users in the Schedule Contents section and selecting Add > Loop.
- With the loop still selected, in the Schedule Element Details
to the right, enter
3for the number of iterations.
- Select the Control the rate of iterations
checkbox and enter an iteration rate of
minute, as shown in Figure 13.
Figure 13. Performance schedule loop iteration rate
- Add a test to the Users group by right-clicking the Loop in the Schedule Contents section and selecting Add > Test.
- From the Select Performance Tests window, select
LargeTestData_data_file (Figure 14) and click
Figure 14. Performance schedule Users group
- Set the schedule option by selecting the schedule name LargeTestData at the top of the Schedule Contents section.
- In the Schedule Element Details section, on
the General tab, select the Add a
delay between starting each user checkbox and
enter a delay of
- Click the Test Log tab in the
Schedule Element Details section to set
the log level. Set the Log Level for And also show all
other types to
Action Details, as shown in Figure 15. You can ignore the recommendation warning, since this test will not log excessive amounts of data.
Figure 15. Performance schedule for LargeTestData
Important! As noted earlier, the IBM developerWorks Web site is not to be used as a load testing site. Please do not run any more users than specified in this article. Also, do not run repeated tests beyond what is necessary to complete the steps in this article.
- Run the test and verify the test data that is used.
- Run the schedule by clicking the Run toolbar button.
- In the Run configurations window, edit the configuration created in Part 1 of this series.
- Click the Schedule tab and select the schedule LargeTestData. Click Run to begin execution.
- After the test has completed, open the execution history by right-clicking LargeTestData [date+time] in the Performance Test Runs view and selecting Display Execution History.
- Expand the events and verify that the query values are different and match values from the test data file. Refer to Part 1 of this series for more detailed steps on how to do this.
Test data with files summary
Using files to handle test data in Rational Performance Tester allows you to work with large sets of data, such as a million records or more, without experiencing the long test start-up delays you would have with similar sized datapools. This technique does require more work than using datapools, both in custom coding and in managing the files outside of the Rational Performance Tester workspace.
While datapools are the easiest way to link test data with your tests in Rational Performance Tester, they can become inefficient with very large sizes (in general over 10,000 records). To overcome this, Rational Performance Tester allows you to also use test data files directly (through custom coding) to handle whatever data volume is required for your testing.
Another advantage of this technique, in addition to the speed of running tests, is the added flexibility of how you want your tests to handle test data. This article demonstrated only one simple implementation of this, and you can always enhance this further to expand the functionality.
You are not even limited by coding capabilities coming from Rational Performance Tester: because all Rational Performance Tester code is Java, you can leverage countless Java libraries (many open source) for file handling, string manipulation, or whatever you need to accomplish your testing goals.