# IBM Storage Protect - Disk Workload Simulation Tool

Use **Disk Workload Simulation Tool** as a benchmarking tool to identify performance issues with your hardware setup and configuration.  You can use this tool, before installing the IBM Storage Protect server and client - to verify the performance of the IBM Storage Protect server database and storage pool disks.  The tool (Perl script - `sp_disk_load_gen.pl`) uses **ldeedee** program, which is similar to the Linux operating system **dd** command, to run a non-destructive workload on the system. Further, it uses the **iostat** command to monitor the workload for IBM FlashSystem systems.

**Note**: In the context of _IBM Elastic Storage System_, the Disk Workload Simulation Tool can report performance statistics only for local devices that are monitored by the **iostat** or **mmpmon** commands. The tool drives loads against other network-attached devices, but does not collect and report on performance statistics. When the tool is run against a file system on an _IBM Elastic Storage System_, the tool automatically runs the **mmpmon** command.

Sample data from the _iostat_ command is extracted for the specific disks that were involved in the test. Then, peak and average measurements for input/output operations per second (IOPS) and throughput are calculated. The script uses the _ldeedee_ command across multiple threads to drive the I/O by using direct I/O.

**Tips**:
* The _iostat_ tool monitors and reports on all I/O for the related disks, even activity that is being driven by applications other than the workload tool. For this reason, ensure that other activity is stopped before you run the tool.
* New storage arrays go through an initialization process. Allow this process to end before you measure disk performance. On IBM FlashSystem disk systems, you can monitor the initialization progress in the **Running Tasks** view.

The _Disk Workload Simulation Tool_ can run the following types of workloads:
1. **Storage pool workload**
   * The storage pool workload simulates IBM Storage Protect server-side data deduplication, in which large, 256 KB block-size sequential read and write operations are overlapped. The write process simulates incoming backups while the read operation simulates identification of duplicate data. The tool creates a read and write thread for every file system that is included in the test, allowing multiple sessions and processes to be striped across more than one file system.
   * You can also simulate a storage pool workload that conducts only read I/O or only write I/O operations:
     * Simulate restore operations by specifying the mode=readonly option.
     * Simulate backup operations by specifying the mode=writeonly option.
1. **Database workload**
    * The database workload simulates IBM Storage Protect database disk access in which small, 8 KB read and write operations are performed randomly across the disk. For this workload, 10 GB files are pre-created on each of the specified file systems and then read and write operations are run to random ranges within these files. Multiple threads are issued against each file system, sending I/O requests simultaneously.
    * For the database workload, configurations typically have one file system for each pool on the storage array. Include all database file systems when you are testing the database workload.

To use the _Disk Workload Simulation Tool_ effectively, experiment with test runs by including different quantities of file systems in the simulation until the performance of the system diminishes.

Depending on disk speed and the number of file systems that you are testing, the time that is required to run the script can be 3 - 10 minutes.

## Usage

To use the _Disk Workload Simulation Tool_, complete the following steps:

1. Choose either _storage pool file systems_ or _database file systems_.
1. Collect a list of the file systems that are associated with the chosen type of storage. 
   * Break the file systems into groups, according to which pool they belong to on the disk system. </br>Grouping is used to ensure that physical disks from all volumes on all arrays for the storage type are engaged in the test. 
   * **Note**: In the context of _IBM Elastic Storage System_, only a single IBM Storage Scale file system is defined for storage, you must create temporary directories to use when you run the workload simulation tool and specify the -fslist option. For example, issue the mkdir command to create temporary directories:
     ```
      mkdir /esstsm1/perftest/1
      mkdir /esstsm1/perftest/2
      < ... >
      mkdir /esstsm1/perftest/14
     ```
1. Prepare the file system for IBM Storage Protect
   * Change to the `tools\sp-load-generator` directory
   * Copy and run the `storage_prep_*.pl` Perl script (for the appropriate Operating System), and specify the size of system that you are configuring. </br>For example, for a medium system, issue the following command:
     ```
     perl storage_prep_lnx.pl medium
     ```
   * List all file systems by issuing the `df` command. Verify that file systems are mounted at the correct LUN and mount point. Also, verify the available space. The amount of used space should be approximately 1%. </br>For example:
     ```
     [root@tapsrv04 ~]# df -h /tsminst1/*
     Filesystem                                     Size Used Avail Use%  Mounted on
     /dev/mapper/360050763008101057800000000000003  134G 188M 132G   1%   /tsminst1/TSMalog
     ```
1. Run an initial test of the workload that includes one file system of the storage type from each pool on the storage array.
   * Example, to simulate the IBM Storage Protect storage pool workload on a medium-scale system, issue the following command:
     ```
     perl sp_disk_load_gen.pl workload=stgpool fslist=/tsminst1/TSMfile00,/tsminst1/TSMfile01,/tsminst1/TSMfile02,/tsminst1/TSMfile03,/tsminst1/TSMfile04,/tsminst1/TSMfile05,/tsminst1/TSMfile06,/tsminst1/TSMfile07
     ```
   * Example, to simulate backup operations (by using only write I/O) for an IBM Storage Protect storage pool workload on a medium-scale system, issue the following command:
     ```
     perl sp_disk_load_gen.pl workload=stgpool fslist=/tsminst1/TSMfile00,/tsminst1/TSMfile01,/tsminst1/TSMfile02,/tsminst1/TSMfile03,/tsminst1/TSMfile04,/tsminst1/TSMfile05,/tsminst1/TSMfile06,/tsminst1/TSMfile07 mode=writeonly
     ```
   * Example, to simulate the database workload on a small-scale system and include all four of the database file systems, issue the following command:
     ```
     perl sp_disk_load_gen.pl workload=db fslist=/tsminst1/TSMdbspace00,/tsminst1/TSMdbspace01,/tsminst1/TSMdbspace02,/tsminst1/TSMdbspace03
     ```
   Record the reported results for each test run.
1. If you have implemented a storage configuration with multiple arrays that are not combined into single storage pool, rerun the previous test, but modify it to include one additional file system from each pool. 
   * Example, if you have two pools on the array that is dedicated to the storage pool, your test sequence will include a count of file systems at 2, 4, 6, 8, 10, and so on.
1. Continue repeating these tests while the reported performance measurements improve. When performance diminishes, capture the results of the last test that indicated improvement. Use these results as the measurements for comparison.

## About Results

**Note**: 
* The performance results from the _Disk Workload Simulation Tool_ may not represent the maximum capabilities of the disk subsystem that is being tested. 
* The primary intent is to provide measurements that can be compared against the lab results (or baseline) for medium and large systems.
* The _Disk Workload Simulation Tool_ is not intended to be a replacement for disk performance analysis tools. It can be used to spot configuration problems that affect performance before you run IBM Storage Protect workloads in a production environment. For example,
  * If the performance results are significantly lower than the baselines from the test lab systems. 
* If you are using hardware other than the Storwize® components that are included in this document, use your test results as a rough estimate of how other disk types compare with the tested configurations.

### Example
This example shows the output from a storage pool workload test on a small system. </br>Eight file systems are included. The following command is issued:
```
perl sp_disk_load_gen.pl workload=stgpool fslist=/tsminst1/TSMfile00,/tsminst1/TSMfile01,tsminst1/TSMfile02,/tsminst1/TSMfile03,/tsminst1/TSMfile04,/tsminst1/TSMfile05,tsminst1/TSMfile06,/tsminst1/TSMfile07
```

The output shows the following results:
```
===================================================================
: IBM Storage Protect disk performance test (Program version 5.1)
:
: Workload type: stgpool
: Number of filesystems: 8
: Mode: readwrite
: Files to write per fs: 5
: File size: 2 GB
:
===================================================================
:
: Beginning I/O test.
: The test can take upwards of ten minutes, please be patient ...
: Starting write thread ID: 1 on filesystem /tsminst1/TSMfile00
: Starting read thread ID: 2 on filesystem /tsminst1/TSMfile00
: Starting write thread ID: 3 on filesystem /tsminst1/TSMfile01
: Starting read thread ID: 4 on filesystem /tsminst1/TSMfile01
: Starting write thread ID: 5 on filesystem /tsminst1/TSMfile02
: Starting read thread ID: 6 on filesystem /tsminst1/TSMfile02
: Starting write thread ID: 7 on filesystem /tsminst1/TSMfile03
: Starting read thread ID: 8 on filesystem /tsminst1/TSMfile03
: Starting write thread ID: 9 on filesystem /tsminst1/TSMfile04
: Starting read thread ID: 10 on filesystem /tsminst1/TSMfile04
: Starting write thread ID: 11 on filesystem /tsminst1/TSMfile05
: Starting read thread ID: 12 on filesystem /tsminst1/TSMfile05
: Starting write thread ID: 13 on filesystem /tsminst1/TSMfile06
: Starting read thread ID: 14 on filesystem /tsminst1/TSMfile06
: Starting write thread ID: 15 on filesystem /tsminst1/TSMfile07
: Starting read thread ID: 16 on filesystem /tsminst1/TSMfile07
: All threads are finished. Stopping iostat process with id 15732
===================================================================
: RESULTS:
: Devices reported on from output:
: dm-25
: dm-28
: dm-7
: dm-6
: dm-4
: dm-8
: dm-12
: dm-15
:
: Average R Throughput (KB/sec): 227438.06
: Average W Throughput (KB/sec): 224826.38
: Avg Combined Throughput (MB/sec): 441.66
: Max Combined Throughput (MB/sec): 596.65
:
: Average IOPS: 1767.16
: Peak IOPS: 2387.43 at 08/05/2015 09:38:27
:
: Total elapsed time (seconds): 171
===================================================================
```

## What to do next
Compare your performance results against test lab results by reviewing sample outputs for storage pool and database workloads on both medium and large systems:
* For the **storage pool workload**, the measurement for average combined throughput in MB-per-second combines the read and write throughput. This is the most useful value when you compare results.
* For the **database workload**, the peak IOPS measurements add the peak read and write operations per second, for a specific time interval. This is the most useful value when you compare results for the database workload.

---
