Creating high volume random test data

Question & Answer

Question

How do I create high volume random test data?

Answer

A useful application of a Cartesian product is to manufacture test data. The output of a Cartesian product is a multiplier of the number of rows from each table listed in the FROM clause. If 10,000 rows of data are needed, simply use any combination of existing tables whose number of rows multiply to 10,000.

The SQL can accept many variations to achieve different datatypes and values. The SELECT could use functions such as NOW(), RANDOM(), etc. to generate random values for different data types.

For example, if there is a table called dummy_data with ten rows of data, we can use the following SQL to create 10,000 rows of data:

DROP TABLE test_5;

CREATE TABLE test_5 AS

SELECT a.rowid AS "A_ROWID", b.rowid AS "B_ROWID",

c.rowid AS "C_ROWID", d.rowid AS "D_ROWID",

CAST(RANDOM()*10000 AS DECIMAL(6,2)) AS "RAND_NUM",

CAST(ROUND(RANDOM()*10000) AS INTEGER) AS "RAND_INT",

NOW()-(RANDOM()*30) "RAND_DT",

a.col1 AS "COL_VAL",

CHR(65+(RANDOM()*57)) AS "RAND_LTR" --ASCII 65-122 are letters

FROM dummy_data a, --first ten rows

dummy_data b, --multiplier of ten, increases output to 100 rows

dummy_data c, --multiplier of ten, increases output to 1000 rows

dummy_data d; --multiplier of ten, increases output to 10000 rows

--------------------------------------------------------------------------------

SELECT "A_ROWID","RAND_NUM","RAND_INT","RAND_DT",

"COL_VAL","RAND_LTR" FROM test_5 LIMIT 5;

The above SQL produced the following output:

--------+----------+----------+---------+---------+----------

151200006 | 166.61 | 167 | 2006-03-03 | G | m

151200006 | 8323.79 | 8324 | 2006-02-06 | G | L

151200006 | 2783.91 | 2784 | 2006-02-23 | G | S

151200006 | 8694.92 | 8695 | 2006-02-05 | G | O

151200006 | 8737.49 | 8737 | 2006-02-05 | G | R

(5 rows)

SELECT COUNT(*) FROM test_5;

count

-------

10000

(1 row)

\\d test_5

Table "test_5"

Attribute | Type | Modifier | Default Value

----------+------+----------+---------------

A_ROWID | bigint | |

B_ROWID | bigint | |

C_ROWID | bigint | |

D_ROWID | bigint | |

RAND_NUM | numeric(6,2) | |

RAND_INT | integer | |

RAND_DT | date | |

COL_VAL | character(1) | |

RAND_LTR | character(1) | |

Distributed on hash: "A_ROWID"

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Database","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Historical Number

NZ815137

Was this topic helpful?

Document Information

More support for:
IBM PureData System

Software version:
1.0.0

Document number:
197665

Modified date:
17 October 2019

UID

swg21578328

IBM Support

Tips