IBM Support

SQL RAND Function Not So Random

Troubleshooting


Problem

This document discusses the results that are preceded by the SQL RAND() function.

Resolving The Problem

Unseeded SQL RAND() function produces results that are not random between jobs. For unchanging data in a file, the data returned is identical between similar unseeded RAND function runs. For example:

Replace MYFILE with any file you like. The version used here has only three records in it and produces only three results each run.

SELECT RAND() FROM MYFILE

Following are the results when this is run five times in a row:

5.1387066255684077E-001
1.7572557756279183E-001
3.0863368633075960E-001

5.3453169347209084E-001
9.4763023773918886E-001
1.7172765282143621E-001

7.0223090304269542E-001
2.2641682180242317E-001
4.9476607562486646E-001

1.2469862971892452E-001
8.3895382549516284E-002
3.8962981048005613E-001

2.7723014007995850E-001
3.6805322428052611E-001
9.8345896786400955E-001

If you sign off and run it again, the same five result sets are produced. It may be expected to give a different result set each time especially with the following wording of the RAND function text in the V5R4 SQL Reference manual:

================================================
The RAND function returns a floating point value between 0 and 1.
expression
If an expression is specified, it is used as the seed value. The argument must be an expression that returns a value of a built-in small integer, large integer, character-string, or graphic-string data type. A string argument is cast to integer before evaluating the function. For more information on converting strings to integer, see “INTEGER or INT” on Page 307.

The data type of the result is double-precision floating point. If the argument can be null, the result can be null; if the argument is null, the result is the null value.
A specific seed value will produce the same sequence of random numbers for a specific instance of a RAND function in a query each time the query is executed. If a seed value is not specified, a different sequence of random numbers is produced each time the query is executed.

RAND is a non-deterministic function.

Example

oAssume that host variable HRAND is an INTEGER with a value of 100. The following statement:

SELECT RAND(:HRAND)
FROM SYSIBM.SYSDUMMY1

Returns a random floating-point number between 0 and 1, such as the approximate value .0121398.
oTo generate values in a numeric interval other than 0 to 1, multiply the RAND function by the size of the desired interval. For example, to get a random number between 0 and 10, such as the approximate value 5.8731398, multiply the function by 10:

SELECT RAND(:HRAND) * 10
FROM SYSIBM.SYSDUMMY1

================================================

This is working as designed. When an unseeded RAND function is run the first time in a job, it actually uses a 'default' seed value, which is the same each first time run. If truly random results are desired, a different seed must be provided for each first run. The following is an example of a seeded run that will provide such a random first run.

Run the following at job setup (just the first time) to get the RAND initially seeded:

SELECT RAND(MICROSECOND(CURRENT_TIMESTAMP))
FROM MYFILE

Microseconds of current timestamp is a 6-digit integer so this is a good way to seed each first run differently.

Note: It is possible to also get two identical values using this although there is a very small probability of it.

Then, you can run whatever SQL has the unseeded RAND() ....

SELECT RAND() FROM MYFILE

This provides the random results desired (unless you happen to get an identical microsecond from a prior run; however, this would be highly unlikely).

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Platform":[{"code":"PF012","label":"IBM i"}],"Version":"6.1.0"}]

Historical Number

455503059

Document Information

Modified date:
18 December 2019

UID

nas8N1014278