Statistics for user-defined functions

To create statistical information for user-defined functions (UDFs), edit the SYSSTAT.ROUTINES catalog view.

The runstats utility does not collect statistics for UDFs. If UDF statistics are available, the optimizer can use them when it estimates costs for various access plans. If statistics are not available, the optimizer uses default values that assume a simple UDF.

Table 1 lists the catalog view columns for which you can provide estimates to improve performance. Note that only column values in SYSSTAT.ROUTINES (not SYSCAT.ROUTINES) can be modified by users.

Table 1. Function Statistics (SYSCAT.ROUTINES and SYSSTAT.ROUTINES)
Statistic Description
IOS_PER_INVOC Estimated number of read or write requests executed each time a function is called
INSTS_PER_INVOC Estimated number of machine instructions executed each time a function is called
IOS_PER_ARGBYTE Estimated number of read or write requests executed per input argument byte
INSTS_PER_ARGBYTE Estimated number of machine instructions executed per input argument byte
PERCENT_ARGBYTES Estimated average percent of input argument bytes that a function will actually process
INITIAL_IOS Estimated number of read or write requests executed the first or last time a function is invoked
INITIAL_INSTS Estimated number of machine instructions executed the first or last time a function is invoked
CARDINALITY Estimated number of rows generated by a table function
For example, consider EU_SHOE, a UDF that converts an American shoe size to the equivalent European shoe size. For this UDF, you might set the values of statistic columns in SYSSTAT.ROUTINES as follows:
  • INSTS_PER_INVOC. Set to the estimated number of machine instructions required to:
    • Call EU_SHOE
    • Initialize the output string
    • Return the result
  • INSTS_PER_ARGBYTE. Set to the estimated number of machine instructions required to convert the input string into a European shoe size
  • PERCENT_ARGBYTES. Set to 100, indicating that the entire input string is to be converted
  • INITIAL_INSTS, IOS_PER_INVOC, IOS_PER_ARGBYTE, and INITIAL_IOS. Each set to 0, because this UDF only performs computations
PERCENT_ARGBYTES would be used by a function that does not always process the entire input string. For example, consider LOCATE, a UDF that accepts two arguments as input and returns the starting position of the first occurrence of the first argument within the second argument. Assume that the length of the first argument is small enough to be insignificant relative to the second argument and that, on average, 75% of the second argument is searched. Based on this information and the following assumptions, PERCENT_ARGBYTES should be set to 75:
  • Half the time the first argument is not found, which results in searching the entire second argument
  • The first argument is equally likely to appear anywhere within the second argument, which results in searching half of the second argument (on average) when the first argument is found

You can use INITIAL_INSTS or INITIAL_IOS to record the estimated number of machine instructions or read or write requests that are performed the first or last time that a function is invoked; this might represent the cost, for example, of setting up a scratchpad area.

To obtain information about I/Os and the instructions that are used by a UDF, use output provided by your programming language compiler or by monitoring tools that are available for your operating system.