Netezza Performance Server query optimization and UDX calls

There are many internal performance algorithms that are intended to make queries as fast and as efficient as possible. These internal algorithms, in combination with the UDX registration settings and query design, can result in some unexpected behaviors for when, and how often, a UDX is invoked during a query. For example, when you review the log messages or plan files for test queries, you might find that the UDX was not called, or perhaps a UDF was called more or less often than you expected. For example:
  • When you define a UDF as RETURNS NULL ON NULL INPUT, if the Netezza Performance Server system detects a NULL input value to the UDF, it skips the UDF and automatically returns a NULL value.
  • When you define a UDF as DETERMINISTIC, the Netezza Performance Server system might call the UDF only once during statement preparation time rather than once for each row it operates on during the query execution. This only happens if the UDF takes all literal arguments or no arguments, or if the UDF RETURNS NULL ON NULL INPUT and it is given at least one literal NULL as an argument.
  • If your query uses the same UDF more than once, and the UDF takes the same arguments and is DETERMINISTIC, the Netezza Performance Server query algorithms can apply common subexpression elimination (CSE) to improve the query performance. With CSE, the Netezza Performance Server system calls the function only once for a common result that it can apply to the other uses of the function within the query.
  • The Netezza Performance Server Just In Time (JIT) statistics process can also increase the number of UDX invocations. JIT statistics run fast sample queries on the affected tables to assess query performance. Thus, the process can invoke the UDXs in the query several times as it seeks the best plan for the query.

The last two examples are Netezza Performance Server query performance optimizations. For the first two example situations, you can change the query optimization behavior if necessary by changing the UDF registration settings.

If you register the UDF as NON DETERMINISTIC, the Netezza Performance Server system always invokes the function to obtain a value. (The NON DETERMINISTIC setting might also be the reason the log shows that a UDF was invoked more than you expected.)

If you register the UDF as CALLED on NULL INPUT, the Netezza Performance Server system invokes the function for one or more NULL input values. Your function must then be designed to handle input NULL values appropriately.

Carefully consider the performance implications for these changes; if your UDF really is DETERMINISTIC or it returns NULL on NULL input, there are performance benefits to the resulting query optimizations. You might want to use different settings for these registration options in your test environment than in the production environment.