Creating a UDAF written in R
The example in this section shows how to create, on a Db2® system, a user-defined aggregate function (UDAF) written in R.
The following example creates a scalar function called average
that calculates
the average of several integers.
Code
Add the following code to the $DB2_HOME/function/routine/average.R file.
nz.mode <- 'aggregate'
nz.init <- function() {
setOutputNull(0)
setOutputNull(1)
}
nz.accum <- function() {
input <- getInputColumn(0)
if (is.null(input))
return()
sum <- getOutputColumn(0)
count <- getOutputColumn(1)
if (is.null(count)) {
setOutputInt64(1, 1)
} else {
setOutputInt64(1, count + 1)
}
if (is.null(sum)) {
setOutput(0, input)
} else {
setOutput(0, sum + input)
}
}
nz.merge <- function() {
inputSum <- getInputColumn(0)
inputCount <- getInputColumn(1)
if(is.null(inputSum) || is.null(inputCount))
return()
outputSum <- getOutputColumn(0)
if (is.null(outputSum))
setOutput(0, inputSum)
else
setOutput(0, inputSum + outputSum)
outputCount <- getOutputColumn(1)
if (is.null(outputCount))
setOutput(1, inputCount)
else
setOutput(1, inputCount + outputCount)
}
nz.final <- function () {
sum <- getInputColumn(0)
if (is.null(sum)) {
setOutputNull(0)
return()
}
count <- getInputColumn(1)
if (is.null(count))
{
userError("NumberAverage.returnValue(): something is wrong.")
return()
}
setOutputDouble(0, as.numeric(as.numeric(sum) / count))
}
Deployment
Deploy the UDX to a Db2 instance by issuing the following command, where path_to_average.R is to be replaced by the fully qualified path to the average.R file.
db2 "CREATE FUNCTION average(integer) returns double aggregate with (integer, bigint) language R parameter style \
NPSGENERIC FENCED NOT THREADSAFE ALLOW PARALLEL NO DBINFO DETERMINISTIC \
NO EXTERNAL ACTION CALLED ON NULL INPUT NO SQL external name 'path_to_average.R' "
Running
It is assumed that there is a table 'test_int' with a single integer column 'i':
I
-----------
1
2
4
8
Use this function in a query, for example:
db2 "select average(i) from test_int"
The following result is returned:
1
------------------------
+3.75000000000000E+000
1 record(s) selected.