R language aggregates 2
This example uses the following file name: sampleuda.R.
This example finds the longest string and largest number in the given group and returns the sum of the numbers and the length of the string.
All examples are run on a standard R data set with the name iris. You can load this data set into the Netezza appliance by using the Netezza R Library client-side package function as.nz.data.frame(). For more details, see the Netezza Package for R Developer's Guide.
Code
nz.mode <- 'aggregate'
nz.init <- function () {
setOutputString(0, '')
setOutputDouble(1, 0)
}
nz.accum <- function () {
inputstr <- getInputColumn(0)
statestr <- getOutputColumn(0)
if (nchar(statestr) < nchar(inputstr))
setOutputString(0, inputstr)
inputnum <- getInputColumn(1)
statenum <- getOutputColumn(1)
if (inputnum > statenum)
setOutputDouble(1, inputnum)
}
nz.merge <- function () {
inputstr <- getInputColumn(0)
inputnum <- getInputColumn(1)
# after running nz.accum there is no possibility to
# get a NULL value in the input state - therefore,
# if NULL is encountered, we can skip this value set
# since it must to be coming from an empty data slice
if (is.null(inputstr) || is.null(inputnum))
return()
statestr <- getOutputColumn(0)
statenum <- getOutputColumn(1)
if (nchar(statestr) < nchar(inputstr))
setOutputString(0, inputstr)
if (inputnum > statenum)
setOutputDouble(1, inputnum)
}
nz.final <- function () {
setOutputDouble(0, nchar(getInputColumn(0)) + getInputColumn(1))
}Compilation
/nz/export/ae/utilities/bin/compile_ae --language r --template compile \
--version 3 --db dev --user nz /tmp/sampleuda.RRegistration
/nz/export/ae/utilities/bin/register_ae --language r --template uda \
--version 3 --db dev --user nz \
--sig 'ruda(VARCHAR(255),DOUBLE)' \
--return 'DOUBLE' --state '(VARCHAR(255),DOUBLE)' --exe sampleuda.R
Running
The following example shows an invocation:
SELECT ruda('text',2);
RUDA
------
6
(1 row)
SELECT DISTINCT(datasliceid) FROM nza..iris;
DATASLICEID
-------------
4
3
1
2
(4 rows)iris table is stored on data slices 1-4. The UDA
is then invoked as any other SQL
aggregate:SELECT ruda(CLASS,SEPALLENGTH) FROM nza..iris;
RUDA
------
17.9
(1 row)SELECT MAX(sepallength) FROM nza..iris;
MAX
-----
7.9
SELECT MAX(LENGTH(class)) FROM nza..iris;
MAX
-----
10The queries show the same result, a sum of 17.9.