R language output mode
The UDX output signature is the definition of a specific function or aggregate result. It can be a scalar (UDF/UDA) or a table (UDTF).
apply(iris, 1, function(x) length(x))
apply(iris[,1:4], 1, function(x) c(length(x),sqrt(as.double(x))))
In the R Adapter, the output mode is controlled by the OUTPUT_TYPE environment variable that you set during registration.
- For sparse mode, add:
--define "r_ae_output_type=SPARSE"
- For table mode, add:
--define "r_ae_output_type=TABLE"
Sparse Output Mode
Generally, the output of the user-provided function cannot be restricted to any predefined form. In the sparse output mode, the R AE returns a table of the definition TABLE(columnid INT4, valueVARCHAR(16000)), which means that each R AE output column is converted to a character string. If the original value is to be retrieved, you must cast each value to the wanted data type manually. However, you should avoid this practice because it might cause extra rounding errors and affect performance, especially for large data sets.
In this mode, there is no difference between returning output data with a general setOutput function and specific setOutput<DataType> functions, where <DataType> is a placeholder for a specific data type identifier. All output data is eventually stored as a character string.
Table Output Mode
- Exact setting that specifies columns with their data types.
-
Set to TABLE(ANY) for which you must define a shaper function that specifies the output signature at run time.
In the table output mode, avoid the setOutput function. Instead, you use specific setOutput<DataType> functions, where <DataType> is a placeholder for a specific data type identifier.