Create the C++ file for the UDTF

To begin, use any text editor to create your C++ file. The file name must have a .cpp extension. You might want to create a directory such as /home/nz/udx_files as your area for UDX code files.

Your C++ file must include the udxinc.h header file, which contains the required declarations for user-defined table functions and processing on the SPUs.

#include "udxinc.h"
In addition, make sure that you declare any of the standard C++ library header files that your table function might require. If your UDTF requires any user-defined shared libraries, make sure that you note the name of the libraries because you must specify them when you register the UDTF in the database. For example:
#include "udxinc.h"
#include <string.h>

User-defined shared libraries must exist in the database before you can register the UDTF and specify those libraries as dependencies. You can register the UDTF without specifying any library dependencies, and after the libraries are added, use the ALTER FUNCTION command to update the UDTF definition with the correct dependencies. For more information about user-defined shared libraries, see Create a user-defined shared library.

The UDX classes and functions for API version 2 are defined in a namespace called nz::udx_ver2. Your C++ program must reference the correct namespace. For example:
#include "udxinc.h"

using namespace nz::udx_ver2;
To implement a UDTF, you create a class object that is derived from the Udtf base class. For example:
#include "udxinc.h"

using namespace nz::udx_ver2;

class parseNames : public Udtf {
public:
}
The parseNames UDTF takes an input table of strings fields that are separated by spaces or commas, and returns a table where each field of the requested string is output on its own row. As with other UDXs, you define the variables that are required for the UDTF algorithm at the class level. For example:
#include "udxinc.h"
using namespace nz::udx_ver2;

class parseNames : public Udtf {

private:
    char value[1000];
    int valuelen;
    int i;
public:
}
The parseNames UDTF uses the following variables:
  • The value variable contains a copy of the input parameter.
  • The valuelen variable contains the length of the input string.
  • The i variable is a counter.
Each UDTF must implement the instantiate() and constructor method and two more UDTF-specific methods: newInputRow() and nextOutputRow(). The nextEoiOutputRow() UDTF-specific method is optional. An example of the methods and their purpose follows.
  • As with UDFs, you call the instantiate() method to create the UDTF object dynamically, In UDX version 2, the instantiate method takes one argument (UdxInit *pInit), which enables access to the memory specification, the log setting, and the UDX environment (see UDX environment). The constructor must take a UdxInit object as well and pass it to the base class constructor. An example follows:
       #include "udxinc.h"
    using namespace nz::udx_ver2;
    
    class parseNames : public Udtf {
    private:
        char value[1000];
        int valuelen;
        int i;
    public:
        parseNames(UdxInit *pInit) : Udtf(pInit) {}
    
        static Udtf* instantiate(UdxInit*);
    };
    
    Udtf* parseNames::instantiate (UdxInit* pInit) {   
        return new parseNames(pInit); 
    }
  • For a UDTF, you use the newInputRow() method to perform initialization actions such as copying input arguments, initializing class variables, and managing situations such as null input variables. The method is called once for each input row. For the parseNames UDTF example, the following sample code copies the input list to the variable value, sets valuelen to the length of the input string, and initializes the variable i to zero:
        virtual void newInputRow() {
            StringArg *valuesa = stringArg(0);
            bool valuesaNull  = isArgNull(0);
            if (valuesaNull)
                valuelen = 0;
            else {
                if (valuesa->length >= 1000)
                  throwUdxException("Input value must be less than 1000 
    characters.");
                memcpy(value, valuesa->data, valuesa->length);
                value[valuesa->length] = 0;
                valuelen = valuesa->length;
            }
            i = 0;
        }
  • You use the nextOutputRow() method to create and return the next output row of the table. You should also detect whether there is more data to return and then return Done. Netezza Performance Server calls this method at least once per input row. Sample code follows:
           virtual DataAvailable nextOutputRow() {
            if (i >= valuelen)
                   return Done;
            // save starting position of name
            int start = i;
            // scan string for next comma
            while ((i < valuelen) && value[i] != ',')
                i++;
            // return word
            StringReturn *rk = stringReturnColumn(0);
            if (rk->size < i-start)
              throwUdxException("Value exceeds return size");
            memcpy(rk->data, value+start, i-start);
            rk->size = i-start;
            i++;
            return MoreData;
        }

    As shown in the example, you create a column by using the appropriate column return type such as stringReturnColumn() or intReturnColumn() and you specify the position of the column such as 1, 2, 3, and so on. The return MoreData syntax indicates that there is another row to process. When the counter variable i reaches the end of the input string, there is no more data to process and nextOutputRow() returns Done.

  • If your UDTF supports the TABLE WITH FINAL syntax, you use the nextEoiOutputRow() method at least once after the end of the input to process and output all the data. The base class has a default implementation of this method that returns no rows when called. It is similar to nextOutputRow() except that newInputRow() is not called before it. A sample method follows:
           virtual DataAvailable nextEoiOutputRow()
            return Done;
        }