Scratchpads for external functions and methods

A scratchpad enables a user-defined function or method to save its state from one invocation to the next.

For example, here are two situations where saving state between invocations is beneficial:

  1. Functions or methods that, to be correct, depend on saving state.
    An example of such a function or method is a simple counter function that returns a '1' the first time it is called, and increments the result by one each successive call. Such a function could, in some circumstances, be used to number the rows of a SELECT result:
       SELECT counter(), a, b+c, ...
         FROM tablex
         WHERE ...

    The function needs a place to store the current value for the counter between invocations, where the value will be guaranteed to be the same for the following invocation. On each invocation, the value can then be incremented and returned as the result of the function.

    This type of routine is NOT DETERMINISTIC. Its output does not depend solely on the values of its SQL arguments.

  2. Functions or methods where the performance can be improved by the ability to perform some initialization actions.
    An example of such a function or method, which might be a part of a document application, is a match function, which returns 'Y' if a given document contains a given string, and 'N' otherwise:
       SELECT docid, doctitle, docauthor
         FROM docs
         WHERE match('myocardial infarction', docid) = 'Y'
    This statement returns all the documents containing the particular text string value represented by the first argument. What match would like to do is:
    • First time only.

      Retrieve a list of all the document IDs that contain the string 'myocardial infarction' from the document application, that is maintained outside of the database manager. This retrieval is a costly process, so the function would like to do it only one time, and save the list somewhere handy for subsequent calls.

    • On each call.

      Use the list of document IDs saved during the first call to see if the document ID that is passed as the second argument is contained in the list.

    This type of routine is DETERMINISTIC. Its answer only depends on its input argument values. What is shown here is a function whose performance, not correctness, depends on the ability to save information from one call to the next.

Both of these needs are met by the ability to specify a SCRATCHPAD in the CREATE statement:
   CREATE FUNCTION counter()

   CREATE FUNCTION match(varchar(200), char(15))
     RETURNS char(1) ... SCRATCHPAD 10000;

The SCRATCHPAD keyword tells the database manager to allocate and maintain a scratchpad for a routine. The default size for a scratchpad is 100 bytes, but you can determine the size (in bytes) for a scratchpad. The match example is 10000 bytes long. The database manager initializes the scratchpad to binary zeros before the first invocation. If the scratchpad is being defined for a table function, and if the table function is also defined with NO FINAL CALL (the default), the database manager refreshes the scratchpad before each OPEN call. If you specify the table function option FINAL CALL, the database manager does not examine or change the content of the scratchpad after its initialization. For scalar functions defined with scratchpads, the database manager also does not examine or change the scratchpad's content after its initialization. A pointer to the scratchpad is passed to the routine on each invocation, and the database manager preserves the routine's state information in the scratchpad.

So for the counter example, the last value returned could be kept in the scratchpad. And the match example could keep the list of documents in the scratchpad if the scratchpad is big enough, otherwise it could allocate memory for the list and keep the address of the acquired memory in the scratchpad. Scratchpads can be variable length: the length is defined in the CREATE statement for the routine.

The scratchpad only applies to the individual reference to the routine in the statement. If there are multiple references to a routine in a statement, each reference has its own scratchpad, thus scratchpads cannot be used to communicate between references. The scratchpad only applies to a single database agent (an agent is a database entity that performs processing of all aspects of a statement). There is no "global scratchpad" to coordinate the sharing of scratchpad information between the agents. This is especially important for situations where the database manager establishes multiple agents to process a statement (in either a single partition or multiple partition database). In these cases, even though there might only be a single reference to a routine in a statement, there could be multiple agents doing the work, and each would have its own scratchpad. In a multiple partition database, where a statement referencing a UDF is processing data on multiple partitions, and invoking the UDF on each partition, the scratchpad would only apply to a single partition. As a result, there is a scratchpad on each partition where the UDF is executed.

If the correct execution of a function depends on there being a single scratchpad per reference to the function, then register the function as DISALLOW PARALLEL. This will force the function to run on a single partition, thereby guaranteeing that only a single scratchpad will exist per reference to the function.

Because it is recognized that a UDF or method might require system resources, the UDF or method can be defined with the FINAL CALL keyword. This keyword tells the database manager to call the UDF or method at end-of-statement processing so that the UDF or method can release its system resources. It is vital that a routine free any resources it acquires; even a small leak can become a big leak in an environment where the statement is repetitively invoked, and a big leak can cause a database crash.

As the scratchpad is of a fixed size, the UDF or method can itself include a memory allocation and thus, can make use of the final call to free the memory. For example, the preceding match function cannot predict how many documents will match the given text string. So a better definition for match is:
   CREATE FUNCTION match(varchar(200), char(15))
     RETURNS char(1) ... SCRATCHPAD 10000 FINAL CALL;

For UDFs or methods that use a scratchpad and are referenced in a subquery, the database manager might make a final call, if the UDF or method is so specified, and refresh the scratchpad between invocations of the subquery. You can protect yourself against this possibility, if your UDFs or methods are ever used in subqueries, by defining the UDF or method with FINAL CALL and using the call-type argument, or by always checking for the binary zero state of the scratchpad.

If you do specify FINAL CALL, note that your UDF or method receives a call of type FIRST. This could be used to acquire and initialize some persistent resource.

Following is a simple Java™ example of a UDF that uses a scratchpad to compute the sum of squares of entries in a column. This example takes in a column and returns a column containing the cumulative sum of squares from the top of the column to the current row entry:
  EXTERNAL NAME 'UDFsrv!SumOfSquares'
  // Sum Of Squares using Scratchpad UDF
  public void SumOfSquares(int inColumn,
                           int outSum)
  throws Exception
    int sum = 0;
    byte[] scratchpad = getScratchpad();
    // variables to read from SCRATCHPAD area
    ByteArrayInputStream byteArrayIn = new ByteArrayInputStream(scratchpad);
    DataInputStream dataIn = new DataInputStream(byteArrayIn);

    // variables to write into SCRATCHPAD area
    byte[] byteArrayCounter;
    int i;
    ByteArrayOutputStream byteArrayOut = new ByteArrayOutputStream(10);
    DataOutputStream dataOut = new DataOutputStream(byteArrayOut);
        // initialize data
	    sum = (inColumn * inColumn); 
        // save data into SCRATCHPAD area
        byteArrayCounter = byteArrayOut.toByteArray();
        for(i = 0; i < byteArrayCounter.length; i++)
          scratchpad[i] = byteArrayCounter[i];
        // read data from SCRATCHPAD area
        sum = dataIn.readInt();
        // work with data
        sum = sum + (inColumn * inColumn);
        // save data into SCRATCHPAD area
        byteArrayCounter = byteArrayOut.toByteArray();
        for(i = 0; i < byteArrayCounter.length; i++)
          scratchpad[i] = byteArrayCounter[i];
    //set the output value
    set(2, sum);  
  } // SumOfSquares UDF
Please note that there is a built-in database function that performs the same task as the SumOfSquares UDF. This example was chosen to demonstrate the use of a scratchpad.