Understanding locus control and analytic executables

AEs can run on either the SPUs or the host. It is best to run an operation on multiple SPUs versus a single host so that portions of the operation can be run concurrently, by using the Netezza appliance's massively parallel architecture. However, there are situations, where running an operation only on the host are desirable. For example, when you must see all of the data from all of the data slices, you can use a table function AE to perform an aggregation-type operation. In a single query, the output of AEs that run on the SPUs becomes the input of another single AE run on the host. The AE on the host performs an aggregation operation and returns a result for the query. Debugging on the host is much easier than debugging on the SPUs.

AEs are called from a database system that performs query optimization. Sometimes the database optimizer chooses to run a function in a particular locus (SPUs or host), which means the AE runs in that locus as well. AEs and UDXs are subject to the same locus rules, but for AEs, the outcome can be more problematic, especially for remote AEs. AEs can be used as database extensions, but can also be used as a mechanism for concurrent processing because they run only on the SPUs. If a remote AE is running only on the SPUs and a query or a portion of a query unexpectedly runs on the host, then that query hangs. Therefore, it is useful to understand locus control before you write an AE. The next sections describe the locus behavior of the scalar, table, and aggregate SQL functions.