density.kernel Function (GPL)

Syntax

density.kernel.<kernel function>(<algebra>, fixedWindow(<numeric>), <function>)

or

density.kernel.<kernel function>(<algebra>, nearestNeighbor(<integer>), <function>)

or

density.kernel.<kernel function>.joint(<algebra>, fixedWindow(<numeric>), <function>)

or

density.kernel.<kernel function>.joint(<algebra>, nearestNeighbor(<integer>), <function>)

<kernel function>. A kernel function. This specifies how data are weighted by the density function, depending on how close the data are to the current point.

<algebra>. Graph algebra, such as x*y. Refer to Brief Overview of GPL Algebra for an introduction to graph algebra.

<numeric>. fixedWindow specifies the proportion of data points to include when calculating the smooth function. This takes a numeric value between 0 and 1 and is optional. You also have the option of using the nearestNeighbor function to calculate smoother's bandwidth.

<integer>. nearestNeighbor specifies the k number of nearest neighbors to includes when calculating the smooth function. This takes a positive integer and is optional. You also have the option of using the fixedWindow function to calculate the smoother's bandwidth.

<function>. One or more valid functions. These are optional. Use scaledToData("false") when comparing densities with very different same sizes.

joint. Used to create densities based on values in the first (x axis) and second (y axis) dimensions. Without the joint modifier, the density is based only on values in the first (x axis) dimension. You would typically use the modifier for 3-D densities.

Description

Calculates the probability density using a nonparametric kernel function. This is often used to add a distribution curve that does not assume a particular model (like normal or Poisson). You can use the fixedWindow function or the nearestNeighbor function to specify the smoother's bandwidth. If you do not specify an explicit bandwidth, the internal algorithm uses a fixed window whose size is determined by the underlying data values and the specific kernel function.

Examples

Figure 1. Example: Adding the default kernel distribution
ELEMENT: line(position(density.kernel.epanechnikov(x)))
Figure 2. Example: Adding a kernel distribution using a fixed window
ELEMENT: line(position(density.kernel.epanechnikov(x, fixedWindow(0.05))))
Figure 3. Example: Adding a kernel distribution using k nearest neighbors
ELEMENT: line(position(density.kernel.epanechnikov(x, nearestNeighbor(100))))
Figure 4. Example: Creating a 3-D graph showing kernel densities
COORD: rect(dim(1,2,3))
ELEMENT: interval(position(density.kernel.epanechnikov.joint(x*y)))

Kernel Functions

uniform. All data receive equal weights.

epanechnikov. Data near the current point receive higher weights than extreme data receive. This function weights extreme points more than the triweight, biweight, and tricube kernels but less than the Gaussian and Cauchy kernels.

biweight. Data far from the current point receive more weight than the triweight kernel allows but less weight than the Epanechnikov kernel permits.

tricube. Data close to the current point receive higher weights than both the Epanechnikov and biweight kernels allow.

triweight. Data close to the current point receive higher weights than any other kernel allows. Extreme cases get very little weight.

gaussian. Weights follow a normal distribution, resulting in higher weighting of extreme cases than the Epanechnikov, biweight, tricube, and triweight kernels.

cauchy. Extreme values receive more weight than the other kernels, with the exception of the uniform kernel, allow.