Changing Statistics and Graphic Elements
You can convert a graphic element to another type, change the statistic used to draw the graphic element, or specify the collision modifier that determines what happens when graphic elements overlap.
How to Convert a Graphic Element
- Select the graphic element that you want to convert.
- Click the Element tab on the properties palette.
- Select a new graphic element type from the Type list.
Graphic Element Type | Description |
---|---|
Point | A marker identifying a specific data point. A point element is used in scatterplots and other related visualizations. |
Interval | A rectangular shape drawn at a specific data value and filling the space between an origin and another data value. An interval element is used in bar charts and histograms. |
Line | A line that connects data values. |
Path | A line that connects data values in the order they appear in the dataset. |
Area | A line that connects data elements with the area between the line and an origin filled in. |
Polygon | A multi-sided shape enclosing a data region. A polygon element could be used in a binned scatterplot or a map. |
Schema | An element consisting of a box with whiskers and markers indicating outliers. A schema element is used for boxplots. |
How to Change the Statistic
- Select the graphic element whose statistic you want to change.
- Click the Element tab on the properties palette.
Summary Statistics Calculated from a Continuous Field
- Mean. A measure of central tendency. The arithmetic average, the sum divided by the number of cases.
- Median. The value above and below which half of the cases fall, the 50th percentile. If there is an even number of cases, the median is the average of the two middle cases when they are sorted in ascending or descending order. The median is a measure of central tendency not sensitive to outlying values (unlike the mean, which can be affected by a few extremely high or low values).
- Mode. The most frequently occurring value. If several values share the greatest frequency of occurrence, each of them is a mode.
- Minimum. The smallest value of a numeric variable.
- Maximum. The largest value of a numeric variable.
- Range. The difference between the minimum and maximum values.
- Mid Range. The middle of the range, that is, the value whose difference from the minimum is equal to its difference from the maximum.
- Sum. The sum or total of the values, across all cases with nonmissing values.
- Cumulative Sum. The cumulative sum of the values. Each graphic element shows the sum for one subgroup plus the total sum of all previous groups.
- Percent Sum. The percentage within each subgroup based on a summed field compared to the sum across all groups.
- Cumulative Percent Sum. The cumulative percentage within each subgroup based on a summed field compared to the sum across all groups. Each graphic element shows the percentage for one subgroup plus the total percentage of all previous groups.
- Variance. A measure of dispersion around the mean, equal to the sum of squared deviations from the mean divided by one less than the number of cases. The variance is measured in units that are the square of those of the variable itself.
- Standard Deviation. A measure of dispersion around the mean. In a normal distribution, 68% of cases fall within one standard deviation of the mean and 95% of cases fall within two standard deviations. For example, if the mean age is 45, with a standard deviation of 10, 95% of the cases would be between 25 and 65 in a normal distribution.
- Standard Error. A measure of how much the value of a test statistic varies from sample to sample. It is the standard deviation of the sampling distribution for a statistic. For example, the standard error of the mean is the standard deviation of the sample means.
- Kurtosis. A measure of the extent to which there are outliers. For a normal distribution, the value of the kurtosis statistic is zero. Positive kurtosis indicates that the data exhibit more extreme outliers than a normal distribution. Negative kurtosis indicates that the data exhibit less extreme outliers than a normal distribution. The definition of kurtosis that is used, where the value is 0 for a normal distribution, is sometimes referred to as excess kurtosis. Some software may report kurtosis such that the value is 3 for a normal distribution.
- Skewness. A measure of the asymmetry of a distribution. The normal distribution is symmetric and has a skewness value of 0. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a guideline, a skewness value more than twice its standard error is taken to indicate a departure from symmetry.
The following region statistics may result in more than one graphic element per subgroup. When using the interval, area, or edge graphic elements, a region statistic results in one graphic element showing the range. All other graphic elements result in two separate elements, one showing the start of the range and one showing the end of the range.
- Region: Range. The range of values between the minimum and maximum values.
- Region: 95% Confidence Interval of Mean. A range of values that has a 95% chance of including the population mean.
- Region: 95% Confidence Interval of Individual. A range of values that has a 95% chance of including the predicted value given the individual case.
- Region: 1 Standard Deviation above/below Mean. A range of values between 1 standard deviation above and below the mean.
- Region: 1 Standard Error above/below Mean. A range of values between 1 standard error above and below the mean.
Count-Based Summary Statistics
- Count. The number of rows/cases.
- Cumulative Count. The cumulative number of rows/cases. Each graphic element shows the count for one subgroup plus the total count of all previous groups.
- Percent of Count. The percentage of rows/cases in each subgroup compared to the total number of rows/cases.
- Cumulative Percent of Count. The cumulative percentage of rows/cases in each subgroup compared to the total number of rows/cases. Each graphic element shows the percentage for one subgroup plus the total percentage of all previous groups.
How to Specify the Collision Modifier
The collision modifier determines what happens when graphic elements overlap.
- Select the graphic element for which you want to specify the collision modifier.
- Click the Element tab on the properties palette.
- From the Modifier drop-down list, select a collision modifier. -auto- lets
the application determine which collision modifier is appropriate
for the graphic element type and statistic.
Overlay. Draw graphic elements on top of each other when they have the same value.
Stack. Stack graphic elements that would normally be superimposed when they have the same data values.
Dodge. Move graphic elements next to other graphic elements that appear at the same value, rather than superimposing them. The graphic elements are arranged symmetrically. That is, the graphic elements are moved to opposite sides of a central position. Dodging is very similar to clustering.
Pile. Move graphic elements next to other graphic elements that appear at the same value, rather than superimposing them. The graphic elements are arranged asymmetrically. That is, the graphic elements are piled on top of one another, with the graphic element on the bottom positioned at a specific value on the scale.
Jitter (normal). Randomly reposition graphic elements at the same data value using a normal distribution.
Jitter (uniform). Randomly reposition graphic elements at the same data value using a uniform distribution.