Field (Variable) Types
Icons appear next to fields in field lists to indicate the field type and data type. Icons also identify multiple response sets.
|Multiple response set type||Icon|
|Multiple response set, multiple categories||
|Multiple response set, multiple dichotomies||
A field's measurement level is important when you create a visualization. Following is a description of the measurement levels. You can temporarily change the measurement level by right-clicking a field in a field list and choosing an option. In most cases, you need to consider only the two broadest classifications of fields, categorical and continuous:
Categorical. Data with a limited number of distinct values or categories (for example, gender or religion). Categorical fields can be string (alphanumeric) or numeric fields that use numeric codes to represent categories (for example, 0 = male and 1 = female). Also referred to as qualitative data. Sets, ordered sets, and flags are all categorical fields.
- Set. A field/variable whose values represent categories with no intrinsic ranking (for example, the department of the company in which an employee works). Examples of nominal variables include region, zip code, and religious affiliation. Also known as a nominal variable.
- Ordered Set. A field/variable whose values represent categories with some intrinsic ranking (for example, levels of service satisfaction from highly dissatisfied to highly satisfied). Examples of ordered sets include attitude scores representing degree of satisfaction or confidence and preference rating scores. Also known as an ordinal variable.
- Flag. A field/variable with two distinct values, such as Yes and No or 1 and 2. Also known as a dichotomous or binary variable.Note that IBM® SPSS® Statistics treats flags as sets (nominal variables).
Continuous. Data measured on an interval or ratio scale, where the data values indicate both the order of values and the distance between values. For example, a salary of $72,195 is higher than a salary of $52,398, and the distance between the two values is $19,797. Also referred to as quantitative, scale, or numeric range data.
Categorical fields define categories in the visualization, typically to draw separate graphic elements or to group graphic elements. Continuous fields are often summarized within categories of categorical fields. For example, a default visualization of income for gender categories would display the mean income for males and the mean income for females. The raw values for continuous fields can also be plotted, as in a scatterplot. For example, a scatterplot may show the current salary and beginning salary for each case. A categorical field could be used to group the cases by gender.
Measurement level isn't the only property of a field that determines its type. A field is also stored as a specific data type. Possible data types are strings (non-numeric data such as letters), numeric values (real numbers), and dates. Unlike the measurement level, a field's data type cannot be changed temporarily. You must change the way the data are stored in the original data set.
Multiple Response Sets
Some data files support a special kind of "field" called a multiple response set. Multiple response sets aren't really "fields" in the normal sense. Multiple response sets use multiple fields to record responses to questions where the respondent can give more than one answer. Multiple response sets are treated like categorical fields, and most of the things you can do with categorical fields, you can also do with multiple response sets.
Multiple response sets can be multiple dichotomy sets or multiple category sets.
Multiple dichotomy sets. A multiple dichotomy set typically consists of multiple dichotomous fields: fields with only two possible values of a yes/no, present/absent, checked/not checked nature. Although the fields may not be strictly dichotomous, all of the fields in the set are coded the same way.
For example, a survey provides five possible responses to the question, "Which of the following sources do you rely on for news?" The respondent can indicate multiple choices by checking a box next to each choice. The five responses become five fields in the data file, coded 0 for No (not checked) and 1 for Yes (checked).
Multiple category sets. A multiple category set consists of multiple fields, all coded the same way, often with many possible response categories. For example, a survey item states, "Name up to three nationalities that best describe your ethnic heritage." There may be hundreds of possible responses, but for coding purposes the list is limited to the 40 most common nationalities, with everything else relegated to an "other" category. In the data file, the three choices become three fields, each with 41 categories (40 coded nationalities and one "other" category).