Use this tab to view the frequency distribution of the selected
column. A frequency distribution lists the distribution of values in a column
and the properties and characteristics of the data values in a column.
- Total Rows
- Shows a count of the total number of rows in the source table of the column.
- Data Class
- Shows the inferred or selected data class of the column.
- Cardinality
- Shows the number of distinct values in the column and the percentage of
distinct values in the total number of records.
- Filtering consists of an attribute, an operator, and a value. Click the
filter icon and then select the attribute, operator, and a text or date value
by which to filter the metadata. You can select to apply the filters to all
or any conditions. While a filter is being applied, the icon changes to indicate
that a filter has been applied. Click Clear to clear
the filter.
- Frequency Distribution grid
- The screen lists the current item row and row count, the page number and
total page count, forward and back paging arrows, and the number of items
displayed per page, which is adjustable.
- Shows 50 records (default) per page. You can select to display 50, 100,
500, 1000, or 5000 items per page.
- You can sort on the paged grid in two ways:
- Click on a column header on the grid.
- Right-click on any column header on the grid. The Customize Table dialog
displays. From the Sorting tab of the dialog, you can select up to three sorting
criteria from three selection lists.
- Data Value
- Shows the actual data value from the source.
- Frequency
- The Count field shows how often this data value
appears in the column.
- The Percent field shows the percentage of the total
records that this data value represents.
- Value Flag
- Shows whether the data value is valid, invalid, or default.
- Data Type
- Shows the inferred or selected data type of the data value.
- Length
- Shows the inferred length of the data value.
- Format
- Shows the general format of the data value.
- Transformation Value
- Click to add a transformation value for the distinct value in the frequency
distribution. A transformation value can be entered to allow creation of a
mapping reference table. Mapping reference tables can be used by other applications,
such as DataStage®,
to correct invalid data.
- Value
- Click the Definition field to type a definition
for the distinct value in the frequency distribution when the column has a
data class of Code or of Indicator and the distinct values are representations
for specific meanings.
- The Sources field indicates that the data came
from the source analysis or was input by a user within Information Analysis.
- The Type field indicates the type of data value.
- Drill Down
- Click to view detailed information for all the associated rows from the
source data based on the specific value.
- The screen lists the current item row and row count, the page number and
total page count, forward and back paging arrows, and the number of items
displayed per page, which is adjustable.
- Shows 50 records (default) per page. You can select to display 50, 100,
500, 1000, or 5000 records per page.
- You can sort on the paged grid in two ways:
- Click on a column header on the grid.
- Right-click on any column header on the grid. The Customize Table dialog
displays. From the Sorting tab of the dialog, you can select up to three sorting
criteria from three selection lists.
- Delete User Value
- Click to remove the selected data value from the frequency distribution.
You can remove a data value only where the frequency count of the data value
is zero.
- New Value
- Click to add a data value to the frequency distribution. You can add a
data value to the frequency distribution if a known data value does not occur
in the data source, but you want to include that value in a reference table.
- Reference Tables
- When Column Analysis runs, inferences are made by the system based on
the actual data values. Incomplete or invalid data can result
in inappropriate inferences. By rebuilding inferences after analysis decisions,
the analysis inferences can often be improved.
- Click to create, edit, or view reference tables. Reference tables provide
the characteristics and properties from the frequency distribution for use
outside of information analysis. For example, you can use reference tables
in other suite components to enforce column domain and completeness requirements
or to control data conversion.
- Rebuild Inferences
- After you mark invalid values in the Domain & Completeness tab,
select Valid Values to rebuild column analysis inferences
using only the valid values in the column.
- Select All Values to rebuild column analysis inferences
using all the values in the column.