In this scenario, you want to investigate the correlation
between car models and incidents of burning.
The Facet Pairs view helps you to identify a high correlation
of facet values from the selected facets. The content analytics miner requires two sets of
search results to calculate a correlation. Accordingly, you select
two facets that represent the two search result sets of the document
set.
The following scenario is based on public data from the
National Highway Transportation Safety Administration (NHTSA). An IBM® Content
Analytics with Enterprise Search administrator created a
crawler to add this data to a content analytics collection and defined
facets to classify the data for quick retrieval.
To explore statistics for a pair of facets:
- Click the Facet Pairs tab to open
the Facet Pairs view.
- In the Facet Navigation area, select Model as
the facet to explore as rows in a two-dimensional view. To
specify the second facet, select Burn as the
facet to explore as columns.
The Facet Pairs
view shows how these two facets correlate. The degree of correlation
is color-coded as shown in the Correlation Amount key. Yellow indicates
the least amount of correlation. Orange indicates a moderate amount
of correlation. Red indicates the highest amount of correlation. It
is important to focus on the intersections that have the highest correlation
values.
- To expand the area that shows statistics, click the arrow
on the border between the facet pairs table and the facet navigation
tree. When you view the data for a facet pair, you have
a choice of three grid- and table-based views.
- Click the Table view icon to view
the data as a table. By default, the facets are sorted
by frequency. Click the column headers to sort by correlation value
or by facet values.
- To filter rows in the Table view, type the filter criteria
in the Filter Rows field. For
example, to see data for just Ford Explorers in the table, type expl in
the Filter Rows field. The view is dynamically
updated to show only the rows with facet values that contain expl.
- To further refine the filter, specify filter criteria for
the columns in the Table view by typing characters in the Filter
Columns field. For example, to explore
the Explorer data that also contains the terms flame or fire, type f.
The table is dynamically redrawn to show only the facet values that
meet both the Filter Rows and Filter Columns criteria.
Tip: To remove a filter, clear the field that corresponds to
the filter that you want to remove.
- To quickly identify the highly correlated intersections
among all of the data, click the Bird's eye view icon. Click and drag the blue square in the upper left corner of the
view to highlight different data. For example, drag
the blue square until you locate areas that contain orange and red
cells, which indicate a high degree of correlation. Hover over a cell
to see the frequency and correlation statistics for that intersection,
such as statistics for the intersection of Expedition and flame.
- To view the area that you selected in greater detail, click
the Grid icon. Only the data
in the area that you highlighted in the Bird's eye view is displayed
in the Grid view. By default, the Grid view shows only a 15 x 15 celled
area of the possible 100 x 100 table at a time.
You can see the
comparison values in table form by row and column, one facet for each
dimension. For example, you can see the intersection of Expedition
and Burn, including all of the facet values that are associated with
Burn, such as fire, smoke, flame, and fire hazard. The intersection
of highly correlated facet values are highlighted in shades of orange
and red.
In the cell where Expedition and flame intersect, you
can see two numbers. The top number is the frequency, which is the
number of documents that contain both facet values (Expedition and
flame). The second number is the correlation value. You can also see
the frequency of each facet value (the number of documents that are
returned by the facet value). For example, you might see the number
3318 displayed under the flame facet value and the number 4568 displayed
under the Expedition facet value.
After you discover highly correlated pairs of facet values,
you can go to the Documents view and look at the textual data to glean
additional insight.