Level: Introductory Andy Smith (andy_smith@uk.ibm.com), Software Developer, IBM
01 Apr 2001 Andy Smith takes a look at some of the benefits of an interactive presentation and exploration of data. He examines some traditional methods of step-by-step data exploration and filtering, and identifies their shortfalls. He introduces Attribute Explorer, and shows how its benefits are brought to bear on the problems identified. Finally, he discusses the potential use of Attribute Explorer in a car showroom kiosk application. Attribute Explorer falls into the category of information visualization.
This is distinct from scientific visualization, which may involve very large data sets with rich three-dimensional representations, often closely related to real-world
objects and situations. (Examples include stress analysis in building materials,
weather patterns, and medical scanning.) Information visualization places less reliance on a literal presentation,
using more abstract representations to convey meaning. Typical examples
include business graphic representations such as bar charts, line plots,
and pie charts. Attribute Explorer uses rapid feedback in response to user interaction to give insights into data relationships that might otherwise escape notice.
This work is informed by research carried out by Professor Bob Spence and
colleagues at Imperial College, London (see Resources).
Traditional solutions
It is a very common requirement to be able to subset data based on varying
criteria in order to home in on data of interest. General query languages
such as SQL are designed for this purpose, and many Web sites provide
facilities that incorporate databases that allow "trial and error" inquiries
to be submitted.
Listing 1 contains an example of SQL that might be used to select a set
of suitable cars for closer analysis: Listing 1. Data subsetting using a query language
SELECT Manufacturer, Model
FROM Cars
WHERE CityMPG > 30 AND
HorsePower > 90
|
A graphical user interface might mask the complexity of the query language by offering
entry fields for the required minimum city MPG and horsepower, and then
executing the query in response to a trigger such as a "Go" button.
This approach has several drawbacks:
- The result set will often be empty, or overwhelmingly large, especially
if the user is unfamiliar with the data.
- Users commonly do not have a clear idea of what criteria
to apply; they tend to refine their requirements based on what is available.
Several discrete attempts at
refining the criteria and observing the results are needed to discover what is available.
- Traditional approaches do not give any indication of data that
almost conforms to the selection criteria. In many cases
it would be extremely useful to know that data only just fails, and
on what criteria it fails.
- The time taken to formulate criteria and receive the result set
inhibits understanding of the data.
A dynamic approach
With the advent of powerful personal computers over the last decade,
it has become possible to present information in a very dynamic way,
responsive to user interactions with a barely discernible delay. This
provides an opportunity to dispense with the traditional query
submission approach, and instead provide immediate feedback to
finely adjusted criteria.
Such responsiveness invites exploration and experimentation, and capitalises
on the human brain's ability to discern patterns that emerge from
such feedback.
Attribute Explorer data
As its name suggests, Attribute Explorer deals with data representing
attributes of objects.
It works well with homogeneous tabular data where each column represents an
attribute, and each row is an instance of an object with the described
attributes. This, of course, is the typical result of a relational query,
so a large segment of the world's data might lend itself to analysis using
this technique.
Continuing with the car theme, the table below shows a short selection of car-related
data. A larger set is used in subsequent Attribute Explorer
screen shots. Example of car data as used in this column:
| City MPG | Highway MPG | Engine size (cc) | Horsepower | | 46 | 50 | 1000 | 55 | | 33 | 37 | 1200 | 73 | | 17 | 25 | 1300 | 255 | | 31 | 33 | 1300 | 63 | | 39 | 43 | 1300 | 70 | | ... | ... | ... | ... |
Benefit 1: Distribution
Each attribute to be explored is displayed as a bar chart, with the
range of attribute values segmented along the horizontal axis, and each data
point displayed as a rectangle within its segment. The top picture in Figure 1a shows the "City MPG"
attribute chart as first displayed. Figure 1b shows
the same chart with a constraint applied.
In these charts each rectangle represents
a car. In the first chart, all cars are shown as white, indicating that
all cars meet the applied constraint. In the second chart a constraint of a
minimum city MPG of 23 has been applied using the slider below the chart.
Those cars that fail this constraint are colored to indicate the failure.
Figure 1a. Attribute distribution, unconstrained
Figure 1b. Attribute distribution, constrained
As constraints are applied, cars fall within or outside of those constraints,
but all cars remain visible, providing a constant overview of the
situation.
Benefit 2: Cumulative analysis
It is rare that data is analysed based on a single attribute. But when
constraints are applied to multiple attributes simultaneously, what
is the cumulative effect, and how can it be represented? It is here
that Attribute Explorer begins to come into its own.
Figure 2. Cumulative effects of constraints

In Figure 2 we see all four attributes from the data in Figure 2 presented
simultaneously. Constraints have been applied to all four attributes such
that the city MPG should be greater than 23, the highway MPG greater than
31, the engine size greater than 2000cc, and the horsepower greater than 140.
Note that in the "Engine Size" chart the horizontal car delineation has been
suppressed; otherwise the horizontal lines would be too close together
for clarity. Only three cars meet all our constraints, and these are shown
in white on each of the charts. The remainder are color-coded with a
progressively darker shade as they fail more constraints. From this we can see
where a small relaxation of a constraint might include cars that would otherwise
have been dismissed from consideration. Indeed, the ability to rapidly
adjust constraints encourages experimentation and consideration of
many more possibilities. During such experimentation a mental model of the
data is rapidly acquired. As constraints are altered, the number of constraint failures for each car
is recalculated. Cars within each attribute segment are reordered such that
the cars with the lowest number of constraint failures are shown closest to
the horizontal axis.
Benefit 3: Discovery through movement?
Whilst the cumulative application of attribute constraints lets us rank
data and discern the likely effect of further constraint tightening or
relaxation, an additional benefit can be derived by virtue of the rapid feedback
that is possible.
Figure 3. Attribute relationship discovery
In Figure 3 we see all four attributes with a single constraint applied to
city MPG. Notice that the grouping of white (100% acceptable) cars is broadly
similar for city MPG and highway MPG. Notice also that the distribution of
white cars for engine size and horsepower is roughly inverted from that of the
MPG attributes. If you have scripting enabled in your browser, move the mouse
over the image and it will start to animate, simulating the movement of the city MPG constraint band. Now we can see how strong the relationship between city and highway MPG is,
and how strong the inverse relationship is with engine size and horsepower.
This might not come as a great surprise, given a knowledge of the trade-offs
between economy and engine size in cars. However, it is often useful to
have predicted relationships confirmed, and even more useful to discover
anomalies, because they provide an indication that further analysis of the
data might be necessary. Attribute Explorer might not be the tool for that further analysis, but as a means of discovering the need, it provides a very useful technique. An applet version of the Attribute Explorer that uses the example data is provided (see Download). The applet requires the
Java 2 Plug-in.
If you do not already have the plug-in installed, its
download and installation should start automatically on Internet Explorer
and Netscape Navigator. The download is
about 8MB, which might take a few minutes depending on your connection speed. It is interesting to speculate on the ability of Attribute Explorer to
provide insight through movement. With the illustrated data the pattern
that emerges is clear, but will it be useful with more enigmatic data?
One way to find out might be to conduct a series of controlled user tests,
both with data that is known to have clearly discernable relationships between
its attributes, and with data where known relationships, if any, are more
subtle. Will the subjects pick out the known patterns quickly and easily?
Will they discover relationships hitherto unknown? Does Attribute Explorer
reveal relationships faster than a static plotting of one attribute against
another? Feedback on these questions is welcome. If you would like to download
Attribute Explorer
(see Resources)
and conduct the user tests described (or any other tests), we would be most
interested in your findings.
Car kiosk application
As mentioned above, the Attribute Explorer applies a ranking based on the
number of attribute constraints failed, and then applies a color coding
to the attribute bar charts to convey this ranking. The ranking algorithm is
not very subtle: It does not take into account the degree of failure for
each attribute, and it does not allow for a relative ranking between attributes.
However, these enhancements could be easily incorporated. The user interface is necessarily generic since Attribute Explorer is
designed without prior knowledge of the data that it is to analyze.
However, for a specific application, an enhanced ranking algorithm might be
used to good effect when coupled with a bespoke user interface designed with
one or two tasks in mind. If we take a car showroom as an example, the attributes of cars are
many, yet the number of attributes that affect customers' buying
decisions is perhaps less numerous. Consider a kiosk application in
the showroom, or a Web application. Underlying the application is a car database much like that used in the examples above, but with several more attributes included.
Figure 4. Car kiosk initial screen
The application supports two tasks: a filtering and ranking of the
available cars based on the customer's criteria, and then a detailed
examination of the top ranked cars. Figure 6 shows the initial kiosk
screen. To the left of the screen is a selection mechanism for the set of
car attributes most commonly used by customers
when making a buying decision. A prompt invites customers to
select one or more attributes on which they wish to base their car
selection.
The Car Suitability list and Specification detail areas are muted because there is currently no criteria established with which to present a list of ranked vehicles.
Figure 5. Car kiosk with one attribute selected
In Figure 5 we see that the customer has selected Cost as the
first attribute of interest. A mechanism for constraining the acceptable
cost range for cars is presented, above which is a bar chart showing
the cost distribution for available cars. This chart has less prominence
than in the generic Attribute Explorer interface. The intent is that
sophisticated users will understand its significance and find it of use, but
that less sophisticated users will not be intimidated. The constraining mechanism is initialised without any constraint, so all
cars are deemed to fall within an acceptable cost range. The red range marker extends
the full width of the available area. Above this is a means for the user to
specify a Low, Medium, or High importance for this
attribute, with Medium being the default. Having established cost as a ranking criteria, the Car Suitablility list
is no longer muted, and a list of cars in ascending sequence by cost is
shown.
Figure 6. Car kiosk with constraints applied
In Figure 6 the customer has selected Doors and Insurance as two further
attributes, and has begun to apply constraints to these attributes.
As this happens, the sequence of available cars is altered in real time
to reflect the changed ranking. Against each car is an indication of
which attributes have failed. In this way, the user is provided with cues to
suggest which attribute constraints might be relaxed in order to bring
certain cars back into contention. All of the above interactions support the task of filtering
and ranking the available cars. Having arrived at a suitably ranked
set of cars, the customer can select individual cars for closer analysis,
at which point the detailed information typically available in a brochure
is displayed. This is illustrated in Figure 7.
Figure 7. Car kiosk with car selected
This mock-up illustrates the use of Attribute Explorer for car selection,
with a user interface geared to two specific tasks. Different designs aimed
at different product sets could use the same
underlying technology to help users home in on the ideal product for
them in a way that encourages exploration and experimentation.
Summary
We have introduced the Attribute Explorer and contrasted it with
more traditional data exploration techniques. We have summarised
its benefits to include:
-
An attribute-based display showing the distribution of objects by attribute,
and allowing the application of a clearly indicated constraint
without eliminating from consideration objects that fail the constraint.
-
Multiple concurrent attribute displays, each display being able to show the
cumulative effect of all applied attribute constraints.
-
The ability to discern attribute relationships through very rapid feedback
in response to the modification of attribute constraints.
We have then considered an application for an enhanced Attribute Explorer
analysis engine, presenting a bespoke user interface designed for a car
showroom or car Web site. The power of Attribute Explorer lies in its simplicity and flexibility. The
concepts are readily understood, and the presentation lends itself to
learning by experimentation. There is no penalty associated with
experimentation, and no feedback delay as alternative constraints are applied. It can work as a generalised data analysis tool, or as the engine behind
applications tailored for a specific domain. If you would like to apply
Attribute Explorer to your data, download the prototype and try it out.
Download | Name | Size | Download method |
|---|
| us-atex-AEApplet.zip | 111KB | HTTP |
Resources Learn
Get products and technologies
-
You can download the Attribute Explorer prototype from
alphaWorks.
About the author  | 
|  | Andy Smith is a Software Engineer with the IBM Ease of Use team in Warwick, UK. He has worked on a number of projects exploring the ease-of-use aspects of data visualization. You can contact him at andy_smith@uk.ibm.com. The Auto Kiosk graphics were produced by Jaclyn Neal. |
Rate this page
|