Attribute Explorer falls into the category of information visualization. This is distinct from scientific visualization, which may involve very large data sets with rich three-dimensional representations, often closely related to real-world objects and situations. (Examples include stress analysis in building materials, weather patterns, and medical scanning.)
Information visualization places less reliance on a literal presentation, using more abstract representations to convey meaning. Typical examples include business graphic representations such as bar charts, line plots, and pie charts. Attribute Explorer uses rapid feedback in response to user interaction to give insights into data relationships that might otherwise escape notice. This work is informed by research carried out by Professor Bob Spence and colleagues at Imperial College, London (see Resources).
It is a very common requirement to be able to subset data based on varying criteria in order to home in on data of interest. General query languages such as SQL are designed for this purpose, and many Web sites provide facilities that incorporate databases that allow "trial and error" inquiries to be submitted.
Listing 1 contains an example of SQL that might be used to select a set of suitable cars for closer analysis:
Listing 1. Data subsetting using a query language
SELECT Manufacturer, Model
FROM Cars
WHERE CityMPG > 30 AND
HorsePower > 90
|
A graphical user interface might mask the complexity of the query language by offering entry fields for the required minimum city MPG and horsepower, and then executing the query in response to a trigger such as a "Go" button.
This approach has several drawbacks:
- The result set will often be empty, or overwhelmingly large, especially if the user is unfamiliar with the data.
- Users commonly do not have a clear idea of what criteria to apply; they tend to refine their requirements based on what is available. Several discrete attempts at refining the criteria and observing the results are needed to discover what is available.
- Traditional approaches do not give any indication of data that almost conforms to the selection criteria. In many cases it would be extremely useful to know that data only just fails, and on what criteria it fails.
- The time taken to formulate criteria and receive the result set inhibits understanding of the data.
With the advent of powerful personal computers over the last decade, it has become possible to present information in a very dynamic way, responsive to user interactions with a barely discernible delay. This provides an opportunity to dispense with the traditional query submission approach, and instead provide immediate feedback to finely adjusted criteria.
Such responsiveness invites exploration and experimentation, and capitalises on the human brain's ability to discern patterns that emerge from such feedback.
As its name suggests, Attribute Explorer deals with data representing attributes of objects. It works well with homogeneous tabular data where each column represents an attribute, and each row is an instance of an object with the described attributes. This, of course, is the typical result of a relational query, so a large segment of the world's data might lend itself to analysis using this technique.
Continuing with the car theme, the table below shows a short selection of car-related data. A larger set is used in subsequent Attribute Explorer screen shots.
Example of car data as used in this column:
| City MPG | Highway MPG | Engine size (cc) | Horsepower |
| 46 | 50 | 1000 | 55 |
| 33 | 37 | 1200 | 73 |
| 17 | 25 | 1300 | 255 |
| 31 | 33 | 1300 | 63 |
| 39 | 43 | 1300 | 70 |
| ... | ... | ... | ... |
Each attribute to be explored is displayed as a bar chart, with the range of attribute values segmented along the horizontal axis, and each data point displayed as a rectangle within its segment.
The top picture in Figure 1a shows the "City MPG" attribute chart as first displayed. Figure 1b shows the same chart with a constraint applied. In these charts each rectangle represents a car. In the first chart, all cars are shown as white, indicating that all cars meet the applied constraint. In the second chart a constraint of a minimum city MPG of 23 has been applied using the slider below the chart. Those cars that fail this constraint are colored to indicate the failure.
Figure 1a. Attribute distribution, unconstrained

Figure 1b. Attribute distribution, constrained

As constraints are applied, cars fall within or outside of those constraints, but all cars remain visible, providing a constant overview of the situation.
Benefit 2: Cumulative analysis
It is rare that data is analysed based on a single attribute. But when constraints are applied to multiple attributes simultaneously, what is the cumulative effect, and how can it be represented? It is here that Attribute Explorer begins to come into its own.
Figure 2. Cumulative effects of constraints

In Figure 2 we see all four attributes from the data in Figure 2 presented simultaneously. Constraints have been applied to all four attributes such that the city MPG should be greater than 23, the highway MPG greater than 31, the engine size greater than 2000cc, and the horsepower greater than 140. Note that in the "Engine Size" chart the horizontal car delineation has been suppressed; otherwise the horizontal lines would be too close together for clarity.
Only three cars meet all our constraints, and these are shown in white on each of the charts. The remainder are color-coded with a progressively darker shade as they fail more constraints. From this we can see where a small relaxation of a constraint might include cars that would otherwise have been dismissed from consideration. Indeed, the ability to rapidly adjust constraints encourages experimentation and consideration of many more possibilities. During such experimentation a mental model of the data is rapidly acquired.
As constraints are altered, the number of constraint failures for each car is recalculated. Cars within each attribute segment are reordered such that the cars with the lowest number of constraint failures are shown closest to the horizontal axis.
Benefit 3: Discovery through movement?
Whilst the cumulative application of attribute constraints lets us rank data and discern the likely effect of further constraint tightening or relaxation, an additional benefit can be derived by virtue of the rapid feedback that is possible.
Figure 3. Attribute relationship discovery

In Figure 3 we see all four attributes with a single constraint applied to city MPG. Notice that the grouping of white (100% acceptable) cars is broadly similar for city MPG and highway MPG. Notice also that the distribution of white cars for engine size and horsepower is roughly inverted from that of the MPG attributes. If you have scripting enabled in your browser, move the mouse over the image and it will start to animate, simulating the movement of the city MPG constraint band.
Now we can see how strong the relationship between city and highway MPG is, and how strong the inverse relationship is with engine size and horsepower. This might not come as a great surprise, given a knowledge of the trade-offs between economy and engine size in cars. However, it is often useful to have predicted relationships confirmed, and even more useful to discover anomalies, because they provide an indication that further analysis of the data might be necessary. Attribute Explorer might not be the tool for that further analysis, but as a means of discovering the need, it provides a very useful technique.
An applet version of the Attribute Explorer that uses the example data is provided (see Download). The applet requires the Java 2 Plug-in. If you do not already have the plug-in installed, its download and installation should start automatically on Internet Explorer and Netscape Navigator. The download is about 8MB, which might take a few minutes depending on your connection speed.
It is interesting to speculate on the ability of Attribute Explorer to provide insight through movement. With the illustrated data the pattern that emerges is clear, but will it be useful with more enigmatic data? One way to find out might be to conduct a series of controlled user tests, both with data that is known to have clearly discernable relationships between its attributes, and with data where known relationships, if any, are more subtle. Will the subjects pick out the known patterns quickly and easily? Will they discover relationships hitherto unknown? Does Attribute Explorer reveal relationships faster than a static plotting of one attribute against another?
Feedback on these questions is welcome. If you would like to download Attribute Explorer (see Resources) and conduct the user tests described (or any other tests), we would be most interested in your findings.
As mentioned above, the Attribute Explorer applies a ranking based on the number of attribute constraints failed, and then applies a color coding to the attribute bar charts to convey this ranking. The ranking algorithm is not very subtle: It does not take into account the degree of failure for each attribute, and it does not allow for a relative ranking between attributes. However, these enhancements could be easily incorporated.
The user interface is necessarily generic since Attribute Explorer is designed without prior knowledge of the data that it is to analyze. However, for a specific application, an enhanced ranking algorithm might be used to good effect when coupled with a bespoke user interface designed with one or two tasks in mind.
If we take a car showroom as an example, the attributes of cars are many, yet the number of attributes that affect customers' buying decisions is perhaps less numerous. Consider a kiosk application in the showroom, or a Web application. Underlying the application is a car database much like that used in the examples above, but with several more attributes included.
Figure 4. Car kiosk initial screen

The application supports two tasks: a filtering and ranking of the available cars based on the customer's criteria, and then a detailed examination of the top ranked cars. Figure 6 shows the initial kiosk screen. To the left of the screen is a selection mechanism for the set of car attributes most commonly used by customers when making a buying decision. A prompt invites customers to select one or more attributes on which they wish to base their car selection.
The Car Suitability list and Specification detail areas are muted because there is currently no criteria established with which to present a list of ranked vehicles.
Figure 5. Car kiosk with one attribute selected

In Figure 5 we see that the customer has selected Cost as the first attribute of interest. A mechanism for constraining the acceptable cost range for cars is presented, above which is a bar chart showing the cost distribution for available cars. This chart has less prominence than in the generic Attribute Explorer interface. The intent is that sophisticated users will understand its significance and find it of use, but that less sophisticated users will not be intimidated.
The constraining mechanism is initialised without any constraint, so all cars are deemed to fall within an acceptable cost range. The red range marker extends the full width of the available area. Above this is a means for the user to specify a Low, Medium, or High importance for this attribute, with Medium being the default.
Having established cost as a ranking criteria, the Car Suitablility list is no longer muted, and a list of cars in ascending sequence by cost is shown.
Figure 6. Car kiosk with constraints applied

In Figure 6 the customer has selected Doors and Insurance as two further attributes, and has begun to apply constraints to these attributes. As this happens, the sequence of available cars is altered in real time to reflect the changed ranking. Against each car is an indication of which attributes have failed. In this way, the user is provided with cues to suggest which attribute constraints might be relaxed in order to bring certain cars back into contention.
All of the above interactions support the task of filtering and ranking the available cars. Having arrived at a suitably ranked set of cars, the customer can select individual cars for closer analysis, at which point the detailed information typically available in a brochure is displayed. This is illustrated in Figure 7.
Figure 7. Car kiosk with car selected

This mock-up illustrates the use of Attribute Explorer for car selection, with a user interface geared to two specific tasks. Different designs aimed at different product sets could use the same underlying technology to help users home in on the ideal product for them in a way that encourages exploration and experimentation.
We have introduced the Attribute Explorer and contrasted it with more traditional data exploration techniques. We have summarised its benefits to include:
- An attribute-based display showing the distribution of objects by attribute, and allowing the application of a clearly indicated constraint without eliminating from consideration objects that fail the constraint.
- Multiple concurrent attribute displays, each display being able to show the cumulative effect of all applied attribute constraints.
- The ability to discern attribute relationships through very rapid feedback in response to the modification of attribute constraints.
We have then considered an application for an enhanced Attribute Explorer analysis engine, presenting a bespoke user interface designed for a car showroom or car Web site.
The power of Attribute Explorer lies in its simplicity and flexibility. The concepts are readily understood, and the presentation lends itself to learning by experimentation. There is no penalty associated with experimentation, and no feedback delay as alternative constraints are applied.
It can work as a generalised data analysis tool, or as the engine behind applications tailored for a specific domain. If you would like to apply Attribute Explorer to your data, download the prototype and try it out.
| Name | Size | Download method |
|---|---|---|
| us-atex-AEApplet.zip | 111KB | HTTP |
Information about download methods
Learn
-
You can purchase Robert Spence's book Information Visualization from
Amazon.com, amongst other places.
-
Read about DB2 Data Warehouse Edition, the replacement product for
IBM DB2 Intelligent Miner for Data, for which the Attribute
Explorer prototype can act as a plug-in viewer.
-
Visit the IBM Ease of Use site for the latest in design guidelines, from designing for the Web to out-of-box-experience.
-
The
Usability Professionals' Association Web site
offers a variety of approaches for getting users involved early in the design
process.
Get products and technologies
-
You can download the Attribute Explorer prototype from
alphaWorks.

Andy Smith is a Software Engineer with the IBM Ease of Use team in Warwick, UK. He has worked on a number of projects exploring the ease-of-use aspects of data visualization. You can contact him at andy_smith@uk.ibm.com. The Auto Kiosk graphics were produced by Jaclyn Neal.
Comments (Undergoing maintenance)





