Visualization

Visual theories applied data

Psychology

Gestalt theory explains how the brain reliably groups and divides elements based on their common visual features. These perception principles for grouping elements can help you construct a data visualization based on human cognition.

Illustration of dots represententing the similarity concept

Similarity

Elements are perceived as groups depending on the visual characteristics they share—like color or value.

Illustration of dots represententing the proximity concept

Proximity

Stronger than similarity, the human eye perceives elements to be related based on how close they are to one another.

Illustration of dots represententing the enclosure concept

Enclosure

Introduced by Palmer in 1992, the common regions principle shows how enclosing elements in other elements helps people see individual items as distinct groups.

Visual features to categorize elements

Building upon the principles, consider Jacques Bertin’s “Visual Variables”, a series of dimensions to visually represent features and to categorize elements. Two dimensional elements, like size, value, texture, color, orientation allow your eye to distinguish and combine objects in complex environments.

Visual representation from Jacques Bertin's Semiology of Graphics

Visual features to quantify elements

Visual features to quantify elements

Visual variables and chart typologies have been studied to understand how the human brain deals and navigates them. A 1984 study by William Cleveland and Robert McGill ranks the most common aspects two shapes can have based on how easily the human brain detects differences between them. They ordered the following visual features from the most to least recognizable:

Illustration represententing points on a common scale

Position on common scale

Because we share a common spatial reference system, position is the easiest feature to recognize and evaluate about elements in space. In bar charts and scatter plots, for example, elements are evaluated with the smallest amount of effort.

Illustration represententing points on a non-aligned scale

Position on non-aligned scales

Separate scales, even if not aligned, are easy to evaluate and to compare. The same axis repeated in a different position does not lose the visualization’s readability.

Three vertical lines with staggered lengths

Length

Bar charts are easy to understand because human brain recognizes, proportions and evaluates lengths easier than many other dimensions. Length interpretation is affected, but not compromised, even if objects are not aligned.

Three short lines facing in different directions

Direction

The human eye recognizes the direction of lines with ease, making this aspect the strength of line and trend charts. In this case, implied continuity is perfectly suited to represent temporal data and changes over time.

Three lines starting from the same point moving out in different angles

Angle

Angles provide a sense of proportion and help people make comparisons. Even if angles are considered harder to read than lengths, studies show how there is no significant differences, for example, in comparing pie charts or stacked bar charts.

Three circles of different sizes represententing area

Area

Harder than length to understand, area is the amount of space an element occupies. It requires more time to process and interpret. When you add this dimension, always scale values to the surface of the shapes you are using.

Two cubes with different size volumes

Volume

Significantly more complicated, volumes imply the use of 3D which, if displayed in a 2D environment, prevent users from understanding the accurate numbers they represent. Like areas, scale values to the volume of the objects you are using.

Three lines with different degrees of curvature

Curvature

Curves make it harder to recognize and evaluate due to the lack of linear length and common land marks.

Two squares with different patterned textures inside of them

Shading

Varying shades show differentiation, but are extremely hard to recognize and compare. Different values are easily misunderstood especially when they represent close values.

A group of lines stacked on top of each other with different opacities in order of lightest to darkest

Color saturation

Mostly used to show scale of values, color saturation results are hard to evaluate precisely.

Keep it simple

Avoid decoration

Eliminate everything not necessary to communicate as clearly as possible. A widely-used data visualization practice to evaluate whether content is meaningful or decorative is Edward Tufte’s concept, the Data Ink Ratio, defined as the “proportion of a graphic’s ink devoted to the non-redundant display of data information.” This basically means: erase everything you can until you don’t lose critical information.

Rely on conventions and metaphors when appropriate

Take advantage of concepts that don’t need to be scrutinized to be understood. Know cultural and industry-specific conventions to make your message immediate and straightforward to your audience.

Bad

Good

Cold

Warm

Low value

High value

Be unambiguous

Give each element a single purpose and be consistent when you pair dimensions of data with visual variables.

Make it visually and structurally consistent

To facilitate users’ confidence with your data visualization:

  • Display content in the same way among sections.
  • Define rules for title, subtitles, visual fields, representation and sources.
  • Establish a hierarchy of information between the data representation and filters, handlers or legends.
  • Create a predictable and familiar journey trough application’s environments.

Text

Good use of typography in charts is essential: numbers and labels give context and reference points to understanding data. However, data carries the main message and typography should never distract or dominate. Typography purely helps in presenting the information.

Title, labels, legend

Establish a clear hierarchy between all typographic elements, such as titles, descriptions, labels, legends, and tooltips. Create a solid and thoughtful structure:

  • Use bold type only to emphasize something.
  • Use type rotation only if strictly necessary.
  • Keep labels concise.
  • Use a legend only when space is tight; otherwise, consider labeling elements directly.
  • Calibrate label behaviors and size, considering overlapping.

Table

Tables are the textual counterpart of a data visualization, and they’re often used to solve accessibility issues.

  • Never align left whole numbers; always align them right.
  • Always align number on the decimal point.
  • Never order entries randomly; they must be sorted according to one field.

Colors

As one of the most powerful sensory cues, color can be a highly influential visual property. When using color, you must first consider how color is communicating meaning. Choose colors strategically:

  • Never use different colors to represent the same kind of data.
  • Use contrasting colors to show contrasting data, and complementary colors for complementary data.
  • Use similar colors for groups of data.

Color can be used to label and categorize elements, to measure quantities, to show intervals, and to highlight key items.

Categorize

Color schemes for nominal data typically use different hues to identify discrete categories. Select distinctive colors from a limited range of lightness grades to avoid implying levels of magnitude. It’s important to ensure that there is enough contrast between the colors so that they are easily distinguishable.

Checkmark
Chart where color contrast is at the correct level

18-25 yo

26-35 yo

40-60 yo

60+

X
Chart where color contrast is not at the correct level

18-25 yo

26-35 yo

40-60 yo

60+

Sort

Data that progresses from low to high (rather than identifying categories) can be communicated with a sequential color scheme. Use light colors for low data values and dark colors for high data values. These schemes are color blind-safe. Single-hue schemes can be created using grades of a single color.

Data that progresses outward from a middle value can be represented with a diverging scheme. Use light colors for the middle data value and dark colors for the end values. These schemes need to be specifically tested for color blindness accessibility. Diverging schemes should be multi-hued, blending between three colors, where the middle is between the two extremes.

Diverging

Evaluation of manager by department

Stacked bar chart showing a divergent set of colors
  • Very positive
  • Positive
  • Neutral
  • Negative
  • Very negative

Uni-directional

Evaluation of manager by department

Stacked bar chart showing a convergent set of colors
  • Very positive
  • Positive
  • Neutral
  • Negative
  • Very negative

A diverging palette is especially effective when the two opposing values have a positive/negative connotation.

Show intervals and ratio

Breaking up quantitative data values into discrete classifications or bins makes them easier to read than using a continuous gradient scale.


Temperature variations across the USA

Illustration of a map with a convergent color palette

Ok

-8
-6
-4
-2
0
+2
+4
+6
+8

Ok

While using a uni-directional palette when displaying positive/negative temperatures is actually correct, but is it the most effective way to communicate the data? Consider the context of the data when attempting to show its meaning.

Temperatures variations in USA

Illustration of a map with a divergent color palette

Best

-8
-6
-4
-2
0
+2
+4
+6
+8

Best

Switching to a diverging palette, the message stands out clearly: the reader can immediately see if there are adjacent areas with opposite temperature behaviours.

Temperatures variations in USA

Illustration of a map with a rainbow color palette

No

-8
-6
-4
-2
0
+2
+4
+6
+8

No

Using a rainbow palette may be colorful, but not effective: a purple area next to a blue area suggest the reader the blue area is colder. But, looking at the legend, it’s exactly the opposite.

Use color to tell a story

The tips above can help you improve the legibility of your data, but color can also be used to emphasize the story you want to tell. Color can give elements a hierarchy, emphasizing multiple levels of information within the data visualization.


Bar chart showing importance of color, one bar is highlighted

Using a colored element among gray elements makes the focus of the chart clear to the reader.

Revenues

Set of bar charts show revenues, three bar charts each containing four bars

Product A

Product B

Product C

In some cases can be useful to make a more subltle distinction of a specific element among others of the same color. In the example on the top, a variation of brightness/saturation of the same tint is a valid solution for making the city of New York stand out.

Scatter plot example chart showing different data points

above target (var %)

above target (absolute revenue)

below target (var %)

below target (absolute revenue)

Use color to hightlight elements that can vary using interaction. In this example, colors change if target changes, creating new categories of objects.

Shading with black

Visualizations can be rich and “colorful” even in situations when color can’t be used. Employing different shades of black, together with patterns and markers, is a valid substitute to color. Maintain contrast between light and dark shades while calibrating pattern density and typology to create hierarchy and guide focus.

Treemap example in color

Patterns and black shades can be used to display and quantify two different categories.

Treemap example in gray scale

Patterns and black shades can be used to display and quantify two different categories.

Manage tricky situations

Big numbers, small changes

When data points are large numbers or are close in value, elements size variations (bar height, bubbles size, or lines segments) can be too similar and differences undistinguishable.

In these cases evaluate the alternative to display only changes (instead of absloute values), maybe using percentages.

Data sample

Year

2013

2014

2015

2016

revenue ($M)

201

205

212

220

Variation (previous year)

0

+4

+7

+8

Revenues

Bar chart showing a selected section representing revenues

Revenues variation

Bar chart showing a zoomed in selected section representing revenues

Comparable scales

Sometimes, we need to compare entitites that have strongly different scales. For example, the revenues over time of two products different in diffusion, or the population growth of two countries different in size. In these cases, percentages can tell much more than absolute values.

Year

2015

2016

Product A revenues

$100

$250

Product B revenues

$500

$750

var (%)

+150%

+50%

Revenues trend

Line chart showing trends in revenue

Focus on actual values

The chart is correct if we want to show the different weight of the two products, but it’s less effective in showing their performance.

Revenues trend

Line chart zoomed in showing trends in revenue

Highlight performance

Using percentages in the chart, the reader immediately understands that Product A outperformed Product B.

Use reference points

Creating a visualization, always think about the main message you want to convey. Make the main idea pop out, calibrating what’s around it.

fig. A

Four bar charts each with four bars, each of different colors, representing different data categories each

18-25 yo

26-35 yo

40-60 yo

60+

fig. B

Two bar charts, each with four bars showing different data comparisons

Accessibility

Making data visualization accessible to everyone is crucial, especially when designing visual analytics tools to help people understand phenomena and make strategic decisions.

Here are some things to keep in mind when designing for people with visual impairments.

Low vision

For people who are blind, include text descriptions of any meaningful images that can be read by assistive technology (screen readers). A person with low vision is still able to perceive visual insights with the help of accessibility features and considerations.

  • Provide the opportunity to increase the size of elements and focus on particular areas by zooming and magnifying.
  • Follow the WCAG 2.0 Level AA guidelines for contrast — high contrast is crucial to improve legibility. Body text should have a contrast ratio of at least 4.5:1 (and large text should be at least 3:1) against its background color.
  • Include textual content to provide alternatives to visual information (especially color).
  • Use captions and always provide a table version of the visualization.
  • Allow keyboard alternatives for navigating graphical user interfaces.

Color blind

The two most common forms of color blindness are deuteranomaly and deuteranopia — which together account for about 6% of men, and protanomaly and protanopia, which account for another 2%. Tritanopia is very rare, and affects less than 0.001% of men.

Deuteranopia

Normal color version

Simulation

Protanopia

Normal color version

Simulation

Tritanopia

Normal color version

Simulation

For those affected the two most difficult colors to distinguish are green and red. Stressing the contrast between dark and light values is a good way to make the colors recognizable. While hue and saturation have minimal effect on legibility, brightness differences are far more perceptible.

Original

HSB 111, 56, 69

HSB 0, 67, 89

HSB 108, 49, 40

HSB 0, 100, 100

Deuteranopia simulation

Easily distinguishable formats (like bar charts or treemaps) or markers (as used in scatter plots or maps) can supplement color to express information. Those work perfectly, even in black and white, and reduce problems for people with color blindness.

Remember, the best way to support accessibility for all visual disabilities is to provide alternatives to visual information (especially color), using text and tabular versions of the visualization.

Back to top