The principle of proportional ink
Misleading colours in plots
Today I would like to discuss a basic rule for the design of data graphics, the principle of proportional ink.
The rule is very simple: when a shaded region is used to represent a numerical value, the area of that shaded region should be directly proportional to the corresponding value. In other words, the amount of ink used to indicate a value should be proportional to the value itself.
This rule derives from a more general principle that Edward Tufte set out in his classic book The Visual Display of Quantitative Information.
There, he argues that “The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented.” (1983, p.56)
This principle seems quite easy and simple, anyway it is often violated and the percentage of ink we see in graphics are not proportional to the values they represent.
Some examples
Graphics about vaccines (with a lot of problem even beyond the ink, I discussed it a lot on a previous article). They choose to color male people in blue and female in pink (yes, it is sexist). That would work, but all the other elements in the picture are of the same pink, that is really confusing. Moreover in the pie chart the pink represents the unvaccinated people (both male and female). That’s really really confusing.
Another examples from the politicians: let’s look at this picture showing the results of a survey about parties. Do you see something blue? Yes, the blue ink used to represent 20,5% is quite the double of the red ink used to represent the 20%.
Conclusions
Many forms of data visualisation use coloured areas to represent data. To avoid misleading viewers, it is imperative in these types of graphs that the sizes of the shaded areas are directly proportional to the magnitudes of the values being indicated. This is often called the principle of proportional ink. Unfortunately, this principle is often violated in a number of ways. As you look at data graphics, be on the lookout for such violations.
When you create your own visualisations, don’t cheat, create the plots having in mind this principle.