This post is part of a series based on the data displayed in O’Reilly’s 2016 Data Science Salary Survey. Using the Data Chefs Revision Organizer as a guide, we will rethink and revise some of the visualizations featured in the report.
In this post, I want to focus on the visualization for the share of survey respondents by self-reported age category:
Again, the authors used the arcing blue circle theme to depict the breakdown by age category. On the plus side, the data labels are consistently placed, all falling along the bottom-right of each value circle (or the inside of the arc), and the order is intuive: youngest to oldest. Also, the circles appear to be sized properly by area (as opposed to diameter).
Using circles is not necessarily a bad way to depict category data, but doing so has some limitations. The main drawback is that by using distinct circles, you lose the relation of each part to the whole.
For this data, I propose using a form of visualization in which the part/whole relationship is central: pie chart, donut chart, waffle chart, or stacked 100% bar chart, shown below:
The biggest downside to using these part/whole visualizations is that there isn’t a lot of room to label smaller values. For that reason, I created a legend for all the values in each graph.
And, although this isn’t a problem with the visualization itself, if you pay attention to the values in the original, you’ll see that they add up to greater than 100%: 101% to be exact. What probably happened is that more than one value was rounded up, giving the total an extra full percent. In my revisions, I changedthe value for the 41-50 category, from 16% to 15% so that the values would sum to 100%. This was a compltely arbitrary choice because I had no access to the raw data to know exactly how they were rounded.
I think any one of these would work in place of the original. Thoughts?