We will be posting a series based on the data displayed in O’Reilly’s 2016 Data Science Salary Survey. Using the Data Chefs Revision Organizer as a guide, we will rethink and revise some of the visualizations featured in the report.
I recently read O’Reilly’s 2016 Data Science Salary Survey (by John King & Roger Magoulas). People who worked in the field of Data Science answered questions about their job titles, age, salaries, tools, tasks, etc., and this report summarized the results. I thought the report offered a pretty fascinating overview of the data science industry, and is definitely worth the read.
However, I was a little thrown off by the choices the authors made in visualizing the data. Here is a selection of representative pages:
As you can see, King & Magoulas opted to use a series of blue circles to represent the data throughout the report. While the circles provide a common visual theme, I don’t think they best represent this particular data.
One example is the visualization for tasks: work activities in which the data science survey respondents reported major engagement:
The values are displayed as circle areas, sorted from highest to lowest, starting from bottom-left and curving clockwise around to the bottom-middle. The relative sizes of the circle areas seem to be accurate., but notice the positioning of the labels on the circles. From 69% down through 36%, the data and category labels are consistently positioned to the right of each circle. From 32% on down, the data label placement starts to get inconsistent: left sometimes, right other times, based on space constraints.
This space constraint also forces the authors to alter the positioning of the value circles. In order to fit the long text of the categories, the bottom right side of the arc had to be squashed. This gives the visualization an odd, bean-like shape.
The revision I’ve proposed, a horizontal bar chart, is a lot cleaner. The data labels are consistent: categories to the left of the bars, values to the right. Also, the relative sizes of the bars are pretty clear. That’s not really the case with the circle values.
This bar chart may lack the novelty or the visual pop of the original, but I think it’s more appropriate for the data, and far easier to understand.
What do you think?