How can we visualize the distribution of 33 categories over the whole of the USA? The webpage meetup.com hosts diferent types of events all over the world. For a class we wondered if it was posible to observe diferences in the interests between the people of diferent states. The main dificulty was that there were 33 categories to compare among 50 states, a simple barchart wouldn’t be enough as 33*50 columns is more than any one person can absorb.
We compromised. Instead of showing every minor diference among states we chose to observe tendencies. Then, a choropleth would be created to reflect these tendencies and add geographical information to the mix.
We show tendencies by dividing each state into 100 colored cells we call percentage cells. The amount of cells that are of a certain color represents the percentage of the interest of the inhabitants of the state towards the category represented by the color. By randomizing the positioning of the cels inside the state we obtain a puzle that broadly shows the main categories present in each one of the states.
We substitute the states geographical shape for one that removes surface area biases in the visualization as this only adds noise to what we want to observe. However, we maintain the general positions of the states in respect to each other so as to facilitate the detection of geographical related relations.
Finally, since the randomly positioned cells are difficult to count in order to find the exact percentage, we let the user click a state to get an easier to read visualization (a barchart) of the specific area.
We hoped that the resulting visualization would enable us to detect patterns between the states by comparing their tendencies.
- We found it easier to find outliers and anomalies simply by looking for states that have a diferent “average” color.
- We found it was easy to find geographical tendencies. For instance, regions of the USA where there was a recognizably larger interest in Tech (shown in blue).
- It’s generally easy to compare two states interest in a category by comparing the amount of cells of the desired color, and similar percentages can be distinguished after clicking each state to get the broken down percentages in the barchart.
We didnt completely finish the project, people looking to build upon or improve should consider the following areas:
- The code needs tidying up (D3.js is declaratively writen and can be dificult to read in a big project like this).
- Percentages are not calculated but proxied by sampling 100 events per state and asuming the categories of these are representative of the interests per state.
- There’s much need for a way to select some subset of categories to compare because as of now those most common tend to hide the rest. (++ im 99% sure no one can distinguish 33 colors).
- Responsiveness: the visualization looks horrible on smaller screens.
- Performance: the transitions lag on slower computers because calculations are repeated for each of many cells (100 per state).