Hexagon Binning

A series of maps of MOOC (Massively Open Online Courses ) participants recently crossed my desk. These were produced by Anthony Robinson (paper: Exploring Class Discussions from a Massive Open Online Course in Cartography) and Carolyn Fish (blog: Making Maps of a MOOC). These highlight some interesting patterns in MOOC students, but I was particularly struck with the use of “hexgon bins”. Here is an example:

Sample hexbin map from Carolyn Fish’s blog. See her post for full discussion and more maps – many more sophisticated than this.

Hexagon bins (or ‘hexbins’) are a way of visualizing the density of discrete point data. Although it might be easy to find the densest regions of a traditional “scatter plot” of data points, it is difficult to make more meaningful comparisons. For example, do two areas have the same density?  What about points which ly on top of each other? For a non-map scatter plot, the solution is to “bin” the data into ranges of values. These are then plotted using color or symbol size to distinguish the number of points in each bin.

The same idea can be used on a map. Although square shaped bins could be used, hexagons work better. This is because they are closer to the ideal shape of a circle which represents the point density at the circle’s center. The chosen shape also has to tessellate. Although both equilateral triangles and squares tessellate, neither represent a circle as well as a hexagon does.

The above maps use color to indicate a bin’s size (e.g. the number of students, in the above map). It is also possible to use different sized hexagons but on the same constant-sized grid. This is particularly useful if there is a natural upper limit on the size of a bin. These could of course be combined to demonstrate multiple attributes, although Carolyn Fish uses different color shades to perform this in her maps (see her blog for examples).

A number of mapping applications and toolkits support hexagon bins, including ArcGIS and PostGIS. You could also construct and populate your own hexagon grid using a script.

One important thing to bear in mine is that you should only use hexagon bins with equal area map projections. This is because equal area projections are the only ones which preserve data density. You should also be using equal area projections for virtually all statistical maps. It is also pointless creating hexagon bins, if the bins are not of the same size.!

You could apply the binning before or after the projection. The above example applies the binning before the projection. This results in the distorted shapes of many of the hexagons, although their areas are all identical. Covering the entire Earth, such a system will result in stretched thin hexagons around the Poles. This is deemed acceptable here because there is no data in the polar regions.

The alternative is to apply the binning after the projection. This results in a grid of regular hexagons, all the same shape and size. At first glance, this is probably easier to interpret, but the underlying map regions will appear distorted. For example, you could project the regular grid back on a sphere, and find that many of the hexagons (typically those near the edges of the maps) become distorted shapes.

In conclusion, “hexagon bin” maps are a good solution to display discrete point data. However as is often the case, compromises may be required for global maps.

Leave a Reply