Saturday 19 November 2011

Large, complex thematic datasets and web maps

The Guardian recently published an interactive web map (built by ITO World) of the deaths and injuries from road traffic accidents 2000-2010 in Great Britain. That's nearly 33,000 killed and 3 million injured people...represented as separate points on the map.

Viewing the overall picture at zoom level 4 (a scale of about 1:36m) we see the all too familiar mass of points on a web map with a legend that doesn't relate to the map we're seeing. Points overlap and coalesce. It shows very little other than the fact that there's an awful lot of death and injury. At this scale the data need aggregating in some sensible way. For instance, cluster analysis would spatially summarize the data and could be mapped using an isopleth technique for a more suitable small scale map. At any of the lower zoom levels (smaller-scale) the map is less useful still.

At zoom level 9 (about 1:1m) we start to see a little more clearly that the dots have different colours but they are all still the same shape. If we look a little more carefully in the background we can just make out some lighter symbols. What are these? Are they part of the basemap or part of the data? It's worth mentioning here that the underlying basemap labels are obscured by the dots. If they cannot be placed on top of the data then why have them at all?

At zoom level 11 (about 1:280,000) we see some symbols appear that resemble those in the legend but they are illegible. There are a lot of background symbols though...and with all that overlapping transparency the colours on the basemap are heavily compromised. Here then, we start to see that a neutral basemap in a single hue (e.g. light grey) containing very little detail would provide a more uniform background and allow the data to be seen more clearly rather than melt into the background.

It's not until zoom level 15 (about 1:18,000) we start to see the symbols as they appear in the legend but is it any clearer? There are 12 variations of the symbols and they all overlap. Actually that's not strictly true as yellow circles and triangles seem to be the lowest layer and blue circles and triangles and pink triangles the highest so the latter take visual priority even though they represent categorical data.

I've never been able to zoom in further than level 15 since it fails to draw any symbols beyond that scale but what can we take from this map in terms of its cartography? The importance of designing for specific scales cannot be underestimated. Putting a mass of points on a map simply doesn't work at most if not all scales. At smaller scales, data needs manipulating so it is in a form suited to a small scale thematic map type. At larger scales, symbols need to be simple and clear. That said, I like the map for one simple's one of the first I have seen that has attempted to show a very complex data set by type rather than the use of a single coloured generic marker symbol. At larger scales the symbol design is generally good and gives a mechanism to visually disentangle incidents by type, transport, age, date and gender. It's doing what cartography was designed to do..allow the map maker to take complex data and classify, symbolize and provide a picture so patterns can be seen that goes beyond what a table, graph or uniform point marker web map can provide. It's not perfect as I've pointed out but it's pleasing to see web maps begin to show signs of cartographic thinking and design.


  1. And I thought I was harsh :)

  2. Yeah, there's definitely too much going on. THe Oakland Crimespotting map is still my favorite for the way they present this much point data.

  3. Very interesting to read your comments and Steven's. Hope you don't mind me responding to both together.

    In response, I'll describe what we were aiming for. We prepared this in time for the World Day of Remembrance for Road Traffic Victims. It was intended to be simply a presentation of the data - analysis will come later. The 'insight' is just that there is "an awful lot of death and injury". We felt that people had been shown the headline numbers as statistics many times before and that we wanted to show what that data 'felt' like translated to individuals across the countries and in their area. The choice of symbols was to try to emphasise the fatalities as individuals - we found the addition of age and sex made each loss more human.

    Given the focus on individuals, we didn't want to aggregate them in any way. We were happy that the lower zoom levels would just be thumbnails, hinting at the mass of data, encouraging people to zoom in or search. Maybe we need something more to guide users to this.

    Of course, this release wasn't without constraints, and further releases will address filtering, neutral base maps, symbol overlaps, clickable details, etc. There are a number of other views onto this data available ( , , ) and we wanted to start by focusing on the overall picture. We are just about to release a similar map for the USA, then follow with some more specific versions, and then some analysis.

    Hopefully that explains a bit more about what we were attempting, and that we weren't just "sticking it on a map" because the data had some coordinates...

  4. To make the raw data easy and unproblematic to maneuver and construe, so that one can make improved and good judgment of the information, it is utterly imperative to systemize the entire compilation of data. See more thematic data analysis