Cartonerd: dot density

Showing posts with label dot density. Show all posts

Wednesday, 7 March 2018

Dotty election map

Well that escalated quickly...

While I've been working on the forthcoming book and mooc I've been doing some data wrangling in the background at work. For the 2012 Presidential election I made a gallery of maps that illustrated diverse styles of cartography along with some comments on the map types. Each map can tell a different story of the election. I've been in the process of updating this with a new gallery of the 2016 election results (currently around ten maps but more to come) and I got to the tricky one - the dasymetric dot density map. It requires quite a bit of manipulation of data so here is the map, and in this blog I'll explain a little of the process.

--------------------------
Update: There's now a web map which shows the data at 6 scales in much more detail than the screengrab above. Check it out here or below:

---------------------------

In 2012 I made a similar map for the Obama/Romney election. It was a product of the web mapping technology of the time. Made using ArcMap (full disclosure for those who don't know I work for Esri - who make ArcGIS). At the smallest scale 1 dot = 1,000 votes. At the largest, 1 dot = 10 votes and if you printed the map out it would be as large as a football field. It took 3 months to cajole the largest scale map onto the web!!! I wanted to update the map and the four years that have intervened have brought new software capabilities. For 2012 I had to generate up to 12 million points and position them. Now, using ArcGIS Pro I can use the dot density renderer and let the software take the strain and if I were going all out then why not try and make a map where 1 dot = 1 vote. So, for me, the map is a technical challenge. Part of what I do at work to push the software to see what it is capable of, to test it and to show others what capabilities it affords.

So how to make the map? Well, it's a product of a number of decisions, each one of which propagates into the map. I'll be doing a proper write-up on the ArcGIS blog in due course but, in summary, a dasymetric map takes data held at one spatial unit (in this case counties) and reapportions it to different (usually smaller) areas. It uses a technique developed by the late Waldo Tobler called pycnophylactic reallocation modelling. Those different areas are, broadly, urban. The point of the map is to show where people live and vote rather than simply painting an entire county with a colour which creates a map that often misleads [Waldo sadly passed away recently and I was running the model when I heard of his death a couple of weeks ago. I met him a few times and his legacy to computational geography and cartography is immense].

I used the National Land Cover Database to extract urban areas. It's a raster dataset at 30m resolution. I used the impervious surface categories and created a polygon dataset with three classes, broadly dense urban, urban, and rural. I then did some data wrangling in ArcGIS Pro (more of that in a different blog) to reapportion the Democrat and Republican total votes at county level into the new polygons. There's some weighting involved so the dense urban polygons get (in total) 50% of the data. The urban get 35% of the data and the rural polygons get 15% of the data. Then I got the dot density renderer in ArcGIS Pro to draw the dots, one for each vote resulting in a map with nearly 130 million dots.

The result is a map that pushes the data into areas where people actually live. It leaves areas where no-one lives devoid of data. It reveals the structure of the US population surface. Most maps that take a dasymetric approach will all end up like this but I think there's value in the approach. To me it presents a better visual comparison of the amount of red and blue that the standard county level map that maps geography, not people, and overemphasises relatively sparsely populated large geographical areas.

So the map I saw on my desktop late Tuesday afternoon took 35 minutes to draw. Technical challenge achieved. ArcGIS Pro nailed it. This is a map that I couldn't have made in the previous election cycle. I was excited and so I took a quick screengrab, sent out a tweet and went home to walk Wisley the dog.

And that, I thought, was that. I'd put the map on the backburner and return to doing layout reviews for the book and doing last-minute work on the mooc over the next couple of weeks. But then something unexpected happened. My phone started pinging. Slowly at first but then a little more during the evening as people began to see the map on Twitter and like or re-tweet it. That's nice, I thought. I went to bed. Wednesday morning I woke to a relative avalanche of likes and retweets. I spent the day in Palm Springs at our Developer Summit and my phone never stopped. By the end of the day it had received around 3,000 likes and had been retweeted 2,000 times. I'm writing this Thursday morning and it's currently at 7,000 likes and a little over 3,000 retweets. The side-effect of this 15 minutes of map fame is I've picked up an extra 1,000 followers (25% increase) on my nearly 10 year old Twitter habit.

But there's a problem. The screengrab was quick and dirty and while there have been many and varied comments on the 'map' it's by no means the finished article. I want to create a hi-res version and also make a web map like the 2012 version. I don't have time to do this in the next couple of weeks but it will happen. But be assured, I am aware of a number of issues. Some have already spotted them and commented.

The symbols - I chose a very default red and blue. Each dot has 90% transparency so overlapping dots at this scale will undoubtedly coalesce into clumps. The impression will appear to bleed across the map. I need to tweak the colours (less saturated) and adjust the transparency to get a better effect. I will also likely do what I did for the 2012 map and classify the data so that at small scales 1 dot = 100 or 1,000 etc. To remove visual 'noise' at those scales. I'll also check for too many overlaps and overprinting. I actually think there's a problem in some areas with blue dots overprinting red. There should be more mixing and more purple. And no, there's no yellow dots. The map only displays Democrat and Republican votes in what remains, effectively, a binary voting outcome.

The data - it's county data, reapportioned. Dot maps convey a positioning that is a function of the processing, not where people actually live or vote. Dots are positioned randomly. Some have, quite reasonably, interpreted the map as showing where votes are and this is a fundamental drawback of the approach. No personal information is in the map at all. I also need to double-check a few areas where people have pointed out apparent anomalies in the map, compared to their personal knowledge of the areas. There may be errors. I need to check. That said, it's a function of the way I've used the NLCD so that data is the basis for reapportionment.

The geography - yes, I hold my hand up. There's no Alaska or Hawaii. I apologise. I'm not sure I'll go back as it requires doing some movement of those states to position them around the lower 48 and put them back in. It's easy but a non-trivial task when you're working in a GIS but I'll think about it. I understand this is unpalatable for some and I accept that criticism.

The interpretations - many have offered some fascinating insights into the gaps and the patterns through Twitter replies. I'll be going through these more carefully when the hullabaloo dies down and teasing out some. But more than anything I've been blown away by the nice things that have been said about the map. It shows the election result in a different way. It tells a different story. One of my favourite responses was this by Thomas de Beus...a lovely mashup and play on the classic photo of Trump's preferred view of the data to hang on the wall of the White House by Trey Yingst.

And this is the point of making a map like this. It presents the SAME data in a different way. It leads to different insights, different interpretations and a different perception. Neither of the above are right or wrong. They are different. Of course, we all have out own view on which serves our needs and which we prefer but that's for us as individuals.

My only regret is that I excitedly tweeted a rough version. I should have waited until I made the map properly. I'll do that but I suspect this is my one viral 15 minutes of fame and I regret it doesn't reflect the quality I know the final version will exhibit. A finished map likely won't get the same traction but we'll see. At the very least it has ignited a discussion. It brings different cartographic eyes to the dataset. Will it ever be hung in the White House? Unlikely.

Thanks for your interest and comments thus far!

Ken

Hurriedly written from a hotel in Palm Springs during which time the map's had many more likes, 11 more mentions and I've picked up another 86 followers. I can only apologise to them when they realise I tweet just as much about beer and football as I do about maps.

Monday, 11 September 2017

Pointilist cartography

The Washington Post have published an article that explores alternative methods for mapping elections. "Toward a more perfect 2016 presidential election results map" does an excellent job of establishing the problem of mapping totals in massively different geographical units. They don't really explain you have to normalize the totals but, instead, leap to the population-equalizing density cartogram as one alternative before quickly dismissing it as hard to read.

They then offer a map that takes precinct level data and scales the results by number of votes.

What they seem to have done is created a proportional symbol map with very small circular symbols that have been scaled across a ridiculously small size range. They've used a lot of transparency to allow overlapping symbols to build a composite patch of more opaque colour in areas with a lot of small geographical areas.

This is pointilist cartography (note, I said pointilist, not pointless). Proportional symbol maps are not new. Neither are dot density maps. This version isn't particularly innovative but it does do a very good job of mitigating the perceptual problems of widely varying geographical areas. Each place gets the same symbology treatment and, so, the map provides a well balanced mix of red and blue with a lot of white space in between. They used a symbol treatment that goes from red through white to blue with the intermediate colours reserved for marginal precincts. I like this approach. It avoids the unusual purple often used for areas that are finely balanced. It means the map brings focus to those areas that are more partisan. Of course, with a shift in the symbology you could bring focus to marginal areas if that was the map you wanted to show.

A similar approach is to use solid fills for small areas and then show larger areas as small circular symbols. Mixing the techniques on a single map can be useful and also mitigates the visual impact of large areas. Here's an illustration using the technique that I recently made for my forthcoming book. The top is a standard choropleth with a diverging colour scheme. The bottom is the pointilist version.

So, overall I really like this kind of approach to deal with perceptual issues. But the article does hide a more interesting problem. The opening paragraph is at pains to say we've been over this ground before. We have - ad nauseam. Yet so many prefer the standard choropleth and, worse, sometimes with totals. But when they suggest it's a problem for the 'designer' that's where the real problem lies. Everyone these days is a bloody 'designer'. But everything is designed. I always balk when someone tells me they're a designer. A designer of what precisely? Furniture? Buildings? UI? Maps? A cartographer knows how to map election data. They know the problems and they know the solutions that best deal with particular visual issues to get to a map that matches a particular narrative. Far too many 'designers' are busy scrambling to try and figure out how to overcome problems that have already been figured out.

Talk to a cartographer. That's their job. They know what they're doing and likely have a good solution. Pointilist cartography isn't new. I'm pleased to see articles like the one I note here picking up these techniques. I just hope they get used a little more rather than being marginalized by 'designers' who default to the standard choropleth.

Saturday, 3 August 2013

The dottiness of dot maps

Dot density mapping seems to be the new hexagon in mapping. We've seen iPhone vs Android neighbourhoods, locals vs tourists and languages mapped. The US census has been mapped as one dot per person. I had a bash at mapping the results of the 2012 Presidential election. There's plenty of others and Eric Fischer writes a nice discussion on some of the issues you have to contend with when making such maps (picking up some threads of a discussion I had with him after my previous blog).

And then we have The Guardian's "Every person in England and Wales on a map".

There isn't much information about the map's construction except each person in England and Wales has been represented with a dot. The data source is the 2011 census. That equates to 56,075,912 dots on the map.

According to The Guardian's Chris Cross it creates a 'beautiful picture of population density across the country" and you can " zoom in to the highest level to see the individual dots". OK then...here goes:

At it's proper size this is the maximum scale which is 1:72,224. I struggle to see individual dots. The reason is simple...the dots are too large so in the areas of most people they are coalescing into an amorphous blob. We have no way of seeing any variation amongst the most densely populated areas. All we get is a black fill for the underlying polygons. This also creates the illusion of 'totality' in the sense that the area is absolutely rammed with people to the point of there being no room left for anyone else (is this the political point the map maker wanted to make?). Black is never a good colour to use on a map for 'fills'. Leave it for linework and labels....or make your dots sufficiently small so we can see some detail in these areas.

That's not the biggest problem though...

The map above clearly shows that the data has been mapped into boundaries. These look like wards...the primary unit of English and Welsh electoral geography. It's not the finest scale of geography and it's conceivable they've used data reported at sub-units (e.g. Output Areas). And so the dots are placed randomly within the areas. That's a fairly standard technique for dot density mapping but let's not get carried away. This does not create a picture of population density. Using wards or other arbitrary boundaries means the data is constrained by the pattern that those boundaries create and NOT the pattern of where people live. So I don't find this a beautiful picture of population across England and Wales; I find it a visual misrepresentation of that statement.

The standard technique for mapping census data in wards is a choropleth map which, through the use of graduated colours, would allow us to compare the population density of areas across the country. It might be considered a 'boring' technique, particularly for the media, but it would be useful in this context. Instead we've got randomly placed dots across the entire country including all those areas only populated by parkland...or airports...or reservoirs and lakes...or sheep..or well, the map suggests that people exhaust space but in different densities. People don't. In fact, many areas on the map will have far lower densities (practicaly zero) and many areas will actually be far more populated.

Here's a few of those land uses in the map extract above...

Whenever people look at dot maps, particularly those that suggest each dot is an individual data item (person in this case) the presumption is that the dot is the actual location. The danger of imputing the characteristics of an area to a finer resolution of data is termed the ecological fallacy and this is manifest in The Guardian's map. They've taken data reported in areas and created a map that at least 'suggests' a finer scale of mapping than the data is capable of providing. At best we know there are a lot of people in one area compared to another but the map suggests people are everywhere. The arbitrary boundaries of wards create the pattern we see, not the populated surface.

The map really should use a dasymetric technique which takes into account a secondary data set that allows us to position the source data more in line with where we logically know the data exists. In this case, using urban land use would help. Using residential land use (because that's where the census reports location) would be even better. Additionally, mask out all the parks, airports and unpopulated areas and THEN use the dot density technique to distribute the dots in those new areas. That's the technique I used for the dasymetric dot density map of the 2012 Presidential election data. I also avoided the ecological fallacy by deliberately NOT using 1 dot to symbolize 1 vote because I have no idea of voting patterns at that scale. I left it at 1 dot is 10 votes. I perhaps stretched things even then and maybe should have left it at 1 dot is 100 votes. Either way...the map shouldn't suggest more than the data is capable of showing.

The point here (no pun intended) is that accurate mapping requires quite a bit of data processing. It requires thought and, often, further data than simply the item to be mapped. It requires a knowledge and understanding of what you're doing and the limitations inherent in the technique and how you're crow-barring the data into it.

And the biggest problem of this sort of maverick cartography is the fact it gets huge readership. It appears in a national publication, on a web site with large readership and is promoted by twitter accounts and blogs with large followers. I've said before that one of the big dangers in modern cartography is that we're seeing even more maps made by people who are unaware of basic techniques who produce maps consumed by the uninitiated. Cartography is important. The point of cartographic techniques is to marshal the way we map in a way that avoids misinterpretations. Cartography is not a set of rules to constrain innovation or stifle experimentation...it's a guide to ensure that the map's meaning will be interpreted accurately. The Guardian have failed massively. While what they've created isn't in itself wrong, the way it's presented is what causes the confusion. Maps like this should have a series of caveats to accompany them so the rest of us don't make false assumptions. Honest cartography would be a good start. Instead, these sort of maps seem to be made because the author has found a way to make them...thus solving a technical challenge. In so doing they may get a viral map; they may be seen as innovative. Very few question the value of their cartographic approach but if we're to improve the honesty in modern cartography then we need to see these sort of maps be published with a greater sense of caution about what they show rather than the fanfare of technical triumphilism