Friday 3 February 2017

Warp Factor Eleven

The UK Onshore Geophysical Library's William Smith - Interactive has wound me up.  It represents a cartofail that's extremely common.

William Smith produced his classic and beautiful maps of the Geology of England and Wales in the early 1800s. They are stunning. making them available digitally is also wonderful but...and this drives me nuts - why oh why, time and time again do we see people scan these wonderful old maps and then warp them to Web Mercator?

There's a reason Smith chose the right projection for his maps....because it was the right projection.

Would it not be easier to just change the projection of the web map service you're using and then allow the maps to be seen as intended? Instead we get the maps stretched and distorted. Horizontal text also becomes stretched and sits on a curve. It looks absurd.

200 years ago Smith made his map right. The very least we can do is honour it by using modern technology to re-present his work properly, not turn the warp factor up to eleven.

Thursday 2 February 2017

What's the point?

Campaign group 38 degrees have produced a map of the NHS crisis in the UK that purports to show the location of signatories to a petition demanding improved resources for the NHS.

Click here to put in your own UK postcode and here if you want to just see the map I've screen grabbed below

Each signatory is shown with a lovely blue Google map pin (zoom out to get the full 'death by map pin' effect just for kicks). Here's the cartography bit: "To protect anonymity, we randomly assign locations in the constituency for each signature. No real locations are shown."

Say what?

So you take data, you ignore location other than it exists within a certain boundary, you give it a false location and then put it on a map. Let's just zoom in a bit...

There you go. Mr Gordon Bennett in that lovely house round the corner signed the petition. Except he didn't, did he, because this is a randomly placed marker. Someone in the area whose postcode cannot define a particular property has had their data pinned to Mr Bennett's house. I suspect that pisses them both off.

This sort of map tells this huge lie while at the same time purporting a level of precision that assigns the unreality to very particular houses on the map.  It's a verson of the ecological fallacy in the interpretation of statistical data where  inferences about the nature of individuals are deduced from inference for the group to which those individuals belong. In this case precise location...albeit randomly assigned.

If you're going to randomize the data for display (a good thing) then aggregate it into a choropleth (you clearly have the boundaries which you're using to demarcate the selection) or show the postcode totals as a proportional symbol or do anything other than use point markers that make no sense and, worse, impute nonsense. Total cartojunk that obfuscates the real message. Put the damn numbers on the map. Make them big. Make those crucial messages the visual.

ht @StevenFeldman

Wednesday 1 February 2017

Dangerous times. Dangerous maps.

A map showing the distribution of people on Trump's foreign banned nationals Executive Order appeared recently.

You can read the full blog post by the publishers here but here's a screen grab in case the tweet is taken down:

There's a few things that upset me about this map.

Firstly it's non-normalized. It shows totals. Choropleths need a rate or ratio. It was pointed out to me that the map uses Congressional Districts as boundaries and that means the populations are 'roughly' the same so it's all ok.

I'm afraid 'roughly' doesn't cut it. Each Congressional District represents about 711,000 people but you'll notice it's also split by state boundaries so, in fact, it's the data per state that's reapportioned into roughly equally populated areas. That'd be fine for mapping if you discounted Montana, Wyoming, North and South Dakota which encompass one Congressional District each. But they have populations of 1 million, 584,000, 740,000 and 853,000 respectively. None of these population totals can be easily split further without resulting in numbers even further away from 711,000 but it means the map of totals fails to respect these differences. This means the visual message is warped when you're trying to compare across the map. The primary function of the choropleth is to support visual comparison and unless you accommodate the underlying discrepancy in populations it doesn't.

The problem is exacerbated because those States are very large anyway so they inevitably dominate the map. Also, because the map uses the inappropriate Web Mercator projection those northern states are enlarged in relation to the rest of the country too - a further warping that our brains won't adjust for in deciphering the map's message.

There's plenty of other techniques that the map's maker could have used - cartograms, proportional symbols, dot density, hex-binning and so on. Each have benefits and each have drawbacks. The authors said they "wanted to stick with the district boundaries so people could see which district they reside in". So they chose to go with geography so people have less of a barrier to understanding the map. Fine - but the consequence of that decision is you have to be prepared to deal with the inherent cognitive bias and work hard to mitigate it properly.

Even if you think I'm being too nerdy about the issue of totals on choropleths (I'm not) then think about it this way...the map suggests around 3,400 as the bottom value of the highest class in the legend per Congressional District that's a lot of people right? Three and a half thousand of them.  As a percentage? Less than 0.5% and, frankly, that doesn't make the map nearly as persuasive.  Dig a little further:

So the upper class actually goes from 3,400 to 51,652.  And look at how tiny that little place is in downtown Los Angeles. 51,652 people all crammed into Congressional District CA-28 which you can hardly see, compared to 1,620 in North Dakota which you can really, really see. The choropleth doesn't help at all here. A different technique altogether would help. But even at 51,652 that's only 7% of the population. Still not exactly a huge proportion.

The data is also a little misleading. Libya cannot be extracted as a separate country from the American Community Survey used as a source for the map so the 'Other North Africa' designation was used - meaning people not on the banned list are included in the map. How many? Hard to know.

And reds? Hey, this is a sensitive issue. I mean a really f*cking sensitive issue. Red is not the colour to use because it's value-laden. We process it in a particular way and it means 'danger'. If the map is supposed to be an impartial display of the data then red is not the colour to use.

Finally, a friend of mine noted to me that they were concerned that the map even existed given it shows WHERE people on the banned list live. Popups even provide broken down summaries by country.  I countered by suggesting that at Congressional District level there's enough generalization to mask real locations but I take the point and it raises an ethical issue for cartography. In a time of unpresidented [sic] political turmoil, is it morally OK to publish this sort of map just because you can easily scrape the data? What purpose does it support? Given the general outrage that the ban on entry from nationals of 7 countries is tantamount to a partial ban on Muslims then the map could easily incite or inflame the situation further. If the intent is to be impartial then you have to be ridiculously careful to ensure you do just that and this map doesn't. Unless you are setting out to be explicitly persuasive or even propagandist, cartographers and map-makers have a responsibility to make maps that are not misleading and when dealing with sensitive subject matter it becomes crucial.

I am absolutely sure that the map-makers here actually had the opposite intention because they include contact details for Congressional Representatives - presumably as a call to action to encourage people to call in their opposition to the ban. Trouble is, for every one that might go to that effort there will be many more that look at a sea of red and interpret it differently. That's the power of maps.

As it stands the map is dangerous. It shows where people live that are currently on a banned list and that serves no purpose. It uses a good technique but poorly which is nothing more than creating visual alternative facts. It uses the wrong projection which exacerbates the problem. It uses slightly dubious data and, certainly, a bad choice of colours. I'd ban this sort of mapping. Period.