But if statistics can be twisted and turned to establish almost anything, all the more reason to make sure people know what they mean. Governments and bodies who create official statistics need to minimise the risks of them being abused by making them as clear as possible in the first place.
This is currently one of the subjects of the annual conference of the International Association of Official Statistics in Paris, themed “Better Statistics for Better Lives”. As we shall see, there are numerous common problems with official statistics that should give the delegates food for thought.
Mind the margin
One of the most politicised official statistics are employment/unemployment rates. When compiling the figures, the government obviously doesn’t go around asking every person are you employed or unemployed. It asks a small subset of the population and then generalises their unemployment rate. This means the level of unemployment at any given time is a guess – a good guess, but still a guess.
That’s statistics in a nutshell. Nothing is certain, but we throw numbers around like they are absolutely certain. To echo an argument that the statistician David Spiegelhalter has made, for example, the Office of National Statistics (ONS) recently reported that the number of unemployed people fell 55,000 between May and July of this year. However, the error around that guess was plus or minus 75,000. In other words, there could have been a decrease in unemployed people of 130,000 or an increase of 20,000.
So while we think unemployment went down, we don’t know for sure. But you wouldn’t have known that from how it was reported in the media, with the sort of certainty that nearly always accompanies news about official statistics. News headlines don’t like ambiguity, nor do the first few paragraphs of the stories below them.
But before we lay all the blame with journalists, there is an underlying issue with the statistical presentation. The ONS announcement didn’t even mention the uncertainty in the figures until several sections below the headline numbers, and you’d have to do quite a bit more digging than that to find out the +/- 75,000 margin of error.
It’s not just the ONS that routinely glosses over such uncertainties, of course. For the UN official climate change statistics, I spent 22 minutes trying to find the variability tables … I gave up. The US Bureau of Labor Statistics recently published a report that began: “Total non-farm payroll employment increased by 201,000 in August, and the unemployment rate was unchanged at 3.9%.” The bureau waited until 4,200 words and nine pages later to add a dense paragraph trying to explain the margin of error.
The bureau also includes many different measures of unemployment: U-3 for the total unemployed as a proportion of civilian labour force, for example; and U-6 for the “total unemployed plus all persons marginally attached to the labour force, plus total employed part time for economic reasons, as a percent of the civilian labour force plus all persons marginally attached to the labour force”.
I am a statistician and even I have to read this stuff over and over before I understand it. Again, this is a very common phenomenon. Complex statistics have their place, but if there is no clear explanation, they are not helpful to the public and potentially quite damaging.
So how do the bodies who produce these figures win? First, they need to do a better job at reporting these uncertainties. In the UK, the Royal Statistical Society and the ONS are working together on how to do this right now (yes, US Bureau of Labor Statistics and the 13 other bodies that produce official statistics for the US, that is a hint). Clarity is key. I mean, this may sound like a crazy approach, but has anyone ever considering saying that unemployment is down 55,000, plus or minus 75,000?
We also need, and I know this can sound like a cliche, a better education programme for statistics in schools. Since statistics are being used to drive enormous decisions, our children need to know how to question these numbers.
A recent paper found that in the US, for example, while there is a new standard that requires a stronger emphasis on statistics in schools, maths teachers are not well prepared to teach the subject. As for the UK, most people in the stats community will tell you that statistics is poorly taught in schools – crammed into core mathematics, or (in my opinion) even worse, spread out in thin bits to geography and biology.
Lastly, politicians needs to stay out of the bodies that produce national statistics. When politicians recently didn’t like the statistics coming out of the independent statistics agency in Puerto Rico, for instance, they dismantled it. In Greece, the chief statistician is being criminally charged for releasing what seems like the truth. The UK has not been immune to this in the past, either: Labour’s decision in the early 2000s to switch the Bank of England’s inflation target from retail price inflation to consumer price inflation is arguably the most obvious example, since it removed house prices from the equation at a time when they were rising rampantly.
A great statistician, George Box, once said: “All models are wrong, but some are useful.” In short, there is error any time we model something – or in other words, make a prediction. What statisticians need to be better at explaining, and what the public need to be better at understanding, is that none of these numbers are exact. We also need to make statistics clearer so that anyone can understand them. In an era of fake news, where verifiable facts can seem a rare commodity, statisticians are too often doing us all a disservice.