The book “The Data Detective: Ten Easy Rules to Make Sense of Statistics“, by Tim Harford, is well written. Mr. Harford presents his ideas straightforwardly. He himself provides a compelling summary of the ten rules. According to him, we should:
- learn to stop and notice our emotional reaction to a claim, rather than accepting or rejecting it because of how it makes us feel;
- look for ways to combine the “bird’s eye” statistical perspective with the “worm’s eye” view from personal experience;
- look at the labels on the data we’re being given, and ask if we understand what’s really being described;
- look for comparisons and context, putting any claim into perspective
- look behind the statistics at where they came from—and what other data might have vanished into obscurity;
- ask who is missing from the data we’re being shown, and whether our conclusions might differ if they were included;
- ask tough questions about algorithms and the big datasets that drive them, recognizing that without intelligent openness they cannot be trusted;
- pay more attention to the bedrock of official statistics—and the sometimes heroic statisticians who protect it;
- look under the surface of any beautiful graph or chart;
- keep an open mind, asking how we might be mistaken, and whether the facts have changed.
Subjacent to all these guidelines, ONE “golden rule”: be curious. In my view, each of them is quite sensible, but not exactly groundbreaking.
The book starts decrying Darrell Huff’s “How to Lie with Statistics”. In an overreaction, Harford sees it as a disservice to the cause of data analysis. However, Huff states early in his 1954 book that “[s]tatistical methods and statistical terms are necessary in reporting the mass data of social and economic trends, business conditions, ‘opinion’ polls, the census.” As a consequence, I do not see why not take Huff’s approach at face value or, in his words: “[t]he crooks already know these tricks; honest men must learn them in self-defense.”
I found especially interesting Harford’s digressions about the perils of “motivated reasoning” and “naive realism”. Besides, as a recent reader, his remark on how a measure ceases to be a good one when it becomes a target could not be timelier given the recent decision of the World Bank to discontinue the “Doing Business Report”.
The book also reminds us all how we are drawn to surprising news, usually bad ones or mere flukes, also stressing the importance of remaining skeptical of both hype and hysteria, and emphasizing the properties of “intelligently open decisions” (i.e., information should be accessible and be usable, and decisions should be understandable and be assessable).
A good point is how standard statistical tests assume that all data have been gathered before being tested. If the data are gathered bit by bit and tested incrementally, these tests stop being valid. A correlated problem is the practice of hypothesizing after results known (or, simply, HARK). As an example of food for thought, the book states that it was possible to craft an algorithm that would either “give an equal rate of false positives for all races” or “where the risk ratings matched the risk of rearrest [in a business deal] for all races, but it wasn’t possible to do both at the same time”.
In spite of not endorsing it, the book presents a relevant argument on how risky it is to provide information to governments often regarded as incompetent. The implicit assumption that governmental statistics are not collected by the government, but for the government, deserves a more careful consideration.
Concluding, the author highlights, among others, the histories of Marin Mersenne (1588-1648), a monk and mathematician, and Andreas Georgiou, the former President of the Greek Statistical Authority. The first was essential to the development of science by stimulating open debate among disparate thinkers of his age, in stark contrast with the secrecy permeating alchemy by the same time. The second faces an unjust ordeal for doing his job, attacked by leftists, rightists and bureaucrats from his home country. In my own country, Brazil, I see a similar development on the coordinated effort to disrupt the work and the reputation of the former judge Sérgio Moro.
Note: first published on Goodreads > https://www.goodreads.com/review/show/3962558666.