Lies, Damn Lies, and Covid Statistics
Writing about bad analytics in the age of Covid is easy; embarrassingly easy. Since Lone Star has published several assessments around the epidemic, we’ve had our share of problems.
As 2021 comes to an end, it seems worth looking back over the last two years to summarize our work and other bad data, bad analysis, and generally bad reporting we’ve seen about the Covid-19 epidemic.
Here’s a top-five list of the worst lies, damn lies, and Covid-statistics.
Number 5
The “data” you are seeing doesn’t mean what it seems to mean. Something Lone Star Analysis has generally done right is what we call “you can’t read the other guy’s numbers.” We’d have put this issue higher on the list at one time, but it has become more common to understand that different states and nations report statics differently. More importantly, it’s generally accepted today that some places (e.g., China) simply lie a lot.
Grade for most reporting = C minus (much improved)
Grade for Lone Star = B+ (generally good here)
Number 4
Political framing is simply the wrong way to think about Covid statistics. I recently had a conversation with an otherwise brilliant quant analysis who wanted to argue that “Republicans resist vaccinations.” Of course, there is (sadly) a political component to nearly everything in the U.S., but this misses the point.
Chris Arnade has long pointed out that the biggest voting bloc in America is the Non-Voter. Urban inner cores, which nominally leans left, don’t vote. So, in the DFW area (our headquarters), the lowest vaccination rates are among the non-voters, in Blue islands, in a Red state. When you read stories about the “red-blue divide,” you should be very suspicious.
Our polling shows that other issues, like income level, gender, and other factors, are probably more important than party affiliation. Lone Star has been very consistent about polling a demographic which matches the census, not likely voters.
Grade for most reporting = F (persistently)
Grade for Lone Star = A (consistently)
Number 3
Using words without meaning and changing even squishy definitions. Remember “herd immunity”? There was never a good consensus on what this meant, and like other terms, some religious wars were fought over semantics. This is, of course, is a standard problem in analytics. Groups whose forecasts fail often try to change definitions. This falls into the “lie” category. It’s not ok to move the target you failed to miss.
Grade for most reporting = D
Grade for Lone Star = A minus (we slip sometimes)
Number 2
Doing analytics around the easy data to get. This is another well-known problem in the analysis world. We gravitate to the data we can easily acquire, so we look in the wrong places, like the story of the drunk under the streetlight. Everyone is guilty of this to some extent.
It seems impossible to avoid it completely, but advanced analytic methods sidestep many of the pitfalls here. Sadly, most reporters and analysts are not doing the sidestep.
A great example is recent stories suggesting that perhaps we should not be counting “cases” of Covid now that vaccines have rendered many people immune to the extent that most people who test positive have mild or no symptoms. Is it a “case” if no one is sick? The problem, of course, is that we have decent data on positive test results, and we don’t have easy data on the degree of sickness.
Grade for most reporting = D plus (slow improvement)
Grade for Lone Star = B minus (we slip here too)
Number 1
Using the wrong analytic framework. In hindsight, it is clear that early models were dominated by flu and some other respiratory diseases with different characteristics than Covid-19.
It turns out droplets are not the concern most expected. The models had to start somewhere, but as G.E.P. Box warned us, “all models are wrong.” Too many of us ignored his warning. This problem is a key reason why predictions of herd immunity (including the ones from Lone Star) were wrong.
While some of Lone Star’s work was off base, others have been wildly off the mark. In particular, we take issue with work intended to coerce rather than inform the public. We don’t think analytics should be used as fear porn.
Grade for most reporting = F plus (little changed)
Grade for Lone Star = C plus (this one is hard)
It can be depressing to see the low quality of the information provided to the American public and our leaders. It may be a good thing that our polling shows the public generally doesn’t believe what the press and health officials say.
Our best practices work shows accountability is an important attribute of excellent analytics and A.I. We want to hold Lone Star accountable, so here you have it. The places where we’ve been right and where we’ve missed.
Want to learn more about how Lone Star’s solutions can empower your organization? Contact us today for a consultation.