Why is it so hard to tell how deadly 2019-nCoV is?

With 2019-nCoV still uncontrolled across China, and global control challenged by incidents on cruise ships, in alpine ski towns, in apartment blocks, and elsewhere, conversations about the coronavirus epidemic always circle back to trying to figure out just how deadly it really is.

The disease’s case fatality rate (CFR) — the percentage of 2019-nCoV infections that result in death — is devilishly difficult to calculate this early on in the outbreak. So the range of numbers tossed out is very wide, going from .1% (like the seasonal flu) all the way up to 10% (like SARS), and 30% (like MERS).

Why is it so difficult to calculate a CFR for this virus? Well, it’s partially because people are irresponsibly quoting numbers with no scholarly support whatsoever, whether too high or too low. But it mainly comes down to the fact that the CFR of an outbreak of a brand new virus is hard to estimate.

Just to put real information up front, responsible estimates are all between 1% and 10% right now, and have been trending toward the lower end of that range.

Calculating a CFR is simple in theory

For any disease that has already run its course, where we can look back at a sample of cases that we’re certain is representative and divide them cleanly into deaths and recoveries, the CFR is a simple ratio: the number of deaths divided by the number of cases.

For instance:

  • The CFR of SARS is about 10%, because about eight thousand cases yielded about eight hundred deaths.
  • The CFR of the 2014-2016 West African ebola virus outbreak was about 40%, because about 28,000 cases yielded about 11,000 deaths.
  • The CFR of the 2009 H1N1 influenza epidemic was about .01-.08%, because between 700 million and 1.4 billion cases yielded between 150,000 and 575,000 deaths.
  • The CFR of the 1918 influenza pandemic has been estimated at around 10% to 20%.

For a thoroughly studied epidemic that’s already over, this kind of pat number can be calculated with relative ease, although as you can see, the numbers can carry significant uncertainty even after a century of retrospective research.

The trouble is that in a new and evolving outbreak, both of these numbers are very uncertain, and the relationship between them is far from guaranteed to remain constant over time.

Uncertainty in the number of cases

Right now, there’s huge uncertainty in the number of cases of 2019-nCoV that are actually out there.

The best estimates are that ascertainment, the percentage of cases known to medical authorities, is still pretty low, and probably still rising over time. Early in the epidemic, ascertainment numbers were estimated in the single digits, meaning real case numbers were probably over ten times higher than official numbers.

Even now, in Hubei province, thousands of patients with 2019-nCoV-like symptoms are being quarantined together in huge spaces without even being tested for 2019-nCoV, likely including many with the cold or flu who don’t even have 2019-nCoV. On the other hand, thousands of 2019-nCoV patients are also suffering at home, while others may not even have symptoms. And thousands more cases are latent.

Outside Hubei and especially outside China, ascertainment is almost certainly higher, but significant uncertainty still exists.

Uncertainty in the amount of time it takes for the disease to run its course

We need better data on how long this disease takes to run its course, before we can figure out which of the known cases should be considered as part of the denominator for a CFR calculation.

Someone who exhibited symptoms yesterday and was diagnosed today is a real case, but hasn’t had a chance to die yet. So should they be included in the “number of cases” term in our CFR calculation? Probably not.

But how long does someone have to have symptoms before you can count them? Do you wait a fixed period of time, or until they’re already recovered?

This uncertainty around the timetable of the disease is likely to be a big factor with 2019-nCoV, because the epidemic has been approximately doubling in size every week, while some reports from Wuhan indicate that deaths tend to come in the third week of the infection.

Uncertainty in the number of deaths

In most outbreaks, the official death numbers are likely to be closer to correct than the official case numbers, because more serious cases which are more likely to result in deaths are also more likely to wind up in the hospital. However, there are still a number of factors that can lead to significant uncertainty.

For one, some people do die of epidemic disease at home, completely outside the medical system. This is often pretty rare in times and places with functioning and trusted health systems, but there have been reports of this happening in Wuhan. We don’t know how often it’s happening.

In addition, in Hubei, significant numbers of deaths are occurring in hospitals without being confirmed by positive tests, and they are not counted as 2019-nCoV casualties. We don’t know how many.

It’s also the case that current case numbers contain people who haven’t died yet, but will die in the future, and we don’t know how many (see the above section on how long it takes the disease to run its course).

Finally, since the onset of this outbreak there have been rumors on Chinese social media that the government has been having bodies cremated before they can be tested and entered into the official death count. It’s very hard to evaluate these allegations of official suppression of fatality numbers, much less factor them into a CFR estimate, which is why many of us are keeping a close eye on the numbers coming from outside China.

Changes in these relationships over time

In addition, there is no single CFR for a disease — no guarantee that the CFR derived from one set of cases, even if it is unimpeachably correct, will extend to other circumstances. So talking about “the” CFR of 2019-nCoV is a little bit of a fool’s errand.

The CFR in a place where all the cases are making it into hospital beds may be significantly better than in a place many cases are winding up in improvised epidemic wards, or suffering at home.

The CFR in a first world hospital may be better than one in a third world hospital, and better in a hospital with plenty of resources to spare than one seeing lots of cases. Washington State University hospital treated its case with robots, used experimental drugs like remdesivir, and had a team of specialists standing by, conditions unlikely if they had more cases, and unlikely in Wuhan under even the best of circumstances.

The CFR in a population with more vulnerable people may be worse than one in a place with a younger population. Japan, e.g., has nearly twice the elderly population of China as a percentage, while India has half the elderly population.

The CFR in the same place may improve over time as new treatments emerge, or as experience with the clinical features of the new disease accumulates. The CFR of 2019-nCoV may drop significantly if repurposed antivirals prove effective and become widely available, or if new insights emerge about how to care for critical cases.

Numbers you’ll see that are definitely wrong

There are a number of ways that people in the popular press are calculating CFRs for 2019-nCoV that are wrong. The numbers may eventually turn out to be correct, but these methods are wrong, and the numbers should not be relied on at all until they’re derived by more reliable methods.

Some examples:

  • Making stuff up. A surprising number of people, including press outlets and even some medical experts, have been claiming the CFR of 2019-nCoV is “similar to seasonal flu” (which has a .1% CFR) or even “similar to the common cold” (which has a CFR of approximately zero) with no sources or calculations whatsoever, despite the fact that no responsible, data-based estimates have come close to this, or could be sustained by the numbers we have available. 2019-nCoV has caused a minimum of 725 deaths already; for it to have a CFR like the flu, this would mean that 725,000 cases would have had to have already happened and run their course, which is much higher than responsible estimates.
  • Naively dividing current death numbers by current case numbers. This has all kinds of biases in both directions, and the 2% figure it produces, while possibly approximately correct, can’t be relied on.
  • Naively dividing confirmed death numbers by the sum of current death and recovered numbers. This method is a favorite of conspiracy cranks and doomsayers. It will produce reliable results in an epidemic that is over, and wherein the ascertainment rate of fatal and nonfatal cases is similar, but in an ongoing and rapidly growing epidemic with a long recovery timeline and higher ascertainment of serious cases, like this one, will produce massive overestimates. The 23% number that people derive with this method is almost certainly a massive overestimate.

Right now, ridiculously high and ridiculously low CFR numbers, as well as plausible numbers derived by totally unreliable methods, are everywhere. Skepticism pays, right now.

Which numbers might actually be right?

Right now, none.

Medical experts continue to entertain numbers as low as 1% (still ten times higher than the seasonal flu) and as high as the SARS rate of approximately 10% for current caseloads in China.

While the higher of those numbers is now seemingly less likely, there’s still no clear consensus on a specific number for current caseloads, no clear consensus on how it might differ by circumstances (hospitalized vs improvised conditions, by country), and no clear consensus on how it might change over time.  The highest recent number we’ve seen was an estimate of 6.5% from a recent modeling study.

One notable report we’re paying attention to is a lengthy interview with Caixin from Peng Zhiyong, the director of acute medicine at Wuhan University South Central Hospital about the experiences his group had.

Dr. Peng reports on 138 cases his hospital admitted up to January 28, a group which has a couple noted advantages as a study group: they have had time for the disease to largely run its course, they entered the medical system before case numbers got totally out of control, and they were seen in a hospital and treated pretty well. Of this group, about a quarter required ICU treatment, and about 4% died. Dr. Peng also reports that the fatalities in this group all came in approximately the third week of the symptomatic phase of the disease.

Although not formally published yet, this is the first cohort study of any kind to report fatality numbers. It’s unclear what this means, in exact terms, because we don’t know the percentage of admitted patients among all patients, but it speaks against the higher end mortality estimates of 10% and above. It also speaks against using current case numbers as a denominator, because the epidemic has been growing so rapidly and many cases have not had time to develop to the serious stage yet.

This article in Swiss Medical Weekly summarizes different reports and possibilities, but doesn’t account for long estimates of the outcome time by Dr. Peng and others.

This large cohort study in MedRxiv is another good piece of information.  It’s the largest cohort study so far, and it comes from outside of Wuhan where conditions are under more control and ascertainment is higher.  It reports a CFR of about 1.4%, lower than other recent reports.  It’s based largely on cases which are still not resolved, which will bias the estimate downward, but also is only based on the cases which made it to the hospital, which will bias the estimate upward.

As time passes, more reliable cohort studies will start to give specific numbers for ascertained cases, and population surveys and reviews of death records will begin to pin down how representative those cases are of the general population. Cases in different places will get separate numbers attached to them, and studies from different times will show temporal trends. Research on any new therapies will focus on CFRs with and without them. The CFR landscape of 2019-nCoV will become clearer.

In the mean time, don’t commit yourself mentally to any single number, and pay no attention to any numbers far below 1%, far above 10%, that don’t have any sourcing, or that use unreliable methods.



  • Trace

    Great write up. Statistics and numbers are so hard to track, and are frequently used to argue one side against another (“There are three types of lies — lies, damn lies, and statistics.”). I appreciate the explanation as to how these numbers are arrived at, and why they can’t be determined at this point.

    2 |
    • Jon StokesThe Prepared Trace

      Thanks!

      1 |
The Prepared