A Very Important Paper on Infection Rate and Mortality

I was going to place this in my monthly links post, but it seems too timely to wait:

Here is a very important paper on COVID infection rates and mortality rates. It has plenty of math so I don't understand all of it, but the authors appear to be high-level econometricians. They describe in plain language their assumptions, which seem very humble, and why they get such broad estimates.

Some very readable excerpts from the paper:
"Criteria used to determine who is eligible for testing typically require demonstration of symptoms associated with presence of infection or close contact with infected persons. This gives considerable reason to believe that some fraction of untested persons are asymptomatic or pre-symptomatic carriers of the COVID-19 disease. Presuming this is correct, the actual rate of infection has been higher than the reported rate.
"It is perhaps less appreciated that available measurement of confirmed cases is imperfect because the prevalent tests for infection are not fully accurate. There is basis to think that accuracy is highly asymmetric. Various sources suggest that the positive predictive value (the probability that, conditional on testing positive, an individual is indeed infected) of the tests in use is close to one. However, it appears that the negative predictive rate (the probability that, conditional on testing negative, the individual is indeed not infected) may be substantially less than one. Presuming this asymmetry, the actual rate of infection has again been higher than the reported rate. 
Combining the problems of missing data and imperfect test accuracy yields the conclusion that reported rates of infections are lower than actual rates."
Obviously, because tests are scarce we only test people who show symptoms. This means asymptomatic carriers don't make it into the denominator. Okay. What's less obvious is the inaccuracy of the test. If the test says you have it, it's very likely you have it. But if the test says you don't have it, there's a good chance you actually do have it. These two reasons combined means more people have it than we otherwise would think.

This leads to the optimistic conclusion,
"reported rates of severe illness conditional on infection are higher than actual rates."
 The analysis is based on data in three places; Illinois, New York, and Italy. They caution against generalizing from these results,
"the available data on the rate of positive tests for tested persons reveal almost nothing about the population infection rate. Moreover, a huge increase in the rate of testing would be required to substantially narrow the width of the bound."
Very very few people have been tested:
"As of early April 2020, the fraction of the population who have been tested is very small in most locations. For example, the fraction who have been tested by April 6, 2020 was about 0.005 in Illinois, 0.017 in New York, and 0.012 in Italy"
 So what's the punchline?
"On April 6, 2020, the bounds on the infection rates in Illinois, New York, and Italy respectively are [0.001, 0.517], [0.008, 0.645], and [0.003, 0.510]."
But when they import more assumptions from experts, the updated infection rates look like:
"Considering again April 6, 2020, the updated bounds are [0.002, 0.517], [0.011, 0.645], and [0.004, 0.510]."
Good God that's a huge range! Harm per infection isn't much better. Focusing on Italy where data exists for severe health outcomes, they use the non-updated bounds to calculate:
"Focusing on April 6, 2020, we see that the bound on the probability of being hospitalized if infected is [0.001, 0.172]. The bound on the probability of needing intensive care is narrow, being [0, 0.02]. The fatality rate on April 6 lies in the bound [0.001, 0.086]. It is notable that this upper 18 bound on fatality is substantially lower than the fatality rate among confirmed infected individuals, which was 0.125 on April 6."
You might say that the paper's bounds are so high that it tell us next to nothing. Indeed, I read people from Twitter and Marginal Revolution's comments saying this. That is completely wrong.

Consider of the phrase, "I may or may not" as in, "I may or may not go to the store." You might think, of course you may or may not go to the store, those are the only logical possibilities! This is the wrong way of understanding it. "May or may not" is contrary to "will" or a third possibility, "will not". It says, "both are possibilities" as opposed to, "one or the other is certainly true."

Likewise, it's easy to think, of course the fatality rate is between .001 and .086! This, of course, is wrong for the same reason. This paper is admitting a large range of possibilities. Half of us could be infected and the death rate could be very low, but maybe few of us are infected and the death rate is very high. "It could go either way, we don't know" is information.

I almost titled this post, "New Paper Says Coronavirus could be as Bad as the Flu." Which is technically true, but it says it also says it could be much, much worse.

No comments:

Post a Comment