How lethal is COVID-19 compared with the common flu? To answer this, health officials first have to know how many people in a community are infected, or have been in the past. They’re using virus swab tests to identify people who have COVID-19, and they’re administering antibody blood tests to detect who has had the infection previously.

The accuracy of antibody tests is still in question, however. Inaccurate test results could produce inaccurate infection counts, which would then produce inaccurate prevalence and mortality rates. But Chicago Booth’s Panos Toulis suggests a statistical method that officials can use to arrive at more accurate counts and thereby craft more-nuanced containment policies.

Antibody testing isn’t yet widespread, so officials are trying to extrapolate from studies conducted in certain locations, including Santa Clara County, California, and New York State. In Santa Clara, officials tested 3,330 people, of whom 50 tested positive for COVID-19 antibodies. After some reweighting to make the sample more representative of the general population, a research team led by Stanford’s Eran Bendavid estimates that 2–4 percent of the general population in the area was likely infected.

But how robust and trustworthy are these results? The Santa Clara study used a standard statistical method that Toulis compares to a Swiss Army knife. Standard methods, which involve assumptions, can serve many purposes and are great if you need one tool to do many things, he says. When possible, you reach for a tool more specific to the task at hand.

Toulis reexamined the results using a different method, aiming to eliminate unnecessary assumptions by including two unknown factors as parameters. One was the false-positive rate—if someone does not have COVID-19 antibodies, what is the probability that the test result will still be positive? The second was the true-positive rate—if someone has the antibodies, what is the probability that a test will come out positive? “Our target is disease prevalence. In between, there are these two numbers we don’t care as much about but that help solve the equation,” he explains.

This gave him three unknowns, which functioned as parameters for his problem: the true-positive rate, the false-positive rate, and the prevalence rate. He then looked at all possible values of the unknown parameters and assessed the likelihood of observed test results assuming any given values of the parameters. For example, what is the likelihood of seeing 50 positives out of 3,330 trials when the true positive rate is 85 percent, the false-positive rate is 0.8 percent, and the prevalence rate is 5 percent? He had some sense of the actual true- and false-positive rates because Santa Clara researchers, using blood collected before the COVID-19 crisis, conducted a validation study that gave them a better sense of those rates in their sample.

Toulis scanned every possible joint combination, a computationally manageable problem because there were only three numbers involved, and because all were percentages and all had values between 0 and 100. If a particular combination was inconsistent with the data, he threw those out. The remaining combinations corresponded to a confidence set of prevalence values, which were all statistically plausible.

For example, he finds that in Santa Clara, a 2 percent prevalence rate could occur only if the testing kit used had an atypically low false-positive rate. By contrast, a prevalence rate close to 0 would happen only if there were a relatively high false-positive rate, of about 1.5 percent. Using the data and logic, Toulis narrowed the range of possibilities and argues that the likely true prevalence rate was somewhere between 0 and 2 percent—and between 0.3 percent and 1.8 percent if the actual false-positive rate of the test was 0.5 percent, an estimation based on validation data from the testing-kit manufacturer and the Santa Clara study authors.

Toulis sees two ways to produce even-more-accurate results: either run bigger studies involving more people, to produce more robust results with the classical statistical methods, or redo his calculations with more data to better signal the relationship between the false- and true-positive rates.

Toulis says his method can help assess the accuracy of the Santa Clara findings, or of any findings arising from classical statistical methods. When he applies the same statistical process to data from New York State, which tested more people and produced a higher prevalence rate, he determines that in mid-April, the prevalence rate in New York was likely between 11 and 18 percent.

While his research doesn’t produce a definite conclusion, Toulis says that it could help officials avoid making mistakes. “One consistent finding, for instance, is that COVID-19 prevalence still appears to be very low compared with the 70–90 percent range that is typically associated with herd immunity,” he says. “This means that reopening policies will likely fail if they start prematurely, and there are signs of that already happening in data from several countries.” After easing social-distancing restrictions, France, Germany, Korea, and Iran are among the countries experiencing a reemergence of virus cases.