Adam Pearce github twitter email rss

You Regress It: Have Masks Prevented 66,000 Infections in New York City?

A paper published last week claiming mask usage prevented 66,000 coronavirus cases in NYC has received widespread media coverage. It has also been heavily criticized for, among other things, a tenuous analysis of a regression discontinuity.

Fitting lines to the number of new cases each day in New York City before and after face masks were mandatory, the authors attribute the steeper decline in cases to face masks: new cases were falling by 39 per day before the order and 106 after, so face masks probably sped up the rate of decrease by about 67 cases a day.

There are some bold claims along the way (“After April 3, the only difference in the regulatory measures between NYC and the United States lies in face covering on April 17 in NYC”), but let’s take a closer look at mechanics of this regression.

With large day-of-week patterns, the regression is sensitive to the exact start and end dates — can you tweak them to make a chart recommending against mask usage?

These small adjustments aren’t unreasonable. Mobility decreased before the lockdown order; the mask order had a three-day grace period. And there’s a variable lag between infections and positive tests that the paper doesn’t engage with. Add a seven day lag to account for incubation, testing and reporting to the paper’s dates and the regression makes it look like there’s a strong case for banning masks!

With fuzzy boundaries a local regression might be more appropriate. But it’s not really possible to do causal inference with case counts from just three regions like this paper tries to do, especially if you’re fitting straight lines to exponential infection curves.

A growing body of evidence supports mask usage. Shoddy statistics published in PNAS (with disconcertingly positive initial expert reactions and continued public reference) will make it harder to communicate the results of future research, especially after masks have become politicized and recommendations have shifted. The paper’s abstract puts it well: “sound science is essential in decision-making for the current and future public health pandemics.”

NYC case data // chart code

The chart from the paper has been lightly edited for clarity.

The top line 66,000 number comes from a similarly suspect regression on cumulative cases. The discontinuous regression was more interesting to illustrate and the results are in the same ballpark.