A (Longish) Twitter Thread on Modeling During COVID-19

I came across this tweet by Republican Senator John Cornyn (TX) while trying to satisfy my brain’s dually rational responses for both information and numbness during a pandemic surfing social media.


What Cornyn is referencing is the fact that prominent predicitve models of deaths attributed to COVID-19 in the United States have been revised a few times lately—each of the most recent revisions has lowered the point-estimate of deaths we’re expecting to see as a direct consequence of infection. A lot of political and media elites on the US right are claiming that this is proof that the virus was never something for us to be as concerned about as we were. Many are trying to frame the whole response to COVID-19 as being an overreacion by “liberals” who are using it as a smoke-screen for their true, dastardly plan: to institute socialism in the United States.

(For the record, I think that there are a lot of legitimate concerns regarding privacy and whether or not the largely-necessary expansion of government powers will be easily relinquished once the threat is cleared. But I find the efforts to trivialize the virus’ impact to be dangerous—and the insistence that everything be framed as but the backdrop for a deeper partisan scheme to be at once ludicrous, facile, and just sogoddamnexhausting).

The thing is, if you couldn’t tell from that oh-so-sly inclusion of “climate” there, Cornyn is a pretty ardent climate change denier. And there’s a pretty concerted effort among some GOP elites to start sewing doubt about the accuracy of “models” as part of a larger rhetorical attack on an issue that is an even greater existential threat to the species, but with longer time horizon.

Thanks to my wonderful newborn daughter (no sarcasm there; I’m feeling truly ecstatic and blessed), I haven’t had a lot of sleep. So my first response was to respond with something sassy.


But while Cornyn is quite possibly not acting in good faith, I know that the vast majority of real-life conservatives are perfectly wonderful folk who are:

  1. Concerned to see the nation’s economy spinning down the drain
  2. Upset and anxious at the complete disruption of their lives
  3. Worried about the amount of power being exerted upon them by local and state authorities
  4. Confused over all of the changing information

Most people aren’t totally polarized on this bogus modeling point, so I thought that I’d pop-on my science communication hat on and write a thread about modeling, why it’s scientific, and how a model can be both informative and liable to change. I figured that it could actually be a decent lesson for some, so I thought I’d post the text of the thread here on the blog, with some minor corrections and augmentations because I spent a decent amount of time pecking it out while holding a baby in one arm and because Twitter isn’t a fan of emphasized text. (Although you’re more than welcome to check it out in its natural habitat with all the little goofs).

For those who aren’t here in bad faith: Hi! I’m a scientist. Let’s go ahead and have that honest conversation about modeling.

First and foremost, let’s tackle the idea that “science”" must equal the “scientific method” that we learned in elementary school.

If you ask any practicing scientist, they’ll tell you that this concept of how science works is a good rule of thumb but it’s often more complicated than that. Sometimes people observe the world without an overarching hypothesis to test. Sometimes the process of testing and theorizing and concluding is iterative, tracing back over itself multiple times. Sometimes it looks absolutley nothing like the idea we learned in grade school.

What we learned in grade school is, in actuality, a simplification: its an abstraction, relying on a set of assumptions that relaxes how much it looks like the real world, that lets us get a handle on the bigger, more complex reality it represents.

It is itself, ironically enough here, a model.

Models are used at various stages of the scientific process—in all its complicated glory. They’re used to help us theorize, they’re used to generate testable hypotheses, they’re used to test predictions, and they’re used to predict novel things based upon things we’ve seen and studied before.

Models aren’t perfect—again, they’re abstractions and simplifications. They can’t be perfect. And that’s good! Because otherwise they’d be useless—just like a map that’s a perfect 1:1 representation of the area it’s meant to convey.

So we make some simplifications. We make assumptions that makes a problem or research question more tractable. Or, in keeping with the map analogy, we keep some parts of the picture and remove some others to clarify things and help it fulfill it’s purpose: getting us to where we want to go. Or at least get us in that general area.

This is where deep subject-matter expertise comes in. Mapmakers know exactly what parts of reality they need to remove, accentuate, and stylize to help you get to where you’re going. Scientists do the same! Building off of generations of prior work, scientists have gotten an increasingly better sense of what parts need to be pruned and emphasized to get something that, while obviously not reality, at least produces outcomes that we can “map” to reality with an incredible degree of usefulness.

But remember where I said it gets you in the same area as what you want? I made that point intentionally with the specific idea of predictive models in mind. Which is what the model that we’re talking about is. And, also, by the way, are just as much of the aforementioned wonderfully multifaceted scientific process as anything else!

Predictive models account for the fact that they’re abstractions. In addition to their estimates, they produce ranges of uncertainty. In fact, these ranges are usually the most important part. Because 9/10 times, the specific estimate that’s predicted by the model is technically wrong.

Like, what are the odds that the temperature for tomorrow will be exactly what the weather man predicted? Pretty slim. What are the odds that it’ll be pretty darn close? Pretty darn high.

But predictive models, and the estimates they generate, are entirely reliant on the data fed to them. In models that deal with the social world (like those dealing with COVID-19), that data is likely to change. Ergo the predictions, and the uncertainty ranges change too.

Sometimes that new information reflects changes in the situation on the ground. Sometimes it allows us to amend and correct some of the other simplifying assumptions made in the past that helped bring about the previous estimate. More often than not, it’s a little bit of both. The situation has obviously changed, but we’re also getting better information. That latter bit isn’t too hard to pull off because, as the name implies, it’s a NOVEL virus. Many of the important parts of the predictive model (rate of seriousness, ease of infection, time it’s contagious, deathrate, etc.) are getting updated with better and better information. Plus the obvious fact that we haven’t been testing as much as we should have and we’ve been doing pretty good at this social distancing thing. You know aside from the slow start.

That said: I’ve seen a few people argue that the previous COVID-19 lethality models already assumed “maximum social distancing.” What gives? While this statement is true, we have to remember that the model they’re talking about is super complicated. It’s got a lot of moving parts. Social distancing isn’t the only factor that can cause the model to spit out a new, lower figure.

(But also, like, the number is changing because we’re doing new kinds of social distancing based off of recommendations generated by the new data. So, while each iteration of the model is assuming max distancing, what “max distancing” means has shifted throughout the outbreak).

Anyways, lots of things can change about the model as we get new data. It doesn’t mean that the model is bad. In fact, the model’s creators have been upfront about the fact that the estimates should decrease as we get better info on the virus and how to avoid spreading it.

Errors and revisions don’t mean that it’s a poor model. Models, and predictive models like these, are an important part of science—the errors and revisions don’t magically make them unscientific. The model is working off the best knowledge global experts have at their disposal right now. But they’re able to get increasingly greater knowledge; the situation on the ground is updating. Therefore, the data are updating. Therefore, so too are the estimates.

We should be celebrating every time the model predicts fewer deaths. Because every time it does, it’s because we’re learning more and being more effective and preventing those deaths. But the entire reason we were able to was because of the previous models.

So when you hear people like Senator Cornyn question the entire pursuit of modeling as unscientific, I hope you now get a sense of how silly it is. Modeling is a foundational part of science. And the fact that the estimates are changing is a feature, not a bug.

Hopefully some of the ideas in here can be helpful to people when they’re thinking about predictive models—especially in areas like this with a lot of eyeballs watching but with continuously increasing data-quality mixed with a naturally-evolving world.