HACKER Q&A
📣 yosito

Are counterfactual models reliable science?


I've noticed a lot of news related to pandemic, climate and other scenarios that compares actual measurements to a "counterfactual" model, where an alternative reality is simulated which "would have happened" if key facts were changed. For example, using a model for how many people would have been hospitalized for covid in a country if the country had been unvaccinated, and comparing that to real hospitalization numbers to make claims about how many lives were saved.

I am not skeptical that vaccines save lives, but I am skeptical that these counterfactual models can be used to accurately quanitify specific truths. From my layman's perspective, it seems like comparing reality to a fictional scenario and describing the difference. Like saying, "the moon contains 100% less cheese, than if it would have been made out of cheese" or "the average global temperature is 1 degree higher than it would have been if it were 1 degree lower".

I don't understand the science or math behind counterfactual models, but I'm sure there are people on HN who do, so my question is, when I see studies about them, do those studies tend to be useful, reliable, trustworthy science, or is it often people using fancy math to manufacture evidence to promote their own opinions?


  👤 Phithagoras Accepted Answer ✓
“All models are wrong, but some are useful” - British statistician George E. P. Box. Any model could simply be made badly, and even well designed models are susceptible to garbage data in giving garbagey results.

Lets consider groundwater contaminant models. Do they precisely predict what is seen onsite? Most of the time engineers never have the budget to know how right/wrong it is. They start with sparse data, apply principles that we know are "wrong" but work well enough at the scale of the problem (ie flow over 500m rather than flow around a single sand grain). At the end, typically they only know whether or not the original problem is still a problem for the client. If it's solved, the model is right! (enough)!

It's more important to consider the implications of the model for the most practical questions possible. Colleagues of mine are trying to tune climate change models based on data from ancient lake sediments. Personally, I dont care whether their model predicts 2 degrees of change or 1 degree. The broad strokes of the climate picture are not cheerful, and the mathematical and field specific details of such models are intricate yet simultaneously boring as hell. There is so much to know that arguments are best left to people who care enough to complete a PhD in the subject. Widespread bikeshedding over particular X,Y,or Z model is actively harmful to solving the global problems we'll face. Like whether to build dikes levees or to just displace huge swathes of population to higher ground. How can we develop infrastructure that can move transport food no matter what, or take water from flooded areas to droughted ones? Good climate and sea level rise models could help with specific details (5m dike or 10?), but the fundamental sociopolitical questions don't require much numeric precision. (50 year dike or 10 year dike so everyone can move?)


👤 brudgers
Counterfactuals are rational.

You turn the steering wheel to avoid hitting oncoming buses.

And don't question why.

Don't argue, I can't be certain.