Model, Model, Who’s Got the Model?

On the limits of formal inference relative to down home thinkin'. Or, sometimes, things that are expensive are worse.

Jul 10, 2023

If you’re reading this blog, there’s a good chance you’re already familiar with the idea of causal inference, and how important it is to produce estimates that causally identify parameters that answer any causal questions you happen to have about the world. After all, that’s how you avoid classic errors of causal reasoning like you see in causal inference textbooks (like mine). One classic error that proper causal inference estimation would help you avoid is something like “I observe that people in hospitals tend to be more sick than people outside hospitals, and therefore hospitals cause disease; let’s shut down the hospitals.”

How ludicrous! And so the book notes the error, the reader agrees, and then the book goes on to show you the kinds of approaches to data analysis and estimation that would let you avoid such an error in the future.

Except… there’s a piece missing here. There must be. That example of faulty reasoning was used to justify an introduction to methods that let you perform causal inference so as to avoid being tricked by the problem… except that we already spotted the problem. And we already could take that problem into account when interpreting any ol’ raw correlation between sickness and being in the hospital.

In fact, we already have a rough idea of what the actual causal effect is likely to be, relative to the correlation - that’s what makes this a good example in the first place! The example is supposed to be a justification for formal causal modeling, but it very self-evidently does not actually need any formal causal modeling to get the broad strokes right.

We have a model that lets us address causal problems in data analysis without producing an estimate that is, itself, representative of a causal effect. The idea that causal inference is about producing a causal estimate is, well, incomplete.

Using that internal, human model for interpreting non-causal estimates in a causal way is not only possible, but in many cases it may actually be better. Which is good, because it’s definitely the most common way we do it.

What Causal Inference in Data Analysis is For

It would be very odd for me, of all people, to argue that we don’t need formal causal modeling. Accordingly, that’s really not the case I’m making. But we should be clear on what formal causal inference actually does get us.

The purpose of formal causal modeling, and all those fancy statistical methods that are designed to get us a causal effect, is to provide us a specific estimate of the effect of one thing on another. Further, the goal of formal causal modeling is to get the answer to appear directly in the estimation and produce an “identified” estimate that represents only the causal impact you’re interested in learning about.

Generalizing from another area, basic observation lets us see that things fall to the floor. More thorough observation of planets can tell us that this has something to do with masses attracting each other over distances. Formal theory tells us the actual formula for gravitational forces, based on the masses of the objects in question, the distance between them, and a gravitational constant. Precise experimentation and data analysis tells us what calculations you can do to derive the actual gravitational constant.

For an example in the social sciences, we have a general notion that if children are properly nourished, they’ll do better in school. We can look at the data and see that student test scores tend to be better in places with a school-lunch program, but also we know that that correlation is confounded. Precise causal identification can isolate the effect of school lunches on test scores and let us say something like “giving a child school lunches increases their test scores by X.”

Only in the last step do we actually need formal statistical modeling. And getting that precise estimate is important in a lot of cases! There’s a reason we put all this effort into getting it.1 But we got pretty far without it, didn’t we?

The Human Touch

Unsurprisingly, people who like to do statistical causal inference are chock full of examples of ways that humans make mistakes about causal inference and the consequences of doing so. But it’s not really plausible that we’re that bad at it. We need to learn and recognize basic causal relationships to survive. Evolutionarily, it would stand to reason that we’d be capable of figuring stuff out at least to some extent. Pointing out places where humans can be tricked into making errors because they fall short of some formal standard, without asking whether they are in fact succeeding at some more appropriate, and pragmatically equivalent, task, is an error in our rational reasoning, not theirs

It’s not even the case that we’re good at causal inference in everyday tasks but awful at it in ones that require the use of more-general data. We’re actually not too bad at that either. Anyone who has taught a class on causal inference and had students talk through their reasoning, or even just read a comment thread under any statistical result, know that even without sophisticated statistical training, people are in fact pretty darn good at spotting sources of confounding, often without even being prompted, and in many cases can even figure out how adjusting for the confounder would shift the finding.

Post a statistic about how carbon emissions are falling in rich countries? Scroll down about an inch to find a bevy of people explaining that it’s just that those countries are exporting those emissions to poorer nations.

Sure, when it comes to political hot-button issues and internet reply threads, we’re far more motivated to look for confounders when we want a way of dismissing a finding we don’t like,2 but that doesn’t change the fact that we’re more than equipped to do it, even if we make mistakes sometimes. Sick people and hospitals, people taking credit for stuff that was clearly going to happen anyway, some rich-person thing being associated with rich-person outcomes, the list goes on; we can spot this stuff. Talk to someone about an area they know a lot about and have a vested interest in actually finding the right answer and you’ll find someone who would be right at home raising their hand on the title slide and proposing threats to identification in an economics seminar.

Where Formal Inference Falls Short

OK, so as humans we have a basic intuition about causality, even in statistical applications, and can sometimes figure stuff out on their own. So what? Humans will also get that stuff wrong a lot of the time, and this won’t give us an accurate and clearly identified estimate of what relationships are there. This is clearly just a second-best to causal identification.

Except that “getting it wrong a lot of the time and not getting an accurate and clearly identified estimate”, uh, also describes what we get from formal causal identification a lot of the time.

Yes, doing proper formal causal inference and then applying it to an appropriately selected and large data set where our necessary assumptions apply will give us a better and more precise answer than just sorta thinking about it for a while and using our intuition. But that perfect storm doesn’t actually occur all that often, and we know that. Proper causal inference often relies on an experiment or quasiexperiment that doesn’t actually exist, or some attempt to shakily mimic one, or some very confident statements about knowing the model of the world.

There are a lot of assumptions, acknowledged and otherwise, that must be correct in order for our identification strategy to actually identify what we say it does. And even when our causal inference ducks are actually in a row, there’s no shortage of opportunities to make statistical mistakes in applying them. We have plenty of examples of these kinds of errors in the past, just like we have plenty of examples of mistakes in applying intuition.

The formal-identification process operates on a basis of attempting to find the correct answer. The intuitive process doesn’t. We imagine that when there are flaws in both, then surely the formal-identification process is still closer to that correct answer, since it was the option trying to find the correct answer in the first place. But that’s not really how real-world inference, statistical or otherwise, works, pragmatically.

Honestly, unless you really know what you’re doing, I suspect there’s a good chance you’ll get closer to the truth with an intuitive approach than with a sloppy attempt at formal identification.

One approach to formal causal inference that we might consider to be a bit more honest is partial identification, where we acknowledge that there are some necessary assumptions that we must make to do formal causal inference, but that we don’t have enough certainty about the real world to make those assumptions with confidence. So we acknowledge a gap and ask what our potential range of conclusions might be given that we know perhaps a direction that our bias might be in, but not enough information to pin down a precise value.

Human intuition about causal inference, when done well, is sort of like an extreme version of partial identification. We know there’s some bias left in what we see. We don’t try to pin it down to a specific value, but instead reason about what a likely range of actual results might be (even if it’s just as generic as “the correlation is probably weaker than what I see in the data”). The main difference between what partial identification does and what intuitive approaches do is that partial identification has a formal model do the work, whereas with intuitive approaches it happens in the brain.

And that’s the big difference - causal inference always requires a model. Formal inference puts that model down on paper and makes it do the statistical work of producing an estimate. The model is part of the estimation process. Intuitive inference puts that model in your head; it’s something to be applied to the statistics after they’re done. The model is part of the interpretation process.

The Market Test

My point is not just that people can sometimes get causal stuff right in their personal decision-making or thinking. Rather, this more intuitive approach to causal inference can actually be highly useful in cases where you’re making real-world decisions. In particular, in my work in consulting, I find this to be a heavily favored approach to causal inference.

Instead of having an analyst produce an extremely careful piece of causal inference, a more common approach is to have them produce a more basic descriptive piece of work. Then, after the fact, businesspeople will use their intuition to adjust the results for confounding that they anticipate.

Let’s say you want to know whether your promotions last year did a good job. You could try to get a team to do some careful quasi-experimental study that produces a per-dollar causal effect of promotional spending. Or you could get a baseline correlation between spending and sales, and then apply your domain knowledge to say things like “alright, we see that a dollar of promotions is associated with three dollars of sales… but product X had a viral thing and probably doesn’t count, can you rerun without that? $2 now. And we were favoring promotions on popular items so this is picking up some of that and we can shade it down. But even if that’s half of the effect we’re still at like a 50% return. I don’t know if that’s right but it does at least seem positive.”

This is off-the-cuff, imprecise, and you can imagine a bunch of ways this could go wrong. But is it really a mistake to do this over something more precise? Imagine you wanted to answer the same question more formally. There are plenty of studies that do so, but what bet would you place on these studies being bulletproof? And even if they were, the data requirements of formal causal inference often mean that you can’t actually study all the cases you want, so you’d probably end up having to use an estimate of some other set of promotions than the ones you want to know about - decent chance that the external validity of that other study is even weaker than the internal validity of the “mentally adjust the correlation” approach you just took. Not to mention much slower and more expensive. And as a businessperson you’re less likely to understand where the answer came from.

Not to say that the formal approach is never the best option. If you do have the time and opportunity to do things right, and getting a precise answer really is important, then have at it. Certainly in industries where experimentation (often via A/B testing) is cheap and fast, it seems to be useful and successful.

But don’t confuse the power of formal causal inference where it works (if it does) with the idea that it’s the only way, or even always the best way, of learning about causal relationships in the world. I don’t think it’s a mistake that, in many applications, we choose to go another way.

The value of the precise estimate is one reason of many why one of my least favorite social-media sentiments, the “look at this empirical study with a causal finding that confirms my political beliefs, what a waste of time for the researchers to just confirm this already-obvious truth I already knew 🙄” is wrong. I find it hard to take anyone seriously after I see them do this.

With the carbon-emissions example, the response is to some degree more a political talking point people are repeating rather than a conclusion they’ve derived on their own, and you’ll find the exact same replies even if the original statistic already accounts for the exporting of emissions.

Of course, this all goes the other way as well, where we’re not that motivated to look for confounders if it would be convenient to avoid doing so. Like, US presidents probably aren’t that responsible for the job market going up and down, most of the time. But whether someone says everyone losing their jobs in 2020 and regaining them in 2021 is because of (a) COVID, or (b) Trump is bad at the economy and Biden is good at it, is going to be at the very least correlated with their opinion of those two presidents otherwise.

Data, On Average

Discussion about this post