Causality in Complex Systems: GiveDirectly and Egger et al. (2022)
A little bit of Giving TuesDAG
Oh, Hello
I admit, it’s been a while since I updated the blog. Why not? It’s a mix of the fact that I had another baby this past year, and the fact that I just sorta didn’t feel like it.1 Geez, I probably should have posted something when the second edition of my book came out. Oops. That said, I’ve got a few outlines I’ve been working on and you can expect to see some more articles somewhat soon after this one.
Today is a special article though. As you may know, the subscription fees that subscribers pay for this blog have always gone to charity, and for most of that time the recipient has been GiveDirectly, an organization dedicated to just sorta sending money to people who need money. GiveDirectly reached out ask if I would ask people to donate on today, Giving Tuesday, as a part of their Substackers Fight Poverty drive. So here we are.
Please do consider heading over to GiveDirectly and chipping in today, which comes with a 1.5x match of your donation. This year’s campaign is going to send cash transfers to three villages in Rwanda.
In addition to its efforts to send resources where they are needed (and trusting the people in those places figure out for themselves how they can be best used), GiveDirectly also has a history of working with researchers to verify the impact of their work. So if you aren’t interested in giving for the sake of improving lives, maybe you’ll be swayed by their ability to convincingly untangle a rather thorny causal inference problem.
Shock to the System
Today we’ll be going over one of the pieces of research produced by GiveDirectly in coordination with researchers, Egger et al. (2022). You can go ahead and read the whole thing if you like, it’s open access.
This study covered a drive by GiveDirectly where they sent about $1,000 to each of over 10,000 rural households across 653 villages. This money comprised roughly 15% of the GDP of those villages. Importantly, the payouts were rolled out over time and assigned randomly. Using the randomized experiment they could pin down the impact of the funds, and they did find that households receiving the money showed meaningfully large increases in consumption and assets. That’s as you’d expect (and hope!).
They weren’t just able to get impacts on individual households, though. They were able to do something pretty rare and figure out the causal impact on the local economic system as a whole.
When we do causal inference, it’s often because we recognize that a raw correlation between two variables might exist for lots of reasons. Maybe one of those variables causes the other, but also maybe there’s some alternate reason they are correlated. Our goal is to find some experiment, or experiment-like thing, or subset of the variation (i.e. use control or matching variables) in which those alternate reasons don’t apply. That way we know that any remaining relationship between the variables must represent a causal effect.
This whole plan runs into a wall whenever you start studying a complex interlocking system where everything is related to everything else and you see feedback loops arising.
To see what I mean, consider the below causal diagram meant to study the impact of a teacher training program.
The teacher training program isn’t randomly assigned, but that’s okay. Maybe you’re more/less likely to take the training based on how good a teacher you already were, and the program might be more/less likely to be offered based on where you are, and both of those things also affect future student success, which the program is trying to improve. But just control for those two things and you should be fine to call the remaining relationship a causal effect.
Now let’s consider a far more complicated system, like say an economy where every part of the economy affects every other part of the economy. We want to know how the price of beans (x11) affects wages in the trucking sector (x19). So we draw up a nice little diagram:
Hmm, well, uh.
The problem isn’t just the complexity either, it’s the fact that these systems feed back into themselves. A classic example of this, again in education, is the attempt to estimate “peer effects” where you want to know whether, for example, having a super smart student in your class makes you a better student as well.
Doesn’t seem too bad right? Regress each student’s performance on some measure of how smart their peers are, adjust for some variables that determine how students are sorted together, and that should do it right? Well, no. This has long been known to be a very thorny problem. Among other issues, let’s assume that peer effects do work. Being around someone smart makes you smarter. Okay so bring a smart person in the room and they make you smarter. Then you’re a smarter person… which should make them smarter. So then they make you smarter again? Even if you could show that exogenously introducing smart kids into classrooms improves everyone else’s scores, this back-and-forth reflection problem makes backing out the actual effect quite difficult.2
More broadly, applied causal researchers live in fear of the assumption known as SUTVA, the Stable Unit Treatment Value Assumption. SUTVA is an assumption that’s necessary to get many different forms of causal inference to work as expected. If SUTVA breaks, then your estimate probably breaks too. In this case, the relevant part of SUTVA is that it assumes that our measurement of whether you got treated is actually an indication of your treatment. This fails whenever there are spillover effects.
If our data says “person A got the pill that cures hair loss, but their neighbor person B didn’t”, but A feels bad and shares some pills with person B, or if the pill creates a cloud of hair-loss-reducing bacteria that also happen to waft over into B’s garden, then we’ll see B get some of the effect too. Obviously, then, comparing hair loss between A and B won’t actually tell us the impact of the pill, since our “control” unit actually got a little treatment too.
You can see how this makes identifying any sort of effect in a big complex system very difficult. You can’t just do a regular causal inference task, even one with a thorny and complex causal diagram. You also need to worry about whether your treatment is actually your treatment or whether there are spillovers. You need to worry about whether your outcomes are feeding back into the system and affecting your treatment in a feedback loop.
Thankfully, none of these problems are fundamentally unsolvable. They just require that you properly account for all of them in your model, and happen to have the data to be able to estimate that model. So not fundamentally unsolvable. Just practically unsolvable. Unless…
Unsnarled
Those of us used to causal inference using observational data are used to being just a touch jealous of the experimentalists. After all, we have to come up with all these clever designs, spot randomization out in the wild, and do all this complex modeling, while the experimentalists just get to create their own exogenous variation and then compare some averages.
However, Egger et al. don’t have it that easy. If they just randomized the funds they gave out, they’d still run into issues with feedback loops and spillovers. Randomization doesn’t fix everything.
The thing that Egger et al. do that allows them to get into these complex assumptions is to not just randomize once, but rather to use a three-level randomization procedure. They have a large set of candidate villages, which are divided up into regions that might reasonably have spillovers on each other. They first randomized those regions into “high-saturation” and “low-saturation” regions. The second tier of randomization is the selection of actual villages to receive treatment. In the high-saturation regions, 2/3 of villages were randomly selected for the direct-funding treatment. In low-saturation regions, 1/3 of villages were selected.3 Within those villages, individual households were not selected - all eligible households were awarded funds. The third level of randomization is again at the village level: different villages received their funding at different times, in a randomized order.
Each of these layers of randomization accomplishes a different goal.
The first level of randomization, to a high-saturation or low-saturation region, produces exogenous variation in how much spillover we would expect. After all, if we’re interested in the effect of treatment on non-recipients, then it sure would be great to have random variation in how much treatment those non-recipients are around. And they get that! Nearby non-treated villages see double-digit increases in their expenditures, plausibly as a result of increased economic activity in those nearby villages (noting that less-nearby non-treated villages see basically no effect, and that we have a staggered rollout, that third level of randomization, to try to rule out shared economic trends).4
The second level of randomization, to receive or not receive the treatment, is the classic one, and we can use it to see a direct effect of the funding on local economic activity. Further, they can take another crack at local spillover effects by examining the impact of the policy on households that were ineligible to receive money (they also went up!).
One thing they can’t quite solve with their randomization is that feedback-loop problem. They follow up on their outcomes of interest (spending, assets, and so on) a year later. This might be enough time for the treated-village economic boom to affect nearby towns, which might then affect the original town again, muddying the waters of the effect (although unlikely in this case to produce a positive effect if there were none, since it’s hard for a feedback of no effect to be a positive effect).
They can address this a little bit by bringing in more theory. Because they are tracking spending they can track specifically how much of the additional funds that came in to treated households were spent. Economists have a decent idea of how the marginal propensity to consume (how much of an additional dollar-in for households is spent) converts into activity through the rest of the economy as that funding bounces back and forth between people. So there’s not really data on reflection, but it can still be modeled and understood.5
So that’s the plan. And once they have that in place, the rest of the analysis falls into place. Their analytic models aren’t all that complex - mostly just regressions with interaction terms (to allow impacts to be different between treated and spillover groups, or between eligible and ineligible households).
Eggers et al. do something impressive here with their design and analysis, and get at some fundamental economic questions about how money moves in an economy that are very tricky to answer with data. Causal inference has trouble here, and often we fall back on theory when the data is weak. But none of these problems are unsolvable, you just need a model that allows for things like spillovers to happen, and the variation (which they created) needed to identify those models.
Causal inference can always tell us how to get around a causal inference problem! It might tell us that we have to do something impossible. But sometimes it just tells us to do something hard. GiveDirectly did the something-hard, and we (and the targeted recipients of funding, as well as their neighbors) are better off for it.
Again, consider donating to GiveDirectly’s Substackers Fight Poverty goal today. You can also do a paid subscription to this newsletter, which will end up going to them as well, but that will take longer and probably not get the 1.5x match, so I recommend donating directly instead.
I also started playing Magic again which is the maybe the first non-work-related hobby I’ve had in about ten years so that can’t have helped. Pauper is a great format yall.
In my book I talk about how issues of cycles and feedback can be avoiding by respecting time in your causal diagram. Feedback cycles require time to work, so if you incorporate that into your diagram, there’s no longer a feedback loop. However, in many applications this doesn’t actually solve the problem. How long do these peer effects take to work? Do we actually measure student performance often enough that we can measure “performance just after the smart student comes in and has a chance to influence the others but BEFORE those others have a chance to influence the smart student”? You can draw a causal diagram that uses time to get us out of this problem but good luck getting data.
A common critique of this kind of experimental design is that this potentially life-changing funding is withheld from some villages purely on the basis of random chance. However, in all applications like this there is only so much funding available to distribute. Distributing to all villages in the sample is not feasible - selecting a design where all villages is possible, but would require picking a smaller sample in the first place, again excluding villages on a likely arbitrary basis.
They can also do neat stuff like looking at who is the most direct beneficiary of those spillover effects. They find nearby-village spillovers are stronger for individuals who would be ineligible for funding than for individuals who would have been eligible if their village was selected. This makes sense, since ineligible families are more likely to be wealthier and have businesses that might see increased sales from a nearby increase in wealth and thus demand.
Data is really only needed to understand stuff we don’t already know about. We don’t collect data to prove 2 + 2 = 4 or whether parachutes keep you from dying if you fall out of a plane. If we think we have a pretty good theory, that doesn’t just fill a gap for the data, it makes the data we have much more meaningful and worth interpreting. Granted, in some cases we can disagree whether our theory is actually good enough (and data helps there).



