My notes from part 2 of the American Evaluation Association (AEA) eStudy course being facilitated by Jonny Morell.
- one person commented that “measuring collective impact of small effects of multiple programs is almost always shot down by stakeholders in favour of measuring process outcomes” and Jonny talked about how evaluators are typically engaged to evaluate a single program, so if we tried to measure outcomes from other programs, we’d get shot down for wasting resources and going outside of the scope that we were hired for
- this got me thinking about boundaries (something I’ve been reflecting on a lot lately as part of a group that I’m working with). The “scope” of a project is a boundary and it makes sense for a program to bound this scope to the work that their program is doing. They have limited funds and don’t want to spend them evaluating the broader system in which they are working. But the organization operates within that system and if change really does happen by a bunch of program contributing little bits of outcomes that all accumulate – how would we ever see that?
- There are “collective impact” initiatives and I’ve seen evaluations of those, but those I’ve seen tend to be when a single funder is funding a bunch of programs and want to evaluate across all these programs that are setting out to improve some thing across all the programs.
- But what about programs that aren’t linked through a collective impact project – what about just the way that all sorts of programs running in the world that affect similar things?
- [Crazy idea: what if someone (like a philanthropic foundation) funded an evaluation of the impact of an entire system that relates to some issue – say, poverty, for example – with the freedom to go and investigate whatever programs/services/initiatives the evaluative process uncovers. Is anything doing something like this?]
Timing of Effects
- we don’t often talk about “how long will it take?” for these effect in a program model to happen. So even if we say something is an “intermediate effect”, how long does that actually mean? Often effects don’t happen as soon as people expect, or soon as they would like.
- also, sometimes things need to hit a tipping point, so you might not see effects for a long time, and then you see a big effect. This challenges people’s “common sense” feeling that things will be linear (you put in a bit of work, you get a bit of effect, you put in more work, you get a bit more effect).
- “Success may mean that the rich get richer. In a very successful program, benefits may not be symmetrically distributed. evaluation methodology, straightforward. The politics and values? Not so much”
- example: an agriculture program that leads to increase crop yield is expected to improve family standard of living. But does that get evenly distributed?
- Looking at distributions is important!
Three ways to use complexity
- e.g. thinking about a program as an organism evolving in an environment
- instrumentally – would need to do a lot of math and specific data
- conceptually – how might this program change? does it become more or less adaptable to its environment? does this program compete for resources from the environment with other programs? thinking about the program in this way changes how I think about the program
- metaphorically – e.g. chaos has a very technical meaning, but it’s not useful in evaluatoin beacusae we never let chaos happen – we never let feedback loops go on uncontrolled. We intervene when things start to go off the rails. But the notion of chaos of repeated patterns that can’t be controlled or predicted
- [I don’t understand that difference between conceptual and metaphorical use – going to post a question about this on the workshop discussion site]
Cross-cutting themes in Evaluation
- whenever you are thinking about complexity, need to think about:
- patterns
- predictability
- how change happens
- without thinking about compleixty, 3 ways we think about change:
- from the outside: take a systems view; events in a program’s environment makes a difference
- expected causal relationships: identified model content in terms of elements and relationships
- traditional social science theories: usual paradigmatic stuff depending on your background (e.g., economics, sociology)
- when you add complexity to the mix:
- emergence – change cannot be explained by the behaviour of a system’s parts
- sensitive dependence – small (sometimes random) changes can affect an entire trajectory over time. We usually think of a linear model – we only care about groups, we want a large n, we don’t want to see those little other things
- limitations of models – models simplify, causal dynamics are going on that are unknown. (explicit or implicit models, quant or qual). Remember “all models are wrong but some are useful”. They help us identify things we care about and come up with methods, but need to remember that they aren’t perfect
- evolutionary dynamics – think of programs as organisms evolving in a diverse ecosystem. Helps him to think of this as a metaphor
- preferential attachment – on a random basis, “larger” becomes a larger attractor.
- Jonny thinks its useful to think about each of these in an evaluation – you may not necessarily need to use them, but worth thinking about whether they could be useful
- without thinking about compleixty, 3 ways we think about change:
You can use simple methods to evaluate in situations of complexity
- e.g., attendees of a program may affect their (non-attendee) friends. And their friends may also know each other. And the attendees may affect one another too. And maybe there are community-wide effects too. And maybe those effects might feedback and change the program too.
- you could track the program over time (to see if there is a feedback loop from community to the program)
- you could interview staff about their perceptions of needs
- there are unpredictable changes in the community – you could do a content analysis of community social media; you could do open-ended interviews of community members
- program theory – you can specify desired outcomes (and you can measure them); you can’t specify the path to the desired outcomes in the beginning -but you can track stuff and look at it post hoc
- or you may decide that it is worth using fancy tools (such as agent-based or system dynamic modelling; formal network analysis)
Networks
- network structures can tell us a lot about relationships
- even without doing fancy calculations, sometimes just looking at a network structure can be revealing
- some evaluations, it is worth doing network analysis
- fractal structures
- example of a healthcare – primary, secondary, tertiary health care
- primary clinics feed into secondary, secondary clinic feeds into a tertiary system – if the link between the secondary and teritary clinics breaks, the whole thing falls apart
- fractal structure: unless you know the scale, you can’t tell how close or far away you are from it (e.g., snowflake, vascular system of the human body)
- leads to robustness – if you only have one link (e.g., the only way to get into the tertiary clinic is referral from one secondary clinic, if that links breaks, the whole system is wrecked)
Competing Program Theories
- we can have different, competing program theories
- e.g. one theory might be that increasing air pollution controls and increased use of clean fuel sources –> decreased air pollution and increased economic growth (which is a theory that those who endorse more air pollution controls and promoting the use of more clean fuels might suggest)
- but another theory might be that air pollution controls –> decrease air pollution, but increasing clean fuel sources –> increased cost of doing business –> slowed economic growth (which is a theory that those who opposed more air pollution controls and promoting the use of more clean fuels might suggest)
- what would it take to activate one or the other program theory? it might be (a) small change(s). And it’s not really knowable/predictable what the events will tip the balance
- in complex systems, small changes can lead to big results
- simple programs can exhibit complex behaviours
- so it’s always worth thinking about “might there be complex behaviours going on?”
How much do you need to know about complexity?
- his argument by analogy:
- how much do you know about a t-test?
- if you know what it is appropriate for, that most people accept that p <0.05 as a level of significance, you can probably use the t-test reasonably – you can probably make sense of it
- but there is lots more to know about the t-test – things like the distribution of data, underlying theory, there’s a whole argument about whether the level of 0.05 is really appropriate, central limit theorem, definition of degrees of freedom etc., etc.
- do we need to know all of that deeper stuff to do a decent job of using a t-test? probably not.We’d be better off at doing it if we knew all the underlying stuff, but there’s not magical amount of stuff that we can say we “need” to know
- he thinks it’s similar with complexity – knowing more is better, but hard to say how much is “enough”
Feedback Loops
- “feedback loops can produce nonlinear behaviour”
- but the nature of those feedback loops matters – things like how long the lag for a feedback is (shorter lag = quicker loops)
- it was very interesting to see lags added into the a program logic model and see how that affected the overall timeline