Intro to Philosophy – Week 4 – Morality

  • the first lecture explored the “status of morality” – not “is this moral statement correct?” but rather “what is it that we are doing when we make moral statements? are moral statements objective facts? or are they relative to cultural/personal? are they emotional?”
  • empirical judgments are things that we can discover by observations (e.g., the earth rotates around the sun; electricity has positive and negative charges; the Higgs-Boson exists; it was sunny today)
  • moral judgments are things that we judge to be right/wrong, good/bad (e.g., it is good to give to charity; parents are morally obliged to take care of their children; Pol Pot’s genocidal actions was morally abhorrent; polygamy is morally dubious)
  • 3 questions to ask to about these judgments
    1. are they the kinds of things that can be true/false or are they merely opinions? (empirical judgments can be true/false and some philosophers think that moral judgments are merely opinion, though other disagree
    2. if moral judgments can be true/false, what makes them true/false?
    3. if they are true, are they objectively true? (or only true relative to a culture/personal approach)
  • three broad approaches that philosophers have taken to these questions: objectivism, relativism, and emotivism

Objectivism

  • “our moral judgments are the sorts of things that can be true or false, and what makes them true or false are facts that are generally independent of who we are or what cultural groups we belong to – they are objective moral facts”
  • in this approach, if people disagree about morality of something, they are seen as disagreeing over some objective fact about morality
  • e.g., genocide is morally abhorrent – this seems to be something that can be true/false, seems to be objectively true (if someone disagreed, we’d probably thing they are wrong!)
  • e.g., polygamy is morally dubious – but many cultures practice it – perhaps it isn’t objectively true – so this example argues against objectivism
  • objection to objectivism: how can we determine what the empirical truth of a moral claim is? We can’t observe it like we do with empirical judgements.
    • potential responses to the objection: if you take the position that what is right is what maximizes overall happiness, then you can observe which option maximizes overall happiness to make your moral judgements. Or you can say that there are mathematically empirical facts that we can know without observing them in the physical world – instead, you reason them. So we can do the same with morals.

Relativism

  • “our moral judgments are indeed things that can true or false, but they are only true or false relative to something that can vary between people”
  • e.g., the statement “one must drive on the left side of the road” – is true in Britain, but false in the US (so it’s a statement that can be true or false, but whether it is true or false is relative to where you are)
  • e.g., polygamy is morally dubious – can be true or false, but depends on your culture
  • e.g., Oedipus sleeping with his mother was morally bad – (remember, he didn’t know it was his mother) – if you consider incest wrong, is it wrong across the board or only wrong if you know?
  • subjectivism: a form of relativism where “our moral judgments are indeed true or false, but they’re only true or false relative to the subjective feelings of the person who makes them” “X is bad” = “I dislike X”
    • subjectivism has a hard time explaining disagreements
  • cultural relativism: a form of relativism where “our moral judgments are indeed true or false, but they’re only true or false relative to the culture of the person who makes them.” “X is bad” = “X is disproved of in my culture”
  • objection to relativism: it seems like there is moral progress (e.g., people used to think that slavery was morally OK, but now we’ve progressed to say that slavery is morally wrong. However, under a relativism view, you’d say that slavery was morally acceptable relative to the time and culture. So relativism does not allow for moral progress.
    • potential answer to the objection: cultures overlap – so, for example, if you consider “America” a culture

Emotivism

  • “moral judgments are neither objectively true/false nor relatively true/false. They’re direct expression of our emotive reactions”
  • objection to emotivism: we reason our way to moral conclusions – e.g., you might say “it’s wrong for Oedipus to sleep with his mother,” but then someone says “But he didn’t know it was his mother” and then you reason “OK, he can’t be held morally responsible since he didn’t know.” But emotivism says that moral judgments are only based on emotions
    • potential answer to the objection: some evaluations are reason – e.g., if you prefer A to B and prefer B to C, but then you prefer C to A – that’s irrational. So we do use reason when it comes to emotions/preferences.
  • some people in the class discussion asked questions like “Can’t there be a universal principle that unites objectivism and relativism? E.g., a relativist might say “Women should wear headscarves in some cultures but not others, but an objective could say the principle is “When in Rome, do as the Romans do” – which would work out to “Women should wear headscarves in those cultures where that is what is expected and not in other cultures where it is not”. Another discussion point was that we could agree on a moral judgment but disagree on the reason for it (e.g., We agree that kicking dogs is morally wrong, but one might think it’s because you are causing pain to the dog, while another thinks it’s because it desensitizes the person doing it to cruelty”)
  • “Objective” can mean moral principles independent of us, or it can mean moral principles apply to everyone equally (relativists would just object to the latter).
  • Another question from the class was could objectivism be right for some moral principles, relativism is best for other moral principles, and emotivism is best for yet other principles. Philosophers talk about “agent neutrality” – the reasons that morality provide for whether something is moral are independent of the individual and they talk about morality is overriding. If this is correct, you’d expect there is a unified domain of morality.
  • Probably none of these theories are right – they all need some work to figure out which, if any, is correct.
Posted in philosophy | Tagged , , , , , , , | Leave a comment

Intro to Philosophy – Week 3 – Philosophy of the Mind

  • Cartesian dualism: the body is made of material stuff (i.e., stuff that has “extension” (i.e., takes up space)) and the mind is made of immaterial stuff (i.e., does not have extension)
  • Princess Elizabeth of Bohemia was a student of Decartes who brought up the following problem: how can an immaterial mind affect a material body? Our thoughts cause us to do things, but how does the immaterial interact with the material?
  • Another problem is how does the ingestion of a material substance (e.g., psychoactive drugs) affect an immaterial mind (i.e., hallucinations)?
  • Physicalism = “all that exists is physical stuff”
  • Identity theory = one view of physicalism in which “mental phenomena, like thoughts and emotions, etc. are identical with certain physical phenomena”
    • e.g., the mental state of “pain” is identical to a particular type of nerve cell firing
    • a reductionist view – i.e., reduces mental states to physical processes
    • token = instances of a certain type (e.g., Fido and Patches are two tokens of the type “Basset hound)
    • token identity theory = instances of mental phenomena (e.g., a particular pain that I am feeling) is identical to a particular physical state that I’m in
    • type-type identity theory = types of mental phenomena (like “pain” or “sadness”) are identical to types of physical phenomena (e.g., a particular cocktail of neurotransmitters, hormones, etc.)
      • type identity theory is a stronger claim than token identity theory
  • problem with type-type identity theory:
    • a human, an octopus, and an alien can all feel pain, but have very different brain states
    • Hilary Putnam raised this issue of “multiply realisability” in 1967 –  the same mental state can be “realized” from different physical states
    • similarly – currency can be coins & paper in one place, but shells in another place – so currency is “multiply realisable”. It doesn’t matter what they are made of – what matters is how they function.
  • Functionalism = “we should identify mental states not by what they’re made of, but  by what they  do. And what mental states do is they are caused  by sensory stimuli and  current mental states and cause behaviour and new mental states”
    • e.g. the smell of chocolate (a sensory stimulus) causes a desire for chocolate (mental state) may cause the thought (another mental state) “where is my coat?” and the behaviours of putting on coat and going to the store; but if I have a belief that there is chocolate in the fridge, the desire for chocolate could lead to the behaviour of getting the chocolate out of the fridge
    • functionalism gets away from the question of “what are mental states made of?” and instead focuses on what mental states do
  • philosophers often use the computer as a metaphor for mind – a computer is an information processing machine and it doesn’t matter what it’s made of, it only matters what it does
  • this is a computational view of the mind
  • Turing Test – you ask an entity questions and you don’t know if you are talking to a person or a computer. If we can build a computer that can fool the person asking questions into thinking they are human, we have built a computer that is sufficiently complex to say that it can “think” or it has a “mind”
    • some problems with the Turing test:
      • it’s language-based, so a being that can’t use our language couldn’t pass it
      • it’s too anthropocentric – what about animal intelligence? or aliens
      • does not take into account for the inner states of a machine – e.g., a machine that is doing a calculation of 8 + 2 = 10 is going through a process, but a machine that just has a huge database of files and just pulls the answer 10 out of its “8 + 2” file – we wouldn’t want to say that it is “thinking
  • John Searle’s Chinese Room Thought Experiment
    • You are in a room where you get slips of paper with symbols on them delivered to you through an “input” hole in the wall and you have a book that tells you what symbols to write in response to those symbols which you write down on a slip of paper and pass through the “output” slot in the wall. As it turns out, the symbols are Chinese characters and the book is written in such a way that you are giving intelligent answers to the person sending the questions to you. When they receive your “answers”, they are convinced you are a being with a mind that is answering their questions – but you have no idea that it’s questions and no idea what you are responding because you cannot read nor write in Chinese. This is how computers work – they get an input, they are programmed with a list of rules to produce a certain output. But we don’t say that they computer is “thinking” and more than the person in the room understands Chinese. There is no understanding going on within a computer – it doesn’t have a “mind” and if it passes the Turing test, it’s just a really good simulation.
    • syntactic properties = physical properties, e.g., shape
    • semantic properties = what the symbols means/represents
    • a computer only operates based on syntactic properties – it is programmed to responded to the syntactic property of a given symbol with a given response – it does not “understand” its semantic properties
    • aboutness of thought – thoughts are “about” something – they have meaning
  • some problems with the computational view of the mind
    • doesn’t allow us to understand how we can get “aboutness of thought”
    • the “gaping hole of consciousness”
    • the hard problem of consciousness: what makes some lumps of matter have consciousness and others don’t have consciousness?
  • a lot of philosophers were writing when computers were becoming a big deal, so perhaps their thinking was limited by thinking of minds as computers – perhaps we should step away from computational analysis as a metaphor for the mind because it’s limiting our thinking?

 

Follow-up discussion

  • most philosophers use the phrase “intentionality”, which the prof of this session avoided when she talked about “aboutness of thought” because it comes with a lot of philosophical “baggage” that she didn’t want to get into
  • in the discussion forum of the class, people were asking things like “do animals have mind? and how could we know if animals have mind?”
    • one school of philosophy says that you need to have language to have thoughts and since animals don’t have language (as far as we know), they don’t have thoughts
    • but others don’t think this is a fair argument – e.g., if a dog is barking at a squirrel a tree, just because it might not have as “rich” a concept of squirrel as humans do (e.g., a squirrel is a mammal with a bushy tail etc.), we can still infer from its behaviour that it is “thinking” something we can roughly describe as “the dog thinks there’s a squirrel in the tree”
    • she suggests checking out Peter Carrurthers’ work on the animal mind for more information
  • someone in the discussion said that the Turing test doesn’t test if a machine is conscious, but rather it tests at what point humans are willing to attribute conscious states to other things (similar, at what point do infants start to think of other people as having a consciousness?)

 

Posted in notes, online module notes, philosophy | Tagged , , , , , , , , , , | Leave a comment

Intro to Philosophy – Week 2 – Epistemology

Epistemology

  • studying and theorising about our knowledge of the world.
  • we have lots of information, but how do we tell good information from bad information?
  • “propositional knowledge” = knowledge that a certain proposition is the case
  • “proposition” = what is expressed by a declarative sentence, i.e., a sentence that declares that something is the case.
    • e.g., “the cat is on the mat” is a sentence about how the world is
    • it can be true or false
  • not all statements are declarative (e.g., “Shut that door” is not a sentence that declares how the world is. It cannot be true or false).
  • “ability knowledge” = know-how (e.g., knowing how to ride a bike)
  • two conditions for propositional knowledge
    • truth condition – if you know something is the case (e.g., you know that the cat is on the mat), then it has to be true (e.g., the cat really is on the mat)
      • you cannot know a falsehood
      • you can think you know a falsehood, but you cannot actually know it
        • we are interested in when you actually know something, not just when you think you know
    • belief – knowledge requires that you believe it to be true (e.g., if you don’t believe Paris is the capital of France, you cannot have the knowledge that Paris is the capital of France
      • when someone says “I don’t just believe that Paris is the capital of France, I know that Paris is the capital of France.” But this doesn’t mean that belief in a proposition is different in kind from knowledge of that proposition, just that we don’t merely believe it, but that we also take ourselves to know that proposition, and this is indicative of the fact that a knowledge claim is stronger than a belief claim. (i.e., knowledge at the very least requires belief).
  • this doesn’t mean you have to be infallible or certain, but if you are wrong about the fact, then you didn’t really know it (you just thought you did)
  • also, when we talk about propositional knowledge, we aren’t talking about knowledge that something is likely or probably true – we are talking about something that either is or is not true
    • we do sometimes “qualify” or “hedge” our knowledge claims (perhaps because we are unsure), but we are really concerned with actual true
  • knowledge isn’t just about getting it right – it also requires getting to the truth in the right kind of way
    • e.g., imagine a trial where the accused is, in fact, guilty. One juror decides that the accused is guilty based on considering the evidence/judge’s instructions/the law, while another juror decides the accused is guilty based on prejudice without listening to any of the evidence. Although they both “got it right” (i.e., what they believe is true), the first juror knows the accused is guilty, but the second juror does not know it
  • there are two intuitions about the nature of knowledge:
    • anti-luck intuition – it’s not a matter of luck that you ended up at the right answer; you actually formed your belief in the right kind of way (e.g., considering the evidence, making reasoned arguments), not that you got to the truth randomly/by chance
    • ability intuition – you get to the truth through your ability (e.g., the juror who used prejudice and happened to get the right answer did not get to the right answer by their abilities)

The Classical Account of Knowledge and the Gettier Problem

  • knowledge requires more than just truth and belief – but what is it that is required?
  • the classical account of knowledge (a.k.a., the tripartite account of knowledge):
    1. the proposition is true
    2. one believes it
    3. one’s belief is justified (i.e, you can offer good reasons in support of why you believe what you do)
  • until the mid-1960s, this classical account of knowledge was accepted by most people
  • but in 1963, Edmund Gettier published a 2.5 page paper that demolished this account – he showed some counter-examples of situations that fit the three above named criteria, but where people don’t know – they actually come to their belief by luck
  • his examples were very complicated, but here are some simple counter examples (we can call them Gettier-style cases)
  • e.g., the stopped clock example:
    • you come downstairs and see the clock says 8 am and you believe, based on the justification that this clock has always been reliable, that it is 8 am. And it happens to be 8 am. So you have a justified true belief (i.e., it satisfies the classical account of knowledge). But imagine the clock stopped 12 hours ago, but you just happened to look at the clock when it was 8 am – so you got it right by luck. So you cannot actually know the time based on looking at a stopped clock!
  • e.g., the sheep case
    • a farmer looks out his window, sees what looks like a sheep, and believes there is a sheep in the field. There is a sheep in the field, but it is hidden behind a sheep-shaped rock, which is what the farmer actually saw. So his belief is true (i.e., there is a sheep in the field) and he has a justification (he sees what looks like a sheep in the field), but he got it right only by sheer luck that there was a sheep hidden behind that rock. If there had not been a sheep hidden behind that rock, he would be believe there was a sheep in the field and he would be wrong. So he does not actually know there is a sheep in the field (he just thinks he knows and happens to be right just by luck)
  • people try to attack Gettier-style cases – e.g., asking “does the farmer really believe that there is a sheep in the field or do they believe that the rock is a sheep?” because if it is the latter, then they have a false belief (i.e., the rock is not a sheep) and thus it does not violate the classical account of knowledge – but this is just attacking a single case – to knock down Gettier-style cases in general, you’d need to think about Gettier-style cases as a whole and find a way to blow up the whole thing
  • there is a general formula for constructing Gettier-style cases
    • take a belief that is formed in such a way that it would usefully result in a false belief, but which is still justified (e.g., looking at something that looks like a sheep, or looking at a stopped clock)
    • make the belief true, for reasons hat have nothing to do with the justification (e.g., hidden sheep, happening to look at the stopped clock at the right time)
  • at first, people thought there would be some simple fix (e.g., adding a fourth condition onto the classical account), but after much trying, no one has found a way to do this
  • one example of how someone tried
    • Keith Lehrere proposed adding a fourth conditions that says the subject isn’t basing their belief on any false assumptions (a.k.a., “lemmas”)
    • this sounds like a reasonable approach
    • but what do we mean by “assumptions”?
    • a narrow definition of “assumptions” = something that the subject is actively thinking about (but you don’t look at the clock and actively think “I assume the clock is working” – you just believe it is without actively thinking about that assumption)
    • a broad definition of “assumptions” = a belief you have this is in some sense germane to the target belief in the Gettier-style case (e.g., you do believe the clock is working even though you aren’t actively thinking that) – but this is so broad that it will exclude genuine cases of knowledge because of all the things we believe, some of them may false, so then we’d exclude genuine cases of knowledge
  • two questions raised by Gettier-style cases
    1. is justification even necessary for knowledge?
    2. how does one go about eliminating knowledge-undermining luck?
  • so, it really is not that obvious what knowledge is

Do We Have Any Knowledge?

  • radical skepticism contends that we don’t know nearly as much as we think we know – and in its most extreme form suggests that we can’t know anything
  • skeptical hypotheses are scenarios that are indistinguishable from normal life, so you can’t possibly know they aren’t occurring
    • e.g. brain-in-a-vat – if you were a brain in a vat being feed the necessary nutrients to stay alive and being fed fake experiences
    • there is no way to know this isn’t true because any “evidence” you can provide against it (e.g., I can feel objects around me, I can have a conversation with you) could be explained by the situation of being a brain in a vat (e.g., your brain is being fed signals that make it appear that you can feel objects or have a conversation)
    • note that radical skepticism isn’t saying you are a brain-in-a-vat or even that it’s likely that you are a brain-in-a-vat. It’s just asking “How would you know that you aren’t a brain-in-a-vat?” And really, you can’t know.
Posted in notes, online module notes, philosophy | Tagged , , , , , | Leave a comment

Report on “Delivering the Benefits of Digital Health Care”

Delivering_the_benefits_of_digital_health_careA report on “Delivering the benefits of digital health care” from Nuffield Trust in the UK recently came across my desk. It covers a bigger scope of technology than the project I’m working on (which is a project about transforming clinical care (and implementing an electronic health record across three large health organizations to support this clinical transformation), but does not include telehealth and some of the other IT “solutions” talked about in this report), but some of the “lessons learned” that they share resonate with what we are doing.

Some highlights:

Clinically led improvement, enabled by new technology, is transforming the delivery of health care and our management of population health. Yet strategic decisions about clinical transformation and the associated investment in information and digital technology can all too often be a footnote to NHS board discussions. This needs to change. This report sets out the possibilities to transform health care offered by digital technologies, with important insight about how to grasp those possibilities and benefits from those furthest on in their digital journey” (p. 5, emphasis mine)

  • this report suggests that rather than focusing on the technology with an eye to productivity gains, “the most significant gains are to be found in more radical thinking and changes in clinical working practices” (p. 5).
    • it’s “not about replacing analogue or paper processes with digital ones. It’s about rethinking what work is done, re-engineering how it is done and capitalising on opportunities afforded by data to learn and adapt.” (p. 6)
    • This reminds me of what my IT management professor in my MBA program liked to say: “If you automate a mess, all you get is an automated mess”. It’s much better to focus on getting your processes right, and then automating them, rather than just automating what you have.
    • “”It’s fundamentally not a technology project; it’s fundamentally a culture change and a business transformation project” (Robert Wachter, UCSF)” (p. 22)
  • in a notable failure, the NHS in the UK spent 9 years and nearly £10 billion and failed to digitise the hospital and community health sectors with reasons for the failure being “multiple, complex, and overlapping” including “an attempt to force top-down change, with lack of consideration to clinical leadership, local requirements, concerns, or skills” (p. 14)
  • it is noted that implementing an electronic health record (EHR) [which is what the project I’m working on is doing) is particularly challenging
  • they also note that things take longer than you expect:
    • “The history of technology as it enters industries is that people say ‘this is going to transform everything in two years’. And then you put it in and nothing happens and people say ‘why didn’t it work the way we expected it to?… And then lo and behold after a period of 10 years, it begins working.” (Robert Wachter, University of California San Francisco (UCSF)” (p. 20)
  • and they note that “the technologies that have released the greatest immediate benefits have been carefully designed to make people’s jobs or the patient’s interaction easier, with considerable investment in the design process.” (p. 20)
  • poorly designed systems, however, can really decrease productivity
  • getting full benefit of the system “requires a sophisticated and complex interplay between the technology, the ‘thoughtflow’ (clinical decision-making) and the ‘workflow’ (the clinical pathway)” (p. 21)
  • systems with automated data entry (e.g., bedside medical device integration, where devices that monitor vital signs at the bedside automatically enter their data into the EHR, without requiring a clinician to do it manually) really help maximize the benefits

Seven Lessons Learned

  1. [Clinical] Transformation First
    • it’s a “transformation programme supported by new technology” (p. 22)
  2. Culture change is crucial
    • “many of the issues face […] are people problems, not technology problems” (p. 23)
    • you need:
      • “a culture that is receptive to change
      • a strong change management process
      • clinical champions/supporting active staff engagement” (p. 23)
  3. User-centred design
    • you need to really understand the work so that you design the system to meet the needs of the clinician
    • “the combination of a core package solution with a small number of specialist clinical systems is emerging as the norm in top-performing digital hospitals” (p. 8)
  4. Invest in analytics
    • data analytics allows you to make use of all the data you collect as a whole (in addition to using it for direct clinical care)
    • requires “analytical tools available to clinicians in real time” (p. 8)
  5. Multiple iterations and continuous learning
    • you aren’t going to get it right the first time, no matter how carefully you plan [this is something that our new Chief Clinical Information Officer is always reminding us of] and so you will need “several cycles – some quite painful – before the system reaches a tipping point where all of this investment starts to pay off” (p. 26)
  6. Support interoperability
    • to provide coordinated care,  you need to be able to share data across multiple settings
    • “high-performing digital hospitals are integrating all their systems, to as low a number as possible, across their organisation” (p. 9)
  7. Strong information governance
    • when you start to digitize patient information, the size and scope of privacy issues change (i.e., while there is risk that an authorized person could look at a patient’s paper record or paper records could be lost when being transported between places, with digitized record there is a risk that all of your patients’ record could be accessed by an unauthorized person and that it is much easier to search electronic records for a specific person, condition, etc.)
    • you need “strong data governance and security” (p. 9)

Seven Opportunities to Drive Improvement

  1. More systematic, high-quality care
    • health care “often falls short of evidence-based good practice” (p. 31)
    • “technologies that aid clinical decision-making and help clinicians to manage the exponential growth in medical knowledge and evidence offer substantial opportunities to reduce variation and improve the quality of care” (p. 31)
    • integrated clinical decision support systems and computerized provider order entry systems:
      • reduce the likelihood of med errors (they cite a review paper (Radley et al, 2013) [which I have now obtained to check out what methods the papers they reviewed used to measure med errors]
      • reduced provider resource use
      • reduced lab, pharmacy & radiology turnaround times
      • reduced need for ancillary staff (p. 32)
    • at Intermountain Healthcare, “staff are encouraged to deviated from the standardised protocol, subject to clear justification for doing so, with a view to it being refined over time” (p. 34) – “hold on to variation across patients and limit variation across clinicians” (p. 35) as “no protocol perfectly fits each patient” (p. 35)
    • need to avoid alert fatigue – by only using them sparingly (or else they will get ignored and the really important ones will be missed) and targeting them to the right time (e.g., having prescribing alerts fire while the provider is prescribing)
    • be on the lookout for over-compliance – “Intermountain Health experience problems where clinicians were too ready to adopt the default prescribing choice, leading to inappropriate care in some cases” (p. 37)
  2. More proactive and targeted care
    • “patient data can be used to predict clinical risk, enabling providers to target resources where they are needed most and spot problems that would benefit from early intervention” (p. 38)
    • drawing on patient data, computer-based algorithms “can generate risk scores, highlighting those at high risk of readmission and allowing preventative measures to be put in place” (p. 39)
    • “it may also have a role in predicting those in the community who are likely to use health care services in the near future” (p. 39)
    • “monitoring of vital signs, [which are then] electronically recorded, [can be used to] calculate early warning scores [and] automatically escalate to appropriate clinicians [and] “combine these data with laboratory tests to alert staff to risks of sepsis, acute kidney injury or diarrhoeal illness” (p. 39)
      • Cerner estimates using early warning system for sepsis “could reduce in-hospital patient mortality by 24% and reduce length of stay by 21%, saving US$5,882 per treated patient” (p. 41)
    • there’s also opportunity to “check a patient’s status from remote location within the hospital, as well as facilitating handover between staff and task prioritisation using electronic lists” (p. 39)
    • monitoring of vital signs throughout the whole hospital is best to maximize benefits
    • predictive analytics is only as good as the quality of the data you put into the system
    • lots of data is unstructured – need to find ways to use these data (e.g., natural language processing)
  3. Better coordinated care
    • coordinated care leads to a better care experience, reduces risk of duplication or neglect
    • “if all health care professionals have access to all patient information in real time, there is significant potential to reduce waste (e.g., duplication of tests). It can help make sure things are done at the right time, at the right place and not overdone” (p. 45)
    • “chasing  report or a result […is…] an inefficient use of time, effort and energy and doesn’t really give confidence to the patient and carers” (p. 47)
    • but note that “systems to share results/opinions digitally can remove the opportunity for informal exchange of views and advice across teams, which often enrich and improve clinical decision-making” (p. 48), so alternative ways of doing this may need to be provided.
  4. Improved access to specialist expertise
    • telehealth (not part of the project I’m working on)
  5. Greater patient engagement
    • this section referred to tools, like wearable tech (e.g., Fitbit) or patient portals that empower patients to take more control of their own health  (not part of the project I’m working on)
    • “patient co-production of data into a hospital EHR will redefine the interaction with care services” (e..g, questionnaires that patients fill out before they even come to the healthcare facility, tracking of long-term data (e.g., blood pressure, weight))
  6. Improved resource management
    • e-rostering (i.e., of staff), patient flow management, business process support (e.g, HR, facilities, billing) all discussed (not relevant to the project I’m working on)
    • ability of staff to remotely access health records “can transform the way hat staff in the community deliver care” (p. 66)
  7. System improvement and learning
    • “feeding learning from clinical and non-clinical data back into existing processes is essential to fully realising the benefits of digital technology” (p. 70)
    • Intermountain Healthcare:
      • captures 3 type of data:
        • intermediate & final clinical outcomes
        • cost data
        • patient satisfaction and experience
      • “clinical registries are derived directly from clinical workflows” – currently has “58 condition-specific registries – tracking a complete set of intermediate and final clinical and cost outcomes by patient” (e.g., 71)
      • remember that data collection is costly, so only collect data routinely if you are using it for some purpose that adds value (“Intermountain Healthcare does this through small individual projects, before building data collection into existings processes”) (p. 76)

What could the future look like?

  • operational improvement from:
    • combining impact of a bunch of small changes [this assumes that (a) the different elements of the system are additive, as opposed to complex, and (b) the “benefits” outweigh the unintended negative consequences]
    • getting the “full benefit” out of all the technologies (i.e., it will take time for people to implement the available technologies and to optimize their use) [this doesn’t even include technologies that are not yet available)
  • “benefits” they expect are most likely to see:
    • “reduced duplication and rework
    • removing unjustified variation in standard clinical processes
    • identifying deteriorating patients and those at risk
    • predicting the probability of an extended stay or readmission
    • cutting out unnecessary steps
    • improving communication and handoffs
    • removing administrative tasks from clinical staff
    • scheduling and improving flow
    • inventory & procurement management
    • rostering, mobile working, and staff deployment
    • patient self-service for administrative tasks such as booking
    • other automation, e.g., robotics in back office” (p. 80-1)
  • redesigning the whole pathway:
    • “reduced variation
    • ability to ensure the most appropriate level of care
    • fitting staffing skill mix to demand more effectively” (p. 81)
  • population health management
    • “early intervention & targeting
    • enabling patient self-management
    • shared decision-making
    • measuring outcomes and value rather than counting activities” (p. 82)
      • all this requires better data and analytics, learning & improvement processes, and supporting patients with self-management and supporting shared decision-making (p. 82)

“Early strategic priorities should be the areas where technology is able to facilitate some relatively easy and significant wins. Most notable are the systematic and comprehensive use of vital signs monitoring and support for mobile working. In the short to medium term, the use of EHRs, telehealth, patient portals and staff rostering apps can also generate savings and improve quality. However, these require sophisticated leadership with support for organisational development and change management to ensure that the full benefits are realised. In the longer term, the really big benefits will come from the transition to a system and ways of working premised on continual learning and self-improvement.” (p. 88, emphasis mine)

Potential intended consequences mentioned in the report:

  • decreased productivity if the system is poorly designed (e.g., time spent on data entry, time spent responding to unhelpful alerts)
  • “over-compliance” – “Intermountain Health experience problems where clinicians were too ready to adopt the default prescribing choice, leading to inappropriate care in some cases” (p. 37)
  • “systems to share results/opinions digitally can remove the opportunity for informal exchange of views and advice across teams, which often enrich and improve clinical decision-making” (p. 48),

Limitations:

  • they noted there was little evidence on this type of work in the literature, particularly in terms of return on investment
Imison, C., Castle-Clarke, S., Watson, R., & Edwards, N. (2016).Delivering the benefits of digital health care. Nuffield Trust. [Download the full report.]
Posted in healthcare, information technology, notes | Tagged , , , , , , , , , , , , , , | Leave a comment

Implementation Science

Trying to avoid falling into yet another rabbit hole of reading (this time on “Implementation Science”1There’s another rabbit hole of “Program Science” awaiting me as well!), but here are notes from a couple of papers I’ve read trying to get the lay of the land on this.

Implementation Matters

Diffusion or Technology Transfer = “the spread of new ideas, technologies, manufactured products […] or […] programs” (p. 327)

  • Phases of program diffusion:
    dissemination: “how well information about a program’s existence and value is supplied to” end users
  • adoption: whether end users “decides to try the new program”
  • implementation: “how well the program is conducted during a trial period”
  • sustainability: “whether the program is maintained over time” (p. 327)

8 aspects of implementation:

  • fidelity: “the extent to which the innovation corresponds to the originally intended program (a.k.a. adherence, compliance, integrity, faithful replication)”
  • dosage: “how much of the original program has been delivered (a.k.a. quantity, intervention strength)”quality: “how well different programs have been conducted”
  • participant responsiveness: “degree to which the program stimulates the interest or hold the attention of participants”
  • program differentiation: “extent to which a program’s theory and practices can be distinguished from other programs (a.k.a. program uniqueness)”
  • monitoring of control/comparison conditions: “describing he nature and amount of services received by members of these group (treatment contamination, usual care, alternative services)”
  • program reach: “rate of involvement and representativeness of program participants (participation rates, program scope)”
  • adaptation: “changes made in the original program during implementation (a.k.a. program modification, reinvention) (p. 329)

In order to evaluate whether a program –> outcomes, you need to monitor:

  • how the program is being implemented:
    • so you know what you are actually evaluating
    • because negative results could occur because you didn’t actually implement as planned (and if you don’t monitor what is actually implemented, you would come to the incorrect conclusion that the program doesn’t work)
    • because positive results could come from an innovation that was implemented instead of what was planned (and if you don’t monitor what is actually implemented, you would come to the incorrect conclusion that the program works and would miss out on being able to sustain/spread the innovation that actually does work)
  • what the comparator group is actually getting (so you know what you are actually comparing your program to)

It’s important to find the right mix of fidelity and adaptation. Although fidelity can –> improved outcomes, no program is implemented with 100% fidelity and some adaption to local context can also improve outcomes; so, it is important to “find the right mix of fidelity and adaption”. Importantly, you need to “specify the theoretically important components of interventions, and to determine how well these specific components are delivered or altered during implementation. This is because core program components should receive emphasis in terms of fidelity. Other less central program features can be altered to achieve a good ecological fit.” (p. 341)

Source: Durlak, J. A., & DePre, E. P. (2008) Implementation matters: A review of research on the influence of implementation on program outcomes and the factors affecting implementation. Am J Community Psychol 41: 327-350.

Making sense of implementation theories, models
and frameworks

  • implementation science came from struggles with getting research into practice; often attempts to implement evidence-based practice were not based in an explicit strategy/theory and it was hard to “understand and explain how and why implementation succeeds or fails, thus restraining opportunities to identify factors that predict the likelihood of implementation success and develop better strategies to achieve more successful implementations.” (p. 1)
  • in response, researchers have created a lot of theories and used some from other disciplines and now people find it difficult to pick a theory to use
  • Implementation Science: “the scientific study of methods to promote the systematic uptake of research findings and other EBPs into routine practice to improve the quality and effectiveness of health services and care” (p. 2)
  • diffusion – dissemination – implementation continuum
    • diffusion – practices spread through passive, untargeted, unplanned mechanisms
    • dissemination – practices spread through active mechanisms/planned strategies
    • implementation – “the process of putting to use or integrating new practices within a setting” (p. 2)
  • theory – “a set of analytical principles or statements designed to structure our observation, understanding and explanation of the world” (p. 2) – usually described as “made up of definitions of variables, a domain where the theory applies, a set of relationships between the variables and specific predictions. A “good theory” provides a clear explanation of how and why specific relationships lead to specific events” (p. 2)
  • model – ” a deliberate simplification of a phenomenon or a specific aspect of a phenomenon. Models need not be completely accurate representations of reality to have value”.
    • not always easy to distinguish betwen a “model” and a “theory” – “Models can be described as theories with a more narrowly defined scope of explanation; a model is descriptive, whereas a theory is explanatory as well as descriptive” (p. 2)
  • framework – “a structure, overview, outline, system or plan consisting of various descriptive categories, e.g. concepts, constructs or variables, and the relations between them that are presumed to account for a phenomenon. Frameworks do not provide explanations; they only describe empirical phenomena by fitting them into a set of categories”
  • in implementation science:
    • “theory usually implies some predictive capacity […] and attempts to explain the causal mechanisms of implementation”
    • models “are commonly used to describe and/or guide the process of translation research into practice […] rather than to predict or analyse what factors influence implementation outcomes”
    • frameworks “often have a descriptive purpose by pointing to factors believed or found to influence implementation outcomes” (p. 3)
      • models and frameworks are typically checklist and don’t specify mechanisms of change

 

  • there is overlap among these five categories
  • “the use of a single theory that focuses only on a particular aspect of implementation will not tell the whole story. Choosing one approach often means placing weight on some aspects (e.g. certain causal factors) at the expense ofo thers, thus offering only partial understanding. Combining the merits of multiple theoretical approaches may offer more complete understanding and explanation, yet such combinations may mask contrasting assumptions regarding key issues. […] Furthermore, different approaches may require different methods, based on different epistemological and ontological assumptions.” (p. 9)
  • research is needed to determine if use of theories/models/frameworks does, in fact, improve implementation
Source: Nielsen, P. (2015). Making sense of implementation theories, models and frameworks. Implementation Science. 10:53. (full text)

Footnotes   [ + ]

1. There’s another rabbit hole of “Program Science” awaiting me as well!
Posted in evaluation | Tagged , | Leave a comment

Intro to Philosophy – Week 1 – What is Philosophy

As I’ve been doing so much reading on things like theory, complexity science, and research methodology, I’ve been reading more and more papers with words like epistemology and ontology, and it’s prompted me to do a bit of a refresher on philosophy. I’ve never actually done a Philosophy 101 type of course – my philosophy education has been pieced together from a few philosophy courses during my various university degrees (specifically, biomedical ethics and critical thinking courses during my undergrad and a business ethics course from the philosophy department during my MBA), an ethics training course I took in my previous job so that I could serve as a consultant with the organization’s ethics department 1Sadly, I never got to put that into practice, as I left that organization for my current job before an ethics consult came up that I could participate in., and reading Sophie’s World and The Matrix and Philosophy. So I figured I should go back to basics and thus have enrolled in Introduction to Philosophy, offered by the University of Edinburgh on Coursera.

What is Philosophy?

  • is both an subject and an activity – philosophy is what philosopher’s do
  • Dr. Ward’s definition “the activity of working out the right way of thinking about things”
    • all subjects try to think about their domains in the right way, but what makes philosophy different is that working in other subjects involves doing the work of, say, physics, (e.g., collecting data, developing theories, testing theories), while philosophy involves stepping back from that work and working out the right way to think about things (e.g., “what does it mean by “physical reality”? “what distinguishes a good scientific theory from a bad one?”)
    • another example: medicine involves trying to treat people’s illness based on an understanding of the best available medical theory. That used to mean trying to balance black bile and yellow bile, because it was believed that diseases were caused by imbalances in the “humours”. But philosophy of medicine involves stepping back and thinking about how we understand “health” and “illness” (e.g., perhaps noticing that balancing black bile and yellow bile didn’t actually work to heal people would lead people to think about their understanding of causes of illness).

 

  • philosophical questions can arise from anywhere (you can always step back and ask questions about how you are thinking about something)
  • similar to when children ask you “why?” and when you answer, they ask you “why?” again, etc. – in this analogy, the philosopher is in the role of the child asking “why?” and in the role of the person trying to come up with answers
  • since we can ask philosophical questions about anything, they can be trivial, but they can also be very important (e.g., if we hadn’t asked philosophical questions about the way medicine was understood, we’d still be using leeches to treat most diseases)
  • in the past, people found it acceptable to enslave others or commit genocide – but today those ideas are indefensible. We don’t know which of our current beliefs/values will be looked back on as indefensible (e.g., farming animals for food; our treatment of the planet) – philosopher’s will step back and ask questions about these things

How Do You Do Philosophy?

  • the best way to learn how to do philosophy is to do it
  • working out the best way to think about something is something that we do naturally
    • we look around for evidence, we think about what that evidence means, and draw a conclusion
  • “argument” means evidence and a line of reasoning to support a conclusion
  • “premises” = claims made to support a conclusion
  • we examine arguments to see if we think they are good arguments
    • if the argument’s conclusion follows from the premises – i.e., if the premises are true then the conclusion must be true, then the argument is valid
    • we can question the truth of the premises – so even if the conclusion follows from the premises, if one or more of the premises are not true, then the argument does not support the conclusion
  • if an argument is valid and its premises are true, then we say it is a sound argument

An argument against free will:

  1. The way the world was in the past controls exactly how it is in the present, and how it will be in the future
  2. We’re part of the world, just like everything else.
  3. We can’t control how things were in the past, or the way the past controls the present and future
  4. Therefore, we don’t control anything that happens in the world, including all the things that we think, say and do
  • we can question the premises – e.g., some people think humans aren’t a part of the world like everything else because we have “souls”; or perhaps there is some indeterminacy on the effect of the future based on what happened in the past
  • when questioning the premises, you’ll then have more work to do to support your thoughts around if the premises is true or not
  • it is hard, but useful work, to clarify our thinking, our premises, our arguments
  • it is also useful to keep in mind the “big picture” – philosophy is not just about constructing clever arguments, but also thinking about why these issues are important

Is There a “Right” Way to Think about Things?

  • David Hume thought a skeptical attitude was the appropriate way to approach philosophy
  • he felt that it was important that philosophy stay true to our sensory experience of the world, which he felt was
    • e.g., causation – Hume argued that we can’t really know causation – e.g., when we see one billiard ball hits another and the other moves off, we attribute the causation (i.e., we add the notion of causation with our mind), as all we actually see is the behaviour of the two balls
    • he also thought there wasn’t really a “self” – all we really experience is our thoughts/feelings/impressions as they pass through our minds, but our mind adds something extra that we think of as our “self” – we don’t observe a “self” above and beyond the thoughts/feelings/impressions
    • he also thought there was no reason based on our sensory experience of the world to believe an omnipotent, omniscience “God”
  • he didn’t think there was a “right” way to think about the world
  • Immanuel Kant thought the possibility of a world that didn’t conform to the rules and patterns that our mind imposes on experience was nonsensical
    • the rules that govern our thought are the same as rules that govern the world, and that we can know this just by thinking about it. So, for Kant, there is a right way of thinking about things, and we can arrive at it by the clear and careful use of reason.

Footnotes   [ + ]

1. Sadly, I never got to put that into practice, as I left that organization for my current job before an ethics consult came up that I could participate in.
Posted in notes, online module notes, philosophy | Tagged , | Leave a comment

CES Webinar: Words of Evaluation

Title: Words of Evaluation: A terminological dictionary to clarify communication in evaluation practice
Speakers: Richard Marceau, Francine Sylvain, Ghislan Arbour, Frank Hogg
Offered by: Canadian Evaluation Society

  • project at the École nationale d’administration publique (ENAP) in Quebec City to create a terminology dictionary for people working in evaluation in French (2014)
  • then they realized that they could extend this work to English (currently working on it)
  • not just a matter of translating the French dictionary into English, but rather:
    • extracting the conceptual system
    • apply the methodology for terminology dictionary
  • what are the needs of evaluators when it comes to communication?
    • evaluators sometimes run into communication issues because they use different words to refer to the same thing (or the same word to refer to different things)
    • communication between evaluators and clients – even moreso!
    • as evaluators, our product is information, which is made of words, so we need to have clear communication!
  • challenges in evaluation language:
    • inaccuracy
    • incoherence – e.g., if you are writing a paper about needs assessment, you may just define terms relevant to that, but when others do the same for other types of/aspects of evaluation, but when you try to put it all together, you don’t have a coherent system
    • jargon – not known by not experts
  • What do we currently have?
    • evaluation textbooks contain specialized knowledge (but don’t solve our jargon issue and may not solve the incoherence problem)
    • general dictionaries use general language, but don’t contain the specialized knowledge, so may not work for our purposes
  • a terminological dictionary is meant to be a blend of the two – contain specialized knowledge but attempts to provide a coherent system
    • vertical coherence
    • horizontal coherence – e.g., if you have a definition of “program” and a definition of “evaluation”, then your definition of “program evaluation” should make sense in terms of the first two definitions
  • their terminological dictionary is focused on performance of programs (not on evaluation methodology or sociology of evaluation) – not designed to do research on evaluation, but rather to support evaluation practice
  • stay tuned for the release of the English version

Very similar slide deck to the one presented is available here.

Posted in evaluation, evaluation tools, notes, webinar notes | Tagged , , , , , , | Leave a comment

Qualitative Comparative Analysis

While researching evaluating complex programs/complex systems, I came across an evaluation approach/method that I wasn’t familiar with: Qualitative Comparative Analysis (QCA). So I’ve done a bit of reading on this to see if it might be something I can use in my work.

Background

  • evaluations face the tension between contextualization (i.e., needing to understand the context in which an intervention occurs, because context (in addition to the intervention) can affect outcomes) and generalization (i.e., being able to inform future practice by identifying recurring patterns of what works and what does not work)
  • “project evaluation has tended to focus on the comparison of ‘before and after’ situations, and has not adequately incorporated the influence of contextual local conditions on infrastructure project development” (Verweij & Gerrits, 2012, p. 41)
  • there is a “misfit between the way infrastructure development projcets are understood and the methodologies used to evaluate them” (Verweij & Gerrits, 2012, p. 41)
  • variable-oriented studies can allow the identification of generic patterns (e.g., rail projects lead to the biggest cost overruns among transportation infrastructure projects), but don’t account for local contexts that also affect outcomes, while case-based studies allow an understanding of the affect of context, but don’t allow for identifying patterns for generalization. QCA is meant to integrate these two approaches
  • when we say that an infrastructure project is “complex”, “it usually means that it is perceived to be difficult”  (Verweij & Gerrits, 2012, p. 42), but complexity is more than that – it is a “multi-layered concept” (p. 42):
    • comprises a “mixture of generic elements [e.g., suburbanization] and local conditions [e.g., specific features of geography that affect how suburbanization could (or could not) happen at a specific site]” (p. 43)
    • “becomes even more complex if its social fabric is taken into account”  (Verweij & Gerrits, 2012, p. 43)
    • “thus, the local built order emerges from interaction between generic and specific physical and social elements”  (Verweij & Gerrits, 2012, p. 43)
  • it is important that “this specific pattern of local conditions and generic developments is researched to understand ex-ante how a project should be executed, and to understand ex-post what leads to certain outcomes”  (Verweij & Gerrits, 2012, p. 43)
  • to understand complex infrastructure projects
    • they “take place within a specific interacting mix of local conditions and generic patterns that occurs in any given location”
    • “the causal relationship between site-specific conditions and generic developments are poorly known and, if known, only for that specific time and place” – so “known causal relationships specific to a certain area are by definition case-specific”
    • “the emergent nature of any built area implies that it is the result of longitudinal development”, i.e., it is the result of past changes and events that are to some extent path-dependent” (Verweij & Gerrits, 2012, p. 43)
  • types of complexity:
    • generic complexity “focuses on the emergence of complex processes and structures from a limited set of variables. It assumes a general set or rules from which emergent complexity flows”… but it is missing that the “emergent nature of infrastructure projects is partly determined by local systems” (Verweij & Gerrits, 2012, p. 44)
    • situated complexity focuses “on the explanatory value of the contextualization of a phenomenon”. Buijs et al “argue that while open systems ‘do not operate according to general rules applied in all context’, a systematic comparison can reveal differences and similarities between the operations of different systems. This approach to situated complexity focuses both on recurring patterns over multiple systems and the idiosyncratic events in particular systems, since both determine how systems develop over time” (Verweij & Gerrits, 2012, p. 44)
  • how can complexity be understood, “which is basically a question of how reality can be understood”
    • positivism: “primarily concerned with determining general rules by taking reality apart in discrete components” (Verweij & Gerrits, 2012, p. 44) – similar to generic complexity
    • postpositivism: “has many different sub-strands that range from the extreme relativism social constructivism to the more realist thesis of negotiated subjectivism or critical realism” (“the common theme within those strands is that the contextualization is explanatory for what is being observed”) (Verweij & Gerrits, 2012, p. 44) – similar to situated complexity
  • “if systems are said to be open, then it follows that their boundaries do not exist a priori, and any individual will develop a particular demarcation or set of boundary judgements about the system which includes and excludes variables (i.e., a reduction of real complexity) that may be connected but not perceived as such by the observer. Thus, there is no unambiguous separation between systems and their context, and the observer is as much part of the complexity as the system or agents that are observed. Situated complexity is therefore not confined to the presupposed demarcations of system but intersects all system representation by respondents.” Verweij & Gerrits, 2012, p. 44)
    • implications of this for evaluation:
      • people choose the boundaries of the system
      • we should take multiple perspectives into account
      • “cause and effect relations do exist and can be known through respondents’ perceptions” (Verweij & Gerrits, 2012, p. 45)
      • evaluators cannot be separated from the evaluation

Criteria for a complexity-informed methodological framework

  • “it balances between in-depth understanding and reductionist generalization” (Verweij & Gerrits, 2012, p. 45)
  • “cased-based”, where “projects are treated holistically an not as collection par parts” (p. 46); with multiple cases that are compared “to allow for causal inference – for studying patterns across cases” (Verweij & Gerrits, 2012, p. 46)
    • this “rejects the idea that variables can be disaggregated from cases and analysed separately as if it is the variables rather than the cases that are causal. […] It is the ‘case’ and the state of important conditions of each case” (they are “bundles of conditions that interact together”) (Blackman et al, 2013, p. 4)
  • ” allow the observation and analysis of complex interaction between the variables” (Verweij & Gerrits, 2012, p. 45)
  • consider how situated complexity came into being over time (i.e., complex dynamics)” (Verweij & Gerrits, 2012, p. 45)

Qualitative Comparative Analysis (QCA)

  • an umbrella term for:
    • crisp set QCA (csQCA) – where conditions are scored as binaries (present (1) or absent (0))
    • multi-value QCA (mvQCA)
    • fuzzy set QCA (fsQCA) – allows conditions to be scored as a gradient (e.g., could be 0, 0.5, 1, 1.5, etc.)
  • “aims to integrate the case-oriented and variable-oriented approaches”  (Verweij & Gerrits, 2012, p. 46)
  • “can be used to achieve a systematic comparison across a smaller number of individual cases in order to preserve complexity, and yet being as parsimonious as possible and illuminating otherwise hidden causal paths on a micro level’ (Rihouse & Lobe, cited in Verweij & Gerrits, 2012, p. 46)
  • in this process, you examine multiple cases:
    • “to uncover the most frequent combinations of causal conditions (i.e., variables) that produce a certain outcome” (conjunctural causation)
    • note that “different configurations” may produce the outcome” (equifinality)
    • note that “factors can have different effects in different contexts” (multifinality)
  • also looks at asymmetric causality: “the presence and absence of outcomes require different explanations” (Verweij & Gerrits, 2012, p. 46)
  • uses “dialogue between theoretical and empirical evidence, which is especially important in the selection and construction of cases and variables”  (Verweij & Gerrits, 2012, p. 47)
  • “in QCA, variables are conceptualized as causal conditions or sets”  (Verweij & Gerrits, 2012, p. 47) – but note that since “social phenomena […] are often difficult to grasp in terms of sets” … “theoretical and substantive knowledge should be used to substantiate the constructions (and membership) of sets”  (Verweij & Gerrits, 2012, p. 47)
  • “sets can be intersected” [with ‘logical and’] and “unified” [with ‘logical or’]
  • QCA then is “able to systematically compare and analyze these set conjunctions”  (Verweij & Gerrits, 2012, p. 47)
  • QCA looks at “set relations instead of correlations”  (Verweij & Gerrits, 2012, p. 47)
    • “a condition is necessary if it has to be present for the outcome to occur” (p. 47)
    • “a condition is sufficient if it can produce the outcome by itself” (p. 47)
    • an INUS condition is “insufficient but non-redundant part of an unnecessary but sufficient condition” (Mackie, cited in Verweij & Gerrits, 2012, p. 48) – that is, since it is often a combination or “recipe” of conditions that are required to cause an outcome, there are “usually no purely necessary or sufficient condition” (p. 48) but rather conditions that are not sufficient on their own nor necessary on their own, but are sufficient when combined with other specific conditions
  • it is important to note that “‘neither necessity nor sufficiency exists independently of theories that propose causes'” (Ragin, cited in Verweij & Gerrits, 2012, p. 48)
  • once you have each case assigned to a set (i.e., you know its score for each condition, as well as its outcome), you consider a “truth table“:
    • “lists all of the logically possible configurations”  (Verweij & Gerrits, 2012, p. 48-9)
    • fundamental unit of analysis = truth table row
 Condition A Condition B Condition C Outcome           Distribution of cases
1 1 1 1
1 1 0 1
1 0 1 1
1 0 0 1
0 1 1 1
0 1 0 0
0 0 1 0
0 0 0 0
  • then “the truth table can be minimized to produce a so-called solution (i.e., a statement about patterns across cases”  (Verweij & Gerrits, 2012, p. 46). This is done using Boolean algebra9
    • it is important that this is done not done by just applying the formula, but includes “interpreting the formula and critically assessing it in light of individual cases: does it make sense?” Thus, there should be “several iterations between data and concepts, generating increased understanding/interpretation of the cases”  (Verweij & Gerrits, 2012, p. 50)
  • also, “we can use the method in an exploratory rather than explanatory mode. This is particularly useful when we have ‘contradictory’ configurations (i.e., configurations without 100% of the cases having the same outcome state). In such cases, we can then return to the cases and seek additional differentiating characteristics of their previous trajectories and contexts that might help us to resolve the contradiction (i.e., generate an explanation of the difference in observed outcome state” (Byrne, 2013, p. 224)
  • Steps in QCA
    • define the outcome
    • identify conditions thought to be relevant to the outcome
    • gather data on the identified conditions for each case
      • if csQCA, determine if each condition and the outcome are “present” or “absent”
      • if fsQCA, determine value for each condition and the outcome
        • dichotomization (or scoring) “requires judgement and discussion” (Blackman et al, 2013, p. 13)
    • determine the sets (“shared configurations of conditions”
    • allocate cases to sets
    • create the “truth table”
      • share with practitioners  – allows for “them to help [researchers] develop accounts of causality in more detail”, is a “valuable knowledge exchange opportunity” and “enables [researchers] to incorporate insights from practice into our explanations so that these could be grounded in practitioners’ worlds”

Example

  • Blackman et al (2013) conducted a QCA and found that  for the outcome of narrowing the gap in teenage pregnancy rates in a local areas that have particularly high teenage pregnancy rates had “causal combinations” for their 27 cases in 5 sets that used different combinations of 4 conditions (they started with 9 conditions, but 5 did not show up in any of the “causal combinations”)
  • these sets included:
    • narrowing gap was seen in:
      • cases with a high proption of black and minority ethnic groups
      • cases where there was a combination of low #s in drug treatment and high #s of people under 18 years of age and a “basic” standard of commissioning
    • not narrowing gap was seen when:
      • lower proportion of black and minority ethnic groups and low #s of under 18s
      • lower proportion of black and minority ethnic groups and higher #s in drug treatment
      • lower proportion of black and minority ethnic groups and good/exemplary standard of comissioning
    • ideas as to why these patterns were seen:
      • having a basic standard of commissioning was seen in one ‘good’ combination, but a good/exemplary was seen in a ‘bad’ combination – the authors speculaed that this could be because getting a “good/exemplary” rating may be requiring a lot more meetings/documentation that takes people away from doing the real work
      • the lower #s in drug treatment may reflect the level of substance use (which may be similar to level of risk taking generally)
      • areas with the higher #s of under 18s may provide more services to under 18s, leading to lower teen pregnancy rates (but only when combined with lower #s in drug treatment (possibly = lower risk taking) and where a basic standard of comissioning is met

Limitations of QCA

  • “a static method [that] does not fully capture the dynamics of complex systems”  (Verweij & Gerrits, 2012, p. 51)
    • some potential workarounds for this:
      • “using multiple iterations of the method (i.e., before, during, and after a certain intervention)
      • interpreting he time dimension
      • conceptualizing time as (part of) a set
      • complementing QCA with other methods” (Verweij & Gerrits, 2012, p. 51)
  • can only include a limited number of conditions because the number of possible combinations increases exponentially
    • many configurations […] will have no cases […but…] this is not a problem, because, in complexity terms, such configurations can be considered as describing empty attractor states in the possibility space”  (Byrne, 2013, p. 224)
    • but there will still likely be lots of cases in the “occupied attractor states – configurations” (Byrne, 2013, p. 224) when use lots of conditions in our QCA
  • adding a new case can lead to different solution formula – though this “is actually part of the philosophy behind QCA and its roots in systemic thinking. With QCA, the researcher does not strive to identify a single central tendency that reflects reality as more cases are added. rather, it helps researchers to examine the different causal pathways that lead to a particular outcome and how much pathways are linked to individual cases” (Verweij & Gerrits, 2012, p. 46)

Resources:

References
Blackman, T., Wistow, J., & Byrne, D. (2013). Using Qualitative Comparative Analysis to understanding complex policy problems. Evaluation. 19(2): 126-140. (page references in this blog posting refer to the “Accepted Manuscript” version posted at http://oro.open.ac.uk/37540/2/5C07E325.pdf)
Byrne, D. (2013). Evaluating complex social interventions in a complex world. Evaluation. 19(3): 217-228.
Verweij, S. & Gerrits, L. M. (2012). Understanding and researching complexity with Qualitative Comparative Analysis: Evaluating transportation infrastructure projects. Evaluation. 19(1): 40-55.
Posted in evaluation, evaluation tools, methods, research | Tagged , | Leave a comment

More readings on evaluation and complexity

I’m falling down a rabbit hole of reading on this topic! Here are some notes on more papers I’ve read lately… and there will be more (as again, this posting got quite long, so I’m just posting it now and will start another blog posting for the other articles I’m reading on this topic!)

Applying complexity theory: A review to inform evaluation design

  • this paper uses “complexity” to refer to “understanding the social systems within which interventions are implemented as complex” (p. 119)
  • he defines a complex system as “comprised of multiple interacting actors, objects and processes define as a system based on interest or function” and “are nested”. “The interaction of components in a complex system gives rise of ’emergent’ properties, which cannot be understood by examining the individual system components” and “interactions are non-linear” (p. 119)
  • “challenges posed by complex social systems for evaluation relate to uncertainty in the nature and timing of impacts arising from interventions, due to non-linear interactions within complex systems and the ’emergent’ nature of system outcomes. There are also likely to be differing values and valuation of outcomes from actors across different parts of a complex system, making judgements of ‘what worked’ contested.” (p. 120)
  • “due to the open boundaries of complex systems, there are always multiple interventions operating and interacting, creating difficulties identifying the effects of one intervention over another” (p. 120)
  • in the literature, “there is little consensus regarding what the key characteristics of a complexity informed policy or program evaluation approach should be” (emphasis mine, p. 120)
  • this review paper identified the following themes:
    • developing an understanding of the system
      • “the need to develop a picture of the system operating to aid analysis of both interaction and of changes in system parts” (p. 121)
      • “boundaries are constructs with decisions of inclusion and exclusion reflecting positions of actors involved in boundary definitions” (i.e., “boundaries likely reflect the interest of evaluators and others defining evaluation scope”) (p. 121)
      • “complex system boundaries are socially constructed, so we should be asking about what systems are being targeted for change and what ‘change’ means to various people involved” (p. 125)
    • attractors, emergence, and other complexity concerns
      • emergent properties: “generated through the operation of the system as a whole and cannot be identified through examining individual system parts” (p. 123)
        • this challenges the role of evaluating predetermined goals
      • attractor states: “depict a pattern of system behaviour and represents stability, with a change in attractor state representing a qualitative shift in the system,  with likely impacts on emergent phenomena” (p. 123)
      • it’s important to keep “a holistic view of the system over long time periods” (p. 123)
    • defining appropriate level of analysis
      • the literature includes “a clear call for evaluation to focus upon multiple levels, whilst also noting the challenge this creates” (p. 123)
    • timing of evaluations
      • “non-linear interactions and potential for sudden system transformation suggest we cannot predict when the effects of an intervention will present. Therefore, long evaluative time frames may be required” (p. 123)
      • “evaluation should, if possible, occur concurrently alongside programme development and implementation”  (p. 123)
      • but long timeframes “pose a challenge to the question of what should be evaluated” and “suggest that evaluative activity needs to be on going and that the line between evaluation and monitoring may be blurred. Attribution of outcomes to specific interventions becomes more complicated over time with the number of local adaptations, national level policy changes, and social and economic contextual changes likely to increase.” (p. 123)
      • there is a “role of evaluation for understanding local adaptations and feeding back into implementation processes”  (p. 123) – and this “may be more immediate and relevant to current implementation decisions [than a focus on outcomes/attribution] and therefore provide a more tangible focus for evaluation”  (p. 123)
    • participatory methods
      • “used to gather perspectives of actors across the system to develop systems descriptions; understand how interventions are adapted at the local level; and make explicit different value claims of actors across the system” (p. 124)
    • case study and comparison designs
      • pro: “ability to develop a detailed understanding of a system (or a limited number of systems), in line with complexity theory concepts” (p. 124)
    • multiple and mixed methods
      • “a logical response to the challenge of providing contextualised information on what works” (p. 124)
    • layering theory to guide evaluation
      • “multiple theories can be nested for explanation at multiple levels of a system” (p. 124)
  • “participation build into the evaluation from the start, and a close relationship with stakeholders throughout the evaluation lifecycle is part of an ‘agile’ evaluation” (p. 125)

Perturbing ongoing conversations about systems and complexity in health services and systems

  • “What matters is making sense of what is relevant, i.e., how a particular intervention works in the dynamics of particular settings and contexts” (p. 549)
  • “the most useful questions addressing complex problems must imply an open system: ‘What will the intervention be able to produce? and ‘What kind of behaviour will emerge? What are our frames of reference? What are our ideas and values in relation to success?’ (Stengers cited on p. 549)
  • “Frameworks for understanding policy development do not merely describe the process. They invariably indicate what a “well-functioning” process is like. And so they place a value on certain structures and behaviour. As our theories change, so do our vies of what is good” (Glouberman cited on p. 549)
  • “common to complex systems are two fundamental themes:
    • the universal interconnectedness and interdependence of all phenomena
    • the intrinsically dynamic nature of reality” (p. 549)
  • although there seems to be lots of talk about complexity, its “uptake” in health systems/services has been slow
    • “reductionism remains the dominant paradigm”
    • we often break down the work of clinicians into “discrete activities based on a business model drive by the agenda of cost containment rather than improved patient health” (p. 550)
    • “we must counterintuitively work to develop appropriate abstract frameworks and categories, and reflect on our ways of knowing, if we are to gain a deeper understanding of the processes that operate in complex systems, and how to intervene more successfully” (p. 550)
  • “the awareness of complexity does not imply answering questions or solving problems: rather, it means opening problems up to dynamic reality, as well as increasing the relative level of awareness. Thus, the notion of complexity […] strongly supports the possibility that […] questions and answers may change, as well as the nature of questions and answers upon which scientific investigation is built” (p. 551)

Theory-based Evaluation and Types of Complexity

  • “evaluation deficit” – “the unsatisfactory situation in which most evaluations, conducted at local and other sub-national levels, provide the kind of information (on output) that does not immediate inform an analysis of effects and impacts at higher levels, i.e., whether global objectives have been met. Or, conversely, impact assessment is not corroborated by an understanding of the working of programmes.” (p. 59)
  • Stame criticizes “mainstream evaluation” for “choosing to play a low-key role. Neither wanting to enter into the ‘value’ problem […], nor wanting to discuss the theoretical implications of programmes, evaluators have concentrated their efforts on developing a methodology for verifying the internal validity (causality) and external validity (generalizability) of programmes” (p. 59). She goes on to list the consequences of this “low-key” approach:
    • “fail to formally or explicitly specify theories”
    • assuming the programs are “rational” – e.g., “assuming the needs are know, decision makers are informed […], decisions are taken with the aim of maximizing gains from existing resources”
    • since programs are seen as “rational”, “politics was seen as a disturbance or interference and the political context itself never became an object of inquiry”
    • thinking of the outcome of evaluation as being just “‘instrumental’ use: saying that something worked or did not work” (p. 60)
  • theory-oriented evaluations:
    • “changes […] the attitude towards methods” … “All methods can have merit when one puts the theories that can explain a program at the centre of the evaluation design. No method is seen as the ‘gold standard’. Theories should be made explicit, and the evaluation steps should be built around them: by elaborating on assumptions; revealing causal chains; and engaging all concerned parties” (p. 60)
    • some different approaches to theory-oriented evaluations:
      • Theory-driven evaluation (Chen & Rossi) – many programs have “‘no theory’, goals are unclear, and measures are false”, so “evaluations are ‘at best social accounting studies that enumerate clients, describe programs, and sometimes count outcomes”. “The black box is an empty box.” Thus, their approach is “more to provide a programme’s missing theory than to discuss the way programmes exist in the world of politics” (p. 61).
      • Theory-based evaluation (Weiss) – the “black box is full of many theories […that] take the form of assumptions, tacit understandings, etc: often more than one for the same programme.” (i.e., different people involved – the many program implementers, recipients, funders, etc. – may all be operating based on different ideas of how the program works and may not even be aware of their own theories. Two parts to theories of change (1) “‘implementation theory,” which forecasts in a descriptive way the steps to be taken in the implementation of the programme” and (2) “‘programmatic theory’, based on the mechanisms that make things happen” (p. 61-62)
      • Realist Evaluation (Pawson & Tilley): they “stress what the components of a good programme theory should be: context (C) and mechanism (M), which account for outcome (O). Evaluation should be based on the CMO configuration. Programmes are seen as opportunities that an agent, situated inside structures an organizations, can choose to take, and the outcomes will depend on how the mechanism that is supposed to be at work will be enacted in a given context.” “We cannot know why something changes, only that something has changed […] in a given case. And that is why it is so difficult to say whether the change can be attributed to the programme. The realist approach is based on a ‘generative’ theory of causality: it is not programmes that make things change, it is people, embedded in their context, who, when exposed to programmes, do something to activate given mechanisms, and change. So the mystery of the black box is unveiled: people inhabit it.” (p. 62)
    • similarities among these theory-oriented approaches:
      • evaluation is based on “an account of what may happen”
      • they “consider programmes in their context”
      • use “all methods that might be suitable”
      • “are clearly committed to internal validity (they indeed look for causality), but nonetheless allow for comparisons across different situations” (p. 63)
    • differences  among these theory-oriented approaches: role of theory, role of context
  • “reality is complex because:
    • it is stratified, and actors are embedded in their own contexts; and
    • each aspect that maybe be examined and dealt with by a programme is multifaceted” (p. 63) [this doesn’t seem to fit any of the other definitions of “complexity” that I’ve read]
  • “if […] the evaluator considers that what is important is to know how impact has been attained and why, s/he is bound to consider that means […] are relevant. Evaluation is then concerned with different ways of reaching objectives, and tries to judge which policy instruments, in isolation or in combination, and in what sequence, are better suited to the actors situation in given contexts” (p. 66)

Complex, but not quite complex enough: The turn to the complexity sciences in evaluation scholarship

  • This article provides a critique to the way that many evaluators have been writing about, and attempting to apply, “complexity sciences” (see my previous posts here and here for my notes from some of the types of articles he’s critiquing)
  • Mowles’ main critiques are that:
    • “there is a tendency either to over-claim or under-claim [the] importance” (p. 160) of complexity sciences
    • evaluation “scholars are not always careful about which of the manifestations of the complexity sciences they are appealing to” (p. 160)
    • evaluation scholars do not always “demonstrated how they understand [the complexity sciences] in social turns” (p. 160)
      • evaluators who favour a “contingency approach to complexity” (i.e., we can pick and choose when to use it based on our decision about if a program (or part of a program) is “complex”) “suggest complexity is a ‘lens’ or framework to be applied if helpful, and take emergence to mean the opposite of being tightly planned” (p. 167). This leads to evaluators seeing only those programs (or parts of programs) that they have deemed to be complex as “need[ing] “a special and  “trying to feed back data and information in real time” (p. 167)
      • But “in portraying emergence as a special phenomenon [these evaluators] have implicitly dismissed the idea that the human interaction is always complex, and that emergence, which we might understanding in social terms as the interplay of intentions, is always happening, whether a social program is tightly planned or not” (p. 167)
    • thus, “complexity sciences” are used as just another tool within evaluation as a “logical, rational activity” (p. 160) – he cites Fleck who “described the ways in which groups of scholars, committed to understanding the world in a particular way, resist the rise of new ideas by either ignoring them or rearticulating them in terms of the prevailing orthodoxy” (p. 161) – with the implication being that this is what many evaluators are doing – rather than grappling with the complexity sciences to see what the implications are for evaluation, they are trying to fit the complexity sciences into their existing ways of evaluating. He goes on to ask “what difference appealing to the complexity sciences makes to the prescriptions that scholars recommend for evaluative practice” (p. 161). – that is, does “applying” complexity theory lead these evaluators to do anything differently than they would have done without it?
  • trends noted in evaluation scholarship re: complexity:
    • many suggest that complexity is something that an evaluator should choose at what time/in what circumstances to use
      • many use the “Stacey Matrix” which  and which is “a contingency theory of organizations understood as complex adaptive systems [that] suggests that the nature of the decision facing managers depends on the situation facing them” (p. 163) [The one Patton uses in his Developmental Evaluation book – with low-high certainty on one axis and low-high agreement on the other and you use it to determine if something is simple, complicated, complex, or chaotic] – even though “Stacey himself abandoned the idea that organizations can be helpfully understood as complex adaptive systems, and has moved on from a contingency perspective” (p. 163)
      • using this approach “allows evaluators in the mainstream to claim that the complexity sciences may be quite helpful but only in circumstances of their own choosing” – this represents the “‘spectator theory of knowledge’, which sustains a separation between the observer and the thing observed.” (p. 164)
      • Mowles suggests that everything is complex “even following rules like a recipe [the oft given example of “simple”] “is a highly social process where the rules inform practice and practice informs the rule” (p. 163)
    • talking about “complexity sciences” as if it were just one thing, “homogenizing” them OR just picking “some of the characteristics of particular manifestations of the complexity sciences” (p. 162) [thought: it’s kind of funny that complexity theory includes the notions that the whole is not just the sum of the parts/you can’t understand the whole just by looking at the parts… but then we say we are applying complexity theory by just looking at some of the parts]
      • he notes that Patton draws on a lot of aspects of complexity sciences for his Developmental Evaluation approach “without offering a view as to whether one particular branch of the complexity sciences is more helpful than another” (p. 164)
      • he also notes, somewhat snarkily (though not unjustifiably) in my opinion, that “In the development of the disciplines of evaluation, particularly those claiming to be theory-based, it is probably important to know what the theories being taken up actually claim to be revealing about nature, and to be able to make distinctions between one theory and another” (p. 164)
    • making the assumption that “the social is best understood in systemic terms”; social/health interventions understood as a “system with a boundary, even if that boundary is ‘open’. Interaction is then understood as taking place between entities, agenda, even institutions operating at different ‘levels’ of the system, or between systems, which leads to the idea that social change can be both wholesale and planned”….. this “allows scholars to avoid explaining their theory of social action, or to interpret complexity theories from the perspective of social theory and thus to read into them more than they sustain” (p. 162)
  • “insights from complexity theory help us understand why social activity is unpredictable”, but remember that “evaluation practice […] is also a social activity”, so “it can no longer be grounded in the certainties of the rational, designing evaluator” (p. 163)
  • a brief summary of how complexity theory evolved over time:
    • Step 0: equilibrium model in classical physics & economics that assumes
      (a) system with a boundary, made of interacting entities
      (b) entities are homogeneous
      (c) interactions occur at an average rate
      (d) system moves towards equilibrium
    • Step 1:
      • removes assumption (d) (i.e., not assumed to be moving towards equilibrium)
      • replaces linear equations with non-linear
      • output of one equation feeds into next iteration of the equation
      • basis for modeling chaos
    • Step 2:
      • removes assumptions (c) (i.e., interactions not assumed to occur at an average rate) and (d) (i.e., not assumed to be moving towards equilibrium)
      • used to explain dissipative structures, things jumping to different states, ability of things to self-organize
    • Step 3:
      • removes assumptions (b) (i.e., entities are not homogeneous), (c) (i.e., interactions not assumed to occur at an average rate) and (d) (i.e., not assumed to be moving towards equilibrium)
  • complex adaptive systems (CAS) – “agent-based models run on computer” are “temporal models that change qualitatively over time and attempt to explain how order emerges from apparent disorder, without any overall blue-print or plan” (p. 165)
    • attempts “to describe how global patterns arise form local agent behaviour” (p. 166)
    • can operate at Step 2 or 3
  • in real life, people are not homogenous and interactions are not average and not linear, so it is at Step 3 where we see “truly evolutionary and novel behaviour emerge” (p. 166)
  • “models are helpful in supporting us to think about real world problems, [but remember that…] “mathematical models uncover fundamental truths about mathematical objects and not much about the real world” (p. 166)
  • Mowles identifies three evaluation scholars (Callaghan, Sanderson, Westhorp) who suggest that evaluators should “draw on insights from the complexity sciences more generally to inform evaluation practice, rather than understanding the insights to refer only to special cases” (p. 167)
  • he identifies that the use of experimental methods to evaluate represents the “highest degree of abstraction” (p. 167) from the program being evaluated and notes that “Theories of Change” are a “hybrid of systems thinking and emancipatory social theory” (p. 168) as they “draw on propositional logic and represent social change in the form of entity-based logic models showing the linear development of social interventions towards their conclusions” (p. 167), but also “often point to the importance of participation and involvement of the target population of programmes to inspire motivation” (pp. 167-8)
  • “realist evaluators” talk of “‘generative’ theories of causality, i.e., ones that open up the ‘black box’ of what people actually do to make social programmes ‘work’ or not” (p. 168) – they argue that “interventions do or do not achieve what they set out to because of a combination of context, mechanism and outcomes (CMO). [RE] is concerned with finding what works for whom and in what circumstances and then extrapolating a detailed and evolving explanation to other contexts” (p. 168)
    • Callaghan “adds […] on the idea of a mechanism, that what people are doing locally in their specific contexts to make social projects work is to negotiate order” (p. 168)
    • Westhorp “recommends trying to identify the local ‘rules’ according to which people are operating as a way of offering richer evaluative explanations of what is going on” (p. 168)
    • but Mowles suggests that this does not go far enough and that rather than opening the black box, realist evaluators  “use a mystery to explain a mystery” (p. 168) and don’t seem able “to let go of the idea of a system with a boundary, outside which the evaluator stands, comprising abstract, interacting parts” (p. 169)
    • he also suggests that the “persistence of systematic abstractions and predictive rationality may be that they protect the discipline of evaluation by separating the evaluator from the object to be evaluated” (p. 169) – not that evaluators are totally “unaware of the way that they influence social interventions” (p. 169), but that they “only go so far in developing how much these non-linear sciences apply to them and what they are doing in the practice of evaluation” (p. 170)
  • Mowles’ suggested alternative is “a radical interpretation of the complexity sciences, which understands human interaction as always complex and emergent” (p. 160)
  • he references Stacey (remember, the one who has moved on from the contingency approach) and colleagues and notes that “they argue that in moving form computer modelling [e.g., CAS] to theories of the social, but by preserving some of the insights by analogy, it might be helpful to think of social interaction as tending neither towards equilibrium nor as linear, nor as forming any kind of a whole. Social life always takes place locally between diverse individuals who have their own history and multiple understandings of what is happening as they engage and take up broader social themes” (p. 170)
  • he goes on to say that rather than thinking in terms of a system with a boundary, we think of “global patterns of human relating aris[ing[ from many, many local interactions, paradoxically informing and informed by […] the habitus. The habitus is habitual and repetitive, but because it is dynamically and paradoxically emerging it also plays out in surprising, novel and sometimes unwanted ways because of the interweaving of intentions.”
  • Mowles discusses the following implications for evaluation of his “radical” interpretation of complexity:
    • evaluators cannot really just “decide” which social interventions (or parts thereof) are complex and which ones are not
    • “calls into question the idea that emergence is a special category of social activity” (p. 171) – “social life is always emerging in one pattern or another, whether an intervention is tightly or loosely planned, and that people are always acting reasonably […] rather than rationally” (p. 171)
    • “evaluation is a situated, contextual practice undertaken by particular people with specific life-histories interacting with specific others, who are equally socially formed. The evaluative relationship is an expression of power relations, both between the commissioner of the social intervention/evaluation and the evaluator, and between these and the people comprising the intervention, which will inform how the evaluation emerges” (p. 171)
    • simplifying programs for the purposes of evaluation “cover[s] over the very improvisational and adaptive/responsive activity that makes social projects works, and even improve them, and which should be of interest both to commissioners and evaluators” (p. 171)
    • “an evaluator convinced about complexity might […] take an interest in how their own practice forms, and is formed by the relationships they are caught up in with the people they are evaluating” (p. 171) – they’d be interested in:
      • “how people in the intervention negotiate order” (p. 171)
      • “how the evaluation itself is negotiated” (p. 171)
      • “how power relations play out in, and affect, the social intervention, including the framing of both the social development project as a logical project and the evaluation as a rational activity” (p. 171)
      • “pay[ing[ close attention to the quality of conversational life of social interventions, including how participants took up and understood any quantitative indicators that they might be using in the unfolding project” (p. 171)
      • “there will always be unintended and unwanted outcomes of social activity, which may be just as important as what is intended” (p. 171)
      • “how the programme changed oer time, and how people accounted for these changes: ‘progress’ in terms of the social intervention, could also be understood in the movement of people’s thinking and their sense of identity” (p. 171)
      • “evaluators should assume a greater humility in their work and their claims about predictability, causality and replicability” (p. 171)
References
Martin, C.M., & Sturmberg, J. P. (2009). Perturbing ongoing conversations about systems and complexity in health services and systems. Journal of Evaluation in Clinical Practice. 15: 549-552.
Mowles, C. (2014). Complex, but not quite complex enough: The turn to the complexity sciences in evaluation scholarship. Evaluation. 20(2): 160-75.
Stame, N. (2004). Theory-based Evaluation and Types of Complexity. Evaluation. 10(1): 58-76
Walton, M. (2014) Applying complexity theory: A review to inform evaluation design. Evaluation and Program Planning. 45: 119-126.
Posted in evaluation, notes | Tagged , , , | Leave a comment

Complexity and Evaluation

Notes from some readings on complexity and evaluation.

A Review of Three Recent Books on Complexity and Evaluation

Gerrits and Verweij (2015) reviewed three books that explored complexity and evaluation:

  • Forss et al’s Evaluating the Complex: Attribution, Contribution, and Beyond (2011)
  • Patton’s Developmental Evaluation: Applying Complexity Concepts to Enhance Innovation and Use (2011)
  • Wolf-Branigin’s Using Complexity Theory for Research and Program Evaluation (2013)

They note that all three of these books raise a similar concern (“that the complexity of social reality is often ignored, leading to misguided evaluation and policy recommendations, and that the current methodological toolbox is not particularly well-suited to deal with complexity” (p. 485)), but that they deal with this concern in different ways.

 Forss et al Patton Wolf-Branigan
 How they define complexity

“there is a difference between complexity as an experience and complexity as a precise quality of social processes and structures” (p. 485)

give multiple definitions

mention “a system state somewhere between order and chaos” and a focus on the non-linear and situated nature of complex systems” (p. 485)

  • “describes rather than defines complexity” (p. 485)
  • core principles of:
    • non-linearity
    • emergence
    • adaptive behavior
    • uncertainty
    • dynamics
    • co-evolution
  • “bolts on Holling’s adaptive cycle and panarchy” (p. 485)
  •  “settles on Mitchell’s (2009) definition which focuses on the self-organizing aspect of complex systems, out of which collective behavior emerges” (p. 485)
  • “emergent behavior [..] is a process that is embedded in complex systems” (p. 485)
  • complex systems –> complex adaptive systems “when the constituent elements show mutual adaptation” (p. 485)

They note that Wolf-Branigan offers a “complexity-friendly set of evaluation methods” and that Forss et al, being an edited volume of chapters by different authors with a bunch of different ways that they dealt with complexity (and possibly some conflation of complexity and complicatedness, which suggests they perhaps did not have a clear understanding of complexity).

In contrast to a focus on methods, they noted that Patton views complexity as a “heuristic and sense-making device” (p. 487) and thus Developmental Evaluation is “an approach that […] favors:

Developmental Evaluation “is a dynamic kind of evaluation that does not only seek to identify causal relationships and to serve accountability, but that also offers an approach that interacts with the programs it evaluates, preferably feeding results back into the program on the fly, so as to develop it”  (Gerrits & Verweij, 2015, p. 486)

A few other points of interest:

  • “Whereas complicated interventions can be evaluated by asking “what works for whom in what contexts” […] in complex programs, ‘it is not possible to report on these in terms of “what works”… because what “it” is constantly changes'” (Gerrits & Verweij, 2015, p. 486)
  • When the “object of evaluation is complex (i.e., changes over time, etc.), it challenges the evaluation methods that do not account for that complexity” (Gerrits & Verweij, 2015, p. 488)
  • “Complexity features a language that is relatively foreign to evaluators and that is difficult to operationalize” (Gerrits & Verweij, 2015, p. 488)

A Paper on “Evaluating Complex and Unfolding Interventions in Real Time”

  • simple interventions rely upon a single (a coherent set of) known mechanisms with a single (a coherent set of) output whose benefits are understood to lead to measurable and widely anticipated outcomes” – e.g., a drug to treat a disease
  • “complicated interventions involve a number of interrelated parts, all of which are required to function in a predictable way if the whole interventions is to success. the processes are broadly predictable and outputs arrive at outcomes in well-understood ways” – e.g., a rocketship is complicated – lots of interrelated parts, but it functions as expected (e.g., “it does not transform itself over time into a toaster”)
  • “complex interventions are characterized by:
    • feedback loops
    • adaptation and learning by both those delivering and those receiving the intervention
    • portfolio of activies and desired outcomes which may be re-prioritized or changed
    • sensitive to starting conditions
    • outcomes tend to change, possibly significantly, over time
    • have multiple components which may act independently and interdependently” (Ling, p. 80)
  • when delivering (or receiving) complex interventions, people:
    • “learn and adapt
    • reflexively seek to make sense of the systems in which they act and where possible to change how they work
    • adapt behaviour based on a changing understanding of the consequences of their actions”
    • of course, they (and the evaluators) only have an “incomplete understanding of these systems and their actions based on this limited understanding may be unpredictable” (p. 81)
  • RCTs can be used for simple and even complicated interventions, but are not appropriate for evaluating complex because they are  “inherently unable to deal with complexity” (p. 80)
  • also, it is important to remember that “interventions interact with complex systems in ways that cannot be predicted. The evaluation challenge lies in understanding this interaction” (emphasis mine, p. 80)

“While we need to challenge the expectation that evaluations of the complex will lead to more precise preditions and greater control, we should not adandon the belief that appropriately structured evaluations can contribute positively to reflexivity while simultaneously fulfilling the evaluators’ mission to strengthen both learning and accountability.. To do so we will need ot trade our search for universal generalizability in favour of more modest, more contigent, claims. In evaluating complex interventions we should settle for constantly improving understanding and practice by focusing on reducing key uncertainties.” (p. 81)

  • problem with “more conventional approaches” to program evaluation when used in situations of complexity:
    • expect to understand the whole by looking at a combination of its parts
    • evaluations “therefore […try to…] build up detailed pieces of evidence into an accurate account of the costs (or efforts) and the consequences, […] add up all the inputs, describe the processes, list the outputs and (possibly) weight outcomes and put this together to form judgements about and draw evaluative conclusions” (p. 81)
    • this can work for simple or complicated interventions “where we can make highly plausible assumptions that we know enough about both the intervention and the context” (p. 81)
  • for complexity, however, this is not the case:
    • need to “start with an understanding of the systems within which the parts operate”
    • “it is not simple the presence of [factors] (and the more the better), [but] rather it is how these parts are combined and balanced […] and how they are shaped to address local circumstances or resonate with national agendas. In other words, how they form a system of improvement and how this systems interacts with other systems in and around healthcare services. From an evaluator’s point of view, ‘What matters is making sense of what is relevant, i.e., how a particular intervention works in the dynamics of particular settings and contexts.'” (emphasis mine, p. 81-2)
  • “conceptualizing complex interventions is made more difficult still by the fact that we rarely find an intervention that can adequately be described as a single system. More often there are systems nested within systems.” (p. 82)
    • e.g., systems “operating individual, organization, and whole-system levels (or micro, meso, and macro)” (p. 82)
    • “when we talk about an intervention being context-dependent, or context-rich, we are describing how the processes and outcomes in each case are shaped by the particular ways in which these systems and subsystems uniquely interact” (p. 82)
  • “most economic evaluations are still primarily quantitative evaluations of “black box” interventions – that is, with little or no explicit interest in how and why they generate different effects or place different demands on the use of resources” (p. 83)
  • we need to recognize that the context in which an intervention is conducted is important, but “this approach to contextualization could lead to the conclusion that every context is different and unique and so we cannot use the lessons from one evaluation to inform decisions elsewhere […] To address this challenge, we can use complexity thinking to go beyond simply arguing that each context is different by showing how particular system function and how systems interact. If this were successful it would provide a way of contextualizing and then allowing ‘mid-range generalization’. This could deliver sufficiently thick description of the workings of systems and subsystems to support reflexive learning within the intervention and more informed decision making elsewhere. It establishes mid-ground between the uniqueness of everything and universal generalizability.” (emphasis mine, p. 83-4)
  • “evaluations should more often be conducted in real time and support reflexive learning and informed adaptation. Rather than seeing an intervention as a fixed sequence of activities, organized in linear form, capable of being duplicated and repeated, we see an intervention as including a process of reflection and adaptation as the characteristics of the complex system become more apparent to practitioners. The evaluation aims in real time to understand these and support more informed adaptation by practitioners. It also provides an account of if and how effectively practitioners have adapted their activities in the light of intended goals. They can be held to account for their intelligent adaptation rather than slavishly adhering to a set of instructions. Furthermore, the evaluation should say something about how the approach might be applied elsewhere.” (emphasis mine, p. 84-5)
  • Ling cites Stirling’s Uncertainty Matrix as a useful way to think about the “different kinds and causes of uncertainty” (p. 85)

Uncertainty Matrix (adapted from Stirling, 2010)

Uncertainty Matrix

  • probabilities – i.e., the chance of something happening
  • possibilities – i.e., the range of things that can happen
  • our knowledge of probabilities and possibilities can each be either non-problematic (i.e., we know the chance of something happening and we know the range of things that can happen, respectively) or problematic
  • risk – when we know the range of possibilities and each of their probabilities – we can engage in risk assessments, expert consensus, optimizing models
  • uncertainty – limited number of possibilities but we don’t know the probabilities of them occurring – we can use scenarios, sensitivity testing, etc.
  • ignorance – both range of possibilities sand their probabilities not known – we need to monitor, be flexible and adaptive
  • ambiguity – range of possibilities is problematic, but probabilities not problematic – we can use participatory deliberation, multicriteria mapping, etc.
  • for simple interventions, evaluations aim for certainty
  • for complicated interventions, evaluations aim to reduce (known) uncertainty
  • for complex interventions, evaluations aim to support a self-improving system
    • first aim to expose uncertainties, then to reduce them
    • “need to understand both activities and contexts, important to identify how learning and feed back happens, understand both system dynamics but also what makes change ‘sticky’, real-time evaluation necessary, requires a counterfactual space or matrix” (p. 86)
  • Ling recommends “an evaluation approach based […on] understanding the unfolding ‘Contribution Stories’ that those involved in delivering and adapting interventions work with to describe their activities and anticipated events” (p. 86-7)
    • Contribution Stories “aim to surface and outline how those involved in the intervention understand the causal pathways connecting the intervention to intended outcomes” and “provide an opportunity to explore their thinking about how the different aspects of the intervention interact with each other and with other systems” (p. 87)
    • from the Contribution Stories, “more abstract Theories of Change can be developed which trace the causal pathway linking resources use to outcomes achieved. Theses Theories of Change will be contingent and context-dependent and should be expressed as ‘mid-range theories’; not so specific that they amount to nothing more than a listing of micro-level descriptions of the causal pathway of the specific intervention but also not so abstract that it cannot be tested or informed by the evidence from the evaluation.” (emphasis mine, p. 87)
    • next, evaluators: (1) “identify key uncertainties associated with the intervention – those anticipated causal linkages for which there is limited evidence or inherent ambiguities or ignorance.” and (2)  Data collection & analysis would then aim to reduce these uncertainties, hopefully producing evidence that would be both relevant and timely.” (p. 87)
  • 6 stages (at which evaluators should “reflect on the consequences of complexity” (p. 87)
    1. Understand the interventions Theory of Change and its related uncertainties
      • include “importance of learning and adaptive” (p. 87)
      • “identify key dependencies upon systems and subsystems which lie outside the formal structures of the intervention” (p. 87)
    2. Collect and analyse data focused on key uncertainties
      • “identify where key uncertainties exist”
      • identify “what sort of uncertainty it is” (ignorance, risk, ambiguity, uncertainty)
      • “data collection alone may not address all of the key uncertainties” (p. 87)
    3. Identify how reflexive learning takes place through the project and plan data collection and analysis to support this, strengthening the formative role of evaluation
      • there is a “creation of evidence by the project itself as it learns and adapts”
      • the “evaluation can support this learning as part of a formative role at the same time as building a data base for its own summative evaluation” (p. 87) (with a shift it the balance towards a more formative role) [this sounds like what I’ll be doing with my project]
    4. Building a portfolio of activities and costs
      • “identifying boundaries around the cost base is made difficult when the success of a project may depend more on harnessing synergies from outside the intervention itself.” (p. 88)
      • “a major cost in conditions of complexity is equipping projects to be adaptable and responsive to a changing environment. Essentially, part of what is being ‘bought’ is flexibility and, by definition, this means that some resources might not need to be used. It could be regarded as the cost of uncertainty” (p. 88)
    5. Understanding what would have happened in the absence of the intervention
      • it is “often much harder to identify the counterfactual” for a complex intervention than for simple/complicated ones, but it is still “crucial to pose the core question in an evaluation which is ‘did it make a difference?'” which of course requires us to ask “compared to what?
      • rather than the counterfactual being a single thing, think of it more as “a counterfactual space of more or less likely alternative states. This might be produced by scenarios, modelling, simulation, or even expert judgement depending upon the nature of the uncertainty” (p. 88)
    6. “The evaluation judgment should not aim to identify attribution (what proportion of the outcome was produced by the intervention?) but rather to clarify contribution (how reasonable is it to believe that the intervention contributes to the intended goals effectively and might there be better ways of doing this?)” (p. 88)
  • the above is a general outline – still needs to be fleshed out
  • important to remember that “interventions change as they unfold” and “this adaptation is both necessary and unpredictable” (p. 89)

A Few Points from the Stirling Paper

  • I looked up the Stirling paper that Ling had cited to read more about the uncertainty matrix. This paper made the point that “when knowledge is uncertain, experts should avoid pressures to simplify their advice. Render decision-makers accountable for decisions.” (p. 1029).
  • Also: “An overly narrow focus on risk is an inadequate response to incomplete knowledge.” (p. 1029)

A Paper on “Using Programme Theory to Evaluate Complicated and Complex Aspects of Interventions”

  • It’s not about “creating messier logic models with everything connected to everything. Indeed, the art of dealing with the complicated and complex real world lies in knowing when to simplify and when, and how, to complicate” (p. 30)
  • various names for “program theory”:
    • programme logic
    • theory-based evaluation
    • theory of change
    • theory-driven evaluation
    • theory-of-action
    • intervention logic
    • impact pathway analysis
    • programme theory-driven evaluation science
      they all refer to “a variety of ways of developing a causal modal linking programme inputs and activities to a chain of intended or observed outcomes, and then using this model to guide the evaluation” (p. 30)
  • Glouberman and Zimmerman’s (2002) analogy re: complexity:
    • simple = following a recipe (very predictable)
    • complicated = sending a rocket ship to the moon (need a lot of expertise, but there is high certainty about the outcome; doing it once increases your likelihood of doing it again with the same result)
    • complex = raising a child (every child is unique and needs to be understood as such; what works well with one child will not necessarily work well with another; uncertainty about outcome)
  • Rogers suggests using this distinction to think about different aspects of an intervention (as some aspects of an intervention could be simple, while others are complicated or complex)
  • simple linear logic models:
    • (inputs –> activities –> outputs –> outcomes –> impact):
    • lack information about other things that can affect program outcomes, such as “implementation context, concurrent programmes and the characteristics of clients” (p.34)
    • risk overstating the causal contribution of the intervention” (p. 34)
    • best to reserve simple logic models for “aspects of interventions that are in fact tightly controlled, well-understood and homogeneous or for situations where only an overall orientation about the causal intent of the intervention is required, and they are clearly understood to be heuristic simplifications and not accurate models” (p. 35)
  • complicated logic models:
    • multi-site, multi-governance – can be challenging to get multiple groups to agree on evaluation questions/plans, but if there is a clear understanding of the “causal pathway” (e.g., a parasite causes a known problem, program is working to reduce the spread of that parasite), you can use a single logic model, report data separately for each site and in aggregate for the whole
    • simultaneous causal strands – all of which are required for the program to work (“not optional alternatives but each essential” (p. 37); must show them in the logic model (and indicate they are all required) and collect data on them
    • alternative causal strands – where the “programme can work through one or the other of the causal pathways” (p. 37); often, different “causal strands are effective in particular contexts”; difficult to denote visually on a logic model
      • can conducted “evaluation that involve ‘comparative analysis over time of carefully selected instances of similar policy initiatives implemented in different contextual circumstances’ ” (Sanderson, 2000 cited in Rogers, 2008, p. 37)
      • it’s important to document the alternative causal strands in an “evaluation to guide appropriate replication into other locations and times” (p. 38)
  • complex logic models:
    • two aspects of complexity that Rogers talks about as having been addressed in published evaluations:
      • recursive causality & tipping points – rather than program logic being a simple “linear progression from  initial outcomes to subsequent outcomes” (p. 38), the links are “likely to be recursive rather than unidirectional” and have “feedback mechanisms [and] interactive configurations” – it’s “mutual, multidirectional, and multilateral” (Patton, 1997 cited in Rogers, 2008, p. 38)”
      • “many interventions depend on activating a ‘virtuous circle’ where an initial success creates the conditions for further success,” so, “evaluation needs to get early evidence of these small changes, and track changes throughout implementation” (p. 38)
      • ‘tipping points’ – “where a small additional effort can have a disproportionately large effect, can be created through virtuous circles, or a result of achieving certain critical levels” (p. 38)
      • can be hard to show virtuous circles/tipping points on logic model diagrams, so may require notes on diagrams [I wonder if we can do anything with technology to better illustrate this?]
      • emergence of outcomes
        • what outcomes there will be, and how they will be achieved, “emerge during implementation of an intervention”
        • this may be appropriate :
          • “when dealing with a ‘wicked problem’
          • where partnerships and network governance are involved, so activities and specific objectives emerge through negotiation and through developing and using opportunities
          • where the focus is on building community capacity, leadership, etc., which can then be used for various specific purposes” (p. 39)
        • could develop a “series of logic models […] alongside development of the intervention, reflecting changes in understanding. Data collection, then, must be similarly flexible.” (P. 39)
          • may have a clear idea of the overall goals, but “specific activities and causal pathways are expected to evolve during implementation, to take advantage of emerging opportunities and to learn from difficulties” (p. 40) – so could develop an initial model that is “both used to guide planning and implementation, but [is] also revised as plans change” (p. 40) [this is what we are doing on my current project]
  • interventions that have both complicated and complex aspects
    •  e.g., multi-level/multi-site (complicated) and emergent outcomes (complex)
    • could have a logic model that “provide[s] a common framework that can accommodate local adaptation and change” (p. 40)
    • “a different approach is not to present a causal model at all, but to articulate the common principles or rules that will be used to guide emergent and responsive strategy and action” (p. 42-3)
  • how to use program theory/logic models for complicated & complex program models
    • with simple logic models, we use program theory/logic models to create performance measures that we use to monitor program implementation and make improvements
    • with complicated & complex models, we cannot do this so formulaically
    • one of the importance uses of program theory/logic models in these situations is in having “discussions based around the logic models” (p. 44)
    • evaluation methods tend to be more “qualitative , communicative, iterative, an participative” (p. 44)
    • “the use of ’emergent evaluation’ – engaging stakeholders in “highly participative” processes that “recognize difference instead of seeking consensus that might reflect power differences rather than agreement” (p. 44) – and then these “multi-stakeholder dialogues [are] used simultaneously in the roles of data collection, hypothesis testing and intervention, rather than evaluators going away with the model and returning at the end with result” (p. 44) – and stakeholders can then “start to use the emerging program theories […] to guide planning, management and evaluation of their specific activities.” (p. 44)
    • having “participatory monitoring and evaluation to build better understanding and better implementation of the intervention” (p. 45)
    • citing Douthwaith et al, 2003: “Self-evaluation, and the learning it engenders, is necessary for successful project management in complex environments” (p. 45)
  • final thoughts:
    • “The anxiety provoked by uncertainty and ambiguity can lead managers and evaluators to seek the reassurance of a simple logic model, even when this is not appropriate[, but…] a better way to contain this anxiety might be to identify instead the particular elements of complication or complexity that need to be addressed, and to address them in ways that are useful” (p. 45)

I have a lot more articles to read on this topic, but this blog posting is getting very long, so I’m going to publish this now and start a new posting for more notes from other papers.

References
Gerrits, L. & Verweij, S. (2015). Taking stock of complexity in evaluation: A discussion of three recent publications. Evaluation. 21(4): 481-91.
Ling, T. (2012). Evaluating complex and unfolding interventions in real time. Evaluation. 18(1): 79-91.
Rogers, P. (2008). Using Programme Theory to Evaluate Complicated and Complex Aspects of Interventions. Evaluation 14(1): 29-48.
Stirling, A. (2010). Keep it complex. Nature. 468. p. 1029-1031.
Posted in evaluation, notes | Tagged , , , | Leave a comment