Evaluator Competencies Series: Evaluation Standards

Next up in my evaluator competency series is the second Reflective Practice standard:

1.2 Integrates the Canadian/US Joint Committee Program Evaluation Standards in professional practice.

The Program Evaluation Standards are a set of statements that evaluators – or evaluation users – can use to plan and execute high quality evaluations and to judge the quality of evaluation proposals and evaluations. The standards were developed by the Joint Committee on Standards for Educational Evaluation (JCSEE) and are used by both the Canadian Evaluation Society (CES) and the American Evaluation Association (AEA). The standards were most recently published in 2011 (the 3rd edition); they were reviewed more recently than that to see if they needed updating but it was decided that they were sufficient the way they are.

The standards are categorized under five categories:

  • Utility
  • Feasibility
  • Propriety
  • Accuracy
  • Evaluation Accountability

In each of these categories, there are several standard statements that describe what high quality evaluations should do. For example, under the category of “utility”, there are 8 statements of what evaluations should do to be useful and under the “propriety” category, there are 7 statements of what evaluation should to be ethical, just, and fair.

As I reviewed the standard statements for this blog posting, I noticed that both the CES and AEA, which both list the statements on their websites, include the following note:”Authors wishing to reproduce the standard names and standard statements with attribution to the JCSEE may do so after notifying the JCSEE of the specific publication or reproduction.” So, since I haven’t notified the JCSEE that I would like to reproduce the statements here on my blog, I can’t do so. You can read them over on the CES website though.

But the standards are more than just the statements. There’s a whole book published by the JSCEE that describes the standard statements in detail, explaining where the standards come from and how they can be applied.

It should also be noted that, despite the “should” wording of the standards, they aren’t meant to be slavishly followed, but to be applied in context and with nuance.


The standards also exist in tension with each other and you have to figure out the right balance. For example, there is a standard that says you should use resources efficiently, but another standard that says you should include the full range of groups and people who are affected by the program being evaluated. Evaluators need to find the balance between being thorough, but also been efficient in our use of resources.

In terms of my own practice, I think I can be more explicit in my use of the program standards. I’ve been an evaluator for a decade and I’ve integrated a lot of the standards into my work such that it’s just second nature (things like being efficient in my use of resources, using effective project management practices, using reliable and valid information, and being transparent). But there are other standards for which, as I read them I think “I could probably do better” (e.g., being more explicit about my evaluation reasoning or encouraging external meta-evaluation).

The evaluation standards is such a big topic, I’m barely scratching the surface here. So once I’m done this blog series on evaluator competencies, my next series is going to be on the evaluation standards! I think that will be a good way to get me to spend a bit of time reflecting on each of the standards and thinking about how I can improve my practice related to each one. And I’ll be sure to contact the JSCEE to let them know I’d like to reproduce the standard statements here on my blog!

Image sources:

  • Book cover from Amazon.ca
  • Scales photo posted on Flickr by Hans Splinter with a Creative Commons license.

Posted in evaluation, evaluator competencies, reflection | Tagged , , , , , , | 2 Comments

Blog Series! Evaluator Competencies

So I had an idea. As I ease my way into blogging in a more reflective way, I thought that perhaps I could do a blog series about the Canadian Evaluation Society (CES) evaluator competencies, where each post I reflect on one of the competences. The competencies have been recently revised, so it seems like a good time to do this. Plus, having a series will give me ideas for topics – and hey, let’s make it every Sunday, so that I’ll have a deadline as well. This seems like a good way to get me into the habit of writing here.

What Are Evaluator Competencies?

Competencies are defined as “the background, knowledge, skills, and dispositions program evaluators need to achieve standards that constitute sound evaluations.” (Stevahn et al, 2005) 

Cited by CES

The competencies were created as part of the program for the Credentialed Evaluator (CE) designation. To get the designation, one has to demonstrate that they have education and/or experience related to 70% of the competencies in each of the five domains. I got my CE under the original set of competencies, but anyone applying now would use the new set. It was a few years ago that I did my CE application, so it’s another reason why it’s a good time for me to reflect on where I am now with respect to the competencies.

The Domains

The five competency domains are:

  1. reflective practice
  2. technical practice
  3. situational practice
  4. management practice
  5. interpersonal practice

Reflective Practice


“Reflective Practice competencies focus on the evaluator’s knowledge of evaluation theory and practice; application of evaluation standards, guidelines, and ethics; and awareness of self, including reflection on one’s practice and the need for continuous learning and professional growth.” (Source)

1.1 Knows evaluation theories, models, methods and tools and stays informed about new thinking and best practices.

I have taught Program Planning and Evaluation at both SFU (in the Masters of Public Health program) and UBC (in the Masters of Health Administration program) in the past couple of years, and I find that teaching is a great way to both deepen my own understanding of evaluation theories, models, methods, and tools and to stay informed about new thinking and best practices. In deciding what to include in a course, and how best to present it, and coming up (whenever possible) with activities the class can do to learn it, I learn more every time I prepare, update, and deliver a class. Also, students ask great questions (sometimes even after a class has ended and they’ve gone on to work in places where they are involved in evaluation) and sometimes it’s things that I’m not familiar with and I have to go and do some research to find out more.

I think my main reflection related to this area is that I am a firm believer that there is no one “right” way to do evaluation, and that it is best to start with what the purpose of an evaluation is and then figure out what approach, design, and methods will best help you achieve the purpose. Oftentimes, those requesting an evaluation come to it with assumptions about methods or design – like “I need you to do a survey of the program clients” or “how can I set up a randomized controlled trial to evaluate my program?” So I often find myself saying things like “Let’s begin at the beginning. Why do you want an evaluation? What do you want to know? What will you do with that information once you have it?”

Given that I think it’s important to find the best fit of approach, design, and methods to the purpose of an evaluation, it means that I need to be familiar with lots of different theories, models, methods and tools!

In terms of new thinking and best practices:

  • I’m currently reading Principles-Focused Evaluation 1Expect to see a blog posting on that once I’m done the book..
  • I attend evaluation conferences and pick sessions where I can learn about new things – and deepen my understanding of things I’m familiar with. For example, at the most recent CES conference, I took a workshop on reflective practice to deepen my skills in that area (which I’m now actively working on integrating into my life), I attended a session on rubrics to learn more about those (next step there is to try applying rubrics to an evaluation!), and I attended a session on a realist evaluation (next step there is to have the presenter come to my class as a guest speaker so that I and my students can learn more!)
  • I include a section in my course on “hot topics” in evaluation, which gives me the opportunity to explore the latest thinking in evaluation with my students. Recently, I’ve included complexity and systems thinking, and indigenous evaluation 2Except to read more about indigenous evaluation when I get to competency 3.7 in this blog series.. I also try to demonstrate reflective practice and humility to my students by telling them that I am exploring new areas, so I’m not an expert in these topics (especially indigenous evaluation), but that I’m sharing my learning journey with them.

Image source: Posted on Flickr by Allan Watkin with a Creative Commons licence.


1 Expect to see a blog posting on that once I’m done the book.
2 Except to read more about indigenous evaluation when I get to competency 3.7 in this blog series.
Posted in evaluation, evaluator competencies, notes, reflection | Tagged , , , , | Leave a comment

what is the point of this blog?


I recently had coffee with a new friend and fellow evaluator, Meagan Sutton. We were introduced by a mutual friend who knew that Meagan was interested in chatting with evaluators who write blogs and that I am an evaluator who writes a blog! We had a great chat and it got me thinking about why I have this blog and how I might grow what I do with it.


I originally started this blog as a place to keep notes of work-related stuff I was reading. I have a pretty terrible memory and I find my personal blog a great way to remember stuff that I did – it’s easy to search through and accessible anywhere with an Internet connection – so I figured rather than having notes in various notebooks and jotted down in the margins of printed copies of journal articles, I could use this blog as my brain dump for various things I learn 1I briefly co-opted this blog for blog postings I was required to do during an Internet marketing class that I took in my MBA, but then switched it back to stuff related to my work.. So whenever I went to a conference, attended a webinar, or read a book or article where I wanted to record what I was learning, I dumped it on this blog. I am an external processor, so it helps me to remember and understand things when I write them down. For webinars I tend to take notes directly into my blog and publish that, but for conferences, I usually write notes on paper during the conference – partially because that helps keep me awake and attentive during conference sessions and partially because I don’t like lugging my laptop around during a conference – but also because I find it helpful to look at all the notes I’ve taken and sort or synthesize them together for the whole conference and if type my notes during the conference, I find it harder to remove the superfluous stuff, whereas if I’m deciding what it’s worth typing out from a bunch of handwritten notes, I find it easier to be more succinct as I’ll select just the main points to blog about. The downside is that it often takes me quite a while to do that, and I can end up posting my conference summary blog posting many months later 2Though I made it a priority to do it more quickly from the last conference I attended and actually got it posted just two weeks after the conference instead of months and months later.

Meagan asked me how I promote this blog and honestly, I don’t. Since I saw the blog as mostly just an externalization of my memory, I didn’t think anyone else would ever want to read it. I have had a few people contact me after reading something on my blog that they found through Google – and actually have had some interesting conversations result – but it’s pretty rare.

Occasionally, I add some reflection into these blog postings – like thoughts about how what I was reading or learning at a conference might relate to work that I do, but that’s been pretty minimal.


At the same time, I’ve been working on improving my reflective practice, mostly through reflective writing that I’m doing privately rather than in a public forum like this. Part of that is because the reflections I’ve been writing are part of the data I am using in the evaluation I’m working on, so I need it documented where the rest of the data (including my team’s reflections) are. And part of it is because some of what I write about is confidential or politically sensitive, so is not for sharing publicly.

And this is where blogging as an evaluator can get sticky. Sometimes there are things you want to reflect on and process, and maybe even start a conversation with fellow evaluators about, but that you aren’t able to make anonymous for discussion in a public forum. Or you have conflicts with clients that you want to reflect on, but can’t do that publicly either. How does one navigate this? I honestly don’t know the answer, but as I think about expanding this blog to become more reflective, it’s something I’ll need to think more about.

I guess the flip side of this is: why do I want to put my reflections out into the world? I guess because I see it as an opportunity to engage with others. As I mentioned above, without even sharing my blog postings beyond just posting them here, I’ve had some interesting interactions with other evaluators who stumbled on my blog – imagine what could happen if I tweeted out these blog postings (like I do my personal blog postings with my personal Twitter account) and actually wrote some reflective stuff – things I’m thinking about/struggling with/wanting to know more about? Perhaps I could connect with others facing similar issues and get different perspectives on the things I’m thinking about.

Image credits:

  • Coffee – posted on Flickr by Jen with a Creative Commons license
  • Blog – posted on Flickr by Xiaobin Liu with a Creative Commons license
  • Twisty Water Looking Thing – posted on Flickr by Mario with a Creative Commons license
  • Megaphone – from Pixabay by OpenClipart-Vectors with a free for commercial use license


1 I briefly co-opted this blog for blog postings I was required to do during an Internet marketing class that I took in my MBA, but then switched it back to stuff related to my work.
2 Though I made it a priority to do it more quickly from the last conference I attended and actually got it posted just two weeks after the conference instead of months and months later.
Posted in evaluation, reflection | Tagged , , | 2 Comments

Webinar Notes: Shifting Mental Models to Advance Systems Change

Title: Shifting Mental Models to Advance Systems Change

Offered by: FSGNew Profit, and the Collective Impact Forum.

-Tammy Heinz, Program Officer, Hogg Foundation
-Hayling Price, Senior Consultant, FSG
-Darrell Scott, Founder, PushBlack
-Julie Sweetland, Vice President for Strategy and Innovation, Frameworks Institute
-Rick Ybarra, Program Officer, Hogg Foundation

Hayling Price:

  • “Systems change is about shifting conditions that are holding a problem in place”
  • “It’s not about getting more young people to beat the odds. It’s about changign the odds”
  • 6 conditions of systems change
    • structural change (policies, practices, resource flows (who gets funding and why? how are human resources allocated) [explicit – easiest to find and to change]
    • relationships & connections (not just having someone on your LinkedIn, but actually engaging), power dynamics (who is getting funded and why? some people have a leg up, some people are dealing with a history of oppression) [semi-explicit]
    • transformative change (mental models) [implicit]
  • mental models: deeply held beliefs, assumptions, etc.
  • the policies, practice, resource flows are not handed to us by nature – they are created by humans based on our mental models

Darrell Scott

  • PushBlack – nation’s largest nonprofit media platform for Black people
  • 4 millions subscribers with emotionally-driven stories about Black history, culture, and current events
  • through Facebook Messenger – meeting people where they are at
  • Go to Facebook Messenger and search “PushBlack” to sign up!
  • ran the largest get-out-the-vote campaign on social media in history in 2018
    • got subscribers to contact their friends (relates to relationships and connections part of the conditions of system change)
  • giving subscribers tools to work at the local level (e.g., to be heard when Black people are killed by police, to free innocent Black people)
  • test their messages with small subset of audience before sending out only the best performing messages to the broader audience)

Julie Sweetland

  • uses the phrase “cultural models”, which is similar concept from anthropology
  • “cultural models are cognitive short cuts created through years of experience and expectation. They are largely automatic assumptions, and can be implicit”
  • People rely on cultural models to interpret, organize and make meaning out of all sorts of stimuli, from daily experiences to social issues”
  • believe that understanding mental/cultural models helps you to understand what are the mental models that are holding a problem in place
  • e.g., Google image search “ocean” and the top hits are pictures of “beautiful blue expanse” – this is a mental model that Americans hold of the ocean – this holds implications for policy:
    • people think it is so big, that it’s invincible
    • people think it’s water and think about the surface – not thinking about what’s underneath, about how it’s an ecosystem, it produces oxygen, it affects weather, etc.
  • it’s not that the ocean isn’t blue or isn’t big, but that’s just a piece of the picture
  • e.g., some people’s mental model of “teenager”, is about “risk and rebellion” – people defying expectations from adults. Again, not a complete picture.
  • 3 models are consistently barriers to productive conversations on social issues (especially in American context, but they’ve also seen them internationally):
    • individualism: assumption that problems, solutions, and consequences happen at the personal level
    • us vs. them: assumption that another social group is distinct, different, and problematic (beyond people – can be human vs. animals; environment vs. economy)
    • fatalism: assumption that social problems are too big, too bad, or too difficult to fix
  • there are also mental models that are specific to a given situation, but the above three tend to show up in lots of areas
  • one thing that doesn’t work: correcting their mistakes
    • “myth busters” – they don’t work! A study of myth-fact structure found: people misremembered the myths as true, got worse over time, and they attributed the false information to the CDC (Skumik et al (2005), JAMA)
    • mental models are there because we’ve heard it so many times. When you restate a “bad” mental model, you reinforce it (e.g., if you state: Myth: Flu vaccines cause the flu, you reinforce their mental model that flu vaccines cause the flu (doesn’t matter that you said it was a “myth”))
    • never remind people of things you wish they’d forget
  • another thing that doesn’t work: giving people more information
    • isn’t not that you shouldn’t use facts
    • but if people have a particular mental model, stacking data on top does not change their mental model
    • you need to help them build a new mental model
  • another thing that doesn’t work: leaving causation to the public imagination doesn’t work
    • leaving people with their bad mental models won’t help
  • instead of trying to rebut people’s misunderstanding – try to redirect attention to what is true and how things do work

Tammy Heinz and Rick Ybarra

  • Hogg Foundation for Mental Health
  • historically funded lots of program and research
  • Mental Health has been focused on diagnosis and treatments, with end goal of symptom reduction
  • now moving their work upstream
  • traditionally, there has been a medical/disease model of health
  • in the 1970s, people started thinking about if mental health was really chronic or could people get better from this
  • shifting a mental model is not something that can happen quickly
  • in the past 20 years, there’s been some deliberate work to shift the thinking around mental health
  • huge shift towards peers helping in mental health care teams
  • thinking about “recovery” – it’s not an expectation of only symptom control


  • there are multiple mental models on an issue – you can call up a more productive mental model (e.g., maybe “fatalism” if the first thing that comes to mind, but you can call up a more productive mental model)
  • how do you figure out what mental models people are using?
    • Hayling: we are constantly testing out models through our work
    • Julie: ask people “what are ideas you wish you’d never hear again?” and you’ll get a pretty good idea of the mental models that are being a problem
  • how do you change mental models around emotionally charged issues?
    • Rick: listening. Figure out what mental models are driving things. Really learn and understand where people are coming from.
    • Tammy: being clear about where you want to go
    • Hayling: make things plain
    • Julie: call people in rather than calling them out

Update: Here’s a link to the recording of the webinar.

Posted in event notes, notes, webinar notes | Tagged , , , , | Leave a comment

Recap of the 2019 Canadian Evaluation Society conference

This year’s conference was in Halifax and, as always, it was a wonderful opportunity to reconnect with my evaluation friends, make some wonderful new friends, to pause and reflect on my practice, and to learn a thing or two. And I think this is quite possibly the fastest I’ve ever put together my post-conference recap here on ye old blog! (The conference ended on May 29 and I’m posting this on June 14!)

Student Case Competition

The highlight of the conference for me this year was the Student Case Competition finals. In this competition, student teams from around the country, each coached by an experienced evaluator, compete in round 1 where they have 5 hours to review a case (typically a nonprofit organization or program) and then complete an evaluation plan for that program. Judges review all the submissions and the top 3 teams from round 1 move on to the finals, where they get to compete live at the conference. They are given a different case and have 5 hours to come up with a plan, which they then present to an audience of conference goers, including representatives from the organization and three judges. After all three teams present, the judges deliberate and a winning team is announced!

I had the honour of coaching a team of amazing students from Simon Fraser University. The competition rules do not allow teams to talk to their coaches when they are actually working on the cases, so my role was to work with them before the round, talking about strategies for approaching the work, as well as chatting with them about evaluation in general. Most of the students on the team had not yet taken an evaluation course, so I also provided some resources that I use when I teach evaluation.

I will admit that I was a bit nervous watching the presentations – not because I didn’t think my team would do well, as I know they worked really hard and are all exceptionally intelligent, enthusiastic and passionate, but because it’s huge challenge to come up with a solid evaluation plan and a presentation in such a short period of time, and because they were competing among the best in the country!

But I need not have been worried. They came up with such a well thought through, appropriate to the organization, and professional plan and presented it with all the enthusiasm, professionalism, grace, and passion that I have come to know they possess. I was definitely one proud evaluation mama watching my team do that presentation and so very, very proud of them when they won! Congratulations to Kathy, Damien, Stephanie, Manal, and Cassandra! And to Dasha, who was part of the team that won round 1, but wasn’t able to join us in Halifax for the finals.

Kudos also go to the two other teams who competed in the finals – students from École nationale d’administration publique (ENAP) and Memorial University of Newfoundland (MUN). Great competitors and, as I had the pleasure of learning when we all went out to the pub afterwards, as well as chatting at the kitchen party the next night, all very lovely people!

Conference Learnings

As usual, I took a tonne of notes throughout the conference and, as usual for my post-conference recaps, I will:

  • summarize some of my insights, by topic (in alphabetical order) rather than by session as I went to some different sessions that covered similar things
  • where possible, include the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due. Sometimes I missed names (e.g., if an audience member asked a question or made a statement, as audience members don’t always state their name or I don’t catch it)
  • apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them.

Anything in [square brackets] is my thoughts that I’ve added upon reflection on what the presenter was talking about.

Federal Government

  • every time I go to CES, I find I learn a little bit more about how the federal government works (since so many evaluators work there!). This time I learned that Canada Revenue Agency (CRA) doesn’t report up to Treasury Board – they report to Finance

Indigenous Evaluation

  • the conference was held on Mi’kma’ki, the ancestral and unceded territory of the Mi’kmaq People.
  • the indigenous welcome to the conference was fantastic and it was given by a man named Jude. I didn’t catch his full name and I couldn’t find his name in the conference program or on Twitter. [Note to self: I need to do better at catching and remembering names so I can properly give credit where credit is due]. He talked about how racism, sexism, ableism, transphobia, and other forms of oppression are at play in the world today. He also talk about about how there is a difference between guilt and responsibility. We need to take responsibility for making things better now, not just feel guilty about the way things are.
  • Nan Wehipeihana talked about an evaluation of sports participation program and how they moved from sports participation “by” Māori  to sports participation “as” Māori. They talked about what it would look like to participate “as” Māori (e.g., using
    Māori language, Māori structures (tribal, subtribal, kin groups) are embedded in the activity, activities occur in places that are meaningful to Māori people (e.g., kayaking on our rivers, activities on our mountains). Developed a rubric in the shape of a five-point star (took a year to develop).
  • I went to a Lightning Roundtable session hosted by Larry Bremner, Nicole Bowmanm, and Andrealisa Belizer where they were leading a discussion on Connecting to Reconciliation through our Profession and Practice. One of the things that Larry mentioned that struck me was the importance of not just indigenous approaches to evaluation, but indigenous approaches to program development. It doesn’t make sense to design a program without indigenous communities as equal partners and then to say you are going to take an indigenous approach to evaluation – the horse has left the barn by that point.
  • They also talked about how evaluators are culpable for the harm that is still happening because we haven’t done right in our work. They talked about how the CES needs to keep the government’s feet to the fire on the Truth and Reconciliation Commission’s (TRC) Calls to Action. Really, after there Commission, there should have been a TRC implementation committee who could go around the country and help get the Calls to Action implemented (Larry Bremner).
  • They talked about not only what can CES do at the national level, but what can we do at the chapter level. As the president of one of the chapters, this is something I need to reflect on and speak to the council about. I also need to revisit the Truth and Reconciliation Commission’s Calls to Action (as it was a while ago that I read that report) and read “Reclaiming Power and Place: The Final Report of the National Inquiry into Missing and Murdered Indigenous Women and Girls“, which was released the week after the CES national conference.
  • I also went to a concurrent session where the panelists were discussing the TRC Calls to action. They pointed out that CBC has a website where they are tracking progress on the 94 Calls to Action: Beyond 94.
  • CES added a competency about indigenous evaluation in its recent updating of the CES competencies:
    • 3.7 Uses evaluation processes and practices that support reconciliation and build stronger relationships among Indigenous and non-Indigenous peoples.
  • Many evaluators saw this new competency and said “I don’t work with indigenous populations, so how can I relate to this competency?” [I will admit, I had that thought as well when the new competencies were announced. Not that I don’t think this is an important competency for evaluators to have – but more that I didn’t know how to apply it in the work I am currently doing or where to start in figuring out what I should do.]. The CES is trying to provide examples to support evaluators. (Linda Lee) E.g.:
Presentation slide
  • I also learned that EvalIndigenous is open to indigenous and non-indigenous people – anyone who wants to move forward indigenous worldviews and want indigenous communities to have control of their own evaluations. So I joined their Facebook group! (Nicole Bowman and Larry Bremner)
  • Evaluators typically use a Western European approach and many use an “extractive” evaluation process, where they take stuff out of the community and leave (I can’t remember if this slide was from Larry Bremner or Linda Lee).
Presentation slide
  • I also found this discussion of indigenous self-identification helpful (Larry Bremner):
Presentation slide
  • There is still so much work to do and so much harm being inflicted on indigenous people:
    • there are more indigenous kids in care today than were in residential schools – this is the new residential schools. (Larry Bremner)
    • During the discussion with the audience, some audience members mentioned “trauma tourism” – that it can be re-traumatizing for indigenous people to share traumas they have experienced and non-indigenous people, in their attempts to learn more about the experiences of indigenous people need to be mindful of this and not further burden indigenous people.
    • If you google “indigenous women”, all the results you get are about missing and murdered indigenous women and girls. Where is the focus on the strengths in the community?


  • evaluators are learners (Barrington)
  • Bloom’s Taxonomy is a hierarchy of cognitive processes that we go through when we do an evaluation – notice that evaluation is at the top – it’s the hardest part
    (Gail Barrington)

Bloom taxonomy.jpg
By Xristina laOwn work, CC BY-SA 3.0, Link

  • single loop learning is where you repeat the same process over and over again, without every questioning the problem you are trying to fix (sort of like the PDSA cycle). There’s no room for growth or transformation. (Gail Barrington)

By Xjent03Own work, CC BY-SA 3.0, Link

  • in contrast, double loop learning allows you to question if you are really tackling the correct problem (sometimes the way that the problem is defined is causing problems/making things difficult to solve) and the decision making rules you are using, allowing for innovation/transformation/growth. (Gail Barrington)

By Xjent03 – Own work, CC BY-SA 3.0, Link

Pattern Matching

  • “Pattern matching is the underlying logic of theory-based evaluation” – specify a theory, collected data based on that, see if they match (Sebastian Lemire)
  • Trochim wrote about both verification AND falsification, but in practice most people just come up with a theory and try to find evidence to support it (confirmation bias)
    (Sebastian Lemire)
  • humans are wired to see patterns, even when they aren’t there and we tend to focus on evidence in support of the patterns (Sebastian Lemire)
  • having more data is not the solution! (Sebastian Lemire)
    • e.g., when people were given more information on horses and then made bet, they didn’t get any more accurate in their bets, but they did get more confident in their bets
  • evaluators need to do reflective practice – e.g., to look for our biases (Sebastian Lemire)
  • structural analytic techniques (see slide) below – not a recipe, but a structure process (Sebastian Lemire)
Presentation slide
  • pay attention to alternative explanations – in the context of comissioned evaluations, it can be hard to get commissioners to agree to you spending time on looking at alternative explanations and we often go into an evaluation assuming that the program is the cause (bias) (Sebastian Lemire)
  • falsification: specify what data you would expect to see if your hypothesis was wrong
    (Sebastian Lemire)

Power and Privilege

  • since we have under-served, under-represented, and under-privileged people, we must also have over-served, over-represented, and over-privileged people (Jude, who gave the indigenous welcome. I didn’t catch his last name and I can’t find it on the conference website)
  • recognize your power and privilege, recognize your biases and think about where they come from and work to prevent your biases from affecting your work
    (Jude, who gave the indigenous welcome. I didn’t catch his last name and I can’t find it on the conference website)
  • and speaking of power and privilege, the opening plenary on the Tuesday morning was a manel. For the uninitiated, a “manel” is a panel of speakers who are all male. It’s an example of bias – men being more often recognized as experts and given a platform as experts when there are many, many qualified women. I called it out on Twitter:
  • a friend of mine who is a male re-tweeted this saying he was glad to see that someone called it out and when I spoke to him later, he told me that people were giving him kudos for calling it out and he had to point out that it was actually a woman who called it out. So another great example of women being made invisible and men getting credit.
  • I do regret, however, that I neglected to point out that it was a “white manel” specifically. There’s so much more to diversity than just “men” and “women”!

Realist Evaluation

  • Michelle Naimi (who I know from the BC evaluation scene) gave a great presentation on a realist evaluation project she’s been working on related to violence prevention training in emergency departments. My notes on realist evaluation don’t do it justice, but I think my main learning here is that this is an approach that I can learn more about. I’m definitely inviting her as a guest speaker the next time I teach evaluation!
Michelle Naimi gives a presentation at the Canadian Evaluation Society conference

Reflective Practice

  • I took a pre-conference workshop, led by Gail Barrington, on reflective practice. This is an area that I’ve identified that I want to improve in my own work and life, and a pre-conference workshop where I got to learn some techniques and actually try them out seemed like a perfect opportunity for professional development.
  • Gail talked about:
    • how she doesn’t see her work and her self as separate – they are seamless
    • if you don’t record your thoughts, they don’t endure. (How many great ideas have you had and lost?) [I’d add, how many great ideas have you had, forgotten about, and then been reminded of later when you read something you wrote?]
    • evaluators are always serving others – we need to take care of ourselves too
  • The best part of the workshop was that we got to try out some techniques for reflective practice as we learned them

Warm up activity: In this activity, we took a few minutes to answer the following questions:

-Who am I?
-What do I hope to get out of this workshop?
-To get the most out of this workshop, I need to ____

Then we re-read what we wrote and answered this:

-As I read this, I am aware that __________

  • and that is an example of reflection!
  • [Just had an idea! I could use that at the start of class to introduce the notion of reflective practice from the beginning of class. If I turn my class into more of a flipped classroom approach, I could have more in-class time to do fun, experiential things like this than listening to lecture 🙂 ]
Resistance Exercise: Another quick writing exercise:

-What are the personal barriers that hold me back from reflection?
-What are the lifestyle/family barriers that hold me back from reflection?
-What barriers at work are holding me back from being transformative?

Then we re-read what you wrote and answer this:

-As I read this, I am aware that __________
The Morning Pages:

Write three pages of stream of consciousness first thing in the morning in a journal that you like writing in. Before you’ve done anything else – and before your inner critic has woken up. If you can’t think of anything to write, just write “I can’t think of anything to write” over and over again until something comes to you.

All sorts of things will pop up – might be ideas for a project you are working on, or “to do” items to add to your list. You can annotate in margins, transfer things to your main to do list later, or some of it might not be useful to you now and you don’t have to look at it again.
  • Gail said it’s very different writing first ting in the morning compared to later in the day. I know that I’m unlikely to get up an extra half hour earlier than I already do, but I could give this a try on weekend morning when I’m not feeling rushed to get to work to see if it’s different for me too.
Start Now Activity:

-The thoughts/ideas that prevent me from journaling now ____

Then re-read what you wrote and answered this:

-As I read this, I am aware that __________
  • for some people, writing is not for them. An alternative is using a voice memo app. We gave it a try in the workshop and I was kind of meh on it, but I used it two more times during the conference when I had a quick thought I wanted to capture. I think the challenge will be that if I want to retrieve those ideas, I’ll need to listen to the recordings, which seems like a big time sync, depending on how much I say (as I can be verbose).
  • we also talked about meditation and went out on a meditative walk ((Gail put up the quotation “solvitur ambulando”, citing St. Augustine, and noting that it is Latin for “solved by walking”. But when I just googled it, it turns out that it was actually from the philosopher Diogenes, and actually refers to something that is solved by a practical experiment). For our walk, we set an intention (to think about one thing that I’ll chnage at my work), then forget about it and go for a mindful walk – paying attention to the sensations of walking (e.g., the feeling of your feet on the ground as you step, the colours and shapes and sounds and smells you encounter). It was a rainy day, but I was definitely struck with all the beauty around me, and was reminded about how beneficial mindfulness can be.
  • My take home from all my reflections in this workshop was:
    • taking time to do things like reflective practice and mindfulness meditation is a choice. I say that I don’t have enough time to do these things, but it’s actually that I have been choosing not to spend my time doing these things. There are a variety of reasons for those choices (which I did reflect on and got some valuable insights about). Remembering that this is a choice – and being more mindful of what choices I’m making – is going to be my intention as I return back to work after my conference/holiday.


  • I’ve been to sessions on Rubrics by Kate McKegg, Nan Wehipeihana, and their colleagues at a number of conferences and I always learn useful things. This year was no exception. The stuff in this section is all from McKegg and Wehipeihana (and they had a couple of collaborators who weren’t there but “presented” via video.
  • rubrics are a way to make our evaluation reason explicit
  • just evaluating on if goals are met is not enough. Rubrics can help us with situations like:
    • what counts as “meeting targets”? (e.g., what if you meet an unimportant target but don’t meet an important one? Or you way exceed one target and miss another by a little bit? etc.)
    • what if you meet targets but there are some large unintended negative consequences?
    • do the ends justify the means? (what if you meet targets but only but doing unethical things?)
    • whose values do you use?
  • 3 core parts of a rubric:
    • criteria (e.g., reach of a program, educational outcomes, etc.)
    • levels (standards) (e.g., bad, poor, good, excellent; could also include “harmful”)
      • some people don’t like to see “harmful” as a level, but e.g., when we saw inequities, we needed a way to be able to say that it was beyond poor and actually causing harm
    • importance of each criteria (e.g., weighting)
      • sometimes all criteria are equally important and sometimes not
  • rubrics can be used to evaluate emerging strategies:
    • evaluation can be used in situations of complexity to track evolving understanding
Presentation slide
  • in all systems change, there is no final “there”
    • in situations of complexity, cause-and-effect are only really coherent in retrospect [this are not predictable] and do not necessarily repeat
    • we only know things in hindsight and our knowledge is only partial – we must be humble
    • need to be looking out continually for what emerges
  • in complexity thinking, we are only starting to see what indigenous communities have long known
    • our reality if created in relation, interpretive
    • Western knowledge dismissed this
Presentation slide
  • need to bring things together to make sense of multiple lines of evidence
    • “weaving diverse strands of evidence together” in the sensemaking process
  • we have to make judgments and decisions about what to do next with limited/patchy information. Rubrics give us a traceable method to make our reasoning explicit
  • having agreed on values at the start helps to navigate complexity
  • break-even analysis flips return-on-investment:
Presentation slide
  • when you can’t do a full cost-benefit analysis (e.g., don’t have information on ALL costs and ALL benefits), can see if the benefits are at least greater than costs
  • think about how rubrics are presented – e.g., minirubrics with red/yellow/green
Presentation slide
  • but that might not be appropriate in some contexts – e.g., if a program is just developing an it’s unreasonable to expect that certain criteria would be at a good level yet
  • a growing flower as a metaphor for different stages of different parts of a program may be more appropriate to a development program. May also be more appropriate in an indigenous context
Presentation slide
  • it’s important to talk about how the criteria relate to each other (not in isolation)
    • they do each analysis separately (e.g., analyze the survey; analyze the interviews)
    • then map that to the rubric
    • then take that to the stakeholders for sensemaking; stakehodlers can help you understand why you saw what you saw (e.g., when you see what might seem like conflicting results)
  • like with other evaluation stuff, might not say “we are building a rubric” to stakeholders at the start (it’s jargon). Instead, ask questions like “what is important to you?” or “If you were participating *as* Māori , what would that look/sound/feel like to you?”

Theory of Change

  • to be a theory of change (TOC) requires a “causal explanation” (i.e., a logic model on its own is not a TOC – we need to talk about why those arrows would lead to those outcomes) (John Mayne) [This also came up as a question to my case competition team – and my team gave a great answer! Did I mention I’m so proud of them?]
  • complexity affects the notion of causation – in complexity, there isn’t “a” cause, there are many causes (John Mayne)
  • people assume you have to have a TOC that can fit on one page – but that doesn’t always work – can do nested TOCs (John Mayne)
  • interventions are aimed at changing the behaviour of groups/institutions, so TOCs should reflect that (John Mayne)
    • there is lots of research on behaviour change, such as on Bennett’s hierarchy, or the COM-B model (John Mayne):
Presentation slide
  • causal link assumptions – what conditions are needed for that link to work? (John Mayne) (e.g., could label the arrows on a logic model with these assumptions – Andrew Koleros)
Presentation slide


As with pretty much any conference I go to, I came home with a reading list:

And some to dos:

  • re-read the TRC Calls to Action and figure out which things I can take action on! And then take action!
  • try writing “the Morning Pages”
  • listen to the audiorecorded reflections that I made during the conference and document any insights I want to capture
  • read all the books on the above list!

Sessions I Attended

Posted in evaluation, evaluation tools, notes | Tagged , , , , , | 1 Comment

CHSPR Conference Poster References

Some colleagues and I are presenting a poster at the Centre for Health Services & Policy Research conference on March 7-8, 2019. Rather than cluttering up our poster with a reference list, we are putting our references online here and our poster will have a QR code linked to this page. So if you’ve come looking for the references from our poster, you’ve come to the right place!

  1. American Society for Quality. (2017). What is audit? Retrieved from American Society for Quality: http://asq.org/learn-about-quality/auditing/
  2. Baily, M. A., Bottrell, M., Lynn, J., & Jennings, B. (2006). The Ethics of Using QI Methods to Improve Health Care Quality and Safety. RAND Corporation. Retrieved from http://www.thehastingscenter.org/wp-content/uploads/The-Ethics-of-Using-QI-Methods.pdf
  3. Benjamin, A. (2008). Audit: how to do it in practice. BMJ: British Medical Journal, 336(7655), 1241.
  4. Canadian Evaluation Society. (2015, October). What is Evaluation. Retrieved March 3, 2017, from Canadian Evaluation Society: http://evaluationcanada.ca/what-is-evaluation
  5. Cook, P. F., & Lowe, N. K. (2012). Differentiating the Scientific Endeavors of Research, Program Evaluation, and Quality Improvement Studies. Journal of obstetric, gynecologic, and neonatal nursing, 41(1), 1-3.
  6. Council for International Development. (2014, June). Monitoring Versus Evaluation. Retrieved March 3, 2017, from Council for International Development: http://www.cid.org.nz/assets/Key-issues/Good-Development-Practice/Factsheet-17-Monitoring-versus-evaluation.pdf
  7. Hedges, C. (2009). Pulling It All Together: QI, EBP, and Research. Nursing management. Nursing management, 40(4), 10-12.
  8. Hill, S. L., & Small, N. (2006). Differentiating Between Research, Audit and Quality Improvement: Governance Implications. Clinical Governance: An International Journal, 11(2), 98-10
  9. Naidoo, N. (2011). What is Research? A Conceptual Understanding. African Journal of Emergency Medicine, 1(1), 47-48.
  10. Newhouse, R. P., Pettit, J. C., Poe, S., & Rocco, L. (2006). The Slippery Slope: Differentiating between Quality Improvement and Research. Journal of Nursing Administration, 36(4), 211-219.
  11. Shirey, M. R., Hauck, S. L., Embree, J. L., Kinner, T. J., Schaar, G. L., Phillips, L. A., . . . McCool, I. A. (2011). Showcasing Differences Between Quality Improvement, Evidence-Based Practice, and Research. The Journal of Continuing Education in Nursing, 42(2), 57-68
  12. U.S. Department of Health & Human Services. (2009, 01 15). Basic HHS Policy for Protection of Human Research Subjects. Retrieved March 3, 2017, from Office for Human Research Protections – U.S. Department of Health & Human Services: https://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/#46.102
  13. United States Government Accountability Office. (2011, May). Performance Measurement and Evaluation: Definitions and Relationships. Retrieved March 3, 2017, from Program Performance Assessment: http://www.gao.gov/assets/80/77277.pdf
Posted in Uncategorized | Tagged , , | Leave a comment

Recap of the 2018 Canadian Evaluation Society conference

This year’s Canadian Evaluation Society (CES) conference was held in Calgary, Alberta and had a theme of Co-Creation. As always, I had a great time connecting with old friends and making new ones, learning a lot, and getting to share some of my own learnings too.

As I usually do at conferences, I took a tonne of notes, but for this blog posting I’m going to summarize some of my insights, by topic (in alphabetical order) rather than by session as I went to some different sessions that covered similar things. Where possible, I’ve included the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due, but I apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them. Anything in [square brackets] is my thoughts that I’ve added upon reflection on what the presenter was talking about.

Complex Systems

  • I didn’t see as many things about complexity as I usually do at evaluation conferences.
  • No one person has their mind around a complex system. You need all the people in the room to understand it. Systems are messy because people are messy. (Patrick Field)


  • How do we get past the view that humans are supreme beings and the environment is just there to serve us (vs. a stewardship view)? This view is deeply embedded in our identities. Even the legal system is set up to prioritize making money (and views environmentalism as a “nuisance”) (Jane Davidson)
  • People have a fear of being evaluated on things they don’t (feel they) have control over “don’t evaluate me on sustainability stuff! It’s affected by so much else!” Looking at outcomes that are outside the control of the program isn’t meant to be about evaluating a program/organization on their performance, but about identifying the things that are constraining them from achieving the outcomes they are trying to achieve. It’s not though if you only look at the things within the box of your program, you can really control all of the things  in the box and they aren’t affected by the things outside your program. [No program is a closed system]. (Jane Davidson)
  • We focus on doing evaluations to meet the client’s requests, and maybe we stretch it to cover some other things. Sometimes you can slip in stuff that the client didn’t ask for but then you can use that to demonstrate the value of it. People often limited by what they ask for to what they think is possible and sometimes you need to be able to demonstrate the possibilities first (Jane Davidson)
  • It’s not just about asking “how good were the outcomes”, but “how good was this organization in making the trade offs?”(Jane Davidson)

Evaluation Approaches and Methods

  • People limited their questions to what they think can be measured (e.g., I want to see indicator X move by Y%). When clients say “we can’t measure that!”, Jane tells them “Look there are academics who spent their love life studying “love”. If they can do that, we can find a way to measure what you are really interested in. And it doesn’t have to be quantitative!” The client isn’t a measurement expert and they shouldn’t be limiting their questions to what they think can be measured. (Jane Davidson)
  • Once upon a time, evaluation was about “did you achieve your objectives?” but now we also think about the side effects too! (Jane Davidson).


  • Bower & Elnitsky talked about having to distinguish between evaluation/quality imrpovement/data collection/clinical indicators/performance indicators (and how, in their view, these aren’t different things) and to talk to their client about how evaluation adds value. This struck a chord with me as it was similar to some of the things that my co-authors and I talk about in a paper we currently have under review in the Canadian Journal of Program Evaluation.
  • Sarah Sangster felt that evaluation is like research, but more. She described how evaluation requires all the things you need to do research, but also has some things that research doesn’t (e.g., some evaluation-specific methods). She talked about how ways that people sometimes try to differentiate evaluation and research are really shared (e.g., evaluation is often defined as referring to judging “merit/value/worth”, but that research does that too (e.g., research judges the “best treatment”). [Some of the things she talked about were things that my co-authors and I grappled with in our paper – such as how research is a lot more varied than people typically give it credit for (e.g., participatory action research or community-based research stretch the boundaries are traditional research in that the questions being explore come from community instead of from the researchers and the results are specifically intended to be applied in the community rather than just being knowledge for knowledge’s sake).

Evaluation Competencies

  • The CES is updating its list of evaluation competencies – those things a person should know and be able to do in order to be a competent evaluator. The evaluation competencies are used by the society to assess applicants for the Credentialed Evaluator designation – people have to demonstrate that they’ve met the competencies. The competencies are being revised and updated and the committee is taking comments on the draft until June 30, 2018. They expect to finalize the new competencies in Sept 2018.

Evaluation Ethics

  • The CES is also looking at renewing its ethics statement, which hasn’t been updated in 20 years! I went to a session where we looked at the existing statement and it clearly needs a lot of work. The society is currently doing an environmental scan (e.g., looking at other evaluation societies’s ethics guidelines/principles/codes/etc.) and consultations with stakeholders (e.g., the session I attended at the conference) and plan to have a decision by the fall if they are going to just tweak the existing statement or completely overhaul it. They hope to have a finished product to unveil at next year’s CES conference.
  • During the session that Alec from my team led, which was a lighting round table where people circled through various table discussions, one of the things we talked about while discussing doing observations was ethics. For example, when we are doing observations, it is understood that if you are in a public place, you might be observed and it is ethics to observe people. The question arose “are hospitals public places?”

Evaluation, Use Of

  • There was a fascinating panel of 3 mayors who were invited to the conference to talk about what value evaluation can add for municipalities. None of the mayors had even heard of the Canadian Evaluation Society prior to being invited to the conference, so we definitely have our work cut out for us in terms of advocating for evaluation at the municipal level. There is definitely lots of evaluation work that can be done at the municipal level and it would be worthwhile for the society to educate municipal politicians about what we do and how it can help them. The mayors were open to the idea of using evaluation findings in their decision making. There was a suggestion that there should be a panel of evaluators at the Canadian municipalities conference, just like we had the mayors’ panel at our evaluator conference, and I seriously hope the CES pursues this idea.

Evaluation as intervention

  • Evaluators affect the things they evaluate. The act of observing is well known to affect the behaviour of those being observed. As well, we know that “what gets measured gets managed,” so setting up specific indicators that will be measured will cause people to do things that they might not otherwise have done. This is an important thing that we should be discussing in our evaluation work.

Indigenous Evaluation

There was a lot of discussion around indigenous populations and indigenous evaluation, in keep with the CES’s commitment “to incorporating reconciliation in its values, principles, and practices.”

  • The opening keynote on Reconciliation and Culturally Responsive Evaluation was introduced with “You will feel uncomfortable and that is by design. Ask yourself why it makes you uncomfortable.” [The history – and present – of indigenous people is a hard thing to grapple with for many reasons. There are so many injustices that have been done – and continue to be done – and each of us participates in a system that perpetuates that injustice. Doing nothing about it is to do harm. And for those of us who are not indigenous, there can be a mixture of ignorance about our own history and ignorance about our actions and inactions that contribute to the injustice, privilege, and lack of knowing what we can do that can all contribute to this discomfort.]
  • Several of the people speaking about indigenous evaluation talked about the need for indigenous-led evaluation. We have a long history of evaluation and research in indigenous communities being led by non-indigenous people where they take from the community, don’t contribute to the community, don’t ask the questions that the community needs answered, don’t understand things from the communities’ perspectives, impose Western view/perspectives/model, and then leave the community no better off than before.
  • “Scientific colonialism” colonial powers export raw data from communities to “process it”. (Nicole Bowman)
  • Despite all the evaluation that has been going on for years, we still face all the same problems – maybe worse. (Kate McKegg)
  • “How do we ensure evaluation is socially justi, as well as true, that it attends to the interests of everyone in society and not solely the privileged” (House, 1991 – cited by Kate McKegg).
  • “Culturally-responsive evaluation seems to be about giving “permission” to colonizers and settlers to do evaluation in indigenous communities” (Kate McKegg).
  • “Sometimes the stories we are telling are not the stories that need to be told.” (Larry Bremner) [Larry was talking about the ways in which evaluation can further perpetuate injustice against, and further ignore and marginalize, indigenous people, thorough what we do and do not study.] Is our own working maintaining colonial oppression?
  • “Trauma is never far from the surface in indigenous communities.” Larry Bremner.
  • Since many aspects of culture and ceremony have been destroyed by colonialism, how are people supposed to heal, as culture and ceremony are ways of – healing? (Nicole Bowman)
  • The lifespan of indigenous people is 15 years less than non indigenous people.
  • There is a lot of diversity among indigenous people in Canada: 617 First Nations, as well as Inuit and Métis; 60 languages.
  • Larry Bremner quoted a few people that he’d worked with: “Everyone is talking about reconciliation, but what happened to the “truth” part?” [in reference to the Truth and Reconciliation commission] and “In my community, reconciliation is about making white people feel less guilty.” [Even work that is supposed to be about dealing with injustice against indigenous people gets turned around to serve white people instead.]
  • Understanding our history is needed to understand the legal and policy work in which we live today. Evaluators need to understand authority power. (Nicole Bowman)
  • How can non-indigenous people be good allies?
    • We have to be clear on our own identities as settlers and colonizers, recognize our privilege. Our identity is shaped by our history and our present. Colonization is still going on and is nonconsensual and designed to benefit the privileged. (Kate McKegg)
    • We don’t even know our own history, let alone that of indigenous people.(Kate McKegg)
    • It’s not indigenous people’s responsibility to teach us about this – it’s own our job. Only when we understand ourselves can we hear indigenous people. (Kate McKegg)
    • Do your homework. Expand your indigenous networks. Undertake relevant professional development. Build relationships. (Nan Wehipeihana)
    • Advocate for indigenous-led evaluation – indigenous people evaluating as indigenous people:

Slide by Nan Wehipeihana

  • During the opening keynote, an audience member asked how non indigenous people can learn if it’s not indigenous peoples’ responsibility to teach non indigenous people.
    • The panelists noted that indigenous people are a small group who first priority is to do work to help their communities – expecting them to educate you is to put a burden on them that is not their responsibility.
    • Kate McKegg noted that indigenous people have been trying to talk to non indigenous people for years and we haven’t listened to them. She suggested that we can work with other settlers who want to learn – there is lots available to read, to start.
    • Nicole Bowman noted that observation is how we traditionally learned and it is part of science to observe – do some observing.
    • Larry pointed out that indigenous people have taught their ways to others before and people have taken their protocols and not used them well – why should they give non indigenous people more tools to hurt indigenous people?
  • Lea Bill from the First Nations Information Governance Centre spoke about the OCAP® principles, which refers to Ownership, Control, Access, and Possession of data, in that First Nations have rights to all of these. [I have learned about OCAP® before, but hadn’t realized until I saw this presentation that it was a registered trademark).
    • All privacy legislation is about protecting individual privacy rights, but OCAP® is about collective, community rights.

Slide by Lea Bill

  • When you work with indigenous communities, you need to know who the knowledge holders in the community are – they have rights and privileges and if you don’t know, you could offend people and not get good information. (Lea Bill)
  • Indigenous indicator are bicultural – all things are interconnected and human beings are not separate from the environment. (Lea Bill)


  • Something I’ve been interested lately is how people from different disciplines use words differently. Two disciplines might use the same word to mean different things, or they might use different words to mean the same thing. One of the sessions I attended was about a glossary that thad been created to clarify words/phrases that are using by financial/accounting people vs. evaluation people. Check out the glossary here.


  • I attended a thematic breakfast session that was a live taping of an episode of the Eval Cafe podcast. It was a chance for a group of us to reflect on what we’d learned about at the conference. You can check out the podcast here.

Caroline & Brian doing soundcheck for a live podcast at the Canadian Evaluation Society 2018 conference


  • We need to think bigger than binary thinking – “what in our control vs. not in our control?”, “Yes/No”, “Good/Bad”, “Pre/Post”. [Few things are really black or white – often things that are we think of as binary are really more of a gradient or spectrum. There are fuzzy boundaries between things. It’s one of the reasons I like to start questions with “to what extent….” Like “to what extent did the program achieve its goals?”]

To Dos:

  • Watch “The Doctrine of Discovery – Unmasking the Domination Code”
  • Read “Pagans in the Promised Land” by Steven J. Naucomb
  • Research  “are hospitals public places?” for the purposes of observations.

Sessions I Attended:


  • Opening Keynote Panel: Reconciliation and Culturally-Responsive Evaluation: Rhetoric or Reality? with panelists Dr. Nicole Bowman, Larry K. Bremner, Kate McKegg, and Nan Wehipeihana
  • Keynote with panelists Dr. Jane Davidson, Patrick Field, Sean Curry, Dr. Juha I. Uitto
  • Keynote by Lea Bill
  • Mayors’ Panel: A municipal perspective on co-cration and evaluation with panelists Heather Colberg, Mayor of Drumheller, Alberta; Mark Heyck, Mayor of Yellowknife, NWT; Stuart Houston, Mayor of Spruce Grove, Alberta
  • Fellows Panel – Evaluation for the Anthropocene.
  • Closing Keynote Panel: Reflection on Co-Creation Conference 2018 by CES Fellows – Our rapporteurs, realists, and renegades.

Concurrent Sessions:

  • Collaborating to improve wait times for a primary care geriatric assessment and support program by Emily Johnston, Krista Rondeau, Kathleen Douglas-England, Bethan Kingsley, Roma Thomson
  • Surveying an Under-Represented Population: What We Learned by Surveying Great- Grandma by Kate Woodman, Krista Brower
  • Co-Creating Evaluation Capacity in Primary Care Networks: A Case Example of Lessons Learned by Krista Brower, Sherry Elnitsky, Meghan Black
  • Evaluators faced with complexity: presentation of the results of a synthesis of the literature by Marie-Hélène L’Heureux
  • Knowledge translation and impacts ” unpacking the black box by Ambrosio Catalla Jr, Ryan Catte
  • Evaluation and research: Two sides of the same coin or different kettles of fish? by Sarah Sangster, D. Karen Lawson
  • Who’s keeping score? A team-based approach to building a performance measurement scorecard by Beth Garner
  • Updating the CES Competencies for Evaluators: A Work in Progress by Gail Vallance Barrington, Christine Frank, Karyn Hicks, Marthe Hurteau, Birgitta Larsson, Linda Lee
  • Help Us Co-create CES’s Renewal Vision of Ethics in Program Evaluation! by CES Ethics Working Group on Ethics, Environmental Scan and Stakeholder Consultation Subcommittees
  • On the Road with the EvalCafe Podcast: Greetings from Calgary! by Carolyn Camman, Brian Hoessler
  • Integrating Social Impact Measurement Practice into Social Enterprises: A Sociotechnical Perspective by Victoria Carlan
  • From Collaboration to Collective Impact; Measuring Large-scale Social Change by Andrea Silverstone, Debb Hurlock, Tara Tharayil
  • The Rosetta Stone of Impact: A Glossary for Investors and Evaluators by David Pritchard, Michael Harnar, Sara Olsen

Presentations I Gave:

  • An inside job: Reflections on the practice of embedded evaluation by Amy Salmon, Mary Elizabeth Snow
  • How is evaluation indicator development like an orchestra? by Mary Elizabeth Snow, Alec Balasescu, Joyce Cheng, Allison Chiu, Abdul Kadernani, Stephanie Parent
  • Can co-creation lead to better evaluation? Towards a strategy for co-creation of qualitative data collection tools by Alec Balasescu, Joyce Cheng, Allison Chiu, Abdul Kadernani, Stephanie Parent, Mary Elizabeth Snow
Posted in evaluation, event notes | Tagged , , , , , , | 1 Comment

Recap of the Canadian Evaluation Society’s 2017 national conference

The Canadian Evaluation Society’s national conference was held right here in Vancouver last month! I was one of the program co-chairs for the conference and I have to say that it was pretty awesome to see a year and a half worth’s of work by the organizing committee come to fruition! There were a lot of people involved in putting together the conference and so many more parts to it than I had realized when I started working on it and it was incredible to see everything work so smoothly!

As I usually do at conferences, I took a tonne of notes, but for this blog posting I’m going to summarize some of my insights, by topic (in alphabetical order) rather than by session 1Though I’ve listed all the sessions I attended at the bottom of this posting. as I went to some different sessions that covered similar things. Where possible, I’ve included the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due, but I apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them.


  • Damien Contandriopoulos noted that context is often defined by what it is not – it is not your intervention – i.e., it’s whatever is outside your intervention, but it’s not the entire universe outside of your intervention. Just what is close enough to be relevant/important to the analysis. He also noted that some disciplines don’t talk about context at all (e.g., they might talk about the culture in which an intervention occurs, but don’t talk about it as separate from the intervention the way we talk about context as being separate from the intervention).
  • Depending on your conceptualization of “context”, you may want to:
    • neutralize the context (e.g., those who think that context “gets in the way” and thus they try to measure it and neutralize it so it won’t “interfere” with your results). Contandriopoulos clearly didn’t favour this approach, but noted that it could work if your evaluand was very concrete/clear.
    • adapt to context
    • describe the context
  • In all of the above options, it’s about generalizability/external validity (e.g., if you are trying to neutralize the context, you are wanting to know if the evaluand works and don’t want the context to interfere with your conclusion about if the the evaluand works; if you are adapting to the context, you want to figure out how the evaluand might work in a given context; if you are describing the context, you are wanting to understand the context to use to interpret your evaluation findings)
  • From the audience, AEA president Kathryn Newcomer, mentioned a paper by Nancy Cartwright about transferability of findings 2She didn’t say the name of the paper or the journal, but based on her comments about the paper, I believe it is likely this paper. Unfortunately, it’s behind a paywall, so I can’t read more than the abstract., specifically about how Cartwright talks about “support factors” rather than context. Further, she talked about how in the US there is lots of interesting in “scaling up” interventions, but rarely do studies document the support factors that allow an intervention to work (e.g., you need to have a pool of highly qualified teachers in the area for program X to work). She suggested:
    • putting the support factors into the theory of change
    • considering: how do we know if the support factors are necessary or sufficient? What if you need a combination of factors that need to be present at the same time and in certain amounts for the program to work? etc.
  • Contandriopoulos mentioned that sometimes people just list “facilitators” and “barriers” as if that’s enough [but I liked Newcomer’s suggestion that “support factors” (or barriers, though she didn’t mention it) could be integrated into the theory of change]


  • Kas Aruskevich showed an imagine of a river in Alaska viewed from above and noted that if you were standing by the side of that river, you’d never know what the sources of that river are (as they are blocked by mountains) and she likened evaluation to taking that perspective from a distance where you look at the whole picture. I liked this analogy.
  • Kathy Robrigado talked about how the accountability function of evaluation is often seen as an antagonist to learning, but she sees it as a jumping off point for learning.
  • In summarizing the Leading Edge panel, E. Jane Davidson had a few things to say that were very insightful in relation to thinking I’ve been doing lately with my team about what evaluation is (and how it compares/relates to other disciplines that aim to assess program/projects/etc.). With respect to monitoring, she noted that people often expect key performance indicators (KPIs) to be an answer, but they aren’t. Often what’s the easiest to measure is not what’s most important. In evaluation, we need to think about what’s most important (not just what’s strong or weak, but what really matters).

Evaluation, History of the Field

  • Every time I go to a evaluation conference, someone gives a bit of a history of the field of evaluation from their perspective (perhaps once day I’ll compile them all into a timeline). This conference was no different, with closing keynote speaker Kylie Hutchison talking about what she has seen as “innovations” in evaluation that had a lot of buzz around them and then eventually settled into an appropriate place [her description made me think of the “hype cycle“, which someone had coincidentally shown in one of the sessions that I was in]:
    • 1990s – logic models
    • 2000s – the big RCT debate (i.e., are RCTs really the “best” way to evaluate in all class)
    • social return on investment (SROI), Appreciative Inquiry
    • developmental evaluation, systems approaches
    • deliverology

Evaluators, Role of

  • Lyn Shulha noted that as an evaluator, you’ll never have the same context/working conditions from one evaluation to the next, and you’ll never have a “final” practice or theory – they will continue to change.
  • Kathy Robrigado talked about starting an evaluation as an “evaluator as critical friend” (e.g., asking provocative questions to understand the program/context, offering critiques of a person’s work, providing data to be examined through another lens). But after awhile, they found this approach to be too resource intensive, as they had ~60 programs to deal with and data collection was cumbersome; they moved from critical friend to “strategic acquaintance” (or, as she put it, “we had to friendzone the programs”)
  • Michel Laurendeau stated that “evaluators are the experts in interpreting monitoring data” as what you see when you look at the data isn’t necessarily what is really going on [this reminded me of something that was discussed at last year’s CES conference: what the data says vs. what the data means]
  • Kylie Hutchison talked about how many evaluators are talking about the evaluator as a social change agent. People gravitate to this profession because they want to be involved in social change – maybe they are a data geek, but they see how the data can lead to social change. She also talked about how many skills she has needed to build to support her evaluation practice: in grad school she focused on methods and statistics, but when she went on to become a consultant she didn’t find that she needed advanced statistics – she needed skills in facilitation, then data visualization, and now organizational development.

Knowledge Translation

  • Kim van der Woerd described getting knowledge into action as “the long journey from the head to the heart”. I really like this phrase, as just knowing something (with the head) doesn’t necessarily mean we take it to heart and put it into action. I wonder how thinking about how we can get things from the head to the heart could help us think about better ways to promote the translation of knowledge into action.


  • Lyn Shulha talked about learning spirals – as we travel from novice to expert, we can imagine ourselves descending down, say, a spiral staircase. As a given point, we can be at the same place as earlier, but deeper (as well, we are changed from when we were last at this point). She noted that we “need to hold onto our experiences and our truths lightly”, lest we end up traveling linearly rather than in a spiral.

Logic Models

  • One of the sessions I was in generated an interesting discussion about different ways that people use logic models, such as:
    • having the lead agency of a program create a logic model of how they think the program works and then having all the agencies operating the program create logic models of how they think the program works and then compare – if they have different views of how the program works, this can generate important discussions
    • calling the first version of the logic model “strawman #1” to emphasize that the logic model is meant to be challenging and changed.


  • Report structure recommended by Julian King in the Leading Edge panel on Rubrics:
    • answer the evaluation question
    • key evidence & reasoning behind how you came up with the answer
    • extra information
      • They summarized this as spoiler, evidence, discussion, repeatUntitled


  • E. Jane Davidson noted that in social sciences, people are often taught how to break things down, but not how to pack it back together again to answer the big picture question. For example, you’ll often see people report the quantitative results, then the qualitative results, but with no actual mixing of the data (so it’s not really “mixed methods” – it’s more just “both methods”).
  • Also from E. Jane Davidson – the length of a section of a report is typically proportional to how long it took you to do the work (which is why literature reviews are so long), but that’s not what’s most useful to the reader. It’s like we feel we have to put the reader through the same pain we went through to do the work; we want them to know we did so much work! And then they get to the end and we say “the results may or may not be…. and more research is needed.” Not helpful! Spoilers really are key in evaluation reporting – write it like a headline. Pique their interest in the spoiler and then they want to read the evidence (how did they decide that??
    • 7 +/- 2 key evaluation questions (KEQ):
      • executive summary: KEQ 1, answer + brief evidence; KEQ 2, answer + brief evidence; KEQ 3, answer + brief evidence
      • and make sure your recommendations are actionable!


  • The Leading Edge Panel on Rubrics was easily my favourite session of the conference. I’ve done a bit of reading about rubrics after going to a session on them at the Australasian Evaluation Society conference in Perth, but found that this panel really brought the ideas to life for me.
  • Kate McKegg mentioned that she asked a group of people in healthcare if they thought that their organizations key performance indicators (KPIs) reflected the value of what their organization does, and not a single person raised their hand [This resonated with me, as my team and I have been doing a lot of work lately on differentiating, among other thing monitoring and evaluation.]
  • Rubrics:
    • can help clarify what matters and include those things in your evaluation
    • are made of:
      • evaluative criteria – to come up with these, can check out the literature, talk to experts, talk to stakeholders (e.g., people on the front lines); can also think about what would be appropriate for the cultural context (e.g., what would make a program excellent in light of the cultural context?)
      • levels of importance (of the criteria) – remember, things that are easy to measure are not necessarily what’s important
      • rating scale (how to determine the level of performance (e.g., excellent-very good-good-adequate-emerging-not yet emerging-poor); depending on your context, you may choose different words (e.g., may use “thriving” instead of “excellent”)
    • can be:
      • analytic – describe the various performance levels for each criterion
      • holistic – a broad level of description of performance at each level (e.g., describe “excellent” overall (encompassing all the criteria) rather than describing “excellent” for each criterion individually)
    • analytic can provide more clarity, but require more data
  • You should be able to see your theory of change in the rubric. Key evaluation questions (KEQ) often follow the theory of change (e.g., KEQs might be “how well are we implementing?” or “how well are we achieving outcome #1?” Think about the causal links in the theory of change. If there is a deal breaker, it should show up in the theory of change.). Think about the causal links and their strength.
  • You can embed cultural values into the process (e.g., for the Maori, the word “rubric” didn’t resonate, so Nan Wehipeihana used a cultural metaphor that did; rather than words like “poor” and “excellent”, can use words that fit better like a “seed with latent potential” and “blooming” and “coming to fruition”)
  • Values are the basis for criteria – they reflect what is valued (and whose values hold sway matters)
  • Once you have a rubric, you need to collect data to “grade” the program using the rubric; data may come from all sorts of places (e.g., previous research, administrative data, photos from the program, interviews/surveys/focus groups)
  • Can make a table of each criteria and data source and use that to optimize your data collection:
Admin Data Interview Staff Interview Participants Photos from the Program
Criterion 1  x  x
 Criterion 2  x x x
 Criterion 3  x  x
 Criterion 4 x  x
 Criterion 5  x x x
  • Then you can look at all the things you want to collect from each data source (e.g., you can ask about criteria 2, 4, and 5 in interviews with staff; look for criteria 1, 2, and 3 in the photos from the program) = integrated data collection
  • Make sure that the data collection is designed to answer the evaluation questions.
  • Look to see if you are getting consistent information (i.e., saturation) or if the data is patchy or inconsistent and you need to get more clarity.
  • Bring data to stakeholders as you go along (especially for long evaluations – they don’t want to wait until the end of 3 years to find out how things are going!)
  • 3 steps to making sense of data:
    • analysis – breaking something down into its component parts and examining each part separately (King et al, 2013)
    • synthesis – putting together “a complex whole made up of a number of parts or elements ” (OED online); assembling the different sources of data. Sometimes when you are working on data synthesis, you learn that what’s important isn’t what you initially thought was important (so you need to rejig your rubric). Also think about what the deal breakers are (e.g., if no one shows up to the program…)
    • sensemaking: helps to clarify things; one way to do this is to get all the stakeholders together, give them the synthesized data (a rough cut), and go through a process like this:
      • generalization: In general, I noticed…
      • exception: In general…, except….
      • contradiction: On one hand…, but ont he other hand…
      • surprise: I was surprised by…
      • puzzle: I wonder…
    • When you think about the exceptions or contradictions – how big of a deal are they? Are they deal breakers?
    • As stakeholders do this, they start to understand the data and to own the evaluation. Often they make harder judgments than the evaluator might have.
    • Typically, they do the synthesis and bring that to the stakeholders to do sensemaking; but don’t spend a lot of time making the synthesized data looked polished/finished – it should look rough as it is to be worked with. Not everyone will spend time reading the data synthesis in advance, so give them time to do that at the start of the session.
    • Put up the rubric and have the stakeholders grade the program.
    • Often people try to do analysis, synthesis, and sensemaking all at the same time, but you should do them separately.
  • Rubrics “aren’t just a method – they change the whole fabric of your evaluation”. They can help you “mix” methods (rather than just doing “both”) methods – they can help you make sense of the “constellation of evidence”).
  • I asked how do they deal with situations that are dynamic? Their answer was the rubrics can evolve, especially with an innovative program. You create it based on what you imagine the outcome will be, but other things can emerge from the program. You can start with a high level rubric (don’t want to get too detailed or overspecified that you paint yourself into a corner). You need it to be underspecified enough to be able to contextualize it to the setting. It’s like the concept of “implementation fidelity” – implementing something exactly as specific is not the best – you should be implementing enough of the intent in a way that will work in the setting.
  • Another audience member asked how would you determine if a rubric is valid/reliable? The speakers noted that often people ask “is it a valid tool?” meaning “was it compared to a gold standard /previously validated tool”? But those other tools are often too narrow/miss the mark. The speakers suggested that “construct validity is the mother of all validities” – the most important question is “is it useful for the people for whom it was built?”
  • Another audience member asked about “scaling up” rubrics. The speakers noted examples where they had worked on projects to create rubrics to be used across a broader group than those who created it – e.g., created by the Ministry of Education to be used by many different schools with the help of a facilitator. For these, you need to have a lot more detail/instructions on how to use it (and a good facilitator) since users won’t have the shared understanding that comes from having created it. They have also done “skinny rubrics” to be used by lots of different types of schools (so had to be underspecified), but again, need to provide lots of support to users.

Systems Thinking

  • Systems archetypes are common patterns that emerge in systems. This was a concept that was brought up by an audience member in my session on complexity, and is something I want to read more about!
  • Heather Codd talked about three key concepts in using systems thinking (using Donella Meadow’s definition of a system as something with parts, links between parts, and a boundary) in evaluation:
    • interrelationships – understanding the interrelationships and what drives them helps us to understand what’s going on with the program (and she suggested using rich pictures to help focus the evaluation and think about what the consequences of the program might be)
    • boundaries – we need to pick a boundary for the purpose of analysis, but note that it is sensitive because it defines what is in and out of the evaluation. She suggested using critical system heuristics to help describe the program, scope the evaluation, and decide on an evaluation approach)”
      Critical systems heuristic slide
    • multiple perspectives – what are the world views being applied and what the implications of those world views? She suggested you can do a stakeholder analysis, but also a stake analysis; she also suggested “framing” by using an idea from Bob Williams, where you add the words “something to do with…” in front of ideas (e.g., “Something to do with a culture of health”, “something to do with managing heart disease”; this tool can help give you a sense of the intervention’s purpose and the evaluation’s purpose.
    • Evaluators are an element in a system and we cannot separate out our effect on the systems [This made me think of “co-evolution” – the evaluation co-evolves along with the rest of the system]
    • There are echoes in a system of what has happened before [e.g., intergenerational trauma]

Truth & Reconciliation

  • Last year, the CES took a position on reconciliation in Canada. Several of the speakers at the conference talked about this topic. For example, Kim van der Woerd talked about a witness as being one who listens with their whole heart and validates a message by sharing it (and that they have a responsibility to share it). She also noted that the Truth and Reconciliation Commission (TRC) wasn’t Canada’s first attempt at trying to build a good relationship between Aboriginal and non-Aboriginal people – the Royal Commission on Aboriginal People put out a report with recommendations in 1996. But when it was evaluated in 2006, Canada received a failing grade with 76% of the 400+ recommendations being not done and with no significant process. She noted that we shouldn’t wait 10 years before we evaluate how well Canada is doing on the TRC recommendations.
  • Paul Lacerte outlined a set of recommendations:
    • amplify the new narrative (where the old narrative was “the federal government takes care of the natives”)
    • conduct research & develop a reconciliation framework
    • set targets for recruiting and training indigenous evaluators
    • learn about and follow protocol (e.g., how to start a meeting, gift giving)
    • put up a sign in your workspace about the traditional territory on which you are working
    • volunteer for an indigenous non-profit
    • join the Moose Hide Campaign
  • At the start of her closing keynote, Kylie Hutchison acknowledge that she was speaking on the unceded traditional territory of the Musqueam, Squamish, and Tsleil-Waututh First Nations. And then she said that she’d never said that before speaking before but that she would be now. And I thought that it was a really cool think to witness someone learning something new and putting it into practice like that, especially something so meaningful.


  • The best joke I heard in a presentation was when Kathy Robrigado, after a few acronym-filled sentences in her presentation, said, “As you know, government employees are paid by the number of acronyms they use”

To Dos:

Sessions I Attended:

  • Opening Keynote by Kim van der Woerd and Paul Lacerte
  • Short presentation: Causing Chaos: Complexity, theory of change, and developmental evaluation in an innovation institute by Darly Dash, Hilary Dunn, Susan Brown, Tanya Darisi, Celia Laur Cypress
  • Short presentation: Implications of complexity thinking on planning an evaluation of a system transformation by M. Elizabeth Snow, Joyce Cheng [This was one of my own presentations!]
  • Short presentation: Cycles of Learning: Considering the Process and Product of the Canadian Journal of Program Evaluation Special Issue by Michelle Searle, Cheryl Poth, Jennifer Greene, Lyn Shulha
  • Short presentation: Using System Mapping as an Evaluation Tool for Sustainability by Kas Aruskevich
  • Incorporating influence beyond academia data into performance measurement and evaluation projects by Christopher Manuel
  • Exploring Innovative Methods for Monitoring Access to Justice Indicators by Yvon Dandurand, Jessica Jahn
  • A Quasi-Experimental, Longitudinal Study of the Effects of Primary School Readiness Interventions by Andres Gouldsborough
  • What Would Happen If…? A Reflection on Methodological Choices for a Gendered Program by Jane Whynot, Amanda McIntyre, Janice Remai
  • Towards Strategic Accountability: From Programs to Systems by Kathy Robrigado
  • Getting comfortable with complexity: a network analysis approach to program logic and evaluation design by John Burrett
  • Communication in System Level Initiatives: A grounded theory study by Dorothy Pinto
  • Seeing the Bigger Picture: How to Integrate Systems Thinking Approaches into Evaluation Practice by Heather Codd
  • Understanding and Measuring Context: What? Why? and How? by Damien Contandriopoulos
  • A Graphic Designer, an Evaluator, and a Computer Scientist Walk into a Bar: Interdisciplinary for Innovation by M. Elizabeth Snow, Nancy Snow, Daniel J. Gillis [This was another one of my presentations and hands down the best presentation title I’ve ever had]
  • Big Bang, or Big Bust? The Role of Theory and Causation in the Big Data Revolution by Sebastian Lemire, Steffen Bohni Nielsen Seymour
  • Using Web Analytics for Program Evaluation – New Tools for Evaluating Government Services in the Digital Age at Economic and Social Development Canada by Lisa Comeau, Alejandro Pachon
  • The Future of Evaluation: Micro-Databases by Michel Laurendeau
  • Dylomo: Case studies from an online tool for developing interactive logic models by M. Elizabeth Snow, Nancy Snow [This was the last of my presentations]
  • Development and use of an App for Collecting Data: The Facility Engagement Initiative by Neale Smith, Graham Shaw, Chris Lovato, Craig Mitton, Jean-Louis Denis
  • Leading Edge Panel: Evaluative Rubrics – Delivering well-reasoned answers to real evaluative questions by Kate McKegg, Nan  Wehipeihana, Judy Oakden, Julian King, E Jane Davidson
  • Closing Keynote by Kylie Hutchinson

Next CES Conference:

  • Host: Alberta & Northwest Territory confernece
  • May 26-29 – Calgary
  • May 31-June 1 – Yellowknife
  • Theme: Co-creation


1 Though I’ve listed all the sessions I attended at the bottom of this posting.
2 She didn’t say the name of the paper or the journal, but based on her comments about the paper, I believe it is likely this paper. Unfortunately, it’s behind a paywall, so I can’t read more than the abstract.
Posted in evaluation, event notes, notes | Tagged , , , , , , , , , | 1 Comment

On Flexibility in Evaluation Design

Been doing some reading as I work on developing an evaluation plan for a complex program that will be implemented at many sites. Here are some notes from a few papers that I’ve read – I think if anything links these three together, it is the notion of the need to be flexible when designing an evaluation – but you also need to think about how you’ll maintain the rigour of your work.

Wandersman et al (2016)’s paper on using an evaluation approach called “Getting to Outcomes (GTO)” discussed the notion that just because an intervention has been shown to be effective in one setting does not necessarily mean it will work in other settings. While I wasn’t interesting in the GTO approach per se, I found their introduction insightful.

Some notes I took from the paper:

  • the rationale for using evidence-based interventions is that since research studies show that a given intervention leads to positive outcomes, then if we take that intervention and implement it in the same way it was implemented in the research studies (i.e., fidelity to the intervention) on a broad scale (i.e., at many sites), then we should see those same positive outcomes on a broad scale
  • however, when this is actually done, evaluations often show that the positive outcomes compared to control sites don’t happen or that positive outcomes happen on average, but there is much variability among the sites such that some sites get the positive outcomes and others don’t (or even that some sites get negative outcomes)
  • from the perspective of each individual site, having positive outcomes on average (but not at their own particular site) is not good enough to say that this intervention “works”
  • when you implement complex programs at multi-sites/multi-levels, you “need to accommodate for the contexts of the sites, organizations, or individuals and the complete hierarchies that exist among these entities […] the complexity […]” includes multiple targets of change and settings” (p. 549-50)
  • recommendations:
    • evaluate interventions at each site in which it is implemented
    • examine the quality of the implementation
    • consider the fit of the intervention to the local context
      • “the important question is whether they are doing what they need to do in their own setting in order to be successful” (p. 547)
      • “the relevant evaluation question to be answered at scale is not “does the [evidence-based intervention] result in outcomes?” but rather “how do we achieve outcomes in each setting?” (p. 547)
    • evaluators should “assist program implementers to adapt and tailor programs to meet local needs and provide ongoing feedback to support program implementation” (p. 548)
  • empowerment evaluation: premise is: “if key stakeholders (including program staff and consumers) have the capacity to use the logic and tools of evaluation for planning more systematically, implementing with quality, self-evaluating, and using the information for continuous quality improvement, then they will be more likely to achieve their desired outcomes”

Balasubramanian et al (2015) discussed what they call “Learning Evaluation”, which they see as a blend of quality improvement and implementation research. To me it sounded similar to Developmental Evaluation (DE). For example, they state that:

  • “Two key aspects of this approach set it apart from other evaluation approaches; its emphasis on facilitating learning from small, rapid cycles of change within organizations and on capturing contextual and explanatory factors related to implementation and their effect on outcomes across organizations”  (p. 2 of 11)
  • “assessment needs to be flexible, grounded, iterative, contextualized, and participatory in order to foster rapid and transportable knowledge. This approach integrates the implementation and evaluation of interventions by establishing feedback loops that allow the intervention to adapt to ongoing contextual changes.” (p. 2 of 11)

That sound a lot like DE to me. And it sounds a lot like how I’m looking to approach the evaluation I’m currently planning.

Principles underlying the “Learning Evaluation” approach (from page 3 of 11):

 Principle Why
 1. Gather data to describe the types of changes made by healthcare organizations, how changes are implemented, and the evolution of the change process. To establish initial conditions for implementing innovations at each site and to describe implementation changes over time.
 2. Collect process and outcome data that are relevant to healthcare organizations and to the research team To engage healthcare organizations in research and in continuous learning and quality improvement.
 3. Assess multi-level contextual factors that affect implementation, process, outcome, and transportability. Contextual factors influence quality improvement: need to evaluate conditions under which innovations may or may not result in anticipated outcomes.
 4. Assist healthcare organizations in applying data to monitor the change process and make further improvements.  To facilitate continuous quality improvement and to stimulate learning within and across organizations.
5. Operationalize common measurement and assessment strategies with the aim of generating transportable results. To conduct internally valid cross-organization mixed methods analyis

A point that was made in this paper that resonated with me was that: “Within the context of a multi-site demonstration project conducted in real-world settings, it was not feasible to randomize sites or to specify target patient samples or measures a priori.” (p. 7 of 11) Instead, they incorporated elements to enhance the study’s rigour:

  • rigour in study design
    • considered each site as a “single group pre-post quasi-experimental study”, which is subject to history 1i.e., how do you know results aren’t do to other events that are occurring concurrently with the intervention? and maturation 2i.e., how do you know the results aren’t just due to naturally occurring changes over time rather than being due to the intervention? threats to internal validity
    • to counteract these threats, they collected qualitative data on implementation events (to allow them to examine if results are related to implementation of the intervention)
    • they also used member checking to validate their findings
  • rigour in analysis
    • rather than analyzing each source of data independently, they integrated findings
    • “triangulating data sources is critical to rigor in mixed methods analysis”
    • qualitative data analysis was conducted first within a given site (e.g., “to identify factors that hindered or facilitated implementation while also paying attention to the role contextual influences played” (p. 7 of 11), then across sites.

A few other points they make:

  • “ongoing learning and adaptation of measurement allows both rigor and relevance” (p. 8 of 11)
  • by “working collaboratively with innovators to develop data collection strategies and routine processes for jointly sharing and reflecting on data to foster continuous learning, improvement, and advocacy for policy changes” the organization can “develop capacity for data collection and monitoring for future efforts” (p. 8 of 11)
  • this approach “may feel to some to be at odds with current standards of rigor, which value fidelity to a priori hypotheses and methods”, but it is “not a ‘canned’ approach to evaluating healthcare innovations, but it involves the flexible application of five general principles” (p. 9 of 11). “This requires [evaluators] to be flexible and nimble in adapting their approach when proposed innovations are modified to fit the local context.” (p. 9 of 11)

Brainard & Hunter conducted a scoping review with the question “Do complexity-informed health interventions work?” What they found was that although “the lens of complexity theory is widely advocated to improve health care delivery,” there’s not much in the literature to support the idea that using a complexity lens to design an intervention makes the intervention more effective.

They used the term “‘complexity science’ as an umbrella term for a number of closely related concepts: complex systems, complexity theory, complex adaptive systems, systemic thinking, systems approach and closely related phrases” (p. 2 of 11). They noted the following characteristics of systems:

  • “Large number of elements, known and unknown.
  • Rich, possibly nested or looping, and certainly overlapping networks, often with poorly understood relationships between elements or networks.
  • Non-linearity, cause and effect are hard to follow; unintended consequences are normal.
  • Emergence and/or self-organization: unplanned patterns or structures that arise from processes within or between elements. Not deliberate, yet tend to be self-perpetuating.
  • A tendency to easily tip towards chaos and cascading sequences of events.
  • Leverage points, where system outcomes can be most influenced, but never controlled.” (p. 2 of 11)

They also had some recommendations for reporting on/evaluating complexity-informed interventions:

  • results should be monitored over the long term (e.g., more than 12 months) as results can take a long time to occur
  • barriers to implementation should be explored/discussed
  • unintended/unanticipated (including negative) changes should be actively looked for
  • support from the institution/senior staff combined with widespread collaborative effort is needed to successfully implement
  • complexity science or related phrases should be in the title of the article


Balasubramanian, B., Cohen, D.J., Davis, M.M., Gunn, R., Dickinson, L.M., Miller, W.L., Crabtree, B.F., & Stange, K.C. Learning Evaluation: blending quality improvement and implementation research methods to study healthcare innovations. Implementation Science. 10: 31. (full text)

Brinard, J., & Hunter, P.R. Do complexity-informed health interventions work? A scoping review. Implementation Science. 11:127. (full text)

Wandersman, A., Alia, K., Cook, B.S., Hsu, L.L., & Ramaswamy, R. (2016). Evidence-Based Interventions Are Necessary but Not Sufficient for Achieving Outcomes in Each Setting in a Complex World: Empowerment Evaluation, Getting To Outcomes, and Demonstrating Accountability.  American Journal of Evaluation. 37(4): 544-561. [abstract]


1 i.e., how do you know results aren’t do to other events that are occurring concurrently with the intervention?
2 i.e., how do you know the results aren’t just due to naturally occurring changes over time rather than being due to the intervention?
Posted in evaluation, healthcare, notes | Tagged , , , , , , , , | Leave a comment

Pragmatic Science

Another posting that was languishing in my drafts folder. Not sure why I didn’t published it when I wrote it, but here it is now!

  • Berwick (2009) wrote an interesting commentary called “Broadening the view of evidence-based medicine” in which he describes how “scholars in the last half of the 20th century forged our modern commitment to evidence in evaluating clinical practices” (p. 315) and though it was seen as unwelcome at the time, they brought the scientific method to bear on the clinical world, and over time, the randomized controlled trail (RCT) because the “Crown Prince of methods […] which stood second to no other method” (p. 315). And while there has been a huge amount of benefit from this, he says “we have overshot the mark. We have transformed the commitment to “evidence-based medicine” of a particular sort into an intellectual hegemony that can cost use dearly if we do not take stock and modify it” (p. 315). He points out that there are many ways of learning things:
  • “Did you learn Spanish by conducting experiments? Did you master your bicycle or your skis using randomized trials? Are you a better parent because you did a laboratory study of parenting? Of course not. And yet, do you doubt what you have learned?” (p. 315)
  • “Much of human learning relies wisely on effective approaches to problem solving, learning, growth, and development that are different from the types of formal science […and …] some of those approaches offer good defences against misinterpretation, bias, and confounding.” (p. 315).

  • He warns that limiting ourselves to only RCTs “excludes too much of the knowledge and practice that can be harvested from experience, itself, reflected upon” (p. 316)
  • “Pragmatic science” involved:
    • “tracking effects over time (rather than summarizing with stats)
    • using local knowledge in measurement
    • integrating detailed process knowledge into the work of interpretation
    • using small sample sizes and short experimental cycles to learn quickly
    • employing powerful multifactorial designs (rather than univariate ones focused on “summative” questions) ” (p. 316)
 explanatory trials  pragmatic trials
  • evaluating efficacy (how well does it work in a tightly controlled setting)
  • clinical trials that test a causal research hypothesis in an ideal setting
  • evaluating effectiveness (how well does it work in “real life”)
  • trials that help users decide between options
  • high internal validity
  • high external validity
Test sample & setting
  • focus on homogeneity
  • focus on heterogeneity
  • explanatory and pragmatic are not a dichotomy as most trials are not purely one or the other – there is a spectrum between them
  • Thorpe et al (2009) created a tool (called PRECIS) to help people designing clinical trials to distinguish where on that pragmatic-explanatory continuum their trial falls; it involves looking at 10 domains (see table below), with scores on these criteria placed on a 11 spoke wheel (to give you a spider diagram type of picture)
Criteria   explanatory trials  pragmatic trials
participant eligibility
  • strict
  • everyone with condition of interest can be enrolled
experimental intervention – flexibility
  • strict adherence to protocol
  • highly flexible; practitioners have leeway on how to apply the intervention
experimental intervention – practitioner expertise
  • narrow group, highly skilled
  • broad group of practitioners in broad range of settings
comparison group – flexibility
  • strict; may use placebo instead of “usual practice”/”best alternative”
  • “use practice”/”best alternative”, practitioner has leeway on how to apply it
comparison group – practitioner expertise
  • standardized
  • broad group of practitioners in broad range of settings
follow-up intensity
  • extensive follow-up & data collection; more than would routinely occur
  • no formal follow-up; use administrative database to collect outcome data
primary trial outcome
  • outcome known to be direct & immediate results of intervention; may require specialized training
  • clinically meaningful to participants; special tests/training not required
Participant compliance with intervention
  • closely monitored
Practitioner compliance with study protocol
  • closely monitored
Analysis of primary outcome
  • intention-to-treat analysis usually used; but usually supplemented with “compliant participants” analysis to answer question of “does this intervention work in the ideal situation?”; analysis focused on narrow mechanistic questions
  • intention-to-treat analysis (includes all patients regardless of compliance)
  • meant to answer the question “does the intervention work in “real world” conditions, “with all the noise inherent therein” (Thorpe et al, 2009)

I also came across this article in Forbes magazine: Why We Need Pragmatic Science, and Why the Alternatives are Dead-Ends. It’s a short read, but it succinctly summarizes an argument I find myself often making: science is a powerful tool for understanding and explaining the world. It’s not the only tool (philosophy and the other humanities, for example, are great tools for different purposes), but it’s certainly the best one for certain purposes and it’s a fantastic one to have in our toolbox!


Berwick, D.M. (2005). Broadening the view of evidence-based medicine. Quality & Safety in Health Care. 14:315-316. (full-text)

Thorpe, K.E., Zwarenstein, M., Oxman, A.D., Treweek, D., Furberg, C.D., Altman, D.G., Thus, S., Bergel, E., Harvey, I Magid, M.J., & Chalkidou, K. (2009). A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. Canadian Medical Association Journal. 180(10): E47-E57.

Posted in Uncategorized | Leave a comment