1.3 Integrates the Canadian Evaluation Society’s stated ethics in professional practice and ensures that ethical oversight is maintained throughout the evaluation.
Like many evaluators, a lot of my knowledge of ethics comes from the research world. I recently completed the latest version of the TriCouncil’s online Course on Research Ethics, which was required by the organization I work for as the course has been updated since I originally took their ethics training a long, long time ago. A lot of the concepts from research ethics – informed consent of participants, do no harm, justice, etc. – are applicable to evaluation as well.
As for how I integrate ethics into my work and ensure that ethic oversight is maintained through the evaluation, a few things that I do include:
create systems to protect the privacy of data that my team and I collect, such as only storing data on secure networks and using passwords to protect data
discussing ethical considerations, such as confidentiality, conducting rigorous evaluations, and reporting findings accurately and completely (just to name a few), with my team throughout the evaluation process
holding strong on my commitment to do my work ethically, even when it is challenging. I consider my integrity to be a very important part of being an evaluator. Without integrity, there would be no point to doing the work that I do.
One area of ethical considerations that I’m seeking to learn more about is equity in evaluation. Since I don’t work in an area where there is an obvious equity lens – such as there would be working with a non-profit that explicitly focuses on equity, for example – I find it challenging to see how my work links with equity. But inequities often stem from institutions and systems where power imbalances and institutionalized racism/sexism/ableism/and many other -isms are so embedded and are often difficult for someone with a lot of privilege (such as a straight, white, cis person such as myself) to see. So I figure that this is an area that I need to learn more about so that I can do better. Two great resources that I’ve heard about recently for learning more about equity and evaluation are Equitable Evaluation and We All Count.
The CES ethics statement is currently under review, as it is about 20 years old. I went to a session at the CES 2018 conference where they were consulting with evaluators to see if the statement needed some tweaks, or a complete overhaul. The group I was in felt it was the latter and I know there is a committee that is hard at work at revising that statement. I’m actually quite looking forward to seeing what they come up with – and I’m sure I’ll write a blog posting on it once it comes out – now that I’m on such a roll with writing here!
When I joined the Australasian Evaluation Society 1The year I went to their conference – it’s cheaper to join the society and pay the member conference registration fee than to just pay the non-member registration fee, so I joined., I had to attest to the fact that I would adhere to their ethical guidelines. I’m interested to see if CES will do the same with their new ethics statement when it’s released.
Given my interest in complexity and evaluation, I decided to take the American Evaluation Association (AEA) eStudy course being facilitated by Jonny Morell. I’ve seen Jonny speak at conferences before and have learned some useful things, so figured I could learn a few things from him in this more extended session. Sadly, the live presentation of the eStudy conflicts with other meetings that I have, so I’m only going to be able to see part of the presentations live and will have to watch the other parts of the presentations after the fact from the session recording.
Here’s one quote from his posting on complexity having awkward implications for evaluators that jumped out at me:
Contrast an automobile engine [not complex] with a beehive, a traffic jam, or an economy [complex]. I could identify each part of the engine, explain its construction, discuss how an internal combustion engine works, and what role that part plays in the operation of the engine. The whole engine may be greater than the sum of its parts, but the unique role of each part remains. The contribution of each individual part does not exist with beehives, traffic jams, or economies. With these, it may be possible to identify the rules of interaction that have to be in place for emergence to manifest, but it would still be impossible to identify the unique contribution of each part.
This really is a challenge for evaluators! Imagine being hired to evaluate a program – your job is to answer “what happens as a result of this program?”, but you know that your program is just one part of a larger, complex system, so you can never really definitively say “this program, and this program alone, caused X, Y, and Z”, as you know that outcomes are affected by so many things that are outside of the control of the program. That is the situation that we evaluators find ourselves in all the time. That’s not to say that we can’t do anything, but just that we need to be thoughtful in how we try to determine what results from a program in the context of everything else in the system. Learning about complexity and systems thinking can help us do that.
I had a conflicting meeting during the first session, held on July 9, 2019, so I watched the recording afterwards. Here are my notes:
people seems to think complexity is “mysterious” and “magic” – Jonny feels it is not
he feels that “most of the time you won’t have to use it at all”
if you learned thematic analysis or regression, you’d say “cool method, I’ll use it when it is needed and I won’t use it when I don’t need it”. He thinks complexity should be the same – use when it’s needed.
you might use complexity instead of another method (like you might say “thematic analysis is better than how I’ve been analyzing open ended survey data. I will use thematic analysis instead of what I was doing before”)
but you could also thinking about it like this: it can help you change how you conceptualize the problem, the data analysis strategies – “you begin to think differently about the world”
people seems to think that you need to use new fancy tools to apply complexity – and sometimes you do, but often you don’t – you can use familiar methods while applying complexity concepts
there’s no agreed upon definition of complexity – but he doesn’t worry about that
“systems” is a huge area (but he’s not that interested in it – though he did plug the AEA Systems TIG)
“complexity” also a huge area – and he thinks lots of the concepts are useful to evaluators
“I don’t know what complex systems are, but I know what complex systems do. I can work with that” – we can use that to make practical decisions on models, on methods, data interpretation, how to conceptualize the program.
He thinks that complexity is popular in evaluation today because there is a sense that programs aren’t successful and evaluators are the messenger (and people are shooting the messenger). And people think that maybe complexity can help explain why programs aren’t working.
“The fact that everything is connected to everything else is true, but useless.” He wants to help us learn the “art” of getting a sense of what connections are worth dealing with and which aren’t. We need to “discern meaning within the fact that everything is connected to everything else.”
Cross cutting themes in complexity science
predictability – what can we predict and how well can we predict it
how change happens –
Complex behaviours that might be useful in evaluation ((Not everything that you’ll read about when you read about complexity is useful in evaluation:
unpredictable outcome chains
network effects among outcomes
joint optimization of uncorrelated outcomes
It’s hard to talk to people (like evaluation stakeholders) about complexity
if we show people a logic model or theory of change, they can understand how things they do in their program are believed to lead to outcomes they are interested in
but talking about things like a program might benefit a few people a lot and most people not at all, or network effects – these are things we aren’t used to talking to evaluation stakeholders about
it’s difficult to say to people that we might not be able to show “intermediate outcomes” on the way to long-term outcomes (because results aren’t so linear)
your program may have negative effects in the broader system (programs are siloed, so you are only working within your own scope and aren’t concerned (or incentivized to be concerned) about stuff outside of your program. If we throw all of our financial and intellectual resources into HIV, we’d make a lot of improvements with respect to HIV. But that pulls the resources away from prenatal care, palliative care, primary care, etc., etc., etc. You are “impoverishing” the environment for every other program – and those programs will have to adapt to that.
preferential attractors – e.g., snowflakes – the odds of a molecule attaching to a big clump is more than a little clump; same thing with business – you are more likely to attach to a bigger centre of money than a small one
emergence is NOT “the whole is greater than the sum of the parts” – it’s about the WAY that the whole is greater than the sum of the parts. An engine is greater than the sum of its parts. But I could explain what the contribution of each of the parts is to the engine. That’s not the same for complex systems (like traffic jams, beehives, or economies) – you can’t explain the whole economy based on the contribution of each of its parts. Not just because we haven’t studied these enough – but because it is “theoretically impossible” to do so.
“Ignoring complexity can be rational, adaptive behaviour”
stovepipes are efficient ways to get things done
different programs have different time horizons
different organizations have different cultures
it takes resources to coordinate different programs/systems/organizations
Even if our stakeholders don’t buy into complexity, it’s still important for evaluators to think about and deal with
“if program designers build models that do not incorporate complex behaviour, they will:
miss important relationships
not be able to advocate effectively
not be effective in making changes to improve their programs
misunderstand how programs operate and what they may accomplish
these problems cannot be fixed in an evaluation, but it is still possible to evaluate the complex behaviours in their models”
e.g., he showed a logic model and talked about if you have a bunch of arrows leading into an outcome, are those “AND” or are they “OR” (i.e., do you need all of the outputs to happen to lead to that outcome, or do you only need one? Or only need some combo? He also added unintended consequences and about network effects.
the evaluator can still look at these complex behaviours – look for the data to support it. You can superimpose a complex model on top of the traditional logic model. You can do this even if the program stakeholders only see the logic model. You can show them the data interpreted based on their logic model, and then also show them how the data relates to the model that includes complexity (that might be what it takes to incorporate it).
He thinks more unintended consequences are undesirable and there are methods for measuring unintended consequences and they can be measured within the scope of an evaluation.
Jonny hates the “butterfly effect” because, in his world, he doesn’t see big changes happening super easily. He sees people making lots of policy/program changes, but the outcomes don’t change! His take on sensitivity to initial conditions is that you can run the same program multiple times and get different results each time because there are difference in the context of where its implemented and so you can’t necessarily replicate the outcome chain. But if the program is oeprating within an attractor, you might be able to get to the same ultimate outcome.
E.g., if you roll a boulder down a hill, you won’t be able to predict it’s exact path (e.g., might hit a pebble, wind might move it), but we know it will end up at the bottom of the hill because there is an attractor (gravity).
He’s not arguing to not measure intermediate outcomes, but we should think about these concepts [and maybe not be too overconfident in what we think we know about the outcome chain?]
The standards are categorized under five categories:
In each of these categories, there are several standard statements that describe what high quality evaluations should do. For example, under the category of “utility”, there are 8 statements of what evaluations should do to be useful and under the “propriety” category, there are 7 statements of what evaluation should to be ethical, just, and fair.
As I reviewed the standard statements for this blog posting, I noticed that both the CES and AEA, which both list the statements on their websites, include the following note:”Authors wishing to reproduce the standard names and standard statements with attribution to the JCSEE may do so after notifying the JCSEE of the specific publication or reproduction.” So, since I haven’t notified the JCSEE that I would like to reproduce the statements here on my blog, I can’t do so. You can read them over on the CES website though.
But the standards are more than just the statements. There’s a whole book published by the JSCEE that describes the standard statements in detail, explaining where the standards come from and how they can be applied.
It should also be noted that, despite the “should” wording of the standards, they aren’t meant to be slavishly followed, but to be applied in context and with nuance.
The standards also exist in tension with each other and you have to figure out the right balance. For example, there is a standard that says you should use resources efficiently, but another standard that says you should include the full range of groups and people who are affected by the program being evaluated. Evaluators need to find the balance between being thorough, but also been efficient in our use of resources.
In terms of my own practice, I think I can be more explicit in my use of the program standards. I’ve been an evaluator for a decade and I’ve integrated a lot of the standards into my work such that it’s just second nature (things like being efficient in my use of resources, using effective project management practices, using reliable and valid information, and being transparent). But there are other standards for which, as I read them I think “I could probably do better” (e.g., being more explicit about my evaluation reasoning or encouraging external meta-evaluation).
The evaluation standards is such a big topic, I’m barely scratching the surface here. So once I’m done this blog series on evaluator competencies, my next series is going to be on the evaluation standards! I think that will be a good way to get me to spend a bit of time reflecting on each of the standards and thinking about how I can improve my practice related to each one. And I’ll be sure to contact the JSCEE to let them know I’d like to reproduce the standard statements here on my blog!
So I had an idea. As I ease my way into blogging in a more reflective way, I thought that perhaps I could do a blog series about the Canadian Evaluation Society (CES) evaluator competencies, where each post I reflect on one of the competences. The competencies have been recently revised, so it seems like a good time to do this. Plus, having a series will give me ideas for topics – and hey, let’s make it every Sunday, so that I’ll have a deadline as well. This seems like a good way to get me into the habit of writing here.
What Are Evaluator Competencies?
Competencies are defined as “the background, knowledge, skills, and dispositions program evaluators need to achieve standards that constitute sound evaluations.” (Stevahn et al, 2005)
The competencies were created as part of the program for the Credentialed Evaluator (CE) designation. To get the designation, one has to demonstrate that they have education and/or experience related to 70% of the competencies in each of the five domains. I got my CE under the original set of competencies, but anyone applying now would use the new set. It was a few years ago that I did my CE application, so it’s another reason why it’s a good time for me to reflect on where I am now with respect to the competencies.
The five competency domains are:
“Reflective Practice competencies focus on the evaluator’s knowledge of evaluation theory and practice; application of evaluation standards, guidelines, and ethics; and awareness of self, including reflection on one’s practice and the need for continuous learning and professional growth.” (Source)
1.1 Knows evaluation theories, models, methods and tools and stays informed about new thinking and best practices.
I have taught Program Planning and Evaluation at both SFU (in the Masters of Public Health program) and UBC (in the Masters of Health Administration program) in the past couple of years, and I find that teaching is a great way to both deepen my own understanding of evaluation theories, models, methods, and tools and to stay informed about new thinking and best practices. In deciding what to include in a course, and how best to present it, and coming up (whenever possible) with activities the class can do to learn it, I learn more every time I prepare, update, and deliver a class. Also, students ask great questions (sometimes even after a class has ended and they’ve gone on to work in places where they are involved in evaluation) and sometimes it’s things that I’m not familiar with and I have to go and do some research to find out more.
I think my main reflection related to this area is that I am a firm believer that there is no one “right” way to do evaluation, and that it is best to start with what the purpose of an evaluation is and then figure out what approach, design, and methods will best help you achieve the purpose. Oftentimes, those requesting an evaluation come to it with assumptions about methods or design – like “I need you to do a survey of the program clients” or “how can I set up a randomized controlled trial to evaluate my program?” So I often find myself saying things like “Let’s begin at the beginning. Why do you want an evaluation? What do you want to know? What will you do with that information once you have it?”
Given that I think it’s important to find the best fit of approach, design, and methods to the purpose of an evaluation, it means that I need to be familiar with lots of different theories, models, methods and tools!
I attend evaluation conferences and pick sessions where I can learn about new things – and deepen my understanding of things I’m familiar with. For example, at the most recent CES conference, I took a workshop on reflective practice to deepen my skills in that area (which I’m now actively working on integrating into my life), I attended a session on rubrics to learn more about those (next step there is to try applying rubrics to an evaluation!), and I attended a session on a realist evaluation (next step there is to have the presenter come to my class as a guest speaker so that I and my students can learn more!)
I include a section in my course on “hot topics” in evaluation, which gives me the opportunity to explore the latest thinking in evaluation with my students. Recently, I’ve included complexity and systems thinking, and indigenous evaluation 2Except to read more about indigenous evaluation when I get to competency 3.7 in this blog series.. I also try to demonstrate reflective practice and humility to my students by telling them that I am exploring new areas, so I’m not an expert in these topics (especially indigenous evaluation), but that I’m sharing my learning journey with them.
I recently had coffee with a new friend and fellow evaluator, Meagan Sutton. We were introduced by a mutual friend who knew that Meagan was interested in chatting with evaluators who write blogs and that I am an evaluator who writes a blog! We had a great chat and it got me thinking about why I have this blog and how I might grow what I do with it.
I originally started this blog as a place to keep notes of work-related stuff I was reading. I have a pretty terrible memory and I find my personal blog a great way to remember stuff that I did – it’s easy to search through and accessible anywhere with an Internet connection – so I figured rather than having notes in various notebooks and jotted down in the margins of printed copies of journal articles, I could use this blog as my brain dump for various things I learn 1I briefly co-opted this blog for blog postings I was required to do during an Internet marketing class that I took in my MBA, but then switched it back to stuff related to my work.. So whenever I went to a conference, attended a webinar, or read a book or article where I wanted to record what I was learning, I dumped it on this blog. I am an external processor, so it helps me to remember and understand things when I write them down. For webinars I tend to take notes directly into my blog and publish that, but for conferences, I usually write notes on paper during the conference – partially because that helps keep me awake and attentive during conference sessions and partially because I don’t like lugging my laptop around during a conference – but also because I find it helpful to look at all the notes I’ve taken and sort or synthesize them together for the whole conference and if type my notes during the conference, I find it harder to remove the superfluous stuff, whereas if I’m deciding what it’s worth typing out from a bunch of handwritten notes, I find it easier to be more succinct as I’ll select just the main points to blog about. The downside is that it often takes me quite a while to do that, and I can end up posting my conference summary blog posting many months later 2Though I made it a priority to do it more quickly from the last conference I attended and actually got it posted just two weeks after the conference instead of months and months later.
Meagan asked me how I promote this blog and honestly, I don’t. Since I saw the blog as mostly just an externalization of my memory, I didn’t think anyone else would ever want to read it. I have had a few people contact me after reading something on my blog that they found through Google – and actually have had some interesting conversations result – but it’s pretty rare.
Occasionally, I add some reflection into these blog postings – like thoughts about how what I was reading or learning at a conference might relate to work that I do, but that’s been pretty minimal.
At the same time, I’ve been working on improving my reflective practice, mostly through reflective writing that I’m doing privately rather than in a public forum like this. Part of that is because the reflections I’ve been writing are part of the data I am using in the evaluation I’m working on, so I need it documented where the rest of the data (including my team’s reflections) are. And part of it is because some of what I write about is confidentialorpoliticallysensitive, so is not for sharing publicly.
And this is where blogging as an evaluator can get sticky. Sometimes there are things you want to reflect on and process, and maybe even start a conversation with fellow evaluators about, but that you aren’t able to make anonymous for discussion in a public forum. Or you have conflicts with clients that you want to reflect on, but can’t do that publicly either. How does one navigate this? I honestly don’t know the answer, but as I think about expanding this blog to become more reflective, it’s something I’ll need to think more about.
I guess the flip side of this is: why do I want to put my reflections out into the world? I guess because I see it as an opportunity to engage with others. As I mentioned above, without even sharing my blog postings beyond just posting them here, I’ve had some interesting interactions with other evaluators who stumbled on my blog – imagine what could happen if I tweeted out these blog postings (like I do my personal blog postings with my personal Twitter account) and actually wrote some reflective stuff – things I’m thinking about/struggling with/wanting to know more about? Perhaps I could connect with others facing similar issues and get different perspectives on the things I’m thinking about.
Coffee – posted on Flickr by Jen with a Creative Commons license
Speakers: -Tammy Heinz, Program Officer, Hogg Foundation -Hayling Price, Senior Consultant, FSG -Darrell Scott, Founder, PushBlack -Julie Sweetland, Vice President for Strategy and Innovation, Frameworks Institute -Rick Ybarra, Program Officer, Hogg Foundation
“Systems change is about shifting conditions that are holding a problem in place”
“It’s not about getting more young people to beat the odds. It’s about changign the odds”
6 conditions of systems change
structural change (policies, practices, resource flows (who gets funding and why? how are human resources allocated) [explicit – easiest to find and to change]
relationships & connections (not just having someone on your LinkedIn, but actually engaging), power dynamics (who is getting funded and why? some people have a leg up, some people are dealing with a history of oppression) [semi-explicit]
transformative change (mental models) [implicit]
mental models: deeply held beliefs, assumptions, etc.
the policies, practice, resource flows are not handed to us by nature – they are created by humans based on our mental models
PushBlack – nation’s largest nonprofit media platform for Black people
4 millions subscribers with emotionally-driven stories about Black history, culture, and current events
through Facebook Messenger – meeting people where they are at
Go to Facebook Messenger and search “PushBlack” to sign up!
ran the largest get-out-the-vote campaign on social media in history in 2018
got subscribers to contact their friends (relates to relationships and connections part of the conditions of system change)
giving subscribers tools to work at the local level (e.g., to be heard when Black people are killed by police, to free innocent Black people)
test their messages with small subset of audience before sending out only the best performing messages to the broader audience)
uses the phrase “cultural models”, which is similar concept from anthropology
“cultural models are cognitive short cuts created through years of experience and expectation. They are largely automatic assumptions, and can be implicit”
People rely on cultural models to interpret, organize and make meaning out of all sorts of stimuli, from daily experiences to social issues”
believe that understanding mental/cultural models helps you to understand what are the mental models that are holding a problem in place
e.g., Google image search “ocean” and the top hits are pictures of “beautiful blue expanse” – this is a mental model that Americans hold of the ocean – this holds implications for policy:
people think it is so big, that it’s invincible
people think it’s water and think about the surface – not thinking about what’s underneath, about how it’s an ecosystem, it produces oxygen, it affects weather, etc.
it’s not that the ocean isn’t blue or isn’t big, but that’s just a piece of the picture
e.g., some people’s mental model of “teenager”, is about “risk and rebellion” – people defying expectations from adults. Again, not a complete picture.
3 models are consistently barriers to productive conversations on social issues (especially in American context, but they’ve also seen them internationally):
individualism: assumption that problems, solutions, and consequences happen at the personal level
us vs. them: assumption that another social group is distinct, different, and problematic (beyond people – can be human vs. animals; environment vs. economy)
fatalism: assumption that social problems are too big, too bad, or too difficult to fix
there are also mental models that are specific to a given situation, but the above three tend to show up in lots of areas
one thing that doesn’t work: correcting their mistakes
“myth busters” – they don’t work! A study of myth-fact structure found: people misremembered the myths as true, got worse over time, and they attributed the false information to the CDC (Skumik et al (2005), JAMA)
mental models are there because we’ve heard it so many times. When you restate a “bad” mental model, you reinforce it (e.g., if you state: Myth: Flu vaccines cause the flu, you reinforce their mental model that flu vaccines cause the flu (doesn’t matter that you said it was a “myth”))
never remind people of things you wish they’d forget
another thing that doesn’t work: giving people more information
isn’t not that you shouldn’t use facts
but if people have a particular mental model, stacking data on top does not change their mental model
you need to help them build a new mental model
another thing that doesn’t work: leaving causation to the public imagination doesn’t work
leaving people with their bad mental models won’t help
instead of trying to rebut people’s misunderstanding – try to redirect attention to what is true and how things do work
Tammy Heinz and Rick Ybarra
Hogg Foundation for Mental Health
historically funded lots of program and research
Mental Health has been focused on diagnosis and treatments, with end goal of symptom reduction
now moving their work upstream
traditionally, there has been a medical/disease model of health
in the 1970s, people started thinking about if mental health was really chronic or could people get better from this
shifting a mental model is not something that can happen quickly
in the past 20 years, there’s been some deliberate work to shift the thinking around mental health
huge shift towards peers helping in mental health care teams
thinking about “recovery” – it’s not an expectation of only symptom control
there are multiple mental models on an issue – you can call up a more productive mental model (e.g., maybe “fatalism” if the first thing that comes to mind, but you can call up a more productive mental model)
how do you figure out what mental models people are using?
Hayling: we are constantly testing out models through our work
Julie: ask people “what are ideas you wish you’d never hear again?” and you’ll get a pretty good idea of the mental models that are being a problem
how do you change mental models around emotionally charged issues?
Rick: listening. Figure out what mental models are driving things. Really learn and understand where people are coming from.
Tammy: being clear about where you want to go
Hayling: make things plain
Julie: call people in rather than calling them out
This year’s conference was in Halifax and, as always, it was a wonderful opportunity to reconnect with my evaluation friends, make some wonderful new friends, to pause and reflect on my practice, and to learn a thing or two. And I think this is quite possibly the fastest I’ve ever put together my post-conference recap here on ye old blog! (The conference ended on May 29 and I’m posting this on June 14!)
Student Case Competition
The highlight of the conference for me this year was the Student Case Competition finals. In this competition, student teams from around the country, each coached by an experienced evaluator, compete in round 1 where they have 5 hours to review a case (typically a nonprofit organization or program) and then complete an evaluation plan for that program. Judges review all the submissions and the top 3 teams from round 1 move on to the finals, where they get to compete live at the conference. They are given a different case and have 5 hours to come up with a plan, which they then present to an audience of conference goers, including representatives from the organization and three judges. After all three teams present, the judges deliberate and a winning team is announced!
I had the honour of coaching a team of amazing students from Simon Fraser University. The competition rules do not allow teams to talk to their coaches when they are actually working on the cases, so my role was to work with them before the round, talking about strategies for approaching the work, as well as chatting with them about evaluation in general. Most of the students on the team had not yet taken an evaluation course, so I also provided some resources that I use when I teach evaluation.
I will admit that I was a bit nervous watching the presentations – not because I didn’t think my team would do well, as I know they worked really hard and are all exceptionally intelligent, enthusiastic and passionate, but because it’s huge challenge to come up with a solid evaluation plan and a presentation in such a short period of time, and because they were competing among the best in the country!
But I need not have been worried. They came up with such a well thought through, appropriate to the organization, and professional plan and presented it with all the enthusiasm, professionalism, grace, and passion that I have come to know they possess. I was definitely one proud evaluation mama watching my team do that presentation and so very, very proud of them when they won! Congratulations to Kathy, Damien, Stephanie, Manal, and Cassandra! And to Dasha, who was part of the team that won round 1, but wasn’t able to join us in Halifax for the finals.
Kudos also go to the two other teams who competed in the finals – students from École nationale d’administration publique (ENAP) and Memorial University of Newfoundland (MUN). Great competitors and, as I had the pleasure of learning when we all went out to the pub afterwards, as well as chatting at the kitchen party the next night, all very lovely people!
As usual, I took a tonne of notes throughout the conference and, as usual for my post-conference recaps, I will:
summarize some of my insights, by topic (in alphabetical order) rather than by session as I went to some different sessions that covered similar things
where possible, include the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due. Sometimes I missed names (e.g., if an audience member asked a question or made a statement, as audience members don’t always state their name or I don’t catch it)
apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them.
Anything in [square brackets] is my thoughts that I’ve added upon reflection on what the presenter was talking about.
every time I go to CES, I find I learn a little bit more about how the federal government works (since so many evaluators work there!). This time I learned that Canada Revenue Agency (CRA) doesn’t report up to Treasury Board – they report to Finance
the indigenous welcome to the conference was fantastic and it was given by a man named Jude. I didn’t catch his full name and I couldn’t find his name in the conference program or on Twitter. [Note to self: I need to do better at catching and remembering names so I can properly give credit where credit is due]. He talked about how racism, sexism, ableism, transphobia, and other forms of oppression are at play in the world today. He also talk about about how there is a difference between guilt and responsibility. We need to take responsibility for making things better now, not just feel guilty about the way things are.
Nan Wehipeihana talked about an evaluation of sports participation program and how they moved from sports participation “by” Māori to sports participation “as” Māori. They talked about what it would look like to participate “as” Māori (e.g., using Māori language, Māori structures (tribal, subtribal, kin groups) are embedded in the activity, activities occur in places that are meaningful to Māori people (e.g., kayaking on our rivers, activities on our mountains). Developed a rubric in the shape of a five-point star (took a year to develop).
I went to a Lightning Roundtable session hosted by Larry Bremner, Nicole Bowmanm, and Andrealisa Belizer where they were leading a discussion on Connecting to Reconciliation through our Profession and Practice. One of the things that Larry mentioned that struck me was the importance of not just indigenous approaches to evaluation, but indigenous approaches to program development. It doesn’t make sense to design a program without indigenous communities as equal partners and then to say you are going to take an indigenous approach to evaluation – the horse has left the barn by that point.
They also talked about how evaluators are culpable for the harm that is still happening because we haven’t done right in our work. They talked about how the CES needs to keep the government’s feet to the fire on the Truth and Reconciliation Commission’s (TRC) Calls to Action. Really, after there Commission, there should have been a TRC implementation committee who could go around the country and help get the Calls to Action implemented (Larry Bremner).
I also went to a concurrent session where the panelists were discussing the TRC Calls to action. They pointed out that CBC has a website where they are tracking progress on the 94 Calls to Action: Beyond 94.
CES added a competency about indigenous evaluation in its recent updating of the CES competencies:
3.7 Uses evaluation processes and practices that support reconciliation and build stronger relationships among Indigenous and non-Indigenous peoples.
Many evaluators saw this new competency and said “I don’t work with indigenous populations, so how can I relate to this competency?” [I will admit, I had that thought as well when the new competencies were announced. Not that I don’t think this is an important competency for evaluators to have – but more that I didn’t know how to apply it in the work I am currently doing or where to start in figuring out what I should do.]. The CES is trying to provide examples to support evaluators. (Linda Lee) E.g.:
I also learned that EvalIndigenous is open to indigenous and non-indigenous people – anyone who wants to move forward indigenous worldviews and want indigenous communities to have control of their own evaluations. So I joined their Facebook group! (Nicole Bowman and Larry Bremner)
Evaluators typically use a Western European approach and many use an “extractive” evaluation process, where they take stuff out of the community and leave (I can’t remember if this slide was from Larry Bremner or Linda Lee).
I also found this discussion of indigenous self-identification helpful (Larry Bremner):
There is still so much work to do and so much harm being inflicted on indigenous people:
there are more indigenous kids in care today than were in residential schools – this is the new residential schools. (Larry Bremner)
During the discussion with the audience, some audience members mentioned “trauma tourism” – that it can be re-traumatizing for indigenous people to share traumas they have experienced and non-indigenous people, in their attempts to learn more about the experiences of indigenous people need to be mindful of this and not further burden indigenous people.
If you google “indigenous women”, all the results you get are about missing and murdered indigenous women and girls. Where is the focus on the strengths in the community?
evaluators are learners (Barrington)
Bloom’s Taxonomy is a hierarchy of cognitive processes that we go through when we do an evaluation – notice that evaluation is at the top – it’s the hardest part (Gail Barrington)
single loop learning is where you repeat the same process over and over again, without every questioning the problem you are trying to fix (sort of like the PDSA cycle). There’s no room for growth or transformation. (Gail Barrington)
in contrast, double loop learning allows you to question if you are really tackling the correct problem (sometimes the way that the problem is defined is causing problems/making things difficult to solve) and the decision making rules you are using, allowing for innovation/transformation/growth. (Gail Barrington)
“Pattern matching is the underlying logic of theory-based evaluation” – specify a theory, collected data based on that, see if they match (Sebastian Lemire)
Trochim wrote about both verification AND falsification, but in practice most people just come up with a theory and try to find evidence to support it (confirmation bias) (Sebastian Lemire)
humans are wired to see patterns, even when they aren’t there and we tend to focus on evidence in support of the patterns (Sebastian Lemire)
having more data is not the solution! (Sebastian Lemire)
e.g., when people were given more information on horses and then made bet, they didn’t get any more accurate in their bets, but they did get more confident in their bets
evaluators need to do reflective practice – e.g., to look for our biases (Sebastian Lemire)
structural analytic techniques (see slide) below – not a recipe, but a structure process (Sebastian Lemire)
pay attention to alternative explanations – in the context of comissioned evaluations, it can be hard to get commissioners to agree to you spending time on looking at alternative explanations and we often go into an evaluation assuming that the program is the cause (bias) (Sebastian Lemire)
falsification: specify what data you would expect to see if your hypothesis was wrong (Sebastian Lemire)
Power and Privilege
since we have under-served, under-represented, and under-privileged people, we must also have over-served, over-represented, and over-privileged people (Jude, who gave the indigenous welcome. I didn’t catch his last name and I can’t find it on the conference website)
recognize your power and privilege, recognize your biases and think about where they come from and work to prevent your biases from affecting your work (Jude, who gave the indigenous welcome. I didn’t catch his last name and I can’t find it on the conference website)
and speaking of power and privilege, the opening plenary on the Tuesday morning was a manel. For the uninitiated, a “manel” is a panel of speakers who are all male. It’s an example of bias – men being more often recognized as experts and given a platform as experts when there are many, many qualified women. I called it out on Twitter:
a friend of mine who is a male re-tweeted this saying he was glad to see that someone called it out and when I spoke to him later, he told me that people were giving him kudos for calling it out and he had to point out that it was actually a woman who called it out. So another great example of women being made invisible and men getting credit.
I do regret, however, that I neglected to point out that it was a “white manel” specifically. There’s so much more to diversity than just “men” and “women”!
Michelle Naimi (who I know from the BC evaluation scene) gave a great presentation on a realist evaluation project she’s been working on related to violence prevention training in emergency departments. My notes on realist evaluation don’t do it justice, but I think my main learning here is that this is an approach that I can learn more about. I’m definitely inviting her as a guest speaker the next time I teach evaluation!
I took a pre-conference workshop, led by Gail Barrington, on reflective practice. This is an area that I’ve identified that I want to improve in my own work and life, and a pre-conference workshop where I got to learn some techniques and actually try them out seemed like a perfect opportunity for professional development.
Gail talked about:
how she doesn’t see her work and her self as separate – they are seamless
if you don’t record your thoughts, they don’t endure. (How many great ideas have you had and lost?) [I’d add, how many great ideas have you had, forgotten about, and then been reminded of later when you read something you wrote?]
evaluators are always serving others – we need to take care of ourselves too
The best part of the workshop was that we got to try out some techniques for reflective practice as we learned them
Warm up activity: In this activity, we took a few minutes to answer the following questions:
-Who am I? -What do I hope to get out of this workshop? -To get the most out of this workshop, I need to ____
Then we re-read what we wrote and answered this:
-As I read this, I am aware that __________
and that is an example of reflection!
[Just had an idea! I could use that at the start of class to introduce the notion of reflective practice from the beginning of class. If I turn my class into more of a flipped classroom approach, I could have more in-class time to do fun, experiential things like this than listening to lecture 🙂 ]
Resistance Exercise: Another quick writing exercise:
-What are the personal barriers that hold me back from reflection? -What are the lifestyle/family barriers that hold me back from reflection? -What barriers at work are holding me back from being transformative?
Then we re-read what you wrote and answer this:
-As I read this, I am aware that __________
The Morning Pages:
Write three pages of stream of consciousness first thing in the morning in a journal that you like writing in. Before you’ve done anything else – and before your inner critic has woken up. If you can’t think of anything to write, just write “I can’t think of anything to write” over and over again until something comes to you.
All sorts of things will pop up – might be ideas for a project you are working on, or “to do” items to add to your list. You can annotate in margins, transfer things to your main to do list later, or some of it might not be useful to you now and you don’t have to look at it again.
Gail said it’s very different writing first ting in the morning compared to later in the day. I know that I’m unlikely to get up an extra half hour earlier than I already do, but I could give this a try on weekend morning when I’m not feeling rushed to get to work to see if it’s different for me too.
Start Now Activity:
-The thoughts/ideas that prevent me from journaling now ____
Then re-read what you wrote and answered this:
-As I read this, I am aware that __________
for some people, writing is not for them. An alternative is using a voice memo app. We gave it a try in the workshop and I was kind of meh on it, but I used it two more times during the conference when I had a quick thought I wanted to capture. I think the challenge will be that if I want to retrieve those ideas, I’ll need to listen to the recordings, which seems like a big time sync, depending on how much I say (as I can be verbose).
we also talked about meditation and went out on a meditative walk ((Gail put up the quotation “solvitur ambulando”, citing St. Augustine, and noting that it is Latin for “solved by walking”. But when I just googled it, it turns out that it was actually from the philosopher Diogenes, and actually refers to something that is solved by a practical experiment). For our walk, we set an intention (to think about one thing that I’ll chnage at my work), then forget about it and go for a mindful walk – paying attention to the sensations of walking (e.g., the feeling of your feet on the ground as you step, the colours and shapes and sounds and smells you encounter). It was a rainy day, but I was definitely struck with all the beauty around me, and was reminded about how beneficial mindfulness can be.
My take home from all my reflections in this workshop was:
taking time to do things like reflective practice and mindfulness meditation is a choice. I say that I don’t have enough time to do these things, but it’s actually that I have been choosing not to spend my time doing these things. There are a variety of reasons for those choices (which I did reflect on and got some valuable insights about). Remembering that this is a choice – and being more mindful of what choices I’m making – is going to be my intention as I return back to work after my conference/holiday.
I’ve been to sessions on Rubrics by Kate McKegg, Nan Wehipeihana, and their colleagues at a number of conferences and I always learn useful things. This year was no exception. The stuff in this section is all from McKegg and Wehipeihana (and they had a couple of collaborators who weren’t there but “presented” via video.
rubrics are a way to make our evaluation reason explicit
just evaluating on if goals are met is not enough. Rubrics can help us with situations like:
what counts as “meeting targets”? (e.g., what if you meet an unimportant target but don’t meet an important one? Or you way exceed one target and miss another by a little bit? etc.)
what if you meet targets but there are some large unintended negative consequences?
do the ends justify the means? (what if you meet targets but only but doing unethical things?)
whose values do you use?
3 core parts of a rubric:
criteria (e.g., reach of a program, educational outcomes, etc.)
levels (standards) (e.g., bad, poor, good, excellent; could also include “harmful”)
some people don’t like to see “harmful” as a level, but e.g., when we saw inequities, we needed a way to be able to say that it was beyond poor and actually causing harm
importance of each criteria (e.g., weighting)
sometimes all criteria are equally important and sometimes not
rubrics can be used to evaluate emerging strategies:
evaluation can be used in situations of complexity to track evolving understanding
in all systems change, there is no final “there”
in situations of complexity, cause-and-effect are only really coherent in retrospect [this are not predictable] and do not necessarily repeat
we only know things in hindsight and our knowledge is only partial – we must be humble
need to be looking out continually for what emerges
in complexity thinking, we are only starting to see what indigenous communities have long known
our reality if created in relation, interpretive
Western knowledge dismissed this
need to bring things together to make sense of multiple lines of evidence
“weaving diverse strands of evidence together” in the sensemaking process
we have to make judgments and decisions about what to do next with limited/patchy information. Rubrics give us a traceable method to make our reasoning explicit
having agreed on values at the start helps to navigate complexity
break-even analysis flips return-on-investment:
when you can’t do a full cost-benefit analysis (e.g., don’t have information on ALL costs and ALL benefits), can see if the benefits are at least greater than costs
think about how rubrics are presented – e.g., minirubrics with red/yellow/green
but that might not be appropriate in some contexts – e.g., if a program is just developing an it’s unreasonable to expect that certain criteria would be at a good level yet
a growing flower as a metaphor for different stages of different parts of a program may be more appropriate to a development program. May also be more appropriate in an indigenous context
it’s important to talk about how the criteria relate to each other (not in isolation)
they do each analysis separately (e.g., analyze the survey; analyze the interviews)
then map that to the rubric
then take that to the stakeholders for sensemaking; stakehodlers can help you understand why you saw what you saw (e.g., when you see what might seem like conflicting results)
like with other evaluation stuff, might not say “we are building a rubric” to stakeholders at the start (it’s jargon). Instead, ask questions like “what is important to you?” or “If you were participating *as* Māori , what would that look/sound/feel like to you?”
Theory of Change
to be a theory of change (TOC) requires a “causal explanation” (i.e., a logic model on its own is not a TOC – we need to talk about why those arrows would lead to those outcomes) (John Mayne) [This also came up as a question to my case competition team – and my team gave a great answer! Did I mention I’m so proud of them?]
complexity affects the notion of causation – in complexity, there isn’t “a” cause, there are many causes (John Mayne)
people assume you have to have a TOC that can fit on one page – but that doesn’t always work – can do nested TOCs (John Mayne)
interventions are aimed at changing the behaviour of groups/institutions, so TOCs should reflect that (John Mayne)
there is lots of research on behaviour change, such as on Bennett’s hierarchy, or the COM-B model (John Mayne):
causal link assumptions – what conditions are needed for that link to work? (John Mayne) (e.g., could label the arrows on a logic model with these assumptions – Andrew Koleros)
As with pretty much any conference I go to, I came home with a reading list:
Concurrent session: Steering Evaluation Through Uncharted Waters by M. Elizabeth Snow, Alec Balasescu, Abdul Kadernani, Sheila Matano, Stephanie Parent, Monika Viktorin, Shadi Mahmoodi, Sandra Wu [this was my presentation!]
Some colleagues and I are presenting a poster at the Centre for Health Services & Policy Research conference on March 7-8, 2019. Rather than cluttering up our poster with a reference list, we are putting our references online here and our poster will have a QR code linked to this page. So if you’ve come looking for the references from our poster, you’ve come to the right place!
Cook, P. F., & Lowe, N. K. (2012). Differentiating the Scientific Endeavors of Research, Program Evaluation, and Quality Improvement Studies. Journal of obstetric, gynecologic, and neonatal nursing, 41(1), 1-3.
Hedges, C. (2009). Pulling It All Together: QI, EBP, and Research. Nursing management. Nursing management, 40(4), 10-12.
Hill, S. L., & Small, N. (2006). Differentiating Between Research, Audit and Quality Improvement: Governance Implications. Clinical Governance: An International Journal, 11(2), 98-10
Naidoo, N. (2011). What is Research? A Conceptual Understanding. African Journal of Emergency Medicine, 1(1), 47-48.
Newhouse, R. P., Pettit, J. C., Poe, S., & Rocco, L. (2006). The Slippery Slope: Differentiating between Quality Improvement and Research. Journal of Nursing Administration, 36(4), 211-219.
Shirey, M. R., Hauck, S. L., Embree, J. L., Kinner, T. J., Schaar, G. L., Phillips, L. A., . . . McCool, I. A. (2011). Showcasing Differences Between Quality Improvement, Evidence-Based Practice, and Research. The Journal of Continuing Education in Nursing, 42(2), 57-68
United States Government Accountability Office. (2011, May). Performance Measurement and Evaluation: Definitions and Relationships. Retrieved March 3, 2017, from Program Performance Assessment: http://www.gao.gov/assets/80/77277.pdf
This year’s Canadian Evaluation Society (CES) conference was held in Calgary, Alberta and had a theme of Co-Creation. As always, I had a great time connecting with old friends and making new ones, learning a lot, and getting to share some of my own learnings too.
As I usually do at conferences, I took a tonne of notes, but for this blog posting I’m going to summarize some of my insights, by topic (in alphabetical order) rather than by session as I went to some different sessions that covered similar things. Where possible, I’ve included the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due, but I apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them. Anything in [square brackets] is my thoughts that I’ve added upon reflection on what the presenter was talking about.
I didn’t see as many things about complexity as I usually do at evaluation conferences.
No one person has their mind around a complex system. You need all the people in the room to understand it. Systems are messy because people are messy. (Patrick Field)
How do we get past the view that humans are supreme beings and the environment is just there to serve us (vs. a stewardship view)? This view is deeply embedded in our identities. Even the legal system is set up to prioritize making money (and views environmentalism as a “nuisance”) (Jane Davidson)
People have a fear of being evaluated on things they don’t (feel they) have control over “don’t evaluate me on sustainability stuff! It’s affected by so much else!” Looking at outcomes that are outside the control of the program isn’t meant to be about evaluating a program/organization on their performance, but about identifying the things that are constraining them from achieving the outcomes they are trying to achieve. It’s not though if you only look at the things within the box of your program, you can really control all of the things in the box and they aren’t affected by the things outside your program. [No program is a closed system]. (Jane Davidson)
We focus on doing evaluations to meet the client’s requests, and maybe we stretch it to cover some other things. Sometimes you can slip in stuff that the client didn’t ask for but then you can use that to demonstrate the value of it. People often limited by what they ask for to what they think is possible and sometimes you need to be able to demonstrate the possibilities first (Jane Davidson)
It’s not just about asking “how good were the outcomes”, but “how good was this organization in making the trade offs?”(Jane Davidson)
Evaluation Approaches and Methods
People limited their questions to what they think can be measured (e.g., I want to see indicator X move by Y%). When clients say “we can’t measure that!”, Jane tells them “Look there are academics who spent their love life studying “love”. If they can do that, we can find a way to measure what you are really interested in. And it doesn’t have to be quantitative!” The client isn’t a measurement expert and they shouldn’t be limiting their questions to what they think can be measured. (Jane Davidson)
Once upon a time, evaluation was about “did you achieve your objectives?” but now we also think about the side effects too! (Jane Davidson).
Bower & Elnitsky talked about having to distinguish between evaluation/quality imrpovement/data collection/clinical indicators/performance indicators (and how, in their view, these aren’t different things) and to talk to their client about how evaluation adds value. This struck a chord with me as it was similar to some of the things that my co-authors and I talk about in a paper we currently have under review in the Canadian Journal of Program Evaluation.
Sarah Sangster felt that evaluation is like research, but more. She described how evaluation requires all the things you need to do research, but also has some things that research doesn’t (e.g., some evaluation-specific methods). She talked about how ways that people sometimes try to differentiate evaluation and research are really shared (e.g., evaluation is often defined as referring to judging “merit/value/worth”, but that research does that too (e.g., research judges the “best treatment”). [Some of the things she talked about were things that my co-authors and I grappled with in our paper – such as how research is a lot more varied than people typically give it credit for (e.g., participatory action research or community-based research stretch the boundaries are traditional research in that the questions being explore come from community instead of from the researchers and the results are specifically intended to be applied in the community rather than just being knowledge for knowledge’s sake).
The CES is updating its list of evaluation competencies – those things a person should know and be able to do in order to be a competent evaluator. The evaluation competencies are used by the society to assess applicants for the Credentialed Evaluator designation – people have to demonstrate that they’ve met the competencies. The competencies are being revised and updated and the committee is taking comments on the draft until June 30, 2018. They expect to finalize the new competencies in Sept 2018.
The CES is also looking at renewing its ethics statement, which hasn’t been updated in 20 years! I went to a session where we looked at the existing statement and it clearly needs a lot of work. The society is currently doing an environmental scan (e.g., looking at other evaluation societies’s ethics guidelines/principles/codes/etc.) and consultations with stakeholders (e.g., the session I attended at the conference) and plan to have a decision by the fall if they are going to just tweak the existing statement or completely overhaul it. They hope to have a finished product to unveil at next year’s CES conference.
During the session that Alec from my team led, which was a lighting round table where people circled through various table discussions, one of the things we talked about while discussing doing observations was ethics. For example, when we are doing observations, it is understood that if you are in a public place, you might be observed and it is ethics to observe people. The question arose “are hospitals public places?”
Evaluation, Use Of
There was a fascinating panel of 3 mayors who were invited to the conference to talk about what value evaluation can add for municipalities. None of the mayors had even heard of the Canadian Evaluation Society prior to being invited to the conference, so we definitely have our work cut out for us in terms of advocating for evaluation at the municipal level. There is definitely lots of evaluation work that can be done at the municipal level and it would be worthwhile for the society to educate municipal politicians about what we do and how it can help them. The mayors were open to the idea of using evaluation findings in their decision making. There was a suggestion that there should be a panel of evaluators at the Canadian municipalities conference, just like we had the mayors’ panel at our evaluator conference, and I seriously hope the CES pursues this idea.
Evaluation as intervention
Evaluators affect the things they evaluate. The act of observing is well known to affect the behaviour of those being observed. As well, we know that “what gets measured gets managed,” so setting up specific indicators that will be measured will cause people to do things that they might not otherwise have done. This is an important thing that we should be discussing in our evaluation work.
The opening keynote on Reconciliation and Culturally Responsive Evaluation was introduced with “You will feel uncomfortable and that is by design. Ask yourself why it makes you uncomfortable.” [The history – and present – of indigenous people is a hard thing to grapple with for many reasons. There are so many injustices that have been done – and continue to be done – and each of us participates in a system that perpetuates that injustice. Doing nothing about it is to do harm. And for those of us who are not indigenous, there can be a mixture of ignorance about our own history and ignorance about our actions and inactions that contribute to the injustice, privilege, and lack of knowing what we can do that can all contribute to this discomfort.]
Several of the people speaking about indigenous evaluation talked about the need for indigenous-led evaluation. We have a long history of evaluation and research in indigenous communities being led by non-indigenous people where they take from the community, don’t contribute to the community, don’t ask the questions that the community needs answered, don’t understand things from the communities’ perspectives, impose Western view/perspectives/model, and then leave the community no better off than before.
“Scientific colonialism” colonial powers export raw data from communities to “process it”. (Nicole Bowman)
Despite all the evaluation that has been going on for years, we still face all the same problems – maybe worse. (Kate McKegg)
“How do we ensure evaluation is socially justi, as well as true, that it attends to the interests of everyone in society and not solely the privileged” (House, 1991 – cited by Kate McKegg).
“Culturally-responsive evaluation seems to be about giving “permission” to colonizers and settlers to do evaluation in indigenous communities” (Kate McKegg).
“Sometimes the stories we are telling are not the stories that need to be told.” (Larry Bremner) [Larry was talking about the ways in which evaluation can further perpetuate injustice against, and further ignore and marginalize, indigenous people, thorough what we do and do not study.] Is our own working maintaining colonial oppression?
“Trauma is never far from the surface in indigenous communities.” Larry Bremner.
Since many aspects of culture and ceremony have been destroyed by colonialism, how are people supposed to heal, as culture and ceremony are ways of – healing? (Nicole Bowman)
The lifespan of indigenous people is 15 years less than non indigenous people.
There is a lot of diversity among indigenous people in Canada: 617 First Nations, as well as Inuit and Métis; 60 languages.
Larry Bremner quoted a few people that he’d worked with: “Everyone is talking about reconciliation, but what happened to the “truth” part?” [in reference to the Truth and Reconciliation commission] and “In my community, reconciliation is about making white people feel less guilty.” [Even work that is supposed to be about dealing with injustice against indigenous people gets turned around to serve white people instead.]
Understanding our history is needed to understand the legal and policy work in which we live today. Evaluators need to understand authority power. (Nicole Bowman)
How can non-indigenous people be good allies?
We have to be clear on our own identities as settlers and colonizers, recognize our privilege. Our identity is shaped by our history and our present. Colonization is still going on and is nonconsensual and designed to benefit the privileged. (Kate McKegg)
We don’t even know our own history, let alone that of indigenous people.(Kate McKegg)
It’s not indigenous people’s responsibility to teach us about this – it’s own our job. Only when we understand ourselves can we hear indigenous people. (Kate McKegg)
Do your homework. Expand your indigenous networks. Undertake relevant professional development. Build relationships. (Nan Wehipeihana)
Advocate for indigenous-led evaluation – indigenous people evaluating as indigenous people:
During the opening keynote, an audience member asked how non indigenous people can learn if it’s not indigenous peoples’ responsibility to teach non indigenous people.
The panelists noted that indigenous people are a small group who first priority is to do work to help their communities – expecting them to educate you is to put a burden on them that is not their responsibility.
Kate McKegg noted that indigenous people have been trying to talk to non indigenous people for years and we haven’t listened to them. She suggested that we can work with other settlers who want to learn – there is lots available to read, to start.
Nicole Bowman noted that observation is how we traditionally learned and it is part of science to observe – do some observing.
Larry pointed out that indigenous people have taught their ways to others before and people have taken their protocols and not used them well – why should they give non indigenous people more tools to hurt indigenous people?
Lea Bill from the First Nations Information Governance Centre spoke about the OCAP® principles, which refers to Ownership, Control, Access, and Possession of data, in that First Nations have rights to all of these. [I have learned about OCAP® before, but hadn’t realized until I saw this presentation that it was a registered trademark).
All privacy legislation is about protecting individual privacy rights, but OCAP® is about collective, community rights.
When you work with indigenous communities, you need to know who the knowledge holders in the community are – they have rights and privileges and if you don’t know, you could offend people and not get good information. (Lea Bill)
Indigenous indicator are bicultural – all things are interconnected and human beings are not separate from the environment. (Lea Bill)
Something I’ve been interested lately is how people from different disciplines use words differently. Two disciplines might use the same word to mean different things, or they might use different words to mean the same thing. One of the sessions I attended was about a glossary that thad been created to clarify words/phrases that are using by financial/accounting people vs. evaluation people. Check out the glossary here.
I attended a thematic breakfast session that was a live taping of an episode of the Eval Cafe podcast. It was a chance for a group of us to reflect on what we’d learned about at the conference. You can check out the podcast here.
We need to think bigger than binary thinking – “what in our control vs. not in our control?”, “Yes/No”, “Good/Bad”, “Pre/Post”. [Few things are really black or white – often things that are we think of as binary are really more of a gradient or spectrum. There are fuzzy boundaries between things. It’s one of the reasons I like to start questions with “to what extent….” Like “to what extent did the program achieve its goals?”]
Watch “The Doctrine of Discovery – Unmasking the Domination Code”
Read “Pagans in the Promised Land” by Steven J. Naucomb
Research “are hospitals public places?” for the purposes of observations.
Sessions I Attended:
Opening Keynote Panel: Reconciliation and Culturally-Responsive Evaluation: Rhetoric or Reality? with panelists Dr. Nicole Bowman, Larry K. Bremner, Kate McKegg, and Nan Wehipeihana
Keynote with panelists Dr. Jane Davidson, Patrick Field, Sean Curry, Dr. Juha I. Uitto
Keynote by Lea Bill
Mayors’ Panel: A municipal perspective on co-cration and evaluation with panelists Heather Colberg, Mayor of Drumheller, Alberta; Mark Heyck, Mayor of Yellowknife, NWT; Stuart Houston, Mayor of Spruce Grove, Alberta
Fellows Panel – Evaluation for the Anthropocene. ￼
Closing Keynote Panel: Reflection on Co-Creation Conference 2018 by CES Fellows – Our rapporteurs, realists, and renegades.
Collaborating to improve wait times for a primary care geriatric assessment and support program by Emily Johnston, Krista Rondeau, Kathleen Douglas-England, Bethan Kingsley, Roma Thomson
Surveying an Under-Represented Population: What We Learned by Surveying Great- Grandma by Kate Woodman, Krista Brower
Co-Creating Evaluation Capacity in Primary Care Networks: A Case Example of Lessons Learned by Krista Brower, Sherry Elnitsky, Meghan Black
Evaluators faced with complexity: presentation of the results of a synthesis of the literature by Marie-Hélène L’Heureux
Knowledge translation and impacts ” unpacking the black box by Ambrosio Catalla Jr, Ryan Catte
Evaluation and research: Two sides of the same coin or different kettles of fish? by Sarah Sangster, D. Karen Lawson
Who’s keeping score? A team-based approach to building a performance measurement scorecard by Beth Garner
Updating the CES Competencies for Evaluators: A Work in Progress by Gail Vallance Barrington, Christine Frank, Karyn Hicks, Marthe Hurteau, Birgitta Larsson, Linda Lee
Help Us Co-create CES’s Renewal Vision of Ethics in Program Evaluation! by CES Ethics Working Group on Ethics, Environmental Scan and Stakeholder Consultation Subcommittees
On the Road with the EvalCafe Podcast: Greetings from Calgary! by Carolyn Camman, Brian Hoessler
Integrating Social Impact Measurement Practice into Social Enterprises: A Sociotechnical Perspective by Victoria Carlan
From Collaboration to Collective Impact; Measuring Large-scale Social Change by Andrea Silverstone, Debb Hurlock, Tara Tharayil
The Rosetta Stone of Impact: A Glossary for Investors and Evaluators by David Pritchard, Michael Harnar, Sara Olsen
Presentations I Gave:
An inside job: Reflections on the practice of embedded evaluation by Amy Salmon, Mary Elizabeth Snow
How is evaluation indicator development like an orchestra? by Mary Elizabeth Snow, Alec Balasescu, Joyce Cheng, Allison Chiu, Abdul Kadernani, Stephanie Parent
Can co-creation lead to better evaluation? Towards a strategy for co-creation of qualitative data collection tools by Alec Balasescu, Joyce Cheng, Allison Chiu, Abdul Kadernani, Stephanie Parent, Mary Elizabeth Snow
The Canadian Evaluation Society’s national conference was held right here in Vancouver last month! I was one of the program co-chairs for the conference and I have to say that it was pretty awesome to see a year and a half worth’s of work by the organizing committee come to fruition! There were a lot of people involved in putting together the conference and so many more parts to it than I had realized when I started working on it and it was incredible to see everything work so smoothly!
As I usually do at conferences, I took a tonne of notes, but for this blog posting I’m going to summarize some of my insights, by topic (in alphabetical order) rather than by session 1Though I’ve listed all the sessions I attended at the bottom of this posting. as I went to some different sessions that covered similar things. Where possible, I’ve included the names of people who said the brilliant things that I took note of, because I think it is important to give credit where credit is due, but I apologize in advance if my paraphrasing of what people said is not as elegant as the way that people actually said them.
Damien Contandriopoulos noted that context is often defined by what it is not – it is not your intervention – i.e., it’s whatever is outside your intervention, but it’s not the entire universe outside of your intervention. Just what is close enough to be relevant/important to the analysis. He also noted that some disciplines don’t talk about context at all (e.g., they might talk about the culture in which an intervention occurs, but don’t talk about it as separate from the intervention the way we talk about context as being separate from the intervention).
Depending on your conceptualization of “context”, you may want to:
neutralize the context (e.g., those who think that context “gets in the way” and thus they try to measure it and neutralize it so it won’t “interfere” with your results). Contandriopoulos clearly didn’t favour this approach, but noted that it could work if your evaluand was very concrete/clear.
adapt to context
describe the context
In all of the above options, it’s about generalizability/external validity (e.g., if you are trying to neutralize the context, you are wanting to know if the evaluand works and don’t want the context to interfere with your conclusion about if the the evaluand works; if you are adapting to the context, you want to figure out how the evaluand might work in a given context; if you are describing the context, you are wanting to understand the context to use to interpret your evaluation findings)
From the audience, AEA president Kathryn Newcomer, mentioned a paper by Nancy Cartwright about transferability of findings 2She didn’t say the name of the paper or the journal, but based on her comments about the paper, I believe it is likely this paper. Unfortunately, it’s behind a paywall, so I can’t read more than the abstract., specifically about how Cartwright talks about “support factors” rather than context. Further, she talked about how in the US there is lots of interesting in “scaling up” interventions, but rarely do studies document the support factors that allow an intervention to work (e.g., you need to have a pool of highly qualified teachers in the area for program X to work). She suggested:
putting the support factors into the theory of change
considering: how do we know if the support factors are necessary or sufficient? What if you need a combination of factors that need to be present at the same time and in certain amounts for the program to work? etc.
Contandriopoulos mentioned that sometimes people just list “facilitators” and “barriers” as if that’s enough [but I liked Newcomer’s suggestion that “support factors” (or barriers, though she didn’t mention it) could be integrated into the theory of change]
Kas Aruskevich showed an imagine of a river in Alaska viewed from above and noted that if you were standing by the side of that river, you’d never know what the sources of that river are (as they are blocked by mountains) and she likened evaluation to taking that perspective from a distance where you look at the whole picture. I liked this analogy.
Kathy Robrigado talked about how the accountability function of evaluation is often seen as an antagonist to learning, but she sees it as a jumping off point for learning.
In summarizing the Leading Edge panel, E. Jane Davidson had a few things to say that were very insightful in relation to thinking I’ve been doing lately with my team about what evaluation is (and how it compares/relates to other disciplines that aim to assess program/projects/etc.). With respect to monitoring, she noted that people often expect key performance indicators (KPIs) to be an answer, but they aren’t. Often what’s the easiest to measure is not what’s most important. In evaluation, we need to think about what’s most important (not just what’s strong or weak, but what really matters).
Evaluation, History of the Field
Every time I go to a evaluation conference, someone gives a bit of a history of the field of evaluation from their perspective (perhaps once day I’ll compile them all into a timeline). This conference was no different, with closing keynote speaker Kylie Hutchison talking about what she has seen as “innovations” in evaluation that had a lot of buzz around them and then eventually settled into an appropriate place [her description made me think of the “hype cycle“, which someone had coincidentally shown in one of the sessions that I was in]:
1990s – logic models
2000s – the big RCT debate (i.e., are RCTs really the “best” way to evaluate in all class)
social return on investment (SROI), Appreciative Inquiry
developmental evaluation, systems approaches
Evaluators, Role of
Lyn Shulha noted that as an evaluator, you’ll never have the same context/working conditions from one evaluation to the next, and you’ll never have a “final” practice or theory – they will continue to change.
Kathy Robrigado talked about starting an evaluation as an “evaluator as critical friend” (e.g., asking provocative questions to understand the program/context, offering critiques of a person’s work, providing data to be examined through another lens). But after awhile, they found this approach to be too resource intensive, as they had ~60 programs to deal with and data collection was cumbersome; they moved from critical friend to “strategic acquaintance” (or, as she put it, “we had to friendzone the programs”)
Michel Laurendeau stated that “evaluators are the experts in interpreting monitoring data” as what you see when you look at the data isn’t necessarily what is really going on [this reminded me of something that was discussed at last year’s CES conference: what the data says vs. what the data means]
Kylie Hutchison talked about how many evaluators are talking about the evaluator as a social change agent. People gravitate to this profession because they want to be involved in social change – maybe they are a data geek, but they see how the data can lead to social change. She also talked about how many skills she has needed to build to support her evaluation practice: in grad school she focused on methods and statistics, but when she went on to become a consultant she didn’t find that she needed advanced statistics – she needed skills in facilitation, then data visualization, and now organizational development.
Kim van der Woerd described getting knowledge into action as “the long journey from the head to the heart”. I really like this phrase, as just knowing something (with the head) doesn’t necessarily mean we take it to heart and put it into action. I wonder how thinking about how we can get things from the head to the heart could help us think about better ways to promote the translation of knowledge into action.
Lyn Shulha talked about learning spirals – as we travel from novice to expert, we can imagine ourselves descending down, say, a spiral staircase. As a given point, we can be at the same place as earlier, but deeper (as well, we are changed from when we were last at this point). She noted that we “need to hold onto our experiences and our truths lightly”, lest we end up traveling linearly rather than in a spiral.
One of the sessions I was in generated an interesting discussion about different ways that people use logic models, such as:
having the lead agency of a program create a logic model of how they think the program works and then having all the agencies operating the program create logic models of how they think the program works and then compare – if they have different views of how the program works, this can generate important discussions
calling the first version of the logic model “strawman #1” to emphasize that the logic model is meant to be challenging and changed.
Report structure recommended by Julian King in the Leading Edge panel on Rubrics:
answer the evaluation question
key evidence & reasoning behind how you came up with the answer
They summarized this as spoiler, evidence, discussion, repeat
E. Jane Davidson noted that in social sciences, people are often taught how to break things down, but not how to pack it back together again to answer the big picture question. For example, you’ll often see people report the quantitative results, then the qualitative results, but with no actual mixing of the data (so it’s not really “mixed methods” – it’s more just “both methods”).
Also from E. Jane Davidson – the length of a section of a report is typically proportional to how long it took you to do the work (which is why literature reviews are so long), but that’s not what’s most useful to the reader. It’s like we feel we have to put the reader through the same pain we went through to do the work; we want them to know we did so much work! And then they get to the end and we say “the results may or may not be…. and more research is needed.” Not helpful! Spoilers really are key in evaluation reporting – write it like a headline. Pique their interest in the spoiler and then they want to read the evidence (how did they decide that??
and make sure your recommendations are actionable!
The Leading Edge Panel on Rubrics was easily my favourite session of the conference. I’ve done a bit of reading about rubrics after going to a session on them at the Australasian Evaluation Society conference in Perth, but found that this panel really brought the ideas to life for me.
Kate McKegg mentioned that she asked a group of people in healthcare if they thought that their organizations key performance indicators (KPIs) reflected the value of what their organization does, and not a single person raised their hand [This resonated with me, as my team and I have been doing a lot of work lately on differentiating, among other thing monitoring and evaluation.]
can help clarify what matters and include those things in your evaluation
are made of:
evaluative criteria – to come up with these, can check out the literature, talk to experts, talk to stakeholders (e.g., people on the front lines); can also think about what would be appropriate for the cultural context (e.g., what would make a program excellent in light of the cultural context?)
levels of importance (of the criteria) – remember, things that are easy to measure are not necessarily what’s important
rating scale (how to determine the level of performance (e.g., excellent-very good-good-adequate-emerging-not yet emerging-poor); depending on your context, you may choose different words (e.g., may use “thriving” instead of “excellent”)
analytic – describe the various performance levels for each criterion
holistic – a broad level of description of performance at each level (e.g., describe “excellent” overall (encompassing all the criteria) rather than describing “excellent” for each criterion individually)
analytic can provide more clarity, but require more data
You should be able to see your theory of change in the rubric. Key evaluation questions (KEQ) often follow the theory of change (e.g., KEQs might be “how well are we implementing?” or “how well are we achieving outcome #1?” Think about the causal links in the theory of change. If there is a deal breaker, it should show up in the theory of change.). Think about the causal links and their strength.
You can embed cultural values into the process (e.g., for the Maori, the word “rubric” didn’t resonate, so Nan Wehipeihana used a cultural metaphor that did; rather than words like “poor” and “excellent”, can use words that fit better like a “seed with latent potential” and “blooming” and “coming to fruition”)
Values are the basis for criteria – they reflect what is valued (and whose values hold sway matters)
Once you have a rubric, you need to collect data to “grade” the program using the rubric; data may come from all sorts of places (e.g., previous research, administrative data, photos from the program, interviews/surveys/focus groups)
Can make a table of each criteria and data source and use that to optimize your data collection:
Photos from the Program
Then you can look at all the things you want to collect from each data source (e.g., you can ask about criteria 2, 4, and 5 in interviews with staff; look for criteria 1, 2, and 3 in the photos from the program) = integrated data collection
Make sure that the data collection is designed to answer the evaluation questions.
Look to see if you are getting consistent information (i.e., saturation) or if the data is patchy or inconsistent and you need to get more clarity.
Bring data to stakeholders as you go along (especially for long evaluations – they don’t want to wait until the end of 3 years to find out how things are going!)
3 steps to making sense of data:
analysis – breaking something down into its component parts and examining each part separately (King et al, 2013)
synthesis – putting together “a complex whole made up of a number of parts or elements ” (OED online); assembling the different sources of data. Sometimes when you are working on data synthesis, you learn that what’s important isn’t what you initially thought was important (so you need to rejig your rubric). Also think about what the deal breakers are (e.g., if no one shows up to the program…)
sensemaking: helps to clarify things; one way to do this is to get all the stakeholders together, give them the synthesized data (a rough cut), and go through a process like this:
generalization: In general, I noticed…
exception: In general…, except….
contradiction: On one hand…, but ont he other hand…
surprise: I was surprised by…
puzzle: I wonder…
When you think about the exceptions or contradictions – how big of a deal are they? Are they deal breakers?
As stakeholders do this, they start to understand the data and to own the evaluation. Often they make harder judgments than the evaluator might have.
Typically, they do the synthesis and bring that to the stakeholders to do sensemaking; but don’t spend a lot of time making the synthesized data looked polished/finished – it should look rough as it is to be worked with. Not everyone will spend time reading the data synthesis in advance, so give them time to do that at the start of the session.
Put up the rubric and have the stakeholders grade the program.
Often people try to do analysis, synthesis, and sensemaking all at the same time, but you should do them separately.
Rubrics “aren’t just a method – they change the whole fabric of your evaluation”. They can help you “mix” methods (rather than just doing “both”) methods – they can help you make sense of the “constellation of evidence”).
I asked how do they deal with situations that are dynamic? Their answer was the rubrics can evolve, especially with an innovative program. You create it based on what you imagine the outcome will be, but other things can emerge from the program. You can start with a high level rubric (don’t want to get too detailed or overspecified that you paint yourself into a corner). You need it to be underspecified enough to be able to contextualize it to the setting. It’s like the concept of “implementation fidelity” – implementing something exactly as specific is not the best – you should be implementing enough of the intent in a way that will work in the setting.
Another audience member asked how would you determine if a rubric is valid/reliable? The speakers noted that often people ask “is it a valid tool?” meaning “was it compared to a gold standard /previously validated tool”? But those other tools are often too narrow/miss the mark. The speakers suggested that “construct validity is the mother of all validities” – the most important question is “is it useful for the people for whom it was built?”
Another audience member asked about “scaling up” rubrics. The speakers noted examples where they had worked on projects to create rubrics to be used across a broader group than those who created it – e.g., created by the Ministry of Education to be used by many different schools with the help of a facilitator. For these, you need to have a lot more detail/instructions on how to use it (and a good facilitator) since users won’t have the shared understanding that comes from having created it. They have also done “skinny rubrics” to be used by lots of different types of schools (so had to be underspecified), but again, need to provide lots of support to users.
Systems archetypes are common patterns that emerge in systems. This was a concept that was brought up by an audience member in my session on complexity, and is something I want to read more about!
Heather Codd talked about three key concepts in using systems thinking (using Donella Meadow’s definition of a system as something with parts, links between parts, and a boundary) in evaluation:
interrelationships – understanding the interrelationships and what drives them helps us to understand what’s going on with the program (and she suggested using rich pictures to help focus the evaluation and think about what the consequences of the program might be)
boundaries – we need to pick a boundary for the purpose of analysis, but note that it is sensitive because it defines what is in and out of the evaluation. She suggested using critical system heuristics to help describe the program, scope the evaluation, and decide on an evaluation approach)”
multiple perspectives – what are the world views being applied and what the implications of those world views? She suggested you can do a stakeholder analysis, but also a stake analysis; she also suggested “framing” by using an idea from Bob Williams, where you add the words “something to do with…” in front of ideas (e.g., “Something to do with a culture of health”, “something to do with managing heart disease”; this tool can help give you a sense of the intervention’s purpose and the evaluation’s purpose.
Evaluators are an element in a system and we cannot separate out our effect on the systems [This made me think of “co-evolution” – the evaluation co-evolves along with the rest of the system]
There are echoes in a system of what has happened before [e.g., intergenerational trauma]
Truth & Reconciliation
Last year, the CES took a position on reconciliation in Canada. Several of the speakers at the conference talked about this topic. For example, Kim van der Woerd talked about a witness as being one who listens with their whole heart and validates a message by sharing it (and that they have a responsibility to share it). She also noted that the Truth and Reconciliation Commission (TRC) wasn’t Canada’s first attempt at trying to build a good relationship between Aboriginal and non-Aboriginal people – the Royal Commission on Aboriginal People put out a report with recommendations in 1996. But when it was evaluated in 2006, Canada received a failing grade with 76% of the 400+ recommendations being not done and with no significant process. She noted that we shouldn’t wait 10 years before we evaluate how well Canada is doing on the TRC recommendations.
Paul Lacerte outlined a set of recommendations:
amplify the new narrative (where the old narrative was “the federal government takes care of the natives”)
conduct research & develop a reconciliation framework
set targets for recruiting and training indigenous evaluators
learn about and follow protocol (e.g., how to start a meeting, gift giving)
put up a sign in your workspace about the traditional territory on which you are working
At the start of her closing keynote, Kylie Hutchison acknowledge that she was speaking on the unceded traditional territory of the Musqueam, Squamish, and Tsleil-Waututh First Nations. And then she said that she’d never said that before speaking before but that she would be now. And I thought that it was a really cool think to witness someone learning something new and putting it into practice like that, especially something so meaningful.
The best joke I heard in a presentation was when Kathy Robrigado, after a few acronym-filled sentences in her presentation, said, “As you know, government employees are paid by the number of acronyms they use”
Opening Keynote by Kim van der Woerd and Paul Lacerte
Short presentation: Causing Chaos: Complexity, theory of change, and developmental evaluation in an innovation institute by Darly Dash, Hilary Dunn, Susan Brown, Tanya Darisi, Celia Laur Cypress
Short presentation: Implications of complexity thinking on planning an evaluation of a system transformation by M. Elizabeth Snow, Joyce Cheng [This was one of my own presentations!]
Short presentation: Cycles of Learning: Considering the Process and Product of the Canadian Journal of Program Evaluation Special Issue by Michelle Searle, Cheryl Poth, Jennifer Greene, Lyn Shulha
Short presentation: Using System Mapping as an Evaluation Tool for Sustainability by Kas Aruskevich
Incorporating influence beyond academia data into performance measurement and evaluation projects by Christopher Manuel
Exploring Innovative Methods for Monitoring Access to Justice Indicators by Yvon Dandurand, Jessica Jahn
A Quasi-Experimental, Longitudinal Study of the Effects of Primary School Readiness Interventions by Andres Gouldsborough
What Would Happen If…? A Reflection on Methodological Choices for a Gendered Program by Jane Whynot, Amanda McIntyre, Janice Remai
Towards Strategic Accountability: From Programs to Systems by Kathy Robrigado
Getting comfortable with complexity: a network analysis approach to program logic and evaluation design by John Burrett
Communication in System Level Initiatives: A grounded theory study by Dorothy Pinto
Seeing the Bigger Picture: How to Integrate Systems Thinking Approaches into Evaluation Practice by Heather Codd
Understanding and Measuring Context: What? Why? and How? by Damien Contandriopoulos
A Graphic Designer, an Evaluator, and a Computer Scientist Walk into a Bar: Interdisciplinary for Innovation by M. Elizabeth Snow, Nancy Snow, Daniel J. Gillis [This was another one of my presentations and hands down the best presentation title I’ve ever had]
Big Bang, or Big Bust? The Role of Theory and Causation in the Big Data Revolution by Sebastian Lemire, Steffen Bohni Nielsen Seymour
Using Web Analytics for Program Evaluation – New Tools for Evaluating Government Services in the Digital Age at Economic and Social Development Canada by Lisa Comeau, Alejandro Pachon
The Future of Evaluation: Micro-Databases by Michel Laurendeau
Dylomo: Case studies from an online tool for developing interactive logic models by M. Elizabeth Snow, Nancy Snow [This was the last of my presentations]
Development and use of an App for Collecting Data: The Facility Engagement Initiative by Neale Smith, Graham Shaw, Chris Lovato, Craig Mitton, Jean-Louis Denis
Leading Edge Panel: Evaluative Rubrics – Delivering well-reasoned answers to real evaluative questions by Kate McKegg, Nan Wehipeihana, Judy Oakden, Julian King, E Jane Davidson
Though I’ve listed all the sessions I attended at the bottom of this posting.
She didn’t say the name of the paper or the journal, but based on her comments about the paper, I believe it is likely this paper. Unfortunately, it’s behind a paywall, so I can’t read more than the abstract.