Impact evaluation’s day in the sun (Part III) December 3, 2008
Posted by Paul Duignan in : Attribution, Evaluation debates, Impact evaluation, Outcomes systems architecture, Evaluation planning , trackback
[Please Read Part I and Part II first]. Now that I have got to Part III of this posting on the impact evaluation debate, the time has come for me to front up and tell you how I think things should be done if I were in charge of the world. First, the concept of there being a shoot-out between impact evaluation and other types of evaluation is about as sensible as trying to have a debate as to whether a fork or a knife is the better kitchen utensil. It obviously depends on what you want to do with it. This does not, of course mean that it might not be reasonable to argue that there are not enough knives in the kitchen at a particular point in time and that we should go out and encourage knife collecting rather than just getting more and more forks.
To get underway, lets get rid of a couple of unreasonable positions which I think are untenable in the impact evaluation discussion. These are:
- Blind insistence on only ever supporting the use of a very limited number of impact designs (e.g. true experiments and regression discontinuity designs) in the face of good arguments (e.g. by Scriven) that true experiments have major flaws as impact designs in some instances (e.g. in cases where it is difficult to establish a credible no-treatment control group) does not seem sensible to me. (For more info on what impact designs there are see here)
- Blind insistence that impact designs can never provide any particularly useful information because: 1) humans and human societies can’t be studied with experimental methods; 2) because everything is an interacting system of such complexity that you can never tease out one cause from another; 3) interventions are changing so much in their implementation that you can never know what the actual intervention is; or 4) things can never really be generalized from one situation to another does not seem sensible to me.
However, I can understand at least part of the sentiment behind the people promoting 1 above - as least some of them are I presume motivated by wanting to produce results which will be accepted by all stakeholders because they cannot be challenged on methodological grounds. I can also understand the sentiment behind where the people promoting 2 above are coming from. I think that in a number of cases they are right and traditional impact designs are not going to work because of one or more of the problems they point out are present to a significant degree.
So I sit in the middle of the road (not necessarily a safe place to be when arguments are raging) believing that impact evaluation, when used appropriately can add a great deal to our knowledge,. However, I think that any proposal for an impact evaluation needs to consider the answer to three questions. These are:
- Is the method suitable for actually providing an extimate of the impact of an intervention (Scrivens point again)?
- Is the impact evaluation going to be able to be carried right through to completion? Shadish, Cook and Campbell (2002) set out many of the potential problems which can arise in impact designs and over which you often do not have much control (e.g. compenstatory equalization where the inequality between the intervention and control groups is compesated for and there is no longer a differential intervention to study). Well set up impact evaluations are major investments, if they fall over much of this investment is lost. This is a much harder question to aswer than question 1 above because you may have an impact design which could potentially provide useful impact information but the practicalities of the situation are such that it is likely to not actually be able to be carried through to a successful conclusion. In at least two cases with major impact evaluations I have been involved in, I have had my reservations about whether or not they would be able to be carried through to completion. However I kept my reservations to myself because I did not want to be a pesimist in the face of all of the enthusiasm in the room,. In any case, no one was going to accept that we should cancel the impact evaluation because I did not think that it would be able to be seen through to completion. In the event, in both cases the evalautions were not able to be carried through to completion as planned when they were commenced.
- Is the impact evaluation a sector priority in terms of both the intervention it is looking at and the type of evaluation being undertaken?
I think that the most tricky aspect of answering these questions is getting 2 assessed properly. I think that if we talk things through we can get 1 happening alright as long as no extremist positions are taken. I also think, as outlined below, that if we put the work in, we can do 3. Question number 2 remains a very difficult risk management question. The reality of how impact evaluations tend to be set up (the effort one has to go to to marshel the resources etc.) mean that at the planning stage of such evaluations no one really wants someone in the room who is going to be saying that is all unlikely to work. Having evaluation resources allocated as a result of a process which involves answering question 3 may help with this problem. This is in contract to having evaluation resources contingent on being used on a particular evaluation project and disappearing if they are not going to be used on that project.
As for the third question, I think that first, we always need to have on the table that impact evaluation is one of the set of methods which we can use to assess a program and there may be different priorities for different types of evalaution at different points in the development of an intervention. The framework I use for thinking about these things is set out in the Five Basic Building-Blocks of Outcomes Systems. So, an intervention which is still in the early stages of development is normally a higher priority for non-process formative evaluation than one which has settled down as an intervention and which may therefore be a higher priority for impact evaluation. In some cases, given the priority for spending scarce evaluation resources on other programs, just monitoring of attributable indicators (outputs) may be appropriate (you wil probably need to look at the diagram in the article above to follow the discussion here.
The second aspect of the third question is what I think is currently missing in thinking about impact (and other types of )evaluation. We tend to approach the problem from the point of view of the question: ‘what sort of evaluation should this program have?’ this is a ‘program-centric’ approach. I think that we should take a ’sector-centric’ view instead and be asking - ‘what are the strategic knowledge needs of this sector?’ from this we then work back to defining what impact or other evaluation may be appropriate for a particular program.
Here is an example of how this approach could work for a sector. The generic steps in the process for which what is set out below is an example, are outlined in the article from which this example has been taken - Reframing Program Evaluation as Part of Collecting Strategic Information for Sector Decison-making. This is an example of a social sector issue - interventions for high-risk youth with multiple problems (e.g. justice, education, health).
- An overall outcomes model could be built for the social sector with a focus on at risk young people who are involved in problems in a range of areas (e.g. justice, education, health).
- Onto this could be mapped existing interventions (e.g. separate interventions by different agencies, e.g. justice, education, health).
- Evidence could be mapped onto the outcomes model showing that it might be more effective to have ‘wrap-around’ services which were not provided within the traditional service deliver silos.
- A priority evaluation question could be identified as: ‘do ‘wrap-around services improve the outcomes for at risk youth over provision of individual traditional silo-based services?’ Subsidiary evaluation questions could be include: ‘what is the best way of providing wrap-around services for youth from particular cultural groups’.
- An intensive evaluation could be launched of a pilot wrap-around intervention which is funded by a range of stakeholder groups.
- A number of wrap-around projects may be launched which will receive less levels of evaluation, not focused on longer-term outcomes, but more just on feedback on cultural appropriateness.
- Many other wrap-around projects may also be occurring for which there will not be any demand for evaluation (with the resources which would have been spent on evaluating a number of separate programs being pooled and spent on the intensive evaluation described in 5 above). These programs would only be subject to contractual monitoring of attributable indicators (outputs) - the third of the five outcomes systems building-blocks in the diagram from the article referred to above - and measurement of not-necessarily attributable indicators. See Contracting for Outcomes for detail on how such contracting could work.
This approach, from a sector perspective would then allow us to work all the way back to identifying the type of evaluation (impact or otherwise) which should be undertaken in regard to particular programs. Obviously, it would take time to develop such an approach, but it could be progressively developed over time and as its results were made public it could progressively come to influence the selection of impact (and other types of) evaluation for particular projects. How high-level stakeholders could be encouraged to use such an approach is discussed in What Added Value can Evaluators Bring to Governance, Development and Progress Through Policy-making? The Role of Large Visualized Outcomes Models in Policy Making.
Shadish, W. R., Cook, T. D. & D. T. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin Company (p. 72-81).
Comments»
[…] Impact evaluation’s day in the sun (Part III) December 4th, 2008 […]