John Gargani is an independent evaluation expert, and a former President of the American Evaluation Association (AEA). For the past 30 years, he has run the evaluation consultancy Gargani + Company, conducted research on evaluation, and written about evaluation, impact, and scaling.
He’s also a member of the evaluation reference group to CGIAR’s Independent Advisory and Evaluation Service (IAES).
Early this year, IAES released a new set of evaluation guidelines, which build on the CGIAR Independent Science for Development Council (ISDC)’s Quality of Research for Development (QoR4D) Frame of Reference, and provide the framing, criteria, dimensions, and methods for assessing QoR4D – both within CGIAR, and in other like-minded organizations.
On February 27 and 28 this year, IAES ran a workshop (report) in Rome, Italy about the development and potential application of the guidelines, with a mix of participants within and outside of the CGIAR system. Following the workshop, we spoke to Gargani to hear his take on where CGIAR and IAES should now focus their attention as they seek to level up its evaluation of research and innovation for development.
Q: How would you contrast the evaluation of innovation to the evaluation of research?
A: Research and innovation are quite different. Research is a process. It's the practical application of the scientific method to learn, discover, improve, and invent. Innovations are a potential output of scientific research that make it possible to accomplish something that was previously impossible.
For example, one team of scientists may conduct basic research on soil chemistry, and another team may subsequently apply the earlier research to develop a new fertilizer. The new fertilizer may be considered an innovation because it helps farmers achieve crop yields that were not previously possible.
Because research and innovations are different, we evaluate them differently. CGIAR typically uses four criteria when evaluating research projects and proposals—scientific credibility, legitimacy, relevance, and effectiveness. These terms have specific meanings for CGIAR, and may be operationalized differently for basic and applied research.
Evaluations of innovations often use more and/or different criteria. These evaluations tend to emphasize the impact innovations have on people—farmers, communities, women, etc.—and the natural environment. They should address the equity of impacts and take the inherent riskiness of innovations into account.
Q: What about evaluating research and innovation for development?
A: Research for development is scientific research undertaken to improve the lives of people and the environment in developing regions of the world. Often, this research supports the development of innovations with the potential to produce large or ‘transformational’ impacts.
Consequently, we need to address two large domains when evaluating research and innovations for development—the quality of scientific research and development impacts. This is why CGIAR combined its evaluative criteria for research with other criteria commonly used for development projects (the DAC Criteria, with some useful modifications).
The expanded criteria provide a starting place for evaluations. Not all criteria need be used, criteria may be modified, and other criteria may be added. This flexibility allows evaluators to address intended positive impacts and risks (such as the potential for unintended harm). Importantly, flexibility makes it possible to incorporate the perspectives of different stakeholder groups.
For instance, if our innovative fertilizer increases yields and reduces labor, it can affect stakeholders differently. Some may lose income and a measure of control over their lives. They may judge the innovation harmful. Others may believe that avoiding work they consider exploitative and dangerous is beneficial. Some community members may save money and gain food security. Some women may find they earn less selling the surplus of what they grow for their own families.
Evaluators are obligated to understand these tensions, and this requires the participation of people who are affected. That wouldn't be the case for traditional research in a university setting.
Q: What are the special things CGIAR needs to think about for innovation evaluation? What is the role of performance monitoring data in the success of such evaluations?
A: Innovations are inherently risky because they have never been tried. We don’t know what will happen when people begin to use them. They may produce unexpectedly large benefits or create unintended harm.
CGIAR can reduce these risks through monitoring. Some potentially harmful impacts will be anticipated at the outset, and traditional monitoring approaches can track them and act as an early warning system. Other potentially harmful impacts will not be anticipated. To guard against them, exploratory monitoring methods are needed. This entails searching for impacts rather than confirming them. An effective way to do this is to ask stakeholders – people notice quickly how their lives change and whether they consider the changes improvements.
CGIAR can also reduce risk by monitoring the larger system in which impacts are created. Innovations are not merely introduced to the world; efforts are made to scale their adoption to achieve large impacts. This is especially true in agriculture, where researchers hope their new fertilizers, crop varieties, and farming methods will be widely adopted to achieve impacts at scale. As adoption increases, researchers expect impacts will change for the better (become larger, more widely distributed, more equitable, etc.).
This isn’t always the case, because scaling takes place in complex systems. And this is why my work with Rob McLean on scaling impact has led us to think about scaling as a dynamic process. Not only can scaling affect impacts, but it can affect the underling social, economic, or natural systems in which impacts are created. This is what it means to be transformational. The transformed systems may make it easier or harder to achieve desirable impacts, or they may change the qualities of impacts for better or worse. Again, monitoring can act like an early warning system by identifying changes in underlying social, economic, and natural systems as scaling takes place.
What evaluation approaches and methods would be most useful for researchers applying the new CGIAR guidelines?
I think it’s wide open. That’s the beauty of the guidelines. Evaluators should use the methods they believe are most appropriate for the context and purposes of their evaluations, and they must justify their choice to others. Other than that, the guidelines don’t restrict you.
Scientific organizations like CGIAR may favor structured, quantitative evaluations. There is an important place for evaluations like this, and a long history of their effective application in agriculture. Yet, to its credit, CGIAR has created guidelines that give evaluators permission to widen their gaze, to use methods that are qualitative, anthropological, exploratory, and participatory, and to combine them in novel ways.
Good evaluators know how to do this. It’s the state of the art in evaluation—combining methods to produce better information than any method could produce on its own. I believe that’s critical for evaluating research for development.