How should the recently developed Quality of Research for Development (QoR4D) Evaluation Guidelines be applied to the evaluation of research processes and performance? This was the purpose of a workshop held in Rome on 27‒28 February 2023 (see report). The focus was on widening conventional science-focused assessments to incorporate engagement, learning, and impact.
As background, the shift from assessing quality of science to considering the quality of research more broadly in CGIAR emerged from a Green Paper by Harvard University and a subsequent 2016 directive from the Independent Science and Partnership Council (ISPC) to develop a framework for QoR4D based on the founding principles of “salience, credibility, and legitimacy.”
The result was a framework consisting of 4 elements and 17 criteria for evaluation. Participants in the workshop were keen to learn how to apply this framework to the huge range of actions being evaluated in ways that are timely, improve design, and enhance assessment of effectiveness. This issue of usefulness to implementors and practitioners was of particular interest, reflecting participants’ desire for tools that are relevant and useful in the widest range of real-world situations.
A key theme that emerged in discussion was the tension between self-assessment and peer review. A recent advance in science assessment worldwide is the use of a mixed-methods framework to inform internal and external peer reviews, with impact evaluations guided by co-created theories of change. The challenge is to ensure that all relevant stakeholders are involved in co-creation of the theory of change.
Another interesting challenge is achieving consistency of judgment, using comparable quality criteria across a wide range of research projects with diverse performance expectations in different contexts. Much can be learned from the QoR4D evaluations of other organizations, which add value at different entry points, such as priority setting, design, implementation, evaluation, communication, predictive science, and reflective science. There was agreement that the CGIAR QoR4D Guidelines had wider use and application beyond CGIAR. Their uptake and use by donors, scientists, and partners should therefore be monitored.
Challenges and themes emerging from the workshop included:
- How to adapt the Guidelines to different contexts and stages of the project cycle – While continuing to draw inspiration and guidance from the Guidelines, evaluators should be willing to adapt them continuously to a project’s unique context, the needs of end users, and the stage in the project cycle (beginning, middle or end) at which the evaluation is taking place.
- How to incorporate continuous learning – The learning and developmental components of evaluations deserve more serious attention, not only to develop capacity, but also to enable adaptation and ongoing refinement based on lessons learned, aiming for “adaptive rigor.” Reflection should, therefore, be built into the entire evaluation process and by all evaluation actors—evaluators, beneficiaries, and those whose research is being evaluated.
- How to put people at the heart of everything – R4D evaluation deals with humans, complete with their strengths, weaknesses, biases, and capacities. This applies to the evaluator as well as those being evaluated. Evaluations will have stronger developmental impact if they are co-designed with project executants and end users, starting with an upfront participatory evaluability assessment involving key project actors (including end users), with decision-makers, researchers and end users involved in all four of the evaluation dimensions (research design, inputs, process and outputs). Researchers and end users should even be involved in helping to formulate evaluation questions, with co-design being particularly useful during the scoping and preparation phase, but also during the data gathering, analysis and communication of learning phases (p.12 of the Guidelines).
- How to connect the dots between different evaluation components – Process evaluation, performance evaluation, and capacity development are crucial components of R4D evaluation. These elements could be better aligned by paying attention to more comprehensive theories of change in the early stages of project development and in evaluability assessments.
- How to develop more helpful theories of change – While the theory-of-change concept is well integrated into CGIAR’s work, research outcomes and impacts are sometimes not explicit enough and have to be reconstructed by evaluators. Consequently, a thorough evaluability assessment, conducted upfront with project executants, is crucial. It would also be helpful if the CGIAR QR4D Guidelines were accompanied by a good theory of change to make clear what R4D evaluations want to achieve.
- How to achieve true trans-disciplinarity – Trans-disciplinarity not only refers to the composition of the evaluation team in terms of discipline, gender, seniority, and culture, but also that of the research team, as well as the involvement of practitioners and end users. Transdisciplinary ways of working should be reflected in all five critical phases: (i) co-defining problems, (ii) co-producing new knowledge, (iii) assessing new knowledge, (iv) disseminating new knowledge in the realms of both science and practice, and (v) using new knowledge in science and practice (Hoffmann et al., 2019). These criteria might be included in the QoR4D Guidelines for maximum impact. To prevent the process from becoming too arduous, these dimensions could be evaluated qualitatively by expanding the list of evaluation questions.
- How to balance rigor, depth, time, and budgets ‒ A comprehensive evaluation with sufficient breadth and rigor, which captures all the complexities and incorporates capacity development, may be unrealistic when capacity and budgets are low and time is short. To tailor evaluation processes to budgets, capacity, and time constraints without rushing through them as superficial, “box-ticking” exercises, requires careful trade-off analysis and a range of tools aligned with different resources and budgets.
- How to reconcile the applied, practicable impacts of research, on the ground, with conventional, objective measures of research impact. The usefulness of bibliometrics such as Impact Factors and h-indices to measure research impact in development is hotly debated, internationally. The objectivity of these measures is, however, an advantage and the challenge, therefore, is to blend them with more subjective measures of applied impact - to inform an integrated evaluation.
In conclusion, the QoR4D Guidelines are useful to different actors in different ways, depending on context and circumstances. There was broad agreement on the need for research quality evaluations that are sufficiently inclusive, fair, consistent, and transparent. The QoR4D Guidelines have gone a long way to achieving this, but adopting the Guidelines will take commitment to properly resourcing evaluations of this type. An ontology or shared vocabulary is needed, however, to allow comparison and consistency across different contexts and cases. Process-rich evaluations are time-consuming and may not always be appropriate for light evaluations, so additional rapid and cheaper approaches should be tested. Immediate priorities are to critically adapt these Guidelines for different contexts and to ensure that they are used and funded consistently in evaluation. Dissemination of the Guidelines will need some thought; a body of case work and accessible learning material are needed to support uptake. Interactive online courses could be a useful next step after the next version.
 See p.7 of the QoR4D Guidelines.
 See p.10 and p.13 of the QoR4D Guidelines.