Applying Quality of Research for Development Frame of Reference to Process and Performance Evaluations

coverThese pioneering evaluation guidelines provide the framing, criteria, dimensions, and methods for assessing the quality of CGIAR research for development. They can be adapted by like-minded organizations conducting research for development (R4D).

Rigorous, independent, and external evaluations form the foundation of CGIAR’s effort to inform the design of interventions, provide actionable evidence to support management and governance decisions, and ensure a high level of accountability to donors. In 2022, the CGIAR System Council and Board approved the CGIAR Evaluation Framework and Policy accompanied by specific guidelines to facilitate uptake of the framework and policy.

Learn more about this publication


Listen to the podcast EvalEdge Episode#19: CGIAR's IAES Guidelines on Evaluating Quality of Research for Development in which the IAES Evaluation lead, Ms Svetlana Negroustoueva, presents the guidelines.

Soliciting Input and Feedback

For queries and feedback about learning from the roll-out and application of this Beta version contact the Evaluation Function within the Independent Advisory and Evaluation Service (IAES) of CGIAR at

Frequently Asked Questions (FAQs)

FAQs tab have been developed below and will be updated to facilitate roll-out and learning to best ensure effective and cost-efficient evaluations, including in CGIAR.

Key Concepts and Definitions

1. Is there a confounding of quality of science with quality of research in the Guidelines?

For grounding, the Guidelines use definitions of “science” and “research” from the CGIAR 2030 Strategy (page 1). Research is defined as the “Generation and communication of data, information and knowledge on an empirical basis.” Science refers to “Rigorous theory-based research.” Consequently, science is part of research. Evaluating QoS is an integral part of evaluating QoR. The intent is not to mandate use of the same terminology in other contexts; clear reference was needed to ground this document.

2. Why is evaluation of quality of science emphasized in the Guidelines?

The Evaluation Guidelines operationalize the CGIAR Evaluation Policy (2022) and the QoR4D Framework (2017). The QoR4D Framework recognized that high quality science is necessary but needs to be placed into a research-for-development framework. The QoR4D Framework was based on the Transdisciplinary Research Quality Assessment Framework (QAF) (Belcher et al., 2016). It established a methodology to evaluate elements of relevance, credibility, legitimacy and effectiveness (positioning for use). At the same time, quality of science as a means to achieve quality of research was not given the emphasis and definition that it deserves. The Guidelines aim to address this important issue, by further bridging into performance process evaluations.

3. Why is evaluation of quality of science complex?

Evaluating QoS is essential for interventions which produce/deliver science and related outputs as part of the research cycle. Evaluation of QoS in the R4D context is not straightforward, but if done systematically it is not complex. QoS evaluation criterion captures the key elements – credibility and legitimacy, operationalized by relevant dimensions and clearly defined indicators. Together their use can significantly strengthen the evaluative findings and recommendations.

4. Looking at some of the language, it seems that we are still evaluating how good research is rather than how good the program is in a specific context. Are trying to achieve more research and publications or more outreach of the program?

CGIAR is a science-based organization that aims for developmental outcomes. More high-quality publications and more research innovations is an “output” will feed into greater development outcomes. “Outputs” is only one of four dimensions.

5. Who are the key stakeholders in the design, implementation and use of process and performance evaluations in the CGIAR?

According to the CGIAR-wide Evaluation Policy and Framework (revised in 2022), performance and process evaluations help meet CGIAR’s evolving internal needs, supporting an organizational culture that engages with and uses such evaluations for accountability, learning, and evidence-based steering. Evaluations can be fully independent (i.e. commissioned by funders) or largely independent, i.e. commissioned within CGIAR as an operating entity: CGIAR System Organization, all CGIAR Centers and all the organizational business units under One CGIAR. Therefore, all of them are key evaluation stakeholders.

Framing and Application

6. Can we use the framework to assess development projects tackling scientific research although not necessarily with the primary objective of delivering scientific outputs?

Yes, this is the core objective of the Guidelines. Certain development interventions evaluated might not have the stated objective to deliver science per se. The first step of the evaluation is to ask the question “Was the objective of the evaluand to deliver science outputs and outcomes for development?” The answer can be established using the theory of change or in other ways. If it is “no,” then inclusion of the QoS evaluation criterion is not recommended; however, it is still possible to pursue a different pathway to the evaluation design. An example is provided on page 13, when CGIAR Platforms are mentioned that do not deliver science but are a tool to coordinate its delivery.

7. Are there other frameworks for assessing quality of research? How is the CGIAR Quality of Research for Development Framework different?

Two related assessment frameworks are explicitly cited in the CGIAR Evaluation Guidelines, namely the UK’s Research Excellence Framework (REF) and the Research Quality Plus (RQ+) of the International Development Research Centre (IDRC). As CGIAR is a research-for-development organization in the agricultural space, the QoR4D Framework is different from the frameworks used by purely research and purely development organizations, and those not specific to agriculture. The Evaluation Guidelines and the QoR4D Framework are companion documents when implementing evaluations of CGIAR research.

8. Is there a particular form of governance that supports the deployment and adoption of such framework?

In CGIAR, the Guidelines operationalize the CGIAR-wide Evaluation Policy and Framework (revised in 2022), approved by the CGIAR System Board (23rd Session) and System Council (15th Meeting). Irrespective of QoR4D framework, grounding in OECD/DAC evaluation framing in multilateral development assistance facilitates similar governance-type set-ups.  

9. The document mentions that guidelines can be applied beyond CGIAR. If so, who could potentially use the document?

While the main audience of the guidelines are CGIAR-related stakeholders involved in evaluating Quality of Science (QoS) in CGIAR, other users in research-for-development context (R4D) may find it useful, such as national agriculture research organizations and ministries of agriculture engaged with multi-lateral development assistance; UN agencies such as FAO and UNEP; selected accredited entities to the Green Climate Fund; and academia. The translation of the Guidelines into Spanish, also building on a precursor tri-lingual online discussion jointly held by CGIAR and FAO evaluation offices on “How to evaluate science, technology and innovation in a development context?”, intends to widen the reach and potential application.

Evaluation Criteria and QoS Dimensions

10. How to apply the Guidelines for research that are: short term (low value & less < 2 years), sponsored by national/international agencies (1-5 years), large network projects covering multi-commodity institutions, multi-disciplinary sciences (> 5 years)

The approach, dimensions, steps and methods proposed in the Guidelines can be applied to process and performance evaluation of interventions with different types and duration of research activities, more aligned to developmental assistance. The Guidelines offer a flexible and adaptable menu of considerations. It has elements that can be used by research institutes such as universities, NARS and international agencies. Choice of evaluation criteria is context specific. Evaluating QoS in four dimensions allows consideration of a continuum from design to outputs, facilitating wide applicability throughout the project/program cycle from proposal to mid-term corrections to end of project to evaluation of effectiveness several years after the project finishes.

11. While QoS evaluation criterion is separated from the 6 DAC criteria, there's some overlap when operationalizing the evaluation of QoS in the 4 dimensions. E.g. "relevance" (an OECD DAC criterion) is stressed in assessing the research ‘design’ (p. 7).

Correct, by design there would be some overlap. The dimension of “design” relates to relevance but “scientific relevance” is different from “relevance of a development-type intervention” (hence OECD DAC), again driven by what is being evaluated. Table 3 illustrates the relationship between the QoS evaluation criterion, that it can be used in the four dimensions and their potential application to the other six OECD DAC criteria.

12. Table 3 can be read as a matrix approach across the lines of the four dimensions of QoS. Can you provide clarity on whether the intended matrix approach would be at odds with a typically linear evaluation report?

The premise of the Guidelines is its adaptability to different contexts, which may not involve using a designated QoS criterion. Table 3 is intended to illustrate the potential application of the four dimensions, even if a designated QoS evaluation criterion is not used, rather if/when any of the other six evaluation criteria apply.

13. Implied definitions of efficiency and effectiveness seem inconsistent in different sections. Effectiveness should surely include outputs (Table 3)

Definitions of “efficiency” and “effectiveness” are from the CGIAR Evaluation Policy and Annex 2 of the Guidelines. Table 2 showcases difference in “Relevance” and “Effectiveness” evaluation criteria and QoR4D elements. Point taken on “outputs.” They will be included in the next version.

14. If Quality of Science is defined as the process of performing science to the highest quality standards, should it not be analyzed while the project is ongoing, ideally during the first two years of a three-year project (compare Figure 6)?

It is unlikely that the full spectrum of quality of science (credibility and legitimacy) can be delivered in a three-year project cycle. “Process” dimension covers the “performing science.” By definition it is “how” in performance, but other dimensions relate to design of the performance (play, script), inputs into it (like costumes, talents, stage), and outputs (standing ovations, articles in newspapers praising the performance). Many factors facilitate or inhibit performance and are out of control of the performers, and many likely after the performance, aka intervention. With stated objectives relevant to QoS, all four dimensions – design, inputs and processes and outputs – can be used throughout the project cycle.

15. In which ways do the guidelines consider the contribution to development? How can we credibly link science and research with developmental impact?

The “design” dimension is key to considering the contribution to development impact. One of the three recommended key evaluation questions (p. 9 of the Guidelines) captures how appropriate and responsive the research design is in addressing development challenges. The “outputs” dimension includes a context of “positioning for use” which can be assessed by “scaling readiness.” These are important when considering contribution to development impact. Annex 6 provides several evaluation sub-questions and suggested methods to answer the question “Is research design appropriate to the development challenges in the context?” The Guidelines suggest the use of qualitative methods to determine whether the link between the impact assessment plan and indicators in the theory of change (ToC) is clearly defined. Another aspect considered in the Guidelines is the interconnection of the research design to the Sustainable Development Goals. This can be assessed using a mixed-methods approach, including the ToC analysis, the desk review of reports, the use of rubrics and bibliometric indicators. A designed bibliometric indicator can be used to measure the thematic alignment with SDG relevant topic (Annex 8), although this can only be used three years after the completion of the intervention. What happens to scientific and research outputs of an R4D intervention beyond effectiveness and sustainability to achieve developmental impact is not within the scope of performance and process evaluations.

16. To which extent are the guidelines intended for performance evaluation of individual researchers or research teams, vs higher levels of aggregation like One CGIAR Research Initiatives or Research Centers?

The “input” dimension considers the skills of individual researchers and teams which can be evaluated by both quantitative and qualitative indicators.

17. Are soft skills, such as leadership, teamwork and communication considered in the evaluation?

Some of these aspects are indirectly addressed in the “process” dimension, under the assessment criterion “Clearly defined roles and responsibilities.” Skills such as leadership and teamwork are considered under the dimension of “management processes” dimension and usually evaluated with qualitative indicators. “Communication” is both a process and an output and can be evaluated in several ways with both qualitative and quantitative indicators. Furthermore, such elements are often explored under the “effectiveness” evaluation criterion as they often facilitate or inhibit delivery of results.

18. Subsection 3.3 takes up higher level societal benefits (only after published science); with insufficient attention to finding solutions for socio economic aspects. This bigger picture may be implicit in figure 6, with unclear light vs dark blue.

All evaluations are context specific, driving the selection of evaluation criteria. The four dimensions of the QoS criterion allow consideration of a continuum from “design” to “outputs.” Consideration of what happens with the scientific and research outputs of a R4D intervention segways into “effectiveness” and “sustainability” along the “impact pathway.” The latter is not in the scope of performance and process evaluations.

Methods and Tools

19. How does one value local knowledge and diverse voices in the pursuit of science? How do you improve end user (like farmers, small enterprises) involvement in the research process or key decisions on research?

The standards in the CGIAR Evaluation Framework, which the Guidelines operationalize, cover “how” evaluations are to be conducted, i.e. legitimacy and participation, responsiveness to GDI, transparency, etc. (pages 4–5). Aligned to the evaluation industry standards, evaluation design and implementation would consider who is engaged in the work and who benefits from it, as well as how involved researchers from lower-and middle-income countries and local communities were in the design and delivery of the research. The “Process” dimension considers assessing the “Engagement with local knowledge,” to define whether local communities, stakeholders or populations were effectively engaged and have been considered in the research process. This can be done through the use of qualitative methods, such as interventions’ reports, interviews and focus groups discussions (see Annex 6).

20. Why is using a mixed method approach important? What are the most useful qualitative methods to assess quality of science?

The use of mixed methods enhances the credibility and validity of evaluative findings and supports high-quality science-specific conclusions. Quantitative and qualitative methods complement each other, and triangulation of related results of using quantitative and qualitative methods enhances robustness of findings. Quantitative methods alone might miss the rounded picture of the context which can be provided through the use of qualitative methods. The selection of the best qualitative methods to use will depend on the evaluation and on documentation availability. However, experts (SMEs) document review and assessment of scientific “outputs” certainly adds credibility and rigor to the evaluation. The use of evaluative rubrics, e.g. the traffic light system helps to reduce the subjectivity of qualitative indicators.

21. What is the rationale for proposing Social Network Analysis among suggested methods?

Social Network Analysis (SNA) can be a powerful tool for the “Process” dimension to measure the involvements of stakeholders and partners. Being a graphic way to depict the number and strength of connections between people, it represents a significant addition to other quantitative and qualitative methods that are less visually attractive. The use of such a method adds power to the evaluation and allows for a better understanding of stakeholders’ involvements through the use of network graphs. The blog “Alone we can do so little; together we can do so much” (University of Florida with CGIAR) can provide some insights.

22. When using bibliometrics, would you be able to effectively baseline the discipline or the sub-discipline to allow comparisons? Comparing bibliometrics of agriculture with other disciplines might be very hard.

It is indeed always important to have the appropriate comparators. Normally, in either Web of Science or Scopus there are some subfield disciplines that are already delineated. Disciplines related to agriculture such as nutrition, health and employment could be included in a bibliometric analysis through appropriate choice of key words. The technical note on bibliometrics recommends to assemble a custom dataset of agricultural research for development publications. This might require some manual work at first, but it is an investment that, once done, can be reused and updated through multiple evaluations.

23. Are there particular tools that are used to centralize the data?

The CGIAR performance data presented is in the Results Dashboard. Also, CGSpace is a great resource as a collaborative platform of several centers, the CGIAR system office and its research initiatives and platforms.

Knowledge Management and Communication

24. I acknowledge that Quality of Science may have standalone outcomes or recommendations, but shouldn’t it also overlap with the other OECD/DAC derived criteria of efficiency, sustainability and impact. The greater the science quality, the greater the th

The Guideline specifies that QoS-related recommendations should be triangulated with evidence from answering evaluations questions along other evaluation criteria.

25. What is the role of communication to influence change? How will the use of evidence be encouraged?

Communication plays a key role in the evaluation process. Effective communication will increase the understanding of performance and process evaluations of QoS and build buy-in from stakeholders and their confidence in credibility and rigor behind recommendations. As such, “knowledge management” facilitates the effective communication, delivery of appropriate evaluative messages and the use of evaluative information. Communication of QoS specific evidence is encouraged using presentations, short videos, podcasts, blogs, conference interventions and organization of ad-hoc events to share related knowledge and learnings, see example from a virtual side event at FAO Science and Innovation Forum 2022.