Program evaluation: Wikis


Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.


From Wikipedia, the free encyclopedia

Program evaluation is a systematic method for collecting, analyzing, and using information to answer basic questions about projects, policies and programs[1]. Program evaluation is used in the public and private sector and is taught in numerous universities. Evaluation became particularly relevant in the U.S. in the 1960s during the period of the Great Society social programs associated with the Kennedy and Johnson administrations[2][3]. Extraordinary sums were invested in social programs, but the impacts of these investments were largely unknown.

Program evaluations can involve quantitative methods of social research or qualitative methods or both. People who do program evaluation come from many different backgrounds: sociology, psychology, economics, social work. Some graduate schools also have specific training programs for program evaluation.


Paradigms in Program Evaluation

Potter (2006)[4] identifies and describes three broad paradigms within program evaluation . The first, and probably most common, is the positivist approach, in which evaluation can only occur where there are “objective”, observable and measurable aspects of a program, requiring predominantly quantitative evidence. The positivist approach includes evaluation dimensions such as needs assessment, assessment of program theory, assessment of program process, impact assessment and efficiency assessment (Rossi, Lipsey and Freeman, 2004) [5].

The second paradigm identified by Potter (2006) is that of interpretive approaches, where it is argued that it is essential that the evaluator develops an understanding of the perspective, experiences and expectations of all stakeholders. This would lead to a better understanding of the various meanings and needs held by stakeholders, which is crucial before one is able to make judgments about the merit or value of a program. The evaluator’s contact with the program is often over an extended period of time and, although there is no standardized method, observation, interviews and focus groups are commonly used.

Potter (2006) also identifies critical-emancipatory approaches to program evaluation, which are largely based on action research for the purposes of social transformation. This type of approach is much more ideological and often includes a greater degree of social activism on the part of the evaluator. Because of its critical focus on societal power structures and its emphasis on participation and empowerment, Potter argues this type of evaluation can be particularly useful in developing countries.

Despite the paradigm which is used in any program evaluation, whether it be positivist, interpretive or critical-emancipatory, it is essential to acknowledge that evaluation takes place in specific socio-political contexts. Evaluation does not exist in a vacuum and all evaluations, whether they are aware of it or not, are influenced by socio-political factors. It is important to recognize the evaluations and the findings which result from this kind of evaluation process can be used in favour or against particular ideological, social and political agendas (Weiss, 1999) [6]. This is especially true in an age when resources are limited and there is competition between organizations for certain projects to be prioritised over others (Louw, 1999) [7].

Dimensions of Program Evaluation

Program evaluators may assess programs on several dimensions to determine whether the program works. Rossi et al. (2004) divide these dimensions into 5 main categories: needs assessment, program theory, process analysis, impact analysis, and cost-benefit & cost-effectiveness analysis.[8]

A needs assessment examines the nature of the problem that the program is meant to address. This includes evaluating who is affected by the problem, how wide-spread the problem is, and what effects stem from the problem. For example, for a housing program aimed at mitigating homelessness, a program evaluator may want to find out how many people are homeless in a given geographic area and what their demographics are.

The program theory is the formal description of the program's concept and design. This is also called a logic model or impact pathways[9]. The program theory breaks down the components of the program and shows anticipated short- and long-term effects. An analysis of the program theory examines how the program is organized and how that organization will lead to desired outcomes. It will also reveal unintended or unforeseen consequences of a program, both positive and negative. The program theory drives the hypotheses to test for impact evaluation. Developing a logic model can also build common understanding amongst program staff and stakeholders (see Participatory Impact Pathways Analysis).

Process analysis looks beyond the theory of what the program is supposed to do and instead evaluates how the program is being implemented. The evaluation determines whether target populations are being reached, people are receiving the intended services, staff are adequately qualified, etc.

The impact evaluation determines the causal effects of the program. More information about impact evaluation is found under the heading 'Determining Causation'.

Finally, cost-benefit or cost-effectiveness analysis assesses the efficiency of a program. Evaluators outline the benefits and cost of the program for comparison. An efficient program has a lower cost-benefit ratio.

Determining Causation

Perhaps the most difficult part of evaluation is determining whether the program itself is causing observed impacts. Events or processes outside of the program may be the real cause of the observed outcome (or the real prevention of the anticipated outcome).

Causation is difficult to determine. One main reason for this is self selection bias[10]. People select themselves to participate in a program. For example, in a job training program, some people decide to participate and others do not. Those who do participate may differ from those who do not in important ways. They may be more determined to find a job or have better support resources. These characteristics may actually be causing the observed outcome of increased employment, not the job training program.

If programs could use random assignment, then they could find a strong correlation or association. Causation is not something that can be proved through correlation. A program could randomly assign people to participate or to not participate in the program, eliminating self-selection bias. Thus, the group of people who participate would be the same as the group who did not participate, and this would be helpful in

However, since most programs cannot use random assignment, causation cannot be determined. Impact analysis can still provide useful information. For example, the outcomes of the program can be described. Thus the evaluation can describe that people who participated in the program were more likely to experience a given outcome than people who did not participate.

If the program is fairly large, and there is enough data, statistical analysis can be used to make a reasonable case for the program by showing, for example, that other causes are unlikely.

Types of Program Evaluation

Program evaluation is often divided into types of evaluation[11].

Evaluation can be performed at any time in the program. The results are used to decide how the program is delivered, what form the program will take or to examine outcomes. For example, an exercise program for elderly adults would seek to learn what activities are motivating and interesting to this group. These activities would then be included in the program.

Process Evaluation (Formative Evaluation) is concerned with how the program is delivered. It deals with things such as when the program activities occur, where they occur, and who delivers them. In other words, it asks the question: Is the program being delivered as intended? An effective program may not yield desired results if it is not delivered properly.

Outcome Evaluation (Summative Evaluation) addresses the question of what are the results. It is common to speak of short-term outcomes and long-term outcomes. For example, in an exercise program, a short-term outcome could be a change knowledge about the health effects of exercise, or it could be a change in exercise behavior. A long-term outcome could be less likelihood of dying from heart disease.

CDC framework

In 1999, the Centers for Disease Control and Prevention (CDC) published a six-step framework for conducting evaluation of public health programs. The publication of the framework is a result of the increased emphasis on program evaluation of government programs in the US. The six steps are:

  1. Engage stakeholders
  2. Describe the program.
  3. Focus the evaluation.
  4. Gather credible evidence.
  5. Justify conclusions.
  6. Ensure use and share lessons learned.

Methodological Challenges Presented by Language and Culture

The purpose of this section is to draw attention to some of the methodological challenges and dilemmas evaluators are potentially faced with when conducting a program evaluation in a developing country. In many developing countries the major sponsors of evaluation are donor agencies from the developed world, and these agencies require regular evaluation reports in order to maintain accountability and control of resources, as well as generate evidence for the program’s success or failure (Bamberger, 2000)[12]. However, there are many hurdles and challenges which evaluators face when attempting to implement an evaluation program which attempts to make use of techniques and systems which are not developed within the context to which they are applied (Smith, 1990)[13]. Some of the issues include differences in culture, attitudes, language and political process (Ebbutt, 1998, Smith, 1990)[14].

Culture is defined by Ebbutt (1998, p. 416) as a “constellation of both written and unwritten expectations, values, norms, rules, laws, artifacts, rituals and behaviours that permeate a society and influence how people behave socially”. Culture can influence many facets of the evaluation process, including data collection, evaluation program implementation and the analysis and understanding of the results of the evaluation (Ebbutt, 1998). In particular, instruments which are traditionally used to collect data such as questionnaires and semi-structured interviews need to be sensitive to differences in culture, if they were originally developed in a different cultural context (Bulmer & Warwick, 1993)[15]. The understanding and meaning of constructs which the evaluator is attempting to measure may not be shared between the evaluator and the sample population and thus the transference of concepts is an important notion, as this will influence the quality of the data collection carried out by evaluators as well as the analysis and results generated by the data (ibid).

Language also plays an important part in the evaluation process, as language is tied closely to culture (ibid). Language can be a major barrier to communicating concepts which the evaluator is trying to access, and translation is often required (Ebbutt, 1998). There are a multitude of problems with translation, including the loss of meaning as well as the exaggeration or enhancement of meaning by translators (ibid). For example, terms which are contextually specific may not translate into another language with the same weight or meaning. In particular, data collection instruments need to take meaning into account as the subject matter may not be considered sensitive in a particular context might prove to be sensitive in the context in which the evaluation is taking place (Bulmer & Warwick, 1993). Thus, evaluators need to take into account two important concepts when administering data collection tools: lexical equivalence and conceptual equivalence (ibid). Lexical equivalence asks the question: how does one phrase a question in two languages using the same words? This is a difficult task to accomplish, and uses of techniques such as back-translation may aid the evaluator but may not result in perfect transference of meaning (ibid). This leads to the next point, conceptual equivalence. It is not a common occurrence for concepts to transfer unambiguously from one culture to another (ibid). Data collection instruments which have not undergone adequate testing and piloting may therefore render results which are not useful as the concepts which are measured by the instrument may have taken on a different meaning and thus rendered the instrument unreliable and invalid (ibid).

Thus, it can be seen that evaluators need to take into account the methodological challenges created by differences in culture and language when attempting to conduct a program evaluation in a developing country.

See also

External links


  1. ^ Administration for Children and Families (2006) The Program Manager's Guide to Evaluation. Chapter 2: What is program evaluation?.
  2. ^ US Department of Labor, History of the DOL (no date) Chapter 6: Eras of the New Frontier and the Great Society, 1961-1969.
  3. ^ National Archives, Records of the Office of Management and Budget (1995) 51.8.8 Records of the Office of Program Evaluation.
  4. ^ Potter, C. (2006). Program Evaluation. In M. Terre Blanche, K. Durrheim & D. Painter (Eds.), Research in practice: Applied methods for the social sciences (2nd ed.) (pp. 410-428). Cape Town: UCT Press.
  5. ^ Rossi, P., Lipsey, M.W., & Freeman, H.E. (2004). Evaluation: a systematic approach (7th ed.). Thousand Oaks: Sage.
  6. ^ Weiss, C.H. (1999). Research-policy linkages: How much influence does social science research have? World Social Science Report, pp. 194-205.
  7. ^ Louw, J. (1999). Improving practice through evaluation. In D. Donald, A. Dawes & J. Louw (Eds.), Addressing childhood adversity (pp. 60-73). Cape Town: David Philip.
  8. ^ Rossi, P.H., Lipsey, M.W. & Freeman, H.E. (2004) Evaluation: A Systematic Approach.
  9. ^ Centers for Disease Control and Prevention. Framework for Program Evaluation in Public Health. MMWR 1999;48(No. RR-11).
  10. ^ Delbert Charles Miller, Neil J. Salkind (2002) Handbook of Research Design & Social Measurement. Edition: 6, revised. Published by SAGE,.
  11. ^ University of Texas, Austin, Division of Instructional Innovation and Assessment (2007) Types of program evaluation
  12. ^ Bamberger, M. (2000). The Evaluation of International Development Programs: A View from the Front. American Journal of Evaluation, 21, pp. 95-102.
  13. ^ Smith, T. (1990) Policy evaluation in third world countries: some issues and problems. The Asian Journal of Public Administration, 12, pp. 55-68.
  14. ^ Ebbutt, D. (1998). Evaluation of projects in the developing world: some cultural and methodological issues. International Journal of Educational Development, 18, pp. 415-424.
  15. ^ Bulmer, M. and Warwick, D. (1993). Social research in developing countries: surveys and censuses in the Third World. London: Routledge.


Got something to say? Make a comment.
Your name
Your email address