Research Paper Evaluation Formative Evaluation
Measuring Success: Evaluation Article Types for the Public Health Education and Promotion Section of Frontiers in Public Health
Matthew Lee Smith1,* and Marcia G. Ory2
1Department of Health Promotion and Behavior, College of Public Health, The University of Georgia, Athens, GA, USA
2Department of Health Promotion and Community Health Sciences, School of Public Health, Texas A&M Health Science Center, College Station, TX, USA
Edited by: Irene Olivia Adelaide Sandvold, Health Resources and Services Administration, USA
Reviewed by: Geraldine Sanchez Aglipay, University of Illinois at Chicago, USA; Irene Olivia Adelaide Sandvold, Health Resources and Services Administration, USA
This article was submitted to Public Health Education and Promotion, a section of the journal Frontiers in Public Health.
Author information ►Article notes ►Copyright and License information ►
Received 2014 Apr 28; Accepted 2014 Jul 21.
Keywords: evaluation, review criteria, public health education and promotion, article type, peer review
Copyright © 2014 Smith and Ory.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Front Public Health. 2014; 2: 111.
Published online 2014 Aug 13. Prepublished online 2014 Jun 16. doi: 10.3389/fpubh.2014.00111
This article has been cited by other articles in PMC.
The aims of this article are to provide a rationale about the importance of evaluation in public health initiatives; justify Public Health Education and Promotion’s decision to create an Evaluation Article Type; and outline the evaluation criteria from which submitted articles will be assessed for publication.
The Importance and Use of Evaluation in Public Health Education and Promotion
Evaluation is a process used by researchers, practitioners, and educators to assess the value of a given program, project, or policy (1). The primary purposes of evaluation in public health education and promotion are to: (1) determine the effectiveness of a given intervention and/or (2) assess and improve the quality of the intervention. Through evaluation, we can identify our level of success in evoking desired outcomes and accomplishing desired objectives. This is accomplished by carefully formulating specific, measurable objective statements that enable evaluators to assess if the intervention influences intended indicators and/or if the correct measures were used to gage effectiveness. Determining the impact of our efforts has vast implications for the future of the intervention. For example, through evaluation we are able to identify the essential elements of a given intervention (e.g., activities, content, resources, and structure), refine content and implement strategies, and decide whether or not to invest more resources for scalability.
High-quality evaluation is contingent upon the appropriateness of the design and selected measures for the questions being posed and the population being studied. Measurement is especially critical to evaluation because it enables the evaluator to know if changes or improvements occur as a result of the intervention, and it provides testable evidence for participant progress and program success. Evaluation is a critical factor for demonstrating accountability to all stakeholders included in the intervention. More specifically, conducting an appropriate and rigorous evaluation shows that the evaluator is accountable to the audiences and communities they serve, the organization for which they work, the funding agency supporting the project, and the greater field of public health.
Evaluation serves many varied purposes in addition to providing accountability for the stakeholders. At the very core, evaluative efforts help determine if predetermined objectives related to behavior change or health improvement were achieved in the proposed health education or promotion initiative. Evaluation is also useful to improve elements surrounding program implementation (e.g., partnership development, fidelity, effectiveness, and efficiency) and can increase the level of community support for a given intervention or initiative. Further, evaluation contributes to our knowledge about the determinants of health issues as well as the best and most appropriate public health interventions to address them. This knowledge is extremely valuable to guide future research and practice. Evaluation also informs policy decisions at the organizational, local, state, national, and international level.
The role of evaluation has evolved over time. There are many types of evaluation, which are primarily defined by their design and purpose (2). The selection of an evaluation design is dependent upon the initiative’s focus, health issue being targeted, audience, setting, and timeline. Efficacy research includes evaluation performed under strict and regulated conditions, often in the form of randomized controlled trials (RCT). This type of evaluation is beneficial to determine what types of interventions work, while controlling for confounders and external influences. Effectiveness research includes evaluation performed in less controlled situations. This type of evaluation is beneficial to determine if the effects from RCT can be replicated in ‘real-world’ settings and conditions, often on a grander scale. Dissemination and implementation research typically includes evaluations performed in ‘real-world’ settings. This type of evaluation is beneficial to determine how to get what is known to be effective into the hands of the people, organizations, and communities that need them most. Much of this translational and pragmatic research includes evaluation about participant recruitment and retention, organizational adoption, fidelity, partnership formation and collaboration, data collection processes, scalability, and sustainability.
There are many phases of evaluation, which are primarily defined by their purpose and timing in the initiative’s delivery (3–5). Formative evaluation typically occurs in the early stages of an initiative to ‘pilot test’ for the purposes of obtaining feedback from involved parties, adjusting and enhancing the intervention components and content, and guiding the future directions of the initiative. Formative evaluation is most often concerned with feasibility and the appropriateness of materials and procedures. Formative evaluation permits preliminary testing and refinement of study hypotheses, data collection instruments, and statistical/analytical procedures. Generally, this form of evaluation occurs on a small scale to ensure unanticipated problems (e.g., glitches, breakdowns, lengthy delays, and departures from the design) are identified and the intervention quality is improved before ‘going to scale’ (i.e., prior to allocating larger investments of time, effort, and resources). Process evaluation is a type of formative evaluation that focuses on the intervention itself (as opposed to the outcomes) and should occur throughout the ‘life’ of an initiative. This type of evaluation uses data to assess the delivery of services and examine the nature and quality of processes and procedures. Process evaluation helps the evaluator to define the content, activities, and parameters of the initiative. It also addresses whether or not the intervention reached the intended audience, was appropriate for the audience, and was delivered as intended (including elements of fidelity and receipt of adequate intervention dose). Summative evaluation encompasses the overall merit of the intervention in terms of immediate impact as well as intermediate- and long-term outcomes. In addition to the intervention’s effectiveness, this type of evaluation also encompasses process evaluation, considering that predicted outcomes and objectives can only be achieved if the intervention is delivered with fidelity, as intended.
Rationale for Introducing Evaluation Article Type Submissions
Recognizing the importance of evaluation, the Public Health Education and Promotion section has created an Article Type dedicated to evaluation. Evaluation is a special niche of public health education and promotion that assesses interventions’ ability to change health-related knowledge, perceptions, behavior, and service/resource utilization. While many public health education and promotion evaluations examine program efficacy and effectiveness, the emergent emphasis on translational issues of program dissemination and implementation (e.g., participant and delivery site recruitment and retention, fidelity, and maintenance/sustainability) requires the application of pragmatic research principles and methodologies (6, 7). Such translational evaluations address different research questions than traditional efficacy and effectiveness evaluations and are often conducted under pragmatic research designs. Pragmatic designs also attempt to promote the translation between research and practice (8). Thus, articles written using these methodological techniques require tailored review criteria to determine their appropriateness for publication.
Further, in public health, there are many types of innovations (e.g., trainings, courses, curricula, health promotion programs, and environmental or policy change), and there are many ways to report the participants, procedures, and findings of these initiatives based on the data collection methodology and research design (e.g., CONSORT for reporting controlled randomized trials, TREND for non-randomized evaluations, and STROBE for observational studies) (9–11). Although these guidelines are good for documenting the quality of the evaluation in terms of the appropriateness, sophistication, and replicability of the research design and evaluation, they are not all encompassing for innovations in public health education and promotion. As such, general, expansive, and all-encompassing set of criteria are needed to assess evaluation-related manuscripts submitted to the Public Health Education and Promotion section to ensure published manuscripts are rigorous, timely, relevant, and responsive to public health needs.
Evaluation Article Type
Public Health Education and Promotion will accept a broad spectrum of articles that evaluate programs, courses, curricula, teaching methods, and other pedagogical elements as well as public health innovations at the organizational, environmental, or policy levels relevant to our mission. Such translational research articles will require a sufficient description of the program logistics, procedures, and participants/sample. Additionally, submissions will require a Discussion section that shares practical implications, lessons learned for future applications of the program, and acknowledgment of any methodological constraints. Articles should not exceed 6,000 words and include a maximum of five tables/graphs. Details about the Evaluation Article Type can be found online (http://www.frontiersin.org/Public_Health_Education_and_Promotion/articletype). A detailed description of the criteria used by Review Editors during the peer-review process is available as a separate file. While this information is obviously beneficial for Review Editors, we hope it will be consulted by authors prior to submitting evaluation-related manuscripts to the Public Health Education and Promotion section.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Evaluation Article Type Criteria for Peer Review Public Health Education and Promotion
Indicate what this article evaluates
__an educational or training program or intervention
__a teaching method
__multiple pedagogical facets
__a health promotion program
__environmental, technological or policy natural or planned change
__none of the above (i.e., inappropriately categorized for submission as an Evaluation article)
__other. Please specify: ________________________________________________
Indicate the target audience
__community professionals (of one or more disciplines)
__healthcare professionals (of one or more disciplines)
Note: In the following questions, “intervention/program” is used to encompass the innovation (subject of the targeted effort, whether it is a course, curriculum, pilot project, program, etc.) being evaluated.
Significance of issue being addressed by the intervention/program: (scored out of a maximum of 10 points)
Appropriateness of the methodology used to address the question being asked: (scored out of a maximum of 10 points)
Quality of the writing: (scored out of a maximum of 10 points)
Quality of the figure(s) and tables: (scored out of a maximum of 10 points)
Significance of the evaluation findings: (scored out of a maximum of 10 points)
Application of the evaluation findings: (scored out of a maximum of 10 points)
Could this program be replicated by other organizations?
__Unclear based on the presented information
An Evaluation article has the following mandatory sections: abstract, introduction, background and rationale, methods, results, discussion, and conclusions. Are all sections present?
Is the abstract written in a clear and comprehensive way?
Does the introduction present the problem in an appropriate context?
Does the Introduction identify a knowledge gap justifying the need for evaluation?
Other comments on introduction.
Background and Rationale:
Is the literature review sufficient?
Is the intent of the intervention/program adequately described?
Are the questions asked by the evaluation those that are most essential to the success of the intervention/program being attempted?
Other comments on background and rationale.
Is the population adequately described so it is clear: (1) how they were recruited? and (2) if they are representative of the broader population?
Is the intervention/program adequately described (e.g., development, previous findings, components, format/design)?
Are the intervention/program implementation procedures adequately described (e.g., structure, participant and site recruitment, data collection)?
Is the evaluation methodology appropriate to assess the process?
Is the evaluation methodology appropriate to assess the outcome?
Are the statistical analysis techniques appropriate? Adequately presented?
Other comments on methods.
Are findings accurately reported from data presented?
Is the level of detail of the results appropriate (too much, too little, about right)?
Is essential information missing?
Are effect sizes for outcomes available to enable cross-study comparison, when appropriate?
Other comments on results.
Are the reported findings summarized briefly and described within the context of what is currently known about the innovation (using findings from previous evaluations)?
Does the discussion address the knowledge gap identified in the Introduction section?
Does the discussion address all possible concerns of both internal and external validity of the findings (Limitations section)?
Does the article conclude with practical recommendations for others who might replicate this intervention/program (or similar programs)?
Does the article conclude with applied recommendations for practitioners in the field who might deliver this intervention/program (or similar programs) in their communities/settings?
Does the evaluation contribute concrete recommendations for delivering and/or improving the intervention/program in future applications (directed toward researchers or practitioners, as appropriate)?
Other comments on discussion.
Are the conclusions justified?
Overall, does the article contribute to building Evidence-Based Practice?
Is prior work properly and fully cited?
An Evaluation article should not exceed 4,500 words. Should any part of the article be shortened? If yes, please specify which part should be shortened.
An Evaluation article should not include more than 5 tables/figures. If there are more tables/figures included, please specify if you believe tables can be combined, condensed, or eliminated.
Language and Grammar:
Is the language and grammar of sufficient quality?
Should the paper be sent to an expert in English language and scientific writing?
Please add any further comments you have regarding this manuscript.
1. Springett J. Issues in participatory evaluation. In: Minkler M, Wallerstein N, editors. , editors. Community Based Participatory Research for Health. New York: Jossey-Bass; (2003). p. 263–86
2. Flay BR, Biglan A, Boruch RF, Castro FG, Gottfredson D, Kellam S, et al. Standards of evidence: criteria for efficacy, effectiveness and dissemination. Prev Sci (2005) 6(3):151–7510.1007/s11121-005-5553-y [PubMed][Cross Ref]
3. McKenzie JF, Neiger BL, Thackeray R. Planning, Implementing, & Evaluating Health Promotion Programs: A Primer. 5th ed San Francisco: Benjamin Cummings; (2009). 464 p.
4. Royse D, Thyer B, Padgett D. Program Evaluation: An Introduction. Belmont, CA: Cengage Learning; (2009). 416 p.
5. Windsor RA, Baranowski T, Clark N, Cutter G. Evaluation of health promotion and education programs. J Sch Health (1984) 54(8):318.10.1111/j.1746-1561.1984.tb08946.x [Cross Ref]
6. Glasgow RE. What does it mean to be pragmatic? Pragmatic methods, measures, and models to facilitate research translation. Health Educ Behav (2013) 40(3):257–6510.1177/1090198113486805 [PubMed][Cross Ref]
7. Glasgow RE, Chambers D. Developing robust, sustainable, implementation systems using rigorous, rapid and relevant science. Clin Transl Sci (2012) 5(1):48–5510.1111/j.1752-8062.2011.00383.x [PMC free article][PubMed][Cross Ref]
8. Glasgow RE, Lichtenstein E, Marcus AC. Why don’t we see more translation of health promotion research to practice? Rethinking the efficacy-to-effectiveness transition. Am J Public Health (2003) 93(8):1261–710.2105/AJPH.93.8.1261 [PMC free article][PubMed][Cross Ref]
9. Des Jarlais DC, Lyles C, Crepaz N. Improving the reporting quality of nonrandomized evaluations of behavioral and public health interventions: the TREND statement. Am J Public Health (2004) 94(3):361–610.2105/AJPH.94.3.361 [PMC free article][PubMed][Cross Ref]
10. Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet (2001) 357(9263):1191–410.1016/S0140-6736(00)04337-3 [PubMed][Cross Ref]
11. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. Prev Med (2007) 45(4):247–5110.1016/j.ypmed.2007.08.012 [PubMed][Cross Ref]
Articles from Frontiers in Public Health are provided here courtesy of Frontiers Media SA
« PreviousHomeNext »
Evaluation is a methodological area that is closely related to, but distinguishable from more traditional social research. Evaluation utilizes many of the same methodologies used in traditional social research, but because evaluation takes place within a political and organizational context, it requires group skills, management ability, political dexterity, sensitivity to multiple stakeholders and other skills that social research in general does not rely on as much. Here we introduce the idea of evaluation and some of the major terms and issues in the field.
Definitions of Evaluation
Probably the most frequently given definition is:
Evaluation is the systematic assessment of the worth or merit of some object
This definition is hardly perfect. There are many types of evaluations that do not necessarily result in an assessment of worth or merit -- descriptive studies, implementation analyses, and formative evaluations, to name a few. Better perhaps is a definition that emphasizes the information-processing and feedback functions of evaluation. For instance, one might say:
Evaluation is the systematic acquisition and assessment of information to provide useful feedback about some object
Both definitions agree that evaluation is a systematic endeavor and both use the deliberately ambiguous term 'object' which could refer to a program, policy, technology, person, need, activity, and so on. The latter definition emphasizes acquiring and assessing information rather than assessing worth or merit because all evaluation work involves collecting and sifting through data, making judgements about the validity of the information and of inferences we derive from it, whether or not an assessment of worth or merit results.
The Goals of Evaluation
The generic goal of most evaluations is to provide "useful feedback" to a variety of audiences including sponsors, donors, client-groups, administrators, staff, and other relevant constituencies. Most often, feedback is perceived as "useful" if it aids in decision-making. But the relationship between an evaluation and its impact is not a simple one -- studies that seem critical sometimes fail to influence short-term decisions, and studies that initially seem to have no influence can have a delayed impact when more congenial conditions arise. Despite this, there is broad consensus that the major goal of evaluation should be to influence decision-making or policy formulation through the provision of empirically-driven feedback.
'Evaluation strategies' means broad, overarching perspectives on evaluation. They encompass the most general groups or "camps" of evaluators; although, at its best, evaluation work borrows eclectically from the perspectives of all these camps. Four major groups of evaluation strategies are discussed here.
Scientific-experimental models are probably the most historically dominant evaluation strategies. Taking their values and methods from the sciences -- especially the social sciences -- they prioritize on the desirability of impartiality, accuracy, objectivity and the validity of the information generated. Included under scientific-experimental models would be: the tradition of experimental and quasi-experimental designs; objectives-based research that comes from education; econometrically-oriented perspectives including cost-effectiveness and cost-benefit analysis; and the recent articulation of theory-driven evaluation.
The second class of strategies are management-oriented systems models. Two of the most common of these are PERT, the Program Evaluation and Review Technique, and CPM, the Critical Path Method. Both have been widely used in business and government in this country. It would also be legitimate to include the Logical Framework or "Logframe" model developed at U.S. Agency for International Development and general systems theory and operations research approaches in this category. Two management-oriented systems models were originated by evaluators: the UTOS model where U stands for Units, T for Treatments, O for Observing Observations and S for Settings; and the CIPP model where the C stands for Context, the I for Input, the first P for Process and the second P for Product. These management-oriented systems models emphasize comprehensiveness in evaluation, placing evaluation within a larger framework of organizational activities.
The third class of strategies are the qualitative/anthropological models. They emphasize the importance of observation, the need to retain the phenomenological quality of the evaluation context, and the value of subjective human interpretation in the evaluation process. Included in this category are the approaches known in evaluation as naturalistic or 'Fourth Generation' evaluation; the various qualitative schools; critical theory and art criticism approaches; and, the 'grounded theory' approach of Glaser and Strauss among others.
Finally, a fourth class of strategies is termed participant-oriented models. As the term suggests, they emphasize the central importance of the evaluation participants, especially clients and users of the program or technology. Client-centered and stakeholder approaches are examples of participant-oriented models, as are consumer-oriented evaluation systems.
With all of these strategies to choose from, how to decide? Debates that rage within the evaluation profession -- and they do rage -- are generally battles between these different strategists, with each claiming the superiority of their position. In reality, most good evaluators are familiar with all four categories and borrow from each as the need arises. There is no inherent incompatibility between these broad strategies -- each of them brings something valuable to the evaluation table. In fact, in recent years attention has increasingly turned to how one might integrate results from evaluations that use different strategies, carried out from different perspectives, and using different methods. Clearly, there are no simple answers here. The problems are complex and the methodologies needed will and should be varied.
Types of Evaluation
There are many different types of evaluations depending on the object being evaluated and the purpose of the evaluation. Perhaps the most important basic distinction in evaluation types is that between formative and summative evaluation. Formative evaluations strengthen or improve the object being evaluated -- they help form it by examining the delivery of the program or technology, the quality of its implementation, and the assessment of the organizational context, personnel, procedures, inputs, and so on. Summative evaluations, in contrast, examine the effects or outcomes of some object -- they summarize it by describing what happens subsequent to delivery of the program or technology; assessing whether the object can be said to have caused the outcome; determining the overall impact of the causal factor beyond only the immediate target outcomes; and, estimating the relative costs associated with the object.
Formative evaluation includes several evaluation types:
- needs assessment determines who needs the program, how great the need is, and what might work to meet the need
- evaluability assessment determines whether an evaluation is feasible and how stakeholders can help shape its usefulness
- structured conceptualization helps stakeholders define the program or technology, the target population, and the possible outcomes
- implementation evaluation monitors the fidelity of the program or technology delivery
- process evaluation investigates the process of delivering the program or technology, including alternative delivery procedures
Summative evaluation can also be subdivided:
- outcome evaluations investigate whether the program or technology caused demonstrable effects on specifically defined target outcomes
- impact evaluation is broader and assesses the overall or net effects -- intended or unintended -- of the program or technology as a whole
- cost-effectiveness and cost-benefit analysis address questions of efficiency by standardizing outcomes in terms of their dollar costs and values
- secondary analysis reexamines existing data to address new questions or use methods not previously employed
- meta-analysis integrates the outcome estimates from multiple studies to arrive at an overall or summary judgement on an evaluation question
Evaluation Questions and Methods
Evaluators ask many different kinds of questions and use a variety of methods to address them. These are considered within the framework of formative and summative evaluation as presented above.
In formative research the major questions and methodologies are:
What is the definition and scope of the problem or issue, or what's the question?
Formulating and conceptualizing methods might be used including brainstorming, focus groups, nominal group techniques, Delphi methods, brainwriting, stakeholder analysis, synectics, lateral thinking, input-output analysis, and concept mapping.
Where is the problem and how big or serious is it?
The most common method used here is "needs assessment" which can include: analysis of existing data sources, and the use of sample surveys, interviews of constituent populations, qualitative research, expert testimony, and focus groups.
How should the program or technology be delivered to address the problem?
Some of the methods already listed apply here, as do detailing methodologies like simulation techniques, or multivariate methods like multiattribute utility theory or exploratory causal modeling; decision-making methods; and project planning and implementation methods like flow charting, PERT/CPM, and project scheduling.
How well is the program or technology delivered?
Qualitative and quantitative monitoring techniques, the use of management information systems, and implementation assessment would be appropriate methodologies here.
The questions and methods addressed under summative evaluation include:
What type of evaluation is feasible?
Evaluability assessment can be used here, as well as standard approaches for selecting an appropriate evaluation design.
What was the effectiveness of the program or technology?
One would choose from observational and correlational methods for demonstrating whether desired effects occurred, and quasi-experimental and experimental designs for determining whether observed effects can reasonably be attributed to the intervention and not to other sources.
What is the net impact of the program?
Econometric methods for assessing cost effectiveness and cost/benefits would apply here, along with qualitative methods that enable us to summarize the full range of intended and unintended impacts.
Clearly, this introduction is not meant to be exhaustive. Each of these methods, and the many not mentioned, are supported by an extensive methodological research literature. This is a formidable set of tools. But the need to improve, update and adapt these methods to changing circumstances means that methodological research and development needs to have a major place in evaluation work.
« PreviousHomeNext »
Copyright ©2006, William M.K. Trochim, All Rights Reserved
Purchase a printed copy of the Research Methods Knowledge Base
Last Revised: 10/20/2006