CHAPTER 6
Assessing the Success of Professional Practice Models: Evaluation
KEY WORDS
Evaluation, evaluation plan, performance indicators, measurement, feedback
OBJECTIVES
By the end of this chapter, readers will be able to:
1. Explain the definition, types, and processes used for evaluation
2. Describe the purpose of evaluation frameworks
3. Identify the components of an evaluation plan
4. State at least three performance indicators associated with implementing a professional practice model (PPM)
5. Choose valid and reliable instruments
6. Analyze data-collection strategies
7. Apply at least one method for engaging end users in the evaluation process
8. Develop a transparent method for disseminating evaluation feedback
EVALUATION
Although evaluation is actually conducted during and after implementation of a project or program, it must be a consideration before the implementation process is finalized.
Incorporating evaluation into the implementation process enhances the continuity and credibility of the project, improving the likelihood of success.
The goal of evaluation is to provide objective feedback that is useful for decision making or, in a more contemporary sense, to support development of innovations and adaptation of interventions in complex dynamic environments (Patton, 2011). In this case, the aim of evaluation is to provide objective feedback on the progress of the implementation process and the extent to which intended goals and objectives were successfully met (outcomes). Because evaluation is not only a conceptual exercise, but also a practical service, Stufflebeam and Coryn (2014) pose the following operational definition of evaluation: “the systematic process of delineating, obtaining, reporting, and applying safety, significance, and/or equity” (p. 14). Two forms of evaluation are typically described: formative and summative.
Formative evaluation is a learning process that is intended to provide feedback on a project’s progress in order to assess what more might be done to advance such progress (Bennett, 2011), providing frequent opportunities for continuous improvement. As such, the evidence obtained from formative evaluation helps to determine whether the project was implemented as planned, whether the number of recipients included reached established targets, and to assess whether adjustments are needed and, if so, to recommend them (Weiss, 1998). When performed effectively, formative evaluation ensures that implementation strategies, activities, and materials work as planned. It also helps to understand whether the strategies used are serving the target population as designed and whether the number of people being served is more or less than expected.
Summative evaluation, on the other hand, refers to the results of a program or project, and includes specific intermediate and long-term outcomes and impact measures. Summative evaluation is most often used at the end of a project to learn how well the program succeeded in achieving its ultimate goals, whether changes in participants’ knowledge, attitudes, beliefs, or behaviors occurred as planned, and whether the desired goals were achieved with efficiency (McDavid, Huse, & Hawthorn, 2013). See Table 6.1 for differences and similarities between formative and summative evaluation. Both formative and summative evaluative processes work together to provide a comprehensive assessment of a project’s success in meeting its goals (observed versus specified outcomes).
Table 6.1 Similarities and Differences Between Formative and Summative Evaluation
Evaluation of a project or program, or a change in the manner of working, involves ongoing collection and analysis of information to inform stakeholders about its progress and the achievement of its goals (in this case, to satisfy nursing’s responsibility to the health system regarding improvements in practice).
Data from this continuous assessment is used to revise the implementation process, to better manage limited resources, justify ongoing resource use, or seek additional resources, and finally, to document end-of-program accomplishments. Internal evidence generated from evaluation data ensures that goals/objectives are being met and limited system resources are prudently expended; organizations desperately need these data to make informed decisions. To gather more detailed meanings (e.g., barriers and facilitators or differences among departments) or to better understand different perspectives and contexts of the appropriateness of the implementation process, evaluation may include both quantitative data (usually used to assess specific outcomes) and qualitative data.
During the implementation planning step of professional practice model integration, it is important to consider evaluation and even design it into the process. That way, its requirements and burdens, such as collecting data, interpreting and reporting results, are known and well planned for.
For example, the findings of data collectors could impact reliability, accuracy, and the completeness of data; the sample size could impact the use of certain statistical tests; and the preservation of confidentiality could impact access to patient or employee records; all of which negatively affect the quality of the data. Such burdens must be considered and addressed up front to avoid later credibility issues. Evaluation is intimately tied to implementation in that without it, the success of the project cannot be effectively demonstrated (and methods to improve it are left unknown). Thus, understanding and applying rigor to the evaluation process is paramount.
A thorough definition of evaluation includes: “the systematic assessment of an object’s merit, worth, probity, feasibility, safety, significance, or equity” (Stufflebeam & Coryn, 2014, p. 24). The key word here is systematic. Systematic connotes a logical, orderly, consistent course of action (Figure 6.1). Such a process requires thinking about (prior to implementation) and establishing a written evaluation scheme (or plan) that allows for adequate reflection of its multiple components and consideration of assessment activities that are well designed and function optimally.
To assess the value of professional practice model (PPM) integration, it is necessary to judge the implementation’s ability to do things well or at least according to some standard (merit); to meet a need with some attached value (worth); to conduct it with integrity (probity), practicality (feasibly), and safety (does not induce harm); so as to engender some practice implications that are applied fairly (equity; Stufflebeam & Coryn, 2014). Using the more specific program evaluation standards (Yarbrough, Shulha, Hopson, & Caruthers, 2011) ensures that the ultimate purposes of evaluation—demonstrating the attainment of goals, monitoring progress and modifying project planning as appropriate, demonstrating accountability, and justifying funding—are respected. These standards address the utility of the evaluation (credibility of the process and products), feasibility (practicality), proprietary (appropriate agreements, permissions, equity, transparency, disclosure etc. are used), accuracy (quality information with justified conclusions), and accountability (documentation and benchmarking).
Another approach to ensuring a comprehensive evaluation is the reach, effectiveness, adoption, implementation, maintain (RE-AIM) method. This comprehensive approach was developed by Glasgow, McKay, Piette (2001) to be used in real-world settings. This method ensures the representativeness of participants, their demographics, and their acceptance of the implementation strategies. Calculating the proportion of implementation strategies completed and goals achieved at any point in time assesses effectiveness while also considering contextual adaptations of the strategies, the time required to implement certain strategies, and resources consumed in its analysis. Furthermore, RE-AIM allows benchmarking to similar systems to add perspective and uses qualitative data to capture facilitators, barriers, and other contextual issues.
It also considers long-term delivery of the project by examining the extent to which the PPM becomes part of routine practice and estimates efficiency by examining costs. The evaluation of a PPM integration project withdraws assets from health systems in terms of human and supply resources that must be taken into account prior to start-up.
Most decisions about maintaining a project long term are influenced, not only by the overall impact of a project, but also by its costs.
In this situation, systematic monitoring of how well the implementation process is meeting important objectives (e.g., exceeds that of similar organizations or a national average); addresses the original need of the professional practice model (e.g., in terms of improving some patient factors); uses accurate information to reach defensible conclusions; is efficient; and is measured in an inclusive, ethical, practical, safe, and fair manner that provides the best evidence for formulating responsible revisions (if necessary) so that the ultimate success of the project can be realized, affords optimum integration of the professional practice model.
This last point—best possible integration over the long term—is most significant for health systems today as they struggle to invest in what seems like endless, continuous changes and programs.
From this perspective, designing a comprehensive and systematic evaluation plan, while simultaneously adhering to evaluation standards, should be cohesively assimilated with the implementation process. Assistance from a credible internal professional or external consultant with expertise in project evaluation who applies a monitoring and evaluation framework to the process can assist greatly with identifying evaluation components, contemplating planned activities and feedback mechanisms, and determining whether they are indeed the most appropriate ones to execute.
MONITORING AND EVALUATION FRAMEWORKS
A clear blueprint from which to monitor and evaluate the integration of a PPM facilitates a more complete understanding of the project’s goals and objectives, shows the relationships among the many strategies that were developed for implementation, and describes how contextual factors may affect successful integration. Some questions to ponder prior to choosing a monitoring and evaluation framework include:
• Does the framework help to inform the progress of the project (e.g., is it useful)?
• Does the framework point to specific information that is needed to learn whether implementation strategies are being executed in the way they were planned (e.g., does it facilitate selection of indicators)?
• Does the framework clarify how to assess the results, impact, and success of the project?
• Does the framework provide a structure for determining whether the expected objectives and implementation goals were accomplished?
• Does the framework suggest competencies and responsibilities of evaluators (e.g., who is best to lead and actively participate in the evaluation)?
• Does the framework suggest sources of information?
• Does the framework account for the context of the project?
• Is the framework practical (does it use resources wisely)?
• Does the framework promote integrity, flexibility, robustness, and inclusiveness?
Determining which framework is suitable for a specific health system is difficult and some organizations prefer to combine aspects of several frameworks for a more individualized approach. Selection of the evaluation framework that best suits the implementation strategies and responds to institutional requirements is the best approach. Four popular evaluation frameworks that are used to assess the progress and outcomes of large projects/programs are described in what follows. However, they do not represent the totality of possible frameworks from which to choose.
Goals-based evaluation (GBE), a classic framework in the literature on organizational evaluation, reports evaluation results assessed only in relation to predetermined goals. The term goals in this approach is used broadly to include objectives, performance targets, and expected outcomes derived from an implementation plan. The goals-based approach to evaluation was developed by Tyler (1942) and has continued to evolve.
GBE focuses on the degree to which the program met its predefined goals, whether they were met on time, and, in some cases, how (or whether) goals be changed in the future. A key strength of this model is the determination of whether results align with goals. Obviously, employing the GBE framework is only useful if the goals of the project were clear in the first place, if there is consensus about them, and if they are time bound and measurable. It is also relatively simple compared to other approaches. However, it introduces bias by disregarding consequences and unintended effects as well as whether the original goals were valid. In addition, it reports findings at the end of a project, essentially eliminating the improvement benefits of formative evaluation.
To help mitigate these limitations, the goal-free evaluation (GFE) framework was designed to prevent bias by avoiding the risk of overlooking unintended results (Scriven, 1991). Instead, observations of actual processes and outcomes are made and used to assess all the anticipated effects of a project, providing a limitless profile. Scriven argued that, because almost all projects either fall short of their goals or overachieve them, evaluating predetermined goals may be a waste of time. GFE may be less costly to apply and is flexible, but it is prone to evaluator bias and less likely to intentionally assess the project goals because it is not explicitly centered on those goals. It also may require evaluators to do and think more. Although this framework seems simpler, many believe its limitations outweigh its advantages.
Occasionally the GBE and GFE are combined by having the GB and GF evaluators design and conduct their evaluations independently and then synthesize their results. Interpretation involves assessing the results, including whether conclusions support or contradict each other. Both sets of evaluators (possibly with key stakeholders) then weigh the data from both approaches to make an evaluative conclusion. A more theoretical evaluation framework is presented in the text that follows.
A logic model is a diagram that paints a picture of how a project is supposed to work by expressing the thinking behind a plan, that is, the rationale for the plan.
A logic model describes the overall inputs, the connections between project strategies and the anticipated short-term, intermediate, and long-term outcomes (Knowlton & Phillips, 2013). When implemented well, logic models often become reference points for participants by pointing them in the right direction, clarifying the strategy of the evaluation, reminding them of targeted goals and timelines for completion, explaining the project to others, and suggests ways to organize and prepare reports.
A logic model is created by the actual team responsible for completing the evaluation and usually is depicted in a diagram for ease of understanding (Figure 6.2). This framework shows the relationship between the context (the environment) and the inputs, processes, and outcomes of an integration project. There is no “one way” to present these processes; rather, the elements are simply ordered or mapped in a logical manner with the results clearly depicted at the end. It should be simple enough to be understood, yet contain enough degree of detail so that fundamental elements are present. Usually a goal statement is listed first, followed by more specific objectives, implementation strategies, formative and summative evaluation measures, and finally the outcomes. Context or organizational factors that can affect the attainment of outcomes can be listed anywhere. Explicit links (shown by arrows or dotted lines) between components should be clearly evident to avoid confusion. When implementation processes are well written with measurable objectives, logic models essentially become a visual representation of the fundamental components of implementation with additional evaluation measures added.
A strong rationale for using theory-based evaluation is that it offers a number of ways of carrying out an analysis of the logical or theoretical consequences of a project or program, and focuses on shared understandings of project goals and strategies, increasing the likelihood of attaining desired outcomes.
Limitations of this evaluation framework, however, include time requirements and the heavy emphasis on objective measurement, leaving less room for more qualitative data.
The Centers for Disease Control and Prevention’s (CDC; 2011) well-established process for program evaluation includes six steps:
1. Engage stakeholders (those involved in project operation, including those affected by the project).
2. Describe the process (the expected goals and objectives, strategies, resources, context, and evaluation framework).
3. Focus the design (consider the purpose, users, uses, questions, methods, and agreements).
4. Gather evidence (indicators, sources, quality, quantity, and logistics).
5. Justify conclusions (link conclusions to the evidence gathered and judge them against agreed-on implementation goals and objectives; compare conclusions to existing standards, professional literature, and similar organizations).
6. Ensure use (through reporting and dissemination methods that engage stakeholders).
This evaluation framework is similar to the plan–do–check–act cycle of the Deming method (1950) that many health systems use for performance improvement. It is simple, clear, and factual but lacks focus on the human side of change, the competence of evaluators, or leadership involvement.
Stufflebeam’s (1966) Context, Inputs, Process, and Product (CIPP) evaluation framework is one of the most comprehensive and widely used evaluation models. Unlike others, the CIPP framework systematically guides both evaluators and stakeholders in posing relevant questions and conducting assessments at the beginning of a project (context and input evaluation), while it is in progress (input and process evaluation), and at its end (project outcomes evaluation). This approach serves to improve and achieve accountability through a “learning-by-doing” approach (Zhang et al., 2011). It is especially relevant for guiding evaluations of programs, projects, personnel, products, institutions, and evaluation systems (Stufflebeam, 2003). The CIPP model is thorough, attends to the context, helps identify needs at the beginning of a project, monitors and documents a project’s process and potential procedural barriers, provides feedback regarding the extent to which planned activities are carried out, guides staff on how to modify and improve the program plan, assesses the degree to which participants can carry out their roles, and identifies needs for project adjustments. Finally, it measures, interprets, and judges intended and unintended outcomes and interprets the positive and negative effects the program had on its target audience (Mertens & Wilson, 2012; Zhang et al., 2011). The CIPP framework applies a combination of methodological techniques to ensure all outcomes are noted and assists in verifying evaluation findings (Stufflebeam & Coryn, 2014).
The CIPP evaluation framework advocates involving stakeholders (in fact, it suggests engaging them and seeking them out); requires evaluators to study the feasibility and potential results of a project prior to implementation; and considers the institutional context, including the impact of individual personalities and the importance of the prevailing organizational climate.
One disadvantage may be the time necessary to carry out the CIPP model. Nevertheless, it is a worthy framework for systematic evaluation of PPM integration.
Although the frameworks presented here have been well applied and remain relevant, many evaluators are beginning to question whether evaluating innovation (or change) is best conducted using these traditional methods.
The ability of a workforce to change practice in line with a professional practice model often involves constantly searching for what is working, the use of data to inform that search, and then altering approaches or changing direction as new ideas present themselves.
To do this, waiting for end-of-project outcomes reports is not seen by some as that valuable. Instead, newer views consider that continuous data collection, analysis, and interpretation be used “on the fly” to inform adjustments in implementation strategies. Furthermore, rather than the evaluators communicating results to those on the front lines, direct care professionals become active participants in the evaluation process, ensuring that their real needs and those of the patient/family are recognized and addressed. In this manner, professional evaluators, project staff, clinical staff, and project beneficiaries (e.g., patients and families and sometimes members of the community) all become colleagues in an ongoing effort to integrate the PPM into clinical practice.
THE COMPONENTS OF EVALUATION
Using an established or customized framework, an evaluation plan consists of a performance indicator set that identifies sources of data and data-collection methods, including appropriate samples, time frames, logistics (who will be responsible for what), analysis, interpretation, and dissemination (the provision of feedback). Each of these components will be briefly discussed in the text that follows.
Selecting a Logical Indicator Set
Indicators are ways of measuring (indicating) that progress on a project is being achieved. Using the overall goals and objectives in the implementation plan, indicators are identified that help monitor performance and best measure the predetermined targets. Setting indicators is a complex process of deciding what best gauges whether a project has met certain goals or effected some change. For example, good indicators can point to the extent to which the implementation goals have been met or whether a certain desirable practice change has occurred. But indicators must be comprehensive in order to provide the most value.
Indicators often measure tangible outcomes, but also changes in knowledge, attitudes, and behaviors, which are often less tangible and not always easy to count. Quantitative indicators often use frequencies (e.g., number of participants in a class), scores on some instrument (e.g., the total score on the Hospital Consumer Assessment of Healthcare Providers and Systems [HCAHPS; Price et al., 2014]), and rates (e.g., infection rates), whereas qualitative indicators are generally more descriptive and tend to use interviews, focus groups, and case studies.
Qualitative indicators are helpful in understanding complex processes or relationships. For example, a focus group may help to more fully understand why participants in a certain educational course did not finish. In health care, the structure, process, and outcomes format based on the widely used Donabedian health outcomes model (Donabedian, 1966) is often applied to develop a comprehensive indicator set. Whereas structural indicators are concerned with the environment (e.g., number of resources accessible to staff), process indicators examine the approach used (e.g., the number of participants who actually attended and completed an educational program), and outcomes indicators measure the results of the implementation (e.g., increased knowledge, an example of the actual objective set in the implementation plan [hence, the advantage of making it measurable at the beginning]).
Developing a comprehensive indicator set includes establishing a feasible number of structure, process, and outcomes indicators that are relevant to the project, practical enough for data collection and interpretation, and have the ability to detect change.
Ideally, a good indicator set will include both quantitative and qualitative measures that include the stakeholders (patients and nurses). It is important to note that an indicator set must be manageable, so a reasonable number of meaningful structure, process, and outcomes indicators for each objective are better than a large set of cumbersome measures.
Identify Data Sources
With a comprehensive list of specific indicators developed, one can begin to identify sources of data. Primary data sources are those that are directly accessed by the evaluator and intended to be collected for evaluation in a specific project. For example, using a questionnaire to gather patient information about their satisfaction is a primary source. Primary data is preferred because it is more reliable and objective. Secondary data are those data that are already collected and may be electronically stored. Using secondary data has advantages and disadvantages. It is easily retrievable and less labor intensive, thereby limiting expenses. However, it can be prone to inaccuracies or incompleteness. Although primary data can be collected through direct contact with participants using questionnaires, interview guides, or observations, secondary data is obtained through both internal and external sources. Internal secondary data is a practical source and may be more pertinent to the project at hand because those data come from within the organization implementing the PPM. For example, the National Database of Nursing Quality Indicators (NDNQI; Press Ganey, 2015) may be a secondary source that is quickly available, has been analyzed by similar systems and departments, and offers available benchmarks. Other internal data sources include billing databases, internal organizational data such as decision support or performance improvement departments’ existing databases on patient volumes, various procedures, or demographic information. If these data are insufficient, external data sources may be helpful. Although a little more cumbersome, external data can provide timely and rich information. For example, federal government databases—such as the U.S. Census Bureau, the CDC, or the Agency for Healthcare Research and Quality (AHRQ) databases; local state health statistics; state health workforce surveys; pharmaceutical registries; and professional organization databases—offer excellent information that is often underutilized.
For each indicator developed, an appropriate data source or sources should be listed. Whatever measures are chosen (a combination of measures is best), there are several considerations to keep in mind. First, existing data may already be available that can be accessed relatively easy. It is important not to interrupt the routine clinical workflow during data collection to prevent negative feelings that might sabotage the process. Second, because this is an evaluation of a specific project, there may be a need for specialized training or consultative expertise. For example, often those new to primary data collection may introduce bias when administering questionnaires simply by their presence or verbal responses. Receiving expertise from a good consultant concerning evaluation and ensuring the appropriate training of data collectors is a must. Third, the validity and reliability of quantitative instruments must be ensured.
Validity refers to the accurateness or truthfulness of a measure such that interpretations can be supported and is usually reported in the literature (e.g., content validity, construct validity). Reliability, on the other hand, refers to dependability or consistency. For example, does a particular questionnaire yield the same scores when administered at different times? Depending on the nature of the evaluation, varying reliability estimates should be used. For example, if administering the same survey twice, test–retest reliability may be applied. Or, if multiple individuals are conducting the same observations, interrater reliability may be more appropriate. Examining the validity and reliability evidence for each measure used in the evaluation allows the evaluators to justify the use of that measure for the project, ultimately ensuring credibility of findings, which is necessary for effectively reporting evaluation results. If an existing questionnaire from the literature, (e.g., a pain instrument) is used, retrieving the validity and reliability estimates from the literature and documenting them for future use is beneficial. If a survey is designed by the organization to meet its unique needs (a home-grown instrument), pilot testing for validity/reliability prior to actual evaluation is essential. Those responsible for evaluation of the implementation process can make preliminary recommendations, but seeking feedback from others, including patients and direct care nurses, is helpful. Choosing the right mechanism or instrument to measure progression and/or success of PPM integration is vital. Likewise, if an instrument will be used multiple times over the course of an implementation project, ensuring that the process is consistently applied each time is crucial. Finally, in some cases, permissions and informed consents may be warranted.
Several options exist for identifying data sources and some examples are listed as follows:
• Surveys/questionnaires
• Interviews
• Focus groups
• Observation checklists and/or videos
• Tests of knowledge
• Existing data—records and databases
• Key informants
• External peer review
• Case studies
• Diaries/journals
• Logs/meeting minutes/participation lists