Evaluation of Outcomes





Knowing the results of healthcare services is key to demonstrating value; likewise, measurement of results or outcomes is critical to stimulating improvement in healthcare structures and processes. Today, outcome measurement has become not only necessary, but also an expectation among all healthcare service stakeholders, yet methods supporting measurement alongside program improvement remain complex, especially in comparison to the ease of structure and process measurement.

A number of measures constitute health outcomes, including traditional measures such as those focused on disease morbidity and mortality, as well as health economic measures, quality of life, functional status, and stakeholder satisfaction with healthcare services. Some of the earliest attempts to measure outcomes were made by Florence Nightingale in the 1850s when she monitored the impact of interventions aimed at improving survival rates during the Crimean War (Nightingale, 1863). Attempts to evaluate healthcare can increasingly be seen in the literature since that time, but the work of Donabedian (1966) describing structure, process, and outcome as the three key monitoring areas supporting health systems improvement significantly advanced the field of quality management.

Definitions for what constitutes healthcare quality likely differ depending on the stakeholder, but most would probably identify quality healthcare as that which provides safe, accessible evidence-based health services that have been shown to ensure optimal health, functional status and wellness, and can be delivered at an acceptable cost. Regardless of how it is defined, the evaluation of quality always starts with identification of an outcome target; methods are then proposed and developed, and systems built or augmented in a manner that supports achievement of the outcome. The implementation, and ultimately the evaluation of these processes and structural systems, support completion of a quality loop. Whether the outcome is advanced practice registered nurse (APRN)–role specific (e.g., position turnover; role satisfaction) or associated with a medical diagnosis (e.g., complication rate; disease severity), the process remains the same, providing healthcare practitioners with an ability to not only understand their impact, but also to clarify the contribution of implemented processes and systems of care, including the need for further improvement. APRNs must embrace outcome measurement alongside evaluation of process and structural system components, using it as a potent method to demonstrate the value of their services and to provide ongoing feedback that incentivizes improvement in both systems of care and the processes used.


Health Outcomes Research

Health outcomes research evaluates the results of specific healthcare treatments or interventions (Agency for Healthcare Research and Quality [AHRQ], 2020). The conduct of health outcomes research requires use of experimental or mixed methods designs with the intent to understand differences between two clinical approaches to a problem. Interventions tested may include a broad range of items, from new structures providing a similar standard of care process in a new setting, to new roles, or to use of a variety of different processes that aim to improve outcomes.

Efficacy and Effectiveness Research

Randomized controlled trials (RCTs) test either efficacy or effectiveness of an intervention. Classically, patients are randomized to either receive an intervention, or to become part of the control group that is managed in the currently acceptable manner. Differences in the targeted outcome are then analyzed to determine whether the intervention is efficacious or effective depending on the phase of intervention testing. Efficacy testing is conducted within the context of what is commonly referred to as a phase 3 RCT, where the potential benefits of an intervention are studied under ideal, highly controlled conditions (Piantadosi, 2017; U.S. Food and Drug Administration [FDA], 2020). In comparison, effectiveness testing or phase 4 research refers to testing the intervention in a real-world, less controlled setting, often using what is referred to as an historical control group from a previous efficacy RCT (Hulley et al., 2013; Piantadosi, 2017; FDA, 2020). The rationale for use of an historical control group is that if an intervention has been found to be efficacious in previous phase 3 research, it may be unethical to withhold its use in a phase 4 effectiveness trial; therefore, the historical control serves as the group for comparison instead of randomizing patients to a control arm where they may be significantly and unethically disadvantaged (Piantadosi, 2017). The findings from effectiveness research are much more generalizable to a variety of patients, practitioners, and clinical settings because of its implementation within less strictly controlled circumstances (Piantadosi, 2017). As more and more nurse researchers are being prepared as clinical trialists, the future will likely hold findings from numerous efficacy and effectiveness RCTs led by APRN investigators.

Comparative Effectiveness Research

Comparative effectiveness research is conducted to test use of an intervention that has already been found to be effective, but now will be delivered in a new manner (Piantadosi, 2017). For example, treatment with intravenous tissue plasminogen activator (tPA) has been found to be both efficacious and effective in the treatment of acute ischemic stroke patients within 4.5 hours of symptom onset. Treatment with tPA is classically given within the walls of an acute care hospital and has been studied extensively in these settings. The emergence of mobile stroke units (ambulances with computed tomography scanning capabilities) provides the opportunity for a comparative effectiveness study of tPA treatment in the field compared to that provided in the hospital. Similarly, tPA treatment is classically provided by physicians after clinical examination and review of neuroimaging; however, use of APRNs to diagnose stroke and prescribe/administer tPA is increasing dramatically due a shortage of vascular neurologists. Therefore. a comparison of APRN-prescribed to physician-prescribed tPA provides another example of a comparative effectiveness research project. While these examples pertain strictly to acute stroke, APRNs are 349encouraged to think of how they might design comparative effectiveness research that could support role expansion and the discovery of better methods to support optimal patient outcomes.

Risk and/or Severity Adjustment

Risk adjustment refers to accounting or controlling for patient-specific factors that may impact outcomes (Alexandrov et al., 2019; Duncan, 2018). Risk adjustment “levels the playing field,” so to speak, in an effort to determine the efficacy or effectiveness of an intervention in patients that may inherently carry differences in risk, for example a young otherwise healthy athlete being treated for a sports-related extremity fracture, versus an elderly patient with multiple comorbidities and disabilities who is being treated similarly for the same extremity fracture. Without risk adjustment inaccurate conclusions might be made if treatment outcomes are poor, when in reality the cause might be inherent patient factors. There are varying methods of risk adjustment and using the most appropriate method depends on which outcomes are being evaluated, the time period under study, the study population, and the purpose of the evaluation (Duncan, 2018). The most common variables used for risk adjustment include age, severity of illness, and comorbid conditions (Alexandrov et al., 2019; Duncan, 2018).


Evolution of Outcome Evaluation

As briefly described, Florence Nightingale (1863) is credited with much of the early work in medicine that was tied to outcome evaluation, with scrutiny of processes and structures when results were determined to be unacceptable or improved. Her exceptional influence in the development of epidemiology as well as quality improvement is seen in early 20th century England’s embracing similar methods led by Emory W. Groves (1908), a physician who championed outcome measurement. Dr. Grove’s (1908) early work demonstrated variability in surgical mortality, fostering interest in understanding what should be considered “acceptable” rates of death, alongside standardization of medical approaches to diseases so that like comparisons could be made between providers and hospitals.

Interestingly, at the time of Dr. Grove’s (1908) work in England, practitioners in the United States held a rather fatalistic view toward poor medical outcomes, with beliefs that poor outcomes were not linked to system structure, processes of care, or practitioner abilities, but instead were due to factors beyond human control. Massive social reform dominated the late 19th and early 20th centuries in the United States, with significant public administrative growth along with required performance auditing. Deficits in medical education were made public during this time with the publication of the Flexnor Report of 1910 which cited poor medical school educational standards and ultimately led to a retooling of American medical education (Fee, 1982). Nurse licensure formalized requirements for nursing education mandating preliminary education, professional training in nursing, licensing examinations, and formal state registration (Wyatt, 2019). While these measures acted to improve structural healthcare quality, it wasn’t until 1914 that outcome evaluation formally emerged in the United States, when physician Ernest Codman publicly recommended 12-month postoperative patient follow-ups as a measure of surgical success (Codman, 1914). In 1917, Dr. Codman articulated this concept as the End-Result Idea, advocating for measurement and public reporting of results (outcomes), in an effort for hospitals to improve the medical care delivered at their facilities (Codman, 1917). Sadly, Dr. Codman’s End-Result Idea was viewed as heretical and dangerous, 350stifling work on the science of outcome measurement and limiting health quality improvement to primarily structural and process description, until the work of Donabedian.


In 1966, Avedis Donabedian detailed descriptions of structure, process, and outcome measures and took up the torch for development of criteria to evaluate outcomes. In 1980, he described structure as stable attributes within the care setting, for example personnel manpower and available equipment; processes were described as interventions used by healthcare professionals, including how skillfully these interventions were executed; and, outcomes were defined as the resultant change in health status that is directly in-line with care structure and process (Donabedian, 1980). Donabedian (1980) identified that outcomes must be assessed in relation to both structure and process variables to provide a thorough understanding of how and why the outcome occurred. Figure 18.1 illustrates Donabedian’s (1980) concepts showing the inter-relatedness of structure, process and outcome, with outcomes the result of the process and structural interactions. A shortcoming to the Donabedian (1980) model is the exclusion of individual patient factors or characteristics which may alter expected results, for example inherent susceptibility for different outcomes due to significant comorbidities, or health disparities affecting access to services. Although Donabedian’s (1980) work defined the important relationship among structure, process, and outcome, practitioners rarely embraced outcome measurement in routine clinical practice until the 1990s when it became an expectation.

Ellwood’s Outcomes Management

In 1988, Dr. Paul Ellwood published his definition of Outcomes Management (OM), describing it as “. . . a technology of patient experience designed to help patients, payers, and providers make rational medical care-related choices based on better insight into the effect of these choices on patient life” (p. 1549). The suggestion that knowledge of healthcare outcomes could drive patient, payer, and provider healthcare choices revolutionized contemporary medicine and nursing care at a time in the United States when hospital payment had become capitated with the emergence of Diagnostic Related Groups (DRGs). Ellwood’s (1988) landmark paper prophetically predicted a dramatic shift in healthcare stakeholder engagement well before launch of consumer internet access, including requirements such as public reporting and standardized healthcare quality core measures that could be tied to reimbursement in association with the results produced.

FIGURE 18.1The Inter-relatedness of Donabedian’s Structure and Process Concepts in the Production of Outcomes.

351Ellwood (1988) suggested four essential principles for inclusion in an OM program:

  1. An emphasis on standards that providers can use to select appropriate interventions;
  2. The measurement of patient functional status and well-being, along with disease-specific outcomes;
  3. A pooling of outcome data on a massive scale; and
  4. The analysis and dissemination of the database to appropriate decision makers.

Today, Ellwood’s (1988) principles support a number of initiatives that are now standard requirements for health systems, including:

Use of evidence-based guidelines developed by scientific and professional practice agencies to support delivery of optimal healthcare;

Use of valid and reliable tools to capture such factors as disease severity, disability, functional status, health consumer perceptions of service quality or satisfaction, and quality of life, with many of these developed to validly capture patient outcomes within a discrete population or diagnosis of interest;

Use of large national registries to capture standardized payer-required core measures for pooling and analyses; and,

Benchmarking of provider and/or institutional performance, along with publicly reported data.

Following publication of Ellwood’s (1988) paper, the Outcomes Management Model (Figure 18.2) emerged, demonstrating four key steps in the measurement of outcomes using an effectiveness research approach (Wojner, 2001). The first of these steps centered on identification of outcome targets, along with other important variables that were theoretically in-line with the outcome, including those that could confound outcome attainment; this process lent itself to construction of an outcome measurement repository or database. The second step included review of the evidence supporting current and new or evolving practice processes, with collaborative agreement reached across all providers/stakeholders on an evidence-supported standardized approach to patient management. This effort resulted in the development of structured care methods consisting of care maps/pathways, order sets, decision-support algorithms, policies, and procedures to assist with provider compliance with the new agreed upon approach to patient management. The third step involved implementing the new care methods, with time provided to ensure the stability of care processes; once stability of the new practices was reached, data collection would then begin to capture structure, process, and outcome data. The last phase involved data analysis with interdisciplinary review of findings. This phase led to agreed upon next steps/changes to further outcome improvement, often recycling back to phase two where new approaches or further refinement of existing processes could be agreed upon and standardized for a subsequent phase of testing and evaluation (Wojner, 2001).

Models for Outcome Evaluation

When beginning an outcome measurement project, APRNs must start by considering the model that will best frame their efforts. Numerous methods to support outcome evaluation have emerged over time, but interestingly each contains similar approaches to the measurement of outcomes, although the Outcomes Management Model depicts the process as effectiveness research (Wojner, 2001). Most other models adhere to a quality model structure although elements are similar. For example, the Plan-Do-Study-Act (PDSA) model starts with identification of the quality (outcome) 352target and planning, moving to standardization of an approach to care, followed by collecting performance data, and last, examining the results; findings often require a recycling through the process again as methods are refined and further improved upon, until the targeted outcome is achieved. PDSA is a popular quality model to guide the work of outcome evaluation under the authority of local quality improvement initiatives and extends from the work of Deming (1986) who first developed it as the Plan-Do-Check-Act (PDCA) framework. Six Sigma is another performance improvement model that is supported by similar elements and a cyclic methodology: Define-Measure-Analyze-Improve and Control (Council for Six Sigma Certification, 2018).

FIGURE 18.2Wojner’s outcomes management model.

Source: Reprinted with permission from the Health Outcomes Institute, LLC-P, Fountain Hills, Arizona.

Selection of a model to support the work of outcome evaluation is largely dependent on the purpose. If the APRN intends to disseminate findings as effectiveness research in a publication or presentation, the Outcomes Management Model may provide the most appropriate framework for the project, including the attainment of local Ethics or Institutional Review Board approval (Alexandrov et al., 2019; Wojner, 2001). Conversely, if the project is only intended for internal quality improvement without external dissemination, selection of a quality model may be the best option. Table 18.1 provides elements that the APRN should consider in determining whether the project should be classified as effectiveness research or quality improvement.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Oct 17, 2021 | Posted by in NURSING | Comments Off on Evaluation of Outcomes

Full access? Get Clinical Tree

Get Clinical Tree app for offline access

353TABLE 18.1 Effectiveness Research Versus Quality Improvement




Project purpose

  • Focuses on a specific local performance gap compared to the existing standard of care
  • Focus is to improve a specific aspect of health or healthcare delivery that is not consistently and appropriately being implemented

  • Identifies a specific deficit in scientific knowledge in the literature and practice
  • Proposes to address specific research questions or test hypotheses to develop new knowledge or advance existing knowledge

Intended methods

  • Focuses on the implementation of only scientifically acceptable evidence-based practice changes
  • Implementation methods may be staged and sequential over time based on feedback received, with the intent of tailoring the process to local resource availability
  • Intended analyses focus on structural and process changes alongside outcome measurement

  • Focuses on the implementation of either a new/novel untested intervention, or testing of effectiveness in an intervention shown to be efficacious in another sample
  • Protocol defines the intervention and its implementation in detail, along with a case report form that enables collection of patient characteristics and demographics, process fidelity, structural standardization, and outcome data using valid and reliable methods and instruments
  • Analytical strategies allow comparison between groups using experimental or mixed methods in relation to the intervention(s) being tested

Risk–benefit analysis

  • The provider–patient relationship is not altered
  • Benefit to patients and other stakeholders is well established scientifically
  • Institutional benefit is specified (i.e., improved efficiency or cost per case)
  • Participant risk is associated with no participation

  • Requires Ethics or Institutional Review Board approval
  • Use of personal health information extends to beyond the usual provider–patient relationship
  • Benefit to participants or the institution may be unknown
  • Risks to participants may occur and must be disclosed
  • Written informed consent may be required


  • Generalizability to other settings is unlikely and is not the main intent

  • Results may be generalizable depending on the sample, methods, and limitations cited