14
CLINICAL EVALUATION METHODS
After establishing a framework for evaluating students in clinical practice and exploring one’s own values, attitudes, and biases that may influence evaluation, the teacher identifies a variety of methods for collecting data on student performance. Clinical evaluation methods are strategies for assessing students’ performance in clinical practice. That practice may be with patients in hospitals and other healthcare facilities, in communities, in simulation and learning laboratories, and in virtual environments. Some evaluation methods are most appropriate for use by clinical educators or preceptors who are on site with students and can observe their performance; other evaluation methods assess students’ knowledge, cognitive skills, and other competencies but do not involve direct observation of their performance.
There are many evaluation methods for use in nursing education. Some methods, such as reflective writing assignments, are most appropriate for formative evaluation, whereas others are useful for either formative or summative evaluation. In this chapter, varied strategies are presented for evaluating clinical performance.
Selecting Clinical Evaluation Methods
There are several factors to consider when selecting clinical evaluation methods to use in a course. First, the evaluation methods should provide information on student performance of the clinical competencies associated with the course. With the evaluation methods, the teacher collects data on performance to judge whether students are developing the clinical competencies or have achieved them by the end of the course. For many outcomes of a course, there are different strategies that can be used, thereby providing flexibility in choosing methods for evaluation. Most evaluation methods provide data on multiple outcomes. For example, a written assignment in which students compare two different data sets might relate to outcomes on assessment, analysis, and writing. In planning the evaluation for a clinical course, the teacher reviews the outcomes or competencies to be developed and decides which evaluation methods will be used for assessing them, recognizing that most methods provide information on more than one outcome or competency.
268In clinical courses in nursing programs, students are evaluated typically on the outcomes of clinical practice, as identified in Exhibit 13.1. These relate to students’ knowledge; use of evidence in practice; higher level thinking and clinical judgment; psychomotor, technical, and informatics competencies; communication, collaboration, and teamwork skills; values and professional behaviors; quality and safety competencies; leadership skills; responsibility; and self-assessment and development. Some of these competencies are easier to assess than others, but all aspects should be addressed in the evaluation process. Because of the breadth of competencies students need to develop, multiple strategies should be used for assessment in clinical courses.
Second, there are many different clinical evaluation strategies that might be used to assess performance. Varying the methods takes into account individual needs, abilities, and characteristics of learners. Some students may be more proficient in methods that depend on writing, whereas others may demonstrate their learning better in conferences and through discussions. Planning for multiple evaluation methods in clinical practice, as long as they are congruent with the outcomes to be evaluated, reflects these differences among students. It also avoids relying on one method, such as a rating scale, for determining the entire clinical grade.
Third, the teacher should always select evaluation methods that are realistic considering the number of students to be evaluated and practice opportunities (in the clinical setting or simulation). Planning for an evaluation method that depends on patients with specific health problems or particular clinical situations is not realistic considering the types of experiences with actual patients available to students. For that goal, a simulation or use of standardized patients would be more appropriate. Some methods are not feasible because of the number of students who would need to use them within the time frame of the course. Others may be too costly or require resources not available in the nursing education program or healthcare setting.
Fourth, evaluation methods can be used for either formative or summative evaluation. In the process of deciding how to evaluate students’ clinical performance, the teacher should identify whether the methods will be used to provide feedback to learners (formative) or for grading (summative). With formative clinical evaluation, the focus is on the progression of students in meeting the learning goals. At the end of the practicum, course, or semester, summative evaluation establishes whether the student met those goals and is competent (Oermann, 2016). In clinical practice, students should know ahead of time whether the assessment by the teacher is for formative or summative purposes. Some of the methods designed for clinical evaluation provide feedback to students on areas for improvement and should not be graded. Other methods, such as rating scales and written assignments, can be used for summative purposes and therefore can be computed as part of the course or clinical grade.
Fifth, before finalizing the protocol for evaluating clinical performance in a course, the teacher should review the purpose of each assignment completed by students in 269clinical practice and should decide on how many assignments will be required in the course. What are the purposes of these assignments, and how many are needed to demonstrate competency? In some clinical courses, students complete an excessive number of written assignments. How many assignments, regardless of whether they are for formative or summative purposes, are needed to meet the outcomes of the course? Students benefit from continuous feedback from the teacher, not from repetitive assignments that contribute little to their development of clinical knowledge and skills. Rather than daily or weekly care plans or other assignments, which may not even be consistent with current practice, once students develop the competencies, they can progress to other, more relevant learning activities.
Sixth, in deciding how to evaluate clinical performance, the teacher should consider the time needed to complete the evaluation, provide feedback, and grade the assignment. Instead of requiring a series of written assignments in a clinical course, the same outcomes might be met through discussions with students, case analysis by students in postclinical conference, group writing activities, and other methods requiring less teacher time that accomplish the same purposes. Considering the demands on nurse educators, it is important to consider one’s own time when planning how to evaluate students’ performance in clinical practice.
The rest of the chapter presents clinical evaluation methods for use in nursing education programs. Some of these methods, such as written assignments, were examined in earlier chapters.
Observation
The predominant strategy used to evaluate clinical performance is observing students in clinical practice, simulation and learning laboratories, and other settings. Although observation is widely used, there are threats to its validity and reliability. First, observations of students may be influenced by the teacher’s values, attitudes, and biases, as discussed in Chapter 13, Clinical Evaluation Process. Altmiller (2016) emphasized that feedback about performance should be an unbiased reflection of observations and events. There also may be overreliance on first impressions, which might change as the teacher or preceptor observes the student over a period of time and in different situations. In any performance assessment, there needs to be a series of observations made before drawing conclusions about performance.
Second, in observing performance, there are many aspects of that performance on which the teacher may focus attention. For example, while observing a student administer an intravenous (IV) medication, the teacher may focus mainly on the technique used for its administration, ask limited questions about the purpose of the medication, and make no observations of how the student interacts with the patient. Another teacher observing this same student may focus on those other aspects. The same practice situation, therefore, may yield different observations.
270Third, the teacher may arrive at incorrect judgments about the observation, such as inferring that a student is inattentive during conference when in fact the student is thinking about the comments made by others in the group. It is important to discuss observations with students, obtain their perceptions of their behavior, and be willing to modify one’s own inferences when new data are presented. In discussing observations and impressions with students, the teacher can learn about their perceptions of performance; this, in turn, may provide additional information that influences the teacher’s judgment about competencies.
Fourth, every observation in the clinical setting reflects only a sampling of the learner’s performance during a clinical activity. An observation of the same student at another time may reveal a different level of performance. The same holds true for observations of the teacher; on some clinical days and for some classes, the teacher’s behaviors do not represent a typical level of performance. An observation of the same teacher during another clinical activity and class may reveal a different quality of teaching.
Finally, similar to other clinical evaluation methods, the outcomes or competencies guide the teacher on what to observe. They help the teacher focus the observations of performance. All observations should be shared with the students.
Notes About Performance
It is difficult if not impossible to remember the observations made of each student for each clinical activity. For this reason, teachers need a strategy to help them remember their observations and the context in which the performance occurred. There are several ways of recording observations of students in clinical settings, simulation and learning laboratories, and other settings, such as notes about performance, checklists, and rating scales. These are summarized in Table 14.1.
The teacher can make notes that describe the observations made of students in the clinical setting; these are sometimes called anecdotal notes. Some teachers include only a description of the observed performance and then, after a series of observations, review the pattern of the student’s performance, whereas others include a judgment or impression with each observation. Notes about observations of performance should be recorded as close to the time of the observation as possible; otherwise, it is difficult to remember what was observed and the context, for example, the patient and clinical situation, of that observation. In a study of nursing programs in both the United States and Canada, 92.4% of clinical faculty who were teaching prelicensure students used anecdotal notes for keeping records of their observations of students (Hall, 2013). Notes can be handwritten or recorded on smartphones, tablets, or other types of portable devices, and then shared with students.
Notes should be shared with students as frequently as possible; otherwise, they are not effective for feedback. In a study by Quance (2016), students reported that they preferred to have this feedback before their next clinical experience. 271Considering the issues associated with observations of clinical performance, the teacher should discuss observations with the students and be willing to incorporate their own judgments about the performance. Notes about performance also are useful in conferences with students, for example, at midterm and end of the term, as a way of reviewing a pattern of performance over time. When there are sufficient observations about performance, the notes can serve as documentation for ratings on the clinical evaluation tool.
Notes about performance | Used for recording descriptions of observations made of students in the clinical setting, simulation, and other learning activities in which clinical nurse educators, preceptors, and others observe performance. May also include interpretations or conclusions about the performance. Often referred to as anecdotal notes. |
Checklists | Used primarily for recording observations of specific competencies, procedures, and skills performed by students; includes list of behaviors to demonstrate competency and steps for carrying out the procedure or skill. May also include errors in performance to check. |
Rating scales | Used for recording judgments about students’ performance in clinical practice. Includes a set of defined clinical outcomes or competencies and scale for rating the degree of competence (with multiple levels or pass–fail). |
Checklists
A checklist is a list of specific behaviors or actions to be observed with a place for marking whether or not they were present during the performance (Brookhart & Nitko, 2019). A checklist often lists the steps to be followed in performing a procedure or demonstrating a skill. Some checklists also include errors in performance that are commonly made. Checklists not only facilitate the teacher’s observations, but they also provide a way for learners to assess their own performance. With checklists, learners can review and evaluate their performance prior to assessment by the teacher.
Checklists are used frequently in healthcare settings to assess skills of nurses and document their continuing competence in performing them. They also are used to assess performance in simulations. Many checklists and tools have been developed for evaluating the performance of students, nurses, and other health professionals in simulations. When skills are assessed in an objective structured clinical examination (OSCE) or by using standardized patients, checklists are often included to guide observations of performance of those skills.
For common procedures and skills, teachers can find checklists already prepared that can be used for evaluation, and some nursing textbooks have accompanying skills checklists. When these resources are not available, teachers can develop their own 272checklists. However, there should be some consistency among faculty members in expectations of skill performance and in the checklist used for evaluation. Kardong-Edgren and Mulcock (2016) found multiple variations of skills checklists, instructional practices, and expectations of performance for their sample skill (inserting a Foley catheter) across the prelicensure program, making it difficult for students to learn this skill. Initially, it is important to review the procedure or competency to understand the steps in the procedure and critical elements in its performance. The checklist should list the steps in order and should include a scale to designate whether the student completed each step using the correct procedure. Generally a yes–no scale is used.
In designing checklists, it is important not to include every possible step, which makes the checklist too cumbersome to use, but to focus instead on critical actions and where they fit into the sequence. The goal is for students to learn how to perform a procedure and use technology safely. When there are different ways of performing a procedure, the students should be allowed that flexibility when evaluated. Exhibit 14.1 provides an example of a checklist.
EXHIBIT 14.1 SAMPLE CHECKLIST
Student Name ______________________________________________________________
Instructions to teacher: Observe the student performing the following procedure and check the steps completed properly by the student. Check only those steps that the student performed properly. After completing the checklist, discuss performance with the student, reviewing aspects of the procedure to be improved.
IV Medication Administration
Checklist:
• Checks provider’s order.
• Checks medication administration record.
• Adheres to rights of medication administration.
• Assembles appropriate equipment.
• Checks compatibility with existing IV if present.
• Explains procedure to patient.
• Positions patient appropriately.
• Checks patency of administration port or line.
• Administers medication at proper rate and concentration.
• Monitors patient response.
• Flushes tubing as necessary.
• Documents IV medication correctly.
IV, intravenous.
273Rating Scales
Rating scales, also referred to as clinical evaluation tools or instruments, provide a means of recording judgments about the observed performance of students in clinical practice. A rating scale has two parts: (a) a list of outcomes or competencies the student is to demonstrate in clinical practice and (b) a scale for rating the student’s performance of them.
Rating scales are most useful for summative evaluation of performance; after observing students over a period of time, the teacher arrives at conclusions about performance, rating it according to the scale provided with the tool. They also may be used to evaluate specific activities that the students complete in clinical practice, for example, rating a student’s presentation of a case in clinical conference. Other uses of rating scales are to (a) help students focus their attention on important competencies to be developed, (b) give specific feedback to students about their performance, and (c) demonstrate growth in clinical competencies over a designated time period if the same rating scale is used. Rating scales also are used to assess performance in simulations. In simulations the goal of the assessment is generally formative, providing feedback to students on their judgments and actions taken in the simulation. However, simulations can also be used for high-stakes evaluation, determining students’ achievement of end-of-program competencies (Bensfield, Olech, & Horsley, 2012; Oermann, Kardong-Edgren, & Rizzolo, 2016; Rizzolo, Kardong-Edgren, Oermann, & Jeffries, 2015). Using simulations for evaluation that is high stakes (students must pass the simulation to successfully pass the course) is described in Chapter 15, Simulation and Objective Structured Clinical Examinations for Assessment.
The same rating scale can be used for both a midterm evaluation (documenting students’ progress in developing the competencies) and the final evaluation (documenting that they can safely perform them). Exhibit 14.2 shows sample competencies from a rating scale that can be used midway through a course and for the final evaluation.
Types of Rating Scales
Many types of rating scales are used for evaluating clinical performance. The scales may have multiple levels for rating performance, such as 1 to 5 or exceptional to below average, or have two levels, such as pass–fail or satisfactory–unsatisfactory. Types of scales with multiple levels for rating performance include
• Letters: A, B, C, D, E or A, B, C, D, F
• Numbers: 1, 2, 3, 4, 5
• Qualitative labels: Excellent, very good, good, fair, and poor; exceptional, above average, average, and below average
• Frequency labels: Always, often, sometimes, and never
274EXHIBIT 14.2 SAMPLE COMPETENCIES FROM RATING SCALE
Student Name ___________ Faculty Name __________ Date __________
Some instruments for rating clinical performance combine different types of scales, for example, rating performance of competencies on a scale of 1 to 4 based on the students’ independence in practice and their knowledge, skills, and attitudes. In one school of nursing, a grade is then generated from the ratings (Altmiller, 2017).
A short description included with the letters, numbers, and labels for each of the outcomes or competencies rated improves objectivity and consistency 275(Brookhart & Nitko, 2019). For example, if the tool uses a scale with numbers, short descriptions should be written to clarify the performance expected at each level. For the competency “Collects relevant data from patient,” the descriptors might be:
4: Differentiates relevant from irrelevant data, analyzes multiple sources of data, establishes comprehensive database, identifies data needed for evaluating all possible patient problems.
3: Collects significant data from patients, uses multiple sources of data as part of assessment, identifies possible patient problems based on the data.
2: Collects significant data from patients, uses data to develop main patient problems.
1: Does not collect significant data and misses important cues in data; unable to explain relevance of data for patient problems.
Many rating scales for clinical evaluation have only two levels: pass–fail or satisfactory–unsatisfactory. A survey of nursing faculty from all types of programs indicated that most faculty members (n = 1,116; 83%) used pass–fail or satisfactory–unsatisfactory in their clinical courses (Oermann, Saewert, Charasika, & Yarbrough, 2009). It is generally easier and more reliable for teachers to rate performance as either satisfactory or unsatisfactory (or pass–fail) rather than differentiating performance according to 4 or 5 levels of proficiency.
Any rating form used in a nursing education program or healthcare system must be clear to all stakeholders. Students, educators, preceptors, and others need to understand the meaning of the competencies and scale levels. They also need to be able to determine examples of clinical performance that reflect each level in the scale. For example, what is satisfactory and unsatisfactory performance in establishing relationships with team members? If a scale with 5 levels is used, what are the differences in establishing relationships with team members at each of those levels? All too often the meaning of the competencies in the tool and levels used to rate observed performance are not fully understood by the clinical educators using it.
Teachers should be prepared for use of the form through faculty development. The meaning of the competencies and examples of performance that reflect each level should be discussed by all educators who will be using the form. There needs to be agreement on the meaning of each of the competencies and the behaviors that represent acceptable performance of them. Without these discussions, there may be wide variability in the interpretation of the competencies and behaviors that represent a pass or fail, or a 4, 3, 2, or 1 level of performance. In addition to these discussions, teachers can practice using the form to evaluate performance of students in digitally recorded simulations.
276Along with teacher preparation for using the clinical evaluation tool, some schools develop guidelines that accompany the tool to improve consistency in its use among clinical educators. Walsh, Jairath, Paterson, and Grandjean (2010) reported on the development of their Clinical Performance Evaluation Tool (CPET), based on the Quality and Safety Education for Nurses (QSEN) competencies. The CPET has three parts: (a) a one-page checklist for teachers to evaluate student performance related to the QSEN competencies, (b) a key that explains the application of the competencies to the specific clinical course, and (c) guidelines for grading performance.
Issues With Rating Scales
One problem in using rating scales with multiple levels is consistency among clinical teachers and others in determining the level of performance based on the scale. This problem can occur even when descriptions are provided for each level of the rating scale. Teachers may differ in their judgments of whether the student collected relevant data, whether multiple sources of data were used, whether the database was comprehensive, whether all possible patient problems were considered, and so forth. Scales based on frequency labels are often difficult to implement because of limited opportunities for students to practice and demonstrate a level of skill rated as always, often, sometimes, and never. How should teachers rate students’ performance in situations in which they practiced the skill perhaps once or twice? Even with two-dimensional scales such as pass–fail, there is room for variability among educators.
Brookhart and Nitko (2019) identified errors that evaluators can make when using rating scales. Three of these can occur with tools that have multiple points on the scale for rating performance, such as 1 to 4:
1. Leniency error results when the teacher tends to rate all students toward the high end of the scale.
2. Severity error is the opposite of leniency, tending to rate all students toward the low end of the scale.
3. Central tendency error is hesitancy to mark either end of the rating scale and instead use only the midpoint of the scale. Rating students only at the extremes or only at the midpoint of the scale limits the validity of the ratings for all students and introduces the teacher’s own biases into the evaluation (Brookhart & Nitko, 2019).
Three other errors that can occur with any type of clinical performance rating scale are a halo effect, personal bias, and a logical error:
4. Halo effect is a judgment based on a general impression of the student. With this error, the teacher lets an overall impression of the student influence the ratings of specific aspects of the student’s performance. This impression is considered to create a “halo” around the student that affects the teacher’s 277ability to objectively evaluate and rate specific competencies on the tool. This halo may be positive, giving the student a higher rating than is deserved, or negative, letting a general negative impression of the student result in lower ratings of specific aspects of the performance.
5. Personal bias occurs when the teacher’s biases influence ratings such as favoring nursing students who do not work while attending school over those who are employed while attending school.
6. Logical error results when similar ratings are given for items on the scale that are logically related to one another. This is a problem with rating scales in nursing that are too long and often too detailed. For example, there may be multiple competencies related to communication skills to be rated. The teacher observes some of these competencies but not all of them. In completing the clinical evaluation form, the teacher gives the same rating to all competencies related to communication on the tool. When this occurs, often some of the items on the rating scale can be combined.
Two other errors that can occur with performance ratings are rater drift and reliability decay (Brookhart & Nitko, 2019):
7. Rater drift can occur when teachers redefine the performance behaviors to be observed and assessed. Initially in developing a clinical evaluation form, teachers agree on the competencies to be rated and the scale to be used. However, over a period of time, educators may interpret them differently, drifting away from the original intent. For this reason, faculty members, clinical educators, and others involved in the clinical evaluation should discuss as a group each competency on their evaluation tool at the beginning of the course and at the midpoint. This discussion should include the meaning of the competency and what a student’s performance would “look like” at each rating level in the tool. Simulated experiences in observing a performance, rating it with the tool, and discussing the rationale for the rating are valuable to prevent rater drift as the course progresses.
8. Reliability decay is a similar issue that can occur. Brookhart and Nitko (2019) indicated that immediately following training on using a performance rating tool, educators tend to use the tool consistently across students and with each other. As the course continues, though, faculty members may become less consistent in their ratings. Discussion of the clinical evaluation tool among course faculty, as indicated earlier, may improve consistency in use of the tool.
Although there are issues with rating scales, they remain an important clinical evaluation method because they allow teachers, preceptors, and others to rate performance over time and to note patterns of performance. Exhibit 14.3 provides guidelines for using rating scales for clinical evaluation in nursing.