Strategies for assessing and evaluating learning outcomes

Jane M. Kirkpatrick, PhD, RN and Diann A. DeWitt, PhD, RN, CNE

This chapter is dedicated to our outstanding colleague and friend Dr. Lillian Yeager, who died in May 2006. Dr. Yeager was appointed Dean for the School of Nursing at Indiana University Southeast in 2002 following approximately 30 years as an exceptional educator there. It was our pleasure to collaborate with such an outstanding nurse educator and administrator. Lillian had a keen sense of humor and zest for life, both of which aided in her valiant efforts to live life fully in her fight with cancer. She is greatly missed and will long be remembered by many. Thanks for inspiring us, Lillian!

The purpose of this chapter is to discuss the uses, advantages, disadvantages, and issues related to a variety of strategies that faculty can use to assess and evaluate student learning. The Carnegie Foundation’s report on nursing education (Benner, Sutphen, Leonard, & Day, 2009) calls for stronger integration of both clinical and classroom instruction and “radical transformation” in how nursing education is provided. Just as teaching methods are expanding to ensure that graduates achieve desired outcomes as identified by the American Association of Colleges of Nursing (AACN) essentials (2008) and National League for Nursing (NLN) competencies (2010), we must also expand our assessment and evaluation strategies to determine if these competencies are attained. Competency in clinical judgment, critical thinking, and best nursing practices may need multiple measures to be evaluated accurately. As educators expand teaching methods and seek to assess deep learning and critical thinking, we can explore ways that our active learning teaching strategies can be transformed to assess the desired learning outcomes. The chapter includes practical information on a variety of outcome assessment strategies. Included are ways to select strategies, improve validity and reliability of the assessment strategies, and increase the effectiveness of their use.

Assessment and evaluation

Just what is the difference between assessment and evaluation? In many instances it seems that these two terms are interchangeable. Assessment basically means obtaining information for a specific purpose. The information collected may be quantitative or qualitative depending on how it will be used (Brookhart, 2005). The main purpose of assessment is understanding and improving student learning (T. A. Angelo, personal communication, November 7, 2006). “It involves making our expectations explicit and public; setting appropriate criteria and high standards for learning quality; systematically gathering, analyzing, and interpreting evidence to determine how well performance matches those expectations and standards; and using the resulting information to document, explain, and improve performance” (Angelo, 1995, pp. 7–9). This definition of assessment is quite similar to formative evaluation, a process of determining progress with the goal of making improvements.

Evaluation is a term that is more commonly associated with summative evaluation, which takes assessment to the next level of judging the value and quality of performance at a defined end point. Summative evaluation suggests that a decision may be made. In clinical disciplines, faculty must evaluate student attainment of course outcomes and defined program competencies.

In determining validity, faculty must ask whether the assessment technique is appropriate to the purpose and whether it provides useful and meaningful data (Linn & Gronlund, 2005). Faculty must consider the fit of the assessment strategy with the identified objectives. In other words, does the strategy measure what it is supposed to measure? For instance, if the objective for an assignment is for the student to demonstrate skill in written communication, evaluating student performance through oral questioning will not provide valid data. Similarly, at the nursing department level, faculty should coordinate assessment and evaluation strategies with nursing program outcomes such as critical thinking and communication. It is a challenge to develop sound criteria for assessment that accurately reflect the specified outcomes, objectives, and content. To establish face validity, faculty must seek input from colleagues by asking questions such as “Do these criteria appear to measure what my objectives are?” In addition, obtaining the opinion of other content experts can assist in determining whether there is adequate sampling of the content (content validity). Whereas these types of validity (e.g., face, content) constitute the traditional approach to establishing validity, Gronlund (2006) asserts that this view is being replaced by validity as a unitary concept, based on several different categories of evidence (e.g., face-related evidence, content-related evidence). The evidence available to establish validity determines whether validity is considered low, medium, or high.

Once assessment and evaluation criteria or rubrics are developed, it is essential to establish their reliability. The most commonly used method for establishing reliability in this situation is when two or more instructors independently rate student performance using the agreed upon criteria or rubric for sample work. Then the ratings are correlated to establish interrater reliability. Interrater reliability is expressed as a percentage of agreement between scores. An example of using criteria to establish interrater reliability is provided in Box 25-1.

BOX 25-1 Establishing Interrater Reliability

Develop criteria and apply them to sample work

Have 2 or more observers independently rate performance, then correlate

The formula for % Agreement is as follows:

total # agreements/# of agreements + # of disagreements

Example: 3 raters evaluate written communication using the following criteria:

1. Clear expression of ideas

2. Logical flow/organization

3. Correct use of syntax, grammar, APA format

4. Incorporation of research findings

• Item 1: 2 agree, 1 does not

• Item 2: all 3 agree

• Item 3: 2 agree, 1 does not

• Item 4: all 3 agree

10 (Total agreements)/10 (Agreements) + 2 (Disagreements) = 10/12 = 0.83 or 83% (>70% is good)

Polit, D. F., & Hungler, B. P. (1999). Nursing research: Principles and methods (6th ed.). Philadelphia, PA: Lippincott, p. 416.

A multiplicity of assessment strategies can provide a more complete picture of the student’s abilities and therefore contribute to the trustworthiness of the process. It is a serious limitation to rely on a single technique. Each assessment technique has limitations and issues that can influence the reliability, validity, and appropriateness of the technique for given student populations. Using multiple assessment techniques provides a more robust and accurate framework for making evaluative decisions.

Effectiveness

After the assessment strategy is implemented, it is essential to determine its overall effectiveness. Issues related to the implementation of the assessment strategy should be examined as well. Some questions faculty should ask include the following: Was the strategy an effective use of resources (e.g., student and faculty time and financial resources)? Were there adequate data to determine if the learning outcome was met? Are there any problems with the implementation of the technique? What revisions are necessary? Would the faculty consider this strategy to be a good choice for future use?

Matching the assessment strategy to the domain of learning

Educators must also be mindful of the domain of learning being assessed or evaluated (see Chapter 13). Cognitive learning is typically assessed with strategies requiring the students to write, submit portfolios, or complete tests (see Chapter 11). Assessment in the psychomotor domain typically involves simulations and simulated patients and ultimately occurs in clinical practice (see Chapters 19 and 20). Assessment in the affective domain is particularly important in nursing and is discussed further here.

The taxonomy of affective assessment and evaluation as applied to nursing (Krathwohl, Bloom, & Mases, 1964) lists five behavioral categories: (1) receiving, (2) responding, (3) valuing, (4) organization of values, and (5) characterization by a value or value complex. The beginning student may be at the receiving level, able to hear and recognize the values. As the student progresses, more sophisticated affective growth would demonstrate the ability to respond to or communicate about the particular value or issue. At the next level, the student embraces the value. Ultimately, the student would act on the value. Once actions are consistent, the highest levels of the affective domain would be realized.

Examples of areas in which nursing students encounter the affective domain include socialization to the roles of the nurse, caring for patients who are dying, meeting spirituality needs, working with sexuality concerns, and becoming culturally competent.

For example, students are expected to increase their level of cultural competence throughout the curriculum. At the beginning level students may be expected to become self-aware using exploration of their own cultural and health care practices as well as values. A mid-program outcome could focus on student awareness of the cultural orientation of the patients under their care. At graduation the expected outcomes would be to act in a culturally competent manner when providing care to all patients and demonstrate the ability to advocate for an individual patient’s unique needs.

Multiple strategies could be used for assessment of these outcomes, including written papers identifying the student’s own cultural background, a critical review of an interaction in caregiving with a patient of another culture, or perhaps the use of media (e.g., video recording, web page development, or even a collage) to demonstrate key concepts and values held by a given culture. The assessment and evaluation strategy would be designed to assess evidence of self-awareness, recognition of the values and conflicts in areas in which judgments must be made, and mechanisms for advocacy.

Development of the affective domain is progressive and can be tied to critical thinking. Because of the progressive nature of development, formative assessment and evaluation across the curriculum may be most appropriate, with a summative assessment and evaluation at the time of graduation. Many of the assessment methods listed in this chapter can be adapted to evaluate the affective domain.

Communicating grading expectations

When assessment strategies are used to collect data for grading purposes, it is imperative that the grading requirements be communicated to the students. Information about grading criteria is typically provided to students in the course syllabus. Other methods such as checklists, guidelines, or grading scales can be used as well. See Box 25-2 for an example of a writing assignment with grading rubric.

BOX 25-2 Sample Writing Assignment with Grading Rubric

Courtesy of colorado christian university

College of adult and graduate studies Division of nursing and sciences

Nur 400a transitions in nursing: career advancement

Scholarly paper assignment

Please select a nursing topic of interest to you for this scholarly paper assignment. The purpose of this paper is, in part, to assess your ability to think critically and communicate clearly as you begin the RN-BSN program.

The paper is to be 6–8 pages in length and follow APA format. Please refer to CAGS Guidelines for APA Style in Doc Sharing for additional information. A minimum of three (3) professional journal articles (less than five years old) should be used. All scholarly papers should have an introduction that includes a thesis statement, the body of the paper and a conclusion or summary.

Make sure the paper includes the following:

• A title page

• Body of the paper

• Introduction (including thesis statement or focus for the paper)

• At least one direct quote from a credible information source (such as a nursing article)

• At least one in-text citation of an information source (from a different source than the one you obtained the quote from). This should not be a direct quote. It should be a reference to an idea or concept presented by the author.

• Conclusion (summary of main points)

• A reference page

Please refer to the NUR 400A Grading Rubric for Scholarly Paper (in Doc Sharing) for specific grading criteria.

Nur 400a grading rubric for scholarly paper

Critical Thinking and Written Communication
Objectives	Below Expectations	Meets Expectations	Exceeds Expectations	Points
	<75%	75–90%	90–100%
Organization (introduction and conclusion as well as transitions) The student will provide an introduction (including a clear thesis statement) and conclusion as well as use transitions to provide a logical flow.	Missing the introduction (with a thesis statement), the conclusion, or both. AND Transitions not used or transitions are inconsistent which provides some confusion to the reader.	Adequate introduction clearly focuses the paper and includes a thesis statement. Plausible conclusion summarizes paper. AND Transitions are ordinary (get the job done but in a routine fashion) but purposefully connect content providing logical flow.	Insightful, original introduction (including thesis statement) clearly focuses the paper. Convincing conclusion summarizes paper. AND Transitions are original and purposefully connect content providing strong logical flow.	___ of 20
Development and Evidence (Analysis and Development) The student will analyze and develop content in a logical progression.	Content analysis and development are general or vague but presented in a somewhat logical progression or no logical progression at all.	Content analysis and development are adequate and presented in a logical progression.	Content analysis and development are insightful and presented in logical progression.	___ of 20
Development and Evidence (Support)The student will support thesis, main points, and/or claims with appropriate evidence.	The topic statement, main points, announcement and/or claims are not supported or are supported with general or vague (reader gains few insights) personal examples and obvious textual sources.	The thesis, main points, and/or claims are supported with relevant personal examples (reader gains some insight), textual sources, and appropriate external sources.	The thesis, main points, and/or claims are supported with relevant personal examples (reader gains insight), textual sources, and scholarly academic sources.	___ of 20
Structure and Usage (Language) The student will use effective academic language, variety in sentence structure, and active voice.	Conversational word choice; some variety in sentence structure; active and passive voice are used or passive voice is primarily used.	Academic word choice; variety in sentence structure; active voice is primarily used.	Strong effective academic word choice; variety in sentence structure; active voice is primarily used.	___ of 20
Structure and Usage (Conventions—Mechanics & APA) The student will use writing mechanics properly (spelling; capitalization; punctuation; pronoun references; subject–verb agreement; consistent verb tense). The student will use APA format correctly.	The paper contains numerous (7+) mechanical errors and the errors seriously distract from the writer’s purpose. AND APA format errors are numerous (7+).	The paper contains a few (4–6) mechanical errors but the errors do not distract from the writer’s purpose. AND There are a few (4–6) APA format errors.	Mechanical errors are rare (0–3) and the errors do not distract from the writer’s purpose. AND APA format errors are rare (0–3).	___of 20
			Total Points	/100

Rubrics are another way to inform students about grading expectations. According to Stevens and Levi (2005), rubrics are rating scales used to assess performance. The two types of rubrics are holistic and analytic. The holistic approach is based on global scoring, often with descriptive information for each area based on a numerical scoring system, whereas analytic scoring involves examining each significant characteristic of the written work or portfolio. For example, in assessment of writing, the organization, ideas, and style may be judged individually according to analytic scoring (Linn & Gronlund, 2005). The global method seems more suitable for summative assessment, whereas the analytic method is useful in providing specific feedback to students for the purpose of performance improvement.

Regardless of the type, rubrics are composed of four parts: (1) a task description (the assignment), (2) a scale, (3) the dimensions of the assignment, and (4) descriptions of each performance level (Stevens & Levi, 2005). The first portion of a rubric contains a clear description of the assignment and should be matched to the learning outcomes of the course. The next part of the rubric is a scale to describe levels of performance. Such a scale may include levels such as “excellent,” “competent,” and “needs work.” The dimensions of the assignment are the third part of rubric development, where the task is broken down into components. Last, differentiated descriptions of each performance level are explicitly identified. Rubrics thus provide clarity of expectations to assist students in the successful completion of assignments as well as making grading of these assignments more objective for the faculty. See Box 25-3 for an example of grading rubrics.

Box 25-3 Examples of Grading Rubrics

“A” grade

The final course synthesis paper clearly defines a researchable problem; the search strategy provides sufficient relevant data for understanding the problem; the coding sheet is focused and guides the analysis of the data; issues of reliability and validity are identified; the literature is synthesized, rather than reviewed or summarized; the paper concludes with recommendations based on the research synthesis. The paper is written using the IUSON writing guidelines.

Participation in discussions and learning activities integrates course concepts and reflects critical thinking about research synthesis. Participation is thoughtful, respectful, informed, and substantiated. Peer review of the synthesis paper reflects the reviewers’ understanding of the synthesis process, provides practical suggestions, and is presented in a collegial manner.

Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. The plan for publication includes thoughtful selection of a journal; draft of a query or cover letter; and, if the paper needs revisions to suit publication guidelines, a statement about revisions needed that matches the journal publication guidelines. The professional presentation is well organized, supported by visual aids (e.g., PowerPoint slides), and uses professional communication style to suit the audience.

“B” grade

The final synthesis paper clearly defines a researchable problem; the search strategy yields mostly relevant data for understanding the problem; the coding sheet lacks one or more important data and/or does not reflect the scope of the problem statement; issues of reliability and validity are unclear; the review of literature is primarily synthesis with minimal summary; the paper concludes with mostly appropriate recommendations. The paper is free of major errors in grammar or style.

Participation in discussions and learning activities usually integrates course concepts and reflects critical thinking about research synthesis. Participation is helpful but may not contribute substantially to the focus of the course. Peer review of the synthesis paper does not include relevant aspects of the peer review checklists or overlooks areas in which feedback is needed.

Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. The plan for publication includes appropriate selection of a journal; the drafts of the query or cover letter are generally appropriate to the situation; general revisions are noted but do not consider manuscript guidelines of the journal. The professional presentation is fairly well organized; the visual aids (e.g., PowerPoint slides) enhance the presentation; the presentation is delivered with consideration for the audience.

“C” grade

The final synthesis paper has an ill-defined problem; the search strategy yields irrelevant or tangential data for understanding the problem; the coding sheet is not well focused or neglects key variables or includes irrelevant variables; issues of reliability and validity are not identified or are ignored; the review is more summary than synthesis; the paper does not include recommendations or includes recommendations that are not drawn from the data. There are substantial errors in grammar and/or writing style.

Participation in discussions occurs on an irregular basis and is not grounded in course concepts, comments do not reflect critical thinking, and there are breaches of course norms and etiquette. Peer review of the synthesis paper does not provide substantive or helpful feedback to classmates; significant aspects of the peer review checklist are ignored.

Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. There is no plan for publication or the journal selected is not appropriate for the content of the paper; the drafts of the query or cover letters are not clearly written and do not capture attention of the reader; there is not clear understanding of the revisions needed of the paper for the style requirements for the selected journal. The professional presentation is not well organized; visual aids (e.g., PowerPoint slides) or the visuals do not clarify or highlight key points of the presentation; the presentation exceeds time limits and/or is not suited to the audience. The presenter is unable to answer audience questions, if any, about the material.

IUSON, Indiana University School of Nursing.

Courtesy of Indiana University School of Nursing.

Strategies for assessing and evaluating learning outcomes

Nursing faculty can use a variety of strategies to assess and evaluate student learning. This section identifies several strategies known to be effective in nursing. Table 25-1 provides an overview of these strategies.

TABLE 25-1

Overview of Assessment and Evaluation Strategies

Technique	Domain and Assessment Purpose	Possible Applications	Advantages	Disadvantages	Issues
Portfolio (paper and electronic)	High-level cognitive Affective Psychomotor (if video) Formative Summative	Placement in program of study For evidence of progress Outcome measure for individual or program Marketing tool for job placement	Broad sample of student work Documents progress Identifies student strengths and weaknesses Critical thinking with student reflection If electronic portfolio, is easier to make updates and convenient for online programs	Time for collection and grading Need storage space Not direct observation Limited reliability Additional expenses with electronic portfolios Time needed for learning	Ownership Responsibility for collection Nonselective versus selective portfolio Are you evaluating process or product? Deciding on the format for organizing the portfolio
Role play	Cognitive Affective Psychomotor Formative	Formative feedback for psychomotor skills, communication techniques, problem-solving skills	Active participation of student Stimulates creativity Variables can be controlled Can repeat Provides practice in peer review skills	Immediate feedback may not be possible Self-consciousness of participant	Takes time to build comfort with technique Need familiarity with material Only gold members can continue reading. Log In or Register to continue Share this: Click to share on Twitter (Opens in new window) Click to share on Facebook (Opens in new window) Related Related posts: The evaluation process: an overview Improving teaching and learning: classroom assessment techniques Clinical simulations: an experiential, student-centered pedagogical approach Developing and using classroom tests Stay updated, free articles. Join our Telegram channel Tags: Teaching in Nursing A Guide for Faculty Feb 12, 2017 \| Posted by admin in NURSING \| Comments Off on Strategies for assessing and evaluating learning outcomes Full access? Get Clinical Tree Get Clinical Tree app for offline access Get Clinical Tree app for offline access

Technique

Domain and Assessment Purpose

Possible Applications

Advantages

Disadvantages

Issues

Portfolio (paper and electronic)

High-level cognitive

Affective

Psychomotor (if video)

Formative

Summative