Strategies for assessing and evaluating learning outcomes



Strategies for assessing and evaluating learning outcomes


Jane M. Kirkpatrick, PhD, RN and Diann A. DeWitt, PhD, RN, CNE


This chapter is dedicated to our outstanding colleague and friend Dr. Lillian Yeager, who died in May 2006. Dr. Yeager was appointed Dean for the School of Nursing at Indiana University Southeast in 2002 following approximately 30 years as an exceptional educator there. It was our pleasure to collaborate with such an outstanding nurse educator and administrator. Lillian had a keen sense of humor and zest for life, both of which aided in her valiant efforts to live life fully in her fight with cancer. She is greatly missed and will long be remembered by many. Thanks for inspiring us, Lillian!


The purpose of this chapter is to discuss the uses, advantages, disadvantages, and issues related to a variety of strategies that faculty can use to assess and evaluate student learning. The Carnegie Foundation’s report on nursing education (Benner, Sutphen, Leonard, & Day, 2009) calls for stronger integration of both clinical and classroom instruction and “radical transformation” in how nursing education is provided. Just as teaching methods are expanding to ensure that graduates achieve desired outcomes as identified by the American Association of Colleges of Nursing (AACN) essentials (2008) and National League for Nursing (NLN) competencies (2010), we must also expand our assessment and evaluation strategies to determine if these competencies are attained. Competency in clinical judgment, critical thinking, and best nursing practices may need multiple measures to be evaluated accurately. As educators expand teaching methods and seek to assess deep learning and critical thinking, we can explore ways that our active learning teaching strategies can be transformed to assess the desired learning outcomes. The chapter includes practical information on a variety of outcome assessment strategies. Included are ways to select strategies, improve validity and reliability of the assessment strategies, and increase the effectiveness of their use.




Assessment and evaluation


Just what is the difference between assessment and evaluation? In many instances it seems that these two terms are interchangeable. Assessment basically means obtaining information for a specific purpose. The information collected may be quantitative or qualitative depending on how it will be used (Brookhart, 2005). The main purpose of assessment is understanding and improving student learning (T. A. Angelo, personal communication, November 7, 2006). “It involves making our expectations explicit and public; setting appropriate criteria and high standards for learning quality; systematically gathering, analyzing, and interpreting evidence to determine how well performance matches those expectations and standards; and using the resulting information to document, explain, and improve performance” (Angelo, 1995, pp. 7–9). This definition of assessment is quite similar to formative evaluation, a process of determining progress with the goal of making improvements.


Evaluation is a term that is more commonly associated with summative evaluation, which takes assessment to the next level of judging the value and quality of performance at a defined end point. Summative evaluation suggests that a decision may be made. In clinical disciplines, faculty must evaluate student attainment of course outcomes and defined program competencies.



Selecting strategies


The strategies discussed in this chapter provide faculty with a variety of techniques to use to assess and evaluate student learning outcomes. Several of the strategies discussed may be more familiar as teaching strategies. The idea of adapting a teaching strategy as an assessment or assessment tool allows students to practice the same process by which they will ultimately be evaluated.


The major reasons for faculty to consider new assessment and evaluation strategies are so they can better (1) assess and evaluate all domains of learning, (2) assess higher levels of the cognitive domain (e.g., analysis, synthesis), (3) assess critical thinking, and (4) prepare students for licensing or certification examinations. By providing a more authentic assessment wherein the student is asked to perform or demonstrate the learning in a way that is as closely related to the ultimate performance required in the real world, the faculty will have richer and deeper evidence of student progress.


In selecting strategies, the philosophy of the faculty regarding accountability and responsibility for learning must be considered. Many of the strategies discussed are compatible with active teaching techniques. Critical reflections, short essays, and guided writing assignments encourage students to interact with the material in a different way than if they were learning the material for a multiple-choice test. The major challenges of using these strategies include (1) the time it takes to use the strategy and (2) difficulty in establishing validity and reliability of data-gathering instruments and methods. To avoid some of the pitfalls associated with these strategies, faculty should do the following:





Purpose

The purpose of assessment and evaluation is to ascertain that students have achieved their potential and have acquired the knowledge, skills, and abilities set forth in courses and curricula. The instructional goals and course objectives will indicate the type of behavior (cognitive, affective, or psychomotor) to be assessed. The learning experiences must have relevance to the students and be valued in the grading system. Finally, the grading criteria should be shared with the students before the assessment occurs.


The timing of the assessment relates closely to the purpose. Assessment or formative evaluation is much like feedback for the purpose of recognizing progress. This type of evaluation would be appropriate throughout the class term. Summative evaluation suggests that a decision may be made. This might be a grade or a decision for passing or failing a course.



Setting

Another critical factor to consider is the setting in which the instruction and assessment will occur. Most faculty are comfortable with assessment and evaluation in traditional classroom settings, but more than half of all nursing schools are now using some form of computer-based learning support. For some, technology provides an adjunct or support to the nursing course. For others, the entire course is web-based and delivered online.


When considering how to implement assessment and evaluation strategies in an electronic environment, faculties need to address how the technology supports the assessment purpose. Most of the strategies discussed in this chapter can be used in an online community. For example, a threaded discussion can be used for critiquing or even as a forum for verbal questioning. Concept maps can be developed in an electronic format. Students or faculty can maintain an electronic portfolio representative of student work throughout the course.




Procedures

Although procedures for using assessment strategies vary, any procedure selected must be well planned. The strategy should be pilot-tested before it is fully implemented. This process should help prevent unexpected difficulties and allow for refinement and quality improvements prior to full-scale implementation. It is also important to delineate the responsibilities associated with the methods used. For example, in the case of portfolio assessment, a decision must be made about whether students or faculty will collect and keep the work. Another area of concern is the environment in which the assessment will take place. Because of the anxiety and stress associated with the process of being evaluated, faculty must attempt to provide an atmosphere conducive to the process. Humor, when used appropriately, may help place students at ease.



Validity and reliability

The issues of validity and reliability are critical, especially when the purpose is for summative assessment and evaluation. The terms validity and reliability are defined and described in Chapter 24. For the purposes of this chapter, specific examples are given to clarify establishment of validity and reliability in nonmultiple-choice assessment methods.


In determining validity, faculty must ask whether the assessment technique is appropriate to the purpose and whether it provides useful and meaningful data (Linn & Gronlund, 2005). Faculty must consider the fit of the assessment strategy with the identified objectives. In other words, does the strategy measure what it is supposed to measure? For instance, if the objective for an assignment is for the student to demonstrate skill in written communication, evaluating student performance through oral questioning will not provide valid data. Similarly, at the nursing department level, faculty should coordinate assessment and evaluation strategies with nursing program outcomes such as critical thinking and communication. It is a challenge to develop sound criteria for assessment that accurately reflect the specified outcomes, objectives, and content. To establish face validity, faculty must seek input from colleagues by asking questions such as “Do these criteria appear to measure what my objectives are?” In addition, obtaining the opinion of other content experts can assist in determining whether there is adequate sampling of the content (content validity). Whereas these types of validity (e.g., face, content) constitute the traditional approach to establishing validity, Gronlund (2006) asserts that this view is being replaced by validity as a unitary concept, based on several different categories of evidence (e.g., face-related evidence, content-related evidence). The evidence available to establish validity determines whether validity is considered low, medium, or high.


Once assessment and evaluation criteria or rubrics are developed, it is essential to establish their reliability. The most commonly used method for establishing reliability in this situation is when two or more instructors independently rate student performance using the agreed upon criteria or rubric for sample work. Then the ratings are correlated to establish interrater reliability. Interrater reliability is expressed as a percentage of agreement between scores. An example of using criteria to establish interrater reliability is provided in Box 25-1.



A multiplicity of assessment strategies can provide a more complete picture of the student’s abilities and therefore contribute to the trustworthiness of the process. It is a serious limitation to rely on a single technique. Each assessment technique has limitations and issues that can influence the reliability, validity, and appropriateness of the technique for given student populations. Using multiple assessment techniques provides a more robust and accurate framework for making evaluative decisions.




Matching the assessment strategy to the domain of learning


Educators must also be mindful of the domain of learning being assessed or evaluated (see Chapter 13). Cognitive learning is typically assessed with strategies requiring the students to write, submit portfolios, or complete tests (see Chapter 11). Assessment in the psychomotor domain typically involves simulations and simulated patients and ultimately occurs in clinical practice (see Chapters 19 and 20). Assessment in the affective domain is particularly important in nursing and is discussed further here.


The taxonomy of affective assessment and evaluation as applied to nursing (Krathwohl, Bloom, & Mases, 1964) lists five behavioral categories: (1) receiving, (2) responding, (3) valuing, (4) organization of values, and (5) characterization by a value or value complex. The beginning student may be at the receiving level, able to hear and recognize the values. As the student progresses, more sophisticated affective growth would demonstrate the ability to respond to or communicate about the particular value or issue. At the next level, the student embraces the value. Ultimately, the student would act on the value. Once actions are consistent, the highest levels of the affective domain would be realized.


Examples of areas in which nursing students encounter the affective domain include socialization to the roles of the nurse, caring for patients who are dying, meeting spirituality needs, working with sexuality concerns, and becoming culturally competent.


For example, students are expected to increase their level of cultural competence throughout the curriculum. At the beginning level students may be expected to become self-aware using exploration of their own cultural and health care practices as well as values. A mid-program outcome could focus on student awareness of the cultural orientation of the patients under their care. At graduation the expected outcomes would be to act in a culturally competent manner when providing care to all patients and demonstrate the ability to advocate for an individual patient’s unique needs.


Multiple strategies could be used for assessment of these outcomes, including written papers identifying the student’s own cultural background, a critical review of an interaction in caregiving with a patient of another culture, or perhaps the use of media (e.g., video recording, web page development, or even a collage) to demonstrate key concepts and values held by a given culture. The assessment and evaluation strategy would be designed to assess evidence of self-awareness, recognition of the values and conflicts in areas in which judgments must be made, and mechanisms for advocacy.


Development of the affective domain is progressive and can be tied to critical thinking. Because of the progressive nature of development, formative assessment and evaluation across the curriculum may be most appropriate, with a summative assessment and evaluation at the time of graduation. Many of the assessment methods listed in this chapter can be adapted to evaluate the affective domain.



Communicating grading expectations


When assessment strategies are used to collect data for grading purposes, it is imperative that the grading requirements be communicated to the students. Information about grading criteria is typically provided to students in the course syllabus. Other methods such as checklists, guidelines, or grading scales can be used as well. See Box 25-2 for an example of a writing assignment with grading rubric.



BOX 25-2   Sample Writing Assignment with Grading Rubric


Courtesy of colorado christian university



College of adult and graduate studies Division of nursing and sciences



Nur 400a transitions in nursing: career advancement



Scholarly paper assignment


Please select a nursing topic of interest to you for this scholarly paper assignment. The purpose of this paper is, in part, to assess your ability to think critically and communicate clearly as you begin the RN-BSN program.


The paper is to be 6–8 pages in length and follow APA format. Please refer to CAGS Guidelines for APA Style in Doc Sharing for additional information. A minimum of three (3) professional journal articles (less than five years old) should be used. All scholarly papers should have an introduction that includes a thesis statement, the body of the paper and a conclusion or summary.


Make sure the paper includes the following:



Please refer to the NUR 400A Grading Rubric for Scholarly Paper (in Doc Sharing) for specific grading criteria.



Nur 400a grading rubric for scholarly paper
























































Critical Thinking and Written Communication
Objectives Below Expectations Meets Expectations Exceeds Expectations Points
  <75% 75–90% 90–100%  




___ of 20

Content analysis and development are general or vague but presented in a somewhat logical progression or no logical progression at all. Content analysis and development are adequate and presented in a logical progression. Content analysis and development are insightful and presented in logical progression. ___ of 20
Development and Evidence (Support)The student will support thesis, main points, and/or claims with appropriate evidence. The topic statement, main points, announcement and/or claims are not supported or are supported with general or vague (reader gains few insights) personal examples and obvious textual sources. The thesis, main points, and/or claims are supported with relevant personal examples (reader gains some insight), textual sources, and appropriate external sources. The thesis, main points, and/or claims are supported with relevant personal examples (reader gains insight), textual sources, and scholarly academic sources. ___ of 20

Conversational word choice; some variety in sentence structure; active and passive voice are used or passive voice is primarily used. Academic word choice; variety in sentence structure; active voice is primarily used. Strong effective academic word choice; variety in sentence structure; active voice is primarily used. ___ of 20




___of 20
      Total Points /100


image



Rubrics are another way to inform students about grading expectations. According to Stevens and Levi (2005), rubrics are rating scales used to assess performance. The two types of rubrics are holistic and analytic. The holistic approach is based on global scoring, often with descriptive information for each area based on a numerical scoring system, whereas analytic scoring involves examining each significant characteristic of the written work or portfolio. For example, in assessment of writing, the organization, ideas, and style may be judged individually according to analytic scoring (Linn & Gronlund, 2005). The global method seems more suitable for summative assessment, whereas the analytic method is useful in providing specific feedback to students for the purpose of performance improvement.


Regardless of the type, rubrics are composed of four parts: (1) a task description (the assignment), (2) a scale, (3) the dimensions of the assignment, and (4) descriptions of each performance level (Stevens & Levi, 2005). The first portion of a rubric contains a clear description of the assignment and should be matched to the learning outcomes of the course. The next part of the rubric is a scale to describe levels of performance. Such a scale may include levels such as “excellent,” “competent,” and “needs work.” The dimensions of the assignment are the third part of rubric development, where the task is broken down into components. Last, differentiated descriptions of each performance level are explicitly identified. Rubrics thus provide clarity of expectations to assist students in the successful completion of assignments as well as making grading of these assignments more objective for the faculty. See Box 25-3 for an example of grading rubrics.



Box 25-3   Examples of Grading Rubrics


“A” grade


The final course synthesis paper clearly defines a researchable problem; the search strategy provides sufficient relevant data for understanding the problem; the coding sheet is focused and guides the analysis of the data; issues of reliability and validity are identified; the literature is synthesized, rather than reviewed or summarized; the paper concludes with recommendations based on the research synthesis. The paper is written using the IUSON writing guidelines.


Participation in discussions and learning activities integrates course concepts and reflects critical thinking about research synthesis. Participation is thoughtful, respectful, informed, and substantiated. Peer review of the synthesis paper reflects the reviewers’ understanding of the synthesis process, provides practical suggestions, and is presented in a collegial manner.


Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. The plan for publication includes thoughtful selection of a journal; draft of a query or cover letter; and, if the paper needs revisions to suit publication guidelines, a statement about revisions needed that matches the journal publication guidelines. The professional presentation is well organized, supported by visual aids (e.g., PowerPoint slides), and uses professional communication style to suit the audience.



“B” grade


The final synthesis paper clearly defines a researchable problem; the search strategy yields mostly relevant data for understanding the problem; the coding sheet lacks one or more important data and/or does not reflect the scope of the problem statement; issues of reliability and validity are unclear; the review of literature is primarily synthesis with minimal summary; the paper concludes with mostly appropriate recommendations. The paper is free of major errors in grammar or style.


Participation in discussions and learning activities usually integrates course concepts and reflects critical thinking about research synthesis. Participation is helpful but may not contribute substantially to the focus of the course. Peer review of the synthesis paper does not include relevant aspects of the peer review checklists or overlooks areas in which feedback is needed.


Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. The plan for publication includes appropriate selection of a journal; the drafts of the query or cover letter are generally appropriate to the situation; general revisions are noted but do not consider manuscript guidelines of the journal. The professional presentation is fairly well organized; the visual aids (e.g., PowerPoint slides) enhance the presentation; the presentation is delivered with consideration for the audience.



“C” grade


The final synthesis paper has an ill-defined problem; the search strategy yields irrelevant or tangential data for understanding the problem; the coding sheet is not well focused or neglects key variables or includes irrelevant variables; issues of reliability and validity are not identified or are ignored; the review is more summary than synthesis; the paper does not include recommendations or includes recommendations that are not drawn from the data. There are substantial errors in grammar and/or writing style.


Participation in discussions occurs on an irregular basis and is not grounded in course concepts, comments do not reflect critical thinking, and there are breaches of course norms and etiquette. Peer review of the synthesis paper does not provide substantive or helpful feedback to classmates; significant aspects of the peer review checklist are ignored.


Dissemination of the findings of the research synthesis includes a written plan for publication and an oral presentation to faculty and classmates. There is no plan for publication or the journal selected is not appropriate for the content of the paper; the drafts of the query or cover letters are not clearly written and do not capture attention of the reader; there is not clear understanding of the revisions needed of the paper for the style requirements for the selected journal. The professional presentation is not well organized; visual aids (e.g., PowerPoint slides) or the visuals do not clarify or highlight key points of the presentation; the presentation exceeds time limits and/or is not suited to the audience. The presenter is unable to answer audience questions, if any, about the material.


IUSON, Indiana University School of Nursing.


Courtesy of Indiana University School of Nursing.



Strategies for assessing and evaluating learning outcomes


Nursing faculty can use a variety of strategies to assess and evaluate student learning. This section identifies several strategies known to be effective in nursing. Table 25-1 provides an overview of these strategies.



TABLE 25-1


Overview of Assessment and Evaluation Strategies


























Technique Domain and Assessment Purpose Possible Applications Advantages Disadvantages Issues
Portfolio (paper and electronic)




Role play




Stay updated, free articles. Join our Telegram channel

Feb 12, 2017 | Posted by in NURSING | Comments Off on Strategies for assessing and evaluating learning outcomes

Full access? Get Clinical Tree

Get Clinical Tree app for offline access