Thursday, June 26, 2008

On the Arbitrary Nature of Cut Scores Used in Progression Testing

As I continue to be contacted weekly by students, faculty, parents, etc., about the use of progression policies in nursing programs across the country, something interesting has come to light. Schools are using varying cut scores in their progression policies. Indeed Nibert, Young and Britt (2003) noted that of the schools surveyed in thier study, scores that schools used for "benchmarking" (i.e., a "cut-score") ranged from 77 to 90 (corrected for today's scores, that would be 770 to 900).

Why should the scores schools use be different from place to place? Standard setting for educational tests is somewhat of a science (Broadfoot, 2002; Karantonis & Sireci, 2006), but it doesn't appear that most schools are using any empirical data to support the cut scores they are using. This can be inferred in that schools have varying cut scores in place for their progression policies. How are decisions on cut scores made? What evidence supports one score over another?

For example, Lewis (2006) reported on data from a HESI Exit Exam® dataset with N = 8,009, that students scoring in the 800-849 range on the Exit Exam passed the NCLEX-RN® 93.3% of the time. Students scoring in the 700-799 range pased the NCLEX-RN 85.3% of the time (see the Figure).





Why then, would schools have a cut score of 850 on the Exit Exam when data from a large N suggests that students scoring less than 850 might have more than a 9/10 chance of passing the NCLEX-RN?

Zieky and Perie (2006) suggest that when setting a cut score, the harm that will be done if students are misclassified must be considered. If students are misclassified as "likely to fail" and are therefore prevented from graduating and taking the NCLEX-RN, then there can be profound consequences for that student's life. Their job plans, financial status, personal relationships, and many other dimensions of life are instantly in jeopardy. There is a serious risk of harm, therefore, if students are misclassified as unprepared for the licensure exam when they in fact are.

On the other hand, if students are classified as likely to pass, and are therefore allowed to graduate and sit for the licensure exam, the risk of harm in that misclassification is less harmful for the student (who can take the exam again in a matter of weeks), but more important for the school, which seeks to have a high NCLEX-RN pass rate.

The only logical conclusion then is that when cut scores are chosen non-empirically (i.e., not based on available data), and possibly even arbitrarily, the risk of harm is shifted significantly to the student with schools erring on the side of not allowing qualified students to graduate and sit for the licensure exam.

The school's licensure pass rate may be protected, but the lives of many students are profoundly and negatively impacted. Many, many qualified nurses are then kept from the workforce because they are unable to test for a nursing license due to a progression policy that prevents their graduate on the basis of a score from one test.

References

Broadfoot, P. (2002). Dynamic versus arbitrary standards: Recognising the human factor in assessment. Assessment in Education: Principles, Policy & Practice, 9(2), 157-159.

Karantonis, A., & Sireci, S. G. (2006). The bookmark standard-setting method: A literature review. Educational Measurement: Issues and Practice, 25(1), 4-12.

Lewis, C. (2006). Predictive accuracy of the HESI Exit Exam on NCLEX-RN pass rates and effects of progression policies on nursing student exit exam scores. Dissertation Abstracts International, 66(11), B. (UMI No. 3195986)

Nibert, A. T., Young, A., & Britt, R. (2003). The HESI Exit Exam: Progression benchmark and remediation guide. Nurse Educator, 28(3), 141-145.

Zieky, & Perie (2006). A primer on setting cut scores on tests of educational achievement. Retrieved June 22, 2008 from http://www.ets.org/Media/Research/pdf/Cut_Scores_Primer.pdf.

Monday, June 16, 2008

On the Power of Testing

Using tests in appropriate ways can be a very powerful tool for learning. Specifically, there is an effect, the testing effect, which basically changes the knowledge one has by the simple act of testing (Roediger & Karpicke, 2006a)⁠. This effect has been seen in much basic research (in the laboratory) and in applied research. For quite some time, some educators have promoted the use of tests not only for assessment purposes (i.e., summative assessment), but also for learning (formative assessment). These educators can now rest assured there is some empiric backing to their positions, that is, that testing can be used for learning, too.


Additional work by the authors (Roediger & Karpicke, 2006b)⁠ showed that repeated testing was more effective among samples of college students than was repeated studying. The authors provide several potential theoretical reasons for the results, including some based on basic understanding of human memory and some based upon the idea that "testing as learning" provides practice to the test-taker, thereby increasing recall and performance on future tests.

Another important bit of research is on providing feedback to students after multiple-choice tests (Butler & Roediger, 2008)⁠. Using a 3x3x3x2 experimental factorial design, the researchers investigated how the amount of study, number of multiple-choice alternatives, feedback condition, and report option (forced vs. free report) interacted to influence participant performance on test performance. Results showed that prior testing and studying both resulted in improved later performance on the study measures, but that prior testing had a larger effect than studying. Also, feedback given to the participants reduced the number of errors they made on future tests, likely due to them correcting misinformation in their knowledge, allowing for more accurate recall remembered materials.

There are frequently questions about the value of testing in student learning, and the research discussed here provides some evidence toward that point. Tests don't have to be only for summative assessment - that is, to assess what students have learned. Tests can also be used to assist in students' learning. Providing frequent testing in courses, along with feedback on the tests with a focus on correcting mis-remembered information may be an effective strategy to enhance student learning.

Butler, A., & Roediger, H. (2008). Feedback enhances the positive effects and reduces the negative effects of multiple-choice testing. Memory & Cognition, 36(3), 604-616.

Roediger, H. L., & Karpicke, J. D. (2006a). The power of testing memory: Bbasic research and implications for educational practice. Perspectives on Psychological Science, 1(3), 181-210. doi: 10.1111/j.1745-6916.2006.00012.x.

Roediger, H. L., & Karpicke, J. D. (2006b). Test-enhanced learning. Psychological Science, 17(3), 249-255. doi: 10.1111/j.1467-9280.2006.01693.x.

Tuesday, June 3, 2008

Who Will Pass and Who Will Fail? That is the Question. Should It Be?

The nursing education literature is full of research (of varying qualities) on predicting NCLEX-RN® passing and failure by nursing students. It has been a dominant topic in the literature since a national test became available. It seems that nearly every month, in one nursing education publication or another, there is some new report on predicting NCLEX-RN outcomes, programs to help students "at-risk" of failure on the NCLEX-RN, or about remediation of low-scoring students on some academic skill set.

What effect does this preoccupation by nursing faculty have on addressing other problems in nursing education? Most of the studies and reports put out on this topic focus on student-level variables. GPA. Test scores. Test anxiety. Course failures.

What about curriculum evaluation? There isn't much about that. If students get to the end of an academic program and are unprepared for the licensure exam, is that a student problem, or is that a curricular problem? Students don't pass themselves. Faculty pass students on to the next course. So, what are students to think when they get to the end of a program yet are un(der)prepared to take the NCLEX-RN? Is it really their fault? I think not. In the age of public school accountability, it is not only students who pay, but also faculty when student performance is sub-par. In nursing education, however, it is much easier to shift the burden to students and make the issue one of student preparation, rather than one of the systems and processes that get students to the end of their programs, un(der)prepared, in the first place.

Calibration is key. Curricula must be calibrated to the test. This does not mean that one cannot teach more than is on the test (NCLEX-RN), but you certainly cannot teach less than is on the test and expect graduates to pass. The NCLEX-RN blueprint changes periodically, and every 2-3 years it seems the passing standard on the NCLEX-RN is raised (the test becomes more difficult to "pass"). Do faculty stay on top of these changes and re-calibrate their curricula to meet the dynamic nature of the licensure exam? If the passing standard increased periodically, are nursing education curricula adjusted accordingly in difficulty level? This could be part of the seemingly omnipresent problem of NCLEX-RN pass rates.

I won't discuss whether or not the NCLEX-RN should drive nursing education (at least at the pre-licensure level) like it does. It truly does, no questions asked. If it was announced tomorrow that the NCSBN was changing the content of the licensure exam to include a significant focus on genetic therapies for developmental disorders, schools would be compelled to increase or add this content to their current curricula. What I think is happening is that less distinct changes, such as a .07 logit increase in the passing standard, are not being followed by schools as closely as they should be. True, the national pass rate doesn't "plummet" when the passing standard is changed, but NCLEX pass rates seem to be a nagging problem for schools, and this could reflect the underlying, always dynamic nature of the test graduates take to become licensed. This idea is not to forsake issues of quality and consistency in curricula themselves, but it is a reasonable proposition, given the gravity of the data present on the problem.