Entries by Nathan Thompson, PhD

,

“Dichotomous” Vs “Polytomous” in IRT?

What is the difference between the terms dichotomous and polytomous in psychometrics?  Well, these terms represent two subcategories within item response theory.  Item response theory (IRT) is the dominant psychometric paradigm for constructing, scoring and analyzing assessments.  Virtually all large-scale assessments utilize IRT because of its well-documented advantages.  In many cases, however, it is referred […]

, ,

Multistage Testing

Multistage testing (MST) is a type of computerized adaptive testing (CAT).  This means it is an exam delivered on computers which dynamically personalize it for each examinee or student.  Typically, this is done with respect to the difficulty of the questions, by making the exam easier for lower-ability students and harder for high-ability students.  Doing […]

, ,

Ebel Method of Standard Setting

The Ebel method of standard setting is a psychometric approach to establish a cutscore for tests consisting of multiple-choice questions. It is usually used for high-stakes examinations in the fields of higher education, medical and health professions, and for selecting applicants. How is the Ebel method performed? The Ebel method requires a panel of judges who […]

,

Distractor analysis for test items

Distractor analysis refers to the process of evaluating the performance of incorrect answers vs the correct answer for multiple choice items on a test.  It is a key step in the psychometric analysis process to evaluate item and test performance as part of documenting test reliability and validity. What is a distractor? Multiple-choice questions always […]

What is multi-modal test delivery?

Multi-modal test delivery refers to an exam that is capable of being delivered in several different ways, or of a online testing software platform designed to support this process. For example, you might provide the option for a certification exam to be taken on computer at third-party testing centers or via paper at the annual […]

Confidence interval for test scores

A confidence interval for test scores is a common way to interpret the results of a test by phrasing it as a range rather than a single number.  We all know that tests are imperfect measurements that happen at a given slice in time, and performance could in actuality vary over time.  The examinee might […]

Composite Test Score

A composite test score refers to a test score that is combined from the scores of multiple tests, that is, a test battery.  The purpose is to create a single number that succinctly summarizes examinee performance.  Of course, some information is lost by this, so the original scores are typically reported as well. This is […]

Inter-rater reliability vs agreement

Inter-rater reliability and inter-rater agreement are important concepts in certain psychometric situations.  For many assessments, there is never any encounter with raters, but there certainly are plenty of assessments that do.  This article will define these two concepts and discuss two psychometric situations where they are important.  For a more detailed treatment, I recommend Tinsley […]

,

Split Half Reliability Index

Split Half Reliability is an internal consistency approach to quantifying the reliability of a test, in the paradigm of classical test theory.  Reliability refers to the repeatability or consistency of the test scores; we definitely want a test to be reliable.  The name comes from a simple description of the method: we split the test […]