modified-angoff

How do I conduct a modified-Angoff study?

There are a number of acceptable methodologies in the psyychometric literature for standard setting studies, also known as cutscores or passing points.  Examples include Angoff, modified-Angoff, Bookmark, Contrasting Groups, and Borderline.  The modified-Angoff approach is by far the most commonly used, yet it remains a black box to many professionals in the testing industry, especially non-psychometricians in the credentialing field.  This post hopefully provides some elucidation and demystification.  There is some flexibility in the study implementation, but this article describes a sound method.

What to Expect with the Modified-Angoff Approach

First of all, do not expect a straightforward, easy process that leads to an unassailably correct cutscore.  All standard setting methods involve some degree of subjectivity.  The goal of the methods is to reduce that objectivity as much as possible.  Some methods focus on content, others on data, while some try to meld the two.

Step 1: Prepare Your Team

The modified-Angoff process depends on a representative sample of subject matter experts (SMEs), usually 6-20.  By “representative” I mean they should represent the various stakeholders.  A certification for medical assistants might include experienced medical assistants, nurses, and physicians, from different areas of the country.  You must train them about their role and how the process works, so they can understand the end goal and drive toward it.

Step 2: The Minimally Competent Candidate (MCC)

This concept is the core of the Angoff process, though it is known by a range of terms or acronyms, including minimally qualified candidate (MQC) or just barely qualified (JBQ).  The reasoning is that we want our exam to separate candidates that are qualified from those that are not.  So we ask the SMEs to define what makes someone qualified (or unqualified!) from a perspective of skills and knowledge.  This leads to a conceptual definition of an MCC.  We then want to estimate what score this borderline candidate would achieve, which is the goal of the remainder of the study.   This step can be conducted in person, or via webinar.

Step 3: Round 1 Ratings

Next, ask your SMEs to read through all the items on your test form and estimate the percentage of MCCs that would answer each correctly.  A rating of 100 means the item is a slam dunk; it is so easy that every MCC would get it right.  A rating of 40 is very difficult.  Most ratings are in the 60-90 range if the items are well-developed.  The ratings should be gathered independently; if everyone is in the same room, let them work on their own in silence.  This can easily be conducted remotely, though.

Step 4: Discussion

This is where it gets fun.  Identify items where there is the most disagreement (as defined by grouped frequency distributions or standard deviation) and make the SMEs discuss it.  Maybe two SMEs thought it was super easy and gave it a 95 and two other SMEs thought it was super hard and gave it a 45.  They will try to convince the other side of their folly.  Chances are that there will be no shortage of opinions and you, as the facilitator, will find your greatest challenge is keeping the meeting on track.  This step can be conducted in person, or via webinar.

Step 5: Round 2 Ratings

Raters then re-rate the items based on the discussion.  The goal is that there will be greater consensus.  In the previous example, it’s not likely that every rater will settle on a 70.  But if your raters all end up from 60-80, that’s OK.  How do you know there is enough consensus?  We recommend the inter-rater reliability suggested by Shrout and Fleiss (1979).

Step 6: Evaluate Results and Final Recommendation

Evaluate the results from Round 2 as well as Round 1.  An example of this is below.  What is the recommended cutscore, which is the average or sum of the Angoff ratings depending on the scale you prefer?  Did the reliability improve?  Estimate the mean and SD of examinee scores (there are several methods for this).  What sort of pass rate do you expect?  Even better, utilize the Beuk Compromise as a “reality check” between the modified-Angoff approach and actual test data.  You should take multiple points of view into account, and the SMEs need to vote on a final recommendation.  They, of course, know the material and the candidates so they have the final say.  This means that standard setting is a political process; again, reduce that effect as much as you can.Angoff Method

 

Step 7: Write Up Your Report

Validity refers to evidence gathered to support test score interpretations.  Well, you have lots of relevant evidence here.  Document it.  If your test gets challenged, you’ll have all this in place.  On the other hand, if you just picked 70% as your cutscore because it was a nice round number, you could be in trouble.

Additional Topics

In some situations, there are more issues to worry about.  Multiple forms?  You’ll need to equate in some way.  Using item response theory?  You’ll have to convert the Angoff-recommended cutscore onto the theta metric using the Test Response Function (TRF).  New credential and no data available?  That’s a real chicken-and-egg problem there.

Where Do I Go From Here?

Ready to take the next step and actually apply the modified-Angoff process to improving your exams?  Download our free Angoff Analysis Tool.

Want to go even further and implement automation in your Angoff study?  Sign up for a free account in our FastTest item banker.

References

Shrout & Fleiss (1979). Intraclass correlations: Uses in assessing reliability. Psychological Bulletin, 86(2), 420-428.

Want to improve the quality of your assessments?

Sign up for our newsletter and hear about our free tools, product updates, and blog posts first! Don’t worry, we would never sell your email address, and we promise not to spam you with too many emails.

Newsletter Sign Up
First Name*
Last Name*
Email*
Company*
Market Sector*
Lead Source

The following two tabs change content below.

Nathan Thompson, PhD

Chief Product Officer at ASC
I am a psychometrician, software developer, author, and researcher, currently serving as Chief Product Officer for Assessment Systems Corporation (ASC). My mission is to elevate the profession of psychometrics by using software to automate the menial stuff like job analysis and Angoff studies, so we can focus on more innovative work. My core goal is to improve assessment throughout the world. I was originally trained as a psychometrician, doing an undergrad at Luther College in Math/Psych/Latin and then a PhD in Psychometrics at the University of Minnesota. I then worked multiple roles in the testing industry, including item writer, test development manager, essay test marker, consulting psychometrician, software developer, project manager, and business leader. Research and innovation are incredibly important to me. In addition to my own research, I am cofounder and Membership Director at the International Association for Computerized Adaptive Testing, You can often find me at other important conferences like ATP, ICE, CLEAR, and NCME. I've published many papers and presentations, and my favorite remains http://pareonline.net/getvn.asp?v=16&n=1.

Latest posts by Nathan Thompson, PhD (see all)

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply