analytics-pencils-guttman

Guttman Errors: Additional Insight into Examinees

Guttman errors are a concept derived from the Guttman Scaling approach to evaluating assessments.  There are a number of ways that they can be used.  Meijer (1994) suggests an evaluation of Guttman errors as a way to flag aberrant response data, such as cheating or low motivation.  He quantified this with two different indices, G and G*.

 

What is a Guttman error?

It occurs when an examinee answers an item incorrectly when we expect them to get it correct, or vice versa.  Here, we describe the Goodenough methodology as laid out in Dunn-Rankin, Knezek, Wallace, & Zhang (2004).  Goodenough is a researcher’s name, not a comment on the quality of the algorithm!

In Guttman scaling, we begin by taking the scored response matrix (0s and 1s for dichotomous items) and sorting both the columns and rows.  Rows (persons) are sorted by observed score and columns (items) are sorted by observed difficulty.  The following table is sorted in such a manner, and all the data fit the Guttman model perfectly: all 0s and 1s fall neatly on either side of the diagonal.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Person 1 1 1 0 0 0 0
Person 2 2 1 1 0 0 0
Person 3 3 1 1 1 0 0
Person 4 4 1 1 1 1 0
Person 5 5 1 1 1 1 1

 

Now consider the following table.  Ordering remains the same, but Person 3 has data that falls outside of the diagonal.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Person 1 1 1 0 0 0 0
Person 2 2 1 1 0 0 0
Person 3 3 1 1 0 1 0
Person 4 4 1 1 1 1 0
Person 5 5 1 1 1 1 1

 

Some publications on the topic are unclear as to whether this is one error (two cells are flipped) or two errors (a cell that is 0 should be 1, and a cell that is 1 should be 0).  In fact, this article changes the definition from one to the other while looking at two rows the same table.  The Dunn-Rankin et al. book is quite clear: you must subtract the examinee response vector from the perfect response vector for that person’s score, and each cell with a difference counts as an error.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Perfect 3 1 1 1 0 0
Person 3 3 1 1 0 1 0
Difference 1 -1

 

Thus, there are two errors.

Usage in data forensics

Meijer suggested the use of G, raw Guttman error count, and a standardized index he called G*:

G*=G/(r(k-r).

 

Here, k is the number of items on the test and r is the person’s score.

How is this relevant to data forensics?  Guttman errors can be indicative of several things:

  1. Preknowledge: A low ability examinee memorizes answers to the 20 hardest questions on a 100 item test. Of the 80 they actually answer, they get half correct.
  2. Poor motivation or other non-cheating issues: in a K12 context, a smart kid that is bored might answer the difficult items correctly but get a number of easy items incorrect.
  3. External help: a teacher might be giving answers to some tough items, which would show in the data as a group having a suspiciously high number of errors on average compared to other groups

How can I calculate G and G*?

Because the calculations are simple, it’s feasible to do both in a simple spreadsheet for small datasets.  But for a data set of any reasonable size, you will need specially designed software for data forensics, such as SIFT.

What’s the big picture?

Guttman error indices are by no means perfect indicators of dishonest test-taking, but can be helpful in flagging potential issues at both an individual and group level.  That is, you could possibly flag individual students with high numbers of Guttman errors, or if your test is administered in numerous separate locations such as schools or test centers, you can calculate the average number of Guttman errors at each and flag the locations with high averages.

As with all data forensics, though, this flagging process does not necessarily mean there is nefarious goings-on.  Instead, it could simply give you a possible reason to open a deeper investigation.

The following two tabs change content below.

Nathan Thompson, PhD

Chief Product Officer at ASC
I am a psychometrician, software developer, author, and researcher, currently serving as Chief Product Officer for Assessment Systems Corporation (ASC). My mission is to elevate the profession of psychometrics by using software to automate the menial stuff like job analysis and Angoff studies, so we can focus on more innovative work. My core goal is to improve assessment throughout the world. I was originally trained as a psychometrician, doing an undergrad at Luther College in Math/Psych/Latin and then a PhD in Psychometrics at the University of Minnesota. I then worked multiple roles in the testing industry, including item writer, test development manager, essay test marker, consulting psychometrician, software developer, project manager, and business leader. Research and innovation are incredibly important to me. In addition to my own research, I am cofounder and Membership Director at the International Association for Computerized Adaptive Testing, You can often find me at other important conferences like ATP, ICE, CLEAR, and NCME. I've published many papers and presentations, and my favorite remains http://pareonline.net/getvn.asp?v=16&n=1.

Latest posts by Nathan Thompson, PhD (see all)

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply