Computerized classification test


A computerized classification test refers to, as its name would suggest, a test that is administered by computer for the purpose of classifying examinees. The most common CCT is a mastery test where the test classifies examinees as "Pass" or "Fail," but the term also includes tests that classify examinees into more than two categories. While the term may generally be considered to refer to all computer-administered tests for classification, it is usually used to refer to tests that are interactively administered or of variable-length, similar to computerized adaptive testing. Like CAT, variable-length CCTs can accomplish the goal of the test with a fraction of the number of items used in a conventional fixed-form test.
A CCT requires several components:
  1. An item bank calibrated with a psychometric model selected by the test designer
  2. A starting point
  3. An item selection algorithm
  4. A termination criterion and scoring procedure
The starting point is not a topic of contention; research on CCT primarily investigates the application of different methods for the other three components. Note: The termination criterion and scoring procedure are separate in CAT, but the same in CCT because the test is terminated when a classification is made. Therefore, there are five components that must be specified to design a CAT.
An introduction to CCT is found in Thompson and a book by Parshall, Spray, Kalohn and Davey. A bibliography of published CCT research is found below.

How it works

A CCT is very similar to a CAT. Items are administered one at a time to an examinee. After the examinee responds to the item, the computer scores it and determines if the examinee is able to be classified yet. If they are, the test is terminated and the examinee is classified. If not, another item is administered. This process repeats until the examinee is classified or another ending point is satisfied.

Psychometric model

Two approaches are available for the psychometric model of a CCT: classical test theory and item response theory . Classical test theory assumes a state model because it is applied by determining item parameters for a sample of examinees determined to be in each category. For instance, several hundred "masters" and several hundred "nonmasters" might be sampled to determine the difficulty and discrimination for each, but doing so requires that you be able to easily identify a distinct set of people that are in each group. IRT, on the other hand, assumes a trait model; the knowledge or ability measured by the test is a continuum. The classification groups will need to be more or less arbitrarily defined along the continuum, such as the use of a cutscore to demarcate masters and nonmasters, but the specification of item parameters assumes a trait model.
There are advantages and disadvantages to each. CTT offers greater conceptual simplicity. More importantly, CTT requires fewer examinees in the sample for calibration of item parameters to be used eventually in the design of the CCT, making it useful for smaller testing programs. See Frick for a description of a CTT-based CCT. Most CCTs, however, utilize IRT. IRT offers greater specificity, but the most important reason may be that the design of a CCT is expensive, and is therefore more likely done by a large testing program with extensive resources. Such a program would likely use IRT.

Starting point

A CCT must have a specified starting point to enable certain algorithms. If the sequential probability ratio test is used as the termination criterion, it implicitly assumes a starting ratio of 1.0. If the termination criterion is a confidence interval approach, a specified starting point on theta must be specified. Usually, this is 0.0, the center of the distribution, but it could also be randomly drawn from a certain distribution if the parameters of the examinee distribution are known. Also, previous information regarding an individual examinee, such as their score the last time they took the test may be used.

Item selection

In a CCT, items are selected for administration throughout the test, unlike the traditional method of administering a fixed set of items to all examinees. While this is usually done by individual item, it can also be done in groups of items known as testlets.
Methods of item selection fall into two categories: cutscore-based and estimate-based. Cutscore-based methods maximize the information provided by the item at the cutscore, or cutscores if there are more than one, regardless of the ability of the examinee. Estimate-based methods maximize information at the current estimate of examinee ability, regardless of the location of the cutscore. Both work efficiently, but the efficiency depends in part on the termination criterion employed. Because the sequential probability ratio test only evaluates probabilities near the cutscore, cutscore-based item selection is more appropriate. Because the confidence interval termination criterion is centered around the examinees ability estimate, estimate-based item selection is more appropriate. This is because the test will make a classification when the confidence interval is small enough to be completely above or below the cutscore. The confidence interval will be smaller when the standard error of measurement is smaller, and the standard error of measurement will be smaller when there is more information at the theta level of the examinee.

Termination criterion

There are three termination criteria commonly used for CCTs. Bayesian decision theory methods offer great flexibility by presenting an infinite choice of loss/utility structures and evaluation considerations, but also introduce greater arbitrariness. A confidence interval approach calculates a confidence interval around the examinee's current theta estimate at each point in the test, and classifies the examinee when the interval falls completely within a region of theta that defines a classification. This was originally known as adaptive mastery testing, but does not necessarily require adaptive item selection, nor is it limited to the two-classification mastery testing situation. The sequential probability ratio test defines the classification problem as a hypothesis test that the examinee's theta is equal to a specified point above the cutscore or a specified point below the cutscore.