Page images
PDF
EPUB
[blocks in formation]

The specificity of a test refers to its ability to identify people who do not have an alcohol use disorder. As such, specificity reflects the proportion of non-alcohol abusers correctly identified (true negatives). Accordingly, a test with high specificity provides a minimum of false positives (i.e., non-alcohol abusers identified by the screening test as alcohol abusers). Referring again to table 1, specificity would be calculated by dividing the true negative cases by the total number of nonalcohol abusers (a/a+b).

Positive Predictive Value

Another useful statistic in evaluating screening tests is called positive predictive

value. This refers to the proportion of persons identified as positive on the screening test who actually have the disorder. The likelihood that a person with a positive test result actually has an alcohol problem is calculated by dividing the true positives by the number of positives identified by the screening test (d/b+d). It should be noted that as the prevalence of the disorder in the population being screened increases, the positive predictive value of the measure increases as well.

Likelihood Ratios

The method of likelihood ratios to describe the accuracy of a screening test has been touted as quicker and more powerful than the sensitivity/specificity strategy. According to Sackett (1992), a likelihood ratio reflects the odds that a positive finding on a screening test would occur in a person with, as opposed to a person without, an alcohol use disorder. He described the significance of different likelihood ratios as follows:

When a finding's likelihood ratio is above 1.0, the probability of disease goes up (because the finding is more likely among patients with, than without, the disorder); when the likelihood ratio is below 1.0, the probability of disease goes down (because the finding is less likely among patients with, than without, the disorder); finally, when the likelihood ratio is close to 1.0, the probability of disease is unchanged (because the finding is equally likely in patients with, and without, the disorder). (pp. 2643-2644, emphasis in original)

The calculation of the likelihood ratio is based on sensitivity and specificity, as follows:

sensitivity 1-specificity

More information on likelihood ratios and their uses is provided by Feinstein (1985) and Sackett (1992).

Receiver Operating Curves

These curves are used to determine cutoff scores for use with a particular screening measure. Changing the test's cutoff has implications for its sensitivity, specificity, and positive predictive value. For example, lowering the cutoff for a screening test generally will produce a greater number of positive test results. Such a strategy typically will result in greater sensitivity, but at the same time, it will reduce the test's specificity. An excellent example of the effect of using different cutoff points for several widely used screening measures was presented recently by Russell et al. (1994).

Overview of Screening Measures

There is no shortage of screening measures. Applying the criteria outlined in the introduction to this Guide yielded a core group of 23 instruments. They are identified in table 2, along with a variety of descriptive information such as target population, number of items, administration options, time to administer, and availability of normative data. Availability of psychometric data, including various types of reliability and validity, is indicated in table 3.

As can be seen, a number of useful selfreport screening measure options are available to clinicians and researchers. Around half of those listed in the tables are geared to use with adults. Six (26 percent) were developed specifically for use with adolescents. Six other instruments were developed for use with adults and adolescents. The measures range in length from the brief 4-item CAGE to the 350-item Computerized Lifestyle Assessment; 5 of the screening measures include 10 or fewer items. Several of the measures include two or more distinct scales, should such further information be of use in a particular screening endeavor.

While the vast majority of measures are

available in a paper and pencil self-administered format, other options are present. Several measures can be used in an interview format, and a number of the measures have been adapted for computerized scoring. Regardless of format, the measures, with few exceptions, can be completed in less than 15 minutes, and five can be completed in just 1 or 2 minutes. Scoring of the majority of the measures likewise entails a relatively brief time. Finally, table 2 indicates whether norms are available generally and for particular subgroups.

Selection of Measures for
Specific Purposes

It is not possible to make definitive statements about which screening measure to select because of the many variables involved, such as the population being assessed, the amount of time available, the setting, and the goals of the screening. Therefore, this section discusses three interrelated components of selecting a screening measure: guidelines for selecting a screening tool, a summary of studies that have explicitly compared screening measures with each other, and suggestions regarding specific measures. However, the suggestions about selecting a particular measure need to be evaluated carefully in the context of the particular setting and context in which the screening will occur.

Guidelines on Selecting Measures

Three central issues need to be addressed in selecting a screening tool:

• The goals of the screening

• The time available for conducting the screening

• The resources available for scoring the screening test and providing feedback/ referral for positive cases.

Identifying the goals of screening in a

particular situation might appear straightforward. Indeed, all screening endeavors on some level are designed to detect alcohol problems among those tested. However, a related issue is the degree of sensitivity and specificity desired. For example, the person doing the screening may want to focus on maximizing sensitivity and thus identify as many true positives as possible. On the other hand, another user might want to key on specificity and thus maximize the likelihood that persons identified as negative are actually not experiencing an alcohol problem.

The amount of time available for performing the screening should not be a major impediment, because some measures can be completed in just a couple of minutes (e.g., CAGE and TWEAK). However, other measures can take much longer, and the balancing of time with the relative benefits or advantages of a particular measure has to be assessed.

An evaluation of resources available for two particular activities is needed. The first is resources for scoring and interpreting the screening data collected. Conveniently, a host of measures are available that can be scored and evaluated in just a few minutes. The second category of resources needed revolves around time and effort required to act upon screening results. Since screening is intended to detect persons with alcohol problems, resources to provide feedback and referral for evaluation/assessment are needed. The sensitivity versus specificity emphasis of a given measure has implications for the amount of resources necessary for subsequent feedback/referral of positive

cases.

Comparisons Among Screening Measures

Another resource for making decisions on the selection of a screening measure is data on direct comparisons between measures. Some useful information in this regard has been provided by Maisto et al. (in press).

They reviewed research involving direct contrasts of self-report screening measures for alcohol problems in a variety of settings. The 13 studies reviewed covered 14 different instruments, the majority of which are included in table 2. Most of the studies surveyed were conducted in primary care settings. A particular focus of the review was on the measures' sensitivity and specificity.

Most of the comparisons between screening measures involved three tests: the CAGE, the Michigan Alcoholism Screening Test (MAST), and the Short MAST. Among the conclusions reached by Maisto et al. were the following:

• In direct contrasts, the MAST is generally more sensitive than the CAGE, although the CAGE may perform better than the MAST with elderly primary care patients.

• The CAGE and Short MAST performed comparably.

• The CAGE is particularly popular in primary care settings.

• The TWEAK and T-ACE warrant additional use and study as screening measures, not only because of their promising performance to date, but also because they were designed for use in obstetric/gynecologic clinics and other primary health care settings.

These conclusions may be useful in deliberations involving the choice of specific scales. However, only a subset of the measures listed in table 2 were examined, and these findings should not preclude the use of the other measures.

Closing Points and Suggestions

Cautions have been made throughout this chapter about the need to select a screen

TABLE 2.-Descriptive information on a variety of screening measures

[blocks in formation]
[blocks in formation]
« PreviousContinue »