clinical populations, assuming that the study
performance will be equivalent in both (Hui and
Walter 1980).
Other forms of systematic error, or bias, in
reference classification of the study population have
been recognized. These include ascertainment bias
and diagnostic-review bias. Ascertainment bias
results from unequal vigor in application of the
reference classification method among all the study
subjects (Wasson
et al.
1982). For instance, if the
reference technique is invasive, subjects who are less
ill will often be spared the procedure but may still be
assigned to the disease-free group or the favorable
prognosis class. Similarly, if long-term monitoring
of subjects is necessary for their final classification,
subjects at low risk or with less severe symptoms
may be included in the analysis even though they are
lost to follow-up.
When the results of the method under study are
known to the investigator making the reference class
assignments, and when such knowledge can influ-
ence the classification made, there is a risk that,
consciously or unconsciously, the classification will
be biased in favor of agreement with the study
method results. This kind of bias, called diagnostic-
review bias (Ransohoff and Feinstein 1978), leads to
overestimation of study performance. It can be
avoided by "blind" interpretation of the reference
method results. A similar form of bias,
incorporation bias (Ransohoff and Feinstein 1978),
arises when the reference method includes as one of
its criteria the results of the method whose perform-
ance is being studied. Such circular reasoning is
especially likely to arise when continuing contro-
versy concerning the precise clinical utility of a
diagnostic or predictive test motivates researchers to
evaluate its performance after it has been incorpo-
rated into clinical practice.
The following are excerpts from the methods
section of an unusually thorough evaluation report
detailing the performance of a number of plasma
enzymes for the diagnosis of acute myocardial
infarction (Werner
et al.
1982):
We investigated patients with acute myocar-
dial infarction and patients in whom this
condition was suspected but ruled out. The
diagnostic classifications were established
after the patient's discharge by a review of all
clinical findings, including history, electrocar-
diographic data, and laboratory data. The
document on "Nomenclature and Criteria for
Diagnosis of Ischemic Heart Disease" [Report
of the Joint International Society and Federa-
tion of Cardiology/World Health Organization
Task Force on Standardization of Clinical
Evaluating Classification Studies
4-2
Table 4.2
Effects of Common Deficiencies in the Design of Performance Evaluations
Deficiency
Sensitivity or
Specificity or
fraction correctly classified fraction correctly classified
event group
non-event group
Reference classification
Imperfect reference method
sensitivity<1
correct
under
fraction correctly classified, event group<1
specificity, sensitivity<1
incorrect
incorrect
fraction correctly classified, both groups<1
Ascertainment bias
incorrect
incorrect
Diagnostic-review bias
over
over
Incorporation bias
over
over
Study population spectrum
Inappropriate population
(not applicable)
over
Inadequate heterogeneity
over
over
Work-up bias
over
under
Selection bias
over
under
Analytical methodology
Inaccurate or imprecise method
under
under
Test-review bias
over
over