The Logic of laboratory Medicine - page 72

cited reasons are that the value (1) is used by other
researchers, (2) yields a specificity of 0.95 (for
diagnostic studies), or (3) yields maximum
efficiency among the subjects studied. If the first
criterion applies, the results of the evaluation can be
compared to those reported by others. However, it
may not permit ready comparison of the perform-
ance of the method with that of alternative diagnostic
studies. The second criterion permits comparisons
among alternative studies also evaluated at critical
values yielding a specificity of 0.95. The third crite-
rion is problematic as it often makes it so that the
performance findings cannot be compared to the
findings from other evaluations concerned with the
same or alternative studies.
The statistical treatment of performance results
includes calculation of the confidence limits for the
performance estimates. Confidence limits define the
range of values within which, to a stipulated level of
confidence, the true value of the estimate lies. For
proportions (such as sensitivity, specificity, and
fraction correctly classified), confidence limit calcu-
lations are based upon the properties of the binomial
distribution. For a proportion derived from a large
number of subjects (N more than 50), the approxi-
mate confidence limits of the estimate are,
estimate
+
z
c
2
2
N
!
z
c
estimate
(
1
estimate
)
N
+
z
c
2
4
N
2
1
+
z
c
2
N
where z
c
is the confidence coefficient as found with
the standard normal distribution; z
c
equals 1.96 for a
95% confidence level and 1.645 for a 90% confi-
dence level. For proportions derived from a small
number of subjects (fewer than 50), the calculation
of confidence limits is mathematically involved, so
they are usually taken from a table or graph. The
95% confidence limits for proportions derived from
samples of 10, 20, 30, and 50 subjects are shown in
Figure 4.1. If the result frequency distributions are
modeled, the performance measure estimates and
associated confidence limits as computed from the
model parameters should also be presented. As an
example, based on the result frequency distributions
for transferrin saturation as a classification study in
the diagnosis of iron deficiency in 1-year-olds as
reported by Dallman
et al.
(1981), the empirical
specificity for a transferrin saturation of 10% is
0.75. As there were 110 iron-replete infants
studied, the 95% confidence limits for the empirical
specificity of the study result are 0.662 and 0.822.
Based on lognormal models of the result frequency
distributions, the specificity at a transferrin satura-
tion of 10% is 0.79 with 95% confidence limits,
0.719 and 0.844. The width of the confidence inter-
val for the estimate derived from the distribution
model will always be smaller than the interval for
the empirical estimate (White and James 1996).
Here there is a 22% difference in the interval
widths. The ROC curve for transferrin saturation
(see Chapter 3) shown in Figure 4.2 indicates the
95% confidence intervals for the empirical sensitiv-
ity and specificity estimates.
Confidence limits should also be calculated for
likelihood ratios (Fleiss 1981). Figure 4.3 shows the
likelihood ratio curve for transferrin saturation (see
Chapter 3) with the approximate 95% confidence
intervals for the ratios indicated. Werner
et al.
state
that:
We estimated the uncertainties of these
measurements by analogy with binomial distri-
bution ... The uncertainty (standard deviation)
ranges for sensitivity estimates from 0.70 ±
0.08 to 0.80 ± 0.07 and for specificity
estimates from 0.96 ± 0.02 to 0.99 ± 0.01.
Some researchers also present data for the
predictive value of study results (Linnet 1988).
When these calculations are based upon appropriate
epidemiologic estimates of the prevalence of the
Evaluating Classification Studies
4-7
Figure 4.1
95% confidence limits for performance
estimates. The number of study subjects is indicated.
0
0.2
0.4
0.6
0.8
1
Estimate
0
0.2
0.4
0.6
0.8
1
Actual proportion
50
30
20
10
1...,62,63,64,65,66,67,68,69,70,71 73,74,75,76,77,78,79,80,81,82,...238
Powered by FlippingBook