View on GitHub

50 Senz of Sith

Estimation of Measurement Uncertainty in Qualitative Testing

Last updated 2024-09-08

Measurement uncertainty (MU) is crucial in ensuring the reliability of test results, particularly in medical laboratories. While the conventional framework, as laid out in JCGM-GUM-3, is well-suited for quantitative testing, its application in qualitative tests (e.g., positive/negative outcomes) presents certain challenges. For this reason, JCGM-GUM-7 offers a more appropriate approach, modeling MU using the principles of probability distribution and Bayesian Probability.

JCGM-GUM-7 Framework and Bayesian Approach

In JCGM-GUM-7, MU is modeled using probability distributions, which can be used to quantify the MU in both qualitative and quantitative results. A significant aspect of this approach is the use of Bayesian Probability to integrate prior knowledge with observed data.

Unlike frequentist methods, which rely solely on the observed data, Bayesian methods combine prior information with data to update our beliefs about a hypothesis.

Binary Outcomes in Qualitative Testing

Most qualitative tests produce binary outcomes, such as positive vs negative or detected vs not detected. These outcomes can be easily modeled using the Bernoulli distribution. A useful way to analyze such results is through a 2x2 contingency table, which compares the test results with the true target condition (TC). Conventionally, the columns represent the true TC, while the rows represent the test results.

Here is a typical contingency table:

Target Condition
Positive Negative
Test Result Detected tp fp
Not Detected fn tn

True Positive Rate (TPR) and False Negative Rate (FNR)

According to Bayesian Probability, the analysis of binary outcomes should be carried out under the context of the true target condition. When the condition is positive, we can calculate:

The probability of a positive test given the condition is positive:

\[P(\text{Test+}|\text{TC+}) = \text{TPR} = \frac{\text{tp}}{\text{tp} + \text{fn}}\]

The probability of a negative test given the condition is positive:

\[P(\text{Test-}|\text{TC+}) = \text{FNR} = \frac{\text{fn}}{\text{tp} + \text{fn}}\]

Similarly, when the condition is negative, we can calculate:
The probability of a negative test given the condition is negative:

\[P(\text{Test-}|\text{TC-}) = \text{TNR} = \frac{\text{tn}}{\text{fp} + \text{tn}}\]

The probability of a positive test given the condition is negative:

\[P(\text{Test+}|\text{TC-}) = \text{FPR} = \frac{\text{fp}}{\text{fp} + \text{tn}}\]

Predictive Values vs. Likelihood Ratios

Although Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are commonly used in diagnostic tests, these metrics are significantly influenced by the prevalence of the condition in the population. On the other hand, Positive Likelihood Ratios (LR+) and Negative Likelihood Ratios (LR-) are much more robust indicators, as they are not directly affected by prevalence.

\[\text{LR+} = \frac{P(\text{Test+}|\text{TC+})}{P(\text{Test+}|\text{TC-})} = \frac{\text{TPR}}{\text{FPR}}\] \[\text{LR-} = \frac{P(\text{Test-}|\text{TC-})}{P(\text{Test-}|\text{TC+})} = \frac{\text{TNR}}{\text{FNR}}\] \[\text{PPV} = P(\text{TC+}|\text{Test+}) = \frac{\text{tp}}{\text{tp}+\text{fp}}\] \[\text{NPV} = P(\text{TC-}|\text{Test-}) = \frac{\text{tn}}{\text{tn}+\text{fn}}\]

Example of Prevalence Impact

Let’s compare PPV, NPV, and likelihood ratios at different prevalence levels. Given the same test sensitivity, specificity and likelihood ratios, the PPV will be low due to the low prevalence, while the NPV and accuracy will be high due to the low prevalence.

Equal Prevalence Unbalanced Prevalence Low Prevalence Very Low Prevalence
Prevalence % 50.00 33.33 9.09 0.99
Contingency Table
26417
26273
26434
26546
264170
262730
2641700
2627300
Sensitivity % 91.03
Specificity % 94.14
FNR % 8.97
FPR % 5.86
LR+ % 15.53
LR- % 10.49
PPV % 93.95 88.59 60.83 13.44
NPV % 91.30 95.45 99.06 99.90
Accuracy % 92.59 93.10 93.86 94.11

Sources of Data for MU Analysis

Data for constructing the contingency table in the context of measurement uncertainty analysis can come from several sources:

  1. Validation (MV) studies conducted by the manufacturer and the user.
  2. Published literature on the performance of similar tests.
  3. Participation in external quality assurance (EQA) programs.
  4. Routine internal quality control (IQC) checks.
  5. Prior estimation of MU.

These data sources can be combined using Bayes’ Theorem, which is particularly useful in pooling information from different sources. When starting with minimal prior information, we often use a uniform non-informative prior such as Beta(1,1) for both TC.

Example of Combining Data Using Bayes’ Theorem

Prior Values Target Condition
Positive Negative
Test Result Detected 1 1
Not Detected 1 1
Likelihood Values Target Condition
Positive Negative
Test Result Detected a b
Not Detected c d

Given the above observation, the posterior probability according to Bayes’ Theorem is:

\[P(\text{posterior}) \propto P(\text{prior}) \times P(\text{likelihood})\]

given probability of Binomial(n, p) is:

\[P(x) = \left( \begin{array}{c} n \\ x \end{array} \right) \centerdot p^x \centerdot (1-p)^{n-x}\] \[P(x) \propto p^x \centerdot (1-p)^{n-x}\]

hence, for TC+

\[P(\text{posterior}) \propto (p^1 \centerdot (1-p)^1) \centerdot (p^a \centerdot (1-p)^c)\] \[P(\text{posterior}) \propto p^{1+a} \centerdot (1-p)^{1+c}\]

similarly, for TC-

\[P(\text{posterior}) \propto p^{1+d} \centerdot (1-p)^{1+b}\]

which can be shown in this table:

Posterior Values Target Condition
Positive Negative
Test Result Detected 1 + a 1 + b
Not Detected 1 + c 1 + d

Examples of MU Estimation

Case 1

In this example, the laboratory uses data from the MV study and IQC to estimate MU.

Year Prior Data Posterior LR+ LR-
- Uniform non-informative prior
11
11
MV
301
029
312
130
$$\frac{31/32}{2/32} = 15.50$$ $$\frac{30/32}{1/32} = 30.00$$
2023 Previous MU
312
130
IQC (2022)
500
050
812
180
$$\frac{81/82}{2/82} = 40.50$$ $$\frac{80/82}{1/82} = 80.00$$
2024 MU (2023)
812
180
IQC (2023)
530
053
1342
1133
$$\frac{134/135}{2/135} = 67.00$$ $$\frac{133/135}{1/135} = 133.00$$

Case 2

In this example, the laboratory uses data from the EQA program and IQC to estimate MU. The laboratory has started the test in year 2022 and joined the EQA program in year 2023.

Year Prior Data Posterior LR+ LR-
EQA IQC
2023
11
11
-
500
050
511
151
$$\frac{51/52}{1/52} = 51.00$$ $$\frac{51/52}{1/52} = 51.00$$
2024
511
151
161
015
500
050
1172
1116
$$\frac{117/118}{2/118} = 58.50$$ $$\frac{116/118}{1/118} = 116.00$$

Case 3

In this example, the test used in the laboratory produces three possible outcomes: positive, intermediate, and negative. However, the technical manager has decided to group the positive and intermediate results together under a single category called non-negative, because this test is used for screening purposes, where false negatives are considered to have more serious consequences than false positives.

This reclassification simplifies the analysis by transforming the test outcomes into binary results, which is more compatible with our methodology. The technical manager or laboratory director would need to provide a clear justification for this reclassification, explaining why it is appropriate to combine the positive and intermediate results in the context of the test’s clinical use and objectives.

After the reclassification, measurement uncertainty (MU) estimation can proceed using the same approach as outlined in Case 1 and Case 2. With the test outcomes now categorized into non-negative (positive + intermediate) and negative results, we can apply the same Bayesian analysis framework.

Before reclassification:

Before reclassification Target Condition
Positive Negative
Test Result Detected a b
Intermediate c d
Not Detected e f

After reclassification, note that the number of false positives (b + d) has increased, while the number of false negatives remains unchanged from before the reclassification.:

After reclassification Target Condition
Positive Negative
Test Result Non-(Not Detected) a + c b + d
Not Detected e f

References

Table of Content