17 Diagnostic Tests
Scientific testing for the presence of various disease conditions or processes is very common in everyday life. This could range from complex testing for the presence of strange diseases to newly manufactured electrical gadgets for defects. Very often there is a Gold Standard test, one that is deemed to perfectly determine the presence or absence of the condition. However, there is always the search for alternative tests often because they are cheaper or easier to use compared to the Gold standard.
In a study to diagnose malaria in children attending an outpatient clinic in Ghana, children with a clinical suspicion of malaria were tested using three
methods. First, a blood film reported as a count of the malaria parasites
(Gold standard) was done. Two rapid diagnostic kits, called here RDT.1 and RDT.2
were also done concurrently and reported as positive (1) or negative (0). These
were done for 100 patients and recorded in malaria.csv
. Our task is to
evaluate RDT.1’s ability to accurately and reliably diagnose malaria.
First, we read the data
df_malaria <-
read_csv("./Data/malaria.txt") %>%
mutate(
gold = ifelse(mps == 0, 0, 1) %>%
factor(levels = c(1,0),
labels = c("Positive", "Negative")),
across(
c(rdt.1, rdt.2),
~factor(
.x,
levels = c(1,0),
labels = c("Positive", "Negative"),
)
)
)
The summary of the data is shown below
df_malaria %>% summarytools::dfSummary(graph.col = F)
Data Frame Summary
df_malaria
Dimensions: 100 x 4
Duplicates: 44
-----------------------------------------------------------------------------------------
No Variable Stats / Values Freqs (% of Valid) Valid Missing
---- ----------- ------------------------------ -------------------- ---------- ---------
1 mps Mean (sd) : 3365.2 (23683.3) 53 distinct values 100 0
[numeric] min < med < max: (100.0%) (0.0%)
0 < 62.5 < 236155
IQR (CV) : 413.8 (7)
2 rdt.1 1. Positive 52 (52.0%) 100 0
[factor] 2. Negative 48 (48.0%) (100.0%) (0.0%)
3 rdt.2 1. Positive 51 (51.0%) 100 0
[factor] 2. Negative 49 (49.0%) (100.0%) (0.0%)
4 gold 1. Positive 54 (54.0%) 100 0
[factor] 2. Negative 46 (46.0%) (100.0%) (0.0%)
-----------------------------------------------------------------------------------------
And then tabulate rdt.1 and the Gold Standard as
df_malaria %>%
gtsummary::tbl_cross(
col = gold,
row = rdt.1,
label = list(
gold ~ "Gold Standard",
rdt.1 ~ "First RDT"
)
) %>%
gtsummary::bold_labels()
Gold Standard | Total | ||
---|---|---|---|
Positive | Negative | ||
First RDT | |||
Positive | 50 | 2 | 52 |
Negative | 4 | 44 | 48 |
Total | 54 | 46 | 100 |
The table above decomposes the test results into 4 distinct categories.
- Those who had both RDT.1 and the gold standard positive (True positive) were 50.
- The group with both RDT.1 and Gold standard negative (True Negative) were 44.
- The group that showed a positive RDT.1 results when they are negative by the Gold standard (False positive) were 2.
- Finally the last group, those whose RDT.1 results were negative but are positive judging by the Gold standard (False negative) were 4.
We operationalise these by extracting relevant portions of the table below
tp <- 50
tn <- 44
fp <- 2
fn <- 4
17.1 True prevalence of the disease
The true prevalence of the disease is the proportion of the diseased individuals observed in the study population as determined by the gold standard. This is mathematically given by \[True~prevalence = \frac{tp + fn}{tp + tn + fp + fn}\]
And determined with our data as
17.2 Apparent prevalence of the disease
The apparent prevalence of the disease is the proportion of the diseased individuals observed in the study population as determined by the RDT.1 test. This is mathematically given by \[Apparent~prevalence = \frac{tp + fp}{tp + tn + fp + fn}\] And determined with our data by
17.3 Sensitivity of a test
The sensitivity of a test defines as the proportion of individuals with the disease who are correctly identified by the test applied. It ranges from 0, a completely useless test to 1, a perfect test. Mathematically this is defined as
\[Sensitivity = \frac{tp}{tp + fn}\] And is determined below
17.4 Specificity of a test
The specificity of a test is defined as the proportion of individuals without the disease who are correctly identified by the test used. It ranges from 0, a completely useless test to 1, a perfect test. Mathematically this is defined as \[Specificity = \frac{tn}{tn + fp}\]
And determine as below
17.5 Predictive value of a test
17.5.1 Positive predictive value of a test
The positive predictive value (PPV) of a test is defined as the proportion of individuals with a positive test result who have the disease. This is a more useful measure compared to the sensitivity and specificity because it indicates how much weight one has to put on a positive test result when confronted with one. Mathematically it is defined as:
\[PPV = \frac{tp}{tp + fp}\]
17.5.2 Negative predictive value of a test
The negative predictive value (npv) of a test is defined as the proportion of individuals with a negative test result who do not have the disease. As with the ppv this is a more useful measure compared to the sensitivity and specificity as it indicates how much weight one has to put on a negative test result when confronted with one. Mathematically it is defined as: \[NPV = \frac{tn}{tn + fn}\] And determined below
17.6 Likelihood ratio of a test
The likelihood ratio of a test is another way of expressing its usefulness. Unlike the previous statistics about tests, the likelihood ratios stretch beyond 0 to 1. A likelihood ratio of 1 indicates a useless (non-discriminatory) test.
17.6.1 The Positive likelihood ratio (LR+)
This is the ratio of the chance of a positive result if the patient has the disease to the chance of a positive result if he does not have the disease. The higher the positive likelihood the better the test.
This is mathematically equivalent to
\[LR+ = \frac{Sensitivity}{1-Specificity}\]
Applying this to our data so far we have
17.6.2 Negative liklihood ratio (LR-)
The negative likelihood ratio (LR-) on the other hand is the ratio of the chance of a person having a negative result having the disease to the chance of a negative result in a person not having the disease. The lower the negative likelihood the better the test.
Computationally this is equivalent to \[LR+ = \frac{1-Sensitivity}{Specificity}\] Applying this to our data so far we have
17.7 Summary
Fortunately, all these can be obtained in one go using the epi.tests()
function
from the epiR
package. The function however requires a table formatted in a
specific way. Below we create the table
table.test <-
df_malaria %$%
table(rdt.1, gold)
table.test
gold
rdt.1 Positive Negative
Positive 50 2
Negative 4 44
And then we evaluate the test
table.test %>% epiR::epi.tests()
Outcome + Outcome - Total
Test + 50 2 52
Test - 4 44 48
Total 54 46 100
Point estimates and 95% CIs:
--------------------------------------------------------------
Apparent prevalence * 0.52 (0.42, 0.62)
True prevalence * 0.54 (0.44, 0.64)
Sensitivity * 0.93 (0.82, 0.98)
Specificity * 0.96 (0.85, 0.99)
Positive predictive value * 0.96 (0.87, 1.00)
Negative predictive value * 0.92 (0.80, 0.98)
Positive likelihood ratio 21.30 (5.48, 82.77)
Negative likelihood ratio 0.08 (0.03, 0.20)
False T+ proportion for true D- * 0.04 (0.01, 0.15)
False T- proportion for true D+ * 0.07 (0.02, 0.18)
False T+ proportion for T+ * 0.04 (0.00, 0.13)
False T- proportion for T- * 0.08 (0.02, 0.20)
Correctly classified proportion * 0.94 (0.87, 0.98)
--------------------------------------------------------------
* Exact CIs
Conclusion: With the high (all above 0.9) Sensitivity, Specificity, PPV and NPV, the test appears to be a very good one. This is confirmed by the relatively high LR+ and low LR-.