17 Diagnostic Tests

Scientific testing for the presence of various disease conditions or processes is very common in everyday life. This could range from complex testing for the presence of strange diseases to newly manufactured electrical gadgets for defects. Very often there is a Gold Standard test, one that is deemed to perfectly determine the presence or absence of the condition. However, there is always the search for alternative tests often because they are cheaper or easier to use compared to the Gold standard.

In a study to diagnose malaria in children attending an outpatient clinic in Ghana, children with a clinical suspicion of malaria were tested using three methods. First, a blood film reported as a count of the malaria parasites (Gold standard) was done. Two rapid diagnostic kits, called here RDT.1 and RDT.2 were also done concurrently and reported as positive (1) or negative (0). These were done for 100 patients and recorded in malaria.csv. Our task is to evaluate RDT.1’s ability to accurately and reliably diagnose malaria.

First, we read the data

df_malaria <- 
    read_csv("./Data/malaria.txt") %>% 
    mutate(
        gold = ifelse(mps == 0, 0, 1) %>% 
            factor(levels = c(1,0),
                   labels = c("Positive", "Negative")),
        across(
            c(rdt.1, rdt.2), 
            ~factor(
                .x, 
                levels = c(1,0),
                labels = c("Positive", "Negative"),
            )
        )
    )

The summary of the data is shown below

df_malaria %>% summarytools::dfSummary(graph.col = F)
Data Frame Summary  
df_malaria  
Dimensions: 100 x 4  
Duplicates: 44  

-----------------------------------------------------------------------------------------
No   Variable    Stats / Values                 Freqs (% of Valid)   Valid      Missing  
---- ----------- ------------------------------ -------------------- ---------- ---------
1    mps         Mean (sd) : 3365.2 (23683.3)   53 distinct values   100        0        
     [numeric]   min < med < max:                                    (100.0%)   (0.0%)   
                 0 < 62.5 < 236155                                                       
                 IQR (CV) : 413.8 (7)                                                    

2    rdt.1       1. Positive                    52 (52.0%)           100        0        
     [factor]    2. Negative                    48 (48.0%)           (100.0%)   (0.0%)   

3    rdt.2       1. Positive                    51 (51.0%)           100        0        
     [factor]    2. Negative                    49 (49.0%)           (100.0%)   (0.0%)   

4    gold        1. Positive                    54 (54.0%)           100        0        
     [factor]    2. Negative                    46 (46.0%)           (100.0%)   (0.0%)   
-----------------------------------------------------------------------------------------

And then tabulate rdt.1 and the Gold Standard as

df_malaria %>% 
    gtsummary::tbl_cross(
        col = gold,
        row = rdt.1,
        label = list(
            gold ~ "Gold Standard",
            rdt.1 ~ "First RDT"
        )
    ) %>% 
    gtsummary::bold_labels()

	Gold Standard		Total
	Positive	Negative	Total
First RDT
Positive	50	2	52
Negative	4	44	48
Total	54	46	100

The table above decomposes the test results into 4 distinct categories.

Those who had both RDT.1 and the gold standard positive (True positive) were 50.
The group with both RDT.1 and Gold standard negative (True Negative) were 44.
The group that showed a positive RDT.1 results when they are negative by the Gold standard (False positive) were 2.
Finally the last group, those whose RDT.1 results were negative but are positive judging by the Gold standard (False negative) were 4.

We operationalise these by extracting relevant portions of the table below

tp <- 50
tn <- 44
fp <- 2
fn <- 4

17.1 True prevalence of the disease

The true prevalence of the disease is the proportion of the diseased individuals observed in the study population as determined by the gold standard. This is mathematically given by $T r u e p r e v a l e n c e = \frac{t p + f n}{t p + t n + f p + f n}$

And determined with our data as

true.prevalence <- (tp+fn)/(tp+tn+fp+fn)
true.prevalence
[1] 0.54

17.2 Apparent prevalence of the disease

The apparent prevalence of the disease is the proportion of the diseased individuals observed in the study population as determined by the RDT.1 test. This is mathematically given by $A p p a r e n t p r e v a l e n c e = \frac{t p + f p}{t p + t n + f p + f n}$ And determined with our data by

apparent.prevalence<-(tp+fp)/(tp+tn+fp+fn)
apparent.prevalence
[1] 0.52

17.3 Sensitivity of a test

The sensitivity of a test defines as the proportion of individuals with the disease who are correctly identified by the test applied. It ranges from 0, a completely useless test to 1, a perfect test. Mathematically this is defined as

$S e n s i t i v i t y = \frac{t p}{t p + f n}$ And is determined below

sensitivity <- tp/(tp+fn)
sensitivity
[1] 0.9259259

17.4 Specificity of a test

The specificity of a test is defined as the proportion of individuals without the disease who are correctly identified by the test used. It ranges from 0, a completely useless test to 1, a perfect test. Mathematically this is defined as $S p e c i f i c i t y = \frac{t n}{t n + f p}$

And determine as below

specificity<-tn/(tn+fp)
specificity
[1] 0.9565217

17.5 Predictive value of a test

17.5.1 Positive predictive value of a test

The positive predictive value (PPV) of a test is defined as the proportion of individuals with a positive test result who have the disease. This is a more useful measure compared to the sensitivity and specificity because it indicates how much weight one has to put on a positive test result when confronted with one. Mathematically it is defined as:

$P P V = \frac{t p}{t p + f p}$

ppv <- tp/(tp+fp)
ppv
[1] 0.9615385

17.5.2 Negative predictive value of a test

The negative predictive value (npv) of a test is defined as the proportion of individuals with a negative test result who do not have the disease. As with the ppv this is a more useful measure compared to the sensitivity and specificity as it indicates how much weight one has to put on a negative test result when confronted with one. Mathematically it is defined as: $N P V = \frac{t n}{t n + f n}$ And determined below

npv <- tn/(tn+fn)
npv
[1] 0.9166667

17.6 Likelihood ratio of a test

The likelihood ratio of a test is another way of expressing its usefulness. Unlike the previous statistics about tests, the likelihood ratios stretch beyond 0 to 1. A likelihood ratio of 1 indicates a useless (non-discriminatory) test.

17.6.1 The Positive likelihood ratio (LR+)

This is the ratio of the chance of a positive result if the patient has the disease to the chance of a positive result if he does not have the disease. The higher the positive likelihood the better the test.

This is mathematically equivalent to

$L R + = \frac{S e n s i t i v i t y}{1 - S p e c i f i c i t y}$

Applying this to our data so far we have

pLR <- sensitivity/(1-specificity)
pLR
[1] 21.2963

17.6.2 Negative liklihood ratio (LR-)

The negative likelihood ratio (LR-) on the other hand is the ratio of the chance of a person having a negative result having the disease to the chance of a negative result in a person not having the disease. The lower the negative likelihood the better the test.

Computationally this is equivalent to $L R + = \frac{1 - S e n s i t i v i t y}{S p e c i f i c i t y}$ Applying this to our data so far we have

nLR<-(1-sensitivity)/specificity
nLR
[1] 0.07744108

17.7 Summary

Fortunately, all these can be obtained in one go using the epi.tests() function from the epiRpackage. The function however requires a table formatted in a specific way. Below we create the table

table.test <- 
    df_malaria %$%
    table(rdt.1, gold)

table.test
          gold
rdt.1      Positive Negative
  Positive       50        2
  Negative        4       44

And then we evaluate the test

table.test %>% epiR::epi.tests()
          Outcome +    Outcome -      Total
Test +           50            2         52
Test -            4           44         48
Total            54           46        100

Point estimates and 95% CIs:
--------------------------------------------------------------
Apparent prevalence *                  0.52 (0.42, 0.62)
True prevalence *                      0.54 (0.44, 0.64)
Sensitivity *                          0.93 (0.82, 0.98)
Specificity *                          0.96 (0.85, 0.99)
Positive predictive value *            0.96 (0.87, 1.00)
Negative predictive value *            0.92 (0.80, 0.98)
Positive likelihood ratio              21.30 (5.48, 82.77)
Negative likelihood ratio              0.08 (0.03, 0.20)
False T+ proportion for true D- *      0.04 (0.01, 0.15)
False T- proportion for true D+ *      0.07 (0.02, 0.18)
False T+ proportion for T+ *           0.04 (0.00, 0.13)
False T- proportion for T- *           0.08 (0.02, 0.20)
Correctly classified proportion *      0.94 (0.87, 0.98)
--------------------------------------------------------------
* Exact CIs

Conclusion: With the high (all above 0.9) Sensitivity, Specificity, PPV and NPV, the test appears to be a very good one. This is confirmed by the relatively high LR+ and low LR-.

16 Confounding and Interaction

18 Agreement