22 Analysis of Variance

22.1 Introduction

Hypertension, an elevated blood pressure beyond normal, is a very common medical condition, especially in adults. To treat this condition, persons with hypertension are often given blood pressure-lowering drugs. There are many types of this drug, with varying blood pressure-lowering effects when taken. In a study to identify the blood pressure-lowering effect, four drugs, here called A, B, C and D were randomly administered to persons with hypertension. Their systolic blood pressure (SBP) was measured before and after administration of the drugs.

The drop in SBP was then recorded. The data systolic.txt was captured from this study. The variables in the data include drug, the drug administered with 1, 2, 3, and 4 representing drugs A, B, C and D, respectively. The other variables in the data are disease representing other diseases present in the patient with 1, 2, 3 representing Asthma, Diabetes Mellitus (DM) and Obesity, and systolic representing the drop in systolic blood pressure a week after starting the drug.

We first read the data and summarise it.

df_syst <- 
    read.table(".\\Data\\systolic.txt") %>% 
    mutate(
        drug = factor(drug, levels = 1:4, labels = c("A", "B", "C", "D")),
        disease = factor(
            disease, levels = 1:3, 
            labels = c("Asthma", "DM", "Obesity")
        )
    )
df_syst %>% summarytools::dfSummary(graph.col = F)
Data Frame Summary  
df_syst  
Dimensions: 58 x 3  
Duplicates: 3  

------------------------------------------------------------------------------------
No   Variable    Stats / Values            Freqs (% of Valid)   Valid      Missing  
---- ----------- ------------------------- -------------------- ---------- ---------
1    drug        1. A                      15 (25.9%)           58         0        
     [factor]    2. B                      15 (25.9%)           (100.0%)   (0.0%)   
                 3. C                      12 (20.7%)                               
                 4. D                      16 (27.6%)                               

2    disease     1. Asthma                 19 (32.8%)           58         0        
     [factor]    2. DM                     19 (32.8%)           (100.0%)   (0.0%)   
                 3. Obesity                20 (34.5%)                               

3    systolic    Mean (sd) : 18.9 (12.8)   32 distinct values   58         0        
     [integer]   min < med < max:                               (100.0%)   (0.0%)   
                 -6 < 21 < 44                                                       
                 IQR (CV) : 19 (0.7)                                                
------------------------------------------------------------------------------------

As the data analyst, the investigator asks you to determine from the data which drug has the highest blood pressure-lowering effect and the best and worst performing drugs. Recollect that if there were only two drugs we would most likely to performing a Student t-test for this.

However, we have four categories and cannot answer these questions with the t-test. This is where the Analysis of Variance (ANOVA) in its simplest form kicks in. First, we visualize the distribution of the SBP-lowering effect in all four drugs in the boxplot below:

df_syst %>% 
    ggplot(aes(x = drug, y = systolic, color = drug)) +
    geom_boxplot()+
    labs(title = "Decrease in systolic BP for the various drugs",
         x = "Drug", color = "Drug", 
         y = "Change in SBP (mmHg)")+
    theme_bw()

Summarizing the change in SBP by the drugs administered we have

df_syst %>% 
    group_by(drug) %>% 
    rstatix::get_summary_stats(type = "common")
Table 6.2:
drug variable n min max median iqr mean sd se ci
A systolic 15 -3 44 25   11.5 26.1  11.7  3.02 6.47
B systolic 15 3 42 28   14   25.5  11.6  3    6.43
C systolic 12 -6 29 8   12   8.75 10    2.89 6.37
D systolic 16 -5 27 13.5 13.5 13.5  9.32 2.33 4.97

From the figure and the summary above it is obvious the drug with the greatest lowering effect appears to be A, followed by B, D and C in that order. How do we know if we can say for sure that B is better than A, which in turn is better than D etc? To answer this question we perform an ANOVA.

22.2 One-way analysis of variance

The analysis of variance is a parametric test used to compare means for more than two groups. It tests the null hypothesis

H0: The population means of all the groups are the same. Ha: At least one of the population means differ from the rest.

Before we do proper ANOVA we summarise the change in SBP for the different drugs, presenting the data with the confidence interval of the mean.

It is apparent from the confidence intervals of the mean change in systolic blood pressures that the effect of drug A is quite similar to that of B. Also, both A and B are quite different from C and D. C and D on the other hand have similar effects. With this preliminary finding at the back of our minds, we fit a linear regression model as below.

df_syst %>% 
    aov(systolic ~ drug, data = .) %>% 
    broom::tidy()
Table 6.3:
term df sumsq meansq statistic p.value
drug 3 3.13e+03 1.04e+03 9.09 5.75e-05
Residuals 54 6.21e+03 115                   

From the ANOVA output above we have a p-value of the order 0.000057! This is very small indicating at least one of our means significantly differs from the rest.

22.2.1 Postestimation pairwise comparison

We have made the point that ANOVA only tell us at least one of the means is significantly different from the others. To pick up which drug’s mean effect(s) differ from the others we use post-estimation tests. An example is the Pairwise t-test.

So why don’t we just do multiple T-tests and present the results as such? This is because the t-test is not designed for this. The significance level is only set up for single comparisons not multiple meaning the p-values obtained for such comparison will be invalid. Many authors have come up with ways of correcting these p-values with the common one implemented in R listed below. The following is an extract from the help page of functionp.adjust().

“The adjustment methods include the Bonferroni correction (”Bonferroni”) in which the p-values are multiplied by the number of comparisons. Less conservative cor- reactions are also included by Holm (1979) (“Holm”), Hochberg (1988) (“Hochberg”), Hommel (1988) (“hommel”), Benjamini & Hochberg (1995) (“BH” or its alias “fdr”), and Benjamini & Yekutieli (2001) (“BY”), respectively. A pass-through option (“none”) is also included.”

Below we determine the p-values for the multiple comparisons of the various drugs using the “holm” method

df_syst %$% 
    pairwise.t.test(
        x = systolic, g = drug, data = ., p.adjust.methods = "holm"
        ) %>% 
    broom::tidy()
Table 7.4:
group1 group2 p.value
B A 0.892   
C A 0.000665
C B 0.000846
D A 0.00769 
D B 0.00863 
D C 0.502   

The analysis output generated above are p-values of pairwise comparison of the means of the decrease in SBP for the various drugs. It can be inferred that the difference in the reduction between A and B is not significant (p-value: 0.89214). In other words, the reduction obtained for the two drugs can be said to be comparable. However, there is a significant difference between both A and C, A and D, B and C, and B and D. Also there isn’t a significant difference between C and D (p-value: 0.50216). An arguably more informative way of showing these pairwise differences can be obtained in R by the use of Tukey’s post-estimation test.

df_syst %>% 
    aov(systolic ~ drug, data = .) %>% 
    TukeyHSD() %>% 
    broom::tidy()
Table 6.4:
term contrast null.value estimate conf.low conf.high adj.p.value
drug B-A 0 -0.533 -10.9 9.84 0.999   
drug C-A 0 -17.3   -28.3 -6.31 0.000626
drug D-A 0 -12.6   -22.8 -2.35 0.0101  
drug C-B 0 -16.8   -27.8 -5.78 0.000949
drug D-B 0 -12     -22.2 -1.82 0.0148  
drug D-C 0 4.75  -6.1 15.6  0.654   

The output above shows the pairwise differences, the 95%CI of the differences and Tukey’s adjusted p-values. Comparing the two methods, Holms and Tukey we observe the multiple comparison p-values differ slightly. However, the conclusions remain the same.

Conclusion: The drugs with the highest SBP lowering effect are A and B with no evidence that one does better than the other. C and D on the other hand have significantly smaller SBP lowering effects compared to A and B. Drugs C and D don’t perform significantly differently from each other.

22.3 Two-Way Analysis of Variance

In the above data, the are two possible independent variables that may explain the BP-lowering effect obtained. These are the disease and drug administered. For a Two-way analysis of variance, we primarily come up with three hypotheses:

H0: The population means of all who took the various drugs are the same.

H0: The population means of all who had the various diseases are the same.

H0: There is no significant interaction between the drug taken and the disease of the study participant in determining the mean BP change.

To evaluate this hypothesis, we perform a two-way analysis of variance with interaction.

df_syst %>% 
    aov(systolic ~ drug*disease, data = .) %>% 
    broom::tidy()
Table 6.5:
term df sumsq meansq statistic p.value
drug 3 3.13e+03 1.04e+03 9.46 5.58e-05
disease 2 419        209        1.9  0.162   
drug:disease 6 707        118        1.07 0.396   
Residuals 46 5.08e+03 110                   

The analysis output indicates there is a significant difference in the means for the various drugs but not for the various diseases present in the study subjects. Also, there is no significant interaction between the drug and the disease type. For a visual look at this effect, we plot the means below:

df_syst %>% 
    aov(systolic ~ drug*disease, data = .) %>% 
    ggeffects::ggeffect(terms = c("drug","disease")) %>% 
    plot()

The plot above indicates that the systolic blood pressure among the groups of disease conditions does not seem to vary significantly for each drug given. This is reflected in the high p-value for the interaction between the drug and disease.

22.4 Repeated measure analysis of variance

Often in research, the investigator desires to determine the change in an outcome over repeated measures, often after a specific time interval. This involves measuring the outcome in the same study participant multiple times. Since these measurements are not independent, the methodology for the analysis is not the same as that described above. When only two measurements are made, the paired T-test suffices to use here. however, if you have more than 2 measurements on the same study participant. This is where a repeated measure is used.

For this subsection, we will use the data that involves a set of hypertension patients, whose blood pressures were measured every 2 months for a year.

set.seed(7)
df_sbp <- 
    readxl::read_xlsx("./Data/sbp_longitudinal.xlsx") %>% 
    sample_n(50)

Next, we select the required variables

df_sbp_selected <-
    df_sbp %>%  
    select(sid, sbp0, sbp4, sbp8, sbp12)

And then view the data

df_sbp_selected %>% summarytools::dfSummary()
Data Frame Summary  
df_sbp_selected  
Dimensions: 50 x 5  
Duplicates: 0  

-----------------------------------------------------------------------------------------------------------
No   Variable      Stats / Values             Freqs (% of Valid)   Graph               Valid      Missing  
---- ------------- -------------------------- -------------------- ------------------- ---------- ---------
1    sid           1. AA0991KAB                1 ( 2.0%)                               50         0        
     [character]   2. AB1053KAA                1 ( 2.0%)                               (100.0%)   (0.0%)   
                   3. AC0162AGB                1 ( 2.0%)                                                   
                   4. AD0510KAB                1 ( 2.0%)                                                   
                   5. AD1208KAB                1 ( 2.0%)                                                   
                   6. AF0129APA                1 ( 2.0%)                                                   
                   7. AF0217KAB                1 ( 2.0%)                                                   
                   8. AF0347TTB                1 ( 2.0%)                                                   
                   9. AH0084TTB                1 ( 2.0%)                                                   
                   10. AJ0478APB               1 ( 2.0%)                                                   
                   [ 40 others ]              40 (80.0%)           IIIIIIIIIIIIIIII                        

2    sbp0          Mean (sd) : 141.3 (20)     28 distinct values       :               50         0        
     [numeric]     min < med < max:                                    :   :           (100.0%)   (0.0%)   
                   106 < 139 < 189                                     : . :                               
                   IQR (CV) : 25.2 (0.1)                             : : : : . :                           
                                                                   : : : : : : : : :                       

3    sbp4          Mean (sd) : 141.3 (27.1)   35 distinct values     :                 50         0        
     [numeric]     min < med < max:                                : : .               (100.0%)   (0.0%)   
                   100 < 136.5 < 215                               : : :                                   
                   IQR (CV) : 34 (0.2)                             : : : :                                 
                                                                   : : : : . :                             

4    sbp8          Mean (sd) : 145.2 (32.1)   37 distinct values     :                 50         0        
     [numeric]     min < med < max:                                . :                 (100.0%)   (0.0%)   
                   101 < 138.5 < 277                               : : :                                   
                   IQR (CV) : 38.5 (0.2)                           : : : .                                 
                                                                   : : : : :   .   .                       

5    sbp12         Mean (sd) : 138.5 (17.9)   37 distinct values     :                 50         0        
     [numeric]     min < med < max:                                . : : .             (100.0%)   (0.0%)   
                   110 < 136 < 179                                 : : : :                                 
                   IQR (CV) : 23.5 (0.1)                           : : : : . :                             
                                                                   : : : : : : :                           
-----------------------------------------------------------------------------------------------------------

Next, we need to convert the data into a long format and view the first 6 records.

df_sbp_long <- 
    df_sbp_selected %>% 
    pivot_longer(
        cols = sbp0:sbp12, values_to = "bp", names_to = "period") %>% 
    mutate(
        period = factor(period, levels = c("sbp0", "sbp4", "sbp8", "sbp12")))

df_sbp_long %>% head()
Table 13.1:
sid period bp
NG0200AGB sbp0 130
NG0200AGB sbp4 127
NG0200AGB sbp8 119
NG0200AGB sbp12 147
RA0112KAA sbp0 187
RA0112KAA sbp4 207

We first summarize the blood pressures for the four periods of measurement.

df_sbp_long %>% 
    group_by(period) %>% 
    summarize(Mean = mean(bp), SD = sd(bp))
Table 13.2:
period Mean SD
sbp0 141 20  
sbp4 141 27.1
sbp8 145 32.1
sbp12 139 17.9

We observe that there are differences between the four means. Further, we show this graphically.

df_sbp_long %>% 
ggplot(aes(x = period, y = bp, group = sid)) +
    geom_line(alpha =.1, col = "red") +
    stat_summary(
        fun.data = mean_sdl, 
        geom="line", 
        colour = "black", 
        linewidth = .8, 
        group=1, linetype = 2) +
    labs(x = "Time Blood pressure taken",
         y = "Blood pressure (mmHg)",
         title = "Variation of blood pressure over the review periods",
         subtitle = "Sampling done on all patients",
         caption = "Source: 2010 data")+
    theme_bw()

There may be an insignificant drop in BP over the period. Next, we perform a repeated measure ANOVA to determine differences between the means for the various review periods.

aov(bp ~ period + Error(sid/period), data = df_sbp_long) %>% broom::tidy()
Table 13.4:
stratum term df sumsq meansq statistic p.value
sid Residuals 49 6.2e+04  1.26e+03          
sid:period period 3 1.13e+03 377        0.924 0.431
sid:period Residuals 147 6e+04        408                 

22.5 Kruskal Wallis Test

At the beginning of this section, we noted that ANOVA is a parametric test. Therefore it requires approximately normally distributed dependent data to workmwith. So what if the data deviates significantly from normality or any of the assumptions under which the ANOVA can be used? Under those conditions we use the Kruskal-Wallis test, the non-parametric equivalent of the one-way ANOVA. This tests the null hypothesis

H0: The population distribution (or centre) of all the groups is the same.

Ha: At least one of the population distributions (or centre) differs from the rest.

Below we use this on the systolic blood pressure and drug used before

df_syst %$% kruskal.test(systolic, drug) %>% broom::tidy()
Table 13.5:
statistic p.value parameter method
20.5 0.000136 3 Kruskal-Wallis rank sum test

With the small p-value, we conclude that the centre of at least one of the effects of the drug is significantly different from the others. This conclusion is not different from that derived from the analysis of variance. At least the centre of one of the systolic blood pressure is significantly different from another

22.6 Assumptions