library(ggplot2) # for general plotting
library(car)     # for ANOVA (Type II used, better than Type I when there is an unbalanced design)

Information of data source

Enzyme inhibitors were used to treat mites, fecundity and enzyme activity following enzyme inhibition were assessed.

Inhibitors:

PBO - CYP (cytochrome P450) inhibitor

DEM - GST inhibitor

DEF - esterase inhibitor

Read in the data and view structure to identify any issues in data formatting

DEF.data <- read.csv("~/Lab Stuff/Adapted mites/Tomato/Inhibitor assay/Thesis - enzyme activity/DEF/TA DEF R data.csv", header = TRUE)

str(DEF.data)
## 'data.frame':    19 obs. of  3 variables:
##  $ Mite.Strain: Factor w/ 2 levels "TU","TU-A": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Treatment  : Factor w/ 2 levels "Control","DEF": 1 1 1 1 1 2 2 2 2 2 ...
##  $ Activity   : num  824 878 863 805 801 ...

Formulate hypothesis

H0: There will be no difference in enzyme activity.

HA: Treatment with detoxification enzyme inhibitor will result in degreased enzyme activity. ## Conduct data exploration

Outliers in the response variable (Activity) within explanatory variables (Treatment, Mite.Strain).

ggplot(DEF.data, aes(x = Treatment, y = Activity)) + geom_boxplot()+ theme_classic()

ggplot(DEF.data, aes(x = Mite.Strain, y = Activity)) + geom_boxplot() + theme_classic()

No outliers.

Collinearity of the explanatory variables

Does not apply, all explanatory variables are categorical/factorial.

Spatial/temporal or other hierarchical aspects of sampling design

No.

Interactions (is the quality of the data good enough to include them?)

A Mite.Strain by Treatment interaction will be included to check if the different mite strains responded differently to treatment.

Samples collected from one trial, so no Trial main effect to include in model.

Zero inflation in Y

No

Are categorical covariates balanced?

summary(DEF.data)
##  Mite.Strain   Treatment     Activity     
##  TU  :10     Control: 9   Min.   : 117.0  
##  TU-A: 9     DEF    :10   1st Qu.: 174.9  
##                           Median : 275.0  
##                           Mean   : 548.2  
##                           3rd Qu.: 870.8  
##                           Max.   :1144.5

No, but close.

Apply model

# fit linear model and display model fit information and ANOVA table
m <- lm(Activity ~ Mite.Strain + Treatment + Mite.Strain:Treatment, data = DEF.data)
summary(m)
## 
## Call:
## lm(formula = Activity ~ Mite.Strain + Treatment + Mite.Strain:Treatment, 
##     data = DEF.data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -48.80 -29.57  -1.69  23.24  53.07 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    834.43      15.15  55.091  < 2e-16 ***
## Mite.StrainTU-A                265.96      22.72  11.706 6.06e-09 ***
## TreatmentDEF                  -687.83      21.42 -32.111 3.04e-15 ***
## Mite.StrainTU-A:TreatmentDEF  -190.59      31.23  -6.104 2.02e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 33.87 on 15 degrees of freedom
## Multiple R-squared:  0.9942, Adjusted R-squared:  0.9931 
## F-statistic: 862.4 on 3 and 15 DF,  p-value: < 2.2e-16
Anova(m)
## Anova Table (Type II tests)
## 
## Response: Activity
##                        Sum Sq Df  F value    Pr(>F)    
## Mite.Strain            128661  1  112.164 2.336e-08 ***
## Treatment             2854755  1 2488.707 < 2.2e-16 ***
## Mite.Strain:Treatment   42735  1   37.255 2.019e-05 ***
## Residuals               17206 15                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Calculate effect size and display
result.anova<-Anova(m)
ss<-result.anova$"Sum Sq"   ##ss = sum of squares
pes<-ss/(ss+ss[length(ss)])  ##pes = partial e squared
pes[length(pes)]<-""
result.anova$"Part E Sq"<-pes
result.anova
## Anova Table (Type II tests)
## 
## Response: Activity
##                        Sum Sq Df  F value     Pr(>F) Part E Sq
## Mite.Strain            128661  1  112.164 2.3400e-08   0.88204
## Treatment             2854755  1 2488.707 0.0000e+00   0.99401
## Mite.Strain:Treatment   42735  1   37.255 2.0186e-05   0.71295
## Residuals               17206 15
# perform post-hoc Tukey-Kramer test of contrasts
TukeyHSD(aov(Activity ~ Mite.Strain + Treatment + Mite.Strain:Treatment, data = DEF.data))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Activity ~ Mite.Strain + Treatment + Mite.Strain:Treatment, data = DEF.data)
## 
## $Mite.Strain
##             diff     lwr      upr   p adj
## TU-A-TU 121.8678 88.6991 155.0365 1.1e-06
## 
## $Treatment
##                  diff       lwr       upr p adj
## DEF-Control -775.1195 -808.2882 -741.9509     0
## 
## $`Mite.Strain:Treatment`
##                              diff         lwr       upr     p adj
## TU-A:Control-TU:Control  265.9638   200.48202  331.4455 0.0000000
## TU:DEF-TU:Control       -687.8300  -749.56677 -626.0932 0.0000000
## TU-A:DEF-TU:Control     -612.4560  -674.19277 -550.7192 0.0000000
## TU:DEF-TU-A:Control     -953.7938 -1019.27548 -888.3120 0.0000000
## TU-A:DEF-TU-A:Control   -878.4198  -943.90148 -812.9380 0.0000000
## TU-A:DEF-TU:DEF           75.3740    13.63723  137.1108 0.0146624
# plot interactions
interaction.plot(DEF.data$Mite.Strain, DEF.data$Treatment, DEF.data$Activity, type="l", leg.bty="o", leg.bg="grey95", lwd=2, ylab="Activity", xlab="Mite.Strain", main="Mite.Strain:Treatment")

Validate model

DEF.data$m.fit <- fitted(m)    # fitted values
DEF.data$m.res <- rstandard(m) # Pearson residuals

Residual distribution / Overdispersion

We assumed normal residuals. This is the least important regression assumption but its can be tested with a qq plot.

ggplot(DEF.data, aes(sample = m.res)) + geom_qq() +
  geom_abline(intercept = 0, slope = 1) + theme_classic()

Nice.

Residuals vs fitted values

Testing for:

Linearity - there should be no curvilinear pattern in the residuals.

Equal variance - the vertical spread of the residuals should be constant across all fitted values.

ggplot(DEF.data, aes(x = m.fit, y = m.res)) + 
  geom_point() + geom_hline(yintercept = 0) + geom_smooth() + theme_classic()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Looks OK.

Residuals vs explanatory variables

Should be centered around 0, if not then model requires another explanatory variable(s), to account for observed variation.

ggplot(DEF.data, aes(x = Treatment, y = m.res)) + 
  geom_boxplot() + theme_classic() + geom_hline(yintercept = 0) 

ggplot(DEF.data, aes(x = Mite.Strain, y = m.res)) + 
  geom_boxplot() + theme_classic() + geom_hline(yintercept = 0) 

ggplot(DEF.data, aes(x = Mite.Strain:Treatment, y = m.res)) + 
  geom_boxplot() + theme_classic() + geom_hline(yintercept = 0) 

Nice.

Model is valid and interpretation of ANOVA is good.

sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252   
## [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C                   
## [5] LC_TIME=English_Canada.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] car_3.0-3     carData_3.0-2 ggplot2_3.1.1
## 
## loaded via a namespace (and not attached):
##  [1] zip_2.0.2         Rcpp_1.0.1        cellranger_1.1.0 
##  [4] pillar_1.4.1      compiler_3.6.0    plyr_1.8.4       
##  [7] forcats_0.4.0     tools_3.6.0       digest_0.6.19    
## [10] evaluate_0.14     tibble_2.1.1      gtable_0.3.0     
## [13] pkgconfig_2.0.2   rlang_0.3.4       openxlsx_4.1.0   
## [16] curl_3.3          yaml_2.2.0        haven_2.1.0      
## [19] xfun_0.7          rio_0.5.16        withr_2.1.2      
## [22] stringr_1.4.0     knitr_1.23        hms_0.4.2        
## [25] grid_3.6.0        data.table_1.12.2 readxl_1.3.1     
## [28] foreign_0.8-71    rmarkdown_1.13    magrittr_1.5     
## [31] scales_1.0.0      htmltools_0.3.6   abind_1.4-5      
## [34] colorspace_1.4-1  labeling_0.3      stringi_1.4.3    
## [37] lazyeval_0.2.2    munsell_0.5.0     crayon_1.3.4