Home Page - http://dmcglinn.github.io/quant_methods/ GitHub Repo - https://github.com/dmcglinn/quant_methods

Readings

Online Docs

Outline

Overview of multivariate ordination methods

Multivariate statistical analyses are generalizations of the univariate approaches we examined in the last lesson. In the case of a univariate approach we considered a model in which

\[\hat{y_i} = b_0 1 + b_1 x_{i1} + \cdots + b_p x_{ip} + \varepsilon_i, \qquad i = 1, \ldots, n, \qquad \text{(equ. 1)}\]

We are interested in using multivariate methods when we no longer are just interested in explaining variance in the vector y instead we would like to understand the dominant patterns and sources of variation in a matrix of response variables Y that are all of the same type (i.e., same units).

Often times, we may be simply be interested in if we can represent our response matrix Y using a relatively small number of dimensions (without any reference to an explanatory matrix). For example, if Y has n samples (i.e., rows) and p descriptors (i.e., columns) then we can fully represent all the variation in Y using p hypothetical axes; however, what would a p-dimensional graph look like - any dimensionality > 3 is difficult for the human mind to comprehend. Furthermore, is that degree of complexity even needed to see the major trends in Y? Picture trying to interpret a pairs plot that has \(p*(p-1)/2\) graphs (one graph for each unique combination of columns in Y).

The good news is that in general, the information in matrices can be reduced down to a few major axes of variation. Ordination is a term that describes this process of reducing the information needed to represent a matrix. The term derives from the idea to ordinate or put things into order. With ordination approaches we are attempting to take a high-dimensional data matrix and explain its patterns with a small number of axes. Although this is a very general concept across scientific disciplines, the term ordination is generally only used in ecology.

There are generally considered to be two types of ordination.

  1. Indirect or unconstrained ordination in which only a single matrix is analyzed
  2. Direct or constrained ordination in which one matrix is used to explain the variance of another matrix.

The figure below (Legendre and Legendre 1998, Fig 11.1) visually illustrates this conceptual shift from modeling a vector y (panel b) to a matrix Y. Indirect ordination does not use an explanatory matrix X (panel a) while direct ordination methods do (panel c)

Today we will examine both indirect and direction methods of ordination. In general, ordination is frequently used when exploring patterns in datasets graphically; however, it can also be used to carry out hypothesis testing.

Despite their sometimes terrifying names ordination has a close kinship with multiple regression (as illustrated in the figure above). One key difference is that the response variable is multivariate rather than univariate; however, with many approaches the underlying algebra is very similar between regression and ordination.

Overview of Ordination methods (adapted from http://ordination.okstate.edu/terminol.htm)

Acronym Name Type of analysis Algorithm R function
PCA Principal Components Analysis indirect Eigenanalysis, SVD stats::princomp, vegan::rda
CA Correspondence Analysis indirect RA, eigenanalysis, SVD vegan::cca
DCA Detrended Correspondence Analysis indirect RA with detrending and rescaling vegan::decorana
NMDS Non-metric Mulidimensional Scaling indirect Distance based ordination, non-eigenbased vegan::metaMDS
RDA Redundancy Analysis direct Eigenanalysis, SVD vegan::rda
CCA Canonical Correspondence Analysis direct RA with regressions, eigenanalysis vegan::cca
DCCA Detrended Canonical Correspondence Analysis direct RA with regressions and detrending NA

This presentation by Abel Valdivia provides a review of the types of ordination and provides examples of their graphical output.

Additionally this key created by Mike Palmer provides a decision tree that can help to guide your choice of methods.

Create a community matrix

The first very common challenge when working with multivariate analyses is to construct the multivariate matrix we wish to analyze. Essentially a community matrix is a cross-tab structure in which you have each descriptor element (e.g., species identities) as column ids and each sample element (e.g., site identities) as row ids. A cross-tab structure is a very inefficient way to store your raw data and in many cases we must aggregate the raw data which can be accomplished using a for loop or other simple functions. Using the Great Smokey Mountains tree dataset that was using in HW 3, I will demonstrate how to do this with a computationally inefficient but easy to understand for loop. As a reminder the metadata for the tree data is located here.

Quick note: In this lesson we will use the R package dummies to create dummy variables. This package is now defunct and no longer on the main package repository CRAN. To install you will have to use the package remotes and the function install_github specifically once remotes is installed use remotes::install_github('cran/dummies')

# load relevant packages and code for today's lesson
library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.6-4
library(dummies) # this package is no longer on cran see note above
## dummies-1.5.6 provided by Decision Patterns
source('./scripts/utility_functions.R')

# load data
tree <- read.csv('./data/treedata_subset.csv')

# create a community site x species matrix by summing species cover values
# we can do this with a for loop but it take a while to run
uni_sp <- unique(tree$spcode)
uni_site <- unique(tree$plotID)
comm <- matrix(0, ncol=length(uni_sp), nrow=length(uni_site))
colnames(comm) <- uni_sp
rownames(comm) <- uni_site
for(i in seq_along(uni_sp)) {
    for(j in seq_along(uni_site)) {
        comm[j , i] <- mean(tree$cover[tree$spcode == uni_sp[i] &
                          tree$plotID == uni_site[j]])
    }
}
comm[1:5, 1:5]

Alternatively we could use the function tapply which is much more efficient.

# alternatively we can use a tapply function 
comm <- tapply(tree$cover,
              INDEX = list(tree$plotID, tree$spcode),
              mean)
# examine the community matrix
comm[1:5, 1:5]
##              ABIEFRA ACERNEG ACERNIG ACERRUB AILAALT
## ATBN-01-0303      NA      NA      NA       6      NA
## ATBN-01-0304      NA      NA      NA       7      NA
## ATBN-01-0305      NA      NA      NA       5      NA
## ATBN-01-0306      NA      NA      NA       7      NA
## ATBN-01-0307      NA      NA      NA       5      NA
# replace the NAs with zeros
comm <- ifelse(is.na(comm), 0, comm)
comm[1:5, 1:5]
##              ABIEFRA ACERNEG ACERNIG ACERRUB AILAALT
## ATBN-01-0303       0       0       0       6       0
## ATBN-01-0304       0       0       0       7       0
## ATBN-01-0305       0       0       0       5       0
## ATBN-01-0306       0       0       0       7       0
## ATBN-01-0307       0       0       0       5       0

Now that we’ve created our matrix it is usually a good idea to make sure everything looks ok - the row and column sums (i.e., marginal sums) provide one reasonable metric to see overall how the data is structured in a very coarse way.

# visually explore the cover variable between species and sites
uni_sp <- unique(tree$spcode)
sp_sum <- apply(comm, 2, sum)
site_sum <- apply(comm, 1, sum)
par(mfrow=c(2,2))
hist(sp_sum)
col <- colorRamp(c('red', 'orange', 'blue'))
sp_cols <- col(length(uni_sp))
plot(sp_sum[order(sp_sum, decreasing=T)], type='o', col='red', lwd=2,
     xlab='Sp Rank', ylab='Sum Cover')
hist(site_sum)
plot(site_sum[order(site_sum, decreasing=T)], type='o', col='red', lwd=2,
     xlab='Site Rank', ylab='Sum Cover')

par(mfrow=c(1,1))

Above we can see that most species have small total covers (i.e., most species are rare) which is a fairly universal ecological law so this seems correct. Also we can see that most sites have a total cover of approximately 40 and that this distribution isn’t as highly skewed as the species marginal totals.

NOTE: These plots are not terribly helpful at understanding the variability in the community matrix. They are just useful for looking at gross aggregate properties of the matrix; therefore, they can be useful for spotting an error in the matrix. Let’s get our environmental variables ready to use and start to conduct some actual analyses.

Create an explanatory matrix

In the tree dataset, each site has one set of environmental measurements. These are replicated across the rows of the tree data object

##            plotID  spcode               species cover elev     tci streamdist
## 1    ATBN-01-0403 ABIEFRA         Abies fraseri     1 1660 5.70146      490.9
## 775  ATBN-01-0403 BETUALL Betula alleghaniensis     1 1660 5.70146      490.9
## 4547 ATBN-01-0403 PICERUB          Picea rubens     1 1660 5.70146      490.9
##      disturb     beers
## 1    CORPLOG 0.2244286
## 775  CORPLOG 0.2244286
## 4547 CORPLOG 0.2244286

So we just need to pull out the environmental data for a single one of those rows. In the end we need the explanatory matrix to have the same number of rows in the same order as the community matrix. We could write a for loop to do this but we’ll take the easier approach of using the function aggregate.

cols_to_keep <- c('elev', 'tci', 'streamdist', 'disturb', 'beers')
env <- aggregate(tree[ , cols_to_keep], by = list(tree$plotID), function(x) x[1])
# aggregate does have a side effect of adding a column to the output which 
# is the unique plotID in this case. We just need to move that so it is a 
# row name instead. 
row.names(env) <- env[ , 1]
env <- env[ , -1]
head(env)
##                elev      tci streamdist disturb     beers
## ATBN-01-0303  896.1 4.705636     197.00 CORPLOG 1.9906803
## ATBN-01-0304  947.3 4.447437     125.30 CORPLOG 0.8167341
## ATBN-01-0305 1027.0 6.149170     174.60 CORPLOG 0.5860782
## ATBN-01-0306  450.2 4.133772     202.50  LT-SEL 0.8601108
## ATBN-01-0307  477.0 5.587310     134.20  LT-SEL 0.1009244
## ATBN-01-0308  461.8 4.415634      31.62  LT-SEL 1.4324567
# before we can use this explanatory matrix we need to check 
# that its rows are in the same order as our response matrix

all.equal(rownames(comm), rownames(env))
## [1] TRUE
# now that we've completed that check we can rename the rows to something 
# more manageable

rownames(comm) <- 1:nrow(comm)
rownames(env) <- 1:nrow(env)

Indirect or Unconstrained Ordination

Principle Components Analysis (PCA)

Principle Components Analysis (PCA) is a useful tool for either

  1. examining correlations between the columns of a matrix
  2. potentially reducing the set of explanatory variables included in model

Sometimes PCA is also known as Reciprocal Averaging (RA).

Simplified description of PCA algorithm (from Zelený 2018):

  1. Use the matrix of samples x species (or, generally, samples x descriptors), and display each sample into the multidimensional space where each dimension is defined by an abundance of one species (or descriptor). In this way, the samples will produce a cloud located in the multidimensional space.
  2. Calculate the centroid of the cloud.
  3. Move the centers of axes to this centroid.
  4. Rotate the axes in such a way that the first axis goes through the cloud in the direction of the highest variance, the second is perpendicular to the first and goes in the direction of the second highest variance, and so on. The position of samples on resulting rotated axes are sample scores on ordination axes.

Figure 1: PCA ordination of five samples and two species. (Fig. 9.2 from Legendre & Legendre 1998.)

Figure 2: 3D schema of PCA ordination algorithm

Check out the nice interactive visualization at Explained Visually

We’ll start by using PCA to simply examine the correlations in tree species covers that are contained in the community matrix.

tree_pca <- rda(comm)
tree_pca
## Call: rda(X = comm)
## 
##               Inertia Rank
## Total           114.3     
## Unconstrained   114.3   52
## Inertia is variance 
## 
## Eigenvalues for unconstrained axes:
##    PC1    PC2    PC3    PC4    PC5    PC6    PC7    PC8 
## 23.245 14.729  9.554  8.586  6.875  4.925  4.379  4.207 
## (Showing 8 of 52 unconstrained eigenvalues)

Like the object returned by the function lm, the output of rda is just a named list. We can get a sense of the structure of this named list using the function str

str(tree_pca)
## List of 10
##  $ colsum        : Named num [1:52] 1.061 0.184 0.185 2.751 0.332 ...
##   ..- attr(*, "names")= chr [1:52] "ABIEFRA" "ACERNEG" "ACERNIG" "ACERRUB" ...
##  $ tot.chi       : num 114
##  $ Ybar          : num [1:734, 1:52] -0.00692 -0.00692 -0.00692 -0.00692 -0.00692 ...
##   ..- attr(*, "scaled:center")= Named num [1:52] 0.18733 0.00954 0.00681 4.05995 0.01226 ...
##   .. ..- attr(*, "names")= chr [1:52] "ABIEFRA" "ACERNEG" "ACERNIG" "ACERRUB" ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:734] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:52] "ABIEFRA" "ACERNEG" "ACERNIG" "ACERRUB" ...
##   ..- attr(*, "METHOD")= chr "PCA"
##  $ method        : chr "rda"
##  $ call          : language rda(X = comm)
##  $ pCCA          : NULL
##  $ CCA           : NULL
##  $ CA            :List of 6
##   ..$ eig    : Named num [1:52] 23.24 14.73 9.55 8.59 6.88 ...
##   .. ..- attr(*, "names")= chr [1:52] "PC1" "PC2" "PC3" "PC4" ...
##   ..$ poseig : NULL
##   ..$ u      : num [1:734, 1:52] -0.0148 -0.0263 0.0329 -0.0505 -0.048 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:734] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:52] "PC1" "PC2" "PC3" "PC4" ...
##   ..$ v      : num [1:52, 1:52] 0.04414 0.000326 -0.000369 -0.407768 -0.005665 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:52] "ABIEFRA" "ACERNEG" "ACERNIG" "ACERRUB" ...
##   .. .. ..$ : chr [1:52] "PC1" "PC2" "PC3" "PC4" ...
##   ..$ rank   : int 52
##   ..$ tot.chi: num 114
##  $ inertia       : chr "variance"
##  $ regularization: chr "this is a vegan::rda result object"
##  - attr(*, "class")= chr [1:2] "rda" "cca"

This is useful because for example if we wanted to pull out the eigenvalues of the analysis we would see that they are stored under $CA$eig. We can access each of the objects stored in a named list as we would reference columns in a data.frame using the $.

tree_pca$tot.chi
## [1] 114.3032
tree_pca$CA$eig
##          PC1          PC2          PC3          PC4          PC5          PC6 
## 23.244770597 14.729455729  9.553715342  8.586350801  6.875460601  4.924553087 
##          PC7          PC8          PC9         PC10         PC11         PC12 
##  4.379409841  4.206722212  3.648091703  3.432661578  2.784056682  2.579211642 
##         PC13         PC14         PC15         PC16         PC17         PC18 
##  2.231308613  2.172385623  2.005743843  1.939263045  1.756857179  1.595933867 
##         PC19         PC20         PC21         PC22         PC23         PC24 
##  1.524147315  1.392534007  1.265278989  1.144346819  1.022333994  0.987984586 
##         PC25         PC26         PC27         PC28         PC29         PC30 
##  0.894203997  0.890931916  0.758238314  0.580826522  0.513812015  0.475536085 
##         PC31         PC32         PC33         PC34         PC35         PC36 
##  0.338373458  0.295940221  0.221512264  0.213810566  0.181667497  0.155612934 
##         PC37         PC38         PC39         PC40         PC41         PC42 
##  0.124854861  0.110463860  0.092544030  0.084629816  0.078266544  0.077549045 
##         PC43         PC44         PC45         PC46         PC47         PC48 
##  0.055542679  0.046692964  0.031058911  0.025908859  0.024458021  0.022436260 
##         PC49         PC50         PC51         PC52 
##  0.013732027  0.007551564  0.003227927  0.001201937
# the eigenvalues sum up to equal the total inertia 
sum(tree_pca$CA$eig)
## [1] 114.3032
# inertia in a PCA is equal to the total variance of the matrix because PCA
# uses the usual Euclidean distance as its metric of how different samples are:
# 1 / (n-1) * sum(x - xbar)^2 where n is sample size and xbar is the mean x value
# you can verify that this is equal to the sum of the variances (i.e., diagonal 
# elements) in the species covariance matrix 
sum(diag(cov(comm)))
## [1] 114.3032
# of if you want to do it by hand with matrix algebra
# first center the community matrix by subtracting the means
cen_comm <- apply(comm, 2, function(x) x - mean(x))
# then compute the covariance matrix
cov_comm <- nrow(comm)^-1 * t(cen_comm) %*% cen_comm
sum(diag(cov_comm))
## [1] 114.1474
# the ratio of the eigenvalue to the total variance is the amount of 
# variance explained by each PCA axis
round(tree_pca$CA$eig / tree_pca$tot.chi, 2)
##  PC1  PC2  PC3  PC4  PC5  PC6  PC7  PC8  PC9 PC10 PC11 PC12 PC13 PC14 PC15 PC16 
## 0.20 0.13 0.08 0.08 0.06 0.04 0.04 0.04 0.03 0.03 0.02 0.02 0.02 0.02 0.02 0.02 
## PC17 PC18 PC19 PC20 PC21 PC22 PC23 PC24 PC25 PC26 PC27 PC28 PC29 PC30 PC31 PC32 
## 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.00 0.00 0.00 0.00 
## PC33 PC34 PC35 PC36 PC37 PC38 PC39 PC40 PC41 PC42 PC43 PC44 PC45 PC46 PC47 PC48 
## 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
## PC49 PC50 PC51 PC52 
## 0.00 0.00 0.00 0.00

We can see from above that the PCA axis 1 captures approximately 20% of the total variance in the community matrix. Let’s graph the data to better get a sense of the correlation structure.

plot(tree_pca)

biplot(tree_pca)

cleanplot.pca(tree_pca)
## Warning in arrows(0, 0, spe.sc1[, ax1], spe.sc1[, ax2], length = ahead, :
## zero-length arrow is of indeterminate angle and so skipped
## Warning in arrows(0, 0, spe.sc2[, ax1], spe.sc2[, ax2], length = ahead, :
## zero-length arrow is of indeterminate angle and so skipped

# p120-121 of Numerical Ecology in R:
# Scaling 1 = distance biplot: the eigenvectors are scaled to unit length. (1)
# Distances among objects in the biplot are approximations of their
# Euclidean distances in multidimensional space. (2) The angles among
# descriptor vectors are meaningless.
# Scaling 2 = correlation biplot: each eigenvector is scaled to the square root of
# its eigenvalue. (1) Distances among objects in the biplot are not approximations
# of their Euclidean distances in multidimensional space. (2) The angles
# between descriptors in the biplot reflect their correlations.
# scaling 2 is the default for vegan ordination graphing functions
ordiplot(tree_pca, display = 'sp')

Correspondance Anlysis (CA), Detrended Coresspondance Analysis (DCA), and NMDS

# each of these different indirect ordination approaches
# has different strengths and weaknesses
# Correspondence analysis  (CA) examines differences on weighted
# averages of the columns (i.e., species in this case)
tree_ca <- cca(comm)

# Detrended correspondence analysis (DCA) is identical except it attempts
# to account for a well known artifact of CA known as the arch-effect 
# by detrending subsequent axes from previous axes. 
tree_dca <- decorana(comm)

# Non-metric multidimensional scaling (MDS) is unique in that you may 
# specify one of a number of different distance metrics to use. By 
# default the Bray-Curtis distance is used by metaMDS. 
tree_mds <- metaMDS(comm)

NMDS Maximizes rank-order correlation between distance measures and distance in ordination space. Points are moved to minimize “stress”. Stress is a measure of the mismatch between the two kinds of distance.

Direct or Constrained Ordination

Redundancy Analysis (RDA)

First let’s carry out an RDA which expects a linear response of each species to the environmental variables. RDA is the most direct analog of OLS regression to the multivariate response variable.

rda_tree <- rda(comm, env)
# the above breaks b/c we have a categorical factor in env 

# vegan requires that we write out each term if we are not going to 
# convert the factor to a dummy matrix 
rda_tree <- rda(comm ~ env$elev + env$tci +
               env$streamdist + env$disturb + env$beers)
# alternatively we could use a shorthand approach
rda_tree <- rda(comm ~ . , data=env)
rda_tree
## Call: rda(formula = comm ~ elev + tci + streamdist + disturb + beers,
## data = env)
## 
##                Inertia Proportion Rank
## Total         114.3032     1.0000     
## Constrained    19.1096     0.1672    7
## Unconstrained  95.1936     0.8328   52
## Inertia is variance 
## 
## Eigenvalues for constrained axes:
##   RDA1   RDA2   RDA3   RDA4   RDA5   RDA6   RDA7 
## 12.454  3.922  0.889  0.751  0.693  0.285  0.116 
## 
## Eigenvalues for unconstrained axes:
##    PC1    PC2    PC3    PC4    PC5    PC6    PC7    PC8 
## 14.210 12.741  7.891  6.811  5.608  4.455  4.122  3.415 
## (Showing 8 of 52 unconstrained eigenvalues)
RsquareAdj(rda_tree)
## $r.squared
## [1] 0.1671835
## 
## $adj.r.squared
## [1] 0.1591536

The output above provides us with some useful information. Inertia is another name for variation or variance in this case. “Total” refers to total variance, “Constrained” refers to the amount of variance explained by the explanatory variables, “Unconstrained” refers to the residual variance. Constrained + Unconstrained = Total. An \(R^2\) statistic can be derived simply as Constrained / Total. The function RsquareAdj computes \(R^2\) and \(R^2\)-adjusted. The variable “Rank” indicates the number of variables included. The eigenvalues are displayed for both the constrained and unconstrained axes. In this context these eigenvalues indicate how much variance each of the axes contribute to.

We can plot our model result to get a sense of which variables are correlating with with species along which axes.

plot(rda_tree, type='n', scaling=1)
orditorp(rda_tree, display='sp', cex=0.5, scaling=1, col='blue')
text(rda_tree, display='cn', col='red')

We interpret the plot above as we have interpreted the previously ordination plots with one important difference. The environmental variables are now displayed and their placement indicates their loading on the two displayed RDA axes. elev is loading heavily on RDA1 indicating that this variable explains a larger portion of the variance associated with axis 1. The location of the species relative to the environmental variables indicates how strongly a species is associated with a particular environmental variable. So for example ABIEFRA or Abies fraseri increases as elevation increases.

Canonical Correspondence Analysis (CCA)

Let’s carry out a Canonical Correspondence Analysis (CCA) as well. CCA is appropriate for modeling unimodal or hump-shaped responses to explanatory variables (rather than linear as with RDA).

cca_tree <- cca(comm ~ ., data=env)
RsquareAdj(cca_tree, 100)
## $r.squared
## [1] 0.09932996
## 
## $adj.r.squared
## [1] 0.09060108
anova(cca_tree, permutations = 999)
## Permutation test for cca under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: cca(formula = comm ~ elev + tci + streamdist + disturb + beers, data = env)
##           Df ChiSquare      F Pr(>F)    
## Model      7    0.5546 11.438  0.001 ***
## Residual 726    5.0285                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(cca_tree, by='margin', permutations = 999)
## Permutation test for cca under reduced model
## Marginal effects of terms
## Permutation: free
## Number of permutations: 999
## 
## Model: cca(formula = comm ~ elev + tci + streamdist + disturb + beers, data = env)
##             Df ChiSquare       F Pr(>F)    
## elev         1    0.2329 33.6268  0.001 ***
## tci          1    0.0335  4.8363  0.001 ***
## streamdist   1    0.0525  7.5778  0.001 ***
## disturb      3    0.0920  4.4260  0.001 ***
## beers        1    0.0341  4.9295  0.001 ***
## Residual   726    5.0285                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(cca_tree, type='n', scaling=1)
orditorp(cca_tree, display='sp', cex=0.5, scaling=1, col='blue')
text(cca_tree, display='bp', col='red')

The CCA models don’t explain as much variation and their plots look slightly different but the general take home message has not changed much.

Hypothesis testing

Now let’s carry out hypothesis testing.

anova(rda_tree, permutations=10)
## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 10
## 
## Model: rda(formula = comm ~ elev + tci + streamdist + disturb + beers, data = env)
##           Df Variance     F  Pr(>F)  
## Model      7   19.110 20.82 0.09091 .
## Residual 726   95.194                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(rda_tree, by='margin', permutations=10)
## Permutation test for rda under reduced model
## Marginal effects of terms
## Permutation: free
## Number of permutations: 10
## 
## Model: rda(formula = comm ~ elev + tci + streamdist + disturb + beers, data = env)
##             Df Variance      F  Pr(>F)  
## elev         1    8.019 61.154 0.09091 .
## tci          1    1.597 12.176 0.09091 .
## streamdist   1    2.300 17.544 0.09091 .
## disturb      3    2.421  6.155 0.09091 .
## beers        1    1.363 10.396 0.09091 .
## Residual   726   95.194                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In a real analysis you would specify a much larger number of permutations (at least 1000). The first test examines overall model fit relative to a randomized or permuted matrix of data. The second test examines the partial effects of the individual variables included in the model.

Model Comparison

Model comparison can be carried out using direct ordination methods either using nested models of different degrees of complexity as we did with univariate lm type models

rda_tree_simple <- update(rda_tree, . ~ . - beers)
anova(rda_tree_simple, rda_tree)
## Permutation tests for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model 1: comm ~ elev + tci + streamdist + disturb
## Model 2: comm ~ elev + tci + streamdist + disturb + beers
##   ResDf ResChiSquare Df ChiSquare      F Pr(>F)    
## 1   727       96.557                               
## 2   726       95.194  1    1.3631 10.396  0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The above tests suggests that the more complex model including the variable beers is strongly supported. Note this specific example is identical to the output for the beers parameter when using anova(tree_rda, by = 'margin').

Model comparison can also be undertaken using adjusted \(R^2\). This is often easier to discuss and communicate then a myriad of significance tests.

Variation Paritioning

Lastly let’s carry out variance partitioning. We can use this approach to examine how much of the explained variance is due to different groups of variables. In other words this approach is really only useful if you are interested in comparing the relative importance of several variables to another set of variables.

## variance partitioning

moisture <- env[ , c('elev', 'tci', 'beers', 'streamdist')]
# because the variable disturb is a factor we need to convert it into 
# a dummy matrix using the function dummies::dummy
disturb <- dummy(env$disturb)
## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts =
## FALSE): non-list contrasts argument ignored
# examine the explanatory variable of each class of variables.
varpart(comm, moisture, disturb)
## Warning: collinearity detected in X2: mm = 4, m = 3
## Warning: collinearity detected in cbind(X1,X2): mm = 8, m = 7
## 
## Partition of variance in RDA 
## 
## Call: varpart(Y = comm, X = moisture, disturb)
## 
## Explanatory tables:
## X1:  moisture
## X2:  disturb 
## 
## No. of explanatory tables: 2 
## Total variation (SS): 83784 
##             Variance: 114.3 
## No. of observations: 734 
## 
## Partition table:
##                      Df R.squared Adj.R.squared Testable
## [a+c] = X1            4   0.14600       0.14132     TRUE
## [b+c] = X2            3   0.04722       0.04331     TRUE
## [a+b+c] = X1+X2       7   0.16718       0.15915     TRUE
## Individual fractions                                    
## [a] = X1|X2           4                 0.11584     TRUE
## [b] = X2|X1           3                 0.01784     TRUE
## [c]                   0                 0.02547    FALSE
## [d] = Residuals                         0.84085    FALSE
## ---
## Use function 'rda' to test significance of fractions of interest
showvarparts(2)

The output indicates that the moisture group of variables has the largest individual fraction of explained variance (10%), whereas, the disturbance groups of variables explain only approximately 2%. We can also see that there are not any really large fractions of shared variance which indicates the variables effects are somewhat independent of one another.