R provides an excellent environment for carrying out numerical experiments or simulations. These can be very helpful for increasing mathematical competence, testing hypotheses, assessing statistical power, and generating theoretical expectations.
We've already experimented a bit with simulations in this course:
In the first example listed above the simulation was designed to inform our understanding of regression and to examine how sensitive it was to violations of its assumptions. In examples 2 and 3 listed above the role of the simulation was to generate a null model that an observed pattern could be tested against.
Here I present an example of a simulation of a theoretical model to try to gain an understanding of this model and of a broader concept known as Chaos theory.
I was inspired to code this in R after watching this video by Veritasium - This equation will change how you see the world
First we are going to define a simple theoretical model of population growth that has a negative feedback. In other words, as the population grows its rate of increase decreases as it approaches a carrying capacity for example. This kind of growth model is known as a logistic growth model and it contrasts from the exponential model of growth in which the population continues to grow to infinity. The logistic model can be presented mathematically as:
\[\dfrac{dN}{dt} = rN(1-N)\] where N is population size, t is time, and r is the intrinsic rate of population growth rate on a per capita basis. In this formulation, without loss of generality the carrying capacity of the population is implicitly defined as equal to 1.
Let's code this simple model up in R and then examine its behavior with a simulation.
# simple model of logistic growth
dNt <- function(r, N) r * N * (1 - N)
# iterate growth through time
Nt <- function(r, N, t) {
for (i in 1:(t - 1)) {
# population at next time step is population at current time + pop growth
N[i + 1] <- N[i] + dNt(r, N[i])
}
N
}
Now let's examine what happens in this model if we run it through time for different starting abundance values.
t <- 100
r <- 0.1
# lets consider 4 different starting abundances (i.e., N(t=0) values)
Nt0 = c(0.1, 0.5, 1.5, 2)
par(mfrow=c(2,2))
for (i in seq_along(Nt0)) {
plot(1:t, Nt(r, Nt0[i], t), type = 'l', xlab = 'time', ylab = 'Population size',
main = paste('N(t=0) =', Nt0[i]), ylim =c(0, 2))
abline(h = 1, lty = 2, col='grey')
}
So what we learn from this is if you start below the carrying capacity (grey dashed line) your population will increase to it, and if you start above carrying capacity your population will decrease to it. We also learn that the rate of increase or decrease depends on how far from the equilibrium you are at the beginning of the time series.
You can play around with an interactive version of the logistic population growth model that we examined above on this Shiny app (Source code):
OK so all is well in the world of logistic population growth - given enough time all of the populations eventually hit carrying capacity and do not change from that point. Therefore, if we wanted to predict long-term (i.e., equilibrium) abundance then this model would seem to suggest that should be 1 for all of these populations because they all have the same carrying capacity of 1.
To examine if this is actually true let's examine the other parameter of the model that we can adjust the rate of population growth (r).
Question: do all of the populations go to the same expected population equilibrium (i.e., the carrying capacity or 1 in this specific case)?
To test this we will examine a range of r-values from 0.01 to 3, we'll set the starting population size to 0.5 but where you actually start N out ends up not making a difference for addressing this specific question. The approach laid out below we will assume that equilibrium is hit at the end of the first of the time series. This is would seem to be a pretty reasonable assumption given what we learned above and we'll run the simulation 10 times longer (1000 time steps) to make extra sure that the model should have achieved equilibrium.
# set starting conditions and amount of time
t <- 1000
r <- seq(0.01, 3, .01)
Nt0 <- 0.5
# compute the population sizes across the times
e <- sapply(r, function(r) Nt(r, Nt0, t))
# only use 2nd half of times presuming those will be at equilibrium
thalf <- round(t/2)
e <- e[thalf:t, ]
t <- nrow(e)
maxE <- max(as.vector(e))
# plot simulation results
ptsize = 0.25
plot(rep(r[1], t), e[ , 1], ylim = c(0, maxE), xlim = range(r),
cex = ptsize, xlab = 'Population growth rate (r)',
ylab = 'Equilibrium abundance')
for (i in seq_along(r)) {
points(rep(r[i], t), e[ , i], cex = ptsize)
}
abline(h = 1, col='grey', lty=2)
What we see above was quite unexpected to the ecologist Robert May who first discovered this phenomena (May 1976). Essentially what this figure tells us is that for population growth rates above 2 that the population does not necessarily stay at the equilibrium where we expect it to (at the dashed grey line, 1). Instead equilibrium abundances bifurcated into different values and further bifurcate until essentially any abundance value is possible at r values greater than 2.6 or so.
The important thing to remember about the result above is that:
As might be imagined this unpredictable behavior from one of the simplest and most canonical models of population growth shook ecology to its core. The reverberations of which are still being felt over 40 years later.
This simple example demonstrates some of R's power to explore parameter space of models and to find novel insights that would require a lot more pure mathematical expertise if just using pen and paper.
As noted above the model we examined just now was purely deterministic meaning that it had no error or noise.
How would you change the following code chunk so that the model allows for additive process error?
What distribution will error in your model take (e.g., Gaussian, Log Normal)?
You may want to consider a default value for error so that downstream code does not break that does not specify a value for the error term.
Can you reproduce the first figure in the lesson but now with stochasticity?
# simple model of logistic growth
dNt <- function(r, N) r * N * (1 - N)
# iterate growth through time
Nt <- function(r, N, t) {
for (i in 1:(t - 1)) {
# population at next time step is population at current time + pop growth
N[i + 1] <- N[i] + dNt(r, N[i])
}
N
}