###########################################################################
###########################################################################
###########################################################################
# Introductory IPM exercises
###########################################################################
###########################################################################
###########################################################################
# some of the following code has been adapted from code shared by johan dahlgren from nicole et al. 2011. Interdependent effects of habitat quality and climate on population growth of an endangered plant. J Ecol 99 1211-1218.
# OVERVIEW
# the following document is organized into 8 sections.
# A. plots for data exploration
# B. parameter estimation for regressions
# C. decide on how to model life history
# D. make a kernel
# E. basic analyses
# F. exercises
# G. advanced exercises (if you get bored)
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# A. plots for data exploration
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# read in data. you'll have to set your own file path here. the data is a .csv file included in the workshop's dropbox folder
d=read.csv('/Users/ctg/Dropbox/Shared/Sean/ESA_2013_workshop/For\ Participants/Intro_IPM_exercises_data_ESA_2013.csv')
# you'll notice that adults are stored at the beginning of the data set with values for size (measured in 2001) and sizeNext (measured in 2002). the number of seeds and flowering status were measured in 2001.
head(d)
# and that new recruits are stored at the end and were only observed in the second survey (2002)
tail(d)
# make some plots - figure 1
par(mfrow=c(2,2),mar=c(4,4,2,1))
plot(d$size,jitter(d$surv),main='Survival') # jittered to see easier
plot(d$size,d$sizeNext,main='Growth/Shrinkage/Stasis')
plot(d$size,d$fec.seed,main='Seeds') # jittered to see easier
hist(d$sizeNext[is.na(d$size)],main='Size of Recruits')
# note that we use NAs for missing or non-applicable data rather than 0 or some other indicator because this causes them to be automatically excluded from r's regression functions.
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# B. regressions build regressions for vital rate functions
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# 0. set up parameter list for regressions
# this sets up a list of the model parameters. these parameters will be estimated and recorded below.
params=data.frame(
surv.int=NA,
surv.slope=NA,
growth.int=NA,
growth.slope=NA,
growth.sd=NA,
seed.int=NA,
seed.slope=NA,
recruit.size.mean=NA,
recruit.size.sd=NA,
establishment.prob=NA
)
# 1. survival regression
surv.reg=glm(surv~size,data=d,family=binomial())
summary(surv.reg)
params$surv.int=coefficients(surv.reg)[1]
params$surv.slope=coefficients(surv.reg)[2]
# 2. growth regression
growth.reg=lm(sizeNext~size,data=d)
summary(growth.reg)
params$growth.int=coefficients(growth.reg)[1]
params$growth.slope=coefficients(growth.reg)[2]
params$growth.sd=sd(resid(growth.reg))
# 3. seeds regression
# note that we are just pooling all individuals into this regression regardless of whether they flowered or not. a later exercise will be to explicitly model flowering probability.
seed.reg=glm(fec.seed~size,data=d,family=poisson())
summary(seed.reg)
params$seed.int=coefficients(seed.reg)[1]
params$seed.slope=coefficients(seed.reg)[2]
# 4. size distribution of recruits
# in the dataframe, recruits are those individuals who have a value for sizeNext but not for size
params$recruit.size.mean=mean(d$sizeNext[is.na(d$size)])
params$recruit.size.sd=sd(d$sizeNext[is.na(d$size)])
# 5. establishment probability
# these data represent a single year's worth of data, hence establishment probability can be estimated by dividing the number of observed recruits by the number of seeds. hence the growth/survival measurements were taken in year t which the recruit sizes were measured in year t+1.
params$establishment.prob=sum(is.na(d$size))/sum(d$fec.seed,na.rm=TRUE)
# 6. plot the models over the data - figure 2
par(mfrow=c(2,2),mar=c(4,4,2,1))
xx=seq(0,8,by=.01)
plot(d$size,d$sizeNext,main='Growth/Shrinkage/Stasis')
lines(xx,predict(growth.reg,data.frame(size=xx)),col='red',lwd=3)
plot(d$size,jitter(d$surv),main='Survival') # jittered to see easier
lines(xx,predict(surv.reg,data.frame(size=xx),type='response'), col='red',lwd=3)
plot(d$size,d$fec.seed,main='Seeds') # jittered to see easier
lines(xx,predict(seed.reg,data.frame(size=xx),type='response'), col='red',lwd=3)
hist(d$sizeNext[is.na(d$size)],main='Size of Recruits',freq=FALSE)
lines(xx,dnorm(xx,params$recruit.size.mean,params$recruit.size.sd), col='red',lwd=3)
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# C. decide how to model life history
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# each of the functions below represents one or more of the vital rates. these functions are used to build the IPM and use output (the coefficients) from the regressions developed above.
# these functions represent the modeler's decision about how to decompose the life cycle. in this very simple example, we model (1) survival, (2) growth, (3) seed number, (4) the seedling size distribution and (5) the establishment probability. these last three functions are combined in the model for fecundity below. in practice, we'd need to decide what regressions to build in part B in advance, but it's easier to digest this section after you've seen the regressions above, so fortunately we built all the right regressions already. in practice, sections B and C are iterative one might inform one another.
## vital rate functions
# 1. probability of surviving
s.x=function(x,params) {
u=exp(params$surv.int+params$surv.slope*x)
return(u/(1+u))
}
# 2. growth function
g.yx=function(xp,x,params) {
dnorm(xp,mean=params$growth.int+params$growth.slope*x,sd=params$growth.sd)
}
# 3. reproduction function
f.yx=function(xp,x,params) {
params$establishment.prob*
dnorm(xp,mean=params$recruit.size.mean,sd=params$recruit.size.sd)*
exp(params$seed.int+params$seed.slope*x)
}
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# D. make a kernel
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# We begin by defining the boundary points (b;the edges of the cells defining the kernel), mesh points (y; the centers of the cells defining the kernel and the points at which the kernel is evaluated for the midpoint rule of numerical integration), and step size (h; the widths of the cells). The integration limits (min.size and max.size) span the range of sizes observed in the data set, and then some.
# 1. boundary points b, mesh points y and step size h
# integration limits - these limits span the range of sizes observed in the data set, and then some.
min.size=.9*min(c(d$size,d$sizeNext),na.rm=T)
max.size=1.1*max(c(d$size,d$sizeNext),na.rm=T)
# number of cells in the discretized kernel
n=100
# boundary points (the edges of the cells defining the kernel)
b=min.size+c(0:n)*(max.size-min.size)/n
# mesh points (midpoints of the cells)
y=0.5*(b[1:n]+b[2:(n+1)])
# width of the cells
h=y[2]-y[1]
# 2. make component kernels
# Next, we make the kernels. The function outer() evaluates the kernel at all pairwise combina- tions of the two vectors y and y. For the numerical integration, we’re using the midpoint rule (the simplest option) to estimated the area of a rectangle under the curve. The heights of the rectangles are given by the outer function and the width of the rectangles is h. We'll plot each step along the way.
par(mfrow=c(2,3))
G=h*outer(y,y,g.yx,params=params) # growth kernel
image(t(G),main='growth kernel') # plot it
S=s.x(y,params=params) # survival
plot(y,S,type='l',main='survival') # plot it
P=G # placeholder;redefine P on the next line
for(i in 1:n) P[,i]=G[,i]*S[i] # growth/survival kernel
image(t(P),main='survival/growth kernel') # plot it
abline(0,1,lwd=3) # plot 1:1, which represents stasis
F=h*outer(y,y,f.yx,params=params) # reproduction kernel
image(t(F),main='fecundity kernel') # plot it
K=P+F # full kernel
image(t(K),main='full kernel') # plot it
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# E. basic analyses
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# 1. get lamda,v,w
(lam=Re(eigen(K)$values[1])) # should be 1.013391
w.eigen=Re(eigen(K)$vectors[,1])
stable.dist=w.eigen/sum(w.eigen)
v.eigen=Re(eigen(t(K))$vectors[,1])
repro.val=v.eigen/v.eigen[1]
# 2. compute elasticity and sensitivity matrices
v.dot.w=sum(stable.dist*repro.val)*h
sens=outer(repro.val,stable.dist)/v.dot.w
elas=matrix(as.vector(sens)*as.vector(K)/lam,nrow=n)
# 3. plot results
# you might want to save this plot for comparison with later versions
par(mfrow=c(2,3))
image(y,y,t(K), xlab="Size (t)",ylab="Size (t+1)",col=topo.colors(100), main="Kernel")
contour(y,y,t(K), add = TRUE, drawlabels = TRUE)
plot(y,stable.dist,xlab="Size",type="l",main="Stable size distribution")
plot(y,repro.val,xlab="Size",type="l",main="Reproductive values")
image(y,y,t(elas),xlab="Size (t)",ylab="Size (t+1)",main="Elasticity")
image(y,y,t(sens),xlab="Size (t)",ylab="Size (t+1)", main="Sensitivity")
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# F. exercises
# -------------------------------------------------------------------
# -------------------------------------------------------------------
# for all of the following exercises, code from above can be adapted with just a few minor changes, so you don't need to start from scratch.
# 1. make the variance of the growth regression a function of size and obtain the value of lambda. in the model above, we simply modeled the variance as constant, as seen in params$growth.sd. by looking at the growth data in figure 1 (section A) you might guess that the variance increases as a function of size. you can see this in the following plot, where we plot the absolute value of the residuals of the growth regression against size:
plot(growth.reg$model$size,abs(resid(growth.reg)),xlab='size',ylab='residual')
# incorporating size in the growth variance requires 4 steps:
# a. build a regression that includes size as a predictor of the varaince. this is most easily done using generalized least squares (gls). gls allows us to simultaneously fit a model for the expected size at t+1 and the variance. for this example, we'll model the variance in growth as an exponential function sigma.2(size)=(sigma^2)*exp[2*constant*size]. see ?varExp for more details. the gls model will estimate sigma^2 and the constant, in addition to the intercept and slope used in the mean of the growth regression. gls() is in the nlme package, which you may need to install and load using install.packages('nlme') and library(nlme).
growth.reg=gls(sizeNext~size,weights=varExp(),na.action=na.omit, data=d)
summary(growth.reg)
# plot the model, just to be sure it's reasonable
plot(d$size,d$sizeNext,main='Growth/Shrinkage/Stasis')
lines(xx,predict(growth.reg,data.frame(size=xx)),col='red',lwd=3)
# b. modify the params data frame (where we store all the coefficients used to build the ipm) to have coefficients called growth.sd.int and growth.sd.slope and set the values of those coefficients equal to the appropriate coefficients from part a.
params$growth.int=coefficients(growth.reg)[1]
params$growth.slope=coefficients(growth.reg)[2]
params$growth.sigma2=summary(growth.reg)$sigma^2
params$growth.sigma2.exp=as.numeric(growth.reg$modelStruct$varStruct)
#c. modify the growth function, g.xy, to allow the standard deviation argument, sd, to be a function of size. this will follow a similar pattern to the argument for mean. it's probably easiest not to rename the function g.yx, otherwise you'll have modify the code in section D.2.
g.yx=function(xp,x,params) {
dnorm(xp,mean=params$growth.int+params$growth.slope*x, sd=sqrt(params$growth.sigma2*exp(2*params$growth.sigma2.exp*x)))
}
# d. rerun the code from sections D and E to obtain lambda. the value should be 0.980247. plot the P (survival/growth) matrix and check that it reasonably predicts the growth observations.
image(y,y,t(P))
points(d$size,d$sizeNext)
# 2. add a quadratic term to growth regression build in question 1 and obtain the value of lambda (it should be 0.9777275). you should follow steps analogous to those in question 1. to tell the growth regression to use a quadratic term, use I(size^2) as a predictor variable. as in question 1d, plot the P matrix to check that the model is reasonable.
growth.reg=gls(sizeNext~size+I(size^2),weights=varExp(), na.action=na.omit, data=d)
summary(growth.reg)
params$growth.int=coefficients(growth.reg)[1]
params$growth.slope=coefficients(growth.reg)[2]
params$growth.sqrd=coefficients(growth.reg)[3]
params$growth.sigma2=summary(growth.reg)$sigma^2
params$growth.sigma2.exp=as.numeric(growth.reg$modelStruct$varStruct)
g.yx=function(xp,x,params) {
dnorm(xp,
mean=params$growth.int+params$growth.slope*x+params$growth.sqrd*x^2,
sd=sqrt(params$growth.sigma2*exp(2*params$growth.sigma2.exp*x)))
}
image(y,y,t(P))
points(d$size,d$sizeNext)
# 3. incorporate flowering probability and calculate lambda. consider the case where the plant does not flower each year, perhaps dependent on stress deriving from some environmental factor. you'll note from the seed production data in figure 1 that there are very different patterns in seed production dependent on size; smaller plants produce no seeds because they don't flower. for example, larger plants may be more likely to flower because they can maintain a sufficient amount of stored resources to buffer against environmental variation. we might not know what triggers the decision to flower, but we can at least describe its size dependence by including the probability of flowering in the model. the flowering probability function will enter the reproduction kernel (called f.xy) above. the version of f.xy used above simply assumes that all plants flower, so including the flowering probability function will reduce this number. your task is as follows:
# a. write the flowering probability function. flowering probability is most easily modeled using logistic regression, so this will be very similar to the survival function (s.xy) above in section C.1. call your function p.flower.x(). when writing this function, note that you'll store the slope and intercept of the flowering regression as params$flower.int and param$flower.slope, respectively, as discussed below.
# b. modify the reproduction function (f.xy) to include the flowering probability function. this just amounts to multiplying the argument in f.xy by p.flower.x().
# c. build a logistic regression for flowering probability. see the data frame for binary flowering data under d$fec.flower. the code for estimating the survival regression can be easily modified to accomodate the flowering probability model since both should be logistic regressions on size. from this regression, you should obtain a slope and intercept, which should be stored in the param vector as as params$flower.int and param$flower.slope, respectively.
# d. update the regression for seed number to include only the individuals that flowered. this is easily done by using the code for the seed regression in section B.3 and changing the data argument to data=d[d$fec.flower==1,] .
# e. build the kernel, and plot a picture of it using your new function for f.yx. this should use code from section D and E without modification. does the peak of the fecundity kernel increase or decrease or decrease compared to your kernel from exercise 2?
# f. obtain the asymptotic population growth rate (lambda) and a plot of hte elasticity with respect to lambda. this should use code from sections E and E without modification. lambda should equal 0.9710969.