\newcommand{\dlow}{d_{\text{\small low}}} \newcommand{\dhigh}{d_{\text{\small high}}} \newcommand{\dnull}{d_0} <>= ## source("plotCI.R") ## ?? library(plotrix) library(emdbook) ## for dzinbinom and rzinbinom library(bbmle) library(MASS) source("chapskel.R") @ \section*{Summary} This chapter introduces techniques and ideas related to simulating ecological patterns. Its main goals are: (1) to show you how to generate patterns you can use to sharpen your intuition and test your estimation tools; and (2) to introduce statistical power and related concepts, and show you how to estimate statistical power by simulation. This chapter and the supplements will also give you more practice working with \R. \section{Introduction} Chapters~\ref{chap:determ} and \ref{chap:stoch}, gave a basic overview of functions to describe deterministic patterns and probability distributions to describe stochastic patterns. This chapter will show you how to use stochastic simulation to understand and test your data. Simulation is sometimes called \emph{forward} modeling, to emphasize that you pick a model and parameters and work forward to predict patterns in the data. Parameter estimation, or \emph{inverse} modeling (the main focus of this book), starts from the data and works backward to choose a model and estimate parameters. Ecologists often use simulation to explore the patterns that emerge from ecological models. Often they use theoretical models without accompanying data, in order to understand qualitative patterns and plan future studies. But even if you have data, models, but you might want to start by simulating your system. You can use simulations to explore the functions and distributions you chose to quantify your data. If you can choose parameters that make the simulated output from those functions functions and distributions approximate your data, you can confirm that the models are reasonable --- and simultaneously find a rough estimate of the parameters. You can also use simulated ``data'' from your system to test your estimation procedures. Chapters~6--8 will show you how to estimate parameters; in this chapter I'll work with more ``canned'' procedures like nonlinear regression. Since you never know the true answer to an ecological question --- you only have imperfect measurements with which you're trying to get as close to the answer as possible --- simulation is the only way to test whether you can correctly estimate the parameters of an ecological system. It's always good to test such a best-case scenario, where you know that the functions and distributions you're using are correct, before you proceed to real data. \emph{Power analysis} is a specific kind of simulation testing where you explore how large a sample size you would need to get a reasonably precise estimate of your parameters. You can also also use power analysis to explore how variations in experimental design would change your ability to answer ecological questions. \section{Stochastic simulation} Static ecological processes, where the data represent a snapshot of some ecological system, are easy to simulate\footnote{Dynamic processes are more challenging. See Chapter~\ref{chap:dyn}.}. For static data, we can use a single function to simulate the deterministic process and then add heterogeneity. Often, however, we will chain together several different mathematical functions and probability distributions representing different stages in an ecological process to produce surprisingly complex and rich descriptions of ecological systems. I'll start with three simple examples that illustrate the general procedure, and then move on to two slightly more in-depth examples. \subsection{Simple examples} \subsubsection{Single groups} Figure~\ref{fig:sim1} shows the results of two simple simulations, each with a single group and single continuous covariate. \begin{figure} <>= op=par(mfrow=c(1,2)) par(lwd=2,bty="l",las=1,cex=1.5) x = 1:20 a =2; b=1 y_det = a+b*x y = y_det+rnorm(20,sd=2) plot(x,y) abline(lm(y~x),lty=2) abline(a,b) legend("topleft",c("true","best fit"), lty=1:2,bty="n") ## b = 1 a = 20 k = 5 n=25 xmax=5 axmax=5 x= runif(n,min=0,max=xmax) y_det= a*b/(b+x) y = rnbinom(n,mu=y_det,size=k) plot(x,y,xlim=c(0,axmax),axes=FALSE) axis(side=1,at=0:5) axis(side=2) box() curve(a*b/(b+x),add=TRUE,lty=1,to=xmax) #mfun = function(a,b,k) { # -sum(dnbinom(y,mu=a/(b+x),size=k,log=TRUE)) #} m1 = mle2(y~dnbinom(mu=a*b/(b+x),size=k), start=list(a=20,b=1,k=1)) curve(coef(m1)["a"]*coef(m1)["b"]/(coef(m1)["b"]+x),add=TRUE,lty=2,to=xmax) legend("topright", c("true","best fit"), lty=1:2, bty="n") par(op) @ \caption{Two simple simulations: a linear function with normal errors ($Y \sim \mbox{Normal}(a+bx,\sigma^2)$), and a hyperbolic function with negative binomial errors ($Y \sim \mbox{NegBin}(\mu=a b/(b+x),k)$).} \label{fig:sim1} \end{figure} The first simulation (Figure~\ref{fig:sim1}a) is a linear model with normally distributed errors. It might represent productivity as a function of nitrogen concentration, or predation risk as a function of predator density. The mathematical formula is $Y \sim \mbox{Normal}(a+bx,\sigma^2)$, specifying that $Y$ is a random variable drawn from a normal distribution with mean $a+bx$ and variance $\sigma^2$. The symbol $\sim$ means ``is distributed according to''. This model can also be written as $y_i=a+bx_i+\varepsilon_i$, $\varepsilon_i \sim N(0,\sigma^2)$, specifying that the $i$\textsuperscript{th} value of $Y$, $y_i$, is equal to $a+b x_i$ plus a normally distributed error term with mean zero. I will always use the first form because it is more general: normally distributed error is one of the few kinds that can simply be added onto the deterministic model in this way. The two lines on the plot show both the theoretical relationship between $y$ and $x$ and the best-fit line (by linear regression, \code{lm(y\~{}x)} (Section~\ref{sec:linreg}). The lines differ slightly because of the randomness incorporated in the simulation. A few lines of \R\ code will run this simulation. Set up the values of $x$, and specify values for the parameters $a$ and $b$: <<>>= x = 1:20 a =2; b=1 @ Calculate the deterministic part of the model: <<>>= y_det = a+b*x @ Pick 20~random normal deviates with the mean equal to the deterministic equation and $\sigma=2$: <<>>= y=rnorm(20,mean=y_det,sd=2) @ (you could also specify this as \verb!y = y_det+rnorm(20,sd=2)!, corresponding to the additive model $y_i=a+bx_i+\varepsilon_i$, $\varepsilon_i \sim N(0,\sigma^2)$ (the \code{mean} parameter is zero by default). However, the additive form works only for the Normal, and not for most of the other distributions we will be using). The second simulation uses hyperbolic functions ($y=ab/(b+x)$) with negative binomial error: in symbols, $Y \sim \mbox{NegBin}(\mu=a b/(b+x),k)$. The function is parameterized so that $a$ is the intercept term (when $x=0$, $y=a b/b=a$). This simulation might represent the decreasing fecundity of two different species with increasing population density: the hyperbolic function is a natural expression of the decreasing quantity of a limiting resource per individual. In this case, we cannot express the model as the deterministic function ``plus error''. Instead, we have to incorporate the deterministic model as a control on one of the parameters of the error distribution---in this case, the mean $\mu$. (Although the negative binomial is a discrete distribution, its parameters $\mu$ and $k$ are continuous.) Ecological models typically describe the differences in the mean among groups or as covariates change, but we could also allow the variance or the shape of the distribution to change. The \R\ code for this simulation is easy, too. Define parameters <<>>= a = 20 b = 1 k = 5 @ How you simulate the $x$ values depends on the experimental design you are trying to simulate. In this case, we choose 50~$x$ values randomly distributed between 0 and 5 to simulate a study were the samples are chosen from natural varying sites, in contrast to the previous simulation where $x$ varied systematically (\code{x=1:20}), simulating an experimental or observational study that samples from a gradient in the predictor variable $x$. <<>>= x= runif(50,min=0,max=5) @ Now we calculate the deterministic mean \code{y\_det}, and then sample negative binomial values with the appropriate mean and overdispersion: <<>>= y_det= a*b/(b+x) y = rnbinom(50,mu=y_det,size=k) @ \subsubsection{Multiple groups} Ecological studies typically compare the properties of organisms in different groups (e.g. control and treatment, parasitized and unparasitized, high and low altitude). Figure~\ref{fig:sim2} shows a simulation that extends the hyperbolic simulation above to compares the effects of a continuous covariate in two different groups (species in this case). Both groups have the same overdispersion parameter $k$, but the hyperbolic parameters $a$ and $b$ differ: \begin{equation} Y \sim \mbox{NegBin}(\mu=a_i b_i/(b_i+x),k) \end{equation} where $i$ is 1 or 2 depending on the species of an individual. Suppose we still have 50~individuals, but the first~25 are species~1 and the second~25 are species~2. We use \code{rep} to set up a factor that describes the group structure (the \R\ command \code{gl} is also useful for more complicated group assignments): <<>>= g = factor(rep(1:2,each=25)) @ Defining vectors of parameters, each with one element per species, or a single parameter for $k$ since the species are equivalent in this case: <<>>= a=c(20,10) b=c(1,2) k=5 @ \R's vectorization makes it easy to incorporate different parameters for different species into the formula, by using the group vector \code{g} to specify which element of the parameter vectors to use for any particular individual. <<>>= y_det= a[g]/(b[g]+x) y = rnbinom(50,mu=y_det,size=k) @ \begin{figure} <>= ## ## b = c(1,2) ## a = c(20,10) ## k = 5 ## g = rep(1:2,each=25) ## x= runif(50,min=0,max=5) ## y_det= a[g]*b/(b+x) ## y = rnbinom(50,mu=y_det,size=k) m2 = mle2(y~dnbinom(mu=a*b/(b+x),size=k), parameters=list(a~g,b~g), start=list(a=20,b=1,k=1)) axmax=8 xmax=5 op=par(lwd=2,las=1,mgp=c(3,1,0),bty="l",cex=1.5) plot(x,y,pch=as.numeric(g),xlim=c(0,axmax),axes=FALSE,lwd=1) axis(side=1,at=0:5) axis(side=2) box() curve(a[1]/(b[1]+x),add=TRUE,col="gray",to=xmax) curve(a[2]/(b[1]+x),add=TRUE,lty=2,col="gray",to=xmax) c2 = coef(m2) curve(c2[1]*c2[3]/(c2[3]+x),add=TRUE,lty=1,to=xmax) curve((c2[1]+c2[2])*(c2[3]+c2[4])/(c2[3]+c2[4]+x),add=TRUE,lty=2,to=xmax) legend("topright", outer(c("data","true","best fit"), c("(sp. 1)","(sp. 2)"),paste), lty=c(NA,1,1,NA,2,2), lwd=c(1,2,2,1,2,2), col=rep(c("black","gray","black"),2), pch=c(1,NA,NA,2,NA,NA),cex=0.8,bty="n") par(op) @ \caption{Simulation results from a hyperbolic/negative binomial model with groups differing in both intercept and slope: $Y \sim \mbox{NegBin}(\mu=a_i b_i/(b_i+x),k)$. Parameters: $a=\{20,10\}$, $b=\{1,2\}$, $k=5$.} \label{fig:sim2} \end{figure} \subsection{Intermediate examples} \subsubsection{Reef fish settlement} <>= set.seed(1001) N = 603 a = 0.696 b = 9.79 mu=25.32 zprob=0.123 k=0.932 recrprob = function(S) { a/(1+(a/b)*S) } settlers = rzinbinom(N,mu=mu,size=k,zprob=zprob) recr = rbinom(N,prob=recrprob(settlers),size=settlers) @ \begin{figure} <>= op = par(mfrow=c(1,2),mar=c(5,4,2,0.2),lwd=2,las=1,cex=1.5,bty="l") hist(settlers,breaks=40,col="gray",ylab="Frequency",xlab="Settlers",main="") plot(settlers,recr,xlab="Settlers",ylab="Recruits") curve(a*x/(1+(a/b)*x),add=TRUE) par(op) @ \caption{Damselfish recruitment: (a) distribution of settlers; (b) recruitment as a function of settlement density} \label{fig:schmitt} \end{figure} The damselfish settlement data from \cite{Schmitt+1999} (p.~\pageref{sec:damsel}) include random variation in settlement density (the density of larvae arriving on a given anemone) and random variation in density-dependent recruitment (number of settlers surviving for 6~months on an anemone). To simulate the variation in settlement density I took random draws from a zero-inflated negative binomial (p.~\pageref{zinbinom}), although a non-inflated binomial, or even a geometric distribution (i.e. a negative binomial with $k=1$) might be sufficient to describe the data. Schmitt \emph{et al.} modeled density-dependent recruitment with a Beverton-Holt curve (equivalents to the Michaelis-Menten function). I have simulated this curve with binomial error (for survival of recruits) superimposed. The model is \begin{equation} R \sim \mbox{Binom}(N=S,p=a/(1+(a/b)S)). \end{equation} (With the recruitment probability per settler $p$ given as the hyperbolic function $a/(1+(a/b)S)$, the mean number of recruits is Beverton-Holt: $Np = aS/(1+(a/b)S)$.) The settlement density $S$ is drawn from the zero-inflated negative binomial distribution shown in Figure~\ref{fig:schmitt}a. Set up the parameters, including the number of samples ($N$): <<>>= N = 603 a = 0.696 b = 9.79 mu=25.32 zprob=0.123 k=0.932 @ Define a function for the recruitment probability: <<>>= recrprob = function(S) { a/(1+(a/b)*S) } @ Now simulate the number of settlers and the number of recruits, using \code{rzinbinom} from the \code{emdbook} package: <<>>= settlers = rzinbinom(N,mu=mu,size=k,zprob=zprob) recr = rbinom(N,prob=recrprob(settlers),size=settlers) @ % (d) a ``triangular'' or ``factor-ceiling'' distribution: uniform with exponentially decreasing maximum % \cite{Thomson+1996}; % The fourth model is more theoretical, and just shows a set % of data that are uniformly distributed for any specific value of $x$, but where % the maximum of the uniform distribution varies with $x$: % $y \sim \mbox{Unif}(0,\exp(-ax))$. This kind of data is very % common in ecological settings where the covariate we've measured % may set an upper bound on performance (survival/growth/fecundity), but where % other factors (predation, limitation by other nutrients, etc.) determines % the actual performance of the organism. \subsubsection{Pigweed distribution and fecundity} <>= set.seed(1001) L = 30 nparents = 50 offspr_per_parent = 10 noffspr = nparents*offspr_per_parent dispdist = 2 parent_x = runif(nparents,min=0,max=L) parent_y = runif(nparents,min=0,max=L) angle = runif(noffspr,min=0,max=2*pi) dist = rexp(noffspr,1/dispdist) offspr_x = rep(parent_x,each=offspr_per_parent)+cos(angle)*dist offspr_y = rep(parent_y,each=offspr_per_parent)+sin(angle)*dist pos <- cbind(offspr_x,offspr_y) ndist <- as.matrix(dist(pos,upper=TRUE,diag=TRUE)) nbrcrowd = apply(ndist<2,1,sum)-1 ci = nbrcrowd*3 M=2.3 alpha=0.49 mass_det=M/(1+ci) mass = rgamma(length(mass_det),scale=mass_det,shape=alpha) b = 271.6; k= 0.569 seed_det = b*mass seed = rnbinom(length(seed_det),mu=seed_det,size=k) f= fitdistr(nbrcrowd,"negative binomial") @ \begin{figure} <>= op=par(mfrow=c(2,2),mar=c(5,5,0.2,0.2),lwd=1,las=1,mgp=c(3.5,1,0)) ## plot 1 plot(offspr_x,offspr_y,xlab="",ylab="") ## plot 2 b1 = barplot(table(factor(nbrcrowd,levels=0:max(nbrcrowd)))/length(nbrcrowd), xlab="Number of neighbors",ylab="Proportion") points(dnbinom(0:max(nbrcrowd),size=f$estimate["size"], mu=f$estimate["mu"]),pch=16,type="b") ## plot 3 plot(ci,mass,cex=0.5,log="y",xlab="Competition index",ylab="Biomass (g)") curve(M/(1+x)*alpha,add=TRUE,from=0) ## plot 4 ## x = runif(76,min=0,max=5) seed_det = b*mass seed = rnbinom(length(seed_det),mu=seed_det,size=k) par(mgp=c(2.5,1,0)) plot(mass,1+seed,log="xy",xlab="Mass",ylab="1+Seed set") curve(b*x+1,add=TRUE) ##abline(a=0,b=b,lty=1) curve(qnbinom(0.975,mu=b*x,size=k)+1,lty=2,from=0.001,add=TRUE,type="s") curve(qnbinom(0.025,mu=b*x,size=k)+1,lty=2,from=0.001,add=TRUE,type="s") par(op) @ \caption{Pigweed simulations. (a) Spatial pattern (Poisson cluster process). (b) Distribution of number of neighbors within 2~m. (c) End-of-year biomass, based on a hyperbolic function of crowding index with a gamma error distribution. (d) Seed set, proportional to biomass with a negative binomial error distribution.} \label{fig:pigweed} \end{figure} \cite{PacalaSilander1990} did a series of experiments quantifying the strength and spatial scale of competition between the annual weeds velvetweed (\emph{Abutilon theophrasti}) and pigweed (\emph{Amaranthus retroflexus}). They were interested in neighborhood competition among nearby plants. Local dispersal of seeds changes the distribution of the number of neighbors per plant. If plants were randomly distributed we would expect a Poisson distribution of neighbors within a given distance, but if seeds have a limited dispersal range so that plants are spatially aggregated, we expect a distribution with higher variance (and a higher mean number of neighbors for a given overall plant density) such as the negative binomial. Neighbors increase local competition for nutrients, which in turn decreases plants' growth rate, their biomass at the end of the growing season, and their fecundity (seed set). Thus differences in dispersal and spatial patterning within and among species can in theory change competitive outcomes \cite{BPN2003}, although Pacala and Silander found that spatial structure had little effect in their system. To explore the patterns of competition driven by local dispersal and crowding, we can simulate this spatial competitive process. Let's start by simulating a spatial distribution of plants in an $L\times L$ plot ($L=30 \mbox{m}$ below). We'll use a \emph{Poisson cluster process}, where mothers are located randomly in space at points $\{x_p,y_p\}$ (called a \emph{Poisson process} in spatial ecology), and their children are distributed nearby (only the children, and not the mothers, are included in the final pattern). The simulation includes $N=50$ parents, for which we pick 50 $x$ and 50 $y$ values, each uniformly distributed between 0 and $L$. The distance of each child from its parent is exponentially distributed with rate=$1/d$ (mean dispersal distance $d$), and the direction is random --- that is, uniformly distributed between 0 and $2\pi$ radians\footnote{\R, like most computer languages, works in radians rather than degrees; to convert from degrees to radians, multiply by $2\pi/360$. Since \R\ doesn't understand Greek letters, use \code{pi} to denote $\pi$: \code{radians=degrees*2*pi/360}.}. I use a little bit of trigonometry to calculate the offspring locations (Figure~\ref{fig:pigweed}a). The formal mathematical definition of the model for offspring location is: \begin{equation} \begin{array}{lrcl} \mbox{parent locations} & x_p, y_p & \sim & U(0,L) \nonumber \\ \mbox{distance from parent} & r & \sim & \mbox{Exp}(0,1/d) \nonumber \\ \mbox{dispersal angle} & \theta & \sim & U(0,2 \pi) \nonumber \\ \mbox{offspring } x & x_c & \sim & x_p + r \cos \theta \nonumber \\ \mbox{offspring } y & y_c & \sim & y_p + r \sin \theta. \nonumber \end{array} \end{equation} In \R, set up the parameters: <<>>= set.seed(1001) L = 30 nparents = 50 offspr_per_parent = 10 noffspr = nparents*offspr_per_parent dispdist = 2 @ Pick locations for the parents: <<>>= parent_x = runif(nparents,min=0,max=L) parent_y = runif(nparents,min=0,max=L) @ Pick angles and distances for dispersal: <<>>= angle = runif(noffspr,min=0,max=2*pi) dist = rexp(noffspr,1/dispdist) @ Add the offspring displacements to the parent coordinates (using \\ \verb+rep(...,each=offspr_per_parent)+): <<>>= offspr_x = rep(parent_x,each=offspr_per_parent)+cos(angle)*dist offspr_y = rep(parent_y,each=offspr_per_parent)+sin(angle)*dist @ If you wanted to allow different numbers of offspring for each parent --- for example, drawn from a Poisson distribution --- you could use \code{offspr\_per\_parent={}rpois(nparents,lambda)} and then \code{rep(..., times=offspr\_per\_parent)}. Instead of specifying that each parent's coordinates should be repeated the same number of times, you would be telling \R\ to repeat each parent's coordinates according to its number of offspring. Next we calculate the neighborhood density, or the number of individuals within 2~m of each plant (not counting itself). Figure~\ref{fig:pigweed}(b) shows this distribution, along with a fitted negative binomial distribution. This calculation reduces the spatial pattern to a simpler non-spatial distribution of crowding. % although we %did have to decide on a particular size (2~m) for the competition %neighborhood. <<>>= pos <- cbind(offspr_x,offspr_y) ndist <- as.matrix(dist(pos,upper=TRUE,diag=TRUE)) nbrcrowd = apply(ndist<2,1,sum)-1 @ Next we use a relationship that Pacala and Silander found between end-of-year mass ($M$) and competition index ($C$). They fitted this relationship based on a competition index estimated as a function of the neighborhood density of conspecific (pigweed) and heterospecific (velvetleaf) competitors, $C = 1+c_{pp} n_p + c_{vp} n_v$. For this example, I simply made up a proportionality constant to match the observed range of competition indices. Pacala and Silander found that biomass $M \sim \mbox{Gamma}(\mbox{shape}=m/(1+C), \mbox{scale}=\alpha$), with $m=2.3$ and $\alpha=0.49$. <<>>= ci = nbrcrowd*3 M=2.3 alpha=0.49 mass_det=M/(1+ci) mass = rgamma(length(mass_det),scale=mass_det,shape=alpha) @ Finally, we simulate seed set as a function of biomass, again using a relationship estimated by Pacala and Silander. Seed set is proportional to mass, with negative binomial errors: $S \sim \mbox{NegBin}(\mu=bM,k)$, with $b=271.6$, $k= 0.569$. <<>>= b = 271.6; k= 0.569 seed_det = b*mass seed = rnbinom(length(seed_det),mu=seed_det,size=k) @ Figure~\ref{fig:pigweed}c shows both mass and (1+seed set) on a logarithmic scale, along with dashed lines showing the 95\% confidence limits of the theoretical distribution. <>= f= fitdistr(nbrcrowd,"negative binomial") @ The idea behind realistic static models is that they can link together simple deterministic and stochastic models of each process in a chain of ecological processes---in this case from spatial distribution to neighborhood crowding to biomass to seed set. (Pacala and Silander actually went a step further and computed the density-dependent survival probability. We could simulate this using a standard model like $\mbox{survival} \sim \mbox{Binom}(N=1,p=\mbox{logistic}(a+bC))$, where the logistic function allows the survival probability to be an increasing function of competition index without letting it ever go above 1.) Thus, although it's hard to write down a simple function or distribution that describes the relationship between competition index and the number surviving, as shown here we can break the relationship down into stages in the ecological process and use a simple model for each stage. <>= ## triangular plot a=2 b=0.5 x = runif(100,0,4) y = runif(100,max=a*exp(-b*x)) plot(x,y) curve(a*exp(-b*x),lty=2,add=TRUE) @ \section{Power analysis} Power analysis in the narrow sense means figuring out the (frequentist) statistical power, the probability of failing to reject the null hypothesis when it is false (Figure~\ref{fig:power1}). Power analysis is important, but the narrow frequentist definition suffers from some of the problems that we are trying to move beyond by learning new statistical methods, such as a focus on $p$ values and on the ``truth'' of a particular null hypothesis. Thinking about power analysis even in this narrow sense is already a vast improvement on the naive and erroneous ``the null hypothesis is false if $p<0.05$ and true if $p>0.05$'' approach. However, we should really be considering a much broader question: \textbf{How do the quality and quantity of my data and the true properties (parameters) of my ecological system affect the quality of the answers to my questions about ecological systems?} For any real experiment or observation situation, we don't know what is really going on (the ``true'' model or parameters), so we don't have the information required to answer these questions from the data alone. But we can approach them by analysis or simulation. Historically, questions about statistical power could only be answered by sophisticated analyses, and only for standard statistical models and experimental designs such as one-way ANOVA or linear regression. Increases in computing power have extended power analyses to many new areas, and \R's capability to run many repeated stochastic simulations is a great help. Paradoxically, the mathematical difficulty of deriving power formulas is a great equalizer: since even research statisticians typically use simulations to estimate power, it's now possible (by learning simulation, which is easier than learning advanced mathematical statistics) to work on an equal footing with even cutting-edge researchers. \begin{figure} <>= powplot <- function(m0=45.5,m1=54,sd=1.745,len=100,col1="gray",col2=gray(0.2)) { rng <- range(c(m0-4*sd,m1-4*sd,m0+4*sd,m1+4*sd)) curve(dnorm(x,m0,sd),from=rng[1],to=rng[2],yaxs="i", ylim=c(0,dnorm(m0,m0,sd)*1.1),axes=FALSE,ylab="",xlab="") axis(side=1) box() ## plot(pvec1,dnorm(pvec1,m0,sd),type="l",xlim=rng) ## lines(pvec2,dnorm(pvec2,m1,sd)) qval <- qnorm(c(0.025,0.975),m0,sd) pvec <- sort(c(qval,seq(rng[1],rng[2],length=len))) abline(v=qval,lty=2) pvec.up <- pvec[pvec>=qval[2]] pfun <- function(x,y,...) polygon(c(x,rev(x)),c(rep(0,length(x)),rev(y)),...) pfun(pvec.up,dnorm(pvec.up,m1,sd),col=col1) pfun(pvec.up,dnorm(pvec.up,m0,sd),col=col2) pvec.down <- pvec[pvec<=qval[1]] pfun(pvec.down,dnorm(pvec.down,m0,sd),col=col2) pfun(pvec.down,dnorm(pvec.down,m1,sd),col=col1) ##polygon(c(pvec1b,rev(pvec1b)), ## c(rep(0,length(pvec1b)),dnorm(rev(pvec1b),m1,sd)),col="gray") power=pnorm(qval[2],m1,sd,lower=FALSE)+pnorm(qval[1],m1,sd) curve(dnorm(x,m1,sd),from=rng[1],to=rng[2],add=TRUE,lwd=2) invisible(power) } powval <- function(m0=45.5,m1=54,sd=1.745,alpha=0.95) { 1-(pnorm(qnorm(1-(1-alpha)/2,m0,sd),m1,sd)- pnorm(qnorm((1-alpha)/2,m0,sd),m1,sd)) } @ <>= op <- par(mfrow=c(1,2)) par(lwd=2,las=1,bty="l",cex=1.5,mgp=c(2.5,1,0)) par(mar=c(4,4,2,1)) powplot(m0=0,m1=2,sd=0.75) mtext(side=2,las=0,"Probability",cex=1.5,line=1) mtext(side=1,las=0,"x",cex=1.5,line=2.5) text(c(0,2),c(0.56,0.56), c(expression(H[0]),expression(H[1]))) text(-2.6,0.1,expression(alpha)) arrows(c(-2.5,-2.2), c(0.09,0.09), c(-1.95,1.4), c(0.026,0.026), angle=15) text(4,0.4,"Power") text(4,0.37,expression(1-beta)) arrows(4,0.36,2,0.3,angle=15) ##xvec <- seq(35,55,len=100) par(mar=c(4,4,2,2)) curve(powval(x,m0=0,sd=0.25),from=-5,to=5,ylab="Power",xlab="Effect size") ##plot(xvec,sapply(xvec,function(x)powval(m1=x)),type="l") abline(h=0.05) curve(powval(x,m0=0,sd=0.75),lty=2,add=TRUE) curve(powval(x,m0=0,sd=2),lty=3,add=TRUE) points(2,powval(2,m0=0,sd=0.75),pch=16) par(xpd=NA,adj=0) text(c(1.5,2.5,3.5), c(1.03,0.88,0.36), c(expression(sigma==0.25), expression(sigma==0.75), expression(sigma==2)), cex=1,xpd=NA) par(op) @ \caption{The frequentist definition of power. In the left-hand plot, the type~I (false positive) rate $\alpha$ is the area under the tails of the null hypothesis $H_0$; the type~II error rate, $\beta$, is the area under the sampling distribution of the alternative hypothesis ($H_1$) between the tails of the null hypothesis; thus the power $1-\beta$ is the gray area shown that lies above the upper critical value of the null hypothesis curve. (There is also a tiny area where $H_1$ overlaps the lower tail of $H_0$.) The right-hand plot shows power as a function of effect size (distance between the means) and standard deviation; the point shows the situation (effect size=2, $\sigma=0.75$) illustrated in the left figure.} \label{fig:power1} \end{figure} The first part of the rather vague (but common-sense) question above is about ``quantity and quality of data and the true properties of the ecological system''. These properties include: \begin{itemize} \item{Number of data points (number of observations/sampling intensity)} \item{Distribution of data (experimental design) \begin{itemize} \item{Number of observations per site, number of sites} \item{Temporal and spatial extent (distance between the farthest samples, controlling the largest scale you can measure) and grain (distance between the closest samples, controlling the smallest scale you can measure)} \item{Even or clustered distribution in space and/or time. Blocking. Balance (i.e., equal or similar numbers of observations in each treatment)} \item{Distribution of continuous covariates --- mimicking the natural distribution, or \emph{stratified} to sample evenly across the natural range of values, or artificially extended to a wider range} \end{itemize} } \item{Amount of variation (measurement/sampling error, demographic stochasticity, environmental variation). Experimental control or quantification of variation.} \item{Effect size (small or large), or the distance of the true parameter from the null-hypothesis value.} \end{itemize} These properties will determine how much information you can extract from your data. Large data sets are better than smaller ones; balanced data sets with wide ranges are better than unbalanced data sets with narrow ranges; data sets with large extent (maximum spatial and/or temporal range) and small grain (minimum distance between samples) are best; and larger effects are obviously easier to detect and characterize. There are obvious tradeoffs between effort (measured in person-hours or dollars) and the number of samples, and in how you allocate that effort. Would you prefer more information about fewer samples, or less information about more? More observations at fewer sites or fewer at more sites? Should you spend your effort increasing extent or decreasing grain? Subtler tradeoffs also affect the value of an experiment. For example, controlling extraneous variation allows a more powerful answer to a statistical question --- but how do we know what is ``extraneous''? Variation actually affects the function of ecological systems \citep[Jensen's inequality:][]{RuelAyres1999}. Measuring a plant in a constant laboratory environment may turn out to answer the wrong question: we ultimately want to know how the plant performs in the natural environment, not in the lab, and variability is an important part of most environments. In contrast, performing ``unrealistic'' manipulations like pushing population densities beyond their natural limits may help to identify density-dependent processes that are real and important but undetectable at ambient densities \citep{Osenberg+2002}. There is no simple answer to these questions, but they're important to think about. The quality of the answers we get from our analyses is as multifaceted as the quality of the data. \emph{Precision} specifies how finely you can estimate a parameter --- the number of significant digits, or the narrowness of the confidence interval --- while \emph{accuracy} specifies how likely your answer is to be correct. Accurate but imprecise answers are better than precise but inaccurate ones: at least in this case you know that your answer is imprecise, rather than having misleadingly precise but inaccurate answers. But you need both precision and accuracy to understand and predict ecological systems. More specifically, I will show how to estimate the following aspects of precision and accuracy for the damselfish system: \begin{itemize} \item{\emph{Bias} (accuracy): bias is the expected difference between the estimate and the true value of the parameter. If you run a large number of simulations with a true value of $d$ and estimate a value of $\hat d$ for each one, then the bias is $E[\hat d-d]$. Most simple statistical estimators are unbiased, and so most of us have come to expect (wrongly) that statistical estimates are generally unbiased. Most statistical estimators are indeed \emph{asymptotically} unbiased, which means that in the limit of a large amount of data they will give the right answer on average, but a surprisingly large number of common estimators are biased \citep{Poulin1996,Doak+2005}.} \item{\emph{Variance} (precision): variance, or $E[(\hat d-E[\hat d])^2]$, measures the variability of the point estimates ($\hat d$) around their mean value. Just as an accurate but imprecise answer is worthless, unbiased answers are worthless if they have high variance. With low bias we know that we get the right answer \emph{on average}, but high variability means that any particular estimate could be way off. With real data, we never know which estimates are right and which are wrong.} \item{\emph{Confidence interval width} (precision): the width of the confidence intervals, either in absolute terms or as a proportion of the estimated value, provides useful information on the precision of your estimate. If the confidence interval is estimated correctly (see coverage, below) then the confidence interval should be related to the variance among estimates.} \item{\emph{Mean squared error} (MSE: accuracy and precision) combines bias and variance as (bias$^2$+variance). It represents the total variation around the true value, rather than the average estimated value $(E[d-\hat d])^2+E[(\hat d-E[\hat d])^2] = E[(\hat d-d)^2]$. MSE gives an overall sense of the quality of the estimator.} \item{\emph{Coverage} (accuracy): when we sample data and estimate parameters, we try to estimate the uncertainty in those parameters. Coverage describes how accurate those confidence intervals are, and (once again) can only be estimated via simulation. If the confidence intervals (for a given confidence level $1-\alpha$) are $\dlow$ and $\dhigh$, then the coverage describes the proportion or percentage of simulations in which the confidence intervals actually include the true value ($\mbox{Prob}(\dlow < d < \dhigh)$). Ideally, the observed coverage should equal the nominal coverage of $1-\alpha$; values that are too high are pessimistic, overstating the level of uncertainty, while values that are too low are optimistic. (It often takes several hundred simulations to get a reasonably precise estimate of the coverage, especially when estimating the coverage for 95\% confidence intervals.)} \item{\emph{Power} (precision): finally, the narrow-sense power gives the probability of correctly rejecting the null hypothesis, or in other words the fraction of the times that the null-hypothesis value $d_0$ will be outside of the confidence limits: ($\mbox{Prob}(d_0 < \dlow \mbox{ or } d_0>\dhigh)$). In frequentist language, it is $1-\beta$, where $\beta$ is the probability of making a type~II error. \\ \begin{center} \begin{tabular}{c|cc} & $H_0$ true & $H_0$ false \\ \hline accept $H_0$ & $1-\alpha$ & $\beta$ \\ reject $H_0$ & $\alpha$ & $1-\beta$ \end{tabular} \end{center} Typically you specify an alternative hypothesis $H_1$, a desired type~I error rate $\alpha$, and a desired power $(1-\beta)$ and then calculate the required sample size, or calculate $(1-\beta)$ as a function of sample size, \emph{for some particular} $H_1$. When the effect size is zero (the difference between the null and the alternate hypotheses is zero --- i.e. the null hypothesis is true), the power is undefined, but it approaches $\alpha$\footnote{Not zero! even when the null hypothesis is true, we reject it a proportion $\alpha$ of the time: thus we can expect to correctly reject the null hypothesis, even for very small effects, with probability at least $\alpha$.} as the effect size gets small ($H_1 \to H_0$). \R\ has built-in functions for several standard cases (power of tests of difference between means of two normal populations [\code{power.t.test}], tests of difference in proportions, [\code{power.prop.test}], and one-way, balanced ANOVA [\code{power.anova.test}])% \footnote{The \code{Hmisc} package, available on CRAN, has a few more power calculators.}. For more discussion of these cases, or for other fairly straightforward examples, you can look in any relatively advanced biometry book (e.g. \cite{SokalRohlf1995}), or even find a calculator on the web (search for ``statistical power calculator''). For more complicated and ecologically realistic examples, however, you'll probably have to find the answer through simulation, as demonstrated below. } \end{itemize} \subsection{Simple examples} \subsubsection{Linear regression} <>= x = 1:20 a =2; b=1; sd=8 N = 20 @ Let's start by estimating the statistical power of detecting the linear trend in Figure~\ref{fig:sim1}a, as a function of sample size. In order to find out whether we can reject the null hypothesis in a single ``experiment'', we simulate a data set with a given slope, intercept, and number of data points; run a linear regression; extract the $p$-value; and see whether it is less than our specified $\alpha$ criterion (usually 0.05). For example: <<>>= y_det = a+b*x y = rnorm(N,mean=y_det,sd=sd) m = lm(y~x) coef(summary(m))["x","Pr(>|t|)"] @ Extracting $p$-values from \R\ analyses can be tricky. In this case, the coefficients of the \emph{summary} of the linear fit are a matrix including the standard error, $t$ statistic, and $p$-value for each parameter; I used matrix indexing to pull out the specific value I wanted. More generally, you will have to use the \code{names} and \code{str} commands to pick through the results of a test to find the $p$-value. In order to estimate the \emph{probability} of successfully rejecting the null hypothesis when it is false (the power), we have to repeat this procedure many times and calculate the proportion of the time that we reject the null hypothesis. Specify the number of simulations to run (400 is a reasonable number if we want to calculate a percentage --- even 100 would do to get a crude estimate): <<>>= nsim = 400 @ Set up a vector to hold the $p$-value for each simulation: <<>>= pval = numeric(nsim) @ Now repeat what we did above \Sexpr{nsim} times, each time saving the $p$-value in the storage vector: <>= for (i in 1:nsim) { y_det = a+b*x y = rnorm(N,mean=y_det,sd=sd) m = lm(y~x) pval[i] = coef(summary(m))["x","Pr(>|t|)"] } @ Calculate the power: <<>>= sum(pval<0.05)/nsim @ However, we don't just want to know the power for a single experimental design. Rather, we want to know how the power changes as we change some aspect of the design such as the sample size or the variance. Thus we have to repeat the entire procedure above multiple times, each time changing some parameter of the simulation such as the slope, or the error variance, or the distribution of the $x$ values. Coding this in \R\ usually involves nested \code{for} loops. For example: <>= bvec = seq(-2,2,by=0.1) power.b = numeric(length(bvec)) for (j in 1:length(bvec)) { b = bvec[j] for (i in 1:nsim) { y_det = a+b*x y = rnorm(N,mean=y_det,sd=sd) m = lm(y~x) pval[i] = coef(summary(m))["x","Pr(>|t|)"] } power.b[j] = sum(pval<0.05)/nsim } @ The results would resemble a noisy version of the right subfigure in Figure~\ref{fig:power1}. The power equals $\alpha$=0.05 when the slope is zero, rising to 0.8 for slope $\approx \pm 1$. You could repeat these calculations for a different set of parameters (e.g. changing the sample size, or the number of parameters). If you were feeling ambitious, you could calculate the power for many combinations of (e.g.) slope and sample size, using yet another \code{for} loop; saving the results in a matrix; and using \code{contour} or \code{persp} to plot the results. \subsubsection{Hyperbolic/negative binomial data} What about the power to detect the difference between the two groups shown in Figure~\ref{fig:sim1}b with hyperbolic dependence on $x$, negative binomial errors, and different intercepts and hyperbolic slopes? In order to estimate the power of the analysis, we have to know how to test statistically for a difference between the two groups. Jumping the gun a little bit (this topic will be covered in much greater detail in Chapter~6), we can define \emph{negative log-likelihood functions} both for a null model that assumes the intercept is the same for both groups as well as for a more complex model that allows for differences in the intercept. <>= load("negbinhyp-batch2.RData") nsim <- 200 @ The \code{mle2} command in the \code{bbmle} package lets us fit the parameters of these models, and the \code{anova} command gives us a $p$-value for the difference between the models (p.~\pageref{mle2groups}): <>= m0 = mle2(y~dnbinom(mu=a*b*x/(b+x),size=k),start=list(a=15,b=1,k=5)) m1 = mle2(y~dnbinom(mu=a*b*x/(b+x),size=k), parameters=list(a~g,b~g), start=list(a=15,b=1,k=5)) anova(m0,m1)[2,"Pr(>Chisq)"] @ Without showing the details, we now run a \code{for} loop that simulates the system above \Sexpr{nsim} times each for a range of sample sizes, uses \code{anova} to calculate the $p$-values, and calculates the proportion of $p$-values $<0.05$ for each sample size. Figure~\ref{fig:negbinhyppow} shows the results. For small sample sizes ($<20$), the power is abysmal ($\approx 0.2--0.4$). Power then rises approximately linearly, rising to acceptable levels (0.8 and up) for sample sizes of 50--100 and greater. The variation in Figure~\ref{fig:negbinhyppow} is due to stochastic variation. We could run more simulations per sample size to reduce the variation, but it's probably unnecessary since all power analysis is approximate anyway. \begin{figure} <>= op <- par(lwd=2,las=1,bty="l",cex=1.5,mgp=c(2.5,1,0)) plot(Nvec[power.N>0],power.N[power.N>0],log="x",type="b", xlab="Sample size", ylab="Power",ylim=c(0,1),axes=FALSE) axis(side=2) axis(side=1,cex.axis=0.8) abline(h=1,lty=2) box() #grid() par(op) @ \caption{Statistical power to detect differences between two hyperbolic functions with intercepts $a=\{10,20\}$, slopes $b=\{2,1\}$, and negative binomial $k=5$, as a function of sample size. Sample size is plotted on a logarithmic scale.} \label{fig:negbinhyppow} \end{figure} \subsubsection{Bias and variance in estimates of the negative binomial $k$ parameter} For another simple example, one that demonstrates that there's more to life than $p$-values, consider the problem of estimating the $k$ parameter of a negative binomial distribution. Are standard estimators biased? How large a sample do you need for a reasonably accurate estimate of aggregation? Statisticians have long been aware that maximum likelihood estimates of the negative binomial $k$ and similar aggregation indices, while better than simpler method of moments estimates (p.~\pageref{negbinom-moments}), are biased for small sample sizes \citep{Pieters+1977,% Piegorsch1990,Poulin1996,LloydSmith2007}. While you could delve into the statistical literature on this topic and even find special-purpose estimators that reduce the bias \citep{SahaPaul2005}, it's empowering to be able to explore the problem yourself through simulation. We can generate negative binomial samples with \code{rnbinom}, and the \code{fitdistr} command from the \code{MASS} package is a convenient way to estimate the parameters. \code{fitdistr} finds maximum likelihood estimates, which generally have good properties --- but are not infallible, as we will see shortly. For a single sample: <>= set.seed(1001) @ <<>>= x = rnbinom(100,mu=1,size=0.5) f = fitdistr(x,"negative binomial") f @ (the standard deviations of the parameter estimates are given in parentheses). You can see that for this example the value of $k$ (\code{size}) is underestimated relative to the true value of 0.5 --- but how do the estimates behave in general? In order to dig the particular values we want (estimated $k$ and standard deviation of the estimate) out of the object that \code{fitdistr} returns, we have to use \code{str(f)} to examine its internal structure. It turns out that \verb+f$estimate["size"]+ and \verb+f$sd["size"]+ are the numbers we want. Set up a vector of sample sizes (\code{lseq} is a function from the \code{emdbook} package that generates a logarithmically spaced sequence) and set aside space for the estimated $k$ and its standard deviation: <<>>= Nvec = round(lseq(20,500,length=100)) estk = numeric(length(Nvec)) estksd = numeric(length(Nvec)) @ Now pick samples and estimate the parameters: <>= set.seed(1001) for (i in 1:length(Nvec)) { N = Nvec[i] x = rnbinom(N,mu=1,size=0.5) f = fitdistr(x,"negative binomial") estk[i] = f$estimate["size"] estksd[i] = f$sd["size"] } @ \begin{figure} <>= op <- par(mfrow=c(1,2)) par(lwd=2,las=1,bty="l",cex=1.5,mgp=c(2.5,1,0)) plot(Nvec,estk,log="x",xlab="Sample size",ylab="Estimated k",axes=FALSE) axis(side=2) axis(side=1,cex.axis=0.7) box() lines(Nvec,predict(loess(estk~Nvec))) abline(h=0.5,lty=2) plot(Nvec,estksd,log="xy",xlab="Sample size",ylab="Estimated sd(k)", mgp=c(3,1,0),axes=FALSE) axis(side=2) axis(side=1,cex.axis=0.7) box() @ \caption{Estimates of negative binomial $k$ with increasing sample size. In left-hand figure, solid line is a \code{loess} fit. Horizontal dashed line is the true value. The $y$ axis in the right-hand figure is logarithmic.} \label{fig:negbinpow} \end{figure} Figure~\ref{fig:negbinpow} shows the results: the estimate is indeed biased, and highly variable, for small sample sizes. For sample sizes below about 100, the estimate $k$ is biased upward by about 20\% on average. The coefficient of variation (standard deviation divided by the mean) is similarly greater than 0.2 for sample sizes less than 100. \subsection{Detecting under- and overcompensation in fish data} Finally, we will explore a more extended and complex example --- the difficulty of estimating the exponent $d$ in the Shepherd function, $R=aS/(1+(a/b)S^d)$ ((Figure~\ref{fig:power1}c). This parameter controls whether the Shepherd function is undercompensating ($d<1$: recruitment increases indefinitely as the number of settlers grows), saturating ($d=1$: recruitment reaches an asymptote), or overcompensating ($d>1$: recruitment decreases at high settlement). % As a reminder, when $d=1$ the Shepherd function % reduces to the Beverton-Holt (Michaelis-Menten{\slash}Holling type~II) % function and the number of recruits $R$ increases to an asymptote as % the number of settlers $S$ grows. When $d>1$, the Shepherd function % is overcompensating (the number of recruits grows to a maximum but % decreases at high settler densities); when $d<1$ it is % undercompensating (the recruitment curve never reaches an asymptote, % although the rate of increase of the curve continually slows). \cite{Schmitt+1999} set $d=1$ %, as is common in %fisheries examples, among other reasons in part because $d$ is very hard to estimate reliably --- we are about to see just how hard. You can use the simulation approach described above to generate simulated ``data sets'' of different sizes whose characteristics matched Schmitt et al.'s data: a zero-inflated negative binomial distribution of numbers of settlers and a Shepherd-function relationship (with a specified value of $d$) between the number of settlers and the number of recruits. For each simulated data set, use \R's \code{nls} function to estimate the values of the parameters by nonlinear least squares \footnote{Non-linear least-squares fitting assumes constant, normally distributed error, ignoring the fact that the data are really binomially distributed. Chapter~\ref{chap:opt} will present more sophisticated maximum likelihood approaches to this problem.}. Then calculate the confidence limits on $d$ (using \code{confint}) and record the estimated value of the parameter and the lower and upper confidence limits. <>= mm = nls(recr~a*settlers/(1+(a/b)*settlers),start=c(a=0.696,b=9.79)) shep = nls(recr~a*settlers/(1+(a/b)*settlers^d),start=c(coef(mm),d=1)) recrprob = function(x,a,b,d) a/(1+(a/b)*x^d) simdata = function(n,d=1,a=0.696,b=9.79) { scoefs = c(mu=25.32,k=0.932,zprob=0.123) settlers = rzinbinom(n,mu=scoefs["mu"],size=scoefs["k"],zprob=scoefs["zprob"]) recr = rbinom(n,prob=recrprob(settlers,a,b,d),size=settlers) data.frame(settlers=settlers,recr=recr) } getvals = function(n=100,d=1) { OK = FALSE while (!OK) { z = simdata(n,d) mm = try(nls(recr~a*settlers/(1+(a/b)*settlers),start=c(a=0.696,b=9.79),data=z)) shep = try(nls(recr~a*settlers/(1+(a/b)*settlers^d),start=c(coef(mm),d=1),data=z)) OK = (class(shep)!="try-error" && class(mm)!="try-error") if (OK) { mm.ci = try(confint(mm)) shep.ci = try(confint(shep)) if (class(shep.ci)=="try-error") { s0 = summary(shep) ci_width = qt(0.975,s0$df[2])*s0$parameters[,"Std. Error"] shep.ci=c(s0$parameters[,"Estimate"]-ci_width, s0$parameters[,"Estimate"]+ci_width) } } } res = c(coef(mm),mm.ci,coef(shep),shep.ci) names(res) = c("a.mm","b.mm","a.mm.lo","b.mm.lo","a.mm.hi","b.mm.hi", "a.shep","b.shep","d.shep","a.shep.lo","b.shep.lo","d.shep.lo", "a.shep.hi","b.shep.hi","d.shep.hi") res } @ <>= set.seed(1001) nvec = c(50,100,200,300,400,500,1000,2000) nsim=100 dvec = seq(0.7,1.3,by=0.1) r1= getvals() resmat = matrix(ncol=length(r1)+3,nrow=nsim*length(nvec)*length(dvec)) c0 = 1 for (i in 1:length(dvec)) { for (j in 1:length(nvec)) { for (k in 1:nsim) { cat(i,j,k,"\n") resmat[c0,] = c(dvec[i],nvec[j],k,getvals(n=nvec[j],d=dvec[i])) } } } @ <>= load("chap5-batch2.RData") d.null = 1 true.b=9.79 ## nsim=400 ## for batch2 faclist <- list(factor(resmat[,"d"]),factor(resmat[,"n"])) a.mm.mean = tapply(resmat[,"a.mm"],faclist,mean) b.mm.mean = tapply(resmat[,"b.mm"],faclist,mean) b.mm.sd = tapply(resmat[,"b.mm"],faclist,sd) a.shep.mean = tapply(resmat[,"a.shep"],faclist,mean) b.shep.mean = tapply(resmat[,"b.shep"],faclist,mean) b.shep.sd = tapply(resmat[,"b.shep"],faclist,sd) d.shep.mean = tapply(resmat[,"d.shep"],faclist,mean) d.shep.sd = tapply(resmat[,"d.shep"],faclist,sd) rshepOK <- resmat[resmat[,"shep.ci.OK"]==1,] ##rshepOK <- resmat d.shep.OK <- table(factor(rshepOK[,"d"]),factor(rshepOK[,"n"])) faclist2 <- list(factor(rshepOK[,"d"]),factor(rshepOK[,"n"])) d.shep.pow = with(as.data.frame(rshepOK), tapply(d.shep.hid.null,faclist2,sum))/d.shep.OK d.shep.cov = with(as.data.frame(rshepOK), tapply(d.shep.hi>d & d.shep.lo>= nsim = 20 trueval=1.2 nullval=1 r1 = resmat[resmat[,"n"]==1000 & resmat[,"d"]==trueval & resmat[,"shep.ci.OK"],] totsim=nrow(r1) dm = mean(r1[,"d.shep"]) bias = dm-trueval dsd = sd(r1[,"d.shep"]) dcover = sum(r1[,"d.shep.lo"]trueval)/totsim dpow = sum(r1[,"d.shep.lo"]>nullval | r1[,"d.shep.hi"]>= r=require(plotrix,quietly=TRUE,warn.conflicts=FALSE) plotCI(1:nsim,r1[1:nsim,"d.shep"],li=r1[1:nsim,"d.shep.lo"],ui=r1[1:nsim,"d.shep.hi"],xlab="Simulation", ylab=expression(hat(d)),xlim=c(1,40),axes=FALSE,ylim=c(0.9,1.6)) abline(h=trueval,lwd=2) abline(h=nullval,lwd=2,lty=2) axis(side=2) axis(side=1,at=c(1,10,20)) box() axis.break(breakpos=25,axis=1,style="zigzag",brw=0.05) n = length(ddens$x) p1 = 35 sc=-0.8 polygon(c(p1+ddens$y*sc,p1),c(ddens$x,min(ddens$x)),col="gray") segments(p1+ddens$y[1:(n-1)]*sc,ddens$x[1:(n-1)], p1+ddens$y[2:n]*sc,ddens$x[2:n]) sc2=+0.8 #polygon(c(p1+ddens.l$y*sc2,p1),c(ddens.l$x,min(ddens.l$x)),col="gray") covvals=which(ddens.l$xnullval) npowvals=which(ddens.l$x>= clabyoff = -0.1 op=par(mfrow=c(2,2),mar=c(4.2,4.1,2,1),mgp=c(2.5,1,0),las=1,cex=1.5, bty="l",lwd=2) ##mtext("a",side=2,outer=FALSE,col=2,at=0.5) ##mtext("a",side=2,col=4,las=0,cex=2,las=1,adj=0,padj=1.1) ## bias matplot(nvec,t(d.shep.mean),type="b",col=1,pch="7890123", xlab="Sample size",ylab="Estimated d",axes=FALSE) axis(side=1,cex.axis=0.8) axis(side=2,cex.axis=0.8) box() par(xpd=NA) corner.label("a",yoff=clabyoff,xoff=0,cex=2) par(xpd=FALSE) #plotCI(nvec[col(d.shep.sd)],d.shep.mean,d.shep.sd/5,add=TRUE) abline(h=dvec,lty=1:length(dvec),lwd=2,col="gray") ## ## matplot(nvec,t(d.shep.cwid),type="b",col=1,pch="7890123", xlab="Sample size",ylab="Confidence interval width",axes=FALSE) axis(side=1,cex.axis=0.8) axis(side=2,cex.axis=0.8) box() par(xpd=NA) corner.label("b",yoff=clabyoff,xoff=0,cex=2) par(xpd=FALSE) matplot(nvec,t(d.shep.cov),type="b",pch="7890123",col=1, xlab="Sample size",ylab="Coverage",mgp=c(3,1,0),ylim=c(0.75,1.0), cex.axis=0.8) par(xpd=NA) corner.label("c",yoff=clabyoff,xoff=0,cex=2) par(xpd=FALSE) abline(h=0.95,lwd=2) matplot(nvec,t(d.shep.pow),type="b",pch="7890123",col=1, ylab=expression(paste("Power or ",alpha)), xlab="Sample size",ylim=c(0,1),cex.axis=0.8) par(xpd=NA) corner.label("d",yoff=clabyoff,xoff=0,cex=2) par(xpd=FALSE) abline(h=0.05,lty=2) abline(h=0.8,lty=2) par(op) @ \caption{Summaries of statistical accuracy, precision, and power for extimating the Shepherd exponent $d$, for a range of $d$ values from undercompensation, $d=0.7$ (line marked ``7'') to overcompensation $d=1.3$ (line marked ``3''). (a) Estimated $d$: the estimates are strongly biased upwards for sample sizes less than 500, especially for undercompensation ($d<1$). (b) Confidence interval width: the confidence intervals are large ($>0.4$) for sample sizes smaller than about 500, for any value of $d$. (c) Coverage of the nominal 95\% confidence intervals is adequate for large sample size ($>250$) and overcompensation ($d>1$), but poor even for large sample sizes when $d<1$. (d) For statistical power ($1-\beta$) of at least 0.8, sample sizes of 500--1000 are required if $d \le 0.7$ or $d \ge 1.2$; sample sizes of 1000 if $d=0.8$; and sample sizes of at least 2000 if $d=0.9$ or $d=1.1$. When $d=1.0$ (``0'' line), the probability of rejecting the null hypothesis is a little above the nominal value of $\alpha=0.05$. } \label{fig:shep-pow1} \end{figure} Figure~\ref{fig:shep-pow1} gives a gloomier picture, showing the bias, precision, coverage, and power for a range of $d$ values from 0.7 to 1.3 and a range of sample sizes from 50 to 2000. It takes sample sizes of at least 500 to obtain reasonably unbiased estimates with adequate precision, and even then the coverage may be low if $d<1.0$ and the power low if $d$ is close to 1 ($0.9 \le d \le 1.1$). Because of the upward bias in $d$ at low sample sizes, the calculated power is actually \emph{higher} at very low sample sizes, but this is not particularly comforting. The power of the analysis slightly better for overcompensation than undercompensation. The relatively low power values are as expected from Fig.~\ref{fig:shep-pow1}b, which shows wide confidence intervals. Low power would also be predictable from the high variance of the estimates, which I didn't even bother to show in Fig.~\ref{fig:shep-pow1}a because they obscured the figure too much. Another use for our simulations is to take a first look at the tradeoffs involved in adding complexity to models. Figure~\ref{fig:shep-pow2} shows estimates of $b$, the asymptote if $d=1$, for different sample sizes and values of $d$. If $d=1$, then the Shepherd model reduces to the Beverton-Holt model. In this case, you might think that it wouldn't matter whether you used the Shepherd or the Beverton-Holt model to estimate the $b$ parameter, but there are serious disadvantages to the Shepherd function. First, even when $d=1$, the Shepherd estimate of $d$ is biased upwards for low sample sizes, leading to a severe upward bias in the estimate of $b$. Second, not shown on the graph because it would have obscured everything else, the variance of the Shepherd estimate is far higher than the variance of the Beverton-Holt estimate (e.g. for a sample size of 200, the Beverton-Holt estimate is \Sexpr{round(b.mm.mean["1","200"],2)} $\pm$ \Sexpr{round(b.mm.sd["1","200"],2)} (s.d.), while the Shepherd estimate is \Sexpr{round(b.shep.mean["1","200"],2)} $\pm$ \Sexpr{round(b.shep.sd["1","200"],2)} (s.d.)). On the other hand, if $d$ is \emph{not} equal to 1, the bias in the Beverton-Holt estimate of $b$ is large and more or less independent of sample size. For reasonable sample sizes, if $d=0.9$ the Beverton-Holt estimate is biased upward by \Sexpr{round(b.mm.mean["0.9",nvec>200]-9.79)}; if $d=1.1$ it is biased downward by \Sexpr{9.79-round(b.mm.mean["1.1",nvec>200])}. Since the Beverton-Holt model isn't flexible enough to account for the changes in shape caused by $d$, it has to modify $b$ in order to compensate. This general phenomenon is called the \emph{bias--variance tradeoff} (see p.~\pageref{biasvar}): more complex models in general reduce bias at the price of increased variance. (The small-sample bias of the Shepherd is a separate, and slightly less general, phenomenon.) \begin{figure} <>= matplot(nvec,t(b.mm.mean[3:5,]),pch="901", ylab="Estimate of b",xlab="Number of samples",col="darkgray", type="b",ylim=c(5,20)) plotCI(rep(nvec,3),t(b.mm.mean[3:5,]),t(b.mm.sd[3:5,]),add=TRUE) matpoints(nvec,t(b.shep.mean[3:5,]),pch="901",col=1,type="b") ## ##matplot(nvec,t(b.shep.mean[3:5,]),pch="901",col=1,type="b") tmpsd = t(b.shep.sd[3:5,]) tmpsd[tmpsd>5] = NA ##plotCI(rep(nvec,3),t(b.shep.mean[3:5,]),tmpsd,add=TRUE) abline(h=true.b,lwd=2) text(c(1160,1168,471,1232), c(6.7,9.5,12.5,15.4), c("B-H est.: d=1.1", "B-H est.: d=1.0", "Shepherd est.", "B-H est.: d=0.9"),adj=0) @ \caption{Estimates of $b$, using Beverton-Holt or Shepherd functions, for different values of $d$ and sample sizes.} \label{fig:shep-pow2} \end{figure} % Simple (analytic) example: % power curve? % Trying to detect difference of a sample of $n$ values with % mean $\mu$, variance $\sigma^2$ from 0: expected value % of mean is $\mu$, standard error is $\sigma^2/n$. % What is the chance that the mean falls below the 97.5 quantile % of the null distribution, a normal distribution with % mean 0 and standard deviation $\sigma/\sqrt{n}$? % It's the lower tail of the distribution % $\mbox{Normal}(0,\sigma^2/n)$: % $\beta=\Phi(\Phi^{-1}$ <>= ## obsolete/not used powfun = function(mu,sigma,n,alpha=0.05,two.tail=FALSE) { if (!two.tail) { critlevel = qnorm(1-alpha,mean=0,sd=sigma/sqrt(n)) pnorm(critlevel,mean=mu,sd=sigma/sqrt(n),lower.tail=FALSE) } else { upper = qnorm(1-alpha/2,mean=0,sd=sigma/sqrt(n)) lower = qnorm(alpha/2,mean=0,sd=sigma/sqrt(n)) pnorm(upper,mean=mu,sd=sigma/sqrt(n),lower.tail=FALSE)+ pnorm(lower,mean=mu,sd=sigma/sqrt(n),lower.tail=TRUE) } } @ %\begin{figure} <>= xvec = seq(-5,10,by=0.1) m=2.75 alpha=0.05 upperq=1-alpha xvec.low = c(xvec[xvecupperq]) op=par(mfrow=c(1,2)) plot(xvec,dnorm(xvec),type="l",lwd=2,xlab="x",ylab="Probability density") polygon(c(xvec.low,rev(xvec.low)),c(rep(0,length(xvec.low)),dnorm(rev(xvec.low),mean=m)),col="darkgray",border=NA) polygon(c(xvec.hi,rev(xvec.hi)),c(rep(0,length(xvec.hi)),dnorm(rev(xvec.hi),mean=m)),col="lightgray",border=NA) polygon(c(xvec.hi,rev(xvec.hi)),c(rep(0,length(xvec.hi)),dnorm(rev(xvec.hi),mean=0)),col=gray(0.3),border=NA) curve(dnorm(x),lwd=2,add=TRUE) upperq = qnorm(1-alpha) abline(v=upperq,lty=2) abline(v=0) lines(xvec,dnorm(xvec,mean=m)) text(upperq+0.2,0.02,expression(alpha),col="white",cex=1.5,adj=0) text(upperq-0.2,0.02,expression(beta),cex=1.5,adj=1) text(m,0.2,expression(1-beta),cex=1.5,adj=0.5) ## curve(powfun(x,sigma=1,n=1,two.tail=TRUE),from=-8,to=8,lwd=2, ylab=expression(paste("Power ",(1-beta))),xlab=expression(mu)) par(op) @ %\caption{Basic power curves (normal distributions)} %\label{fig:power1} %\end{figure} Because it is fundamentally difficult to estimate parameters or test hypotheses with noisy data, and most ecological data sets are noisy, power analyses are often depressing. On the other hand, even if things are bad, it's better to know how bad they are than just to guess; knowing how much you really know is important. In addition, there are design decisions you can make (e.g. number of treatments vs number of replicates per treatment) that optimize power given the constraints of time and money. Remember that systematic biases, pseudo-replication, etc. --- factors that are rarely accounted for in your experimental design or in your power analysis -- are often far more important than the fussy details of your statistical design. While you should quantify the power of your experiment to make sure it has a reasonable of success, thoughtful experimental design (e.g. measuring and statistically accounting for covariates such as mass, rainfall, etc.; pairing control and treatment samples; or expanding the range of covariates tested) will make a much bigger difference than tweaking the details of your experiment to squeeze out a little bit more statistical power. % \newpage % \section{\R\ supplement} % <<>>= % ndist <- sqrt((outer(offspr_x,offspr_x,"-"))^2+(outer(offspr_y,offspr_y,"-"))^2) % @