This seems to be the best way for me to have a set of files available for my classmates, so the rest of you are just going to have to put up with this…

First, we have the R script provided by the authors of the BAT package (with a few minor additions of my own). Start by copying and pasting this chunk so we can go through and make sure it is working for everyone:

library(BAT)

### —– PROVIDED EXAMPLE —– ###

data(arrabida)

print(arrabida[1:30,1:8]) # have a look at how the file is formatted# arrange data

comm <- rbind(colSums(arrabida), colSums(geres), colSums(guadiana))

sites <- c(“Arrabida”, “Geres”, “Guadiana”)

row.names(comm) <- sites# alpha diversity

alpha(comm)

alpha(comm, phylotree)

alpha(comm, functree)raref.td <- alpha(comm, raref=1000)

raref.pd <- alpha(comm, phylotree, raref=1000)

raref.fd <- alpha(comm, functree, raref=1000)

par(mfrow=c(1,3))

boxplot(t(raref.td), names=sites, main=”TD”, pars=list (boxwex=0.8, staplewex=0, outwex=0.5),

range=1.5, lwd=0.8, lty=1, col=c(“grey30”, “grey60”, “grey90″), ylab=expression(alpha))

boxplot(t(raref.pd), names=sites, main=”PD”, pars=list (boxwex=0.8, staplewex=0, outwex=0.5),

range=1.5, lwd=0.8, lty=1, col=c(“grey30”, “grey60”, “grey90″), ylab=expression(alpha))

boxplot(t(raref.fd), names=sites, main=”FD”, pars=list (boxwex=0.8, staplewex=0, outwex=0.5),

range=1.5, lwd=0.8, lty=1, col=c(“grey30”, “grey60”, “grey90″), ylab=expression(alpha))alpha.estimate(comm)

alpha.estimate(comm, phylotree)

alpha.estimate(comm, functree)# accumulation curves

acc.arrabida <- alpha.accum(arrabida) # this can take a while

par(mfrow=c(1,2))

plot(acc.arrabida[,2], acc.arrabida[,17], xlab=”Individuals”, ylab=”Chao 1P”)

plot(acc.arrabida[,2], slope(acc.arrabida)[,17], xlab=”Individuals”, ylab=”Slope”)

accuracy(acc.arrabida, target=170)# beta diversity

beta(comm)

beta.multi(comm, phylotree, raref=1)

acc.beta <- beta.accum(arrabida, geres, abund=T) # this can take a while

par(mfrow=c(1,3))

plot(acc.beta[,2], xlab=”Sampling Units”, ylab=expression(beta[total]))

plot(acc.beta[,3], xlab=”Sampling Units”, ylab=expression(beta[repl]))

plot(acc.beta[,4], xlab=”Sampling Units”, ylab=expression(beta[rich]))### —– END OF EXAMPLE —– ###

Now we get to the fun stuff – experimenting with an unstudied dataset!

There are several files you will need, including the field monitoring dataset (QuadratCoverBAT), the taxonomy for species found at the site (OrderFamily), and the distance matrix (DisMatrix).

Look at the field dataset and compare it to the arrabida set. We are going to need to make some changes…

### REFORMAT FIELD DATA ###

library(reshape2)

anyDuplicated(QuadratCoverBAT) # Finds duplicated records

QuadratCover <- unique(QuadratCoverBAT) # eliminate duplicates because otherwise they will be added in the next step

MyData <- dcast(QuadratCover, PinTag+PinLocation~ScientificName, value.var=”Cover”, fun.aggregate=sum)

SiteData<- dcast(QuadratCover, PinLocation~ScientificName, value.var=”Cover”, fun.aggregate=length)

# convert PinLocaton to row name field

row.names(SiteData) <- SiteData$PinLocation

SiteData$PinLocation <- NULL

Does this look more like the arrabida dataset? If so, then lets look at the alpha and beta diversity.

### MEASURING DIVERSITY ###

# alpha diversity within each of the habitats

alpha(SiteData)# beta diversity among the habitats

beta(SiteData)

# 1=Mulch Woods; 2=N Prairie; 3=N Woods; 4=Pasture Prairie; 5=S Prairie; 6=S Woods

How about we try to look at the taxonomic diversity? We will follow the same rules as in the example:

- Same species = 0
- Same genus, not species = 0.25
- Same family, not genus = 0.5
- Same order, not family = 0.75
- Different order = 1

Can we generate a matrix in R using the taxonomy file? I could not make it work, but maybe we can work through it together.

If we can’t generate a working matrix, we can use one that I made by hand (but since it is by hand, there may be errors). It still needs some adjustments so that R knows what to do with it:

### GENERATE TAXONOMIC DISTANCE MATRIX ###

rownames(DisMatrix) <- DisMatrix[,1]

DisMatrix[,1] <- NULLdistance.matrix <- as.matrix(DisMatrix)

isSymmetric(distance.matrix) #headings must also be symmetric; don’t have any “-” in your headings!taxa <- hclust(as.dist(distance.matrix, diag=F), method=”complete”)

Then we can determine the taxonomic diversity of our sites:

### TAXONOMIC DIVERSITY ###

# alpha diversity within each of the habitats

alpha(SiteData, taxa) # interpretation??# beta diversity among the habitats

beta(SiteData, taxa) # interpretation??

Can you graph any of these and draw any conclusions about the habitats?