Spend some time with the On-line Splus Tutorial from Dalhousie University. If you are going to be using the UNIX versions of Splus, read the relevant sections of Running Splus at McMaster.
If you are using R in the BSB Windows labs, copy the shortcut to R from the course folder to the desktop. Edit the Properties of the new shortcut, so that R will "Start in" a folder of your choosing, such as "D:\Temp\myRproject1". Save the workspace image to this folder. After your session, copy the whole folder myRproject1 to a floppy disk or, by FTP, to a remote server. To begin a new session where you left off, copy myRproject1 from the floppy disk or remote server to the hard drive of the machine you are on.
The same procedure will work for Splus 2000, but by default Splus will start in D:\Temp\sp2000\userid so you will not usually have to set the "Start in" folder to anything else.
> xs <- seq(-4, 4, length = 100) > plot(xs, dnorm(xs), type = "l", main = "Standard Normal pdf")
> xx <- rnorm(100) > mean(xx) [1] 0.1211835 > sqrt(var(xx)) [1] 1.181486 > hist(xx)
> xb <- seq(-4, 4, by = 0.5) > hist(xx, breaks = xb)
> lines(xs, 100*0.5*dnorm(xs))
Note that plot() initiates a new plot but lines() adds lines to an existing plot. Explain the scaling factor.
Each observation is a stand of trees observed by satellite and on the ground.
ht Mean height dbh Mean diameter (at breast height) bnd5 Satellite Band 5 cci Crown cover index (fraction of ground covered by the canopy) type "Damaged" (road construction, wind damage, etc.) or "Undamaged".
Get the data and save it in a file called trees.txt. Use
> trees <- read.table("trees.txt",header=T)
to read data from the file into a data frame called trees. Attach the data frame to your session. Confirm the names of the 5 variables. Look at the first 6 rows to see what is there. Draw a scatterplot matrix for all variables, then for the 4 quantitative variables, for all stands of trees. Draw a scatterplot matrix for the 4 quantitative variables for the undamaged stands alone, and for the damaged stands alone.
> attach(trees) > dimnames(trees)[[2]] [1] "ht" "dbh" "bnd5" "cci" "type" > trees[1:6,] ht dbh bnd5 cci type 1 1.00 NA 102 0.04146902 undamaged 2 1.21 NA 99 0.05835823 undamaged 3 1.21 NA 99 0.06029240 undamaged 4 1.18 NA 102 0.06414085 undamaged 5 1.03 NA 103 0.04427970 undamaged 6 1.06 NA 104 0.05017752 undamaged > pairs(trees) > pairs(trees[, 1:4]) > pairs(trees[type == "undamaged", 1:4]) > pairs(trees[type == "damaged", 1:4])
There are some missing values in dbh. Does pairs() eliminate missing vales pairwise or listwise? One way to find out would be to do a pairs plot of the first 71 rows.
Because we have attached the data frame, we can refer directly to the variables within it, i.e. we can write cci instead of trees$cci.
> boxplot(split(cci, type))
What have you learned about the trees? Can the satellite readings predict crown cover? Discuss any special features you see in the data.
Use fix(by2) to write a function by2() which multiplies a given value by 2. Note that the value returned by a function is the last value computed (unlike other programming languages which require an assignment statement) and the default value of the argument is 1.
> by2 function(x = 1) { 2 * x } > by2() [1] 2 > by2(99) [1] 198
Use fix(norm3d) to write a function to prepare a grid, compute the standard bivariate normal probability density function over the grid, and display it as either a perspective plot or a contour plot.
Note that the scaling factor scl could be omitted in this calculation as it does not affect the shape of the distribution. Look at the first few rows of xynor; what did expand.grid do? What is the purpose of invisible() on the last line?
> norm3d function(view = "p") { xnor <- seq(-3., 3., by = 0.1) ynor <- xnor xynor <- expand.grid(list(xnor, ynor)) scl <- 1/(2 * pi) znor <- matrix(scl * exp(-0.5 * (xynor[, 1.]^2. + xynor[, 2.]^2.)), nrow = length(xnor)) if(view == "p") persp(xnor, ynor, znor) if(view == "c") contour(xnor, ynor, znor) invisible() }
Try the example discussed in class, Drawing an ellipse in Splus or R.
Try spin plots for the trees data, taking variables 3 at a time.
Try Exercise 1.14 from the text. The data are on the data disk that comes with the book.
Try Exercises 1.24 and 1.25. The data are on the data disk.