Spend some time with the On-line Splus Tutorial from Dalhousie University. If you are going to be using the UNIX versions of Splus, read the relevant sections of Running Splus at McMaster.
If you are using R in the BSB Windows labs, copy the shortcut to R from the course folder to the desktop. Edit the Properties of the new shortcut, so that R will "Start in" a folder of your choosing, such as "D:\Temp\myRproject1". Save the workspace image and session history to this folder. After your session, copy the whole folder myRproject1 to a floppy disk or by FTP to a remote server, or e-mail it to yourself. To begin a new session where you left off, copy myRproject1 from the floppy disk or remote server to the hard drive of the machine you are on.
The same procedure will work for Splus 2000, but by default Splus will start in D:\Temp\sp2000\userid so you will not usually have to set the "Start in" folder to anything else.
> xs <- seq(-4, 4, length = 100) > plot(xs, dnorm(xs), type = "l", main = "Standard Normal pdf")
> xx <- rnorm(100) > mean(xx) [1] 0.1211835 > sqrt(var(xx)) [1] 1.181486 > hist(xx)
> xb <- seq(-4, 4, by = 0.5) > hist(xx, breaks = xb)
> lines(xs, 100*0.5*dnorm(xs))
Note that plot() initiates a new plot but lines() adds lines to an existing plot. Explain the scaling factor.
> hist(xx, breaks = xb, prob = T) > lines(xs, dnorm(xs))
Each observation is a stand of trees observed by satellite and on the ground.
ht Mean height dbh Mean diameter (at breast height) bnd5 Satellite Band 5 cci Crown cover index (fraction of ground covered by the canopy) type "Damaged" (road construction, wind damage, etc.) or "Undamaged".
Get the data and save it in a file called trees.txt. This is the 1995 data extracted from a larger Case Study. Click here to see the Case Study. Use
> trees <- read.table("trees.txt",header=T)
to read data from the file into a data frame called trees. Attach the data frame to your session. Confirm the names of the 5 variables. Look at the first 6 rows to see what is there. Draw a scatterplot matrix for all variables, then for the 4 quantitative variables, for all stands of trees. Draw a scatterplot matrix for the 4 quantitative variables for the undamaged stands alone, and for the damaged stands alone.
> attach(trees) > dimnames(trees)[[2]] [1] "ht" "dbh" "bnd5" "cci" "type" > trees[1:6,] ht dbh bnd5 cci type 1 1.00 NA 102 0.04146902 undamaged 2 1.21 NA 99 0.05835823 undamaged 3 1.21 NA 99 0.06029240 undamaged 4 1.18 NA 102 0.06414085 undamaged 5 1.03 NA 103 0.04427970 undamaged 6 1.06 NA 104 0.05017752 undamaged > pairs(trees) > pairs(trees[, 1:4]) > pairs(trees[type == "undamaged", 1:4]) > pairs(trees[type == "damaged", 1:4])
There are some missing values in dbh. Does pairs() eliminate missing vales pairwise or listwise? One way to find out would be to do a pairs plot of the first 71 rows.
Because we have attached the data frame, we can refer directly to the variables within it, i.e. we can write cci instead of trees$cci.
> boxplot(split(cci, type))
What have you learned about the trees? Can the satellite readings predict crown cover? Discuss any special features you see in the data.
Try brush and spin plots in Splus for the trees data, spinning the variables 3 at a time.
The best way to move objects between R and Splus is to use dput() in one environment to create a text file with a portable version of the object, then dget() to import the file to the workspace of the other.
For example, if you have created the object trees in R, write it to a file called trees.dat in your R working directory. Be careful not to conflict with existing file names.
> dput(trees, "trees.dat")
Move the file trees.dat to your Splus working directory. Now you can create a new object trees in Splus with
> trees <- dget("trees.dat")
You don't have to move the file if you can remember the path. Remember that in R and Splus you can specify a path either with forward slashes or double back-slashes; the DOS convention of single back-slashes is not permitted. Either of the following will work if the file trees.dat is in D:\Temp.
> trees <- dget("D:/Temp/trees.dat") > trees <- dget("D:\\Temp\\trees.dat")
Try the data in Table 1.2.3 of Srivastava. The data are available online.
Use fix(by2) to write a function by2() which multiplies a given value by 2. Note that the value returned by a function is the last value computed (unlike other programming languages which require an assignment statement) and the default value of the argument is 1.
> by2 function(x = 1) { 2 * x } > by2() [1] 2 > by2(99) [1] 198
Use fix(norm3d) to write a function to prepare a grid, compute the standard bivariate normal probability density function over the grid, and display it as either a perspective plot or a contour plot.
Note that the scaling factor scl could be omitted in this calculation as it does not affect the shape of the distribution. Look at the first few rows of xynor; what did expand.grid do? What is the purpose of invisible() on the last line?
> norm3d function(view = "p") { xnor <- seq(-3., 3., by = 0.1) ynor <- xnor xynor <- expand.grid(list(xnor, ynor)) scl <- 1/(2 * pi) znor <- matrix(scl * exp(-0.5 * (xynor[, 1.]^2. + xynor[, 2.]^2.)), nrow = length(xnor)) if(view == "p") persp(xnor, ynor, znor) if(view == "c") contour(xnor, ynor, znor) invisible() }
There is more than one way to make this plot. Try the different functions described in Plotting the Bivariate Standard Normal Density in Splus or R.
Try the example discussed in class, Drawing an ellipse in Splus or R.