I think the best way to manage the .Rdata and .Rhistory files is to set up a different folder for each project or assignment. Make a copy of the shortcut to the R application and set the properties of the shortcut (right-click on the shortcut icon) so it starts in the project folder you made. .RData and .Rhistory will be stored in that folder; copy both to a floppy if you want to take your work to a new computer at the end of a session. If you are working in the BSB lab remember that you can't write to Drive K, so the best place to create the folder is in D:\Temp.
Use the data frame mydata you set up when you were learning R.
> mydata$name <- c("Joe","Bill","Sam","Beth","Sue") > mydata y x1 x2 name 1 1.2 1.5 1 Joe 2 3.6 2.5 1 Bill 3 5.1 6.0 1 Sam 4 4.2 3.1 2 Beth 5 2.1 2.2 2 Sue > lmfit <- lm(y~x1, data=mydata) > lmfit Call: lm(formula = y ~ x1, data = mydata) Coefficients: (Intercept) x1 0.8519 0.7804 > plot(mydata$x1, mydata$y, xlab="x1", ylab="y", type="n") > text(mydata$x1, mydata$y, mydata$name) > title("An X-Y Text Plot") > abline(lmfit)
Plot the Binomial distribution by setting up a spreadsheet with consecutive values of x in the first column, f(x) in the second column, and values for n and p in nearby cells. The graph should automatically redraw if n or p changes. Repeat for the Poisson distribution.
If you're not sure what I'm asking for here, click here to see the Excel workbook distributions.xls.
Generate 20 observations from a standard normal distribution and draw a graph showing a histogram (as relative frequencies), a smoothed density estimate, a dot plot, and the true standard normal probability density function.
Repeat this a few times with n = 20, then a few times with n = 40, a few times with n = 100 , a few times with n = 1000, and a few times with n = 10000. How many observations do you need before you can say with any certainty whether or not a given sample came from a Normal distribution?
If you have time, do this again for a skewed distribution, such as the chi-square distribution on 1 or 3 degrees of freedom.
Since creating the graph involves several steps, you might want to write a function normdat(n) so you don't have to type in all the steps every time. The easiest way to write this function is to type fix(normdat); this will open a text editor, where you can write the function, then save it as you exit the editor. The same command fix(normdat) is also a convenient way to edit or modify an existing function.
> normdat function(n = 50) { xdat <- rnorm(n) hist(xdat, prob = T) lines(density(xdat)) points(xdat, rep(0, n)) xgr <- seq(-4, 4, length = 100) lines(xgr, dnorm(xgr), lty = 2) }
Once normdat() is written, you just have to type normdat(20) a few times, normdat(40) a few times, etc., to complete the exercise.