Statistics 3N03 - Assignment #1

2001-09-25

Due: 2001-10-03 18:00


The following problems and data sets are taken from Montgomery & Runger, Applied Statistics and Probability for Engineers, 2nd edition. You may use any software you like. Submit your work as a report, pasting the graphs into a word processor and adding comments and discussion.

You don't have to type in all the data sets, those from Chapters 1-8 are available online.

2-23 (p 42)

Follow the instructions in the text and also do a lag-1 scatter plot. Is there evidence of trend or autocorrelation?

2-31 (p 45)

Follow the instructions in the text and also do a lag-1 scatter plot. Is there evidence of trend or autocorrelation?

Crude Oil Data

The attached data file gives measurements of trace elements (vanadium, iron, and beryllium, all in % ash) and hydrocarbons (saturated and unsaturated, both in % area) in chemically analyzed samples of crude oil from three zones of sandstone (Wilhelm, Sub-Mulinia, Upper Mulinia). The data are listed in an arbitrary order within each zone.

Use a scatterplot matrices to study relations between the variables and use histograms to assess normality. Use box plots to look for differences between the zones. State your conclusions. Why would time series plots and lag-1 plots be inappropriate for these data?

13-8 (p 640)

Do graphical analyses using comparative box plots to compare crack growth rates between the three frequencies, between the three environments, and between the nine different combinations of frequency and environment. Repeat using the log of crack growth rate. State your conclusions. (The question asks for a test of hypothesis and an analysis of residuals but you are not expected to do those for this assignment.)

Hint: Review the notes on making comparative box plots. Set up the data in three columns and 36 rows. Put the 36 growth rate measures in the first column "growth". Put codes for the three different frequencies in the second column "freq" and codes for the three different environments in the third column "envir". Make a fourth column "freq.envir" with codes A to I or 1 to 9 (or air.f10, air.f1, etc.) to indicate the nine different combinations of frequency and environment. Now make three comparative box plots: growth vs freq (3 boxes on one graph), growth vs envir (3 boxes), and growth vs freq.envir (9 boxes). Finally, make a new column of log(growth) and repeat the three comparative box plots. (Note that in R you don't have to make a new column of log(growth), you can just add the option log="y" to the plot command to transform the Y-axis to a log scale.)


Statistics 3N03