STATISTICS 2MA3
TEST #1 * 2003-02-06
Instructions

This test is to be written in the BSB Computer Lab. The duration of the test is 2 hours.

Any calculators and one sheet of notes (8.5" x 11", one side only) are permitted. You may refer to the STAT 2MA3 web site, but you may not consult any other web pages.

Use R to do your analyses and draw graphs. Your response to each question should be in the form of a report. Prepare the report in a word processor, integrating graphics and discussion. Print out and submit your report at the end of the test period. Only printed reports will be accepted. Electronic submissions will not be accepted.

Questions

A. Bioavailability Study

In a study to compare the bioavailability of four different preparations of beta-carotene capsules (1 = Solatene 30 mg, 2 = Roche 60 mg, 3 = BASF 30 mg, 4 = BASF 60 mg), 23 volunteers were subjected to two consecutive-day fasting blood samples, then randomized to one of the four preparations, taking one pill every other day for 12 weeks. Blood samples were drawn at 6, 8, 10 and 12 weeks and the results (plasma carotene levels) were as shown below.

Using graphical methods, determine if the two baseline measures are consistent with each other. Which would be more reliable, baseline 1, baseline 2, or an average of the two? Use a comparative box plot to show how the difference between Week 12 and baseline varies between preparations. Give any other graphs you think are interesting.

   prep subj bl1 bl2 wk6 wk8 wk10 wk12
1     1   71 298 116 174 178  218  190
2     1   73 124 146 294 278  244  262
3     1   80 176 200 276 286  308  334
4     1   83 116 180 164 238  308  226
5     1   90 152 142 290 300  270  268
6     1   92 106 106 246 206  304  356
7     2   78 114 110 280 220  178  210
8     2   82 106 114 114 176  100  104
9     2   84 100 100 144 114  154  142
10    2   85  92  92 164 116  140  112
11    2   87 212 212 354 430  352  382
12    2   89  92  94 160 200  150  170
13    3   72 180 162 432 336  440  472
14    3   79 186 198 242 252  240  336
15    3   88 202 208 476 408  414  416
16    3   94 192 160 264 252  320  350
17    3   95  80  88 160 152  208  226
18    4   74 174 182 206 268  202  232
19    4   75 252 234 590 594  522  566
20    4   76 210 230 474 500  472  444
21    4   77 162 152 202 204  140  180
22    4   86  68  64 262 216  214  216
23    4   93  74  72 218 164  218  184

B. Niagara River Pollution

The file diesol.html gives dieldrin in solids, measured more or less weekly from 1986-04-02 to 1996-03-28 in the Niagara River at Niagara on the Lake. There are 6 columns: julian = Julian Date, year = Year, month = Month, day = Day of the month, diel.sol = Dieldrin concentration in ng/g, dl = TRUE if an upper or lower detection limit was reached or FALSE otherwise.

Study the time sequence graphically, looking for the following features: trend, cyclic effects, change-points, autocorrelation. Make a comparative box plot to compare dieldrin concentration over years, and another to compare dieldrin concentration over the 12 months of the year.

Are the detection limits a problem in interpreting these data? Do the detection limits affect the medians in the box plots? Does a log transformation make the data easier to interpret? Is the level of pollution decreasing? Justify your conclusions.


Statistics 2MA3