This test is to be written in the BSB Computer Lab. The duration of the test is 2 hours.
Any calculators and one sheet of notes (8.5" x 11", one side only) are permitted. You may refer to the STAT 2MA3 web site, but you may not consult any other web pages.
Use R to do your analyses and draw graphs. Your response to each question should be in the form of a report. Prepare the report in a word processor, integrating graphics and discussion. Print out and submit your report at the end of the test period. Only printed reports will be accepted. Electronic submissions will not be accepted.
In a study to compare the bioavailability of four different preparations of beta-carotene capsules (1 = Solatene 30 mg, 2 = Roche 60 mg, 3 = BASF 30 mg, 4 = BASF 60 mg), 23 volunteers were subjected to two consecutive-day fasting blood samples, then randomized to one of the four preparations, taking one pill every other day for 12 weeks. Blood samples were drawn at 6, 8, 10 and 12 weeks and the results (plasma carotene levels) were as shown below.
Using graphical methods, determine if the two baseline measures are consistent with each other. Which would be more reliable, baseline 1, baseline 2, or an average of the two? Use a comparative box plot to show how the difference between Week 12 and baseline varies between preparations. Give any other graphs you think are interesting.
prep subj bl1 bl2 wk6 wk8 wk10 wk12 1 1 71 298 116 174 178 218 190 2 1 73 124 146 294 278 244 262 3 1 80 176 200 276 286 308 334 4 1 83 116 180 164 238 308 226 5 1 90 152 142 290 300 270 268 6 1 92 106 106 246 206 304 356 7 2 78 114 110 280 220 178 210 8 2 82 106 114 114 176 100 104 9 2 84 100 100 144 114 154 142 10 2 85 92 92 164 116 140 112 11 2 87 212 212 354 430 352 382 12 2 89 92 94 160 200 150 170 13 3 72 180 162 432 336 440 472 14 3 79 186 198 242 252 240 336 15 3 88 202 208 476 408 414 416 16 3 94 192 160 264 252 320 350 17 3 95 80 88 160 152 208 226 18 4 74 174 182 206 268 202 232 19 4 75 252 234 590 594 522 566 20 4 76 210 230 474 500 472 444 21 4 77 162 152 202 204 140 180 22 4 86 68 64 262 216 214 216 23 4 93 74 72 218 164 218 184
The file diesol.html gives dieldrin in solids, measured more or less weekly from 1986-04-02 to 1996-03-28 in the Niagara River at Niagara on the Lake. There are 6 columns: julian = Julian Date, year = Year, month = Month, day = Day of the month, diel.sol = Dieldrin concentration in ng/g, dl = TRUE if an upper or lower detection limit was reached or FALSE otherwise.
Study the time sequence graphically, looking for the following features: trend, cyclic effects, change-points, autocorrelation. Make a comparative box plot to compare dieldrin concentration over years, and another to compare dieldrin concentration over the 12 months of the year.
Are the detection limits a problem in interpreting these data? Do the detection limits affect the medians in the box plots? Does a log transformation make the data easier to interpret? Is the level of pollution decreasing? Justify your conclusions.