(a) Heteroscedasticity is the condition where different groups within a population have different variances. [1 mark]
A parameter is a scalar or vector that indexes a family of probability distributions. [1 mark]
A test statistic can be derived from a pivotal quantity by replacing the unknown parameter by its hypothesized value. [1 mark]
The distribution of a test statistic when the null hypothesis is true is called the reference distribution for the test. [1 mark]
(b) A pivotal quantity is a function of a statistic and the parameter of interest that follows a standard distribution. The distribution may not include any unknown parameters.
To derive a test statistic from a pivotal quantity, replace the parameter of interest by its hypothesized value to make the pivotal quantity into a test statistic, and use the distribution of the pivotal quantity as the reference distribution.
To derive a confidence interval, use the distribution of the pivotal quantity to set limits between which the pivotal quantity will lie with a specified probability, then solve the inequalities to find limits on the parameter. [4 marks]
(a) William Sealey Gosset discovered the t-distribution and called it t because it was used to do a test of hypothesis. He published this work under the pseudonym of "Student" because his employer, the Guinness brewery, did not want him publishing in his own name for fear of releasing trade secrets. [3 marks]
(b) This code will compute a two-sided P-value for a one-sample t-test of the hypothesis that the population mean is 27. [4 marks]
(a) Correct analysis: A paired t-test with a two-sided P-value to test the hypothesis that the mean difference in tire life between brands is 0 against the alternative that the mean difference is not 0. [15 marks for the correct analysis with P-value, 12 marks for the correct analysis without P-value, maximum of 10 marks for the invalid analysis done completely, maximum of 10 marks if BOTH the correct and the invalid analysis are given.]
Graph: Stem & leaf plot, histogram, dot plot, box plot are all acceptable.
Stem: units; Leaf: tenths
-0 | 8 0 | 467 1 | 2 | 136 3 | 0
Assumptions: Normality (stem & leaf plot looks OK, but sample is too small); independence (can't test because the sample is so small; ensured by random assignment of tires to cars and to left or right wheels).
n = 8, d_bar = 1.3625, sd2 = 1.75125, sd = 1.3235, t0 = 2.912, df = 7, 0.05 > P > 0.02.
Conclusions: There is some evidence (0.05 > P > 0.02) from these data that the mean life is different for the two brands of tires.
Other valid analyses: A t-test without a P-value ("The hypothesis that the mean life is the same for both brands of tires is rejected at the 5% level of significance."), or compute a 95% confidence interval for the mean difference (0.26, 2.47) and note that it does not include 0.
Invalid analysis: A two-sample t-test; t = 0.4763, df = 14, P > 0.5. Graph: comparative box plots, stem & leaf plots or dot diagrams. Assumptions: normality of each sample (looks OK, see graphs), independence (randomization ensures independence between cars within each sample, but pairing both brands on one car means that the samples are not independent, so the two-sample t-test analysis is not valid), homoscedasticity (looks OK, see graphs). Conclusions: There is no evidence (P > 0.5) from these data that the mean life is different between the two brands.
Brand A Brand B Stem: tens; Leaf: units 3 | 24 3 | 014 3 | 6778 3 | 688 4 | 4 | 2 4 | 58 4 | 8
(b) Correct analysis: A two-sample t-test with a two-sided P-value to test the hypothesis that both types of pipe have the same mean deflection temperature against the alternative that the mean deflection temperature is different. [15 marks for the correct analysis with P-value, 12 marks for the correct analysis without P-value, maximum of 10 marks for the invalid analysis done completely, maximum of 10 marks if BOTH the correct and the invalid analysis are given. Give up to a 3-mark bonus if the F test for homoscedasticity is done correctly.]
Graph: comparative box plots, stem & leaf plots or dot diagrams are acceptable.
Type 1 Type 2 Stem: hundreds, tens; Leaf: units 17 | 17 | 67 18 | 5789 18 | 05 19 | 34 19 | 277 20 | 567 20 | 016 21 | 3 21 |
n1 = n2 = 10, x1_bar = 196.7, x2_bar = 191.1, s12 = 101.5667, s22 = 117.4333, sp2 = 109.5
t0 = 1.197, df = 18, 0.5 > P > 0.2
Assumptions: normality of each sample (looks OK, see graphs, samples are really too small to be sure), independence (no idea, we can hope that the pipe specimens were chosen randomly and that there was no way for the type 1 test results to affect type 2), homoscedasticity (looks OK, see graphs; could test with F0 = 117.4333/101.5667 = 1.156, reference distribution is F(9, 9) giving two-sided P > 0.5 so there is no evidence from these data of heteroscedasticity).
Conclusions: There is no evidence (0.5 > P > 0.2) from these data that the mean deflection temperature is not the same for both types of pipe.
Other valid analyses: A t-test without a P-value ("The hypothesis that the mean deflection temperature is the same for both types of pipe is accepted at the 5% level of significance."), or compute a 95% confidence interval for the difference in mean deflection temperature (-4.232, 15.432) and note that it includes 0.
Invalid analysis: A paired t-test; t = -1.062, df = 9, P > 0.2. Assumptions: The paired t-test analysis is not valid because there is no pairing here, the Type 1 specimens are unrelated to the Type 2 specimens.