Specifying Contrasts

Specifying contrasts in Splus

1999-03-24

A coefficient for each level of a factor cannot usually be estimated because of dependencies among the coefficients of the overall model. Overparametrization induced by dummy variables is removed prior to fitting by replacing the dummy variables with a set of linear combinations of the dummy variables which are functionally independent of each other and functionally independent of the sum of the dummy variables. A factor with k levels has k-1 possible independent linear combinations. A particular choice of linear combinations is called a set of contrasts.

Splus allows four pre-defined definitions of contrasts, as well as user-defined contrasts. Helmert contrasts is the default in Splus. Treatment contrasts is the default in GLMStat.

Helmert Contrasts

The jth linear combination is the difference between level j+1 and the average of the first j. The following example returns a Helmert parametrization based on four levels:

> contr.helmert(4)
  [,1] [,2] [,3]
1   -1   -1   -1
2    1   -1   -1
3    0    2   -1
4    0    0    3

Orthogonal Polynomials

The function produces k-1 orthogonal contrasts representing polynomials of degree k-1.

> contr.poly(4)
          .L   .Q         .C
1 -0.6708204  0.5 -0.2236068
2 -0.2236068 -0.5  0.6708204
3  0.2236068 -0.5 -0.6708204
4  0.6708204  0.5  0.2236068

Sum

This produces contrasts between each of the first k-1 levels and level k.

> contr.sum(4)
  [,1] [,2] [,3]
1    1    0    0
2    0    1    0
3    0    0    1
4   -1   -1   -1

Treatment

This is not really a contrast because the columns don't sum to zero and hence are not orthogonal to the vector of ones. It does, however, give a comparison of each treatment level relative to the first. This would be useful if, say, the first level were the control treatment. This is the contrast to use if you want to estimate the same coefficients as GLMStat fits.

> contr.treatment(4)
  2 3 4
1 0 0 0
2 1 0 0
3 0 1 0
4 0 0 1

Specifying which contrasts Splus is to use

Here are two different ways to specify that "treatment" contrasts, instead of the default "helmert" contrasts, should be implemented for the factors sex and treat in a glm() fit. The first method specifies the contrasts globally for all factors in the model, the second method allows you to specify the contrasts for each factor separately.

glm(rem/n ~ sex * treat, family = binomial(link = logit), data =
remission, subset = n > 0, weights = n, options(contrasts = c(factor =
"contr.treatment")))
 
glm(rem/n ~ C(sex, treatment) * C(treat, treatment), family =
binomial(link = logit), data = remission, weights = n, subset = n > 0)