A coefficient for each level of a factor cannot usually be estimated because of dependencies among the coefficients of the overall model. Overparametrization induced by dummy variables is removed prior to fitting by replacing the dummy variables with a set of linear combinations of the dummy variables which are functionally independent of each other and functionally independent of the sum of the dummy variables. A factor with k levels has k-1 possible independent linear combinations. A particular choice of linear combinations is called a set of contrasts.
Splus allows four pre-defined definitions of contrasts, as well as user-defined contrasts. Helmert contrasts is the default in Splus. Treatment contrasts is the default in GLMStat.
The jth linear combination is the difference between level j+1 and the average of the first j. The following example returns a Helmert parametrization based on four levels:
> contr.helmert(4) [,1] [,2] [,3] 1 -1 -1 -1 2 1 -1 -1 3 0 2 -1 4 0 0 3
The function produces k-1 orthogonal contrasts representing polynomials of degree k-1.
> contr.poly(4) .L .Q .C 1 -0.6708204 0.5 -0.2236068 2 -0.2236068 -0.5 0.6708204 3 0.2236068 -0.5 -0.6708204 4 0.6708204 0.5 0.2236068
This produces contrasts between each of the first k-1 levels and level k.
> contr.sum(4) [,1] [,2] [,3] 1 1 0 0 2 0 1 0 3 0 0 1 4 -1 -1 -1
This is not really a contrast because the columns don't sum to zero and hence are not orthogonal to the vector of ones. It does, however, give a comparison of each treatment level relative to the first. This would be useful if, say, the first level were the control treatment. This is the contrast to use if you want to estimate the same coefficients as GLMStat fits.
> contr.treatment(4) 2 3 4 1 0 0 0 2 1 0 0 3 0 1 0 4 0 0 1
Here are two different ways to specify that "treatment" contrasts, instead of the default "helmert" contrasts, should be implemented for the factors sex and treat in a glm() fit. The first method specifies the contrasts globally for all factors in the model, the second method allows you to specify the contrasts for each factor separately.
glm(rem/n ~ sex * treat, family = binomial(link = logit), data = remission, subset = n > 0, weights = n, options(contrasts = c(factor = "contr.treatment"))) glm(rem/n ~ C(sex, treatment) * C(treat, treatment), family = binomial(link = logit), data = remission, weights = n, subset = n > 0)