Module 2:4 - Factorial ANOVA

From SSMS
Jump to: navigation, search

Youtube Playlist Associated Files

One Factor Linear Model[edit]

Equation for the one factor linear model

One factor linear model describes an individual score in three different parts. The grand mean and the treatment offset are the group means or the predicted score, which can be rewritten in several ways. But by having three different parts, we can see if the factor we are testing actually makes a difference in the individual score. Future models will describe the group mean in more parts but it will still be the group mean.

Introduction to Two factor Linear Model[edit]

Two factor linear model allow us to use multi factor experimental designs. In the video example we have a one factor linear model design based on the route taken to campus and another based on time of day. For each of those models we can only test one effect, either the effect of route or time. When they are combined there is one additional test based on the interaction of the two, it is the specific route and time combinations.

Full Factorial Design[edit]

Full factorial designs are multi factor designs in which two or more factors are completely crossed: measurements are taken for every combination of factor levels. Random assignment is still used for validity. It is used to see the effect of individual factors along with their interaction as a whole.

With two factors we can make a 2x2 which will allow us to see the effect for each of the three tests we want to preform

Factorial Plot[edit]

Factorial plot showing the 2x2 data

Factorial plot lets us look at the effect of data in factorial designs better. Each factor becomes an axis and we can visually see any difference in the data by plotting the means. In the video example the factorial plot shows us the overall effect of route by finding the average of each line plotted. It also shows us the effect of time by finding the average between the routes and forming a line.

Interaction[edit]

The interaction between factors is the degree to which the effects of one factor depend on the level of the other factor. Depend is used as way to describe how one factor cannot be described without mentioning the level of the other factor. In the video example we see the interaction when describing time because we also need to mention route because it changes.

Two Factor Linear Model[edit]

Equation for the two factor linear model

The two factor linear model represents more means than the one linear model. It has more parts to show how each factor contributes to the individual score. The Yijk is the individual score, like before. Mu.. is the grand mean, the average over all individuals in all groups. Alpha j is the offset for factor A. Beta k is the offset for factor B. Alpha-beta jk is the interaction. Alpha-beta jk is not a multiplication and can be rewritten as Gamma jk. Eijk is the individual error for each score.

Two-Factor Sample Model[edit]

The two-factor sample model uses roman notations rather than greek letters to differentiate it from the population sample model, but the components are similar. Like the population model, the interior proportion (Y... + aj + bk + (ab)jk) is just the mean of group for the jth level of Factor A and kth level of Factor B.

Equation for the Two-Factor Sample Model

Notations for Model[edit]

Numerals accompany the roman notations to indicate the level of the factor it represents, that is the group mean, Yjk for level 1 in Factor A and Factor B can be represented as Y11. In the example from the video, Factor A will represent Route Taken, and the different routes will be notated by the “jth” level it is in. That is, the “jth” level for Gilman Drive in Factor A is 1 (as in j = 1), and the “jth” level for La Jolla Village Drive in Factor A is 2 (as in j = 2) A similar process is applied for Factor B, which represents the Time of Day. The “kth” level for 8:00am in Factor B is 1 (as in k = 1), and the “kth” level for 9:30am in Factor B is 2 (as in k=2) That is, any term involving the first level of Factor A will be represented by 1 in its “jth” notation, and any term involving the first level of Factor B will be represented by 1 in its ‘kth” notation.

Screen Shot 2015-02-11 at 11.19.04.png

Test Contributions for Each Factors[edit]

It is to observe if there are overall effects of the factors and/or if there are interactions between factors that might have result in the effect. To find the sample mean of the various factors, it is the function of the mean of group without the error; they are also unique to each combination of factor.

Screen Shot 2015-02-11 at 11.25.34.png

Finding the Values[edit]

Grand Mean (Y)[edit]

Finding the Grand Mean is just adding up all the scores divided by the number of scores; ignoring the model. This is also used in all the components.

Marginal Mean[edit]

Overall mean of the “jth” or “kth” level of Factor A or Factor B. (e.g. in the video, it is the overall mean of Factor B in the first level, represented by Y..1) The marginal mean of one factor ignores the other, and its notation represents the averaging over the other factor, e.g. The marginal mean of Time of Day at 8:00am ignores the contribution of Route Taken, and is represented by Y..1. That is to say, the marginal mean is the mid-point of the 2 levels for one factor ignoring the other.

Factorial Plot for Marginal Mean

Offset of Marginal Mean (aj or bk)[edit]

The offsets (aj or bk) represents the deviation of the marginal mean from the grand mean, and its formula using Factor B is bk = Y..k - Y.
From this we observed that the offsets corresponds to each other, that is when one is positive, the other is negative to the same degree; enforced by the geometry of the factorial structure. The grand mean is formed on the average of these 2 marginal means, so to the degree that one is higher, the other has to be lower to the same degree.

Factorial Plot of Offset of Marginal Mean[edit]

When observing the marginal means together with the grand mean, we can see that there is a pivot point in the structure. Thus, the corresponding component is different from the grand mean to the same degree in the opposite directions, therefore implying that these components are related, and hence, the degree of freedom for these values are 1 (because k-1), as we only require one independent information (the offset of one level of a factor) and we will be able to derive the other value

Offset for Factor A
Offset for Factor B

Degrees of Interaction (ab)jk[edit]

Purely Additive Factor Effects (YPA)[edit]

Factorial Plot of Purely Additive Factor Effects

If there were no interaction between factors, that is for example, the effect of time of day is the same on two different routes, the factor will be known as the Purely Additive Factor Effects (YPA). These are means that will occur if there are no interaction (i.e. no (ab)jk). The Purely Additive Factor Effects mean is just adding the difference of Factor A to the Grand Mean, and then add the difference of Factor B.

(ab)jk[edit]

Pivot Point

It is known as the degrees of interaction, also known as the deviation from the Purely Additive Factor Effects. From the factorial plot, we can observe the deviation of (ab)jk from YPA. We knew that there will be some interaction, and the deviation of the Sample Means from the Purely Additive Factor Effects demonstrates this. That is, the deviations/difference from the Sample Means to the Purely Additive Factor Effects captures the interaction effect of the two factors. The interaction between factors is the degree to which the mean for treatments differs from the additive effects of factors. (ab)jk is the difference between the Purely Additive Factor Effects and the actual treatment means, and each combination will yield the same degree of the ab term in either the positive or negative, which demonstrates that these terms are not independent from each other.

The Structure of the Interaction Offsets in the Two Factor Linear Model[edit]

To study the interaction of offsets in the model, we need to keep the grand mean where it is, and the overall mean of each individual group where it is. Untitled5.png

All the points are pivoting around a particular structure to keep the overall means the same.

Degrees of freedom for an interaction: df=(j-1)(k-1)

Knowing any one of the interaction term offsets, tells you the other 3.

The Cell Means as a Function of the Row, Column, and Interaction Effects[edit]

The group means are taken by the following formula, adding up 4 pieces of information:

Screen Shot 2015-02-16 at 11.21.25 PM.png

That is, it is a function of the grand mean, the offset associated with it, the average effect of one of the factors (in this case, the time of day 8:00 am), and the interaction offset.

So now, we refer back to the model and update the definition for the ab term:

Screen Shot 2015-02-16 at 11.28.59 PM.png

Residuals and Test of Effects in the Two Factor Linear Model [edit]

The final factor that needs to be added to our Two Factor Linear Model is the error or residuals variable: Screen Shot 2015-02-16 at 11.34.46 PM.png

Error: the deviation between an individual's actual observed score and the score predicted by the model. Treatment mean: the combination of the levels of two factors

Screen Shot 2015-02-16 at 11.45.32 PM.png

Overall effect of A determined by looking at the variability of the aj values, which will tell us whether they are large or small. Overall effect Of B will be subject to how far from zero the bk values are. Effect for Specific Combinations of A and B

For all 3 tests, we are taking/calculating the F ratio to compare the mean variances and see the reaction between them.

      • Mean Square of Error is in the denominator for each test such that they are giving us the best guess of variability for each of the tests.***