# Module 2:7 - Multifactor ANOVA

## Contents

## Introduction[edit]

When conducting a study, it is often imperative to have a method of analyzing data that involves multiple factors. Previously, we learned about the Two Factor ANOVA, which allows us to compare interactions between two factors only. The multifactor ANOVA will allow us to look at three-way interactions, as well as the now-multiple two-way interactions within our three factors, and help us determine whether any of those interactions are statistically significant.

## General Linear Model[edit]

When conducting an ANOVA with more than two factors, the main difference is that there is an increase in the number of sources of information. Because of this, our general linear model is once again updated to include the additional factors.

### What's New[edit]

- Effect offset for our new group,
- Two new interactions, , because the new factor (c) interacts with each of the previous ones (a & b)
- A three way interaction, , because there are now three factors
- A comprehensive error value, , because now there is individual error within the third factor

### How to Test for Effects[edit]

When conducting a multifactor ANOVA, the formulas for F-tests are updated so that they also include the third factor.

- Main Effects:
- Two-way Interactions:
- Three-way Interaction:

## Data Quality Check[edit]

It is important to always check the data for any accidental errors or inconsistencies. To do so, follow these easy steps:

1) Open your data set, then go to *Analyze* and click *Distribution*.

2) When setting up your distribution, on the left, highlight all the variables you want to check, then click *Y, Variables* to add them on the right. Click *OK*.

3) Look at the distributions to see if you have any anomalies. If the distribution is expected to look normal (i.e. you controlled for the number in each group) and it does, then it's fine.

4) If your distribution does not look as expected, then double check to see if the values are valid or not. Otherwise, a typo could be responsible for the accidental creation of a new level!

Caution: Use discretion when deciding whether a numerical value is a typo or simply an outlier. *Is there a negative number when you're only supposed to be working with positive values? Probably an error (but still possibly an outlier). *Is there a number that represents a value you probably couldn't have recorded? (Such as in the Time of Day graph above, where most recordings took place between 8:00am and 9:30am, but two values supposedly are from 7:00pm and 7:30p.) Probably an error (but still possibly an outlier). *Is there a number that is still within the range of the expected, but seems far off from the rest of the distribution? Probably just an outlier (but still possibly an error).

5) Once you've determined that there are errors which need to be corrected, highlight the columns where the errors were found. Click on the *Columns* tab in the menu bar, then click *Recode*.

6) Find the error under "Old Value" and input the corrected version under "New Value."

- To replace the old values where they are, make sure to select "In place" from the drop-down menu
- To create a new column with all of the updated values, select "New Column"

7) Click *OK*.

## Fit Model[edit]

Use Fit Model in JMP to set up a multifactor ANOVA. *Analyze > Fit Model*

Select the continuous variable being measured as the Y Variable.

Next, we want to add each of our model terms to the Model Effects section and cross our factors. There are three different ways to do this:

1) The fast way (recommended by Julian): select all three terms, and then under *Macros* select *Factorial Sorted*.

2) The fast, but out of order way: select all three terms, and then under *Macros* select *Full Factorial*. This produces all of the same output as option 1, but the main effects and higher order effects are in a different order.

3) The slow way: add each of the terms, and then make all of the appropriate two-way interactions by selecting two variables at a time and pressing *Cross*.

To form the three-way interaction, select one of the two-way interactions you just created, and then from the left hand column select the third factor that is not a part of that particular two-way interaction, and press *Cross*.

Finally, to limit the output to include only the basic sections, select *Minimal Report* under *Emphasis *.

## Interpreting the Three Factor ANOVA[edit]

### Effect Tests[edit]

The Effect Tests section shows us the main effect of each factor, the two-way interactions, and our three-way interaction.

When interpreting this output, Julian recommends that we start with the highest order term and work our way up to the lowest order term. In this instance, we start with the three-way interaction, and next move to the two-way interactions, before finally looking at our main effects. This is helpful, because our main effects may be qualified by one or more of our interactions. In the output shown above, we see main effects for all three factors. However, because there is a two-way interaction between Route and Time of Day, we know that the main effects for these factors may be qualified by the other factor.

Julian also suggests that it may be helpful to journal the Effect Tests section so that we can easily view it while looking at the more in-depth output that will follow. To create a JMP Journal for any portion of an output, use the selection tool (found in the menu bar or tools menu) to select the given section, and then press command+j (mac) or control+j (PC).

### Effect Details[edit]

#### LSMeans Plots[edit]

The Effect Details section gives us more detailed output for each main effect and interaction, and is where we can produce LSMeans Plots. To produce a LSMeans Plot for any main effect or interaction, simply select the red arrow next to the desired section, and select LSMeans Plot from the drop down menu. To produce plots for all of the sections at the same time, hold down the command key (mac) or control key (PC) while selecting LSMeans Plot in any section.

To look at the LSMeans Plots without the LSMeans Tables, simply hold down the command or control key, and either minimize the tables using grey arrow next to them, or deselect LSMeans Table under the red arrow in any section. This will leave only the LSMeans Tables, making it easier to scroll through and compare them.

Additionally, we can resize the LSMeans Plots by clicking and dragging the corner of the Plot. To resize all of them at the same time, press command or control before clicking and dragging the corner of one of the plots. This will apply the resize to all of the LSMeans Plot at the same time.

#### Main Effect Follow Up Tests[edit]

To follow up on any of the main effects with pairwise comparisons, select the red arrow next to that factor. This will give a drop down menu with a number of options:

- LSMeans Contrast - allows us to compare specific levels within the factor
- LSMeans Dunnett - used for comparisons to a control, won't escalate alpha level
- LSMeans Student's t - pairwise comparisons without controlling for Family Wise Error Rate (FWER)
- LSMeans Tukey HSD - used to make all pairwise comparisons within the factor while controlling for FWER

#### Two-way Interactions[edit]

A two-way interaction averages over a third factor, to instead show a two-way interaction between the other two factors. This allows us to see if the main effect for the first factor is similar across the levels of the second factor, or whether it differs across those levels, meaning that an interaction is present.

**No Interaction:**

In the Route and Day of Week LSMeans Plot, we see that there is not a statistically significant interaction because the lines are all relatively parallel throughout. This shows us that while the routes differ in speed, they do not do so based on which day of the week we're looking at. Alternatively, we can choose to look at this plot to show that he relative differences between commutes on different days does not vary based on route; Monday is generally slowest and Friday is generally fastest, regardless of which route we are looking at. The main effects of of these factors are independent of one another, so there is no interaction.

**Interaction:**

In the Route and Time of Day LSMeans Plot, we can take a closer look at a statistically significant two-way interaction. The lines in this plot are not parallel, meaning that we need to consider **both** factors in order to make a prediction about the world. In order to know how much faster or slower a particular route is than another, we need to know what time of day we are talking about. Alternatively we can see this to mean that in order to know what time of day is best to travel, we need to take into account which route we will be taking.

If an interaction were *not* present in this data, the routes would all maintain a similar relative distance from one another across the different times of day, meaning the time of day would not affect how much of a difference there was between the speeds of different routes.

## Interpreting the Three Way Interaction[edit]

This is based on Module 2.7 5 of Julian's videos.

In the Dataset "TimesToCampus(4x4x5).jmp", we are interested in whether the two-way interaction of two factors (Route and Time of Day)depends on the level of a third factor (Day of Week).

Looking at the Effects Details section of your Fit Model output, you want to scroll down to the three-way interaction section that reads: "Day of Week*Route*Time of Day". Under the red triangle, you want to select "LS Means Plot" to create the different plots for each day of the week. In essence, we are looking at the Route*Time of Day effect changing across the different days of the week.

The different days of week (Monday, Tuesday, Wednesday, Thursday, Friday) are different levels of the factor of Day of Week.

The pattern of the two-way factor stays relatively the same across the levels of the third factor. We can see this by comparing the slopes of the factor for Route in Friday to that of Thursday, Wednesday, Tuesday, and Monday. The slope for Genesee Ave. across every day of the week stays relatively the same because they all have a downward slope. We can see that the slope of Gilman Dr. stays relatively flat across the levels of the third factor, and Nobel Dr. and La Jolla Village Dr. stay relatively close to each other. Although there are slight changes in the slopes across the various levels of Factor in Day of Week, the general pattern of the slopes are the same across all days of the week. The LS Means plots of Monday and Friday are the most similar.

Although the two-way interactions change across the levels of the third factor, they are not that different. This concludes that there is no evidence of the two-way interaction depending on the effect of Day of Week.

## Three Way Interactions in Graph Builder[edit]

This is based on Module 2.7 6 of Julian's videos.

Using Graph Builder, you want to create the overall interaction of Route and Time of Day. To do this, drag and drop Time to Campus on the y-axis, Times of Day on the x-axis, and Route on Overlay (top-right). Adding Day of Week on top as variable x allows you to see the same graphs as you did in the previous example using LS Means Plot, except you see them going across rather than descending.

To scroll through the plots for each level of Day of Week, right click the factor and choose your settings. Then you can have a better idea of to what degree the two-way interaction changes throughout each level.

Although the p-value in our Effects Table is not exactly zero, we know that these differences are most likely attributed to chance. We do not have enough evidence to claim that the third factor affects the change in our two-way interaction.

Now if you want to see how any third factor affects any other two-way factor in Graph Builder, you can easily swap the variables. To do this, you right click the factor of interest. Let's say we want to swap our factor Day of Week with Route.

We can see whether the third factor Route changes the two-way interaction of Time of Day and Day of Week.

If we swap Time of Day with Day of Week, we still have the same two-way interaction of Time of Day and Day of Week, but scrolling through each level of Route makes it look very different. However, we cannot reject the null hypothesis because the power to detect an effect in our three-way interaction is less than our power to detect an effect in a two-way interaction.

If we make a final swap for Route to Time of Day. The overall pattern of the interaction of Day of Week and Route changes, but that is not enough to claim there is a three-way interaction in the population. Three-way interactions are generally harder to interpret because of the many different levels in the three factors. But other designs, such as 2x2x2 models can be interpreted easier because it may be easier to tell which factors have an effect on another.