In reality the normality assumption is probably not the biggest one (pay attention to that homogeneity of variance!), but its still something to be aware of and I thought I'd show an approach using QQ plots.
But first... what are some of these options for deciding if you're normal?
1) there are tests...(Bartlett's, Levene's, etc.),
2) make a QQ plot and eyeball it,
3) make a histogram and eyeball it...
....maybe there are more, but those are the three I can think of right now.
Since the second two involve "eye-balling" it, its no wonder folks who are not trained as statisticians end up doing the tests! Often we're doing the stats because we imagine them to be objective, and therefore the thought of adding back in subjective "eyeballs" doesn't sit well.
I gotta say, I don't like the tests! They seem overly stringent.... and can even be challenging to get normal data to pass those tests! And I just don't like the idea of testing assumptions without looking at the data, which tends to happen when you get in the habit of running tests.
So what to do instead? ...
I like QQ plots and so I'm going to show a cool little function that makes them a bit more useful (at least I feel so).
What are QQ plots?
QQ plots are a means of comparing two distributions. So as a test of normality, you can plot your data (or its distribution) against a normal distribution and see if they match. If the two distributions are similar the points on a QQ plot will fall along the y=x line (unity). If they don't...well then they don't. And typically "real" data aren't going to fall right on that y=x line!
So how far away from that line can they be? Where is the transition from being "eh, pretty normal" to "nope...no way that's normal"? It is quite challenging to find any guidelines, which means we're left with a nice visual tool for testing normality, but little or no means of assessing how far from perfectly normal we can be and still be reasonable.
"Well... how do we interpret that?"
Well....what if we made a QQ plot using data actually drawn from a normal distribution with the same variance and sample size as our data set and then compared that "normal" QQ plot to the QQ plot drawn with our data. As its randomly drawn from a normal distribution, with a presumably rather small sample size, the "normal" data won't fall perfectly on the y=x line, but it will give us a sense of where actual "normal" data might fall. That would be a useful comparison.
But what if we did that 8 times? Then we'd get a pretty good sense of what we might expect a normal QQ plot to look like. And if we can't identify our QQ plot from a sea of "normal" QQ plots, our data is probably sufficiently normal.
And I actually I have a little R function that does just that...let's see it in action.
First...here's the function..
The qqfunc to the rescue.
Just type qqfunc and apply it to the model you just ran! Simple.
So there you have it...a means of using QQ plots to detect normality... don't have too much fun with it.