BCSS: Analysis of Variance: Almost always a bad idea

Ask a vague question, get a vague answer

Analysis of variance is a dangerous tool. It allows researchers to avoid asking precise questions. And if you don’t formulate your question precisely, then the answer won’t tell you anything. It’s like the gag in the Hitch Hiker’s Guide where they build a supercomputer to figure out “the answer to life, the universe and everything”. When the computer finally figures it out, the answer is 42. The problem, you see, was that they hadn’t actually figured out what the question was.

Let’s have a look at a simple case. Here is data on cardiac output late in pregnancy in three groups of mothers: those with normal blood pressure, pre-eclampsia (PET) and gestational hypertension (GH)

. table outcome, c(mean co5) format(%2.1f)

----------------------

Hypertens |

ion |

outcome | mean(co5)

----------+-----------

Normal | 6.4

P E T | 5.6

Gest HT | 9.0

----------------------

I used the format option to limit us to one place of decimals.

What does an anova tell us

. anova co5 outcome

Number of obs = 256 R-squared = 0.4984

Root MSE = .790256 Adj R-squared = 0.4944

Source | Partial SS df MS F Prob > F

-----------+----------------------------------------------------

Model | 156.97329 2 78.4866448 125.68 0.0000

outcome | 156.97329 2 78.4866448 125.68 0.0000

Residual | 157.999502 253 .624503962

-----------+----------------------------------------------------

Total | 314.972792 255 1.23518742

Now, if you find that edifying you’re a better person than I. What the anova tells us is that there is a difference in cardiac output between the three groups. But that’s not the answer to any particular question. It certainly doesn’t answer any useful clinical question.

“Oh, but now we can do post-hoc comparisons”, you say. To which I might reply “But don’t you have any ideas you want to test?” The trouble with post-hoc comparisons is that you can rapidly end up with a bunch of comparisons, some of which are of interest and some meaningless.

First ask a question

To analyse data, you need to articulate the underlying hypothesis. And here, the hypothesis wasn’t “there’s some kinda difference in cardiac output between the three groups”. This is what I call the Empty Brain hypothesis, that assumes no previous research has been done, we know nothing about physiology, about hæmodynamics, about anything. And it’s not good enough. Science progresses by building on our understanding, not by wandering around hypothesising “some kinda difference”.

In fact, we expect that late in pregnancy, cardiac output in gestational hypertension will be higher than normal, causing high blood pressure because it exceeds the normal carrying capacity of the mother’s circulatory system. On the other hand, we expect that it will be below normal in pre-eclampsia because pre-eclampsia is characterised by a lot of clinical indicators of inadequate blood supply. The high blood pressure in pre-eclampsia is the result of very high peripheral resistance (the circulatory system basically closed tight) and the heart desperately trying to get the blood supply through.

Then use regression to answer it

We can use regression to ask this question:

. regress co5 i.outcome

Source | SS df MS Number of obs = 256

-------------+------------------------------ F( 2, 253) = 125.68

Model | 156.97329 2 78.4866448 Prob > F = 0.0000

Residual | 157.999502 253 .624503962 R-squared = 0.4984

-------------+------------------------------ Adj R-squared = 0.4944

Total | 314.972792 255 1.23518742 Root MSE = .79026

------------------------------------------------------------------------------

co5 | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

outcome |

P E T | -.8146901 .217851 -3.74 0.000 -1.243722 -.3856577

Gest HT | 2.612142 .1732165 15.08 0.000 2.271012 2.953272

_cons | 6.376119 .0534005 119.40 0.000 6.270953 6.481285

------------------------------------------------------------------------------

I have used Stata’s ‘i’ notation to tell Stata that outcome is to be treated as separate categories. In this case, Stata will use the normotensive group (coded as zero) as the reference group.

Not just hypothesis tests – effect sizes too

The regression has tested our two hypotheses:

1. Compared with women with normal BP, those with PET will have lower cardiac outputs

2. Compared with women with normal BP, those with GH will have higher cardiac outputs

Furthermore, it has quantified the effects. Cardiac output is 0·8 litres a minute lower in pre-eclampsia, and 2·6 litres a minute higher in gestational hypertension.

Robust variance estimation as well!

Another advantage of regression is that we can use Stata’s robust variance estimates. This topic is so important that it will be the subject of its own blog post. But note what happens when I invoke robust variance estimation:

. regress co5 i.outcome, robust

Linear regression Number of obs = 256

F( 2, 253) = 207.37

Prob > F = 0.0000

R-squared = 0.4984

Root MSE = .79026

------------------------------------------------------------------------------

| Robust

co5 | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

outcome |

P E T | -.8146901 .1947096 -4.18 0.000 -1.198148 -.4312319

Gest HT | 2.612142 .1352127 19.32 0.000 2.345856 2.878428

_cons | 6.376119 .0549773 115.98 0.000 6.267847 6.48439

------------------------------------------------------------------------------

Our estimates for the effect sizes haven’t changed, but the confidence intervals have. Robust variance estimation allows us to take into account the clustering within the data due to factors beyond our control. Always a good idea!

BCSS

Monday, 9 June 2014

Analysis of Variance: Almost always a bad idea

Ask a vague question, get a vague answer

First ask a question

Then use regression to answer it

Not just hypothesis tests – effect sizes too

Robust variance estimation as well!

No comments:

Post a Comment