A basic principle in Science is that of parsimony, or reducing complexity where possible, as typified in the application of Occam’s Razor.
William of Occam (or Ockham), a philosopher monk named after the English town that he came from, said something to the effect of ‘pluralitas non est ponenda sine necessitate’ (‘plurality should not be posited without necessity’). In other words, don’t increase, beyond what is necessary, the number of entities needed to explain something.
Occam’s Razor doesn’t necessarily mean that ‘less is always better’, it merely suggests that more complex models shouldn’t be used unless required, to increase model performance, for example. As is commonly, but probably mistakenly believed to have been proposed by Albert Einstein, ‘everything should be made as simple as possible, but not simpler’.
Common methods of measuring performance or ‘bang’, taking into account the cost, complexity or ‘buck’, are the Akaike Information Criterion (AIC), Bayesian or Schwarz Information Criterion (BIC), Minimum Message Length (MML) and Minimum Description Length (MDL).
Unlike the standard AIC, the latter three techniques take sample size into account, while MDL and MML also take the precision of the model estimates into account, but let’s just keep to the comparatively simpler AIC/BIC here.
An excellent new book by Thom Baguley ‘Serious Stats’ (serious in this case meaning powerful rather than scarey) http://seriousstats.wordpress.com/ shows how to do a t-test using AIC/BIC in SPSS and R.
I’ll do it here using Stata regression, the idea being to compare a null model (e.g. just the constant) with a model including the group. In this case we’re looking at the difference between headroom in American and ‘Foreign’ cars in 1978. (well, it’s Thursday night!).
Here’s the t-test results
(1978 Automobile Data)
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
Domestic | 52 3.153846 .1269928 .9157578 2.898898 3.408795
Foreign | 22 2.613636 .103676 .4862837 2.39803 2.829242
Domestic has slightly bigger mean headroom (but also larger variation!), p value is 0.011, indicating that the probability of getting a difference in means as large as or larger than the one above (0.540), IF the null hypothesis, that the populations means are actually identical, holds, is around 1 in a 100.
Using the method shown in Dr Thom’s book (Stata implementation on my Downloadables page) we get
Akaike’s information criterion and Bayesian information criterion
Model | Obs ll(null) ll(model) df AIC BIC
nullmodel | 74 -92.12213 -92.12213 1 186.2443 188.5483
groupmodel | 74 -92.12213 -88.78075 2 181.5615 186.1696
AIC and BIC values are lower for the model including group, suggesting in this case that increasing complexity (the two groups), also commensurately increases performance (i.e. need to take into account the two group means for US and non-US cars, rather than assuming there’s just one common mean, or universal headroom)
Of course, things get a little more complex when comparing several means, having different variances etc (as the example above actually does, although means still “significantly” different when differences in variances taken into account using separate variance t-test). Something to think about, and more info on applying AIC/BIC to variety of statistical methods can be found in refs below, particularly 3 and 5.
Further Reading (refs 2,3,4 and 5 are the most approachable, with Thom Baguley’s book referred to above, more approachable still)
- Akaike, H., A new look at the statistical model identification. IEEE Transactions on Automatic Control, 1974. 19: p. 716-723.
- Anderson, D.R., Model based inference in the life sciences: a primer on evidence. 2007, New York: Springer.
- Dayton, C.M., Information criteria for the paired-comparisons problem. American Statistician, 1998. 52: p. 144-151.
- Forsyth, R.S., D.D. Clarke, and R.L. Wright, Overfitting revisited : an information-theoretic approach to simplifying discrimination trees. Journal of Experimental and Artificial Intelligence, 1994. 6: p. 289-302.
- Sakamoto, Y., M. Ishiguro, and G. Kitagawa, Akaike information criterion statistics. 1986, Boston, MA: Dordrecht.
- Schwarz, G., Estimating the dimension of a model. Annals of Statistics, 1978. 6: p. 461-464.
- Wallace, C.S., Statistical and inductive inference by minimum message length. 2005, New York: Springer.
- Wallace, C.S. and D.M. Boulton, An information measure for classification. Computer Journal, 1968. 11: p. 185-194.