Give p’s a Chance? Hoochie Coochie Hypothesis Tests

A common request for jobbing analysts is to ‘run these results through the computer and see if they’re significant’. Now, unfortunately, many folk, including scarily, even lecturers in our craft, have a misconception as to what ‘significance’ actually means.

Shout in a desperate monotone “it’s the probability of getting a result as large as, or larger than, what we would obtain if the ‘null hypothesis’ of no difference or association was actually true” and people look flummoxed, yes flummoxed, as if you were speaking to them in the language of the ancient Huns, (another) language no-one has been able to figure out.

True, testing ‘something’ against the concept of ‘nothing’ is a bit kooky. If we really did have a situation where two groups ended up with identical averages we’d think it was a trifle dodgy to say the least.

And as for the notion of effect sizes! Picture, on an enchanted desert isle, two group means of 131.5 and 130, with a pooled standard deviation (sd) of 15. A difference of 1.5 divided by 15, is a Cohen’s (the late great Jacob Cohen; Cohen’s kappa, populariser of power analysis, maven of multiple regression) effect size of 0.10, where given Jack’s arbitrary but conventional guidelines for mean differences, 0.20 is a small effect size, 0.50 medium, 0.80 large.

Using an online calculator e.g.

http://www.graphpad.com/quickcalcs/ttest1/

we find, that if there were 1000 in each group, the t test value would be 2.24 and our p value 0.026.

Voila, Eureka, Significance, as cook smiles and puts an extra dollop of custard on our pudding!

But if we ‘only’ had 100 in each group, our t value would be 0.71, our p value would be 0.48, and there’d be a sigh, a frown, a closing of doors and a grim faced cook doling out the thrice-boiled cabbage….

But they’re the same means, the same sd, and the same effect size!

Coming Up:  Guest Post on a possible, probable, Salvation.

Further/Future reading

G Cumming (2014) How significant is P? Australasian Science, March 2014. p. 37.

http://www.australasianscience.com.au/article/issue-march-2014/how-significant-p.html

also check out Prof G’s website

http://www.latrobe.edu.au/psy/research/cognitive-and-developmental-psychology/esci

with free Excel ESCI program and details of his illuminating 2012 book ‘The New Statistics’.

Now, back to honest resting from honest labour!

Minitab 17: think Mini Cooper, not Minnie Mouse

As it has been 3 or 4 years since the previous version, the new release of Minitab 17 statistical package is surely cause for rejoicing, merriment, and an extra biscuit with a strong cup of tea.

At one of the centres where I work, the data analysts sit at the same lunch table, but are known by their packages, the Stata people, the SAS person, the R person, the SPSS person and so on. No Minitab person as yet, but maybe there should be. Not only for its easy to use graphics, mentioned in a previous post, but for its all round interface, programmability (Minitab syntax looks a little like that great Kemeny-Kurtz language from 1964 Dartmouth College, BASIC, but more powerful), and a few new features (Poisson regression for relative risks & counted data, although alas no negative binomial regression for trickier counted data), and even better graphics.

Bubble plots, Outlier tests, and the Box-Cox transformation (another great collaboration from 1964), Minitab was also one of the first packages to include Exploratory Data Analysis (e.g. box plots and smoothed regression), for when the data are about as well-behaved as the next door neighbours strung out on espresso coffee mixed with red cordial.

Not as much cachet for when the R and SAS programmers come a-swaggering in, but still worth recommending for those who may not be getting as much as they should be out of SPSS, particularly for graphics, yet find the other packages a little too high to climb.

http://www.minitab.com/en-us/

Expected Unexpected: Power bands, performance curves, rogue waves and black swans

Many years ago, I had a ride of a Kawasaki 500 Mach III 2-stroke motorcycle, which along with its even more horrendous 750cc version was known as the ‘widow-maker’. It was incredibly fast in a straight line, but if it went around corners at all, the rider had long since fallen (or jumped) off!

It also had a very narrow ‘power band’ http://en.wikipedia.org/wiki/Power_band, in that it would have no real power until about 7,000 revs per minute, and then all of a sudden it would whoop and holler like the proverbial bat out of hell, the front wheel would lift, the rider’s jaw drop, and well, you get the idea! In statistical terms, this was a nonlinear relationship between twisting the throttle and the available power.

A somewhat less dramatic example of a nonlinear effect is the Yerkes-Dodson ‘law’ http://en.wikipedia.org/wiki/Yerkes%E2%80%93Dodson_law, in which optimum task performance is associated with medium levels of arousal (too much arousal = the ‘heebie-jeebies’, too little = ‘half asleep’).

Various simple & esoteric methods for finding global (follows a standard pattern such as a U shape, or upside down U) or local (different parts of the data might be better explained by different models, rather than ‘one size fits all’) relationships exist. A popular ‘local’ method is known as a ‘spline’ after the flexible metal ruler that draftspeople once fitted curves with. The ‘GT’ version, Multivariate Adaptive Regression Splines http://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines. is available in R (itself a little reminiscent of a Mach III cycle at times!),  the big-iron ‘1960’s 390 cubic inch Ford Galaxie V8′ of the SAS statistical package and the original, sleek ‘Ferrari V12’ Salford Systems version.

Other nonlinear methods are available http://en.wikipedia.org/wiki/Loess_curve, but the thing to remember is that life doesn’t always fit within the lines, or follow some human’s idea of a ‘natural law’.

For example, freak or rogue waves, that can literally break supertankers in half, were observed for centuries by mariners but are only recently accepted by shore-bound scientists, similarly the black swans (actually native to Australia) of the stock market http://www.fooledbyrandomness.com/

When analysing data, fitting models, (or riding motorcycles), please be careful!

SecretSource: of Minitab and Dataviz

When the goers go and the stayers stay, when shirts loosen and tattoos glisten, it’s time for the statisticians and the miners and the data scientists to talk, and walk, Big Iron.

R. S-Plus. SAS. Tableau. Stata. GnuPlot. Mondrian. DataDesk. Minitab.   MINITAB?????? Okay, we’ll leave the others to get back to their arm wrasslin’, but if you want to produce high quality graphs, simply, readily and quickly, then Minitab could be for you.

A commercialized version of Omnitab, Minitab appeared in Philadelphia in 1972 and has long been associated with students learning stats, but also now with business, industrial and medical/health quality management and six sigma, etc. There’s some  other real ‘rough and tumble’ applications involving Minitab – DR Helsell’s ‘Statistics for Censored Environmental Data using Minitab and R’ (Wiley 2012), for instance.

IBM SPSS and Microsoft Excel can produce good graphs (‘good’ in the ‘good sense’ of John Tukey , Edward Tufte, William Cleveland, Howard Wainer, Stephen Few & Nathan Yau etc etc), with the soft pedal down and ‘caution switches’ on, but Minitab is probably going to be easier.

For example, the Statistical Consulting Centre at the University of Melbourne uses Minitab for most of its graphs (R for the trickiest ones). As well as general short courses on Minitab, R, SPSS and GenStat there’s a one day course in Minitab graphics in November, which I’ve done and can recommend.

More details on the Producing Excellent Graphics Simply (PEGS) course using Minitab at Melbourne are at

http://www.scc.ms.unimelb.edu.au/pegs.html

student and academic pricing for Minitab is at http://onthehub.com/

What, I wonder, would Florence Nightingale have used for graphic software if she was alive today???

Electric Stats: PSPP and SPSS

Most people use computer stats packages if they want to perform statistical or data analysis. One of the most popular packages, particularly in psychology and physiotherapy, is SPSS, now known as IBM SPSS. Although there is room for growth in some areas such as ‘robust regression’ (regression for handling data that may not follow the usual assumptions), IBM SPSS has many jazzy features / options such as decision trees and neural nets and Monte Carlo simulation, as well as all the old faves like ANOVA, t-tests and chi-square.

I love SPSS and have been using it since 1981, back when SPSS analyses had to be submitted to run after 11 pm (23:00) so as not to hog the ‘mainframe’ computer resources. Alas, as with Minitab, SAS and Stata and others, SPSS can be expensive if you’re not a student or academic. An open source alternative that is free as in sarsparilla and free as in speech, is GNU PSPP, which has nothing whatsoever to do with IBM or the former SPSS Inc.

PSPP has a syntax or command line / program interface for old school users such as myself, *and* a snazzy GUI or Graphic User Interface. Currently, it doesn’t have all the features that 1981 SPSS had (e.g. ‘two-way ANOVA’), let alone the more recent features, although it does have logistic regression for binary outcomes such as depressed / non depressed. PSPP is easy to use (easier than open source R and perhaps even R Commander, although nowhere near as powerful).

PSPP can handle most basic analyses, and is great for starters and those using a computer at a worksite etc where SPSS is not installed, but need to run basic analyses or test syntax. The PSPP team is to be congratulated!

http://www.gnu.org/software/pspp/   free, open-source PSPP

http://www-01.ibm.com/software/analytics/spss/  IBM SPSS

(students and academics can obtain less expensive versions of IBM SPSS from http://onthehub.com)