To many who are not members of our Craft, and even some that are, Statistics is something of a Foreign Language, difficult to grasp without a good understanding of its grammar, or at least a whole swag of useful rules.
Stats is also difficult to teach, note our students look of bored angst when we try to explain p values.
So could we teach Stats like a foreign language?
For starters, why don’t we teach statistical ‘tourists’/’travellers’/’consumers’ some useful ‘phrases’ they can actually use, like how to read Excel files into a stats package, how to do a box plot, check for odd values, do some basic recodes etc.
Such things rarely appear in texts. Instead, we tumble about teaching the statistical equivalent of ‘the pen of my aunt is on the table’ or ‘my hovercraft is full of eels’ (Monty Python), or ‘a wolverine is eating my leg’ (Tim Cahill).
For example, as well as assuming that the data are all clean and ready to go, why do stats books persist in showing how to read in a list of 10 or so numbers, rather than reading in an actual file?
Just as human languages may or may not directly have universal concepts, the same may apply for stats packages. The objects of R for example, are very succinct in conception, but very dfficult to explain.
Such apparent lack of universality, may be why English borrows words like ‘gourmand’ (to cite from my own book chapter), as English doesn’t otherwise have words for a person that eats for pleasure. Similarly, courgette/zucchini sounds better than baby marrow (and have you ever seen how big they can actually grow?).
Yet it’s a two way street, with English providing words to other languages, such as ‘weekend’.
According to the old Sapir-Whorf hypothesis, language precedes or at least shapes thought (but see John McWhorter’s recent 2014 book The Language Hoax), so if there’s no word for something, it’s supposedly hard to think about it.
In Stats package terms, instructors would have to somehow explain that it is very easy to extract and store, say, correlation values in R, for further processing, putting smiley faces beside large ones etc. But in SPSS and SAS we would normally have to use OMS/ODS, and think in terms of capturing information that would otherwise be displayed on a line printer. This is a difficult concept to explain to anyone under 45 or so!
Although there are many great books on learning stats packages, (something for a later post), and I myself can ‘speak’ SPSS almost like a native after 33 years, I only know a few words of other human languages, (and, funnily enough, only a few “words” of R).
If you’ll excuse me, my aunt and her pen are now going for a ride on a hovercraft.
(I hope there’s no eels! )
REFS
Sapir-Whorf Hypothesis
https://www.princeton.edu/~achaney/tmve/wiki100k/docs/Sapir%E2%80%93Whorf_hypothesis.html
Counter to the Sapir-Whorf Hypthesis
http://global.oup.com/academic/product/the-language-hoax-9780199361588;jsessioniid=455D218B50BA25BC37B119092C3F7CDE?cc=au&lang=en&
Hovercraft, Gourmands and Stats Packages
McKenzie D (2013) Chapter 14: ‘Statistics and the Computer’ in
http://www.elsevierhealth.com.au/epidemiology-and-public-health/vital-statistics-paperbound/9780729541497/