Wisdom of the Cloud

Many summers ago when I started out in the Craft, I could log onto the trusty DEC-20 literally anywhere in the world, and use SPSS or BMDP to analyse data. Nowadays, I have to have IBM SPSS or Stata installed on the right laptop or computer, and bring it with me, wherever I may roam, and wonder dreamily  if I could just access my licensed stats packages from anywhere, like a library, a beach, a forest, a coffee shop.

One option would to subscribe to a stats package in the Cloud! Iin terms of main line stats packages, https://www.apponfly.com/en/ has R (free plus 8 euro’s ($A12.08) per month for platform, NCSS 10 at 18/27.19 per month + platform, IBM SPSS 23 Base 99/149.54 ditto and Standard (adds logistic regression, hierarchical linear modelling, survival analysis etc) for 199/300.59 per month + platform.

Another option, particularly if you’re more into six sigma / quality control type analyses, is Engineroom from http://www.moresteam.com at $US275 ($A378.55) per year.

Obviously,  compare the prices against actually buying the software , but to be able to log in from anywhere, on different computers, and analyse data,  sigh, it’s almost like the summer of ’85!

When Boogie becomes Woogie, when Dog becomes Wolf

An exciting (and not just for statisticians!) area of application in statistics/analytics/data science relates to change/anomaly/outlier detection, the general notion of outliers (e.g. ‘unlikely’ values) having been covered in a previous post, looking at, amongst other things, very long pregnancies.

But tonight’s fr’instance comes from Fleming’s wonderful James Bond Jamaican adventure novel, Dr No, (also a jazzy 1962 movie) which talks of London Radio Security shutting down radio connections with secret agents, if a change in their message transmitting style is detected. This may have indicated that their radio had fallen into enemy hands.

To use a somewhat less exotic example, imagine someone, probably not James Bond, tenpin bowling and keeping track of their scores, this scenario coming from HJ Harrington et al’s excellent Statistical Analysis Simplified: the Easy-to-Understand Guide to SPC and Data Analysis (McGraw-Hill, 1998).

On the 10th week, the score suddenly drops more than three standard deviations (scatter or variation around the mean or average) below the mean.

Enemy agents? Forgotten bowling shoes? Too many milk shakes?

Once again, an anomaly or change, something often examined in industry (Statistical Process Control (SPC) and related areas) to determine the point at which, in the words of Tom Robbin’s great novel Even Cowgirls Get The Blues, ‘the boogie stopped and the woogie began’.

Sudden changes in operations & processes can happen, and so a usual everyday assembly line (‘dog’) can in milliseconds become the unusual, and possibly even dangerous (‘wolf’), at which point hopefully an alarm goes off and corrective action taken.

The basics of SPC were developed many years ago (and taken to Japan after WW2, a story in itself). Anomaly detection is a fast-growing area. For further experimentation / reading, a recent method based upon calculating the closeness of points to their neighbours is described in John Foreman’s marvellous DataSmart: using Data Science to Transform Information into Insight (Wiley, 2014).

We might want to determine if a credit card has been stolen on the basis of different spending patterns/places, or, to return to the opening example, detect an unauthorised intruder to a computer network (e.g. Clifford Stoll’s trailblazing The Cuckoo’s Egg: Tracking a Spy Through the Maze of Computer Espionage).

Finally, we might just want to figure out just exactly when it was that our bowling performance dropped off!