Friday, 20 July 2012

Inactivity

Image: FreeDigitalPhotos.net
I heard a(nother?) radio news story on Inactivity this morning, claiming that it is as big a killer as smoking and obesity.
As I've already blogged about sitting too much killing us, I don't want to focus on the inactivity piece here but rather on the subject of multiple dependent variables.
When analysing what's going on, whether in science, business or life generally, it is important to find causality. This search usually starts with correlation, but should not end there.
What do I mean? Well, if A causes B then we would expect there to be a correlation between A and B. However, just because X and Y are correlated does not mean that X causes Y.
There are some classic examples...I remember learning when studying for A level Economics that there is a very strong correlation between the consumption of sardines and deaths from cancer, measured by national rates of each. However, there is no evidence for a causative relationship.
One controversial point in the nutritional field is the correlation between red meat consumption and heart disease. Some believe this tells us red meat is unhealthy (and usually blame the saturated fats), while others (and I personally subscribe to this group) think that since there is no clearly established medical pathway between eating saturated fat and heart disease (and indeed many other scientific studies showing no relationship) that there is something else going on.
This "something else" normally takes the form of a so-called "confounder". It is the common link. Westernisation is the confounder in the sardines and cancer correlation - associated with both, and arguably, also the "cause" of both.
In the red meat example, the confounder may not be so simple...it could be that people who eat lots of red meat simply eat too much (of everything) and it is the over-consumption that is the problem. It could be that the red meat is too often consumed with lots of starchy carbohydrates (just imagine that red meat in burger form, sandwiched in a nutritionally poor burger bun and with a large pile of fries and a super-sized soda and you'll catch my thinking) and it is in fact the sugar and starch that is the problem. Based on other studies, both excessive calorie consumption per se, and high carb intake have been shown to have specific body and hormonal effects that tend towards weight gain and heart disease.
But back to my main point...how do we separate, for example, the impact of inactivity and obesity on the death rate? Surely lots of people are both inactive and obese...with arguments as to in which direction the causative relationship flows still open.
Traditionally, multi-variable correlation is used. This means that if we measure the correlation between N and P, and the correlation between M and P, then the strength of the single correlations guide us as to the amount of the variability in P that is explained by each of N and M. We can then combine N and M in a variety of formulae to work out how we can maximise the correlation between this M/N combo and P.
But, a warning, this is a statistical analysis. A result that shows we can explain 90% of the variation with a combo of M and N does not mean we have shown the causative process. We could still have a confounder in the mix.


In work I did with a major fitness club chain, we were looking for correlations between KPIs for the global clubs. We were looking to show that, for example, customer satisfaction survey scores were correlated with club performance (measured by membership, revenue or profit). Other possible "causes" of performance were retention scores, the physical state of the club etc. The statistical analysis showed that we could not find a single variable with a strong correlation with measures of performance. I found this fascinating. Surely, my thinking went, this means something big. Others in the management team were more sceptical, thinking that we never would find such correlations and that it was pointless looking.
My theory was that there was a confounder in the mix (maybe more than one). The best theory I could come up with was that the quality of club management (almost impossible to measure objectively) was the most likely confounder, explaining both the KPIs and the performance measures. It is broadly known in multi-site businesses that the management strength of the business unit managers is absolutely critical to success. 
The only way to demonstrate this with statistics would be time-series analysis of different clubs set against measurable management strength indices. Correlations between different variables will never get there.
And so back to inactivity. I am sceptical about the ability of the researchers to separate the figures from a causation point of view, even if they have been able to do so statistically.
Be very careful what conclusions you draw from correlation, especially if there are many competing and inter-related variables involved.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home