Science and sensory deprivation

Scientific study of sensory deprivation dates back almost sixty years, to the mid 50s, and float tanks as we know them today date back to the mid 60s. They’ve never been mainstream, but they were invented and developed in research settings, and the literature on them is reasonably extensive. In the scientific community, the field is known as “flotation REST”, for Restricted Environmental Stimulation Technique, or Therapy.

What has been found? I’m still digging into it myself, but one survey I found lists:

reduced stress (Kjellgren, Sundequist, Norlander, & Archer, 2001),
reduced tension and anxiety (Fine & Turner, 1982; Schulz & Kaspar, 1994; Suedfeld, 1983),
reduced blood pressure (Fine & Turner, 1982; Turner, Fine, Ewy, & Sershon, 1989),
less muscle tension (Norlander, Bergman, & Archer, 1999)
increased well-being (Mahoney, 1990),
improved sleep (Ballard, 1993),
mild euphoria (Schulz & Kaspar, 1994),
increased originality (Forgays & Forgays, 1992; Norlander, Bergman, & Archer, 1998; Norlander, Kjellgren, & Archer, 2003; Sandlund, Linnarud, & Norlander, 2001; Suedfeld, Metcalfe, & Bluck, 1987), and
indications that the technique is a suitable complement to psycho-therapy (Jessen, 1990; Mahoney, 1990).

I want to try to dig up some of these articles, and I’ll post about them when I find details.

In the meantime, let’s start with a meta-analysis from 2005 (Dierendonck & Nijenhuis, Psychology and Health 20(3): 405–12). So, yes, a Dutch study. I’m going to call them D&N and save some syllables.

D&N found 27 studies published up to 2002 meeting two basic criteria:

the study examined the effects of standard flotation REST, and
published quantifiable measures of their outcomes.

Many had small sample sizes and some lacked control groups — this is an unfortunate feature of the REST literature, being rather underfunded and obscure. But the point of a meta-analysis is to try to compensate for these defects by aggregating the results to simulate a larger sample size.

The study results could be divided into three categories:

direct physiological measures related to activation (e.g., blood pressure, level of cortisol, and level of adrenaline),
more subjective psychological measures of well-being (e.g., negative or positive affectivity, measured by questionnaire or observation), and
performance outcomes of activities involving a physiological component (e.g., archery, basketball).

Let’s talk a little about meta-analysis methodology, now. My background is actually astrophysics, so I’m not particularly well trained in statistics in general or this kind of study in particular, but I’ll explain what I know. If I get anything wrong, please let me know and I’ll update.

There are two basic questions to ask about any study result: (1) is the result statistically significant, and (2) how big is the effect size? Statistical significance is a probability measure: essentially, it says, how likely is it that this result real, and not the result of a fluke or unlucky roll of the dice? The generally accepted threshold for scientific believability is 95% — the “19 times out of 20” thing you see all the time in political polling. I believe this threshold is essentially arbitrary, but since 100% is impossible, the consensus is that for most purposes 95% is good enough.

Statistical significance, though, only says that some effect is real, but nothing about how big or important it is. It might be a statistically significant result that Finnish men are on average a quarter inch taller than American men, but there’s no reason to care. This is where effect size comes in, commonly measured by a value called Cohen’s d. It’s a measure of comparison between two groups; for instance, a study’s test group vs its control group, or the same study’s test group before and after floating. You take the difference between the measured value for two groups, and divide by the standard deviation to set the scale. That is, if the test group scores are half a standard deviation better than the control group scores, the comparison has an effect size of d=0.5.

So, how much is a lot? The interpretation of effect size varies by context, but for psychological and physiological studies, the distribution of any given measure across populations is typically quite broad, and an interventions that can shift a measure by a whole standard deviation are quite rare. Apparently the “standard” interpretation offered by Cohen (1988) is that less than 0.2 is small and more than 0.8 is large.

So, back to floating. What do we find from the meta-analysis? Of the 17 studies reporting control groups, six reported positive effects larger than 0.8, and only one an effect less than 0.2 (none were negative). The overall most likely effect size was 0.73, with 95% confidence it was between 0.52 and 0.94.

Outcome type	# studies	# subjects	d (exp vs control)	95% confidence interval
All measures	17	387	0.73	0.52–0.94
Physiology	5	83	0.59	0.15–1.04
Well-being	5	132	0.92	0.56–1.28
Performance	7	172	0.65	0.34–0.96

That’s large! With small sample sizes the standard deviation is quite wide, so an action (floating) that improves a measure between a half and a full standard deviation is a meaningful improvement.

There was a wide variety of procedures used in these various studies, and in particular D&N were curious about the effect of the duration of the study. Some studies ran subjects through only one session in a tank, while others asked subjects to return several or many times over time periods of up to 28 weeks. D&N conclude with 99.5% confidence that the effect size does rise with duration, with six-month effects being as much as 40% larger than single-session effects.

There are a couple of dangers to be aware of in trying to take strong conclusions from the literature as it exists. The most important is that generally small sample sizes of studies included leads to a high error variance — missing data and/or outliers could have a large effect. For instance, in a small study, if a few people who do not respond well to REST drop out, then the remaining effect could appear quite high.

(This point is important to keep in mind for the purpose of comparing apples to apples when looking at different studies and different techniques. But of course, even if it might turn out that only a subset of the population responds strongly to REST, that makes it a very good therapy for those people!)

The second limitation of all meta-analyses is the so-called ‘file-drawer problem’ — the idea that inconclusive or disappointing studies may get stuck in a drawer and never published, making the average in the literature look overly optimistic. D&N calculate, though, that to reduce the effect size even below 0.5 it would take an additional eight “lost” studies with zero effect. That’s rather too many to be plausible.

Overall, it’s especially noteworthy that substantial improvements are being seen as a result of an easy therapy with no known harmful side effects. For comparison, a similar meta-analysis of other stress relaxation techniques such as relaxation exercises, biofeedback, or sitting comfortably on a couch yielded an effect size of 0.35 (van der Klink et al, 2001, as cited by D&N). According to D&N, Lipsey & Wilson (1993) surveyed 300 studies of psychological, educational, and other behavioral interventions for stress and coping, and an effect size of 0.73 would be in the top 25% of techniques.

One thought on “Science and sensory deprivation”