<<<Rupert Sheldrake page

Sheldrake's Seven Experiments

Pets Who Sense When Their Caretakers are Returning How do Pigeons Home? The Organization of Termites The Sense of Being Stared At Phantom Touch The Variability of the 'Fundamental Constants' The Effects of Researchers' Expectations
Sheldrake, Rupert. Seven Experiments That Could Change the World.
New York: Riverhead, 1995.

The Effects of Researchers' Expectations

Can researchers' expectations influence their results?

Rupert Sheldrake:

"How paranormal is normal science?

There is a good reason for the conventional taboo against parapsychology, making it a kind of outcast from established science. The existence of psychic phenomena would seriously endanger the illusion of objectivity. It would raise the possibility that many empirical results in many fields of science reflect the expectations of the experimenters through subtle unconscious influences. Ironically, the orthodox ideal of passive observation may well provide excellent conditions for paranormal effects:

An experimenter preparing [the] apparatus, getting
. . . animals ready, and then leaving them with some feeling of assurance that the experiment will run and the animals will appropriately 'do their thing' cannot but remind us of certain aspects of magic, ritual, or perhaps petitionary prayer. Something is done with the confidence that it will produce a desired result, and the participant[s], once [they have] done this, psychologically [put] a distance between [themselves] and the outcome. [They are] not trying to make things happen, but just [trust] that they will.
...Such circumstances may provide an optimum opportunity for psychokinetic intervention.

This possibility has indeed been raised in a paper in Nature entitled 'Scientists confronting the paranomal', by the physicist David Bohm and others. They noted that the relaxed conditions necessary for the appearance of psychokinetic phenomena are also those most fruitful for scientific research in general. Conversely, tension, fear, and hostility tend not only to inhibit psi effects, but also to influence experiments in the so-called hard sciences too. 'If any of those who participate in a physical experiment are tense and hostile, and do not really want the experiment to work, the chances of success are greatly diminished.'

The defenders of orthodoxy generally reject or ignore the possibility of paranormal influences under any circumstances. The task of keeping science psi-free is undertaken by organized groups of Skeptics. These scientific vigilantes continually challenge any evidence for psi effects, rejecting it on one or more of the following grounds:

  1. Incompetent experimentation.

  2. Selective observation, recording, and reporting of data.

  3. Unconscious or conscious deception.

  4. Experimenter effects mediated by subtle cues.

Skeptics are right to point out these possible sources of error in parapsychological research. But the same sources of bias are present in orthodox research as well. The very fact that parapsychological research is subject to such critical scrutiny makes researchers in this field unusually conscious of the effects of expectation. Ironically, it is in conventional, uncontroversial fields of research that the influences of experimenters' expectations are most likely to pass undetected.

The evidence for experimenter effects in medicine and the behavioral sciences is undeniable. And that is why 'subtle cues' take on such an important explanatory role. Almost everyone agrees that subtle cues such as gestures, eye movements, body posture, and odors can influence people and animals. Skeptics are very keen on emphasizing the importance of such cues, and rightly so. A favorite example showing the importance of subtle communication is the story of Clever Hans, a famous horse in Berlin at the turn of the century. This horse could apparently perform arithmetic in the presence of its owner by tapping a hoof on the ground to count out an answer. Fraud seemed unlikely, since the owner would allow other people (free of charge) to question the animal themselves. The phenomenon was scientifically investigated in 1904 by the psychologist Oskar Pfungst, who concluded that the horse was receiving clues from gestures made, probably unwittingly, by the owner and other questioners. Pfungst found that he could get the horse to give the correct answer simply by concentrating his attention on the number, though he was not aware of making any movement that would give the number away.

No one denies that subtle cues from experimenters, passing through normal sensory channels, can affect people and animals. Skeptics claim that such influences may explain many examples of seemingly telepathic communication. But granted all this, the possibility remains that both subtle sensory cues and 'paranormal' influences play a part.

The story of Pfungst's investigation of Clever Hans has been told again and again to generations of psychology students. What is less well known is that after Pfungst's investigation, described in his book on Clever Hans published in 1911, further studies on horses with similar mathematical powers showed that more was involved than subtle sensory cues. For example, when Maurice Maeterlinck investigated the famous calculating horses of Elberfeld, he concluded that they were somehow reading his mind, rather than responding to subtle sensory cues. After a series of increasingly stringent tests, he finally thought of one which 'by virtue of its very simplicity, could not be exposed to any elaborate and far-fetched suspicions'. He took three cards with numbers on them, shuffled them without looking at them, and placed them face down on a board where the horse could see only their blank backs. 'There was therefore, at that moment, not a human soul on earth who knew the figures.' Yet, without hesitation, the horse rapped out the number the three cards formed. This experiment succeeded with other calculating horses too 'as often as I cared to try it.' These results go even beyond the possibility of telepathy, since Maeterlinck himself did not know the answers when the horses were tapping them out. They imply either that the horses were capable of clairvoyance, directly knowing what was on the cards, or precognition, knowing the number that would be in Maeterlinck's mind when he later turned the cards over.

For more than eighty years, the story of Clever Hans and Pfungst has been told and retold as a triumph of scepticism. It has taken on a mythic significance, enabling semmingly paranormal effects to be explained in terms of subtle cues. But what if some of the subtle cues are themselves paranormal? There is a taboo against even discussing this possibility, let alone investigating it. Nevertheless, the possible importance of parapsychological influences was suggested to Rosenthal by one of his colleagues at Harvard right at the outset of his research on experimenter effects:

Had I the wit or courage to do so, I could easily have conducted a study in which experimenters with varying expectations for their subjects' responses were prevented from having sensory contact with those subjects. My prediction, then and now, was (and would be) that under these conditions, no expectancy effects could occur. But I never did the study.

Maybe if someone actually did this study, Rosenthal's prediction would turn out to be wrong. Maybe some of the effects of experimenters' expectations are indeed paranormal. Such subtle influences would not be opposed to subtle cues; they would usually work along with them, and operate just as unconsciously.

Although experimenter effects are well recognized in the medical and behavioral sciences, the fact that they are explained--or explained away--in terms of 'subtle cues' prevents them from being taken very seriously in other fields of investigation such as biochemistry. Whereas a person or a rat might pick up a scientist's expectations and respond accordingly, an enzyme in a test tube would not be expected to respond to subtle body language, unconscious facial gestures, etc. Of course, there is a general recognition of the possibility of biased observation, but this is not a result of any actual influence on the experimental system itself. The scientist may 'see' a difference that fits his or her expectancy, but the difference is supposed to be only in the eye of the observer, not in the material studied.

Nevertheless, all this is merely an assumption. There has been practically no research on the influence of experimenters' expectations in fields of science such as agriculture, genetics, molecular biology, chemistry, and physics. Since the material studied is assumed to be immune from such influences, precautions against them are assumed to be unnecessary. Except in the behavioral sciences and in clinical research, double-blind procedures are rarely employed.

I now suggest a variety of tests to explore the possibility that experimenter effects may be far more widespread than previously thought.

Experiments on Possible Paranormal Experimenter Effects

In looking for experimenter effects, I think it is best to start with situations where the phenomena show an inherent variability, an inherent indeterminism, allowing scope for the biasing effects of expectancy. This is certainly the case with human and animal behavior, where expectancy effects have been so clearly demonstrated. I would not expect physical systems with a high degree of uniformity and predictability to show much scope for biasing effects, for example the dynamics of billiard balls (although even here, in a hotly contested game of billiards, a player might be highly motivated to affect the outcome of impacts and collisions, and could conceivably bring into play unconscious psychokinetic powers).

In fact, variable, statistical results are the norm in most fields of social and biological research, including sociology, ecology, veterinary medicine, agriculture, genetics, developmental biology, microbiology, neurophysiology, immunology, and so on. And so they are in quantum physics, where probabilities are of the essence. There are many areas of the physical sciences too where inherent variability is very apparent, as in crystallization processes--for example, every snowflake is different. And even the most mechanistic of systems, mass-produced machines, are variable. Their tendency to break down, for example, is measured statistically, as in the 'reliability' figures for different brands published in consumer surveys. And almost everyone has heard of 'lemons', individual cars or other machines which are exceptionally unreliable--in extreme cases even said to be 'jinxed'.

What I am proposing is a general type of experiment that can be conducted in many fields of enquiry. The experimental design follows Rosenthal's standard procedure but is extended to other areas which are so far unexplored. The purpose is to find out which systems are susceptible to the influence of experimenters' expectations, and to compare the susceptibilities of different systems. Here are two extreme examples.

First, students are given two samples of radioactive tracers, of the kind routinely used in biochemical and biophysical research, and led to believe that one is more radioactive than the other. In fact, both are the same. They then determine the levels of radioactivity following standard laboratory procedures, with automatic Geiger or scintillation counters. Do they tend to find more radioactivity in the samples where they expect it?

In the second example, in the field of consumer research, volunteers are given samples of a standard product, say an automatic camera, and told they are taking part in an experiment to study the 'Monday morning' effect, whereby an unusually high proportion of 'lemon' cameras are produced on Monday mornings. Half the cameras, drawn at random from a normal consignment, are labeled 'Monday Morning Sample'. The others are labeled 'Reliable Control'. The experiment is designed so that both lots of cameras are used to an equal extent under comparable conditions, and the volunteers are asked to report regularly on any problems they have encountered. Do the 'Monday morning' cameras tend to show more defects?

I would guess that the camera experiment might show a bigger expectancy effect than the radioactivity experiment. There are more ways that people's expectations could affect the results--for instance, they may be more on the look-out for faults with the 'Monday morning' cameras, or treat them with less respect, handling them more roughly. Ther would also be the possibility of unconscious paranormal influences; for example their negative expectations about the 'Monday morning' cameras might somehow put a 'jinx' on them. But even the radioactivity experiment leaves scope for several kinds of influence, including conscious or unconscious errors in preparing the sample for radioactive analysis, and a psychokinetic influence on the process of radioactive decay itself, or on the operation of the measuring instrument. If these experiments did in fact provide positive evidence for expectancy effects, further research could then be designed to tease apart the possibilities, separating possible paranormal effects from other sources of bias.

Here are some more examples of experiments of this general type.

1. A crystallization experiment

Many compounds do not crystallize readily even from supersaturated solutions; there may be delays of hours, days, or even weeks before crystals appear. However, crystallization can be initiated by putting in 'seeds' or 'nuclei' around which the crystals can form. In this experiment, students are given a supersaturated solution of a hard-to-crystallize substance and also two samples of a fine powder, one described as a 'nucleation enhancer', made by a special seed-enrichment process, and the other as an 'inert control'. In fact the two powders are identical. To each of several identical containers containing a fixed amount of supersaturated solution, the students add a small, defined amount of 'nucleation enhancer'; to an equal number of identical containers with the same amount of supersaturated solution, they add the same amount of 'inert control' powder. They examine the samples at regular intervals, recording which ones have crystallized. Do the samples they expect to crystallize sooner show a tendency to do so?

2. A biochemical experiment

Students in a biochemistry practical class are given two samples of a particular enzyme. One is described as having been treated with an inhibitor which partically blocks its activity; the other is described as the untreated control. In fact both samples are identical. They measure the enzyme activity, using standard biochemical techniques. Does the 'inhibited' enzyme tend to show lower activity than the 'control'?

3. A genetic experiment

Students in a genetics practical class are given seeds of a fast-growing plant species, for example Arabidopsis thaliana, a small plant in the mustard family commonly used for genetic research. They divide this lot of seeds into two samples. One is the control. The other is placed in a lead-shielded radiation chamber, covered with signs saying 'Radioactivity--Danger', and left there for a defined period before being taken out with great care. These seeds have now supposedly been subjected to powerful mutation-inducing radiations (but in fact there is no radioactive source in the chamber). Both samples are now raised under identical condictions, and the students in due course record the number of abnormal growth forms in both samples. Do they tend to find more 'mutant' forms in the 'irradiated' samples?

4. Another genetic experiment

Students in another genetics practical class are given fruit flies containing mutant genes, for example mutations in bithorax genes giving flies containing them a tendency to produce four wings instead of two. Such mutations are recessive; in other words, only flies with a double dose of the mutant genes develop abnormally. First-generation hybrids between such mutant flies and normal flies appear normal. But when these hybrids are crossed with each other, they give rise to progeny which show Mendelian segregation: most of these second-generation hybrids are normal-looking, but a minority show the mutant form to varying degrees.

The students are given two samples of the normal-looking hybrid flies, drawn from the same population, but one sample is said to have an 'enhancer' gene that causes the bithorax character to show a higher penetrance and expressivity in the segregating population. (In the jargon of genetics, 'penetrance' means the proportion of organisms showing the effects of the gene in question, and 'expressivity' the intensity with which the effects of the gene are realized.) The other sample of hybrid flies is said to be bred from a strain with an 'inhibitor' gene with the opposite effect.

The students then breed from the flies with the 'enhancer' gene and the 'inhibitor' gene, and carefully examine the resulting populations of flies. Do the populations with the 'enhancer' gene tend to show a higher proportion of abnormal flies, and does this character tend to be more strongly expressed? (The flies in both populations should be preserved, for example in alcohol, so they can later be re-examined independently.)

5. An agricultural experiment

Agriculture students are told that as a practical exercise they are going to carry out a trial of a promising new growth stimulator, which when sprayed onto plants at reglar intervals leads to enhanced yields. They carry out a field experiment on a crop of, say, beans, using a standard design, with replicated plots and a randomized allocation of test and control treatments. Throughout the flowering and fruiting season they spray the plants in the test plots at weekly intervals with the 'growth stimulator' solution, and the control plants with an equal volume of water. In fact the 'growth stimulator' solution is nothing but water. On each occasion they observe the plants carefully and note any differences they can see between the plants in the test and control plots. When the crop is mature, they harvest the plants in the various plots and measure their total weight and seed yield. Do the 'stimulated' plants grow better and give a higher yield than the 'controls'?

There is no need to multiply examples further. Clearly the same general principles could be extended to many fields of research. Such experiments would be particularly easy to carry out, at little cost, in the context of student practical classes, with the cooperation of the regular course instructors.


The only disadvantage of these experiments is that they involve deception. In this respect they follow the precedents established by Rosenthal and his colleagues, and by the use of placebos in medical research. Some people may object on ethical grounds, and I am not entirely happy with the use of deception as a means of affecting people's expectations. But I believe this kind of research can be justified because of its importance in helping to reveal the possible extent of expectancy effects in the practice of science, together with the dangers of self-deception.

However, I also believe that if such deception became commonplace it would be self-limiting. If experiments of this type give interesting and significant results, if further research on the topic became widespread, and if the results were well publicized, students would probably become increasingly aware of the possibility that they were sometimes being deceived by their instructors. They might then tend to be more sceptical about what they are told to expect, and hence less prone to expectancy effects. If the deliberate practice of deception makes students more aware of the effect of expectation, more on guard against it, this would be a valuable contribution to their scientific education.

The effects of the kind of deception used in these experiments may be relatively weak, because students' expectations may be lightly held, not involving much personal commitment; they are merely carrying out practical exercises that no one takes too seriously. Professional researchers, steeped in the currently accepted paradigms and with careers and reputations at stake, may show much stronger expectancy effects, and also be more prone to self-deception....


...If any significant expectancy effects show up, the investigations will need to be taken further in order to find out what kinds of factors, normal or paranormal, may have been playing a part. For example, in experiment 4, if a bias appeared in the ratios of abnormal to normal flies in the populations of second-generation hybrids in accordance with the experimenters' expectations, the first thing to check would be a possible bias in the recording of data. This could be done by a third party, who would count the preserved flies 'blind', not knowing which sample was which. Perhaps this check would show that the entire experimenter effect could be explained in terms of biased counting. On the other hand, perhaps it would show that only a part of the bias was introduced at this stage; it might confirm that the proportions of normal and abnormal flies really had been altered. Then there would have to be a check on the possibility that the experimenters did not preserve and count all the second-generation flies, but only a selected sample which might have been biased. But if it turned out that all the flies had been preserved, then the alteration in the ratios would begin to look like a paranormal effect.

A new experiment would be needed to resolve this question. The second experiment would be a repetition of the first, except that the experimenters would see the hybrid flies being crossed, but not be allowed to handle the flies or the fly-bottles until the second generation of flies had matured and was ready for counting. The flies would be looked after by people who did not know what expectations were being tested. If the expectancy effects still showed up after experimenters had no normal means of influencing the breeding and development of the flies, then the;y could be inferred to result from some paranormal influence....

. . .Students spend many hours in laboratories doing practical classes, in which they perform standard experiments illustrating the prevailing paradigm. These experiments have 'correct' results, namely those that conform to a well-established pattern of expectation. Nevertheless, these are not always the results that students obtain. I have had years of experience teaching in undergraduate laboratory classes, and have often been amazed by the variation in the students' results. Of course, deviant data are immediately put down to mistakes and inexperience. And students who persistently fail to get their experiments to work correctly are not regarded as promising researchers. They fare badly in their practical examinations, and are unlikely to pursue a scientific career. By contrast, professional scientists have succeeded in a long process of training and selection, in the course of which they have proved their ability to get the expected results from standard experiments. Is this success simply a matter of practical competence? Or does it also involve a subtle and unconscious ability to bring about experimenter effects in accordance with orthodox expectations?"

Please let us know how we could better present this information.

top of this page

Transaction Net
The Ultimate San Francisco Resource Directory

San FranZiskGo!

Zisk Tech Toons