Experiments
Experiments describe the systematic and reproducible design to test specific hypothesis.
Starting with Francis Bacon there was the theoretical foundation to shift previously widely un-systematic experiments into a more structured form. With the rise of disciplines in the enlightenment experiments thrived, also thanks to an increasing amount of resources available in Europe due to the Victorian age and other effects of colonialism. Deviating from more observational studies in physics, astronomy, biology and other fields, experiments opened the door to the wide testing of hypothesis. All the while Mill and others build on Bacon to derive the necessary basic debates about so called facts, building the theoretical basis to evaluate the merit of experiments. Hence these systematic experimental approaches aided many fields such as botany, chemistry, zoology, physics and much more, but what was even more important, these fields created a body of knowledge that kickstarted many fields of research, and even solidified others. The value of systematic experiments, and consequently systematic knowledge created a direct link to practical application of that knowledge. The scientific method -called with the ignorant recognition of no other methods beside systematic experimental hypothesis testing as well as standardisation in engineering- hence became the motor of both the late enlightenment as well as the industrialisation, proving a crucial link between basically enlightenment and modernity.
Due to the demand of systematic knowledge some disciplines ripened, meaning that own departments were established, including the necessary laboratory spaces to conduct experiments. The main focus to this end was to conduct experiments that were as reproducible as possible, meaning ideally with a 100 % confidence. Laboratory conditions thus aimed at creating constant conditions and manipulating ideally only one or few parameters, which were then manipulated and therefore tested systematically. Necessary repetitions were conducted as well, but of less importance at that point. Much of the early experiments were hence experiments that were rather simple but produced knowledge that was more generalisable. There was also a general tendency of experiments either working or not, which is up until today a source of great confusion, as a trial and error and errors approach -despite being a valid approach- is often confused with a general mode of “experimentation”. In this sense, many people consider repairing a bike without any knowledge about bikes whatsoever as a mode of “experimentation”. We therefore highlight that experiments are systematic. The next big step was the provision of certainty and ways to calculate uncertainty, which came with the rise of probability statistics.
First in astronomy, but then also in agriculture and other fields the notion became apparent that our reproducible settings may sometimes be hard to achieve. Error of measurements in astronomy was a prevalent problem of optics and other apparatus in the 18th and 19th century, and Fisher equally recognised the mess -or variance- that nature forces onto a systematic experimenter. The demand for more food due to the rise in population, and the availability of potent seed varieties and fertiliser -both made thanks to scientific experimentation- raised the question how to conduct experiments under field conditions. Making experiments in the laboratory reached its outer borders, as plant growth experiments were hard to conduct in the small confined spaces of a laboratory, and it was questioned whether the results were actually applicable in the real world. Hence experiments literally shifted into fields, with a dramatic effect on their design, conduct and outcome. While laboratory conditions aimed to minimize variance -ideally conducting experiments with a high confidence- the new field experiments increased sample size to tame the variability -or messiness- of factors that could not be controlled, such as subtle changes in the soil or microclimate.
Field experiments became a revolution for many scientific fields. The systematic testing of hypotheses allowed first for agriculture and other fields of production to thrive, but then also did medicine, psychology, ecology and even economics use experimental approaches to test specific questions. This systematic generation of knowledge triggered a revolution in science, as knowledge became subsequently more specific and detailed. Take antibiotics, where a wide array of remedies was successively developed and tested. This triggered the cascading effects of antibiotic resistance, demanding new and updated versions to keep track with the bacteria that are likewise constantly evolving. This showcases that while the field experiment led to many positive developments, it also created ripples that are hard to anticipate. The problems created through fertiliser and GMO become more and more apparent but integrating the diversity of knowledge became a novel challenge within science. The rising number of systematic experiments that provided comparable data led to the creation of meta-analysis, which integrate knowledge from available studies to find the overarching effects. For instance may several studies show that Ibuprofen works against headache, yet other studies are inconclusive. Integrating all studies together proves however, that overall the drug seems to work against headaches, just not in all circumstances and with certain specific restrictions. The gold standard in medicine are the Cochrane reviews, which are the most established and thought out procedure how to integrate knowledge into meta-studies. Beside rigorous standards does this imply a specific set of statistical approaches that are able to take the diversity of cases and studies into account. It became a revolution of so-called mixed effect models to not only investigate what we want to know, but also to take into account what we do not want to know. A good example of this is sports and exercise investigated in health studies. A drug that should be researched may have certain positive impacts as a treatment, yet this impact may be less pronounced on people who are anyway healthy though daily exercise. Another example would be that we want to investigate whether a treatment works to heal patients, but we do not want to know whether it works better in one hospital compared to another hospital.
Another severe challenge that emerged out of the development of field experiments was an almost exact opposite trend. What do we do with singular cases? How do we deal with cases that are of pronounced importance, yet cannot be replicated. A famous example from ethnographic studies are the Easter Islands. Why did the people their channel much of their resources into building gigantic statues, thereby bringing their society to the brink of collapse? While this is a surely intriguing question, there are no replicates of the Easter Islands. This is surely a singular problem, and such settings are often referred to as Natural Experiments. From a certain perspective is our whole planet a natural Experiment, and it is also from a statistical perspective a problem that we do not have any replicates, besides other ramifications. Such singular cases are often increasingly relevant on a smaller scale as well. With a rise in qualitative methods both in diversity and abundance, and an urge for understanding even complex systems and cases, there is clearly a demand for the integration on knowledge from Natural Experiments. From a statistical viewpoint, such cases are difficult due to a lack of being reproducible, yet the knowledge can still be relevant, plausible and valid. To this end, I proclaim the concept of the niche in order to illustrate and conceptualize how single cases can still contribute to the production of knowledge. For example is the financial crisis from 2009, where many patterns where comparable to previous crisis, but other factors were different. Hence this crisis is comparable to many previous factors and patterns regarding some layers of information, but also novel and not transferable regarding other dynamics.
Real world experiments are the latest development in the diversification of the arena of experiments. These types of experiments are currently widely explored in the literature, and I do not recognize a coherent understanding of what real-world experiments are to date in the literature. These experiments can however be seen as a continuation of the trend of natural experiments, where a solution orientated agenda tries to generate one or several interventions, the effects of which are tested often within singular cases, but the evaluation criteria are clear before the study was conducted. Most studies to date have this defined with such a vigour, but the development of real-world experiments is only starting to emerge. Since this is only partly relevant for statistics, we will not elaborate further here, but highlight the available literature.
Taken together, experiments can be powerful to test specific hypothesis. Experiments should be systematic but were evolved over time to investigate field conditions as well as small samples. The next years and decades will show how the world of big data and the growing information through the science society interaction will build and evolve the scientific experiment. Systematic experiments will probably remain a backbone of systematic knowledge production.