Cohort Study

Method Categorisation:
Quantitative - Qualitative
Deductive - Inductive
Individual - System - Global
Past - Present - Future

In short: A Cohort Study is a form of quantitative data analysis where two or more groups are tracked over time from a specific exposure to an outcome.

Background

SCOPUS hits for Cohort Study until 2019. Search terms: 'Cohort Study', 'Cohort Analysis' in Title, Abstract, Keywords. Source: own.

The term 'cohort' historically refers to a 300-600 man unit in the Roman army (3). It found its way into scientific research thanks to Wade Hampton Frost, a medical doctor from the US, who invented the Cohort study method after his pioneering work in the field of epidemiology in the early 20th Century. In a study that was published after his death in 1938, he focused on the mortality rates of tuberculosis patients in Massachusetts and - presumably for the first time - assessed not only differences between age groups in any given year, but also in the development of each cohort over time.

The method and the term 'cohort' were subsequently adapted by the social sciences, mostly focusing on the role of "generations" in historical developments as well as in demography (4). Social sciences and epidemiology remain the most important applications fields of Cohort Studies to this day (2). However, cohorts are no more only defined based on their birth year, but determined based on any shared characteristic of interest (e.g. the experience of a specific event) within a defined time interval (4). It is worthwhile noting that the original term 'cohort analysis' today relates to a big data approach used for business analytics. The scientific method used in statistics and epidemiology today is referred to as 'cohort study' (2). More appropriate synonyms of the cohort study include incidence, longitudinal, forward-looking, follow-up, concurrent or prospective studies (3).

What the method does

"A cohort study tracks two or more groups forward from exposure to outcome. In its simplest form, a cohort study compares the experience of a group exposed to some factor with another group not exposed to the factor. If the former group has a higher or lower frequency of an outcome than the unexposed, then an association between exposure and outcome is evident." (3, p.341). The exposure must be clearly defined, with the option of defining several degrees of exposure (e.g. light and heavy smokers compared to non-smokers) (3). The cohorts may be tracked over years, but also over decades, depending on the research question.
A Cohort Study is a form of quantitative data analysis. There are two main types: prospective (following cohorts into the future with original data gathering) and retrospective (following cohorts up to the present based on existent data). A third type, ambidirectional, involves both data from the past and data gathered from the present on which is useful for studies that focus on both short-term and long-term outcomes of certain events or exposures (3).

A table from Frost's original study. Source: Comstock 2001, p.9)

This table from Frost's original paper (see Key Publications) illustrates the mortality of tuberculosis patients (a both infectious and chronic disease), with the year of death in the columns and the age-group of the patients in the lines. When analyzed per year, this table suggests the highest mortality among infants, followed by a peak in the 20-29 year group and another increase for the elderly. However, the death rates for one cohort - seen diagonally over time - only indicates a high mortality for infants and 20-29 year olds, while the mortality rate decreases after that. By analyzing the temporal development of the cohort, Frost was able to reject the previous assumption that especially elderly people are at risk. Instead, he concluded that the high death rates in this age group are residuals from even higher death rates of this cohort in earlier years (1).

Strengths & Challenges

Cohort studies are "(...) the best way to identify incidence and natural history of a disease, and can be used to examine multiple outcomes after a single exposure" (3, p.341). In addition, they are useful when studying rare exposures that happen in limited numbers of environments (e.g. a factory). They can reduce the risk of survivor bias when studying diseases that are rapidly fatal, and they allow for the calculation of incidence rates, relative risks, and confidence intervals (3, p.342).
However, Cohort Studies are less useful for diseases and outcomes that are very rare or take a long time to develop (3). Also, loss to follow-up can be an issue with prospective Cohort Studies, especially for longitudinal studies that span over years or decades. In addition, the exposure status of the subjects may change over time (3).

Normativity

Most often, it is difficult to find control groups that share the exact same characteristics as the analyzed group(s) except for the exposure. Attempts to compensate for this can lead to biases in the analysis (3).
Since several outcomes may be attributed to the original exposure, researchers might focus on those outcomes that are significant or supportive of a certain assumption, instead of transparently communicating all outcomes (3).
The increasing wealth of data that become available in recent years through the internet an other sources led to many opportunities, however, now some studies are indicated to be cohort studies, when i fact they only use available data, but do not design a methodological sampling.

Key Publications

Frost, W.H. The age selection of mortality from tuberculosis in successive decades. Am J Hyg 1939; 30: 91-6. (Reprinted in Am J Epidemio11995; 141: 4-9).

The original paper from Frost, published posthumously, that popularized the method.

Power, C. Elliott, J. 2006. Cohort profile: 1958 British birth cohort (National Child Development Study). International Journal of Epidemiology 35. p.34-41.

A summary of the National Child Development Study, a British cohort study covering 17000 children born in a single week in 1958, that pursued these children's lives for several decades and investigated several health and social issues based on these datasets.

References

(1) Comstock, G.W. 2001. Cohort analysis: W.H. Frost's contributions to the epidemiology of tuberculosis and chronic disease. Sozial- und Präventivmedizin SPM 46. 7-12. Available at http://www.epi.msu.edu/janthony/requests/articles/Comstock_Cohort%20analysis%20Frost.pdf.

(2) wikipedia. Cohort study. Available at https://en.wikipedia.org/wiki/Cohort_study.

(3) Grimes, D.A. Schulz, K.F. 2002. Cohort studies: marching towards outcomes. Lancet 359. p.341-345.

(4) Ryder, N.B. 1965. The Cohort As A Concept In The Study of Social Change. In: Mason, W.M. Fienberg, S. (eds.) 1985. Cohort Analysis in Social Research. Springer.

Further Information

Catgory:Statistics

The author of this entry is Christopher Franz.