Longitudinal studies: concept and characteristics

Home » Health and Fitness » Longitudinal studies: concept and characteristics
Health and Fitness No Comments

* What is a longitudinal study

* Differences between longitudinal studies and life table

* Bibliography

SUMMARY: In this review we explore the concept of longitudinal study. The epidemiology textbooks today generally do not define it, while they do statistical textbooks. He speaks of “longitudinal data” that “longitudinal studies”. A longitudinal study implies the existence of repeated measures (more than two) along track. It would therefore be a subtype cohort study, unlike those of a life table, allows inferences individually and analyze changes in variables (exposure and outcomes) and transitions between different states of health. The characteristics of this type of design makes it have to pay special attention to quality control during execution, the losses to follow up, and missing data in some of the measurements. The analysis should take into account the repeated measures and this is what finally gives his character a longitudinal study.

Keywords: Longitudinal studies. Cohort studies.

ABSTRACT: Longitudinal Studies: concept and particularities

In this review the definition of “longitudinal study” is Analysed. Most current textbooks on epidemiology do not define a longitudinal study, statistical Whereas textbooks do. It is more common to Talk About acerca than longitudinal data longitudinal studies. A longitudinal study IMPLIES the existence of Repeated measurements (more than two) across follow-up. According To These ideas, a longitudinal study can be Considered a subtype of That cohort study, in contrast with life-table cohort studies, Allows inference to the subject level, to analyze changes in variables (exposures and outcomes) and transitions among different health states . The Characteristics of this force to design special attention paid to quality control data collection During, Losses During follow-up, and missing data in some measurements. The Statistical Analysis Repeated Measures Should Take into account, and it is what finally Gives the longitudinal character to a Study with Repeated measurements.

Key words: Longitudinal studies. Cohort studies.


The discussion about the meaning of the summed longitudinal Chin1 in 1989: for epidemiologists is synonymous cohort study or follow-up, while for some statistical measurements implies repetidas2. He himself decided not to define the term longitudinal, being difficult to find an acceptable concept for everyone, and chooses as the equivalent to “monitoring”, the most common thought for professionals of the time.

The longitudinal study in epidemiology

In the 1980s it was common to use the term longitudinal simply separate cause from effect. In opposition to the cross term. Miettinen defined as a study whose base is the experience of the population over time (as opposed to a cut of the population) 3. Consistent with this idea, Rothman, in their text, 1986, indicates that the longitudinal word denotes the existence of a time interval between exposure and the beginning of the disease4. In this sense, the case-control study, a sampling strategy to represent the experience of the population over time (especially on the ideas of Miettinen), it would be a longitudinal study. Similarly this idea coincides with Abramson, also unlike descriptive longitudinal studies (studies of change) of the analytical longitudinal, including in the case studies and controles5. Kleinbaum et al6 define himself in opposition to the term longitudinal cross but with a slightly different emphasis, speak of “longitudinal experience” of a population (versus’ cross experience “) and means for them to hold at least two sets of observations over a period of monitoring. Excluding these latter authors to the case-control studies. Kahn and Sempos7 not have an entry for these studies and the key word index, the entry ‘longitudinal study’ reads’ see prospective study “.

This reflects the Dictionary of Epidemiology led by Last, considering the term “longitudinal study” as synonymous cohort study or study seguimiento8. In the classic text on Breslow and Day cohort studies, the term is considered the equivalent longitudinal cohort and used so indistinta9. However, Cook and longitudinal study Ware10 defined as one in which one individual is observed in more than one occasion and it differed from the tracer studies, in which individuals are followed until the occurrence of an event such as the death or illness (although this event is already the second observation).

Since 1990, several texts consider the longitudinal term equivalent to other names, but most ignored it. This is reflected in the book co-directed by Rothman and Greenland11, in which there is no specific section for longitudinal studies in the chapters devoted to the design, and also coincides with this trend Encyclopedia Epidemiolgicos12 Methods, which does not offer a specific entry for this type of study. The fourth edition of the Dictionary of Epidemiology Last played their input anteriores13 editions. Gordis14 considered synonymous concurrent prospective cohort study. Aday15 still partially Abramson’s ideas, already mentioned, and unlike descriptive studies (several cross-sectional studies sequenced in time) of the analytical, among which are prospective cohort studies or longitudinal.

In other fields of clinical medicine, the effect is seen opposite longitudinal and transverse equates cohort, prospective frequently. This is checked, for example, in the publications centered menopausia16 field.

The longitudinal study in statistics

Here the ideas are much clearer: a longitudinal study is that involving more than two measurements along track, must be more than two, since all study cohorts have this number of measurements, and the principle of monitoring end. This is the existing concept mentioned in the text of the 1979 Goldstein. In the same year Rosner was explicit in stating that involve longitudinal data on subjects repeated measurements over time, proposing a new method of analysis for such datos17. Since then, the statistical journal articles (by ejemplo18-22) and are consistent textos23-25 on the same concept.

Two reference works in epidemiology, although not defined in the corresponding longitudinal studies, statistical match prevailing notion. In the book, co-directed by Rothman and Greenland, in the chapter Introduction to regression modeling, Greenland itself states that longitudinal data are repeated measurements in subjects over a period of time and that can be done for exhibitions time-dependent (eg, smoking, alcohol consumption, diet or blood pressure) or recurring income (eg, pain, allergies, depression, etc..) 26. In the Encyclopedia of Epidemiologic Methods, input “sample size” includes a section of “longitudinal studies” which provides the same information provided by Greenland27.

It should be clarified that the statistical picture of “longitudinal study” part of a particular data analysis (taking into account the repeated measures) and that the same would apply to intervention studies, which also have seguimiento28.

To conclude this section, the special issue of Epidemiologic Reviews dedicated to cohort studies, Tager, article focused on the outcome variable of cohort studies, broadly classified the cohort into two groups, ” life table “and” longitudinal “29, clarifying that this classification is somewhat” artificial “. The first is the conventional, where the result is a discrete variable, and population exposure-time is short, estimated incidence and the main measure is the relative risk. The latter incorporate a different analysis, taking advantage of the subjects repeated measurements over time, allowing an inference population besides, individually in a process changes over time or in the transitions between different states health and disease.

The above ideas in epidemiology denote that there is a tendency to avoid the term longitudinal study. Nevertheless, summarizing the ideas discussed previously, the notion of referring to longitudinal study cohort study in which more than two measurements performed along the time and wherein an analysis that considers various measures . The three key elements are: monitoring, more than two measurements and analysis that note. This can be done either prospectively or retrospectively, and the study can be observational or intervention.

Differences between studies longitudinal and life table

Table 1 summarizes the general characteristics of both types of designs. The cohort life table type are the short exposure and disease in the groups being compared, for example, frequency of lung cancer in smokers and nonsmokers. The inference provided by these studies relates to population means. Imply the assumption that exposure acts consistently over time and also has an effect per unit of time along the track, and can only provide limited inference on the time dependence of the associations between exposure and efecto29. An example of this type of study may be the Nurses’ Health Study, with over 120,000 nurses in 11 U.S. states, in which cumulative exposure to oral contraceptives was assessed as a risk factor mama30 cancer.

Longitudinal studies at any time can behave like type life table studies. They can also make inferences at the individual level, value the process change over time and the transitions between different states of health and disease. An example, as the Nurses’ Health Study, prolific publications is the MACS (Multicenter AIDS Cohort Study), in which nearly 5000 men were recruited in estadounidenses31 four cities. When measuring a variable changes with time, the design must take into account the follow-up duration and spacing of the mediciones32.


SPECIAL longitudinal studies

When performing measurements over time, the quality control plays an essential role. We must ensure that all measurements are made at the right time and to standard techniques. The long duration of studies requires paying special attention to the change of personnel, deterioration of equipment, technologies change and inconsistencies in the responses of the participants along the tiempo33.

There is a higher probability of dropping out to follow up. The factors involved in this are varios34:

* The definition of a stable population as a criterion. For example, living in a particular geographic area may motivate participants to address changes not eligible for its completion.

* It will be higher when respondents are contacted not once, not again try to establish contact in subsequent phases of monitoring.

* The purpose of the study influences, for example, in a study of the political science leave no more interested in politics.

* The amount of personal attention devoted to the responders. Telephone interviews and less personal letter to those conducted face to face, and not being used to strengthen links with the study.

* The time spent by the respondent in meeting the information needs of researchers. The higher, the higher the frequency of withdrawals.

* The frequency of contact may also influence, although not everyone agrees. Studies have documented that excess harms tracking contacts, while others have either not found or it is negative relationship.

To prevent dropouts should develop strategies to retain and track participating members. Should be assessed early willingness to participate and report what is expected of participants. We must build bridges with participants joining by sending greeting cards, studio updates, etc. The frequency of contact should be regular. Study staff should be enthusiastic, easy communication, fast respond to problems adecuaduamente participants and adaptable to your needs. Do not neglect to provide incentives that encourage the continuation in estudio35.

Third, another problem facing higher caliber other cohort studies is the existence of missing data. If a participant is required to take all measurements, can cause a similar problem of dropouts during follow-up. Stop it have developed techniques and imputation of missing values, although it has been suggested that may not be necessary if applied generalized estimating equations (GEE analysis) 36, it was found that other methods work best, even when losses are completely aleatorias37. Often information losses are differential and further measurements are lost in patients with a worse health level. It is recommended in these cases data imputation is made taking into account the existing data to the individual himself who faltan38.


In the analysis of longitudinal studies can treat time-dependent covariates that may influence both the exposure under study and be influenced by it (variables that behave simultaneously as confounding and intermediate between exposure and effect). Also, similarly to control recurrent results that can act on exposure and be caused by it (they behave both as confounders and effects) 26.

The longitudinal analysis can be used when there are measurements of the effect and / or exposure to different times. Suppose the relationship between a dependent variable Y is a function of a variable X that changes over time (temporal-dependent) and a Z which is stable over time (temporal-independent), which are studied in subjects K N time points , which is expressed by the equation siguiente17:

Yit = zia bxit + + eit

where the subscript i refers to the individual, the time t when e is an error term (Z does not change to be stable and therefore has only one subscript). The existence of several measurements to estimate the coefficient b without knowing the value of the variable stable when performing a regression of the difference in effect (Y) on the difference of values of the independent variables:

Yit – yi1 = b (xit – xi1) + a (zi – zi) +

+ Eit – ei1 = b (xit – xi1) + eit – ei1

That is, it is not necessary to know the value of the independent variables temporomandibular (or stable) over time. This is an advantage over other analysis, in which these variables must be known. The above model is easily generalized to a multivariate vector of factors that change over time.

The longitudinal analysis is done within the context of generalized linear models with two objectives: to adopt the conventional tools of regression, where the effect is related to the different Exposiones and take into account the correlation between subjects measures. This last point is very important. Suppose that analyzes the effect of growth on blood pressure, blood pressure values of a subject in the examinations conducted depend baseline or baseline and therefore must be taken into account.

For example, longitudinal analysis could be performed in a cohort child is valued as the main exhibition vitamin A deficiency (which may change over time) on the risk of infection (which can be multiple over time) controlling the influence of age, weight and height (temporomandibular dependent variables). The longitudinal analysis can be classified into three main grupos39:

a) Marginal Models: combine different measurements (which are courts in time) of the prevalence of exposure for an average prevalence or other summary measure of exposure over time, and related to the frequency of the disease . The longitudinal element is age or length of follow up in the regression analysis. The coefficients of these models are transformed into a population prevalence ratio, in the example of vitamin A and infection would be the prevalence of infection in children with vitamin A deficiency divided by the prevalence of infection in children without vitamin deficiency A.

b) Transition patterns make this result regressed on past values and past and current exhibitions. An example of these are Markov models. The model coefficients are transformed directly into a rate ratio, that is, RRs, in the example would be the RR of vitamin A deficiency on infection.

c) random effects models allow each individual has unique regression parameters, and there are procedures for standardized results, binary and person-time data. The model coefficients are transformed into an odds ratio refers to the individual, which is assumed to be constant throughout the population, in the example would be the odds of infection in children with vitamin A deficiency against the odds of infection in the same child without deficiency of vitamin A.

Linear models, logistic, Poisson and many survival analysis can be considered special cases of generalized linear models. There are procedures that allow late entries or at different times and unevenly on the observation of a cohort.

Besides parametric models listed in the previous paragraph, the analysis is possible using non-parametric methods, eg the use of functional analysis has been reviewed recientemente40 splines, 41.

He mentioned several specific texts longitudinal data analysis. One of them even offers examples to write routines to correctly perform the analysis using different conventional statistical packages (STATA, SAS, SPSS) 25.


1. Chinn S. Longitudinal studies: objectives and ethical considerations. Publ Sant Epidem Rev 1989, 37: 417-29.

2. H. Goldstein The design and analysis of longitudinal studies. London: Academic Press, 1979.

3. Miettinen OS. Theoretical Epidemiology: Principles of Occurrence Research in Medicine. New York: Wiley, 1985.

4. Rothman KJ. Modern epidemiology. Boston: Little, Brown, 1986.

5. JH Abramson. Classification of epidemiologic research. J Clin Epidemiol 1989; 42: 819-20.

6. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research. Belmont: Lifetime Learning Publications, 1982.

7. Kahn HA, Sempos CT. Statistical methods in epidemiology. New York: Oxford University Press, 1989.

8. Last JM. A dictionary of epidemiology. 2nd ed. New York: Oxford University Press, 1988.

9. Beslow NE, Day NE. Statistical methods for cancer research. Volume II-The design and analysis of cohort studies. Lyon: IARC Scientific Publications, 1987.

10. Cook NR, Ware JH. Design and analysis methods for longitudinal research. Annu Rev Public Health 1983; 4:1-24.

11. Rothman KJ, Greenland S, editors. Modern Epidemiology. 2nd ed. Philadelphia: Lippincott-Raven, 1998.

12. Gail MH, Benichou J, editors. Encyclopedia of epidemiologic methods. Chichester: Wiley, 2000.

13. Last JM. A dictionary of epidemiology. 4th ed. New York: Oxford University Press, 2000.

14. Gordis L. Epidemiology. Philadelphia: Saunders, 1996. p. 119.

15. Aday LA. Designing and conducting health surveys. 2nd ed. San Francisco: Jossey-Bass Publishers, 1996. p. 29-30.

16. Collins A, Landgren B-M. Longitudinal research on the menopause: Methodological Challenges. Acta Obstet Gynecol Scand 2002; 81: 579-80.

17. Rosner B. The analysis of longitudinal data in epidemiologic studies. Chron Dis J 1979, 32: 163-73.

18. Louis TA. General methods for analyzing Repeated measures. Stat Med 1988; 7: 29-45.

19. Ware JH, Lipsitz S. Issues in the Analysis of Repeated categorical outcomes. Stat Med 1988, 7: 95-107.

20. Landis JR, Miller ME. Some methods for the overall analysis of categorical data in longitudinal studies. Stat Med 1988; 7: 29-45.

21. Zeger SL, Liang KY. An overview of methods for the analysis of longitudinal data. Stat Med 1992; 11: 1825-39.

22. Carlin JB, Wolfe R, Coffey C, Patton GC. Analysis of binary outcomes in longitudinal studies using weighted estimating equations and discrete-time survival methods: Prevalence and Incidence of smoking in an adolescent cohort. Stat Med 1999; 18: 2655-79.

23. Dwyer JH, Feinleib M, Lippert P, Hoffmeister H. Statistical models for longitudinal studies of health. New York: Oxfod University Press, 1992.

24. Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of longitudinal data. 2nd ed. Oxford: Oxford University Press, 2002.

25. Twisk JWR. Applied longitudinal data analysis for epidemiology. A practical guide. Cambridge: Cambridge University Press, 2003.

26. Greenland S. Introduction to regression modeling. In: Rothman KJ, Greenland S, editors. Modern Epidemiology. 2nd ed. Philadelphia: Lippincott-Raven, 1998. p. 359-432.

27. G Liu Sample size for epidemiologic studies. In: Gail MH, Benichou J, editors. Encyclopedia of epidemiologic methods. Chichester: Wiley, 2000. p. 787-788.

28. Galbraith S, Marschner IC. Guidelines for the design of clinical trials with longitudinal outcomes. Controlled Clin Trials 2002, 23: 257-73.

29. Tager IB. Outcomes of cohort studies. Epidemiol Rev 1998, 20: 15-28.

30. Lipnick RJ, Buring JE, Hennekens CH, Rosner B, Willett W, Bain C, et al. Oral contraceptives and breast cancer. A prospective cohort study. JAMA 1986; 255:58-61.

31. Munoz A, Kirby AJ, He YD. Long-term survivors with HIV-infection: incubation period and longitudinal patterns of CD4 + lymphocytes. J Acquir Immune Defic Syndr Hum Retrovirol 1995, 8: 496-505.

32. Schlesselman KK. Planning a longitudinal study. II. Frequency of measurement and study duration. J Chron Dis 1973; 26: 561-70.

33. Whitney CW, BK Lind, Wahl PW. Quality assurance and quality control in longitudinal studies. Epidemiol Rev 1998, 20: 71-80.

34. Deeg DJH, van Tilburg T, Smit JH, de Leeuw ED. Attrition in the Longitudinal Aging Study Amsterdam: The effect of differential inclusion in side studies. J Clin Epidemiol 2002; 55: 319-28.

35. Hunt JR, White E. Retaining and tracking cohort study members. Epidemiol Rev 1998, 20: 57-70.

36. Twisk J, de Vente W. Attrition in longitudinal studies: how to deal with missing data. J Clin Epidemiol 2002; 55: 329-37.

37. Touloumi G, Babiker AG, PocockSJ, Darbyshire JH. Impact of missing data due to drop-outs on estimators for rates of change in longitudinal studies: a simulation study. Stat Med 2001; 20: 3715-28.

38. Engels JM, Diehr P. Imputation of missing longitudinal data: a comparison of methods. J Clin Epidemiol 2003; 56: 968-76.

39. Samet JM, Munoz A. Evolution of the cohort studies. Epidemiol Rev 1998, 20: 1-14.

40. W. Guo Functional data analysis in longitudinal settings using smoothing splines. Stat Meth Med Res 2004, 13:49-62.

41. Zhang H. Mixed effects multivariate adaptive splines model for the analysis of longitudinal and growth curve data. Stat Meth Med Res 2004, 13: 63-82.

Miguel Delgado Rodriguez (1) and Javier Llorca Diaz (2)

(1) University of Jan

(2) University of Cantabria.

Correspondence: Miguel Delgado Rodriguez. University of Jan. Building B-3. 23071-Jan.