Get Adobe Flash player

Main Menu

(6) Epidemiology

 WORLD HEALTH ORGANIZATION - TOBACOO - Leading cause of death, illness and impoverishment

Smoking In Asia: A Looming Health Epidemic


Stopping smoking at any age prolongs your life

In principle, randomized controlled trials are a superior instrument to observational studies, to be preferred whenever possible. This may, however, prove more problematic in actual practice. It may be relatively straightforward to test a new drug for the treatment of hypertension in hospitalized patients with a randomized trial, but interpreting the significance for doctors and patients of its results compared with those of another randomized trial testing a different drug may face difficulties. The first trial may have been on patients with a longer duration of hypertension than the second, one trial may have used a placebo as control while the other used another anti-hypertensive drug, and so on. It often happens that the trials have been correct from a methods viewpoint and addressing the same question, the treatment of hypertension, but in different ways that complicate the overall interpretation of the results and the task of doctors choosing a treatment for their patients.

There is even wider scope for this problem to arise with trials of interventions such as a screening programme in a large healthy population - much more complex and dependent on circumstances than just administering a drug to patients. Two trials testing the value of PSA (prostate-specific antigen), a potential early marker of prostate cancer, have recently provided diverging results. The trial from the United States has shown no difference in mortality from prostate cancer among the men who underwent the planned screening programme in respect to the unscreened (control) men.

One explanation of this disappointing result might be that a substantial proportion of the control group had spontaneously undergone an occasional PSA test; hence any difference between the screened and the control group might have been reduced to the point of becoming undetectable.

The trial from Europe in fact showed a reduction in mortality of about 20%. This, however, has to be weighed against the fact that of the 16% of men in which the PSA test was positive, 3 out of 4 were found to have no cancer after undergoing a prostate biopsy, a procedure neither too pleasant nor totally exempt from complications such as infection or bleeding. The question of the possible beneficial effect of PSA screening for prostate cancer remains completely open.

In the context of drug or vaccine testing, the randomized controlled trial is a ‘Phase 3' experimental study comparing the drug or vaccine to a reference treatment (placebo or other drug). `Phase 1' and `Phase 2' precede the randomized trial. In `Phase 1' experiments, the safety of the drug is usually explored by administering small but increasing doses to a limited number of volunteer subjects; data on absorption, distribution in the body, and elimination of the drug are also collected. Once a safe range of doses has been identified, `Phase 2' experiments provide initial information on whether the drug has some of the intended beneficial effects. Again, a limited number of subjects is studied using rapidly obtained responses: for example, in the cancer field the reduction of the mass of a tumor will signal some efficacy of the drug, although the most relevant effect, to be explored in `Phase 3' randomized trials, will be the patients' length of survival. These trials allow an evaluation of the efficacy of a treatment under ideal experimental conditions. When, however, the same treatment is applied in everyday practice, its actual effect, or effectiveness, will not only depend from its efficacy but also on how accurately the patients for whom it is indicated are identified, on how well they comply with the treatment, on whether they spontaneously recur to other treatments and on a variety of other life circumstances.

Observing people without intervening with treatments In randomized controlled trials, subjects treated differently are followed up over time to observe the effects of the intervention (treatment). People can, however, be observed over time even when no treatment was administered at the start of the study period. We first encountered this type of study when comparing rates of onset of diabetes in overweight and normal-weight people. These purely observational follow-up studies are called cohort studies. A group of subjects is chosen; a number of characteristics (exposures) of the subject are measured and recorded, e.g. weight, blood pressure, diet, smoking habits. The subjects are then followed up in time, and a number of events recorded, typically the occurrence of a disease, like diabetes or myocardial infarction, or death from a disease or change in some trait like weight or blood pressure. A classical cohort study is the investigation of the health effects of tobacco in British doctors.

Cohort studies addressing possible short-term effects of exposures such as food poisoning may span days or weeks, while cohort studies investigating long-term effects, such as cancer or atherosclerosis, must necessarily last for decades, involving cumbersome logistics. As they unfold prospectively in the future, these studies are also known as prospective cohort studies or simply prospective studies. When all or some of the measurements made at the beginning are repeated over time, the study is often qualified as longitudinal.

In recent years, a number of large cohort studies have been started to investigate three types of open questions about common diseases: the long-term positive and negative effects of diet; the role played by genetic factors; the influence of early life experiences, in the maternal womb and during childhood, on adult diseases.

A prototype contemporary study: the EPIC international investigation The EPIC (European Prospective Investigation into Cancer) is a project currently jointly coordinated by the International Agency for Research on Cancer in Lyon, France (a research centre of the World Health Organization) and the Department of Public Health of the Imperial College in London. It initially focused on cancers in relation to nutrition but was soon expanded to other chronic diseases, like diabetes and myocardial infarction, and to genetics and environmental factors. It started with several preliminary studies developing and testing questionnaires on habitual diet and moved to recruiting adults, mostly in the age range 35 to 70, in 23 centers in 10 West European countries (Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden, United Kingdom). Some 520,000 people entered the study between 1992 and 2000. Each provided detailed information on diet – collected using comparable procedures in the different countries - and other personal characteristics such as sex, age, education, alcohol and tobacco consumption, physical activity, reproductive history for women, previous diseases. Height, weight, and waist and hip circumferences (as indicators of fat distribution) were measured. Blood was taken from about 385,000 subjects for storage at 196°C in freezers filled with liquid nitrogen.

At this temperature, all biochemical reactions taking place in blood are blocked and the specimens can be stored without alteration for years. The subjects are followed up, recording causes of death and the occurrence of cancers through permanent systems of cancer registration (Cancer Registries) where these exist or, for cancer as well as for other diseases, using physician or hospital records.

A wide range of specific studies is being conducted within the EPIC cohort and findings of major interest have already emerged. Nearly 15,000 deaths from any cause have been recorded and it has been shown that the distribution of fat in the body, in particular an increased deposit of fat in the abdomen translating into a large abdominal girth, predicts the risk of death.

Another study found that the risk of cancer of the intestine (colon and rectum) is associated with high consumption of red and processed meat. In a third study, the risk of breast cancer occurring after menopause was shown to be related to the level of both female and male hormones in the blood, a caution against the use of male hormones (like testosterone) that had been proposed for prevention of bone fragility in post-menopausal women.

Each of the blood specimens stored in the EPIC depository contains different fractions: serum and plasma, in which many biochemical compounds can be analyzed; envelopes of red cells, in which some substances like fatty acid molecules can be assayed; and most important, white blood cells that provide DNA for genetic studies. Genes are embodied as sequences of variable length of elementary molecules (nucleotides or bases, coupled as base-pairs) in the long molecules of DNA. These are in turn packed within the chromosomes included in a cell nucleus. Every human being receives 23 chromosomes from the father and 23 from the mother, each of these two sets containing more than three billion base-pairs. About 99% of these are common to all humans, but this leaves more than ten million nucleotides that can vary from one person to another. In these variations are hidden the differences in individual susceptibility to disease, a universe that has become accessible to direct exploration only recently, since the revolution in molecular biology and technology has made it possible first to measure and then to test the differences in the structure of individual nucleotides on a very large scale. Today, it is feasible to test a million nucleotides at once for variations in structure (form), in studies of `single nucleotide polymorphisms' (SNPs) that cover all chromosomes, i.e. the whole `genome'.

Within EPIC and in a fast-expanding number of other studies, associations are now sought between these genetic variations and disease; in the same way as in the past associations of smoking with disease were investigated. Testing hundreds of thousands or millions of associations, and understanding whether and in what way a gene variant causes a disease, involves three major challenges. First, it requires a very large number of cases of a disease, even beyond the numbers achievable in a project like EPIC: hence data from similar, albeit smaller, studies are combined with those of EPIC for `consortium' analyses. Second, testing a million associations increases the number of them that turn out to be statistically significant at the commonly adopted levels of 5% or 1% probability of error. A 5% level implies that 50,000 associations will appear as statistically significant merely by chance!

New methods of statistical analysis are being developed to keep this flurry of chance results under control. Third, and most complex, a single nucleotide variant will very rarely be found responsible `per se' for individual susceptibility to developing a common disease like colon cancer or myocardial infarction. The action of multiple genes is likely and to unravel the puzzle of their cooperation combined with the influences of external factors like diet will require the investigation of the chains of events leading from the gene variants to gene expression into proteins, cellular functions, and finally to disease. For this task, resources like EPIC that make possible not only the testing of genetic polymorphisms in DNA but also the assaying of biochemical components such as proteins in serum and plasma represent a precious research instrument.

Among studies broadly similar to EPIC, the British Biobank, which has started recruitment of subjects aged 40-69 in 2007, targets on a total of half a million. In Denmark, a'Danish National Birth Cohort' recruited nearly 100,000 pregnant women between 1997 and 2002 with the main objective of investigating how the period from conception to early childhood influences the health conditions of adult life. Both projects have a collection of blood specimens, as do several other studies at the advanced frontier of today's epidemiology, combining the study of genetic and external factors - dietary, occupational, environmental, and social. As these ongoing projects show, the timescale of epidemiology is often long and very different from the time of weeks, months, or a few years taken by studies carried out in the laboratory using materials like cell cultures or experimental animals: the simple reason is that for finding out what happens and why it happens to people over a lifetime there is no real alternative to observing people over a lifetime.

The five key features of cohort studies

1. The choice of the population is crucial. Essentially the exposures that are the focus of investigation must be present and variable in intensity in the population; otherwise the study will be a waste of time and resources. There is little point in choosing a population where everybody eats similar diets for a study of diet and disease: for this reason, the EPIC investigation included a spectrum of countries from northern to southern Europe, where diets (still) exhibit sizeable differences. For the same reason, when investigating the possible health effects of an air pollutant like benzo-pyrene from heating, industrial, or vehicle exhausts, the first choice would be a population of, say, gas workers, some of whom were occupationally exposed at high levels in coal-firing, rather than a general urban population exposed to low and relatively uniform levels. A population may also be chosen because it shows a high frequency of a disease, for example liver cancer, or of a disease and an exposure, say liver cancer and hepatitis B. In this case, the purpose of the study is to find out whether the risk of liver cancer is indeed concentrated among people who had hepatitis. Cohorts of patients, for example those with chronic bronchitis, are special populations to be followed up in time after the first manifestations of the condition in order to understand the natural history of the disease development. This knowledge is indispensable to clinicians for formulating correct prognoses for individual patients.

2. The study design may include a cohort recruited in a single place, like the classic study in the small town of Framingham in Massachusetts that has provided fundamental information on the determinants of cardiovascular diseases, or several cohorts, as in the EPIC project. The number of subjects to be recruited should in any case be sufficient to detect with high probability the risk of disease associated with different levels of an exposure, e.g. of myocardial infarction with amount of fat in the diet. For this reason, studies of workplace hazards often combine populations of workers at several plants, each of which employs too few workers exposed to a particular hazard to permit a meaningful investigation. Usually the age range and the gender of the people to be included in the cohort are also specified. Should the people actually in the cohort be a random sample representative of the chosen population? Because comparisons are made between groups within the cohort, for instance between people eating different amounts of fat, this is not an absolute requirement (although if the proportion of people invited who refuse to enter the study is high, various types of biases may creep in).

3. The factors or exposures to be measured belong to two categories. First, those that can be measured at the moment of people's entry into the study: education, profession, blood pressure, blood glucose level, presents smoking habits, and so on. Second, those that reflect past experience, recent or remote: lifetime smoking habits, past jobs, diet during the last week or month or year, and so on. A proper standardization of the methods of measurement can ensure the quality of measurements for the first category, but it cannot completely prevent errors of recollection for the second category.

4. Events such as disease occurrence or death are the typical responses to be recorded in most cohort studies. Mechanisms for tracing the people in the cohort are essential: a cohort study in which the percentage of subjects of whom it is unknown whether they are still alive or dead is higher than 5% or, at worst, 10% is usually regarded as of mediocre quality. Existing national or local systems of death registration and disease (e.g. cancer) registries are used both for ascertaining the status of a person and the disease diagnoses. When these systems are not in operation or are unreliable an active follow-up mechanism has to be put in place, for example through a network of the subjects' doctors.

5. The analysis of a cohort study is straightforward. Incidence rates or risks of disease are computed for groups with different exposures and the relative rates or relative risks encountered are calculated to find out whether exposure-disease associations emerge. As many factors are at work, it will always be indispensable to adjust for several of them regarded as mere disturbances: for instance, when comparing the risk of lung cancer in people heavily and only slightly exposed to urban air pollution, it will be necessary to remove the influence of at least gender, age, and smoking habits. This can be done by the methods of logistic or Cox's and Poisson regression. These methods, easy to employ today thanks to user-friendly statistical computer packages, should not be applied blindly, lest one removes effects that should be left in. For example, in a study of the role of dietary salt in the causation of stroke it would be appropriate to remove the influence of other factors such as gender, age, tobacco smoking, blood cholesterol, diabetes.

On the other hand, it would be unwise to remove the influence of blood pressure because the direct effect of salt is to increase blood pressure which in turn influences stroke. To decide which factors need to be adjusted for is specific to each study and requires careful consideration of the possible relationships between factors?

The historical cohort study

Cohort studies are usually long-term investments (and people in the cohort may survive longer than the epidemiologists who initiate the study). A very advantageous short-cut, which has been used often in studying exposures in the workplace, is the `historical cohort study'. When records of employment are available, a cohort can be formed of all workers entering employment say between 1930 and 1950, who are then followed up to the present, establishing whether they are alive or dead through national or local registries, and in the latter case the date and cause of the death. This design allows calculation of rates and risks like any prospective cohort study, the only difference being that the cohort is followed up in the past rather than in the future.

An early example of a well-conducted historical cohort study is the 1913 German investigation of nearly 20,000 children born to tubercular parents and more than 7,000 children born to non-tubercular parents which showed that children of tubercular parents had shorter lives than children of non-tubercular parents and that their increased mortality was in addition related to the number of siblings and to lower social class.

A classic example is the 1968 study of asbestos insulation workers in the two states of New York and New Jersey present at the end of 1943 or subsequently enrolled until the end of 1962 and followed up until the end of April 1967. Their mortality from any cause was double that of men of the same age in the general population. For lung cancer, the mortality of the workers was eight times higher and in addition more than one-tenth of the deaths were due to mesothelioma, a malignant tumor of the linings of lung (pleura) or intestine (peritoneum) that is extremely rare in the general population. These findings clearly demonstrate the danger of asbestos, all the more so as the results for lung cancer were adjusted to remove the influence of the workers' smoking habits. The study went even one step further: it showed that in workers jointly exposed to asbestos and tobacco smoking, the risk of lung cancer was much increased by a reciprocal strengthening of their separate effects. The increase in risk for lung cancer was, as mentioned, about eightfold, and the increase in risk from smoking about twelvefold: the increase in risk arising from the combined exposure turned out to be close to 8 x 12 = 96-fold.

It was the first epidemiological evidence of how different factors can not only present as confounders of each other's effect within a study but can cooperate or `interact' to produce strong joint effects.

The same information for much less work and cost

Large long-term cohort studies, like the international EPIC or the birth cohorts, need massive amounts of information on diet, smoking habits, occupation, and many other factors collected on each member of the cohorts to be processed in statistical analyses, a task that does not pose insurmountable problems today thanks to the availability of software and computing facilities. Much less tractable are the problems arising from the need to carry out multiple laboratory analyses on the stored blood specimens of hundreds of thousands of people. These can, however, be overcome by using only the blood specimens from a fraction or representative `sample' of the cohort rather than from all its members (the same device is widely used for opinion polls). A sample that includes a number of subjects (not too small) can provide the same information as the whole cohort at a much reduced workload and expenditure.

For example, to investigate how the blood levels of sex hormones in 1997 influence the subsequent risk of breast cancer, advantage can be taken within the EPIC cohort of the breast cancer cases accumulated during the follow-up until 2008. All these cases, or a randomly selected subgroup, are included in the sample and for each of them one or more control women are extracted from the cohort at random.

Usually a more elaborate sampling plan is adopted, for example by picking at random a control belonging to the same country, centre, and age group as a case. With four controls per case, this `case-control' sampling design conveys essentially the same amount of information as the whole cohort and already with two controls per case the loss of information with respect to studying the whole cohort is minor. Hence it becomes possible to perform the hormone determinations only on the blood of, say, 2,000 cases and 4,000 controls, namely a total of 6,000 women instead of on the blood of the several hundred thousand women in the EPIC cohort. This type of approach has become common in recent years in cohort studies involving biochemical or genetic tests on stored specimens of blood or other biological materials like urine and hair. The approach, usually called case-control study within a cohort or nested case-control study, may also be advantageously used in every situation in which assessing an exposure is very cumbersome or costly. This might be the case when investigating whether low doses of ionizing radiation cause cancer, which requires determining the amount of radiation received by each member of a large cohort of nuclear reactor workers in the course of their entire working life. This detailed evaluation can be limited to the cases of cancer and to a number of controls picked at random from among the workers rather than extended to everyone in the cohort.




googleplus sm


ar bg ca zh-chs zh-cht cs da nl en et fi fr de el ht he hi hu id it ja ko lv lt no pl pt ro ru sk sl es sv th tr uk

Verse of the Day

Global Map