Get Adobe Flash player

Main Menu


Health is everybody's natural concern, and an everyday theme in the media. Outbreaks of disease such as the most recent influenza, occurring in many countries at the same time, make front-page news. Beyond epidemics, novel findings on dangerous pollutants in the environment, substances in food which prevent cancer, genes predisposing to disease or drugs promising to wipe them out, are reported regularly. Their actual relevance for human health depends crucially on the accumulation of evidence from studies, guided by the principles of epidemiology that directly observe and evaluate what happens in human populations and groups.

These studies combine two features. They explore health and disease with the instruments of medical research, ranging from records of medical histories to measures of height, weight, blood pressure to a wide variety of diagnostic tests and procedures. At the same time they involve individuals living in society exposed to a multitude of influences, and they cannot be conducted in the isolated and fully controlled conditions of laboratory experiments. Their design, conduct and analysis require, instead, the methods of statistics and of social sciences such as demography, the quantitative study of human populations. Without a clear understanding of this composite nature of epidemiology and of its reasoning in terms of probability and statistics it is hard to appreciate the strengths and weaknesses of the scientific evidence relevant to medicine and public health that epidemiology keeps producing. It is not only among the general public that a woolly appreciation or even a frank misreading of epidemiology often surfaces, for instance in debates on risks or on the merit, real or imagined, of a disease treatment. In my experience the same may occur with journalists, members of ethics committees, health administrators, health policy makers and even with experts in disciplines other than epidemiology responsible for evaluating and funding research projects.

(6) Epidemiology

 WORLD HEALTH ORGANIZATION - TOBACOO - Leading cause of death, illness and impoverishment

Smoking In Asia: A Looming Health Epidemic


Stopping smoking at any age prolongs your life

In principle, randomized controlled trials are a superior instrument to observational studies, to be preferred whenever possible. This may, however, prove more problematic in actual practice. It may be relatively straightforward to test a new drug for the treatment of hypertension in hospitalized patients with a randomized trial, but interpreting the significance for doctors and patients of its results compared with those of another randomized trial testing a different drug may face difficulties. The first trial may have been on patients with a longer duration of hypertension than the second, one trial may have used a placebo as control while the other used another anti-hypertensive drug, and so on. It often happens that the trials have been correct from a methods viewpoint and addressing the same question, the treatment of hypertension, but in different ways that complicate the overall interpretation of the results and the task of doctors choosing a treatment for their patients.

There is even wider scope for this problem to arise with trials of interventions such as a screening programme in a large healthy population - much more complex and dependent on circumstances than just administering a drug to patients. Two trials testing the value of PSA (prostate-specific antigen), a potential early marker of prostate cancer, have recently provided diverging results. The trial from the United States has shown no difference in mortality from prostate cancer among the men who underwent the planned screening programme in respect to the unscreened (control) men.

One explanation of this disappointing result might be that a substantial proportion of the control group had spontaneously undergone an occasional PSA test; hence any difference between the screened and the control group might have been reduced to the point of becoming undetectable.

The trial from Europe in fact showed a reduction in mortality of about 20%. This, however, has to be weighed against the fact that of the 16% of men in which the PSA test was positive, 3 out of 4 were found to have no cancer after undergoing a prostate biopsy, a procedure neither too pleasant nor totally exempt from complications such as infection or bleeding. The question of the possible beneficial effect of PSA screening for prostate cancer remains completely open.

In the context of drug or vaccine testing, the randomized controlled trial is a ‘Phase 3' experimental study comparing the drug or vaccine to a reference treatment (placebo or other drug). `Phase 1' and `Phase 2' precede the randomized trial. In `Phase 1' experiments, the safety of the drug is usually explored by administering small but increasing doses to a limited number of volunteer subjects; data on absorption, distribution in the body, and elimination of the drug are also collected. Once a safe range of doses has been identified, `Phase 2' experiments provide initial information on whether the drug has some of the intended beneficial effects. Again, a limited number of subjects is studied using rapidly obtained responses: for example, in the cancer field the reduction of the mass of a tumor will signal some efficacy of the drug, although the most relevant effect, to be explored in `Phase 3' randomized trials, will be the patients' length of survival. These trials allow an evaluation of the efficacy of a treatment under ideal experimental conditions. When, however, the same treatment is applied in everyday practice, its actual effect, or effectiveness, will not only depend from its efficacy but also on how accurately the patients for whom it is indicated are identified, on how well they comply with the treatment, on whether they spontaneously recur to other treatments and on a variety of other life circumstances.

Observing people without intervening with treatments In randomized controlled trials, subjects treated differently are followed up over time to observe the effects of the intervention (treatment). People can, however, be observed over time even when no treatment was administered at the start of the study period. We first encountered this type of study when comparing rates of onset of diabetes in overweight and normal-weight people. These purely observational follow-up studies are called cohort studies. A group of subjects is chosen; a number of characteristics (exposures) of the subject are measured and recorded, e.g. weight, blood pressure, diet, smoking habits. The subjects are then followed up in time, and a number of events recorded, typically the occurrence of a disease, like diabetes or myocardial infarction, or death from a disease or change in some trait like weight or blood pressure. A classical cohort study is the investigation of the health effects of tobacco in British doctors.

Cohort studies addressing possible short-term effects of exposures such as food poisoning may span days or weeks, while cohort studies investigating long-term effects, such as cancer or atherosclerosis, must necessarily last for decades, involving cumbersome logistics. As they unfold prospectively in the future, these studies are also known as prospective cohort studies or simply prospective studies. When all or some of the measurements made at the beginning are repeated over time, the study is often qualified as longitudinal.

In recent years, a number of large cohort studies have been started to investigate three types of open questions about common diseases: the long-term positive and negative effects of diet; the role played by genetic factors; the influence of early life experiences, in the maternal womb and during childhood, on adult diseases.

A prototype contemporary study: the EPIC international investigation The EPIC (European Prospective Investigation into Cancer) is a project currently jointly coordinated by the International Agency for Research on Cancer in Lyon, France (a research centre of the World Health Organization) and the Department of Public Health of the Imperial College in London. It initially focused on cancers in relation to nutrition but was soon expanded to other chronic diseases, like diabetes and myocardial infarction, and to genetics and environmental factors. It started with several preliminary studies developing and testing questionnaires on habitual diet and moved to recruiting adults, mostly in the age range 35 to 70, in 23 centers in 10 West European countries (Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden, United Kingdom). Some 520,000 people entered the study between 1992 and 2000. Each provided detailed information on diet – collected using comparable procedures in the different countries - and other personal characteristics such as sex, age, education, alcohol and tobacco consumption, physical activity, reproductive history for women, previous diseases. Height, weight, and waist and hip circumferences (as indicators of fat distribution) were measured. Blood was taken from about 385,000 subjects for storage at 196°C in freezers filled with liquid nitrogen.

At this temperature, all biochemical reactions taking place in blood are blocked and the specimens can be stored without alteration for years. The subjects are followed up, recording causes of death and the occurrence of cancers through permanent systems of cancer registration (Cancer Registries) where these exist or, for cancer as well as for other diseases, using physician or hospital records.

A wide range of specific studies is being conducted within the EPIC cohort and findings of major interest have already emerged. Nearly 15,000 deaths from any cause have been recorded and it has been shown that the distribution of fat in the body, in particular an increased deposit of fat in the abdomen translating into a large abdominal girth, predicts the risk of death.

Another study found that the risk of cancer of the intestine (colon and rectum) is associated with high consumption of red and processed meat. In a third study, the risk of breast cancer occurring after menopause was shown to be related to the level of both female and male hormones in the blood, a caution against the use of male hormones (like testosterone) that had been proposed for prevention of bone fragility in post-menopausal women.

Each of the blood specimens stored in the EPIC depository contains different fractions: serum and plasma, in which many biochemical compounds can be analyzed; envelopes of red cells, in which some substances like fatty acid molecules can be assayed; and most important, white blood cells that provide DNA for genetic studies. Genes are embodied as sequences of variable length of elementary molecules (nucleotides or bases, coupled as base-pairs) in the long molecules of DNA. These are in turn packed within the chromosomes included in a cell nucleus. Every human being receives 23 chromosomes from the father and 23 from the mother, each of these two sets containing more than three billion base-pairs. About 99% of these are common to all humans, but this leaves more than ten million nucleotides that can vary from one person to another. In these variations are hidden the differences in individual susceptibility to disease, a universe that has become accessible to direct exploration only recently, since the revolution in molecular biology and technology has made it possible first to measure and then to test the differences in the structure of individual nucleotides on a very large scale. Today, it is feasible to test a million nucleotides at once for variations in structure (form), in studies of `single nucleotide polymorphisms' (SNPs) that cover all chromosomes, i.e. the whole `genome'.

Within EPIC and in a fast-expanding number of other studies, associations are now sought between these genetic variations and disease; in the same way as in the past associations of smoking with disease were investigated. Testing hundreds of thousands or millions of associations, and understanding whether and in what way a gene variant causes a disease, involves three major challenges. First, it requires a very large number of cases of a disease, even beyond the numbers achievable in a project like EPIC: hence data from similar, albeit smaller, studies are combined with those of EPIC for `consortium' analyses. Second, testing a million associations increases the number of them that turn out to be statistically significant at the commonly adopted levels of 5% or 1% probability of error. A 5% level implies that 50,000 associations will appear as statistically significant merely by chance!

New methods of statistical analysis are being developed to keep this flurry of chance results under control. Third, and most complex, a single nucleotide variant will very rarely be found responsible `per se' for individual susceptibility to developing a common disease like colon cancer or myocardial infarction. The action of multiple genes is likely and to unravel the puzzle of their cooperation combined with the influences of external factors like diet will require the investigation of the chains of events leading from the gene variants to gene expression into proteins, cellular functions, and finally to disease. For this task, resources like EPIC that make possible not only the testing of genetic polymorphisms in DNA but also the assaying of biochemical components such as proteins in serum and plasma represent a precious research instrument.

Among studies broadly similar to EPIC, the British Biobank, which has started recruitment of subjects aged 40-69 in 2007, targets on a total of half a million. In Denmark, a'Danish National Birth Cohort' recruited nearly 100,000 pregnant women between 1997 and 2002 with the main objective of investigating how the period from conception to early childhood influences the health conditions of adult life. Both projects have a collection of blood specimens, as do several other studies at the advanced frontier of today's epidemiology, combining the study of genetic and external factors - dietary, occupational, environmental, and social. As these ongoing projects show, the timescale of epidemiology is often long and very different from the time of weeks, months, or a few years taken by studies carried out in the laboratory using materials like cell cultures or experimental animals: the simple reason is that for finding out what happens and why it happens to people over a lifetime there is no real alternative to observing people over a lifetime.

The five key features of cohort studies

1. The choice of the population is crucial. Essentially the exposures that are the focus of investigation must be present and variable in intensity in the population; otherwise the study will be a waste of time and resources. There is little point in choosing a population where everybody eats similar diets for a study of diet and disease: for this reason, the EPIC investigation included a spectrum of countries from northern to southern Europe, where diets (still) exhibit sizeable differences. For the same reason, when investigating the possible health effects of an air pollutant like benzo-pyrene from heating, industrial, or vehicle exhausts, the first choice would be a population of, say, gas workers, some of whom were occupationally exposed at high levels in coal-firing, rather than a general urban population exposed to low and relatively uniform levels. A population may also be chosen because it shows a high frequency of a disease, for example liver cancer, or of a disease and an exposure, say liver cancer and hepatitis B. In this case, the purpose of the study is to find out whether the risk of liver cancer is indeed concentrated among people who had hepatitis. Cohorts of patients, for example those with chronic bronchitis, are special populations to be followed up in time after the first manifestations of the condition in order to understand the natural history of the disease development. This knowledge is indispensable to clinicians for formulating correct prognoses for individual patients.

2. The study design may include a cohort recruited in a single place, like the classic study in the small town of Framingham in Massachusetts that has provided fundamental information on the determinants of cardiovascular diseases, or several cohorts, as in the EPIC project. The number of subjects to be recruited should in any case be sufficient to detect with high probability the risk of disease associated with different levels of an exposure, e.g. of myocardial infarction with amount of fat in the diet. For this reason, studies of workplace hazards often combine populations of workers at several plants, each of which employs too few workers exposed to a particular hazard to permit a meaningful investigation. Usually the age range and the gender of the people to be included in the cohort are also specified. Should the people actually in the cohort be a random sample representative of the chosen population? Because comparisons are made between groups within the cohort, for instance between people eating different amounts of fat, this is not an absolute requirement (although if the proportion of people invited who refuse to enter the study is high, various types of biases may creep in).

3. The factors or exposures to be measured belong to two categories. First, those that can be measured at the moment of people's entry into the study: education, profession, blood pressure, blood glucose level, presents smoking habits, and so on. Second, those that reflect past experience, recent or remote: lifetime smoking habits, past jobs, diet during the last week or month or year, and so on. A proper standardization of the methods of measurement can ensure the quality of measurements for the first category, but it cannot completely prevent errors of recollection for the second category.

4. Events such as disease occurrence or death are the typical responses to be recorded in most cohort studies. Mechanisms for tracing the people in the cohort are essential: a cohort study in which the percentage of subjects of whom it is unknown whether they are still alive or dead is higher than 5% or, at worst, 10% is usually regarded as of mediocre quality. Existing national or local systems of death registration and disease (e.g. cancer) registries are used both for ascertaining the status of a person and the disease diagnoses. When these systems are not in operation or are unreliable an active follow-up mechanism has to be put in place, for example through a network of the subjects' doctors.

5. The analysis of a cohort study is straightforward. Incidence rates or risks of disease are computed for groups with different exposures and the relative rates or relative risks encountered are calculated to find out whether exposure-disease associations emerge. As many factors are at work, it will always be indispensable to adjust for several of them regarded as mere disturbances: for instance, when comparing the risk of lung cancer in people heavily and only slightly exposed to urban air pollution, it will be necessary to remove the influence of at least gender, age, and smoking habits. This can be done by the methods of logistic or Cox's and Poisson regression. These methods, easy to employ today thanks to user-friendly statistical computer packages, should not be applied blindly, lest one removes effects that should be left in. For example, in a study of the role of dietary salt in the causation of stroke it would be appropriate to remove the influence of other factors such as gender, age, tobacco smoking, blood cholesterol, diabetes.

On the other hand, it would be unwise to remove the influence of blood pressure because the direct effect of salt is to increase blood pressure which in turn influences stroke. To decide which factors need to be adjusted for is specific to each study and requires careful consideration of the possible relationships between factors?

The historical cohort study

Cohort studies are usually long-term investments (and people in the cohort may survive longer than the epidemiologists who initiate the study). A very advantageous short-cut, which has been used often in studying exposures in the workplace, is the `historical cohort study'. When records of employment are available, a cohort can be formed of all workers entering employment say between 1930 and 1950, who are then followed up to the present, establishing whether they are alive or dead through national or local registries, and in the latter case the date and cause of the death. This design allows calculation of rates and risks like any prospective cohort study, the only difference being that the cohort is followed up in the past rather than in the future.

An early example of a well-conducted historical cohort study is the 1913 German investigation of nearly 20,000 children born to tubercular parents and more than 7,000 children born to non-tubercular parents which showed that children of tubercular parents had shorter lives than children of non-tubercular parents and that their increased mortality was in addition related to the number of siblings and to lower social class.

A classic example is the 1968 study of asbestos insulation workers in the two states of New York and New Jersey present at the end of 1943 or subsequently enrolled until the end of 1962 and followed up until the end of April 1967. Their mortality from any cause was double that of men of the same age in the general population. For lung cancer, the mortality of the workers was eight times higher and in addition more than one-tenth of the deaths were due to mesothelioma, a malignant tumor of the linings of lung (pleura) or intestine (peritoneum) that is extremely rare in the general population. These findings clearly demonstrate the danger of asbestos, all the more so as the results for lung cancer were adjusted to remove the influence of the workers' smoking habits. The study went even one step further: it showed that in workers jointly exposed to asbestos and tobacco smoking, the risk of lung cancer was much increased by a reciprocal strengthening of their separate effects. The increase in risk for lung cancer was, as mentioned, about eightfold, and the increase in risk from smoking about twelvefold: the increase in risk arising from the combined exposure turned out to be close to 8 x 12 = 96-fold.

It was the first epidemiological evidence of how different factors can not only present as confounders of each other's effect within a study but can cooperate or `interact' to produce strong joint effects.

The same information for much less work and cost

Large long-term cohort studies, like the international EPIC or the birth cohorts, need massive amounts of information on diet, smoking habits, occupation, and many other factors collected on each member of the cohorts to be processed in statistical analyses, a task that does not pose insurmountable problems today thanks to the availability of software and computing facilities. Much less tractable are the problems arising from the need to carry out multiple laboratory analyses on the stored blood specimens of hundreds of thousands of people. These can, however, be overcome by using only the blood specimens from a fraction or representative `sample' of the cohort rather than from all its members (the same device is widely used for opinion polls). A sample that includes a number of subjects (not too small) can provide the same information as the whole cohort at a much reduced workload and expenditure.

For example, to investigate how the blood levels of sex hormones in 1997 influence the subsequent risk of breast cancer, advantage can be taken within the EPIC cohort of the breast cancer cases accumulated during the follow-up until 2008. All these cases, or a randomly selected subgroup, are included in the sample and for each of them one or more control women are extracted from the cohort at random.

Usually a more elaborate sampling plan is adopted, for example by picking at random a control belonging to the same country, centre, and age group as a case. With four controls per case, this `case-control' sampling design conveys essentially the same amount of information as the whole cohort and already with two controls per case the loss of information with respect to studying the whole cohort is minor. Hence it becomes possible to perform the hormone determinations only on the blood of, say, 2,000 cases and 4,000 controls, namely a total of 6,000 women instead of on the blood of the several hundred thousand women in the EPIC cohort. This type of approach has become common in recent years in cohort studies involving biochemical or genetic tests on stored specimens of blood or other biological materials like urine and hair. The approach, usually called case-control study within a cohort or nested case-control study, may also be advantageously used in every situation in which assessing an exposure is very cumbersome or costly. This might be the case when investigating whether low doses of ionizing radiation cause cancer, which requires determining the amount of radiation received by each member of a large cohort of nuclear reactor workers in the course of their entire working life. This detailed evaluation can be limited to the cases of cancer and to a number of controls picked at random from among the workers rather than extended to everyone in the cohort.



(7) Epidemiology


National Cancer Institute - Cancer Epidemiology Matters Blog


Today's diseases arise from yesterday's causes

The sampling of a limited, but not too tiny, number of subjects out of a much larger cohort or population can be looked at from a different viewpoint. Ignoring the cohort for a moment, attention can be focused on the disease cases as the starting point of an investigation. This is what happens every day to doctors confronted with their patients. Again and again, keen doctors have been struck by the unusual occurrence of some events in the life experience of some of their patients providing the first hints to causes of the disease. For example, in the early 1960s an ear, nose, and throat specialist noted that as many as one-quarter of patients with cancer of the nasal cavities, a very rare disease, occurred in furniture workers, an infrequent type of exposure in the general population. This observation paved the way to subsequent epidemiological studies which showed that dusts produced in furniture and cabinet works can produce cancers of the nasal cavities, probably because the dust is loaded with several carcinogenic chemicals. Cancer of the nasal cavities and furniture making are both so rare that their repeated joint occurrence is unlikely to arise by chance. In ordinary circumstances, however, to judge whether such a joint occurrence is just coincidental requires some estimate of the frequency of the possible causal factor among the patients and in the general population from which they originate. Observing, as happened in the brief period of three years at a hospital in Boston, seven cases of cancer of the vagina in women as young as 15 to 22 immediately prompted an enquiry into the life experience of the patients extending into the pre-natal intrauterine period. Rather than focusing only on the patients, the investigators carefully selected for each case four controls: women born within five calendar days and in the same service (ward or private). Examination of the medical history of the patients' mothers during pregnancy found that all the mothers of the cases and none of the controls had been taking diethylstilbestrol, a synthetic oestrogen prescribed to prevent pregnancy loss in high-risk women. This case-control study provided strong evidence of a causal association between the drug and the cancer in daughters, explained on the basis of the alteration that it induced in the vaginal cells of the foetus that years later developed into a cancer. The use of the drug has since been proscribed.

Case-control studies were in fact widespread in epidemiology well before their use in the special situation where cases and controls are extracted from an actual study cohort. They can be regarded as a natural expansion of the enquiry a doctor makes the first time he or she sees a patient, asking not only about symptoms but also about the patient's health history, familial precedents, eating habits, occupation, and other elements which may possibly have a bearing on the present condition. In a case-control study, this procedure is carried out in a formalized way, using questionnaires focused on the exposure of interest to the investigator, and extending the enquiry to control subjects as well. The great advantage of this type of study is that it capitalizes on cases of disease which are occurring currently as a result of past causes - to be identified - rather than requiring, like a cohort study, a follow-up of subjects lasting years, waiting for the cases to occur.

Case-control studies have contributed knowledge to all areas of epidemiology and medicine. One example whose full relevance is now tangible are the seminal case-control studies tracking the causes of cervical cancer, today the second commonest cancer in women in developing countries.

It had already been noted in the 19th century that this cancer was uncommon among nuns, suggesting that it was perhaps in some way connected with sexual activity. It then emerged from several case-control studies in the second half of the 20th century that the cancer was related to being married, particularly at an early age and to a high frequency of sexual intercourse. In one study, a frequency of intercourse of 15 times or more a month was 50% higher among the cancer cases than among the controls. In these studies,

information on exposure (i.e. on frequency of intercourse) was collected, as very often in case-control studies, by interview and it could have been inaccurate; moreover, it was most likely that marriage and sexual intercourse were not directly relevant but reflected the action of some other unknown factor, probably infectious. The search for sexually transmitted micro-organisms started focusing in particular on several viruses, among which the human papilloma viruses (HPVs) were particularly suspect because they were known to produce benign tumour lesions in humans (warts) and malignant tumours in rabbits. Case-control studies in which the exposure was no longer the marital status nor the frequency of intercourse but the presence of the virus showed a strong association of some types of HPV with the cancer.

The more specific and accurate was the laboratory method to ascertain the presence of the virus in cells of the uterine cervix, the stronger the association turned out to be, indicating that the virus, and not something else, was the real factor at play. Would this also mean that it was the cause of the cancer? It could in fact be that the cancer developed first and the virus was found only as a host boarding the cells once the cancer had begun. A case-control study is not a good instrument to solve this kind of `who's first' question, because the presence of the virus (and in general of an exposure) is ascertained at the moment the disease is already established. Cohort studies showed that the infection with the virus preceded the cancer. Moreover, studies with newly developed vaccines demonstrated in a definitive way that the HPV viruses are the cause of cervical cancer and that blocking them prevents occurrence of the disease. Vaccination campaigns in young women are now in progress in several countries. Epidemiology has brought a crucial contribution to this major advance in public health, and case-control studies have been a pioneering component of it.

The four key features of case-control studies

1. The selection of cases is the starting point of case-control studies. Often, they are observed in hospital and the diagnosis can be accurate and if necessary refined, for example separating the different cellular types of lung cancer if one suspects that they may be influenced by different factors (exposures) to be investigated. Usually cases should have arisen very recently, i.e. they should be new or incident cases, for example of diabetes. If all cases of diabetes, whether they were diagnosed yesterday or ten years ago, are instead included in the study, it may happen that a factor emerging as different between cases and controls does in fact influence how long a diabetes patient survives rather than why a healthy subject becomes diabetic. These two features become inextricable and the results of the study will become hard to interpret.

2. The selection of controls obeys the fundamental and rather obvious principle that they should come from the same study population as the cases. There are usually no problems when the population is an existing cohort already under investigation, as we have seen when discussing the case-control study within a cohort. A similar situation holds when the cases are, for example, all stomach cancers recorded in a year by a cancer registry covering a defined population, and controls are picked up at random from the population.

The hurdle is that there will always be a proportion of selected controls who refuse to participate; they can be replaced by other people who consent to participate but in this way the controls are no longer rigorously representative of the population from which the cases come. When the latter is only vaguely defined, as when the cases are patients in a hospital, the problem of which population to sample to obtain controls may become very difficult. Taking as controls patients in the same hospital with diseases other than the one under study and not related to the factors under study is a widely adopted solution. It assumes that all kinds of patients reach the hospital for the same combination f reasons, medical, personal, administrative, or legal. This assumption may often be wrong as when, for example, the hospital has one highly specialized and reputed service for leukaemia, the disease under study, which receives patients from several regions while the other services of the hospital operate essentially on a local basis. In this situation, it is reasonable to select controls coming from the same area of residence of each case and it may be sensible to also match cases and controls for gender, age, interviewer, and calendar period of the interview. Going further and trying to find controls similar to the cases in other respects should be avoided. Not only is it difficult to find controls that match a case when the number of characteristics increases, but making cases and controls more and more similar makes the controls unrepresentative of their population of origin and destroys the possibility of discovering differences in exposures between cases and controls, i.e. the very purpose of the study. The choice of controls is a major challenge for epidemiologists, requiring both experiences - including mistakes - and specific knowledge of the local context of the study.

3. Ascertaining exposures very often involves interviewing cases and controls about a variety of factors to which they may have been exposed, ranging from smoking habits through diet to medical history, depending on the purpose of the study. The same interviewer interviews a case and his or her controls, in a random order and within a short period of time, to avoid subtle changes in the way questions are asked that may intervene with the passing of time (a case-control study usually lasts for months or a few years as necessary to obtain the required number of cases and controls). Structured questionnaires are the rule for the interview and interviewers undergo training sessions on how to use them and, more generally, on the approach to the subjects. Ideally the interviewers should not know whether the person they are talking to is a case or a control, as this would avoid bias in the way questions are formulated and answers recorded. This `blind' condition is, however, seldom feasible in practice. In addition, the subjects themselves may remember incorrectly or report, consciously or unconsciously, past events and exposures. The extent to which this misreporting may be different for cases and controls produces a recall bias that distorts comparison. Similar problems affect telephone interviews and replies to self-administered questionnaires. Lesser difficulties arise when past exposures can be evaluated consulting written documents, for instance medical or employment records, although they may sometimes be incompletely or inaccurately filled in. Finally, an investigator may wish to explore the influence of a physiological factor like insulin on a disease such as colon cancer by measuring the blood levels of insulin in cases with colon cancer and controls without the disease; but who can guarantee that it is insulin influencing the disease rather than the other way round? Clearly ascertaining exposure is a delicate exercise in a case-control setting.

4. By now you should have noted the basic difference between a case-control and a prospective study. The prospective study observes events in their natural course from causes to possible effects. Computing and comparing incidence rates or risks of chronic bronchitis in smokers and non-smokers seeks to answer the question: how often do smokers develop the disease compared to non-smokers? A case-control study observes the events in a reverse sequence, from effects to possible causes. It starts from the disease and seeks to answer the question: what proportion of people with chronic bronchitis have been smokers compared to people with no disease? No incidence rates or risks can be calculated from a case-control study as the number of smokers and non-smokers at risk of developing the disease is as a rule unknown; we only have two samples of people who actually developed or did not develop the disease but we know the frequency of smoking in both samples. Fortunately, a proper data analysis permits us to compute the ratio of the two risks, each of them remaining unknown. If this sounds surprising, consider for a moment the figures from a prospective study (not a case-control!) of a population of 10,000 people, of whom 2,515 turn out to be smokers and 7,485 non-smokers:

Smokers – Non - smokers

Developed chronic bronchitis after three years of observation 25-15

Did not develop chronic bronchitis 2,490 – 7,740

Total population (10,000) 2,515 – 7,485

In three years, 25 smokers out of 2,515 developed chronic bronchitis, hence their risk is 25/2,515. Similarly, the risk for non-smokers is 15/7,485. The ratio of the two risks is (25/2,515) / (15/7,485) = (25/2,515) x (7,485/15) = 4.9, i.e. smokers have almost a fivefold probability of developing chronic bronchitis. We could get nearly the same result by replacing 2,515 (the number of smokers initially at risk of disease) with 2,490, the number that did not actually develop the disease by the end of the three years of observation. This replacement is justified by 2,490 being a reasonably close approximation to 2,515 and, similarly, 7,470 to 7,485. In general, the smaller the number of diseased people in relation to the population size, i.e. the disease risk during the period of observation, the better the approximation will be. And as any long period can be broken down into very tiny intervals, it will in principle be possible to make the risk within each interval as small as we please, rendering the approximation virtually perfect (a device you may come across under the intimidating name of `incidence density sampling').

The new ratio, called the odds ratio, can now be computed as: (25/2,490)/(15/7,470) = (25/2,490) x (7,470/15) = 5.0, very close to 4.9.

Why go to the trouble of computing an odds ratio when the risk ratio is already available? Because the latter can, unlike the risk ratio, be calculated not only in a prospective study - as in the example - but also in a case-control study. For instance, a case-control study covering the same time span as our prospective study may have picked up from our population all 40 cases of bronchitis through hospital records and at random 160 controls without the disease, i.e. only 1.6 % of the 2,490 + 7,470 subjects with no disease. The new figures look like this:


Cases with chronic bronchitis 25-15

Controls 40-120

The odds ratio is (25/40) / (15/120) = (25/40) x (120/15) = 5.0, exactly the same as before.

Herein lies the remarkable advantage of a case-control study: the possibility of estimating via the odds ratio computed from a comparatively small number of subjects the same ratio of risks that in a prospective setting would require following up a large population for years. This advantage offsets the limitations already discussed (notably in the choice of controls and in ascertainment of exposure) of case-control studies and explains their continuing popularity with epidemiologists. Problems notwithstanding, the case-control study is an epidemiological tool adaptable to all manners of circumstances and relatively rapid to implement. As such, it has been popular and is still currently used widely as a first-line study, when tackling a new health problem. When a group of people comes down with a serious gastrointestinal ailment after a festive dinner, the first thing an epidemiologist will do is to interview the sick people and then some healthy controls to ascertain the frequency with which individual food items served at the dinner were consumed by cases and controls. Hazardous items may in this way emerge and be identified and, hopefully, removed from the menu.



(8) Epidemiology

 MERS Virus Outbreak: Everything You Need to Know

Middle East respiratory syndrome coronavirus


The boom in genetic case-control studies

Genes inherited from the parents are fixed characteristics of a living organism. They cannot be altered by the occurrence of a disease and they are not subject to recollection errors, unlike exposures ascertained by questioning cases and controls. For this reason, genes represent an ideal exposure to be measured accurately in case-control studies. Large series of cases, uniformly diagnosed, can be assembled from many clinical centres, providing adequate numbers for detecting associations between gene variants and disease. With the availability of techniques that permit testing a million `single nucleotide polymorphisms' (SNPs), the gene variants, the current trend is to first throw the net wide and explore SNPs distributed over all 23 chromosomes. These studies are labeled `GWAS', or Genome Wide Association Studies, and after the first phase are followed by confirmatory phases to check that the associations found in the first phase are not false positive results arising simply by chance. Several GWAS studies are in progress and many more are starting. The first results of one large study of 14,000 cases of seven common diseases and 3,000 controls has identified more than 20 associations, involving a mental disorder, coronary artery disease, type 1 and type 2 diabetes, an intestinal inflammatory disease, and rheumatoid arthritis.

Confirmed associations open the way to the investigation of the physiological mechanisms leading from the gene variant to the disease, a task beyond the scope of ordinary case-control studies. In principle, blood could be taken from cases and controls and biochemical studies carried out to probe these mechanisms. However, the presence of the disease makes the meaning of any physiological finding questionable: would it really represent a step leading to the disease or instead be a consequence of the disease itself? Only case-control studies conducted within cohorts where blood was collected and stored before the disease occurred, like the EPIC or the British Biobank cohorts, are free of this problem. Laboratory studies on experimental animals and on cells, including fresh white blood cells from human volunteers, complement these epidemiology-based studies on the physiological paths linking genes to disease. Understanding these paths also helps to clarify the role of the external factors at play. Drugs and other means (e.g. changes in diet) that can interfere with the paths and prevent disease development are the ultimate objective of this research.

Space, time, and individuals

Intervention studies, randomized or non-randomized, and observational studies (cohort and case-control) form the core of epidemiology. As they aim to test hypotheses about causal relationships between exposures and effects, they are often collectively called analytical studies (cross-sectional and correlation studies, of which more later on in this chapter, also belong to the group). Analytical studies are generally both preceded and followed by descriptive studies of how health and disease, as measured by rates of deaths,

new cases of disease, or hospital admissions, are distributed in space by geographical area, in time by week or month or year, and in categories of people of different age, gender, and socio-economic status. Observing a disease distribution in space, time, and categories of people provides useful indications of which factors need to be explored in depth through analytical studies as possible determinants of the distribution. Visually friendly tools, like graphs and maps (sometimes collected in atlases), convey a pictorial view of disease burden and evolution. They facilitate the examination of descriptive data and the generation of causal hypotheses.

Cholera made headlines in 2008 when a lethal epidemic hit Zimbabwe, but in the 19th century it had often ravaged Europe, producing waves of high mortality, particularly among the poorest sections of the population living in squalid conditions. From foci in India, it spread to Europe, an early example (1832) of a health map, illustrates its progression westward.

Today, cholera has practically disappeared from Europe while cardiovascular diseases head the league of causes of mortality. In particular, death rates from heart attacks, tend to be markedly higher in northeast than in southwest Europe.

Sixty years ago, the marked difference in the occurrence of heart attacks between southern Europe and other regions, especially the United States, prompted American investigators to develop a comparative cohort study in seven countries with very different rates (Finland, Greece, Italy, Japan, the Netherlands, the United States, Yugoslavia). Together with other cohorts, especially the Framingham study, this `Seven Country study', follow-up of which has continued until recently, has provided essential information to establish high cholesterol, tobacco smoking, and high blood pressure as three major determinants of heart attacks.

Diseases vary both in the short and long term. Depicts the rise and fall of an outbreak of a viral disease, mumps, communicable from person to person through airborne droplets or saliva: it was a small (37 cases), self-limiting, and non-lethal outbreak localized particularly in a school.

Epidemic curves are not only useful as visual summaries of past epidemics. When constructed day by day or week by week while an epidemic is in progress, they allow, combined with information on the mechanism of transmission of the disease, the building of mathematical models, deterministic or probabilistic, predicting the likely evolution of the disease. Descriptive studies not only stimulate and guide the development of analytical epidemiological studies but are also an ultimate check that the results of the latter `make sense'. If tobacco smoking is an important cause of lung cancer, the pattern of lung cancer rates by geographical area and over a period of years should reflect, or at least be compatible with, the pattern of tobacco consumption, as has in fact been found again and again in many countries. And if effective measures of prevention have been taken, for instance reducing tobacco consumption or removing a pollutant or treating a disease, they should be reflected in a change in the rates of the disease.

Preventive measures and improved treatment in fact show up in the halving of the risk of

heart attacks between the late 1970s and the late 1990s; at the same time, the persistent difference between higher and lower social classes is a stimulus for new research.

Sources of data

The description of health and disease in populations relies on data that are collected on an ongoing and systematic basis or through special enquiries. In principle, all countries in the world (192 are members of the World Health Organization) have a system of records of basic life events: births, deaths, and causes of deaths. The organization and, more importantly, the coverage and quality of the data collected are very variable. Only 20% of the world's population living in 75 countries, mostly economically developed, is in fact covered by cause-of-death statistics judged (for instance in respect to accuracy, completeness, and other requirements) as high or medium-high quality. For as many as 25% of the world population, WHO does not receive any data on cause of death. Several years, from three to ten or more, may be needed for cause of death to be reported to WHO and made available for international comparisons. As to births, 36% go unregistered, with vast differences between countries, from 2% in industrialized countries to 71% in the very least developed. In the decade 1995-2004, only 30% of the six billion world population lived in countries with complete registration of births and deaths. This percentage had not changed much from the 27% figure of the period 1965-74 (when the world population was less than 4 billion).

Morbidity statistics are disease-oriented, and range from hospital discharge records, in principle available wherever a hospital exists but in practice of vastly variable quality, to registries intended to cover all cases of a disease, or of a group of diseases, occurring within a population. In selected and limited areas of a number of developed countries, registries are operational for malformations, myocardial infarctions, diabetes, stroke, and other conditions. Cancer or, better, cancers (because the heading embraces several hundred different diseases), is the condition best covered. Cancer registries started in the 1930s in North America and the first nationwide registry was established in Denmark in 1942. By the end of the last century, there were close to 200 good-quality cancer registries, mostly local or regional, in 57 countries.

Because of the danger of contagion, several infectious diseases have been the object of compulsory and rapid notification at national level since the late 19th century. Today, cases of diseases such as smallpox, SARS, poliomyelitis, and cholera fall within the larger scope of the `International Health Regulations' and also require notification to the World Health Organization. Surveillance systems of communicable diseases have evolved in promptness and coverage, yet even the best systems based on doctors' diagnoses can hardly report a rising epidemic in less than one or two weeks. Recently, and somewhat surprisingly, just counting an increasing number of daily queries on influenza in Google has proved capable of detecting the rise of the disease in just one or two days. This simple and cheap approach might also work for other epidemic diseases in areas with a large population of Web users. An accurate estimate of the propagation and severity of a potentially fatal disease demands, however, a complete enumeration both of cases and of deaths. When many mild cases are not recorded, as it may be for the A(H1/N1) influenza, the extent and speed of the epidemic is underestimated and the fatality rate, i.e. the ratio of deaths to cases in a time interval, is overestimated (as deaths are less likely to go unrecorded).


(9) Epidemiology

Epidemiology - Incidence and Prevalence



Friendly figures and hidden fallacies

Mortality rates from heart attacks in men aged 45-74 years in Europe may suggest among others the hypothesis that the disease distribution is related to some foods, a possibility which can be explored by correlating the rates of heart attacks in each European country (or even within smaller areas) with the average per capita consumption of several foods. Both the figures for heart attacks and for food consumption can be gathered easily from published statistics, making the exercise fast and friendly. If we had only two countries, one with a high consumption of, say, milk and the other with a low consumption, we could simply compute a rate ratio. A ratio different from 1 (equality of rates) would tell (if statistically significant) that there is an association or correlation between the occurrence of heart attacks and the consumption of milk. Here, however, we have not two but several countries and correspondingly several rates and differing levels of milk consumption; fortunately, the method of studying their correlation is just an extension of the rate ratio. If a correlation is indeed found, we should guard against inferring that milk consumption is a determinant of heart attacks. Not only do we have to take into account the possibility of confounding and bias encountered when interpreting associations in general, but here the association, called an ecological association, is at the level of geographical units, i.e. countries, not at the level of individuals, as in case-control and cohort studies.

In these, the consumption of milk and the health status (with or without heart attack) would have been measured and be known for each person, while in the correlation exercise all that is known is the rate for each country and the average consumption of milk (derived, for example, from sales figures). No one knows whether within each country the individuals who develop a heart attack are those who also consume more milk and the observed correlation could be an artifact. Falsely believing that it is real would result in an ecological fallacy. As is often heard, and as epidemiologists, contrary to what is also often heard, know perfectly well `correlation is not causation'.

Things, however, are even more complex as an opposite fallacy may be at work. Imagine two areas in one of which nobody smokes while in the other everybody smokes the same amount from the age of 15. The latter would have a much higher rate of lung cancer than the former, yet a study measuring smoking habits for each individual would be unable to detect any difference in lung cancer risk associated with smoking within each area. Only comparison of the two areas would reveal that in these circumstances smoking is a determinant of lung cancer but only at the population (area) level. To look only at the individual level may lead to an atomistic fallacy. Everything that has been said about geographical units applies also to time units, for example to correlations of concentrations of air pollutants during successive weeks and hospital admission rates for respiratory disease in the same weeks.

Cross-sectional surveys

Detailed information on health is gathered by special surveys of samples of a population, in which questions about health are asked or a health examination is carried out, or both. As in correctly conducted opinion polls, representative samples can be obtained by first subdividing the population by key criteria, typically gender, age, place of residence, and then extracting at random within each subdivision or `stratum' a number of subjects to be included in the survey. All data collected refer, as in population censuses, to a fixed point in time (calendar date) even if the actual duration may span several days or weeks. Some surveys may be repeated regularly to monitor trends in health. A major periodical survey is the US National Health and Nutrition Examination Survey. It was first conducted in the period 1971-75 on a nationwide random sample of more than 30,000 subjects and included an interview focused on diet and a medical examination. The survey was repeated in 1976-80 and in 1997-8. Since 1999, it has become an ongoing biennial survey including an enlarged and variable number of interview, medical examination, and laboratory test items.

While it is valuable to document in detail the health of a community and its changes in time, these surveys are usually less useful as tools to search for causes of disease. For example, blood pressure measurements and an electrocardiogram can be taken during the survey, and some electrocardiogram anomalies may be found to be more frequent among people with high blood pressure than among people with normal blood pressure. However, as both electrocardiogram and blood pressure were assessed at the same time, it is impossible to say which anomaly started first and can be a cause, direct or indirect, of the other. Indeed, they may both have begun at about the same time as the result of another common factor, such as tobacco smoking. All surveys of this type that collect data only once at a fixed point in time (cross-sectional studies) suffer from this shortcoming. Surveys of the same populations repeated at different dates but, as is often the case, on a different sample of people are not free of this limitation.

The burden of diseases

Another minefield, essential for the establishment of public health priorities, is the determination of the burden of different diseases due to different factors in a region or nation or even worldwide. In its simplest form, this may start with a frequently asked question of the type: what percentage of, for example, all cancers is due, say, to environment? Three main problems prevent a single answer. First, the definition of environment may cover all factors external to the body (sunlight, pollutants in air and water, tobacco smoke, foods, etc.) or it may be restricted to some of them, like pollutants in place of residence and occupation: the percentages will vary depending on the definition. A second reason is that these percentages obviously depend on how many people are exposed to the different components of the environment: if many smoke, the percentage of cancers due to environment, and to smoke in particular, is high; if few smoke, it is low. How many people smoke (and how much) may be relatively easy to determine, but there is usually much more uncertainty on how many people are exposed, for example, to carcinogenic air pollutants; moreover, for both exposures the numbers of exposed individuals vary from place to place, and percentages calculated for large countries or continents or the entire world hide these substantial variations as well as the uncertainty in the evaluation of the numbers of exposed people. Finally, even accurate percentages specific to a single place (e.g. a town) have the puzzling feature that they cannot be added up, as their total may exceed 100%, i.e. more than the total of cancer cases! In fact, if we knew all causes of cancer perfectly - which is far from being the case today – there would be a lot of double counting, as many cancers are due to the joint action of two or more causes, genetic and environmental. Occupational exposure to asbestos and tobacco smoking both independently increase the risk of lung cancer, but their combination further multiplies the risk for asbestos workers who smoke. It is correct to attribute the percentage of cancers due to this combined action, say 5% of all lung cancers, once to asbestos and once to smoking (because without either of the two exposures these cancers would have not occurred), but it is not correct to sum the corresponding percentages, 5% + 5%, because they refer to the same cancers.

With all these reservations, it should not be surprising if today one cannot be more precise than saying that on the grand scale of the whole world approximately one-third of cancer is attributable to environmental factors. Tobacco smoking accounts for at least 20%, alcoholic drinks for some 5%, infectious agents for at least 10% with higher percentages in developing countries, and occupational and environmental carcinogens for fractions variable from less than 1% to some 10%.

These percentages are a simplistic representation of the actual impact of a factor on the health of a population. The same percentage may reflect impacts of very different severity depending on whether the cancers affect young or old people or whether they are successfully treatable (and how and for how long) or not. These elements are taken into account in sophisticated analyses, developed in the last two decades, of the burden of disease in local, national, or world populations. Often, the results of burden of disease analyses are expressed as number of DALYs (disability-adjusted life years) lost due to a cause. One DALY corresponds to the loss of one year of life free of disability: hence the DALY unit of measurement incorporates both the loss of years of life, because of death, and the loss of quality of life, because of disability.

Working for the health of all

Epidemiology is at heart a field of applied research with the improvement of the health of all as the key aim. As such, epidemiology is an essential component of all public health activities that implement the organized efforts of society to promote, protect, and restore health. This concept of public health has no relation to how societal efforts to improve health are or should be organized; it does, however, imply that some kind of explicit organization should exist, rather than just dispersed and uncoordinated initiatives, for society to successfully tackle health problems.

Three broad activities contribute to people's health. In clinical medicine, doctors and other health personnel deal individually with each patient. They provide preventive measures such as drugs to control high cholesterol or elevated blood pressure, or deliver advice and psychological support to stop smoking. They intervene to diagnose, treat, and when possible cure, diseases with procedures ranging from the simple prescription of an antibiotic to a complex liver or heart transplant. Finally, they offer individual rehabilitation to people with disabling diseases. Prevention and early diagnosis at the population level form the second field of activity. Prevention addresses the root causes of disease, environmental or genetic. It embraces a vast array of regulations spanning control of pollutants in air, water, and the workplace, to traffic speed limits and safety requirements in home appliances. It includes compulsory and optional vaccination programmes as well as campaigns to foster healthy diet and behaviour. When it targets genetic causes of diseases, for example the screening of all newborns for genetic defects, primary prevention uses medical diagnostic tools, as do organized programmes of early diagnosis and treatment of diseases. These have proved effective and are operational in many countries for a limited number of high-impact diseases such as cancers of the uterine cervix and of the breast.

The third activity consists in the empowerment of people to exercise responsibility for their health through adoption of health promoting habits and participation in the decision processes that shape health policies. The latter in turn may reinforce or inhibit people's empowerment, the development of which depends on formal and informal education and on updated and accurate information.

Public health also coordinates these activities in relation to other societal actions, external to the health system, which strongly influence health, for example income and housing policies. In the coordination process, public health administrators and policy makers usually demand that the benefits and adverse effects of proposed policies be subject to economic analysis, in which epidemiologists play a specific role jointly with other specialists. Channelling the research results into practice, whether in clinical medicine, in population prevention, or for people's empowerment, requires as a first step the aggregation of the results of multiple studies to consolidate the total evidence available on a specific question, for example whether vitamin C protects against cancer in humans. This is done by critically reviewing the studies' reports, comparing methods and results, and drawing a general `best' answer to the question at hand. In the last two decades, the approach and methods used in a review, previously entirely left to the reviewer's discretion, have been refined and made more objective and rigorous under the heading of systematic reviews.

Systematic reviews, with and without meta-analysis

A systematic review is a review carried out using a systematic approach to minimize bias and random errors, a process which is explicitly documented in the methods section of the review itself. It usually offers a more objective appraisal of the available evidence than traditional reviews, conducted as narrative commentaries on the studies. In a systematic review, each study is scrutinized to assess its quality in respect of a number of criteria fixed in advance, e.g. how well the population is defined, whether the study responses were assessed blindly or not, and so on. This makes it possible to consider separately studies judged of higher and lower quality, rather than all of them together, and see whether the results of the lower-quality studies point in the same direction (e.g. towards a reduction or an increase in risk) as the higher-quality ones. Broadly consistent results can be combined in a statistical analysis, a meta-analysis, to provide a single summary estimate of risk. This analysis, in which each study is given a `weight' proportional to the number of disease cases it contributes, may cause a clear-cut result to emerge, while the individual studies, particularly if small in size, may each present a result statistically non-significant that is difficult to interpret.

Combining studies often permits the evaluation of rare events, too few of which occur in a single study. A typical case is that of side effects of new drugs, which occur infrequently, say once in a thousand treated patients. However, if a side effect is serious, for instance a major heart problem, it will have considerable impact when the drug is put on the market and used by hundreds of thousands or millions of people. Yet such an effect will be hard to detect in a randomized experiment of a size of, say, a few hundred subjects, which would be more than adequate to measure a much more frequent therapeutic effect. It is only by combining all available data from different randomized experiments that a sufficiently large number of patients is reached to allow the adverse event to become detectable. A telling example is Rofecoxib, a drug commercialized in 1999 as an anti-inflammatory remedy for rheumatic and muscular disorders. It was withdrawn from the market by the manufacturer in September 2004 on account of an increased risk of heart attacks, when an estimated 80 million people had already used it. However, if the manufacturer or the drug licensing authorities had conducted a timely meta-analysis, they would have detected the increased risk more than three years earlier, in 2000.

Systematic reviews complemented by meta-analyses of randomized controlled trials are most valuable for clinical medicine. They have helped to develop the continuously evolving body of evidence-based medicine which guides doctors' everyday practice. They have also helped to put the evidence from randomized preventive trials carried out in populations on a firm basis, for example the prevention of myocardial infarction with cholesterol-lowering drugs.

Meta-analyses have also been extended to observational epidemiology studies directly relevant to public health. Combining results from observational studies in which confounding factors and biases have usually been dealt with in a different way in each study in a statistical analysis is, however, problematic. As we know in randomized controlled trials bias and confounding are prevented by randomization and do not impinge on a meta-analysis, a condition that does not apply to observational studies. For these studies, systematic reviews are in any case necessary while the worth of meta-analyses has to be assessed case by case.


(10) Epidemiology

Climate Change Might Increase Risk Of Malaria

Climate change and infectious diseases - World Health Organization

The connection between climate change and malaria


Clinical medicine

Systematic reviews form an important part of clinical epidemiology, but more generally the quantitative and probabilistic traits of epidemiology pervade clinical medicine. It is common to find today in standard textbooks of medicine references to `NNTs' and schemes of `diagnostic decision trees'. Comparing treatment options is helped by computing the NNT, or number needed to treat. In severe hypertensive subjects, the risk of a major adverse outcome (such as death or stroke) in the coming three years may be as high as 20%. A treatment may, however, reduce it to 15%. The risk reduction obtained with the treatment is 20 - 15 = 5%, which means that out of 100 subjects treated, 5 avoid the major adverse outcome they would have otherwise suffered. This is the same as saying that for one subject to avoid a major adverse event, the number needing treatment is 100/5 = 20. Should a new treatment reduce the risk to 4%, it would be necessary to treat only 6 x=100 /(20 - 4) patients to avoid one adverse event. Comparing the number of people who need to be treated for the two treatments, 20 against 6, conveys tangible information on the merits of the two treatments, the second being clearly superior (provided all other aspects are the same, for instance the frequency of side effects, but these can be dealt with in terms similar to NNT).

A diagnostic decision tree is designed to assist the physician in formulating a diagnosis. If a young man presents with a sudden vague but aching and recurrent pain in the left chest, one diagnostic possibility is coronary artery disease, the narrowing of the coronary arteries that supply blood to the heart. Given the young age of the patient and the absence of any other sign, this condition appears a priori unlikely, but being very serious it could be disastrous to miss it. The patient can thus undergo an exercise stress test whereby his electrocardiogram is monitored during controlled physical effort. A negative test would be reassuring; unfortunately the test is not perfect and sometimes it turns out falsely negative even in presence of the disease, in the same way that it can be falsely positive in its absence. If narrative terms like `a priori unlikely, `sometimes falsely positive', `sometimes falsely negative' are replaced with figures of probabilities (derived from specific studies), a map, or decision tree, can be built of all possible courses of diagnostic actions. One course maybe to dismiss straight away the diagnosis of coronary artery disease because the type of pain found in an otherwise healthy and young man makes the diagnosis less than 5% probable. The alternative course is to proceed to the stress test knowing, however, that it has a 30% probability of false negative results (i.e. it has a sensitivity of 100 - 30 = 70%) and a 10% probability of false positive results (i.e. it has a specificity of 100 - 10 = 90%). Combining these figures makes it possible to calculate the probability, or predictive power, that each alternative will correctly identify the disease if present or dismiss it if absent. A comparison of these probabilities, and of the penalties involved in a wrong diagnosis, helps the physician to analyze the diagnostic process, which often involves not just one but many possible tests, and to choose an optimal diagnostic strategy (these calculations are based on Bayes' theorem, a fundamental tool for drawing inferences of probabilistic nature from empirical observations, established as early as the mid-18th century by the Reverend Thomas Bayes).

Prevention and early diagnosis

In a strict technical sense, `prevention' denotes the activities aimed at directly modifying the root determinants of disease, which fall only into two broad categories: genes and environment, or in more archaic wording `nature and nurture'. Early diagnosis, on the other hand, aims at detecting and treating diseases before they become manifest through symptoms. These two neatly separated activities, both organized at the level of the whole population, have, however, a major bridge in the diagnosis of host risk factors, like high blood cholesterol or high blood pressure, that are not yet `diseases' but increase the chance of disease occurrence; on the one side, the host risk factors share this property with a person's genes predisposing to disease, while on the other they are themselves the result, like early disease, of a complex interplay of genes and environment. Some early disease diagnosis tests are carried out as `opportunistic screening tests' by individual doctors when they examine a patient: for instance, the PSA test for prostate cancer discussed in Chapter 5 has become, rightly or wrongly, popular in several developed countries even in the absence of firm evidence of net benefit. Only screenings for which this evidence exists do, however, qualify for systematic adoption in the population in the form of `organized screening programmes', such as those for colon cancer or for cervical and breast cancer in women, now implemented on a substantial scale in many countries. Screening programmes aimed at early diagnosis in apparently healthy populations are evaluated in the same ways as the diagnostic procedures in symptomatic patients previously discussed.

Programmes for different diseases can be compared or different alternatives of a programme, for instance screening for cervical cancer using either the cytological `Pap test' or the assay detecting the human papilloma virus. For this purpose, indexes such as the predictive power and the number needed to screen (NNS) are calculated. The latter is closely similar to the number needed to treat (NNT) and tells how many subjects one needs to test in order to avoid one death or other major adverse event within a period of time. It depends not only, as NNT does, from the probability that a treatment successfully avoids death but also from the probability that an apparently healthy subject turns out to have the disease without symptoms. NNS are usually in the range from several hundreds to, more often, several thousands.

Screening for host factors, genetic or acquired, that may predispose to a disease stands on the basic assumption that subjects who will develop the disease can be distinguished from subjects who will not, so that any preventive intervention, for example a change in diet, can be concentrated on the former (should the distinction prove impossible, there would be no point in screening and any intervention would simply need to be applied to everybody). Looking closely at one of these risk factors, blood cholesterol, throws light on how far the basic assumption is justified and illustrates at the same time some general principles of prevention, taken in the wide and generic sense of any measure able to prevent at any point the progression from health to disease and death. Today, few will be surprised if a heavy smoker comes down with lung cancer. Many may be surprised, however, if told that avoiding heavy smoking will not wipe out the burden of lung cancer in the population because a substantial number of cases occur in fact in people who regularly smoke only moderately. People with frankly anomalous cholesterol levels, say above 6.5 mill moles per liter, represent 6 + 3 + 2 =11% of the population in which it has been found that 13 + 9 + 8 = 30% of the deaths from heart attacks occur (in case you feel more comfortable with milligrams per 100 milliliters, 6.5 mill moles is about 250 milligrams). Intervening on this `high-risk' fraction of the population, about one-tenth of the total would prevent - assuming an intervention that is 100% effective - just one-third of the deaths, leaving untouched the other two-thirds. Why these disappointing results? Because the risk is not concentrated solely in people `at high risk', with cholesterol levels above 6.5 mill moles, but involves everybody to some degree. As cholesterol levels increase over the very lowest levels (category 0-3.9), the risk of disease increases by small increments, with no abrupt jumps.

As a consequence, the many people with only modest elevations in cholesterol who are also at a modestly increased risk produce more cases of heart attacks than the minority of people at high risk. This `paradox of prevention' implies that the bulk of cases could be prevented by moderately reducing the cholesterol level, hence the risk, of everybody. Abating cholesterol only in people with high levels is certainly beneficial to them but cannot does the public health job of preventing the mass of cases in the population. Many disease determinants have been found to increase the risk of some diseases in a smooth, continuous way like cholesterol, for example blood pressure for heart attacks, hydraulic pressure in the eye for glaucoma, or alcohol consumption for cancer of the oesophagus or liver cirrhosis. The graded distribution over the whole population of risk generated by these determinants, rather than its exclusive concentration in some groups, stresses their role as population disease determinants in contrast to individual determinants. The susceptibility of each person, rooted in their genetic make-up, plays - as does chance - a role in determining who becomes diseased, but the number affected will depend to a major extent on the population determinants. For example, there are no known populations with a high frequency of heart attacks without also an average (over the whole population) high level of cholesterol. The next question then becomes: why do population determinants differ from one population to another? Cholesterol level is diet dependent and, like alcohol consumption, is conditioned by available foods (or alcoholic drinks), traditional tastes, and behavior influenced by marketing and by economic constraints. For infectious diseases, the proportion of people vaccinated is a typical population determinant of how often a disease will occur, because vaccinated people do not fall ill and at the same time they interrupt the chain of transmission of the contagion. For most diseases, multiple, rather than single, determinants are recognized. For example, blood cholesterol level, blood pressure, tobacco smoking, diabetes, and obesity are main population determinants of heart attacks. Interventions acting in turn on these determinants aim at promoting healthy habits, behaviors, foods, and to limit the availability of harmful products. This population strategy of prevention, based on a variable mix of incentives, education, and regulation, is beneficial to everybody, whatever one's known or unknown susceptibility or level of risk. It can be complemented by specific preventive actions, often involving the use of drugs (e.g. to lower cholesterol or blood pressure) for people known to be at definitely high risk. Recently the idea has dawned that a combination in a single pill ('polypill') of low doses of several drugs controlling cholesterol level, blood pressure, and blood clotting propensity could be used in a population prevention strategy by offering it to most or all middle-aged and older people. Whether this is an effective, safe, and realistic possibility remains to be explored. The general principle is that before being launched on a grand scale, a preventive measure must have been clearly shown to work. This involves research covering a large number of disease determinants, from proximate biological and genetic factors, to personal behavior traits, and to the `determinants of the determinants' operating at the level of the social or of the global environment.

Attention to the global environment has markedly increased in recent years. Localized `heat waves' have caused clearly documented excesses of mortality and fluctuations in urban air pollutants, especially fine particulates, which have been shown to increase hospital admissions for respiratory and cardiovascular ailments and to precipitate deaths from a variety of causes. Protocols to prevent these adverse effects affecting in particular vulnerable, already sick people have been put in place in a number of countries. In contrast to these meteorological episodes, the health consequences of the foreseen global climatic change are a completely new chapter for epidemiological investigation. A likely temperature increase of anything between 2°C and 5°C by the end of this century may be reflected in a sea-level elevation of 20 centimeters to 60 centimeters, involving a change in coastlines with consequent exposure of populations to flooding, already regularly experienced in a country like Bangladesh. Tropical cyclones, to which more than 300 million people are currently exposed, are expected to become more intense. The biological cycles of parasites are sensitive to climate changes, so that hundreds of millions of additional people will be infected by diseases like malaria. A further likely consequence is increased under-nutrition caused by droughts and rural poverty that, like the other sequels of climatic change, will induce mass migrations, themselves a source of severe health problems (as just one example, keeping well controlled a serious case of diabetes, a delicate but everyday routine task in developed countries, may become hopeless in a moving refugee population). Today, these effects can be identified but their probable impact on health (currently quite modest) remains to be quantified  through research that combines available epidemiological data, for example on malaria in different regions, with models simulating how the disease may evolve under various hypotheses of temperature and other environmental changes.




googleplus sm


ar bg ca zh-chs zh-cht cs da nl en et fi fr de el ht he hi hu id it ja ko lv lt no pl pt ro ru sk sl es sv th tr uk

Verse of the Day

Global Map