Bias, Confounding & Causation

Community Medicine · Epidemiology · lean revision notes

Bias, Confounding & Causation

Epidemiological studies aim to estimate the true association between an exposure and an outcome. Bias and confounding are the two great enemies of that truth, while the Bradford Hill criteria help us decide when an observed association is actually causal. This is a perennial NEET PG favourite, almost always asked as a clinical/research vignette demanding you name the specific error operating.

1. The big picture: why associations can be false

When a study reports an association, there are four possible explanations. You must rule out the first three before claiming causation.

Observed association → Is it chance? → Is it bias? → Is it confounding? → Only then: causation.

Explanation	Nature	Controlled by
Chance (random error)	Random, due to sampling	Larger sample size, statistical testing (p-value, CI)
Bias (systematic error)	Systematic, built into design/conduct	Good study design; cannot be fixed in analysis
Confounding	Systematic, due to a third variable	Design or analysis stage
True causation	Real biological relationship	Establish via Bradford Hill

High-yield: Random error → affects precision (corrected by ↑ sample size). Systematic error (bias) → affects validity/accuracy (cannot be corrected by increasing sample size). This precision-vs-validity distinction is heavily tested.

2. Bias — classification

Bias = any systematic error in design, conduct or analysis of a study that results in a mistaken estimate of the exposure–outcome association. Broadly, three families:

Selection bias — error in selecting/retaining study subjects.
Information (measurement/observation) bias — error in measuring exposure or outcome.
Confounding — traditionally listed separately (a distortion by a third factor), discussed below.

2A. Selection bias

Arises when the subjects studied are not representative of the target population, or when comparison groups differ systematically in ways related to both exposure and outcome.

Type	Setting	Classic description
Berkson's bias (admission rate bias)	Hospital-based case-control	Cases & controls selected from hospital; differential admission rates create spurious association between two unrelated diseases
Neyman bias (prevalence–incidence / survival bias)	Case-control using prevalent (survivor) cases	Rapidly fatal or quickly-cured cases are missed; only survivors studied → distorted exposure
Healthy worker effect	Occupational cohorts	Workers healthier than general population (sick people don't get/keep jobs) → underestimates harm
Non-response / volunteer bias	Surveys, screening	Responders/volunteers differ from non-responders
Loss to follow-up (attrition) bias	Cohort/RCT	Dropouts differ from those retained
Ascertainment / detection bias	Screening, surveillance	Exposed group monitored more closely → more disease detected

High-yield: Berkson's = two hospitalised groups, spurious association. Neyman = prevalent/surviving cases, the rapidly fatal ones are lost. These two are the most commonly confused — Berkson is about admission, Neyman is about survival.

2B. Information (measurement) bias

Arises from incorrect measurement of exposure or outcome.

Type	Description	Example
Recall bias	Cases recall past exposure better than controls	Mothers of malformed babies recall drug intake more
Interviewer / observer bias	Interviewer probes exposed/cases differently	Knowing case status, interviewer asks leading questions
Reporting bias	Subjects under/over-report sensitive info	Under-reporting alcohol, smoking, sexual history
Misclassification bias	Subjects placed in wrong exposure/outcome category	Faulty diagnostic test labels diseased as healthy
Hawthorne effect	Subjects change behaviour because observed	Hygiene improves once participants know they are watched
Lead-time bias	Screening detects disease earlier → apparent ↑ survival without true benefit	Survival "from diagnosis" longer though death date unchanged
Length bias	Screening preferentially detects slow-growing (indolent) disease	Indolent cancers over-represented in screen-detected group

High-yield: Recall bias is the classic limitation of case-control studies. Lead-time and length bias are the classic limitations of screening programmes — distinguish them: lead-time = earlier detection clock-starting; length = slow tumours preferentially caught.

Misclassification — differential vs non-differential

Non-differential (random) misclassification — error equal across groups → biases estimate towards the null (dilutes a real effect). Tested fact.
Differential misclassification — error differs by group (e.g., recall bias) → can bias in either direction (towards or away from null).

High-yield: Non-differential misclassification → almost always biases towards the null (RR/OR pushed towards 1). This is a frequent one-liner MCQ.

3. Confounding

A confounder is a third variable that is associated with the exposure and is an independent risk factor for the outcome, and is not an intermediate step in the causal pathway.

Three criteria for a confounder (all must be met):

Associated with the exposure (in the source population).
Independent risk factor for the outcome.
Not an intermediate variable on the causal pathway between exposure and outcome.

Classic example: Coffee–lung cancer association is confounded by smoking (coffee drinkers smoke more; smoking causes lung cancer). Once you adjust for smoking, the coffee association vanishes.

High-yield: Age and smoking are the two most common confounders in NEET PG vignettes. If a question shows an association that disappears after "adjustment," the third variable is a confounder.

Confounding vs effect modification (interaction)

Feature	Confounding	Effect modification (interaction)
What it is	A nuisance to be removed	A real biological phenomenon to be described
Effect on estimate	Distorts the true measure	Different effect size across strata
Stratified analysis	Adjusted (pooled) estimate ≈ across strata; differs from crude	Stratum-specific estimates differ from each other
Action	Control/remove it	Report it stratum-wise (don't "remove")

High-yield: In effect modification, stratum-specific RRs differ from each other. In confounding, stratum-specific RRs are similar to each other but differ from the crude estimate.

Methods to control confounding

Design stage (before/during data collection):

Randomisation — gold standard; distributes known and unknown confounders equally (only in RCTs).
Restriction — limit study to one category (e.g., only non-smokers); reduces generalisability.
Matching — pair cases and controls on confounder (age, sex). Needs matched analysis (McNemar / conditional logistic regression).

Analysis stage (after data collected): 4. Stratification — analyse within strata; use Mantel–Haenszel to pool adjusted estimate. 5. Multivariate analysis — e.g., multiple logistic regression, Cox proportional hazards; adjusts for several confounders simultaneously. 6. Standardisation — direct/indirect, classically for age.

Flow — controlling confounding: Design phase → Randomisation / Restriction / Matching → Analysis phase → Stratification (Mantel–Haenszel) / Multivariate regression / Standardisation.

High-yield: Randomisation is the only method that controls unknown/unmeasured confounders — hence the RCT's supremacy. Restriction, matching, stratification and regression can only address known/measured confounders. This is the single most repeated fact in this topic.

Mnemonic for confounder control — "RM-SMS": Restriction, Matching, Stratification, Multivariate, Standardisation (+ Randomisation as the design overlord).

4. Validity & reliability (linked concept)

Often paired with bias in MCQs.

Validity (accuracy) = measures what it intends to; threatened by bias (systematic error).
- Internal validity — results true for the study population.
- External validity (generalisability) — results applicable to wider population.
Reliability (precision/repeatability) = consistency on repetition; threatened by random error.

High-yield: A study can be highly reliable but invalid (consistently wrong — e.g., a miscalibrated weighing scale). Validity ≠ reliability. Restriction improves internal validity but reduces external validity.

5. Bradford Hill criteria for causation

After ruling out chance, bias and confounding, Sir Austin Bradford Hill (1965) proposed nine viewpoints to judge causation. None is individually sufficient; temporality is the only absolute requirement.

#	Criterion	Meaning
1	Temporality	Cause must precede effect (only essential criterion)
2	Strength	Larger RR/OR → more likely causal
3	Consistency	Repeatable across studies, populations, settings
4	Biological gradient	Dose–response relationship
5	Specificity	One cause → one effect (weakest, often violated)
6	Biological plausibility	Consistent with known biology
7	Coherence	Doesn't conflict with natural history/known facts
8	Experiment / reversibility	Removing exposure ↓ disease
9	Analogy	Similar agents cause similar effects

High-yield: Temporality is the sine qua non — the only criterion that MUST be satisfied. Strength and biological gradient (dose-response) are strong supporters. Specificity is the weakest/least useful (most diseases are multifactorial; smoking causes many diseases).

Mnemonic — "Timmy's Strong Consistent Biological Specific Plausible Coherent Experiment Analogy" or simply remember the lead trio: Temporality (must), Strength, Dose-response.

High-yield: A cohort study and an RCT can establish temporality (exposure measured before outcome). A case-control study cannot reliably establish temporality — a key reason it's weaker for causation.

6. Worked vignette logic (how MCQs phrase it)

"Cases and controls both drawn from a hospital, spurious link found" → Berkson's bias.
"Only surviving patients of MI studied; fatal cases missed" → Neyman (survival) bias.
"Mothers of children with birth defects recall drug use better" → Recall bias.
"Screen-detected cancers appear to survive longer though death timing unchanged" → Lead-time bias.
"Screening picks up mostly slow-growing tumours" → Length bias.
"Factory workers healthier than general public" → Healthy worker effect.
"Association of coffee with CHD disappears after adjusting for smoking" → Confounding (smoking).
"Effect of drug differs in diabetics vs non-diabetics" → Effect modification.
"Behaviour improved because subjects knew they were observed" → Hawthorne effect.

7. Complications / consequences of unaddressed errors

Spurious associations leading to wrong public-health policy.
Masking of true effects (non-differential misclassification dilutes real RR towards null).
Misleading screening "benefit" (lead-time/length bias inflating apparent survival).
Non-reproducible research and wasted resources.
Harm to patients if causal claims are acted on prematurely.

8. Key differentials / discriminators (don't confuse these)

Pair	Key discriminator
Selection vs information bias	Who you pick vs how you measure
Berkson vs Neyman	Admission rate vs survival
Confounding vs effect modification	Remove it vs report it (strata similar-to-crude vs strata-differ)
Lead-time vs length bias	Earlier detection vs indolent-tumour over-detection
Random vs systematic error	Precision/sample-size vs validity/design
Validity vs reliability	Accuracy vs repeatability

Recently asked / exam angle

Identify the specific bias in a described case-control study (Berkson's, Neyman, recall) — the single most common stem.
Only criterion essential for causation → Temporality (repeated almost every cycle).
Best/only method to control unknown confounders → Randomisation.
Non-differential misclassification biases towards → the null.
Direction of bias from random vs differential misclassification.
Weakest Bradford Hill criterion → Specificity.
Distinguishing confounding from effect modification using stratified RR tables.
Healthy worker effect in occupational epidemiology stems.
Lead-time vs length bias in screening-programme questions.
Mantel–Haenszel as the technique for pooled stratum-adjusted estimate.
Which study design controls confounding at the design vs analysis stage.

Rapid revision

Random error → precision (fix with sample size); systematic error/bias → validity (fix with design, NOT sample size).
Berkson's bias = both groups from hospital → spurious association (admission rate bias).
Neyman bias = prevalence-incidence/survival bias; fatal cases missed in case-control.
Recall bias = classic flaw of case-control studies.
Lead-time bias (earlier detection) and length bias (indolent tumours) plague screening.
Healthy worker effect underestimates occupational harm.
A confounder must satisfy 3 criteria: linked to exposure, independent risk factor for outcome, NOT on causal pathway.
Non-differential misclassification → bias towards the null (dilutes true effect).
Randomisation is the ONLY method controlling unknown confounders.
Restriction/matching = design stage; stratification (Mantel–Haenszel) / multivariate regression = analysis stage.
Effect modification → stratum-specific RRs differ from each other (report, don't remove).
Temporality is the ONLY essential Bradford Hill criterion; specificity is the weakest; strength + dose-response are strong supporters.

← Back to hub Practice MCQs →