AT

Bias, Confounding & Causation

Community Medicine · Epidemiology · lean revision notes

Bias, Confounding & Causation

Epidemiological studies aim to estimate the true association between an exposure and an outcome. Bias and confounding are the two great enemies of that truth, while the Bradford Hill criteria help us decide when an observed association is actually causal. This is a perennial NEET PG favourite, almost always asked as a clinical/research vignette demanding you name the specific error operating.


1. The big picture: why associations can be false

When a study reports an association, there are four possible explanations. You must rule out the first three before claiming causation.

Observed association → Is it chance? → Is it bias? → Is it confounding? → Only then: causation.

Explanation Nature Controlled by
Chance (random error) Random, due to sampling Larger sample size, statistical testing (p-value, CI)
Bias (systematic error) Systematic, built into design/conduct Good study design; cannot be fixed in analysis
Confounding Systematic, due to a third variable Design or analysis stage
True causation Real biological relationship Establish via Bradford Hill

High-yield: Random error → affects precision (corrected by ↑ sample size). Systematic error (bias) → affects validity/accuracy (cannot be corrected by increasing sample size). This precision-vs-validity distinction is heavily tested.


2. Bias — classification

Bias = any systematic error in design, conduct or analysis of a study that results in a mistaken estimate of the exposure–outcome association. Broadly, three families:

  1. Selection bias — error in selecting/retaining study subjects.
  2. Information (measurement/observation) bias — error in measuring exposure or outcome.
  3. Confounding — traditionally listed separately (a distortion by a third factor), discussed below.

2A. Selection bias

Arises when the subjects studied are not representative of the target population, or when comparison groups differ systematically in ways related to both exposure and outcome.

Type Setting Classic description
Berkson's bias (admission rate bias) Hospital-based case-control Cases & controls selected from hospital; differential admission rates create spurious association between two unrelated diseases
Neyman bias (prevalence–incidence / survival bias) Case-control using prevalent (survivor) cases Rapidly fatal or quickly-cured cases are missed; only survivors studied → distorted exposure
Healthy worker effect Occupational cohorts Workers healthier than general population (sick people don't get/keep jobs) → underestimates harm
Non-response / volunteer bias Surveys, screening Responders/volunteers differ from non-responders
Loss to follow-up (attrition) bias Cohort/RCT Dropouts differ from those retained
Ascertainment / detection bias Screening, surveillance Exposed group monitored more closely → more disease detected

High-yield: Berkson's = two hospitalised groups, spurious association. Neyman = prevalent/surviving cases, the rapidly fatal ones are lost. These two are the most commonly confused — Berkson is about admission, Neyman is about survival.

2B. Information (measurement) bias

Arises from incorrect measurement of exposure or outcome.

Type Description Example
Recall bias Cases recall past exposure better than controls Mothers of malformed babies recall drug intake more
Interviewer / observer bias Interviewer probes exposed/cases differently Knowing case status, interviewer asks leading questions
Reporting bias Subjects under/over-report sensitive info Under-reporting alcohol, smoking, sexual history
Misclassification bias Subjects placed in wrong exposure/outcome category Faulty diagnostic test labels diseased as healthy
Hawthorne effect Subjects change behaviour because observed Hygiene improves once participants know they are watched
Lead-time bias Screening detects disease earlier → apparent ↑ survival without true benefit Survival "from diagnosis" longer though death date unchanged
Length bias Screening preferentially detects slow-growing (indolent) disease Indolent cancers over-represented in screen-detected group

High-yield: Recall bias is the classic limitation of case-control studies. Lead-time and length bias are the classic limitations of screening programmes — distinguish them: lead-time = earlier detection clock-starting; length = slow tumours preferentially caught.

Misclassification — differential vs non-differential

  • Non-differential (random) misclassification — error equal across groups → biases estimate towards the null (dilutes a real effect). Tested fact.
  • Differential misclassification — error differs by group (e.g., recall bias) → can bias in either direction (towards or away from null).

High-yield: Non-differential misclassification → almost always biases towards the null (RR/OR pushed towards 1). This is a frequent one-liner MCQ.


3. Confounding

A confounder is a third variable that is associated with the exposure and is an independent risk factor for the outcome, and is not an intermediate step in the causal pathway.

Three criteria for a confounder (all must be met):

  1. Associated with the exposure (in the source population).
  2. Independent risk factor for the outcome.
  3. Not an intermediate variable on the causal pathway between exposure and outcome.

Classic example: Coffee–lung cancer association is confounded by smoking (coffee drinkers smoke more; smoking causes lung cancer). Once you adjust for smoking, the coffee association vanishes.

High-yield: Age and smoking are the two most common confounders in NEET PG vignettes. If a question shows an association that disappears after "adjustment," the third variable is a confounder.

Confounding vs effect modification (interaction)

Feature Confounding Effect modification (interaction)
What it is A nuisance to be removed A real biological phenomenon to be described
Effect on estimate Distorts the true measure Different effect size across strata
Stratified analysis Adjusted (pooled) estimate ≈ across strata; differs from crude Stratum-specific estimates differ from each other
Action Control/remove it Report it stratum-wise (don't "remove")

High-yield: In effect modification, stratum-specific RRs differ from each other. In confounding, stratum-specific RRs are similar to each other but differ from the crude estimate.

Methods to control confounding

Design stage (before/during data collection):

  1. Randomisation — gold standard; distributes known and unknown confounders equally (only in RCTs).
  2. Restriction — limit study to one category (e.g., only non-smokers); reduces generalisability.
  3. Matching — pair cases and controls on confounder (age, sex). Needs matched analysis (McNemar / conditional logistic regression).

Analysis stage (after data collected): 4. Stratification — analyse within strata; use Mantel–Haenszel to pool adjusted estimate. 5. Multivariate analysis — e.g., multiple logistic regression, Cox proportional hazards; adjusts for several confounders simultaneously. 6. Standardisation — direct/indirect, classically for age.

Flow — controlling confounding: Design phase → Randomisation / Restriction / Matching → Analysis phase → Stratification (Mantel–Haenszel) / Multivariate regression / Standardisation.

High-yield: Randomisation is the only method that controls unknown/unmeasured confounders — hence the RCT's supremacy. Restriction, matching, stratification and regression can only address known/measured confounders. This is the single most repeated fact in this topic.

Mnemonic for confounder control — "RM-SMS": Restriction, Matching, Stratification, Multivariate, Standardisation (+ Randomisation as the design overlord).


4. Validity & reliability (linked concept)

Often paired with bias in MCQs.

  • Validity (accuracy) = measures what it intends to; threatened by bias (systematic error).
    • Internal validity — results true for the study population.
    • External validity (generalisability) — results applicable to wider population.
  • Reliability (precision/repeatability) = consistency on repetition; threatened by random error.

High-yield: A study can be highly reliable but invalid (consistently wrong — e.g., a miscalibrated weighing scale). Validity ≠ reliability. Restriction improves internal validity but reduces external validity.


5. Bradford Hill criteria for causation

After ruling out chance, bias and confounding, Sir Austin Bradford Hill (1965) proposed nine viewpoints to judge causation. None is individually sufficient; temporality is the only absolute requirement.

# Criterion Meaning
1 Temporality Cause must precede effect (only essential criterion)
2 Strength Larger RR/OR → more likely causal
3 Consistency Repeatable across studies, populations, settings
4 Biological gradient Dose–response relationship
5 Specificity One cause → one effect (weakest, often violated)
6 Biological plausibility Consistent with known biology
7 Coherence Doesn't conflict with natural history/known facts
8 Experiment / reversibility Removing exposure ↓ disease
9 Analogy Similar agents cause similar effects

High-yield: Temporality is the sine qua non — the only criterion that MUST be satisfied. Strength and biological gradient (dose-response) are strong supporters. Specificity is the weakest/least useful (most diseases are multifactorial; smoking causes many diseases).

Mnemonic — "Timmy's Strong Consistent Biological Specific Plausible Coherent Experiment Analogy" or simply remember the lead trio: Temporality (must), Strength, Dose-response.

High-yield: A cohort study and an RCT can establish temporality (exposure measured before outcome). A case-control study cannot reliably establish temporality — a key reason it's weaker for causation.


6. Worked vignette logic (how MCQs phrase it)

  • "Cases and controls both drawn from a hospital, spurious link found" → Berkson's bias.
  • "Only surviving patients of MI studied; fatal cases missed" → Neyman (survival) bias.
  • "Mothers of children with birth defects recall drug use better" → Recall bias.
  • "Screen-detected cancers appear to survive longer though death timing unchanged" → Lead-time bias.
  • "Screening picks up mostly slow-growing tumours" → Length bias.
  • "Factory workers healthier than general public" → Healthy worker effect.
  • "Association of coffee with CHD disappears after adjusting for smoking" → Confounding (smoking).
  • "Effect of drug differs in diabetics vs non-diabetics" → Effect modification.
  • "Behaviour improved because subjects knew they were observed" → Hawthorne effect.

7. Complications / consequences of unaddressed errors

  • Spurious associations leading to wrong public-health policy.
  • Masking of true effects (non-differential misclassification dilutes real RR towards null).
  • Misleading screening "benefit" (lead-time/length bias inflating apparent survival).
  • Non-reproducible research and wasted resources.
  • Harm to patients if causal claims are acted on prematurely.

8. Key differentials / discriminators (don't confuse these)

Pair Key discriminator
Selection vs information bias Who you pick vs how you measure
Berkson vs Neyman Admission rate vs survival
Confounding vs effect modification Remove it vs report it (strata similar-to-crude vs strata-differ)
Lead-time vs length bias Earlier detection vs indolent-tumour over-detection
Random vs systematic error Precision/sample-size vs validity/design
Validity vs reliability Accuracy vs repeatability

Recently asked / exam angle

  • Identify the specific bias in a described case-control study (Berkson's, Neyman, recall) — the single most common stem.
  • Only criterion essential for causationTemporality (repeated almost every cycle).
  • Best/only method to control unknown confoundersRandomisation.
  • Non-differential misclassification biases towardsthe null.
  • Direction of bias from random vs differential misclassification.
  • Weakest Bradford Hill criterionSpecificity.
  • Distinguishing confounding from effect modification using stratified RR tables.
  • Healthy worker effect in occupational epidemiology stems.
  • Lead-time vs length bias in screening-programme questions.
  • Mantel–Haenszel as the technique for pooled stratum-adjusted estimate.
  • Which study design controls confounding at the design vs analysis stage.

Rapid revision

  1. Random error → precision (fix with sample size); systematic error/bias → validity (fix with design, NOT sample size).
  2. Berkson's bias = both groups from hospital → spurious association (admission rate bias).
  3. Neyman bias = prevalence-incidence/survival bias; fatal cases missed in case-control.
  4. Recall bias = classic flaw of case-control studies.
  5. Lead-time bias (earlier detection) and length bias (indolent tumours) plague screening.
  6. Healthy worker effect underestimates occupational harm.
  7. A confounder must satisfy 3 criteria: linked to exposure, independent risk factor for outcome, NOT on causal pathway.
  8. Non-differential misclassification → bias towards the null (dilutes true effect).
  9. Randomisation is the ONLY method controlling unknown confounders.
  10. Restriction/matching = design stage; stratification (Mantel–Haenszel) / multivariate regression = analysis stage.
  11. Effect modification → stratum-specific RRs differ from each other (report, don't remove).
  12. Temporality is the ONLY essential Bradford Hill criterion; specificity is the weakest; strength + dose-response are strong supporters.