Measures of Association & Impact
Community Medicine · Epidemiology · lean revision notes
Measures of Association & Impact
These measures quantify the strength of the link between an exposure and an outcome (association) and the public-health burden attributable to that exposure (impact). They are among the most heavily tested numerical topics in NEET PG epidemiology — expect a 2×2 table to compute from, or a confidence-interval interpretation.
The master 2×2 table
Every calculation starts from a single contingency table. Memorise the cell letters — half the marks are lost by mislabelling rows and columns.
| Disease + | Disease − | Total | |
|---|---|---|---|
| Exposed + | a | b | a + b |
| Exposed − | c | d | c + d |
| Total | a + c | b + d | N |
- a = exposed who developed disease (true diseased among exposed)
- b = exposed without disease
- c = unexposed who developed disease
- d = unexposed without disease
High-yield: In a 2×2 table, rows = exposure, columns = disease (by convention). Always orient the table this way before computing risk, because risk is read across a row (incidence within an exposure group).
Classification — which measure for which study
Measures split into two families: measures of association (strength of relationship) and measures of impact (potential benefit of removing the exposure).
| Category | Measure | Study design |
|---|---|---|
| Association | Relative Risk (RR) | Cohort, RCT (need incidence) |
| Association | Odds Ratio (OR) | Case-control (preferred), also cohort/cross-sectional |
| Impact (individual) | Attributable Risk (AR) | Cohort |
| Impact (individual, %) | Attributable Risk % (AR%) | Cohort / case-control (via OR) |
| Impact (community) | Population Attributable Risk (PAR) | Cohort + prevalence of exposure |
| Impact (community, %) | Population Attributable Risk % (PAR%) | Cohort |
| Therapeutic impact | Number Needed to Treat (NNT) | RCT |
| Therapeutic harm | Number Needed to Harm (NNH) | RCT / cohort |
High-yield (most-asked single fact): Relative risk cannot be calculated in a case-control study because incidence is unknown — only the odds ratio is valid. Conversely, in a cohort study you can compute both RR and OR.
Relative Risk (Risk Ratio, RR)
RR measures how many times more likely the outcome is in the exposed versus the unexposed.
Stepwise approach: compute incidence in each group → take the ratio.
- Incidence in exposed = a / (a + b)
- Incidence in unexposed = c / (c + d)
- RR = [a/(a+b)] ÷ [c/(c+d)]
Interpretation:
| RR value | Meaning |
|---|---|
| RR = 1 | No association (exposure neither raises nor lowers risk) |
| RR > 1 | Exposure is a risk factor (positive association) |
| RR < 1 | Exposure is protective (e.g. vaccine, statin) |
High-yield: RR is the measure of choice for assessing strength of causal association and is used to satisfy Bradford-Hill's "strength" criterion. RR ≥ 2–3 suggests a strong association.
Worked example: Smokers — 80 of 1000 develop lung cancer (incidence 0.08). Non-smokers — 10 of 1000 (incidence 0.01). RR = 0.08/0.01 = 8 → smokers are 8× more likely.
Odds Ratio (OR)
When incidence cannot be measured (case-control: we start with diseased cases and select controls), we use odds.
OR = (a × d) / (b × c) — the cross-product ratio.
This equals (odds of exposure in cases) ÷ (odds of exposure in controls), which algebraically equals the disease odds ratio — the symmetry that makes case-control studies valid.
High-yield: OR approximates RR when the disease is rare (incidence < ~10%) — the rare disease assumption. For common outcomes OR overestimates RR (moves further from 1).
Interpretation mirrors RR: OR = 1 (no association), > 1 (risk factor), < 1 (protective).
Worked example: Cases (oral cancer) 90 chew tobacco, 10 don't; controls 30 chew, 70 don't. OR = (90 × 70)/(10 × 30) = 6300/300 = 21.
RR vs OR — the comparison NEET loves
| Feature | Relative Risk | Odds Ratio |
|---|---|---|
| Based on | Incidence (risk) | Odds |
| Calculable in case-control | No | Yes |
| Calculable in cohort/RCT | Yes | Yes |
| Rare disease | RR ≈ OR | OR ≈ RR |
| Common disease | More intuitive | Exaggerates effect |
| Formula | [a/(a+b)] / [c/(c+d)] | ad/bc |
| Logistic regression output | — | Always reports OR |
Attributable Risk (AR) — Risk Difference
AR is the excess risk in the exposed group attributable to the exposure — what would be prevented if exposure were removed.
AR = Incidence(exposed) − Incidence(unexposed) = [a/(a+b)] − [c/(c+d)]
Using the smoking example: AR = 0.08 − 0.01 = 0.07 (70 excess cases per 1000 smokers).
High-yield: RR tells you strength of association (etiological importance); AR tells you the absolute public-health benefit of removing the exposure at the individual level. A factor can have a huge RR but tiny AR if the baseline risk is minuscule.
Attributable Risk Percent (AR% / Attributable Fraction in the exposed)
The proportion of disease in the exposed group due to the exposure.
- AR% = [Incidence(exp) − Incidence(unexp)] / Incidence(exp) × 100
- Equivalent shortcut: AR% = [(RR − 1) / RR] × 100
Smoking: AR% = (8 − 1)/8 × 100 = 87.5% → 87.5% of lung cancer in smokers is due to smoking.
High-yield: From a case-control study you cannot get AR directly, but you can estimate AR% using OR in place of RR: AR% = (OR − 1)/OR × 100.
Population Attributable Risk (PAR)
AR refers only to the exposed; PAR scales the impact to the whole community, factoring in how common the exposure is.
PAR = Incidence(total population) − Incidence(unexposed)
Or, when prevalence of exposure (P) is known:
PAR = AR × P (prevalence of exposure)
Population Attributable Risk Percent (PAR%)
Proportion of disease in the whole population attributable to the exposure — the number policymakers want.
- PAR% = [Incidence(pop) − Incidence(unexp)] / Incidence(pop) × 100
- Levin's formula: PAR% = [P(RR − 1)] / [P(RR − 1) + 1] × 100, where P = prevalence of exposure.
High-yield: PAR / PAR% depend on both the strength of association AND the prevalence of the exposure in the population. A weak risk factor that is extremely common (e.g. mild hypertension) can have a larger PAR than a strong but rare one. This is the conceptual core of population-level prevention (Geoffrey Rose's "prevention paradox").
Flow for impact measures:
RR (strength) → AR = excess risk in exposed → AR% = fraction of exposed disease preventable → PAR = AR × exposure prevalence → PAR% = community-level preventable fraction
Number Needed to Treat (NNT)
A clinical-trial measure: how many patients must receive the treatment to prevent one additional bad outcome.
Stepwise:
- Control Event Rate (CER) = risk in control arm
- Experimental Event Rate (EER) = risk in treatment arm
- Absolute Risk Reduction (ARR) = CER − EER
- NNT = 1 / ARR
Example: A drug drops MI rate from 5% (CER) to 3% (EER). ARR = 0.02. NNT = 1/0.02 = 50 → treat 50 patients to prevent one MI.
High-yield: NNT = 1/ARR (absolute risk reduction), not relative risk reduction. A lower NNT = better/more effective treatment. NNT must always be rounded up to the next whole patient.
- Relative Risk Reduction (RRR) = ARR / CER = (CER − EER)/CER — looks impressive but can mislead because it ignores baseline risk.
Number Needed to Harm (NNH)
Same logic for adverse events.
- Absolute Risk Increase (ARI) = EER(harm) − CER(harm)
- NNH = 1 / ARI
Example: If a drug causes bleeding in 4% vs 1% on placebo, ARI = 0.03, NNH = 1/0.03 ≈ 34 → for every 34 treated, one extra bleed.
High-yield: A high NNH is good (harm is rare); a low NNT is good (benefit is frequent). The ideal drug has low NNT and high NNH. Favourable therapy: NNT << NNH.
| Measure | Formula | Better when |
|---|---|---|
| ARR | CER − EER | Higher |
| NNT | 1 / ARR | Lower |
| ARI | EER − CER (harm) | Lower |
| NNH | 1 / ARI | Higher |
| RRR | ARR / CER | Higher (but baseline-dependent) |
Confidence Intervals — the killer interpretation question
A point estimate (RR or OR) is reported with a 95% confidence interval. Statistical significance is read from whether the CI crosses the null value of 1 (for ratios) or 0 (for risk differences / AR).
High-yield: If the 95% CI of an RR or OR includes 1.0, the result is NOT statistically significant (p > 0.05) — the association could be due to chance. If the entire CI lies above 1 (e.g. 1.4–3.2) → significant risk factor; entirely below 1 (e.g. 0.3–0.7) → significant protective effect.
- For Attributable Risk / Risk Difference / NNT-derived ARR, the null value is 0, not 1 — a CI crossing 0 is non-significant.
- A narrow CI = precise estimate (large sample); a wide CI = imprecise (small sample).
| 95% CI for OR/RR | Interpretation |
|---|---|
| 1.5 (1.2 – 1.9) | Significant ↑ risk |
| 0.6 (0.4 – 0.8) | Significant protection |
| 1.3 (0.9 – 1.8) | Not significant (crosses 1) |
| 2.0 (0.7 – 5.5) | Not significant despite OR = 2 |
Complications / pitfalls in interpretation
- Confounding: a third variable distorts the association (e.g. alcohol confounding the smoking–oral cancer link). Controlled by matching, stratification (Mantel-Haenszel), restriction, randomisation, or multivariable regression.
- Effect modification (interaction): the RR genuinely differs across strata — should be reported separately, not adjusted away.
- OR exaggerates RR when the disease is common — a frequent trap.
- Statistical vs clinical significance: a significant CI in a huge trial may reflect a trivial ARR (huge NNT). Always ask "how big is the effect?" not just "is p < 0.05?".
- RRR vs ARR deception: a 50% RRR sounds dramatic but if baseline risk is 0.2% the ARR is only 0.1% (NNT = 1000).
Key differentials — telling the measures apart
- RR vs OR: RR needs incidence (cohort); OR is the cross-product (case-control).
- AR vs AR%: AR is an absolute rate (difference); AR% is a proportion (fraction).
- AR% vs PAR%: AR% applies to the exposed group; PAR% applies to the whole population and incorporates exposure prevalence.
- ARR vs RRR: absolute difference vs relative (proportional) reduction; NNT uses ARR only.
Recently asked / exam angle
- Given a 2×2 table, compute OR or RR — the single commonest stem. Watch whether it says cohort (RR allowed) or case-control (OR only).
- "A case-control study reports an OR of 4.5 (95% CI 0.8–9.0) — what is the interpretation?" → Not statistically significant because CI includes 1.
- NNT from a trial: drug reduces event from X% to Y%, find NNT → 1/(X−Y as proportion).
- "Which measure best reflects the public-health benefit of an intervention in the community?" → PAR / PAR%.
- "Which measure is used to assess strength of association for causality?" → Relative Risk.
- AR% via the (RR−1)/RR shortcut, or via (OR−1)/OR in case-control data.
- Distinguishing AR% (exposed) from PAR% (population) in a worded question.
- Why OR ≈ RR — the rare disease assumption.
Mnemonic — "ROAR PIN": Relative risk & Odds ratio = Association; Rest (AR, PAR) = impact... and PIN: PAR needs Incidence + prevalence... "Cohort gives both, Case-control gives Odds only" — keep this one-liner ready.
Mnemonic for NNT: "Number Needed to Treat = 1 over ARR; small is mighty."
Rapid revision
- OR = ad/bc; RR = [a/(a+b)] / [c/(c+d)].
- Case-control → OR only; cohort/RCT → both RR and OR.
- OR ≈ RR when disease is rare (rare disease assumption).
- RR/OR = 1 → no association; > 1 → risk factor; < 1 → protective.
- AR (risk difference) = Inc(exposed) − Inc(unexposed) = excess risk removable.
- AR% = (RR − 1)/RR × 100 = fraction of exposed disease due to exposure.
- PAR = AR × prevalence of exposure; PAR% uses Levin's formula and depends on exposure prevalence + RR.
- PAR% guides community/policy prevention; AR% is individual-level.
- NNT = 1/ARR; lower NNT = better drug; always round up.
- NNH = 1/ARI; higher NNH = safer drug; good therapy = low NNT, high NNH.
- CI crossing 1 (ratios) or 0 (differences) → not statistically significant.
- RR = strength of association (Bradford-Hill); AR/PAR = potential impact of intervention.