Bayes' Theorem | Statistics Cheat Sheet

Overview

Bayes' Theorem allows us to update our beliefs about the probability of an event based on new evidence. It's the foundation of Bayesian statistics and has wide applications in medicine, machine learning, and decision-making.

The Theorem

P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}

Or more explicitly:

P(A|B) = \frac{P(B|A) \times P(A)}{P(B|A) \times P(A) + P(B|A') \times P(A')}

Terminology

Term	Symbol	Description
Prior	$P(A)$	Initial probability before evidence
Likelihood	$P(B\lvert A)$	Probability of evidence given $A$
Posterior	$P(A\lvert B)$	Updated probability after evidence
Marginal	$P(B)$	Total probability of evidence

Extended Form

For multiple hypotheses $H_1, H_2, \ldots, H_n$ :

P(H_i|E) = \frac{P(E|H_i) \times P(H_i)}{\sum_j P(E|H_j) \times P(H_j)}

The Bayesian Approach

\text{Posterior} \propto \text{Likelihood} \times \text{Prior}

P(\text{Hypothesis}|\text{Data}) \propto P(\text{Data}|\text{Hypothesis}) \times P(\text{Hypothesis})

Key Insights

Prior matters: Initial beliefs affect the posterior
Evidence updates: Strong evidence shifts the posterior
Rare events: Even with good tests, rare events often yield surprising results
Sequential updating: Can apply Bayes repeatedly as new evidence arrives

Examples

Example 1: Medical Diagnosis

A disease affects 1% of the population. A test is:

95% accurate for positive when disease present (sensitivity)
90% accurate for negative when disease absent (specificity)

If you test positive, what's the probability you have the disease?

P(D) = 0.01 \quad \text{(prior: disease rate)}

P(D') = 0.99 \quad \text{(no disease)}

P(+|D) = 0.95 \quad \text{(true positive)}

P(+|D') = 0.10 \quad \text{(false positive)}

P(D|+) = \frac{P(+|D) \times P(D)}{P(+|D) \times P(D) + P(+|D') \times P(D')}

P(D|+) = \frac{0.95 \times 0.01}{(0.95 \times 0.01) + (0.10 \times 0.99)}

P(D|+) = \frac{0.0095}{0.0095 + 0.099} = \frac{0.0095}{0.1085} \approx 0.088 \text{ or } 8.8\%

Only 8.8% chance of disease despite positive test—the false positive rate dominates because disease is rare.

Example 2: Spam Filter

$P(\text{spam}) = 0.30$ (30% of emails are spam)
$P(\text{"lottery"} | \text{spam}) = 0.15$ (15% of spam contains "lottery")
$P(\text{"lottery"} | \text{not spam}) = 0.01$ (1% of legitimate emails contain "lottery")

If an email contains "lottery," probability it's spam:

P(\text{spam} | \text{lottery}) = \frac{0.15 \times 0.30}{(0.15 \times 0.30) + (0.01 \times 0.70)}

= \frac{0.045}{0.045 + 0.007} = \frac{0.045}{0.052} \approx 0.87 \text{ or } 87\%

Example 3: Two Defective Machines

Machine A produces 60% of items with 2% defective rate. Machine B produces 40% of items with 5% defective rate.

If an item is defective, probability it came from Machine B:

P(B|\text{def}) = \frac{P(\text{def}|B) \times P(B)}{P(\text{def}|A) \times P(A) + P(\text{def}|B) \times P(B)}

= \frac{0.05 \times 0.40}{(0.02 \times 0.60) + (0.05 \times 0.40)} = \frac{0.02}{0.012 + 0.02} = \frac{0.02}{0.032} = 0.625 \text{ or } 62.5\%

Common Applications

Field	Application
Medicine	Diagnostic testing
Email	Spam filtering
Legal	Evaluating evidence
Finance	Risk assessment
AI/ML	Naive Bayes classifiers
Science	Updating hypotheses

Base Rate Fallacy

A common error is ignoring the prior (base rate). Example:

If a rare disease test is 99% accurate but the disease affects only 0.1% of people, most positive tests are false positives. Always consider the base rate!