Probability BasicsTopic #9 of 33

Bayes' Theorem

Updating probabilities with new evidence: prior, likelihood, posterior, and applications.

Overview

Bayes' Theorem allows us to update our beliefs about the probability of an event based on new evidence. It's the foundation of Bayesian statistics and has wide applications in medicine, machine learning, and decision-making.

The Theorem

P(AB)=P(BA)×P(A)P(B)P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}

Or more explicitly:

P(AB)=P(BA)×P(A)P(BA)×P(A)+P(BA)×P(A)P(A|B) = \frac{P(B|A) \times P(A)}{P(B|A) \times P(A) + P(B|A') \times P(A')}

Terminology

TermSymbolDescription
PriorP(A)P(A)Initial probability before evidence
LikelihoodP(BA)P(B\lvert A)Probability of evidence given AA
PosteriorP(AB)P(A\lvert B)Updated probability after evidence
MarginalP(B)P(B)Total probability of evidence

Extended Form

For multiple hypotheses H1,H2,,HnH_1, H_2, \ldots, H_n:

P(HiE)=P(EHi)×P(Hi)jP(EHj)×P(Hj)P(H_i|E) = \frac{P(E|H_i) \times P(H_i)}{\sum_j P(E|H_j) \times P(H_j)}

The Bayesian Approach

PosteriorLikelihood×Prior\text{Posterior} \propto \text{Likelihood} \times \text{Prior} P(HypothesisData)P(DataHypothesis)×P(Hypothesis)P(\text{Hypothesis}|\text{Data}) \propto P(\text{Data}|\text{Hypothesis}) \times P(\text{Hypothesis})

Key Insights

  1. Prior matters: Initial beliefs affect the posterior
  2. Evidence updates: Strong evidence shifts the posterior
  3. Rare events: Even with good tests, rare events often yield surprising results
  4. Sequential updating: Can apply Bayes repeatedly as new evidence arrives

Examples

Example 1: Medical Diagnosis

A disease affects 1% of the population. A test is:

  • 95% accurate for positive when disease present (sensitivity)
  • 90% accurate for negative when disease absent (specificity)

If you test positive, what's the probability you have the disease?

P(D)=0.01(prior: disease rate)P(D) = 0.01 \quad \text{(prior: disease rate)} P(D)=0.99(no disease)P(D') = 0.99 \quad \text{(no disease)} P(+D)=0.95(true positive)P(+|D) = 0.95 \quad \text{(true positive)} P(+D)=0.10(false positive)P(+|D') = 0.10 \quad \text{(false positive)} P(D+)=P(+D)×P(D)P(+D)×P(D)+P(+D)×P(D)P(D|+) = \frac{P(+|D) \times P(D)}{P(+|D) \times P(D) + P(+|D') \times P(D')} P(D+)=0.95×0.01(0.95×0.01)+(0.10×0.99)P(D|+) = \frac{0.95 \times 0.01}{(0.95 \times 0.01) + (0.10 \times 0.99)} P(D+)=0.00950.0095+0.099=0.00950.10850.088 or 8.8%P(D|+) = \frac{0.0095}{0.0095 + 0.099} = \frac{0.0095}{0.1085} \approx 0.088 \text{ or } 8.8\%

Only 8.8% chance of disease despite positive test—the false positive rate dominates because disease is rare.

Example 2: Spam Filter

  • P(spam)=0.30P(\text{spam}) = 0.30 (30% of emails are spam)
  • P("lottery"spam)=0.15P(\text{"lottery"} | \text{spam}) = 0.15 (15% of spam contains "lottery")
  • P("lottery"not spam)=0.01P(\text{"lottery"} | \text{not spam}) = 0.01 (1% of legitimate emails contain "lottery")

If an email contains "lottery," probability it's spam:

P(spamlottery)=0.15×0.30(0.15×0.30)+(0.01×0.70)P(\text{spam} | \text{lottery}) = \frac{0.15 \times 0.30}{(0.15 \times 0.30) + (0.01 \times 0.70)} =0.0450.045+0.007=0.0450.0520.87 or 87%= \frac{0.045}{0.045 + 0.007} = \frac{0.045}{0.052} \approx 0.87 \text{ or } 87\%

Example 3: Two Defective Machines

Machine A produces 60% of items with 2% defective rate. Machine B produces 40% of items with 5% defective rate.

If an item is defective, probability it came from Machine B:

P(Bdef)=P(defB)×P(B)P(defA)×P(A)+P(defB)×P(B)P(B|\text{def}) = \frac{P(\text{def}|B) \times P(B)}{P(\text{def}|A) \times P(A) + P(\text{def}|B) \times P(B)} =0.05×0.40(0.02×0.60)+(0.05×0.40)=0.020.012+0.02=0.020.032=0.625 or 62.5%= \frac{0.05 \times 0.40}{(0.02 \times 0.60) + (0.05 \times 0.40)} = \frac{0.02}{0.012 + 0.02} = \frac{0.02}{0.032} = 0.625 \text{ or } 62.5\%

Common Applications

FieldApplication
MedicineDiagnostic testing
EmailSpam filtering
LegalEvaluating evidence
FinanceRisk assessment
AI/MLNaive Bayes classifiers
ScienceUpdating hypotheses

Base Rate Fallacy

A common error is ignoring the prior (base rate). Example:

If a rare disease test is 99% accurate but the disease affects only 0.1% of people, most positive tests are false positives. Always consider the base rate!