Central Limit Theorem | Statistics Cheat Sheet

Overview

The Central Limit Theorem (CLT) is one of the most important results in statistics. It states that the sampling distribution of the sample mean approaches a normal distribution as sample size increases, regardless of the population's distribution.

Statement

For a population with mean $\mu$ and finite variance $\sigma^2$ , the sampling distribution of $\bar{x}$ :

\bar{x} \sim N\left(\mu, \frac{\sigma^2}{n}\right) \quad \text{as } n \to \infty

Or equivalently, the standardized mean:

Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}} \sim N(0, 1)

Conditions

Independence: Observations are independent
Sample size: $n$ is "large enough" (typically $n \geq 30$ )
Finite variance: Population has finite mean and variance

How Large is "Large Enough"?

Population Shape	Recommended $n$
Normal	Any $n$
Symmetric	$n \geq 15$
Moderate skewness	$n \geq 30$
Highly skewed	$n \geq 50$ or more

Visual Demonstration

Population (any shape):
    The CLT says: For large n,
     /\         the sampling distribution
    /  \        of x̄ becomes:
___/    \___          ∩
                    ╱    ╲
                  ╱        ╲
                ─┴──────────┴─
                   Normal

Key Points

Works for any shape: Population can be skewed, uniform, bimodal, etc.
Larger n → better approximation: More samples mean closer to normal
Focus on $\bar{x}$ , not $X$ : Individual observations keep the population distribution
Foundation of inference: Enables z-tests and confidence intervals

Applications

Confidence Intervals

When $n$ is large:

\bar{x} \pm z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}

Hypothesis Testing

Test statistic (large $n$ ):

Z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}

This $Z$ follows $N(0,1)$ approximately by CLT.

Examples

Example 1: Non-Normal Population

Die rolls (uniform distribution, $\mu = 3.5$ , $\sigma = 1.71$ ):

For $n = 40$ die rolls, find $P(\bar{x} > 3.7)$ :

SE = \frac{1.71}{\sqrt{40}} = 0.27

Z = \frac{3.7 - 3.5}{0.27} = 0.74

P(Z > 0.74) = 1 - 0.7704 = 0.23

Example 2: Skewed Population (Incomes)

Income: $\mu = \$ 60{,}000 $,$ \sigma = $25{,}000$ (right-skewed)

For $n = 100$ , find $P(\bar{x} < \$ 55{,}000)$:

SE = \frac{25000}{\sqrt{100}} = 2500

Z = \frac{55000 - 60000}{2500} = -2.0

P(Z < -2.0) = 0.0228

Example 3: Sum of Random Variables

By CLT, for large $n$ , the sum $S_n = X_1 + X_2 + \cdots + X_n$ :

S_n \sim N(n\mu, n\sigma^2) \quad \text{approximately}

Applied to insurance claims, total sales, etc.

Relationship to Other Concepts

Concept	CLT Connection
Standard Error	$SE = \sigma/\sqrt{n}$ comes from CLT
Confidence Intervals	Justified by CLT
Z-tests	Work because of CLT
Sample Size	Larger $n$ means CLT applies better

Common Misconceptions

Misconception	Reality
"Population becomes normal"	No, only $\bar{x}$ 's distribution does
"Works for any $n$ "	Need $n$ large enough
"Individual values are normal"	No, only sample means are
"Exact normality"	It's an approximation

Why It Matters

Enables inference: Most statistical procedures rely on CLT
Practical flexibility: Don't need to know population distribution
Universal application: Works for many types of data
Quality control: Basis for control charts
Polling/surveys: Justifies sample-based conclusions