Chi-Square Distribution | Statistics Cheat Sheet

Overview

The chi-square ( $\chi^2$ ) distribution is used for inference about population variance and for categorical data analysis, including goodness-of-fit and independence tests.

Definition

If $Z_1, Z_2, \ldots, Z_k$ are independent standard normal random variables:

\chi^2 = Z_1^2 + Z_2^2 + \cdots + Z_k^2

follows a chi-square distribution with $k$ degrees of freedom.

Properties

Property	Value
Range	$0$ to $\infty$ (non-negative)
Mean	$df$
Variance	$2 \times df$
Skewness	Positive (right-skewed)
Mode	$df - 2$ (for $df \geq 2$ )

Shape

df = 2    df = 5    df = 10
  ╲         ╱╲        ╱─╲
   ╲       ╱  ╲      ╱   ╲
────╲─────╱────╲────╱─────╲────

Always right-skewed (less so for large $df$ )
Becomes more symmetric as $df$ increases

Notation

\chi^2_{\alpha, df}

Example: $\chi^2_{0.05, 10} = 18.307$ (right-tail area = 0.05, $df = 10$ )

Critical Values Table

$df$	$\chi^2_{0.995}$	$\chi^2_{0.99}$	$\chi^2_{0.05}$	$\chi^2_{0.025}$	$\chi^2_{0.01}$
1	0.000	0.000	3.841	5.024	6.635
5	0.412	0.554	11.070	12.833	15.086
10	2.156	2.558	18.307	20.483	23.209
15	4.601	5.229	24.996	27.488	30.578
20	7.434	8.260	31.410	34.170	37.566

Applications

1. Variance Testing

Test statistic for $\sigma^2$ :

\chi^2 = \frac{(n - 1)s^2}{\sigma_0^2}

With $df = n - 1$

2. Goodness-of-Fit Test

Tests if observed frequencies match expected frequencies:

\chi^2 = \sum \frac{(O - E)^2}{E}

Where:

$O$ = observed frequency
$E$ = expected frequency
$df$ = (number of categories) $- 1$

3. Test of Independence

Tests association between categorical variables:

\chi^2 = \sum \frac{(O - E)^2}{E}

Where:

$E = \frac{\text{row total} \times \text{column total}}{\text{grand total}}$
$df = (\text{rows} - 1) \times (\text{columns} - 1)$

Examples

Example 1: Goodness-of-Fit

Testing if a die is fair. Roll 60 times, expect 10 per face.

Face	$O$	$E$	$(O-E)^2/E$
1	8	10	0.4
2	12	10	0.4
3	9	10	0.1
4	11	10	0.1
5	10	10	0.0
6	10	10	0.0

\chi^2 = 1.0, \quad df = 6 - 1 = 5

\text{Critical value } \chi^2_{0.05, 5} = 11.07

1.0 < 11.07 \Rightarrow \text{fail to reject (die appears fair)}

Example 2: Independence Test

Survey of 200 people on product preference by gender:

	Product A	Product B	Total
Male	40	60	100
Female	60	40	100
Total	100	100	200

Expected (for each cell): $E = \frac{100 \times 100}{200} = 50$

\chi^2 = \frac{(40-50)^2}{50} + \frac{(60-50)^2}{50} + \frac{(60-50)^2}{50} + \frac{(40-50)^2}{50}

= 2 + 2 + 2 + 2 = 8

df = (2-1)(2-1) = 1, \quad \chi^2_{0.05, 1} = 3.841

8 > 3.841 \Rightarrow \text{reject } H_0 \text{ (preference depends on gender)}

Example 3: Variance Test

Testing $H_0: \sigma^2 = 25$ vs $H_1: \sigma^2 \neq 25$

Sample: $n = 20$ , $s^2 = 40$

\chi^2 = \frac{(20-1)(40)}{25} = 30.4, \quad df = 19

\text{Lower critical: } \chi^2_{0.975, 19} = 8.907

\text{Upper critical: } \chi^2_{0.025, 19} = 32.852

8.907 < 30.4 < 32.852 \Rightarrow \text{fail to reject } H_0

Assumptions

Random sampling
Independence of observations
Expected frequencies $\geq 5$ (for categorical tests)