Sampling Distributions

Overview

A sampling distribution is the probability distribution of a statistic (like the sample mean) calculated from all possible samples of a given size from a population.

Key Concepts

Term	Definition
Population	The entire group of interest
Sample	A subset of the population
Parameter	A numerical characteristic of a population ( $\mu$ , $\sigma$ )
Statistic	A numerical characteristic of a sample ( $\bar{x}$ , $s$ )
Sampling distribution	Distribution of a statistic over all possible samples

Sampling Distribution of the Mean

If we take all possible samples of size $n$ from a population and calculate $\bar{x}$ for each:

Mean of $\bar{x}$

\mu_{\bar{x}} = \mu

The mean of sample means equals the population mean.

Standard Error of the Mean

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

The standard deviation of sample means (standard error) decreases as $n$ increases.

Standard Error

The standard error (SE) measures sampling variability:

SE = \frac{\sigma}{\sqrt{n}} \quad \text{(if } \sigma \text{ known)}

SE = \frac{s}{\sqrt{n}} \quad \text{(if } \sigma \text{ estimated by } s \text{)}

Interpretation

Smaller SE means:

Sample means cluster more tightly around $\mu$
Estimates are more precise
Larger sample sizes give smaller SE

Properties

Property	Formula
Mean of $\bar{x}$	$\mu$
Variance of $\bar{x}$	$\sigma^2 / n$
Standard Error	$\sigma / \sqrt{n}$

Effect of Sample Size

$n$	SE relative to $\sigma$
1	$\sigma$
4	$\sigma/2$
9	$\sigma/3$
25	$\sigma/5$
100	$\sigma/10$

Quadrupling $n$ cuts SE in half.

Other Sampling Distributions

Sample Proportion

For proportion $\hat{p}$ from samples of size $n$ :

E(\hat{p}) = p

SE = \sqrt{\frac{p(1-p)}{n}}

Difference of Means

For $\bar{x}_1 - \bar{x}_2$ from independent samples:

\text{Mean: } \mu_1 - \mu_2

SE = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}

Sampling Variability

Population
    ↓
┌─────────────────────────────┐
│  Sample 1 → x̄₁             │
│  Sample 2 → x̄₂             │  → Sampling Distribution
│  Sample 3 → x̄₃             │     of x̄
│  ...                         │
│  Sample k → x̄ₖ             │
└─────────────────────────────┘

Each sample gives a different $\bar{x}$ , creating variability.

Examples

Example 1: Standard Error

Population: $\mu = 100$ , $\sigma = 20$

Sample size $n = 25$ :

SE = \frac{20}{\sqrt{25}} = \frac{20}{5} = 4

Sample size $n = 100$ :

SE = \frac{20}{\sqrt{100}} = \frac{20}{10} = 2

Example 2: Probability Using SE

Population: $\mu = 500$ , $\sigma = 100$ , Sample $n = 64$

$P(\bar{x} > 520)$ ?

SE = \frac{100}{\sqrt{64}} = 12.5

Z = \frac{520 - 500}{12.5} = 1.6

P(Z > 1.6) = 1 - 0.9452 = 0.0548

Example 3: Required Sample Size

To cut SE in half from current value, need:

n_{\text{new}} = 4 \times n_{\text{current}}

To reduce SE from 10 to 5 when $\sigma = 50$ :

\text{Current: } 10 = \frac{50}{\sqrt{n}} \Rightarrow n = 25

\text{Target: } 5 = \frac{50}{\sqrt{n_{\text{new}}}} \Rightarrow n_{\text{new}} = 100

Importance

Inference foundation: Understanding sampling variability enables hypothesis testing and confidence intervals
Precision planning: Calculate required sample sizes
Estimating parameters: Quantify uncertainty in estimates