Overview
In hypothesis testing, we can make two types of errors when drawing conclusions about a population based on sample data.
Error Types
| Decision | True | False |
|---|---|---|
| Reject | Type I Error () | Correct Decision (Power) |
| Fail to Reject | Correct Decision | Type II Error () |
Type I Error ()
Definition
Rejecting when it is actually true.
Also Called
- False positive
- False alarm
- error
Probability
Example
Concluding a drug works when it actually doesn't.
Type II Error ()
Definition
Failing to reject when it is actually false.
Also Called
- False negative
- Missed detection
- error
Probability
Example
Concluding a drug doesn't work when it actually does.
Power
Definition
The probability of correctly rejecting a false .
Desirable Values
- Power is common standard
- Higher power = better ability to detect effects
Relationships
Visual Representation
H₀ true H₁ true
distribution distribution
↓ ↓
╭───╮ ╭───╮
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
────┴─────────┴────────┴─────────┴────
│ │
│ Rejection │
│←── Region ─────→│
Critical Value
Area under H₀ curve in rejection region = α
Area under H₁ curve NOT in rejection region = β
Tradeoff
| Choice | Effect |
|---|---|
| Lower | Higher (less power) |
| Higher | Lower (more power) |
| Larger | Lower (keeping same) |
Factors Affecting Power
| Factor | Effect on Power |
|---|---|
| Larger sample size () | ↑ Increases |
| Larger effect size | ↑ Increases |
| Lower variance () | ↑ Increases |
| Higher | ↑ Increases |
| One-tailed vs Two-tailed | One-tailed has more power |
Examples
Example 1: Courtroom Analogy
: Defendant is innocent
- Type I Error: Convicting an innocent person ()
- Type II Error: Acquitting a guilty person ()
The justice system sets very low ("beyond reasonable doubt").
Example 2: Medical Screening
: Patient does not have disease
- Type I Error: False positive (unnecessary treatment)
- Type II Error: False negative (missed diagnosis)
Which error is worse depends on the disease and treatment.
Example 3: Quality Control
: Product meets specifications
- Type I Error: Rejecting good products (waste)
- Type II Error: Accepting bad products (customer complaints)
Controlling Errors
To Reduce
- Lower significance level
- Tradeoff: increases
To Reduce (Increase Power)
- Increase sample size
- Increase (if acceptable)
- Reduce measurement error
- Focus on larger effect sizes
Power Analysis
Before conducting a study:
Determines required sample size to detect a meaningful effect.
Practical Significance vs Statistical Significance
- Statistical significance:
- Practical significance: Effect is large enough to matter
A very large sample can detect statistically significant but practically unimportant effects.