Descriptive StatisticsTopic #5 of 33

Data Visualization

Graphical representations: histograms, box plots, scatter plots, bar charts, and pie charts.

Overview

Data visualization is the graphical representation of data to help understand patterns, trends, and distributions. Choosing the right visualization depends on your data type and analysis goals.

Chart Types by Data Type

Data TypeRecommended Charts
CategoricalBar chart, Pie chart
Numerical (one variable)Histogram, Box plot, Dot plot
Numerical (two variables)Scatter plot, Line chart
Time seriesLine chart, Area chart
Part-to-wholePie chart, Stacked bar

Histograms

Purpose

Display the distribution of a single continuous variable.

Key Elements

  • Bars: Represent frequency of values in each bin
  • Bins: Intervals of equal width
  • No gaps: Bars are adjacent (unlike bar charts)

Choosing Bin Width

Sturges’ Rule: k=1+3.322×log10(n)\text{Sturges' Rule: } k = 1 + 3.322 \times \log_{10}(n)

Where kk is the number of bins and nn is sample size.

Interpretation

  • Shape: Symmetric, skewed left/right, bimodal
  • Center: Where most data is concentrated
  • Spread: Width of the distribution
  • Outliers: Isolated bars far from center

Box Plots (Box-and-Whisker)

Components

          Outliers
             •
    ┌────────┬────────┐
    |        |        |
────┤   Q₁   Med   Q₃ ├────
    |        |        |
    └────────┴────────┘
   Min               Max

Shows

  • Five-number summary
  • IQR (box length)
  • Skewness (median position within box)
  • Outliers (individual points)

Scatter Plots

Purpose

Show relationship between two numerical variables.

Interpretation

PatternRelationship
Upward slopePositive correlation
Downward slopeNegative correlation
No patternNo correlation
Curved patternNonlinear relationship

Bar Charts

Guidelines

  • Bars should have equal width
  • Start y-axis at zero
  • Order categories logically
  • Use horizontal bars for long labels

Types

  • Simple: One category
  • Grouped: Compare categories across groups
  • Stacked: Show composition of totals

Pie Charts

Guidelines

  • Use for part-to-whole relationships
  • Limit to 5-7 categories
  • Order slices by size
  • Label directly when possible

When to Avoid

  • Comparing similar-sized slices
  • Showing trends over time
  • Many categories

Line Charts

Purpose

Display trends over time or ordered categories.

Guidelines

  • Time on x-axis
  • Connect data points with lines
  • Show grid lines for reference
  • Use markers for actual data points

Best Practices

Do

  • Choose appropriate chart type
  • Label axes clearly
  • Include units of measurement
  • Use consistent scales
  • Add meaningful titles

Don't

  • Use 3D effects (distorts perception)
  • Truncate y-axis (exaggerates differences)
  • Use too many colors
  • Overcrowd with data
  • Use pie charts for comparison

Choosing the Right Chart

PurposeChart Type
ComparisonBar chart
DistributionHistogram, Box plot
RelationshipScatter plot
TrendLine chart
CompositionPie chart, Stacked bar

Example Interpretations

Histogram Shape Analysis

  • Right-skewed: Long tail on right (e.g., income)
  • Left-skewed: Long tail on left (e.g., age at retirement)
  • Bimodal: Two peaks (possible subgroups)
  • Uniform: All values equally likely

Scatter Plot Analysis

  • Look for direction (positive/negative)
  • Look for form (linear/curved)
  • Look for strength (tight/loose clustering)
  • Identify outliers