Overview

Data Distribution and Graphs tests the ability to read, interpret, and compare data presented in histograms, box plots, dot plots, two-way frequency tables, bar charts, line graphs, and pie charts on the digital SAT. Questions ask students to identify distribution shape, compare distributions by center and spread, and compute joint, marginal, and conditional frequencies from two-way tables. These are calculator-active medium-difficulty questions, and this topic is one of the most visually varied in the Problem-Solving and Data Analysis domain.

Key Points

1. Histograms

A histogram displays frequency data using adjacent bars, where:

  • The x-axis shows intervals (bins)
  • The y-axis shows frequency (count) for each bin
  • Bar width = bin width; bar height = frequency

What you can and cannot read:

  • Can read: frequency per bin, approximate shape, rough location of median
  • Cannot read: individual data values (unlike dot plots)

Shape vocabulary:

ShapeDescription
SymmetricMirror-image halves; mean ≈ median
Right-skewed (positive)Long tail to the right; mean > median
Left-skewed (negative)Long tail to the left; mean < median
UniformAll bars approximately equal height
BimodalTwo peaks

2. Box Plots (Box-and-Whisker Plots)

A box plot summarizes data using the five-number summary:

Min --- Q1 [===Median===] Q3 --- Max
          |<-- IQR -->|
PartMeaning
Box left edgeQ1 (25th percentile)
Line inside boxMedian (Q2, 50th percentile)
Box right edgeQ3 (75th percentile)
Box widthIQR = Q3 − Q1
WhiskersExtend to min/max within 1.5×IQR
Dots beyond whiskersOutliers

“Middle 50% of data” → the box (IQR). A longer box = greater spread in the middle half.

Skew signal from box plot: If the median line is closer to Q1, the data is right-skewed; closer to Q3 means left-skewed.

3. Dot Plots

Each data point is shown as a dot on a number line. You can:

  • Read individual values
  • Count exact frequencies
  • Determine mean, median, and mode directly

Dot plots are best for small data sets. An outlier appears as an isolated dot far from the main cluster.

4. Two-Way Frequency Tables

Two-way tables cross-tabulate two categorical variables. Three frequency types:

TypeLocationFormula
Joint frequencyIndividual cellCell value
Marginal frequencyRow or column totalSum of row or column
Conditional frequencyCell ÷ marginalCell / (row or column total)

Example table:

PassedFailedRow Total
Female451560
Male301040
Column Total7525100
  • P(Passed | Female) = 45/60 = 0.75 (conditional — divide by row total 60)
  • P(Female and Passed) = 45/100 = 0.45 (joint — divide by grand total 100)
  • P(Female) = 60/100 = 0.60 (marginal — divide row total by grand total)

5. Comparing Distributions

When comparing two distributions (two histograms, two box plots, two dot plots), address three aspects:

  1. Center: Which has the higher mean or median?
  2. Spread: Which has the larger range, IQR, or standard deviation?
  3. Shape: Are they symmetric or skewed? In the same direction?

6. Other Graph Types

TypeBest used forSAT usage
Bar chartComparing categoriesFrequency comparisons
Line graphChange over timeTrend interpretation
Pie chartPart-to-whole proportionsPercent of total

Pitfalls and Common Mistakes

Mistake 1: Dividing by the grand total when asked for a conditional frequency. “What fraction of females passed?” → 45/100 = 0.45 (wrong) vs. 45/60 = 0.75 (correct). Fix: For conditional frequencies (“given that…”), divide by the marginal total (row or column), not the grand total.

Mistake 2: Misreading skew direction from a histogram. Students see a long left tail and call the distribution “right-skewed.” Fix: The skew name matches the direction of the tail, not the bulk of the data. Long tail pointing right = right-skewed (positive skew).

Mistake 3: Trying to read exact data values from a histogram. Histograms group values into bins — you cannot identify individual values, only the count per interval. Fix: If an exact value is needed, look for a dot plot or table, not a histogram.

Mistake 4: Confusing the median line in a box plot with the mean. The line inside the box is the median, not the mean. Fix: To estimate the mean from a box plot, you would need additional information. The box plot only guarantees the median.

Mistake 5: Thinking a wider box means higher mean. A wider box (larger IQR) indicates more spread in the middle half, not a higher mean or median. Fix: Use the position of the median line (vertical mark inside the box) to compare centers, and the box width to compare spread.

Quick Reference Card

Graph TypeCan ReadCannot Read
HistogramFrequency per bin, shapeIndividual values
Box plotMedian, Q1, Q3, IQR, outliersMean, exact distribution
Dot plotIndividual values, mode, median
Two-Way Table FrequencyFormula
JointCell / Grand total
MarginalRow or column total / Grand total
ConditionalCell / Row total (or column total)
SkewMean vs MedianTail Direction
Right (positive)Mean > MedianPoints right
Left (negative)Mean < MedianPoints left
SymmetricMean ≈ MedianNo clear tail