Overview
Data Distribution and Graphs tests the ability to read, interpret, and compare data presented in histograms, box plots, dot plots, two-way frequency tables, bar charts, line graphs, and pie charts on the digital SAT. Questions ask students to identify distribution shape, compare distributions by center and spread, and compute joint, marginal, and conditional frequencies from two-way tables. These are calculator-active medium-difficulty questions, and this topic is one of the most visually varied in the Problem-Solving and Data Analysis domain.
Key Points
1. Histograms
A histogram displays frequency data using adjacent bars, where:
- The x-axis shows intervals (bins)
- The y-axis shows frequency (count) for each bin
- Bar width = bin width; bar height = frequency
What you can and cannot read:
- Can read: frequency per bin, approximate shape, rough location of median
- Cannot read: individual data values (unlike dot plots)
Shape vocabulary:
| Shape | Description |
|---|---|
| Symmetric | Mirror-image halves; mean ≈ median |
| Right-skewed (positive) | Long tail to the right; mean > median |
| Left-skewed (negative) | Long tail to the left; mean < median |
| Uniform | All bars approximately equal height |
| Bimodal | Two peaks |
2. Box Plots (Box-and-Whisker Plots)
A box plot summarizes data using the five-number summary:
Min --- Q1 [===Median===] Q3 --- Max
|<-- IQR -->|
| Part | Meaning |
|---|---|
| Box left edge | Q1 (25th percentile) |
| Line inside box | Median (Q2, 50th percentile) |
| Box right edge | Q3 (75th percentile) |
| Box width | IQR = Q3 − Q1 |
| Whiskers | Extend to min/max within 1.5×IQR |
| Dots beyond whiskers | Outliers |
“Middle 50% of data” → the box (IQR). A longer box = greater spread in the middle half.
Skew signal from box plot: If the median line is closer to Q1, the data is right-skewed; closer to Q3 means left-skewed.
3. Dot Plots
Each data point is shown as a dot on a number line. You can:
- Read individual values
- Count exact frequencies
- Determine mean, median, and mode directly
Dot plots are best for small data sets. An outlier appears as an isolated dot far from the main cluster.
4. Two-Way Frequency Tables
Two-way tables cross-tabulate two categorical variables. Three frequency types:
| Type | Location | Formula |
|---|---|---|
| Joint frequency | Individual cell | Cell value |
| Marginal frequency | Row or column total | Sum of row or column |
| Conditional frequency | Cell ÷ marginal | Cell / (row or column total) |
Example table:
| Passed | Failed | Row Total | |
|---|---|---|---|
| Female | 45 | 15 | 60 |
| Male | 30 | 10 | 40 |
| Column Total | 75 | 25 | 100 |
- P(Passed | Female) = 45/60 = 0.75 (conditional — divide by row total 60)
- P(Female and Passed) = 45/100 = 0.45 (joint — divide by grand total 100)
- P(Female) = 60/100 = 0.60 (marginal — divide row total by grand total)
5. Comparing Distributions
When comparing two distributions (two histograms, two box plots, two dot plots), address three aspects:
- Center: Which has the higher mean or median?
- Spread: Which has the larger range, IQR, or standard deviation?
- Shape: Are they symmetric or skewed? In the same direction?
6. Other Graph Types
| Type | Best used for | SAT usage |
|---|---|---|
| Bar chart | Comparing categories | Frequency comparisons |
| Line graph | Change over time | Trend interpretation |
| Pie chart | Part-to-whole proportions | Percent of total |
Pitfalls and Common Mistakes
Mistake 1: Dividing by the grand total when asked for a conditional frequency. “What fraction of females passed?” → 45/100 = 0.45 (wrong) vs. 45/60 = 0.75 (correct). Fix: For conditional frequencies (“given that…”), divide by the marginal total (row or column), not the grand total.
Mistake 2: Misreading skew direction from a histogram. Students see a long left tail and call the distribution “right-skewed.” Fix: The skew name matches the direction of the tail, not the bulk of the data. Long tail pointing right = right-skewed (positive skew).
Mistake 3: Trying to read exact data values from a histogram. Histograms group values into bins — you cannot identify individual values, only the count per interval. Fix: If an exact value is needed, look for a dot plot or table, not a histogram.
Mistake 4: Confusing the median line in a box plot with the mean. The line inside the box is the median, not the mean. Fix: To estimate the mean from a box plot, you would need additional information. The box plot only guarantees the median.
Mistake 5: Thinking a wider box means higher mean. A wider box (larger IQR) indicates more spread in the middle half, not a higher mean or median. Fix: Use the position of the median line (vertical mark inside the box) to compare centers, and the box width to compare spread.
Related Entries
Quick Reference Card
| Graph Type | Can Read | Cannot Read |
|---|---|---|
| Histogram | Frequency per bin, shape | Individual values |
| Box plot | Median, Q1, Q3, IQR, outliers | Mean, exact distribution |
| Dot plot | Individual values, mode, median | — |
| Two-Way Table Frequency | Formula |
|---|---|
| Joint | Cell / Grand total |
| Marginal | Row or column total / Grand total |
| Conditional | Cell / Row total (or column total) |
| Skew | Mean vs Median | Tail Direction |
|---|---|---|
| Right (positive) | Mean > Median | Points right |
| Left (negative) | Mean < Median | Points left |
| Symmetric | Mean ≈ Median | No clear tail |