Overview
Sampling, Surveys, and Statistical Inference is a medium-to-hard conceptual topic in the Problem-Solving and Data Analysis domain on the digital SAT. Questions test the ability to evaluate whether a sample is representative of a population, identify sources of bias in data collection, interpret margin of error and confidence intervals, and distinguish between what can and cannot be concluded from observational studies versus experiments. These are calculator-active questions, but they primarily require logical reasoning rather than computation.
Key Points
1. Random Sampling Types
| Type | Description | SAT relevance |
|---|---|---|
| Simple random sample | Every member of the population has an equal chance of selection | Gold standard for generalization |
| Stratified random | Population divided into subgroups (strata); random sample taken from each | Ensures representation of subgroups |
| Cluster | Population divided into clusters; entire clusters selected randomly | Efficient for geographically dispersed populations |
| Systematic | Every nth member selected from a list | Practical; can generalize if list has no pattern |
Key principle: A random sample is required to generalize findings to the broader population. A non-random (biased) sample cannot support general conclusions.
2. Sources of Bias
| Bias Type | Description | Example |
|---|---|---|
| Voluntary response | People choose to respond; tend to have extreme opinions | Online polls, “call in” surveys |
| Convenience | Sample taken from whoever is easily accessible | Surveying only students in one classroom |
| Leading question | Question wording steers respondents toward a particular answer | ”Don’t you agree that…?” |
| Undercoverage | Some population segments are systematically excluded | Phone surveys missing those without phones |
| Nonresponse | Selected individuals do not respond; pattern differs from respondents | Low survey return rates |
All biased samples produce estimates that do not accurately represent the population.
3. Margin of Error and Confidence Intervals
A confidence interval gives a range within which the true population parameter likely falls.
Where MOE = margin of error and p̂ = sample estimate.
Example: “42% ± 3%” → the interval is [39%, 45%].
The SAT always uses a 95% confidence level. This means: if the same survey were repeated many times, approximately 95% of the resulting intervals would contain the true parameter.
Effect of sample size on MOE:
Larger sample → smaller MOE → more precise estimate. Smaller variability + larger sample → most precise estimate.
4. Observational Studies vs. Experiments
| Feature | Observational Study | Experiment |
|---|---|---|
| Researcher role | Observes without intervening | Assigns subjects to conditions |
| Random assignment | No | Yes (in a well-designed experiment) |
| Can establish causation | No — association only | Yes — if random assignment used |
| Can generalize to population | Only if sample was random | Only if sample was random |
Confounding (lurking) variable: A third variable correlated with both the explanatory and response variables, potentially explaining the observed association without a causal relationship.
Control group: Receives no treatment (or a placebo). Comparison to the treatment group isolates the effect of the intervention.
5. Valid and Invalid Conclusions — SAT Decision Tree
Was the sample randomly selected?
├── YES → Can generalize to the population
└── NO → Cannot generalize; results apply only to the sample
Was there random assignment to groups?
├── YES (experiment) → Can claim causation
└── NO (observational) → Association only; cannot claim causation
6. Common SAT Inference Scenarios
| Scenario | Valid conclusion | Invalid conclusion |
|---|---|---|
| Random survey of 500 city residents | Estimate city-wide preferences | Estimate national preferences |
| Study with random assignment, treatment shows improvement | Treatment causes improvement | Nothing if sample was non-random |
| Convenience sample shows trend | Trend exists in the sample | Trend applies to the population |
| Two variables are correlated in an observational study | Association exists | Causation |
Pitfalls and Common Mistakes
Mistake 1: Concluding causation from an observational study. A study shows that people who eat breakfast have higher GPAs — students conclude breakfast causes higher GPA. Fix: Observational studies cannot establish causation. Look for confounding variables (e.g., students who eat breakfast may also have more structured home environments).
Mistake 2: Generalizing from a biased or non-random sample. A survey of website users is used to draw conclusions about “all adults.” Fix: Only random samples of the target population support generalization to that population.
Mistake 3: Misinterpreting the confidence interval. “95% confidence interval” means there is a 95% chance the true value is in this interval for a specific sample. Fix: The correct interpretation is: “We are 95% confident this procedure produces intervals that contain the true parameter.” The true parameter is fixed; the interval varies across samples.
Mistake 4: Thinking a larger margin of error is better. A larger MOE means less precision. Fix: Smaller MOE = better. To reduce MOE, increase sample size.
Mistake 5: Confusing the scope of a conclusion. A study on students at one school cannot be used to make inferences about all students nationally. Fix: Conclusions are only valid for the population from which the random sample was drawn, not a broader group.
Related Entries
- Probability
- Statistical_Measures
- Data_Distribution_Graphs
- Scatterplots_Regression
- Ratios_Proportions
Quick Reference Card
| Concept | Rule |
|---|---|
| Random sample | Required to generalize to the population |
| Random assignment | Required to claim causation |
| Observational study | Association only — never causation |
| Experiment | Can establish causation if random assignment used |
| MOE interpretation | True value is within estimate ± MOE (95% confidence) |
| Larger n | Smaller MOE — more precise |
| Voluntary response bias | Overrepresents strong opinions |
| Convenience bias | Not representative; easy-to-access sample |
| Confounding variable | Third variable explaining apparent association |
| SAT confidence level | Always 95% |