Two formulas. One divides by n, the other by n−1. Most students memorize the difference for the exam but never understand why it exists. Here's the real explanation — and when to use each formula.
The Two Formulas Side by Side
Population:
σ = √[Σ(xᵢ − μ)² / n]
Sample:
s = √[Σ(xᵢ − x̄)² / (n − 1)]
Everything is identical except the denominator. Population divides by n. Sample divides by n−1.
When to Use Each
| Situation | Formula | Divide By | |-----------|---------|----------| | All 30 students in your class | Population (σ) | 30 | | 30 students surveyed from a school of 500 | Sample (s) | 29 | | Every product from a batch | Population (σ) | n | | 50 products tested from a batch of 1000 | Sample (s) | 49 | | Census data (entire country) | Population (σ) | n | | Survey of 1000 people | Sample (s) | 999 |
Simple rule: If your data IS the whole group → population. If your data REPRESENTS a larger group → sample.
Why n−1? The Intuition
When you calculate variance from a sample, you use the sample mean (x̄), not the population mean (μ). The sample mean is always closer to the sample values than the population mean is — because the sample mean is literally calculated from those values.
This means the squared deviations (xᵢ − x̄)² are systematically smaller than the true deviations (xᵢ − μ)². The result: sample variance underestimates population variance.
Dividing by n−1 instead of n inflates the result just enough to compensate.
Visual Analogy
Imagine you're estimating the size of a room by measuring 5 points. Your 5 points will cluster around their own center (sample mean), not the room's true center. The spread you measure is always a bit too small. Bessel's correction fixes this.
Worked Example: Comparing Both
Data: 4, 8, 6, 5, 3, 2, 8, 9, 2, 5
Mean = 5.2, Sum of squared deviations = 57.6
| Formula | Division | Variance | Std Dev | |---------|----------|----------|---------| | Population | 57.6 / 10 | 5.76 | 2.40 | | Sample | 57.6 / 9 | 6.40 | 2.53 |
Difference: 5.4%. With n = 10, the correction is modest but noticeable.
How Sample Size Affects the Difference
| n | n−1 | Difference (%) | |---|-----|---------------| | 3 | 2 | 50% | | 5 | 4 | 25% | | 10 | 9 | 11% | | 30 | 29 | 3.4% | | 100 | 99 | 1.0% | | 1000 | 999 | 0.1% |
The smaller the sample, the more Bessel's correction matters. With n = 1000, it's negligible. With n = 3, it's enormous.
Common Mistakes
| Mistake | Example | Fix | |---------|---------|-----| | Using population formula for sample data | Survey of 50, divide by 50 | Divide by 49 | | Using sample formula for population data | All 30 students, divide by 29 | Divide by 30 | | Confusing which mean to use | Using μ when you have x̄ | Sample uses x̄, population uses μ | | Forgetting to square root | Reporting variance as std dev | σ = √variance, not variance itself |
What If You Use the Wrong Formula?
- Using population formula on sample data → underestimates σ → you think data is less spread than it actually is → false precision
- Using sample formula on population data → overestimates σ → you think data is more spread than it actually is → false uncertainty
For large n (>30), the difference is small enough that either formula gives a similar result. For small samples, choosing correctly matters.
The Trench Truth: In most real-world statistics, you're working with samples, not populations. The default should be the sample formula (n−1). Only use the population formula when you genuinely have every member of the group — which is rare outside of controlled settings like "all students in this one class."
Calculate both population and sample standard deviation with our standard deviation calculator.
Related: Statistics Calculator · Derivative Calculator · Quadratic Formula Calculator
Discussion
Loading comments...