Quick Answer
TL;DR
- Combined mean is the weighted average of group means, using each group size as the weight.
- Combined standard deviation is the spread of all observations after the groups are treated as one dataset.
- Within-group variation comes from each group's own standard deviation.
- Between-group variation comes from group means sitting above or below the combined mean.
- Use the sample formula when group SDs are sample SDs; use the population formula only for complete populations.
A student or analyst usually needs this formula after receiving only summaries: group size, group mean, and group standard deviation. The raw rows may be in separate lab sheets, classrooms, production lots, or survey waves. As a data educator, the practical objective is to recover the standard deviation of the combined dataset without making the common error of averaging the SDs.
This guide focuses on combining group summaries. If your problem is about adding independent random variables, use Combining Standard Deviations. If your problem is a two-sample t-test that assumes equal variances, use Pooled Standard Deviation or the pooled standard deviation calculator.
When This Formula Applies
The combined standard deviation formula applies when you know, for each group, n_i, mean_i, and s_i, and you want the sample standard deviation of the union of all observations.
| Situation | Use this article? | Reason |
|---|---|---|
| Three classes report n, mean, and sample SD; you need the SD of all students together | Yes | You are combining group summaries into one dataset. |
| Two labs report instrument error SDs; you need total error SD | No | That is propagation of independent variation; use Combining Standard Deviations. |
| Two treatment groups need one equal-variance estimate for a t-test | Usually no | That is pooled SD; it estimates shared within-group variability and ignores between-group mean differences. |
| You have all raw observations in one column | No | Use the sample standard deviation calculator directly. |
Do not average standard deviations
Combined Mean Formula
The combined mean is a weighted mean. Each group mean contributes in proportion to the number of observations in that group.
Combined mean
This step is required before calculating the combined standard deviation because the between-group term measures how far each group mean is from the combined mean.
Combined Standard Deviation Formula
For sample standard deviations, combine the sums of squares, not the SDs. The formula has two parts: within-group variation and between-group variation.
Combined sample variance from group summaries
Combined sample standard deviation
Here N = sum(n_i). The first sum rebuilds the within-group sum of squares from each sample SD. The second sum adds the extra spread caused by different group means.
| Term | What it measures | Why it matters |
|---|---|---|
| sum((n_i - 1) * s_i^2) | Within-group sum of squares | Reconstructs the spread inside each group. |
| sum(n_i * (xbar_i - xbar)^2) | Between-group sum of squares | Adds spread created by group centers being different. |
| N - 1 | Total sample degrees of freedom | Matches the ordinary sample variance denominator for all observations together. |
For a full population, replace sample SDs with population SDs and divide by N instead of N - 1:
Combined population variance
Worked Example
For this article, we verified the grouped formula against a raw-row check using three small inspection batches. The summary version below is what an analyst would have if the original batch sheets were already archived.
| Batch | Raw readings used for verification | n | Mean | Sample SD |
|---|---|---|---|---|
| A | 9.8, 10.1, 10.0, 10.2, 9.9 | 5 | 10.00 | 0.1581 |
| B | 10.4, 10.5, 10.6, 10.7 | 4 | 10.55 | 0.1291 |
| C | 9.6, 9.7, 9.8 | 3 | 9.70 | 0.1000 |
Find the combined mean
Compute within-group sum of squares
Compute between-group sum of squares
Divide by total degrees of freedom
Take the square root
Why the answer is larger than the group SDs
To audit the arithmetic, paste the 12 raw readings into the sample standard deviation calculator, or verify the squared-spread pieces with the variance calculator and the mean calculator.
Decision Checklist
- Use this formula when each group summarizes the same measurement unit.
- Use group sizes as weights; do not give a group of 3 the same influence as a group of 300.
- Confirm whether each reported SD is sample SD or population SD before choosing the denominator.
- Include the between-group term when your goal is the SD of all observations combined.
- Exclude the between-group term only when estimating a shared within-group SD for a pooled t-test.
Common Mistakes
Mistake: averaging SDs
Mistake: using pooled SD
Mistake: mixing units
Mistake: hiding imbalance
FAQ
- Can I combine standard deviations without group means?:Not if you need the SD of the full combined dataset. You need group means to calculate between-group variation.
- Is combined SD the same as pooled SD?:No. Combined SD includes differences between group means. Pooled SD estimates a shared within-group SD and is usually smaller when group means differ.
- What if the groups overlap?:Do not use this formula for overlapping groups. The observations would be double-counted, so N and the sums of squares would be wrong.
- What if I only have variances?:Use variances directly in the formula by replacing s_i^2 with the reported variance. Do not square a value that is already a variance.
Further Reading
Sources
References and further authoritative reading used in preparing this article.