When to Use This Formula
When a problem gives you possible outcomes and their probabilities instead of a raw data list, you do not use the usual sample standard deviation workflow. You first compute the expected value of the random variable, then measure how far each outcome sits from that mean after weighting by probability.
This setup appears in reliability models, quality-control defect counts, insurance claims, game outcomes, and classroom probability questions. If you want to check the arithmetic numerically, the site's probability calculator, mean and variance calculator, and mean, variance, and standard deviation calculator are the most relevant companion tools.
| Input type | What you know | Best standard deviation approach |
|---|---|---|
| Raw dataset | Observed values such as 4, 7, 9, 10 | Use sample or population formulas on the data list |
| Frequency table | Observed values plus counts | Use weighted frequencies as shown in Standard Deviation from a Frequency Table |
| Probability distribution | Possible values plus probabilities summing to 1 | Use expected value and probability-weighted variance |
Key distinction
Core Formulas
For a discrete random variable X with outcomes xᵢ and probabilities pᵢ, the probabilities must satisfy Σpᵢ = 1. The mean of the distribution is:
Expected value
The variance and standard deviation are then:
Variance of a discrete probability distribution
Standard deviation of a discrete probability distribution
This is the same spread concept used in Standard Deviation Formula Explained and Understanding Variance, but the weights now come from probabilities rather than repeated observations.
Worked Example
Suppose X is the number of defective items found in a short production run. Its probability distribution is:
| x | P(X = x) | xP(X = x) | x²P(X = x) |
|---|---|---|---|
| 0 | 0.15 | 0.00 | 0.00 |
| 1 | 0.35 | 0.35 | 0.35 |
| 2 | 0.30 | 0.60 | 1.20 |
| 3 | 0.20 | 0.60 | 1.80 |
| Total | 1.00 | 1.55 | 3.35 |
The expected value is μ = 1.55. That means the long-run average number of defects per run is 1.55, even though 1.55 defects never occurs in a single run.
Using the variance formula directly gives σ² = (0 - 1.55)²(0.15) + (1 - 1.55)²(0.35) + (2 - 1.55)²(0.30) + (3 - 1.55)²(0.20) = 0.9475.
The standard deviation is σ = √0.9475 ≈ 0.973. In practical terms, the distribution typically varies by about one defect around its mean.
Verify the distribution
Find the mean first
Square each distance from the mean
Weight by probability
Take the square root last
Interpretation tip
Shortcut Method
Many textbook problems are faster with the computational identity:
Variance shortcut
In the example above, E(X²) = 3.35 and μ² = 1.55² = 2.4025. So σ² = 3.35 - 2.4025 = 0.9475, which matches the long method exactly.
When the shortcut helps most
Sample vs Distribution Standard Deviation
Students often mix up a sample standard deviation with the standard deviation of a probability distribution. They answer different questions. A sample standard deviation summarizes observed data and usually uses n - 1. A distribution standard deviation summarizes the theoretical model itself and uses the full probability weights.
Use sample SD when
Use distribution SD when
That difference also explains why a probability-distribution question usually does not use Bessel's correction. The probabilities already define the whole distribution, so there is no sample-estimation adjustment.
Practical Patterns
| Scenario | What the standard deviation tells you |
|---|---|
| Defects per batch | How volatile the defect count is around the expected number of defects |
| Game payoff distribution | How risky or unpredictable the payoff is relative to its average |
| Demand outcomes in inventory planning | How much actual demand may swing around expected demand |
| Number of claims or failures | How stable or unstable the event count is over repeated periods |
If the distribution is approximately bell-shaped after aggregation, Understanding Normal Distribution and Z-Score Explained help you translate standard deviation into probability statements and unusually high or low outcomes.
Quick Bernoulli example
Problem-Solving Checklist
- Check the inputs:Make sure the table gives **probabilities**, not frequencies. If you have counts instead, use the frequency-table workflow instead of the probability-distribution workflow.
- Sum to one:Confirm that **Σp(x) = 1**. If not, the table is incomplete or the values need normalization.
- Compute the mean first:Do not jump straight to squared deviations. Every variance term depends on **μ**.
- Use the shortcut when convenient:If **Σx²p(x)** is easy to build, use **E(X²) - μ²** to reduce arithmetic mistakes.
- Interpret the answer in context:A standard deviation is large or small only relative to the outcome scale, the mean, and the real decision you are making.
Once you see probabilities as weights, the formula becomes much easier to remember: find the mean, measure squared distance from that mean, weight by probability, and take the square root. The mathematics is the same idea as ordinary standard deviation, but the input is a model instead of a sample.
Further Reading
Sources
References and further authoritative reading used in preparing this article.