Σ
SDCalc
IntermediateProbability·10 min

Standard Deviation of a Probability Distribution

Learn how to calculate the mean, variance, and standard deviation of a discrete probability distribution using weighted outcomes, shortcut formulas, and practical examples.

By Standard Deviation Calculator Team · Data Science Team·Published

When to Use This Formula

When a problem gives you possible outcomes and their probabilities instead of a raw data list, you do not use the usual sample standard deviation workflow. You first compute the expected value of the random variable, then measure how far each outcome sits from that mean after weighting by probability.

This setup appears in reliability models, quality-control defect counts, insurance claims, game outcomes, and classroom probability questions. If you want to check the arithmetic numerically, the site's probability calculator, mean and variance calculator, and mean, variance, and standard deviation calculator are the most relevant companion tools.

Input typeWhat you knowBest standard deviation approach
Raw datasetObserved values such as 4, 7, 9, 10Use sample or population formulas on the data list
Frequency tableObserved values plus countsUse weighted frequencies as shown in Standard Deviation from a Frequency Table
Probability distributionPossible values plus probabilities summing to 1Use expected value and probability-weighted variance

Key distinction

A probability distribution describes the behavior of the random variable itself, so its standard deviation is usually a population-style parameter. You are not estimating from a sample at this stage.

Core Formulas

For a discrete random variable X with outcomes xᵢ and probabilities pᵢ, the probabilities must satisfy Σpᵢ = 1. The mean of the distribution is:

Expected value

μ = E(X) = Σ(xᵢpᵢ)

The variance and standard deviation are then:

Variance of a discrete probability distribution

σ² = Σ[(xᵢ - μ)²pᵢ]

Standard deviation of a discrete probability distribution

σ = √[Σ((xᵢ - μ)²pᵢ)]

This is the same spread concept used in Standard Deviation Formula Explained and Understanding Variance, but the weights now come from probabilities rather than repeated observations.

Worked Example

Suppose X is the number of defective items found in a short production run. Its probability distribution is:

xP(X = x)xP(X = x)x²P(X = x)
00.150.000.00
10.350.350.35
20.300.601.20
30.200.601.80
Total1.001.553.35

The expected value is μ = 1.55. That means the long-run average number of defects per run is 1.55, even though 1.55 defects never occurs in a single run.

Using the variance formula directly gives σ² = (0 - 1.55)²(0.15) + (1 - 1.55)²(0.35) + (2 - 1.55)²(0.30) + (3 - 1.55)²(0.20) = 0.9475.

The standard deviation is σ = √0.9475 ≈ 0.973. In practical terms, the distribution typically varies by about one defect around its mean.

1

Verify the distribution

Check that all probabilities are between 0 and 1 and that they sum to 1.00.
2

Find the mean first

Compute Σxᵢpᵢ before touching variance. The mean anchors every deviation.
3

Square each distance from the mean

Use (xᵢ - μ)², not the unsquared distances, so positive and negative deviations do not cancel.
4

Weight by probability

Multiply each squared distance by its probability pᵢ.
5

Take the square root last

Variance comes first; standard deviation is its square root.

Interpretation tip

A standard deviation near 0 means most of the probability mass is tightly concentrated near the mean. A larger standard deviation means more probability sits farther away, either through wider spread or heavier tails.

Shortcut Method

Many textbook problems are faster with the computational identity:

Variance shortcut

σ² = E(X²) - [E(X)]² = Σ(xᵢ²pᵢ) - μ²

In the example above, E(X²) = 3.35 and μ² = 1.55² = 2.4025. So σ² = 3.35 - 2.4025 = 0.9475, which matches the long method exactly.

When the shortcut helps most

Use E(X²) - μ² when the table already includes an x²p(x) column or when you are doing the calculation by hand under time pressure.

Sample vs Distribution Standard Deviation

Students often mix up a sample standard deviation with the standard deviation of a probability distribution. They answer different questions. A sample standard deviation summarizes observed data and usually uses n - 1. A distribution standard deviation summarizes the theoretical model itself and uses the full probability weights.

Use sample SD when

You already observed a dataset and want to estimate variability from those measurements. See Sample vs Population for that workflow.

Use distribution SD when

You are given outcomes with probabilities and want the exact spread implied by the model, such as a binomial, geometric, or custom discrete distribution.

That difference also explains why a probability-distribution question usually does not use Bessel's correction. The probabilities already define the whole distribution, so there is no sample-estimation adjustment.

Practical Patterns

ScenarioWhat the standard deviation tells you
Defects per batchHow volatile the defect count is around the expected number of defects
Game payoff distributionHow risky or unpredictable the payoff is relative to its average
Demand outcomes in inventory planningHow much actual demand may swing around expected demand
Number of claims or failuresHow stable or unstable the event count is over repeated periods

If the distribution is approximately bell-shaped after aggregation, Understanding Normal Distribution and Z-Score Explained help you translate standard deviation into probability statements and unusually high or low outcomes.

Quick Bernoulli example

If X equals 1 for a machine failure and 0 for no failure, with P(failure) = 0.08, then μ = 0.08, σ² = 0.08(0.92) = 0.0736, and σ ≈ 0.271. Even a binary event has a meaningful standard deviation because the result varies from run to run.

Problem-Solving Checklist

  • Check the inputs:Make sure the table gives **probabilities**, not frequencies. If you have counts instead, use the frequency-table workflow instead of the probability-distribution workflow.
  • Sum to one:Confirm that **Σp(x) = 1**. If not, the table is incomplete or the values need normalization.
  • Compute the mean first:Do not jump straight to squared deviations. Every variance term depends on **μ**.
  • Use the shortcut when convenient:If **Σx²p(x)** is easy to build, use **E(X²) - μ²** to reduce arithmetic mistakes.
  • Interpret the answer in context:A standard deviation is large or small only relative to the outcome scale, the mean, and the real decision you are making.

Once you see probabilities as weights, the formula becomes much easier to remember: find the mean, measure squared distance from that mean, weight by probability, and take the square root. The mathematics is the same idea as ordinary standard deviation, but the input is a model instead of a sample.

Further Reading

Sources

References and further authoritative reading used in preparing this article.

  1. NIST/SEMATECH e-Handbook of Statistical Methods
  2. Expected value
  3. Discrete probability distribution