Σ
SDCalc
IntermediateApplications·9 min

Detecting Outliers with Standard Deviation

Learn how to identify outliers in your data using standard deviation. Master the 3-sigma rule, IQR method, and understand when outliers should be removed.

What are Outliers?

Outliers are data points that differ significantly from other observations. They can be caused by measurement errors, data entry mistakes, or they might represent genuinely unusual cases worth investigating.

The orange point at (10, 50) is an outlier

The 3-Sigma Rule

For normally distributed data, points beyond 3 standard deviations from the mean are considered outliers. They occur less than 0.3% of the time by chance.

Outlier if

x < μ - 3σ OR x > μ + 3σ

Example

If test scores have μ = 75 and σ = 10: - Lower bound: 75 - 30 = 45 - Upper bound: 75 + 30 = 105 - Any score below 45 or above 105 is an outlier

Z-Score Method

Calculate the z-score for each data point. If |z| > 3 (or sometimes 2.5), it's an outlier.

Z-Score

z = (x - μ) / σ

Threshold Options

- |z| > 3: Conservative (catches fewer outliers) - |z| > 2.5: Moderate - |z| > 2: Liberal (catches more outliers)

IQR Method (Alternative)

The Interquartile Range (IQR) method is more robust to outliers because it doesn't use the mean or standard deviation.

1

Step 1

Find Q1 (25th percentile) and Q3 (75th percentile)
2

Step 2

Calculate IQR = Q3 - Q1
3

Step 3

Lower fence = Q1 - 1.5 × IQR
4

Step 4

Upper fence = Q3 + 1.5 × IQR
5

Step 5

Points outside fences are outliers

Handling Outliers

Don't Automatically Delete!

Outliers aren't always errors. Before removing them, investigate: - Is it a data entry or measurement error? - Is it a genuine extreme value? - Does it represent an important edge case?

When to Remove

- Confirmed data entry errors - Measurement equipment malfunction - Outside the possible range of values

When to Keep

- Represents real variability - Important for your analysis - Removing would bias results