How should I interpret a high standard deviation?

A high standard deviation means the observations are spread farther from the mean on average. Whether that spread is acceptable depends on the context: wide dispersion might signal risk in finance, instability in manufacturing, or genuine natural variation in scientific data.

Why do some articles mention n while others mention n-1?

The denominator reflects the difference between population and sample formulas. Population variance and population standard deviation use N because the full dataset is known. Sample variance and sample standard deviation often use n-1 because Bessel’s correction reduces bias when estimating population spread from a sample.

What is a statistical interpretation guide?

A statistical interpretation guide is a page that moves beyond arithmetic and explains meaning. It tells you what a metric is, when the formula applies, and how to describe the result in plain English without overstating certainty.

Can I cite this article in a report?

You should cite the underlying authoritative reference for formal work whenever possible. This page is best used as an explanatory bridge that helps you understand the concept before quoting the original standard or handbook.

Why include direct citations on every article page?

Direct citations give readers a route to verify the definition, notation, and assumptions. That improves trust and reduces the chance that a simplified explanation is mistaken for the entire technical standard.

Cohen's d 與效果量計算 | Standard Deviation Calculator

超越統計顯著性：認識效果量

效果量衡量差異或關係的大小，與樣本數無關。雖然 p 值告訴你某個效果是否具有統計顯著性，但效果量告訴你這個效果在實務上有多大意義。這個區別對於研究、醫學、教育和商業中的循證決策至關重要。

想像一個製藥試驗，新藥相比安慰劑有統計顯著的改善（p < 0.001）。但如果不看效果量，你不知道改善的幅度是 0.1% 還是 50%。效果量提供了這個關鍵的背景資訊，幫助決策者判斷這個效果是否值得承擔成本、副作用或實施的努力。

比較兩組最常用的效果量是 Cohen's d，它以標準差為單位來表示平均數的差異。這種標準化讓你可以跨不同研究和測量尺度進行比較。

為什麼效果量很重要

統計顯著性受樣本數的影響很大。當樣本數夠大時，即使微小的差異也會變得“顯著”。反之，重要的效果在小樣本中可能達不到顯著水準。效果量解決了這個問題，提供一個不受樣本數影響的量度。

顯著性陷阱

一個 n=10,000 的研究可能顯示在 100 分的量表上 0.5 分的差異 p < 0.001。這在統計上顯著，但在實務上毫無意義（d ≈ 0.05）。永遠要在 p 值旁邊同時報告效果量。

使用效果量的關鍵原因：

統合分析： 效果量可以跨研究合併，以估計整體效果
統計力分析： 計算未來研究所需的樣本數時必須用到
實務決策： 幫助判斷某項介入措施是否值得實施
重複研究： 為重複實驗提供一個目標來比對

Cohen's d：標準效果量指標

Cohen's d 以合併標準差為單位來表示兩組平均數的差異：

Cohen's d

d = (M₁ - M₂) / sp

其中 M₁ 和 M₂ 是兩組平均數，sp 是合併標準差，計算方式如下：

合併標準差

sp = √[((n₁-1)s₁² + (n₂-1)s₂²) / (n₁+n₂-2)]

d 的正負號表示方向：M₁ > M₂ 時為正，M₁ < M₂ 時為負。當方向從情境中顯而易見時，通常報告絕對值 |d|。

為什麼要合併標準差？

合併假設兩組有相同的母體變異數。這比單獨使用任一組的標準差更穩定，也符合獨立樣本 t 檢定的假設。

其他效果量指標

Cohen's d 最常用，但特定情況下有其他選擇：

Hedges' g：偏差校正的效果量

Cohen's d 在小樣本中會稍微高估母體效果量。Hedges' g 加入了一個校正因子：

Hedges' g 校正

g = d × (1 - 3/(4(n₁+n₂) - 9))

每組樣本數超過 20 時，差異微乎其微。小樣本（n < 20）時，建議使用 Hedges' g。

Glass's Δ：當變異數不等時

當其中一組是已知變異性的控制組時，只使用控制組的標準差作為分母：

Glass's Delta

Δ = (M₁ - M₂) / s_control

這在處理可能影響變異數的介入措施時很有用（例如，一項幫助低分者比高分者更多的介入）。

效果量解讀：Cohen 的準則

Jacob Cohen 提出了以下解讀 d 值的慣例：

效果量 (d)	解讀	重疊程度
0.2	小	兩組 85% 重疊
0.5	中	兩組 67% 重疊
0.8	大	兩組 53% 重疊
1.2	很大	兩組 40% 重疊
2.0	巨大	兩組 19% 重疊

情境很重要

這些是粗略的準則，不是絕對的規則。在某些領域，d = 0.2 可能非常有意義（例如降低心臟病發風險），而在其他領域 d = 0.8 可能是意料之中（例如有輔導 vs 無輔導）。

計算範例：教育介入

一所學校測試新的閱讀計畫。控制組（n=25）：平均數=72，標準差=12。實驗組（n=30）：平均數=79，標準差=14。計算 Cohen's d：

計算合併變異數

sp² = [(25-1)(12)² + (30-1)(14)²] / (25+30-2) = [24×144 + 29×196] / 53 = [3456 + 5684] / 53 = 172.45

計算合併標準差

sp = √172.45 = 13.13

計算 Cohen's d

d = (79 - 72) / 13.13 = 7 / 13.13 = 0.53

解讀

中等效果量 (d = 0.53)。實驗組的分數大約比控制組高出半個標準差。

這意味著如果你隨機從實驗組和控制組各挑一名學生，實驗組學生分數較高的機率大約是 64%（根據重疊度計算）。

Python 實作

以程式計算效果量及其信賴區間：

python

import numpy as np
from scipy import stats

def cohens_d(group1, group2):
    """Calculate Cohen's d for two independent groups."""
    n1, n2 = len(group1), len(group2)
    var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1)

    # Pooled standard deviation
    pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2))

    # Cohen's d
    d = (np.mean(group1) - np.mean(group2)) / pooled_std
    return d

def hedges_g(group1, group2):
    """Calculate Hedges' g (bias-corrected effect size)."""
    n1, n2 = len(group1), len(group2)
    d = cohens_d(group1, group2)

    # Correction factor for small sample bias
    correction = 1 - 3 / (4*(n1+n2) - 9)
    return d * correction

# Example usage
control = [68, 72, 75, 70, 69, 74, 71, 73, 76, 72]
treatment = [75, 79, 82, 78, 80, 77, 81, 76, 83, 79]

d = cohens_d(treatment, control)
g = hedges_g(treatment, control)
print(f"Cohen's d: {d:.3f}")
print(f"Hedges' g: {g:.3f}")

Reading goal	What to focus on	Common mistake
Definition	What the metric is and what quantity it summarizes	Treating the formula as self-explanatory
Formula choice	Sample versus population assumptions and notation	Using n when n-1 is required or vice versa
Interpretation	Whether the result indicates concentration, spread, or risk	Calling a large value good or bad without context

Cohen's d 與效果量計算