고급심화·14 min

표준편차를 이용한 가설검정

Q: How should I interpret a high standard deviation?

A high standard deviation means the observations are spread farther from the mean on average. Whether that spread is acceptable depends on the context: wide dispersion might signal risk in finance, instability in manufacturing, or genuine natural variation in scientific data.

Q: Why do some articles mention n while others mention n-1?

The denominator reflects the difference between population and sample formulas. Population variance and population standard deviation use N because the full dataset is known. Sample variance and sample standard deviation often use n-1 because Bessel’s correction reduces bias when estimating population spread from a sample.

Q: What is a statistical interpretation guide?

A statistical interpretation guide is a page that moves beyond arithmetic and explains meaning. It tells you what a metric is, when the formula applies, and how to describe the result in plain English without overstating certainty.

Q: Can I cite this article in a report?

You should cite the underlying authoritative reference for formal work whenever possible. This page is best used as an explanatory bridge that helps you understand the concept before quoting the original standard or handbook.

Q: Why include direct citations on every article page?

Direct citations give readers a route to verify the definition, notation, and assumptions. That improves trust and reduces the chance that a simplified explanation is mistaken for the entire technical standard.

가설검정에서 표준편차가 어떻게 활용되는지 알아봅니다. t-검정, z-검정, 통계적 유의성 판단 방법을 설명합니다.

개요

가설검정은 표본 데이터를 기반으로 모집단에 대한 결정을 내리는 통계적 방법입니다. 표준편차는 관찰된 차이가 통계적으로 유의한지, 아니면 단순히 우연에 의한 것인지를 판단하는 데 핵심적인 역할을 합니다.

가설 설정

귀무가설(H₀)과 대립가설(H₁) 설정

유의수준 선택

유의수준(α) 선택, 일반적으로 0.05

검정통계량 계산

표준편차를 사용하여 검정통계량 계산

임계값과 비교

임계값과 비교하거나 p-값 계산

결정 내리기

H₀를 기각할지 기각하지 않을지 결정

Z-검정

모집단 표준편차(σ)를 알고 있고 표본 크기가 큰 경우(n ≥ 30) Z-검정을 사용합니다.

Z-검정 통계량

z = (x̄ - μ₀) / (σ / √n)

예시

한 제조업체가 배터리 수명이 평균 100시간이라고 주장합니다(μ₀ = 100). 36개 배터리를 테스트한 결과 x̄ = 98시간이고, σ = 12시간이라면: z = (98 - 100) / (12 / √36) = -2 / 2 = -1 z = -1이고 α = 0.05(양측검정)이면, H₀를 기각하지 않습니다. 차이가 통계적으로 유의하지 않습니다.

t-검정

모집단 표준편차를 모르고 표본에서 추정해야 하는 경우(σ 대신 s 사용) t-검정을 사용합니다.

t-검정 통계량

t = (x̄ - μ₀) / (s / √n)

t-검정 vs Z-검정 선택 기준

- Z-검정: σ를 알고 있으며 n ≥ 30 - t-검정: σ를 모름(s 사용), 어떤 표본 크기에서든 사용 가능 실무에서는 모집단의 진짜 σ를 아는 경우가 드물기 때문에 t-검정이 훨씬 더 일반적입니다.

표준오차

표준오차(SE)는 표본 평균이 모집단 평균에서 얼마나 벗어나는지를 측정합니다. 표준편차와 가설검정을 연결하는 핵심 개념입니다.

평균의 표준오차

SE = σ / √n (또는 표본 SD 사용 시 s / √n)

표준오차는 표본 크기가 커질수록 감소합니다. 큰 표본은 더 정밀한 추정치를 제공하며 실제 차이를 탐지하기 쉬워집니다.

통계적 유의성

관찰된 결과가 우연으로 나타날 확률(p-값)이 선택한 기준치(α) 미만일 때 통계적으로 유의하다고 합니다.

p-값 < α인 경우

H₀를 기각합니다. 결과가 통계적으로 유의합니다.

p-값 ≥ α인 경우

H₀를 기각하지 못합니다. 결과가 우연일 수 있습니다.

통계적 유의성 vs 실질적 유의성

통계적으로 유의한 결과가 반드시 실질적으로 중요한 것은 아닙니다. 매우 큰 표본에서는 아주 작은 차이도 “유의”하게 나올 수 있지만, 실무적으로는 의미 없을 수 있습니다. p-값과 함께 항상 효과 크기(effect size)를 고려하세요.

How to Read This Article

A statistics tutorial is a practical interpretation guide, not just a formula dump. It refers to the assumptions, notation, and reporting language that analysts need when they explain a result to a teacher, manager, client, or reviewer. The article body covers the specific topic, while the sections below create a common interpretation frame that readers can reuse across related metrics.

Reading goal	What to focus on	Common mistake
Definition	What the metric is and what quantity it summarizes	Treating the formula as self-explanatory
Formula choice	Sample versus population assumptions and notation	Using n when n-1 is required or vice versa
Interpretation	Whether the result indicates concentration, spread, or risk	Calling a large value good or bad without context

Frequently Asked Questions

How should I interpret a high standard deviation?