Welch's t-test explained

In statistics, Welch's t-test, or unequal variances t-test, is a two-sample location test which is used to test the (null) hypothesis that two populations have equal means. It is named for its creator, Bernard Lewis Welch, and is an adaptation of Student's t-test,^[1] and is more reliable when the two samples have unequal variances and possibly unequal sample sizes.^[2] ^[3] These tests are often referred to as "unpaired" or "independent samples" t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping. Given that Welch's t-test has been less popular than Student's t-test^[2] and may be less familiar to readers, a more informative name is "Welch's unequal variances t-test" — or "unequal variances t-test" for brevity.^[3]

Assumptions

Student's t-test assumes that the sample means being compared for two populations are normally distributed, and that the populations have equal variances. Welch's t-test is designed for unequal population variances, but the assumption of normality is maintained.^[1] Welch's t-test is an approximate solution to the Behrens–Fisher problem.

Calculations

Welch's t-test defines the statistic t by the following formula:

	\Delta\overline{X

} = \frac\,

s_\bar{X_i}={s_i\over\sqrt{N_i}}

where

\overline{X}_i

and

s_\bar{X_i}

are the

i^th

sample mean and its standard error, with

s_i

denoting the corrected sample standard deviation, and sample size

N_i

. Unlike in Student's t-test, the denominator is not based on a pooled variance estimate.

\nu

associated with this variance estimate is approximated using the Welch–Satterthwaite equation:^[4]

\nu ≈

\left(

	2
s
	1

N₁

	2
s
	2

N₂

\right)²

	4
s
	1

	2
N		\nu₁
	1

	4
s
	2

	2
N		\nu₂
	2

This expression can be simplified when

N₁=N₂

\nu ≈

	s_\Delta\bar{X
	^4}

	-1
{\nu
	1

s_\bar{X

	4

	1}

	-1
\nu
	2

s_\bar{X

	4}.

	2}

Here,

\nu_i=N_i-1

is the degrees of freedom associated with the i-th variance estimate.

The statistic is approximately from the t-distribution since we have an approximation of the chi-square distribution. This approximation is better done when both

N₁

and

N₂

are larger than 5.^[5] ^[6]

Statistical test

Once t and

\nu

have been computed, these statistics can be used with the t-distribution to test one of two possible null hypotheses:

that the two population means are equal, in which a two-tailed test is applied; or
that one of the population means is greater than or equal to the other, in which a one-tailed test is applied.

The approximate degrees of freedom are real numbers

\left(\nu\inR^+\right)

and used as such in statistics-oriented software, whereas they are rounded down to the nearest integer in spreadsheets.

Advantages and limitations

Welch's t-test is more robust than Student's t-test and maintains type I error rates close to nominal for unequal variances and for unequal sample sizes under normality. Furthermore, the power of Welch's t-test comes close to that of Student's t-test, even when the population variances are equal and sample sizes are balanced.^[2] Welch's t-test can be generalized to more than 2-samples,^[7] which is more robust than one-way analysis of variance (ANOVA).

It is not recommended to pre-test for equal variances and then choose between Student's t-test or Welch's t-test.^[8] Rather, Welch's t-test can be applied directly and without any substantial disadvantages to Student's t-test as noted above. Welch's t-test remains robust for skewed distributions and large sample sizes.^[9] Reliability decreases for skewed distributions and smaller samples, where one could possibly perform Welch's t-test.^[10]

Software implementations

Language/Program	Function	Documentation
	`TTEST(''Data1; Data2; Mode; Type'')`	^[11]
	`ttest2(data1, data2, 'Vartype', 'unequal')`	^[12]
Microsoft Excel pre 2010 (Student's T Test)	`TTEST(''array1'', ''array2'', ''tails'', ''type'')`	^[13]
Microsoft Excel 2010 and later (Student's T Test)	`T.TEST(''array1'', ''array2'', ''tails'', ''type'')`	^[14]
	Accessed through menu	^[15]
Origin software	Results of the Welch t-test are automatically outputted in the result sheet when conducting a two-sample t-test (Statistics: Hypothesis Testing: Two-Sample t-test)	^[16]
	Default output from `proc ttest` (labeled "Satterthwaite")
Python (through 3rd-party library SciPy)	`scipy.stats.ttest_ind(''a'', ''b'', ''equal_var=False'')`	^[17]
	`t.test(data1, data2, var.equal = FALSE)`	^[18]
	`ttest2(data1, data2)`	^[19]
	`Statistics.Test.StudentT.welchTTest SamplesDiffer data1 data2`	^[20]
	`Oneway(Y(YColumn), X(XColumn), Unequal Variances(1));`	^[21]
	`UnequalVarianceTTest(data1, data2)`	^[22]
Stata	`'''ttest''' ''varname1'' '''==''' ''varname2''''',''' '''welch'''`	^[23]
Google Sheets	`TTEST(range1, range2, tails, type)`	^[24]
GraphPad Prism	It is a choice on the t test dialog.
	An option in the menu	^[25] ^[26]
GNU Octave	`welch_test(x, y)`	^[27]

Notes and References

Welch . B. L. . The generalization of "Student's" problem when several different population variances are involved . . 34 . 1–2 . 28–35 . 1947 . 10.1093/biomet/34.1-2.28 . 20287819 . 19277 .
Ruxton . G. D. . The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test . . 17 . 4 . 688–690 . 2006 . 10.1093/beheco/ark016. free .
Derrick. B. Toher. D. White. P. Why Welchs test is Type I error robust. The Quantitative Methods for Psychology. 2016. 12. 1. 30–38. 10.20982/tqmp.12.1.p030. free.
https://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm 7.3.1. Do two processes have the same mean?
Web site: The Satterthwaite Formula for Degrees of Freedom in the Two-Sample t-Test . 6 . Michael . Allwood . 2008 .
Book: Yates . Moore . Starnes . The Practice of Statistics . 3rd . 792 . 2008 . W.H. Freeman and Company . New York . 9780716773092 .
Welch. B. L.. On the Comparison of Several Mean Values: An Alternative Approach. Biometrika. 1951. 38. 3/4. 330–336. 10.2307/2332579. 2332579.
Zimmerman . D. W. . A note on preliminary tests of equality of variances . . 57 . 173–181 . 2004 . Pt 1 . 10.1348/000711004849222. 15171807 .
Fagerland . M. W. . t-tests, non-parametric tests, and large studies—a paradox of statistical practice? . BMC Medical Research Methodology. 12 . 78 . 2012 . 10.1186/1471-2288-12-78. 3445820 . 22697476 . free .
Fagerland . M. W. . Sandvik . L. . Performance of five two-sample location tests for skewed distributions with unequal variances . . 30 . 5 . 490–496 . 2009 . 10.1016/j.cct.2009.06.007. 19577012 .
Web site: Statistical Functions Part Five - LibreOffice Help.
Web site: Two-sample t-test - MATLAB ttest2 - MathWorks United Kingdom.
Web site: TTEST - Excel - Microsoft Office . office.microsoft.com . dead . https://web.archive.org/web/20100613052612/http://office.microsoft.com/en-us/excel-help/ttest-HP005209325.aspx . 2010-06-13.
Web site: T.TEST function.
https://support.minitab.com/en-us/minitab/18/help-and-how-to/statistics/basic-statistics/how-to/2-sample-t/before-you-start/overview/ Overview for 2-Sample t - Minitab:
Web site: Help Online - Quick Help - FAQ-314 Does Origin supports Welch's t-test? . 2023-11-09 . www.originlab.com.
Web site: Scipy.stats.ttest_ind — SciPy v1.7.1 Manual.
Web site: R: Student's t-Test.
Web site: JavaScript npm: @stdlib/stats-ttest2.
Web site: Statistics.Test.StudentT.
Web site: Index of /Support/Help .
Web site: Welcome to Read the Docs — HypothesisTests.jl latest documentation.
Web site: Stata 17 help for ttest.
Web site: T.TEST - Docs Editors Help.
Jeremy Miles: Unequal variances t-test or U Mann-Whitney test?, Accessed 2014-04-11
https://www.ibm.com/support/knowledgecenter/SSLVMB_24.0.0/spss/base/syn_t-test_examples.html#syn_t-test_examples One-Sample Test
Web site: Function Reference: Welch_test.