Definitions

Classification of m hypothesis tests

Main article: Multiple comparisons problem § Classification of m hypothesis tests

The following table gives a number of errors committed when testing

m

null hypotheses. It defines some random variables that are related to the

m

hypothesis tests.

	Null hypothesis is True (H₀)	Alternative hypothesis is True (H₁)	Total
Declared significant	$V$	$S$	$R$
Declared non-significant	$U$	$T$	$m - R$
Total	$m_0$	$m - m_0$	$m$

$m$ is the total number hypotheses tested
$m_0$ is the number of true null hypotheses
$m - m_0$ is the number of true alternative hypotheses
$V$ is the number of false positives (Type I error) (also called "false discoveries")
$S$ is the number of true positives (also called "true discoveries")
$T$ is the number of false negatives (Type II error)
$U$ is the number of true negatives
$R$ is the number of rejected null hypotheses (also called "discoveries")
In $m$ hypothesis tests of which $m_0$ are true null hypotheses, $R$ is an observable random variable, and $S$ , $T$ , $U$ , and $V$ are unobservable random variables.

False discovery rate (FDR)

Based on previous definitions we can define

Q

as the proportion of false discoveries among the discoveries

\left ( Q = \frac{V}{R} \right )

. And the false discovery rate is given by:^[1]

FDR = Q_e = \mathrm{E}\!\left [Q \right ] = \mathrm{E}\!\left [\frac{V}{V+S}\right ] = \mathrm{E}\!\left [\frac{V}{R}\right ],

where

\frac{V}{R}

is defined to be 0 when

R = 0

.
And one wants to keep this value below a threshold

\alpha

(or q).

待看資料：http://brainder.org/2011/09/05/fdr-corrected-fdr-adjusted-p-values/

Familywise error rate (FWER)

The FWER is the probability of making even one type I error In the family,

\mathrm{FWER} = \Pr(V \ge 1), \,

or equivalently,

\mathrm{FWER} = 1 -\Pr(V = 0).

Thus, by assuring

\mathrm{FWER} \le \alpha\,\! \,

, the probability of making even one type I error in the family is controlled at level

\alpha\,\!

.
A procedure controls the FWER in the weak sense if the FWER control at level

\alpha\,\!

is guaranteed only when all null hypotheses are true (i.e. when

m_0

m

so the global null hypothesis is true)
A procedure controls the FWER in the strong sense if the FWER control at level

\alpha\,\!

is guaranteed for any configuration of true and non-true null hypotheses (including the global null hypothesis)

False discovery rate (FDR):
FDR procedures are designed to control the expected proportion of incorrectly rejected null hypotheses ("false discoveries").^[1] FDR controlling procedures exert a less stringent control over false discovery compared to familywise error rate (FWER) procedures (such as the Bonferroni correction), which seek to reduce the probability of even one false discovery, as opposed to the expected proportion of false discoveries. Thus FDR procedures have greater power at the cost of increased rates of type I errors, i.e., rejecting the null hypothesis of no effect when it should fail to be rejected.

Post-hoc testing of ANOVAs

Multiple comparison procedures are commonly used in an analysis of variance after obtaining a significant omnibus test result, like the ANOVA F-test. The significant ANOVA result suggests rejecting the global null hypothesis H₀ that the means are the same across the groups being compared. Multiple comparison procedures are then used to determine which means differ. In a one-way ANOVA involving K group means, there are K(K − 1)/2 pairwise comparisons.
A number of methods have been proposed for this problem, some of which are:

Single-step procedures

Tukey–Kramer method (Tukey's HSD) (1951)
Scheffe method (1953)
Rodger's method (precludes type 1 error rate inflation, using a decision-based error rate)

Multi-step procedures based on Studentized range statistic

Duncan's new multiple range test (1955)
The Nemenyi test is similar to Tukey's range test in ANOVA.

The Bonferroni–Dunn test allows comparisons, controlling the familywise error rate.^[vague]
Student Newman-Keuls post-hoc analysis
Dunnett's test (1955) for comparison of number of treatments to a single control group.

Choosing the most appropriate multiple-comparison procedure for your specific situation is not easy. Many tests are available, and they differ in a number of ways.^[7]
For example,if the variances of the groups being compared are similar, the Tukey–Kramer method is generally viewed as performing optimally or near-optimally in a broad variety of circumstances.^[8] The situation where the variance of the groups being compared differ is more complex, and different methods perform well in different circumstances.
The Kruskal–Wallis test is the non-parametric alternative to ANOVA. Multiple comparisons can be done using pairwise comparisons (for example using Wilcoxon rank sum tests) and using a correction to determine if the post-hoc tests are significant (for example a Bonferroni correction).

Holm–Bonferroni method

In statistics, the Holm–Bonferroni method ^[1] is a method used to counteract the problem of multiple comparisons. It is intended to control the Familywise error rate and offers a simple test uniformly more powerful than the Bonferroni correction. It is one of the earliest usage of stepwise algorithms in simultaneous inference.

Hochberg correction

Hommel correction

Statistics

2015年5月18日星期一

多重比較的校正 Mutiple Comparison Correction

Classification of m hypothesis tests

False discovery rate (FDR)

Familywise error rate (FWER)

Post-hoc testing of ANOVAs

Holm–Bonferroni method

2015年5月16日星期六

Bonferroni Correction

2015年5月18日 星期一

多重比較的校正 Mutiple Comparison Correction

Classification of m hypothesis tests

False discovery rate (FDR)

Familywise error rate (FWER)

Post-hoc testing of ANOVAs

Holm–Bonferroni method

2015年5月16日 星期六

Bonferroni Correction

2015年5月18日星期一

2015年5月16日星期六