10 Multiple Hypothesis Correction

TA 9

Bonferroni correction for multiple hypotheses

10.1 Bonferroni correction for multiple hypotheses

The Bonferroni correction is a multiple-comparison correction used when several dependent or independent statistical tests are being performed simultaneously (since while a given alpha value alpha may be appropriate for each individual comparison, it is not for the set of all comparisons). In order to avoid a lot of spurious positives, the alpha value needs to be lowered to account for the number of comparisons being performed.

In multiple hypothesis testing there are two kinds of errors that must be to considered: 1. Type I error: The rejection of a true null hypothesis (also known as a “false positive” finding or conclusion; example: “an innocent person is convicted”) 2. Type II error (False negative): The non-rejection of a false null hypothesis (also known as a “false negative” finding or conclusion; example: “a guilty person is not convicted”

https://mathworld.wolfram.com/BonferroniCorrection.html

10.1.1 Example from gene editing

Background: A researcher analyzes thousands of genese to identify “differentially expressed genes” between two groups (e.g., normal vs. treated), which could alter biological mechanisms of a species in response to a particular treatment.

If we analyze 10,000 results with a signifance level of 0.05, then we should expect hundreds of false positives.

To control false discoveries from multiple hypothesis testing, it is imperative to adjust the significance level (α) to reduce the probability of getting Type I error.

https://www.reneshbedre.com/blog/multiple-hypothesis-testing-corrections.html

Import packages

import numpy as np
from statsmodels.stats.multitest import multipletests

Generate 500 random p-values

rand_num = np.random.random(500)

Set alpha value to 0.05

# alpha value
alpha = 0.05

What would happen if we didn’t correct alpha value?

# without correction, how many times would we reject the null hypothesis?
print(len(rand_num[np.where(rand_num<alpha)]))

With Bonferroni correction

p_adjusted = multipletests(pvals=rand_num, alpha=alpha, method='bonferroni')
print(len(p_adjusted[1][np.where(p_adjusted[1]<alpha)]))

10.2 Benjamini-Hochberg correction for multiple hypotheses

p_adjusted = multipletests(pvals=rand_num, alpha=alpha, method='fdr_bh')
print(len(p_adjusted[1][np.where(p_adjusted[1]<alpha)]))