Classical multiple testing theory prescribes the null distribution, which is often a too stringent assumption for nowadays large scale experiments. This paper presents theoretical foundations to understand the limitations caused by ignoring the null distribution, and how it can be properly learned from the (same) data-set, when possible. While an oracle procedure in that case is the Benjamini Hochberg procedure applied with the true (unknown) null distribution, we pursue the aim of building a procedure that asymptotically mimics the performance of the oracle (AMO in short). For a Gaussian null, our main result states that an AMO procedure exists if and only if the sparsity parameter k (number of false nulls) is of order less than n/log(n), where n is the total number of tests.
This is a joint work with Nicolas Verzelen, https://arxiv.org/abs/1912.03109.