A general reduction scheme for minimax lower bounds

In the last publication, we defined a minimax lower bound as $latex \displaystyle \mathcal{R}^{*}\geq cs_{n} &fg=000000$ where

$latex {\mathcal{R}^{*}\triangleq\inf_{\hat{f}}\sup_{f\in\mathcal{F}}\mathbb E\left[d^{2}(\hat{f}_{n},f)\right]}&fg=000000$

and $latex {s_{n}\rightarrow0}&fg=000000$. The big issue with this definition is to take the supremum over a massive set $latex {\mathcal{F}}&fg=000000$ and then the infimum over all the possible estimators of $latex {f}&fg=000000$.

For every complex problem, there is a solution that is simple, neat, and wrong – H. L. Mencken. Source.

To solve this, we shall explain some steps to reduce the problem into one more feasible. These steps can be summarize as follow:

  1. Reduction to bounds in probability: Let us notice first that by the Markov’s inequality

    $latex \displaystyle \frac{\mathbb E\left[d(\hat{f}_{n},f)\right]}{s_{n}}\geq\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right). &fg=000000$

    Hence, it follows that

    $latex \displaystyle \inf_{\hat{f}_{n}}\sup_{f\in\mathcal{F}}\mathbb E\left[d(\hat{f}_{n},f)\right]\geq\inf_{\hat{f}_{n}}\sup_{f\in\mathcal{F}}s_{n}\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right)\geq c>0. &fg=000000$

    Therefore, to simplify the searching, instead of finding lower bounds for the minimax risk $latex {\mathcal{R}^{*}}&fg=000000$, there is enough to find them for

    $latex \displaystyle \inf_{\hat{f}_{n}}\sup_{f\in\mathcal{F}}\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right). &fg=000000$

  2. Reduction to a finite number of hypothesis:Next, we reduce the number the hypothesis to a number finite. Then, it is clear that

    $latex \displaystyle \inf_{\hat{f}_{n}}\sup_{f\in\mathcal{F}}\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right)\geq\inf_{\hat{f}_{n}}\max_{f\in\left\{ f_{0},\ldots,f_{M}\right\} }\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right) &fg=000000$

    for any finite set $latex {\left\{ f_{0},\ldots,f_{M}\right\} }&fg=000000$ contained in $latex {\mathcal{F}}&fg=000000$. For each problem it needs to choose $latex {M+1}&fg=000000$ elements  in an proper way.

  3. Choice of $latex {2s_{n}}&fg=000000$-separated hypothesis: The set

    $latex \displaystyle \left\{ \hat{f}_{n},f\in\left\{ f_{0},\ldots,f_{M}\right\} :d(\hat{f}_{n},f)\geq s_{n}\right\} &fg=000000$

    still being inadequate. To simplify the problem, it is better to bound

    $latex {\mathbb P_{f_{j}}\left(h\neq j\right)}&fg=000000$

    where $latex {h:\mathcal{F}\rightarrow\left\{ 0,\ldots,M\right\} }&fg=000000$ are all the measurable test functions and the probability means that after observing the data, the test infers the wrong hypothesis.

    To do that, define the test $latex {h^{*}(f):\mathcal{F}\rightarrow{0,…,M}}&fg=000000$ as

    $latex \displaystyle h^{*}=\arg\min_{0\leq j\leq M}d(\hat{f}_{n},f_{j}). &fg=000000$

    Now, we need some conditions over the distance between $latex {f_{j}}&fg=000000$ and $latex {f_{k}}&fg=000000$ to pass from $latex {\mathbb P\left(d(\hat{f}_{n},f)\geq s_{n}\right)}&fg=000000$ to $latex {\mathbb P_{f_{j}}\left(h\neq j\right)}&fg=000000$.

    Lemma. Suppose $latex {d(.,.)}&fg=000000$ is a semi-distance and that we have constructed $latex {f_{0},…,f_{M}}&fg=000000$ such that $latex {d(f_{j},f_{k})\geq2s_{n}}&fg=000000$ , $latex {\forall j\neq k}&fg=000000$.
    Then $latex {h^{*}(\hat{f}_{n})\neq j}&fg=000000$, implies $latex {d(\hat{f}_{n},f_{j})\geq s_{n}}&fg=000000$.

    Proof: Suppose $latex {h^{*}(\hat{f}_{n})\neq j}&fg=000000$ then it follows that there exist $latex {k\neq j}&fg=000000$ such that $latex {d(\hat{f}_{n},f_{k})\leq d(\hat{f}_{n},f_{j})}&fg=000000$. Now, we have

    $latex \displaystyle 2s_{n}\leq d(f_{j},f_{k})\leq d(\hat{f}_{n},f_{j})+d(\hat{f}_{n},f_{k})\leq2d(\hat{f}_{n},f_{j}) &fg=000000$

    which concludes that $latex {d(\hat{f}_{n},f_{j})\geq s_{n}}&fg=000000$. $latex \Box&fg=000000$

    The earlier lemma implies that

    $latex \displaystyle \mathbb P_{f_{j}}(d(\hat{f}_{n},f_{j})\geq s_{n})\geq\mathbb P_{f_{j}}(h^{*}(\hat{f}_{n})\neq j). &fg=000000$

    Finally,

    $latex \displaystyle \begin{array}{rl} \displaystyle\inf_{\hat{f}_{n}}\sup_{f\in\mathcal{F}}\mathbb P_{f_{j}}(d(\hat{f}_{n},f_{j})\geq s_{n}) & \displaystyle\geq\inf_{\hat{f}_{n}}\max_{f\in\left\{ f_{0},\ldots,f_{M}\right\} }\mathbb P_{f_{j}}(d(\hat{f}_{n},f_{j})\geq s_{n})\\ &\displaystyle \geq\inf_{\hat{f}_{n}}\max_{0\leq j\leq M}\mathbb P_{f_{j}}(h^{*}(\hat{f}_{n})\neq j)\\ & \displaystyle\geq\inf_{h}\max_{0\leq j\leq M}\mathbb P_{f_{j}}(h\neq j)\\ & \displaystyle\triangleq p_{e,M} \end{array} &fg=000000$

    where in the third step, we replaced the class of tests defined by $latex {h^{*}(\hat{f}_{n})}&fg=000000$ by a larger class of ALL possible test called $latex {h}&fg=000000$. Hence, the $latex {\inf_{h}}&fg=000000$ taken over the larger class is smaller than $latex {\inf_{\hat{f}_{n}}}&fg=000000$.

Conclusion

It is enough to build $latex {f_{0},\ldots,f_{M}}&fg=000000$ such that $latex {d(f_{j},f_{k})\geq2s_{n}}&fg=000000$ for $latex {j\neq k}&fg=000000$ and verify that

$latex \displaystyle p_{e,M}\triangleq\inf_{h}\max_{0\leq j\leq M}\mathbb P_{f_{j}}(h\neq j)\geq c>0. &fg=000000$

This construction requires careful construction since the first condition says that the $latex {f_{j}}&fg=000000$’s need to be separated each to other, while the second one requires that they are close enough so it is harder to distinguish them based on a given sample data. We call to the quantity $latex {p_{e,M}}&fg=000000$ the minimax error probability for the problem of testing $latex {M+1}&fg=000000$ hypotheses $latex {f_{0},\dots,f_{M}}&fg=000000$ .

The next post we will aboard the case $latex {M=1}&fg=000000$ to lower bound the error probability $latex {p_{e,M}}&fg=000000$.

For the time being, let me know your commentaries below.

 

Comments

  1. Pingback: A first minimax lower bound in the two hypothesis scenario | Blog about Statistics

  2. Pingback: Introduction to Minimax Lower Bounds | Blog about Statistics

  3. Pingback: Minimax Lower Bounds using the Total Variation Divergence | Blog about Statistics

Leave a Reply