Example with two hypothesis: Regression case

Nothing preaches better than the act.
–Benjamin Franklin (1706-1790)

We are now going to apply our version of Kullback’s theorem  based in two hypothesis to the non-parametric regression model. Assume first the following conditions:

  1. For $latex {f:[0,1]\rightarrow{\mathbb R}}&fg=000000$ we assume the model

    $latex \displaystyle Y_{i}=f(X_{i})+\xi_{i},\quad i=1,\ldots,n. &fg=000000$

  2. The $latex {\xi_{i}}&fg=000000$ are iid with density $latex {p_{\xi}(\cdot)}&fg=000000$. Also there exist $latex {p_{*}>0}&fg=000000$ and $latex {v_{0}>0}&fg=000000$ such that

    $latex \displaystyle {\displaystyle {\displaystyle \int p_{\xi}(u)\log\frac{p_{\xi}(u)}{p_{\xi}(u+v)}du\leq p_{*}v^{2}}}, \ \ \ \ \ (1)&fg=000000$

    for any $latex {\vert v\vert\leq v_{0}}&fg=000000$.

  3. The $latex {X_{i}\in[0,1]}&fg=000000$ are fixed.

Remember that to find a lower bound we need 3 things a class space, a distance and the hypothesis. For this example we will use the following,

  1. The class space: The functional space of Hölder $latex {\Sigma(\beta,L),\ \beta,L>0}&fg=000000$ (see this post for the definitions).
  2. The distance: We shall use the distance to a point fixed $latex {x_{0}\in[0,1]}&fg=000000$:$latex \displaystyle d(f,g)=\vert f(x_{0})-g(x_{0})\vert. &fg=000000$
  3. The hypothesis: We choose the followings,

    $latex \displaystyle {\displaystyle {\displaystyle f_{0}(x)=0,\quad f_{1}(x)=Lh^{\beta}K\left(\frac{x-x_{0}}{h}\right),\quad x\in[0,1]}} &fg=000000$

    where $latex {K:{\mathbb R}\rightarrow[0,\infty[}&fg=000000$, $latex {K>0}&fg=000000$ and satisfies

    $latex \displaystyle K\in\Sigma(\beta,1/2)\cap C^{\infty}({\mathbb R})\Leftrightarrow u\in]-1/2,1/2[. &fg=000000$

Remark 1 By classical methods, we can show that the upper bound is $latex {n^{-\frac{2\beta}{2\beta+1}}}&fg=000000$, so our aim it is to meet the same rate for the lower bound.

We are now ready to start showing how to find the lower bounds. An easy but practical way to proceed it is proving these three statements:

  1. $latex \boldsymbol{f_{j}\in\Sigma(\beta,L)\ ,j=0,1.}&fg=000000$
    For $latex {l=\beta}&fg=000000$, the $latex {l}&fg=000000$-th order derivative of $latex {f_{1}}&fg=000000$ is

    $latex \displaystyle f_{1}^{(l)}(x)=Lh^{\beta-l}K^{(l)}\left(\frac{x-x_{0}}{h}\right). &fg=000000$


    $latex \displaystyle \begin{array}{rl} \vert f_{1}^{(l)}(x)-f_{1}^{(l)}(x^{\text{\ensuremath{\prime}}})\vert & ={\displaystyle Lh^{\beta-l}\vert K^{(l)}(u)-K^{(l)}(u^{\prime})\vert}\\ & \leq{\displaystyle Lh^{\beta-l}\vert u-u^{\prime}\vert^{\beta-l}/2=L\vert x-x^{\prime}\vert^{\beta-l}/2} \end{array} &fg=000000$

    with $latex {u=(x-x_{0})/h}&fg=000000$, $latex {u^{\prime}=(x^{\prime}-x_{0})/h}&fg=000000$ and $latex {x,x^{\prime}\in{\mathbb R}}&fg=000000$. We conclude that $latex {f_{1}\in\Sigma(\beta,L)}&fg=000000$.

  2. $latex \boldsymbol{d(f_{0},f_{1})>2s}&fg=000000$. We have

    $latex \displaystyle {\displaystyle {\displaystyle d(f_{0},f_{1})=\vert f_{1}(x_{0})\vert=Lh^{\beta}K(0)=Lc^{\beta}K(0)n^{-\frac{\beta}{2\beta+1}}}.} &fg=000000$

    We finish considering,

    $latex \displaystyle {\displaystyle s=\frac{1}{2}Lc^{\beta}K(0)n^{-\frac{\beta}{2\beta+1}}\triangleq An^{-\frac{\beta}{2\beta+1}}} &fg=000000$

  3. $latex \boldsymbol{K(P_{0},P_{1})\leq\alpha<\infty}&fg=000000$. Notice first that $latex {P_{j}}&fg=000000$ is the distribution of $latex {Y_{1},\ldots,Y_{n}}&fg=000000$ under $latex {f=f_{j}.}&fg=000000$ This distribution admits the following joint density

    $latex \displaystyle {\displaystyle p_{j}(u_{1},\ldots,u_{n})=\prod_{i=1}^{n}p_{\varepsilon}(u_{i}-f_{j}(X_{i})),\quad j=0,1.} &fg=000000$

    There exists an integer $latex {n_{0}}&fg=000000$ depending only on $latex {c_{0},L,\beta,K_{\max},v_{0}}&fg=000000$ such that for all $latex {n>n_{0}}&fg=000000$ we have $latex {nh\geq1}&fg=000000$ and $latex {Lh\beta K_{\max}\leq v_{0}}&fg=000000$ where $latex {K_{\max}=\max_{u}K(u)}&fg=000000$. Assume also that there exists a real number $latex {a_{0}>0}&fg=000000$ such that for $latex {n\geq1}&fg=000000$

    $latex \displaystyle {\displaystyle \sum_{i=1}^{n}I\left(\left|\frac{X_{i}-x_{0}}{h}\right|\leq\frac{1}{2}\right)\leq a_{0}\max(nh,1).} &fg=000000$

    Then, by 1 and the before assumptions we get

    $latex \displaystyle \begin{array}{rl} K(P_{0},P_{1}) & ={\displaystyle \int\log\frac{dP_{0}}{dP_{1}}dP_{0}}\\ & ={\displaystyle \int\cdots\int\log\prod_{i=1}^{n}\frac{p_{\varepsilon}(u_{i})}{p_{\varepsilon}(u_{i}-f_{1}(X_{i}))}\prod_{i=1}^{n}\left[p_{\varepsilon}(u_{i})du_{i}\right]}\\ & ={\displaystyle \sum_{i=1}^{n}\int\log\frac{p_{\varepsilon}(y)}{p_{\varepsilon}(y-f_{1}(X_{i}))}p_{\varepsilon}(y)dy}\\ & \leq p_{*}\sum_{i=1}^{n}f_{1}^{2}(X_{i})\\ & ={\displaystyle p_{*}L^{2}h^{2\beta}\sum_{i=1}^{n}K^{2}\left(\frac{X_{i}-x_{0}}{h}\right)}\\ & \leq{\displaystyle p_{*}L^{2}h^{2\beta}K_{\max}^{2}\sum_{i=1}^{n}I\left(\left|\frac{X_{i}-x_{0}}{h}\right|\leq\frac{1}{2}\right)}\\ & \leq{\displaystyle p_{*}a_{0}L^{2}K_{\max}^{2}h^{2\beta}\max(nh,1)}\\ & ={\displaystyle p_{*}L^{2}K_{\max}^{2}nh^{2\beta+1}}. \end{array} &fg=000000$

    If we choose

    $latex \displaystyle {\displaystyle c_{0}=\left(\frac{\alpha}{p_{*}a_{0}L^{2}K_{\max}^{2}}\right)^{\frac{1}{2\beta+1}}}, &fg=000000$

    and with $latex {h=c_{o}n^{-\frac{1}{2\beta+1}}}&fg=000000$ we obtain $latex {K(P_{0},P_{1})\leq\alpha}&fg=000000$.


Gathering all these results and combining them with the last post, we can state the following theorem

Theorem 1 Suppose that $latex {\beta>0}&fg=000000$ and $latex {L>0}&fg=000000$. Under all the assumptions mentioned before we have, for all $latex {x_{0}\in[0,1]}&fg=000000$, $latex {t>0}&fg=000000$ and for any estimator $latex {T_{n}}&fg=000000$,

$latex \displaystyle {\displaystyle \liminf_{n\rightarrow\infty}\inf_{T_{n}}\sup_{f\in\Sigma(\beta,L)}\mathbb E\left[n^{\frac{2\beta}{2\beta+1}}\left(T_{n}(x_{0})-f(x_{0})\right)^{2}\right]\geq c} &fg=000000$

where $latex {c>0}&fg=000000$ depends only on $latex {\beta,L,p_{*}}&fg=000000$ and $latex {a_{0}}&fg=000000$.

Last remarks

This result is very powerful because it proves that we cannot enhance the convergence rate for any estimator. At first time, all the hypothesis here assumed seem be very hard or impossible to guess. Well, the trick here is to start first with some general hypothesis and then gradually assume the requirements to get the desired conclusion.

Leave a Reply