A global measure of risk for kernel estimators in Nikolski classes

Sergey Mikhailovich Nikolsky (Russian: Серге́й Миха́йлович Нико́льский; 30 April 1905 – 9 November 2012) was a Russian mathematician. He was born in Talitsa, which was at that time located in the Governorate of Perm, Russia. He had been an Academician since November 28, 1972. He also had won many scientific prizes. At the age of 92 he was still actively giving lectures in Moscow Institute of Physics and Technology. In 2005, he was only giving talks at scientific conferences, but was still working in MIPT, at the age of 100.
Photos of Sergey Nikolskii from The Russian Academy of Sciences

The MSE  gives an error of the estimator $latex {\hat{p}_{n}}&fg=000000$ at an arbitrary point $latex {x_{0}}&fg=000000$, but it is worth to study a global risk for $latex {\hat{p} _{n}}&fg=000000$. The mean integrated squared error (MISE) is an important global measure,

$latex \displaystyle \mathrm{MISE}\triangleq\mathop{\mathbb E}_{p}\int\left(\hat{p} _{n}(x)-p(x)\right)^{2}dx &fg=000000$

which by Fubini Theorem and the MSE bias-variance decomposition, we have

$latex \displaystyle \mathrm{MISE}=\int\mathrm{MSE} dx=\int b^{2}(x)dx+\int\sigma^{2}(x)dx. &fg=000000$

We proceed as the same way as the $latex {\mathrm{MSE}}&fg=000000$, we analyze by apart the bias term$latex {\int b^{2}(x)dx}&fg=000000$ and the variance$latex {\int\sigma^{2}(x)dx}&fg=000000$.
Kernel density estimate (KDE) with different b...1.1. Variance term of MISE

Lets study first the variance term

Proposition 1 Suppose that $latex {K:{\mathbb R}\rightarrow{\mathbb R}}&fg=000000$ is a function satisfying

$latex \displaystyle \int K^{2}(u)du<\infty. &fg=000000$

Then for any $latex {h>0}&fg=000000$, $latex {n\geq1}&fg=000000$ and any probability density $latex {p}&fg=000000$ we have

$latex \displaystyle \int\sigma^{2}(x)dx\leq\frac{1}{nh}\int K^{2}(u)du. &fg=000000$

Proof: In Proposition 2 of the last post we got

$latex \displaystyle \sigma^{2}(x)=\frac{1}{nh^{2}}\mathop{\mathrm{Var}}\left(K\left(\frac{X_{1}-x_{0}}{h}\right)\right)\leq\frac{1}{nh^{2}}\mathop{\mathbb E}_{p}\left[K^{2}\left(\frac{X_{1}-x_{0}}{h}\right)\right] &fg=000000$

for all $latex {x\in{\mathbb R}}&fg=000000$. Therefore

$latex \displaystyle \begin{array}{rl} \int\sigma^{2}(x)dx\leq & \displaystyle\frac{1}{nh^{2}}\int\left[\int K^{2}\left(\frac{z-x}{h}\right)p(z)dz\right]dx\\ = & \displaystyle\frac{1}{nh^{2}}\int p(z)\left[\int K^{2}\left(\frac{z-x}{h}\right)dx\right]dz\\ = & \displaystyle\frac{1}{nh^{2}}\int K^{2}\left(u\right)dx. \end{array} &fg=000000$

$latex \Box&fg=000000$

1.2. Bias term of MISE

For the bias term, it is possible to control it only in a subset of smooth densities. For example we assume that $latex {p}&fg=000000$ belongs to a Nikol’ski class of functions defined as follows.

Definition 2 Let $latex {\beta>0}&fg=000000$ and $latex {L>0}&fg=000000$. The Nikol’ski class $latex {\mathcal{H}(\beta,L)}&fg=000000$ is the set of functions $latex {f:{\mathbb R}\rightarrow{\mathbb R}}&fg=000000$ whose derivatives $latex {f^{(l)}}&fg=000000$ of order $latex {l=\left\lfloor \beta\right\rfloor }&fg=000000$ exist and satisfy

$latex \displaystyle \left[\int\left(f^{(l)}(x+t)-f^{(l)}(x)\right)^{2}dx\right]^{1/2}\leq L\left|t\right|^{\beta-l},\quad\forall t\in R. &fg=000000$

The next inequality will be very useful in Proposition 3.

Lemma (Generalized Minkowski inequality):For any Borel function g $latex {{\mathbb R}\times{\mathbb R}}&fg=000000$, we have

$latex \displaystyle \int\left(\int g(u,x)du\right)^{2}\leq\left[\int\left(\int g^{2}(u,x)dx\right)^{\text{1/2}}du\right]^{2}. &fg=000000$

We will assume that $latex {p}&fg=000000$ belongs to the following class of densities

$latex \displaystyle \mathcal{P_{H}}(\beta,L)=\left\{ p\in\mathcal{H}(\beta,L)\left|p\geq0\quad\text{and}\quad\int p(x)dx=1\right.\right\} . &fg=000000$

Proposition 3 Assume that $latex {p\in\mathcal{P_{H}}(\beta,L)}&fg=000000$ and let $latex {K}&fg=000000$ be a kernel of order $latex {l=\left\lfloor \beta\right\rfloor }&fg=000000$ satisfying

$latex \displaystyle \int|u|^{\beta}|K(u)|du<\infty. &fg=000000$

Then, for any $latex {h>0}&fg=000000$ and $latex {n\geq1}&fg=000000$,

$latex \displaystyle \int b^{2}(x)dx\leq C_{2}^{2}h^{2\beta}, &fg=000000$

where

$latex \displaystyle C_{2}=\frac{L}{l!}\int|u|^{\beta}|K(u)|du. &fg=000000$

Proof: For any $latex {x\in{\mathbb R}}&fg=000000$, $latex {u\in{\mathbb R}}&fg=000000$, $latex {h>0}&fg=000000$ and write the Taylor expansion

$latex \displaystyle p(x+uh)=p(x)+p^{\prime}(x)uh+\cdots+\frac{(uh)^{l}}{(l-1)!}\int_{0}^{1}(1-\tau)^{l-1}p^{(l)}(x+\tau uh)d\tau. &fg=000000$

Since the kernel $latex {K}&fg=000000$ is of order $latex {l=\left\lfloor \beta\right\rfloor }&fg=000000$ we obtain

$latex \displaystyle \begin{array}{rl} b(x)= & \displaystyle\int K(u)\frac{(uh)^{l}}{(l-1)!}\left[\int_{0}^{1}(1-\tau)^{l-1}p^{(l)}(x+\tau uh)d\tau\right]du\\ = & \displaystyle\int K(u)\frac{(uh)^{l}}{(l-1)!}\left[\int_{0}^{1}(1-\tau)^{l-1}\left(p^{(l)}(x+\tau uh)-p^{(l)}(x)\right)d\tau\right]du \end{array} &fg=000000$

Applying twice the generalized Minkowski inequality and using the fact that $latex {p}&fg=000000$ belongs to the class $latex {\mathcal{H}(\beta,L)}&fg=000000$, we get the following upper bound

$latex \displaystyle \begin{array}{rl} \int b^{2}(x)dx\leq & \int\int\left(|K(u)|\frac{|uh|^{l}}{(l-1)!}\right.\\ & \left.\left[\int_{0}^{1}(1-\tau)^{l-1}\left|p^{(l)}(x+\tau uh)-p^{(l)}(x)\right|d\tau\right]du\right)^{2}dx\\ \leq & \left(\int\left(\int\left(|K(u)|\frac{|uh|^{l}}{(l-1)!}\right)^{2}\right.\right.\\ & \left.\left.\left[\int_{0}^{1}(1-\tau)^{l-1}\left|p^{(l)}(x+\tau uh)-p^{(l)}(x)\right|d\tau\right]^{2}dx\right)^{\frac{1}{2}}du\right)^{2}\\ \leq & \left(\int|K(u)|\frac{|uh|^{l}}{(l-1)!}\right.\\ & \left.\left(\int\left[\int_{0}^{1}(1-\tau)^{l-1}\left|p^{(l)}(x+\tau uh)-p^{(l)}(x)\right|d\tau\right]^{2}dx\right)^{\frac{1}{2}}du\right)^{2}\\ \leq & \left(\int|K(u)|\frac{|uh|^{l}}{(l-1)!}\right.\\ & \left.\left(\int_{0}^{1}(1-\tau)^{l-1}\left[\int\left(p^{(l)}(x+\tau uh)-p^{(l)}(x)\right)^{2}dx\right]^{\frac{1}{2}}d\tau\right)du\right)^{2}\\ \leq & \left(\int|K(u)|\frac{|uh|^{l}}{(l-1)!}\left(\int_{0}^{1}(1-\tau)^{l-1}L|uh|^{\beta-l}d\tau\right)du\right)^{2}\\ = & C_{2}^{2}h^{2\beta}. \end{array} &fg=000000$

$latex \Box&fg=000000$

Summarizing Propositions 1 and 3 we find

$latex \displaystyle \mathrm{MISE}\leq C_{2}^{2}h^{2\beta}+\frac{1}{nh}\int K^{2}(u)du, &fg=000000$

and minimizing with respect $latex {h}&fg=000000$ we get

$latex \displaystyle h_{n}^{*}=\left(\frac{\int K^{2}(u)du}{2\beta C_{2}^{2}}\right)^{1/(2\beta+1)}n^{-1/(2\beta+1)}. &fg=000000$

Lastly, taking $latex {h=h_{n}^{*}}&fg=000000$ we see that

$latex \displaystyle \mathrm{MISE}=O\left(n^{-2\beta/(2\beta+1)}\right),\quad n\rightarrow\infty. &fg=000000$

This rate is exactly the same as the $latex {\mathrm{MSE}}&fg=000000$.

Recapitulating, for $latex {\alpha>0}&fg=000000$ and $latex {h=\alpha n^{-2\beta/(2\beta+1)}}&fg=000000$, then the kernel estimator of $latex {\hat{p} _{n}}&fg=000000$ satisfies

$latex \displaystyle \sup_{p\in\mathcal{P_{H}}(\beta,L)}\mathop{\mathbb E}_{p}\int\left(\hat{p} _{n}(x)-p(x)\right)^{2}dx\leq Cn^{-2\beta/(2\beta+1)}, &fg=000000$

where $latex {C>0}&fg=000000$ is a constant depending only on $latex {\beta}&fg=000000$, $latex {L}&fg=000000$, $latex {\alpha}&fg=000000$ and the kernel $latex {K}&fg=000000$.

For the next time we will change the space of densities and see what happens there. Also if the time allow me, we will see the main problem of kernel estimators and some ideas of how to fix it.

As always any comment/suggestion/idea is welcome.

Happy 2012 for all and see you the next year!

Source:
Tsybakov, A. (2009). Introduction to nonparametric estimation. Springer.

Comments

  1. Pingback: Blog about Statistics.

  2. Pingback: Rate of convergence of MISE in Sobolev classes of densities. « Blog about Statistics.

  3. Pingback: Introduction to Minimax Lower Bounds | Blog about Statistics

Leave a Reply