The last post I forget to say that we use Mikownski classes of densities because the MISE is a risk corresponding to the $latex {\mathbb L^2({\mathbb R})}&fg=000000$ norm. Thus, it is natural to assume that $latex {p}&fg=000000$ is smooth with respect to this norm. Another way to describe smoothness in $latex {\mathbb L^{2}({\mathbb R})}&fg=000000$ are Sobolev classes:

Definition 1 Let $latex {\beta\geq1}&fg=000000$ and an integer $latex {L>0}&fg=000000$. The Sobolev class $latex {\mathcal{S}(\beta,L)}&fg=000000$ is the set of all $latex {\beta-1}&fg=000000$ times differentiable functions $latex {f:{\mathbb R}\rightarrow{\mathbb R}}&fg=000000$ having absolutely continuous derivate $latex {f^{(\beta-1)}}&fg=000000$ and satisfying

$latex \displaystyle \int(f^{(\beta)}(x))^{2}dx\leq L^{2}. &fg=000000$

It is also interesting notice that, for an integer $latex {\beta}&fg=000000$, we have the inclusion $latex {\mathcal{S}(\beta,L)\subset\mathcal{H}(\beta,L)}&fg=000000$ by the generalized Minkowski inequality. Also, for these classes, we attain the same rate of convergence as Nikol’ski classes.

Theorem 2: Suppose that, for an integer $latex {\beta\geq1}&fg=000000$:

• the function $latex {K}&fg=000000$ is a kernel of order $latex {\beta-1}&fg=000000$ satisfying the conditions

$latex \displaystyle {\displaystyle \int K^{2}(u)du}<\infty,\qquad\int|u|^{\beta}|K(u)|du<\infty; &fg=000000$

• the density $latex {p}&fg=000000$ is $latex {\beta-1}&fg=000000$ times differentiable, its derivative $latex {p^{(\beta-1)}}&fg=000000$ is absolutely continuous on $latex {{\mathbb R}}&fg=000000$ and

$latex \displaystyle \int(p^{(\beta)}(x))^{2}dx<\infty. &fg=000000$

Then for all $latex {n\geq1}&fg=000000$ and all $latex {h>0}&fg=000000$ the mean integrated squared error of the kernel estimator $latex {\hat{p} _{n}}&fg=000000$ satisfies

$latex \displaystyle \begin{array}{rl} \mathrm{MISE}\triangleq & \mathbb E_{p}\int(\hat{p} _{n}(x)-p(x))^{2}dx\\ \leq & \frac{1}{nh}\int K^{2}(u)du+\frac{h^{2\beta}}{(l!)^{2}}\left(\int|u|^{\beta}|K(u)|du\right)^{2}\int(p^{(\beta)}(x))^{2}dx. \end{array} &fg=000000$

Proof: Due decomposition of MISE and the Proposition to bound the variance  we can bound the variance term. For the bias term, we apply this inequality with $latex {l=\left\lfloor \beta\right\rfloor =\beta-1}&fg=000000$, replacing $latex {L}&fg=000000$ by $latex {\left(\int(p^{(\beta)}(x))^{2}dx\right)^{1/2}}&fg=000000$. By generalized Minkowski inequality we have for all $latex {t\in{\mathbb R}}&fg=000000$ and $latex {0\leq\theta\leq1}&fg=000000$ that

$latex \displaystyle \begin{array}{rl} \displaystyle\int\left(p^{(l)}(x+t)-p^{(l)}(x)\right)^{2}dx & =\displaystyle\int\left(p^{(l)}(x)+t\int_{0}^{1}p^{(l)}(x+\theta t)d\theta-p^{(l)}(x)\right)^{2}dx\\ & =\displaystyle\int\left(t\int_{0}^{1}p^{(l+1)}(x+\theta t)d\theta\right)^{2}dx\\ &\leq \displaystyle t^{2}\left(\int_{0}^{1}\left[\int\left(p^{(\beta)}(x+\theta t)\right)^{2}dx\right]^{1/2}d\theta\right)^{2}\\ & =\displaystyle t^{2}\left(\int_{0}^{1}\left[\int\left(p^{(\beta)}(u)\right)^{2}du\right]^{1/2}d\theta\right)^{2}\\ & = \displaystyle t^{2}\int\left(p^{(\beta)}(u)\right)^{2}du. \end{array} &fg=000000$

$latex \Box&fg=000000$

The natural questions that arises now are:

How to choose a kernel $latex {K}&fg=000000$ and the bandwidth $latex {h}&fg=000000$ in an optimal way? And, this optimal is in a some meaning consistent?

Let’s answer these questions the next week. As always, please use the comment section for questions, suggestion or improvements.

See you next time.

Source: Tsybakov, A. (2009). Introduction to nonparametric estimation. Springer.