Monthly Archives: March 2012

Multivariate kernel density estimation

Briefly, we shall see the definition of a kernel density estimator in the multivariate case. Suppose that the data is d-dimensional so that $latex {X_{i}=(X_{i1},\ldots,X_{id})}&fg=000000$. We will use the product kernel $latex \displaystyle \hat{f}_{h}(x)=\frac{1}{nh_{1}\cdots h_{d}}\left\{ \prod_{j=1}^{d}K\left(\frac{x_{j}-X_{ij}}{h_{j}}\right)\right\} . &fg=000000$ The risk is given by $latex \displaystyle \mathrm{MISE}\approx\frac{\left(\mu_{2}(K)\right)^{4}}{4}\left[\sum_{j=1}^{d}h_{j}^{4}\int f_{jj}^{2}(x)dx+\sum_{j\neq k}h_{j}^{2}h_{k}^{2}\int f_{jj}f_{kk}dx\right]+\frac{\left(\int K^{2}(x)dx\right)^{d}}{nh_{1}\cdots h_{d}} &fg=000000$

Choosing the smoothing parameter

Two popular methods to find the bandwidth $latex {h}&fg=000000$ for the nonparametric density estimator are the plug-in method and the method cross-validation. The first one we will focus in the “quick and dirty” plug-in method introduced by Silverman (1986). In cross-validation we will minimize a modified version of the quadratic risk of $latex {\hat{f}_{h}}&fg=000000$. The …