Importance of nonparametric statistics in regression.

nonparametric

I would like to start this blog with some basic ideas about density estimation and nonparametric regression.

The study of the probability density function (pdf) is called nonparametric estimation. This kind of estimation can serve as a block building in nonparametric regression.

The typical regression problem is setting as follows. Assume that we have a set of explanatory variables $X_1,\ldots,X_d$ and an explained variable $Y$ related in the following way:
\begin{equation}
Y=X^{\top}\beta+\varepsilon\label{eq:linear_model}
\end{equation}
where $\varepsilon$ is independent of $X$ and $\E[\varepsilon]=0$.

Taking expectations on both sides, we can see model \eqref{eq:linear_model} as,
\begin{equation}
\label{eq:cond_model}
\E[X\vert Y]=X_{1}\beta_{1}+\cdots+X_{d}\beta_{d}=\mathbf{X}^{\top}\beta.
\end{equation}

where $\E[X\vert Y]$ is the conditional expectation of $Y$ given $X$.

Example of linear regression with one independ...
Example of linear regression

Unfortunately, for many real problems, equation \eqref{eq:cond_model} is not sufficient.

To tackle this situation, we can transform it into,
\begin{equation}
\E[X\vert Y]=m(\mathbf{X}),
\end{equation}
Here $m(\mathbf{X})$ is the true, unknown regression function.

Just to get the things in perspective. Suppose that $\mathbf{X}=(X_1,X_2)$ and the real model for the conditional expectation is
\begin{equation}
\label{eq:example}
\E[X\vert Y]=\beta_1 X_1 + \beta_2 X_2 +\beta_3 X_2^2.
\end{equation}

Now given a data sample, you have to estimate $\E[X\vert Y]$ as accurately as possible in one single trial. That means, you can not change the model if the data does not fit well.

In parametric models, the task is relative easy: Estimate the $\beta$’s producing the minimum error given the formula \eqref{eq:example}. The advantage is that you have to deal only with a finite number of parameters give a parametric structure of your model.

However, if you relax the latter condition, and the only assumption is that there exist a function $m(\cdot)$ (which could be differentiable).

The question here: How could we perform the regression under this new rules? This type of regression is called nonparametric and we will return to it later.

To start, we will build the nonparametric density as a preamble in the next posts.

Comments

  1. Pingback: Kernel density estimation | Blog about Statistics.

  2. Pingback: Taylor expansion of a probability density function | cartesian product

Leave a Reply