**1. Preliminaries**

Given a random variable $latex {X}&fg=000000$, we define the cumulative distribution function(or distribution function) as follows,

DefinitionThe cumulative distribution function, or cdf, is the function $latex {F_{X}:{\mathbb R}\rightarrow[0,1]}&fg=000000$ defined by $latex {F_{X}(x)=\mathbb P(X\leq x)}&fg=000000$.

The following theorem is characterizes the class of distribution functions.

Theorem:A function $latex {F}&fg=000000$ mapping the real line to $latex {[0,1]}&fg=000000$ is a cdf for some probability $latex {\mathbb P}&fg=000000$ if and only if $latex {F}&fg=000000$ satisfies the following three conditions:(i) $latex {F}&fg=000000$ is non-decreasing: $latex {x_{1}\leq x_{2}}&fg=000000$ implies that $latex {F(x_{1})\leq F(x_{2})}&fg=000000$.

(ii) $latex {F}&fg=000000$ is normalized: $latex {\lim_{x\rightarrow-\infty}F(x)=0}&fg=000000$ and $latex {\lim_{x\rightarrow\infty}F(x)=1.}&fg=000000$

(iii) $latex {F}&fg=000000$ is right-continuous: $latex {F(x)=F(x^{+})}&fg=000000$ for all $latex {x}&fg=000000$, where $latex {F(x^{+})=\underset{\stackrel{y\rightarrow x}{y>x}}{\lim}F(y).}&fg=000000$

Also, we can prove that the number of discontinuities of $latex {F(x)}&fg=000000$ is at most countable (see Froda’s theorem).

These results hold if we take a random vector $latex {X=(X_{1},\ldots,X_{k})}&fg=000000$, a vector $latex {x=(x_{1},\dots,x_{n})}&fg=000000$ and we define the distribution function of $latex {X}&fg=000000$ as $latex {F:{\mathbb R}^{k}\rightarrow[0,1]}&fg=000000$ such that $latex {F(x)=\mathbb P(X\leq x)=\mathbb P(X_{1}\leq x_{1},\dots,X_{n}\leq x_{n})}&fg=000000$.

A random vectors sequence $latex {X_{n}}&fg=000000$ is said to **converge in distribution** to a random vector $latex {X}&fg=000000$ if

$latex \displaystyle \mathbb P(X_{n}\leq x)\rightarrow F(x)=\mathbb P(X\leq x) &fg=000000$

for every $latex {x}&fg=000000$ at which the limit distribution $latex {F(x)}&fg=000000$ is continuous. We call it also **weak convergence** or **convergence in law**. We will denote the convergence in distribution as $latex {X_{n}\rightsquigarrow X}&fg=000000$.

**2. Lemma of Portmanteau **

The Portmanteau lemma gives a number of equal descriptions of weak convergence. Most of the characterization are only useful in proof . The last one also has intuitive value.

TheoremFor any random vectors $latex {X_{n}}&fg=000000$ and $latex {X}&fg=000000$ the following statements are equivalent:(i) $latex {\mathbb P(X_{n}\leq x)\rightarrow\mathbb P(X\leq x)}&fg=000000$ for all continuity points of $latex {x\mapsto\mathbb P(X\leq x)}&fg=000000$.

(ii) $latex {\mathbb E\left(f(X_{n})\right)\rightarrow\mathbb E\left(f(X)\right)}&fg=000000$ for all bounded, continuous function $latex {f}&fg=000000$.

(iii) $latex {\mathbb E\left(f(X_{n})\right)\rightarrow\mathbb E\left(f(X)\right)}&fg=000000$ for all bounded, Lipschitz function $latex {f}&fg=000000$.

(iv) $latex {\lim\inf\mathbb E\left(f(X_{n})\right)\geq\mathbb E\left(f(X)\right)}&fg=000000$ for all nonnegative, continuous function $latex {f}&fg=000000$.

(v) $latex {\lim\inf\mathbb P\left(X_{n}\in G\right)\geq\mathbb P\left(X\in G\right)}&fg=000000$ for every open set $latex {G}&fg=000000$.

(vi) $latex {\lim\sup\mathbb P\left(X_{n}\in F\right)\leq\mathbb P\left(X\in F\right)}&fg=000000$ for every closed set $latex {F}&fg=000000$.

(vii) $latex {\mathbb P\left(X_{n}\in B\right)\rightarrow\mathbb P\left(X\in B\right)}&fg=000000$ for all Borel sets $latex {B}&fg=000000$ with $latex {\mathbb P\left(X\in\delta B\right)=0}&fg=000000$, where $latex {\delta B=\bar{B}-\mathring{B}}&fg=000000$ is the boundary of $latex {B}&fg=000000$.

*Proof:* $latex {(i)\Rightarrow(ii)}&fg=000000$ Assume first that the distribution function of $latex {X}&fg=000000$ is continuous. Then condition $latex {(i)}&fg=000000$ implies that for any rectangle $latex {I}&fg=000000$ we have, $latex {\mathbb P(X_{n}\in I)\rightarrow\mathbb P(X\in I)}&fg=000000$.

Let $latex {f}&fg=000000$ a bounded continuous function (by homogeneity we can assume that $latex {\|f\|_{\infty}=1}&fg=000000$). Let $latex {\epsilon>0}&fg=000000$ fixed and $latex {I}&fg=000000$ a closed rectangle verifying $latex {\mathbb P(X\notin I)\leq\epsilon}&fg=000000$. As $latex {I}&fg=000000$ is compact, the function $latex {f}&fg=000000$ is uniformly continuous on $latex {I}&fg=000000$. Thus there exists $latex {\eta>0}&fg=000000$ such that $latex {|x-y|\leq\eta\implies|f(x)-f(y)|\leq\epsilon,\ \forall(x,y)\in I^{2}}&fg=000000$. By compactness, we can cover $latex {I}&fg=000000$ by a finite number of rectangles $latex {(I_{j})_{j=1\ldots p}}&fg=000000$ of radius $latex {\eta}&fg=000000$ such that in every $latex {I_{j}}&fg=000000$, $latex {f}&fg=000000$ varies at most $latex {\epsilon}&fg=000000$.

We choose one point $latex {x_{j}}&fg=000000$ in every $latex {I_{j}}&fg=000000$ and define the function $latex {f_{\epsilon}:=\sum_{j=1}^{p}f(x_{j}){\bf 1}_{I_{j}}}.&fg=000000$ Then $latex {\vert f-f_{\epsilon}\vert\leq\epsilon}&fg=000000$ on $latex {I}&fg=000000$ and

$latex \displaystyle \begin{array}{rl} |\mathbb E\left(f(X_{n})\right)-\mathbb E\left(f_{\epsilon}(X_{n})\right)| & \leq\epsilon+\mathbb P\left(X_{n}\notin I\right),\\ |\mathbb E\left(f_{\epsilon}(X_{n})\right)-\mathbb E\left(f_{\epsilon}(X)\right)| & \displaystyle \leq\sum_{j=1}^{p}\left|\mathbb P\left(X_{n}\in I_{j}\right)-\mathbb P\left(X\in I_{j}\right)\right||f(x_{j})|,\\ |\mathbb E\left(f(X)\right)-\mathbb E\left(f_{\epsilon}(X)\right)| & \leq\epsilon+\mathbb P\left(X\notin I\right)\leq2\epsilon. \end{array} &fg=000000$

By hypothesis $latex {\mathbb P\left(X_{n}\notin I\right)\rightarrow\mathbb P\left(X\notin I\right)}&fg=000000$ and $latex {\mathbb P\left(X_{n}\notin I\right)\leq\epsilon}&fg=000000$ for sufficiently large $latex {n}&fg=000000$. Also,

$latex \displaystyle \sum_{j=1}^{p}|\mathbb P\left(X_{n}\in I_{j}\right)-\mathbb P\left(X\in I_{j}\right)||f(x_{j})|\leq p\sup_{j}|\mathbb P\left(X_{n}\in I_{j}\right)-\mathbb P\left(X\in I_{j}\right)|\leq\epsilon. &fg=000000$

The three displays, applying the triangle inequality, show that

$latex \displaystyle {\displaystyle \vert\mathbb E\left(f(X_{n})\right)-\mathbb E\left(f(X_{n})\right)\vert\leq5\epsilon} &fg=000000$

showing the result assuming that the distribution function of $latex {X}&fg=000000$ is continuous.

In the general case, the number of discontinuities points of the distribution function is at most countable. Therefore, we can allow that the boundary does not have any point of discontinuity, even if it is necessary enlarge a little the rectangle $latex {I}&fg=000000$. Also, we can shrink the rectangles $latex {I_{j}}&fg=000000$ to do not have any points of discontinuities.

$latex {(ii)\implies(iii)}&fg=000000$ Evident.

$latex {(ii)\implies(iv)}&fg=000000$ Let $latex {f}&fg=000000$ a nonnegative continuous function. Let $latex {M}&fg=000000$ a positive real, we define the function $latex {f_{M}}&fg=000000$ by $latex {f_{M}(x)=\inf(f(x),M)}&fg=000000$. This function is nonnegative continuous and bounded for $latex {M}&fg=000000$, besides $latex {f_{M}\leq f}&fg=000000$. We have for all $latex {n}&fg=000000$, $latex {\mathbb E\left(f_{M}(X_{n})\right)\leq\mathbb E\left(f(X_{n})\right).}&fg=000000$ By $latex {(ii)}&fg=000000$ the term of left converge to $latex {\mathbb E\left(f_{M}(X)\right)}&fg=000000$. We deduce that $latex {\mathbb E\left(f_{M}(X)\right)\leq\lim\inf\mathbb E\left(f(X_{n})\right).}&fg=000000$ We conclude by the monotone convergence theorem.

$latex {(iv)\implies(ii)}&fg=000000$ Let $latex {f}&fg=000000$ a bounded continuous function, then the functions $latex {f+\|f\|_{\infty}}&fg=000000$ and $latex {\|f\|_{\infty}-f}&fg=000000$ are nonnegative. By $latex {(iv)}&fg=000000$

$latex \displaystyle \begin{array}{rl} \mathbb E\left(f(X)\right)+\|f\|_{\infty} & \leq\lim\inf\mathbb E\left(f(X_{n})\right)+\|f\|_{\infty},\\ \|f\|_{\infty}-\mathbb E\left(f(X)\right) & \leq\|f\|_{\infty}-\lim\sup\mathbb E\left(f(X_{n})\right). \end{array} &fg=000000$

We conclude that $latex {\lim\mathbb E\left(f(X_{n})\right)=\mathbb E\left(f(X)\right).}&fg=000000$

$latex {(iii)\implies(v)}&fg=000000$ Let $latex {G}&fg=000000$ an open set of $latex {{\mathbb R}^{k}}&fg=000000$ and $latex {M}&fg=000000$ an integer strictly positive. We define the function $latex {f_{M}(x)=\inf\left(1,Md(x,G^{c}\right)}&fg=000000$. This function is $latex {M-}&fg=000000$Lipschitz and bounded by 1. The function sequence $latex {f_{M}}&fg=000000$ are increasing and converge to $latex {\mbox{{\bf 1}}_{G}}&fg=000000$. By $latex {(iii)}&fg=000000$ we know that $latex {\lim_{n}\mathbb E\left(f_{M}(X_{n})\right)=\mathbb E\left(f_{M}(X)\right)}&fg=000000$. As $latex {\mathbb P\left(X_{n}\in G\right)\geq\mathbb E\left(f_{M}(X_{n})\right)}&fg=000000$, we conclude that $latex {\lim\inf\mathbb P\left(X_{n}\in G\right)\geq\mathbb E\left(f_{M}(X)\right).}&fg=000000$ We finish the argument by the monotone convergence theorem.

$latex {(v)\iff(vi)}&fg=000000$ Immediate passing to complements.

$latex {(v)+(vi)\implies vii)}&fg=000000$ Let $latex {B}&fg=000000$ a Borel set such that $latex {\mathbb P\left(X\in\partial B\right)=0}&fg=000000$. We have

$latex \displaystyle \mathbb P\left(X_{n}\in\mathring{B}\right)\leq\mathbb P\left(X_{n}\in B\right)\leq\mathbb P\left(X_{n}\in\bar{B}\right). &fg=000000$

We apply $latex {(vi)}&fg=000000$ to $latex {\bar{B}}&fg=000000$ and $latex {(v)}&fg=000000$ to $latex {\mathring{B}}&fg=000000$ then we notice that $latex {\mathbb P\left(X\in\mathring{B}\right)=\mathbb P\left(X\in B\right)=\mathbb P\left(X\in\bar{B}\right).}&fg=000000$

$latex {(vii)\implies(i)}&fg=000000$ It is immediate if we consider a continuity point $latex {x}&fg=000000$ of the distribution function of $latex {X}&fg=000000$ and $latex {B=]-\infty,x]}&fg=000000$. $latex \Box&fg=000000$

The next week, we will see others type of convergence for random variables and some very useful results.

**Sources:**

- Larry Wasserman, All of statistics: a concise course in statistical inference, Springer, 2004
- A. W. Van Der Vaart, Asymptotic Statistics, Cambridge University Press, 2000

What does the notation {mbox{{bf 1}}_{G}} mean?

Sorry that should read 1 subscript G.

Hello Patrick!

The 1_G means the indicator function over the set G.

Thank you for a very helpful post. Been spending quite a number of hours with this now 🙂 One question: when proving (ii) -> (iv), exactly how do you conclude by the monotone convergence theorem? I can’t really seem to follow how we get rid of the M because of that. One thought I had was that we haven’t said anything about M other than it being positive, so would it be possible to argue that it’s arbitrary (but positive) such that we can let it tend to infinity? That doesn’t resonate well with your comment on concluding, however. Thanks!

Okay, I see now. The theorem includes taking the limit just as I was thinking. I’m a bit rusty it seems 🙂

Surely the line after “the three displays, applying the triangle inequality, show that” is a typo? It’s trivially 0.

Thanks for your helpful posting. I’m reading Van Der Vaart but embarrassed by some notations. For example what does X <= x mean for vectors? Does it mean element-wise compare, i.e.,

For X = (X1, X2, … ,Xn) and x = (x1, x2, …. , xn),

is the notation X <= x equivalent to

"X1 <= x1 and X2 <= x2 and ….. and Xn <= xn"?

I think my guess is right from contexts but still not 100% sure.