mjtsai1974's Dev Blog Welcome to mjt's AI world

Jensen's Inequality

Prologue To The Jensen's Inequality

The Jensen's inequality is an important inequality in the proof of many famous lemmas, theorems, it reveals that equality, $E\lbrack g(X)\rbrack$=$g(E\lbrack X\rbrack)$ rarely occur for nonlinear function g. Without actually computing the distribution of $g(X)$, we can easily relate $E\lbrack g(X)\rbrack$ to $g(E\lbrack X\rbrack)$.

Jensen's Inequality

Let $g$ be a convex function, and let $X$ be any random variable, then
$\;\;g(E\lbrack X\rbrack)\le E\lbrack g(X)\rbrack$

➀why focus on the convex function?
Be recalled that the second derivative of convex function is positive, that is $g″(X)\ge 0$. The curve would be in a bowl shape.
➁suppose $X$=$\{e_1,e_2\}$ is a random variable, containing 2 events with event $e_1$ the probability $\frac {4}{7}$, and event $e_2$ the probability $\frac {3}{7}$.
You can treat the 2 events as the inputs.

Convexity of $g$ forces all line segments connecting 2 points on the curve lie above the part of the curve segment in between.

➂if we choose the line ranging from $(a,g(a))$ to $b,g(b)$, then
$(E\lbrack X\rbrack,E\lbrack g(X)\rbrack)$
=$(\frac {4}{7}\cdot a+\frac {3}{7}\cdot b,\frac {4}{7}\cdot g(a)+\frac {3}{7}\cdot g(b))$
=$\frac {4}{7}\cdot (a,g(a))+\frac {3}{7}\cdot (b,g(b))$
This point $(E\lbrack X\rbrack,E\lbrack g(X)\rbrack)$ must lie above the point $(E\lbrack X\rbrack,g(E\lbrack X\rbrack))$.
Therefore, we have proved $g(E\lbrack X\rbrack)\le E\lbrack g(X)\rbrack$.