强化学习几个不等式推导
学习强化学习multi-armed bandit的时候,几个概率不等式没有推导过程课上直接使用了,这里记一下推导过程。
相关不等式:
Markov, Chebyshev, Chernoff Bonding, Hoeffding
Markov’s Inequality
\[ \begin{equation} \mathbb{P}(X\ge a)\le\frac{\mathbb{E}[X]}{a},a>0 \end{equation} \]
证明1
Let \[Y_a=\begin{cases}0, X<a\\a, X\ge a\end{cases}\] then \(Y_a\le X\) which implies \(\mathbb{E}[Y_a]\le \mathbb{E}[X]\), combining with the fact that \(\mathbb{E}[Y_a]=\mathbb{P}(X\ge a)\times a\), we obtain \[ \begin{gather} \mathbb{P}(X\ge a)\le\frac{\mathbb{E}[X]}{a}, a>0 \end{gather} \]
as needed.
证明2
\[ \begin{gather} \mathbb{E}[X]=\int_0^\infty xp(x)dx\ge\int_a^\infty xp(x)dx\ge\int_a^\infty a p(x)dx=\mathbb{P}(X\ge a)\times a \end{gather} \]
证明3
Let \(Y=\frac{U}{a}\) which gives \(\mathbb{E}[Y]=\frac{\mathbb{E}[X]}{a}\), then \(\mathbb{P}(X\ge a)=\mathbb{P}(Y\ge 1)\). What we want to show is \(\mathbb{P}(Y\ge 1)\le \mathbb{E}[Y]\), i.e. \(\mathbb{I}\{Y\ge1\}\le Y\) (then take expectation on both sides give the desired result).
Case 1: \(Y\ge1\). Then the inequality holds.
Case 2: \(Y<1\). Then the indicator is 0 showing the inequality holds,
Hence we conclude the Markov’s Inequality holds.
Chebyshev’s Inequality
\[\mathbb{P}(|X-\mu|\ge c)\le\frac{\sigma^2}{c^2}, c>0\]
\[\mathbb{P}(|X-\mu|\ge k\sigma)\le\frac{1}{k^2}, k>0\]
Where \(\mu\) is the mean, \(\sigma\) is the standard deviation
证明
Using Markov’s Inequality, By letting \(X:=(X-\mu)^2, a:=c^2\), we see \(\mathbb{P}((X-\mu)^2\ge c^2)\le \frac{\mathbb{E}[(X-\mu)^2]}{c^2}=\frac{\sigma^2}{c^2}\) which gives \[\mathbb{P}(|X-\mu|\ge c)\le\frac{\sigma^2}{c^2}\]
By letting \(c:=k\sigma\), we see \[\mathbb{P}(|X-\mu|\ge k\sigma)\le\frac{\sigma^2}{k^2\sigma^2}=\frac{1}{k^2}\]
Chernoff Bonding Method
Let \(M_Z(s):=\mathbb{E}[e^{sZ}]\) be the moment generating function.
\[\mathbb{P}(Z\ge t)\le \inf_{s>0}e^{-st}M_Z(s)\]
证明
Apply Markov’s Inequality, \(\mathbb{P}(Z\ge t)=\mathbb{P}(e^{sZ}\ge e^{st})\le e^{-st}\mathbb{E}[e^{sZ}]=e^{-st}M_Z(s)\), take inf of all \(s>0\) gives the result.
Hoeffding’s Inequality
\(\{Z_i\}_{i\in\Lambda}\subset\mathbb{R}\) be independent random variables. If \(a_i\le Z_i\le b_i\) for all \(i\in\Lambda\), for some \(a_i, b_i\), then let \(S_n=\sum_i Z_i\), we have
\[\mathbb{P}(|S_n-\mathbb{E}[S_n]|\ge t)\le 2e^{-\frac{2t^2}{\sum_i (b_i-a_i)^2e}}\]
证明
Using Chernoff Bonding Method, and replacing infimum to minimum since it is achievable, we have \(\mathbb{P}(S_n-\mathbb{E}[S_n]\ge t)\le \min_{s>0}e^{-st}\mathbb{E}[e^{s(S_n-\mathbb{E}[S_n])}]\)

1.压缩感知和稀疏编码
2.降维
3.深度学习是如何统治所有机器学习的学习类型的
4.机器学习的学习规则
5.线性回归和逻辑回归的区别
6.模型验证与评估
7.Logit, Logistic, Sigmoid, Softmax在MLP的区别
8.激活函数和损失函数