Lec-3
Outline¶
- Strongly Convex Functions
- Definition of The Strongly Convex and The Class \(S_\mu^1 (\R^n)\)
- Property of Strongly Convex Function
- Equivalent Definitions
- Examples
- Smooth and Strongly Convex Functions
- The Class \(S_{\mu, L}^{1, 1} (\R^n)\)
- Property of Smooth and Strongly Convex Function
- Conclusion
Strongly Convex Functions¶
Definition of The Strongly Convex and The Class \(S_\mu^1 (\R^n)\)¶
如果 \(f(x)\) 连续可微,且 \(\exists \mu > 0, \forall x, y \in \R^n\):
那么,就称 \(f(x)\) strongly convex, where \(\mu\) is the convexity parameter of \(f\).
Comparison
- \(S^1_\mu: f(y) \geq f(x) + \langle \nabla f(x), y-x \rangle + \frac 1 2 \mu \|y-x\|^2\)
- \(C_\mu^{1,1}: f(x) + \langle \nabla f(x), y-x \rangle + \frac 1 2 \mu \|y-x\|^2 \geq f(y) \geq f(x) + \langle \nabla f(x), y-x \rangle - \frac 1 2 \mu \|y-x\|^2\)
性质¶
Note: 也就是在最优解附近,函数的变化率是足够大的。
Note: 也就是,强凸系数是线性叠加的。
Proof (Cont.)
从而,\(\nabla \phi(x) = \nabla f(y) |_{y=x} - \nabla f(x) = 0\)。因此,由 Theorem 30:
$$ \begin{aligned} \phi(x) &= \min_\nu \phi(\nu) \geq \min_\nu \phi(y) + \langle \nabla f(y) , \nu - y \rangle + \frac 1 2 \mu | \nu - y |^2 \newline &= \phi(y) - \frac 1 {2\mu} | \nabla \phi(y) |^2 \end{aligned} $$ 展开,得到:
进一步,互换 \(x, y\):
然后两边相加:
$$ \frac 1 {\mu} | \nabla f(x) - \nabla f(y) |^2 \geq \langle \nabla f(x) - \nabla f(y), x - y \rangle $$ \(\blacksquare\)
Note: (25) 就是割线不等式
Equivalent Definitions¶
上面两个都是等价定义(证明略)。
\(S_\mu^2 (\R^n) \subseteq S_\mu^1 (\R^n)\)¶
如果 strongly convex function 同时还可以求二阶导(Hessian matrix),那么我们就有更加简单的判定方式:
证明略
Smooth and Strongly Convex Functions¶
Definition¶
- 注意条件数的定义
Example¶
性质¶
Some Useful Lemmas
Theorem 36 (Theorem 2.1.12)
If \(f \in \mathcal{S}_{\mu, L}^{1,1}\left(\mathbb{R}^n\right)\), then for any \(\boldsymbol{x}, \boldsymbol{y} \in \mathbb{R}^n\) we have $$ \begin{aligned} \langle\nabla f(\boldsymbol{x})-\nabla f(\boldsymbol{y}), \boldsymbol{x}-\boldsymbol{y}\rangle & \geq \frac{\mu L}{\mu+L}|\boldsymbol{x}-\boldsymbol{y}|^2 \newline & +\frac{1}{\mu+L}|\nabla f(\boldsymbol{x})-\nabla f(\boldsymbol{y})|^2 \end{aligned} $$
Proof.
Denote \(\phi(\boldsymbol{x})=f(\boldsymbol{x})-\frac{1}{2} \mu\|\boldsymbol{x}\|^2\). Then \(\nabla \phi(\boldsymbol{x})=\nabla f(\boldsymbol{x})-\mu \boldsymbol{x}\).
We check \(\phi \in \mathcal{F}_{L-\mu}^{1,1}\left(\mathbb{R}^n\right)\) as follows.
Case 1: If \(\mu=L\),
Add up the upper two inequalities, the theorem can be proved.
Case 2: If \(\mu<L\), $$ \begin{aligned} \langle\nabla f(\boldsymbol{x})-\nabla f(\boldsymbol{y}), \boldsymbol{x}-\boldsymbol{y}\rangle & \geq \frac{1}{L}|\nabla f(\boldsymbol{x})-\nabla f(\boldsymbol{y})|^2 \newline \phi(\boldsymbol{x}) & =f(\boldsymbol{x})-\frac{1}{2} \mu|\boldsymbol{x}|^2 \end{aligned} $$
从而可以推出: $$ \langle\nabla \phi(\boldsymbol{x})-\nabla \phi(\boldsymbol{y}), \boldsymbol{x}-\boldsymbol{y}\rangle \geq \frac{1}{L-\mu}|\nabla \phi(\boldsymbol{x})-\nabla \phi(\boldsymbol{y})|^2 $$.
Consider the LHS of the last inequality. We have
Consider the RHS of the last inequality. We have
After some rearrangements, we have
Conclusion¶
不同的 continuity,对于 0、1、2 阶的情况分别是什么?
Hierarchy of Functions¶
其中,PŁ 虽然比不上强凸,但是也是很好的性质。下面我们证明强凸 \(\implies\) PŁ,以及举反例说明 PŁ \(\centernot\implies\) 强凸。
\(z = f(x,y) = 2x^2\)
不难看出,由于沿着 y 轴,完全是平坦的,因此绝对非强凸(甚至非凸);但是