Section 9.4 Another Approach to Strong Duality
In this section, we provide an alternative approach to proving the Strong Duality Theorem
Theorem 4.2.4.
Theorem 9.4.1. Hyperplane Seperation Theorem.
Given two disjoint convex sets \(U, V\subseteq \R^{n}\text{,}\) there is a hyperplane \(H=\{\mathbf{x}\in \R^n: \mathbf{h}^{\top}\mathbf{x}=k\} \) for some \(\mathbf{h}\in \R^n, k\in \R\text{,}\) such that \(\mathbf{h}^{\top}U \geq k, \mathbf{h}^{\top}V\lt k\text{.}\)
Activity 9.4.2.
(a)
Sketch two non-empty convex sets \(U, V\text{,}\) what does \(H\) look like here?
(b)
Activity 9.4.3.
We prove the case of
Theorem 9.4.1 where there are
\(u_0\in U, v_0\in V\) that minimize
\(\{\|u-v\|: u\in U, v\in V\}\text{.}\) We assume this is true.
(a)
Without loss of generality, let \(v_0=0\text{.}\) Why can we do this?
(b)
Let \(\mathbf{h} = u_0-v_0\text{.}\) Sketch \(u_0, v_0, \mathbf{h}\) and \(H:=\{\mathbf{x}\in \R^n: \mathbf{h}^\top \mathbf{x}=0\}\text{.}\)
(c)
We want to show that \(H\) is the seperating hyperplane. Suppose that \(U\) was not contained in \(\{\mathbf{x}\in \R^n: \mathbf{h}^\top\mathbf{x}\geq0\}\) what must be true about \(U\text{?}\)
(d)
Let \(u'\in U\) such that \(\mathbf{h}^\top u'\leq0\text{.}\) Sketch \(u'\text{.}\)
(e)
Let \(\mathbf{d} = u'-u_0\) decribe geometrically what \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) represents. (Think dot product, projections and for \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) to start at \(u\)).
(f)
Let \(\alpha = \frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\text{,}\) show that \(0\lt \alpha \lt 1\text{.}\)
(g)
Let \(w=\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u_0\text{,}\) show that \(w = \alpha u' + (1-\alpha)u_0\)
(h)
Show that \(w\in U\text{.}\)
(i)
Show that \(w^\top w = u_0^\top u_0 + \alpha \mathbf{d}^\top \mathbf{d}\left(2 \frac{u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}+\alpha\right)\text{,}\) and explain why \(w^\top w \lt u_0^\top u_0\text{.}\)
(j)
Why does the last statement result in a contradiction?
We now introduce a key idea which will tie together the primal and dual problems.
Definition 9.4.4.
Let \(S\subseteq \R^n\text{.}\) We call the cone of \(S\text{,}\) denoted \(\cone(S)\) to be the set \(\cone(S):=\left\{\displaystyle \sum_{i=1}^m u_i\mathbf{s}_i: u_i\in \R, u_i\geq 0, \mathbf{s}_i\in S \right\}\text{.}\)
Activity 9.4.5.
(a)
In \(\R^2\text{,}\) describe \(\cone\left( \left\{ \begin{bmatrix} 1\\1\end{bmatrix}, \begin{bmatrix} 1\\-2\end{bmatrix} \right\} \right)\text{.}\)
(b)
Prove that for any \(S\subseteq \R^n, \cone(S)\) is convex.
(c)
Let \(A\) denote a \(n\times m\) matrix, and let \(P\) denote the cone of the columns of \(A\text{.}\)
Suppose
\(\mathbf{b}\not\in P\text{.}\) What does
Theorem 9.4.1 tell us?
We can codify the intuition gained from
Activity 9.4.5 in the following:
Theorem 9.4.6. The Farkas Lemma.
Given a matrix \(A\in \R^{n\times m}\) and vector \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:
There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}=\mathbf{b}.\)
There is a \(\mathbf{y}\in \R^n\) such that \(\y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
Activity 9.4.7.
(a)
(b)
Suppose both cases (A) and (B) of
Theorem 9.4.6 held at the same time. use the product
\(\y^\top A\x\) to derive a contradiction.
(c)
(d)
(e)
Let’s denote the normal vector of the seperating hyperplane by \(\y\) (interesting choice 👀) so that \(\y^\top \vb\lt \y^\top p\) for any \(p\in P\text{.}\) Why must \(\y^\top \vb \lt 0\text{?}\)
(f)
Suppose \(A\) had a column \(A_j\) such that \(\y^\top A_j\lt 0\text{,}\) show that there is an \(\alpha>0\) such that \(\y^\top (\alpha A_j)\lt \y^\top \vb \text{.}\) Why is this a contradiction?
(g)
(h)
Now that
Theorem 9.4.6 is proven, we apply it to
\(\begin{bmatrix} A \amp I_n \end{bmatrix} \) and
\(\vb \text{.}\)
Suppose (A) held, and we had that there was a \(\begin{bmatrix} \x \\ \vt\end{bmatrix} \geq 0\) so that \(\begin{bmatrix} A \amp I_n \end{bmatrix} \begin{bmatrix} \x \\ \vt\end{bmatrix} = \vb \text{.}\) How would \(A\x\) compare to \(\vb \text{?}\)
\(A\x\leq \vb\text{.}\)
\(A\x\geq \vb\text{.}\)
\(A\x= \vb\text{.}\)
No comparison may be made between \(A\x\) and \(\vb\text{.}\)
(i)
Suppose (A) failed. Then there is a \(\y\in \R^n\) satisfing (B) for \(\begin{bmatrix} A \amp I_n \end{bmatrix}, \vb \text{.}\)
What can we say about \(\y \) compared to \(\mathbf{0}\text{?}\)
\(\y^\top \leq \mathbf{0}^\top\text{.}\)
\(\y^\top \geq \mathbf{0}^\top\text{.}\)
\(\y^\top = \mathbf{0}^\top\text{.}\)
No comparison may be made between \(\y\) and \(\mathbf{0}p\text{.}\)
What can we say about \(\y^\top A\) compared to \(\mathbf{0}\text{?}\)
\(\y^\top A\leq \mathbf{0}^\top\text{.}\)
\(\y^\top A\geq \mathbf{0}^\top\text{.}\)
\(\y^\top A = \mathbf{0}^\top\text{.}\)
No comparison may be made between \(\y^\top A\) and \(\mathbf{0}^\top\text{.}\)
What can we say about \(\y^\top \vb\) compared to \(0\text{?}\)
\(\y^\top \vb\leq 0\text{.}\)
\(\y^\top \vb\geq 0\text{.}\)
\(\y^\top \vb = 0\text{.}\)
No comparison may be made between \(\y^\top \vb\) and \(0\text{.}\)
Corollary 9.4.8. The Farkas Lemma v2.
Given a matrix \(A\in \R^{n\times m}\) and vector \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:
There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}\leq \mathbf{b}\text{.}\)
There is a \(\mathbf{y}\in \R^n\) such that \(\y\geq 0, \y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
Without loss of generality, we may let \(\y^\top \vb=-1\) in case (B).
Theorem 9.4.9. The Strong Duality Theorem.
Given a pair of primal max-dual min problems, the primal max problem as an optimal solution \(\x^*\) if and only if the dual min problem has an optimal solution \(\y^*\text{.}\) Moreover, \(f(\x^*)=g(\y^*)\text{.}\)
Activity 9.4.10. Proof of the Strong Duality Theorem.
(a)
Suppose that optimal dual solution
\(\y^*\) exists. Explain why by
Activity 4.2.2 it suffices to show that
\(f(\x^*)\geq g(\y^*)\) for some feasible
\(\x^*\text{.}\)
(b)
Without loss of generality, let \(d=0\) and let \(g^* = (\y^*)^\top \vb \text{.}\)
Consider the matrix
\(M = \begin{bmatrix} -\vc^\top \\ A\end{bmatrix}\text{,}\) and vector
\(\mathbf{d} = \begin{bmatrix} -g^* \\ \vb \end{bmatrix}\text{.}\) Apply
Corollary 9.4.8 to
\(M, \mathbf{d}\text{.}\) What does it mean for (A) to hold?
(c)
If
Corollary 9.4.8 (A) holds for
\(M, \mathbf{d}\text{,}\) then it holds for the pair
\(-\vc^\top, \begin{bmatrix} -g^*\end{bmatrix} \text{,}\) as well as the pair
\(A, \vb \) for the same
\(\x\in \R^m\text{.}\) Why does this show that a feasible optimal solution
\(\x^*\) exists and that
\(f(\x^*)=g(\y^*)\text{?}\)
(d)
On the other hand, suppose
Corollary 9.4.8 (B) holds for
\(M, \mathbf{d}\text{.}\) What would it mean for (B) to hold?
(e)
We would like to derive a contradiction.
Let
\(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\) denote the vector produced by
Corollary 9.4.8 (B). Suppose
\(y_0=0\text{.}\) How would
\((\y^*+\y)^\top A\) compare to
\((\y^*)^\top A\) and
\((\y^*+\y)^\top \vb \) compare to
\((\y^*)^\top \vb \text{?}\) Why is this a contradiction?
(f)
Suppose \(y_0>0\text{.}\) Let \(\y' := \frac{\y}{y_0}\text{.}\)
Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}M\geq 0\) that \((\y')^\top A \geq \vc^\top\text{.}\)
(g)
Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\mathbf{d} = -1\) that \((\y')^\top \vb= g^*-\frac{1}{y_0}\text{.}\)
(h)
Explain why (f) and (g) produce a contradiction.