Section 4.2 Duality Theory
In this section, we establish the theoretical underpinnings of duality. This is a proof heavy section.
Activity 4.2.2.
In this activity, we explore a foundational relationship between the primal max problem and it’s dual, called weak duality.
(a)
Consider the matrix product \(\mathbf{y}^{\top}A\mathbf{x}\text{.}\) Use this product to show that \(g\geq f\text{.}\)
(b)
Suppose there were feasible \(\mathbf{x_0}, \mathbf{y_0}\) for which \(f(\mathbf{x_0},)= g(\mathbf{y_0})\text{.}\) What then must be true about these solutions? Can we prove our assertion?
(c)
(d)
Come up with a primal max problem (and corresponding min dual) where \(A, b, c, d, \mathbf{x}, \mathbf{y}\) all have integer values, so that the primal max and dual min problems achieve optimal solutions \(\mathbf{x_0}, \mathbf{y_0}\text{,}\) where \(f(\mathbf{x_0}) \lt g(\mathbf{y_0})\text{.}\)
(e)
Using the same values for \(A, b, c, d\) for the problem we just constructed, suppose we relax the condition that all our values must be integers. What can we say about the optimal solutions then?
What we have proven is the following:
Proposition 4.2.3.
For a primal maximization problem with objective function \(f\) and dual objective \(g\text{,}\) we have that
\begin{equation*}
f\leq g.
\end{equation*}
In particular, if there is a feasible primal solution \(\x'\) and feasible dual solution \(\y'\) such that
\begin{equation*}
f(\x')=g(\y')
\end{equation*}
then \(\x', \y'\) are optimal solutions for the primal and dual problems respectively.
We have now that if \(f=g\) for a pair of feasible solutions, then we have optimality for both problems. It would be good to prove that this is an if or only if statement.
Theorem 4.2.4. Hyperplane Seperation Theorem.
Given two disjoint convex sets \(U, V\subseteq \R^{n}\text{,}\) there is a hyperplane \(H=\{\mathbf{x}\in \R^n: \mathbf{h}^{\top}\mathbf{x}=k\} \) for some \(\mathbf{h}\in \R^n, k\in \R\text{,}\) such that \(\mathbf{h}^{\top}U \geq k, \mathbf{h}^{\top}V\lt k\text{.}\)
Activity 4.2.5.
(a)
Sketch two non-empty convex sets \(U, V\text{,}\) what does \(H\) look like here?
(b)
Activity 4.2.6.
We prove the case of
Theorem 4.2.4 where there are
\(u_0\in U, v_0\in V\) that minimize
\(\{\|u-v\|: u\in U, v\in V\}\text{.}\) We assume this is true.
(a)
Without loss of generality, let \(v_0=0\text{.}\) Why can we do this?
(b)
Let \(\mathbf{h} = u_0-v_0\text{.}\) Sketch \(u_0, v_0, \mathbf{h}\) and \(H:=\{\mathbf{x}\in \R^n: \mathbf{h}^\top \mathbf{x}=0\}\text{.}\)
(c)
We want to show that \(H\) is the seperating hyperplane. Suppose that \(U\) was not contained in \(\{\mathbf{x}\in \R^n: \mathbf{h}^\top\mathbf{x}\geq0\}\) what must be true about \(U\text{?}\)
(d)
Let \(u'\in U\) such that \(\mathbf{h}^\top u'\leq0\text{.}\) Sketch \(u'\text{.}\)
(e)
Let \(\mathbf{d} = u'-u_0\) decribe geometrically what \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) represents. (Think dot product, projections and for \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) to start at \(u\)).
(f)
Let \(\alpha = \frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\text{,}\) show that \(0\lt \alpha \lt 1\text{.}\)
(g)
Let \(w=\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u_0\text{,}\) show that \(w = \alpha u' + (1-\alpha)u_0\)
(h)
Show that \(w\in U\text{.}\)
(i)
Show that \(w^\top w = u_0^\top u_0 + \alpha \mathbf{d}^\top \mathbf{d}\left(2 \frac{u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}+\alpha\right)\text{,}\) and explain why \(w^\top w \lt u_0^\top u_0\text{.}\)
(j)
Why does the last statement result in a contradiction?
We now introduce a key idea which will tie together the primal and dual problems.
Definition 4.2.7.
Let \(S\subseteq \R^n\text{.}\) We call the cone of \(S\text{,}\) denoted \(\cone(S)\) to be the set \(\cone(S):=\left\{\displaystyle \sum_{i=1}^m u_i\mathbf{s}_i: u_i\in \R, u_i\geq 0, \mathbf{s}_i\in S \right\}\text{.}\)
Activity 4.2.8.
(a)
In \(\R^2\text{,}\) describe \(\cone\left( \left\{ \begin{bmatrix} 1\\1\end{bmatrix}, \begin{bmatrix} 1\\-2\end{bmatrix} \right\} \right)\text{.}\)
(b)
Prove that for any \(S\subseteq \R^n, \cone(S)\) is convex.
(c)
Let \(A\) denote a \(n\times m\) matrix, and let \(P\) denote the cone of the columns of \(A\text{.}\)
Suppose
\(\mathbf{b}\not\in P\text{.}\) What does
Theorem 4.2.4 tell us?
Theorem 4.2.9. The Farkas Lemma.
Given a matrix \(A\in \R^{n\times m}\) and \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:
There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}=\mathbf{b}.\)
There is a \(\mathbf{y}\in \R^n\) such that \(\y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
Activity 4.2.10.
(a)
(b)
Suppose both cases (A) and (B) of
Theorem 4.2.9 held at the same time. use the product
\(\y^\top A\x\) to derive a contradiction.
(c)
(d)
(e)
Let’s denote the normal vector of the seperating hyperplane by \(\y\) (interesting choice 👀) so that \(\y^\top \vb\lt \y^\top p\) for any \(p\in P\text{.}\) Why must \(\y^\top \vb \lt 0\text{?}\)
(f)
Suppose \(A\) had a column \(A_j\) such that \(\y^\top A_j\lt 0\text{,}\) show that there is an \(\alpha>0\) such that \(\y^\top (\alpha A_j)\lt \y^\top \vb \text{.}\) Why is this a contradiction?
(g)
(h)
Now that
Theorem 4.2.9 is proven, we apply it to
\(\begin{bmatrix} A \amp I_n \end{bmatrix} \) and
\(\vb \text{.}\)
Suppose (A) held, and we had that there was a \(\begin{bmatrix} \x \\ \vt\end{bmatrix} \geq 0\) so that \(\begin{bmatrix} A \amp I_n \end{bmatrix} \begin{bmatrix} \x \\ \vt\end{bmatrix} = \vb \text{.}\) How would \(A\x\) compare to \(\vb \text{?}\)
(i)
Suppose (A) failed. Then there is a \(\y\in \R^n\) satisfing (B) for \(\begin{bmatrix} A \amp I_n \end{bmatrix}, \vb \text{.}\)
What can we say about \(\y^\top A, \y^\top \vb \) and \(\y\) compared to \(\mathbf{0}, 0\text{?}\)
Corollary 4.2.11. The Farkas Lemma v2.
Given a matrix \(A\in \R^{n\times m}\) and \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:
There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}\leq \mathbf{b}\text{.}\)
There is a \(\mathbf{y}\in \R^n\) such that \(\y\geq 0, \y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
Without loss of generality, we may let \(\y^\top \vb=-1\) in case (B).
Theorem 4.2.12. The Strong Duality Theorem.
Given a pair of primal max-dual min problems, the primal max problem as an optimal solution \(\x^*\) if and only if the dual min problem has an optimal solution \(\y^*\text{.}\) Moreover, \(f(\x^*)=g(\y^*)\text{.}\)
Activity 4.2.13. Proof of the Strong Duality Theorem.
(a)
Suppose that optimal dual solution
\(\y^*\) exists. Explain why by
Activity 4.2.2 it suffices to show that
\(f(\x^*)\geq g(\y^*)\) for some feasible
\(\x^*\text{.}\)
(b)
Without loss of generality, let \(d=0\) and let \(g^* = (\y^*)^\top \vb \text{.}\)
Consider the matrices
\(M = \begin{bmatrix} -\vc^\top \\ A\end{bmatrix}, \mathbf{d} = \begin{bmatrix} -g^* \\ \vb \end{bmatrix}\text{.}\) Apply
Corollary 4.2.11 to
\(M, \mathbf{d}\text{.}\) What does it mean for (A) to hold?
(c)
If
Corollary 4.2.11 (A) holds for
\(M, \mathbf{d}\text{,}\) then it holds for the pair
\(-\vc^\top, \begin{bmatrix} -g^*\end{bmatrix} \text{,}\) as well as the pair
\(A, \vb \) for the same
\(\x\in \R^m\text{.}\) Why does this show that a feasible optimal solution
\(\x^*\) exists and that
\(f(\x^*)=g(\y^*)\text{?}\)
(d)
On the other hand, suppose
Corollary 4.2.11 (B) holds for
\(M, \mathbf{d}\text{.}\) What would it mean for (B) to hold?
(e)
We would like to derive a contradiction.
Let
\(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\) denote the vector produced by
Corollary 4.2.11 (B). Suppose
\(y_0=0\text{.}\) How would
\((\y^*+\y)^\top A\) compare to
\((\y^*)^\top A\) and
\((\y^*+\y)^\top \vb \) compare to
\((\y^*)^\top \vb \text{?}\) Why is this a contradiction?
(f)
Suppose \(y_0>0\text{.}\) Let \(\y' := \frac{\y}{y_0}\text{.}\)
Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}M\geq 0\) that \((\y')^\top A \geq \vc^\top\text{.}\)
(g)
Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\mathbf{d} = -1\) that \((\y')^\top \vb= g^*-\frac{1}{y_0}\text{.}\)
(h)
Explain why (f) and (g) produce a contradiction.
Activity 4.2.14. Complementary Slackness.
We assume the primal problem is canonical.
(a)
Prove that \(g-f = \vs^\top\x + \y^\top\vt \text{.}\)
(b)
Suppose \(\x\) is a feasible extreme point, and \(\y, \vt\) are the corresponding coefficients for the normal vectors. If \(x_j\neq 0\text{,}\) what does that say about \(s_j\text{?}\)
Hint.
\(s_j\) is the coefficient of the normal vector for the plane \(-x_j=0\text{.}\) If the feasible solution does not lie on \(-x_j=0\text{,}\) what can we say about \(s_j\text{?}\)
What about \(s_j\neq 0\text{?,}\) \(y_i\neq0\text{?}\) \(t_i\neq0\text{?}\)
(c)
If \(\x_*, \y_*\) are feasible optimal solutions with slack variables \(\vt_*, \vs_*\text{,}\) what must be true about \(\x_*^\top\vs_* + \y_*^\top\vt_*\text{?}\)
Definition 4.2.15.
Feasible variables \(\x, \y, \vs, \vt \) are said to exhibit complementary slackness if \(\vs^\top\x=0, \y^\top\vt=0\text{.}\)