Another Approach to Strong Duality

Section 9.4 Another Approach to Strong Duality

In this section, we provide an alternative approach to proving the Strong Duality Theorem Theorem 4.2.4.

Theorem 9.4.1. Hyperplane Seperation Theorem.

Given two disjoint convex sets \(U, V\subseteq \R^{n}\text{,}\) there is a hyperplane \(H=\{\mathbf{x}\in \R^n: \mathbf{h}^{\top}\mathbf{x}=k\} \) for some \(\mathbf{h}\in \R^n, k\in \R\text{,}\) such that \(\mathbf{h}^{\top}U \geq k, \mathbf{h}^{\top}V\lt k\text{.}\)

🔗

Activity 9.4.2.

(a)

Sketch two nonempty convex sets \(U, V\text{,}\) what does \(H\) look like here?

🔗

(b)

Sketch two nonconvex sets where Theorem 9.4.1 fails.

🔗

Activity 9.4.3.

We prove the case of Theorem 9.4.1 where there are \(u_0\in U, v_0\in V\) that minimize \(\{\|u-v\|: u\in U, v\in V\}\text{.}\) We assume this is true.

🔗

(a)

Without loss of generality, let \(v_0=0\text{.}\) Why can we do this?

🔗

(b)

Let \(\mathbf{h} = u_0-v_0\text{.}\) Sketch \(u_0, v_0, \mathbf{h}\) and \(H:=\{\mathbf{x}\in \R^n: \mathbf{h}^\top \mathbf{x}=0\}\text{.}\)

🔗

(c)

We want to show that \(H\) is the separating hyperplane. Suppose that \(U\) was not contained in \(\{\mathbf{x}\in \R^n: \mathbf{h}^\top\mathbf{x}\geq0\}\) what must be true about \(U\text{?}\)

🔗

(d)

Let \(u'\in U\) such that \(\mathbf{h}^\top u'\leq0\text{.}\) Sketch \(u'\text{.}\)

🔗

(e)

Let \(\mathbf{d} = u'-u_0\) describe geometrically what \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) represents. (Think dot product, projections and for \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) to start at \(u\text{.}\))

🔗

(f)

Let \(\alpha = \frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\text{,}\) show that \(0\lt \alpha \lt 1\text{.}\)

🔗

(g)

Let \(w=\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u_0\text{,}\) show that \(w = \alpha u' + (1-\alpha)u_0\)

🔗

(h)

Show that \(w\in U\text{.}\)

🔗

(i)

Show that \(w^\top w = u_0^\top u_0 + \alpha \mathbf{d}^\top \mathbf{d}\left(2 \frac{u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}+\alpha\right)\text{,}\) and explain why \(w^\top w \lt u_0^\top u_0\text{.}\)

🔗

(j)

Why does the last statement result in a contradiction?

🔗

We now introduce a key idea which will tie together the primal and dual problems.

🔗

Definition 9.4.4.

Let \(S\subseteq \R^n\text{.}\) We call the cone of \(S\text{,}\) denoted \(\cone(S)\) to be the set \(\cone(S):=\left\{\displaystyle \sum_{i=1}^m u_i\mathbf{s}_i: u_i\in \R, u_i\geq 0, \mathbf{s}_i\in S \right\}\text{.}\)

🔗

Activity 9.4.5.

(a)

In \(\R^2\text{,}\) describe \(\cone\left( \left\{ \begin{bmatrix} 1\\1\end{bmatrix}, \begin{bmatrix} 1\\-2\end{bmatrix} \right\} \right)\text{.}\)

🔗

(b)

Prove that for any \(S\subseteq \R^n, \cone(S)\) is convex.

🔗

(c)

Let \(A\) denote a \(n\times m\) matrix, and let \(P\) denote the cone of the columns of \(A\text{.}\)

🔗

Suppose \(\mathbf{b}\not\in P\text{.}\) What does Theorem 9.4.1 tell us?

🔗

We can codify the intuition gained from Activity 9.4.5 in the following:

🔗

Theorem 9.4.6. The Farkas Lemma.

Given a matrix \(A\in \R^{n\times m}\) and vector \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:

🔗

There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}=\mathbf{b}.\)
🔗

🔗
There is a \(\mathbf{y}\in \R^n\) such that \(\y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
🔗

🔗

🔗

Activity 9.4.7.

We prove Theorem 9.4.6 and a useful corollary.

🔗

(a)

Recall Activity 9.4.5 \(\textbf{(c)}\text{.}\) How do the cases (A) and (B) of Theorem 9.4.6 relate to the cone \(P\text{?}\)

🔗

(b)

Suppose both cases (A) and (B) of Theorem 9.4.6 held at the same time. use the product \(\y^\top A\x\) to derive a contradiction.

🔗

(c)

If Theorem 9.4.6 (A) were true, what could we say about (B)?

🔗

(d)

Suppose Theorem 9.4.6 (A) were false. We want to prove that (B) is true. What does Activity 9.4.5 \(\textbf{(c)}\) tell us?

🔗

(e)

Let’s denote the normal vector of the separating hyperplane by \(\y\) (interesting choice 👀) so that \(\y^\top \vb\lt \y^\top p\) for any \(p\in P\text{.}\) Why must \(\y^\top \vb \lt 0\text{?}\)

🔗

(f)

Suppose \(A\) had a column \(A_j\) such that \(\y^\top A_j\lt 0\text{,}\) show that there is an \(\alpha>0\) such that \(\y^\top (\alpha A_j)\lt \y^\top \vb \text{.}\) Why is this a contradiction?

🔗

(g)

Why is Theorem 9.4.6 (B) proven? Why is then Theorem 9.4.6 proven?

🔗

(h)

Now that Theorem 9.4.6 is proven, we apply it to \(\begin{bmatrix} A \amp I_n \end{bmatrix} \) and \(\vb \text{.}\)

🔗

Suppose (A) held, and we had that there was a \(\begin{bmatrix} \x \\ \vt\end{bmatrix} \geq 0\) so that \(\begin{bmatrix} A \amp I_n \end{bmatrix} \begin{bmatrix} \x \\ \vt\end{bmatrix} = \vb \text{.}\) How would \(A\x\) compare to \(\vb \text{?}\)

🔗

\(A\x\leq \vb\text{.}\)
🔗

🔗
\(A\x\geq \vb\text{.}\)
🔗

🔗
\(A\x= \vb\text{.}\)
🔗

🔗
No comparison may be made between \(A\x\) and \(\vb\text{.}\)
🔗

🔗

🔗

(i)

Suppose (A) failed. Then there is a \(\y\in \R^n\) satisfing (B) for \(\begin{bmatrix} A \amp I_n \end{bmatrix}, \vb \text{.}\)

🔗

What can we say about \(\y \) compared to \(\mathbf{0}\text{?}\)

🔗

\(\y^\top \leq \mathbf{0}^\top\text{.}\)
🔗

🔗
\(\y^\top \geq \mathbf{0}^\top\text{.}\)
🔗

🔗
\(\y^\top = \mathbf{0}^\top\text{.}\)
🔗

🔗
No comparison may be made between \(\y\) and \(\mathbf{0}p\text{.}\)
🔗

🔗

What can we say about \(\y^\top A\) compared to \(\mathbf{0}\text{?}\)

🔗

\(\y^\top A\leq \mathbf{0}^\top\text{.}\)
🔗

🔗
\(\y^\top A\geq \mathbf{0}^\top\text{.}\)
🔗

🔗
\(\y^\top A = \mathbf{0}^\top\text{.}\)
🔗

🔗
No comparison may be made between \(\y^\top A\) and \(\mathbf{0}^\top\text{.}\)
🔗

🔗

What can we say about \(\y^\top \vb\) compared to \(0\text{?}\)

🔗

\(\y^\top \vb\leq 0\text{.}\)
🔗

🔗
\(\y^\top \vb\geq 0\text{.}\)
🔗

🔗
\(\y^\top \vb = 0\text{.}\)
🔗

🔗
No comparison may be made between \(\y^\top \vb\) and \(0\text{.}\)
🔗

🔗

🔗

Corollary 9.4.8. The Farkas Lemma v2.

Given a matrix \(A\in \R^{n\times m}\) and vector \(\mathbf{b}\in \R^n\text{,}\) exactly one of the following is true:

🔗

There is a \(\mathbf{x}\in \R^m\) such that \(\mathbf{x}\geq \mathbf{0}\) and \(A\mathbf{x}\leq \mathbf{b}\text{.}\)
🔗

🔗
There is a \(\mathbf{y}\in \R^n\) such that \(\y\geq 0, \y^\top \vb\lt 0\) and \(\y^\top A \geq \mathbf{0}\text{.}\)
🔗

🔗

Without loss of generality, we may let \(\y^\top \vb=-1\) in case (B).

🔗

Recall the Strong Duality Theorem Theorem 4.2.4.

🔗

Theorem 9.4.9. The Strong Duality Theorem.

Given a pair of primal max-dual min problems, the primal max problem as an optimal solution \(\x^*\) if and only if the dual min problem has an optimal solution \(\y^*\text{.}\) Moreover, \(f(\x^*)=g(\y^*)\text{.}\)

🔗

Activity 9.4.10. Proof of the Strong Duality Theorem.

We provide an alternate proof of Theorem 4.2.4 in this activity.

🔗

(a)

Suppose that optimal dual solution \(\y^*\) exists. Explain why by Activity 4.2.2 it suffices to show that \(f(\x^*)\geq g(\y^*)\) for some feasible \(\x^*\text{.}\)

🔗

(b)

Without loss of generality, let \(d=0\) and let \(g^* = (\y^*)^\top \vb \text{.}\)

🔗

Consider the matrix \(M = \begin{bmatrix} -\vc^\top \\ A\end{bmatrix}\text{,}\) and vector \(\mathbf{d} = \begin{bmatrix} -g^* \\ \vb \end{bmatrix}\text{.}\) Apply Corollary 9.4.8 to \(M, \mathbf{d}\text{.}\) What does it mean for (A) to hold?

🔗

(c)

If Corollary 9.4.8 (A) holds for \(M, \mathbf{d}\text{,}\) then it holds for the pair \(-\vc^\top, \begin{bmatrix} -g^*\end{bmatrix} \text{,}\) as well as the pair \(A, \vb \) for the same \(\x\in \R^m\text{.}\) Why does this show that a feasible optimal solution \(\x^*\) exists and that \(f(\x^*)=g(\y^*)\text{?}\)

🔗

(d)

On the other hand, suppose Corollary 9.4.8 (B) holds for \(M, \mathbf{d}\text{.}\) What would it mean for (B) to hold?

🔗

(e)

We would like to derive a contradiction.

🔗

Let \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\) denote the vector produced by Corollary 9.4.8 (B). Suppose \(y_0=0\text{.}\) How would \((\y^*+\y)^\top A\) compare to \((\y^*)^\top A\) and \((\y^*+\y)^\top \vb \) compare to \((\y^*)^\top \vb \text{?}\) Why is this a contradiction?

🔗

(f)

Suppose \(y_0>0\text{.}\) Let \(\y' := \frac{\y}{y_0}\text{.}\)

🔗

Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}M\geq 0\) that \((\y')^\top A \geq \vc^\top\text{.}\)

🔗

(g)

Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\mathbf{d} = -1\) that \((\y')^\top \vb= g^*-\frac{1}{y_0}\text{.}\)

🔗

(h)

Explain why \(\textbf{(f)}\) and \(\textbf{(g)}\) produce a contradiction.

🔗

Prev Top Next