Skip to main content

Section 4.2 Duality Theory

In this section, we establish the theoretical underpinnings of duality. This is a proof heavy section.

Observation 4.2.1.

Recall the primal maximization problem Activity 2.1.1, and the corresponding dual minimization problem Definition 4.1.5. By letting
\begin{equation*} A = \begin{bmatrix} a_{11} \amp a_{12} \amp \cdots \amp a_{1m} \\ a_{21} \amp a_{22} \amp \cdots \amp a_{2m} \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ a_{n1} \amp a_{n2} \amp \cdots \amp a_{nm} \end{bmatrix}, b = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix}, c^{\top}= \begin{bmatrix}c_1 \amp c_2\amp \cdots \amp c_m \end{bmatrix}, d=d \end{equation*}
We can rephrase the primal max problem as follows: Maximize \(f = c^{\top}\mathbf{x}-d\) for \(\mathbf{x}\in \mathbb{R}^m\) subject to
\begin{equation*} A\mathbf{x}\leq b, \mathbf{x}\geq 0. \end{equation*}
Here, we understand \(\leq, \geq \) to denote entrywise inequality.
Likewise, we can rephrase the dual min problem as follows: Minimize \(g = \mathbf{y}^{\top}b-d\) for \(\mathbf{y}\in \mathbb{R}^n\) subject to
\begin{equation*} \mathbf{y}^{\top}A\geq c^{\top}, \mathbf{y}\geq 0. \end{equation*}

Activity 4.2.2.

In this activity, we explore a foundational relationship between the primal max problem and it’s dual, called weak duality.

(a)

Consider the matrix product \(\mathbf{y}^{\top}A\mathbf{x}\text{.}\) Use this product to show that \(g\geq f\text{.}\)

(b)

Suppose there were feasible \(\mathbf{x_0}, \mathbf{y_0}\) for which \(f(\mathbf{x_0},)= g(\mathbf{y_0})\text{.}\) What then must be true about these solutions? Can we prove our assertion?

(d)

Come up with a primal max problem (and corresponding min dual) where \(A, b, c, d, \mathbf{x}, \mathbf{y}\) all have integer values, so that the primal max and dual min problems achieve optimal solutions \(\mathbf{x_0}, \mathbf{y_0}\text{,}\) where \(f(\mathbf{x_0}) \lt g(\mathbf{y_0})\text{.}\)

(e)

Using the same values for \(A, b, c, d\) for the problem we just constructed, suppose we relax the condition that all our values must be integers. What can we say about the optimal solutions then?
What we have proven is the following:
We have now that if \(f=g\) for a pair of feasible solutions, then we have optimality for both problems. It would be good to prove that this is an if or only if statement.

Activity 4.2.5.

(a)

Sketch two non-empty convex sets \(U, V\text{,}\) what does \(H\) look like here?

Activity 4.2.6.

We prove the case of Theorem 4.2.4 where there are \(u_0\in U, v_0\in V\) that minimize \(\{\|u-v\|: u\in U, v\in V\}\text{.}\) We assume this is true.

(a)

Without loss of generality, let \(v_0=0\text{.}\) Why can we do this?

(b)

Let \(\mathbf{h} = u_0-v_0\text{.}\) Sketch \(u_0, v_0, \mathbf{h}\) and \(H:=\{\mathbf{x}\in \R^n: \mathbf{h}^\top \mathbf{x}=0\}\text{.}\)

(c)

We want to show that \(H\) is the seperating hyperplane. Suppose that \(U\) was not contained in \(\{\mathbf{x}\in \R^n: \mathbf{h}^\top\mathbf{x}\geq0\}\) what must be true about \(U\text{?}\)

(d)

Let \(u'\in U\) such that \(\mathbf{h}^\top u'\leq0\text{.}\) Sketch \(u'\text{.}\)

(e)

Let \(\mathbf{d} = u'-u_0\) decribe geometrically what \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) represents. (Think dot product, projections and for \(\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u\) to start at \(u\)).

(f)

Let \(\alpha = \frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\text{,}\) show that \(0\lt \alpha \lt 1\text{.}\)

(g)

Let \(w=\frac{-u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}\cdot \mathbf{d}+u_0\text{,}\) show that \(w = \alpha u' + (1-\alpha)u_0\)

(h)

Show that \(w\in U\text{.}\)

(i)

Show that \(w^\top w = u_0^\top u_0 + \alpha \mathbf{d}^\top \mathbf{d}\left(2 \frac{u_0^\top \mathbf{d}}{\mathbf{d}^\top \mathbf{d}}+\alpha\right)\text{,}\) and explain why \(w^\top w \lt u_0^\top u_0\text{.}\)

(j)

Why does the last statement result in a contradiction?
We now introduce a key idea which will tie together the primal and dual problems.

Definition 4.2.7.

Let \(S\subseteq \R^n\text{.}\) We call the cone of \(S\text{,}\) denoted \(\cone(S)\) to be the set \(\cone(S):=\left\{\displaystyle \sum_{i=1}^m u_i\mathbf{s}_i: u_i\in \R, u_i\geq 0, \mathbf{s}_i\in S \right\}\text{.}\)

Activity 4.2.8.

(a)

In \(\R^2\text{,}\) describe \(\cone\left( \left\{ \begin{bmatrix} 1\\1\end{bmatrix}, \begin{bmatrix} 1\\-2\end{bmatrix} \right\} \right)\text{.}\)

(b)

Prove that for any \(S\subseteq \R^n, \cone(S)\) is convex.

(c)

Let \(A\) denote a \(n\times m\) matrix, and let \(P\) denote the cone of the columns of \(A\text{.}\)
Suppose \(\mathbf{b}\not\in P\text{.}\) What does Theorem 4.2.4 tell us?

Activity 4.2.10.

We prove Theorem 4.2.9 and a useful corollary.

(b)

Suppose both cases (A) and (B) of Theorem 4.2.9 held at the same time. use the product \(\y^\top A\x\) to derive a contradiction.

(e)

Let’s denote the normal vector of the seperating hyperplane by \(\y\) (interesting choice 👀) so that \(\y^\top \vb\lt \y^\top p\) for any \(p\in P\text{.}\) Why must \(\y^\top \vb \lt 0\text{?}\)

(f)

Suppose \(A\) had a column \(A_j\) such that \(\y^\top A_j\lt 0\text{,}\) show that there is an \(\alpha>0\) such that \(\y^\top (\alpha A_j)\lt \y^\top \vb \text{.}\) Why is this a contradiction?

(h)

Now that Theorem 4.2.9 is proven, we apply it to \(\begin{bmatrix} A \amp I_n \end{bmatrix} \) and \(\vb \text{.}\)
Suppose (A) held, and we had that there was a \(\begin{bmatrix} \x \\ \vt\end{bmatrix} \geq 0\) so that \(\begin{bmatrix} A \amp I_n \end{bmatrix} \begin{bmatrix} \x \\ \vt\end{bmatrix} = \vb \text{.}\) How would \(A\x\) compare to \(\vb \text{?}\)

(i)

Suppose (A) failed. Then there is a \(\y\in \R^n\) satisfing (B) for \(\begin{bmatrix} A \amp I_n \end{bmatrix}, \vb \text{.}\)
What can we say about \(\y^\top A, \y^\top \vb \) and \(\y\) compared to \(\mathbf{0}, 0\text{?}\)

Activity 4.2.13. Proof of the Strong Duality Theorem.

We finally prove Theorem 4.2.12.

(a)

Suppose that optimal dual solution \(\y^*\) exists. Explain why by Activity 4.2.2 it suffices to show that \(f(\x^*)\geq g(\y^*)\) for some feasible \(\x^*\text{.}\)

(b)

Without loss of generality, let \(d=0\) and let \(g^* = (\y^*)^\top \vb \text{.}\)
Consider the matrices \(M = \begin{bmatrix} -\vc^\top \\ A\end{bmatrix}, \mathbf{d} = \begin{bmatrix} -g^* \\ \vb \end{bmatrix}\text{.}\) Apply Corollary 4.2.11 to \(M, \mathbf{d}\text{.}\) What does it mean for (A) to hold?

(c)

If Corollary 4.2.11 (A) holds for \(M, \mathbf{d}\text{,}\) then it holds for the pair \(-\vc^\top, \begin{bmatrix} -g^*\end{bmatrix} \text{,}\) as well as the pair \(A, \vb \) for the same \(\x\in \R^m\text{.}\) Why does this show that a feasible optimal solution \(\x^*\) exists and that \(f(\x^*)=g(\y^*)\text{?}\)

(d)

On the other hand, suppose Corollary 4.2.11 (B) holds for \(M, \mathbf{d}\text{.}\) What would it mean for (B) to hold?

(e)

We would like to derive a contradiction.
Let \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\) denote the vector produced by Corollary 4.2.11 (B). Suppose \(y_0=0\text{.}\) How would \((\y^*+\y)^\top A\) compare to \((\y^*)^\top A\) and \((\y^*+\y)^\top \vb \) compare to \((\y^*)^\top \vb \text{?}\) Why is this a contradiction?

(f)

Suppose \(y_0>0\text{.}\) Let \(\y' := \frac{\y}{y_0}\text{.}\)
Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}M\geq 0\) that \((\y')^\top A \geq \vc^\top\text{.}\)

(g)

Show that since \(\begin{bmatrix} y_0 \amp \y^\top \end{bmatrix}\mathbf{d} = -1\) that \((\y')^\top \vb= g^*-\frac{1}{y_0}\text{.}\)

(h)

Explain why (f) and (g) produce a contradiction.

Activity 4.2.14. Complementary Slackness.

We assume the primal problem is canonical.

(a)

Prove that \(g-f = \vs^\top\x + \y^\top\vt \text{.}\)

(b)

Suppose \(\x\) is a feasible extreme point, and \(\y, \vt\) are the corresponding coefficients for the normal vectors. If \(x_j\neq 0\text{,}\) what does that say about \(s_j\text{?}\)
Hint.
\(s_j\) is the coefficient of the normal vector for the plane \(-x_j=0\text{.}\) If the feasible solution does not lie on \(-x_j=0\text{,}\) what can we say about \(s_j\text{?}\)
What about \(s_j\neq 0\text{?,}\) \(y_i\neq0\text{?}\) \(t_i\neq0\text{?}\)

(c)

If \(\x_*, \y_*\) are feasible optimal solutions with slack variables \(\vt_*, \vs_*\text{,}\) what must be true about \(\x_*^\top\vs_* + \y_*^\top\vt_*\text{?}\)

Definition 4.2.15.

Feasible variables \(\x, \y, \vs, \vt \) are said to exhibit complementary slackness if \(\vs^\top\x=0, \y^\top\vt=0\text{.}\)