Section 4.2 Duality Theory
In this section, we establish the theoretical underpinnings of duality. This is a proof heavy section.
Activity 4.2.2.
In this activity, we explore a foundational relationship between the primal max problem and it’s dual, called weak duality.
(a)
Consider the matrix product \(\mathbf{y}^{\top}A\mathbf{x}\text{.}\) Use this product to show that \(g\geq f\text{.}\)
(b)
Suppose there were feasible \(\mathbf{x_0}, \mathbf{y_0}\) for which \(f(\mathbf{x_0},)= g(\mathbf{y_0})\text{.}\) What then must be true about these solutions? Can we prove our assertion?
(c)
(d)
Come up with a primal max problem (and corresponding min dual) where \(A, b, c, d, \mathbf{x}, \mathbf{y}\) all have integer values, so that the primal max and dual min problems achieve optimal solutions \(\mathbf{x_0}, \mathbf{y_0}\text{,}\) where \(f(\mathbf{x_0}) \lt g(\mathbf{y_0})\text{.}\)
(e)
Using the same values for \(A, b, c, d\) for the problem we just constructed, suppose we relax the condition that all our values must be integers. What can we say about the optimal solutions then?
What we have proven is the following:
Proposition 4.2.3.
For a primal maximization problem with objective function \(f\) and dual objective \(g\text{,}\) we have that
\begin{equation*}
f\leq g.
\end{equation*}
In particular, if there is a feasible primal solution \(\x^*\) and feasible dual solution \(\y^*\) such that
\begin{equation*}
f(\x^*)=g(\y^*)
\end{equation*}
then \(\x^*, \y^*\) are optimal solutions for the primal and dual problems respectively.
We have now that if \(f=g\) for a pair of feasible solutions, then we have optimality for both problems. It would be good if the converse were also true. This is encapsulated by the following theorem.
Theorem 4.2.4. The Strong Duality Theorem.
Given a pair of primal max-dual min problems, the primal max problem as an optimal solution \(\x^*\) if and only if the dual min problem has an optimal solution \(\y^*\text{.}\) Moreover, \(f(\x^*)=g(\y^*)\text{.}\)
To prove this, we recall the idea we explored in
Section 4.1 that dual variables were coefficients or weights of normal variables of the bounding hyperplanes.
Activity 4.2.5.
Before we delve into the proof, we illustrate the idea with a constrained case. Consider a canonical optimization problem captured by the following tableau (we assume for simplicities sake to let
\(d=0\)):
|
\(x_1\) |
\(x_2\) |
\(x_3\) |
\(x_4\) |
\(-1\) |
|
|
\(a_{11}\) |
\(a_{12}\) |
\(a_{13}\) |
\(a_{14}\) |
\(b_1\) |
\(=-t_1\) |
|
\(a_{21}\) |
\(a_{22}\) |
\(a_{23}\) |
\(a_{24}\) |
\(b_2\) |
\(=-t_2\) |
|
\(a_{13}\) |
\(a_{23}\) |
\(a_{33}\) |
\(a_{43}\) |
\(b_3\) |
\(=-t_3\) |
|
\(c_1\) |
\(c_2\) |
\(c_3\) |
\(c_4\) |
\(0\) |
\(=f\) |
Which after pivoting achieves optimality with the following tableau:
|
\(t_3\) |
\(x_3\) |
\(x_1\) |
\(t_2\) |
\(-1\) |
|
|
\(a_{11}^*\) |
\(a_{12}^*\) |
\(a_{13}^*\) |
\(a_{14}^*\) |
\(b_1^*\) |
\(=-x_4\) |
|
\(a_{21}^*\) |
\(a_{22}^*\) |
\(a_{23}^*\) |
\(a_{24}^*\) |
\(b_2^*\) |
\(=-t_1\) |
|
\(a_{13}^*\) |
\(a_{23}^*\) |
\(a_{33}^*\) |
\(a_{43}^*\) |
\(b_3^*\) |
\(=-x_2\) |
|
\(c_1^*\) |
\(c_2^*\) |
\(c_3^*\) |
\(c_4^*\) |
\(d^*\) |
\(=f\) |
(a)
Let \(y_i^*=-c^*_?\) if \(c_?^*\) is the coefficient for \(t_i\) in the optimal tableau, and let \(y^*_i=0\) otherwise. Similarly let \(s_j^*=-c^*_?\) if \(c_?^*\) is the coefficient for \(x_j\) in the optimal tableau, and let \(s^*_j=0\) otherwise.
Recall that the optimal tableau represents a reformulation of the original problem where \(f=c_1x_1+c_2x_2+c_3x_3+c_4x_4\) is instead rewritten as \(f=c_1^*t_3+c_2^*x_3+c_3^*x_1+c_4^*t_2-d^*\text{.}\)
Why must
\begin{equation*}
\displaystyle \sum_{j=1}^4 c_jx_j = \sum_{i=1}^4y_i^*(-t_i) + \sum_{j=1}^4 s_i^*(-x_i)-d^*?
\end{equation*}
(b)
Use the fact that \(t_i = b_i-\left(\displaystyle \sum_{j=1}^4 a_{ij}x_j\right)\) to rewrite the above equality without \(t_i\)’s. Regroup the right hand side so we have the form:
\begin{equation*}
\displaystyle \sum_{j=1}^4 c_jx_j = (\text{linear function in terms of }x_j\text{'s})+(\text{constant}).
\end{equation*}
(c)
Recall that the above equality we established is an equality as functions of the \(x_j\)’s. What must be the constant portion of the right hand side be equal to? What must the linear function in terms of \(x_j\)’s on the right hand side be equal to?
(d)
Prove that \(g(\y^*)=\displaystyle \sum_{i=1}^3 y_i^*b_i=-d^*.\)
(e)
We now need to show that these \(y_i^*\)’s we found is a feasible solution to the dual problem. Why do we have that \(c_j = \displaystyle \sum_{i=1}^4 y_i^*a_{ij}-s_j^*\text{?}\)
(f)
Why is each \(y_i^*, s_j^*\geq 0\text{?}\) Why does this show that \(\displaystyle \sum_{i=1}^4 y_i^*a_{ij} \geq c_j\text{?}\) This shows that \(s_j\)’s are non-negative slack variables for the \(y_i\) and that the \(y_i^*\)’s are a feasible solution.
(g)
Show by Weak Duality
Proposition 4.2.3 that the
\(y_i^*\)’s are an optimal dual solution.
Activity 4.2.6.
Adopt the arguments of
Activity 4.2.5 to prove the Strong Duality Theorem
Theorem 4.2.4 for a general primal-dual optimization problem
|
\(x_1\) |
\(x_2\) |
\(\cdots\) |
\(x_m\) |
\(-1\) |
|
|
\(a_{11}\) |
\(a_{12}\) |
\(\cdots\) |
\(a_{1m}\) |
\(b_1\) |
\(=-t_1\) |
|
\(a_{21}\) |
\(a_{22}\) |
\(\cdots\) |
\(a_{2m}\) |
\(b_2\) |
\(=-t_2\) |
|
\(\vdots\) |
\(\vdots\) |
\(\ddots\) |
\(\vdots\) |
\(\vdots\) |
\(\vdots\) |
|
\(a_{n1}\) |
\(a_{n2}\) |
\(\cdots\) |
\(a_{nm}\) |
\(b_n\) |
\(=-t_n\) |
|
\(c_1\) |
\(c_2\) |
\(\cdots\) |
\(c_m\) |
\(0\) |
|
which also achieves optimality for some general tableau.
Use the facts that
\begin{equation*}
\displaystyle \sum_{j=1}^m c_jx_j = \sum_{i=1}^ny_i^*(-t_i) + \sum_{j=1}^m s_i^*(-x_i)-d^*,
\end{equation*}
that \(t_i = b_i-\left(\displaystyle \sum_{j=1}^n a_{ij}x_j\right)\text{,}\) and by construction each \(y_i^*, s_j^*\geq0\text{.}\)
An alternative approach to proving this theorem is provided in
Section 9.4.
Activity 4.2.7. Complementary Slackness.
We assume the primal problem is canonical.
(a)
Prove that \(g-f = \vs^\top\x + \y^\top\vt \text{.}\)
(b)
Suppose \(\x\) is a feasible extreme point, and \(\y, \vs\) are the corresponding coefficients for the normal vectors. If \(x_j\neq 0\text{,}\) what does that say about \(s_j\text{?}\)
Hint.
\(s_j\) is the coefficient of the normal vector for the plane \(-x_j=0\text{.}\) If the feasible solution does not lie on \(-x_j=0\text{,}\) what can we say about \(s_j\text{?}\)
What about \(s_j\neq 0\text{?,}\) \(y_i\neq0\text{?}\) \(t_i\neq0\text{?}\)
(c)
Give an example of a linear optimization primal-dual problem, and feasible solutions \(\x, \y\) with slack variables \(\vt, \vs\) where \(\vs^\top\x + \y^\top\vt > 0\text{.}\)
(d)
If \(\x^*, \y^*\) are feasible optimal solutions with slack variables \(\vt^*, \vs^*\text{,}\) what must be true about \((\vs^*)^\top\x^* + (\y^*)^\top\vt^*\text{?}\)
Definition 4.2.8.
Feasible variables \(\x, \y\) with slack variables \(\vs, \vt \) are said to exhibit complementary slackness if \(\vs^\top\x=0, \y^\top\vt=0\text{.}\)