Saturday, August 8, 2015

Optimizing Part of the Objective Function II

Today's post extends yesterday's entry about optimizing a portion of the original objective function. I won't repeat much of the previous post, to save myself some typing.

The actual question motivating these posts, which I originally misread, asked how to minimize, not maximize, the sum of the $K$ largest terms in $\sum_{i=1}^N x_i$ (where $x$ is a vector of decision variables in an optimization model). As I mentioned in the last outing, that (or the symmetric case, maximizing the sum of the $K$ smallest terms), is trickier than what I covered yesterday. Everything from yesterday's post (the introduction of binary variables $z_i$ and auxiliary variables $y_i$ and the added constraints) carries over, with one exception: the objective function is now$$\textrm{minimize} \sum_{i=1}^N y_i.$$On top of all that, we require some additional constraints.

Update

Forget everything that follows and just skip down to the comments section. There's a clever reformulation, posted by Michael Grant in a comment, that gets the job done with a linear number of new variables (all continuous) and a linear number of constraints. I knew another formulation with no integer variables, and in fact only one new variable at all, but with an exponential number of constraints. I was debating whether to post it until I saw the more efficient solution in Michael's comment.

5. Lower bounds on auxiliary variables


If we do not do something to prevent it, the solver will achieve an ideal objective value of 0 by setting $y_i = 0$ regardless of whether $z_i$ is 0 or 1. So we need additional constraints to ensure that $z_i = 1 \implies y_i = x_i$. We can accomplish that with$$y_i \ge x_i - U_i(1 - z_i)\quad \forall i \in \{1,\dots,N\},$$which forces $y_i\ge x_i$ if $z_i = 1$ and is vacuous if $z_i = 0$.

6. Choosing the $K$ largest terms


Choosing the largest of the $x_i$ was automatic when the objective was to maximize the sum. When the objective is minimization, the solver will "want" to choose the smallest of the $x_i$, and we need to constrain it to select the largest values. An easy way to do this is to note that choosing the $K$ largest values is equivalent to saying that every value chosen is at least as big as every value not chosen. Expressed mathematically,$$z_i = 1 \bigwedge z_j = 0 \implies x_i \ge x_j.$$Note that, having assumed $0\le x_k \le U_k\,\forall k$, the most negative that any difference $x_i - x_j$ could be would be $0 - U_j = -U_j$. That leads to the following additional constraints:$$x_i - x_j \ge -U_j(1-z_i+z_j)\quad \forall i,j\in \{1,\dots,N\},\,i\neq j.$$
Coupled with yesterday's formulation, that should get the job done.

9 comments:

  1. Paul: in fact, the sum of the largest $K$ elements of a vector is a convex function. There is no need whatsoever for binary variables to implement it. It's implemented, for instance, in the function "sum_largest" in CVX. I can give you more information via email if you like.

    ReplyDelete
    Replies
    1. Michael, I agree that sum_largest is a convex function, but it is nonlinear (and not entirely differentiable). For a convex solver of the right sort -- and I assume this includes some of the solvers CVX supports -- this is fine. The original source of the question posted it on a CPLEX forum, though, so I assume that he is looking at what would otherwise be a problem that is linear or at worst quadratically constrained. Do you see a way to avoid binary variables and still use CPLEX (or Gurobi or other equivalent solver)?

      Delete
    2. Yes, it is nonlinear, but it is polyhedral. Just like the absolute value function, it admits an efficient LP representation. I'm not sure of the best place to offer the derivation. Perhaps you could link to the forum post, and I can go there and offer it as an answer, assuming that forum supports MathJax...

      Delete
    3. It actually has a very useful interpretation in portfolio optimization: it can be used to ensure that, say, no more than 50% of your money is spent on any 10% of your stocks.

      Delete
  2. I would never normally recommend anyone read my dissertation, but you can see a derivation on page 160: https://web.stanford.edu/~boyd/papers/pdf/disc_cvx_prog.pdf

    ReplyDelete
    Replies
    1. Actually, slight correction: that derivation is for the sum of the $k$ largest absolute values. But I believe that simply by dropping one "side" of the absolute value constraints recovers the non-absolute-value case.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Aha, I just realized comments here do get rendered by MathJax. Here you go (the additional variables are $v_i$, $i=1,2,\dots,N$, and $q$):$$\begin{array}{ll}\text{minimize} & \sum_i v_i + K q \\ \text{subject to} & x_i \leq v_i + q, ~ i=1,2,\dots, N \\ & v_i\geq 0, ~ i=1,2,\dots, N \\ & \text{constraints on $x$}\end{array}$$

    ReplyDelete
  5. I couldn't find an original reference for the LP representation, but in the dissertation I cited "On the Sum of the Largest Eigenvalues of a Symmetric Matrix" by Overton and Wormersley. http://epubs.siam.org/doi/abs/10.1137/0613006 ... there is a PS file out there if you Google for it...

    ReplyDelete

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.