Minimum relevant variables in linear system explained

Minimum relevant variables in linear system (Min-RVLS) is a problem in mathematical optimization. Given a linear program, it is required to find a feasible solution in which the number of non-zero variables is as small as possible.

The problem is known to be NP-hard and even hard to approximate.

Definition

A Min-RVLS problem is defined by:[1]

The linear system is given by: A x R b. It is assumed to be feasible (i.e., satisfied by at least one x). Depending on R, there are four different variants of this system: A x = b, A x ≥ b, A x > b, A x ≠ b.

The goal is to find an n-by-1 vector x that satisfies the system A x R b, and subject to that, contains as few as possible nonzero elements.

Special case

The problem Min-RVLS[=] was presented by Garey and Johnson,[2] who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations.

Applications

The Min-RVLS problem is important in machine learning and linear discriminant analysis. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them.[3] The problem is known as the minimum feature set problem. An algorithm that approximates Min-RVLS within a factor of

O(log(m))

could substantially reduce the number of training samples required to attain a given accuracy level.[4]

The shortest codeword problem in coding theory is the same problem as Min-RVLS[=] when the coefficients are in GF(2).

Related problems

In minimum unsatisfied linear relations (Min-ULR), we are given a binary relation R and a linear system A x R b, which is now assumed to be infeasible. The goal is to find a vector x that violates as few relations as possible, while satisfying all the others.

Min-ULR[≠] is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants:

O(nmn/2n-1)

.
log1-\varepsilonn
2
, for any

\varepsilon>0

, unless NP is contained in DTIME(

n\operatorname{polylog(n)}

).[8]

In the complementary problem maximum feasible linear subsystem (Max-FLS), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously.

m\varepsilon

. With coefficients over GF[q], it is q-approximable.

Hardness of approximation

All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of

log1-\varepsilonn
2
, for any

\varepsilon>0

, unless NP is contained in DTIME(

n\operatorname{polylog(n)}

). The hardness is proved by reductions:

On the other hand, there is a reduction from Min-RVLS[=] to Min-ULR[=]. It also applies to Min-ULR[≥] and Min-ULR[>], since each equation can be replaced by two complementary inequalities.

Therefore, when R is in, Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.

Notes and References

  1. Amaldi . Edoardo . Kann . Viggo . On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems . Theoretical Computer Science . December 1998 . 209 . 1–2 . 237–260 . 10.1016/S0304-3975(97)00115-1 . free .
  2. Book: Garey . Michael R. . Johnson . David S. . Computers and Intractability: A Guide to the Theory of NP-completeness . 1979 . W. H. Freeman . 978-0-7167-1044-8 .
  3. Koehler . Gary J. . Linear Discriminant Functions Determined by Genetic Search . ORSA Journal on Computing . November 1991 . 3 . 4 . 345–357 . 10.1287/ijoc.3.4.345 .
  4. Van Horn . Kevin S. . Martinez . Tony R. . The minimum feature set problem . Neural Networks . January 1994 . 7 . 3 . 491–494 . 10.1016/0893-6080(94)90082-5 .
  5. Amaldi . Edoardo . Kann . Viggo . The complexity and approximability of finding maximum feasible subsystems of linear relations . Theoretical Computer Science . August 1995 . 147 . 1–2 . 181–210 . 10.1016/0304-3975(94)00254-G . free .
  6. Sankaran . Jayaram K . A note on resolving infeasibility in linear programs by constraint relaxation . Operations Research Letters . February 1993 . 13 . 1 . 19–20 . 10.1016/0167-6377(93)90079-V .
  7. Book: Greer . R. . Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations . 2011 . Elsevier . 978-0-08-087207-0 .
  8. Arora . Sanjeev . Babai . László . Stern . Jacques . Sweedyk . Z . The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations . Journal of Computer and System Sciences . April 1997 . 54 . 2 . 317–331 . 10.1006/jcss.1997.1472 . free .