AI

Pushing the boundaries of convex optimization

Share this post:

Convex optimization has many applications ranging from operations research and machine learning to quantum information theory. Convex optimization problems, which involve the minimization of a convex function over a convex set, can be approximated in theory to any fixed precision in polynomial time. However, practical algorithms are known only for special cases.

Some well-known special cases include linear programming, convex quadratic programming, and convex quadratically-constrained quadratic programming, which are the workhorses of operations research. They are also special cases of semidefinite programming (SDP), which optimizes a linear objective over a set of positive semidefinite matrices. SDP has numerous additional applications such as recommender systems [1] and power systems [2].

An important question is whether it is possible to develop algorithms for a broader subset of convex optimization problems that are efficient in both theory and practice.

We begin to answer this question in our paper, “Entropy Penalized Semidefinite Programming,” [3] at the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019). In our oral presentation on August 15th (09:30am session on Constraint Satisfaction), we present our work on SDPs with a convex non-linear objective. This generalization has many applications:

  • In natural language processing, a popular approach for language modeling is to use word embeddings. The training of a word embedding can be cast as a semidefinite programming problem. This training can be improved further by the inclusion of an entropy term in the objective. However, modern giga-word corpora lead to large instances of SDP.
  • In quantum information theory, entropy-penalized SDPs improve upon traditional SDPs in quantum state tomography. Thereby, one may require fewer measurements to estimate the state of a quantum system with a certain confidence. Clearly, however, the dimension of the quantum state grows exponentially with the number of qubits, and 50 qubits already pose a non-trivial challenge.
  • In maximum a posteriori (MAP) estimates in Markov random fields (MRF), one can likewise improve upon the ability to round the SDP relaxations by including the entropy term, but the instances of SDPs can grow very large.

Existing methods do not make it possible to solve large-scale instances of entropy-penalized SDPs (EP-SDP).

Existing methods for EP-SDP have been based mostly on specialized interior-point methods [4], which deal with the non-linear objective, or approximations of the non-linear objective, and subsequent use of the standard methods for semidefinite optimization. Either way, the methods rely on the inversion of a large (Hessian) matrix. This makes it possible for the interior-point methods to quickly converge, but it’s possible to perform a single iteration only for relatively modest matrix sizes, such as 200×200, rather than the large-scale matrices that are typical of machine-learning and quantum computing applications.

Instead, we show that it’s possible to extend so-called low-rank methods, which utilize a particular factorization of the matrix variables, and compute the gradient of the per-factor Lagrangian, including the entropy term, in linear time. We have also developed a convergence result for the method. This makes it possible to solve large instances both in theory and practice and pushes the boundaries of what can be done efficiently in convex optimization.

Our Python code is freely available on-line, here, and comes with examples of use, e.g., on a benchmark in MAP estimates.

An experimental comparison of the run-time of an interior-point method for a linear semidefinite programming (SDP), the Burer-Monteiro approach to the same problem with the linear objective (LR-SDP), and our approach to the entropy-penalized SDP in the same dimension (EP-SDP), as a function of the dimension of the relaxation. Details of the set-up are given in the paper [1]. Notice that the entropy penalty does not incur additional computational costs.

An experimental comparison of the run-time of an interior-point method for a linear semidefinite programming (SDP), the Burer-Monteiro approach to the same problem with the linear objective (LR-SDP), and our approach to the entropy-penalized SDP in the same dimension (EP-SDP), as a function of the dimension of the relaxation. Details of the set-up are given in the paper [1]. Notice that the entropy penalty does not incur additional computational costs.

References:

[1] Jakub Marecek, Peter Richtarik, Martin Takac. Matrix Completion under Interval Uncertainty. European Journal on Operational Research (2017) 256 (1): 35-43.

[2] Bissan Ghaddar, Jakub Marecek, Martin Mevissen. Optimal Power Flow as a Polynomial Optimization Problem. IEEE Transactions on Power Systems 31(1), 539-546, 2016. http://arxiv.org/abs/1404.3626

[3] Mikhail Krechetov, Jakub Marecek, Yury Maximov, Martin Takac. Entropy Penalized Semidefinite Programming. The 28th International Joint Conference on Artificial Intelligence, Macau, 2019. https://arxiv.org/abs/1802.04332 and https://doi.org/10.24963/ijcai.2019/157 and https://github.com/mkrechetov/epsdp

[4] Cunlu Zhou. Entropy, optimization and coding theory. PhD thesis, University of Notre Dame, Notre Dame, Indiana, 2019.

Research Staff Member, IBM Research

More AI stories

IBM Research advances in end-to-end speech recognition at INTERSPEECH 2019

IBM scientists presented three papers at INTERSPEECH 2019 that address the shortcomings of End-to-end automatic approaches for speech recognition - an emerging paradigm in the field of neural network-based speech recognition that offers multiple benefits.

Continue reading

High quality, lightweight and adaptable Text-to-Speech (TTS) using LPCNet

Recent advances in deep learning are dramatically improving the development of Text-to-Speech systems through more effective and efficient learning of voice and speaking styles of speakers and more natural generation of high-quality output speech.

Continue reading

IBM Research AI at INTERSPEECH 2019

IBM Research's papers at INTERSPEECH 2019 showcase our focus on improving the underlying speech technologies that enable companies provide their customers with a uniformly good experience across different channels and extract actionable insights from these interactions.

Continue reading