#### AI

# Pushing the boundaries of convex optimization

August 13, 2019 | Written by: Jakub Marecek

Categorized: AI

Share this post:

Convex optimization has many applications ranging from operations research and machine learning to quantum information theory. Convex optimization problems, which involve the minimization of a convex function over a convex set, can be approximated in theory to any fixed precision in polynomial time. However, practical algorithms are known only for special cases.

Some well-known special cases include linear programming, convex quadratic programming, and convex quadratically-constrained quadratic programming, which are the workhorses of operations research. They are also special cases of semidefinite programming (SDP), which optimizes a linear objective over a set of positive semidefinite matrices. SDP has numerous additional applications such as recommender systems [1] and power systems [2].

An important question is whether it is possible to develop algorithms for a broader subset of convex optimization problems that are efficient in both theory and practice.

We begin to answer this question in our paper, “Entropy Penalized Semidefinite Programming,” [3] at the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019). In our oral presentation on August 15^{th} (09:30am session on Constraint Satisfaction), we present our work on SDPs with a convex non-linear objective. This generalization has many applications:

- In natural language processing, a popular approach for language modeling is to use word embeddings. The training of a word embedding can be cast as a semidefinite programming problem. This training can be improved further by the inclusion of an entropy term in the objective. However, modern giga-word corpora lead to large instances of SDP.
- In quantum information theory, entropy-penalized SDPs improve upon traditional SDPs in quantum state tomography. Thereby, one may require fewer measurements to estimate the state of a quantum system with a certain confidence. Clearly, however, the dimension of the quantum state grows exponentially with the number of qubits, and 50 qubits already pose a non-trivial challenge.
- In maximum a posteriori (MAP) estimates in Markov random fields (MRF), one can likewise improve upon the ability to round the SDP relaxations by including the entropy term, but the instances of SDPs can grow very large.

Existing methods do not make it possible to solve large-scale instances of entropy-penalized SDPs (EP-SDP).

Existing methods for EP-SDP have been based mostly on specialized interior-point methods [4], which deal with the non-linear objective, or approximations of the non-linear objective, and subsequent use of the standard methods for semidefinite optimization. Either way, the methods rely on the inversion of a large (Hessian) matrix. This makes it possible for the interior-point methods to quickly converge, but it’s possible to perform a single iteration only for relatively modest matrix sizes, such as 200×200, rather than the large-scale matrices that are typical of machine-learning and quantum computing applications.

Instead, we show that it’s possible to extend so-called low-rank methods, which utilize a particular factorization of the matrix variables, and compute the gradient of the per-factor Lagrangian, including the entropy term, in linear time. We have also developed a convergence result for the method. This makes it possible to solve large instances both in theory and practice and pushes the boundaries of what can be done efficiently in convex optimization.

Our Python code is freely available on-line, here, and comes with examples of use, e.g., on a benchmark in MAP estimates.

### References:

[1] Jakub Marecek, Peter Richtarik, Martin Takac. Matrix Completion under Interval Uncertainty. European Journal on Operational Research (2017) 256 (1): 35-43.

[2] Bissan Ghaddar, Jakub Marecek, Martin Mevissen. Optimal Power Flow as a Polynomial Optimization Problem. IEEE Transactions on Power Systems 31(1), 539-546, 2016. http://arxiv.org/abs/1404.3626

[3] Mikhail Krechetov, Jakub Marecek, Yury Maximov, Martin Takac. Entropy Penalized Semidefinite Programming. The 28th International Joint Conference on Artificial Intelligence, Macau, 2019. https://arxiv.org/abs/1802.04332 and https://doi.org/10.24963/ijcai.2019/157 and https://github.com/mkrechetov/epsdp

[4] Cunlu Zhou. Entropy, optimization and coding theory. PhD thesis, University of Notre Dame, Notre Dame, Indiana, 2019.

**Jakub Marecek**

Research Staff Member, IBM Research

### Making Neural Networks Robust with New Perspectives

IBM researchers have partnered with scientists from MIT, Northeastern University, Boston University and University of Minnesota to publish two papers on novel attacks and defenses for graph neural networks and on a new robust training algorithm called hierarchical random switching at IJCAI 2019.

### Improving the Scalability of AI Planning when Memory is Limited

We report new research results relevant to AI planning in our paper, "Depth-First Memory-Limited AND/OR Search and Unsolvability in Cyclic Search Spaces," presented at the International Joint Conference on Artificial Intelligence, IJCAI-19.

### Causal Knowledge Extraction: An Evaluation using Automated Binary Causal Question Answering

At IJCAI'19, IBM researchers present new results on causal knowledge extraction from large amounts of text for applications in enterprise risk management.