Block Coordinate Update Method in Tensor Optimization
Doctor of Philosophy
Block alternating minimization (BAM) has been popularly used since the 50's of last century. It partitions the variables into disjoint blocks and cyclically updates the blocks by minimizing the objective with respect to each block of variables, one at a time with all others fixed. A special case is the alternating projection method to find a common point of two convex sets. The BAM method is often easy yet efficient particularly if each block subproblem is simple to solve. However, for certain problems such as the nonnegative tensor decomposition, the block subproblems can be difficult to solve, or even if they are solved exactly or to high accuracies, BAM can perform badly on solving the original problem, in particular on non-convex problems. On the other hand, in the literature, the BAM method is mainly analyzed for convex problems. Although it has been shown numerically to work well on many non-convex problems, theoretical results of BAM for non-convex optimization are still lacked. For these reasons, I propose different block update schemes and generalize the BAM method for non-smooth non-convex optimization problems. Which scheme is the most efficient depends on specific applications. In addition, I analyze convergence of the generalized method, dubbed as block coordinate update method (BCU), with different block update schemes for non-smooth optimization problems, in both convex and non-convex cases. BCU has found many applications, and the work in this dissertation is mainly motivated by tensor optimization problems, for which the BCU method is often the best choice due to their block convexity. I make contributions in modeling, algorithm design, and also theoretical analysis. The first part is about the low-rank tensor completion, for which I make a novel model based on parallel low-rank matrix factorization. The new model is non-convex, and it is difficult to guarantee global optimal solutions. However, the BAM method performs very well on solving this model. Global convergence in terms of KKT conditions is established, and numerical experiments demonstrate the superiority of the proposed model over several state-of-the-art ones. The second part is towards the solution of the nonnegative tensor decomposition. For this problem, each block subproblem is a nonnegative least squares problem and not simple to solve. Hence, the BAM method may be inefficient. I propose a block proximal gradient (BPG) method. In contrast to BAM that solves each block subproblem exactly, BPG solves relaxed block subproblems, which are often much simpler than the original ones and can thus make BPG converge faster. Through the Kurdyka-Lojasiewicz property, I establish its global convergence with rate estimate in terms of iterate sequence. Numerical experiments on sparse nonnegative Tucker decomposition demonstrates its superiority over the BAM method. The last part is motivated by tensor regression problems, whose block partial gradient is expensive to evaluate. For such problems, BPG becomes inefficient, and I propose to use inexact partial gradient and generalize BPG to a block stochastic gradient method. Convergence results in expectation are established for general non-convex case in terms of first-order optimality conditions, and for convex case, a sublinear convergence rate result is shown. Numerical tests on tensor regression problems show that the block stochastic gradient method significantly outperforms its deterministic counterpart.