主辦方：中國科學院數學與系統科學研究院優化與應用研究中心
時 間：2021年12月4日
騰訊會議：744 487 261
日期

時间

報告信息


08: 50－09:00

開幕式：
優化與應用研究中心名譽主任 袁亞湘 致辭
中國科學院計算數學與科學計算工程研究所所長 周愛輝 致辭

12月4日
（周六）
上午

09: 00－10:00

主持人：郭田德

09:00－09:30

Zhijun Wu（Iowa State University, USA）
Social Distancing as a Social Dilemma Game and its Optimal
Strategies

09:30－10:00

印臥濤（阿裏巴巴達摩院）
Closing the Gap: Alternating SGD Methods Work Well for Bilevel Problems

10:00－11:30

主持人：劉歆

10:00－10:30

Feng Ruan（University of California, Berkeley）
Adapting Maximum Likelihood Theory to Modern Applications

10:30－11:00

李肖（香港中文大學（深圳））
Convergence of Random Reshuffling Under The KurdykaLojasiewicz
Inequality

11:00－11:30

包承龍（清華大學）
Anderson Acceleration for Training Neural Networks

11:30－14:00

午休

12月4日
（周六）
下午

14:00－16:00

主持人：馬俊傑

14:00－14:30

淩舒揚（上海紐約大學）
Generalized Power Method for Generalized Orthogonal Procrustes
Problem

14:30－15:00

史斌（中國科學院數學與系統科學研究院）
Understanding the Acceleration Phenomenon via HighResolution
Differential Equations

15:00－15:30

張國涵（北京郵電大學）
Recent advances in Extended Secondorder cones

15:30－16:00

顧然（南開大學）
A Random Active Set Method For Strictly Convex Quadratic Problem With Simple Bounds

16:00－17:30

主持人：劉亞鋒

16:00－16:30

陰佳騰（北京交通大學）
Benders分支切割法及其在結構化整數規劃模型中的應用

16:30－17:00

陳亮（中國科學院數學與系統科學研究院）
An Exact Separation Algorithm for Unsplittable Flow Capacitated Network Design Arcset Polyhedron

17:0017:30

Houduo Qi（University of Southampton）
Proximal Distance Algorithms for Euclidean Distance Matrix Optimization

17:30－17:40

閉幕式：
优化与应用研究中心學術委員會副主任 楊新民 致辭
優化與應用研究中心主任 戴彧虹 致辭

附：報告題目與摘要
09:00－09:30
報告人：Zhijun Wu（Iowa State University, USA）
報告題目：Social Distancing as a Social Dilemma Game and its Optimal Strategies
報告摘要：Since the outbreak of the global COVID19 pandemic, social distancing has been known to everyone and recommended almost everywhere every day. Social distancing has been and will be one of the most effective measures and sometimes, the only available one for fighting epidemics and saving lives. However, it has not been so clear how social distancing should be practiced or managed, especially when it comes to regulating everyone's otherwise normal social activities. The debate on how to implement social distancing often leads to a heated political argument, while research on the subject is lacking. In this talk, I will discuss a theoretical framework for the understanding of the scientific nature of social distancing by considering social distancing as a social dilemma game played by every individual against his/her population. From this perspective, every individual needs to make decision on how to engage in social distancing, or risk being trapped into a dilemma either exposing to deadly diseases or getting no access to necessary social activities. As the players of the game, the individual's decisions depend on the population's actions and vice versa, and an optimal strategy can be found when the game reaches an equilibrium. I will show how an optimal strategy can be determined for a population with either closely related or completely separated social activities and with either single or multiple social groups, and how the collective behaviors of social distancing can be simulated by following every individual's actions as the distancing game progresses. The simulation results for populations of varying sizes and complexities will be presented, which not only justify the choices of the strategies based on the game theoretic analysis, but also demonstrate the convergence of the individual actions to an optimal distancing strategy in silico and possibly in natura as well, if every individual makes rational distancing decisions.
09:30－10:00
報告人：印臥濤（阿裏巴巴達摩院）
報告題目：Closing the Gap: Alternating SGD Methods Work Well for Bilevel Problems
報告摘要：This talk briefly overviews a few stochastic nested optimization problems, including stochastic bilevel, minmax, and compositional optimization, which quickly gain popularity in machine learning (meta learning, reinforcement learning, etc.). They share the same nested structure, but existing works treat them separately, developing problemspecific algorithms and analyses. Although one may think SGD is slower on nested problems than on singlelevel problems, it is not the case. We develop a method called ALternating Stochastic gradient dEscenT (ALSET), which leverages hidden smoothness and achieves the same complexity of O(eps^{2}). Under certain regularity conditions, applying our results to stochastic compositional, minmax, and reinforcement learning problems either improves upon or matches the bestknown sample complexity in the respective cases.
This is joint work with Prof. Tianyi Chen (RPI) and Dr. Yuejiao Sun (recently graduated from UCLA).
10:00－10:30
報告人：Feng Ruan（University of California, Berkeley）
報告題目：Adapting Maximum Likelihood Theory to Modern Applications
報告摘要： Maximum likelihood estimation (MLE) is influential because it can be easily applied to generate optimal, statistically efficient procedures for broad classes of estimation problems. Nonetheless, the theory does not apply to modern settings—such as problems with computational, communication, or privacy considerations—where our estimators have resource constraints. In this talk, I will introduce a modern maximum likelihood theory that addresses these issues, focusing specifically on procedures that must be computationally efficient or privacypreserving. To do so, I first derive analogues of Fisher information for these applications, which allows a precise characterization of tradeoffs between statistical efficiency, privacy, and computation. To complete the development, I also describe a recipe that generates optimal statistical procedures (analogues of the MLE) in the new settings, showing how to achieve the new Fisher information lower bounds.
10: 30－11:00
報告人：李肖（香港中文大學（深圳））
報告題目：Convergence of Random Reshuffling Under The KurdykaLojasiewicz
Inequality
報告摘要：We study the random reshuffling (RR) method for smooth nonconvex optimization problems with a finitesum structure. Though this method is widely utilized in practice, e.g., in the training of neural networks, its convergence behavior is only understood in several limited settings. In this paper, under the wellknown KurdykaLojasiewicz (KL) inequality, we establish strong limitpoint convergence results for RR with appropriate diminishing step sizes, namely, the whole sequence of iterates generated by RR is convergent and converges to a single stationary point in an almost sure sense. In addition, we derive the corresponding rate of convergence, depending on the KL exponent and suitably selected diminishing step sizes. When the KL exponent lies in [0,1/2], the convergence is at a rate of O(t^{1}) with t counting the iteration number. When the KL exponent belongs to (1/2,1), our derived convergence rate is of the form O(t^{q}) with q in (0,1) depending on the KL exponent. The standard KL inequalitybased convergence analysis framework only applies to algorithms with a certain descent property. We conduct a novel convergence analysis for the nondescent RR method with diminishing step sizes based on the KL inequality, which generalizes the standard KL framework. We summarize our main steps and core ideas in an informal analysis framework, which is of independent interest. As a direct application of this framework, we also establish similar strong limitpoint convergence results for the shuffled proximal point method.
11:00－11:30
報告人：包承龍（清華大學）
報告題目：Anderson Acceleration for Training Neural Networks
報告摘要：Anderson mixing is a powerful method in scientific computing, but its convergence analysis has not been fully explored. In this talk, we add the adaptive regularization and damped regularization to the original Anderson mixing, and provide the detailed convergence analysis and iteration complexity results. Moreover, to reduce the memory burden, we propose a shortterm version that can mimic longterm memory. Various experiments on deep neural networks are demonstrated for the advantages of this method.
14:00－14:30
報告人：淩舒揚（上海紐約大學）
報告題目：Generalized Power Method for Generalized Orthogonal Procrustes
Problem
報告摘要：Given a set of multiple point clouds, how to find the rigid transformations (rotation, reflection, and shifting) such that these point clouds are well aligned? This problem, known as the generalized orthogonal Procrustes problem (GOPP), plays a fundamental role in several scientific disciplines including statistics, imaging science and computer vision. Despite its tremendous practical importance, it is still a challenging computational problem due to the inherent nonconvexity. In this paper, we study the semidefinite programming (SDP) relaxation of the generalized orthogonal Procrustes problems and prove that the tightness of the SDP relaxation holds, i.e., the SDP estimator exactly equals the least squares estimator, if the signaltonoise ratio (SNR) is relatively large. We also prove that an efficient generalized power method with a proper initialization enjoys global linear convergence to the least squares estimator. In addition, we analyze the BurerMonteiro factorization and show the corresponding optimization landscape is free of spurious local optima if the SNR is large. This explains why firstorder Riemannian gradient methods with random initializations usually produce a satisfactory solution despite the nonconvexity. One highlight of our work is that the theoretical guarantees are purely algebraic and do not require any assumptions on the statistical property of the noise. Our results partially resolve one open problem posed in [Bandeira, Khoo, Singer, 2015] on the tightness of the SDP relaxation in solving the generalized orthogonal Procrustes problem. Numerical simulations are provided to complement our theoretical analysis.
14:30－15:00
報告人：史斌（中國科學院數學與系統科學研究院）
報告題目：Understanding the Acceleration Phenomenon via HighResolution Differential Equations
報告摘要：Gradientbased optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithmsNesterov's accelerated gradient method for strongly convex functions (NAGSC) and Polyak's heavyball methodwe study an alternative limiting process that yields highresolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAGSC and Polyak's heavyball method, but they allow the identification of a term that we refer to as "gradient correction" that is present in NAGSC but not in the heavyball method and is responsible for the qualitative difference in convergence of the two methods. We also use the highresolution ODE framework to study Nesterov's accelerated gradient method for (nonstrongly) convex functions, uncovering a hitherto unknown resultthat NAGC minimizes the squared gradient norm at an inverse cubic rate. Finally, by modifying the highresolution ODE of NAGC, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAGC for smooth convex functions.
15:00－15:30
報告人：張國涵（北京郵電大學）
報告題目：Recent advances in Extended Secondorder cones
報告摘要：The extended secondorder cones(ESOC) are introduced in 2015. This extension seems the most natural extension of second order cones. In this talk, I will first briefly review some basic properties of ESOC. Then I will introduce the monotone extended second order cone (MESOC), which is related to the monotone cone and the second order cone. Finally, I will show some applications of ESOC and MESOC in portfolio optimization.
15:30－16:00
報告人：顧然（南開大學）
報告題目：A Random Active Set Method For Strictly Convex Quadratic Problem With Simple Bounds
報告摘要：Active set method aims to find the correct active set of the optimal solution and is a powerful method for solving strictly convex quadratic problem with bound constraints. To guarantee finite step convergence, the existing active set methods all need strict conditions or some additional strategies, which greatly affects the efficiency of the algorithm. In this paper, we propose a random active set method which introduces randomness in the update of active set. We prove that it can converge in finite steps with probability one without any conditions on the problem or any additional strategies. Numerical results show that the algorithm obtains the correct active set within a few iterations, and compared with the existing methods, it has good robustness and efficiency.
16:00－16:30
報告人：陰佳騰（北京交通大學）
報告題目：Benders分支切割法及其在結構化整數規劃模型中的應用
報告摘要：Benders分解是一種常用于求解大規模結構化線性規劃（LP）問題的經典方法。然而，對于子問題帶有離散整數變量的優化模型，由于其子問題對應的Recourse function不再具有分段線性特征，因此無法使用傳統Benders分解方法求解。爲求解帶有離散變量的結構化整數規劃模型，本文提出一種基于Benders分支切割（Branchandcut）框架的近似算法，並推導了基于線性松弛影子價格與整數可行解的“Benders對偶整數割”。通過叠代求解子問題的LP松弛模型以及IP可行解（或最優解），根據主問題産生的松弛下界與補償函數值的大小動態加入Benders對偶整數割，直至得到高質量可行解。最後，基于北京地鐵實際問題：車輛選址方案與運行圖一體優化模型進行了數值實驗，通過與現有MIP商用求解器CPLEX的计算效果对比，验证了所提出算法在计算時间、优化精度、可计算模型规模等方面的优势。
16:30－17:00
報告人：陳亮（中國科學院數學與系統科學研究院）
報告題目：An Exact Separation Algorithm for Unsplittable Flow Capacitated Network Design Arcset Polyhedron
報告摘要：In this talk, we concentrate on generating cutting planes for the unsplittable capacitated network design problem. We use the unsplittable flow arcset polyhedron of the considered problem as a substructure and generate cutting planes by solving the separation problem over it. To relieve the computational burden, we show that, in some special cases, a closed form of the separation problem can be derived. For the general case, a bruteforce algorithm, called exact separation algorithm, is employed in solving the separation problem of the considered polyhedron such that the constructed inequality guarantees to be facetdefining. Furthermore, a new technique is presented to accelerate the exact separation algorithm, which significantly decreases the number of iterations in the algorithm. Finally, a comprehensive computational study on the unsplittable capacitated network design problem is presented to demonstrate the effectiveness of the proposed algorithm.
17:00－17:30
報告人：Houduo Qi（University of Southampton）
報告題目：Proximal Distance Algorithms for Euclidean Distance Matrix Optimization
報告摘要：Proximal distance algorithms combine the classical quadratic penalty method of constrained minimization with distance majorization. Those algorithms have been independently developed in various areas. For convex differentiable problems, they reduce to proximal gradient algorithms and therefore enjoy well understood convergence properties. However, for nonconvex problems, it is challenging to establish satisfactory convergence results other than a decreasing objective sequence being generated. For the Euclidean Distance Matrix optimization, which is nonconvex and nondifferentiable, strong convergence properties can be established for one such proximal distance algorithm by taking advantage of the structural properties of the problem. We demonstrate its efficiency for the examples from supervised maximum variance unfolding.