Convex Optimization Problems
convex optimization中译本
一、导论随着科技的发展和应用,凸优化在各个领域中发挥着越来越重要的作用。
其在工程、金融、计算机科学等领域的应用不断扩展和深化。
对于凸优化的理论和方法的研究,以及文献的翻译与传播变得尤为重要。
本文旨在对凸优化中的一些重要主题和内容进行介绍和讨论,希望能够为相关领域的研究者和读者提供一些参考和帮助。
二、凸优化基本概念1. 凸集与凸函数凸集和凸函数是凸优化中非常基础且重要的概念。
凸集是指集合中任意两个点的线段都在该集合内部的集合。
凸函数则是定义在凸集上的实值函数,其函数图像上的任意两点组成的线段都在函数图像上方。
凸集和凸函数的性质为凸优化问题的理论和方法提供了基础。
2. 凸优化问题的一般形式凸优化问题的一般形式可以表示为:minimize f(x)subject to g_i(x) <= 0, i = 1,2,...,mh_j(x) = 0, j = 1,2,...,p其中,f(x)是要优化的目标函数,g_i(x)和h_j(x)分别为不等式约束和等式约束。
凸优化问题通常要求目标函数和约束函数都是凸的。
三、凸优化中的常见算法1. 梯度下降法梯度下降法是一种常用的优化算法,尤其适用于凸优化问题。
其基本思想是通过计算目标函数的梯度方向,并沿着梯度的负方向进行迭代,以逐步逼近最优解。
2. 拉格朗日乘子法拉格朗日乘子法主要用于处理约束优化问题,通过构建拉格朗日函数并对其进行优化,得到原始优化问题的最优解。
拉格朗日乘子法在凸优化问题中得到了广泛的应用。
3. 内点法内点法是一类迭代法,主要用于求解线性规划和二次规划等凸优化问题。
其优点在于可以较快地收敛到最优解,尤其适用于大规模的凸优化问题。
四、凸优化在科学与工程中的应用凸优化在科学与工程中有着广泛的应用,如在信号处理中的最小二乘问题、在机器学习中的支持向量机、在通信系统中的功率分配问题等。
这些应用不仅推动了凸优化理论的发展,也为实际问题的解决提供了有效的工具和方法。
凸优化课程详
2. 凸集,凸函数, 3学时
凸集和凸函数的定义和判别
3. 数值代数基础, 3学时
向量,矩阵,范数,子空间,Cholesky分解,QR分解,特征值分解,奇异值分解
4. 凸优化问题, 6学时
典型的凸优化问题,线性规划和半定规划问题
5. 凸优化模型语言和算法软件,3学时
模型语言:AMPL, CVX, YALMIP; 典型算法软件: SDPT3, Mosek, CPLEX, Gruobi
随着科学与工程的发展,凸优化理论与方法的研究迅猛发展,在科学与工程计算,数据科学,信号和图像处理,管理科学等诸多领域中得到了广泛应用。通过本课程的学习,掌握凸优化的基本概念,对偶理论,典型的几类凸优化问题的判别及其计算方法,熟悉相关计算软件
本课程面向高. 凸优化简介, 3学时
Numerical Optimization,Jorge Nocedal and Stephen Wright,Springer,2006,2nd ed.,978-0-387-40065-5;
最优化理论与方法,袁亚湘,孙文瑜,科学出版社,2003,
参考书
1st ed.,9787030054135;
教学大纲
(2) 课程项目: 60%
要求:
作业和课程项目必须按时提交,迟交不算成绩,抄袭不算成绩
教学评估
文再文:
凸优化课程详细信息
课程号
00136660
学分
3
英文名称
Convex Optimization
先修课程
数学分析(高等数学),高等代数(线性代数)
中文简介
凸优化是一种广泛的,越来越多地应用于科学与工程计算,经济学,管理学,工业等领域的学科。它涉及建立恰当的数学模型来描述问题,设计合适的计算方法来寻找问题的最优解,研究模型和算法的理论性质,考察算法的计算性能。该入门课程??适合于数学,统计,计算机科学,电子工程,运筹学等学科的高年级本科生和研究生。教学内容包括凸集,凸函数和凸优化问题的介绍;凸分析的基础知识; 对偶理论;梯度算法,近似梯度算法,Nesterov加速方法,交替方向乘子法;内点算法,统计,信号处理和机器学习中的应用。
课程名称最优化方法(双语)
课程名称:最优化方法(双语)课程编码:7121101课程学分:3学分课程学时:48学时适用专业:信息与计算科学《最优化方法》(双语)Optimization Method (Bilingual)教学大纲1.课程性质与任务(1)本课程是信息与计算科学专业学生的专业选修课。
最优化方法是从众多可能方案中选择出最佳者,从而达到最优目标的科学。
作为一门新兴的应用数学分支,最优化方法在近二、三十年来随着计算机的应用而迅猛发展,已经应用于国民经济各个部门和科学技术的各个领域中。
(2)通过本课程的学习,使学生掌握数学规划,主要指线性规划、整数规划、运输问题、目标规划、非线性规划的基本理论和方法,为在该领域的深入学习和研究打下良好的基础。
培养学生分析和解决实际问题的能力,使学生通过最优化方法的学习,能够将实际问题抽象为数学的问题,分析和解释最优结果,并将结果应用到实际中去。
2.课程教学基本内容及要求本课程主要介绍线性规划、整数规划、运输问题、目标规划、非线性规划的基本理论和方法。
通过对最优化方法的教学活动,对学生的要求按了解、理解、掌握三个层面给出,具体要求如下:(1)引言掌握最优化模型及分类。
掌握凸集和凸函数、凸规划的基本概念,理解其性质。
(2)线性规划的基本性质掌握线性规划的标准型,掌握图解法。
(3)单纯形方法掌握单纯形方法的原理、单纯形表、两阶段法和大M法。
了解退化情形和修正单纯形方法。
(4)对偶原理及灵敏度分析理解线性规划的对偶理论,掌握对偶单纯形算法。
(5)运输问题掌握运输问题的数学模型、掌握表上作业法。
(6)整数规划掌握典型整数规划的数学模型,掌握割平面法、分枝定界法,了解0-1规划的隐数法。
(7)无约束问题掌握一维搜索的概念,掌握非线性规划的模型建立,以及凸集、凸函数,最优性条件等基本概念,掌握最速下降法、牛顿法。
理解直接搜索法,可行方向法等最优化方法。
(8)有约束问题掌握非线性规划的模型建立,以及最优性条件等基本概念。
一个新的对于无约束非凸优化问题渐近的算法
一个新的对于无约束非凸优化问题渐近的算法陈汝栋;吴成玉【摘要】For the optimization of nonconvex functions in mathematical programming,according to the know nconvex function optimization results and the corresponding algorithm,a new im-proved asymptotic algorithm is constructed,and by using Kurdyka-Lojasiewicz property,the convergence analysis of unconstrained nonconvex optimization problems for real lower semicon-tinuous nonconvex functions is considered.The sequence generated by the improved asymptotic algorithm has finite length and converges to a critical point of the function are obtained.Mean-while,the result representation of the convergence rate of the sequence is given.%针对数学规划中的非凸函数的优化问题,根据已知的凸函数的优化结果及相应算法,构造新的渐进算法,并运用Kurdyka-Lojasiewicz不等式,对真下半连续的非凸函数的无约束非凸优化问题进行了收敛分析,得到了由改进的渐进算法生成的序列具有有限长且收敛于该函数的一个临界点.同时给出了序列收敛速率的结果表示.【期刊名称】《纺织高校基础科学学报》【年(卷),期】2018(031)001【总页数】8页(P55-62)【关键词】渐近算法;Kurdyka-Lojasiewicz性质;无约束非凸优化问题;收敛速率【作者】陈汝栋;吴成玉【作者单位】天津工业大学理学院,天津 300387;天津工业大学理学院,天津300387【正文语种】中文【中图分类】O1770 引言在数学规划中,研究的大多数是凸函数的优化问题,而非凸函数优化很少涉及.非凸函数优化还是一个新兴的研究方向,发展较为缓慢,且主要应用于非凸优化方面的算法,Martient[1]和Rockafellar[2]在对极大单调算子的变分不等式的研究中引进渐近算法和在凸优化中引进一个渐近正则方法.文献[3-8]给出了凸优化中的非单调算子.文献[9-13]给出了关于非凸函数的概念以及非凸条件下的优化理论.文献[14]研究了f:X→(-∞,∞]和g:Y→(-∞,∞] 是真下半连续函数(未必是凸的)的优化问题,其中X⊂⊂Rn是闭凸集,目的在于找到关于下列函数的临界点f(x)+g(y),(1)使得min(f(x)+g(y)).(2)引进交替方向法去解决非凸的线性约束问题,其中目标函数是真下半连续的非凸函数构造的迭代方法如下:(3)通过收敛性分析,得到了交替方向法生成的序列{(xk,yk)}收敛于目标函数的一个临界点{(x*,y*)}以及{(xk,yk)}具有有限长.但所是构造的迭代方法中的罚参数的控制条件比较严格,且没有给出收敛速率的结果.本文借鉴文献[14]研究非凸函数无约束的优化问题,构造了新的渐进算法,并在适当条件下检验改进的迭代算法的收敛性.1 预备知识设H是一个定义了内积〈·,·〉和范数‖·‖的希尔伯特空间.命题1 (ⅰ) domf:={x∈Rn:f(x)<+∞}表示f的定义域;(ⅱ) 对于一点表示f在x的Fre′chet次微分,它是关于向量x*∈Rn的集合,该集合满足(ⅲ) ∂f(x)表示f在x∈Rn处的极限次微分,定义为∂f(x):={x*∈Rn:∃∀x∈Rn,显然有⊂∂f(x).其中第一个集合是闭的和凸的, 然而第二个集合是闭的. 用critf来表示f的临界点的集合, 即若0∈∂f(x):则有x∈crit.命题2 设f:Rn→(-∞,+∞]是一个真下半连续函数.如果C是Rn的一个闭子集,对x∈Rn,x到C的距离dist(x,C):=inf{‖x-y‖:y∈C}.(4)如果C是空的, 对所有的x∈Rn,有dist(x,C)=∞.若dist(x,C)=0,则x∈C.命题3 如果C是Rn中的一个闭子集,用δC表示其指示函数, 即对所有x∈Rn,有(5)在C上的投影PC(x):=argmin{‖x-z‖:z∈C}来表示.命题4 设η∈(0,+∞].用Φη来表示所有凹函数和连续函数φ:[0,η)→R+,该函数满足下列条件:(ⅰ) φ(0)=0;(ⅱ) φ在(0,η)上是C1和在0是连续的;(ⅲ) ∀x∈(0,η),φ′(x)>0.引理1[15] 设f:Rn→(-∞,+∞]是真下半连续函数,如果x∈Rn是f的局部极小值,则有0∈∂f(x).引理2(KL性质)[16] 设f:Rn→(-∞,+∞]是真下半连续函数.则(ⅰ) 设Dom∂f(x):={x∈Rn:∂f(x)≠φ}.f在具有Kurdyka-Lojasiewicz(KL)性质,如果∃η∈(0,+∞],对于的一个邻域U和一个函数φ∈Φη,使得∀(6)有(7)(ⅱ) 如果一个凹函数f在关于Dom∂f(x)的每一点满足KL性质,则f被称为KL函数.2 迭代算法首先构造与式(1)有关的目标函数:(8)然后,给出该方法的迭代序列:(9)上述方法就是所构造的新的渐近算法.针对不同非凸问题的渐近算法和相关知识可以参考文献[17-22].假设满足的条件如下:(H1) 式(9)的解集是非空的且(H2) f和g是下有界的KL函数;(H3) ∀k≥0,序列{λk},{μk}属于(λ-,λ+).引理3 假定满足(H1)-(H3),设由式(9)生成的序列是{(xk,yk)},且则(Δx,k,Δy,k)∈∂Ψ(xk,yk).故存在一个常数M>0,使得‖(Δx,k,Δy,k)‖≤M(‖xk-xk-1‖+‖yk-yk-1‖).证明由式(9),得(10)设αk∈∂f(xk),可得关于式(10)的优化条件为同理,(11)因为∂xΨ(xk,yk-1)=αk+‖xk-yk-1‖和∂yΨ(xk,yk)=βk-‖xk-yk‖, 有最后因此得到(Δx,k,Δy,k)∈∂Ψ(xk,yk).根据三角不等式,有其中3 收敛性分析定理1 假定满足(H1)~(H3), 由式(9)生成的序列是{(xk,yk)}. 则下面的假设成立: (ⅰ) 序列{Ψ(xk,yk)}是递增的且存在一个常数M1>0, 使得M1(‖xk+1-xk‖2+‖yk+1-yk‖2)≤Ψ(xk,yk)-Ψ(xk+1,yk+1)(12)(ⅱ)如果{(xk,yk)}是有界的, 则此外,和是有限的,则证明(ⅰ)由式(9)知(13)(14)∀k≥1,式(13)和(14)相加可得(15)根据Ψ(x,y)的定义, 有(16)这表明{Ψ(xk,yk)}是非增的. 其中(ⅱ) 对不等式(16)从0到N(N≥0)求和,得(17)因为{(xk,yk)}是有界的,∀ε1>0, 存在N1>0使得∀k>N1 dist((xk,yk),(x*,y*))<ε1.由f的下半连续性知(18)从式(9)知因(xk,yk)→(x*,y*)且{λk}是有界的, 设k→∞,则有(19)结合式(18)和(19), 得到同理故(20)即∀ε2>0,∃N2>0, 使得∀k>N2‖Ψ(xk,yk)-Ψ(x*,y*)‖<ε2.(21)因{Ψ(xk,yk)}是非增序列, 则可得∀k≥1Ψ(x*,y*)<Ψ(xk,yk).设N=max{1,N1,N2},∀k>N,有(xk,yk)∈ {(xk,yk)|dist((xk,yk),(x*,y*))<ε1}∩(Ψ(x*,y*)<Ψ(xk,yk)<Ψ(x*,y*)+ε2).由KL性质知, 有φ′(ψ(xk,yk)-ψ(x*,y*))dist((0,0),∂Ψ(xk,yk))≥1.(22)由引理3得(23)由φ的凹性有φ(Ψ(xk,yk)-Ψ(x*,y*))-φ(Ψ(xk,yk)-Ψ(x*,y*))≥φ,(Ψ(xk,yk)-Ψ(x*,y*))(Ψ(xk,yk)-Ψ(xk+1,yk+1)).(24)∀k>N, 由式(22),式(11)式和φ的凹性可得(25)其中Ωk,k+1=φ(Ψ(xk,yk)-Ψ(x*,y*))-φ(Ψ(xk,yk)-Ψ(x*,y*)).根据(a+b)2≤2a2+2b2和有2(‖xk+1-xk‖+‖yk+1-yk‖)≤M2Ωk,k+1+(‖xk-xk-1‖+‖yk-yk-1‖).(26)其中将式(26)从k=N+1,N+2,…,n相加,化简得到其中ΩN+1,n+1=ΩN+1,q+Ωq,n+1(q是一个正整数).然后由ΩN+1,n+1的定义和φ∈Φη,可得+M2φ(Ψ(xN+1,yN+1)-Ψ(x*,y*)).(27)设n→∞,根据式(27)可得到因此,由于得到是有限的.最终4 收敛结果定理2(收敛定理) 假定满足(H1)~(H3), 由式(9)生成的序列记作{(xk,yk)}. 用{(x*,y*)}表示关于Ψ(xk,yk)的极限点, 则{(xk,yk)}收敛于一个临界点{(x*,y*)}.证明设m>n>N,得到(28)式(28)表明{(xk,yk)}是一个收敛序列,从定理1(ⅱ)知,由引理2和定理1(ⅱ),有(Δx,k,Δy,k)∈∂Ψ(xk,yk),(Δx,k,Δy,k)→(0,0)当k→∞.因此,由∂Ψ的封闭性可知(0,0)∈∂Ψ(x*,y*),这表明(x*,y*)是Ψ的一个临界点.推论1 假定Ψ满足(H1)~(H3)且在处具有Kurdyka-Lojasiewicz性质,且是Ψ一个局部极小点.则∃ε3>0和υ>0,使得(ⅰ)(ⅱ) minΨ<Ψ(x0,y0)<minΨ+υ.这表明以(x0,y0)为起始点的序列(xk,yk)具有有限长性质且收敛于(x*,y*), 即Ψ(x*,y*)=minΨ.证明从定理1知(xk,yk)收敛于(x*,y*), 一个临界点Ψ满足minΨ<Ψ(x0,y0)<minΨ+υ,∀k>0.如果由引理2知这与(0,0)∈∂Ψ(x*,y*)矛盾.定理3(收敛速率定理) 假设Ψ(x,y)满足(H1)~(H3).假定(xk,yk)收敛于(x∞,y∞),Ψ(x,y)在(x∞,y∞)具有Kurdyka-Lojasiewicz性质.φ(s)=cs1-θ,θ∈[0,1),c>0.其中θ是关于(x∞,y∞)的一个Lojasiewicz指数. 则下列假设成立: (ⅰ) 如果θ=0,序列(xk,yk)收敛于有限步长;(ⅱ) 如果使得‖(xk,yk)-(x∞,y∞)‖≤cτk.(ⅲ) 如果证明(ⅰ) 假设θ=0. 如果Ψ(xk,yk)是固定的, 根据定理2,(xk,yk)收敛于有限步长.如果Ψ(xk,yk)不是固定的, 则对于任意k充分大, 由Kurdyka-Lojasiewicz不等式可得cdist((0,0),∂Ψ(xk,yk))≥1.这与(0,0)∈∂Ψ(x k,yk)矛盾.(ⅱ) 假设θ>0.∀k≥0, 设由定理1知它是有限的. 因为Δk≥‖xk-x∞‖+‖yk-y∞‖,估计Δk就足够了. 接下来有Δk≤Δk-1-Δk+M2Ωk,k+1.由Kurdyka-Lojasiewicz不等式可得φ′[Ψ(xk,yk)-Ψ(x*,y*)]dist[(0,0),∂Ψ(xk,yk)]=c(1-θ)[Ψ(xk,yk)-Ψ(x*,y*)]-θdist[(0,0),∂Ψ(xk,yk)]≥1.因此(Ψ(xk,yk)-Ψ(x*,y*))θ≤c(1-θ)dist[(0,0),∂Ψ(xk,yk)].又因为dist((0,0),∂Ψ(xk,yk))≤ ‖Δx,k,Δy,k‖≤M(‖(xk-1-xk)‖+‖(yk-1-xk)‖)≤M(Δk-1-Δk).最后得到然后由Ωk,k+1的定义可得Ωk,k+1≤φ(Ψ(xk,yk)-Ψ(x*,y*))=[c(Ψ(xk,yk)-Ψ(x*,y*))]1-θ.最后给出其中再结合文献[23]可以得到(ⅱ)和(ⅲ).5 结束语研究解决无约束非凸可分离规划的算法, 该目标函数是真下半连续的, 但未必是凸的. 目标函数具有KL性质,证明了算法的收敛性,也获得了收敛速率结果.通过Lojasiewicz指数相关的函数获得了收敛速率的结果.参考文献(References):[1] MARTINET B. Regularisation d′inequations variationnelles par approximations successives(French)[J].Rev Francaise informat.Recherche Operationnelle, 1970,4(4):154-158.[2] ROCKAFELLAR R T.Augmented Lagrangians and applications of the proximal point algorithm in convex programming[J].Mathematics ofOperations Research 1976,1(2):97-116.[3] COMBETTES P,PENNANEN T. Proximal methods for cohypomonotone operators[J].SIAM J Control Optim,2004,43(2):731-742.[4] KAPLAN A,TICHATSCHKE R.Proximal point methods and nonconvex optimization[J].Journal of Globl Optimization,1998,13(4):389-406.[5] MIETTNEN M,MKEL M M,HASLINGER J.On numerical solution of hemivariational inequalities by nonsmooth optimizationmethods[J].Journal of Global Optimization,1995,6(4):401-425.[6] MIFFLIN R,SAGASTIZABAL C.νμ-smoothness and proximal point results for some nonconvex functions[J].Optimization Methods &Software,2004,19(5):463-478.[7] PENNANEN T.Local convergence of the proximal point algorithm and multiplier methods without monotonicity[J].Mathematics of Operations Research,2002,27(1):170-191.[8] Spingarn J E.Submonotone mappings and the proximal point algorithm[J].Numerical Functional Analysis & Optimization,1982,4(2):123-150.[9] ATTOUCH H,SOUBEYRAN A.Inertia and reactivity in decision making as cognitive variational inequalities[J].Journal of ConvexAnalysis,2006,13(13):207-224.[10] CLARKE F H,STERN R J,LEDYAEV Y S,et al.Nonsmooth analysis and control theory[J].Graduate Texts in Mathematics,1998,178(7):137-151. [11] MORDUKHOVICH B S.Maximum principle in the problem of time optimal response with nonsmooth constraints[J].Journal of AppliedMathematics and Mechanics,1976,40(6):960-969.[12] MORDUKHOVICH B.Variational analysis and generalized differentiation[M].Heidelberg:Springer,1998.[13] ROCKAFELLAR R T,WETS R.Variationalanalysis[M].Heidelberg:Springer,1998.[14] WANG X Y,LI S J,KOU X P,et al.A new alternating direction method for linearly constrained nonconvex optimization problems[J].Journal of Global Optimization,2015,62(4):695-709.[15] NOCEDAL J,WRIGHT S J.Numerical optimization[M].NewYork:Springer,2006.[16] BOLTE J,DANIILIDIS A,LEWIS A.The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems[J].SIAM Journal of Optimization,2006,17(4):1205-1223.[17] REDONT P,SOUBEYRAN A.Proximal alternating minimization and projection methods for nonconvex problems:An approach based on the Kurdyka-jasiewicz inequality[J].Mathematics of Operations Research,2010,35(2):438-457.[18] BOLTE J,SABACH S,TEBOULLE M.Proximal alternating linearized minimization for nonconvex and nonsmooth problems[J].Mathematical Programming,2014,146(1-2):459-494.[19] LEMAIRE B.The proximal algorithm[J].New Methods in Optimization and Their Industrial Uses,International Series of Numerical,1987,87:73-87.[20] ROCKAFELLAR R T.Monotone operators and the proximal point algorithm[J].Siam Journal on Control & Optimization,1976,14(5):877-898.[21] SPINGARN J E.Submonotone mappings and the proximal point algorithm[J].Numerical Functional Analysis & Optimization,1982,4(2):123-150.[22] ATTOUCH H,BOLTE J,SVAITER B F.Convergence of descent methods for semi-algebraic and tame problems:Proximal algorithms,forward-backward splitting, and regularized Gauss-Seidel methods[J].Mathematical Programming, 2013,137(1-2):91-129.[23] ATTOUCH H,BOLTE J.On the convergence of the proximal algorithm for nonsmooth functions involving analytic features[J].Mathematical Programming,2009,116(1-2):5-16.。
西安电子科技大学优质课程《凸优化及其在信号处理中的应用》课程教学大纲
课程教学大纲课程编号:G00TE1204课程名称:凸优化及其在信号处理中的应用课程英文名称:Convex Optimization and Its Applications in Signal Processing开课单位:通信工程学院教学大纲撰写人:苏文藻课程学分:2学分课内学时:32学时课程类别:硕士/博士/专业学位课程性质:任选授课方式:讲课考核方式:作业,考试适用专业:通信与信息系统、信号与信息处理先修课程:教学目标:同学应:1.掌握建立基本优化模型技巧2.掌握基本凸分析理论3.掌握凸优化问题的最优条件及对偶理论4.认识凸优化在信号处理的一些应用英文简介:In this course we will develop the basic machineries for formulating and analyzing various optimization problems. Topics include convex analysis, linear and conic linear programming, nonlinear programming, optimality conditions, Lagrangian duality theory, and basics of optimization algorithms. Applications from signal processing will be used to complement the theoretical developments. No prior optimization background is required for this class. However, students should have workable knowledge in multivariable calculus, real analysis, linear algebra and matrix theory.课程主要内容:Part I: Introduction-Problem formulation-Classes of optimization problemsPart II: Theory-Basics of convex analysis-Conic linear programming and nonlinear programming: Optimality conditions and duality theory-Basics of combinatorial optimizationPart III: Selected Applications in Signal Processing-Transmit beamforming-Network localization-Sparse/Low-Rank Regression参考书目:1.Ben-Tal, Nemirovski: Optimization I-II: Convex Analysis, Nonlinear ProgrammingTheory, Nonlinear Programming Algorithms, 2004.2.Boyd, Vandenberghe: Convex Optimization, Cambridge University Press, 2004.3.Luenberger, Ye: Linear and Nonlinear Programming (3rd Edition), 2008.4.Nemirovski: Lectures on Modern Convex Optimization, 2005.。
拉格朗日神经网络解决带等式和不等式约束的非光滑非凸优化问题
拉格朗日神经网络解决带等式和不等式约束的非光滑非凸优化问题喻昕;许治健;陈昭蓉;徐辰华【摘要】Nonconvex nonsmooth optimization problems are related to many fields of science and engineering applications, which are research hotspots. For the lack of neural network based on early penalty function for nonsmooth optimization problems, a recurrent neural network model is proposed using Lagrange multiplier penalty function to solve the nonconvex nonsmooth optimization problems with equality and inequality constrains. Since the penalty factor in this network model is variable, without calculating initial penalty factor value, the network can still guarantee convergence to the optimal solution, which is more convenient for network computing. Compared with the traditional Lagrange method, the network model adds an equality constraint penalty term, which can improve the convergence ability of the network. Through the detailed analysis, it is proved that the trajectory of the network model can reach the feasible region in finite time and finally converge to the critical point set. In the end, numerical experiments are given to verify the effectiveness of the theoretic results.%非凸非光滑优化问题涉及科学与工程应用的诸多领域,是目前国际上的研究热点.该文针对已有基于早期罚函数神经网络解决非光滑优化问题的不足,借鉴Lagrange乘子罚函数的思想提出一种有效解决带等式和不等式约束的非凸非光滑优化问题的递归神经网络模型.由于该网络模型的罚因子是变量,无需计算罚因子的初始值仍能保证神经网络收敛到优化问题的最优解,因此更加便于网络计算.此外,与传统Lagrange方法不同,该网络模型增加了一个等式约束惩罚项,可以提高网络的收敛能力.通过详细的分析证明了该网络模型的轨迹在有限时间内必进入可行域,且最终收敛于关键点集.最后通过数值实验验证了所提出理论的有效性.【期刊名称】《电子与信息学报》【年(卷),期】2017(039)008【总页数】6页(P1950-1955)【关键词】拉格朗日神经网络;收敛;非凸非光滑优化【作者】喻昕;许治健;陈昭蓉;徐辰华【作者单位】广西大学计算机与电子信息学院南宁 530004;广西大学计算机与电子信息学院南宁 530004;广西大学计算机与电子信息学院南宁 530004;广西大学电气工程学院南宁 530004【正文语种】中文【中图分类】TP183作为解决优化问题的并行计算模型,递归神经网络在过去的几十年里受到了极大的关注,不少神经网络模型被提出。
凸优化练习题与解答(1)台大考古题
Exam policy: Open book. You can bring any books, handouts, and any kinds of paper-based notes with you, but electronic devices (including cellphones, laptops, tablets, etc.) are strictly prohibited.
2. (18%) Determine whether each of the following sets is a convex function, quasi-convex
function, concave function. Write your answer as a table of 6 rows and 3 columns, with
z, X1z ≥ 1 z, X2z ≥ 1
Then, for 0 ≤ θ ≤ 1,
z, θX1 + (1 − θ)X2 z =θ z, X1 + (1 − θ) z, X2 ≥θ · 1 + (1 − θ) · 1 =1.
As required in definition of S10. To see it is not a cone, consider z = (1, 0, . . . , 0), and X = I ∈ Sn (symmetric matrices). Here z, Xz = 1, but z, 2Iz = 2 The reason that it is not affine is the same, by considering 2I = 2 · I + (−1) · O, the “line” containing O (all-0 matrix) and I. It follows that it is not a subspace. 11. S11 = x ∈ Rn ||P x + q||2 ≤ cT x + r given any P ∈ Rm×n, q ∈ Rm, c ∈ Rn, and r ∈ R. T, F, F, F To show convexity, if
LP NLP MIP介绍 英文
The most important characteristic of an optimization problem is whether it is continuous or discrete. Continuous problems are those where the constraint set is infinite and has a continuous character, like linear optimization and nonlinearoptimization.1. Linear ProgrammingLinear programming, known as linear optimization, is the problem of maximizing or minimizing a linear function over a convex polyhedron specified by linear and non-negativity constraints. Simplistically, linear programming is the optimization of an outcome based on some set of constraints using a linear mathematical model.Convexity is a special property of linear programming, which means that the line connecting any two points is completely contained within the set. Its local optimum is exactlyglobaloptimum. Thus, the solution of optimization must lie on constraint.So linear programming can be solved using the simplex method which runs along polytope edges of the visualization solid to find the best answer.A different type of methods for linear programming problems are interior point methods. It achieves optimization by going through the middle of the solid defined by the problem rather than around its surface. The method can be more effective facing multi-variable problem for the reason that simplex can be very computationally intensive.Its practicalperformance is betterthantheoreticalcomplexity.Excellenttoolbox such as YALMIP is used by me when modeling and working out the solution of convex and nonconvex optimization problems. The “sdpvar”is the core object of YALMIP, whichrepresents the real decision variablesinthe optimization problems. The “set”is another kind of key object in YALMIP, which is used to include all the constraint in optimization problem. The function “solvesdp” is used to solve optimization problem. The functional form is as follow: s=solvesdp(F, f), where F is sum of constraint “set”, and f is the objective function.2.Nonlinear ProgrammingNonlinear Programming is another kind of optimizationwhere some of the constraints or the objective function are nonlinear. EveryconstrainedNLPhasthreebasiccomponents:asetofunknownsorvariablestobedetermined,ano bjectivefunctionto beminimizedormaximized,andasetofconstraintstobesatisfied.Solvingsuchaprobl emamountstofindingvaluesofvariablesthatoptimize(minimizeormaximize)theobjectivefunctionwh ilesatisfyingalltheconstraints.Aspecialkindofnon-linearprogramisconvexoptimization,inwhichthefe asible setisaconvexset,andtheoptimizationgoalandtheconstraintsareconvex/concave/linear functions.Some methods can be used to solve NLP. Its mechanism is to generate sequences of feasible points by searching along descent directions. In this sense, they can be viewed as constrained versions of unconstrained descent algorithms.Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.The constraint set of an optimization problem is usually specified in terms of equality an inequality constraints. If we take into account this structure, we obtain a sophisticated collection of optimality conditions, involving some auxiliary variables called Lagrange multipliers. Thesevariables facilitate the characterization of optimal solutions, but also provide valuable sensitivity information, quantifying up to first order the variation of the optimal cost by variations in problem data. Two basic lines in Lagrange multipliers need to analysis: the penalty and the feasible direction.3.Mixed-integer Linear/Nonlinear ProgrammingMixed-integer linear/nonlinear programming is similar to linear/nonlinear programmingexcept different type of variables—MIPcontains integer variables as well as continuous variables. Besides, if the variables need to be 0 or 1(binary), it’s called 0-1 linear program. Mixed-integer linear/nonlinear programming becomes NP-complete to solve if we are allowed to specify constraints of a different kind: requiring that some variables be integers instead of real values. So there is no known polynomial-time algorithm.For mixed-integer linear programming, solution is unlikely to lie on boundary of continuous feasibility region. As a consequence,the optimum of problem must be no better than before. If we solve the MILP via treating it as a linear programming firstly, how to get from this solution to an integer solution? Assume integer variables have lower and upper bounds. If in optimal solution of LP all integer variables take integer values, then it is also an optimal solution to MILP. Else, rounding the solution of LP may yield to non-optimal or non-feasible solutions. Therefore, what’s needed more to elaborate is branch & bound. We can build a binary tree of subproblems whose leaves correspond to pending problems still to be solved, with terminal condition as integer variables have finite bounds and at each split, range of one variable becomes strictly smaller.Certainly, the pruning of the search tree has a huge impact on the efficiency of branch & bound.For mixed-integer programming,a classical problem in scheduling is the unit commitment problem. In this problem, our task is to turn on and off power generating plants in order to meet a forecasted future power demand, while minimizing the generation side costs. It has several different power plants with different characteristics and running costs, and various constraints on how they can be used. The decision to determine whether generations turn on or off is integer variables. If the generation cost is modeled as a nonlinear formulation, it is a mixed-integer nonlinear optimization problem. The constraints often includes energy balance constraints, facility performance constraints andfacility capacity constraints.Some heuristic optimization algorithms can be used to solve such complicatedoptimization, such as GA, PSO, hybrid optimization strategy and so on. I usually apply these methods through matlab toolbox. But these tools may not find the best global solution but a “good” solution. Nevertheless, these methods have advantage of a quick speed of problem solving ability. So they are widely applied now.。
凸函数1
Figure 1: Examples of a convex set (a) and a nons
• All of Rn . It should be fairly obvious that given any x, y ∈ Rn , θx + (1 − θ)y ∈ Rn . • The non-negative orthant, Rn + . The non-negative orthant consists of all vectors in n R whose elements are all non-negative: Rn + = {x : xi ≥ 0 ∀i = 1, . . . , n}. To show that this is a convex set, simply note that given any x, y ∈ Rn + and 0 ≤ θ ≤ 1, (θx + (1 − θ)y )i = θxi + (1 − θ)yi ≥ 0 ∀i. • Norm balls. Let · be some norm on Rn (e.g., the Euclidean norm, x 2 = n n 2 i=1 xi ). Then the set {x : x ≤ 1} is a convex set. To see this, suppose x, y ∈ R , with x ≤ 1, y ≤ 1, and 0 ≤ θ ≤ 1. Then θx + (1 − θ)y ≤ θx + (1 − θ)y = θ x + (1 − θ) y ≤ 1 where we used the triangle inequality and the positive homogeneity of norms. • Affine subspaces and polyhedra. Given a matrix A ∈ Rm×n and a vector b ∈ Rm , an affine subspace is the set {x ∈ Rn : Ax = b} (note that this could possibly be empty if b is not in the range of A). Similarly, a polyhedron is the (again, possibly empty) set {x ∈ Rn : Ax b}, where ‘ ’ here denotes componentwise inequality (i.e., all the entries of Ax are less than or equal to their corresponding element in b).1 To prove this, first consider x, y ∈ Rn such that Ax = Ay = b. Then for 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay = θb + (1 − θ)b = b. Similarly, for x, y ∈ Rn that satisfy Ax ≤ b and Ay ≤ b and 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay ≤ θb + (1 − θ)b = b.
一类不确定优化问题的鲁棒对偶性刻画
一类不确定优化问题的鲁棒对偶性刻画孙祥凯;曾静;郭晓乐【摘要】通过引入一类目标函数和约束条件均带有不确定信息的优化问题,借助鲁棒型次微分约束品性,刻画了该不确定优化问题与其不确定对偶问题之间的Mond-Weir 型鲁棒对偶性,即原问题的鲁棒对应与其对偶问题的最优对应之间的对偶性。
%By introducing a class of optimization problems with uncertainty information both in the objective functions and constraints, and then using the robust-type subdifferential constraint qualification,we characterized Mond-Weir type robust duality between the uncertain optimization problem and its uncertain dual problem,in other words,the duality between the robust counterpart of the primal problem and the optimistic counterpart of its dual problem.【期刊名称】《吉林大学学报(理学版)》【年(卷),期】2016(054)004【总页数】5页(P715-719)【关键词】不确定优化问题;鲁棒对偶性;约束品性【作者】孙祥凯;曾静;郭晓乐【作者单位】重庆工商大学数学与统计学院,重庆 400067;重庆工商大学数学与统计学院,重庆 400067;西南政法大学经济学院,重庆 401120【正文语种】中文【中图分类】O221.6;O224Key words: uncertain optimization problem; robust duality; constraint qualification设X和Y为局部凸空间, C为X的非空闭凸子集, K为Y的非空闭凸锥, f: X→为凸函数, g: X→Y为K-凸函数. 考虑凸优化问题其可行集定义为作为最优化问题的重要模型, 优化问题(P)已经得到了广泛研究[1-5]. 但由于其未考虑数据不确定性因素的影响, 使其应用受到很大限制. 实际上, 在一些实际问题中, 由于计算或操作的原因, 很难保证所考虑数据的确定性, 如生产过程中产品最大销售量的不确定性, 风险管理中未来需求及市场定价数据的不确定性, 数据收集中由于测量工具带来的测量误差, 简化模型时用到的近似处理等. 因此, 一类目标函数或约束函数中含有不确定性参数的优化问题, 即不确定性优化问题应用广泛[6-12].针对凸优化问题(P), 若其目标函数和约束函数中均含有不确定性数据, 则(P)变为不确定性优化问题其中:和Z为局部凸空间; U ⊆和V ⊆Z为不确定集; u∈U和v∈V为不确定参数为凸凹函数, 即对任意的u∈U, f(·,u)为凸函数; 对任意的x∈X, f(x,·)为凹函数. 更进一步, g: X×Z→Y为K-凸凹函数, 即对任意的v∈V, g(·,v)为K-凸函数; 对任意的x∈X, g(x,·)为K-凹函数.目前, 研究不确定性优化问题的方法很多, 其中鲁棒优化方法是处理含不确定参数优化问题的主要方法, 其目的是合理估计一个包含不确定参数的不确定集, 使问题的解在该不确定集中都是可行的, 并以此建立最差情况下最优化问题目标函数的鲁棒对应模型, 从而得到问题的鲁棒最优解[13]. 文献[8,10-12]借助鲁棒优化方法给出了不确定性优化问题的各种性质, 文献[12]通过引入一类鲁棒型次微分约束品性, 刻画了不确定性优化问题(UP)的鲁棒最优解以及最优解集, 并研究了与其不确定对偶问题之间的Wolf型鲁棒对偶性.本文借助鲁棒优化方法[13]及鲁棒型次微分约束品性[12], 刻画问题(UP)与其不确定对偶问题之间的Mond-Weir型鲁棒对偶性. 为此, 先引入问题(UP)的鲁棒对应(robust counterpart):再引入问题(UP)的不确定Mond-Weir型对偶问题, 并刻画问题(UP)与其不确定Mond-Weir型对偶问题之间的鲁棒弱对偶和鲁棒强对偶.设X为局部凸拓扑向量空间, 其对偶空间定义为X*, 并且赋予弱星拓扑w(X*,X). 〈·,·〉表示空间X与X*之间元素的内积. 设W⊆X*或W⊆X*×, 则cl W表示W的弱星闭包. 对于任意的集合C⊆X, C的指示函数δC: X→∪{+∞}定义为C的支撑函数σC: X*→∪{+∞}定义为假设l: X→∪{+∞}为给定的广义实值函数, 则函数l的有效域和上图分别定义为dom l={x∈n: l(x)<+∞}和epi l={(x,r)∈n×: l(x)≤r}. 若函数l的有效域不是空集, 则称函数l是真函数. 若对任意的x,y∈X和t∈[0,1], 有则称函数l为凸函数. 若-l为凸函数, 则称函数l为凹函数.对于任意的真凸函数l: X→∪{+∞}, 其共轭函数l*: X*→∪{+∞}定义为函数l在点∈dom l处的次微分定义为设Y为另一个局部凸拓扑向量空间, 其对偶空间定义为Y*, 并且赋予弱星拓扑w(Y*,Y). 设K⊆Y为非空闭凸锥, 并定义Y中的偏序≤K. 设h: X→Y为向量值函数. 若对于任意的x,y∈X和t∈[0,1], 有则称函数h为K-凸函数. 若-h为K-凸函数, 则称函数h为K-凹函数. 进一步, 假设λ∈K*, λh表示λ与函数h的复合. 显然, 函数h为K-凸函数当且仅当对于任意的λ∈K*, 函数λh为凸函数. 同理, 函数h为K-凹函数当且仅当对于任意的λ∈K*, 函数λh为凹函数.定义1[12] 问题(UP)的鲁棒可行集定义为定义2[12] 若∈F为问题(RUP)的最优解, 则称∈F为问题(UP)的鲁棒最优解. 问题(UP)所有鲁棒最优解组成的集合称为问题(UP)的鲁棒最优解集, 即定义3[12] 若则称鲁棒型次微分约束品性(RSCQ)在点x∈F处成立.定义4[12] 若则称次微分约束品性(SCQ)在点x∈F处成立.引理1[12] 对于问题(UP), 设为凸凹函数, 并且对于任意的u∈U, f(·,u)在处连续. 设g: X×Z→Y为连续的K-凸凹函数, 则下述结论等价:1) 鲁棒型次微分约束品性(RSCQ)在点∈F处成立;2) ∈X为问题(UP)的鲁棒最优解当且仅当存在及使得引理2[12] 对于问题(P), 设f0: X→为凸函数, 并且f0在点处连续. 设g0: X→Y为连续的K-凸函数, 则下述结论等价:1) 次微分约束品性(SCQ)在点∈F0处成立;2) ∈X为问题(P)的最优解当且仅当存在∈X及使得下面借助鲁棒优化方法, 先引入不确定优化问题(UP)的Mond-Weir型鲁棒对偶问题(UDMW), 再考虑问题(UP)与问题(UDMW)之间的Mond-Weir型鲁棒对偶性, 即原问题的鲁棒对应(RUP)与其对偶问题最优对应(ODMW)之间的对偶性.设y∈C, λ∈K*. 对于任意固定的u∈U, v∈V, 问题(UP)的Mond-Weir型对偶问题定义为其最优对应(optimistic counterpart)为注1 若目标函数和约束函数中均不含有不确定性参数, 即U和V为单点集时, 则问题(RUP)退化为问题(P), 问题(ODMW)退化为问题(P)的Mond-Weir型对偶问题(DMW), 即这里Mond-Weir型鲁棒强对偶是指问题(RUP)的最优值等于问题(ODMW)的最优值, 并且问题(RUP)和问题(ODMW)的最优解均存在.下面建立Mond-Weir型鲁棒弱对偶性与鲁棒强对偶性.定理1(鲁棒弱对偶性) 对于问题(RUP)的任意可行解x及问题(ODMW)的任意可行解(y,λ,u,v), 有(y,u).证明:令x为问题(RUP)的任意可行解, (y,λ,u,v)为问题(ODMW)的任意可行解, 则存在y∈C, λ∈K*, u∈U, v∈V, 使得从而存在η∈∂f(·,u)(y), θ∈∂δC(y), ξ∈∂((λg)(·,v))(y), 使得由η∈∂f(·,u)(y)和(λg)(y,v)≥0可得于是, 由θ∈∂δC(y), ξ∈∂((λg)(·,v))(y), η+θ+ξ=0可得进一步, 由x为问题(RUP)的任意可行解可知, g(x,v)∈-K. 所以, 由λ∈K*可得(λg)(x,v)≤0, 故f(x,u)≥f(y,u). 因此(y,u). 证毕.类似地, 易得问题(P)与问题(DMW)之间的Mond-Weir型弱对偶性.推论1(弱对偶性) 对于问题(P)的任意可行解x及问题(DMW)的任意可行解(y,λ), 有f0(x)≥f0(y).定理2(鲁棒强对偶性) 设为凸凹函数, 并且对任意的u∈U, f(·,u)在点处连续. 设g: X×Z→Y为连续K-凸凹函数, 则下述结论等价:1) 鲁棒型次微分约束品性(RSCQ)在点∈F处成立;2) 若∈X为问题(UP)的鲁棒最优解, 则存在使得为问题(ODMW)的最优解, 并且问题(RUP)和问题(ODMW)的最优值相等.证明: 1) 2). 设为问题(UP)的鲁棒最优解, 则由引理1可知, 存在使得所以为问题(ODMW)的可行解. 进一步, 对于问题(ODMW)的任意可行解(y,λ,u,v), 由定理1可知因此为问题(ODMW)的鲁棒最优解, 并且问题(RUP)和问题(ODMW)的最优值相等.2) 1). 类似于引理1证明方法可知结论成立. 证毕.特别地, 若U和V为单点集, 则借助引理2, 易得问题(P)与问题(DMW)之间的Mond-Weir型强对偶性.推论2(强对偶性) 设f0: X→为凸函数, 并且f0在点处连续. 设g0: X→Y为连续K-凸函数, 则下述结论等价:1) 次微分约束品性(SCQ)在点∈F0处成立;2) 若∈X为问题(P)的最优解, 则存在∈K*, 使得为问题(DMW)的最优解, 并且问题(P)和问题(DMW)的最优值相等.【相关文献】[1] Rockafellar R T. Convex Analysis [M]. Princeton: Princeton University Press, 1970.[2] Boyd S, Vandenberghe L. Convex Optimization [M]. Cambridge: Cambridge University Press, 2004.[3] Bot R I. Conjugate Duality in Convex Optimization [M]. Berlin: Springer-Verlag, 2010.[4] 赵丹, 孙祥凯. 复合凸优化问题的稳定强对偶 [J]. 吉林大学学报(理学版), 2013, 51(3): 441-443. (ZHAO Dan, SUN Xiangkai. Stable Strong Duality for a Composed Convex Optimization Problem [J]. Journal of Jilin University (Science Edition), 2013, 51(3): 441-443.)[5] 孙祥凯. 复合凸优化问题全对偶性的等价刻画 [J]. 吉林大学学报(理学版), 2015, 53(1): 33-36. (SUN Xiangkai. Some Characterizations of Total Duality for a Composed Convex Optimization [J]. Journal of Jilin University (Science Edition), 2015, 53(1): 33-36.)[6] Ben-Tal A, Nemirovski A. Robust Convex Optimization [J]. Mathematics of Operations Research, 1998, 23(4): 769-805.[7] Shapiro A, Dentcheva D, Ruszczynski A. Lectures on Stochastic Programming: Modeling and Theory [M]. Philadelphia: SIAM, 2009.[8] Jeyakumar V, Li G Y. Strong Duality in Robust Convex Programming: Complete Characterizations [J]. SIAM Journal on Optimization, 2010, 20(6): 3384-3407.[9] Bertsimas D, Brown D B, Caramanis C. Theory and Applications of Robust Optimization [J]. SIAM Review, 2011, 53(3): 464-501.[10] Kuroiwa D, Lee G M. On Robust Convex Multiobjective Optimization [J]. Journal of Nonlinear and Convex Analysis, 2014, 15(6): 1125-1136.[11] SUN Xiangkai, CHAI Yi. On Robust Duality for Fractional Programming with Uncertainty Data [J]. Positivity, 2014, 18(1): 9-28.[12] SUN Xiangkai, PENG Zaiyun, GUO Xiaole. Some Characterizations of Robust Optimal Solutions for Uncertain Convex Optimization Problems [J/OL]. Optimization Letters, 2015-09-18. doi: 10.1007/s11590-015-0946-8.[13] Ben-Tal A, Ghaoui L E I, Nemirovski A. Robust Optimization [M]//Princeton Series in Applied Mathematics. Princeton: Princeton University Press, 2009.。
凸优化、半定规划相关Matlab工具包总结(部分为C C++)
SoftwareFor some codes a benchmark on problems from SDPLIB is available at Arizona State University.∙CSDP 4.9, by Brian Borchers (report 1998, report 2001). He also maintains a problem library, SDPLIB.∙CVX, version 1.1, by M. Grant and S. Boyd.Matlab software for disciplined convex programming.∙DSDP 5.6 , by S. J. Benson and Y. Ye, parallel dual-scaling interior point code in C (manual); source and excutables available fromBenson's homepages.∙GloptiPoly3, by D. Henrion, J.-B. Lasserre and J. Loefberg;a Matlab/SeDuMi add-on for LMI-relaxations of minimization problemsover multivariable polynomial functions subject to polynomial or integer constraints.∙LMITOOL-2.0 of the Optimization and Control Group at ENSTA.∙MAXDET, by Shao-po Wu, L. Vandenberghe, and S. Boyd. Software for determinant maximization. (see also rmd)∙NCSOStools, by K. Cafuta, I. Klep, and J. Povh. An open source Matlab toolbox for symbolic computation with polynomials in noncommutingvariables, to be used in combination with sdp solvers.∙PENNON-1.1 by M. Kocvara and M. Stingl. It implements a penalty method for (large-scale, sparse) nonlinear and semidefiniteprogramming (see their report), and is based on the PBM method ofBen-Tal and Zibulevsky.∙PENSDP v2.0 and PENBMI v2.0, by TOMLAB Optimization Inc., a MATLAB interface for PENNON.∙rmd , by the Geometry of Lattices and Algorithms group at University of Magdeburg, for making solutions of MAXDET rigorous byapproximating primal and dual solution by rationals and testing forfeasibility.∙SBmethod (Version 1.1.3), by C. Helmberg. A C++ implementation of the spectral bundle method for eigenvalue optimization.∙SDLS by D. Henrion and J. Malick.Matlab package for solving least-squares problems over convexsymmetric cones.∙SDPA (version 7.1.2), initiated by the group around Masakazu Kojima.∙SDPHA does not seem to be available any more (it was package by F.A. Potra, R. Sheng, and N. Brixius for use with MATLAB).∙SDPLR (version 1.02, May 2005) by Sam Burer, a C package for solving large-scale semidefinite programming problems.∙SDPpack is no longer supported, but still available. Version 0.9 BETA, by F. Alizadeh, J.-P. Haeberly, M. V. Nayakkankuppam, M. L. Overton, and S. Schmieta, for use with MATLAB.∙SDPSOL (version beta), by Shao-po Wu & Stephen Boyd (May 20, 1996). A parser/solver for SDP and MAXDET problems with matrixstructure.∙SDPT3 (version 4.0), high quality MATLAB package by K.C. Toh, M.J.Todd, and R.H. Tütüncü. See the optimization online reference.∙SeDuMi, a high quality package with MATLAB interface for solving optimization problems over self-dual homogeneous cones started byJos F. Sturm.Now also available: SeDuMi Interface 1.04 by Dimitri Peaucelle.∙SOSTOOLS, by S. Prajna, A. Papachristodoulou, and P. A. Parrilo. A SEDUMI based MATLAB toolbox for formulating and solving sums ofsquares (SOS) optimization programs(also available at Caltech).∙SP (version 1.1), by L. Vandenberghe, Stephen Boyd, and Brien Alkire.Software for Semidefinite Programming.∙SparseCoLO, by the group of M. Kojima, a matlab package for conversion methods for LMIs having sparse chordal graph structure,see the Research report B-453.∙SparsePOP, by H. Waki, S. Kim, M. Kojima and M. Muramatsu, is a MATLAB implementation of a sparse semidefinite programmingrelaxation method proposed for polynomial optimization problems.∙VSDP: Verified SemiDefinite Programmin, by Christian Jansson.MATLAB software package for computing verified results ofsemidefinite programming problems. See the optimization onlinereference.∙YALMIP, free MATLAB Toolbox by J. Löfberg for rapid optmization modeling with support for, e.g., conic programming, integerprogramming, bilinear optmization, moment optmization and sum ofsquares. Interfaces about 20 solvers, including most modern SDPsolvers.Reports on software:∙M. Yamashita, K. Fujisawa, M. Fukuda, K. Nakata and M. Nakata."Parallel solver for semidefinite programming problem having sparseSchur complement matrix",Research Report B-463, Dept. of Math. and Comp. Sciences, TokyoInstitute of Technology, Oh-Okayama, Meguro, Tokyo 152-8552,September 2010.opt-online∙Hans D. Mittelmann."The state-of-the-art in conic optimization software",Arizona State University, August 2010, written for the "Handbook ofSemidefinite, Cone and Polynomial Optimization: Theory, Algorithms, Software and Applications".opt-online∙K.-C. Toh, M. J. Todd, and R. H. Tütüncü."On the implementation and usage of SDPT3 -- a Matlab softwarepackage for semidefinite-quadratic-linear programming, version 4.0", Preprint, National University of Singapore, June, 2010.opt-online∙K. Cafuta, I. Klep and J. Povh."NCSOSTOOLS: A Computer Algebra System for Symbolic andNumerical Computation with Noncommutative Polynomials",University of Ljubljana, Faculty of Mathematics and Physics, Slovenia, May 2010.opt-online∙I. D. Ivanov and E. De Klerk."Parallel implementation of a semidefinite programming solver based on CSDP on a distributed memory cluster",Optimization Methods and Software, Volume 25, Issue 3 June 2010 , pages 405 - 420 .OMS∙M. Yamashita, K. Fujisawa, K. Nakata, M. Nakata, M. Fukuda, K.Kobayashi and Kazushige Goto."A high-performance software package for semidefinite programs:SDPA 7",Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, January, 2010.opt-online∙Sunyoung Kim, Masakazu Kojima, Hayato Waki and Makoto Yamashita."SFSDP: a Sparse Version of Full SemiDefinite ProgrammingRelaxation for Sensor Network Localization Problems",Report B-457, Dept. of Mathematical and Computing Sciences, Tokyo Institute of Technology, July 2009.opt-online∙K. Fujisawa, S. Kim, M. Kojima, Y. Okamoto and M. Yamashita."ser's Manual for SparseCoLO: Conversion Methods for SparseConic-form Linear Optimization Problems",Research report B-453, Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, 2-12-1 Oh-Okayama,Meguro-ku, Tokyo 152-8552 Japan, February 2009.opt-online∙Sunyoung Kim, Masakazu Kojima, Martin Mevissen, Makoto Yamashita."Exploiting Sparsity in Linear and Nonlinear Matrix Inequalities viaPositive Semidefinite Matrix Completion",Research Report B-452, Department of Mathematical and ComputingSciences, Tokyo Institute of Technology, Oh-Okayama, Meguro, Tokyo 152-8552, Japan, November 2008.opt-online∙ D. Henrion, J. B. Lasserre, and J. Löfberg."GloptiPoly 3: moments, optimization and semidefinite programming", LAAS-CNRS, University of Toulouse, 2007.opt-online∙Didier Henrion and J茅r么me Malick."SDLS: a Matlab package for solving conic least-squares problems", LAAS-CNRS, University of Toulouse, 2007.opt-online∙M. Grant and S. Boyd."Graph Implementations for Nonsmooth Convex Programs",Stanford University, 2007.opt-online∙K. K. Sivaramakrishnan."A PARALLEL interior point decomposition algorithm for block-angular semidefinite programs",Technical Report, Department of Mathematics, North Carolina State University, Raleigh, NC, 27695, December 2006. Revised in June 2007 and August 2007.opt-online∙Makoto Yamashita, Katsuki Fujisawa, Mituhiro Fukuda, Masakazu Kojima, Kazuhide Nakata."Parallel Primal-Dual Interior-Point Methods for SemiDefinite Programs", Research Report B-415, Tokyo Institute of Technology, 2-12-1,Oh-okayama, Meguro-ku, Tokyo, Japan, March 2005.opt-online∙ B. Borchers and J. Young."How Far Can We Go With Primal-Dual Interior Point Methods forSDP?",New Mexico Tech, February 2005.opt-online∙H. Waki, S. Kim, M. Kojima and M. Muramatsu."SparsePOP : a Sparse Semidefinite Programming Relaxation ofPolynomial Optimization Problems",Research Report B-414, Dept. of Mathematical and ComputingSciences, Tokyo Institute of Technology, Oh-Okayama, Meguro152-8552, Tokyo, Japan, March 2005.opt-online∙M. Kocvara and M. Stingl."PENNON: A code for convex nonlinear and semidefinite programming", Optimization Methods and Software (OMS), Volume 18, Number 3,317-333, June 2003.∙Brian Borchers."CSDP 4.0 User's Guide",user's guide, New Mexico Tech, Socorro, NM 87801, 2002.opt-online∙M. Yamashita, K. Fujisawa, and M. Kojima."SDPARA : SemiDefinite Programming Algorithm PARAllel Version", Parallel Computing Vol.29 (8) 1053-1067 (2003).opt-online∙J. Sturm."Implementation of Interior Point Methods for Mixed Semidefinite and Second Order Cone Optimization Problems",Optimization Methods and Software, Volume 17, Number 6, 1105-1154, December 2002.optimization-online∙S. Benson and Y. Ye."DSDP4 Software User Guide",ANL/MCS-TM-248; Mathematics and Computer Science Division;Argonne National Laboratory; Argonne, IL; March 2002.opt-online∙S. Benson."Parallel Computing on Semidefinite Programs",Preprint ANL/MCS-P939-0302; Mathematics and Computer Science Division Argonne National Laboratory 9700 S. Cass Avenue Argonne, IL, 60439; March 2002.opt-online∙ D. Henrion and J. B. Lasserre."GloptiPoly - Global Optimization over Polynomials with Matlab andSeDuMi",LAAS-CNRS Research Report, February 2002.opt-online∙M. Kocvara and M. Stingl."PENNON - A Generalized Augmented Lagrangian Method forSemidefinite Programming",Research Report 286, Institute of Applied Mathematics, University of Erlangen, 2001.opt-online∙ D. Peaucelle, D. Henrion, and Y. Labit."User's Guide for SeDuMi Interface 1.01", Technical report number01445 LAAS-CNRS : 7 av. du Colonel Roche, 31077 Toulouse Cedex 4, FRANCE November 2001.opt-online∙Jos F. Sturm."Using SEDUMI 1.02, a MATLAB Toolbox for Optimization OverSymmetric Cones (Updated for Version 1.05)",October 2001.opt-online∙Hans D. Mittelmann."An Independent Benchmarking of SDP and SOCP Solvers",Technical Report, Dept. of Mathematics, Arizona State University, July2001.opt-online∙K. Fujisawa, M. Fukuda, M. Kojima and K. Nakata."Numerical Evaluation of SDPA",Research Report B-330, Department of Mathematical and ComputingSciences, Tokyo Institute of Technology, Oh-Okayama, Meguro-ku,Tokyo 152, September 1997.ps.Z-file (ftp) or dvi.Z-file (ftp)∙L. Mosheyev and M. Zibulevsky."Penalty/Barrier Multiplier Algorithm for Semidefinite Programming:Dual Bounds and Implementation",Research Report #1/96, Optimization Laboratory, Technion, November 1996.ps-file (http)Due to several requests I have asked G. Rinaldi for permission to put his graph generator on this page. Here it is: rudy (tar.gz-file)Last modified: Tue Oct 26 15:10:14 CEST 2010。
稀疏和低秩理论ppt
? ?
Robust PCA expresses an input data matrix as a sum of a low-rank matrix and a sparse matrix .
Two noise-aware variants
Basis pursuit denoising seeks a sparse near-solution to an underdetermined linear system:
? ?
? ?
Noise-aware Robust PCA approximates an input data matrix as a sum of a low-rank matrix and a sparse matrix .
Many possible applications …
CHRYSLER SETS STOCK SPLIT, HIGHER DIVIDEND Chrysler Corp said its board declared a three-for-two stock split in the form of a 50 pct stock dividend and raised the quarterly dividend by seven pct. The company said the dividend was raised to 37.5 cts a share from 35 cts on a pre-split basis, equal to a 25 ct dividend on a post-split basis. Chrysler said the stock dividend is payable April 13 to holders of record March 23 while the cash dividend is payable April 15 to holders of record March 23. It said cash will be paid in lieu of fractional shares. With the split, Chrysler said 13.2 mln shares remain to be purchased in its stock repurchase program that began in late 1984. That program now has a target of 56.3 mln shares with the latest stock split. Chrysler said in a statement the actions "re°ect not only our outstanding performance over the past few years but also our optimism about the company's future."
convex optimization介绍
Convex optimization是数学最优化的一个子领域,它研究的是定义于凸集中的凸函数的最小化问题。
通俗地说,就像在光滑的山坑中投掷小球,小球会停在最低点,这就是“凸优化”。
而相对地,若山坑内凹凸不平,甚至有更小的坑洞,那么小球有时就会被粗糙的平面卡住,而有时则会落在最低点,这就是“非凸优化”。
凸优化中的一些理论与思想可以被延伸到整个优化领域甚至其他学科,很多非凸问题的解决方法之一就是将其在某种程度上转化为凸问题,然后利用凸优化的方法技巧来计算。
不仅在数学领域,计算机科学、工程学、甚至在金融与经济学领域,凸优化都成为很多学生需研究者需要学习的一门课程。
以上信息仅供参考,如有需要,建议查阅相关网站。
凸优化2017作业及答案4
SI 251-Convex Optimization,Spring 2017Homework 4Due on 08:00a.m.,April 6,2017,before classNote:Please compress your codes into one file and sent it to TAs,and print your figures or results and answer the questions on A4paper.Finish your simulation with CVX package (MATLAB/Python/···).And initialize your program with com-mands to fix your randomized results and make sure that your results are repeatable.For example,if you are using MATLAB,you may add rng(’default’);rng(1);in the preamble.And you may need to reprogram the given MATLAB code segments to other programming languages that you’d like to choose.1.Feasibility1)(Multiuser transmit beamforming.)Power minimization problem in wireless communicationP :minimizew 1,···,w KK k =1w k 2subject to SINR k ≥γk ,k =1,···,K,(1)where w 1,···,w K ∈C n are the beamforming vectors for receiver k =1,···,K .Signal-to-interference-plus-noise-ratio for the k -th user SINR k is given bySINR k =|h H k w k |2 i =k |h H kw i |2+σ2,(2)where h k ∈C n is the channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it’s often represented with dB,which is defined as 10log γk .(a)Considerthe relationship between target SINR and the feasibility of P .Please draw the phasetransition 1figure where X-axis is target SINR in dB (γ1=···=γK =γ),and Y-axis is the ratio when the problem is feasible over multiple realizations of channel,i.e.R =#{P is feasible }#of tests(channel realizations).(3)Assume K =50,n =3.You need to run 20times and take average.(5points)(b)Please draw the phase transition figure about the relationship between the number of users Kand the feasibility of P .Assume n =3,γ=−15dB.You need to run 20times and take average.(5points)(c)Please draw the phase transition figure about the relationship between the number of antennasn and the feasibility of P .Assume K =100,γ=−10dB.You need to run 20times and take average.(5points)2)(Second-order cone optimization problem.)Randomly generate standard SOCPP SOCP :minimize x ∈Rnf T xsubject toA i x +b i ≤c T i x +d i ,i =1,···,K(4)where each entry of A i ∈R m ×n ,b i ∈R m ,c i ∈R n ,d i ∈R is all draw of i.i.d.standard Gaussiandistribution N (0,1).Please draw the phase transition figure about the relationship between the number of constriants K and the feasibility of P SOCP .Assume m =20,n =100.You need to run 20times and take average.(10points)1Formore about phase transition,refer to Dennis Amelunxen et al.:Living on the edge:Phase transitions in convex programswith random data,in:Information and Inference 2014,iau005凸优化2017作业及答案2.Optimization problems.(a)(LASSO.)We wish to recover a sparse vector x∈R n from measurements y∈R m.Our measurementmodel tells us thaty=Ax+v,where A∈R m×n is a known matrix and v∈R m is unknown measurement error.The entries of v are drawn IID from the distribution N(0,σ2).We canfirst try to recover x by solving the optimization problemminAx−y 22+γ||x||22.(5)xThis problem is called ridge regression.A more successful approach is to solve the LASSO problemminAx−y 22+γ||x||1.(6)xPlease use the code below to define n,m,A,x,and y.1234567(a)Use CVX to estimate x from y using ridge regression and LASSO problem,respectively.(15points)(b)Plot your result to compare the estimated x with the true x.(5points)(c)How many measurements m are needed tofind an accurate x with ridge regression?How aboutwith the LASSO?(5points)(b)(Portfolio Optimization.)Find minimum-risk portfolios with the same expected return as the uniformportfolio(w=(1/n)1),with risk measured by portfolio return variance,and the following portfolio constraints(in addition to1T w=1):•No(additional)constraints.•Long-only:w 0.•Limit on total short position:1T w−≤0.5,where(w−)i=max{w i,0}.(a)Use CVX to compare the optimal risk in these portfolios with each other and the uniformportfolio.(10points)(b)Plot the optimal risk-return trade-offcurves for the long-only portfolio,and for total shortposition limited to0.5,in the samefiment on the relationship between the two trade-offcurves.(10points)(c)(Energy Storage Trade-offs.)We consider the use of a storage device(say,a battery)to reduce thetotal cost of electricity consumed over one day.We divide the day into T time periods,and let p t denote the(positive,time-varying)electricity price,and u t denote the(nonnegative)usage or consumption,in period t,for t=1,...,T.Without the use of a battery,the total cost is p T u.Let q t denote the(nonnegative)energy stored in the battery in period t.For simplicity,we neglect energy loss(although this is easily handled as well),so we have q t+1=q t+c t,t=1,...,T1,where c t is the charging of the battery in period t;c t<0means the battery is discharged.We will require that q1=q T+c T,i.e.,wefinish with the same battery charge that we start with.With the battery operating,the net consumption in period t is u t+c t;we require this to be nonnegative(i.e.,we do not pump power back into the grid).The total cost is then p T(u+c).The battery is characterized by three parameters:The capacity Q,where q t≤Q;the maximum charge rate C,where c t≤C;and the maximum discharge rate D,where c t≥D.(The parameters Q,C,and D are nonnegative.)(a)Explain how tofind the charging profile c∈R T(and associated stored energy profile q∈R T)that minimizes the total cost,subject to the constraints.(5points)p T(u+c)minq,cs.t q t+1=q t+c t,t=1,...,T−1q1=q T+c T0≤q t≤Q,t=1,...,T−D≤c t≤C,t=1,...,T0≤u t+c t,t=1,...,T(b)Use CVX to solve the problem above with Q=35,C=D=3as well as p and u defined by thefollowing code:12345Plot u t,p t,c t,and q t versus t.(15points)(c)Storage Trade-offs Plot the minimum total cost versus the storage capacity Q,using p and ubelow,and charge/discharge limits C=D=3.Repeat for charge/discharge limits C=D=1.(Put these two trade-offcurves on the same plot.)Give an interpretation of the endpoints of the trade-offcurves.(10points)SI 251-Convex Optimization,Spring 2017Homework 4Due on 08:00a.m.,April 6,2017,before classNote:Please compress your codes into one file and sent it to TAs,and print your figures or results and answer the questions on A4paper.Finish your simulation with CVX package (MATLAB/Python/···).And initialize your program with com-mands to fix your randomized results and make sure that your results are repeatable.For example,if you are using MATLAB,you may add rng(’default’);rng(1);in the preamble.And you may need to reprogram the given MATLAB code segments to other programming languages that you’d like to choose.1.Feasibility1)(Multiuser transmit beamforming.)Power minimization problem in wireless communicationP :minimizew 1,···,w KK ∑k =1∥w k ∥2subject toSINR k ≥γk ,k =1,···,K,(1)where w 1,···,w K ∈C n are the beamforming vectors for receiver k =1,···,K .Signal-to-interference-plus-noise-ratio for the k -th user SINR k is given bySINR k =|h H k w k |2∑i =k |h H kw i |2+σ2,(2)where h k ∈C n is the channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it’s often represented with dB,which is defined as 10log γk .(a)Consider the relationship between target SINR and the feasibility of P .Please draw the phasetransition 1figure where X-axis is target SINR in dB (γ1=···=γK =γ),and Y-axis is the ratio when the problem is feasible over multiple realizations of channel,i.e.R =#{P is feasible }#of tests(channel realizations).(3)Assume K =50,n =3.You need to run 20times and take average.(5points)(b)Please draw the phase transition figure about the relationship between the number of users Kand the feasibility of P .Assume n =3,γ=−15dB.You need to run 20times and take average.(5points)(c)Please draw the phase transition figure about the relationship between the number of antennasn and the feasibility of P .Assume K =100,γ=−10dB.You need to run 20times and take average.(5points)Solution:1Formore about phase transition,refer to Dennis Amelunxen et al.:Living on the edge:Phase transitions in convex programswith random data,in:Information and Inference 2014,iau0051 2 3 4 5 678910111213142)(Second-order cone optimization problem.)Randomly generate standard SOCPP SOCP:minimizef T xx∈R nsubject to∥A i x+b i∥≤c T i x+d i,i=1,···,K(4) where each entry of A i∈R m×n,b i∈R m,c i∈R n,d i∈R is all draw of i.i.d.standard Gaussian distribution N(0,1).Please draw the phase transitionfigure about the relationship between the number of constriants K and the feasibility of P SOCP.Assume m=20,n=100.You need to run 20times and take average.(10points)Solution:123456789101112132.Optimization problems.(a)(LASSO.)We wish to recover a sparse vector x∈R n from measurements y∈R m.Our measurementmodel tells us thaty=Ax+v,where A∈R m×n is a known matrix and v∈R m is unknown measurement error.The entries of v are drawn IID from the distribution N(0,σ2).We canfirst try to recover x by solving the optimization problem∥Ax−y∥22+γ||x||22.(5)minxThis problem is called ridge regression.A more successful approach is to solve the LASSO problem∥Ax−y∥22+γ||x||1.(6)minxPlease use the code below to define n,m,A,x,and y.1234567(a)Use CVX to estimate x from y using ridge regression and LASSO problem,respectively.(15points)(b)Plot your result to compare the estimated x with the true x.(5points)(c)How many measurements m are needed tofind an accurate x with ridge regression?How aboutwith the LASSO?(5points)Solution:(b)(Portfolio Optimization.)Find minimum-risk portfolios with the same expected return as the uniformportfolio(w=(1/n)1),with risk measured by portfolio return variance,and the following portfolio constraints(in addition to1T w=1):•No(additional)constraints.•Long-only:w≽0.•Limit on total short position:1T w−≤0.5,where(w−)i=max{−w i,0}.(a)Use CVX to compare the optimal risk in these portfolios with each other and the uniformportfolio.(10points)(b)Plot the optimal risk-return trade-offcurves for the long-only portfolio,and for total shortposition limited to0.5,in the samefiment on the relationship between the two trade-offcurves.(10points)Solution:92017/4/16portfolio_yangkai(b)Plot the optimal risk-return trade-off curves for the long-only portfolio, and for total short positionlimited to 0.5, in the same figure.Comment on the relationship between the two trade-off curves.file:///C:/Users/Line/Box%20Sync/Course/convex%20optimization%202017/hw4/result/portfolio_yangkai.html3/6In [ ]: In [ ]:(c)(Energy Storage Trade-offs.)We consider the use of a storage device(say,a battery)to reduce thetotal cost of electricity consumed over one day.We divide the day into T time periods,and let p t denote the(positive,time-varying)electricity price,and u t denote the(nonnegative)usage or consumption,in period t,for t=1,...,T.Without the use of a battery,the total cost is p T u.Let q t denote the(nonnegative)energy stored in the battery in period t.For simplicity,we neglect energy loss(although this is easily handled as well),so we have q t+1=q t+c t,t=1,...,T1,where c t is the charging of the battery in period t;c t<0means the battery is discharged.We will require that q1=q T+c T,i.e.,wefinish with the same battery charge that we start with.With the battery operating,the net consumption in period t is u t+c t;we require this to be nonnegative(i.e.,we do not pump power back into the grid).The total cost is then p T(u+c).The battery is characterized by three parameters:The capacity Q,where q t≤Q;the maximum charge rate C,where c t≤C;and the maximum discharge rate D,where c t≥D.(The parameters Q,C,and D are nonnegative.)(a)Explain how tofind the charging profile c∈R T(and associated stored energy profile q∈R T)that minimizes the total cost,subject to the constraints.(5points)minp T(u+c)q,cs.t q t+1=q t+c t,t=1,...,T−1q1=q T+c T0≤q t≤Q,t=1,...,T−D≤c t≤C,t=1,...,T0≤u t+c t,t=1,...,T(b)Use CVX to solve the problem above with Q=35,C=D=3as well as p and u defined by thefollowing code:12345Plot u t,p t,c t,and q t versus t.(15points)(c)Storage Trade-offs Plot the minimum total cost versus the storage capacity Q,using p and ubelow,and charge/discharge limits C=D=3.Repeat for charge/discharge limits C=D=1.(Put these two trade-offcurves on the same plot.)Give an interpretation of the endpoints of the trade-offcurves.(10points)Solution:16In [28]:# Here we plot the demands u and prices p.import numpy as npimport matplotlib.pyplot as plt%matplotlib inlinenp.random.seed(1)T = 96t = np.linspace(1, T, num=T).reshape(T,1)p = np.exp(-np.cos((t-15)*2*np.pi/T)+0.01*np.random.randn(T,1)) u = 2*np.exp(-0.6*np.cos((t+40)*np.pi/T) - \0.7*np.cos(t*4*np.pi/T)+0.01*np.random.randn(T,1))plt.figure(1)plt.plot(t/4, p, 'g', label=r"$p$");plt.plot(t/4, u, 'r', label=r"$u$");plt.ylabel("$")plt.xlabel("t")plt.legend()plt.show()2017/4/16energystorage_yangkaifile:///C:/Users/Line/Box%20Sync/Course/convex%20optimization%202017/hw4/result/energystorage_yangkai.html6/6。
Convex Optimization 教材习题答案
Stephen Boyd
Lieven Vandenberghe
January 4, 2006
Chapter 2
Convex sets
Exercises
Exercises
Definition of convexity
2.1 Let C ⊆ Rn be a convex set, with x1 , . . . , xk ∈ C , and let θ1 , . . . , θk ∈ R satisfy θi ≥ 0, θ1 + · · · + θk = 1. Show that θ1 x1 + · · · + θk xk ∈ C . (The definition of convexity is that this holds for k = 2; you must show it for arbitrary k.) Hint. Use induction on k. Solution. This is readily shown by induction from the definition of convex set. We illustrate the idea for k = 3, leaving the general case to the reader. Suppose that x 1 , x2 , x3 ∈ C , and θ1 + θ2 + θ3 = 1 with θ1 , θ2 , θ3 ≥ 0. We will show that y = θ1 x1 + θ2 x2 + θ3 x3 ∈ C . At least one of the θi is not equal to one; without loss of generality we can assume that θ1 = 1. Then we can write where µ2 = θ2 /(1 − θ1 ) and µ2 = θ3 /(1 − θ1 ). Note that µ2 , µ3 ≥ 0 and µ1 + µ 2 = y = θ1 x1 + (1 − θ1 )(µ2 x2 + µ3 x3 )
凸优化练习题与解答(2)台大考古题
tr(GZ ), g(Z) =
−∞,
tr(FiZ) + ci = 0, i = 1, 2, 3 otherwise
2
And finally the dual problem is maximize subject to
tr(GZ )
tr(FiZ) + ci = 0, i = 1, 2, 3 ZO
2. (40%) Consider the equality constrained problem
x, A x + b , x
is
√ −1 Ab
22/4,
for self-adjoint (symmetric) positive definite linear operator A (which is assumed here). As
a reminder, diagonalize A√ (why is√that possible?) to be U DU T where D has all-positive entries as assumed, then A = U DU T and the generalization of “completion of square”
∂ ∂x (tf0 + φ) = 0
Or
1
1
2xt −
−
=0
x−1 x−3
When this condition is met, x ← x∗(t) is plugged in, after some direct arrangement (x = 1, 3 where the function is singular and thus arrangement is valid),
凸优化(Convex optimization)
凸优化(Convex optimization)
最小二乘问题和线性规划问题都可以看成是凸优化问题的特殊情况,但是与最小二乘问题和线性规划问题两者不同,求解凸优化问题还不能算是一门成熟的技术。
通常没有解析公式来求解凸优化问题,但是存在一些有效的算法,最典型的代表是内点算法。
如果一个实际的问题可以被表示成凸优化问题,那么我们就可以认为其能够得到很好的解决。
但是往往识别一个凸优化问题比识别一个最小二乘问题要困难的多,所以需要更多的技巧。
还有的问题不是凸优化问题,但是凸优化问题同样可以在求解该问题中发挥重要的左右。
比如松弛算法和拉格朗日松弛算法,将非凸的限制条件松弛为凸限制条件。
凸优化包含多个层次,比如:二次优化问题是一个最底层的优化问题,可以通过求解线性方程来求解优化问题。
而牛顿算法是上一个层次,牛顿算法可以求解非限制问题或等式限制问题,但往往是将该问题简化为多个二次优化问题。
内点算法处于最高级,可以将非等式限制问题转化为一系列非限制问题或等式限制问题。
资料_基于分裂Bregman算法的玉米种子品种识别(英文版)
第28卷增刊2农业工程学报V ol.28Supp.2 2482012年10月Transactions of the Chinese Society of Agricultural Engineering Oct.2012Variety identification of corn seed based onBregman Split methodJiang Jingtao1,Wang Yanyao1,Yang Ranbing1,Mei Shuli2(1.College of Mechanical and Electrical Engineering,Qingdao Agricultural University,Qingdao266109,China;2.College of Information and Electrical Engineering,China Agricultural University,Beijing100083,China)Abstract:Corn seed purity is closely related to corn yield,so seed selection plays an important role in improving grain yield product.The automatic seed selection procedure based on the machine vision is usually divided into three steps: image segmentation,feature extraction and classification.Variational model for image segmentation and corresponding numerical technique of Split Bregman method were introduced into the identification procedure,which had advantages of feature extraction such as high accuracy and closed continuous border.In addition,the adaptive wavelet collocation method was employed to solve the optimality conditions in Bregman split method.Based on the improved method,the corn geometric features can be extracted more precisely.Nongda108and Ludan981were taken as examples to test the new method.Based on a classifier designed with SVM,results showed the identification accuracy of Nongda108and Ludan981were97.3%and98%,respectively,better than95%in previous research.Key words:image recognition,feature extraction,models,Bregman split method,multi-levels wavelet interpolation operator doi:10.3969/j.issn.1002-6819.2012.z2.043CLC number:TN911173;O55113Document code:A Article ID:1002-6819(2012)-Supp.2-0248-05 Jiang Jingtao,Wang Yanyao,Yang Ranbing,et al.Variety identification of corn seed based on Bregman Split method[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2012,28(Supp.2):248-252.(in English with Chinese abstract)江景涛,王延耀,杨然兵,等.基于分裂Bregman算法的玉米种子品种识别[J].农业工程学报,2012,28(增刊2):248-252.0IntroductionIt is well known that grain seed purity is close related to the grain output[1-2].Seed identification can be both a science and an art.Some seed scientists use “seed keys”to identify seeds[3-4],others visualization, and most use both depending upon what experience they have in the field and what they are trying to identify.Unfortunately,only the most common agricultural and weed seeds have been described, drawn,or photographed.And so it is hard to identify the less common seeds by this method.For any seeds, there are some important characteristics need to be identified,such as size,shape,texture,color[5-7].When it comes to size,both the overall size of the seed and the size of each of the seed's individual parts[8]should Received date:2012-05-09Revised date:2012-08-20Foundation item:Special Fund for Agro-scientific Research in the Public Interest,China(No.201203028);The“Twelfth Five-Year”National Science and technology support program,China(No.2012BAD35B02). Biography:Jiang Jingtao(1963-),Female,Qingdao,Professor,Major in College of Mechanical and Electrical Engineering,Qingdao Agricultural University,Qingdao266109.Email:******************.※Corresponding author:Mei Shuli(1968-),Male,Beijing,Major in Computer Science,College of Information and Electrical Engineering, China Agricultural University,Beijing100083.Email:****************.be considered.Corn identification needs such a large amount of time and effort that it’s necessary to develop the automatic identification of corn seed based on machine vision.In general,the automatic identification procedure includes image acquisition and segmentation[9],seed geometric and color features extraction,seeds classification.Obviously,the corn seed identification precision is up to the image segmentation precision.In fact,segmentation and object extraction is one of most important tasks in image processing and computer vision[10].Many of the most general and effective segmentation methods can be written as variational based models such as fuzzy connectedness, watershed algorithm[11],Bayesian methods[12],Otsu’s method[13].This category of variational models has been proved to be very effective in many applications, especially in the processing and analysis of medical images[14].While there are many disparate approaches to image segmentation,this paper will focus on recently proposed methods which can be cast in the form of totally convex optimization problems and the corresponding numerical method-split Bregman method[15].Combined with the classifier based on support vector machine(SVM)[16],a novel corn seed增刊2江景涛等:基于分裂Bregman 算法的玉米种子品种识别249varieties intelligent identification system will be constructed.1The split Bregman method on the globally convex segmentation1.1Convex methods for image segmentationThe gloval convex segmentation (GCS)method ,first proposed by Chan et al.[17],eliminate difficulties associated with those non-convex models by proposing an approach to segmentation based on convex energies.The GCS formulation based on the gradient flow can be described as follows:2212[()()]||u u c f c f t u μ∂∇=∇----∂∇(1)Where u is the level set function,μis a constant variable,t is the time parameter,f is the image intensity,c 1and c 2represent the mean intensity inside and outside of the segmented region,respectively [18].The strength of the regularization can be controlled by the parameter.This simplified flow represents the gradient descent for minimizing the energy1()||,E u u u r μ=∇+<>(2)where 2212()()r c f c f =---.To make the global minima well defined,we must constrain thesolution to lie in the interval [0,1].This results in the optimization problem:011min ||,u u u r ∇+<>≤≤(3)Once this optimization problem is solved,thesegmented region can be found by thresholding the level set function to get{;()}x u x αΩ=>(4)for some α∈(0,1).1.2Split Bregman method on GCSIn fact,it’s difficult to get the minimize of the model (2).Goldstein and Osher [15]proposed to enforce the inequality constraint using an exact penalty function.Then,the convexified segmentation can be reduced to a sequence of problems of the form01min ||,u g u u r μ∇+<>≤≤(5)where r =(c 1−f )2−(c 2−f )2.In order to apply the Split Bregman method,the auxiliary variable d wasintroduced,that is,dcan be employed to take theplace of u ∇.To weakly enforce the resulting equality constraint,a quadratic penalty function was added,the following unconstrained problem can be got:01,**(,)arg min ||,2u dg u d d u r d u λμ=+<>+-∇ ≤≤(6)In order to strictly enforce the constraint d u -∇,Bregman iteration can be applied to the problem.Theresulting sequence of the optimization problme is01,112(,)arg min ||,2u dk k g k u d d u r d u b μλ++=+<>+-∇-≤≤(7)1k k kkb b u d +=+∇- (8)To the Optimization problem in Eq.(7),the optimization condition can be described as()u r d b μλ∆=+∇⋅- If the solution to this equation lies in the interval [0,1]then this global minimizer coincides with the minimizer of the constrained problem.If the solution lies outside of this interval,then the energy is strictly monotonic inside [0,1],and the minimizer lies at the endpoint closest to the unconstrained minimizer.We have the following element-wise minimization formula:,1,,1,,,1,,1,,1,1,1,1,1,,,1()4max{min{,1},0}x x x x y y y yi j i j i j i j i j i j i j i j i ji j i j i j i j i j i j i j i j d d b b d d b b u u u u u αμβαλβ-----+-+=--++--+=+++-=Minimization with respect to dis performed using the following formula:11(,)k k k g d shrink b u λ++=+∇1.3Maize image segmentation experimentIn order to examine the effectiveness of the split Bregman method on the globally convex segmentation,it was applied to segment the maize image,as shown in Fig.1.The purpose of method is to find the maize shape,the exact edge and the color information.The segmentation results were shown inFig.2~3.Fig.1Original maizeimageFig.2Segmentation with the Split Bregman method农业工程学报2012年250Fig.3Contour of the maize image with split Bregman method Compared with the watershed method shown in Fig.4,the segmentation result with split Bregman method was more accurate and had closed continuous border,which would be helpful in measuring the geometric feature of the maize images.But we can’t get the different regions with different color.If we can get them,the more features of the maize image can be obtained,which are helpful in identification of the maizeseeds.Fig.4Watershed method2Modified split Bregman method based on the morphological reconstructionMorphological reconstruction is a useful but little-known method for extracting meaningful information about shapes in an image.The shapes could be just about anything:letters in a scanned text document,fluorescently stained cell nuclei,or galaxies in a far-infrared telescope image.We can use morphological reconstruction to extract marked objects,find bright regions surrounded by dark pixels, detect or remove objects touching the image border, detect or fill in object holes,filter out spurious high or low points,and perform many other operations.Essentially a generalization of flood-filling, morphological reconstruction processes to an image, which can be called the marker,based on the characteristics of another image,called the mask.The high points,or peaks,in the marker image specify where processing begins.The peaks spread out,or dilate,while being forced to fit within the mask image. The spreading processing continues until the image values stop changing.If G is the mask and F is the marker,the reconstruction of G from F,denoted by R G(F),is defined by the following iterative procedure:1)Initialize h1to be the marker image,F.2)Create the structuring element:B=ones(3).3)Repeat:1()k kh h B+=⊕∩Guntil h k+1=h k.4)R G(F)=h k+1Fig.5~6illustrate the preceding iterative procedure.Although this iterative formulation is useful conceptually,much faster computational algorithmsexist.Fig.5Modified regionalmaximaFig.6Opening-closing by reconstructionAfter the morphological reconstruction,we can segment the maize images with the split Bregman method,the result was shown in Fig.7.It’s easy to observe that the modified method can identify the different color regionsexactly.Fig.7Modified split Bregman method combing withmorphological reconstruction增刊2江景涛等:基于分裂Bregman 算法的玉米种子品种识别2513Multi-Object feature extractionIn order to quantitatively describe the color information of maize seeds,six color features were defined as the mean values of Red,Green,Blue color,the mean value of the Hue,Saturation,bining with the geometric features measured with the segmentation results.Table 1shows parts of the geometric feature parameters of two varieties of maize seeds,Nongda 108and Ludan 981.Table 2shows the color features.Table 1Geometric feature parameter of maize seedsGeometric feature Nongda 108Ludan 981Contour points amount834756Circumference952878Area 6003448061Length of long-axis 336287length of minor-axis 256241Maximum inscribed circle radius 162134Minimum inscribed circle radius121112Largest span 321243Equivalent diameter276241Table 2Mean value of color feature parameter for maizeseedsColor featureNongda 108mean valueLudan 981mean valueR227218G 195181B 127132H0.880.66S 0.150.036I1821764Corn seeds identification with SVMSupport Vector Machine (SVM)is a classification and regression prediction tool that uses machine learning theory to maximize predictive accuracy while automatically avoiding over-fit to the data.SVM can be defined as systems which use hypothesis space of a linear functions in a high dimensional feature space,trained with a learning algorithm from optimization theory that implements a learning bias derived from statistical learning theory.Here we present the QP formulation for SVM classification.This is a simple representation only.SV classification:i2i Kf,1min fC li ξξ=+∑y i f (x i )≥1−ξi ,for all iξi ≥0SVM classification,dual formulation:1111min (,)2i ll li i j i j i j i i j y y K x x αααα===-∑∑∑0≤αi ≤C ,for all i:1li ii yα==∑Variables ξi are called slack variables and they measure the error made at point (x i ,y i ).Training SVM becomes quite challenging when the number of training points is large.A number of methods for fast SVM training have been proposed.Applying the feature parameters extracted in section 3,we can construct a classfier of corn seeds identification based on the SVM ing the varieties identification classifier,the test of single variety identification and mixed varieties identification are done to 2varieties maize seeds such as Nongda108and Ludan981.The identification accuracy of Nongda108and Ludan981by the method were 97.3and 98%,respectively,showing that the identification accuracy was improved than that by Shi [19],in which the identification accuracy of Nongda108and Ludan981were about 95%,respectively.5ConclusionsThe variational models for image segmentation and the corresponding split Bregman method were first employed to identify the corn seed variety in this paper,and the result showed the methods had advantages of high accuracy and closed continuous border .In fact,the reason that the method can improve the seeds identification precision is that image segementation results are more precise.Future research will focus on accelerating the Split Bregman scheme in the case of fidelity parameters,allowing for faster coarse segmentation of large images,and faster evolution of the GAC contour.[References][1]Yan Xiaomei,Liu Shuangxi,Zhang Chunqing,et al.Purity identification of maize seed based on color characteristics[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE),2010,26(Supp.1):46-50.[2]Huang Yanyan,Zhu Liwei,Li Junhui,et al.Rapid and nondestructive discrimination of hybrid maize seed purity using near infrared spectroscopy[J].Spectroscopy and Spectral Analysis,2011,31(3):661-664.[3]Bedane G M,Gupta M,George D,et al.Optimum harvest maturity for guayule seed[J].Industrial Crops and Products,2006,24(1):26-33.[4]Bedane G M,Gupta M L,George D L,et al.Effect of plant population on seed yield,mass and size of guayule[J].Industrial Crops and Products,2009,29(1):139-144.农业工程学报2012年252[5]Granitto P M,Verdes P,Ceccatto H rge-scaleinvestigation of weed seed identification by machinevision[J].Computers and Electronics in Agriculture,2005, 47(1):15-24.[6]Kovinich N,Saleem A,John A,et al.Identification oftwo anthocyanidin reductase genes and three red-brownsoybean accessions with reduced anthocyanidin reductase1mRNA,activity,and seed coat proanthocyanidinamounts[J].Journal of Agricultural and Food Chemistry,2012,60(2):574-584.[7]Liu Zhaoyan,Cheng Fang,Ying Yibin,et al.Identification of rice seed varieties using neuralnetwork[J].Journal of Zhejiang University:Science,2005,6(11):1095-1100.[8]Yi S,Davis B J,Robb R A.A method for size estimationfor small objects and its application in brachytherapy seedidentification[J].Proceedings of SPIE-The InternationalSociety for Optical Engineering,2004,5370(3):1679-1684.[9]Zhang Junxiong,Wu Kebin,Song Peng,et al.Imagesegmentation of maize haploid seeds based on BP neuralnetwork[J].Journal of Jiangsu University(NaturalScience Edition),2011,32(6):621-625.[10]Lin Haibo,Dong Shuliang,Qiu Yan,et al.Research ofwheat precision seeding test system based on imageprocessing[J].Advanced Materials Research,2011,311-313:1559-1563.[11]Kuang Fangjun,Xu Weihong,Wang Yanhua.Novelwatershed algorithm for touching rice image segmentation[J].Advanced Materials Research,2011,271-273:1-6. [12]Ruben A,Inaki I,Pedro L.Detecting reliable geneinteractions by a hierarchy of Bayesian networkclassifiers[J].Computer Methods and Programs inBiomedicine,2008,91(2):110-121.[13]Long Mansheng,He Dongjian.Weed identification fromcorn seedling based on computer vision[J].Transactionsof the Chinese Society of Agricultural Engineering,2007,23(7):139-144.[14]Jonasson L,Bresson X,Hagmann P,et al.White matterfiber tract segmentation in dt-mri using gemetric flows[J].Med.Image Anal.,2005,9(9):223-236.[15]Goldstein T,Bresson X and Osher S.Geometricapplications of the split Bregman method:Setmentationand surface reconstruction[J].J Sci Comput,2010,45:272-293.[16]Wu Di,Feng Lei,He Yong,et al.Variety identificationof Chinese cabbage seeds using visible and near-infraredspectroscopy[J].Transactions of the ASABE,2008,51(6): 2193-2199.[17]Chan T F,Esedoglu S,Nikolova M.Algorithms forfinding global minimizers of image segmentation anddenoising models.SIAM J[J].Appl.Math.,2006,66:1932-1948.[18]Chan T F,Vese L.Active contours witout edges[J].IEEEtrans.Image Process.,2001,10:266-277.[19]Shi Zhonghui.Research on corn seed varieties intelligentidentification system[D].Tai’an:Shandong AgriculturalUniversity,2011.基于分裂Bregman算法的玉米种子品种识别江景涛1,王延耀1,杨然兵1,梅树立2(1.青岛农业大学机电工程学院,青岛266109;2.中国农业大学信息与电气工程学院,北京100083)摘要:玉米品种的纯度和玉米产量密切相关,因此玉米品种的筛选对提高粮食产量具有非常重要的作用。
优质课程《凸优化及其在信号处理中的应用》课程教学大纲
附件课程教学大纲课程编号:G00TE1204课程名称:凸优化及其在信号处理中的应用课程英文名称:Convex Optimization and Its Applications in Signal Processing 开课单位:通信工程学院教学大纲撰写人:苏文藻课程学分:2学分课内学时:32学时课程类别:硕士/博士/专业学位课程性质:任选授课方式:讲课考核方式:作业,考试适用专业:通信与信息系统、信号与信息处理先修课程:教学目标:同学应:1.掌握建立基本优化模型技巧2.掌握基本凸分析理论3.掌握凸优化问题的最优条件及对偶理论4.认识凸优化在信号处理的一些应用英文简介:In this course we will develop the basic machineries for formulating and analyzing various optimization problems. Topics include convex analysis, linear and conic linear programming, nonlinear programming, optimality conditions, Lagrangian duality theory, and basics of optimization algorithms. Applications from signal processing will be used to complement the theoretical developments. No prior optimization background is required for this class. However, students should have workable knowledge in multivariable calculus, real analysis, linear algebra and matrix theory.课程主要内容:Part I: Introduction-Problem formulation-Classes of optimization problemsPart II: Theory-Basics of convex analysis-Conic linear programming and nonlinear programming: Optimality conditions and duality theory-Basics of combinatorial optimizationPart III: Selected Applications in Signal Processing-Transmit beamforming-Network localization-Sparse/Low-Rank Regression参考书目:1.Ben-Tal, Nemirovski: Optimization I-II: Convex Analysis, Nonlinear ProgrammingTheory, Nonlinear Programming Algorithms, 2004.2.Boyd, Vandenberghe: Convex Optimization, Cambridge University Press, 2004.3.Luenberger, Ye: Linear and Nonlinear Programming (3rd Edition), 2008.4.Nemirovski: Lectures on Modern Convex Optimization, 2005.。
convex词根 -回复
convex词根-回复Convexity: Exploring the Concepts and ApplicationsIntroduction:The term "convex" is derived from the Latin word "convexus," which means "rounded or arched." This article aims to explore the concept of convexity and its applications in various fields such as mathematics, physics, economics, and computer science. Convexity is a fundamental concept that describes specific characteristics and properties of objects, functions, and spaces. By understanding convexity, we can gain insights into optimization problems, geometric analysis, and resource allocation.What is Convexity?Convexity refers to the property of being curved or rounded outward like a convex lens. It is essentially a geometrical concept that defines the shape of a curve or a surface. In mathematics, convexity describes a set or a function for which any line segment connecting two points in the set lies entirely within the set. This implies that the set does not contain any indentations or holes, andits boundaries are always curved outward.Properties of Convex Sets:A convex set has several defining properties that are crucial in understanding its characteristics. Firstly, any line segment joining two points within the set lies entirely within the set. This is known as the segment property. Secondly, if a set contains two points, then it also contains every point lying on the line segment joining these two points. This is called the line property. Thirdly, the intersection of two convex sets is also convex. Lastly, the convex combination property states that any convex combination of two points lies within the set.Applications of Convexity:1. Optimization:Convexity plays a vital role in optimization problems, where the objective is to find the best possible solution given certain constraints. Convex optimization problems have well-defined properties, making them easier to solve. The convexity of theobjective function ensures that the solution obtained is not merely a local optimum but a global optimum. This significantly simplifies the optimization process and allows for efficient algorithms to be developed. Convex optimization finds applications in finance, engineering, machine learning, and many other fields.2. Geometry and Spatial Analysis:Convexity has extensive applications in geometry and spatial analysis. Convex polyhedra, such as cubes and pyramids, play a crucial role in solid geometry. These shapes have well-defined edges, vertices, and faces, making them easier to analyze and manipulate. Convex hulls, which are the smallest convex sets containing a given set of points, are employed in computer graphics, computational geometry, and image processing. Convexity also allows for the definition of convex curves and surfaces, which are used in various contexts, such as designing aerodynamic shapes, analyzing biological structures, and constructing mathematical models.3. Economics and Game Theory:Convexity finds applications in the study of economics and game theory. Convexity is often assumed in the utility functions of consumers and production functions of firms. This assumption helps in simplifying economic models and understanding the behavior of consumers and producers. Convexity ensures that indifference curves are convex, meaning that individuals have diminishing marginal rates of substitution. Convexity also plays a significant role in game theory, particularly in the study of convex games, where players' payoffs are determined by their own decisions and a convex combination of other players' decisions.4. Resource Allocation and Fair Division:Convexity is essential in the field of resource allocation and fair division. Convexity ensures that fair allocations can be obtained efficiently and without envy. The notion of a fair division problem arises in various real-world scenarios such as dividing goods among individuals, allocating resources to projects, or distributing tasks among workers. Convexity allows for the definition of fair division algorithms that guarantee an equitable distribution without any complaints arising from the participants.Conclusion:Convexity is a powerful concept with broad applications in mathematics, physics, economics, and computer science. It represents essential properties of sets, functions, and spaces, allowing for efficient optimization, geometric analysis, and resource allocation. Understanding and applying convexity can lead to significant advancements in various fields and pave the way for innovative solutions to complex problems. Whether it is finding global optima, analyzing geometric shapes, modeling economic behavior, or ensuring fair division, convexity remains a fundamental and indispensable concept.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
(FOC)
Q.E.D.
July 24, 2009 10 / 11
The rst order condition for (CP). Slater's Condition: There exists an x such that gj () < 0 for all j = 1, . . . m. x
()
Convex Optimization Problems
July 24, 2009
6 / 11
More Homework Problems: (1) Prove: If f and g are both concave functions, the function h = min{f, g} is also concave. (2) Prove: A function f is concave if and only if, for all x and y, and for all α ≤ 0, f (αx + (1 α)y) ≤ αf (x) + (1 α)f (y). (3) If f is concave on Rn , then for any x1 , . . . , xm ∈ Rn , and real numbers αj ≥ 0 with
()
Convex Optimization Problems
July 24, 2009
8 / 11
2. Convex Optimization Problems
We call the following optimization problem a concave maximization problem: maxx∈Rn s.t. f (x) (CP)
()
Convex Optimization Problems
July 24, 2009
2 / 11
Remark: A function f is concave if and only if f is convex. Hence, we will state most results for concave functions only. They all have counterparts for convex functions. Homework: Prove that when both f and g are concave, f + g is also concave. Homework: Prove that if f is concave and increasing on R1 and g is concave, then the composite function f g is also concave. Homework: Prove that a function f on Rn is concave if and only if the set below its graph in Rn+1 , i.e., (x, y) ∈ Rn+1 | x ∈ Rn , y ∈ R, y ≤ f (x) is a convex set.
() Convex Optimization Problems
Q.E.D.
July 24, 2009 5 / 11
Theorem 3:
(1) If f dened on Rn is dierentiable, then f is concave if and only if for any x, y ∈ Rn , f (y) ≤ f (x) + f (x)(y x).
Theorem 2: A function f dened on a convex set C Rn is concave if and
only if its restriction on any line segment in C is a concave function on [0, 1]. Proof: Suppose the restriction of f on any line segment within C is a concave function. Then, for any x, y ∈ C, let g(α) = f (αx + (1 α)y) for α ∈ (0, 1). f (αx + (1 α)y) = g(α) = g(α1 + (1 α)0) ≥ αg(1) + (1 α)g(0) = αf (x) + (1 α)f (y). Hence, f is concave on C. Reversing the argument proves the other half of the theorem.
≤ f (x ) + λ1 (g1 (x) g1 (x )) + . . . + λm (gm (x) gm (x )) (Theorem 3) = f (x ) + λ1 g1 (x) + . . . + λm gm (x) ≤ f (x ) Hence, x is optimal for (CP).
(FOC)
j = 1, . . . , m,
then x is optimal for (CP). Proof: For any feasible x for (CP), f (x) ≤ f (x ) +
f (x )(x x )
(Theorem 3) (FOC)
= f (x ) + [λ1 g1 (x ) + . . . + λm gm (x )] (x x ) = f (x ) + λ1 g1 (x )(x x ) + . . . + λm gm (x )(x x )
(2) If f on Rn is twice dierentiable, then f is concave if and only if its Hessian matrix
2f xi xj
is non-positive denite.
பைடு நூலகம்
The proof of Theorem 2 follows from Theorem 1 and 2 and the chain rule of dierentiation.
g1 (x) ≤ 0 . . . gm (x) ≤ 0.
in which f is concave, and gj are convex. For any concave maximization problem (CP), the rst order condition is also sucient for optimal solutions.
Theorem 1:
(1) If f dened on R1 is dierentiable, then f is concave if and only if its rst order derivative f (t) is weakly decreasing. (2) If f dened on R1 is twice dierentiable, then f is concave if and only if its second order derivative f (t) is non-positive. (3) If f dened on R1 is dierentiable, then f is concave if and only if for any x, y ∈ Rn , f (y) ≤ f (x) + f (x)(y x). Proof: (Homework)
()
Convex Optimization Problems
July 24, 2009
4 / 11
Theorem 1 can be extended to functions with many variables using the following trick. For any convex set C and any pair x, y ∈ C, the line segment between x and y is parameterized by the interval [0, 1] R1 .
Lecture Note 6. Convex Optimization Problems
July 24, 2009
()
Convex Optimization Problems
July 24, 2009
1 / 11
1. Concave (Convex) Functions
Denition: A function f on a convex set C in Rn is concave if for any x, y ∈ C and any α ∈ (0, 1) f (αx + (1 α)y) ≥ αf (x) + (1 α)f (y). It is called strictly concave if the above inequality is always strict. Denition: A function f on a convex set C in Rn is convex if for any x, y ∈ C and any α ∈ (0, 1) f (αx + (1 α)y) ≤ αf (x) + (1 α)f (y). It is called strictly convex if the above inequality is always strict.
j=m j=1
αj = 1,
j=m j=m
f(
j=1
αj xj ) ≥