Convex Optimization
convex optimization中译本
二、凸优化基本概念1. 凸集与凸函数凸集和凸函数是凸优化中非常基础且重要的概念。
2. 凸优化问题的一般形式凸优化问题的一般形式可以表示为:minimize f(x)subject to g_i(x) <= 0, i = 1,2,...,mh_j(x) = 0, j = 1,2,...,p其中,f(x)是要优化的目标函数,g_i(x)和h_j(x)分别为不等式约束和等式约束。
三、凸优化中的常见算法1. 梯度下降法梯度下降法是一种常用的优化算法,尤其适用于凸优化问题。
2. 拉格朗日乘子法拉格朗日乘子法主要用于处理约束优化问题,通过构建拉格朗日函数并对其进行优化,得到原始优化问题的最优解。
3. 内点法内点法是一类迭代法,主要用于求解线性规划和二次规划等凸优化问题。
Convex Optimization
凸优化是一种广泛的,越来越多地应用于科学与工程计算,经济学,管理学,工业等领域的学科。它涉及建立恰当的数学模型来描述问题,设计合适的计算方法来寻找问题的最优解,研究模型和算法的理论性质,考察算法的计算性能。该入门课程??适合于数学,统计,计算机科学,电子工程,运筹学等学科的高年级本科生和研究生。教学内容包括凸集,凸函数和凸优化问题的介绍;凸分析的基础知识; 对偶理论;梯度算法,近似梯度算法,Nesterov加速方法,交替方向乘子法;内点算法,统计,信号处理和机器学习中的应用。
内点法介绍(Interior Point Method)
内点法介绍(Interior Point Method)在面对无约束的优化命题时,我们可以采用牛顿法等方法来求解。
单纯形法(Simplex Method)可以用来求解带约束的线性规划命题(LP),与之类似的有效集法(Active Set Method)可以用来求解带约束的二次规划(QP),而内点法(Interior Point Method)则是另一种用于求解带约束的优化命题的方法。
本文主要介绍两种内点法,障碍函数法(Barrier Method)和原始对偶法(Primal-Dual Method)。
其中障碍函数法的内容主要来源于Stephen Boyd与Lieven Vandenberghe的Convex Optimization一书,原始对偶法的内容主要来源于Jorge Nocedal和Stephen J. Wright的Numerical Optimization一书(第二版)。
障碍函数法Barrier MethodCentral Path举例原始对偶内点法Primal Dual Interior Point Method Central Path举例几个问题障碍函数法(Barrier Method)对于障碍函数法,我们考虑一个一般性的优化命题:minsubject tof0(x)fi(x)≤0,i=1,...,mAx=b(1) 这里f0,...,fm:Rn→R 是二阶可导的凸函数。
同时我也要求命题是有解的,即最优解x 存在,且其对应的目标函数为p。
2. (18%) Determine whether each of the following sets is a convex function, quasi-convex
function, concave function. Write your answer as a table of 6 rows and 3 columns, with
z, X1z ≥ 1 z, X2z ≥ 1
Then, for 0 ≤ θ ≤ 1,
z, θX1 + (1 − θ)X2 z =θ z, X1 + (1 − θ) z, X2 ≥θ · 1 + (1 − θ) · 1 =1.
As required in definition of S10. To see it is not a cone, consider z = (1, 0, . . . , 0), and X = I ∈ Sn (symmetric matrices). Here z, Xz = 1, but z, 2Iz = 2 The reason that it is not affine is the same, by considering 2I = 2 · I + (−1) · O, the “line” containing O (all-0 matrix) and I. It follows that it is not a subspace. 11. S11 = x ∈ Rn ||P x + q||2 ≤ cT x + r given any P ∈ Rm×n, q ∈ Rm, c ∈ Rn, and r ∈ R. T, F, F, F To show convexity, if
()()1minimize subject to XN i i f x f x x =∈∑目前这类型的研究已有一些工作,有部分文章考虑异步算法:I. Lobel 将分布式梯度算法应用于随机网络当中[2], M. Zhong and C. G. Cassandras 考虑了通讯为事件驱动的多智能体网络的分布式优化问题,A. Nedich 利用广播通信算法实现网络的分布式优化。
有一些研究单纯关注收敛速度的优化,也就是说设计一个拓扑结构使得相应的连通性最高(即使2λ最小),如斯坦福大学S.Boyd 利用半定规划方法设计了相应的最优网络(无向)拓扑[5] 。
存在的问题1. 目前大部分的优化问题,都是以个体目标和为目标函数。
Figure 1: Examples of a convex set (a) and a nons
• All of Rn . It should be fairly obvious that given any x, y ∈ Rn , θx + (1 − θ)y ∈ Rn . • The non-negative orthant, Rn + . The non-negative orthant consists of all vectors in n R whose elements are all non-negative: Rn + = {x : xi ≥ 0 ∀i = 1, . . . , n}. To show that this is a convex set, simply note that given any x, y ∈ Rn + and 0 ≤ θ ≤ 1, (θx + (1 − θ)y )i = θxi + (1 − θ)yi ≥ 0 ∀i. • Norm balls. Let · be some norm on Rn (e.g., the Euclidean norm, x 2 = n n 2 i=1 xi ). Then the set {x : x ≤ 1} is a convex set. To see this, suppose x, y ∈ R , with x ≤ 1, y ≤ 1, and 0 ≤ θ ≤ 1. Then θx + (1 − θ)y ≤ θx + (1 − θ)y = θ x + (1 − θ) y ≤ 1 where we used the triangle inequality and the positive homogeneity of norms. • Affine subspaces and polyhedra. Given a matrix A ∈ Rm×n and a vector b ∈ Rm , an affine subspace is the set {x ∈ Rn : Ax = b} (note that this could possibly be empty if b is not in the range of A). Similarly, a polyhedron is the (again, possibly empty) set {x ∈ Rn : Ax b}, where ‘ ’ here denotes componentwise inequality (i.e., all the entries of Ax are less than or equal to their corresponding element in b).1 To prove this, first consider x, y ∈ Rn such that Ax = Ay = b. Then for 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay = θb + (1 − θ)b = b. Similarly, for x, y ∈ Rn that satisfy Ax ≤ b and Ay ≤ b and 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay ≤ θb + (1 − θ)b = b.
First and second order characterizations of convex functions
Theorem 2. Suppose f : Rn → R is twice differentiable over an open domain. Then, the following are equivalent: (i) f is convex. (ii) f (y ) ≥ f (x) + ∇f (x)T (y − x), for all x, y ∈ dom(f ). (iii) ∇2 f (x) 0, for all x ∈ dom(f ).
• The theorem simplifies many basic proofs in convex analysis but it does not usually make verification of convexity that much easier as the condition needs to hold for all lines (and we have infinitely many). • Many algorithms for convex optimization iteratively minimize the function over lines. The statement above ensures that each subproblem is also a convex optimization problem. 4
Examples of multivariate convex functions
• Affine functions: f (x) = aT x + b (for any a ∈ Rn , b ∈ R). They are convex, but not strictly convex; they are also concave: ∀λ ∈ [0, 1], f (λx + (1 − λ)y ) = aT (λx + (1 − λ)y ) + b = λaT x + (1 − λ)aT y + λb + (1 − λ)b = λf (x) + (1 − λ)f (y ). In fact, affine functions are the only functions that are both convex and concave. • Some quadratic functions: f (x) = xT Qx + cT x + d. – Convex if and only if Q 0. 0.
Implicit Constraints
The standard form optimization problem has an explicit constraint:
m p
dom fi ∩
dom hi
D is the domain of the problem The constraints fi (x) ≤ 0, hi (x) = 0 are the explicit constraints A problem is unconstrained if it has no explicit constraints Example: minimize
Convex Optimization Problems
1 Optimization Problems 2 Convex Optimization 3 Quasi-Convex Optimization 4 Classes of Convex Problems: LP, QP, SOCP, SDP 5 Multicriterion Optimization (Pareto Optimality)
Global and Local Optimality
A feasible x is optimal if f0 (x) = p ; Xopt is the set of optimal points. A feasible x is locally optimal if it is optimal within a ball, i.e., there is an R > 0 such that x is optimal for minimize
SI 251-Convex Optimization,Spring 2017Homework 4Due on 08:00a.m.,April 6,2017,before classNote:Please compress your codes into one file and sent it to TAs,and print your figures or results and answer the questions on A4paper.Finish your simulation with CVX package (MATLAB/Python/···).And initialize your program with com-mands to fix your randomized results and make sure that your results are repeatable.For example,if you are using MATLAB,you may add rng('default');rng(1);in the preamble.And you may need to reprogram the given MATLAB code segments to other programming languages that you'd like to choose.1.Feasibility1)(Multiuser transmit beamforming.)Power minimization problem in wireless communicationP :minimizew 1,···,w KK k =1w k 2subject to SINR k ≥γk ,k =1,···,K,(1)where w 1,···,w K ∈C n are the beamforming vectors for receiver k =1,···,K .Signal-to-interference-plus-noise-ratio for the k -th user SINR k is given bySINR k =|h H k w k |2 i =k |h H kw i |2+σ2,(2)where h k ∈C n is the channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it's often represented with dB,which is defined as 10log γk . channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it’s often represented with dB,which is defined as 10log γk .(a)Considerthe relationship between target SINR and the feasibility of P .Please draw the phasetransition 1figure where X-axis is target SINR in dB (γ1=···=γK =γ),and Y-axis is the ratio when the problem is feasible over multiple realizations of channel,i.e.R =#{P is feasible }#of tests(channel realizations).(3)Assume K =50,n =3.You need to run 20times and take average.(5points)(b)Please draw the phase transition figure about the relationship between the number of users Kand the feasibility of P .Assume n =3,γ=−15dB.You need to run 20times and take average.(5points)(c)Please draw the phase transition figure about the relationship between the number of antennasn and the feasibility of P .Assume K =100,γ=−10dB.You need to run 20times and take average.(5points)2)(Second-order cone optimization problem.)Randomly generate standard SOCPP SOCP :minimize x ∈Rnf T xsubject toA i x +b i ≤c T i x +d i ,i =1,···,K(4)where each entry of A i ∈R m ×n ,b i ∈R m ,c i ∈R n ,d i ∈R is all draw of i.i.d.standard Gaussiandistribution N (0,1).Please draw the phase transition figure about the relationship between the number of constriants K and the feasibility of P SOCP .Assume m =20,n =100.You need to run 20times and take average.(10points)1Formore about phase transition,refer to Dennis Amelunxen et al.:Living on the edge:Phase transitions in convex programswith random data,in:Information and Inference 2014,iau005凸优化2017作业及答案2.Optimization problems.(a)(LASSO.)We wish to recover a sparse vector x∈R n from measurements y∈R m.Our measurementmodel tells us thaty=Ax+v,where A∈R m×n is a known matrix and v∈R m is unknown measurement error.The entries of v are drawn IID from the distribution N(0,σ2).We canfirst try to recover x by solving the optimization problemminAx−y 22+γ||x||22.(5)xThis problem is called ridge regression.A more successful approach is to solve the LASSO problemminAx−y 22+γ||x||1.(6)xPlease use the code below to define n,m,A,x,and y.1234567(a)Use CVX to estimate x from y using ridge regression and LASSO problem,respectively.(15points)(b)Plot your result to compare the estimated x with the true x.(5points)(c)How many measurements m are needed tofind an accurate x with ridge regression?How aboutwith the LASSO?(5points)(b)(Portfolio Optimization.)Find minimum-risk portfolios with the same expected return as the uniformportfolio(w=(1/n)1),with risk measured by portfolio return variance,and the following portfolio constraints(in addition to1T w=1):•No(additional)constraints.•Long-only:w 0.•Limit on total short position:1T w−≤0.5,where(w−)i=max{w i,0}.(a)Use CVX to compare the optimal risk in these portfolios with each other and the uniformportfolio.(10points)(b)Plot the optimal risk-return trade-offcurves for the long-only portfolio,and for total shortposition limited to0.5,in the samefiment on the relationship between the two trade-offcurves.(10points)(c)(Energy Storage Trade-offs.)We consider the use of a storage device(say,a battery)to reduce thetotal cost of electricity consumed over one day.We divide the day into T time periods,and let p t denote the(positive,time-varying)electricity price,and u t denote the(nonnegative)usage or consumption,in period t,for t=1,...,T.Without the use of a battery,the total cost is p T u.Let q t denote the(nonnegative)energy stored in the battery in period t.For simplicity,we neglect energy loss(although this is easily handled as well),so we have q t+1=q t+c t,t=1,...,T1,where c t is the charging of the battery in period t;c t<0means the battery is discharged.We will require that q1=q T+c T,i.e.,wefinish with the same battery charge that we start with.With the battery operating,the net consumption in period t is u t+c t;we require this to be nonnegative(i.e.,we do not pump power back into the grid).The total cost is then p T(u+c).The battery is characterized by three parameters:The capacity Q,where q t≤Q;the maximum charge rate C,where c t≤C;and the maximum discharge rate D,where c t≥D.(The parameters Q,C,and D are nonnegative.)(a)Explain how tofind the charging profile c∈R T(and associated stored energy profile q∈R T)that minimizes the total cost,subject to the constraints.(5points)p T(u+c)minq,cs.t q t+1=q t+c t,t=1,...,T−1q1=q T+c T0≤q t≤Q,t=1,...,T−D≤c t≤C,t=1,...,T0≤u t+c t,t=1,...,T(b)Use CVX to solve the problem above with Q=35,C=D=3as well as p and u defined by thefollowing code:12345Plot u t,p t,c t,and q t versus t.(15points)(c)Storage Trade-offs Plot the minimum total cost versus the storage capacity Q,using p and ubelow,and charge/discharge limits C=D=3.Repeat for charge/discharge limits C=D=1.(Put these two trade-offcurves on the same SI 251-Convex Optimization,Spring 2017Homework 4Due on 08:00a.m.,April 6,2017,before classNote:Please compress your codes into one file and sent it to TAs,and print your figures or results and answer the questions on A4paper.Finish your simulation with CVX package (MATLAB/Python/···).And initialize your program with com-mands to fix your randomized results and make sure that your results are repeatable.For example,if you are using MATLAB,you may add rng('default');rng(1);in the preamble.And you may need to reprogram the given MATLAB code segments to other programming languages that you'd like to choose.1.Feasibility1)(Multiuser transmit beamforming.)Power minimization problem in wireless communicationP :minimizew 1,···,w KK ∑k =1∥w k ∥2subject toSINR k ≥γk ,k =1,···,K,(1)where w 1,···,w K ∈C n are the beamforming vectors for receiver k =1,···,K .Signal-to-interference-plus-noise-ratio for the k -th user SINR k is given bySINR k =|h H k w k |2∑i =k |h H kw i |2+σ2,(2)where h k ∈C n is the channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it's often represented with dB,which is defined as 10log γk . k is given bySINR k =|h H k w k |2∑i =k |h H kw i |2+σ2,(2)where h k ∈C n is the channel coeffcient vector between the transmitter and the k -th receiver andσ2is noise power.In the simulation,considerthe complex Gaussian channel,i.e.h k ∼CN (0,s 2I )in which s =1/√K .And the noise power σ2can be set as 1without loss of generality.Each target SINR γk ≥0and it’s often represented with dB,which is defined as 10log γk .(a)Consider the relationship between target SINR and the feasibility of P .Please draw the phasetransition 1figure where X-axis is target SINR in dB (γ1=···=γK =γ),and Y-axis is the ratio when the problem is feasible over multiple realizations of channel,i.e.R =#{P is feasible }#of tests(channel realizations).(3)Assume K =50,n =3.You need to run 20times and take average.(5points)(b)Please draw the phase transition figure about the relationship between the number of users Kand the feasibility of P .Assume n =3,γ=−15dB.You need to run 20times and take average.(5points)(c)Please draw the phase transition figure about the relationship between the number of antennasn and the feasibility of P .Assume K =100,γ=−10dB.You need to run 20times and take average.(5points)Solution:1Formore about phase transition,refer to Dennis Amelunxen et al.:Living on the edge:Phase transitions in convex programswith random data,in:Information and Inference 2014,iau0051 2 3 4 5 678910111213142)(Second-order cone optimization problem.)Randomly generate standard SOCPP SOCP:minimizef T xx∈R nsubject to∥A i x+b i∥≤c T i x+d i,i=1,···,K(4) where each entry of A i∈R m×n,b i∈R m,c i∈R n,d i∈R is all draw of i.i.d.standard Gaussian distribution N(0,1).Please draw the phase transitionfigure about the relationship between the number of constriants K and the feasibility of P SOCP.Assume m=20,n=100.You need to run 20times and take average.(10points)Solution:123456789101112132.Optimization problems.(a)(LASSO.)We wish to recover a sparse vector x∈R n from measurements y∈R m.Our measurementmodel tells us thaty=Ax+v,where A∈R m×n is a known matrix and v∈R m is unknown measurement error.The entries of v are drawn IID from the distribution N(0,σ2).We canfirst try to recover x by solving the optimization problem∥Ax−y∥22+γ||x||22.(5)minxThis problem is called ridge regression.A more successful approach is to solve the LASSO problem∥Ax−y∥22+γ||x||1.(6)minxPlease use the code below to define n,m,A,x,and y.1234567(a)Use CVX to estimate x from y using ridge regression and LASSO problem,respectively.(15points)(b)Plot your result to compare the estimated x with the true x.(5points)(c)How many measurements m are needed tofind an accurate x with ridge regression?How aboutwith the LASSO?(5points)Solution:(b)(Portfolio Optimization.)Find minimum-risk portfolios with the same expected return as the uniformportfolio(w=(1/n)1),with risk measured by portfolio return variance,and the following portfolio constraints(in addition to1T w=1):•No(additional)constraints.•Long-only:w≽0.•Limit on total short position:1T w−≤0.5,where(w−)i=max{−w i,0}.(a)Use CVX to compare the optimal risk in these portfolios with each other and the uniformportfolio.(10points)(b)Plot the optimal risk-return trade-offcurves for the long-only portfolio,and for total shortposition limited to0.5,in the samefiment on the relationship between the two trade-offcurves.(10points)Solution:92017/4/16portfolio_yangkai(b)Plot the optimal risk-return trade-off curves for the long-only portfolio, and for total short positionlimited to 0.5, in the same figure.Comment on the relationship between the two trade-off curves.file:///C:/Users/Line/Box%20Sync/Course/convex%20optimization%202017/hw4/result/portfolio_yangkai.html3/6In [ ]: In [ ]:(c)(Energy Storage Trade-offs.)We consider the use of a storage device(say,a battery)to reduce thetotal cost of electricity consumed over one day.We divide the day into T time periods,and let p t denote the(positive,time-varying)electricity price,and u t denote the(nonnegative)usage or consumption,in period t,for t=1,...,T.Without the use of a battery,the total cost is p T u.Let q t denote the(nonnegative)energy stored in the battery in period t.For simplicity,we neglect energy loss(although this is easily handled as well),so we have q t+1=q t+c t,t=1,...,T1,where c t is the charging of the battery in period t;c t<0means the battery is discharged.We will require that q1=q T+c T,i.e.,wefinish with the same battery charge that we start with.With the battery operating,the net consumption in period t is u t+c t;we require this to be nonnegative(i.e.,we do not pump power back into the grid).The total cost is then p T(u+c).The battery is characterized by three parameters:The capacity Q,where q t≤Q;the maximum charge rate C,where c t≤C;and the maximum discharge rate D,where c t≥D.(The parameters Q,C,and D are nonnegative.)(a)Explain how tofind the charging profile c∈R T(and associated stored energy profile q∈R T)that minimizes the total cost,subject to the constraints.(5points)minp T(u+c)q,cs.t q t+1=q t+c t,t=1,...,T−1q1=q T+c T0≤q t≤Q,t=1,...,T−D≤c t≤C,t=1,...,T0≤u t+c t,t=1,...,T(b)Use CVX to solve the problem above with Q=35,C=D=3as well as p and u defined by thefollowing code:12345Plot u t,p t,c t,and q t versus t.(15points)(c)Storage Trade-offs Plot the minimum total cost versus the storage capacity Q,using p and ubelow,and charge/discharge limits C=D=3.Repeat for charge/discharge limits C=D=1.(Put these two trade-offcurves on the same plot.)Give an interpretation of the endpoints of the trade-offcurves.(10points)Solution:16In [28]:# Here we plot the demands u and prices p.import numpy as npimport matplotlib.pyplot as plt%matplotlib inlinenp.random.seed(1)T = 96t = np.linspace(1, T, num=T).reshape(T,1)p = np.exp(-np.cos((t-15)*2*np.pi/T)+0.01*np.random.randn(T,1)) u = 2*np.exp(-0.6*np.cos((t+40)*np.pi/T) - \0.7*np.cos(t*4*np.pi/T)+0.01*np.random.randn(T,1))plt.figure(1)plt.plot(t/4, p, 'g', label=r"$p$");plt.plot(t/4, u, 'r', label=r"$u$");plt.ylabel("$")plt.xlabel("t")plt.legend()。
主题:Convex Optimization作业内容:1. 什么是凸优化?凸优化是指在凸函数和凸集合上进行最小化或最大化的优化问题。
2. 凸优化的应用领域凸优化问题涉及到诸如机器学习、控制理论、金融工程等众多领域。
3. 凸优化的基本概念和方法凸优化问题一般可以用以下标准形式表示:\[\min_x f(x)\]\[s.t. \quad g_i(x) \le 0, \quad i = 1,2,...,m\]\[\quad \quad \quad h_i(x) = 0, \quad i = 1,2,...,p\]其中,\(f(x)\)是目标函数,\(g_i(x)\)和\(h_i(x)\)分别为不等式约束和等式约束。
4. 凸优化在实际中的案例以线性规划问题为例,假设有以下线性规划问题:\[\min_x c^Tx\[s.t. \quad Ax \le b\]\[\quad \quad \quad x \ge 0\]其中,\(c\)是目标函数的系数向量,\(A\)是不等式约束的系数矩阵,\(b\)是不等式约束的右端向量,\(x\)是优化变量。
5. 结语凸优化是一种重要的优化问题类型,在实际中有着广泛的应用。
Convex sets
Definition of convexity
2.1 Let C ⊆ Rn be a convex set, with x1 , . . . , xk ∈ C , and let θ1 , . . . , θk ∈ R satisfy θi ≥ 0, θ1 + · · · + θk = 1. Show that θ1 x1 + · · · + θk xk ∈ C . (The definition of convexity is that this holds for k = 2; you must show it for arbitrary k.) Hint. Use induction on k. Solution. This is readily shown by induction from the definition of convex set. We illustrate the idea for k = 3, leaving the general case to the reader. Suppose that x 1 , x2 , x3 ∈ C , and θ1 + θ2 + θ3 = 1 with θ1 , θ2 , θ3 ≥ 0. We will show that y = θ1 x1 + θ2 x2 + θ3 x3 ∈ C . At least one of the θi is not equal to one; without loss of generality we can assume that θ1 = 1. Then we can write where µ2 = θ2 /(1 − θ1 ) and µ2 = θ3 /(1 − θ1 ). Note that µ2 , µ3 ≥ 0 and µ1 + µ 2 = y = θ1 x1 + (1 − θ1 )(µ2 x2 + µ3 x3 )
(i,j )∈E
− aij fij + bij (fij − cij ) +
k∈V \{s,t}
fik −
(k,j )∈E
fkj ≤ 0
• Rearrange as
for any aij , bij ≥ 0, (i, j ) ∈ E , and xk , k ∈ V \ {s, t} Mij (a, b, x)fij ≤ bij cij
max 2c − b
subject to x ≥ 0 y≤1 3x + y = 2 Primal LP
subject to a + 3c = p −b + c = q a, b ≥ 0
Dual LP
Note: in the dual problem, c is unconstrained
subject to a + b = p a+c=q a, b, c ≥ 0 Called dual LP
Called primal LP
Note: number of dual variables is number of primal constraints
Try another one: min px + qy
subject to x + y ≥ 2 x, y ≥ 0 What’s a lower bound? Easy, take B = 2 But didn’t we get “lucky”?
Try again: min x + 3y
subject to x + y ≥ 2 x, y ≥ 0
仿射集、凸集和锥的概念1、仿射集和凸集1.1 仿射集相关概念仿射(affine)定义:对于集合,如果通过集合C中任意两个不同点之间的直线仍在集合C中,则称集合C为仿射(affine)。
仿射集(affine set)定义:仿射集包含了集合内点的所有仿射组合。
若C是仿射集,,,则点也属于C.仿射包(affine hull)的定义:仿射包是包含C的最⼩的仿射集,表⽰为:1.2 凸集的相关概念凸(convex)的定义:对于集合,如果通过集合C中任意两个不同点之间的线段仍在集合C中,则称集合C为凸(convex)。
凸集(convex set):该集合包含了所有点的凸组合。
凸包(convex hull):最⼩的凸集,表⽰为:注:1)凸包总是凸的2)若B是凸集并且包含C,则在⼆维欧⼏⾥得空间中,凸包可想象为⼀条刚好包着所有点的橡⽪圈1.3、锥锥(cone)的定义:若对于任意和,有,则称为锥。
锥包(cone hull):集合C中所有锥组合的集合,也是包含C的最⼩凸锥。
即2、例⼦空集、点、整个空间都是仿射(affine),因此也是凸(convex)任意线是仿射(affine),若过原点,则为凸锥(convex cone)线段是凸(convex),但不是仿射形式如的射线是凸,但不是仿射任意⼦空间是仿射和凸锥超平⾯是仿射集(affine set)半平⾯是凸集(convex set)球体和椭圆体是凸集Norm ball 和norm cone是凸锥多⾯体(polyhedra)是凸集参考⽂献:convex optimization[Stephen Boyd]。
凸优化分析 -导论 斯坦福大学电子工程系必修课程
exceptions: certain problem classes can be solved efficiently and reliably • least-squares problems • linear programming problems • convex optimization problems
using convex optimization • often difficult to recognize • many tricks for transforming problems into convex form • surprisingly many problems can be solved via convex optimization
Introduction 1–6
i = 1, . . . , m
Convex optimization problem
minimize f0(x) subject to fi(x) ≤ bi,
i = 1, . . . , m
• objective and constraint functions are convex: fi(αx + βy) ≤ αfi(x) + βfi(y) if α + β = 1, α ≥ 0, β ≥ 0 • includes least-squares problems and linear programs as special cases
Course goals and topics
goals 1. recognize/formulate problems (such as the illumination problem) as convex optimization problems 2. develop code for problems of moderate size (1000 lamps, 5000 patches) 3. characterize optimal solution (optimal power distribution), give limits of performance, etc. topics 1. convex sets, functions, optimization problems 2. examples and applications 3. algorithms
DOI:10.19533/j.issn1000-3762.2020.06.012基于凸优化的RPSEMD及其在滚动轴承故障诊断中的应用张永庆1,2,柯伟2,3,林青云2,4,易灿灿2,马毓博2(1.大冶特殊钢有限公司,湖北 黄石 435001;2.武汉科技大学,武汉 430081;3.台州市特种设备检验检测研究院,浙江 台州 318000;4.丽水市特种设备检测院,浙江 丽水 323000)摘要:为改善再生相移正弦辅助经验模态分解(RPSEMD)在噪声影响下鲁棒性较差的缺陷,引入了一种广义的极小极大凹罚函数(GMC)作为1范数的替代,建立起了基于凸优化的降噪框架。
关键词:滚动轴承;故障诊断;凸优化;再生相移正弦辅助经验模式分解;特征提取中图分类号:TH133.33;TN911.7 文献标志码:B 文章编号:1000-3762(2020)06-0051-07RPSEMDBasedonConvexOptimizationandItsApplicationinRollingBearingFaultDiagnosisZHANGYongqing1,2,KEWei2,3,LINQingyun2,4,YICancan2,MAYubo2(1.DayeSpecialSteelCo.,Ltd.,Huangshi435001,China;2.WuhanUniversityofScienceandTechnology,Wuhan430081,China;3.TaizhouSpecialEquipmentInspectionandTestingInstitute,Taizhou318000,China;4.LishuiSpecialEquipmentTestingInstitute,Lishui323000,China)Abstract:ToimprovepoorrobustnessofRegeneratedPhase-ShiftedSinusoidAssistedEMD(RPSEMD)underinflu enceofnoise,aGeneralizedMinimax-Concave(GMC)penaltyfunctionisintroducedasanalternativetol1norm.Adenoisingframeworkisestablishedbasedonconvexoptimization.Theconvexoptimizationdenoisingmethodisusedasapreprocessingapproach,andthenthemodedecompositioniscarriedoutforpreprocessedsignalsbyRPSEMD.ThenumericalsimulationsignalandactualmeasuredbearingfaultsignalandcomparisonanalysisbetweenEMDandEEMDshowthatthemethodeliminatesinfluenceofmodalchaosphenomenonandeffectivelyextractfaultcharacteristicfre quencyofbearings.Keywords:rollingbearing;faultdiagnosis;convexoptimization;RPSEMD;featureextraction 由于工业现场实际工况的复杂性,采集到的机械设备振动信号难免夹杂噪声或其他干扰成分,导致难以有效识别所需要的特征信息。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Figure 1: Examples of a convex set (a) and a non-convex set (b).
• All of Rn . It should be fairly obvious that given any x, y ∈ Rn , θx + (1 − θ)y ∈ Rn . • The non-negative orthant, Rn + . The non-negative orthant consists of all vectors in Rn whose elements are all non-negative: Rn + = {x : xi ≥ 0 ∀i = 1, . . . , n}. To show that this is a convex set, simply note that given any x, y ∈ Rn + and 0 ≤ θ ≤ 1, (θx + (1 − θ)y )i = θxi + (1 − θ)yi ≥ 0 ∀i. • Norm balls. Let · be some norm on Rn (e.g., the Euclidean norm, x 2 = n n 2 i=1 xi ). Then the set {x : x ≤ 1} is a convex set. To see this, suppose x, y ∈ R , with x ≤ 1, y ≤ 1, and 0 ≤ θ ≤ 1. Then θx + (1 − θ)y ≤ θx + (1 − θ)y = θ x + (1 − θ) y ≤ 1 where we used the triangle inequality and the positive homogeneity of norms. • Affine subspaces and polyhedra. Given a matrix A ∈ Rm×n and a vector b ∈ Rm , an affine subspace is the set {x ∈ Rn : Ax = b} (note that this could possibly be empty if b is not in the range of A). Similarly, a polyhedron is the (again, possibly empty) set {x ∈ Rn : Ax b}, where ‘ ’ here denotes componentwise inequality (i.e., all the entries of Ax are less than or equal to their corresponding element in b).1 To prove this, first consider x, y ∈ Rn such that Ax = Ay = b. Then for 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay = θb + (1 − θ)b = b. Similarly, for x, y ∈ Rn that satisfy Ax ≤ b and Ay ≤ b and 0 ≤ θ ≤ 1, A(θx + (1 − θ)y ) = θAx + (1 − θ)Ay ≤ θb + (1 − θ)b = b.
Convex Optimization Overview
Zico Kolter October 19, 2007
Many situations arise in machine learning where we would like to optimize the value of some function. That is, given a function f : Rn → R, we want to find x ∈ Rn that minimizes (or maximizes) f (x). We have already seen several examples of optimization problems in class: least-squares, logistic regression, and support vector machines can all be framed as optimization problems. It turns out that in the general case, finding the global optimum of a function can be a very difficult task. However, for a special class of optimization problems, known as convex optimization problems , we can efficiently find the global solution in many cases. Here, “efficiently” has both practical and theoretical connotations: it means that we can solve many real-world problems in a reasonable amount of time, and it means that theoretically we can solve problems in time that depends only polynomially on the problem size. The goal of these section notes and the accompanying lecture is to give a very brief overview of the field of convex optimization. Much of the material here (including some of the figures) is heavily based on the book Convex Optimization [1] by Stephen Boyd and Lieven Vandenberghe (available for free online), and EE364, a class taught here at Stanford by Stephen Boyd. If you are interested in pursuing convex optimization further, these are both excellent resources.
• Intersections of convex sets. Suppose C1 , C2 , . . . , Ck are convex sets. Then their intersection
Ci = {x : x ∈ Ci ∀i = 1, . . . , k }
is also a convex set. To see this, consider x, y ∈
Convex Sets
We begin our look at convex optimization with the notion of a convex set . Definition 2.1 A set C is convex if, for any x, y ∈ C and θ ∈ R with 0 ≤ θ ≤ 1, θx + (1 − θ)y ∈ C. Intuitively, this means that if we take any two elements in C , and draw a line segment between these two elements, then every point on that line segment also belongs to C . Figure 1 shows an example of one convex and one non-convex set. The point θx + (1 − θ)y is called a convex combination of the points x and y . 1
Convex Functions
A central element in convex optimization is the notion of a convex function . Definition 3.1 A function f : Rn → R is convex if its domain (denoted D(f )) is a convex set, and if, for all x, y ∈ D(f ) and θ ∈ R, 0 ≤ θ ≤ 1, f (θx + (1 − θ)y ) ≤ θf (x) + (1 − θ)f (y ). Intuitively, the way to think about this definition is that if we pick any two points on the graph of a convex function and draw a straight line between then, then the portion of the function between these two points will lie below this straight line. This situation is pictured in Figure 2.2 We say a function is strictly convex if Definition 3.1 holds with strict inequality for x = y and 0 < θ < 1. We say that f is concave if −f is convex, and likewise that f is strictly concave if −f is strictly convex.