系统优化与调度读书报告

合集下载

相关主题

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Book Report of System Optimization and
Scheduling
The conjugate gradient method and its application in solving
optimization problems
1. Introduction of problem’s background
Optimization theory and methods is a very active young discipline, it discusses the characteristics of the best choice deciding problems. Struct the calculations of seeking optimal solution, study the theoretical properties and the actual computing performance of these calculations. With the rapid development of high-tech, computer and information technology, optimization theory and methods become more and more important, it has been widely used in various aspects of the natural sciences and the engineering design. Conjugate gradient method is one of the most commonly used optimization methods. In all optimizations need to calculate derivative, the steepest descent method is the most simple, but it is too slow convergence. Quasi-Newton method converges quickly, is widely regarded as the most effective method for nonlinear programming, but the quasi-Newton method requires the storage matrix and by solving linear equations to calculate the search direction, which is almost impossible to solve large-scale problems.
Conjugate gradient method can transform an n-dimensional optimization problem into n equivalent one-dimensional problems, the algorithm is simple, small storage requirements, the convergence rate surpasses fast steepest descent method, and is particularly suitable for solving large-scale problems. Such as electricity distribution, oil exploration, atmospheric modeling, aerospace and other proposed optimization problems.
Conjugate gradient method was first proposed by Hestenes and Stiefle came in 1952, for the solution of linear equations of definite coefficient matrix. The famous article they cooperate -“Method of conjugate gradients for solving linear systems”[1] is considered to be the founder of the articles about conjugate gradient method. This article discusses in detail the nature of the conjugate gradient method for solving linear equations and its relationship with other methods. On this basis, Fletcher and Reeves in 1964 first proposed the conjugate gradient method to solve a nonlinear
optimization problem, making it an important optimization method. Subsequently 、Beale 、Fletcher 、Powell and other scholars be in-depth study, given early results of nonlinear conjugate gradient method some convergence analysis. Since the conjugate gradient method does not require matrix storage, and has a faster convergence rate and secondary termination, etc., and now the conjugate gradient method has been widely used in practical problems.
2. Mathematical description of the problem
We proceed from the point 0x x R ∈, in turn one-dimensional search along the group of conjugate direction to solving unconstrained optimization problems is known as conjugate direction method. Conjugate gradient method uses conjugate direction as a kind of search direction. It is a typical method of conjugate direction, each of search directions are mutually conjugate, and these search directions are only a combination of the negative gradient direction with the direction of the previous iteration of the search, Therefore, store less, calculate conveniently. Meanwhile, the conjugate gradient method is a method between between steepest descent method and Newton's method, it uses only the first order derivative information, but overcomes the disadvantage of slow convergence of the steepest descent method, but also avoids the need to store and calculate Hesse matrix inversion of Newton's method.
The basic idea of conjugate gradient method is to combine conjugation and the steepest descent method, using the known point ’s gradient to struct a group of conjugate directions, and search elements along the direction of this group, to find the minimum point of the objective function. According to the basic nature of the conjugate direction, this method has the second termination. In the conjugate direction, if take the initial search direction 00()d f x =-∇, the following conjugate direction k d is determined by the negative gradient ()k f x -∇from iteration of k times and linear combination whose conjugate direction 1-k d has been obtained negative k-th iteration of the conjugate gradient direction has been obtained by linear combination, which construct a specific conjugate direction method:
11(),1,2,...,1k k k k d f x d k n α--=-∇+=- (2-1)
Because each conjugate direction is dependent on the negative gradient of iteration point, so it is called the conjugate gradient method.
3. The algorithm
(1) Linear conjugate gradient method
Linearconjugate gradient method [2] is proposed independently by Hesetnes and Stelefl in solving linear equations n R x b Ax ∈=,. (3-1)
When A is a symmetric positive definite matrix, linear equations is equivalent to the quadratic optimization problems to solve the following formula:
n T T R x R x x b Ax x n ∈-∈,2
1min (3-2) Therefore, Hestnees and steiefl ’s approach can be viewed as the conjugate gradient method to evaluat the minimum value of the quadratic function.
Linear conjugate gradient method ’s steps are as follows:
a. Select the initial point 0n x R ∈, set 0000,,0r Ax b d r k =-=-=.
b. If k r ε≤, then stop, or calculation step factor .T k k k T k k
r r d Ad α=- Order the next iteration point:111,k k k k k k x x d r Ax b α+++=+=-. c. Calculation parameters 21
11112;;k k k k k k k r d r d r ββ+++++==-+ set 1k k =+, turn a.
A notable feature of linear conjugate gradient method is about the direction ,0,1...k d k =generated by the algorithm, about A conjugate, so with limited termination.
(2) Nonlinear conjugate gradient method
Hypothesis :n f R R → is continuous and differentiable,()g x is the gradient of f at point x . The general format of nonlinear conjugate gradient method for solving unconstrained minimization problem: min (),n f x x R ∈is as follows:
1,0,1,...k k k k x x d k α+=+= (3-3)
Among them, k αcan be getted by some linear search, the search direction k d is defined by the following equation:
1,0,0k k k
k k g k d g d k β--=⎧=⎨-+>⎩ (3-4) k β is parameter. The different type of k βcorresponds to the different nonlinear conjugate gradient method.The followings gives some known methods ’ parameter k β:
221122111111211111
,,,,,,T T k k FR PRP HS CD k k k k k k k k T T k k k k k k T k DY LS k k k k T T k k k k g g g y g y d y d g g g g g y d y d g ββββββ-------------=
===== (3-5) Among them,11,().k k k k k y g g g g x --=-=
According to the characteristics of the gradient, in the experimental points ()k x , select negative gradient direction as the search direction, the fastest decrease in the function value, which can be shown that, for a non-negative definite quadratic function at n-dimensional Euclidean space, can be achieved the minimum value of the point without exceeding n search times. Now use the positive definite quadratic function 1()(2
T T f x x Ax b x c A =++ is n n ⨯ symmetric positive definite matrix;c is standard constants;)),,,(;),,,(2121n T T n b b b b x x x x ==as an example to explain, also can be called FR conjugate gradient method. If you take the step length parameter h ，then the iterative equations of next step (1)k x +:
(1)()()()()k k k k x x h f x d +=-∇ (3-6) Set ()()k h f x λ=∇ as a step of one-dimensional search from ()k x along the search direction, we can get:
(1)()(k k k x x d λ+=- (3-7)
If a negative gradient angle of rotation, so that the search direction becomes
conjugate direction, namely
(1)1()(1)()()k k k k k k k d f x d g d ββ+++=-∇+=-+ (3-8)
k β is conjugate coefficient, if Hessian matrix is A , multiply ()()k T d A at both sides of the equation, which on the left and guarantee ()k d and (1)k d +about A conjugate, then we have:
()(1)()(1
)()(()()()0
k T k k T k k T k k d A d d A g d A d β++=-+= (3-9) So ()(1)
()()
()()k T k k k T k d Ag d Ad β+=. (3-10) In order to avoid the trouble of calculating the Hessian matrix A , now try to eliminate A from the above formula, multiply A simultaneously by both sides on the left and then plus )),,,((21n T b b b b b = of one-dimensional searching iterative formula (1)()()k k k k x x d λ+=+, we get:
(1)()()k k k x x k A b A b Ad λ++=++ (3-11) According to the quation ()f f x b Ax x
∂=∇=+∂, the formula can be changed to; (1)()()()()k k k k f x f x Ad λ+∇=∇+ or (1)()()k k k k g g Ad λ+-= (3-12) Substituting the top quation into ()(1)
()()
()()k T k k k T k d Ag d Ad β+=, we get: 2
2(1)(1)(1)(1)()(1)(1)22
()(1)()()()()()()()[()()]()()()[()()]()()()k k k T k k k T k k k T k k k T k k k f x g f x f x f x f x f x d f x f x f x f x f x g β+++++++∇∇∇-∇∇∇====∇-∇∇∇∇ (3-13) For non-quadratic function, 2(1)(1)2()()k k T k
k k g g g g β++-= (3-14)
The above is the conjugate gradient method, the method has simple structure, only need to store three variables, occupy less storage units, and facilitate to iteratively calculate for computer.
Iterative steps of conjugate gradient method as follows [3]:
a. Set the number of iteration 0k =, set the tolerance error εand the initial point ()k x .
b. Calculate the gradient vector ()()()k k g f x =∇of function ()f x at ()k x , testing it if meets ()k g ε≤, if it meets then the ()k x we get is the approximately optimal solution x *; or let ()()k k d g =-.
c. Determine the best of step k λ（from ()k x , along the direction ()k d ）, to conduct one-dimensional search, if it meets ()()()min ()k k k k k k f x d f x d λ
λλ+⨯=+⨯, let (1)()k k k k x x d λ+=+⨯.
d. Calculate (1)(1)()k k g f x ++=∇, and test it if meets (1)k g ε+≤, if it meets then let (1)k x x +*=, or turn to
e.
e. Judge it, if k n =, it shows the times of iteration is n , which has runned out of all conjugate directions , let (0)(1),0k x x k +==, turn to b; or turn to
f.
f. Calculate 2(1)
2()k k k g g β+=, or 2(1)(1)2()()k k T k
k k g g g g β++-=and (1)(1)()k k k k d g d β++=-+.
g. Let 1k k =+, turn to c.
These are the unconstrained optimal solutional algorithm which against the target function 1()2
T T f x x Ax b x c =++, we can generalize them to the general differentiable function, begin from any point (0)x , respectively, conduct one-dimensional search along each axis direction, conduct over (total of n times linear search) later, the optimal solution of ()f x will be able to get. For the quadraic differentiable functions whose taget functions are n-dimension, using conjugate gradient method can theoretically reach up to the minimum point as long as needing n times iterations, but in the actual calculation, due to rounding errors, always carry more times can achieve satisfactory results, while for non-quadratic functions the number of iterations will be
more, however due to the conjugate directions of n-dimensional functions only up to n, after n times iterations it will lose its meaning, at the same time the accumulate of error will be unfavorable to convergence. Thus conjugate gradient method usually restart the algorithm by setting the vector direction to the steepest descent direction d g ←- after n times or (n+1) times iterations. 4. The numerical verification
Example1:
To illustrate the conjugate gradient method, we will complete a simple example. Considering the linear system b Ax = given by
⎥⎦
⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡⎥⎦⎤⎢⎣⎡=21311421x x Ax we will perform two steps of the conjugate gradient method beginning with the initial guess
⎥⎦
⎤⎢⎣⎡=120x in order to find an approximate solution to the system.
Solution:
Our first step is to calculate the residual vector 0r associated with 0x . This residual is computed from the formula 00Ax b r -=, and in our case is equal to
⎥⎦
⎤⎢⎣⎡--=⎥⎦⎤⎢⎣⎡⎥⎦⎤⎢⎣⎡-⎥⎦⎤⎢⎣⎡=38123114210r Since this is the first iteration, we will use the residual vector 0r as our initial search direction 0p ; the method of selecting k p will change in further iterations.
We now compute the scalar 0αusing the relationship
[][]33173383114383838
00000=⎥⎦⎤⎢⎣⎡--⎥⎦⎤⎢⎣⎡--⎥⎦⎤⎢⎣⎡----==Ap p r r T T α
We can now compute 1x using the formula
⎥⎦
⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡--+⎥⎦⎤⎢⎣⎡=+=3384.02356.03833171120001p x x α This result completes the first iteration, the result being an "improved" approximate solution to the system,1x . We may now move on and compute the next residual vector 1r using the formula
⎥⎦
⎤⎢⎣⎡-=⎥⎦⎤⎢⎣⎡--⎥⎦⎤⎢⎣⎡-⎥⎦⎤⎢⎣⎡--=-=7492.02810.038311433173380001Ap r r α Our next step in the process is to compute the scalar 0βthat will eventually be used to determine the next search direction 1p .
[][]0088.038387492.02810.07492.02810.00011
0=⎥⎦
⎤⎢⎣⎡----⎥⎦⎤⎢⎣⎡--==r r r r T T β Now, using this scalar 0β, we can compute the next search direction 1p using the relationship:
11000.287080.35110.00880.749230.7229p r p β---⎡⎤⎡⎤⎡⎤=+=+=⎢⎥⎢⎥⎢⎥-⎣⎦⎣⎦⎣⎦
We now compute the scalar 1αusing our newly-acquired 1p using the same method as that used for 0α.
[][]4122.07229.03511.031147229.03511.07492.02810.07492.02810.011110=⎥⎦
⎤⎢⎣⎡-⎥⎦⎤⎢⎣⎡-⎥⎦⎤⎢⎣⎡--==Ap p r r T T α Finally, we find 2x using the same method as that used to find 1x .
⎥⎦
⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡-+⎥⎦⎤⎢⎣⎡=+=6364.00909.07229.03511.04122.03384.02356.01112p x x α
The result,2x , is a "better" approximation to the system's solution than 1x and 0x . If exact arithmetic were to be used in this example instead of limited-precision, then the exact solution would theoretically have been reached after n = 2 iterations (n being the order of the system).
Example2:
Using FR method to solve the following question:
2212
min ()2f x x x =+, set the initial point (1)(2,2)T x =. The first iteration:
Set (1)1(8,4),T d g =-=-- however:
(1)11(1)(1)8(8,4)4540818(8,4)024T T g d
d Ad λ-⎡⎤⎢⎥-⎣⎦=-=-=-⎡⎤⎡⎤--⎢⎥⎢⎥-⎣⎦⎣⎦
So (2)(1)(1)1528(2,2)(8,4)(,)1899T T T x x d λ-=+=+
--= The second iteration:
2222212221(2)(1)21(2)22(2)(2)2(3)(2)(2)2816(
,),99
816()()499.8481816440(,)(8,4)(1,4)998181
140816(,)8199494012040()(1,4)81024T T T T T
T T g g g d g d g d d Ad x x d ββλλ-=-+∴===+-∴=-+=+--=-⎡⎤-⎢⎥-⎣⎦=-=-=⎡⎤⎡⎤-⎢⎥⎢⎥-⎣⎦⎣⎦
∴=+3(3)28940(
,)(1,4)(0,0)992081
(0,0) is the asked minimum point.T T T T
g x -=+⨯-==∴ Example3:
Known linear equations:
2220122220122220
121240340x x x x x x x x x ⎧++=⎪+-=⎨⎪-+=⎩ Then calculated using the steepest descent method and conjugate gradient method, the results are as follows:
Table 1
Algorithm
steepest descent method conjugate gradient method Initial value
（1，1，1）（1，1，1） Accuracy
810-=ε 810-=ε Iterations
70 17 Terminal value （0.78519694，0.49661140，0.36992283）（0.78519693，
0.46991139，0.36992283）
Clearly, in the same precision, starting from the same initial point, the conjugate gradient method is superior than steepest descent method.
5. Section
The conjugate gradient method is not only one of the most useful methods for solving large linear equations, and also one of the most effective algorithm to slove large-scale nonlinear optimization. In various optimization algorithms, conjugate gradient method is a very important one, it takes the advantages of the steepest descent method and Newton's method and overcomes the shortcomings they have. It requires small amount of memories, with further convergence, high stability, and does not require any external parameters.
6. References
[1] Hestenese M.R., Stiefel E., Method of conjugate gradient for solving linear equations [J]. JRes Nat Bur Stand, 1952, 49: 409-436.
[2] 8DaiYH,YuanY .NonlinearConjugateGrdaientMethods.Shnaghai:Shnaghai
SeieneeandTehcnologyPublisher,2000,1-152
[3] Yunfei Li, Meng ing conjugate gradient method to solve optimization problems[M], Xi an: Weinan Teachers College Newapaper, 2003.。