Flexible decomposition algorithms for weakly coupled markov decision problems

合集下载

韧性硬化材料裂纹扩展分形运动学-清华大学

韧性硬化材料裂纹扩展分形运动学-清华大学

CNAIS 2006 Symposium文题(不超过20字)*作者11,作者22,作者31……(1. 学校 系名,城市 邮编;2. 单位名称2,城市 邮编)文 摘: 包括目的、方法、结果、结论4部分,200-220字,信息具体。

关键词:关键词1(与分类号对应);关键词2;关键词3;… 中图分类号: 分类号1;分类号2*基金项目:基金项目类别(项目编号) 作者简介:第一作者的姓名(出生年-),性别(民族),籍贯,职称。

通讯联系人:姓名,职称,E-mail :……第一作者为研究生、博士后时,应当以作者中的导师为通讯联系人; 其他情况时,在作者简介后直接加E-mail ,不写通讯联系人。

当前,提高板料成形性能的新工艺的研发,成为全球板料冲压领域中处在前沿的一个热点课题,国内外的众多学者主要沿两个方向正在开展这项研究[1]。

这两个方向是:①控制和优化压边力曲线。

②多点位控制压边技术。

要提高板料在加工成覆盖件时的成形性能,必须对覆盖件拉深过程中的力学特征进行较为深刻的理论分析。

一般说来,任何一个非回转面的形状复杂的覆盖件,都是由多个直壁面或斜壁面与1/4左右的过渡圆柱面或圆锥面以及外凸曲面的底面组合而成的。

以图1所示的长方形盒形件的拉深工艺作为分析模型,用上限法来探讨一般的覆盖件拉深过程中的力学特征。

为此,将板坯的凸缘面分成两类区域:圆角区域与直边区域,前者如图1中的ABCD 区域,后者如图1中的ABHE 区域,且每个区域又可以分为凸缘部分与凹模的圆角部分。

文中用上限法对覆盖件拉深过程中的力学特征进行理论探讨,同时给出几点假设:①板厚δ在拉深过程中保持不变。

②等效应变速率按厚向异性的材料模型进行计算。

③计算动可容速度场时忽略接触面上的摩擦阻力。

事实上,假设①在主应力法中同样也被采用了[2]。

假如在拉深过程中,板坯上的圆角区域与直边区域的运动与变形是相互独立的,二者之间没有质点系的转移,即圆角区域中的质点按圆筒件的拉深模式进行运动与变形,而直边区域中的质点按平面应变拉深的模式进行运动与变形,则由上限法可导出如下结果。

多目标优化问题求解的直接法和间接法的优缺点

多目标优化问题求解的直接法和间接法的优缺点

多目标优化问题求解的直接法和间接法的优缺点多目标优化问题是指在同一优化问题中存在多个冲突的目标函数,需要找到一组解,使得每个目标函数都能达到最优。

在解决这类问题时,可采用直接法和间接法两种不同的方法。

本文将会对直接法和间接法进行详细的介绍,并分析它们各自的优点和缺点。

直接法直接法也被称为权衡法或综合法,它将多目标优化问题转化为单目标优化问题,通过综合考虑各个目标函数的权重,求解一个综合目标函数。

直接法的基本思想是将多个目标函数进行线性组合,构建一个综合目标函数,然后通过求解单个目标函数的优化问题来求解多目标问题。

优点:1.简单直观:直接法将多目标问题转化为单目标问题,相对于间接法来说,更加直观和易于理解。

2.数学模型简化:直接法通过线性组合,将多个目标函数融合为一个综合目标函数,从而简化了数学模型,降低了计算难度。

3.基于人的主观意愿:直接法需要设定各个目标函数的权重,这样通过调整权重的大小来达到不同目标之间的权衡,符合人的主观意愿。

缺点:1.主观性强:直接法中的权重需要依赖专家经验或决策者主观意愿来确定,因此结果可能受到主观因素的影响。

2.依赖权重设定:直接法对于权重设定非常敏感,权重的选择对最终的结果具有较大的影响,不同的权重选择可能得到不同的解决方案。

3.可能出现非最优解:由于直接法是通过综合目标函数来求解单目标问题,因此可能会导致非最优解的出现,无法找到所有的最优解。

间接法间接法也称为非支配排序遗传算法(Non-dominated Sorting Genetic Algorithm, NSGA),它是一种利用遗传算法的非支配排序方法来解决多目标优化问题的方法。

通过建立种群的非支配排序,通过选择、交叉和变异等遗传算子来生成新的种群,并不断迭代,直到找到一组非支配解集。

优点:1.高效性:间接法利用遗传算法,并采用非支配排序的思想,能够快速收敛到一组非支配解集,有效地解决多目标优化问题。

2.多样性:间接法通过种群的选择、交叉和变异等操作,能够保持种群的多样性,不仅可以得到最优解,还可以提供多种优秀的解决方案供决策者选择。

嵌入式系统中英文翻译

嵌入式系统中英文翻译

6.1 ConclusionsAutonomous control for small UAVs imposes severe restrictions on the control algorithmdevelopment, stemming from the limitations imposed by the on-board hardwareand the requirement for on-line implementation. In this thesis we have proposed anew hierarchical control scheme for the navigation and guidance of a small UAV forobstacle avoidance. The multi-stage control hierarchy for a complete path control algorithmis comprised of several control steps: Top-level path planning,mid-level pathsmoothing, and bottom-level path following controls. In each stage of the control hierarchy,the limitation of the on-board computational resources has been taken intoaccount to come up with a practically feasible control solution. We have validatedthese developments in realistic non-trivial scenarios.In Chapter 2 we proposed a multiresolution path planning algorithm. The algorithmcomputes at each step a multiresolution representation of the environment usingthe fast lifting wavelet transform. The main idea is to employ high resolution closeto the agent (where is needed most), and a coarse resolution at large distances fromthe current location of the agent. It has been shown that the proposed multiresolutionpath planning algorithm provides an on-line path solution which is most reliableclose to the agent, while ultimately reaching the goal. In addition, the connectivityrelationship of the corresponding multiresolution cell decomposition can be computed directly from the the approximation and detail coefficients of the FLWT. The path planning algorithm is scalable and can be tailored to the available computational resources of the agent.The on-line path smoothing algorithm incorporating the path templates is presentedin Chapter 3. The path templates are comprised of a set of B-spline curves,which have been obtained from solving the off-line optimization problem subject tothe channel constraints. The channel is closely related to the obstacle-free high resolutioncells over the path sequence calculated from the high-level path planner. Theobstacle avoidance is implicitly dealt with since each B-spline curve is constrainedto stay inside the prescribed channel, thus avoiding obstacles outside the channel.By the affine invariance property of B-spline, each component in the B-spine pathtemplates can be adapted to the discrete path sequence obtained from thehigh-levelpath planner. We have shown that the smooth reference path over the entire pathcan be calculated on-line by utilizing the path templates and path stitching scheme. The simulation results with the D_-lite path planning algorithm validates the effectivenessof the on-line path smoothing algorithm. This approach has the advantageof minimal on-line computational cost since most of computations are done off-line.In Chapter 4 a nonlinear path following control law has been developed for asmall fixed-wing UAV. The kinematic control law realizes cooperative path followingso that the motion of a virtual target is controlled by an extra control input to helpthe convergence of the error variables. We applied the backstepping to derive theroll command for a fixed-wing UAV from the heading rate command of the kinematiccontrol law. Furthermore, we applied parameter adaptation to compensatefor theinaccurate time constant of the roll closed-loop dynamics. The proposed path followingcontrol algorithm is validated through a high-fidelity 6-DOF simulation of a fixed-wing UAV using a realistic sensor measurement, which verifies the applicabilityof the proposed algorithm to the actual UAV.Finally, the complete hierarchical path control algorithm proposed in this thesis isvalidated thorough a high-fidelity hardware-in-the-loop simulation environment usingthe actual hardware platform. From the simulation results, it has been demonstratedthat the proposed hierarchical path control law has been successfully applied for pathcontrol of a small UAV equipped with an autopilot that has limited computational resources.6.2 Future ResearchIn this section, several possible extensions of the work presented in this thesis are outlined.6.2.1 Reusable graph structure The proposed path planning algorithm involves calculating the multiresolution cell decomposition and the corresponding graph structure at each of iteration. Hence, the connectivity graph G(t) changes as the agent proceeds toward the goal. Subsequently, let x 2 W be a state (location) which corresponds to nodes of two distinct graphs as followsBy the respective A_ search on those graphs, the agent might be rendered to visit x at different time steps of t i and t j , i 6= j. As a result, a cyclic loop with respect to x is formed for the agent to repeat this pathological loop, while never reaching the goal. Although it has been presented that maintaining a visited set might be a means of avoiding such pathological situations[142], it turns out to be a trial-and-error scheme is not a systemical approach. Rather, suppose that we could employ a unified graph structure over the entire iteration, which retains the information from the previous search. Similar to the D_-lite path planning algorithm, the incremental search over the graph by reusing the previous information results in not only overcoming the pathological situation but also reducing the computational time. In contrast to D_ orD_-lite algorithms where a uniform graph structure is employed, a challenge lies in building the unified graph structure from a multiresolution cell decomposition. Specifically, it includes a dynamic, multiresolution scheme for constructing the graph connectivity between nodes at different levels. The unified graph structure will evolveitself as the agent moves, while updating nodes and edges associated with the multiresolutioncell decomposition from the FLWT. If this is the case, we might be ableto adapt the proposed path planning algorithm to an incremental search algorithm, hence taking advantages of both the efficient multiresolution connectivity (due tothe FLWT) and the fast computation (due to the incremental search by using the previous information).6.1个结论小型无人机自主控制施加严厉限制控制算法发展,源于所施加的限制板载硬件并要求在线实施。

昆虫学英语

昆虫学英语

昆虫学英语Entomology in EnglishInsects are an integral part of our world, playing vital roles in the intricate web of life. As a branch of zoology, entomology focuses on the study of these diverse and fascinating creatures. From the delicate butterfly to the industrious ant, the realm of insects offers a wealth of knowledge and wonder.One of the most captivating aspects of entomology is the sheer diversity of insect species. It is estimated that there are over 1 million described species of insects, with many more yet to be discovered. This remarkable biodiversity encompasses a wide range of forms, from the graceful dragonfly to the resilient cockroach. Each species has evolved unique adaptations and behaviors that allow it to thrive in its particular niche within the ecosystem.Insects are found in virtually every corner of the globe, from the lush rainforests to the harsh deserts. They have colonized a remarkable range of habitats, demonstrating their remarkable adaptability. Some insects, such as the monarch butterfly, undertake incredible migratory journeys, covering thousands of miles in search of theperfect breeding grounds. Others, like the carpenter ant, have developed intricate social structures and division of labor within their colonies.The study of entomology not only provides insights into the lives of insects but also has far-reaching implications for our own understanding of the natural world. Insects play crucial roles in pollination, decomposition, and the maintenance of healthy ecosystems. Without the tireless efforts of bees, butterflies, and other pollinators, many of the plants we rely on for food and other resources would not exist.Moreover, the study of insect behavior and physiology has yielded valuable insights that have contributed to advancements in fields such as medicine, engineering, and even computer science. For instance, the intricate compound eyes of dragonflies have inspired the development of advanced imaging technologies, while the problem-solving abilities of ants have informed the design of efficient algorithms for data processing and logistics.In the realm of education, entomology offers a captivating gateway for students to explore the natural world. By studying the diverse forms and functions of insects, young learners can develop a deeper appreciation for the complexity and interconnectedness of life on our planet. Hands-on activities, such as insect collecting andobservation, can foster a sense of wonder and curiosity that can inspire future generations of scientists and nature enthusiasts.Beyond the academic sphere, entomology also plays a crucial role in addressing pressing global challenges. Insects are increasingly being recognized as a sustainable source of protein, with the potential to alleviate food insecurity in many parts of the world. Additionally, the study of insect-borne diseases, such as malaria and Zika, has led to the development of innovative strategies for disease prevention and control.In conclusion, the field of entomology is a rich and multifaceted discipline that offers a window into the incredible diversity and complexity of the natural world. From the intricate social structures of ants to the mesmerizing flight patterns of butterflies, the study of insects continues to captivate and inspire people around the globe. As we delve deeper into the realm of entomology, we gain a greater understanding of our own place within the delicate balance of life on Earth.。

电力系统潮流计算不收敛的调整方法

电力系统潮流计算不收敛的调整方法

电力系统潮流计算不收敛的调整方法洪峰【摘要】潮流调整是电力系统分析计算的重要内容,潮流计算不收敛调整技术是提高分析计算自动化水平的关键.针对电网由收敛到不收敛的动态过程进行分析,提出表征电网处在不收敛临界点的综合指标,基于该指标,提出潮流不收敛调整新算法.该方法考虑了发电机启停及出力约束条件,避免了调整过程中切除负荷的情况,EPRI 36节点系统和某实际系统算例分析验证了该算法的正确性和有效性.%The power flow adjustment is an important part of the analysis and calculation of power system, and the non-convergence adjustment technology is critical to improve the automation level of the analysis and calculation. The dynamic process of power flow calculation from convergence to non-convergence was analyzed, and the comprehensive index characterized the grid at the non-convergence critical point was put forward. On the basis, a novel adjustment method was proposed in this paper. The generator start-stop and output constraints were taken into account, which avoided the adjustment process to shed load. The simulation results of the EPRI36 node system and an actual power system verified the correctness and effectiveness of the proposed algorithm.【期刊名称】《电力科学与技术学报》【年(卷),期】2017(032)003【总页数】6页(P57-62)【关键词】电力系统;潮流计算;收敛临界点;潮流调整【作者】洪峰【作者单位】湖南省电力公司建设部,湖南长沙 410004【正文语种】中文【中图分类】TM74由于在实际工作中,出现潮流无解的问题,工作人员很难区分是病态潮流还是潮流无解。

机器证明-埃尔多斯-谢卡雷斯问题说明书

机器证明-埃尔多斯-谢卡雷斯问题说明书

Mechanical Proving for ERDÖS-SZEKERES Problem1Meijing ShanInstitute of Information science and Technology,East China University of Political Science and Law, Shanghai, China. 201620Keywords: Erdös-Szekeres problem, Automated deduction, Mechanical provingAbstract:The Erdös-Szekeres problem was an open unsolved problem in computational geometry and related fields from 1935. Many results about it have been shown. The main concern of this paper is not only show how to prove this problem with automated deduction methods and tools but also contribute to the significance of automated theorem proving in mathematics using advanced computing technology. The present case is engaged in contributing to prove or disprove this conjecture and then solve this problem. The key advantage of our method is to utilize the mechanical proving instead of the traditional proof and this method could improve the arithmetic efficiency.IntroductionThe following famous problem has attracted more and more attention of many mathematicians [3, 6, 12, 16] due to its beauty and elementary character. Finding the exact value of N(n) turns out to be a very challenging problem. The problem is very easy to explain and understand.The Erdös-Szekeres Problem 1.1 [4, 15]. For any integer n ≥ 3, determine the smallest positive integer N(n) such that any set of at least N(n) points in generalposition in the plane contains n points that are the vertices of a convex n-gon.A set of points in the plane is said to be in the general position if it contains no three points on a line. This problem was also called Happy Ending Problem by Erdös, because two investigators Esther Klein and George Szekeres who first worked on the problem became engaged and subsequently married[8, 17].The interest of Erdös and Szekeres in this problem was initiated by Esther Klein(later Mrs. Szekeres), who observed that from 5 points of the plane of which no three lie on the same straight line it is always possible to select 4 points determining convex quadrilateral. There are three distinct types of five points in the plane, as shown in Figure 1.Figure 1. Three cases for 5 points.In any case of the Figure 1, one can find at least one convex quadrilateral determined by the points. Klein [4] suggested the following more general problem: Can we find for a given n a number N(n) such that from any set containing at least N points it is possible to select n points forming a convex polygon?As observed by Erdös and Szederes [4], there are two particular questions:(1) Does the number N corresponding to n exist?(2) If so, how is the least N(n) determined as a function of n?1T his work was financially supported by Humanity and Social Science Youth foundation of Ministry of Education of China (No. 14YJCZH020).They proved the existence of the number N(n) by two different methods. The first one is a combinatorial theorem of Ramsey. The foundation of the second one isbased on some geometrical and combinatorial considerations. And then they formulated the following conjecture.Conjecture 1.1 N (n ) = 2n −2 + 1 for all n ≥ 3.Despite its elementary characters and the efforts of many researchers, the ErdösSzekeres problem is solved for the value n = 3, 4 and 5 only. The case n = 3 istrivial, and n = 4 is due to Klein. The equality N (5) = 9 was proved by E. Makaiwhile the published proof by Kalbfleisch [11] and then Bonnice [2] and Lovasz [13]independently published the much simpler proofs. The bottle neck of this problem now is when n >5, how to prove or disprove the conjecture.About this problem, the best currently known bounds are(1)Where is a binomial coefficient. The lower bound was obtained by Erdös andSzekeres [4] and the upper bound is due to Töth and Valtr [18]. The lower bound issupposed to be sharp, according to conjecture 1.1.In this paper we use the automated deduction method and tools [1, 19-21] to establish a mechanical method for proving problem instead of the manual proof. We hope the method might give a rise to substantially promote this unsolved problem.The rest content of the paper is indicated by section headings as follows: Section 2 preliminaries, Section 3 main results, Section 4 conclusion and remarks.PreliminariesIn this Section, we present some algorithms that would help us develop our method in next section.Algorithm 2.1. Modified Cylindrical Algebraic Decomposition (CAD). Due to the problem statement, the proof should consider all kinds of points’ positions on the plane.The Cylindrical Algebraic Decomposition [5, 14] of R n adapted to a set ofpolynomials which is a partition of R d is cells (simple connected subsets of R d )such that each input polynomial has a constant sign on each cell. Basically, thealgorithm computes recursively at least one point in each cell (so that one can testthe cells that verify a fixed sign condition). The sample points Pi got by originalCAD are always more than one on each cell. We modify the procedure to have asample point on each cell of the final cell decomposition, by the rule that Pi has a constant sign on each cell. We elaborate the main idea under lying our method by showing how our main algorithm evolved from the original one.Here, we describe it as follows.Algorithm. MCADInput: A set F of polynomialsOutput: A F-sign-invariant CAD of R nStep 1. Projection. Compute the projection polynomials which using exclusivelyoperations, and receive some (n − 1) -variate polynomials.Step 2. Recur. Apply the algorithm recursively to compute a CAD of R n −1which Q(F )is sign-invariant.Step 3. Lifting. Lift the Q(F )-sign-invariant CAD of R n −1 up to a F-sign-invariant CAD of using the auxiliary polynomial Π(F ) of degree no largerthan d (F ) (d is the maximum degree of any polynomial in F).Step 4. Choice. Utilize the strategy that {Pi} has constant sign on each cell tochoose one sample point on each cell.Algorithm 2.2 (Graham Scan Algorithm) [9, 10, 22]We present one of the simplest algorithms used to find the convex hull fromsome points. Some basic definitions are provided in the field of Computational Geometry. This algorithm works in three phases:Input: A set S of pointsOutput: The convex hull of S.Step 1. Find an extreme point. The algorithm starts by picking a point in S known to be a vertex of the convex hull. This point is chosen to be with smallest y coordinate and guaranteed to be on the hull. If there are some points with the same smallest y coordinate, we will choose the point with largest x coordinate in them. In other words, we select the right most lowest point as the extreme point.Step 2. Sort the points. Having selected the base point which is called P0 , then the algorithm sorts the other points P in S by the increasing counter-clockwise (ccw) angle the line segment P0P makes with the x-axis. If there is a tie and two points have the same angle, discard the one that is closest to P0 .Step 3. Construct the convex hull. Build the hull by marching around the star shaped polygon, adding edges when we make a left turn, and back-tracking when we make a right turn. We end up with a star-shaped polygon, see Figure 3 (one in whichone special point, in this case the pivot, can “see” the whole polygon). Considering efficiency in Step 2, it is important to note that the comparison of sorting between two points P2 and P3 can be made without actually computing their angles. In fact, computing angles would use slow in accurate trigonometry functions, and doing these computations would be a bad mistake. Instead, one just observes that P2 would make a greater angle than P1 if (and only if) P2 lies on theleft side of the directed line segment P0P1 , see Figure 2.We make full use of this algorithm to judge whether the polygon received in every recursive step is a convex polygon or not. It is a decision method in our algorithm.To state the algorithm clearly, we will describe it in a style of pseudo-code.Algorithm. Graham Scan AlgorithmInput: A set S of points in the planeOutput: A list containing the vertex of the convex hullSelect the right most lowest point P0 in SFigure 3. Graham ScanFigure 2. Sort the points.Sort S angularly about P0 as a center.For ties, discard the closer points.Let S be the sorted array of points.Push S [1] = P0 and P1 onto a stack Ω.Let P1 = the top point on ΩLet P2 = the second top point on Ωwhile S [k ] P doif (S [k −1] is strictly left of the line S [k ] to S [k +1]), thenPush S [k −1] onto ΩelsePop the top point S [k ] off the stack Ω.fi;od;Main ResultsIn contrast with the traditional proof, this method presented following can show the convex polygons received in every step.The main idea of our algorithm: First, give randomly four points in general position in the plane, and then we use polynomials of points’ coordinates to represent the lines. We extend the 4-element set to some5-element sets. We do this by establishing the corresponding Modified Cylindrical Algebraic Decomposition (MCAD) and design an interactive program which allow the user to choose among the candidate one sample point in each cell. We use Graham Scan algorithm to determine whether or not there is a convex 5-gon (convexpentagon) in every set received. If some of the received 5-element sets have no convex 5-gon, then extend them to the 6-element sets by the strategy mentionedabove. Simultaneously, check whether or not each of the received 6-element sets has any convex hull at least with any convex 5-gon in the polygon. We implement theprogram repeatedly until can find a convex n-gon (n ≥ 5) in any set. We trace the processing of the extension and decision, and then draw a conclusion that N (5) = 9.To prove this approach, we write the following algorithm named “conv5”. Based on this algorithm can generate short and readable mechanical proving for the Erdös-Szekeres conjecture, including the case of n = 3, 4, 5. It consists of two main algorithms—Modified Cylindrical Algebraic Decomposition algorithm and Graham Scan algorithm and some sub-algorithms such as collinear, pol, sam, ponlist, min0,isleft, ord, conhull, point5, convex, G5,G6,Pmn.Algorithm Conv5Input: four points in the general position in the plane.Output: Any polygon with at least 9 points in general position in the planecontains a convex 5-gon. Step 1 [collinear]. Write the line polynomials with the given four points (basepoints).Step 2 [pol, sam, ponlist]. Illustrate how we utilize the CAD to find somesample points in the cell which built by the lines, and then with the base n points get n +1 -element sets.Step 3 [min0, isleft, conhull, point5, convex, G5]. Decide whether or not thereis a convex hull or a convex n-gon (n ≥ 5) in every set; if it is true, then stop; elsegoto Step 4.Step 4 [G6, Pmn]. Deal with the sets which have no convex n-gon (n ≥ 5).Recursively implement Step 2 and Step 3 process, until at least there is a convex 5-gon in any set. End Conv5.The key techniques of the algorithm are listed as follows:1. To reduce the complexity of the computation and increase the efficiency,when we check whether or not there is a convex 5-gon in the given points set,we utilize the strategy as follows:if there is a convex hull at least with 5 points in the points set thenpop this points setelif there is a convex 5-gon thenpop this points setelse “there is no convex 5-gon in this points set” go to next step2. Each convex n-gon (n ≥ 5) contains a convex 5-gon3. If there is no convex 4-gon, then there should be no convex 5-gon.Conclusion and RemarksBy the Maple procedure we have implemented the mechanical method for the conjecture in certain cases. Through observing the whole computational process, we obtain a certain answer that any set with at least 9 points in general position in the plane contains a convex 5-gon. This method can be generalized in an obvious way to arbitrary base points in the plane.For the mechanical method proposed here, on one hand it provided one of the promising direction for proving or disproving the Conjecture 1.1 (when n ≥ 6), even for handling with some unsolved problems in computational geometry. On the other hand, it gave one especially useful application of computer algebraic andautomated deduction. For further investigations, now we consider about the following problems:1. Does any set of at least 17 points in general position in the plane contains 6 points which are the vertices of a convex hexagon? Can we give the proof about N (6) existence and prove or disprovethe corresponding conclusion by mechanical proving? Now the best known conclusion about this is N (6) ≥ 27, if it exists.2. Erdös posed a similar problem on empty convex polygons. Whether or not we can give the automated proof to this problem?References[1] M. de Berg, M. van Kreveld, M. Overmars and O. Schwarzkopf, omputational GeometryAlgorithm and Applications, (2nd ed.), Spring-Verlag, Berlin, Heideberg, New York, 1997.[2] W. E. Bonnice, On convex polygons determined by a finie planar set, Amer. Math. Monthly.[3] F. R. K. Chung and R. L. Graham, Forced Convex n-gons in the Plane, Discr. Comput. Geom. 19(1998), 367-371.[4] P. Erdös and G. Szekeres, A combinatiorial problem in geometry, Comositio Mathematica 2(1935), 463-470.[5] E. Collins George, Quantifier elimination for the elementary theory of real closed fields by cylindrical algebraic decomposition Lecture Notes In Computer Sciencevol. 33, Springer-Verlag, Berlin, pp. 134-183.[6] Kráolyi Gyula, An Erdös-Szekeres type problem in the plane.[7] X. R. Hou and Z. B. Zeng, An efficient Algorithm for Finding Sample Points of Algebraic Decomposition of Plane, Computer Application, 1997, (in Chinese).[8] P. Hofiman, The Man Who loved Only Numbers Hyperion, New York, 1998.[9] /ah/alganim/version0/Graham.html.[10] http://cgm.cs.mcgill.ca/ beezer/cs507/main.html.[11] J. D. Kalbfleisch, J. G. Kalbfleisch and R. G. Stanton, A combinatorial problem on convexregions, Proc. Louisiana Conf. Combinatorics, Graph Theory and Computing, Louisianna StateUniv., Baton Touge, La, Congr. Numer. 1 (1970), 180-188.[12] D. Kleitman and L. Pachter, Finding convex sets among points in the plane, Discr. Comput.Geom. 19, (1998), 405-410.[13] L. Lovasz, Combinatorial Problem and Exercises North-Holland, msterdam, 1979.[14] Bhubaneswar Mishra, Algorithmic Algebra, Springer-Verlag, 2001.[15] W. Morris and V. Soltan, The Erdös-Szekeres Problem on Points in ConvexPosition- A, Survey Bulletin of the American Mathematical Society, vol. 37.[16] L. Graham Ronald and Yao Frances, Finding the Convex Hull of a SimplePolygon Report No. STAN-CS-81-887, 1998.[17] B. Schechter, My Brain is Open Simon Schuster, New York, 1998.[18] G. Tóth and P. Valtr, Note on the Erdös-Szekeres theorem, Discr. Comput. Geom. 19 (1998), 457-459.[19] W. T. Wu, Basic Principles of Mechanical Theorem Proving in Geometries Science Press,Beijing, 1984, (Part on elementary geometries, in Chinese).[20] L. Yang and B. C. Xia, Automated Deduction in Real Geometry, Geometric Computation, WorldScientific, 2004.[21] L. Yang and Z. Z. Jing and X. R. Hou, Nonlinear Algebraic Equation System and AutomatedTheorem Proving, Shanghai Scientific and Technological Education Publishing House, Shanghai,1996, (in Chinese).[22] P. D. Zhou, Computational Geometry Design and Analysis in Chinese, TsingHua UniversityPress, 2005.。

文献——精选推荐

文献——精选推荐

⽂献徐胜元简介:徐胜元,男,南京理⼯⼤学⾃动化学院教授、博⼠、博⼠⽣导师。

毕业于南京理⼯⼤学控制理论与控制⼯程专业,获得博⼠学位。

研究⽅向:1、鲁棒控制与滤波2、⼴义系统3、⾮线性系统2017年SCI1.Relaxed conditions for stability of time-varying delay systems ☆TH Lee,HP Ju,S Xu 《Automatica》, 2017, 75:11-15EI1.Relaxed conditions for stability of time-varying delay systems ☆TH Lee,HP Ju,S Xu 《Automatica》, 2017, 75:11-152.Adaptive Tracking Control for Uncertain Switched Stochastic Nonlinear Pure-feedback Systems with Unknown Backlash-like HysteresisG Cui,S Xu,B Zhang,J Lu,Z Li,...《Journal of the Franklin Institute》, 20172016年SCI1..Finite-time output feedback control for a class of stochastic low-order nonlinear systemsL Liu,S Xu,YZhang《International Journal of Control》, 2016:1-162.Universal adaptive control of feedforward nonlinear systems with unknown input and state delaysX Jia,S Xu,Q Ma,Y Li,Y Chu《International Journal ofControl》, 2016, 89(11):1-193.Robust adaptive control of strict-feedback nonlinear systems with unmodeled dynamics and time-varying delaysX Shi,S Xu,Y Li,W Chen,Y Chu《International Journal of Control》, 2016:1-184.Stabilization of hybrid neutral stochastic differential delay equations by delay feedback controlW Chen,S Xu,YZou《Systems & Control Letters》, 2016, 88(1):1-135.Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control ☆Q Jiao,H Modares,S Xu,FL Lewis,KG Vamvoudakis《Automatica》, 2016, 69(C):24-346.Semiactive Inerter and Its Application in Adaptive Tuned Vibration AbsorbersY Hu,MZQ Chen,S Xu,Y Liu《IEEE Transactions on Control Systems Technology》, 2016:1-77.Decentralised adaptive output feedback stabilisation for stochastic time-delay systems via LaSalle-Yoshizawa-type theoremT Jiao,S Xu,J Lu,Y Wei,Y Zou《International Journal of Control》, 2016, 89(1):69-838.Coverage control for heterogeneous mobile sensor networks on a circleC Song,L Liu,G Feng,S Xu《Automatica》, 2016, 63(3):349-358EI1.Finite-time output feedback control for a class of stochastic low-order nonlinear systemsL Liu,S Xu,YZhang《International Journal of Control》, 2016:1-162.Unified filters design for singular Markovian jump systems with time-varying delaysG Zhuang,S Xu,B Zhang,J Xia,Y Chu,...《Journal of the FranklinInstitute》, 2016, 353(15):3739-37683.Improvement on stability conditions for continuous-time T–S fuzzy systemsJ Chen,S Xu,Y Li,Z Qi,Y Chu《Journal of the Franklin Institute》, 2016, 353(10):2218-22364.Universal adaptive control of feedforward nonlinear systems with unknown input and state delaysX Jia,S Xu,Q Ma,Y Li,Y Chu《International Journal ofControl》, 2016, 89(11):1-195.H∞ Control with Transients for Singular Systems Z Feng,J Lam,S Xu,S Zhou 《Asian Journal of Control》, 2016,18(3):817-8272015年SCI1.Pinning control for cluster synchronisation of complex dynamical networks withsemi-Markovian jump topologyTH Lee,Q Ma,S Xu,HP Ju《International Journal of Control》, 2015, 88(6):1223-12352..Anti-disturbance control for nonlinear systems subject to input saturation via disturbance observer ☆Y Wei,WX Zheng,S Xu《Systems & ControlLetters》, 2015, 85:61-693.Exact tracking control of nonlinear systems with time delays and dead-zone inputZ Zhang,S Xu,B Zhang《Automatica》, 2015, 52(52):272-276EI1.Further studies on stability and stabilization conditions for discrete-time T–S systems with the order relation information of membership functionsJ Chen,S Xu,Y Li,Y Chu,Y Zou《Journal of the Franklin Institute》, 2015, 352(12):5796-5809 .2 .Stability analysis of random systems with Markovian switching and its application T Jiao,J Lu,Y Li,Y Chu,SXu《Journal of the Franklin Institute》, 2015, 353(1):200-220 3.Exact tracking control of nonlinear systems with time delays and dead-zone inputZ Zhang,S Xu,B Zhang《Automatica》, 2015, 52(52):272-2764.Event-triggered average consensus for multi-agent systems with nonlinear dynamics and switching topologyD Xie,S Xu,Y Chu,Y Zou《Journal of the Franklin Institute》, 2015, 352(3):1080-1098葛树志简介:葛树志,男,汉族,1963年9⽉20⽇⽣于⼭东省安丘县景芝的葛家彭旺村。

Modified PSO algorithm for solving planar graph coloring problem

Modified PSO algorithm for solving planar graph coloring problem
4.3.1. Instance 1
4.3.2. Instance 2
4.3.3. Instance 3
5. Inter-cluster load balancing through self-organizing cluster approach
5.1. Performance evaluation
935
On Efficient Sparse Integer Matrix Smith Normal Form Computations Original Research Article
Journal of Symbolic Computation, Volume 32, Issues 1-2, July 2001, Pages 71-99
6.3.2. Liveliness property
6.3.3. Deadlock
7. Conclusion
References
Vitae Purchase
Research highlights
? A hybrid load balancing (HLB) approach in trusted clusters is proposed for HPC. ? HLB reduces network traffic by 80%–90% and increases CPU utilization by 40%–50%. ? The AWT and MRT of remote processes are reduced by 13%–26% using ReJAM. ? The stability analysis of JMM using PA ensures the finite sequences of transitions. ? On the basis of these properties, JM model has been proved safe and reliable.

随机矩阵奇异值分解算法在3D建模中的应用效果评估

随机矩阵奇异值分解算法在3D建模中的应用效果评估

随机矩阵奇异值分解算法在3D建模中的应用效果评估随机矩阵奇异值分解(Randomized Singular Value Decomposition, rSVD)算法是一种用于矩阵分解的高效方法,近年来在3D建模领域得到了广泛的应用。

本文将对随机矩阵奇异值分解在3D建模中的应用效果进行评估。

1. 引言3D建模是计算机图形学领域的重要研究方向之一,广泛应用于电影、游戏、虚拟现实等领域。

在3D建模中,常常需要对大量的三维点云数据进行处理和分析。

而随机矩阵奇异值分解算法可以高效地对大规模矩阵进行分解,因此在3D建模中有着广泛的应用前景。

2. 随机矩阵奇异值分解算法随机矩阵奇异值分解算法是一种基于采样和迭代的矩阵分解方法。

它通过对原始矩阵进行随机采样,构造一个低秩近似矩阵,并对其进行奇异值分解。

与传统的奇异值分解算法相比,随机矩阵奇异值分解算法具有更低的计算复杂度和更快的运算速度。

3. 随机矩阵奇异值分解在3D建模中的应用3D建模中常用的数据表示方式之一是三维点云。

而随机矩阵奇异值分解算法可以对三维点云数据进行降维和拟合,从而实现对三维模型的快速建模。

通过将三维点云数据映射到低维空间,随机矩阵奇异值分解算法可以提取出三维模型的主要特征,并去除噪声和冗余信息。

4. 实验设计与结果分析为了评估随机矩阵奇异值分解算法在3D建模中的应用效果,我们设计了实验,并对比了其与传统奇异值分解算法的性能差异。

实验中使用了不同规模的三维点云数据集,并分别对其进行了随机矩阵奇异值分解和传统奇异值分解处理。

结果表明,随机矩阵奇异值分解算法在运算速度和降维效果上都优于传统奇异值分解算法,能够更快速地实现对三维模型的建模和分析。

5. 应用案例分析除了实验评估,本文还通过应用案例对随机矩阵奇异值分解在3D 建模中的具体应用效果进行分析。

通过对真实场景中的三维点云数据进行处理,我们展示了随机矩阵奇异值分解算法在三维模型建模和分析方面的潜力和优势。

MOEAD

MOEAD

1 The Performance of a New Version of MOEA/D on CEC09Unconstrained MOP Test InstancesQingfu Zhang,Wudong Liu and Hui LiAbstract—This paper describes the idea of MOEA/D and proposes a strategy for allocating the computational resource to different subproblems in MOEA/D.The new version of MOEA/D has been tested on all the CEC09unconstrained MOP test instances.Index Terms—MOEA/D,Test problems,Multiobjective opti-mization.I.I NTRODUCTIONA multiobjective optimization problem(MOP)can be stated as follows:minimize F(x)=(f1(x),...,f m(x))T(1)subject to x∈ΩwhereΩis the decision(variable)space,F:Ω→R m consists of m real-valued objective functions and R m is called the objective space.If x∈R n,all the objectives are continuous andΩis described byΩ={x∈R n|h j(x)≤0,j=1,...,k},where h j are continuous functions,we call(1)a continuous MOP.Very often,since the objectives in(1)contradict one another,no point inΩcan minimize all the objectives simul-taneously.One has to balance them.The best tradeoffs among the objectives can be defined by Pareto optimality.Let u,v∈R m,u is said to dominate v if and only if u i≤v i for every i∈{1,...,m}and u j<v j for at least one index j∈{1,...,m}.A point x∗∈Ωis Pareto optimal if there is no point x∈Ωsuch that F(x)dominates F(x∗).F(x∗)is then called a Pareto optimal(objective)vector.In other words, any improvement in a Pareto optimal point in one objective must lead to deterioration to at least one other objective.The set of all the Pareto optimal points is called the Pareto set(PS) and the set of all the Pareto optimal objective vectors is the Pareto front(PF)[1].Recent years have witnessed significant progress in the development of evolutionary algorithms(EAs)for dealing with MOPs.Multiobjective evolutionary algorithms(MOEAs)aim atfinding a set of representative Pareto optimal solutions in a single run.Most MOEAs are Pareto dominance based,they adopt single objective evolutionary algorithm frameworks and thefitness of each solution at each generation is mainly deter-mined by its Pareto dominance relations with other solutions Q.Zhang and W.Liu are with the School of Computer Science and Electronic Engineering,University of Essex,Colchester,CO43SQ,U.K {qzhang,wliui}@.H.Li is with Department of Computer Science,University of Nottingham, Nottingham,NG81BB,U.K in the population.NSGA-II[2],SPEA-II[3]and PAES[4]are among the most popular Pareto dominance based MOEAs.A Pareto optimal solution to an MOP could be an optimal solution of a single objective optimization problem in which the objective is a linear or nonlinear aggregation function of all the individual objectives.Therefore,approximation of the PF can be decomposed into a number of single objective optimization problems.some MOEAs such as MOGLS[5]–[7] and MSOPS[8]adopt this idea to some extent.MOEA/D[9] (MultiObjective Evolutionary Algorithm based on Decomposi-tion)is a very recent evolutionary algorithm for multiobjective optimization using the decomposition idea.It has been applied for solving a number of multiobjective optimization problems [10]–[14].The rest of this paper is organized as follows.Section II introduces a new version of MOEA/D.Section III presents experimental results of MOEA/D on the13unconstrained MOP test instances for CEC2009MOEA competition[15]. Section IV concludes the paper.II.MOEA/DMOEA/D requires a decomposition approach for converting the problem of approximation of the PF into a number of scalar optimization problems.In this paper,we use the Tchebycheff approach.A.Tchebycheff Approach[1]In this approach,the scalar optimization problems are in the formminimize g te(x|λ,z∗)=max1≤i≤m{λi|f i(x)−z∗i|}(2) subject to x∈Ωwhere z∗=(z∗1,...,z∗m)T is the reference point,i. e. z∗i=min{f i(x)|x∈Ω}for each i=1,...,m.Under some mild conditions[1],for each Pareto optimal point x∗,there exists a weight vectorλsuch that x∗is the optimal solution of (2)and each optimal solution of(2)is a Pareto optimal solution of(1).Therefore,one is able to obtain different Pareto optimal solutions by solving a set of single objective optimization problem defined by the Tchebycheff approach with different weight vectors.B.MOEA/D with Dynamical Resource AllocationLetλ1,...,λN be a set of even spread weight vectors and z∗be the reference point.As shown in Section II,the problem of approximation of the PF of(1)can be decomposed into Nscalar optimization subproblems and the objective function of the j-th subproblem is:g te(x|λj,z∗)=max1≤i≤m {λji|f i(x)−z∗i|}(3)whereλj=(λj1,...,λj m)T,and j=1,...,N.MOEA/D minimizes all these N objective functions si-multaneously in a single run.Neighborhood relations among these single objective subproblems are defined based on the distances among their weight vectors.Each subproblem is optimized by using information mainly from its neighboring subproblems.In the versions of MOEA/D proposed in[9]and [10],all the subproblems are treated equally,each of them receives about the same amount of computational effort.These subproblems,however,may have different computational dif-ficulties,therefore,it is very reasonable to assign different amounts of computational effort to different problems.In MOEA/D with Dynamical Resource Allocation(MOEA/D-DRA),the version of MOEA/D proposed in this paper,we define and compute a utilityπi for each subproblem -putational efforts are distributed to these subproblems based on their utilities.During the search,MOEA/D-DRA with the Tchebycheff approach maintains:•a population of N points x1,...,x N∈Ω,where x i is the current solution to the i-th subproblem;•F V1,...,F V N,where F V i is the F-value of x i,i.e.F V i=F(x i)for each i=1,...,N;•z=(z1,...,z m)T,where z i is the best(lowest)value found so far for objective f i;•π1,...,πN:whereπi utility of subproblem i.•gen:the current generation number.The algorithm works as follows:Input:•MOP(1);•a stopping criterion;•N:the number of the subproblems consideredin MOEA/D;•a uniform spread of N weight vectors:λ1,...,λN;•T:the number of the weight vectors in theneighborhood of each weight vector. Output:{x1,...,x N}and{F(x1),...,F(x N)}Step1InitializationStep1.1Compute the Euclidean distances betweenany two weight vectors and thenfind the T clos-est weight vectors to each weight vector.For eachi=1,...,N,set B(i)={i1,...,i T}whereλi1,...,λi T are the T closest weight vectors toλi.Step1.2Generate an initial population x1,...,x Nby uniformly randomly sampling from the searchspace.Step 1.3Initialize z=(z1,...,z m)T by settingz i=min{f i(x1),f i(x2),...,f i(x N)}.Step 1.4Set gen=0andπi=1for all i=1,...,N.Step2Selection of Subproblems for Search:the indexes of the subproblems whose objectives are MOPindividual objectives f i are selected to form initialI.By using10-tournament selection based onπi,select other[N5]−m indexes and add them to I.Step3For each i∈I,do:Step3.1Selection of Mating/Update Range:Uni-formly randomly generate a number rand from(0,1).Then setP=B(i)if rand<δ,{1,...,N}otherwise.Step3.2Reproduction:Set r1=i and randomlyselect two indexes r2and r3from P,and thengenerate a solution¯y from x r1,x r2and x r3by aDE operator,and then perform a mutation operatoron¯y with probability p m to produce a new solutiony.Step3.3Repair:If an element of y is out of theboundary ofΩ,its value is reset to be a randomlyselected value inside the boundary.Step3.4Update of z:For each j=1,...,m,ifz j>f j(y),then set z j=f j(y).Step3.5Update of Solutions:Set c=0and thendo the following:(1)If c=n r or P is empty,go to Step4.Otherwise,randomly pick an index j fromP.(2)If g(y|λj,z)≤g(x j|λj,z),then setx j=y,F V j=F(y)and c=c+1.(3)Delete j from P and go to(1).Step4Stopping Criteria If the stopping criteria is satisfied,then stop and output{x1,...,x N}and{F(x1),...,F(x N)}.Step5gen=gen+1.If gen is a multiplication of50,then compute∆i,the relative decrease of the objective for each sub-problem i during the last50generations,updateπi=1if∆i>0.001;(0.95+0.05∆i0.001)πi otherwise.endifGo to Step2.In10-tournament selection in Step2,the index with the highestπi value from10uniformly randomly selected indexes are chosen to enter I.We should do this selection[N5]−m times.In Step5,the relative decrease is defined asold function value-new function valueold function valueIf∆i is smaller than0.001,the value ofπi will be reduced. In the DE operator used in Step3.2,each element¯y k in ¯y=(¯y1,...,¯y n)T is generated as follows:¯y k=x r1k+F×(x r2k−x r3k)with probability CR,x r1k,with probability1−CR,(4) where CR and F are two control parameters.The mutation operator in Step 3.2generates y =(y 1,...,y n )T from ¯y in the following way:y k =¯y k +σk ×(b k −a k )with probability p m ,¯y k with probability 1−p m ,(5)withσk =(2×rand )1η+1−1if rand <0.5,1−(2−2×rand )1η+1otherwise,where rand is a uniformly random number from [0,1].The distribution index ηand the mutation rate p m are two control parameters.a k and b k are the lower upper bounds of the k -th decision variable,respectively.III.E XPERIMENTAL R ESULTSMOEA/D has been tested on all the 13unconstrained test instances in CEC 2009[15].The parameter settings are as follows:•N :600for two objectives,1000for three objectives,and 1500for five objectives;•T =0.1N and n r =0.01N ;•δ=0.9;•In DE and mutation operators:CR=1.0and F=0.5,η=20and p m =1/n .•Stopping condition:the algorithm stops after 300,000function evaluations for each test instance.A set of N weight vectors W are generated using as follows:1)Uniformly randomly generate 5,000weight vectors for forming the set W 1.W is initialize as the set containing all the weight vectors (1,0,...,0,0),(0,1,...,0,0),...,(0,0,...,0,1).2)Find the weight vector in W 1with the largest distance to W ,add it to W and delete it from W 1.3)If the size of W is N ,stop and return W .Otherwise,go to 2).In calculating the IGD values,100nondominated solutions selected from each final population were used in the case of two objectives,150in the case of three objectives and 800in the case of five objectives.The final solution set A is selected from the output O ={F (x 1),...,F (x N )}of the algorithm is as follows:•For the instances with two objectives,the set 100final solutions are the solution set consisting of the best solutions in O for the subproblems with weights (0,1),(1/99,98/99)...,(98/99,1/99),(1,0).•For the instances with more than two objectives:1)Randomly select an element e from O and set O 1=O \{e }and A ={e }.2)Find the element in O 1with the largest distance to A ,and delete it from O 1and add it to A .3)If the size of A is 150for three objectives and 800for five objectives,stop.Otherwise,go to 2).The experiments were performed on a 1.86GHz Intel PC with 2GB RAM.The programming languages are MATLAB and C++.The IGD values are listed in table I.The distributions of the final populations with the lowest IGD values among theTABLE IT HE IGD STATISTICS B ASED ON 30INDEPENDENT RUNS Test InstancesMean Std Best Worst UF010.004350.000290.003990.00519UF020.006790.001820.004810.01087UF030.007420.005890.003940.02433UF040.063850.005340.056870.08135UF050.180710.068110.080280.30621UF060.005870.001710.003420.01005UF070.004440.001170.004050.01058UF080.058400.003210.050710.06556UF090.078960.053160.035040.14985UF100.474150.073600.364050.64948R2DTLZ2M50.110320.002330.106920.11519R2DTLZ3M5146.781341.828166.1690214.2261WFG1M51.84890.01981.83461.899330runs for the first 10test instances are plotted in figures 1-10.For seven biobjective instances,MOEA/D found good ap-proximations to UF1,UF2,UF3,UF6and UF7but performed poorly on UF4and UF5.For three 3-objective instances,MOEA/D had better performance on UF8than the other two.For three 5-objective instances,the IGD value found by MOEA/D on R2-DTLZ3-M5was very large while those on R2-DTLZ2-M5and WFG1-M5were smaller.IV.C ONCLUSIONThis paper described the basic idea and framework of MOEA/D.A dynamic computational resource allocation strat-egy was proposed.It was tested on the 13unconstrained instances for CEC09algorithm competition.The source code of the algorithm can be obtained from its authors.Fig.1.The best approximation to UF1Fig.2.The best approximation to UF2Fig.3.The best approximation to UF3R EFERENCES[1]K.Miettinen,Nonlinear Multiobjective Optimization .Kluwer Aca-demic Publishers,1999.[2]K.Deb,S.Agrawal,A.Pratap,and T.Meyarivan,“A fast and elitistmultiobjective genetic algorithm:NSGA-II,”IEEE Trans.Evolutionary Computation ,vol.6,no.2,pp.182–197,2002.[3] E.Zitzler,umanns,and L.Thiele,“SPEA2:Improving thestrength pareto evolutionary algorithm for multiobjective optimization,”Fig.4.The best approximation to UF4Fig.5.The best approximation to UF5in Evolutionary Methods for Design Optimization and Control with Applications to Industrial Problems ,K.C.Giannakoglou,D.T.Tsahalis,J.P´e riaux,K.D.Papailiou,and T.Fogarty,Eds.,Athens,Greece,2001,pp.95–100.[4]J.D.Knowles and D.W.Corne,“The pareto archived evolution strategy:A new baseline algorithm for multiobjective optimisation,”in Proc.of Congress on Evolutionary Computation (CEC’99),Washington D.C.,1999,pp.98–105.[5]H.Ishibuchi and T.Murata,“Multiobjective genetic local search algo-Fig.7.The best approximation to UF7rithm and its application toflowshop scheduling,”IEEE Transactions on Systems,Man and Cybernetics,vol.28,no.3,pp.392–403,1998. [6] A.Jaszkiewicz,“On the performance of multiple-objective genetic localsearch on the0/1knapsack problem-a comparative experiment,”IEEE Trans.Evolutionary Computation,vol.6,no.4,pp.402–412,Aug.2002.[7]H.Ishibuchi,T.Yoshida,and T.Murata,“Balance between geneticsearch and local search in memetic algorithms for multiobjective per-mutationflowshop scheduling,”IEEE Trans.Evolutionary Computation, vol.7,no.2,pp.204–223,Apr.2003.2678–2684.[9]Q.Zhang and H.Li,“MOEA/D:A multiobjective evolutionary algorithmbased on decomposition,”IEEE Transactions on Evolutionary Compu-tation,vol.11,no.6,pp.712–731,2007.[10]H.Li and Q.Zhang,“Multiobjective optimization problems with com-plicated pareto set,MOEA/D and NSGA-II,”IEEE Transactions on Evolutionary Computation,2009,in press.[11]P.C.Chang,S.H.Chen,Q.Zhang,and J.L.Lin,“MOEA/D forflowshop scheduling problems,”in Proc.of Congress on Evolutionary Computation(CEC’08),Hong Kong,2008,pp.1433–1438.[12]W.Peng,Q.Zhang,and H.Li,“Comparision between MOEA/D andNSGA-II on the multi-objective travelling salesman problem,”in Multi-Objective Memetic Algorithms,ser.Studies in Computational Intelli-gence,C.-K.Goh,Y.-S.Ong,and K.C.Tan,Eds.Heidelberg,Berlin:Fig.10.The best approximation to UF10Springer,2009,vol.171.[13]Q.Zhang,W.Liu,E.Tsang,and B.Virginas,“Expensive multiobjectiveoptimization by MOEA/D with gaussian process model,”Technical Report CES-489,the School of Computer Science and Electronic Engineering,University of Essex,Tech.Rep.,2009.[14]H.Ishibuchi,Y.Sakane,N.Tsukamoto,and Y.Nojima,“Adaptationof scalarizing functions in MOEA/D:An adaptive scalarizing function-based multiobjective evolutionary algorithm,”in Proc.of the5th Interna-tional Conference devoted to Evolutionary Multi-Criterion Optimization (EMO’09),Nantes,France,Apr.2009.[15]Q.Zhang,A.Zhou,S.Zhao,P.N.Suganthan,W.Liu,and S.Tiwari,“Mmultiobjective optimization test instances for the CEC2009sepcial session and competition,”Technical Report CES-487,The Shool of Computer Science and Electronic Engineering,University of Essex, Tech.Rep.,2008.。

2.2参考文献

2.2参考文献

2.2参考⽂献[1.1] O.D.Faugeras and G.Toscani, The Calibration Problem for Stereo, Proc. of IEEE Conference of Computer Vision and Pattern Recognition, pp 15-20, 1986.[1.2] O.D.Faugeras and G.Toscani,Camera Calibration for 3D Computer V ision, Proc. Of International Workshop on Industrial Application of Machine Vision and Machine Intellegence, pp.240-247, Japan, 1987[1.3] S.J.Manbank and O.D.Faugeras, A Theory of Self-Calibration of a Moving Camera., International Journal of Computer V ision, V ol.8:2,pp.123-151,1992.[1.4] S.D.Ma, A Self-Calibration Technique for Active Vision System, IEEE Trans. Robotics and Automation,Feb.,1996.[1.5] Y.C.Shiu and S.Ahmad, Calibration of Wrist-Mounted Robotic Sensors by Solving Homogenous Transform Equations of Form AX=XB, IEEETrans. Robotics and Automation,V ol.5,No.1,pp 16-29,January,1989.[1.6] Z. Zhang, A Flexible New Technique for Camera Calibration, IEEE Transactions on Pattern Analysis and Machine Intelligence,22(11):1330–1334, 2000.[2.1] Barnard S T,Fischler M A.Computational stereo[J].ACM Computing Surveys,1982,14(4):553-572.[2.2] Dhond U R,Aggarwal J K.Structure from stereo-A review[J].IEEE Trans on Systems,Man andCybernetics,1989,19(6):1489-1510.[2.3] Faugeras O.What can be seen in three dimensions with an uncalibrated stereo rig[C].Proc of the 2nd European Confon Computer Vision.Santa Margherita Ligure,1992:563-578.[2.4] Koschan A.What is new in computational stereo since 1989:A survey of current stereo papers [R].Berlin:Technical University of Berlin. 1993.[2.5] Scharstein D,Szeliski R.A taxonomy and evaluation of dense two-frame stereo correspondence algorithms[J].Int J of Computer V ision,2002,47(1):7-42.[2.6] Brown M Z,Burschka D,Hager G D.A dvances in computational stereo[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2003,25(8):993-1008.[2.7] V enkateswar V,Chellappa R.Hierarchical stereo and motion correspondence using feature groupings[J].Int J of Computer V ision,1995,15(3):245-269.[2.8] Mahlmann K,Maier D,Hesser J,eta1.Calculating dense disparity maps from color stereo images,an efficient implementation[J].Int J of Computer Vision,2002,47(i-3):79-88.[2.9]Kolmogorov V,Zabin R.What energy functions can be minimized via graph cuts? [J].IEEE Trans on Pattern Analysis and Machine Intelligence,2004,26(2):147-159.[2.10] Gong M,Y ang Y H.Fast unambiguous stereo matching using reliability-based dynamic programming[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2005,27(6):998-1003.[2.11]徐奕,周军,周源华.基于⼩波及动态规划的相位匹配[J].上海交通⼤学学报,2003,37(3):388-392.[2.12]Goulermas J Y,Liatsis P,Fernando T.A constrained nonlinear energy minimization framework for the regularization of the stereo correspondence problem [J].IEEE Trans on Circuits and Systems for V ideo Technology,2005,15(4):550-565.[2.13] V eksler O.Fast variable window for stereo correspondence using integral images[C].Proc of IEEE Conf on Computer V ision and Pattern Recognition.Madison,2003:556-561.[2.14]Wang L,Kang S B,Shum H Y.Cooperative segmentation and stereo using perspective space search [C].Proc of Asian Conf on Computer V ision.Jeju Island,2004:366-371.[2.15]Y oon K J,Kweon I S.Locally adaptive support-weight approach for visual correspondence search [C].Proc of IEEE Conf on Computer Vision and Pattern Recognition.SanDiego,2005:924-931.[2.16] Zabih R,Woodfill J.Non-parametric local transforms for computing visual correspondence[C].Procofthe 3rd European Conf on Computer Vision.Stockholm,l994:150-l58.[2.17] Birchfield S,Tomasi C.Multiway cut for stereo and motion with slanted surfaces [C].Proc of Int Conf on Computer Vision.GreeP.e,1999:489-495.[2.18] Boykov Y,V eksler O,Zabih R.Markovr and omfields with efficient approximations[C].Proc of IEEE Conf on Computer V ision and Pattern Recognition.Santa Barbara,1998:648-655.[2.19]Kim H,Sohn K.Hierarchical disparity estimation with energy-based regularization[C].Proc of Int Conf on Image Processing.Barcelona,2003:373-376.[2.20] Prince J D,Eagle R A.Weighted directional energy model of human stereo correspondence [J].V ision Research,2000,40(9):1143-1155.[2.21] Muquit M A,Shibahara T,Aoki T.A high-accuracy passive 3D measurement system using phase-based image matching [J].IEICE Trans on Fundamentals of Electronics,Communications and Computer Sciences,2006,E89-A(3):686-697.2006,E89-A(3):686-697.[2.22] Fleet D J,Jepson A D.Stability of phase information [J].IEEE Trans on:Pattern Analysis and Machine Intelligence,1993,15(12):1253-1268.[2.23] 徐彦君,杜利民,侯⾃强.基于相位的尺度⾃适应⽴体匹配⽅法[J].电⼦学报,1999,27(7):38-41.[2.24] Birchfield S,Tomasi C.Depth discontinuities by pixel-to-pixelstereo [C].Proc of IEEE Int Conf on Computer V ision.Bombay,1998:1073-1080.[2.25] Ohta Y,Kanade T.Stereo by intra- and inter-scan line search using dynamic programming [J].IEEE Trans on Pattern Analysis and Machine Intelligence,1985,7(2):139-i54.[2.26] Cox I J,Hingorani S L,Rao S B,eta1.A maximum likelihood stereo algorithm [J].Computer V ision and Image Understanding,1996,63(3):542-567.[2.27]Bobick A F,Intille S S.Large occlusion stereo[J].Int J of Computer V ision,1999,33(3):181-200.[2.28] Lei C,Seizer J,Y ang Y H.Region-tree based stereo using dynamic programming optimization [C].Proc of IEEE Conf on Computer V ision and Pattern Recognition.New Y ork,2006:378-385.[2.29]Cox I J,Roy S.A maximum-flow formulation of the N-camera stereo correspondence problem [C].Proc of the 6th Int Conf on Computer V ision.Bombay,1998:492-499.[2.30]Boykov Y,V eksler O,Zabih R.Fast approximate energy minimization via graph cuts [J].IEEE Trans on Pattern Analysis and Machine Intelligence,2001,23(11):1222-1239.[2.31]王年,范益政,鲍⽂霞,等.基于图割的图像匹配算法[J].电⼦学报,2006,34(2):232-236.[2.32]Boykov Y,Kolmogorov V.An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision [J].IEEE Trans on Pattern Analysis and Machine Intelligence,2004,26(9):1124-1137.[2.33] Ruichek Y.Multi level and neural-network-based stereo-matching method for real-time obstacle detection using linear cameras[J].IEEE Trans on Intelligent Transportation Systems,2005,6(1):54-62.[2.34] Hua X J,Y okomichi M,Kono M.Stereo correspondence using color based on competitive-cooperative neural networks[C].Proc of the 6th Int Conf on Parallel and Distributed Computing,Applications and Technologies.Denver,2005:856-860.[2.35]Wang B.Chung R,Shen C.Genetic algorithm-based stereo vision with no block-partitioning of input images[C].Proc IEEE Int Symposium on Computational Intelligence in Robotics and Automation.Kobe,2003:830-836.[2.36]Gong M,Y ang Y H.Genetic-based stereo algorithm and disparity map evaluation [J].Int J of ComputerVision,2002,47(1-3):63-77.[2.37]Zitnick L C,Kanade T.A cooperative algorithm for stereo matching and occlusion detection[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2000,22(7):675-681.[2.38]Ruichek Y,Issa H,Postaire JG,etal.Towards real-time obstacle detection using a hierarchicaldecomposition methodology for stereo matching with a genetic algorithm[C].Proc of the 16th IEEE Int Conf on Tools with Artificial Intelligence.Boca Raton,2004:138-147.[2.39]Scharstein D,Szeliski R.Stereo matching with nonlinear diffusion[J].Int J of Computer Vision,1998,28(2):155-174.[2.40] Lee S H,Kanatsugu Y,Park J I.MAP-based stochastic diffusion for stereo matching and line fieldsestimation[J].Int J of Computer V ision,2002,47(1-3):195-218.[2.41]Sun J,Zheng N N,Shum H Y.Stereo matching using belief propagation[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2003,25(7):787-800.[2.42]Felzenszwalb P F,Huttenlocher D P.Efficient belief propagation for early vision[C].Proc of IEEE Conf on Computer V ision and Pattern Recognition.Washington,2004:261-268.[2.43]Klaus A,Sormann M,Karner K.Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure[C].Proc of the 18th Int Conf on Pattern Recognition.HongKong,2006:15-18.[2.44] Wang L,Liao M,Gong M,etal.High-qualityreal-time stereo using adaptive cost aggregation and dynamic programming[C].3rd Int Symposiumon 3D DataProcessing,Visualization and Transmission.NorthCarolina,2006:798-805.[2.45]Birchfield B,Natarajan B,Tromasi C.correspondence as energy-based segmentation[J].Image and Vision Computing,2007,25(8):1329-1340.[2.46]Deng Y,Y ang Q,Linx,etal.Stereo correspondence with occlusion handling in a symmetric patch-based graph-cuts model[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2007,29(6):1068-1079.[2.47]Binaghi E,Gallo I,Marino G,etal.Neural adaptive stereo matching[J].Pattern RecognitionLetters,2004,25(15):1743-1758.[2.48]Bleyer M,Gelautz M.Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions[J].Signal Processing:Image Communication,2007,22(2):127-143.[2.49]Gong M,Y ang Y H.Real-time stereo matching using orthogonal reliability-based dynamic programming[J].IEEE Trans on Image Processing,2007,16(3):879-884.[2.50]Ambrosio G,Gonzalez J.Extracting and matching perceptual groups for hierarchical stereo vision[C].Proc of the 15th Int Conf on Pattern Recognition.Barcelona,2000:542-545.[2.51]Desolneux A,Moisan I,Morel J M.Gestalt theory and computer vision[R].Paris:CMLA,2002.。

Algorithms for Non-negative Matrix Factorization

Algorithms for Non-negative Matrix Factorization

Daniel D.LeeBell Laboratories Lucent Technologies Murray Hill,NJ07974H.Sebastian SeungDept.of Brain and Cog.Sci.Massachusetts Institute of TechnologyCambridge,MA02138 AbstractNon-negative matrix factorization(NMF)has previously been shown tobe a useful decomposition for multivariate data.Two different multi-plicative algorithms for NMF are analyzed.They differ only slightly inthe multiplicative factor used in the update rules.One algorithm can beshown to minimize the conventional least squares error while the otherminimizes the generalized Kullback-Leibler divergence.The monotonicconvergence of both algorithms can be proven using an auxiliary func-tion analogous to that used for proving convergence of the Expectation-Maximization algorithm.The algorithms can also be interpreted as diag-onally rescaled gradient descent,where the rescaling factor is optimallychosen to ensure convergence.1IntroductionUnsupervised learning algorithms such as principal components analysis and vector quan-tization can be understood as factorizing a data matrix subject to different constraints.De-pending upon the constraints utilized,the resulting factors can be shown to have very dif-ferent representational properties.Principal components analysis enforces only a weak or-thogonality constraint,resulting in a very distributed representation that uses cancellations to generate variability[1,2].On the other hand,vector quantization uses a hard winner-take-all constraint that results in clustering the data into mutually exclusive prototypes[3]. We have previously shown that nonnegativity is a useful constraint for matrix factorization that can learn a parts representation of the data[4,5].The nonnegative basis vectors that are learned are used in distributed,yet still sparse combinations to generate expressiveness in the reconstructions[6,7].In this submission,we analyze in detail two numerical algorithms for learning the optimal nonnegative factors from data.2Non-negative matrix factorizationWe formally consider algorithms for solving the following problem:Non-negative matrix factorization(NMF)Given a non-negative matrix,find non-negative matrix factors and such that:(1)NMF can be applied to the statistical analysis of multivariate data in the following manner. Given a set of of multivariate-dimensional data vectors,the vectors are placed in the columns of an matrix where is the number of examples in the data set.This matrix is then approximately factorized into an matrix and an matrix. Usually is chosen to be smaller than or,so that and are smaller than the original matrix.This results in a compressed version of the original data matrix.What is the significance of the approximation in Eq.(1)?It can be rewritten column by column as,where and are the corresponding columns of and.In other words,each data vector is approximated by a linear combination of the columns of, weighted by the components of.Therefore can be regarded as containing a basis that is optimized for the linear approximation of the data in.Since relatively few basis vectors are used to represent many data vectors,good approximation can only be achieved if the basis vectors discover structure that is latent in the data.The present submission is not about applications of NMF,but focuses instead on the tech-nical aspects offinding non-negative matrix factorizations.Of course,other types of ma-trix factorizations have been extensively studied in numerical linear algebra,but the non-negativity constraint makes much of this previous work inapplicable to the present case [8].Here we discuss two algorithms for NMF based on iterative updates of and.Because these algorithms are easy to implement and their convergence properties are guaranteed, we have found them very useful in practical applications.Other algorithms may possibly be more efficient in overall computation time,but are more difficult to implement and may not generalize to different cost functions.Algorithms similar to ours where only one of the factors is adapted have previously been used for the deconvolution of emission tomography and astronomical images[9,10,11,12].At each iteration of our algorithms,the new value of or is found by multiplying the current value by some factor that depends on the quality of the approximation in Eq.(1).We prove that the quality of the approximation improves monotonically with the application of these multiplicative update rules.In practice,this means that repeated iteration of the update rules is guaranteed to converge to a locally optimal matrix factorization.3Cost functionsTofind an approximate factorization,wefirst need to define cost functions that quantify the quality of the approximation.Such a cost function can be constructed using some measure of distance between two non-negative matrices and.One useful measure is simply the square of the Euclidean distance between and[13],(2)This is lower bounded by zero,and clearly vanishes if and only if.Another useful measure isWe now consider two alternative formulations of NMF as optimization problems: Problem1Minimize with respect to and,subject to the constraints .Problem2Minimize with respect to and,subject to the constraints .Although the functions and are convex in only or only,they are not convex in both variables together.Therefore it is unrealistic to expect an algorithm to solve Problems1and2in the sense offinding global minima.However,there are many techniques from numerical optimization that can be applied tofind local minima. Gradient descent is perhaps the simplest technique to implement,but convergence can be slow.Other methods such as conjugate gradient have faster convergence,at least in the vicinity of local minima,but are more complicated to implement than gradient descent [8].The convergence of gradient based methods also have the disadvantage of being very sensitive to the choice of step size,which can be very inconvenient for large applications.4Multiplicative update rulesWe have found that the following“multiplicative update rules”are a good compromise between speed and ease of implementation for solving Problems1and2.Theorem1The Euclidean distance is nonincreasing under the update rules(4)The Euclidean distance is invariant under these updates if and only if and are at a stationary point of the distance.Theorem2The divergence is nonincreasing under the update rules(5)The divergence is invariant under these updates if and only if and are at a stationary point of the divergence.Proofs of these theorems are given in a later section.For now,we note that each update consists of multiplication by a factor.In particular,it is straightforward to see that this multiplicative factor is unity when,so that perfect reconstruction is necessarily afixed point of the update rules.5Multiplicative versus additive update rulesIt is useful to contrast these multiplicative updates with those arising from gradient descent [14].In particular,a simple additive update for that reduces the squared distance can be written as(6) If are all set equal to some small positive number,this is equivalent to conventional gradient descent.As long as this number is sufficiently small,the update should reduce .Now if we diagonally rescale the variables and set(8) Again,if the are small and positive,this update should reduce.If we now setminFigure1:Minimizing the auxiliary function guarantees that for.Lemma2If is the diagonal matrix(13) then(15) Proof:Since is obvious,we need only show that.To do this,we compare(22)(23)is a positive eigenvector of with unity eigenvalue,and application of the Frobenius-Perron theorem shows that Eq.17holds.We can now demonstrate the convergence of Theorem1:Proof of Theorem1Replacing in Eq.(11)by Eq.(14)results in the update rule:(24) Since Eq.(14)is an auxiliary function,is nonincreasing under this update rule,accordingto Lemma1.Writing the components of this equation explicitly,we obtain(28)Proof:It is straightforward to verify that.To show that, we use convexity of the log function to derive the inequality(30) we obtain(31) From this inequality it follows that.Theorem2then follows from the application of Lemma1:Proof of Theorem2:The minimum of with respect to is determined by setting the gradient to zero:7DiscussionWe have shown that application of the update rules in Eqs.(4)and(5)are guaranteed to find at least locally optimal solutions of Problems1and2,respectively.The convergence proofs rely upon defining an appropriate auxiliary function.We are currently working to generalize these theorems to more complex constraints.The update rules themselves are extremely easy to implement computationally,and will hopefully be utilized by others for a wide variety of applications.We acknowledge the support of Bell Laboratories.We would also like to thank Carlos Brody,Ken Clarkson,Corinna Cortes,Roland Freund,Linda Kaufman,Yann Le Cun,Sam Roweis,Larry Saul,and Margaret Wright for helpful discussions.References[1]Jolliffe,IT(1986).Principal Component Analysis.New York:Springer-Verlag.[2]Turk,M&Pentland,A(1991).Eigenfaces for recognition.J.Cogn.Neurosci.3,71–86.[3]Gersho,A&Gray,RM(1992).Vector Quantization and Signal Compression.Kluwer Acad.Press.[4]Lee,DD&Seung,HS.Unsupervised learning by convex and conic coding(1997).Proceedingsof the Conference on Neural Information Processing Systems9,515–521.[5]Lee,DD&Seung,HS(1999).Learning the parts of objects by non-negative matrix factoriza-tion.Nature401,788–791.[6]Field,DJ(1994).What is the goal of sensory coding?Neural Comput.6,559–601.[7]Foldiak,P&Young,M(1995).Sparse coding in the primate cortex.The Handbook of BrainTheory and Neural Networks,895–898.(MIT Press,Cambridge,MA).[8]Press,WH,Teukolsky,SA,Vetterling,WT&Flannery,BP(1993).Numerical recipes:the artof scientific computing.(Cambridge University Press,Cambridge,England).[9]Shepp,LA&Vardi,Y(1982).Maximum likelihood reconstruction for emission tomography.IEEE Trans.MI-2,113–122.[10]Richardson,WH(1972).Bayesian-based iterative method of image restoration.J.Opt.Soc.Am.62,55–59.[11]Lucy,LB(1974).An iterative technique for the rectification of observed distributions.Astron.J.74,745–754.[12]Bouman,CA&Sauer,K(1996).A unified approach to statistical tomography using coordinatedescent optimization.IEEE Trans.Image Proc.5,480–492.[13]Paatero,P&Tapper,U(1997).Least squares formulation of robust non-negative factor analy-b.37,23–35.[14]Kivinen,J&Warmuth,M(1997).Additive versus exponentiated gradient updates for linearprediction.Journal of Information and Computation132,1–64.[15]Dempster,AP,Laird,NM&Rubin,DB(1977).Maximum likelihood from incomplete data viathe EM algorithm.J.Royal Stat.Soc.39,1–38.[16]Saul,L&Pereira,F(1997).Aggregate and mixed-order Markov models for statistical languageprocessing.In C.Cardie and R.Weischedel(eds).Proceedings of the Second Conference on Empirical Methods in Natural Language Processing,81–89.ACL Press.。

贝叶斯网络结构学习总结

贝叶斯网络结构学习总结

贝叶斯⽹络结构学习总结完备数据集下的贝叶斯⽹络结构学习:基于依赖统计分析的⽅法—— 通常利⽤统计或是信息论的⽅法分析变量之间的依赖关系,从⽽获得最优的⽹络结构对于基于依赖统计分析⽅法的研究可分为三种:基于分解的⽅法(V结构的存在)Decomposition of search for v-structures in DAGsDecomposition of structural learning about directed acylic graphsStructural learning of chain graphs via decomposition基于Markov blanket的⽅法Using Markov blankets for causal structure learningLearning Bayesian network strcture using Markov blanket decomposition基于结构空间限制的⽅法Bayesian network learning algorithms using structural restrictions(将这些约束与pc算法相结合提出了⼀种改进算法,提⾼了结构学习效率)(约束由Campos指出包括1、⼀定存在⼀条⽆向边或是有向边 2、⼀定不存在⼀条⽆向边或有向边 3、部分节点的顺序)常⽤的算法:SGS——利⽤节点间的条件独⽴性来确定⽹络结构的⽅法PC——利⽤稀疏⽹络中节点不需要⾼阶独⽴性检验的特点,提出了⼀种削减策略:依次由0阶独⽴性检验开始到⾼阶独⽴性检验,对初始⽹络中节点之间的连接进⾏削减。

此种策略有效地从稀疏模型中建⽴贝叶斯⽹络,解决了SGS算法随着⽹络中节点数的增长复杂度呈指数倍增长的问题。

TPDA——把结构学习过程分三个阶段进⾏:a)起草(drafting)⽹络结构,利⽤节点之间的互信息得到⼀个初始的⽹络结构;b)增厚(thickening)⽹络结构,在步骤a)⽹络结构的基础上计算⽹络中不存在连接节点间的条件互信息,对满⾜条件的两节点之间添加边;。

物流管理毕业论文参考文献范例

物流管理毕业论文参考文献范例

物流管理毕业论文参考文献范例的引用应当实事求是、科学合理,不可以为了凑数随便引用,以下是搜集整理的物流管理参考文献范例,供大家阅读查看.参考文献一:[1]李锦涛,郭俊波,罗海勇。

射频识别(RFID)技术及其应用[N].中科院计算所信息技术快报,2004(11):25-32.[2]李战怀,聂艳明,陈群等.RFID数据管理的研究进展[J]。

CommunicationsofCCF,2007(8):50—58。

[3]PalmerSPrinciplesofEffectiveRFIDDataManagement[Z]。

ProgressSoftware'sRealTimeDivision,2004.[4]FloerkemeierML.IssueswithRFIDUsageinUbiquitousComputingApplication[C]。

LectureNotesinComputerScience,2004:188—193.[5]JefferyM.AdaptivecleaningforRFIDDataStreams[S].InProc.ofthe32ndInternationalCon ferenceonVeryLargeDataBases。

Seoul,A:VLDBEndowment,2006:163-174。

[6]Roberts.RadioFrequencyIdentification(RFID)[C],Computers&Security,2006(25):18—26。

[7]仇建平,崔杜武。

基于射频识别的供应链管理系统[J]。

计算机应用,2005,25(3):734-736.[8]李斌,李文锋。

智能物流中面向RFID的信息融合研究[J]。

电子科技大学学报,2007,36(6):1329—1932。

[9]吴剑敏,腾少华,张巍.基于RFID技术的应用数据模型研究[J]。

微计算机信息,2007,23(9):234—236.[10]张昊,陈宇.应用RFID技术和无线通信的实时物流追踪系统[J]。

经验模态分解的单通道呼吸信号自动睡眠分期

经验模态分解的单通道呼吸信号自动睡眠分期

Advances in Applied Mathematics 应用数学进展, 2023, 12(6), 2788-2801 Published Online June 2023 in Hans. https:///journal/aam https:///10.12677/aam.2023.126280经验模态分解的单通道呼吸信号自动睡眠分期白雨欣,令狐荣乾北方工业大学理学院,北京收稿日期:2023年5月16日;录用日期:2023年6月9日;发布日期:2023年6月16日摘要睡眠是人体基本的生理需求,可以保证机体的生长发育、为机体储蓄能量、维持机体免疫等。

对睡眠质量的准确评估是认识睡眠障碍并采取有效干预措施的关键。

如果用经验丰富的睡眠专家进行人工睡眠分期是比较耗时并且主观的。

目前,研究人员提出了许多准确、有效、有针对性的睡眠分期方法。

比如,基于深度学习以及经验模态分解算法的单通道电脑信号自动睡眠分期方法,它被成功地用于呼吸信号(RESP)的睡眠分期,该方法为呼吸信号分解和睡眠阶段自动识别提供了新途径。

本文采用的呼吸信号数据集来自SHHS ,它是一个中心队列研究,用来确定睡眠与呼吸障碍的心血管和其他病症的数据库。

首先,我们对SHHS 数据库中的单通道呼吸信号进行了分析,以便更好地了解人类睡眠情况。

其次,利用经验模态分解算法(EMD)对预处理后的呼吸信号进行分解,从原始呼吸信号和分解出的6个简单信号中提取时域、非线性动力学、统计学等方面的9个特征。

最后,使用长短期记忆网络(LSTM)构建分类模型,将提取的呼吸信号特征进行分类识别,实现自动睡眠分期。

实验结果表明,在4类和5类睡眠分期任务中,SHHS 数据库的呼吸信号自动睡眠分期准确率分别为89.22%和88.43%。

实验结果表明,本文提出的自动睡眠分期模型具有较高的分类精度和效率,具有较强的适用性和稳定性。

关键词经验模态分解算法,长短期记忆网络LSTM ,呼吸信号,特征提取,睡眠阶段分类Empirical Modal Decompositionof Single-Channel Respiratory Signals for Automatic Sleep StagingYuxin Bai, Rongqian LinghuCollege of Science, North China University of Technology, BeijingReceived: May 16th , 2023; accepted: Jun. 9th , 2023; published: Jun. 16th , 2023AbstractSleep is a basic physiological need of the body to ensure growth and development, save energy for白雨欣,令狐荣乾the body, and maintain immunity of the body. Accurate assessment of sleep quality is the key to recognizing sleep disorders and taking effective interventions. Manual sleep staging is time con-suming and subjective when performed by experienced sleep specialists. Currently, researchers have proposed a number of accurate, effective, and targeted sleep staging methods. For example, a single-channel computer signal automatic sleep staging method based on deep learning and em-pirical modal decomposition algorithms has been successfully used for respiratory signal (RESP) sleep staging, which provides a new way to decompose respiratory signals and identify sleep stages automatically. The data set used in this paper is from SHHS, which is a central cohort study to identify sleep and breathing disorders in a database of cardiovascular and other conditions. First, we analyzed the single-channel respiratory signals from the SHHS database to better under-stand human sleep. Second, the pre-processed respiratory signals were decomposed using an em-pirical modal decomposition algorithm (EMD) to extract nine features in the time domain, nonli-near dynamics, and statistics from the original respiratory signals and the six simple signals that were decomposed. Finally, a classification model was constructed using a long short-term memory network (LSTM) to classify and identify the extracted respiratory signal features for automatic sleep staging. The experimental results show that the accuracy of automatic sleep staging of res-piratory signals from SHHS database is 89.22% and 88.43% in 4 and 5 categories of sleep staging tasks, respectively. The experimental results show that the automatic sleep staging model pro-posed in this paper has high classification accuracy and efficiency, and has strong applicability and stability.KeywordsEmpirical Modal Decomposition Algorithm, Long Short-Term Memory Network LSTM, Respiratory Signal, Feature Extraction, Sleep Stage Classification.This work is licensed under the Creative Commons Attribution International License (CC BY 4.0)./licenses/by/4.0/1. 介绍睡眠是评价人类生活质量和身体健康的标准之一,并且了解睡眠质量和结构对人类的健康至关重要。

一种低复杂度的MIMO正交缺陷门限减格预编码算法

一种低复杂度的MIMO正交缺陷门限减格预编码算法

一种低复杂度的MIMO正交缺陷门限减格预编码算法王伟;李勇朝;张海林【摘要】To reduce the complexity of lattice reduction aided ( LRA ) precoding , a low complexity LRA precoding based on the orthogonality defect threshold is proposed . We introduce the orthogonality defect (od) threshold as an early‐termination condition into the lattice reduction (LR) algorithm which can reduce computational complexity by adaptively early terminating the LR processing . And , sorted QR decomposition of the channel matrix is used to enhance the probability of the early termination which further reduces computational complexity . Moreover , to achieve a favorable tradeoff between performance and complexity , we define a power loss factor ( PLF) to optimize the od threshold . Simulation results show that the proposed algorithm can achieve significant complexity savings with nearly the same bit‐error‐rate (BER) performance as the traditional LRA precoding algorithm .%针对减格预编码算法复杂度较高的问题,提出了一种基于正交缺陷门限的低复杂度减格预编码算法。

SIMPLICIAL DECOMPOSITION ALGORITHMS

SIMPLICIAL DECOMPOSITION ALGORITHMS

formulated as minimize f (x( ^ ; ^)); (3a) ^ ; ^) 2 ^ : subject to ( (3b) Alternately, a pro table extreme point or direction of X is generated through the solution of an approximation of (1), in which f is replaced by its rst-order, linear approximation, y 7! f (x) + rf (x)T(y ? x), de ned at the solution, x, to the RMP (3), that is, by the problem minimize rf (x)T y; (4a) subject to y 2 X ; (4b) this approximate problem is a linear programming problem, which in general is much easier to solve than the original one. (This is called the column generation subproblem, and corresponds to the decomposition step in some descriptions of column generation methods.) If the solution to this problem lies within the current inner approximation, then the conclusion is that the current solution, x, is optimal in (1), since, then, rf (x)T(y ? x) 0 must hold for all y 2 X . Oth^ ^ erwise, P or D is augmented by a new element, the resulting inner approximation is improved (that is, enlarged), and the solution to the new RMP has a strictly lower objective value than the previous one; the latter result follows since the strict inequality rf (x)T d < 0 holds (that is, d de nes a direction of descent with respect to f at x), where d denotes either the direction d := y ?x towards the new extreme point y or an extreme direction. The iteration is then repeated with the solution of a new column generation subproblem de ned at the solution to the RMP. In the method of 31], Caratheodory's Theorem is utilized in the validation of a column dropping rule, according to which any extreme point or direction whose weight in the expression of the solution x to the RMP is zero is removed; thanks to the niteness of P and D and the strictly decreasing values of f , the convergence of the SD algorithm in the number of RMP is nite. (In the case of non-polyhedral sets, it was observed in 11] that von Hohenbalken's original procedure does not necessarily converge. Their remedy is the introduction of a safe-guarding step which

碳排放IDA模型的算法比较及应用研究

碳排放IDA模型的算法比较及应用研究

碳排放IDA模型的算法比较及应用研究程郁泰;张纳军【期刊名称】《统计与信息论坛》【年(卷),期】2017(032)005【摘要】在分析碳排放指数分解模型IDA的基本理论框架和各类型算法的结构、特点基础上,以中国1991-2014年相关数据的实证分析作为各算法应用的解读,并基于适用性、有效性综合评价提出算法选择的参考信息:LD算法分解的各因素作用易于理解,但存在分解余项问题;RLD与Shapley算法能够实现因素完全分解,且本质上具有一致性;GFI算法适合多因素效应完全分解,但计算过程复杂;AMDI与AWD算法受到分解余项和对数权重赋值限制的影响约束;LMDIⅠ算法具有灵活的分解形式及因素完全分解特征等.%The research shows the basic theoretical framework of index decomposition analysis model of carbon emissions and the structure and characteristics analysis of different types of algorithms;Unscramble the application of each algorithm based on the empirical analysis with related data of China during the period 1991 to 2014.By the comprehensive evaluation of applicability and effectiveness of each algorithm, this paper provides the reference information for the method selection: The decomposition of the LD algorithm which exists residual items is easy to understand;The calculation results of RLD and Shapley algorithm which achieves complete decomposition of factors are almost identical;And the algorithm of GFI is suitable for multi-factor effect of complete decomposition with the computational complexity;Thealgorithm of AMDI and AWD influenced by the restriction of logarithmic assignment problem results in residual items;The algorithm of LMDIⅠ possesses the flexible features of decomposition form and complete decomposition of factors.【总页数】8页(P10-17)【作者】程郁泰;张纳军【作者单位】天津财经大学统计系, 天津 300222;天津财经大学统计系, 天津300222【正文语种】中文【中图分类】F222.1;C812【相关文献】1.建筑施工过程碳排放量预测模型及应用研究 [J], 刘家林;马朋2.碳排放SDA模型的算法比较及应用研究 [J], 张纳军;程郁泰3.数学模型在碳排放测算与预测中的应用研究 [J], 李婉婷;宋男哲;慎英才;邢洁4.“十四五”期间我国碳排放总量及其结构预测——基于混频数据ADL-MIDAS 模型 [J], 赫永达;文红;孙传旺5.MIDAS模型与EQW模型预测精度的比较——以资产价格的经济增长效应为例[J], 王春枝;赵国杰;王维国;于扬因版权原因,仅展示原文概要,查看原文内容请购买。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Ronald Parr Computer Science Department Stanford University Stanford,CA94305-9010parr@AbstractThis paper presents two new approaches to de-composing and solving large Markov decisionproblems(MDPs),a partial decoupling methodand a complete decoupling method.In these ap-proaches,a large,stochastic decision problem isdivided into smaller pieces.Thefirst approachbuilds a cache of policies for each part of theproblem independently,and then combines thepieces in a separate,light-weight step.A secondapproach also divides the problem into smallerpieces,but information is communicated betweenthe different problem pieces,allowing intelligentdecisions to be made about which piece requiresthe most attention.Both approaches can be usedtofind optimal policies or approximately optimalpolicies with provable bounds.These algorithmsalso provide a framework for the efficient transferof knowledge across problems that share similarstructure.1IntroductionThe Markov Decision Problem(MDP)framework pro-vides a formal framework for modeling a large variety of stochastic,sequential decision problems.It is a well-understood framework with well-known on-line and off-line algorithms for determining optimal behavior(see e.g. Puterman(1994)).The limitations of this framework are also well-known:compliance with the Markov property generally requires a veryfine grained description of the environment,i.e.,a very large number of states.One of the main research thrusts for MDPs has been the development of methods for large state spaces.A major complicating factor in this line of research is the apparent non-decomposability of MDPs—the utility or value of any state can,in general,be affected indirectly by the cost structure and the dynamics of any other state.This thwarts efforts to decompose MDPs into completely independent subproblems and complicates efforts to reduce computation time through parallelization.While some progress has been made on understanding some very special cases where MDPs may be decomposed into independent subproblems(Singh,1992;Lin,1997),much of the effort has focused on methods that decompose MDPs into“communicating”subproblems(Bertsekas&Tsitsiklis, 1989;Dean&Lin,1995).In these iterative methods,in-formation about subproblem solutions is communicated to neighboring subproblems.The solution for each subprob-lem may need to be updated many times until a globally optimal solution is obtained.This paper considers a special,but fairly general class of problem decompositions where each subproblem is “weakly”coupled with the neighboring subproblems.This means that the number of states connecting the two sub-problems is small,a relationship that appears naturally in many problems.For example,the problem of moving from one’s office to one’s house has this structure:one’s office is a small region that is connected by a much smaller region, the door,to an external corridor.Many other offices may be connected to this corridor,each with a similar structure.The corridor could be fairly large and connected to other corri-dors by relatively small intersection regions.Most buildings have a small number of doorways that connect them to the streets outside.Each street has a relatively small number of points where it connects to other streets.One such street connects to the house one calls home,which is itself an ag-gregation of weakly connected pieces.An MDP is weakly coupled if it can be divided into two or more subproblems that are weakly coupled with each other.Figure1shows a simple navigation MDP divided into four rooms,each of which can be considered a subproblem.This paper uses a similar approach to that used in commu-nicating MDP solution methods,but aims to avoid itera-tively updating solutions to subproblems by building a set of policies independently for each subproblem.Each set of policies is called a cache.The caches are constructed in such a way that they are guaranteed a priori to provide performance within a constant of the optimal,regardless of the structure of the other subproblems.This permits a complete decoupling of the MDP into independent subprob-lems that can be solved in parallel and then recombined in a light-weight step.The decoupling process is based upon the observation that any policy over a region of state spaceX XXX X X $Room 2Room 1Room 4Room 3XXFigure 1:A weakly coupled MDP.There is a reward in room ,indicated with a $.Connecting states are identified with an X.Similar examples and pictures are used by Precup and Sutton (1997)and Hauskrecht et al.(1998).defines a linear function for the values of the states inside the region in terms of the values of the states outside the region (see,for example,Parr (1998)).The linear relation-ship is exploited by the algorithms in this chapter to build caches for each region of the state space.The caches are built iteratively by constructing linear programs that dis-cover the values of the states outside the region for which cache performs the worst,then adding a new policy to the cache to cover the worst case.The efficient manipulation of policy caches also provides a formal basis for the transfer of knowledge across problems with similar substructures.The simplest case of this occurs when the reward structure for a problem changes.Suppose,for example,that the reward in the navigation problem is moved from room to room .Policy caches devised for rooms and can be can be transferred to the new problem.Similarly,if one’s destination is now a cafe instead of home,the policies designed for one’s office and the containing building should transfer to the new problem.Since the number of possible policies for a subproblem is exponential in the number of states in the subproblem,there may exist problems and accuracy requirements for which the size of the policy cache will be exponential.In these cases there still will be some benefit to constructing a small policy cache,even if it does not provide the desired accuracy guarantees.This paper presents an algorithm that augments standard communicating MDP algorithms with the use of a policy cache.The policy cache can be used to determine lower and upper bounds on the values that states in the subproblem can assume,and this provides a means of deciding when it is worth using a cached solution and when it is worth producing a new subproblem solution.This is particularly useful in determining if subproblem solutions from a related problem can be applied to a new one.2Markov Decision ProblemsTo review the basic MDP framework,an MDP is a 4-tuple,where is a set of states ,is a set of actions ,is a transition model mapping into probabilitiesin,and is a reward function mapping into real-valued rewards.Algorithms for solving MDPs can return a policy ,,that maps from to ,or a real-valued value function .In this paper,the focus is on infinite-horizon MDPs with a discount factor .The aim in these problems is to find an optimal policy,,that maximizes the expected discounted total reward of the agent,or to find an approximately optimal policy that comes within some bound of optimal.Value iteration,policy iteration or linear programming can be used to determine the optimal policy for an MDP.These algorithms all use some form of the Bellman equation (Bell-man,1957):When the Bellman equation is satisfied,the maximizing action for each state is the optimal action.For a particular policy,the Bellman equation becomes a system of linear equations:These can be solved to determine,,the value of follow-ingfrom any state.The Bellman error for a particular policy at a particular state is the difference between the value function for that policy and the right-hand side of the Bellman equation:For any policy,the maximum Bellman error over all states,,is a well-known bound onthe distance from the optimal value function (Williams &Baird,1993):assignment of policies to regions can be determined by solv-ing a“high-level”reduced decision problem defined over only the states in the out-spaces of the regions.This reduced decision problem removes all but the out-space states from the problem.Actions in the reduced problem correspond to assignments of policies to regions in the original deci-sion problem.This transformation is the basic insight of Forestier and Varaiya(1978)and it follows as a special case of the hierarchical results in Parr and Russell(1997).The approach is also investigated in Hauskrecht et al.(1998). This type of problem also can be viewed as Semi-Markov decision problem(SMDP),where each low-level policy be-comes a primitive SMDP action,as in Parr(1998).In Figure1,the high level problem would contain just the eight specially marked states.An action in the high level problem would correspond to a decision to adopt some pol-icy from the room’s cache upon entering the room,and staying with this policy until the next out-space state is reached.The solution to the high-level problem may pro-duce a non-stationary policy at the low-level,which means that the actions taken in any room may depend upon the manner in which the room is entered.A non-stationary policy of this type can be converted easily to a stationary policy that is at least as good(Parr,1998).The relationship between the size of the out-spaces and the complexity of the high-level problem should make the importance of weak coupling clear.If the size of the out-spaces approaches the size of the original MDP,then the high-level decision problem that combines the cached sub-problem solutions will be as difficult as the original MDP. An algorithm that completely decomposed an MDP would produce a for each,combine these to produce an op-timal or approximately optimal overall solution,and never need to revise any of the.Unless the are chosen very carefully,or the caches are very large,combinations of policies in the initial policy caches may not suffice.There are several approaches to revising the policy caches.One extreme end of this spectrum is the approach in Sutton, Precup,and Singh(1998),where policies and low-level ac-tions are mixed together in the same SMDP.This sacrifices the reduction in computational complexity obtained from solving a reduced decision problem in favor of a guarantee of obtaining optimality.Another approach considered by Dean and Lin(1995)updates each directly.Dean and Lin considered a special case in which the old policies were discarded at each iteration,and a new policy was computed for each region based upon the high-level decision prob-lem’s current estimates for the value of the out-space states. The approach advocated by Dean and Lin is guaranteed to converge to the optimal policy.However,it is just one special case of a general class of methods that must con-verge.Any reasonable scheme that improves the policies in the regions and propagates those improvements through the high-level decision problem is guaranteed to produce an optimal policy as long as no regions“starve”,i.e.,never have their policies improved.This result follows directly from the observation that the high-level problem of assign-ing policies to rooms is really just an SMDP where the set of permitted actions for the SMDP are just the set of possible policies defined over regions.The algorithms in this paper all aim to minimize the number of policies that are computed for MDP subproblems.The extent to which this can be minimized is a measure of how effectively an MDP has been decomposed.If each sub-problem requires only a small cache of candidate solutions, this means that the subproblem solutions are relatively in-dependent.These are precisely the situations in which a large computational benefit is reaped from decomposition, since the MDP can be divided and conquered by solving a reasonable number of small subproblems.The size of the policy caches also gives some measure of the paralleliz-ability of the problem.If a region can be solved with a small cache of policies,this suggests that the entire cache could be constructed a priori as a completely independent subprocess.The following section describes several algorithms for con-structing policy caches for subproblems with minimal as-sumptions about the rest of the MDP.These algorithms aim to minimize the size of the cache,while ensuring that so-lutions using the cache will be within a bound of optimal. The succeeding section describes a scheme for working with policy caches for which optimality bounds have not been established a priori.This method efficiently estab-lishes bounds on the benefit of adding a new policy to a cache,based upon the current contents of the cache.4Complete decouplingThis section presents algorithms thatfind a policy cache, ,for a particular region,,such that is guaranteed to provide policies that are within a constant of optimal when a high level problem using for is solved.The only assumptions that are made about the regions to which connects is that the states assume values on. Define,as a vector of values that the states in the out-space of can take on(the subscript will be dropped when there can be no confusion about the region in question). The fan-out of a region is defined as the dimension of this vector.In addition to storing a cache of policies it is useful to store a cache of functions,for each.Eachis a linear function that provides the value of any state as a linear function of.For any policy these functions can be determined by solving a system of linear equations(see Parr(1998)).The goal in constructing a policy cache for a region is to produce a cache such that for every possible value of the corresponding out-space states,there is a policy in the region’s cache for which the performance in the region will be within a bound of optimal.A policy,,for region is optimal with respect to if is the solution to the MDP defined just over the states in,with the assumption that states in the out-space of are absorbing states with valueslocked at the value of the corresponding entry in.In room of the four-room example,the optimal policy for would be determined by solving an MDP with just the states in room and the two connecting states in room and room.The value of the connecting state in room would be treated as a constant with value and the value of the connecting state in room would be a constant with value.A policy,,is said to be-optimal withrespect to if,when the values of the states in the out-space of arefixed by.For any state and any value of,there must be one policy in the cache that appears at least as good as all of the others.A policy,dominates at for a particular,if.This means that the low-level policy,,appears to be the best high level action at state for a particular.A cache of policies is optimal at t if,for any,the dominating policy is optimal.A cache of policies is optimal if it is optimal at all in the in-space of.Theorem1If an MDP is divided into regions, and an optimal cache of policies,, is determined for each region,these policies can be com-bined to produce a globallyoptimality.,where is the fan-out of the region.This will be unmanageable unless the range of values is very small,the fan out of the region is very small,or is very large.4.2Value Space SearchThis section presents an algorithm that aims to avoid con-structing an exponential number of policies by searching through space tofind a point at which the current pol-icy cache is not adequate.If such a point is found,a new policy is added to the cache,and the process is repeated until no points can be found for which the current cache is inadequate.The following formal results are the basis of the value space search algorithm:Lemma1For any state,the dominating policies at form a piecewise-linear convex function of.Proof:This follows from the observation that using the dominating policy means taking the maximum over a set of linear policy functions.For this example,the top-left state is treated as if it were an in-space state even though there is no entrance to the room in that area.This keeps the value surfaces corresponding to policies displayable in two dimensions.0.511.5200.20.40.60.81 1.2 1.4 1.6 1.82V a l u e o f t o p -l e f t s t a t eValue of out-stateOptimal for Vo = 0Optimal for Vo = 2.0Figure 3:The optimal policy for avoids the exit,making the value of the top-left state nearly independent ofthe value of the exit.The optimal policy forgoes directly for the exit and has a strong dependence on .can be determined in time that is polynomial in ,,,and .Proof:This is achieved by means of a linear program.For all in the in-space of ,for all in ,for all ,and for all,the following linear program is solved:Maximize:Subject to:Note that the free variables in the system are the componentsof.The objective function maximizes the Bellman error at state under the assumption that action is taken.Thefirst set of constraints identifies the region inspace for which dominates at .If this region exists,it is guaranteed to be a single,continuous facet of a convex surface by Lemma 1.The last set of constraints bounds to be within the range of possible values.The largest value returned by the linear program over all ,,and provides the point at which the current cache of policies will have the largest Bellman error.The time bound is satisfied because linear programming is polynomial in the size of its inputs.point at which the error in a set of infinite horizon MDP value functions is the largest.The value space search algorithm was used tofind an optimal policy cache for room of Figure1using the same action model and discount factor as used for the one-exit problem.This subproblem contains states and has a fan-out of.Possible actions are right,left,up and down,but these actions are unreliable,resulting in move-ment in one of the three other axis-parallel directionsof the time.The discount factor used was.There are possible policies for this subproblem.Of course,many of these are unreasonable policies that,for example,move the agent in circles.However,as in the one-exit example,a variety of policies can still be induced by different values of the out-space states.If the values of the states are assumed to be on, then the-grid approach for this problem would require million policies for.The value space searchalgorithm produced a policy cache with the same optimality guarantees with just policies.For,the-grid approach would require million policies,while the value space search algorithm produced the same policies.In this particular case,the value space search algorithm has captured the intuition that this type of subproblem should not be that hard.A few seconds of computation has pro-duced a small cache of that will ensure a nearly optimal solution for this region no matter what happens in any con-necting region.This small subproblem is now decoupled and completely solved—at least for and for problems where the neighboring states can assume values on.Any MDP satisfying these conditions and with an optimality requirement of no more thanof optimal. This could be a problem,however,if the agent typically starts in some state that is not a high-level state.In such cases,the starting position of the agent can be treated as if it were a connecting state by adding it to the in-space of the enclosing region and constructing a policy cache as if it were a connecting state.If desired,every state could be treated as if it were an in-space state,ensuring full low-level optimality as well.The algorithm presented in this section has run time that is exponential in,the fan-out of the region,but unlike the -grid approach,it does not depend explicitly on andunlike the value space search algorithm,it can avoid consid-ering every state inside of a region if high-level-optimality is sufficient.The algorithm relies upon the following formal results:Lemma2For any point,set of points,, with set of policies,,such that is optimal with respect to and such that the form a convex hull around ,the optimal policy with respect to at any state is bounded from below by and from above by the hyperplane containing each of the. Proof:Bounding from below is obvious and follows from Lemma1:the optimal policy at any point must do at least as well as the dominating policy in the cache.The bound from above is somewhat more subtle:Let be the hy-perplane containing the.Suppose that there exists some and corresponding such that for some, is above.Let be the hyperplane corre-sponding to the linear value function of this policy at. There must exist some corner of the convex hull used to create(some)where is above, i.e.,.However,is known to be optimal with respect to,so this is a contradiction.0.511.5200.20.40.60.81 1.2 1.4 1.6 1.82V a l u e o f t o p -l e f t s t a t eValue of out stateOptimal for Vo = 0Optimal for Vo = 2.0Upper bounding hullFigure 4:Two policies,and an upper surface bounding the their distance from the optimality.at.For any ,the linear function for the optimal policy cannot cross at or cross1.84at=[1.84].Thus,the value of optimal policy is bounded by the line shown.Theorem 3For region and cache of policies,,that are optimal at,the optimal policy value for any with respect to any is bounded from below by the convex surface formed by the maxi-mum over the correspondingand bounded from above by the convex hull containing the points:.Proof:The bound from below is a direct consequence of Lemma 1.The bound from above follows from Lemma 2and noting that the lowest bounding hyper-plane for anymust form a facet in the convex hull of.facets,making thisalgorithm exponential in .Still,the convex hull bound-ing algorithm is superior to thegrid approach since the grid approach has run time that depends directly oncache for,or generating a new policy that is optimal for the algorithm’s current estimate of.A straightforward way to answer this question would be to use the cached functions to assign values to every state in the subprob-lem and then compute the Bellman error for each state. However,this approach would require so much computa-tion that it would essentially defeat the purpose of solving a high-level problem.Instead,high-level optimality can be checked quite efficiently by using the tools of the convex hull bounding algorithm.Starting with some policy cache,,the el-ements of which are optimal at the corresponding ,for any particular,the value of any state under the optimal policy with respect to is bounded from below by,and the value is bounded from above the convex hull formed by(Theorem3).The situation here is slightly different from the bounding algo-rithm in that isfixed and known.Instead of a high-dimensional convex hull problem,the bounds for a partic-ular can be determined by solving a linear program. In the following is an unknown linear equation,i.e.,the coefficients and constant are free variables:Maximize:Subject to:To reassure oneself that this is indeed a linear program, recall that in this context,,the,and coefficients and constants for the are all known constants.The only vari-ables are the components of.Thefirst set of constraints requires that be no better than the optimal policy for at points in value space where the optimal policy is known. This is,essentially,a restatement of Lemma2.The second set of constraints requires that never exceeds the maxi-mum value any state can assume in this problem.Thus,the objective function forces the linear program tofind the high-est hyperplane that does not violate Lemma2or the bound on state values.If lies in the convex hull of, then will be the facet of the upper-bounding convex hull from Theorem3.Note that if does not lie in the convex hull,will be returned.This bound can be tightened by requiring that the constant of be no larger than the value of the optimal policy at and that the coefficients of sum to be no more than.If the distance between the dominating policy and the upper bound returned by the above linear program is less than for every state in the in-space of,then the policy cache for is sufficient to produce a high-level-optimal policy for the current value of.This means that a high-level decision problem can,for now,avoid updating the policy for region and focus attention on other regions.This decision will need to be reevaluated as values of the states in the out-space of change.One way to view this result is that it enables a form of high-level prioritized sweeping (Moore&Atkeson,1993;Andre,Friedman,&Parr,1997). This result also has significant consequences for the transfer of knowledge across problems.Suppose,for example,that a particular model substructure appears in many different problems.Consider a larger version of the four-room prob-lem with many interconnected rooms.Different tasks in this domain would correspond to different positions of the reward in different rooms.Every time a policy is produced for a room it can be added to the room’s policy cache.The above linear program can be used to determine quickly if for some new problem,the cache in a particular room is adequate.Thus,a form of cross-task learning is achieved where the time required to plan for new objectives declines as experience is gained with the environment.Moreover, intelligent allocation of computational resources will be possible since parts of the value space that have already been mastered will no longer drain CPU time.6ConclusionThis paper presented two approaches to decoupling MDPs, a complete decoupling approach and a partial decoupling approach.With complete decoupling,the problem is di-vided into independent subproblems,and the solutions to these subproblems are combined in a light-weight step.Two new algorithms for determining optimal policy caches for a subproblem are presented.The significance of the first algorithm is that it uses a polynomial time test to de-cide when to add new policies to the cache.The second algorithm uses a computational geometry approach that can be exponential in the fan-out of the subproblem,but can be more efficient than thefirst algorithm if the fan-out is small. Since complete decoupling may not always be possible,a method for partial decoupling is presented.This method assumes that an imperfect policy cache is used by a high-level asynchronous MDP algorithm.It uses the policy cache to bound the optimal values of states in the in-space of a region with respect to the values of the states in the out-space of the region.By providing upper and lower bounds, this permits intelligent decisions about when to update the policy cache for a region based upon the algorithm’s current estimate of the values of the states in the out-space of the region.Together these results provide a framework for large-scale parallelization of MDPs and a formal framework for the transfer of knowledge across problems that share common structures.These results can be applied hierarchically,al-though the optimality requirements for the subproblems will become stricter with each division if the same level of optimality is to be maintained at the top level.This work does not address the questions of state abstrac-tion or value function approximation.Fortunately,these techniques will compliment the results presented here.The decoupled MDP algorithms will benefit from any approachthat compresses the state space,especially if the compres-sion reduces the fan-out of the regions in some decomposi-tion of the space.A limitation of this work is that it applies mainly to a restricted class of MDPs,those that are weakly coupled. Moreover,the efficiency of the methods described here will depend heavily upon the manner in which the MDP is de-composed into subproblems,and,in particular,the fan-out of the regions in the decomposition.The reader should keep in mind,however,that this type of aggressive decoupling of MDPs is a farily new topic and and that while the algorithms involved are,admittedly,complex,the potential benefits in parallelization and knowledge transfer across problems re-sulting from this line of research are substantial.7AcknowledgmentThis work was supported in part by DARPA contract DACA76-93-C-0025under subcontract to Information Ex-traction and Transport,Inc.,and through the generosity of the Powell Foundation and the Sloan Foundation and by DARPA Prime contract IET-1004-96-009.Some of this was done at the University of California at Berkeley, where it was supported in part,by ONR grant N00014-97-1-0942and ARO MURI grant DAAH04-96-1-0341.The author benefited from helpful discussions about this and re-lated work with David Andre,Craig Boutilier,Mike Bowl-ing,Tom Dean,Nir Friedman,Milos Hauskrecht,Daphne Koller,Uri Lerner,Stuart Russell,Mehran Sahami and Rich Sutton.The reviewers also provided some extremely help-ful comments.ReferencesAndre,D.,Friedman,N.,&Parr,R.(1997).Generalized prioritized sweeping.In Advances in Neural Information Processing Systems10:Proceedings of the1997Confer-ence Denver,Colorado.MIT Press.Bellman,R.E.(1957).Dynamic Programming.Princeton University Press,Princeton,New Jersey.Bertsekas,D.C.,&Tsitsiklis,J.N.(1989).Parallel and Distributed Computation:Numerical Methods.Prentice-Hall,Englewood Cliffs,New Jersey.Cassandra,A.R.,Kaelbling,L.P.,&Littman,M.L.(1994). Acting optimally in partially observable stochastic domains. In Proceedings of the Twelfth National Conference on Arti-ficial Intelligence(AAAI-94),pp.1023–1028Seattle,Wash-ington.AAAI Press.Dean,T.,&Lin,S.-H.(1995).Decomposition techniques for planning in stochastic domains.In Proceedings of the Fourteenth International Joint Conference on Artificial In-telligence(IJCAI-95),pp.1121–1127Montreal,Canada. Morgan Kaufmann.Forestier,J.-P.,&Varaiya,P.(1978).Multilayer control of large Markov chains.IEEE Transactions on Automatic Control,AC-23,298–304.Hauskrecht,M.(1998).Planning with temporally abstract actions.Tech.rep.CS-98-01,Computer Science Depart-ment,Brown University,Providence,Rhode Island. Hauskrecht,M.,Meuleau,N.,Boutilier,C.,Kaelbling,L.P., &Dean,T.(1998).Hierarchical solution of Markov deci-sion processes using macro-actions.In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelli-gence(UAI-98).To appear.Lin,S.-H.(1997).Exploiting Structure for Planning and Control.Ph.D.thesis,Computer Science Department, Brown University,Providence,Rhode Island.Lovejoy,W.S.(1991).A survey of algorithmic methods for partially observed Markov decision processes.Annals of Operations Research,28(1–4),47–66.Moore, A.W.,&Atkeson, C.G.(1993).Prioritized sweeping—reinforcement learning with less data and less time.Machine Learning,13,103–130.Parr,R.(1998).Hierarchical Control and Learning for Markov Decision Processes.Ph.D.thesis,University of California,Computer Science Division,Berkeley,Califor-nia.Parr,R.,&Russell,S.(1997).Reinforcement learning with hierarchies of machines.In Advances in Neural In-formation Processing Systems10:Proceedings of the1997 Conference Denver,Colorado.MIT Press.Precup,D.,&Sutton,R.S.(1997).Multi-time models for temporally abstract planning.In Advances in Neural Information Processing Systems10:Proceedings of the 1997Conference Denver,Colorado.MIT Press. Puterman,M.L.(1994).Markov Decision Processes.Wi-ley,New York.Russell,S.J.,&Norvig,P.(1995).Artificial Intelligence:A Modern Approach.Prentice-Hall,Englewood Cliffs,New Jersey.Singh,S.P.(1992).Transfer of learning by composing solutions of elemental sequential tasks.Machine Learning, 8(3),323–340.Sutton,R.S.,Precup,D.,&Singh,S.P.(1998).Between MDPs and semi-MDPs:Learning,planning,and represent-ing knowledge at multiple temporal scales.In prep. Williams,R.J.,&Baird,L.C.I.(1993).Tight perfor-mance bounds on greedy policies based on imperfect value functions.Tech.rep.NU-CCS-93-14,College of Computer Science,Northeastern University,Boston,Massachusetts.。

相关文档
最新文档