Exact (exponential) algorithms for the dominating set problem
应用地球化学元素丰度数据手册-原版
应用地球化学元素丰度数据手册迟清华鄢明才编著地质出版社·北京·1内容提要本书汇编了国内外不同研究者提出的火成岩、沉积岩、变质岩、土壤、水系沉积物、泛滥平原沉积物、浅海沉积物和大陆地壳的化学组成与元素丰度,同时列出了勘查地球化学和环境地球化学研究中常用的中国主要地球化学标准物质的标准值,所提供内容均为地球化学工作者所必须了解的各种重要地质介质的地球化学基础数据。
本书供从事地球化学、岩石学、勘查地球化学、生态环境与农业地球化学、地质样品分析测试、矿产勘查、基础地质等领域的研究者阅读,也可供地球科学其它领域的研究者使用。
图书在版编目(CIP)数据应用地球化学元素丰度数据手册/迟清华,鄢明才编著. -北京:地质出版社,2007.12ISBN 978-7-116-05536-0Ⅰ. 应… Ⅱ. ①迟…②鄢…Ⅲ. 地球化学丰度-化学元素-数据-手册Ⅳ. P595-62中国版本图书馆CIP数据核字(2007)第185917号责任编辑:王永奉陈军中责任校对:李玫出版发行:地质出版社社址邮编:北京市海淀区学院路31号,100083电话:(010)82324508(邮购部)网址:电子邮箱:zbs@传真:(010)82310759印刷:北京地大彩印厂开本:889mm×1194mm 1/16印张:10.25字数:260千字印数:1-3000册版次:2007年12月北京第1版•第1次印刷定价:28.00元书号:ISBN 978-7-116-05536-0(如对本书有建议或意见,敬请致电本社;如本社有印装问题,本社负责调换)2关于应用地球化学元素丰度数据手册(代序)地球化学元素丰度数据,即地壳五个圈内多种元素在各种介质、各种尺度内含量的统计数据。
它是应用地球化学研究解决资源与环境问题上重要的资料。
将这些数据资料汇编在一起将使研究人员节省不少查找文献的劳动与时间。
这本小册子就是按照这样的想法编汇的。
Exact algorithms for the minimum latency problem
Exact algorithms for the minimum latency problemBang Ye Wu∗,Zheng-Nan Huang,Fu-Jie ZhanDept.of Computer Science and Information Engineering,Shu-Te University,YenChau,Kaohsiung,Taiwan824,R.O.C.Key words:algorithms,minimum latency problem,dynamic programming,branch and bound1IntroductionLet G=(V,E,w)be an undirected graph with positive weight w(e)on each edge e∈E. Given a starting vertex s∈V and a subset U⊂V as the demand vertex set,the minimum latency problem(MLP)asks for a tour P starting at s and visiting each demand vertex at least once such that the total latency of all demand vertices is minimized,in which the latency of a vertex is the length of the path from s to thefirst visit of the vertex.The MLP is an important problem in computer science and operations research,and is also known as the delivery man problem or the traveling repairman problem.Similar to the well-known traveling salesperson problem(TSP),in the MLP we are asked tofind an“optimal”way for routing a server passing through the demand vertices.The difference is the objective functions.The latency of a vertex can be thought of as the delay of the service.In the MLP we care about the total delay(service quality),while the total length(service cost)is concerned in the TSP.The MLP on a metric space is NP-hard and also MAX-SNP-hard[4].Polynomial time algorithms are only known for very special graphs,such as paths[1,6],edge-unweighted trees[9],trees of diameter3[4],trees of constant number of leaves[8],or graphs with similar structure[12].Even for caterpillars(paths with edges sticking out),no polynomial time algorithm has been reported.In a recent work,it is shown that the MLP on edge-weighted trees is NP-hard[11].Due to the NP-hardness,many works ∗corresponding author(bangye@.tw)have been devoted to the approximation algorithms[2,3,4,7,8],and the current best approximation ratio is3.59[5].More references to exact and approximation algorithms can be found in those papers.“Dynamic programming”(DP)and“branch-and-bound”(B&B)are two popular strate-gies used to exactly solve NP-hard problems without exhaustive search.As pointed out in [12],the MLP can be exactly solved by a dynamic programming algorithm.However,the algorithm is still very time-consuming.By designing non-trivial lower bound functions and using a technique combining the advantages of both DP and B&B,we developed a series of exact algorithms for the MLP.Experimental results on both random and real data are also reported in this paper.The results show that our algorithm is much more efficient than the DP algorithm and the B&B algorithm,and we believe that the technique can be also applied to some other problems.2PreliminariesIn this paper,a graph is a simple and connected graph with a nonnegative weight on each edge.Throughout this paper,the input graph is G,and n is the number of nodes of graph G.An origin(starting vertex)is a given vertex of G.A tour is a route from the origin and visiting each vertex at least once.A subtour is a partial or a complete tour starting at the origin.Let H be a subgraph or a subtour.The set of vertices of H is denoted by V(H).For u,v∈V(G),we use d G(u,v)to denote the length of the shortest path between u and v on G.For a subtour P,d P(u,v)denotes the distance from thefirst visit of u to thefirst visit of v in P,and w(P)denotes the length of P.Definition1:Let P be a subtour starting at s on graph G.For a demand vertex v visited by P,the latency of v is defined as d P(s,v),which is the distance from the origin to thefirstvisit of v on P.The latency of a tour P is defined by L(P)=v∈Ud P(s,v),in which U isthe demand vertex set.In general,the input graph of a MLP may be any simple connected graph with nonnegative edge weights,and the demand vertex set does not necessarily include all the vertices.Ametric graph is a complete graph with edge weights satisfying the triangle inequality.By a simple reduction,we may assume that the input graph is always a metric graph and all the vertices are the demand vertices.Let G=(V,E,w)be the underlying graph and U⊂V be the demand vertex set.Wefirst compute the metric closure¯G=(U,U×U,¯w)of G,in which the weight on each edge is the shortest path length of the two endpoints in G.For any tour ¯P on¯G,we can construct a corresponding tour P on G by simply replacing each edge in¯P with the corresponding shortest path on G.It is easy to see that L(P)≤L(¯P).Conversely, given any tour P on G,we can obtain a tour¯P on¯G by eliminating all vertices not in U. Since the edge weight is the shortest path length,we have L(¯P)≤L(P).Consequently the minimum latencies of the two graphs are the same.Furthermore,if there exists an O(T(n)) time exact or approximation algorithm for the MLP on metric graphs,the MLP on general graphs can be solved in O(T(n)+f(n))time with the same performance guarantee,in which f(n)is the time complexity for computing the all-pairs shortest path length.In the remaining paragraphs,we assume that the input graph G is a metric graph and each vertex is a demand vertex.It should also be noted that the optimal tour never visits the same vertex twice in a metric graph.3Algorithms3.1Pure dynamic programmingTofind the optimal tour of a MLP,a brute force algorithm checking all permutations of the vertices except for the origin will takeΩ((n−1)!)time.In[12],it was pointed out that the MLP can be solved in O(n22n)time by a dynamic programming algorithm.For the completeness,we briefly explain the algorithm in the following.Definition2:Let P be a subtour on graph G.Define a cost function c(P)=L(P)+ (n−|V(P)|)w(P),i.e.,c(P)is the total latency of the visited vertices plus the length of P multiplied by the number of vertices not been visited.Let P1and P0be two routes such that the last vertex of P1is thefirst vertex of P0.We use P1//P0to denote the route obtained by concatenating P1and P0.For a subtour P,we say that P has configuration(R,v),in which R=V(P)and v is the last vertex of P. The dynamic programming algorithm is based on the following property which can be easily shown by definition.It also explains the reason why we define the cost function c in such a way.Claim1:Let P1and P2be subtours with the same configuration and c(P1)≤c(P2).If Y2=P2//P0is a complete tour,i.e.,P0is a route starting at the last vertex of P2and visiting all the remaining vertices,then Y1=P1//P0is also a tour and L(Y1)≤L(Y2).Tofind the minimum latency,by Claim1,we only need to keep one subtour for each possible configuration.The dynamic programming algorithm starts at the subtour containing only the origin and computes the best subtour for each configuration in the order that the number of the visited vertex is from small to large.The time complexity then follows that there are O(n2n)configurations and we generate O(n)subtours when a subtour is extended by one vertex.3.2Dynamic programming with pruningTo make the program more efficient,we introduce a pruning technique in the DP algorithm, which is similar to the one used in a typical branch-and-bound algorithm.While the program is running,we always record an upper bound(UB)of the optimal,which is the latency of some feasible tour.For each generated subtour P,we compute a lower bound of P,which is an under estimate of any complete tour containing P as a prefix.If the lower bound of a subtour is no less than UB,we can prune the subtour without affecting the optimality of the final solution.The key points are how we compute the UB and how we estimate the lower bound of a subtour.A pure DP algorithm does not generate any complete tour until it reaches the configura-tions consisting of the set of all vertices.To get an upper bound,we employ a simple greedy algorithm to build a tour.The greedy algorithm uses the“nearest vertexfirst”strategy. Beginning with a subtour containing only the origin,we repeatedly augment the subtour by one vertex until all vertices are included.At each iteration,we choose the vertex which isnearest to the stopping vertex of the subtour and has not been visited.Obviously,such a tour can be computed in O(n2)time.In addition to the initial stage,our algorithm uses the greedy method to build a tour whenever a new subtour is generated,and keep the current best solution.Algorithm DPwP MLPInput:A metric graph G=(V,E,w)and an origin s∈V.Output:The latency of the optimal tour.//Q i is a queue for storing the generated subtours consisting of i vertices.1:Initiate Q1,and insert subtour(s)into Q1.2:Get an upper bound UB of the optimal.3:For i←1to n−1do4:For each subtour P in Q i do5:compute an upper bound UB from P;6:if UB <UB7:UB←UB ;8:For each vertex v not in V(P)do9:generate a subtour P =P//(v);10:if there exists a subtour with the same configuration in Q i+111:keep the one with better c(·)value;12:else13:compute a lower bound LB of P ;14:if LB<UB then insert P into Q i+1;15:Output UB as the minimum latency.At Step10,we need to search a configuration in Q i+1.In a typical DP algorithm, such a step can be implemented by employing an array,of which each element is for one configuration.By suitably encoding the configurations,the search can be done in only one memory access.However,such a simple method is not suitable for our algorithm since it requires to check every configuration,and this is what we want to avoid.Because of the large size of the queue,a good data structure should be used.In our program,we use an AVL tree.In the next section,we shall present the experimental results,and it shows that the improvement is very significant,compared to a link list implementation.As in a typical B&B algorithm,the lower bound function is a key point to the efficiency of the algorithm.The running time depends heavily on two factors:the number of the generated subtours and the time to compute a lower bound of a subtour.A lower bound function eliminating many subtours may be bad if it suffers from a long computation time.In the following,let G=(V,E,w)be the input metric graph and s be the origin.Let P be a subtour stopping at a vertex r and Y=P//P0be the best tour consisting of P as its prefix. Let¯V=V−V(P),¯n=|¯V|,and P0=(v0=r,v1,v2...,v¯n).Remember that the best tour never visits a vertex twice in a metric graph.A function is a LB function of P if the latency of Y is lower bounded by the value of the function.We begin with a simple observation.For any1≤i≤¯n,by the triangle inequality,we haved Y(s,v i)=w(P)+d Y(r,v i)≥w(P)+w(r,v i).Therefore,L(Y)≥L(P)+¯ni=1(w(P)+w(r,v i))=L(P)+¯n w(P)+¯ni=1w(r,v i)=c(P)+v∈¯Vw(r,v). The following property is obvious,and we omit the proof.Claim2:The function B1(P)=c(P)+v∈¯Vw(r,v)is a LB function of P and can becomputed in O(n)time.Next,we generalize the simple idea.Let l i(r,v)be the length of the shortest i-edges path between vertices r and v.Thereby an i-edges path is a path consisting of exactly i different edges.Wefirst show the following property.Lemma3:For any vertices r and v,l i(r,v)≤l j(r,v)if i<j.Proof:It is sufficient to show that l i(r,v)≤l i+1(r,v).Let Q=(r,u1,u2,...,u i+1=v) be the shortest(i+1)-edges path.Then Q =(r,u2,...,u i+1)is an i-edges path,and w(Q )≤w(Q)since w(r,u2)≤w(r,u1)+w(u1,u2)by the triangle inequality.By the definition of li,we have l i(r,v)≤w(Q ),and this completes the proof.Note that l1(r,v)is exactly w(r,v)by definition.By the monotonic property of l i,it is natural to use a more general l i as the lower bound function.In the next theorem,we establish a family of lower bound functions.Note that the function B1coincides with the one in Claim 2.Theorem4:Let k≥1.The functionB k(P)=c(P)+v∈¯V l k(r,v)−k−1i=1maxv∈¯V{l k(r,v)−l i(r,v)}is a LB function of P and can be computed in O(kn)time if the value l i(r,v)is available for any1≤i≤k and any v∈¯V.Proof:Clearly l i(r,v i)≤d Y(r,v i)since d Y(r,v i)is the length of an i-edges path while l i(r,v i)is the minimum among all possible such paths.Furthermore,by Lemma3,we have l i(r,v)≤d Y(r,v j)for any j≥i,and therefore,for k≥1,L(Y)=c(P)+¯ni=1d Y(r,v i)≥c(P)+¯ni=1l i(r,v i)≥c(P)+k−1i=1l i(r,v i)+¯ni=kl k(r,v i)(1)For i<k,we rewritel i(r,v i)=l k(r,v i)−(l k(r,v i)−l i(r,v i)) in Eq.(1),and obtainL(Y)≥c(P)+¯ni=1l k(r,v i)−k−1i=1(l k(r,v i)−l i(r,v i))≥c(P)+v∈¯V l k(r,v)−k−1i=1maxv∈¯V{l k(r,v)−l i(r,v)}Finally the time complexity is obviously O(kn).Although it is very time-consuming to compute l k even for small k,we compute the values only once in a preprocessing stage.As a subtour is generated,we need only O(kn)time to obtain a lower bound.We summarize the time complexity of the algorithm in the next theorem.Theorem5:The algorithm DPwP MLP with lower bound function B k runs in O(n k+1+ n2T)time,in which T is the number of generated subtours.Proof:To employ B k as the lower bound function,we compute l i(u,v)for any1≤i≤k and each vertex pair(u,v)in a preprocessing stage.Since l i(u,v)is the length of the shortest i-edges path and an i-edges path containing exactly i−1intermediate vertices,all these values can be computed in O(n k+1)time by exhaustively checking all possible permutations.For each generated subtour,at Step5–7,we compute a feasible tour and update the upper bound if necessary,and it takes O(n2)time.For searching the configuration in Q i+1at Step 10,by employing an AVL tree,we perform O(log|Q i+1|)comparisons of configurations.Since there are at most n2n configurations,the number of comparisons is O(n).A configuration consists of a vertex and a set of up to n paring two configurations takes O(n) time.Therefore,the total time for searching the AVL trees is O(n2T),in which T is the total number of generated subtours.For Step13,by Theorem4,the time for computing the lower bounds of all subtours is O(knT).For Step14,since inserting an element into the AVL tree has the same time complexity as the searching,the total time for all the insertions is also O(n2T).In summary, the time complexity of the algorithm is therefore O(n k+1+n2T).4The experimental resultsWe implemented the algorithms in C language and investigated their practical performances. All the tests were performed on personal computers,each of which is equipped with an Intel Pentium IV2.4GHz CPU and256M bytes memory.Two types of test data were used: random data and real data.For each test case,the running time includes all the steps except for generating or calculating the input distances.4.1Random dataThe random data were generated artificially with edge weights drawn from uniform distribu-tion.All the edge weights are integers between1and1024.In Table1,we summarize the maximum running time for each program in the tests on random data.Algorithm DPP(i) denote the algorithm DPwP MLP with lower bound function B i.For the sake of compari-Table1:The maximum running time in the random data tests(seconds,K=1000)BF10.5K165K-------DP 1.45 3.278.3817.740.096.812.5K*--DPP L 1.07 2.7725.299.4367 4.07K8.41K--DPP(1)0.300.50 2.03 4.1911.343.781.018011.7K* DPP(2)0.220.38 1.44 2.947.6629.154.6166302 DPP(3)0.170.280.91 2.03 5.2717.737.2128247 DPP(4)0.250.47 1.00 2.06 4.9111.125.896.6176 DPP(5) 1.45 2.56 4.928.0313.722.740.5105165B&B(1) 1.80 3.9115.055.7161 1.77K 2.97K 6.50K-Table2:The maximum number of generated subtours in random data tests(M=106)n=18 1.1M0.23M0.12M0.11M789896985514.8M n=2110.5M 2.96M 1.99M 1.32M0.83M0.51M593M n=23-9.17M7.35M 6.51M 4.51M 3.26M-son,we also implemented the brute-force method(labeled by BF)and the branch-and-bound method(labeled by B&B(1),using the lower bound function B1).The BF computes the op-timal solution by simply checking all the possible permutations.The B&B(1)program is similar to DPP(1)except that it does not merge the subtours with the same configuration. It uses the depth-first strategy to choose the subtour to be extended,and the chosen subtour is augmented by each of the vertices not been visited yet.In fact,we also implemented the branch-and-bound method with B i,i>1.But their behaviors are similar,and we only list B&B(1)for comparison.Algorithm DPP L is the same as DPP(1)but using a link list instead of an AVL tree as the data structure for storing the configurations.Basically at least one hundred data instances were used for each problem size.But,for BF and DP,only few instances are tested because their performances almost do not vary with the input data of the same number of vertices.Some cells in the table are marked with “-”to indicate that we did not complete the tests on these cases because some data instances took too long to complete.A“*”in a cell indicates that the long running time is caused by“disk swap”in the virtual memory system.In Table2,we list the maximum number of subtours generated by each program for some typical values of n.Table3:The running time in the real data tests(seconds)Ulysses160.090.080.090.130.330.45 Ulysses22 3.40 3.53 3.50 3.42 5.5554.47 Gr2454.4751.5443.6434.4130.23285.17 Fri2639.6137.6432.7526.0927.41257.60 4.2Real dataIn addition to the random data,we also used real data to test the performances of the algorithms.The data instances are chosen from TSPLIB[10]for the sake of their problem sizes.The results are shown in Table3.Note that the number appeared in the name indicates the number of vertices for each instance.In fact,we have also performed some other tests on partial data drawn from larger instances in TSPLIB.The results are similar.Roughly speaking,problems with25–26vertices can be solved in paring with the results of random data,the performances are much better.The reason may be that the real data are more structured and therefore the bad cases rarely happen.5Discussion and concluding remarksBy the experimental results and some other observations in our development,we make the following conclusions.•The algorithm DPwP MLP takes the advantages of both the dynamic programming and the branch-and-bound strategies,and significantly improves the performance.•Using a good data structure such as the AVL tree in our program is very important.The reason is obvious by knowing the numbers of the generated subtours(Table2).•For small integers j>i,DPP(j)is better than DPP(i)when n exceeds some value.•Theoretically,we can improve the lower bound by restricting that the i-edge path can only visit the vertices in¯V.But it suffers from a long computation time and therefore has a worse performance.In fact,we have tried several other lower bound functions.Some of them eliminate much more subtours than B1but has a worse performance.References[1]F.Afrati,S.Cosmadakis,C.Papadimitriou,G.Papageorgiou,and N.Papakostantinou,The complexity of the traveling repairman problem,Theoretical Informatics and Appli-cations,20(1)(1986)79–87.[2]A.Archer and D.P.Williamson,Faster approximation algorithms for the minimum la-tency problem,in Proc.14th ACM-SIAM Symposium on Discrete Algorithms(SODA 2003),2003pp.88–96.[3]S.Arora and G.Karakostas,Approximation schemes for minimum latency problems,SIAM put.,32(5)(2003)1317-1337.[4]A.Blum,P.Chalasani,D.Coppersmith,B.Pulleyblank,P.Raghavan,and M.Sudan,Theminimum latency problem,in Proc.26th ACM Symposium on the Theory of Computing (STOC’94),1994pp.163–171.[5]K.Chaudhuri,B.Godfrey,S.Rao,and K.Talwar,Paths,Trees,and Minimum LatencyTours,in Proc.44th Symposium on Foundations of Computer Science(FOCS2003),2003 pp.36–45.[6]A.Garcia,P.Jodr´a,and J.Tejel,A note on the traveling repairmen problem,Networks,40(1)(2002)27–31.[7]M.Goemans and J.Kleinberg,An improved approximation ratio for the minimum latencyproblem,Math.Program.,82(1998)114–124.[8]E.Koutsoupias,C.Papadimitriou and M.Yannakakis,Searching afixed graph,in Proc.23nd Colloquium on Automata,Languages and Programming,Lecture Notes in Comput.Sci.,Vol.1099,1996,pp.280–289.[9]E.Minieka,The delivery man problem on a tree network,Ann.Oper.Res.,18(1989)261–266.[10]G.Reinelt,TSPLIB—a traveling salesman problem library,ORSA puting,3(1991)376–384.See also http://www.iwr.uni-heidelberg.de/groups/comopt/software-/tsplib95/.[11]R.Sitters,The minimum latency problem is NP-hard for weighted trees,in Proc.9thInternational IPCO Conference,Lecture Notes in Comput.Sci.,Vol.2337,2002,pp.230–239.[12]B.Y.Wu,Polynomial time algorithms for some minimum latency problems,Inf.Process.Lett.,75(5)(2000)225–229.11。
A Sequential Algorithm for Generating Random Graphs
Mohsen Bd Amin Saberi1
arXiv:cs/0702124v4 [] 16 Jun 2007
Stanford University {bayati,saberi}@ 2 Yonsei University jehkim@yonsei.ac.kr
(FPRAS) for generating random graphs; this we can do in almost linear time. An FPRAS provides an arbitrary close approximaiton in time that depends only polynomially on the input size and the desired error. (For precise definitions of this, see Section 2.) Recently, sequential importance sampling (SIS) has been suggested as a more suitable approach for designing fast algorithms for this and other similar problems [18, 13, 35, 6]. Chen et al. [18] used the SIS method to generate bipartite graphs with a given degree sequence. Later Blitzstein and Diaconis [13] used a similar approach to generate general graphs. Almost all existing work on SIS method are justified only through simulations and for some special cases counter examples have been proposed [11]. However the simplicity of these algorithms and their great performance in several instances, suggest further study of the SIS method is necessary. Our Result. Let d1 , . . . , dn be non-negative integers given for the degree sequence n and let i=1 di = 2m. Our algorithm is as follows: start with an empty graph and sequentially add edges between pairs of non-adjacent vertices. In every step of the procedure, the probability that an edge is added between two distinct ˆj (1 − di dj /4m) where d ˆi and d ˆj denote ˆi d vertices i and j is proportional to d the remaining degrees of vertices i and j . We will show that our algorithm produces an asymptotically uniform sample with running time of O(m dmax ) when maximum degree is of O(m1/4−τ ) and τ is any positive constant. Then we use a simple SIS method to obtain an FPRAS for any ep, δ > 0 with running time O(m dmax ǫ−2 log(1/δ )) for generating graphs with dmax = O(m1/4−τ ). Moreover, we show that for d = O(n1/2−τ ), our algorithm can generate an asymptotically uniform d-regular graph. Our results are improving the bounds of Kim and Vu [34] and Steger and Wormald [45] for regular graphs. Related Work. McKay and Wormald [37, 39] give asymptotic estimates for number of graphs within the range dmax = O(m1/3−τ ). But, the error terms in their estimates are larger than what is needed to apply Jerrum, Valiant and Vazirani’s [25] reduction to achieve asymptotic sampling. Jerrum and Sinclair [26] however, use a random walk on the self-reducibility tree and give an FPRAS for sampling graphs with maximum degree of o(m1/4 ). The running time of their algorithm is O(m3 n2 ǫ−2 log(1/δ )) [44]. A different random walk studied by [27, 28, 10] gives an FPRAS for random generation for all degree sequences for bipartite graphs and almost all degree sequences for general graphs. However the running time of these algorithms is at least O(n4 m3 dmax log5 (n2 /ǫ)ǫ−2 log(1/δ )). For the weaker problem of generating asymptotically uniform samples (not an FPRAS) the best algorithm was given by McKay and Wormald’s switching technique on configuration model [38]. Their algorithm works for graphs 2 2 3 2 with d3 max =O(m / i di ) with average running i di ) and dmax = o(m + 2 2 2 4 time of O(m + ( i di ) ). This leads to O(n d ) average running time for dregular graphs with d = o(n1/3 ). Very recently and independently from our work, Blanchet [12] have used McKay’s estimate and SIS technique to obtain an FPRAS with running time O(m2 ) for sampling bipartite graphs with given
北美数学学术英语
北美数学学术英语在北美数学学术领域,学术英语的使用具有一定的规范和特点。
以下是一些在数学学术写作和交流中常见的术语和表达方式:●数学概念和操作:1.Theorem (定理): A statement that has been proven to be true.2.Lemma (引理): A smaller result that is often used in the proof of a larger theorem.3.Corollary (推论): A result that follows directly from a theorem.4.Conjecture (猜想): A statement believed to be true, but not yet proven.●证明和推理:1.Proof (证明): A logical argument that demonstrates the truth of a statement.2.Lemma Proof (引理证明): A proof specifically for a lemma.3.Contradiction (反证法): A proof technique where the assumption of the statement beingfalse leads to a contradiction.4.Induction (归纳法): A proof technique that involves proving a statement for a base case andshowing that if it holds for one case, it holds for the next.●方程和符号:1.Equation (方程): A mathematical statement that asserts the equality of two expressions.2.Variable (变量): A symbol that can represent any element from a set.3.Function (函数): A relation between a set of inputs and a set of possible outputs.4.Integral (积分): The concept of an antiderivative.●统计和概率:1.Probability (概率): The likelihood of a particular event occurring.2.Random Variable (随机变量): A variable whose value is subject to variations due to chance.3.Distribution (分布): A function or curve that describes the likelihood of different outcomes.●图论和几何:1.Graph (图): A collection of nodes and edges connecting pairs of nodes.2.Vertex (顶点): A point in a graph.3.Edge (边): A line connecting two vertices in a graph.4.Geometric (几何): Related to the properties and relations of points, lines, surfaces, andsolids.●学术写作风格:1.Precision (精准性): Clear and precise language is highly valued in mathematical writing.2.Rigor (严谨性): Mathematical arguments and proofs should be logically sound and rigorous.3.Conciseness (简洁性): Expressing ideas in a clear and concise manner is important inmathematical writing.以上是一些在北美数学学术领域中常见的英语术语和表达方式。
IEEE standard for Terminology and Test methods for ADC Std 1241-2000
IEEE Std1241-2000 IEEE Standard for Terminology and Test Methods for Analog-to-Digital ConvertersSponsorWaveform Measurement and Analysis Technical Committeeof theof theIEEE Instrumentation and Measurement SocietyApproved7December2000IEEE-SA Standards BoardAbstract:IEEE Std1241-2000identifies analog-to-digital converter(ADC)error sources and provides test methods with which to perform the required error measurements.The information in this standard is useful both to manufacturers and to users of ADCs in that it provides a basis for evaluating and comparing existing devices,as well as providing a template for writing specifications for the procurement of new ones.In some applications,the information provided by the tests described in this standard can be used to correct ADC errors, e.g.,correction for gain and offset errors.This standard also presents terminology and definitions to aid the user in defining and testing ADCs.Keywords:ADC,A/D converter,analog-to-digital converter,digitizer,terminology,test methodsThe Institute of Electrical and Electronics Engineers,Inc.3Park Avenue,New York,NY10016-5997,USACopyrightß2001by the Institute of Electrical and Electronics Engineers,Inc.All rights reserved. Published 13 June 2001. Printed in the United States of America.Print:ISBN0-7381-2724-8SH94902PDF:ISBN0-7381-2725-6SS94902No part of this publication may be reproduced in any form,in an electronic retrieval system or otherwise,without the prior written permission of the publisher.IEEE Standards documents are developed within the IEEE Societies and the Standards Coordinating Committees of the IEEE Standards Association(IEEE-SA)Standards Board.The IEEE develops its standards through a consensus development process,approved by the American National Standards Institute,which brings together volunteers representing varied viewpoints and interests to achieve thefinal product.Volunteers are not necessarily members of the Institute and serve without compensation.While the IEEE administers the process and establishes rules to promote fairness in the consensus development process,the IEEE does not independently evaluate,test,or verify the accuracy of any of the information contained in its standards.Use of an IEEE Standard is wholly voluntary.The IEEE disclaims liability for any personal injury,property or other damage,of any nature whatsoever,whether special,indirect,consequential,or compensatory,directly or indirectly resulting from the publication,use of,or reliance upon this,or any other IEEE Standard document.The IEEE does not warrant or represent the accuracy or content of the material contained herein,and expressly disclaims any express or implied warranty,including any implied warranty of merchantability orfitness for a specific purpose,or that the use of the material contained herein is free from patent infringement.IEEE Standards documents are supplied‘‘AS IS.’’The existence of an IEEE Standard does not imply that there are no other ways to produce,test,measure,purchase, market,or provide other goods and services related to the scope of the IEEE Standard.Furthermore,the viewpoint expressed at the time a standard is approved and issued is subject to change brought about through developments in the state of the art and comments received from users of the standard.Every IEEE Standard is subjected to review at least everyfive years for revision or reaffirmation.When a document is more thanfive years old and has not been reaffirmed,it is reasonable to conclude that its contents,although still of some value,do not wholly reflect the present state of the art. Users are cautioned to check to determine that they have the latest edition of any IEEE Standard.In publishing and making this document available,the IEEE is not suggesting or rendering professional or other services for,or on behalf of,any person or entity.Nor is the IEEE undertaking to perform any duty owed by any other person or entity to another.Any person utilizing this,and any other IEEE Standards document,should rely upon the advice of a competent professional in determining the exercise of reasonable care in any given circumstances.Interpretations:Occasionally questions may arise regarding the meaning of portions of standards as they relate to specific applications.When the need for interpretations is brought to the attention of IEEE,the Institute will initiate action to prepare appropriate responses.Since IEEE Standards represent a consensus of concerned interests,it is important to ensure that any interpretation has also received the concurrence of a balance of interests.For this reason, IEEE and the members of its societies and Standards Coordinating Committees are not able to provide an instant response to interpretation requests except in those cases where the matter has previously received formal consideration. Comments for revision of IEEE Standards are welcome from any interested party,regardless of membership affiliation with IEEE.Suggestions for changes in documents should be in the form of a proposed change of text,together with appropriate supporting ments on standards and requests for interpretations should be addressed to:Secretary,IEEE-SA Standards Board445Hoes LaneP.O.Box1331Piscataway,NJ08855-1331USANote:Attention is called to the possibility that implementation of this standard may require use of subjectmatter covered by patent rights.By publication of this standard,no position is taken with respect to theexistence or validity of any patent rights in connection therewith.The IEEE shall not be responsible foridentifying patents for which a license may be required by an IEEE standard or for conducting inquiriesinto the legal validity or scope of those patents that are brought to its attention.IEEE is the sole entity that may authorize the use of certification marks,trademarks,or other designations to indicate compliance with the materials set forth herein.Authorization to photocopy portions of any individual standard for internal or personal use is granted by the Institute of Electrical and Electronics Engineers,Inc.,provided that the appropriate fee is paid to Copyright Clearance Center. To arrange for payment of licensing fee,please contact Copyright Clearance Center,Customer Service,222Rosewood Drive,Danvers,MA01923USA;(978)750-8400.Permission to photocopy portions of any individual standard for educational classroom use can also be obtained through the Copyright Clearance Center.Introduction(This introduction is not a part of IEEE Std1241-2000,IEEE Standard for Terminology and Test Methods for Analog-to-Digital Converters.)This standard defines the terms,definitions,and test methods used to specify,characterize,and test analog-to-digital converters(ADCs).It is intended for the following:—Individuals and organizations who specify ADCs to be purchased—Individuals and organizations who purchase ADCs to be applied in their products —Individuals and organizations whose responsibility is to characterize and write reports on ADCs available for use in specific applications—Suppliers interested in providing high-quality and high-performance ADCs to acquirersThis standard is designed to help organizations and individuals—Incorporate quality considerations during the definition,evaluation,selection,and acceptance of supplier ADCs for operational use in their equipment—Determine how supplier ADCs should be evaluated,tested,and accepted for delivery to end users This standard is intended to satisfy the following objectives:—Promote consistency within organizations in acquiring third-party ADCs from component suppliers—Provide useful practices on including quality considerations during acquisition planning —Provide useful practices on evaluating and qualifying supplier capabilities to meet user requirements—Provide useful practices on evaluating and qualifying supplier ADCs—Assist individuals and organizations judging the quality and suitability of supplier ADCs for referral to end usersSeveral standards have previously been written that address the testing of analog-to-digital converters either directly or indirectly.These include—IEEE Std1057-1994a,which describes the testing of waveform recorders.This standard has been used as a guide for many of the techniques described in this standard.—IEEE Std746-1984[B16]b,which addresses the testing of analog-to-digital and digital-to-analog converters used for PCM television video signal processing.—JESD99-1[B21],which deals with the terms and definitions used to describe analog-to-digital and digital-to-analog converters.This standard does not include test methods.IEEE Std1241-2000for analog-to-digital converters is intended to focus specifically on terms and definitions as well as test methods for ADCs for a wide range of applications.a Information on references can be found in Clause2.b The numbers in brackets correspond to those in the bibliography in Annex C.As of October2000,the working group had the following membership:Steve Tilden,ChairPhilip Green,Secretary&Text EditorW.Thomas Meyer,Figures EditorPasquale Arpaia Giovanni Chiorboli Tom Linnenbrink*B.N.Suresh Babu Pasquale Daponte Solomon MaxAllan Belcher David Hansen Carlo MorandiDavid Bergman Fred Irons Bill PetersonEric Blom Dan Kien Pierre-Yves RoyDan Knierim*Chairman,TC-10CommitteeContributions were also made in prior years by:Jerry Blair John Deyst Norris NahmanWilliam Boyer Richard Kromer Otis M.SolomonSteve Broadstone Yves Langard T.Michael SoudersThe following members of the balloting committee voted on this standard:Pasquale Arpaia Pasquale Daponte W.Thomas MeyerSuresh Babu Philip Green Carlo MorandiEric Blom Fred Irons William E.PetersonSteven Broadstone Dan Knierim Pierre-Yves RoyGiovanni Chiorboli T.E.Linnenbrink Steven J.TildenSolomon MaxWhen the IEEE-SA Standards Board approved this standard on21September2000,it had the following membership:Donald N.Heirman,ChairJames T.Carlo,Vice-ChairJudith Gorman,SecretarySatish K.Aggarwal James H.Gurney James W.MooreMark D.Bowman Richard J.Holleman Robert F.MunznerGary R.Engmann Lowell G.Johnson Ronald C.PetersenHarold E.Epstein Robert J.Kennelly Gerald H.Petersonndis Floyd Joseph L.Koepfinger*John B.PoseyJay Forster*Peter H.Lips Gary S.RobinsonHoward M.Frazier L.Bruce McClung Akio TojoRuben D.Garzon Daleep C.Mohla Donald W.Zipse*Member EmeritusAlso included are the following nonvoting IEEE-SA Standards Board liaisons:Alan Cookson,NIST RepresentativeDonald R.Volzka,TAB RepresentativeDon MessinaIEEE Standards Project EditorContents1.Overview (1)1.1Scope (1)1.2Analog-to-digital converter background (2)1.3Guidance to the user (3)1.4Manufacturer-supplied information (5)2.References (7)3.Definitions and symbols (7)3.1Definitions (7)3.2Symbols and acronyms (14)4.Test methods (18)4.1General (18)4.2Analog input (41)4.3Static gain and offset (43)4.4Linearity (44)4.5Noise(total) (51)4.6Step response parameters (63)4.7Frequency response parameters (66)4.8Differential gain and phase (71)4.9Aperture effects (76)4.10Digital logic signals (78)4.11Pipeline delay (78)4.12Out-of-range recovery (78)4.13Word error rate (79)4.14Differential input specifications (81)4.15Comments on reference signals (82)4.16Power supply parameters (83)Annex A(informative)Comment on errors associated with word-error-rate measurement (84)Annex B(informative)Testing an ADC linearized with pseudorandom dither (86)Annex C(informative)Bibliography (90)IEEE Standard for Terminology and Test Methods for Analog-to-Digital Converters1.OverviewThis standard is divided into four clauses plus annexes.Clause1is a basic orientation.For further investigation,users of this standard can consult Clause2,which contains references to other IEEE standards on waveform measurement and relevant International Standardization Organization(ISO) documents.The definitions of technical terms and symbols used in this standard are presented in Clause3.Clause4presents a wide range of tests that measure the performance of an analog-to-digital converter.Annexes,containing the bibliography and informative comments on the tests presented in Clause4,augment the standard.1.1ScopeThe material presented in this standard is intended to provide common terminology and test methods for the testing and evaluation of analog-to-digital converters(ADCs).This standard considers only those ADCs whose output values have discrete values at discrete times,i.e., they are quantized and sampled.In general,this quantization is assumed to be nominally uniform(the input–output transfer curve is approximately a straight line)as discussed further in 1.3,and the sampling is assumed to be at a nominally uniform rate.Some but not all of the test methods in this standard can be used for ADCs that are designed for non-uniform quantization.This standard identifies ADC error sources and provides test methods with which to perform the required error measurements.The information in this standard is useful both to manufacturers and to users of ADCs in that it provides a basis for evaluating and comparing existing devices,as well as providing a template for writing specifications for the procurement of new ones.In some applications, the information provided by the tests described in this standard can be used to correct ADC errors, e.g.,correction for gain and offset errors.The reader should note that this standard has many similarities to IEEE Std1057-1994.Many of the tests and terms are nearly the same,since ADCs are a necessary part of digitizing waveform recorders.IEEEStd1241-2000IEEE STANDARD FOR TERMINOLOGY AND TEST METHODS 1.2Analog-to-digital converter backgroundThis standard considers only those ADCs whose output values have discrete values at discrete times, i.e.,they are quantized and sampled.Although different methods exist for representing a continuous analog signal as a discrete sequence of binary words,an underlying model implicit in many of the tests in this standard assumes that the relationship between the input signal and the output values approximates the staircase transfer curve depicted in Figure1a.Applying this model to a voltage-input ADC,the full-scale input range(FS)at the ADC is divided into uniform intervals,known as code bins, with nominal width Q.The number of code transition levels in the discrete transfer function is equal to 2NÀ1,where N is the number of digitized bits of the ADC.Note that there are ADCs that are designed such that N is not an integer,i.e.,the number of code transition levels is not an integral power of two. Inputs below thefirst transition or above the last transition are represented by the most negative and positive output codes,respectively.Note,however,that two conventions exist for relating V min and V max to the nominal transition points between code levels,mid-tread and mid-riser.The dotted lines at V min,V max,and(V minþV max)/2indicate what is often called the mid-tread convention,where thefirst transition is Q/2above V min and the last transition is3Q/2,below V max. This convention gets its name from the fact that the midpoint of the range,(V minþV max)/2,occurs in the middle of a code,i.e.,on the tread of the staircase transfer function.The second convention,called the mid-riser convention,is indicated in thefigure by dashed lines at V min,V max,and(V minþV max)/2. In this convention,V min isÀQ from thefirst transition,V max isþQ from the last transition,and the midpoint,(V minþV max)/2,occurs on a staircase riser.The difference between the two conventions is a displacement along the voltage axis by an amount Q/2.For all tests in this standard,this displacement has no effect on the results and either convention may be used.The one place where it does matter is when a device provides or expects user-provided reference signals.In this case the manufacturer must provide the necessary information relating the reference levels to the code transitions.In both conventions the number of code transitions is 2NÀ1and the full-scale range,FSR,is from V min to V max.Even in an ideal ADC,the quantization process produces errors.These errors contribute to the difference between the actual transfer curve and the ideal straight-line transfer curve,which is plotted as a function of the input signal in Figure1b.To use this standard,the user must understand how the transfer function maps its input values to output codewords,and how these output codewords are converted to the code bin numbering convention used in this standard.As shown in Figure1a,the lowest code bin is numbered0, the next is1,and so on up to the highest code bin,numbered(2NÀ1).In addition to unsigned binary(Figure1a),ADCs may use2’s complement,sign-magnitude,Gray,Binary-Coded-Decimal (BCD),or other output coding schemes.In these cases,a simple mapping of the ADC’s consecutive output codes to the unsigned binary codes can be used in applying various tests in this standard.Note that in the case of an ADC whose number of distinct output codes is not an integral power of2(e.g.,a BCD-coded ADC),the number of digitized bits N is still defined,but will not be an integer.Real ADCs have other errors in addition to the nominal quantization error shown in Figure1b.All errors can be divided into the categories of static and dynamic,depending on the rate of change of the input signal at the time of digitization.A slowly varying input can be considered a static signal if its effects are equivalent to those of a constant signal.Static errors,which include the quantization error, usually result from non-ideal spacing of the code transition levels.Dynamic errors occur because of additional sources of error induced by the time variation of the analog signal being sampled.Sources include harmonic distortion from the analog input stages,signal-dependent variations in the time of samples,dynamic effects in internal amplifier and comparator stages,and frequency-dependent variation in the spacing of the quantization levels.1.3Guidance to the user1.3.1InterfacingADCs present unique interfacing challenges,and without careful attention users can experience substandard results.As with all mixed-signal devices,ADCs perform as expected only when the analog and digital domains are brought together in a well-controlled fashion.The user should fully understand the manufacturer’s recommendations with regard to proper signal buffering and loading,input signal connections,transmission line matching,circuit layout patterns,power supply decoupling,and operating conditions.Edge characteristics for start-convert pulse(s)and clock(s)must be carefully chosen to ensure that input signal purity is maintained with sufficient margin up to the analog input pin(s).Most manufacturers now provide excellent ADC evaluation boards,which demonstrate IN P U T IN P U T(a)Figure 1—Staircase ADC transfer function,having full-scale range FSR and 2N À1levels,corresponding to N -bit quantizationIEEE FOR ANALOG-TO-DIGITAL CONVERTERS Std 1241-2000IEEEStd1241-2000IEEE STANDARD FOR TERMINOLOGY AND TEST METHODS recommended layout techniques,signal conditioning,and interfacing for their ADCs.If the characteristics of a new ADC are not well understood,then these boards should be analyzed or used before starting a new layout.1.3.2Test conditionsADC test specifications can be split into two groups:test conditions and test results.Typical examples of the former are:temperature,power supply voltages,clock frequency,and reference voltages. Examples of the latter are:power dissipation,effective number of bits,spurious free dynamic range (SFDR),and integral non-linearity(INL).The test methods defined in this standard describe the measurement of test results for given test conditions.ADC specification sheets will often give allowed ranges for some test condition(e.g.,power supply ranges).This implies that the ADC will function properly and that the test results will fall within their specified ranges for all test conditions within their specified ranges.Since the test condition ranges are generally specified in continuous intervals,they describe an infinite number of possible test conditions,which obviously cannot be exhaustively tested.It is up to the manufacturer or tester of an ADC to determine from design knowledge and/or testing the effect of the test conditions on the test result,and from there to determine the appropriate set of test conditions needed to accurately characterize the range of test results.For example,knowledge of the design may be sufficient to know that the highest power dissipation(test result)will occur at the highest power supply voltage(test condition),so the power dissipation test need be run only at the high end of the supply voltage range to check that the dissipation is within the maximum of its specified range.It is very important that relevant test conditions be stated when presenting test results.1.3.3Test equipmentOne must ensure that the performance of the test equipment used for these tests significantly exceeds the desired performance of the ADC under ers will likely need to include additional signal conditioning in the form offilters and pulse shapers.Accessories such as terminators, attenuators,delay lines,and other such devices are usually needed to match signal levels and to provide signal isolation to avoid corrupting the input stimuli.Quality testing requires following established procedures,most notably those specified in ISO9001: 2000[B18].In particular,traceability of instrumental calibration to a known standard is important. Commonly used test setups are described in4.1.1.1.3.4Test selectionWhen choosing which parameters to measure,one should follow the outline and hints in this clause to develop a procedure that logically and efficiently performs all needed tests on each unique setup. The standard has been designed to facilitate the development of these test procedures.In this standard the discrete Fourier transform(DFT)is used extensively for the extraction of frequency domain parameters because it provides numerous evaluation parameters from a single data record.DFT testing is the most prevalent technique used in the ADC manufacturing community,although the sine-fit test, also described in the standard,provides meaningful data.Nearly every user requires that the ADC should meet or exceed a minimum signal-to-noise-and-distortion ratio(SINAD)limit for the application and that the nonlinearity of the ADC be well understood.Certainly,the extent to whichthis standard is applied will depend upon the application;hence,the procedure should be tailored for each unique characterization plan.1.4Manufacturer-supplied information1.4.1General informationManufacturers shall supply the following general information:a)Model numberb)Physical characteristics:dimensions,packaging,pinoutsc)Power requirementsd)Environmental conditions:Safe operating,non-operating,and specified performance tempera-ture range;altitude limitations;humidity limits,operating and storage;vibration tolerance;and compliance with applicable electromagnetic interference specificationse)Any special or peculiar characteristicsf)Compliance with other specificationsg)Calibration interval,if required by ISO10012-2:1997[B19]h)Control signal characteristicsi)Output signal characteristicsj)Pipeline delay(if any)k)Exceptions to the above parameters where applicable1.4.2Minimum specificationsThe manufacturer shall provide the following specifications(see Clause3for definitions):a)Number of digitized bitsb)Range of allowable sample ratesc)Analog bandwidthd)Input signal full-scale range with nominal reference signal levelse)Input impedancef)Reference signal levels to be appliedg)Supply voltagesh)Supply currents(max,typ)i)Power dissipation(max,typ)1.4.3Additional specificationsa)Gain errorb)Offset errorc)Differential nonlinearityd)Harmonic distortion and spurious responsee)Integral nonlinearityf)Maximum static errorg)Signal-to-noise ratioh)Effective bitsi)Random noisej)Frequency responsek)Settling timel)Transition duration of step response(rise time)m)Slew rate limitn)Overshoot and precursorso)Aperture uncertainty(short-term time-base instability)p)Crosstalkq)Monotonicityr)Hysteresiss)Out-of-range recoveryt)Word error rateu)Common-mode rejection ratiov)Maximum common-mode signal levelw)Differential input impedancex)Intermodulation distortiony)Noise power ratioz)Differential gain and phase1.4.4Critical ADC parametersTable1is presented as a guide for many of the most common ADC applications.The wide range of ADC applications makes a comprehensive listing impossible.This table is intended to be a helpful starting point for users to apply this standard to their particular applications.Table1—Critical ADC parametersTypical applications Critical ADC parameters Performance issuesAudio SINAD,THD Power consumption.Crosstalk and gain matching.Automatic control MonotonicityShort-term settling,long-term stability Transfer function. Crosstalk and gain matching. Temperature stability.Digital oscilloscope/waveform recorder SINAD,ENOBBandwidthOut-of-range recoveryWord error rateSINAD for wide bandwidthamplitude resolution.Low thermal noise for repeatability.Bit error rate.Geophysical THD,SINAD,long-term stability Millihertz response.Image processing DNL,INL,SINAD,ENOBOut-of-range recoveryFull-scale step response DNL for sharp-edge detection. High-resolution at switching rate. Recovery for blooming.Radar and sonar SINAD,IMD,ENOBSFDROut-of-range recovery SINAD and IMD for clutter cancellation and Doppler processing.Spectrum analysis SINAD,ENOBSFDR SINAD and SFDR for high linear dynamic range measurements.Spread spectrum communication SINAD,IMD,ENOBSFDR,NPRNoise-to-distortion ratioIMD for quantization of smallsignals in a strong interferenceenvironment.SFDR for spatialfiltering.NPR for interchannel crosstalk.Telecommunication personal communications SINAD,NPR,SFDR,IMDBit error rateWord error rateWide input bandwidth channel bank.Interchannel crosstalk.Compression.Power consumption.Std1241-2000IEEE STANDARD FOR TERMINOLOGY AND TEST METHODS2.ReferencesThis standard shall be used in conjunction with the following publications.When the following specifications are superseded by an approved revision,the revision shall apply.IEC 60469-2(1987-12),Pulse measurement and analysis,general considerations.1IEEE Std 1057-1994,IEEE Standard for Digitizing Waveform Recorders.23.Definitions and symbolsFor the purposes of this standard,the following terms and definitions apply.The Authoritative Dictionary of IEEE Standards Terms [B15]should be referenced for terms not defined in this clause.3.1Definitions3.1.1AC-coupled analog-to-digital converter:An analog-to-digital converter utilizing a network which passes only the varying ac portion,not the static dc portion,of the analog input signal to the quantizer.3.1.2alternation band:The range of input levels which causes the converter output to alternate between two adjacent codes.A property of some analog-to-digital converters,it is the complement of the hysteresis property.3.1.3analog-to-digital converter (ADC):A device that converts a continuous time signal into a discrete-time discrete-amplitude signal.3.1.4aperture delay:The delay from a threshold crossing of the analog-to-digital converter clock which causes a sample of the analog input to be taken to the center of the aperture for that sample.COMINT ¼communications intelligence DNL ¼differential nonlinearity ENOB ¼effective number of bits ELINT ¼electronic intelligence NPR ¼noise power ratio INL ¼integral nonlinearity DG ¼differential gain errorSIGINT ¼signal intelligenceSINAD ¼signal-to-noise and distortion ratio THD ¼total harmonic distortion IMD ¼intermodulation distortion SFDR ¼spurious free dynamic range DP ¼differential phase errorTable 1—Critical ADC parameters (continued)Typical applicationsCritical ADC parametersPerformance issuesVideoDNL,SINAD,SFDR,DG,DP Differential gain and phase errors.Frequency response.Wideband digital receivers SIGINT,ELINT,COMINTSFDR,IMD SINADLinear dynamic range fordetection of low-level signals in a strong interference environment.Sampling frequency.1IEC publications are available from IEC Sales Department,Case Postale 131,3rue de Varemb,CH 1211,Gen ve 20,Switzerland/Suisse (http://www.iec.ch).IEC publications are also available in the United States from the Sales Department,American National Standards Institute,25W.43rd Street,Fourth Floor,New York,NY 10036,USA ().2IEEE publications are available from the Institute of Electrical and Electronics Engineers,445Hoes Lane,P.O.Box 1331,Piscataway,NJ 08855-1331,USA (/).。
纹理物体缺陷的视觉检测算法研究--优秀毕业论文
摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
Thilikos. On exact algorithms for treewidth
Takustraße 7D-14195Berlin-DahlemGermanyKonrad-Zuse-Zentrumf¨ur Informationstechnik Berlin H ANS L.B ODLAENDERF EDOR V.F OMINA RIE M.C.A.K OSTERD IETER K RATSCHD IMITRIOS M.T HILIKOSOn exact algorithms for treewidthOn exact algorithms for treewidth∗Hans L.Bodlaender†Fedor V.Fomin‡Arie M.C.A.Koster§Dieter Kratsch¶Dimitrios M.ThilikosAbstractWe give experimental and theoretical results on the problem of computing the treewidth of a graph by exact exponential time algorithms using exponential spaceor using only polynomial space.Wefirst report on an implementation of a dynamicprogramming algorithm for computing the treewidth of a graph with running timeO∗(2n).This algorithm is based on the old dynamic programming method introducedby Held and Karp for the Traveling Salesman problem.We use some optimiza-tions that do not affect the worst case running time but improve on the running timeon actual instances and can be seen to be practical for small instances.However,our experiments show that the space used by the algorithm is an important factor towhat input sizes the algorithm is effective.For this purpose,we settle the problemof computing treewidth under the restriction that the space used is only polynomial.In this direction we give a simple O∗(4n)algorithm that requires polynomial space.We also show that with a more complicated algorithm,using balanced separators,Treewidth can be computed in O∗(2.9512n)time and polynomial space.1IntroductionThe use of treewidth in several application areas requires efficient algorithms for computing the treewidth and optimal width tree decompositions of given graphs.In the past years,a large number of papers appeared studying the problem to determine the treewidth of a graph,including both theoretical and experimental results,see e.g.,[4]for an overview. Since the problem is NP complete[1],there is a little hope infinding an algorithm which can determine the treewidth of a graph in polynomial time.There are several exponential time(exact)algorithms known in the literature for the treewidth problem.(See the surveys [11,26]for an introduction to the area of exponential algorithms.)Arnborg et al.[1]gave an algorithm that tests in O(n k+2)time if a given graph has treewidth at most k.It is not hard to observe that the algorithm runs for variable k in O∗(2n)time1.See also[22]. In2004,Fomin et al.[12]presented an O(1.9601n)algorithm to compute the treewidth based on minimal separators and potential maximal cliques of graphs,using the paradigms introduced by Bouchitt´e and Todinca[7,6].The analysis of the algorithm of Fomin et al.from[12]was improved by Villanger[25],who showed that the treewidth of a graph can be computed in O(1.8899n)time.While the algorithms from[12,25]provide the best known running time,they are based on computations of potential maximal cliques and are difficult to implement.In this paper we try another approach to compute the treewidth,which seems to be much more suitable for implementations.While Treewidth is usually formulated as the problem tofind a tree decomposition of minimum width,it is possible to formulate it as a problem tofind a linear ordering of(the vertices of)the graph such that a specific cost measure of the ordering is as small as possible.Several existing algorithms and heuristics for treewidth are based on this linear ordering characterization of treewidth,see e.g.,[2,8,15]. In this paper,we exploit this characterization again,and a lesser known property of the characterization.Thus,we can show that an old dynamic programming method,introduced by Held and Karp for the Traveling Salesman problem[18]in1962can be adapted and used to compute the treewidth of given graphs.Suppressing polynomial factors,time and space bounds of the algorithm for treewidth is the same as that of the algorithm of Held and Karp for TSP:O∗(2n)running time and O∗(2n)space.The Held-Karp algorithm tabulates some information for pairs(S,v),where S is a subset of the vertices,and v is a vertex(from S);a small variation of the scheme allows us to save a factor O(n)on the space for the problems considered in this paper:we tabulate information for all subsets S⊆V of vertices.We have carried out experiments that show that the method works well to compute the treewidth of graphs of size up to around forty tofifty.For larger graphs,the space requirements of the algorithm appear to be the bottleneck.Thus,this raises the question: are there polynomially space algorithms to compute the treewidth having running time of the form O∗(c n)for some constant c?In this paper we answer this question in the affirmative.We show that there is an algorithm to compute the treewidth that uses O∗(4n) time and only polynomial space.This algorithm uses a simple recursive divide-and-conquer technique and is similar to the polynomial space algorithm of Gurevich and Shelah[17]for Hamiltonian Path.Finally,we further provide theoretical results improving upon the running time for the polynomial space algorithm for ing balanced separators,we obtain an algorithm for Treewidth that uses O∗(2.9512n)time and polynomial space.It should be noted that this result is only theoretical:the algorithm must consider many subsets of a specific size of the set of vertices.Thus,we did not carry out an experimental evaluation of the polynomial space algorithms.2Preliminaries2.1DefinitionsWe assume the reader to be familiar with standard notions from graph theory.Throughout this paper,n=|V|denotes the number of vertices of graph G=(V,E).A graph G= (V,E)is chordal,if every cycle in G of length at least four has a chord,i.e.,there is an edge connecting two non-consecutive vertices in the cycle.A triangulation of a graph G=(V,E) is a graph H=(V,F)that contains G as subgraph(F⊆E)and is chordal.H=(V,F) is a minimal triangulation of G=(V,E)if H is a triangulation of G and there does not exist a triangulation H′=(V,F′)of G with H′a proper subgraph of H.For a graph G=(V,E)and a set of vertices W⊆V,the subgraph of G induced by W is the graph G[W]=(W,{{v,w}∈E|v,w∈W}).Definition1A tree decomposition of a graph G=(V,E)is a pair({X i|i∈I},T= (I,F))with{X i|i∈I}a collection of subsets of V,called bags,and T=(I,F)a tree, such that•For all v∈V,there exists an i∈I with v∈X i.•For all{v,w}∈E,there exists an i∈I with v,w∈X i.•For all v∈V,the set I v={i∈I|v∈X i}forms a connected subgraph(subtree)of T.The width of tree decomposition({X i|i∈I},T=(I,F))equals max i∈I|X i|−1.The treewidth of a graph G,tw(G),is the minimum width of a tree decomposition of G.The following alternative characterization of treewidth is well known,see e.g.,[3]. Proposition2Let G=(V,E)be a graph,k an integer.The following are equivalent.1.G has treewidth at most k.2.G has a triangulation H=(V,F)with the maximum size of a clique in H at mostk+1.3.G has a minimal triangulation H=(V,F)with the maximum size of a clique in Hat most k+1.32.2Treewidth as a Linear Ordering ProblemIt is well known that treewidth can be formulated as a linear ordering problem,and this is exploited in several algorithms for determining the treewidth,see e.g.,[2,8,9,15].A linear ordering of a graph G=(V,E)is a bijectionπ:V→{1,2,...,|V|}.For a linear orderingπand v∈V,we denote byπ<,v the set of vertices that appear before v in the ordering:π<,v={w∈V|π(w)<π(v)}.Likewise,we defineπ≤,v,π>,v, andπ≥,v.A linear orderingπof G is a perfect elimination scheme,if for each vertex, its higher numbered neighbors form a clique,i.e.,for each i∈{1,2,...,|V|},the set {π−1(j)|{π−1(i),π−1(j)}∈E∧j>i}is a clique.It is well-known that a graph has a perfect elimination scheme,if and only if it is chordal,see[16,Chapter4].For arbitrary graphs G,a linear orderingπdefines a triangulation H of G that hasπas perfect elimination scheme.The triangulation with respect toπof G is built as follows:first,set G0=G,and then for i=1to n,G i is obtained from G i−1by adding an edge between each pair of non adjacent higher numbered neighbors ofπ−1(i).One can observe that the resulting graph H=G n is chordal,hasπas perfect elimination scheme,and contains G as subgraph.For our algorithms,we want to avoid working with the triangulation explicitly.The following predicate allows us to‘hide’the triangulation.For a linear orderingπ,and two vertices v,w∈V,we say Pπ(v,w)holds,if and only if there is a path v,x1,x2,...,x r,w from v to w in G,such that for each i,1≤i≤r,π(x i)<π(v),andπ(x i)<π(w).In other words,Pπ(v,w)is true,if and only if there is a path from v to w such that all internal vertices are before v and w in the orderingπ.Note that the definition implies that Pπ(v,w) is always true when v=w or when{v,w}∈E.With Rπ(v),we denote the number of higher numbered vertices w∈V for which Pπ(v,w)holds,i.e.,Rπ(v)=|{w∈V|π(w)>π(v)∧Pπ(v,w)}|.The proof of the following proposition is an immediate consequence of a lemma of Rose et al.[21].(See also[3,8,9].)Proposition3Let G=(V,E)be a graph,and k a non-negative integer.The treewidth of G is at most k iffthere is a linear orderingπof G,such that for each v∈V,Rπ(v)≤k. Proof:We use the following result from[21].For a given graph G=(V,E),and a linear orderingπ,we have for each pair of disjoint vertices v,w∈E:{v,w}is an edge in the triangulation H=(V,E H)with respect toπ,if and only if Pπ(v,w)is true.Also,we use the result of Fulkerson and Gross[13]that ifπis a perfect elimination scheme of chordal graph H=(V,E H),then the maximum clique size of H is one larger than the maximum over all v∈V of the number of higher numbered neighbors|{v,w}∈E H|π(w)>π(v)}|.Now,the treewidth of G is at most k,if and only if there is a triangulation H=(V,E H) of G with maximum clique size at most k(Proposition3),if and only if for a perfect elimination schemeπof triangulation H,we have that for each v∈V:k+1≤|{v,w}∈E H|π(w)>π(v)}|=|{w∈V|Pπ(v,w)∧π(w)>π(v)}|=Rπ(v)4⊓⊔LetΠ(S)be the set of all permutations of a set S.So,Π(V)is the set of all linear orderings of G.WriteΠ(S,R)for the collection of permutations of S,that end with vertices in R,i.e.,with the property that for each v∈R:π(v)≥|S|−|R|+1.For a graph G=(V,E),a set of vertices S⊆V and a vertex v∈V−S,we defineQ G(S,v)=|{w∈V−S−{v}|there is a path from v to w in G[S∪{v,w}]}|If G is clear from the context,we drop the subscript G.Let us note that Q(S,v)can be computed in O(n+m)time by checking for each w∈V−S−{v}whether w has a neighbor in the component of G[S∪{v}]containing v.Also note that Rπ(v)=|Q(π<,v,v)| for any v∈V,and any linear orderingπ∈Π(V).3A dynamic programming algorithm for treewidth The results of this section are based on the observation that the value Rπ(v)only depends on v,G,and the set of vertices left of v inπ.We defineT W G(S)=minπ∈Π(V)maxv∈S|Q G(π<,v,v)|.Again,usually G is clear from the context,and dropped as subscript.The main idea of the algorithm in this section is that we compute T W G(S)for all subsets S⊆V using dynamic programming.The next lemma shows that this solves the treewidth problem.Lemma4For each graph G=(V,E),the treewidth of G equals T W(V).Proof:Using Proposition3,we havetw(G)=minπ∈Π(V)maxv∈VRπ(v)=minπ∈Π(V)maxv∈V|Q(π<,v,v)|=T W(V).⊓⊔The following lemma gives the recursive formulation that allows us to compute the values T W(S)with dynamic programming.Lemma5For any graph G=(V,E),and any set of vertices S⊆V,S=∅,T W(S)=minv∈Smax{T W(S−{v}),|Q(S−{v},v)|}5Proof:Letπ∈Π(V)be a permutation with T W(S)=max w∈S|Q(π<,w,w)|.Suppose v is the vertex from S with the largest index inπ,i.e.,the vertex with S⊆π≤,v.From the definition of T W,it directly follows that T W(S)≥T W(S−{v}).Also,as S⊆π≤,v,we have|Q(S−{v},v)|≤|Q(π<,v,v)|.HenceT W(S)≥max T W(S−{v}),max w∈S|Q(π<,w,w)|≥max{T W(S−{v}),|Q(π<,v,v)|}≥max{T W(S−{v}),|Q(S−{v},v)|}Thus,max{T W(S−{v}),|Q(S−{v},v)|}T W(S)≥minv∈SFor the other direction,let v be an arbitrary vertex from S.Supposeπ∈Π(V)is a permutation with T W(S−{v})=max w∈S|Q(π<,w,w)|.Letπ′∈Π(V)be a permutation, obtained byfirst taking the vertex in S−{v}in the order as they appear inπ,then taking v,and then taking the vertices in V−S in an arbitrary order.Note that we have for all w∈S−{v},π′<,w⊆π<,w,and thatπ′<,v=S−{v}.Now|Q(π′<,w,w)|T W(S)≤maxw∈S=max max w∈S−{v}|Q(π′<,w,w)|,|Q(π′<,v,v)|≤max max w∈S−{v}|Q(π<,w,w)|,|Q(S−{v},v)|=max{T W(S−{v}),|Q(S−{v},v)|}⊓⊔This gives us the following relatively simple algorithm for Treewidth with O∗(2n) worst case running time and space.Theorem6The treewidth of a graph on n vertices can be determined in O∗(2n)time and O∗(2n)space.Proof:By Lemma5,we almost directly obtain a Held-Karp-like dynamic programming algorithm for the problem.In order of increasing sizes,we compute for each S⊆V, T W(S)using Lemma5.Below,we give pseudo-code for a simple form of the algorithm Dynamic-Programming-Treewidth.The algorithm uses O∗(2n)time,as we do polynomially many steps per subset of V. The algorithm also keeps all subsets of V and thus uses O∗(2n)space.⊓⊔6Set T W(∅)=−∞.for i=1to n dofor all sets n⊂V with|S|=n doSet T W(S)=min v∈S max{T W(S−{v}),|Q(S−{v},v)|} end forend forreturn T W(V)Letπ′∈Π(V)be a permutation with T W R(L,S′)=max v∈S′|Q(L∪π′<,v,v)|.Letπ′′∈Π(V)be a permutation with T W R(L∪S′,S−S′)=max v∈S−S′|Q(L∪S′∪π′<,v,v)|.We now build a permutationπ∈Π(V)in the following way.First,we take the elements in L, in some arbitrary order.Then,we take the elements in S′,in the order as they appear in π′.I.e.,for v,w∈S′,we have that v has a smaller index than w inπ,if and only if v has a smaller index than w inπ′.Then,we take the elements in S−S′,in the order as they appear inπ′′.I.e.,for v,w∈S−S′,we have that v has a smaller index than w inπ,if and only if v has a smaller index than w inπ′′.Also,for all v∈S′,w∈S−S′,v has a smaller index than w inπ.We end permutationπby taking the elements in V−S−L in some arbitrary order.For this orderπ,we have|Q(L∪π<,v,v)|T W R(L,S)≤maxv∈S=max max v∈S|Q(L∪π<,v,v)|,max v∈S−S′|Q(L∪π<,v,v)|≤max max v∈S|Q(L∪π′<,v,v)|,max v∈S−S′|Q(L∪S′∪π′<,v,v)|=max{T W R(L,S′),T W R(L∪S′,S−S′)}This proves the result.⊓⊔By making use of Lemma7with k=⌊|S|/2⌋,we obtain the following result. Theorem8The treewidth of a graph on n vertices can be determined in O∗(4n)time and polynomial space.Proof:Lemma7is used to obtain Algorithm2.This algorithm computes T W R(L,S) recursively.Algorithm2computes the treewidth of the graph G when calling Recursive-Treewidth(G,∅,V).Since tw(G)=T W R(∅,V),this gives the answer to the problem.The algorithm clearly uses polynomial space:recursion depth is O(log n),and per recursive step,only polynomial space is used.To estimate the running time,suppose that Recursive-Treewidth(G,L,S)costs T(n,r)time with n the number of vertices of G and r=|S|.All work,except the time of recursive calls,has its time bounded by a polynomial in n,p(n).As we make less than2r+1recursive calls,each with a set S′with|S′|≤⌈|S|/2⌉, we haveT(n,r)≤2r+1·T(n,⌈r/2⌉)+p(n)(1) From this,it follows that there is a polynomial p′(n),such thatT(n,r)≤4r·p′(n)(2) As the algorithm is called with|S|=n,it uses O∗(4n)time.⊓⊔In Section5,we report on an implementation of the O∗(2n)algorithm for Treewidth (with additional improvements to decrease the time for actual instances).So,while the O∗(2n)algorithm does not give a theoretical improvement,it can be seen to be of practical use.In Section6,we improve upon the running time for the case of polynomial space.8if|S|=1thenSuppose S={v}.return Q(L,v)end ifSet Opt=∞.for all sets S′⊆S,|S′|=⌊|S|/2⌋doCompute v1=Recursive-Treewidth(G,L,S′); Compute v2=Recursive-Treewidth(G,L∪S′,S−S′); Set Opt=min{Opt,max{v1,v2}};end forreturn Optn=|V|.Compute some initial upper bound up on the treewidth of G.(E.g.,set up=n−1.) Let T W0be the set,containing the pair(∅,−∞).for i=1to n doSet T W i to be an empty set.for each pair(S,r)in T W i−1dofor each vertex x∈V−S doCompute q=|{w∈V−S−{x}|w∼S x}|.Set r′=min{r,q}.if r′<up thenif There is a pair(S∪{x},t)in T W i for some t thenReplace the pair(S∪{x},t)in T W i by(S∪{x},min(t,r′)).elseInsert the pair(S∪{x},r′)in T W i.end ifend ifend forend forend forif T W n contains a pair(V,r)for some r thenreturn relsereturn upend ifLemma11Let C⊆V induce a clique in graph G=(V,E).The treewidth of G equals max{T W(V−C),|C|−1}.Proof:Using the proof method of Proposition3and Proposition10,we obtain that the treewidth of G is at most some non-negative integer k,if and only if there is a linear orderingπ∈Π(V,C)of G,(i.e.,withπending with the vertices in C),such that for each v∈V,Rπ(v)≤k.In a similar,slightly more complicated way as for the proof of Lemma4, we havetw(G)=minπ∈Π(V,C)maxv∈VRπ(v)=minπ∈Π(V,C)max max v∈V−C Rπ(v),max v∈C Rπ(v)=minπ∈Π(V,C)max max v∈V−C Rπ(v),|C|−1=max |C|−1,minπ∈Π(V,C)Rπ(v)=max{|C|−1,T W(V−C)}⊓⊔By Lemma11,we can restrict the sets S to elements from V−C for some clique C;in particular for the maximum clique.Although it is NP-hard to compute the maximum clique in a graph,it can be computed extremely fast for the graphs considered.In our program, we use a simple combinatorial branch-and-bound is used to compute all maximum cliques. It recursively extends a clique by all candidate vertices once.The algorithm was implemented in C++,using the Boost graph library,as part of the Treewidth Optimization Library TOL,a package of algorithms for the treewidth of graphs.The package includes preprocessing,upper bound,and lower bound algorithms for treewidth.Experiments were carried out on a number of graphs taken from applications; several were used in other experiments.See[24]for the used graphs,information on the graphs,and other results of experiments to compute the treewidth.The experiments were carried out on a Sun computer with4AMD Dualcore Opteron875,2.2GHz processor and at most20GB of internal memory available.The program did not use parallelism.In Table1the results of our experiments on a number of graphs are reported.Besides instance name,number of vertices,number of edges,and the computed treewidth,we report on the CPU time in seconds and the maximum number of sets(S,r),considered at once,max|T W|=max i=0,...,n|T W i|in a number of cases.First,we report on the CPU time and maximum number of sets for the case that no initial upper bound up is exploited. Next,we report on the case where we use an initial upper bound,displayed in the column up.The last two columns report on the experiments in which the algorithm is advanced by both an initial upper bound up and a maximum clique C of sizeω.In several instances reported in[24],the best bound obtained from a few upper bound heuristics,and the11lower bound obtained by the LBP+(MMD+)heuristic match,and then we have obtained in a relatively fast way an exact bound on the treewidth of the instance graph.In othercases,these bounds do not match.Then,when the graph is not too large,the dynamic programming algorithm can be of good use.A nice example is the celar03graph.This graph has200vertices and721edges.A combination of different preprocessing techniques yield an equivalent instance celar03-pp-001which has38vertices and238edges.Existing upper bound heuristics gave a best upper bound of15,while the lower bound of the LBP+(MMD+)heuristic was13.With the dynamic programming algorithm with15as input for an upper bound,we obtained the exact treewidth of14for this graph,and hence also for celar03.instance|V||E|tw CPU max|T W|up CPU max|T W|ωCPU max|T W| myciel3112050.0024050.003520.0021 myciel42371107.64296835100.14442220.124064 queen5-525160180.1518220180.0294450.02392 queen6-6362902536.43203171626 1.161887260.366994 queen7-74947635--371012.12965170957248.0324410915 Table1:Experimental results for some DIMACS vertex coloring graphs,some probabilisticnetworks and frequency assignment graph celar03-pp-001The algorithm can also be used as a lower bound heuristic:give the algorithm as‘upper bound’a conjectured lower boundℓ:when it terminates,it either has found the exact treewidth,or we know thatℓis indeed a lower bound for the treewidth of the input graph. In a few cases,we could thus increase the lower bound for the treewidth of considered instances,e.g.,for the treewidth of the queen8-8graph(the graph modeling the possible moves of a queen on an8by8chessboard)the lower bound could be improved from27to 35.For larger graphs,the above idea can be combined by an idea exploited earlier in various papers.Given a graph G and a minor G′of G,tw(G′)≤tw(G).In[5,15,20],a lower bound on tw(G′)is computed to obtain a lower bound for G.With Algorithm3,we can compute tw(G′)exactly to obtain a lower bound for tw(G).For the1024vertices graph pignet2-pp,we have generated a sequence of minors by repeatedly contracting a minimum degree vertex with a neighbor with least number of common neighbors(see[5]).12Figure1shows the treewidth(right y-scale)for the minors with70to79vertices.Moreover, the maximum number of sets for three different upper bounds is reported(left y-scale, logarithmic).If the used upper bound is less than or equal to the treewidth,no feasible solution is found in the end.The best known lower bound for pignet2-pp is increased from 48to59by the treewidth of the79vertex-minor.Figure1shows one more time the impact of the upper bound on the memory consumption(and time consumption)of the algorithm.Figure1:Maximum number of subsets S during algorithm for different upper bounds. 6Improved polynomial space algorithms for treewidthIn this section,we give a faster exponential time algorithm with polynomial space for Treewidth.The algorithm is based on results of earlier sections combined with tech-niques based upon balanced separators.We now derive a number of necessary lemmas.Lemma12Supposeπis a linear ordering of G=(V,E)withtw(G)=maxRπ(v)v∈V13Let0≤i<|V|,and S={v∈V|π(v)>i}be the set with the|V|−i highest numbered vertices.Thentw(G)=max{T W(V−S),T W R(V−S,S)}Proof:Recall that T W R(∅,S)=T W(S),for all S⊆V.By Lemma7,tw(G)=T W R(∅,V)≤max{T W R(∅,V−S),T W R(V−S,S)}Clearly T W(V−S)≤T W(V)≤tw(G).Observing that for v∈S:π<,v=V−S∪π<,v, we have|Q(V−S∪π<,v,v)|T W R(V−S,S)≤maxv∈SRπ(v)≤tw(G)=maxv∈S⊓⊔Lemma13Let G=(V,E)be a graph.Let S⊆V be a set of vertices,such that the treewidth of G is equal to the treewidth of the graph G′=(V,E∪{{v,w}|v,w∈S,v=w}) obtained from G by turning S into a clique.Then there is a linear orderingπ∈Π(V,S) (i.e.,πends with the vertices in S),such that tw(G)=max v∈V Rπ(v).Proof:Suppose H=(V,E H)is a triangulation of G′,such that the maximum clique size of H equals tw(G′)+1=tw(G)+1.H is also a triangulation of G,and S is a clique in H. By Lemma10,there is a perfect elimination schemeπof H that ends with the vertices in S,i.e.,withπ∈Π(V,S).For this orderingπ,we have that tw(G)=max v∈V Rπ(v),as we have for each v∈V that{v}∪Q(π<,v,v)is a clique in H,and hence Rπ(v)≤tw(G).⊓⊔The following lemma is a small variant on a folklore result.Its proof follows mostly the folklore proof.Lemma14Let G=(V,E)be a graph with treewidth at most k.There is a set S⊆V with•|S|=k+1.•Each connected component of G[V−S]contains at most(|V|−k)/2vertices.•The graph G′=(V,E∪{{v,w}|v,w∈S,v=w})obtained from G by turning S into a clique has treewidth at most k.Proof:It is well known that if the treewidth of G is at most k,then G has a tree decomposition({X i|i∈I},T=(I,F))such that14•For all i∈I:|X i|=k+1.•For all(i,j)∈F:|X i−X j|≤1.Take such a tree decomposition.Now,for each i∈I,consider the trees obtained when removing i from T.For each such tree,consider the union of the sets X j−X i with j in this tree.Each connected component W of G[V−X i]has all its vertices in one such set, i.e.,for one subtree of T−i.Suppose that for i∈I,there is at least one component of G[V−X i]that contains more than(|V|−k)/2vertices.Let i′be the neighbor of i that belongs to the subtree that contains the vertices in W.Now,direct an arc from i to i′.In this way,each node in I has at most one outgoing arc.Supposefirst that there are two neighboring nodes i1and i2,with i1having an arc to i2,and i2having an arc to i1.Let W1be the connected component of G[V−X i1]that contains more than(|V|−k)/2vertices.Let W2be the connected component of G[V−X i2] that contains more than(|V|−k)/2vertices.Figure2:Illustration to the proof of Lemma14Now,W1and W2are disjoint sets.(See Figure2.Note that if v∈W1∩W2,then v must belong to a bag in the part of the tree,marked with W1,and a part of the tree marked with W2.Then also w∈X i1,and this is a contradiction as W is a connected component of G[V−X i1].)Also,W1∩X i1=∅.As W2∩X i2=∅,W1∩X i2⊆X i1−X i2.Now,X i1,W1,andW2−(X i1−X i2)are disjoint sets,which contain together at least(k+1)+((|V|−k)/2+1)+((|V|−k)/2+1−1)>|V|vertices,contradiction.Now,as there are no two neighboring nodes i1and i2,with i1having an arc to i2,and i2 having an arc to i1,there must be a there is a node i0in T without outgoing arcs.(Start at any tree node,and follow arcs.As the tree isfinite and loopless,we end in a node i0without outgoing arcs.)Now taking S=X igives the required set:({X i|i∈I},T=(I,F))is also a tree decomposition of G′,so G′has treewidth k,and as i0has no outgoing arcs,each connected component of G[V−X i]has at most(|V|−k)/2vertices.⊓⊔Lemma15Let G=(V,E)be a graph with treewidth at most k.Let k+1≤r≤n.There is a set W⊆V,with•|W|=r.15•Each connected component of G[V−W]contains at most(|V|−r+1)/2vertices.•tw(G)=max{T W R(∅,V−W),T W R(V−W,W)}.Proof:First,let S be the set,implied by Lemma14.Letπ∈Π(V,S)be the linear ordering with tw(G)=max v∈V Rπ(v),see Lemma13.If k+1=r,then we can take W=S,and we are done by Lemma12.If k+1<r,then we construct W and an orderingπas follows.We start by setting W=S,but will add later more vertices to W.Repeat the following steps,until|W|=r: compute the connected components of G[V−W].Suppose Z is the set of vertices of the connected component of G[V−W]with the largest number of vertices.Let z∈Z be the vertex in Z with the largest index inπ:π(z)=max v∈Zπ(v).Now,we do the following.•Change the position of z inπas follows:move z to thefirst position before an element in W,i.e.,setπ(z)=|V|−|Q|.All other elements keep their relative position.Now, note that sets Q(π<,v,v)do not change,for all v∈V.So,for the new orderingπ, we still have that tw(G)=max v∈V Rπ(v).•Add z to W.Note that we still have thatπends with the vertices in W.This procedure keeps as invariants thatπ∈Π(V,W),i.e.,πends with W,that tw(G)= max v∈V Rπ(v),and that each connected component of G[V−W]contains at most(|V|−|W|+1)/2vertices.(This can be seen as follows.The component that contained z became one smaller,while the term(|V|−|W|−1)/2decreases with1/2.All but the largest component of G[V−W]contain at most(|V|−|W|)/2vertices,which means that they still are of sufficiently small size when|W|increases by one.)By the Lemma12,we known that the third condition holds for W,thus the set W obtained by the procedure fulfills the conditions stated in the lemma.⊓⊔For a graph G=(V,E),and a set W,let G+[W]be thefill-in graph,obtained by eliminating the vertices in V−W,i.e.,G+[W]=(W,F),with for all v,w∈W,v=w,we have that{v,w}∈F,if and only if there is a path from v to w that uses only vertices in V−W as internal vertices.The next lemma formalizes the intuition behind T W R(V−W,W):when computing T W R(V−W,W),we look for the best ordering of the vertices in W,after all vertices in V−W are eliminated—i.e.,in the graph G+[W]).We also give a formal proof.Lemma16Let G=(V,E)be a graph,and W⊆V a set of vertices.Then tw(G+[W])= T W R(V−W,W).Proof:Consider a linear orderingπ∈Π(V)of the vertices in V.Letπ′be the linear ordering of the vertices in W,obtained by restrictingπto W,i.e.,for v,w∈W:π(v)<π(w)⇔π′(v)<π′(w).Now,for all v∈VQ G+[W](π′<,v,v)=Q G(V−W∪π<,v,v)16。
eofs.standard原理
eofs.standard原理EOFs (Empirical Orthogonal Functions) are a mathematical technique used to analyze and decompose complex datasets into a set of orthogonal basis functions. These basis functions, also known as eigenfunctions, capture the dominant patterns of variability in the data. The EOF analysis is widely applied in various fields, including meteorology, oceanography, and climate science, to study and understand the spatio-temporal patterns present in large datasets.From a mathematical perspective, EOFs are derived from the Singular Value Decomposition (SVD) of a data matrix. The SVD breaks down the data matrix into three components: the left singular vectors (spatial patterns), the singular values (variance explained), and the right singular vectors (temporal coefficients). The left singular vectors form the EOFs, which represent the spatial patterns in the data. These patterns are ordered in terms of their contribution to the total variance explained by the dataset.The EOF analysis provides valuable insights into the underlying processes and dynamics of the system under study. By decomposing the data into orthogonal basis functions, it allows researchers to identify the dominant modes of variability and their spatial distribution. Thisinformation is crucial for understanding the physical mechanisms driving the observed patterns and for making predictions about future behavior.One of the key advantages of EOF analysis is itsability to reduce the dimensionality of complex datasets without losing important information. By retaining only the leading EOFs, which capture the most significant patternsof variability, researchers can represent the data in a lower-dimensional space. This simplification facilitates data interpretation and visualization, making it easier to identify coherent structures and relationships within the dataset.Furthermore, EOF analysis can be used to identify and remove noise or unwanted variability from the data. Byexcluding the EOFs associated with noise or measurement errors, researchers can focus on the meaningful signals present in the dataset. This noise reduction step improves the signal-to-noise ratio and enhances the reliability of subsequent analyses or modeling efforts.From a practical perspective, EOF analysis is implemented through various computational algorithms. These algorithms typically involve matrix manipulations and eigenvalue decompositions, which can be computationally intensive for large datasets. However, advancements in computational resources and algorithms have made itfeasible to apply EOF analysis to increasingly larger and more complex datasets.In conclusion, EOF analysis is a powerful mathematical technique used to analyze and decompose complex datasets. It enables researchers to identify dominant patterns of variability, understand underlying processes, reduce dimensionality, and remove noise. By providing valuable insights into the spatio-temporal patterns present in thedata, EOF analysis plays a crucial role in advancing our understanding of various scientific disciplines.。
Outlier Detection Techniques
LUDWIG-MAXIMILIANS-DATABASE INSTITUTE FOR UNIVERSITÄTMÜNCHEN SYSTEMS GROUPINSTITUTE FOR INFORMATICS 16th ACM SIGKDD Conference on Knowledge Discovery and Data MiningOutlier Detection TechniquesHans-Peter Kriegel, Peer Kröger, Arthur ZimekLudwig-Maximilians-Universität MünchenMunich, Germanyhttp://www.dbs.ifi.lmu.de{kriegel,kroegerp,zimek}@dbs.ifi.lmu.deGeneral IssuesDATABASESYSTEMSGROUP1.Please feel free to ask questions at any time during thepresentationt ti2.Aim of the tutorial: get the big picture–NOT in terms of a long list of methods and algorithms–BUT in terms of the basic approaches to modeling outliers–Sample algorithms for these basic approaches will be sketched •The selection of the presented algorithms is somewhat arbitrary•Please don t mind if your favorite algorithm is missingPlease don’t mind if your favorite algorithm is missing•Anyway you should be able to classify any other algorithm not covered here by means of which of the basic approaches is implemented3.The revised version of tutorial notes will soon be availableon our websitesGROUPWhat is an outlier?Definition of Hawkins [Hawkins 1980]:“An outlier is an observation which deviates so much from the otherobservations as to arouse suspicions that it was generated by a differentmechanism”Statistics-based intuitiono a data objects o o a ge e at g ec a s,e g so e –Normal data objects follow a “generating mechanism”, e.g. some given statistical process–Abnormal objects deviate from this generating mechanismGROUPExample: Hadlum vs. Hadlum (1949) [Barnett 1978]•Example:Hadlum vs Hadlum(1949)[Barnett1978]•The birth of a child to Mrs.Hadlum happened 349 daysafter Mr. Hadlum left formilitary service.military service•Average human gestationperiod is 280 days (40weeks).•Statistically, 349 days is anoutlier.GROUPExample: Hadlum vs. Hadlum (1949) [Barnett 1978]•Example:Hadlum vs Hadlum(1949)[Barnett1978]−blue: statistical basis (13634observations of gestation periods)observations of gestation periods)−green: assumed underlyingGaussian process−Very low probability for the birth ofMrs. Hadlums child for beinggenerated by this process−red: assumption of Mr. Hadlum(another Gaussian processresponsible for the observed birth,responsible for the observed birth,where the gestation period startslater)−Under this assumption theUnder this ass mption thegestation period has an averageduration and the specific birthdayhas highest-possible probabilityhas highest possible probabilityGROUP•Sample applications of outlier detection–Fraud detection•Purchasing behavior of a credit card owner usually changes when thecard is stolencard is stolen•Abnormal buying patterns can characterize credit card abuse –Medicine•Unusual symptoms or test results may indicate potential health problemsof a patient•Whether a particular test result is abnormal may depend on otherWh th ti l t t lt i b l d d thcharacteristics of the patients (e.g. gender, age, …)Public health–Public health•The occurrence of a particular disease, e.g. tetanus, scattered acrossvarious hospitals of a city indicate problems with the correspondingvaccination program in that cityi ti i th t it•Whether an occurrence is abnormal depends on different aspects likefrequency, spatial correlation, etc.q y,p,GROUP•Sample applications of outlier detection (cont.)–Sports statistics•In many sports, various parameters are recorded for players in order toevaluate the players performancesevaluate the players’performances•Outstanding (in a positive as well as a negative sense) players may beidentified as having abnormal parameter values•Sometimes, players show abnormal values only on a subset or a specialcombination of the recorded parametersDetecting measurement errors–Detecting measurement errors•Data derived from sensors (e.g. in a given scientific experiment) maycontain measurement errors•Abnormal values could provide an indication of a measurement error•Removing such errors can be important in other data mining and dataanalysis tasksanalysis tasks•“One person‘s noise could be another person‘s signal.”–…GROUP •Discussion of the basic intuition based on Hawkins–Data is usually multivariate,i.e., multi-dimensionalb i d l i i i t=> basic model is univariate,i.e., 1-dimensionalThere is usually more than one generating–There is usually more than one generatingmechanism/statistical process underlyingthe “normal”datathe normal data=> basic model assumes only one “normal”generating mechanism generating mechanism–Anomalies may represent a different class (generating mechanism) ofobjects, so there may be a large class of similar objects that are the j ,y g joutliers=> basic model assumes that outliers are rare observationsGROUP•Consequences:– A lot of models and approaches have evolved in the past years in order to exceed these assumptions–It is not easy to keep track with this evolutionIt is not easy to keep track with this evolution–New models often involve typical, sometimes new, though usually hidden assumptions and restrictionshidden assumptions and restrictionsGROUP•General application scenarios–Supervised scenario•In some applications, training data with normal and abnormal dataobjects are providedobjects are provided•There may be multiple normal and/or abnormal classes•Often, the classification problem is highly imbalanced–Semi-supervised Scenario•In some applications, only training data for the normal class(es) (or onlythe abnormal class(es)) are providedth b l l())id d–Unsupervised ScenarioIn most applications there are no training data available•In most applications there are no training data availableIn this tutorial, we focus on the unsupervised scenario•In this tutorial we focus on the unsupervised scenarioGROUP•Are outliers just a side product of some clustering algorithms?l ith?–Many clustering algorithms do not assign all points to clusters but account for noise objectsaccount for noise objects–Look for outliers by applying one of those algorithms and retrieve the noise setnoise set–Problem:•Clustering algorithms are optimized to find clusters rather than outliers•Accuracy of outlier detection depends on how good the clusteringalgorithm captures the structure of clustersl ith t th t t f l t• A set of many abnormal data objects that are similar to each other would gbe recognized as a cluster rather than as noise/outliersGROUP•We will focus on three different classification approaches –Global versus local outlier detectionConsiders the set of reference objects relative to which each point’s outlierness is judged“outlierness”is judgedg g–Labeling versus scoring outliersConsiders the output of an algorithm–Modeling propertiesConsiders the concepts based on which “outlierness” is modeledNOTE: we focus on models and methods for Euclidean data but many of those can be also used for other data types (because they onlyof those can be also used for other data types(because they onlyrequire a distance measure)GROUP•Global versus local approaches–Considers the resolution of the reference set w.r.t. which the “outlierness” of a particular data object is determined–Global approachesGlobal approaches•The reference set contains all other data objectsas c assu pt o t e e s o y o e o a ec a s•Basic assumption: there is only one normal mechanism•Basic problem: other outliers are also in the reference set and may falsifythe results–Local approaches•The reference contains a (small) subset of data objects•No assumption on the number of normal mechanismsNo assumption on the number of normal mechanisms•Basic problem: how to choose a proper reference setNOTE: Some approaches are somewhat in between–NOTE:Some approaches are somewhat in between•The resolution of the reference set is varied e.g. from only a single object(local) to the entire database (global) automatically or by a user-definedinput parameteri t tGROUP•Labeling versus scoring–Considers the output of an outlier detection algorithm–Labeling approaches•Binary outputBi t t•Data objects are labeled either as normal or outlierScoring approaches–Scoring approaches•Continuous outputj p(g p y •For each object an outlier score is computed (e.g. the probability forbeing an outlier)•Data objects can be sorted according to their scores–NotesN t•Many scoring approaches focus on determining the top-n outliers(parameter is usually given by the user)(parameter n is usually given by the user)•Scoring approaches can usually also produce binary output if necessary(e.g. by defining a suitable threshold on the scoring values)GROUP•Approaches classified by the properties of the underlying modeling approachd li h–Model-based Approaches•RationalR ti l–Apply a model to represent normal data points–Outliers are points that do not fit to that model•Sample approaches–Probabilistic tests based on statistical models–Depth-based approachesD th b d h–Deviation-based approaches–Some subspace outlier detection approachesGROUP–Proximity-based Approaches•Rational–Examine the spatial proximity of each object in the data space–If the proximity of an object considerably deviates from the proximity of other objects it is considered an outlier•Sample approachesDistance based approaches–Distance-based approaches–Density-based approaches–Some subspace outlier detection approaches–Angle-based approaches•RationalExamine the spectrum of pairwise angles between a given point and all other –Examine the spectrum of pairwise angles between a given point and all other points–Outliers are points that have a spectrum featuring high fluctuationDATABASESYSTEMSOutlineGROUP1.Introduction√2.Statistical Tests3.Depth-based Approaches Model-based4.Deviation-based Approaches5Distance-based Approaches5.Distance based Approaches6.Density-based Approaches7High dimensional Approaches Proximity-basedAdaptation of different models7.High-dimensional Approaches8.Summary Adaptation of different models to a special problemGROUP•General idea–Given a certain kind of statistical distribution (e.g., Gaussian)–Compute the parameters assuming all data points have been generated by such a statistical distribution(e g mean and standardgenerated by such a statistical distribution (e.g., mean and standarddeviation)Outliers are points that have a low probability to be generated by the –Outliers are points that have a low probability to be generated by the overall distribution (e.g., deviate more than 3 times the standarddeviation from the mean)–See e.g. Barnett’s discussion of Hadlum vs. Hadlum•Basic assumption–Normal data objects follow a (known) distribution and occur in a high probability region of this model–Outliers deviate strongly from this distributionGROUP• A huge number of different tests are available differing in –Type of data distribution (e.g. Gaussian)–Number of variables, i.e., dimensions of the data objects (univariate/multivariate)–Number of distributions (mixture models)–Parametric versus non-parametric (e.g. histogram-based) Parametric versus non-parametric(e g histogram-based)•Example on the following slidesExample on the following slides–Gaussian distribution–Multivariate– 1 model–ParametricGROUP•Probability density function of a multivariate normal distributiondi t ib tiGROUP•Visualization (2D) [Tan et al. 2006]GROUP•Problems–Curse of dimensionality•The larger the degree of freedom, the more similar the MDist values forall pointsall pointsx-axis: observed MDist valuesy-axis: frequency of observationGROUP•Problems (cont.)–Robustness•Mean and standard deviation are very sensitive to outliers•These values are computed for the complete data set (including potentialThese values are computed for the complete data set(including potentialoutliers)•The MDist is used to determine outliers although the MDist values areinfluenced by these outliers=> Minimum Covariance Determinant [Rousseeuw and Leroy 1987]minimizes the influence of outliers on the Mahalanobis distanceminimizes the influence of outliers on the Mahalanobis distance •Discussion–Data distribution is fixedD t di t ib ti i fi d–Low flexibility (no mixture model)–Global methodGlobal method–Outputs a label but can also output a scoreμDBOutlineDATABASESYSTEMSGROUP1.Introduction√2.Statistical Tests√3.Depth-based Approaches4.Deviation-based Approaches5.Distance based Approaches 5Distance-based Approaches6.Density-based Approaches7High dimensional Approaches7.High-dimensional Approaches8.SummaryGROUP•General idea–Search for outliers at the border ofthe data space but independent ofstatistical distributionst ti ti l di t ib ti–Organize data objects inconvex hull layersconvex hull layers–Outliers are objects on outer layers•Basic assumption Picture taken from [Johnson et al. 1998]–Outliers are located at the border of the data space–Normal objects are in the center of the data spaceGROUP•Model [Tukey 1977]–Points on the convex hull of the full data space have depth = 1–Points on the convex hull of the data set after removing all points with depth = 1 have depth = 2depth=1have depth=2–…Points having a depth are reported as outliers–Points having a depth ≤k are reported as outliersPicture taken from [Preparata and Shamos 1988]GROUP•Sample algorithms–ISODEPTH [Ruts and Rousseeuw1996]–FDC [Johnson et al. 1998]•Discussion–Similar idea like classical statistical approaches (k = 1 distributions) but independent from the chosen kind of distribution–Convex hull computation is usually only efficient in 2D / 3D spaces Convex hull computation is usually only efficient in2D/3D spaces –Originally outputs a label but can be extended for scoring (e.g. take depth as scoring value)depth as scoring value)–Uses a global reference set for outlier detectionOutlineDATABASESYSTEMSGROUP1.Introduction√2.Statistical Tests√3.Depth-based Approaches√4.Deviation-based Approaches5.Distance based Approaches 5Distance-based Approaches6.Density-based Approaches7High dimensional Approaches7.High-dimensional Approaches8.SummaryGROUP•General idea–Given a set of data points (local group or global set)–Outliers are points that do not fit to the general characteristics of that set, i.e., the variance of the set is minimized when removing theset i e the variance of the set is minimized when removing theoutliers•Basic assumptionOutliers are the outermost points of the data set–Outliers are the outermost points of the data setGROUP•Model [Arning et al. 1996]–Given a smoothing factor SF(I) that computes for each I⊆DB how much the variance of DB is decreased when I is removed from DB –If two sets have an equal SF value, take the smaller setIf two sets have an equal value take the smaller set–The outliers are the elements of the exception set E⊆DB for which the following holds:the following holds:SF(E) ≥SF(I) for all I⊆DB•Discussion:–Similar idea like classical statistical approaches (k = 1 distributions) but independent from the chosen kind of distributionp–Naïve solution is in O(2n) for n data objects–Heuristics like random sampling or best first search are applied–Applicable to any data type (depends on the definition of SF)–Originally designed as a global method–Outputs a labelingOutlineDATABASESYSTEMSGROUP1.Introduction√2.Statistical Tests√3.Depth-based Approaches√4.Deviation-based Approaches √5.Distance based Approaches5Distance-based Approaches 6.Density-based Approaches7High dimensional Approaches7.High-dimensional Approaches8.SummaryGROUP•General Idea–Judge a point based on the distance(s) to its neighbors–Several variants proposed•Basic Assumption–Normal data objects have a dense neighborhood–Outliers are far apart from their neighbors, i.e., have a less dense neighborhoodGROUP•DB(ε,π)-Outliers–Basic modelεp1p2GROUP–Algorithms•Index-based [Knorr and Ng 1998]Index based[K d N1998]–Compute distance range join using spatial index structure–Exclude point from further consideration if its ε-neighborhood contains more than Card(DB) .πpoints•Nested-loop based [Knorr and Ng 1998]–Divide buffer in two partsDivide buffer in two parts–Use second part to scan/compare all points with the points from the first part •Grid-based [Knorr and Ng 1998]–Build grid such that any two points from the same grid cell have a distance of at most εto each other–Points need only compared with points from neighboring cellsGROUP–Deriving intensional knowledge [Knorr and Ng 1999]Relies on the DB(ε,π)-outlier model•Relies on the DB()outlier model•Find the minimal subset(s) of attributes that explains the “outlierness” of apoint, i.e., in which the point is still an outlier•Example–Identified outliers–Derived intensional knowledge (sketch)GROUP•Outlier scoring based on k NN distances–General models•Take the k NN distance of a point as its outlier score [Ramaswamy et al 2000]•Aggregate the distances of a point to all its 1NN, 2NN, …, k NN as anAggregate the distances of a point to all its1NN2NN NN as anoutlier score [Angiulli and Pizzuti 2002]g–Algorithms•General approaches–Nested-Loop»Naïve approach:NïhFor each object: compute k NNs with a sequential scan»Enhancement: use index structures for k NN queries–Partition-based»Partition data into micro clusters»Aggregate information for each partition (e.g. minimum boundingAggregate information for each partition(e g minimum boundingrectangles)»Allows to prune micro clusters that cannot qualify when searching for thek NNs of a particular pointNNs of a particular pointGROUP–Sample Algorithms (computing top-n outliers)•Nested-Loop [Ramaswamy et al 2000]Nested Loop[R t l2000]–Simple NL algorithm with index support for k NN queries–Partition-based algorithm (based on a clustering algorithm that has linear time complexity)–Algorithm for the simple k NN-distance modelLinearization [Angiulli and Pizzuti 2002]•Linearization[Angiulli and Pizzuti2002]–Linearization of a multi-dimensional data set using space-fill curves–1D representation is partitioned into micro clusters–Algorithm for the average k NN-distance modelAl i h f h NN di d l•ORCA [Bay and Schwabacher 2003]NL algorithm with randomization and simple pruning–NL algorithm with randomization and simple pruning–Pruning: if a point has a score greater than the top-n outlier so far (cut-off), remove this point from further consideration=> non-outliers are pruned=>non outliers are pruned=> works good on randomized data (can be done in linear time)=> worst-case: naïve NL algorithm–Algorithm for both k NN-distance models and the DB(ε,π)-outlier modelGROUP–Sample Algorithms (cont.)•RBRP [Ghoting et al. 2006],RBRP[Gh ti t l2006]–Idea: try to increase the cut-off as quick as possible => increase the pruning power–Compute approximate k NNs for each point to get a better cut-off–For approximate k NN search, the data points are partitioned into microclusters and k NNs are only searched within each micro cluster–Algorithm for both k NN-distance models•Further approaches–Also apply partitioning-based algorithms using micro clusters [McCallum et al Also apply partitioning based algorithms using micro clusters[M C ll t l2000], [Tao et al. 2006]–Approximate solution based on reference points [Pei et al. 2006]–DiscussionOutput can be a scoring (k NN distance models) or a labeling (k NN•Output can be a scoring(NN-distance models)or a labeling(NN-distance models and the DB(ε,π)-outlier model)•Approaches are local (resolution can be adjusted by the user via εor k)GROUP•Variant–Outlier Detection using In-degree Number [Hautamaki et al. 2004]•Idea–Construct the k NN graph for a data setConstruct the graph for a data set»Vertices: data points»Edge: if q∈k NN(p) then there is a directed edge from p to q– A vertex that has an indegree less than equal to T(user defined threshold) is an outlier•Discussion–The indegree of a vertex in the k NN graph equals to the number of reverse kNNs(R k NN) of the corresponding point–The R k NNs of a point p are those data objects having p among their k NNs The R of a point are those data objects having among their–Intuition of the model: outliers are»points that are among the k NNs of less than T other points have lessthan T R k NNsth NN–Outputs an outlier label–Is a local approach (depending on user defined parameter k)GROUP•Resolution-based outlier factor (ROF) [Fan et al. 2006]–ModelOutlineDATABASESYSTEMSGROUP1.Introduction√2.Statistical Tests√3.Depth-based Approaches√4.Deviation-based Approaches √5.Distance based Approaches √5Distance-based Approaches6.Density-based Approaches7High dimensional Approaches7.High-dimensional Approaches8.SummaryGROUP•General idea–Compare the density around a point with the density around its local neighborsThe relative density of a point compared to its neighbors is computed –The relative density of a point compared to its neighbors is computed as an outlier scoreApproaches essentially differ in how to estimate density–Approaches essentially differ in how to estimate density•Basic assumptionBasic assumption–The density around a normal data object is similar to the density around its neighborsg–The density around an outlier is considerably different to the density around its neighborsGROUP•Local Outlier Factor (LOF) [Breunig et al. 1999], [Breunig et al. 2000]–Motivation:C1qC2o2o1GROUP –ModelReachability distance•Reachability distance –Introduces a smoothing factordistance max{o dist o k o dist reach −=−•Local reachability distance (lrd) of point pInverse of the average reach dists of the NNs of )},(),({),(p p k –Inverse of the average reach-dists of the k NNs of pGROUP –PropertiesLOF 1:point is in a cluster•LOF ≈1: point is in a cluster (region with homogeneousdensity around the point andy p its neighbors)Data set •LOF >> 1: point is an outlier a a se LOFs (MinPts = 40)–Discussion•Choice of in the original paper)specifies the reference setChoice of k (MinPts in the original paper) specifies the reference set •Originally implements a local approach (resolution depends on the user’schoice for k )•Outputs a scoring (assigns an LOF value to each point)GROUP•Variants of LOF–Mining top-n local outliers [Jin et al. 2001]•Idea:–Usually a user is only interested in the top-Usually, a user is only interested in the top-n outliers–Do not compute the LOF for all data objects => save runtime•Method–Compress data points into micro clusters using the CFs of BIRCH [Zhang et al.1996]–Derive upper and lower bounds of the reachability distances, lrd-values, and LOF-values for points within a micro clusters–Compute upper and lower bounds of LOF values for micro clusters and sort results w.r.t. ascending lower bound–Prune micro clusters that cannot accommodate points among the top-noutliers (n highest LOF values)Iteratively refine remaining micro clusters and prune points accordingly –Iteratively refine remaining micro clusters and prune points accordinglyGROUP •Variants of LOF (cont.)–Connectivity-based outlier factor (COF) [Tang et al. 2002]•Motivation–In regions of low density it may be hard to detect outliersIn regions of low density, it may be hard to detect outliers –Choose a low value for k is often not appropriate •Solution–Treat “low density” and “isolation” differently•ExampleData set LOF COFGROUP•Influenced Outlierness (INFLO) [Jin et al. 2006]–Motivation•If clusters of different densities are not clearly separated, LOF will haveproblemsPoint p will have a higher LOF thanp qpoints or r which is counter intuitive –Idea•Take symmetric neighborhood relationship into account•Influence space (k IS(p)) of a point p includes its k NNs (k NN(p)) and itsreverse k NNs (R k NN(p))reverse NNs(Rk IS(p) = kNN(p) ∪R k NN(p))= {q1, q2, q4}={GROUP–Model•Density is simply measured by the inverse of the k NN distance, i.e.,Density is simply measured by the inverse of the NN distance i eGROUP–PropertiesSimilar to LOF•Similar to LOF•INFLO ≈1: point is in a cluster•INFLO >> 1: point is an outlierp–Discussion•Outputs an outlier score•Originally proposed as a local approach (resolution of the reference setk IS can be adjusted by the user setting parameter k)IS can be adjusted by the user setting parameter。
Exact solutions for a mean-field Abelian sandpile
where
ij
is the Kronecker delta.
Because of the highly symmetric nature of in our model it is straightforward to calculate the determinant and determine that the number of recurrent con gurations is
y Supported Postdoctoral Research Fellowship DMS 90-07206 z Address after August 1993: Department of Mathematics, University of Texas, Austin, TX 78712
June 1993
Abstract
We introduce a model for a sandpile, with N sites, critical height N and each site connected to every other site. It is thus a mean- eld model in the spin-glass sense. We nd an exact solution for the steady state probability distribution of avalanche sizes, and discuss its asymptotics for large N .
Supported in part by NSF Grant DMR89-18903
1
A sandpile model is basically a set of dynamical rules describing the way that grains of sand are added to a system, the conditions under which those grains can be redistributed inside the system, and the way they are removed from the system. Here we consider a system of N sites and de ne h(i) as the (integer) height of the column of sand at site i, i 2 f1 : : : N g. We drop a grain of sand on a site i chosen at random, thereby increasing its height by one: h(i) ! h(i)+1. If this new height exceeds the maximum stable value h then that column topples and gives 1 grain of sand to each of the N ? 1 other sites while one grains drops out of the system. (We take h N so that h(i) 0; in fact we are primarily interested in h = N .) We then examine the system to see if any site has a column exceeding h in which case we topple that column also. We keep toppling until all the sites are stable (this characterizes an avalanche). We then repeat the procedure of adding a grain at a randomly chosen site.
Resolution for Max-SAT
This article was originally published in a journal published by Elsevier,and the attached copy is provided by Elsevier for the author’s benefit and for the benefit of the author’s institution,for non-commercial research and educational use including without limitation use in instruction at your institution,sending it to specific colleagues that you know,and providing a copy to your institution’sadministrator.All other uses,reproduction and distribution,including without limitation commercial reprints,selling or licensing copies or access, or posting on open internet sites,your personal or institution’s website or repository,are prohibited.For exceptions,permission may be sought for such use through Elsevier’s permissions site at: /locate/permissionusematerialA u t h o r 's p e r s o n a l c o py Artificial Intelligence 171(2007)606–618/locate/artintResolution for Max-SAT ✩María Luisa Bonet a ,Jordi Levy b ,∗,Felip Manyàba Dept.Llenguatges i Sistemes Informàtics (LSI),Universitat Politècnica de Catalunya (UPC),Jordi Girona,1-3,08034Barcelona,Spainb Artificial Intelligence Research Institute (IIIA),Spanish Scientific Research Council (CSIC),Campus UAB,08193Bellaterra,SpainReceived 1September 2006;received in revised form 26February 2007;accepted 1March 2007Available online 12March 2007AbstractMax-SAT is the problem of finding an assignment minimizing the number of unsatisfied clauses in a CNF formula.We propose a resolution-like calculus for Max-SAT and prove its soundness and completeness.We also prove the completeness of some refinements of this calculus.From the completeness proof we derive an exact algorithm for Max-SAT and a time upper bound.We also define a weighted Max-SAT resolution-like rule,and show how to adapt the soundness and completeness proofs of the Max-SAT rule to the weighted Max-SAT rule.Finally,we give several particular Max-SAT problems that require an exponential number of steps of our Max-SAT rule to obtain the minimal number of unsatisfied clauses of the combinatorial principle.These results are based on the corresponding resolution lower bounds for those particular problems.©2007Elsevier B.V .All rights reserved.Keywords:Satisfiability;Resolution;Completeness;Saturation;Max-SAT;Weighted Max-SAT1.IntroductionThe Max-SAT problem for a CNF formula φis the problem of finding an assignment of values to variables that minimizes the number of unsatisfied clauses in φ.Max-SAT is an optimization version of SAT which is NP-hard (see [25]).Competitive exact Max-SAT solvers—as the ones developed by [2–4,17,22,23,30,32–34]—implement variants of the following branch and bound (BnB)schema:Given a CNF formula φ,BnB explores the search tree that represents the space of all possible assignments for φin a depth-first manner.At every node,BnB compares the upper bound (UB ),which is the best solution found so far for a complete assignment,with the lower bound (LB ),which is the sum of the number of clauses unsatisfied by the current partial assignment plus an underestimation of the number of ✩This research has been partially founded by the CICYT research projects iDEAS (TIN2004-04343)and Mulog (TIN2004-07933-C03-01/03).The first author also wants to thank the Isaac Newton Institute for Mathematical Sciences for hosting her while some of the ideas of this paper where thought and presented.*Corresponding author.E-mail addresses:bonet@ (M.L.Bonet),levy@iiia.csic.es (J.Levy),felip@iiia.csic.es (F.Manyà).URLs:/~bonet (M.L.Bonet),http://www.iiia.csic.es/~levy (J.Levy).0004-3702/$–see front matter ©2007Elsevier B.V .All rights reserved.doi:10.1016/j.artint.2007.03.001A u t h o r 's p e r s o n a l c o p yM.L.Bonet et al./Artificial Intelligence 171(2007)606–618607clauses that will become unsatisfied if the current partial assignment is completed.If LB UB the algorithm prunes the subtree below the current node and backtracks to a higher level in the search tree.If LB <UB ,the algorithm tries to find a better solution by extending the current partial assignment by instantiating one more variable.The solution to Max-SAT is the value that UB takes after exploring the entire search tree.The amount of inference performed by BnB at each node of the proof tree is poor compared with the inference performed in DPLL-style SAT solvers.The inference rules that one can apply in Max-SAT have to transform the current instance φinto another instance φ in such a way that φand φ have the same number of unsatisfied clauses for every possible assignment;in other words,the inference rules have to be sound .It is not enough to preserve satisfiability as in SAT.Unfortunately,unit propagation,which is the most powerful inference technique applied in SAT,is unsound for Max-SAT,1and many Max-SAT solvers apply rules which are far from being as powerful as unit propagation in SAT.A basic BnB algorithm,when branches on literal l ,enforces the following inference:removes the clauses contain-ing l and deletes the occurrences of ¯l ,but the new unit clauses derived as a consequence of deleting the occurrences of ¯l are not propagated as in unit propagation.Typically,that inference is enhanced by applying simple inference rules such as (i)the pure literal rule [13];(ii)the dominating unit clause rule [24],(iii)the almost common clause rule [8],and (iv)the complementary unit clause rule [24].All these rules,which are sound but not complete,have proved to be useful in a number of solvers [2,4,13,30,33].A recent trend,that we believe will remain in future Max-SAT solvers,is to design solvers that incorporate resolution-like inference rules that can be applied efficiently at every node of the proof tree.This is the case of MaxSatz,2the best performing Max-SAT solver of the SAT-2006Max-SAT Evaluation.3For example,one of the derived resolution rules that implements MaxSatz is the star rule :x y ¯x ∨¯y x ∨y x y ¯x ∨¯y x ∨y 00100011011010111001where we have added the truth table of the rule to verify its soundness.Max-SAT inference rules like the star rule replace the premises of the rule by its conclusion instead of adding the conclusion to the premises,which might increase the number of clauses unsatisfied by some assignment.The star rule preserves the number of unsatisfied clauses by replacing x,y,¯x ∨¯y with ,x ∨y ,where is the empty clause.Because these rules substitute a set of clauses by another,in some articles they are called transformation rules (see[24])instead of resolution rules.See also [20]for other examples of rules for Max-SAT.The main objective of this paper is to make a step forward in the study of resolution inference rules for Max-SAT by defining a sound and complete resolution rule.We want a rule such that the existing inference rules for Max-SAT either are particular cases of our rule (like the complementary unit clause rule or the almost common clause rule)or are rules that can be derived from our rule (like the star rule).We also want our rule to provide a general framework for extending our results to Weighted Max-SAT,defining complete refinements of resolution and devising faster Max-SAT solvers.Firstly,we observe that the classical resolution rule x ∨A,¯x ∨B A ∨B is not sound for Max-SAT,because an assignment satisfying x and A ,and falsifying B ,would falsify one of the premises,but would satisfy the conclusion.So the number of unsatisfied clauses would not be preserved for every truth assignment.1The set of clauses {a,¯a ∨b,¯a ∨¯b,¯a ∨c,¯a ∨¯c }has a minimum of one unsatisfied clause (setting a to false).However,performing unit propagation with a leads to a non-optimal assignment falsifying at least two clauses.2URL:http://web.udl.es/usuaris/m4372594/software.html.3URL:http://www.iiia.csic.es/~maxsat06/.A u t h o r 's p e r s o n a l c o p y608M.L.Bonet et al./Artificial Intelligence 171(2007)606–618Secondly,there is a natural extension to Max-SAT of the classical resolution rule in [21]:x ∨A ¯x ∨BA ∨B x ∨A ∨¯B ¯x ∨¯A∨B In [21],Larrosa and Heras present this rule and ask whether it is complete for Max-SAT.However,two of the conclusions of this rule are not in clausal form,and the trivial application of distributivity results into an unsound rule:x ∨a 1∨···∨a s ¯x ∨b 1∨···∨b ta 1∨···∨a s ∨b 1∨···∨b t x ∨a 1∨···∨a s ∨¯b1···x ∨a 1∨···∨a s ∨¯bt ¯x ∨b 1∨···∨b t ∨¯a 1···¯x ∨b 1∨···∨b t ∨¯as Therefore,our first objective was to modify the previous rule to obtain a sound and complete resolution rule in which the conclusions are in clausal form,as well as analyzing the complexity of applying the rule and finding out if there is some complete refinement.As we show in the next sections,we achieve our objective by providing a sound and complete resolution rule for Max-SAT in which both premises and conclusions are in clausal form.Moreover,we describe an exact algorithm for Max-SAT which is derived from the completeness proof.We also obtain an upper bound of the complexity of applying our rule and prove the completeness of the ordered resolution refinement.In classical resolution,different copies of a clause are eliminated leaving just one copy of each clause.In the context of the Max-SAT optimization problem,clearly this is not sound and we must keep repeated copies of a clause.This is why instead of working with sets of clauses we will work with multisets of clauses.A way to make the representation of this multisets more compact is to substitute several copies of a clause by a weighted clause,where the weight represents the number of times that the clause appears.So,our second objective was to extend our Max-SAT resolution rule to weighted clauses.As a result,we obtain a sound and complete resolution rule for Weighted Max-SAT.Our third objective was to study the complexity of our calculus from the point of view of the number of steps it might need to tell us the minimal number of unsatisfied clauses.Since the Max-SAT problem is hard for the optimiza-tion problem corresponding to NP,we expect to find classes of instances that require an exponential number of steps to give the minimal number of unsatisfied clauses.As a result,we prove such lower bounds for various combinatorial principles.Finally,in this paper we use the term of Max-SAT meaning Min-SAT.This is because,with respect to exact computations,finding an assignment that minimizes the number of unsatisfied clauses is equivalent to finding an assignment that maximizes the number of satisfied clauses.This is not necessarily the case for approximability results (see [18]).This paper proceeds as follows.First,in Section 2we define Max-SAT resolution and prove its soundness.Despite of the similitude of the inference rule with the classical resolution rule,it is not clear how to simulate classical inferences with the new rule.To obtain a complete strategy,we need to apply the new rule repeatedly to get a saturated set of clauses,as described in Section 3.In Section 4we prove the completeness of the new rule,and the extension to ordered resolution.In Section 5we deduce an exact algorithm and give a worst-case time upper bound in Section 6.Section 7contains a rule for weighted Max-SAT and the soundness and completeness of the rule.Section 8has the lower bound results for our Max-SAT rule.Finally,we present some concluding remarks.A u t h o r 's p e r s o n a l c o p yM.L.Bonet et al./Artificial Intelligence 171(2007)606–6186092.The Max-SAT resolution rule and its soundnessIn Max-SAT we need to keep repeated clauses.Therefore,we use multisets of clauses instead of just sets.For instance,the multiset {a,¯a,¯a,a ∨b,¯b},where a clause is repeated,has a minimum of two unsatisfied clauses.Max-SAT resolution ,like classical resolution,is based on a unique inference rule.In contrast to the resolution rule,the premises of the Max-SAT resolution rule are removed from the multiset after applying the rule.Moreover,apart from the classical conclusion where a variable has been cut,we also conclude some additional clauses that contain one of the premises as subclause.Definition 1.The Max-SAT resolution rule is defined as follows:x ∨a 1∨···∨a s ¯x ∨b 1∨···∨b ta 1∨···∨a s ∨b 1∨···∨b t x ∨a 1∨···∨a s ∨¯b 1x ∨a 1∨···∨a s ∨b 1∨¯b2···x ∨a 1∨···∨a s ∨b 1∨···∨b t −1∨¯bt ¯x ∨b 1∨···∨b t ∨¯a 1¯x ∨b 1∨···∨b t ∨a 1∨¯a 2···¯x ∨b 1∨···∨b t ∨a 1∨···∨a s −1∨¯as This inference rule is applied to multisets of clauses,and replaces the premises of the rule by its conclusions.We say that the rule cuts the variable x .The tautologies concluded by the rule are removed from the resulting multiset.Similarly,repeated literals in a clause are collapsed into one.Definition 2.We write C D when the multiset of clauses D can be obtained from the multiset C applying the Max-SAT resolution rule finitely many times.We write C x D when this sequence of applications only cuts the variable x .The Max-SAT resolution rule may conclude more clauses than the classical resolution rule.Notice though that the number of conclusions of the rule is at most the number of literals in the premises.However,when the two premises share literals,some of the conclusions are tautologies,hence removed.In particular we have x ∨A,¯x ∨A A .Moreover,as we will see when we study the completeness of the rule,there is no need to cut the conclusions of a rule among themselves.Finally,we will also see that the size of the worst-case proof of a set of clauses is similar to the size for classical resolution.Notice that an instance of the rule not only depends on the two clauses of the premise and the cut variable (like in resolution),but also on the order of the literals.Notice also that,like in classical resolution,this rule concludes a new clause not containing the variable x ,except when this clause is a tautology.Example 3.The Max-SAT resolution rule removes clauses after using them in an inference step.Therefore,it could seem that it can not simulate classical resolution when a clause needs to be used more than once,like in the example of Fig.1(left).However,this is not the case,as it can be seen in the same figure (right).More precisely,we derivea,¯a ∨b,¯a ∨c,¯b ∨¯c ,a ∨¯b ∨¯c,¯a ∨b ∨cwhere any truth assignment satisfying {a ∨¯b ∨¯c,¯a ∨b ∨c }minimizes the number of falsified clauses in the originalformula.Notice that the structure of the classical resolution proof and the Max-SAT resolution proof is quite different.It seems difficult to adapt a classical resolution proof to get a Max-SAT resolution proof,and it is an open question if this is possible without increasing substantially the size of the proof.A u t h o r 's p e r s o n a l c o p y610M.L.Bonet et al./Artificial Intelligence 171(2007)606–618Fig.1.An example of inference with classical resolution (left)and its equivalence with Max-SAT resolution (right).We put a box around the already used clauses.Theorem 4(Soundness).The Max-SAT resolution rule is sound ;i.e.,the rule preserves the number of unsatisfied clauses for every truth assignment.Proof.For every assignment I ,we will prove that the number of clauses that I falsifies in the premises of the inference rule is equal to the number of clauses that it falsifies in the conclusions.Let I be any assignment.I can not falsify both premises,since it satisfies either x or ¯x .Suppose I satisfies x ∨a 1∨···∨a s but not ¯x ∨b 1∨···∨b t .Then I falsifies all b j ’s and sets x to true.Now,suppose that I satisfies at least one literal among {a 1∨···∨a s }.Say a i is the first such literal.Then I falsifies ¯x ∨b 1∨···∨b t ∨a 1∨···∨a i −1∨¯a i and it satisfies all the others in the set of conclusions.Suppose now that I falsifies all a i ’s.Then,it falsifies a 1∨···a s ∨b 1∨···∨b t but satisfies all the other conclusions.If I satisfies the second premise but not the first,then by a similar argument we can show that I falsifies only one conclusion.Finally,suppose that I satisfies both premises.Suppose that I sets x to true.Then,for some j ,b j is true and I satisfies all the conclusions since all of them have either b j or x .The argument works similarly for I falsifying x .23.Saturated multisets of clauses In this section we define saturated multisets of clauses.This definition is based on the classical notion of sets of clauses closed by (some restricted kind of)inference,in particular,on sets of clauses closed by cuts of some variable.In classical resolution,given a set of clauses and a variable,we can saturate the set by cutting the variable exhaustively,obtaining a superset of the given clauses.If we repeat this process for all the variables,we get a complete resolution algorithm,i.e.we obtain the empty clause whenever the original set was unsatisfiable.Our completeness proof is based on this idea.However,notice that the classical saturation of a set w.r.t.a variable is unique,whereas in Max-SAT,it is not (see Remark 8).In fact,it is not even a superset of the original set.Moreover,in general,if we saturate a set w.r.t.a variable,and then w.r.t.another variable,we obtain a set that is not saturated w.r.t.both variables.What we will do is to first saturate with respect to a variable x .This way we create two multisets of variables.One with clauses that don’t contain the variable x ,and another with clauses that still contain x .We will then saturate with respect to the following variable only in the multiset of clauses that doesn’t contain the first variable x .We will do the same with the rest of the variables.Also,the saturation procedure keeps a good property:given a multiset of clauses saturated w.r.t.a variable x ,if there exists an assignment satisfying all the clauses not containing x ,then it can be extended (by assigning x )to satisfy all the clauses (see Lemma 9).Definition 5.A multiset of clauses C is said to be saturated w.r.t.x if for every pair of clauses C 1=x ∨A and C 2=¯x ∨B of C ,there is a literal l such that l is in A and ¯l is in B .A multiset of clauses C is a saturation of C w.r.t.x if C is saturated w.r.t.x and C x C ;i.e.,C can be obtained from C applying the inference rule cutting x finitely many times.Trivially,by the previous definition,a multiset of clauses C is saturated w.r.t.x if,and only if,every possible application of the inference rule cutting x only introduces clauses containing x (since tautologies get eliminated).A u t h o r 's p e r s o n a l c o p yM.L.Bonet et al./Artificial Intelligence 171(2007)606–618611We assign a function P :{0,1}n →{0,1}to every clause,and a function P :{0,1}n →N to every multiset of clauses as follows.Definition 6.For every clause C =x 1∨···∨x s ∨¯x s +1∨···∨¯x s +t we define its characteristic function as P C ( x )=(1−x 1)...(1−x s )x s +1...x s +t .For every multiset of clauses C ={C 1,...,C m },we define its characteristic function as P C = m i =1P C i ( x ).Notice that for every assignment I ,P C (I)is the number of clauses of C falsified by I .Also,by the soundness of our rule,a step of the Max-SAT resolution rule replaces a multiset of clauses by another with the same characteristic function.Before stating and proving the following lemma,let us recall the usual order relation among functions:f g if for all x ,f (x) g(x),and f <g if for all x ,f (x) g(x)and for some x ,f (x)<g(x).Since the functions have finite domain and the order relation on the range is well-founded,the order relation <on the functions is also well-founded.Lemma 7.For every multiset of clauses C and variable x ,there exists a multiset C such that C is a saturation of C w.r.t.x .Moreover,this multiset C can be computed by applying the inference rule to any pair of clauses x ∨A and ¯x ∨B with the restriction that A ∨B is not a tautology,using any ordering of the literals,until we can not apply the inference rule any longer with this restriction.Proof.We proceed by applying nondeterministically the inference rule cutting x ,until we obtain a saturated multiset.We only need to prove that this process terminates in finitely many inference steps,i.e.that there does not exist an infinite sequence C =C 0 C 1 ···,where at every inference step we cut the variable x and none of the sets C i are saturated.At every step,we can divide C i into two multisets:E i with all the clauses that do not contain x ,and D i with the clauses that contain the variable x (in positive or negative form).When we apply the inference rule we replace two clauses of D i by a multiset of clauses,where one of them,say A ,does not contain x .Therefore,we obtain a distinct multiset C i +1=D i +1∪E i +1,where E i +1=E i ∪{A }.Since A is not a tautology,the characteristic function P A is not the constant zero function.Then,since P C i +1=P C i and P E i +1=P E i +P A ,we obtain P D i +1=P D i −P A and P D i +1<P D i .Therefore,the characteristic function of the multiset of clauses containing x strictly decreases after every inference step.Since the order relation between characteristic functions is well-founded,this proves that we can not perform infinitely many inference steps.2Remark 8.Although every multiset of clauses is saturable,its saturation is not unique.For instance,the multiset {a,¯a ∨b,¯a ∨c }has two possibles saturations w.r.t.variable a :the multiset {b,¯b ∨c,a ∨¯b ∨¯c,¯a ∨b ∨c }and the multiset {c,b ∨¯c,a ∨¯b ∨¯c,¯a ∨b ∨c }.Another difference with respect to classical resolution is that we can not saturate a set of clauses simultaneously w.r.t.two variables by saturating w.r.t.one,and then w.r.t.the other.For instance,if we saturate {¯a ∨c,a ∨b ∨c }w.r.t.a ,we obtain {b ∨c,¯a ∨¯b∨c }.This is the only possible saturation of the original set.If now we saturate this multiset w.r.t.b ,we obtain again the original set {¯a ∨c,a ∨b ∨c }.Therefore,it is not possible to saturate this multiset of clauses w.r.t.a and b simultaneously.Lemma 9.Let C be a saturated multiset of clauses w.r.t.x .Let C be the subset of clauses of C not containing x .Then,any assignment I satisfying C (and not assigning x )can be extended to an assignment satisfying C .Proof.We have to extend I to satisfy the whole C .In fact we only need to set the value of x .If x has a unique polarity in C \C ,then the extension is trivial (x =true if x always occurs positively,and x =false otherwise).If,for any clause of the form x ∨A or ¯x ∨A ,the assignment I already satisfies A ,then any choice of the value of x will work.Otherwise,assume that there is a clause x ∨A (similarly for ¯x ∨A )such that I sets A to false.We set x to true.All the clauses of the form x ∨B will be satisfied.For the clauses of the form ¯x ∨B ,since C is saturated,there exists a literal l such that l ∈A and ¯l ∈B .This ensures that,since I falsifies A ,I (l)=false and I satisfies B .2A u t h o r 's p e r s o n a l c o p y612M.L.Bonet et al./Artificial Intelligence 171(2007)606–618pleteness of Max-SAT resolutionNow,we prove the main result of this paper,the completeness of Max-SAT resolution.The main idea is to prove that we can get a complete algorithm by successively saturating w.r.t.all the variables.However,notice that after saturating w.r.t.x 1and then w.r.t.x 2,we get a multiset of clauses that is not saturated w.r.t.x 1anymore.Therefore,we will use a variant of this basic algorithm:we saturate w.r.t.x 1,then we remove all the clauses containing x 1,and saturate w.r.t.x 2,we remove all the clauses containing x 2and saturate w.r.t.x 3,ing Lemma 9,we prove that,if the original multiset of clauses was unsatisfiable,then with this process we get the empty clause.Even better,we get as many empty clauses as the minimum number of unsatisfied clauses in the original formula.Theorem 10(Completeness).For any multiset of clauses C ,we haveC ,..., m,D where D is a satisfiable multiset of clauses,and m is the minimum number of unsatisfied clauses of C .Proof.Let x 1,...,x n be any list of the variables of C .We construct two sequences of multisets C 0,...,C n and D 1,...,D n such that(i)C =C 0,(ii)for i =1,...,n ,C i ∪D i is a saturation of C i −1w.r.t.x i ,and(iii)for i =1,...,n ,C i is a multiset of clauses not containing x 1,...,x i ,and D i is a multiset of clauses containingthe variable x i .By Lemma 7,this sequences can effectively be computed:for i =1,...,n ,we saturate C i −1w.r.t.x i ,and then we partition the resulting multiset into a subset D i containing x i ,and another C i not containing this variable.Notice that,since C n does not contain any variable,it is either the empty multiset ∅,or it only contains (some)empty clauses { ,..., }.Now we are going to prove that the multiset D = n i =1D i is satisfiable by constructing an assignment satisfying it.For i =1,...,n ,let E i =D i ∪···∪D n ,and let E n +1=∅.Notice that,for i =1,...,n ,(i)the multiset E i only contains the variables {x i ,...,x n },(ii)E i is saturated w.r.t.x i ,and(iii)E i decomposes as E i =D i ∪E i +1,where all the clauses of D i contain x i and none of E i +1contains x i .Claims (i)and (iii)are trivial.For claim (ii),notice that,since C i ∪D i is saturated w.r.t.x i ,the subset D i is also saturated.Now,since D i +1∪···∪D n does not contain x i ,the set E i will be saturated w.r.t.x i .Now,we construct a sequence of assignments I 1,...,I n +1,where I n +1is the empty assignment,hence satisfies E n +1=∅.Now,I i is constructed from I i +1as follows.Assume by induction hypothesis that I i +1satisfies E i +1.SinceE i is saturated w.r.t.x i ,and decomposes into D i and E i +1,by Lemma 9,we can extend I i +1with an assignment for x i to obtain I i satisfying E i .Iterating,we get that I 1satisfies E 1=D = n i =1D i .Since the inference rule is sound (Theorem 4),and by the previous argument D is satisfiable,we conclude that m =|C n |is the minimum number of unsatisfied clauses of C .2In classical resolution we can assume a given total order on the variables x 1>x 2>···>x n and restrict inferences x ∨A,¯x ∨B A ∨B to satisfy that x is maximal in x ∨A and in ¯x ∨B .This refinement of resolution is complete,and has some advantages:the set of possible proofs is smaller,thus its search is more efficient.The same result holds for Max-SAT Resolution:Corollary 11.For any multiset of clauses C ,and for every ordering x 1>···>x n of the variables,we haveC x 1C x 2··· x n ,..., m ,DA u t h o r 's p e r s o n a l c o p yM.L.Bonet et al./Artificial Intelligence 171(2007)606–618613where D is a satisfiable multiset of clauses,m is the minimum number of unsatisfied clauses of C ,and in every inference step the cut variable is maximal.Proof.The proof is similar to Theorem 10.First,given the ordering x 1>x 2>···>x n ,we start by computing the saturation w.r.t.x 1and finish with x n .Now,notice that,when we saturate C 0w.r.t.x 1to obtain C 1∪D 1,we only cut x 1,and this is the biggest variable.Then,when we saturate C 1w.r.t.x 2to obtain C 2∪D 2,we have to notice that the clauses of C 1,and the clauses that we could obtain from them,do not contain x 1,and we only cut x 2which is the biggest variable in all the premises.In general,we can see that at every inference step performed during the computation of the saturations (no matter how they are computed)we always cut a maximal variable.We only have to choose the order in which we saturate the variables coherently with the given ordering of the variables.25.An algorithm for Max-SAT From the proof of Theorem 10,we can extract the following algorithm:input :CC 0:=C for i :=1to nC :=saturation (C i −1,x i )C i ,D i :=partition (C,x i )endform :=|C n |I :=∅for i :=n downto 1I :=I ∪[x i →extension (x i ,I,D i )]output :m,IGiven an initial multiset of clauses C ,this algorithm obtains the minimum number m of unsatisfied clauses and an optimal assignment I for C .Function saturation (C,x)computes a saturation of C w.r.t.x .As we have already said,the saturation of a multiset is not unique,but the proof of Theorem 10does not depends on which particular saturation we take.Therefore,this computation can be done with “don’t care”non-determinism.Function partition (C i ,x)computes a partition of C into the subset of clauses containing x and the subset of clauses not containing x ,D i and C i respectively.Function extension (x,I,D)computes a truth assignment for x such that,if I assigns the value true to all the clauses of D containing x ,then the function returns false,if I assigns true to all the clauses of D containing ¯x ,then returns true.According to Lemma 9and the way the D i ’s are computed,I evaluates to true all the clauses containing x or all the clauses containing ¯x .The order on the saturation of the variables can also be freely chosen;i.e.,the sequence x 1,...,x n can be any enumeration of the variables.6.EfficiencyIn classical resolution,we know that there are formulas that require exponentially long refutations on the number of variables,and even on the size of the formula.On the other hand,no formula requires more than 2n inference steps to be refuted,being n the number of variables.Fortunately,in many practical cases the number of resolution steps required is polynomial.Obviously,we do not have a better situation in Max-SAT resolution.Moreover,since we can have repeated clauses,and we may need to generate more than one empty clause,the number of inference steps is not only bounded by the number of variables.It also depends on the number of original clauses.Again,in many practical cases of Max-SAT resolution,the number of resolution steps is also polynomial.In contrast,bucket elimination for soft constraints [29],which is also a complete procedure for Max-SAT,always requires exponential time,even worse,exponential space.The following theorem states an upper bound on the number of inference steps,using the strategy of saturating variable by variable:。
牛顿 自然哲学的数学原理 哲学中的推理规则 英文
牛顿自然哲学的数学原理哲学中的推理规则英文《牛顿自然哲学的数学原理与哲学中的推理规则》Isaac Newton is widely regarded as one of the most influential scientists in history. His groundbreaking work in physics and mathematics laid the foundation for modern science and revolutionized our understanding of the natural world. One of his most significant contributions is the development of the mathematical principles of natural philosophy, which provided a systematic framework for explaining the motion of objects and the behavior of physical systems.Newton's mathematical principles of natural philosophy, as articulated in his seminal work "Philosophiæ Naturalis Principia Mathematica" (Mathematical Principles of Natural Philosophy), laid the groundwork for the development of classical mechanics. Newton's laws of motion, which are based on mathematical principles, provide a quantitative description of the behavior of objects in motion and have been fundamental to the development of modern physics and engineering.In addition to his mathematical principles of natural philosophy, Newton also made important contributions to the field of philosophy, particularly in the area of logic and reasoning. Newton's work on the philosophy of science and his development of empirical methods for testing scientific hypotheses laid the groundwork for the scientific method, which remains the foundation of modern scientific inquiry.The philosophical implications of Newton's work are also manifested in his development of inferential reasoning and the establishment of rules for logical deduction. Newton's emphasis on empirical evidence and his commitment to the use of mathematical and logical reasoning in scientific inquiry has had a lasting impact on the development of philosophical thought and scientific methodology. Overall, Newton's mathematical principles of natural philosophy and his contributions to the development of inferential reasoning and logical deduction have had a profound impact on the development of modern science and philosophy. His work continues to be a source of inspiration and guidance for scientists and philosophers alike, and remains an essential part of the intellectual legacy of the Western tradition.。
EXACT_PERIODIC-WAVE_SOLUTIONS_FOR_(2+1)-DIMENSIONAL_BOUSSINESQ_EQUATION_AND_(3+1)-DIMENSIONAL_KP_EQU
tain de rivatives (this is true for the e quation s c on side red h ere). In th is p roce ss we take the integration con stants to b e ze ro. T he n ext cruc ial step is to exp ress th e solu tions of th e resu ltin g O DE b y the Jac obi e lliptic- fun ction meth od in Ref. [12], u (ξ ) c an be ex pre sse d as a fin ite p ower se rie s of J acob i ellip tic sine fu nc tion , sn ξ , i.e., the an satz u( ξ ) =
d cn ξ = −sn ξ d n ξ , dξ d dξ dn ξ = −m 2 sn ξ cn ξ . (8)
In this artic le, for Jacob i e llip tic fu nc tions, we u se the n otation sn ξ, cn ξ, d n ξ w ith argu me nt ξ an d mo du lu s parame ter m (0 < m < 1). T he param eter n in Eq. (4) will b e fi xe d b y balan cing th e h ighe st ord er of de rivative term an d the n online ar term in th e non lin ear OD E Eq. (3) by u sin g Eq. (5). Su bstituting Eq . (4) (w ith fix ed valu e of n ) in to the re du ce d non linear O DE (3) an d equ ating the c oeffi cients of variou s p owers of sn ξ to ze ro we get a se t of algeb raic e quation s for aj , k , l , s , an d ω . S olving th em c onsistently we obtain re lation s am ong the p arWave Solut ions for (2+1)-Dimen sional Bou ssines q Equ at ion an d ( 3+ 1)-D im ens ional K P Equ at ion∗
Predicates for the Planar Additively Weighted Voronoi Diagram
Predicates for the Planar Additively Weighted Voronoi Diagram ∗
Menelaos I. Karavelas † Ioannis Z. Emiris ‡
Abstract We consider the geometric predicates involved in an incremental algorithm for computing the additively weighted Voronoi diagram in the plane. These predicates correspond to certain algebraic operations, or subpredicates, whose efficient implementation calls for studying various algebraic tools. Our effort is to minimize the algebraic degree of the predicates, thus optimizing the required precision to perform exact arithmetic. We may also try to minimize the number of arithmetic operations; this twofold optimization corresponds to reducing bit complexity. The proposed algorithms are based on Sturm sequences of univariate polynomials and make use of geometric invariants to simplify calculations. Multivariate resultants are also used for a deeper understanding of the predicates and provide an alternative approach to evaluation. We expect that our techniques are sufficiently powerful and general to be applied to a number of analogous geometric problems on curved objects.
The deepest regression method
The regression depth of a fit IRp relative to the dataset Zn IRp is thus the smallest number of observations that need to be passed when tilting until it becomes vertical. Therefore, we always have 0 rdepth ( ; Zn) n. In the special case of p = 1 there are no x-values, and Zn is a univariate dataset. For any 2 IR we then have rdepth( ; Zn) = min (#fyi( ) 0g; #fyi( ) 0g) which is the 'rank' of when we rank from the outside inwards. For any p 1, the regression depth of measures how balanced the dataset Zn is about the linear fit determined by . It can easily be verified that regression depth is scale invariant, regression invariant and affine invariant according to the definitions in Rousseeuw and Leroy ( 17, page 116]). Based on the notion of regression depth, Rousseeuw and Hubert 16] introduced the deepest regression estimator (DR) for robust linear regression. In Section 2 we give the definition of DR and its basic properties. We show that DR is a robust method with breakdown value that converges almost surely to 1/3 in any dimension, when the good data come from a large semiparametric model. Section 3 proposes the fast approximate algorithm MEDSWEEP to compute DR in higher dimensions (p 3). Based on the distribution of the regression depth function, inference for the parameters is derived in Section 4. Tests and confidence regions for the true unknown parameters 1 ; : : : ; p are constructed. We also propose a test for linearity versus convexity of the dataset Zn based on the maximal depth of Zn. Applications of deepest regression to specific models are given in Section 5. First we consider polynomial regression, for which we update the definition of regression depth and then compute the deepest regression accordingly. We show that the deepest polynomial regression always has breakdown value at least 1/3. We also apply the deepest regression to the Michaelis-Menten model, where it provides a solution to the problem of ambiguous results obtained from the two commonly used parametrizations.
Exact Penalty Principle
One of the purposes of this paper is to extend Clarke’s exact penalty principle to the case where f is vector-valued. Our result in the special case where the objective function f is scalar-valued proved the following improved Clarke Exact Penalty Principle which is a corollary of Theorem 3.1. Theorem 1.2 (Improved Clarke’s exact penalty principle) Let X be a normed space, C ⊂ S ⊂ X and f : X → R be Lipschitz of rank Lf on S . Then for L > Lf , f attains a minimum over C at x if and only if the function g (y ) = f (y ) + LdC (y ) attains a minimum over S at x. Unfortunately when local optimal solutions are considered, the reverse statement of Clarke’s exact penalty principle does not hold without additional conditions. In [17], Scholtes and St¨ ohr gave some conditions which ensure the reverse statements hold for the distance function and the error bound function. In this paper we extend these results to the vector optimization case. Under the assumption that S is compact and the local (global) optimal solutions of the problem (P) lies in the interior of the set S , Di Pillo and Grippo [4, 5] showed that the extended Mangasarian Fromovitz constraint qualification (EMFCQ) can be used to insure that the local (global) minimizers of the penalized problem with penalty function ψ (x) := ρ( h(x)
最小费用流外文文献
176BERALDI,GUERRIERO AND MUSMANNO the correctness of the parallel method,adapting the convergence proof given in[11]to the convex cost function case.In Section4,we describe parallel implementations on a shared memory multiprocessor. Finally,in Section5,we present and discuss some computational results collected on a large set of test problems.2.The convex networkflow problemGiven a directed graph G=(N,A),where N is the set of nodes with|N|=n and A is the set of arcs with|A|=m,the MCCF problem can be formulated as follows: min(i,j)∈Af ij(x ij)(1)s.t.{j:(i,j)∈A}x ij−{j:(j,i)∈A}x ji=s i,∀i∈N(2)where f ij:R→(−∞,+∞],∀(i,j)∈A,is a convex,closed,proper function(extendedreal-valued,lower semicontinuous,not identically taking the value∞[4]);x i j,∀(i,j)∈A,represents the number of units offlow through the arc(i,j)from node i to node j.Furthermore,s i,∀i∈N,is the supply/demand of node i,depending whether its value is greater/less than zero.We refer to constraints(2)as the conservation offlow constraints.For each function f ij,we denote with l ij and u ij,respectively,the left and right endpointsof the effective domain C ij={ξ∈R|f ij(ξ)<∞}.We make the following assumptions.Assumption2.1.The MCCF problem is feasible,that is,there exists at least oneflowvector x satisfying theflow conservation constraints(2)and its components x ij belong toC ij,∀(i,j)∈A.Assumption2.2.There exists at least one feasibleflow vector x such that:f−ij(x ij)<∞and f+ij(x ij)>∞,∀(i,j)∈A,where f−ij(x ij)and f+ij(x ij)denote,respectively,the left and the right directional derivative of f ij at x ij.The -relaxation method for solving the MCCF problem can be viewed as a generalizationof the method proposed in[13]for the linear minimum costflow problem.The method isbased on the satisfaction of the -complementary slackness conditions.Let p i be the price of node i∈N.Given a scalar >0,we say that aflow-price vectorpair(x,p)satisfies the -complementary slackness conditions( -CS for short)if and onlyiff−ij(x ij)− ≤p i−p j≤f+ij(x ij)+ ,∀(i,j)∈A.(3) It can be shown that,if a feasibleflow-price vector pair(x,p)satisfies -CS,then the cost corresponding to x is optimal within a factor proportional to [8].In the sequel,some terminology and computational operations are introduced.PARALLEL ALGORITHMS177 Definition2.1.Given aflow distribution x,the surplus g i of a node i is the difference between the supply s i and the net outflow from i,that is:g i={j:(j,i)∈A}x ji−{j:(i,j)∈A}x ij+s i.(4)Definition2.2.Given aflow-price vector pair(x,p)satisfying -CS,the push list L i of node i,∀i∈N,is defined as follows:L i={(i,j)| /2<p i−p j−f+ij(x ij)≤ }∪{(j,i)|− ≤p j−p i−f−ji(x ji)<− /2}.(5) Definition2.3.For each node i,the push list L i contains unblocked arcs,that is,arcs(i,j) such that:p i−p j≥f+ij(x ij+δ),(6) or arcs(j,i)such that:p j−p i≤f−ji(x ji−δ),(7) whereδis a given positive scalar.Definition2.4.Given the push list L i of node i and an unblocked arc a=(i,j)[or a=(j,i)],theflow margin of the arc a is the supremum ofδfor which the relation(6) [or(7)]holds.The -relaxation method starts with aflow vector x such that x ij∈C ij,∀(i,j)∈A,and a price vector p such that theflow-price vector pair(x,p)satisfies -CS.At each iteration, a node i with positive surplus(referred in the following to as active node)is selected,and one(or more)of the two following basic operations are performed on it.1.A price rise,which consists of increasing the price p i by the maximum amount thatmaintains -CS,whereas theflow vector x and the price p j,∀j∈N−{i},are left unchanged.2.Aflow push along an arc(i,j)[or along an arc(j,i)]that consists of increasing theflow on arc(i,j)[or decreasing theflow on arc(j,i)]by an amountδ∈(0,g i],while leaving all otherflows as well as the price vector unchanged.The typical iteration on node i is as follows.1.(Scan the push list L i)If L i=∅go to3.2.(Decrease the surplus of node i)Choose an arc a∈L i and perform aδ-flow push along it,whereδ=min{g i,flow margin of a}.If g i=0,go to the next iteration;otherwise go to1.178BERALDI,GUERRIERO AND MUSMANNO 3.(Increase the price of node i)Execute a price rise operation on i.Go to the next iteration.For a feasible problem,the -relaxation method terminates in afinite number of iterations when g i=0,∀i∈N,with aflow vector that is optimal within a factor that is proportional to [8].3.The parallel -relaxation methodThis section is devoted to the description of a parallel asynchronous version of the -relaxation method for MCCF problem.The proposed parallel algorithm can be viewed as an extension of the method developed for the case of linear cost function by Bertsekas and Tsitsiklis[11],who have also demonstrated the theoretical convergence properties (theoretical results for the linear case have also been discussed in a simpler form by Li and Zenios in[12]).The -relaxation method described in the previous section presents a quite simple struc-ture,that makes it well suited to be implemented on parallel systems.Perhaps,the easiest way to parallelize the method consists of selecting,at each iteration, several non-adjacent active nodes and performing the two basic operations of the sequential method(price rise andflow push)concurrently.However,such an implementation could be not very efficient,since the number of non-adjacent active nodes could be relatively small in the course of the algorithm.Thus,it might not be possible to take full advantage of the parallelization.A much more efficient version can be obtained by allowing simultaneous iterations on adjacent active nodes.In this case,it may happen that two processors could update simultaneously theflow value along the same arc.For this reason,in order to guarantee the correct termination of the parallel method,some synchronization mechanisms are needed. In the following,we describe the details of this parallel asynchronous version and we prove its theoretical convergence properties.In order to design a parallel method valid for both shared and distributed memory systems, we refer to a theoretical computational model that consists of a set of N P processors,each with its own local memory,that communicate through a global memory or an interconnection network.We assume that the computation proceeds in supersteps,each consisting of an input phase,a computational phase,and an output phase.In the input phase,a processor can receive information from other processors(i.e.,it can read data from the shared memory,or receive data sent to it from other processors);in the computational phase,it performs local computation;in the output phase,it communicates data to other processors(i.e.,by sending them through communication links or writing them into the global memory).For the sake of simplicity,in the sequel,we assume that each node i is assigned to a separate processor P i(i.e.,the number of available processors N P is equal to the number of nodes n).This makes no theoretical difference if we consider the most general case when N P<n (in this case,more nodes have to be assigned to each processor in some way).PARALLEL ALGORITHMS179 The parallel asynchronous scheme for the -relaxation method can be formally described as follows.Each processor P i executes the typical iteration at node i.The input and output phases involve communication with the adjacent processors P j,∀j∈F i∪B i,whereF i={j|(i,j)∈A},andB i={j|(j,i)∈A}.At any time t,each processor P i holds the following values:•p i(t):the price of node i;•p j(i,t):the price of node j∈F i∪B i communicated by P j at some earlier time;•x ij(i,t):the estimate of theflow on the arc(i,j),j∈F i,available at processor P i at time t;•x ji(i,t):the estimate of theflow on the arc(j,i),j∈B i,available at processor P i at time t.We assume that the price and theflow values can change only at an increasing sequence of times t0,t1,...,t m,with t m→∞.At each time t,the processor P i can execute one of the following three phases.putational Phase.P i computes the surplus g i(t):g i(t)={j:(j,i)∈A}x ji(i,t)−{j:(i,j)∈A}x ij(i,t)+s i.If g i(t)>0,then the typical iteration is executed and the following values p i(t),x ij(i,t),j∈F i,x ji(i,t),j∈B iare updated.2.Output Phase.The values of p i(t),x ij(t),x ji(t),computed during the computational phase,are communicated to the adjacent processors P j,j∈F i∪B i.3.Input Phase.P i receives the price p i(t )and the arcflow x ij(j,t )or x ji(j,t ),computed by processor P j,j∈F i∪B i,at some earlier time t <t.On the basis of this information,P i updates p j(i,t)and x ij(i,t)if j∈F i,(x ji(i,t),if j∈B i).If p j(t )≥p j(i,t),then p j(i,t)=p j(t ).In addition,if j∈F i,the value of x ij(i,t)is replaced by x ij(j,t )ifp i(t)<p j(t )+f+ij(x ij(j,t ))+ and x ij(j,t )<x ij(i,t).In the case of j∈B i,the value of x ji(i,t)is replaced by x ji(j,t )ifp j(t )≥p i(t)+f−ji(x ji(j,t ))− and x ji(j,t )>x ji(i,t).180BERALDI,GUERRIERO AND MUSMANNO Let T i be the set of times for which the computational phase is executed by processor P i,and let T i(j)be the set of times when P i receives new data from adjacent processors P j,j∈F i∪B i.We make the following assumptions.Assumption3.1.T i and T i(j)have an infinite number of elements for all processors P i and P j,j∈F i∪B i.Assumption3.2.Old information is eventually purged from the system,that is,given any time t k,there exists t m≥t k such that the computing time of the price andflow information obtained at any node after t m(i.e,the time t in the input phase)exceeds t k.Assumption3.3.For each processor P i,the initial arcflow x ij(i,t0),j∈F i,and x ji(i,t0), j∈B i,satisfy -CS together with p i(t0)and p j(i,t0),j∈F i∪B i.Furthermore,p i(t0)≥p i(j,t0)∀j∈F i∪B i,x ij(i,t0)≥x ij(j,t0)∀j∈F i.The sketch of the typical iteration of the parallel method,the updating rules and the initial conditions imply the following properties.1.The price sequence is monotonically nondecreasing in t andp i(t)≥p i(j,t ),∀j∈F i∪B i,t ≤t(8)2. -CS are locally satisfied for each node i:f−ij(x ij(i,t))− ≤p i(t)−p j(i,t)≤f+ij(x ij(i,t))+ ∀(i,j),j∈F i(9)f−ji(x ji(i,t))− ≤p j(i,t)−p i(t)≤f+ji(x ji(i,t))+ ∀(j,i),j∈B i3.Processor P i stores an estimate of arcflow x ij(i,t)which is greater than or equal to thevalue stored at processor P j,that is:x ij(i,t)≥x ij(j,t),∀j∈F i,∀t≥t0.(10)4.There exists a node which is never processed.This follows from the fact that the surplusof a node,once nonnegative,remains nonnegative and from(10)we obtain:g i(t)≤0,∀t≥t0.(11)i∈NThis implies that at any time t,there is at least one node i with negative surplus if there is a node with positive surplus.At this node i,processor P i must not have executed any iteration up to t,and,therefore,the price p i(t)must be equal to the initial price p i(t0).PARALLEL ALGORITHMS181 We say that the parallel asynchronous version of the -relaxation method terminates if there is a time t k such that,for all t≥t k,we have:g i(t)=0∀i∈N,(12)x ij(i,t)=x ij(j,t)∀(i,j)∈A,(13) p j(t)=p j(i,t)∀j∈F i∪B i.(14) Now we are ready to show the correctness of the parallel algorithm.Our proof of conver-gence follows the same approach proposed for the linear case by Bertsekas and Tsitsiklis [11],by taking into account,however,the different type of cost function(convex instead of linear).Proposition3.1.If the problem is feasible and Assumptions3.1–3.3hold,the algorithm terminates.Proof:Suppose no iterations are executed at any node after some time t∗.Then Eq.(12) must hold for large enough t.Because no iterations occur after t∗,Assumption3.1,Eq.(8), and the updating rules defined in the input phase imply Eq.(14).Furthermore,after t∗, noflow estimates can change unless if new data are available.Note that the updating rules,Eq.(10),Assumptions3.1and3.2imply the consistency of the arcflow values as in Eq.(13).We assume now that iterations are executed indefinately.In this case,for every t,there is a time t >t and a node i such that g i(t )>0.But this is impossible,since we observe that the number of price increases and the number ofδ-flow pushes performed by the parallel asynchronous algorithm are bounded.This fact can be demonstrated by following the same approach used in[8]to show the correctness of the sequential method.24.Parallel implementations on a shared memory multiprocessorIn this section we describe different parallel asynchronous implementations of the -relaxation method designed for a shared memory multiprocessor.A key issue of the proposed parallel method is related to the partitioning and allocation of the workload among the available processors,in such a way to guarantee a good load balancing.In our case,the computational workload depends on the number of active nodes,since the method terminates when the surplus of all nodes is reduced to zero.In principle,it is possible to consider two different allocation strategies:static and dynamic.In thefirst case,the set of nodes is partitioned into N P blocks of equal size,each containing the same number of active nodes,using a procedure executed only once,at the beginning of the algorithm.Each processor extracts nodes only from its private subset;consequently, there is a reduction of the synchronitazion overhead due to access to shared data,generally, performed through a specific mechanism such as lock.On the other hand,the main drawback of this strategy is the impossibility to guaranteeing a priori,during the execution of the algorithm,a good load balancing among the available processors.Indeed,followingflow182BERALDI,GUERRIERO AND MUSMANNO push operations other nodes become active and there is no way to ensure that their number remains roughly the same in each block.This limitation can be overcome by considering a dynamic node allocation strategy.The active nodes are stored into a FIFO queue L(i.e.nodes are extracted from the top of L and inserted at the bottom),shared among all processors.The main drawback of this strategy is due to the synchronization overhead:a lock is used in order to guarantee that a node could not be simultaneously selected by more than one processor.Empirical computational studies have revealed that,in the case of the -relaxation method,the dynamic allocation outperforms the static one[15].For this reason,all the implementations presented in the sequel are based on dynamic allocation strategies.Afirst parallel implementation of the -relaxation method can be easily derived from the parallel scheme introduced in the previous section.Each processor stores into its local memory a privateflow-price vector pair on the basis of which it executes the typical iteration of the method.Processors exchange information through the shared memory in which the globalflow-price pair,the queue L of active nodes and a boolean variableflag are stored.At the beginning,each processor reads from the shared memory an initialflow-price vector pair computed in such a way that -CS are satisfied.The computation starts with the extraction of a node i from the queue L and proceeds with the execution of the basic operations on the extracted node(computational phase).Once the surplus of node i is reduced to zero,the processor writes the newflow and price values into the shared memory(output phase)and warns the adjacent processors of their availability,by using theflag.Then,each processor gets the new data and,eventually, updates the localflow-price vector pair according to the rules defined in the input phase. The main drawback of the proposed parallel algorithm,when implemented on a shared memory multiprocessor,is that the shared memory is not exploited in the most efficient way:data are copied from the main memory into the local ones,and viceversa.More efficient implementations can be defined by using the main memory in a more appropriate way,by avoiding the use of local copies and maintaining only a globalflow-price vector pair,shared among all processors.Following this approach,we have considered two different parallel implementations of the asynchronous -relaxation method,that differ in the way of organizing the queue of active nodes.More specifically,our implementations are as follows.•Single Queue Implementation.The active nodes are stored in a single queue L shared among all processors.Each processors P i selects the active node i from L,computes the surplus of the node,stores its value in a local variableσand executes the basic opera-tions untilσis reduced to zero.This implementation allows the eventual simultaneous selection of adjacent nodes by two processors.This means that theflow values along the arcs betweeen the two nodes could be updated in a non-predeterminated order and, consequently,the value ofσcould be not consistent with the currentflow distribution.For this reason,in order to guarantee the correct termination of the algorithm,when the queue L is found empty by all the processors,it is necessary to compute again the surplus of all the nodes(check phase)and,eventually,restart the computation if the optimality conditions are not satisfied at some nodes.PARALLEL ALGORITHMS183•Multiple Queues Implementation.In this implementation,the active nodes are organized in multiple queues,that is,there is a separate queue for each processor.Each processor extracts nodes from its own queue and uses a heuristic procedure for choosing the queue into which eventually insert nodes that become active,after aδ-flow push operation. The queue chosen is the one with the minimum current value of the number of nodes already inserted into the queue.The heuristic is easy to implement and ensures a good load balancing among the processors.The multiple queues implementation guarantees much less contention for queue access than the case of a single queue(we reduce the probability that more processors attempt to simultaneously insert a node into the same queue).In this case,we note that the termination condition is detected in a slight different way from the single queue implementation.When a processorfinds its queue empty,it switches in an idle state and,eventually,reawakens when a node is added to its queue. When the idle condition is reached by all processors(this situation is detected by using specific procedures,see for more details[15],a check phase is performed,in order to verify that the optimality conditions are satisfied by all the nodes.We observe that both the proposed parallel implementationsfind the optimal solution in afinite number of iterations(the updating rules introduced in the algorithm guarantee that the -CS are satisfied at each iteration).It is worth observing that the procedure used in our parallel algorithms for the current updating of price andflow vector resembles the computational scheme used in the paral-lel relaxation algorithms of Chajakis and Zenios[14].However,our implementations are substantially different from the relaxation method,not only for the fact that Chajakis and Zenios have examined the case of strictly convex cost function(quadratic in the numerical experiments).We cite,for example,that Chajakis and Zenios adopted a static node allo-cation procedure to split the workload among the processors,whereas we use a dynamic allocation,and,consequently,any asynchronous updating operation has been re-designed (in some sense,our method is much more“chaotic”).putational experimentsIt is well known that the theoretical and the practical performance of the -relaxation method can be improved by using the -scaling technique,which wasfirst introduced for the linear minimum costflow problem in[16]and[17].The key idea of the -scaling technique is to apply the -relaxation method several times, starting with a large value of and reduce it up to afinal value corresponding to the desiderable degree of solution accuracy.Each application of the algorithm,called the scaling phase,provides good initial prices andflows for the next phase.In our implementations,the sequence{ (k)}is defined by(k)=θ (k−1),k=1,2,...where (0)(starting value)andθ∈(0,1)(scaling factor)are chosen by the user.184BERALDI,GUERRIERO AND MUSMANNO For our testing,we have considered convex linear/quadratic problems with cost function defined as follows:f ij(x ij)=a ij x ij+b ij x2ij if0≤x ij≤u ij ∞otherwiseand we have chosenθ=0.5and (0)=max∀(i,j)∈A a ij+2b ij u ijWe have considered scaled versions of the algorithm.All the issues introduced for the unscaled versions can be also used for the corresponding scaled counterparts,without loss of efficiency.We choose to terminate the algorithms at the scaling phase¯k such that (¯k)≤10−10. The computational experiments have been carried out on two different sets of test prob-lems,for which the percentage of arcs with quadratic costs is equal tofifty percent of the total number of arcs,and the remaining arcs have linear cost.All the problems have been generated by using the public domain NETGEN generator[18].Thefirst set(referred to as medium scale test problems)consists of twelve test prob-lems,belonging to the suite designed by Klingman and Mote[18].The corresponding classification number and the main characteristics are reported in Table1.The second set(referred to as large scale test problems)consists of eight larger problems, having5,000and10,000nodes,with different number of arcs(Table2).For all test problems the arc cost is chosen randomly,according to a uniform distribution, within the range[1,100],and the arc capacity in the range[1,1000].The parallel algorithms have been implemented and tested by using an Origin2000,a multiprocessor consisting of4nodes,each with a memory of128MB.Each node consists of two processors R10000at195MHz,with a4MB cache memory and a hub device,which carries out duties similarly to a bus in a bus-based system.The nodes are connected by two routers.Table1.Medium scale test problems.Problem Nodes Arcs Sources Sinks Tsurplus1015,00025,0002,5002,500250,0001025,00025,0002,5002,5002,500,0001035,00025,0002,5002,5006,250,0001075,00037,5002,5002,500375,0001085,00050,0002,5002,500500,0001095,00075,0002,5002,500750,0001115,00035,5002,5002,500250,0001125,00050,0002,5002,500250,0001135,00075,0002,5002,500250,0001235,00025,000500500250,0001245,00025,0001,0001,000250,0001255,00025,0001,5001,500250,000PARALLEL ALGORITHMS185 rge scale test problems.Problem Nodes Arcs Sources Sinks TsurplusL15,000200,0002,5002,500500,000L210,000200,0005,0005,0001,000,000L35,000400,0002,5002,500500,000L410,000400,0005,0005,0001,000,000L55,000600,0002,5002,500500,000L610,000600,0005,0005,0001,000,000L75,000800,0002,5002,500500,000L810,000800,0005,0005,0001,000,000The main feature of this system is that the hardware allows the physical distributed memory of the system to be shared,just as in a bus-based system,but since each hub is connected to its local memory,the bandwith is proportional to the number of nodes,and so,there is no inherent limit to the number of processors that can be effectively used in the system.On the other hand,the main drawback of this parallel system is related to the access time to memory.Indeed,it is no longer uniform:it varies depending on how far away the memory being accessed is in the system.So,while two processors in each node have quick access to their local memory through their hub,accessing remote memories by additional hubs adds an extra time overhead.The operating system used is IRIX6.4,whereas the compiler is the f77.The performance of the parallel implementations has been evaluated by measuring the average execution times,obtained over5runs,as a function of the number of processors. Tables3,4and Tables5,6report the results for the single queue(SQ for short)and the multiple queues implementation(MQ for short),respectively.Table3.Average execution time(in secs)required by the SQ implementation for the medium scale test problem.Problem Seq2-Proc4-Proc8-Proc101333.69230.13128.3466.47102456.97302.63159.7885.26103525.69318.60174.0795.75107411.41304.75158.2390.22108496.10359.49187.92107.15109509.36363.83191.49106.34111354.29266.38157.4680.16112367.10269.93152.9682.13113431.45303.20167.2491.22123249.16190.20103.3957.81124291.54217.57115.6962.43125301.72226.86115.6063.79186BERALDI,GUERRIERO AND MUSMANNOTable4.Average execution time(in secs)required by the MQ implementation for the medium scale test problem.Problem Seq2-Proc4-Proc8-Proc101333.69222.46119.1865.17102456.97295.61151.8282.78103525.69311.68153.7189.25107411.41293.86152.3785.53108496.10346.92181.72100.83109509.36351.28184.55102.28111354.29256.73150.7676.69112367.10262.21146.8478.27113431.45294.72161.4887.62123249.16184.5699.2755.12124291.54211.26110.8561.38125301.72209.53110.9362.60Table5.Average execution time(in secs)required by the SQ implementation for the large scale test problem.Problem Seq2-Proc4-Proc8-ProcL1585.67385.31196.53117.60L21743.241131.97579.15347.26L3793.93508.93254.46155.67L41862.671164.17585.75349.47L5998.67608.94310.15183.91L62314.221377.51712.07408.15L71374.26808.39411.46233.32L83299.171896.07970.34548.95Table6.Average execution time(in secs)required by the MQ implementation for the large scale test problem.Problem Seq2-Proc4-Proc8-ProcL1585.67370.68187.7199.77L21743.241089.52544.76290.06L3793.93484.10244.29129.52L41862.671089.28564.44299.95L5998.67567.43294.59157.77L62314.221285.96670.79343.36L71374.26750.96392.64199.75L83299.171745.59906.36447.04PARALLEL ALGORITHMS187 In order to evaluate the performance of the proposed parallel version of the -relaxation method,we have measured the speedup value computed as the average sequen-tial execution time over the average multiple-processors execution time(seefigures1–4). We observe that the speedup values are not proportional to the number of processors used. More specifically,the average speedup values are1.49,2.82,5,02on2,4and8processors, respectively for the SQ implementation,and1.56,2.98,5.55on2,4and8processors for the MQ implementation.This numerical behaviour can be explained by different reasons:(a)during the last scaling phases,the number of nodes with surplus greater than the user-defined threshold decreases. Thus,some processors can remain in an idle state and,consequently,we have a loss of efficiency;(b)the non-uniform access to memories.This affects the performance of the method,especially when the number of processors is increased;(c)the synchronitazion overhead due to access with locking of the common data structure.It penalizes,in partic-ular,the SQ implementation as confirmed by comparing the results obtained with the MQ implementation(see Tables4and6).In this case,each processor extracts nodes from its private queue and eventually inserts nodes into another queue,chosen by using a heuristic procedure.In the SQ implementation the locking data access is used by all processors for the same queue.Other interesting considerations can be drawn by observing that the performance of the parallel implementations strongly depends on the characteristics of the test problems.In particular,we note that better speedup is achieved for the test problems with higher values of total surplus(see,problems101,102and103offigures1and2and problems L7 and L8offigures3and4).This behavior can be explained by observing that the higher theFigure1.Speedup values of the SQ implementation for the medium scale problems.。
9781107673311w04
Example
5
6 Simplify each of the following by expanding and collecting like terms: a (x − 4)2 d a c e g i x−
1 2 2
b (2x −√ 3)2 e (x − 5)2
ISBN 978-1-107-67331-1 © Michael Evans et al. 2011 Photocopying is restricted under law and this material must not be transferred to another party.
Cambridge University Press
94
Essential Mathematical Methods 1 & 2 CAS
Example 1 Simplify 2(x − 5) − 3(x + 5), by first expanding. Solution 2(x − 5) − 3(x + 5) = 2x − 10 − 3x − 15 Expand each bracket. = 2x − 3x − 10 − 15 Collect like terms. = −x − 25
1
3 Simplify each of the following by expanding and collecting like terms: a 8(2x − 3) − 2(x + 4) c 4(2 − 3x) + 4(6 − x) b 2x(x − 4) − 3x d 4 − 3(5 − 2x)
Chapter 4 — Quadratics
95
Using the TI-Nspire
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
REPORTSININFORMATICSISSN0333-3590Exact(exponential)algorithms for thedominating set problemFedor V.Fomin,Dieter Kratsch,and GerhardJ.WoegingerREPORT NO270May2004Department of Informatics UNIVERSITY OF BERGENBergen,NorwayThis report has URL http://www.ii.uib.no/publikasjoner/texrap/ps/2004-270.ps Reports in Informatics from Department of Informatics,University of Bergen,Norway,is available at http://www.ii.uib.no/publikasjoner/texrap/.Requests for paper copies of this report can be sent to:Department of Informatics,University of Bergen,Høyteknologisenteret,P.O.Box7800,N-5020Bergen,NorwayExact (exponential)algorithms for thedominating set problemFedor V.Fomin ∗Dieter Kratsch†Gerhard J.Woeginger‡AbstractWe design fast exact algorithms for the problem of computing a minimum dominating set in undirected graphs.Since this problem is NP-hard,it comes with no big surprise that all our time complexities are exponential in the number n of vertices.The contribution of this paper are ‘nice’exponential time complexities that are bounded by functions of the form c n with reason-ably small constants c <2:For arbitrary graphs we get a time complexity of 1.93782n .And for the special cases of split graphs,bipartite graphs,and graphs of maximum degree three,we reach time complexities of 1.41422n ,1.73206n ,and 1.64515n ,respectively.1IntroductionNowadays,it is common believe that NP-hard problems can not be solved in poly-nomial time.For a number of NP-hard problems,we even have strong evidence that they cannot be solved in sub-exponential time.For these problems the only remaining hope is to design exact algorithms with good exponential running times.How good can these exponential running times be?Can we reach 2n 2for instances of size n ?Can we reach 10n ?Or even 2n ?Or can we reach c n for some constant c that is very close to 1?The last years have seen an emerging interest in attack-ing these questions for concrete combinatorial problems:There is an O ∗(1.2108n )time algorithm for independent set (Robson [13]);an O ∗(2.4150n )time algorithm for graph coloring (Eppstein [4]);an O ∗(1.4802n )time algorithm for 3-Satisfiability (Dantsin &al.[2]).We refer to the survey paper [14]by Woeginger for an up-to-date overview of this field.In this paper,we study the dominating set problem from this exact (exponential)algorithms point of view.Basic definitions.Let G =(V,E )be an undirected,simple graph without loops.We denote by n the number of vertices of G .The open neighborhood of a vertex v is denoted by N (v )={u ∈V :{u,v }∈E },and the closed neighborhood of v is denoted by N [v ]=N (V )∪{v }.The degree of a vertex v is |N (v )|.For a vertex set S ⊆V ,we define N [S ]= v ∈S N [v ]and N (S )=N [S ]−S .The subgraph of G induced by S is denoted by G [S ].We will write G −S short for G [V −S ].A set S ⊆V of vertices is a clique ,if any two of its elements are adjacent;S is∗fomin@ii.uib.no .Department of Informatics,University of Bergen,N-5020Bergen,Norway.F.Fomin is supported by Norges forskningsr ˚ad project 160778/V30.†kratsch@sciences.univ-metz.fr .LITA,Universit´e de Metz,57045Metz Cedex 01,France.‡g.j.woeginger@tue.nl .Department of Mathematics and Computer Science,TU Eindhoven,P.O.Box 513,5600MB Eindhoven,and Faculty of Mathematical Sciences,University of Twente,7500AE Enschede,The Netherlands.independent ,if no two of its elements are adjacent;S is a vertex cover ,if V −S is an independent set.Throughout this paper we use the so-called big-Oh-star notation,a modification of the big-Oh notation that suppresses polynomially bounded terms:We will write f =O ∗(g )for two functions f and g ,if f (n )=O (g (n )poly(n ))holds with some polynomial poly(n ).We say that a problem is solvable in sub-exponential time in n ,if there is an effectively computable monotone increasing function g (n )with lim n →∞g (n )=∞such that the problem is solvable in time O (2n/g (n )).The dominating set problem.Let G =(V,E )be a graph.A set D ⊆V with N [D ]=V is called a dominating set for G ;in other words,every vertex in G must either be contained in D or adjacent to some vertex in D .A set A ⊆V dominates a set B ⊆V if B ⊆N [A ].The domination number γ(G )of a graph G is the cardinality of a smallest dominating set of G .The dominating set problem asks to determine γ(G )and to find a dominating set of minimum cardinality.The dom-inating set problem is one of the fundamental and well-studied classical NP-hard graph problems (Garey &Johnson [6]).For a large and comprehensive survey on domination theory,we refer the reader to the books [8,9]by Haynes,Hedetniemi &Slater.The dominating set problem is also one of the basic problems in param-eterized complexity (Downey &Fellows [3]);it is contained in the parameterized complexity class W[2].Further recent investigations of the dominating set problem can be found in Albers &al.[1]and in Fomin &Thilikos [5].Results and organization of this paper.What are the best time complexities for dominating set in n -vertex graphs that we can possibly hope for?Well,of course there is the trivial O ∗(2n )algorithm that simply searches through all the 2n subsets of V .But can we hope for a sub-exponential time algorithm,maybe with a time complexity of O ∗(2√n )?Section 2provides the answer to this question:No,probably not,unless some very unexpected things happen in computational complexity theory ...Hence,we should only hope for time complexities of the form O ∗(c n ),with some small value c <2.And indeed,Section 3presents such an algorithm with a time complexity of O ∗(1.93782n ).This algorithm combines a recursive approach with a deep result from extremal graph theory.The deep result is due to Reed [12],and it provides an upper bound on the domination number of graphs of minimum degree three.Furthermore,we study exact exponential algorithms for dominating set on some special graph classes:In Section 4,we design an O ∗(1.41422n )time algorithm for split graphs,and an O ∗(1.73206n )time algorithm for bipartite graphs.In Section 5,we derive an O ∗(1.64515n )time algorithm for graphs of maximum degree three.Note that for these three graph classes,the dominating set problem remains NP-hard (Garey &Johnson [6],Haynes,Hedetniemi &Slater [9]).2A negative observationWe will show that the existence of a sub-exponential time algorithm for the dominat-ing set problem would be highly unlikely.Our (straightforward)argument exploits the structural similarities between the dominating set problem and the vertex cover problem:“Given a graph,find a vertex cover of minimum cardinality”.Proposition 2.1.Let G =(V,E )be a graph.Let G +be the graph that results from G by adding for every edge e ={u,v }∈E a new vertex x (e )together with the two new edges {x (e ),u }and {x (e ),v }.Then the graph G has a vertex cover of size at most k ,if and only if the graph G +has a dominating set of size at most k .Proposition 2.2.(Johnson &Szegedy [11])If the vertex cover problem on graphs of maximum degree three can be solved in sub-exponential time,then also the vertex cover problem on arbitrary graphs can be solved in sub-exponential time.Proposition 2.3.(Impagliazzo,Paturi &Zane [10])If the vertex cover problem (on arbitrary graphs)can be solved in sub-exponential time,then the complexity classes SNP and SUBEXP satisfy SNP ⊆SUBEXP (and this is considered a highly unlikely event in computational complexity).Now suppose that the dominating set problem can be solved in sub-exponential time.Take an instance G =(V,E )of the vertex cover problem with maximum degree at most three,and construct the corresponding graph G +.Note that G +has at most |V |+|E |≤5|V |/2vertices;hence,its size is linear in the size of G .Solve the dominating set problem for G +in sub-exponential time.Proposition 2.1yields a sub-exponential time algorithm for vertex cover in graphs with maximum degree at most three.Propositions 2.2and 2.3yield that SNP ⊆SUBEXP.3An exact algorithm for arbitrary graphsIn this section we present the main result of our paper.It is the first exact algo-rithm for the dominating set problem breaking the natural Ω(2n )barrier for the running time:We present an O ∗(1.93782n )time algorithm to compute a minimum dominating set on any graph.Our algorithm heavily relies on the following result of Reed to restrict the search space.Proposition 3.1.(Reed [12])Every graph on n vertices with minimum degree at least three has a dominating set of size at most 3n/8.In fact,we will tackle the following generalization of the dominating set problem:An input for this generalization consists of a graph G =(V,E )and a subset X ⊆V .We say that a set D ⊆V dominates X ,if X ⊆N [D ].The goal is to find a dominating set D for X of minimum cardinality.(Obviously,setting X :=V yields the classical dominating set problem).We will derive an exact O ∗(1.93782n )time algorithm for this generalization.The algorithm is based on the so-called pruning the search tree technique.The idea is to branch into subcases and to remove all vertices of degree one and two,until we terminate with a graph with all vertices of degree zero or at least three.Denote by V the set of all vertices of degree at least three in this final graph.Let t =|V |and let G =G [V ].Then Proposition 3.1yields that there exists some vertex set in G with at most 3t/8vertices that dominates all of G ;consequently,there exists also a dominating set for X =X ∩V of size at most 3t/8in G .We simply test all possible subsets with up to 3t/8vertices to find a minimum dominating set D for X in G .By using Stirling’s approximation x !≈x x e −x √2πx for factorials,and by suppressing some polynomial factors,we see that the number of tested subsets is at mostt 3t/8=(t )!(3t/8)!(5t/8)!=O ∗(8t ·3−3t/8·5−5t/8)=O ∗(1.93782t ),where 8/(33/8·55/8)is approximately 1.9378192.This test can be done in time O ∗( 3t/8i =1 t i )=O ∗(1.93782t ).Finally,we add all degree zero vertices of X to the set D to obtain a minimum dominating set of G .Now let us discuss the branching into subcases.While there is a vertex of degree one or two,we pick such a vertex,say v,and we recurse distinguishing four cases depending on the degree of v and whether v∈X or not.Case A:The vertex v is of degree one and v∈V−X.In this case there is no need to dominate the vertex v and there always exists a minimum dominating set for X that does not contain v.Then a minimum dominating set for X−{v} in G−{v}is also a minimum dominating set for X in G,and thus we recurse on G−{v}and X−{v}.Case B:The vertex v is of degree one and v∈X.Let w be the unique neighbor of v.Then there always exists a minimum dominating set for X that contains w,but does not contain v.If D is a minimum dominating set for X−N[w] in G−{v,w}then D ∪{w}is a minimum dominating set for X in G,and thus we recurse on G−{v,w}and X−N[w].We need the following auxiliary result.Lemma 3.2.Let v be a vertex of degree2in G,and let u1and u2be its two neighbors.Then for any subset X⊆V there is a minimum dominating set D for X such that one of the following holds.(i)u1∈D and v∈D;(ii)v∈D and u1,u2∈D;(iii)u1/∈D and v∈D.Proof.If there exists a minimum dominating set D for X that contains u1then there exists a minimum dominating set D for X that contains u1but not v.In fact,if v∈D,then D =(D−{v})∪{u2}is a dominating set for X and|D |≤|D|. Similarly,if there exists a minimum dominating set for X that contains u2then there exists a minimum dominating set for X that contains u2but not v.Thus we are left withfive possibilities how v,u1,u2might show up in a minimum dominating set D for X:(a)u1,u2,v∈D;(b)v∈D and u1,u2∈D;(c)u1∈D and v,u2∈D;(d)u2∈D and v,u1∈D;(e)u1,u2∈D and v∈D.Now(i)is equivalent to(c)or(e),(ii)is equivalent to(b),and(iii)is equivalent to(a)or(d). This concludes the proof.Now consider a vertex v of degree two.Depending on whether v∈X or not we branch in different ways.Additionally,the search is restricted to those minimum dominating sets D satisfying the conditions of Lemma3.2.Case C:The vertex v of degree2and v∈V−X.Let u1and u2be the two neighbors of v in G.By Lemma3.2,we can branch into three subcases for a minimum dominating set D:(C.1):u1∈D and v∈D.In this case if D is a minimum dominating set for X−N[u1]in G−{u1,v}then D ∪{u1}is a minimum dominating set for X in G, and thus we recurse on G−{u1,v}and X−N[u1].(C.2):v∈D and u1,u2∈D.In this case if D is a minimum dominating set for X−{u1,u2}in G−{u1,v,u2}then D ∪{v}is a minimum dominating set for X in G,and thus we recurse on G−{u1,v,u2}and X−{u1,u2}.(C.3):u1∈D and v∈D.In this case a minimum dominating set for X in G−{v} is also a minimum dominating set for X in G,and thus we recurse on G−{v}and X.Case D:The vertex v is of degree2and v∈X.Let u1and u2denote the two neighbors of v in G.Again according to Lemma3.2,we branch into three subcases for a minimum dominating set D:(D.1):u1∈D and v∈D.In this case if D is a minimum dominating set for X−N[u1]in G−{u1,v}then D ∪{u1}is a minimum dominating set for X in G. Thus we recurse on G−{u1,v}and X−N[u1].(D.2):v∈D and u1,u2∈D.In this case if D is a minimum dominating set for X−{u1,v,u2}in G−{u1,v,u2}then D ∪{v}is a minimum dominating set for X in G.Thus we recurse on G−{u1,v,u2}and X−{u1,v,u2}.(D.3):u1∈D and v∈D.Then v∈X implies u2∈D.Now we use that if D is a minimum dominating set for X−N[u2]in G−{v,u2}then D ∪{u2}is a minimum dominating set for X in G.Thus we recurse on G−{v,u2}and X−N[u2].To analyse the running time of our algorithm we denote by T(n)the worst case number of recursive calls performed by the algorithm for a graph on n vertices. Each recursive call can easily be implemented in time polynomial in the size of the graph passed to the recursive call.In cases A and B we have T(n)≤T(n−1), in case C we have T(n)≤T(n−1)+T(n−2)+T(n−3)and in case D we have T(n)≤2·T(n−2)+T(n−3).Standard calculations yield that the worst behavior of T(n)is within a constant factor ofαn,whereαis the largest root ofα3=α2+α+1, which is approximately1.8393.Thus T(n)=O∗(1.8393n).Therefore,the most time consuming part of the algorithm is the procedure of checking all subsets of size at most3t/8where t≤n.As already discussed,this can be performed in O∗(1.93782n) steps by a brute force algorithm.Summarizing,we have proved the following theorem.Theorem3.3.A minimum dominating set of a graph on n vertices can be computed in time O∗(1.93782n)time.(The base of the exponential function in the running time is8/(33/8·55/8)≈1.9378192.)4Split graphs and bipartite graphsIn this section we present an exponential algorithm for the minimum set cover problem obtained by dynamic programming.This algorithm will then be used as a subroutine in exponential algorithms for the NP-hard minimum dominating set problems on split graphs and on bipartite graphs.Let X be a ground set of cardinality m,and let T={T1,T2,...,T k}be a collection of subsets of X.We say that a subset T ⊆T covers a subset S⊆X, if every element in S belongs to at least one member of T .A minimum set cover of(X,T)is a subset T of T that covers the whole set X.The minimum set cover problem asks tofind a minimum set cover for given(X,T).Note that a minimum set cover of X can trivially be found in time O∗(2k)by checking all possible subsets of T.Lemma4.1.There is an O(mk2m)time algorithm to compute a minimum set cover for an instance(X,T)with|X|=m and|T|=k.Proof.Let(X,T)with T={T1,T2,...,T k}be an instance of the minimum set cover problem over a ground set X with|X|=m.We present an exponential algorithm solving the problem by dynamic programming.For every nonempty subset S⊆X,and for every j=1,2,...,k we define F[S;j] as the minimum cardinality of a subset of{T1,...,T j}that covers S.If{T1,...,T j} does not cover S then we set F[S;j]:=∞.Now all values F[S;j]can be computed as follows.In thefirst step,for every subset S⊆X,we set F[S;1]=1if S⊆T1,and F[S;1]=∞otherwise.Then in step j+1,j=1,2,...,k−1,F[S;j+1]is computed for all S⊆X in O(m)time as follows:F[S;j+1]=min{F[S;j],F[S−T j+1;j]+1}.This yields an algorithm to compute F[S;j]for all S⊆X and all j=1,2,...,k of overall running time O(mk2m).In the end,F[X;k]is the cardinality of a minimum set cover for(X,T).Now we shall use Lemma4.1to establish an exact exponential algorithm to solve the NP-hard minimum dominating set problem for split graphs.Let us recall that a graph G=(V,E)is a split graph if its vertex set can be partitioned into a clique C and an independent set I.Theorem4.2.There is an O(n22n/2)=O∗(1.41422n)time algorithm to compute a minimum dominating set for split graphs.Proof.If G is a complete graph or an empty graph,then the dominating set problem on G is trivial.If G=(V,E)is not connected,then all of its components are isolated vertices except possibly one,say G =(V ,E).If D is a minimum dominating set of the connected split graph G then D ∪(V−V )is a minimum dominating set of G.Thus we may assume that the input graph G=(V,E)is a connected split graph with a partition of its vertex set into a clique C and an independent set I where|I|≥1and|C|≥1.Such a partition can be found in linear time(Golumbic [7]).A connected split graph has a minimum dominating set D such that D⊆C: consider a minimum dominating set D of G with|D ∩I|as small as possible;then a vertex x∈D ∩I can be replaced by a neighbor y∈C.N[x]⊆N[y]implies that D :=(D −{x})∪{y}is a dominating set,and either|D |<|D |(if y∈D ),or |D |=|D |and|D ∩I|<|D ∩I|–both contradicting the choice of D .Let C={v1,v2,...,v k}.For every j∈{1,2,...,k}we define T j=N(v j)∩I. Clearly,D⊆C is a dominating set in G if and only if{T i:v i∈D}covers I.Hence the minimum dominating set problem for G can be reduced to the minimum set cover problem for(I,T)with|I|=n−k and|T|=k.For k≤n/2this problem can be solved by trying all possible subsets in time O(n2k)=O(n2n/2).For k>n/2, by Lemma4.1,the problem can be solved in time O((n−k)k2n−k)=O(n22n/2).Thus a minimum dominating set of G can be computed in time O(n22n/2).A modification of the technique used to prove Theorem4.2,can be used to obtain faster algorithms for graphs with large independent set.Theorem4.3.There is an O(nz·3n−z)time algorithm to compute a minimum dominating set for graphs with an independent set of size z.In particular,there is an O(n2·3n/2)=O∗(1.73206n)time algorithm to compute a minimum dominating set for bipartite graphs.Proof.Let G=(V,E)be a graph with an independent set of size z.Note that such an independent set can be identified in O∗(1.2108n)time by the algorithm of Robson[13].Let R=V−I denote the set of vertices outside the independent set.In an initial phase,wefix for every subset X⊆R some corresponding vertex set I X⊆I via the following three steps.•Determine Y=I−N[X].•Compute a vertex set Z ⊆N [X ]∩I of minimum cardinality subject to R −N [X ]−N [Y ]⊆N (Z ).•Set I X =Y ∪Z .First,we observe that Y ⊆I and Z ⊆I yield I X ⊆I .Secondly,I ⊆Y ∪N [X ]implies that I is dominated by X ∪I X ,and R −N [X ]−N [Y ]⊆N (Z )implies that R is dominated by X ∪I X .Consequently,the set X ∪I X forms a dominating set for the graph G .Thirdly,we claim that among all dominating sets D for G with D ∩R =X ,the dominating set X ∪I X has the smallest possible cardinality:Indeed,D ∩R =X means that the vertices in Y =I −N [X ]can only be dominated,if they are contained in D ;hence Y ⊆D .Furthermore,the vertices in R −N [X ]−N [Y ]must all be dominated through some vertices in N [X ]∩I ;in the second step,we determine the smallest possible subset Z ⊆N [X ]∩I with this property.Summarizing,for finding a minimum dominating set for G ,it is sufficient to look through all the 2n −z sets X ∪I X .What is the time complexity of this approach?The only (exponentially)ex-pensive step for determining the sets I X is the computation of the sets Z .And this expensive step boils down to solving a set covering problem that consists of a ground set R −N [X ]−N [Y ]with at most |R −X |≤n −z −|X |elements,and that consists of a collection of |N [X ]∩I |≤z subsets.By Lemma 4.1,such a set covering problem can be solved in O (nz ·2n −z −|X |)time.The overall time for solving all set covering problems for all subsets X ⊆R is proportional to n −z k =1 n −z k nz ·2n −z −k .This yields an overall time complexity of O (nz ·3n −z ).Note that for graphs with an independent set of size z ≥0.39782n ,the run-ning time of the algorithm in Theorem 4.3is better than the running time of the algorithm for general graphs from Section 3.5Graphs of maximum degree threeComputer experiments suggest that exact exponential algorithms like the trivial O (2n )time algorithm,or like our O ∗(1.93782n )algorithm from Section 3have the slowest running times for fixed values of n ,if the input graphs have large domination numbers.One possible explanation is that the algorithm has to spend a lot of time on checking that no vertex subset of size γ(G )−1is dominating (even in case a true minimum dominating set is detected at an early stage).Since graphs of maximum degree three have high domination numbers,the algorithms for general graphs do not behave well on these graphs.In this section,we design a better exact algorithm for graphs of maximum degree three,by using the pruning a search tree technique and a structural property of minimum dominating sets in graphs of maximum degree three provided in the following lemma.Lemma 5.1.Let G =(V,E )be a graph of maximum degree three.Then there is a minimum dominating set D of G with the following two properties:(i)every connected component of G [D ]is either an isolated vertex,or an isolatededge,and(ii)if two vertices x,y ∈D form an isolated edge in G [D ],then x and y havedegree three in G ,and N (x )∩N (y )=∅.Proof.Let D be a minimum dominating set of G with the maximum number of isolated vertices in G [D ].If G [D ]has a vertex x of degree three,then D −{x }is asmaller dominating set of G,which is a contradiction.Thus the maximum degree of G[D]is two.Assume G[D]has a vertex y of degree two.If the degree of y in G is two,then D−{y}is a smaller dominating set of G,a contradiction.Otherwise let z be the unique neighbor of y in G that is not in D.If z∈N[D−{y}]then D−{y}is a smaller dominating set of G,another contradiction.Finally,if z/∈N[D−{y}] then D1:=(D∪{z})−{y}is another minimum dominating set in G with a larger number of isolated vertices in G[D1]than in G[D].This contradiction concludes the proof of property(i).To prove property(ii),let usfirst show that any two adjacent vertices x,y∈D have degree three in G.For the sake of contradiction,assume that y has degree less than three in G.Clearly y cannot have degree one,otherwise D−{y}is a dominating set,a contradiction.Suppose y has degree two,and let z=x be the second neighbor of y.If z∈N[D−{y}]then D−{y}is a dominating set of smaller size than D,a contradiction.If z/∈N[D−{y}],then D2:=(D−{y})∪{z}is a minimum dominating set in G with a larger number of isolated vertices in G[D2] than in G[D],another contradiction.Finally,we prove that N(x)∩N(y)=∅in G.For the sake of contradiction, assume that N(x)∩N(y)=∅.If N[x]⊆N[y]then D−{x}is a dominating set,and if N[y]⊆N[x]then D−{y}is a dominating set.In both cases this contradicts our choice of D.Hence N(x)={y,w,u}with N(x)−N(y)={w} and N(x)∩N(y)={u}.If w∈N[D−{x}]then D−{x}is a dominating set, another contradiction.If w/∈N[D−{x}]then D3:=(D−{x})∪{w}is a minimum dominating set in G with a larger number of isolated vertices in G[D3]than in G[D], thefinal contradiction.Now we construct a search tree algorithm using the restriction of the search space guaranteed by Lemma5.1,i.e.for a graph G=(V,E)of maximum degree three only vertex sets D⊆V satisfying the properties of of Lemma5.1have to be inspected.Theorem5.2.There is a O∗(1.64515n)time algorithm to compute a minimum dominating set on graphs of maximum degree three.(The base of the exponential function in the running time is the largest real rootα≈1.64515ofα6=4α2+9.) Proof.The algorithm is based on the pruning a search tree technique.The idea is to branch into subcases until we obtain a graph of maximum degree two,and for such a graph a minimum dominating set can be computed in linear time since each of its connected components is either an induced path P k(k≥1)or an induced cycle C k(k≥3).In this way we obtain all minimum dominating sets satisfying the properties of Lemma5.1.More precisely,the input graph G=(V,E)and D=∅correspond to the root of the search tree.To each node of the search tree corresponds an induced subgraph G[V ]of G and a partial dominating set D⊆V−V of G already chosen to be part of the dominating set obtained in any branching from this node.For each node of the search tree the algorithm proceeds as follows:It chooses a vertex of degree three to be inspected(called x below)and branches in various subcases.Suppose(G[V ],D) corresponds to a node of the search tree and that G[V ]has maximum degree two. Then a linear time algorithm will be invoked tofind a minimum dominating set D of G[V ],and thus D∪D is a dominating set of G.Finally the algorithm chooses a smallest set among all dominating sets of G obtained in this way and outputs it as a minimum dominating set of G.To show that this algorithm has running time O∗(1.64515n)we have to study its branching into subcases.We denote by T(n)the worst case number of recursive calls performed by the algorithm for a graph on n vertices.The algorithm will pick any vertex x of degree three,then for each subcase it chooses one or two vertices to be added to the partial dominating set D and recurses on some smaller induced subgraph.The number of neighbors of degree three of x will be denoted by t .Based on Lemma 5.1each connected component of G [D ]can be supposed to be a K 1or a K 2.Thus it suffices to distinguish the following three cases.Case A:x,y ∈D for some neighbor y of D .Let y be one of the three neighbors of x .If y has degree three then we add x and y to the partial dominating set D .Thus we branch into a subcase for each neighbor y of degree three in G ,and recurse on G −(N [x ]∪N [y ])with D :=D ∪{x,y }.By property (ii)of Lemma 5.1,we remove in each subcase 6vertices.Thus the number of recursive calls is at most t ·T (n −6).Case B:x ∈D isolated vertex in G [D ].We add x to the dominating set D and recurse on G −N [x ].Since x has degree three the number of recursive calls on the subcase is T (n −4).Case C:x ∈D .At least one of the neighbors y 1,y 2,y 3of x must be added to D .The algorithm picks one of these vertices,say y i .Thus we obtain three subcases “y i must be added to D ”.Each of these subcases will be treated depending on the degree of y i (quite similar to cases A and B).(C.1):y i has degree three.Clearly {y i ,x }⊆D is impossible in case C.Let z i 1and z i 2be the other two neighbors of y i .Then we obtain 2t subcases as follows:Add y i ,z i j ,i ∈{1,2,3},j ∈{1,2},to D and recurse on G −(N [y i ]∪N [z i j ]).As in Case A,we remove 6vertices for each subcase;hence the number of recursive calls for the 2t subcases is at most 2t ·T (n −6).Additionally we can choose any vertex y i and add it as a singleton of G [D ]to D .Thus there are t subcases with at most t ·T (n −4)recursive calls.(C.2):y i has degree two.Then by Lemma 5.1,y i can only be added to D as a singleton of G [D ].We recurse on G −N [y ]needing at most (3−t )·T (n −3)recursive calls.(C.3):y i has degree one.No minimum dominating set of G contains y i .Thus we obtain T (n )≤(3−t )·T (n −3)+(t +1)·T (n −4)+3t ·T (n −6)where 0≤t ≤3is the number of neighbors y 1,y 2,y 3of x of degree three.Standard calculations yield that the worst behavior of T (n )is within a constant factor of αn for each t ,and that the largest αis obtained for t =3.This αis the largest root of α6=4α2+9,which is approximately 1.64515.Thus T (n )=O ∗(1.64515n ).References[1]J.Alber,H.L.Bodlaender,H.Fernau,T.Kloks,and R.Niedermeier .Fixed parameter algorithms for dominating set and related problems on planar graphs .Algorithmica 33,2002,pp.461–493.[2] E.Dantsin,A.Goerdt,E.A.Hirsch,R.Kannan,J.Kleinberg,C.Papadimitriou,P.Raghavan,and U.Sch ¨oning .A deterministic (2−2/(k +1))n algorithm for k -SAT based on local search .Theoretical Computer Science 289,2002,pp.69–83.[3]R.G.Downey and M.R.Fellows .Parameterized complexity .Monographs in Computer Science,Springer-Verlag,New York,1999.[4] D.Eppstein .Small maximal independent sets and faster exact graph coloring .Proceedings of the 7th Workshop on Algorithms and Data Structures (WADS’2001),LNCS 2125,Springer,2001,pp.462–470.[5] F.V.Fomin and D.M.Thilikos .Dominating sets in planar graphs:Branch-width and exponential speed-up .Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms (SODA’2003),2003,pp.168–177.。