[2007]Algorithms for central-median paths with bounded length on trees

合集下载

Triangles with two given integral sides

1. introduction
There are many Diophantine problems arising from studying certain properties of triangles. Most people know the theorem on the lengths of sides of right angled triangles named after Pythagoras. That is a2 + b2 = c2.
Corollary. Problem (i) has inﬁnitely many solutions.
Few solutions are given in the following table.
x 8 1768 10130640 498993199440
y 15 2415 8109409 136318711969
(a) If n is odd, then the number of integer triples (x, y, z) satisfying the equation
n = 2x2 + y2 + 8z2 is just twice the number of integer triples (x, y, z)
rational cosine.
The present paper is motivated by the following two problems due to Zolt´an
Bertalan.
(i) How to choose x and y such that the distances of the clock hands at 2 o’clock and 3 o’clock are integers?

英特尔Intel面试题

英特尔Intel面试题[Intel Interview Questions]Introduction:Intel is a global technology company that designs and manufactures advanced integrated digital technology platforms. As part of their recruitment process, Intel conducts interviews to assess the skills and suitability of candidates for various roles within the company. In this article, we will explore some potential Intel interview questions and provide detailed answers.1. Technical Knowledge Questions:- What is the difference between a CPU and a GPU?Answer: A CPU (Central Processing Unit) is responsible for running the instructions of a computer program, executing tasks on a single thread at high speed. A GPU (Graphics Processing Unit) is designed specifically for rendering and displaying graphics on a screen, performing parallel computations with multiple threads.- Can you explain Moore's Law?Answer: Moore's Law states that the number of transistors on a microchip doubles approximately every two years. It illustrates the exponential growth of computing power and the shrinking size of transistors, leading to increased performance and efficiency in electronic devices.- What is the difference between DDR3 RAM and DDR4 RAM? Answer: DDR3 and DDR4 are different generations of Random-Access Memory. DDR4 offers higher speed and lower power consumption compared to DDR3. DDR4 also provides higherbandwidth, enabling faster data transfer rates between the RAM and the CPU.2. Problem-Solving Questions:- How would you design a traffic light control system? Answer: I would start by understanding the requirements and constraints of the system. Next, I would design a state machine to model the different states of the traffic lights (e.g., green, yellow, red) and the transitions between them based on inputs from sensors and timers. I would then implement the control logic using appropriate programming languages and algorithms.- You have a list of numbers. How would you find the median without using built-in functions?Answer: I would first sort the list of numbers in ascending order. If the number of elements in the list is odd, the median would be the middle element. If the number of elements is even, the median would be the average of the two middle elements. This can be found by calculating the average of the (n/2)th and ((n/2)+1)th elements.3. Behavioral Questions:- Tell me about a time when you faced a challenge at work and how you resolved it.Answer: In my previous job, we had a tight deadline for a project, and the team was struggling to meet the requirements. To overcome this challenge, I facilitated effective communication within the team and identified key areas where we could optimize workflows. I also delegated tasks based on individual strengths and provided resources to support the team. Through collaboration andefficient project management, we were able to deliver the project on time and achieve the desired results.- Describe a situation where you had to work collaboratively with a team to achieve a common goal.Answer: During a group project in university, we had to develop a mobile application within a limited timeframe. We divided the tasks among team members based on their skills and interests. We established regular communication channels and conducted frequent status updates to ensure everyone was on the same page. By leveraging each member's expertise and working together, we successfully developed the application and received positive feedback from our peers and professors.Conclusion:Preparing for an Intel interview requires a strong foundation of technical knowledge, problem-solving skills, and the ability to showcase your experiences through behavioral questions. These sample questions provide a glimpse into the types of inquiries you may encounter during an Intel interview. Remember to research the company, review the job requirements, and practice your responses to increase your chances of success.4. Technical Skills Questions:- Can you explain the difference between multi-core and multi-thread processors?Answer: A multi-core processor consists of multiple independent processing units called cores, each capable of executing instructions in parallel. This allows for simultaneous processing of multiple tasks, improving overall system performance. On the other hand, a multi-thread processor utilizes a technique calledmulti-threading, which allows a single core to handle multiple threads of execution. This enables better utilization of the core's resources and can also improve performance.- How does cache memory work in a computer system? Answer: Cache memory is a small, high-speed memory located closer to the CPU than main memory. It stores frequently accessed data and instructions, allowing for faster access compared to retrieving data from main memory. When the CPU needs data, it first checks the cache to see if the data is present. If it is, a cachehit occurs, and the data is retrieved from the cache. If the data is not present, a cache miss occurs, and the CPU retrieves the data from main memory and stores a copy in the cache for future access.- Can you explain the concept of pipelining in processors? Answer: Pipelining is a technique used in processors to overlap the execution of multiple instructions. It breaks down instruction execution into several stages: fetch, decode, execute, memory access, and writeback. Each stage is performed by a different part of the processor, allowing instructions to flow through the pipeline simultaneously. This improves overall instruction throughput and can result in faster execution.5. Problem-Solving Questions:- How would you design a data structure to efficiently store and retrieve a large number of strings?Answer: One possible approach would be to use a data structure called a trie (prefix tree). In a trie, each node represents a character, and the edges represent the next possible characters. This allows for efficient storage and retrieval of strings based on their prefixes,as common prefixes are shared among different strings. By utilizing trie operations such as insertion and search, the system can efficiently handle a large number of strings.- How would you optimize the performance of a database query that is running slowly?Answer: There are several approaches to optimize the performance of a slow-running database query. Firstly, ensure that appropriate indexes are created on the columns involved in the query, as indexes can significantly improve search speeds. Secondly, analyze the query execution plan and identify any performance bottlenecks, such as a full table scan or inefficient joins. Adjusting the query or adding additional conditions to limit the result set can often improve performance. Additionally, considering database schema design and partitioning data can also improve query performance.6. Behavioral Questions:- Describe a situation where you demonstrated strong problem-solving skills.Answer: In a previous job, we encountered a critical bug in the software that caused system crashes. To identify and resolve the issue, I gathered data from users, performed thorough testing, and analyzed log files. After identifying the root cause of the bug, I proposed a solution and worked closely with the development team to implement and test the fix. Through diligent problem-solving, we successfully resolved the issue, ensuring system stability and avoiding further disruptions.- Tell me about a time when you had to adapt to a rapidly changingwork environment.Answer: In a previous role, our company underwent a significant reorganization that resulted in changes to team structures and project priorities. As a result, my role and responsibilities shifted drastically. To adapt to the new environment, I quickly assessed the changes and embraced a flexible mindset. I collaborated with my colleagues, sought out new opportunities for growth, and adjusted my workflow and priorities accordingly. By embracing change and staying adaptable, I was able to thrive in the new work environment.Conclusion:Preparing for an Intel interview requires not only a solid understanding of technical concepts but also the ability to apply problem-solving skills and effectively communicate your experiences. The additional questions in this section provide further insight into the types of inquiries Intel may pose in an interview. Remember to leverage your technical knowledge, showcase your problem-solving abilities, and demonstrate your adaptability to increase your chances of success.。

最大覆盖问题研究

最大覆盖问题研究摘要最大覆盖问题是运筹学中一个经典组合优化问题。

通常是现实生活中邮政服务站点，加油站点，银行选址等问题的数学抽象。

最大覆盖问题一般被描述为有被服务点若干，选取若干服务点对被服务点进行服务的最小代价。

最大覆盖问题已经被证明是一类NP问题，也就是不能在多项式时间内求得最优值的问题。

目前国内外学者对于此问题的研究多是使用遗传，蚁群，退火模拟等启发式搜索求的近似值的方法来进行讨论。

本文主要分析了最大覆盖问题的穷举解法，剪枝搜索解法和启发式搜索解法。

对这三种解法进行了测试，比较算法的优劣和适用范围。

通过提出对于待选边进行性价比的计算，设计出启发式函数来搜索近似最优值，最后的测试将近似最优值保持在平均2的差异值范围内。

关键词最大覆盖问题；穷举解法；剪枝搜索解法；启发式搜索解法1 问题描述1.1 具体问题最大覆盖问题是现实生活中对邮政站点，加油站点，银行等一系列选址问题的数学抽象。

一般被描述为有被服务点若干，选取若干服务点对被服务点进行服务的最小代价。

1.2 数学描述通常最大覆盖问题可以通过矩阵覆盖问题来简化为数学抽象问题，即最大覆盖问题中，Am*n是一个M*N二维矩阵，其中aij表示j列是否覆盖i行。

aij = 0表示j列不覆盖i行，aij=1表示j列覆盖i行。

Cn数组维护列的代价，Cj表示选择j列的代价。

目标解为Z，Zj=0表示j列未被选中，Zj=1表示j列被选中。

目标解为:minj*Cj解约束条件为:j*aij>=1, i=1,2,3…mZj∈{0,1}, j=1,2,3….n1.3 解空间Am*n矩阵0 0 1 00 0 0 11 0 1 00 1 0 11 0 1 0Cn 3 2 1 2由于Zj∈{0,1} j=1,2,3…n。

这个解的空间是2n，随着n的增大，计算机在有限时间内很难通过穷举得出最优解，计算近似解是解决这类NP问题的常用方法。

如上面例子Am*n矩阵，Cn为列代价值。

求解P中位问题的混合蝙蝠算法

求解P中位问题的混合蝙蝠算法王婷婷; 张惠珍【期刊名称】《《上海理工大学学报》》【年(卷),期】2019(041)004【总页数】6页(P344-349)【关键词】P中位问题; 蝙蝠算法; 可行化函数; 交叉【作者】王婷婷; 张惠珍【作者单位】上海理工大学管理学院上海 200093【正文语种】中文【中图分类】TP301.6设施选址问题是运筹学中的经典问题之一，在生产、生活及物流等方面都有着非常广泛的应用，如工厂、仓库、急救中心、消防站及物流中心的选址等。

而P中位问题（P-median problem，PMP）就是一种很常见的设施选址问题，该问题最初由Hakimi[1-2]在1964年提出的，目的是要在给定没有容量限制的m个候选设施集中选择p(p＜m)个作为开放设施，使得所有顾客到这些中位点的距离之和达到最小。

目前，用于求解此问题的方法多种多样，如动态规划方法[3]、拉格朗日松弛算法[4-5]等经典求解方法。

但是，这些经典方法仅对小规模的问题有效,而当求解问题的规模较大时，其计算复杂度也将极大地增加，乃至无法求解。

由于Garey和Johnson[6]在1979年通过计算复杂性理论已经证明了该问题是NP-hard问题，因此，除非P=NP，否则不存在多项式时间算法求解该问题。

所以，绝大多数学者都致力于启发式算法的研究，以期得到问题的满意解或近似最优解。

许多传统的优化算法已被用来求解PMP，如局部搜索算法[7-8]、随机搜索算法[9]等。

此外，人们也提出了多种现代启发式算法，如模拟退火算法[10]、遗传算法[11-13]及粒子群算法[14]等。

这些启发式算法能够生成较好的最优解或近似最优解，特别是以遗传算法为代表的仿生算法，以其自组织和自适应等良好特征，被广大学者研究并应用到组合优化问题的求解过程中。

而本文正是将蝙蝠算法这种具有良好优化性能和发展前景的仿生算法用于求解PMP，以期能够克服已有算法的局限性，获得较好的优化性能。

视频突变检测的规范化灰度分布帧差方法

图 9 包含视频 2 的长视频
该视频中包括 1 处突变，并且视频中发生突变后的视频片断为视频 2(相机运动)部分。
从图 10 中可见新直方图帧差的定义也能满足第 3 节中所提要求的第(2)点。
图 6 视频 2 的规范化灰度分布帧差变化
为验证规范化灰度分布帧差方法能满足第 3 节中所提要求的第(2)点，需做进一步的实验来说明。包含视频 1 的长视频如图 7 所示。
将这些灰度级上的绝对倍数做累加后就可以使前后2帧具有相近灰度直方图分布的规范化灰度分布帧差变化平坦前后2帧具有不同形状灰度直方图分布的规范化灰度分布帧差急剧上升以此来达到结合自适应阈值算法hj获得更高的突变检测准确率的效果
第 35 卷第 3 期 Vol.35 No.3
计算机工程 Computer Engineering
该视频中包括 3 处突变，并且视频中第 3 次突变到视频尾部为视频 1(镜头变焦)部分。
从图 8 中，可见规范化灰度分布帧差满足第 3 节中所提要求的第(2)点，即在正确发生突变的地方仍能体现明显的
(2)对计算所得的 0~255 个 Diff (i,l) 执行以下操作： IF Diff (i,l) < 1 THEN Diff (i,l) = 1/ Diff (i,l) 即，对每个值小于 1 的灰度级分量商，取其倒数。将此 Diff 定义为 2 个数之间的“绝对倍数”。 (3) 将 0~255 的 Diff (i,l) 累加，得到新的直方图帧间差别：
255
NewHistDiff (i) = ∑ Diff (i,l)
l=0
便得到了第 i 个规范化灰度分布帧差。
4 实验结果
基于规范化灰度分布帧差方法，选取上述视频片断中灰

Monotone

Monotone circuits for the majority functionShlomo Hoory Avner Magen†Toniann Pitassi†AbstractWe present a simple randomized construction of size O n3and depth53log n O1monotone circuits for the majority function on n variables.This result can be viewed as a reduction in the size anda partial derandomization of Valiant’s construction of an O n53monotone formula,[15].On the otherhand,compared with the deterministic monotone circuit obtained from the sorting network of Ajtai, Koml´o s,and Szemer´e di[1],our circuit is much simpler and has depth O log n with a small constant.The techniques used in our construction incorporate fairly recent results showing that expansion yields performance guarantee for the belief propagation message passing algorithms for decoding low-density parity-check(LDPC)codes,[3].As part of the construction,we obtain optimal-depth linear-size mono-tone circuits for the promise version of the problem,where the number of1’s in the input is promised to be either less than one third,or greater than two thirds.We also extend these improvements to general threshold functions.At last,we show that the size can be further reduced at the expense of increased depth,and obtain a circuit for the majority of size and depth about n1Department of Computer Science,University of British Columbia,Vancouver,Canada.†Department of Computer Science,University of Toronto,Toronto,Canada.1IntroductionThe complexity of monotone formulas/circuits for the majority function is a fascinating,albeit perplexing,problem in theoretical computer science.Without the monotonicity restriction,majority can be solvedwith simple linear-size circuits of depth O log n,where the best known depth(over binary AND,OR,NOT gates)is495log n O1[12].There are two fundamental algorithms for the majority function thatachieve logarithmic depth.Theﬁrst is a beautiful construction obtained by Valiant in1984[15]that achievesmonotone formulas of depth53log n O1and size O n53.The second algorithm is obtained from the celebrated sorting network constructed in1983by Ajtai,Koml´o s,and Szemer´e di[1].Restricting to binaryinputs and taking the middle output bit(median),reduces this network to a monotone circuit for the majorityfunction of depth K log n and size O n log n.The advantage of the AKS sorting network for majority is thatit is a completely uniform construction of small size.On the negative side,its proof is quite complicated andmore importantly,the constant K is huge:the best known constant K is about5000[11],and as observed byPaterson,Pippenger,and Zwick[12],this constant is important.Further converting the circuit to a formulayields a monotone formula of size O n K,which is roughly n5000.In order to argue about a quality of a solution to the problem,one should be precise about the differentresources and the tradeoffs between them.We care about the depth,the size,the number of random bitsfor a randomized construction,and formula vs circuit question.Finally,the conceptual simplicity of boththe algorithm and the correctness proof is also an important goal.Getting the best depth-size tradeoffs isperhaps the most sought after goal around this classical question,while achieving uniformity comes next. An interesting aspect of the problem is the natural way it splits into two subproblems,the solution to which gives a solution to the original problem.Problem I takes as input an arbitrary n-bit binary vector,and outputs an m-bit vector.If the input vector has a majority of1’s,then the output vector has at least a2/3fraction of 1’s,and if the input vector does not have a majority of1’s,then the output vector has at most a1/3fraction of1’s.Problem II is a promise problem that takes the m-bit output of problem I as its input.The output of Problem II is a single bit that is1if the input has at least a2/3fraction of1’s,and is a0if the input has at most a1/3fraction of1’s.Obviously the composition of these two functions solves the original majority problem.There are several reasons to consider monotone circuits that are constructed via this two-phase approach.First,Valiant’s analysis uses this viewpoint.Boppana’s later work[2]actually lower bounds each of thesesubproblems separately(although failing to provide lower bound for the entire problem).Finally,the secondsubproblem is of interest in its own right.Problem II can be viewed as an approximate counting problem,and thus plays an important role in many areas of theoretical computer science.Non monotone circuits forthis promise problem have been widely studied.Results The contribution of the current work is primarily in obtaining a new and simple construction ofmonotone circuits for the majority function of depth53log n and size O n3,hence signiﬁcantly reducing the size of Valiant’s formula while not compromising at all the depth parameter.Further,for subproblem II as deﬁned above,we supply a construction of a circuit size that is of a linear size,and it too does not compromise the depth compared to Valiant’s solution.A very appealing feature of this construction is that it is uniform,conditioned on a reasonable assumption about the existence of good enough expander graphs. To this end we introduce a connection between this circuit complexity question and another domain,namely message passing algorithms.The depth we achieve for the promise problem nearly matches the1954lower bound of Moore and Shannon[10].We further show how to generalize our solution to a general threshold function,and explore another optionin the tradeoffs between the different resources we use;speciﬁcally,we show that by allowing for a depth of roughly twice that of Valiant’s construction,we may get a circuit of size O n12Deﬁnitions and ampliﬁcationFor a monotone boolean function H on k inputs,we deﬁne its ampliﬁcation function A H:0101 as A H p Pr H X1X k1,where X i are independent boolean random variables that are one with probability p.Valiant[15],considered the function H on four variables,which is the OR of two AND gates,H x1x2x3x4x1x2x3x4.The ampliﬁcation function of H,depicted in Figure1,is A H p11p22,and has a non-trivialﬁxed point atβ512152.Let H k be the depth2k binary tree with alternating layers of AND and OR gates,where the root is labeled OR.Valiant’s construction uses the fact that A Hk is the composition of A H with itself k times.Therefore,H k probabilistically ampliﬁesβ∆β∆to βγεk∆βγεk∆,as long asγεk∆∆0.This implies that for any constantε0we can take2k33log n O1to probabilistically amplifyβΩ1nβΩ1n toε1ε,where33is any constant bigger thanαlogDeﬁnition1.Let F be a boolean function F:01n01m,and let S01n be some subset of the inputs.We say that F deterministically ampliﬁes p l p h to q l q h with respect to S,if for all inputs x S, the following promise is satisﬁed(we denote by x the number of ones in the vector x):F x q l m if x p l nF x q h m if x p h nNote that unlike the probabilistic ampliﬁcation,deterministic ampliﬁcation has to work for all inputs or scenarios in the given set S.From here on,whenever we simply say“ampliﬁcation”we mean deterministic ampliﬁcation.For an arbitrary small constantε0,the construction we give is composed of two independent phases that may be of independent interest.A circuit C1:01n01m for m O n that deterministically ampliﬁesβΩ1nβΩ1n toδ1δfor an arbitrarily small constantδ0.This circuit has size O n3and depth αεlog n O1.A circuit C2:01m01,such that C2x0if xδm and C2x1if x1δm,whereδ0is a sufﬁciently small constant.This circuit has size O m and depth2εlog m O1.Theﬁrst circuit C1is achieved by a simple probabilistic construction that resembles Valiant’s construction. We present two constructions for the second circuit,C2.Theﬁrst construction is probabilistic;the second construction is a simulation of a logarithmic number of rounds of a certain message passing algorithm on a good bipartite expander graph.The correctness is based on the analysis of a similar algorithm used to decode a low density parity check code(LDPC)on the erasure channel[3].Combining the two circuits together yields a circuit C:01n01for theβn-th threshold function. The circuit is of size O n3and depthα22εlog n O1.3Monotone circuits for MajorityIn this section we give a randomized construction of the circuit C:01n01such that C x is one if the portion of ones in x is at leastβn and zero otherwise.The circuit C has size O n3and depth2αεlog n O1for an arbitrary small constantε0.As we described before,we will describe C as the compositions of the circuits C1and C2whose parameters are given by the following two theorems: Theorem2.For everyεεc0,there exists a circuit C1:01n01m for m O n,of size O n3and depthαεlog n O1that deterministically ampliﬁes all inputs fromβc nβc n toε1ε. Theorem3.For everyε0,there existsε0and a circuit C2:01n01,of size O n and depth 2εlog n O1that deterministically ampliﬁes all inputs fromε1εto01.The two circuits use a generalization of the four input function H used in Valiant’s construction.For any integer d2,we deﬁne the function H d on d2inputs as the d-ary OR of d d-ary AND gates,i.e d i1d j1 x i j.Note that Valiant’s function H is just H2.Each of the circuits C1and C2is a layered circuit,where layer zero is the input,and each value at the i-th layer is obtained by applying H d to d2independently chosen inputs from layer i 1.However,the valuesof d we choose for C1and C2are different.For C1we have d2,while for C2we choose sufﬁciently large d dεto meet the depth requirement of the circuit.We let F n m F denote a random circuit mapping n inputs to m outputs,where F is aﬁxed monotone boolean circuit with k inputs,and each of the m output bits is calculated by applying F to k independently chosen random inputs.We start with a simple lemma that relates the deterministic ampliﬁcation properties of F to the probabilistic ampliﬁcation function A F.1Lemma4.For anyεδ0,the random function F deterministically ampliﬁes p l p h to A F p l1δA F p h1δwith respect to S01n with probability at least1ε,if:log S log1εmΩΘγ2εi1c nβγεγ2εi1c nThat is,we can chooseδas an increasing geometric sequence,starting fromΘ1n for i1,up toΘ1 for i logγ2εn.The implied layer size for error probability2n(which is much better than we need),is Θnδ2.Therefore,it decreases geometrically fromΘn3down toΘn.It is not difﬁcult to see that after achieving the desired ampliﬁcation fromβc n toβ∆0,only a constant number of layers is needed to get down toε.The corresponding value ofδin these last steps is a constant (that depends onε),and therefore,the required layer sizes are allΘn.Proof of Theorem3.The circuit C2is a composition of F n m1H d F m1m2H dF mt1m t H d,where d andthe layer sizes n m0m1m t1are suitably chosen parameters depending onε.We prove that with high probability such a circuit deterministically ampliﬁes all inputs fromε1εto01.As before,we restrict our attention to the lower end of the promise problem and prove that C2outputs zero on all inputs with portion of ones smaller thanε.As in the circuit C1,the layer sizes must be sufﬁciently large to allow accurate computation.However, for the circuit C2,accurate computation does not mean that the portion of ones in each layer is close to its expected value.Rather,our aim is to keep the portion of ones bounded by aﬁxed constantε,while making each layer smaller than the preceding one by approximately a factor of d.We continue this process until the layer size is constant,and then use a constant size circuit toﬁnish the computation.Therefore,since the number of layers of such a circuit is about log n log d,and the depth of the circuit for H d is2log d,the total depth is about2log n for large d.By the above discussion,it sufﬁces to prove the following:For everyε0there exists a real number δ0and two integers d n0,such that for all n n0the random circuit F n m H d with m1εn d, deterministically ampliﬁesδtoδwith respect to all inputs,with failure probability at most1n.Since A Hδ11δd d dδd,the probability of failure for any speciﬁc input with portion of ones at most δ,is bounded by:mδmA Hδδm eampliﬁcation method to analyze the performance of a belief propagation message passing algorithm for decoding low density parity check(LDPC)codes.Today the use of belief propagation for decoding LDPC codes is one of the hottest topics in error correcting codes[9,14,13].Let G V L V R;E be a d regular bipartite graph with n vertices on each side,V L V R n.Consider the following message passing algorithm,where we think of the left and right as two players.The left player “plays AND”and the right player“plays OR”.At time zero the left player starts by sending one boolean message through each left to right edge,where the value of the message m uv from u V L to v V R is the input bit x u.Subsequently,the messages at time t0are calculated from the messages at time t 1.At odd times,given the left to right messages m uv,the right player calculates the right to left messages m vw, from v V R to w V L by the formula m vw u N v w m uv.That is,the right player sends a1along the edge from v V R to w V L if and only if at least one of the incoming messages/values(not including the incoming message from w)is1.Similarly,at even times the algorithm calculates the left to right messages m vw,v V L,w V R,from the right to left messages m uv,by the formula m vw u N v w m uv.That is,the left player sends a1along the edge from v V L to w V R if and only if all of the incoming messages/values (not including the incoming message from w)are1.We further need the following deﬁnitions.We call a left vertex bad at even time t if it transmits at least one message of value one at time t.Similarly,a right vertex is bad at odd time t if it is a right vertex that transmits at least one message of value zero at time t.We let b t be the number of bad vertices at time t.These deﬁnitions will be instrumental in providing a potential function measuring the progress of the message passing algorithm which is expressed in Lemma5.We say that a bipartite graph G V L V R;E isλe-expanding,if for any vertex set S V L(or S V R)of size at mostλn,N S e S.It will be convenient to denote the expansion of the set S by e S N S S. Lemma5.Consider the message passing algorithm using a d4regular expander graph with d1e d12.If b tλn d2then b t2b tη,whereηd1d1ηt and so b2t10for t log a d2d e gets,and the better the time guarantee above gets.How good are the expanders that we may use?One can show the existence of such expanders for sufﬁciently large d large,and e d c for an absolute constant c.The best known explicit construction that gets close to what we need,is the result of[4].However,that result does not sufﬁce here for two reasons.Theﬁrst is that it only achieves expansion1εd for anyε0 and sufﬁciently large d depending onε.The second is that it only guarantees left-to-right expansion,while our construction needs both left-to-right and right-to-left expansion.We refer the reader to the survey[6] for further reading and background.For such expanders,ηd1d1log d1log d1iterations,all mes-sages contain the right answer,whereεcan be made arbitrarily small by choosing sufﬁciently large d.It remains to convert the algorithm into a monotone circuit,which introduces a depth-blowup of log d1 owing to the depth of a binary tree simulating a d1-ary gate.Thus we get a2εlog n-depth circuit for arbitrarily smallε0.The size is obviously dn depth O n log n.To get a linear circuit,further work is needed,which we now describe.The idea is to use a sequence of graphs G 0G G 1,where each graph is half the size of its preceding graph,but has the same degree and expansion parameters.We start the message passing algorithm using the graph G G 0,and every t 0rounds (each round consists of OR and then AND),we switch to the next graph in the sequence.Without the switch,the portion of bad vertices should decrease by a factor of ηt 0,every t 0rounds.We argue that each switch can be performed,while losing at most a constant factor.To describe the switch from G i to G i 1,we identify V L G i 1with an arbitrary half of the vertices V L G i ,and start the message passing algorithm on G i 1with the left to right messages from each vertex in V L G i 1,being the same as at the last round of the algorithm on G i .As the number of bad left vertices cannot increase at a switch,their portion,at most doubles.For the right vertices,the exact argument is slightly more involved,but it is clear that the portion of bad right vertices in the ﬁrst round in G i 1,increases by at most a constant factor c ,compared with what it should have been,had there been no switch.(Precise calculation,yields c 2d η.)Therefore,to summarize,as the circuit consists of a geometrically decreasing sequence of blocks starting with a linear size block,the total size is linear as well.As for the depth,the amortized reduction in the portion of bad vertices per round,is by a factor of ηηc 1t 0.Therefore,the resulting circuit is only deeper than the one described in the previous paragraph,by a factor of log ηlog η.By choosing a sufﬁciently large value for t 0,we obtain:Theorem 6.For any ε0,there exists a 0such that for any n there exists a monotone circuit of depth 2εlog n O 1and size O n that solves a-promise problem.We note here that O log n depth monotone circuits for the a -promise problem can also be obtained from ε-halvers.These are building blocks used in the AKS network.However,our monotone circuits for the a -promise problem have two advantages.First,our algorithm relates this classical problem in circuit com-plexity to recent popular message passing algorithms.And second,the depth that we obtain is nearly ly,Moore and Shannon [10]prove that any monotone formula/circuit for majority requires depth 2log n O 1,and the lower bound holds for the a -promise problem as well.Proof of Lemma 5.(based on Burshtein and Miller [3])We consider only the case of bad left vertices.The proof for bad right vertices follows from the same proof,after exchanging ones with zeroes,ANDs with ORs,and lefts with rights.Let B V L be the set of bad leftvertices,and assume Bλd 2at some even time t and B the set of bad vertices at time t 2.We bound the size of B by considering separately B B and B B .Note that all sets considered in the proof have size at most λn ,and therefore expansion at least e.N(B’)To bound B B ,consider the set Q N B B N B N B B N B .Since vertices in Q are not adjacent to B ,then at time t 1they send right to left messages valued zero.On the other hand,any vertex in B B can receive at most one such zero message (otherwise all its messages at time t 2will be valuedzero and it cannot be in B).Therefore,since each vertex in Q must have at least one neighbour in B B,it follows that Q B B.Therefore,we have:N B B N B Q N B B B e B B B BOn the other hand,N B B e B B e B B B.Combining the above two inequalities,we obtain:B B e Be2B B1d12B(2) Combining inequalities(1)and(2)we get that:B B e B ed12Since e d12,and e B e,this yields the required bound:B B2d e d1As noted before in Section2,replacing the last2log n layers of Valiant’s tree with2log r n layers of r-ary AND/OR gates,results in an arbitrarily small increase in the depth of the corresponding formula for a large value of r.It is interesting to compare the expected behavior of the suggested belief-propagation algorithm to the behavior of the d1-ary tree.Assume that the graph G is chosen at random(in theconﬁguration model),and that the number of rounds k is sufﬁciently small,d12k n.Then,almost surely the computation of all but o1fraction of the k-th round messages is performed by evaluating a d1-ary depth k trees.Moreover,introducing an additional o1error,one may assume that the leaves are independently chosen boolean random variables that are one with probability p,where p is the portion of ones in the input.This observation sheds some light on the performance of the belief propagation algorithm. However,our analysis proceeds far beyond the number of rounds for which a cycle free analysis can be applied.4Monotone formulas for threshold-k functionsConsider the case of the k-th threshold function,T k n,i.e.a function that is one on x01n if xk1and zero otherwise.We show that,by essentially the same techniques of Section3,we can construct monotone circuits to this more general problem.We assume henceforth that k n2,since otherwise, we construct the circuit T n1k n and switch AND with OR gates.For k nΘ1,the construction yields circuits of depth53log n O1and size O n3.However,when k o n,circuits are shallower and smaller (this not surprising fact is also discussed in[2]in the context of formulas).The construction goes as follows:(i)Amplify k n k1n toβΩ1kβΩ1k by randomly applying to the input a sufﬁciently large number of OR gates with arityΘn k(ii)AmplifyβΩ1kβΩ1k to O11O1using a variation of phase I,and(iii)Amplify O11O1to01using phase II.We now give a detailed description.For the sake of the section to follow,we require the following lemma which is more general than is needed for the results of this section.Lemma7.Let S01n,andε0.Then,for any k,there is a randomized construction of a monotone circuit that evaluates T k n correctly on all inputs from S and hasdepth log n23log k2εloglog S O1size O log S k nHere k min k n1k,and the constants of the O depend only onε.Proof.Let s log S,and let i be the OR function with arity i.Then An kk n11k n n k,while An k k1n11k1n n k.Hence An kk n is a constant bounded from zero andone.We further notice thatAn k k1nΘ1kIt is not hard to see that we can pick a constantρso that Aρn k knβΩ1k.Therefore,ρn k probabilistically amplify k n k1n toβΩ1kβΩingLemma4withδΘ1k and m sk2we get that F n mρn k ampliﬁes k n k1n toβΩ1kβΩ1k with arbitrarily high probability.The depth required to implement the above circuit is log n k and the size is O skn.Next we apply a slight modiﬁcation of phase I.The analysis there remains the same except that the starting point is separation guarantee ofΩ1k instead ofΩ1n,and log S is s instead of n.This leads to a circuit of depthαεlog k O1and of size O sk2,for an arbitrarily small constantε0.Also,we note that the output of this phase is of sizeΘs.Finally,we apply phase II,where the number of inputs isΘs instead ofΘn,to obtain an ampliﬁcation from O11O1to01.This requires depth2εlog s O1and size O s,for an arbitrarily small constantε0.To guarantee the correctness of a monotone circuit for T n k,it sufﬁces to check its output on inputs of weight k k1(as the circuit is monotone).Therefore,S n k n k1,implying that log S O k log n k. Therefore,we have:Theorem8.There is a randomized construction of a monotone circuit for T k n with:depth log n43log k O loglog n ksize O k2n log n kwhere k min k n1k,and the constants of the O are absolute.5Reducing the circuit sizeThe result obtained so far for the majority,is a monotone circuit of depth53log n O1and size O n3.In this section,we would like to obtain smaller circuit size,at the expense of increasing the depth somewhat. The crucial observation is that the size of our circuit depends linearly on the logarithm of the number of scenarios it has to handle.Therefore,applying a preprocessing stage to reduce the wealth of scenarios may save up to a factor of n in the circuit size.We propose a recursive construction that reduces the circuit size to about n1We chooseαi2σi1to equate1αiσi with3αi.This implies thatσi132σi1,and δi153δi22σi1,resulting in the following sequence:iαiσiδi2,and the sequence of δi tends to129896.Therefore,we have:Theorem9.There is a randomized construction of a monotone circuit for the majority of size n1There are two central open problems related to this work.First,is the promise version really simpler than majority?A lower bound greater than2log n on the communication complexity of mMaj-search would settle this question.Boppana[2]and more recent work[5]show lower bounds on a particular method for obtaining monotone formulas for majority.However we are asking instead for lower bounds on the size/depth of unrestricted monotone formulas/circuits.Secondly,the original question remains unresolved. Namely,we would like to obtain explicit uniform formulas for majority of optimal or near optimal size.A related problem is to come up with a natural(top-down)communication complexity protocol for mMaj-Search that uses O log n many bits.References[1]M.Ajtai,J.Koml´o s,and E.Szemer´e di.Sorting in c log n parallel binatorica,3(1):1–19,1983.[2]R.B.Boppana.Ampliﬁcation of probabilistic boolean formulas.IEEE Symposium on Foundations ofComputer Science(FOCS),pages20–29,1985.[3]D.Burshtein and ler.Expander graph arguments for message-passing algorithms.IEEE Trans.Inform.Theory,47(2):782–790,2001.[4]M.Capalbo,O.Reingold,S.Vadhan,and A.Wigderson.Randomness conductors and constant-degreeexpansion beyond the degree2barrier.In Proceedings34th Symposium on Theory of Computing, pages659–668,2002.[5]M.Dubiner and U.Zwick.Ampliﬁcation by read-once formulas.SIAM put.,26(1):15–38,1997.[6]S.Hoory,N.Linial,and A.Wigderson.Expander graphs and their applications.survey article toappear in the Bulletin of the AMS.[7]Mauricio Karchmer and Avi Wigderson.Monotone circuits for connectivity require super-logarithmicdepth.In Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing,pages539–550,Chicago,IL,May1988.[8]M.Luby,M.Mitzenmacher,and A.Shokrollahi.Analysis of random processes via and-or tree evalu-ation.In ACM-SIAM Symposium on Discrete Algorithms(SODA),1998.[9]M.Luby,M.Mitzenmacher,A.Shokrollahi,and D.A.Spielman.Analysis of low density codes andimproved designs using irregular graphs.ACM Symposium on Theory of Computing(STOC),1998.[10]E.F.Moore and C.E.Shannon.Reliable circuits using less reliable relays.I,II.J.Franklin Inst.,262:191–208,281–297,1956.[11]M.S.Paterson.Improved sorting networks with O log N depth.Algorithmica,5(1):75–92,1990.[12]M.S.Paterson,N.Pippenger,and U.Zwick.Optimal carry save networks.In Boolean functioncomplexity(Durham,1990),volume169of London Math.Soc.Lecture Note Ser.,pages174–201.Cambridge Univ.Press,Cambridge,1992.[13]T.Richardson and R.Urbanke.Modern coding theory.Draft of a book.[14]T.Richardson and R.Urbanke.The capacity of low-density parity-check codes under message-passingdecoding.IEEE rm.Theory,47(2):599–618,2001.[15]L.G.Valiant.Short monotone formulae for the majority function.J.Algorithms,5(3):363–366,1984.。

VESA Display Stream Compression Encoder IP v1.0 用户

VESA Display Stream Compression Encoder IP v1.0 UserGuideIntroductionDisplay Stream Compression (DSC) is a visually lossless video compression targeted for display devices. As there is demand for higher video resolutions and higher frame rates, the data bandwidth required to transmit the video keeps increasing. To transmit high video resolutions such as 4K and 8K, the source, transmission path, that is the display cable, and the display should support higher data rates. These high data rates increase the cost of the source, cable and the display. DSC is used to reduce the data rate required to transmit high resolution videos and there by reducing the cost. DSC was first introduced by Video Electronics Standards Association (VESA) in 2014. DSC compression is supported by the latest versions of the popularly used protocols such as HDMI, Display port, and MIPI DSI.DSC implements compression by combining a group of pixels in a horizontal line. The compression algorithm uses several stages such as prediction, quantization, entropy encoding, and rate control. There are two types of algorithms for prediction, which are Modified Median Adaptive Filter (MMAP) and Mid-Point Prediction (MPP). The predicted data is quantized based on the rate control to achieve constant bandwidth at the output. The quantized data is then passed to the Variable Length Coding (VLC) that minimizes the bits used to represent the quantized output. These compression stages are implemented for Y, Cb, and Cr component and the outputs of these stages are combined at the end using a substream multiplexer.DSC supports splitting a video frame into multiple slices horizontally with equal size. The slicing of a frame allows parallel processing of slices to handle high resolution video frames. The DSC IP supports two slices and uses MMAP and MPP predictions.FeaturesDSC has the following features:•VESA DSC 1.2a Spec•Implements Compression on YCbCr 444 Video Format•Supports 12-bits Per Pixel (12 bpp) and 8-bits Per Component•Standalone Operation, CPU, or Processor Assistance not Required•Supports Compression for 432x240, 648x480, 960x540, 1296x720, and 1920x1080 Resolutions at 60 Frames Per Second (fps)•Supports Two SlicesSupported FamiliesDSC supports the following family of products:•PolarFire® SoC FPGA•PolarFire FPGATable of ContentsIntroduction (1)Features (1)Supported Families (1)1. Hardware Implementation (3)1.1. Inputs and outputs (3)1.2. Configuration Parameters (3)1.3. Hardware Implementation of DSC IP (4)2. Testbench (7)2.1. Simulation (7)3. License (10)4. Installation Instructions (11)5. Resource Utilization (12)6. Revision History (13)Microchip FPGA Support (14)Microchip Information (14)The Microchip Website (14)Product Change Notification Service (14)Customer Support (14)Microchip Devices Code Protection Feature (14)Legal Notice (15)Trademarks (15)Quality Management System (16)Worldwide Sales and Service (17)1. Hardware ImplementationThe following figure shows the DSC IP block diagram.Figure 1-1. DSC Encoder IP Block Diagram1.1 Inputs and outputsThe following table lists the input and output ports of DSC IP.Table 1-1. Input and Output Ports of DSC IP1.2 Configuration ParametersThe following figure shows the DSC Encoder IP configuration parameters.Figure 1-2. DSC Encoder IP Configurator1.3 Hardware Implementation of DSC IPThis section describes the different internal modules of the DSC Encoder IP. The data input to the IP must be in the form of a raster scan image in the YCbCr 444 format.The following figure shows the DSC Encoder IP block diagram that divides the input image into two slices. The width of each slice is half of the input image width and the slice height is same as the input image height.Figure 1-3. DSC Encoder IP Block Diagram Slice 1The following figure shows the DSC Encoder block diagram for each slice.Figure 1-4. DSC Encoder Block Diagram Slice 2SLICE-1 BLOCK DIAGRAM1.3.1 Prediction and QuantizationEach group, consisting of three consecutive pixels, is predicted by using the MMAP and MPP algorithms. Predicted values are subtracted from the original pixel values, and the resulting residual pixels are quantized. In addition,reconstruction step is performed in the encoder wherein, the inverse quantized residuals are added to the predicted sample to ensure that both encoder and decoder have the same reference pixels.MMAP algorithm uses the current group’s pixels, the previous line’s adjacent pixels, and the reconstructed pixelimmediately to the left of the group. This is the default prediction method.The MPP predictor is a value at or near the midpoint of the range. The predictor depends on the rightmostreconstructed sample value of the previous group.1.3.2 VLC Entropy EncoderThe size of each residual is predicted using the previous residual size and changing the Quantization Parameter(QP). Variable length encoding effectively compresses the residual data.1.3.3 Rate ControlRate control block calculates the master Quantization Parameter (masterQP) to be used for prediction and VLC to ensure that the rate buffer neither underflows nor overflows. masterQP value is not transmitted along the bitstream, and the same rate control algorithm is imitated in the decoder. The RC algorithm is designed to optimize subjectivepicture quality by way of its QP decisions. Lower QP on flat areas of the image and Higher QP on busy areas of the image ensures you to maintain constant quality for all the pixels.1.3.4 Decoder ModelDecoder is an idealized theoretical actual decoder model. Decoder model dictates how the substeams Y, Cb, and Cr are multiplexed. The Balance FIFOs ensure that the multiplexer has at least one mux word’s worth of data whenever the multiplexer receives a request signal from the decoder model.1.3.5 Substream MultiplexerThe substream multiplexer multiplexes the Y, CB, and Cr components into a single slice of data. Each muxword has 48-bit data. Muxwords are inserted in the bitstream depending on the size of their syntax elements.1.3.6 Slice MultiplexerEach picture is divided into two equal slices. Each slice is independently decoded without referencing other slices.Two slices are merged in the bitstream by the slice multiplexing process.2. TestbenchTestbench is provided to check the functionality of the DSC IP.2.1 SimulationThe simulation uses a 432x240 image in YCbCr444 format represented by three files, each for Y, Cb, and Cr as input and generates a .txt file format that contains one frame.To simulate the core using the testbench, perform the following steps:1.Go to Libero® SoC Catalog tab, expand Solutions-Video, double-click DSC_Encoder, and then click OK.Note: If you do not see the Catalog tab, navigate to View > Windows menu and click Catalog to make itvisible.Figure 2-1. DSC Encoder IP Core in Libero SoC Catalog2.Go to the Files tab, right-click simulation, and then click Import Files.Figure 2-2. Import Files3.Import the img_in_luma.txt, img_in_cb.txt, img_in_cr.txt, and DSC_out_ref.txt files from thefollowing path: ..\<Project_name>\component\Microchip\SolutionCore\ DSC_Encoder\<DSC IP version>\Stimulus.The imported file is listed in the simulation folder as shown in the following figure.Figure 2-3. Imported Files4.Go to Libero SoC Stimulus Hierarchy tab, select the testbench (DSC_Encoder_tb. v), right-click and thenclick Simulate Pre-Synth Design > Open Interactively. The IP is simulated for one frame.Note: If you do not see the Stimulus Hierarchy tab, navigate to View > Windows menu, and then click Stimulus Hierarchy to make it visible.Figure 2-4. Simulating the Pre-Synthesis DesignModelSim opens with the testbench file as shown in the following figure.Figure 2-5. ModelSim Simulation WindowNote: If the simulation is interrupted due to the runtime limit specified in the DO file, use the run -allcommand to complete the simulation.3. LicenseVESA DSC IP is provided only in encrypted form.Encrypted RTL source code is license locked, which needs to be purchased separately. You can perform simulation, synthesis, layout, and program the Field Programmable Gate Array (FPGA) silicon using the Libero design suite.Evaluation license is provided for free to explore the VESA DSC IP features. The evaluation license expires after an hour’s use on the hardware.Installation Instructions 4. Installation InstructionsDSC IP core must be installed to the IP Catalog of the Libero SoC software. This is done automatically through theIP Catalog update function in the Libero SoC software, or the IP core can be manually downloaded from the catalog.Once the IP core is installed in the Libero SoC software IP Catalog, the core can be configured, generated, andinstantiated within the SmartDesign tool for inclusion in the Libero projects list.Resource Utilization 5. Resource UtilizationThe following table lists the resource utilization of a sample DSC IP design made for PolarFire FPGA(MPF300TS-1FCG1152I package) and generates compressed data by using 4:4:4 sampling of input data.Table 5-1. Resource UtilizationRevision History 6. Revision HistoryThe revision history describes the changes that were implemented in the document. The changes are listed byrevision, starting with the current publication.Table 6-1. Revision HistoryMicrochip FPGA SupportMicrochip FPGA products group backs its products with various support services, including Customer Service, Customer Technical Support Center, a website, and worldwide sales offices. Customers are suggested to visit Microchip online resources prior to contacting support as it is very likely that their queries have been already answered.Contact Technical Support Center through the website at /support. Mention the FPGA Device Part number, select appropriate case category, and upload design files while creating a technical support case.Contact Customer Service for non-technical product support, such as product pricing, product upgrades, update information, order status, and authorization.•From North America, call 800.262.1060•From the rest of the world, call 650.318.4460•Fax, from anywhere in the world, 650.318.8044Microchip InformationThe Microchip WebsiteMicrochip provides online support via our website at /. This website is used to make files and information easily available to customers. Some of the content available includes:•Product Support – Data sheets and errata, application notes and sample programs, design resources, user’s guides and hardware support documents, latest software releases and archived software•General Technical Support – Frequently Asked Questions (FAQs), technical support requests, online discussion groups, Microchip design partner program member listing•Business of Microchip – Product selector and ordering guides, latest Microchip press releases, listing of seminars and events, listings of Microchip sales offices, distributors and factory representativesProduct Change Notification ServiceMicrochip’s product change notification service helps keep customers current on Microchip products. Subscribers will receive email notification whenever there are changes, updates, revisions or errata related to a specified product family or development tool of interest.To register, go to /pcn and follow the registration instructions.Customer SupportUsers of Microchip products can receive assistance through several channels:•Distributor or Representative•Local Sales Office•Embedded Solutions Engineer (ESE)•Technical SupportCustomers should contact their distributor, representative or ESE for support. Local sales offices are also available to help customers. A listing of sales offices and locations is included in this document.Technical support is available through the website at: /supportMicrochip Devices Code Protection FeatureNote the following details of the code protection feature on Microchip products:•Microchip products meet the specifications contained in their particular Microchip Data Sheet.•Microchip believes that its family of products is secure when used in the intended manner, within operating specifications, and under normal conditions.•Microchip values and aggressively protects its intellectual property rights. Attempts to breach the code protection features of Microchip product is strictly prohibited and may violate the Digital Millennium Copyright Act.•Neither Microchip nor any other semiconductor manufacturer can guarantee the security of its code. Code protection does not mean that we are guaranteeing the product is “unbreakable”. Code protection is constantly evolving. Microchip is committed to continuously improving the code protection features of our products. Legal NoticeThis publication and the information herein may be used only with Microchip products, including to design, test,and integrate Microchip products with your application. Use of this information in any other manner violates these terms. Information regarding device applications is provided only for your convenience and may be supersededby updates. It is your responsibility to ensure that your application meets with your specifications. Contact yourlocal Microchip sales office for additional support or, obtain additional support at /en-us/support/ design-help/client-support-services.THIS INFORMATION IS PROVIDED BY MICROCHIP "AS IS". MICROCHIP MAKES NO REPRESENTATIONSOR WARRANTIES OF ANY KIND WHETHER EXPRESS OR IMPLIED, WRITTEN OR ORAL, STATUTORYOR OTHERWISE, RELATED TO THE INFORMATION INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE, OR WARRANTIES RELATED TO ITS CONDITION, QUALITY, OR PERFORMANCE.IN NO EVENT WILL MICROCHIP BE LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL, OR CONSEQUENTIAL LOSS, DAMAGE, COST, OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE INFORMATION OR ITS USE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE POSSIBILITY OR THE DAMAGES ARE FORESEEABLE. TO THE FULLEST EXTENT ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY RELATED TO THE INFORMATION OR ITS USE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY, THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THE INFORMATION.Use of Microchip devices in life support and/or safety applications is entirely at the buyer's risk, and the buyer agrees to defend, indemnify and hold harmless Microchip from any and all damages, claims, suits, or expenses resulting from such use. No licenses are conveyed, implicitly or otherwise, under any Microchip intellectual property rights unless otherwise stated.TrademarksThe Microchip name and logo, the Microchip logo, Adaptec, AVR, AVR logo, AVR Freaks, BesTime, BitCloud, CryptoMemory, CryptoRF, dsPIC, flexPWR, HELDO, IGLOO, JukeBlox, KeeLoq, Kleer, LANCheck, LinkMD, maXStylus, maXTouch, MediaLB, megaAVR, Microsemi, Microsemi logo, MOST, MOST logo, MPLAB, OptoLyzer, PIC, picoPower, PICSTART, PIC32 logo, PolarFire, Prochip Designer, QTouch, SAM-BA, SenGenuity, SpyNIC, SST, SST Logo, SuperFlash, Symmetricom, SyncServer, Tachyon, TimeSource, tinyAVR, UNI/O, Vectron, and XMEGA are registered trademarks of Microchip Technology Incorporated in the U.S.A. and other countries.AgileSwitch, APT, ClockWorks, The Embedded Control Solutions Company, EtherSynch, Flashtec, Hyper Speed Control, HyperLight Load, Libero, motorBench, mTouch, Powermite 3, Precision Edge, ProASIC, ProASIC Plus, ProASIC Plus logo, Quiet- Wire, SmartFusion, SyncWorld, Temux, TimeCesium, TimeHub, TimePictra, TimeProvider, TrueTime, and ZL are registered trademarks of Microchip Technology Incorporated in the U.S.A.Adjacent Key Suppression, AKS, Analog-for-the-Digital Age, Any Capacitor, AnyIn, AnyOut, Augmented Switching, BlueSky, BodyCom, Clockstudio, CodeGuard, CryptoAuthentication, CryptoAutomotive, CryptoCompanion, CryptoController, dsPICDEM, , Dynamic Average Matching, DAM, ECAN, Espresso T1S, EtherGREEN, GridTime, IdealBridge, In-Circuit Serial Programming, ICSP, INICnet, Intelligent Paralleling, IntelliMOS, Inter-Chip Connectivity, JitterBlocker, Knob-on-Display, KoD, maxCrypto, maxView, memBrain, Mindi, MiWi, MPASM, MPF, MPLAB Certified logo, MPLIB, MPLINK, MultiTRAK, NetDetach, Omniscient Code Generation, PICDEM, , PICkit, PICtail, PowerSmart, PureSilicon, QMatrix, REAL ICE, Ripple Blocker, RTAX, RTG4, SAM-ICE, Serial Quad I/O, simpleMAP, SimpliPHY, SmartBuffer, SmartHLS, SMART-I.S., storClad, SQI, SuperSwitcher, SuperSwitcher II, Switchtec, SynchroPHY, Total Endurance, Trusted Time, TSHARC, USBCheck, VariSense, VectorBlox, VeriPHY, ViewSpan, WiperLock, XpressConnect, and ZENA are trademarks of Microchip Technology Incorporated in the U.S.A. and other countries.SQTP is a service mark of Microchip Technology Incorporated in the U.S.A.The Adaptec logo, Frequency on Demand, Silicon Storage Technology, and Symmcom are registered trademarks of Microchip Technology Inc. in other countries.GestIC is a registered trademark of Microchip Technology Germany II GmbH & Co. KG, a subsidiary of Microchip Technology Inc., in other countries.All other trademarks mentioned herein are property of their respective companies.© 2022, Microchip Technology Incorporated and its subsidiaries. All Rights Reserved.ISBN: 978-1-6683-1273-5Quality Management SystemFor information regarding Microchip’s Quality Management Systems, please visit /quality.Worldwide Sales and Service。

Efficient Algorithms for Citation Network Analysis

a r X i v :c s /0309023v 1 [c s .D L ] 14 S e p 2003Efﬁcient Algorithms for Citation Network AnalysisVladimir BatageljUniversity of Ljubljana,Department of Mathematics,Jadranska 19,1111Ljubljana,Slovenia e-mail:vladimir.batagelj@uni-lj.siAbstractIn the paper very efﬁcient,linear in number of arcs,algorithms for determining Hum-mon and Doreian’s arc weights SPLC and SPNP in citation network are proposed,and some theoretical properties of these weights are presented.The nonacyclicity problem in citation networks is discussed.An approach to identify on the basis of arc weights an im-portant small subnetwork is proposed and illustrated on the citation networks of SOM (self organizing maps)literature and US patents.Keywords:large network,acyclic,citation network,main path,CPM path,arc weight,algorithm,self organizing maps,patent1IntroductionThe citation network analysis started with the paper of Garﬁeld et al.(1964)[10]in which the introduction of the notion of citation network is attributed to Gordon Allen.In this paper,on the example of Asimov’s history of DNA [1],it was shown that the analysis ”demonstrated a high degree of coincidence between an historian’s account of events and the citational relationship between these events ”.An early overview of possible applications of graph theory in citation network analysis was made in 1965by Garner [13].The next important step was made by Hummon and Doreian (1989)[14,15,16].They proposed three indices (NPPC,SPLC,SPNP)–weights of arcs that provide us with automatic way to identify the (most)important part of the citation network –the main path analysis.In this paper we make a step further.We show how to efﬁciently compute the Hummon and Doreian’s weights,so that they can be used also for analysis of very large citation networks with several thousands of vertices.Besides this some theoretical properties of the Hummon and Doreian’s weights are presented.The proposed methods are implemented in Pajek –a program,for Windows (32bit),for analysis of large networks .It is freely available,for noncommercial use,at its homepage [4].For basic notions of graph theory see Wilson and Watkins [18].Table1:Citation network characteristicsnetwork m n0k C∆in24 DNA6013700 2231218161340 Small world198816316000 105911024282320 Cocitation49293519020 308412678321052 Kroto319500116660 447023704247350 Zewail542531015166382 8843782126310984 Desalination25751141111573121 377476813764117327700Figure1:Citation Network in Standard FormLet I={(u,u):u∈U}be the identity relation on U andQ∩I=∅.The relation Q⋆=3Analysis of Citation NetworksAn approach to the analysis of citation network is to determine for each unit/arc its impor-tance or weight.These values are used afterward to determine the essential substructures in the network.In this paper we shall focus on the methods of assigning weights w:R→I R+0to arcs proposed by Hummon and Doreian[14,15]:•node pair projection count(NPPC)method:w d(u,v)=|R inv⋆(u)|·|R⋆(v)|•search path link count(SPLC)method:w l(u,v)equals the number of”all possible search paths through the network emanating from an origin node”through the arc(u,v)∈R, [14,p.50].•search path node pair(SPNP)method:w p(u,v)”accounts for all connected vertex pairs along the paths through the arc(u,v)∈R”,[14,p.51].3.1Computing NPPC weightsTo compute w d for sets of units of moderate size(up to some thousands of units)the matrix representation of R can be used and its transitive closure computed by Roy-Warshall’s algorithm [9].The quantities|R⋆(v)|and|R inv⋆(u)|can be obtained from closure matrix as row/column sums.An O(nm)algorithm for computing w d can be constructed using Breath First Search from each u∈U to determine|R inv⋆(u)|and|R⋆(v)|.Since it is of order at least O(n2)this algorithm is not suitable for larger networks(several ten thousands of vertices).3.2Search path count methodTo compute the SPLC and SPNP weights we introduce a related search path count(SPC) method for which the weights N(u,v),uRv count the number of different paths from s to t (or from Min R to Max R)through the arc(u,v).To compute N(u,v)we introduce two auxiliary quantities:let N−(v)denotes the number of different s-v paths,and N+(v)denotes the number of different v-t paths.Every s-t pathπcontaining the arc(u,v)∈R can be uniquely expressed in the formπ=σ◦(u,v)◦τwhereσis a s-u path andτis a v-t path.Since every pair(σ,τ)of s-u/v-t paths gives a corresponding s-t path it follows:N(u,v)=N−(u)·N+(v),(u,v)∈RwhereN−(u)= 1u=sv:vRu N−(v)otherwiseandN+(u)= 1u=tv:uRv N+(v)otherwiseThis is the basis of an efﬁcient algorithm for computing the weights N(u,v)–after the topo-logical sort of the network[9]we can compute,using the above relations in topological order, the weights in time of order O(m).The topological order ensures that all the quantities in the right side expressions of the above equalities are already computed when needed.The counters N(u,v)are used as SPC weights w c(u,v)=N(u,v).3.3Computing SPLC and SPNP weightsThe description of SPLC method in[14]is not very precise.Analyzing the table of SPLC weights from[14,p.50]we see that we have to consider each vertex as an origin of search paths.This is equivalent to apply the SPC method on the extended network N l=(U′,R l)R l:=R′∪{s}×(U\∪R(s))It seems that there are some errors in the table of SPNP weights in[14,p.51].Using the deﬁnition of the SPNP weights we can again reduce their computation to SPC method applied on the extended network N p=(U′,R p)R p:=R∪{s}×U∪U×{t}∪{(t,s)}in which every unit u∈U is additionaly linked from the source s and to the sink t.3.4Computing the numbers of paths of length kWe could use also a direct approach to determine the weights w p.Let L−(u)be the number of different paths terminating in u and L+(u)the number of different paths originating in u.Then for uRv it holds w p(u,v)=L−(u)·L+(v).The procedure to determine L−(u)and L+(u)can be compactly described using two fami-lies of polynomial generating functionsP−(u;x)=h(u)k=0p−(u,k)x k and P+(u;x)=h−(u)k=0p+(u,k)x k,u∈Uwhere h(u)is the depth of vertex u in network(U,R),and h−(u)is the depth of vertex u in network(U,R inv),The coefﬁcient p−(u,k)counts the number of paths of length k to u,and p+(u,k)counts the number of paths of length k from u.Again,by the basic principles of combinatoricsP−(u;x)= 0u=s1+x· v:vRu P−(v;x)otherwiseandP+(u;x)= 0u=t1+x· v:uRv P+(v;x)otherwiseand both families can be determined using the deﬁnitions and computing the polynomials in the(reverse for P+)topological ordering of U.The complexity of this procedure is at most O(hm).FinallyL−(u)=P−(u;1)and L+(v)=P+(v;1)In real life citation networks the depth h is relatively small as can be seen from the Table 1.The complexity of this approach is higher than the complexity of the method proposed in subsection 3.3–but we get more detailed information about paths.May be it would make sense to consider ’aging’of references by L −(u )=P −(u ;α),for selected α,0<α≤1.3.5Vertex weightsThe quantities used to compute the arc weights w can be used also to deﬁne the corresponding vertex weights tt d (u )=|R inv ⋆(u )|·|R ⋆(u )|t c (u )=N −(u )·N +(u )t l (u )=N ′−(u )·N ′+(u )t p (u )=L −(u )·L +(u )They are counting the number of paths of selected type through the vertex u .3.6Implementation detailsIn our ﬁrst implementation of the SPNP method the values of L −(u )and L +(u )for some large networks (Zewail and Lederberg)exceeded the range of Delphi’s LargeInt (20decimal places).We decided to use the Extended real numbers (range =3.6×10−4951..1.1×104932,19-20signiﬁcant digits)for counters.This range is safe also for very large citation networks.To see this,let us denote N ∗(k )=max u :h (u )=k N −(u ).Note that h (s )=0and uRv ⇒h (u )<h (v ).Let u ∗∈U be a unit on which the maximum is attained N ∗(k )=N −(u ∗).ThenN ∗(k )=v :vRu ∗N −(v )≤v :vRu ∗N ∗(h (v ))≤v :vRu ∗N ∗(k −1)==deg in (u ∗)·N ∗(k −1)≤∆in (k )·N ∗(k −1)where ∆in (k )is the maximal input degree at depth k .Therefore N ∗(h )≤ hk =1∆in (k )≤∆h in .A similar inequality holds also for N +(u ).From both it followsN (u,v )≤∆h (u )in ·∆h −(v )out≤∆H −1where H =h (t )and ∆=max(∆in ,∆out ).Therefore for H ≤1000and ∆≤10000we getN (u,v )≤∆H −1≤104000which is still in the range of Extended reals.Note also that in the derivation of this inequality we were very generous –in real-life networks N (u,v )will be much smaller than ∆H −1.Very large/small numbers that result as weights in large networks are not easy to use.One possibility to overcome this problem is to use the logarithms of the obtained weights –logarith-mic transformation is monotone and therefore preserve the ordering of weights (importance of vertices and arcs).The transformed values are also more convenient for visualization with line thickness of arcs.4Properties of weights4.1General properties of weightsDirectly from the deﬁnitions of weights we getw k(u,v;R)=w k(v,u;R inv),k=d,c,pandw c(u,v)≤w l(u,v)≤w p(u,v)Let N A=(U A,R A)and N B=(U B,R B),U A∩U B=∅be two citation networks,andN1=(U′A,R′A)and N2=((U A∪U B)′,(R A∪R B)′)the corresponding standardized networks of theﬁrst network and of the union of both networks.Then it holds for all u,v∈U A and for all p,q∈R At(1)k(u)t(2)k(v),andw(1)k(p)w(2)k(q),k=d,c,l,pwhere t(1)and w(1)is a weight on network N1,and t(2)and w(2)is a weight on network N2.This means that adding or removing components in a network do not change the ratios(ordering)of the weights inside components.Let N1=(U,R1)and N2=(U,R2)be two citation networks over the same set of units U and R1⊆R2thenw k(u,v;R1)≤w k(u,v;R2),k=d,c,p4.2NPPC weightsIn an acyclic network for every arc(u,v)∈R holdR inv⋆(u)∩R⋆(v)=∅and R inv⋆(u)∪R⋆(v)⊆Utherefore|R inv⋆(u)|+|R⋆(v)|≤n and,using the inequality √2(a+b),alsow d(u,v)=|R inv⋆(u)|·|R⋆(v)|≤1Rv⇒R⋆(u)⊂R⋆(v)The weights w d are larger in the’middle’of the network.A more uniform(but less sensitive)weight would be w s(u,v)=|R inv⋆(u)|+|R⋆(v)|or in the normalized form w′s(u,v)=14.3SPC weightsFor theﬂow N(u,v)the Kirchoff’s node law holds:For every node v in a citation network in standard form it holdsincomingﬂow=outgoingﬂow=t c(v)Proof:N(x,v)= x:xRv N−(x)·N+(v)=( x:xRv N−(x))·N+(v)=N−(v)·N+(v) x:xRvN(v,y)= y:vRy N−(v)·N+(y)=N−(v)· y:vRy N+(y)=N−(v)·N+(v) y:vRy2 From the Kirchoff’s node law it follows that the totalﬂow through the citation network equals N(t,s).This gives us a natural way to normalize the weightsN(u,v)w(u,v)=Figure2:Preprint transformationBut,new problems arise:What is the right value of the’aging’factor?Is there an efﬁcient algorithm to count the restricted trails?The other possibility,since a citation network is usually almost acyclic,is to transform it into an acyclic network•by identiﬁcation(shrinking)of cyclic groups(nontrivial strong components),or •by deleting some arcs,or•by transformations such as the’preprint’transformation(see Figure2)which is based on the following idea:Each paper from a strong component is duplicated with its’preprint’version.The papers inside strong component cite preprints.Large strong components in citation network are unlikely–their presence usually indicates an error in the data.An exception from this rule is the citation network of High Energy Particle Physics literature[20]from arXiv.In it different versions of the same paper are treated as a unit.This leads to large strongly connected components.The idea of preprint transformation can be used also in this case to eliminate cycles.6First Example:SOM citation networkThe purpose of this example is not the analysis of the selected citation network on SOM(self-organizing maps)literature[12,24,23],but to present typical steps and results in citation net-work analysis.We made our analysis using program Pajek.First we test the network for acyclicity.Since in the SOM network there are11nontrivial strong components of size2,see Table1,we have to transform the network into acyclic one.We decided to do this by shrinking each component into a single vertex.This operation produces some loops that should be removed.Figure3:Main path and CPM path in SOM network with SPC weights Now,we can compute the citation weights.We selected the SPC(search path count)method. It returns the following results:the network with citation weights on arcs,the main path network and the vector with vertex weights.In a citation network,a main path(sub)network is constructed starting from the source vertex and selecting at each step in the end vertex/vertices the arc(s)with the highest weight, until a sink vertex is reached.Another possibility is to apply on the network N=(U,R,w)the critical path method (CPM)from operations research.First we draw the main path network.The arc weights are represented by the thickness of arcs.To produce a nice picture of it we apply the Pajek’s macro Layers which contains a sequence of operations for determining a layered layout of an acyclic network(used also in analysis of genealogies represented by p-graphs).Some experiments with settings of different options are needed to obtain a right picture,see left part of Figure3.In its right part the CPMTable2:15Hubs and AuthoritiesRank Hub Id Authority Id1CLARK-JW-1991-V36-P1259HOPFIELD-JJ-1982-V79-P25540.063660.334273HUANG-SH-1994-V17-P212KOHONEN-T-1990-V78-P14640.057210.123985SHUBNIKOV-EI-1997-V64-P989#GARDNER-E-1988-V21-P2570.054960.093537VEMURI-V-1993-V36-P203MCELIECE-RJ-1987-V33-P4610.054090.076569BUSCEMA-M-1998-V33-P17RUMELHART-DE-1985-V9-P750.052580.0727111WELLS-DM-1998-V41-P173ANDERSON-JA-1977-V84-P4130.052330.0703313SMITH-KA-1999-V11-P15KOSKO-B-1987-V26-P49470.051490.0580215KOHONEN-T-1990-V78-P1464GROSSBERG-S-1987-V11-P23 path is presented.We see that the upper parts of both paths are identical,but they differ in the continuation. The arcs in the CPM path are thicker.We could display also the complete SOM network using essentially the same procedure as for the displaying of main path.But the obtained picture would be too complicated(too many vertices and arcs).We have to identify some simpler and important subnetworks inside it.Inspecting the distribution of values of weights on arcs(lines)we select a threshold0.007 and determine the corresponding arc-cut–delete all arcs with weights lower than selected threshold and afterwards delete also all isolated vertices(degree=0).Now,we are ready to draw the reduced network.Weﬁrst produce an automatic layout.We notice some small unimportant components.We preserve only the large main component,draw it and improve the obtained layout manually.To preserve the level structure we use the option that allows only the horizontal movement of vertices.Finally we label the’most important vertices’with their labels.A vertex is considered important if it is an endpoint of an arc with the weight above the selected threshold(in our case 0.05).The obtained picture of SOM’main subnetwork’is presented in Figure4.We see that the SOMﬁeld evolved in two main branches.From CARPENTER-1987the strongest(main path) arc is leading to the right branch that after some steps disappears.The left,more vital branch is detected by the CPM path.Further investigation of this is left to the readers with additional knowledge about the SOMﬁeld.As a complementary information we can determine Kleinberg’s hubs and authorities vertex weights[17].Papers that are cited by many other papers are called authorities;papers that cite many other documents are called hubs.Good authorities are those that are cited by good hubsFigure4:Main subnetwork at level0.007and good hubs cite good authorities.The15highest ranked hubs and authorities are presented in Table2.We see that the main authorities are located in eighties and the main hubs in nineties. Note that,since we are using the relation uRv≡u is cited by v,we have to interchange the roles of hubs and authorities produced by Pajek.An elaboration of the hubs and authorities approach to the analysis of citation networks complemented with visualization can be found in Brandes and Willhalm(2002)[8].7Second Example:US patentsThe network of US patents from1963to1999[21]is an example of very large network (3774768vertices and16522438arcs)that,using some special options in Pajek,can still be analyzed on PC with at least1G memory.The SPC weights are determined in a range of1 minute.This shows that the proposed approach can be used also for very large networks.The obtained main path and CPM path are presented in Figure5.Collecting from the United States Patent and Trademark Ofﬁce[22]the basic data about the patents from both paths,see Table3-6,we see that they deal with’liquid crystal displays’.But,in this network there should be thousands of’themes’.How to identify them?Using the arc weights we can deﬁne a theme as a connected small subnetwork of size in the interval k ..K(for example,between k=1Table3:Patents on the liquid-crystal display patent author(s)and titleMar13,1951Jun29,1954May30,1967May19,1970Jan18,1972May30,1972Jul11,1972Sep19,1972Oct10,1972May8,1973Jun19,1973Oct23,1973Nov20,1973Mar5,1974Mar12,1974Apr23,1974May7,1974Mar18,1975Apr8,1975May6,1975Jun24,1975Mar30,1976May4,1976Jun1,1976Aug17,1976Dec28,1976Mar8,1977Mar22,1977Apr12,1977Table4:Patents on the liquid-crystal display patent author(s)and titleJun14,1977Jun28,1977Mar7,1978Apr4,1978Apr11,1978Sep12,1978Oct3,1978Dec19,1978Apr17,1979May15,1979Apr1,1980Apr15,1980May13,1980Oct21,1980Apr14,1981Sep22,1981Oct6,1981Nov24,1981May18,1982Jul20,1982Sep14,1982Nov2,1982Nov30,1982Jan11,1983May31,1983Jun7,1983Jun7,1983Aug23,1983Nov15,1983Dec6,1983Dec27,1983Jun19,1984Jun26,1984Jul17,1984Sep18,1984Table5:Patents on the liquid-crystal display patent author(s)and titleSep18,1984Oct30,1984Mar5,1985Apr9,1985Apr30,1985Jul2,1985Nov5,1985Dec10,1985Apr22,1986Nov11,1986Dec23,1986Apr14,1987Apr21,1987Sep22,1987Nov3,1987Nov24,1987Dec1,1987Dec15,1987Jan12,1988Jan26,1988Jun21,1988Sep13,1988Jan3,1989Jan10,1989Apr11,1989May23,1989Oct31,1989Sep18,1990May21,1991May21,1991Jun16,1992Jun23,1992Dec15,1992Dec29,1992Table6:Patents on the liquid-crystal display patent author(s)and titleSep7,1993Feb1,1994May3,1994June7,1994Dec20,1994Apr18,1995Jul23,1996Aug6,1996Sep10,1996Nov4,1997Jun23,1998Jan5,1999Nov23,1999Dec21,19992510205010011001000sizef r e qFigure 6:Island size frequency distributionTable8:Some patents from the’foam’island patent author(s)and titleNov29,1977Sep29,1981Nov2,1982Jul10,1984Jan29,1985Oct1,1985Dec22,1987May22,1990Feb26,1991Dec8,1992Feb16,1993May3,1994Sep24,1996Table9:Some patents from’ﬁber optics and bags’island patent author(s)and titleJul24,1984Apr16,1985Jul23,1985May20,1986Jun30,1987Jan12,1988Nov15,1988Nov22,1988Mar7,1989Mar7,1989May9,1989Jan1,1991Mar5,1991Feb23,1993May3,1994Nov15,1994The subnetworks approach onlyﬁlters out the structurally important subnetworks thus pro-viding a researcher with a smaller manageable structures which can be further analyzed using more sophisticated and/or substantial methods.9AcknowledgmentsThe search path count algorithm was developed during my visit in Pittsburgh in1991and pre-sented at the Network seminar[2].It was presented to the broader audience at EASST’94in Budapest[3].In1997it was included in program Pajek[4].The’preprint’transformation was developed as a part of the contribution for the Graph drawing contest2001[5].The al-gorithm for the path length counts was developed in August2002and the Islands algorithm in August2003.The author would like to thank Patrick Doreian and Norm Hummon for introducing him into theﬁeld of citation network analysis,Eugene Garﬁeld for making available the data on real-life networks and providing some relevant references,and Andrej Mrvar and Matjaˇz Zaverˇs nik for implementing the algorithms in Pajek.This work was supported by the Ministry of Education,Science and Sport of Slovenia, Project0512-0101.References[1]Asimov I.:The Genetic Code,New American Library,New York,1963.[2]Batagelj V.:Some Mathematics of Network work Seminar,Department ofSociology,University of Pittsburgh,January21,1991.[3]Batagelj V.:An Efﬁcient Algorithm for Citation Networks Analysis.Paper presented atEASST’94,Budapest,Hungary,August28-31,1994.[4]Batagelj V.,Mrvar A.:Pajek–program for analysis and visualization of large networks.http://vlado.fmf.uni-lj.si/pub/networks/pajek/http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/extreme.htm [5]Batagelj V.,Mrvar A.:Graph Drawing Contest2001Layoutshttp://vlado.fmf.uni-lj.si/pub/GD/GD01.htm[6]Batagelj V.,Zaverˇs nik M.:Generalized Cores.Submitted,2002./abs/cs.DS/0202039[7]Batagelj V.,Zaverˇs nik M.:Islands–identifying themes in large networks.In preparation,August2003.[8]Brandes U.,Willhalm T.:Visualization of bibliographic networks with a reshaped land-scape metaphor.Joint Eurographics–IEEE TCVG Symposium on Visualization,D.Ebert, P.Brunet,I.Navazo(Editors),2002.http://algo.fmi.uni-passau.de/˜brandes/publications/bw-vbnrl-02.pdf[9]Cormen T.H.,Leiserson C.E.,Rivest R.L.,Stein C.:Introduction to Algorithms,SecondEdition.MIT Press,2001.[10]Garﬁeld E,Sher IH,and Torpie RJ.:The Use of Citation Data in Writing the History ofScience.Philadelphia:The Institute for Scientiﬁc Information,December1964./papers/useofcitdatawritinghistofsci.pdf[11]Garﬁeld E.:From Computational Linguistics to Algorithmic Historiography,paper pre-sented at the Symposium in Honor of Casimir Borkowski at the University of Pittsburgh School of Information Sciences,September19,2001./papers/pittsburgh92001.pdf [12]Garﬁeld E.,Pudovkin A.I.,Istomin,V.S.:Histcomp–(comp iled Hist oriography program)/histcomp/guide.html/histcomp/index.html[13]Garner R.:A computer oriented,graph theoretic analysis of citation index structures.Flood B.(Editor),Three Drexel information science studies,Philadelphia:Drexel Univer-sity Press1967./rgarner.pdf[14]Hummon N.P.,Doreian P.:Connectivity in a Citation Network:The Development of DNATheory.Social Networks,11(1989)39-63.[15]Hummon N.P.,Doreian P.:Computational Methods for Social Network Analysis.SocialNetworks,12(1990)273-288.[16]Hummon N.P.,Doreian P.,Freeman L.C.:Analyzing the Structure of the Centrality-Productivity Literature Created Between1948and1979.Knowledge:Creation,Diffusion, Utilization,11(1990)4,459-480.[17]Kleinberg J.:Authoritative sources in a hyperlinked environment.In Proc9th ACMSIAMSymposium on Discrete Algorithms,1998,p.668-677./home/kleinber/auth.ps/kleinberg97authoritative.html[18]Wilson,R.J.,Watkins,J.J.:Graphs:An Introductory Approach.New York:John Wileyand Sons,1990.[19]Pajek’s datasets–citation networks:http://vlado.fmf.uni-lj.si/pub/networks/data/cite/[20]KDD Cup2003:/projects/kddcup/index.html/[21]Hall,B.H.,Jaffe,A.B.and Tratjenberg M.:The NBER U.S.Patent Citations Data File.NBER Working Paper8498(2001)./patents/[22]The United States Patent and Trademark Ofﬁce./netahtml/srchnum.htm[23]Bibliography on the Self-Organizing Map(SOM)and Learning Vector Quantization(LVQ)a.de/bibliography/Neural/SOM.LVQ.html[24]Neural Networks Research Centre:Bibliography of SOM papers.http://www.cis.hut.fi/research/refs/。

AVL PUMA OPEN说明书

HEAVY DUTY ENGINE EMISSIONS With dynamic testbeds for the engines of utility vehicles, the focus is on strict compliance with the legislation despite the flexible use of advanced functions and additional measurement methods for engine development. AVL‘s packages guarantee the correct test procedure and display the results in a legally compliant and clear manner after the test has been completed.
specialized interfaces like ASAM-ODS and ASAM-CEA • Graphical formula editor for crank-angle- and time-based
calculations • Full programming environment for user-specific formulas,
LIGHT DUTY ENGINE EMISSIONS Increases in efficiency in the engine development process can be achieved by making use of the benefits of dynamic engine testbeds. It is essential in this area to comply with the basic conditions for exhaust gas measurements specified in the corresponding guidelines. AVL‘s packages contain suitable methods for different measurement equipment (e.g. undiluted raw exhaust gas measurement or diluted CVS measurement).

Solution of the Dirac Equation using the Lanczos Algorithm

|eλ n
|Eλ ,
Performing the diagonalizations in equation 4 reduces to ﬁnding the roots of the following characteristic polynomial ˆn − x · ˆ det(H 1) := (−1)n w1 . . . wn pn+1 (x) after each iteration step. The generated sequence of eigenpairs (eλ n , |eλ n ) possess the following convergence properties [14] eλ n −→ n→∞ −→ n→∞ Eλ , λ = 1, 2 , 3 , . . . , (11) (10)
arXiv:0706.2236v1 [math-ph] 15 Jun 2007
Department of Information Management, WuFeng Institute of Technology, Minsyung, Jiayee 621-53, Taiwan
Abstract
Covergent eigensolutions of the Dirac Equation for a relativistic electron in an external Coulomb potential are obtained using the Lanczos Algorithm. A tri-diagonal matrix representation of the Dirac Hamiltonian operator is constructed iteratively and diagonalized after each iteration step to form a sequence of convergent eigenvalue solutions. Any spurious solutions which arise from the presence of continuum states can easily be identiﬁed. PACS 03.65.Ge, 02.60.Lj

Statistics Toolbox

An Introduction to Dataset Arrays 5:31Improve productivity with statistical arrays.Statistics ToolboxPerform statistical modeling and analysisStatistics Toolbox™provides algorithms and tools for organizing, analyzing, and modeling data. You can use regression or classification for predictive modeling, generate random numbers for Monte Carlo simulations, use statistical plots for exploratory data analysis, and perform hypothesis tests.For analyzing multidimensional data, Statistics Toolbox includes algorithms that let you identify key variables that impact your model with sequential feature selection, transform your data with principal component analysis,apply regularization and shrinkage, or use partial least squares regression.Statistics Toolbox includes specialized data types for organizing and accessing heterogeneous data. Dataset arrays store numeric data, text, and metadata in a single data container. Built-in methods enable you to merge datasets using a common key (join), calculate summary statistics on grouped data, and convert between tall and wide data representations. Categorical arrays provide a memory-efficient data container for storing information drawn from a finite, discrete set of categories.Statistics Toolbox is included in MATLAB and Simulink Student Version .Key Features▪Statistical arrays for storing heterogeneous and categorical data▪Regression techniques, including linear, nonlinear, robust, and ridge, and nonlinear mixed-effects models ▪Classification algorithms, including boosted and bagged decision trees, k-Nearest Neighbor, and linear discriminant analysis▪Analysis of variance (ANOV A)▪Probability distributions, including copulas and Gaussian mixtures▪Random number generation▪Hypothesis testing▪Design of experiments and statistical process controlData Organization and ManagementStatistics Toolbox provides two specialized arrays for storing and managing statistical data: dataset arrays and categorical arrays.Dataset ArraysDataset arrays enable convenient organization and analysis of heterogeneous statistical data and metadata. Dataset arrays provide columns to represent measured variables and rows to represent observations. With dataset arrays,you can:▪Store different types of data in a single container.An Introduction to Joins 5:01Combine fields from two dataset arrays using a variable that is present in both.▪Label rows and columns of data and reference that data using recognizable names.▪Display and edit data in an intuitive tabular format.▪Use metadata to define units, describe data, and store information.Dataset array displayed in the MATLAB Variable Editor. This dataset array includes a mixture of cell strings and numeric information, with selected columns available in the Plot Selector Tool.Statistics Toolbox provides specialized functions to operate on dataset arrays. With these specialized functions,you can:▪Merge datasets by combining fields using common keys.▪Export data into standard file formats, including Microsoft ®Excel ®and comma-separated value (CSV).▪Calculate summary statistics on grouped data.▪Convert data between tall and wide representations.Categorical ArraysCategorical arrays enable you to organize and process nominal and ordinal data that uses values from a finite set of discrete levels or categories. With categorical arrays, you can:▪Decrease memory footprint by replacing repetitive text strings with categorical labels.▪Store nominal data using descriptive labels, such as red ,green , and blue for an unordered set of colors.▪Store ordinal data using descriptive labels, such as cold ,warm , and hot for an ordered set of temperature measurements.▪Manipulate categorical data using familiar array operations and indexing methods.▪Create logical indexes based on categorical data.▪Group observations by category.Exploratory Data AnalysisStatistics Toolbox provides multiple ways to explore data: statistical plotting with interactive graphics, algorithms for cluster analysis, and descriptive statistics for large datasets.Statistical Plotting and Interactive GraphicsStatistics Toolbox includes graphs and charts to explore your data visually. The toolbox augments MATLAB®plot types with probability plots, box plots, histograms, scatter histograms, 3D histograms, control charts, and quantile-quantile plots. The toolbox also includes specialized plots for multivariate analysis, including dendograms, biplots, parallel coordinate charts, and Andrews plots.Group scatter plot matrix showing interactions between five variables.Compact box plot with whiskers providing a five-number summary of a dataset.Scatter histogram using a combination of scatter plots and histograms to describe the relationship between variables.Plot comparing the empirical CDF for a sample from an extreme value distribution with a plot of the CDF for the sampling distribution.Cluster AnalysisStatistics Toolbox offers multiple algorithms to analyze data using hierarchical clustering, k-means clustering, and Gaussian mixtures.Two-component Gaussian mixture model fit to a mixture of bivariate Gaussians.Output from applying a clustering algorithm to the same example.Dendrogram plot showing a model with four clusters.Cluster Analysis(Example)Use k-means and hierarchical clustering to discover natural groupings in data.Descriptive StatisticsDescriptive statistics enable you to understand and describe potentially large sets of data quickly. Statistics Toolbox includes functions for calculating:▪Measures of central tendency (measures of location), including average, median, and various means▪Measures of dispersion (measures of spread), including range, variance, standard deviation, and mean or median absolute deviation▪Linear and rank correlation (partial and full)▪Results based on data with missing values▪Percentile and quartile estimates▪Density estimates using a kernel-smoothing functionThese functions help you summarize values in a data sample using a few highly relevant numbers.In some cases, estimating summary statistics using parametric methods is not possible. To deal with these cases, Statistics Toolbox provides resampling techniques, including:▪Generalized bootstrap function for estimating sample statistics using resampling▪Jackknife function for estimating sample statistics using subsets of the data▪bootci function for estimating confidence intervals4:07Develop a predictive model without specifying a function that describes the relationship between variables.Regression, Classification, and ANOV ARegressionWith regression, you can model a continuous response variable as a function of one or more predictors. Statistics Toolbox offers a wide variety of regression algorithms, including:▪Linear regression▪Nonlinear regression▪Robust regression▪Logistic regression and other generalized linear modelsFitting with MATLAB: Statistics, Optimization, and Curve Fitting (Webinar)Apply regression algorithms with MATLAB.You can evaluate goodness of fit using a variety of metrics, including:▪R 2and adjusted R 2▪Cross-validated mean squared error▪Akaike information criterion (AIC) and Bayesian information criterion (BIC)With the toolbox, you can calculate confidence intervals for both regression coefficients and predicted values.Statistics Toolbox supports more advanced techniques to improve predictive accuracy when the dataset includes large numbers of correlated variables. The toolbox supports:▪Subset selection techniques, including sequential features selection and stepwise regression▪Regularization methods, including ridge regression, lasso, and elastic netComputational Statistics: Feature Selection, Regularization, and Shrinkage with MATLAB(Webinar)Learn how to generate accurate fits in the presence of correlated data.Statistics Toolbox also supports nonparametric regression techniques for generating an accurate fit without specifying a model that describes the relationship between the predictor and the response. Nonparametric regression techniques include decision trees as well as boosted and bagged regression trees.Additionally, Statistics Toolbox supports nonlinear mixed-effect (NLME) models in which some parameters of a nonlinear function vary across individuals or groups.An Introduction to Classification9:00Develop predictive models for classifying data.Nonlinear mixed-effects model of drug absorption and elimination showing intrasubject concentration-versus-time profiles. The nlmefit function in Statistics Toolbox generates a population model using fixed and random effects.ClassificationClassification algorithms enable you to model a categorical response variable as a function of one or more predictors. Statistics Toolbox offers a wide variety of parametric and nonparametric classification algorithms,such as:▪Boosted and bagged classification trees, including AdaBoost, LogitBoost, GentleBoost, and RobustBoost ▪Naïve Bayes classification▪k-Nearest Neighbor (kNN) classification▪Linear discriminant analysisYou can evaluate goodness of fit for the resulting classification models using techniques such as:▪Cross-validated loss▪Confusion matrices▪Performance curves/receiver operating characteristic (ROC) curvesANOV AAnalysis of variance (ANOV A) enables you to assign sample variance to different sources and determine whether the variation arises within or among different population groups. Statistics Toolbox includes these ANOV A algorithms and related techniques:One-way ANOV A▪Two-way ANOV A for balanced data▪Multiway ANOV A for balanced and unbalanced data▪Multivariate ANOV A (MANOV A)▪Nonparametric one-way and two-way ANOVA (Kruskal-Wallis and Friedman)▪Analysis of covariance (ANOCOVA)▪Multiple comparison of group means, slopes, and interceptsMultivariate StatisticsMultivariate statistics provide algorithms and functions to analyze multiple variables. Typical applications include:▪Transforming correlated data into a set of uncorrelated components using rotation and centering (principal component analysis)▪Exploring relationships between variables using visualization techniques, such as scatter plot matrices and classical multidimensional scaling▪Segmenting data with cluster analysisFitting an Orthogonal Regression Using Principal Component Analysis(Example)Implement Deming regression (total least squares).Feature TransformationFeature transformation techniques enable dimensionality reduction when transformed features can be more easily ordered than original features. Statistics Toolbox offers three classes of feature transformation algorithms:▪Principal component analysis for summarizing data in fewer dimensions▪Nonnegative matrix factorization when model terms must represent nonnegative quantities▪Factor analysis for building explanatory models of data correlationPartial Least Squares Regression and Principal Component Regression(Example)Model a response variable in the presence of highly correlated predictors.Multivariate VisualizationStatistics Toolbox provides graphs and charts to explore multivariate data visually, including:▪Scatter plot matrices▪Dendograms▪Biplots▪Parallel coordinate charts▪Andrews plots▪Glyph plotsGroup scatter plot matrix showing how model year impacts different variables.Biplot showing the first three loadings from a principal component analysis.Andrews plot showing how country of original impacts the variables.Cluster AnalysisStatistics Toolbox offers multiple algorithms for cluster analysis, including:▪Hierarchical clustering, which creates an agglomerative cluster typically represented as a tree.▪K-means clustering, which assigns data points to the cluster with the closest mean.▪Gaussian mixtures, which are formed by combining multivariate normal density components. Clusters areassigned by selecting the component that maximizes posterior probability.Two-component Gaussian mixture model fit to a mixture of bivariate Gaussians.Output from applying a clustering algorithm to the same example.Dendrogram plot showing a model with four clusters.Probability DistributionsStatistics Toolbox provides functions and graphical tools to work with parametric and nonparametric probability distributions. With these tools, you can:▪Fit distributions to data.▪Use statistical plots to evaluate goodness of fit.▪Compute key functions such as probability density functions and cumulative distribution functions.▪Generate random and quasi-random number streams from probability distributions.8:15Fit distributions to empirical data, and visually explore the effects of changing parameters onthe shape of a distribution.The Distribution Fitting Tool in the toolbox enables you to fit data using predefined univariate probability distributions, a nonparametric (kernel-smoothing) estimator, or a custom distribution that you define. This tool supports both complete data and censored (reliability) data. You can exclude data, save and load sessions, and generate MATLAB code.Visual plot of distribution data (left) and summary statistics (right). Using the Distribution Fitting Tool, you can estimate a normal distribution with mean and variance values (16.9 and 8.7, respectively, in this example).You can estimate distribution parameters at the command line or construct probability distributions that correspond to the governing parameters.Additionally, you can create multivariate probability distributions, including Gaussian mixtures and multivariate normal, multivariate t, and Wishart distributions. You can use copulas to create multivariate distributions by joining arbitrary marginal distributions using correlation structures.See the complete list of supported distributions.Simulating Dependent Random Numbers Using Copulas(Example)Create distributions that model correlated multivariate data.With the toolbox, you can specify custom distributions and fit these distributions using maximum likelihood estimation.Fitting Custom Univariate Distributions(Example)Perform maximum likelihood estimation on truncated, weighted, or bimodal data.Evaluating Goodness of FitStatistics Toolbox provides statistical plots to evaluate how well a dataset matches a specific distribution. The toolbox includes probability plots for a variety of standard distributions, including normal, exponential, extreme value, lognormal, Rayleigh, and Weibull. You can generate probability plots from complete datasets and censored datasets. Additionally, you can use quantile-quantile plots to evaluate how well a given distribution matches a standard normal distribution.Statistics Toolbox also provides hypothesis tests to determine whether a dataset is consistent with different probability distributions. Specific tests include:▪Chi-Square goodness-of-fit tests▪One-sided and two-sided Kolmogorov-Smirnov tests▪Lilliefors tests▪Ansari-Bradley tests▪Jarque-Bera testsAnalyzing Probability DistributionsStatistics Toolbox provides functions for analyzing probability distributions, including:▪Probability density functions▪Cumulative density functions▪Inverse cumulative density functions▪Negative log-likelihood functionsGenerating Random NumbersStatistics Toolbox provides functions for generating pseudo-random and quasi-random number streams from probability distributions. You can generate random numbers from either a fitted or constructed probability distribution by applying the random method.MATLAB code for constructing a Poisson distribution with a specific mean and generating a vector of random numbers that match the distribution.Statistics Toolbox also provides functions for:▪Generating random samples from multivariate distributions, such as t, normal, copulas, and Wishart▪Sampling from finite populations▪Performing Latin hypercube sampling▪Generating samples from Pearson and Johnson systems of distributionsYou can also generate quasi-random number streams. Quasi-random number streams produce highly uniform samples from the unit hypercube. Quasi-random number streams can often accelerate Monte Carlo simulations because fewer samples are required to achieve complete coverage.Hypothesis TestingRandom variation can make it difficult to determine whether samples taken under different conditions are different. Hypothesis testing is an effective tool for analyzing whether sample-to-sample differences are significant and require further evaluation or are consistent with random and expected data variation. Statistics Toolbox supports widely used parametric and nonparametric hypothesis testing procedures, including:▪One-sample and two-sample t-tests▪Nonparametric tests for one sample, paired samples, and two independent samples▪Distribution tests (Chi-square, Jarque-Bera, Lillifors, and Kolmogorov-Smirnov)Comparison of distributions (two-sample Kolmogorov-Smirnov)▪Tests for autocorrelation and randomness▪Linear hypothesis tests on regression coefficientsSelecting a Sample Size(Example)Calculate the sample size necessary for a hypothesis test.Design of Experiments and Statistical Process ControlDesign of ExperimentsFunctions for design of experiments (DOE) enable you to create and test practical plans to gather data for statistical modeling. These plans show how to manipulate data inputs in tandem to generate information about their effect on data outputs. Supported design types include:▪Full factorial▪Fractional factorial▪Response surface (central composite and Box-Behnken)▪D-optimal▪Latin hypercubeYou can use Statistics Toolbox to define, analyze, and visualize a customized DOE. For example, you can estimate input effects and input interactions using ANOVA, linear regression, and response surface modeling, then visualize results through main effect plots, interaction plots, and multi-vari charts.Fitting a decision tree to data. The fitting capabilities in Statistics Toolbox enable you to visualize a decision tree by drawing a diagram of the decision rule and group assignments.Model of a chemical reaction for an experiment using the design-of-experiments (DOE) and surface-fitting capabilities of Statistics Toolbox.Statistical Process ControlStatistics Toolbox provides a set of functions that support Statistical Process Control (SPC). These functions enable you to monitor and improve products or processes by evaluating process variability. With SPC functions, you can:▪Perform gage repeatability and reproducibility studies.▪Estimate process capability.▪Create control charts.▪Apply Western Electric and Nelson control rules to control chart data.Control charts showing process data and violations of Western Electric control rules. Statistics Toolbox provides a variety of control charts and control rules for monitoring and evaluating products or processes.Product Details, Examples, and System Requirements/products/statisticsTrial Software/trialrequestSales/contactsalesTechnical Support/support ResourcesOnline User Community /matlabcentral Training Services /training Third-Party Products and Services /connections Worldwide Contacts /contact。

1-2007_-_Y_F_Han_-_PreparationofnanosizedMn3O4SBA15catalystforcomplet[retrieved-2016-11-15]

Preparation of nanosized Mn 3O 4/SBA-15catalyst for complete oxidation of low concentration EtOH in aqueous solution with H 2O 2Yi-Fan Han *,Fengxi Chen,Kanaparthi Ramesh,Ziyi Zhong,Effendi Widjaja,Luwei ChenInstitute of Chemical and Engineering Sciences,1Pesek Road,Jurong Island 627833,Singapore Received 11May 2006;received in revised form 18December 2006;accepted 29May 2007Available online 2June 2007AbstractA new heterogeneous Fenton-like system consisting of nano-composite Mn 3O 4/SBA-15catalyst has been developed for the complete oxidation of low concentration ethanol (100ppm)by H 2O 2in aqueous solution.A novel preparation method has been developed to synthesize nanoparticles of Mn 3O 4by thermolysis of manganese (II)acetylacetonate on SBA-15.Mn 3O 4/SBA-15was characterized by various techniques like TEM,XRD,Raman spectroscopy and N 2adsorption isotherms.TEM images demonstrate that Mn 3O 4nanocrystals located mainly inside the SBA-15pores.The reaction rate for ethanol oxidation can be strongly affected by several factors,including reaction temperature,pH value,catalyst/solution ratio and concentration of ethanol.A plausible reaction mechanism has been proposed in order to explain the kinetic data.The rate for the reaction is supposed to associate with the concentration of intermediates (radicals: OH,O 2Àand HO 2)that are derived from the decomposition of H 2O 2during reaction.The complete oxidation of ethanol can be remarkably improved only under the circumstances:(i)the intermediates are stabilized,such as stronger acidic conditions and high temperature or (ii)scavenging those radicals is reduced,such as less amount of catalyst and high concentration of reactant.Nevertheless,the reactivity of the presented catalytic system is still lower comparing to the conventional homogenous Fenton process,Fe 2+/H 2O 2.A possible reason is that the concentration of intermediates in the latter is relatively high.#2007Elsevier B.V .All rights reserved.Keywords:Hydrogen peroxide;Fenton catalyst;Complete oxidation of ethanol;Mn 3O 4/SBA-151.IntroductionRemediation of wastewater containing organic constitutes is of great importance because organic substances,such as benzene,phenol and other alcohols may impose toxic effects on human and animal anic efﬂuents from pharmaceu-tical,chemical and petrochemical industry usually contaminate water system by dissolving into groundwater.Up to date,several processes have been developed for treating wastewater that contains toxic organic compounds,such as wet oxidation with or without solid catalysts [1–4],biological oxidation,supercritical oxidation and adsorption [5,6],etc.Among them,catalytic oxidation is a promising alternative,since it avoids the problem of the adsorbent regeneration in the adsorption process,decreases signiﬁcantly the temperature and pressure in non-catalytic oxidation techniques [7].Generally,the disposalof wastewater containing low concentration organic pollutants (e.g.<100ppm)can be more costly through all aforementioned processes.Thus,catalytic oxidation found to be the most economical way for this purpose with considering its low cost and high efﬁciency.Currently,a Fenton reagent that consists of homogenous iron ions (Fe 2+)and hydrogen peroxide (H 2O 2)is an effective oxidant and widely applied for treating industrial efﬂuents,especially at low concentrations in the range of 10À2to 10À3M organic compounds [8].However,several problems raised by the homogenous Fenton system are still unsolved,e.g.disposing the iron-containing waste sludge,limiting the pH range (2.0–5.0)of the aqueous solution,and importantly irreversible loss of activity of the reagent.To overcome these drawbacks raised from the homogenous Fenton system,since 1995,a heterogeneous Fenton reagent using metal ions exchanged zeolites,i.e.Fe/ZSM-5has proved to be an interesting alternative catalytic system for treating wastewater,and showed a comparable activity with the homogenous Fenton system [9].However,most reported heterogeneous Fenton reagents still need UV radiation during/locate/apcatbApplied Catalysis B:Environmental 76(2007)227–234*Corresponding author.Tel.:+6567963806.E-mail address:han_yi_fan@.sg (Y .-F.Han).0926-3373/$–see front matter #2007Elsevier B.V .All rights reserved.doi:10.1016/j.apcatb.2007.05.031oxidation of organic compounds.This might limit the application of homogeneous Fenton system.Exploring other heterogeneous catalytic system considering the above disadvantages,is still desirable for this purpose.Here,we present an alternative catalytic system for the complete oxidation of organic com-pounds in aqueous solution using supported manganese oxide as catalyst under mild conditions,which has rarely been addressed.Mn-containing oxide catalysts have been found to be very active for the catalytic wet oxidation of organic efﬂuents (CWO)[10–14],which is operated at high air pressures(1–22MPa)and at high temperatures(423–643K)[15].On the other hand,manganese oxide,e.g.MnO2[16],is well known to be active for the decomposition of H2O2in aqueous solution to produce hydroxyl radical( OH),which is considered to be the most robust oxidant so far.The organic constitutes can be deeply oxidized by those radicals rapidly[17].The only by-product is H2O from decomposing H2O2.Therefore,H2O2is a suitable oxidant for treating the wastewater containing organic compounds.Due to the recent progress in the synthesis of H2O2 directly from H2and O2[18,19],H2O2is believed to be produced through more economical process in the coming future.So,the heterogeneous Fenton system is economically acceptable.In this study,nano-crystalline Mn3O4highly dispersed inside the mesoporous silica,SBA-15,has been prepared by thermolysis of organic manganese(II)acetylacetonate in air. We expect the unique mesoporous structure may provide add-itional function(conﬁnement effect)to the catalytic reaction, i.e.occluding/entrapping large organic molecules inside pores. The catalyst as prepared has been examined for the complete oxidation of ethanol in aqueous solution with H2O2,or to say, wet peroxide oxidation.Ethanol was selected as a model organic compound because(i)it is one of the simplest organic compounds and can be easily analyzed,(ii)it has high solu-bility in water due to its strong hydrogen bond with water molecule and(iii)the structure of ethanol is quite stable and only changed through catalytic reaction.Presently,for theﬁrst time by using the Mn3O4/SBA-15catalyst,we investigated the peroxide ethanol oxidation affected by factors such as temperature,pH value,ratio of catalyst(g)and volume of solution(L),and concentration of ethanol in aqueous solution. In addition,plausible reaction mechanisms are established to explain the peroxidation of ethanol determined by the H2O2 decomposition.2.Experimental2.1.Preparation and characterization of Mn3O4/SBA-15 catalystSynthesis of SBA-15is similar to the previous reported method[20]by using Pluronic P123(BASF)surfactant as template and tetraethyl orthosilicate(TEOS,98%)as silica source.Manganese(II)acetylacetonate([CH3COCH C(O)CH3]2Mn,Aldrich)by a ratio of2.5mmol/gram(SBA-15)wereﬁrst dissolved in acetone(C.P.)at room temperature, corresponding to ca.13wt.%of Mn3O4with respect to SBA-15.The preparation method in detail can be seen in our recent publications[21,22].X-ray diffraction proﬁles were obtained with a Bruker D8 diffractometer using Cu K a radiation(l=1.540589A˚).The diffraction pattern was taken in the Bragg angle(2u)range at low angles from0.68to58and at high angles from308to608at room temperature.The XRD patterns were obtained by scanning overnight with a step size:0.028per step,8s per step.The dispersive Raman microscope employed in this study was a JY Horiba LabRAM HR equipped with three laser sources(UV,visible and NIR),a confocal microscope,and a liquid nitrogen cooled charge-coupled device(CCD)multi-channel detector(256pixelsÂ1024pixels).The visible 514.5nm argon ion laser was selected to excite the Raman scattering.The laser power from the source is around20MW, but when it reached the samples,the laser output was reduced to around6–7MW after passing throughﬁltering optics and microscope objective.A100Âobjective lens was used and the acquisition time for each Raman spectrum was approximately 60–120s depending on the sample.The Raman shift range acquired was in the range of50–1200cmÀ1with spectral resolution1.7–2cmÀ1.Adsorption and desorption isotherms were collected on Autosorb-6at77K.Prior to the measurement,all samples were degassed at573K until a stable vacuum of ca.5m Torr was reached.The pore size distribution curves were calculated from the adsorption branch using Barrett–Joyner–Halenda(BJH) method.The speciﬁc surface area was assessed using the BET method from adsorption data in a relative pressure range from 0.06to0.10.The total pore volume,V t,was assessed from the adsorbed amount of nitrogen at a relative pressure of0.99by converting it to the corresponding volume of liquid adsorbate. The conversion factor between the volume of gas and liquid adsorbate is0.0,015,468for N2at77K when they are expressed in cm3/g and cm3STP/g,respectively.The measurements of transmission electron microscopy (TEM)were performed at Tecnai TF20S-twin with Lorentz Lens.The samples were ultrasonically dispersed in ethanol solvent,and then dried over a carbon grid.2.2.Kinetic measurement and analysisThe experiment for the wet peroxide oxidation of ethanol was carried out in a glass batch reactor connected to a condenser with continuous stirring(400rpm).Typically,20ml of aqueous ethanol solution(initial concentration of ethanol: 100ppm)wasﬁrst taken in the round bottomﬂask(reactor) together with5mg of catalyst,corresponding to ca.1(g Mn)/30 (L)ratio of catalyst/solution.Then,1ml of30%H2O2solution was introduced into the reactor at different time intervals (0.5ml at$0min,0.25ml at32min and0.25ml at62min). The total molar ratio of H2O2/ethanol is about400/1. Hydrochloric acid(HCl,0.01M)was used to acidify the solution if necessary.NH4OH(0.1M)solution was used to adjust pH to9.0when investigating the effect of pH.The pH for the deionized water is ca.7.0(Oakton pH meter)and decreased to 6.7after adding ethanol.All the measurements wereY.-F.Han et al./Applied Catalysis B:Environmental76(2007)227–234 228performed under the similar conditions described above if without any special mention.For comparison,the reaction was also carried out with a typical homogenous Fenton reagent[17], FeSO4(5ppm)–H2O2,under the similar reaction conditions.The conversion of ethanol during reaction was detected using gas chromatography(GC:Agilent Technologies,6890N), equipped with HP-5capillary column connecting to a thermal conductive detector(TCD).There is no other species but ethanol determined in the reaction system as evidenced by the GC–MS. Ethanol is supposed to be completely oxidized into CO2and H2O.The variation of H2O2concentration during reaction was analyzed colorimetrically using a UV–vis spectrophotometer (Epp2000,StellarNet Inc.)after complexation with a TiOSO4/ H2SO4reagent[18].Note that there was almost no measurable leaching of Mn ion during reaction analyzed by ICP(Vista-Mpx, Varian).3.Results and discussion3.1.Characterization of Mn3O4/SBA-15catalystThe structure of as-synthesized Mn3O4inside SBA-15has beenﬁrst investigated with powder XRD(PXRD),and the proﬁles are shown in Fig.1.The proﬁle at low angles(Fig.1a) suggests that SBA-15still has a high degree of hexagonal mesoscopic organization even after forming Mn3O4nanocrys-tals[23].Several peaks at high angles of XRD(Fig.1b)indicate the formation of a well-crystallized Mn3O4.All the major diffraction peaks can be assigned to hausmannite Mn3O4 structure(JCPDS80-0382).By N2adsorption measurements shown in Fig.2,the pore volume and speciﬁc surface areas(S BET)decrease from 1.27cm3/g and937m2/g for bare SBA-15to0.49cm3/g and 299m2/g for the Mn3O4/SBA-15,respectively.About7.7nm of mesoporous diameter for SBA-15decreases to ca.6.3nm for Mn3O4/SBA-15.The decrease of the mesopore dimension suggests the uniform coating of Mn3O4on the inner walls of SBA-15.This nano-composite was further characterized by TEM. Obviously,the SBA-15employed has typical p6mm hex-agonal morphology with the well-ordered1D array(Fig.3a). The average pore size of SBA-15is ca.8.0nm,which is very close to the value(ca.7.7nm)determined by N2adsorption. Along[001]orientation,Fig.3b shows that the some pores areﬁlled with Mn3O4nanocrystals.From the pore A to D marked in Fig.3b correspond to the pores from empty to partially and fullyﬁlled;while the features for the SBA-15 nanostructure remains even after forming Mn3O4nanocrys-tals.Nevertheless,further evidences for the location of Mn3O4inside the SBA-15channels are still undergoing in our group.Raman spectra obtained for Mn3O4/SBA-15is presented in Fig.4a.For comparison the Raman spectrum was also recorded for the bulk Mn3O4(97.0%,Aldrich)under the similar conditions(Fig.4b).For the bulk Mn3O4,the bands at310,365, 472and655cmÀ1correspond to the bending modes of Mn3O4, asymmetric stretch of Mn–O–Mn,symmetric stretch of Mn3O4Fig.1.XRD patterns of the bare SBA-15and the Mn3O4/SBA-15nano-composite catalyst.(a)At low angles:(A)Mn3O4/SBA-15,(B)SBA-15;and (b)at high angles of Mn3O4/SBA-15.Fig.2.N2adsorption–desorption isotherms:(!)SBA-15,(~)Mn3O4/SBA-15.Y.-F.Han et al./Applied Catalysis B:Environmental76(2007)227–234229groups,respectively [24–26].However,a downward shift ($D n 7cm À1)of the peaks accompanying with a broadening of the bands was observed for Mn 3O 4/SBA-15.For instance,the distinct feature at 655cm À1for the bulk Mn 3O 4shifted to 648cm À1for the nanocrystals.The Raman bands broadened and shifted were observed for the nanocrystals due to the effect of phonon conﬁnement as suggested previously in the literature [27,28].Furthermore,a weak band at 940cm À1,which should associate with the stretch of terminal Mn O,is an indicative of the existence of the isolated Mn 3O 4group [26].The assignment of this unique band has been discussed in our previous publication [22].3.2.Kinetic study3.2.1.Blank testsUnder a typical reaction conditions,that is,20ml of 100ppm ethanol aqueous solution (pH 6.7)mixed with 1ml of 30%H 2O 2,at 343K,there is no conversion of ethanol was observed after running for 120min in the absence of catalyst or in the presence of bare SBA-15(5mg).Also,under the similar conditions in H 2O 2-free solution,ethanol was not converted for all blank tests even with Mn 3O 4/SBA-15catalyst (5mg)in the reactor.It suggests that a trace amount of oxygen dissolved in water or potential dissociation of adsorbed ethanol does not have any contribution to the conversion of ethanol under reaction conditions.To study the effect of low temperature evaporation of ethanol during reaction,we further examined the concentration of ethanol (100ppm)versus time at different temperatures in the absence of catalyst and H 2O 2.Loss of ca.5%ethanol was observed only at 363K after running for 120min.Hence,to avoid the loss of ethanol through evaporation at high temperatures,which may lead to a higher conversion of ethanol than the real value,the kinetic experiments in this study were performed at or below 343K.The results from blank tests conﬁrm clearly that ethanol can be transformed only by catalytic oxidation during reaction.3.2.2.Effect of amount of catalystThe effect of amount of catalyst on ethanol oxidation is presented in Fig.5.Different amounts of catalyst ranging from 2to 10mg were taken for the same concentration of ethanol (100ppm)in aqueous solution under the standard conditions.It can be observed that the conversion of ethanol increases monotonically within 120min,reaching 15,20and 12%for 2,5and 10mg catalysts,respectively.On the other hand,Fig.5shows that the relative reaction rates (30min)decreased from 0.7to ca 0.1mmol/g Mn min with the rise of catalyst amount from 2to 10mg.Apparently,more catalyst in the system may decrease the rate for ethanol peroxidation,and a proper ratio of catalyst (g)/solution (L)is required for acquiring a balance between the overall conversion of ethanol and reaction rate.In order to investigate the effects from other factors,5mg (catalyst)/20ml (solution),corresponding to 1(g Mn )/30(L)ratio of catalyst/solution,has been selected for the followedexperiments.Fig.4.Raman spectroscopy of the Mn 3O 4/SBA-15(a)and bulk Mn 3O 4(b).Fig.3.TEM images recorded along the [001]of SBA-15(a),Mn 3O 4/SBA-15(b):pore A unﬁlled with hexagonal structure,pores B and C partially ﬁlled and pore D completely ﬁlled.Y.-F .Han et al./Applied Catalysis B:Environmental 76(2007)227–2342303.2.3.Effect of temperatureAs shown in Fig.6,the reaction rate increases with increasing the reaction temperature.After 120min,the conversion of ethanol increases from 12.5to 20%when varying the temp-erature from 298to 343K.Further increasing the temperature was not performed in order to avoid the loss of ethanol by evaporation.Interestingly,the relative reaction rate increased with time within initial 60min at 298and 313K,but upward tendency was observed above 333K.3.2.4.Effect of pHIn the pH range from 2.0to 9.0,as illustrated in Fig.7,the reaction rate drops down with the rise of pH.It indicates that acidic environment,or to say,proton concentration ([H +])in the solution is essential for this reaction.With considering our target for this study:purifying water,pH approaching to 7.0in the reaction system is preferred.Because acidifying the solution with organic/inorganic acids may potentially causea second time pollution and result in surplus cost.Actually,there is almost no effect on ethanol conversion with changing pH from 5.5to 6.7in this system.It is really a merit comparing with the conventional homogenous Fenton system,by which the catalyst works only in the pH range of 2.0–5.0.3.2.5.Effect of ethanol concentrationThe investigation of the effect of ethanol concentration on the reaction rate was carried out in the ethanol ranging from 50to 500ppm.The results in Fig.8show that the relative reaction rate increased from 0.07to 2.37mmol/g Mn min after 120min with increasing the concentration of ethanol from 50to 500ppm.It is worth to note that the pH value of the solution slightly decreased from 6.7to 6.5when raising the ethanol concentration from 100to 500ppm.paring to a typical homogenous Fenton reagent For comparison,under the similar reaction conditions ethanol oxidation was performed using aconventionalFig.5.The ethanol oxidation as a function of time with different amount of catalyst.Conversion of ethanol vs.time (solid line)on 2mg (&),5mg (*)and 10mg (~)Mn 3O 4/SBA-15catalyst,the relative reaction rate vs.time (dash line)on 2mg (&),5mg (*)and 10mg (~)Mn 3O 4/SBA-15catalyst.Rest conditions:20ml of ethanol (100ppm),1ml of 30%H 2O 2,708C and pH of6.7.Fig.6.The ethanol oxidation as a function of temperature.Conversion of ethanol vs.time (solid line)at 258C (&),408C (*),608C (~)and 708C (!),the relative reaction rate vs.time (dash line)at 258C (&),408C (*),608C (~)and 708C (5).Rest conditions:20ml of ethanol (100ppm),1ml of 30%H 2O 2,pH of 6.7,5mg ofcatalyst.Fig.7.The ethanol oxidation as a function of pH value.Conversion of ethanol vs.time (solid line)at pH value of 2.0(&),3.5(*),4.5(~),5.5(!),6.7(^)and 9.0("),the relative reaction rate vs.time (dash line)at pH value of 2.0(&),3.5(*),4.5(~),5.5(5),6.7(^)and 9.0(").Rest conditions:20ml of ethanol (100ppm),1ml of 30%H 2O 2,708C,5mg ofcatalyst.Fig.8.The ethanol oxidation as a function of ethanol concentration.Conver-sion of ethanol vs.time (solid line)for ethanol concentration (ppm)of 50(&),100(*),300(~),500(!),the relative reaction rate vs.time (dash line)for ethanol concentration (ppm)of 50(&),100(*),300(~),500(5).Condi-tions:20ml of ethanol,pH of 6.7,1ml of 30%H 2O 2,708C,5mg of catalyst.Y.-F .Han et al./Applied Catalysis B:Environmental 76(2007)227–234231homogenous reagent,Fe 2+(5ppm)–H 2O 2(1ml)at pH of 5.0.It has been reported to be an optimum condition for this system [17].As shown in Fig.9,the reaction in both catalytic systems exhibits a similar behavior,that is,the conversion of ethanol increases with extending the reaction time.Varying reaction temperature from 298to 343K seems not to impact the conversion of ethanol when using the homogenous Fenton reagent.Furthermore,the conversion of ethanol (deﬁning at 120min)in the system of Mn 3O 4/SBA-15–H 2O 2is about 60%of that obtained from the conventional Fenton reagent.There are no other organic compounds observed in the reaction mixture other than ethanol suggesting that ethanol directly decomposing to CO 2and H 2O.3.2.7.Decomposition of H 2O 2In the aqueous solution,the capability of metal ions such as Fe 2+and Mn 2+has long been evidenced to be effective on the decomposition of H 2O 2to produce the hydroxyl radical ( OH),which is oxidant for the complete oxidation/degrading of organic compounds [9,17].Therefore,ethanol oxidation is supposed to be associated with H 2O 2decomposition.The investigation of H 2O 2decomposition has been performed under the reaction conditions (in an ethanol-free solution)with different amounts of catalyst.H 2O 2was introduced into the reaction system by three steps,initially 0.5ml followed by twice 0.25ml at 32and 62min,the pH of 6.7is set for all experiments except pH of 5.0for Fe 2+.As shown in Fig.10,H 2O 2was not converted in the absence of catalyst or presence of bare SBA-15(5mg);in contrast,by using the Mn 3O 4/SBA-15catalyst we observed that ca.Ninety percent of total H 2O 2was decomposed in the whole experiment.It can be concluded that that dissociation of H 2O 2is mainly caused by Mn 3O paratively,the rate of H 2O 2decomposition is relatively low with the homogenous Fenton reagent,total conversion of H 2O 2,was ca.50%after runningfor 120min.Considering the fact that H 2O 2decomposition can be signiﬁcantly enhanced with the rise of Fe 2+concentration,however,it seems not to have the inﬂuence on the reaction rate for ethanol oxidation simultaneously.The similar behavior of H 2O 2decomposition was also observed during ethanol oxidation.The rate for ethanol oxidation is lower for Mn 3O 4/SBA-15comparing to the conventional Fenton reagent.The possible reasons will be discussed in the proceeding section.3.3.Plausible reaction mechanism for ethanol oxidation with H 2O 2In general,the wet peroxide oxidation of organic constitutes has been suggested to proceed via four steps [15]:activation of H 2O 2to produce OH,oxidation of organic compounds withOH,recombination of OH to form O 2and wet oxidation of organic compounds with O 2.It can be further described by Eqs.(1)–(4):H 2O 2À!Catalyst =temperture 2OH(1)OH þorganic compoundsÀ!Temperatureproduct(2)2 OHÀ!Temperature 12O 2þH 2O(3)O 2þorganic compoundsÀ!Temperature =pressureproduct(4)The reactive intermediates produced from step 1(Eq.(1))participate in the oxidation through step 2(Eq.(2)).In fact,several kinds of radical including OH,perhydroxyl radicals ( HO 2)and superoxide anions (O 2À)may be created during reaction.Previous studies [29–33]suggested that the process for producing radicals could be expressed by Eqs.(5)–(7)when H 2O 2was catalytically decomposed by metal ions,such asFeparison of ethanol oxidation in systems of typical homogenous Fenton catalyst (5ppm of Fe 2+,20ml of ethanol (100ppm),1ml of 30%H 2O 2,pH of 5.0acidiﬁed with HCl)at room temperature (~)and 708C (!),and Mn 3O 4/SBA-15catalyst (&)under conditions of 20ml of ethanol (100ppm),pH of 6.7,1ml of 30%H 2O 2,708C,5mg ofcatalyst.Fig.10.An investigation of H 2O 2decomposition under different conditions.One milliliter of 30%H 2O 2was dropped into the 20ml deionized water by three intervals,initial 0.5ml followed by twice 0.25ml at 32and 62min.H 2O 2concentration vs.time:by calculation (&),without catalyst (*),SBA-15(~),5ppm of Fe 2+(!)and Mn 3O 4/SBA-15(^).Rest conditions:5mg of solid catalyst,pH of 7.0(5.0for Fe 2+),708C.Y.-F .Han et al./Applied Catalysis B:Environmental 76(2007)227–234232and Mn,S þH 2O 2!S þþOH Àþ OH (5)S þþH 2O 2!S þ HO 2þH þ(6)H 2O $H þþO 2À(7)where S and S +represent reduced and oxidized metal ions,both the HO 2and O 2Àare not stable and react further with H 2O 2to form OH through Eqs.(8)and (9):HO 2þH 2O 2! OH þH 2O þO 2(8)O 2ÀþH 2O 2! OH þOH ÀþO 2(9)Presently, OH radical has been suggested to be the main intermediate responsible for oxidation/degradation of organic compounds.Therefore,the rate for ethanol oxidation in the studied system is supposed to be dependent on the concentra-tion of OH.Note that the oxidation may proceed via step four (Eq.(4))in the presence of high pressure O 2,which is so-called ‘‘wet oxidation’’and usually occurs at air pressures (1–22MPa)and at high temperatures (423–643K)[15].However,it is unlikely to happen in the present reaction conditions.According to Wolfenden’s study [34],we envisaged that the complete oxidation of ethanol may proceed through a route like Eq.(10):C 2H 5OH þ OH À!ÀH 2OC 2H 4O À! OHCO 2þH 2O(10)Whereby,it is believed that organic radicals containing hydroxy-groups a and b to carbon radicals centre can eliminate water to form oxidizing species.With the degrading of organic intermediates step by step as the way described in Eq.(10),the ﬁnal products should be CO 2and H 2O.However,no other species but ethanol was detected by GC and GC–MS in the present study possibly due to the rapid of the reaction that leads to unstable intermediate.Fig.5indicates that a proper ratio of catalyst/solution is a necessary factor to attain the high conversion of ethanol.It can be understood that over exposure of H 2O 2to catalyst will increase the rate of H 2O 2decomposition;but on the other hand,more OH radical produced may be scavenged by catalyst with increasing the amount of catalyst and transformed into O 2and H 2O as expressed in Eq.(3),instead of participating the oxidation reaction.In terms of Eq.(10),stoichiometric ethanol/H 2O 2should be 1/6for the complete oxidation of ethanol;however,in the present system the total molar ratio is 1/400.In other words,most intermediates were extinguished through scavenging during reaction.This may explain well that the decrease of reaction rate with the rise of ratio of catalyst/solution in the system.The same reason may also explain the decrease of reaction rate with prolonging the time.Actually,H 2O 2decomposition (ca.90%)may be completed within a few minutes over the Mn 3O 4/SBA-15catalyst as illustrated in Fig.10,irrespective of amount of catalyst (not shown for the sake of brevity);in contrast,the rate for H 2O 2decomposition became dawdling for Fe 2+catalyst.As a result,presumably,the homogenous system has relatively high concentration ofradicals.It may explain the superior reactivity of the conventional Fenton reagent to the presented system as depicted in Fig.9.Therefore,how to reduce scavenging,especially in the heterogeneous Fenton system [29],is crucial for enhancing the reaction rate.C 2H 5OH þ6H 2O 2!2CO 2þ9H 2O(11)On the other hand,as illustrated by Eqs.(1)–(4),all steps in the oxidation process are affected by the reaction temperature.Fig.6demonstrates that increasing temperature remarkably boosts the reactivity of ethanol oxidation in the system of Mn 3O 4/SBA-15–H 2O 2possibly,due to the improvement of the reactions in Eqs.(2)and (4)at elevated temperatures.In terms of Eqs.(6)and (7),acidic conditions may delay the H 2O 2decomposition but enhance the formation of OH (Eqs.(5),(8)and (9)).This ‘‘delay’’is supposed to reduce the chance of the scavenging of radicals and improve the efﬁciency of H 2O 2in the reaction.The protons are believed to have capability for stabilizing H 2O 2,which has been elucidated well previously [18,19].Consequently,it is understandable that the reaction is favored in the strong acidic environment.Fig.7shows a maximum reactivity at pH of 2.0and the lowest at pH of 9.0.As depicted in Fig.8,the reaction rate for ethanol oxidation is proportional to the concentration of ethanol in the range of 50–500ppm.It suggests that at low concentration of ethanol (100ppm)most of the radicals might not take part in the reaction before scavenged by catalyst.With increasing the ethanol concentration,the possibility of the collision between ethanol and radicals can be increased signiﬁcantly.As a result,the rate of scavenging radicals is reduced relatively.Thus,it is reasonable for the faster rate observed at higher concentration of ethanol.Finally,it is noteworthy that as compared to the bulk Mn 3O 4(Aldrich,98.0%of purity),the reactivity of the nano-crystalline Mn 3O 4on SBA-15is increased by factor of 20under the same typical reaction conditions.Obviously,Mn 3O 4nanocrystal is an effective alternative for this catalytic system.The present study has evidenced that the unique structure of SBA-15can act as a special ‘‘nanoreactor’’for synthesizing Mn 3O 4nanocrystals.Interestingly,a latest study has revealed that iron oxide nanoparticles could be immobilized on alumina coated SBA-15,which also showed excellent performance as a Fenton catalyst [35].However,the role of the pore structure of SBA-15in this reaction is still unclear.We do expect that during reaction SBA-15may have additional function to trap larger organic molecules by adsorption.Thus,it may broaden its application in this ﬁeld.So,relevant study on the structure of nano-composites of various MnO x and its role in the Fenton-like reaction for remediation of organic compounds in aqueous solution is undergoing in our group.4.ConclusionsIn the present study,we have addressed a new catalytic system suitable for remediation of trivial organic compound from contaminated water through a Fenton-like reaction withY.-F .Han et al./Applied Catalysis B:Environmental 76(2007)227–234233。

theil sen斜率估算

theil sen斜率估算Theil-Sen estimator, also known as the robust estimator of location and scale, is a non-parametric method used for estimating the slope of a relationship between two variables. Unlike other common estimators, such as ordinary least squares (OLS), the Theil-Sen estimator is less affected by outliers and is more resistant to violations of assumptions.The Theil-Sen estimator is based on the concept of median pairwise slopes. It calculates the median of all possible pairwise slopes between pairs of data points. The basic steps involved in estimating the Theil-Sen slope are as follows:1. Rank the data: Sort the data in ascending order according to the independent variable.2. Compute the pairwise differences in the dependent variable: Calculate the differences between all pairs of data points in the dependent variable.3. Compute the pairwise differences in the independent variable: Calculate the differences between all pairs of data points in the independent variable.4. Calculate the median slope: For each pair of differences in the dependent and independent variables, calculate the slope and store it. Then, find the median of all these slopes.The Theil-Sen estimator has several advantages over other linear regression methods. Firstly, it is robust to outliers as it uses themedian as a measure of central tendency instead of the mean. This helps to avoid the influence of extreme values that can highly affect the slope estimation in OLS. Secondly, it does not require any distributional assumptions, making it very flexible for analyzing data with non-normal distributions. Moreover, it can handle multiple independent variables and is resistant to violations of the assumption of homoscedasticity.While the Theil-Sen estimator offers robustness and flexibility, it also has some limitations. One limitation is that it does not provide estimates for the intercept term, which can be important in certain situations. However, this limitation can be overcome by using a median-based method to estimate the intercept. Another limitation is that it can be computationally intensive when dealing with a large number of data points. However, this issue can be resolved by using efficient algorithms specifically designed for large data sets.The Theil-Sen estimator has been widely applied in various fields due to its robustness and reliability. It has been used in environmental sciences to estimate trends in climatic variables, in economics to analyze the relationships between variables, in image processing for edge detection, and in finance for estimating the risk-return relationship in portfolios.To summarize, the Theil-Sen estimator is a powerful non-parametric method for estimating the slope of a relationship between two variables. Its robustness to outliers and violations of assumptions makes it a useful alternative to traditional linear regression methods. Despite some limitations, the Theil-Senestimator has found many applications in different fields and continues to be an important tool for data analysis.。

像素8邻域的英文

像素8邻域的英文The English term for 像素8邻域 is "8-pixel neighborhood". It refers to the neighboring pixels surrounding a central pixel in an image. In digital image processing, the 8-pixel neighborhood is commonly used for various operations such as edge detection, noise reduction, and image segmentation.When we talk about the 8-pixel neighborhood, we are referring to the eight adjacent pixels that surround a central pixel in a 2D image. These eight pixels are located to the north, south, east, west, and at the four diagonal directions from the central pixel. The concept of the 8-pixel neighborhood is widely used in image processing algorithms to analyze the local features and patternswithin an image.In image processing, the 8-pixel neighborhood is often used in tasks such as image enhancement, feature extraction, and object recognition. For example, in edge detection algorithms, the gradients of intensity values between the central pixel and its 8-pixel neighbors are calculated to identify the edges and boundaries within the image.Similarly, in image segmentation, the 8-pixel neighborhoodis used to group pixels with similar attributes together to form distinct regions within the image.The 8-pixel neighborhood is also important in thecontext of image filtering and noise reduction. By considering the intensity values of the neighboring pixels, various filtering techniques can be applied to remove noise and enhance the overall quality of the image. For example, the median filter, which replaces the central pixel withthe median intensity value of its 8-pixel neighborhood, is commonly used for noise reduction in digital images.在数字图像处理中，像素8邻域通常用于各种操作，如边缘检测、噪声减少和图像分割。

科学文献

Output-Sensitive Algorithms for ComputingNearest-Neighbour Decision Boundaries⋆David Bremner1,Erik Demaine2,Jeff Erickson3,John Iacono4,Stefan Langerman5,Pat Morin6,and Godfried Toussaint7 1Faculty of Computer Science,University of New Brunswick,bremner@unb.ca 2MIT Laboratory for Computer Science,edemaine@ 3Computer Science Department,University of Illinois,jeffe@4Polytechnic University,jiacono@5Charg´e de recherches du FNRS,Universit´e Libre de Bruxelles,ngerman@ulb.ac.be6School of Computer Science,Carleton University,morin@cs.carleton.ca7School of Computer Science,McGill University,godfried@cs.mcgill.caAbstract.Given a set R of red points and a set B of blue points,thenearest-neighbour decision rule classiﬁes a new point q as red(respectively,blue)if the closest point to q in R∪B comes from R(respectively,B).This rule implicitly partitions space into a red set and a blue set that areseparated by a red-blue decision boundary.In this paper we develop output-sensitive algorithms for computing this decision boundary for point sets onthe line and in R2.Both algorithms run in time O(n log k),where k is thenumber of points that contribute to the decision boundary.This runningtime is the best possible when parameterizing with respect to n and k.1IntroductionLet S be a set of n points in the plane that is partitioned into a set of red points denoted by R and a set of blue points denoted by B.The nearest-neighbour deci-sion rule classiﬁes a new point q as the color of the closest point to q in S.The nearest-neighbour decision rule is popular in pattern recognition as a means of learning by example.For this reason,the set S is often referred to as a training set.Several properties make the nearest-neighbour decision rule quite attractive, including its intuitive simplicity and the theorem that the asymptotic error rate of the nearest-neighbour rule is bounded from above by twice the Bayes error rate[6,8,16].(See[17]for an extensive survey of the nearest-neighbour de-cision rule and its relatives.)Furthermore,for point sets in small dimensions, there are efﬁcient and practical algorithms for preprocessing a set S so that the nearest neighbour of a query point q can be found quickly.The nearest-neighbour decision rule implicitly partitions the plane into a red set and a blue set that meet at a red-blue decision boundary.One attractive as-pect of the nearest-neighbour decision rule is that it is often possible to reduce the size of the training set S without changing the decision boundary.To see this,consider the Vorono˘ıdiagram of S,which partitions the plane into convex (possibly unbounded)polygonal Vorono˘ıcells,where the Vorono˘ıcell of point p∈S is the set of all points that are closer to p than to any other point in S(see Figure1.a).If the Vorono˘ıcell of a red point r is completely surrounded by the Voronoi cells of other red points then the point r can be removed from S and this will not change the classiﬁcation of any point in the plane(see Figure1.b). We say that these points do not contribute to the decision boundary,and the remaining points contribute to the decision boundary.(a)(b)Fig.1.The Vorono˘ıdiagram(a)before Vorono˘ıcondensing and(b)after Vorono˘ıcon-densing.Note that the decision boundary(in bold)is unaffected by Vorono˘ıcondensing. Note:In thisﬁgure,and all otherﬁgures,red points are denoted by white circles and blue points are denoted by black disks.The preceding discussion suggests that one approach to reducing the size of the training set S is to simply compute the Vorono˘ıdiagram of S and re-move any points of S whose Vorono˘ıcells are surrounded by Vorono˘ıcells of the same color.Indeed,this method is referred to as Vorono˘ıcondensing[18]. There are several O(n log n)time algorithms for computing the Vorono˘ıdiagram a set of points in the plane,so Vorono˘ıcondensing can be implemented to run in O(n log n)time.8However,in this paper we show that we can do signiﬁcantly better when the number of points that contribute to the decision boundary is small.Indeed,we show how to do Vorono˘ıcondensing in O(n log k)time,where k is the number of points that contribute to the decision boundary(i.e.,the number of points of S that remain after Vorono˘ıcondensing).Algorithms,likethese,in which the size of the input and the size of the output play a role in the running time are referred to as output-sensitive algorithms.Readers familiar with the literature on output-sensitive convex hull algo-rithms may recognize the expression O(n log k)as the running time of optimal algorithms for computing convex hulls of n point sets with k extreme points, in2or3dimensions[2,4,5,13,19].This is no coincidence.Given a set of n points in R2,we can color them all red and add three blue points at inﬁnity(see Figure2).In this set,the only points that contribute to the nearest-neighbour de-cision boundary are the three blue points and the red points on the convex hull of the original set.Thus,identifying the points that contribute to the nearest-neighbour decision boundary is at least as difﬁcult as computing the extreme points of a set.Fig.2.The relationship between convex hulls and decision boundaries.Each vertex of the convex hull of R contributes to the decision boundary.Observe that,once the size of the training set has been reduced by Vorono˘ıcodensing,the condensed set can be preprocessed in O(k log k)time to answer nearest neighbour queries in O(log k)time per query.This makes it possible to do nearest-neighbour classiﬁcations in O(log k)time.Alternatively,the algo-rithm we describe for computing the nearest neighbour decision boundary ac-tually produces an explicit description of the boundary(of size O(k))that canbe preprocessed in O(k)time by Kirkpatrick’s point-location algorithm[12]to allow nearest neighbour classiﬁcation in O(log k)time.The remainder of this paper is organized as follows:In Section2we describe an algorithm for computing the nearest-neighbour decision boundary of points on a line that runs in O(n log k)time.In Section3we present an algorithm for points in the plane that also runs in O(n log k)time.Finally,in Section4we summarize and conclude with open problems.2A1-Dimensional AlgorithmIn the1-dimensional version of the nearest-neighbour decision boundary prob-lem,the input set S consists of n real numbers.Imagine sorting S,so that S={s1,...,s n}where s i<s i+1for all1≤i<n.The decision boundary consists of all pairs(s i,s i+1)where s i is red and s i+1is blue,or vice-versa.Thus,this problem is solveable in linear-time if the points of S are sorted.Since sorting the elements of S can be done using any number of O(n log n)time sorting algo-rithms,this immediately implies an O(n log n)time algorithm.Next,we give an algorithm that runs in O(n log k)time and is similar in spirit to Hoare’s quicksort [11].Toﬁnd the decision boundary in O(n log k)time,we begin by computing the median element m=s⌈n/2⌉in O(n)time using any one of the existing linear-time medianﬁnding algorithms(see[3]).Using an additional O(n)time,we split S into the sets S1={s1,...,s⌈n/2⌉−1}and S2={s⌈n/2⌉+1,...,s n}by comparing each element of S to the median element m.At the same time we alsoﬁnd s⌈n/2⌉−1and s⌈n/2⌉+1byﬁnding the maximum and minimum elements of S1and S2,respectively.We then check if(s⌈n/2⌉−1,m)and/or(m,s⌈n/2⌉+1) are part of the decision boundary and report them if necessary.At this point,a standard divide-and-conquer algorithm would recurse on both S1and S2to give an O(n log n)time algorithm.However,we can improve on this by observing that it is not necessary to recurse on a subproblem if it contains only elements of one color,since it will not contribute a pair to the de-cision boundary.Therefore,we recurse on each of S1and S2only if they contain at least one red element and one blue element.The correctness of the above algorithm is clear.To analyze its running time we observe that the running time is bounded by the recurrenceT(n,k)≤O(n)+T(n/2,l)+T(n/2,k−l),where l is the number of points that contribute to the decision boundary in S1 and where T(1,k)=O(1)and T(n,0)=O(n).An easy inductive argument that uses the concavity of the logarithm shows that this recurrence is maximized when l=k/2,in which case the recurrence solves to O(n log k)[5].Theorem1The nearest-neighbour decision boundary of a set of n real numbers can be computed in O(n log k)time,where k is the number of elements that con-tribute to the decision boundary.3A2-Dimensional AlgorithmIn the2-dimensional nearest-neighbour decision boundary problem the Vorono˘ıcells of S are(possibly unbounded)convex polygons and the goal is toﬁnd all Vorono˘ıedges that bound two cells whose deﬁning points have different col-ors.Throughout this section we will assume that the points of S are in general position so that no four points of S lie on a common circle.This assumption is not very restrictive,since general position can be simulated using inﬁnitesmal perturbations of the input points.It will be more convenient to present our algorithm using the terminology of Delaunay triangulations.A Delaunay triangle in S is a triangle whose vertices (v1,v2,v3)are in S and such that the circle with v1,v2and v3on its boundary does not contain any point of S in its interior.A Delaunay triangulation of S is a partitioning of the convex hull of S into Delaunay triangles.Alternatively,a De-launay edge is a line segment whose vertices(v1,v2)are in S and such that there exists a circle with v1and v2on its boundary that does not contain any point of S in its interior.When S is in general position,the Delaunay triangulation of S is unique and contains all triangles whose edges are Delaunay edges(see[14]). It is well known that the Delaunay triangulation and the Voronoi diagram are dual in the sense that two points of S are joined by an edge in the Delaunay triangulation if and only if their Voronoi cells share an edge.We call a Delaunay triangle or Delaunay edge bichromatic if its set of deﬁning vertices contains at least one red and at least one blue point of S.Thus,the problem of computing the nearest-neighbour decision boundary is equivalent to the problem ofﬁnding all bichromatic Delaunay edges.3.1The High Level AlgorithmIn the next few sections,we will describe an algorithm that,given a valueκ≥k,ﬁnds the set of all bichromatic Delaunay triangles in S in O((κ2+n)logκ)time, which forκ≤√n then we stop the entire algorithm and run an O(n log n)time algorithm to compute the entire Delaunay triangulation of S.The values ofκthat we use areκ=22i for i=0,1,2,...,⌈log log n⌉.Since the algorithm will terminate onceκ≥k orκ≥√3.2PivotsA key subroutine in our algorithm is the pivot9operation illustrated in Figure3.A pivot in the set of points S takes as input a ray and reports the largest circlewhose center is on the ray,has the origin of the ray on its boundary and hasno point of S in its interior.We will make use of the following data structuring result,due to Chan[4].For completeness,we also include a proof.Fig.3.A pivot operation.Lemma1(Chan1996)Let S be a set of n points in R2.Then,for any integer 1≤m≤n,there exists a data structure of size O(n)that can be constructed in O(n log m)time,and that can perform pivots in S in O(nm×O(m log m)=O(n log m).To perform a query,we simply query each of the n/m data structures in O(log m)time per data structure and reportthe smallest circle found,for a query time of nlog m).m In the following,we will be using Lemma1with a value of m=κ2,so that the time to construct the data structure is O(n logκ)and the query time is O(n9The term pivot comes from linear programming.The relationship between a(po-lar dual)linear programming pivot and the circular pivot described here is evident when we consider the parabolic lifting that transforms the problem of computing a2-dimensional Delaunay triangulation to that of computing a3-dimensional convex hull of a set of points on the paraboloid z=x2+y2.In this case,the circle is the projection of the intersection of a plane with the paraboloid.point r and any blue point b.We then perform a pivot in the set B along the ray with origin r that contains b.This gives us a circle C that has no blue points in its interior and has r as well as some blue point b′(possibly b=b′)on its bound-ary.Next,we perform a pivot in the set R along the ray originating at b′and passing through the center of C.This gives us a circle C1that has no point of S in its interior and has b′and some red point r′(possibly r=r′)on its boundary. Therefore,(r′,b′)is a bichromatic edge in the Delaunay triangulation of S.rbrb(a)(b)Fig.4.The(a)ﬁrst and(b)second pivot used toﬁnd a bichromatic edge(r′,b′).The above argument shows how toﬁnd a bichromatic Delaunay edge using only2pivots,one in R and one in B.The second part of the argument also implies the following useful lemma.Lemma2If there is a circle with a red point r and a blue point b on its bound-ary,and no red(respectively,blue)points in its interior,then r(respectively,b) contributes to the decision boundary.3.4Finding More PointsLet Q be the set of points that contribute to the decision boundary,i.e.,the set of points that are the vertices of bichromatic triangles in the Delaunay triangulation of S.Suppose that we have already found a set P⊆Q and we wish to either (1)ﬁnd a new point p∈Q\P or(2)verify that P=Q.To do this,we will make use of the augmented Delaunay triangulation of P (see Figure5).This is the Delaunay triangulation of P∪{v1,v2,v3},where v1, v2,and v3are three black points“at inﬁnity”(see Figure5).For any triangle t, we use the notation C(t)to denote the circle whose boundary contains the three vertices of t(note that if t contains a black point then C(t)is a halfplane).The following lemma allows us to tell when we have found the entire set of points Q that contribute to the decision boundary.Lemma3Let∅=P⊆Q.The following statements are equivalent:v 1v 2vFig.5.The augmented Delaunay triangulation of S .1.For every triangle t in the augmented Delaunay triangulation of P ,if t has a blue (respectively,red)vertex then C (t )does not have a red (respectively,blue)point of S in its interior .2.P =Q .Proof.First we show that if Statement 1of the lemma is not true,then State-ment 2is also not true,i.e.,P =Q .Suppose there is some triangle t in the augmented Delaunay triangulation of P such that t has a blue vertex b and C (t )contains a red point of S in its interior.Pivot in R along the ray originating at b and passing through the center of C (t )(see Figure 6).This will give a circle C with b and some red point r /∈P on its boundary and with no red points in its interior.Therefore,by Lemma 2,r contributes to the decision boundary and is therefore in Q ,so P =Q .A symmetric argument applies when t has a red vertex r and C (t )contains a blue vertex in its interior.Fig.6.If Statement 1of Lemma 3is not true then P =Q .Next we show that if Statement2of the lemma is not true then Statement1 is not true.Suppose that P=Q.Let r be a point in Q\P and,without loss of generality,assume r is a red point.Since r is in Q,there is a circle C with r and some other blue point b on its boundary and with no points of S in its interior.We will use r and b to show that the augmented Delaunay triangulation of P contains a triangle t such that either(1)b is a vertex of t and C(t)contains r in its interior,or(2)C(t)contains both r and b in its interior.In either case, Statement1of the lemma is not true because of triangle t.Refer to Figure7for what follows.Consider the largest circle C1that is con-centric with C and that contains no point of P in its interior(this circle is at least as large as C).The circle C1will have at least one point p1of P on its boundary (it could be that p1=b,if b∈P).Next,perform a pivot in P along the ray originating at p1and containing the center of C1.This will give a circle C2that contains C1and with two points p1and p2of P∪{v1,v2,v3}on its boundary and with no points of P∪{v1,v2,v3}in its interior.Therefore,(p1,p2)is an edge in the augmented Delaunay triangulation of P.The edge(p1,p2)partitions the interior of C2into two pieces,one that con-tains r and one that does not.It is possible to move the center of C2along the perpendicular bisector of(p1,p2)maintaining p1and p2on the boundary of C2.There are two directions in which the center of C2can be moved to ac-complish this.In one direction,say−→d,the part of the interior that contains r only increases,so move the center in this direction until a third point p3∈P∪{v1,v2,v3}is on the boundary of C2.The resulting circle has the points p1, p2,and p3on its boundary and no points of P in its interior,so p1,p2and p3are the vertices of a triangle t in the augmented Delaunay triangulation of P.The circumcircle C(t)contains r in its interior and contains b either in its interior or on its boundary.In either case,t contradicts Statement1,as promised.Note that theﬁrst paragraph in the proof of Lemma3gives a method of testing whether P=Q,and when this is not the case,ofﬁnding a point in Q\P.For each triangle t in the Delaunay triangulation of P,if t contains a blue vertex b then perform a pivot in R along the ray originating at b and passing through C(t).If the result of this pivot is C(t),then do nothing.Otherwise,the pivotﬁnds a circle C with no red points in its interior and that has one blue point b and one red point r/∈P on its boundary.By Lemma2,the point r must be in Q.If t contains a red vertex,repeat the above procedure swapping the roles of red and blue.If both pivots(from the red point and the blue point)ﬁnd the circle C(t),then we have veriﬁed Statement1of Lemma3for the triangle t.The above procedure performs at most two pivots for each triangle t in the augmented Delaunay triangulation of P.Therefore,this procedure performs O(|P|)=O(κ)pivots.Since we repeat this procedure at mostκtimes before de-ciding thatκ<k,we perform O(κ2)pivots,at a total cost of O(κ2×n=bb33(1)(2)Fig.7.If P=Q then Statement1of Lemma3is not true.The left column(1)corresponds to the case where b∈P and the right column(2)corresponds to the case where b∈P.In summary,we have an algorithm that given S andκdecides whether the condensed set Q of points in S that contribute to the decision boundary has size at mostκ,and if so,computes Q.This algorithm runs in O((κ2+n)logκ)time. By trying increasingly large values ofκas described in Section3.1we obtain our main theorem.Theorem2The nearest-neighbour decision boundary of a set of n points in R2can be computed in O(n log k)time,where k is the number of points that contribute to the decision boundary.Remark:Theorem2extends to the case where there are more than2color classes and our goal is toﬁnd all Vorono˘ıedges bounding two cells of different color.The only modiﬁcation required is that,for each color class,R,we use two pivoting data structures,one for R and one for S\R.When performing pivots from a point in R,we use the data structure for pivots in S\R.Otherwise,the details of the algorithm are identical.Remark:In the pattern-recognition community pattern classiﬁcation rules are often implemented as neural networks.In the terminology of neural networks, Theorem2states that it is possible,in O(n log k)time,to design a simple one-layer neural network that implements the nearest-neighbour decision rule and uses only k McCulloch-Pitts neurons(threshold logic units).4ConclusionsWe have given O(n log k)time algorithms for computing nearest-neighbour de-cisions boundaries in1and2dimensions,where k is the number of points that contribute to the decision boundary.A standard application of Ben-Or’s lower-bound technique[1]shows that even the1-dimensional algorithm is optimal in the algebraic decision tree model of computation.We have not studied algorithms for dimensions d≥3.In this case,it is not even clear what the term“output-sensitive”means.Should k be the number of points that contribute to the decision boundary,or should k be the complexity of the decision boundary?In theﬁrst case,k≤n for any dimension d,while in the second case,k could be as large asΩ(n⌈d/2⌉).To the best of our knowledge, both are open problems.References1.M.Ben-Or.Lower bounds for algebraic computation trees(preliminary report).InProceedings of the Fifteenth Annual ACM Symposium on Theory of Computing,pages 80–86,1983.2.B.K.Bhattacharya and S.Sen.On a simple,practical,optimal,output-sensitiverandomized planar convex hull algorithm.Journal of Algorithms,25(1):177–193, 1997.3.M.Blum,R.W.Floyd,V.Pratt,R.L.Rivest,and R.E.Tarjan.Time bounds forselection.Journal of Computing and Systems Science,7:448–461,1973.4.T.M.Chan.Optimal output-sensitive convex hull algorithms in two and three di-mensions.Discrete&Computational Geometry,16:361–368,1996.5.T.M.Chan,J.Snoeyink,and C.K.Yap.Primal dividing and dual pruning:Output-sensitive construction of four-dimensional polytopes and three-dimensional Voronoi diagrams.Discrete&Computational Geometry,18:433–454,1997.6.T.M.Cover and P.E.Hart.Nearest neighbour pattern classiﬁcation.IEEE Transactionson Information Theory,13:21–27,1967.7.B.Dasarathy and L.J.White.A characterization of nearest-neighbour rule decisionsurfaces and a new approach to generate them.Pattern Recognition,10:41–46,1978.8.L.Devroye.On the inequality of Cover and Hart.IEEE Transactions on Pattern Anal-ysis and Machine Intelligence,3:75–78,1981.9.D.P.Dobkin and D.G.Kirkpatrick.Fast detection of poyhedral intersection.Theoret-ical Computer Science,27:241–253,1983.10.D.P.Dobkin and D.G.Kirkpatrick.A linear algorithm for determining the separationof convex polyhedra.Journal of Algorithms,6:381–392,1985.11.C.A.R.Hoare.ACM Algorithm64:munications of the ACM,4(7):321,1961.12.D.G.Kirkpatrick.Optimal search in planar subdivisions.SIAM Journal on Computing,12(1):28–35,1983.13.D.G.Kirkpatrick and R.Seidel.The ultimate planar convex hull algorithm?SIAMJournal on Computing,15(1):287–299,1986.14.F.P Preparata and putational Geometry.Springer-Verlag,1985.15.M.I.Shamos.Geometric complexity.In Proceedings of the7th ACM Symposium onthe Theory of Computing(STOC1975),pages224–253,1975.16.C.Stone.Consistent nonparametric regression.Annals of Statistics,8:1348–1360,1977.17.G.T.Toussaint.Proximity graphs for instance-based learning.Manuscript,2003.18.G.T.Toussaint,B.K.Bhattacharya,and R.S.Poulsen.The application of Voronoidiagrams to non-parametric decision rules.In Proceedings of Computer Science and Statistics:16th Symposium of the Interface,1984.19.R.Wenger.Randomized quick hull.Algorithmica,17:322–329,1997.。

Please address all correspondence to

1
Clinical Neurophysiology monitors 1500 operative cases/year and performs 5000 diagnostic studies/year. This e ort is carried out with the service of 10 technicians, 3 secretarial sta , 2 computer support people, and 10 members of the medical school faculty. Monitoring and diagnostics are provided throughout the 3500 bed medical center (8 hospitals) including 60 operating rooms, 200 intensive care beds, and 8 laboratories operated by the center. Neurophysiological recordings are typically made from the scalp while stimulation is applied to a peripheral nerve, the eye or the ear. The consistency of the recorded response depends on the continued integrity of the corresponding neural pathways. For example if the stimulus is applied to the median nerve at the wrist, the response at the scalp depends on the continued integrity of the proximal median nerve, the brachial plexus, the dorsal spinal roots primarily at the 5th and 6th cervical vertebrae, the dorsal spinal cord rostral to these roots, the posterior brainstem, and so on proceeding rostral to the cortical centers which subserve cutaneous sensation. A surgical intervention or untoward physiological event which interferes with the function of any of these structures will produce a decrement in the recorded response. Such a change can typically be detected and reported to the surgeon within 60 seconds, enabling him/her to modify the surgical technique to reduce potential injury to the nervous system. This work describes e orts to enhance the SIGNAL of NOISE these signals and display them in a manner which is most helpful in reducing surgical morbidity. In particular, our e ort has been directed to reduce the time required to produce interpretable results and to distribute the results throughout our computer network to promote collaborative interaction.

数字全息技术中散斑噪声滤波算法比较

数字全息技术中散斑噪声滤波算法比较潘云;潘卫清;晁明举【摘要】In the recording process of digital holographic measurement, the hologram is easily polluted by speckle noise, which may decrease the resolution of the hologram. In addition, the reconstructed effect is seriously affected by speckle noise in digital reconstruction. Thus it is important to study the filtering speckle algorithms for digital hologram. The median filtering algorithm, Lee filtering algorithm, Kuan filtering algorithm and SUSAN filtering algorithm were introduced to filter the speckle noise in hologram and reconstructed image. Then these algorithms were compared. The results showed that the SUSAN filtering algorithm was better in digital holographic technology. The speckle noises were suppressed significantly and the information of reconstructed images were well maintained.%在数字全息测量记录过程中,其所记录的全息图易受到散斑噪声的污染造成分辨率下降,同时也严重影响数字全息再现的效果,因此研究适用于数字全息技术中散斑滤波的算法具有重要的实用价值.介绍了中值滤波、Lee滤波、Kuan滤波和SUSAN滤波这四种常用的散斑滤波算法,并将它们运用于数字全息实验所记录图像和数字再现图像的散斑噪声滤波处理中,然后对这四种算法的处理结果进行评价.结果表明,在数字全息技术中使用SUSAN滤波算法进行处理,既明显抑制了散斑噪声,又有效保证了再现图像信息的完整性.【期刊名称】《应用光学》【年(卷),期】2011(032)005【总页数】5页(P883-887)【关键词】数字全息;滤波算法;算法比较;散斑噪声【作者】潘云;潘卫清;晁明举【作者单位】郑州大学物理工程学院,河南郑州450001;浙江科技学院理学院,浙江杭州310023;郑州大学物理工程学院,河南郑州450001【正文语种】中文【中图分类】TN209;O438.1引言随着材料科学、生命科学、微加工技术、微电子技术等的飞速发展，使得对微小物体三维形貌测量的需求变得越来越迫切。

SDN架构下数据流量调度算法的设计

SDN架构下数据流量调度算法的设计许文庆;余庚【摘要】目前云技术被广泛应用于数据中心网络(DCN),为确保DCN服务质量(QoS),等价多路径路由(ECMP)算法和动态负载均衡(DLB)算法被作为解决方案.然而这些算法仅能获得局部优化效果,将其应用在DCN重载环境下时,得到的传输时延和带宽利用率等指标不尽如人意.文章针对这些算法的局限性,从全局优化的角度提出一种软件定义网络(SDN)架构下的胖树网络数据流量载荷均衡调度算法.该算法通过Ryu控制器监视全局实时参量来评估SDN节点和链路载荷情况,选出最符合流量载荷需求的最优转发路径.实验表明,所提算法在改善DCN服务性能方面较等价多路径路由算法和动态负载均衡算法有显著提升.%Recently,cloud technology is widely used in Data Center Network (DCN).In order to ensure Quality of Service (QoS)in DCN,Equal Cost Multipath Routing (ECMP)algorithms and Dynamic Load Balancing (DLB)algorithms are em-ployed as solutions.However,these algorithms only obtain local optimization results.In particular,the transmission delay and bandwidth utilization are not satisfactory when these algorithms are applied in the DCN overloaded environment.Considering the limitations of these algorithms,this paper proposes a load balancing scheduling mechanism for fat tree network data traffic under Software Defined Network(SDN)architecture from the perspective of global optimization.The mechanism evaluates the SDN node and link load by monitoring the global real-time parameters through the Ryu controller,and selecting the optimal forwarding path which meets the demand of trafficload.Experiments show that the scheduling mechanism proposed in this pa-per has significantly improved the performance of DCN services compared with the ECMP/DLB algorithm.【期刊名称】《光通信研究》【年(卷),期】2018(000)003【总页数】5页(P5-8,20)【关键词】软件定义网络;数据中心网络;评估;均衡;仿真【作者】许文庆;余庚【作者单位】福州理工学院,福州 350506;福建工程学院国脉信息学院,福州350014;福州理工学院,福州 350506;福建工程学院国脉信息学院,福州 350014【正文语种】中文【中图分类】TP3930 引言随着云技术、物联网技术和虚拟化技术的不断融合发展，海量宽带数据流量将以指数级增长呈现在数据中心网络（Data Center Network，DCN）上。

基于尺度、距离、旋转测度的角点匹配算法.

通过对初始匹配算法与测度函数设计的分析, 建立基于尺度、距离、旋转测度的角点匹配算法流程
首先, 利用相位相关法进行两幅图片的初始匹配对不同图像上的角点邻域进行相位相关分
析而后, 对候选匹配角点进行尺度、距离、旋转测度的计算, 通过判断匹配测度函数值的合理性, 去除虚假匹配最后, 利用 8 个最佳匹配点计算双目视觉的基础矩阵, 获得极线几何对应参数[ 5] 流程如图 3 所示
i,
( 10)
式中:
E
g
=
1 K
K
(
i= 1
i-
k , median) 2
( 11)
k 为考察的候选匹配角点与邻域内第 k 对
候选匹配角点连线的夹角; g, median为考察的候选
匹配角点在邻域内与所有其他候选匹配角点连线
夹角的平均值; E g 为考察的候选匹配角点在邻域内与所有其他候选匹配角点连线夹角的方差
摘
要: 针对基于模板灰度相似性测度的匹配方法抗旋转性差的缺陷, 依据真实匹配角点与邻域内其他
角点位置关系存在拓扑不变性, 提出了一种基于尺度、距离、旋转测度的角点匹配方法该方法首先利用相位相关法对角点进行初始匹配, 而后对每对候选匹配角点进行基于尺度、距离、旋转测度的计算, 利用计算后的测度函数值来判断初始匹配是否为正确匹配实验验证了提出的角点匹配方法的匹配结果要明显好于直接
匹配角点与邻域内第 k 对候选匹配角点的距离;
dg, median为考察的候选匹配角点在邻域内与所有
其他候选匹配角点距离比的平均值; Edg 为候选
匹配角点在邻域内与所有其他候选匹配角点距离
比的方差

图像拼接方法综述

图像拼接方法综述罗群明;施霖【摘要】概述了图像拼接的基本理论和一般过程,分类介绍了各种图像配准方法的原理和优缺点,重点分析了基于特征的图像配准技术,详细阐述了随机采样一致算法在基于特征的图像配准技术中的应用.%The basic theory and general image stitching process are summarized.The various image registration method principles,advantages and disadvantages are introduced and classified,and the feature-based image registration technique are analyzed.Application of random sampling consensus algorithm in feature-based image registration are described in detail.【期刊名称】《传感器与微系统》【年(卷),期】2017(036)012【总页数】4页(P4-6,12)【关键词】图像拼接;配准;特征匹配;全景图像【作者】罗群明;施霖【作者单位】昆明理工大学信息工程与自动化学院,云南昆明650500;昆明理工大学信息工程与自动化学院,云南昆明650500【正文语种】中文【中图分类】TP391.4近年来，随着摄像设备的普及，越来越多的领域应用了图像分析技术来研究和处理各种各样的问题，从而对采集的图像提出更高的要求[1]。

由于摄像设备的物理限制，采集到的图像总是不能满足人们的宽视角、高分辨率的要求。

而获取全景图像的硬件设备(全景相机、广角镜头等)一般比较昂贵, 不适合普遍应用, 于是人们提出了利用计算机进行图像拼接来获得全景图的方法。

图像拼接是指将一组相互之间存在重叠部分的图像，经过各种图像变换、图像配准和图像融合等形成一个宽视角(甚至是360°视角)的、高分辨率的全景图的技术，涉及计算机图形学、计算机视觉、图像处理和模式识别等多个学科，广泛应用于宇宙空间探测、医学图像、视频检索、虚拟现实等多个领域[2,3]。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Algorithms for central-median pathswith bounded length on treesRonald I.Becker a ,Isabella Lari b ,Andrea Scozzaric,*aDepartment of Mathematics,University of Cape Town,Rondebosh 7700,South Africa b Dipartimento di Statistica,Probabilita´e Statistiche Applicate,Universita ´di Roma ‘‘La Sapienza’’P.le A.Moro 5,00185Roma,Italy c Dipartimento di Matematica per le Decisioni Economiche,Finanziarie ed Assicurative,Universita´di Roma ‘‘La Sapienza’’Via del Castro Laurenziano 9,00161Roma,Italy Received 15December 2003;accepted 15September 2005Available online 2May 2006AbstractThe location of path-shaped facilities on trees has been receiving a growing attention in the specialized literature in the recent years.Examples of such facilities include railroad lines,highways and public transit lines.Most of the papers deal with the problem of locating a path on a tree by minimizing either the maximum distance from the vertices of the tree to the facility or of minimizing the sum of the distances from all the vertices of the tree to the path.However,neither of the two above criteria alone capture all essential elements of a location problem.The sum of the distances criterion alone may result in solutions which are unacceptable from the point of view of the service level for the clients who are located far away from the facilities.On the other hand,the criterion of the minimization of the maximum distance,if used alone,may lead to very costly service systems.In the literature,there is just one paper that considers the problem of ﬁnding an optimal location of a path on a tree using combinations of the two above criteria,and eﬃcient algorithms are provided.In par-ticular,the cases where one criterion is optimized subject to a restriction on the value of the other are considered and linear time algorithms are presented.However,these problems do not consider any bound on the length or cost of the facility.In this paper we consider the two following problems:ﬁnd a path which minimizes the sum of the distances such that the maximum distance from the vertices of the tree to the path is bounded by a ﬁxed constant and such that the length of the path is not greater than a ﬁxed value;ﬁnd a path which minimizes the maximum distance with the sum of the distances being not greater than a ﬁxed value and with bounded length.From an application point of view the constraint on the length of the path may refer to a budget constraint for establishing the facility.The restriction on the length of the path complicates the two problems but for both of them we give O(n log 2n )divide-and-conquer algorithms.Ó2006Elsevier B.V.All rights reserved.Keywords:Facility location;Central path;Median path0377-2217/$-see front matter Ó2006Elsevier B.V.All rights reserved.doi:10.1016/j.ejor.2005.09.049*Corresponding author.E-mail addresses:rib@maths.uct.ac.za (R.I.Becker),ri@uniromal.it (ri),andrea.scozzari@uniromal.it (A.Scozzari).European Journal of Operational Research 179(2007)1208–1220/locate/ejorR.I.Becker et al./European Journal of Operational Research179(2007)1208–12201209 1.IntroductionNetwork facility location is concerned with the optimal selection of a site or of a set of facility sites in a network in order to supply a set of costumers.The objective is either minimizing the distance from the furthest client in the network to the facilities(centre criterion)or minimizing the sum of the distances from the clients to the selected facilities(median criterion).These problems have a wide variety of applications in economy as the problem of locating bank accounts to optimizeﬂoat[6]or of locating warehouses or depots.However, these problems traditionally deal with the optimal location of points located at either a vertex or along an arc of the network[7,8,12,13,15,23,25].Several authors extended the theory to include sites that are not merely single points but paths or trees [16,24].There are many practical applications,arising for instance in the design of public transit lines,where the facility to be located is too large to be modelled as a single point.In fact,the public transit line problem is formulated as the problem of locating a path on a road transportation network in order to minimize either the centre or the median criterion.Path-shaped and tree-shaped facilities are also called extensive facilities.Slater ﬁrstly[24]solved the problem ofﬁnding a core of a tree,that is,ﬁnding a path which minimizes the sum of the distances from all vertices of the tree to the path.Linear time algorithms forﬁnding a core of a tree have been presented in Morgan and Slater[22]and Becker[3].Peng and Lo[20]and Becker et al.[5]both provide algo-rithms that solved an extended core problem:ﬁnding a core with speciﬁed length or cost.The best result for ﬁnding a core with bounded length on trees can be found in[1].A parallel algorithm was also proposed for this problem in[28].More recently,some authors[27,29]considered the study ofﬁnding the conditional location of a path shaped facility of limited length on trees.That is,the problem of locating a path of limited length on a tree which minimizes the sum of the distances and under the condition that some existing facilities are already located.The problem ofﬁnding an optimal location of a path on a tree minimizing the maximal distance is consid-ered in Hedetniemi et al.[14]where a linear time algorithm is developed.Minieka[21]later solved the problem with a restriction on the length of the path to be located.A comprehensive survey on the problem of locating paths on trees can be found in[11,19].However,some authors have questioned the pertinence of the median and the centre criterion for these problems.Neither of the two above criteria alone capture all essential elements of a location problem. The median criterion alone may result in solutions which are unacceptable from the point of view of the service level for the clients who are located far away from the facilities.On the other hand,the centre criterion if used alone may lead to very costly service systems.Additionally,the median objective function tends to favor clients who are clustered in population centers to the detriment of clients who are spatially dispersed[18].On the other hand,the centre criterion may yield a signiﬁcant increase in the total distance travelled by the clients.This had led Halpern[9,10]to model the corresponding trade oﬀas a bicriterion problem in which a combination of total distance and maximal distance is minimized.In[17,26]eﬃcient algorithms for the location of central-median point-shaped facilities are provided.In case of a path-shaped facility,such a dual objective is appropriate in many real world locational decisions.Indeed,fair service considerations in locating a public transit line may call for the service(i.e.the path)to be located not too far from any customer without the average distance travelled by all prospective customers being not too large.Averbakh and Berman[2]introduced the problem ofﬁnding a path-shaped facility on a tree network by considering both the centre and the median criterion.They deﬁned the following problems:ﬁnd a path which minimizes the sum of the distances such that the maximum distance from the furthest vertex of the tree to the path is bounded by aﬁxed constant;ﬁnd a path which minimizes the maximum distance with the sum of the distances being not greater than aﬁxed value;ﬁnd the set of all the Pareto-Optimal paths.They provided a two-phase procedure for all the three problems.Itﬁrstﬁnds a set M of cardinality at most n,where n is the number of vertices in the tree,which includes the set of Pareto-Optimal paths along with some extra paths in O(n)time.In a second phase,this procedureﬁnds the optimal path for theﬁrst two problems by searching among the paths in M.Since the cardinality of M is of at most n,the overall two-phase procedure has time complexity of O(n)for theﬁrst two problems.Extracting from M the Pareto-Optimal paths set requires O(n log n)time instead.In this paper we consider the ﬁrst two problems introduced in [2]with an additional constraint,namely that the two optimal paths must have length (or cost )bounded by a ﬁxed constant.In real applications,this addi-tional constraint on the length or on the cost may refer to a budget constraint for establishing the facility.In particular,we denote the problem of ﬁnding a path which minimizes the sum of the distances with the max-imum distance not greater than a ﬁxed constant and with bounded length by Bounded Cent-Median path prob-lem,while we denote the problem of ﬁnding a path which minimizes the maximum distance with sum of the distances less than a ﬁxed value and with bounded length by the Bounded Medi-Central path problem.The Bounded Cent-Median problem was ﬁrst presented in [4]while,to the best of our knowledge,the Bounded Medi-Central problem does not seem to have been considered elsewhere in the literature.We notice that both the two problems are NP-hard on general graphs.Indeed,they contain as special case the problem of ﬁnding a path of bounded length which minimizes the sum of the distances and the problem of ﬁnding a path of bounded length which minimizes the maximum distance,respectively.It is well known that these two problems are NP-hard on general graphs [11].For both the problems we provide O(n log 2n )divide-and-conquer algorithms.The idea for both the proce-dures is that a middle vertex in the tree is computed;then an optimal path through this vertex is found.If it is not the optimal path for the problem under consideration,then this optimal path must lie entirely in one of the subtrees rooted at the adjacent vertices of the middle vertex.The algorithm is recursively applied to these sub-trees.An appropriate choice of the middle vertex ensures that the depth of the recursion is O(log n )[5,20].In Section 2,we provide notation and deﬁnitions,as well as an account of a preprocessing phase which calculates several quantities needed for the two algorithms.Section 3gives the algorithm for the Bounded Cent-Median path problem,while Section 4provides the procedure for the Bounded Medi-Central path prob-lem.In Appendix are presented some technical considerations related to the two algorithms.2.Notation and deﬁnitionsGiven a tree T =(V ,E ),with j V j =n ,let a (e )be a positive weight representing the length of each edge e =(v ,w )2E .Suppose also that to each vertex v is assigned a non-negative weight h (v ).Let P be a path in T ,the length of P is L ðP Þ¼P e 2P a ðe Þ.Given two vertices v and u ,we denote the unique path from v to u by P vu .We deﬁne the distance d (v ,u )between v and u of V as the length of P vu .Given a path P in T ,the sum of the distances from P to all the vertices v 2V is D ðP Þ¼P v 2V h ðv Þd ðv ;P Þ,where d (v ,P )is the minimum distance from v 2V to a vertex of P (see [22]).We call D (P )the DISTSUM of P .If P ={v }then we write D (v )instead of D ({v }).A path P which minimizes DISTSUM in T is called a median path or,following the deﬁnition in [22,24],a core of the tree T .We deﬁne E (P )the ECCENTRICITY of P where E (P )=max v 2V {d (v ,P )}.The shortest path P among those paths that minimizes ECCENTRICITY is the central path of T (see [24]).Let T be rooted at some vertex r ,and denote by T r the rooted tree.We deﬁne p (v )as the parent of v in T r and Son (v )as the set of children of v .Denote by T B v ¼ðV B v ;E B v Þthe subtree of T r rooted at vertex v and byT U v ¼ðV U v ;E U v Þthe subtree induced by vertices ðV ÀV B v Þ[f v g .Let deg B (v )be the degree of vertex v in T B v .For both the problems presented in this paper,we need a preprocessing phase that allows us to compute the quantities that will be used in the two algorithms.In the following we give a description of the recursive for-mulas calculated in this preprocessing phase.The quantities corresponding to T B v are often referred as belowquantities,while the ones corresponding to T U v are referred as upper quantities.Let D B (v )be the sum of the distances of the vertices in T B v to vertex v ,and let sum B (v )be the sum of theweights of the vertices in T B v .By using the standard bottom up approach,and proceeding level by level fromthe leaves to the root,we compute D B (v )and sum B (v )as follows (see also [5,4,16,25])sum B ðv Þ¼h ðv Þif v is a leaf of T r ;sum B ðv Þ¼h ðv ÞþP w 2Son ðv Þsum B ðw Þ;8<:ð1ÞD B ðv Þ¼0if v is a leaf of T r ;D B ðv Þ¼P w 2Son ðv Þ½D B ðw Þþsum B ðw Þa ðv ;w Þ .8<:ð2Þ1210R.I.Becker et al./European Journal of Operational Research 179(2007)1208–1220The time needed for the computation of(1)and(2)is O(n).Also,in the following algorithms,we need to cal-culate"v2V,D(v),that is,the sum of the distances of all the vertices in T r to v.Then,D(r)=D B(r)and by proceeding from the root r to the leaves we haveDðvÞ¼DðpðvÞÞþaðv;pðvÞÞ½HÀ2sum BðvÞ ;ð3Þwhere H¼Pv2VhðvÞ(see also[16]).Given a vertex v we denote by E B(v)and E U(v)the ECCENTRICITY of v in T Bvand in T Uv ,respectively.We proceed bottom up and top down to compute these quantities eﬃciently at eachvertex of T r.In particular,in a bottom up visit of the tree weﬁrst associate to each vertex v the label E1B ðvÞwhich is the maximum distance from v to a vertex u in T BvE1 B ðvÞ¼0if v is a leaf of T r;maxw2SonðvÞf E1BðwÞþaðv;wÞg otherwise.(ð4ÞLet u1B ðvÞbe a son of v that gives the value of E1BðvÞ.In the preprocessing phase,we also compute the labelsE2 B ðvÞthat represent the maximum distance from v to a vertex u in T Bvn T Bu1BðvÞE2BðvÞ¼0if deg BðvÞ<2;maxw2SonðvÞf E1BðwÞþaðv;wÞj w¼u1BðvÞg otherwise.(ð5ÞFinally,we calculate E3B ðvÞwhich is the maximum distance from v to a vertex u in T Bvn T Bu1BðvÞ[T Bu2BðvÞ,whereu2 B ðvÞis a son of v that gives E2BðvÞE3BðvÞ¼0if deg BðvÞ<3;maxw2SonðvÞf E1BðwÞþaðv;wÞj w¼u1BðvÞand w¼u2BðvÞg otherwise.(ð6ÞThe above three formulas can be computed in O(n)time for all the vertices of T.Let us consider the eccen-tricity E U(v)of a vertex v in T Uv ,then by proceeding from the root toward its leaves we distinguish:if v=r then E U(v)=0,if v is adjacent to the root r:E UðvÞ¼E1BðrÞþaðr;vÞif v¼u1BðrÞ;E2BðrÞþaðr;vÞif v¼u1BðrÞ.(ð7Þif v is not adjacent to r:E UðvÞ¼max fðE UðpðvÞÞþaðpðvÞ;vÞÞ;ðE1BðpðvÞÞþaðpðvÞ;vÞÞg if v¼u1BðpðvÞÞ;max fðE UðpðvÞÞþaðpðvÞ;vÞÞ;ðE2BðpðvÞÞþaðpðvÞ;vÞÞg if v¼u1BðpðvÞÞ.(ð8ÞIn a top down visit of the tree the above formulas can be computed in O(n)time.We now introduce a deﬁ-nition relevant to the formulation of the algorithms of the following sections.Deﬁnition1.Given a path P vu and a path P uw with edges disjoint from P vu,the distance saving of P uw with respect to P vu,is the reduction of DISTSUM obtained by adding P uw to P vu(see[22]),that is savðP vu;P uwÞ¼DðP vuÞÀDðP vwÞ.ð9ÞIf theﬁrst path consists of only one vertex v,we simply write sav(v,P vw).If v=w,then sav(v,P vv)=0.Given a path P vw with w in T Bvand w5v,we can compute sav(v,P vw)in linear time by proceeding top down: savðv;P vwÞ¼savðv;P vpðwÞÞþsum BðwÞaðpðwÞ;wÞ.ð10ÞIn particular,for the problem ofﬁnding a path P which minimizes the ECCENTRICITY with bounded DISTSUM, another additional value is used,that is,upecc(v)representing the eccentricity of the path P rv excluding thevertices in T Bv and in the other subtrees incident to the root r(see Fig.1).The recursive equation for this valueisR.I.Becker et al./European Journal of Operational Research179(2007)1208–12201211upecc ðv Þ¼0if v ¼r or v is adjacent to the root r ;max f E 1B ðp ðv ÞÞ;upecc ðp ðv ÞÞg otherwise and v ¼u 1B ðp ðv ÞÞ;max f E 2B ðp ðv ÞÞ;upecc ðp ðv ÞÞg otherwise and v ¼u 1B ðp ðv ÞÞ.8><>:ð11ÞLet T r 0be the subtree of T r rooted at vertex r 0,and let P vu be a path passing through r 0.Let w 1and w 2be the two sons of r 0along the paths P r 0v and P r 0u ,respectively.Deﬁne (see Fig.2)E r 0ðv ;u Þ¼max E U ðr 0Þ;max x 2Son ðr 0Þf E 1B ðx Þþa ðr 0;x Þj x ¼w 1and x ¼w 2g &'.ð12ÞNotice that E U (r )=0in the given tree T r ,while in a subtree T r 0rooted at vertex r 0,E U (r 0)may be diﬀerent from 0.See Appendix for furtherdetails.1212R.I.Becker et al./European Journal of Operational Research 179(2007)1208–1220Hence,by referring to formulas(4),(11)and(12),the ECCENTRICITY of any path P uv belonging to T r0and passing through r0,with u5r0and v5r0is(see Fig.2)EðP uvÞ¼max E1B ðuÞ;E1BðvÞ;upeccðuÞ;upeccðvÞ;E r0ðv;uÞÈÉ.Finally,by a top down scan of T r0,we canﬁnd the lengths of the paths P r0v from the root r0to each vertex v as follows:LðP r0vÞ¼LðP r0pðvÞÞþaðpðvÞ;vÞ.ð13ÞThe length and the DISTSUM of any path P uv passing through the root r0,with u5r0and v5r0are LðP uvÞ¼LðP r0uÞþLðP r0vÞ;DðP uvÞ¼Dðr0ÞÀsavðr0;P r0uÞÀsavðr0;P r0vÞ.The two recursive procedures presented in this paper are based on the following remark.Remark1[5].Given a tree T=(V,E),a path P and a vertex v2V,we have two cases:•P contains v;•P is fully contained in one of the h subtrees,T1,T2,...,T h,obtained by removing v from T.Deﬁnition2[5,23].Given a weighted tree T,a middle vertex v of T is a vertex which minimizes the maximum of the number of vertices of the subtrees obtained by removing v.Remark2[15].A middle vertex has maximum subtree cardinality less than or equal to n/2,since,in fact,it is the centroid of the corresponding tree T if we assume that to all its vertices are assigned weights equal to one. Computing the middle vertex requires O(n)time with the algorithm in[7].For both the following algorithms,we root the tree at a middle vertex m,and then we check if the optimal path passes through m,otherwise we search for the optimal path in one of the subtrees obtained by deleting m. We notice that,when weﬁnd the subtrees T1,T2,...,T h,we need to update the below and upper quantities of these subtrees,before the root m of the original T is deleted.See Appendix for further details on the Update procedure.3.Bounded Cent-Median problemIn this section we consider the problem ofﬁnding a path which minimizes DISTSUM with ECCENTRICITY at most R and with length less than or equal to‘.Remark3.Given a tree T,the following three cases with respect to the eccentricity may occur:(1)there exists a vertex u in T such that either E2B ðuÞ>R and E U(u)>R,or E3BðuÞ>R.In this case no pathin T is feasible with respect to the ECCENTRICITY;(2)if case(1)does not hold,at most one vertex w2T with E2B ðwÞ>R,E3BðwÞ6R and E U(w)6R may exist.In this case,all the paths feasible with respect to the ECCENTRICITY must contain the unique path P con-taining w such that E(P)6R and of minimum length;(3)if vertex w in case(2)does not exist,then,given the middle vertex m and the tree T m,we may have eitherE1 B ðmÞ>R or E1BðmÞ6R.In theﬁrst case,all the feasible paths passing through m must contain a givenpath P with E(P)6R and of minimum length,which has m as an endpoint.In the second case,all the paths passing through m are feasible with respect to the eccentricity.In cases(1)and(2)the algorithm checks the feasibility andﬁnds the optimal path,if it exists,working in the whole tree T.In case(3),given a tree T rooted at a middle vertex,the algorithm proceeds by searching for an optimal path by recursively visiting the subtrees of T rooted at the sons of the middle vertex.R.I.Becker et al./European Journal of Operational Research179(2007)1208–12201213Hence,in case(2)and at each iteration of the recursion in case(3),we have to solve the following problem: given a path P of ECCENTRICITY at most R,ﬁnd a path P containing P of minimum DISTSUM and length less than or equal to‘.Forﬁnding P we use a modiﬁed version of the algorithm described in the paper by Becker et al.[5]whichﬁnds a path of minimum DISTSUM and length less than or equal to‘containing a given vertex v.We will give a description of this algorithm in Appendix.Algorithm1(Bounded Cent-Median)Input:a weighted tree TOutput:a path in T of ECCENTRICITY6R,length6‘and of minimum DISTSUM,if such a path exists,or FAIL otherwisebeginﬁnd a middle vertex m of T and root T at m"u2T compute the quantities E1B ðuÞ,E2BðuÞ,E3BðuÞ,E U(u),L(P mu),sav(m,P mu)and D(u)/*see Section2*/ CASE1if$u2T such that E3B ðuÞ>R or(E2BðuÞ>R and E U(u)>R)thenoutput FAIL and Stop/*The problem is infeasible*/ CASE2else if$a vertex w2T such that E2B ðwÞ>R,E3BðwÞ6R and E U(w)6R/*the feasible paths in T pass through w*/consider the two sons u1B ðwÞand u2BðwÞof wlet t¼u1B ðwÞwhile E1B ðtÞ>R do t¼u1BðtÞlet t0¼u2B ðwÞwhile E1B ðt0Þ>R do t0¼u1Bðt0Þif LðP tt0Þ>‘output FAIL and Stop/*The problem is infeasible*/elseﬁnd P tt0containing P tt0(see Appendix)output PÃ¼P tt0and DÃ¼DðP tt0ÞCASE3elseD*:¼+1,P*:¼;SUBTREE(T)output P*and D*endProcedure SUBTREE(T0)Input:a subtree T0=(V0,E0)of T with n0vertices and the best current DISTSUM D* Output:the best current DISTSUM D*and the corresponding path P*beginﬁnd a middle vertex m0of T0and root T0at m0"u2T0compute the quantities E1B ðuÞ,E2BðuÞ,E3BðuÞ,E U(u),L(P m0u),sav(m0,P m0u)/*see Section2and Appendix*/if E1B ðm0Þ>Rconsider the son u1B ðm0Þof m0let t¼u1B ðm0Þwhile E1B ðtÞ>R do t¼u1BðtÞif LðP m0tÞ>‘/*There is no feasible path passing through m0*/elseﬁnd P m0t containing P m0t(see Appendix)if DðP m0tÞ<DÃthen DÃ:¼DðP m0tÞand PÃ:¼P m0t1214R.I.Becker et al./European Journal of Operational Research179(2007)1208–1220if E U ðu 1B ðm 0ÞÞ6R then SUBTREE(T 0B u 1B ðm 0Þ)return P *and D *else/*E 1B ðm 0Þ6R then,in T 0all the paths passing through m 0are feasiblewith respect to the ECCENTRICITY */let P be the path formed by only m 0ﬁnd P containing P (see Appendix )if D ðP Þ<D Ãthen D Ã:¼D ðP Þand P Ã:¼Pfor each son s of m 0doif E U (s )6R then SUBTREE(T B s )return P *and D*end Theorem 1(Correctness).Algorithm Bounded Cent-Median correctly ﬁnds an optimal solution,if it exists,or output FAIL otherwise.Proof.An optimal path,if it exists,must contain a path P such that E (P )6R ,otherwise,if every path in T has ECCENTRICITY greater than R the problem will result infeasible.Thus,it is convenient to ﬁrstly search for such a path P and then to try to ﬁnd a path P which contains P of minimum DISTSUM and L (P )6‘.This reduces the search for an optimal path by means of considering the three cases described in Remark 3.For each subtree T 0considered in the recursion,a feasible path of minimum DISTSUM containing the middle vertex m 0of T 0is found.Then,the search for an optimal solution is recursively done in all the subtrees obtained by deleting m 0.More precisely,the recursive algorithm proceeds by working on the subtrees rooted at the sons sof m 0such that E U (s )6R .In particular,if E 1B ðm 0Þ>R the algorithm proceeds only on the son u 1B ðm 0Þ.In thisway the algorithm considers only subtrees where it is possible to ﬁnd paths feasible with respect to the ECCEN-TRICITY .Hence,the algorithm correctly ﬁnds an optimal solution if it exists or fail otherwise.hTheorem 2(Complexity).Algorithm Bounded Cent-Median terminates in O(n log 2n )time.Proof.The computation of the below and upper quantities in the main procedure requires O(n )time in T .When computing these quantities,it is possible to directly check which of the three cases listed in Remark 3holds.Finding vertices t and t 0in case 2requires O(n )time.The path P tt 0can be found in time O(n log n )by using the procedure described in Appendix .Let us now consider the recursive procedure SUBTREE .At a given level of the recursion there are a number k of subproblems to be solved.Let n i with i =1,...,k be the number of vertices of the i th subproblem.As in the main procedure,also in SUBTREE the computation of the below and the upper quantities requires O(n i )time (see Appendix and Section 2)as well as the time needed for the computation of t is O(n i ).Finding P m 0t or P requires O(n i log n i )time.Since P ki ¼1ðn i Þ<n ,weobtain,for each level of the recursion,a time complexity of O(n log n ).The depth of the recursion is O(log n )since m is a middle vertex of the current subtree and the cardinalities of the subtrees obtained by removing m are at most n /2(see Remark 2).Hence,the overall time complexity of the algorithm is O(n log 2n ).hNotice that if the weights associated to the edges of the tree are all equal to one,ﬁnding a feasible path of minimum DISTSUM containing the middle vertex m of T can be done in O(n )time.Hence,the overall time com-plexity of our algorithm will be O(n log n ).4.Bounded Medi-Central problemIn this section we deal with the problem of ﬁnding a path P which minimizes ECCENTRICITY with DISTSUM at most k and with length less than or equal to ‘.Similar to the algorithm for the Bounded Cent-Median problem,we have to search for a path passing through the middle vertex m of T that has minimum ECCENTRICITY with DISTSUM at most k and with length less than or equal to ‘.We call such a path a best path through the middle vertex m .By Remark 1,if it is not the R.I.Becker et al./European Journal of Operational Research 179(2007)1208–12201215。