Methods for robust clustering of epileptic EEG spikes

合集下载

基于改进多目标萤火虫算法的模糊聚类

基于改进多目标萤火虫算法的模糊聚类朱书伟;周治平;张道文【摘要】针对传统的模糊聚类算法大都针对单一目标函数的优化,而无法获得更全面、更准确的聚类结果的问题,提出一种基于改进多目标萤火虫优化算法的模糊聚类方法.首先在多目标萤火虫算法中引入一种动态调整的变异机制以获得更加均匀分布的非劣解,其中以动态减小的概率选择个体并采用类似于差分进化算法中变异算子的策略对其进行变异,通过自适应调整收缩因子以提高变异效率.然后当归档集中的最优解集充满时,从中选取一定量的解与当前种群组合进行下一次进化,使得算法具有更高的效率.最后将其运用到模糊聚类问题中,通过同时优化两个模糊聚类指标的目标函数并从最终的归档集中选取一个解确定聚类结果.采用5组数据进行实验的结果表明,相对于单目标聚类方法,所提方法对各种数据集的聚类有效性指标提高了2到8个百分点,具有更高的聚类准确性和更好的综合性能.【期刊名称】《计算机应用》【年(卷),期】2015(035)003【总页数】6页(P685-690)【关键词】模糊聚类;多目标优化;萤火虫算法;变异;差分进化【作者】朱书伟;周治平;张道文【作者单位】江南大学物联网工程学院,江苏无锡214122;江南大学物联网工程学院,江苏无锡214122;江南大学物联网工程学院,江苏无锡214122【正文语种】中文【中图分类】TP18;TP301.60 引言聚类分析技术通常可以看成是一个复杂的优化问题，它在很大程度上依赖于聚类有效性指标的优化，采用不同的优化标准就会形成不同的聚类问题。

现有的聚类方法大部分只针对一种指标进行分析，无法有效地适用于各种不同特征类型的数据。

多目标优化算法能够实现不同目标函数的同时优化，可以成功应用到聚类技术中，将聚类问题转化为对多个聚类指标目标函数的优化问题，使其更广泛地应用于各种类型数据并获得更全面的综合性能。

近年来基于多目标进化算法(Multi-Objective Evolutionary Algorithm，MOEA)［1］的聚类方法逐渐成为了一个研究热点，目前已取得不少有价值的研究成果。

A Tutorial on Spectral Clustering

A Tutorial on Spectral Clustering
Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 T¨ ubingen, Germany ulrike.luxburg@tuebingen.mpg.de
2
Similarity graphs
Given a set of data points x1 , . . . xn and some notion of similarity sij ≥ 0 between all pairs of data points xi and xj , the intuitive goal of clustering is to divide the data points into several groups such that points in the same group are similar and points in diﬀerent groups are dissimilar to each other. If we do not have more information than similarities between data points, a nice way of representing the data is in form of the similarity graph G = (V, E ). Each vertex vi in this graph represents a data point xi . Two vertices are connected if the similarity sij between the corresponding data points xi and xj is positive or larger than a certain threshold, and the edge is weighted by sij . The problem of clustering can now be reformulated using the similarity graph: we want to ﬁnd a partition of the graph such that the edges between diﬀerent groups have very low weights (which means that points in diﬀerent clusters are dissimilar from each other) and the edges within a group have high weights (which means that points within the same cluster are similar to each other). To be able to formalize this intuition we ﬁrst want to introduce some basic graph notation and brieﬂy discuss the kind of graphs we are going to study.

A Fast and Accurate Plane Detection Algorithm for Large Noisy Point Clouds Using Filtered Normals

A Fast and Accurate Plane Detection Algorithm for Large Noisy Point CloudsUsing Filtered Normals and Voxel GrowingJean-Emmanuel DeschaudFranc¸ois GouletteMines ParisTech,CAOR-Centre de Robotique,Math´e matiques et Syst`e mes60Boulevard Saint-Michel75272Paris Cedex06jean-emmanuel.deschaud@mines-paristech.fr francois.goulette@mines-paristech.frAbstractWith the improvement of3D scanners,we produce point clouds with more and more points often exceeding millions of points.Then we need a fast and accurate plane detection algorithm to reduce data size.In this article,we present a fast and accurate algorithm to detect planes in unorganized point clouds usingﬁltered normals and voxel growing.Our work is based on aﬁrst step in estimating better normals at the data points,even in the presence of noise.In a second step,we compute a score of local plane in each point.Then, we select the best local seed plane and in a third step start a fast and robust region growing by voxels we call voxel growing.We have evaluated and tested our algorithm on different kinds of point cloud and compared its performance to other algorithms.1.IntroductionWith the growing availability of3D scanners,we are now able to produce large datasets with millions of points.It is necessary to reduce data size,to decrease the noise and at same time to increase the quality of the model.It is in-teresting to model planar regions of these point clouds by planes.In fact,plane detection is generally aﬁrst step of segmentation but it can be used for many applications.It is useful in computer graphics to model the environnement with basic geometry.It is used for example in modeling to detect building facades before classiﬁcation.Robots do Si-multaneous Localization and Mapping(SLAM)by detect-ing planes of the environment.In our laboratory,we wanted to detect small and large building planes in point clouds of urban environments with millions of points for modeling. As mentioned in[6],the accuracy of the plane detection is important for after-steps of the modeling pipeline.We also want to be fast to be able to process point clouds with mil-lions of points.We present a novel algorithm based on re-gion growing with improvements in normal estimation and growing process.For our method,we are generic to work on different kinds of data like point clouds fromﬁxed scan-ner or from Mobile Mapping Systems(MMS).We also aim at detecting building facades in urban point clouds or little planes like doors,even in very large data sets.Our input is an unorganized noisy point cloud and with only three”in-tuitive”parameters,we generate a set of connected compo-nents of planar regions.We evaluate our method as well as explain and analyse the signiﬁcance of each parameter. 2.Previous WorksAlthough there are many methods of segmentation in range images like in[10]or in[3],three have been thor-oughly studied for3D point clouds:region-growing, hough-transform from[14]and Random Sample Consen-sus(RANSAC)from[9].The application of recognising structures in urban laser point clouds is frequent in literature.Bauer in[4]and Boulaassal in[5]detect facades in dense3D point cloud by a RANSAC algorithm.V osselman in[23]reviews sur-face growing and3D hough transform techniques to de-tect geometric shapes.Tarsh-Kurdi in[22]detect roof planes in3D building point cloud by comparing results on hough-transform and RANSAC algorithm.They found that RANSAC is more efﬁcient than theﬁrst one.Chao Chen in[6]and Yu in[25]present algorithms of segmentation in range images for the same application of detecting planar regions in an urban scene.The method in[6]is based on a region growing algorithm in range images and merges re-sults in one labelled3D point cloud.[25]uses a method different from the three we have cited:they extract a hi-erarchical subdivision of the input image built like a graph where leaf nodes represent planar regions.There are also other methods like bayesian techniques. In[16]and[8],they obtain smoothed surface from noisy point clouds with objects modeled by probability distribu-tions and it seems possible to extend this idea to point cloud segmentation.But techniques based on bayesian statistics need to optimize global statistical model and then it is difﬁ-cult to process points cloud larger than one million points.We present below an analysis of the two main methods used in literature:RANSAC and region-growing.Hough-transform algorithm is too time consuming for our applica-tion.To compare the complexity of the algorithm,we take a point cloud of size N with only one plane P of size n.We suppose that we want to detect this plane P and we deﬁne n min the minimum size of the plane we want to detect.The size of a plane is the area of the plane.If the data density is uniform in the point cloud then the size of a plane can be speciﬁed by its number of points.2.1.RANSACRANSAC is an algorithm initially developped by Fis-chler and Bolles in[9]that allows theﬁtting of models with-out trying all possibilities.RANSAC is based on the prob-ability to detect a model using the minimal set required to estimate the model.To detect a plane with RANSAC,we choose3random points(enough to estimate a plane).We compute the plane parameters with these3points.Then a score function is used to determine how the model is good for the remaining ually,the score is the number of points belonging to the plane.With noise,a point belongs to a plane if the distance from the point to the plane is less than a parameter γ.In the end,we keep the plane with the best score.Theprobability of getting the plane in theﬁrst trial is p=(nN )3.Therefore the probability to get it in T trials is p=1−(1−(nN )3)ing equation1and supposing n minN1,we know the number T min of minimal trials to have a probability p t to get planes of size at least n min:T min=log(1−p t)log(1−(n minN))≈log(11−p t)(Nn min)3.(1)For each trial,we test all data points to compute the score of a plane.The RANSAC algorithm complexity lies inO(N(Nn min )3)when n minN1and T min→0whenn min→N.Then RANSAC is very efﬁcient in detecting large planes in noisy point clouds i.e.when the ratio n minN is 1but very slow to detect small planes in large pointclouds i.e.when n minN 1.After selecting the best model,another step is to extract the largest connected component of each plane.Connnected components mean that the min-imum distance between each point of the plane and others points is smaller(for distance)than aﬁxed parameter.Schnabel et al.[20]bring two optimizations to RANSAC:the points selection is done locally and the score function has been improved.An octree isﬁrst created from point cloud.Points used to estimate plane parameters are chosen locally at a random depth of the octree.The score function is also different from RANSAC:instead of testing all points for one model,they test only a random subset and ﬁnd the score by interpolation.The algorithm complexity lies in O(Nr4Ndn min)where r is the number of random subsets for the score function and d is the maximum octree depth. Their algorithm improves the planes detection speed but its complexity lies in O(N2)and it becomes slow on large data sets.And again we have to extract the largest connected component of each plane.2.2.Region GrowingRegion Growing algorithms work well in range images like in[18].The principle of region growing is to start with a seed region and to grow it by neighborhood when the neighbors satisfy some conditions.In range images,we have the neighbors of each point with pixel coordinates.In case of unorganized3D data,there is no information about the neighborhood in the data structure.The most common method to compute neighbors in3D is to compute a Kd-tree to search k nearest neighbors.The creation of a Kd-tree lies in O(NlogN)and the search of k nearest neighbors of one point lies in O(logN).The advantage of these region growing methods is that they are fast when there are many planes to extract,robust to noise and extract the largest con-nected component immediately.But they only use the dis-tance from point to plane to extract planes and like we will see later,it is not accurate enough to detect correct planar regions.Rabbani et al.[19]developped a method of smooth area detection that can be used for plane detection.Theyﬁrst estimate the normal of each point like in[13].The point with the minimum residual starts the region growing.They test k nearest neighbors of the last point added:if the an-gle between the normal of the point and the current normal of the plane is smaller than a parameterαthen they add this point to the smooth region.With Kd-tree for k nearest neighbors,the algorithm complexity is in O(N+nlogN). The complexity seems to be low but in worst case,when nN1,example for facade detection in point clouds,the complexity becomes O(NlogN).3.Voxel Growing3.1.OverviewIn this article,we present a new algorithm adapted to large data sets of unorganized3D points and optimized to be accurate and fast.Our plane detection method works in three steps.In theﬁrst part,we compute a better esti-mation of the normal in each point by aﬁltered weighted planeﬁtting.In a second step,we compute the score of lo-cal planarity in each point.We select the best seed point that represents a good seed plane and in the third part,we grow this seed plane by adding all points close to the plane.Thegrowing step is based on a voxel growing algorithm.The ﬁltered normals,the score function and the voxel growing are innovative contributions of our method.As an input,we need dense point clouds related to the level of detail we want to detect.As an output,we produce connected components of planes in the point cloud.This notion of connected components is linked to the data den-sity.With our method,the connected components of planes detected are linked to the parameter d of the voxel grid.Our method has 3”intuitive”parameters :d ,area min and γ.”intuitive”because there are linked to physical mea-surements.d is the voxel size used in voxel growing and also represents the connectivity of points in detected planes.γis the maximum distance between the point of a plane and the plane model,represents the plane thickness and is linked to the point cloud noise.area min represents the minimum area of planes we want to keep.3.2.Details3.2.1Local Density of Point CloudsIn a ﬁrst step,we compute the local density of point clouds like in [17].For that,we ﬁnd the radius r i of the sphere containing the k nearest neighbors of point i .Then we cal-culate ρi =kπr 2i.In our experiments,we ﬁnd that k =50is a good number of neighbors.It is important to know the lo-cal density because many laser point clouds are made with a ﬁxed resolution angle scanner and are therefore not evenly distributed.We use the local density in section 3.2.3for the score calculation.3.2.2Filtered Normal EstimationNormal estimation is an important part of our algorithm.The paper [7]presents and compares three normal estima-tion methods.They conclude that the weighted plane ﬁt-ting or WPF is the fastest and the most accurate for large point clouds.WPF is an idea of Pauly and al.in [17]that the ﬁtting plane of a point p must take into consider-ation the nearby points more than other distant ones.The normal least square is explained in [21]and is the mini-mum of ki =1(n p ·p i +d )2.The WPF is the minimum of ki =1ωi (n p ·p i +d )2where ωi =θ( p i −p )and θ(r )=e −2r 2r2i .For solving n p ,we compute the eigenvec-tor corresponding to the smallest eigenvalue of the weightedcovariance matrix C w = ki =1ωi t (p i −b w )(p i −b w )where b w is the weighted barycenter.For the three methods ex-plained in [7],we get a good approximation of normals in smooth area but we have errors in sharp corners.In ﬁg-ure 1,we have tested the weighted normal estimation on two planes with uniform noise and forming an angle of 90˚.We can see that the normal is not correct on the corners of the planes and in the red circle.To improve the normal calculation,that improves the plane detection especially on borders of planes,we propose a ﬁltering process in two phases.In a ﬁrst step,we com-pute the weighted normals (WPF)of each point like we de-scribed it above by minimizing ki =1ωi (n p ·p i +d )2.In a second step,we compute the ﬁltered normal by us-ing an adaptive local neighborhood.We compute the new weighted normal with the same sum minimization but keep-ing only points of the neighborhood whose normals from the ﬁrst step satisfy |n p ·n i |>cos (α).With this ﬁltering step,we have the same results in smooth areas and better results in sharp corners.We called our normal estimation ﬁltered weighted plane ﬁtting(FWPF).Figure 1.Weighted normal estimation of two planes with uniform noise and with 90˚angle between them.We have tested our normal estimation by computing nor-mals on synthetic data with two planes and different angles between them and with different values of the parameter α.We can see in ﬁgure 2the mean error on normal estimation for WPF and FWPF with α=20˚,30˚,40˚and 90˚.Us-ing α=90˚is the same as not doing the ﬁltering step.We see on Figure 2that α=20˚gives smaller error in normal estimation when angles between planes is smaller than 60˚and α=30˚gives best results when angle between planes is greater than 60˚.We have considered the value α=30˚as the best results because it gives the smaller mean error in normal estimation when angle between planes vary from 20˚to 90˚.Figure 3shows the normals of the planes with 90˚angle and better results in the red circle (normals are 90˚with the plane).3.2.3The score of local planarityIn many region growing algorithms,the criteria used for the score of the local ﬁtting plane is the residual,like in [18]or [19],i.e.the sum of the square of distance from points to the plane.We have a different score function to estimate local planarity.For that,we ﬁrst compute the neighbors N i of a point p with points i whose normals n i are close toFigure parison of mean error in normal estimation of two planes with α=20˚,30˚,40˚and 90˚(=Noﬁltering).Figure 3.Filtered Weighted normal estimation of two planes with uniform noise and with 90˚angle between them (α=30˚).the normal n p .More precisely,we compute N i ={p in k neighbors of i/|n i ·n p |>cos (α)}.It is a way to keep only the points which are probably on the local plane before the least square ﬁtting.Then,we compute the local plane ﬁtting of point p with N i neighbors by least squares like in [21].The set N i is a subset of N i of points belonging to the plane,i.e.the points for which the distance to the local plane is smaller than the parameter γ(to consider the noise).The score s of the local plane is the area of the local plane,i.e.the number of points ”in”the plane divided by the localdensity ρi (seen in section 3.2.1):the score s =card (N i)ρi.We take into consideration the area of the local plane as the score function and not the number of points or the residual in order to be more robust to the sampling distribution.3.2.4Voxel decompositionWe use a data structure that is the core of our region growing method.It is a voxel grid that speeds up the plane detection process.V oxels are small cubes of length d that partition the point cloud space.Every point of data belongs to a voxel and a voxel contains a list of points.We use the Octree Class Template in [2]to compute an Octree of the point cloud.The leaf nodes of the graph built are voxels of size d .Once the voxel grid has been computed,we start the plane detection algorithm.3.2.5Voxel GrowingWith the estimator of local planarity,we take the point p with the best score,i.e.the point with the maximum area of local plane.We have the model parameters of this best seed plane and we start with an empty set E of points belonging to the plane.The initial point p is in a voxel v 0.All the points in the initial voxel v 0for which the distance from the seed plane is less than γare added to the set E .Then,we compute new plane parameters by least square reﬁtting with set E .Instead of growing with k nearest neighbors,we grow with voxels.Hence we test points in 26voxel neigh-bors.This is a way to search the neighborhood in con-stant time instead of O (logN )for each neighbor like with Kd-tree.In a neighbor voxel,we add to E the points for which the distance to the current plane is smaller than γand the angle between the normal computed in each point and the normal of the plane is smaller than a parameter α:|cos (n p ,n P )|>cos (α)where n p is the normal of the point p and n P is the normal of the plane P .We have tested different values of αand we empirically found that 30˚is a good value for all point clouds.If we added at least one point in E for this voxel,we compute new plane parameters from E by least square ﬁtting and we test its 26voxel neigh-bors.It is important to perform plane least square ﬁtting in each voxel adding because the seed plane model is not good enough with noise to be used in all voxel growing,but only in surrounding voxels.This growing process is faster than classical region growing because we do not compute least square for each point added but only for each voxel added.The least square ﬁtting step must be computed very fast.We use the same method as explained in [18]with incre-mental update of the barycenter b and covariance matrix C like equation 2.We know with [21]that the barycen-ter b belongs to the least square plane and that the normal of the least square plane n P is the eigenvector of the smallest eigenvalue of C .b0=03x1C0=03x3.b n+1=1n+1(nb n+p n+1).C n+1=C n+nn+1t(pn+1−b n)(p n+1−b n).(2)where C n is the covariance matrix of a set of n points,b n is the barycenter vector of a set of n points and p n+1is the (n+1)point vector added to the set.This voxel growing method leads to a connected com-ponent set E because the points have been added by con-nected voxels.In our case,the minimum distance between one point and E is less than parameter d of our voxel grid. That is why the parameter d also represents the connectivity of points in detected planes.3.2.6Plane DetectionTo get all planes with an area of at least area min in the point cloud,we repeat these steps(best local seed plane choice and voxel growing)with all points by descending order of their score.Once we have a set E,whose area is bigger than area min,we keep it and classify all points in E.4.Results and Discussion4.1.Benchmark analysisTo test the improvements of our method,we have em-ployed the comparative framework of[12]based on range images.For that,we have converted all images into3D point clouds.All Point Clouds created have260k points. After our segmentation,we project labelled points on a seg-mented image and compare with the ground truth image. We have chosen our three parameters d,area min andγby optimizing the result of the10perceptron training image segmentation(the perceptron is portable scanner that pro-duces a range image of its environment).Bests results have been obtained with area min=200,γ=5and d=8 (units are not provided in the benchmark).We show the re-sults of the30perceptron images segmentation in table1. GT Regions are the mean number of ground truth planes over the30ground truth range images.Correct detection, over-segmentation,under-segmentation,missed and noise are the mean number of correct,over,under,missed and noised planes detected by methods.The tolerance80%is the minimum percentage of points we must have detected comparing to the ground truth to have a correct detection. More details are in[12].UE is a method from[12],UFPR is a method from[10]. It is important to notice that UE and UFPR are range image methods and our method is not well suited for range images but3D Point Cloud.Nevertheless,it is a good benchmark for comparison and we see in table1that the accuracy of our method is very close to the state of the art in range image segmentation.To evaluate the different improvements of our algorithm, we have tested different variants of our method.We have tested our method without normals(only with distance from points to plane),without voxel growing(with a classical region growing by k neighbors),without our FWPF nor-mal estimation(with WPF normal estimation),without our score function(with residual score function).The compari-son is visible on table2.We can see the difference of time computing between region growing and voxel growing.We have tested our algorithm with and without normals and we found that the accuracy cannot be achieved whithout normal computation.There is also a big difference in the correct de-tection between WPF and our FWPF normal estimation as we can see in theﬁgure4.Our FWPF normal brings a real improvement in border estimation of planes.Black points in theﬁgure are non classiﬁedpoints.Figure5.Correct Detection of our segmentation algorithm when the voxel size d changes.We would like to discuss the inﬂuence of parameters on our algorithm.We have three parameters:area min,which represents the minimum area of the plane we want to keep,γ,which represents the thickness of the plane(it is gener-aly closely tied to the noise in the point cloud and espe-cially the standard deviationσof the noise)and d,which is the minimum distance from a point to the rest of the plane. These three parameters depend on the point cloud features and the desired segmentation.For example,if we have a lot of noise,we must choose a highγvalue.If we want to detect only large planes,we set a large area min value.We also focus our analysis on the robustess of the voxel size d in our algorithm,i.e.the ratio of points vs voxels.We can see inﬁgure5the variation of the correct detection when we change the value of d.The method seems to be robust when d is between4and10but the quality decreases when d is over10.It is due to the fact that for a large voxel size d,some planes from different objects are merged into one plane.GT Regions Correct Over-Under-Missed Noise Duration(in s)detection segmentation segmentationUE14.610.00.20.3 3.8 2.1-UFPR14.611.00.30.1 3.0 2.5-Our method14.610.90.20.1 3.30.7308Table1.Average results of different segmenters at80%compare tolerance.GT Regions Correct Over-Under-Missed Noise Duration(in s) Our method detection segmentation segmentationwithout normals14.6 5.670.10.19.4 6.570 without voxel growing14.610.70.20.1 3.40.8605 without FWPF14.69.30.20.1 5.0 1.9195 without our score function14.610.30.20.1 3.9 1.2308 with all improvements14.610.90.20.1 3.30.7308 Table2.Average results of variants of our segmenter at80%compare tolerance.4.1.1Large scale dataWe have tested our method on different kinds of data.We have segmented urban data inﬁgure6from our Mobile Mapping System(MMS)described in[11].The mobile sys-tem generates10k pts/s with a density of50pts/m2and very noisy data(σ=0.3m).For this point cloud,we want to de-tect building facades.We have chosen area min=10m2, d=1m to have large connected components andγ=0.3m to cope with the noise.We have tested our method on point cloud from the Trim-ble VX scanner inﬁgure7.It is a point cloud of size40k points with only20pts/m2with less noise because it is a ﬁxed scanner(σ=0.2m).In that case,we also wanted to detect building facades and keep the same parameters ex-ceptγ=0.2m because we had less noise.We see inﬁg-ure7that we have detected two facades.By setting a larger voxel size d value like d=10m,we detect only one plane. We choose d like area min andγaccording to the desired segmentation and to the level of detail we want to extract from the point cloud.We also tested our algorithm on the point cloud from the LEICA Cyrax scanner inﬁgure8.This point cloud has been taken from AIM@SHAPE repository[1].It is a very dense point cloud from multipleﬁxed position of scanner with about400pts/m2and very little noise(σ=0.02m). In this case,we wanted to detect all the little planes to model the church in planar regions.That is why we have chosen d=0.2m,area min=1m2andγ=0.02m.Inﬁgures6,7and8,we have,on the left,input point cloud and on the right,we only keep points detected in a plane(planes are in random colors).The red points in theseﬁgures are seed plane points.We can see in theseﬁg-ures that planes are very well detected even with high noise. Table3show the information on point clouds,results with number of planes detected and duration of the algorithm.The time includes the computation of the FWPF normalsof the point cloud.We can see in table3that our algo-rithm performs linearly in time with respect to the numberof points.The choice of parameters will have little inﬂuence on time computing.The computation time is about one mil-lisecond per point whatever the size of the point cloud(we used a PC with QuadCore Q9300and2Go of RAM).The algorithm has been implented using only one thread andin-core processing.Our goal is to compare the improve-ment of plane detection between classical region growing and our region growing with better normals for more ac-curate planes and voxel growing for faster detection.Our method seems to be compatible with out-of-core implemen-tation like described in[24]or in[15].MMS Street VX Street Church Size(points)398k42k7.6MMean Density50pts/m220pts/m2400pts/m2 Number of Planes202142Total Duration452s33s6900sTime/point 1ms 1ms 1msTable3.Results on different data.5.ConclusionIn this article,we have proposed a new method of plane detection that is fast and accurate even in presence of noise. We demonstrate its efﬁciency with different kinds of data and its speed in large data sets with millions of points.Our voxel growing method has a complexity of O(N)and it is able to detect large and small planes in very large data sets and can extract them directly in connected components.Figure 4.Ground truth,Our Segmentation without and with ﬁlterednormals.Figure 6.Planes detection in street point cloud generated by MMS (d =1m,area min =10m 2,γ=0.3m ).References[1]Aim@shape repository /.6[2]Octree class template /code/octree.html.4[3] A.Bab-Hadiashar and N.Gheissari.Range image segmen-tation using surface selection criterion.2006.IEEE Trans-actions on Image Processing.1[4]J.Bauer,K.Karner,K.Schindler,A.Klaus,and C.Zach.Segmentation of building models from dense 3d point-clouds.2003.Workshop of the Austrian Association for Pattern Recognition.1[5]H.Boulaassal,ndes,P.Grussenmeyer,and F.Tarsha-Kurdi.Automatic segmentation of building facades using terrestrial laser data.2007.ISPRS Workshop on Laser Scan-ning.1[6] C.C.Chen and I.Stamos.Range image segmentationfor modeling and object detection in urban scenes.2007.3DIM2007.1[7]T.K.Dey,G.Li,and J.Sun.Normal estimation for pointclouds:A comparison study for a voronoi based method.2005.Eurographics on Symposium on Point-Based Graph-ics.3[8]J.R.Diebel,S.Thrun,and M.Brunig.A bayesian methodfor probable surface reconstruction and decimation.2006.ACM Transactions on Graphics (TOG).1[9]M.A.Fischler and R.C.Bolles.Random sample consen-sus:A paradigm for model ﬁtting with applications to image analysis and automated munications of the ACM.1,2[10]P.F.U.Gotardo,O.R.P.Bellon,and L.Silva.Range imagesegmentation by surface extraction using an improved robust estimator.2003.Proceedings of Computer Vision and Pat-tern Recognition.1,5[11] F.Goulette,F.Nashashibi,I.Abuhadrous,S.Ammoun,andurgeau.An integrated on-board laser range sensing sys-tem for on-the-way city and road modelling.2007.Interna-tional Archives of the Photogrammetry,Remote Sensing and Spacial Information Sciences.6[12] A.Hoover,G.Jean-Baptiste,and al.An experimental com-parison of range image segmentation algorithms.1996.IEEE Transactions on Pattern Analysis and Machine Intelligence.5[13]H.Hoppe,T.DeRose,T.Duchamp,J.McDonald,andW.Stuetzle.Surface reconstruction from unorganized points.1992.International Conference on Computer Graphics and Interactive Techniques.2[14]P.Hough.Method and means for recognizing complex pat-terns.1962.In US Patent.1[15]M.Isenburg,P.Lindstrom,S.Gumhold,and J.Snoeyink.Large mesh simpliﬁcation using processing sequences.2003.。

基于盲稀疏度匹配追踪的协同频谱检测

基于盲稀疏度匹配追踪的协同频谱检测陈晓芳;朱翠涛【期刊名称】《计算机工程》【年(卷),期】2012(038)015【摘要】A collaborative spectrum detection based on backtracking blind sparsity matching pursuit algorithm is proposed for sparse signals with unknown sparsity which the cognitive radio users receive. The algorithm can control the rapidity and accuracy of spectrum detection. It chooses the candidate set automatically, adopts staged changing process to estimate sparsity and backoff mechanism to obtain the global optimal support sets, and chooses the optimal collaborative users although SNR estimate. Experimental results show that the algorithm is superior to other algorithms in the same test conditions, and the probability of detection is increased by about 25% than the collaborative detection method of non selective object.%针对认知用户接收的未知稀疏度信号,提出一种基于盲稀疏度匹配追踪的协同频谱检测算法.该算法自动调节候选集原子的数量后,在迭代过程中采用阶段转换得到稀疏度,并利用回退机制获得全局最优支撑集,同时通过SNR估计选择最优协作用户进行联合检测,从而实现频谱的快速检测.实验结果表明,在相同条件下,该算法的检测效果优于同类算法,检测率比无选择对象的协作检测方法提高约25％.【总页数】3页(P81-83)【作者】陈晓芳;朱翠涛【作者单位】中南民族大学电子信息工程学院,武汉430074;中南民族大学电子信息工程学院,武汉430074【正文语种】中文【中图分类】TN929【相关文献】1.盲稀疏度信号重构的改进正交匹配追踪算法 [J], 季秀霞;张弓2.基于改进的稀疏度自适应匹配追踪算法的宽带压缩频谱感知 [J], 焦传海;李永成3.基于指数试探的稀疏度自适应匹配追踪算法 [J], 于金冬;芮国胜;田文飚;董道广;于志军4.基于CS的稀疏度变步长自适应压缩采样匹配追踪算法 [J], 雷丽婷;李刚;蒋常升;梁壮5.基于改进稀疏度自适应匹配追踪算法的压缩感知DOA估计 [J], 窦慧晶;肖子恒;杨帆因版权原因，仅展示原文概要，查看原文内容请购买。

Journal of Loss Prevention in the Process Industries

HAZOP e Local approach in the Mexican oil &gas industryM.Pérez-Marín a ,M.A.Rodríguez-Toral b ,*aInstituto Mexicano del Petróleo,Dirección de Seguridad y Medio Ambiente,Eje Central Lázaro Cárdenas Norte No.152,07730México,D.F.,Mexicob PEMEX,Dirección Corporativa de Operaciones,Gerencia de Análisis de Inversiones,Torre Ejecutiva,Piso 12,Av.Marina Nacional No.329,11311México,D.F.,Mexicoa r t i c l e i n f oArticle history:Received 3September 2012Received in revised form 26March 2013Accepted 27March 2013Keywords:HAZOPRisk acceptance criteria Oil &gasa b s t r a c tHAZOP (Hazard and Operability)studies began about 40years ago,when the Process Industry and complexity of its operations start to massively grow in different parts of the world.HAZOP has been successfully applied in Process Systems hazard identi ﬁcation by operators,design engineers and consulting ﬁrms.Nevertheless,after a few decades since its ﬁrst applications,HAZOP studies are not truly standard in worldwide industrial practice.It is common to ﬁnd differences in its execution and results format.The aim of this paper is to show that in the Mexican case at National level in the oil and gas industry,there exist an explicit acceptance risk criteria,thus impacting the risk scenarios prioritizing process.Although HAZOP studies in the Mexican oil &gas industry,based on PEMEX corporate standard has precise acceptance criteria,it is not a signi ﬁcant difference in HAZOP applied elsewhere,but has the advantage of being fully transparent in terms of what a local industry is willing to accept as the level of risk acceptance criteria,also helps to gain an understanding of the degree of HAZOP applications in the Mexican oil &gas sector.Contrary to this in HAZOP ISO standard,risk acceptance criteria is not speci ﬁed and it only mentions that HAZOP can consider scenarios ranking.The paper concludes indicating major implications of risk ranking in HAZOP,whether before or after safeguards identi ﬁcation.Ó2013Elsevier Ltd.All rights reserved.1.IntroductionHAZOP (Hazard and Operability)studies appeared in systematic way about 40years ago (Lawley,1974)where a multidisciplinary group uses keywords on Process variables to ﬁnd potential hazards and operability troubles (Mannan,2012,pp.8-31).The basic prin-ciple is to have a full process description and to ask in each node what deviations to the design purpose can occur,what causes produce them,and what consequences can be presented.This is done systematically by applying the guide words:Not ,More than ,Less than ,etc.as to generate a list of potential failures in equipment and process components.The objective of this paper is to show that in the Mexican case at National level in the oil and gas industry,there is an explicit acceptance risk criteria,thus impacting the risk scenarios priori-tizing process.Although HAZOP methodology in the Mexican oil &gas industry,based on PEMEX corporate standard has precise acceptance criteria,it is not a signi ﬁcant difference in HAZOP studies applied elsewhere,but has the advantage of being fullytransparent in terms of what a local industry is willing to accept as the level of risk acceptance criteria,also helps to gain an under-standing of the degree of HAZOP applications in the Mexican oil &gas sector.Contrary to this in HAZOP ISO standard (ISO,2000),risk acceptance criteria is not speci ﬁed and it only mentions that HAZOP can consider scenarios ranking.The paper concludes indicating major implications of risk prioritizing in HAZOP,whether before or after safeguards identi ﬁcation.2.Previous workHAZOP studies include from original ICI method with required actions only,to current applications based on computerized documentation,registering design intentions at nodes,guide words,causes,deviations,consequences,safeguards,cause fre-quencies,loss contention impact,risk reduction factors,scenarios analysis,ﬁnding analysis and many combinations among them.In the open literature there have been reported interesting and signi ﬁcant studies about HAZOP,like HAZOP and HAZAN differences (Gujar,1996)where HAZOP was identi ﬁed as qualitative hazard identi ﬁcation technique,while HAZAN was considered for the quantitative risk determination.This difference is not strictly valid today,since there are now companies using HAZOP with risk analysis*Corresponding author.Tel.:þ525519442500x57043.E-mail addresses:mpmarin@imp.mx (M.Pérez-Marín),miguel.angel.rodriguezt@ ,matoral09@ (M.A.Rodríguez-Toral).Contents lists available at SciVerse ScienceDirectJournal of Loss Prevention in the Process Industriesjou rn al homepage :/locate/jlp0950-4230/$e see front matter Ó2013Elsevier Ltd.All rights reserved./10.1016/j.jlp.2013.03.008Journal of Loss Prevention in the Process Industries 26(2013)936e 940and its acceptance criteria(Goyal&Kugan,2012).Other approaches include HAZOP execution optimization(Khan,1997);the use of intelligent systems to automate HAZOP(Venkatasubramanian,Zhao, &Viswanathan,2000);the integration of HAZOP with Fault Tree Analysis(FTA)and with Event Tree Analysis(ETA)(Kuo,Hsu,& Chang,1997).According to CCPS(2001)any qualitative method for hazard evaluation applied to identify scenarios in terms of their initial causes,events sequence,consequences and safeguards,can beextended to register Layer of Protection Analysis(LOPA).Since HAZOP scenarios report are presented typically in tabular form there can be added columns considering the frequency in terms of order of magnitude and the probability of occurrence identiﬁed in LOPA.There should be identiﬁed the Independent and the non-Independent Protection Layers,IPL and non-IPL respec-tively.Then the Probability of Failure on Demand(PFDs)for IPL and for non-IPL can be included as well as IPL integrity.Another approach consists of a combination of HAZOP/LOPA analysis including risk magnitude to rank risk reduction actions (Johnson,2010),a general method is shown,without emphasizing in any particular application.An extended HAZOP/LOPA analysis for Safety Integrity Level(SIL)is presented there,showing the quan-titative beneﬁt of applying risk reduction measures.In this way one scenario can be compared with tolerable risk criteria besides of being able to compare each scenario according to its risk value.A recent review paper has reported variations of HAZOP methodology for several applications including batch processes, laboratory operations,mechanical operations and programmable electronic systems(PES)among others(Dunjó,Fthenakis,Vílchez, &Arnaldos,2010).Wide and important contributions to HAZOP knowledge have been reported in the open literature that have promoted usage and knowledge of HAZOP studies.However,even though there is available the IEC standard on HAZOP studies,IEC-61882:2001there is not a worldwide agreement on HAZOP methodology and there-fore there exist a great variety of approaches for HAZOP studies.At international level there exist an ample number of ap-proaches in HAZOP studies;even though the best advanced prac-tices have been taken by several expert groups around the world, there is not uniformity among different consulting companies or industry internal expert groups(Goyal&Kugan,2012).The Mexican case is not the exception about this,but in the local oil and gas industry there exist a national PEMEX corporate standard that is speciﬁc in HAZOP application,it includes ranking risk scenarios (PEMEX,2008),qualitative hazard ranking,as well as the two ap-proaches recognized in HAZOP,Cause by Cause(CÂC)and Devia-tion by Deviation(DÂD).Published work including risk criteria include approaches in countries from the Americas,Europe and Asia(CCPS,2009),but nothing about Mexico has been reported.3.HAZOP variationsIn the technical literature there is no consensus in the HAZOP studies procedure,from the several differences it is consider that the more important are the variations according to:(DÂD)or (CÂC).Table1shows HAZOP variations,where(CQÂCQ)means Consequence by Consequence analysis.The implications of choosing(CÂC)are that in this approach there are obtained unique relationships of Consequences,Safeguards and Recommendations,for each speciﬁc Cause of a given Deviation. For(DÂD),all Causes,Consequences,Safeguards and Recommenda-tions are related only to one particular Deviation,thus producing that not all Causes appear to produce all the Consequences.In practice HAZOP approach(DÂD)can optimize analysis time development.However,its drawback comes when HAZOP includes risk ranking since it cannot be determined easily which Cause to consider in probability assignment.In choosing(CÂC)HAZOP there is no such a problem,although it may take more time on the analysis.The HAZOP team leader should agree HAZOP approach with customer and communicate this to the HAZOP team.In our experience factors to consider when choosing HAZOP approach are:1.If HAZOP will be followed by Layers of Protection Analysis(LOPA)for Safety Integrity Level(SIL)selection,then choose (CÂC).2.If HAZOP is going to be the only hazard identiﬁcation study,it isworth to make it with major detail using(CÂC).3.If HAZOP is part of an environmental risk study that requires aConsequence analysis,then use(DÂD).4.If HAZOP is going to be done with limited time or becauseHAZOP team cannot spend too much time in the analysis,then use(DÂD).Although this is not desirable since may compro-mise process safety.Regarding risk ranking in HAZOP,looking at IEC standard(IEC, 2001)it is found that HAZOP studies there are(DÂD)it refers to (IEC,1995)in considering deviation ranking in accordance to their severity or on their relative risk.One advantage of risk ranking is that presentation of HAZOP results is very convenient,in particular when informing the management on the recommendations to be followedﬁrst or with higher priority as a function of risk evaluated by the HAZOP team regarding associated Cause with a given recommendation.Tables2and3are shown as illustrative example of the convenience of event risk ranking under HAZOP,showing no risk ranking in Table2and risk ranking in Table3.When HAZOP presents a list of recommendations without ranking,the management can focus to recommendations with perhaps the lower resource needs and not necessarily the ones with higher risk.Table1Main approaches in HAZOP studies.Source HAZOP approach(Crowl&Louvar,2011)(DÂD)(ABS,2004)(CÂC)&(DÂD)(Hyatt,2003)(CÂC),(DÂD)&(CQÂCQ) (IEC,2001)(DÂD)(CCPS,2008);(Crawley,Preston,& Tyler,2008)(DÂD),(CÂC)Table2HAZOP recommendations without risk ranking.DescriptionRecommendation1Recommendation2Recommendation3Recommendation4Recommendation5Table3HAZOP recommendations with risk ranking.Scenario risk DescriptionHigh Recommendation2High Recommendation5Medium Recommendation3Low Recommendation1Low Recommendation4M.Pérez-Marín,M.A.Rodríguez-Toral/Journal of Loss Prevention in the Process Industries26(2013)936e940937As can be seen in Tables 2and 3,for the management there will be more important to know HAZOP results as in Table 3,in order to take decisions on planning response according to ranking risk.4.HAZOP standard for the Mexican oil &gas industryLooking at the worldwide recognized guidelines for hazard identi ﬁcation (ISO,2000)there is mentioned that when consid-ering scenarios qualitative risk assignment,one may use risk matrix for comparing the importance of risk reduction measures of the different options,but there is not a speci ﬁc risk matrix with risk values to consider.In Mexico there exist two national standards were tolerable and intolerable risk is de ﬁned,one is the Mexican National Standard NOM-028(NOM,2005)and the other is PEMEX corporate standard NRF-018(PEMEX,2008).In both Mexican standards the matrix form is considered for relating frequency and consequences.Fig.1shows the risk matrix in (NOM,2005),nomenclature regarding letters in this matrix is described in Tables 4e 6.It can be mentioned that risk matrix in (NOM,2005)is optional for risk management in local chemical process plants.For Mexican oil &gas industry,there exist a PEMEX corporate standard (NRF),Fig.2,shows the corresponding risk matrix (PEMEX,2008).Nomenclature regarding letters in this matrix is described in Tables 7e 9for risk concerning the community.It is important to mention that PEMEX corporate standard considers environmental risks,business risks,and corporate image risks.These are not shown here for space limitations.The Mexican National Standard (NOM)as being of general applicability gives the possibility for single entities (like PEMEX)to determine its own risk criteria as this company opted to do.PEMEX risk matrix can be converted to NOM ’s by category ’s grouping infrequency categories,thus giving same ﬂexibility,but with risk speci ﬁc for local industry acceptance risk criteria.One principal consideration in ranking risk is to de ﬁne if ranking is done before safeguards de ﬁnition or after.This de ﬁnition is relevant in:HAZOP kick-off presentation by HAZOP leader,explaining im-plications of risk ranking.HAZOP schedule de ﬁnition.Risk ranking at this point takes shorter time since time is not consumed in estimating risk reduction for each safeguard.If after HAZOP a LOPA is going to be done,then it should be advisable to request that HAZOP leader considers risk ranking before safeguards de ﬁnition,since LOPA has established rules in de ﬁning which safeguards are protections and the given risk reduction.Otherwise if for time or resource limitations HAZOP is not going to be followed by LOPA,then HAZOP should consider risk ranking after safeguards de ﬁnition.Therefore,the HAZOP leader should explain to the HAZOP team at the kick-off meeting a concise explanation of necessary considerations to identify safeguards having criteria to distinguish them as Independent Protection Layers (IPL)as well as the risk reduction provided by each IPL.In HAZOP report there should be make clear all assumptions and credits given to the Protections identi ﬁed by the HAZOP team.Figs.3and 4,shows a vision of both kinds of HAZOP reports:For the case of risk ranking before and after safeguards de ﬁnition.In Figs.3Fig.1.Risk matrix in (NOM,2005).Table 5Probability description (Y -axis of matrix in Fig.1)(NOM,2005).Frequency Frequency quantitative criteria L41in 10years L31in 100years L21in 1000years L1<1in 1000yearsTable 6Risk description (within matrix in Fig.1)(NOM,2005).Risk level Risk qualitative descriptionA Intolerable:risk must be reduced.B Undesirable:risk reduction required or a more rigorous risk estimation.C Tolerable risk:risk reduction is needed.DTolerable risk:risk reduction not needed.Fig.2.Risk matrix as in (PEMEX,2008).Table 7Probability description (Y -axis of matrix in Fig.2)(PEMEX,2008).Frequency Occurrence criteria Category Type Quantitative QualitativeHighF4>10À1>1in 10yearsEvent can be presented within the next 10years.Medium F310À1À10À21in 10years e 1in 100years It can occur at least once in facility lifetime.LowF210À2À10À31in 100years e 1in 1000years Possible,it has never occurred in the facility,but probably ithas occurred in a similar facility.Remote F1<10À3<1in 1000years Virtually impossible.It is norealistic its occurrence.Table 4Consequences description (X -axis of matrix in Fig.1)(NOM,2005).Consequences Consequence quantitative criteriaC4One or more fatalities (on site).Injuries or fatalities in the community (off-site).C3Permanent damage in a speci ﬁc Process or construction area.Several disability accidents or hospitalization.C2One disability accident.Multiple injuries.C1One injured.Emergency response without injuries.M.Pérez-Marín,M.A.Rodríguez-Toral /Journal of Loss Prevention in the Process Industries 26(2013)936e 940938and4“F”means frequency,C means consequence and R is risk as a function of“F”and“C”.One disadvantage of risk ranking before safeguards deﬁnition is that resulting risks usually are found to be High,Intolerable or Unacceptable.This makes difﬁcult the decision to be made by the management on what recommendations should be carried outﬁrst and which can wait.One advantage in risk ranking after safeguards deﬁnition is that it allows to show the management the risk scenario fully classiﬁed, without any tendency for identifying most risk as High(Intolerable or Unacceptable).In this way,the management will have a good description on which scenario need prompt attention and thus take risk to tolerable levels.There is commercial software for HAZOP methodology,but it normally requires the user to use his/her risk matrix,since risk matrix deﬁnition represents an extensive knowledge,resources and consensus to be recognized.The Mexican case is worldwide unique in HAZOP methodology, since it uses an agreed and recognized risk matrix and risk priori-tizing criteria according to local culture and risk understanding for the oil&gas sector.The risk matrix with corresponding risk levels took into account political,economical and ethic values.Advantages in using risk matrix in HAZOP are:they are easy to understand and to apply;once they are established and recognized they are of low cost;they allow risk ranking,thus helping risk reduction requirements and limitations.However,some disad-vantages in risk matrix use are:it may sometimes be difﬁcult to separate frequency categories,for instance it may not be easy to separate low from remote in Table7.The risk matrix subdivision may have important uncertainties,because there are qualitative considerations in its deﬁnition.Thus,it may be advantageous to update Pemex corporate HAZOP standard(PEMEX,2008)to consider a6Â6matrix instead of the current4Â4matrix.5.ConclusionsHAZOP studies are not a simple procedure application that as-sures safe Process systems on its own.It is part of a global design cycle.Thus,it is necessary to establish beforehand the HAZOP study scope that should include at least:methodology,type(CÂC,DÂD, etc.)report format,acceptance risk criteria and expected results.Mexico belongs to the reduced number of places where accep-tance risk criteria has been explicitly deﬁned for HAZOP studies at national level.ReferencesABS.(2004).Process safety institute.Course103“Process hazard analysis leader training,using the HAZOP and what-if/checklist techniques”.Houston TX:Amer-ican Bureau of Shipping.CCPS(Center for Chemical Process Safety).(2001).Layer of protection analysis: Simpliﬁed process risk assessment.New York,USA:AIChE.CCPS(Center for Chemical Process Safety).(2008).Guidelines for hazard evaluation procedures(3rd ed.).New York,USA:AIChE/John Wiley&Sons.CCPS(Center for Chemical Process Safety).(2009).Guidelines for Developing Quan-titative Safety Risk Criteria,Appendix B.Survey of worldwide risk criteria appli-cations.New York,USA:AIChE.Crawley,F.,Preston,M.,&Tyler,B.(2008).HAZOP:Guide to best practice(2nd ed.).UK:Institution of Chemical Engineers.Crowl,D.A.,&Louvar,J.F.(2011).Chemical process safety,fundamentals with ap-plications(3rd ed.).New Jersey,USA:Prentice Hall.Table8Consequences description(X-axis of matrix in Fig.2)(PEMEX,2008).Event type and consequence categoryEffect:Minor C1Moderate C2Serious C3Catastrophic C4 To peopleNeighbors Health and Safety.No impact on publichealth and safety.Neighborhood alert;potentialimpact to public health and safety.Evacuation;Minor injuries or moderateconsequence on public health and safety;side-effects cost between5and10millionMX$(0.38e0.76million US$).Evacuation;injured people;one ormore fatalities;sever consequenceon public health and safety;injuriesand side-consequence cost over10million MX$(0.76million US$).Health and Safetyof employees,serviceproviders/contractors.No injuries;ﬁrst aid.Medical treatment;Minor injurieswithout disability to work;reversible health treatment.Hospitalization;multiple injured people;total or partial disability;moderate healthtreatment.One o more fatalities;Severe injurieswith irreversible damages;permanenttotal or partial incapacity.Table9Risk description(within matrix in Fig.2)(PEMEX,2008).Risk level Risk description Risk qualitative descriptionA Intolerable Risk requires immediate action;cost should not be a limitation and doing nothing is not an acceptable option.Risk with level“A”represents an emergency situation and there should be implements with immediate temporary controls.Risk mitigation should bedone by engineered controls and/or human factors until Risk is reduced to type“C”or preferably to type“D”in less than90days.B Undesirable Risk should be reduced and there should be additional investigation.However,corrective actions should be taken within the next90days.If solution takes longer there should be installed on-site immediate temporary controls for risk reduction.C Acceptablewith control Signiﬁcant risk,but can be compensated with corrective actions during programmed facilities shutdown,to avoid interruption of work plans and extra-costs.Solutions measures to solve riskﬁndings should be done within18months.Mitigation actions should focus operations discipline and protection systems reliability.D ReasonablyacceptableRisk requires control,but it is of low impact and its attention can be carried out along with other operations improvements.Fig.3.Risk ranking before safeguard deﬁnition.Fig.4.Risk ranking after safeguards deﬁnition.M.Pérez-Marín,M.A.Rodríguez-Toral/Journal of Loss Prevention in the Process Industries26(2013)936e940939Dunjó,J.,Fthenakis,V.,Vílchez,J.A.,&Arnaldos,J.(2010).Hazard and opera-bility(HAZOP)analysis.A literature review.Journal of Hazardous Materials, 173,19e32.Goyal,R.K.,&Kugan,S.(2012).Hazard and operability studies(HAZOP)e best practices adopted by BAPCO(Barahin Petroleum Company).In Presented at SPE middle east health,safety,security and environment conference and exhibition.Abu Dhabi,UAE.2e4April.Gujar,A.M.(1996).Myths of HAZOP and HAZAN.Journal of Loss Prevention in the Process Industry,9(6),357e361.Hyatt,N.(2003).Guidelines for process hazards analysis,hazards identiﬁcation and risk analysis(pp.6-7e6-9).Ontario,Canada:CRC Press.IEC.(1995).IEC60300-3-9:1995.Risk management.Guide to risk analysis of techno-logical systems.Dependability management e Part3:Application guide e Section 9:Risk analysis of technological systems.Geneva:International Electrotechnical Commission.IEC.(2001).IEC61882.Hazard and operability studies(HAZOP studies)e Application guide.Geneva:International Electrotechnical Commission.ISO.(2000).ISO17776.Guidelines on tools and techniques for hazard identiﬁcation and risk assessment.Geneva:International Organization for Standardization.Johnson,R.W.(2010).Beyond-compliance uses of HAZOP/LOPA studies.Journal of Loss Prevention in the Process Industries,23(6),727e733.Khan,F.I.(1997).OptHAZOP-effective and optimum approach for HAZOP study.Journal of Loss Prevention in the Process Industry,10(3),191e204.Kuo,D.H.,Hsu,D.S.,&Chang,C.T.(1997).A prototype for integrating automatic fault tree/event tree/HAZOP puters&Chemical Engineering,21(9e10),S923e S928.Lawley,H.G.(1974).Operability studies and hazard analysis.Chemical Engineering Progress,70(4),45e56.Mannan,S.(2012).Lee’s loss prevention in the process industries.Hazard identiﬁca-tion,assessment and control,Vol.1,3rd ed.,Elsevier,(pp.8e31).NOM.(2005).NOM-028-STPS-2004.Mexican National standard:“Norma Oﬁcial Mexicana”.In Organización del trabajo-Seguridad en los procesos de sustancias químicas:(in Spanish),published in January2005.PEMEX.(2008).Corporate Standard:“Norma de Referencia NRF-018-PEMEX-2007“Estudios de Riesgo”(in Spanish),published in January2008. Venkatasubramanian,V.,Zhao,J.,&Viswanathan,S.(2000).Intelligent systems for HAZOP analysis of complex process puters&Chemical Engineering, 24(9e10),2291e2302.M.Pérez-Marín,M.A.Rodríguez-Toral/Journal of Loss Prevention in the Process Industries26(2013)936e940 940。

Cluster analysis

8 Cluster Analysis:Basic Concepts andAlgorithmsCluster analysis divides data into groups(clusters)that are meaningful,useful, or both.If meaningful groups are the goal,then the clusters should capture the natural structure of the data.In some cases,however,cluster analysis is only a useful starting point for other purposes,such as data summarization.Whether for understanding or utility,cluster analysis has long played an important role in a wide variety ofﬁelds:psychology and other social sciences,biology, statistics,pattern recognition,information retrieval,machine learning,and data mining.There have been many applications of cluster analysis to practical prob-lems.We provide some speciﬁc examples,organized by whether the purpose of the clustering is understanding or utility.Clustering for Understanding Classes,or conceptually meaningful groups of objects that share common characteristics,play an important role in how people analyze and describe the world.Indeed,human beings are skilled at dividing objects into groups(clustering)and assigning particular objects to these groups(classiﬁcation).For example,even relatively young children can quickly label the objects in a photograph as buildings,vehicles,people,ani-mals,plants,etc.In the context of understanding data,clusters are potential classes and cluster analysis is the study of techniques for automaticallyﬁnding classes.The following are some examples:488Chapter8Cluster Analysis:Basic Concepts and Algorithms •Biology.Biologists have spent many years creating a taxonomy(hi-erarchical classiﬁcation)of all living things:kingdom,phylum,class, order,family,genus,and species.Thus,it is perhaps not surprising that much of the early work in cluster analysis sought to create a discipline of mathematical taxonomy that could automaticallyﬁnd such classiﬁ-cation structures.More recently,biologists have applied clustering to analyze the large amounts of genetic information that are now available.For example,clustering has been used toﬁnd groups of genes that have similar functions.•Information Retrieval.The World Wide Web consists of billions of Web pages,and the results of a query to a search engine can return thousands of pages.Clustering can be used to group these search re-sults into a small number of clusters,each of which captures a particular aspect of the query.For instance,a query of“movie”might return Web pages grouped into categories such as reviews,trailers,stars,and theaters.Each category(cluster)can be broken into subcategories(sub-clusters),producing a hierarchical structure that further assists a user’s exploration of the query results.•Climate.Understanding the Earth’s climate requiresﬁnding patterns in the atmosphere and ocean.To that end,cluster analysis has been applied toﬁnd patterns in the atmospheric pressure of polar regions and areas of the ocean that have a signiﬁcant impact on land climate.•Psychology and Medicine.An illness or condition frequently has a number of variations,and cluster analysis can be used to identify these diﬀerent subcategories.For example,clustering has been used to identify diﬀerent types of depression.Cluster analysis can also be used to detect patterns in the spatial or temporal distribution of a disease.•Business.Businesses collect large amounts of information on current and potential customers.Clustering can be used to segment customers into a small number of groups for additional analysis and marketing activities.Clustering for Utility Cluster analysis provides an abstraction from in-dividual data objects to the clusters in which those data objects reside.Ad-ditionally,some clustering techniques characterize each cluster in terms of a cluster prototype;i.e.,a data object that is representative of the other ob-jects in the cluster.These cluster prototypes can be used as the basis for a489 number of data analysis or data processing techniques.Therefore,in the con-text of utility,cluster analysis is the study of techniques forﬁnding the most representative cluster prototypes.•Summarization.Many data analysis techniques,such as regression or PCA,have a time or space complexity of O(m2)or higher(where m is the number of objects),and thus,are not practical for large data sets.However,instead of applying the algorithm to the entire data set,it can be applied to a reduced data set consisting only of cluster prototypes.Depending on the type of analysis,the number of prototypes,and the accuracy with which the prototypes represent the data,the results can be comparable to those that would have been obtained if all the data could have been used.•Compression.Cluster prototypes can also be used for data compres-sion.In particular,a table is created that consists of the prototypes for each cluster;i.e.,each prototype is assigned an integer value that is its position(index)in the table.Each object is represented by the index of the prototype associated with its cluster.This type of compression is known as vector quantization and is often applied to image,sound, and video data,where(1)many of the data objects are highly similar to one another,(2)some loss of information is acceptable,and(3)a substantial reduction in the data size is desired.•Eﬃciently Finding Nearest Neighbors.Finding nearest neighbors can require computing the pairwise distance between all points.Often clusters and their cluster prototypes can be found much more eﬃciently.If objects are relatively close to the prototype of their cluster,then we can use the prototypes to reduce the number of distance computations that are necessary toﬁnd the nearest neighbors of an object.Intuitively,if two cluster prototypes are far apart,then the objects in the corresponding clusters cannot be nearest neighbors of each other.Consequently,to ﬁnd an object’s nearest neighbors it is only necessary to compute the distance to objects in nearby clusters,where the nearness of two clusters is measured by the distance between their prototypes.This idea is made more precise in Exercise25on page94.This chapter provides an introduction to cluster analysis.We begin with a high-level overview of clustering,including a discussion of the various ap-proaches to dividing objects into sets of clusters and the diﬀerent types of clusters.We then describe three speciﬁc clustering techniques that represent490Chapter8Cluster Analysis:Basic Concepts and Algorithms broad categories of algorithms and illustrate a variety of concepts:K-means, agglomerative hierarchical clustering,and DBSCAN.Theﬁnal section of this chapter is devoted to cluster validity—methods for evaluating the goodness of the clusters produced by a clustering algorithm.More advanced clustering concepts and algorithms will be discussed in Chapter9.Whenever possible, we discuss the strengths and weaknesses of diﬀerent schemes.In addition, the bibliographic notes provide references to relevant books and papers that explore cluster analysis in greater depth.8.1OverviewBefore discussing speciﬁc clustering techniques,we provide some necessary background.First,we further deﬁne cluster analysis,illustrating why it is diﬃcult and explaining its relationship to other techniques that group data. Then we explore two important topics:(1)diﬀerent ways to group a set of objects into a set of clusters,and(2)types of clusters.8.1.1What Is Cluster Analysis?Cluster analysis groups data objects based only on information found in the data that describes the objects and their relationships.The goal is that the objects within a group be similar(or related)to one another and diﬀerent from (or unrelated to)the objects in other groups.The greater the similarity(or homogeneity)within a group and the greater the diﬀerence between groups, the better or more distinct the clustering.In many applications,the notion of a cluster is not well deﬁned.To better understand the diﬃculty of deciding what constitutes a cluster,consider Figure 8.1,which shows twenty points and three diﬀerent ways of dividing them into clusters.The shapes of the markers indicate cluster membership.Figures 8.1(b)and8.1(d)divide the data into two and six parts,respectively.However, the apparent division of each of the two larger clusters into three subclusters may simply be an artifact of the human visual system.Also,it may not be unreasonable to say that the points form four clusters,as shown in Figure 8.1(c).Thisﬁgure illustrates that the deﬁnition of a cluster is imprecise and that the best deﬁnition depends on the nature of data and the desired results.Cluster analysis is related to other techniques that are used to divide data objects into groups.For instance,clustering can be regarded as a form of classiﬁcation in that it creates a labeling of objects with class(cluster)labels. However,it derives these labels only from the data.In contrast,classiﬁcation8.1Overview491(a)Original points.(b)Two clusters.(c)Four clusters.(d)Six clusters.Figure8.1.Different ways of clustering the same set of points.in the sense of Chapter4is supervised classiﬁcation;i.e.,new,unlabeled objects are assigned a class label using a model developed from objects with known class labels.For this reason,cluster analysis is sometimes referred to as unsupervised classiﬁcation.When the term classiﬁcation is used without any qualiﬁcation within data mining,it typically refers to supervised classiﬁcation.Also,while the terms segmentation and partitioning are sometimes used as synonyms for clustering,these terms are frequently used for approaches outside the traditional bounds of cluster analysis.For example,the term partitioning is often used in connection with techniques that divide graphs into subgraphs and that are not strongly connected to clustering.Segmentation often refers to the division of data into groups using simple techniques;e.g., an image can be split into segments based only on pixel intensity and color,or people can be divided into groups based on their income.Nonetheless,some work in graph partitioning and in image and market segmentation is related to cluster analysis.8.1.2Diﬀerent Types of ClusteringsAn entire collection of clusters is commonly referred to as a clustering,and in this section,we distinguish various types of clusterings:hierarchical(nested) versus partitional(unnested),exclusive versus overlapping versus fuzzy,and complete versus partial.Hierarchical versus Partitional The most commonly discussed distinc-tion among diﬀerent types of clusterings is whether the set of clusters is nested492Chapter8Cluster Analysis:Basic Concepts and Algorithmsor unnested,or in more traditional terminology,hierarchical or partitional.A partitional clustering is simply a division of the set of data objects into non-overlapping subsets(clusters)such that each data object is in exactly one subset.Taken individually,each collection of clusters in Figures8.1(b–d)is a partitional clustering.If we permit clusters to have subclusters,then we obtain a hierarchical clustering,which is a set of nested clusters that are organized as a tree.Each node(cluster)in the tree(except for the leaf nodes)is the union of its children (subclusters),and the root of the tree is the cluster containing all the objects. Often,but not always,the leaves of the tree are singleton clusters of individual data objects.If we allow clusters to be nested,then one interpretation of Figure8.1(a)is that it has two subclusters(Figure8.1(b)),each of which,in turn,has three subclusters(Figure8.1(d)).The clusters shown in Figures8.1 (a–d),when taken in that order,also form a hierarchical(nested)clustering with,respectively,1,2,4,and6clusters on each level.Finally,note that a hierarchical clustering can be viewed as a sequence of partitional clusterings and a partitional clustering can be obtained by taking any member of that sequence;i.e.,by cutting the hierarchical tree at a particular level. Exclusive versus Overlapping versus Fuzzy The clusterings shown in Figure8.1are all exclusive,as they assign each object to a single cluster. There are many situations in which a point could reasonably be placed in more than one cluster,and these situations are better addressed by non-exclusive clustering.In the most general sense,an overlapping or non-exclusive clustering is used to reﬂect the fact that an object can simultaneously belong to more than one group(class).For instance,a person at a university can be both an enrolled student and an employee of the university.A non-exclusive clustering is also often used when,for example,an object is“between”two or more clusters and could reasonably be assigned to any of these clusters. Imagine a point halfway between two of the clusters of Figure8.1.Rather than make a somewhat arbitrary assignment of the object to a single cluster, it is placed in all of the“equally good”clusters.In a fuzzy clustering,every object belongs to every cluster with a mem-bership weight that is between0(absolutely doesn’t belong)and1(absolutely belongs).In other words,clusters are treated as fuzzy sets.(Mathematically, a fuzzy set is one in which an object belongs to any set with a weight that is between0and1.In fuzzy clustering,we often impose the additional con-straint that the sum of the weights for each object must equal1.)Similarly, probabilistic clustering techniques compute the probability with which each8.1Overview493 point belongs to each cluster,and these probabilities must also sum to1.Be-cause the membership weights or probabilities for any object sum to1,a fuzzy or probabilistic clustering does not address true multiclass situations,such as the case of a student employee,where an object belongs to multiple classes. Instead,these approaches are most appropriate for avoiding the arbitrariness of assigning an object to only one cluster when it may be close to several.In practice,a fuzzy or probabilistic clustering is often converted to an exclusive clustering by assigning each object to the cluster in which its membership weight or probability is highest.Complete versus Partial A complete clustering assigns every object to a cluster,whereas a partial clustering does not.The motivation for a partial clustering is that some objects in a data set may not belong to well-deﬁned groups.Many times objects in the data set may represent noise,outliers,or “uninteresting background.”For example,some newspaper stories may share a common theme,such as global warming,while other stories are more generic or one-of-a-kind.Thus,toﬁnd the important topics in last month’s stories,we may want to search only for clusters of documents that are tightly related by a common theme.In other cases,a complete clustering of the objects is desired. For example,an application that uses clustering to organize documents for browsing needs to guarantee that all documents can be browsed.8.1.3Diﬀerent Types of ClustersClustering aims toﬁnd useful groups of objects(clusters),where usefulness is deﬁned by the goals of the data analysis.Not surprisingly,there are several diﬀerent notions of a cluster that prove useful in practice.In order to visually illustrate the diﬀerences among these types of clusters,we use two-dimensional points,as shown in Figure8.2,as our data objects.We stress,however,that the types of clusters described here are equally valid for other kinds of data. Well-Separated A cluster is a set of objects in which each object is closer (or more similar)to every other object in the cluster than to any object not in the cluster.Sometimes a threshold is used to specify that all the objects in a cluster must be suﬃciently close(or similar)to one another.This idealistic deﬁnition of a cluster is satisﬁed only when the data contains natural clusters that are quite far from each other.Figure8.2(a)gives an example of well-separated clusters that consists of two groups of points in a two-dimensional space.The distance between any two points in diﬀerent groups is larger than494Chapter8Cluster Analysis:Basic Concepts and Algorithmsthe distance between any two points within a group.Well-separated clusters do not need to be globular,but can have any shape.Prototype-Based A cluster is a set of objects in which each object is closer (more similar)to the prototype that deﬁnes the cluster than to the prototype of any other cluster.For data with continuous attributes,the prototype of a cluster is often a centroid,i.e.,the average(mean)of all the points in the clus-ter.When a centroid is not meaningful,such as when the data has categorical attributes,the prototype is often a medoid,i.e.,the most representative point of a cluster.For many types of data,the prototype can be regarded as the most central point,and in such instances,we commonly refer to prototype-based clusters as center-based clusters.Not surprisingly,such clusters tend to be globular.Figure8.2(b)shows an example of center-based clusters. Graph-Based If the data is represented as a graph,where the nodes are objects and the links represent connections among objects(see Section2.1.2), then a cluster can be deﬁned as a connected component;i.e.,a group of objects that are connected to one another,but that have no connection to objects outside the group.An important example of graph-based clusters are contiguity-based clusters,where two objects are connected only if they are within a speciﬁed distance of each other.This implies that each object in a contiguity-based cluster is closer to some other object in the cluster than to any point in a diﬀerent cluster.Figure8.2(c)shows an example of such clusters for two-dimensional points.This deﬁnition of a cluster is useful when clusters are irregular or intertwined,but can have trouble when noise is present since, as illustrated by the two spherical clusters of Figure8.2(c),a small bridge of points can merge two distinct clusters.Other types of graph-based clusters are also possible.One such approach (Section8.3.2)deﬁnes a cluster as a clique;i.e.,a set of nodes in a graph that are completely connected to each other.Speciﬁcally,if we add connections between objects in the order of their distance from one another,a cluster is formed when a set of objects forms a clique.Like prototype-based clusters, such clusters tend to be globular.Density-Based A cluster is a dense region of objects that is surrounded by a region of low density.Figure8.2(d)shows some density-based clusters for data created by adding noise to the data of Figure8.2(c).The two circular clusters are not merged,as in Figure8.2(c),because the bridge between them fades into the noise.Likewise,the curve that is present in Figure8.2(c)also8.1Overview495 fades into the noise and does not form a cluster in Figure8.2(d).A density-based deﬁnition of a cluster is often employed when the clusters are irregular or intertwined,and when noise and outliers are present.By contrast,a contiguity-based deﬁnition of a cluster would not work well for the data of Figure8.2(d) since the noise would tend to form bridges between clusters.Shared-Property(Conceptual Clusters)More generally,we can deﬁne a cluster as a set of objects that share some property.This deﬁnition encom-passes all the previous deﬁnitions of a cluster;e.g.,objects in a center-based cluster share the property that they are all closest to the same centroid or medoid.However,the shared-property approach also includes new types of clusters.Consider the clusters shown in Figure8.2(e).A triangular area (cluster)is adjacent to a rectangular one,and there are two intertwined circles (clusters).In both cases,a clustering algorithm would need a very speciﬁc concept of a cluster to successfully detect these clusters.The process ofﬁnd-ing such clusters is called conceptual clustering.However,too sophisticated a notion of a cluster would take us into the area of pattern recognition,and thus,we only consider simpler types of clusters in this book.Road MapIn this chapter,we use the following three simple,but important techniques to introduce many of the concepts involved in cluster analysis.•K-means.This is a prototype-based,partitional clustering technique that attempts toﬁnd a user-speciﬁed number of clusters(K),which are represented by their centroids.•Agglomerative Hierarchical Clustering.This clustering approach refers to a collection of closely related clustering techniques that producea hierarchical clustering by starting with each point as a singleton clusterand then repeatedly merging the two closest clusters until a single,all-encompassing cluster remains.Some of these techniques have a natural interpretation in terms of graph-based clustering,while others have an interpretation in terms of a prototype-based approach.•DBSCAN.This is a density-based clustering algorithm that producesa partitional clustering,in which the number of clusters is automaticallydetermined by the algorithm.Points in low-density regions are classi-ﬁed as noise and omitted;thus,DBSCAN does not produce a complete clustering.Chapter 8Cluster Analysis:Basic Concepts and Algorithms (a)Well-separated clusters.Eachpoint is closer to all of the points in itscluster than to any point in anothercluster.(b)Center-based clusters.Each point is closer to the center of its cluster than to the center of any other cluster.(c)Contiguity-based clusters.Eachpoint is closer to at least one pointin its cluster than to any point inanother cluster.(d)Density-based clusters.Clus-ters are regions of high density sep-arated by regions of low density.(e)Conceptual clusters.Points in a cluster share some generalproperty that derives from the entire set of points.(Points in theintersection of the circles belong to both.)Figure 8.2.Different types of clusters as illustrated by sets of two-dimensional points.8.2K-meansPrototype-based clustering techniques create a one-level partitioning of the data objects.There are a number of such techniques,but two of the most prominent are K-means and K-medoid.K-means deﬁnes a prototype in terms of a centroid,which is usually the mean of a group of points,and is typically8.2K-means497 applied to objects in a continuous n-dimensional space.K-medoid deﬁnes a prototype in terms of a medoid,which is the most representative point for a group of points,and can be applied to a wide range of data since it requires only a proximity measure for a pair of objects.While a centroid almost never corresponds to an actual data point,a medoid,by its deﬁnition,must be an actual data point.In this section,we will focus solely on K-means,which is one of the oldest and most widely used clustering algorithms.8.2.1The Basic K-means AlgorithmThe K-means clustering technique is simple,and we begin with a description of the basic algorithm.Weﬁrst choose K initial centroids,where K is a user-speciﬁed parameter,namely,the number of clusters desired.Each point is then assigned to the closest centroid,and each collection of points assigned to a centroid is a cluster.The centroid of each cluster is then updated based on the points assigned to the cluster.We repeat the assignment and update steps until no point changes clusters,or equivalently,until the centroids remain the same.K-means is formally described by Algorithm8.1.The operation of K-means is illustrated in Figure8.3,which shows how,starting from three centroids,the ﬁnal clusters are found in four assignment-update steps.In these and other ﬁgures displaying K-means clustering,each subﬁgure shows(1)the centroids at the start of the iteration and(2)the assignment of the points to those centroids.The centroids are indicated by the“+”symbol;all points belonging to the same cluster have the same marker shape.1:Select K points as initial centroids.2:repeat3:Form K clusters by assigning each point to its closest centroid.4:Recompute the centroid of each cluster.5:until Centroids do not change.In theﬁrst step,shown in Figure8.3(a),points are assigned to the initial centroids,which are all in the larger group of points.For this example,we use the mean as the centroid.After points are assigned to a centroid,the centroid is then updated.Again,theﬁgure for each step shows the centroid at the beginning of the step and the assignment of points to those centroids.In the second step,points are assigned to the updated centroids,and the centroids498Chapter8Cluster Analysis:Basic Concepts and Algorithms(a)Iteration1.(b)Iteration2.(c)Iteration3.(d)Iteration4.ing the K-means algorithm toﬁnd three clusters in sample data.are updated again.In steps2,3,and4,which are shown in Figures8.3(b), (c),and(d),respectively,two of the centroids move to the two small groups of points at the bottom of theﬁgures.When the K-means algorithm terminates in Figure8.3(d),because no more changes occur,the centroids have identiﬁed the natural groupings of points.For some combinations of proximity functions and types of centroids,K-means always converges to a solution;i.e.,K-means reaches a state in which no points are shifting from one cluster to another,and hence,the centroids don’t change.Because most of the convergence occurs in the early steps,however, the condition on line5of Algorithm8.1is often replaced by a weaker condition, e.g.,repeat until only1%of the points change clusters.We consider each of the steps in the basic K-means algorithm in more detail and then provide an analysis of the algorithm’s space and time complexity. Assigning Points to the Closest CentroidTo assign a point to the closest centroid,we need a proximity measure that quantiﬁes the notion of“closest”for the speciﬁc data under consideration. Euclidean(L2)distance is often used for data points in Euclidean space,while cosine similarity is more appropriate for documents.However,there may be several types of proximity measures that are appropriate for a given type of data.For example,Manhattan(L1)distance can be used for Euclidean data, while the Jaccard measure is often employed for documents.Usually,the similarity measures used for K-means are relatively simple since the algorithm repeatedly calculates the similarity of each point to each centroid.In some cases,however,such as when the data is in low-dimensional8.2K-means499Table8.1.Table of notation.Symbol Descriptionx An object.C i The i th cluster.c i The centroid of cluster C i.c The centroid of all points.m i The number of objects in the i th cluster.m The number of objects in the data set.K The number of clusters.Euclidean space,it is possible to avoid computing many of the similarities, thus signiﬁcantly speeding up the K-means algorithm.Bisecting K-means (described in Section8.2.3)is another approach that speeds up K-means by reducing the number of similarities computed.Centroids and Objective FunctionsStep4of the K-means algorithm was stated rather generally as“recompute the centroid of each cluster,”since the centroid can vary,depending on the proximity measure for the data and the goal of the clustering.The goal of the clustering is typically expressed by an objective function that depends on the proximities of the points to one another or to the cluster centroids;e.g., minimize the squared distance of each point to its closest centroid.We illus-trate this with two examples.However,the key point is this:once we have speciﬁed a proximity measure and an objective function,the centroid that we should choose can often be determined mathematically.We provide mathe-matical details in Section8.2.6,and provide a non-mathematical discussion of this observation here.Data in Euclidean Space Consider data whose proximity measure is Eu-clidean distance.For our objective function,which measures the quality of a clustering,we use the sum of the squared error(SSE),which is also known as scatter.In other words,we calculate the error of each data point,i.e.,its Euclidean distance to the closest centroid,and then compute the total sum of the squared errors.Given two diﬀerent sets of clusters that are produced by two diﬀerent runs of K-means,we prefer the one with the smallest squared error since this means that the prototypes(centroids)of this clustering are a better representation of the points in their ing the notation in Table8.1,the SSE is formally deﬁned as follows:。

相变混凝土能量桩热-力学特性的数值模拟与试验验证

第37卷第2期农业工程学报 V ol.37 No.2268 2021年1月Transactions of the Chinese Society of Agricultural Engineering Jan. 2021 相变混凝土能量桩热-力学特性的数值模拟与试验验证杨卫波1,2，杨彬彬1，汪峰1（1. 扬州大学电气与能源动力工程学院，扬州 225127；2. 热流科学与工程教育部重点实验室（西安交通大学），西安 710049）摘要：为了获得热力耦合作用下相变混凝土能量桩的热-力学特性，建立了其三维数值模型，比较了传统和相变混凝土能量桩热-力学特性的差异，分析了埋管管腿间距及桩体长径比对相变混凝土能量桩热-力学特性的影响规律。

结果表明，相变材料（Phase Change Material，PCM）的固液相变可使单位桩深换热量提高10.3%，且可降低桩身温度变化幅度，由温度变化所引起的桩身位移、轴力及侧摩阻力变化量也相应减小。

随桩基埋管管腿间距增加，能量桩的换热量和土壤热影响范围增大，桩身轴力减小，桩身位移呈现先增大后减小趋势；加大桩体长径比会增加总换热量，但会导致单位桩深换热量降低及桩顶位移的增加，不利于桩基结构的稳定性。

试验验证表明：所建能量桩数值模型可用于模拟相变混凝土能量桩的热-力学特性，其桩壁中点温度与桩顶位移的预测最大相对误差分别在5.1%与12%以内，平均相对误差分别为4.2%、9.9%。

研究结论对于相变混凝土能量桩的优化设计与运行具有重要指导意义。

关键词：能量桩；相变；热力学特性；数值模拟；试验验证doi：10.11975/j.issn.1002-6819.2021.2.031中图分类号：TU473，TU83 文献标志码：A 文章编号：1002-6819(2021)-2-0268-10杨卫波，杨彬彬，汪峰. 相变混凝土能量桩热-力学特性的数值模拟与试验验证[J]. 农业工程学报，2021，37(2)：268-277. doi：10.11975/j.issn.1002-6819.2021.2.031 Yang Weibo, Yang Binbin, Wang Feng. Numerical simulation and experimental validation of the thermo-mechanical characteristics of phase change concrete energy pile[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(2): 268-277. (in Chinese with English abstract) doi：10.11975/j.issn.1002-6819.2021.2.031 0 引言地源热泵作为浅层地热能利用技术之一，因其节能、高效和环保等优势而在建筑节能中得到广泛推广[1]。

Two-dimensional Quantum Field Theory, examples and applications

Abstract The main principles of two-dimensional quantum ﬁeld theories, in particular two-dimensional QCD and gravity are reviewed. We study non-perturbative aspects of these theories which make them particularly valuable for testing ideas of four-dimensional quantum ﬁeld theory. The dynamics of conﬁnement and theta vacuum are explained by using the non-perturbative methods developed in two dimensions. We describe in detail how the eﬀective action of string theory in non-critical dimensions can be represented by Liouville gravity. By comparing the helicity amplitudes in four-dimensional QCD to those of integrable self-dual Yang-Mills theory, we extract a four dimensional version of two dimensional integrability.
2 48 49 52 54 56
5 Four-dimensional analogies and consequences 6 Conclusions and Final Remarks

一种基于视觉的PERCLOS特征提取方法

一种基于视觉的PERCLOS特征提取方法王磊;吴晓娟;巴本冬;董文会【摘要】PERCLOS是有效地检测驾驶员瞌睡的特征.本文在前人研究的基础上,提出了一种快速、有效的计算眼睛睁开程度的检测算法.针对车内光线会有变化的特点,对肤色滤波之后的人脸灰度图像通过累计直方图阈值法进行二值化,很好地将面部五官从肤色中分离出来.最后利用连通域搜索算法得到双眼睁开的大小,并最终提取到PERCLOS.【期刊名称】《计算机工程与科学》【年(卷),期】2006(028)006【总页数】3页(P52-54)【关键词】驾驶疲劳/瞌睡;PERCLOS;肤色分割;累计直方图【作者】王磊;吴晓娟;巴本冬;董文会【作者单位】山东大学信息科学与工程学院,山东,济南,250100;山东大学信息科学与工程学院,山东,济南,250100;山东大学信息科学与工程学院,山东,济南,250100;山东大学信息科学与工程学院,山东,济南,250100【正文语种】中文【中图分类】工业技术C N 4 3 - 1 2 5 8 / T P 计算机工程与科学2 0 0 6 年第 2 8 卷第 6 期 IS S N 1 0 0 7 - 1 3 0 X CO M P U T E R E N G I N E E R I N G ＆ S C I E N C E Vol.2 8 , N o.6 , 2 0 0 6文章编号：1 0 0 7 -1 3 0 X ( 2 0 0 6 ) 0 6 - 0 0 5 2 -0 3一种基于视觉的 P E R C L O S 特征提取方法A V i s i o n - B a s e d l \/ I e t h o d t o D e t e c tP E R C L O S F e a t u r e s王磊，吴晓娟，巴本冬，营文会W A N G L ei.W U X i a o - j u a n.B A B e n r d o n g.D O N G W e n - h u i （山东大学信息科学与工程学院，山东济南 2 5 0 1 0 0）( S c h o ol o f I n f o r m a ti o n S cie n c e a n d E n gi n e e ri n g.S h a n d o n g U n i v e r sity.J i n a n 2 5 0 1 0 0.C h i n a)摘要：P E R C L O S 是有效地检测驾驶员瞌睡的特征。

Lam, W. and F. Bacchus (1994b). Using new data to refine a Bayesian network. In R. Lopez de

Learning Bayesian Networks from Data—AAAI1998Tutorial—Additional ReadingsThe following is a list of references to the material covered in the tutorial and to more advanced subjects mentioned at various points.This list is far from being comprehensive and is intended only to provide useful starting points.Background MaterialBayesian Networks A good reference on Bayesian networks is[Pearl1988].A more recent book,which covers Bayesian network inference in depth is[Jensen1996].A a short and gentle introduction can be found in[Charniak 1991].Statistics,Pattern Recognition and Information Theory There are many books on statistics.Weﬁnd[DeGroot 1970]to be a good introduction to statistics and Bayesian statistics in particular.A more recent book[Gelman et al. 1995]is also a good introduction to thisﬁeld and also discusses recent advances,such as hierarchical priors.Books in pattern recognition,including the classic[Duda and Hart1973]and the more recent[Bishop1995],cover basic issues in density estimation and their use for pattern recognition and classiﬁcation.A good introduction to information theory,and notions such as KL divergence and mutual information can be found in[Cover and Thomas1991].Tutorials and Surveys[Heckerman1995]provides an in-depth tutorial on Bayesian methods in learning Bayesian networks.[Buntine1996]surveys the literature.[Jordan1998]is a collection of introductory surveys and papers discussing recent advances.Learning ParametersLearning parameters from complete data is discussed in[Spiegelhalter and Lauritzen1990].A more recent discussion can be found in[Buntine1994].An introduction to the possible problems with incomplete data and MAR assumptions can be found in[Rubin1976].Learning parameters from incomplete data using gradient methods is discussed by [Binder et al.1997;Thiesson1995].The original EM paper is[Dempster et al.1977];an elegant alternative explanation of EM can found in[Neal and Hinton1998].[Lauritzen1995]describes how to apply EM to Bayesian networks.[Bauer et al.1997]describe methods for accelerating the convergence of EM.Learning using Gibbs sampling is discussed in[Thomas et al.1992;Gilks et al.1996].Learning StructureComplete Data The Bayesian score is originally discussed in[Cooper and Herskovits1992]and further developed in[Buntine1991;Heckerman et al.1995].The MDL score is based on the Minimal Description Length principle of [Rissanen1989];the application of this principle to Bayesian networks was developed by several authors[Bouckaert 1994;Lam and Bacchus1994a;Suzuki1993].The method for learning trees was initially introduced in[Chow and Liu1968](see also the description in[Pearl1988]).Learning structure using greedy hill-climbing and other variants is discussed and evaluated in[Heckerman et al.1995].See[Chickering1996b]for search over equivalence network classes.[Buntine1991;Heckerman et al.1995;Madigan and Raftery1994]discuss methods for approximating the full Bayesian model averaging.Incomplete Data[Chickering and Heckerman1997]discuss the problems with evaluating the score of networks in the presence of incomplete data and describe several approximation to the score.[Cheeseman and Stutz1995]discuss Bayesian learning of mixture models with a single hidden variable.Recent works on learning structure in the presence of incomplete data include[Friedman1997;Friedman1998;Meila and Jordan1998;Singh1997;Thiesson et al. 1998].Causal DiscoveryFor different views of the relation of causality and Bayesian networks see[Heckerman and Shachter1995;Pearl1993; Spirtes et al.1993].[Pearl and V erma1991;Spirtes et al.1993]describe constraint-based methods for learning causal relation from data.The Bayesian approach is discussed in[Heckerman et al.1997].Advanced TopicsContinuous Variables See[Heckerman and Geiger1995]for methods of learning a network that contains Gaus-sian distributions.[Hofmann and Tresp1996;John and Langley1995]discuss learning Bayesian networks with non-parametric representations of density functions.[Monti and Cooper1997]use neural networks to represent the conditional densities.[Friedman and Goldszmidt1996]learn Bayesian networks over continuous domains by discretizing the values of the continuous variables.Local Structure[Buntine1991;Diez1993]discuss learning the“noisy-or”conditional probability.[Meek and Heckerman1997]discuss how to learn a several extensions of this local model.[Friedman and Goldszmidt1998] describe how to learn tree-like representations of local structure and why this helps in learning global structure. [Chickering et al.1997]extend these results to richer representations and discuss more advanced search procedures for learning both global and local structure.Online Learning&Updates See[Buntine1991;Friedman and Goldszmidt1997;Lam and Bacchus1994b]for discussion on how to sequentially update the structure of a network as more data is available.Temporal Processes Dynamic Bayesian networks[Dean and Kanazawa1989]is an extension of Bayesian networks for representing stochastic models.[Smyth et al.1997]discussed how this representation generalizes hidden Markov networks,and how methods from bothﬁelds are related.[Ghahramani and Jordan1997]describe methods for learning parameters for complex dynamic Bayesian networks with non-trivial unobserved state.[Friedman et al.1998]describe methods for learning the structure of dynamic Bayesian networks.Theory[Chickering1996a]shows thatﬁnding the structure that maximizes the Bayesian score is NP-hard.[Dasgupta 1997;Friedman and Yakhini1996]discuss the sample complexity—that is,how many examples are required to achieve a desired accuracy—for learning parameters and structure.ApplicationsThe AutoClass system[Cheeseman and Stutz1995]is an unsupervised clustering program that the simple“naive”Bayesian network.This program has been used in numerous applications.The“naive”Bayesian classiﬁer has been used since the early days of pattern recognition[Duda and Hart1973].[Ezawa and Schuermann1995;Friedman et al.1997;Singh and Provan1995]describe applications of more complex Bayesian network learning algorithms for classiﬁcation.[Zweig and Russell1998]use Bayesian networks for speech recognition.[Breese et al.1998]discuss collaborativeﬁltering methods that use Bayesian network learning algorithms.[Spirtes et al.1993]describe several applications of causal learning in social sciences.ReferencesBauer,E.,D.Koller,and Y.Singer(1997).Update rules for parameter estimation in bayesian networks.In D.Geiger and P.Shanoy(Eds.),Proc.Thirteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’97),San Francisco,Calif.,pp.3–13.Morgan Kaufmann.Binder,J.,D.Koller,S.Russell,and K.Kanazawa(1997).Adaptive probabilistic networks with hidden variables.Machine Learning29,213–244.Bishop,C.M.(1995).Neural Networks for Pattern Recognition.Oxford,U.K.:Oxford University Press.Bouckaert,R.R.(1994).Properties of Bayesian network learning algorithms.In R.L´o pez de Mantar´a s and D.Poole (Eds.),Proc.Tenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’94),pp.102–109.San Francisco, Calif.:Morgan Kaufmann.Breese,J.S.,D.Heckerman,and C.Kadie(1998).Empirical analysis of predictive algorithms for collaborativeﬁl-tering.In G.F.Cooper and S.Moral(Eds.),Proc.Fourteenth Conference on Uncertainty in Artiﬁcial Intelligence (UAI’98).San Francisco,Calif.:Morgan Kaufmann.Buntine,W.(1991).Theory reﬁnement on Bayesian networks.In B.D.D’Ambrosio,P.Smets,and P.P.Bonissone (Eds.),Proc.Seventh Annual Conference on Uncertainty Artiﬁcial Intelligence(UAI’92),pp.52–60.San Francisco,Calif.:Morgan Kaufmann.Buntine,W.(1994).Operations for learning with graphical models.J.of Artiﬁcial Intelligence Research2,159–225. Buntine,W.(1996).A guide to the literatureon learning probabilisticnetworks from data.IEEE Trans.on Knowledge and Data Engineering8,195–210.Charniak,E.(1991).Bayesian networks without tears.AI Magazine12,50–63.Cheeseman,P.and J.Stutz(1995).Bayesian classiﬁcation(AutoClass):Theory and results.In U.Fayyad,G.Piatesky-Shapiro,P.Smyth,and R.Uthurusamy(Eds.),Advances in Knowledge Discovery and Data Mining.Menlo Park,California:AAAI Press.Chickering,D.M.(1996a).Learning Bayesian networks is NP-complete.In D.Fisher and H.-J.Lenz(Eds.), Learning from Data:Artiﬁcial Intelligence and Statistics V.Springer V erlag.Chickering,D.M.(1996b).Learning equivalence classes of Bayesian-network structure.In E.Horvitz and F.Jensen (Eds.),Proc.Twelfth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’96).San Francisco,Calif.: Morgan Kaufmann.Chickering,D.M.and D.Heckerman(1997).Efﬁcient approximations for the marginal likelihood of incomplete data given a Bayesian network.Machine Learning29,181–212.Chickering,D.M.,D.Heckerman,and C.Meek(1997).A Bayesian approach to learning Bayesian networks with local structure.In D.Geiger and P.Shanoy(Eds.),Proc.Thirteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’97),San Francisco,Calif.,pp.80–89.Morgan Kaufmann.Chow,C.K.and C.N.Liu(1968).Approximating discrete probability distributions with dependence trees.IEEE Trans.on Info.Theory14,462–467.Cooper,G.F.and E.Herskovits(1992).A Bayesian method for the induction of probabilistic networks from data.Machine Learning9,309–347.Cover,T.M.and J.A.Thomas(1991).Elements of Information Theory.New Y ork:John Wiley&Sons. Dasgupta,S.(1997).The sample complexity of learningﬁxed-structure Bayesian networks.Machine Learning29, 165–180.Dean,T.and K.Kanazawa(1989).A model for reasoning about persistence and putational Intelli-gence5,142–150.DeGroot,M.H.(1970).Optimal Statistical Decisions.New Y ork:McGraw-Hill.Dempster,A.P.,ird,and D.B.Rubin(1977).Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society B39,1–39.Diez,F.J.(1993).Parameter adjustment in Bayes networks:The generalized noisy or-gate.In D.Heckerman andA.Mamdani(Eds.),Proc.Ninth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’93),pp.99–105.SanFrancisco,Calif.:Morgan Kaufmann.Duda,R.O.and P.E.Hart(1973).Pattern Classiﬁcation and Scene Analysis.New Y ork:John Wiley&Sons. Ezawa,K.J.and T.Schuermann(1995).Fraud/uncollectable debt detection using a Bayesian network based learning system:A rare binary outcome with mixed data structures.In P.Besnard and S.Hanks(Eds.),Proc.Eleventh Conference on Uncertainty in Artiﬁcial Intelligence(UAI’95),pp.157–166.San Francisco,Calif.:Morgan Kaufmann.Friedman,N.(1997).Learning Bayesian networks in the presence of missing values and hidden variables.InD.Fisher(Ed.),Proceedings of the Fourteenth International Conference on Machine Learning.San Francisco,Calif.:Morgan Kaufmann.Friedman,N.(1998).The Bayesian structural EM algorithm.In G.F.Cooper and S.Moral(Eds.),Proc.Fourteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’98).San Francisco,Calif.:Morgan Kaufmann. Friedman,N.,D.Geiger,and M.Goldszmidt(1997).Bayesian network classiﬁers.Machine Learning29,131–163. Friedman,N.and M.Goldszmidt(1996).Discretization of continuous attributes while learning Bayesian networks.In L.Saitta(Ed.),Proceedings of the Thirteenth International Conference on Machine Learning,pp.157–165.San Francisco,Calif.:Morgan Kaufmann.Friedman,N.and M.Goldszmidt(1997).Sequential update of Bayesian network structure.In D.Geiger and P.Shanoy(Eds.),Proc.Thirteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’97).San Francisco, Calif.:Morgan Kaufmann.To appear.Friedman,N.and M.Goldszmidt(1998).Learning Bayesian networks with local structure.In M.I.Jordan(Ed.), Learning in Graphical Models.Dordrecht,Netherlands:Kluwer.A preliminary version appeared in E.Horvitz and F.Jensen eds.,Proc.Twelfth Conference on Uncertainty in Artiﬁcial Intelligence,1996,pp.252–262. Friedman,N.,K.Murphy,and S.Russell(1998).Learning the structure of dynamic probabilistic networks.In G.F.Cooper and S.Moral(Eds.),Proc.Fourteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’98).San Francisco,Calif.:Morgan Kaufmann.Friedman,N.and Z.Yakhini(1996).On the sample complexity of learning Bayesian networks.In E.Horvitz andF.Jensen(Eds.),Proc.Twelfth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’96).San Francisco,Calif.:Morgan Kaufmann.Gelman,A.,J.B.Carlin,H.S.Stern,and D.B.Rubin(1995).Bayesian Data Analysis.London:Chapman&Hall. Ghahramani,Z.and M.I.Jordan(1997).Factorial hidden Markov models.Machine Learning29,245–274. Gilks,W.,S.Richerdson,and D.Spiegelhalter(1996).Markov Chain Monte Carlo in Practice.Chapman and Hall. Heckerman,D.(1995).A tutorial on learning with Bayesian networks.Technical Report MSR-TR-95-06,Mi-crosoft Research,Redmond,Washington.Available from /research/dtg/heckerma/heckerma.html. Heckerman,D.and D.Geiger(1995).Learning Bayesian networks:a uniﬁcation for discrete and Gaussian domains.In P.Besnard and S.Hanks(Eds.),Proc.Eleventh Conference on Uncertainty in Artiﬁcial Intelligence(UAI ’95),pp.274–284.San Francisco,Calif.:Morgan Kaufmann.Heckerman,D.,D.Geiger,and D.M.Chickering(1995).Learning Bayesian networks:The combination of knowledge and statistical data.Machine Learning20,197–243.Heckerman,D.,C.Meek,and G.Cooper(1997).A bayesian approach to causal discovery.Technical report.Technical Report MSR-TR-97-05,Microsoft Research.Heckerman,D.and R.Shachter(1995).Decision-theoretic foundations for causal reasoning.Journal of A.I.Re-search3,405–430.Hofmann,R.and V.Tresp(1996).Discovering structure in continuous variables using Bayesian networks.In Advances in Neural Information Processing Systems8,pp.500–506.Jensen,F.(1996).An Introduction to Bayesian Networks.Springer.John,G.H.and ngley(1995).Estimating continuous distributions in Bayesian classiﬁers.In P.Besnard and S.Hanks(Eds.),Proc.Eleventh Conference on Uncertainty in Artiﬁcial Intelligence(UAI’95),pp.338–345.San Francisco,Calif.:Morgan Kaufmann.Jordan,M.I.(Ed.)(1998).Learning in Graphical Models.Dordrecht,Netherlands:Kluwer.Lam,W.and F.Bacchus(1994a).Learning Bayesian belief networks:An approach based on the MDL principle.Computational Intelligence10,269–293.Lam,W.and F.Bacchus(1994b).Using new data to reﬁne a Bayesian network.In R.L´o pez de Mantar´a s andD.Poole(Eds.),Proc.Tenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’94),pp.383–390.SanFrancisco,Calif.:Morgan Kaufmann.Lauritzen,S.L.(1995).The EM algorithm for graphical association models with missing putational Statistics and Data Analysis19,191–201.Madigan,D.and A.Raftery(1994).Model selection and accounting for model uincertainty in graphical models using Occam’s window.Journal of the American Statiscal Association89,1535–1546.Meek,C.and D.Heckerman(1997).Structure and parameter learning for causal independence and causal interaction models.In D.Geiger and P.Shanoy(Eds.),Proc.Thirteenth Conference on Uncertainty in Artiﬁcial Intelligence (UAI’97),San Francisco,Calif.,pp.366–375.Morgan Kaufmann.Meila,M.and M.I.Jordan(1998).Estimating dependency structure as a hidden variable.In NIPS10.Monti,S.and G.F.Cooper(1997).Learning Bayesian belief networks with neural network estimators.In Advances in Neural Information Processing Systems9,pp.579–584.Neal,R.M.and G.E.Hinton(1998).A new view of the EM algorithm that justiﬁes incremental and other variants.In M.I.Jordan(Ed.),Learning in Graphical Models.Dordrecht,Netherlands:Kluwer.Pearl,J.(1988).Probabilistic Reasoning in Intelligent Systems.San Francisco,Calif.:Morgan Kaufmann. Pearl,J.(1993).Graphical models,causality and intervention.Statistical Science8,266–273.Pearl,J.and T.S.V erma(1991).A theory of inferred causation.In J.A.Allen,R.Fikes,and E.Sandewall(Eds.), Principles of Knowledge Representation and Reasoning:Proc.Second International Conference(KR’91),pp.441–452.San Francisco,Calif.:Morgan Kaufmann.Rissanen,J.(1989).Stochastic Complexity in Statistical Inquiry.River Edge,NJ:World Scientiﬁc.Rubin,D.R.(1976).Inference and missing data.Biometrica63,581–592.Singh,M.(1997).Learning bayesian networks from incomplete data.In Proc.National Conference on Artiﬁcial Intelligence(AAAI’97),pp.27–31.Menlo Park,CA:AAAI Press.Singh,M.and G.M.Provan(1995).A comparison of induction algorithms for selective and non-selective Bayesian classiﬁers.In A.Prieditis and S.Russell(Eds.),Proceedings of the Twelfth International Conference on Machine Learning,pp.497–505.San Francisco,Calif.:Morgan Kaufmann.Smyth,P.,D.Heckerman,and M.Jordan(1997).Probabilisticindependence networksfor hidden Markov probability models.Neural Computation9(2),227–269.Spiegelhalter,D.J.and uritzen(1990).Sequential updating of conditional probabilitieson directed graphical works20,579–605.Spirtes,P.,C.Glymour,and R.Scheines(1993).Causation,Prediction and Search.Number81in Lecture Notes in Statistics.New Y ork:Springer-V erlag.Suzuki,J.(1993).A construction of Bayesian networks from databases based on an MDL scheme.In D.Heckerman and A.Mamdani(Eds.),Proc.Ninth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’93),pp.266–273.San Francisco,Calif.:Morgan Kaufmann.Thiesson,B.(1995).Accelerated quantiﬁcation of Bayesian networks with incomplete data.In Proceedings of the First International Conference on Knowledge Discovery and Data Mining(KDD-95),Montreal,Canada,pp.306–311.AAAI Press.Thiesson,B.,C.Meek,D.M.Chickering,and D.Heckerman(1998).Learning mixtures of Bayesian networks.InG.F.Cooper and S.Moral(Eds.),Proc.Fourteenth Conference on Uncertainty in Artiﬁcial Intelligence(UAI’98).San Francisco,Calif.:Morgan Kaufmann.Thomas,A.,D.Spiegelhalter,and W.Gilks(1992).Bugs:A program to perfrom Bayesian inference using Gibbs sampling.In J.Bernardo,J.Berger,A.Dawid,and A.Smith(Eds.),Bayesian Statistics4,pp.837–842.Oxford Univ.Press.Zweig,G.and S.J.Russell(1998).Speech recognition with dynamic Bayesian networks.In Proceedings of the Fifteenth National Conference on Artiﬁcial Intelligence(AAAI-98).。

实证软件工程方法英文

实证软件工程方法英文Empirical Software Engineering (ESE) Methods.Empirical Software Engineering (ESE) is a research discipline within Software Engineering that focuses on the use of empirical methods to validate and improve software engineering processes, methods, tools, and theories. It provides a systematic and objective approach to software engineering that is based on the collection and analysis of data. ESE methods are used to evaluate the effectiveness of different software engineering practices and to identify areas for improvement.There are a variety of ESE methods that can be used to evaluate software engineering practices. These methods can be classified into two main categories:Observational methods: These methods involve observing the behavior of software engineers and software systems without interfering with their operations. Observationalmethods can be used to collect data on a variety of topics, such as the time it takes to develop software, the number of defects that are introduced into software, and the satisfaction of software users.Experimental methods: These methods involve manipulating the variables that affect software engineering practices in order to measure the impact of those changes. Experimental methods can be used to compare the effectiveness of different software engineering practices, to identify the factors that lead to software defects, and to develop new software engineering tools and methods.ESE methods have been used to evaluate a wide range of software engineering practices, including:Agile development methods: ESE methods have been used to compare the effectiveness of agile development methods to traditional software development methods.Code review: ESE methods have been used to evaluate the effectiveness of code review practices.Defect tracking: ESE methods have been used to evaluate the effectiveness of defect tracking systems.Software testing: ESE methods have been used to evaluate the effectiveness of software testing techniques.Software quality: ESE methods have been used to evaluate the quality of software products.ESE methods have made a significant contribution to the field of software engineering. These methods have provided valuable insights into the effectiveness of different software engineering practices and have helped to identify areas for improvement. As the field of software engineering continues to evolve, ESE methods will continue to play an important role in the development of new and improved software engineering practices.Here are some specific examples of how ESE methods have been used to improve software engineering practices:A study by Kitchenham et al. (2001) found that agile development methods were more effective than traditional software development methods in terms of productivity, quality, and customer satisfaction.A study by Fenton and Pfleeger (2008) found that code review practices were effective in reducing the number of defects in software products.A study by Graves et al. (2009) found that defect tracking systems were effective in helping software engineers to identify and fix defects.A study by Begel and Zimmermann (2010) found that software testing techniques were effective in identifying defects in software products.A study by Basili et al. (2012) found that software quality models were effective in predicting the quality of software products.These are just a few examples of the many ways that ESEmethods have been used to improve software engineering practices. As the field of software engineering continues to evolve, ESE methods will continue to play an important role in the development of new and improved software engineering practices.## Benefits of Using ESE Methods.There are many benefits to using ESE methods to evaluate software engineering practices. These benefits include:Increased objectivity: ESE methods provide a more objective way to evaluate software engineering practices than traditional methods, such as expert opinion or intuition.Greater accuracy: ESE methods can provide more accurate results than traditional methods, because they are based on data that has been collected and analyzed in a systematic way.Improved decision-making: ESE methods can help software engineers to make better decisions about which software engineering practices to use, because they provide objective evidence about the effectiveness of different practices.Increased efficiency: ESE methods can help software engineers to be more efficient in their work, because they can identify areas for improvement and focus their efforts on the most effective practices.Enhanced customer satisfaction: ESE methods can help software engineers to develop software products that are of higher quality and that meet the needs of customers, because they provide objective evidence about the effectiveness of different software engineering practices.## Conclusion.ESE methods are a valuable tool for improving software engineering practices. These methods provide a systematic and objective approach to evaluating software engineeringpractices and identifying areas for improvement. ESE methods have been used to evaluate a wide range of software engineering practices, and they have made a significant contribution to the field of software engineering. As the field of software engineering continues to evolve, ESE methods will continue to play an important role in the development of new and improved software engineering practices.。

Dempster-Shafer证据推理方法理论与应用的综述-浙江大学计算机学院

March 10, 2002第一稿 September 25, 2006第四次修改稿
Outline

本章的主要参考文献证据理论的发展简况经典证据理论关于证据理论的理论模型解释证据理论的实现途径基于DS理论的不确定性推理计算举例
本章的主要参考文献
[1] Dempster, A. P. Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics, 1967, 38(2): 325-339. 【提出证据理论的第一篇文献】 [2] Dempster, A. P. Generalization of Bayesian Inference. Journal of the Royal Statistical Society. Series B 30, 1968:205-247. [3] Shafer, G. A Mathematical Theory of Evidence. Princeton University Press, 1976. 【证据理论的第一本专著，标志其正式成为一门理论】 [4] Barnett, J. A. Computational methods for a mathematical theory of evidence. In: Proceedings of 7th International Joint Conference on Artificial Intelligence(IJCAI-81), Vancouver, B. C., Canada, Vol. II, 1981: 868-875. 【第一篇将证据理论引入AI领域的标志性论文】

论文的参考文献标准模版

参考文献标准模版一、参考文献书写格式1）期刊[序号] 主要作者. 文献题名[J]. 刊名，出版年份，卷号（期号）：起止页码.例如：[1] 袁庆龙，候文义. Ni-P合金镀层组织形貌及显微硬度研究[J]. 太原理工大学学报，2001，32（1）：51-53.2）专著[序号] 主要作者. 专著名[M].出版地：出版者，出版年份，起止页码.[4] 王芸生. 六十年来中国与日本[M]. 北京：三联书店，1980，161-172.3）专利文献[序号] 专利所有者. 专利题名[P]. 专利国别：专利号，发布日期.[7] 姜锡洲. 一种温热外敷药制备方案[P]. 中国专利：881056078，1983-08-12.4）报纸文章[序号] 主要作者. 文献题名[N]. 报纸名，出版日期(版次).[11] 谢希德. 创造学习的思路[N]. 人民日报，1998-12-25(10).二、文献名称标识期刊文章[J]、专著[M]、论文集[C]、学位论文[D]、专利[P]、标准[S]、报纸文章[N]、报告[R]、资料汇编[G]、其他文献[Z][1] 纪钢. 一种对周期性信号采样的新方法[J]. 仪表技术，1998，(4):31-34.[2] 李晓陆. 带通采样定理在降低功耗问题中的实际应用[J]. 桂林电子工业学院学报，2004，24(5):36-38.[3] 李思坤，苏显渝，陈文静. 一种新的小波变换空间载频条纹相位重建方法[J]. 中国激光，2010，37(12):3060-6065.[4] Wang Chuandan，Zhang Zhongpei，Li Shaoqian. INTERFERENCE MITIGATINGBASED ON FRACTIONAL FOURIER TRANSFORM IN TRANSFORM DOMAIN COMMUNICATION SYSTEM [J]. Journal of Electronics(China)，电子科学学刊(英文版)，2007(2)：1327-1350.[5] S.C.Chan，T.S.Ng. TRANSFORM DOMAIN CONJUGATE GRADIENTALGORITHM FOR ADAPTIVE FILTERING [J]. Journal of Electronics(China)，电子科学学刊(英文版)，2000，17(1):69-76.[6] Li Ke，Shi Xinhua，Zhang Eryang. TRANSFORM DOMAIN SMART ANTENNASALGORITHM FOR MAI SUPPRESSION [J]. Journal of Electronics(China)，电子科学学刊(英文版)，2004，21(4):289-295.[7] 谢艾纾，徐成，赵利平，邓绍伟，赵嫦花. 变换域维纳滤波及其改进[J]. 计算机工程与应用，2011，11(24):1-8.[8] 焦李成，孙强. 多尺度变换域图像的感知与识别:进展和展望[J]. 计算机学报，2006，29(2):177-193.[9] 李栋. 模拟信号的数字化[J]. 中国新闻科技，1999(8):4-9.[10] 周超. 多带模拟信号的采样与重构[J]. 传感器与微系统，2011，30(5):83-85.[11] 山磊. 模拟信号的数字传输[J]. 南宁职业技术学院学报，2005，10(1):92-95.[12] 徐洪浩. 带限信号谱估计的一个新算法[J]. 哈尔滨船舶工程学院学报，1985(3):36-42.[13] 沈彩耀，李红波，张颋，曾繁景. 带限信号时延估计快速算法研究[J]. 信息工程大学学报，2007，8(1):77-80.[14] 王飞雪，郭桂蓉. 多通带带限信号的采样定理[C]. 第九届全国信号处理学术年会(CCSP-99)，1999(10)-1.[15] 邓林旺，曹建航，何睿，倪琰. 一种模拟信号采样装置[P]. 比亚迪股份有限公司，2001(3)-2.[16] 木青. 高速A/D转换器的基本原理与结构比较[J]. 微电子学，1987，17(5):8-11.[17] 崔庆林，蒋和全. 高速A/D转换器动态参数的计算机辅助测试[J]. 微电子学，2004，34(5):505-509.[18] 王萍，石寅. 一种用于高速A/D转换器的高精度参考电压电阻网络[J]. 电子学报，2000，28(12):48-51.[19] 崔庆林，蒋和全. 高速A/D转换器测试采样技术研究[J]. 微电子学，2006，36(1):52-55.[20] David L. Donoho. Compressed sensing[J]. IEEE Transactions on InformationTheory，2006，52(4): 1289-1306[21] E.J. Candes and J Romberg. Quantitative robust uncertaninty principles and optimallysparse decompositions[J]. Foundations of Comput Math，2006，6(2) :227-254 [22] D. L. Donoho，Y Tsaig. Extensions of compressed sensing[J]. Signal Processing.2006，86(3) :533-548.[23] E.J. Candes. Monoscale ridgelets for the rep resentation of images with edges.Stanford:Stanford University，1999.[24] E.J. Candes and J Romberg. Practical signal recovery from random projections InProc.SPIE Computational Imaging，2005，5674:76-86[25] E.J.Candes. Compressive sampling.Int. Congress of Mathematics，2006，3:1433-1452[26] R. Baraniuk. Compressive sensing. IEEE Signal Processing Magazine，2007，24(4):448-121.[27] 石光明，刘丹化，高大化，刘哲，林杰，王良君.压缩感知理论及其研究进展[J].电子学报，2009，37(5):1070-1081.[28] Olshausen B A, Field D J. Emergency of simple-cell receptive field properties bylearning a sparse coding for natural images. Nature，1996，381(6583): 607-609. [29] Olshausen B A, Field D J. Sparse coding with an overcomplete basis set: a strategyemployed by V1? Visual Research，1997，37(33): 3311-3325.[30] 程文波，王华军. 信号稀疏表示的研究及应用[J].西南石油大学学报(自然科学版)，2008，30(5):148-151.[31] 何昭水，谢胜利. 信号的稀疏性分析[J]. 自然科学进展，2006，16(9):1167-1173.[32] 李映，张艳宁，许星. 基于信号稀疏表示的形态成分分析:进展和展望[J]. 电子学报，2009，37(1):146-152.[33] 傅予力，谢胜利，何昭水. 稀疏信号的参数分析[J]. 武汉大学学报(工学版)，2006，36(9):101-121.[34] 王世一编著. 《数字信号处理》(修订版). 北京理工大学出版社，1997.[35] Xiaoyan Xing，Lisheng Xu，Jilie Ding，Xiaobo Deng and Hailei Liu. The Preliminaryanalysis of Guizhou short-term climate change characteristics using the information theory[C]. 2010 International Conference on Remote Sensing (ICRS 2010)，2010(10).[36] 廖斌，许刚，王裕国. 二维匹配跟踪自适应图像编码[J]. 计算机辅助设计与图形学学报，2003，15(9):1084-1090.[37] 尹忠科，王建英，Pierre Vandergheynst. 在低维空间实现的基于MP的图像稀疏分解. 电讯技术，2004，44(3):12-15.[38] M.Lustig，D.L.Donoho，J.M.Pauly. Sparse MRI:The application of compressedsensing for rapid MR imaging. Magnetic Resonance in Medicine. 2007，58(6):1182-1195.[39] Chen，S.A.Billings，and W. Luo. Orthogonal least squares and their application tonon-linear system identification. International Journal of Control，1989，50(5):1873-1896.[40] R. Baraniuk，P. Steeghs，Compressive radar imaging. IEEE Radar Conference，Waltham，Massachusetts，April 2007.[41] W. Bajwa，J. Haupt，A. Sayeed，etc. Compressive wireless sensing. Int. Conf. onInformation Processing in Sensor Networks(IPSN)，Nashville，Tennessee，2006:134-142.[42] W. Bajwa，J. Haupt，A. Sayeed，etc. Compressive wireless sensing. Proceedings of thefifth International Conference on Information Processing in Sensor Networks，IPSN’06. New York: Association for Computing Machinery. 2006:134-142.[43] G.Quer，R.Masiero，D.Munaretto，etc. On the Interplay Between Routing and SignalRepresentation for Compressive Sensing in Wireless Sensor Networks. Information Throry and Applications Workshop(ITA 2009)，San Diego，CA.[44] 黄萍莉，岳军. 图像传感器CCD技术[J]. 信息记录材料，2005，6(1):50-55.[45] 赵瑾娜. 攻擂方:CMOS技术前景无限[N]. 中国计算机报，2001-05-28(D03).[46] 青山. CMOS技术:还有很长的路要走[N]. 中国电子报，2001-03-16(006).[47] 俊平. CMOS技术有望再领风骚15年[N]. 电子资讯时报，2002-12-05(B04).[48] 陈辰. 基于CCD和CMOS技术的混合数字图像传感器技术兼有低成本和高性能两大优点[J]. 电子产品世界，1998，Z1:143.[49] 王东. 基于数码相机的CCD与CMOS技术[J]. 今日印刷，2002，8(12):56-59.[50] 康为民，李延彬，高伟志. 数字微镜阵列红外动态景象模拟器的研制[J]. 红外与激光工程，2008，37(5):753-756.[51] D. Takhar，V. Bansal，M. Wakin，etc. A compressed sensing camera: New theory andan implementation using digital micromirrors[C]. SPIE Electronic Imaging: Computational Imaging. San Jose. 2006[52] M. Duarte，M. Davenport，D. Takhar，etc. Single-pixel imaging via compressivesampling[C]. IEEE Signal Processing Magazine，2008，25(2):82-91.[53] CAO Wenhua，LIU Songhao，Wuyi University. Optical pulse compression using anonlinear optical loop mirror constructed from dispersion decreasing fiber[J]. Science in China(Series E: Technological Sciences)，中国科学(E辑:技术科学)(英文版)，2004，47(1):33-50.[54] 孟藏珍，袁俊泉，徐向东. 海杂波背景下自适应脉冲压缩的性能与分析[J]. 雷达科学与技术，2006，4(5):305-308.[55] 商枝江. 基于压缩感知的稀疏多径信道估计算法研究[D]. 电子科技大学，2011.[56] Emmanuel Candes，Justin Romberg，T. Tao，Robust uncertainty principles: exactsignal reconstruction from highly incomplete frequency information， IEEE Transactions on Information Theory，2006，52(2):489-509.[57] E. Candes，J. Romberg，T. Tao. Stable signal recovery from incomplete andinaccurate measurements. Communications on Pure and Applied Mathematics，2006，59(8):1207-1223.[58] Hong Fang，Quanbing Zhang，Sui Wei. A Method of image Reconstruction Based onSub-Gaussian Random Projection[J]. Journal of Computer Research and Development，2008，45(8):1402-1407.[59] Hong Fang，Quanbing Zhang，Sui Wei. Method of image reconstruction based on verysparse random projection[J]. Computer Engineering and Applications，2007，43(22):25-27.[60] E.Candes,T.Tao.Near optimal signal recovery from random projections: Universalencoding strategies?[J]. IEEE Transactions on Information Theory，2006,52(12): 5406-5425.[61] W.Yin，S.P.Morgan，J.Yang，Y.Zhang，Practical compressive sensing with Toeplitzand circulant matrices[C]. Rice University CAAM Technical Report TR10-01,Submitted to VCIP 2010.[62] W.Bajwa，J.Haupt，G.Raz，S.J.Wright，R.D.Nowak. Toeplitz-structured compressedsensing matrices[C]. Proceedings of the IEEE Workshop on Statistical Signal Processing，Washington D.C.，USA:IEEE，2007，294-298.[63] F.Sebert，Y.M.Zou，L.Ying. Toeplitz block matrices in compressed sensing and theirapplications in imaging. [C]. Proceedings of International Conference on Technology and Applications in Biomedicine，Washington D.C.，USA:IEEE，2008，47-50. [64] Holger Rauhut. Circulant and Toeplitz matrices in compressed sensing[C]. InProcessing SPARS’09，Saint Malo，2009.[65] Radu Berinde，Pintr Indyk，Sparse recovery using sparse random matrices，2008，preprint.[online]，Available:/cs.[66] T.T.Do，T.D.Trany，L.Gan，Fast compressive with structurally random matrices，Proceedings of the IEEE International Conference on Acoustics[C]. Speech and signal Processing，Washington D.C.，USA:IEEE，2008，3369-3372.[67] Lorne Applebaum，Stephen Howard，Stephen Searle，Robert Calderbank，Chirpsensing codes: Deterministic compressed sensing measurements for fast recovery.2008，preprint.[online]，Available:/cs.[68] Justin Romberg，compressive sensing by random convolution[J]. SIAM Jouranl onImagining Sciences，Nov.2009，2(4):1098-1128.[69] Richard Baraniuk，Mark Davenport，Ronald Dcvore，Michael Wakin. A simple proofof the restricted Isometry property for random matrices[J]. Comstructive Approximation, Dec.2008，28(3):253-263.[70] Richard Baraniuk. Compressive sensing. IEEE Signal Processing Magazine[J]. July2007，24(4):118-121.[71] E.Candes，T.Tao. Decoding by linear Programming[J]. IEEE Transactions onInformation Theory，2005，51(12):4203-4215.[72] Ronald，A. DeVore. Deterministic constructions of compressed sensing matrices[J].Journal of Complexity，2007，23(4-6):918-925.[73] P.Wojtaszczyk. Stability and instance optimality for Gaussian measurement incompressed sensing，Feb，2008.[74] 常彦勋. 有限域的本原元性质[J]. 数学杂志，1993，13(1):59-63.[75] 李海合，王三福. 有限域上的同余方程组[J]. 渭南师范学院学报，2009，24(5):9-10.[76] 白志东. 大维随机矩阵理论及其应用[R]. 东北师范大学，2009.[77] 李云龙. 一类凸规划最优解的形式表达式[J]. 哈尔滨科学技术大学学报，1993，17(1):78-83.[78] 陈景达，陈向晖. 特殊矩阵[M]. 北京:清华大学出版社，2001.[79] 张贤达. 矩阵分析与应用[M]. 北京:清华大学出版社，2004.[80] 胡星星. 线性规划的组合方向算法[D]. 杭州电子科技大学，2011.[81] S.B.Chen，D.L.Donoho，M.A.Saunders. Atomic decomposition by basis pursuit[J].SIAM Journal on Scientific Computing，1998，20(1):33-61.[82] Kim S，Koh K，Lustig M，Boyd S，Gorinevsky D. An interior-point method forlarge-scale l1 regularized least squares[C]. IEEE Journal of Selected Topics in Signal Processing，2007，1(4):606-617.[83] Fiqueiredo MAT，Nowak R D，Wright S J. Gradient projection for sparsereconstruction：Application to compressed sensing and other inverse problems[C].IEEE Journal of Selected Topics in Signal Processing，2007，1(4):586-598.[84] 伍杰. 求解对称非线性方程组的共轭梯度法[D]. 湖南大学，2010.[85] D. L. Donoho，Y Tsaig. Fast solution of l1-norm minimization problems when thesolution may be sparse[J]. Technical Report，Department of Statistics，Stanford University，USA，2008.[86] Tropp J，Gilbert A. Signal recovery from random measurements via orthogonalmatching pursuit[J]. Transactions on Information Theory，2007，53(12):4655-4666.[87] Needell D，Vershynin R. Uniform uncertainty principle and signal rccovery viaregularized orthogonal matching pursuit[J]. Found Comput Math，2008，in press. [88] Needell D，Tropp J A. CoSaMP：Iterative signal recovery from incomplete andinaccurate samples[J]. ACM Technical Report 2008-01，California Institute of Technology，Pasadena，2008.7.[89] Thong T Do，Lu Gan，Nam Nguyen and Trac D Tran. Sparsely adaptive matchingpursuit algorithm for practical compressed sensing[J]. Asilomar Conference on Signals Systems，and Computers，Pacific Grove，California，2008.10.[90] Dai W，Milenkovic O. Subspace pursuit for compressive sensing signalreconstruction[J]. 2008 5th International Symposium on Turbo Codes and Related Topics，TURBOCODING，2008:402-407.[91] 刘亚新，赵瑞珍，胡绍海，姜春晖. 用于压缩感知信号重建的正则化自适应匹配追踪算法[J]. 电子与信息学报，2010，32(11):2713-2717.[92] Kingsbury N G. Complex wavelets for shift invariant analysis and filtering of comlexwavelets for shift invariant analysis and filtering of signals[J]. Journal of Applied and Computational Harmonic Analysis，2001，10(3):234-253.[93] Herrity K.K，Gilbert A C，Tropp J A. Sparse approximation via iterative shareholding.In: Proceedings of the IEEE International Conference on Acoustics[C]. Speech and signal Processing，Washington D.C.，USA:IEEE，2006，624-627.[94] E.Candes，D.L.Donoho. New Tight Frames of Curvelets and Optimal Representationsof Objects with Piecewise C2 Singularities Communications on Pure and Applied Mathematics[C]，2003，57(2):219-266.[95] Vinje W E，Gallant J L. Sparse coding and décor-relation in primary visual cortexduring natural vision[J]. Science，2000，287(5456): 1273-1276.[96] Olshausen B A，Field D J. Emergency of simple-cell receptive field properties bylearning a sparse coding for natural images[J]. Nature，1996，381(6583): 607-609.[97] Olshausen B A，Field D J. Sparse coding with an overcomplete basisset:a strategyemployed by V1? [J]. Visual Research，1997，37(33): 3311-3325.[98] V. K. Goyal，K. Alyson，et al. Compressive sampling and lossy compression[C].IEEE SIGNAL PROCESSING MAGAZINE，2008，25(2):48-56.[99] E. J. Candes，M. B. Wakin. An introduction to compressive sampling:Asending/sampling parading that goes against the common knowledge in data acquisition[C]. IEEE Signal Processing Magazine，2008，25(5):21-30.[100] 郭天圣. 基于小波变换的图像去噪研究[D]. 兰州理工大学，2010.[101] L.M.Bregman. The method of successive projection for finding a common point of convex sets[J]. Doklady Mathematics，1965，(6):688-692.[102] David L，Donoho，Yaakov Tsaig，Iddo Drori ，Jean-Luc Starck. Sparse Solution of Underdetermined Linear Equations by Stagewise Orthogonal Matching Pursuit[J]，2006.[103] 王潇，尹忠科，王建英，杨郑. 应用基追踪的信号分离的算法[C]. 2008年中国西部青年通信学术会议论文集，2008(12):446-449.l-regularized [104] S.J.Kim，K.Koh，M.Lusting，et al. A method for large-scale1 least-squares[C]. IEEE Journal on Selected Topics in Signal Processing，2007，4(1):606-617.[105] I.Daubechies，M.Defrise，C.D.Mol. An iterative thresholding algorithm for linear inverse problems with a sparsely constraint[P]. Comm.Pure.，2004，57(11):1413-1457. [106] A.C.Gilbert，S.Guha，P.Indyk，et al. Near-optimal sparse Fourier representations via sampling[P]. Proceedings of the Annual ACM Symposium on Theory of Computing.Montreal，Que.，Canada: Association for Computing Machinery，2002:152-161.[107] A.C.Gilbert，S.Muthukrishnan，M.J.Strauss. Improved time bounds for neat-optimal sparse Fourier representation[P]. Proceedings of SPIE，Waveles XI，Belingham WA: International Society for Optical Engineering，2005，5914:1-15.[108] A.C.Gilbert，M.J.Strauss，J.Tropp. Algorithmic linear dimension reduction in thel1 norm for sparse vectors[N]. /files/cs/allerton2006GSTV.pdf. [109] A.C.Gilbert，M.J.Strauss，J.Tropp.One sketch for all:Fast algorithms for compressed sensing. Proceedings of the 39th Annual ACM Symposium on Theory of Computing，New York:Association for Computing Machiner，2007:237-246.[110] Takigawa I，Kudo M，Toyama J. Performance analysis of minimuml-norm1 solutions for underdetermined source separation[J]. IEEE Transactions on Signal Processing，2004，52(3): 582-591.。

高阶HBAM方法一般模型还可以

Keywords: High-order neural networks; Exponential stability; Bidirectional associative memory (BAM); Time delays; Linear matrix inequality; Lyapunov fuIn recent years, Hopﬁeld neural networks and their various generalizations have attracted the attention of many scientists (e.g., mathematicians, physicists, computer scientists and so on), due to their potential for the tasks of classiﬁcation, associative memory, parallel computation and their ability to solve difﬁcult optimization problems, see for example [1–9]. For Hopﬁeld neural networks characterized by ﬁrst-order interactions, Abu-Mostafa and Jacques [10], McEliece et al. [11], and Baldi [12] presented their intrinsic limitations. As a consequence, different architecture with high-order interactions [13–17] have been successively introduced to design neural networks which have stronger approximation property, faster convergence rate, greater storage capacity, and higher fault tolerance than lower-order neural networks; while the stability properties of these models for ﬁxed weights have been studied in [18–21].

Noninvasive system and method for mapping epilepti

专利名称：Noninvasive system and method formapping epileptic networks and surgicalplanning发明人：Fernando Vale,Elliot George Neal申请号：US16024020申请日：20180629公开号：US10588561B1公开日：20200317专利内容由知识产权出版社提供专利附图：摘要：System and method for processing, non-concurrently collected,electroencephalogram (EEG) data and resting station functional magnetic resonanceimaging (rsfMRI) data, non-invasively, to create a patient-specific three-dimensional (3D) mapping of the patient's functional brain network. The mapping can be used to more precisely identify candidates of resective neurosurgery and to help create a targeted surgical plan for those patients. The methodology automatically maps the patient's unique brain network using non-concurrent EEG and resting state functional MRI (rsfMRI). Generally, the current invention merges non-concurrent EEG data and rsfMRI data to map the patient's epilepsy/seizure network.申请人：Fernando Vale,Elliot George Neal地址：Tampe FL US,Tampa FL US国籍：US,US代理机构：Smith & Hopen, P.A.代理人：Molly L. Sauter更多信息请下载全文后查看。

基于频率原型和相异度量的分类数据阵列的可能性模糊聚类(IJISA-V9-N5-7)

Kharkiv National University of Radio Electronics, Kharkiv, Ukraine, E-mail: lehatish@, samitova@ Abstract—Fuzzy clustering procedures for categorical data are proposed in the paper. Most of well-known conventional clustering methods face certain difficulties while processing this sort of data because a notion of similarity is missing in these data. A detailed description of a possibilistic fuzzy clustering method based on frequency-based cluster prototypes and dissimilarity measures for categorical data is given. Index Terms—Computational Intelligence, Machine Learning, Categorical Data, Categorical Scale, Possibilistic Fuzzy Clustering, Frequency Prototype, Dissimilarity Measure. I. INTRODUCTION The problem of multi-dimensional data clustering is common to many Data Mining applications. Its solution may be useful for a variety of different approaches and algorithms [1-10]. The point of this problem is that an initial data set (which is described by a multidimensional vector) should be split in a self-learning mode into homogeneous groups (clusters). A traditional approach to the clustering problem is based on the assumption that each vector may belong to an only class which means that formed clusters do not overlap in the multi-dimensional feature space. An initial data set for the task is N n dimensional feature vectors n X x 1 , x 2 , ..., x N R which are given either in an interval scale or in a relational scale wherein

多目标拆卸线平衡问题的Pareto遗传模拟退火算法

多目标拆卸线平衡问题的Pareto遗传模拟退火算法汪开普;张则强;朱立夏;邹宾森【摘要】Aiming at the deficiencies of single solving result and failure to balance the optimization objectives of traditional method in solving multi-objective disassembly line balancing problem,a multi-objective genetic simulated annealing algorithm based on Pareto set was proposed,which combined rapid global search ability of genetic algorithm with strong local search capability of simulated annealing operation.The simulated annealing operation was performed on the solving results of genetic operation to avoid the local optimum.An improved Metropolis rule was employed by considering the characteristics of multi-objective optimization problems.The crowding distance as an evaluation mechanism was adopted to filter the non-inferior solutions acquired from Pareto dominance relationship,and the preserved non-inferior solutions were added in the population to speed up the convergence rate of the proposed algorithm.Based on a 25-task disassembly case,the effectiveness of proposed algorithm was verified by the comparison with other 6 single-objective algorithms.The proposed algorithm was applied to a disassembly instance and 10 task assignment schemes were obtained,and the solution results were compared with Pareto ant colony algorithm further indicating the superiority of proposed algorithm.%针对传统方法求解多目标拆卸线平衡问题时求解结果单一、无法平衡各目标等不足,提出一种基于Pareto解集的多目标遗传模拟退火算法.该算法融合了遗传操作的快速全局搜索能力和模拟退火操作较强的局部搜索能力,对遗传操作的结果进行模拟退火操作,避免了算法陷入局部最优.结合多目标优化问题的特点,改进了模拟退火操作的Metropolis准则.根据拆卸序列之间的Pareto支配关系得到非劣解,并采用拥挤距离评价非劣解,实现了拆卸序列的精英保留,进而将非劣解添加到种群中,加快了算法的收敛速度.基于25项拆卸任务算例,通过与现有的6种单目标算法进行对比,验证了所提算法的有效性,并将所提算法应用于某拆卸线实例中,求得10种平衡方案,结果表明所提算法较Pareto蚁群算法更具优势.【期刊名称】《计算机集成制造系统》【年(卷),期】2017(023)006【总页数】9页(P1277-1285)【关键词】拆卸线平衡;多目标优化;遗传算法;模拟退火算法;Pareto解集【作者】汪开普;张则强;朱立夏;邹宾森【作者单位】西南交通大学机械工程学院,四川成都610031;西南交通大学机械工程学院,四川成都610031;西南交通大学机械工程学院,四川成都610031;西南交通大学机械工程学院,四川成都610031【正文语种】中文【中图分类】TH165;TP301.6科技的飞速发展使得产品的更新换代日益加快，伴随而来的是大量废旧产品的产生，面对经济与环境的压力，实现废旧产品的回收再利用已成为可持续发展的必由之路。

基于人工免疫粒子群优化算法的动态聚类分析

文章编号:1006 4710(2008)04 0390 05基于人工免疫粒子群优化算法的动态聚类分析王磊,吉欢,徐庆征(西安理工大学计算机科学与工程学院,陕西西安710048)摘要:模糊C 均值聚类算法受初始化影响较大,在迭代时容易陷入局部极小值。

将粒子群优化算法与模糊C 均值聚类算法相结合,提出一种新颖的动态聚类算法。

该算法利用人工免疫思想改进粒子群优化过程,在很大程度上避免了粒子群算法和聚类算法早熟现象的发生,全局搜索能力和局部搜索能力优于同类算法。

利用聚类理论中的经验规则k maxn 来确定聚类数k 的搜索范围,在最优粒子基础上进化新一级种群,该方案可有效提高算法的收敛速度。

两组数据的仿真实验表明,新算法优于传统模糊C 均值聚类算法,具有收敛速度快和解的精度高的特点。

关键词:人工免疫系统;粒子群优化算法;动态聚类;收敛性中图分类号:TP301 文献标识码:AA Dyna m ic Clustering Analysis Based on Artificial I mmuneParticle S war m Opti m ization A lgorith mWANG Le,i JIH uan ,XU Q i n g zheng(Faculty o f Co m pute r Sc i ence and Eng i nee ri ng ,X i an U n i versity o f T echno l ogy ,X ii an 710048,Ch i na)Abst ract :The fuzzy C m eans (FC M )c l u stering a l g orit h m is sensitive to the situati o n o f the initializationand easy to fall i n to the loca lm ini m um when itera ti n g .A nove l dyna m ic cl u ster i n g a l g orit h m based on the co m b i n ati o n of FC M a l g orithm w it h particle s w ar m opti m ization (PSO)a l g orit h m is pr oposed i n th is pa per .The artific ial i m m une m echanis m is i n troduced i n i m proving the process o f parti c le s w ar m opti m iza ti o n ,so that the pre m ature conver gence o f PSO and FC M algor ith m is avo i d ed ,wh ich m akes the algo rit h m s global search and l o cal search capab ility appear better t h an nor m al ones .The search range is de ter m i n ed according to the experi e ntial rule k max n and the ne w population is evo lved based on the opti m a l particles ,wh ich m akes the convergence speed i n creased .The experi m ents on t w o data sets sho w tha t w hen co m pared w ith the c lassica l clusteri n g m ethod ,the ne w algo rithm is capab le o f i m prov i n g the c l u ste ring perfor m ance sign ificantly in convergence ab ility and solution precision .K ey w ords :artifi c ial i m m une syste m;particle s w ar m opti m izati o n algorithm ;dyna m ic c l u stering ;convergence 收稿日期:2008 06 16基金项目:国家自然科学基金资助项目(60603026)。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY 2000857Methods for Robust Clustering of Epileptic EEGSpikesPatrik Wahlberg*and Göran LantzAbstract—We investigate algorithms for clustering of epileptic electroencephalogram (EEG)spikes.Such a method is useful prior to averaging and inverse computations since the spikes of a patient often belong to a few distinct classes.Data sets often contain out-liers,which makes algorithms with robust performance desirable.We compare the fuzzy C -means (FCM)algorithm and a graph-the-oretic algorithm.We give criteria for determination of the correct level of outlier contamination.The performance is then studied by aid of simulations,which show good results for a range of cir-cumstances,for both algorithms.The graph-theoretic method gave better results than FCM for simulated signals.Also,when evalu-ating the methods on seven real-life data sets,the graph-theoretic method was the better method,in terms of closeness to the manual assessment by a neurophysiologist.However,there was some dis-crepancy between manual and automatic clustering and we sug-gest as an alternative method a human choice among a limited set of automatically obtained clusterings.Furthermore,we evaluate geometrically weighted feature extraction and conclude that it is useful as a supplementary dimension for clustering.Index Terms—Clustering,electrode geometry,epileptic EEG spikes,fuzzy C-means algorithm,graphtheory,Publisher858IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY 2000to a human who makes the decision.Also,with this principle the GTC method is preferable to FCM,and GTC works well for all real-life data sets.A comparison of the GTC algorithm with corresponding dipole estimates is contained in [14].Fur-ther,we evaluate the idea of taking into account the geometry of the electrode configuration in order to identify clusters easier.This method gives improved conditions for clustering for some data configurations,both simulated and real life.Since improve-ment is not obtained generally,we conclude that it should be used as an exploratory tool,i.e.,as an extra dimension which may give clusters that are more clearly separated.The paper is organized as follows.Section II treats feature extraction.Methods for robust clustering are described in Sec-tion III;Section IV reports simulations results and Section V evaluates the algorithms on real-life data.II.F EATURE E XTRACTIONThe classification problem is often divided into feature ex-traction and clustering [9],[11]although there can be no clear border between the two stages.Feature extraction aims at transforming the input data into a form that makes it easier for the clustering algorithm to identify the clusters.It can be,for instance,a Karhunen–Loève (KL)transformation [15].There are two fundamentally different types of input data a feature extraction algorithm may generate [11]:1)vector space data,which are fed into the C -means algorithm and the maximum likelihood algorithm [9],and 2)metric data,i.e.,distances between each pair of data samples,which are fed into the graph-theoretic algorithms.The vector space representation is more general than the metric representation since the vector space norm can be used to generate a metric,but not vice versa,since the metric data are not necessarily contained in a linear manifold.A.Vector RepresentationA setof,is defined from the raw data.Their dimensionsaresimultaneously recorded columns (chan-nels)oflength-dimensional vector of projectioncoefficients(1)Thus,a considerable reduction of dimensionalityfrom may be obtained.Thevectors(2)where the chosen norm is the Frobenius matrix norm[16]is any differentiable parametrization of a curveinand is the Euclidean normin which fulfills the criterion is a geodesic and itcan be shown that it is part of a (generalized)great circle[17],,whereand are orthonormal vectorsinas a function of follows from themonotonicity in two dimensions,using the law ofcosines,WAHLBERG AND LANTZ:METHODS FOR ROBUST CLUSTERING OF EPILEPTIC EEG SPIKES859Fig.1.Example of pairs of geometrically distributed multichannel spikes.The spike activity is(a)close and(b)distant geometrically.the fact that their active channels are too far apart to be mixed if the low-pass filtering is adequate,so the feature space distance between them is left invariant.In order to implement spatial low-pass filtering,the approxi-mately spherical geometry of the problem is taken into account. Incorporating a symmetric positive definite weightingmatrixin the metric(5),a modifiedmetricto have unit diagonal elements,a calculationyieldsth column vectorofand,so themetricis a positive exponentandandleads to less spatial mixing in the feature extraction and in-creasingis a guaranteeforis symmetricand positive definite it can be factoredasis diagonal with positive elements.Hereby,the metric trans-formationischosen,and the iterationnumberaccordingtois the vector representation(1)of the ma-trix(11)The column contains the fuzzy membershipvalues forobjectwill be almost one for oneclusterit can beshown[12]that the FCM algorithm reduces to the C-means al-gorithm.It is,thus,clear that the FCM is a generalization of theordinary C-means algorithm.The metricexponent,i.e.,the setmemberships is.When the result of FCM is to be compared witha clus-tering,decisionsforanddetermines860IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY 2000the degree of fuzziness imposed on the data.The choice is madeheuristically;weusethroughout.For the choiceof ,and then decide for the valueof(14)and the square cluster separationby(15)A highly clustered set hassmall andlarge.This is why theoptimal.However,in practice there is a tendencythatdecreases monotonicallywith,where.B.Identification of Outliers for FCM Having made the decisionforsubsets,by,and,i.e.,the vector which is leasttightly connected to any cluster.This vector is the most promi-nentoutlier.vectors.For the decision of the optimal number ofoutliers.This is due tothe fact that exclusion of outliers significantly reduces compact-ness,so a small valueofindicates compact clusters.How-ever,a problem with this approach was discovered empirically,consisting of the fact that the minimization frequently makes a decision which is far from correct.Therefore,we define com-pactness with a generalizedexponentstands for FCM).Thechoice will result ina monotonicallydecreasing,andlarger ,whereby the minimumofwill be affected.The choiceof is replaced by computa-tion of the centre of gravity of the function defined by pointwiseinversion.This function gives smaller variance over different re-alizations than minimizationof for estimationof (18)wheredenotes integer part.It is of interest in this application to cluster thevectorswith the projectiondistance,assuming has unit norm.The algorithmsteps are modified according to the following.Step 1):beyond the existing processing,thevectorsis replaced by the projectiondistance ,whereby the decision region boundarieswill be generalized conics [19].C.A GTC AlgorithmThe result of the computation of between-matrix metrics can be considered to be a graph consistingofand .Such a graph is called complete [20]since it has edges between every pair of vertices,i.e.,ithasand edges de-notedbycanbe created,by eliminating edges of decreasing metric starting with the edge of largest metric.Thegraphare not “visible”in the thresholdgraphhas been eliminated.In each threshold subgraph clusters can be identified with aid of a clustering criterion.The result will be a class of subsets that constitute a partition of the vertices,i.e.,a class of disjoint sub-sets whose union is all vertices of the threshold subgraph.The subsets that consist of solely one vertex are considered outliers,which gives robustness to the algorithm.For identification of clusters in each threshold graph,the sim-plest criterion is to identify the free components .The free com-ponents are defined as the maximal subgraphs such that all ver-tices in each subgraph are connected,i.e.,there is a sequence of edges joining them.It has the disadvantage that the objects of aWAHLBERG AND LANTZ:METHODS FOR ROBUST CLUSTERING OF EPILEPTIC EEG SPIKES861Fig.2.(a)Proximity graph (incomplete)of data set,N =6.(b)The corresponding threshold graph after elimination of the largest metric d .(c)The k -component criterion:any X -separating dichotomy (dashed subsets)has at least k edges between its parts.cluster may be very unlike,i.e.,the clusters may lack compact-ness [9],[10].In order to force clusters to be compact one can require more than one sequence of edges joining two vertices of a cluster.One such criterion,that finds more compact clusters than the free components,is to identify the maximal subgraphs such that it takes the elimination of atleast[10].These subgraphs are calledthe,the-components of the threshold sub-graph,[21,p 404],which is obtainedfromwhich,thus,equals themaximum number of edge-disjoint paths between the pair ofverticesin-componentsof-component if and only if it takes the elimination ofatleast-component to separate them,i.e.,if any dichotomy oftheedges between its parts,see Fig.2(c).From the theorem,this is equivalent to the maximum number of edge-disjoint pathsbetween.D.Identification of Outliers for GTCAt a given level in the edge elimination hierarchy,i.e.,at a given valueofstands for GTC)where denotes the number of clus-tered nodes,with respect tothereplacedby.In the hierarchy,onlythe862IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY2000Fig.3.The spherical geometry with the coordinate angles latitude and azimuth (a).A simulated spike from a dipole with r=0:6and SNR=3dB at21 electrodes(b),channels with spike amplitude>50%of maximal channel amplitude are underlined.Superposed noise-free spikes from all three r=0:6clusters (solid)and example of outlier(dashed)(c).,the optimal,i.e.,(20)analogously to the FCM estimate(18).The first variable ofof the denomi-nator of the criteria functions(17)and(19).A choice of suitable andand.Hereby,different cluster separation wasWAHLBERG AND LANTZ:METHODS FOR ROBUST CLUSTERING OF EPILEPTIC EEG SPIKES863TABLE IR ESULTS FROM FCM S IMULATIONS`A.FCM Simulation ResultsThe simulated signal sets consistedofand SNRwas ten.Table I summarizes the results,in terms of estimatesof the number ofclusters3).Overestimationofand all SNR values,apart from small Type2and Type3errors.For,there is Type2error in the order of.2forSNR,sinceit gives smallest Type2error.For,the errors of alltypes are significant forSNRshould be chosen in order to minimize Type2errorfor.We concludethat is a generally good choice,and that the FCM algorithm works wellfor,acceptablefor,and only in the case of high SNR(.B.GTC Simulation ResultsThe simulated signal sets consistedofand SNRwas three.The reason for choosing asmallerwas used for computa-tion oftheand dB,where GTC gives negligible er-rors while FCM gives considerable errors(in the order of0.1to0.3).Further,there is no influence ofthe864IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY2000TABLE IIR ESULTS FROM GTC S IMULATIONS`WAHLBERG AND LANTZ:METHODS FOR ROBUST CLUSTERING OF EPILEPTIC EEG SPIKES865Fig.4.Results of geometric weighting simulation.The criteria D (o)for all pairs i;j; (x)are given as a function of .Average dipoleradius(a)r =0:4deteriorated clustering and (b)r =0:8improved clustering.TABLE IIIP ROPERTIES OF R EAL -L IFE D ATA SETSand ),usingexponentand ,respectively.A.FCM ResultsThe FCM method both under-andoverestimatesand .We con-clude that the FCM algorithm works adequately only for dataset #1,and that the result is not sensitiveto )gives better estimatesoffor data sets #4and #6.Also,there is small differencebetween resultsforand for all data sets.For all data sets except #1,smalland values are reached.Valuesof are large,about .5–.6,for three data sets,#4,#6,and #7,and are otherwise small.We conclude that GTC works well for data sets #2,#3,and #5,acceptably (withhigh error)for data sets #4,#6,and #7,and gives bad results for data set #1.The small clusters of data set #6are not discovered by the algorithm.C.Results Without Decision of Outlier Contamination After much empirical experience,we are of the opinion that it seems difficult to find a decision function which works for the broad range of data sets likely to be met clinically.There-fore,we propose an alternative clustering strategy:to deliver as algorithm output not one clustering,but a (small)set of clus-terings.The choice among this set is then made by a human.For the FCM algorithm,the set of clusterings is the clusteringscorresponding to eachof,with .It is seen that both methodsobtainsmalland values using this choice,but for the FCMmethod it happens at the prize of verylargevalue in most866IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY 2000TABLE IVFCM AND GTC (U SING k =2)R ESULTS .E STIMATES OF `A MONG A S MALL S ET OF A LGORITHM O UTPUT C LUSTERINGS ,FOR S EVENS ETS OF R EAL -L IFE DATAcases,whereas for the GTC methodsmaller values are ob-tained,albeit quite large for data sets #4and #6.For data set #5,Fig.5shows the errors of three types for all sets of clusterings,with lines connecting Type 1,2,and 3errors of each clustering,for FCM [Fig.5(a)],and GTC [Fig.5(b)].This result is representative for four of the seven data sets,and from this figure it is clear that the GTC method offers a better set of clusterings for a human to choose from.In fact,it is seen that there is a tradeoff between errors of Type 1and Type 2for FCM,whereby it is impossible to obtain small values of both.This is in contrast to GTC for which small values of all three error types can be reached with a proper decision.In Fig.6we give an example of how signal clusters may look.The figure showsaverages(24)i.e.,the maximum relative increase of compactness over allclusters.If this number is smaller than one,there is a pair of nonzeroparameters that decreases compactness for all clusters.The second columncontains(25)i.e.,the minimum relative increase of cluster separation overclusterpairs.If this number is larger than one,there is a nonzero pair of parameters that augments separation for all cluster pairs.The third columncontains.If this number is smaller than one,there is a pair of nonzero parameters that improves conditions for clustering as measured by that criterion.Positions in the table left blank indicate that there is no consistent improvement over the clusters of the cor-responding criterion.It can be seen that Xie and Beni’s crite-rion decreases,i.e.,clustering improves,for five data sets,but only for one of them,data set #7,the improvement consists of both compactness and separation improvement.For the other cases,it is either compactness or separation improvement,or neither of them (data set #6).Data set #7,for which geometric weighting seems most promising judging from Table VI,wastransformed according to itsoptimalparameters,and subjected to FCM clustering.The result was compared to FCM clustering of raw data,in both cases using the correct number of clusters.The error summed over all three types was computedWAHLBERG AND LANTZ:METHODS FOR ROBUST CLUSTERING OF EPILEPTIC EEG SPIKES867Fig.5.Error results for data set#5.(a)FCM and(b)GTC.The results of each clustering are connected.All output clusterings are displayed.Fig.6.Signal clusters,data set#2,GTC method.Averages6one empirical standard deviation.(a),(b)and(c).Three clusters(d)Residual.The time axis spans 100ms.for each possible choice of number of outliers,868IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.47,NO.7,JULY 2000TABLE VIR ESULTS U SING G EOMETRIC W EIGHTING ON R EAL -L IFE DATAVI.C ONCLUSIONIn this paper we have evaluated two algorithms,the FCM and a GTC,for the task of robust clustering of epileptic EEG spikes,and also suggested a method for geometrically weighted feature extraction.We have suggested criteria for determina-tion of the correct number of outliers,and adapted FCM to one-dimensional subspace clusters (amplitude invariance).Sim-ulations gave good performance for both algorithms in a range of circumstances which are clinically probable,and GTC gave best results.The geometric weighting procedure gave improved clustering results for simulated data clusters that were enough separated geometrically.Evaluation of both algorithms on seven real-life data sets,which had been manually clustered by a neurophysiologist,gave results which speaks in favor of the GTC algorithm.The FCM algorithm gave bad results for five of the data sets and there was a tradeoff between error of Type 1and 2.Simultaneous low errors were impossible to reach.The GTC algorithm gave re-sults that were acceptable for all seven data sets but one.Clin-ical data sets are likely to have varying characteristics.Crite-rion-based clustering of all possible data sets seems difficult,and as a remedy we suggest to modify the output of the clus-tering algorithm to a set of clusterings for a human operator to choose ing this idea,we obtained the result that GTC is preferable to FCM,and the GTC results were good for all seven data sets.The geometrically weighted feature extraction is favorable for some data set configurations,where the clusters are well separated geometrically.It can then facilitate clustering,and we obtained successful results in five of the seven data sets.We suggest it should be used as a complementary dimension to raw data clustering for exploratory cluster analysis.R EFERENCES[1]P.Y .Ktonas,“Automated Spike and Sharp Wave Detection,”inHandbook of Electroencephalography and Clinical Neurophysi-ology .Amsterdam,The Netherlands:Elsevier,1987,vol.1,pp.211–238.[2] D.H.Fender,“Source Localization of Brain Electrical Activity,”in Handbook of Electroencephalography and Clinical Neurophys-iology .Amsterdam,The Netherlands:Elsevier,1987,vol.1,pp.355–403.[3]ntz,M.Holub,E.Ryding,and I.Rosén,“Simultaneous intracra-nial and extracranial recording of interictal epileptiform activity:Pat-terns of conduction and results from dipole reconstructions,”Electroen-cephalogr.Clin.Neurophysiol.,no.99,pp.69–78,1996.[4]J.S.Ebersole,“EEG dipole modeling in complex partial epilepsy,”Brain Topogr.,vol.4,pp.113–23,1991.[5] A.van Oosterom,“History and evolution of methods for solving the in-verse problem,”J.Clin.Neurophysiol.,vol.8,no.4,pp.371–380,1991.[6] A.A.Dingle,R.D.Jones,G.J.Carroll,and W.R.Fright,“A multistagesystem to detect epileptiform activity in the EEG,”IEEE Trans.Biomed.Eng.,vol.40,pp.1260–1268,Dec.1993.[7]P.J.Huber,Robust Statististics .New York:Wiley,1981.[8]J.MacQueen,“Some methods for classification and analysis of multi-variate observations,”in Proc.5h Berkeley Symp.Math.Stat.and Prob.,L.M.LeCam and J.Neyman,Eds.Los Angeles,1967,pp.281–297.[9]R.O.Duda and P.E.Hart,Pattern Classification and Scene Anal-ysis .New York:Wiley,1973.[10] D.W.Matula,“Graph theoretic techniques for cluster analysis algo-rithms,”in Classification and Clustering ,J.Van Ryzin,Ed.New York:Academic,1977,pp.95–129.[11] A.K.Jain and R. C.Dubes,Algorithms for ClusteringData .Englewood Cliffs,NJ:Prentice-Hall,1988.[12]J.C.Bezdek,Pattern Recognition with Fuzzy Objective Function Algo-rithms .New York:Plenum,1981.[13]G.Zouridakis,B.H.Jansen,and N.N.Boutros,“A fuzzy clusteringapproach to EP estimation,”IEEE Trans.Biomed.Eng.,vol.44,pp.673–680,Aug.1997.[14]ntz,P.Wahlberg,G.Salomonsson,and I.Rosén,“Categorization ofinterictal epileptiform potentials using a graph-theoretic method,”Elec-troencephalogr.Clin.Neurophysiol.,vol.107,pp.323–331,1998.[15]K.Fukunaga,Statistical Pattern Recognition .New York:Academic,1990.[16]ncaster,Theory of Matrices .New York:Academic,1969.[17]J.A.Thorpe,Elementary Topics in Differential Geometry .Berlin,Ger-many:Springer-Verlag,1979.[18]X.L.Xie and G.Beni,“A validity measure for fuzzy clustering,”IEEETrans.Patt.Anal.,vol.13,no.8,pp.841–847,1991.[19] E.Oja,Subspace Methods of Pattern Recognition .New York:Wiley,1983.[20] F.Harary,Graph Theory .Reading,MA:Addison-Wesley,1969.[21]K.Thulasiraman and M.N.S.Swamy,Graphs:Theory and Algo-rithms .New York:Wiley,1992.Patrik Wahlberg received the M.Sc.and Ph.D.de-grees in electrical engineering and signal processing from Lund University,Sweden,in 1991and 1999,respectively.The thesis treated multichannel EEG signal analysis for epilepsy.He currently works as a Lecturer in Signal Processing at Lund University.His research interests include EEG signal processing,pattern recognition,and general appliedmathematics.Göran Lantz graduated from Medical School 1987.He received the certification as a Clinical Neurophysiologist in 1993and the Ph.D.degree in clinical neurophysiology in 1997,both at Lund University,Sweden.He is currently working with the Plurifaculty Program of Cognitive Neuroscience,University of Geneva,Switzerland.His main research interests are noninvasive presurgical epilepsy evaluations using advanced methods for multichannel EEG analysis.。