Comparison of 3D algorithms for non-rigid motion and correspondence estimation

合集下载

Modelling and Assimilation of Atmospheric Chemistry - ecmwf建模与大气化学ECMWF同化

Modelling and Assimilation of Atmospheric Chemistry - ecmwf建模与大气化学ECMWF同化
Training Data assimilation and Satellite Data – Johannes Flemming
Why Atmospheric Chemistry at NWP centres ?
- or in a NWP Training Course?
Environmental concern Air pollution Ozone hole Climate change
ppt 1:1012
Atmospheric Chemistry
Transport
Chemical Reactions
Photolysis
catalytical Cycles
Emissions
Atmospheric Reservoir
Training Data assimilation and Satellite Data – Johannes Flemming Dr. Martin Schultz - Max-Planck-Institut für Meteorologie, Hamburg
Rodwell and Jung Published in Quart. J. Roy. Meteorol. Soc., 134, 1479.1497 (2019)
Training Data assimilation and Satellite Data – Johannes Flemming
An other motivation …
Transport
wet & dry Deposition
Modelling atmospheric composition
Mass balance equation for chemical species ( up to 150 in state-of-the-art

信息可视化设计外文文献翻译中英文

信息可视化设计外文文献翻译中英文

信息可视化设计外文文献翻译中英文AbstractInformation visualization has become increasingly popular in recent years due to the growth of data analytics and the need for effective data presentation. This paper aims to introduce the concept of information visualization design and its importance as a tool for analyzing complex data sets. It also provides a comparative analysis of information visualization design strategies used in both Chinese and English literature, highlighting some key differences and similarities between the two.IntroductionInformation visualization design involves the graphical representation of data and information to facilitate understanding, analysis, and decision-making. As the amount of data being generated continues to grow exponentially, the need for effective information visualization design becomes critical. It enables users to explore, analyze, and interpret data in a more intuitive and interactive manner.Importance of Information Visualization DesignInformation visualization design plays a crucial role in transforming complex data into visually appealing and meaningful representations. It helps users identify patterns, trends, and relationships within data sets that might not be immediately apparent in raw data. Moreover, it allows users tocustomize and interact with the visual representations, enabling them to gain deeper insights and make more informed decisions.Comparison of Information Visualization Design Strategies Chinese LiteratureIn Chinese literature, there has been an emphasis on the use of color, shape, and layout to convey information effectively. Chinese researchers have explored various visualization techniques, including word clouds, treemaps, and interactive charts. They have also emphasized the importance of user-centered design, focusing on user needs, preferences, and cognitive processes.English LiteratureIn English literature, there has been a focus on the use of data-driven design principles and algorithms. English researchers have developed sophisticated visualization techniques such as scatter plots, heatmaps, and network graphs. They have also stressed the importance of data preprocessing and transformation to ensure accurate and reliable visualizations.Similarities and DifferencesAlthough there are some distinct approaches in Chinese and English literature, there are also some common principles and techniques that both share. Both Chinese and English researchers recognize the importance of clear and concise visualizations, effective use of color, and interactivity. Theyalso emphasize the importance of design evaluation and user testing to ensure the usability and effectiveness of visualizations.However, there are some differences in terms of research goals and focus. Chinese literature tends to emphasize the use of storytelling and narrative in information visualization design, aiming to engage users emotionally and create meaningful experiences. English literature, on the other hand, focuses more on the technical aspects of visualization design, such as algorithm development and optimization.ConclusionInformation visualization design is a crucial tool in analyzing and understanding complex data sets. Both Chinese and English literature provide valuable insights and techniques for effective information visualization design. By combining the strengths of both approaches, researchers and practitioners can create more impactful and user-friendly visualizations. As data continues to grow in volume and complexity, the field of information visualization design will continue to evolve, enabling us to derive greater insights and make better-informed decisions.。

Segmentation - University of M

Segmentation - University of M

Robustness
– Outliers: Improve the model either by giving the noise “heavier tails” or allowing an explicit outlier model
– M-estimators
Assuming that somewhere in the collection of process close to our model is the real process, and it just happens to be the one that makes the estimator produce the worst possible estimates
– Proximity, similarity, common fate, common region, parallelism, closure, symmetry, continuity, familiar configuration
Segmentation by clustering
Partitioning vs. grouping Applications
ri (x i , );
i
(u;
)
u2 2
u
2
Segmentation by fitting a model(3)
RANSAC (RAMdom SAmple Consensus)
– Searching for a random sample that leads to a fit on which many of the data points agree
Allocate each data point to cluster whose center is nearest

A Comparison of Algorithms for the

A Comparison of Algorithms for the

A Comparison of Algorithms for the Optimization of Fermentation ProcessesRui MendesIsabel RochaEug´e nio C.FerreiraMiguel RochaAbstract —The optimization of biotechnological processes isa complex problem that has been intensively studied in the past few years due to the economic impact of the products obtained from fermentations.In fed-batch processes,the goal is to find the optimal feeding trajectory that maximizes the final productivity.Several methods,including Evolutionary Algorithms (EAs)have been applied to this task in a number of different fermentation processes.This paper performs an experimental comparison between Particle Swarm Optimization,Differential Evolution and a real-valued EA in three distinct case studies,taken from previous work by the authors and literature,all considering the optimization of fed-batch fermentation processes.I.I NTRODUCTIONA number of valuable products such as recombinant pro-teins,antibiotics and amino-acids are produced using fer-mentation techniques.Additionally,biotechnology has been replacing traditional manufacturing processes in many areas like the production of bulk chemicals,due to its relatively low requirements regarding energy and environmental costs.Consequently,there is an enormous economic incentive to develop engineering techniques that can increase the pro-ductivity of such processes.However,these are typically very complex,involving different transport phenomena,microbial components and biochemical reactions.Furthermore,the nonlinear behavior and time-varying properties,together with the lack of reliable sensors capable of providing direct and on-line measurements of the biological state variables limits the application of traditional control and optimization techniques to bioreactors.Under this context,there is the need to consider quantita-tive mathematical models,capable of describing the process dynamics and the interrelation among relevant variables.Additionally,robust global optimization techniques must deal with the model’s complexity,the environment constraints and the inherent noise of the experimental process [3].In fed-batch fermentations,process optimization usually encompasses finding a given nutrient feeding trajectory that maximizes productivity.Several optimization methods have been applied in this task.It has been shown that,for simple bioreactor systems,the problem can be solved analytically [24].Rui Mendes and Miguel Rocha are with Department of Infor-matics and the Centro de Ciˆe ncias e Tecnologias da Computac ¸˜a o,Universidade do Minho,Braga,Portugal (email:azuki@di.uminho.pt,mrocha@di.uminho.pt).Isabel Rocha and Eug´e nio Ferreira with the Centro de Engenharia Biol´o gica da Universidade do Minho (email:irocha@deb.uminho.pt,ecferreira@deb.uminho.pt).Numerical methods make a distinct approach to this dy-namic optimization problem.Gradient algorithms are used to adjust the control trajectories in order to iteratively improve the objective function [4].In contrast,dynamic programming methods discretize both time and control variables to a predefined number of values.A systematic backward search method in combination with the simulation of the system model equations is used to find the optimal path through the defined grid.However,in order to achieve a global optimum the computational burden is very high [23].An alternative approach comes from the use of algorithms from the Evolutionary Computation (EC)field,which have been used in the past to optimize nonlinear problems with a large number of variables.These techniques have been applied with success to the optimization of feeding or temperature trajectories [14][1],and,when compared with traditional methods,usually perform better [20][6].In this work,the performance of different algorithms belonging to three main groups -Evolutionary Algorithms (EA),Particle Swarm (PSO)and Differential Evolution (DE)-was compared,when applied to the task of optimizing the feeding trajectory of fed-batch fermentation processes.Three test cases taken from literature and previous work by the authors were used.The algorithms were allowed to run for a given number of function evaluations that was deemed to be enough to achieve acceptable results.The comparison among the algorithms was based on their final result and on the convergence speed.The paper is organized as follows:firstly,the fed-batch fermentation case studies are presented;next,PSO,DE and a real-valued EA are described;the results of the application of the different algorithms to the case studies are presented;finally,the paper presents a discussion of the results,conclu-sions and further work.II.C ASESTUDIES :FED -BATCH FERMENTATIONPROCESSESIn fed-batch fermentations there is an addition of certain nutrients along the process,in order to prevent the accumu-lation of toxic products,allowing the achievement of higher product concentrations.During this process the system states change considerably,from a low initial to a very high biomass and product concen-trations.This dynamic behavior motivates the development of optimization methods to find the optimal input feeding trajectories in order to improve the process.The typical input in this process is the substrate inflow rate time profile.0-7803-9487-9/06/$20.00/©2006 IEEE2006 IEEE Congress on Evolutionary ComputationSheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 20062018For the proper optimization of the process,a white box mathematical model is typically developed,based on dif-ferential equations that represent the mass balances of the relevant state variables.A.Case study IIn previous work by the authors,a fed-batch recombinant Escherichia coli fermentation process was optimized by EAs [17][18].This was considered as thefirst case study in this work and will be briefly described next.During the aerobic growth of the bacterium,with glucose as the only added substrate,the microorganism can follow three main different metabolic pathways:•Oxidative growth on glucose:k1S+k5Oµ1−→X+k8C(1)•Fermentative growth on glucose:k2S+k6Oµ2−→X+k9C+k3A(2)•Oxidative growth on acetic acid:k4A+k7Oµ3−→X+k10C(3) where S,O,X,C,A represent glucose,dissolved oxygen, biomass,dissolved carbon dioxide and acetate components, respectively.In the sequel,the same symbols are used to represent the state variables’concentrations(in g/kg);µ1to µ3are time variant specific growth rates that nonlinearly depend on the state variables,and k i are constant yield coefficients.The associated dynamical model can be described by the following equations:dXdt =(−k1µ1−k2µ2)X+F in,S S indt =(k3µ2−k4µ3)X−DA(6)dOdt =(k8µ1+k9µ2+k10µ3)X−CT R−DC(8)dWT f(10) The relevant state variables are initialized with the follow-ing values:X(0)=5,S(0)=0,A(0)=0,W(0)=3. Due to limitations in the feeding pump capacity,the value of F in,S(t)must be in the range[0.0;0.4].Furthermore, the following constraint is defined over the value of W: W(t)≤5.Thefinal time(T f)is set to25hours.B.Case study IIThis system is a fed-batch bioreactor for the production of ethanol by Saccharomyces cerevisiae,firstly studied by Chen and Huang[5].The aim is tofind the substrate feed rate profile that maximizes the yield of ethanol.The model equations are the following:dx1x4(11)dx2x4(12)dx3x4(13)dx41+x30.22+x2(15)g2=171.5x2dx10.12+A −ux1dt =x3x4e−5x4x5(19)dx3x5)x3(20) dx4x5(21)dx5(x4+0.4)(x4+62.5)(23) The aim of the optimization is tofind the feeding profile (u)that maximizes the following PI:P I=x1(T f)x5(T f)(24) Thefinal time is set to T f=15(hours)and the initial values for relevant state variables are the following:x1(0)= 0,x2(0)=0,x3(0)=1,x4(0)=5and x5(0)=1.The feed rate is constrained to the range u(t)∈[0.0;3.0].III.T HE A LGORITHMSThe optimization task is tofind the feeding trajectory, represented as an array of real-valued variables,that yields the best performance index.Each variable will encode the amount of substrate to be introduced into the bioreactor, in a given time unit,and the solution will be given by the temporal sequence of such values.In this case,the size of the genome would be determined based on thefinal time of the process(T f)and the discretization step(d)considered in the numerical simulation of the model,given by the expression: T fdI+1(25) where I stands for the number of points within each interpo-lation interval.The value of d used in the experiments was d=0.005,for case studies I,II and III.The evaluation process,for each individual in the pop-ulation,is achieved by running a numerical simulation of the defined model,given as input the feeding values in the genome.The numerical simulation is performed using ODEToJava,a package of ordinary differential equation solvers,using a linearly implicit implicit/explicit(IMEX) Runge-Kutta scheme used for stiff problems[2].Thefitness value is then calculated from thefinal values of the state variables according to the PI defined for each case.A.Particle Swarm OptimizationA particle swarm optimizer uses a population of particles that evolve over time byflying through space.The particles imitate their most successful neighbors by modifying their velocity component to follow the direction of the most successful position of their neighbors.Each particle is defined by:P(i)t= x t,v t,p t,e tx t∈R d is the current position in the search space;p t∈R d is the position visited by the particle in the past that had the best function evaluation;v t∈R d is a vector that represents the direction in which the particle is moving,it is called the‘velocity’;e t is the evaluation of p t under the function being optimized,i.e.e t=f(p t).Particles are connected to others in the population via a predefined topology.This can be represented by the adja-cency matrix of a directed graph M=(m ij),where m ij= 1if there is an edge from particle i to particle j and m ij=0 otherwise.At each iteration,a new population is produced by allow-ing each particle to move stochastically toward its previous best position and at the same time toward the best of the previous best positions of all other particles to which it is connected.The following is an outline of a generic PSO.1)Set the iteration counter,t=0.2)Initialize each x(i)and v(i)randomly.Set p(i)=x(i)0.3)Evaluate each particle and set e(i)=f(p(i)0).4)Let t=t+1and generate a new population,whereeach particle i is moved to a new position in the search space according to:(i)v(i)t=velocityvelocityupdate(v(i)t−1)=X j∈N(i)r·(c1+c2)4(i.e.,smallperturbations will be preferred over larger ones).where[min i;max i]is the range of values allowed for gene i.In both cases,an innovation is introduced:the mutation operators are applied to a variable number of genes(a value that is randomly set between1and10in each application). 20213p<0.001 20.001≤p<0.01 10.01≤p<0.05 N p≥0.05CanPso2.5154±0.71232.5563±0.70912.5641±0.7168 DE9.3693±0.05709.4738±0.00529.4770±0.0028 DEBest2.7077±0.19212.7419±0.21152.7936±0.2176 DETourn9.1044±0.19839.2913±0.12409.3596±0.1114 EA7.9371±0.13558.5161±0.08838.8121±0.0673 Fips9.1804±0.16429.4280±0.05519.4528±0.0538CanPso7.1461±1.11527.1461±1.11527.1461±1.1152 DE9.4351±0.00009.4351±0.00009.4351±0.0000 DEBest7.6932±0.83217.6937±0.83217.6937±0.8321 DETourn9.4099±0.05519.4099±0.05519.4099±0.0551 EA8.7647±0.14419.0137±0.14219.1324±0.1320 Fips9.4351±0.00009.4351±0.00009.4351±0.0000CanPso DE DEBest DETourn EA DEN-N-N3-3-1DETourn3-3-13-3-23-3-13-3-1FipsAlgorithm PI40k NFEs PI100k NFEs PI200k NFEsCanPso19385.2±284.319386.4±284.319406.8±272.5 DE20379.4±11.620397.2±13.920406.9±14.5 DEBest19418.1±290.019421.0±290.419430.5±293.5 DETourn20362.7±52.420380.4±42.720394.3±32.8 EA20151.8±69.720335.1±54.120394.7±23.1 Fips19818.0±160.719818.9±161.119818.9±161.1TABLE IVR ESULTS FOR CASE II FOR I=100(109VARIABLES),I=200(55 VARIABLES)AND I=540(21VARIABLES)RESPECTIVELY.Table V shows the comparison of the algorithms.As can be seen,CanPso continues to be the worst contender but DEBest is not a very bad choice when the number of variables is small.EA is still beaten by DE and DETourn in most cases. Figure2presents the convergence curve of the algorithms. DE and DETourn converge fast(around40,000NFEs);Fips gets stuck in a plateau that is higher than the one of DEBest and CanPso;EA converges slowly but is steadily improving. It seems that,given enough time,EAfinds similar solutions to either DE and DETourn.20233-3-1DEBest 3-3-1N-N-N 3-3-N EAN-N-N3-3-NN-N-N3-3-N3-3-NTABLE VP AIRWISE T -TEST WITH THEH OLM P -VALUE ADJUSTMENT FOR THEALGORITHMS OF CASEII.T HE P -VALUE CODES CORRESPOND TOI =100,I =200AND I =540RESPECTIVELY .1650017000 17500 18000 18500 19000 19500 20000 20500 050000100000 150000200000P INFEsCanPsoFips DE DEBest DETournEAFig.2.Convergence of the algorithms for case II for I =200.E.Results for case IIITable VI presents the results of the algorithms on case III.This case seems to be quite simple and most algorithms find similar results.DE ,Fips and EA are the best algorithms in this problem because of their reliability:they have narrow confidence intervals.DETourn seems to be a little less reliable,but its confidence intervals are still small enough.Table VII shows the comparison of the algorithms for this problem.In this case,most algorithms are not statistically different.This is the case when we turn to the reliability of the algorithms to draw conclusions.As we stated before,most algorithms find similar solutions,which indicates that this case is probably not a good benchmark to compare algorithms.Figure 3presents the convergence curve of the algorithms for I =100.In this case DE ,DETourn and Fips converge very fast;EA has a slower convergence rate;CanPso and DEBest get stuck in local optima.V.C ONCLUSIONSANDF URTHER W ORKThis paper compares canonical particle swarm (CanPso ),fully informed particle swarm (Fips ),a real-valued EA (EA )and three schemes of differential evolution (DE ,DEBest and DETourn )in three test cases of optimizing the feeding trajectory in fed-batch fermentation.Each of these problems was tackled with different numbers of points (i.e.,different values for I )to interpolate the feeding trajectory.This is a trade off:the more variables we have,the more precise the curve is but the harder it is to optimize.CanPso 27.069±1.75127.370±1.83627.579±1.681DE 32.641±0.02932.674±0.00232.680±0.001DEBest 30.774±1.00430.775±1.00430.775±1.004DETourn 32.624±0.05732.629±0.05632.631±0.056EA 32.526±0.02532.633±0.01332.670±0.008Fips 32.625±0.10032.629±0.09932.630±0.099CanPso 31.914±0.66231.914±0.66231.914±0.662DE 32.444±0.00032.444±0.00032.444±0.000DEBest 31.913±0.70031.914±0.70031.914±0.700DETourn 32.441±0.00532.441±0.00532.441±0.005EA 32.413±0.01232.439±0.00332.443±0.001Fips 32.444±0.00032.444±0.00032.444±0.000CanPsoDE DEBest DETourn EADE 1-N-N 1-N-N DETourn 2-N-NN-3-N1-N-N N-3-NFips1214 16 18 20 22 24 26 28 30 32 34 050000100000 150000200000P INFEsCanPsoFips DE DEBest DETournEAFig.3.Convergence of the algorithms for case III when I =100.to choose DE instead.Previous work by the authors [19]developed a new representation in EAs in order to allow the optimization of a time trajectory with automatic interpolation.It would be interesting to develop a similar approach within DE or Fips .Another area of future research is the consideration of on-line adaptation,where the model of the process is updated during the fermentation process.In this case,the good computational performance of DE is a benefit,if there is the need to re-optimize the feeding given a new model and values for the state variables are measured on-line.A CKNOWLEDGMENTSThis work was supported in part by the Portuguese Foundation for Science and Technology under project POSC/EIA/59899/2004.The authors wish to thank Project SeARCH (Services and Advanced Research Computing with HTC/HPC clusters),funded by FCT under contract CONC-REEQ/443/2001,for the computational resources made avail-able.R EFERENCES[1]P.Angelov and R.Guthke.A Genetic-Algorithm-based Approach toOptimization of Bioprocesses Described by Fuzzy Rules.Bioprocess Engin.,16:299–303,1997.[2]Spiteri Ascher,Ruuth.Implicit-explicit runge-kutta methods for time-dependent partial differential equations.Applied Numerical Mathe-matics ,25:151–167,1997.[3]J.R.Banga,C.Moles,and A.Alonso.Global Optimization of Bio-processes using Stochastic and Hybrid Methods.In C.A.Floudas and P.M.Pardalos,editors,Frontiers in Global Optimization -Nonconvex Optimization and its Applications ,volume 74,pages 45–70.Kluwer Academic Publishers,2003.[4]A.E.Bryson and Y .C.Ho.Applied Optimal Control -Optimization,Estimation and Control .Hemisphere Publication Company,New York,1975.[5]C.T.Chen and C.Hwang.Optimal Control Computation forDifferential-algebraic Process Systems with General Constraints.Chemical Engineering Communications ,97:9–26,1990.[6]J.P.Chiou and F.S.Wang.Hybrid Method of Evolutionary Algorithmsfor Static and Dynamic Optimization Problems with Application to a Fed-batch Fermentation puters &Chemical Engineering ,23:1277–1291,1999.[7]Maurice Clerc and James Kennedy.The particle swarm -explosion,stability,and convergence in a multidimensional complex space.IEEE Transactions on Evolutionary Computation ,6(1):58–73,2002.[8]J.Stuart Hunter George E.P.Box,William G.Hunter.Statistics forexperimenters:An introduction to design and data analysis .NY:John Wiley,1978.[9]Cyril Harold Goulden.Methods of Statistical Analysis,2nd ed .JohnWiley &Sons Ltd.,1956.[10]S Holm.A simple sequentially rejective multiple test procedure.Scandinavian Journal of Statistics ,6:65–70,1979.[11]J.Kennedy and R.Mendes.Topological structure and particle swarmperformance.In David B.Fogel,Xin Yao,Garry Greenwood,Hitoshi Iba,Paul Marrow,and Mark Shackleton,editors,Proceedings of the Fourth Congress on Evolutionary Computation (CEC-2002),Honolulu,Hawaii,May 2002.IEEE Computer Society.[12]Rui Mendes,James Kennedy,and Jos´e Neves.The fully informed par-ticle swarm:Simple,maybe better.IEEE Transactions on EvolutionaryComputation ,8(3):204–210,2004.[13]Z.Michalewicz.Genetic Algorithms +Data Structures =EvolutionPrograms .Springer-Verlag,USA,third edition,1996.[14]H.Moriyama and K.Shimizu.On-line Optimization of CultureTemperature for Ethanol Fermentation Using a Genetic Algorithm.Journal Chemical Technology Biotechnology ,66:217–222,1996.[15]S.Park and W.F.Ramirez.Optimal Production of Secreted Protein inFed-batch Reactors.AIChE J ,34(9):1550–1558,1988.[16]I.Rocha.Model-based strategies for computer-aided operation ofrecombinant E.coli fermentation .PhD thesis,Universidade do Minho,2003.[17]I.Rocha and E.C.Ferreira.On-line Simultaneous Monitoring ofGlucose and Acetate with FIA During High Cell Density Fermentation of Recombinant E.coli.Analytica Chimica Acta ,462(2):293–304,2002.[18]M.Rocha,J.Neves,I.Rocha,and E.Ferreira.Evolutionary algo-rithms for optimal control in fed-batch fermentation processes.In G.Raidl et al.,editor,Proceedings of the Workshop on Evolutionary Bioinformatics -EvoWorkshops 2004,LNCS 3005,pages pp.84–93.Springer,2004.[19]Miguel Rocha,Isabel Rocha,and Eug´e nio Ferreira.A new represen-tation in evolutionary algorithms for the optimization of bioprocesses.In Proceedings of the IEEE Congress on Evolutionary Computation ,pages 484–490.IEEE Press,2005.[20]J.A.Roubos,G.van Straten,and A.J.van Boxtel.An EvolutionaryStrategy for Fed-batch Bioreactor Optimization:Concepts and Perfor-mance.Journal of Biotechnology ,67:173–187,1999.[21]Rainer Storn.On the usage of differential evolution for functionoptimization.In 1996Biennial Conference of the North American Fuzzy Information Processing Society (NAFIPS 1996),pages 519–523.IEEE,1996.[22]Rainer Storn and Kenneth Price.Minimizing the real functions ofthe icec’96contest by differential evolution.In IEEE International Conference on Evolutionary Computation ,pages 842–844.IEEE,May 1996.[23]A.Tholudur and W.F.Ramirez.Optimization of Fed-batch BioreactorsUsing Neural Network Parameters.Biotechnology Progress ,12:302–309,1996.[24]V .van Breusegem and G.Bastin.Optimal Control of Biomass Growthin a Mixed Culture.Biotechnology and Bioengineering ,35:349–355,1990.2025。

正确对待算法的作文题目

正确对待算法的作文题目

正确对待算法的作文题目英文回答:When it comes to dealing with algorithms, it is important to approach them with a balanced perspective. On one hand, algorithms have greatly improved our lives by providing efficient solutions to complex problems. For example, search engines like Google use algorithms toquickly deliver relevant search results, saving us time and effort. Algorithms also play a crucial role in various industries, such as finance, healthcare, and transportation, where they help optimize processes and make informed decisions.However, it is equally important to acknowledge the potential drawbacks and ethical concerns associated with algorithms. One major concern is the issue of bias. Algorithms are created by humans and can inadvertentlyreflect the biases and prejudices of their creators. For instance, facial recognition algorithms have been found tohave higher error rates for people with darker skin tones, leading to potential discrimination. Another concern is the lack of transparency and accountability in algorithmic decision-making. When algorithms are used to make important decisions, such as in hiring or loan approvals, it iscrucial to ensure that they are fair, unbiased, and explainable.To address these concerns, it is necessary to have regulations and guidelines in place to govern the development and use of algorithms. Governments and organizations should promote transparency andaccountability by requiring algorithmic systems to be auditable and explainable. Additionally, there should be diversity and inclusivity in the teams developingalgorithms to minimize biases. Regular audits and evaluations of algorithms should be conducted to identify and rectify any biases or errors.Moreover, it is essential to educate the public about algorithms and their impact. Many people are unaware of how algorithms work and the potential consequences of their use.By promoting digital literacy and providing accessible resources, individuals can make informed decisions and actively engage in discussions about algorithmic fairness and ethics.In conclusion, algorithms have become an integral partof our lives, bringing numerous benefits and conveniences. However, we must approach them with caution and address the potential biases and ethical concerns they may pose. By implementing regulations, promoting transparency, and educating the public, we can ensure that algorithms are developed and used in a responsible and fair manner.中文回答:谈到处理算法时,我们需要以平衡的态度来对待它们。

The development and comparison of robust methods for estimating the fundamental matrix

The development and comparison of robust methods for estimating the fundamental matrix

1. Introduction
In most computer vision algorithms it is assumed that a least squares framework is su cient to deal with data corrupted by noise. However, in many applications, visual data are not only noisy, but also contain outliers, data that are in gross disagreement with a postulated model. Outliers, which are inevitably included in an initial t, can so distort a tting process that the tted parameters become arbitrary. This is particularly severe when the veridical data are themselves degenerate or near-degenerate with respect to the model, for then outliers can appear to break the degeneracy. In such circumstances, the deployment of robust estimation methods is essential. Robust methods continue to recover meaningful descriptions of a
statistical population even when the data contain outlying elements belonging to a di erent population. They are also able to perform when other assumptions underlying the estimation, say the noise model, are not wholly satis ed. Amongst the earliest to draw the value of such methods to the attention of computer vision researchers were Fischler and Bolles (1981). Figure 1 shows a table of x; y data from their paper which contains a gross outlier (Point 7). Fit 1 is the result of applying least squares, Fit 2 is the result of applying least squares after one robust method has removed the outlier, and the solid line is the result of applying their fully robust RANSAC algorithm to the data. The data set can also be used to demonstrate the failings of na ve heuristics to remove outliers. For example, discarding

几种常见的优化方法ppt课件

几种常见的优化方法ppt课件
fast, this is relatively unimportant because the time
required for integration is usually trivial in comparison to
the time required for the force calculations.
Example of integrator for MD simulation
• One of the most popular and widely used integrators is
the Verlet leapfrog method: positions and velocities of
7
Continuous time molecular dynamics
1. By calculating the derivative of a macromolecular force
field, we can find the forces on each atom
as a function of its position.
11
Choosing the correct time step…
1. The choice of time step is crucial: too short and
phase space is sampled inefficiently, too long and the
energy will fluctuate wildly and the simulation may
– Rigid body dynamics
– Multiple time step algorithms

UNISINOS – Universidade do Vale do Rio dos Sinos

UNISINOS – Universidade do Vale do Rio dos Sinos

A Comparative Study of Algorithms for3D MorphingT ATIANA F IGUEIREDO E VERS,M ARCELO W ALTERUNISINOS–Universidade do Vale do Rio dos SinosPIPCA-Mestrado em Computac¸˜a o Aplicadatatiana,marcelow@exatas.unisinos.brAbstract.We present a comparative study between two3D morphing algorithms for polyhedral objects.A 3D morphing algorithm establishes a smooth transition between the source object and the target object.The two algorithms compared are the one presented by Hong et al.[1]and the one presented by Kent et al.[2].The main conclusion is that,in general,the latter algorithm delivers morphy sequences that look more natural.1IntroductionMorphing techniques allow the transformation of a sourceobject into a target object.One of the main goals of morph-ing is to convince the eye that the source object is smoothlytransformed into the target object.In this study we imple-mented and compared two algorithms([1]and[2]).We selected these2algorithms since they are among thefirst solutions presented for the3D morphing problem andstill today they are at the core of many morphing solutions.2AlgorithmsBoth algorithms divide the problem into two steps.Thefirststep deals with the correspondence between the points,orhow to establish a mapping between each point of the tar-get and source objects.Once this correspondence is estab-lished,the second step deals with the problem of interpola-tion.The algorithms differ on how to establish the mapping.Hong uses the criterion of the minimal dynamic distanceswhereas Kent combines the topologies of the source andtarget objects,creating a new object.For the interpolationstep both solutions use a linear interpolation between eachpair of correspondent vertices on a user-given number offrames.(a)Hong’smorphing(b)Kent’s morphingFigure1:Visual ComparisonMorphing KentIt does not seemvery real dependingon the complexityof objectsIntermediate Soft andforms continuousPolyhedralobjectsConectivity MantainTable1:Comparison between algorithms[1]and[2]3ConclusionsA summary of ourfindings is presented on Table1and a vi-sual comparison between the2algorithms in shown in Fig-ure1.The technique presented by Kent et al.preserves thetopology of intermediate objects,keeping the conectivitybetween the faces.This generates transformations with lit-tle distortions in the intermediate frames,that is,a soft andcontinuous morphing.The technique presented by Hong etal.,on the other hand,ignores the topological informationof the models,resulting in intermediate models where thefaces seem“tofly separately”during the transformation.Therefore,in general,the solution proposed by Kent hasbetter visual results.References[1]HONG,Tong Minh;MAGNENAT-THALMANN,N.;THALMANN,D.A General Algorithm for Interpola-tion in a Facet-Based Representation.In:Graphics In-terface’88...Canada.1988.p.229-235.[2]KENT,James R.;PARENT,Richard E.;CARLSON,Wayne E..Shape Transformation for Polyhedral Ob-jects.In:SIGGRAPH’92...,Estados Unidos.ACMPress.V ol.26,n.2,1992,p.47-54.。

统计学论文(精选6篇

统计学论文(精选6篇

统计学论文(精选6篇1. "A Bayesian Approach to Modeling Mixed-Methods Survey Data"This paper discusses the use of Bayesian methods for analyzing mixed-methods survey data, where both quantitative and qualitative data are collected from a sample. The authors present a hierarchical Bayesian model that allows for the incorporation of both types of data, and demonstrate its usefulness through simulations and a real-world application.2. "Network Analysis of Financial Risk"This paper uses network analysis to evaluate the interconnectedness of financial institutions and the potential for systemic risk. The authors construct a network of financial institutions based on their credit exposures, and analyze the network for patterns of vulnerability. The results suggest that the network is highly interconnected, and that some institutions are more central and influential than others.3. "A Comparison of Machine Learning Algorithms for Prediction of Patient Outcomes"This paper compares the performance of several machine learning algorithms for predicting patient outcomes, using data from electronic health records. The authors find that Random Forests and Support Vector Machines perform the best, and suggest that these models could be used in clinical decision-making.4. "Spatial Analysis of Crime Rates"This paper uses spatial analysis techniques to explore patterns of crime in a particular city. The authors use a spatial autocorrelationtest to identify areas of high and low crime rates, and then conduct a regression analysis to identify factors that may be contributing to the patterns. The results suggest that socio-economic factors are strongly correlated with crime rates.5. "Bayesian Inference for Time Series Forecasting"This paper presents a Bayesian approach to forecasting time series data, using a state-space model and Markov Chain Monte Carlo techniques for parameter estimation. The authors demonstrate the method using data on monthly inflation rates, and show that it outperforms traditional methods such as ARIMA.6. "Identifying Subpopulations with Latent Trait Models"This paper presents a method for identifying subpopulations within a larger sample, based on latent trait models. The authors use a mixture model to identify subgroups with different characteristics, and then conduct a regression analysis to explore the factors that are associated with membership in each subgroup. The method is applied to data on adolescent substance use, and the results suggest that there are different patterns of substance use among subgroups of adolescents.。

归一法的英语

归一法的英语

归一法的英语Normalization Method。

Introduction。

Normalization is a data preprocessing technique used to transform data into a common scale, allowing for easier comparison and analysis. It is an essential step in data preparation before applying various data mining algorithms. In this document, we will discuss the normalization method and its importance in data analysis.What is Normalization?Normalization is the process of rescaling numeric data to a common range without distorting the original distribution or losing information. It is particularly useful when dealing with datasets containing attributes with different scales or units. By normalizing the data, we can eliminate the bias caused by these differences and ensure fair comparisons between variables.Why is Normalization Important?1. Comparison: Normalizing data allows for fair and meaningful comparisons between variables. Without normalization, variables with larger scales or units may dominate the analysis, leading to inaccurate results. Normalization ensures that each attribute contributes equally to the analysis.2. Data Mining: Many data mining algorithms, such as clustering and classification, rely on distance or similarity measures. Normalizing the data ensures that these algorithms are not influenced by the scale of the variables. It improves the accuracy and reliability of the results.3. Interpretation: Normalized data is easier to interpret and understand. By bringing all variables to a common scale, we can directly compare their values and identifypatterns or trends more effectively. This aids in decision-making and understanding the relationships between variables.Normalization Methods。

计算机组成与设计 第五版答案_CH06_Solution

计算机组成与设计 第五版答案_CH06_Solution

Chapter 6 Solutions S-3 6.1 Th ere is no single right answer for this question. Th e purpose is to get studentsto think about parallelism present in their daily lives. Th e answer should have atleast 10 activities identifi ed.6.1.1 Any reasonable answer is correct here.6.1.2 Any reasonable answer is correct here.6.1.3 Any reasonable answer is correct here.6.1.4 Th e student is asked to quantify the savings due to parallelism. Th e answershould consider the amount of overlap provided through parallelism and should beless than or equal to (if no parallelism was possible) to the original time computedif each activity was carried out serially.6.26.2.1 For this set of resources, we can pipeline the preparation. We assume thatwe do not have to reheat the oven for each cake.Preheat OvenMix ingredients in bowl for Cake 1Fill cake pan with contents of bowl and bake Cake 1. Mix ingredients forCake 2 in bowl.Finish baking Cake 1. Empty cake pan. Fill cake pan with bowl contents forCake 2 and bake Cake 2. Mix ingredients in bowl for Cake 3.Finish baking Cake 2. Empty cake pan. Fill cake pan with bowl contents forCake 3 and bake Cake 3.Finish baking Cake 3. Empty cake pan.6.2.2 Now we have 3 bowls, 3 cake pans and 3 mixers. We will name them A, B,and C.Preheat OvenMix incredients in bowl A for Cake 1Fill cake pan A with contents of bowl A and bake for Cake 1. Mix ingredientsforCake 2 in bowl A.Finish baking Cake 1. Empty cake pan A. Fill cake pan A with contents ofbowl A for Cake 2. Mix ingredients in bowl A for Cake 3.Finishing baking Cake 2. Empty cake pan A. Fill cake pan A with contentsof bowl A for Cake 3.S-4 ChapterSolutions6Finish baking Cake 3. Empty cake pan A.Th e point here is that we cannot carry out any of these items in parallelbecause we either have one person doing the work, or we have limitedcapacity in our oven.6.2.3 Each step can be done in parallel for each cake. Th e time to bake 1 cake, 2cakes or 3 cakes is exactly the same.6.2.4 Th e loop computation is equivalent to the steps involved to make one cake.Given that we have multiple processors (or ovens and cooks), we can executeinstructions (or cook multiple cakes) in parallel. Th e instructions in the loop (orcooking steps) may have some dependencies on prior instructions (or cookingsteps) in the loop body (cooking a single cake).Data-level parallelism occurs when loop iterations are independent (i.e., noloop carried dependencies).Task-level parallelism includes any instructions that can be computed onparallel execution units, are similar to the independent operations involvedin making multiple cakes.6.36.3.1 While binary search has very good serial performance, it is diffi cult toparallelize without modifying the code. So part A asks to compute the speedupfactor, but increasing X beyond 2 or 3 should have no benefi ts. While we canperform the comparison of low and high on one core, the computation for midon a second core, and the comparison for A[mid] on a third core, without somerestructuring or speculative execution, we will not obtain any speedup. Th e answershould include a graph, showing that no speedup is obtained aft er the values of 1,2, or 3 (this value depends somewhat on the assumption made) for Y.6.3.2 In this question, we suggest that we can increase the number of cores (toeach the number of array elements). Again, given the current code, we really cannotobtain any benefi t from these extra cores. But if we create threads to compare theN elements to the value X and perform these in parallel, then we can get idealspeedup (Y times speedup), and the comparison can be completed in the amountof time to perform a single comparison.6.4. Th is problem illustrates that some computations can be done in parallelif serial code is restructured. But more importantly, we may want to provide forSIMD operations in our ISA, and allow for data-level parallelism when performingthe same operation on multiple data items.Chapter 6 Solutions S-5 6.4.1 Th is is a straightforward computation. Th e fi rst instruction is executedonce, and the loop body is executed 998 times.Version 1—17,965 cyclesVersion 2—22,955 cyclesVersion 3—20,959 cycles6.4.2 Array elements D[j] and D[jϪ1] will have loop carried dependencies. Th esewill $f4 in the current iteration and $f0 in the next iteration.6.4.3 Th is is a very challenging problem and there are many possibleimplementations for the solution. Th e preferred solution will try to utilize the twonodes by unrolling the loop 4 times (this already gives you a substantial speedupby eliminating many loop increment, branch and load instructions). Th e loopbody running on node 1 would look something like this (the code is not the mosteffi cient code sequence):addiu $s1, $zero, 996l.d $f0, –16($s0)l.d $f2, –8($s0)loop:add.d $f4, $f2, $f0add.d $f6, $f4, $f2Send (2, $f4)Send (2, $f6)s.d $f4, 0($s0)s.d $f6, 8($s0)Receive($f8)add.d $f10, $f8, $f6add.d $f0, $f10, $f8Send (2, $f10)Send (2, $f0)s.d. $f8, 16($s0)s.d $f10, 24($s0)s.d $f0 32($s0)Receive($f2)s.d $f2 40($s0)addiu $s0, $s0, 48bne $s0, $s1, loopadd.d $f4, $f2, $f0add.d $f6, $f4, $f2add.d $f10, $f8, $f6s.d $f4, 0($s0)s.d $f6, 8($s0)s.d $f8, 16($s0)S-6 Chapter6SolutionsTh e code on node 2 would look something like this:addiu $s2, $zero, 0loop:Receive ($f12)Receive ($f14)add.d $f16, $f14, $f12Send(1, $f16)Receive ($f12)Receive ($f14)add.d $f16, $f14, $f12Send(1, $f16)Receive ($f12)Receive ($f14)add.d $f16, $f14, $f12Send(1, $f16)Receive ($f12)Receive ($f14)add.d $f16, $f14, $f12Send(1, $f16)addiu $s2, $s2, 1bne $s2, 83, loopBasically Node 1 would compute 4 adds each loop iteration, and Node 2would compute 4 adds. Th e loop takes 1463 cycles, which is much better thanclose to 18K. But the unrolled loop would run faster given the current sendinstruction latency.6.4.4 Th e loop network would need to respond within a single cycle to obtain aspeedup. Th is illustrates why using distributed message passing is diffi cult whenloops contain loop-carried dependencies.6.56.5.1 Th is problem is again a divide and conquer problem, but utilizes recursionto produce a very compact piece of code. In part A the student is asked to computethe speedup when the number of cores is small. When forming the lists, we spawn athread for the computation of left in the MergeSort code, and spawn a thread for thecomputation of the right. If we consider this recursively, for m initial elements in thearray, we can utilize 1 ϩ 2 ϩ 4 ϩ 8 ϩ 16 ϩ …. log2(m) processors to obtain speedup.6.5.2 In this question, log2 (m) is the largest value of Y for which we can obtainany speedup without restructuring. But if we had m cores, we could perform sorting using a very diff erent algorithm. For instance, if we have greater than m/2 cores, we can compare all pairs of data elements, swap the elements if the left element is greater than the right element, and then repeat this step m times. So this is one possible answer for the question. It is known as parallel comparison sort. Various comparison sort algorithms include odd-even sort and cocktail sort.Chapter 6 Solutions S-76.66.6.1 Th is problem presents an “embarrassingly parallel” computationand asks the student to fi nd the speedup obtained on a 4-core system. Th ecomputations involved are: (m ϫ p ϫ n) multiplications and (m ϫ p ϫ(n Ϫ 1)) additions. Th e multiplications and additions associated with a singleelement in C are dependent (we cannot start summing up the results of themultiplications for an element until two products are available). So in this question,the speedup should be very close to 4.6.6.2 Th is question asks about how speedup is aff ected due to cache misses causedby the 4 cores all working on diff erent matrix elements that map to the same cacheline. Each update would incur the cost of a cache miss, and so will reduce thespeedup obtained by a factor of 3 times the cost of servicing a cache miss.6.6.3 In this question, we are asked how to fi x this problem. Th e easiest way tosolve the false sharing problem is to compute the elements in C by traversing thematrix across columns instead of rows (i.e., using index-j instead of index-i). Th eseelements will be mapped to diff erent cache lines. Th en we just need to make surewe process the matrix index that is computed ( i, j) and (i ϩ 1, j) on the same core.Th is will eliminate false sharing.6.76.7.1 x ϭ 2, y ϭ 2, w ϭ 1, z ϭ 0x ϭ 2, y ϭ 2, w ϭ 3, z ϭ 0x ϭ 2, y ϭ 2, w ϭ 5, z ϭ 0x ϭ 2, y ϭ 2, w ϭ 1, z ϭ 2x ϭ 2, y ϭ 2, w ϭ 3, z ϭ 2x ϭ 2, y ϭ 2, w ϭ 5, z ϭ 2x ϭ 2, y ϭ 2, w ϭ 1, z ϭ 4x ϭ 2, y ϭ 2, w ϭ 3, z ϭ 4x ϭ 3, y ϭ 2, w ϭ 5, z ϭ 46.7.2 We could set synchronization instructions aft er each operation so that allcores see the same value on all nodes.6.86.8.1 If every philosopher simultaneously picks up the left fork, then there will beno right fork to pick up. Th is will lead to starvation.S-8 ChapterSolutions66.8.2 Th e basic solution is that whenever a philosopher wants to eat, she checksboth forks. If they are free, then she eats. Otherwise, she waits until a neighborcontacts her. Whenever a philosopher fi nishes eating, she checks to see if herneighbors want to eat and are waiting. If so, then she releases the fork to one ofthem and lets them eat. Th e diffi culty is to fi rst be able to obtain both forks withoutanother philosopher interrupting the transition between checking and acquisition.We can implement this a number of ways, but a simple way is to accept requestsfor forks in a centralized queue, and give out forks based on the priority defi nedby being closest to the head of the queue. Th is provides both deadlock preventionand fairness.6.8.3 Th ere are a number or right answers here, but basically showing a casewhere the request of the head of the queue does not have the closest forks available,though there are forks available for other philosophers.6.8.4 By periodically repeating the request, the request will move to the head ofthe queue. Th is only partially solves the problem unless you can guarantee thatall philosophers eat for exactly the same amount of time, and can use this time toschedule the issuance of the repeated request.6.9A3B1, B4A1, A2B1, B4A1, A4B2A1B3A1A2A1A1B1B2B1A3A4B2B4Chapter 6 Solutions S-9A1B1A1B1A1B2A2B3A3B4A46.10 Th is is an open-ended question.6.116.11.1 Th e answer should include a MIPS program that includes 4 diff erentprocesses that will compute ¼ of the sums. Assuming that memory latency is notan issue, the program should get linear speed when run on the 4 processors (thereis no communication necessary between threads). If memory is being consideredin the answer, then the array blocking should consider preserving spatial locality sothat false sharing is not created.6.11.2 Since this program is highly data parallel and there are no datadependencies, a 8ϫ speedup should be observed. In terms of instructions, theSIMD machine should have fewer instructions (though this will depend upon theSIMD extensions).6.12 Th is is an open-ended question that could have many possible answers. Th ekey is that the student learns about MISD and compares it to an SIMD machine.6.13 Th is is an open-ended question that could have many answers. Th e key isthat the students learn about warps.6.14 Th is is an open-ended programming assignment. Th e code should be testedfor correctness.6.15 Th is question will require the students to research on the Internet both theAMD Fusion architecture and the Intel QuickPath technology. Th e key is thatstudents become aware of these technologies. Th e actual bandwidth and latencyvalues should be available right off the company websites, and will change as thetechnology evolves.6.166.16.1 For an n-cube of order N (2N nodes), the interconnection network cansustain NϪ1 broken links and still guarantee that there is a path to all nodes in thenetwork.6.16.2 Th e plot below shows the number of network links that can fail and stillguarantee that the network is not disconnected.S-10 Chapter 6Solutions11010010000100000Network order N u m b e r o f f a u l t y l i n k s6.176.17.1 Major diff erences between these suites include:Whetstone—designed for fl oating point performance specifi callyPARSEC—these workloads are focused on multithreaded programs6.17.2 Only the PARSEC benchmarks should be impacted by sharing and synchronization. Th is should not be a factor in Whetstone.6.186.18.1 Any reasonable C program that performs the transformation should be accepted.6.18.2 Th e storage space should be equal to (R ϩ R) times the size of a single precision fl oating point number ϩ (m + 1) times the size of the index, where R is the number of non-zero elements and m is the number of rows. We will assume each fl oating-point number is 4 bytes, and each index is a short unsigned integer that is 2 bytes. For Matrix X this equals 111 bytes.6.18.3 Th e answer should include results for both a brute-force and a computation using the Yale Sparse Matrix Format.6.18.4 Th ere are a number of more effi cient formats, but their impact should be marginal for the small matrices used in this problem.6.196.19.1 Th is question presents three diff erent CPU models to consider when executing the following code:if (X[i][j] > Y[i][j])count++;Chapter 6 Solutions S-11 6.19.2 Th ere are a number of acceptable answers here, but they should considerthe capabilities of each CPU and also its frequency. What follows is one possibleanswer:Since X and Y are FP numbers, we should utilize the vector processor (CPU C) toissue 2 loads, 8 matrix elements in parallel from A and 8 matrix elements from B,into a single vector register and then perform a vector subtract. We would thenissue 2 vector stores to put the result in memory.Since the vector processor does not have comparison instructions, we would haveCPU A perform 2 parallel conditional jumps based on fl oating point registers. Wewould increment two counts based on the conditional compare. Finally, we couldjust add the two counts for the entire matrix. We would not need to use core B.6.19.3 Th e point of the problem is to show that it is diffi cult to perform an operationon individual vector elements when utilizing a vector processor. What might be a niceinstruction to add would be a vector comparison that would allow for us to comparetwo vectors and produce a scalar value of the number of elements where one vectorwas larger the other. Th is would reduce the computation to a single instruction forthe comparison of 8 FP number pairs, and then an integer computation for summingup all of these values.6.20 Th is question looks at the amount of queuing that is occurring in the systemgiven a maximum transaction processing rate, and the latency observed on averageby a transaction. Th e latency includes both the service time (which is computed bythe maximum rate) and the queue time.6.20.1 So for a max transaction processing rate of 5000/sec, and we have 4 corescontributing, we would see an average latency of .8 ms if there was no queuingtaking place. Th us, each core must have 1.25 transactions either executing or insome amount of completion on average.So the answers are:1 ms5000/sec 1.252 ms5000/sec 2.51 ms10,000/sec 2.52 ms10,000/sec56.20.2 We should be able to double the maximum transaction rate by doublingthe number of cores.6.20.3 Th e reason this does not happen is due to memory contention on theshared memory system.。

估计3维刚体变换的四种算法的比较

估计3维刚体变换的四种算法的比较
This work was funded by EC H.C.M. Project ERB40500PL921003 through the SMART network, and UK EPSRC Grant GR/H/86905.
lr
238
using a unit quaternion. The use of dual quaternions to represent both transform components is the basis of the fourth technique of Walker, Shao and Volz [5]. The comparison presented here consists of three parts. First, the accuracy of each algorithm is examined as the coordinates of corresponding points are corrupted with increasing amounts of noise. Second, the stability is determined as original 3-D point sets degenerate into such forms as a plane, line and single point. Lastly, the relative efficiency, in terms of actual execution time, is reported for the above situations. Conclusions based on these results should make the choice of an appropriate algorithm for a given application simpler and more reliable.

英文期刊回复审稿人意见-参考模板

英文期刊回复审稿人意见-参考模板

Response to Reviewer 1 CommentsPoint 1: Is the delay in ms estimated considering the same hardware and for all the algorithms involved in the comparison? This aspect is very important to estimate the complexity of the algorithmResponse 1: Thank you for your comment. According to the method described in section 2.3.1, the total delay defined in this paper includes the VSF processing delay on the substrate node and the transmission delay on the substrate link. As for the substrate parameter configuration, the VSF processing delay on the substrate node is randomly distributed in [0.2,0.5], which simulates the VSF processing delay on different hardware. Figure 5 shows the comparison of the average end-to-end delay using the proposed algorithm and the other three algorithms.Point 2: Conclusions are too short, I suggest further summarizing results in the conclusion sectionResponse 2: We are grateful for the suggestion.Those comments are all valuable and very helpful for revising and improving our paper. We have further summarized the results in the conclusion section(Lines 359-375, page 15).Point 3: I suggest extending the references sectionResponse 3: Thank you for your careful review. We have extended the references section.1。

三对角矩阵块循环递减解法

三对角矩阵块循环递减解法

BCYCLIC:A parallel block tridiagonal matrix cyclic solver qS.P.Hirshman *,K.S.Perumalla,V.E.Lynch,R.SanchezOak Ridge National Laboratory,Oak Ridge,Tennessee 37830,USAa r t i c l e i n f o Article history:Received 30December 2009Received in revised form 27April 2010Accepted 30April 2010Available online 13May 2010Keywords:Cyclic reduction Block matrix Dense blocks Tridiagonal matrix Thomas algorithm Parallel computinga b s t r a c tA block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algo-rithm that is easily parallelized.Storage of the factored blocks allows the application ofthe inverse to multiple right-hand sides which may not be known at factorization time.Scalability with the number of block rows is achieved with cyclic reduction,while scalabil-ity with the block size is achieved using multithreaded routines (OpenMP,GotoBLAS)forblock matrix manipulation.This dual scalability is a noteworthy feature of this new solver,as well as its ability to efficiently handle arbitrary (non-powers-of-2)block row and pro-cessor parison with a state-of-the art parallel sparse solver is presented.Itis expected that this new solver will allow many physical applications to optimally usethe parallel resources on current supercomputers.Example usage of the solver in mag-neto-hydrodynamic (MHD),three-dimensional equilibrium solvers for high-temperaturefusion plasmas is cited.Ó2010Elsevier Inc.All rights reserved.1.Introduction1.1.Motivation and scopeThe solution of the matrix equation Ax =b ,with possible multiple right-hand sides b ,arises routinely in the solution of linear and nonlinear partial differential equations (PDEs).In three-dimensional systems where two coordinates are angular and one is radial,the Fourier-transformed PDEs result in A being block tridiagonal when the underlying equations involve at most second order derivatives.The block size M is the additive Fourier extents in the two angular directions,and the number of block rows N is the number of discrete radial nodes.Such matrix structure emerges naturally in numerical simulations of hot thermonuclear plasmas confined by a magnetic toroidal field,such as those in a tokamak or stellarator.The nonlinear fluid equations that describe the equilibrium,stability and evolution of these plasmas are often discretized spectrally in the two periodic directions,while finite differences are used in the radial direction.In this case the highest derivative in the radial direction is second order,the resulting linear sub-problems that emerge when solving the nonlinear equations via iterative Newton–Krylov methods exhibit a block tridiagonal structure.Examples of physics codes which require efficient block solvers include the ideal magnetohydrodynamic (MHD)nonlin-ear solvers VMEC [1]and SIESTA [2],and the linear full-wave ion cyclotron radio frequency (ICRF)code TORIC [3].All of these codes solve equations of the form Ax =b ,with the block tridiagonal matrix A consisting of large,dense ually0021-9991/$-see front matter Ó2010Elsevier Inc.All rights reserved.doi:10.1016/j.jcp.2010.04.049qThis submission was sponsored by a contractor of the United States Government under contract DE-AC05-00OR22725with the United States Department of Energy.The United States Government retains,and the publisher,by accepting this submission for publication,acknowledges that the United States Government retains,a nonexclusive,paid-up,irrevocable,worldwide license to publish or reproduce the published form of this submission,or allow others to do so,for United States Government purposes.*Corresponding author.Tel.:+18655741289.E-mail address:hirshmansp@ (S.P.Hirshman).S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–64046393 multiple right sides b occur.Such a block solver is essential for the numerical efficiency of the SIESTA[2]code which com-putes high-resolution MHD equilibria in the presence of magnetic islands.Most of the computational time in SIESTA is spent in the inversion of large block matrices(N>100,M>300)which are repeatedly applied as preconditioners to accelerate the convergence of the linearized MHD equations toward an equilibrium state.Eventual simulations for plasmas in the Interna-tional Thermonuclear Experimental Reactor(ITER,with T$15keV,a$2m,B$5T)will require even larger spatial resolu-tion resulting in greater block row numbers and sizes.Therefore the efficient parallel inversion and storage of the factors of A is essential in SIESTA and was the primary motivation for the present code development.Fig.1shows a SIESTA equilibrium calculation for pressure contours(magnetic surfaces)in a tokamak plasma with an internal q=1surface near the center of the plasma.The block size for this problem was M=273and there were N=101block rows.For this small sized problem,the serial calculation can be done in under5min on a desktop computer using Compaq Visual ing ScaLAPACK(see below)to factor and invert the blocks improves the performance for this problem by about a factor of5before communication bottlenecks saturate the performance gain.As we discuss in Section4,the new Example of a tokamak equilibrium plasma with a q=1tearing mode creating an m6394S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–6404with large(M)1),dense blocks.For example,the well-known BLKTRI code[10]uses cyclic reduction for the efficient solu-tion of block tridiagonal matrices which arise from separable elliptic partial differential equations.It is however not well-suited for the present case of interest consisting of dense blocks.Other parallel block tridiagonal solvers[11]are also not optimized for these parameters.ScaLAPACK[12,13]provides another method for efficiently solving dense block tridiagonal systems.Block factorization and solution based on ScaLAPACK are currently implemented in the SIESTA[2]MHD equilibrium code.This technique scales well with processor count only for very large matrix block sizes.For matrix blocks of interest for small3D MHD problems (M$300),scalability was found to be limited to about5–10processors.For larger block sizes,scalability is expected to im-prove.To improve scalability for the current(small)block sizes,we have developed the BCYCLIC code.It uses a combination cyclic reduction,for scalability in the block row dimension(N)1),together with GotoBLAS for scalability in the block size, to achieve overall good scalability.Enhancements to use ScaLAPACK instead of LAPACK,while retaining the use of threaded BLAS(in the layer under LAPACK),are being developed.This will relieve single-node memory constraints,allowing the blocks to be larger than the memory of a single-node,and also permit idle processors to participate in later recursion levels of the cyclic reduction.In addition to cyclic reduction,divide-and-conquer strategies have been used to parallelize the solution of tridiagonal equations[14].As noted by Wang[15],this partition algorithm is unlikely to be more efficient than cyclic reduction unless P(N(here,P is the number of MPI ranks,which would be nodes in the case of using OpenMP or GotoBLAS threads,but cores in the case of only MPI tasks).A parallel fast direct solver for block tridiagonal systems with separable matrices,based on divide-and-conquer,has been described in[16].A parallel symmetric block tridiagonal divide-and-conquer algorithm was described in[17].Divide-and-conquer has been used in conjunction with cyclic reduction by Lee[18]for the TORIC code[3].The state-of-the art sparse direct solver SuperLU[19]can be used to perform the necessary inversion for the block matrix A.In Appendix A,the processor scaling of SuperLU is compared to BCYCLIC for typical block size(M$300)and block rows (N$1000)encountered in present day3D MHD simulations.The PETSc[20]library provides a suite of parallel routines for solving partial differential equations.However,our present focus is on developing a highly-optimized parallel implementa-tion for dense block-triadiagonal systems.Vendor-supplied libraries,such as CRAY Scientific Libraries[21](CSL)and IBM Engineering and Scientific Subroutine Library[22](ESSL),are commonly used for matrix inversion operations but are also not specifically tuned for the present problem.1.3.New contributionsBCYCLIC uses cyclic reduction to obtain good processor scalability with respect to the number of block rows N.It is also optimized for speed in a parallel environment by using efficient non-blocking receives and sends to minimize inter-proces-sor communication time.In this way,the overall computation time is not dominated by communication.In contrast to pre-vious codes(and algorithms),BCYCLIC stores the necessary factored blocks of A with minimalfill-in so that solutions with multiple right sides can be efficiently computed,even when they are not known at factorization time.This is important,for example,when the matrix A is used repeatedly as a preconditioner as part of a nonlinear iterative scheme,as in the SIESTA [2]code.By using multithreading techniques to accelerate the computation of matrix–matrix products,parallelism is intro-duced to give additional scalability with respect to the block size M.This dual scalability–in both N and M–is unique to the present implementation.BCYCLIC can be used with arbitrary block rows N and MPI ranks P that are not restricted to powers of two as in the classical cyclic reduction algorithm.anization of paperThe rest of the paper is organized as follows.Our solution approach based on cyclic reduction is described in Section2 with emphasis on efficient block storage and inter-processor communication optimization.Section3documents some addi-tional implementation details,followed by Section4in which a performance study is presented for the solver executed on a parallel machine with multi-core nodes.The results are summarized in Section5.2.Our approachIn this section,we present our solution approach based on block cyclic reduction.First,we review the well-known Tho-mas sequential method,to place the current approach in context.This is followed by our parallelization scheme for the block cyclic reduction algorithm.This scheme when executed sequentially is nearly as efficient as the Thomas algorithm in both computational FLOPS(floating point operations)and memory usage,but requires(approximately)three times as many ma-trix–matrix products.The computational cost makes it slower than Thomas by about a factor of2–3on a serial machine. However,we later show that the real gain is that it gives a significant reduction in run time on a parallel machine(approx-imately N/log2N),where it is assumed P<N/2,where,N is the number of block rows,and P is the number of processors over which the matrix block rows are distributed.This estimate assumes the code is structured so that communication times be-tween processors do not dominate the run time.We show that this indeed can be achieved,and,in support of this observa-tion,we present a detailed analysis of CPU time in a parallel execution environment.2.1.Background:Thomas sequential recursionThe Thomas[4]algorithm is briefly reviewed for later comparison with the cyclic reduction method.The i th block row of the matrix A consists of the three MÂM blocks denoted L i,D i,U i,where L i is the lower block,D i is the diagonal block,and U i is the upper block.At the boundaries,L1=0and U N=0.The block structure of A for the i th row is:L i x iÀ1þD i x iþU i x iþ1¼b i:ð1ÞThe Thomas algorithm uses the boundary conditions at i=1and i=N to obtain a two-point recursion relation which can be iterated from one boundary to the other,followed by a‘‘reverse sweep”to obtain the solution vector x.Generally,no block pivoting is done.First,x1is solved in terms of the unknown x2,assuming D1is non-singular:x1¼DÀ11ÀU1x2þb1ðÞ:ð2ÞInserting this into the next block row equation,x2can be eliminated in terms of x3.In general,the recurrence formula is: x i¼ÀD i U i x iþ1þb iðÞ:ð3aÞHere,D i is a matrix and b i is a vector.Inserting this into block row i yields the recurrence relations:D i¼D iÀL i D iÀ1U iÀ1ðÞÀ1;b i ¼b iÀL i b iÀ1:ð3bÞStarting values for the recurrence relations in Eq.(3b)are D0=0and b0=0.Note this equation is intrinsically serial,since the computation at level i requires the previous level values.The recurrence continues until the other boundary at i=N is reached.There,U N=0is used to initiate the back-solving process,via Eq.(3a):x N¼ÀD N b N:ð4ÞEq.(3a)is iterated backwards from i=NÀ1to i=1to complete the solution process.Once D i has been determined(and stored)for a particular matrix A,Eq.(3b)for b i can be iterated forward for multiple right sides and the back-substitution in Eq.(3a)performed to obtain the unknowns,without any further matrix inversions or matrix–matrix products.Also,there is nofill-in,since D i(and U i)can be used to sequentially store D i,(and D i U i).Note the Thomas algorithm requires one matrix inversion and two matrix–matrix multiplications per block row.2.2.Cyclic reduction of block rowsCyclic reduction[6,9]is a‘‘period doubling”algorithm that can be applied recursively to reduce the number of coupled block rows in Eq.(1)to one or a very small number compared with N.This reduced block system can then be either solved trivially–with a single block matrix inversion-or efficiently using the Thomas algorithm.Each step of the method can be performed in a parallel fashion.At each period doubling‘‘bifurcation”,the number of coupled equations to be solved is re-duced by a factor of two and the spacing(in the original block row index space)of remaining block rows is doubled:hence the terminology‘‘period doubling”.Assuming that there are P=N processors available,and inter-processor communication time is not a dominant factor,then the theoretical maximum speed-up factor due to cyclic reduction alone would be R=N/ log2N on a parallel machine,compared to a single processor serial computation.The speed-up compared with the Thomas algorithm is not quite as large as this,since there are more matrix–matrix calculations per cyclic reduction step compared with Thomas(so R in practice might be only be in the range R/3to R/2compared to Thomas).Following Ref.[9],note that cyclic reduction eliminates all the even-numbered block rows in Eq.(1)in terms of the odd-numbered rows.In Eq.(1),let i=2k,for k=1,N/2.For now,assume that N 2p is even.(This limitation will be relaxed later.) There are p+1levels that correspond to effective decreasing system sizes(the number of remaining block rows to be pro-cessed)of N,N/2,N/4,...,N/2p=1.At the l th level(l=0,...,pÀ1),there are N l=2pÀl rows of which half(the even ones) can be eliminated using Eq.(1),which is rewritten below for the k th even row:x2k¼^b2kÀb L2k x2kÀ1Àb U2k x2kþ1;ð5aÞ^b 2k ¼DÀ12kb2k;b L2k¼DÀ12kL2k;b U2k¼DÀ12k U2k:ð5bÞIn Eq.(5a),k=(1,...,N l/2).The boundary condition b U2N¼0is implied by U2N=0.Several features of Eq.(5)are noteworthy:only the inverses DÀ12kfor the even block row diagonal elements are requiredthe DÀ12kcan be computed in parallel since at any level l of the cyclic reduction,all the D2k are known.Furthermore,there is no inter-processor communication required at this computation stage.S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–64046395the two matrix–matrix products needed to compute b L2k and b U2k can also be done in parallel,once the inverse blocks are computed.values of DÀ12k(actually,its LU decomposition factors),b L2k,and b U2k can overwrite the values of D2k,U2k,L2k,respectively (since they are no longer needed for the back-substitution step)so that nofill-in occurs at this step.once the odd values of x are known(by a backwards recursion,which will be described below),then the even values are computed by efficient matrix–vector multiplications(no further expensive matrix inversions or matrix–matrix multipli-cations are required)The number of computed inverses at thefirst reduction level(l=0)is only half of the total Thomas inverses.When summed over all levels the number is the same as for the Thomas scheme.Likewise,total the number of matrix–matrix prod-ucts for this step equals the number of Thomas products.Thus,in both time and memory utilization,the even part of the cyclic reduction equals the Thomas algorithm.The next step in the cyclic reduction of Eq.(1)is to obtain an equation for the odd-indexed components of x,by letting i=2kÀ1,for k=1,N l/2,in Eq.(1)and using Eq.(5a)to eliminate the even components.The result is:b L2kÀ1x2kÀ3þb D2kÀ1x2kÀ1þb U2kÀ1x2kþ1¼^b2kÀ1:ð6ÞHere,the reduced matrix blocks(for odd indices now)areb D2kÀ1¼D2kÀ1ÀL2kÀ1b U2kÀ2ÀU2kÀ1b L2k;b L2kÀ1¼ÀL2kÀ1b L2kÀ2;b U2kÀ1¼ÀU2kÀ1b U2k;ð7aÞand the reduced sources(for odd indices)are^b2kÀ1¼b2kÀ1ÀL2kÀ1^b2kÀ2ÀU2kÀ1^b2k:ð7bÞThe boundary condition is b L1¼0(with b U2NÀ1¼0a consequence b U2N¼0as stated above).Note that no matrix inverses are required.Rather,four matrix–matrix products are needed to compute the odd-indexed reduced(hatted)matrix blocks.Inter-processor communication will now be required,since the calculation of the(2kÀ1)reduced matrices requires the U and L blocks from both the adjacent2kÀ2and2k even munication can be minimized,if there are less than N processors, by storing even and odd block rows contiguously in memory on a given processor.Then only the‘‘boundary”matrices from block rows on adjacent processors need to be communicated.While the effective odd diagonal elements can write over the original ones,we see from Eqs.(5a)and(7b)that both the original and reduced(hatted)U and L blocks must be retained for back-solving with multiple right sides.(The exception to this is when all the right sides are known at factorization time, which may not be the case when iterative matrix inversion is used as a preconditioner).So this reduction producesfill-in of one extra block per level(since only the odd values of the original blocks U and L must be stored),and an overall(summing over all levels)increase in storage of two blocks per row.Thus,the cyclic reduction increases the memory requirements com-pared to the Thomas algorithm by66%,so thefill-in is still deemed to be manageable.There are now four additional matrix–matrix products to evaluate in Eq.(7a).Added to the two required for the even reduction equations,this is a total of six compared with the Thomas algorithm’s two.However,as noted previously,the num-ber of matrix inversions is the same as for Thomas,so the overall numerical efficiency for each step of the cyclic reduction is greater than1/3that of Thomas.The parallelism of the cyclic algorithm compensates for this performance reduction.Eq.(6)is self-similar to the original equation(1)but in the‘‘period-doubled”index space k of odd indices.Setting x2kÀ1 y k,for k=1,N l/2,with b D2kÀ1 ~D k,etc.,Eq.(6)becomes:~L k ykÀ1þ~D k y kþ~U k y kþ1¼~b k:ð8ÞThis self-similarity,in the period-doubled(odd)index space,allows one to apply recursively the same reduction technique to each successive level(cycle)of N l+1=N l/2equations.In this way,one iterates through a period doubling sequence in which the number of equations is halved at each level:N,N/2,N/4,...,1.If at each level the communication time is dominated by the inversion and matrix–matrix multiplication times(implying an efficient use of parallel resources),then the time for the cyc-lic reduction will be p(in units of the time to compute a single inverse and matrix–matrix multiply step).This is compared with a time of N for only one processor.Thus,the theoretical parallel performance gain(ignoring communication time)is R=N/p=N/ln2N,which is the scalability enhancement factor with respect to the number of block rows that can be expected from this algorithm.If the number of processors P is less than,or equal to,N,the performance gain will be reduced by approximately P/N,so R=P/ln2N.More complete analysis of the parallel performance time is presented in Section4.Thefinal level of the reduction process(l=p)results in a single diagonal block equation(L=U=0)which can be trivially solved for the only remaining odd-indexed vector x1.The solution x1¼b DÀ11^b1is used to initiate back-substitution step.Back-substitution at any level l6p proceeds as follows.The even-indexed values are determined by substituting the known odd values into Eq.(5a),which requires two additional matrix–vector multiplications per even index.Note from Eqs.(6)and(8)that all indices at a higher level of the cyclic reduction originate from only odd indices at the(previous)level, since even-indexed rows do not propagate to the next level.Conversely,the even and odd indices at level l uniquely pop-6396S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–6404S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–64046397 ulate all of the odd indices at the previous level lÀ1.Therefore,once the solution at level l is known,Eq.(5a)can be used to fill-in all the remaining even indices of level lÀ1.2.3.Parallel complexity of block cyclic reductionIn this performance analysis,both N(number of block rows)and P(number of processors)are assumed to be powers of two.Specifically,N=2p and P=2r,for integers p and r.Execution time for non-powers of two for N can be bounded on the low-(and high-)side by using the nearest power of two that is less than(and greater than)N.A similar approach can be used for P that is not a power of two.Recall that the algorithmfirst performs a series of forward‘‘reduction”steps for each recursive bisection(‘‘period dou-bling”)until the sub-problem size has only one row block,and then unrolls the recursion via a corresponding series of back-ward‘‘solve”steps.There are two zones within each of the reduction and solve phases as shown schematically in Fig.2.These zones correspond to N0>P and N06P,where N0is the number of remaining block rows at a particular level l of the cyclic reduction.It is assumed that N>P initially(p>r)so the algorithm begins in thefirst execution zone,where every pro-cessor contains two or more block rows to be processed.In general,in thefirst execution zone,each processor starts with N/ P=2pÀr block rows and performs recursive bisections until each processor has exactly two row blocks.of the parallel work done over time.Schematic is not-to-scale,to show detail.(b)Actual(to-scale)parallel work1024.able.There is no performance gain for P >N /2.Any additional processors should be used to improve the factorization and matrix–matrix product calculations.3.Additional implementation detailsIn this section we generalize the previous discussion to show how the cyclic reduction algorithm can be extended to the case when the number of processors P and/or the number of blocks rows N are not powers of two (it is still assumed P 6N ).3.1.Distribution of block rows to processorsThe distribution of block rows to processors is similar to the distribution for exact powers of two,with the simple mod-ification that the first few processors each gets an extra row block mapped to it.Thus,the block rows are distributed to the processors according to the following rule (here,p refers to a specific processor):for p ¼1to p ¼mod ðN ;P Þ:distribute ½N =P þ1block rows numbered consecutively ;for p ¼1þmod ðN ;P Þto p ¼P :distribute ½N =P block rows numbered consecutively :Here,[x ]=floor (x )denotes the largest integer less than,or equal to,x .3.2.Propagation of odd row indices between levelsAccording to the cyclic reduction at any level l ,the current even-numbered block rows are stored and no longer prop-agate to the next level l +1.The odd rows propagate to the next level and map into the new consecutive block row num-bering by the rule:at level l ,odd row n )row (n +1)/2at level l +1If N l is the number of block rows at level l ,then N l +1=[(N l +l )/2].The inverse of this mapping is given by:at level l +1,row n 0)row 2n 0À1at level lNote that all indices at level l +1map to only the odd indices at the previous level l .That is,in performing the back-sub-stitution step at level l to compute the even-index solutions according to Eq.(5a),one first maps the solution from level l +1to obtain all the odd values of x at the previous level l .Then Eq.(5a)can be solved for the even-indexed x 0s to complete the back-solve at that level.This back-solution is iterated until the complete solution at level l =0isobtained.reduction time to serial Thomas time vs.number of processors,for small and moderate communication (M ?1),based on Eqs.(9)and (10).S.P.Hirshman et al./Journal of Computational Physics229(2010)6392–64046399 4.Performance study4.1.Software and hardware platformsTo prepare for future multi-core computers with many cores per node,we have chosen a machine that contains a large number of cores per node(16cores per node,80nodes).We implemented BCYCLIC using Fortran90and optimized it on the SMOKY[23]computer,which is a machine hosted by the National Center for Computational Sciences(NCCS)as a develop-mental computational resource at Oak Ridge National Laboratory(ORNL).SMOKY’s current configuration is an80node Linux cluster consisting of four quad-core2.0GHz AMD Opteron processors per node,32GB of memory(2GB per core),a gigabit Ethernet network with Infiniband interconnect,and access to Spider,the center wide LUSTRE-basedfile system.Two levels of parallelism are used for the solver:(1)cyclic reduction is computed in parallel over the N block rows using separate MPI tasks;(2)GotoBLAS[24](or OpenMP[25])threads are used inside nodes(16cores)to achieve parallelism in the factorization of the large(MÂM)dense blocks.Thefirst level of parallelism has been described in the previous sections. The second level of parallelism involves the efficient calculation of matrix–matrix products and inverses.Note the four ma-trix–matrix products in Eq.(7a)are performed by the LAPACK routine DGEMM.This is the single most CPU-intensive routine, consuming over50%of the time at each cyclic reduction level.Optimization can be achieved by doing this part of the cal-culation in parallel when separate threads are available.(The remaining time is spent in the matrix inversion and the two matrix–matrix multiplications in Eq.(5b).)4.2.Threaded goto BLASOn SMOKY,batch jobs can be submitted to use multiple threads.Thread usage is controlled by modifying the processors per node(ppn)count.For example,1024MPI tasks on16cores per node would be set by the command‘‘nodes=64:ppn=16”. Alternatively,for64MPI tasks,with1MPI task per node,the required command would be‘‘nodes=64:ppn=1”.In this latter case,sixteen threads can be assigned on each node.For OpenMP parallelism,‘‘export OMP_NUM_THREADS=16”would set16 OpenMP threads per core.For GotoBLAS threads,‘‘export GOTO_NUM_THREADS=16”would set16GotoBLAS threads per core.OpenMPI is the library used for MPI tasks on SMOKY.The MPI communication in BCYCLIC is done by non-blocking sends Speed-up factor on SMOKY[23]vs.total number of processors for block size M=1000and block。

最小特征值的迭代非刚体三维射影重建方法

最小特征值的迭代非刚体三维射影重建方法

最小特征值的迭代非刚体三维射影重建方法裘国永;刘静娜;刘中华;彭亚丽;刘侍刚【摘要】To obtain 3D non-rigid projective reconstruction from an image sequence,an iteration projection reconstruction method for 3D non-rigid based on minimal eigenvalue is presented.Based on the characteristic that all the image points and the depth factors constitute a low rank image matrix,the method replaces projection solution by eigenvalue and eigenvector solution.Then we can obtain the depth factors by iteration.Finally,the 3D non-rigid projective reconstruction is realized.The method can guarantee to converge to the global optimal solution.The experiments with both simulate and real data show that the proposed method has the advantages of fast convergence speed and small error.%为了从图像序列中重建出非刚体三维射影重建,本文提出了一种最小特征值的迭代非刚体射影重建方法.该方法利用所有的图像点和深度因子组成一个低秩图像矩阵的特性,将投影求解转化为矩阵特征值及特征向量的求解,迭代地求解深度因子,实现非刚体的三维射影重建.该方法能够保证算法能够收敛到全局最优解.模拟实验和真实实验结果表明,本文方法具有收敛性速度快、误差小等优点.【期刊名称】《电子学报》【年(卷),期】2017(045)005【总页数】7页(P1211-1217)【关键词】非刚体;射影重建;特征值【作者】裘国永;刘静娜;刘中华;彭亚丽;刘侍刚【作者单位】现代教学技术教育部重点实验室,陕西西安 710062;陕西省教学信息技术工程实验室,陕西西安 710119;陕西省教学信息技术工程实验室,陕西西安710119;河南科技大学电子信息工程学院,河南洛阳 471023;现代教学技术教育部重点实验室,陕西西安 710062;陕西省教学信息技术工程实验室,陕西西安 710119;现代教学技术教育部重点实验室,陕西西安 710062;陕西省教学信息技术工程实验室,陕西西安 710119【正文语种】中文【中图分类】TP391.41从已有图像序列中重建出物体的三维结构是计算机视觉研究热点问题之一[1,2],早期的三维重建研究工作都针对物体做刚体运动的情况[3].然而,现实世界中大部分物体的运动属于非刚体运动.非刚体运动比刚体运动更具有普遍性和多样性,其重建的难度也将增加[4,5].为了重建三维非刚体,Bregler等人首次提出了三维非刚体可由若干个形状基(Shape Basis)线性加权组成[6],并重建了三维非刚体的结构,但事后已证明这是一个病态不定问题求解[7].后来许多三维非刚体重建方法都基于Bregler的假设[8,9].其中Fragkiadaki利用秩的约束采用迭代方法重建非刚体的三维结构及运动[10],Bue等人基于SVD分解方法重建三维人脸结构及运动[11].但这些方法仅适合正投影模型.在针孔模型下,从图像序列中仅能实现物体的射影重建[12,13],为了实现非刚体的射影重建,有些学者将非刚体的部分当作出格数据进行处理[14],采用刚体重建方法进行射影重建,但该方法要求物体做近似刚体运动,在许多情况下,物体的运动并不能近似于刚体运动.因此,该方法在许多场合并不适用.为了更符合实际情况,本文在相机为针孔模型下,提出了一种基于最小特征值的迭代非刚体三维射影重建方法.该方法利用所有的图像点和深度因子组成一个低秩图像矩阵的特性,将投影求解转化为矩阵特征值及特征向量的求解,迭代地求解深度因子,实现非刚体的三维射影重建,而且本文方法能够从理论上保证算法收敛至全局最优解.假定相机为针孔模型,其成像过程可表示为假设有F幅图像,N个三维空间点,对于第i幅图像,由式(1)有当物体做非刚体运动时,Zi可以认为由L个形状基线性组成[6],即将式(3)代入式(2)可得将上式展开并整理得从式(6)可以看出,M3F×N为低秩矩阵,其秩为3L+1.同时,从上式还可以看出,对于任意的非奇异矩阵Π,都有从式(6)还可以看出,M3F×N中含有未知的深度因子γi,j.若深度因子γi,j已知,通过SVD分解则有由于M3F×N的秩为3L+1,则有因此,可以令射影重建为:从上面可以看出,若深度因子γi,j已知,通过SVD分解,很容易实现非刚体的射影重建.因此,射影重建的转化为深度因子γi,j的求解,下面讨论如何实现深度因子γi,j的求解.任一列向量投影到列向量生成的正交补子空间的投影矩阵为[15]:同时,M3F×N中的任一列cj在列生成的正交补空间上的投影为在实际应用中,图像含有噪声,因此上式的求解可以最小化余差,即为了避免非平凡解的出现,将式(17)修改为式(18)的求解可以转化为求解HF×F的最小特征值对应的特征向量.在3.1中,利用每个空间点在图像矩阵中组成一个列向量对深度因子λj进行求解.和上节类似,利用每幅图像构成三个行向量对深度因子γi=(γi,1,γi,2,…,γi,N)T进行求解.任一行向量投影到由的行向量生成的正交补子空间的投影矩阵为[15]:证明:由于在M3F×N矩阵中,同一幅图像中的连续三行都有相同的γi,j,因此,取连续的三行r3i-2、r3i-1和r3i(i=1,2,…,F),将其r3i展开,则有同理,为了避免非平凡解的出现,将上式可以写为在上面求解深度因子γj和γi的过程中,事先假定和都已知,但和是从M3F×N得到,而M3F×N中却含有深度因子γi,j,因此,可以构造一个迭代算法对深度因子γi,j进行求解.本文算法流程如图1所示,详细步骤总结如下:步骤1 假设所有的图像深度因子γi,j=1,k=1,令ε为任意小的正数;步骤2 对M3F×N进行奇异值分解,利用式(9)求到和;步骤3 利用式(11)构造投影矩阵;步骤4 利用式(17)求取矩阵HF×F;步骤5 求取矩阵HF×F的最小特征值及其对应的特征向量,即最小余差为及对应的深度因子为γj;步骤6 利用式(19)构造投影矩阵;步骤7 利用式(22)求取矩阵FN×N;步骤8 求取矩阵FN×N的最小特征值及其对应的特征向量,即最小余差为及对应的深度因子为γi;步骤9 若,转步骤(10);否则,k=k+1,转步骤(2);步骤10 利用式(10)实现非刚体的射影重建.在本文算法中,每次迭代的运算量主要来自以下3部分:(1)步骤2中对M3F×N进行奇异值分解,其运算复杂度为min(O(F2N) O(FN2));(2)步骤5中求HF×F的最小特征值及特征向量,其运算复杂度为O(F3);(3)步骤8中求FN×N的最小特征值及特征向量,其运算复杂度为O(N3).因此,这3部分总的运算复杂度为max(O(F3) O(N3)),即本文算法每进行一次迭代的运算复杂度为max(O(F3) O(N3)).为了验证本文方法的性能及各种参数对本文方法的影响,首先产生不同形状基数目的非刚体运动,再产生不同数量的图像,并在图像中加入均值为零,方差变化的高斯噪声.根据模拟产生的非刚体图像序列,用本文方法进行非刚体的射影重建,并用v(3L+2)的值和最小余差er的大小来衡量算法的性能.实验1 为了检验本文方法的收敛性能,假设形状基数目L分别为6、8,模拟产生含有60个空间点、60幅图像的非刚体图像序列,同时,在图像中分别加入零均值,方差分别为0,0.5,1.0,1.5,2.0个像素的高斯噪声,实验结果如图2和图3所示.从图2和图3可以看出,v(3L+2)和er的值均随迭代次数的增加逐渐减小,一般情况下,只需要迭代20次之内就达到了收敛,说明本文方法具有较好的收敛性能.从图2和图3还可以看出,噪声越小收敛性能越好.同时,比较图2(a)与2(b)和3(a)与3(b),可以看出,非刚体的形状基数L越大,收敛速度越慢,原因是因为形状基数L越大,秩越大,要求解的未知数就越多,而方程数却是一定的,因此收敛速度就越慢.实验2 为了检验形状基数目L对本文方法的影响,模拟产生60个空间点,60幅图像,在图像中加入零均值,方差分别为0,0.5,1.0,1.5,2.0个像素的高斯噪声,形状基个数L 由1变化至12,在每种情况下分别运行50次后求其平均值,实验结果如图4和图5所示.从图4和图5可以看出,随着形状基数L的增加,v(3L+2)和er的值都减小,图像噪声越大,v(3L+2)和er的值也越大,原因是由于:(1)对于v(3L+2),矩阵在进行SVD分解时,对角阵V3F×N对角线上元素的值是从大到小进行排列,能量主要集中在前面特征值上,基数L越大,v(3L+2)的位置越靠后,其对应的值就越小.(2)对于er,基数越大,要求解的未知数就越多,而方程数量却一定的,因此,er的值就越小.实验3 为了检验空间点数对本文方法的影响,在图像保持60幅不变的情况下,空间点数由40变化至240,同时,在图像中加入零均值,方差为1.0个像素的高斯噪声,形状基数分别取4,6,8,10,在每种情况下分别运行50次后求其平均值,实验结果如图6和图7所示.从图6可以看出,随着空间点数的增加,v(3L+2)的值增大,原因是由于空间点数越多,矩阵越大,其能量相对越大,因此,v(3L+2)的值就越大.从图7也可以看出,er的值随着空间点数的增加而增大,这是因为空间点数越多,约束就越多,方程求解的余差就会越大.实验4 为了检验图像数对本文方法的影响,同实验3一样,只是本实验保持空间点数为60,而图像数由20变化至200幅,实验结果如图8和图9所示.从图8和图9可以看出,v(3L+2)和er的值随着图像数的增加而增大,原因和实验3一致.实验5 为了检验及比较本文方法的抗噪能力,模拟产生60个空间点,60幅图像,形状基数分别取4,6,8;同时,在图像序列中加入零均值,方差由0变化至2个像素的高斯噪声.利用这些图像序列,分别用本文方法和Dai方法[13]进行射影重建,在每种情况下分别运行50次后求其平均值,实验结果如图10和图11所示.从图10可以看出,用本文方法,v(3L+2)随图像噪声的增加呈线性增长的趋势,说明该方法鲁棒性较好,而Dai方法仅考虑列向量的约束,因此,本文方法的重建精度高.从图11可以看出,er值随图像噪声的增加而增加.比较发现本文方法的v(3L+2)和er值比Dai方法都要小,说明本文方法具有更好的抗噪能力.同时,从图10和图11中还可以看出,基数越大,v(3L+2)和er的值越小,其结论和实验2是一致的.为了验证本文算法的正确性,本文获得一个由230帧图像组成的恐龙图像序列,图像大小为570×338,其中的两帧如图12所示.从图12可以看出,该恐龙运动是非刚体运动.在该图像序列中,通过人工提取及跟踪了49个特征点(如图中*所示),选取形状基的个数L为10,用本文方法对这些特征点进行射影重建.为了衡量本文方法的重建精度,对这些重建点进行重投影,重投影点如图12中○所示.从图12可以看出,重投影点和原始特征点基本重合,同时,我们计算了重投影点到原始特征点的平均距离,即平均重投影误差为0.6214像素,这说明本文方法具有较高的重建精度.本文提出了一种基于最小特征值的迭代非刚体三维射影重建方法.该方法利用所有的图像点和深度因子组成一个图像低秩矩阵的特性,将投影求解转化为矩阵特征值及特征向量的求解,迭代地求解深度因子,最终实现非刚体的射影重建.该方法保证了算法能够收敛至全局最优解.模拟实验和真实实验的数据结果表明,本文方法具有收敛性速度快、误差小等优点.为了进一步提高三维射影重建精度,本文下一步工作将结合度量学习理论[16~18],通过度量学习建立非刚体三维模型库,利用三维模型库实现非刚体的射影重建.刘静娜女,1992年2月出生,河南周口人,2014年在河南师范大学获得学士学位,现为陕西师范大学硕士生.从事图像序列分析、相机标定等方面的有关研究.刘中华(通信作者) 男,1975年3月出生,河南郑州人,1998年在空军第一航空学院获得学士学位,2005年在西华大学获得硕士学位,2011年在南京理工大学获得博士学位,现为河南科技大学副教授.从事计算机视觉、模式识别、图像处理等方面的有关研究.刘侍刚男,1973年11月出生,江西峡江人,1997年和2001年在哈尔滨工程大学分别获得学士学位和硕士学位,2005年在西安电子科技大学获得博士学位,现为陕西师范大学副教授.从事计算机视觉、三维重建等方面的有关研究.裘国永男,1964年6月出生,浙江绍兴人.1999年浙江大学计算数学专业理学博士.现为陕西师范大学副教授,从事三维重建、相机标定等方面的有关研究.。

3D Source Detection

3D Source Detection
N
x x y t x y y t t x y t x y t
N N N log (N N
MH (x; y; t;
x y t
x
;
y
;
t
) = 3?
"
x2 2
x
+
y2 2
y
+
!# ?1 t2 e 2 2
t
x2 + y2 + t2 2 2 2 x y t
;
(2)
where ; ; are scale factors. The MH function has some signi cant advantages over the square-cell lters which are in general use: It is centrally peaked and is surrounded by a -ve shell such that the integral of the function over all space is zero { it therefore serves as a natural background subrtractor; the scale factors allow it to act as a lter, enhancing structures in the data that match these sizes; both the MH function and its Fourier Transform (FT) are limited in extent, which allows us to store a smaller fraction of the arrays (see below); and nally, it is analytically manipulable, which allows us to reduce computation times signi cantly. In our use of the MH function as a lter to detect sources, we follow Freeman et al. (1995) and Damiani et al. (1995). Because direct convolution is computationally expensive, we carry it out as multiplication in Fourier space. As pointed out above, it may not be possible for the entire data array to be stored in memory, so we compute the forward transform as follows: 1. From the data I (x ; y ; t ), generate a 2D sub-image I 0 j = 0 , and compute the FFT of this image along the x- and y-axes to obtain F j 0 . 2. Store a subset of F , F = 0 j 0 in a 2D \image" I = 0 j 0 .

非光滑曲线等长样条逼近

非光滑曲线等长样条逼近

非光滑曲线等长样条逼近何国良;张勇【摘要】在工程计算中,常常需要对非光滑曲线进行光滑逼近.本文讨论了非光滑曲线在局部小区间上,可用保长度不变的光滑样条曲线来近似的问题.首先利用介值定理,从理论上证明:对定义在有限区间上的任意连续函数,只要曲线长度有限,则在该区间上可用光滑样条曲线来等长近似原曲线.在此基础上,通过选择恰当的插值节点,可以构造出一条唯一的光滑曲线,使其具有和原曲线形同的长度.同时,本文还给出具体构造光滑曲线的方法及详细计算过程.最后的数值结果表明这种方法既简单又行之有效,从而完整地解决了工程计算中曲线近似这一问题.【期刊名称】《工程数学学报》【年(卷),期】2016(033)005【总页数】15页(P480-494)【关键词】长度可测;介值定理;光滑曲线;等长近似【作者】何国良;张勇【作者单位】电子科技大学数学科学学院,成都611731;西南医科大学医学与信息工程学院,泸州646000【正文语种】中文【中图分类】O241.71 引言在数值传热学、计算流体力学、核反应堆模拟、油藏数值模拟、计算机几何学等学科中常常需要数值求解定义在曲线上的微分方程.在这其中一些精细分析和计算中,为达到比较好的计算效果,需要用光滑曲线来近似已知非光滑曲线,并且还要保持沿曲线的测度不变—保持定义在这上面的重要物理、化学或几何性质不变,比如浓度、压力、速度、通量、质量等[1-5].由于这些问题通常对应着复杂的微分或积分方程,需要用数值方法来求解.此外,由于物理模型要求具备守恒性,这就要求所采取的数值方法也能够保持这些守恒性质.这样一来,针对微分或积分算子方程本身的离散方法除要具备守恒特点而外,对网格也提出了很高的要求.比如在生成贴体网格的过程中,需要网格能够很好地近似曲线(面)随空间自变量的变化而变化的趋势[5,6].对于定义在曲线上的一些方程,通常采用的参数化目标曲线来生成网格的办法存在两个方面的不足:1) 参数化曲线后所导致的积、微分方程异常复杂,所以除简单问题外,一般在工程上很少使用[7,8];2) 对于一些特殊问题,达不到需要的精度,如图1所示.图1:曲线上的投影点不唯一曲线C代表一条管状结构(或来自一薄层的简化,详细介绍参考文献[9–12]),在A 处有一个不光滑点.实际应用中需要以该曲线为基线,生成贴近曲线走势的一组三角形(四边形)划分:为保证物理守恒,要求这些三角形或四边形单元节点的值等于相应函数在曲线C上的正交投影[8-10](这样在数值计算时,能保证我们关心的物理量守恒,从而保证数值方法和物理模型一样具有守恒性).对于曲线的光滑部分,划分没有障碍,但是在曲线不光滑部分,比如A点,则会遇到下面的问题:1) 在曲线不光滑的地方,因为曲线方程在这里不可导.对于角平分线上的任何一点pi来讲,其在曲线C上的投影点是不确定的:L i,R i都是在曲线上的投影点.所以,在图1中的点A附近,如果直接进行三角形(或四边形)划分,其计算过程中是不稳定的;2) 通常采取的另一种方法,“边角光滑化(边界部分以曲代直)”,在这里也变得不可行—为达到一定的计算精度,导致局部网格划分异常密集,否则近似误差较大,会污染邻接节点的计算,甚至导致计算失败.为克服参数化和局部简单的以曲代直的不足,本文提出在局部,用等长光滑曲线来代替原始曲线的方法.这里的曲线一方面满足足够的光滑性(如曲线关于自变量,比如x是可导的),另一方面更为重要的是,保证所得曲线在替代范围内(本文称为为链接区间)和被代替的原始曲线具有相同的测度(如:长度).由于该问题是在实际计算中新遇到的需要解决的一个基本问题,所以在本文中将作详细的分析讨论.在下面的讨论过程中,均假设曲线在包含不可导点的某一区间内有界且一维可测,如图2中的左图;但对于图2的右图的函数在x=0处附近震荡或其他分形曲线等复杂无界曲线,则不在本文讨论范围内,因为曲线段的一维测度(长度)为无穷大.注意,虽然实际是应用中可能不是直接使用长度的概念[6,10,18],但为简单将基本原理和方法叙述清楚,这里暂时抛开具体应用,仅以模型曲线来作为分析对象,并将抽象测度用长度来代替.本文的组织结构如下:首先分析满足上面所提及的光滑性和等测度的条件,然后建立包含一组数学关系和方程的数学模型;在此基础上分析该模型的可解性,最后通过数值方法来求解.图2: 在x=0附近曲线长度可测和不可测2 模型建立假设给定两点的坐标:p(x l,y l),q(x r,y r),以及链接区间[x l,x r]内的曲线长度(或其他测度)L m.根据光滑性和等长度的要求,需构造一连续函数S(x),满足如下三个条件:1) 尽可能只改变原曲线不光滑点附近的曲线;2) 链接区间内[x l,x r]光滑条件:S(x)至少是一阶光滑函数;3) 长度性一致条件:在区间[x l,x r]上的函数S(x),其对应曲线的长度L需满足指定的长度L m,即根据上面的基本要求,比较容易找到且计算稳定的函数是分段多项式函数[14-16].由于要求该函数在区间[x l,x r]上至少具有一阶连续导数(链接区间的端点用单侧导数连续来表示),且在两端端点满足相应函数值和导数值的插值条件.这样,包含原来的不光滑点处的光滑性在内,至少有七个约束条件.基于这些基本分析,自然的选择是利用分段3次样条函数来近似.考虑到这里不需要处理太多的插值节点,并且还需要能比较简单地计算出曲线的长度,达到和后续进一步插值计算或求解微分方程时空离散之间的简单对接,所以这里采用一般样条函数[15-18].设需求的函数具有如下形式其中x m为链接区间内任意一点,a1,b1,c1,d1,a2,b2,c2,d2是待定参数.此外,为满足长度相同的要求,这里假设y m也是未知的.根据S(x)需要满足的三个条件,我们可以得到如下的约束关系(∗):1) 插值条件2) 函数一阶光滑条件3) 曲线长度相等条件其中s表示弧元,L m为指定长度.带入S(x)的表达式,我们可以得到下面一组约束方程和这里k0,k1为原曲线在端点x l,x r处的导数值,y m为中间链接点处的函值,其值待定.由于约束方程(4)的引入,使该该问题变得比较复杂.下面分析求解情况.设A为方程组(3)的系数矩阵,行列顺序按照a1,b1,c1,d1,a2,b2,c2,d2的自然顺序进行排列,记系数向量X=[a1,b1,c1,d1,a2,b2,c2,d2]T,B=[y l,y r,ym,k1,k2,0,0,0]T.对于方程组(3),其中前三个为插值条件,只要x l,x m,x r三点互不相同,这三个方程则线性无关;中间两个是不同端点x l,x r的导数约束;后面两个为函数在链接点的一阶光滑性约束.这8个方程构成的线性方程组两两线性无关,即A可逆.于是,在给定y m的情形下,我们能够得到唯一的一组系数a1,b1,c1,d1,a2,b2,c2,d2,即S(x)和y m是一一对应的;并且下面特殊关系式成立即,若S(x)为直线时,y m定在pl pr的连线上(此时自然要求k1=k2).为方便起见,形式上令将(5)带入(4)式,有下面分析非线性方程(6)解的存在性情况.为方便起见,令注1 函数f(x m,y m)中的两个被积函数均隐含着x m,y m,所以是二元函数,其图像如图3所示.这里x m并不要求为原曲线的不光滑点.图3:f(x m,y m)的二维图像和等值线图3 理论结果为完整地解决该问题,我们首先从理论上讨论方程(6)中解的存在性.下面分几步来证明方程(6)的解:首先,证明在区间[−1,1]上,存在关于y轴对称的屋顶函数和长度与给定曲线相同的三次一阶光滑函数;其次,引入一个线性变换将一般的区间[c,d]映射至[−1,1]区间;最后,借助于介值定理来完成方程(6)的解的存在性证明.命题1 设δ∈R,且δ>0.对给定的对称区间[−δ,δ],以及在区间端左端点x=−δ处的斜率k0>0,在x= δ处的斜率k1= −k0.存数y m=和一个三次多项式P(x),使得该函数同时满足下面四条:1) 插值条件2) P(x)∈ C 1[−δ,δ],其中C 1[−δ,δ]表示闭区间[−δ,δ]的一阶连续光滑函数;3) 一阶导数单调下降:满足k0>P′(x)>−k0,x∈[−δ,δ];4) 函数P(x)在区间[−δ,δ]凸性不变.证明为方便起见,假设y m>0,对应的函数形式上为区间[−δ,δ]三次多次多项式满足下面的插值条件由于上述关于系数[a,b,c,d]的线性方程组两两之间线性无关,所以存在唯一解在区间(−δ,0)中,函数P1(x)的一阶,二阶导函数分别为且x∈(−δ,0),即函数P1(x)在区间(−δ,0)向上凸,所以在该区间(−δ,0)上单调下降,满足:0<P1(x)<k.另一方面,利用对称性可以证明在区间(0,δ)内,若在右端点x=δ处,曲线的斜率k1=k0,在x=0处的斜率等于0.同样存在一个三次多项式P2(x)函数,满足下面插值条件该函数且在区间(0,δ)向上凸,且一阶导函数单调下降,在x=δ取得最小−k0.将上述两方面结合起来,令则通过上面构造和分析,我们得到P(x)在区间(−δ,δ)连续的,满足插值条件1).由于所以P(x)在区间(−δ,δ)上具有连续的一阶导数,即P(x) ∈ C1[−δ,δ].命题1中的第2)条和第4)条,已经在证明过程中得到证明.注2 命题1在k0>0的条件下得到.对于k0<0,我们同样也可以得到类似的结论,只需要将命题1第3)条中的导数变成单调增加即可.该命题的关键在于能找到一个满足条件的函数即可.命题2 在区间(−δ,δ)上存在一个具有一阶连续光滑的三次多项式函数P(x)∈P3(x),使得证明为方便起见,下面以k0>0为例来证明,对于k0<0,结论一致.令y m=k0/3,根据命题1,存在一个具有一阶连续光滑的函数P(x).该函数在区间[−δ,0]单调上升,一阶导数不变号,且0<P′(x)<k0,x∈[−δ,0].P(x)所代表的曲线在区间[δ,0]上的长度为同样,在区间[0,δ]上将上面两个积分加起来,有注3 上述命题的含义是:如图4所示,在区间[−1,1]上,多项式P(x)所对应的曲线(x,P(x))的长度小于线段AB和BC的长度之和,即图4:P(x)的长度小于三角形的两直角边的长度和命题3 对于定义在区间[a,b]上任意连续且具有可测长度的函数f(x),如果f(x)在子区间(c,d)∈[a,b]以外不存在不可导点,则存在一个如下形式的函数使得h(s)在区间上和f(x)在区间[a,b]具有相同的长度,并且h(s)在点x=δ和x=−δ处导数连续,且h′(−δ)= −h′(δ)=k0 ̸=0, ¯a< −δ< δ< ¯b.证明取实数x l,x r满足a<x l<c<d<x r<b.由于f(x)是在区间[a,b]上是长度有限的函数,故令其在区间[a,x l]上的长度为L L,在[x l,x r]上的长度为L m,在[x r,b r]上的长度为L R.由于在区间[a,x l],若L L=e−a,则令ˆa=a+ε0,可使得L L>x l−ˆa.为简单起见,以后均假设L L>x l−a,L m>x r−x l,L R>b−x r.记S(x,f(x))为曲线上任意一点,引入下面的线性变换:T:(x,f(x))7→(s,¯y),其中ˆS表示变换后曲线上的点,则对于定义区间[x l,x r]上的点集S(x,f(x)),在变换(11)下的像集ˆS(s,¯y)成为定义在区间[−δ,δ]上的连续曲线,曲线长度依然为L m.令则上面定义的h m(s)在区间[−δ,δ]上连续,对应的曲线长度依然为L m,且在左、右端点处的单侧导数存在,互为相反数:由于f(x)不光滑点附件的曲线长度有限,所以总可以适当移动区间[x l,x r],使得k0̸=0.设a¯<−δ在区间[¯a,−δ]上,定义从方程组(13)的前两个方程解出b1,c1,有b1=k0+2δa1,c1=k0+2δa1−3a1δ2.带入第三个方程得到关于参数¯a,a1的方程.由于第三个方程左边关于这两个参数均是连续函数,左边积分值的范围介于−(δ+¯a)到无穷大之间.所以总可以找到恰当的值,使得在[¯a,−δ]上,函数p1(s)所对应的曲线长度为L L,并且p1(x)=k0.类似地,在区间[δ,b]上,存在函数p2(s)=a2 s2+b2 s+c2,使得在该区间上,函数p2(s)所对应的曲线长度为L R,并且p2(x)=k0.最后,令即为所求函数.说明:1) 根据h(s)的构造方法,对任意的故对其作一个反向旋转变化得到原来区间[x l,x r]上的函数,在该区间以外也是光滑的.为避免引入太多记号,这里可将其记为h(x).2) 该命题的主要意义在于:从理论上通过平移和旋转变换,构造一个简单的屋顶函数h(x):这个函数只在x=[x l+x r]/2处不可导(不光滑点),而在其它地方是光滑的,并且在相应区间上保持曲线长度不变.构造这样的函数主要目的是为以后用介值定理做准备,实际的数值计算中不需要构造该函数.定理1 设f(x)为区间[a,b]上的连续且长度可测函数,其长度为L0,则存在满足约束关系(∗)的解.证明若f(x)在整个区间[a,b]上光滑连续,则自然满足要求.下面证明不光滑的情况.由于f(x)为区间[a,b]上的连续函数,我们可以找到一个区间[x l,x r],满足[c,d]∈[x l,x r]∈[a,b].利用命题3,存在一个连续函数h(s),使得h(s)除x=0以外均是光滑函数.对于区间[−δ,δ]以外的部分,h(s)所对应的曲线长度和f(x)的在区间[x l,x r]以外相同,所以这里只需再证明,可用等长光滑函数来代替h(s)在区间[−δ,δ]上的部分即可.设曲线(c,f(x))在区间[x l,x r]上的长度为L m.根据h(s)的构造,h(s)在区间[¯a,¯b]上光滑,在区间[−δ,δ]上的长度和f(x)在[x l,x r]的长度相同,即若L m=2δ,此时需k0=0,f′(x)≡c0,即f(x)为直线段,方程(6)有唯一的解若L m>2δ,f(x)不恒为常数,k0=h′(s)|s=−δ ̸=0.由于形如中的被积函数关于变量s是连续函数,从而方程(6)中的两项定积分值均存在的.令其中从式(7)可知,向量X的各分量关于y m均为非线性函数,且h1(s,y m),h2(s,y m)是关于y m连续函数,所以(15)式是关于y m的连续非线性函数.因k0̸=0,由命题2,存在数⌢y=k0/3和一个具有一阶连续光滑的三次多项式函数P(s)∈P3(s),使另一方面,对于任意的固定值L m,当不恒为零时,H(y m)是关于y m的增函数(0<c0<|y m|,c0为一常数),故于是存在一正数M满足M>L m,如图5所示.对该M,根据函数H(y m)的连续性,存在¯y m,使得H(¯y m)=M,即存在¯y m,使得将式(17),(18)结合起来,根据介质定理,存在y m∗,使得H(y m∗)=L m.再将其在区间[a,b]上进行适当延拓即可从而证明非线性方程(6)有解.从而定理得证.图5:不同的y m对应的光滑曲线说明:1) 定理说明对于任意一个连接区间[x l,x r],只要在这个区间上,曲线长度可测,则总可以用一条一阶光滑曲线来进行等长替代.2) 方程(6)的解一般不唯一.因为对于给定的x m和L m,由于积分关系中凡是含y m的项整体均为平方项,所以若y m满足方程(6),则在−y m附近也存在另一个满足方程(6)的解,如图6所示.3) 由于y m不唯一,实际应用中选择y m的一般方法是借助于图形的惯性方向来选取;比较简单的方法是借助重心坐标来算,方法如下:•计算曲线在区间[x l,x r]的部分,连接点(x l,f(x1))和点(x r,f(x r));对给定的x m 计算两直线•在区间[x l,x r]内的交点,选y m的初值为所得三角形的重心即可,如图6中的ym.图6: 方程(6)在图中有上下两个解在上面的定理中,我们证明了在含不光滑点的链接区间[x l,x r]上存在通过该区间中点x m,且在中点处的导数为0.实际上由关系式(9)可知所以我们同样可以构造出h(s),使得从而保证h(s)在区间(x l,x r)内二阶光滑.这样一来,由h(s)衍生的S(x)同样满足此外,由于在插值节点x m处,样条函数S(x),F(x m),H(y m)均是关于x m的连续函数.沿用定理1的证明方法,我们可得到下面更一般的结论.定理2 设f(x)为区间[x l,x r]上的连续且长度可测函数,有意义,则对任意的x∈[x l,x r],存在满足如下约束关系函数三次样条函数S(x):1) 插值条件2) 函数光滑条件3) 曲线长度相等条件其中s表示弧元,L m为原曲线在区间[x l,x r]上的长度.注4 定理2表明,在一个局部区间,不仅可以用简单曲线来不光滑曲线,也可以代替复杂曲线,比如高频震荡曲线.这在实际应用中是很有好处的,尤其在对曲线形状和长度比较敏感的场合.4 算法分析下面讨论光滑曲线S(x)的具体数值求解办法.在链接区间上[x l,x r],曲线长度L m可利用测量数据或用公式积分或数值积分计算而得到.由于一般形式积分关系(15)的结果,形式非常复杂,难于计算,所以这里考虑用高斯积分数值积分方法来计算.下面给出求解恰当y∗m的牛顿-拉夫逊算法(Newton-Raphson Method).4.1 算法1) 给定初始数据:x l,x r,f(x l),f(x r),k1,k2,及曲线的在区间[x l,x r]上的长度L m,容许误差ε;2) 计算几何区域的惯性方向(或重心),选取恰当的x m∈[x l,x r],计算y m相应的初值3) 计算对应的S(x):令n=1,2,3,···,(I) 用高斯积分方法计算积分式(15);(II) 用牛顿法计算方程(6)的第n此迭代值;(III) 利用算样条函数的系数X=[a1,b1,c1,d1,a2,b2,c2,d2]T,从而得到新的˜S(x);(IV)计算样条曲线˜S(x)在夹在区间[x l,x r]的长度L,并计算|L−L m|;(V) 若|L−L m|>ε,则返回3)继续计算;否则,终止跳出循环,结束计算;4) 利用最后的y nm计算曲线整个区间上S(x),从而得需要的光滑曲线.4.2 算法分析1) 虽然牛顿法对初值比较敏感,但从y m的角度上看,如果方程 (6)基本上还是和2次多项式方程接近,所以会很快收敛.只不过由于F(y m)存在最小值点,初值取为最小值点时,算法可能不稳定.所以选取初值时,需注意尽可能避开该点.2) 理论上讲,对任意的x m∈[x l,x r]都可以求出y m对应的近似值,不过从后续计算的稳定性和效率来看,在|k1|和|k2|比较接近的情况下,将x m选择靠近区间的中点还是不错的方案,这样得到的曲线,极性小些[14,19].4.3 数值结果下面以函数f(x)=|sin(x)|,x∈[−1.5,1.5]为例来计算.该函数在x=0处导数不存在,有一个不光滑点.在区间[−1,1]上面长度有限.图7为选择不同的插值节点x m来计算的光滑曲线.他们分别对应x m分别为−2/3,0,2/3三个值的三对曲线.上面一组点线和下一组点线相对应,粗线为原始非光滑曲线;小圆圈表示插值节点x m在曲线上的位置.每一组曲线在长度、光滑程度均满足要求,都可以作为局部的曲线近似.所以我们很容易根据实际需要来选择相应的曲线,这保证了方法的稳定可靠.图7:不同x m对应的连续光滑曲线表1的结果在牛顿迭代终止条件为10−5下取得.迭代次数5∼6次.曲线在区间[−1,1]上的真实长度:2.622884996431094.计算结果仅保留小数点后面5位.从表1可以看出,实际计算速度很快;而且对不同的x m,计算结果正确且算法稳定,所以本文提出的等长度曲线近似方法计算效果很好.为简单得到唯一的曲线,只需将插值节点x m选为区间的中点即可.此外,本文主要讨论了连续曲线的逼近问题,如果工程中需要对于间断曲线进行近似,可以有两个办法:一个是逐短光滑,另一种是整体光滑.这两种问题利用本文的处理方法不存在任何困难,因为本文提出的近似方法只需要知道在给定区间上曲线的长度即可,与曲线的具体几何形态无关.表1:不同x m对应的连续光滑曲线x m y m 迭代次数近似曲线长度L 长度的相对误差−2/3 0.37778 5 2.62289 1.06581×10−14−2/3 1.26377 6 2.622895.28906×10−140 0.02876 5 2.62289 5.32907×10−14 0 1.57628 6 2.62289 3.24185×10−14 2/3 0.37778 6 2.62289 8.88178×10−14 2/3 1.26377 52.62289 1.55964×10−14参考文献:[1]mehendale S S,Jacobi A M,Shah R K.Fluid fl ow and heat transfer at micro-and meso-scales with app lication to heat ex-changer design[J].App lied mechanics Review s,2000,53(7):175-193[2]吴向红,叶继根,马远乐,等.水平井蒸汽辅助重力驱油藏模拟方法[J].计算物理,2002,11(6):549-592 W u X H,Ye J G,Ma Y L,et al.SAGD num erical Simulation with horizontal wells[J].Chinese Jou rnal of Com putational Physics,2002,11(6):549-592[3]Fang C,Steinbrenner J E,Wang F M,et al.Im pact of wall hyd rophobicity on condensation fl ow and heat transfer in silicon micro-channels[J].Jou rnal of microm echanics and microengineering,2010,20(4):045018[4]Louah lia-Gualous H,mecheri B.Unsteady steam condensation fl ow patterns inside a miniatu re tube[J].App lied Therm alEngineering,2007,27(8-9):1225-1235[5]陶文铨.数值传热学[M].西安:西安交通大学出版社,2001 Tao W Q.Num erical Heat Transfer[M].X i’an:X i’an Jiaotong University Press,2001[6]范晓光,马学虎.微通道内蒸汽及混合蒸气冷凝[D].大连:大连理工大学,2012 Fan X G,Ma X H.Fluid fl ow and heat transfer of steam and steam-noncondensab le gasm ixtu res condensation inmicrochannels[D].Dalian:Dalian University of Technology,2012[7]Floater M,Horm ann K.Su rface param eterization:a tu torial and survey[C]//Advances in mu ltiresolution for Geom etric modelling,Sp ringer,2005[8]Greer JB.An im provem ent of a recent Eu lerian method for solving PDEs on general geom etries[J].Journal of Scientifi c Computing,2006,29(3):321-352[9]Salom on D.Cu rves and Surfaces for Com pu ter G raphics[M].New York:Sp ringer Science+Business med ia,2006[10]Reu ter M,W olter F E,Shenton M,et p lace-Beltram i eigenvalues and topological featu res of eigenfunctions for statistical shapeanalysis[J].Com pu ter-A ided Design,2009,41(10):739-755[11]Bertalm M,Bertozzi A L,Sapiro G.Navier-Stokes,fluid dynam ics,and im age and video inpainting[J].IEEE Com pu ter Society Con ference on Com pu ter V ision&Pattern Recognition,2001,1(1):355-362[12]Dziuk G,E lliott C M.Eu lerian finite elem entm ethod for parabolic PDEs on im plicit su rfaces[J].Interfaces&Free Boundaries,2008,10(1):119-138 [13]macdonald C B,Ruu th S J.The implicit closest point method for the num erical solution of partial differential equations on su rfaces[J].SIAM Jou rnal on Scientific Computing,2009,31(6):4330-4350[14]Burden R L,Fairs J D.Num erical Analysis[M].Boston:Cengage Learning Press,2001[15]Berrut J P,Trefethen L N.Barycentric Lagrange interpolation[J].SIAM Review,2004,46(3):501-517[16]Liu X D,Osher S,Chan T.W eighted essentially non-oscillatory schemes[J].Journal of Com putational Physics,1994,115(1):200-212[17]Xu Z F,Shu C W.Anti-d iff usive fl ux corrections for high order finite d iff erenceW ENO schem es[J].Journal of Com putationalPhysics,2005,205(2):458-485[18]李庆扬,王能超,易大义.数值分析[M].武汉:华中科技大学出版社,2001 LiQ Y,W ang N C,Y iD Y.Num erical Analysis[M].W uhan:Huazhong University of Science and Technology Press,2001[19]钟尔杰,黄廷祝.数值分析[M].北京:高等教育出版社,2004 Zhong EJ,Huang T Z.Num erical Analysis[M].Beijing:H igher Education Press,2004。

基于机器学习的欧元、美元汇率预测(IJMSC-V8-N1-5)

基于机器学习的欧元、美元汇率预测(IJMSC-V8-N1-5)

I. J. Mathematical Sciences and Computing, 2022, 1, 44-48Published Online February 2022 in MECS (/)DOI: 10.5815/ijmsc.2022.01.05EUR/USD Exchange Rate Prediction Using Machine LearningMd. Soumon Aziz SarkarHajee Mohammad Danesh Science & Technology University, Dinajpur, 5200, BangladeshEmail: soumon.sarkar72@U.A. Md. Ehsan AliHajee Mohammad Danesh Science & Technology University, Dinajpur, 5200, BangladeshEmail: ehsan_cse@hstu.ac.bdReceived: 08 June 2021; Accepted: 20 July 2021; Published: 08 February 2022Abstract:Nowadays artificial intelligence is used in almost every sector of our day-to-day life. AI is used in preventative maintenance, quality control, demand forecasting, rapid prototyping, and inventory management among other places. Also, its use in the economic market has gained widespread. The use of artificial intelligence has made a huge contribution to price forecasting in the currency market or the stock market. This research work explores and analyzes the use of machine learning techniques as a linear regression in the EUR/USD exchange rate in the global forex market to predict future movements and compare daily and hourly data forecasts. As a reason for comparison, linear regression was applied in both hourlies and daily's almost equivalent data sets of the EUR/USD exchange rate and showed differences in results.This has opened a new door of research on this market. It has been found that the percentage of accuracy of the daily data forecast is higher than the hourly data forecast at the test stage.Index Terms:Artificial Intelligence, Foreign Exchange Rate, EUR/USD, Forex Market, Machine Learning, Linear Regression.1. IntroductionMachine learning is a means of data analysis where automated models make their own decisions without human help. It is a branch of Artificial Intelligence where the models train themselves from previous data. After being trained, they can identify different patterns. Through some processes, they can be able to make their own decisions.Machine learning techniques are widely known and widely used in financial markets. Foreign exchange (FX, forex, or currency market) is also a large part of the financial market estimated with a daily trade volume of almost $6.6 trillion. The causes of price movements in the forex market have interacted with complex relationships. This is why the Forex market forecast is significant. Financial market forecasts are crucial for traders, investigators, economists, and analysts.Many investors face losses every year simply because they do not receive accurate forecasts. Most of the traders are following the old method of forecasting but those which used to work fairly but are not giving the expected results now. With the update of technology, the economic market has also changed drastically, so it is not right to speculate about the pace of the market in the old way. This is why an epoch-making technology like machine learning has emerged in this field which has changed the outline of the economic market [10].Day by day, forecasting methods are being developed to gain more accuracy. This paper has been considered as an application of machine learning techniques in the currency market. EUR/USD is the most active trading currency pair in the market. In this research, we use the regression technique to predict the EUR/USD data series in different time frames. The regression technique is commonly used in linear data sets but is rarely used in nonlinear data sets [1]. Historical data are collected for selection and stored for later model design. The move, which is called the data acquisition measures to evaluate the data related to the system. The data are then taken through a few more steps in the machine learning model so that data can be processed efficiently [9].This technique is applied in both hour and day frames to perform the EUR/USD exchange rate forecast and to show the accuracy and error of the forecast. By following this method traders will gain in-depth knowledge of the market, which will help traders to avoid unwanted trading hassles and this way investors will be able to run their business profitably.2. Previous StudiesSwagat Ranjit, Shruti Shrestha, Sital Subedi, and Subarna Shakyahave have done a study on comparing algorithms in forecasting foreign exchange rates in 2018 [4]. They used some machine learning techniques such as Artificial Neural Network (ANN), Repeat Neural Network (RNN) to develop prediction models among NRs against three major currencies such as Euro, Pound Sterling, and US Dollar. Here they erected a prediction model based on different RNN architectures, feed forward ANN with back propagation algorithm, and then the accuracy of each model was compared. The input data sets were with some parameters like Low, High, Open, and Closing prices for each currency. This study found that LSTM networks worked better than SRNN and GRU networks.Dr Gu Wang and Dr Joarg Osterrider have analyzed currency risk management to forecast the EUR/USD exchange rate on April 2, 2018 [6]. They have developed a linear regression model and fixed the error using motion signals. The predictions are compared to the future price to develop a hedging strategy for deciding whether wait until next month to make the transaction or enter a forward contract. Lastly, they analyzed the return by using this strategy.Dinesh K Sharma, H.S. Hota, and Richa Handa conducted a project on exchange rate forecasting in 2017 using regression techniques [1]. They compared regression techniques with cohesive regression techniques for non-linear data and observed variations in MAPE values. Here they used the ensemble regression technique in addition to the regression technique for forecasting. Ensemble regression enhances the performance of the predicting model. These two techniques are applied to the INR/USD and INR/EUR data to predict the future movement. This comparison shows that regression ensemble with Least Square Boost performs better than other techniques for the one-day forecast ahead.Siti Vetenariajeng Sidehabi, Indrabayu, and Sofian Tandungan studied American Data-Based Statistical and Machine Learning Approach Forex Prediction at the 2011 International Conference on Computational Intelligence and Cybernetics [8]. They use machine learning as a hybrid form of support vector machine (SVM) and genetic algorithm-neural network (GA-NN) and compare the results of these two methods. This research paper has also used Adaptive Spline Threshold Auto Regression (ASTAR) as the Statistical method. The comparison is presenting that ASTAR and GA-NN method has advantages for the different time frame.Konstantinos Theofilatos, Spiros Likothanassis, and Andreas Karathanasopoulos investigated the modelling and trading of EUR / USD exchange rates using machine learning techniques in 2012 [2]. They compare using a variety of machine learning techniques. They used five supervisor learning algorithms (K-Nearest Neighbors algorithm, Naïve Bayesian Classifier, Artificial Neural Networks, Support Vector Machines and Random Forests) to predict the Euro USD exchange rate. In this case, the random forest algorithm has given a relatively satisfactory result for EUR/USD exchange rate prediction.Tadashi Iokibe, Shoji Murata and Masaya Koyama [7] worked and analyzed foreign exchange rates by the local fuzzy restructuring method on October 22-25, 1995. In this research paper, they predicted the time series data of the foreign exchange market. Here they used embedding and local reconstruction technology for this prediction. In the end, it showed the result of the forecast.3. Data Set and MethodologyDucascopy Bank is a Swiss online bank that provides online trading platforms, banking and financial services. These various services have made it much more convenient for those who work in the financial market. Dukascopy Bank SA provides historical price data feeds for a variety of Forex tools for a variety of time series. One-hour data sets were collected from June 8, 2018, to December 8, 2018, and daily data were collected from December 31, 2007, to January 12, 2019. Data sets with some features like Open, High, Low, Close, and Volume are collected in an Excel sheet. Saturday and Sunday are weekly holidays in the foreign exchange market. The price movement of these two pairs will continue to show the same price on Friday. Each data set or Excel sheet collects about four thousand worth of data at different times.To perform the prediction, all coding was written in Python language. The Python 3.7 (32-bit) version has been used for Python through the IDLE or integrated development environment. The data has been converted and manipulated as we like and features have been defined.Although forex and stock exchange data sets are rarely found as missing values, lost data has been replaced by the -99,999 value. As for preprocessing, the properties have been normalized in the range of 1 to -1 to speed up the processing time and get valuable accuracy. The general formula is given:z=x−min⁡(x)(1)max(x)−min⁡(x)Where x is a prime value, z is the normalized value. Linear regression classifiers have been used through machine learning libraries such as Scikit-Learn and provide training to the machine learning classifiers. Then taken the data to test the classifier. After training and testing, the prognosis was taken as a whole data and taken as a prediction from thedata. The scale method is then applied to the forecast data based on all known data to standardize the range of individual variables or data properties. The following standard deviation must first be detected for standardization:δ =√1N ∑(x i−x̅)2Ni=1(2)Where x̅all is the average of all x values and its subtraction value is divided by its value deviation and the equation is given:Z = x−x̅δ(3)Model selection techniques for dynamic data segmentation were applied to make the forecast more precise. Eventually, we will finish our forecast for the EUR/USD market both hourly and daily. We will discuss various errors as well as forecasting. In this respect the model of machine learning is much improved. It is not only low cost but also much more perfect than the general prediction technique.4. Experimental ResultsLinear regression has been applied to the EUR / USD exchange rate and we have received a clear forecast and comparison. Here, Table 1 and 2 show the actual price, the estimated price, and the percentage error of the EUR / USD exchange rate on an hourly and daily basis, respectively. Here we have taken a few data from the whole datasheet to explain the comparison.Table 1. One Hour Base PriceTable 2. Daily Basis PriceFigure 1, and Figure 2, show the forecast in different time frames. In Figure 1, the data and predictions are based on the hourly time series and Figure 2 shows a daily based graph. The analysis found that the daily time-series graph is more accurate than the hourly time series graph and also the daily time frame is ahead in terms of accuracy. However, the errors in finding the analysis are different as there are more errors in the daily time series.Fig. 1. EUR/USD exchange movement and forecast for a one-hour time frameFig. 2. EUR/USD exchange movement and forecast for a one-day time frameTable 3 explores comparisons between one-hour time frame results and daily time frame results, respectively, with calculations of MAE (Media Complete Error), MSE (Mean Squad Error), RMSE (Root Mean Squad Error) and accuracy calculations. This test is performed by self-written Python code.Table 3. Comparative Result Analysis5. Summary, Conclusion, and Future WorksIn this research paper, first, we have addressed a few necessary terms to deeply understand the field of penetration detection. This was followed by a brief literature review of some intrusion detection methods, techniques, and procedures, as well as some of their notable weaknesses. Then, our proposed strategy was elegantly described. It was assessed by some real-world dataset.We set up a learning framework and normalize the myriad data sets of EUR/USD exchange rates. We then applied a machine learning technique called linear regression of different time series of EUR/USD exchange rates in the global forex market to compare the results of accuracy and different error methods and got different results for different timecharts. The successful comparison of this paper further finds that a daily data chart is better for business with more accuracy.In the future, our goal is to use machine learning strategies in other financial markets, such as the stock market, to trade better and more securely. Also, improve our daily lives through the use of artificial intelligence in every part of the world. The most important thing is researchers are always trying to increase its accuracy as much as possible. It is hoped that future research will take this field to a more advanced level.References[1]Dinesh K. Sharma, H.S. Hota, and Richa Handa, “Prediction of foreign exchange rate using regression techniques”, Review ofBusiness and Technology Research, Vol. 14, No. 1, ISSN1941-9414, 2017.[2]Konstantinos Theofilatos, Spiros Likothanassis, and Andreas Karathanasopoulos, “Modeling and Trading the EUR/USDExchange Rate Using Machine Learning Techniques”, ETASR - Engineering, Technology & Applied Science Research, Vol. 2, No. 5, 269-272, 2012.[3]Kei Shioda, Shangkun Deng, and Akito Sakurai, “Prediction of Foreign Exchange Market States with Support Vector Machine”,10th International Conference on Machine Learning and Applications and Workshops, Dec 18-21, 2011.[4]Swagat Ranjit, Shruti Shrestha, Sital Subedi, and Subarna Shakya,” Comparison of algorithms in Foreign Exchange RatePrediction”, IEEE 3rd International Conference on Computing, Communicat ion and Security (ICCCS), Kathmandu (Nepal), 2018.[5]Christian L. Dunis, and Mark Williams, “Modelling and Trading the EUR/USD Exchange Rate: Do Neural Network ModelsPerform Better”, Derivatives use, trading and regulation, , February 2002.[6]Dr Gu Wang, and Dr Joerg Oesterrider, “Currency Risk Management Predicting the EUR/USD Exchange Rate”, WorcesterPolytechnic Institute Zurich School of Applied Sciences (ZHAW), April 26, 2018.[7]Tadashi Iokibe, Shoji Murata, and Masaya Koyama, “Prediction o f Foreign Exchange Rate by Local Fuzzy ReconstructionMethod”, IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century, Vancouver, BC, Canada, Oct 22-25, 1995.[8]Sitti Wetenriajeng Sidehabi, Indrabayu, and So fyan Tandungan, “Statistical and Machine Learning Approach in ForexPrediction Based on Empirical Data”, International Conference on Computational Intelligence and Cybernetics, Makassar, Indonesia, Nov 22-24, 2016.[9]Y. Lei, N. Li, S. Gontarz, J. Lin, S. Rad kowski, and J. Dybala, “A Model-Based Method for Remaining Useful Life Predictionof Machinery,” IEEE Trans. Reliab., vol. 65, no. 3, pp. 1314–1326, 2016.[10]P.D. Yoo, M.H. Kim, and T. Jan, "Machine Learning Techniques and Use of Event Information for Stock Market Prediction: ASurvey and Evaluation", International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), Vienna, Austria, Nov 28-30, 2005.Authors’ ProfilesMd. Soumon Aziz Sarkar obtained his B.Sc. degree in Computer Science and Engineering from HajeeMohammad Danesh Science and Technology University, Dinajpur, Bangladesh in 2017. He had gained extensiveknowledge of coding, design, and software development cycles as well as proficiency in several programminglanguages. As a software developer, he has developed some systems on a freelance basis. He is currently interestedin working on Artificial Intelligence, Machine Learning, Cryptocurrency, Data Mining, Database Systems, andDistributed Systems. In his spare time, he enjoys playing table tennis, reading books, and swimming.U. A. Md. Ehasn Ali received his B. Sc. degree in Computer Science and Engineering from Hajee MohammadDanesh Science and Technology University, Dinajpur, Bangladesh in 2013. Now, he is pursuing M. Sc. degree inComputer Science and Engineering from Rajshahi University of Engineering & Technology (RUET), Rajshahi,Bangladesh. His main working interest is based on Image Processing, Expanding the Applications of ArtificialIntelligence, Machine Learning, Data Mining, Data Security etc. Currently, he is working as an Assistant Professorin Dept. of Computer Science and Engineering in Hajee Mohammad Danesh Science and Technology University,Dinajpur, Bangladesh. He has several scientific research publications in various aspects of Computer Science and Engineering.How to cite this paper:Md. Soumon Aziz Sarkar, U.A. Md. Ehsan Ali," EUR/USD Exchange Rate Prediction Using Machine Learning ", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.8, No.1, pp.44-48, 2022. DOI: 10.5815/ijmsc.2022.01.05。

筒串卷积算法用于非均匀介质光子剂量计算的精确性验证

筒串卷积算法用于非均匀介质光子剂量计算的精确性验证

筒串卷积算法用于非均匀介质光子剂量计算的精确性验证关玉敏;周凌宏;张书旭;甄鑫;卢文婷;杨俊【摘要】目的验证筒串卷积算法(CCC)预测射野剂量分布的精确性.方法采用MatriXX系统测量均匀模体中的剂量分布,然后利用热释光探测器(TLD)对两种不同的非均匀模体结构中的剂量分布进行实验验证.将Philips Pinnacle3 Version 8.0商业TPS中CCC算法和FC算法的计算值与实验测量值进行比较.所有实验均在Varian 23EX直线加速器上进行,射野大小分别为5cm×5cm和10cm×10cm,采用6MV光子束,源皮距(SSD)=100cm.结果在均匀介质中,两种算法都能精确地预测射束的剂量分布,在非均匀介质中,组织密度的不均匀和射野大小对剂量分布有着很重要的影响,CCC算法的计算值与测量值之间的误差小于FC算法.结论 CCC算法能够更精确地预测射束剂量分布.【期刊名称】《解放军医学杂志》【年(卷),期】2010(035)007【总页数】4页(P864-866,874)【关键词】剂量计算;放射治疗计划系统;组织非均匀性校正;筒串卷积算法【作者】关玉敏;周凌宏;张书旭;甄鑫;卢文婷;杨俊【作者单位】510515,广州,南方医科大学生物医学工程学院;510515,广州,南方医科大学生物医学工程学院;510095,广州,广州医学院附属肿瘤医院放疗中心;510515,广州,南方医科大学生物医学工程学院;510515,广州,南方医科大学生物医学工程学院;510515,广州,南方医科大学生物医学工程学院【正文语种】中文【中图分类】R815放射治疗已成为肿瘤治疗的一种非常重要的手段。

既往的放射治疗剂量计算主要采用解析法,所使用的标准数据来源于均匀模体或标准水箱的测量结果,但真实的人体组织具有非均匀性,这种非均匀性会对剂量计算产生一定影响,在某些情况下尤为明显,如肺组织照射时如忽略组织非均匀性可导致15%~25%的剂量误差[1-2]。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

1 Introduction
Experimental comparison of algorithms for non-rigid motion and correspondence estimation is highly important. A vast amount of relevant work published in the last decade builds on heterogeneous ideas, yet no single algorithm is known to provide a robust solution under a variety of conditions. In this paper we attempt to cross-investigate a number of algorithms some well-established ones as well as some promising recent approaches with the aim of identifying the ideas that may lead to major improvements of current methods of non-rigid motion analysis. Why is it desirable to further improve current motion analysis techniques? The answer is: some developing applications impose much greater requirements for motion analysis than what current methods are capable of. While speci c applicationdependent techniques, e.g., left-ventricular surface motion tracking 10 or cerebral cortical surface correspondence estimation 13 , have proved to be very successful, in general very little is known about how to robustly recover unrestricted non-rigid motion from observations. The potential bene ts of such knowledge are manifold. It may further help the analysis of 3D biomedical images by quantifying what is currently perceived only visually as a motion eld between corresponding points. It may facilitate the segmentation of multiple motion by ltering out
flaskov,chandrag@ http: vims
Abstract
We address the problem of non-rigid motion and correspondence estimation in 3D images in the absense of prior domain information. A generic framework is utilized in which a solution is approached by hypothesizing correspondence and evaluting the motion models constructed under each hypothesis. We present and evaluate experimentally ve algorithms that can be used in this approach. Our experiments were carried out on synthetic and real data with ground truth correspondence information.
Funding for this work was provided u9984842 and CISE CDA-9703088.
motions with distinct characteristics. Another interesting potential application is decreasing the bandwidth in transmission of dynamic image sequences: if compact representation of the motion between successive images could be found, only this component would need to be transmitted instead of full images. The goal of a robust non-rigid motion estimation algorithm can be seen as the following: in the absence of any prior information other than 3D images before and after motion, recover some meaningful compact representation of the observed motion. Let us point out the three essential requirements of this scenario: 1. Correspondence between points, or other features, in images is assumed unknown. As part of its job, the algorithm must recover the correspondence, but it is not the only objective of the algorithm. 2. No prior shape information, nor any information about the physical properties, is available. 3. The algorithm must not be limited to speci c points in objects with some favorable properties. The problem of unknown correspondence lies at the heart of non-rigid motion estimation. In some cases, it may be decoupled from the motion estimation, in that the results of an algorithm providing only correspondence can be later used by a motion estimation algorithm that assumes known correspondence. For this reason we also consider the correspondence only" algorithms in the current investigation. The second requirement prompts us to leave out the physically-based methods as well as the methods utilizing global shape topology. The rationale here is that using a model from an inappropriate domain may lead to erroneous model estimation, which would severely hamper motion analysis. Finally, the requirement for applicability to arbitrary points rules out the techniques that are looking for feature points", such as points with high curvature, etc. Comprehensive coverage of non-rigid motion estimation techniques can be found in two literature reviews published in the mid-90's 1, 9 . Experimental cross-evaluation of such techniques, to our knowledge, is the rst of a kind. Due to space constraints and heterogeneity of existing algorithms we are only able to cover a small subset thereof. Nonetheless we hope that the ndings of this work provide a useful insight in development of more advanced techniques.
2 Generic Framework for Non-rigid Motion and Correspondence Estimation
相关文档
最新文档