a competitive mean squared error approach to beamforming(翻译),波束形成
Contrast Analysis
In Neil Salkind(Ed.),Encyclopedia of Research Design.Thousand Oaks,CA:Sage.2010Contrast AnalysisHerv´e Abdi⋅Lynne J.WilliamsA standard analysis of variance(a.k.a.anova)provides an F-test,which is called an om-nibus test because it reflects all possible differences between the means of the groups analyzed by the anova.However,most experimenters want to draw conclusions more precise than “the experimental manipulation has an effect on participants’behavior.”Precise conclusions can be obtained from contrast analysis because a contrast expresses a specific question about the pattern of results of an anova.Specifically,a contrast corresponds to a prediction precise enough to be translated into a set of numbers called contrast coefficients which reflect the prediction.The correlation between the contrast coefficients and the observed group means directly evaluates the similarity between the prediction and the results.When performing a contrast analysis we need to distinguish whether the contrasts are planned or post hoc.Planned or a priori contrasts are selected before running the experiment. In general,they reflect the hypotheses the experimenter wanted to test and there are usually few of them.Post hoc or a posteriori(after the fact)contrasts are decided after the experiment has been run.The goal of a posteriori contrasts is to ensure that unexpected results are reliable.When performing a planned analysis involving several contrasts,we need to evaluate if these contrasts are mutually orthogonal or not.Two contrasts are orthogonal when their contrast coefficients are uncorrelated(i.e.,their coefficient of correlation is zero).The number of possible orthogonal contrasts is one less than the number of levels of the independent variable.Herv´e AbdiThe University of Texas at DallasLynne J.WilliamsThe University of Toronto ScarboroughAddress correspondence to:Herv´e AbdiProgram in Cognition and Neurosciences,MS:Gr.4.1,The University of Texas at Dallas,Richardson,TX75083–0688,USAE-mail:herve@ /∼herve2Contrast Analysis All contrasts are evaluated using the same general procedure.First,the contrast is for-malized as a set of contrast coefficients(also called contrast weights).Second,a specific F ratio(denoted Fψ)is computed.Finally,the probability associated with Fψis evaluated. This last step changes with the type of analysis performed.1How to express a research hypothesis as a contrastWhen a research hypothesis is precise,it is possible to express it as a contrast.A research hypothesis,in general,can be expressed as a shape,a configuration,or a rank ordering of the experimental means.In all of these cases,we can assign numbers which will reflect the predicted values of the experimental means.These numbers are called contrast coefficients when their mean is zero.To convert a set of numbers into a contrast,it suffices to subtract their mean from each of them.Often,for convenience we will express contrast coefficients with integers.For example,assume that for a4-group design,a theory predicts that thefirst and second groups should be equivalent,the third group should perform better than these two groups and the fourth group should do better than the third with an advantage of twice the gain of the third over thefirst and the second.When translated into a set of ranks this prediction gives:C1C2C3C4Mean11242After subtracting the mean,we get the following contrast:C1C2C3C4Mean−1−1020In case of doubt,a good heuristic is to draw the predicted configuration of results,and then to represent the position of the means by ranks.A BDI&W ILLIAMS3 2A priori(planned)orthogonal contrasts2.1How to correct for multiple testsWhen several contrast are evaluated,several statistical tests are performed on the same data set and this increases the probability of a Type I error(i.e.,rejecting the null hypothesis when it is true).In order to control the Type I error at the level of the set(a.k.a.the family) of contrasts one needs to correct theαlevel used to evaluate each contrast.This correction for multiple contrasts can be done using theˇSid`a k equation,the Bonferroni(a.k.a.Boole, or Dunn)inequality or the Monte-Carlo technique.2.1.1ˇSid`a k and BonferroniThe probability of making as least one Type I error for a family of orthogonal(i.e.,statisti-cally independent)C contrasts isα[P F]=1−(1−α[P C])C.(1) withα[P F]being the Type I error for the family of contrasts;andα[P C]being the Type I error per contrast.This equation can be rewritten asα[P C]=1−(1−α[P F])1 C.(2) This formula,called theˇSid`a k equation,shows how to correct theα[P C]values used for each contrast.Because theˇSid`a k equation involves a fractional power,ones can use an approximation known as the Bonferroni inequality,which relatesα[P C]toα[P F]byα[P C]≈α[P F]C.(3)ˇSid`a k and Bonferroni are related by the inequalityα[P C]=1−(1−α[P F])1 C≥α[P F]C.(4)They are,in general,very close to each other.As can be seen,the Bonferroni inequality is a pessimistic estimation.ConsequentlyˇSid`a k should be preferred.However,the Bonferroni inequality is more well known,and hence,is used and cited more often.2.1.2Monte-CarloThe Monte-Carlo technique can also be used to correct for multiple contrasts.The Monte Carlo technique consists of running a simulated experiment many times using random data,4Contrast Analysis Table1:Results of a Monte-Carlo simulation.Numbers of Type I errors when performing C=5contrasts for10,000analyses of variance performed on a6group design when the H0is true.How to read the table?For example,192families over10,000have2 Type I errors,this gives2×192=384Type I errors.Number of families X:Number of Type1Number ofwith X Type I errors errors per family Type I errors7,868001,90711,9071922384203601345205010,0002,403with the aim of obtaining a pattern of results showing what would happen just on the basis of chance.This approach can be used to quantifyα[P F],the inflation of Type I error due to multiple testing.Equation2can then be used to setα[P C]in order to control the overall value of the Type I error.As an illustration,suppose that6groups with100observations per group are created with data randomly sampled from a normal population.By construction,the H0is true(i.e.,all population means are equal).Now,construct5independent contrasts from these6groups. For each contrast,compute an F-test.If the probability associated with the statistical index is smaller thanα=.05,the contrast is said to reach significance(i.e.,α[P C]is used).Then have a computer redo the experiment10,000times.In sum,there are10,000experiments, 10,000families of contrasts and5×10,000=50,000contrasts.The results of this simulation are given in Table1.Table1shows that the H0is rejected for2,403contrasts over the50,000contrasts actually performed(5contrasts times10,000experiments).From these data,an estimation ofα[P C] is computed as:α[P C]=number of contrasts having reached significance total number of contrasts=2,40350,000=.0479.(5)This value falls close to the theoretical value ofα=.05.It can be seen also that for7,868experiments no contrast reached significance.Equiva-lently for2,132experiments(10,000−7,868)at least one Type I error was made.From these data,α[P F]can be estimated as:A BDI&W ILLIAMS5α[P F]=number of families with at least1Type I error total number of families=2,13210,000=.2132.(6)This value falls close to the theoretical value given by Equation1:α[P F]=1−(1−α[P C])C=1−(1−.05)5=.226.2.2Checking the orthogonality of two contrastsTwo contrasts are orthogonal(or independent)if their contrast coefficients are uncorrelated. Recall that contrast coefficients have zero sum(and therefore a zero mean).Therefore,two contrasts whose A contrast coefficients are denoted C a,1and C a,2,will be orthogonal if and only if:Aa=1C a,i C a,j=0.(7)2.3Computing sum of squares,mean square,and FThe sum of squares for a contrast can be computed using the C a coefficients.Specifically, the sum of squares for a contrast is denoted SSψ,and is computed asSSψ=S(∑C a M a.)2∑C2a(8)where S is the number of subjects in a group.Also,because the sum of squares for a contrast has one degree of freedom it is equal to the mean square of effect for this contrast:MSψ=SSψdfψ=SSψ1=SSψ.(9)The Fψratio for a contrast is now computed asFψ=MSψMS error.(10)6Contrast Analysis 2.4Evaluating F for orthogonal contrastsPlanned orthogonal contrasts are equivalent to independent questions asked to the data. Because of that independence,the current procedure is to act as if each contrast were the only contrast tested.This amounts to not using a correction for multiple tests.This procedure gives maximum power to the test.Practically,the null hypothesis for a contrast is tested by computing an F ratio as indicated in Equation10and evaluating its p value using a Fisher sampling distribution withν1=1andν2being the number of degrees of freedom of MS error[e.g.,in independent measurement designs with A groups and S observations per groupν2=A(S−1)].2.5An exampleThis example is inspired by an experiment by Smith(1979).The main purpose in this experi-ment was to show that being in the same mental context for learning and for test gives better performance than being in different contexts.During the learning phase,subjects learned a list of80words in a room painted with an orange color,decorated with posters,paintings and a decent amount of paraphernalia.Afirst memory test was performed to give subjects the impression that the experiment was over.One day later,subjects were unexpectedly re-tested for their memory.An experimenter asked them to write down all the words of the list they could remember.The test took place in5different experimental conditions.Fifty subjects(ten per group)were randomly assigned to one of thefive experimental groups.The five experimental conditions were:1.Same context.Subjects are tested in the same room in which they learned the list.2.Different context.Subjects are tested in a room very different from the one in which theylearned the list.The new room is located in a different part of the campus,is painted grey and looks very austere.3.Imaginary context.Subjects are tested in the same room as subjects from Group2.Inaddition,they are told to try to remember the room in which they learned the list.In order to help them,the experimenter asks them several questions about the room and the objects in it.4.Photographed context.Subjects are placed in the same condition as Group3,and,inaddition,they are shown photos of the orange room in which they learned the list.5.Placebo context.Subjects are in the same condition as subjects in Group2.In addition,before starting to try to recall the words,they are askedfirst to perform a warm-up task, namely,to try to remember their living room.The data and anova results of the replication of Smith’s experiment are given in the Tables2 and3.2.5.1Research hypotheses for contrast analysisSeveral research hypotheses can be tested with Smith’s experiment.Suppose that the exper-iment was designed to test these hypotheses:A BDI&W ILLIAMS7Table2:Data from a replication of an experiment by Smith(1979).The dependent variable is the number of words recalled.Experimental ContextGroup1Group2Group3Group4Group5Same Different Imagery Photo Placebo251114258262115152017929231015610217147121815171422247141214141204202717117221211211912114Y a.180110170190100M a.1811171910M a.−M..3−424−5∑(Y as−M a.)2218284324300314Table3:anova table for a replication of Smith’s experiment(1979).Source df SS MS F Pr(F)Experimental4700.00175.005.469∗∗.00119Error451,440.0032.00Total492,140.00–Research Hypothesis1.Groups for which the context at test matches the context during learning(i.e.,is the same or is simulated by imaging or photography)will perform better than groups with a different or placebo contexts.–Research Hypothesis2.The group with the same context will differ from the group with imaginary or photographed contexts.–Research Hypothesis3.The imaginary context group differs from the photographed con-text group.–Research Hypothesis4.The different context group differs from the placebo group.2.5.2ContrastsThe four research hypotheses are easily transformed into statistical hypotheses.For example, thefirst research hypothesis is equivalent to stating the following null hypothesis:The means of the population for groups1.,3.,and4.have the same value as the means of the population for groups2.,and5..8Contrast AnalysisTable4:Orthogonal contrasts for the replication of Smith(1979).contrast Gr.1Gr.2Gr.3Gr.4Gr.5∑C aψ1+2−3+2+2−30ψ2+20−1−100ψ300+1−100ψ40+100−10This is equivalent to contrasting groups1.,3.,4.and groups2.,5..Thisfirst contrast is denotedψ1:ψ1=2µ1−3µ2+2µ3+2µ4−3µ5.The null hypothesis to be tested isH0,1∶ψ1=0Thefirst contrast is equivalent to defining the following set of coefficients C a:C aGr.1Gr.2Gr.3Gr.4Gr.5a+2−3+2+2−30Note that the sum of the coefficients C a is zero,as it should be for a contrast.Table4 shows all4contrasts.2.5.3Are the contrast orthogonal?Now the problem is to decide if the contrasts constitute an orthogonal family.We check that every pair of contrasts is orthogonal by using Equation7.For example,Contrasts1and2 are orthogonal becauseA=5C a,1C a,2=(2×2)+(−3×0)+(2×−1)+(2×−1)+(−3×0)+(0×0)=0.a=12.5.4F testThe sum of squares and Fψfor a contrast are computed from Equations8and10.For example,the steps for the computations of SSψ1are given in Table5:A BDI&W ILLIAMS9Table5:Steps for the computation of SSψ1of Smith(1979).Group M a C a C a M a C2a118.00+2+36.004211.00−3−33.009317.00+2+34.004419.00+2+38.004510.00−3−30.009045.0030SSψ1=S(∑C a M a.)2∑C2a=10×(45.00)230=675.00MSψ1=675.00Fψ1=MSψ1MS error=675.0032.00=21.094.(11)The significance of a contrast is evaluated with a Fisher distribution with1and A(S−1)= 45degrees of freedom,which gives a critical value of4.06forα=.05(7.23forα=.01).The sum of squares for the remaining contrasts are SSψ.2=0,SSψ.3=20,and SSψ.4=5with1 and A(S−1)=45degrees of freedom.Therefore,ψ2,ψ3,andψ4are non-significant.Note that the sums of squares of the contrasts add up to SS experimental.That is:SS experimental=SSψ.1+SSψ.2+SSψ.3+SSψ.4=675.00+0.00+20.00+5.00=700.00.When the sums of squares are orthogonal,the degrees of freedom are added the same way as the sums of squares are.This explains why the maximum number of orthogonal contrasts is equal to number of degrees of freedom of the experimental sum of squares.3A priori(planned)non-orthogonal contrastsSo,orthogonal contrasts are relatively straightforward because each contrast can be evalu-ated on its own.Non-orthogonal contrasts,however,are more complex.The main problem is to assess the importance of a given contrast conjointly with the other contrasts.There are10Contrast Analysis currently two(main)approaches to this problem.The classical approach corrects for multi-ple statistical tests(e.g.,using aˇSid`a k or Bonferroni correction),but essentially evaluates each contrast as if it were coming from a set of orthogonal contrasts.The multiple regression (or modern)approach evaluates each contrast as a predictor from a set of non-orthogonal predictors and estimates its specific contribution to the explanation of the dependent vari-able.The classical approach evaluates each contrast for itself,whereas the multiple regression approach evaluates each contrast as a member of a set of contrasts and estimates the spe-cific contribution of each contrast in this set.For an orthogonal set of contrasts,the two approaches are equivalent.3.1The classical approachSome problems are created by the use of multiple non-orthogonal contrasts.Recall that the most important one is that the greater the number of contrasts,the greater the risk of a Type I error.The general strategy adopted by the classical approach to take this problem is to correct for multiple testing.3.1.1ˇSid`a k and Bonferroni corrections for non-orthogonal contrastsWhen a family of contrasts are nonorthogonal,Equation1gives a lower bound forα[P C] (cf.ˇSid`a k,1967;Games,1977).So,instead of having the equality,the following inequality, called theˇSid`a k inequality,holdsα[P F]≤1−(1−α[P C])C.(12) This inequality gives an upper bound forα[P F],therefore the real value ofα[P F]is smaller than its estimated value.As previously,we can approximate theˇSid`a k inequality by Bonferroni asα[P F]<Cα[P C].(13) And,as previously,ˇSid`a k and Bonferroni are linked to each other by the inequalityα[P F]≤1−(1−α[P C])C<Cα[P C].(14)3.1.2An example:Classical approachLet us go back to Smith’s(1979)study(see Table2).Suppose that Smith wanted to test these three hypotheses:–Research Hypothesis1.Groups for which the context at test matches the context during learning will perform better than groups with different contexts;Table6:Non-orthogonal contrasts for the replication of Smith(1979).contrast Gr.1Gr.2Gr.3Gr.4Gr.5∑C aψ12−322−30ψ233−2−2−20ψ31−41110Table7:Fψvalues for the nonorthogonal contrasts from the replication of Smith(1979).Fψp(Fψ)r Y.ψr2Y.ψψ1.9820.964321.0937<.0001ψ2−.1091.01190.2604.6123ψ3.5345.2857 6.2500.0161–Research Hypothesis2.Groups with real contexts will perform better than those with imagined contexts;–Research Hypothesis3.Groups with any context will perform better than those with no context.These hypotheses can easily be transformed into the set of contrasts given in Table6. The values of Fψwere computed with Equation10(see also Table3)and are in shown in Table7along with their p values.If we adopt a value ofα[P F]=.05,aˇSid`a k correction (from Equation2)will entail evaluating each contrast at theαlevel ofα[P C]=.0170 (Bonferroni will give the approximate value ofα[P C]=.0167).So,with a correction for multiple comparisons we will conclude that Contrasts1and3are significant.3.2Multiple regression approachAnova and multiple regression are equivalent if we use as many predictors for the multiple regression analysis as the number of degrees of freedom of the independent variable.An obvious choice for the predictors is to use a set of contrasts coefficients.Doing so makes contrast analysis a particular case of multiple regression analysis.When used with a set of orthogonal contrasts,the multiple regression approach gives the same results as the anova based approach previously described.When used with a set of non-orthogonal contrasts, multiple regression quantifies the specific contribution of each contrast as the semi-partial coefficient of correlation between the contrast coefficients and the dependent variable.We can use the multiple regression approach for non-orthogonal contrasts as long as the following constraints are satisfied:1.There are no more contrasts than the number of degrees of freedom of the independentvariable;2.The set of contrasts is linearly independent(i.e.,not multicollinear).That is,no contrastcan be obtained by combining the other contrasts.3.2.1An example:Multiple regression approachLet us go back once again to Smith’s(1979)study of learning and recall contexts.Suppose we take our three contrasts(see Table6)and use them as predictors with a standard multiple regression program.We willfind the following values for the semi-partial correlation between the contrasts and the dependent variable:ψ1∶r2Y.C a,1 C a,2C a,3=.1994ψ2∶r2Y.C a,2 C a,1C a,3=.0000ψ3∶r2Y.C a,3 C a,1C a,2=.0013,with r2Y.C a,1 C a,2C a,3being the squared correlation ofψ1and the dependent variable with theeffects ofψ2andψ3partialled out.To evaluate the significance of each contrast,we compute an F ratio for the corresponding semi-partial coefficients of correlation.This is done using the following formula:F Y.Ca,i C a,k C a, =r2Y.C a,i C a,k C a,1−r2Y.A×(df residual).(15)This results in the following F ratios for the Smith example:ψ1∶F Y.Ca,1 C a,2C a,3=13.3333,p=0.0007;ψ2∶F Y.Ca,2 C a,1C a,3=0.0000,p=1.0000;ψ3∶F Y.Ca,3 C a,1C a,2=0.0893,p=0.7665.These F ratios follow a Fisher distribution withν1=1andν2=45degrees of freedom.F critical=4.06whenα=.05.In this case,ψ1is the only contrast reaching significance(i.e., with Fψ>F critical).The comparison with the classic approach shows the drastic differences between the two approaches.4A posteriori(post-hoc)contrastsFor a posteriori contrasts,the family of contrasts is composed of all the possible contrasts even if they are not explicitly made.Indeed,because we choose the contrasts to be made a posteriori,this implies that we have implicitly made and judged uninteresting all the possible contrasts that have not been made.Hence,whatever the number of contrasts actually performed,the family is composed of all the possible contrasts.This number grows very fast: A conservative estimate indicates that the number of contrasts which can be made on A groups is equal to1+{[(3A−1) 2]−2A}.(16)So,using aˇSid`a k or Bonferroni approach will not have enough power to be useful.4.1Scheff´e’s testScheff´e’s test was devised to test all possible contrasts a posteriori while maintaining the overall Type I error level for the family at a reasonable level,as well as trying to have a conservative but relatively powerful test.The general principle is to insure that no discrepant statistical decision can occur.A discrepant decision would occur if the omnibus test would fail to reject the null hypothesis,but one a posteriori contrast could be declared significant.In order to avoid such a discrepant decision,the Scheff´e approachfirst tests any contrast as if it were the largest possible contrast whose sum of squares is equal to the experimental sum of squares(this contrast is obtained when the contrast coefficients are equal to the deviations of the group means to their grand mean);and second makes the test of the largest contrast equivalent to the anova omnibus test.So,if we denote by F critical,omnibus the critical value for the anova omnibus test(performed on A groups),the largest contrast is equivalent to the omnibus test if its Fψis tested against a critical value equal toF critical,Scheff´e=(A−1)×F critical,omnibus.(17) Equivalently,Fψcan be divided by(A−1)and its probability can be evaluated with a Fisher distribution withν1=(A−1)andν2being equal to the number of degrees of freedom of the mean square error.Doing so makes it impossible to reach a discrepant decision.4.1.1An example:Scheff´eSuppose that the Fψratios for the contrasts computed in Table7were obtained a posteriori. The critical value for the anova is obtained from a Fisher distribution withν1=A−1=4 andν2=A(S−1)=45.Forα=.05this value is equal to F critical,omnibus=2.58.In order to evaluate if any of these contrasts reaches significance,we need to compare them to the critical value ofF critical,Scheff´e=(A−1)×F critical,omnibus=4×2.58=10.32.With this approach,only thefirst contrast is considered significant.Related entriesAnalysis of Variance,Bonferonni correction,Post-Hoc comparisons.Further Readings1.Abdi,H.,Edelman,B.,Valentin,D.,&Dowling,W.J.(2009).Experimental Design and Analysis for Psychology.Oxford:Oxford University Press.2.Rosenthal,R.,&Rosnow,R.L.(2003).Contrasts and effect sizes in behavioral research:A correlational approach.Boston:Cambridge University Press.。
地质雷达和电法的英文文献3
Electrical resistivity tomography technique for landslide investigation:A reviewA.Perrone ⁎,penna,S.PiscitelliInstitute of Methodologies for Environmental Analysis,CNR,Italya b s t r a c ta r t i c l e i n f o Article history:Received 18September 2013Accepted 8April 2014Available online 18April 2014Keywords:ReviewElectrical resistivity tomography 2D 3DTime-lapse LandslidesIn the context of in-situ geophysical methods the Electrical Resistivity Tomography (ERT)is widely used for the near-surface exploration of landslide areas characterized by a complex geological setting.Over the last decade the technological improvements in field-data acquisition systems and the development of novel algorithms for tomographic inversion have made this technique more suitable for studying landslide areas,with a particular at-tention to the rotational,translational and earth-flow slides.This paper aims to present a review of the main re-sults obtained by applying ERT for the investigation of a wide spectrum of landslide phenomena which affected various geological formations and occurred in different geographic areas.In particular,signi ficant and represen-tative results obtained by applying 2D and 3D ERT are analyzed highlighting the advantages and drawbacks of this geophysical technique.Finally,recent applications of the time-lapse ERT (tl-ERT)for landslide investigation and the future scienti fic challenges to be faced are presented and discussed.©2014Elsevier B.V.All rights reserved.Contents 1.Introduction ...............................................................652.The ERT method for landslide investigation .................................................662.1.The 2D ERT imaging ........................................................672.2.The 3D ERT imaging ........................................................722.3.The time-lapse ERT monitoring ...................................................723.Discussion on the ERT advantages and drawbacks for landslide investigation ..................................774.Conclusions ...............................................................79Acknowledgments ..............................................................79References ................................ (79)1.IntroductionLandslides are complex geological phenomena with a high socio-economical impact also in terms of loss of lives and damage.Their inves-tigation usually requires a multidisciplinary approach,based on the in-tegration of satellite,airborne and ground-based sensing technologies.Each technique allows the study of speci fic triggering factors and/or particular physical features,characterizing the landslide body compared with the material not affected by the movement.Airborne and satellite methods (i.e.digital aerophotogrammetry,GPS,differential interferometric SAR,etc.)can provide information on the surface characteristics of the investigated slope,such as geomorpho-logical features,the areal extension of the landslide body,super ficial displacement and velocity (Catani et al.,2005;Squarzoni et al.,2005;Glenn et al.,2006;Lanari et al.,2007;Baldi et al.,2008;Roering et al.,2009;Cascini et al.,2010;Strozzi et al.,2010;Ventura et al.,2011;Guzzetti et al.,2012),without giving any information on subsoil characteristics.Direct ground-based techniques (i.e.piezometer,inclinometer,labo-ratory tests,etc.)give true information on the mechanical and hydraulic characteristics of the terrains affected by the landslide but in a speci fic point of the subsoil (Petley et al.,2005;Marcato et al.,2012).Earth-Science Reviews 135(2014)65–82⁎Corresponding author at:CNR-IMAA,c.da S.Loja,85050Tito Scalo PZ.Tel.:+390971427282.E-mail address:angela.perrone@r.it (A.Perrone)./10.1016/j.earscirev.2014.04.0020012-8252/©2014Elsevier B.V.All rightsreserved.Contents lists available at ScienceDirectEarth-Science Reviewsj o u r n a l h om e p a g e :w w w.e l s e v i e r.c o m /l o c a t e /e a r s c i r e vIn-situ geophysical techniques are able to measure physical param-eters directly or indirectly linked with the lithological,hydrological and geotechnical characteristics of the terrains related to the movement (McCann and Foster,1990;Hack,2000;Jongmans and Garambois, 2007).These techniques,less invasive than the previous ones,provide information integrated on a greater volume of the soil thus overcoming the point-scale feature of classic geotechnical measurements.Among the in-situ geophysical techniques,the Electrical Resistivity Tomogra-phy(ERT)has been increasingly applied for landslide investigation (McCann and Foster,1990;Hack,2000;Jongmans and Garambois, 2007;references in Table1,3and5).This technique is based on the measure of the electrical resistivity and can provide2D and3D images of its distribution in the subsoil.The frequent use of this method in the study of landslide areas is mainly due to the factors that can affect resistivity and its extreme var-iability in space and time domains.Indeed,this parameter is mostly in-fluenced by the mineralogy of the particles,the ground water content, the nature of electrolyte,the porosity and the intrinsic matrix resistivity with weathering and alteration(Archie,1942;Reynolds,1997;Park and Kim,2005;Bievre et al.,2012).Some of these factors,especially the change of water content and the consequent increase in pore water pressures,can play an important role in the triggering mechanisms of a landslide(Bishop,1960;Morgenstern and Price,1965).This paper aims at presenting the current state of-the-art on the ap-plication of ERT for landslide investigation,mainly considering the tech-nological and methodological improvements of this technique.The work is focused on the scientific papers published in international journals since2000and available online.In particular,this study pre-sents the results offield geophysical surveys based on2D,3D and time-lapse ERT carried out for the investigation of different typologies of landslide,also considering the acquisition systems and the inversion algorithms.The main advantages and drawbacks related to the applica-tion of the ERT method are identified and discussed.Finally,the future challenges for a better use of the ERT in the landslide investigation and monitoring are presented.2.The ERT method for landslide investigationThe Electrical Resistivity Tomography is an active geophysical meth-od that can provide2D or3D images of the distribution of the electrical resistivity in the subsoil.The analysis and interpretation of these electri-cal images allow the identification of resistivity contrasts that can be mainly due to the lithological nature of the terrains and the water con-tent variation.The in-field procedure includes the use of a multi-electrode cable, laid out on the ground,to which a number of steel electrodes are con-nected at afixed distance according to a specific electrode configuration. The electrodes are used both for the injection of the current(I)in the subsoil and the measurement of the voltage(V).Knowing the I and V values and the geometrical coefficient depending on the electrode con-figuration used,the apparent resistivity values characterizing the sub-soil investigated can be calculated.These values are positioned at pseudo-depths according to a geometrical reconstruction(Edwards, 1977),which results in a pseudo-section representing an approximate picture of the true subsurface resistivity distribution(Hack,2000).To obtain an electrical resistivity tomography,the apparent resistiv-ity values must be inverted by using inversion routines.The best known and most applied algorithm is Res2Dinv(Loke and Barker,1996;Loke et al.,2003)based on a smoothness-constrained least-squares method which allows to obtain two-dimensional sections throughfinite differ-ences orfinite elements computations,taking into account the topo-graphic corrections.To evaluate thefit of the resistivity model obtained,the root mean square error(RMS)can be considered.This error provides the percentage difference between the measured values and those calculated;so,the correspondence between thefield data and the ones of the model is higher when the error is lower.Although Res2Dinv is the most widely applied software,many other methods are currently available for the electrical resistivity data inversion(see Section2.1and Table1).Thefirst applications of ERT in the study of landslides(Gallipoli et al., 2000;Lapenna et al.,2003)involved the use of manual systems charac-terized by separated energization and measurement devices and single cables.Due to the absence of multi-core cables,the operators used four separate insulated cables connected to four metal electrodes,two of steel for the injection of current and the other two non-polarizable for the measurement of the voltage.The use of manual equipment resulted in rather slow data acquisition;moreover,the possibility or the necessi-ty to keep the energization and the measurement systems separate mainly favored the use of dipole–dipole configuration which is more suitable for the investigation of vertical boundaries(landslide lateral boundaries,source area,fault)than for the identification of the horizon-tal ones(sliding surface,lithological contact).Technological improvements,which produced more compact and portable equipments and faster acquisition systems,as well as the de-velopment of novel software for data processing and the creation of 2D and3D tomographic images of the resistivity distribution in the sub-soil have greatly increased the applicability of this technique for the study of landslide areas.Over the last15years the number of systems for the resistivity im-aging survey has considerably grown.Two categories of systems are now available,the static and the dynamic.In the static one many elec-trodes are connected to a multi-electrode cable and planted into the ground during the survey.The dynamic systems use a small number of nodes but move the entire equipment to obtain a wide coverage (Loke,2013).The static systems are usually used for the investigation of landslides.In particular,the introduction of static multi-electrode sys-tems(Barker,1981;Griffiths and Turnbull,1985;Griffiths et al.,1990;Li and Oldenburg.,1992;Dahlin,1993,1996;Dahlin and Bernstone,1997; Stummer,2002),mainly using single channel data acquisition,has greatly reduced the acquisition time and also improved some logistic as-pects.These systems allow the use of a large number of electrodes with an increase in the profile length and the automatic change of spatial res-olution and investigation depth.They have made it easier to carry out 2D ERT on landslides and obtain a3D geoelectrical model of the subsoil, particularly where the logistic conditions are advantageous(small-sized landslides and slightly steep slopes).The development of algorithms for the inversion of apparent resistiv-ity data(Dey and Morrison,1979;Barker,1992;Oldenburg et al.,1993; Oldenburg and Li,1994;Tsourlos,1995;LaBrecque et al.,1996;Loke and Barker,1996;Dahlin,2001and reference therein;Loke et al.,2003) made it easier to analyze the data and generate2D and3D images useful for the characterization of the slope investigated so as to obtain informa-tion on the geometry of a landslide body(i.e.the slide material thickness, the location of areas characterized by a higher water content,the pres-ence of potentially unstable areas,etc.).From a temporal point of view, the information obtained can be considered static being related only to the day of acquisition.Resistivity data are usually acquired after the occur-rence of an event and give an image of that moment,without providing any indications on the dynamic evolution affecting the slope investigated. Very recently,the development of static multi-channel measuring sys-tems,able to simultaneously acquire a number of potential measure-ments for a single pair of current electrodes,have significantly reduced the acquisition time.These systems can be set up to provide ERT at specif-ic times during the day,and they can also repeat the measurement in order to give ERT images at very close time intervals called time-lapse ERT(tl-ERT).This is extremely important as it allows the exploitation of ERT not only to define the geometrical characteristics of the landslide body or the slope investigated but also to monitor a potentially unstable area.The literature reports some examples of tl-ERT applications in land-slide areas with the main aim to obtain information on the water content change(see Section2.3).Obviously,although some software for the pro-cessing of data continuously acquired has already been developed,there66 A.Perrone et al./Earth-Science Reviews135(2014)65–82is still a need to improve this aspect and especially to quantify the rela-tionship between the variations of the electrical resistivity as a function of changes in hydrological parameters.2.1.The2D ERT imagingSince2000a lot of papers dealing with the application of2D ERT for landslide investigation have been published.For each paper Table1 specifies the year of publication,the name of the authors and the jour-nal,the typology of landslide investigated,the lithological nature of the material involved in the movement and the country affected by the event.The majority of the case histories considered(73%)are located in Europe,a lower percentage(24%)in Asia and a very low percentage (3%)in America(Fig.1).No example has been found for Oceania, while only few examples of the Vertical Electrical Sounding(VES)appli-cation for the investigation of unstable areas(Ayenew and Barbieri, 2005;Epada et al.,2012)are available for Africa.The65papers analyzed deal with different landslide typologies.Two of the papers are reviews(Hack,2000;Jongmans and Garambois,2007) and other three do not include information on the type of landslide (Otto and Sass,2006;Yilmaz,2007;Mondal et al.,2008),therefore, only60papers have been considered for the classification of the phe-nomenon typology.In particular,as also shown in Table2,twelve(20%)papers concern complex landslides(slides evolving in earth-flow;or retrogressive land-slides,etc.)(Gallipoli et al.,2000;Lapenna et al.,2003;Bichler et al., 2004;Perrone et al.,2004;Lapenna et al.,2005;Park and Kim,2005; Colangelo et al.,2008;Naudet et al.,2008;Panek et al.,2008;Sass et al.,2008;Jongmans et al.,2009;Ogusnsuyi,2010),nineteen(32%) study translational or rotational slides(Godio and Bottino,2001; Meric et al.,2005;Drahor et al.,2006;Friedel et al.,2006;Perrone et al.,2006;Göktürkler et al.,2008;Lee et al.,2008;Marescot et al., 2008;Schrott and Sass,2008;Erginal et al.,2009;Bekler et al.,2011; de Bari et al.,2011;Grandjean et al.,2011;Le Roux et al.,2011;Bièvre et al.,2012;Hibert et al.,2012;Ravindran and Ramanujam,2012; Sastry and Mondal,2013;Shan et al.,2013),six(10%)analyze rockfalls and rockslides(Batayneh et al.,2002;Godio et al.,2006;Ganerød et al., 2008;Heincke et al.,2010;Socco et al.,2010;Oppikofer et al.,2011), eight(13%)investigate deep seated landslides(Lebourg et al.,2005; Jomard et al.,2007a,b;Van Den Eeckhaut et al.,2007;Jomard et al., 2010;Migońet al.,2010;Tric et al.,2010;Zerathe and Lebourg,2012), twelve(20%)consider debris,earthflows or shallow landslides (Havenith et al.,2000;Jongmans et al.,2000;Demoulin et al.,2003; Grandjean et al.,2006;Perrone et al.,2008;Piegari et al.,2009; Schmutz et al.,2009;Chambers et al.,2011;Carpentier et al.,2012; Chang et al.,2012;Mainsant et al.,2012;Ravindran and Prabhu, 2012),and three(5%)focus on quick clay slides(Lundstrom et al., 2009;Donohue et al.,2012;Solberg et al.,2012).No examples of topples and lateral spread have been found.To define the resistive characteristics of the material involved in the landslides,63papers,excluding the reviews,have been analyzed.In par-ticular,in41case studies(65%)the slide material is conductive,in14case studies(22%)it is resistive and in the remaining8(13%)it is not well de-fined(Table2).This percentage distribution is mainly due both to the clayey andflyschoid nature of the material involved in the landslides and the high content of water that usually characterize landslide areas.Table1also reports the information related to the acquisition sys-tems,the electrode configuration and inversion software used by each team of authors.As for the acquisition systems,the different models of the IRIS-Instruments()are found to be the most widely used among the available commercial tools,in addi-tion to:i)ABEM Lund Imaging System(http://www.abem.se),ii) GeoTomo of Geolog(http://www.geolog2000.de),iii)AGI-SuperSting (),iv)OYO McOHM Profiler-4System(http:// www.oyo.co.jp/english.html),v)Campus Tigre(/files/index.html),vi)Multi Function Digital DC Resis-tivity IP/Meter(/fp745352/Multi-Funciton-Digital-DC-Resistivity-IP-Meter.html).They are static acquisition sys-tems usually working by using a multi-electrode cable and measuring a voltage only on a single pair of electrodes.As regards the arrays,dipole–dipole(DD)is the most used electrode configuration,followed by Wenner(W)and Wenner–Schlumberger (WS);only few examples using pole–pole(PP),pole–dipole(PD), multi-gradient(MG)and Wenner-alpha(W-α)electrode configuration can be found.In most cases,the authors use PP and PD array to study complex deep seated landslides(Lebourg et al.,2005;Jomard et al., 2007a,b,2010;Tric et al.,2010;Zerathe and Lebourg,2012)so as to reach deeper investigation depths.In order to highlight vertical structures some authors prefer to use the DD configuration(Godio and Bottino,2001;Lebourg et al.,2005; Godio et al.,2006;Perrone et al.,2006;Colangelo et al.,2008;Naudet et al.,2008;Perrone et al.,2008)also in combination with other config-urations to study deep and complex landslides(Lebourg et al.,2005; Jomard et al.,2007a,2010;Tric et al.,2010).The W and WS arrays are used to characterize horizontal discontinuities(Colangelo et al.,2008; Perrone et al.,2008;de Bari et al.,2011)and,in the last examples (since2012),especially to investigate shallow and non-complex landslides.Sometimes,different array configurations are used to measure resis-tivity data along the same profile in order to compare the resistivity im-ages obtained and overcome the intrinsic limitations of each configuration(Godio and Bottino,2001;Lebourg et al.,2005;Friedel et al.,2006;Godio et al.,2006;Perrone et al.,2006;Jomard et al., 2007a;Van Den Eeckhaut et al.,2007;Colangelo et al.,2008;Ganerod et al.,2008;Naudet et al.,2008;Perrone et al.,2008;Heincke et al., 2010;Jomard et al.,2010;Tric et al.,2010;de Bari et al.,2011; Grandjean et al.,2011).The resistivity distribution obtained with the different configurations is often proved to be comparable and the one showing the lowest RMS is generally reported(Van Den Eeckhaut et al.,2007).Friedel et al.(2006)show a quantitative comparison be-tween the results obtained using W,WS and DD configurations along the same profile.The authors point out the difference of resolution and sensitivity of each single array.All the models obtained have the same basic features,which indicates a high data quality and a stable in-version procedure.The authors conclude that in their specific case study,the best compromise between resolution and measurement time is represented by the joint inversion of WS+DD data set.Generally,to invert the data the authors mainly apply the RES2Dinv algorithm proposed by Loke and Barker(1996).For the same aim,Park and Kim(2005)use the DIPRO algorithm(Hee Song Geotek,2002),Meric et al.(2005)the DCIP2D(UBC-GIF2001), based on subspace methods(Oldenburg et al.,1993);Yilmaz (2007)the IP2DI(Wannamaker,1992);Gokturkler et al.(2008) the DC2DInvRes(Günther,2007);Heincke et al.(2010)the BERT al-gorithm(Günther et al.,2006)and Chang et al.(2012)the Earth Im-ager TM2D(AGI,2009).The main information obtained by applying the2D ERT technique helped the authors to define the geological setting of the investigated subsoil,to reconstruct the geometry of landslide body,to estimate the thickness of sliding material,to locate the possible sliding surface and lateral boundaries of the landslide,to characterize fractures or tectonic elements that could bring about an event,etc.(Fig.2).In some cases, ERT was also applied with the aim to evaluate the groundwater condi-tions,to locate areas with a high water content,to verify the network of water drainage,to study the groundwater circulation and storage within an unstable area(Perrone et al.,2004;Lapenna et al.,2005; Grandjean et al.,2006;Jomard et al.,2007a,b;Yilmaz,2007; Colangelo et al.,2008;Gokturkler et al.,2008;Marescot et al.,2008; Heickne et al.,2010;McClymont et al.,2010;Langston et al.,2011).In many of the case studies reported,ERT are compared with other geophysical methods,such as seismics and Ground Penetrating Radar67A.Perrone et al./Earth-Science Reviews135(2014)65–82GPR,or stratigraphical and hydrological data (Table 1last 2columns)in order to validate and calibrate the resistivity results.Among the geophysical methods,the ERT and seismic tomography combination proves to be the most successful (Fig.3).The joint applica-tion of GPR,ERT and seismic tomography seems to solve and overcome the resolution problems of each single method.Indeed,the GPR pro-vides more useful information on the shallowest layers (Sass et al.,2008),ERT on the intermediate layers and the seismic on the deepest ones (Bichler et al.,2004).If the investigated material is very wet,the seismic method can work better than ERT,providing information on the displacement material (Jongmans et al.,2009;Le Roux et al.,2011).Literature reports very few examples of ERT combined with In-duced Polarization (IP)used for the discrimination of clayey material from the matrix or for a better interpretation of ERT (Marescot et al.,2008;Sastry et al.,2013).2.2.The 3D ERT imagingLandslides are volumetric targets and their reconstruction and char-acterization should be carried out by means of 3D imaging and visuali-zation procedures.Although the introduction of multi-electrode and multi-channel systems has strongly increased the speed of data acquisi-tion,literature reports only very few cases of 3D ERT application in land-slide areas (Table 3).Very often,logistic conditions in these areas are not so conducive giving rise to problems in transporting and installing the instruments and equipment.The planning and the carrying out of a 3D geoelectrical campaign on landslides can be very tiring,exhausting and costly.In-deed,the slide material is usually strongly reworked,and,although the measurement equipment is now very compact and easy to carry,it still remains extremely dif ficult to be moved on the slope.Depending on the type of landslide and material involved,the slope can be very steep making it very dif ficult to install the cable network necessary to perform a 3D survey.Generally,landslides can present a large super fi-cial extension and,therefore,a very long multi-core cable could be nec-essary to cover the entire area investigated.A possible solution could beto use more instruments connected to each other and many multi-core cables.This would probably reduce the ef ficiency of the method and in-crease the electrical power required by the system.Despite all these problems,some authors have tried to perform a 3D investigation of a landslide (Bichler et al.,2004;Lebourg et al.,2005;Drahor et al.,2006;Friedel et al.,2006;Yilmaz,2007;Chambers et al.,2009;Heincke et al.,2010;Chambers et al.,2011;Grandjean et al.,2011;Di Maio and Piegari,2011;Udphuay et al.,2011;Di Maio and Piegari,2012).In all the cases reported,the acquisition has been carried out in a 2D way along parallel pro files whose direction is generally transversal to the dip of the slope and,sometimes,additional perpendic-ular pro files are also used.The acquisition systems of the IRIS-Instruments ( )and the DD electrode con figuration prove to be the most used also for the 3D applications.In one case,the authors apply a system (ALERT system of the British Geological Survey)that they themselves designed.Only few authors carry out a 3D inversion of the acquired data by applying some dedicated software (Fig.4),the others have used the 2D pro files in a graphical way to get a 3D fence diagram (Bichler et al.,2004;Drahor et al.,2006;Grandjean et al.,2011)(Fig.5).As reported in Table 4,the slides (63%)are the most studied type of landslide and,as in the case of 2D ERT applications,the material in-volved in the movement is essentially conductive (67%).The information obtained through 3D ERT,very similar to that ob-tained for the 2D ERT applications,allowed the de finition of a 3D geoelectrical model useful for the reconstruction of the subsoil geologi-cal setting and the identi fication of areas characterized by a high water content.2.3.The time-lapse ERT monitoringDespite the ERT technological and methodological development over the past 15years,2D and 3D ERT surveys have provided only static information.Generally,these investigations have been carried out after the occurrence of an event or in old landslide areas potentially subject to new activations.The information gathered is always related to the ac-quisition time without providing any indications on the possible evolu-tion of physical parameters in the slope investigated.Considering the in fluence that the water content change could have on the electrical re-sistivity and taking into account the role played by the water content in the triggering of some landslides,a continuous monitoring of the resis-tivity could give information on the dynamic behavior of the slope in-vestigated.This has led to the use of a new acquisition procedure known as time-lapse ERT (tl-ERT).These are usually acquire through multi-channel systems which allow the simultaneous potential mea-surement on many channels by means of a single pair of current elec-trodes.The Syscal PRO of IRIS Instruments proves to be the most popular.Systems like the GEOMON 4D (Supper et al.,2008),ALERT (Kuras et al.,2009;Wilkinson et al.,2010)and A-ERT (Hilbich et al.,2011)have also been developed in order to obtain tl-ERT.These systems can use local power generated by wind,solar and fuel cell technology,and can incorporate telemetric control and data transfer (Loke et al.,2013).To accommodate time-lapse resistivity in inverse models,different approaches such as the ratio method,the cascaded time-lapse inversion,the difference inversion and the differencing in dependent inversions have been proposed (Hayley et al.,2011;Loke et al.,2014).In the most common,the measured data,acquired at each monitoring phase,are independently inverted (Loke et al.,1999;Tsourlos,2003).This kind of approach mainly assumed that the time-lapse images are calcu-lated under the time-invariant static condition and that the changes of the ground properties during the acquisition can be ignored.However,the images obtained from this approach may be strongly contaminated with inversion artefacts due to both the presence of noise in the mea-surements and independent inversion errors.This generates false anomalies of ground condition changes.Furthermore,thetime-Fig.1.Geographical distribution of the case histories considered for the review.The graph shows that most of the examples considered are related to landslides located in Europe.Table 2Percentage distribution of landslide typologies studied by applying 2D ERT and of resistiv-ity values related to the material involved in these ndslide typology %Resistivity values %Slides32Conductive 65Debris and earth flows 20Resistive 22Complex landslides 20Mixed13Deep seated landslides 13Rock slides10Quick clay slides572 A.Perrone et al./Earth-Science Reviews 135(2014)65–82Fig.2.Varco d'Izzo landslide (Basilicata region,southern Italy):identi fication of the sliding surface and de finition of landslide shape by the comparison between the HH ′2D ERT and the stratigraphic data inferred from boreholes B22,B23and B24(redrawn from Lapenna et al.,(2005)).73A.Perrone et al./Earth-Science Reviews 135(2014)65–82。
OIML-R137-1-e06
Gas metersPart 1: RequirementsCompteurs de gaz Partie 1: ExigencesM L R 137-1 E d i t i o n 2006 (E )OIML R 137-1Edition 2006 (E)I NTERNATIONAL R ECOMMENDATIONContents Foreword (3)1 Scope (4)2 Terminology (4)2.1 Gas meter and its constituents (4)characteristics (6)2.2 Metrological2.3 Operating conditions (for definition, see 2.2.14) (7)conditions (8)2.4 Testequipment (9)2.5 Electronicrequirements (9)3 Constructional3.1 Construction (9)Direction (10)3.2 Flow3.3 Pressuretappings (11)conditions (11)3.4 Installation4 Seals and markings (12)units (12)4.1 Measurementandinscriptions (12)4.2 Markings4.3 Verification marks and protection devices (13)requirements (15)5 Metrologicaloperatingconditions (15)5.1 RatedofQ max, Q t and Q min (15)5.2 Values5.3 Accuracy classes and maximum permissible errors (15)5.4 Weighted mean error (WME) (16)5.5 Repair and damage of seals (16)requirements (17)6 Technical6.1 Indicatingdevice (17)element (18)6.2 Testdevices (19)6.3 Ancillarysources (19)6.4 Power6.5 Checks, limits and alarms for electronic gas meters (20)7 Requirements for metrological controls (21)results (21)7.1 Testconditions (21)7.2 Referenceapproval (21)7.3 Type7.4 Type examination tests (24)7.5 Initial verification and subsequent verification (31)7.6 Additional requirements for statistical verifications (32)7.7 Additional requirements for in-service inspections (33)Annex A: Environmental tests for electronic instruments or devices (34)Annex B: Flow disturbance tests (42)Annex C: Overview of tests applicable for different metering principles (45)Annex D:Bibliography (47)ForewordThe International Organization of Legal Metrology (OIML) is a worldwide, intergovernmental organization whose primary aim is to harmonize the regulations and metrological controls applied by the national metrological services, or related organizations, of its Member States. The main categories of OIML publications are:! International Recommendations (OIML R), which are model regulations that establish the metrological characteristics required of certain measuring instruments and which specify methods and equipment for checking their conformity. OIML Member States shall implement these Recommendations to the greatest possible extent;! International Documents (OIML D), which are informative in nature and which are intended to harmonize and improve work in the field of legal metrology;! International Guides (OIML G), which are also informative in nature and which are intended to give guidelines for the application of certain requirements to legal metrology; and! International Basic Publications (OIML B), which define the operating rules of the various OIML structures and systems.OIML Draft Recommendations, Documents and Guides are developed by Technical Committees or Subcommittees which comprise representatives from the Member States. Certain international and regional institutions also participate on a consultation basis. Cooperative agreements have been established between the OIML and certain institutions, such as ISO and the IEC, with the objective of avoiding contradictory requirements. Consequently, manufacturers and users of measuring instruments, test laboratories, etc. may simultaneously apply OIML publications and those of other institutions.International Recommendations, Documents, Guides and Basic Publications are published in English (E) and translated into French (F) and are subject to periodic revision.Additionally, the OIML publishes or participates in the publication of Vocabularies (OIML V) and periodically commissions legal metrology experts to write Expert Reports (OIML E). Expert Reports are intended to provide information and advice, and are written solely from the viewpoint of their author, without the involvement of a Technical Committee or Subcommittee, nor that of the CIML. Thus, they do not necessarily represent the views of the OIML.This publication - reference OIML R 137-1, Edition 2006 - was developed by Technical Subcommittee TC 8/SC 8 Gas meters. It was approved for final publication by the International Committee of Legal Metrology in 2006 and will be submitted to the International Conference of Legal Metrology in 2008 for formal sanction. It supersedes the previous editions of R 31 (1995) and R 32 (1989) and partially supersedes OIML R 6 (1989).OIML Publications may be downloaded from the OIML web site in the form of PDF files. Additional information on OIML Publications may be obtained from the Organization’s headquarters:Bureau International de Métrologie Légale11, rue Turgot - 75009 Paris - FranceTelephone: 33 (0)1 48 78 12 82Fax: 33 (0)1 42 82 17 27E-mail: biml@Internet: Gas metersPart 1: Requirements1 ScopeThis Recommendation applies to gas meters based on any principle, used to meter the quantity of gas in volume, mass or energy units that has passed through the meter at operating conditions. It applies also to gas meters intended to measure quantities of gaseous fuels or other gases, except gases in the liquefied state and steam.Dispensers for compressed natural gas (CNG dispensers) are also excluded from the scope of this Recommendation.This Recommendation also applies to correction devices, and other electronic devices that can be attached to the gas meter. However, provisions for conversion devices, either as part of the gas meter or as a separate instrument, or provisions for devices for the determination of the superior calorific value and gas metering systems consisting of several components, are defined in the draft OIML Recommendation on Measuring systems for gaseous fuel [8].2 TerminologyThe terminology used in this Recommendation conforms to the International Vocabulary of Basic and General Terms in Metrology (VIM - Edition 1993) [1] and the International Vocabulary of Terms in Legal Metrology (VIML - Edition 2000) [2]. In addition, for the purposes of this Recommendation, the following definitions apply.2.1 G AS METER AND ITS CONSTITUENTS2.1.1 Gas meterInstrument intended to measure, memorize and display the quantity of gas passing the flow sensor at operating conditions.2.1.2 Measurand (VIM 2.6)Particular quantity subject to measurement.2.1.3 Sensor (VIM 4.14)Element of a measuring instrument or measuring chain that is directly affected by the measurand.2.1.4 Measuring transducer (VIM 4.3)Device that provides an output quantity having a determined relationship to the input quantity.2.1.5 Mechanical output constant (mechanical gas meters only)Value of the quantity corresponding to one complete revolution of the shaft of the mechanical output. This value is determined by multiplying the value of the quantity corresponding to one complete revolution of the test element by the transmission ratio of the indicating device to this shaft.The mechanical output is an element to drive an ancillary device.2.1.6 CalculatorPart of the gas meter which receives the output signals from the measuring transducer(s) and, possibly, associated measuring instruments, transforms them and, if appropriate, stores the results in memory until they are used. In addition, the calculator may be capable of communicating both ways with ancillary devices.2.1.7 Indicating device (VIM 4.12 adapted)Part of the gas meter which displays the measurement results, either continuously or on demand. Note: A printing device, which provides an indication at the end of the measurement, is not an indicating device. 2.1.8 Adjustment deviceDevice incorporated in the gas meter that only allows the error curve to be shifted generally parallel to itself, with a view to bringing errors (of indication) within the limits of the maximum permissible error (MPE).2.1.9 Correction deviceDevice intended for correction of known errors as a function of e.g. flowrate, Reynolds number (curve linearization), or pressure and/or temperature.2.1.10 Ancillary deviceDevice intended to perform a particular function, directly involved in elaborating, transmitting or displaying measurement results.The main ancillary devices are:a) repeating indicating device;b) printing device;c) memory device; andd) communication device.Note 1: An ancillary device is not necessarily subject to metrological control.Note 2: An ancillary device may be integrated in the gas meter.2.1.11 Associated measuring instrumentInstrument connected to the calculator or the correction device for measuring certain gas properties, for the purpose of making a correction.2.1.12 Equipment under test (EUT)(Part of the) gas meter and/or associated devices which is exposed to one of the tests.2.1.13 Family of metersGroup of meters of different sizes and/or different flowrates, in which all the meters shall have the following characteristics:• the same manufacturer;• geometric similarity of the measuring part;• the same metering principle;• roughly the same ratios Q max/Q min and Q max/Q t;• the same accuracy class;• the same electronic device for each meter size;• a similar standard of design and component assembly; and•the same materials for those components that are critical to the performance of the meter.2.2 M ETROLOGICAL CHARACTERISTICS2.2.1 Quantity of gasTotal quantity of gas obtained by integrating the flow over time, expressed as volume V , mass m or energy E passed through the gas meter, disregarding the time taken. This is the measurand (see 2.1.2).2.2.2 Indicated value (of a quantity)Value Y i of a quantity, as indicated by the meter.2.2.3 Cyclic volume of a gas meter (positive displacement gas meters only)Volume of gas corresponding to one full revolution of the moving part(s) inside the meter (working cycle).2.2.4 True value (of a quantity) (VIM 1.19 + notes)Value consistent with the definition of a given particular quantity.2.2.5 Conventional true value (of a quantity) (VIM 1.20)Value Y ref attributed to a particular quantity and accepted, sometimes by convention, as having an uncertainty appropriate for a given purpose.2.2.6 Absolute error (of indication) (VIM3.10 + notes) Indicated value of a quantity Y i minus a true value of a quantity. 2.2.7 Relative error or error (of indication) e (VIM 3.12 + note) Error of measurement divided by a true value of the measurand. The error is expressed as a percentage, and is calculated by:%100)(×−=refref i Y Y Y e2.2.8 Weighted mean error (WME)The weighted mean error (WME) is calculated as follows:()()∑∑==⋅=ni ini iiQQ e QQ WME 1max1max /)/(where: • Q i /Q max is a weighting factor; • e i is the error at the flowrate Q i ; • at Q i > 0.9·Q max a weighting factor of 0.4 shall be used instead of 1. 2.2.9 Intrinsic errorError determined under reference conditions.2.2.10 Fault ∆e (OIML D 11,3.9)Difference between the error of indication and the intrinsic error of a measuring system or of its constituent elements.Note: In practice this is the difference between the error of the meter observed during or after a test, and the error of the meter prior to this test, performed under reference conditions.2.2.11 Maximum permissible error (MPE) (VIM 5.21)Extreme values permitted by the present Recommendation for an error.2.2.12 Accuracy class (VIM 5.19)Class of measuring instrument that meets certain metrological requirements that are intended to maintain errors within specified limits.2.2.13 Durability (OIML D 11,3.17)Ability of a measuring instrument to maintain its performance characteristics over a period of use.2.2.14 Operating conditionsConditions of the gas (temperature, pressure and gas composition) at which the quantity of gas is measured.2.2.15 Rated operating conditions (adapted from VIM 5.5)Conditions of use giving the range of values of the measurand and the influence quantities, for which the errors of the gas meter are required to be within the limits of the maximum permissible error.2.2.16 Reference conditions (adapted from VIM 5.7)Set of reference values, or reference ranges of influence quantities, prescribed for testing the performance of a gas meter, or for the intercomparison of the results of measurements.2.2.17 Base conditionsConditions to which the measured volume of gas is converted (examples: base temperature and base pressure).Note: Operating and base conditions relate to the volume of gas to be measured or indicated only and should not be confused with “rated operating conditions” and “reference conditions” (VIM 5.05 and 5.07) which refer to influence quantities.2.2.18 Test element of an indicating deviceDevice to enable precise reading of the measured gas quantity.2.2.19 Resolution (of an indicating device) (VIM 5.12)Smallest difference between indications of an indicating device that can be meaningfully distinguished.Note: For a digital device, this is the change in the indication when the least significant digit changes by one step.For an analogue device, this is half the difference between subsequent scale marks.2.2.20 Drift (VIM 5.16)Slow change of a metrological characteristic of a measuring instrument.2.3 O PERATING CONDITIONS (for definition, see 2.2.14)2.3.1 Flowrate, QQuotient of the actual quantity of gas passing through the gas meter and the time taken for this quantity to pass through the gas meter.2.3.2 Maximum flowrate, Q maxHighest flowrate at which a gas meter is required to operate within the limits of its maximum permissible error, whilst operated within its rated operating conditions.2.3.3 Minimum flowrate, Q minLowest flowrate at which a gas meter is required to operate within the limits of its maximum permissible error, whilst operated within its rated operating conditions.2.3.4 Transitional flowrate, Q tFlowrate which occurs between the maximum flowrate Q max and the minimum flowrate Q min that divides the flowrate range into two zones, the “upper zone” and the “lower zone”, each characterized by its own maximum permissible error.2.3.5 Working temperature, t wTemperature of the gas to be measured at the gas meter.2.3.6 Minimum and maximum working temperatures, t min and t maxMinimum and maximum gas temperature that a gas meter can withstand, within its rated operating conditions, without deterioration of its metrological performance.2.3.7 Working pressure, p wGauge pressure of the gas to be measured at the gas meter. The gauge pressure is the difference between the absolute pressure of the gas and the atmospheric pressure.2.3.8 Minimum and maximum working pressure, p min and p maxMinimum and maximum internal gauge pressure that a gas meter can withstand, within its rated operating conditions, without deterioration of its metrological performance.2.3.9 Static pressure loss or pressure differential, ∆pMean difference between the pressures at the inlet and outlet of the gas meter while the gas is flowing.2.3.10 Working density, ρwDensity of the gas flowing through the gas meter, corresponding to p w and t w2.4 T EST CONDITIONS2.4.1 Influence quantity (VIM 2.7)Quantity that is not the measurand but which affects the result of the measurement.2.4.2 Influence factor (OIML D 11,3.13.1)Influence quantity having a value within the rated operating conditions of the gas meter, as specified in this Recommendation.2.4.3 Disturbance (OIML D 11,3.13.2)Influence quantity having a value within the limits specified in this Recommendation, but outside the specified rated operating conditions of the gas meter.Note: An influence quantity is a disturbance if for that influence quantity the rated operating conditions are not specified.2.4.4 Overload conditionsExtreme conditions, including flowrate, temperature, pressure, humidity and electromagnetic interference that a gas meter is required to withstand without damage. When it is subsequently operated within its rated operating conditions, it must do so within the limits of its maximum permissible error.2.4.5 TestSeries of operations intended to verify the compliance of the equipment under test (EUT) with certain requirements.2.4.6 Test procedureDetailed description of the test operations.2.4.7 Test programDescription of a series of tests for a certain type of equipment.2.4.8 Performance testTest intended to verify whether the equipment under test (EUT) is capable of accomplishing its intended functions.2.5 E LECTRONIC EQUIPMENT2.5.1 Electronic gas meterGas meter equipped with electronic devices.Note: For the purposes of this Recommendation auxiliary equipment, as far as it is subject to metrological control, is considered part of the gas meter, unless the auxiliary equipment is approved and verified separately.2.5.2 Electronic deviceDevice employing electronic sub-assemblies and performing a specific function. Electronic devices are usually manufactured as separate units and are capable of being tested independently.2.5.3 Electronic sub-assemblyPart of an electronic device employing electronic components and having a recognizable function of its own.2.5.4 Electronic componentSmallest physical entity, which uses electron or gap conduction in semi-conductors, or conduction by means of electrons or ions in gases or in a vacuum.requirements3 Constructional3.1 C ONSTRUCTION3.1.1 MaterialsA gas meter shall be made of such materials and be so constructed to withstand the physical, chemical and thermal conditions to which it is likely to be subjected and to fulfil correctly its intended purposes throughout its life.3.1.2 Soundness of casesThe case of a gas meter shall be gas-tight up to the maximum working pressure of the gas meter. If a meter is to be installed in the open air it shall be impermeable to run-off water.3.1.3 Condensation/climate provisionsThe manufacturer may incorporate devices for the reduction of condensation, where condensation may adversely affect the performance of the device.3.1.4 Protection against external interferenceA gas meter shall be constructed and installed in such a way that mechanical interference capable of affecting its accuracy is either prevented, or results in permanently visible damage to the gas meter or to the verification marks or protection marks.3.1.5 Indicating deviceThe indicating device can be connected to the meter body physically or remotely. In the latter case the data to be displayed shall be stored in the gas meter.Note: National or regional requirements may contain provisions to guarantee access to the data stored in the meter for customers and consumers.3.1.6 Safety deviceThe gas meter may be equipped with a safety device that shuts off the gas flow in the event of calamities, such as an earthquake or a fire. A safety device may be connected to the gas meter, provided that it does not influence the metrological integrity of the meter.Note: A mechanical gas meter equipped with an earthquake sensor plus an electrical powered valve is not considered to be an electronic gas meter.3.1.7 Connections between electronic partsConnections between electronic parts shall be reliable and durable.3.1.8 ComponentsComponents of the meter may only be exchanged without subsequent verification if the type examination establishes that the metrological properties and especially the accuracy of the meter are not influenced by the exchange of the components concerned. Such components shall be identified at least by their own type indication.Note: National bodies may require components to be marked with the model(s) of the meter(s) to which they may be attached and may require such exchange to be carried out by authorized persons.3.1.9 Zero flowThe gas meter totalization shall not change when the flowrate is zero, while the installation conditions are free from pulsations and vibrations.Note: This requirement refers to stationary operating conditions. This condition does not refer to the response of the gas meter to changed flowrates.3.2 F LOW D IRECTION3.2.1 Direction of the gas flowOn a gas meter where the indicating device registers positively for only one direction of the gas flow, this direction shall be indicated by a method which is clearly understood, e.g. an arrow. This indication is not required if the direction of the gas flow is determined by the construction.3.2.2 Plus and minus signThe manufacturer shall specify whether or not the gas meter is designed to measure bi-directional flow. In the case of bi-directional flow a double-headed arrow with a plus and minus sign shall be used to indicate which flow direction is regarded as positive and negative respectively.3.2.3 Recording of bi-directional flowIf a meter is designed for bi-directional use, the quantity of gas passed during reverse flow shall either be subtracted from the indicated quantity or be recorded separately. The maximum permissible error shall be met for both forward and reverse flow.3.2.4 Reverse flowIf a meter is not designed to measure reverse flow, the meter shall either prevent reverse flow, or it shall withstand incidental or accidental reverse flow without deterioration or change in its metrological properties.3.2.5 Indicating deviceA gas meter may be provided with a device to prevent the indicating device from functioning whenever gas is flowing in an unauthorized direction.3.3 P RESSURE TAPPINGS3.3.1 GeneralIf a gas meter is designed to operate above an absolute pressure of 0.15 MPa, the manufacturer shall either equip the meter with pressure tappings, or specify the position of pressure tappings in the installation pipe work.3.3.2 BoreThe bore of the pressure tappings shall be large enough to allow correct pressure measurements.3.3.3 ClosurePressure tappings shall be provided with a means of closure to make them gas-tight.3.3.4 MarkingsThe pressure tapping on the gas meter for measuring the working pressure (2.3.7) shall be clearly and indelibly marked “p m” (i.e. the pressure measurement point) or “p r” (i.e. the pressure reference point) and other pressure tappings “p”.3.4 I NSTALLATION CONDITIONSThe manufacturer shall specify the installation conditions (as applicable) with respect to:- the position to measure the working temperature of the gas (2.3.5);- filtering;- levelling and orientation;disturbances;- flow- pulsations or acoustic interference;changes;pressure- rapid- absence of mechanical stress (due to torque and bending);- mutual influences between gas meters;instructions;- mounting- maximum allowable diameter differences between the gas meter and connecting pipework; and- other relevant installation conditions.4 Seals and markings4.1 M EASUREMENT UNITSAll quantities shall be expressed in SI units [3] or as other legal units of measurement [4], unless a country’s legal units are different. In the next section the unit corresponding to the quantity indicated is expressed by <unit>.4.2 M ARKINGS AND INSCRIPTIONSAll markings prescribed in 4.2 shall be visible, easily legible and indelible under rated conditions of use.Any marking other than those prescribed in the type approval document shall not lead to confusion.4.2.1 General applicable markings for gas metersAs relevant, the following information shall be marked on the casing or on an identification plate, or clearly and unambiguously visible via the indicating device:a) Type approval mark (according to national or regional regulation);b) Name or trade mark of the manufacturer;c) Type designation;d) Serial number of the gas meter and its year of manufacture;e) Accuracy class;f) Maximum flowrate Q max = … <unit>;g) Minimum flowrate Q min = … <unit>;h) Transition flowrate Q t = … <unit>;i) Gas temperature range and pressure range for which the errors of the gas meter shall bewithin the limits of the maximum permissible error, expressed as:t min – t max = … - … <unit>;p min – p max = … - … <unit> gauge pressure.j) The density range within which the errors shall comply with the limits of the maximum permissible error may be indicated, and shall be expressed as:ρ = … - … <unit>This marking may replace the range of working pressures (i) unless the working pressure marking refers to a built-in conversion device;k) Pulse values of HF and LF frequency outputs (imp/<unit>, pul/<unit>, <unit>/imp);Note: The pulse value is given to at least six significant figures, unless it is equal to an integer multiple or decimal fraction of the used unit.l) Letter V or H, as applicable, if the meter can be operated only in the vertical or horizontal position;m) Indication of the flow direction, e.g. an arrow (if applicable, see 3.2.1 and 3.2.2);n) Measurement point for the working pressure according to 3.3.4; ando) Environmental temperatures, if they differ from the gas temperature as mentioned in i).4.2.2 Additional markings for mechanical gas meters with a built-in mechanical conversiondevice having only one indicating devicep) Base temperature t b = … <unit>;q) Temperature t sp = … <unit> specified by the manufacturer according to 5.3.4.4.2.3 Additional markings for gas meters with output drive shaftsr) Gas meters fitted with output drive shafts or other facilities for operating detachable additional devices shall have each drive shaft or other facility characterized by anindication of its constant (C) in the form “1 rev = … <unit>” and the direction ofrotation. “rev” is the abbreviation of the word “revolution”;s) If there is only one drive shaft the maximum permissible torque shall be marked in the form “M max = … N.mm”;t) If there are several drive shafts, each shaft shall be characterized by the letter M with a subscript in the form “M1, M2, … M n”;u) The following formula shall appear on the gas meter:k1M1 + k2M2 + … + k n M n≤ A N.mm,where:A is the numerical value of the maximum permissible torque applied to the drive shaftwith the highest constant, where the torque is applied only to this shaft; this shaft shallbe characterised by the symbol M1,k i (i = 1, 2, … n) is a numerical value determined as follows: k i = C1 / C i,M i (i = 1, 2, … n) represents the torque applied to the drive shaft characterized by thesymbol M i,C i(i = 1, 2, … n) represents the constant for the drive shaft characterized by thesymbol M i.4.2.4 Additional markings for gas meters with electronic devicesv) For an external power supply: the nominal voltage and nominal frequency;w) For a non-replaceable or replaceable battery: the latest date by which the battery is to be replaced, or the remaining battery capacity.x) Software identification of the firmware4.3 V ERIFICATION MARKS AND PROTECTION DEVICES4.3.1 General provisionProtection of the metrological properties of the meter is accomplished via hardware (mechanical) sealing or via electronic sealing devices.In any case, memorized quantities of gas shall be protected by means of a hardware seal.The design of verification marks and hardware seals is subject to national or regional legislation. Seals shall be able to withstand outdoor conditions.4.3.2 Verification marksVerification marks indicate that the gas meter has successfully passed the initial verification (7.5). Verification marks shall be realized as hardware seals.。
Generalizations of the BiasVariance Decomposition for Prediction Error
2 Generalizing the de nitions
Often squared error is a very convenient loss function to use. It possesses well known mathematical properties such as the bias/variance decomposition (1) that make it very attractive to use. However there are situations where squared error is clearly not the most appropriate loss function. This is especially true in classi cation problems where a loss function like 0-1 loss seems much more realistic. So how might we extend these concepts of variance and bias to general loss functions? There is one obvious requirement that it seems natural for any generalization to ful ll. When using squared error loss our general de nitions must reduce to the standard ones. 3
The bias and variance of a real valued random variable, using squared error loss, are well understood. However because of recent developments in classi cation techniques it has become desirable to extend these concepts to general random variables and loss functions. The 0-1 (misclassi cation) loss function with categorical random variables has been of particular interest. We explore the concepts of variance and bias and develop a decomposition of the prediction error into functions of the systematic and variable parts of our predictor. After providing some examples we conclude with a discussion of the various de nitions that have been proposed.
chapter 3 notes from book
MSE squares errors, thusgiving more weight to larger errors, whichcausesmore problems.
MAPE should be used when there is a need to put errors in perspective.
Time-series forecasts:
Simply attempt to project past experience into the future.Thesetechniques use historical data with the assumption that the future will be like the past. Somemodels merely attempt to smooth out random variations in historical data; others attempt toidentify specific patterns in the data and project or extrapolate those patterns into the future,without trying to identify causes of the patterns.
MAD weights all errors evenly,
MSE weights errors according to theirsquaredvalues, and
MAPE weights according to relative error.
Contents Explanations and error diagnosis
Explanations and error diagnosisLIFOG´e rard Ferrand,Willy Lesaint,Alexandre Tessierpublic,rapport de rechercheD3.2.2Contents1Introduction3 2Preliminary notations and definitions42.1Notations (4)2.2Constraint Satisfaction Problem (4)2.3Constraint Satisfaction Program (5)2.4Links between CSP and program (7)3Expected Semantics83.1Correctness of a CSP (8)3.2Symptom and Error (8)4Explanations94.1Explanations (10)4.2Computed explanations (12)5Error Diagnosis125.1From Symptom to Error (13)5.2Diagnosis Algorithms (13)6Conclusion141AbstractThe report proposes a theoretical approach of the debugging of constraint programs based on the notion of explanation tree(D1.1.1and D1.1.2part 2).The proposed approach is an attempt to adapt algorithmic debugging to constraint programming.In this theoretical framework for domain reduction, explanations are proof trees explaining value removals.These proof trees are defined by inductive definitions which express the removals of values as con-sequence of other value removals.Explanations may be considered as the essence of constraint programming.They are a declarative view of the com-putation trace.The diagnosis consists in locating an error in an explanation rooted by a symptom.keywords:declarative diagnosis,algorithmic debugging,CSP,local consis-tency operator,fix-point,closure,inductive definition21IntroductionDeclarative diagnosis[15](also known as algorithmic debugging)have been success-fully used in different programming paradigms(e.g.logic programming[15],func-tional programming[10]).Declarative means that the user has no need to consider the computational behavior of the programming system,he only needs a declarative knowledge of the expected properties of the program.This paper is an attempt to adapt declarative diagnosis to constraint programming thanks to a notion of explanation tree.Constraint programs are not easy to debug because they are not algorithmic programs[14]and tracing techniques are revealed limited in front of them.More-over it would be incoherent to use only low level debugging tools whereas for these languages the emphasis is on declarative semantics.Here we are interested in a wide field of applications of constraint programming:finite domains and propagation.The aim of constraint programming is to solve Constraint Satisfaction Problems (CSP)[17],that is to provide an instantiation of the variables which is solution of the constraints.The solver goes towards the solutions combining two different methods.Thefirst one(labeling)consists in partitioning the domains.The second one(domain reduction)reduces the domains eliminating some values which cannot be correct according to the constraints.In general,the labeling alone is very expen-sive and domain reduction only provides a superset of the solutions.Solvers use a combination of these two methods until to obtain singletons and test them.The formalism of domain reduction given in the paper is well-suited to define ex-planations for the basic events which are“the withdrawal of a value from a domain”. It has already permitted to prove the correctness of a large family of constraint re-traction algorithms[6].A closed notion of explanations have been proved useful in many applications:dynamic constraint satisfaction problems,over-constrained problems,dynamic backtracking,...Moreover,it has also been used for failure anal-ysis in[12].The introduction of labeling in the formalism has already been proposed in[13].But this introduction complicates the formalism and is not really necessary here(labeling can be considered as additional constraints).The explanations de-fined in the paper provide us with a declarative view of the computation and their tree structure is used to adapt algorithmic debugging to constraint programming.From an intuitive viewpoint,we call symptom the appearance of an anomaly dur-ing the execution of a program.An anomaly is relative to some expected properties of the program,here to an expected semantics.A symptom can be a wrong answer or a missing answer.A wrong answer reveals a lack in the constraints(a missing constraint for example).This paper focuses on the missing answers.Symptoms are caused by erroneous constraints.Strictly speaking,the localization of an erroneous constraint,when a symptom is given,is error diagnosis.It amounts to search for a kind of minimal symptom in the explanation tree.For a declarative diagnostic system,the input must include at least(1)the actual program,(2)the symptom and(3)a knowledge of the expected semantics.This knowledge can be given by the3programmer during the diagnosis session or it can be specified by other means but,from a conceptual viewpoint,this knowledge is given by an oracle .We are inspired by GNU-Prolog [7],a constraint programming language over finite domains,because its glass-box approach allows a good understanding of the links between the constraints and the rules used to build explanations.But this work can be applied to all solvers over finite domains using propagation whatever the local consistency notion used .Section 2defines the basic notions of CSP and program.In section 3,symptoms and errors are described in this framework.Section 4defines explanations.An algorithm for error diagnosis of missing answer is proposed in section 5.2Preliminary notations and definitionsThis section gives briefly some definitions and results detailed in [9].2.1NotationsLet us assume fixed:•a finite set of variable symbols V ;•a family (D x )x ∈V where each D x is a finite non empty set,D x is the domain of the variable x .We are going to consider various families f =(f i )i ∈I .Such a family can be identified with the function i →f i ,itself identified with the set {(i,f i )|i ∈I }.In order to have simple and uniform definitions of monotonic operators on a power-set,we use a set which is similar to an Herbrand base in logic programming:we define the domain by D = x ∈V ({x }×D x ).A subset d of D is called an environment .We denote by d |W the restriction of d to a set of variables W ⊆V ,that is,d |W ={(x,e )∈d |x ∈W }.Note that,with d,d ⊆D ,d = x ∈V d |{x },and (d ⊆d ⇔∀x ∈V,d |{x }⊆d |{x }).A tuple (or valuation )t is a particular environment such that each variable ap-pears only once:t ⊆D and ∀x ∈V,∃e ∈D x ,t |{x }={(x,e )}.A tuple t on a set of variables W ⊆V ,is defined by t ⊆D |W and ∀x ∈W,∃e ∈D x ,t |{x }={(x,e )}.2.2Constraint Satisfaction ProblemA Constraint Satisfaction Problem (CSP)on (V,D )is made of:•a finite set of constraint symbols C ;•a function var :C →P (V ),which associates with each constraint symbol the set of variables of the constraint;4•a family(T c)c∈C such that:for each c∈C,T c is a set of tuples on var(c),T c is the set of solutions of c.Definition1A tuple t is a solution of the CSP if∀c∈C,t|var(c)∈T c.From now on,we assumefixed a CSP(C,var,(T c)c∈C)on(V,D)and we denote by Sol its set of solutions.Example1The conference problem[12]Michael,Peter and Alan are organizing a two-day seminar for writing a report on their work.In order to be efficient,Peter and Alan need to present their work to Michael and Michael needs to present his work to Alan and Peter.So there are four variables, one for each presentation:Michael to Peter(MP),Peter to Michael(PM),Michael to Alan(MA)and Alan to Michael(AM).Those presentations are scheduled for a whole half-day each.Michael wants to know what Peter and Alan have done before presenting his own work(MA>AM,MA>PM,MP>AM,MP>PM).Moreover,Michael would prefer not to come the afternoon of the second half-day because he has got a very long ride home(MA=4,MP=4,AM=4,PM=4).Finally,note that Peter and Alan cannot present their work to Michael at the same time(AM=PM).The solutions of this problem are:{(AM,2),(MA,3),(MP,3),(PM,1)}and{(AM,1),(MA,3),(MP,3),(PM,2)}.The set of constraints can be written in GNU-Prolog[7]as:conf(AM,MP,PM,MA):-fd_domain([MP,PM,MA,AM],1,4),MA#>AM,MA#>PM,MP#>AM,MP#>PM,MA#\=4,MP#\=4,AM#\=4,PM#\=4,AM#\=PM.2.3Constraint Satisfaction ProgramA program is used to solve a CSP,(i.e tofind the solutions)thanks to domain reduction and beling can be considered as additional constraints,so we concentrate on the domain reduction.The main idea is quite simple:to remove from the current environment some values which cannot participate to any solution of some constraints,thus of the CSP.These removals are closely related to a notion of local consistency.This can be formalized by local consistency operators.Definition2A local consistency operator r is a monotonic function r:P(D)→P(D).Note that in[9],a local consistency operator r have a type(in(r),out(r))with in(r),out(r)⊆D.Intuitively,out(r)is the set of variables whose environment is reduced(values are removed)and these removals only depend on the environments of the variables of in(r).But this detail is not necessary here.5Example2The GNU-Prolog solver uses local consistency operators following the X in r scheme[4]:for example,AM in0..max(MA)-1.It means that the values of AM must be between0and the maximal value of the environment of MA minus1.As we want contracting operator to reduce the environment,next we will consider d→d∩r(d).But in general,the local consistency operators are not contracting functions,as shown later to define their dual operators.A program on(V,D)is a set R of local consistency operators.Example3Following the X in r scheme,the GNU-Prolog conference problem is implemented by the following program:AM in1..4,MA in1..4,PM in1..4,MP in1..4,MA in min(AM)+1..infinity,AM in0..max(MA)-1,MA in min(PM)+1..infinity,PM in0..max(MA)-1,MP in min(AM)+1..infinity,AM in0..max(MP)-1,MP in min(PM)+1..infinity,PM in0..max(MP)-1,MA in-{val(4)},AM in-{val(4)},PM in-{val(4)},MP in-{val(4)},AM in-{val(PM)},PM in-{val(AM)}.From now on,we assumefixed a program R on(V,D).We are interested in particular environments:the commonfix-points of the re-duction operators d→d∩r(d),r∈R.Such an environment d verifies∀r∈R, d =d ∩r(d ),that is values cannot be removed according to the operators.Definition3Let r∈R.We say an environment d is r-consistent if d⊆r(d).We say an environment d is R-consistent if∀r∈R,d is r-consistent.Domain reduction from a domain d by R amounts to compute the greatestfix-point of d by R.Definition4The downward closure of d by R,denoted by CL↓(d,R),is the greatest d ⊆D such that d ⊆d and d is R-consistent.In general,we are interested in the closure of D by R(the computation starts from D),but sometimes we would like to express closures of subset of D(environments, tuples).It is also useful in order to take into account dynamic aspects or labeling [9,6].Example4The execution of the GNU-Prolog program provides the following clo-sure:{(AM,1),(AM,2),(MA,2),(MA,3),(MP,2),(MP,3),(PM,1),(PM,2)}.By definition4,since d⊆D:Lemma1If d is R-consistent then d⊆CL↓(D,R).62.4Links between CSP and programOf course,the program is linked to the CSP.The operators are chosen to “imple-ment”the CSP.In practice,this correspondence is expressed by the fact that the program is able to test any valuation.That is,if all the variables are bounded,the program should be able to answer to the question:“is this valuation a solution of the CSP ?”.Definition 5A local consistency operator r preserves the solutions of a set of con-straints C if,for each tuple t ,(∀c ∈C ,t |var(c )∈T c )⇒t is r -consistent.In particular,if C is the set of constraints C of the CSP then we say r preserves the solutions of the CSP.In the well-known case of arc-consistency,a set of local consistency operators R c is chosen to implement each constraint c of the CSP.Of course,each r ∈R c preserves the solutions of {c }.It is easy to prove that if r preserves the solutions of C and C ⊆C ,then r preserves the solutions C .Therefore ∀r ∈R c ,r preserves the solutions of the CSP.To preserve solutions is a correction property of operators.A notion of com-pleteness is used to choose the set of operators “implementing”a CSP.It ensures to reject valuations which are not solutions of constraints.But this notion is not necessary for our purpose.Indeed,we are only interested in the debugging of miss-ing answers,that is in locating a wrong local consistency operators (i.e.constraints removing too much values).In the following lemmas,we consider S ⊆Sol,that is S a set of solutions of the CSP and S (= t ∈S t )its projection on D .Lemma 2Let S ⊆Sol ,if r preserves the solutions of the CSP then S is r -consistent.Proof.∀t ∈S,t ⊆r (t )so S ⊆ t ∈S r (t ).Now,∀t ∈S,t ⊆ S so∀t ∈S,r (t )⊆r ( S ).Extending definition 5,we say R preserves the solutions of C if for each r ∈R ,r preserves the solutions of C .From now on,we consider that the fixed program R preserves the solutions of the fixed CSP.Lemma 3If S ⊆Sol then S ⊆CL ↓(D ,R ).Proof.by lemmas 1and 2.Finally,the following corollary emphasizes the link between the CSP and the program.Corollary 1 Sol ⊆CL ↓(D ,R ).The downward closure is a superset (an “approximation”)of Sol which is itself the projection (an “approximation”)of Sol.But the downward closure is the most accurate set which can be computed using a set of local consistency operators in the framework of domain reduction without splitting the domain (without search tree).73Expected SemanticsTo debug a constraint program,the programmer must have a knowledge of the problem.If he does not have such a knowledge,he cannot say something is wrong in his program!In constraint programming,this knowledge is declarative.3.1Correctness of a CSPAt first,the expected semantics of the CSP is considered as a set of tuples:the expected solutions .Next definition is motivated by the debugging of missing answer.Definition 6Let S be a set of tuples.The CSP is correct wrt S if S ⊆Sol .Note that if the user exactly knows S then it could be sufficient to test each tuple of S on each local consistency operator or constraint.But in practice,the user only needs to know some members of S and some members of D \ S .We consider the expected environment S ,that is the approximation of S .By lemma 2:Lemma 4If the CSP is correct wrt a set of tuples S then S is R -consistent.3.2Symptom and ErrorFrom the notion of expected environment,we can define a notion of symptom.A symptom emphasizes a difference between what is expected and what is actually computed.Definition 7h ∈D is a symptom wrt an expected environment d if h ∈d \CL ↓(D ,R ).It is important to note that here a symptom is a symptom of missing solution (an expected member of D is not in the closure).Example 5From now on,let us consider the new following CSP in GNU-Prolog :conf(AM,MP,PM,MA):-fd_domain([MP,PM,MA,AM],1,4),MA #>AM,MA #>PM,MP #>AM,PM #>MP,MA #\=4,MP #\=4,AM #\=4,PM #\=4,AM #\=PM.As we know,a solution of the conference problem contains (AM,1).But,the execution provides an empty closure.So,in particular,(AM,1)has been removed.Thus,(AM,1)is a symptom.8Definition 8R is approximately correct wrt d if d ⊆CL ↓(D ,R ).Note that R is approximately correct wrt d is equivalent to there is no symptom wrt d .By this definition and lemma 1we have:Lemma 5If d is R -consistent then R is approximately correct wrt d .In other words,if d is R -consistent then there is no symptom wrt d .But,our purpose is debugging (and not program validation),so:Corollary 2Let S be a set of expected tuples.If R is not approximately correct wrt S then S is not R -consistent,thus the CSP is not correct wrt S .The lack of an expected value is caused by an error in the program,more precisely a local consistency operator.If an environment d is not R -consistent,then there exists an operator r ∈R such that d is not r -consistent.Definition 9A local consistency operator r ∈R is an erroneous operator wrt d if d ⊆r (d ).Note that d is R -consistent is equivalent to there is no erroneous operator wrt d in R .Theorem 1If there exists a symptom wrt d then there exists an erroneous operator wrt d (the converse does not hold).When the program is R = c ∈C R c with each R c a set of local consistency operators preserving the solutions of c ,if r ∈R c is an erroneous operator wrt Sthen it is possible to say that c is an erroneous constraint.Indeed,there exists a value (x,e )∈ S \r ( S ),that is there exists t ∈S such that (x,e )∈t \r (t ).So t is not r -consistent,so t |var(c )∈T c i.e.c rejects an expected solution.4ExplanationsThe previous theorem shows that when there exists a symptom there exists an erroneous operator.The goal of error diagnosis is to locate such an operator from a symptom.To this aim we now define explanations of value removals as in [9],that is a proof tree of a value removal.If a value has been wrongly removed then there is something wrong in the proof of its removal,that is in its explanation.94.1ExplanationsFirst we need some notations.Let d=D\d.In order to help the understanding,we always use the notation d for a subset of D if intuitively it denotes a set of removed values.Definition10Let r be an operator,we denote by r the dual of r defined by:∀d⊆D, r(d)=r(d).We consider the set of dual operators of R:let R={ r|r∈R}.Definition11The upward closure of d by R,denoted by CL↑(d, R)exists and is the least d such that d⊆d and∀r∈R, r(d )⊆d (see[9]).Next lemma establishes the correspondence between downward closure of local consistency operators and upward closure of their duals.Lemma6CL↑(d, R)=CL↓(d,R).Proof.CL↑(d, R)=min{d |d⊆d ,∀ r∈ R, r(d )⊆d }=min{d |d⊆d ,∀r∈R,d ⊆r(d )}=max{d |d ⊆d,∀r∈R,d ⊆r(d )} Now,we associate rules in the sense of[1]with these dual operators.These rules are natural to build the complementary of an environment and well suited to provide proof(trees)of value removals.Definition12A deduction rule is a rule h←B such that h∈D and B⊆D.Intuitively,a deduction rule h←B can be understood as follow:if all the elements of B are removed from the environment,then h does not participate in any solution of the CSP and it can be removed.A very simple case is arc-consistency where theB corresponds to the well-known notion of support of h.But in general(even for hyper arc-consistency)the rules are more intricate.Note that these rules are only a theoretical tool to define explanations and to justify the error diagnosis method.But in practice,this set does not need to be given.The rules are hidden in the algorithms which implement the solver.For each operator r∈R,we denote by R r a set of deduction rules which defines r,that is,R r is such that: r(d)={h∈D|∃B⊆d,h←B∈R r}.For each operator,this set of deduction rules exists.There possibly exists many such sets, but for classical notions of local consistency one is always natural[9].The deduction rules clearly appear inside the algorithms of the solver.In[3]the proposed solver is directly something similar to the set of rules(it is not exactly a set of deduction rules because the heads of the rules do not have the same shape that the elements of the body).10(AM,1)(MA,3)(MA,4)(MA,2)(PM,1)(PM,1)(PM,2)(MP,1)MA >PM MA >PM MA =4PM >MPPM >MP PM >MP MP >AMMA >AMFigure 1:An explanation for (AM,1)Example 6With the GNU-Prolog operator AM in 0..max(MA)-1are associated the deduction rules:•(AM,1)←(MA,2),(MA,3),(MA,4)•(AM,2)←(MA,3),(MA,4)•(AM,3)←(MA,4)•(AM,4)←∅Indeed,for the first one,the value 1is removed from the environment of AM only when the values 2,3and 4are not in the environment of MA.From the deduction rules,we have a notion of proof tree [1].We consider the set of all the deduction rules for all the local consistency operators of R :let R = r ∈R R r .We denote by cons(h,T )the tree defined by:h is the label of its root and T the set of its sub-trees.The label of the root of a tree t is denoted by root(t ).Definition 13An explanation is a proof tree cons(h,T )with respect to R ;it is in-ductively defined by:T is a set of explanations with respect to R and (h ←{root(t )|t ∈T })∈R .Example 7The explanation of figure 1is an explanation for (AM,1).Note that the root (AM,1)of the explanation is linked to its children by the deduction rule (AM,1)←(MA,2),(MA,3),(MA,4).Here,since each rule is associated with an operator which is itself associated with a constraint (arc-consistency case),the constraint is written at the right of the rule.Finally we prove that the elements removed from the domain are the roots of the explanations.Theorem 2CL ↓(D ,R )is the set of the roots of explanations with respect to R .Proof.Let E the set of the roots of explanations wrt to R .By induction on explanations E ⊆min {d |∀ r ∈ R, r (d )⊆d }.It is easy to check thatr (E )⊆E .Hence,min {d |∀ r ∈ R, r (d )⊆d }⊆E .So E =CL ↑(∅, R).11In[9]there is a more general result which establishes the link between the closure of an environment d and the roots of explanations of R∪{h←∅|h∈d}.But here, to be lighter,the previous theorem is sufficient because we do not consider dynamic aspects.All the results are easily adaptable when the starting environment is d⊂D.4.2Computed explanationsNote that for error diagnosis,we only need a program,an expected semantics,a symptom and an explanation for this symptom.Iterations are briefly mentioned here only to understand how explanations are computed in concrete terms,as in the PaLM system[11].For more details see[9].CL↓(d,R)can be computed by chaotic iterations introduced for this aim in[8].The principle of a chaotic iteration[2]is to apply the operators one after the other in a“fairly”way,that is such that no operator is forgotten.In practice this can be implemented thanks to a propagation queue.Since⊆is a well-founded ordering (i.e.D is afinite set),every chaotic iteration is stationary.The well-known result of confluence[5,8]ensures that the limit of every chaotic iteration of the set of local consistency operators R is the downward closure of D by R.So in practice the computation ends when a commonfix-point is reached.Moreover,implementations of solvers use various strategies in order to determine the order of invocation of the operators.These strategies are used to optimize the computation,but this is out of the scope of this paper.We are interested in the explanations which are“computed”by chaotic iterations, that is the explanations which can be deduced from the computation of the closure.A chaotic iteration amounts to apply operators one after the other,that is to apply sets of deduction rules one after another.So,the idea of the incremental algorithm [9]is the following:each time an element h is removed from the environment by a deduction rule h←B,an explanation is built.Its root is h and its sub-trees are the explanations rooted by the elements of B.Note that the chaotic iteration can be seen as the trace of the computation, whereas the computed explanations are a declarative vision of it.The important result is that CL↓(d,R)is the set of roots of computed explana-tions.Thus,since a symptom belongs to there always exists a computed explanation for each symptom.5Error DiagnosisIf there exists a symptom then there exists an erroneous operator.Moreover,for each symptom an explanation can be obtained from the computation.This section describes how to locate an erroneous operator from a symptom and its explanation.125.1From Symptom to ErrorDefinition14A rule h←B∈R r is an erroneous rule wrt d if B∩d=∅and h∈d.It is easy to prove that r is an erroneous operator wrt d if and only if there exists an erroneous rule h←B∈R r wrt d.Consequently,theorem1can be extended into the next lemma.Lemma7If there exists a symptom wrt d then there exists an erroneous rule wrt d.We say a node of an explanation is a symptom wrt d if its label is a symptom wrt d.Since,for each symptom h,there exists an explanation whose root is labeled by h,it is possible to deal with minimality according to the relation parent/child in an explanation.Definition15A symptom is minimal wrt d if none of its children is a symptom wrt d.Note that if h is a minimal symptom wrt d then h∈d and the set of its children B is such that B⊆d.In other words h←B is an erroneous rule wrt d.Theorem3In an explanation rooted by a symptom wrt d,there exists at least one minimal symptom wrt d and the rule which links the minimal symptom to its children is an erroneous rule.Proof.Since explanations arefinite trees,the relation parent/child iswell-founded.To sum up,with a minimal symptom is associated an erroneous rule,itself as-sociated with an erroneous operator.Moreover,an operator is associated with,a constraint(e.g.the usual case of hyper arc-consistency),or a set of constraints. Consequently,the search for some erroneous constraints in the CSP can be done by the search for a minimal symptom in an explanation rooted by a symptom.5.2Diagnosis AlgorithmsThe error diagnosis algorithm for a symptom(x,e)is quite simple.Let E the computed explanation of(x,e).The aim is tofind a minimal symptom in E by asking the user with questions as:“is(y,f)expected?”.Note that different strategies can be used.For example,the“divide and conquer”strategy:if n is the number of nodes of E then the number of questions is O(log(n)), that is not much according to the size of the explanation and so not very much compared to the size of the iteration.13Example8Let us consider the GNU-Prolog CSP of example5.Remind us that its closure is empty whereas the user expects(AM,1)to belong to a solution.Let the explanation offigure1be the computed explanation of(AM,1).A diagnosis session can then be done using this explanation tofind the erroneous operator or constraint of the CSP.Following the“divide and conquer”strategy,first question is:“Is(MA,3)a symptom ?”.According to the conference problem,the knowledge on MA is that Michael wants to know other works before presenting is own work(that is MA>2)and Michael cannot stay the last half-day(that is MA is not4).Then,the user’s answer is:yes.Second question is:“Is(PM,2)a symptom?”.According to the conference prob-lem,Michael wants to know what Peter have done before presenting his own work to Alan,so the user considers that(PM,2)belongs to the expected environment:its answer is yes.Third question is:“Is(MP,1)a symptom?”.This means that Michael presents his work to Peter before Peter presents his work to him.This is contradicting the conference problem:the user answers no.So,(PM,2)is a minimal symptom and the rule(PM,2)←(MP,1)is an erroneous one.This rule is associated to the operator PM in min(MP)+1..infinite,associated to the constraint PM>MP.Indeed,Michael wants to know what Peter have done before presenting his own work would be written PM<MP.Note that the user has to answer to only three questions whereas the explanation contains height nodes,there are sixteen removed values and eighteen operators for this problem.So,it seems an efficient way tofind an error.Note that it is not necessary for the user to exactly know the set of solutions,nor a precise approximation of them.The expected semantics is theoretically considered as a partition of D:the elements which are expected and the elements which are not. For the error diagnosis,the oracle only have to answer to some questions(he has to reveal step by step a part of the expected semantics).The expected semantics can then be considered as three sets:a set of elements which are expected,a set of elements which are not expected and some other elements for which the user does not know.It is only necessary for the user to answer to the questions.It is also possible to consider that the user does not answer to some questions, but in this case there is no guarantee tofind an error[16].Without such a tool,the user is in front of a chaotic iteration,that is a wide list of events.In these conditions, it seems easier tofind an error in the code of the program than tofind an error in this wide trace.Even if the user is not able to answer to the questions,he has an explanation for the symptom which contains a subset of the CSP constraints.6ConclusionOur theoretical foundations of domain reduction have permitted to define notions of expected semantics,symptom and error.14。
试题英文数理统计
一、填空(一)各章节的introduction1、Continuous variables or interval data can assume any value in some interval of real numbers.连续变量或间隔数据可以假设在某个实数间隔中的任意值。
(measurement)Discrete variables assume only isolated values.离散变量只假定孤立的值。
(counting)11、The lower or first quartile is the 25th percentile and the upper or third quartile is the 75th percentile.12、The fist qurtile Q1 is the median of the observations falling below the median of the entire sample and the third quartile Q3 is the median of the observations falling above the median of the entire sample.The interquartile range is defined as IQR=Q3-Q1.第一个四分位数Q1是低于整个样本中位数的观测值的中位数,第三个四分位数Q3是高于整个样本中位数的观测值的中位数。
四分位数范围定义为IQR=Q3-Q1。
2、Statistics applied to the life sciences in often called biostatistics or biometry.统计学应用于生命科学,通常称为生物统计学或生物计量学。
3、A descriptive measure associated with a random variable when it is considered over the entire population is called a parameter.当在整个总体中考虑一个随机变量时,与它相关的描述性度量称为参数4、One is forced to examine a subset or sample of the population and make inferences about the entire variable of a sample is called a statistic.人们被迫检查总体中的一个子集或样本,并对样本中的整个变量做出推断,这被称为统计量。
莫里莫里泡沫胶带产品说明书
Table.Spearman Correlations between CRISS and individual components at12months and Comparison of ABA and PBO using CRISS index and individual components at12 months;Outcome ABAN=44PBON=44TreatmentDifference(ABA-PBO)P-value^ACR CRISS(0.0-1.0) median(IQR)0.68(1.00)0.01(0.86)0.03 SpearmanCorrelationLSmean(SE)LSmean(SE)LS mean(SE)P-value^^D mRSS(0-51)-0.75*-6.7(1.30)-3.8(1.23)-2.9(1.75)0.10D FVC%predicted0.36*-1.4(1.30)-3.1(1.20)1.7(1.72)0.32D PTGA(0-10)-0.17-0.50(0.392)-0.30(0.385)-0.20(0.557)0.73D MDGA(0-10)-0.47*-1.34(0.282)-0.18(0.284)-1.16(0.403)0.004D HAQ-DI(0-3)-0.21-0.11(0.079)0.11(0.076)-0.22(0.108)0.05^p-value for treatment comparisons based on Van Elteren test^^p-value for treatment comparisons based on ANCOVA model with treatment,duration of SSc and baseline value as covariates*p<0.01using Spearman correlation coefficientNegative score denotes improvement,except for FVC%where negative score denotes worsening;LS mean=least squares mean;SE=standard errorFRI0328BRANCHED CHAIN AMINO ACIDSIN THE TREATMENTOF POLYMYOSITIS AND DERMATOMYOSITIS:RESULTSFROM THE BTOUGH STUDYNaoki Kimura,Hitoshi Kohsaka.Tokyo Medical and Dental University(TMDU), Department of Rheumatology,Tokyo,JapanBackground:Muscle functions of patients with polymyositis and dermato-myositis(PM/DM)remain often impaired even after successful control of the immune-mediated muscle injury by immunosuppressive therapy.The only effort at the present to regain muscle functions except for the immu-nosuppression is rehabilitation,which is carried out systematically in lim-ited institutes.No medicines for rebuilding muscles have been approved. Branched chain amino acids(BCAA)promote skeletal muscle protein syn-thesis and inhibit muscle atrophy.They thus have positive effects on muscle power,but have never been examined for the effects on PM/DM patients.Objectives:To assess the efficacy and safety of BCAA in the treatment of PM/DM for official approval of their use in Japan.Methods:Untreated adults with PM/DM were enrolled in a randomized, double-blind trial to receive either TK-98(drug name of BCAA)or pla-cebo in addition to the conventional immunosuppressive agents.One package of TK-98(4.15g)contained L-isoleucine952mg,L-leucine 1904mg,and L-valine1144mg(molar ratio is1:2:1.35),and6packages were administered daily in3divided doses.After12weeks,patients with average manual muscle test(MMT)score less than9.5were enrolled in an open label extension study for12weeks.The primary end point was the change of the MMT score at12weeks.The secondary end points were the disease activity evaluated with myositis disease activity core set (MDACS)and the change of functional index(FI),which evaluates dynamic repetitive muscle functions.Results:Forty-seven patients were randomized to the TK-98(24patients [12with PM and12with DM])and placebo(23patients[11with PM and12with DM])groups.The baseline MMT scores were equivalent (7.97±0.92[mean±SD]in the TK-98group and7.84±0.86in the placebo group).The change of MMT scores at12weeks were0.70±0.19(mean ±SEM)and0.69±0.18,respectively(P=0.98).Thirteen patients from the TK-98group and12from the placebo group were enrolled in the exten-sion study.The MMT scores in both groups improved comparably throughout the extension study.The increase of the FI scores of the shoulder flexion at12weeks was significantly larger in the TK-98group (27.9±5.67and12.8±5.67in the right shoulder flexion[P<0.05],27.0±5.44and13.4±5.95in the left shoulder flexion[P<0.05]).The improvement rate of the average FI scores of all tested motions(head lift,shoulder flexion,and hip flexion)through the first12weeks was larger in the TK-98group.No difference was found in the disease activ-ity throughout the study period.Frequencies of the adverse events until 12weeks were comparable.Conclusion:Although BCAA exerted no effects in the improvement of the muscle strength evaluated with MMT,they were effective in the improve-ment of dynamic repetitive muscle functions in patients with PM/DM with-out significant increase of adverse events.Disclosure of Interests:None declaredDOI:10.1136/annrheumdis-2019-eular.5235FRI0329ANALYSIS OF11CASES OF ANTI-PL-7ANTIBODYPOSITIVE PATIENTS WITH IDIOPATHIC INFLAMMATORYMYOPATHIES.MALIGNANCY MAY NOT BE UNCOMMONCOMPLICATION IN ANTI-PL-7ANTIBODY POSITIVEMYOSITIS PATIENTSTaiga Kuga,Yoshiyuki Abe,Masakazu Matsushita,Kurisu Tada,Ken Yamaji, Naoto Tamura.Juntendo University School of Medicine,Department of Internal Medicine and Rheumatology,Tokyo,JapanBackground:Various autoantibodies are known to be related to idiopathic inflammatory myopathies(IIM).Anti-PL-7antibody is anti-threonyl-tRNA synthetase antibody associated with antisynthetase syndrome(ASS).Since anti-PL-7antibody is rare(mostly1-4%of myositis,while a Japanese study reported17%),little is known as to clinical characteristics of it(1). Objectives:To analyze clinical characteristics of anti-PL-7positive IIM patients.Methods:Anti-PL-7antibody was detected by EUROLINE Myositis Profile 3.IIM diagnosis was made by the2017EULAR/ACR classification criteria (2)and/or Bohan And Peter classification(3).Eleven anti-PL-7antibody positive adult patients(all female),age at onset(61.5±12.6years)were enrolled in this study between2009and2018.Clinical manifestations, laboratory and instrumental data were reviewed in this single centre retro-spective cohort.Results:Characteristic symptoms were identified at diagnosis:skin mani-festations(7/11cases,63.6%),muscle weakness(8/11cases,72.7%), arthralgia(5/11cases,45.5%)and Raynaud’s phenomenon(4/11cases, 36.4%).Myogenic enzymes were elevated in most cases(10/11cases, 90.9%).ILD was detected in all patients(11/11cases,100%)and2 patients(18.2%)developed rapidly progressive rgest IIM subtype was polymyositis(PM,5/11cases),followed by dermatomyositis(DM,3/ 11cases)and amyopathic dermatomyositis(ADM,3/11cases).Five patients(45.5%)complicated with malignancy within3years from the diagnosis of IIM.Though clinical manifestations and laboratory data showed any difference between malignancy group and non-malignancy group,all3ADM cases but no DM cases complicated with malignancy in this study.Conclusion:Anti-PL-7antibody positive IIM patients frequently complicated with ILD.Frequency of cancer in ASS patients within three years from diagnosis was 1.7%and not much different from the general population in previous report from France(4).Though this study only included IIM patients and may have selection bias,careful malignancy survey may be essential in Anti-PL-7antibody positive IIM patients.REFERENCES:[1]Y Yamazaki,et al.Unusually High Frequency of Autoantibodies to PL-7Associated With Milder Muscle Disease in Japanese Patients With Poly-myositis/DermatomyositisARTHRITIS&RHEUMATISM Vol.54,No.6, June2006,pp2004–2009[2]Lundberg IE,Tjärnlund A,Bottai M,et al.EULAR/ACR classification crite-ria for adult and juvenile idiopathic inflammatory myopathies and their Major Subgroups.Ann Rheum Dis.2017;76:1955–64.[3]Bohan A,Peter J.Polymyositis and dermatomyositis.N Engl J Med1975,292:344-347;403-407.[4]Hervier B,et al.Hierarchical cluster and survival analyses of antisynthe-tase syndrome:phenotype and outcome are correlated with anti-tRNA syn-thetase antibody specificity.Autoimmunity reviews.2012;12:210–217. Disclosure of Interests:Taiga Kuga:None declared,Yoshiyuki Abe:None declared,Masakazu Matsushita:None declared,Kurisu Tada Grant/ research support from:Eli Lilly,Ken Yamaji:None declared,Naoto Tamura Grant/research support from:Astellas Pharma Inc.,Asahi Kasei Pharma,AYUMI Pharmaceutical Co.,Chugai Pharmaceutical Co.LTD, Eisai Inc.,:Takeda Pharmaceutical Company Ltd.,Speakers bureau:Jans-sen Pharmaceutical K.K.,Bristol-Myers Squibb K.K.,:Mitsubishi Tanabe Pharma Co.DOI:10.1136/annrheumdis-2019-eular.4150846Friday,14June2019Scientific Abstractson December 25, 2023 by guest. Protected by copyright./Ann Rheum Dis: first published as 10.1136/annrheumdis-2019-eular.5235 on 27 May 2019. Downloaded from。
标准误差standarderro...
标准差(Standard Deviation),也称均方差(mean squar e error),是各数据偏离平均数的距离的平均数,它是离均差平方和平均后的方根,用σ表示。
标准差是方差的算术平方根。
标准差能反映一个数据集的离散程度。
平均数相同的,标准差未必相同。
简介标准差也被称为标准偏差,或者实验标准差,公式如图。
简单来说,标准差是一组数据平均值分散程度的一种度量。
一个较大的标准差,代表大部分数值和其平均值之间差异较大;一个较小的标准差,代表这些数值较接近平均值。
例如,两组数的集合 {0, 5, 9, 14} 和 {5, 6, 8, 9} 其平均值都是 7 ,但第二个集合具有较小的标准差。
标准差可以当作不确定性的一种测量。
例如在物理科学中,做重复性测量时,测量数值集合的标准差代表这些测量的精确度。
当要决定测量值是否符合预测值,测量值的标准差占有决定性重要角色:如果测量平均值与预测值相差太远(同时与标准差数值做比较),则认为测量值与预测值互相矛盾。
这很容易理解,因为如果测量值都落在一定数值范围之外,可以合理推论预测值是否正确。
标准差应用于投资上,可作为量度回报稳定性的指标。
标准差数值越大,代表回报远离过去平均数值,回报较不稳定故风险越高。
相反,标准差数值越细,代表回报较为稳定,风险亦较小。
例如,A、B两组各有6位学生参加同一次语文测验,A组的分数为95、85、75、65、55、45,B组的分数为73、72、71、69、68、67。
这两组的平均数都是70,但A组的标准差为17.07分,B组的标准差为2.37分(此数据时在R统计软件中运行获得),说明A组学生之间的差距要比B组学生之间的差距大得多。
如是总体,标准差公式根号内除以n如是样本,标准差公式根号内除以(n-1)因为我们大量接触的是样本,所以普遍使用根号内除以(n-1)公式意义所有数减去其平均值的平方和,所得结果除以该组数之个数(或个数减一),再把所得值开根号,所得之数就是这组数据的标准差。
interval
Using small bias nonparametric density estimators for confidenceinterval estimationMarco Di MarzioDipartimento di Metodi Quantitativi e Teoria Economica,Universit`a di Chieti-Pescara,Viale Pindaro42,65127Pescara,Italydimarzio@dmqte.unch.itCharles C.TaylorDepartment of Statistics,University of Leeds,Leeds LS29JT,UKc.c.taylor@SummaryConfidence intervals for densities built on the basis of standard nonparametric theory are doomed to have poor coverage rates due to bias.Studies on coverage improvement exist,but reasonably behaved interval estimators are needed.We explore the use of small bias kernel–based methods to construct confidence intervals,in particular using a geometric density estimator that seems particularly suited for this purpose.Some key words:Bootstrap;Coverage rate;Geometric density estimators;Higher order bias estimators; Kernel density estimators.1IntroductionNonparametric density estimation is plagued by the bias problem,and much effort has been devoted to obtain modified estimators with a smaller bias.Jones&Signorini(1997)perform an extensive,MISE based simulation study where many of these small bias,kernel-based estimators are compared.Their final advice favours the use of the standard kernel method in many situations.For confidence sets bias obviously imparts poor coverage,and bootstrapping methods do not provide a remedy because the bootstrap expectation of a linear nonparametric estimator is the estimate itself. Hall(1992)accurately treats bootstrap confidence intervals for kernel density estimation,and concludes that undersmoothing is preferable to explicit bias estimation.After observing that Hall’s undersmoothing deteriorates the variance estimate,and consequently is unable to guarantee the promised coverage,Chen (1996)uses empirical likelihood to avoid this reportedflaw.To date,it seems that off-the-shelf methods for confidence interval estimation of densities are still needed.Actually,the above studies do not give rules for practical bandwidth selection,and little account is taken of the expected width.In this paper we explore the feasibility of confidence intervals on the basis of small bias density estimators.Apart from Hall(1992),who studies how undersmoothing of higher-order kernel estimators influences the coverage,this strategy has not been fully explored.A reason could be that often many small bias estimators neither have small theoretical minimum MISEs(as pointed out by Jones&Signorini (1997)),nor possess efficient data driven bandwidth selection.Another reason could be these methods produce estimates that are not densities.But notice that here the coverage is our main target,therefore an estimator’s performance relies primarily on integrated squared bias;much less on MISE.In addition,since ourfinal goal is a confidence interval,the fact that the estimate does not integrate to one is of secondary importance.However,the non-negativity constraint—violated by higher-order kernel estimators—remains relevant when estimating in the tails.In this paper we focus on a geometric density estimator enjoying substantial bias reduction,efficient variance estimation and simple data-driven bandwidth selection.Our simulation study compares other known reduced bias estimators for which it is straightforward to obtain a bandwidth selector.Having said that there is an edge for our method,it seems that all the bias-reduction estimators tried give reasonable performance,even for small samples.Note that,in contrast to other studies on bias reduction,we use specific data-driven bandwidths,therefore our results are of practical relevance.In Section2we present the estimator and derive its main properties.In Section3we develop simple tools to build confidence intervals,such as a variance estimator and a simple method to approximate bootstrap distributions.In Section4we obtain normal-based bandwidth selectors for our method as well as for a number of estimators included in the unifying paper of Jones&Signorini(1997).Section 5illustrates the simulation study.Finally,some concluding remarks are given in Section6.Some preliminaries follow.Given an independent,identically distributed sample X1,...,X n from some unknown density f,the usual kernel density estimate of f at x isˆf(x;h):=1nhni=1Kx−X ih,(1·1)the function K,measurable and integrating to1,is the kernel,the positive real number h is the bandwidth. If f has at least p>1derivatives in a neighbourhood of x,a Taylor series expansion givesE{ˆf(x;h)}−f(x)=p−1k=1h k(−1)kk!f(k)(x)µk(K)+O(h p),(1·2)whereµk(K):=u k K(u)du.If K is a density–as in the standard case–then the bias is O(h2).Ifan estimator has bias of order O(h p)with p>2,we refer to it as a small(or also higher-order)bias estimator.K is said to have order p ifµk(K)=0,0<k<p,and0<µp(K)<+∞.2A Geometric EstimatorOur estimator of f at x isˆf M (x;h):=ˆf2(x;h)ˆf(x;21/2h);it was originally motivated by the observation that the expectation of the smoothed bootstrap normal-kernel estimator is simplyˆf(x;21/2h).Hence,if we suppose thatˆf(x;21/2h)is an estimate ofˆf(x;h) in the same way thatˆf(x;h)is an estimate of f(x),we could obtain the additive estimatorˆf A(x;h):= 2ˆf(x;h)−ˆf(x;21/2h)or the multiplicative estimatorˆf M.ˆf A amounts to(1·1)equipped with the fourth order kernel2K−K∗K,i.e.the“twicing”kernel proposed infixed design regression by Stuetzle& Mittal,(1979).Althoughˆf A is simpler to analyze and,to the best of our knowledge,has not been studied, we will concentrate onˆf M because it cannot take negative values.As pointed out by Jones&Foster(1993),ˆf M is a specific example within a family of special cases of a more general technique.This family has been referred to by them as“generalized jackknifing on thelog scale”,andˆf M is thus an example of a multiplicative form,akin to that of Terrell&Scott(1980). Although this is a multiplicative estimator,we note that it is distinct from that of Jones et al.(1995).As it will be seen below,our estimator has a smaller bias thanˆf,but at the price that it does not integrate to one.2·1Moments calculationTo approximate expectation and variance,we use these standard second order approximationsE{ˆf M(x;h)}E{ˆf2(x;h)}E{ˆf(x;21/2h)}−cov{ˆf2(x;h),ˆf(x;21/2h)}E{ˆf(x;21/2h)}2+E{ˆf2(x;h)}E{ˆf(x;21/2h)}3var{ˆf(x;21/2h)},(2·1)var{ˆf M(x;h)}E{ˆf2(x;h)}E{ˆf(x;21/2h)}2var{ˆf2(x;h)}E{ˆf2(x;h)}2+var{ˆf(x;21/2h)}E{ˆf(x;21/2h)}2−2cov{ˆf2(x;h),ˆf(x;21/2h)}E{ˆf2(x;h)}E{ˆf(x;21/2h)}.(2·2)To make both the estimator and the above expansions well defined,we require the various denom-inators to be strictly positive everywhere in the support.This appears a good reason for using gaus-sian kernels.Now we calculate the exact expressions of the terms involved in(2·1)and(2·2)in the case of gaussian kernels.For brevity,we use notation which suppresses the dependence on x,and let φ(h):=E{K h(x−X1)}.We straightforwardly obtainE{ˆf2(x;h)}=1n(n−1)φ(h)2+φ(h/21/2)2π1/2h,E{ˆf(x;21/2h)}=φ(21/2h),var{ˆf(x;21/2h)}=1nφ(h)81/2π1/2h−φ(21/2h)2.Moreover,tedious but simple algebra shows thatn4var{ˆf2(x)}=nφ(h/2)321/2π3/2h3/2−φ(h/21/2)24πh2+2n!(n−2)!φ(h/21/2)24πh2−φ(h)4+4n!(n−3)!φ(h)2φ(h/21/2)2π1/2h−φ(h)2+φ(h)n−3φ(h/31/2)121/2πh2−φ(h/21/2)2π1/2hφ(h);cov{ˆf2(x;h),ˆf(x;21/2h)}=n!(n−2)!(n−2)φ(h)2φ(21/2h)+φ(h/21/2)2π1/2hφ(21/2h)+2φ((2/3)1/2h)61/2π1/2hφ(h)+n201/2πh2φ((2/5)1/2h)−1n(n−1)φ(h)2+φ(h/21/2)2π1/2h.For both var{ˆf2(x;h)}and cov{ˆf2(x;h),ˆf(x;21/2h)}very simple O{(nh)−1}approximations are imme-diate.2·2Mean-squared errorExpression(2·1)can be used to obtain an expression for the bias.Firstly note that the variance and the covariance terms have the same magnitude and partially cancel out.Now,assuming that f iv(x)exists, from the above theory and using Equation(1·2)with p=2,we haveE{ˆf M(x;h)}=φ(h)2+O{(nh)−1}φ(21/2h)+O(nh)−1=f(x){f(x)+h2f (x)µ2(K)}f(x)+h2f (x)µ2(K)+O{h4+(nh)−1},and so,as expected,the bias is higher-order.In particular,if h=O(n−c),c=1/5,the bias order is O(h4).In addition,if we use a gaussian kernel then we obtain the bias as h4/4×(f 2/f−f iv)+o(h4).Now consider this large sample expression of expansion(2·2)avar{ˆf M(x;h)}=φ(h)4nhπ1/2φ(21/2h)22φ(h/21/2)φ(h)2+φ(h)81/2φ(21/2h)2−4φ((2/3)1/2h)61/2φ(h)φ(21/2h).If we assume a gaussian kernel,a Taylor series expansion gives the above variance as0.72×f(x)/(nhπ1/2)+ O(h/n),which closely recalls the corresponding bootstrap estimator,i.e.0.50׈f(h;x)/(nhπ1/2).Hence,the asymptotic mean-squared error isAMSE{ˆf M(x;h)}=h816f 2(x)f(x)−f iv(x)2+0.72f(x)nhπ.Concerningˆf A,similar calculations giveAMSE{ˆf A(x;h)}=h816f iv(x)2+0.72f(x)nhπ1/2.Both of thesefit into the general form of MSE expressions for small bias estimators in Jones&Signorini (1997).3Confidence Interval EstimationTo construct a100(1−α)%confidence interval,we consider normal(I N)(also considered by Chaudhuri &Marron,1999)and bootstrap percentile(I B)methods:I N:=ˆfM−zα/2ˆvar(ˆf M)1/2,ˆf M+zα/2ˆvar(ˆf M)1/2,I B:=(F∗−1M (α/2),F∗−1M(1−α/2)),where F∗−1M (u):=inf{x:F∗M(x)≥u},with F∗M(x)the bootstrap distribution ofˆf M(x),andˆvar{ˆf M}derived below.An alternative to the Normal-based confidence interval,which also avoids the resamplingprocess,follows.By analogy with the estimation of a spectral density (Tukey,1949),approximate by a χ2distribution:F ∗M (x )=x 0a (u )χ2b (u )(u )duwith a (u ),b (u )chosen to match the mean and variance of f ∗M (u ):a (x ):=var ∗f ∗M (x )2E ∗f ∗M (x ),b (x ):=2{E ∗f ∗M (x )}2var ∗f ∗M (x ),this leads to a third method denoted as I χ2.To get an estimator of the variance of ˆf M ,we replace f (x )with ˆf (x ;h )in each of the momentsexpressions of Section 2·1,then calculate the resulting convolutions.We obtain:ˆE {ˆf 2(x ;h )}=(n −1)ˆf 2(x ;21/2h )+ˆf (x ;(3/2)1/2h )/(2π1/2h )n 2;ˆE{ˆf (x ;21/2h )}=ˆf (x ;31/2h );ˆvar {ˆf (x ;21/2h )}=1n ˆf (x ;21/2h )81/2π1/2h−ˆf 2(x ;31/2h ) ;n 3ˆvar {ˆf 2(x )}=4(n −1)(n −2)ˆf 2(x ;21/2h ) ˆf (x ;(3/2)1/2h )2π1/2h −ˆf 2(x ;21/2h ) + ˆf (x ;(5/4)1/2h )321/2π3/2h 3−ˆf 2(x ;(3/2)1/2h )4πh 2 +2(n −1) ˆf 2(x ;(3/2)1/2h )4πh 2−ˆf 4(x ;21/2h ) +4(n −1)ˆf (x ;21/2h ) ˆf (x ;(4/3)1/2h )12πh −ˆf (x ;21/2h )ˆf (x ;(3/2)1/2h )2πh;n 2ˆcov ˆf 2(x ;h ),ˆf (x ;21/2h ) =2(n −1)61/2π1/2h ˆf (x ;21/2h )ˆf (x ;(5/3)1/2h )−ˆf (x ;(3/2)1/2h )ˆf (x ;31/2h )2π1/2h−2(n −1)ˆf 2(x ;21/2h )ˆf (x ;31/2h )+ˆf (x ;(7/5)1/2h )201/2πh 2.Finally,we combine the above estimators using formula (2·2).We notice that ˆvar {ˆf M }is made ofratios with the same bias order both in the numerator and in the denominator.So the bias of ˆvar {ˆf M }isstrongly reduced just like the bias of ˆf M .We note that,in our simulations with small sample sizes,thisestimate of the variance is not always positive in the tails of the distributions.4Normal Reference Bandwidth SelectionA very simple bandwidth selector for ˆfis the normal scale rule .It results from assuming f ∼N (µ,σ2),then minimizing the asymptotic MISE;this gives h NS =1.06ˆσn −1/5.We now give similar rules formany small bias bias estimators.Jones&Signorini(1997)give the approximated AMSE of many O(h4)-bias estimators.All of these,together with the corresponding results forˆf M andˆf A,can be integrated under the normal assumption,then optimized over h.This leads to bandwidth selectors of the form h AMISE=cˆσn−1/9.The coefficients c are summarized in Table1.[Table1about here]Clearly,if f is not very smooth the above selectors have poor performance.However,very importantly, these selectors appear particularly appropriate for confidence intervals.In fact,differently fromˆf,these estimators are O(h4)biased,so here undersmoothing(e.g.Hall,1992)for coverage is less crucial.5SimulationsIn all of our experiments the kernels are gaussian;the bandwidths are given by the normal-based plug-in rules specified in Table1;the confidence levels are1−α=0.95,and,finally,the number of bootstrap samples is1000.5·1Interval estimationAs afirst case study,we use the setup of Hall(1992):estimate the standard normal,and a symmetric, bimodal,normal mixture at0,0.75and1.5;use n=50and n=100.Also Chen(1996)estimates the standard normal density at0with n=50,at predetermined smoothing levels.Our results–contained in Table2–are averages over100000simulations.It can be seen that,although computationally much faster,I N and Iχ2are not quite as good as I B,but all of them favourably compare with Hall(1992)and Chen(1996)methods.The coverage at x=1.5for the bimodal density is poor.The density is at a local minimum here,and so any smoother will have a positive bias.This is compounded by the fact thatˆf M typically integrates to more than one,so is already positvely biased.Finally,matters are made even worse by the fact that the plug-in rule for h is based on a(unimodal)normal density;a measure based on the interquartile range(rather than s)could ameliorate the problem somewhat. Of course,if the goal was estimation only at a point,then we could explore more adaptive choices of smoothing parameter as well.[Table2about here]The second case study is more general:we estimate models#1(Gaussian),#2(Skewed Unimodal)and#6(Bimodal)of Marron&Wand(1992)at300equispaced points in[−3,3];sample sizes are50 e bootstrap percentile confidence intervals based on various estimators.In particular,two well-established small bias methods—the proposal of Jones et al.(1995)(ˆf JLN)and the fourth-order kernel estimator in Section2.1of Jones&Signorini(1997)(ˆf FO)—along withˆf are included.The performance indexes are:P:=p(x)f(x)dx,W:=w(x)f(x)dx,and O:=|1−α−p(x)|w(x)f(x)dx where p is the coverage,and w is the average width.Strictly,narrower intervals are of importance only when the desired coverage is attained,nevertheless practical usefulness is often attained by a convenient coverage/width trade-off.With this objective,we formulated the omnibus index O as a “substantial performance”indicator,not without difficulties.Table3gives the results for each of these measures(P,W,O)calculated on10000samples.[Table3about here]It can be seen from Table3that small bias methods give much better coverage thanˆf,recalling that the bandwidth is always automatically selected.The results forˆf A(not shown)were quite similar to those ofˆf M,but not quite as good.In order to investigate whyˆf M seems to outperformˆf JLN andˆf FO, we now consider a more typical analysis of performance in point estimation.5·2Point estimationFor the same models and estimators as before,we have calculated the usual L2integrated discrepancies on[-3,3].For each model10000samples were drawn.Each column of Table4gives the ratio of MISE, integrated variance and integrated bias-squared between an element of{ˆf M,ˆf JLN,ˆf FO}and those ofˆf. As can be seen from Table4,ˆf M is the best in bias reduction,even though it is not so good at minimizing MISE.Interestingly,it seems more robust to less smooth densities like model#6.[Table4about here]6Concluding RemarksIt is well known that small bias methods produce estimates that do not integrate to1,and/or take negative values.A big number of techniques that transform these estimates into densities have been proposed;see Glad et al.(2003).Nevertheless,we have preferred to not involve estimate corrections. This is simply to avoid linking estimators’performances–both absolute and relative–to a subjectivechoice of the correction method.It would be a subjective choice exactly because the formal properties of these estimators refer to the uncorrected versions.The only fair alternative could have been to select the best correction method for each pair{estimator,model},but this seems a long way from practical usage.However,we note that the correction subject seems problematic,for example,Glad et al.(2003) show that the simple dividing by the integral of the estimate–inappropriate for correcting higher-order kernel methods–could even deteriorate the performance,depending on the model to estimate,and with no way to know this in advance from the data.Higher-order bias methods have been much studied in kernel density estimation,but are much less used.In this paper,there seems to be justification for using such methods when the goal is confidence interval estimation rather than point estimation.In this case,it seems that the strength of any method lies mainly in its ability to reduce bias with the availability of a suitable plug-in rule for the smoothing parameter.Further work could extend these methods to nonparametric regression,which could also be incorporated in hypothesis testing,for example,in tools such as SiZer(Chaudhuri&Marron,1999).Finally,we note that our data-based smoothing parameters are chosen to minimize AMISE(under a normal assumption),whereas practical selectors“optimizing”the coverage do not exist at all.How-ever,these AMISE-bandwidth selectors may nevertheless provide a good trade-offbetween coverage and expected width in many situations.ReferencesChaudhuri,P.and Marron,J.S.(1999).SiZer for exploration of structures in curves.Journal of the American Statistcal Association,94,807–823.Chen,S.X.(1996).Empirical likelihood confidence intervals for nonparametric density estimation.Biometrika,83,329–341.Glad,I.K.,Hjort,N.L.and Ushakov,N.(2003).Correction of density estimators that are not densities.Scandinavian Journal of Statistics.30,415–427.Hall,P.(1992).Effect of bias estimation on coverage accuracy of bootstrap confidence intervals for a probability density.The Annals of Statistics,20,675–694.H¨o ssjer,O.and Ruppert,D.(1995).Asymptotics for the transformation kernel density estimator.The Annals of Statistics,23,1198–1222.Jones,M.C.and Foster,P.J.(1993).Generalized Jackknifing and higher order kernels.Nonparametric Statistics,3,81–94.Jones,M.C.,Linton,O.and Nielsen,J.P.(1995).A simple bias reduction method for density estimation.Biometrika,82,327–38.Jones,M.C.and Signorini,D.F.(1997).A comparison of higher-order bias kernel density estimators.Journal of the American Statistical Association,92,1063–1073.Marron,J.S.and Wand,M.P.(1992).Exact mean integrated squared error.The Annals of Statistics, 20,712–736.Stuetzle,W.and Mittal,Y.(1979).Some comments on the asymptotic behavior of robust smoothers.In Smoothing Techniques for Curve Estimation.Proceedings,Heidelberg1979.Berlin:Springer-Verlag. Terrell,G.R.and Scott,D.W.(1980).On improving convergence rates for nonnegative kernel density estimators.The Annals of Statistics.8,1160–1163.Terrell,G.R.and Scott,D.W.(1992).Variable kernel density estimation.The Annals of Statistics,20, 1236–1265.Tukey,J.W.(1949).The sampling theory of power spectrum estimates.Proc.Symp.on Applications of Autocorrelation Analysis to Physical Problems,NAVEXOS–P-735,47–67.Office of Naval Research, Department of the Navy,Washington,USA.Wand,M.P.and Jones.M.C.(1995).Kernel Smoothing.London:Chapman&Hall.List of Tables1Coefficients of h AMISE=cˆσn−1/9for various small bias estimators:ˆf M andˆf A are given in Section2;ˆf FO is the fourth-order kernel estimator;ˆf JF is an estimator(explicitly givenby(4)in Jones&Signorini,1997)of Jones&Foster(1993);ˆf JLN is that of Jones et al.(1995);ˆf HR is an estimator from H¨o ssjer&Ruppert(1995),andˆf TS indicates the variablebandwidth estimator of Terrell&Scott(1992) (12)2Coverages and average widths for various95%confidence interval estimators at x= 0,0.75,1.5of a standard normal,and a bimodal density.Methods are:bootstrap per-centile;normal approximation;χ2approximation.Averages over100000simulations (13)3Integrated performance measures P,W,O for a variety of bootstrap95%percentile confi-dence intervals(indicated by the corresponding point estimator symbol) (14)4Integrated performance measures for a variety of estimators and models.Ratios of esti-mators( f∈{ˆf M,ˆf JLN,ˆf FO}),for MISE,integrated variance,and integrated bias-squared,relative to those ofˆf,i.e.MISE[ f]/MISE[ˆf],Ivar[ f]/Ivar[ˆf],and IBIAS[ f]/IBIAS[ˆf].Quan-tities are averages over10000simulations (15)Methodˆf Mˆf Aˆf FOˆf JFˆf JLNˆf HRˆf TSc0.89280.9126 1.08340.90550.96420.86170.8124Table1:Coefficients of h AMISE=cˆσn−1/9for various small bias estimators:ˆf M andˆf A are given in Section2;ˆf FO is the fourth-order kernel estimator;ˆf JF is an estimator(explicitly given by(4)in Jones &Signorini,1997)of Jones&Foster(1993);ˆf JLN is that of Jones et al.(1995);ˆf HR is an estimator from H¨o ssjer&Ruppert(1995),andˆf TS indicates the variable bandwidth estimator of Terrell&Scott(1992).observed coverage rates(average width)N(0,1)0.5N(0,1)+0.5N(3,1) x00.75 1.500.75 1.5I B93.3(0.222)94.3(0.204)94.1(0.151)81.9(0.122)94.4(0.109)65.6(0.093) n=50I N91.6(0.202)94.0(0.193)93.9(0.152)74.1(0.107)95.2(0.107)81.6(0.107) Iχ292.4(0.202)94.4(0.192)93.2(0.150)78.0(0.106)94.8(0.107)76.0(0.107)I B93.6(0.163)94.7(0.151)94.2(0.112)77.7(0.090)94.6(0.081)52.3(0.070) n=100I N92.3(0.153)94.3(0.145)94.7(0.114)69.8(0.081)95.0(0.081)71.0(0.080) Iχ292.9(0.153)94.5(0.145)93.8(0.114)73.0(0.081)94.6(0.081)66.0(0.080) Table2:Coverages and average widths for various95%confidence interval estimators at x=0,0.75,1.5 of a standard normal,and a bimodal density.Methods are:bootstrap percentile;normal approximation;χ2approximation.Averages over100000simulations.Performance Measures:100P,W,100O#1#2#6n=50ˆf89.4,0.167,0.9486.7,0.214,1.9081.0,0.140,2.00ˆf93.3,0.191,0.2892.4,0.243,.05587.0,0.162,1.26Mˆf90.3,0.167,0.7187.2,0.215,1.6974.8,0.138,2.86JLNˆf92.8,0.183,0.3691.5,0.233,0.7884.3,0.153,1.63FOn=500ˆf90.3,0.0722,0.3585.5,0.0904,0.9373.6,0.0610,1.30ˆf94.0,0.0713,0.0691.4,0.0891,0.3172.6,0.0607,1.30Mˆf92.5,0.0643,0.1586.1,0.0801,0.7954.3,0.0536,2.26JLNˆf93.5,0.0684,0.0989.5,0.0853,0.5066.9,0.0579,1.62FOTable3:Integrated performance measures P,W,O for a variety of bootstrap95%percentile confidence intervals(indicated by the corresponding point estimator symbol).#1#2#6MISE Ivar IBIAS MISE Ivar IBIAS MISE Ivar IBIASˆf1.0346 1.26920.19170.8971 1.24530.3241 1.0205 1.31840.7668Mn=50ˆf JLN0.85330.98170.39200.82940.98850.5676 1.06950.9510 1.1703ˆf0.9800 1.19560.20570.8904 1.18640.4035 1.0271 1.20790.8730FOˆf0.79030.96430.20500.72230.95830.4178 1.01960.9844 1.0392Mn=500ˆf JLN0.67340.78450.29970.71630.78290.6303 1.28710.7652 1.5774ˆf0.75620.91260.23000.74450.91110.5294 1.11820.9155 1.2309FOTable4:Integrated performance measures for a variety of estimators and models.Ratios of estimators ( f∈{ˆf M,ˆf JLN,ˆf FO}),for MISE,integrated variance,and integrated bias-squared,relative to those ofˆf,i.e.MISE[ f]/MISE[ˆf],Ivar[ f]/Ivar[ˆf],and IBIAS[ f]/IBIAS[ˆf].Quantities are averages over10000 simulations.。
How to Calculate MSE in Excel
How to Calculate MSE in ExcelBy Stephanie Ellen, eHow Contributor| updated July 31, 2011Mean squared error (MSE) is used in statistics to give a numerical value to the difference between values indicated by an estimation and the actual value of the quantity.The larger the MSE, the further away the estimation is from the true data points. To calculate the MSE by hand, you would have to make several calculations that opens the process to error. Using a spreadsheet format, such as Microsoft Excel, cuts down on errors and allows for faster calculationThe mean square error (MSE) is the average of the squared errors between actual and estimated readings in a data sample. Squaring the difference removes the possibility of dealing with negative numbers. It also gives bigger differences more weight than smaller differences in the result. Mean square error is widely used in signal processing applications, such as assessing signal quality,comparing competing signal processing methods and optimizing signal processing algorithmsIn statistics, the mean square error (MSE) is one way to evaluate the difference between an estimator and the true value of the quantity being estimated. MSE measures the average of the square of the "error," with the error being the amount by which the estimator differs from the quantity to be estimatedInstructionso 1 Type the data points in column A, starting in cell A1.2 Type the estimated data points in column B, starting in cell B2.o3Type "=A1-B1" in cell C1, then grab the fill handle, which is the little black square at the bottom right of the cell. Drag the fill handle down the column to match the last row of data you entered in columns A and B.o 4 Type "=C1^2" into cell D2 then grab the fill handle. Pull the fill handle down the column to match the last row you filled in columns A through C.o5Click the first empty cell in column D, then click the sigma symbol in the Ribbon. This action adds the sum of the square of the errors.o 6 Calculate the MSE from the entered data. Click cell E1, then type "=." Click the summation cell, then type "/." Type the number of data points that you entered in column A. Press Enter to get the MSE.。
(刚体点配准错误率的分布)Distribution of Fiducial Registration Error in
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 28, NO. 11, NOVEMBER 20091791Distribution of Fiducial Registration Error in Rigid-Body Point-Based RegistrationMehdi Hedjazi Moghari and Purang Abolmaesumi*Abstract—Rigid-body point-based registration is frequently used in computer assisted surgery to align corresponding points, or fiducials, in preoperative and intraoperative data. This alignment is mostly achieved by assuming the same homogeneous error distribution for all the points; however, due to the properties of the medical instruments used in measuring the point coordinates, the error distribution might be inhomogeneous and different for each point. In this paper, in an effort to understand the effect of error distribution in the localized points on the performed registration, we derive a closed-form solution relating the error distribution of each point with the performed registration accuracy. The solution uses maximum likelihood estimation to calculate the probability density function of registration error at each fiducial point. Extensive numerical simulations are performed to validated the proposed solution. Index Terms—Fiducial localization error, fiducial registration error, inhomogeneous and anisotropic zero-mean noise, maximum likelihood, registration, target registration error.I. INTRODUCTIONPOINT-BASED rigid-body registration algorithms derive a transformation that maps preoperative data sets to the physical location of a patient anatomy in intraoperative images. This transformation is generally calculated from a set of points (hereafter referred to as fiducials) which are either attached to the patient anatomy, or are extracted from the imaging data. The fiducials can be easily localized in the preoperative and intraoperative data sets; however, their location is generally perturbed by measurement noise which is referred to as fiducial localization error (FLE) in the literature. Due to FLE, the performed registration between preoperative and intraoperative data sets is usually imperfect, leading to an error in fiducial alignments that is referred to as fiducial registration error (FRE). Another important error in rigid-body point-based registration is called target registration error (TRE). TRE is a distance error, after registration, between a pair of fiducial markers in the preoperative and intraoperative data sets which are not used in the registration process.Manuscript received January 27, 2009; revised May 19, 2009. Current version published October 28, 2009. This work was supported in part by the Canadian Institutes of Health Research (CIHR), and in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). Asterisk indicates corresponding author. M. H. Moghari is with the Harvard Medical School, Boston, MA 02215 USA (e-mail: mhedjazi@). *P. Abolmaesumi is with the Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, V6T 1Z4 Canada (e-mail: purang@ece.ubc.ca). Digital Object Identifier 10.1109/TMI.2009.2024208The properties of FRE and TRE, and their relations with FLE have been under investigation for a few decades. Sibson [1], in 1979, for the first time demonstrated that FLE, FRE, and TRE are related to each other. Inspired by Sibson, Fitzpatrick and West [2], [3], in 1998 and 2001, derived the first closed-form solution that estimated the mean squared value of TRE in terms of FLE and the configuration of the registering data sets. Their derivation was based on the assumption that FLE had an identical and isotropic zero-mean Gaussian distribution. Wiles et al. [4], in 2007, extended the Fitzpatrick and West formulation to estimate the mean squared value of TRE when FLE had an identical and anisotropic zero-mean Gaussian distribution. Their method, similar to the Fitzpatrick and West’s, utilized a least mean squares (LMS) algorithm, such as Horn’s [5] or Arun’s [6], to register the data sets and estimate the characteristics of TRE. Ma et al. [7] and Sielhorst et al. [8], in 2007, derived new formulations that estimated the properties of TRE in the presence of inhomogeneous and anisotropic zero-mean Gaussian FLE. In 2008, Moghari and Abolmaesumi [9] utilized the maximum likelihood (ML) algorithm to derive a general solution that estimated the properties of TRE when FLE had an arbitrary zero-mean distribution. The same authors [10], in 2008, mathematically compared Sielhorst [8], Ma [7], Wiles [4], and Fitzpatrick and West [3] algorithms. They demonstrated that all the algorithms, based on their assumptions for the distribution of FLE, converged to the their proposed ML solution. All the above algorithms generate a formulation correlating TRE to FLE. If the distribution of FLE is available, then the derived formulations can be used to estimate the TRE at a desired target location. But, in real clinical applications, the distribution of FLE is usually unknown and it should be somehow computed. To estimate the characteristics of FLE, it maybe possible to calculate the true value of TRE, and utilize the proposed relationships between FLE and TRE to estimate the properties of FLE. Yet, it is difficult to calculate the actual value of TRE in real clinical situations. To overcome this difficulty, Balachendran and Fitzpatrick [11] proposed to use the characteristics of FRE, rather than TRE, to estimate the properties of FLE. These authors, for the first time, derived a formulation relating the characteristics of FRE and FLE by assuming that FLE had an identical and isotropic zero-mean Gaussian distribution. In this article, we generalize the Balachendran and Fitzpatrick algorithm to estimate the properties of FRE in terms of FLE when FLE has an arbitrary zero-mean distribution. We present novel solutions correlating FRE to FLE when FLE has an identical and isotropic, inhomogeneous and isotropic, identical and anisotropic, and inhomogeneous and anisotropic zeromean Gaussian distribution, respectively.0278-0062/$26.00 © 2009 IEEE1792IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 28, NO. 11, NOVEMBER 2009The rest of this article is organized as follows. Section II briefly describes how the ML algorithm can be used to estimate the registration parameters and their variances. In Section III, the ML algorithm is used to calculate the distribution of FRE in terms of FLE. The numerical simulations supporting the derived formulations are presented in Section IV. Finally, the discussion and conclusions are presented in Sections V and VI, respectively. II. ESTIMATING THE REGISTRATION PARAMETERS BASED ON THE ML ALGORITHM In the remainder of this paper, please note that we use nonbold fonts for scalars, bold lowercase fonts for vectors that are assumed to always be column vectors, and capital bold fonts for matrices. Also, we indicate the number of points in each and dimension of the space containing the points space by by —which is assumed to be 3. While three-dimensional data sets are the subject of this paper, the proposed formulation is . The points in the preoperative data set are still valid for matrix whose columns correspond to the shown by as a position vectors of points in the preoperative coordinate frame. matrix, represents the corresponding points in As a , and , the intraoperative data set. The th columns of represent a pair of corresponding points in the preoperative and intraoperative spaces that are related to each other by the following equation [3], [11]: (1) where is the rotation matrix in terms of . is the vector , and , the rotation angles about , consisting of is the translation and axes1, respectively. , and . models the FLE in both the prevector along operative and intraoperative data sets, i.e., and , which can have a different distribution from a point to another. Also, for simplicity, we assumed that the precise rotation and translation is aligned parameters are very small or zero, and data set along its principal axes and its centroid is set at the origin. Using (1), one can use the ML algorithm to accurately estimate the registration parameters and their variances when FLE has an arbitrary distribution. The registration parameters can be calculated by concurrently solving the following equations [9], [12]: (2) (3) where is the likelihood function generated from the distribution of FLE at the fiducial markers. To estimate the variance of the transformation parameters, one needs to derive the second derivative of the log-likelihood function as follows: (4)1Please note that for consistency with [3] and [11], we changed the notation of ; and defined in our previous paper [9], to R ; R , and R , respectively.(5) (6) Equations (4)–(6) can then be used in the Carmer-Rao bound [13], to obtain the covariance matrix of the registration parameters (7) Since rotation matrix is a nonlinear function in terms of , no closed-form solution has yet been proposed to derive the registration parameters from (2) and (3) in the presence of FLE with an arbitrary distribution. Therefore, (2) and (3) should be numerically calculated to obtain the registration parameters. The results can then be used in (4), (5), and (6) to find the covariance matrices of the registration parameters [14]. On the other hand, if it is assumed that the first-order assumption introduced by Fitzpatrick and West [3] is valid, i.e.,(8)then can be linearized with the accuracy of the first-order Taylor series expansion, and closed-form solutions for calcuand can be obtained. For example, as it is shown lating in [9], if FLE at the th fiducial marker is assumed to be independent Gaussian with distribution , the registration parameters can be derived by solving the following six linear equations:(9)(10) Then, (4)–(6) can be calculated as follows, and substituted in (7) to derive the variance of the registration parameters [9]: (11)(12)MOGHARI AND ABOLMAESUMI: DISTRIBUTION OF FIDUCIAL REGISTRATION ERROR IN RIGID-BODY POINT-BASED REGISTRATION1793FRE at the th fiducial marker can be written as(19) (13) Since and are unbiased estimates and FLE has a zero-mean Gaussian distribution, it can simply be shown from (16) that . Using (16), (19) can be simplified as follows:III. ESTIMATING THE DISTRIBUTION OF FRE BASED ON THE ML ALGORITHM As defined in [11], FRE is the distance error, after registration, between a pair of fiducial markers in the preoperative and intraoperative data sets which are used in the registration process. Therefore, FRE at the th fiducial marker can be mathematically defined as follows: (14) where and are the pair of fiducial markers in the preoperative and intraoperative data sets that are used in the registration; and are the registration parameters estimated from (9) and (10), respectively. By using (1), (14) can be written as (15) Substituting (8) in (15) results to (21) where , and can be computed from (11), (12), (13) which are substituted in (7) as follows. Without loss of generality, if the weighted mean of is set to , then we have the origin such that (22) (16) where . From (9) and (10), one can simply show that (17) Substituting (22), (23), and (24) in (21), one can write and (24) (23) (20) Equation (20) can be rewritten as(18) Equations (17) and (18) illustrate that if FLE has a zero-mean Gaussian distribution, the estimation of the registration paramand are zero. eters is unbiased, i.e., Now, to estimate the distribution of FRE, the covariance mashould be calculated. Covariance matrix of trix of FRE at(25)1794IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 28, NO. 11, NOVEMBER 2009Now, we use (17) and (18) to calculate as follows:andWith more simplification of (29), the covariance matrix of FRE at the th fiducial marker, when FLE has an arbitrary zero-mean Gaussian distribution, can be derived by the following closedform formula:(30) (26) where is the dirac function (27) Also which, by using (22) and (24), can be rewritten as (31) Here, it should be emphasized that if , then (21) cannot be simplified as (31). In that case, (21) should be used to calculate the the covariance matrix of FRE at the th fiducial marker. Finally, the distribution of FRE at the th fiducial marker can be computed from either (31) or (21) and the TRE distribution derived in [15]:(32) , and 2 are the square root of the eigenvalues of , and is the modified Bessel function. Since the mean of FRE at the th fiducial marker is zero, the mean squared value of FRE at can, also, be computed from (31) as where (33) where trace is the summation of diagonal elements of a matrix. Our derivations that calculate the distribution of FRE and its mean squared value when FLE has an arbitrary zero-mean Gaussian distribution is now complete. But, (31) can be further simplified for the special cases where FLE has an identical and isotropic, inhomogeneous and isotropic, and identical and anisotropic Gaussian distributions. In what follows, we make these simplifications and show that our derivations converge to the previous formulations in the literature for these special cases. A. FLE Has an Identical and Isotropic Zero-Mean Gaussian Distribution If FLE has an identical and isotropic zero-mean Gaussian distribution in the data sets, it can be modeled as , where is the identity matrix. In this scenario, every fiducial marker . has the same distribution with covariance matrix , Therefore, If and are centered at the origin, then2The order of ; , and is not theoretically important for solving (32). But, in our performed simulations, we assumed that .(28) By substituting (26) and (28) in (25), the covariance matrix of FLE at the th fiducial marker can be calculated as(29)where to obtain (29), we should note that metric matrix, andis a skew sym.MOGHARI AND ABOLMAESUMI: DISTRIBUTION OF FIDUCIAL REGISTRATION ERROR IN RIGID-BODY POINT-BASED REGISTRATION1795and (30) can be used to calculateas follows:which exactly matches the one derived by Balachandran and Fitzpatrick [11]. A more interesting result can be found from (38) by computing the average of the mean squared value of FRE over all the fiducial markers as (34)As it is shown in [9], if thenis aligned along its principal axes,(35)where, and . Therefore, by using (35) in (34), one can write(39) which exactly matches the one derived by Sibson [1], when , the dimension of points in data sets, is 3. Finally, the distribution of FRE at the th fiducial marker can be obtained by using the eigenvalues of (37), i.e., (36) (40) where (36) is fully written in (37). From (37), the mean squared value of FRE at the th fiducial marker can be computed as (41) (42) in (32). B. FLE Has an Inhomogeneous and Isotropic Zero-Mean Gaussian Distribution (38) If FLE has an inhomogeneous and isotropic zero-mean is . Here, Gaussian distribution in the data sets, then(37)1796IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 28, NO. 11, NOVEMBER 2009every fiducial marker is perturbed by a different zero-mean Gaussian distribution which is isotropic. In this scenario, if the , and (30) weighted center of is set at origin, then can be employed to compute asAlso, the average of the computed mean squared value of FRE over all the fiducial markers is(43)is chosen As shown in [9], if the orientation of such that , then By using (44) in (43), (43) can be further simplified as (45). From (45), the mean squared value of FRE at the th fiducial marker can be computed as follows:(47) Equation (47) is the extension of Sibson [1] and Fitzpatrick et al. [2] derivations for the case where FLE has an inhomogeneous and isotropic zero-mean Gaussian distribution. The distribution of FRE at the th fiducial marker can be obtained by using the eigenvalues of (45) in (32). C. FLE Has an Identical and Anisotropic Zero-Mean Gaussian Distribution If FLE has an identical and anisotropic zero-mean Gaussian , where distribution, its distribution can be modeled as(46) Equation (46) is the extension of Balachandran and Fitzpatrick formulation [11] that calculates the mean squared value of FRE when FLE has an inhomogeneous and identical zero-mean Gaussian distribution.(48) In this case, every fiducial marker has the same distribution which has distinct property along different directions, i.e., is is centered at the origin, then becomes anisotropic. If(44)MOGHARI AND ABOLMAESUMI: DISTRIBUTION OF FIDUCIAL REGISTRATION ERROR IN RIGID-BODY POINT-BASED REGISTRATION1797zero, and therefore, (30) can be used to estimateasFinally, the distribution of FRE at the th fiducial marker can be obtained by using the eigenvalues of (51) in (32). D. FLE Has an Inhomogeneous and Anisotropic Zero-Mean Gaussian Distribution(49) As shown in the Appendix, if axes, we have is aligned along its principal(50) where and are defined in the Appendix. By using (50) in (49), (49) can be fully written in (51). From (51), the mean squared value of FRE at the th fiducial marker can be computed as follows:When FLE has an inhomogeneous and anisotropic zero-mean Gaussian distribution, it is difficult to derive a closed-form solution for the mean squared value of FRE; however, such solution can be numerically calculated: One can use (21) to calculate the covariance matrix of FRE at the desired fiducial marker. The eigenvalues of the resultant covariance matrix can then be utilized in (32) to derive the distribution of FRE. In addition, the summation of those eigenvalues gives the mean squared value of FRE. In a very general scenario, when FLE has an arbitrary distribution, which may not necessarily be Gaussian, one needs to first calculate the variance of the registration parameters from (7), and use the result in (19) to compute the covariance of the FRE at the desired fiducial location. Then, the estimated covariance of FRE can be used in (32), and (33), as before, to calculate the distribution of FRE and its means squared value. IV. NUMERICAL SIMULATIONS To support our derivations, we use four different simulations to numerically and analytically calculate the characteristics of FRE when FLE has different zero-mean Gaussian distributions. First, we run the simulations on the randomly generated fiducial markers which are perturbed by identical and isotropic zero-mean Gaussian distribution, to derive the distribution of FRE and its mean squared value. We then repeat this experiment in the second simulation, where the generated fiducial markers are perturbed by an inhomogeneous and isotropic zero-mean Gaussian distribution. Finally, in the third and fourth simulations, the FRE characteristics are estimated when FLE perturbing the fiducial markers, has an identical and anisotropic, and inhomogeneous and anisotropic zero-mean Gaussian distribution, respectively. 1) FLE Has an Identical and Isotropic Zero-Mean Gaussian Distribution: We choose , the number of fiducial markers, to be 3. For this value of , we randomly generate the preoperative data set within a cube with sides of mm. The randomly are listed in generated fiducials in the preoperative data set Table I. The same data set is used in the remaining simulations. To generate the intraoperative data set , we perturb each point in by a zero-mean Gaussian random noise with covariance matrix 1 mm . is then registered to by using (9) and (10), and FRE at all the fiducial markers are calculated. One simulation consists of 100 000 repetitions of perturbation, registration and FRE calculation which allows us to numerically(52) Equation (52) is the extension of Balachandran and Fitzpatrick formulation [11] when FLE has an identical and anisotropic zero-mean Gaussian distribution. Also, one can simply calculate the average of the mean squared value of FRE over all the fiducial markers as(53) Equation (53) is the extension of Sibson [1] and Fitzpatrick et al. [2] derivations that estimates the overall mean squared value of FRE for the case where FLE has an identical and anisotropic zero-mean Gaussian distribution.1798IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 28, NO. 11, NOVEMBER 2009TABLE I RANDOMLY GENERATED FIDUCIAL MARKERS IN THE PREOPERATIVE DATA SET THAT ARE USED IN ALL THE SIMULATIONSXFig. 1. Probability density functions of the estimated distribution of the fiducial using numerical simulations and the proposed algorithm, localization error at when FLE has an identical and isotropic zero-mean Gaussian distribution.xestimate the distribution of FRE and its mean squared value at each fiducial marker. Also, we use (32) and (33) to analytically estimate the distribution of FRE and its mean squared value, respectively. Fig. 1 shows the estimated distribution of FRE using the proposed formulations at the first fiducial marker and numerical simulations. Furthermore, Table II displays the estimated average of the mean squared value of FRE over all the fiducial markers and the using numerical simulations mean squared value of FRE at and the proposed algorithm, respectively. 2) FLE Has an Inhomogeneous and Isotropic Zero-Mean Gaussian Distribution: In this simulation, the same data set as in the previous simulation is used. But, to generate the fixed data set , each point in is perturbed by an inhomogeneous and isotropic zero-mean Gaussian noise with covariance . is uniformly drawn between 0.001 of to by using (9) and (10), and 2 mm . Then, we register and calculate FRE at each fiducial marker. As before, this simulation is repeated 100 000 times to numerically estimate the distribution of FRE and its mean squared value. Furthermore, we use (32) and (33) to analytically estimate the distribution of FRE and its mean squared value. Fig. 2 shows the estimated distribution of FRE at using numerical simulations and the proposed algorithm for one simulation run. Table II displays the average of the mean squared values of FRE over all fiducial markers and the mean squared value of FRE at . 3) FLE Has an Identical and Anisotropic Zero-Mean is produced as Gaussian Distribution: In this simulation, before. To generate , each individual point in is perturbed by an identical and anisotropic, zero-mean Gaussian noisewith covariance matrix diag (0.5 mm , 1.5 mm , 1 mm ). As the first simulation, the two data sets are registered by using (9) and (10), and FRE at each fiducial marker is numerically calculated. Equations (32) and (33) are also used to analytically calculate the characteristics of FRE. Fig. 3 shows the estimated using numerical simulations and the distribution of FRE at proposed algorithm. The mean squared value of FRE over all the fiducial markers and the mean squared value of FRE at using the proposed derivations and numerical simulations are also listed in Table II. 4) FLE Has an Inhomogeneous and Anisotropic Zero-Mean Gaussian Distribution: In this simulation, to generate , which is produced as before, is perturbed by each point in inhomogeneous and anisotropic zero-mean Gaussian noise . The matrix is assumed to be with covariance matrix , where and are independently and randomly drawn between 0.001 and 2 mm . is then registered to by using (9) and (10), and FRE at each fiducial marker is numerically calculated. Equations (32) and (33) are utilized to estimate the distribution of FRE and its mean squared value at . Fig. 4 shows the estimated using the proposed algorithm and distribution of FRE at numerical simulations. Furthermore, the computed average of the mean squared value of FRE over all the fiducial markers and are reported in Table II. the mean squared value of FRE at 5) Effect of the Number of Fiducial Markers and Fiducial Configurations: In the last simulation, we repeated Simulations 1, 2, 3, and 4 for 50 trials (each trial containing 100 000 simulations for a given fiducial configuration) with random configuration of fiducial markers where , the number of fiducial markers, is 3, 5, and 10, respectively, and the fiducials are drawn mm. Data sets and , and within a cube with sides of the noise modeling FLE are generated as explained in the above simulations, and the mean squared value of FRE is numerically and analytically calculated. Table III displays the mean and variance of error difference between numerically and analytically computed overall mean squared values of FRE; and the mean and variance of error difference between the numerically and analytically computed mean squared values of FRE at . The mean of the normalized estimation error of FRE for the first and all the fiducials is also reported in Table III. Finally, we repeated the previous simulations with these modimm to fications: The size of data sets was increased from mm; and the mean of overall variance of the noise, rather than 1 mm , was assumed to be 3 mm . Table IV displays the results. V. SUMMARY AND DISCUSSION We derived the first closed-form solutions for estimating the distribution of FRE and its mean squared value when FLE has an arbitrary zero-mean Gaussian distribution. We analytically illustrated that when FLE has an identical and isotropic zeromean Gaussian distribution, the proposed derivations match the ones derived by Sibson [1] and Balachandran and Fitzpatrick [11]. In addition, we used numerical simulations to verify our derivations. As shown in Fig. 1 and Table II, the proposed formulations’ results accurately match the ones numerically calculated.MOGHARI AND ABOLMAESUMI: DISTRIBUTION OF FIDUCIAL REGISTRATION ERROR IN RIGID-BODY POINT-BASED REGISTRATION1799TABLE II ESTIMATED AVERAGE (MM) OF THE ROOT MEAN SQUARED VALUES OF FRE OVER ALL FIDUCIAL MARKERS AND THE ROOT MEAN SQUARED VALUES OF FRE AT THE FIRST FIDUCIAL MARKER USING THE NUMERICAL SIMULATIONS AND THE PROPOSED ALGORITHM, WHEN FLE HAS A NONIDENTICAL ZERO-MEAN GAUSSIAN DISTRIBUTION. PLEASE NOTE THAT THE RESULTS FOR NUMERICAL SIMULATIONS AND PROPOSED DERIVATIONS ARE LABELED WITH AND , RESPECTIVELY00Fig. 2. Probability density functions of the estimated distribution of the fiducial localization error at x using numerical simulations and the proposed algorithm, when FLE has an inhomogeneous and isotropic zero-mean Gaussian distribution.Fig. 4. Probability density functions of the estimated distribution of the fiducial localization error at x using numerical simulations and the proposed algorithm, when FLE has an inhomogeneous and anisotropic zero-mean Gaussian distribution.distribution. In this scenario, as Fig. 4 and Table II display, the proposed formulations’ results accurately match the distribution of FRE and its mean squared value which are numerically calculated. Finally, Tables III and IV demonstrate the accuracy of the derived formulations for different number of fiducials, different fiducial configurations, and different variances of the noise. VI. CONCLUSION We have presented a general solution that estimates the distribution of FRE for the cases where FLE has an arbitrary zeromean Gaussian distribution. With (7), (30), (32), and (33), we have derived approximate expressions for the distribution of FRE and its mean squared value in the presence of different types of FLE distribution at each fiducial marker. That our proposed algorithm closely agrees with the performed numerical simulations demonstrates its high level of accuracy in estimating the distribution of FRE and its mean squared value in the presence of identical or inhomogeneous and isotropic or anisotropic zero-mean Gaussian noise. We should emphasize that the proposed derivations remain accurate as long as the assumptions under which they were derived are valid. First, all noise modeling FLE at each point of the data sets should have an independent zero-mean Gaussian distribution. If the distribution of the noise should transpire to be nonGaussian, then that distribution, instead of Gaussian distribution, should be considered in the likelihood function . Second, the proposed algorithm is optimum as long as the registered dataFig. 3. Probability density functions of the estimated distribution of the fiducial localization error at x using numerical simulations and the proposed algorithm, when FLE has an identical and anisotropic zero-mean Gaussian distribution.We then extended Sibson [1] and Balachandran and Fitzpatrick [11] derivations for the cases where FLE has either inhomogeneous and isotropic or identical and anisotropic zero-mean Gaussian noise distribution. As it is demonstrated in Figs. 2 and 3 and Table II, the proposed derivations as accurately as numerical simulations estimate the distribution of FRE and its mean squared value, when FLE has inhomogeneous and isotropic, or identical and anisotropic zero-mean Gaussian distribution. Finally, we examined the proposed derivations for the case where FLE has inhomogeneous and anisotropic zero-mean Gaussian。
mean square error
mean square error
mean square error的意思是:均方误差;中误差;均方差。
mean square error例句:
1、The method focuses on minimizing the ensemble mean square error of the estimation。
该访法可以使估计的总体均方误差最小。
2、Kalman filtering is an optimum filtering employing the norm of minimum mean square error。
卡尔曼滤波采用最小均方误差准则,是-种最优滤波。
3、And both the mean and the mean square error of the coordinate series are testified this method。
通过算例分析,无论从坐标序列的均值还是中误差都验证了本方法的有效性。
4、The chip level linear minimum mean square error (LMMSE)equalizer in CDMA systems is introduced。
介绍了码分多址CD MA系统中码片级线性最小均方误差均衡器。
5、Usually, excess mean square error (MSE) and convergence speed are two criterions used to evaluate performance。
剩余均方误差(MSE)和收敛速度是衡量均衡算法性能的两个常用标准。
DISTA, Universita del Piemonte Orientale,
The Scale Factor:A New Degree of Freedom in Phase Type ApproximationAndrea BobbioDISTA,Universit`a del Piemonte Orientale, Alessandria,Italy,bobbio@unipmn.itAndr´a s Horv´a th,Mikl´o s TelekDept.of Telecommunications, Budapest University of Technology and Economics, Hungary,horvath,telek@webspn.hit.bme.huAbstractThis paper introduces a unified approach to phase-type approximation in which the discrete and the continuous phase-type models form a common model set.The models of this common set are assigned with a non-negative real parameter,the scale factor.The case when the scale factor is strictly positive results in Discrete phase-type distribu-tions and the scale factor represents the time elapsed in one step.If the scale factor is0,the resulting class is the class of Continuous phase-type distributions.Applying the above view,it is shown that there is no qualitative difference be-tween the discrete and the continuous phase-type models.Based on this unified view of phase-type models one can choose the best phase-type approximation of a stochastic model by optimizing the scale factor.Keywords:Discrete and Continuous Phase type distri-butions,Phase type expansion,approximate analysis.1IntroductionThis paper presents new comparative results on the use of Discrete Phase Type(DPH)distributions[11]and of Continuous Phase Type(CPH)distributions[12]in applied stochastic modeling.DPH distributions of order are defined as the time to absorption in a Discrete-State Discrete-Time Markov Chain (DTMC)with transient states and one absorbing state. CPH distributions of order are defined,similarly,as the distribution of the time to absorption in a Discrete-State Continuous-Time Markov Chain(CTMC)with transient states and one absorbing state.The above definition im-plies that the properties of a DPH distribution are computed over the set of the natural numbers while the properties of a CPH distribution are defined as a function of a continuous time variable.When DPH distributions are used to model timed activities,the set of the natural numbers must be re-lated to a time measure.Hence,a new parameter need to be introduced that represents the time span associated to each step.This new parameter is the scale factor of the DPH dis-tribution,and can be viewed as a new degree of freedom, since its choice largely impacts the shape and properties of a DPH distribution over the continuous time axes.When DPH distributions are used to approximate a given continu-ous distribution,the scale factor affects the goodness of the fit.The paper starts discussing to what extent DPH or CPH distributions can be utilized tofit a given continuous distri-bution.It is shown that a DPH distribution of any order con-verges to a CPH distribution of the same order as the scale factor goes to zero.Even so,the DPH class contains dis-tributions whose behavior differs substantially from the one of the corresponding distributions in the CPH class.Two main peculiar points differentiate the DPH class from the CPH class.Thefirst point concerns the coefficient of varia-tion:indeed,while in the continuous case the minimum co-efficient of variation is a function of the order only and its lower bound is given by the well known theorem of Aldous and Shepp[1],in the discrete case the minimum coefficient of variation is proved to depend both on the order and on the mean(and hence on the scale factor)[13].Furthermore, it is easy to see that for any order,there exist members of the DPH class that represent a deterministic value with a coefficient of variation equal to zero.Hence,for any order (greater than1),the coefficient of variation of the DPH class spans from zero to infinity.The second peculiar point that differentiate the DPH class is the support of the distributions.While a CPH dis-tribution(of any order)has always an infinite support,there exist members of the DPH class of any order that have a finite support(between a minimum non-negative value and a maximum)or have a mass equal to one concentrated in a single value(deterministic distribution).It turns out that the possibility oftuning the scale factor to optimize the goodness of the fit,having distributions with coefficient of variation span-ning from0to infinity,representing deterministic values exactly,coping withfinite support distributions,makes the DPH class a very interesting and challenging class of distributions to be explored in applied stochastic models.The purpose of this paper is to show how these fa-vorable properties can be exploited in practice,and to pro-vide guidelines to the modeler to a reasonably good choice of the distributions to be used.Indeed,since a DPH dis-tribution tends to a CPH distribution as the scale factor ap-proaches zero,considering the scale factor as a new decision variable in afitting experiment,andfinding the value of the optimal scale factor(with respect to some error measure) provides a valuable tool to decide whether to use a discrete or a continuous approximation to the given problem.Thefitting problem for the CPH class has been exten-sively studied and reported in the literature by resorting to a variety of structures and numerical techniques(see[10]for a survey).Conversely,thefitting problem for the DPH class has received very little attention[4].In recent years,a considerable effort has been devoted to define models with generally distributed timings and to merge in the same model random variables and determin-istic duration.Analytical solutions are possible in special cases,and the approximation of the original problems by means of CPH distributions is a rather well known tech-nique[7].This paper is aimed at emphasizing that DPH approximation may provide a more convenient alternative with respect to CPH approximation,and also to provide a way to quantitatively support this choice.Furthermore,the use of DPH approximation can be extended from stochas-tic models to functional analysis where time intervals with nondeterministic choice are considered[3].Finally,dis-cretization techniques for continuous problems[8]can be restated in terms of DPH approximations.The rest of the paper is organized as follows.After defin-ing the notation to be used in the paper in Section2,Section 3discusses the peculiar properties of the DPH class with re-spect to the CPH class.Some guidelines for bounding the parameters of interest and extensive numerical experiments to show how the goodness of thefit is influenced by the op-timal choice of the scale factor are reported in Section4. Section5discusses the quality of the approximation when passing from the analysis of a single distribution to the anal-ysis of performance measures in complete non-Markovian stochastic models.The paper is concluded in Section6.2Definition and NotationA DPH distribution[11,12]is the distribution of the time to absorption in a DTMC with transient states,and one absorbing state numbered.The one-step transition probability matrix of the corresponding DTMC can be par-titioned as:(1)where is the matrix collecting the transi-tion probabilities among the transient states,is the column vector of length grouping the probabilities from any state to the absorbing one,and is the zero vector.The initial probability vectoris of length,with.In the present paper,we consider only the class of DPH distribu-tions for which,but the extension to the case when is straightforward.The tuple is called the representation of the DPH distribution,and the order.Similarly,a CPH distribution[12]is the distribution of the time to absorption in a CTMC with transient states, and one absorbing state numbered.The infinites-imal generator of the CTMC can be partitioned in the following way:(2) where,is a matrix that describes the tran-sient behavior of the CTMC and is the column vector grouping the transition rates to the absorbing state.Letbe the initial probability(row) vector with.The tuple is called the representation of the CPH distribution,and the order.It has been shown in[4]for the discrete case and in[6] for the continuous case that the representations in(1)and (2),because of their too many free parameters,do not pro-vide a convenient form for running afitting algorithm.In-stead,resorting to acyclic phase-type distributions,the num-ber of free parameters is reduced significantly since both in the discrete and the continuous case a canonical form can be used.The canonical form and its constraints for the discrete case[4]is depicted in Figure1.Figure2gives the canonical form and associated constraints for the continuous case.In both cases the canonical form corresponds to a mixture of Hypo-exponential distributions.Afitting algorithm that provides acyclic CPH,acyclic DPH distributions has been provided in[2]and[4],respec-tively.Experiments suggests(an exhaustive comparison of fitting algorithms can be found in[10])that,from the point of view of applications,the Acyclic phase-type class is as flexible as the whole phase-type class.3Comparing properties of CPH and DPH distributionsCTMC are defined as a function of a continuous time variable,while DTMC are defined over the set of the nat-ural numbers.In order to relate the number of jumps in a DTMC with a time measure,a time span must be assigned to each step.Let be(in some arbitrary units)the scaleFigure2.Canonical representation of acyclicCPH distributions and its constraintsfactor,i.e.the time span assigned to each step.The valueof establishes an equivalence between the sentence”prob-ability at the-th step”and”probability at time”,and hence,defines the time scale on which the properties of theDTMC are measured.The consideration of the scale factor introduces a new parameter,and consequently a new de-gree of freedom,in the DPH class with respect to the CPHclass.In the following,we discuss how this new degree of freedom impacts the properties of the DPH class and how it can be exploited in practice.Let be an”unscaled”DPH distributed random variable (r.v.)of order with representation,defined over the set of the non-negative natural numbers.Let us consider a scale factor;the scaled r.v.is defined over the dis-crete set of time points,being a non-negative natural number.For the unscaled and the scaled DPH r.v.the following equations hold.(3)where is the column vector of ones,and is the -th moment calculated from the factorial moments of:.It is evident from(3)that the mean of the scaled r.v.is times the mean of the unscaled r.v..While is an invariant of the representation,is a free parame-ter;adjusting,the scaled r.v.can assume any mean value .On the other hand,one can easily infer from(3) that the coefficients of variation of and are equal.A consequence of the above properties is that one can easily provide a scaled DPH of order with arbitrary mean and arbitrary coefficient of variation with an appropriate scale factor.Or more formally:the unscaled DPH r.v.of any order can exhibit a coefficient of variation between .For the coefficient of variation ranges between.As mentioned earlier,an important property of the DPH class with respect to the CPH class is the possibility of exactly representing a deterministic delay.A determinis-tic distribution with value can be realized by means of a scaled DPH distribution with phases with scale factor if is integer.In this case,the structure of the DPH distribution is such that phase is connected with probabil-ity1only to phase(),and with an initial probability concentrated in state1.If is not inte-ger for the given,the deterministic behavior can only be approximated.3.1First order discrete approximation of CTMCsGiven a CTMC with infinitesimal generator,the tran-sition probability matrix over an interval of length can be written as:hence thefirst order approximation of is matrix.is a proper stochastic matrix if,where.is the exact transition probability matrix of the CTMC assumed that at most one transition occurs in the interval of length.We can approximate the behavior of the CTMC at timeusing the DTMC with transition probability matrix.The approximate transition prob-ability matrix at time is:.Since matrices and commute we can obtain the matrix version of the same expression as followsAn obvious consequence of Theorem1for PH distribu-tions is given in the following corollary.Corollary1Given a scaled DPH distribution of order, representation and scale factor,the limiting behavior as is the CPH distribution of order with representation.3.2The minimum coefficient of variationIt is known that one of the main limitation in approx-imating a given distribution by a PH one is the attainable minimal coefficient of variation,.In order to discuss this point,we recall two theorems that state the for the class of CPH and DPH distributions.Theorem2(Aldous and Shepp[1])The of a CPH distributed r.v.of order is and is attained by the Erlang()distribution independent of its mean or of its parameter.The corresponding theorem for the unscaled DPH class has been proved in[13].In the following,denotes the integer part and denotes the fractional part of. Theorem3The of an unscaled DPH r.v.of order and mean is:In this particular case,when the structure of the bestfit-ting scaled DPH and CPH distributions are known,we can show that the distribution of the bestfitting scaled DPH dis-tribution converges to the distribution of the bestfitting CPH distribution when.Unfortunately,the same conver-gence property cannot be proved in general,since the struc-tural properties of the bestfitting PH distributions are not known and they depend on the chosen(arbitrary)optimiza-tion criterion.Instead,in Section4we provide an extensive experimental study on the behavior of the bestfitting scaled DPH and CPH distributions as a function of the scale factor .3.4DPH distributions withfinite supportAnother peculiar characteristic of the DPH class is to contain distributions withfinite support.A DPH distribu-tion hasfinite support if its structure does not contain cycles and self-loops(any cycle or self loop implies an infinite sup-port).Let be thefinite support of a given distribution,with and(when thefinite support distri-bution reduces to a deterministic distribution with mass1at ).If and are both integers,it is possible to construct a scaled DPH of order for which the probabil-ity mass function has non-zero elements only for the values.As an example,the discrete uniform distribution between and is reported in Figure 5,for scale factor.0.10.20.30.40.50.60.70.80.910.40.60.811.2 1.41.61.82c d fxOriginalScale factor: 0.01Scale factor: 0.06Scale factor: 0.1CPH 00.511.522.50.40.60.81 1.2 1.4 1.6 1.82p d fxOriginalScale factor: 0.01Scale factor: 0.06Scale factor: 0.1CPH Figure 6.Approximating the L3distribution with -phase PH approximationsWhen is less than its lower bound the required can-not be attained;when becomes too large the wide separa-tion of the discrete steps increases the approximation error;when is in the proper range (e.g.)a reasonably good fit is achieved.This example also suggests that an optimal value of exists that minimizes the chosen distance measure in (6).In order to display the goodness of fit for the L3distribu-tion,Figure 7shows the distance measure as a function of for various values of the order .A minimum value of is attained in the range where the parameters fit the bounds of Table 1.Notice also that,as increases,the advantage of having more phases disappears,according to Theorem 3.The circles in the left part of this figure (as well as in all the successive figures)indicate the corresponding distance measure obtained from CPH fitting.The figure (and the subsequent ones as well)suggests that the distance measure obtained from DPH fitting converges to the distance mea-sure obtained by the CPH approximation as tends to .upper bound of equation (7)0.20920.07920.04250.021700.010.020.030.040.050.060.070.080.090.100.050.10.150.20.250.3d i s t a n ce m e a s u r escale factor2 phases 4 phases 6 phases 8 phases 10 phases 12 phasesFigure 9.Distance measure as the function of the scale factor for Uniform(1,2)(U2)00.0020.0040.0060.0080.010.0120.0140.01600.050.10.150.20.250.3d i s t a n ce m e a s u r escale factor2 phases 4 phases 6 phases 8 phases 10 phases 12 phasesFigure 10.Distance measure as the function of the scale factor for Uniform(0,1)(U1)be stressed that the chosen distance measure in (6)can be considered as not completely appropriate in the case of finite support,since it does not force the approximating PH to have its mass confined in the finite support and 0outside.Let be a uniform r.v.over the interval ,withand (this is the distributionU2taken from the benchmark in [5,4]).Figure 9shows the distance measure as a function of for various orders .It is evident that,for each ,a minimal value of is obtained,that provides the best approximation according to the chosen distance measure.As a second example,let be a uniform r.v.over theinterval,with and (this is the distribution U1taken from the benchmark in [5,4]).Figure 10shows the distance measure as a function of forvarious orders .Since,in this example,an order is large enough for a CPH to attain the coefficient of variation of the distribution.Nevertheless,the optimal in Figure (10),which minimizes the distance mea-sure for high order PH (),ranges between and ,thus leading to the conclusion that a DPH provides a better fit.This example evidences that the coef-ficient of variation is not the only factor which influences the optimal value.The shape of the distribution plays an essential role as well.Our experiments show that a discon-00.20.40.60.810.20.40.60.811.21.4c d fxOriginalScale factor: 0.03Scale factor: 0.1CPH00.20.40.60.811.21.41.600.20.40.60.811.21.4p d fxOriginalScale factor: 0.03Scale factor: 0.1CPHFigure 11.Approximating the Uniform ()distribution (U1)tinuity in the pdf (or in the cdf)is hard to approximate with CPH,hence in the majority of these cases DPH provides a better approximation.Figure 11shows the cdf and the pdf of the U1distribu-tion,compared with the best fit PH approximations of order,and various scale factors .In the case of DPH ap-proximation,the values are calculated as in (9).With respect to the chosen distance measure,the best approxi-mation is obtained for,which corresponds to a DPH distribution with infinite support .Whenthe approximate distribution has a finite support.Hence,the value (for )provides a DPH able to rep-resent the logical property that the random variable is less than .Another fitting criterion may,of course,stress this property.5Approximating non-Markovian modelsSection 4has explored the problem of how to find the best fit among either a DPH or a CPH distribution by tuning the scale factor .When dealing with a stochastic model of a system that incorporates non exponential distributions,a well know solution technique consists in a markovianiza-tion of the underlying non-Markovian process by substi-tuting the non exponential distribution with a best fit PH distribution,and then expanding the state space.A natural question arises also in this case,on how to decide among a discrete (using DPH)or a continuous (using CPH)approx-imation,in order to minimize the error in the performances2s4s3Figure12.The state space of the consideredM/G/1/2/2queuemeasures we are interested in for the overall model.One possible way to handle this problem could consist infinding the best PHfits for any single distribution and to plug them in the model.In the present paper,we only consider the case where the PH distributions are either all discrete(and with the same scale factor)or they are all continuous.Various embedding techniques have been ex-plored in the literature for mixing DPH(with different scale factors)and CPH([8,9]),but these techniques are out of the scope of the paper.In order to quantitatively evaluate the influence of the scale factor on some performance measures defined at the system level,we have considered a preemptive M/G/1/2/2 queue with two classes of customers.We have chosen this example because accurate analytical solutions are available both in transient condition and in steady-state using the methods presented in e.g.[8].The general distribution is taken from the set of distributions(L1,L3,U1,U2)already considered in the previous section.Customers arrive at the queue with rate in both classes.The service time of a higher priority job is exponen-tially distributed with parameter.The service time distribution of the lower priority job is either L1,L3,U1 or U2.Arrival of a higher priority job preempts the lower priority one.The policy associated to the preemption of the lower priority job is preemptive repeat different(prd),i.e. after the departure of the higher priority customer the ser-vice of the low priority customer starts from the beginning with a new service time sample.The system has4states(Figure12):in state s1the server is empty,in state s2a higher priority customer is under ser-vice with no lower priority customer in the system,in state s3a higher priority customer is under service with a lower priority customer waiting,in state s4a lower priority job is under service(in this case there cannot be a higher priority job).Let denote the steady state probability of the M/G/1/2/2queue obtained from an exact analytical solution.In order to evaluate the correctness of the PH approxima-0.020.040.060.080.10.120.140.1600.020.040.060.080.10.120.140.160.180.2sumoferrorsscale factor2 phases4 phases6 phases8 phases10 phases12 phasesFigure13.with scale factor and distri-bution L30.010.020.030.040.050.0600.020.040.060.080.10.120.140.160.180.2sumoferrorsscale factor2 phases4 phases6 phases8 phases10 phases12 phasesFigure14.with scale factor and distri-bution L3tion we have solved the model by substituting the original general distribution(either L1,L3,U1or U2)with approx-imating DPH or CPH distributions.Letdenote the steady state probability of the M/PH/1/2/2queue with the PH approximation.The overall approximation error is measured in terms of the difference between the exact steady state probabilities and the approximate steady state probabilities.Two error measures are defined:andThe evaluated numerical values for and are reported in Figures13and14for the distribution L3.Since the behavior of is very similar to the behavior of in all the cases,for the other distributions we reportonly(Figures15,16,17).Thefigures,which re-fer to the error measure in a performance index of a global stochastic model,show a behavior similar to the one ob-tained for a single distributionfitting.Depending on the coefficient of variation and on the shape of the considered non-exponential distributions an optimal value of is found which minimizes the approximation error.In this example, the optimal value of is close to the one obtained for the single distributionfitting.Based on our experiments,we guess that the observed0.050.10.150.20.2500.020.040.060.080.10.120.140.160.180.2s u m o f e r r o r sscale factor2 phases 4 phases 6 phases 8 phases 10 phases 12 phasesFigure 15.with scale factorand distri-bution L100.020.040.060.080.10.120.140.1600.050.10.150.20.250.3s u m o f e r r o r sscale factor2 phases 4 phases 6 phases 8 phases 10 phases 12 phasesFigure 16.with scale factorand distri-bution U1property is rather general.If the stochastic model under study contains a single non-exponential distribution,then the approximation error in the evaluation of the perfor-mance indices of the global model can be minimized by re-sorting to a PH type approximation (and subsequent DTMC or CTMC expansion)with the optimal of the single distri-bution.The same should be true if the stochastic model under study contains more than one general distribution,whose best PH fit provides the same optimal .In order to investigate the approximation error in the transient behavior,we have considered distribution U2for the service time and we have computed the transient proba-bility of state with two different initial conditions.Figure 18depicts the transient probability of state with initial state .Figure 19depicts the transient probability of the same state,,when the service of a lower priority job starts at time 0(the initial state is ).All approximations are with DPH distributions of order .Only the DPH ap-proximations are depicted because the CPH approximationis very similar to the DPH one with scale factor.In the first case,(Figure 18),the scale factor,which was the optimal one from the point of view of fitting the single distribution in isolation,provides the most accu-rate results for the transient analysis as well.Instead,in the second case,the approximation with a scale factor0.020.040.060.080.10.120.140.160.180.200.050.10.150.20.250.3s u m o f e r r o r sscale factor2 phases 4 phases 6 phases 8 phases 10 phases 12 phasesFigure 17.with scale factorand distri-bution U20.10.20.30.40.50.60.70.80.91012345t r a n s i e n t p r o b a b i l i t ytimeTransient behaviour Scale factor: 0.03Scale factor: 0.1Scale factor: 0.2Figure 18.Approximating transient probabili-tiescaptures better the sharp change in the transient probability.Moreover,this value of is the only one among the values reported in the figure that results in 0probability for time points smaller than 1.In other words,the second example depicts the advantage given by DPH distributions to model durations with finite support.This example suggests also that DPH approximation can be of importance when pre-serving reachability properties is crucial (like in modeling time-critical systems)and,hence,DPH approximation can be seen as a bridge between the world of stochastic model-ing and the world of functional analysis and model checking [3].6Concluding remarksThe main result of this paper has been to show that the DPH and CPH classes of distributions of the same order can be considered a single model set as a function of a scalefactor .The optimal value of ,,determines the best distribution in a fitting experiment.When the bestchoice is a CPH distribution,while whenthe best choice is a DPH distribution.This paper has also shown that the transition from DPH class to CPH class is continu-ous with respect to several properties,like the distance (de-noted by in 6)between the original and the approximate0.050.10.150.20.250.3012345t r a n s i e n t p r o b a b i l i t ytimeTransient behaviour Scale factor: 0.03Scale factor: 0.1Scale factor: 0.2Figure 19.Approximating transient probabili-tiesdistributions.The paper presents limit theorems for special cases;however,extensive numerical experiments show that the limiting behavior is far more general than the special cases considered in the theorems.The numerical examples have also evidenced that for very small values of ,the diagonal elements of the tran-sition probability matrix become very close to ,rendering numerically unstable the DPH fitting procedure.A deep analytical and numerical sensitivity analysis is required to draw more general conclusions for the model level “optimal value”and its dependence on the consid-ered performance measure than the ones presented in this work.It is definitely a field of further research.Finally,we summarize the advantages and the disadvan-tages of applying approximate DPH models (even with op-timal value)with respect to using CPH approximations.Advantages of using DPH:An obvious advantage of the ap-plication of DPH distributions is that one can have a closer approximate of distributions with low coefficient of varia-tion.An other important quantitative property of the DPH class is that it can capture distributions with finite support and deterministic values.This property allows to capture the periodic behavior of a complex stochastic model,while any CPH based approximation of the same model tends to a steady state.Numerical experiments have also shown that DPH can better approximate distributions with some abrupt or sharp changes in the CDF or in the PDF.Disadvantages of using DPH:There is a definite disad-vantage of discrete time approximation of continuous time models.In the case of CPH approximation,coincident events do not have to be considered (they have zero proba-bility of occurrence).Instead,when applying DPH approxi-mation coincident events have to be handled,and their con-sideration may burden significantly the complexity of the analysis.AcknowledgmentsThis work has been performed under the Italian-Hungarian R&D program supported by the Italian Ministry of Foreign Affairs and the Hungarian Ministry of Education.A.Bob-bio was partially supported by the MURST Under Grant ISIDE;M.Telek was partially supported by Hungarian Sci-entific Research Fund (OTKA)under Grant No.T-34972.References[1] D.Aldous and L.Shepp.The least variable phase type dis-tribution is Erlang.Stochastic Models ,3:467–473,1987.[2] A.Bobbio and A.Cumani.ML estimation of the param-eters of a PH distribution in triangular canonical form.In G.Balbo and G.Serazzi,editors,Computer Performance Evaluation ,pages 33–46.Elsevier Science Publishers,1992.[3] A.Bobbio and A.Horv´a th.Petri nets with discrete phasetype timing:A bridge between stochastic and functional analysis.Electronic Notes in Theoretical Computer Science ,52(3),2001.[4] A.Bobbio,A.Horv´a th,M.Scarpa,and M.Telek.Acyclicdiscrete phase type distributions:Properties and a parameter estimation algorithm.Technical Report of Budapest Uni-versity of Technology and Economics,Submitted to Perfor-mance Evaluation,2000.[5] A.Bobbio and M.Telek.A benchmark for PH estima-tion algorithms:results for Acyclic-PH.Stochastic Models ,10:661–677,1994.[6] A.Cumani.On the canonical representation of homoge-neous Markov processes modelling failure-time distribu-tions.Microelectronics and Reliability ,22:583–602,1982.[7] A.Cumani.Esp -A package for the evaluation of stochas-tic Petri nets with phase-type distributed transition times.In Proceedings International Workshop Timed Petri Nets ,pages 144–151,Torino (Italy),1985.[8]R.German.Performance Analysis of Communication Sys-tems:Modeling with Non-Markovian Stochastic Petri Nets .John Wiley and Sons,2000.[9]R.Jones and G.Ciardo.On phased delay stochastic Petrinets:Definition and application.In Proceedings 9th Inter-national Workshop on Petri Nets and Performance Models -PNPM01.IEEE Computer Society,2001.[10] ng and J.L.Arthur.Parameter approximation forphase-type distributions.In Matrix-analytic methods in stochastic models ,Lecture notes in pure and applied mathe-matics,pages 151–206.Marcel Dekker,Inc.,1996.[11]M.Neuts.Probability distributions of phase type.In LiberAmicorum Prof.Emeritus H.Florin ,pages 173–206.Uni-versity of Louvain,1975.[12]M.Neuts.Matrix Geometric Solutions in Stochastic Models .Johns Hopkins University Press,Baltimore,1981.[13]M.Telek.Minimal coefficient of variation of discretephase type distributions.In 3rd International Conference on Matrix-Analitic Methods in Stochastic models,MAM3,pages 391–400,Leuven,Belgium,2000.Notable Publica-tions Inc.。
Floating Point Division and Square Root Algorithms and Implementation in the AMD-K7 TM Micr
Floating Point Division and Square Root Algorithms and Implementationin the AMD-K7TM MicroprocessorStuart F.ObermanCalifornia Microprocessor DivisionAdvanced Micro DevicesSunnyvale,CA94088stuart.oberman@AbstractThis paper presents the AMD-K7IEEE754and x87 compliantfloating point division and square root algo-rithms and implementation.The AMD-K7processor em-ploys an iterative implementation of a series expansion to converge quadratically to the quotient and square root. Highly accurate initial approximations and a high perfor-mance sharedfloating point multiplier assist in achieving low division and square root latencies at high operating frequencies.A novel time-sharing technique allows inde-pendentfloating point multiplication operations to proceed while division or square root computation is in progress. Exact IEEE754rounding for all rounding modes and target precisions has been verified by conventional directed and random testing procedures,along with the formulation of a mechanically-checked formal proof using the ACL2theorem prover.1IntroductionThe AMD-K7is an out-of-order,three-way superscalar x86microprocessor with a15-stage pipeline,organized to allow500+MHz operation.The processor can fetch,de-code,and retire up to three x86instructions per cycle,with independent integer andfloating-point schedulers.Up to three operations per clock are dispatched from the36-entry floating-point scheduler to threefloating-point execution pipelines.The AMD-K7floating point unit[1]is an x87compati-ble FPU.This extension defined a stack architecture,along with IEEE754[2]compliance for addition,multiplication, division,and square root operations.All four rounding modes of the standard are supported,along with three work-ing precisions:single,double,and extended.All of the fundamental computations are calculated with an extended precision significand and exponent.Only at the conclusionLatency/Throughput(cycles) Operation Single Double Extended/InternalDivision16/1320/1724/21Square Root19/1627/2435/32 Table1.AMD-K7division and square root per-formanceof an operation is the significand conditionally rounded to a lower precision.The AMD-K7processor also supports a wider internal precision format,with an18bit exponent and a68bit significand.Exactly rounded-to-nearest division is supported at this precision as a means for computing highly accurate transcendental functions.In order to achieve high overallfloating point perfor-mance with high operating frequencies,we determined it was necessary to minimize the latencies of the fundamen-tal operations.Reducing the number of logic levels per pipeline stage to increase operating frequency can easily cause functional unit latencies to increase by a proportional amount,with little net gain in overall performance.To real-ize a true performance gain,improvements in the algorithms for the operations are required to reduce the latency of the fundamental operations.This paper discusses the division and square root imple-mentation in the AMD-K7FPU.These algorithms reduce the latency for these operations in a high clock-rate imple-mentation,yielding the performance shown in Table1.Al-lowing out-of-order execution of operations provides some amount of latency tolerance,since the machine can be kept busy with independent work whenever it is present.How-ever,we determined that to achieve high system perfor-mance,the latency of all of the fundamental operations must be minimized.Many software applications have longdependency chains in their computation,and thus perfor-mance becomes heavily dependent upon the latency of the functional units themselves.While division and square root are typically infrequent operations,it has been shown that ignoring their implementations can result in significant sys-tem performance degradation for many applications[3].We confirmed this result in our target workloads,and we there-fore designed high performance division and square root al-gorithms which could exploit our fast FP multiplier.We discuss the theory of division and square root by functional iteration and present the algorithms used in the AMD-K7FPU in Section2.In Section3,we examine the design of the shared FP multiplier and demonstrate how the division and square root algorithms use this hardware.We analyze the various design issues that arose throughout the project in Section4.We discuss our verification strategies in Section5.We present optimizations to the implementa-tion that improve overall FP performance in Section6.We drawfinal conclusions in Section7.2Functional Iteration2.1MotivationThe most common algorithm used for division and square root in modernfloating point units is digit recur-rence.One implementation of digit recurrence is SRT, which has been used in many microprocessors.SRT di-vision and square root use subtraction as the fundamental operator to retire afixed number of quotient bits in every iteration.Unfortunately,it is this linear convergence to the result that makes this algorithm class undesirable for our ap-plication.For rapid computation of wide-precision results, we desire better than linear convergence.Unlike digit recurrence,division and square root by functional iteration utilize multiplication as the fundamen-tal operation.Multiplicative algorithms are able to take ad-vantage of high-speed multipliers to converge to a result quadratically.Rather than retiring afixed number of quo-tients and square root bits in every cycle,multiplication-based algorithms are able to at least double the number of correct result bits in every iteration.This class of algo-rithm has been widely discussed in the literature,including Flynn[4]and Goldschmidt[5].There are two main forms of multiplicative iterations. The Newton-Raphson division and square root iteration uses a dependent chain of multiplications to converge to the reciprocal/reciprocal square root.This algorithm is self-correcting,in that any error in computing the approxima-tion in a given iteration can be corrected in the subsequent iteration,since all operations are dependent.The series ex-pansion iteration,often called Goldschmidt’s algorithm,re-orders the operations in the Newton-Raphson iteration to increase the parallelism and reduce the latency.However, in this implementation the result is computed as the product of independent terms,and the error in one of them is not corrected.To account for this error,the iteration multipli-cations require extra bits of precision to guard against the accumulation of uncorrected rounding error.The K7implements a variant of Goldschmidt’s al-gorithm to compute division and square root.Unlike the previously-mentioned commercial implementations of functional iteration,the K7FPU has additional features to support the constraints imposed by IEEE754and x87com-patibility,as well as extra K7functionality.These include 1)Exactly-rounded IEEE754extended-precision results, where the result may be also down-rounded to single or dou-ble precision,2)Exactly rounded-to-nearest internal preci-sion division,with a68bit significand,and3)Producing the correct C1bit to indicate whether roundup occurred. 2.2AMD-K7Division and Square Root Algo-rithmsA division or square root operation in the K7can be de-fined by the functionDIV-SQRT(op,pc,rc,a,b,z),with inputs as follows:(a)op∈{OP-DIV,OP-SQRT};(b)pc is an external precision control specifier;(c)rc is a rounding control specifier;(d)a and b arefloating point input operands.In the case op=OP-DIV,the output z represents an appro-priately rounded approximation of the quotient a/b;when op=OP-SQRT,b is ignored and an approximation of√a is returned.Afloating point representation for an arbitrary num-ber x comprises three bit vectors,corresponding to the sign,significand,and exponent of x.A format is de-fined by the number of bits allocated to the significand, sig(x),and the exponent,expo(x),expressed together as (sig(x),expo(x)).The formats that are supported by the K7floating point division and square root operations in-clude(24,8),(53,11),and(64,15),which correspond to single,double,and extended precision as specified by IEEE 754.A wider internal precision,(68,18),is also supported for division.A combination of op and pc determines the target precision for the operation.The rounding modes supported in the K7are RN,RZ, RM,and RP,which correspond to the rounding modes round-to-nearest-even,truncation,round-to-minus-infinity, and round-to-positive-infinity,respectively.A combinationProgram Division:BEGIN DIVISION:Input=(a,b,pc,rc)Output=(q f)x0=recip estimate(b)d0=ITERMUL(x0,b),r0=comp1(d0)n0=ITERMUL(x0,a)if(PC==SINGLE)n f=n0,r f=r0goto END DIVISIONd1=ITERMUL(d0,r0),r1=comp1(d1)n1=ITERMUL(n0,r0)if(PC==DOUBLE)n f=n1,r f=r1goto END DIVISIONd2=ITERMUL(d1,r1),r2=comp1(d2)n2=ITERMUL(n1,r1)n f=n2,r f=r2END DIVISION:q i=LASTMUL(n f,r f,pc)rem=BACKMUL(q i,b,a),q f=round(q i,rem,pc,rc)Figure1.Program Divisionof op and rc determines the rounding mode for the opera-tion.The K7algorithm for division and square root is repre-sented by the programs Division and Square Root,shown in Figures1and2.The division and square root programs employ several K7-specific hardware functions which are discussed in detail in the next section.3Hardware OrganizationThe division and square root programs are implemented in hardware within a single execution pipeline.A state ma-chine within the FP multiplier pipeline detects if a valid di-vision or square root opcode is dispatched to the pipeline. Upon receipt of an appropriate opcode,the state machine begins sequencing through the appropriate states to realize the programs of Figures1and2.The state machine controls several muxes within the multiplier pipeline,along with enable signals to conditionally advance several registers. These programs could be implemented in K7microcode in-stead of through a hardware state machine.However,all of the microcode ops would need to go through the decoders and central scheduler,greatly reducing the throughput of the Program Square Root:BEGIN SQRT:Input=(a,pc,rc)Output=(s f)x0=recipsqrt estimate(a)t0=ITERMUL(x0,x0)if(PC==SINGLE)d f=ITERMUL(x0,a)n0=ITERMUL(t0,a),r f=comp3(n0)GOTO END SQRTn0=ITERMUL(t0,a),r0=comp3(n0)d0=ITERMUL(x0,a)t1=ITERMUL(r0,r0)d1=ITERMUL(d0,r0)n1=ITERMUL(n0,t1),r1=comp3(n1)if(PC==DOUBLE)d f=d1,r f=r1GOTO END SQRTt2=ITERMUL(r1,r1)d2=ITERMUL(d1,r1)n2=ITERMUL(n1,t2),r2=comp3(n2)d f=d2,r f=r2END SQRT:s i=LASTMUL(d f,r f,pc)rem=BACKMUL(s i,s i,a),s f=round(s i,rem,pc,rc)Figure2.Program Square Root machine for independent operations.Thus,we chose to sup-port maximumfloating point throughput by implementing a dedicated hardware state machine which only requires a sin-gle opcode to be decoded and scheduled.Figure3shows a block diagram of the FP significand multiplier pipeline in-cluding additional hardware to support division and square root computation.3.1Multiplier OverviewThe multiplier pipeline is designed in static CMOS logic to expedite design-time and circuit verification.The multi-plier has a latency of four cycles,and it is fully-pipelined.It is configured to operate on a maximum of76bit operands. This maximum width is required to support exactly-roundedEX1EX2EX3EX4Figure3.Multiplier pipeline68bit division and this choice is discussed later.The mul-tiplier is also configured to compute all AMD3DNow!TM SIMD FP multiplications[6];the algorithms to support this are not discussed here.In thefirst cycle of the multiplier,EX1,overlapping groups of four bits of the multiplier operand are inspected as per the Booth3multiplier algorithm[7].In parallel,a special78bit adder generates the3x multiple of the mul-tiplicand.The Booth encoders generate26control signals which control the selection of the26Booth muxes to form the appropriately signed multiples of the multiplicand.In the second cycle,EX2,the26partial products are re-duced to two through a binary tree of4-2compressors.The individual4-2compressors are designed in static CMOS with single-rail inputs and outputs.While a static dual-rail compressor with dual-rail inputs and outputs is typi-cally faster than single-rail,it is also larger and requires twice the routing resources.In our implementation,the in-creased routing congestion and larger cells increased the wire-lengths between compressors,increasing overall de-lay compared with the single-rail design.The reduction tree is fundamentally a parallelogram.However,we imple-mented several folding and interleaving techniques which resulted in a rectangular tree and left internal wire lengths constant.Thefirst portion of the multiplier rounding al-gorithm is then applied.This algorithm involves adding a rounding constant to the sum and carry outputs of the tree. Since the normalization of the assimilated result is unknown at this point,the addition is performed both assuming no overflow occurs and assuming overflow occurs.The addi-tions themselves are implemented using an additional level of(3,2)carry-save adders.These(3,2)adders are also used as part of the back-multiply and subtract operation BACK-MUL which forms the remainder required for quotient and square root rounding.The rounding constants used for the different precisions and rounding modes are shown in Ta-ble2.Thefirst four rows correspond to constants for reg-ular multiplication operations,while the last three are for division/square root specific operations.In this table,!x implies the bit inversion of x,and24’b0implies24zero bits.The third cycle,EX3,forms the three carry-assimilated results.Two of these results are rounded results assuming that either overflow or no overflow occurs.The third is the raw unrounded result.In parallel,sticky-bit logic examines the low order bits of the sum and carry vectors,as a function of pc,to determine whether or not S+C=0,using an op-timized technique similar to those previously reported[8].In the fourth cycle,EX4,the bits below the target pre-cision are appropriately cleared.While the rounding con-stant applied in EX2insures that the result is correctly-rounded,it does not guarantee that the bits below the tar-get LSB are cleared,as required for x87compatibility.The LSB is then conditionally inverted for regular multiplication operations under RN as a function of the sticky-bit to cor-rectly handle the nearest-even case.Finally,the MSB of the unrounded result determines whether or not overflow has occurred,and it selects between the no overflow and with overflow rounded results.For division and square root iter-ations,an extra result R i is also provided which is the one’s complement of the regular unrounded multiply result for di-vision and an approximation to3−N2for square root.Both of the results are available for local storage and bypassing for use within the division and square root iterations.In the case of the last cycle of a division or square root BACK-MUL op,the appropriately rounded result is chosen from the previously-computed target results:Q,Q+1,or Q−1.The unrounded result is also used in the case of IEEE tiny results with the underflow exception masked;i.e.tini-ness occurs when the computed rounded result is below the minimum extended precision normal number.In this in-stance the unrounded result is passed to a microcode han-dler which can properly denormalize and round the result as required by x87compatibility.The unrounded result is also used to determine whether roundup has occurred for proper setting of the C1condition code.Roundup occurs when the rounded result differs from the unrounded result, and the C1bit can be set appropriately given this informa-SINGLE DOUBLE EXTENDED INTERNAL RN(24’b0,1’b1,126’b0)(53’b0,1’b1,97’b0)(64’b0,1’b1,86’b0)(68’b0,1’b1,82’b0) RZ151’b0151’b0151’b0-RM(24’b0,(127(Sign)))(53’b0,(98(Sign)))(64’b0,(87(Sign)))-RP(24’b0,(127(!Sign)))(53’b0,(98(!Sign)))(64’b0,(87(!Sign)))-LASTMUL(25’b0,1’b1,125’b0)(54’b0,1’b1,96’b0)(65’b0,1’b1,85’b0)(69’b0,1’b1,81’b0) ITERMUL(76’b0,1’b1,74’b0)BACKMUL(!Dividend[67:0],(83(1’b1)))Table2.Rounding constantstion.For division and square root,the unrounded result is synthesized by appropriately choosing between Q−1and Q.The extra multiplier hardware required to support divi-sion and square root iterations that actually impacted total area includedflip-flops to store intermediate results,an in-crementer,and the state machine.We estimate that division and square root support accounts for about10%of the total area of the multiplier unit.Designing a parallel SRT divider with similar performance would have required substantially more area and circuit design than our implementation re-quired.We therefore conclude that this implementation rep-resents a good balance of performance and area.3.2Special Iteration Operations3.2.1Recip Estimate and RecipSqrt EstimateThe initial approximation x0to the reciprocal of b,in the case op=OP-DIV,is derived from a pair of tables,each containing210entries,which we represent by the func-tions recip-rom-p and recip-rom-q.For op=OP-SQRT, a separate pair of tables,each containing211entries,repre-sented by the functions sqrt-rom-p and sqrt-rom-q,is simi-larly used to derive an initial approximation to the recipro-cal square root of a.The basic theory of compressing reciprocal ROM ta-bles using interpolation is discussed in[9].We imple-mented an optimized and simplified reciprocal and recip-rocal square root estimate interpolation unit which provides the best compression and performance for the requirements of our algorithms.This interpolation unit is also used to form the values for the AMD3DNow!reciprocal and re-ciprocal square root functions PFRCP and PFRSQRT.Each p table entry is16bits,while the q table entries are each7 bits.The total storage required for reciprocal and reciprocal square root initial approximations is69Kbits.To form an estimate,the appropriate set of p and q tables is accessed in parallel.The results are added and the16 bit sum is appended to a leading1to produce the recipro-cal or reciprocal square root estimate.The latency for this process is three cycles.By exhaustive test we determined that the reciprocal estimate is accurate to at least14.94bits, while the reciprocal square root estimate is accurate to at least15.84bits.3.2.2ITERMUL(x,y)This is a multiplication operation of x and y that forces the rounding to be round-to-nearest.It assumes that each of the input operands are76bits wide,and the152bit interme-diate result is rounded to76bits.This wider precision is required to accommodate the uncorrected rounding errors which accumulate throughout the iterations.3.2.3LASTMUL(x,y,pc)This is a multiplication operation of x and y that forces the rounding to be round-to-nearest.It performs the rounding to a precision one bit wider than the target precision specified by pc.For division,just prior to the rounding of this opera-tion,the double-width product Q is required to be accurate to at least−2−(pc+2)<Q−Q <2−(pc+2)(1) where Q is the infinitely precise quotient,and pc is the tar-get precision in bits,where1ulp for such a pc bit number is 2−(pc−1)[10].Exact rounding requires that the result have an error no worse than±0.5ulp,or2−pc.Our rounding algorithm requires the result to be computed to an accuracy of at least1more bit.Further,since thefinal quotient result can be in the range of(0.5,2),1bit of normalization may be required,requiring1more bit of accuracy.For square root, just prior to the rounding of this operation,the double-width product S is required to be accurate to at least−2−(pc+1)<S−S <2−(pc+1)(2) where S is the infinitely precise square root,and pc is the target precision in bits.Since the square root result is in the range[1,2),it has a looser constraint on the accuracy of the input to this operation.After rounding and normalizing to pc+1bits through the LASTMUL op,the resulting value R satisfies−2−pc<R−R <2−pc,(3)where R is either the infinitely precise quotient or square root as appropriate.Thus,the value can have an error of (-0.5,+0.5)ulp with respect to thefinal pc bit number.3.2.4BACKMUL(b,q,a)This op is a multiplication operation that operates on the two source operands b and q,and it also accepts a third operand a.The152bit intermediate product of the two sources in carry-save form is added with an inverted ver-sion of the third operand,with the low-order bitsfilled with 1’s as shown in Table2.These three values are then in-put into the rounding carry-save adders in EX2,with the unused lsb carry bit set to one,realizing the function of b×q+T wosComp(a).This implements the negative ver-sion of the back multiply and subtraction operation to form the remainder,i.e.b×q−ais implemented instead of the desireda−b×q.The sign of this remainder is thus the negative of the true remainder sign.This operation returns two bits of status: whether the sign of the remainder is negative,taken from a high order bit of the result,and whether the remainder is exactly zero,using fast sticky-bit logic.Since q could be rounded to any of four precisions,the high order bit is chosen high enough to allow it to suffice for all of the pre-cisions.3.2.5comp1(x)This operation returns the one’s complement by bit inver-sion of the unrounded version of the currently-computed product.The unbiased exponent is forced to either-1or0 depending upon whether the result should be in the binade [0.5,1)or[1,2).Using the one’s complement instead of the two’s complement in the iterations adds a small amount of error.However,this error is taken into account when de-signing the width of the multiplier and the required accu-racy of the initial approximations.3.2.6comp3(x)This operation returns an approximation to3−x2of the un-rounded version of the currently-computed product.This approximation is formed by bit inversion and shifting.3.2.7round(q i,rem,pc,rc)The rounding function assumes that a biased trial result q i has been computed with pc+1bits,which is known to have an error of(-0.5,+0.5)ulp with respect to pc bits.The extra bit,or guard bit,is used along with the sign of the remain-der,a bit stating whether the remainder is exactly zero,andGuard Rem-RN RP RM RZ Bit ainder(+/-)(+/-)0=0trunc trunc trunc trunc 0-trunc trunc/dec dec/trunc dec 0+trunc inc/trunc trunc/inc trunc 1=0RNE inc/trunc trunc/inc trunc 1-trunc inc/trunc trunc/inc trunc 1+inc inc/trunc trunc/inc truncTable3.Action table for round functionrc to choose from three possible results,either q,q−1,or q+1which are equal to q i truncated to pc bits and decre-mented or incremented appropriately.The rounding details are shown in Table3.For RN,in the case of an exact halfway case,it is nec-essary to inspect L,the lsb of q to determine the action.If L=0,then the result is correctly rounded to nearest-even. Otherwise,the result is incremented to the closest even re-sult.It should be noted that for division where the preci-sion of the input operands is the same as that of the result, the exact halfway case can not occur.Only when the result precision is smaller than the inputs can such a result occur and rounding be required.For the directed rounding modes RP and RM,the action may depend upon the sign of the quotient estimate.Those entries that contain two operations such as pos/neg are for the sign of thefinal result itself being positive and negative respectively.This function,along with the computation of q−1and q+1,is implemented in parallel with the BACKMUL op-eration.The status information from the BACKMUL op is used as input to the ROUND function to quickly choose and return the correctly-rounded result.3.3PerformanceThe AMD-K7implementation of the programs Division and Square Root through a hardware state machine yields the performance summarized in Table1.The latencies in-clude the time to form the initial approximation and the ap-propriate number of iterations.The scheduling of the spe-cial iteration multiplications is optimized to simultaneously minimize the latency of the operations and the number of extra storage registers required to hold intermediate results, as well as maximize the throughput.4Design IssuesThe two primary design variables for the implementation of iterative division and square root are the minimum accu-racy of the initial approximation and the minimum multi-plier dimensions.We performed an initial loose error anal-ysis using Mathematica to bound the worst case errors,andwe based the design upon this analysis.We used formal verification later to confirm the validity of our results.To determine the accuracy of the Goldschmidt iterations, it is necessary to consider all of the sources of error in the final result.When converging to a result using Goldschmidt iterations,there are three possible sources of error in the pre-rounded result:•Error due to initial approximation= 0•Error due to use of one’s complement,rather than true two’s complement in the iterations= ones•Error due to using rounded results in the iteration mul-tiplications= mul4.1Division4.1.1Multiplier DimensionsIn this analysis,we consider the worst case output precision, internal precision with a68bit significand,such that three iterations of the algorithm are required to form an approxi-mation to Q=ab .Further,we assume that for performancereasons,the one’s complement is used rather than the two’s complement.The FP multiplier used in the iterations takes two inputs each of width n bits and produces an n bit result, such that1ulp for the result is2−(n−1).We wrote a Mathematica program that implemented the Division program with three iterations.The errors due to multiplier rounding were separated in the iterations.This is to correctly account for the fact that the rounding error in the successive D i refinements are self-correcting,while those in the N i refinements are not.Rather,the N i refinements suffer the additive error of the multiplies intrinsic to the N i step itself as well as the error in the D i step.Also,at each point where the rounding error is multiplied by the ratio of the two input operands,the worse case operand values in the range[1,2)were substituted in order to maximize the worst-case possible error.From Mathematica,thefinal error in thefinal2n bit pre-rounded quotient estimate q after three iterations is:q=−ab7 80+10 mul+2 ones(4) The initial approximation error is treated later.However, it is readily apparent that with an approximation accurate to at least2−14,the error in thefinal result due to the initial approximation after three iterations is on the order of2−104, and thus is not of concern.Instead,obtaining the required accuracy for exact rounding of internal precision results is dominated by the errors due to the multiplier dimension, mul and ones.The value of ones is a constant value of-1ulp.The ulp error of multiplication rounding is±0.5ulp.However,Precision Iterations ErrorSingle1ab 20Double2ab3 40Extended/Internal3ab7 80 Table4.Initial approximation error for recip-rocaldue to possible overflow before rounding,the mathemati-cal value,with respect to the binade of the input operands for the specific multiplication,can be as much as±1ulp. To provide the most conservative constraints,the±1ulp value is used in the analysis for all of the rounding errors due to multiplication.Substituting these values into the ex-pression for the pre-rounded quotient estimate yields a loose constraint on thefinal error in the result:−2−(n−5)< q<2−(n−5)(5) The design challenge becomes determining the mini-mum number of bits required for the multiplier.For the K7 FPU,the widest precision requiring exact division rounding is internal precision with pc=68.For internal precision1 ulp is2−67.The constraint for exact rounding of this pre-cision is that thefinal double-width quotient estimate N3 formed in the LASTMUL op have error N3given by−2−70< N3<2−70(6) as per Equation1.By equating the required precision with the precision obtained through the iterations using an n by n bit multiplier:2−(n−5)=2−70(7)n=75Thus,the minimum required number of bits n to guaran-tee exactly-rounded internal precision quotients is75.To provide even more margin in the design,we implemented a 76x76bit multiplier.4.1.2Table DesignMathematica analysis was used to determine the contribu-tion of the initial approximation error of the reciprocal to thefinal pre-rounded result for each of the target precisions. This is shown in Table4.For single precision division,the constraint on the error in thefinal double-width quotient estimate formed in the LASTMUL op for exact rounding as per Equation1is given by:−2−26< N1<2−26(8)。
计量经济学中英文词汇对照
Controlled experiments Conventional depth Convolution Corrected factor Corrected mean Correction coefficient Correctness Correlation coefficient Correlation index Correspondence Counting Counts Covaห้องสมุดไป่ตู้iance Covariant Cox Regression Criteria for fitting Criteria of least squares Critical ratio Critical region Critical value
Asymmetric distribution Asymptotic bias Asymptotic efficiency Asymptotic variance Attributable risk Attribute data Attribution Autocorrelation Autocorrelation of residuals Average Average confidence interval length Average growth rate BBB Bar chart Bar graph Base period Bayes' theorem Bell-shaped curve Bernoulli distribution Best-trim estimator Bias Binary logistic regression Binomial distribution Bisquare Bivariate Correlate Bivariate normal distribution Bivariate normal population Biweight interval Biweight M-estimator Block BMDP(Biomedical computer programs) Boxplots Breakdown bound CCC Canonical correlation Caption Case-control study Categorical variable Catenary Cauchy distribution Cause-and-effect relationship Cell Censoring
Use and misuse of the reduced major axis for line-fitting.
Use and Misuse of the Reduced Major Axis for Line-FittingRichard J.Smith*Department of Anthropology,Washington University,St.Louis,MO63130KEY WORDS regression;error variance;least-squares;statisticsABSTRACT Many investigators use the reduced major axis(RMA)instead of ordinary least squares (OLS)to define a line of bestfit for a bivariate relation-ship when the variable represented on the X-axis is measured with error.OLS frequently is described as requiring the assumption that X is measured without error while RMA incorporates an assumption that there is error in X.Although an RMAfit actually involves a very specific pattern of error variance,investigators have prioritized the presence versus the absence of error rather than the pattern of error in selecting between the two methods.Another difference between RMA and OLS is that RMA is symmetric,meaning that a single line defines the bivariate relationship,regardless of which variable is X and which is Y,while OLS is asymmetric,so that the slope and resulting interpretation of the data are changed when the variables assigned to X and Y are reversed.The concept of error is reviewed and expanded from previous discussions,and it is argued that the sym-metry-asymmetry issue should be the criterion by which investigators choose between RMA and OLS.This is a biological question about the relationship between varia-bles.It is determined by the investigator,not dictated by the pattern of error in the data.If X is measured with error but OLS should be used because the biological question is asymmetric,there are several methods avail-able for adjusting the OLS slope to reflect the bias due to error.RMA is being used in many analyses for which OLS would be more appropriate.Am J Phys Anthropol 140:476–486,2009.V C2009Wiley-Liss,Inc.One of the more common statistical procedures in com-parative biology is thefitting of a straight line to repre-sent the pattern of association between two continuous variables—a‘‘trend line.’’Ordinary least squares(OLS) regression is the only criterion forfitting such a line dis-cussed in most introductory statistics textbooks and also is the default method in virtually all general statistical software packages.However,it is now widely recognized that an OLSfit sometimes is not appropriate.Unfortu-nately,a proper alternative to OLS is not a simple mat-ter.For many data sets,the information needed for a complete solution to the problem of identifying a valid line is unknowable(e.g.,Jolicoeur and Heusner,1971; Moran,1971;Ricker,1975;Sprent and Dolby,1980; Carroll and Ruppert,1996),and strong disagreements characterize proposed solutions(Jolicoeur,1975,1990; Ricker,1975;Jungers,1984;Harvey and Pagel,1991; Kimura,1992).In response to this literature,it has become common in applied comparative studies to acknowledge the need for an alternative to OLS,but to simplify the process of choosing one.Two generalizations characterize the applied literature today:1.The most commonly recommended alternative to OLSis the reduced major axis(RMA)(e.g.,Hofman et al., 1986;Leduc,1987;LaBarbera,1989;Aiello,1992;Niklas,1994,2004;Griffiths,1998;Blackburn and Gaston,1998;White and Seymour,2005).RMA is now so well-known that often it is employed with no mention of why it was selected(e.g.,Bonduriansky and Rowe,2005;Jeffrey,2005;Carlson and Patel, 2006;Hulsey,2006;Potter et al.,2006).2.The signal indicating that RMA rather than OLSshould be used is the presence of‘‘error’’in both the X-axis and Y-axis variables.In contrast,OLS is indi-cated when the values for the X-axis are known with-out error(i.e.,all error can be attributed to the Y-axis measurements).A few recent examples of the large number of studies from a variety of major journals in which error in both X and Y is indicated to be the nec-essary and sufficient indication for use of RMA include Levinton and Allen(2005),Zimmerman et al.(2005),Bullimore and Burn(2006),Sla´dek et al.(2006),and Vincent and Lailvaux(2006).Rayner (1985,p416)is representative of the information that has been presented to biologists about the conse-quences of error in X on the validity of OLS:‘‘It cannot be overemphasized that when both vari-ates consist of uncontrolled measurements of biologi-cal quantities,linear regression models the error dis-tributions unrealistically and should not be used.’’RMA also is known as the standardized major axis (MA)(Warton et al.,2006),the line of organic correlation (Kermack and Haldane,1950),la relation d’allome´trie (Tessier,1948)and the geometric mean(GM)regression (Ricker,1973).Astronomers sometimes call it‘‘Stro¨m-berg’s impartial line’’(Feigelson and Babu,1992),and in physics it may be classified as a type of standard weight-*Correspondence to:Richard J.Smith,Graduate School of Arts and Sciences,Campus Box1187,Washington University,One Brookings Drive,St.Louis,MO63130.E-mail:rjsmith@Received3February2009;accepted17March2009DOI10.1002/ajpa.21090Published online7May2009in Wiley InterScience().V C2009WILEY-LISS,INC.AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY140:476–486(2009)ing model(Macdonald and Thompson,1992).It also qualifies as one type of‘‘model II regression’’in the widely cited terminology of Sokal and Rohlf(1995). While the reduction of linefitting to a choice between OLS or RMA is a simplification of complex issues(e.g., Kuhry and Marcus,1977;Sprent and Dolby,1980;Seim and Saether,1983;Jolicoeur,1990;Prairie et al.,1995; Konigsberg et al.,1998;Isler et al.,2002;Warton et al., 2006),it nevertheless is a major advance over the exclu-sive use of OLS.Given that many investigators are going to choose only between OLS and RMA,the purpose of this study is to demonstrate that this choice often is being made incorrectly,resulting in the use of RMA in many situations in which OLS would be more appropri-ate.It should be noted that this focus on RMA vs.OLS does not signal agreement with the idea that RMA is the method of choice as an alternative to OLS.That is an entirely different issue and will not be addressed in this study.The focus is on RMA because of its widespread use.Much of this discussion is relevant to other alterna-tives to OLS,including MA(Jolicoeur,1975,1990).A COMPARISON OF OLS AND RMAOLS is such a foundational concept in traditional sta-tistics that many readers are skeptical when introduced to the idea that there are problems with it.Philosophers of sciences have found much to question with the idea that minimizing the square of vertical deviations some-how allows identification of the‘‘best’’or‘‘true’’relation-ship between two variables(e.g.,Urbach,1992).Statisti-cal problems with OLS were recognized by Karl Pearson (1901),among others.Essentially,there has always been discussion in the statistics literature of alternate linefit-ting criteria and the limitations of OLS regression.An RMAfit was described briefly by Adcock(1878), developed in detail by Dent(1935),and independently by Roos(1937)and Jones(1937).Guidelines for the calcula-tion of RMA regressions and associated statistics(such as confidence intervals and standard errors)can be found in many sources,including Imbrie(1956),Miller and Kahn(1962),Hayami and Matsukuma(1970),Clarke(1980),Ricker(1984),Hofman et al.(1986),and Niklas(1994).For any given data set,the difference between an OLS and RMA slope is straightforward.One formula for the OLS slope is:r[sd(Y)/sd(X)].(See Table1for the defini-tion of statistical symbols used in this study.)The RMA slope is simply[sd(Y)/sd(X)],with the sign determined by the sign of the correlation.When r is1.00,the two slopes are identical.As r decreases,the OLS slope decreases(flattens)and the RMA slope is unchanged. (The correlation coefficient and its square(r and r2)are defined independently of the linefitting criterion and do not differ for RMA and OLS lines.)Both lines pass through the point defined by(X;Y)and intersect at this point when plotted for the same data.OLS and RMA arefit by minimizing two different defi-nitions of scatter around a line(Angleton and Bonham, 1995):OLS minimizes:RðYÀYÞ25the sum of vertical devia-tionsRMA minimizes:RðXÀXÞðYÀYÞ5the sum of the product of X and Y deviations,which is equivalent to the area of triangles formed by the deviation of a point from the line in the X and Y directions(see Fig.1).Thus,both OLS and RMA are least squares methods, in that they minimize the variance(sum of squares of residuals)about the line,using different definitions for the residual(Warton et al.,2006).LIMITATIONS OF OLSTwo issues with OLS bear on the argument to use RMA instead:1)the presence of error in the data,and2) an OLS equation is not symmetrical,meaning that the slope of the line differs depending upon which variable is identified as X and which as Y.The measurement error problemError,defined as any and all deviations from a perfect fit between X and Y,may be due to measurement tech-TABLE1.Definition of symbols used in statistical formulas X The mean value of the X-axis traitY The mean value of the Y-axis traita The constant(intercept)of the line of bestfitb The slope of the line of bestfite The error in thefit of Yu The error in thefit of Xsd(X)The standard deviation of the X-axis traitsd(Y)The standard deviation of the Y-axis traitvar(X)The variance of the X-axis traitvar(Y)The variance of the Y-axis traitvar(e)The variance of the errors in the Y-axis trait var(u)The variance of the errors in the X-axis trait cov(XY)The covariance between X and Yr The Pearson product-moment correlationcoefficient between X and Ysign(r)The sign(i.e.,positive or negative)of thecorrelation coefficientk The ratio of error variances[var(e)/var(u)]q The portion of error[var(e)and/or var(u)]dueto natural biological variation(see text fordiscussion)var(q)The variance of the natural biologicalvariationFig.1.For an OLS line,error is defined as the vertical devi-ation of a point from the line(distance A-B)and the quantity minimized is the sum of squares of these linear distances.With RMA,error is defined as the area of the triangle C-D-E,and the quantity minimized is the sum of these areas.477USE AND MISUSE OF THE REDUCED MAJOR AXISAmerican Journal of Physical Anthropologynique,sampling variation,or intrinsic natural variation in the data(e.g.,Riska,1991).These will be discussed in detail later,but it is virtually always the case with bio-logical data that some error is present in both the X-axis and Y-axis variables(McArdle,2003).The presence of error in X biases the slope of the OLS line,as follows. The well-known equation for an OLS regression of Y on X is:Y¼aþbXþeð1ÞIn Eq.(1),all variation not explained by the line,Y5 a1bX,is represented by e,error in thefit of Y values. The value of e for each subject is identical to the residual for that subject.An equation allowing for error in both X and Y has the form:Yþu¼aþbXþeð2Þwith the added term u representing errors in the observed values of X.Among the many formulas for the OLS slope(b)of Y on X is:b¼covðXYÞvarðXÞð3ÞIf X is measured with error,then the X-axis trait has an added component of variation and should be expressed as(X1u).The slope becomes(Calbet andPrairie,2003):b observed¼covðXYÞvarðXÞþvarðuÞð4ÞThe error in X causes the slope of Eq.(4)to be shal-lower(because it has a larger denominator)than the OLS slope calculated if X is measured without error in Eq.(3).The bias in the slope calculated with Eq.(4)is known as attenuation or regression dilution,and the magnitude of the effect depends on the magnitude of var(u)relative to var(X).In theory,var(u)also could affect the numerator of Eq.(4),which would make the consequences of error in X difficult to define.In the presence of var(u),the numera-tor becomes cov(XY)1cov(uY)(Kelly,2007).However, there is no reason to expect the errors in X and the observed values of Y to covary,so that this second term can be expected to have a value of zero.Thus,error in X and error in Y have different consequences on an OLS regression.Each reduces the precision of thefit(low-ering the correlation between X and Y,increasing the magnitude of residuals,increasing the standard error of estimate,and reducing the F ratio for the analysis of variance of regression).However,error in Y does not bias the slope of the equation.Error in X biases the slope,and the direction of the bias is to cause the calcu-lated slope to be shallower than the true slope.The symmetry problemA second source of concern about OLS,unrelated to error,is based on the lack of symmetry between the OLS regressions of Y on X and of X on Y.With a single set of data,the biological relationship represented by the equa-tion differs depending upon which variable is assigned to the X axis and which is assigned to Y.It is generally agreed that if the purpose offitting the line is to identify an underlying pattern between two mutually codepend-ent variables arbitrarily identified as X and Y only for the purpose of statistical computations,then this prop-erty of OLS presents a dilemma(Jolicoeur and Heusner, 1971;Brace,1977).For example,as discussed by Ricker (1973),it is hard to argue for a cause-effect relationship between body length and body mass,or between the cor-responding heights of sets of brothers and sisters.There should be one line describing the pattern of covariation in these cases.Neither variable is dependent upon the other,and the biological interpretation of the results should be identical regardless of which variable is on each axis.For a single data set,the difference between the two OLS lines increases as the correlation coefficient decreases.The difference between the two OLS lines is difficult to interpret if they are examined on two sepa-rate graphs with reversed axes.However,the X on Y regression can be inverted byfitting Y on X but minimiz-ing the horizontal deviations from the line(Ricker, 1973).This allows the two OLS regressions to be exam-ined on a single set of axes.The two lines will intercept at X;Y.The regression of Y on X will be the shallower of the two,with a slope(as noted previously)of:r[sd(Y)/ sd(X)].The inverted slope of X on Y will be:[1/r][sd(Y)/ sd(X)](Helsel and Hirsch,2002).Consider as an example a data set in which s y52.00,s x55.00,and r50.80. The Y on X slope is0.8(2/5)50.32.The inverted X on Y slope is:(1/.8)(2/5)50.50.The RMA slope is the GM of these two OLS slopes,in this example,0.40,and can be calculated directly from them:b RMA¼½signðrÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffib y:x b x:yqð5ÞIf the axes are inverted for two RMA regressions,the slopes are exact reciprocals of each other,and therefore maintain a single position with respect to the data. Thus,there is only one RMA regression line.It also fol-lows that the two OLS slopes(Y on X and the inverse of X on Y)define the extremes of possible slope values for a bivariate data set.The OLS regression of Y on X will be the shallowest slope and the inverse of X on Y will be the steepest slope.RMA and other solutions to the prob-lem of error in X all fall between these two OLS values (see Fig.2).The two OLS regressions define the range of uncertainty and the maximum possible error in choosing a line-fitting criterion to calculate a slope(Ricker,1973; McArdle,2003).Another important perspective on the difference between the two OLS equations is found in the literature on classical and inverse calibration(Krutchkoff,1967; Konigsberg et al.,1998).Consider two variables such as body mass(X)and molar tooth area(Y).Across mamma-lian species,we expect a correlation between these traits,and have little difficulty accepting some causal basis for this correlation.The direction of causation is that animals with larger body sizes tend to have larger skeletal structures and organs,including molar teeth. Thus,an equation in which tooth size is dependent and body mass is independent,would be,if one were‘‘cali-brating’’the relationship between the two variables,a ‘‘classical calibration.’’However,regressions between these variables often are calculated to predict an unknown body mass from tooth size.Generally,predic-478R.J.SMITH American Journal of Physical Anthropologytion equations are calculated in the form in which they will be used,with body mass as the dependent measure and tooth area as the independent variable.Since this inverts the direction of cause and effect,it is appropri-ately known as ‘‘inverse calibration.’’As discussed in the calibration literature,prediction of body mass can be accomplished with either equation,directly from the equation constructed by inverse calibration or by taking the classical calibration equation and solving for body mass.Since these are two different equations,they will result in two different predictions of body mass from a single value for molar tooth area.This provides a second insight into the dilemma created by the asymmetry of OLS.We have one data set,presumably one true rela-tionship between tooth size and body mass,but two dif-ferent predictions,depending on which variable is X and which is Y .THE ERROR STRUCTURE OF VARIABLESAn understanding of error is central to an informed decision regarding the use of RMA and OLS.To discuss error,it is helpful to envision a hypothetical data set.I will use interspecific allometry as a model system (Martin,1989).Data are collected on two biological vari-ables,typically body mass on the X -axis and some other trait,such as basal metabolic rate,brain size,or home range area,on the Y -axis.These random variables (i.e.,not under experimental control)are measured with error.Each data point is a species mean;one or more indi-viduals of a species are measured and averaged for each trait.Data usually are transformed to logarithms,but that is not germane to this discussion.One issue that probably does make a difference,but that will not be addressed in this study,is the issue of line-fitting when data are treated as phylogenetically independent con-trasts (Felsenstein,1985;Garland et al.,1992).Ives et al.(2007)and Felsenstein (2008)recently have explored the consequences of within-species variation on contrasts regressions,and Nunn and Barton (2000)havediscussed the peculiarities of error variance in contrasts regressions.For example,the process of ‘‘positivizing’’the X -axis reduces the range of the X -axis and thereby arbitrarily alters the ratio of measurement error in y rel-ative to the measurement error in x .As will be dis-cussed,this ratio has a central role in the choice between RMA and OLS,and manipulation of the ratio by procedures particular to the independent contrasts method is a clear signal that these analyses should be evaluated as highly specialized and specific.The present discussion is in the context of the more general least-squares regression model.Comments on earlier drafts of this manuscript made it clear that for many readers who analyze data but who are not particularly interested in statistical questions,any discussion of statistical methods becomes uncomfort-able when the term ‘‘error variance’’is introduced.It may be of value to briefly review some essential con-cepts.Much of the perspective in statistics regarding the fit-ting of a line developed with data from physics,where it is often the case that except for very small measurement errors,data points are related to a true underlying phys-ical law and are represented by a relationship that is close to a perfect straight line.It makes sense in these cases to call all deviations from a perfect fit ‘‘error.’’With biological data,values are expected to deviate from any line of best fit if for no other reason than that variation is inherent in the evolutionary processes that underlie the development of traits.Bivariate plots of biological data will more often resemble an ellipse rather than a perfect straight line.Although it does not make intuitive sense to call these deviations from a line ‘‘error,’’this is the vocabulary that has become standard in statistics (Hills and Wood,1984).When data are best represented as an ellipse,there are different criteria that might be applied when a single line is used to represent the general pattern of the rela-tionship between X and Y .This is the source of the debate about RMA vs.OLS.For any given line,whatever the criteria,most individ-ual data points will fall some distance from the line.These distances are residuals.If the residuals are organ-ized as a column of a spreadsheet,their mean,standard deviation,and variance can be calculated,just as they could be for any trait.The variance of the set of resid-uals is the ‘‘error variance’’and their standard deviation usually is the ‘‘standard error.’’Technique errorFor any single measurement,the first source of error is true inaccuracy and bias because of measurement technique.There are many ways to take bad measure-ments.Examples of this category of error include improper laboratory conditions while measuring basal metabolic rate,an incomplete survey when calculating home range area,or any flawed measuring instrument.Sometimes,these errors can be evaluated by repeated sampling of individual values (Sprent,1990)and quanti-fied as the ‘‘within-subjects standard deviation.’’When repeated measurements on each subject are available,this value is the square root of the residual mean square from a one-way analysis of variance (Bland and Altman,1996).Appendix C in Warton et al.(2006)provides addi-tional information regarding the calculation of this source oferror.Fig.2.Illustrated are three lines of best fit:1)the standard OLS Y on X regression,2)RMA,and 3)the inverse of OLS X on Y .The Y on X regression is the shallowest,RMA is intermedi-ate,and the inverse of X on Y is the steepest.The lines all intersect at X ;Y ,and become more divergent as the correlation decreases.479USE AND MISUSE OF THE REDUCED MAJOR AXIS American Journal of Physical AnthropologyA different type of technique error results from non-random sampling.A biased or nonrepresentative sample shares with errors in measurement technique the fact that it represents a true mistake in the collection of a representative sample.Snedecor and Cochran(1967, p165)point out that some variables may have no satis-factory method of measurement(perhaps‘‘intelligence’’is an example),and therefore error will result from applying an incorrect concept.Although these various sources of error are diverse and worth study in their own right,conceptually they can all be treated as one problem when contrasted with other sources of error. They will be identified as‘‘technique error.’’Sampling variationThe second source of error represents a fundamental concept of inferential statistics,‘‘sampling error.’’Sup-pose that all technique error is controlled,so that metic-ulous measurements are made on an excellent random sample of individuals.It is still the case that the data represent a sample of individuals from some larger popu-lation(e.g.,a species)and the sample is being used to estimate a population parameter.The estimate can devi-ate from the true population value.The potential for this error is a function of the variability of the trait and of sample size,and is expressed as the standard error of estimate(or confidence interval)for the calculated mean value.A small sample of a highly variable trait may pro-vide a very good estimate of the population mean,but the probability of obtaining an estimate that deviates widely from the true value increases when the sample size is small and the trait exhibits high variability.Sam-pling variation is partially under the control of the in-vestigator to the extent that larger sample sizes can be collected.Some data sets allow for estimation of sampling error. The implications of intraspecific variation on the calcu-lation of interspecific allometric equations have been examined by Pagel and Harvey(1988),Riska(1991), and Kelly and Price(2004).Pagel and Harvey(1988) illustrate the consequences of sampling variation in an interspecific study by calculating an intraspecific stand-ard deviation for each trait in each species(with log transformed data,so that the means and variances are not correlated).They then calculate the average intra-specific standard deviation for each trait.When con-verted to variances,the ratio of these values indicates the proportion of error due to sampling variation on each axis,and can be used to estimate this component of the error variance against the assumptions of OLS and RMA.Ricker(1984),who has contributed as much as anyone to the discussion of linefitting with biological comparative data,seems to omit all consideration of sampling variation,suggesting that if data are‘‘mea-sured carefully,’’measurement errors are negligible, leaving only variation that he calls‘‘mutual natural variability.’’Natural variationThe third source of error,Ricker’s‘‘mutual natural variability,’’is true biological variation.Other terms for it include‘‘intrinsic variation’’(McArdle,2003),‘‘biologi-cal error’’(Riska,1991),and‘‘natural error’’(Prairie et al.,1995).Consider a set of species in which meticu-lous measurements of brain mass and body mass are taken on a very large well-defined random sample.At least in theory,it is a data set in which technique error and sampling error are negligible.Nevertheless,when plotted,we expect points to deviate from the bestfitting line.This occurs because of biological differences among species in the relationship between brain size and body size.Assuming that intelligence is related to brain size, and that humans are an intelligent species,we expect that even if all data are ideally measured,brain size is relatively larger when compared to body size in humans than in other species.Among the causes for biological differences are function,biomechanics,ecology,and phy-logeny.This error in predicting Y from X is inherent in the biology of the variables and cannot be controlled or reduced by the investigator.This natural variation is what distinguishes the statistical considerations infit-ting a line to many biological data sets from line-fitting in much of chemistry and physics.There is one source of error that although clearly a form of technique error,probably must be included with the concept of natural variation,because it cannot be distinguished from it.It is common in interspecific com-parative studies for data to be taken from the literature, and therefore,for the values reported to represent dif-ferent individuals.X and Y data may be accurately measured on large samples—and thus have little tech-nique or sampling error—but it is often the case that a species occupies a variety of habitats,and that these habitats have an effect on the traits examined.If the X-axis and Y-axis traits are measured on individuals in different habitats,there will be error in thefit of this point to the line that cannot be attributed to either the X or Y value.The validity of an X-Y pairing also can be compromised by the use of different protocols(the defi-nition of adult specimens,number of males versus females,etc).I do not see any alternative to the fact that this error will be masked as noise in the evaluation of natural variation.Summary:sources of errorFor many purposes,what has been identified as tech-nique error and as sampling variation can be combined into a single entity,‘‘measurement error.’’Measurement error can then be combined with natural variation to explain all scatter around a line:Measurement error¼technique errorþsampling variationEquation errorðtotal lack of fitÞ¼natural variationþmeasurement error It is worth noting that‘‘the notation in measurement error models is notorious for its lack of consistency’’(Carroll and Ruppert,1996).In some literature,the term ‘‘equation error’’is used to represent what is called here natural variation—the lack offit that would exist if tech-nique error and sampling variation were eliminated (e.g.,Carroll and Ruppert,1996;McArdle,2003; Fernandes and Leblanc,2005;Warton et al.,2006). Fuller(1987,p106),which many statisticians refer to as the standard source on measurement error problems, identifies what we are calling natural variation as q,‘‘error in the equation.’’Unfortunately,this terminology480R.J.SMITH American Journal of Physical Anthropology。
描述误差的指标 -回复
描述误差的指标-回复"描述误差的指标"是一个数据分析和统计模型评估中的重要主题。
在本文中,我们将一步一步地回答关于这个主题的问题,探讨不同的描述误差的指标及其在实际应用中的意义和应用。
1. 什么是描述误差的指标?描述误差的指标是用来评估统计模型的拟合程度或数据分析结果的准确性的度量指标。
它们帮助我们了解模型的预测误差或数据的可靠性,可以用来比较不同模型或评估模型的质量。
2. 常见的描述误差的指标有哪些?常见的描述误差的指标包括均方误差(Mean Squared Error,MSE)、平均绝对误差(Mean Absolute Error,MAE)、均方根误差(Root Mean Squared Error,RMSE)等。
这些指标通常用于评估连续变量的预测模型。
此外,还有其他针对分类变量或非连续变量的评估指标,如准确率、召回率、F1分数等。
3. 均方误差(MSE)是如何计算的?它有什么优缺点?均方误差是评估预测模型拟合程度的常用指标。
它计算了观测值与模型预测值之间的差异的平方平均值。
具体而言,MSE的计算公式如下:MSE = Σ(yᵢ - ŷᵢ)²/ n其中,yᵢ是观测值,ŷᵢ是模型的预测值,n是样本数量。
均方误差的优点是:计算简单,保留了误差的绝对值,并且对大误差给予更高的惩罚,这在一些应用领域中是合理的。
然而,MSE也有一些缺点。
首先,均方误差不直观,其值无法与原始观测值进行比较。
其次,MSE对异常值敏感,高误差的观测值会对指标产生较大影响。
此外,MSE在模型比较时没有进行标准化,因此不易进行跨模型或跨数据集的比较。
4. 平均绝对误差(MAE)是如何计算的?它相对于MSE有什么特点?平均绝对误差是衡量预测模型拟合程度的另一种常用指标。
它计算了观测值与模型预测值之间的绝对差异的平均值。
具体而言,MAE的计算公式如下:MAE = Σyᵢ - ŷᵢ/ n与MSE相比,平均绝对误差更直观,其值与原始观测值的单位相同。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
一个竞争性均方误差波束形成方法摘要:我们对待信号估计的波束的问题,其目的是从数组中观察中估计设置一个信号幅度。
常规波束形成方法通常着眼于将信干噪比(SINR)最大化。
然而,这并不能保证小均方误差(MSE),因此,平均产生的信号的估计会和真实信号相差甚远。
在这里,我们考虑的策略是,以尽量减少估计值和未知之间的信号波形的MSE。
我们建议的所有的方法都是去最大限度地提高SINR,但在同一时间里,它们都被设计为具有良好的MSE性能。
由于MSE依赖于未知的的信号功率,我们开发出了具有竞争力的波束形成方法,最小化鲁棒的MSE估计。
两种设计策略被提出:极小化最大MSE,极小化最大遗憾。
通过数值例子表明,在一个很大的SNR范围内,我们所建议的极小化最大波束形成方法可以超越现有的一些标准鲁棒的的方法。
.最后,我们应用我们的子带波束形成技术,并说明了宽带信号估计他们的优势。
关键词:极小化最大均方误差,极小化最大遗憾,稳健波束形成,子带波束形成。
Ⅰ简介波束形成是一个为了时间估计,干扰消除,源的定位,经典谱估计的处理时间传感器阵列测量的经典方法。
它已被应用于广泛的领域,无所不在,如雷达,声纳,无线通讯,语音处理和医疗成像等领域(详见,参考文献[1-4])。
依赖波束形成器而设计数据的传统方法通常试图极小化最大信号与干扰加噪声比(SINR)。
最大化SINR需要干扰加噪声的协方差矩阵和阵列导向矢量的知识。
由于协方差通常是未知的,它往往是由测量样本的协方差所代替,当信号在训练数据中时,这就导致了在高信噪比(SNR)的情况下,性能下降。
有些波束形成技术是设计来减轻这种影响[5-8],而另一些也需要去克服导向向量的不确定性[9-13],[14]。
在这里,我们假定导向矢量是确切的知道的,我们的目标是为信号估计设计一个波束形成器。
尽管事实上SINR已被用来作为衡量性能的标准和在许多波束形成设计方法的准则,最大化SINR或许也无法保证的一个很好的信号估计。
在估计的环境下,我们的目标是设计一个波束形成,以获得一个信号振幅接近其真实价值的估计,使它会更有意义,选择权,并尽量减少相关的是客观的估计错误,即真实之间的信号和它的估计之间的区别,而不是SINR 的差值。
此外,它可能会更翔实考虑把估计错误作为比较不同的波束形成方法的性能尺度。
如果信号功率是已知的,那么最小均方误差(MMSE)的波束可以被设计。
由此产生的波束可以表示为依赖功率的常数,这个常数乘以一个固定的权重向量是在SINR内的最佳值。
由于信干噪比在缩放时不敏感,最小均方误差的方法也能最大化SINR。
如果比例是固定的,那么缩放选择不影响信号的波形,而只是影响它的大小。
在一些应用中,实际幅度值可能是非常重要的。
在子带波束形成的背景下[15-20],这些就特别重要,由于它能够减少传统的宽带战略的复杂性,近些年获得很大的关注。
在这种情况下,独立执行波束形成在锐减频段和信道的输出相结合。
由于不同的尺度系数在每个通道使用,MMSE的战略一般会造成信号的波形和基于SINR为的方法所产生的不同。
因此,一个不错的选择缩放因子可以显著影响的估计波形。
通常情况下,信号功率为不明,MMSE波束就无法实现。
在这种情况下,其他的设计标准是需要选择缩放因子。
一个常用的方法是选择不会使信号失真的缩放因子,这相当于减少约束下的波束形成不偏不倚的均方误差(MSE)。
这就导致了著名的最小方差无失真响应(MVDR)波束的形成。
然而,就像我们解析和模拟的事实同时显示的那样,尽管MVDR方法是在一间不带偏见的MSE意义上是最优的技术,它往往会带来一个大的估计错误。
另一种策略就是从数据中来估计信号的功率,并使用结合的MMSE 的波束形成器。
这方法是密切相关的盲目极小化最大MSE 技术,最近在开发[21]。
以[21]中的结果为基础,它可以证明,如果估计信号功率和噪声的协方差是已知的,那么这种方法可以比MVDR 方法更加能改进MSE 。
然而,正如我们在第五节中的仿真展示那样,在典型的情况下,协方差是未知的,使用这种方法的性能就会大大恶化。
当信号功率和噪声方差是未知时,为了开发具有良好的MSE 性能的波束形成器,在本文中我们提出了两种设计战略,开发了上,下限形式的信号功率波束形成的先验知识。
这两种方法在可行信号地区内,优化了最坏的情况下的MSE 标准。
这些技术的优势在于两方面:首先,它们允许在信号的功率纳入先验知识,这在很多情况下是可用的,而且这种方式比其他方法更能改进的MSE 的性能。
第二,在没有先验知识可用的情况下,功率可以很容易地从数据中估算出,从而导致实际的波束形成技术。
的确,我们通过模拟演示得到,利用估计边界并联合所推荐的策略可以比先前的波束形成提升MSE 性能。
我们选择的标准,是在对线性模型[22-24]的估计的背景下发展起来的最近的观念为基础的。
在第一个方法中,我们尽量减少幅度(或方差的零均值随(12)机信号的情况下)范围为常数的所有信号的最坏情况的MSE 。
在第二种方法中,我们减少了在所有范围内的信号的最坏情况的遗憾[23],[25],[26],其中的遗憾被定义为在不确定因素存在的情况下波束形成器输出的MSE 和当功率是确定时的最小到达MSE 之间的差别。
这个策略考虑到了信号幅度的上界约束和下界约束两个方面。
为了说明我们的方法的优点,我们提出了一些数据实例,以此来比较传统的基于SINR 的策略,依据MSE 来缩放的策略,以及目前提出的健壮的方法[10]-[12]之间的波束形成器。
我们也在子带波束形成技术中运用了我们的技术,并说明了它们估计宽带信号的优势。
这里提到的理论观点在[13]中也会提及。
然而,在这里我们特别介绍了方法的实际方面,即,和缩放SINR 技术比较时,它们的性能特点,以及,它们在波束形成子带的影响。
本文的结构如下。
在第二节中,我们制定我们的问题并复习现有的方法。
极小化最大MSE 和极小化最大遗憾波束形成器的开发在第三节。
在第IV 及V ,我们讨论的实际的考虑并给出了数据实例,包括一个子带波束形成的应用。
ⅡSINR 的和MMSE 波束形成我们分别用黑体小写字母M c 表示向量,黑体大写字母⨯N M C表示矩阵。
I 是相应维度的单位矩阵,*(.)是对应的矩阵埃尔米特共轭,∧(.)表示一个估计向量。
A,SINR 波束形成对波束形成的主要任务之一是估计从源阵列组()s t 观察信号的幅度,t N ≤≤a i e y(t)=s(t)+(t)+(t),1 (1) 这里,()M y t C ∈是在t 时刻观测数组的复合矢量,M 是传感器的数量。
()s t 是待估测的信号幅度,a 是已知的并取决于和()s t 相关的平面波到到达方向的导向向量,i (t)是干扰信号,e (t)是高斯噪声矢量,N 是快照的数量(is the number of snapshots )我们的目的是通过利用一组波束形成权()w t ,从观测结果()y t 中估计信号的的幅度()s t ,输出的窄带波束如下:*()()(),1s t w t y t t N ∧=≤≤(2)一般来说,波束形成权()w t w =是用来使SINR 最大化*22*()s w a SINR w w Rw σ= (3) 其中,*{()()}R E i e i e =++是干扰和噪声协方差矩阵,2s σ是在确定情况下,由22|()|s s t σ=说给定的信号功率,并且当()s t 为零均值平稳随机过程时,22{()}s E s t σ=。
在实际情况下,协方差矩阵R 通常是不可用的并会被一个估计值所代替,最简单的方法就是使用一个样本协方差 *11()()N sm t R y t y t N ∧==∑ (4)由此导致了Capon 波束形成[27]。
一个可以转换的方式就是使用对角加载估计,由下面公式给出: *11()()Ndt sm t R R I y t y t I N ξξ∧∧==+=+∑ (5)当载入因素ξ发生变化是,将导致Capon 波束形成。
通常是选择210ξσ≈,2σ是但传感器中的噪声功率。
另外一个通用的方法就是特征空间波束形成,它的相反的协方差矩阵估计值是:11()()eig sm s R R P ∧∧--= (6)s P 是正交投影到信号子空间。
B ,MSE 波束形成α为任意值,都有()()SINR w SINR w α=,权重向量的SINR 最大化指定到一个常数,虽然缩放波束形成的权重将不会影响的SINR ,它可以对其他性能措施的影响,其中的波束形成器()s t 是用来估计信号波形应用,扩展成为尤为重要的子带的背景下的波束形成。
一种确定α流行的设计策略是要求*w a =1,导致MVDR 波束形成器 11MVDR R R --=w *1a a a (7)这或者可以得到波束形成的以解决方案**w min w Rw subjec to w a =1 (8)其中,下面我们显示,具有最小的MSE 受约束的波束形成是无偏的,虽然这种方法有几个最优性能的解释,它并不必然导致一个好的信号估计。
相反,我们可以尽量选择α最小化,而不需要直接输出的MSE 持平。
假设()s t s =是确定的,为简便起见,我们在那里省略了指数,MSE 和s 之间以及其估计为:222*{||}()|()||||1|E s s V s B s Rw s ∧∧∧-=+=+-*w a w (9)这里,2(){|{}|}V s E s E s ∧∧∧=-,是s ∧的方差,(){}B s E s s ∧∧=-是s ∧的偏方差。
当s 是一个零均值的随机变量,2s σ为方差,2||s 被2s σ替换。
对于具体,在讨论的其余部分,我们承担的确定性模型;但是,所有的结果在随机设置有效的地方,2||s 被2s σ替换。
21(||)R s β-+2*MVDR w(s)=|s |aa a=(s)w (10)其中最后一个等式的应用矩阵反演定理,我们定义了信号的常数: 22||()1||s s s β=+-1*-1aR a a R a (11)()s β满足0()1s β≤<,在2||s 范围内是单调增加的,所以对于所有的s ,都满足||()||||||MVDR w s w <。
把()w s 代入(9),得到最小均方误差MSE ,我们通过MSEB OPT 表示,由下式给出: 22||1||OPT s MSE s =+*-1a R a(12) MVDR 波形发生器的MSE ,如下式给出: 1MVDR MSE =*-1a R a (13)比较(12)和(13),我们发现,在||0s >的情况下,都有MSEB OPT <MSEB MVDR 。
所以,在具有相同的SINR 的情况下,MMSE 比 MVD 造成更小的的MSE 。