Tolerance-of-IMRT-QA-and-New-Paradigm-of-QA
Problem Definition
Example-Based Motion CloningMin Je Park Sung Yong ShinKorea Advanced Institute of Science and TechnologyPhone:+82428693528Fax:+82428693510Email:{mjpark,syshin}@jupiter.kaist.ac.krIn this paper,we pose a motion cloning problem of retargeting the motion of a source character to a target character with a different structure.Based on scattered data interpolation,an example-based approach to motion cloning is proposed.Provided with a set of example motions,our method automatically extracts a small number of representative postures called source key-postures.The animator then creates the corresponding key-postures of the target character,breathing his/her imagination and creativity into the output animation.Exploiting this correspondence,each input posture is cloned frame by frame to the target character to produce an initial animation,which is further adjusted in space and time for retargeting and timewarping and thenfinalized with some interactivefine tuning.With rich animation data available,our motion cloning method aims at rapid prototyping of an animation to verify an animator’s concept at an early stage.Keywords:character animation,motion cloning,scattered data interpolation,posture clusteringIntroductionProblem DefinitionIt is burdensome and time-consuming to create realistic motion from scratch.As rich repertoires have become available in motion databases, motion synthesis by example has arose as a key issue in character animation.A rather wide variety of solutions have been proposed to address motion synthesis issues in different contexts,e.g.,motion retargeting[1,2,3],motion rearrangement[4,5,6,7],and motion blending[8,9],to name a few.Formulated by Gleicher[1],the motion retargeting problem concerns how to apply a captured motion of one character to another with the same structure but different segment proportions.To synthesize an intended motion,Gleicher’s solution relies on constraints specified by an animator.This method addressed commonly-observed motion artifacts such as foot sliding and penetration rather well.In this paper,we generalize the solution in two different directions while taking a completely different approach:First,we remove the restriction on target characters requiring them to have identical structure in order to retarget captured motions to more diverse characters. Second,we provide a simple user interface for the animator to express his/her intentions directly.To distinguish our problem from motion retargeting,we refer to this problem as“motion cloning”,which is named after“facial expression cloning”in[10,11].Our solution is mainly based on posture blending rather than constrained optimization.Given a set of example motions of the source character,a small number of representative key-postures are extracted.The key-postures are chosen to represent the characteristics of the example motions.After extracting the key-postures of the source character,the animator prepares the corresponding key-postures of the target character together with the(geometric)character model to properly express the intended(time-varying)character shape.Provided with an input motion consisting of a stream of source character postures,every input posture is cloned frame by frame to the target character.This is achieved by blending the target key-postures with their weight values derived from the relationship between the input posture and the source key-postures.Finally,the animator makes some adjustments in space and time as well as minor posture editing.In motion cloning,the main issue is how to manifest animator’s intentions on the target animation with a minimum number of key-postures.More key-postures require accordingly greater effort to create target key-postures.On the other hand,more key-postures yield accordingly better quality in the resulting animation.Our technical contributions are two-fold:We provide a novel scheme for extractingmotionstargetan inputmotionFigure1:Overview of our example-based motion cloninga set of key-postures automatically,while trading off these conflicting requirements.Based on this scheme,we provide a framework of motion cloning.Related WorkSynthesis by imitation is a popular paradigm in computer graphics including shape modelling[12,13],image and texture synthesis[14, 15,16,17,18,19],facial animation[10,11,20],and motion generation[1,3,4,5,6,7,21,22].In concept,motion cloning originates from motion retargeting,which can be classified into motion generation.Technically,however,the origin of our method is facial expression cloning[11,23,24]based on scattered data interpolation[9,13],together with cartoon motion capture and retargeting[25].Motion Generation:Rose et al.[9]and Sloan et al.[13]proposed a framework of motion blending employing scattered data interpo-lation with radial basis functions.Park et al.[8]enhanced this framework for on-line locomotion generation.Bregler et al.[25]proposed an example-based approach to cartoon motion capture and retargeting.Assuming affine transformations between source and target shapes, their approachfirst captured both transformation parameters and blending weights of the source key-shapes at each frame of the input car-toon animation,and then synthesized an output by applying both the parameters and the weights to the corresponding target key-shapes.In a technical sense,the framework of this approach is similar to that of facial expression cloning[11,23,24]although there are well-known, generic key-expressions in facial expression cloning unlike cartoon motion capture and retargeting.Motion Retargeting:Gleicher[1]formulated motion retargeting as a constrained optimization problem.Lee and Shin[2]provided an interactive method to solve this problem by using a hierarchical displacement mapping scheme based on multi-level B-spline approximation. Shin et al.[3]proposed an importance-based approach to on-line puppetry.In addition to their conceptual contribution,motion retargeting techniques are also used in motion cloning to remove artifacts such as foot sliding and penetration in the post-processing stage.Facial Expression Cloning:Motivated by motion retargeting,Noh and Neumann[10]posed a facial expression cloning problem. Based on3D morphing between a pair of source and target face models,their method transfers the facial motion vectors from the source model to the target model to clone a facial expression.Similar example-based methods were proposed to retarget facial expressions from 2D videos to2D drawing[23]or3D models[24].Pyun et al.proposed a similar method by recasting the work of Bregler et al.within the framework of scattered data interpolation[11].We further enhance this framework for motion cloning.OverviewAs illustrated in Figure1,our example-based method for motion cloning consists of two parts:motion analysis and motion synthesis.The motion analysis part is the preprocessing stage consisting of three tasks:parameterization,cluster analysis,and key-posture extraction. These tasks are used in combination to extract a set of source key-postures.Given an input motion together with the parameterized source key-postures and their corresponding target key-postures,the motion synthesis part is to perform actual cloning.This part includes three tasks:initialization,posture blending,and motion retargeting and time adjustment.Thefirst two tasks are for key-posture initialization and scattered data interpolation,respectively,and the last is for postprocessing.The remainder of this paper is organized as follows:In Motion Analysis,we present a method to parameterize the postures of a source character for key-posture extraction.We describe actual motion cloning in Motion Synthesis and show experimental results in Experimental Results.Finally,Discussion and Conclusions provide discussion and conclusions,respectively.Motion AnalysisIn this section,we describe thefirst part of our method that extracts the key-postures from the example motions. ParameterizationWefirst describe how to linearize the postures of a source character for further analysis,and then explain how to obtain their parameter vectors.Motion data of an articulatedfigure can be considered as a vector-valued function in time that provides the posture of thefigure.The function is sampled at regular time instances to form their corresponding frames.The sampled vector at each frame determines the posture of thefigure at that time.Let a motion M be given as follows:M=(M(1),M(2),···,M(n))T,(1) where M(i),1≤i≤n is the posture of the source character at frame i.For an articulatedfigure with m joints including the root,we represent the posture M(i)at frame i byM(i)=(p(i),q1(i),···,q m(i))T,(2) where p(i)∈R3and q1(i)∈S3denote the position of the root and its orientation,respectively,and q j(i)∈S3the orientation of joint j for2≤j≤m.The localized version¯M of the motion M is given by¯M=(¯M(1),¯M(2),···,¯M(n))T,(3) where¯M(i),1≤i≤n is a posture specified in the root coordinate frame.¯M(i)is obtained by nullifying p(i)and q1(i),that is,¯M(i)=(0,1,q2(i),···,q m(i))T.(4) Since the unit quaternion space S3is highly non-linear,we linearize quaternions for later analysis such as principal components analysis (PCA)and k-means clustering:Wefirstfind a reference orientations q j∗for each joint j by minimizing the sum of angular distances between q j(i)and q j∗for all i as proposed in Park et al.[8].As preprocessing,we make all joint orientations q j(i)lie in the same hemisphere of S3for each frame,that is,if||log(q−1j∗q j(i)||>π/2for any j,we replace q j(i)by−q j(i).Given the reference orientation q j∗,we linearize¯M(i)to obtainˆM(i)=(0,0,vT,(5)2(i),···,v m(i))where v j(i)=log(q−1j∗q j(i)),2≤j≤m is the rotation vector of q j(i)with respect to q j∗.By assemblingˆM(i),1≤i≤n,we have the linearized versionˆM:ˆM=(ˆM(1),ˆM(2),···,ˆM(n))T.(6) The local posture vectorˆM contains redundant information since individual joints are correlated with each other.We employ PCA to reduce the dimensionality of the posture vector space to remove the redundancy while sacrificing some accuracy[26].This improves both efficiency and robustness in the remaining part of motion cloning.In particular,we express a posture as a linear combination of the key-postures.However,if the dimensionality of parameters is too higher than the number of key-postures,the scattered data interpolation problem is severely over-constrained,which results in a large approximation error called the“curse of dimensionality”.We avoid this problem by adopting PCA.Given the linearized motionˆM,we compute the set of eigenvectors,e i,1≤i≤m of the covariance matrix of posture elements.We choose the most significant r eigenvectors for r<m to form a space spanned by them.The parameter vectorˇM(i),1≤i≤n of a posture M(i)is obtained by projectingˆM(i)onto this space,that is,ˇM(i)=FˆM(i),1≤i≤n,(7) where F is a matrix formed by the chosen eigenvectors.The parameterized versionˇM is represented byˇM=(ˇM(1),ˇM(2),···,ˇM(n))T.(8)Cluster AnalysisIn this section,we present a scheme for clustering the postures of a source character in the example motions.Fraley and Raftery provide a comprehensive survey on cluster analysis [27].In particular,Berkhin [28]gives an excellent exposition on k -means clustering.The most distinguishing feature of our scheme is automatic computation of the number of clusters provided with the conditions to be satisfied by the resulting clusters.We start with describing these conditions.The objective of posture clustering is to choose proper key-postures such that each key-posture represents a unique cluster.Therefore,the postures in the same cluster should be similar,and a pair of distinct clusters should have different kinds of postures.To guarantee these requirements,we provide two threshold values,γand δ,which restrict the radius of a cluster and the inter-center distance between clusters,respectively.Letting C i ,1≤i ≤k denote a cluster,we define its (geometric)center c i as follows:c i =S j ∈C iS j /n i ,1≤i ≤k,(9)where S j is ˇMs (k )for some k ,and n i is the number of postures in C i .Then,the radius R (C i )of a cluster C i is given as follows:R (C i )=max S j ∈C i {||S j −c i ||},1≤i ≤k,(10)where ||·||gives a Euclidean norm.Similarly,the inter-center distance D (C i ,C j )between a pair of clusters,C i and C j isD (C i ,C j )=||c i −c j ||.(11)Using R (C i )and D (C i ,C j ),we provide the clustering conditions as follows:γ>max i {R (C i )}and δ<min i,j {D (C i ,C j )}.(12)These conditions force each cluster to have sufficiently similar postures (γcondition)and the resulting clusters to be sufficiently different from each other (δcondition).Now,we are ready to describe our clustering scheme:function γδ-Cluster(M ){1¯M ←LocalizeMotions(M );2ˆM ←LinearizeMotions(¯M );3ˇM ←ParameterizeMotions(ˆM );4C ←SeedClusters(ˇM);5γF lag ←TRUE;δF lag ←TRUE;6while (γF lag ∨δF lag ){7K =|C |;8C ←k -MeanCluster(K,C,ˇM);9C ←C ;10γCandidates ←{C i :R (C i )>γ};11δCandidates ←{(C i ,C j ):D (C i ,C j )<δ};12if γCandidates =Øthen {13C ←AddCluster(C,γCandidates );14C ←C ;15γF lag ←TRUE;}16else γF lag ←FALSE;17if δCandidates =Øthen {18C ←DeleteCluster(C,δCandidates );19C ←C ;20δF lag ←TRUE;}21else δF lag ←FALSE;}22return(C );}Initially,an example motion M is preprocessed to be parameterized (steps 1-3).Then,the pair of postures with the maximum Euclideandistances are chosen to classify the postures in ˇMaccording to their distances to each of the chosen postures (step 4).Here,the set C contains the estimation for clusters,that is,the centers of clusters and their members,and thus they are properly initialized.The clusters are then populated iteratively in the while loop (steps 6-21).At step 7,the cardinality of the set C is counted to employ the well-known k -means clustering scheme [28](steps 8-9).With the set C ,the while loop is repeated until the clustering conditions are satisfied.To do this,γCandidates and δCandidates are computed (stepsFigure2:Splitting and merging:splitting the cluster with the largest radius r>γ(left);merging the two distinct clusters with the smallest distance d<δ(right)10-11).These sets provide the information on the clusters that violate our clustering conditions.The former set contains the clusters of which the radii are greater than the given thresholdγ,and the latter consists of pairs of clusters of which the distances are less thanδ.If γCandidates is not empty,then the cluster with the largest radius is chosen from the set and is split(steps13-16),as shown in Figure2(a). IfδCandidates is not empty,the pair of clusters with the smallest distance is chosen from the set and are merged into one(steps17-21), as illustrated in Figure2(b).Specifically,our scheme works well withδ=2γ=0.09-0.1.Key-Posture ExtractionThe source key-postures are instrumental in motion cloning.Therefore,they should satisfy the following requirements:1.The source key postures faithfully reflect the postures in the example animation.2.The number of source key-postures is minimized as long as the animator’s intention on the cloned animation can be expressed.To achieve thefirst requirement,the source key postures should be the representatives of various sample postures.Thus,the postures should be as diverse as possible.Moreover,every posture in the sample animation needs to be expressed as a linear combination of the key-postures since our motion cloning method is based on scattered data interpolation.The second requirement is intended to minimize the animator’s effort in creating the target key-postures corresponding to the source key-postures.Diversity of key-postures for thefirst requirement may aid in expression of the animator’s intended design.However,fidelity of reconstruction quality conflicts with minimality of key-postures for the second condition.Theoretically,every posture can be reconstructed exactly if the degrees of freedom of the source character are the same as the number of linearly-independent key-postures.However,a large number of key-postures imposes a burden on the animator in preparing the corre-sponding target key-postures.Our strategy is to minimize the number of key-postures while forcing every sample posture to be interpolated within a user-specified error toleranceε.For each reconstructed postureˆM∗,its error eˆM is defined as follows:eˆM=||ˆM−ˆM∗||||ˆM||,(13)whereˆM is the linearized posture of a sample posture M.Now,we present our scheme for key-posture extraction in a pseudo-code and explain its major steps: function ExtractKeyPostures(M){1¯M←LocalizeMotions(M);2ˆM←LinearizeMotions(¯M);3ˇM←ParameterizeMotions(ˆM);4C←γδ-Cluster(M);5ContinueF lag←TRUE;6while(ContinueF lag){7KeyP ostures←ExtractKeys(C);8MaxError←0;9GetNextF lag←TRUE;10while(GetNextF lag){11ˆM←GetNextPosture(ˆM);12ˇM←GetNextPosture(ˇM);13ifˆM=øthen GetNextF lag←FALSE;14else{Figure3:MaxError and MaxS(left)after augmenting C with MaxS(right)15W eights←ComputeWeights(KeyP ostures,ˇM);16ˆM∗←BlendPostures(W eights,KeyP ostures);17if MaxError>||ˆM−ˆM∗||/||ˆM||then{18MaxError←||ˆM−ˆM∗||/||ˆM||;19MaxS←ˇM;}}}20if MaxError≤εthen ContinueF lag←FALSE;21else{22C←C∪{MaxS};23K←|C|;24C ←k-MeanCluster(K,C,ˇM);25C←C ;}}26C ←ExtractKeys(C);27ˆC←DeParameterizeMotions(C );28KeyP ostures←DeLinearizeMotions(ˆC);29return(KeyP ostures);}Steps1-3are to process an example motion M to obtain¯M,ˆM,andˇM(See Equations(3),(6),and(8)).Provided with a sequence of sample postures,an initial collection C of clusters is obtained(step4).This collection is then augmented by adding a new cluster at each iteration in the main loop(steps5-25),until every sample posture can be expressed as a linear combination of the chosen key-postures with a given thresholdε.Steps7-9are to prepare for the inner loop(steps10-19).Specifically,step7chooses a key-posture per cluster,which is closest to the center of each cluster,to give the initial key-postures.The task of the inner loop is tofind the maximum error MaxError between a linearized postureˆM and its reconstructed oneˆM∗over all sample postures,together with the sample posture MaxS that yields this error. Steps15and16are for scattered data interpolation,which is explained in Scattered Data Interpolation.After computing MaxError and MaxS,steps20-25are performed:MaxError is compared with a given error thresholdε.If MaxError is less thanε,then the process isfinished since all sample postures can be constructed using the chosen key-postures withinε. Otherwise,the cluster collection is augmented with MaxS and the next iteration is performed with the new collection C.In steps26-28, the source key-postures are sent back to the root coordinate frame so that the animator can refer to it to create the corresponding target key-postures.We empirically setε=2∼5%,which works well for our experiments in Experimental Results.As shown in Figure3,the error for the posture MaxS is reduced to zero after augmenting C with it,which also decreases the errors of other postures near MaxS in the parameter space.Motion SynthesisPreliminariesIn this section,we describe the second part of our method,that is,how to perform the actual cloning.Wefirst provide definitions and symbols to be used,and then present the overall structure of the actual cloning in a pseudocode.Definitions and Symbols:Let M s and M t be input and output motions as follows:M s=(M s(1),M s(2),···,M s(n))T,andM t=(M t(1),M t(2),···,M t(n))T.(14)¯M s,ˆM s,andˇM s denote the same input motion M s represented in different manners as given in Equations(3),(6),and(8),respectively. Similarly,¯M t,ˆM t,andˇM t represent different versions of the same output motion,respectively.Let¯P s and¯P t denote the set of source key-postures and that of their corresponding target key-postures specified in their root coordinate frames,respectively,that is,¯P s=(¯P s1,¯P s 2,···,¯P sk)T,and¯P t=(¯P t1,¯P t2,···,¯P tk)T,(15)where¯P s j in¯P s corresponds to¯P t j in¯P t for all1≤j≤k.¯P s is extracted from the sample postures by function ExtractKeyP ostures given in Key-Posture Extraction,and¯P t is supplied by the animator.For the source key-postures¯P s,its different versionsˆP s andˇP s are built as explained in Equation(6).ˆP t is obtained from the target key-postures using Equation(8).Overall Structure:Now,we are ready to present the overall structure of actual cloning:function SynthesizeMotion(M s,¯P s,¯P t){1(ˆP s,ˆP t)←LinearizeMotions(¯P s,¯P t);2ˇP s←ParameterizeMotions(ˆP s);3M t←ø;4GetNextF lag←TRUE;5while(GetNextF lag){6M s←GetNextPosture(M s);7if M=øthen GetNextF lag←FALSE;8else{9¯M s←LocalizePosture(M s);10ˆM s←LinearizePosture(¯M s);11ˇM s←ParameterizePosture(ˆM s);12W eights←ComputeWeights(ˇP s,ˇM s);13ˆM t←BlendPosture(W eights,ˆP t);14¯M t←DeLinearizePosture(ˆM t);15M t←DeLocalizePosture(¯M t);16M t←M t∪{M t};}}17M←Retarget(M t);18M t←Adjust(M);19return(M t);}Our motion synthesis part consists of three major tasks:key-posture processing(steps1-2),scattered data interpolation(steps3-16),and postprocessing(steps17-18).Thefirst task is to initialize the source and target key-postures for weight computation and posture blending. The second task is to perform the actual motion cloning,and the last task is for motion retargeting and interactivefine tuning performed by the animator.We explain these tasks in the next three sections.InitializationWhen the target key-postures are created,the relationship of the target character with the environment is not available.Thus,they have to be specified in the root coordinate frame of the target character.Since every source key-posture is used as a reference in creating its corresponding target key-posture,the source key-posture should also be specified in the root coordinate frame of the source character. Therefore,we assume that both kinds of key-postures are in their corresponding characters’local coordinate frames,respectively(See Equation(4)).However,since the joint angles are specified in unit quaternions in the local coordinate frames,they should be linearized for scattered data interpolation.In addition,the source key-postures go through PCA for their parameterization(See Equation(7)),which is needed for weight computation.Scattered Data InterpolationGiven an input animation M s,the while loop(steps3-16)clones a sequence of postures in M s to the target character frame by frame to create an output animation M t.The core of motion cloning is scattered data interpolation based on cardinal basis functions[9,13].The loop body can be divided into three groups of steps,that is,input posture parameterization(steps9-11),weight computation and posture blending(steps12-13),and output animation construction(steps14-16).Every posture M s in M sfirst goes through three stages in sequence as described by Equations(4),(5),and(7)to locate the positionˇM s in the parameter space,where the linearized postureˆM s is defined.Then,based on cardinal basis functions,the weight w i(ˆP s j),1≤j≤k of each input key-postureˆP s i inˆP s is computed atˇM s in the parameter space,that is,w i(ˇM s)=rj=1l ij L j(ˇM s)+kj=1r ij R j(ˇM s),1≤i≤k,(16)where L j(·)and l ij are linear basis functions and their coefficients,R j(·)and r ij are radial basis functions and their coefficients,and r and k are the dimensionality of the parameter space and the number of key-postures,respectively.We use cubic B-spline functions as radial basis functions,for which the dilation factor is set to the maximum of cluster radii for all clusters.For details on basis functions and their coefficients,we refer readers to the work in[9,13].Each w i(ˇM s j),1≤j≤k is applied to its corresponding linearized target key-postureˆP t i to give the output postureˆM t corresponding to the input key-postureˆM s,that is,ˆM t=ki=1w iˆP t i.(17)Finally,ˆM t is converted back to the original posture space by negating the sequence of transformations to(experienced in steps9-10)in the reverse order(See Equations(4)and(5)).In particular,letˆM t=(0,0,v t2,···,v tm)T,(18)where v t j is the rotation vector of joint j inˆM s computed in steps12-13.Then,¯M t=(0,1,q t2,···,q tm)T,(19)where q t j=q j∗exp(v t j),and q j∗is the reference orientation for joint j(see Parameterization).Finally,M t=(0,q s1,q t2,···,q t m)T.(20) Here,intact q s1is transferred from M s to M t.Motion Retargeting and AdjustmentNow,we construct the proper position of the root to complete the synthesis process.Since the target key-postures,which are created by the animator,do not contain information about the global frame,we try to construct a plausible root trajectory while satisfying the user-specified constraints.The input postures of the source character are promising guesses for the output postures of the target character.If the source and target characters are structurally similar,we can transfer the root trajectory into the target character as described in[1]after applying some global scaling to the trajectory.However,the target character generally looks quite different from the source character,not only in segment proportions but also in structures.In such a case,to prevent artifacts such as foot sliding or penetration,the animator should specify the target root trajectory in advance so that an on-line motion retargeting technique can be employed[3,8].We also provide a user interface that facilitates dynamic time-warping to control motion styles,exploiting a set of keytimes in the motion, that is,the moments of interaction between the character and its surrounding environment such as instances of heel-strikes and toe-offs in human locomotion.These instances can be detected automatically by employing the technique described in[29].The animator can change motion styles by interactively specifying the temporal correspondence between the keytimes of the input motion and those of the cloned motion.After establishing the correspondence,the cloned motion is resampled and interactively edited forfine tuning to complete motion cloning.Figure4:Three different character models in structures and segment proportionsTarget Key-Posture ValidityWe describe how to check the validity of target key-postures specified by the animator.Suppose that the key-postures are linearized.By the way in which we choose the source key-postures,every linearized postureˆM s in the input animationˆM s can be expressed as a linear combination of the source key-postures within a given error tolerance.Thus,ˆM s≈kj=1w jˆP s j for w j∈R.(21)The corresponding output postureˆM t can be expressed as a function of the input postureˆM s,that is,ˆM t=f(ˆM s)=f(kj=1w jˆP s j),(22)assuming thatˆM s= kj=1w jˆP s j.The function f interpolates the target key-postures at their corresponding source postures,that is,ˆP tj=f(ˆP s j)for all1≤j≤k.(23)Since the target postureˆM t can also be obtained by blending the target key-postures with the same weights,ˆM t j =kj=1w jˆP t j=kj=1w j f(ˆP s j).(24)From Equations(22)and(23),f(kj=1w jˆP s j)=kj=1w j f(ˆP s j),(25)which states that f is a linear function.In other words,the animator should specify the linearized target key-postures such that they are linearly related to their corresponding source key-postures.In order to check the linearity between the two key-posture sets,ˆP s andˆP t, we employ a statistical technique called CCA[22,30],which is a multi-dimensional generalization of correlation analysis.Experimental ResultsWe used a human model of51DOFs for body configuration(6DOFs for the root position and orientations,6DOFs for the spline,3DOFs for the neck,and9DOFs for each limb)as a source character.Three kinds of target characters were prepared for our experiments,as shown in Figure4.The model on the left has the same number of DOFs as that of the source character but has different segment proportions.The middle model has78DOFs and looks quite different not only in segment proportions but also in structures.The model on the right is a third target character with36DOFs.The motions were sampled at a rate of60frames per second.The experiments were performed on a standard PC environment(Intel P42.2GHz processor and1GB memory)with MS Windows XP.Two sets of example motions were captured:Thefirst set of example motions was composed of locomotive motions including walking, jogging,and running.To obtain smooth transitions between these motions,we captured a continuous sequence of those motions as a single motion clip consisting of5112frames.The second set was composed of a sequence of dynamic motions of the same character consisting of4700frames,including soccer motions such as kicking,dribbling,feinting,and heading.For key-posture extraction,our example motions werefirst parameterized as described in Parameterization by employing PCA.To keep 95%of the variance in motion data,we selected12and16eigenvectors of the covariance matrices of posture elements in the two example motion sets,respectively.To apply our key-posture extraction scheme to thefirst set of example motions,we empirically set our thresholds for the radius and the inter-center distance to0.05and0.10,respectively.For the second set of motions,they were set to0.045and0.09. When the scattered data interpolation was performed with an error tolerance of2%,15and24key-postures were obtained for the example。
Tolerance_stack-up[17P][290KB]
– Set every parameter to its spec limit – Measure the response – This is the same as adding tolerances
g g g g x1 x 2 x 3 x1 x 2 x 3
2 2 2
2
17-10
Sensitivity Analysis
• The sensitivity of the output to changes in the input can be calculated from the derivatives. • The sensitivity of I to changes in V
– An equation for what you are simulating – The distribution of the parameters so they can vary over time
17-7
Circuit Example
I
V R ( 2fL )
2 2
Mean Std Dev V = voltage 100 5 r = resistance 10 1 f = capacitance 50 5 L= inductance .004 .0008
17-8
Monte Carlo Analysis
Circuit Example
I
V R ( 2fL )
2 2
Mean Std Dev V = voltage 100 5 r = resistance 10 1 f = capacitance 50 5 L= inductance .004 .0008
Cubature Kalman Filters
1254IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009Cubature Kalman FiltersIenkaran Arasaratnam and Simon Haykin, Life Fellow, IEEEAbstract—In this paper, we present a new nonlinear filter for high-dimensional state estimation, which we have named the cubature Kalman filter (CKF). The heart of the CKF is a spherical-radial cubature rule, which makes it possible to numerically compute multivariate moment integrals encountered in the nonlinear Bayesian filter. Specifically, we derive a third-degree spherical-radial cubature rule that provides a set of cubature points scaling linearly with the state-vector dimension. The CKF may therefore provide a systematic solution for high-dimensional nonlinear filtering problems. The paper also includes the derivation of a square-root version of the CKF for improved numerical stability. The CKF is tested experimentally in two nonlinear state estimation problems. In the first problem, the proposed cubature rule is used to compute the second-order statistics of a nonlinearly transformed Gaussian random variable. The second problem addresses the use of the CKF for tracking a maneuvering aircraft. The results of both experiments demonstrate the improved performance of the CKF over conventional nonlinear filters. Index Terms—Bayesian filters, cubature rules, Gaussian quadrature rules, invariant theory, Kalman filter, nonlinear filtering.• Time update, which involves computing the predictive density(3)where denotes the history of input; is the measurement pairs up to time and the state transition old posterior density at time is obtained from (1). density • Measurement update, which involves computing the posterior density of the current stateI. INTRODUCTIONUsing the state-space model (1), (2) and Bayes’ rule we have (4) where the normalizing constant is given byIN this paper, we consider the filtering problem of a nonlinear dynamic system with additive noise, whose statespace model is defined by the pair of difference equations in discrete-time [1] (1) (2)is the state of the dynamic system at discrete where and are time ; is the known control input, some known functions; which may be derived from a compensator as in Fig. 1; is the measurement; and are independent process and measurement Gaussian noise sequences with zero and , respectively. means and covariances In the Bayesian filtering paradigm, the posterior density of the state provides a complete statistical description of the state at that time. On the receipt of a new measurement at time , we in update the old posterior density of the state at time two basic steps:Manuscript received July 02, 2008; revised July 02, 2008, August 29, 2008, and September 16, 2008. First published May 27, 2009; current version published June 10, 2009. This work was supported by the Natural Sciences & Engineering Research Council (NSERC) of Canada. Recommended by Associate Editor S. Celikovsky. The authors are with the Cognitive Systems Laboratory, Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada (e-mail: aienkaran@grads.ece.mcmaster.ca; haykin@mcmaster. ca). Color versions of one or more of the figures in this paper are available online at . Digital Object Identifier 10.1109/TAC.2009.2019800To develop a recursive relationship between the predictive density and the posterior density in (4), the inputs have to satisfy the relationshipwhich is also called the natural condition of control [2]. has sufficient This condition therefore suggests that information to generate the input . To be specific, the can be generated using . Under this condiinput tion, we may equivalently write (5) Hence, substituting (5) into (4) yields (6) as desired, where (7) and the measurement likelihood function obtained from (2). is0018-9286/$25.00 © 2009 IEEEARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1255Fig. 1. Signal-flow diagram of a dynamic state-space model driven by the feedback control input. The observer may employ a Bayesian filter. The label denotes the unit delay.The Bayesian filter solution given by (3), (6), and (7) provides a unified recursive approach for nonlinear filtering problems, at least conceptually. From a practical perspective, however, we find that the multi-dimensional integrals involved in (3) and (7) are typically intractable. Notable exceptions arise in the following restricted cases: 1) A linear-Gaussian dynamic system, the optimal solution for which is given by the celebrated Kalman filter [3]. 2) A discrete-valued state-space with a fixed number of states, the optimal solution for which is given by the grid filter (Hidden-Markov model filter) [4]. 3) A “Benes type” of nonlinearity, the optimal solution for which is also tractable [5]. In general, when we are confronted with a nonlinear filtering problem, we have to abandon the idea of seeking an optimal or analytical solution and be content with a suboptimal solution to the Bayesian filter [6]. In computational terms, suboptimal solutions to the posterior density can be obtained using one of two approximate approaches: 1) Local approach. Here, we derive nonlinear filters by fixing the posterior density to take a priori form. For example, we may assume it to be Gaussian; the nonlinear filters, namely, the extended Kalman filter (EKF) [7], the central-difference Kalman filter (CDKF) [8], [9], the unscented Kalman filter (UKF) [10], and the quadrature Kalman filter (QKF) [11], [12], fall under this first category. The emphasis on locality makes the design of the filter simple and fast to execute. 2) Global approach. Here, we do not make any explicit assumption about the posterior density form. For example, the point-mass filter using adaptive grids [13], the Gaussian mixture filter [14], and particle filters using Monte Carlo integrations with the importance sampling [15], [16] fall under this second category. Typically, the global methods suffer from enormous computational demands. Unfortunately, the presently known nonlinear filters mentioned above suffer from the curse of dimensionality [17] or divergence or both. The effect of curse of dimensionality may often become detrimental in high-dimensional state-space models with state-vectors of size 20 or more. The divergence may occur for several reasons including i) inaccurate or incomplete model of the underlying physical system, ii) informationloss in capturing the true evolving posterior density completely, e.g., a nonlinear filter designed under the Gaussian assumption may fail to capture the key features of a multi-modal posterior density, iii) high degree of nonlinearities in the equations that describe the state-space model, and iv) numerical errors. Indeed, each of the above-mentioned filters has its own domain of applicability and it is doubtful that a single filter exists that would be considered effective for a complete range of applications. For example, the EKF, which has been the method of choice for nonlinear filtering problems in many practical applications for the last four decades, works well only in a ‘mild’ nonlinear environment owing to the first-order Taylor series approximation for nonlinear functions. The motivation for this paper has been to derive a more accurate nonlinear filter that could be applied to solve a wide range (from low to high dimensions) of nonlinear filtering problems. Here, we take the local approach to build a new filter, which we have named the cubature Kalman filter (CKF). It is known that the Bayesian filter is rendered tractable when all conditional densities are assumed to be Gaussian. In this case, the Bayesian filter solution reduces to computing multi-dimensional integrals, whose integrands are all of the form nonlinear function Gaussian. The CKF exploits the properties of highly efficient numerical integration methods known as cubature rules for those multi-dimensional integrals [18]. With the cubature rules at our disposal, we may describe the underlying philosophy behind the derivation of the new filter as nonlinear filtering through linear estimation theory, hence the name “cubature Kalman filter.” The CKF is numerically accurate and easily extendable to high-dimensional problems. The rest of the paper is organized as follows: Section II derives the Bayesian filter theory in the Gaussian domain. Section III describes numerical methods available for moment integrals encountered in the Bayesian filter. The cubature Kalman filter, using a third-degree spherical-radial cubature rule, is derived in Section IV. Our argument for choosing a third-degree rule is articulated in Section V. We go on to derive a square-root version of the CKF for improved numerical stability in Section VI. The existing sigma-point approach is compared with the cubature method in Section VII. We apply the CKF in two nonlinear state estimation problems in Section VIII. Section IX concludes the paper with a possible extension of the CKF algorithm for a more general setting.1256IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009II. BAYESIAN FILTER THEORY IN THE GAUSSIAN DOMAIN The key approximation taken to develop the Bayesian filter theory under the Gaussian domain is that the predictive density and the filter likelihood density are both Gaussian, which eventually leads to a Gaussian posterior den. The Gaussian is the most convenient and widely sity used density function for the following reasons: • It has many distinctive mathematical properties. — The Gaussian family is closed under linear transformation and conditioning. — Uncorrelated jointly Gaussian random variables are independent. • It approximates many physical random phenomena by virtue of the central limit theorem of probability theory (see Sections 5.7 and 6.7 in [19] for more details). Under the Gaussian approximation, the functional recursion of the Bayesian filter reduces to an algebraic recursion operating only on means and covariances of various conditional densities encountered in the time and the measurement updates. A. Time Update In the time update, the Bayesian filter computes the mean and the associated covariance of the Gaussian predictive density as follows: (8) where is the statistical expectation operator. Substituting (1) into (8) yieldsTABLE I KALMAN FILTERING FRAMEWORKB. Measurement Update It is well known that the errors in the predicted measurements are zero-mean white sequences [2], [20]. Under the assumption that these errors can be well approximated by the Gaussian, we write the filter likelihood density (12) where the predicted measurement (13) and the associated covariance(14) Hence, we write the conditional Gaussian density of the joint state and the measurement(15) (9) where the cross-covariance is assumed to be zero-mean and uncorrelated Because with the past measurements, we get (16) On the receipt of a new measurement , the Bayesian filter from (15) yielding computes the posterior density (17) (10) where is the conventional symbol for a Gaussian density. Similarly, we obtain the error covariance where (18) (19) (20) If and are linear functions of the state, the Bayesian filter under the Gaussian assumption reduces to the Kalman filter. Table I shows how quantities derived above are called in the Kalman filtering framework. The signal-flow diagram in Fig. 2 summarizes the steps involved in the recursion cycle of the Bayesian filter. The heart of the Bayesian filter is therefore how to compute Gaussian(11)ARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1257Fig. 2. Signal-flow diagram of the recursive Bayesian filter under the Gaussian assumption, where “G-” stands for “Gaussian-.”weighted integrals whose integrands are all of the form nonGaussian density that are present in (10), linear function (11), (13), (14) and (16). The next section describes numerical integration methods to compute multi-dimensional weighted integrals. III. REVIEW ON NUMERICAL METHODS FOR MOMENT INTEGRALS Consider a multi-dimensional weighted integral of the form (21) is some arbitrary function, is the region of where for all integration, and the known weighting function . In a Gaussian-weighted integral, for example, is a Gaussian density and satisfies the nonnegativity condition in the entire region. If the solution to the above integral (21) is difficult to obtain, we may seek numerical integration methods to compute it. The basic task of numerically computing the integral (21) is to find a set of points and weights that approximates by a weighted sum of function evaluations the integral (22) The methods used to find can be divided into product rules and non-product rules, as described next. A. Product Rules ), we For the simplest one-dimensional case (that is, may apply the quadrature rule to compute the integral (21) numerically [21], [22]. In the context of the Bayesian filter, we mention the Gauss-Hermite quadrature rule; when the is in the form of a Gaussian density weighting functionis well approximated by a polynomial and the integrand in , the Gauss-Hermite quadrature rule is used to compute the Gaussian-weighted integral efficiently [12]. The quadrature rule may be extended to compute multidimensional integrals by successively applying it in a tensorproduct of one-dimensional integrals. Consider an -point per dimension quadrature rule that is exact for polynomials of points for functional degree up to . We set up a grid of evaluations and numerically compute an -dimensional integral while retaining the accuracy for polynomials of degree up to only. Hence, the computational complexity of the product quadrature rule increases exponentially with , and therefore , suffers from the curse of dimensionality. Typically for the product Gauss-Hermite quadrature rule is not a reasonable choice to approximate a recursive optimal Bayesian filter. B. Non-Product Rules To mitigate the curse of dimensionality issue in the product rules, we may seek non-product rules for integrals of arbitrary dimensions by choosing points directly from the domain of integration [18], [23]. Some of the well-known non-product rules include randomized Monte Carlo methods [4], quasi-Monte Carlo methods [24], [25], lattice rules [26] and sparse grids [27]–[29]. The randomized Monte Carlo methods evaluate the integration using a set of equally-weighted sample points drawn randomly, whereas in quasi-Monte Carlo methods and lattice rules the points are generated from a unit hyper-cube region using deterministically defined mechanisms. On the other hand, the sparse grids based on Smolyak formula in principle, combine a quadrature (univariate) routine for high-dimensional integrals more sophisticatedly; they detect important dimensions automatically and place more grid points there. Although the non-product methods mentioned here are powerful numerical integration tools to compute a given integral with a prescribed accuracy, they do suffer from the curse of dimensionality to certain extent [30].1258IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009C. Proposed Method In the recursive Bayesian estimation paradigm, we are interested in non-product rules that i) yield reasonable accuracy, ii) require small number of function evaluations, and iii) are easily extendable to arbitrarily high dimensions. In this paper we derive an efficient non-product cubature rule for Gaussianweighted integrals. Specifically, we obtain a third-degree fullysymmetric cubature rule, whose complexity in terms of function evaluations increases linearly with the dimension . Typically, a set of cubature points and weights are chosen so that the cubature rule is exact for a set of monomials of degree or less, as shown by (23)Gaussian density. Specifically, we consider an integral of the form (24)defined in the Cartesian coordinate system. To compute the above integral numerically we take the following two steps: i) We transform it into a more familiar spherical-radial integration form ii) subsequently, we propose a third-degree spherical-radial rule. A. Transformation In the spherical-radial transformation, the key step is a change of variable from the Cartesian vector to a radius and with , so direction vector as follows: Let for . Then the integral (24) can be that rewritten in a spherical-radial coordinate system as (25) is the surface of the sphere defined by and is the spherical surface measure or the area element on . We may thus write the radial integral (26) is defined by the spherical integral with the unit where weighting function (27) The spherical and the radial integrals are numerically computed by the spherical cubature rule (Section IV-B below) and the Gaussian quadrature rule (Section IV-C below), respectively. Before proceeding further, we introduce a number of notations and definitions when constructing such rules as follows: • A cubature rule is said to be fully symmetric if the following two conditions hold: implies , where is any point obtainable 1) from by permutations and/or sign changes of the coordinates of . on the region . That is, all points in 2) the fully symmetric set yield the same weight value. For example, in the one-dimensional space, a point in the fully symmetric set implies that and . • In a fully symmetric region, we call a point as a generator , where if , . The new should not be confused with the control input . zero coordinates and use • For brevity, we suppress to represent a complete fully the notation symmetric set of points that can be obtained by permutating and changing the sign of the generator in all possible ways. Of course, the complete set entails where; are non-negative integers and . Here, an important quality criterion of a cubature rule is its degree; the higher the degree of the cubature rule is, the more accurate solution it yields. To find the unknowns of the cubature rule of degree , we solve a set of moment equations. However, solving the system of moment equations may be more tedious with increasing polynomial degree and/or dimension of the integration domain. For example, an -point cubature rule entails unknown parameters from its points and weights. In general, we may form a system of equations with respect to unknowns from distinct monomials of degree up to . For the nonlinear system to have at least one solution (in this case, the system is said to be consistent), we use at least as many unknowns as equations [31]. That is, we choose to be . Suppose we obtain a cu. In this case, we solve bature rule of degree three for nonlinear moment equations; the re) sulting rule may consist of more than 85 ( weighted cubature points. To reduce the size of the system of algebraically independent equations or equivalently the number of cubature points markedly, Sobolev proposed the invariant theory in 1962 [32] (see also [31] and the references therein for a recent account of the invariant theory). The invariant theory, in principle, discusses how to restrict the structure of a cubature rule by exploiting symmetries of the region of integration and the weighting function. For example, integration regions such as the unit hypercube, the unit hypersphere, and the unit simplex exhibit symmetry. Hence, it is reasonable to look for cubature rules sharing the same symmetry. For the case considered above and ), using the invariant theory, we may con( cubature points struct a cubature rule consisting of by solving only a pair of moment equations (see Section IV). Note that the points and weights of the cubature rule are in. Hence, they can be computed dependent of the integrand off-line and stored in advance to speed up the filter execution. where IV. CUBATURE KALMAN FILTER As described in Section II, nonlinear filtering in the Gaussian domain reduces to a problem of how to compute integrals, whose integrands are all of the form nonlinear functionARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1259points when are all distinct. For example, represents the following set of points:Here, the generator is • We use . set B. Spherical Cubature Rule. to denote the -th point from theWe first postulate a third-degree spherical cubature rule that takes the simplest structure due to the invariant theory (28) The point set due to is invariant under permutations and sign changes. For the above choice of the rule (28), the monomials with being an odd integer, are integrated exactly. In order that this rule is exact for all monomials of degree up to three, it remains to require that the rule is exact , 2. Equivalently, to for all monomials for which find the unknown parameters and , it suffices to consider , and due to the fully symmonomials metric cubature rule (29) (30) where the surface area of the unit sphere with . Solving (29) and (30) , and . Hence, the cubature points are yields located at the intersection of the unit sphere and its axes. C. Radial Rule We next propose a Gaussian quadrature for the radial integration. The Gaussian quadrature is known to be the most efficient numerical method to compute a one-dimensional integration [21], [22]. An -point Gaussian quadrature is exact and constructed as up to polynomials of degree follows: (31) where is a known weighting function and non-negative on ; the points and the associated weights the interval are unknowns to be determined uniquely. In our case, a comparison of (26) and (31) yields the weighting function and and , respecthe interval to be tively. To transform this integral into an integral for which the solution is familiar, we make another change of variable via yielding. The integral on the right-hand side of where (32) is now in the form of the well-known generalized GaussLaguerre formula. The points and weights for the generalized Gauss-Laguerre quadrature are readily obtained as discussed elsewhere [21]. A first-degree Gauss-Laguerre rule is exact for . Equivalently, the rule is exact for ; it . is not exact for odd degree polynomials such as Fortunately, when the radial-rule is combined with the spherical rule to compute the integral (24), the (combined) spherical-radial rule vanishes for all odd-degree polynomials; the reason is that the spherical rule vanishes by symmetry for any odd-degree polynomial (see (25)). Hence, the spherical-radial rule for (24) is exact for all odd degrees. Following this argument, for a spherical-radial rule to be exact for all third-degree polyno, it suffices to consider the first-degree genermials in alized Gauss-Laguerre rule entailing a single point and weight. We may thus write (33) where the point is chosen to be the square-root of the root of the first-order generalized Laguerre polynomial, which is orthogonal with respect to the modified weighting function ; subsequently, we find by solving the zeroth-order moment equation appropriately. In this case, we , and . A detailed account have of computing the points and weights of a Gaussian quadrature with the classical and nonclassical weighting function is presented in [33]. D. Spherical-Radial Rule In this subsection, we describe two useful results that are used to i) combine the spherical and radial rule obtained separately, and ii) extend the spherical-radial rule for a Gaussian weighted integral. The respective results are presented as two propositions: Proposition 4.1: Let the radial integral be computed numer-point Gaussian quadrature rule ically by theLet the spherical integral be computed numerically by the -point spherical ruleThen, an by-point spherical-radial cubature rule is given(34) Proof: Because cubature rules are devised to be exact for a subspace of monomials of some degree, we consider an integrand of the form(32)1260IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE 2009where are some positive integers. Hence, we write the integral of interestwhereFor the moment, we assume the above integrand to be a mono. Making the mial of degree exactly; that is, change of variable as described in Section IV-A, we getWe use the cubature-point set to numerically compute integrals (10), (11), and (13)–(16) and obtain the CKF algorithm, details of which are presented in Appendix A. Note that the above cubature-point set is now defined in the Cartesian coordinate system. V. IS THERE A NEED FOR HIGHER-DEGREE CUBATURE RULES? In this section, we emphasize the importance of third-degree cubature rules over higher-degree rules (degree more than three), when they are embedded into the cubature Kalman filtering framework for the following reasons: • Sufficient approximation. The CKF recursively propagates the first two-order moments, namely, the mean and covariance of the state variable. A third-degree cubature rule is also constructed using up to the second-order moment. Moreover, a natural assumption for a nonlinearly transformed variable to be closed in the Gaussian domain is that the nonlinear function involved is reasonably smooth. In this case, it may be reasonable to assume that the given nonlinear function can be well-approximated by a quadratic function near the prior mean. Because the third-degree rule is exact up to third-degree polynomials, it computes the posterior mean accurately in this case. However, it computes the error covariance approximately; for the covariance estimate to be more accurate, a cubature rule is required to be exact at least up to a fourth degree polynomial. Nevertheless, a higher-degree rule will translate to higher accuracy only if the integrand is well-behaved in the sense of being approximated by a higher-degree polynomial, and the weighting function is known to be a Gaussian density exactly. In practice, these two requirements are hardly met. However, considering in the cubature Kalman filtering framework, our experience with higher-degree rules has indicated that they yield no improvement or make the performance worse. • Efficient and robust computation. The theoretical lower bound for the number of cubature points of a third-degree centrally symmetric cubature rule is given by twice the dimension of an integration region [34]. Hence, the proposed spherical-radial cubature rule is considered to be the most efficient third-degree cubature rule. Because the number of points or function evaluations in the proposed cubature rules scales linearly with the dimension, it may be considered as a practical step for easing the curse of dimensionality. According to [35] and Section 1.5 in [18], a ‘good’ cubature rule has the following two properties: (i) all the cubature points lie inside the region of integration, and (ii) all the cubature weights are positive. The proposed rule equal, positive weights for an -dimensional entails unbounded region and hence belongs to a good cubature family. Of course, we hardly find higher-degree cubature rules belonging to a good cubature family especially for high-dimensional integrations.Decomposing the above integration into the radial and spherical integrals yieldsApplying the numerical rules appropriately, we haveas desired. As we may extend the above results for monomials of degree less than , the proposition holds for any arbitrary integrand that can be written as a linear combination of monomials of degree up to (see also [18, Section 2.8]). Proposition 4.2: Let the weighting functions and be and . such that , we Then for every square matrix have (35) Proof: Consider the left-hand side of (35). Because a positive definite matrix, we factorize to be , we get Making a change of variable via is .which proves the proposition. For the third-degree spherical-radial rule, and . Hence, it entails a total of cubature points. Using the above propositions, we extend this third-degree spherical-radial rule to compute a standard Gaussian weighted integral as follows:ARASARATNAM AND HAYKIN: CUBATURE KALMAN FILTERS1261In the final analysis, the use of higher-degree cubature rules in the design of the CKF may marginally improve its performance at the expense of a reduced numerical stability and an increased computational cost. VI. SQUARE-ROOT CUBATURE KALMAN FILTER This section addresses i) the rationale for why we need a square-root extension of the standard CKF and ii) how the square-root solution can be developed systematically. The two basic properties of an error covariance matrix are i) symmetry and ii) positive definiteness. It is important that we preserve these two properties in each update cycle. The reason is that the use of a forced symmetry on the solution of the matrix Ricatti equation improves the numerical stability of the Kalman filter [36], whereas the underlying meaning of the covariance is embedded in the positive definiteness. In practice, due to errors introduced by arithmetic operations performed on finite word-length digital computers, these two properties are often lost. Specifically, the loss of the positive definiteness may probably be more hazardous as it stops the CKF to run continuously. In each update cycle of the CKF, we mention the following numerically sensitive operations that may catalyze to destroy the properties of the covariance: • Matrix square-rooting [see (38) and (43)]. • Matrix inversion [see (49)]. • Matrix squared-form amplifying roundoff errors [see (42), (47) and (48)]. • Substraction of the two positive definite matrices present in the covariant update [see (51)]. Moreover, some nonlinear filtering problems may be numerically ill-conditioned. For example, the covariance is likely to turn out to be non-positive definite when i) very accurate measurements are processed, or ii) a linear combination of state vector components is known with greater accuracy while other combinations are essentially unobservable [37]. As a systematic solution to mitigate ill effects that may eventually lead to an unstable or even divergent behavior, the logical procedure is to go for a square-root version of the CKF, hereafter called square-root cubature Kalman filter (SCKF). The SCKF essentially propagates square-root factors of the predictive and posterior error covariances. Hence, we avoid matrix square-rooting operations. In addition, the SCKF offers the following benefits [38]: • Preservation of symmetry and positive (semi)definiteness of the covariance. Improved numerical accuracy owing to the fact that , where the symbol denotes the condition number. • Doubled-order precision. To develop the SCKF, we use (i) the least-squares method for the Kalman gain and (ii) matrix triangular factorizations or triangularizations (e.g., the QR decomposition) for covariance updates. The least-squares method avoids to compute a matrix inversion explicitly, whereas the triangularization essentially computes a triangular square-root factor of the covariance without square-rooting a squared-matrix form of the covariance. Appendix B presents the SCKF algorithm, where all of the steps can be deduced directly from the CKF except for the update of the posterior error covariance; hence we derive it in a squared-equivalent form of the covariance in the appendix.The computational complexity of the SCKF in terms of flops, grows as the cube of the state dimension, hence it is comparable to that of the CKF or the EKF. We may reduce the complexity significantly by (i) manipulating sparsity of the square-root covariance carefully and (ii) coding triangularization algorithms for distributed processor-memory architectures. VII. A COMPARISON OF UKF WITH CKF Similarly to the CKF, the unscented Kalman filter (UKF) is another approximate Bayesian filter built in the Gaussian domain, but uses a completely different set of deterministic weighted points [10], [39]. To elaborate the approach taken in the UKF, consider an -dimensional random variable having with mean and covariance a symmetric prior density , within which the Gaussian is a special case. Then a set of sample points and weights, are chosen to satisfy the following moment-matching conditions:Among many candidate sets, one symmetrically distributed sample point set, hereafter called the sigma-point set, is picked up as follows:where and the -th column of a matrix is denoted ; the parameter is used to scale the spread of sigma points by from the prior mean , hence the name “scaling parameter”. Due to its symmetry, the sigma-point set matches the skewness. Moreover, to capture the kurtosis of the prior density closely, it is sug(Appendix I of [10], gested that we choose to be [39]). This choice preserves moments up to the fifth order exactly in the simple one-dimensional Gaussian case. In summary, the sigma-point set is chosen to capture a number as correctly as of low-order moments of the prior density possible. Then the unscented transformation is introduced as a method that are related to of computing posterior statistics of by a nonlinear transformation . It approximates the mean and the covariance of by a weighted sum of projected space, as shown by sigma points in the(36)(37)。
Finding State Solutions to Temporal Logic Queries. www.cs.toronto.edumgstateqc.ps
Finding State Solutions to Temporal Logic QueriesMihaela Gheorghiu,Arie Gurfinkel,and Marsha ChechikDepartment of Computer Science,University of Toronto,Toronto,ON M5S3G4,Canada.Email:mg,arie,chechik@Abstract.Different analysis problems for state-transition models can be uni-formly treated as instances of temporal logic query-checking,where only statesare sought as solutions to the queries.In this paper,we propose a symbolic query-checking algorithm thatfinds exactly the state solutions to any query.We showthat our approach generalizes previous ad-hoc techniques,and this generality al-lows us tofind new and interesting applications,such asfinding stable states.Ouralgorithm is linear in the size of the state space and in the cost of model checking,and has been implemented on top of the model checker NuSMV,using the latteras a black box.We show the effectiveness of our approach by comparing it,on agene network example,to the naive algorithm in which all possible state solutionsare checked separately.1IntroductionIn the analysis of state-transition models,many problems reduce to questions of the type:“What are all the states that satisfy a property?”.Symbolic model checking can answer some of these questions,provided that the property can be formulated in an appropriate temporal logic.For example,suppose the erroneous states of a program are characterized by the program counter()being at a line labeled.Then the states that may lead to error can be discovered by model checking the property ,formalized in the branching temporal logic CTL[10].There are many interesting questions which are not readily expressed in temporal logic and require specialized algorithms.One example isfinding the reachable states, which is often needed in a pre-analysis step to restrict further analysis only to those states.These states are typically found by computing a forward transitive closure of the transition relation[8].Another example is the computation of“procedure summaries”.A procedure summary is a relation between states,representing the input/output behav-ior of a procedure.The summary answers the question of which inputs lead to which outputs as a result of executing the procedure.They are computed in the form of“sum-mary edges”in the control-flow graphs of programs[21,2].Yet another example is the algorithm forfinding dominators/postdominators in program analysis,proposed in[1].A state is a postdominator of a state if all paths from eventually reach,and is a dominator of if all paths to pass through.Although these problems are similar,their solutions are quite different.Unifying them into a common framework allows reuse of specific techniques proposed for each problem,and opens a way for creating efficient implementations to other problems ofa similar kind.We see all these problems as instances of model exploration,where properties of a model are discovered,rather than checked.A common framework for model exploration has been proposed under the name of query checking[5].Query checkingfinds which formulas hold in a model.For instance,a query is intended tofind all propositional formulas that hold in the reachable states.In general,a CTL query is a CTL formula with a missing propositional subformula,designated by a placeholder(“”).A solution to the query is any propositional formula that,when sub-stituted for the placeholder,makes a CTL formula that holds in the model.The general query checking problem is:given a CTL query on a model,find all of its propositional solutions.For example,consider the model in Figure1(a),where each state is labeled by the atomic propositions that hold in it.Here,some solutions to are, representing the reachable state,and,representing the set of states. On the other hand,is not a solution:does not hold,since no states whereis false are reachable.Query checking can be solved by repeatedly substituting each possible propositional formula for the placeholder,and returning those for which the resulting CTL formula holds.In the worst case,this approach is exponential in the size of the state space and linear in the cost of CTL model checking.Each of the analysis questions described above can be formulated as a query.Reach-able states are solutions to.Procedure summaries can be obtained by solvingholds in the return statement of the procedure.Dominators/postdominators are solutions to the query(i.e.,what propositional formulas eventually hold on all paths).This insight gives us a uniform formulation of these problems and allows for easy creation of solutions to other,sim-ilar,problems.For example,a problem reported in genetics research[4,12]called for finding stable states of a model,that are those states which,once reached,are never left by the system.This is easily formulated as,meaning“what are the reachable states in which the system will remain forever?”.These analysis problems further require that solutions to their queries be states of the model.For example,a query on the model in Figure1(a)has solutionsand.Thefirst corresponds to the state and is a state solution.The second cor-responds to a set of states but neither nor is a solution by itself.When only state solutions are needed,we can formulate a restricted state query-checking prob-lem by constraining the solutions to be single states,rather than arbitrary propositional formulas(that represent sets of states).A naive state query checking algorithm is to repeatedly substitute each state of the model for the placeholder,and return those for which the resulting CTL formula holds.This approach is linear in the size of the state space and in the cost of CTL model checking.While of significantly more efficient than general query checking,this approach is not“fully”symbolic,since it requires many runs of a model-checker.While several approaches have been proposed to solve general query checking,none are effective for solving the state query-checking problem.The original algorithm of Chan[5]was very efficient(same cost as CTL model checking),but was restricted to valid queries,i.e.,queries whose solutions can be characterized by a single propo-sitional formula.This is too restrictive for our purposes.For example,neither of the queries,,nor the stable states query are valid.Bruns and Gode-2froid[3]generalized query checking to all CTL queries by proposing an automata-basedCTL model checking algorithm over a lattice of sets of all possible solutions.This al-gorithm is exponential in the size of the state space.Gurfinkel and Chechik[15]havealso provided a symbolic algorithm for general query checking.The algorithm is basedon reducing query checking to multi-valued model checking and is implemented in atool TLQSolver[7].While empirically faster than the corresponding naive approach of substituting every propositional formula for the placeholder,this algorithm still has the same worst-case complexity as that in[3],and remains applicable only to modest-sized query-checking problems.An algorithm proposed by Hornus and Schnoebelen[17]finds solutions to any query,one by one,with increasing complexity:afirst solution is found in time linear in the size of the state space,a second,in quadratic time,and so on. However,since the search for solutions is not controlled by their shape,finding all state solutions can still take exponential time.Other query-checking work is not directly ap-plicable to our state query-checking problem,as it is exclusively concerned either with syntactic characterizations of queries,or with extensions,rather than restrictions,of query checking[23,25].In this paper,we provide a symbolic algorithm for solving the state query-checking problem,and describe an implementation using the state-of-the-art model-checker NuSMV[8]. The algorithm is formulated as model checking over a lattice of sets of states,but its implementation is done by modifying only the interface of NuSMV.Manipulation ofthe lattice sets is done directly by NuSMV.While the running time of this approach isthe same as in the corresponding naive approach,we show empirical evidence that our implementation can perform better than the naive,using a case study from genetics[12].The algorithms proposed for the program analysis problems described above are special cases of ours,that solve only and queries,whereas our algorithm solves any CTL query.We prove our algorithm correct by showing that it approximates general query checking,in the sense that it computes exactly those solutions,amongall given by general query checking,that are states.We also generalize our results toan approximation framework that can potentially apply to other extensions of model checking,e.g.,vacuity detection,and point to further applications of our technique,e.g.,to querying XML documents.There is a also a very close connection between query-checking and sanity checkssuch as vacuity and coverage[19].Both problems require checking several“mutants”ofthe property to obtain thefinal solution.In fact,the algorithm for solving state-queries presented in this paper bears many similarities to the coverage algorithms describedin[19].Since query-checking is a more general approach,we believe it can provide a uniform framework for studying all these problems.The rest of the paper is organized as follows.Section2provides the model checking background.Section3describes the general query-checking algorithm.We formallydefine the state query-checking problem and describe our implementation in Section4. Section5presents the general approximation technique for model checking over latticesof sets.We present our case study in Section6,and conclude in Section7.3(a)(b),for true false,for,forFig.1.(a)A simple Kripke structure;(b)CTL semantics.2BackgroundIn this section,we review some notions of lattice theory,minterms,CTL model check-ing,and multi-valued model checking.Lattice theory.Afinite lattice is a pair(,),where is afinite set and is a partial order on,such that everyfinite subset has a least upper bound(called join and written)and a greatest lower bound(called meet and written).Since the lattice isfinite,there exist and,that are the maximum and respectively minimum elements in the lattice.When the ordering is clear from the context,we simply refer to the lattice as.A lattice if distributive if meet and join distribute over each other.In this paper,we work with lattices of propositional formulas. For a set of atomic propositions,let be the set of propositional formulas over .For example,true false.This set forms afinite lattice ordered by implication(see Figure2(a)).Since true,is under true in this lattice.Meet and join in this lattice correspond to logical operators and,respectively.A subset is called upward closed or an upset,if for any,if and,then.In that case,can be identified by the set of its minimal elements(is minimal if),and we write.For example,for the lattice shown in Figure2(a),true. The set is not an upset,whereas true is.For singletons,we write for.We write for the set of all upsets of,i.e.,iff.is closed under union and intersection,and therefore forms a lattice ordered by set inclusion.We call the upset lattice of.The upset lattice of is shown in Figure2(b).An element in a lattice is join-irreducible if and cannot be decomposed as the join of other lattice elements,i.e.,for any and in,impliesor[11].For example,the join-irreducible elements of the lattice in Figure2(a) are and,and of the one in Figure2(b)—true,,,and false.4false(a)(b)(c)ttices for:(a);(b);(c). Minterms.In the lattice of propositional formulas,a join-irreducible element is aconjunction in which every atomic proposition of appears,positive or negated.Such conjunctions are called minterms and we denote their set by.For example,CTL Model Checking.CTL model checking is an automatic technique for verifying temporal properties of systems expressed in a propositional branching-time temporal logic called Computation Tree Logic(CTL)[9].A system model is a Kripke structure ,where is a set of states,is a(left-total)transition relation,is the initial state,is a set of atomic propositions,andis a labeling function,providing the set of atomic propositions that are true in each state.CTL formulas are evaluated in the states of.Their semantics can be described in terms of infinite execution paths of the model.For instance,a formula holds in a state if holds in every state,on every infinite execution path start-ing at;()holds in if holds in some state,on every(some)infi-nite execution path.The formal semantics of CTL is given in Figure1(b). Without loss of generality we consider only CTL formulas in negation normal form, where negation is applied only to atomic propositions[9].In Figure1(b),the function true false indicates the result of checking a formula in state;the set of successors for a state is;and are least and greatestfixpoints of,respectively,where false andtrue.Other temporal operators are derived from the given ones, for example:true,true.The operators in pairsare duals of each other.A formula holds in a Kripke structure,written,if it holds in the initial state,i.e.,true.For example,on the model in Figure1(a),where ,properties and are true,whereas is not.The complexity of model-checking a CTL formula on a Kripke structure is, where.Multi-valued model checking.Multi-valued CTL model checking[6]is a general-ization of model checking from a classical logic to an arbitrary De Morgan algebra ,where is afinite distributive lattice and is any operation that is an involution()and satisfies De Morgan laws.Conjunction and disjunction are the meet and join operations of,respectively.When the ordering and the negation5operation of an algebra are clear from the context,we refer to it as.In this paper,we only use a version of multi-valued model checking where the model remains classical,i.e.,both the transition relation and the atomic propositions are two-valued, but properties are specified in a multi-valued extension of CTL over a given De Morgan algebra,called CTL().The logic CTL()has the same syntax as CTL,except that the allowed constants are all.Boolean values true and false are replaced by the and of,respectively.The semantics of CTL()is the same as of CTL, except is extended to and the interpretation of constants is:for all ,.The other operations are defined as their CTL counterparts(see Fig-ure1(b)),where and are interpreted as lattice operators and,respectively.The complexity of model checking a CTL()formula on a Kripke structure is still ,provided that meet,join,and quantification can be computed in constant time[6],which depends on the lattice.3Query CheckingIn this section,we review the query-checking problem and a symbolic method for solv-ing it.Background.Let be a Kripke structure with a set of atomic propositions.A CTL query,denoted by,is a CTL formula containing a placeholder“”for a proposi-tional subformula(over the atomic propositions in).The CTL formula obtained by substituting the placeholder in by a formula is denoted by.A for-mula is a solution to a query if its substitution into the query results in a CTL formula that holds on,i.e.,if.For example,and are among the solutions to the query on the model of Figure1(a),whereas is not.In this paper,we consider queries in negation normal form where negation is ap-plied only to the atomic propositions,or to the placeholder.We further restrict our attention to queries with a single placeholder,although perhaps with multiple occur-rences.For a query,a substitution means that all occurrences of the place-holder are replaced by.For example,if,then.We assume that occurrences of the placeholder are ei-ther non-negated everywhere,or negated everywhere,i.e.,the query is either positive or negative,respectively.Here,we limit our presentation to positive queries;see Section5 for the treatment of negative queries.The general CTL query-checking problem is:given a CTL query on a model,find all its propositional solutions.For instance,the answer to the query on the model in Figure1(a)is the set consisting of,and every other formula implied by these,including,,and true.If is a solution to a query,then any such that(i.e.,any weaker)is also a solution,due to the monotonicity of positive queries[5].Thus,the set of all possible solutions is an upset;it is sufficient for the query-checker to output the strongest solutions,since the rest can be inferred from them.One can restrict a query to a subset[3].We then denote the query by, and its solutions become formulas in.For instance,checking on the model of Figure1(a)should result in and as the strongest solutions,together with all those implied by them.We write for.6If consists of atomic propositions,there are possible distinct solutions to .A“naive”method forfinding all solutions would model check for every possible propositional formula over,and collect all those’s for which holds in the model.The complexity of this naive approach is times that of usual model-checking.Symbolic Algorithm.A symbolic algorithm for solving the general query-checking problem was described in[15]and has been implemented in the TLQSolver tool[7]. We review this approach below.Since an answer to is an upset,the upset lattice is the space of all possible answers[3].For instance,the lattice for is shown in Figure2(b).In the model in Figure1(a),the answer to this query is true,encoded as,since is the strongest solution.Symbolic query checking is implemented by model checking over the upset lattice. The algorithm is based on a state semantics of the placeholder.Suppose query is evaluated in a state.Either holds in,in which case the answer to the query should be,or holds,in which case the answer is.Thus we have:if,if.This case analysis can be logically encoded by the formula.Let us now consider a general query in a state(where ranges over a set of atomic propositions).We note that the case analysis corresponding to the one above can be given in terms of minterms.Minterms are the strongest formulas that may hold in a state;they also are mutually exclusive and complete—exactly one minterm holds in any state,and then is the answer to at.This semantics is encoded in the following translation of the placeholder:The symbolic algorithm is defined as follows:given a query,first obtain ,which is a CTL formula(over the lattice),and then model check this formula.The semantics of the formula is given by a function from to, as described in Section 2.Thus model checking this formula results in a value from .That value was shown in[15]to represent all propositional solutions to .For example,the query on the model of Figure1(a)becomesThe result of model-checking this formula is.The complexity of this algorithm is the same as in the naive approach.In practice, however,TLQSolver was shown to perform better than the naive algorithm[15,7].74State Solutions to QueriesLet be a Kripke structure with a set of atomic propositions.In general query check-ing,solutions to queries are arbitrary propositional formulas.On the other hand,in state query checking,solutions are restricted to be single states.To represent a single state,a propositional formula needs to be a minterm over.In symbolic model checking,any state of is uniquely represented by the minterm that holds in.For example,in themodel of Figure1(a),state is represented by,by,etc.Thus,for state query checking,an answer to a query is a set of minterms,rather than an upset of propositional formulas.For instance,for the query,on the model of Figure1(a), the state query-checking answer is,whereas the general query checking one is.While it is still true that if is a solution,everything in is also a solution,we no longer view answers as upsets,since we are interested only in minterms,and is the only minterm in the set(minterms are incomparable by implication).We can thus formulate state query checking as minterm query checking: given a CTL query on a model,find all its minterm solutions.We show how to solve this for any query,and any subset.When,the minterms obtained are the state solutions.Given a query,a naive algorithm would model check for every minterm .If is the number of atomic propositions in,there are possible minterms, and this algorithm has complexity times that of model-checking.Minterm query checking is thus much easier to solve than general query checking.Of course,any algorithm solving general query checking,such as the symbolic approach described in Section3,solves minterm query checking as well:from all solu-tions,we can extract only those which are minterms.This approach,however,is much more expensive than needed.Below,we propose a method that is tailored to solve just minterm query checking,while remaining symbolic.4.1Solving minterm query checkingSince an answer to minterm query checking is a set of minterms,the space of all answers is the powerset that forms a lattice ordered by set inclusion.For example,the lattice is shown in Figure2(c).Our symbolic algorithm evaluates queries over this lattice.Wefirst adjust the semantics of the placeholder to minterms.Suppose we evaluate in a state.Either holds in,and then the answer should be,or holds,and then the answer is.Thus,we haveif,if.This is encoded by the formula.In general,for a query ,exactly one minterm holds in,and in that case is the answer to the query. This gives the following translation of placeholder:8Our minterm query-checking algorithm is now defined as follows:given a query on a model,compute,and then model check this over.For example,for,on the model of Figure1(a),we model checkand obtain the answer,that is indeed the only minterm solution for this model.To prove our algorithm correct,we need to show that its answer is the set of all minterm solutions.We prove this claim by relating our algorithm to the general al-gorithm in Section3.We show that,while the general algorithm computes the set of all solutions,ours results in the subset that consists of only the minterms from.Wefirst establish an“approximation”mapping fromto that,for any upset,returns the subset of minterms. Definition1(Minterm approximation).Let be a set of atomic propositions.Minterm approximation is,for any .With this definition,is obtained from by replacing with .The minterm approximation preserves set operations;this can be proven using the fact that any set of propositional formulas can be partitioned into minterms and non-minterms.Proposition1.The minterm approximation is a lattice ho-momorphism,i.e.,it preserves the set operations:for any,and.By Proposition1,and since model checking is performed using only set operations, we can show that the approximation preserves model-checking results.Model check-ing is the minterm approximation of checking.In other words, our algorithm results in set of all minterm solutions,which concludes the correctness argument.Theorem1(Correctness of minterm approximation).For any state of,In summary,for,we have the following correct symbolic state query-checking algorithm:given a query on a model,translate it to,and then model check this over.The worst-case complexity of our algorithm is the same as that of the naive ap-proach.With an efficient encoding of the approximate lattice,however,our approach can outperform the naive one in practice,as we show in Section6.94.2ImplementationAlthough our minterm query-checking algorithm is defined as model checking over a lattice,we can implement it using a classical symbolic model checker.This in done by encoding the lattice elements in such that lattice operations are already imple-mented by a symbolic model checker.The key observation is that the latticeis isomorphic to the lattice of propositional formulas.This can be seen, for instance,by comparing the lattices in Figures2(a)and2(c).Thus,the elements of can be encoded as propositional formulas,and the operations become proposi-tional disjunction and conjunction.A symbolic model checker,such as NuSMV[8], which we used in our implementation,already has data structures for representing propositional formulas and algorithms to compute their disjunction and conjunction —BDDs[24].The only modifications we made to NuSMV were parsing the input and reporting the result.While parsing the queries,we implemented the translation defined in Sec-tion4.1.In this translation,for every minterm,we give a propositional encoding to.We cannot simply use to encode.The lattice elements need to be con-stants with respect to the model,and is not a constant—it is a propositional formula that contains model variables.We can,however,obtain an encoding for,by renam-ing to a similar propositional formula over fresh variables.For instance,we encode as.Thus,our query translation results in a CTL formula with double the number of propositional variables compared to the model.For example,the translation of isWe input this formula into NuSMV,and obtain the set of minterm solutions as a propo-sitional formula over the encoding variables.For,on the model in Figure1(a),we obtain the result,corresponding to the only minterm solution .4.3Exactness of minterm approximationIn this section,we address the applicability of minterm query checking to general query checking.When the minterm solutions are the strongest solutions to a query,minterm query checking solves the general query-checking problem as well,as all solutions to that query can be inferred from the minterms.In that case,we say that the minterm approximation is exact.We would like to identify those CTL queries that admit exact minterm approximations,independently of the model.The following can be proven using the fact that any propositional formula is a disjunction of minterms. Proposition2.A positive query has an exact minterm approximation in any model iff is distributive over disjunction,i.e.,.10An example of a query that admits an exact approximation is;its strongest solu-tions are always minterms,representing the reachable states.In[5],Chan showed that deciding whether a query is distributive over conjunction is EXPTIME-complete.We obtain a similar result by duality.Theorem2.Deciding whether a CTL query is distributive over disjunction is EXPTIME-complete.Since the decision problem is hard,it would be useful to have a grammar that is guaran-teed to generate queries which distribute over disjunction.Chan defined a grammar for queries distributive over conjunction,that was later corrected by Samer and Veith[22]. We can obtain a grammar for queries distributive over disjunction,from the grammar in[22],by duality.5ApproximationsThe efficiency of model checking over a lattice is determined by the size of the lattice. In the case of query checking,by restricting the problem and approximating answers, we have obtained a more manageable lattice.In this section,we show that our minterm approximation is an instance of a more general approximation framework for reasoning over any lattice of sets.Having a more general framework makes it easier to accom-modate other approximations that may be needed in query checking.For example,we use it to derive an approximation to negative queries.This framework may also apply to other analysis problems that involve model checking over lattices of sets,such as vacuity detection[14].Wefirst define general approximations that map larger lattices into smaller ones. Let be anyfinite set.Its powerset lattice is.Let be any sublattice of the powerset lattice,i.e.,.Definition2(Approximation).A function is an approximation if:1.it satisfies for any(i.e.,is an under-approximation of),and2.it is a lattice homomorphism,i.e.,it respects the lattice operations:,and.From the definition of,the image of through is a sublattice of,having and as its maximum and minimum elements,respectively.We consider an approximation to be correct if it is preserved by model checking: reasoning over the smaller lattice is the approximation of reasoning over the larger one.Let be a CTL()formula.We define its translation into to be the CTL(formula obtained from by replacing any constant occurring in by.The following theorem simply states that the result of model checking is the approximation of the result of model checking.Its proof follows by structural induction from the semantics of CTL,and uses the fact that approximations are homomorphisms.[18]proves a similar result,albeit in a somewhat different context.11。
tolerance design
1 Introduction Tolerances have tremendous influence on both manufacturing cost and quality loss of a mechanical product. A dimension with loose tolerance will cost less for its manufacturing, but its possible variation range will be great and will result in unacceptable loss of quality and high scrap rate, leading to customer dissatisfaction. On the contrary, a dimension with relatively tight tolerance will cost more for its manufacturing, but its possible variation range will be small and lead to lower quality loss [1–3]. It is obvious that the tolerance has become the bridge to balance manufacturing cost and quality loss of the product; an optimal assignment of dimensional tolerances usually is the result of a trade-off between these two factors. Taguchi defined product quality as a loss that a product imparts to society from the time the product is shipped [4]. He further proposed several kinds of quality loss functions to assess the loss caused by the quality value deviating from its design target. In order to minimize the total assembly cost, which is the sum of manufacturing cost and quality loss, Cheng and Maghsoodloo [5] determined the optimal tolerances for individual components by incorporating Taguchi’s quality loss function and the cost-tolerance function in an assembly process. Wu and Tang [6] presented a design method for allocating dimensional tolerances of products with asymmetric quality losses, the design strategies proposed include the adjustment of manufacturing target and the approach of robust tolerance design based on the balance between manufacturing cost and quality loss expected. Jeang [2] developed an analysis model that includes manufacturing cost and quality loss simultaneously to determine the optimal values of design tolerance, process mean, and process tolerance. Jeang et al. [3] extended this work for manufacturing process planning with asymmetrical loss function. Based on the sum of tolerance cost, asymmetrical quality loss, and failure cost a
Sooyong Nam
Tolerance analysis using Zemax, the case for the small optics.OPTI521, Sooyong Nam AbstractA systematic procedure to perform tolerance analysis of the small optics has presented. To implement thisof this procedure, a typical cell phone camera lens has been chosen to evaluate performance degradation ofthe optical system. This outline could provide a starting point to understand small optics tolerancing and the instructions to use Zemax as a design and tolerancing tool.1. IntroductionA few obvious trends in current consumer electronic devices are small and compact size but performinghigh with very reasonable price. Optics inside those products are not the exception. It is getting thinner buthas to be performing much better than before with higher pixel resolution. As a result of those tight requirements, the tolerances are very tight and challenging. In this article, a systematic procedure for the tolerance analysis by Zemax has presented. The details about the design of three plastic lens elements arenot covered in this article.2. Procedure to perform tolerance analysisThe basic outline of the optical tolerancing procedure has presented in Fig. 1). In this analysis, as small cellphone camera lenses don’t have complicated mechanical focusing features, only those blue-boxed tasks are completed. Pink boxed tasks are done but will not be discussed in this article.(1)Table 1) is specification of required lens. Its F# is 2.8, pixel size is 5.6μm, and required resolution at centeris 125lp/mm. This is typical specification of fixed focus VGA cell phone camera lens. As a merit functionof the sensitivity analysis, 20% MTF drop at 45 lp/mm frequency is used.Table 1) Design specificationSpecification item Required performanceField of view 60 degreesF# 2.8 Number of lens elements 3 plastic lensesPixel size 5.6 μmImage sensor size 3.58mm x 2.69mm@center Resolution 125lp/mm3. Optical layout for cell phone cameraFig. 2) is optical layout of cell phone camera. It has three aspheric plastic lenses, both sides are aspheres and there are two aperture stops between surface 3 and 6. As shown in Fig. 3), the materials are Zeonex E48R and Polycarbonate. Their indices are around 1.45~1.5. It is very clear and stable compare to conventional optical plastic. The total length from first surface to image plane is about 5.3mm and the lens diameters are from 2.6mm to 4.6mm.Fig. 2) Optical layout with 3 plastic aspheric lensesFig. 3) Lens data in Zemax4. Optomechanical layoutFig. 4) is optomechanical layout. Due to dimensional restriction and cost reduction requirement, these are manufactured from plastic injection molding process. All lenses have their own individual mount so that the flange surfaces are stacked together with respect to the bore surface of the barrel. A plastic injection molded aperture stop is located between lens1 and lens2. Lens3 will be located after lens2. Retainer will be glued to the barrel to maintain lens assembly and survived from the shock and keep all required tolerance remained.Fig. 4) Optomechanical layout2 36978Fig. 5) is the part of the drawing that has all the necessary tolerances are specified. At the top left corner, surface power and irregularity tolerances are specified and the note below is the index of refraction. Aspheric equation and their coefficients have described in the box at the top right corner. On the lens drawing, center thickness, element tilt, element decenter, and element spacing tolerances are specified. All these specific numbers are come from tolerance analysis.Fig. 5) Error analysis from the lens element5. Sensitivity table and error budgetFig. 6 and 7 are input data to execute tolerance analysis from Zemax. Perturbed amounts are 1 fringe for radius of each surface(power), 0.01mm for thickness, 0.05 degree for surface tilt, 0.001 for index change, 0.05 degree for element tilt, and 0.010mm for element decenter. Zemax allows us to tolerate surface irregularity by assigning tolerance operand TEZI which is simulating surface by adjusting Zernike coefficient. 0.00005mm is used as RMS surface irregularity.Fig. 6) Tolerance input to the Zemax Fig.7) Tolerance analysis routine in ZemaxAfter completed to define tolerances, the Zemax will generate tolerance table in Fig. 8). Then we can run tolerance analysis routine. From the command box shown in Fig. 7), we can calculate sensitivity to the given merit function.As a merit function to evaluate performance degradation, percentile MTF drop at Ny/2 has selected. Because resolution and contrast is one of the main category to evaluate image quality of the camera. Since this system uses 5.6μm pixel for the image sensor, the Nyquist frequency is 89lp/mm and Ny/2 is 45lp/mm.Fig. 8) Tolerance data editor in ZemaxTable 2 is the result sensitivity table and error budget. Presented percentile MTF drop is from the 0 degree field at 0.632μm wave. Optical schematic has been attached on top the table to identify effects easily. From the analysis result, we can have following results:- Lens 1 and following space : The dominant variables are index change and power tolerance for the surface 2 and surface 3 are second dominant. Thickness and decenter tolerance is not as sensitive as previous three parameters. The effect of surface irregularity, lens1 and 2 spacing, and element tilt are almost negligible. Surface wedge is negligible through the entire system. So it doesn’t need to be taken into consideration for this system.- Lens 2 and following space : Dominant tolerances are power of two surface and spacing between lens 2 and lens 3. Thickness is next dominant one and after then other tolerances are all insensitive. Lens 2 is the least sensitive element.- Lens 3 and space between image plane : Center thickness and space tolerance between lens and image plane are most dominant variables. Power tolerance of the surface 8 is second most sensitive tolerance than others. The rest of the other tolerances are quite forgiving compare to other lenses.The RSS results have been calculated after each lens error and total RSS has calculated based on the error of each lens. The total RSS of system degradation due to tolerated error is 17.7% of MTF drop. Therefore the design specification less then 20% at Ny/2 is achieved.comment3. Nominalvalue4.Perturbation5. MTF6.Sensitivity7.Tolerancesrefraction 1.5312 0.0010 0.0550 55.030.00100 2-3 Thickness(mm) 0.8822 0.0100 0.0283 2.830.01000 2.830.0000 0.0500 0.0001 0.000.10000Decenter(mm) 0.0000 0.0100 0.0131 1.310.02000 1.3941 0.0011 0.0333 30.250.00220 0.0001 0.0001 2.800.00005 0.0500 0.0002 0.000.10000 2.3987 0.0066 0.0410 6.200.01320 0.0001 0.0035 69.360.00005 0.0500 0.0000 0.000.10000 Spacing0.9750 0.0100 0.0006 0.060.01000 2369786. ConclusionAfter completing tolerance analysis for the cell phone camera lens using Zemax, we have following conclusions,a.From the given optical design, optomechanical layout has been presented.b.Actual drawing with tolerance for the lens fabrication has been presented.c.Dominant errors and negligible errors have been clarified after tolerance analysis. Lens thickness,space between lens and power of each surfaces are dominant factors to affect performance.d.Over all tolerance guide line has been met with 17.7% MTF drop at 45lp/mm7. References(1) Robert H. Ginsberg, “Outline of tolerancing (from performance specification to tolerance drawing)”, Optical Engineering, March/April 1981, Vol. 20, No. 2, pp. 175-180(2) Zemax manual and Zemax knowledge base website.。
Communicating X-machines A practical approach for formal and modular specification of large
Communicating X-Machines: A Practical Approach for Formal and Modular Specification of Large Systems Petros K EFALAS1, George E LEFTHERAKIS1, and Evangelos K EHRIS21City Liberal Studies,Affiliated College of the University of Sheffield, Computer Science Department,13 Tsimiski Str., 54624 Thessaloniki, Greece {kefalas,eleftherakis}@city.academic.gr2Technological Educational Instituteof SerresBusiness Adminitration Dept. Terma Magnisias, 62124 Serres, Greecekehris@teiser.grAbstract. An X-machine is a general computational machine that can model: (a) non-trivial data structuresas a typed memory tuple and (b) the dynamic part of a system by employing transitions, which are not la-beled with simple inputs but with functions that operate on inputs and memory values. The X-machine formalmethod is valuable to software engineers since it is rather intuitive, while at the same time formal descrip-tions of data types and functions can be written in any known mathematical notation. These differences allowthe X-machines to be more expressive and flexible than a Finite State Machine. In addition, a set of X-machines can be viewed as components, which communicate with each other in order to specify larger sys-tems. This paper describes a methodology as well as an appropriate notation, namely XMDL, for buildingcommunicating X-machines from existing stand-alone X-machine models. The proposed methodology is ac-companied by an example model of a traffic light junction, which demonstrates the use of communicating X-machines towards the incremental modeling of large-scale systems. It is suggested that through XMDL, thepractical development of such complex systems can be split into two separate activities: (a) the modeling ofstand-alone X-machine components and (b) the description of the communication between these components.The approach is disciplined, practical, modular and general in the sense that it subsumes the existing meth-ods for communicating X-machines.1. IntroductionAlthough many software engineering methods and methodologies are devised in order to deal with the development of complex software systems, there is still no evidence to suggest that, apart from formal methods, any of them leads towards “correct” systems. In the last few decades, academics and practi-tioners adopted extreme positions for or against formal methods [1], with the truth lying somewhere between but the necessity of formal methods in software engineering of industrial systems still appar-ent [2]. Software system specification has centred on the use of models of data types, either functional or relational models such as Z [3] or VDM [4] or axiomatic ones such as OBJ [5]. Although these have led to some considerable advances in software design, they lack the ability to express the dynam-ics of the system. Also, transforming an implicit formal description into an effective working system is not straightforward. Other formal methods, such as Finite State Machines [6] or Petri Nets [7] cap-ture the dynamics of a system, but fail to describe the system completely, since there is little or no reference at all to the internal data and how this data is affected by each operation in the state transi-tion diagram. Other methods, like Statecharts [8], capture the requirements of both the dynamic be-haviour and modelling of data but are rather informal with respect to clarity and semantics, thus being susceptible to many interpretations. So far, little attention has been paid in formal methods that could facilitate all crucial stages of “correct” system development, namely modelling, verification and test-ing.X-machines is a formal method that is able to deal with all these crucial stages. An X-machine is a general computational machine introduced by Eilenberg [9] and extended by Holcombe [10], that resembles a Finite State Machine (FSM) but with two significant differences: (a) there is an underly-ing data set attached to the machine, and (b) the transitions are not labeled with simple inputs but with functions that operate on inputs and data set values. These differences allow the X-machines to bemore expressive and flexible than the FSM. In this paper, we use X-machines for modeling communi-cating systems.As described above, the majority of formal languages facilitate the modeling of either the data processing or the control of a system. A particular interesting class of X-machines is the Stream X-machines that can model non-trivial data structures as a typed memory tuple. Stream X-machines em-ploy a diagrammatic approach of modeling the control by extending the expressive power of the FSM. They are capable of modeling both the data and the control by integrating methods, which describe each of these aspects in the most appropriate way. Transitions between states are performed through the application of functions, which are written in a formal notation and model the processing of the data, which is held in the memory. Functions receive input symbols and memory values, and produce output while modifying the memory values (Fig. 1). The machine, depending on the current state of control and the current values of the memory, consumes an input symbol from the input stream and determines the next state, the new memory state and the output symbol, which will be part of the out-put stream. The formal definition of a deterministic stream X-machine [11] is an 8-tuple M = (Σ, Γ, Q, M, Φ, F, q0, m0), where:!"Σ, Γ is the input and output finite alphabet respectively, i.e. two sets of symbols,!"Q is the finite set of states,!"M is the (possibly) infinite set called memory, i.e. an n-tuple of typed symbols,!"Φ is the type of the machine M, a finite set of partial functions φ that map an input and a mem-ory state to an output and a new memory state, φ: Σ× M →Γ× M,!" F is the next state partial function that given a state and a function from the type Φ, denotes the next state. F is often described as a transition state diagram, F: Q ×Φ→ Q, and!"q0 and m0 are the initial state and memory respectively.Fig. 1. An abstract example of an X-machine; φi: functions operating on inputs and memory, S i: states.The general format of functions is: φ(σ,m) = (γ,m’) if condition.Stream X-machines can be thought to apply in similar cases where Statecharts and other similar no-tations, such as SDL, do. However, apart from being formal as well as proved to possess the computa-tional power of Turing machines [11], X-machines have other significant advantages. Firstly, they provide a mathematical modeling formalism for a system. Consequently, they offer a strategy to test the implementation against the model [12,13], which is a generalization of W-method for FSM testing [14]. It is proved that this testing method is guaranteed to determine correctness if certain assumptions in the implementation hold [11]. Finally, a model checking methodology for X-machines is devised [15] that facilitates the verification of safety properties of a model. Therefore, the X-machines can be used as a core notation around which an integrated formal methodology of developing correct systems is built, ranging from model checking to testing [15,16]. In principle, X-machines are considered a generalization of models written in similar formalisms. Concepts devised and findings proven for X-machines form a solid theoretical framework, which can be adapted to other, more tool-oriented methods, such as Statecharts or SDL.In this paper, we explore another dimension of X-machine modeling, i.e. the ability to specify large-scale systems in terms of components that communicate with each other. In addition, we demon-strate how formal specification can also be practical through an appropriately devised notation and an incremental modular development methodology. In section 2 of this paper, a review of the communi-cating X-machine approaches is presented and the motivation of our work is given by identifying the limitations of existing approaches. The main contribution of this work is analytically discussed in section 3 where our methodology as well as the appropriate notation is described. A concrete example is given in order to accompany the theory and demonstrate the applicability of the approach. In sec-tion 4, the advantages of the current approach over the alternatives are discussed. Finally, in the last section, current as well as further work concludes this paper.2. Review of Communicating X-machine TheoryA number of different communicating X-machine approaches have been proposed [17,18,19]. We describe in more detail the approach proposed in [17], since, it is more general than [18], and unlike [19], it is sound and preserves the main advantage of X-machines, i.e. the testing method for stand-alone X-machines can be applied. All the approaches are compared to the proposed methodology of this paper in the last section. A Communicating Stream X-machine with n components is a system [17]:CSXM n = ( (XMC i) i=1..n, CM, C0)where:!"XMC i is an X-machine Component of the system,!"CM is a n×n matrix, namely the Communication Matrix,!"C0 is the initial communication matrix.In the following, we will refer to components of the communicating stream X-machine as XMC rather than X-machine, since the definition of XMC is different from the one of a stand-alone X-machine. In the described model, the communication of XMCs is established through the communica-tion matrix. The matrix cells contain “messages” from one XMC to another. The (i,j) cell contains a “message” from XMC i to XMC j, i.e. XMC i reads only from i th column and writes only in i th row. The matrix cells may contain an undefined value λ which stands for “no message” while all the (i,i) cells remain empty. The “messages” can be of any type defined in all XMC memories. In order to deliver or receive “messages”, an XMC i requires an input port IP i and an output port OP i, which contain individual elements (Fig. 2). The type of these elements is IN i and OUT i respectively, which are sets of values from the memory M i or the undefined value λ, that is IN i, OUT i ⊆ M i∪{λ} and λ∉M i.In order to utilize the IP and OP ports, there exist special kind of states and functions, which are called Communicating States and Communicating Functions respectively. Therefore, inside an XMC, there exist:!"Processing Functions, which affect the contents of IP and OP ports, emerge from processing states, and do not affect the communication matrix, and!"Communicating Functions, which either read an element from the matrix and put it in the IP port, or write an element from the OP port to the matrix.Fig. 2. Two X-machines Components, XMC i and XMC j can communicate through their IP and OP ports andthe Communication Matrix.The communicating functions emerge only from communication states, accept the empty symbol εas input, and produce ε as output, while not affecting the memory. The communicating functions can write to the matrix only if the cell contains the special value λ. After the communicating functions read from the matrix, the cell is assigned the value λ. If a communication function is not applicable, it“waits” until it becomes applicable (Fig. 3). Formally, the processing functions of an XMC are de-fined as:pf(σ, m, in, out) = (γ, m’, in’, out’) where: σ∈Σ, m,m’∈M, in,in’∈IN, out,out’∈OUT , γ∈Γand, the communication functions are defined as:cf(ε, m, in, out, c) = (ε, m, in’, out’, c’) where: m∈M, in,in’∈IN, out,out’∈OUT , c,c’∈CM Therefore, the new Φi set for an XMC i is the union of all the processing functions pf∈Φpi with all the communicating functions cf∈Φci , i.e. Φi= Φpi∪Φci and Φpi∩Φci = ∅, where:pf : Σi× M i× IN i× OUT i→Γi× M i× IN i× OUT icf: Σi× M i× IN i× OUT i× CM →Γi× M i× IN i× OUT i× CMεFig.3. An abstract view of Processing and Communicating Functions and their parameters.For example, assume a producer and a consumer combination. The producer generates items, while the consumer accepts two items and operates on them. The two XMCs, together with their processing as well as communicating states and functions are illustrated in Fig. 4. The producer is an XMC that, given a stream of inputs, activates the processing functions begin and continue, which in turn store a partially constructed item into the memory. When the item construction is complete, the item is trans-ferred to the OP port and the machine is at transmitting state. The function send is activated in order to send the item from the OP port to the communication matrix. On the other hand, the consumer is an XMC that is at waiting state until an item appears in the communication matrix. When it does, the function get is activated, putting the item into the IP port. The processing function accept first is re-sponsible for moving the item from the IP port to the memory. When the second item is read, consume operates on both items existing in the memory. A formal definition of the two XMCs exists but falls outside the scope of this section.It is apparent that the above XMCs are different from stand-alone X-machines that someone could specify without the intention to communicate, since:!"there might be different initial states, e.g. q0=waiting for XMC consumer while for the stand-alone X-machine would have been q0=consuming_first,!"there are additional communicating functions in all XMCs,!"the processing functions are different, e.g. in the XMC producer complete is required to write to the IP port,!"the state transition diagrams are different.The above approach to building communicating systems is sound and preserves the ability to gen-erate a complete test set for the system, thus guarantying its correctness. However, it suffers one ma-jor drawback, that is, a system should be conceived as a whole and not as a set of independent com-ponents. As a consequence, one should start from scratch in order to specify a new component as part of the large system. In addition, specified components cannot be re-used as stand-alone X-machines or as components of other systems, since the formal definition of a stand-alone X-machine differs sig-nificantly from the definition of an XMC. Moreover, the semantics of the functions affecting the communication matrix impose a limited asynchronous operation of an XMC. For example, the opera-tion of the XMC producer is restricted, since it is not allowed to construct a second item if the first one is not received by the XMC consumer (the cell in the communication matrix is not empty and therefore the producer keeps on being at state transmitting).Fig.4. A producer and a consumer as XMCs of a communicating system.A different methodology, namely COMX, of constructing communicating X-machines is described in [19]. The communication is also based in nominated IN and OUT ports and described as a separate diagram. The methodology follows a top-down approach and the intention is mainly to verify that certain properties of the communicating system are satisfied, such as reachability, boundness, dead-locks etc. A complex formal notation is used, which however is far from being standard in order to lead to the construction of appropriate tools. In addition, no effort is made to preserve the semantics of stand-alone X-machines, and therefore existing techniques for component testing as well as poten-tial communicating system testing are unusable.3. Building Systems from stand-alone X-machinesOur intention is to present an alternative approach for communicating X-machines and at the same time aim towards practical and disciplined development. Instead of redefining X-machines as XMCs above, we conform to the standard definition, so that X-machine specifications can be used as compo-nents of large-scale communicating systems as they are, without changes. As opposed to the previ-ously defined approach as well as COMX, the new approach has several advantages for the developer who:!"does not need to model a communicating system from scratch,!"can re-use existing models,!"can consider modeling and communication as two separate distinct activities in the development ofa communicating system,!"can use existing tools for both stand-alone and communicating X-machines.Together with keeping the standard definition of X-machines, we suggest a bottom-up methodology for developing communicating systems, which consists of three steps:!"developing X-machine models independently of the target system, or using existing models as they are, as components of the new system,!"determining the way in which the independent models communicate,!"extending the system to accommodate more instances of already defined models.The steps of the methodology are described in the next section through an example.3.1 Modeling Two Independent X-machines (Step 1)Consider a junction with three traffic lights and three corresponding queues of cars (Fig.5). The first step of the proposed methodology implies the modeling of two of system components, i.e. an X-machine for a traffic light and an X-machine for a queue of cars. The memory of a queue X-machine holds a sequence of cars arrived at the traffic light. Functions are activated by the arrival of a car or by a signal car_leaves when the car leaves the junction and consequently the queue. A second X-machine models a traffic light, which has two colours (red and green). The memory of the traffic lightX-machine holds the total number of ticks elapsed since the last change of colour as well as the num-ber of ticks that a colour should be displayed (duration). An optional feature, namely the time delay, i.e. the number of ticks that should elapse before the normal operation of the traffic light may also be included in the memory. Such feature may not be specified, but, either way, it can be implemented later on in the communicating system using a different approach, as it will be shown later. The func-tions are activated by an input signal tick, which simulates a clock tick. The next state functions for the models queue and traffic light are shown diagrammatically in Fig. 6.Fig.5. A junction with three traffic lights and corresponding queues.The formal definitions of the two X-machines are presented below using the notation of X-machine Description Language [20], which is intended to be an ASCII-based interchange language. The use of XMDL makes formal specification more practical, since models that are specified in this language can be processed by various tools, such as an animator, a test set generator, a model checker etc. [21,22]. Briefly, XMDL is a non-positional notation based on tags, used for the declaration of X-machine parts, e.g. types, set of states, memory, input and output symbols, functions etc. The functions take two parameter tuples, i.e. an input symbol and a memory value, and return two new parameter tuples, i.e. an output and a new memory value. A function may be applicable under conditions (if-then) or unconditionally. Variables are denoted by ?. The informative where in combination with the operator <- is used to describe operations on memory values. Therefore the functions are of the form: #fun <function name> ( <input tuple> , <memory tuple> )=[if <condition expression> then]( <output tuple>, <memory tuple> )[where<informative expression>].The full syntax and semantics of XMDL can be found in [23]. In the following, only the XMDL code of the queue X-machine will be analytically described, while the code of the traffic light will be just listed. Firstly, an XMDL code includes the declarations of the model name as well as the basic and user-defined types. In this case, a CAR is a basic type, i.e. anything, car_queue is defined as a sequence of cars. The types inputset and messages will be used later on.#model queue.#basic_type [CAR].#type car_queue = sequence_of CAR.#type inputset = CAR union {car_leaves}.#type messages={FirstArrived,NextArrived,CarLeft,LastCarLeft,NoCarInQueue}.The memory of the queue X-machine should hold the sequence of cars. The initial memory is de-clared as an instance of memory, in which the queue is empty.#memory (car_queue).#init_memory (nil).Fig.6. The state transition diagrams of the queue and traffic light X-machines Accordingly, the set of states and the initial state are declared:#states = {empty, queuing}.#init_state {empty}.The input and output symbols are based on the types previously declared by the specifier. The input is declared as any car or the signal car_leaves, whereas the output indicates the operation triggered: #input (inputset).#output (messages).The set of transitions shown in Fig. 6 are listed:#transition (empty, first_arrives) = queuing.#transition (queuing, arrives) = queuing.#transition (queuing, leaves) = queuing.#transition (queuing, last_leaves) = empty.#transition (empty, reject) = empty.The function first_arrives is triggered by an input, i.e. a car, when the queue is empty. As a result a car is put in the queue sequence:#fun first_arrives( (?c), (nil)) =if ?c belongs CAR then ((FirstArrived), (<?c>)).In order to complete the specification, the rest of the functions are defined in the same manner: #fun arrives( (?c), (?queue)) =if ?c belongs CAR then ((NextArrived),(?newqueue))where?newqueue <- ?c addatendof ?queue.#fun leaves( (car_leaves), (<?c :: ?rest>)) = ((CarLeft), (?rest)).#fun last_leaves( (car_leaves), (<?c>)) = ((LastCarLeft), (nil)).#fun reject( (car_leaves), (nil)) = ((NoCarInQueue), (nil)).Accordingly the traffic light X-machine is defined as follows:#model traffic_light.#type ticks_elapsed = natural0.#type delay = natural0.#type duration_green = natural.#type duration_red = natural.#type inputsignal = {tick}.#type output = {RedColour, GreenColour, StartUp}.#memory (ticks_elapsed, delay, duration_green, duration_red).#states = {red, green}.#init_memory (0,10,30,20).#init_state = {red}.#input (inputsignal).#output (outputset).#transition (red, keep_red) = red.#transition (red, change_green) = green.#transition (red, delay) = red.#transition (green, keep_green) = green.#transition (green, change_red) = red.#transition (green, delay) = green.#fun delay ((tick), (?te, ?delay, ?dg, ?dr)) =if ?d>0 then ( (StartUp), (?te, ?new_delay, ?dg, ?dr) )where ?new_delay <- ?delay - 1.#fun keep_red ((tick), (?te,0,?dg, ?dr)) =if ?new_te<?dr then ( (RedColour), (?new_te, 0, ?dg, ?dr) ) where ?new_te <- ?te + 1.#fun keep_green ((tick), (?te,0,?dg,?dr)) =if ?new_te<?dg then ( (GreenColour), (?new_te, 0, ?dg, ?dr) ) where ?new_te <- ?te + 1.#fun change_green ((tick), (?dr,0,?dg,?dr)) = ((GreenColour), (0,0,?dg,?dr)).#fun change_red ((tick), (?dg,0,?dg,?dr)) = ((RedColour), (0,0,?dg,?dr)).3.2 Building a Communicating System (Step 2)In our approach, we have replaced the communication matrix by several input streams associated with each X-machine component. Although, this may look only as a different conceptual view of the same entity, it will serve both exposition purposes as well as asynchronous operation of the individual ma-chines. X-machines have their own standard input stream but when they are used as components of a large-scale system more streams may be added whenever it is necessary. The number of streams asso-ciated with one X-machine depends on the number of other X-machines, from which it receives mes-sages (Fig. 7).messages from X2 to X131Fig. 7. Three X-machines X1, X2, and X3 and the resulting communicating system where X2 communicates (writes) with X1 and X3, while X3 communicates (writes) with X1. Therefore X1 has three input streams, X3 has two input streams, while X2 has only its own standard input streamThe previously defined X-machines can communicate in the following way; when the traffic light becomes green, the queue can be notified to leave the cars depart one by one, providing that there is at least one car in the queue. For simplicity, we can assume that one car leaves the queue on each tick of the clock. This can be adjusted, as we shall see later. While this happens, more cars may arrive join-ing the queue, waiting for a signal to depart. The above interaction would mean that the X-machine traffic light should send a message to X-machine queue, which will act as an input.If so annotated, the functions accept input from the communicating steam instead of the standard input stream. Also, the functions may write to a communicating input stream of another X-machine. The normal output of the functions is not affected. The annotation used to indicate communication isdepicted in Table 1. Formally speaking, the definition of type Φ in the X-machine changes from thatin the definition of M. The formal definition of the functions in Φ of a communicating X-machine component can be found in [24]. In this paper, we aim to demonstrate the practicality of the approach.Table 1. Annotation for diagrams of communicating X-machine specifications.Annotation SemanticsFunction reads an input from standard input stream andwrites to output stream.model nameFunction reads an input from a communication inputstream, i.e. a “message” sent by the annotated modelname.Function writes a “message” to the communicationinput stream of machine with the annotated modelname. The “message” is sent after all the output pa-rameters are instantiated.The models queue and traffic_light become as illustrated in Fig. 8. In order to incorporate the above semantics, the syntax of XMDL is enhanced by the following annotation:#communication of <model name>:<function name> reads from <model name><function name> writes <message> to <model name>[where <expression> from (memory|input|output) <tuple>].The developer needs only to write is the XMDL code referring to communication, while the rest ofthe specification, i.e. the part referring to the X-machine components and described earlier, remains unchanged:#communication of queue:leaves reads from traffic_light.last_leaves reads from traffic_light.reject reads from traffic_light.#communication of traffic_light:change_green writes (car_leaves) to queue.keep_green writes (car_leaves) to queue.Fig.8. The state transition diagrams of the queue and traffic light X-machine specifications as a partof a communicating system3.3 Extending the System (Step 3)Developing larger models as communicating systems from existing building blocks implies the need for some more features, which can be included in communicating X-machines. For example, if there are more traffic lights in the system, then there must be an additional X-machine that does the sched-uling.In the approaches presented by other researchers [17,18,19], the developer should start from scratch by re-defining all XMCs. In the current approach, re-building the system includes only the modifica-tion of the communication part and one new specification, i.e. the scheduler X-machine. In the follow-ing example, there are three traffic lights and therefore three queues of cars, which operated in a round robin fashion (Fig.5). The scheduler is responsible to synchronise and allocate a time-share to each device (e.g. traffic light) to operate on a certain mode (e.g. become green).The scheduler model can either be created from scratch as an X-machine, or it may already exist as a component of some other system that needed the same type of device scheduling. The model of the scheduler is general and does not refer to any of the other X-machines, i.e. queue or traffic light. The complete model of the scheduler in XMDL is the following:#model scheduler.#type InputSet = {clock_pulse, switch_device}.#type AccumulatedTime=natural0.#memory (AccumulatedTime).#states = {scheduling_device_1, scheduling_device_2, scheduling_device_3}.#init_state = {scheduling_device_1}.#init_memory (0).#input (InputSet).#output (AccumulatedTime).#transition (scheduling_device_1, operate) = scheduling_device_1.#transition (scheduling_device_1, switch) = scheduling_device_2.#transition (scheduling_device_2, operate) = scheduling_device_2.#transition (scheduling_device_2, switch) = scheduling_device_3.#transition (scheduling_device_3, operate) = scheduling_device_3.#transition (scheduling_device_3, switch) = scheduling_device_1.#fun operate ((clock_pulse), (?t)) = ((?nt), (?nt))where?nt <- ?t+1.#fun switch ((switch_device), (?t)) = ((?t), (?t)).The complete system is illustrated in Fig.9. The scheduler communicates with the three traffic lights by sending the “message” tick to them. The traffic lights communicate with the corresponding queues as shown earlier by sending the “message” car_leaves. Finally, a traffic light sends the “mes-sage” switch_device to the scheduler after the appropriate time has elapsed and changed from green to red or vice versa.Nothing in the original models of the X-machine components that describe the queue and the traffic light need to change. However, several instances of the models queue and traffic_light should be cre-ated, each one with different initial state and initial memory. A syntax that can serve this purpose is the following:#model <instance name> instance_of <model name>[with:#init_state = <instance initial state>.#init_memory <instance initial memory tuple>].。
Multi-objective Optimization
Chapter2Multi-objective OptimizationAbstract In this chapter,we introduce multi-objective optimization,and recall some of the most relevant research articles that have appeared in the international litera-ture related to these topics.The presented state-of-the-art does not have the purpose of being exhaustive;it aims to drive the reader to the main problems and the ap-proaches to solve them.2.1Multi-objective ManagementThe choice of a route at a planning level can be done taking into account time, length,but also parking or maintenance facilities.As far as advisory or,more in general,automation procedures to support this choice are concerned,the available tools are basically based on the“shortest-path problem”.Indeed,the problem tofind the single-objective shortest path from an origin to a destination in a network is one of the most classical optimization problems in transportation and logistic,and has deserved a great deal of attention from researchers worldwide.However,the need to face real applications renders the hypothesis of a single-objective function to be optimized subject to a set of constraints no longer suitable,and the introduction of a multi-objective optimization framework allows one to manage more informa-tion.Indeed,if for instance we consider the problem to route hazardous materials in a road network(see,e.g.,Erkut et al.,2007),defining a single-objective function problem will involve,separately,the distance,the risk for the population,and the transportation costs.If we regard the problem from different points of view,i.e.,in terms of social needs for a safe transshipment,or in terms of economic issues or pol-11122Multi-objective Optimizationlution reduction,it is clear that a model that considers simultaneously two or more such objectives could produce solutions with a higher level of equity.In the follow-ing,we will discuss multi-objective optimization and related solution techniques.2.2Multi-objective Optimization and Pareto-optimal SolutionsA basic single-objective optimization problem can be formulated as follows:min f(x)x∈S,where f is a scalar function and S is the(implicit)set of constraints that can be defined asS={x∈R m:h(x)=0,g(x)≥0}.Multi-objective optimization can be described in mathematical terms as follows:min[f1(x),f2(x),...,f n(x)]x∈S,where n>1and S is the set of constraints defined above.The space in which the objective vector belongs is called the objective space,and the image of the feasible set under F is called the attained set.Such a set will be denoted in the following withC={y∈R n:y=f(x),x∈S}.The scalar concept of“optimality”does not apply directly in the multi-objective setting.Here the notion of Pareto optimality has to be introduced.Essentially,a vector x∗∈S is said to be Pareto optimal for a multi-objective problem if all other vectors x∈S have a higher value for at least one of the objective functions f i,with i=1,...,n,or have the same value for all the objective functions.Formally speak-ing,we have the following definitions:•A point x∗is said to be a weak Pareto optimum or a weak efficient solution for the multi-objective problem if and only if there is no x∈S such that f i(x)<f i(x∗) for all i∈{1,...,n}.2.2Multi-objective Optimization and Pareto-optimal Solutions13•A point x∗is said to be a strict Pareto optimum or a strict efficient solution for the multi-objective problem if and only if there is no x∈S such that f i(x)≤f i(x∗) for all i∈{1,...,n},with at least one strict inequality.We can also speak of locally Pareto-optimal points,for which the definition is the same as above,except that we restrict attention to a feasible neighborhood of x∗.In other words,if B(x∗,ε)is a ball of radiusε>0around point x∗,we require that for someε>0,there is no x∈S∩B(x∗,ε)such that f i(x)≤f i(x∗)for all i∈{1,...,n}, with at least one strict inequality.The image of the efficient set,i.e.,the image of all the efficient solutions,is called Pareto front or Pareto curve or surface.The shape of the Pareto surface indicates the nature of the trade-off between the different objective functions.An example of a Pareto curve is reported in Fig.2.1,where all the points between(f2(ˆx),f1(ˆx))and (f2(˜x),f1(˜x))define the Pareto front.These points are called non-inferior or non-dominated points.f1(xFig.2.1Example of a Pareto curveAn example of weak and strict Pareto optima is shown in Fig.2.2:points p1and p5are weak Pareto optima;points p2,p3and p4are strict Pareto optima.142Multi-objective Optimization2Fig.2.2Example of weak and strict Pareto optima2.3Techniques to Solve Multi-objective Optimization ProblemsPareto curves cannot be computed efficiently in many cases.Even if it is theoreti-cally possible tofind all these points exactly,they are often of exponential size;a straightforward reduction from the knapsack problem shows that they are NP-hard to compute.Thus,approximation methods for them are frequently used.However, approximation does not represent a secondary choice for the decision maker.Indeed, there are many real-life problems for which it is quite hard for the decision maker to have all the information to correctly and/or completely formulate them;the deci-sion maker tends to learn more as soon as some preliminary solutions are available. Therefore,in such situations,having some approximated solutions can help,on the one hand,to see if an exact method is really required,and,on the other hand,to exploit such a solution to improve the problem formulation(Ruzica and Wiecek, 2005).Approximating methods can have different goals:representing the solution set when the latter is numerically available(for convex multi-objective problems);ap-proximating the solution set when some but not all the Pareto curve is numerically available(see non-linear multi-objective problems);approximating the solution set2.3Techniques to Solve Multi-objective Optimization Problems15when the whole efficient set is not numerically available(for discrete multi-objective problems).A comprehensive survey of the methods presented in the literature in the last33 years,from1975,is that of Ruzica and Wiecek(2005).The survey analyzes sepa-rately the cases of two objective functions,and the case with a number of objective functions strictly greater than two.More than50references on the topic have been reported.Another interesting survey on these techniques related to multiple objec-tive integer programming can be found in the book of Ehrgott(2005)and the paper of Erghott(2006),where he discusses different scalarization techniques.We will give details of the latter survey later in this chapter,when we move to integer lin-ear programming formulations.Also,T’Kindt and Billaut(2005)in their book on “Multicriteria scheduling”,dedicated a part of their manuscript(Chap.3)to multi-objective optimization approaches.In the following,we will start revising,following the same lines of Erghott (2006),these scalarization techniques for general continuous multi-objective op-timization problems.2.3.1The Scalarization TechniqueA multi-objective problem is often solved by combining its multiple objectives into one single-objective scalar function.This approach is in general known as the weighted-sum or scalarization method.In more detail,the weighted-sum method minimizes a positively weighted convex sum of the objectives,that is,minn∑i=1γi·f i(x)n∑i=1γi=1γi>0,i=1,...,nx∈S,that represents a new optimization problem with a unique objective function.We denote the above minimization problem with P s(γ).It can be proved that the minimizer of this single-objective function P(γ)is an efficient solution for the original multi-objective problem,i.e.,its image belongs to162Multi-objective Optimizationthe Pareto curve.In particular,we can say that if theγweight vector is strictly greater than zero(as reported in P(γ)),then the minimizer is a strict Pareto optimum,while in the case of at least oneγi=0,i.e.,minn∑i=1γi·f i(x)n∑i=1γi=1γi≥0,i=1,...,nx∈S,it is a weak Pareto optimum.Let us denote the latter problem with P w(γ).There is not an a-priori correspondence between a weight vector and a solution vector;it is up to the decision maker to choose appropriate weights,noting that weighting coefficients do not necessarily correspond directly to the relative impor-tance of the objective functions.Furthermore,as we noted before,besides the fact that the decision maker cannot be aware of which weights are the most appropriate to retrieve a satisfactorily solution,he/she does not know in general how to change weights to consistently change the solution.This means also that it is not easy to develop heuristic algorithms that,starting from certain weights,are able to define iteratively weight vectors to reach a certain portion of the Pareto curve.Since setting a weight vector conducts to only one point on the Pareto curve,per-forming several optimizations with different weight values can produce a consid-erable computational burden;therefore,the decision maker needs to choose which different weight combinations have to be considered to reproduce a representative part of the Pareto front.Besides this possibly huge computation time,the scalarization method has two technical shortcomings,as explained in the following.•The relationship between the objective function weights and the Pareto curve is such that a uniform spread of weight parameters,in general,does not producea uniform spread of points on the Pareto curve.What can be observed aboutthis fact is that all the points are grouped in certain parts of the Pareto front, while some(possibly significative)portions of the trade-off curve have not been produced.2.3Techniques to Solve Multi-objective Optimization Problems17•Non-convex parts of the Pareto set cannot be reached by minimizing convex combinations of the objective functions.An example can be made showing a geometrical interpretation of the weighted-sum method in two dimensions,i.e., when n=2.In the two-dimensional space the objective function is a liney=γ1·f1(x)+γ2·f2(x),wheref2(x)=−γ1·f1(x)γ2+yγ2.The minimization ofγ·f(x)in the weight-sum approach can be interpreted as the attempt tofind the y value for which,starting from the origin point,the line with slope−γ1γ2is tangent to the region C.Obviously,changing the weight parameters leads to possibly different touching points of the line to the feasible region.If the Pareto curve is convex then there is room to calculate such points for differentγvectors(see Fig.2.3).2 f1(xFig.2.3Geometrical representation of the weight-sum approach in the convex Pareto curve caseOn the contrary,when the curve is non-convex,there is a set of points that cannot be reached for any combinations of theγweight vector(see Fig.2.4).182Multi-objective Optimization f1(xFig.2.4Geometrical representation of the weight-sum approach in the non-convex Pareto curve caseThe following result by Geoffrion(1968)states a necessary and sufficient condi-tion in the case of convexity as follows:If the solution set S is convex and the n objectives f i are convex on S,x∗is a strict Pareto optimum if and only if it existsγ∈R n,such that x∗is an optimal solution of problem P s(γ).Similarly:If the solution set S is convex and the n objectives f i are convex on S,x∗is a weak Pareto optimum if and only if it existsγ∈R n,such that x∗is an optimal solution of problem P w(γ).If the convexity hypothesis does not hold,then only the necessary condition re-mains valid,i.e.,the optimal solutions of P s(γ)and P w(γ)are strict and weak Pareto optima,respectively.2.3.2ε-constraints MethodBesides the scalarization approach,another solution technique to multi-objective optimization is theε-constraints method proposed by Chankong and Haimes in 1983.Here,the decision maker chooses one objective out of n to be minimized; the remaining objectives are constrained to be less than or equal to given target val-2.3Techniques to Solve Multi-objective Optimization Problems19 ues.In mathematical terms,if we let f2(x)be the objective function chosen to be minimized,we have the following problem P(ε2):min f2(x)f i(x)≤εi,∀i∈{1,...,n}\{2}x∈S.We note that this formulation of theε-constraints method can be derived by a more general result by Miettinen,that in1994proved that:If an objective j and a vectorε=(ε1,...,εj−1,εj+1,...,εn)∈R n−1exist,such that x∗is an optimal solution to the following problem P(ε):min f j(x)f i(x)≤εi,∀i∈{1,...,n}\{j}x∈S,then x∗is a weak Pareto optimum.In turn,the Miettinen theorem derives from a more general theorem by Yu(1974) stating that:x∗is a strict Pareto optimum if and only if for each objective j,with j=1,...,n, there exists a vectorε=(ε1,...,εj−1,εj+1,...,εn)∈R n−1such that f(x∗)is the unique objective vector corresponding to the optimal solution to problem P(ε).Note that the Miettinen theorem is an easy implementable version of the result by Yu(1974).Indeed,one of the difficulties of the result by Yu,stems from the uniqueness constraint.The weaker result by Miettinen allows one to use a necessary condition to calculate weak Pareto optima independently from the uniqueness of the optimal solutions.However,if the set S and the objectives are convex this result becomes a necessary and sufficient condition for weak Pareto optima.When,as in problem P(ε2),the objective isfixed,on the one hand,we have a more simplified version,and therefore a version that can be more easily implemented in automated decision-support systems;on the other hand,however,we cannot say that in the presence of S convex and f i convex,∀i=1,...,n,all the set of weak Pareto optima can be calculated by varying theεvector.One advantage of theε-constraints method is that it is able to achieve efficient points in a non-convex Pareto curve.For instance,assume we have two objective202Multi-objective Optimization functions where objective function f1(x)is chosen to be minimized,i.e.,the problem ismin f1(x)f2(x)≤ε2x∈S,we can be in the situation depicted in Fig.2.5where,when f2(x)=ε2,f1(x)is an efficient point of the non-convex Pareto curve.f1(xf 2(x)£e2x)f1(xFig.2.5Geometrical representation of theε-constraints approach in the non-convex Pareto curve caseTherefore,as proposed in Steurer(1986)the decision maker can vary the upper boundsεi to obtain weak Pareto optima.Clearly,this is also a drawback of this method,i.e.,the decision maker has to choose appropriate upper bounds for the constraints,i.e.,theεi values.Moreover,the method is not particularly efficient if the number of the objective functions is greater than two.For these reasons,Erghott and Rusika in2005,proposed two modifications to improve this method,with particular attention to the computational difficulties that the method generates.2.3Techniques to Solve Multi-objective Optimization Problems21 2.3.3Goal ProgrammingGoal Programming dates back to Charnes et al.(1955)and Charnes and Cooper (1961).It does not pose the question of maximizing multiple objectives,but rather it attempts tofind specific goal values of these objectives.An example can be given by the following program:f1(x)≥v1f2(x)=v2f3(x)≤v3x∈S.Clearly we have to distinguish two cases,i.e.,if the intersection between the image set C and the utopian set,i.e.,the image of the admissible solutions for the objectives,is empty or not.In the former case,the problem transforms into one in which we have tofind a solution whose value is as close as possible to the utopian set.To do this,additional variables and constraints are introduced.In particular,for each constraint of the typef1(x)≥v1we introduce a variable s−1such that the above constraint becomesf1(x)+s−1≥v1.For each constraint of the typef2(x)=v2we introduce a surplus two variables s+2and s−2such that the above constraint be-comesf2(x)+s−2−s+2=v2.For each constraint of the typef3(x)≤v3we introduce a variable s+3such that the above constraint becomesf3(x)−s+3≤v3.222Multi-objective OptimizationLet us denote with s the vector of the additional variables.A solution(x,s)to the above problem is called a strict Pareto-slack optimum if and only if a solution (x ,s ),for every x ∈S,such that s i≤s i with at least one strict inequality does not exist.There are different ways of optimizing the slack/surplus variables.An exam-ple is given by the Archimedean goal programming,where the problem becomes that of minimizing a linear combination of the surplus and slack variables each one weighted by a positive coefficientαas follows:minαs−1s−1+αs+2s+2+αs−2s−2+αs+3s+3f1(x)+s−1≥v1f2(x)+s−2−s+2=v2f3(x)−s+3≤v3s−1≥0s+2≥0s−2≥0s+3≥0x∈S.For the above problem,the Geoffrion theorem says that the resolution of this prob-lem offers strict or weak Pareto-slack optimum.Besides Archimedean goal programming,other approaches are the lexicograph-ical goal programming,the interactive goal programming,the reference goal pro-gramming and the multi-criteria goal programming(see,e.g.,T’kindt and Billaut, 2005).2.3.4Multi-level ProgrammingMulti-level programming is another approach to multi-objective optimization and aims tofind one optimal point in the entire Pareto surface.Multi-level programming orders the n objectives according to a hierarchy.Firstly,the minimizers of thefirst objective function are found;secondly,the minimizers of the second most important2.3Techniques to Solve Multi-objective Optimization Problems23objective are searched for,and so forth until all the objective function have been optimized on successively smaller sets.Multi-level programming is a useful approach if the hierarchical order among the objectives is meaningful and the user is not interested in the continuous trade-off among the functions.One drawback is that optimization problems that are solved near the end of the hierarchy can be largely constrained and could become infeasi-ble,meaning that the less important objective functions tend to have no influence on the overall optimal solution.Bi-level programming(see,e.g.,Bialas and Karwan,1984)is the scenario in which n=2and has received several attention,also for the numerous applications in which it is involved.An example is given by hazmat transportation in which it has been mainly used to model the network design problem considering the government and the carriers points of view:see,e.g.,the papers of Kara and Verter(2004),and of Erkut and Gzara(2008)for two applications(see also Chap.4of this book).In a bi-level mathematical program one is concerned with two optimization prob-lems where the feasible region of thefirst problem,called the upper-level(or leader) problem,is determined by the knowledge of the other optimization problem,called the lower-level(or follower)problem.Problems that naturally can be modelled by means of bi-level programming are those for which variables of thefirst problem are constrained to be the optimal solution of the lower-level problem.In general,bi-level optimization is issued to cope with problems with two deci-sion makers in which the optimal decision of one of them(the leader)is constrained by the decision of the second decision maker(the follower).The second-level de-cision maker optimizes his/her objective function under a feasible region that is defined by thefirst-level decision maker.The latter,with this setting,is in charge to define all the possible reactions of the second-level decision maker and selects those values for the variable controlled by the follower that produce the best outcome for his/her objective function.A bi-level program can be formulated as follows:min f(x1,x2)x1∈X1x2∈argmin g(x1,x2)x2∈X2.242Multi-objective OptimizationThe analyst should pay particular attention when using bi-level optimization(or multi-level optimization in general)in studying the uniqueness of the solutions of the follower problem.Assume,for instance,one has to calculate an optimal solu-tion x∗1to the leader model.Let x∗2be an optimal solution of the follower problem associated with x∗1.If x∗2is not unique,i.e.,|argmin g(x∗1,x2)|>1,we can have a sit-uation in which the follower decision maker can be free,without violating the leader constraints,to adopt for his problem another optimal solution different from x∗2,i.e.,ˆx2∈argmin g(x∗1,x2)withˆx2=x∗2,possibly inducing a f(x∗1,ˆx2)>f(x∗1,x∗2)on the leader,forcing the latter to carry out a sensitivity analysis on the values at-tained by his objective function in correspondence to all the optimal solutions in argmin g(x∗1,x2).Bi-level programs are very closely related to the van Stackelberg equilibrium problem(van Stackelberg,1952),and the mathematical programs with equilibrium constraints(see,e.g.,Luo et al.1996).The most studied instances of bi-level pro-gramming problems have been for a long time the linear bi-level programs,and therefore this subclass is the subject of several dedicated surveys,such as that by Wen and Hsu(1991).Over the years,more complex bi-level programs were studied and even those including discrete variables received some attention,see,e.g.,Vicente et al.(1996). Hence,more general surveys appeared,such as those by Vicente and Calamai(1994) and Falk and Liu(1995)on non-linear bi-level programming.The combinatorial nature of bi-level programming has been reviewed in Marcotte and Savard(2005).Bi-level programs are hard to solve.In particular,linear bi-level programming has been proved to be strongly NP-hard(see,Hansen et al.,1992);Vicente et al. (1996)strengthened this result by showing thatfinding a certificate of local opti-mality is also strongly NP-hard.Existing methods for bi-level programs can be distinguished into two classes.On the one hand,we have convergent algorithms for general bi-level programs with the-oretical properties guaranteeing suitable stationary conditions;see,e.g.,the implicit function approach by Outrata et al.(1998),the quadratic one-level reformulation by Scholtes and Stohr(1999),and the smoothing approaches by Fukushima and Pang (1999)and Dussault et al.(2004).With respect to the optimization problems with complementarity constraints, which represent a special way of solving bi-level programs,we can mention the pa-pers of Kocvara and Outrata(2004),Bouza and Still(2007),and Lin and Fukushima2.4Multi-objective Optimization Integer Problems25(2003,2005).Thefirst work presents a new theoretical framework with the im-plicit programming approach.The second one studies convergence properties of a smoothing method that allows the characterization of local minimizers where all the functions defining the model are twice differentiable.Finally,Lin and Fukushima (2003,2005)present two relaxation methods.Exact algorithms have been proposed for special classes of bi-level programs, e.g.,see the vertex enumeration methods by Candler and Townsley(1982),Bialas and Karwan(1984),and Tuy et al.(1993)applied when the property of an extremal solution in bi-level linear program plementary pivoting approaches(see, e.g.,Bialas et al.,1980,and J´u dice and Faustino,1992)have been proposed on the single-level optimization problem obtained by replacing the second-level optimiza-tion problem by its optimality conditions.Exploiting the complementarity structure of this single-level reformulation,Bard and Moore(1990)and Hansen et al.(1992), have proposed branch-and-bound algorithms that appear to be among the most effi-cient.Typically,branch-and-bound is used when the lower-level problem is convex and regular,since the latter can be replaced by its Karush–Kuhn–Tucker(KKT) conditions,yielding a single-level reformulation.When one deals with linear bi-level programs,the complementarity conditions are intrinsically combinatorial,and in such cases branch-and-bound is the best approach to solve this problem(see,e.g., Colson et al.,2005).A cutting-plane approach is not frequently used to solve bi-level linear programs.Cutting-plane methods found in the literature are essentially based on Tuy’s concavity cuts(Tuy,1964).White and Anandalingam(1993)use these cuts in a penalty function approach for solving bi-level linear programs.Marcotte et al.(1993)propose a cutting-plane algorithm for solving bi-level linear programs with a guarantee offinite termination.Recently,Audet et al.(2007),exploiting the equivalence of the latter problem with a mixed integer linear programming one, proposed a new branch-and-bound algorithm embedding Gomory cuts for bi-level linear programming.2.4Multi-objective Optimization Integer ProblemsIn the previous section,we gave general results for continuous multi-objective prob-lems.In this section,we focus our attention on what happens if the optimization problem being solved has integrality constraints on the variables.In particular,all262Multi-objective Optimizationthe techniques presented can be applied in these situations as well,with some lim-itations on the capabilities of these methods to construct the Pareto front entirely. Indeed,these methods are,in general,very hard to solve in real applications,or are unable tofind all efficient solutions.When integrality constraints arise,one of the main limits of these techniques is in the inability of obtaining some Pareto optima; therefore,we will have supported and unsupported Pareto optima.f 2(x)f1(xFig.2.6Supported and unsupported Pareto optimaFig.2.6gives an example of these situations:points p6and p7are unsupported Pareto optima,while p1and p5are supported weak Pareto optima,and p2,p3,and p4are supported strict Pareto optima.Given a multi-objective optimization integer problem(MOIP),the scalarization in a single objective problem with additional variables and/or parameters tofind a subset of efficient solutions to the original MOIP,has the same computational complexity issues of a continuous scalarized problem.In the2006paper of Ehrgott“A discussion of scalarization techniques for mul-tiple objective integer programming”the author,besides the scalarization tech-niques also presented in the previous section(e.g.,the weighted-sum method,the ε-constraint method),satisfying the linear requirement imposed by the MOIP for-mulation(where variables are integers,but constraints and objectives are linear),2.4Multi-objective Optimization Integer Problems27presented more methods like the Lagrangian relaxation and the elastic-constraints method.By the author’s analysis,it emerges that the attempt to solve the scalarized prob-lem by means of Lagrangian relaxation would not lead to results that go beyond the performance of the weighted-sum technique.It is also shown that the general linear scalarization formulation is NP-hard.Then,the author presents the elastic-constraints method,a new scalarization technique able to overcome the drawback of the previously mentioned techniques related tofinding all efficient solutions,com-bining the advantages of the weighted-sum and theε-constraint methods.Further-more,it is shown that a proper application of this method can also give reasonable computing times in practical applications;indeed,the results obtained by the author on the elastic-constraints method are applied to an airline-crew scheduling problem, whose size oscillates from500to2000constraints,showing the effectiveness of the proposed technique.2.4.1Multi-objective Shortest PathsGiven a directed graph G=(V,A),an origin s∈V and a destination t∈V,the shortest-path problem(SPP)aims tofind the minimum distance path in G from o to d.This problem has been studied for more than50years,and several polynomial algorithms have been produced(see,for instance,Cormen et al.,2001).From the freight distribution point of view the term shortest may have quite dif-ferent meanings from faster,to quickest,to safest,and so on,focusing the attention on what the labels of the arc set A represent to the decision maker.For this reason, in some cases we willfind it simpler to define for each arc more labels so as to represent the different arc features(e.g.,length,travel time,estimated risk).The problem tofind multi-objective shortest paths(MOSPP)is known to be NP-hard(see,e.g.,Serafini,1986),and the algorithms proposed in the literature faced the difficulty to manage the large number of non-dominated paths that results in a considerable computational time,even in the case of small instances.Note that the number of non-dominated paths may increase exponentially with the number of nodes in the graph(Hansen,1979).In the multi-objective scenario,each arc(i,j)in the graph has a vector of costs c i j∈R n with c i j=(c1i j,...,c n i j)components,where n is the number of criteria.。
tolerance must be a positive number
tolerance must be a positive numberIntroductionTolerance is a concept that plays a crucial role in various aspects of our lives. It refers to the ability to accept, understand, and respect the beliefs, opinions, behaviors, and cultural differences of others. Tolerance promotes harmony, diversity, and peaceful coexistence among individuals and communities. In this article, we will explore the significance of tolerance and delve into why it is essential for a positive society. Furthermore, we will discuss why tolerance must be a positive number, highlighting the benefits it brings and the negative consequences that can arise when tolerance is lacking.The Significance of ToleranceTolerance is the cornerstone of a harmonious society. It allows people from different backgrounds to coexist peacefully, fostering inclusivity and respect. When individuals and communities practice tolerance, they create an environment that embraces diversity and encourages open-mindedness. Tolerance promotes understanding, empathy, and compassion, which are essential for building strong relationships and resolving conflicts peacefully.Tolerance in a Globalized WorldIn today’s interconnected world, tolerance has become more critical than ever. Globalization has increased cultural exchange, migration, and the mixing of diverse ideas and perspectives. Tolerance plays a vital role in ensuring that these interactions are positive and constructive. It allows individuals to appreciate and learn from different cultures, traditions, and experiences. Tolerance enables us to embrace diversity as a strength and not a threat, fostering social cohesion and minimizing misunderstandings between people from different backgrounds.Benefits of a Positive Tolerance:1. Social Cohesion•Tolerance enhances social cohesion by promoting mutual respect and understanding among individuals and communities.•It reduces discrimination, prejudice, and stereotypes, creating a more inclusive society where everyone feels valued and accepted.2. Conflict Resolution•Tolerance facilitates peaceful conflict resolution by encouraging dialogue, compromise, and negotiation.•It prevents conflicts from escalating into violence and promotes the development of long-term solutions that benefit all partiesinvolved.3. Personal Growth•Tolerance fosters personal growth by exposing individuals to diverse perspectives and challenging their own beliefs andassumptions.•It encourages critical thinking, empathy, and the ability to adapt to new situations, leading to personal development and a broaderunderstanding of the world.4. Economic Prosperity•Tolerance has positive economic implications. In diverse and tolerant societies, individuals from different backgrounds cancontribute their unique skills, ideas, and experiences, drivinginnovation and economic growth.•It attracts international talent, businesses, and investments, creating a diverse and vibrant economy.The Negative Consequences of IntoleranceIntolerance has severe repercussions for individuals, communities, and society as a whole. When tolerance is lacking, various negative consequences arise, undermining social harmony and hindering progress.1. Social Division•Intolerance creates social divisions, pitting different groups against each other and leading to hostility, prejudice, andmarginalization.•It hampers societal progress by impeding collaboration and cooperation among groups with differing perspectives and ideas.2. Violence and Conflict•Intolerance can escalate into violence and conflict, leading to human rights abuses, destruction, and loss of life.•It perpetuates cycles of hatred, revenge, and animosity, further dividing communities and perpetuating a vicious cycle of violence.3. Stifled Creativity and Innovation•Lack of tolerance stifles creativity and innovation by discouraging individuals from expressing their unique ideas andperspectives.•It limits diversity in thought and hinders the development of groundbreaking solutions to societal challenges.4. Economic Consequences•Intolerance has negative economic consequences as it deters foreign investments and hampers economic growth.•It creates an environment of uncertainty and instability, discouraging businesses and stifling entrepreneurial endeavors. ConclusionTolerance is not merely about being polite or respectful; it is a fundamental value that underpins the fabric of a positive society. Embracing and practicing tolerance leads to valuable benefits such as social cohesion, conflict resolution, personal growth, and economic prosperity. On the other hand, intolerance brings about negative consequences, including social division, violence, stifled creativity, and economic setbacks. To create a thriving and inclusive society, we must recognize that tolerance must be a positive number, and it is ourcollective responsibility to cultivate and promote a culture of tolerance in all aspects of our lives.。
Time Evolution of Two-Level Systems Driven by Periodic Fields
Although (2) gives a straightforward manner to compute U (t), the series in the r.h.s. is not generally uniformly convergent in time. For practical purposes this gives rise to difficulties when one is interested in the large-time behaviour of the system. For instance, if one considers a periodic Hamiltonian of the form H (t) = m Hm eimωt , with H0 = 0, two successive integrations in (2) would produce a linear term in t. Higher order terms in t would appear with further integrations. These polynomial terms are known as secular terms and they plague the expansion of U (t) in such a way that its uniform convergence is spoiled. Of particular interest is the situation where the Schr¨ odinger equation (1) takes the form i d |Ψ = H1 (t) |Ψ , dt with H1 (t) := ǫσ3 − f (t)σ1 , (3)
High Quality Participant Recruitment
Structured writing
Structured Writing as a Paradigmby Robert E. HornA chapter from Instructional Development: State of the Art edited by Alexander Romiszowski and Charles Dills, Englewood Cliffs, N. J., Educational Technology Publications, 1998IntroductionThomas Kuhn (1962) suggests that "normal science" consists of "research based upon one or more past scientific achievements that some particular community acknowledges for a time as supplying the foundation for its further practice." These achievements were (l) " sufficiently unprecedented to attract an enduring group of adherents away from competing modes of scientific activity," and (2) "sufficiently open-ended to leave all sorts of problems for the redefined group of practitioners to solve." He then states that this is his definition of a paradigm: "Achievements that share these two characteristics I shall henceforth refer to as paradigms." Although Kuhn goes on in the same book to use the word "paradigm" in at least twenty-one distinct meanings (as cataloged by Masterman, 1970), this is the only place where he explicitly defines the term. Others have broadened the meaning of "paradigm" and still others have used the term as a metaphor for "any theory or method or approach, large or small."If any writing or instructional design approach can be called a paradigm within Kuhn's definition, I will claim that structured writing most certainly qualifies. And if Kuhn's concept of paradigm can be metaphorically extended beyond the sciences to the realm of practical methodology of communication, then structured writing surely qualifies there as well. My approach in this chapter will be to describe what I believe to be the salient characteristics of structured writing and to describe the "past achievements" that supply "the foundation for further practice." Then I will demonstrate briefly how these achievements have been "sufficiently unprecedented to attract an enduring group of adherents away from competing modes of scientific activity" and finally to describe some of the sorts of issues in the research and evaluation that structured writing focuses us on today.1. What are Some of the Problems that Structured Writing Addresses?Structured writing has been developed to address many of the perennial problems most people have when working on a complex written communication task. Instructional design certainly qualifies as such a complex task. Some of these perennial problems are: - How should I organize the mass of subject-matter material?-How can I keep track of the structure? How can the reader keep track?-How can I make the structure of the document and the subject matter more obvious? -How do I analyze the subject so that I am sure that I have covered all of the bases?-How do I know the coverage is complete? How will the reader understand this scope? -In large analytic and communication tasks, how do I track multiple inputs, different levels of reader competence and rapidly multiplying and increasingly demanding maintenance requirements?-If I am working in an organization with a large number of writers, how do I provide the plan for a group of writers and how do I manage the group -- efficiently --so that it will appear to the reader that there is a unity or organization, structure, analysis, style, graphic display and format?-How do I sequence the final document so that it will present the information to different levels of readers in the most useful manner?-How do I organize the linkages so that different readers with different backgrounds can get what they want from it easily and quickly?-What formats are optimum to enable users to make sense of the document as a whole and through the window of the current display?-How do we make instructional writing optimally effective and efficient?-These problems are not unique to instructional design. They are addressed one way or another by every person who writes a document. But they are the major issues faced by the paradigm of structured writing. The remainder of this chapter will examine how structured writing helps writers tackle these questions.2. What are Some of the Presuppositions of Structured Writing?In this section I will present several of the major presuppositions of structured writing to provide the background that I used to formulate the paradigm of structured writing.I have used these presuppositions without entering current cognitive science debates as to whether or not we really use some kinds of representations within our minds and brains. Rather, I simply observe that when we communicate, we do use representations. Presuppositions about Subject Matter.I began with what seemed obvious, namely that, since we communicate with each other using physical mediums we have to represent what we do in sentences and images. Thus, any subject matter consists of all the sentences and images used by human beings to communicate about that subject matter. So, with sentences and images, we have all we need to fully analyze a subject matter. I acknowledge that subject matters exist that can only be learned by intense observation, practice and nonverbal feedback (such as an exotic martial arts). I acknowledge the issues raised by Polanyi in his concept of tacit knowledge, i.e. that certain knowledge is learned by observation of fine motor movements and unvoiced values, which go beyond the sentences that represent a subject matter. But I sidestep them. Structured writing only deals with that which can be written. Practical communication in commerce, science, and technology teaches, documents or communicates something. Therefore, I assume that what is important enough to learn is capable of being rendered in sentences (or diagrams).I also assumed that the most important regularities to understand in a subject matter are those that exist between sentences. Many of the studies of language begin and end with the study of words and sentences in isolation. Subject matters are tight relationships between many clusters of sentences and images. So, if we are to analyze subject matters properly (i.e. efficiently and effectively) for communication and training, we must understand the relationships between sentences. Why is it that certain sentences should be"close" to each other in an instructional document in order to convey the subject matter easily to a new learner?Presuppositions about readers. There were a number of assumptions about the users (or readers). I took it as axiomatic that different readers and learners may want to use a given document in a variety of ways. Readers may use any of the following approaches to a given document: scanning to decide whether to read the communication at all, browsing to find interesting or relevant material, analyzing critically the contents, studying to be able to remember the subject matter, etc. And, in general, it is difficult to predict what learners and readers will do with a given piece of instruction or communication. Documents often have hundreds or even thousands of users. Each document has a different interest and relevance to each user. Each must therefore serve many people having many purposes. If possible, it is important to optimize among several functions in the same document.Presuppositions about Writing.When I developed structured writing, I also introduced what turned out to be a fairly radical assumption: A new paradigm in communication and learning requires a new basic unit of communication. Revolutions in paradigms in physical theory have in part come about from the different concepts of the most basic particle (the atom as a singular unit, Bohr model of atom as a subvisible solar system, electrons as rings of probability, to the discovery of subatomic particles, etc.). Revolutions in linguistic theory came about with the invention of grammar as a unit of analysis. The behavioral paradigm in instructional design came after Skinner's invention of the stimulus - response unit. Similarly, the invention of the information block (discussed below) qualifies as a major turning point in the history of the conception of basic units.Most training is not formal training. It does not take place in the classroom with documents called training manuals. It takes place on the job with whatever documentation is at hand. I have heard that only one-tenth of training is formal classroom training. Nine-tenths never gets accounted for in the financial or other reports of a company as training. Thus, in my list of major assumptions is this one: Anything that is written is potentially instructional. Therefore, in so far as possible: A writer should design each communication to potentially be "instructional" even if its ostensible job may be as a memo or a report or as documentation.Another focus on the structured writing presuppositions began is giving importance to the scientific research on how much people forget. We forget, as I am sure you remember, most of what we learn within three weeks of learning it. At that time, I noted that we must build "learning - reference systems" in order to deal with these problems. (Horn, et. al., 1969) . Since then we have used the term "reference based training" (Horn, 1989a) to cover this area. Others have invented the delightful term "just in time training" to cover an essential aspect of this training need. And later I specified the domains of memos and reports as another arena in which writing with instructional properties takes place. (Horn, 1977)With this survey of the assumptions underlying the paradigm, let us take a look at the actual components of the structured writing approach.3. What Are the Components of the Structured Writing Paradigm?My early (Horn, 1965) analyses began with the detailed examination of actual sentences, illustrations, and diagrams that appeared in textbooks and training manuals. My investigation involved trying to establish a relatively small set of chunks of information that are (1) similar in that they cluster sentences (and diagrams) that have strong relationships with each other and (2) that frequently occur in various kinds of subject matter. This analysis focused, thus, on the relationship between the sentences in subject matter. The result of this analysis was the invention of the information block as a substitute for the paragraph. The taxonomy that resulted is now known as the information blocks taxonomy for relatively stable subject matter ( shown in Figure 1).Definition: Information BlocksInformation blocks are the basic units of subject matter in structured writing analysis. They replace the paragraph as the fundamental unit of analysis and the form of presentation of that analysis. They are composed of one or more sentences and/or diagrams about a limited topic. They usually have no more than nine sentences. They are always identified clearly by a label. Three examples of information blocks are shown in Figure 2. Information blocks are normally part of a larger structure of organization called an information map (see below for explanation of maps). In short, they are a reader-focused unit of basic or core parts of a subject matter.Example of an Information BlockWhat do information blocks look like? It is important to notice that different types of blocks vary widely in appearance and construction. For example, below is one of the most simple-looking types of blocks (but one that has standards for construction more stringent than most), a definition block:Definition. The Master Payroll File is a group of records containing all of the payroll and employee information used by the weekly payroll program.How is a Block Different from a Paragraph?Let us examine some of the characteristics of this example of an information block and see how it differs from paragraphs. First, we must note that there is no topic sentence in the information block. Topic sentences are absent or irrelevant in much of structured writing, so much so that they are not taught in a structured writing course.Second, it is worth observing that there is no "nice to know" but irrelevant information in the information block. Note that the only information it contains is information that is relevant to defining the term Master Payroll File. Paragraphs typically have a lot of nice to know information.Third, note that the block has a label. One of the mandatory requirements for blocks is that they always have a distinguishing label, chosen according to systematic criteria (Horn, 1989a). Paragraphs have no such requirement, although they may be randomly labeled, depending upon the taste of the writer.Fourth, All definitions in a given structured document would be consistent with these characteristics. Paragraphs have no requirement for consistency within or between documents. These are some of the main characteristics that distinguish the block from a paragraph.This first example is a very simple block. While this block is one sentence long, many types of blocks contain several sentences. diagrams, tables or illustrations, depending upon their information type. (See Figure 2) Typical blocks are several sentences in length, and might contain different kinds of tables: Diagrams comprise other kinds of information blocks.How Does Structured Writing Handle Cohesion and Transition?There is no "transitional" information in the information block, but principles for writing prose encourage or require it. The need for coherence, cohesion, and transition is handled in a completely different manner in structured writing. While this is a huge topic, (Halliday and Hasan, 1976) suffice it to say that much of the burden of coherence is placed on the labeling structure and much of the transition requirement is placed on one type of block, the introduction block, that frequently appears at the beginning of information maps.The Four PrinciplesAll information blocks are constrained by four principles used to guide structured writing:The first of these is the chunking principle. It derives from George Miller's basic research (Miller, 1957. See also note 2) which suggest that we can hold only 7 plus or minus 2 chunks of information in human short-term memory. Our formulation of the principle states: Group all information into small, manageable units, called information blocks and information maps. Small (in information blocks) is defined as usually not more than 7 plus or minus 2 sentences. While others lately (e.g. Walker, 1988) have recommended modularity (i.e. dividing information into labeled chunks) as a principle of structured writing, I have insisted that information blocks turn out to be "precision modularity" (Horn 1989a, 1993) because of the operation of three other principles with the chunking principle and because I believe we have shown that blocks sorted using our taxonomy (see below) offer much greater efficiency and effectiveness of composition and retrieval.The second principle we use in helping to define the information block is the labeling principle. It says: Label every chunk and group of chunks according to specific criteria. It is beyond the scope of this paper to get into all of these criteria. They consist of guidelines and standards some of which cover all blocks, some of which cover only specific types of blocks or even parts of blocks. I have claimed elsewhere (Horn, 1992a) that it is the precise specification of different kinds of blocks that permits the identification of context and limits for these criteria, thus saving them from being bland and overly abstract, and, thus, largely useless, guidelines.The third principle used in developing the information block is the relevance principle. It says: Include in one chunk only the information that relates to one main point, based upon that information's purpose or function for the reader. In effect it says, if you have information that is nice to know, or contains examples or commentary, the relevance principle demands that you put it some place else and label it appropriately: but do not put it in the definition block.The fourth principle is the consistency principle. It says: For similar subject matters, use similar words, labels, formats, organizations, and sequences.Answering Some Objections to BlocksSome have commented that information blocks are not particularly unique or novel. They say, for example, that information blocks are only what paragraphs, when written properly, should be. I have answered many of these claims elsewhere (Horn, 1992a). If extraneous information is excluded from an information block (as it should be, following the relevance principle) the discourse is changed radically. If the materials for cohesiveness and transition in paragraph-oriented writing are put into the labels and if the labeling system is relevant and consistent, the appearance and usefulness of the whole piece of writing is changed tremendously. If the subject matter is divided into appropriate-size of chunks (using the chunking principle and the taxonomy of information blocks), the form of discourse is changed decisively. If all of these changes are made together in the same document, the text usually has much less intertwined prose with multiple threads and allusions. It is a far more usable text to scan, to read and to memorize.Blocks by Themselves Qualify Structured Writing as a Paradigm.By itself, the invention of the information block might qualify structured writing as a separate and new paradigm for analysis and writing. But we used these new distinctions to build a powerful analytic tool for gathering information about and specifying the subject matter in instructional or documentation writing. Simply revising the basic unit has radically shifted the rhetoric of exposition in the documents in which structured writing is used.Topic - Block Matrix of a Subject MatterTo aid analysis of subject matter for instructional and documentation purposes, I conceptualized the subject matter as a topic - block matrix, shown schematically in Figure 3.The reader will note that the topics from the subject matter are arranged along the top of the matrix. This is done according to a group of guidelines provided as a part of the structured writing system. The block types (see Figure 1 for list) are arranged along the vertical axis. The resulting cells in the matrix represent information blocks into which thesentences and diagrams from the subject matter are placed. Examination of the blank spaces show the analyst what information may still be not written down and hence perhaps not known. Specific templates have been developed which permit the analysts to know with a high degree of certainty which block should be filled in for a specific topic. An example of this would be a template which would specify these three block types for a concept: definition, example, and (optionally) non-example.Systematic LabelingAnother key component of structured writing was the development of a system of consistent labeling of parts of a document. Obviously labeling is not unique to structured writing. Many books follow a more or less systematic labeling guideline. But when combined with the new units of communication, the information block and the information map, the systematic labeling becomes a powerful communication device. In a recent article (Horn, 1993) I summarized the benefits of such a systematic approach to labeling. Systematic labeling:- enables readers to scan content to see what they want to read- enables readers/learners to to find what they are looking for in a consistent, relevant, complete manner;- enables the analyst/writer to manage the intermediate stages of information gathering and analysis in a more efficient way;-enables learners to anticipate learning problems by showing the structure of the subject matter to them."-Definition: Information MapsInformation Maps are a collection of more than one but usually not more than nine information blocks about a limited topic. In general, one can think of an information map as approximately one to two pages in length, but some maps (of certain well-specified types) run several pages in length and some maps are composed of only one information block. Maps both (1) aid the writer in organizing large amounts of information during the analysis phase and (2) help the reader to understand the structure of the subject matter and the document. Maps may be sequenced hierarchically or in other clearly defined ways such as task or prerequisite order. Maps are assembled during the sequencing phase of the writing process, into parts, sections, chapters and documents depending upon communication purpose and reader needs. (For an example, see Figure 4.)Discourse DomainsCommunication in business takes place in some fairly routinized forms. This fact enables us to identify some major domains of discourse. We begin such an analysis by asking questions about specific domains such as: How does a report of a scientific experiment differ from a sales presentation or a policy manual? They differ in many ways. They differ as to who the authors are, how the authors have come to know the subject matter, what can be assumed about the audience of the communication, what level of detail is used, what content is communicated.In addition to the "what are the differences" questions, we can ask the "what are the similarities" questions. How are all reports of scientific experiments alike? How are all sales presentations alike? The analysis of these similarities and differences is what is called domain analysis in structured writing. It involves examining the relationships between author and reader of different kinds of documents and the "stances" and points of view that can be seen as a result. This analysis yields specific block types that can be expected in specific kinds of documents. The domain of relatively stable subject matter has already been introduced in this chapter as the one that comprises the subject matter used in training and documentation writing (see also below).So, in the Information Mapping method, a domain of discourse is defined as the specification of information blocks of a particular class of documents, all of which share the same type of author-reader assumptions and the same stance or point of view towards subject matter.Some examples of domains of discourse (Horn, 1989a) have been studied extensively. They are:- the domain of relatively stable subject matter, which is that domain of subject matter which we think we know well enough to teach it in a course or write an introductory training material about it.-the domain of disputed discourse, which is that subject matter about which we know enough to chart its disagreements.-Other domains such as those of business report and memo writing have been studied (Horn, 1977) Still others remain to be carefully identified and mapped.Information TypesBlocks in the domain of relatively stable subject matter can be sorted into seven basic classifications, which we call the "information types."The seven information types are:- Procedure- Process- Concept- Structure- Classification- Principle- FactThis is a key set of categories for specifying and describing how human beings think, especially about what we have called relatively stable discourse domains. Structured writing guidelines have been developed that permit the information blocks to be assigned to one or more of these information types. An example would be the assignment of definitions and examples to the information type "concept" or the assignment of a flowchart to the information type "procedure." This permits the identification of what has come to be called "key block" information, the information which you must have to fully analyze an individual topic of a subject matter. Key blocks enable writers to anchor their writing firmly and reliably to the centrally important structure of a subject matter. (For further information, see Horn, 1989a, Chapt. 3)The information types theory is used to help the analyst/writer identify specific information that is needed for each topic. These information-type templates specify the key information blocks needed to ensure completeness and accuracy of the analysis. Systematic Integration of GraphicsFrom its conception, the structured writing paradigm recognized the importance of graphics (illustrations, diagrams, photographs) as an integral part of any writing with a practical purpose. This meant that we had to specify exactly where such graphics would communicate better than words by themselves. And this led to the identification of specific blocks within the overall scheme which are required to have some kind of graphic, because the communication was likely to be better than if the same message were conveyed only by words. This is also paradigmatic change. Certainly in the past words and images had been used together. But graphics were regarded either as a "tacked on" afterthought or as decorations, not as a mandatory and integral part of the message. (see Horn, 1993 for a fuller treatment of this point).Systematic FormattingMuch reading in the Age of Information Overload is actually scanning. We must continually identify that which we don't have to read. We are always looking for the salient parts. This makes the requirement for aiding scanning paramount in the specification of formatting. There have been a variety of formats identified that meet this criteria. Structured writing is most often associated with a single format: that of having the map title at the top and the block labels on the left-hand side of the page. But this is only one of the many possible formats of structured writing that aid scanning (see Horn, 1989 for others). The topic of formatting is also the one that has produced the most confusion about structured writing. Many people have observed only the strongly formatted versions of documents written according to the analytic methods of structured writing and have concluded that "it is only a format." Since the analysis and structuring of the document is part of the process of producing the document, much of the highly disciplined thinking that goes into producing the documents is not immediately visible. But the number of trained writers of structured writing has grown to over 150,000 world-wide, and the discernment that something more than format goes into structured writing has gradually become the norm rather than the exception.Systematic Life Cycle Approach to Document DevelopmentDocumentation and training materials often last a long time. The amount of time from drafting to final discard of a document can be years and sometimes decades. Many business documents are frequently revised and updated. This means that a methodology for writing must have in place a facility for rapid revision and updating as well as cost-effective initial development. The structured writing paradigm has made paradigmatic changes in how documents can be updated and revised. Because the basic units of organization, the information blocks, are easily isolatable from each other (unlike other paradigms of writing and formatting), they can be much more easily removed, changed or replaced. Previous and more literary rhetorics provide a great deal of difficulty to the writer managing the life cycle of the document, because such literary rhetorics have an intricate and highly interwoven approach to organization. Managers involved in preparing foreign language translations also report major efficiencies of translation because of the simplification of rhetorical structure. Needless to say that the structured writing approach propagates rapidly in a business environment in which costs of publication are closely watched.What Structured Writing Shares with Other ParadigmsNot all of the components of structured writing are novel. Such total novelty is not a requirement for a paradigm. Structured writing shares the use of words and sentences with other forms of writing. Many of the conventions and rhetorical guidelines for good, clear writing of sentences are incorporated without change. Moreover, when serving a purely instructional function, most of the guidelines regarding the design of practice exercises, tests, criteria-based instruction, etc. are used wholesale.Behavioral research from instructional design, such as Merrill's and Tennyson's (1977) work on teaching concepts, has also been incorporated into the structured writing paradigm. This research serves to strengthen the instructional properties of documents whose initial or primary use is not instructional, but which at some time in the life cycle of the document must provide formal or informal training. Moreover, as another example, much of the collection of research-based design imperatives in Fleming's and Levie's (1978) work on message design supported and strengthened the research foundations of the structured writing paradigm, as have many individual pieces of research since then. 4. What Makes Structured Writing "Structured?"One of the claims of structured writing is that there exist particular dimensions along which technical or functional writing can be described (if observing it from the point of view of an outside observer) or composed (if attempting to develop some document using it). It seems to me that we can describe several scales of structure or dimensions along which a piece of prose would be placed. Some of these scales are:The Chunking Scale. This scale might be described by the question "to what degree can the between-sentence units be clearly chunked into separate chunks, each of which serves only one purpose?"。
入离转调的英文表达
入离转调的英文表达《Entry, Departure, and Transfer》Entry, departure, and transfer are all critical stages in any aspect of life. These stages can apply to various scenarios such as entering or leaving a country, a job, a relationship, or even a mindset. Each stage requires careful consideration and preparation to ensure a smooth transition and successful outcome.When it comes to entry, one must make an effort to adapt to new environments. Whether it's a new job, a new country, or a new relationship, there's always a period of adjustment. It involves learning the rules, understanding the culture, and building rapport with others. It's an opportunity to start afresh and embrace new opportunities.On the other hand, departure signifies leaving behind something familiar. This could be a job, a place, or even a relationship. It's a time for reflection, gratitude, and bidding farewell. Departure can be bittersweet, as it involves letting go of the past and preparing for new beginnings.Transfer, on the other hand, refers to the process of moving from one place to another, be it physicalor psychological. This can include changing jobs, moving to a new city, or transitioning from one state of mind to another. Transfer involves careful planning, organizing, and adapting to change.In essence, entry, departure, and transfer are interconnected stages that link moments of change. These stages can be challenging, exciting, and transformative. Each signifies a transition from one phase of life to another, offering opportunities for growth, learning, and self-discovery. It's important to approach these stages with an open mind and a willingness to embrace new opportunities and experiences.。
CAP超详细名词解释
CAP超详细名词解释⽬录引⾔2000年7⽉,加州⼤学伯克利分校的Eric Brewer教授在ACM PODC会议上提出CAP猜想。
2年后,⿇省理⼯学院的Seth Gilbert和Nancy Lynch从理论上证明了CAP。
之后,CAP理论正式成为分布式计算领域的公认定理。
概述CAP 理论对分布式系统的特性做了⾼度抽象,形成了三个指标:⼀致性(Consistency)可⽤性(Availability)分区容错性(Partition Tolerance)CAP定理我们常见的描述是⼀个分布式系统最多只能同时满⾜⼀致性(Consistency)、可⽤性(Availability)和分区容错性(Partition tolerance)这三项中的两项。
但布鲁尔在提出 CAP 猜想的时候,并没有详细定义 Consistency、Availability、Partition Tolerance 三个单词的明确定义。
因此如果初学者去查询 CAP 定义的时候会感到⽐较困惑,因为不同的资料对 CAP 的详细定义有⼀些细微的差别。
这⾥不展开讲述这些细微差别,本⽂选取Robert Greiner的⽂章来作为参考。
需要注意CAP Theorem: Explained这篇⽂章已经被标明outdated(已过时)。
⽹上说的⼤多数定义是根据第⼀篇⽂章来的。
下⾯带你逐步理解CAP定理中分布式,⼀致性,可⽤性,分区容错性这4个关键词。
分布式我们都知道CAP定理说的是分布式系统下的定理,但是分布式系统有很多类型,有异构的,⽐如节点之间是上下游依赖的关系,有同构的,⽐如分区/分⽚型的、副本型的(主从、多主)。
那么CAP说的分布式包含以上所有吗,⽹上很少有提到CAP说的分布式系统是否有具体的类型。
在CAP Theorem: Explained⽂中描述CAP定理⽤的是Distributed systems,也没有具体指明CAP理论下分布式系统的特征,但是在CAP Theorem: Revisited是这样描述CAP的:in a distributed system (a collection of interconnected nodes that share data), you can only have two out of the following three guarantees across awrite/read pair: Consistency, Availability, and Partition Tolerance - one of them must be sacrificed.翻译过来就是:在⼀个分布式系统(指互相连接并共享数据的节点的集合)中,当涉及读写操作时,只能保证⼀致性(Consistence)、可⽤性(Availability)、分区容错性(Partition Tolerance)三者中的两个,另外⼀个必须被牺牲。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
- AAPM QA Task Group Review
September 11, 2015 Changsha, China
Jie Shi, Ph.D. Board Member at Large, NACMPA, Sun Nuclear Corporation
- Quality Assurance and Outcome Improvement SC
Work Group on Clinical Trials Working Group on Recommendations for Radiotherapy External Beam
Quality Assurance (TG-142, TG-119, TG-218) Work Group on Information Technology (TG-201) Work Group on Prevention of Errors in Radiation Oncology (TG-100)
(%)
gamma passing
rate Vendor D
Not acceptable
0.5-0.6 > 0.6
1.5-2.0 > 2.0
5-10% > 10%
Table 7.5 Criteria for acceptability of gamma evaluations of pretreatment verification of IMRT beams (from Stock et al., PMB, 2005).
Standard
Maximum Minimum
Deviation (σ)
MultiTarget Prostate Head and Neck CShape (easier) CShape (harder) Overall combined Confidence limit = (100 – mean) + 1.96σ
A
B
C
D
E
F
H
Measurement
Diode Diode EPID
Diode Diode Film
Diode
device
array
array
array
array
array
Mean
98.9
93.3
99.4
99.2
98.6
99.6
96.8
Stan1.5
0.3
2.5
deviation(σ)
Local confidence 3.9
9.5
1.3
3.4
4.3
1.0
8.1
limit
(96.1%) (90.5%) (98.7%) (96.6%) (95.7%) (99.0%) (91.9%)
(100 – mean) +
1.96σ
Number of studies
5
5
5
5
4
4
5
Table 14 Per field measurements: Average percent of points passing the gamma criteria of 3%/3 mm, averaged over the test plans, with associated confidence limits
IMRT QA Tolerance
- TG-119 (2009)
IMRT Commissioning: Multiple Institution Planning and Dosimetry Comparisons, A Report from AAPM Task Group 119
Test
Mean
IMRT QA Tolerance
- AAPM TG-218 (to be published)
Tolerance Limits and Methodologies for IMRT Verification QA: A Report of the AAPM TG-218
gamma
passing Diff
14 years after the tragedy
September 11, 2001
August 3, 2015
AAPM Organization Trees for QA
- Board of Directors
- Science Council
- Therapy Physics Committee
Dose
DTA
Dose
Gold
rate (%)
Tolerance Tolerance Threshol standard Vendor
(%)
(mm)
d (%)
(%)
A (%)
gamma
passing Diff
rate
(%)
Vendor B
(%)
gamma
passing Diff
rate
(%)
Vendor C
IMRT QA Tolerance
- TG-119 (2009)
IMRT Commissioning: Multiple Institution Planning and Dosimetry Comparisons, A Report from AAPM Task Group 119
Institution
97.8
3.5
99.8
90.8
98.6
2.4
100
93.3
98.1
2.0
100
94.2
97.4
2.8
99.8
93.0
97.5
2.6
99.9
94.0
97.9
2.5
7.0 (i.e. 93.0% passing)
Table 13 Per field measurements: Average percent of points passing the gamma criteria of 3%/3 mm, averaged over the institutions, with associated confidence limits
IMRT QA Tolerance
- ESTRO Booklet #9 (2008)
GUIDELINES FOR THE VERIFICATION OF IMRT
Approach
Average gamma
Max gamma P>1
Acceptable < 0.5
< 1.5
0-5 %
Need further evaluation