Estimation of N for the Two-Scale Gamma Raindrop Size Distribution Model and Its Statistical Propert
算术平均牛顿法的英文
算术平均牛顿法的英文Arithmetic-Geometric Mean Newton's Method.The arithmetic-geometric mean (AGM) Newton's method is an iterative algorithm used in numerical analysis to approximate the solution of equations, particularly those involving transcendental functions. This method is avariant of the classical Newton's method, which uses the tangent line to the function at a given point to approximate the root of the function. The AGM Newton's method incorporates the arithmetic-geometric mean (AGM) iteration, which is itself a fast converging method for computing the square root of a number.Background on Newton's Method:Newton's method is based on the Taylor series expansion of a function. Given a function f(x) and its derivativef'(x), the method starts with an initial guess x0 and iteratively updates the approximation using the formula:x_{n+1} = x_n f(x_n) / f'(x_n)。
英文文献
Antonio ArmillottaGiovanni MoroniDipartimento di Meccanica,Politecnico di Milano,Via La Masa1,20156Milano,ItalyWilma Polini Dipartimento di Ingegneria Industriale, Universitàdegli Studi di Cassino,Via Di Biasio43,03043Cassino(FR),Italy Quirico SemeraroDipartimento di Meccanica,Politecnico di Milano,Via La Masa1,20156Milano,Italy A Unified Approach to Kinematic and Tolerance Analysis of Locating FixturesA workholdingfixture should ensure a stable and precise positioning of the workpiece with respect to the machine tool.This requirement is even more important when modular fixtures are used for the sake of efficiency and reconfigurability.They include standard locating elements,which set the part in a predefined spatial orientation by contacting its datum surfaces.In the computer-based design of afixture,the layout of locators must be tested against two main sources of problems.Kinematic analysis verifies that any relative motion between the part and the worktable is constrained.Tolerance analysis evaluates the robustness of part orientation with respect to manufacturing errors on datum sur-faces.We propose a method to carry out both tests through a common set of geometric parameters of thefixture configuration.These derive from the singular value decompo-sition of the matrix that represents positioning constraints in screw coordinates.For a poorly designedfixture,the decomposition allows us tofind out either unconstrained degrees of freedom of the part or a possible violation of tolerance specifications on machined features due to geometric errors on datum surfaces.In such cases,the analysis provides suggestions to plan the needed corrections to the locating scheme.This paper describes the procedure for kinematic and tolerance analysis and demonstrates its sig-nificance on a sample case offixture design.͓DOI:10.1115/1.3402642͔Keywords:fixture design,kinematic analysis,tolerance analysis,screw theory1IntroductionModularfixtures are the key to exploit the inherentflexibility and reconfigurability offlexible manufacturing systems.They are built from standard components that are readily mounted on a base plate and easily adapted to changing part types and sizes͓1͔. The layout offixture components is customarily designed around computer aided design͑CAD͒descriptions of workpieces with the help of3D catalogs͓2͔.Integrated software support to this task is pursued through the extraction of geometric information from CAD models in order to simulate the kinematic,static,and dy-namic behaviors offixtures͓3͔.Basically,afixture constrains the relative motion between part and worktable by two different mechanisms:–deterministic positioning,which sets a spatial orientation of the part by form closure;–total restraint,which allows part orientation to be maintained by force closure during machining operationsIn this paper,we focus on deterministic positioning,with the aim of proposing a method to check the correctness of geometric constraints imposed to the workpiece.Thefirst requirement to be satisfied by the system of constraints is of a kinematic type:The part cannot be allowed to move in any way relative to thefixture. Possible residual degrees of freedom͑DOFs͒in part motion must be detected in order to allow new constraints to be added.The second requirements for a kinematically correctfixture are related to precision:Tolerances on machined features must be satisfied despite manufacturing errors on bothfixture and workpiece.In literature,a kinematic analysis offixtures has been dealt with by description models of feasible motions for constrained rigid bodies.The objective is to check whether a given layout offixture components constrains all DOFs of part motion.Some approachesderive from early research topics of geometric modeling,such assymbolic spatial relationships͓4͔and spatial occupancy represen-tations͓5,6͔.Apart from them,most studies rely upon a commondescription of motion constraints based on the screw theory ofkinematics͓7͔.We recall its basic results in Sec.2of this paper.Based on previous applications of the theory to the study ofmechanisms͓8,9͔,earlier attempts to use its basic results in thecontext offixtures have led to a compact formulation,which ismore easily applied to real cases and implemented in a CADenvironment͓10,11͔.Although not explicitly citing the screwtheory,other studies have proposed a similar description,high-lighting new properties useful for modelingfixture kinematics ͓12–14͔.A similar approach has been recently applied to the analysis and optimization offixturing schemes with redundantconstraints͓15,16͔.In Ref.͓17͔,a mathematical procedure is pro-posed to analyze kinematically unconstrainedfixtures and calcu-late residual degrees of freedom for the workpiece.Solving thelatter problem is critical to allow corrective actions to a poorlydesignedfixture.As treated in Sec.3,we develop a different kindof manipulation on the screw-based description to achieve thesame objective.A kinematic analysis is not sufficient to ensure precise position-ing.Errors onfixtures and part surfaces cause uncertainty onmachine-workpiece referencing parameters,which can result inthe stack-up of manufacturing errors.To control these deviations,afixture layout should be carefully chosen according to part ge-ometry and tolerances.Some studies have proposed guiding rulesand algorithms based on tolerance charting techniques to selectpositioning surfaces on the workpiece in order to control tolerancestacks on functional dimensions͓18–20͔.To compare alternative fixture configurations,Ref.͓21͔investigates on precision issues related to different types offixture components and provides rules to evaluate their combined effect on positioning uncertainty.In Ref.͓22͔,the uncertainty propagation problem is addressed by introducing probabilistic terms in the calculation of workpiece-machine transformation from contact points.Calculation proce-Contributed by the Computational Metrology/Reverse Engineering Committee ofASME for publication in the J OURNAL OF C OMPUTING I NFORMATION S CIENCE AND E NGI-NEERING.Manuscript received February26,2008;final manuscript received March11,2010;published online June8,2010.Assoc.Editor:A.Fischer.Journal of Computing and Information Science in Engineering JUNE2010,Vol.10/021009-1Copyright©2010by ASMEdures on the constraints description based on the screw theory have also been proposed to address tolerance analysis.They esti-mate either displacements in selected points on the workpiece ͓23–26͔or geometric errors on machined features ͓27–29͔as a result of fixture errors.Most of these approaches also include the search for a minimum-error layout of the fixture:at a lower level of computer support,guidelines for this design task have been proposed in Ref.͓30͔.In Sec.4,we demonstrate a method to detect possible conditions on the fixture layout in which error propagation from fixture to machined features can be critical with respect to tolerance specifications on the part.The solution we propose is based on a unified approach for the two subproblems of kinematic and tolerance analyses.It consists of a simple calculation procedure based on the description of po-sitioning constraints according to the screw theory.The output of the procedure allows us to validate the configuration of a fixture by detecting either a possible lack of constraints or negative ef-fects on machining accuracy due to part-fixture -pared with existing approaches,we attempt to streamline the analysis of deterministic positioning by using a reduced set of parameters easily extracted by available geometric data.A discus-sion of an application example in Secs.5and 6will allow us to better clarify the types of decisions that can be supported by the method.2Description of Positioning ConstraintsA fixture holds a workpiece in a given spatial configuration ͑position and orientation ͒relative to the reference frame of a ma-chine tool.This task includes two different functions.–Positioning :Remove all DOFs of part motion and allow each part of a batch to assume the same configuration within a given tolerance.–Clamping :Withstand forces acting on the part during the machining process without excessive deformation and vibration.In a modular fixture,positioning is usually done before clamp-ing by means of highly accurate fixture components called loca-tors .As shown in Fig.1,they are grouped into a limited number of functional types ͓31,32͔:–support pins and blocks with flat,round,conical,or vee shape,in contact with external resting surfaces of the part –sleeved support pins and blocks,providing both vertical sup-port and side positioning–locating pins,horizontal flat or vee blocks,providing only side positioning on lateral surfaces–center pins,in contact with surfaces of holes and other inter-nal featuresIn most cases,contact between locators and parts occurs on either a point,a straight line segment,or a planar surface area.Line and surface contacts constrain part movement more than point contacts do,and each of them can be replaced by two or more kinematically equivalent point contacts,as shown in Fig.2͑a ͒.However,it is not safe to rely on this property in the pres-ence of a small contact length or area,which is better approxi-mated by a simple point contact ͑Fig.2͑b ͒͒.A proper number of equivalent point contacts can align the part to the reference frame of the worktable.The completeness of such alignment is often related to the number of DOFs of part move-ment that are restricted by the fixture.Since a rigid body has 6DOF ͑translations and rotations along the x ,y ,and z axes of the machine reference frame ͒and each of them can have either sense,12“bidirectional”DOFs are conventionally considered.A basic condition for a deterministic positioning test could check that a given number ͑say,nine ͒of these DOF is restricted by the loca-tors.However,such criterion would not work whenever locators restrict translations or rotations along directions not parallel to x ,y ,and z axes.A more general condition,which will be assumed throughout the paper,is the following:Provided that part is held in contact with locators,translation and rotation along any direc-tion must be restricted ͑with the only exception of rotations along axes of fully symmetric parts ͒.The screw theory provides an effective representation of geo-metrical constraints on the part due to point contacts with locators.Each contact is defined by the direction of the reaction force at the locator.Any set of forces and couples is equivalent to a force f and a couple c along the same direction:They can be joined in a wrench w ,defined by either f and the pitch h =c /f .The direction of f can be expressed in line coordinates by the column vector͓x ,y ,z ,x ,y ,z ͔T͑1͒where ͑x ,y ,z ͒and ͑x ,y ,z ͒are the force itself and its mo-ment about the origin of the coordinate system xyz ,and x x +y y +z z =0.Similarly,the wrench can be expressed in screw coordinates by the vectorw =͓x ,y ,z ,x −h x ,y −h y ,z −h z ͔T͑2͒The reaction force at the i th frictionless point constraint is directed along the normal to the contact surface and can be represented by a wrench with a zero pitchw i =͓xi ,yi ,zi ,xi ,yi ,zi ͔T͑3͒where conventionally xi 2+yi 2+zi 2=1.Then,the 6ϫp matrix of contact wrenchesW =͓w i ͔͑4͒represents the constraints at the p locators of the fixture.ItisFig.1Types of modular locatingelementsFig.2Point contacts kinematically equivalent to line and plane contacts021009-2/Vol.10,JUNE 2010Transactions of the ASMEusually referred to as the locating matrix and can be used to check the deterministic positioning of the part.The six equations of translational and rotational equilibrium under an arbitrary set of forces are expressed by the matrix equationWF=−w E͑5͒where F=͓f1,...,f p͔T represents the constraint reactions and w E is the wrench of the resultant of external loads acting on the part. If W has rank6,Eq.͑5͒has a unique solution F,which means that any external action is balanced by a sum of reaction forces anddoes not cause any displacement of the part.If the rank of W is less than6,Eq.͑5͒has no solution whenever w E does not belong to the range of W͓10,11͔.An equivalent condition,based on the rank of the Jacobian matrix associated with the constraints,is proposed in Ref.͓12͔and applied in Ref.͓13͔:It can be shown that the Jacobian matrix is the transpose of the locating matrix as defined before.As a consequence of this property of W,the sim-plest way of obtaining the deterministic positioning of a part is through six locators.Less constraints fail to locate the part,while more are redundant.A similar problem at a reduced dimension is planar determinis-tic positioning,where reaction forces at locators are parallel to the xy plane,and only planar motion of the part is allowed.The test condition is based on the3ϫp matrix defined as in Eq.͑4͒,withw i=͓xi,yi,zi͔T͑6͒andxi2+yi2=1.Deterministic positioning is accomplished if at least three equivalent point contacts are used and W has rank3. 3Kinematic AnalysisAlthough the rank of the locating matrix is an index of deter-ministic positioning,it does not allow full kinematic characteriza-tion of afixture.Specifically,–it does not explain the cause of a nondeterministic position-ing,nor does it suggest any corrective action on thefixture design;–even in the full-rank case,it does not guarantee that part positioning is unaffected by manufacturing errors on parts and locators.Some properties of the above description of positioning con-straints can help to fully exploit its information content.For this purpose,we propose a method based on a matrix factorization technique known as singular value decomposition͑SVD͓͒33͔. The SVD of an mϫn matrix A isA=USV T͑7͒where U is an mϫm orthogonal matrix,V is an nϫn orthogonal matrix,and S=diag͑1,...,k͒is an mϫn matrix with elements iՆ0such that k=min͑m,n͒.Thei are the singular values of A,while the columns of U and V are,in turn,the left and right singular vectors of A.The SVD is especially helpful in solving ill-conditioned sets of linear equations in the form Ax=b.The rank of A equals the number of nonzero singular values.The columns of U correspond-ing to theiϾ0are an orthonormal basis for the range of A, while the columns of V corresponding to thei=0are an ortho-normal basis for the null space of A.Low values of somei may denote a linear dependency among equations,which can befixed by setting the lowi to zero.These properties have suggested several uses of the SVD in the solution of linear regression problems by the least-squares method and in other matrix manipulation problems͓34,35͔,as well as in the analysis of kinematic and dynamic properties of robot manipu-lators͓36͔.In our problem,since W is the coefficient matrix of the set͑Eq.͑5͒͒of equilibrium equations,the SVD is likely to be a better tool for checking deterministic positioning than simple rank inspection.Specifically,it allows us to draw additional infor-mation on motion constraints when W is rank deficient.A singularity of W means that the part is not positioned in a well defined spatial configuration.That is,the part can translate or rotate from the desired position although keeping contact with locators.Therefore,all DOFs of the guided movement of the part need to be determined in order to make corrections to the design of the locatingfixture.The problem can befirst solved in the xy plane,where deter-ministic positioning requires three contact points.The SVD of W provides its rank r,equal to the number of nonzero singular val-ues.If r=3͑Fig.3͑a͒͒,we have correct positioning.Otherwise, the part could either rotate about the z axis͑Fig.3͑b͒͒or translate along some direction in the plane͑Fig.3͑c͒͒.To recognize the two cases,we can build a translation subma-trix W T from thefirst two rows of W.W T is associated with a set of equilibrium equations similar to Eq.͑5͒,where couples and rotations are not considered.We can now apply the SVD to W T, thusfinding its rank r T.If r T=2,any set of external forces is balanced by the con-straints,and the part cannot translate;then,the residual DOF is a rotation about the z axis.This is the case depicted in Fig.3͑b͒, where the normals to part surfaces at the locators’contact points meet at a center of instantaneous rotation.In such a condition,the fixture only allows small rotations͑arbitrarily close to zero if part boundaries are perfectly straight lines͒,yet sufficient to hinder deterministic positioning.If r T=1,the residual DOF is a translation,and we canfind the motion direction from the base of the range of W T,given by the left singular vector corresponding to its only nonzero singular value.In fact,the range of W T is the set of the resultants of the external forces acting in the plane that do not affect the transla-tional equilibrium of the part.In the latter case,possible resultants can have only one direction,whose unit vector is given͑regard-less of the orientation͒by the base of the range:This direction is perpendicular to the unconstrained translation.In the example of Fig.3͑c͒,the base of the range of W T is the unit vector of the y axis,resulting in a translational DOF along the x axis.In the three-dimensional case,similarly,we apply the SVD to W and,if it is rank deficient,to the translational submatrix W T, including thefirst three rows of W.From the inspection of singu-lar values,we get the ranks r and r T of the matrices W and W T, with rՅ5and r TՅ3.With six locators,these two parameters provide information on the residual DOF of the part.–If r TϽ3,the part has3−r T translational DOF,as we can infer by a similar consideration to those applying on the2D case about the set of the translational equilibrium conditions.–If r−r TϽ3,the part has3−͑r−r T͒rotational degrees of freedom.The two above conditions can be satisfied simultaneously since the part could translate and rotate at the same time.However,six distinct contact points guarantee that rՆ2and r TՆ1,which means at most four total DOFs,not more than two translational. Figure4shows sample locatingfixtures for a prismatic work-piece,representative of all applicable combinations of r and r T. Each configuration includes six locators in contact with part sur-faces͑locators denoted with“2”are in contact with parallelsur-Fig.3Examples of planar locating schemesJournal of Computing and Information Science in Engineering JUNE2010,Vol.10/021009-3faces and may have either coincident or opposite normals ͒.The first case ͑Fig.4͑a ͒͒corresponds to a 3-2-1scheme with determin-istic positioning.In the other cases,the part is allowed one or more DOF,which are calculated from the properties of some sub-matrices of W .If r T Ͻ3,we can determine the ϱ2−r T free directions of transla-tion,as in the 2D case,from the range of W T :–If r T =2,the base of the range of W T includes two orthogonal unit vectors defining a plane perpendicular to the single translation direction ͑Figs.4͑c ͒,4͑e ͒,4͑g ͒,and 4͑i ͒͒.–If r T =1,the base of the range of W T consists of a single unit vector,whose normal plane contains a set of feasible trans-lation directions ͑Figs.4͑h ͒and 4͑j ͒͒.If r −r T Ͻ3,we need to solve the problem of deterministic po-sitioning in the plane to find the unconstrained rotation axes.For example,to detect a rotation about the x axis,we can build the submatrix W x from the rows of W associated with the equations of equilibrium to either translation along y and z and rotation about x ͑the second,the third,and the fourth one ͒.If the rank of W x is less than 3and is equal to that of its first two rows ͑corre-sponding to translations ͒,x is a free rotation axis.In this case,we have ϱ2−͑r −r T ͒feasible rotation directions.Specifically,–if r −r T =2,there is a single rotation axis ͑Figs.4͑b ͒,4͑e ͒,and 4͑h ͒͒;–if r −r T =1,there is a set of rotation axes,defined by two orthogonal directions ͑Figs.4͑d ͒,4͑g ͒,and 4͑j ͒͒;–r −r T =0,any direction is a feasible rotation axis;in this case,all the contact normals converge in a single rotation center for the part ͑Figs.4͑f ͒and 4͑i ͒͒.We can find the unrestricted rotation axes even if they are not parallel to the x ,y ,and z axes.For a generic unit vector t ,we can apply a coordinate transformation such that t is parallel to the unit vector k of the z axis.The same transformation is also applied to the two 3ϫp submatrices of W associated with translational and rotational equilibrium equations,resulting in a new matrix W Јand in the corresponding submatrix W z Ј.If the part has a single rotational DOF,we can search for the direction t for which the submatrix W z Јis rank deficient.The third singular value of W z Јis a continuous function of the angular parameters,which appear in the transformation,and has a unique global minimum of zero value in either Cartesian half-space.Therefore,we can do the search by any technique that is able to recognize and rule out possible local minima ͑direction set algorithms,simulated anneal-ing ͒.If a set of feasible rotation directions exists,we find two distinct directions t ,which define the plane containing the free rotation axes.4Tolerance AnalysisThe second problem in the analysis of deterministic positioning consists in detecting proximity to incorrect locating conditions.As a result of all its contacts with locators,a fully constrained part may still be allowed significant displacements from the nominal position due to form errors on datum surfaces.Although the ma-trix W carries all information required to detect such situations,a method is needed to properly recognize them.In the following,we show how the SVD can be helpful for this task.As it has been said before,low singular values of the locating matrix are related to a quasi-singularity of W ,which we associate with a possible lack of positioning accuracy.For instance,in the basic planar case of Fig.5͑a ͒,a displacement ␦of locator 2along its normal direction would force the part to rotate from its nominal configuration by an angle depending on ␦/a .Such an angle,which would result in geometric errors on machined features,can be relatively high if the distance a takes a small value.It can be verified that a takes a special meaning with respect to the SVD of the locating matrix.Specifically,with the coordinate system as in Fig.5͑a ͒,it isW =΄0011100−a 0΅͑8͒Singular values of W equal the square roots of the eigenvalues of W T ·W ,which can be easily derived by solving the characteristic equation of the latter matrix.We find that1=1Fig.4Examples of three-dimensional locatingschemesFig.5Quasi-singular locating conditions021009-4/Vol.10,JUNE 2010Transactions of the ASME2=ͱa 2+2+ͱa +423=ͱa 2+2−ͱa 4+42͑9͒andP =123=a͑10͒With a proper choice of the coordinate system,Eq.͑10͒applies to the general planar case depicted in Fig.5͑b ͒.The product P equals the distance a between the contact normal of locator 3and the intersection point of the contact normals of locators 1and 2͑or the corresponding distance for any permutation of locators ͒.As in the previous case,a displacement at locator 3causes a rotation of the workpiece by an angle that is inversely propor-tional to a .Similarly,in the 3-2-1scheme of Fig.5͑c ͒,distances a ,b ,and c should be long enough to avoid undesired rotations relative to the nominal configuration.Again,we have that P =12,...,6equals the product abc of the critical distances.The product of singular values of W can thus provide the information we need to detect a lack of positioning “robustness.”It is difficult to provide a mathematical proof for the geometric meaning of the quasi-singularity index P .In the following,how-ever,we will try to strengthen the conjecture that it is inversely related to geometric errors,which can result on machined fea-tures.Meanwhile,we will investigate on the influence of specified tolerances and geometric parameters of the fixture.We will only consider planar locating schemes as in Fig.5͑a ͒,which can be regarded as approximations of three-dimensional cases where the support on a base plane ͑locators 1–3in Fig.5͑b ͒͒is more accu-rate than the side positioning on lateral datum surfaces ͑locators 4–6in the same figure ͒.We will neglect any error sources that are not related to part and fixture geometry.They include uncertainties in tool positioning and workpiece set-in due to clamping and machining forces.We also assume that locators are exactly in their nominal position,which is reasonable when considering that tolerances on locating fixtures are usually tighter than workpiece errors.As a result of these assumptions,worst-case position errors will be underesti-mated by an amount depending on specific machine configura-tions.Let us suppose that a hole is to be drilled on a part as in Fig.6where specified by basic dimensions x and y .A straightness tol-erance t A on the primary datum A and a perpendicularity tolerance t B on the secondary datum B are assigned,as well as a position tolerance on the hole.The locating fixture for the workpiece con-sists in two locating pins on the primary datum and one pin on the secondary,spaced according to dimensions l 1,l 2,and l 3.Hole position will be checked by a functional gauge,whose datum simulators are put in contact with locating surfaces.Theoretically,contact points of locators with datum surfaces lie on the reference planes of the machine tool,which thus match exactly the datum simulators of the gauge.Under the assumption that the hole is drilled in its theoretical position relative to the machine,hole position is perfect also relative to the gauge,and there is no position error.Actually,as illustrated in Fig.7,locating surfaces do not coincide with gauge planes due to form and ori-entation errors.Therefore,contact with locators occurs in points that do not lie on datum simulators anymore.As contact points determine the geometric transformation of the part relative to the machine,the hole turns out to be displaced relative to its checking position on the gauge.The position error is equal to the distance between theoretical and actual ͑i.e.,after part-machine transfor-mation ͒hole axes.Figure 8͑a ͒shows that locators are assumed to be in their nomi-nal positions,and gauge planes are determined by geometric er-rors on locating surfaces.Point-to-point distances of part surfaces from datum simulators could be estimated by one of the available computational models of contacts between surfaces with an im-perfect form.As an example,in Ref.͓37͔,a constrained optimi-zation problem is solved to calculate the actual mating position between two imperfect planes.We prefer to simply calculate part-fixture transformation from a limited number of displacements at locators,to be treated as random variables.For this purpose,as shown in Fig.8͑b ͒,we imagine locating surfaces as perfect planes determined by equivalent displacements of locators.The set of displacement at locators⌬p =͓␦1,␦2,␦3͔T͑11͒transforms the workpiece coordinate system by⌬x =͓⌬x ,⌬y ,⌬␣͔T͑12͒where ͑⌬x ,⌬y ͒is the displacement of the origin and ⌬a is the rotation angle of x and y axes.According to results of Ref.͓12͔,the above parameters are related by the following equation:⌬p =W T ⌬x͑13͒Therefore,workpiece transformation can be found from locator displacements by inverting the transpose of the locating matrix.Following the equivalence described in Fig.8,locator displace-ments can take values less than or equal to the tolerances on corresponding data.Displacement values are negative ͑i.e.,theyFig.6Reference problem in theplaneFig.7Geometric transformation between machine tool and functionalgageFig.8Transformation based on locator displacementJournal of Computing and Information Science in Engineering JUNE 2010,Vol.10/021009-5。
Mathematical Modelling and Numerical Analysis Will be set by the publisher Modelisation Mat
c EDP Sciences, SMAI 1999
2
PAVEL BEL K AND MITCHELL LUSKIN
In general, the analysis of stability is more di cult for transformations with N = 4 such as the tetragonal to monoclinic transformations studied in this paper and N = 6 since the additional wells give the crystal more freedom to deform without the cost of additional energy. In fact, we show here that there are special lattice constants for which the simply laminated microstructure for the tetragonal to monoclinic transformation is not stable. The stability theory can also be used to analyze laminates with varying volume fraction 24 and conforming and nonconforming nite element approximations 25, 27 . We also note that the stability theory was used to analyze the microstructure in ferromagnetic crystals 29 . Related results on the numerical analysis of nonconvex variational problems can be found, for example, in 7 12,14 16,18,19,22,26,30 33 . We give an analysis in this paper of the stability of a laminated microstructure with in nitesimal length scale that oscillates between two compatible variants. We show that for any other deformation satisfying the same boundary conditions as the laminate, we can bound the pertubation of the volume fractions of the variants by the pertubation of the bulk energy. This implies that the volume fractions of the variants for a deformation are close to the volume fractions of the laminate if the bulk energy of the deformation is close to the bulk energy of the laminate. This concept of stability can be applied directly to obtain results on the convergence of nite element approximations and guarantees that any nite element solution with su ciently small bulk energy gives reliable approximations of the stable quantities such as volume fraction. In Section 2, we describe the geometrically nonlinear theory of martensite. We refer the reader to 2,3 and to the introductory article 28 for a more detailed discussion of the geometrically nonlinear theory of martensite. We review the results given in 34, 35 on the transformation strains and possible interfaces for tetragonal to monoclinic transformations corresponding to the shearing of the square and rectangular faces, and we then give the transformation strain and possible interfaces corresponding to the shearing of the plane orthogonal to a diagonal in the square base. In Section 3, we give the main results of this paper which give bounds on the volume fraction of the crystal in which the deformation gradient is in energy wells that are not used in the laminate. These estimates are used in Section 4 to establish a series of error bounds in terms of the elastic energy of deformations for the L2 approximation of the directional derivative of the limiting macroscopic deformation in any direction tangential to the parallel layers of the laminate, for the L2 approximation of the limiting macroscopic deformation, for the approximation of volume fractions of the participating martensitic variants, and for the approximation of nonlinear integrals of deformation gradients. Finally, in Section 5 we give an application of the stability theory to the nite element approximation of the simply laminated microstructure.
A generalized beta copula with applications in modeling multivariate long-tailed data
A Generalized Beta Copula with Applications inModeling Multivariate Long-tailed DataXipei Yang,Edward W.Frees,Zhengjun ZhangNovember4,2010AbstractThis work proposes a new copula class that we call the MGB2copula.The new cop-ula originates from extracting the dependence function of the multivariate GB2distribution(MGB2)whose marginals follow the univariate generalized beta distribution of the second kind(GB2).The MGB2copula can capture non-elliptical and asymmetric dependencies amongmarginal coordinates and provides a simple formulation for multi-dimensional applications.The new class features positive tail dependence in the upper tail and tail independence in thelower tail.Furthermore,it includes some well-known copula classes,such as the Gaussiancopula,as special or limiting cases.The validation of the MGB2copula can be assessed by agraphical tool of the so-called“conditional plots”.To illustrate the usefulness of the MGB2copula in practice,we build a trivariate model to analyze a data set that contains rich information on bodily injury liability claims closed withintwo-week period in years1987,1992,and1997.Reparametrized log-F(EGB2)distributionsare chosen to accommodate the right-skewness and the long-tailedness of the outcome vari-ables,while continuous predictors arefitted by non-linear curves in the marginal regressionmodels.The pairwise dependence structures exhibited motivate the application of the MGB2copula.For comparison purposes we also consider the alternative Gumbel copula and t copulafor the adaption of the upper tail dependence.The quantitative and graphical assessment forgoodness-of-fit demonstrates the comparative advantage of the MGB2copula over the othertwo copulas,which practically establishes the necessity for the development of this new copulaclass.1IntroductionThe past decade has seen an incredible evolution in copula theory and applications.The copula, a tool for understanding relationships among multi-dimensional outcomes,has been applied in awide scope of areas such as biostatistics,finance,insurance,economics,and hydrological studies. For example,Zheng and Klein(1995)estimated survival functions for competing survival and cen-sored times based on assumed copulas.Cherubini et al.(2004)addressed copula applications in pricing exotic derivative instruments and evaluation of counter-party risk in derivative transactions. copula methods have attracted substantial concern in quantitative integrated risk management(Em-brechts et al.,2003).Frees and Valdez(1998)brought this concept to actuarial practices with a thorough list of literature references.Grimaldi and Serinaldi(2006)applied nested asymmetric Archimedean copulas to studyflood events variables.An introduction to copulas can be found in Nelson(1999)and while Kolev et al.(2006)outlined some recent contributions to this important concept.Sklar’s theorem(1959)establishes the grounds for separate investigations of marginal distri-bution and the dependence structure that empowers copula as aflexible modeling technique than the conventional multivariate approach.Formally,a copula is a multivariate joint distribution func-tion defined on the unit hypercube such that each marginal distribution is uniform on the interval [0,1].Any joint distribution can be uniquely expressed as a copula function of individual marginal distributions provided they are continuous.Conversely,copulas can be constructed from multivari-ate distributions without any constraints on marginal distributions.There exists a large variety of copula families(Joe,1997)among which elliptical copulas and Archimedean copulas arise as two major classes.However,elliptical copulas are restricted to elliptical dependence,whereas most Archimedean copulas are exchangeable,thus implying symmetric dependence.A handful papers have discussed some approaches for introducing skewness to elliptical copulas(Demarta and Mc-Neil,2005)or introducing asymmetry to Archimedean copulas(Tawn,1988;Joe,1997;Nelson, 1999;Jones,2004;Liebscher,2008).However the formulations are often analytically intractable and become considerately complicated when the number of dimension increases.Infinance and insurance modeling,it is widely recognized that thefinancial data are heavy-tailed and extreme returns tend to cluster across investment portfolios.An underestimation of the joint extreme behavior of portfolio components may induce catastrophic outcomes,which has been exemplified in the current credit crisis.Furthermore,financial instruments are observed to exhibit asymmetric dependence.Hence,copulas that are able to accommodate tail dependence, capture dependence asymmetry,and provide adequateflexibility to be used in high dimensions are highly desirable.The main objective of this paper is to propose a new copula class,namely the MGB2copula,that is constructed by extracting the dependence function from a multidimen-sional version of the generalized beta distribution of the second kind(MGB2).In the economics literature,the four-parameter GB2model is known to provide an excellent description for long-tailed and highly skewed unemployment duration and income data(McDonald and Butler1990; McDonald and Xu1995).It has also appeared in the actuarial science literature in studies ofthe size-of-loss distribution(Venter,1983;Kleiber and Kotz,2003).Recent GB2applications include Sun et al.(2008),Frees and Valdez(2008),and Frees et al.(2009).Inheriting fat-tail features from the GB2and its multivariate extension,the MGB2copula proves to be able to ac-count for joint extreme events based on a positive asymptotic upper-tail index.Furthermore,the new copula can adapt non-elliptical and asymmetric dependence,and can be easily formulated for multi-dimensional applications.In addition,it includes many important copulas,such as Gaussian copula,and the dependence functions in the multivariate Singh-Maddala(Takahasi,1965;Vinh et al.,2010),multivariate Dagum(Rodriguez,1980;Domma,2009),multivariate Pareto(Mardia, 1962),and multivariate logistic distributions(Satterthwaite and Hutchinson,1978),as special or limiting cases.This article is organized as follows.Section2constructs the new copula,and studies associ-ated properties such as rank-based measures of association and asymptotic tail dependent indices. Relationships between the MGB2copula and some well-known copula classes are established. Simulations are carried out in Section3in order to highlight features of this new class.We present a graphical method so called”conditional-plot(s)”to test goodness-of-fit of the MGB2copula. Triviariate bodily injury data analysis is shown in Section4in which t,Gumbel,and MGB2copu-las are chosen as candidates for modeling the asymmetric dependence,while reparametrized log-F (EGB2)distributions arefit to the marginals.We provide concluding remarks in Section5.2The MGB2Copula and Related CopulasWe start this section by introducing the MGB2distribution.The construction of MGB2relies on the mixture representation of the univariate GB2distribution.This distribution provides adaption of a variety of distributions that are commonly used for handling the multivariate long-tailed data. In order to disentangle from the limitations faced by all families of multivariate distributions,we derive the MGB2copula by extracting the dependence function from the MGB2distribution.We study the copula properties and demonstrate the features of this new class.The establishment of the connections to the well-known copula classes addresses theflexibility of MGB2copula for modeling a wide variety of dependence structures.2.1The MGB2DistributionLet X′=(X1,...,X d)be a real d-dimensional random vector on(0,∞]d such that each X i given θfollows a generalized gamma distribution GG(a i,b iθ1/a i,p i)with the probability density function (p.d.f.)(x i|θ)=a if Xi|θSuppose that X1,...,X d are conditionally independent givenθ.Further assume that the pa-rameterθfollows an inverse gamma distribution with shape parameter q and a unit scale,i.e.θ∼InvGa(q,1).Then the unconditional p.d.f.of X isf X(x)=Γ( d i=1p i+q)[1+ d i=1(x i/b i)a i]q+ d i=1p i,(1)where x i>0with parameters(p1,...,p d,q)>0.The above density implies unconditional GB2 marginal distributions.For i=1,...,d,X i follows GB2(a i,b i,p i,q)with densityf Xi (x i;a i,b i,p i,q)=a iΓ(q)di=1b t i iΓ(p i+t i/a i)Next,we show that MGB2includes a transformed class of the multivariate t distribution.Trans-formations are performed to remove the off-diagonal entries of dispersion matrixes of multivariate t variables.A d-vector X=(X1,...,X d)′is said to follow a multivariate t distribution if it has p.d.f.f T(t)=Γ(v+dΓ(v2 1+(t−µ)′Σ−1(t−µ)2,−∞<t<∞with v>0degree of freedom,mean vectorµ,and positive-definite dispersion matrixΣ.Let X=Σ−1/2|T−µ|.Then,the density of X is given byf X(x)=2dΓ(v2)2) Γ(1v d1+di=1x iv 2−(v2),x i>0.This is a MGB2distribution with(a i=2,b i=√2,q=vB(p,q) z0t p−1(1−t)q−1dt,0≤z≤1,where B(p,q)is the beta function;and we use G p to denote the c.d.f.of a gamma variable with shape p and a unit scale,i.e.G p(z)=1struction of the MGB2distribution underlies the following formulation of the MGB2copulaC (u 1,...,u d )= di =1F X i |θ F −1X i (u i ) dG (θ),(u 1,...,u d )∈[0,1]d .Inserting the c.d.f.of GB2for F X i ,the c.d.f.of GG for F X i |θ,and the c.d.f.of InvGa for G (θ),we deliver the d -dimensional MGB2copulaC MGB2p 1,...,p d ,q (u 1,...,u d )=∞0d i =1G p i B −1p i ,q (u i )Γ(q )dθ,(u 1,...,u d )∈[0,1]d .(4)Jointly controlled by d +1parameters (p 1,...,p d ,q )>0a d -MGB2copula can provide adequate flexibility for modeling asymmetric dependence whenever there exist p m ,p n such that p m =p n ,(m,n )∈{1,...,d }.When all p i are equal,the copula tends to represent non-elliptically symmet-ric dependence.The corresponding density function c MGB2p 1,...,p d ,q is given byc p 1,...,pd ,q (u 1,...,u d )=Γ(q )d −1Γ( d i =1p i +q )(1+d i =1x i ) d i =1p i +q ,(5)in which x i =B −1p i ,q (u i )/ 1−B −1p i ,q (u i ) .One can always transform uniform [0,1]margins of (4)to yield multivariate distributions with specified marginal distributions,whereas the dependence structures are characterized by MGB2copulas.An assignment of a set of GB2margins with shape parameters (p 1,q ),...,(p d ,q )respectively for each dimension leads to an MGB2distribution de-fined in (1).When p 1=...,=p d =1,the MGB2copula yields a closed form representationC (u 1,...,u d )= d i =1u i −(d −1) + i 1<i 2 (1−u i 1)−1/q +(1−u i 2)−1/q −1 −q − i 1<i 2<i 3 (1−u i 1)−1/q +(1−u i 2)−1/q +(1−u i 3)−1/q −2 −q (6)+...+(−1)d d i =1(1−u i )−1/q −(d −1) −q ,(u 1,...,u d )∈[0,1]d .This is the copula associated with a multivariate Singh-Maddala (Burr XII)distribution;Hussaini and Ateya (2006)discovered it as the copula corresponding to a F class of multivariate distribu-tions.Let V i =1−U i for i =1,...,d .The joint distribution function of V i defines the survivalcopula ˆCwith respect to C in the sense that ˆC (v 1,...,v d )=Pr (U 1>u 1,...,U d >u d ).Thesurvival copula of(6)isC(u1,...,u d)= d i=1u−1/q i−(d−1) −q,(u1,...,u d)∈[0,1]d.This is the copula presented in Cook and Johnson(1981)(Clayton copula with strict generator) where they restricted the analysis to the bivariate case due to the equi-correlation structure.2.2.1Properties of the MGB2CopulaKendall’s tau and Spearman’s rho are two well-known rank based measures of association.Un-like Pearson correlation coefficient,Kendall’s tau and Spearman’s rho solely depend the copula function C of H(Frees,1998;Nelson,1999)in the sense thatτ(C)=4 [0,1]2C(u,v)dC(u,v)−1,ρ(C)=12 [0,1]2C(u,v)dudv−3.For a bivariate MGB2copula,the correlations can be evaluated byτ C MGB2p1,p2,q =2 1z2=0 z2z∗2=0 1z1=0B p1,q∗1 1−z∗2z−11−z2 dB p2,q(z∗2)dB p2,q(z2)−1,ρ C MGB2p1,p2,q =12 10 10B q∗1,p1 1−B−1p1,q(u)Definition2.1.The indices of the lower and upper tail dependence of a copula C are defined bylim u→0C(u,u)/u=λl,limu→1(1−2u+C(u,u))/(1−u)=λu,provided the limitsλl,λu∈[0,1]exits.The copula C is said to be lower(upper)tail asymptotically dependent ifλl(λu)∈(0,1],and lower(upper)tail asymptotically independent ifλl(λu)=0.By definition,these coefficients are limiting conditional probabilities for jointly exceeding a common threshold u given one margin does.Copulas of elliptically symmetric distributions have λl=λu.A standard bivariate t copula with v degree of freedom and correlation coefficientρ>−1 has positive tail indices given byλ=2t v+1 −√1−ρ/√B(p1,q)1/q+B(p2,q)1/q +B q∗2,p2B(p2,q)1/qinfinity as shown in the third identity.lim p1→0C MGB2p1,p2,q(u1,u2)=limp1,p2→0C MGB2p1,p2,q(u1,u2)=u1u2,limp1,p2→∞C MGB2p1,p2,q(u1,u2)=min(u1,u2).(10)Forfixed p2,q>0,let p1→∞,a new copula C p2,q is reached,and its density function isc p2,q (u1,u2)=Γ(q)x p21e−x2/x1,(11)where x1=1/G−1q(1−u1),and x2=B−1p2,q (u2)/ 1−B−1p2,q(u2) .The new copula has a zerolower tail index t l(C p2,q)=0,and a positive upper tail index given byt u(C p2,q )=1−G p2 (Γ(q)/B(p2,q))1/q +G q+p2 (Γ(q)/B(p2,q))1/q .(12)It is clear that the q parameter and the p parameters drive the level of association in opposite directions in the sense that a smaller value of q or a larger value of p1(or p2)leads to stronger de-pendence,and vice versa.Therefore,the limiting copula in(11)characterizes stronger dependence than the original MGB2copula.In addition,the role of q dominates the role of a single p,say p1,when driving towards inde-pendence becauselim p1,q→∞C MGB2p1,p2,q(u1,u2)=u1u2.(13)Suppose that p1,p2,q→∞such that p1/q→c1,p2/q→c2as q→∞,where c1,c2are two positive constants.Then,lim q→∞C MGB2p1,p2,q(u1,u2)=Φρ(Φ−1(u1),Φ−1(u2)),(14)whereρ= (1+c1)(1+c2)>0,Φρdenotes the bivariate Gaussian distribution function with corre-lation coefficientρ,andΦdenotes the distribution function of a standard normal variable.Thus,a bivariate Gaussian copula with a positive correlation is the limiting copula of the bivariate MGB2 copula.The proof of(14)is provided in the appendix A.2.The statements above can be easily replicated for multi-dimensional MGB2copulas with d>2.Figure1presents a graphical demonstration of the conversion between the bivariate MGB2 copula and its related copulas.Panel(1)displays an asymmetric MGB2copula with p1=0.1,p2= 6,q=2.5.In Panel(2)and(4)we let q and p1approach infinity respectively while leaving other parameters unchanged.Theflat hyperplanes represent the independence copula with density c(u1,u2)=1.Panel(3)shows the singular copula as q→0;a stronger dependence of concordance is obtained in Panel(5)where the mass concentrates along the line u1=u2.In Panel(6)thedensity of c p2,qreaches a level of40in the joint upper tail,much larger than the level of4in(1). The dominating effect of q over a single p1towards independence is confirmed in Panel(7).Panel (8)and(9)present the limiting Gaussian copulas withρ=0.5andρ=0.61respectively.The variation in the resulting correlation coefficients is attributed to the variation in the ratios of p1/q and p2/q.Figure1about hereExample One can use the copulas together with different marginals to form new bivariate distri-butions.To illustrate,suppose that X1is a random variable from InvGG(a1,b1,q)with p.d.f.given byf X1(x1;a1,b1,q)=a1Γ(q)Γ(p2)x1x2 x1b2 a2p2exp − x1b2 a2 .(15)The marginals of(15)belong to different distribution families.The GB2(a,b,p,q)density is known to be regularly varying at infinity with index−aq−1and regularly varying at origin with index−ap−1(Kleiber and Kotz,2003).Since the InvGG(GG)can be obtained by allowing p (q)in the GB2to approach infinity,its density exhibits shorter left(right)tail than the GB2coun-terpart.Thus,the distribution in(15)may be useful in situations where the marginal distributions appear to have different heaviness in tails.3Data AnalysisIn this section,we establish a comprehensive model to estimate the joint distribution of bodily injury(BI)liability payments and the time-to-settlement using auto injury data from the Insurance Research Council’s(IRC)Closed Claim Survey.In insurance terminology,the bodily injury li-ability coverage pays for the insured driver’s legal liability for bodily injury caused to the third party through the usage of the vehicle,up to the policy limit specified.From this survey,we have extensive claim characteristics such as the claimant demographics,the claimant’s degree of fault in causing the accident,the severity of the claimant’s bodily injury,state statutory rule,and the involvement of attorney.One can use our models based on these extensive characteristics to de-velop procedures to detect fraud and process claims.The joint modeling of the claim payment andthe claim duration reflects the complicated association structure embedded,thus facilitating the indemnity procedures that may involve both of the correlated variables.In our analysis,the total payment is further split into a specific payment(compaid)and a general payment(genpaid).The specific payment,also known as the economic loss,covers quantifiable losses including medical treatment,prescriptions and direct losses of income.In the contrast,the general payment com-pensates for pains,sufferings,mental anguish,loss of ability to work because of the damage to the mental health,and any other psychological problems caused by the accident.By nature,pains and sufferings are very difficult to evaluate and the damage to the mental health is not readily quantifiable.The literature has seen only a handful of regression applications on handling multi-dimensional insurance outcomes by copula.Frees and Valdez(1998)analyzed the loss and the accompanying allocated loss adjustment expense using the general liability claims data supplied by the Insurance Services Office.Klugman and Parsa(1999)fitted a bivariate model to the same data with different marginal distributions and a different copula family.Sun et al.(2008)applied the copula method in the longitudinal data context for health insurance.Frees and Valdez(2008)used t copulas to account for the dependence in severity among the different automobile claim types.Our work implements the analogous modeling strategy,but with trivariate response variables (comppaid,genpaid,duration)and the new MGB2copula.Empirical work on the BI data with a sample size much smaller than ours is provided in Doerpinghaus et al.(2008)where ordinary lin-ear models were used for testing the hypothesis that demographic characteristics affect the claim payment.The study found empirical evidence supporting the assertion that age and gender are discrimination factors that result in lower settlement amounts for female,elderly,and youthful claimants subject to similar degree of bodily injuries.Explanations for the differences in payouts were provided from the perspectives of variations in risk attitudes and variations in negotiating costs.Although the results make economic sense,the model they used failed to accommodate the features of the data.The exhibition of long tails and positive skewness challenges the appropriate-ness for assuming a Gaussian distribution,thus,weakening the power of the empirical evidence.3.1Data DescriptionThe data contain information on automobile bodily injury liability claims closed within two-week period in years1987,1992,and1997.Some34leading writers of auto insurance participated in the survey,collectively representing60%of the country-wide volume of private passenger auto-mobile insurance written in the United States.We focus on the third party claims and remove non-automobile claimants.Claimants of other types,such as motorcycle/bicyclist/moped driver and passenger or pedestrian,may yield non-comparable claim liabilities.Claimants less than14years old are also omitted from the sample.After removing and correcting corrupt and inaccurate records,a sample of71,022observations is retained,among which2,943claimants have the total payment subject to the coverage limit.In those cases,the insurance companies were not liable forthe amount of loss exceeding the coverage limit.For the sake of the joint estimation of paymentsand duration,another3,505claimants that miss the values of both the specific payment and the general payment are lost,leaving a sample of67,517for the trivariate model.Furthermore,not all of the claimants were indemnified from both the sources of payment;6,388claimants only received payments for their specific loss,whereas11,332only received payments for their general loss.However,there is no missing record in the duration component. These observations suggest three types of combinations of outcomes,M=1,2,3.We summarizethe structure in Table1.Table1:Types of Response CombinationsM=1M=3(comppaid,duration)(compaid,genpaid,duration) Number of Claimants11,332Table2:Descriptive Statistics of Dependent Variables by YearYear19871992199719871992199712,99421,40326,7324,2165,7695,3472,0002,7561,9378,48911,65917,2021101244,900353,0001,884,000Table2provides the descriptive statistics for each dependent variable by year.The dramatic differences between the means and the medians imply that the data are highly skewed to the right. The median payments and duration tend to increase from1987to1992followed by a decrease from 1992to1997,although the maximum values and the standard deviations appear to rise significantlyas time elapses,suggesting an increase in the cost of the extremal events.The explanatory variables provide comprehensive information regarding the policy specifics,the accident nature,insured driver’s demographics,claimant’s demographics,the availability ofother payment sources,the degree of injury,and the early-stage claim settlements.Table4-Table6 describe the major categorical predictors of interest,while Table7describes the continuous predic-tors.The explanatory variables vehicles2,drage2,and clmage2are created to mark the cases where missing values occur in the corresponding continuous variables vehicles1,drage1,and clmage1. The missing values are imputed using the medians of the observations in the same year.About 88.8%of the claims were reported within7days,and81.4%of the accidents involved only2vehi-cles.We observe that the claimants subject to the policy limits entertain higher amount of payments and require longer time period for closing the claims.In addition,serious car accidents are most likely to occur in metropolitan areas and often involve large compensation to the third party.No clear trend is detected for payments against the driver’s age;however,male drivers who are likely to be more aggressive drivers than their female counterpart are associated with larger payments for both the specific loss and the general loss.Female claimants on average receive lower settlements for the specific loss than male claimants although little difference is observed in the general loss. Compared with the married claimants,single claimants who might be armed with less argument power receive lower settlement amounts,while divorced/separated claimants claim the largest pay-offs.Full time employees and unemployed workers yield larger comppaid than the rest;the latter receiving larger genpaid than the former.The group consisting of nonemployed spouse,student, minor,and retiree accounts for the least payouts.One can observe larger indemnity payments in states that have enacted the“no-fault”law than in states that implement the“add-on”law or the “tort”law.The involvement of attorney and the resort to lawsuit indicates divergence in opinion between the drivers and the third party claimants and may end up with larger settlement amounts. The availability of other sources of payment seems to be a controversial predictor.People covered by health insurance or medicare claim for less payments,whereas the availability of wageplan and workcomp reduce the payoffs.The type of injuries and the extent of treatment may also be influen-tial.Severe injury is expected to require higher payments and length of stay in hospital may reflect the severity caused by the injuries.Given the large number of explanatory variables,highly complicated multi-dimensional struc-ture,and the fact that only4%of data are censored,we will treat the censoring indicator as a regular explanatory variable.In fact,one can take a couple of approaches for handling censored observa-tions.The simplest approach is to completely delete the censored data which generally leads to biased estimations.The magnitude of the biasedness depends on the type of censoring and the proportion of the censored observations.An alternatively approach is to specify a full information likelihood using copulas.The derivative forms of a copula function can accommodate all possible combinations of censored and non-censored marginal observations.However,the maintenance of full information is usually at the cost of a loss offlexibility in marginal structures.For instance, it cannot effectively adapt the additive structures which require a form of penalization to preventfrom over-fitting.Provided ourflavor forfitting additive regression models,we will proceed the copula modeling using the inference functions for margins(IFM)method,which we think is a convenient compromise between efficiency and complications.3.2Fitting Marginal DistributionsTo accommodate the long-tail features of the data,we employ the GB2marginal distribution for each dependent variable.Adopting the convention introduced by the relationship between the log-normal and normal distribution,if X∼GB2(a,b,p,q),then the variable Y=log X is said to be distributed as an exponential generalized beta of the second kind(EGB2).The four parameter GB2model is known to include a number of important special and limiting models such as the generalized gamma(GG),Singh-Maddala(Burr XII),Dagum(Burr III),log(t)(LT),lognormal (LN),Weibull(W),chi-square(χ2),and exponential(Exp).Naturally,the log-version of these models are special or limiting cases of the EGB2model.McDonald(1995)summarizes the inter-relationship between the GB2/EGB2family distributions in form of distribution trees.Yang(2011) proposes to reparametrize the EGB2model to form a reparametrized log-F model that can be used for discriminating the EGB2members at the boundaries of the parameter space(see also Prentice 1974,1975;Lawless,1980).A standardized reparametrized log-F variable W is characterized by two parameters(s,t) where s measures the skewness and t measures the overall tail-length.In order to incorporate the information reflected by the explanatory variables we consider the following additive regression modelY i=µi+κW i,i=1,...,n,(16)whereµi is the claimant-specific location parameter,κis a common scale parameter,and W i are i.i.d.log-F errors.In regression models with location-scale components,covariates are usually included in the location parameter,although one can also structure covariates into the scale param-eter.We only formulate the location parameter asµi=x′i1β+f(x i2),subject toni=1f(x i2)=0,where x i1is a k1-vector of categorical predictors including the term of intercept while x i2is a k2-vector of continuous predictors,andβis a vector of linear coefficients associated with x i1.We impose the constraint to ensure the identifiability of the non-linear smooth function f.Since the mode of the error W is at0,µi represents the mode of the distribution of Y i while the parame-ters(κ,s,t)are controlling the scale,the skewness,and the kurtosis respectively.Intuitively,the regression models that parameterize up to the fourth moment amount to better characterization of。
TheNewcastle-OttawaScale(NOS)forAssessingthe…
Development: Identifying Items
• Identify ‘high’ quality choices with a ‘star’
• A maximum of one ‘star’ for each item within the ‘Selection’ and ‘Exposure/Outcome’ categories; maximum of two ‘stars’ for ‘Comparability’
2. Representativeness of the cases a) consecutive or obviously representative series of cases ♦ b) potential for selection biases or not stated
3. Selection of Controls a) community controls ♦ b) hospital controls c) no description
Selection
1. Is the case definition adequate? a) yes, with independent validation ♦ b) yes, eg record linkage or based on self reports c) no description
Bias and Confounding
A Comprehensive Survey of Multiagent Reinforcement Learning
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 2, MARCH 2008
A Comprehensive Survey of Multiagent ReinfoN
A
MULTIAGENT system [1] can be defined as a group of autonomous, interacting entities sharing a common environment, which they perceive with sensors and upon which they act with actuators [2]. Multiagent systems are finding applications in a wide variety of domains including robotic teams, distributed control, resource management, collaborative decision support systems, data mining, etc. [3], [4]. They may arise as the most natural way of looking at the system, or may provide an alternative perspective on systems that are originally regarded as centralized. For instance, in robotic teams, the control authority is naturally distributed among the robots [4]. In resource management, while resources can be managed by a central authority, identifying each resource with an agent may provide a helpful, distributed perspective on the system [5].
北大计量经济学讲义-工具变量与两阶段最小二乘法
Intermediate Econometrics,
Intermediate Econometrics,
Yan Shen
18
When an IV is Available: Estimation 当IV存在时:估计
When assumptions (15.4) and (15.5) hold, one can show that the IV estimator is
Suppose the true model regresses log(wage) on education (educ) and ability (abil). 假定真实模型将对数工资对教育和能力回归
Now ability is unobserved, and the proxy, IQ, is not available. 现在能力不可观测,而且没有代理变量IQ
Sometimes we refer to this regression as the first-stage regression. 有时我们将这个回归称为第一阶段回归。
Intermediate Econometrics,
Yan Shen
11
Example: wage determination 例子:工资决定
In this context, identification means that we can
write b1 in terms of population moments that can
Atellica NEPH 630系统用户说明书
/healthineersAtellica NEPH 630 SystemSmart. Simple. Secure.Atellica, BN, BN ProSpec, Cardio Phase, PROTIS, syngo, and all associated marks are trademarks of Siemens Healthcare Diagnostics Inc., or its affiliates. All other trademarks and brands are the property of their respective owners.Not available for sale in the U.S. Product availability may vary from country to country and is subject to varying regulatory requirements. Please contact your local representative for availability.Published by Siemens Healthcare Diagnostics Inc. · Order No. 30-17-10181-01-76 · 09-2017 · © S iemens Healthcare Diagnostics Inc., 2017Siemens Healthineers Headquarters Siemens Healthcare GmbH Henkestr. 12791052 Erlangen, Germany Phone: +49 9131 84-0 /healthineersLegal ManufacturerSiemens Healthcare Diagnostics Products GmbH Laboratory DiagnosticsEmil-von-Behring-Strasse 7635041 Marburg GermanyPhone: +49 6421-39-3061Atellica NEPH 630 System32Siemens Healthineers has helped shape and advance the protein testing market with more than 50 years of experience and five generations of dedicated nephelometric systems. We’ve pioneered advancedtechnologies and innovative assays in order to address the continuous pressure that labs are under to streamline workflow andprocesses, maximize service quality, improve cost efficiency, and drive better outcomes for clinicians and patients.Pairing our commitment to innovation with increasingly higher standards to analyzeproteins quickly, accurately, and in various body fluids, we have developed an easy-to-use system backed by a comprehensive protein testing menu that provides accurate, secure, and reliable results to support better patientoutcomes. The result is the Atellica NEPH 630 System *—specifically designed for low- to mid-volume labs in need of reliable protein testing on cerebrospinal fluid, serum, urine, and plasma.†Connectivity to Atellica NEPH 630 System is under development.‡Data represents mean time between failure of current Siemens BN ProSpec® System.§Requires syngo® Lab Connectivity Manager.Labs seek control, simplicity, and better outcomes in protein testing.The Atellica NEPH 630 System is a dedicated nephelometric system that simplifies lab operations by offering the broadest menu of protein tests on multiple sample types. Part of the Atellica family, it streamlines workflow with highly intelligent software and provides smart, simple, secure testing.Siemens Healthineers developed the Atellica portfolio of products to empower labs with stronger operational control and streamlined workflows for improved business and clinical outcomes. The Atellica NEPH 630 System combines the high-performance features customers expect from Siemens Healthineers nephelometric systems—including innovative Simplify your protein testing.Integrate your disease-state quantification. Advance your capabilities.assays, continuous access, flexible sample processing, easy calibration, and reliableresults—with advanced software capabilities and IT connectivity. Smart, simple, and secure— the Atellica NEPH 630 System stands out for its onboard capabilities, familiar Atellica user interface, and integration with end-to-end lab solutions.*Not available for sale in the U.S. Product availability varies by country.4 5Gain more control over your protein workflow. Simplify lab operations.Improve financial and clinical outcomes.*Connectivity to Atellica NEPH 630 System is under development ** R equires virtual network computing (VNC) or remote desktop capability. ††Depending on assay profile.‡‡ P ROTIS Assessment Software not available for sale in the U.S. Product availability varies by country. Please contact your local representative for availability.123456• F eatures the industry’s largest nephelometric menu of plasma protein assays.• F lexible testing options enable testing on serum, urine, plasma, and CSF.• M ore than 50 years of experience in protein analysis has led to longtime cooperation with IFCC in developing widely accepted standards.Expand your testing capabilities with the broadest menuof protein tests on multiple sample types.1• F lexible use of different sample types in random-access mode allows convenient operation.• O nboard reagent and control storage provides 24/7 operation, long onboard stability, and minimized operator intervention.• H igh sample-loading capacity allows load-and-go processing.• P ositive bar-code ID of primary sample tubes minimizes manual steps and avoids sample mismatch.• A utomatic dilutions and repeat measurement of out-of-range high or low samples are performed without user intervention.Minimize hands-on time so laboratory staff can focuson high-value responsibilities.2Streamline your lab workflows with highly intelligent software and IT connectivity.3• P roven nephelometric technology offers high precision and reproducibility.• S ophisticated antigen-excess pre-reaction protocols provide more accurate results and fewer repeats.• W ide initial measuring ranges reduce the need for retesting.• S ystem detects specimen and reagent levels prior to processing to ensure accuracy of results.• G raphical display of kinetic curves allows additional insight and advanced troubleshooting.• S tate-of-the-art software provides a high level of cyber security.Deliver accurate and secure results.4Simplicity• P ROTIS® Assessment Software ‡‡ consolidates patient test results intoone simple report with algorithm-based interpretation of results to support physicians in clinical decision making.• L ot-to-lot consistency ensures concordance of results, providing physicians with reliable insight into the progression of disease.Empower clinicians to improve patient outcomes.6Better Outcomes• I ntegration with Atellica PM 1.0 Software * simplifies lab management via aggregated system data, alerts, and remote control capability.**• S iemens Remote Service identifies and diagnoses potential hardware issues, enabling rapid technical intervention and quicker resolution.• A dvanced QC interface provides comprehensive statistics, allowing the transmission of control results to the LIS.Save time and reduce waste with continuous access and smart technology.5• F lexible dilutions and automatic retesting can often be performed without re-accessing the primary sample.• F ast average throughput of 65 tests †† per hour. Nominal: 100 tests/hour.• S ophisticated sample dilutions enable you to add tests or retest samples quickly.• S amples and reagents can be loaded and reloaded continuously without interrupting the assay run.6N Latex FLC kappa and N Latex FLC lambda Assays §§ are designed for improved management of patients with monoclonal gammopathies.• Excellent lot-to-lot reproducibility • H igh specificity based on monoclonal antibodies• P re-reaction protocols provide high antigen-excess security• R eagents, supplementary reagents, standards, and controls available separately and can be freely combinedExpand your protein testing menu. Get a complete disease-state picture. Drive better outcomes.N Latex BTP Assay §§is a fast and accurate screening method fordetection of cerebrospinal fluid (CSF) and estimation of residual renal function (RRF).• T wo applications—one convenient, fully automated assay • Fast detection of CSF leakage • E asy and reliable determination of RRF in dialysis patients • High specificity and sensitivity • Low incidence of false-positive resultsN Latex CDT Assay provides a highly specific method for the detection of chronic alcohol abuse.• H ighly specific monoclonal antibody directly detects CDT• A utomatically calculates %CDT by running CDT and Transferrin assays simultaneously• F ast results—within 20 minutes (total assay time)• Random-access capability• R eliable results—excellent recovery between labs, systems, and lotsIgG Subclass 1–4 Immunoassays offer a comprehensive solution for IgG determination. The determination is indicated for diagnostic clarification in patients suffering from a broad range of conditions associated with bothimmune and nonimmune abnormalities. The ability to measure all four subclasses provides insight into deficiencies that may be masked in a total IgGmeasurement or predominant IgG1 levels.• C omprehensive solution from one source• I nnovative assays that align with the trusted BN ™ Systems • P re-reaction protocols for antigen-excess securityProtein quantification has many applications across a range of disease states. Siemens Healthineers offers the industry’s largest nephelometric menu of plasma protein assays, including cardiac risk assessment, kidney diseases, neurological disorders, nutritional assessment, and iron and anemiaassessment, supported by innovative assays such as free light chains (FLC), carbohydrate-deficient transferrin (CDT), and beta-trace protein (BTP).The Atellica NEPH 630 System is a powerfuladdition to Siemens Healthineers plasma protein products, delivering accurate results and providing a comprehensive disease-state picture for better clinical outcomes. With the broadest range ofplasma protein assays across multiple sample types on an easy-to-use and reliable nephelometric system, Siemens Healthineers delivers diagnostic performance for confident decision making.1α2-Macroglobulin urine §§β2-Microglobulin urine β-trace protein §§Albumin urine Cystatin C serum FLC kappa urine §§FLC lambda urine §§Ig/Light Chain, type kappa urine §§Ig/Light Chain, type lambda urine §§IgG urine NGAL §§Transferrin urineAssay Menu§§Not available for sale in the U.S.。
AB实验的高端玩法系列1-AB实验人群定向个体效果差异HTEUpliftModel论文gi。。。
AB实验的⾼端玩法系列1-AB实验⼈群定向个体效果差异HTEUpliftModel论⽂gi。
⼀直以来机器学习希望解决的⼀个问题就是'what if',也就是决策指导:如果我给⽤户发优惠券⽤户会留下来么?如果患者服了这个药⾎压会降低么?如果APP增加这个功能会增加⽤户的使⽤时长么?如果实施这个货币政策对有效提振经济么?这类问题之所以难以解决是因为ground truth在现实中是观测不到的,⼀个已经服了药的患者⾎压降低但我们⽆从知道在同⼀时刻如果他没有服药⾎压是不是也会降低。
这个时候做分析的同学应该会说我们做AB实验!我们估计整体差异,显著就是有效,不显著就是⽆效。
但我们能做的只有这些么?当然不是!因为每个个体都是不同的!整体⽆效不意味着局部群体⽆效!如果只有5%的⽤户对发优惠券敏感,我们能只触达这些⽤户么?或者不同⽤户对优惠券敏感的阈值不同,如何通过调整优惠券的阈值吸引更多的⽤户?如果降压药只对有特殊症状的患者有效,我们该如何找到这些患者?APP的新功能部分⽤户不喜欢,部分⽤户很喜欢,我能通过⽐较这些⽤户的差异找到改进这个新功能的⽅向么?以下⽅法从不同的⾓度尝试解决这个问题,但基本思路是⼀致的:我们⽆法观测到每个⽤户的treatment effect,但我们可以找到⼀群相似⽤户来估计实验对他们的影响。
我会在之后的博客中,从CasualTree的第⼆篇Recursive partitioning for heterogeneous causal effects开始梳理下述⽅法中的异同。
整个领域还在发展中,⼏个开源代码都刚release不久,所以这个博客也会持续更新。
如果⼤家看到好的⽂章和⼯程实现也欢迎在下⾯评论~Uplift Modelling/Causal Tree1. Nicholas J Radcliffe and Patrick D Surry. Real-world uplift modelling with significance based uplift trees. White Paper TR-2011-1,Stochastic Solutions, 2011.2. Rzepakowski, P. and Jaroszewicz, S., 2012. Decision trees for uplift modeling with single and multiple treatments. Knowledge andInformation Systems, 32(2), pp.303-327.3. Yan Zhao, Xiao Fang, and David Simchi-Levi. Uplift modeling with multiple treatments and general response types. Proceedings ofthe 2017 SIAM International Conference on Data Mining, SIAM, 2017.4. Athey, S., and Imbens, G. W. 2015. Machine learning methods forestimating heterogeneous causal effects. stat 1050(5)5. Athey, S., and Imbens, G. 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy ofSciences.6. C. Tran and E. Zheleva, “Learning triggers for heterogeneous treatment effects,” in Proceedings of the AAAI Conference on ArtificialIntelligence, 2019Forest Based Estimators1. Wager, S. & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of theAmerican Statistical Association .2. M. Oprescu, V. Syrgkanis and Z. S. Wu. Orthogonal Random Forest for Causal Inference. Proceedings of the 36th InternationalConference on Machine Learning (ICML), 2019Double Machine Learning1. V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, and a. W. Newey. Double Machine Learning for Treatment andCausal Parameters. ArXiv e-prints2. V. Chernozhukov, M. Goldman, V. Semenova, and M. Taddy. Orthogonal Machine Learning for Demand Estimation: HighDimensional Causal Inference in Dynamic Panels. ArXiv e-prints, December 2017.3. V. Chernozhukov, D. Nekipelov, V. Semenova, and V. Syrgkanis. Two-Stage Estimation with a High-Dimensional Second Stage.2018.4. X. Nie and S. Wager. Quasi-Oracle Estimation of Heterogeneous Treatment Effects. arXiv preprint arXiv:1712.04912, 2017.5. D. Foster and V. Syrgkanis. Orthogonal Statistical Learning. arXiv preprint arXiv:1901.09036, 2019Meta Learner1. C. Manahan, 2005. A proportional hazards approach to campaign list selection. In SAS User Group International (SUGI) 30Proceedings.2. Green DP, Kern HL (2012) Modeling heteroge-neous treatment effects in survey experiments with Bayesian additive regression trees.Public OpinionQuarterly 76(3):491–511.3. Sören R. Künzel, Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. Metalearners for estimating heterogeneous treatment effects usingmachine learning. Proceedings of the National Academy of Sciences, 2019.Deep Learning1. Fredrik D. Johansson, U. Shalit, D. Sontag.ICML (2016). Learning Representations for Counterfactual Inference2. Shalit, U., Johansson, F. D., & Sontag, D. ICML (2017). Estimating individual treatment effect: generalization bounds and algorithms.Proceedings of the 34th International Conference on Machine Learning3. Christos Louizos, U. Shalit, J. Mooij, D. Sontag, R. Zemel, M. Welling.NIPS (2017). Causal Effect Inference with Deep Latent-VariableModels4. Alaa, A. M., Weisz, M., & van der Schaar, M. (2017). Deep Counterfactual Networks with Propensity-Dropout5. Shi, C., Blei, D. M., & Veitch, V. NeurIPS (2019). Adapting Neural Networks for the Estimation of Treatment EffectsUber专场最早就是uber的博客在茫茫paper的海洋中帮我找到了⽅向,如今听说它们AI LAB要解散了有些伤感,作为HTE最多star的开源⽅,它们值得拥有⼀个part1. Shuyang Du, James Lee, Farzin Ghaffarizadeh, 2017, Improve User Retention with Causal Learning2. Zhenyu Zhao, Totte Harinen, 2020, Uplift Modeling for Multiple Treatments with Cost3. Will Y. Zou, Smitha Shyam, Michael Mui, Mingshi Wang, 2020, Learning Continuous Treatment Policy and Bipartite Embeddings forMatching with Heterogeneous Causal EffectsOptimization4. Will Y. Zou,Shuyang Du,James Lee,Jan Pedersen, 2020, Heterogeneous Causal Learning for Effectiveness Optimizationin User Marketing想看更多因果推理AB实验相关paper的⼩伙伴看过来持续更新中 ~。
Estimation of Deterministic and Stochastic IMU
Inertial navigation systems need acceleration and angular rate measurements in the x, y and z- directions to calculate attitude, position and velocity. Therefore, inertial measurement units contain three accelerometer and three gyroscope. For this reason, IMU error model is determined with equation ( 1) and (2) [3,4].
Estimation of Deterministic and Stochastic IMU Error Parameters
Derya UNSAL Department of Guidance and Control Design Roketsan Missiles Industries Inc. Ankara, Turkey dunsal@.tr Abstract- Inertial Measurement Units,
Kerim DEMIRBAS Department of Electrical and Electronics Engineering Middle East Technical University Ankara, Turkey demirbas@.tr signals and this is the major drawback of GPS. However, INS uses IMU outputs to construct position velocity and attitude by processing the navigation equations. Therefore IMUs are the major part of inertial navigation systems. An IMU is a device, which is used to measure linear acceleration and angular rate. Inertial measurement units contain two types of sensor, accelerometer and gyroscope. An accelerometer measures linear acceleration about its sensitivity axis and integrated acceleration measurements are used to calculate velocity and position. Besides a gyroscope measures angular rate about its sensitivity axis and gyroscope outputs are used to maintain orientation in space. The cost of an IMU increases when the sensor performance requirements increase. The major reasons for the cost increase can be explained in two ways. The first reason is the highly skilled production line requirement and the second reason is the decrease in the percentage of utilizable sensor in the batch. Therefore, in order to improve the performance of inertial sensors, the calibration algorithms and the error compensation models were researched and developed. Thereby both low-cost and high-performance IMUs could be produced. The main objective of this article is to develop methods in order to estimate deterministic and stochastic error parameters of MEMS based inertial measurement units. Additionally, improving the performance of IMUs is aimed by using these estimated parameters. Therefore an error calibration algorithm is implemented and estimated parameters are used in this algorithm. II. I MU ERROR MODEL
Lecture 10
EM Algorithm 1st expectation step : calculations
• Assume that the seq1 is 20 bases long and the length of the site is 20 bases.
• Suppose that the site starts in the column 1 and the first two positions are A and T.
eMOTIF
True positives
eMOTIF: search of sequences with certain emotif in the DB
Expectation Maximization (EM) Algorithm
• This algorithm is used to identify conserved areas in unaligned DNA and proteins. • Assume that a set of sequences is expected to have a common sequence pattern.
Bioinformatics
Lecture 10
• Finding signals and motifs in DNA and proteins
• Expectation Maximization Algorithm
• MEME • The Gibbs sampler
Finding signals and motifs in DNA and proteins
• An alignment of sequences is intrinsically connected with another essential task, which is finding certain signals and motifs (highly conservative ungapped blocks) shared by some sequences. • A motif is a sequence pattern that occurs repeatedly in a group of related protein or DNA sequences. Motifs are represented as position-dependent scoring matrices that describe the score of each possible letter at each position in the pattern. • Another related task is searching biological databases for sequences that contain one or more of known motifs. • These objectives are critical in analysis of genes and proteins, as any gene or protein contains a set of different motifs and signals. Complete knowledge about locations and structure of such motifs and signals leads to a comprehensive description of a gene or protein and indicates at a potential function.
Norm estimates for the ` 2-inverses
growth. Thus ' possesses a Fourier transform in the sense of Schwartz (1966) which we shall assume is almost everywhere equal to a Lebesgue measurable function on Rd . Given a nonzero real sequence (yj )j 2Z d of nite support and points (xj )j 2Z d in Rd , we introduce the function F : Rd ! R given by X F (x) = yj yk '(x + xj ? xk ); x 2 Rd : (2:1)
X
j 2Z d
y (n) ]2 = (2 )?d
j
Z
0;2 ]d
P ( ) Kn (
2
? )d
and the approximate identity property of the Fejer kernel (Zygmund (1988), p.86) implies that
X
k2Z d
'( + 2 k); ^
a.e. 2 Rd
(2:6)
j 2Z d
'
j 2Z d
2 yj
X
j;k2Z d
yj yk '(j ? k)
ess sup
X
'
j 2Z d
2 yj :
(2:7)
Let V' be the vector space of real sequences (yj )j 2Z d of nite support for which the function ^ F of (2.3) is absolutely integrable. Note that whenever P is a trigonometric polynomial whose 3
R语言 mgcv包 gam()函数中文帮助文档(中英文对照)
Generalized additive models with integrated smoothness estimation广义加性模型与集成的平滑估计描述----------Description----------Fits a generalized additive model (GAM) to data, the term "GAM" being taken to include any quadratically penalized GLM. The degree of smoothness of model terms is estimated as part of fitting. gam can also fit any GLM subject to multiple quadratic penalties (including estimation of degree of penalization). Isotropic or scale invariant smooths of any number of variables are available as model terms, as are linear functionals of such smooths; confidence/credible intervals are readily available for any quantity predicted using a fitted model; gam is extendable: users can add smooths.适合一个广义相加模型(GAM)的数据,“GAM”被视为包括任何二次处罚GLM。
模型计算的平滑度估计作为拟合的一部分。
gam也可以适用于任何GLM多个二次处罚(包括估计程度的处罚)。
各向同性或规模不变平滑的任意数量的变量的模型计算,这样的线性泛函平滑的信心/可信区间都是现成的使用拟合模型预测任何数量,“gam是可扩展的:用户可以添加平滑。
estimation of
Statistical analysis of multiple optical flow values for estimation of unmanned aerial vehicle height above groundPaul Merrell, Dah-Jye Lee, and Randal BeardDepartment of Electrical and Computer EngineeringBrigham Young University, 459 CBProvo, Utah 84602ABSTRACTFor a UAV to be capable of autonomous low-level flight and landing, the UAV must be able to calculate its current height above the ground. If the speed of the UAV is approximately known, the height of the UAV can be estimated from the apparent motion of the ground in the images that are taken from an onboard camera. One of the most difficult aspects in estimating the height above ground lies in finding the correspondence between the position of an object in one image frame and its new position in succeeding frames. In some cases, due to the effects of noise and the aperture problem, it may not be possible to find the correct correspondence between an object’s position in one frame and in the next frame. Instead, it may only be possible to find a set of likely correspondences and each of their probabilities. We present a statistical method that takes into account the statistics of the noise, as well as the statistics of the correspondences. This gives a more robust method of calculating the height above ground on a UAV.Keywords: Unmanned Aerial Vehicle, Height above Ground, Optical Flow, Autonomous Low-level Flight, Autonomous Landing1.INTRODUCTIONOne of the fundamental problems of computer vision is how to reconstruct a 3D scene using a sequence of 2D images taken from a moving camera in the scene. A robust and accurate solution to this problem would have many important applications for UAVs and other mobile robots. One key application is height above ground estimation. With an accurate estimate of a UAV’s height, the UAV would be capable of autonomous low-level flight and autonomous landing. It would also be possible to reconstruct an elevation map of the ground with a height above ground measurement. The process of recovering 3D structure from motion is typically accomplished in two separate steps. First, optical flow values are calculated for a set of feature points using image data. From this set of optical flow values, the motion of the camera is estimated, as well as the depth of the objects in the scene.Noise from many sources prevents us from finding a completely accurate optical flow estimate. A better understanding of the noise could provide a better end result. Typically, a single optical flow value is calculated at each feature point, but this is not the best approach because it ignores important information about the noise. A better approach would be to examine each possible optical flow value and calculate the probability that each is the correct optical flow based on the image intensities. The result would be a calculation for not just one optical flow value, but an optical flow distribution. This new approach has several advantages. It allows us to quantify the accuracy of each feature point and then rely more heavily upon the more accurate feature points. The more accurate feature points will have lower variances in their optical flow distributions. A feature point also may have a lower variance in one direction over another, meaning the optical flow estimate is more accurate in that direction. All of this potentially valuable information is lost if only a single optical flow value is calculated at each feature point.Another advantage is that this new method allows us to effectively deal with the aperture problem. The aperture problem occurs on points of the image where there is a spatial gradient in only one direction. At such edge points, it is impossible to determine the optical flow in the direction parallel to the gradient. However, often it is possible to obtain a precise estimate of the optical flow in direction perpendicular to the gradient. These edge points can not be used by any method which uses only a single optical flow value because the true optical flow is unknown. However, this problem can easily be avoided if multiple optical flow values are allowed. Even though edge points are P.C. Merrell, D.J. Lee, and R.W. Beard, “Statistical Analysis of Multiple Optical Flow Valuesfor Estimation of Unmanned Air Vehicles Height Above Ground”, SPIE Optics East, Robotics Technologies and Architectures, Intelligent Robots and Computer Vision XXII, vol. 5608, p. 298-305, Philadelphia, PA, USA, October 25-typically ignored, they do contain useful information that can produce more accurate results. In fact, it is possible to reconstruct a scene that contains only edges with no corner points at all. Consequently, this method is more robust because it does not require the presence of any corners in the image.2. RELATED WORKA significant amount of work has been done to try to use vision for a variety of applications on a UAV, such as terrain-following [1], navigation [2], and autonomous landing on a helicopter pad [3]. Vision-based techniques have also been used for obstacle avoidance [4,5] on land robots. We hope to provide a more accurate and robust vision system by using multiple optical flow values.Dellaert et al. explore a similar idea [6] to the one presented here. Their method is also based upon the principle that the exact optical flow or correspondence between feature points in two images is unknown. They attempt to calculate structure from motion without a known optical flow. However, they do assume that each feature point in one image corresponds to one of a number of feature points in a second image. The idea we are proposing is less restrictive because it allows a possible correspondence between any nearby pixels and then calculates the probability of each correspondence.Langer and Mann [7] discuss scenarios in which the exact optical flow is unknown, but the optical flow is known to be one of a 1D or 2D set of optical flow. Unlike the method described here, their method does not compute the probability of each possible optical flow in the set.3. METHODOLOGY3.1. Optical flow probabilityThe image intensity at position x and at time t can be modeled as a signal plus white Gaussian noise.),(),(),(t N t S t I x x x +=(1)where I (x ,t ) represents image intensity, S (x ,t ) represents the signal, and N (x ,t ) represents the noise. Over a sufficiently small range of positions and with a sufficiently small time step, the change in the signal can be expressed as a simple translation),(),(dt t S t S +=+x U x (2)where U is the optical flow vector between the two frames. The shifted difference between the images,),(),(),(),(dt t N t N dt t I t I +−+=+−+x U x x U x , (3)is a Gaussian random process.The probability that a particular optical flow, u , is the correct optical flow value based on the image intensities is proportional to:222)),(),((221)],(),,(|[σπσdt t I t I e dt t I t I P +−+−∝++=x u x x u x u U , (4) for some 2σ. Repeating the same analysis over a window of neighboring positions, W ∈x , and assumingthat the noise is white, the optical flow probability can be calculated as∏∈+−+−∝=W dt t I t I e I P x x u x u U 222)),(),((221]|[σπσ. (5)The value of ),(t I u x + may need to be estimated through some kind of interpolation, since the image data usually comes in discrete samples or pixels, but the value of u in general is not an integer value.Simoncelli et al. [8] describe an alternative gradient-based method for calculating the probability distribution. This method calculates optical flow based on the spatial gradient and temporal derivative of the image. Noise comes in the form of error in the temporal derivative, as well as a breakdown of the planarity assumption in equation (2). The optical flow probability distribution ends up always being Gaussian. The probability distribution calculated from equation (5) can be more complex than a Gaussian distribution.3.2 Rotation and translation estimationAfter having established an equation that relates the probability of an optical flow value with the image data, this relationship is used to calculate the probability of a particular camera rotation, , and translation, T , given the image data, I . This can be accomplished by taking the expected value for all possible optical flow values, u , then applying Bayes’ rule .u I u I u T u I u T I T d P P d P P Ω=Ω=Ω)|(),|,()|,,()|,(. (6)Our estimate of and T is only based on the optical flow value, so )|,(),|,(u T I u T Ω=ΩP P . Using Bayes’ rule twice more yields:ΩΩ=Ω=Ωu I u u T u T u I u u T I T d P P P P d P P P )|()(),|(),()|()|,()|,(. (7)The value of u is a continuous variable, so it does not come in discrete quantities. However, since we do not have a closed form solution to the integral in equation (7), the integral is approximated by computing a summation over discrete samples of u . In the following sections, each of the terms in equation (7) will be examined. Once a solution has been found for each term, the final step will be to find the most likely rotation and translation. The optimal rotation and translation can be found using a genetic algorithm.3.3. Calculating P (u | ,T)The optical flow vector, []T y x u u =u, at the image position (x ,y ) for a given rotation,[]T z y xωωω=Ω and translation, []T z y x t t t =T , is approximately equal tox f xy f f y Z yt f t u y f f x f xy Z xt f t u z y x z y y z y x z x xϖϖϖϖϖϖ−− +++−=+ +−++−=22 (8)where f is the focal length of the camera and Z is the depth of the object in the scene [9]. For each possible camera rotation and translation, we can come up with a list of possible optical flow vectors. While the exact optical flow vector is unknown, since the depth, Z , is unknown, we do know from rearranging the terms in equation (8) that the optical flow vector is somewhere along an epipolar line. The epipolar line is given by:+ +−+−− +=+−+−=+=x f f x f xy m x f xy f f y b xt f t xt f t m bmu u z y x z y x z x zy x y ϖϖωϖϖω22 (9)Furthermore, since the depth, Z , is positive (because it is not possible to see objects behind the camera), we also know thaty f f x f xy u f t x t z y x x x z ϖϖϖ+++> >2, and (10)y f f x f xy u f t x t z y x x x z ϖϖϖ+ ++< <2.If an expression for the probability of the depth, P(Z ), is known, then the probability of an optical flow vector for a given rotation and translation is given by:+≠+=+ +−+−==Ωb mu u b mu u y f f x f xy xt f t Z P P x y x y z y x z x ,0,),|(2ϖϖϖT u (11)3.4. Calculating P (,T) and P (u)The two remaining terms in equation (7) for which a solution must be found are P(,T ) and P(u ). P(,T ) depends upon the characteristics of the UAV and how it is flown. We assume that the rotation is distributed normally with a variance of 2x σ,2y σ , and 2z σ for each of the three components of rotation, roll, pitch, and yaw. For height above ground estimation, the optimal position for the camera is facing directly at the ground. We assume that the motion of the UAV is usually perpendicular to the motion of the optical center of the camera and is usually along the y -axis of the camera coordinate system.=Ω222000000,000~z y x z y x N σσσϖϖϖ, =222000000,010~tz ty tx z y x N t t t σσσT .(12)Using these distributions, an expression for P (u ) can also be obtained. Equation (8) can be separated in two parts that depend upon the rotation and translation of the camera.Ω+=Ω+=2211R T Q R T Q y x u u (13)In this case and T are both random vectors with the distribution given in equation (12).++++++++ ΩΩ++=++=Ω=Ω222222222221221221221221221221221221212212212212111211,00~])[(])[(0][z z y y x x z z z y y y x x x z z z y y y x x x z z y y x x z z y y x x z z y y x x r r r r r r r r r r r r r r r r r r N r r r r r r E E E σσσσσσσσσσσσσσσωωωR R R R (14)For simplicity, we will assume in the next equation that the variation in the translation, 2tx σ,2ty σ , and 2tz σ, isnegligible. Now an expression for the value of P (u ) is obtained by using the probability density function in (14) and a probability density function for the depth, P (Z ))(21z Z P z f u u P u u P y x z y x = −= ΩΩ= R R . (15)3.5. Depth estimationOnce the rotation and translation of the camera between two frames have been estimated, we are now able to estimate the depth of the objects in the scene. For a given rotation, translation, and depth, an optical flow value can be calculated using equation (8). By using equation (5), we can find a probability for the depth Z xy at image position (x ,y ). The depth at position (x+1,y ) is likely to be close to the depth at position (x ,y ), for the simple reason that neighboring pixels are likely to be part of the same object and that object is likely to be smooth. Using this information, we can obtain a better estimate of the depth at one pixel by examining those pixels around it.∏ ∏ ∏∈∈∈===W j i ijij ij ij xy W j i ij ij ij xyW j i ij xy xy dZ I Z P Z Z P dZ I Z Z P I Z P Z P ,,,)|()|()|,()|()|(I (16) where I ij is the image data at position (i ,j ) and W is a set of positions close to the position (x ,y ). This approach adds smoothness and reduces noise. There is the possibility that this method could smooth over depth discontinuities, but it does not smooth over depth discontinuities if we have a high confidence that one exists. This method has the effect that if we are fairly confident that a pixel has a certain depth, it is left unchanged, but if we have very little depth information at that pixel we change the depth to a value closer to one of its neighbors. An additional way to improve our result is to use image data from more than just two frames to calculate the value of P (Z ij |I ij ).4. RESULTSFigures 1 and 2 show results obtained from synthetic data. The data used in Figure 1 was created by texture mapping an aerial photograph onto a sloped landscape. The advantage of using synthetic data instead of real-camera footage is that the true 3D structure of the scene is known, so the recovered depth can be directly compared with its true value. In Figures 1 through 5, the results are displayed in the form of inverse depth or one over the actual depth. A darker color indicates that the objects in the scene at that part of the image are further away from the camera. A lighter color indicates they are closer. In Figures 1 and 2, the recovered depth is fairly close to its true value. In each case, the depth is slightly overestimated.Figure 1: One frame from a sequence of images is shown (left). The recovered depth from this sequence of images (middle) is shown next to the actual depth (right). The camera is moving perpendicular to the direction it is facing.Figure 2: One frame from a sequence of images is shown (left). The recovered depth from this sequence of images (middle) is shown next to the actual depth (right). The camera is moving towards the direction it is facing.Figure 3: Two frames (left, middle) from a sequence of images demonstrating the aperture problem. The recovered depth is shown (right).Figure 4: Two frames (left, middle) from a sequence of images taken directly from a camera onboard a UAV while it is flying towards a tree. The recovered depth is shown (right).Figure 5: Two frames (left, middle) from a sequence of images taken from a camera onboard a UAV while it is flying low to the ground. The recovered depth is shown (right).Figure 3 shows two frames from a sequence of images taken from a rotating camera moving towards a scene that has a black circle in the foreground and a white oval in the background. The scene is a bit contrived, but its purpose is to demonstrate that the aperture problem can be overcome. Every point in the image, when examined closely, is either completely blank or contains a single approximately straight edge. Essentially, every point in the image is affected by the aperture problem. However, with our new method, the aperture problem can be effectively dealt with. The camera was rotated in the z -direction one degree per frame. The camera rotation was estimated to be correct with an error of only 0.034˚. The left image in Figure 3 shows the recovered depth. The gray part of this image is where there was not enough information to detect the depth (since it is impossible to obtain depth information from a blank wall). The white oval was correctly determined to be behind the black circle. This demonstrates that it is possible to extract useful information about the camera movement, as well as the object’s depth from a scene with no corners in it.Figures 4 and 5 show results from real data taken from a camera onboard a UAV. In Figure 4, the camera is moving towards a tree. The results are fairly noisy, but they do show a close object near where the tree should be. In Figure 5, the camera facing the ground and moving perpendicular to it. The results are fairly good with one exception. The UAV is so close to the ground that its shadow appears in the image. The shadow moves forward along with theUAV which violates our assumption that the scene is static. Consequently, the recovered depth at the border of the shadow is overestimated. This error only occurs in a small part of the image and is not a significant problem.The results are fairly accurate and appear to be satisfactory for both autonomous low-level flight and landing. However, this level of accuracy does not come without a price. This method is fairly computationally intense. We have not yet been able to run this algorithm in real-time, but we hope to do so in the near future.5.CONCLUSIONSIn future research, we hope to investigate several potential improvements to this method. First, a more sophisticated method of calculating the optical flow distributions may be necessary and could be very valuable. Second, there are many other structure from motion algorithms that perform very well [10, 11, and 12], besides the statistical method described here. Anyone of these methods could be extended to allow multiple optical flow values.We have presented a novel method to compute structure from motion. This method is unique in its ability to quantitatively describe the noise in the optical flow estimate from image data and use that information to its advantage. The resulting algorithm is more robust and, in many cases, more accurate than methods that use only a single optical flow value.REFERENCESter, T. and N. Francheschini. “A robotic aircraft that follows terrain using a neuromorphic eye,” Conf. IntelligentRobots and System, vol. 1, pp. 129-134, 2002.2. B. Sinopoli, M. Micheli, G. Donato, and T. J. Koo, “Vision based navigation for an unmanned aerial vehicle,” Proc.Conf. Robotics and Automation, pp. 1757-1764, 2001.3.S. Saripalli, J. F. Montgomery, and G. S. Sukhatme. “Vision-based autonomous landing of an unmanned aerialvehicle,” Proc. Conf Robotics and Automation. Vol. 3, pp 2799-2804, 2002.4.L. M. Lorigo, R. A. Brooks, and W. E. L. Grimsou. “Visually-guided obstacle avoidance in unstructuredenvironments,” Proc. Conf. Intelligent Robots and Systems. Vol. 1, pp 373-379, 1997.5.M. T. Chao, T. Braunl, and A. Zaknich. “Visually-guided obstacle avoidance,” Proc. Conf. Neural InformationProcessing. Vol. 2, pp. 650-655, 1999.6. F. Dellaert, S.M. Seitz, C.E. Thrope, and S. Thrun, “Structure From Motion without Correspondence,” Proc. Conf.Computer Vision and Pattern Recognition, pp. 557-564, 2000.7.M. S. Langer and R. Mann. “Dimensional analysis of image motion,” Proc. Conf. Computer Vision, pp. 155-162,2001.8. E. P. Simoncelli, E. H. Adelson, and D. J. Heeger, “Probability distributions of optical flow,” Proc. Conf. ComputerVision and Pattern Recognition, pp. 310-315, 1991.9.Z. Duric and A. Rosenfeld. “Shooting a smooth video with a shaky camera,” Machine Vision and Applications, Vol.13, pp. 303-313, 2003.10.S. Soatto and R. Brocket, “Optimal Structure from Motion: Local Ambiguities and Global Estimates,” Proc. Conf.Computer Vision and Pattern Recognition, pp. 282-288, 1998.11.I. Thomas and E. Simoncelli. Linear Structure from Motion. Technical Report IRCS 94-26, University ofPennsylvania, 1994.12.J. Weng, N Ahuja, and T. Huang. “Motion and structure from two perspective views: algorithms, error analysis, anderror estimation.” IEEE Trans. Pattern Anal. Mach. Intell. 11 (5): 451-476, 1989.。
Threshold Autoregression with a Unit Root
Ž.Econometrica,Vol.69,No.6November,2001,1555᎐1596THRESHOLD AUTOREGRESSION WITH A UNIT ROOTB Y M EHMETC ANER AND B RUCE E.H ANSEN1This paper develops an asymptotic theory of inference for an unrestricted two-regimeŽ.threshold autoregressive TAR model with an autoregressive unit root.Wefind that theasymptotic null distribution of Wald tests for a threshold are nonstandard and differentfrom the stationary case,and suggest basing inference on a bootstrap approximation.Wealso study the asymptotic null distributions of tests for an autoregressive unit root,andfind that they are nonstandard and dependent on the presence of a threshold effect.Wepropose both asymptotic and bootstrap-based tests.These tests and distribution theoryŽ.Žallow for the joint consideration of nonlinearity thresholds and nonstationary unit .roots.Our limit theory is based on a new set of tools that combine unit root asymptotics with empirical process methods.We work with a particular two-parameter empirical processthat converges weakly to a two-parameter Brownian motion.Our limit distributionsinvolve stochastic integrals with respect to this two-parameter process.This theory isentirely new and mayfind applications in other contexts.We illustrate the methods with an application to the U.S.monthly unemployment rate.Wefind strong evidence of a threshold effect.The point estimates suggest that thethreshold effect is in the short-run dynamics,rather than in the dominate root.While theconventional ADF test for a unit root is insignificant,our TAR unit root tests are arguablysignificant.The evidence is quite strong that the unemployment rate is not a unit rootprocess,and there is considerable evidence that the series is a stationary TAR process.K EYWORDS:Bootstrap,nonlinear time series,identification,nonstationary,Brownian motion,unemployment rate.1.INTRODUCTIONŽ.Ž. T HE THRESHOLD AUTOREGRESSIVE TAR MODEL was introduced by Tong1978Žand has since become quite popular in nonlinear time series.See Tong1983, .1990for reviews.A sampling theory of inference has been quite slow toŽ. develop,however.Among the more important contributions,Chan1991and Ž.Hansen1996describe the asymptotic distribution of the likelihood ratio testŽ.for a threshold,Chan1993showed that the least squares estimate of the threshold is super-consistent and found its asymptotic distribution,Hansen Ž.1997b,2000developed an alternative approximation to the asymptotic distribu-Ž.tion,and Chan and Tsay1998analyzed the related continuous TAR model and found the asymptotic distribution of the parameter estimates in this model.In all of the papers listed above,an important maintained assumption is that the data are stationary,ergodic,and have no unit roots.This makes it impossible to discriminate nonstationarity from nonlinearity.To aid in the analysis of possibly nonstationary and r or nonlinear time series,we provide thefirst rigor-1Caner thanks TUBITAK and Hansen thanks the National Science Foundation and the Alfred P. Sloan Foundation for research support.We thank Frank Diebold,Peter Pedroni,Pierre Perron, Simon Potter,four referees and the co-editor for stimulating comments on earlier drafts.15551556M.CANER AND B.E.HANSENous treatment of statistical tests that simultaneously allow for both effects.Ž.Specifically,we examine a two-regime TAR k with an autoregressive unit root.Ž. Within this model,we study Wald tests for a threshold effect for nonlinearityŽ.and Wald and t tests for unit roots for nonstationarity.We allow for general autoregressive orders,and do not artificially restrict the coefficients across regimes.Wefind that the Wald test for a threshold has a nonstandard asymptotic null distribution.This is partially due to the presence of a parameter that is notŽŽ.Ž. identified under the null see Davies1987,Andrews and Ploberger1994,and Ž..Hansen1996,and partially due to the assumption of a nonstationary autore-gression.The asymptotic null distribution has two components,one that reflects the unit root and deterministic trends but is otherwise free of nuisance parame-ters,and the other component that is identical to the empirical process found in the stationary case,and is nuisance-parameter dependent.Hence the asymptotic distribution is nonsimilar and cannot be tabulated.We propose bootstrap procedures to approximate the sampling distribution.Wefind that Wald tests for a unit root have asymptotic null distributions that depend on whether or not there is a true threshold effect,and construct bounds that are free of nuisance parameters.Our simulations suggest that these asymptotic approximations are inferior to bootstrap methods,which we recom-mend for empirical ing simulations,we show that our threshold unitŽroot tests have better power than the conventional ADF unit root test Said and Ž..Dickey1984when the true process is nonlinear.Our distribution theory is based on a new set of asymptotic tools utilizing a double-indexed empirical process that converges weakly to a two-parameter Brownian motion,and we establish weak convergence to a stochastic integral defined with respect to this two-parameter process.This theory may have applications beyond those presented here.The results presented here relate to a growing literature on threshold autore-gressions with unit roots.In a Monte Carlo experiment,Pippenger and Goering Ž.Ž.1993document that the power of the Dickey-Fuller1979unit root test fallsŽ. dramatically within one class of TAR models.Balke and Fomby1997introduce a multivariate model of threshold cointegration,but offer no rigorous distribu-Ž.tion theory.Tsay1997introduces a univariate unit root test when the innova-tions follow a threshold process.Hefinds the asymptotic distribution when the threshold is known,and provides simulations for the case of estimated thresh-old.His model requires the leading autoregressive lag to be constant across threshold regimes,and is a special case of the model we consider.Gonzalez and Ž.Ž.Gonzalo1998carefully examine a TAR1model allowing for a unit root.They provide conditions under which the process is stationary and geometricallyŽ.ergodic,and discuss testing for a threshold in the TAR1model.We illustrate our proposed techniques through an application to the monthly U.S.unemployment rate among adult males.There is a substantial literature documenting nonlinearities and threshold effects in the U.S.unemploymentŽ.Ž.rate.A partial list includes Rothman1991,Chen and Lee1995,Montgomery,THRESHOLD AUTOREGRESSION1557Ž.Ž.Zarnowitz,Tsay,and Tiao1998,Altissimo and Violante1996,Chan and Tsay Ž.Ž.Ž.1998,Hansen1997b,and Tsay1997.This literature is connected to a broader literature studying nonlinearities in the business cycle,which includesŽ.Ž.Ž. contributions by Neftci1984,Hamilton1989,Beaudry and Koop1993,Ž.Ž.Potter1995,and Galbraith1996.Empirical researchers are faced with the fact that the conventional unit root tests are unable to reject the hypothesis that the post-war unemployment rate is nonstationary.Prior statistical methods cannot disentangle nonstationarity from nonlinearity because of the joint model-ing problem of unit roots and thresholds.With our new methods,we are able to rigorously address these issues.In our application,wefind very strong evidence that the unemployment rate has a threshold nonlinearity.Furthermore,wefindŽstrong evidence against the unit root hypothesis,and fairly strong although not .conclusive evidence in favor of a stationarity threshold specification.Our methods point to the conclusion that the unemployment rate is a stationary nonlinear process.This paper is organized as follows.Section2presents the TAR model.Section 3introduces a new set of asymptotic tools that are useful for the study of threshold processes with possible unit roots.Section4presents the distribution theory for the threshold test,including a Monte Carlo study of size and power. Section5presents the distribution theory for the unit root test,including critical values and a simulation study.Section6is the empirical application to the U.S. unemployment rate.The mathematical proofs are presented in the Appendix.A GAUSS program that replicates the empirical work is available from the webpage r;bhansen.2.TAR MODELŽ.The model is the following threshold autoregression TAR:Ž.X X1⌬y sx1qx1q e,t1t y1ÄZ-42t y1ÄZ G4tt y1t y1ŽX.t s1,...,T,where x s y r⌬yиии⌬yЈ,1is the indicator function,t y1t y1t t y1t y kÄи4e is an iid error,Z s y y y for some m G1,and r is a vector of determinis-t t t t y m ttic components including an intercept and possibly a linear time trend.Thew x thresholdis unknown.It takes on values in the intervalg⌳s,where12Ž.Ž.andare picked so that P Z Fs)0and P Z Fs-1.It is 12t11t22typical to treatandsymmetrically so thats1y,which imposes the 1221restriction that no‘‘regime’’has less than%of the total sample.The1particular choice foris somewhat arbitrary,and in practice must be guided1by the consideration that each‘‘regime’’needs to have sufficient observations to adequately identify the regression parameters.This choice is discussed in more detail at the end of Section4.2.The particular specification for the threshold variable Z is not essential tot y1the analysis.In general,what is necessary for our results is that Z bet y1 predetermined,strictly stationary,and ergodic with a continuous distributionM .CANER AND B .E .HANSEN1558function.Our particular choice Z s y y y is convenient because it is en-t t t y m Ž.Ž.sured to be stationary under the alternative assumptions that y is I 1and I 0.t For some of our analysis,it will be convenient to separately discuss the components of and .Partition these vectors as1212s ,s ,1212 0 0␣␣12where and are scalar,and have the same dimension as r ,and ␣1212t 1Ž.Ž.and ␣are k -vectors.Thus ,are the slope coefficients on y ,,212t y 112Ž.are the slopes on the deterministic components,and ␣,␣are the slope 12Ž.coefficients on ⌬y ,...,⌬y in the two regimes.t y 1t y k Ž.Our model 1specifies that all the slope coefficients switch between the regimes,but in some applications it may be desirable for only a subset of the coefficients to depend on the regime.There is nothing essential in this choice and other parameterizations may be used in other contexts.For the theoretical Ž.presentation,we retain the general unrestricted model 1for ease of exposition.We impose the following maintained conditions on the model:A SSUMPTION 1:e is an iid mean -zero sequence with a bounded density function ,t <<2␥and E e -ϱfor some ␥)2.For some matrix ␦and continuous ¨ector t T Ž.Ž.function r s ,␦r «r s .The following parameter restrictions apply :s s 0;T w T s x 12X X <X <<X <for constants and ,r s and r s ;and ␣-1and ␣-1,121t 12t 212where is a k -¨ector of ones .The assumption that e is an independent sequence is essential for our t asymptotic distribution theory and for our bootstrap approximations,and ap-pears to be a meaningful restriction on the model.The parameter restrictions ensure that the time-series ⌬y is stationary and ergodic,so that y is integrated t t of order one and can be described as a unit root process.The restriction that X r s and X r s implies that the only ‘‘trend’’component that enters the 1t 12t 2true process is the intercept.This restriction is standard in the unit root testing literature,and guarantees that there are no quadratic trends in y .t An important question in applications is how to specify the deterministic component r .If the series y is nontrended it would seem natural to set r s 1,t t t Ž.while if the series is highly trended then a natural option is to set r s 1t Ј.t The inclusion of the linear trend will be necessary to ensure that the unit root tests we discuss in Section 5have power against trend stationary alternatives.The coefficient restrictions on ␣and ␣given in Assumption 1are sufficient 12Žto ensure that the series ⌬y is stationary and ergodic see Pham and Tran t Ž..1985,which is the only role of these restrictions.While these are a known set of sufficient conditions,they are not necessary.The region of ergodicity is larger than these assumptions,which is what is essential for our results.Ž.Ž.The TAR model 1is estimated by least squares LS .To implement LS Ž.estimation,it is convenient to use concentration.For each g ⌳,1is esti-THRESHOLD AUTOREGRESSION 1559Ž.mated by ordinary least squares OLS :ˆˆŽ.Ž.Ž.Ž.2⌬y s Јx 1q Јx 1q e .ˆt 1t y 1ÄZ -42t y 1ÄZ G 4t t y 1t y 1Let T 22y 1Ž.Ž.s T e ˆˆÝt 1be the OLS estimate of 2for fixed .The least-squares estimate of the 2Ž.threshold is found by minimizing :ˆ2Ž.s argmin .ˆg ⌳The LS estimates of the other parameters are then found by plugging in the ˆˆˆˆˆˆˆŽ.Ž.point estimate ,vis.s ,and s .We write the estimated model 1122asˆX ˆX Ž.3⌬y s x 1q x 1q e ,ˆˆˆt 1t y 1ÄZ -42t y 1ÄZ G 4t t y 1t y 1which also defines the LS residuals e .Let 2s T y 1ÝT e 2denote the residual ˆˆˆt t s 1t variance from the LS estimation.Ž.The estimates 3can be used to conduct inference concerning the parameters Ž.of 1using standard Wald and t statistics.While the statistics are standard,their sampling distributions are nonstandard,due to the presence of possible unidentified parameters and nonstationarity.We explore large-sample approxi-mations in the following sections.3.UNIT ROOT ASYMPTOTICS FOR THRESHOLD PROCESSESThe sampling distributions for our proposed statistics will require some new asymptotic tools.Rather than develop these tools for our specific model,we first develop the needed results under a set of more general conditions.Let ‘‘«’’w x 2denote weak convergence as T ªϱwith respect to the uniform metric on 0,1.Ä4A SSUMPTION 2:For the sequence U ,e ,X ,w ,let ᑣdenote the natural t t t t t filtration .Ä41.U ,e ,w is strictly stationary and ergodic and strong mixing with mixingt t t coefficients ␣satisfying Ýϱ␣1r 2y 1r r -ϱfor some r )2;m m s 1m w x 2.U has a marginal U 0,1distribution ;t Ž.<<43.e is independent of ᑣ,E e s 0,and E e s -ϱ;t t y 1t t 4.there exists a nonrandom matrix ␦such that the array X s ␦X satisfies T T t T t Ž.w x Ž.X «X s on s g 0,1,where X s is continuous almost surely ;T w T s x <<2q 5.E w -ϱfor some )0.tM .CANER AND B .E .HANSEN1560The two most natural examples of processes X that satisfy condition 4are t Ž.integrated processes and polynomials in time.First,if X is an I 1process,then t y 1r 2Ž.Ž.␦s T and X s is a scaled Brownian motion.Second,if X s 1t ЈT t Ž.a constant and linear trend ,then10Ž.Ž.␦s and X s s 1s Ј.T y 1ž/0T Other polynomials in time,or higher-order integrated processes,can be handled similarly.Ž.Define 1u s 1,the partial-sum processt ÄU -u 4t i Ž.Ž.W u s1u e Ýi t y 1t t s 1and scaled array 1Ž.Ž.W s ,u sW u T w T s x 'T w x Ts 1Ž.s1u e ,Ýt y 1t 't s 1where 2s Ee 2-ϱ.t Ž.Ž.w x 2D EFINITION 1:W s ,u is a two-parameter Brownian motion on s ,u g 0,1Ž.Ž.if W s ,u ;N 0,su andŽŽ.Ž..Ž.Ž.E W s ,u W s ,u Јs s n s u n u .11221212T HEOREM 1:Under Assumption 2,Ž.Ž.Ž.4W s ,u «W s ,u T Ž.w x 2Ž.on s ,u g 0,1as T ªϱ,where W s ,u is a two -parameter Brownian motion .It may be helpful to think of Theorem 1as a two-parameter generalization of the usual functional limit theorem.We now define stochastic integration with Ž.respect to the two-parameter process W s ,u .Let1Ž.Ž.Ž.J u s X s dW s ,u H 0N j y 1j j y 1'plim X W ,u y W ,u ,Ýž/ž/ž/ž/N N N N ªϱj s 1where plim denotes convergence in probability.The integration is over the first Ž.argument of W s ,u ,holding the second argument constant.We will call the Ž.process J u a stochastic integral process.THRESHOLD AUTOREGRESSION 1561T HEOREM 2:Under Assumption 2,T 11Ž.Ž.Ž.X 1u e s X s dW s ,u ÝH T t y 1t y 1t T T 'T 0t s 11Ž.Ž.Ž.«J u s X s dW s ,u H 0w x Ž.on u g 0,1as T ªϱ,and J u is almost surely continuous .This result is a natural extension of the theory of weak convergence to ŽŽ..stochastic integrals see Hansen 1992.Finally,we need to describe the asymptotic covariances between stationary processes and the nonstationary process X when interacted with the indicator t Ž.Ž.ŽŽ..function 1u .Define the moment functionals h u s E 1u w and t y 1t y 1t y 1Ž.ŽŽ.X .H u s E 1u w w .t y 1t y 1t y 1w x T HEOREM 3:Under Assumption 2,on u g 0,1as T ªϱ,Ž.T Ž.X Ž.1Ž.1.1r T Ý1u w X «h u H X s Јds ;t s 1t y 1t y 1T t y 10Ž.T Ž.X Ž.2.1r T Ý1u w w «H u ;t s 1t y 1t y 1t y 1Ž.T Ž.X 1Ž.Ž.3.1r T Ý1u X X «u H X s X s Јds .t s 1t y 1T t y 1T t y 10Theorems 1᎐3will serve as the building blocks for the subsequent theory developed in this paper.4.TESTING FOR A THRESHOLD EFFECT4.1.Wald Test StatisticŽ.In model 1a question of particular interest is whether or not there is a threshold effect.The threshold effect disappears under the joint hypothesis Ž.5H :s .0122Ž.Our test of 5is the standard Wald statistic W for this restriction.This T statistic can be written as2ˆ0W s T y 1T 2ž/2Ž.2where is defined above as the residual variance from 3,and is the ˆˆ0residual variance from OLS estimation of the null linear model.The following relationship may be of some interest.Let2ˆ0Ž.W s T y 1T 2ž/Ž.ˆ2In applications it may also be useful to consider statistics that focus on subvectors of and .12See Section 4.5.M .CANER AND B .E .HANSEN1562Ž.Ž.denote the Wald statistic of the hypothesis 5for fixed from regression 2.Ž.2Ž.Then since W is a decreasing function of we see thatˆT ˆŽ.Ž.Ž.6W s W s sup W .T T T g ⌳Thus the Wald statistic for H is often called the ‘‘Sup-Wald’’statistic.04.2.Asymptotic DistributionŽ.Under the null hypothesis 5of no threshold effect the parameter is not identified,rendering the testing problem nonstandard.The asymptotic distribu-Ž.tion of W for stationary data has been investigated by Davies 1987,Chan T Ž.Ž.Ž.1991,Andrews and Ploberger 1994,and Hansen 1996.Our concern is with the case of a unit root,which has not been studied previously.Ž.Ž.Let G иdenote the marginal distribution function of Z ,set s G and t 11Ž.Ž.Ž.s G ,and define 1u s 1and w s ⌬y ,...,⌬y .22t ÄG ŽZ .-u 4t y 1t y 1t y k t T HEOREM 4:Under Assumption 1,H :s ,012Ž.W «T ssup T u ,T F u F 12whereŽ.Ž.Ž.Ž.7T u s Q u q Q u ,12Ž.Ž.Ž.and Q u and Q u are the independent stochastic processes defined in 8and 12Ž.9below .Ž.Ž.Ž.1.Let W s ,u be a two -parameter Brownian motion ;set W s s W s ,1.Set Ž.ŽŽ.Ž..Ž.1Ž.Ž.X s s W s r s ЈЈ,J u s H X s dW s ,u Ј,a stochastic integral process as 10U Ž.Ž.Ž.defined in Section 3,and set J u s J u y uJ 1.Then 111y 11U U Ž.Ž.Ž.Ž.Ž.Ž.Ž.8Q u s J u Јu 1y u X s X s Јds J u .H 111ž/0Ž.Ž.2.Let J u be a zero -mean Gaussian process ,independent of W s ,u ,with 2ŽŽ.Ž..Ž.Ž.Ž.co ¨ariance kernel E J u J u Јs ⍀u n u ,where ⍀u s H u y 212212y 1Ž.Ž.Ž.ŽŽ.X .Ž.ŽŽ..u h u h u Ј,H u s E 1u w w ,and h u s E 1u w .Then t y 1t y 1t y 1t y 1t y 1y 1y 1U U Ž.Ž.Ž.ŽŽ.Ž.Ž.Ž..Ž.9Q u s J u Ј⍀u y ⍀u ⍀1⍀u J u .222Theorem 4gives the large sample distribution of the conventional Wald Ž.statistic for a threshold for the nonstationary autoregression 1under the unit root restriction s s 0.Notice that the distribution T can be written as the 12Ž.Ž.supremum of the sum of two independent processes Q u and Q u .The 12Ž.process Q u is a chi-square process,taking the same form as found by Hansen 2Ž.Ž.1996for threshold tests applied to stationary data.The process Q u takes a 1very different form,and is a reflection of the nonstationary regressors.We seeTHRESHOLD AUTOREGRESSION1563 that the presence of nonstationarity in the data changes the asymptotic distribu-tion of the threshold test,and this will need to be taken into consideration for correct asymptotic inference.The case of stationary data can be deduced from Theorem4by removing Ž.Ž.W s from the definition of X s,which is the result of omitting y from thet y1Ž.regression model1.The asymptotic distribution corresponds to that found by Ž.Hansen1996.In general,the asymptotic distribution T is nonpivotal and depends upon theŽ.nuisance parameter function⍀u.The dependence on the data structure is quite complicated,so as a result,critical values cannot be tabulated.In the following section,we discuss a bootstrap method to approximate the nulldistribution of W.TIt is also helpful to observe that Theorem4shows that the critical values of Twill increase asdecreases and r ordecreases.This means that larger 12values of W will be needed to reject the null of stationarity when extreme Tvalues ofand r orare used.In analogy to the discussion in Andrews 12Ž.1993concerning the choice of trimming in tests for structural change,thedistribution of T diverges to positive infinity asª0orª1.Thus setting12s0ors1renders the test inconsistent.It follows that it is necessary to 12Ž.select values ofandin the interior of0,1,and values too close to the 12endpoints reduce the power of the test.On the other hand,it is desirable toŽ.w x selectandso that the true value of Glies in the interval,12012Ž.under the alternative hypothesis;otherwise the test may have difficulty inŽ.detecting the presence of the threshold effect.Andrews1993suggests thatsettings.15ands.85provides a reasonable trade-off between these 12considerations,and these are the values we select in our simulations and applications.Since the particular choice is somewhat arbitrary,it appears sensible in practical applications to explore the robustness of the results to this choice.4.3.BootstrapIn this section,we discuss two bootstrap approximations to the asymptoticdistribution of W,one based on the unrestricted estimates,and the other Tenforcing the restriction of a unit root.These bootstrap approximations can be used to calculate critical values and p-values.Under the null hypothesis,ss,say,so for simplicity we omit sub-12scripts on the coefficients for the remainder of this section.Under H and theŽassumption that the only deterministic component is the interceptsee˜.Assumption1the model simplifies to⌬y sy qq␣Ј⌬y q e,wheret t y1t y1t˜Ž.⌬y s⌬yиии⌬yЈ.As the distribution of the test is invariant to level t y1t y1t y k˜shifts,we can sets0so the model simplifies to⌬y sy q␣Ј⌬y q e.t t y1t y1t Since this is entirely determined by,␣,and the distribution F of the error e,t we can use a model-based bootstrap.M .CANER AND B .E .HANSEN1564˜Ž.We first describe the unrestricted bootstrap estimate.Let ,␣,Fbe esti-˜˜Ž.b mates of ,␣,F .The bootstrap distribution W is a conditional distribution T ˜b Ž.determined by the random inputs ,␣,F .It is determined as follows.Let e ˜˜t ˜b b b ˜b be a random draw from F ,and let y be generated as ⌬y s y q ␣Ј⌬y ˜˜t t t y 1t y 1b ˜b b b Ž.q e where ⌬y s ⌬y иии⌬y Ј.Initial values for the recursion can be t t y 1t y 1t y k set to sample values of the demeaned series.The distribution of y b is thet bootstrap distribution of the data.Let W b be the threshold Wald test calculated T from the series y b .The distribution of W b is the bootstrap distribution of the t T Žb <.Wald test.Its bootstrap p -value is p s P W )W ᑣ,where conditioning on T T T T ᑣdenotes that this probability is conditional on the observed data.Typically,T the bootstrap p -value is calculated by simulation,where a large number of independent Wald tests W b are simulated,and the p -value p is approximated T T by the frequency of simulated W b that exceed W .T T ˜Ž.Ž.To implement the bootstrap we need estimates ,␣,F.For ,␣we need ˜˜an estimate that imposes the null hypothesis;an obvious choice is to use the Ž.estimate ,␣obtained by regressing y on x .An estimator for F is the ˜˜t t empirical distribution of the OLS residuals e .In typical statistical contexts ˜t Žwhen the asymptotic distribution is a smooth function of the model parameters .and the parameter estimates are consistent bootstrap distributions will con-Žb .3verge in probability to the correct asymptotic distribution denoted W «T ,T p implying that the bootstrap p -value will be first-order asymptotically correct.In our model,this convergence depends on the true value of .If the time-series is stationary,then the bootstrap will achieve the correct first-order asymptotic distribution,since the model parameters are consistently estimated and the Žasymptotic distribution is a smooth function of these parameters a similar Ž..formal argument is presented in Hansen 1996.If the time-series has a unit root,however,this will not be the case.The asymptotic distribution is discontin-uous in the parameters at the boundary s 0,so the bootstrap distribution will not be consistent for the correct sampling distribution.We can achieve the correct asymptotic distribution by imposing the true unit root.This is done by imposing the constraint s 0.This can be done by setting ˜˜Ž.Ž.Ž.the estimates of ,␣,F to be 0,␣,F ,where ␣,Fare defined above.Then ˜˜b b ˜b b b generate random samples y from the model ⌬y s ␣Ј⌬y q e with e ˜t t t y 1t t ˜drawn randomly from F.These samples are unit root processes.For each sample y b ,calculate the test statistic W b .The estimated bootstrap p -value is the t T percentage of simulated W b that exceed the observed W .T T This constrained bootstrap is first-order correct under H if the true parame-0ter values satisfy s 0.If the true process is stationary,however,the con-strained bootstrap will be incorrect.We see that we have introduced two bootstrap methods,one appropriate for the stationary case,and the other appropriate for the unit root case.If the true order of integration is unknown Ž.as is likely in applications ,then it appears prudent to calculate the bootstrap 3The symbol ‘‘«’’denotes ‘‘weak convergence in probability’’as defined in Gine and Zinn p Ž.1990,which is the appropriate convergence definition for bootstrap distributions.Ž. p-values both ways,and base inference on the more conservative the larger p-value.4.4.A Monte Carlo ExperimentIn order to examine the size and power of the proposed test a small sampleŽ.study is conducted.The model used is equation1with k s1,a linear timetrend,and z s⌬y:t y1t y1Ž.Ž.10⌬y sy qt qq␣⌬y1t1t y1111t y1Ä⌬y-4t y1Ž.qy qt qq␣⌬y1q e,2t y1222t y1Ä⌬y G4tt y1Ž.and e iid N0,1.The sample size we use is T s100.We examine nominal5% tŽ.size tests based on estimation of model10using bootstrap critical values,the latter calculated using500bootstrap replications.All calculations are empirical rejection frequencies from10,000Monte Carlo replications,and in all experi-ments the tests are based on least-squares estimation of the unrestricted model Ž.10.ŽWefirst examined the size of the bootstrap tests used on the unconstrained.estimates and the unit-root-constrained estimates.Under the null hypothesis ofŽ.no threshold,data are generated by the AR1process⌬y sy q␣⌬y qt t y1t y1 e.We explored how the size is affected by the parametersand␣.The results tare presented in Table I.For all cases considered,the size of both tests is excellent.Interestingly,the two bootstrap procedures have near identical size in our simulations,with the unit-root-constrained bootstrap being slightly liberal in some cases,and the unconstrained bootstrap being slightly more conservative in some cases.This evidence suggests that it might not matter much which procedure is used; however,our recommendation is to compute both procedures and take the more conservative results.Next,we explore the power of the test against local alternatives.Because of the minor differences between the two bootstrap procedures,we calculate the power using the unconstrained bootstrap method.We consider three alterna-tives allowing/,/,and␣/␣separately.Thefirst alternative 121212allows a switching intercept:⌬y sy q1q1q␣⌬y q e,t t y11Ä⌬y-42Ä⌬y G4t y1tt y1t y1TABLE IS IZE OF5%B OOTSTRAP T HRESHOLD T ESTSUnconstrained Bootstrap Constrained Bootstraps y.25s y.15s y.05s0s y.25s y.15s y.05s0␣sy.5.038.055.051.048.060.054.041.059␣s.5.040.050.049.042.043.047.044.058Note:T s100.Nominal size5%.Rejection frequencies from10,000replications.。
scipy 中estimation of scale parameters -回复
scipy 中estimation of scale parameters -回复Scipy is a powerful scientific computing library in Python that provides various tools and functions for a wide range of mathematical and statistical operations. One of the essential capabilities of Scipy is the estimation of scale parameters, which is crucial in many data analysis and modeling tasks. In this article, we will explore and explain the process of estimating scale parameters using Scipy, step by step.Introduction to Scale ParametersBefore diving into the estimation process, let's first understand what scale parameters are and why they are important in statistical analysis. In probability theory and statistics, a scale parameter is a characteristic of a probability distribution that governs the spread or dispersion of the data. It provides information about the variability or level of uncertainty in the distribution.For example, in a normal distribution, the scale parameter is the standard deviation, which measures the average distance between each data point and the mean. The larger the standard deviation, the more spread out the data points are from the mean, indicating higher variability. In another example, in the exponential distribution, the scale parameter is the rate parameter, which determines the average time between events.Estimating Scale Parameters using ScipyScipy provides several functions and methods for estimating scale parameters from data. We will focus on three commonly used methods: maximum likelihood estimation (MLE), method of moments (MoM), and least squares.Maximum Likelihood Estimation (MLE)Maximum likelihood estimation is a widely used statistical method for estimating parameters. In Scipy, the `fit` function from the `scipy.stats` module can be used for MLE estimation. Let's see an example using a normal distribution:pythonfrom scipy.stats import normGenerate a random sample from a normal distributiondata = norm.rvs(loc=5, scale=2, size=100)Estimate the mean and standard deviation using MLEmu, sigma = norm.fit(data)In this example, we first generate a random sample of size 100 from a normal distribution with a mean of 5 and a standard deviation of 2. Then, we use the `norm.fit` function to estimate the mean (`mu`) and standard deviation (`sigma`) using MLE. The `fit` function returns the estimated parameters.Method of Moments (MoM)The method of moments is another statistical technique for parameter estimation. Scipy provides the `moment` function in the `scipy.stats` module for MoM estimation. Let's continue the previous example:pythonfrom scipy.stats import momentEstimate the mean and standard deviation using MoMmu_mom = moment(data, moment=1)sigma_mom = moment(data, moment=2, central=True)0.5Here, we use the `moment` function to estimate the mean (`mu_mom`) and standard deviation (`sigma_mom`) using MoM. The `moment` function takes two arguments: the data and the order of the moment to be estimated. By default, it calculates raw moments, but setting `central=True` calculates central moments.Least Squares EstimationLeast squares estimation is a linear regression-based method that minimizes the sum of the squares of the differences between the observed and predicted values. Scipy provides the `curve_fit` function in the `scipy.optimize` module for least squares estimation. Let's see an example using an exponential distribution: pythonfrom scipy.stats import exponfrom scipy.optimize import curve_fitGenerate a random sample from an exponential distribution data_expon = expon.rvs(scale=2, size=100)Define the exponential functiondef exponential(x, scale):return expon.pdf(x, scale=scale)Estimate the scale parameter using least squarespopt, _ = curve_fit(exponential, data_expon,expon.pdf(data_expon, scale=2))scale = popt[0]In this example, we first generate a random sample of size 100 from an exponential distribution with a scale parameter of 2. Then, we define the exponential function and use `curve_fit` to estimate the scale parameter (`scale`) using least squares. The `curve_fit` function returns the estimated parameters (`popt`) and covariance estimates.ConclusionIn this article, we have explored the estimation of scale parameters using Scipy. We discussed three commonly used methods: maximum likelihood estimation (MLE), method of moments (MoM), and least squares. Scipy provides convenient functions and methods like `fit`, `moment`, and `curve_fit` forestimating scale parameters from data. Understanding and estimating scale parameters are essential in various statistical analyses and modeling tasks, allowing us to gain insights into the dispersion and variability of data distributions. With Scipy's estimation capabilities, we can effectively analyze and model data, making informed decisions based on statistical evidence.。
1.UnivariateRandomVariables
The Transformation TechniqueBy Matt Van Wyhe, Tim Larsen, and Yongho ChoiIf one is trying to find the distribution of a function (statistic) of a given random variable X with known distribution, it is often helpful to use the Y=g(X)transformation. This transformation can be used to find the distributions of many statistics of interest, such as finding the distribution of a sample variance or the distribution of a sample mean. The following notes will conceptually explain how this is done for both discrete random variables and continuous random variables, both in the univariate and multivariate cases. Because these processes can be almost impossible to implement, the third section will show a technique for estimating the distributions of statistics.1. Univariate Random VariablesGiven X, we can define Y as the random variable generated by transforming X by some function g(∙)—i.e. Y=g(X). The distribution of Y=g(X) will then depend both on the known distribution of X as well as the transformation itself.1.1 Discrete Univariate Random VariablesIn the discrete case, for a random variable X that can take on values x 1, x 2, x 3,…., x n, with probabilities Pr(x 1), Pr(x 2), Pr(x 3), …., Pr(x n ), then the possible values for Y are found by simply plugging x 1, x 2, x 3,…., x n into the transformation Y=g(X). These values need not be unique for each x i , several values of X may yield identical values of Y. Think for example of g(X )= X 2, g(X) =|X|, g(X) = max{X, 10}, etc., each of which have multiple values of x that could generate the same value for y. The density function of Y is then the sum of x’s that yield the same values for y:Pr (y j )= �Pr (x i )i :g (x i )=y jOr in other terms:f Y(y j)=�f X(x i)i:g(x i)=y jAnd the cumulative distribution function:F y(y)=Pr(Y≤y)=�Pr(x i)i:g(x)≤y jExample 1-1Let X have possible values of -2, 1, 2, 3, 6 each with a probability of .20. Define the function:Y=g(X)=(X−X�)2,the squared deviation from the mean for each x j.To find the distribution of Y=g(X), first calculate X� = 2 for these x j. With a mean of 2, (X−X�)2 can take on values of 16, 1, and 0 with Pr(16) = .4, Pr(1) = .4, and Pr(0) = .2, the sums of the probabilities of the corresponding x j’s. This is the density function for Y=g(X). Formally:f Y(y)=�0.20.40for y=2for y=1 or y=16otℎerwise1.2 Continuous Univariate Random VariablesTo derive the density function for a continuous random variable, use the following theorem from MGB:Theorem 11 (p. 200 in MGB)Suppose X is a continuous random variable with probability density function f x( ∙ ), and X = {x : f x(x) > 0} the set of possible values of f x(x). Assume that:1.y = g(x) defines a one-to-one transformation of X onto its domain D.2.The derivative of x = g-1(y) is continuous and nonzero for y ϵD, where g-1(y)is the inverse function of g(x); that is, g-1(y) is that x for which g(x) = y. Then Y = g(X) is a continuous random variable with density:f y(y)=��d dy g−1(y)�f x�g−1(y)�if yϵD,O otℎerwiseProof—see MGB p. 200.Example 1-2Let’s now define a function Y=g(X)=4X−2 and assume X has an exponential distribution. Note that f x(x) = λe−λx where λ > 0 and X = g−1(Y) = y+24. Since the mean for the exponential is 1/λ, a simplifying assumption of a mean of 2 will give us the parameter λ = ½. Applying Theorem 11, we get the following distribution for Y = g(X):f y(y)=��d dy y+24�12e−12(y+2)4=18e−y+28if y+24≥0O otℎerwiseDensity function for X (exponential distribution with λ=1/2)Density function of the transformation Y = g(X) = 4X – 2 with X distributed exponentially with λ=1/2Clearly from this example, we could also solve the transformation more generally without a restriction on λ as well.If y=g(x) does not define a one-to-one transformation of X onto D , (and hence the inverse x=g -1(y) does not exist), we can still use the transformation technique, we just need to split the domain up into sections (disjoint sets) on which y=g(x) is monotonic and its inverse will exist.Example 1-3Let Y = g(x) = sin x on the interval [0, π]. To ensure that inverse functions exist, split the domain into [0, π/2) and [π/2, π]. Also, let x again have an exponential distribution, this time with mean 1 and λ = 1. Solving for x yields x = sin -1 y, and applying Theorem 11, we have:f y (y )=⎩⎪⎨⎪⎧ �d dy sin −1y � e −sin −1y if sin −1y ϵ [0,1],x ϵ [0,π/2) �d dy sin −1y�e −sin −1y if sin −1y ϵ [0,1],x ϵ [π/2,π]O otℎerwiseThis becomes: f y (y )=⎩⎪⎨⎪⎧1�1−y 2� e −sin −1y � if sin −1y ϵ [0,1],x ϵ [0,π] O otℎerwiseDensity function for X (exponential distribution with λ=1)Splitting the function Y into invertible halves (both halves monotonic)Y = sin(X) for X ϵ [0,π/2)Y = sin(X) for X ϵ [π/2,π]Density Function for Y = g(X) = sin(X) with X distributed exponentially with λ=11.3 Probability Integral TransformAs noted previously in class, a common application of the Y = g(X) transformation is creating random samples for various continuous distributions. This involves creating a function Y = g(X) = the unit uniform distribution, and then finding the distribution of this given some underlying distribution of X. In other words, we set Y = 1, plug in the known distribution of X, and then solve for X and apply Theorem 11 as above to get the transformed density. Then, starting with a random sample from the unit uniform distribution, we can plug in values for Y and generate a random sample for the distribution of X.2. Multivariate TransformationsThe procedure that is given in the book for finding the distribution of transformations of discrete random variables is notation heavy. It is also hard to understand how to apply the procedure as given to an actual example. Here we will instead explain the procedure with an example.2.1 Discrete Multivariate TransformationsExample 2-1Mort has three coins, a penny, a dime, and a quarter. These coins are not necessarily fair. The penny only has a 1/3 chance of landing on heads, the dime has a ½ chance of landing on heads, and the quarter has a 4/5 chance of landing on heads. Mort flips each of the coins once in a round.Let 1 mean “heads” and let 0 mean “tails”. Also, we will refer to a possible outcome of the experiment as follows: {penny, dime, quarter} so that {1,1,0} means the penny and the dime land on heads while the quarter lands on tails. Below is the joint distribution of the variables “P”, “D”, and “Q”:Let’s say we care about two functions of the above random variables: How many heads there are in a round (Y1), and how many heads there are between pennies and quarters, without caring about the result of the dimes (Y2).We can either find the distributions individually, or the joint distribution of Y1 and Y2. Lets start with finding them individually.To do this, we find all of the cases of the pre-transformed joint distribution that apply to all the possible outcomes of Y1 and Y2 and add them up. Distribution of Y 1, f Distribution of Y2, fWe will do this same process for finding the joint distribution of Yand Y.To Simplify: fTo see something interesting, let Y3 be the random variable of the number of heads of the dime in a round. The distribution of Y3, f Y3(y3)=The joint distribution of Y and Y is then fBecause Y2 and Y3 are independent, f Y2,Y3(y2,y3)=f Y2(y2)∗f Y3(y3), which is really easy to see here.To apply the process used above to different problems can be straightforward or very hard, but most of the problems we have seen in the book or outside the book have been quite solvable. The key to doing this type of transformation is to make sure you know exactly which sample point from the untransformed distribution corresponds to which sample point from the transformed distribution. Once you have figured this out, solving for the transformed distribution is straightforward.We think the process outlined in this example is applicable to all discrete multivariate transformations, except perhaps in the case where the number of sample points with positive probability is infinite.2.2 Continuous Multivariate TransformationsExtending the transformation process from a single continuous random variable to multiple random variables is conceptually intuitive, but often very difficult to implement.The general method for solving continuous multivariate transformations is given below. In this method, we will transform multiple random variables into multiple functions of these random variables. Below is the theorem we will use in the method.This is Theorem 15 directly from Mood, Graybill, and Boes;Let X1,X2,…,X n be jointly continuous random variables with density function f X1,X2,…,X n(x1,x2,…,x n). We want to transform these variables into Y1,Y2,…,Y n where Y1,Y2,…,Y n∈N. Let X={(x1,x2,…,x n): f X1,X2,…,X n(x1,x2,…,x n)>0}. Assume that X can be decomposed into sets X1,…,X m such that y1=g1(x1,…,x n),…,y n=g n(x1,…,x n)is a one to one transformation of X i onto N, i=1,…,m. Let x1=g1i−1(y1,y2,…,y n),… ,x n=g1n−1(y1,y2,…,y n) denote the inverse transformation of N onto X i, i=1,…,m. Define J i as the determinant of the matrix of the derivatives of inverse transformations with respect to the transformed variables, y1,y2,…,y n, for all i=1,…,m. Assuming that all the partial derivatives in J i are continuous over N and the determinant J i is non-zero for all the i’s. Thenf Y1,Y2,…,Y n(y1,y2,…,y n)=�|m i=1J i|f X1,X2,…,X n�g1i−1(y1,y2,…,y n),…,g1n−1(y1,y2,…,y n)�for (y1,y2,…,y n)∈N.|J| in this theorem is the absolute value of the determinate of the Jacobian matrix. Using the notation of the theorem, the Jacobian matrix is the matrix of thefirst-order partial derivatives of all of the X ’s with respect to all of the Y ’s. For a transformation from two variables, X and Y , to two functions of the two variables, Z and U , the Jacobian matrix is:Jacobian Matrix =�dx dz dx du dy dz dy du � The determinate of the Jacobian Matrix, often called simply a “Jacobian”, is thenJ =�dx dz dx du dy dz dy du �=dx dz dy du −dy dz dx du The reason we need the Jacobian in Theorem 15 is that we are doing whatmathematicians refer to as a coordinate transformation. It has been proven that an absolute value of a Jacobian must be used in this case (refer to a graduate level calculus book for a proof).Notice from Theorem 15 that we take n number of X ’s and transform them into n number of Y ’s. This must be the case for the theorem to work, even if we only care about k number of Y ’s where k ≤n .Let’s say that you have a set of random variables, X 1,X 2,…,X n and you would like to know the distributions of functions of these random variables. Let’s say the functions you would like to know are called Y i (X 1,X 2,…,X n ) for i ϵ (0,k ). In order to make the dimensions the same (for the theorem to work), we will need to add transformations Y i+1 to Y n , even though we do not care about them. Note the choice of these extra Y’s can greatly affect the complexity of the problem, although we have no intuition about what choices will make thecalculations the easiest. After the joint density is found with this method, you can integrate out the variables you do not care about to make a joint density of only the 0 through k Y i ′s you do care about.Another important point in this theorem is an extension of a problem we saw above in the univariate case. If our transformations are not 1 to 1, then we must divide up the space into areas where we do have 1 to 1 correspondence, do our calculations, and then sum them up, which is exactly what the theorem does. Example 2-2Let (X ,Y ) have a joint density function of f X ,Y (x ,y )=�xe −x e −y for x ,y >00 if elseNote that f X (x )=xe −x for x >0 and f Y (y )=e −y for y >0.{Also note: ∫∫xe −x e −y dx dy =1∞0∞0, ∫xe −x dx =1∞0, and ∫e −y dy =1∞0} We want to find the distribution for the variables Z and U , where Z =X +Y and U =Y /X . We first find the Jacobian. Note: Z =X +Y and U =Y /X and therefore:X =g X −1(z ,u )=Z U +1 and Y =g Y −1(z ,u )= UZ U +1So: dx dz= 1U +1 and dx du =−Z (U +1)2 And dy dz =U U +1and dy du = Z (U +1)2 Plugging these into the definition: J =�dx dz dx du dy dz dy du �=dx dz dy du −dy dz dx duSo the determinate of the Jacobian Matrix is J =z (u+1)2 Z is the sum of only positive numbers and U is a positive number divided by a positive number, so the absolute value of J is just J :|J |=�z (u +1)2�=z (u +1)2 Notice that in this example, the transformations are always one-to-one, so we donot need to worry about segmenting up the density function.So using the theorem:f Z ,U (z ,u )=|J |f X ,Y �g X −1(z ,u ),g Y −1(z ,u )�Substituting in | J |=z (u+1)2, g X −1(z ,u )=X =Z U+1, and g Y −1(z ,u )= Y =UZ U+1,we get: f Z ,U (z ,u )=z (u +1)2f X ,Y �Z U +1,UZ U +1� Remember from above that f X ,Y (x ,y )=�xe −x e −y for x ,y ≥00 otℎerwise So our joint density becomes:f Z ,U (z ,u )=z (u +1)2∗z u +1e −z (u+1)⁄e −uz (u+1)⁄ for z ,u ≥0f Z ,U (z ,u )=z 2(u +1)3e −z (u+1)⁄e −uz (u+1)⁄ for z ,u ≥0 If we wanted to know the density functions of Z and U separately, just integrate out the other variable in the normal way:f Z (z )=�f Z ,U (z ,u )du ∞−∞=�f Z ,U (z ,u )du 0−∞+�f Z ,U (z ,u )du ∞0But U cannot be less than zero so:�f Z ,U (z ,u )du 0−∞=0 Simplifying:f Z (z )=�f Z ,U (z ,u )du ∞0 =z 22e −z for z >0. Andf U (u )=�f Z ,U (z ,u )dz ∞0=2(1+u )3 for u >0.3. SimulatingOften obtaining an analytical solution to these sorts of transformations is nearly impossible. This may be because the integral has no closed form solution or that the transformed bounds of the integral are very hard to translate. In thesesituations, a different approach to estimating the distribution of a new statistic is to run simulations. We have used Matlab for the following examples because there are some cool functions you can download that will randomly draw from many different kinds of distributions. (If you don’t have access to this function, you can do the trick where you randomly sample over the uniform distribution between 0 and 1 and then back out the random value through the CDF. This technique was covered earlier in the class notes).A boring example: (We started with boring to illustrate the technique, below this are more interesting examples. Also, we have tried to solve this simple example using the method above and found it very difficult).Let’s say we have two uniformly distributed variables, X1 and X2 such thatX1~U(0,1) and X2~U(0,1). We want to find the distribution of Y1 whereY1=X1+X2.Choose how many simulation points you would like to have and then take that many draws from the uniform distribution for X1. When we say take a draw from a particular distribution, that means asking Matlab to return a realization of a random variable according to its distribution, so Matlab will give us a number. The number could potentially be any number where its distribution function is positive. Each draw that we take is a single simulated observation from that distribution.Let’s say we have chosen to take a sample with 100 simulated observations, so we now have a 100 element vector of numbers that are the realized values for X1 that we observed. We will call these x1. Do the same thing for X2 to get a 100 element vector of simulated observations from X2 that we will call x2. Now we create the vector for realized values for Y1. The first element of y1 is the first element of x1 plus the first element of x2. We now have a realized y1 vector that is 100 elements long. We treat each element in this vector as an observation of the random variable Y1.We now graph these observations of Y1 by making a histogram. The realized values of y1 will be on the horizontal axis while the frequency of the different values will be along the vertical axis. You can alter how good the graph looks by adjusting the number of different bins you have in your histogram.Below is a graph of 100 draws and 10 bins:We will try to make the graph look better by trying 100,000 simulated observations with 100 bins:This graph looks a lot better, but it is not yet an estimation of the distribution of Y1 because the area underneath the curve is not even close to one. To fix this problem, add up the area underneath and divide each frequency bar in the histogram by this number. Here, our area is 2000, so dividing the frequency levels of each bin by 200 gives us the following graph with an area of 1:It looks as if the distribution here is Y1=�y12−y1if 0≤y1≤1if 1≤y1≤2This seems to make intuitive sense and is close to what we would expect.We will not go further into estimating the functional form of this distribution because that is a nuanced art that seems beyond the scope of these notes. It is good to know however that there are many different ways the computer can assist here, whether it is fitting a polynomial or doing something calledinterpolating which doesn’t give an explicit functional form, but will return an estimate of the dependent variable given different inputs.Using the above frame work, the graph of the estimated distribution ofY2=X1/X2 is:We can extend this approach further to estimate the distributions of statistics that would be nearly impossible to calculate any other way. For example, lets say we have a variable Y~N(3,1), so Y is normally distributed with a mean around 3 and a standard deviation of 1. We also have a variable Z that has a Poisson distribution with the parameter λ=2.We want to know the distribution of X=Y∗Z, so we are looking for a statistic that is a function of a continuous and a discrete random variable. Using the process from above, we simulated 10,000 observations of this statistic and so we estimate the distribution to look something like this:Let’s see a very crazy example:Let’s say the variable Z has an extreme-value distribution with mean 0 and scale 1 and Y is a draw from the uniform distribution between 0 and 1.We want to find the distribution of the statistic X whereX=(cot−1Z)1/2∗YUsing 10,000 simulated observations and 100 bins, an estimate of the distributionof X is:。
Mathematical Statistics and Data Analysis
Example
Calculate the hazard function for the exponential distribution:
1 e t F (t ) 0 t0 t0
Let f denote the density function and h the hazard function of a nonnegative random variable. Show that h ( s ) ds f (t ) h(t )e
0 Fn ( x) k n 1
Properties of the Empirical Cumulative Distribution Function
Theorem 1
E(Fn ( x)) F ( x)
1 Var ( Fn ( x)) F ( x)(1 F ( x)) n
0.0565((1 59) 0.3) 0.452(8 59) 0.3) 1.3559(( 24 59) 0.3) density 0.8475((15 59) 0.3) 0.339((6 59) 0.3) 0.2825((5 59) 0.3)
(Statistical Analysis System)
Chapter 1 Summarizing Data
Methods Based on the Cumulative Distribution Function Histograms, Density Curves and Stem-and-Leaf Plots Measures of Location Measures of Dispersion
t 0
3 Economic value of celebrity endorsement (Tiger Woods)
Economic Value of Celebrity Endorsement:Tiger Woods’Impact on Sales of Nike Golf Balls∗Kevin YC Chung†Timothy Derdenger‡Kannan Srinivasan§First Version:November2010Second Version:January2011Third Version:February2011This Version:May2011(Prelminary draft,do not distribute without permission)AbstractWe study the economic value of celebrity endorsement.Despite the size and the long his-tory,few have attempted to quantify the economic worth of celebrity endorsers because itis terribly difficult to identify an endorser’s effect on afirm’s profit.By developing and es-timating the consumer demand model for the golf ball market,wefind that after controllingfor brand advertisement level and taking into account the inherent quality of the endorser,the endorsement effect leads not only to a significant number of existing customers switch-ing toward the endorsed products but also has a primary demand effect.∗Acknowledgments:We thank Catherine Tucker,Avi Goldfarb and Kory Koft for their comments.The paper also benefited from thefirst author’s attendance at the Columbia-Duke-UCLA Workshop on Quantitative Marketing and Structural Econometrics2010.The author thanks the participants for their comments and Wesley Hartmann for featuring the work in“Developing Research Question”session.†Doctoral Candidate,Tepper School of Business,Carnegie Mellon University.Kevinchung@‡Assistant Professor in Economics&Strategy,Tepper School of Business,Carnegie Mellon University.Der-denge@§Rohet Tolani Distinguished Professor in International Business;H.J.Heinz II Professor of Manage-ment,Marketing and Information Systems,Tepper School of Business,Carnegie Mellon University.Kan-nans@1IntroductionCelebrity endorsements are ubiquitous and are a multimillion dollar business.Corporations scour the likes of movie stars and professional athletes in search of the perfect celebrity to effectively endorse their products and increase sales;and in return celebrities receive millions of dollars.1While tens or even hundreds of millions of dollars in endorsement contracts are a relatively recent phenomenon,endorsements have been around for over200years.[15]The long historic practice reveals that celebrity endorsements are accepted as an effective strategy to increase sales and profit.However,this begs the question of just how profitable are they?Few have attempted to quantify the economic worth of celebrity endorsers.Those that have, have done so only in an indirect manner using the event study methodology.2Identifying and measuring the extent to which an endorser affectsfirms’retail sales is difficult for two major reasons.First,the endorsement variables must be properly defined and the researcher must identify the endorsement effect amongst many other confounding events.Thus,the researcher must distinguish from other phenomena that may give rise to the same outcome.3Second, to attribute the endorsement effect to changes in consumer preferences,the interpretation of the estimate is only valid under the condition of a stationary customer base.In a market that has rapid change in demographics,a researcher can erroneously conclude that there exists a significant effect when in fact the natural evolution of customer base has resulted in the outcome.We have unique golf dataset that satisfy both conditions.4Studying the golf industry in the context of celebrity endorsement isfitting because golf has been the leading sports industry 1For instance,in2000and again in2005Nike signed professional golfer Tiger Woods to a5-year$100million dollar contract to endorse its then nascent Nike Golf division.Other such examples include the current Miami Heat superstar LeBron James earning$28million from the likes of Coca-Cola and State Farm and Indianapolis Colts’quarterback Peyton Manning securing roughly$13million from Sprint,MasterCard,Reebok and Gatorade in2008 alone.2See literature review for detail3Thefirms that are able to afford the large endorsement contracts and use it as part of the marketing strategy are often complex in structure with numerous endorsements and advertisement campaigns occurring simultaneously.4Direct data that satisfy both condition is typically unavailable to the researcher,which is why previous studies had to resort to the event study analysis methodology.in the endorsement business.56[29][30]Structurally,the golf industry is a relatively insulated industry that has had a steady number of participants.It is estimated that over the past10years, the number of golfers remained steady at26to30million.[26]The avid golfers7make up only 23%of all golfers yet account for63%of all golf related spending in2002.8[19]In terms of the endorsement variables,we are presented with favorable circumstances in which the Nike firm is a new entrant in the industry that decides to sign just one endorser.9This presents us with several advantages in answering our research questions.First,the newness of Nike Golf takes care of other past advertisement events or endorsements that can potentially affect current phenomena.10Also,while it is usually impossible to attribute a celebrity endorsement to the growth or total revenue of the sponsoringfirm,the single endorsement strategy of Nike allows us to causally attribute the rise of the golf brand on the basis of a single endorser Tiger Woods.11 The identification of the endorsement effect comes from the detailed data that captures the sales of all major golf brands in the period when the golferfirst endorsed the Titleist brand(1997-2000)and switched to the Nike brand(2000-2010).We posit that endorsements play a direct role in a consumer’s utility function when consum-ing the endorsed brand.We take the complementary view of celebrity endorsement where the 5In“The Fortunate50”,a list of50top earning American athletes in salary,endorsement and appearance fees compiled by the Sports Illustrated,in2008and2009,Tiger Woods and Phil Mickelson came1st and2nd respectively./more/specials/fortunate50/2009/index.html/more/specials/fortunate50/2008/index.html6It is documented that Tiger Woods has consistently earned significantly more off the course than on the course by a variety of endorsers.In fact,it was believed in2008that Tiger Woods was on his way to become thefirst$1 billion athlete.In2007,his earning from on course was$23million while endorsement deals totaled$100million. /magazine/2008-02/gd507Those who play25or more rounds of golf per year.8$4.7billion dollars were spent on equipment(clubs,balls,bags,gloves,shoes)in2002.9Tiger Woods’career began with an endorsement from the Nike Golf brand in1996.In the beginning,Nike golf only endorsed Tiger Woods with apparel and shoes.Nike golf was a new player in the golf industry where they would eventually end up producing golf equipments(ball and golf clubs).Tiger Woods was the one of thefirst players to switch from the Titleist golf ball to the Nike Golf ball in2000with the100million dollar5year contract. (Nike golf ball was introduced in1999)10For example,firms with long history may have loyal customers that lead to persistent purchase.11Towards the latter half of the decade under Tiger Woods’endorsement,it is not entirely true that Tiger Woods was the only endorser of Nike Golf products.However,it is a general consensus that Nike golf built its brand around Tiger Woods.Not only was Tiger Woods the most widely recognized golfer that endorsed Nike products from the very beginning of his career,but he was one of thefirst golfer to switch to the Nike golf equipments when it was made available in2000.consumption process can either be enhanced or worsened through additional or negative utility attached on the endorsed product.With this view,all else equal,we predict that endorsement in and of itself can alter demand and increase or decrease12afirm’s market share.By developing and estimating the consumer demand model for the golf ball market,wefind that the celebrity endorsement effect on consumers can create product differentiation and gen-erate shift in market share.After implementing several counterfactual scenarios wefind,from 2000-2010,the Nike golf ball division reaped an additional profit of$60million through an acquisition of4.5million customers from Tiger Woods’endorsement effect.As a result,ap-proximately33%of Nike’s investment on the golfer’s endorsement was recovered just in US golf ball sales alone.We alsofind that the recent scandal had a negative effect which resulted in a loss of approximately$1.2million in profit with94,000customers switching away from Nike. However,we conclude that Nike’s decision to stand by the golfer was the right decision because even in the midst of the scandal,the endorsement effect was strong enough that had Nike termi-nated its relationship with Tiger Woods,the overall profit would have been less by an additional $1.6million.We alsofind that endorsed products have a primary demand effect.We empirically find that not only does celebrity endorsement take customers away from its competitors,but also attracts customers from the outside who would have otherwise not purchased the product in the absence of celebrity endorsements.The paper is organized as follows:First,to motivate our empirical study,we provide a brief background on the celebrity endorsement and golf industry with a focus on the golf ball market 12The negative effect of celebrity endorsement is especially relevant today asfirms have not been successful in staying away from celebrities who bring“negative”publicity.Recently it has been documented that Brett Favre has behaved inappropriately towards a reporter during his career with the New York Jets and Wrangler jeans company has yet to make a decision on keeping its ties with the athlete.(/id/39616665/)Other than Brett Favre,there are many more athletes throughout history who have not behaved as thefirms would have liked. To name a few,Nike:Kobe Bryant when charged with rape,Pepsi:Mike Tyson when charged with beating his wife,Hertz:OJ Simpson and hisfirst degree murder charge.Prince tennis racket:Jennifer Capriati when charged with marijuana possession.In our case,November2009was the beginning of a tumultuous and embarrassing year for Tiger Woods,in which his infidelity was revealed to the public.Since then,endorsers,one by one,began to cut ties with Tiger Woods. The earnings he made off the course-an estimated$100million a year-dwindled as a result of endorsers like Accenture,AT&T and Gatorade terminating its contracts.In the midst of all this,Nike announced that it would stand by Tiger Woods.We investigate whether his scandal has had any negative impact on Nike’s sales.in section2before providing the literature review in section3.In section4,we provide the data used for empirical estimation before proposing the model that captures the endorsement effect in section5.In section6,empirical results are provided followed by the counterfactual in section 7.We conclude in section8with discussions and limitations with directions for future research.2Background Information2.1Celebrity EndorsementsAccording to Frank Presbrey’s1929book History and Development of Advertising,one of the earliest testimonials in advertising appeared in an advertisement for teething preparationin1711.[15]In1770,the London Chronicle newspaper published an advertisement contain-ing the endorsement of Mary Graham,testifying the healing power of Dr.Rysseeg’s Balsamic Tincture.[11]While the strategy of endorsement traces back hundreds of years to the18th cen-tury,the wake of World War I was the beginning of modern endorsements.In the1920s,so popular was the practice of endorsement that Famous Names Inc.was founded that linked celebrities to national manufacturers.In the late1950s and onward,athletes became more im-portant as endorsers than the non athletes,and the fees paid to them rose significantly.Frank Scott,an athlete agent is credited for blossoming of athlete endorsements.Through him,manu-facturers from cigarettes to cement products scrambled to sign endorsement deals with athletes. Mickey Mantle,who led both baseball leagues in amount of money earned through endorse-ment,is said to have endorsed both Camel cigarettes and Bantron,an anti-smoking pill.In 1956,Mantles salaries was$30,000while his endorsement earning was$70,000.13In1969, Forbes reported that endorsement was the single most important source of outside income for many celebrities,while Sports Illustrated reported that athletes had risen to the top position as endorsers.14[18,12]In fact,the use of athletes had increased so much that the70s became to be 13/1998/06/30/sports/frank-scott-80-baseball-s-first-player-agent. html14This is attributed to the growth in endorsement industry in general and the growth of sport agents like Frank Scott.known in the marketing world as the“Decade of the Athlete”.[29]While the60s and the70s saw significant increases in the contribution of sports athletes in a variety of product endorsements,it was not until the1980s when Nike changed the landscape of sports endorsement.It is reported that Nike shoes were endorsed by135of273players in the NBA in1983.This dominance was said to have resulted from Nike’s promotional strategy of paying the athletes handsomely.15In this era of endorsements,it is reported that most top stars derive most of their income not through their winning but through endorsement deals.Today, endorsements are a multimillion dollar business.2.2The Golf IndustryThe golf industry in the United States generated direct revenues of$76billion in2005up from $62billion in2000.At$76billion,the golf industry is larger than the motion picture and the video industries.With the industry consisting of7main parts ranging from facility operations to real estate,golfer equipment/supplies and golf endorsements combined made up to be a$ 7.8billion industry in2005.We present a general overview of the golf equipment used in the sport of golf.There are3main categories in golf equipment;bags,clubs and balls.Given that our paper assesses the impact of endorsement on sales of golf ball equipment,we include the overview of the other two categories in the appendix.15The budget for this promotional strategy was around10times the amount spent on promotion by the next top ten sneaker makers combined.[6]Nike was also active in endorsing players who were yet to prove themselves at the professional level.For example,Michael Jordan had afive year$2.5million contract with Nike in hisfirst year as a Chicago Bulls player which rose to$18million dollar by1993.Similarly,in1996,a Nike commercial starring Tiger Woods was broadcast less than12hours after the20year old golf sensation announced that he would turn professional.Nike signed an endorsement deal with Tiger Woods for5years for an estimated$40million.Nike Golf division in1996which consisted of apparel and footwear had a total sales of approximately$120million. In year2000,Tiger signed a new deal worth$100million overfive years which was then said to be the largest ever offered to an active athlete.[20,29]Lastly,even before he became the number1overall pick,Nike signed Lebron James at an estimated$90million which was believed to be the largest initial shoe contract ever offered to an athlete.[10]Golf BallsGolf balls are estimated to generate$500million dollars in annual sales with production of over 850million golf balls per year.[4,17]There are1,051models of golf balls that are listed on the United States Golf Association’s list of conforming golf balls.It is believed by many experts in golf that the golf ball has more engineering per cubic centimeter than in any recreational product in the market.[32]Golf balls are usually white,weighing no more than1.62ounces with a diameter of no less than1.68inches.[33]In today’s golf ball,there are3main components; the number of layers,the type of outer cover and the number of dimples.Golf balls can have layers that ranges anywhere from2to5.Most golf balls for amateurs (least expensive)are2layered golf balls consisting only the outer cover material and a core.16 Depending on the number of layers,2layered golf balls are often called the“two piece”ball,3 layered a“three piece”and so on.The more layers a golf ball the higher cost of production and thus a higher retail price.17The type of outer cover on a golf ball determines how the golf ball“feels”under impact from the golf club.There are two main type of covers that are most widely used in the golf ball industry.The most popular is the ionomer/surlyn cover which is durable and resilient material made up of a blend of plastic resin.On the other hand,the urethane is a softer and a more elastic material that is more expensive to manufacture.18Most non premium golf balls are made of ionomer material while most premium golf balls that professional golfer use are made up of urethane cover.Lastly,today’s golf balls are characterized by the dimples on the surface.These are small 16The core of a ball is the resilient rubber compound located in the center of a ball that provides the transfer of energy from the golf club to the ball at impact.17In three piece golf balls,there is an extra layer of material between the core and the cover.This is usu-ally a“mantle”,which is a layer of polymer that are used both to control spin off of high speed impact and provide“feel”.Four piece balls either have2mantles or2cores.There is only one5piece golf ball in the market today.The TaylorMade Penta is priced at$45.99in golf retail stores.For more information on the ball, /mainlevel/golfshop/balls/Penta-TP.html#3018Urethane is about twice as thin as the surlyn cover and during the casting process it is known that urethane goes from a liquid to a solid in30seconds,leaving no room of error for the manufacturer.On the other hand,producing surlyn balls are known to be straightforward.It is said that in the time that160surlyn balls are produced,only1 multilayer urethane cover ball can be produced.identically shaped indents that are usually circular.The main purpose of a dimple is that it creates the necessary aerodynamic forces for the ball tofly further and longer.Depending on the depth of the dimples,the trajectory of theflight differs,with shallow dimples creating higher flights while deeper dimples creating lowerflights.Most golf balls today have250-450dimples. There have been differently shaped dimples to increase the number of coverage of the ball’s surface.It is understood that covering the golf ball with more dimples is generally more difficult to manufacture and are reflected in the retail price.The characteristics of the golf ball is important in this paper as they are the inherent part of the product characteristics that differentiate the products.In estimation,these characteristics become valuable instruments for the endogenous price variable.3Literature ReviewFew have attempted to quantify the economic worth of celebrity endorsers.To best of our knowledge,there is no literature that directly assesses the impact of celebrity endorsements on sales and market share.Rather,those that have studied this domain have done so in an indirect manner-by using the event study methodology and looking into thefluctuation of stock prices during the time of the announcement of celebrity endorsement.Specifically,Agrawal and Kamakura(1995)study110celebrity endorsement contracts andfind that,on average,the market reacts positively on the announcement of celebrity endorsement contracts.Based on this result,they conclude that celebrity endorsements are viewed as a profitable advertising strategy.More recently,Knittel and Stango(2009)study the negative impact of Tiger Woods’scandal.By looking at the stock prices of thefirms that Tiger Woods endorses,they estimate that,after the event in November2009,shareholders of Tiger Woods’sponsors lost$5-12billion relative to thosefirms that Woods did not endorse.Furthermore,theyfind that sports related sponsors suffered more than his other sponsors.To the best of our knowledge,these two papers are the closest in terms of what we are trying to study in our paper.Even so,event studyanalysis does not capture the true effects of celebrity endorsement.Our main concern with this methodology is that the study is an“event”study that takes a single event of an endorsement announcement and assesses the economic value.Also,event study analysis takes the behavior of the investors and their reaction to the endorsement announcement to assess the economic worth.This can be potentially misleading and problematic since we would like to directly study the consumers’behavior,not investors’behavior,and their change in preferences in products due to the endorsement.For example,when Tiger Woods initially signed a multimillion dollar deal in1996with Nike before turning professional,it is documented that stock price for Nike declined5%.Despite this negative market valuation,looking back,it is difficult to argue against the fact that Tiger has been one of the biggest reasons why Nike has rose to such heights in the golf business(an industry in which they were not very well known for at the time).Nike rose to become a formidable golf company from producing not only apparel and shoes but also golf equipment,transforming a$120million business to a$500million business(sales)from1996 to2006.[7]Since celebrity endorsements occur over time,in order to assess the economic value of a celebrity,one must look at the time period in which the celebrity was under contract.We do this in our paper by explicitly tracking the sales of golf balls with celebrity endorsements.The underlying theory behind our model construct originate from Stigler and Becker(1977) and Becker and Murphy(1993)in which they analyze models that incorporate a brand’s adver-tising level into a consumer’s utility function.When such an interaction is positive theyfind that the likelihood of consumption increases.Moreover,“advertising can in itself create prestige, differentiation,or association that may change the utility a consumer obtains from consuming a product”[1](Ackerberg2001).This line of literature is closely related to our study in that one may think of the quality of a celebrity endorser as the analog to their advertising levels-a higher quality celebrity endorser increases the prestige associated with the endorsed product which thus leads to higher utility and sales.It must be noted however that we make a clear distinction be-tween endorsement and advertisement.We define the endorsement effect as the overall effect the endorser has on the company during the time period in which he is under contract.For an ad-vertisement effect,we define it as the overall brand exposure effect in the media at a given time. To distinguish the two effects,we explicitly take into account both the celebrity endorsement effect and the endorsing brand’s advertising level in our consumer utility function.4DataThe data used in this study is aggregated monthly golf ball sales in the United States from February1997to April2010.This data represents the total sales for the US for on course(green grass)and off course golf specialty stores.19There are a total of669unique products represented by a total of26different brands.Below are the summary statistics and plot of sales over time.20On Course Off Course OverallAverage Price$23.09[$18.21,$26.37]$18.50[$14.97,$22.33]$20.81[$14.97,$26.37]Units Sold12,582[4,415,25,441]12,573[5,681,29,461]12,577[4,415,29,461] No.of Products Available61[35,90]71[41,102]66[35,102]No.of Brands Available13[9,17]14[10,17]14[9,17]Table1:Summary Statistics for each market(Feb1997-April2010)Looking at the total sales of golf balls over time,it is apparent that the golf ball market exhibit seasonality and time trend.Seasonality is expected as golf is a seasonal sport that takes place in warm climates.To take this into account,we include the month of year indicator variables in our estimation.Also to account for the general sales trend from1997to2010,we include up to a cubic time trend to account for what is observed.19For on course shops,the sales represent a mix of public and private course golf shops.For off course,a mix of single owner and chains stores are represented.Thefigures are made up of over550on course shops and over250 off course shops.20The average price(1997dollars)of on course shops are higher in comparison to the golf balls available in off course golf shops.(identical products are on average more expensive on on course shops for each month) In the modeling procedure,we explicitly take this into account by using it as an instrument for prices,which is endogenous.Interestingly,while the number of products available in off course shops are on average10more than the on course golf shops with an average price that is lower by$4.59,the number of golf balls sold in both markets are strikingly similar.We come back to this later when we discuss the elasticities in on and off course shops.Figure1:Total Golf Ball Sales(Dozens)for on and off course shops4.1Nike Golf Ball SalesIn this section,we explore the data further by looking at the sales of Tiger Woods’endorsed brand Nike.By doing so,we believe that the readers will be able to see the motivation of the questions we pose and the approach we take in answering them.Below is the sales of Nike golf balls from its introduction in February1999until the end of our data period April2010.The red vertical line represents June of2000when Tiger Woods made an official switch to the Nike golf ball.21It is not difficult to see that even after taking into account the seasonality on sales over months,there appears to be a“jump”in sales for Nike Golf ball“post”Tiger Woods’switch.While we are not able to make any causality arguement yet, this observation has left us wondering1)Is there an endorsement effect on endorsed product? Mainly,does endorsement strategy actually increase sales and profit?2)If so,how profitable are they?21Tiger Woods signed a clothing and shoe contract from1996to2000,and it was not until2000that he started his endorsement of Nike golf equipments.It is documented that Nike signed the golfer a5year$100million dollar contract to endorse its then nascent golf equipments in2000.He would extend the contract by signing what was reported as a5year$100million dollar contract in2005that would extend to2010.Figure2:Total Sales of Nike Golf Balls(Dozens)Pre&Post Tiger Woods’Endorsement Our motivation to answer these questions led us to take a structural econometric approach that relies on economic/marketing theories in modeling the consumer behavior.By doing so, not only are we able to statistically assess the“existence”of endorsement effect,but also by recovering the primitives,we obtain predictions of the effect of strategy changes,enabling us to address our second question of quantifying the profitability of the endorsement strategy.5The ModelGiven the nature of the data structure,our approach is to jointly estimate the demand and supply equations by following the methodology of Berry,Levinsohn and Pakes(1995).5.1The Demand SideWe define a market as the national golf market for each month from February1997to April2010 for both on and off course golf shops.The indirect utility of consumer i from consuming golf ball j in market t is characterized by golf ball price p jt,endorsement vector En jt and productadvertisement vector AD t.We include a set of indicator variables including Tiger Woods’scan-dal Sc t,product specificfixed effects PD and the month of year indicator variables MD.We also include the time trend vector Tr where we include up to the cubic power as we have observed in the data section.We interact Sc t with the time trend variable as we like to observe the persis-tence of the scandal effect on stly,the utility is characterized by the unobservable(to the econometrician)product characteristics∆ξjt and individual taste parameterεi jt,distributed i.i.d.type1extreme value across i,j and t.A consumer i’s indirect utility for golf ball j in market t is,u i jt=αi p jt+En jtΓ+AD t−1λ+(Tr xSc t)ϒ+φPD+κMD+TrΞ+∆ξjt+εi jt(1)αi=α+v iΣv i∼N(0,I)(2)For golf ball price p jt,we adjust the price to1997dollars.Hereαi is unidimensional dis-tribution andΣis the estimate of the standard deviation of our random coefficient.The model parameters of interest consists of both linear and nonlinear parameters.The model parameters areθ=(θ1,θ2)where the vectorθ1=(α,Γ,λ,τ,φ,κ,Ξ)contains the linear parameters while θ2=Σis the nonlinear parameter.Consumers are assumed to purchase one unit of goods in each period that gives the highest utility,including the outside option which is normalized to zero.5.1.1Celebrity Endorsement VariableFor the direct effect of celebrity endorsements on the sale of the endorsed product,we define the (row)vector En jt=[E1jt,E2jt,E3jt,....,E G jt]for golfer g,product j in market t whereE g jt=1rank gtif D g jt=10if D g jt=0(3)Mainly,for each golfer g,given that the golfer endorses product j,we define E g jt as a function of the skill level at time t.Here,the skill level of player g is assumed to be exogenous。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
DSD parameter is fixed, the second is allowed to vary
rapidly (e.g., for each range bin or from moment to
moment), and the third is constant over a certain space
676
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 49
Estimation of N0* for the Two-Scale Gamma Raindrop Size Distribution Model and Its Statistical Properties at Several Locations in Asia
TOSHIAKI KOZU
Shimane University, Matsue, Shimane, Japan
KAZUHIRO MASUZAWA
KOA Corporation, Ina, Nagano, Japan
TOYOSHI SHIMOMAI
Shimane University, Matsue, Shimane, Japan
or time domain (e.g., one radar beam or one rain event).
This type of DSD model has been used for extracting
DSD-related information from the multiparameter ra-
It has been shown that a DSD parameter (e.g., N0 or L with a fixed m) is not necessarily independent from one
resolution volume to another; the spatial or temporal vari-
dar mentioned above and in TRMM measurements
An automatic estimation method is developed to detect stepwise changes in the amplitude parameter of the normalized raindrop size distribution (DSD) N0*. To estimate N0*, it is also assumed that the variation of three DSD parameters follows the two-scale gamma DSD model; this is defined as a DSD model in which one DSD parameter is fixed, the second is allowed to vary rapidly, and the third is constant over a certain space or time domain and sometimes exhibits stepwise transitions. For this study, it is assumed that N0* is the third DSD parameter. To estimate this stepwise-varying parameter automatically, a non-Gaussian state-space model is used for the time series of log10N0*. The smoothed time series of log10N0* fit well to the stepwise transition of log10N0* when it was assumed that the state transition probability follows a Cauchy distribution. By analyzing the long-term disdrometer data using this state-space model, statistical properties for log10N0* are obtained at several Asian locations. It is confirmed that the N0* thus estimated is useful to improve the rain-rate estimation from the measurement of radar reflectivity factor.
Corresponding author address: Toshiaki Kozu, 1060 Nishikawatsu Matsue, Shimane 690-8504, Japan. E-mail: kozu@ecs.shimane-u.ac.jp
difficult to extract DSD information independently for each radar resolution volume because of the difficulty in satisfying measurement accuracy requirements needed to make DSD estimations. For example, the differential attenuation between two radar frequencies and differential specific phase shift between two radar-wave polarizations can only be measured at a relatively coarse range resolution within which a DSD parameter has to be assumed constant or needs to be range smoothed to eliminate range-by-range fluctuations, which effectively degrades the range resolution of those parameters. Downward-looking spaceborne radars utilizing the surface reference technique (Meneghini et al. 2000) can estimate a DSD parameter but require the assumption of a constant DSD parameter over a radar beam (Iguchi et al. 2000). Although recent advances in dual-frequency algorithms to estimate the range profiles of two DSD
1. Introduction
Studies of raindrop size distribution (DSD) properties are important to understand the microphysical processes in rain and to improve the active and passive microwave remote sensing of rain. There have been a number of studies on DSD since the beginning of radar meteorology. In the last two decades, multiparameter radars such as dual-polarization radars (e.g., Bringi and Chandrasekar 2001, chapter 8) and dual-frequency radars (e.g., Meneghini et al. 1992; Tanelli et al. 2004), which can be used to estimate some parameters of DSD, have been developed mainly for research purposes. However, it is generally
Chandrasekar and Bringi 1987). The shape parameter m
is often fixed for simplicity and to make it possible to es-
timate DSD from a dual-parameter radar measurement.
DSD is often modeled with a gamma distribution:
N(D)
5
N0Dm
exp(ÀLD)
5
NT
Lm11 G(m 1 1)
Dm
exp(ÀLD),(1)来自where D is the drop diameter, the groups (N0, m, L) or (NT, m, L) are parameters of the gamma model, and G() is the complete gamma function (Ulbrich 1983;
NOBUHISA KASHIWAGI
The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
(Manuscript received 14 May 2009, in final form 2 September 2009)