Generalized (s-Parameterized) Weyl Transformation
Discriminating large extra dimensions at the ILC with polarized beams

a r X i v :h e p -p h /0604047v 1 5 A p r 20062005ALCPG &ILC Workshops -Snowmass,U.S.A.Discriminating Large Extra Dimensions at the ILC with Polarized BeamsA.A.Pankov,A.V .TsytrinovPavel Sukhoi Technical University,Gomel 246746,BelarusN.PaverUniversity of Trieste and INFN,34100Trieste,ItalyNon-standard scenarios described by effective interactions can manifest themselves indirectly,via corrections to the Standard Model cross sections.It should be desirable to identify at a given confidence level the source of such deviations among the different possible explanations.We here discuss the identification reach on gravity in extra dimensions from the four-fermion compositeness-inspired contact interactions and viceversa ,using as basic observable the differential cross section of e +e −→¯ff at the ILC,and emphasize the rˆo le of beams polarization in enhancing the identification sensitivity.1.INTRODUCTION New-physics scenarios (NP)based on very heavy virtual quanta exchanges can be described,below the direct production threshold,by effective,contact-interactions that can can have only indirect signatures by contributing corrective terms to the Standard Model (SM)amplitudes,suppressed by some power of the ratio between the collider c.m.energy and the above mentioned characteristic high mass scales.These corrections will reveal themselves via deviations of the measured observables from the SM predictions or,in few specific cases,by the observation of processes forbidden by the SM.In principle,different kinds of NP interactions may produce similar deviations and,consequently,it would be desirable to assess,for each non-standard model,not only the “discovery reach”,represented by the maximal value of the relevant mass scale below which a deviation can be observed at a given C.L.within the experimental accuracy but,also,the “identification reach”,defined as the upper limit of the mass range of values for which the model can be discriminated from the other potentially competing scenarios.We will focus on the discrimination reach on the ADD models of gravity in large,compactified,extra spatial dimensions [1],with respect to the four-fermion contact interactions inspired by compositeness [2],and viceversa ,looking at the differential cross sections ofe ++e −→¯f +f,(1)with f =l,q (l =µ,τ;q =c,b ),at the ILC with longitudinally polarized beams [3,4].In Ref.[5],the identification reach on individual contact-interactions was studied by applying a Monte Carlo technique to lepton-pair production with unpolarized beams.An approach based on the polarized differential distributions for lepton pair production processes was proposed in Ref.[6].We here discuss the benefits of longitudinal beams polarization in improving the identification reaches and consider also quark-pair production channels.2.DIFFERENTIAL CROSS SECTIONS AND DEVIATIONS FROM THE SMNeglecting all fermion masses with respect to the c.m.energy√dz =1dz +dσLRdz +dσRLwhere z=cosθis the angle between the incoming and outgoing fermions in the c.m.frame and(α,β=L,R):dσαβσpt|Mαβ|2(1±z)2.(3)8P1and P2the degrees of longitudinal polarization of the electron and positron beams,respectively,and the‘±’signs apply to the cases LL,RR and LR,RL,respectively.According to sec.1,the reduced helicity amplitudes appearing in Eq.(3)can be expanded into the SM part represented byγand Z exchanges,plus corrections depending on the considered NP model:Mαβ=M SMαβ+∆αβ(NP).(4) The examples explicitly considered here are the following ones:a)The ADD large extra dimensions scenario[1],where only gravity can propagate in extra dimensions,and correspondingly a tower of graviton KK states occurs in the four-dimensional space[8,9].In the parameterization of Ref.[10],the(z-dependent)deviations can be expressed as[11]:∆LL(ADD)=∆RR(ADD)=f G(1−2z),∆LR(ADD)=∆RL(ADD)=−f G(1+2z),(5) where f G=λs2/(4παe.m.Λ4H),λ=±1,ΛH being a phenomenological cut-offon the integration on the KK spectrum.b)Gravity in TeV−1–scale extra dimensions,where also the SM gauge bosons can propagate there,parameterized by the“compactification scale”M C[12,13]:∆αβ(TeV)=− Q e Q f+g eαg fβ π2/(3M2C).(6) c)The four-fermion contact-interaction scenario(CI)[2]where,withΛαβthe“compositeness”mass scales(ηαβ=±1):∆αβ(CI)=ηαβs/(αe.m.Λ2αβ).(7) In cases b)and c)the deviations are z-independent,whereas in the case a)they introduce extra z-dependence in the angular distributions.The consequence is that the ADD contribution to the integrated cross sections is tiny, because the interference with the SM amplitudes vanishes in these observables.Current experimental lower bounds on the mass scales M H and M C are reviewed,e.g.,in Ref.[14](M H>1.1−1.3TeV,M C>6.8TeV),while those on Λs,of the order of10TeV,are detailed in Ref.[15].3.DERIVATION OF THE IDENTIFICATION REACHESLet us assume one of the models,for example the ADD model(5),to be the“true”one,i.e.,to be consistent with data for some value ofΛH.To estimate the level at which it may be discriminated from other,in principle competing NP scenarios(“tested”models),for any values of the relevant mass parameters,say example one of the four-fermion CI models(7),we introduce relative deviations of the differential cross section(denoted by O)from the ADD predictions due to the CI in each angular bin,and a correspondingχ2function:2.(8)∆(O)=O(CI)−O(ADD)δO binHere,δO s represent the expected relative uncertainties,which combine statistical and systematic ones,the former one being related to the ADD model prediction.Consequently,theχ2of Eq.(8)is a function ofλ/Λ4H and the considered η/Λ2,and we can determine the“confusion”region in this parameter plane where also the corresponding CI model may be considered as consistent with the ADD predictions at the chosen confidence level,so that an unambiguous identification of ADD cannot be made.We chooseχ2<3.84for95%C.L..ALCPG0112√For the numerical analysis,we consider an ILC withFigure 2:95%CL identification reach on the cutoffscale M C in the TeV model (left panel)and ΛVV in the VV model (right panel)as a function of the integrated luminosity obtained from the fermion pair production processes with unpolarized and both polarized beams at ILC(0.5TeV).References[1]N.Arkani-Hamed,S.Dimopoulos and G.R.Dvali,Phys.Lett.B 429,263(1998);N.Arkani-Hamed,S.Dimopoulos and G.R.Dvali,Phys.Rev.D 59,086004(1999);I.Antoniadis,N.Arkani-Hamed,S.Dimopoulos and G.R.Dvali,Phys.Lett.B 436,257(1998).[2]E.Eichten,ne and M.E.Peskin,Phys.Rev.Lett.50,811(1983);R.R¨u ckl,Phys.Lett.B 129,363(1983).[3]J.A.Aguilar-Saavedra et al.[ECFA/DESY LC Physics Working Group Collaboration],“TESLA TechnicalDesign Report Part III:Physics at an e +e −Linear Collider,”DESY-01-011,arXiv:hep-ph/0106315;T.Abe et al.[American Linear Collider Working Group Collaboration],“Linear collider physics resource book for Snowmass 2001.1:Introduction,”in Proc.of the APS/DPF/DPB Summer Study on the Future of Particle Physics (Snowmass 2001)SLAC-R-570,arXiv:hep-ex/0106055.[4]G.Moortgat-Pick et al.,arXiv:hep-ph/0507011.[5]G.Pasztor and M.Perelstein,in Proc.of the APS/DPF/DPB Summer Study on the Future of Particle Physics(Snowmass 2001)ed.N.Graf,arXiv:hep-ph/0111471.[6]A.A.Pankov,N.Paver and A.V.Tsytrinov,[arXiv:hep-ph/0512131].[7]B.Schrempp,F.Schrempp,N.Wermes and D.Zeppenfeld,Nucl.Phys.B 296,1(1988).[8]T.Han,J.D.Lykken and R.J.Zhang,Phys.Rev.D 59,105006(1999)[arXiv:hep-ph/9811350].[9]G.F.Giudice,R.Rattazzi and J.D.Wells,Nucl.Phys.B 544,3(1999)[arXiv:hep-ph/9811291].[10]J.L.Hewett,Phys.Rev.Lett.82,4765(1999)[arXiv:hep-ph/9811356].[11]S.Cullen,M.Perelstein and M.E.Peskin,Phys.Rev.D 62,055012(2000)[arXiv:hep-ph/0001166].[12]K.M.Cheung and ndsberg,Phys.Rev.D 65,076003(2002)[arXiv:hep-ph/0110346].[13]T.G.Rizzo and J.D.Wells,Phys.Rev.D 61,016007(2000)[arXiv:hep-ph/9906234].[14]For a review see,e.g.,K.Cheung,arXiv:hep-ph/0409028.[15]S.Eidelman et al.[Particle Data Group],Phys.Lett.B 502,1(2004).ALCPG0112。
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL

SUMMARY In this paper, a modular modification of the adaptive robust control (ARC) technique is presented. The modular design has all of the original ARC properties with an estimation-based update law instead of a Lyapunov-based update law. In this design, the controller is divided into two modules: a control module and an identification module. A key new idea is to set a priori bounds on the time derivatives of the estimates to be maintained by the update law. As a result, their effects on the system tracking accuracy can be dominated by the control law. A modification is proposed for the standard gradient and least-square update laws to guarantee the bounds. This modification also makes the controller robust against the generalized (unparameterized) uncertainties considered in the ARC formulation while allowing asymptotic output tracking without the generalized uncertainties. Both the ARC and the modular ARC techniques are applied to a force control problem for an active suspension system. Simulations and experimental results are provided to show that the update law of the modular design is less sensitive to measurement noise which results in smaller force tracking error and smaller control gain. Copyright # 2004 John Wiley & Sons, Ltd.
Finding State Solutions to Temporal Logic Queries. www.cs.toronto.edumgstateqc.ps

Finding State Solutions to Temporal Logic QueriesMihaela Gheorghiu,Arie Gurfinkel,and Marsha ChechikDepartment of Computer Science,University of Toronto,Toronto,ON M5S3G4,Canada.Email:mg,arie,chechik@Abstract.Different analysis problems for state-transition models can be uni-formly treated as instances of temporal logic query-checking,where only statesare sought as solutions to the queries.In this paper,we propose a symbolic query-checking algorithm thatfinds exactly the state solutions to any query.We showthat our approach generalizes previous ad-hoc techniques,and this generality al-lows us tofind new and interesting applications,such asfinding stable states.Ouralgorithm is linear in the size of the state space and in the cost of model checking,and has been implemented on top of the model checker NuSMV,using the latteras a black box.We show the effectiveness of our approach by comparing it,on agene network example,to the naive algorithm in which all possible state solutionsare checked separately.1IntroductionIn the analysis of state-transition models,many problems reduce to questions of the type:“What are all the states that satisfy a property?”.Symbolic model checking can answer some of these questions,provided that the property can be formulated in an appropriate temporal logic.For example,suppose the erroneous states of a program are characterized by the program counter()being at a line labeled.Then the states that may lead to error can be discovered by model checking the property ,formalized in the branching temporal logic CTL[10].There are many interesting questions which are not readily expressed in temporal logic and require specialized algorithms.One example isfinding the reachable states, which is often needed in a pre-analysis step to restrict further analysis only to those states.These states are typically found by computing a forward transitive closure of the transition relation[8].Another example is the computation of“procedure summaries”.A procedure summary is a relation between states,representing the input/output behav-ior of a procedure.The summary answers the question of which inputs lead to which outputs as a result of executing the procedure.They are computed in the form of“sum-mary edges”in the control-flow graphs of programs[21,2].Yet another example is the algorithm forfinding dominators/postdominators in program analysis,proposed in[1].A state is a postdominator of a state if all paths from eventually reach,and is a dominator of if all paths to pass through.Although these problems are similar,their solutions are quite different.Unifying them into a common framework allows reuse of specific techniques proposed for each problem,and opens a way for creating efficient implementations to other problems ofa similar kind.We see all these problems as instances of model exploration,where properties of a model are discovered,rather than checked.A common framework for model exploration has been proposed under the name of query checking[5].Query checkingfinds which formulas hold in a model.For instance,a query is intended tofind all propositional formulas that hold in the reachable states.In general,a CTL query is a CTL formula with a missing propositional subformula,designated by a placeholder(“”).A solution to the query is any propositional formula that,when sub-stituted for the placeholder,makes a CTL formula that holds in the model.The general query checking problem is:given a CTL query on a model,find all of its propositional solutions.For example,consider the model in Figure1(a),where each state is labeled by the atomic propositions that hold in it.Here,some solutions to are, representing the reachable state,and,representing the set of states. On the other hand,is not a solution:does not hold,since no states whereis false are reachable.Query checking can be solved by repeatedly substituting each possible propositional formula for the placeholder,and returning those for which the resulting CTL formula holds.In the worst case,this approach is exponential in the size of the state space and linear in the cost of CTL model checking.Each of the analysis questions described above can be formulated as a query.Reach-able states are solutions to.Procedure summaries can be obtained by solvingholds in the return statement of the procedure.Dominators/postdominators are solutions to the query(i.e.,what propositional formulas eventually hold on all paths).This insight gives us a uniform formulation of these problems and allows for easy creation of solutions to other,sim-ilar,problems.For example,a problem reported in genetics research[4,12]called for finding stable states of a model,that are those states which,once reached,are never left by the system.This is easily formulated as,meaning“what are the reachable states in which the system will remain forever?”.These analysis problems further require that solutions to their queries be states of the model.For example,a query on the model in Figure1(a)has solutionsand.Thefirst corresponds to the state and is a state solution.The second cor-responds to a set of states but neither nor is a solution by itself.When only state solutions are needed,we can formulate a restricted state query-checking prob-lem by constraining the solutions to be single states,rather than arbitrary propositional formulas(that represent sets of states).A naive state query checking algorithm is to repeatedly substitute each state of the model for the placeholder,and return those for which the resulting CTL formula holds.This approach is linear in the size of the state space and in the cost of CTL model checking.While of significantly more efficient than general query checking,this approach is not“fully”symbolic,since it requires many runs of a model-checker.While several approaches have been proposed to solve general query checking,none are effective for solving the state query-checking problem.The original algorithm of Chan[5]was very efficient(same cost as CTL model checking),but was restricted to valid queries,i.e.,queries whose solutions can be characterized by a single propo-sitional formula.This is too restrictive for our purposes.For example,neither of the queries,,nor the stable states query are valid.Bruns and Gode-2froid[3]generalized query checking to all CTL queries by proposing an automata-basedCTL model checking algorithm over a lattice of sets of all possible solutions.This al-gorithm is exponential in the size of the state space.Gurfinkel and Chechik[15]havealso provided a symbolic algorithm for general query checking.The algorithm is basedon reducing query checking to multi-valued model checking and is implemented in atool TLQSolver[7].While empirically faster than the corresponding naive approach of substituting every propositional formula for the placeholder,this algorithm still has the same worst-case complexity as that in[3],and remains applicable only to modest-sized query-checking problems.An algorithm proposed by Hornus and Schnoebelen[17]finds solutions to any query,one by one,with increasing complexity:afirst solution is found in time linear in the size of the state space,a second,in quadratic time,and so on. However,since the search for solutions is not controlled by their shape,finding all state solutions can still take exponential time.Other query-checking work is not directly ap-plicable to our state query-checking problem,as it is exclusively concerned either with syntactic characterizations of queries,or with extensions,rather than restrictions,of query checking[23,25].In this paper,we provide a symbolic algorithm for solving the state query-checking problem,and describe an implementation using the state-of-the-art model-checker NuSMV[8]. The algorithm is formulated as model checking over a lattice of sets of states,but its implementation is done by modifying only the interface of NuSMV.Manipulation ofthe lattice sets is done directly by NuSMV.While the running time of this approach isthe same as in the corresponding naive approach,we show empirical evidence that our implementation can perform better than the naive,using a case study from genetics[12].The algorithms proposed for the program analysis problems described above are special cases of ours,that solve only and queries,whereas our algorithm solves any CTL query.We prove our algorithm correct by showing that it approximates general query checking,in the sense that it computes exactly those solutions,amongall given by general query checking,that are states.We also generalize our results toan approximation framework that can potentially apply to other extensions of model checking,e.g.,vacuity detection,and point to further applications of our technique,e.g.,to querying XML documents.There is a also a very close connection between query-checking and sanity checkssuch as vacuity and coverage[19].Both problems require checking several“mutants”ofthe property to obtain thefinal solution.In fact,the algorithm for solving state-queries presented in this paper bears many similarities to the coverage algorithms describedin[19].Since query-checking is a more general approach,we believe it can provide a uniform framework for studying all these problems.The rest of the paper is organized as follows.Section2provides the model checking background.Section3describes the general query-checking algorithm.We formallydefine the state query-checking problem and describe our implementation in Section4. Section5presents the general approximation technique for model checking over latticesof sets.We present our case study in Section6,and conclude in Section7.3(a)(b),for true false,for,forFig.1.(a)A simple Kripke structure;(b)CTL semantics.2BackgroundIn this section,we review some notions of lattice theory,minterms,CTL model check-ing,and multi-valued model checking.Lattice theory.Afinite lattice is a pair(,),where is afinite set and is a partial order on,such that everyfinite subset has a least upper bound(called join and written)and a greatest lower bound(called meet and written).Since the lattice isfinite,there exist and,that are the maximum and respectively minimum elements in the lattice.When the ordering is clear from the context,we simply refer to the lattice as.A lattice if distributive if meet and join distribute over each other.In this paper,we work with lattices of propositional formulas. For a set of atomic propositions,let be the set of propositional formulas over .For example,true false.This set forms afinite lattice ordered by implication(see Figure2(a)).Since true,is under true in this lattice.Meet and join in this lattice correspond to logical operators and,respectively.A subset is called upward closed or an upset,if for any,if and,then.In that case,can be identified by the set of its minimal elements(is minimal if),and we write.For example,for the lattice shown in Figure2(a),true. The set is not an upset,whereas true is.For singletons,we write for.We write for the set of all upsets of,i.e.,iff.is closed under union and intersection,and therefore forms a lattice ordered by set inclusion.We call the upset lattice of.The upset lattice of is shown in Figure2(b).An element in a lattice is join-irreducible if and cannot be decomposed as the join of other lattice elements,i.e.,for any and in,impliesor[11].For example,the join-irreducible elements of the lattice in Figure2(a) are and,and of the one in Figure2(b)—true,,,and false.4false(a)(b)(c)ttices for:(a);(b);(c). Minterms.In the lattice of propositional formulas,a join-irreducible element is aconjunction in which every atomic proposition of appears,positive or negated.Such conjunctions are called minterms and we denote their set by.For example,CTL Model Checking.CTL model checking is an automatic technique for verifying temporal properties of systems expressed in a propositional branching-time temporal logic called Computation Tree Logic(CTL)[9].A system model is a Kripke structure ,where is a set of states,is a(left-total)transition relation,is the initial state,is a set of atomic propositions,andis a labeling function,providing the set of atomic propositions that are true in each state.CTL formulas are evaluated in the states of.Their semantics can be described in terms of infinite execution paths of the model.For instance,a formula holds in a state if holds in every state,on every infinite execution path start-ing at;()holds in if holds in some state,on every(some)infi-nite execution path.The formal semantics of CTL is given in Figure1(b). Without loss of generality we consider only CTL formulas in negation normal form, where negation is applied only to atomic propositions[9].In Figure1(b),the function true false indicates the result of checking a formula in state;the set of successors for a state is;and are least and greatestfixpoints of,respectively,where false andtrue.Other temporal operators are derived from the given ones, for example:true,true.The operators in pairsare duals of each other.A formula holds in a Kripke structure,written,if it holds in the initial state,i.e.,true.For example,on the model in Figure1(a),where ,properties and are true,whereas is not.The complexity of model-checking a CTL formula on a Kripke structure is, where.Multi-valued model checking.Multi-valued CTL model checking[6]is a general-ization of model checking from a classical logic to an arbitrary De Morgan algebra ,where is afinite distributive lattice and is any operation that is an involution()and satisfies De Morgan laws.Conjunction and disjunction are the meet and join operations of,respectively.When the ordering and the negation5operation of an algebra are clear from the context,we refer to it as.In this paper,we only use a version of multi-valued model checking where the model remains classical,i.e.,both the transition relation and the atomic propositions are two-valued, but properties are specified in a multi-valued extension of CTL over a given De Morgan algebra,called CTL().The logic CTL()has the same syntax as CTL,except that the allowed constants are all.Boolean values true and false are replaced by the and of,respectively.The semantics of CTL()is the same as of CTL, except is extended to and the interpretation of constants is:for all ,.The other operations are defined as their CTL counterparts(see Fig-ure1(b)),where and are interpreted as lattice operators and,respectively.The complexity of model checking a CTL()formula on a Kripke structure is still ,provided that meet,join,and quantification can be computed in constant time[6],which depends on the lattice.3Query CheckingIn this section,we review the query-checking problem and a symbolic method for solv-ing it.Background.Let be a Kripke structure with a set of atomic propositions.A CTL query,denoted by,is a CTL formula containing a placeholder“”for a proposi-tional subformula(over the atomic propositions in).The CTL formula obtained by substituting the placeholder in by a formula is denoted by.A for-mula is a solution to a query if its substitution into the query results in a CTL formula that holds on,i.e.,if.For example,and are among the solutions to the query on the model of Figure1(a),whereas is not.In this paper,we consider queries in negation normal form where negation is ap-plied only to the atomic propositions,or to the placeholder.We further restrict our attention to queries with a single placeholder,although perhaps with multiple occur-rences.For a query,a substitution means that all occurrences of the place-holder are replaced by.For example,if,then.We assume that occurrences of the placeholder are ei-ther non-negated everywhere,or negated everywhere,i.e.,the query is either positive or negative,respectively.Here,we limit our presentation to positive queries;see Section5 for the treatment of negative queries.The general CTL query-checking problem is:given a CTL query on a model,find all its propositional solutions.For instance,the answer to the query on the model in Figure1(a)is the set consisting of,and every other formula implied by these,including,,and true.If is a solution to a query,then any such that(i.e.,any weaker)is also a solution,due to the monotonicity of positive queries[5].Thus,the set of all possible solutions is an upset;it is sufficient for the query-checker to output the strongest solutions,since the rest can be inferred from them.One can restrict a query to a subset[3].We then denote the query by, and its solutions become formulas in.For instance,checking on the model of Figure1(a)should result in and as the strongest solutions,together with all those implied by them.We write for.6If consists of atomic propositions,there are possible distinct solutions to .A“naive”method forfinding all solutions would model check for every possible propositional formula over,and collect all those’s for which holds in the model.The complexity of this naive approach is times that of usual model-checking.Symbolic Algorithm.A symbolic algorithm for solving the general query-checking problem was described in[15]and has been implemented in the TLQSolver tool[7]. We review this approach below.Since an answer to is an upset,the upset lattice is the space of all possible answers[3].For instance,the lattice for is shown in Figure2(b).In the model in Figure1(a),the answer to this query is true,encoded as,since is the strongest solution.Symbolic query checking is implemented by model checking over the upset lattice. The algorithm is based on a state semantics of the placeholder.Suppose query is evaluated in a state.Either holds in,in which case the answer to the query should be,or holds,in which case the answer is.Thus we have:if,if.This case analysis can be logically encoded by the formula.Let us now consider a general query in a state(where ranges over a set of atomic propositions).We note that the case analysis corresponding to the one above can be given in terms of minterms.Minterms are the strongest formulas that may hold in a state;they also are mutually exclusive and complete—exactly one minterm holds in any state,and then is the answer to at.This semantics is encoded in the following translation of the placeholder:The symbolic algorithm is defined as follows:given a query,first obtain ,which is a CTL formula(over the lattice),and then model check this formula.The semantics of the formula is given by a function from to, as described in Section 2.Thus model checking this formula results in a value from .That value was shown in[15]to represent all propositional solutions to .For example,the query on the model of Figure1(a)becomesThe result of model-checking this formula is.The complexity of this algorithm is the same as in the naive approach.In practice, however,TLQSolver was shown to perform better than the naive algorithm[15,7].74State Solutions to QueriesLet be a Kripke structure with a set of atomic propositions.In general query check-ing,solutions to queries are arbitrary propositional formulas.On the other hand,in state query checking,solutions are restricted to be single states.To represent a single state,a propositional formula needs to be a minterm over.In symbolic model checking,any state of is uniquely represented by the minterm that holds in.For example,in themodel of Figure1(a),state is represented by,by,etc.Thus,for state query checking,an answer to a query is a set of minterms,rather than an upset of propositional formulas.For instance,for the query,on the model of Figure1(a), the state query-checking answer is,whereas the general query checking one is.While it is still true that if is a solution,everything in is also a solution,we no longer view answers as upsets,since we are interested only in minterms,and is the only minterm in the set(minterms are incomparable by implication).We can thus formulate state query checking as minterm query checking: given a CTL query on a model,find all its minterm solutions.We show how to solve this for any query,and any subset.When,the minterms obtained are the state solutions.Given a query,a naive algorithm would model check for every minterm .If is the number of atomic propositions in,there are possible minterms, and this algorithm has complexity times that of model-checking.Minterm query checking is thus much easier to solve than general query checking.Of course,any algorithm solving general query checking,such as the symbolic approach described in Section3,solves minterm query checking as well:from all solu-tions,we can extract only those which are minterms.This approach,however,is much more expensive than needed.Below,we propose a method that is tailored to solve just minterm query checking,while remaining symbolic.4.1Solving minterm query checkingSince an answer to minterm query checking is a set of minterms,the space of all answers is the powerset that forms a lattice ordered by set inclusion.For example,the lattice is shown in Figure2(c).Our symbolic algorithm evaluates queries over this lattice.Wefirst adjust the semantics of the placeholder to minterms.Suppose we evaluate in a state.Either holds in,and then the answer should be,or holds,and then the answer is.Thus,we haveif,if.This is encoded by the formula.In general,for a query ,exactly one minterm holds in,and in that case is the answer to the query. This gives the following translation of placeholder:8Our minterm query-checking algorithm is now defined as follows:given a query on a model,compute,and then model check this over.For example,for,on the model of Figure1(a),we model checkand obtain the answer,that is indeed the only minterm solution for this model.To prove our algorithm correct,we need to show that its answer is the set of all minterm solutions.We prove this claim by relating our algorithm to the general al-gorithm in Section3.We show that,while the general algorithm computes the set of all solutions,ours results in the subset that consists of only the minterms from.Wefirst establish an“approximation”mapping fromto that,for any upset,returns the subset of minterms. Definition1(Minterm approximation).Let be a set of atomic propositions.Minterm approximation is,for any .With this definition,is obtained from by replacing with .The minterm approximation preserves set operations;this can be proven using the fact that any set of propositional formulas can be partitioned into minterms and non-minterms.Proposition1.The minterm approximation is a lattice ho-momorphism,i.e.,it preserves the set operations:for any,and.By Proposition1,and since model checking is performed using only set operations, we can show that the approximation preserves model-checking results.Model check-ing is the minterm approximation of checking.In other words, our algorithm results in set of all minterm solutions,which concludes the correctness argument.Theorem1(Correctness of minterm approximation).For any state of,In summary,for,we have the following correct symbolic state query-checking algorithm:given a query on a model,translate it to,and then model check this over.The worst-case complexity of our algorithm is the same as that of the naive ap-proach.With an efficient encoding of the approximate lattice,however,our approach can outperform the naive one in practice,as we show in Section6.94.2ImplementationAlthough our minterm query-checking algorithm is defined as model checking over a lattice,we can implement it using a classical symbolic model checker.This in done by encoding the lattice elements in such that lattice operations are already imple-mented by a symbolic model checker.The key observation is that the latticeis isomorphic to the lattice of propositional formulas.This can be seen, for instance,by comparing the lattices in Figures2(a)and2(c).Thus,the elements of can be encoded as propositional formulas,and the operations become proposi-tional disjunction and conjunction.A symbolic model checker,such as NuSMV[8], which we used in our implementation,already has data structures for representing propositional formulas and algorithms to compute their disjunction and conjunction —BDDs[24].The only modifications we made to NuSMV were parsing the input and reporting the result.While parsing the queries,we implemented the translation defined in Sec-tion4.1.In this translation,for every minterm,we give a propositional encoding to.We cannot simply use to encode.The lattice elements need to be con-stants with respect to the model,and is not a constant—it is a propositional formula that contains model variables.We can,however,obtain an encoding for,by renam-ing to a similar propositional formula over fresh variables.For instance,we encode as.Thus,our query translation results in a CTL formula with double the number of propositional variables compared to the model.For example,the translation of isWe input this formula into NuSMV,and obtain the set of minterm solutions as a propo-sitional formula over the encoding variables.For,on the model in Figure1(a),we obtain the result,corresponding to the only minterm solution .4.3Exactness of minterm approximationIn this section,we address the applicability of minterm query checking to general query checking.When the minterm solutions are the strongest solutions to a query,minterm query checking solves the general query-checking problem as well,as all solutions to that query can be inferred from the minterms.In that case,we say that the minterm approximation is exact.We would like to identify those CTL queries that admit exact minterm approximations,independently of the model.The following can be proven using the fact that any propositional formula is a disjunction of minterms. Proposition2.A positive query has an exact minterm approximation in any model iff is distributive over disjunction,i.e.,.10An example of a query that admits an exact approximation is;its strongest solu-tions are always minterms,representing the reachable states.In[5],Chan showed that deciding whether a query is distributive over conjunction is EXPTIME-complete.We obtain a similar result by duality.Theorem2.Deciding whether a CTL query is distributive over disjunction is EXPTIME-complete.Since the decision problem is hard,it would be useful to have a grammar that is guaran-teed to generate queries which distribute over disjunction.Chan defined a grammar for queries distributive over conjunction,that was later corrected by Samer and Veith[22]. We can obtain a grammar for queries distributive over disjunction,from the grammar in[22],by duality.5ApproximationsThe efficiency of model checking over a lattice is determined by the size of the lattice. In the case of query checking,by restricting the problem and approximating answers, we have obtained a more manageable lattice.In this section,we show that our minterm approximation is an instance of a more general approximation framework for reasoning over any lattice of sets.Having a more general framework makes it easier to accom-modate other approximations that may be needed in query checking.For example,we use it to derive an approximation to negative queries.This framework may also apply to other analysis problems that involve model checking over lattices of sets,such as vacuity detection[14].Wefirst define general approximations that map larger lattices into smaller ones. Let be anyfinite set.Its powerset lattice is.Let be any sublattice of the powerset lattice,i.e.,.Definition2(Approximation).A function is an approximation if:1.it satisfies for any(i.e.,is an under-approximation of),and2.it is a lattice homomorphism,i.e.,it respects the lattice operations:,and.From the definition of,the image of through is a sublattice of,having and as its maximum and minimum elements,respectively.We consider an approximation to be correct if it is preserved by model checking: reasoning over the smaller lattice is the approximation of reasoning over the larger one.Let be a CTL()formula.We define its translation into to be the CTL(formula obtained from by replacing any constant occurring in by.The following theorem simply states that the result of model checking is the approximation of the result of model checking.Its proof follows by structural induction from the semantics of CTL,and uses the fact that approximations are homomorphisms.[18]proves a similar result,albeit in a somewhat different context.11。
Ergodic solenoidal homology

arXiv:math/0702501v1 [math.DG] 16 Feb 2007
˜ ´ VICENTE MUNOZ AND RICARDO PEREZ MARCO Abstract. We define generalized currents associated with immersions of abstract solenoids with a transversal measure. We realize geometrically the full real homology of a compact manifold with these generalized currents, and more precisely with immersions of minimal uniquely ergodic solenoids. This makes precise and geometric De Rham’s realization of the real homology by only using a restricted geometric subclass of currents. These generalized currents do extend Ruelle-Sullivan and Schwartzman currents. We extend Schwartzman theory beyond dimension 1 and provide a unified treatment of Ruelle-Sullivan and Schwartzman theories via Birkhoff’s ergodic theorem for the class of immersions of controlled solenoids. We develop some intersection theory of these new generalized currents that explains why the realization theorem cannot be achieved only with Ruelle-Sullivan currents.
IMPLICATURE 语用学

A: Shall we hold the football match tomorrow?
B: It is raining.
Semantic and literal meaning: his answer unrelated to the question. Intended meaning: the football match will be canceled as ground is wet and slippery after the rain.
(c) Carmen: I hear you’ve invite Mat and Chris. Dave : I didn’t invite Mat. (Did Dave invite Chris?) →Dave invited Chris.
The notion of implicature provides some explicit account of how it is possible to mean more than what is actually “said” , or more than what is literally expressed.
a) Tom : Are you going to Mark‟s party tonight? Annie : My parents are in town. (No.) Shared knowledge : Annie‟s relation with her parents b) Tom : Where‟s the salad dressing? Gabriel : We‟ve run out of olive oil. (There isn‟t any salad dressing.) Shared knowledge : Oliver oil is a possible ingredient in salad dressing and they only use salad dressing made from olive oil.
generalized linear model结果解释-概述说明以及解释

generalized linear model结果解释-概述说明以及解释1.引言1.1 概述概述部分的内容可以包括对广义线性模型的简要介绍以及结果解释的重要性。
以下是一种可能的编写方式:在统计学和机器学习领域,广义线性模型(Generalized Linear Model,简称GLM)是一种常用的统计模型,用于建立因变量与自变量之间的关系。
与传统的线性回归模型不同,广义线性模型允许因变量(也称为响应变量)的分布不服从正态分布,从而更适用于处理非正态分布的数据。
广义线性模型的理论基础是广义线性方程(Generalized Linear Equation),它通过引入连接函数(Link Function)和系统误差分布(Error Distribution)的概念,从而使模型能够适应不同类型的数据。
结果解释是广义线性模型分析中的一项重要任务。
通过解释模型的结果,我们可以深入理解自变量与因变量之间的关系,并从中获取有关影响因素的信息。
结果解释能够帮助我们了解自变量的重要性、方向性及其对因变量的影响程度。
通过对结果进行解释,我们可以推断出哪些因素对于观察结果至关重要,从而对问题的本质有更深入的认识。
本文将重点讨论如何解释广义线性模型的结果。
我们将介绍广义线性模型的基本概念和原理,并指出结果解释中需要注意的要点。
此外,我们将提供实际案例和实例分析,以帮助读者更好地理解结果解释的方法和过程。
通过本文的阅读,读者将能够更全面地了解广义线性模型的结果解释,并掌握解释结果的相关技巧和方法。
本文的目的是帮助读者更好地理解和运用广义线性模型,从而提高统计分析和机器学习的能力。
在接下来的章节中,我们将详细介绍广义线性模型及其结果解释的要点,希望读者能够从中受益。
1.2文章结构文章结构部分的内容应该是对整篇文章的结构进行简要介绍和概述。
这个部分通常包括以下内容:文章结构部分的内容:本文共分为引言、正文和结论三个部分。
其中,引言部分主要概述了广义线性模型的背景和重要性,并介绍了文章的目的。
On the Interactions of Light Gravitinos

On the Interactions of Light GravitinosT.E.Clark1,Taekoon Lee2,S.T.Love3,Guo-Hong Wu4Department of PhysicsPurdue UniversityWest Lafayette,IN47907-1396AbstractIn models of spontaneously broken supersymmetry,certain light gravitino processes are governed by the coupling of its Goldstino components.The rules for constructing SUSY and gauge invariant actions involving the Gold-stino couplings to matter and gaugefields are presented.The explicit oper-ator construction is found to be at variance with some previously reported claims.A phenomenological consequence arising from light gravitino inter-actions in supernova is reexamined and scrutinized.1e-mail address:clark@2e-mail address:tlee@3e-mail address:love@4e-mail address:wu@1In the supergravity theories obtained from gauging a spontaneously bro-ken global N=1supersymmetry(SUSY),the Nambu-Goldstone fermion, the Goldstino[1,2],provides the helicity±1degrees of freedom needed to render the spin3gravitino massive through the super-Higgs mechanism.For a light gravitino,the high energy(well above the gravitino mass)interactions of these helicity±1modes with matter will be enhanced according to the su-persymmetric version of the equivalence theorem[3].The effective action de-scribing such interactions can then be constructed using the properties of the Goldstinofields.Currently studied gauge mediated supersymmetry breaking models[4]provide a realization of this scenario as do certain no-scale super-gravity models[5].In the gauge mediated case,the SUSY is dynamically broken in a hidden sector of the theory by means of gauge interactions re-sulting in a hidden sector Goldstinofield.The spontaneous breaking is then mediated to the minimal supersymmetric standard model(MSSM)via radia-tive corrections in the standard model gauge interactions involving messenger fields which carry standard model vector representations.In such models,the supergravity contributions to the SUSY breaking mass splittings are small compared to these gauge mediated contributions.Being a gauge singlet,the gravitino mass arises only from the gravitational interaction and is thus farsmaller than the scale √,where F is the Goldstino decay constant.More-2over,since the gravitino is the lightest of all hidden and messenger sector degrees of freedom,the spontaneously broken SUSY can be accurately de-scribed via a non-linear realization.Such a non-linear realization of SUSY on the Goldstinofields was originally constructed by Volkov and Akulov[1].The leading term in a momentum expansion of the effective action de-scribing the Goldstino self-dynamics at energy scales below √4πF is uniquelyfixed by the Volkov-Akulov effective Lagrangian[1]which takes the formL AV=−F 22det A.(1)Here the Volkov-Akulov vierbein is defined as Aµν=δνµ+iF2λ↔∂µσν¯λ,withλ(¯λ)the Goldstino Weyl spinorfield.This effective Lagrangian pro-vides a valid description of the Goldstino self interactions independent of the particular(non-perturbative)mechanism by which the SUSY is dynam-ically broken.The supersymmetry transformations are nonlinearly realized on the Goldstinofields asδQ(ξ,¯ξ)λα=Fξα+Λρ∂ρλα;δQ(ξ,¯ξ)¯λ˙α= F¯ξ˙α+Λρ∂ρ¯λ˙α,whereξα,¯ξ˙αare Weyl spinor SUSY transformation param-eters andΛρ≡−i Fλσρ¯ξ−ξσρ¯λis a Goldstinofield dependent translationvector.Since the Volkov-Akulov Lagrangian transforms as the total diver-genceδQ(ξ,¯ξ)L AV=∂ρ(ΛρL AV),the associated action I AV= d4x L AV is SUSY invariant.The supersymmetry algebra can also be nonlinearly realized on the matter3(non-Goldstino)fields,generically denoted byφi,where i can represent any Lorentz or internal symmetry labels,asδQ(ξ,¯ξ)φi=Λρ∂ρφi.(2) This is referred to as the standard realization[6]-[9].It can be used,along with space-time translations,to readily establish the SUSY algebra.Under the non-linear SUSY standard realization,the derivative of a matterfield transforms asδQ(ξ,¯ξ)(∂νφi)=Λρ∂ρ(∂νφi)+(∂νΛρ)(∂ρφi).In order to elim-inate the second term on the right hand side and thus restore the standard SUSY realization,a SUSY covariant derivative is introduced and defined so as to transform analogously toφi.To achieve this,we use the transformation property of the Volkov-Akulov vierbein and define the non-linearly realized SUSY covariant derivative[9]Dµφi=(A−1)µν∂νφi,(3) which varies according to the standard realization of SUSY:δQ(ξ,¯ξ)(Dµφi)=Λρ∂ρ(Dµφi).Any realization of the SUSY transformations can be converted to the standard realization.In particular,consider the gauge covariant derivative,(Dµφ)i≡∂µφi+T a ij A aµφj,(4)4with a=1,2,...,Dim G.We seek a SUSY and gauge covariant deriva-tive(Dµφ)i,which transforms as the SUSY standard ing the Volkov-Akulov vierbein,we define(Dµφ)i≡(A−1)µν(Dνφ)i,(5) which has the desired transformation property,δQ(ξ,¯ξ)(Dµφ)i=Λρ∂ρ(Dµφ)i, provided the vector potential has the SUSY transformationδQ(ξ,¯ξ)Aµ≡Λρ∂ρAµ+∂µΛρAρ.Alternatively,we can introduce a redefined gaugefieldV aµ≡(A−1)µνA aν,(6) which itself transforms as the standard realization,δQ(ξ,¯ξ)V aµ=Λρ∂ρV aµ, and in terms of which the standard realization SUSY and gauge covariant derivative then takes the form(Dµφ)i≡(A−1)µν∂νφi+T a ij V aµφj.(7) Under gauge transformations parameterized byωa,the original gaugefield varies asδG(ω)A aµ=(Dµω)a=∂µωa+gf abc A bµωc,while the redefinedgaugefield V aµhas the Goldstino dependent transformation:δG(ω)V aµ= (A−1)µν(Dνω)a.For all realizations,the gauge transformation and SUSY transformation commutator yields a gauge variation with a SUSY trans-formed value of the gauge transformation parameter,δG(ω),δQ(ξ,¯ξ)=δG(Λρ∂ρω−δQ(ξ,¯ξ)ω).(8) 5If we further require the local gauge transformation parameter to also trans-form under the standard realization so thatδQ(ξ,¯ξ)ωa=Λρ∂ρωa,then the gauge and SUSY transformations commute.In order to construct an invariant kinetic energy term for the gaugefields, it is convenient for the gauge covariant anti-symmetric tensorfield strength to also be brought into the standard realization.The usualfield strengthF a αβ=∂αA aβ−∂βA aα+if abc A bαA cβvaries under SUSY transformations asδQ(ξ,¯ξ)F aµν=Λρ∂ρF aµν+∂µΛρF aρν+∂νΛρF aµρ.A standard realization of thegauge covariantfield strength tensor,F aµν,can be then defined asF aµν=(A−1)µα(A−1)νβF aαβ,(9) so thatδQ(ξ,¯ξ)F aµν=Λρ∂ρF aµν.These standard realization building blocks consisting of the gauge singlet Goldstino SUSY covariant derivatives,Dµλ,Dµ¯λ,the matterfields,φi,their SUSY-gauge covariant derivatives,Dµφi,and thefield strength tensor,F aµν, along with their higher covariant derivatives can be combined to make SUSY and gauge invariant actions.These invariant action terms then dictate the couplings of the Goldstino which,in general,carries the residual consequences of the spontaneously broken supersymmetry.A generic SUSY and gauge invariant action can be constructed[9]asI eff=d4x detA L eff(Dµλ,Dµ¯λ,φi,Dµφi,Fµν)(10)6where L effis any gauge invariant function of the standard realization basic building ing the nonlinear SUSY transformationsδQ(ξ,¯ξ)detA=∂ρ(ΛρdetA)andδQ(ξ,¯ξ)L eff=Λρ∂ρL eff,it follows thatδQ(ξ,¯ξ)I eff=0.It proves convenient to catalog the terms in the effective Lagranian,L eff, by an expansion in the number of Goldstinofields which appear when covari-ant derivatives are replaced by ordinary derivatives and the Volkov-Akulov vierbein appearing in the standard realizationfield strengths are set to unity. So doing,we expandL eff=L(0)+L(1)+L(2)+···,(11)where the subscript n on L(n)denotes that each independent SUSY invariant operator in that set begins with n Goldstinofields.L(0)consists of all gauge and SUSY invariant operators made only from light matterfields and their SUSY covariant derivatives.Any Goldstinofield appearing in L(0)arises only from higher dimension terms in the matter covariant derivatives and/or thefield strength tensor.Taking the light non-Goldstinofields to be those of the MSSM and retaining terms through mass dimension4,then L(0)is well approximated by the Lagrangian of the mini-mal supersymmetric standard model which includes the soft SUSY breaking terms,but in which all derivatives have been replaced by SUSY covariant ones and thefield strength tensor replaced by the standard realizationfield7strength:L(0)=L MSSM(φ,Dµφ,Fµν).(12) Note that the coefficients of these terms arefixed by the normalization of the gauge and matterfields,their masses and self-couplings;that is,the normalization of the Goldstino independent Lagrangian.The L(1)terms in the effective Lagrangian begin with direct coupling of one Goldstino covariant derivative to the non-Goldstinofields.The general form of these terms,retaining operators through mass dimension6,is given byL(1)=1[DµλαQµMSSMα+¯QµMSSM˙αDµ¯λ˙α],(13)Fwhere QµMSSMαand¯QµMSSM˙αcontain the pure MSSMfield contributions to the conserved gauge invariant supersymmetry currents with once again all field derivatives being replaced by SUSY covariant derivatives and the vector field strengths in the standard realization.That is,it is this term in the effective Lagrangian which,using the Noether construction,produces the Goldstino independent piece of the conserved supersymmetry current.The Lagrangian L(1)describes processes involving the emission or absorption of a single helicity±1gravitino.Finally the remaining terms in the effective Lagrangian all contain two or more Goldstinofields.In particular,L(2)begins with the coupling of two8Goldstinofields to matter or gaugefields.Retaining terms through mass dimension8and focusing only on theλ−¯λterms,we can writeL(2)=1F2DµλαDν¯λ˙αMµν1α˙α+1F2Dµλα↔DρDν¯λ˙αMµνρ2α˙α+1F2DρDµλαDν¯λ˙αMµνρ3α˙α,(14)where the standard realization composite operators that contain matter and gaugefields are denoted by the M i.They can be enumerated by their oper-ator dimension,Lorentz structure andfield content.In the gauge mediated models,these terms are all generated by radiative corrections involving the standard model gauge coupling constants.Let us now focus on the pieces of L(2)which contribute to a local operator containing two gravitinofields and is bilinear in a Standard Model fermion (f,¯f).Those lowest dimension operators(which involve no derivatives on f or¯f)are all contained in the M1piece.After application of the Goldstino field equation(neglecting the gravitino mass)and making prodigious use of Fierz rearrangement identities,this set reduces to just1independent on-shell interaction term.In addition to this operator,there is also an operator bilinear in f and¯f and containing2gravitinos which arises from the product of det A with L(0).Combining the two independent on-shell interaction terms involving2gravitinos and2fermions,results in the effective actionIf¯f˜G˜G =d4x−12F2λ↔∂µσν¯λf↔∂νσµ¯f9+C ffF2(f∂µλ)¯f∂µ¯λ,(15)where C ff is a model dependent real coefficient.Note that the coefficient of thefirst operator isfixed by the normaliztion of the MSSM Lagrangian. This result is in accord with a recent analysis[10]where it was found that the fermion-Goldstino scattering amplitudes depend on only one parameter which corresponds to the coefficient C ff in our notation.In a similar manner,the lowest mass dimension operator contributing to the effective action describing the coupling of two on-shell gravitinos to a single photon arises from the M1and M3pieces of L(2)and has the formIγ˜G˜G =d4xCγF2∂µλσρ∂ν¯λ∂µFρν+h.c.,(16)with Cγa model dependent real coefficient and Fµνis the electromagnetic field strength.Note that the operator in the square bracket is odd under both parity(P)and charge conjugation(C).In fact any operator arising from a gauge and SUSY invariant structure which is bilinear in two on-shell gravitinos and contains only a single photon is necessarily odd in both P and C.Thus the generation of any such operator requires a violation of both P and ing the Goldstino equation of motion,the analogous term containing˜Fµνreduces to Eq.(16)with Cγ→−iCγ.Recently,there has appeared in the literature[11]the claim that there is a lower dimensional operator of the form˜M2F2∂νλσµ¯λFµνwhich contributes to the single photon-102gravitino interaction.Here˜M is a model dependent SUSY breaking massparameter which is roughly an order(s)of magnitude less than √.¿Fromour analysis,we do notfind such a term to be part of a SUSY invariant action piece and thus it should not be included in the effective action.Such a term is also absent if one employs the equivalent formalism of Wess and Samuel [6].We have also checked that such a term does not appear via radiative corrections by an explicit graphical calculation using the correct non-linearly realized SUSY invariant action.This is also contrary to the previous claim.There have been several recent attempts to extract a lower bound on the SUSY breaking scale using the supernova cooling rate[11,12,13].Unfortu-nately,some of these estimates[11,13]rely on the existence of the non-SUSY invariant dimension6operator referred to ing the correct low en-ergy effective lagrangian of gravitino interactions,the leading term coupling 2gravitinos to a single photon contains an additional supression factor ofroughly Cγs˜M .Taking√s 0.1GeV for the processes of interest and using˜M∼100GeV,this introduces an additional supression of at least10−12in the rate and obviates the previous estimates of a bound on F.Assuming that the mass scales of gauginos and the superpartners of light fermions are above the core temperature of supernova,the gravitino cooling of supernova occurs mainly via gravitino pair production.It is interesting to11compare the gravitino pair production cross section to that of the neutrino pair production,which is the main supernova cooling channel.We have seen that for low energy gravitino interactions with matter,the amplitudes for gravitino pair production is proportional to1/F2.A simple dimensional analysis then suggests the ratio of the cross sections is:σχχσνν∼s2F4G2F(17)where GF is the Fermi coupling and√s is the typical energy scale of theparticles in a supernova.Even with the most optimistic values for F,thegravitino production is too small to be relevant.For example,taking √F=100GeV,√s=.1GeV,the ratio is of O(10−11).It seems,therefore,thatsuch an astrophysical bound on the SUSY breaking scale is untenable in mod-els where the gravitino is the only superparticle below the scale of supernova core temperature.We thank T.K.Kuo for useful conversations.This work was supported in part by the U.S.Department of Energy under grant DE-FG02-91ER40681 (Task B).12References[1]D.V.Volkov and V.P.Akulov,Pis’ma Zh.Eksp.Teor.Fiz.16(1972)621[JETP Lett.16(1972)438].[2]P.Fayet and J.Iliopoulos,Phys.Lett.B51(1974)461.[3]R.Casalbuoni,S.De Curtis,D.Dominici,F.Feruglio and R.Gatto,Phys.Lett.B215(1988)313.[4]M.Dine and A.E.Nelson,Phys.Rev.D48(1993)1277;M.Dine,A.E.Nelson and Y.Shirman,Phys.Rev.D51(1995)1362;M.Dine,A.E.Nelson,Y.Nir and Y.Shirman,Phys.Rev.D53,2658(1996).[5]J.Ellis,K.Enqvist and D.V.Nanopoulos,Phys.Lett.B147(1984)99.[6]S.Samuel and J.Wess,Nucl.Phys.B221(1983)153.[7]J.Wess and J.Bagger,Supersymmetry and Supergravity,second edition,(Princeton University Press,Princeton,1992).[8]T.E.Clark and S.T.Love,Phys.Rev.D39(1989)2391.[9]T.E.Clark and S.T.Love,Phys.Rev.D54(1996)5723.[10]A.Brignole,F.Feruglio and F.Zwirner,hep-th/9709111.[11]M.A.Luty and E.Ponton,hep-ph/9706268.13[12]J.A.Grifols,R.N.Mohapatra and A.Riotto,Phys.Lett.B400,124(1997);J.A.Grifols,R.N.Mohapatra and A.Riotto,Phys.Lett.B401, 283(1997).[13]J.A.Grifols,E.Masso and R.Toldra,hep-ph/970753.D.S.Dicus,R.N.Mohapatra and V.L.Teplitz,hep-ph/9708369.14。
Submanifolds of generalized complex manifolds

arXiv:math/0309013v1 [math.DG] 1 Sep 2003
OREN BEN-BASSAT AND MITYA BOYARCHENKO Abstract. The main goal of our paper is the study of several classes of submanifolds of generalized complex manifolds. Along with the generalized complex submanifolds defined by Gualtieri and Hitchin in [4], [8] (we call these “generalized Lagrangian submanifolds” in our paper), we introduce and study three other classes of submanifolds. For generalized complex manifolds that arise from complex (resp., symplectic) manifolds, all three classes specialize to complex (resp., symplectic) submanifolds. In general, however, all three classes are distinct. We discuss some interesting features of our theory of submanifolds, and illustrate them with a few nontrivial examples. We then support our “symplectic/Lagrangian viewpoint” on the submanifolds introduced in [4], [8] by defining the “generalized complex category”, modelled on the constructions of Guillemin-Sternberg [5] and Weinstein [14]. We argue that our approach may be useful for the quantization of generalized complex manifolds.
An-iterative-algorithm-for-the-reflexive-solutions-of-the-generalized-coupled-Sylvestermatrixequatio

An iterative algorithm for the reflexive solutions of the generalized coupled Sylvester matrix equationsand its optimal approximationMehdi Dehghan *,Masoud HajarianDepartment of Applied Mathematics,Faculty of Mathematics and Computer Science,Amirkabir University of Technology,No.424,Hafez Avenue,Tehran 15914,IranAbstractThe generalized coupled Sylvester matrix equations ðAY ÀZB ;CY ÀZD Þ¼ðE ;F Þwith unknown matrices Y ;Z are encountered in many systems and control applications.Also these matrix equations have several applications relating to the problem of computing stable eigendecompositions of matrix pencils.In this work,we construct an iterative algo-rithm to solve the generalized coupled Sylvester matrix equations over reflexive matrices Y ;Z .And when the matrix equa-tions are consistent,for any initial matrix pair ½Y 0;Z 0 ,a reflexive solution pair can be obtained within finite iteration steps in the absence of roundofferrors,and the least Frobenius norm reflexive solution pair can be obtained by choosing a spe-cial kind of initial matrix pair.Also we obtain the optimal approximation reflexive solution pair to a given matrix pair ½Y ;Z in the reflexive solution pair set of the generalized coupled Sylvester matrix equations ðAY ÀZB ;CY ÀZD Þ¼ðE ;F Þ.Moreover,several numerical examples are given to show the efficiency of the presented iterative algorithm.Ó2008Elsevier Inc.All rights reserved.Keywords:The generalized coupled Sylvester matrix equations;Generalized reflection matrix;Kronecker matrix product;Reflexive matrix;Optimal approximation reflexive solution pair1.IntroductionWe first give some notations which are used in this paper.The notation R m Ân denotes the set of all m Ân real matrices.The unit matrix of order n is denoted by I n .1n denotes the matrix of order n whose all elements are 1.We use A T ,tr ðA Þand R ðA Þto denote the transpose,the trace and the column space of the matrix A ,respectively.For a matrix A 2R m Ân ,the so–called stretching function vec ðA Þis defined by the following:vec ðA Þ¼a T 1a T 2...a T nÀÁT;0096-3003/$-see front matter Ó2008Elsevier Inc.All rights reserved.doi:10.1016/j.amc.2008.02.035*Corresponding author.E-mail addresses:mdehghan@aut.ac.ir (M.Dehghan),mhajarian@aut.ac.ir ,masoudhajarian@ (M.Hajarian).Available online at Applied Mathematics and Computation 202(2008)571–588/locate/amcwhere a k is the k th column of A .A B stands for the Kronecker product of matrices A ¼ða ij Þm Ân and B which is defined asA B ¼a 11B a 12B ÁÁa 1n Ba 21B a 22B ÁÁa 2n B ÁÁÁÁÁÁÁÁÁÁa m 1B a m 2B ÁÁa mn BBB BB BB@1C CC C C C A :In addition,h A ;B i ¼tr B T A ÀÁis defined as the inner product of the two matrices,which generates the Frobe-nius norm,i.e.h A ;A i ¼k A k 2[1,8,15].An n Ân real matrix P is said to be a real generalized reflection matrix if P T ¼P and P 2¼I n .An n Ân real matrix A is said to be a reflexive (anti-reflexive)matrix with respect to the generalized reflection matrix P ifA ¼PAP ðA ¼ÀPAP Þ.R n Ân r ðP ÞðR n Âna ðP ÞÞdenotes the subspace reflexive (anti-reflexive)matrices with respect to the n Ân generalized reflection matrix P .The reflexive and anti-reflexive matrices with respect to a general-ized reflection matrix P have applications in system and control theory,in engineering,in scientific computa-tions and various other fields [3–5].In this paper we consider the reflexive solutions of the linear matrix equationsAY ÀZB ¼E ;CY ÀZD ¼F ;ð1Þwhere A ;B ;C ;D ;E ;F 2R n Ân ,that is,we will find Y 2R n Ân r ðP Þand Z 2R n Ânr ðQ Þwhich satisfy in (1).Also we consider the reflexive solutions of the matrix pair nearness problemmin Y ;Z 2S YZfk Y ÀY k 2þk Z ÀZ k 2g ;ð2Þwhere Y 2R n Ân r ðP Þand Z 2R n Ânr ðQ Þare given reflexive matrices,and S YZ is the reflexive solution pair set of the generalized coupled Sylvester matrix equations (1).A large number of papers have been written for solving matrix equations [17,19,24,27,30–33].Chu [6]stud-ied the linear matrix equationAXB ¼C ;ð3Þwith an unknown symmetric matrix X .Peng and Hu in [26]established the necessary and sufficient conditions for the existence of solution and the expressions for the reflexive and anti-reflexive with respect to a generalized reflection matrix P solutions of the matrix equationAX ¼B :In [7],the existence of a reflexive,with respect to the generalized reflection matrix P ,solution of the matrix equation (3)is presented.By extending the well-known Jacobi and Gauss–Seidel iterations for Ax ¼b ,Ding et al.in [14]derived iterative solutions of matrix equations AXB ¼F and generalized Sylvester matrix equa-tions AXB þCXD ¼F .Navarra et al.[25]studied a representation of the general common solution X to the matrix equationA 1XB 1¼C 1;A 2XB 2¼C 2:ð4ÞPeng et al.[29]presented an algorithm which is constructed to solve the reflexive with respect to the general-ized reflection matrix P solution of the minimum Frobenius norm residual problemA 1XB 1A 2XB 2 ÀC 1C 2 ¼min :In [28]an iterative algorithm is reported to solve the matrix equationAXB þCYD ¼E :572M.Dehghan,M.Hajarian /Applied Mathematics and Computation 202(2008)571–588We know the Sylvester matrix equations have a close relation with many problems in linear control theory of descriptor systems,and the matrix equations have important applications in stability analysis,in observers de-sign,in output regulation with internal stability,and in the eigenvalue assignment,and a large number of papers have presented several methods to solve these matrix equations[2,16,20–23].The generalized coupled Sylvester matrix equations(1)are very active research in the Sylvester matrix equations,and have been widely applied in various areas.In[23]Ka_gstro¨m and Poromaa introduced LAPACK–style error bounds for the generalized cou-pled Sylvester matrix equations,and presented their software that implement algorithms for solving this matrix equation.In[9,10,13,14],to solve(coupled)matrix equations,the iterative methods are given which are based on the hierarchical identification principle[11,12].The gradient-based iterative(GI)algorithms[9,14]and least squares based iterative algorithm[10]for solving(coupled)matrix equations are innovational and computa-tionally efficient numerical algorithms and were presented based on the hierarchical identification principle [11,12]which regards the unknown matrix as the system parameter matrix to be identified.Also Ding and Chen [13],applying the gradient search principle and the hierarchical identification principle,presented the gradient-based iterative algorithms for generalized Sylvester equation and general coupled matrix equations.This paper is organized as follows:In Section2,we propose an iterative algorithm and its properties to obtain the reflexive solutions of the generalized coupled Sylvester matrix equations(1).When the matrix equa-tions(1)are consistent over reflexive matrices,we show using the introduced iterative algorithm,for any(spa-cial)initial matrix pair½Y1;Z1 ,a reflexive solution pair(the minimal Frobenius normal reflexive solution pair) can be obtained withinfinite steps.Also the optimal approximation reflexive solution to a given matrix pair can be derived byfinding the least norm reflexive solution of new matrix equationsðA e YÀe ZB;C e YÀe ZDÞ¼ðe E;e FÞ.Several numerical examples are given in Section3to illustrate the application of the new iterative algorithm.2.Iterative algorithm to solve(1)and(2)In this section,wefirst introduce an iterative algorithm,then we propose some properties of this iterative algorithm which are essential tools forfinding the reflexive solution of matrix equations(1).Algorithm1step1.Input matrices A;B;C;D;E;F2R nÂn;step2.Chosen arbitrary Y12R nÂnr ðPÞ,Z12R nÂnrðQÞwhere P and Q are two nÂn arbitrary generalizedreflection matrices; step3.CalculateR1¼EÀAY1þZ1B00FÀCY1þZ1D;U1¼1A TðEÀAY1þZ1BÞþC TðFÀCY1þZ1DÞþPA TðEÀAY1þZ1BÞPþPC TðFÀCY1þZ1DÞP ÂÃ;V1¼12ÀðEÀAY1þZ1BÞB TÀðFÀCY1þZ1DÞD TÀQðEÀAY1þZ1BÞB T QÀQðFÀCY1þZ1DÞD T Q ÂÃ;k:¼1;step4.If R k¼0,then stop;Else go to step5; step5.CalculateY kþ1¼Y kþk R k k2k U k k2þk V k k2U k;Z kþ1¼Z kþk R k k2k U k kþk V k kV k;R kþ1¼EÀAY kþZ k B00FÀCY kþZ k D;¼R kÀk R k k2k U k k2þk V k k2AU kÀV k B00CU kÀV k D;M.Dehghan,M.Hajarian/Applied Mathematics and Computation202(2008)571–588573U k þ1¼12A TðE ÀAY k þ1þZ k þ1B ÞþC T ðF ÀCY k þ1þZ k þ1D ÞþPA T ðE ÀAY k þ1þZ k þ1B ÞP ÂþPC TðF ÀCY k þ1þZ k þ1D ÞP Ãþk R k þ1k 2k R k kU k ;V k þ1¼12ÀðE ÀAY k þ1þZ k þ1B ÞB T ÀðF ÀCY k þ1þZ k þ1D ÞD TÂÀQ ðE ÀAY k þ1þZ k þ1B ÞB T Q ÀQ ðF ÀCY k þ1þZ k þ1D ÞD TQ Ãþk R k þ1k 2k R k k 2V k ;step 6.If R k þ1¼0,then stop;Else,let k :¼k þ1,go to step 5.Since the above algorithm,we can easily see that Y k ;U k 2R n Ân r ðP Þand Z k ;V k 2R n Ânr ðQ Þ.Now we intro-duce some properties of the above algorithm.Lemma 1.Assume that the sequences R i ,U i and V i (i ¼1;2;...;s ,R i ¼0)are generated by Algorithm 1,then we havetr R T j R i ¼0;and tr U T j U i þtr V Tj V i ¼0;i ;j ¼1;2;...;s ;i ¼j :ð5ÞProof.It is obvious that tr R T j R i ¼tr ðR T i R j Þ,tr U T j U i ¼tr U T i U j ÀÁand tr V Tj V i ¼tr V T iV j ÀÁ,hence we need only to show thattr R Tj R i ¼0;andtr U T j U i þtr V Tj V i ¼0for 16i <j 6s :ð6ÞWe use induction to prove (6),and also we do it in two steps.Step 1.We first showtr R T i þ1R i ÀÁ¼0;and tr U T i þ1U i ÀÁþtr V T i þ1V i ÀÁ¼0;i ¼1;2;...;s :ð7ÞWe also prove (7)by induction.Because all matrices in Algorithm 1are real for i ¼1,we can writetr R T 2R 1ÀÁ¼tr R 1Àk R 1k 2k U 1k þk V 1kAU 1ÀV 1B 00CU 1ÀV 1D "#T R 10@1A ¼k R 1k 2Àk R 1k 2k U 1k 2þk V 1k2tr AU 1ÀV 1B 00CU 1ÀV 1D T ÂE ÀAY 1þZ 1B 00F ÀCY 1þZ 1D¼k R 1k 2Àk R 1k 2k U 1k 2þk V 1k2tr AU 1ÀV 1B ðÞTE ÀAY 1þZ 1B ðÞh i þðCU 1ÀV 1D ÞTðF ÀCY 1þZ 1D Þh i¼k R 1k 2Àk R 1k2k U 1k þk V 1ktr U T 1A TðE ÀAY 1þZ 1B ÞþU T 1C TðF ÀCY 1þZ 1D ÞÀÀB T V T 1ðE ÀAY 1þZ 1B ÞÀD T V T1ðF ÀCY 1þZ 1D ÞÁ¼k R 1k 2Àk R 1k 2k U 1k 2þk V 1k 2tr U T 1A T ðE ÀAY 1þZ 1B ÞþC TðF ÀCY 1þZ 1D Þ2 þA T ðE ÀAY 1þZ 1B ÞþC T ðF ÀCY 1þZ 1D Þ2574M.Dehghan,M.Hajarian /Applied Mathematics and Computation 202(2008)571–588þPA TðEÀAY1þZ1BÞPþPC TðFÀCY1þZ1DÞP2ÀPA TðEÀAY1þZ1BÞPþPC TðFÀCY1þZ1DÞP2!þV T1ÀðEÀAY1þZ1BÞB TÀðFÀCY1þZ1DÞD T2þÀðEÀAY1þZ1BÞB TÀðFÀCY1þZ1DÞD T2þÀQðEÀAY1þZ1BÞB T QÀQðFÀCY1þZ1DÞD T Q2ÀÀQðEÀAY1þZ1BÞB T QÀQðFÀCY1þZ1DÞD T Q!¼k R1k2Àk R1k2k U1kþk V1ktr U T1A TðEÀAY1þZ1BÞþC TðFÀCY1þZ1DÞ2þPA TðEÀAY1þZ1BÞPþPC TðFÀCY1þZ1DÞP2!þV T1ÀðEÀAY1þZ1BÞB TÀðFÀCY1þZ1DÞD T2þÀQðEÀAY1þZ1BÞB T QÀQðFÀCY1þZ1DÞD T Q2!¼k R1k2Àk R1k2k U1kþk V1ktr U T1U1þV T1V1ÀÁ¼0:ð8ÞAlso we havetr U T2U1ÀÁþtr V T2V1ÀÁ¼tr A TðEÀAY2þZ2BÞþC TðFÀCY2þZ2DÞ2þPA TðEÀAY2þZ2BÞPþPC TðFÀCY2þZ2DÞP2þk R2k2k R1k2U1#TU1!þtrÀðEÀAY2þZ2BÞB TÀðFÀCY2þZ2DÞD T2þÀQðEÀAY2þZ2BÞB T QÀQðFÀCY2þZ2DÞD T Q2þk R2k2k R1k2V1#TV11A¼tr A TðEÀAY2þZ2BÞþC TðFÀCY2þZ2DÞ2þA TðEÀAY2þZ2BÞþC TðFÀCY2þZ2DÞ2ÀPA TðEÀAY2þZ2BÞPþPC TðFÀCY2þZ2DÞP2þPA TðEÀAY2þZ2BÞPþPC TðFÀCY2þZ2DÞP2þk R2k2k R1kU1#TU1!þtrÀðEÀAY2þZ2BÞB TÀðFÀCY2þZ2DÞD T2þÀðEÀAY2þZ2BÞB TÀðFÀCY2þZ2DÞD T2ÀÀQðEÀAY2þZ2BÞB T QÀQðFÀCY2þZ2DÞD T Q2þÀQðEÀAY2þZ2BÞB T QÀQðFÀCY2þZ2DÞD T Q2þk R2k2k R1kV1#TV11AM.Dehghan,M.Hajarian/Applied Mathematics and Computation202(2008)571–588575¼tr U T 1A T ðE ÀAY 2þZ 2B ÞþC T ðF ÀCY 2þZ 2D ÞÂÃþV T 1ÀðE ÀAY 2þZ 2B ÞB T ÀðF ÀCY 2þZ 2D ÞD TÂÃÀÁþk R 2k 2k R 1kðk V 1k 2þk U 1k 2Þ¼tr ððE ÀAY 2þZ 2B ÞT AU 1þðF ÀCY 2þZ 2D ÞT CU 1ÀðE ÀAY 2þZ 2B ÞT V 1B ÀðF ÀCY 2þZ 2D ÞT V 1D Þþk R 2k 2k R 1k2ðk V 1k 2þk U 1k 2Þ¼trðE ÀAY 2þZ 2B ÞT0ðF ÀCY 2þZ 2D ÞT!AU 1ÀV 1B0CU 1ÀV 1D! !þk R 2k 2k R 1kðk V 1k 2þk U 1k 2Þ¼k U 1k 2þk V 1k 2k R 1ktr ðR T 2ðR 1ÀR 2ÞÞþk R 2k 2k R 1kðk V 1k 2þk U 1k 2Þ¼0:ð9ÞAssume that (7)holds for i ¼d À1.Now let i ¼d .Similar to the proofs of (8)and (9),we can obtaintr R T d þ1R d ÀÁ¼k R d k 2Àk R d k 2k U d k þk V d ktr AU d ÀV d B 00CU d ÀV d D T ÂE ÀAY d þZ d B00F ÀCY d þZ d D¼k R d k 2Àk R d k 2k U d k 2þk V d k2tr U T d A T ðE ÀAY d þZ d B ÞþU T d C TðF ÀCY d þZ d D ÞÀÀB T V T d ðE ÀAY d þZ d B ÞÀD T V TdðF ÀCY d þZ d D ÞÁ¼k R d k 2Àk R d k 2k U d k 2þk V d k2tr U T d A T ðE ÀAY d þZ d B ÞþC TðF ÀCY d þZ d D Þ2 þPA T ðE ÀAY d þZ d B ÞP þPC T ðF ÀCY d þZ d D ÞP 2!þV T dÀðE ÀAY d þZ d B ÞB T ÀðF ÀCY d þZ d D ÞD T 2þÀQ ðE ÀAY d þZ d B ÞB TQ ÀQ ðF ÀCY d þZ d D ÞD T Q 2! ¼k R d k 2Àk R d k 2k U d k þk V d k tr U T d U d Àk R d k 2k R d À1k U d À1 !þV Td V d Àk R d k 2k R d À1kV d À1! !¼k R d k 2Àk R d k 2k U d k þk V d kk U d k 2þk V d k 2 þk R d k 4k U d k 2þk V d k 2 k R d À1k 2tr U T d U d À1ÀÁþtr V T d V d À1ÀÁ1A ¼0ð10ÞAnd we havetr U T d þ1U d ÀÁþtr V T d þ1V d ÀÁ¼tr U T dA T ðE ÀAY d þ1þZ d þ1B ÞþC TðF ÀCY d þ1þZ d þ1D ÞÂÃÀþV Td ÀðE ÀAY d þ1þZ d þ1B ÞB T ÀðF ÀCY d þ1þZ d þ1D ÞD TÂÃÁþk R d þ1k 2k R d k 2k V d k 2þk U d k 2¼tr E ÀAY d þ1þZ d þ1B ðÞT AU d þðF ÀCY d þ1þZ d þ1D ÞT CU dÀðE ÀAY d þ1þZ d þ1B ÞT V d B ÀðF ÀCY d þ1þZ d þ1D ÞTV d D576M.Dehghan,M.Hajarian /Applied Mathematics and Computation 202(2008)571–588þk R d þ1k2k R d k 2ðk V d k 2þk U d k 2Þ¼tr ðE ÀAY d þ1þZ d þ1B ÞT 0ðF ÀCY d þ1þZ d þ1D ÞT!ÂAU d ÀV d B00CU d ÀV d Dþk R d þ1k 2k R d k2ðk V d k 2þk U d k 2Þ¼k U d k 2þk V d k2k R d ktr ðR T d þ1ðR dÀR d þ1ÞÞþk R d þ1k 2k R d kðk V d k 2þk U d k 2Þ¼0:ð11ÞHence,(7)holds for i ¼d .Then since (8)–(11),(7)holds by principal of induction.Step 2.In this step,we assume tr R T i þt R i ÀÁ¼0,and tr ðU Ti þt U i Þþtr ðV T i þt V i Þ¼0for 16i 6t and 1<t <s .Now we show tr R T i þt þ1R i ÀÁ¼0,and tr U T i þt þ1U i ÀÁþtr ðV Ti þt þ1V i Þ¼0.By using step 1and similar to the proofs of (8)–(11),we can writetr R T i þt þ1R i ÀÁ¼tr R i þt Àk R i þt k 2k U i þt k þk V i þt kAU i þt ÀV i þt B 00CU i þt ÀV i þt D "#T R i 0@1A ¼tr R T i þt R i ÀÁÀk R i þt k 2k U i þt k 2þk V i þt k2tr AU i þt ÀV i þt B 00CU i þt ÀV i þt D T ÂE ÀAY i þZ i B 00F ÀCY i þZ i D¼Àk R i þt k 2k U i þt k þk V i þt ktr U T i þt A T ðE ÀAY i þZ i B ÞþU T i þt C TðF ÀCY i þZ i D ÞÀÀB T V T i þt ðE ÀAY i þZ i B ÞÀD T V Ti þtðF ÀCY i þZ i D ÞÁ¼Àk R i þt k 2k U i þt k 2þk V i þt k 2tr U Ti þt A T ðE ÀAY i þZ i B ÞþC T ðF ÀCY i þZ i D Þ2 þA T ðE ÀAY i þZ iB ÞþC T ðF ÀCY i þZ iD Þ2þPA T ðE ÀAY i þZ i B ÞP þPC T ðF ÀCY i þZ i D ÞP 2ÀPA T ðE ÀAY i þZ i B ÞP þPC T ðF ÀCY i þZ i D ÞP 2!þV Ti þtÀðE ÀAY i þZ i B ÞB T ÀðF ÀCY i þZ i D ÞD T 2þÀðE ÀAY i þZ i B ÞB T ÀðF ÀCY i þZ i D ÞD T 2þÀQ ðE ÀAY i þZ i B ÞB T Q ÀQ ðF ÀCY i þZ i D ÞD T Q 2ÀÀQ ðE ÀAY i þZ i B ÞB T Q ÀQ ðF ÀCY i þZ i D ÞD T Q 2!¼Àk R i þt k 2k U i þt k 2þk V i þt k2tr U Ti þtA T ðE ÀAY i þZ iB ÞþC T ðF ÀCY i þZ iD Þ2 þPA T ðE ÀAY i þZ i B ÞP þPC T ðF ÀCY i þZ i D ÞP !M.Dehghan,M.Hajarian /Applied Mathematics and Computation 202(2008)571–588577þV Tiþt ÀðEÀAY iþZ i BÞB TÀðFÀCY iþZ i DÞD T2þÀQðEÀAY iþZ i BÞB T QÀQðFÀCY iþZ i DÞD T Q2!¼Àk R iþt k2k U iþt kþk V iþt ktr U TiþtU iÀk R i k2k R iÀ1kU iÀ1!þV TiþtV iÀk R i k2k R iÀ1kV iÀ1!!¼Àk R iþt k2k U iþt kþk V iþt ktr U TiþtU iÀÁþtr V TiþtV iÀÁÂÃþk R iþt k2k R i k2ðk U iþt kþk V iþt kÞk R iÀ1kþtr U Tiþt U iÀ1ÀÁþtr V Tiþt V iÀ1ÀÁÂü0:ð12ÞNothing that we have tr R Tiþtþ1R iÀÁ¼0,and tr R Tiþtþ1R iþ1ÀÁ¼0,hence we can obtaintr U Tiþtþ1U iÀÁþtr V Tiþtþ1V iÀÁ¼trA T EÀAY iþtþ1þZ iþtþ1BðÞþC TðFÀCY iþtþ1þZ iþtþ1DÞ2þPA TðEÀAY iþtþ1þZ iþtþ1BÞPþPC TðFÀCY iþtþ1þZ iþtþ1DÞP2þk R iþtþ1k2k R iþt k2U iþt#TU i!þtr ÀðEÀAY iþtþ1þZ iþtþ1BÞB TÀðFÀCY iþtþ1þZ iþtþ1DÞD T2þÀQðEÀAY iþtþ1þZ iþtþ1BÞB T QÀQðFÀCY iþtþ1þZ iþtþ1DÞD T Q2þk R iþtþ1k2k R iþt kV iþt#TV i1A¼tr U TiA T EÀAY iþtþ1þZ iþtþ1BðÞþC TðFÀCY iþtþ1þZ iþtþ1DÞÂÃþV TiÀðEÀAY iþtþ1þZ iþtþ1BÞB TÂÀÀðFÀCY iþtþ1þZ iþtþ1DÞD T ÃÁþk R iþtþ1k2k R iþt k2tr U TiþtU iÀÁþtr V TiþtV iÀÁÀÁ¼trðEÀAY iþtþ1þZ iþtþ1BÞT AU iþðFÀCY iþtþ1þZ iþtþ1DÞT CU iÀðEÀAY iþtþ1þZ iþtþ1BÞT V i BÀðFÀCY iþtþ1þZ iþtþ1DÞT V i Dþk R iþtþ1k2k R iþt k2tr U TiþtU iÀÁþtr V TiþtV iÀÁÀÁ¼tr ðEÀAY iþtþ1þZ iþtþ1BÞT00ðFÀCY iþtþ1þZ iþtþ1DÞT!AU iÀV i B00CU iÀV i D!!þk R iþtþ1k2k R iþt k2tr U TiþtU iÀÁþtr V TiþtV iÀÁÀÁ¼k U i k2þk V i k2k R i k2tr R Tiþtþ1R iÀR iþ1ðÞÀÁþk R iþtþ1k2k R iþt k2tr U TiþtU iÀÁþtr V TiþtV iÀÁÀÁ¼0:ð13ÞBy steps1and2,the conclusion(5)holds by the principal of induction.hLemma2.Suppose that the matrix equations(1)are consistent over reflexive matrices,and½YÃ;Zà is an arbi-trary reflexive solution pair of the matrix equations(1).Then,for any initial reflexive matrix pair½Y1;Z1 trððYÃÀY iÞT U iþðZÃÀZ iÞT V iÞ¼k R i k2ð14Þfor i¼1;2;...,where the sequences f Y i g,f Z i g,f U i g,f V i g and f R i g are generated by Algorithm1.578M.Dehghan,M.Hajarian/Applied Mathematics and Computation202(2008)571–588Proof.We prove the conclusion (14)by induction.If i ¼1,we havetr ðY ÃÀY 1ÞT U 1þðZ ÃÀZ 1ÞTV 1¼tr Y ÃÀY 1ðÞT A T ðE ÀAY 1þZ 1B ÞþC TðF ÀCY 1þZ 1D Þ2þPA T ðE ÀAY 1þZ 1B ÞP þPC T ðF ÀCY 1þZ 1D ÞP 2!þðZ ÃÀZ 1ÞT ÀðE ÀAY 1þZ 1B ÞB T ÀðF ÀCY 1þZ 1D ÞD T 2þÀQ ðE ÀAY 1þZ 1B ÞB T Q ÀQ ðF ÀCY 1þZ 1D ÞD T Q 2!¼tr ðY ÃÀY 1ÞT A T ðE ÀAY 1þZ 1B ÞþC TðF ÀCY 1þZ 1D Þ2þA T ðE ÀAY 1þZ 1B ÞþC T ðF ÀCY 1þZ 1D ÞÀPA T ðE ÀAY 1þZ 1B ÞP þPC T ðF ÀCY 1þZ 1D ÞP þPA T ðE ÀAY 1þZ 1B ÞP þPC TðF ÀCY 1þZ 1D ÞP 2!þðZ ÃÀZ 1ÞT ÀðE ÀAY 1þZ 1B ÞB T ÀðF ÀCY 1þZ 1D ÞD T 2 þÀðE ÀAY 1þZ 1B ÞB T ÀðF ÀCY 1þZ 1D ÞD T 2ÀÀQ ðE ÀAY 1þZ 1B ÞB T Q ÀQ ðF ÀCY 1þZ 1D ÞD T Q2þÀQ ðE ÀAY 1þZ 1B ÞB T Q ÀQ ðF ÀCY 1þZ 1D ÞD T Q2! ¼tr Y ÃÀY 1ðÞT A T ðE ÀAY 1þZ 1B ÞþC T ðF ÀCY 1þZ 1D ÞÂÃþðZ ÃÀZ 1ÞT ÀðE ÀAY 1þZ 1B ÞB TÂÀðF ÀCY 1þZ 1D ÞD TÃÁ¼tr ðE ÀAY 1þZ 1B ÞT A ðY ÃÀY 1ÞþðF ÀCY 1þZ 1D ÞTC ðY ÃÀY 1ÞÀðE ÀAY 1þZ 1B ÞT ðZ ÃÀZ 1ÞB ÀðF ÀCY 1þZ 1D ÞT ðZ ÃÀZ 1ÞD¼tr ðE ÀAY 1þZ 1B ÞT0ðF ÀCY 1þZ 1D ÞT!ÂA ðY ÃÀY 1ÞÀðZ ÃÀZ 1ÞB00C ðY ÃÀY 1ÞÀðZ ÃÀZ 1ÞD0B @1C A 1C A¼trE ÀAY 1þZ 1B 00F ÀCY 1þZ 1D T E ÀAY 1þZ 1B 00F ÀCY 1þZ 1D!¼k R 1k 2:ð15ÞNow suppose the conclusion (14)holds for 16i 6d .Similar to the proof of (15),for i ¼d þ1we can obtaintr ðY ÃÀY d þ1ÞT U d þ1þðZ ÃÀZ d þ1ÞT V d þ1¼tr ðY ÃÀY d þ1ÞT A TðE ÀAY d þ1þZ d þ1B ÞþC T ðF ÀCY d þ1þZ d þ1D Þ2þPA T ðE ÀAY d þ1þZ d þ1B ÞP þPC TðF ÀCY d þ1þZ d þ1D ÞP 2þk R d þ1k2k R d k 2U d#M.Dehghan,M.Hajarian /Applied Mathematics and Computation 202(2008)571–588579。
A Review on Multi-Label Learning Algorithms

Index Terms Machine learning, multi-label learning, evaluation metrics, label correlations, problem transformation, algorithm adaptation.
I. I NTRODUCTION Traditional supervised learning is one of the mostly-studied machine learning paradigms, where each real-world object (example) is represented by a single instance (feature vector) and associated with a single label. Formally, let X denote the instance space and Y denote the label space, the task of traditional supervised learning is to learn a function f : X → Y from the training set {(xi , yi ) | 1 ≤ i ≤ m}. Here, xi ∈ X is an instance characterizing
Min-Ling Zhang is with the School of Computer Science and Engineering, and the MOE Key Laboratory of Computer Network and Information Integration, Southeast University, Nanjing 210096, China. Email: zhangml@. Zhi-Hua Zhou is with the National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China. Email: zhouzh@. (Corresponding author)
Generalized Quantifiers in Declarative and Interrogative Sentences

Few referees read every short abstract Few referees read every excellent short abstract420Generalized Quantifiers in Declarative and Interrogative Sentences(a)[Three>Few](b)[Few>Three]Three good referees read few abstractsThree good Dutch referees read few abstracts where(a)would be logically correct in the interpretation[Three>Few]and(b) in the wide scope reading[Few>Three].However,while(a)is a correct natural reasoning inference,(b)is not.This difference at the reasoning level is a side effect of some different properties proper of the quantifier phrases few referees and every abstract[3].The example shows that natural language structures contribute to natural reasoning and illustrates the need to account for this information when aiming to model natural reasoning.Moreover,it sheds light on the importance of some differences holding among items of the same semantic type,e.g.within the class of quantifiers,which are irrelevant for the meaning assembly,but effect the form composition.In this paper, we focus on this preliminary task to be carried out by a formal system employed to account for natural reasoning inferences.Lambek calculi[15]are well known for being able to properly account for natural language syntactic-semantic interface by means of the Curry-Howard correspondence between proofs and lambda-terms[11,4].Thanks to this relation,proofs of the gram-maticality of a string correspond to lambda terms.The form/meaning assembly is carried out in parallel by means of function application and abstraction.However, the example above shows that expressions with the same meaning can have differ-ent syntactic distribution.This difference between the syntactic and semantic levels cannot be expressed by a system in a one-to-one correspondence with the lambda calculus.Syntactic types must encode some features which are non visible in the semantic types.In[12],Kurtonina and Moortgat extended the logical language of the Lambek calculi with unary operators,obtaining Multimodal Categorial Logics(MMCL).In this paper,we show how the latter have the right expressivity to encodefine-grained distinctions among expressions of the same semantic type.The paper is divided into two main parts:In Section2and Section3we give a brief presentation of the linguistic data concerning scope ambiguity phenomena and we introduce MMCL showing how it can be used to account for these linguistic data.When building the lexicon in this part,we concentrate on the type language of the system,hiding the corresponding semantic representation since the constraints are purely syntactic.Finally,in Section4we show how the results at the syntactic level contribute to giving definitions for the semantic representations for polarity and constituent questions.2Scope AmbiguityQuantifiers offer interesting challenges for the treatment of the syntax/semantic in-terface.First of all,they can take scope wider than where they occur overtly as illustrated by the object wide scope reading assigned to Few referees read every short abstract.Moreover,quantifiers differ with respect to the ways of scope taking as shown by the non-validity of the inference derived from the object wide scope reading of Three good referees read few abstracts.3.QUANTIFIER SCOPE IN MULTIMODAL CATEGORIAL LOGIC421 For quite a long time,linguists have concentrated only on thefirst problem exhibited by GQs.In the generative tradition since the pioneer work of May[16],all GQs have been treated as having the same scope possibilities.We can refer to this approach as the Uniformity of Quantifier Scope Assignment.Beghelli and Stowell[3]present evidence against this approach and propose a move to a moreflexible theory which explains how and why different types of GQs can have different scoping possibilities. In[3]scope is seen as the by-product of agreement processes,and mismatches in agreement give rise to ungrammatical sentences.Beghelli and Stowell distinguishfive classes of GQs.Membership in any of the GQ types is indicated by some syntactic properties which are morphologically encoded in the determiner position.They claim that for certain combinations of quantifier types the grammar simply excludes certain logically possible scope construals.We refer the reader interested in the linguistic details of the theory to[3],we just summarise their data in Table1on page421.Sentence Scope?∀(b)∗How didn’t every actor behave?¬∃(b)Coppola didn’t direct a movie.∃¬3.(a)∗Any actor didn’t like Kubrick.∃¬and¬∃(c)Some actor didn’t like Kubrick.422Generalized Quantifiers in Declarative and Interrogative Sentences exhibiting different distributional behaviour.A detailed comparison of the categorial logic approach introduced here with the minimalist analysis proposed by Beghelli and Stowell is given in[5].In our system,different scope possibilities of a sentence correspond to different proofs of the parsed string.Syntactic and semantic information is stored in the lex-icon and propagated through the proof by means of logical rules.We will use these characteristics of the system to account for scope ambiguity phenomena,and the unary operators of MMCL to account for the different ways of scope taking identified by Beghelli and Stowell.In this section wefirst briefly present the system,and then we show how we can infer the linguistic data given in Table1from the rules of the system starting from the lexical assignments.Derivable objects in MMCL are of the formΓ:A whereΓis a structure(typically a tree representation of a sentence)and A is a formula indicating the syntactic type of this structure.Definition3.1(Formulas)Over afinite set of atomic formulas A,we define the set of formulas F as follows:F::=A|3F|2↓F|F/F|F•F|F\FDefinition3.2(Structures)Over a countably infinite set of structural variables V, we define the set of structure terms as follows:S::=V| S |(S◦S)To make our proofs a bit more readable,we will typically use lexical word as structural variables.A structure term is then a tree of words.The natural deduction calculus for MMCL tells us how to combine proofs from an initial set of lexical assignments to produce phrases of different types.See[21]for a more detailed explanation and many linguistic applications of the Fitch-style natural deduction calculus presented below.In the logical rules below X,Y,Z range over structure terms,A,B,C range over for-mulas and x,y are structural variables not occurring elsewhere in the proof.Finally, Z[X]denotes a structure term Z with a distinguished subterm occurrence X. Tables2on page423and3on page424list the logical rules of type-logical grammar. The[/E]rule tells us that whenever rule n of our proof shows structure X to be of type A/B and line m of this same proof shows structure Y to be of type B,then the combined structure(X◦Y)is of type A.The[/I]rule indicates that,once we hypothesise a B formula with a fresh structural variable x,we can discharge this formula once we derive a formula A with structure(X◦x)to produce a formula A/B with structure X.Note that we mark the scope of a hypothesis with a vertical bar and that all hypotheses should be discharged at the end of the proof.According to the Curry-Howard interpretation,natural deduction proofs correspond to typed lambda terms.For our current applications,we are only interested in the semantics of the implications.When we want to compute the meaning of a syntactic expression,it is often convenient to add semantic labels to the proof steps.Derivable objects are then of theΓ:A−t,whereΓis a structure,A is a formula and t is the semantics of the expression.Table4lists the semantically annotated rules for the implications;the other connectives have their own Curry-Howard interpretations,but3.QUANTIFIER SCOPE IN MULTIMODAL CATEGORIAL LOGIC423 Lexiconn x:A LexHypothesisnx:B Hypmx:B Hypmx:A Hypm+1Z[(x◦y)]:CZ[X]:C•E(n,m,m+1,p)n X:Am Y:B(X◦Y):A•B•I(n,m)Table2.The logical rules of type-logical grammarthey are not relevant for our current applications.The semantically annotated[/E]rule now tells us that whenever we combine an A/B formula with semantics t with a B formula with semantics u the resulting A formula has the term(t u)as its semantics.For the[/I]rule,the hypothesis B is initially assigned a fresh variable x as its semantics,then we continue our proof until we derive a formula A with some semantics t.We can now withdraw our B hypothesis by abstracting over its variable,producingλx.t as the semantics of the expression A/B.Note that without the structural labelling these are just the logical rules of implication in intuitionistic logic.Also note that on the semantics level,we are not interested in the difference between the two implications:both have the same semantic content.The reader may wonder about the complexity of this system and about how proof search would proceed.Natural deduction proofs have the pleasant property that for424Generalized Quantifiers in Declarative and Interrogative Sentences Unary Rulesn X:3AmZ[ x ]:CZ[X]:C3E(n,m,p)n X:AX :3A3I(n)n X:2↓AX :A2↓E(n)n X :AX:2↓A2↓I(n)Table3.The logical rules for the unary connectivesImplicationsn X:A/B−tm Y:B−u(X◦Y):A−(t u)/E(n,m)n(X◦x):A−tX:A/B−λx.t/I(n,m)n Y:B−um X:B\A−t(Y◦X):A−(t u)\E(n,m)n(x◦X):A−tX:B\A−λx.t\I(n,m)Table4.The logical rules for the implications with semanticsfinding proofs of a given logical statement we only need to consider the subformulas of this logical statement,thereby bounding the search space for proof search.With respect to the complexity,de Groote[8]presents a polynomial algorithm for the system as described above,but adding more complex structural possibilities,like we need for our treatment of generalised quantifiers will increase the complexity and can make the system PSPACE complete in the worst case[20].We illustrate how the logical system can be used to reason with linguistic signs by showing a simple proof consisting only of elimination rules.Example3.3Given the lexicon below we have specified that tarantino and pulp fiction are lexical expressions of type np,that oscar and movie are of type n and that directed and won are of type(np\s)/np.The latter complex type means that it3.QUANTIFIER SCOPE IN MULTIMODAL CATEGORIAL LOGIC425 combinesfirst with an np to the immediate right then with an np to the immediate left to produce an s,which is the atomic type assigned to sentences.We will discuss the correct type assignments to determiners like a and some after this example.Lexiconpulpfiction:np−pulp tarantino:np−tarantinooscar:n−oscar won:(np\s)/np−λxy.((win x)y)movie:n−movie directed:(np\s)/np−λxy.((direct x)y)Using this lexicon,together with the logical rules above,we can prove that tarantino directed pulpfiction is a well-formed expression of type s and its meaning representa-tion is(direct pulp)tarantino.For the sake of simplicity,we replace the structural formulas with their corresponding linguistic expressions.1.tarantino:np−tarantino Lex2.directed:(np\s)/np−λxu.((direct x)y)Lex3.pulpfiction:np−pulp Lex4.(directed◦pulpfiction):np\s−λx.(direct pulp)x/E(2,3)5.(tarantino◦(directed◦pulpfiction)):s−(direct pulp)tarantino\E(1,4) The proof starts from the lexical assumptions.The elimination of the main con-nective of the complex type assigned to the transitive verb,namely/E is applied to the premises2and3,yielding a structure of type np\s.Similarly,the connective\ is eliminated,composing thefinal structure which is proved to be of type s.Further-more,the example shows that the elimination rules of\,/correspond to functional applications.Note that steps4and5hide the application ofβ-reduction.According to the Montegovian tradition GQs are denoted by functions which take scope at the sentence level[2].A type suitable for a subject generalised quantifier would be s/(np\s),that is a type which produces a sentence when itfinds to its right a sentence missing an np to its left.Similarly,an object generalised quantifier would be assigned the type(s/np)\s.The reader might complain that there are no good motivations for such a duplicate type assignment.A solution to this problem is given in[17,18],where it is shown how MMCL can be extended in such a way that a single type assignment suffices for each quantifier,regardless of its position in the sentence.However,for the sake of simplicity,we will abstract over this issue and adopt the general notation(np→s)→s.Doing so we can focus on the problem we are interested in,namely the different scope possibilities of quantifiers.The use of a uniform logical type assignment(np→s)→s for all GQs could be seen as a deductive version of May’s[16]Scope Uniformity thesis,and would fail in accounting for the different scope possibilities of GQs discussed in the previous section.In order to diversify the way GQs scope on sentences,we refine this type assignment further,distinguishing three different sentential levels,s1,s2and s3to which three logically related types are assigned.We consider the standard sentential type s to be the type of the medium sentential level s2and we derive the other two types as shown below.Suppose that we have a proof that a structure X is of type s,then we can prove it is of type2↓3s as follows.426Generalized Quantifiers in Declarative and Interrogative Sentences1.X:s2. X :3s3I(1)3.X:2↓3s2↓I(2)similarly,we can prove that32↓s derives(⇒)s,as follows.1.X:32↓s2.x :s2↓E(2)4.X:s3E(1,2,3)The converse derivability relations do not hold.The reader can verify this by trying all possible proofs using only subformulas of the logical types.Summing up,the logical derivability relation connecting the three types is:sentential levelslogical types3.QUANTIFIER SCOPE IN MULTIMODAL CATEGORIAL LOGIC427 Moreover,to get the desired interaction between negation and generalised quan-tifiers,we follow Carpenter’s raising strategy[6].Instead of considering didn’t as a standard verb phrase modifier vp/vp–where vp=np\s,we lift its goal formula to ((s/vp)\s)/vp,i.e.a type which takes an s incomplete for an np to produce an s in-complete for a GQ.We enrich this type with the information on the sentential levels while abstracting over the directionality of the implication operators,resulting in the following lexical entry.didn’t:(np→s2)→(((np→s2)→s2)→s2)Example3.4Given the lexicon belowLexiconcoppola:npthe godfather:nposcar:nmovie:ndirected:(np\s1)/npwon:(np\s1)/npany:((np→s1)→s1)/na:((np→s2)→s2)/nsome:((np→s3)→s3)/ndidn’t:(np→s2)→(((np→s2)→s2)→s2)sentences2-3of Table1can be correctly parsed.We simplify the logical types us-ing the sentential levels instead of their corresponding types,while we abbreviate subproofs of the relations between the sentence level by(derived)rules we name s i,j. We can see in Figure1on page428that directfirst combines with didn’t and then the result combines with a movie.Besides the application of the elimination rules which as seen before correspond to functional application,the proof contains applications of the introduction rules of the functional connectives.As marked by the label of the steps2and6,these rules correspond to hypothetical reasoning.For instance,the hypothesis v of type np→s2assumed in2,is discharged at the step4by means of →I lifting the noun phrase type of Coppola to the GQ type.This order composition produces the reading where the existential quantifier has wide scope(∃¬):it takes the built structure(Coppola◦(didn’t◦direct))in its scope.In order to understand the way the different lexical type assignments of the GQs properly account for their different scope possibilities,attention has to be drawn on the step14in the proof,where we have a structure of type s2.For the case at hand, with a movie in object position,the proof proceeds simply by means of the logical rules of the binary operators.On the other hand,if we consider the sentence2c where a is replaced by some,wefirst have to lift s2to s3and then proceed as before.Finally, since we cannot derive the type s1from s2,the reading(∃¬)is disallowed in case a is replaced by any as in2a.An alternative proof exists for sentence2b,as shown in Figure2on page429,giving the second reading.Here,instead,direct combinesfirst with a movie and then with didn’t.In other words,in this reading the negation has wide scope(¬∃).Now,in order to get the required argument np→s2at step16it is essential we428Generalized Quantifiers in Declarative and Interrogative Sentences 1.Coppola:np Lex2.(Coppola◦v):s2→E(1,2)4.Coppola:(np→s2)→s2→I(2,3)5.direct:(np\s1)/np Lex6.(direct◦y):np\s1/E(5,6)8.x:np Hyp9.(x◦(direct◦y)):s1\E(7,8)10.(x◦(direct◦y)):s2s1,2(9)11.didn’t:(np→s2)→(((np→s2)→s2)→s2)Lex13.(Coppola◦(didn’t◦(direct◦y))):s2→E(4,13)15.(Coppola◦(didn’t◦direct)):np→s2→I(6,14)16.a:((np→s2)→s2)/n Lex17.movie:n Lex18.(a◦movie):(np→s2)→s2/E(16,17)19.((Coppola◦(didn’t◦direct))◦(a◦movie)):s2→E(15,18)20.((Coppola◦(didn’t◦direct))◦(a◦movie)):s3s2,3(19)Fig.1.Coppola didn’t direct a movie,∃¬reading.have an s1or s2result at step15,which we can only have when the quantifier is any or a.The failure to derive s3⇒s2blocks this derivation for the quantifier some. Before considering the interrogatives,we give some comments on how the small frag-ment we have given can be extended.For example,when a generalised quantifier like every actor combines with other GQs,we have to account for multiple readings.In other words,we need to assign to every a type which will allow for more scope possibilities than any of the assignments we have seen so far.A proper type for this kind of GQ is(np→s3)→s1,which will allow every to have both wide and narrow scope with respect of a second GQ.So using heterogeneous combinations of sentential types,we can account for GQs with more complex behaviour than some and a.Another interesting example is given by the negative polarity items(NPIs),as any: items which require to be in the scope of a negative operator[14].The type assigned to any above,will satisfy the request that when the negation occurs the only possible reading will be the one where the negation has wide scope.However,using this type any can still occur in positive contexts,contrary to linguistic reality.We refer the reader to[5]for a solution to this problem and for the discussion of a larger English fragment with GQs.4InterrogativesNow that the three types of GQs have been introduced and the criteria for differenti-ating them have been explained,we can discuss the last type of GQ we are interested4.INTERROGATIVES429 1.Coppola:np Lex2.(Coppola◦v):s2→E(1,2)4.Coppola:(np→s2)→s2→I(2,3)5.direct:(np\s1)/np Lex6.(x◦direct):np→s2→I(7,10)12.movie:n Lex14.((x◦direct)◦(a◦movie)):s2→E(11,14)16.(direct◦(a◦movie)):np→s2→I(6,15)17.didn’t:(np→s2)→(((np→s2)→s2)→s2)Lex18.(didn’t◦(direct◦(a◦movie))):((np→s2)→s2)→s2→E(16,17)19.(Coppola◦(didn’t◦(direct◦(a◦movie)))):s2→E(4,18)20.(Coppola◦(didn’t◦(direct◦(a◦movie)))):s3s2,3(19)Fig.2.Coppola didn’t direct a movie,¬∃reading.in,namely wh-phrases.This brings us to move from a syntactical approach dealing with sentential inference,to a more semantical one,which involves the discourse level. Therefore,instead of considering only the syntactic types of our lexical items,we will discuss their semantic representation as well.In natural language we can distinguish two basic categories of questions:polarity questions(also known as yes/no questions)and constituent questions(also known as wh-questions).We willfirst discuss thefirst type,then we will present the second one.In our framework we will consider a string of words which form a question to be of a different category and to have a different meaning than a declarative sentence. However,their type and denotation will be a logical consequence of the one attributed to sentences.The distinction of levels described above will therefore reappear.In particular,we will distinguish two levels of questions,q1and q2,deriving their types from the one assigned to sentences in s1and s2.For reasons which will become clear later,we also need a third question level q3derivable from the types of both q1and q2.The logical relations among the types in these three question levels and in the sentential ones are as shown below.For the sake of simplicity we abbreviate the logical types using their corresponding sentential/question level:s1⇒s2⇒s3⇓⇓q1q2⇒⇒q3430Generalized Quantifiers in Declarative and Interrogative Sentences where the relation between types of different levels is the logical derivability relation discussed above;and the one between sentences and questions is the lifting theorem [19].Hence,q1stands for s1/(s1\s1)which in turn abbreviates32↓s/(32↓s\32↓s) and q2stands for s2/(s2\s2),viz.s/(s\s).Reading out this logical types,a yes/no question is seen as a function which takes a sentential modifier and yields a sentence. As might be clear,the two categories q1,q2are at the level of positive and negative yes/no questions,respectively.Their type arePositive:32↓s/(32↓s\32↓s)Negative:s/(s\s)A question to investigate further is whether the type obtained from lifting the type in s3,i.e.23s,can also play a role in a type logical approach to questions.In the logical,philosophical and linguistic literature several frameworks have been proposed for the meaning of questions[7].The logical type we have assigned to the yes/no questionsfinds its semantic motivation in the structured meaning approach (also called“categorial”),which traces back to Ajdukiewicz[1],as noticed in[10],and has been developed in[23,13].The basic idea which characterizes this approach is that:Question meanings are functions that,when applied to the meaning of the answer,yield a proposition.Yes/no questions expect answers like yes,no.In[13]it is suggested that no can be considered as a propositional operator that reverses the truth value,λp.¬p,and yes as a propositional operator that retains the truth value,i.e.the identity function:λp.p.Before going to discuss some examples,it is worth to notice the contribution the proof theoretical approach here assumed gives to the semantic investigation on ques-tions.The results we describe bring evidence to the correctness of the categorial approach and complete the framework with the syntatical conterpart.As pointed out in the beginning of the paper,at the heart of any categorial grammar analysis there is the Curry-Howard isomorphism between lambda terms and types.The former are used as semantic representation of the natural language expressions,and the latter as their sytactic type.Since the studies on questions are mostely related with their interpretation in this section we discuss the lambda term representation of the lexical items as well as their type assignments.We start from thefinal semantic representa-tion of yes-no questions and we then build the lexicon behind it.Let usfirst present the theory intuitively by means of an example with a polarity question.Example4.1Q Did Tarantino direct Titanic?λY.(Y((direct titanic)tarantino))A No.λp.¬pQ(A)By twice beta-reduction.¬((direct titanic)tarantino)Translating into the type language what we have treated so far,an auxiliary as did or didn’t will be a function which takes a sentence and yields a question.More specifically,did yields a question of thefirst level,whereas didn’t of the second one.4.INTERROGATIVES431 Now that the theory is clear,we can present the lambda terms formally.Wefirst show the desired lambda terms representing a wh-question,and then we give the lex-icon displaying both types and lambda terms for the items involved.Additionally,we introduce wh-phrases which give rise to the second type of questions when combined with thefirst one.As is explained in[24]wh-phrases differ in the way they behave with respect to negation.This fact can be easily accounted for in our framework,thanks to the distinction between the two levels q1,q2for positive and negative questions.Following the criterion given in[13]and quoted above,we consider wh-questions to be functions taking an answer to yield a sentence.The type of the question therefore depends on the type of its possible answer.Let us consider what as an example.Example4.2Q What did Cameron direct?λY.(Yλx.((direct x)cameron))A TitanicλP.P(titanic)Q(A)By twice beta-reduction.((direct titanic)cameron)Translating this into the type language,we have that a wh-question is a function which takes a GQ and yields a sentence.We abbreviate this category with wh, knowing that wh=s/GQ,and add the following lexical entries to our lexicon:LexiconTarantino:np−tarantinoCameron:np−cameronTitanic:np−titanicno:s2→s2−λp.¬pdirect:(np\s)/np−λx.λy.((direct x)y)did:GQ→((np→s1)→q1)−λP.λQ.λR.(R(P Q))didn’t:GQ→((np→s1)→q2)−λP.λQ.λR.(R¬(P Q))what:wh/(np→q?)−λZ.λP.(Pλx.((Z x)(λU.U)))Notice that the lexical type of the auxiliary selects for a generalized quantifier and a vp to produce a question.The GQ can be any of the three different types we treated before.Therefore,the type selected by the auxiliary has to be general enough to satisfy this request.In other words,GQ is such that GQ1⇒GQ,GQ2⇒GQ and GQ3⇒GQ are all derivable.These logical properties are assured by GQ=(np→s1)→s3.The type assigned to what simply means that wh-phrases combined with a yes/no question missing an np result into a wh-question.In our language we have two different types of yes/no questions.Before discussing which of them is requested by a wh-phrase we show a derivation of a wh-question.Notice that the type assigned to wh-phrases can account for cases where the answer is a simple proper name(e.g. Titanic),a set of proper names(e.g.Terminator,Aliens,Titanic)or a GQ(e.g.Several famous movies).432Generalized Quantifiers in Declarative and Interrogative SentencesExample4.3We give a proof of‘What did Cameron direct?’in Figure3on page432 1.Cameron:np−cameron Lex2.(Cameron◦v):s1−(V cameron)→E(1,2)4.z:np−z Hyp11.(direct◦z):np\s1−λy.((direct z)y)β(11)13.−(λQ.λR.(R(Q cameron))λy.((direct z)y))→E(8,12)14.−λR.(R((direct z)cameron))β(13)15.−λR.(R((direct z)cameron))q1,3(14)16.((did◦Cameron)◦direct):np→q3−λz.λR.(R((direct z)cameron))→I(10,15)17.what:wh/(np→q?)−λZ.λP.(Pλx.((Zx)(λU.U)))Lex18.(what◦((did◦Cameron)◦direct)):wh−(λZ.λP.(Pλx.((Z x)(λU.U)))λz.λR.(R((direct z)cameron)))/E(16,17)19.(what◦((did◦Cameron)◦direct)):wh−λP.(Pλx.((λz.λR.(R((direct z)cameron)))x)(λU.U))β(18)20.(what◦((did◦Cameron)◦direct)):wh−λP.(Pλx.((λR.(R((direct x)cameron)))(λU.U))β(19)21.(what◦((did◦Cameron)◦direct)):wh−λP.(Pλx.((λU.U)((direct x)cameron)))β(20)22.(what◦((did◦Cameron)◦direct)):wh−λP.(Pλx.((direct x)cameron))β(21)Fig.3.Derivation of what did Cameron directAs shown in the proof the type assigned to Cameron,viz.np has to be lifted to the one of GQ.Having chosen the type GQ derivable from GQ1,2,3,allows‘did’to be combined with either an arbitrary generalized quantifier or a simple noun phrase. We are now ready to answer the open question of the previous paragraph:what is the type of the question taken as an argument by a wh-phrase?In[24]it is shown that wh-phrases differ from each-other in the way they behave with respect to nega-tion.Szabolcsi and Zwart give algebraic motivations for this linguistic phenomenon whichfits naturally into our framework.Having distinguished positive and negative questions enables us to deal with contrasting pairs like what and how presented in Section2and repeated here.。
The Cherednik kernel and generalized exponents

a rX iv:mat h /311258v1[mat h.RT]17Nov23THE CHEREDNIK KERNEL AND GENERALIZED EXPONENTS BOGDAN ION Abstract.We show how the knowledge of the Fourier coefficients of the Cherednik kernel leads to combinatorial formulas for generalized exponents.We recover known formulas for generalized exponents of irreducible represen-tations parameterized by dominant roots,and obtain new formulas for the gen-eralized exponents for irreducible representations parameterized by the domi-nant elements of the root lattice which are sums of two orthogonal short roots.Introduction Let g be a complex simple Lie algebra of rank n and denote by G its adjoint group.The algebra S (g )of complex valued polynomial functions on g becomes a graded representation for G .It is known from the work of Kostant [7]that if I de-notes the subring of G –invariant polynomials on g then S (g )is free as an I –module and is generated by H ,the space of G –harmonic polynomials on g (the polyno-mials annihilated by all G –invariant differential operators with constant complex coefficients and no constant term),or equivalently S (g )=I ⊗H .The space of harmonic polynomials thus becomes a graded,locally finite representation of G ;it can equivalently be thought of as the ring of regular functions on the cone of nilpotent elements in g .If we denote by H i its i –th graded piece,and by V λthe irreducible representation of G with highest weight λwe can consider the graded multiplicity of V λin H E (V λ):= 0≤idim C Hom G (V λ,H i ) t i As a polynomial with positive integer coefficients E (V λ)can be written in the formE (V λ)=v λi =1t e i (λ)such that e 1(λ)≤e 2(λ)≤···≤e v λ(λ)and v λis the multiplicity of the 0–th weight space of V λ.The positive integers e i (λ)were called by Kostant the generalized exponents of V λ.The terminology is justified by the fact that the classical exponents2BOGDAN IONof G,the numbers e1≤···≤e n which appear in the factorization of the Poincar´e polynomial of Gp G(t)=ni=1(1+t2e i+1)coincide with the generalized exponents of the adjoint representation of G.To further motivate the importance of generalized exponents note that by[8]and [4]the polynomials E(Vλ)are particular examples of Kazhdan–Lusztig polynomials (for the affine Weyl group associated to the Weyl group of G)and therefore of con-siderable combinatorial complexity.The results of Lusztig and Hesselink describe E(Vλ)as a t–analogue of the0-th weight multiplicity of Vλvia a deformation of Kostant’s weight multiplicity formula introduced by Lusztig.The problem of computing the classical exponents of G was initially motivated by the problem of computing the Betti numbers of G.It turns out that the classical exponents admit another description quite different from the one alluded to above. It was observed independently by A.Shapiro(unpublished)and R.Steinberg[12] that if we denote by h(k)the number of positive roots of height k in the root system associated to G then the number of times k occurs as an exponent of G is h(k)−h(k+1).This very simple procedure for computing the classical exponents was justified by Coleman[3]modulo the empirically observed fact that2N=nh (N is the number of reflexions in the Weyl group of G and h is the order of a special element of the Weyl group called the Coxeter transformation)and by Kostant[6] who gave a uniform proof by studying the decomposition of g into submodules for the action of a principal three dimensional subalgebra of g.There is also a proof of this fact directly from Macdonald’s factorization of the Poincar´e polynomial of the Weyl group of G[9][5,Section3.20].The main goal of this paper is to explain how the above description of the clas-sical exponents and similar descriptions of generalized exponents can be obtained by analyzing the Fourier coefficients of the Cherednik kernel,a certain continuous function on a maximal torus of G.Besides recovering the formulas for generalized exponents of irreducible representations parameterized by dominant roots,our main result,Theorem4.5,describes combinatorially the generalized exponents for irre-ducible representations parameterized by dominant elementsλof the root lattice of g which are sums of two orthogonal short roots.To describe this result we need the following notation.Letλbe a dominant element of the root lattice of g which can be written as a sum of two orthogonal short roots and it is not itself a root.For anyγin the same Weyl group orbit as λlet n(γ)be the number of(unordered)pairs of positive short orthogonal roots which sum up toγ.Let hλ(k):=h′λ(k)−h′′λ(k),where h′λ(k)is the number of3weights of V λwhich have height k and h ′′λ(k )is the number of weights γof V λin the same Weyl group orbit as λand whose height is k +n (γ).Theorem 1.Let λbe a dominant element of the root lattice of g which can be written as a sum of two orthogonal short roots and it is not itself a root.With this notation above,the multiplicity of V λin H k equals h (k )−h (k +1).Our result suggests that similar formulas for generalized exponents for other classes of irreducible representations of G are also possible if one explicitly describes the Fourier coefficients of the Cherednik kernel parametrized by all weights of the irreducible representation under consideration.A general technique of inductively computing the Fourier coefficients of the Cherednik kernel is described in Theorem 4.1.Another closely related method for computing the Fourier coefficients of the Cherednik kernel was introduced by Bazlov [1].It is based on Cherednik operators and was succesfuly applied to compute the Fourier coefficients parametrized by roots,but this method seems to be less efficient in general because of the complexity of Cherednik operators.1.Preliminaries1.1.Let g be a complex simple Lie algebra of rank n and denote by G its adjoint group.Let h and b be a Cartan subalgebra respectively a Borel subalgebra of g such that h ⊂b ,fixed once and for all.The maximal torus of G corresponding to h is denoted by H .We have H =T A where T is a compact torus and A is a real split torus.The volume one Haar measure on T is denoted by ds .Let R ⊂h ∗be the set of roots of g with respect to h ,let R +be the set of roots of b with respect to h and denote by R −=−R +.Of course,R =R +∪R −;the roots in R +are called positive and those in R −negative.The set of positive simple roots determined by R +is denoted by {α1,...,αn }.We know that the roots in R have at most two distinct lengths.We will use the notation R s and R ℓto refer respectively to the short roots and the long roots in R .If the root system is simply laced we consider all the roots to be short.The dominant element of R s is denoted by θs and the dominant element of R ℓis denoted by θℓ.Any element αof R can be written uniquely as a sum of simple roots n i =1a i αi .The height of the root αis defined to beht(α)=n i =1a i .The root of R with has the largest height is denoted by θ.By the above convention,if R is simply laced then θ=θs and if R is not simply laced then θ=θℓ.Denote by r the maximal number of laces in the Dynkin diagram associated to g .There is a canonical positive definite bilinear form (·,·)on h ∗R (the real vector4BOGDAN IONspace spanned by the roots)normalized such that(α,α)=2for long roots and (α,α)=2/r for short roots.For any rootαdefineα∨=2α/(α,α).We know from the axioms of a root system that(α,β∨)is an integer for any rootsαandβ.In fact,the only possible values for|(α,β∨)|are0,1or2if the length ofαdoes not exceed the length ofβ(the value2is attained only ifα=±β)and0,r if the length ofαis strictly larger than the length ofβ.Defineρ=1eλ=e−λ.If we set eδ=q,for q afixed complex number, the affine Weyl group acts naturally on C[Q].For example,e s0(λ)=q(λ,θ)e sθ(λ).The subalgebra of Z[Q]consisting of W–invariant elements is denoted by Z[Q]W. The irreduciblefinite dimensional representations of G are parameterized by the5dominant elements of the root lattice.For a dominantλwe denote byχλthe character of the corresponding irreducible representation of G.Restricting the characters to T we will regard them as elements of Z[Q].A basis of Z[Q]W is then given by the all the irreducible charactersχλof G.For any continuous function f on the torus T,its Fourier coefficients are param-eterized by Q and are given byfλ:= T fe−λds.The coefficient f0is called the constant term of f;it will be also denoted by[f].2.2.Let us consider the following function on the torus1∆=g∆dsmakes the charactersχλorthonormal.Assume q and t are complex numbers of small absolute value and let∇(q,t)= α∈R i≥01−q i eα[∇(q,t)]is a W–invariant continuous function on the torus with constant term equal to one. It is also invariant under the transformation which sends eλ,q and t to their inverses and therefore well defined also for q and t in a neighborhood of infinity.We can define the following non–degenerate scalar product on Z[Q]Wf,g ∆q,t:= T f6BOGDAN ION2.3.Let us consider also the continuous function on T given byK(q,t)= α∈R+ i≥0(1−q i eα)(1−q i+1e−α)1−teαFor t=q k and positive integral k this functionfirst appeared in Cherednik’s work [2]on the Macdonald constant term conjecture.Unlike∇(q,t)it is not invariant under the Weyl group.The functionK(q,t)C(q,t)=q=q−1andgC(q,t)dsFor example cλ(q,t)= 1,eλ C q,t.The scalar product has the property thatg,f C q,t=72.4.For each simple affine root consider the following operator,called Demazure–Lusztig operator,acting on F[Q]as followsT i(eλ)=e s i(λ)+(1−t)eλ−e s i(λ)1−teαIfχλdenotes the character of the irreducible representation of G with highest weightλ,then the graded multiplicity of Vλinside S(g)can be computed asch S(g)(t),χλ =1χλ(3)As mentioned in Introduction if I denotes the subring of G–invariant polynomials on g and H the space of G–harmonic polynomials on g then S(g)=I⊗H as graded G–modules.If follows that if we want to compute E(Vλ),the graded multiplicity of Vλinside H,then we would have to factor out in formula(3)the graded multiplicity of the trivial representation inside S(g),or equivalently the constant term of∇(0,t). We can conclude thatE(Vλ)= 1,χλ ∆0,t(4)By formula(2)we can thus express Eλas a sum of weight multiplicities of Vλtimes values of Fourier coefficients of the Cherednik kernel at q=0.The non–symmetry of the Cherednik kernel allows various Fourier coefficients parameterized by elements in the same Weyl group orbit to behave differently and therefore to contribute differently to the above scalar product.This feature is not present for the Macdonald kernel∆(q,t).We will return to the problem of computing the Fourier coefficients of the Cherednik kernel after some combinatorial considerations which will allow us to describe them in simple terms for elements of several Weyl group orbits.8BOGDAN ION3.The height function and the Bruhat order3.1.For each w in W let ℓ(w )be the length of a reduced (i.e.shortest)decompo-sition of w in terms of the s i .We have ℓ(w )=|Π(w )|whereΠ(w )={α∈R +|w (α)∈R −}.We also denote by c Π(w )={α∈R +|w (α)∈R +}.If w =s j p ···s j 1is a reducedexpression of w ,thenΠ(w )={α(i )|1≤i ≤p },with α(i )=s j 1···s j i −1(αj i ).For each element λof Q define λ+to be the unique dominant element in W λ,the orbit of λ.Let w λ∈W be the unique minimal length element such that w λ(λ+)=λ.Lemma 3.1.With the notation above,we haveΠ(w λ−1)={α∈R +|(λ,α)<0}.Proof.Let αbe an element of Π(w λ−1).Then w −1λ(α)is a negative root and inconsequence 0≥ λ+,w −1λ(α) =(w λ(λ+),α)=(λ,α).(5)Let us see that above we cannot have equality.If w −1λ=s j p ···s j 1is a reducedexpression,thenα∈Π(w −1λ)={α(i )|1≤i ≤p },with α(i )=s j 1···s j i −1(αj i ).Suppose that0=(λ,α(i ))=(λ,s j 1···s j i −1(αj i ))=(s j i −1···s j 1(λ),αj i )thens j i s j i −1···s j 1(λ)=s j i −1···s j 1(λ),fact which contradicts the minimality of w −1λ.Conversely,if the inequality (λ,α)<0holds for a positive root αthen equation (5)shows that w −1λ(α)is a negative root.3.2.The Bruhat order is a partial order on any Coxeter group defined in way compatible with the length function.For an element w we put w <s i w if and only if ℓ(w )<ℓ(s i w ).The transitive closure of this relation is called the Bruhat order.The terminology is motivated by the way this ordering arises for Weyl groups in connection with inclusions among closures of Bruhat cells for a corresponding semisimple algebraic group.For the basic properties of the Bruhat order we refer to Chapter 5in [5].Let us list a few of them (the first two properties completely characterize the Bruhat order):9(1)For eachα∈R+we have sαw<w if and only ifαis inΠ(w−1);(2)w′<w if and only if w′can be obtained by omitting some factors in afixedreduced decomposition of w;(3)if w′≤w then either s i w′≤w or s i w′≤s i w(or both).We can use the Bruhat order on W do define a partial order on each orbit of the Weyl group action on Q as follows.Definition 3.2.Letλandµbe two elements of the root lattice which lie in the same orbit of W.By definitionλ<µif and only if wλ<wµ.By the above Definition the dominant element of a W–orbit is the minimal element of that orbit with respect to the Bruhat order.Lemma3.3.Letλbe an element of the root lattice such that s i(λ)=λfor some1≤i≤n.Then w si (λ)=s i wλ.Proof.Becauseℓ(s i wλ)=ℓ(wλ)±1andℓ(s i w si (λ))=ℓ(w si(λ))±1we have fourpossible situations depending on the choice of the signs in the above relations.Thechoice of a plus sign in both relations translates intoαi∈Π(w−1λ)andαi∈Π(w−1s i·λ)which by Lemma3.1and our hypothesis implies that(αi,λ)>0and(αi,s i(λ))>0 (contradiction).The same argument shows that the choice of a minus sign in both relations is impossible.Now,we can assume thatℓ(s i wλ)=ℓ(wλ)+1andℓ(s i w si (λ))=ℓ(w si(λ))−1,the other case being treated ing the minimallength properties of wλand w si (λ)we can writeℓ(wλ)+1=ℓ(s i wλ)≥ℓ(w si (λ))=ℓ(s i w si(λ))+1≥ℓ(wλ)+1which shows thatℓ(s i wλ)=ℓ(w si (λ)).Our conclusion now follows from the unique-ness of the element w si (λ).An immediate consequence is the followingLemma 3.4.Letλbe a weight such that s i(λ)=λfor some1≤i≤n.Then s i(λ)>λif and only if(αi,λ)>0.If the equivalent conditions hold we also haveΠ(w si (λ))=Π(wλ)∪{w−1λ(αi)}.Lemma3.5.For an elementλin the root lattice we haveht(λ+)−ht(λ)= α∈Π(wλ)(λ+,α∨) Moreover,the number ht(λ+)−ht(λ)−ℓ(wλ)is a positive integer.Proof.Since ht(λ)=(λ,ρ)=(λ+,w−1λ(ρ))we obtain thatht(λ+)−ht(λ)=(λ+,ρ−w−1λ(ρ))10BOGDAN IONIf we writeρ=12 α∈cΠ(w−1λ)α∨using the equalitiesw−1λ Π(w−1λ) =−Π(wλ)and w−1λ cΠ(w−1λ) =cΠ(wλ)(6) wefind thatw−1λ(ρ)=−12 α∈cΠ(wλ)α∨Ourfirst claim then immediately follows.Regarding the second claim,note that forα∈Π(wλ)we always have(λ+,α∨)≥1.Indeed,from the equality(6)we knowthatα=−w−1λ(β)withβ∈Π(w−1λ)and therefore by Lemma3.1(λ+,α∨)=−(λ,β∨)>0In conclusion,ht(λ+)−ht(λ)−ℓ(wλ)= α∈Π(wλ)((λ+,α∨)−1)is a sum of positive integers and hence a positive integer.For any elementλof the root lattice we will use the notationDλ=ht(λ+)−ht(λ)−ℓ(wλ)As we will see Dλencodes a certain type of combinatorial information aboutλ.If the root system is not simply laced it will be convenient to considerDλ(ℓ)= α∈Πℓ(wλ)((λ+,α∨)−1)andDλ(s)= α∈Πs(wλ)((λ+,α∨)−1)whereΠℓ(wλ),respectivelyΠs(wλ),is used to denote the long roots,respectively short roots,inΠ(wλ).3.3.Let us describe Dλin a few cases.Assume thatλis a short root.Then λ+=θs andDλ= α∈Π(wλ)((θs,α∨)−1)Sinceθs is a short root,it follows that the scalar product(θs,α∨)equals2ifα=θs and equals1otherwise.Therefore Dλtakes the value1or0depending on whether θs is inΠ(wλ)or not.But since wλ(θs)=λwe obtain thatθs is inΠ(wλ)if and only ifλis a negative root.Therefore we have proved the following result. Lemma3.6.Ifλis a short root then Dλ=0ifλis a positive root and Dλ=1if λis a negative root.113.4.In the case on non–simply laced root systems we can investigate Dλforλa long root.Denotefirst by N(θℓ)the number of unordered pairs{α,β}of short roots such thatθℓ=α+β.For any other long rootλthe number of unordered pairs{α,β}of short roots such thatθℓ=α+βis still N(θℓ)since wλprovides a bijection between the set of such pairs.Ifαandβare short roots such thatθℓ=α+βthen(θℓ,α∨)=2+(β,α∨). We remark that(θℓ,α∨)cannot be zero and then it equals r.It follows that always(β,α∨)=r−2.The same is true for the scalar product of pairs of short roots associated in a similar way to any long root.Denote by n(λ)the number of negative roots appearing in all unordered pairs of short roots such thatλ=α+β. The following result describes Dλin combinatorial terms.Lemma3.7.For a non–simply laced root system Dλ(ℓ)=0ifλis a positive long root and Dλ(ℓ)=1ifλis a negative long root.Also,Dλ(s)=(r−1)n(λ).Proof.As before,by examining the scalar products wefind that Dλ(ℓ)=0ifθℓis not inΠ(wλ)and Dλ(ℓ)=1ifθℓis inΠ(wλ).But since wλ(θℓ)=λthis translates precisely into ourfirst claim.Regarding the second claim we use the fact that(θℓ,α∨)=r for allα∈Πs(wλ) to write Dλ(s)=(r−1)|Πs(wλ)|.Therefore,it will be enough to show that n(λ)=|Π(wλ)|.Remarkfirst that all the unordered pairs of short root which sum up toλare of the form{wλ(α),wλ(β)}withαandβpositive short roots such thatθλ=α+β.If,for example,wλ(α)is a negative root thenα∈Πs(wλ). We have shown that n(λ)≤|Π(wλ)|.For the converse inequality note that if(α)is a short root and α∈Πs(wλ)then(θℓ,α)=1and henceθℓ−α=−sθℓθℓ=α+(θℓ−α).Thenλ=wλ(α)+wλ(θℓ−α)and wλ(α)is a negative root.In conclusion|Π(wλ)|≤n(λ)and our statement is proved.3.5.We will give a combinatorial description of Dλfor a few more Weyl group orbits.Let us describefirst the orbits we wish to consider.DefineS:={γ=α+β|α,β∈R s,(α,β)=0}The Weyl group acts on S and the number of orbits of this action is given by the number of dominant elements of S.It is useful to note that the set S is empty for the root system of type G2and that for all the other non–simply laced root systems the long roots belong to S.Let us consider J the set of connected components of the diagram obtained from the Dynkin diagram of R by removing the nodes corresponding to those simple roots for which(θs,α∨i)=1and which contain at least one node associated to a short simple root of R.Note that each connected component as above is itself a Dynkin diagram and therefore we can associate its Weyl group W j,root system R j and highest short rootθs,j.12BOGDAN IONLemma3.8.The dominant elements of S areθs+θs,j for all j in J.Proof.We know(see e.g.[5])that for any element x of h∗R the stabilizer stab W(x) is generated by the simple reflexions whichfix x.Therefore the stabilizer ofθs is the group generated by the simple reflexions s i for which(θs,αi)=0and using the notation above we obtain thatstab W(θs)= j∈J W jIf for a simple rootαi we have(θs,αi)=0thenαi belongs to one of the root systems R j and therefore(θs,j,αi)≥0for any j∈J.Hence,(θs+θs,j,αi)≥0 for any j∈J.Ifαi is a simple root such that(θs,α∨i)=1,since(θs,j,α∨i)≥−1 we obtain again that(θs+θs,j,αi)≥0for any j∈J.In conclusion,the elements θs+θs,j are all dominant.Tofinish the proof we will show that any elementγof S is in fact conjugate to one of theθs+θs,j.Fix an elementγ=α+βof S such thatα,β∈R s and(α,β)=0.We can find a Weyl group element w such that w(α)=θs and therefore w(γ)=θs+w(β). Moreover,(θs,w(β))=0.This means in particular that s w(β)is an element of W whichfixesθs.The element s w(β)of stab W(θs)being a reflexion it follows that w(β)is a short root in one of the R j.If we denote byθs,j the highest short root of R j,we canfind an element w′of W j such that w′w(β)=θs,j.Of course,since w′fixesθs we obtain thatw′w(γ)=θs+θs,jTherefore,we have proved that each element of S is in the same orbit with one of the elementsθs+θs,j,j∈J.3.6.We will investigate the possible values of the scalar products(θs+θ′s,α∨)for positive rootsα.We wish to study the cases which were not already accounted for. Hence wefix j∈J such thatθs+θs,j=θℓ.The possible values of the scalar product(θs,α∨)are0,1and2and the possible values of the scalar product(θs,j,α∨)are0,±1and2.Note that if one on the scalar products is2then the other one is necessarily0sinceαis eitherθs orθs,j. Therefore(θs+θs,j,α∨)=2only ifα=θs,α=θs,j or(θs,α∨)=(θ′s,α∨)=1.The other possible values of the scalar product are0and1sinceθs+θs,j is dominant andαpositive their scalar product has to be positive.The most interesting situation is when we have(θs,α∨)=(θs,j,α∨)=1.In the situation when we have two distinct root lengthsαcan potentially be a long root.In such a case(θs,α)=(θs,j,α)=1and therefore sθs sθs,j(α)=α−rθs−rθs,j isa long root.The scalar product(rθs+rθs,j−α,α)=2(r−1).If r=3this leads to a contradiction and if r=2then we obtain thatα=θs+θs,j.Hence,αbeing a dominant long root it must equalθℓand henceθs+θs,j=θℓ(contradiction).13We have shown that ifαis a positive root and(θs,α∨)=(θs,j,α∨)=1thenαis necessarily short.Let A:={α∈R+s|(θs,α∨)=(θs,j,α∨)=1}.Denote byϕ=−sθs sθs,j.We will show thatϕis an involution of A withoutfixed points.Indeed,ϕ(α)=θs+θs,j−αis a short root and(θs,θs+θs,j−α)=2/r−1/r=1/r and similarly (θs,j,θs+θs,j−α)=1/r,showing thatϕ(α)is an element of A.Obviously,ϕ2is the identity.Ifαisfixed byϕthenθs+θs,j=2α.Computing the scalar product withαwe obtain2/r=4/r which is a contradiction.Therefore,the involutionϕdoes not havefixed points.3.7.One consequence of the above considerations is that A has an even number of elements.For our j∈J(chosen such thatθs+θs,j=θℓ)denote by n(j)the number of unordered pairs{α,β}of short orthogonal roots such thatθs+θs,j=α+β. Also,forλ∈W(θs+θs,j)denote by n(λ)the number of negative roots appearing in all unordered pairs of short roots such thatλ=α+β.Lemma3.9.With the notation above n(j)=1+|A|/2.Proof.If{α,β}is a pair of short orthogonal roots for whichθs+θs,j=α+β, then(θs+θs,j,α∨)=(α+β,α∨)=2/p.Such a root must necessarily be positive since otherwise ht(α+β)<ht(θs).From previous considerations we know that eitherα∈A,eitherα∈{θs,θs,j}.Therefore,the pair{α,β}is{θs,θs,j}or the pair{α,ϕ(α)}for someα∈A.It is easy to see that the number of such pairs is 1+|A|/2.The next result describes Dλin combinatorial terms.Lemma3.10.For an elementλ∈W(θs+θs,j)as above we haveDλ=Dλ(s)=n(λ)Proof.As we have argued before,there is no long rootαsuch that(θs+θs,j,α∨)=2 and therefore Dλ(ℓ)=0.Furthermore,Dλ=Dλ(s)= α∈Πs(wλ)((θs+θs,j,α∨)−1)and since the scalar product(θs+θs,j,α∨)is at most2we obtain that Dλis the number ofα∈Πs(wλ)for which(θs+θs,j,α∨)=2.We know from Lemma3.9 that a short positive rootαsuch that(θs+θs,j,α∨)=2gives rise an expression θs+θs,j=α+βwithαandβshort positive roots Thereforeλ=wλ(α)+wλ(β) and the short root wλ(α)is negative.We have shown that n(λ)≥Dλ.For the converse inequality we argue as in the proof of Lemma3.7.14BOGDAN IONThe following result is an immediate consequence of the combinatorial descrip-tion of Dλ.Lemma3.11.Letλ∈W(θs+θs,j)as above such that ht(λ)=0.Then Dλ=n(j).Proof.The claim is clear since if{α,β}is a pair of orthogonal short roots such that λ=α+βthen because ht(λ)=0precisely one ofαorβis a positive root and the other is a negative root.In conclusion n(λ)=n(j).The next result will be useful later.Lemma3.12.Letλ=sθ(θs+θs,j).Then Dλ=2n(j)−1is the root system R is simply laced and Dλ=n(j)if the root system is not simply laced.Proof.Consider a pair{α,β}of short orthogonal positive roots such thatα+β=θs+θs,j.Then{sθ(α),sθ(β)}is pair of short orthogonal roots such that sθ(α)+sθ(β)=λand all the pairs with this property arise in this way.Assumefirst that R is simply laced.If{α,β}={θs,θs,j}then{sθ(α),sθ(β)}= {−θs,θs,j}.In all the other cases{sθ(α),sθ(β)}={α−θs,β−θs}.Sinceθs is the highest root of R the number of negative roots appearing in the n(j)unordered pairs of short roots which sum up toλequals2n(j)−1.If R is non–simply laced then(θs+θs,j,θ)=1which forces of course(α+β,θ)=1.Becauseθis dominant both(α,θ)and(β,θ)are positive integers and therefore one of them equals0(say,thefirst one)and the other equals1.Hence {sθ(α),sθ(β)}={α−θ,β}.In conclusion the number of negative roots appearing in the n(j)unordered pairs of short roots which sum up toλequals n(j).3.8.For the root system R we denote by N(R)the number of positive roots in R. Similarly we denote by N(R s)the number of positive short roots in R and we use corresponding notation for the root systems R j.Lemma3.13.With the notation above,there are exactly N(R s)N(R s,j)/n(j)el-ements in the orbit W(θs+θs,j).Proof.For afixed short root there are exactly2N(R s,j)short roots orthogonal to it and with the sum in the prescribed orbit(since this is the situation forθs). Therefore,the total number of pairs of orthogonal short roots is4N(R s)N(R s,j). From all these pairs by taking their sum we obtain each element of the orbit W(θs+θs,j)exactly2n(j)times.In conclusion the number of elements in the orbit W(θs+θs,j)has the predicted value.154.Fourier coefficientsIn this section we will describe a general inductive procedure for computing the Fourier coefficients of the Cherednik kernel and then we apply it tofind explicit formulas for the coefficients corresponding to elements in a few Weyl group orbits.4.1.Letλbe an element of the root lattice andαi a simple root.If(λ,α∨i)=k>0 thenT i(eλ)=e s i(λ)+(1−t)(eλ+···+eλ−(k−1)αi)Note that for1<j<k the elementλ−jαi is a convex combination ofλand s i(λ). Indeedλ−jαi=(1−j/k)λ+j/ks i(λ)In consequence they lie in Weyl group orbits strictly closer to the origin than the elements in Wλ.The same is true if(λ,θ)=k>0T0(eλ)=tq k e sθ(λ)+(t−1)(qeλ−θ+···+q k−1eλ−(k−1)θ)and for1<j<k the elementλ−jθis a convex combination ofλand sθ(λ). Using the unitarity of the Demazure–Lusztig operators we obtain relations between Fourier coefficients of the Cherednik kernel.Using the equality T i(1),T i(eλ) q,t= 1,eλ q,t we obtain the following relationstc si (λ)(q,t)−cλ(q,t)=(1−t) cλ−αi(q,t)+···+cλ−(k−1)αi(q,t) (7)for all1≤i≤n such that s i(λ)>λ.Also if(λ,θ)=k>0we havetq k cλ(q,t)−c sθ(λ)(q,t)=(1−t) q k−1cλ−θ(q,t)+···+qcλ−(k−1)θ(q,t) (8)Fix a non–zero dominant elementλ+∈Q and consider the homogeneous system associated to the above equations.The unknowns are xλfor allλ∈Wλ+and the equationstx si (λ)−xλ=0if1≤i≤n and s i(λ)>λ(9)tq k xλ−x sθ(λ)=0if(λ,θ)=k>0(10)It is easy to see that from equation(9)we obtain that xλ=t−ℓ(wλ)xλ+.We also have k:=(λ+,θ)>0and from equation(10)we gettq k xλ+−t−ℓ(w sθ(λ))xλ+=0which implies that xλ+=0and therefore xλ=0for allλ∈Wλ+.Theorem 4.1.The system given by equations(7)and(8)and c0(q,t)=1has unique solution.16BOGDAN IONProof.We know that the system has at least one solution.Fix now a non–zero dominantλ+∈Q.Since the homogeneous system given by equations(9)and(10) has a unique solution it follows that the system given by equations(7)and(8) has at most one bining these two remarks it follows that for any λ∈Wλ+,cλ(q,t)is uniquely expressible in terms of cµ(q,t)’s,whereµlies in an orbit of W closer to the origin thatλ+.Using induction on the distance ofλ+to the origin we obtain that our system has a unique solution.4.2.We will now apply the inductive procedure described in the proof of Theorem 4.1tofind the Fourier coefficients of the Cherednik kernel corresponding to a few Weyl group orbits.The orbit of W on Q closest to the origin is Wθs.If the root system is simply laced thenθs=θthe highest root of R.Let Xθsbe the element of F for whichcθs (q,t)=t ht(θs)Xθs(11)Theorem4.2.For any short rootλwe havecλ(q,t)=t ht(λ)+DλXθs+(t ht(λ)−t ht(λ)+Dλ)(12)and Xθs=(1−t−1)/(1−qt ht(θ)).Proof.We will show that the above formula is valid by induction on the order onthe orbit Wθs induced from the Bruhat order.For the minimal element of the orbitwe have Dθs=0(by Lemma3.6)and the predicted formula coincides with(11).Assuming that the predicted formula is true forλwe will show that it is truefor s i(λ)>λ.As explained in Lemma3.4this means that(λ,α∨i)>0.In factthe possible values of the scalar product are1or2(only ifλ=αi).If(λ,α∨i)=1then ht(s i(λ))=ht(λ)−1andℓ(w si (λ))=ℓ(wλ)+1.It follows that D si(λ)=Dλtherefore using equation(7)we getc si (λ)(q,t)=t−1cλ(q,t)=t ht(λ)+Dλ−1Xθs+(t ht(λ)−1−t ht(λ)+Dλ−1)=t ht(s i(λ))+D s i(λ)Xθs+(t ht(s i(λ))−t ht(s i(λ))+D s i(λ))If(λ,α∨i)=2thenλ=αi and s i(λ)=αi.In consequence Dλ=0and D si (λ)=1.Again by equation(7)we getc si (λ)(q,t)=t−1cλ(q,t)+(t−1−1)=Xθs+(t−1−1)=t ht(s i(λ))+D s i(λ)Xθs+(t ht(s i(λ))−t ht(s i(λ))+D s i(λ))We have thus shown that the formula(12)is valid for all short roots.To show that Xθshas the predicted value we use the equation(8).。
自动化专业英语课后单词及课后句子总结

P3U1architecture n. 体系结构instruction set 指令集binary-coded adj. 二进制编码的central processing unit (CPU) 中央处理器processor n. 处理器location n. (存储)单元word length 字长access v. 存取,接近fetch v., n. 取来field n. 域,字段opcode n. 操作码operand n. 操作数address n. 寻址single-precision adj. 单精度的floating-point adj. 浮点的terminal n. 终端complement v. 补充,求补decode v. 解码,译码request n. 请求inactive n. 不活动,停止I/O-mapped adj. 输入/输出映射的(单独编址)memory-mapped adj. 存储器映射的(统一编址)难句翻译[1] …how the instruction execution cycle is broken down into its various components.……指令执行周期怎样分解成不同的部分。
[2] One way to achieve meaningful patterns is to divide up the bits into fields…一种得到(指令)有效形式的方法是将(这些)位分成段……[3] The majority of computer tasks involve the ALU, but a great amount of data movement is required in order to make use of the ALU instructions.计算机的大多数工作涉及到ALU(逻辑运算单元),但为了使用ALU指令,需要传送大量的数据。
Focused information criterion and model averaging for generalized

The Annals of Statistics2011,V ol.39,No.1,174–200DOI:10.1214/10-AOS832©Institute of Mathematical Statistics,2011FOCUSED INFORMATION CRITERION AND MODEL A VERAGING FOR GENERALIZED ADDITIVE PARTIAL LINEAR MODELSB Y X INYU Z HANG1AND H UA L IANG2Chinese Academy of Sciences and University of RochesterWe study model selection and model averaging in generalized additive partial linear models(GAPLMs).Polynomial spline is used to approximatenonparametric functions.The corresponding estimators of the linear para-meters are shown to be asymptotically normal.We then develop a focusedinformation criterion(FIC)and a frequentist model average(FMA)estimatoron the basis of the quasi-likelihood principle and examine theoretical proper-ties of the FIC and FMA.The major advantages of the proposed proceduresover the existing ones are their computational expediency and theoretical re-liability.Simulation experiments have provided evidence of the superiority ofthe proposed procedures.The approach is further applied to a real-world dataexample.1.Introduction.Generalized additive models,which are a generalization of the generalized models and involve a summand of one-dimensional nonparamet-ric functions instead of a summand of linear components,have been widely used to explore the complicated relationships between a response to treatment and pre-dictors of interest[Hastie and Tibshirani(1990)].Various attempts are still being made to balance the interpretation of generalized linear models and theflexibility of generalized additive models such as generalized additive partial linear models (GAPLMs),in which some of the additive component functions are linear,while the remaining ones are modeled nonparametrically[Härdle et al.(2004a,2004b)].A special case of a GAPLM with a single nonparametric component,the gener-alized partial linear model(GPLM),has been well studied in the literature;see, for example,Severini and Staniswalis(1994),Lin and Carroll(2001),Hunsberger (1994),Hunsberger et al.(2002)and Liang(2008).The profile quasi-likelihood procedure has generally been used,that is,the estimation of GPLM is made com-putationally feasible by the idea that estimates of the parameters can be found for a known nonparametric function,and an estimate of the nonparametric function can Received February2010;revised May2010.1Supported in part by the National Natural Science Foundation of China Grants70625004and 70933003.2Supported in part by NSF Grant DMS-08-06097.AMS2000subject classifications.Primary62G08;secondary62G20,62G99.Key words and phrases.Additive models,backfitting,focus parameter,generalized partially lin-ear models,marginal integration,model average,model selection,polynomial spline,shrinkage methods.174GENERALIZED ADDITIVE PARTIALLY LINEAR MODELS175 be found for the estimated parameters.Severini and Staniswalis(1994)showed that the resulting estimators of the parameter are asymptotically normal and that estimators of the nonparametric functions are consistent in supremum norm.The computational algorithm involves searching for maxima of global and local likeli-hoods simultaneously.It is worthwhile to point out that studying GPLM is easier than studying GAPLMs,partly because there is only one nonparametric term in GPLM.Correspondingly,implementation of the estimation for GPLM is simpler than for GAPLMs.Nevertheless,the GAPLMs are moreflexible and useful than GPLM because the former allow several nonparametric terms for some covariates and parametric terms for others,and thus it is possible to explore more complex re-lationships between the response variables and covariates.For example,Shiboski (1998)used a GAPLM to study AIDS clinical trial data and Müller and Rönz (2000)used a GAPLM to carry out credit scoring.However,few theoretical re-sults are available for GAPLMs,due to their generalflexibility.In this article,we shall study estimation of GAPLMs using polynomial spline,establish asymptotic normality for the estimators of the linear parameters and develop a focused in-formation criterion(FIC)for model selection and a frequentist model averaging (FMA)procedure in construction of the confidence intervals for the focus parame-ters with improved coverage probability.We know that traditional model selection methods such as the Akaike informa-tion criterion[AIC,Akaike(1973)]and the Bayesian information criterion[BIC, Schwarz(1978)]aim to select a model with good overall properties,but the se-lected model is not necessarily good for estimating a specific parameter under consideration,which may be a function of the model parameters;see an inspiring example in Section4.4of Claeskens and Hjort(2003).Exploring the data set from the Wisconsin epidemiologic study of diabetic retinopathy,Claeskens,Croux and van Kerckhoven(2006)also noted that different models are suitable for different patient groups.This occurrence has been confirmed by Hand and Vinciotti(2003) and Hansen(2005).Motivated by this concern,Claeskens and Hjort(2003)pro-posed a new model selection criterion,FIC,which is an unbiased estimate of the limiting risk for the limit distribution of an estimator of the focus parameter,and systematically developed a general asymptotic theory for the proposed criterion. More recently,FIC has been studied in several models.Hjort and Claeskens(2006) developed the FIC for the Cox hazard regression model and applied it to a study of skin cancer;Claeskens,Croux and van Kerckhoven(2007)introduced the FIC for autoregressive models and used it to predict the net number of new personal life insurance policies for a large insurance company.The existing model selection methods may arrive at a model which is thought to be able to capture the main information of the data,and to be decided in advance in data analysis.Such an approach may lead to the ignoring of uncertainty intro-duced by model selection.Thus,the reported confidence intervals are too narrow or shift away from the correct location,and the corresponding coverage probabili-ties of the resulting confidence intervals can substantially deviate from the nominal176X.ZHANG AND H.LIANGlevel[Danilov and Magnus(2004)and Shen,Huang and Ye(2004)].Model aver-aging,as an alternative to model selection,not only provides a kind of insurance against selecting a very poor model,but can also avoid model selection instability [Yang(2001)and Leung and Barron(2006)]by weighting/smoothing estimators across several models,instead of relying entirely on a single model selected by some model selection criterion.As a consequence,analysis of the distribution of model averaging estimators can improve coverage probabilities.This strategy has been adopted and studied in the literature,for example,Draper(1995),Buckland, Burnham and Augustin(1997),Burnham and Anderson(2002),Danilov and Mag-nus(2004)and Leeb and Pöstcher(2006).A seminal work,Hjort and Claeskens (2003),developed asymptotic distribution theories for estimation and inference af-ter model selection and model averaging across parametric models.See Claeskens and Hjort(2008)for a comprehensive survey on FIC and model averaging.FIC and FMA have been well studied for parametric models.However,few ef-forts have been made to study FIC and FMA for semiparametric models.To the best of our knowledge,only Claeskens and Carroll(2007)studied FMA in semi-parametric partial linear models with a univariate nonparametric component.The existing results are hard to extend directly to GAPLMs,for the following reasons: (i)there exist nonparametric components in GAPLMs,so the ordinary likelihood method cannot be directly used in estimation for GAPLMs;(ii)unlike the semi-parametric partial linear models in Claeskens and Carroll(2007),GAPLMs allow for multivariate covariate consideration in nonparametric components and also al-low for the mean of the response variable to be connected to the covariates by a link function,which means that the binary/count response variable can be consid-ered in the model.Thus,to develop FIC and FMA procedures for GAPLMs and to establish asymptotic properties for these procedures are by no means straightfor-ward to achieve.Aiming at these two goals,wefirst need to appropriately estimate the coefficients of the parametric components(hereafter,we call these coefficients “linear parameters”).There are two commonly used estimation approaches for GAPLMs:thefirst is local scoring backfitting,proposed by Buja,Hastie and Tibshirani(1989);the second is an application of the marginal integration approach on the nonparamet-ric component[Linton and Nielsen(1995)].However,theoretical properties of the former are not well understood since it is only defined implicitly as the limit of a complicated iterative algorithm,while the latter suffers from the curse of dimen-sionality[Härdle et al.(2004a)],which may lead to an increase in the computa-tional burden and which also conflicts with the purpose of using a GAPLM,that is,dimension reduction.Therefore,in this article,we apply polynomial spline to approximate nonparametric functions in GAPLMs.After the spline basis is cho-sen,the nonparametric components are replaced by a linear combination of spline basis,then the coefficients can be estimated by an efficient one-step maximizing procedure.Since the polynomial-spline-based method solves much smaller sys-tems of equations than kernel-based methods that solve larger systems(which mayGENERALIZED ADDITIVE PARTIALLY LINEAR MODELS177 lead to identifiability problems),our polynomial-spline-based procedures can sub-stantially reduce the computational burden.See a similar discussion about this computational issue in Yu,Park and Mammen(2008),in the generalized additive models context.The use of polynomial spline in generalized nonparametric models can be traced back to Stone(1986),where the rate of convergence of the polynomial spline es-timates for the generalized additive model werefirst obtained.Stone(1994)and Huang(1998)investigated the polynomial spline estimation for the generalized functional ANOV A model.In a widely discussed paper,Stone et al.(1997)pre-sented a completely theoretical setting of polynomial spline approximation,with applications to a wide array of statistical problems,ranging from least-squares re-gression,density and conditional density estimation,and generalized regression such as logistic and Poisson regression,to polychotomous regression and hazard regression.Recently,Xue and Yang(2006)studied estimation in the additive coef-ficient model with continuous response using polynomial spline to approximate the coefficient functions.Sun,Kopciuk and Lu(2008)used polynomial spline in par-tially linear single-index proportional hazards regression models.Fan,Feng and Song(2009)applied polynomial spline to develop nonparametric independence screening in sparse ultra-high-dimensional additive models.Few attempts have been made to study polynomial spline for GAPLMs,due to the extreme technical difficulties involved.The remainder of this article is organized as follows.Section2sets out the model framework and provides the polynomial spline estimation and asymptotic normality of estimators.Section3introduces the FIC and FMA procedures and constructs confidence intervals for the focus parameters on a basis of FMA esti-mators.A simulation study and real-world data analysis are presented in Sections 4and5,respectively.Regularity conditions and technical proofs are presented in the Appendix.2.Model framework and estimation.We consider a GAPLM where the response Y is related to covariates X=(X1,...,X p)T∈R p and Z=(Z1,..., Z d)T∈R d.Let the unknown mean response u(x,z)=E(Y|X=x,Z=z)and the conditional variance function be defined by a known positive function V, var(Y|X=x,Z=z)=V{u(x,z)}.In this article,the mean function u is defined via a known link function g by an additive linear functiong{u(x,z)}=pα=1ηα(xα)+z Tβ,(2.1)where xαis theαth element of x,βis a d-dimensional regression parameter and theηα’s are unknown smooth functions.To ensure identifiability,we assume that E{ηα(Xα)}=0for1≤α≤p.Letβ=(βT c,βT u)T be a vector with d=d c+d u components,whereβc con-sists of thefirst d c parameters ofβ(which we certainly wish to be in the selected178X.ZHANG AND H.LIANGmodel)and βu consists of the remaining d u parameters (for which we are unsure whether or not they should be included in the selected model).In what follows,we call the elements of z corresponding to βc and βu the certain and exploratory vari-ables,respectively.As in the literature on FIC,we consider a local misspecification framework where the true value of the parameter vector βis β0=(βT c,0,δT /√n)T ,with δbeing a d u ×1vector;that is,the true model is away from the deduced model with a distance O(1/√n).This framework indicates that squared model biases and estimator variances are both of size O(1/n),the most possible large-sample approximations.Some arguments related to this framework appear in Hjort and Claeskens (2003,2006).Denote by βS =(βT c ,βT u,S )T the parameter vector in the S th submodel,in the same sense as β,with βu,S being a d u,S -subvector of βu .Let πS be the projec-tion matrix of size d u,S ×d u mapping βu to βu,S .With d u exploratory covariates,our setup allows 2d u extended models to choose among.However,it is not nec-essary to deal with all 2d u possible models and one is free to consider only a few relevant submodels (unnecessarily nested or ordered)to be used in the model se-lection or averaging.A special example is the James–Stein-type estimator studied by Kim and White (2001),which is a weighted summand of the estimators based on the reduced model (d u,S =0)and the full model (d u,S =d u ).So,the covariates in the S th submodel are X and S Z ,where S =diag (I d c ,πS ).To save space,we generally ignore the dimensions of zero vectors/matrices and identity matrices,simply denoting them by 0and I,respectively.If necessary,we will write their dimensions explicitly.In the remainder of this section,we shall investigate poly-nomial spline estimation for (βT c,0,0)based on the S th submodel and establish a theoretical property for the resulting estimators.Let η0= p α=1η0,α(x α)be the true additive function and the covariate X αbe distributed on a compact interval [a α,b α].Without loss of generality,we take all intervals [a α,b α]=[0,1]for α=1,...,p .Noting (A.7)in Appendix A.2,under some smoothness assumptions in Appendix A.1,η0can be well approximated by spline functions.Let S n be the space of polynomial splines on [0,1]of degree ≥1.We introduce a knot sequence with J interior knots,k − =···=k −1=k 0=0<k 1<···<k J <1=k J +1=···=k J + +1,where J ≡J n increases when sample size n increases and the precise order is given in condition (C6).Then,S n consists of functions ςsatisfying the following:(i)ςis a polynomial of degree on each of the subintervals [k j ,k j +1),j =0,...,J n −1,and the last subinterval is [k J n ,1];(ii)for ≥2,ςis ( −1)-times continuously differentiable on [0,1].For simplicity of proof,equally spaced knots are used.Let h =1/(J n +1)be the distance between two consecutive knots.Let (Y i ,X i ,Z i ),i =1,...,n ,be independent copies of (Y,X ,Z ).In the S th submodel,we consider the additive spline estimates of η0based on the independent random sample (Y i ,X i , S Z i ),i =1,...,n .Let G n be the collection of functionsGENERALIZED ADDITIVE PARTIALLY LINEAR MODELS179ηwith the additive formη(x)= pα=1ηα(xα),where each component functionηα∈S n.We would like tofind a functionη∈G n and a value ofβS that maximize the quasi-likelihood functionL(η,βS)=1nni=1Q[g−1{η(X i)+( S Z i)TβS},Y i],η∈G n,(2.2)where Q(m,y)is the quasi-likelihood function satisfying∂Q(m,y)∂m =y−mV(m).For theαth covariate xα,let b j,α(xα)be the B-spline basis function of de-gree .For anyη∈G n,one can writeη(x)=γT b(x),where b(x)={b j,α(xα),j=− ,...,J n,α=1,...,p}T are the spline basis functions andγ={γj,α,j=− ,...,J n,α=1,...,p}T is the spline coefficient vector.Thus,the maximiza-tion problem in(2.2)is equivalent tofinding values ofβ∗S andγ∗that maximize1 nni=1Q[g−1{γ∗T b(X i)+( S Z i)Tβ∗S},Y i].(2.3)We denote the maximizers as β∗S and γ∗S={ γ∗S,j,α,j=− ,...,J n,α=1,..., p}T.The spline estimator ofη0is then η∗S= γ∗T S b(x)and the centered spline esti-mators of each component function areη∗S,α(xα)=J nj=−γ∗S,j,αb j,α(xα)−1nni=1J nj=−γ∗S,j,αb j,α(X iα),α=1,...,p.The above estimation approach can be easily implemented with commonly used statistical software since the resulting model is a generalized linear model.For any measurable functionsϕ1,ϕ2on[0,1]p,define the empirical inner prod-uct and the corresponding norm asϕ1,ϕ2 n=n−1ni=1{ϕ1(X i)ϕ2(X i)}, ϕ 2n=n−1ni=1ϕ2(X i).Ifϕ1andϕ2are L2-integrable,define the theoretical inner product and the corre-sponding norm as ϕ1,ϕ2 =E{ϕ1(X)ϕ2(X)}, ϕ 22=Eϕ2(X),respectively.Let ϕ 2nαand ϕ 22αbe the empirical and theoretical norms,respectively,of a func-tionϕon[0,1],that is,ϕ 2nα=n−1ni=1ϕ2(X iα), ϕ 22α=Eϕ2(Xα)=1ϕ2(xα)fα(xα)dxα,where fα(xα)is the density function of Xα.180X.ZHANG AND H.LIANGDefine the centered version spline basis for anyα=1,...,p and j=− + 1,...,J n,b∗j,α(xα)=b j,α(xα)− b j,α 2α/ b j−1,α 2αb j−1,α(xα),with the stan-dardized version given byB j,α(xα)=b∗j,α(xα) b∗j,α 2α.(2.4)Note that tofind(γ∗,β∗S)that maximizes(2.3)is mathematically equivalent to finding(γ,βS)that maximizes(γ,βS)=1nni=1Q[g−1{γT B(X i)+( S Z i)TβS},Y i],(2.5)where B(x)={B j,α(xα),j=− +1,...,J n,α=1,...,p}T.Similarly to β∗S, γ∗S, η∗S and η∗S,α,we can define βS, γS, ηS and the centered spline estima-tors of each component function ηS,α(xα).In practice,the basis{b j,α(xα),j=− ,...,J n,α=1,...,p}T is used for data analytic implementation and the math-ematically equivalent expression(2.4)is convenient for asymptotic derivation.Letρl(m)={dg−1(m)dm }l/V{g−1(m)},l=1,2.Write T=(X T,Z T)T,m0(T)=η0(X)+Z Tβ0andε=Y−g−1{m0(T)}.T i,m0(T i)andεi are defined in the same way after replacing X,Z and T by X i,Z i and T i,respectively.Write(x)=E[Zρ1{m0(T)}|X=x]E[ρ1{m0(T)}|X=x],ψ(T)=Z− (X),G n=1√nni=1εiρ1{m0(T i)}ψ(T i),D=E[ρ1{m0(T)}ψ(T){ψ(T)}T]and =E[ρ21{m0(T)}ε2ψ(T){ψ(T)}T].The following theorem shows that the estimators βS on the basis of the S th submodel are asymptotically normal.T HEOREM1.Under the local misspecification framework and conditions (C1)–(C11)in the Appendix,√n{ βS−(βT c,0,0)T}=−( S D T S)−1 S G n+( S D T S)−1 S Dδ+o p(1)d−→−( S D T S)−1 S G+( S D T S)−1 S D0δwith G n d−→G∼N(0, ),where“d−→”denotes convergence in distribution.GENERALIZED ADDITIVE PARTIALLY LINEAR MODELS181 R EMARK1.If the link function g is identical and there is only one nonpara-metric component(i.e.,p=1),then the result of Theorem1will simplify to those of Theorems3.1–3.4of Claeskens and Carroll(2007)under the corresponding submodels.R EMARK2.Assume that d u=0.Theorem1indicates that the polynomial-spline-based estimators of the linear parameters are asymptotically normal.This is thefirst explicitly theoretical result on asymptotic normality for estimation of the linear parameters in GAPLMs and is of independent interest and importance. This theorem also indicates that although there are several nonparametric functions and their polynomial approximation deduces biases for the estimators of each non-parametric component,these biases do not make the estimators ofβbiased under condition(C6)imposed on the number of knots.3.Focused information criterion and frequentist model averaging.In this section,based on the asymptotic result in Section2,we develop an FIC model se-lection for GAPLMs,an FMA estimator,and propose a proper confidence interval for the focus parameters.3.1.Focused information criterion.Letμ0=μ(β0)=μ(βc,0,δ/√n)be afocus parameter.Assume that the partial derivatives ofμ(β0)are continuous in a neighborhood ofβc,0.Note that,in the S th submodel,μ0can be estimated by μS=μ([I d c,0d c×d u] T S βS,[0d u×d c,I d u] T S βS).We now show the asymptoticnormality of μS.Write R S= T S( S D T S)−1 S,μc=∂μ(βc,βu)∂βc |βc=βc,0,βu=0,μu=∂μ(βc,βu)∂βu |βc=βc,0,βu=0andμβ=(μT c,μT u)T.T HEOREM2.Under the local misspecification framework and conditions (C1)–(C11)in the Appendix,we have√n( μS−μ0)=−μTβR S G n+μTβ(R S D−I)δ+o p(1)d−→ S≡−μTβR S G+μTβ(R S D−I)0δ.Recall G∼N(0, ).A direct calculation yieldsE( 2S)=μTβR S R S+(R S D−I)δδT(R S D−I)Tμβ.(3.1)Let δbe the estimator ofδby the full model.Then,from Theorem1,we know thatδ=−[0,I]D−1Gn+δ+o p(1).182X.ZHANG AND H.LIANGIf we define =−[0,I ]D −1G +δ∼N(δ,[0,I ]D −1 D −1[0,I ]T ),then δd −→ .Following Claeskens and Hjort (2003)and (3.1),we define the FIC of the S thsubmodel asFIC S =μT β R S R S +(R S D −I ) 0 δ 0 δT (R S D −I )T (3.2)−(R S D −I ) 000I d u D −1 D −1 000I d u(R S D −I )T μβ,which is an approximately unbiased estimator of the mean squared error when √nμ0is estimated by √n μS .This FIC can be used for choosing a proper sub-model relying on the parameter of interest.3.2.Frequentist model averaging .As mentioned previously,an average esti-mator is an alternative to a model selection estimator.There are at least two advan-tages to the use of an average estimator.First,an average estimator often reduces mean square error in estimation because it avoids ignoring useful information from the form of the relationship between response and covariates and it provides a kind of insurance against selecting a very poor submodel.Second,model averaging pro-cedures can be more stable than model selection,for which small changes in the data often lead to a significant change in model choice.Similar discussions of this issue appear in Bates and Granger (1969)and Leung and Barron (2006).By choosing a submodel with the minimum value of FIC,the FIC estimators ofμcan be written as μFIC = S I (FIC selects the S th submodel) μS ,where I (·),an indicator function,can be thought of as a weight function depending on the dataviaδ,yet it just takes value either 0or 1.To smooth estimators across submodels,we may formulate the model average estimator of μasμ=S w(S | δ) μS ,(3.3)where the weights w(S |δ)take values in the interval [0,1]and their sum equals 1.It is readily seen that smoothed AIC,BIC and FIC estimators investigated in Hjort and Claeskens (2003)and Claeskens and Carroll (2007)share this form.The fol-lowing theorem shows an asymptotic property for the general model average esti-mators μdefined in (3.3)under certain conditions.T HEOREM 3.Under the local misspecification framework and conditions (C1)–(C11)in the Appendix ,if the weight functions have at most a countable num-ber of discontinuities ,then √n( μ−μ0)=−μT βD −1G n +μT β Q( δ) 0 δ − 0 δ+o p (1)d −→ ≡−μT βD −1G +μT β Q( ) 0 − 0 ,GENERALIZED ADDITIVE PARTIALLY LINEAR MODELS183where Q(·)=Sw(s|·)R S D and is defined in Section3.1.Referring to the above theorems,we construct a confidence interval forμbased on the model average estimatorˆμ,as follows.Assume that κ2is a consistent esti-mator ofμTβD−1 D−1μβ.It is easily seen that√n( μ−μ0)−μTβQ( δ)δ−δκd−→N(0,1).If we define the lower bound(low n)and upper bound(up n)byμ−μTβQ( δ)δ−δ√n∓z j κ/√n,(3.4)where z j is the j th standard normal quantile,then we have Pr{μ0∈(low n,up n)}→2 (z j)−1,where (·)is a standard normal distribution function.Therefore,the interval(low n,up n)can be used as a confidence interval forμ0with asymptotic level2 (z j)−1.R EMARK3.Note that the limit distribution of √n( μ−μ0)is a nonlinearmixture of several normal variables.As argued in Hjort and Claeskens(2006),a direct construction of a confidence interval based on Theorem3may not be easy. The confidence interval based on(3.4)is better in terms of coverage probability and computational simplicity,as promoted in Hjort and Claeskens(2003)and ad-vocated by Claeskens and Carroll(2007).R EMARK4.A referee has asked whether the focus parameter can depend on the nonparametric functionη0.Our answer is“yes.”For instance,we consider a general focus parameter,η0(x)+μ0,a summand ofμ0,which we have studied,and a nonparametric value at x.We may continue to get an estimator ofη0(x)+μ0by minimizing(3.2)and then model-averaging estimators by weighting the estimators ofμ0andη0as in(3.3).However,the underlying FMA estimators are not root-n consistent because the bias of these estimators is proportional to the bias of the estimators ofη0,which is larger than n−1/2,whereas we can establish their rates of convergence using easier arguments than those employed in the proof of Theorem3.Even though the focus parameters generally depend onμ0andη0 of form H(μ0,η0)for a given function H(·,·),the proposed method can be still applied.However,to develop asymptotic properties for the corresponding FMA estimators depends on the form of H(·,·)and will require further investigation. We omit the details.Our numerical studies below follow these proposals when the focus parameters are related to the nonparametric functions.184X.ZHANG AND H.LIANG4.Simulation study.We generated 1000data sets consisting of n =200and 400observations from the GAPLMlogit {Pr (Y i =1)}=η1(X i,1)+η2(X i,2)+Z T i β=sin (2πX i,1)+5X 4i,2+3X 2i,2−2+Z Ti β,i =1,...,n,where:the true parameter β={1.5,2,r 0(2,1,3)/√n }T ;X i,1and X i,2are inde-pendently uniformly distributed on [0,1];Z i,1,...,Z i,5are normally distributed with mean 0and variance 1;when 1= 2,the correlation between Z i, 1and Z i, 2is | 1− 2|with =0or =0.5;Z i is independent of X i,1and X i,2.We set the first two components of βto be in all submodels.The other three may or may not be present,so we have 23=8submodels to be selected or av-eraged across.r 0varies from 1or 4to 7.Our focus parameters are (i)μ1=β1,(ii)μ2=β2,(iii)μ3=0.75β1+0.05β2−0.3β3+0.1β4−0.06β5and (iv)μ4=η1(0.86)+η2(0.53)+0.32β1−0.87β2−0.33β3−0.15β4+0.13β5.The cubic B-splines have been used to approximate the two nonparametric func-tions.We propose to select J n using a BIC procedure.Based on condition (C6),the optimal order of J n can be found in the range (n 1/(2υ),n 1/3).Thus,we propose to choose the optimal knot number,J n ,from a neighborhood of n 1/5.5.For our nu-merical examples,we have used [2/3N r ,4/3N r ],where N r =ceiling (n 1/5.5)and the function ceiling (·)returns the smallest integer not less than the correspond-ing element.Under the full model,let the log-likelihood function be l n (N n ).Theoptimal knot number,N optn ,is then the one which minimizes the BIC value.That is,N optn=arg min N n ∈[2/3N r ,4/3N r ]{−2l n (N n )+q n log n },(4.1)where q n is the total number of parameters.Four model selection or model averaging methods are compared in this simu-lation:AIC,BIC,FIC and the smoothed FIC (S-FIC).The smoothed FIC weights we have used arew(S |ˆδ)=exp −FIC S μT βD −1 D −1μβ all S exp −FIC SμT βD −1 D −1μβ,a case of expression (5.4)in Hjort and Claeskens (2003).When using the FICor S-FIC method,we estimate D −1 D −1by the covariance matrix of βfull andestimate D by its sample mean,as advocated by Hjort and Claeskens (2003)and Claeskens and Carroll (2007).Thus, can be calculated straightforwardly.Note that the subscript “full ”denotes the estimator using the full model.In this simulation,one of our purposes is to see whether the traditional selection methods like AIC and BIC lead to an overly optimistic coverage probability (CP)of a claimed confidence interval (CI).We consider a claimed 95%confidence in-terval.The other purpose is to check the accuracy of estimators in terms of their。
Higher Spin Gauge Theories in Various Dimensions

a r X i v :h e p -t h /0401177v 5 13 A p r 2004Higher Spin Gauge Theories in Various DimensionsM.A.VasilievI.E.Tamm Department of Theoretical Physics,Lebedev Physical Institute,Leninsky prospect 53,119991,Moscow,RussiaAbstract:Properties of nonlinear higher spin gauge theories of totally symmetric mass-less higher spin fields in anti-de Sitter space of any dimension are discussed with the emphasize on the general aspects of the approach.1Introduction As shown by Fronsdal [1],an integer-spin massless spin-s field is described by a totally symmetric tensor ϕn 1...n s (m,n,...=0,...,d −1are d -dimensional vector indices)subject to the double tracelessness condition ϕr r k kn 5...n s =0which is nontrivial for s ≥4.The quadratic action for a free spin s field ϕn 1...n s is fixed up to an overall factor in the form S s =ϕLϕwith some second order differential operator L by the condition of gauge in-variance under the Abelian gauge transformations δϕn 1...n s =∂{n 1εn 2...n s }with symmetric traceless tensor parameters εn 1...n s −1,εr rn 3...n s −1=0.It is the higher spin (HS)gauge sym-metry principle that makes the HS gauge theories interesting and perhaps fundamental.The free HS gauge theories extend the linearized theories of electromagnetism (spin 1)and gravity (spin 2)in a uniform way.The original Fronsdal theory and its more geomet-ric versions [2,3]generalize the metric formulation of gravity.The HS generalization of Cartan formulation of gravity with the HS fields described in terms of the frame-like 1-forms was proposed in [4,5].Uniformity of geometric formulations of HS fields raises the question whether there exists an underlying nonlinear HS gauge theory which in the freefield limit gives rise to the free Fronsdal Lagrangians.There is a number of motivations for studying HS gauge theories.From supergravity perspective,this is interesting because theories with HS fields may have more supersymmetries than the “maximal”supergravities with 32supercharges.Re-call that the limitation that the number of supercharges is ≤32is a direct consequence of the requirement that s ≤2for all fields in a supermultiplet (see e.g.[6]).From super-string perspective,the most obvious motivation is due to Stueckelberg symmetries in the string field theory [7],which have a form of spontaneously broken HS gauge symmetries.Whatever a symmetric phase of the superstring theory is,Stueckelberg symmetries are expected to become unbroken HS symmetries in such a phase and the superstring field1theory has to become one or another version of the HS gauge theory.An important in-dication in the same direction is that string amplitudes exhibit certain symmetries in the high-energy limit equivalent to the string mass parameter tending to zero[8].Unusual feature of interacting HS gauge theories is that unbroken HS gauge symmetries do not allowflat space-time as a vacuum solution,requiring nonzero curvature[9].Anti-de Sitter(AdS)space is the most symmetric vacuum of this type.This property may admit interpretation[10,11,12]in the context of the AdS/CF T correspondence conjecture[13]. In particular,it was conjectured[10,11]that HS gauge theories in AdS bulk are dual to some conformal models on the AdS boundary in the large N limit with g2N→0where g2is the boundary coupling constant.Again,this indicates that HS gauge theory has a good chance to be related to a symmetric phase of superstring theory.On the other hand, a reason why the HS gauge theory may be hard to observe in superstring theory may be that a quantum formulation of the latter is not still available in the AdS background despite the progress achieved at the classical level[14].Whatever a motivation is,the HS problem is tofind any nonlinear theory such that •In the freefield limit it contains a set of Fronsdalfields(with correct signs of kinetic terms)plus,may be,some otherfields that admit consistent quantization(e.g.,mixed symmetry masslessfields which exist in d>4[15]).•HS gauge symmetries are unbroken in a nonlinear HS theory and are deformed to some non-Abelian symmetry.Thefirst condition rules out ghosts1.The condition that a HS symmetry is non-Abelian avoids a trivial possibility of building a nonlinear theory with undeformed Abelian HS symmetries by adding powers of gauge invariant HSfield strengths to the Fronsdal action like,for example,adding powers of the Maxwellfield strengths to a collection of free Maxwell actions instead of deforming it to the Yang-Mills action.For the HS models we discuss,this condition is satisfied as a result of the manifest invariance under diffeomor-phisms.A structure of the HS symmetry is one of the key elements of the theory.Although being absolutely minimal,these conditions are so restrictive that they were believed for a long period to admit no solution at all.One argument was due to the Coleman-Mandula type no-go theorems[17]which state that any S matrix inflat space-time,that has a symmetry larger than(semi)direct product of usual space-time(su-per)symmetries and inner symmetries,is trivial(S=Id).Since no scattering means no interactions,this sounds like no theory with non-Abelian HS global symmetries can exist.An alternative test was provided by the attempt[18]to introduce interactions of a HS gaugefield with gravity.It is straightforward to see that the standard covariantization procedure∂→D=∂−Γbreaks down the invariance under the HS gauge transformations because,in order to prove invariance of the action S s,one has to commute derivatives, while the commutator of the covariant derivatives is proportional to the Riemann tensor, [D...,D...]=R....As a result,the gauge variation of the covariantized action S cov sis different from zero,having the structureδS cov s= R...(ε...Dϕ...)=0.(1) It seems difficult to cancel these terms because for s>2they contain the Weyl tensor part of R which cannot be compensated by a transformation of the gravitationalfield.On the other hand,a number of indications on the existence of some consistent in-teractions of the HSfields were found both in the light-cone[19]and in the covariant approach[20],giving strong evidence that some fundamental HS gauge theory must exist. In these works,the problem was considered inflat space.Somewhat later it was realized [9]that the situation improves drastically once the problem is analyzed in the AdS space with nonzero curvatureΛ.This allowed constructing consistent4d HS-gravitational in-teractions in the cubic order at the action level[9]and,later,in all orders in interactions at the level of equations of motion[21].Recently,the4d results of[21]were generalized to any space-time dimension[22].The role of AdS background in HS gauge theories is important in many respects.In particular it cancels the Coleman-Mandula argument because AdS space admits no S-matrix,andfits naturally the AdS/CFT correspondence conjecture.From the technical side,the dimensionful cosmological constant allows new types of HS interactions with higher derivatives,which resolve the problem with HS-gravitational interactions as follows. The Riemann tensor R nm,kl is not small near AdS background butR nm,kl=R nm,kl−Λ(g nk g ml−g nl g mk),(2) whereΛis the cosmological constant,g mn is the background AdS metric tensor and R is a deviation of the Riemann tensor from the AdS background curvature.Expanding around the AdS geometry is therefore equivalent to expanding in powers of R rather than in powers of the Riemann tensor R.The crucial difference compared to theflat space is that the commutator of covariant derivatives in the AdS space[D n,D m]∼Λ(3) is not small for generalΛ.With nonzeroΛone can add to the action some cubic terms schematically written in the formS int= p,qα(p,q)Λ−12D p and D q denote here some combinations of derivatives of orders p and q,respectively.3Properties of HS gauge theories are to large extent determined by HS global symmetries of their most symmetric vacua.HS symmetry restricts interactions andfixes spectra of spins of masslessfields in HS theories as ordinary supersymmetry does in supergravity. To elucidate the structure of a global HS(super)algebra h it is useful to use the approach in whichfields,action and transformation laws are formulated in terms of the gaugefields of h.An attractive feature of this approach,which generalizes the MacDowell-Mansouri-Stelle-West approach[23,24]to gravity,is that it treats allfields as differential forms with the gravitationalfield being on equal footing with other masslessfields in a HS multiplet. The only special property of the metric tensor is that it has a nonzero vacuum expectation value allowing a meaningful linearized approximation for allfields in the model.In section 2we recall the MacDowell-Mansouri-Stelle-West approach to gravity.Then in section3 we show following to[5]how free HSfields can be reformulated in terms of differential forms to be interpreted as gauge connections.The non-Abelian HS algebra is defined in section4in terms of some star-product algebra.Then we describe unfolded formulation of freefield HS dynamics in section5and formulate nonlinear HS equations in section6. Some conclusions and perspectives are summarized in the Conclusion.2Gravity as o(d-1,2)gauge theoryOur approach to higher spins generalizes the MacDowell-Mansouri-Stelle-West[23,24] formulation of gravity as o(d−1,2)gauge theory.The key observation is that the frame 1-form e a(x)=dx n e n a(x)and Lorentz connectionωab(x)=dx nωn ab(x)can be interpreted as components of the o(d−1,2)connection1-formωAB(x)=dx nωn AB(x)(a,b,...= 0,...,d−1arefiber Lorentz vector indices and A,B,...=0,...,d arefiber o(d−1,2) vector indices).The Lorentz subalgebra o(d−1,1)∈o(d−1,2)is identified with the stability subalgebra of some vector V A.Since Lorentz symmetry is local,this vector can be chosen differently at different points of space-time,thus becoming afield V A=V A(x). It is convenient to relate its norm to the cosmological constant so that V A has dimension of length3V A V A=−Λ−1.(5) This allows for a covariant definition of the framefield and Lorentz connection[24]E A=D(V A)≡dV A+ωAB V B,ωL AB=ωAB+Λ(E A V B−E B V A).(6) According to these definitions,E A V A≡0,D L V A=dV A+ωL AB V B≡0.The theory is formulated in a way independent of a particular choice of V A.The simplest choice is with V A being a constant vector pointing at the(d+1)th direction,i.e.,V A=|Λ|−1/2δA d.(7) The Lorentz directions are those orthogonal to V A:V A A A=0→A A=A a.In this “standard gauge”the Lorentz connection isωab,and e a=ωaB V B.When the frame E A n has the maximal rank d,it gives rise to the nondegenerate metric tensor g nm=E A n E B mηAB.The o(d−1,2)Yang-Millsfield strength isR AB=dωAB+ωA C∧ωCB.(8) It can be decomposed into the torsion part R A≡DE A≡R AB V B and the V–transversal Lorentz part.In the standard gauge(7)they identify with the torsion tensor and Riemann tensor shifted by the terms bilinear in the frame1-formR a=de a+ωa b∧e b,R ab=dωab+ωa c∧ωcb+Λe a∧e b.(9) The zero-torsion condition R a=0expresses the Lorentz connection via the framefield in the usual manner.Provided that the metric tensor is nondegenerate,anyfieldωsatis-fying the zero-curvature equation R AB=0describes(A)dS d space with the cosmological constantΛ,AdS d:R AB=0,rank|E A n|=d.(10) The action of Stelle and West[24]for4d gravity is1S=−o(d−1,2)/o(d−1,1).The resulting theory turns out to be expressed in terms of the frame field and Lorentz connection contained inωAB and,as it should,is manifestly invariant under local Lorentz transformations and diffeomorphisms.Alternatively,expressing the covariantized diffeomorphism parameters from(15)ξn g nm=E mAεAB V B(16) and using(14),one can interpret the mixture of diffeomorphisms and transformations from o(d−1,2)/o(d−1,1),that leave invariant V A,as a deformation of the transformation law(12)for the connectionωAB by some R-dependent terms.With the help of V A it is straightforward to write a d–dimensional generalization[5]of the MacDowell-Mansouri-Stelle-West action1S=−3Higher spin gauge fieldsIn the spin two case,the formulation in terms of gauge connections results from the extension g nm −→{e a n ,ωab n }−→ωAB n .It has the following generalization to any spin s ≥1:ϕn 1...n s −→{e n a 1...a s −1,ωn a 1...a s −1,b 1...b tt =1,2...s −1}−→ωA 1...A s −1,B 1...B s −1n .(19)The first arrow is for the equivalent free field reformulation [4]of the Fronsdal dynamics in terms of the set of 1-forms dx n ωn a 1...a s −1,b 1...b t (0≤t ≤s −1)which contain the frame type dynamical field e n a 1...a s −1=ωn a 1...a s −1(t =0)and the generalized Lorentz connections ωa 1...a s −1,b 1...b t (t >0).The connections ωa 1...a s −1,b 1...b t are symmetric in the fiber Lorentz vector indices a i and b j separately,satisfy the antisymmetry conditionωn a 1...a s −1,a s b 2...b t =0(20)implying that symmetrization over any s fiber indices gives zero,and are traceless with respect to the fiber indices ωn a 1...a s −3c c ,b 1...b t =0,ωn a 1...a s −2c,c b 2...b t =0,ωn a 1...a s −1,c cb 3...b t =0.The HS gauge fields associated with the spin s massless field therefore take values in the direct sum of all irreducible representations of the d -dimensional massless Lorentz group o (d −1,1)described by the Young diagrams with at most two rows such that the longest row has length s −1s −1t .Analogously to the relationship between metric and frame formulations of the linearized gravity,the totally symmetric double traceless HS fields used to describe the HS dynamics in the metric type formalism [1,2]identify with the symmetrized part ϕa 1...a s =ω{a 1...a s }of the frame type field ωn a 1...a s −1.The antisymmetric part in ωn a 1...a s −1can be gauge fixed to zero with the aid of the generalized HS Lorentz symmetries with the parameter ǫa 1...a s −1,b .That ϕa 1...a s is double traceless is a consequence of the tracelessness of ωn a 1...a s −1in the indices a i .The generalized Lorentz connections ωn a (s −1),b (t )with t >0are auxiliary fields expressed through order-t derivatives of the dynamical frame-like field by certain constrains,ωn a (s −1),b (t )∼ 1Λ∂whereωAB0is the background AdS d gaugefield satisfying theflatness condition D20=0 (10)which guarantees that the linearized curvature(22)is invariant under the Abelian HS gauge transformations with the HS gauge parametersǫA1...A s−1,B1...B s−1δω1A1...A s−1,B1...B s−1=D0ǫA1...A s−1,B1...B s−1.(23) The o(d−1,2)covariant form of the free action for a massless spin sfield is[5] S s2=1(s−p−2)!.(25) The coefficients a(s,p)arefixed up to an overall spin-dependent factor˜a(s)by the“extra field decoupling condition”that the variation of the free action(24)is different from zeroonly for thefieldsωnA1...A s−1,B1=ωnA1...A s−1,B1...B s−1V B2...V B s−1which contain the frametype dynamical HSfieldωn a1...a s−1and the Lorentz type auxiliaryfieldωn a1...a s−1,b,which is expressed in terms of the frame typefield by virtue of its equation of motion equiva-lent to the“zero torsion condition”R1A1...A s−1,B1...B s−1V B1...V B s−1=0.Insertion of theexpression forωn a1...a s−1,b into(24)gives rise to the HS action expressed entirely(mod-ulo total derivatives)in terms ofωn a1...a s−1and itsfirst derivatives.Since the linearized curvature(22)is invariant under the Abelian HS gauge transformations(23)the result-ing action has necessary HS gauge symmetries and,because of the extrafield decoupling condition,describes correctly the freefield HS dynamics in AdS d.In particular,the gen-eralized Lorentz-like transformations with the gauge parameterǫA1...A s−1,B1(x)guaranteethat only the totally symmetric Fronsdal partϕa1...a s =ω{a1...a s}of the frame type gaugefield contributes to the action.4Higher spin algebrasOnce dynamics of totally symmetric HS gaugefields is shown to be described by1-forms taking values in the two-row rectangular Young tableaux of o(d−1,2),this suggests that aAdS d HS algebra h admits a basis formed by a set of elements T A1...A n,B1...B n ,which satisfythe properties analogous to(21),T{A1...A s−1,A s}B2...B s−1=0,T A1...A s−3C C,B1...B s−1=0,andcontains the o(d−1,2)basis elements T A,B=−T B,A such that[T C,D,T A1...A s−1,B1...B s−1]=ηDA1T CA2...A s−1,B1...B s−1+ (26)The question is whether there exists a non-Abelian algebra h with these properties.If yes,the Abelian curvatures R1(22)can be understood as resulting from the linearization of the non-Abelianfield curvatures R of h with the h gauge connection˜ω=ω0+ω,where ω0is somefixedflat zero-order connection of the AdS d subalgebra o(d−1,2)⊂h andωis thefirst-order dynamical part which describes masslessfields of various spins.According8to the discussion of section 2,anyh of this class is a candidate for a global HS algebra of the symmetric vacuum of a HS theory.The existence of such an algebra was indicated by the results of [25]where conserved currents (and therefore charges)in the free massless scalar field theory were shown to be described by various traceless two-row rectangular Young tableaux of the conformal algebra.A formal definition of h as a conformal HS algebra of symmetries of a scalar field theory in d −1dimension was given by Eastwood in [26].In this paper we use a slightly different definition of h which is more suitable for the analysis of the HS interactions.Consider oscillators Y A i with i =1,2satisfying the commutation relations[Y A i ,Y B j ]∗=εij ηAB ,εij =−εji ,ε12=1,(27)where ηAB is the invariant metric of o (d −1,2).(These oscillators can be interpreted as conjugated coordinates and momenta Y A 1=P A ,Y B 2=Y B .)ηAB and εij are used to raise and lower indices in the usual manner A A =ηAB A B ,a i =εij a j ,a i =a j εji .We use the Weyl (Moyal)star product(f ∗g )(Y )=12Y iA Y B i ,t ij =t ji =Y A i Y jA [T A,B ,t ij ]∗=0.(30)Consider the subalgebra S ∈A d +1spanned by the sp (2)singlets f (Y )f ∈S :[t ij ,f (Y )]∗=0.(31)Eq.(31)is equivalent to Y Ai ∂Y Ai f (Y )=0.For the expansion (29)this con-dition implies that the coefficients f A 1...A m ,B 1...B n are nonzero only if n =m and that symmetrization over any m +1indices of f A 1...A m ,B 1...B m gives zero,i.e.f A 1...A m ,B 1...B m has the symmetry properties of a two-row rectangular Young tableau.The algebra S is not simple.It contains the two-sided ideal I spanned by the elements of the form g =t ij ∗g ij ,where g ij transforms as a symmetric tensor with respect to sp (2),i.e.,[t ij ,g kl ]∗=δk i g j l +δk j g i l +δl i g j k +δl j g i k .(Note that t ij ∗g ij =g ij ∗t ij .)Actually,from (31)it follows that f ∗g,g ∗f ∈I ∀f ∈S ,g ∈I .Due to the definition (30)of t ij ,the ideal I contains all traces of the two-row Young tableaux.As a result,the algebra S/I has only traceless two-row tableaux in the expansion (29).9Now consider the Lie algebra with the commutator in S/I as the product law.Its real form corresponding to a unitary HS theory in AdS d is called hu(1/sp(2)[d−1,2])[22]. Note that,by construction,the AdS d algebra o(d−1,2)with the generators T A,B is the subalgebra of hu(1/sp(2)[n,m]).The gaugefields of hu(1/sp(2)[d−1,2])areω(Y|x)=∞l=0ωA1...A l,B1...B l(x)Y A11...Y A l1Y B12...Y B l2(32)with the component gaugefieldsωA1...A l,B1...B l (x)taking values in all traceless two-rowrectangular Young tableaux of o(d−1,2).Note that,because dt ij=0,the sp(2)invariance condition,which imposes the Young symmetry properties,can be written in the covariant formD(t ij)≡dt ij+[ω,t ij]∗=0.(33) The HS curvatures and gauge transformations have the standard Yang-Mills formR(Y|x)=dω(Y|x)+ω(Y|x)∧∗ω(Y|x),(34)δω(Y|x)=Dε(Y|x),Dε(Y|x)=dε(Y|x)+[ω(Y|x),ε(Y|x)]∗.(35) Different spins correspond to irreducible representations of o(d−1,2)spanned by homogeneous polynomialsω(µY|x)=µ2(s−1)ω(Y|x).(36) (Note that one unit of spin is carried by the1-form index).In particular,spin1is described by a1-formω(x)=dx nωn(x).The algebra hu(1/sp(2)[d−1,2])is infinite dimensional.It contains o(d−1,2)⊕u(1)as the maximalfinite dimensional subalgebra with the generators T AB(30)for o(d−1,2)and constants for u(1).The corresponding gaugefields carry spins2and1,respectively.Taking two HS symmetry parametersεs1 andεs2,being polynomials of degrees s1−1and s2−1,respectively,one obtains[εs1,εs2]∗=s1+s2−2t=|s1−s2|+1εt.(37)Thus,once a spin s>2gaugefield appears,the HS symmetry algebra requires an infinite tower of HS gaugefields to be present.The barrier s≤2separates theories with infinite dimensional gauge symmetries from those with usual lower spin symmetries.It is tempting to speculate that the latter result from the spontaneous breaking of infinite dimensional HS symmetries down to usual lower spin symmetries.In that case,the HS gaugefields should acquire masses as a result of this spontaneous HS symmetry breaking.The formula(37)manifests the quantum-mechanical nonlocality of the oscillator alge-bra(27)(equivalently,the star product(28)).Because bilinear terms in the HS curvatures (34)describe interactions,one concludes that lower spins form sources for higher spins and vice versa.A less obvious fact,which follows from the HSfield equations,is that the non-local character of the star product algebra results in the appearance of higher space-time derivatives in the HS interactions.Thus the star-product origin of the HS algebra links10together such seemingly different properties of the HS theories as the relevance of the AdS background,necessity of introducing infinitely many spins and space-time non-locality of the HS interactions.Note that these properties make the HS theories reminiscent of the superstring theory with the analogy between the cosmological constant andα′.To introduce inner symmetries,one considers following[27]matrix-valued gaugefields ω(Y|x)−→ωνµ(Y|x),µ,ν...=1,...,p,imposing the reality condition[ωνµ(Y|x)]†=−ωνµ(Y|x),(38) where the involution†combines matrix hermitian conjugation with the involution of the star product algebra(Y A j)†=iY A j.The resulting real Lie algebra is called hu(p|sp(2)|[d−1,2]).Its gaugefields describe the set of masslessfields of all spins s≥1which take values in the adjoint representation of u(p).In particular,spin1gaugefields are u(p) Yang-Millsfields.Combining the antiautomorphism of the star product algebraρ(f(Y))=f(iY)with some antiautomorphism of the matrix algebra generated by a nondegenerate formραβone can impose the conditions[27]ωαβ(Y|x)=−ρβγρδαωγδ(iY|x),(39) which truncate the original system to the one with the Yang-Mills gauge group USp(p) or O(p)depending on whether the formραβis antisymmetric or symmetric,respectively. The corresponding global HS symmetry algebras are called husp(p|sp(2)[d−1,2])and ho(p|sp(2)[d−1,2]),respectively.In this case allfields of odd spins take values in the adjoint representation of the Yang-Mills group whilefields of even spins take values in the opposite symmetry second rank tensor representation(i.e.,symmetric for O(p)and antisymmetric for USp(p))which contains a singlet.The graviton is always the color singlet.For general p,color spin2particles also appear however.Note that this does not contradict to the no-go results of[28]because the theory under consideration does not allow aflat limit with unbroken HS and color spin2symmetries.The minimal HS theory is based on the algebra ho(1|sp(2)[n,m]).It describes even spin particles,each in one copy.(Odd spins do not appear because the adjoint representation of o(1)is trivial.)5Unfolded higher spin dynamicsAn efficient approach to HS dynamics consists of reformulation of linear and non-linear field equations in the form of some generalized covariant constancy conditionsfirst at the linear and then at the nonlinear level.This“unfolded formulation”originally introduced for the description of4d HS dynamics in[29]allows one to control simultaneously formal consistency offield equations,gauge symmetries and the invariance under diffeomor-phisms.The unfolded formulation treats uniformly higher derivatives of the dynamical fields and is just appropriate for the analysis of HS dynamics because HS symmetries mix higher derivatives of the dynamicalfields.The same time,the unfolded formulation is a universal tool applicable to any dynamical system although it may be looking unusual from the perspective of standardfield theory because it operates in terms of infinite di-mensional modules which describe all degrees of freedom of the system in question.To11illustrate the idea let us recall the unfolded formulation of the Einstein gravity,using the compensator formalism.As explained in section2,Riemann tensor and torsion tensor are components of the o(d−1,2)field strength(8)R AB=dx n∧dx m R nm AB.The components of the Riemann tensor,that can be nonzero when Einstein equations and zero-torsion constraints are satisfied,belong to the Weyl tensor,i.e.Einstein equations with the cosmological term can be rewritten asR A,B o.m.s.=E C∧E D C AC,BD,(40) where C AC,BD is treated as an independent tensorfield variable that has the symmetry properties of the window Young tableau and describes the Weyl tensor.For our purpose it is convenient to use the symmetric basis with C AC,BD=C CA,BD=C AC,DB. In addition,C AC,BD has the following propertieszero-torsion constraint:V A C AB,CD=0,(41) Einstein equations:C B B,CD=0,(42) Bianchi identities for(41):C{A1A2,A3}D=0:.(43) Field equations for free totally symmetric integer spin s≥2massless HSfields in AdS d [4]can analogously be rewritten in the form[4]R A1...A s−1,B1...B s−11 o.m.s.=E0A s∧E0B s C A1...A s,B1...B s.(44) The generalized Weyl tensors C A1...A s,B1...B s are described by the traceless V A–transversaltwo-row rectangular Young tableaux of length s,i.e.,V A1C A1...A s,B1...B s=0,ηA1A2C A1...A s,B1...B s=0and C{A1...A s,A s+1}B2...B s=0.The equation(44)referred to as First On-Mass-Shell Theo-rem is a consequence of the masslessfield equations along with the constraints on auxiliary and extrafields imposed by requiring appropriate components of HS curvature to vanish [4].Let us note that although the extrafieldsωn a1...a s−1,b1...b t with t≥2do not contribute to the free action,they do contribute at the interaction level.To make such interactions meaningful,one has to express the extrafields in terms of the dynamical ones modulo pure gauge degrees of freedom.This is achieved by imposing appropriate constraints[4] contained in(44)like the torsion constraint in gravity is contained in(40).The Bianchi identities D0(R1)=0along with the equation(44)impose some differen-tial restrictions on the generalized Weyl tensor C A1...A s,B1...B s.The trick is to denote the components of thefirst derivatives of C A1...A s,B1...B s,that are allowed to be non-zero by the Bianchi identities,by a new tensorfield C1,writing symbolically D L0C=E0∧C1,where D L0is the Lorentz derivative and E0is the AdS frame1-form.The Bianchi identities for this equation impose differential conditions on C1to be written as D L0C1=E0∧C2, etc.It turns out that the full set of the0-forms C i consists of all two-row traceless V A−transversal Young tableaux C A1...A u,B1...B s with the second row of length s,i.e.,V A1C A1...A u,B1...B s=0,ηA1A2C A1...A u,B1...B s=0,C{A1...A u,A u+1}B2...B s=0.(45)Thefields C A1...A u,B1...B s form a basis of the space of on-mass-shell nontrivial derivatives of order u−s of the spin s generalized Weyl tensor C A1...A s,B1...B s.The full set of the12compatibility conditions of the equations(44)can be written in the form of the covariant constancy condition[22]˜DC A1...A u,B1...B s=0u≥s,(46) where˜D0is the o(d−1,2)covariant derivative in the so called twisted adjoint represen-tation.To define the twisted adjoint representation it is useful to observe that the set offields C A1...A u,B1...B s satisfying(45)spans the space isomorphic to the space of star-product func-tions C(Y|x)satisfying(31)and with the ideal I factored out to impose the tracelessness condition.Then˜D0is[22]˜D=D L0−2ΛE A0V B ⊥Y i A Y Bi−1∂⊥Y Aj∂ Y Bi ,(47)where the Lorentz covariant derivative is D L0=d+ωL AB⊥Y Ai∂V2V A V B A B i,⊥A A i=A A i−1∂⊥Y Ai− Y Ai∂2E A0∧E B0∂2。
Generalized blocks for symmetric groups

1
ቤተ መጻሕፍቲ ባይዱ
GENERALIZED BLOCKS FOR SYMMETRIC GROUPS
The study of the modular representation theory of symmetric groups was initiated in the 1940’s. One of the first highlights was the proof of the so-called Nakayama conjecture describing the distribution of the irreducible characters into p-blocks in terms of a combinatorial condition on the partitions labelling them. More specifically two irreducible characters are in the same p-block if and only if the partitions labelling them have the same p-core. There is also a comprehensive literature on decomposition numbers, Cartan matrices and other block-theoretic invariants of symmetric groups. The representation theory of symmetric groups has served as a source of inspiration for the study of representations of other classes of groups and algebras. As an example we may refer to the book [9]. Corollary 5.38 in that book presents an analogue of the Nakayama conjecture for Iwahori-Hecke algebras for the symmetric group Sn at an -th root of unity. Donkin [4] has presented a direct link between the representation theory of these algebras and an -analogue of the modular representation theory of the symmetric groups. It thus seems a natural problem to study “ -blocks” of Sn . We attempt to do this here based primarily on the ordinary character theory of symmetric groups and on some very general ideas from the character theory of finite groups. We study analogues of blocks, of the second main theorem on blocks, of decomposition matrices and of Cartan matrices in this context and prove an -analogue of the Nakayama conjecture. We believe that this approach may provide additional insight, eg. concerning the invariant factors of Cartan matrices. For instance we show that these calculations for a given block of weight w may be performed inside the wreath product Z Sw . It should be mentioned that Brundan and Kleshchev [3] have recently given a formula for the determinant of the Cartan matrix of an -block for the Hecke algebras. In view of [4] this also is the determinant of the Cartan matrix of an -block of Sn . (See Proposition 6.10 for details). The paper is organized as follows: The first two sections present a very general theory of contributions, perfect isometries, sections and blocks, suitable for our purposes. These sections may have independent interest beyond the questions at hand. In section 3 we introduce -sections and -blocks in symmetric groups and prove an analogue of the second main theorem of blocks. Then in section 4 we construct “basic sets”, i.e. integral bases for the restrictions of the generalized
MSGUT From Bloom to Doom

Abstract By a systematic survey of the parameter space we confirm our surmise[1] that the Minimal Supersymmetric GUT(MSGUT) based on the 210 ⊕ 126 ⊕ 126 ⊕ 10 Higgs system is incompatible with the generic Type I and Type II seesaw mechanisms. The incompatibility of the Type II seesaw mechanism with this MSGUT is due to its generic extreme sub-dominance with respect to the Type I contribution. The Type I mechanism although dominant over Type II is itself unable to provide Neutrino masses larger than ∼ 10−3 eV anywhere in the parameter space. Our Renormalization Group based analysis shows the origin of these difficulties to lie in a conflict between baryon stability and neutrino oscillation. The MSGUT completed with a 120-plet Higgs is the natural nextห้องสมุดไป่ตู้to minimal candidate. We propose a scenario where the 120-plet collaborates with the 10-plet to fit the charged fermion masses. The freed 126-plet couplings can then give sub-dominant contributions to charged fermion masses and enhance the Type I seesaw masses sufficiently to provide a viable seesaw mechanism. We give formulae required to verify this scenario.
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
a r X i v :q u a n t -p h /0208055v 3 11 N o v 2003Generalized (s-Parameterized)WeylTransformationAlex Granik ∗AbstractA general canonical transformation of mechanical operators of po-sition and momentum is considered.It is shown that it automat-ically generates a parameter s which leads to a generalized (or s -parameterized)Wigner function.This allows one to derive a general-ized (s -parameterized)Moyal brackets for any dimensions.In the clas-sical limit the s -parameterized Wigner averages of the momentum and its square yield the respective classical values.Interestingly enough,in the latter case the classical Hamilton-Jacobi equation emerges as a consequence of such a transition only if there is a non-zero parameter s1IntroductionThe Moyal transformation in the context of the Weyl transformation [1](the former being a particular case of the latter)was addressed by B.Leaf [2]who departing directly from quantum mechanics derived the Weyl-transform A w (Q ,K )of the quantum operator A (Q ,K )and investigated the properties of such a transform.Here Q and K denote the eigenvalues of the coordinate q and momentum k operators respectively.The well-known Moyal formula for the phase-space distribution function [3]readily followed from the de-rived expressions [2].However the work [2]did not yield a more general s -parameterized transformation unifying different quantization rules.Usually this transformation is introduced ”by hand”(cf.[4],[5],[6].A closer inspection of Leaf’s approach shows that it allows one to naturally arrive at the generalized (in a sense of s -parameterization)Weyl-Wigner-Groenewold-Moyal transformation without a need for apriori introduction of the param-eter s .This is associated with the fact that a shift of the operators K and Q (K ,Q →K ′,Q ′)under the only condition that the resulting transformation to be canonical automatically generates an arbitrary parameter s entering the resulting transformation.In Ref.[2]this shift was chosen in such a way as to satisfy the canonicity by a special choice of the numerical coefficients entering the transformation K ,Q →K ′,Q ′and ensuring the value of the parameter s =0.2Generalized Weyl TransformationLet us consider the Hilbert space of a quantum-mechanical system having n degrees of freedom.This space is spanned by the eigenkets |Q >and |K >of the Cartesian coordinate operator q (q 1,q 2,...,q n )and the conjugate momentum p =(¯h /i )(∂/∂q )=2π¯h k (k 1,k 2,...,k n )[with the commutation relations q i k j −k i q j =(i/2π)δij ;i,j =1,2,...,n ]:k |K =(2π¯h )−1¯h ∂q|K =K |K(1)which means|K =e 2πi q •K ;|Q =δ(q −Q )(2)The respective completeness relations are d Q |Q Q |=1, d K |K K |=1(3)Employing (1)and (2)we find that the scalar product <Q |K >isQ |K =e 2πi q •K δ(q −Q )d q =e 2πiQ •K(4)We represent an arbitrary quantum-mechanical operator A in the Hilbert space using the completeness relation (3):A ≡ ...d Q ′d Q ′′d K ′d K ′′|Q ′′ K ′′|K ′′ K ′′|A |K ′ K ′|Q ′ Q ′|(5)2Now we perform a linear transformation from the variables Q ′,Q ′′,K ′,K ′′to new variables K ,Q ,u ,v according to the followingQ ′′i =Q i +αi v i ,Q ′i =Q i +βi v i(6)K ′′i =K i +γi u i ,K ′i =K i +δi u i ;i =1,2..,.n(7)where αi ,βi ,γi and δi are some constants to be determined from an additional condition.If we require this transformation to be canonical then its Jacobian must be 1yielding the following relations:i =ni =1(αi −βi )(γi −δi )=1(8)Note that in [2]from the very beginning the coefficients are chosen in such a way as to identically satisfy the canonicity condition:αi =γi =1/2;βi =δi =−1/2.We rewrite identity (5)taking into account the transformation of variables (6),(7):A ≡...d Q d K d u d v |Q +αv Q +αv |K +γu× K +γu |A |K +δu K +δu |Q +βv Q +βv |(9)where according to (4)Q +αv |K +δu |Q +βv =exp {2πi [n i =1(γi −δi )u i Q i +(αi −βi )v i K i +u i v i (αi γi −βi δi )]}Inserting this expression into Eq.(9)we get A ≡ ...d Q d K d u d v K +γu |A |K +δu |Q +αv Q +βv |exp {2πi [ni =1(γi −δi )u i Q i +(αi −βi )v i K i +u i v i (αi γi −βi δi )]}(10)By a simple change of variables we can incorporate u i v i (αi γi −βi δi )into variables Q i .To this end we represent the power of the exponent in (10)as3follows(γi−δi)u i Q i+(αi−βi)v i K i+u i v i(αiγi−βiδi)=(γi−δi)u i[Q i+v iαiγi−βiδiγi−δi,dropping the superscript o,and using the fact thatexp{−2πk•a}|Q>=|Q+a>we obtain the following representation of the operator A:A= ... d Q d K A w(γ,δ,Q,K)F(α,β,Q,K)(12) whereA w(γ,δ,Q,K)= d u e2πi n k=1(γk−δk)u k Q k K+γu|A|K+δu ,(13) F(α,β,r,Q,K)= d v e−2πi(α−β)v(k−K)|Q+rv Q+rv|(14)andγir=r i=(r1,r2,...,r n)=Because both q and k are hermitian the last expression demonstrates that F(α,β,r,Q,K)is also Hermitian.Since∂()m d v e−2πi (αj−βj)v j(k j−K j)=∂K jd v(v j)m e−2πi (αj−βj)v j(k j−K j)(2πi)m(αj−βj)mEq.(16)yieldsF(α,β,r,Q,K)=e1αj−βj∂∂K j d v e−2πi (αj−βj)v j(k j−K j) d w e2πi w j(q j−Q j)(17)By introducing new variables k o j=k j(αj−βj),K o j=K j(αj−βj)the param-eters(αj−βj)are”absorbed”by these variables,which means that without any loss of generality we can set(αj−βj)=1.Therefore the operator F(α,β,r,Q,K)becomesF(r,Q,K)=e1∂Q j∂2πi r j∂2The last equation allows us tofind an explicit expression for the Weyl-transform of the quantum operator A w(γ,Q,K)which we represent as fol-lows:A w(r,Q′,K′)= d Q d K A w(r,Q,K)δ(Q−Q′)δ(K−K′)(20) Using(19)we calculate<Q′|F(r,Q,K)|K′>:Q′|F(r,Q,K)|K′ = Q′|K′ e1∂Q j∂K jδ(Q′−Q)δ(K′−K)(21) where we use the following identity|Q Q|Q′ ≡δ(Q′−Q)|Q′ ≡δ(q−Q)|Q′With the help of another identity Q′|K′ K′|Q′ ≡1we get from(21)δ(Q′−Q)δ(K′−K)=e−i∂Q′j∂K′j Q′|F(r,Q,K)|K′ K′|Q′ (22) Substitution of(22)into(20)yieldsA w(r,Q′,K′)=d Q d K A w(r,Q,K)e−i∂Q′j∂K′j Q′F(r,Q,K)|K′ K′|Q′ (23)On the other hand,from(12)follows thatQ′|A|K′ K′|Q′ =d Q d K A w(r,Q,K) Q′|F(r,Q,K)|K′ K′|Q′ . Thereforee−i∂Q′j∂K′j Q′|A|K′ K′|Q′ =d Q d K A w(r,Q,K)e−i∂Q′j∂K′j Q′|F(r,Q,K)|K′ K′|Q′ (24)6Combining(23)and(24)we get the following expression for A w(r,Q,K):A w(r,Q,K)=e−i∂Q′j∂K′j Q′|A|K′ K′|Q′ (25)If we replace K j→P j/2π¯h and take into account that for a system with N degrees of freedom|K =(2π¯h)−N/2|P then(25)takes the following form:A w(r,Q′,P′)=(2π¯h)N e−i¯hr j∂2∂Q′j∂P′j Q′|P′ P′|A|Q′ (26) To express Eq.(26)in the Schroedinger representation we consider an or-thonormal set of eigenkets|ψm and expand the operator A in terms of these eigenkets(eigenbras)A= mn w m|ψm w∗n ψn|(27)where w m are the respective coefficients of the expansion.Upon substitution of(27)into(26)we obtain the following expressionA w(r,Q,P)=e i¯h r j∂2d v d w e−πi(1+2r j )w j v j e−2πi(w j Q j −v j K j )e2πi(w j q j −v j k −j )This expression yieldsF (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )=d v d w d v ′d w ′e2πiw j q j −v j k j e−2πi[Q j (w ′j +w/2)+K j (v ′j−v/2)]×e2πi[Q ′j (w ′−w/2)+K ′(v ′+v/2)](29)Introducing new variablesv ′′=v ′+v/2;v ′′′=−(v −v/2);w ′′=−(w ′−w/2);w ′′′=w ′+w/2we obtain after some (rather lengthy)algebraF (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )=e−(i/4π)[∂∂K ′j−∂∂Q ′j]×e−(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]F (r ,Q ,K )F (r ,Q ′,K ′)Using this expression we readily obtain thatF (r ,Q ,K )F (r ,Q ′,K ′)=e (i/4π)[∂∂K ′j−∂∂Q ′j]×e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )Now we can write the product of two AB operators as followsAB =d Q d K d Q ′d K ′A w (r ,Q ,K )B w (r ,Q ′,K ′)F (r ,Q ,K )F (r ,Q ′,K ′)=d Q d K d Q ′d K ′A w (r ,Q ,K )B w (r ,Q ′,K ′)e (i /4π)[∂∂K ′j −∂∂Q ′j ]×e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )=d Q d K d Q ′d K ′F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )e(i/4π)[∂∂K ′j−∂∂Q ′j]×e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]A w (r ,Q ,K )B w (r ,Q ′,K ′)(30)8Quite similarly we obtain that BA isBA =d Q d K d Q ′d K ′A w (r ,Q ,K )B w (r ,Q ′,K ′)F (r ,Q ,K )F (r ,Q ′,K ′)=d Q d K d Q ′d K ′A w (r ,Q ,K )B w (r ,Q ′,K ′)e (i /4π)[∂∂K ′j −∂∂Q ′j ]×e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )=d Q d K d Q ′d K ′F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )e−(i/4π)[∂∂K ′j−∂∂Q ′j]×e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]A w (r ,Q ,K )B w (r ,Q ′,K ′)(31)Therefore the commutator [A ,B ]is[A ,B ]=2id Q d K d Q ′d K ′F (r ,Q ,K )δ(Q ′−Q )δ(K ′−K )e(i/4π)(1+2r j )[∂∂K ′j+∂∂Q ′j]sin {1∂Q ∂K ′−∂2∂t=[ρ,H ](33)where the operator ρis given by Eq.(12)ρ=d Q d K ρw (t ,r ,Q ,K )F (r ,Q ,K )(34)Inserting (32)and (34)into (33)we obtain∂ρw (r ,Q ,P )¯hd Q ′d K ′δ(Q ′−Q )δ(K ′−K )ei∂Q ′j∂K j+∂24π[∂2∂Q j ∂K ′j]}H w (Q ′,K ′)ρw (t ,Q ,K )(35)Performing integration and replacing parameter r by the followingr j =−(1+s j )we obtain the generalization of the Moyal bracket[3]:∂ρ(r,Q,P)2 s j(∂2∂Q j∂P′j)×22[∂2∂Q j∂P′j]}H w(Q′,P′)ρw(t,Q,P)The same result was presented in[6].However there the authors introduced parameter s by hand,without relating it to any transformation of the quan-tum states and simply treating it as a means to achieve a unified approach to different quantization rules.On the other hand,our approach(based on[2]) explicitly shows that such a parameter is a result of a linear transformation from one quantum state to another.Since the pure states are represented by rays,it is very natural to expect that the above transformation would result in the appearance of the phase,which is clearly seen in the exponential op-erator.3On a Physical Meaningof the s-ParameterIt is therefore interesting to investigate what role is played by this parameter in a transition to a classical case.To this end we restrict our attention to a1−D case and consider(following Moyal)space-conditional moments <p n>w(the Wigner averages of the powers of p n of the momentum):<p n>w= A w(p,q,s)p n dpAs a next step,we find Fourier-transform of A w (p,q,σ),Eq.(28).We denote thistransformby M (τ,θ,σ):M (τ,θ,σ)=e (τp +θq )e −1−σ∂p∂q [Ψ∗(q )Ψ(p )e ipq ]dpdq (38)Integration of (38)by parts yields:M (τ,θ,σ)=Ψ∗(q )e iθ[q +τ(1−σ)2(1−σ)Using q 1in (39)we obtain:M (τ,θ,σ)={e iθq 1Ψ∗[q 1−τ2(1+σ)]Ψ(p )dp }dq 1(40)Since1¯he ip [q 1+τ2(1+σ)]relation (40)takes the following form:M (τ,θ,σ)=Ψ∗[q 1−τ2(1+σ)]dq 1(41)Inverse Fourier-transform of M (τ,θ,σ)gives us the desired integral form of the phase-space distribution (a s -parameterized Wigner function)A w (p,q,σ)1:A w (p,q,σ)=12(1−σ)]e −iτp Ψ[q 1+τ1a simplified derivation of A w (p,q,σis given in the Appendix11In general,since the parameterσ(or s)is complex-valued,theσ-parameterized Wigner function A w(p,q,σ)is also complex−valued,in contradistinction to its conventional counterpart(withσ=0).However for the purely imaginary values of the parameterσ,theσ-parameterized Wigner function becomes real-valued again:i)A∗w(p,q,σ)=A w(q,p,−σ)ii)A∗w(p,q,σ)=A w(q,p,σ),Re(σ)=0For the following we rewrite A w(p,q,σ)in terms of the momentum wave functionΦ(p).After some algebra we obtainA w(p,q,σ)= dp′dp′′e−iq(p′′−p′)Φ∗(p′′)Φ(p′)δ[p−p′′(1+σ)+p′(1−σ)2]=dp′dp′′e−iq(p′′−p′)Φ∗(p′′)Φ(p′)[p′′(1+σ)+p′(1−σ) i∂i∂2i [(1−σ)∂∂q1]n e−iq1p′Φ∗(p′′)dp′′ e iq2p′}|q2→q1={1∂q2−(1+σ)∂Returning to the units with¯h and using(43),we calculate twofirst momenta <p>w and<p2>w:<p>w=12i[(1−σ)∂∂q1]Ψ∗(q1)Ψ(q)}|q2→q1=¯h∂q{Ln(Ψ4[(1−σ)2Ψ′′∂q∂LnΨ∗Ψ∗](48)Let us consider a semi-classical limitΨ(q,t)=√2∇(Lnρ)}=∇S=p classical(50) If we use the Schroedinger equation:i¯h ∂Ln(Ψ)2m∇2Ψ∂t−V+18m(∇Lnρ)2++σ2[−∂S2m(∇S)2−¯h22∂Ψ∗)}=m{−∂S2m(∇S)2+¯h2∂t−V−18m(∇Lnρ)2}(51)This limit must yield the classical value of the square of the classical momen-tumlim¯h→0<p2>w=p2classical=(∇S)213which is independent of the parameterσ.This is possible if the factor atσin(51)becomes0,that is−∂S2m(∇S)2+VBut amazingly enough this condition is nothing more than the classical Hamilton-Jacobi equation.Thus emergence of the parameterσin Wigner function is tied to an emergence of the classical Hamilton-Jacobi equation in transition to a classical regime.4ConclusionWe have demonstrated that s-parameterized Wigner function emerges as a result of a linear transformation from one quantum state to another.The respective change of the phase space coordinates accompanying such a trans-form must be necessarily canonical.This allows one to arrive in a natural way,without introducing”by hand”the s-parameter into the transformation either by suitably chosen displacement operator or by using it as a”missing link”between normal and anti-normal ordering operators.A transition to a classical regime demonstrates that parameter s plays an im-portant role,ensuring the emergence of the classical Hamilton-Jacobi equa-tion as a condition for the disappearance of this parameter in classical me-chanics.5AppendixSince the momentum representationΦ(p)and the coordinate representation Ψ(q)of the wave function(for simplicity sake we consider a1-D case)are related as follows:Φ(p)=12π¯he−ipq′/¯hΨ(q′)dq′(52)the momentum probability density is|Ψ(p)|2=1We introduce new variables q andτq′=q+ατ;q′′=q+βτwhereαandβare constants such that the Jacobian of transformation from q′′,q′to q,τis¯h which meansα−β=¯hAs a result,Eq.(53)yields|Φ(p)|2=12Upon substitution of this expression into(54)we obtain|Φ(p)|2=12]Ψ∗[q−¯h(1−s)2π Ψ[q+¯h(1+s)2]dτ(56)is the s-parameterized Wigner function found earlier,Eq.(42)6AcknowledgementThe author expresses his gratitude to C.McCallum for his help in preparation of this paper.15References[1]H.Weyl,The Theory of Groups and Quantum Mechanics,Dover Publ.,1950[2]B.Leaf,J.Math.Phys,9,No.1,65(1968)[3]J.E.Moyal,Proc.Cambridge Phil.Soc.45,99(1949)[4]K.E.Cahill,R.J.Glauber,Phys.Rev.,177,1857(1969);Phys.Rev.,177,1882(1969)[5]N.L.Balazs and B.K.Jennings,Physics Reports,104,347(1984)[6]T.Dereli,A.Vercin,J.Math.Phys.,38,5515(1997)[7]E.Wigner,Phys.Rev.,40,749(1932)16。