An Optimality-Theoretic Alternative to the Apparent Wh-Movement in Old Japanese
美国文学史期末论文终极版
Contents摘要 (1)Abstract (1)Chapter 1 American Romanticism(1810--1865) (2)1.Background reasons (2)1.1 Politically this period was ripe (2)1.2 Economically American had never been wealthier (2)1.3 Culturally American own value emerged (2)2.Basic features and styles (2)2.1 Expressiveness (2)2.2 Imagination (2)2.3 Worship of nature (2)2.4 Simplicity (3)2.5 Cultural nationalism (3)2.6 Liberty,freedom,democracy and individualism (3)3.Influence (3)Chapter 2 American Realism(1865--1914) (3)1. Background changes (3)1.1 Politics (4)1.2 Economics (4)1.3 Cultural and social changes (4)2. Basic features and styles (4)2.1 Truthful description of the actualities of the real life andmaterial (4)2.2 Focus on ordinariness (4)3. Three dominant figures (4)4. Influence (5)Chapter 3 American Naturalism(1890--1914) (5)1. Background information (5)1.1 Cultural and Social Background (5)1.2 Religion and theoretical basis (5)2. Major ideas and features of Naturalism (5)2.1 Determinism (5)2.2 World: godless, indifferent, hostile (6)2.3 Style: scientific objectivity (6)2.4 Subjects and themes (6)3. A representative work that show the ideas and features above (6)3. Influence (6)Chapter 4 American Modernism(1914--1945) (6)1. Background information (6)1.1 Politics (6)1.2 Economy (7)1.3 Cultural and social background (7)2. Characteristics and features of Modernism (7)3. Major genres and a representative of each one (7)3.1 Modern poetry——Ezra Pound (7)3.2 Modern fiction——Ernest Hemingway (7)4. Influence (8)Chapter 5 American Postmodernism(1914--1945) (8)1. Background information (8)1.1 Politics (8)1.2 Economics (8)1.3 Social and international background (8)2. Characteristics and major features (8)2.1 Experimental writing techniques (8)2.3 Irony, playfulness and black humor (9)3.Influence (9)Bibliographies (9)摘要具有自身特点的新文学的出现,是一个国家真正形成的标志。
MicroeconomicTheoryII
Econ 752Microeconomic Theory IIProfessor: Douglas NelsonOffice: Tilton 108 (Murphy Institute), Phone: 865-5317Office Hours: Tuesday and Thursday, 3:30-5:30Phone: 865-5317email: ******************Webpage: /~dnelson/This course provides an overview of equilibrium analysis for competitive markets. The course is organized in four sections. An introductory section illustrates the main themes of the course in simple partial and general equilibrium environments. The second part of the course develops the main positive results from abstract general equilibrium theory. The third and fourth part of the course introduces students to the analysis of general equilibrium systems. Specifically, part III introduces positive analysis in terms of comparative statics, while part IV introduces students to welfare economics.Evaluation: Your performance in this course will be evaluated on the basis of two examinations (worth 100 points each). All students are expected to do all the expected reading and actively participate in all classes.Readings and exercises for the course will be drawn from the following core texts: Andreu Mas-Colell, Michael Whinston, and Jerry Green (1995). Microeconomic Theory.New York: Oxford University Press. [MWG]Hal Varian (1992). Microeconomic Analysis. New York: Norton. [Varian]Eugene Silberberg and Wing Suen (2001). The Structure of Economics: A Mathematical Analysis. Boston: Irwin/McGraw Hill. [Silberberg and Suen]Alan Woodland (1982). International Trade and Resource Allocation. Amsterdam: North Holland.Gareth Myles (1995). Public Economics. Cambridge: Cambridge University Press.In addition, there will be a large number of articles available electronically.The main substantive material of this course has been covered in a number of excellent texts. On pure general equilibrium theory, at a relatively elementary level the following are excellent: Peter Newman (1965). The Theory of Exchange. Englewood Cliffs: Prentice-Hall.James Quirk and Rubin Saposnik (1968). Introduction to General Equilibrium Theoryand Welfare Economics. New York: McGraw Hill.Werner Hildenbrand and Alan Kirman (1988). Equilibrium Analysis: Variations onThemes by Edgeworth and Walras. Amsterdam: North-Holland.Ross Starr (1997). General Equilibrium Theory: An Introduction. Cambridge: CUP.Bryan Ellickson (1993). Competitive Equilibrium: Theory and Applications. Cambridge: CUP.Alan Kirman, ed. (1998). Elements of General Equilibrium Analysis. Oxford: Blackwell. At a more advanced level, the following are excellent:Kenneth Arrow and Frank Hahn (1971). General Competitive Analysis. Amsterdam:North-Holland.Lionel McKenzie (2002). Classical General Equilibrium Theory. Cambridge: MIT Press.Andreu Mas-Colell (1985). The Theory of General Equilibrium: A DifferentiableApproach. Cambridge: CUP/Econometric Society.Yves Balasko (1988). Foundations of the Theory of General Equilibrium. San Diego:Academic Press.C. Aliprantis,D. Brown, and O. Burkinshaw (1990). Existence and Optimality ofCompetitive Equilibrium. Berlin: Springer-Verlag.On the application to public economics, texts emphasizing modern general equilibrium methods include:David Starrett (1988). Foundations of Public Economics. Cambridge: CambridgeUniversity Press.Jean-Jacques Laffont (1988). Fundamentals of Public Economics. Cambridge: MIT Press.Roger Guesnerie (1995). A Contribution to the Pure Theory of Taxation. Cambridge:Cambridge University Press.On the application to trade:Avinash Dixit and Victor Norman (1980). Theory of International Trade. Cambridge:Cambridge University Press.Kar-yiu Wong (1995). International Trade in Goods and Factor Mobility. Cambridge:MIT Press.Those interested in computational methods of general equilibrium analysis may want to consult: John Shoven and John Whalley (1992). Applying General Equilibrium. Cambridge:Cambridge University Press.Victor Ginsburgh and Michiel Keyzer (1997). The Structure of Applied GeneralEquilibrium Models. Cambridge: MIT Press.Joseph Francois and Kenneth Reinert, eds. (1997). Applied Methods for Trade PolicyAnalysis: A Handbook. Cambridge: Cambridge University Press.Finally, for those with an interest in the historical and philosophical background to general equilibrium theory, the place to start is a series of excellent books by E. Roy Weintraub:E.R. Weintraub (1979). Microfoundations. Cambridge: CUP.E.R. Weintraub (1986). General Equilibrium Analysis: Essays in Appraisal. Cambridge:CUP.E.R. Weintraub (1991). Stabilizing Dynamics: Constructing Economic Knowledge.Cambridge: CUP.E.R. Weintraub (2002). How Economics Became a Mathematical Science. Durham: DukeUniversity Press.Examination format. Both exams will be made up of problems drawn from material covered in the lectures and reading. These problems will generally be in the nature of extensions of that material, not simply replication of the relevant content. Exams must be written in blue books, which you must supply.Policy on examinations. The midterm exam will be given on tba. Unless you have a standard university accepted excuse for missing the exam (e.g. health with standard university form), you must take the exams at their scheduled time. The final examination will only be given on the scheduled date: tba (there will be no exceptions so do not make travel plans that conflict with this).Policy on examinations. The midterm exam will be given on 9 March. Unless you have a standard university accepted excuse for missing the exam (e.g. health with standard university form), you must take the exams at their scheduled time. The final examination will only be given on the scheduled date: 10 May, 8:00-12:00 (there will be no exceptions so do not make travel plans that conflict with this).Econ 752SYLLABUS Fall 2006 Topic I. Introduction! Partial Equilibrium Analysis of Competitive Equilibrium# Varian, Chapter 13.# MWG, Chapter 10 a-d and f.! General Equilibrium: Applying Microeconomic Tools toMacroeconomic Questions, Pure Exchange# MWG, Chapter 15, sections a-b# Varian, Chapter 17 and section 21.1.# Shapley and Shubik (1977). “An Example of a Trading Economywith Three Competitive Equilibria”. Journal of Political Economy;V.85-#4, pp. 873-875.# Debreu and Scarf (1963). “A Limit Theorem on the Core of anEconomy”. International Economic Review; V.4-#3, pp. 235-246.# Aumann (1964). “Markets with a Continuum of Traders”.Econometrica; V.32-#1/2, pp. 39-50.# Shubik (1984). “Two-Sided Markets: The Edgeworth Game”.Chapter 10 of A Game Theoretic Approach to Political Economy.Cambridge: MIT Press, pp. 252-285. [optional]# Hildenbrand and Kirman (1988). “Introduction”. In EquilibriumAnalysis. Amsterdam: North-Holland, pp. 1-49. [optional] ! General Equilibrium: Applying Microeconomic Tools toMacroeconomic Questions, Simple Economies with Production# MWG, Chapter 15, section c.# Varian, Chapter 18.# Koopmans (1957). “Allocation of Resources and the Price System”.Essay 1 of Three Essays on the State of Economic Science. NewYork: McGraw Hill, pp. 3-126. [especially pp. 1-66.]Topic II. Pure General Equilibrium Theory! Characterizing Equilibrium and Proving Existence# MWG, Chapter 17, sections a-d and f, appendix B.# Geanakoplos (2003). “Nash and Walras Equilibrium via Brouwer”.Economic Theory; V.21-#2/3, pp. 585-603.# Debreu (1998). “Existence”. Chapter 2 in A.P. Kirman, ed.,Elements of General Equilibrium Analysis. Oxford: Basil Blackwell,pp. 10-37. [optional]! Problems/Extensions: Nonconvexities# MWG, Chapter 17, section I# Chipman (1970). “External Economies of Scale and CompetitiveEquilibrium”. Quarterly Journal of Economics; V.84-#3, pp. 347-363.# Mayer (1974). “Homothetic Production Functions and the Shape of the Production Possibility Locus”. Journal of Economic Theory; V.8-#2, pp. 101-110. [ERes]# Starrett (1971). “Fundamental Non-Convexities in the Theory ofExternalities”. Journal of Economic Theory; V.4-#2, pp. 180-199.[ERes]# Cornes (1980). “External Effects: An Alternative Formulation”.European Economic Review; V.14-#3, pp. 307-321. [optional]! Problems/Extensions: Uncertainty# Varian, Chapter 20.# MWG, Chapter 19# Hens (1998). “Incomplete Markets”. Chapter 5 in A.P. Kirman, ed., Elements of General Equilibrium Analysis. Oxford: Basil Blackwell,pp. 139-210. [optional]! General Equilibrium Comparative Statics?: The Sonnenschein-Debreu-Mantel Result# MWG, Chapter 17, section d-f.# Saari (1995). “The Mathematical Complexity of SimpleEconomies”. Notices of the American Mathematical Society; V.42-#2, pp. 222-230.# Kirman (1989). “The Intrinsic Limits of Modern Economic Theory: The Emperor Has No Clothes”. Economic Journal; V.99-#395, pp.126-139.# Sonnenschein (1973). “Do Walras' Indentity and ContinuityCharacterize the Class of Community Excess Demand Functions?”.Journal of Economic Theory; V.6-#4, pp. 345-354.# Debreu (1974). “Excess Demand Functions”. Journal ofMathematical Economics; V.1-#1, pp. 15-23. [optional]# Mantel (1979). “Homothetic Preferences and Community ExcessDemand Functions”. Journal of Economic Theory; V.12-#2, pp.197-201. [optional]# Mas-Colell (1977). “On the Equilibrium Price Set of an ExchangeEconomy”. Journal of Mathematical Economics; V.4-#2, pp.117-126. [optional]# Kemp and Shimomura (2002). “The Sonnenschein-Debreu-MantelProposition and the Theory of International Trade”. Review ofInternational Economics; V.10-#4, pp. 671-679. [optional]# Brown and Matzkin (1996). “Testable Restrictions on theEquilibrium Manifold”. Econometrica; V.64-#?, pp. 1249-1262.[optional]# Nachbar (2002). “General Equilibrium Comparative Statics”.Econometrica; V.70-#5, pp. 2065-2974. [optional]Midterm: Tuesday, 9 March.No Class: Thursday, 11 MarchTopic III. Applied General Equilibrium Theory: Positive Analysis ! Introduction to Comparative Statics for Applied GE# Silberberg and Suen, Chapter 18. [ERes]# Woodland (1982). “The Production Sector”. Chapter 3 ofInternational Trade and Resource Allocation. Amsterdam: North-Holland, pp. 39-65. [ERes]# MWG, Chapter 15, section d# Jones (1965). “The Structure of Simple General EquilibriumModels”. Journal of Political Economy; V.73-#6, pp. 557-572.[optional]# Mussa (1979). “The Two Sector Model in Terms of Its Dual: AGeometric Exposition”. Journal of International Economics; V.9-#4,pp. 513-526. [optional]# Hale, Lady, Maybee, and Quirk (1999). “The CompetitiveEquilibrium: Comparative Statics”. Chapter 7 in NonparametricComparative Statics and Stability. Princeton: Princeton UniversityPress, pp. 170-205. [optional]! Maximum Value Functions and Comparative Statics for GeneralEquilibrium Analysis# MWG, Chapter 17, section g# Woodland (1982). “Comparative Statics of the Production Sector”.Chapter 4 of International Trade and Resource Allocation.Amsterdam: North-Holland, pp. 67-103. [ERes]# Woodland (1982). “Intermediate Inputs and Joint Outputs”.Chapter 5 of International Trade and Resource Allocation.Amsterdam: North-Holland, pp. 105-146. [ERes]! Applied General Equilibrium Theory: The Stolper-SamuelsonTheorem, from 2 × 2 to m × n.# Chipman (1969). “Factor Price Equalization and the Stolper-Samuelson Theorem”. International Economic Review; V.10-#3, pp.399-406.# Jones and Scheinkman (1977). “The Relevance of the Two-SectorProduction Model in Trade Theory”. Journal of Political Economy;V.85-#5, pp. 909-935.# Ethier (1982). “The General Role of Factor Intensity in theTheorems of International Trade”. Economics Letters; V.10-#3/4, pp.337-342. [ERes]# Ethier (1984). “Higher Dimensional Issues in Trade Theory”. in R.Jones and P. Kenen, eds. Handbook of International Economics--Vol.1. Amsterdam: North-Holland, 131-184. [optional]Topic IV. Welfare Economics: Pure and Applied! Fundamental Theorems of Welfare Economics# Silberberg and Suen, Chapter 19, sections 1-3. [ERes]# MWG, Chapter 16# Hammond (1998). “The Efficiency Theorems and Market Failure”.Chapter 6 in A.P. Kirman, ed., Elements of General EquilibriumAnalysis. Oxford: Basil Blackwell, pp. 211-240. [ERes]! Applied Welfare Economics,1: Introduction# MWG, Chapter 10 e.# Silberberg and Suen, Chapter 19, section 7. [ERes]# Blackorby and Donaldson (1985). “Consumers’ Surpluses andConsistent Cost-Benefit Tests”. Social Choice and Welfare; V.1-#4,pp. 251-262. [optional]# Blackorby and Donaldson (1990). “The Case Against the Use ofthe Sum of Compensating Variations in Cost-Benefit Analysis”.Canadian Journal of Economics; V.23-#3, pp. 471-494.# Blackorby and Donaldson (1999). “Market Demand Curves andDupuit-Marshall Consumers’ Surpluses: A General EquilibriumAnalysis”. Mathematical Social Sciences; V37-#2, pp. 139-163.# Ahlheim (1998). “Measures of Economic Welfare”. In Barberà,Hammond, and Seidl, eds. Handbook of Utility Theory. Dordrecht:Kluwer, pp. 483-568. [Optional: covers one person theory]! Applied Welfare Economics, 2: Commodity Taxation# Myles, Chapter 4. [ERes]# Diamond and Mirrlees (1971). “Optimal Taxation and PublicProduction, I: Production Efficiency”. American Economic Review;V.61-#1, pp. 8-27. [optional]# Diamond and Mirrlees (1971). “Optimal Taxation and PublicProduction, II: Tax Rules”. American Economic Review; V.61-#3, pp.261-278. [optional]# Diamond and McFadden (1974). “Some Uses of the ExpenditureFunction in Public Finance”. Journal of Public Economics; V.3-#1,pp. 3-21.# Greenberg and Denzau (1988). “Profit and Expenditure Functionsin Basic Public Finance: An Expository Note”. Economic Inquiry;V.26-#1, pp. 145-158.# Deaton (1981). “Optimal Taxes and the Structure of Preferences”.Econometrica; V.49-#5, pp. 1245-1260.# Stern (1986). “A Note on Commodity Taxation: The Choice ofVariable and the Slutsky, Hessian and Antonelli Matrices (SHAM)”.Review of Economic Studies; V.53-#2, pp. 293-299.! Applied Welfare Economics, 3: Distortions, Second-best, and Policy# Silberberg and Suen, Chapter 19, sections 5 and 6. [ERes]# MWG, Chapter 22, sections a-d# Myles, Chapter 10. [ERes]# Hammond (1998). “The Efficiency Theorems and Market Failure”.Chapter 6 in A.P. Kirman, ed., Elements of General EquilibriumAnalysis. Oxford: Basil Blackwell, pp. 240-260. [ERes]# Hurwicz (1999). “Revisiting Externalities”. Journal of PublicEconomics Theory; V.1-#2, pp. 225-245.! Social Choice Theory: A (Very) Brief Introduction# MWG, Chapter 21# Fleurbaey and Mongin (2005). “The News of the Death of WelfareEconomics is Greatly Exaggerated”. Social Choice & Welfare;V.25-#2/3, pp. 381-418.# Mongin and d’Aspermont (1998). “Utility Theory and Ethics”. InBarberà, Hammond, and Seidl, eds. Handbook of Utility Theory.Dordrecht: Kluwer, pp. 371-481. [optional]Final Examination: 10 May, 8:00-12:00.。
顺应事情发展的客观规律英语作文
顺应事情发展的客观规律英语作文Embracing the Immutable Laws of Nature.The universe operates according to an intricate tapestry of laws and principles, meticulously woven into the fabric of existence. These laws, like the unwavering force of gravity or the relentless passage of time, guide all that transpires, shaping the trajectories of both the celestial and the mundane. To live in harmony with the world around us, it is crucial that we align our actions and aspirations with these immutable laws.One of the most fundamental laws of nature is the principle of cause and effect. Every action, every thought, and every decision sets in motion a chain of consequences that reverberate through time. By understanding this law, we can act with greater foresight, weighing the potential outcomes of our choices and striving to sow seeds that will bear positive fruit.Another inviolable law is the law of impermanence. All things in the universe undergo constant flux and transformation, from the ceaseless cycles of day and nightto the birth and death of civilizations. By embracing this principle, we can learn to accept change with grace and resilience, recognizing that even in the face of adversity, there is always the potential for renewal and growth.The law of attraction is another potent force that shapes our lives. This law states that like attracts like, and that the energy we emit into the world will inevitably return to us. By cultivating positive thoughts, emotions, and intentions, we attract more of the same into our lives. Conversely, dwelling on negativity and fear can lead to a self-fulfilling cycle of unhappiness and misfortune.Understanding the laws of nature can also empower us to live more sustainable lives. The law of conservation of energy teaches us that energy can neither be created nor destroyed, only transformed from one form to another. By embracing this principle, we can make conscious choices to conserve resources and reduce our impact on the environment.The law of interdependence reminds us that all living beings are interconnected and interdependent. Our actions have consequences not only for ourselves but also for the wider ecosystem. By respecting nature and recognizing our role as stewards of the planet, we can help to create amore harmonious and sustainable world for ourselves and for generations to come.Embracing the laws of nature is not about denying free will or conforming to some predetermined destiny. Rather,it is about aligning our actions and intentions with the underlying principles that govern the universe. By doing so, we can live more meaningful, fulfilling, and sustainable lives, maximizing our potential while respecting the immutable forces that shape our existence.In the words of the ancient sage Lao Tzu, "The highest good is like water. Water gives life to the ten thousand things and does not strive. It flows in places men reject and so is like the Tao." May we all strive to embody the wisdom of water, flowing effortlessly with the currents oflife, in harmony with the immutable laws that guide our journey.。
Singular Plural
Yaqui nominal paradigms and the theory of paradigmatic structure*D. Terence Langendoen, University of ArizonaConstantino Martínez Fabian, University of SonoraThe problemThis paper deals with a deceptively simple problem.1 Yaqui nouns are inflected for Case and Number. The language has the two nominal inflectional paradigms illustrated in (1) and (2). In (1) there are three distinct morphosyntactic forms:•the nominative singular form, which is unmarked (unsuffixed);•the accusative singular form, which is suffixed with −ta;•the nominative/accusative plural form, which is suffixed with −(i)m.2In (2) there is only one morphosyntactic form:•the nominative/accusative/singular/plural form, which is suffixed with −(i)m.The question is why the suffix −(i)m is used in the paradigm in (2).(1)Paradigm for miisi ‘cat’NumberSingular PluralNominative miisimiisimCaseAccusative miisita* The Spanish version of this paper entitled Paradigmas nominales de yaqui y la teoría de estructura paradigmática has been accepted for publication in the proceedings of the VIII Encuentro Lingüística en el Noroeste, held at the University of Sonora in November 2004.1 We thank Heidi Harley for her enthusiastic reception of the first version of this paper, and particularly for her discretion in pointing out that we had entirely overlooked alternatives to our original Optimality Theoretic analysis. As a result we have been able not only to provide a comparison, but also to improve our original account, which she was able to convince us in about thirty seconds was inferior to an account she formulated within Distributed Morphology.2 This is an example of syncretism (Williams 1994) in which a single form represents a (partially) neutralized opposition, and is therefore compatible with two or more distinct feature specifications. Traditionally, syncretic forms are repeated in paradigms, with each occurrence representing a distinct specification. However that mode of representation conflates syncretism with homonymy, in which morphosyntactically distinct forms are realized identically. Deciding between neutralization and homonymy in particular cases can be difficult.(2)Paradigm for supe ‘shirt’NumberSingular PluralNominativesupemCaseAccusativeTypes of inflectional paradigmsTo answer this question, we need to describe in some detail the nature of inflectional paradigms.An inflectional paradigm is a nonempty set of inflections of a linguistic form or class of formsfor a nonempty set of inflectional features. Abstracting away from the morphosyntactic and morphophonological realization of these inflections, we obtain the notion of a schema for an inflectional paradigm, in which the members of the schema represent the various values of those features. Such a schema may be complete or defective. It is complete if all possible values forthose features are represented by a member of the schema; otherwise it is defective.Complete schemas for inflectional paradigmsFor example, suppose, as in Yaqui, there is a class of forms that is inflected for the features Caseand Number, where Case takes the binary values [Nominative] and [Accusative], and Numberthe binary values [Singular] and [Plural]. Then there are 24− 1 = 15 schemas of complete inflectional paradigms for those features, depending on whether any of the feature-valuedistinctions are neutralized, and if so which ones. The 15 schemas are shown in (3) through (17). Yaqui manifests two of these 15 schemas; the paradigm in (1) is an instance of the schema in (4),and the paradigm in (2) is an instance of the schema in (17).(3)Non-neutralized (full) complete paradigmatic schema for binary Case and Number featuresNumberSingular Plural Nominative [Nominative] & [Singular] [Nominative] & [Plural] CaseAccusative [Accusative] & [Singular] [Accusative] & [Plural](4)Neutralization of Case with [Plural]NumberSingular Plural Nominative [Nominative] & [Singular]CaseAccusative [Accusative] & [Singular][Plural](5)Neutralization of Case with [Singular]NumberSingular Plural Nominative [Nominative] & [Plural]CaseAccusative [Singular][Accusative] & [Plural](6)Neutralization of Number with [Accusative]NumberSingular Plural Nominative [Nominative] & [Singular] [Nominative] & [Plural] CaseAccusative [Accusative](7)Neutralization of Number with [Nominative]NumberSingular Plural Nominative [Nominative]CaseAccusative [Accusative] & [Singular] [Accusative] & [Plural](8)Partial neutralization of Case and Number along NW−SE diagonalNumberSingular Plural[Nominative] & [Plural]Nominative([Nominative] & [Singular]) | ([Accusative] & [Plural]) CaseAccusative [Accusative] & [Singular](9)Partial neutralization of Case and Number along SW−NE diagonalNumberSingular Plural[Nominative] & [Singular]Nominative([Accusative] & [Singular]) | ([Nominative] & [Plural]) CaseAccusative [Accusative] & [Plural](10)Complete neutralization of CaseNumberSingular Plural Nominative[Singular] [Plural]CaseAccusative(11)Complete neutralization of NumberNumberSingular Plural Nominative [Nominative]CaseAccusative [Accusative](12)Neutralization through negation of [Nominative] & [Singular]NumberSingular Plural Nominative [Nominative]&[Singular]↑CaseAccusative ←~ ([Nominative]&[Singular])(13)Neutralization through negation of [Accusative] & [Singular]NumberSingular Plural Nominative ←~ ([Accusative]&[Singular])↓CaseAccusative [Accusative]&[Singular](14)Neutralization through negation of [Nominative] & [Plural]NumberSingular PluralNominative↑[Nominative]&[Plural]CaseAccusative ~ ([Nominative]&[Plural]) →(15)Neutralization through negation of [Accusative] & [Plural]NumberSingular PluralNominative ~ ([Accusative]&[Plural])↓→CaseAccusative [Accusative]&[Plural](16)Neutralization along both diagonalsNumberSingular Plural Nominative([Accusative] & [Singular]) | ([Nominative] & [Plural]) Case([Nominative] & [Singular]) | ([Accusative] & [Plural])Accusative(17)Full neutralization of Case and NumberNumberSingular Plural Nominative[ ]CaseAccusativeDefective schemas for inflectional paradigmsA defective schema for inflectional paradigms, on the other hand, is one whose members do notcover the space of all possible values for the features involved, i.e. one that leaves a “gap”. For example, corresponding to the full complete schema in (3), there is the defective schema in (18)that has no provision for the [Accusative] & [Plural] combination of values. In general there aremany more defective schemas for inflectional paradigms than complete ones; for example, thereare 36 defective schemas for two binary features compared to 15 complete ones, but theoccurrence of defective paradigms in natural language descriptions is comparatively rare. Weleave the explanation for this fact for another occasion; for now we simply declare that grammars abhor defective paradigms.(18)Defective schema for an inflectional paradigm, lacking [Accusative] & [Plural]NumberSingular Plural Nominative [Nominative] & [Singular] [Nominative] & [Plural] CaseAccusative [Accusative] & [Singular] −−The realization of complete paradigm schemasDifferent languages manifest different paradigm schemas for given sets of features, but certain preferences are clear. For example, we are aware of no cases in which the schemas involving the “diagonal” neutralizations such as (8), (9) and (16) are realized. In addition, schemas involving the negation of a particular combination of features such as (12)−(15) are unusual, an example is the Person and Number paradigm for the present tense of verbs (other than be) in standard English. On the other hand, schemas involving the neutralization of one or more features such as (4)−(7), (10) and (11) are quite commonly manifested, with preferences for which feature(s) to neutralize being dictated by markedness considerations. Finally, full complete schemas such as (3) are also very common, at least when the number of feature-value combinations is relatively small, as are fully neutralized complete schemas such as (17).Accounting for the Yaqui nominal paradigmsThere are two classes of morphological theories that account for paradigmatic patterns such as observed in Yaqui nominal inflection, those that are paradigm-based and those that are vocabulary-based (Bobaljik 2001: 53-54).3 An example of a vocabulary-based morphological theory is Distributed Morphology (DM) (Halle & Marantz 1993), which Bobaljik also espouses. An example of a paradigm-based theory is one developed by Edwin Williams, according to which a paradigm is “a real object, and not the epiphenomenal product of various rules” (Williams 1994: 22).A Distributed Morphology accountAn elegant DM account of the Yaqui nominal paradigms in (1) and (2) was suggested to us by Heidi Harley (see fn. 1). It goes as follows. Assume as we have already done that Yaqui nouns are inflected for Case and Number, that the values for Case are [Nominative] and [Accusative] and that the values for Number are [Singular] and [Plural]. Assume also that there are two classes of nouns, Class1 the miisi class and Class2 the supe class. Then the ordered list of morpheme realization rules in (19) derives the paradigms in (1) and (2), i.e. treats them precisely as epiphenomenal products.(19)Morpheme realization rules that derive the Yaqui nominal paradigms-ta ⇔ [Accusative] & [Singular] / Class1 ___-∅⇔ [Singular] / Class1 ___-(i)m ⇔ elsewhereThere are two noteworthy properties of this account. First, a zero affix must be postulated, since the rule for its insertion is ordered after that of -ta insertion and before the default insertion of-(i)m. Second, -(i)m has no inherent features; in particular it is not specified [Plural].3 Bobaljik (2001: 78, fn. 1) points out that certain morphological theories, such as that of Wunderlich (1995) and Stump (2001), may not be easily placed within one or the other of these classes.An Optimality Theory accountDM is a theory that ranks rules. On the other hand, Optimality Theory (OT), which ranks constraints rather than rules, can be used within the paradigm-based framework to account for the forms that appear in the paradigms in (1) and (2), but without the use of zero affixes or default (elsewhere) conditions. The suffix -(i)m may be assumed to be specified [Plural] and the suffix -ta specified as [Accusative]. Then, assuming that the entries in the paradigm schema in (4) appear in inputs together with a Class 1 noun such as miisi, we correctly account for the choice of affix in accordance with a faithfulness constraint we call F AITH FS (FS for “feature specifications”), as shown in the tableaux in (20)-(22).(20)Choice of miisi to represent miisi [Nominative] & [Singular]miisi [Nominative] & [Singular] F AITH FS⇒miisi **[Accusative] ***!miisi-ta[Plural] ***!miisi-m(21)Choice of miisi-ta to represent miisi [Accusative] & [Singular]miisi [Accusative] & [Singular] F AITH FSmiisi **!⇒miisi-ta [Accusative] *miisi-m[Plural] **!*(22)Choice of miisi-m to represent miisi [Plural]miisi [Plural] F AITH FSmiisi *!miisi-ta[Accusative] *!*⇒miisi-m [Plural]However, F AITH FS by itself does not predict that the affix -(i)m appears in instances of the paradigm schema (17). Instead, as shows, it predicts that no affix appears.(23)False prediction that supe represents supe [ ]supe [ ] F AITH FS/⇒supe[Accusative] *!supe-ta[Plural] *!supe-mTo force the choice of supem, several additional constraints are required. First, we need a constraint that prefers outputs of inflected forms that have affixes. Call that constraint H AVE A FF.Clearly F AITH FS >> H AVE A FF, since otherwise the choice of miisi as the expression of miisi [Nominative] & [Accusative] would be prevented. However, for Class2 nouns, we require in effect that H AVE A FF outrank F AITH FS. Whether this is a case of local reranking or an additional constraint expressed as a conditional is not our concern here. We assume the latter, calling the constraint H AVE A FF2, and proposing the ranking H AVE A FF2 >> F AITH FS >> H AVE A FF. Now, supe is not winning candidate for expressing supe [ ], but as (24) shows, we are still left with no basis for choosing between supe-ta and supe-m.(24)Failure to choose between supe-ta and supe-m as representing supe [ ]supe [ ] H AVE A FF2 F AITH FS H AVE A FFsupe *! */⇒supe-ta [Accusative] *⇒supe-m [Plural] *To account for the choice of supe-m, we propose two additional markedness filters: *C ASE, which indicates an aversion to marking Case, and *N UMBER, which indicates an aversion to marking Number, and the ranking F AITH FS >> *C ASE >> *N UMBER. Then, as (25) shows, we obtain the result that supem instantiates paradigm schema (17) in Yaqui.(25)Choice of supe-m to represent supe [ ]supe [ ] H AVE A FF2 F AITH FS *C ASE *N UMBER H AVE A FF supe *! *supe-ta[Accusative] * *!⇒supe-m [Plural] * *Comparison of DM and OT accounts of Yaqui nominal paradigmatic structureFrom our presentation so far of the DM and OT accounts of the paradigmatic structure of Yaqui nominals, one might conclude that the DM account is to be preferred on grounds of simplicity. As Bobaljik (2001) points out, a vocabulary-based account such as DM is conceptually simpler than paradigm-based accounts of morphological structure, so is to be preferred for that reason, all things being equal. Since we are interested not so much in the comparison between vocabulary-based and paradigm-based accounts as in the comparison of DM and OT accounts of paradigmatic structure, we now convert the paradigm-based OT account given above to a vocabulary-based one, so as to level the playing field for evaluating those two theories in this arena. To effect this conversion, we replace the inputs with elements that represent all possible combinations of the case and number feature values that Yaqui nouns can express and determine the constraint rankings that yield the correct outputs. For example, we consider an input such as miisi [Nominative] & [Plural] and determine what constraint ranking yields the desired miisi-m as output. For Class1 nouns, we determine immediately that the ranking F AITH FS >> *C ASE >> *N UMBER yields the desired outputs for all combinations of feature values. In (26) and (27), we give two illustrative tableaux.(26)Choice of miisi to express miisi [Nominative] & [Singular]miisi [Nominative] & [Singular] F AITH FS *C ASE *N UMBER⇒miisi **[Accusative] ***! *miisi-tamiisi-m[Plural] ***! *(27)Choice of miisi-m to express miisi [Accusative] & [Plural]miisi [Accusative] & [Plural] F AITH FS *C ASE *N UMBERmiisi **![Accusative] * *!miisi-ta⇒miisi-m [Plural] * *However, this ranking gives the same results for Class2 nouns as for Class1 nouns. In order that supe-m is always selected as output, no matter what feature value combinations are associated with the input stem supe, we require a version of the *C ASE constraint, call it *C ASE2, that is specific to Class2 nouns, with the ranking *C ASE2 >> F AITH FS; (28) shows that it does not matter how *C ASE2 is ranked with respect to H AVE A FF2.(28)Choice of supe-m to express supe [Accusative] & [Singular]supe [Accusative] & [Singular] H AVE A FF2 *C ASE2 F AITH FS *C ASE *N UMBER supe *![Accusative] *! * *supe-ta⇒supe-m [Plural] *** *The OT analysis presented in this section, like the DM analysis in the preceding section, is vocabulary-based, and derives the two Yaqui nominal paradigm schemas in (4) and (17). However, unlike the DM analysis, it assigns feature content to the suffix -(i)m, namely [Plural]; assigns only one feature value to -ta instead of two and does not explicitly restrict its occurrence to Class1 nouns; and does not posit a zero-affix, much less assign feature content to it. Moreover the association of features with Yaqui affixes is lexical, as proposed in Lieber (1982) and DiSciullo & Williams (1987), as opposed to realizational as in DM theories generally, and also in Williams (1994); see Bobaljik (2001: 56) for discussion. In all these respects, we believe that the OT analysis is closer to the ‘truth’ regarding Yaqui (and universal) grammar than the DM analysis. On the other hand, the DM analysis is simpler, inasmuch as it posits only three rules as opposed to the five constraints in the OT analysis. However, the DM theory suffers from the fact that there is an equally simple analysis in which the first two rules are reordered, given in (29), and there is no basis for choosing between them.Langendoen & Martínez, Yaqui nominal paradigms and the theory of paradigmatic structure 11(29)Another list of morpheme realization rules that derives the Yaqui nominal paradigms-∅⇔ [Nominative] & [Singular] / Class1 ___-ta ⇔ [Singular] / Class1 ___-(i)m ⇔ elsewhereFinally, another advantage we see to the OT analysis is that it provides the beginning of a basis for the analysis of the class of possible paradigms within the enormous space of paradigm schemas provided by the free combination of morphosyntactic feature values. Paradigm schema (4) is derived, as we have already seen, from the ranking F AITH FS >> *C ASE >> *N UMBER. Paradigm schema (17) with the [Plural] affix used throughout is derived from the rankingH AVE A FF >> *C ASE >> F AITH FS >> *N UMBER. The need to double the *C ASE and H AVE A FF constraints in the analysis of Yaqui results from having two coexisting nominal paradigms in the language.ReferencesBobaljik, Jonathan David (2001). Syncretism without paradigms: Remarks on Williams 1981, 1994. Yearbook of Morphology 2001: 53-85.DiSciullo, Anna Marie & Edwin Williams (1987). On the Definition of Word. Cambridge, MA: MIT Press.Halle, Morris & Alec Marantz (1993). Distributed morphology and the pieces of inflection. In Ken Hale & Samuel Jay Keyser (eds.), The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger. Cambridge, MA: MIT Press, 111-176.Lieber, Rochelle (1982). Allomorphy. Linguistic Analysis 10: 27-52.Stump, Gregory T. (2001). Inflectional Morphology: A Theory of Paradigm Structure.Cambridge: Cambridge University Press.Williams, Edwin (1994). Remarks on lexical knowledge. Lingua 92: 7-34.Wunderlich, Dieter (1995). Minimalist morphology: The role of paradigms. Yearbook of Morphology 1995: 93-114.。
Deterministic Policy Gradient Algorithms
Deterministic Policy Gradient AlgorithmsDavid Silver DAVID@ DeepMind Technologies,London,UKGuy Lever GUY.LEVER@ University College London,UKNicolas Heess,Thomas Degris,Daan Wierstra,Martin Riedmiller*@ DeepMind Technologies,London,UKAbstractIn this paper we consider deterministic policygradient algorithms for reinforcement learningwith continuous actions.The deterministic pol-icy gradient has a particularly appealing form:itis the expected gradient of the action-value func-tion.This simple form means that the deter-ministic policy gradient can be estimated muchmore efficiently than the usual stochastic pol-icy gradient.To ensure adequate exploration,we introduce an off-policy actor-critic algorithmthat learns a deterministic target policy from anexploratory behaviour policy.We demonstratethat deterministic policy gradient algorithms cansignificantly outperform their stochastic counter-parts in high-dimensional action spaces.1.IntroductionPolicy gradient algorithms are widely used in reinforce-ment learning problems with continuous action spaces.The basic idea is to represent the policy by a parametric prob-ability distributionπθ(a|s)=P[a|s;θ]that stochastically selects action a in state s according to parameter vectorθ. Policy gradient algorithms typically proceed by sampling this stochastic policy and adjusting the policy parameters in the direction of greater cumulative reward.In this paper we instead consider deterministic policies a=µθ(s).It is natural to wonder whether the same ap-proach can be followed as for stochastic policies:adjusting the policy parameters in the direction of the policy gradi-ent.It was previously believed that the deterministic pol-icy gradient did not exist,or could only be obtained when using a model(Peters,2010).However,we show that the deterministic policy gradient does indeed exist,and further-more it has a simple model-free form that simply follows the gradient of the action-value function.In addition,we show that the deterministic policy gradient is the limiting Proceedings of the31st International Conference on Machine Learning,Beijing,China,2014.JMLR:W&CP volume32.Copy-right2014by the author(s).case,as policy variance tends to zero,of the stochastic pol-icy gradient.From a practical viewpoint,there is a crucial difference be-tween the stochastic and deterministic policy gradients.In the stochastic case,the policy gradient integrates over both state and action spaces,whereas in the deterministic case it only integrates over the state space.As a result,computing the stochastic policy gradient may require more samples, especially if the action space has many dimensions.In order to explore the full state and action space,a stochas-tic policy is often necessary.To ensure that our determinis-tic policy gradient algorithms continue to explore satisfac-torily,we introduce an off-policy learning algorithm.The basic idea is to choose actions according to a stochastic behaviour policy(to ensure adequate exploration),but to learn about a deterministic target policy(exploiting the ef-ficiency of the deterministic policy gradient).We use the deterministic policy gradient to derive an off-policy actor-critic algorithm that estimates the action-value function us-ing a differentiable function approximator,and then up-dates the policy parameters in the direction of the approx-imate action-value gradient.We also introduce a notion of compatible function approximation for deterministic policy gradients,to ensure that the approximation does not bias the policy gradient.We apply our deterministic actor-critic algorithms to sev-eral benchmark problems:a high-dimensional bandit;sev-eral standard benchmark reinforcement learning tasks with low dimensional action spaces;and a high-dimensional task for controlling an octopus arm.Our results demon-strate a significant performance advantage to using deter-ministic policy gradients over stochastic policy gradients, particularly in high dimensional tasks.Furthermore,our algorithms require no more computation than prior meth-ods:the computational cost of each update is linear in the action dimensionality and the number of policy parameters. Finally,there are many applications(for example in robotics)where a differentiable control policy is provided, but where there is no functionality to inject noise into the controller.In these cases,the stochastic policy gradient is inapplicable,whereas our methods may still be useful.2.Background2.1.PreliminariesWe study reinforcement learning and control problems in which an agent acts in a stochastic environment by sequen-tially choosing actions over a sequence of time steps,in order to maximise a cumulative reward.We model the problem as a Markov decision process (MDP)which com-prises:a state space S ,an action space A ,an initial state distribution with density p 1(s 1),a stationary transition dy-namics distribution with conditional density p (s t +1|s t ,a t )satisfying the Markov property p (s t +1|s 1,a 1,...,s t ,a t )=p (s t +1|s t ,a t ),for any trajectory s 1,a 1,s 2,a 2,...,s T ,a T in state-action space,and a reward function r :S ×A →R .A policy is used to select actions in the MDP.In general the policy is stochastic and denoted by πθ:S →P (A ),where P (A )is the set of probability measures on A and θ∈R n is a vector of n parameters,and πθ(a t |s t )is the conditional probability density at a t associated with the policy.The agent uses its policy to interact with the MDP to give a trajectory of states,actions and rewards,h 1:T =s 1,a 1,r 1...,s T ,a T ,r T over S ×A ×R .Thereturn r γtis the total discounted reward from time-step t onwards,r γt = ∞k =t γk −tr (s k ,a k )where 0<γ<1.Value functions are defined to be the expected total dis-counted reward,V π(s )=E [r γ1|S 1=s ;π]and Q π(s,a )=E [r γ1|S 1=s,A 1=a ;π].1The agent’s goal is to obtain a policy which maximises the cumulative discounted reward from the start state,denoted by the performance objectiveJ (π)=E [r γ1|π].We denote the density at state s after transitioning for t time steps from state s by p (s →s ,t,π).We also denote the (improper)discounted state distribution by ρπ(s ):= S∞t =1γt −1p 1(s )p (s →s ,t,π)d s .We can then write the performance objective as an expectation,J (πθ)= Sρπ(s )Aπθ(s,a )r (s,a )d a d s=E s ∼ρπ,a ∼πθ[r (s,a )](1)where E s ∼ρ[·]denotes the (improper)expected value with respect to discounted state distribution ρ(s ).2In the re-mainder of the paper we suppose for simplicity that A =R m and that S is a compact subset of R d .2.2.Stochastic Policy Gradient TheoremPolicy gradient algorithms are perhaps the most popular class of continuous action reinforcement learning algo-rithms.The basic idea behind these algorithms is to adjust1To simplify notation,we frequently drop the random vari-able in the conditional density and write p (s t +1|s t ,a t )=p (s t +1|S t =s t ,A t =a t );furthermore we superscript value functions by πrather than πθ.2The results in this paper may be extended to an average re-ward performance objective by choosing ρ(s )to be the stationary distribution of an ergodic MDP.the parameters θof the policy in the direction of the perfor-mance gradient ∇θJ (πθ).The fundamental result underly-ing these algorithms is the policy gradient theorem (Suttonet al.,1999),∇θJ (πθ)= Sρπ(s )A∇θπθ(a |s )Q π(s,a )d a d s=E s ∼ρπ,a ∼πθ[∇θlog πθ(a |s )Q π(s,a )](2)The policy gradient is surprisingly simple.In particular,despite the fact that the state distribution ρπ(s )depends on the policy parameters,the policy gradient does not depend on the gradient of the state distribution.This theorem has important practical value,because it re-duces the computation of the performance gradient to a simple expectation.The policy gradient theorem has been used to derive a variety of policy gradient algorithms (De-gris et al.,2012a ),by forming a sample-based estimate of this expectation.One issue that these algorithms must ad-dress is how to estimate the action-value function Q π(s,a ).Perhaps the simplest approach is to use a sample return r γt to estimate the value of Q π(s t ,a t ),which leads to a variant of the REINFORCE algorithm (Williams ,1992).2.3.Stochastic Actor-Critic AlgorithmsThe actor-critic is a widely used architecture based on the policy gradient theorem (Sutton et al.,1999;Peters et al.,2005;Bhatnagar et al.,2007;Degris et al.,2012a ).The actor-critic consists of two eponymous components.An ac-tor adjusts the parameters θof the stochastic policy πθ(s )by stochastic gradient ascent of Equation 2.Instead of the unknown true action-value function Q π(s,a )in Equation 2,an action-value function Q w (s,a )is used,with param-eter vector w .A critic estimates the action-value function Q w (s,a )≈Q π(s,a )using an appropriate policy evalua-tion algorithm such as temporal-difference learning.In general,substituting a function approximator Q w (s,a )for the true action-value function Q π(s,a )may introduce bias.However,if the function approximator is compati-ble such that i)Q w (s,a )=∇θlog πθ(a |s ) w and ii)the parameters w are chosen to minimise the mean-squareder-ror 2(w )=E s ∼ρπ,a ∼πθ (Q w (s,a )−Q π(s,a ))2,then there is no bias (Sutton et al.,1999),∇θJ (πθ)=E s ∼ρπ,a ∼πθ[∇θlog πθ(a |s )Q w (s,a )](3)More intuitively,condition i)says that compatible function approximators are linear in “features”of the stochastic pol-icy,∇θlog πθ(a |s ),and condition ii)requires that the pa-rameters are the solution to the linear regression problem that estimates Q π(s,a )from these features.In practice,condition ii)is usually relaxed in favour of policy evalu-ation algorithms that estimate the value function more ef-ficiently by temporal-difference learning (Bhatnagar et al.,2007;Degris et al.,2012b ;Peters et al.,2005);indeed ifboth i)and ii)are satisfied then the overall algorithm is equivalent to not using a critic at all (Sutton et al.,2000),much like the REINFORCE algorithm (Williams ,1992).2.4.Off-Policy Actor-CriticIt is often useful to estimate the policy gradient off-policy from trajectories sampled from a distinct behaviour policy β(a |s )=πθ(a |s ).In an off-policy setting,the perfor-mance objective is typically modified to be the value func-tion of the target policy,averaged over the state distribution of the behaviour policy (Degris et al.,2012b ),J β(πθ)=S ρβ(s )V π(s )d s= SAρβ(s )πθ(a |s )Q π(s,a )d a d sDifferentiating the performance objective and applying an approximation gives the off-policy policy-gradient (Degris et al.,2012b )∇θJ β(πθ)≈ SAρβ(s )∇θπθ(a |s )Q π(s,a )d a d s (4)=E s ∼ρβ,a ∼βπθ(a |s )βθ(a |s )∇θlog πθ(a |s )Q π(s,a ) (5)This approximation drops a term that depends on the action-value gradient ∇θQ π(s,a );Degris et al.(2012b )argue that this is a good approximation since it can pre-serve the set of local optima to which gradient ascent con-verges.The Off-Policy Actor-Critic (OffPAC)algorithm (Degris et al.,2012b )uses a behaviour policy β(a |s )to generate trajectories.A critic estimates a state-value func-tion,V v (s )≈V π(s ),off-policy from these trajectories,by gradient temporal-difference learning (Sutton et al.,2009).An actor updates the policy parameters θ,also off-policy from these trajectories,by stochastic gradient ascent of Equation 5.Instead of the unknown action-value function Q π(s,a )in Equation 5,the temporal-difference error δt is used,δt =r t +1+γV v (s t +1)−V v (s t );this can be shown to provide an approximation to the true gradient (Bhatna-gar et al.,2007).Both the actor and the critic use an im-portance sampling ratio πθ(a |s )βθ(a |s )to adjust for the fact that actions were selected according to πrather than β.3.Gradients of Deterministic PoliciesWe now consider how the policy gradient framework may be extended to deterministic policies.Our main result is a deterministic policy gradient theorem,analogous to the stochastic policy gradient theorem presented in the previ-ous section.We provide several ways to derive and un-derstand this result.First we provide an informal intuition behind the form of the deterministic policy gradient.Wethen give a formal proof of the deterministic policy gradi-ent theorem from first principles.Finally,we show that thedeterministic policy gradient theorem is in fact a limiting case of the stochastic policy gradient theorem.Details of the proofs are deferred until the appendices.3.1.Action-Value GradientsThe majority of model-free reinforcement learning algo-rithms are based on generalised policy iteration:inter-leaving policy evaluation with policy improvement (Sut-ton and Barto ,1998).Policy evaluation methods estimate the action-value function Q π(s,a )or Q µ(s,a ),for ex-ample by Monte-Carlo evaluation or temporal-difference learning.Policy improvement methods update the pol-icy with respect to the (estimated)action-value function.The most common approach is a greedy maximisation (or soft maximisation)of the action-value function,µk +1(s )=argmax Q aµk(s,a ).In continuous action spaces,greedy policy improvement becomes problematic,requiring a global maximisation at every step.Instead,a simple and computationally attrac-tive alternative is to move the policy in the direction of the gradient of Q ,rather than globally maximising Q .Specif-ically,for each visited state s ,the policy parameters θk +1are updated in proportion to the gradient ∇θQ µk(s,µθ(s )).Each state suggests a different direction of policy improve-ment;these may be averaged together by taking an expec-tation with respect to the state distribution ρµ(s ),θk +1=θk+αE s ∼ρµk∇θQ µk(s,µθ(s ))(6)By applying the chain rule we see that the policy improve-ment may be decomposed into the gradient of the action-value with respect to actions,and the gradient of the policy with respect to the policy parameters.θk +1=θk +αE s ∼ρµk∇θµθ(s )∇a Q µk(s,a )a =µθ(s )(7)By convention ∇θµθ(s )is a Jacobian matrix such that each column is the gradient ∇θ[µθ(s )]d of the d th action dimen-sion of the policy with respect to the policy parameters θ.However,by changing the policy,different states are vis-ited and the state distribution ρµwill change.As a result it is not immediately obvious that this approach guaran-tees improvement,without taking account of the change to distribution.However,the theory below shows that,like the stochastic policy gradient theorem,there is no need to compute the gradient of the state distribution;and that the intuitive update outlined above is following precisely the gradient of the performance objective.3.2.Deterministic Policy Gradient TheoremWe now formally consider a deterministic policyµθ:S→A with parameter vectorθ∈R n.We define a performance objective J(µθ)=E[rγ1|µ],and define probability dis-tribution p(s→s ,t,µ)and discounted state distribution ρµ(s)analogously to the stochastic case.This again lets us to write the performance objective as an expectation,J(µθ)=Sρµ(s)r(s,µθ(s))d s=E s∼ρµ[r(s,µθ(s))](8) We now provide the deterministic analogue to the policy gradient theorem.The proof follows a similar scheme to (Sutton et al.,1999)and is provided in Appendix B. Theorem1(Deterministic Policy Gradient Theorem). Suppose that the MDP satisfies conditions A.1(see Ap-pendix;these imply that∇θµθ(s)and∇a Qµ(s,a)exist and that the deterministic policy gradient exists.Then,∇θJ(µθ)=S ρµ(s)∇θµθ(s)∇a Qµ(s,a)|a=µθ(s)d s=E s∼ρµ∇θµθ(s)∇a Qµ(s,a)|a=µθ(s)(9)3.3.Limit of the Stochastic Policy GradientThe deterministic policy gradient theorem does not atfirst glance look like the stochastic version(Equation2).How-ever,we now show that,for a wide class of stochastic policies,including many bump functions,the determinis-tic policy gradient is indeed a special(limiting)case of the stochastic policy gradient.We parametrise stochastic poli-ciesπµθ,σby a deterministic policyµθ:S→A and avariance parameterσ,such that forσ=0the stochasticpolicy is equivalent to the deterministic policy,πµθ,0≡µθ.Then we show that asσ→0the stochastic policy gradi-ent converges to the deterministic gradient(see Appendix C for proof and technical conditions).Theorem2.Consider a stochastic policyπµθ,σsuch thatπµθ,σ(a|s)=νσ(µθ(s),a),whereσis a parameter con-trolling the variance andνσsatisfy conditions B.1and the MDP satisfies conditions A.1and A.2.Then,lim σ↓0∇θJ(πµθ,σ)=∇θJ(µθ)(10)where on the l.h.s.the gradient is the standard stochastic policy gradient and on the r.h.s.the gradient is the deter-ministic policy gradient.This is an important result because it shows that the famil-iar machinery of policy gradients,for example compatible function approximation(Sutton et al.,1999),natural gradi-ents(Kakade,2001),actor-critic(Bhatnagar et al.,2007), or episodic/batch methods(Peters et al.,2005),is also ap-plicable to deterministic policy gradients.4.Deterministic Actor-Critic AlgorithmsWe now use the deterministic policy gradient theorem to derive both on-policy and off-policy actor-critic algo-rithms.We begin with the simplest case–on-policy up-dates,using a simple Sarsa critic–so as to illustrate the ideas as clearly as possible.We then consider the off-policy case,this time using a simple Q-learning critic to illustrate the key ideas.These simple algorithms may have conver-gence issues in practice,due both to bias introduced by the function approximator,and also the instabilities caused by off-policy learning.We then turn to a more principled ap-proach using compatible function approximation and gra-dient temporal-difference learning.4.1.On-Policy Deterministic Actor-CriticIn general,behaving according to a deterministic policy will not ensure adequate exploration and may lead to sub-optimal solutions.Nevertheless,ourfirst algorithm is an on-policy actor-critic algorithm that learns and follows a deterministic policy.Its primary purpose is didactic;how-ever,it may be useful for environments in which there is sufficient noise in the environment to ensure adequate ex-ploration,even with a deterministic behaviour policy. Like the stochastic actor-critic,the deterministic actor-critic consists of two components.The critic estimates the action-value function while the actor ascends the gradi-ent of the action-value function.Specifically,an actor ad-justs the parametersθof the deterministic policyµθ(s)by stochastic gradient ascent of Equation9.As in the stochas-tic actor-critic,we substitute a differentiable action-value function Q w(s,a)in place of the true action-value func-tion Qµ(s,a).A critic estimates the action-value function Q w(s,a)≈Qµ(s,a),using an appropriate policy evalua-tion algorithm.For example,in the following deterministic actor-critic algorithm,the critic uses Sarsa updates to esti-mate the action-value function(Sutton and Barto,1998),δt=r t+γQ w(s t+1,a t+1)−Q w(s t,a t)(11) w t+1=w t+αwδt∇w Q w(s t,a t)(12)θt+1=θt+αθ∇θµθ(s t)∇a Q w(s t,a t)|a=µθ(s)(13) 4.2.Off-Policy Deterministic Actor-CriticWe now consider off-policy methods that learn a determin-istic target policyµθ(s)from trajectories generated by an arbitrary stochastic behaviour policyπ(s,a).As before,we modify the performance objective to be the value function of the target policy,averaged over the state distribution of the behaviour policy,Jβ(µθ)=Sρβ(s)Vµ(s)d s=Sρβ(s)Qµ(s,µθ(s))d s(14)∇θJ β(µθ)≈Sρβ(s )∇θµθ(a |s )Q µ(s,a )d s =E s ∼ρβ ∇θµθ(s )∇a Q µ(s,a )|a =µθ(s )(15)This equation gives the off-policy deterministic policy gra-dient .Analogous to the stochastic case (see Equation 4),we have dropped a term that depends on ∇θQ µθ(s,a );jus-tification similar to Degris et al.(2012b )can be made in support of this approximation.We now develop an actor-critic algorithm that updates the policy in the direction of the off-policy deterministic policy gradient.We again substitute a differentiable action-value function Q w (s,a )in place of the true action-value function Q µ(s,a )in Equation 15.A critic estimates the action-value function Q w (s,a )≈Q µ(s,a ),off-policy from trajectories generated by β(a |s ),using an appropriate policy evaluation algorithm.In the following off-policy deterministic actor-critic (OPDAC)algorithm,the critic uses Q-learning up-dates to estimate the action-value function.δt =r t +γQ w (s t +1,µθ(s t +1))−Q w (s t ,a t )(16)w t +1=w t +αw δt ∇w Q w (s t ,a t )(17)θt +1=θt +αθ∇θµθ(s t )∇a Q w (s t ,a t )|a =µθ(s )(18)We note that stochastic off-policy actor-critic algorithms typically use importance sampling for both actor and critic (Degris et al.,2012b ).However,because the deterministic policy gradient removes the integral over actions,we can avoid importance sampling in the actor;and by using Q-learning,we can avoid importance sampling in the patible Function ApproximationIn general,substituting an approximate Q w (s,a )into the deterministic policy gradient will not necessarily follow the true gradient (nor indeed will it necessarily be an ascent di-rection at all).Similar to the stochastic case,we now find a class of compatible function approximators Q w (s,a )such that the true gradient is preserved.In other words,we find a critic Q w (s,a )such that the gradient ∇a Q µ(s,a )can be replaced by ∇a Q w (s,a ),without affecting the determinis-tic policy gradient.The following theorem applies to both on-policy,E [·]=E s ∼ρµ[·],and off-policy,E [·]=E s ∼ρβ[·],Theorem 3.A function approximator Q w (s,a )is com-patible with a deterministic policy µθ(s ),∇θJ β(θ)=E ∇θµθ(s )∇a Q w (s,a )|a =µθ(s ) ,if 1.∇a Q w (s,a )|a =µθ(s )=∇θµθ(s ) wand2.w minimises the mean-squared error,MSE (θ,w )=E (s ;θ,w )(s ;θ,w ) where (s ;θ,w )=∇a Q w (s,a )|a =µθ(s )−∇a Q µ(s,a )|a =µθ(s )Proof.If w minimises the MSE then the gradient of 2w.r.t.w must be zero.We then use the fact that,by condi-tion 1,∇w (s ;θ,w )=∇θµθ(s ),∇w MSE (θ,w )=0E [∇θµθ(s ) (s ;θ,w )]=0E ∇θµθ(s )∇a Q w(s,a )|a =µθ(s )=E ∇θµθ(s )∇a Q µ(s,a )|a =µθ(s )=∇θJ β(µθ)or ∇θJ (µθ)For any deterministic policy µθ(s ),there always exists a compatible function approximator of the form Q w (s,a )=(a −µθ(s )) ∇θµθ(s ) w +V v (s ),where V v (s )may be any differentiable baseline function that is independent of the action a ;for example a linear combination of state fea-tures φ(s )and parameters v ,V v (s )=v φ(s )for param-eters v .A natural interpretation is that V v (s )estimates the value of state s ,while the first term estimates the ad-vantage A w (s,a )of taking action a over action µθ(s )in state s .The advantage function can be viewed as a linear function approximator,A w (s,a )=φ(s,a ) w with state-action features φ(s,a )def=∇θµθ(s )(a −µθ(s ))and pa-rameters w .Note that if there are m action dimensions and n policy parameters,then ∇θµθ(s )is an n ×m Jacobian matrix,so the feature vector is n ×1,and the parameter vector w is also n ×1.A function approximator of this form satisfies condition 1of Theorem 3.We note that a linear function approximator is not very use-ful for predicting action-values globally,since the action-value diverges to ±∞for large actions.However,it can still be highly effective as a local critic.In particular,it represents the local advantage of deviating from the cur-rent policy,A w (s,µθ(s )+δ)=δ ∇θµθ(s ) w ,where δrepresents a small deviation from the deterministic policy.As a result,a linear function approximator is sufficient to select the direction in which the actor should adjust its pol-icy parameters.To satisfy condition 2we need to find the parameters w that minimise the mean-squared error between the gradi-ent of Q w and the true gradient.This can be viewed as a linear regression problem with “features”φ(s,a )and “tar-gets”∇a Q µ(s,a )|a =µθ(s ).In other words,features of the policy are used to predict the true gradient ∇a Q µ(s,a )at state s .However,acquiring unbiased samples of the true gradient is difficult.In practice,we use a linear func-tion approximator Q w (s,a )=φ(s,a ) w to satisfy con-dition 1,but we learn w by a standard policy evaluation method (for example Sarsa or Q-learning,for the on-policy or off-policy deterministic actor-critic algorithms respec-tively)that does not exactly satisfy condition 2.We note that a reasonable solution to the policy evaluation prob-lem will find Q w (s,a )≈Q µ(s,a )and will therefore ap-proximately (for smooth function approximators)satisfy ∇a Q w (s,a )|a =µθ(s )≈∇a Q µ(s,a )|a =µθ(s ).To summarise,a compatible off-policy deterministic actor-critic (COPDAC)algorithm consists of two components.The critic is a linear function approximator that estimates the action-value from features φ(s,a )=a ∇θµθ(s ).This may be learnt off-policy from samples of a behaviour pol-icy β(a |s ),for example using Q-learning or gradient Q-learning.The actor then updates its parameters in the di-rection of the critic’s action-value gradient.The following COPDAC-Q algorithm uses a simple Q-learning critic.δt =r t +γQ w (s t +1,µθ(s t +1))−Q w (s t ,a t )(19)θt +1=θt +αθ∇θµθ(s t ) ∇θµθ(s t ) w t(20)w t +1=w t +αw δt φ(s t ,a t )(21)v t +1=v t +αv δt φ(s t )(22)It is well-known that off-policy Q-learning may diverge when using linear function approximation.A more recent family of methods,based on gradient temporal-difference learning,are true gradient descent algorithm and are there-fore sure to converge (Sutton et al.,2009).The basic idea of these methods is to minimise the mean-squared projected Bellman error (MSPBE)by stochastic gradient descent;full details are beyond the scope of this paper.Similar to the OffPAC algorithm (Degris et al.,2012b ),we use gradi-ent temporal-difference learning in the critic.Specifically,we use gradient Q-learning in the critic (Maei et al.,2010),and note that under suitable conditions on the step-sizes,αθ,αw ,αu ,to ensure that the critic is updated on a faster time-scale than the actor,the critic will converge to the pa-rameters minimising the MSPBE (Sutton et al.,2009;De-gris et al.,2012b ).The following COPDAC-GQ algorithm combines COPDAC with a gradient Q-learning critic,δt =r t +γQ w (s t +1,µθ(s t +1))−Q w (s t ,a t )(23)θt +1=θt +αθ∇θµθ(s t ) ∇θµθ(s t ) w t(24)w t +1=w t +αw δt φ(s t ,a t )−αw γφ(s t +1,µθ(s t +1)) φ(s t ,a t ) u t(25)v t +1=v t +αv δt φ(s t )−αv γφ(s t +1) φ(s t ,a t ) u t(26)u t +1=u t +αu δt −φ(s t ,a t ) u tφ(s t ,a t )(27)Like stochastic actor-critic algorithms,the computational complexity of all these updates is O (mn )per time-step.Finally,we show that the natural policy gradient (Kakade ,2001;Peters et al.,2005)can be extended to deter-ministic policies.The steepest ascent direction of our performance objective with respect to any metric M (θ)is given by M (θ)−1∇θJ (µθ)(Toussaint ,2012).The natural gradient is the steepest ascent direction with respect to the Fisher information metric M π(θ)=E s ∼ρπ,a ∼πθ ∇θlog πθ(a |s )∇θlog πθ(a |s );this metric is invariant to reparameterisations of the policy (Bagnell and Schneider ,2003).For deterministic policies,we use the metric M µ(θ)=E s ∼ρµ∇θµθ(s )∇θµθ(s ) which can be viewed as the limiting case of the Fisher informa-tion metric as policy variance is reduced to zero.By com-bining the deterministic policy gradient theorem with com-patible function approximation we see that ∇θJ (µθ)=E s ∼ρµ ∇θµθ(s )∇θµθ(s )w and so the steepest ascent direction is simply M µ(θ)−1∇θJ β(µθ)=w .This algo-rithm can be implemented by simplifying Equations 20or 24to θt +1=θt +αθw t .5.Experiments5.1.Continuous BanditOur first experiment focuses on a direct comparison be-tween the stochastic policy gradient and the determinis-tic policy gradient.The problem is a continuous ban-dit problem with a high-dimensional quadratic cost func-tion,−r (a )=(a −a ∗) C (a −a ∗).The matrix C is positive definite with eigenvalues chosen from {0.1,1},and a ∗=[4,...,4] .We consider action dimensions of m =10,25,50.Although this problem could be solved analytically,given full knowledge of the quadratic,we are interested here in the relative performance of model-free stochastic and deterministic policy gradient algorithms.For the stochastic actor-critic in the bandit task (SAC-B)we use an isotropic Gaussian policy,πθ,y (·)∼N (θ,exp(y )),and adapt both the mean and the variance of the policy.The deterministic actor-critic algorithm is based on COPDAC,using a target policy,µθ=θand a fixed-width Gaussianbehaviour policy,β(·)∼N (θ,σ2β).The critic Q (a )is sim-ply estimated by linear regression from the compatible fea-tures to the costs:for SAC-B the compatible features are ∇θlog πθ(a );for COPDAC-B they are ∇θµθ(a )(a −θ);a bias feature is also included in both cases.For this exper-iment the critic is recomputed from each successive batch of 2m steps;the actor is updated once per batch.To eval-uate performance we measure the average cost per step in-curred by the mean (i.e.exploration is not penalised for the on-policy algorithm).We performed a parameter sweep over all step-size parameters and variance parameters (ini-tial y for SAC;σ2βfor COPDAC).Figure 1shows the per-formance of the best performing parameters for each algo-rithm,averaged over 5runs.The results illustrate a signif-icant performance advantage to the deterministic update,which grows larger with increasing dimensionality.We also ran an experiment in which the stochastic actor-critic used the same fixed variance σ2βas the deterministic actor-critic,so that only the mean was adapted.This did not improve the performance of the stochastic actor-critic:COPDAC-B still outperforms SAC-B by a very wide mar-gin that grows larger with increasing dimension.。
人与自然英语范文
人与自然英语范文篇1Nature and Human Beings: An Inseparable BondOh, how closely intertwined are nature and human beings! We rely on nature for our very existence. Think about it! We depend on natural resources such as water, air, and land to sustain our lives. The food we consume, the energy we use, all come from nature. But have we truly appreciated this?Sadly, in our pursuit of progress and development, we have often overexploited nature. For instance, deforestation has led to the loss of countless species' habitats. Hasn't this affected the ecological balance? And what about the excessive mining that has caused soil erosion and pollution? The consequences are terrifying!We must realize that when we harm nature, we are ultimately harming ourselves. The deterioration of the natural environment brings about disasters like floods and droughts, which seriously impact our lives and livelihoods. Don't we understand this?It's high time we took action to protect nature. We should use resources sustainably and develop environmentally friendly technologies. Let's not wait until it's too late! We have the responsibility to leave a beautiful and healthy natural world for future generations. Can we affordto ignore this?Nature and human beings are inseparable. Let's cherish and protect nature, for our own sake and for the sake of our planet!篇2Oh, dear friends! Let's think deeply about the relationship between human beings and nature. We must admit that humans have a huge responsibility and obligation to protect nature.Look at some enterprises! They only care about profits and ignore the damage they cause to the environment. They discharge pollutants without any hesitation, destroying the balance of ecosystems. But on the contrary, there are also many enterprises that take positive environmental protection measures. They invest in research and development of green technologies, and strive to minimize their negative impact on nature.And what about us as individuals? We can also make a difference in our daily lives. For instance, we can choose to walk or ride a bike instead of driving a car to reduce carbon emissions. We can save water and electricity at home. We can also refuse to use disposable products. Every small action counts!Shouldn't we all take responsibility for protecting nature? Can we just stand by and watch it being destroyed? The answer is definitely no! We must act now, because the future of our planet depends on our choices and actions. Let's work together to protect our beautiful nature and leave agreen and sustainable world for future generations!篇3One summer vacation, I decided to go on a hiking trip in the forest. The moment I stepped into that green world, I was immediately embraced by nature. The tall trees stood like guardians of a secret realm, their leaves rustling in the gentle breeze. The air was filled with the sweet scent of wildflowers and the earthy smell of the soil. How wonderful it was!I walked along the narrow path, listening to the chirping of birds and the gurgling of the nearby stream. The sunlight filtered through the leaves, creating patches of light and shadow on the ground. Every step I took felt like a dance with nature.Another time, I spent my holidays by the sea. The vast ocean stretched out before me as far as the eye could see. The waves crashed against the shore, each one a powerful display of nature's might. I couldn't help but wonder at the immensity and mystery of the sea. How could such a force exist?These experiences have made me deeply fall in love with nature. It is a world full of beauty and miracles. We should cherish and protect it, shouldn't we?篇4One day, a terrifying earthquake struck our peaceful town. The groundshook violently as if the world was coming to an end. Buildings collapsed, roads cracked, and a cloud of dust filled the air. Panic spread among people in an instant.In the midst of this chaos, stories of human kindness emerged. Neighbors helped each other escape from the debris. Strangers joined hands to rescue those trapped. V olunteers rushed to the disaster area, bringing food, water and hope.However, as the dust settled and the reality of the damage became clear, we couldn't help but reflect. Why did this happen? Was it nature's wrath or our own neglect of the environment? We started to think about our future. Should we build stronger and more resilient structures? How could we better prepare for such disasters?This earthquake was a harsh lesson. It made us realize that we are not separate from nature, but a part of it. We must respect and protect it, or we will face more disasters. Only by doing so can we ensure a safer and more harmonious future for ourselves and the generations to come!篇5In today's rapidly evolving world, the influence of technological advancements on the relationship between humanity and nature is a topic of paramount significance! How has technology shaped this delicate balance?The development of new energy sources, such as solar and windpower, has undoubtedly brought about positive changes to our environment. These clean and renewable energies have reduced our reliance on fossil fuels, thereby lessening the emission of greenhouse gases. Isn't this a remarkable step forward in protecting our planet?However, on the flip side, certain technological products have led to a significant depletion of natural resources. For instance, the mass production of electronic devices requires vast amounts of rare metals and minerals, which are extracted from the earth at an alarming rate. How can we ignore such detrimental effects?Technology is a double-edged sword. It has the potential to either heal or harm our natural world. Shouldn't we be more cautious and wise in our pursuit of technological progress? We must ensure that our innovations are in harmony with nature, not at its expense. After all, nature is not something we can afford to lose. So, how can we strike the right balance and ensure a sustainable future for both humanity and nature?。
Optimal Capital Allocation Principles
Jan Dhaene†‡ Andreas Tsanakas § Emiliano A. Valdez ¶ Steven Vanduffel
April 20, 2010
Abstract This paper develops a unifying framework for allocating the aggregate capital of a financial firm to its business units. The approach relies on an optimisation argument, requiring that the weighted sum of measures for the deviations of the business unit’s losses from their respective allocated capitals be minimised. The approach is fair insofar as it requires capital to be close to the risk that necessitates holding it. The approach is additionally very flexible in the sense that different forms of the objective function can reflect alternative definitions of corporate risk tolerance. Owing to this flexibility, the general framework reproduces several capital allocation methods that appear in the literature and allows for alternative interpretations and possible extensions. Keywords: Capital allocation; risk measure; comonotonicity; Euler allocation; default option; optimisation.
山东省烟台市2023-2024学年高一下学期7月期末英语试题(含答案)
2023—2024学年度第二学期期末学业水平诊断高一英语注意事项:1. 答卷前,考生务必将自己的姓名、考生号等填写在答题卡和试卷指定位置上。
2. 回答选择题时,选出每小题答案后,用铅笔把答题卡上对应题目的答案标号涂黑。
如需改动,用橡皮擦干净后,再选涂其他答案标号。
回答非选择题时,将答案写在答题卡上,写在本试卷上无效。
3. 考试结束后,只交答题卡。
第一部分听力(共两节,满分30分)做题时,请先将答案划在试卷上。
该部分录音内容结束后,你将有两分钟的时间将你的答案转涂到客观题答题卡上。
第一节(共5小题;每小题1. 5分,满分7. 5分)听下面5段对话。
每段对话后有一个小题,从题中所给的A、B、C三个选项中选出最佳选项,并标在试卷的相应位置。
听完每段对话后,你都有10秒钟的时间来回答有关小题和阅读下一小题。
每段对话仅读一遍。
1. What is the woman going to buy?A. A pair of boots.B. A new bag.C. A new car.2. What is the man doing now?A. Having lunch.B. Repairing a printer.C. Working on a computer.3. What is the man going to do next?A. Say goodbye to everyone.B. Run to the airport.C. Find a taxi.4. What is the conversation mainly about?A. Foods for dinner.B. Gifts for the birthday.C. Arrangements for the holiday.5 Who is the man talking with?A. A doctor.B. His teacher.C. His mother.第二节(共15小题;每小题1. 5分,满分22. 5分)听下面5段对话或独白。
培养批判性思维做出理性选择英语作文
培养批判性思维做出理性选择英语作文Developing Critical Thinking for Making Rational ChoicesIntroductionIn today's fast-paced and information-saturated world, making rational choices has become increasingly important. Whether it is deciding what to eat for lunch, where to invest your money, or who to vote for in an election, the ability to think critically and make informed decisions is crucial. In this essay, we will explore the importance of developing critical thinking skills and how it can help us make rational choices in various aspects of our lives.What is Critical Thinking?Critical thinking is a process of analyzing, evaluating, and interpreting information in order to make informed decisions. It involves questioning assumptions, challenging beliefs, and examining evidence before coming to a conclusion. Critical thinkers are able to see beyond the surface of an issue and consider multiple perspectives before making a decision.Why is Critical Thinking Important?Critical thinking is important because it allows us to make rational choices based on evidence and logical reasoning. In a world where misinformation and fake news are rampant, being able to critically evaluate sources of information is crucial. By developing critical thinking skills, we can avoid falling prey to manipulation and propaganda and make decisions that are in our best interest.Moreover, critical thinking enables us to solve problems more effectively. By breaking down complex issues into smaller components and analyzing them systematically, we can come up with creative solutions that may not be immediately obvious. This can be especially useful in the business world, where the ability to think critically can give us a competitive edge.How to Develop Critical Thinking SkillsThere are several ways to develop critical thinking skills. One approach is to practice active listening and engage with others in thoughtful discussions. By listening attentively to different points of view and challenging our own beliefs, we can broaden our perspective and become more open-minded.Another technique is to ask probing questions and seek out reliable sources of information. By questioning assumptions and verifying facts, we can avoid making hasty decisions based onfaulty reasoning. This can help us avoid common cognitive biases that can lead us astray.In addition, reading widely and exposing ourselves to different ideas can help us develop critical thinking skills. By exploring diverse viewpoints and learning about different cultures and perspectives, we can expand our thinking and become more empathetic towards others.Making Rational ChoicesOnce we have developed our critical thinking skills, we can apply them to make rational choices in various aspects of our lives. For example, when faced with a decision about where to invest our money, we can carefully research different options, consider the risks and potential returns, and make an informed choice based on evidence and logic.Similarly, when it comes to choosing a political candidate to support, we can critically evaluate their policies, track record, and character before casting our vote. By being informed and thoughtful in our decision-making, we can contribute to a more democratic and just society.ConclusionIn conclusion, developing critical thinking skills is essential for making rational choices in today's complex andfast-changing world. By questioning assumptions, challenging beliefs, and seeking out reliable information, we can avoid falling prey to misinformation and make decisions that are in our best interest. By cultivating our critical thinking skills, we can become more effective problem solvers, better decision-makers, and responsible citizens.。
英语六级历年阅读题及答案
2010年6月阅读Section BDirections: There are 2 passages in this section. Each passage is followed by some questions or unfinished statements. For each of them there are four choices marked A), B), C) and D). You should decide on the best choice and mark the corresponding letter on Answer Sheet2 with a single line through the centre.Passage OneQuestions 51 to 56 are based on the following passage.Only two countries in the advanced world provide no guarantee for paid leave from work to care for a newborn child. Last spring one of the two, Australia, gave up the dubious distinction by establishing paid family leave starting in 2011. I wasn't surprised when this didn't make the news here in the United States—we're now the only wealthy country without such a policy.The United States does have one explicit family policy, the Family and Medical Leave Act, passed in 1993. It entitles workers to as much as 12 weeks' unpaid leave for care of a newborn or dealing with a family medical problem. Despite the modesty of the benefit, the Chamber of Commerce and other business groups fought it bitterly, describing it as "government-run personnel management" and a "dangerous precedent". In fact, every step of the way, as (usually) Democratic leaders have triedto introduce work-family balance measures into the law, business groups have been strongly opposed.As Yale law professor Anne Alstott argues, justifying parental support depends on defining the family as a social good that, in some sense, society must pay for. In her book No Exit: What Parents Owe Their Children and What Society Owes Parents, she argues that parents are burdened in many ways in their lives: there is "no exit" when it comes to children. "Society expects—and needs—parents to provide their children with continuity of care, meaning the intensive, intimate care that human beings need to develop their intellectual, emotional and moral capabilities. And society expects—and needs—parents to persist in their roles for 18 years, or longer if needed."While most parents do this out of love, there are public penalties for not providing care. What parents do, in other words, is of deep concern to the state, for the obvious reason that caring for children is not only morally urgent but essential for the future of society. The state recognizes this in the large body of family laws that govern children' welfare, yet parents receive little help in meeting the life-changing obligations society imposes. To classify parenting as a personal choice for which there is no collective responsibility is not merely to ignore the social benefits of good parenting; really, it is to steal those benefits because they accrue (不断积累) to the whole of society as today's children become tomorrow'sproductive citizenry (公民). In fact, by some estimates, the value of parental investments in children, investments of time and money (including lost wages), is equal to 20-30% of gross domestic product. If these investments generate huge social benefits—as they clearly do—the benefits of providing more social support for the family should be that much clearer.注意:此部分试题请在答题卡2上作答。
排污权初始分配
Environ Resource Econ(2008)39:265–282DOI10.1007/s10640-007-9125-4ORIGINAL PAPERThe optimal initial allocation of pollution permits:a relative performance approachIan A.Mackenzie·Nick Hanley·Tatiana KornienkoReceived:24May2006/Accepted:30March2007/Published online:3May2007©Springer Science+Business Media B.V.2007Abstract The initial allocation of pollution permits is an important aspect of emissions trading schemes.We generalize the analysis of Böhringer and Lange(2005,Eur Econ Rev 49(8):2041–2055)to initial allocation mechanisms that are based on inter-firm relative performance comparisons(including grandfathering and auctions,as well as novel mecha-nisms).We show that usingfirms’historical output for allocating permits is never optimal in a dynamic permit market setting,while usingfirms’historical emissions is optimal only in closed trading systems and only for a narrow class of allocation mechanisms.Instead,it is possible to achieve social optimality by allocating permits based only on an external factor, which is independent of output and emissions.We then outline sufficient conditions for a socially optimal relative performance mechanism.Keywords Relative performance·Initial allocation·Pollution permits·Auctions·Rank-order contestsJEL Classification Q53·Q58·C721IntroductionTradable permit markets have become an important policy tool in the control of pollution. Schemes such as RECLAIM and the SO2market in the US have shown that tradable permits are a viable and cost effective market-based mechanism(e.g.Stavins1998;Schmalensee et al. 1998).Yet there is still an active debate about how to allocate permit endowments among the participatingfirms at the beginning of each trading period.As Böhringer and Lange(2005) argue,some initial allocation mechanisms may create inter-temporal distortions and result in socially suboptimal outcomes.I.A.Mackenzie(B)·N.Hanley·T.KornienkoDepartment of Economics,University of Stirling,Stirling FK94LA,UKe-mail:i.a.mackenzie@266Ian A.Mackenzie et al.In this paper,we extend the results of Böhringer and Lange(2005)to accommodate most of the existing dynamic initial allocation mechanisms(including grandfathering and auctions, as well as novel mechanisms).We show that usingfirms’historical outputs for allocating permits is never optimal,while usingfirms’historical emissions is optimal only in closed trading systems and only for a narrow class of allocation mechanisms.Instead,it is possible to achieve social optimality by allocating permits based only on an external factor,which is independent of output and emissions.We outline sufficient conditions for a socially optimal relative performance mechanism and discuss the issues related to the choice of a suitable mechanism for initial allocation.In our analysis,we discuss two types of mechanisms that are commonly considered for allocating initial endowments of permits.Thefirst mechanism,which we call an Absolute Performance Mechanism(APM),involves permit allocations based on the levels of individ-ualfirm activity.The second mechanism,which we call a Relative Performance Mechanism (RPM),involves permit allocations based on how the levels of afirm’s activity compare to the levels of otherfirms’activities,or on inter-firm relative comparisons.The distinction between these two mechanisms is crucial asfirms’behaviour in the permit market is subject to whetherfirms’believe they are obtaining permits individually or,as under a RPM,as part of a game where afirm’s allocation is dependent on otherfirms’actions.We show in this paper that a mechanism that allocates permits based onfirms’absolute performance(APM), as used by Böhringer and Lange(2005),is a special case of a generalized relative perfor-mance mechanism(RPM),and thus that the two mechanisms share a number of optimality properties in a dynamic setting.We however argue that mechanisms which are based on relative performance might be superior over those based on absolute performance and offer a promising alternative to auctioning and grandfathering,namely a rank-order contest.Both types of mechanisms have had important applications in existing tradable permit markets.Absolute performance mechanisms have been advocated in the form of relative emissions or intensity-based emissions caps(Fischer2001,2003;Ellerman and Wing2003; Kuik and Mulder2004;Pizer,2005;Newell and Pizer,2006).1In such a scheme intra-firm relative comparisons exist,where the performance of a givenfirm is evaluated relative to its own activity,but not relative to the activity of otherfirms.Rather than having a cap on absolute levels of emissions,an intensity-based cap involves a ceiling on the emissions inten-sity(i.e.emissions per one unit of output).This type of approach is becoming increasingly common,for example,Bode(2005)notes that a number of participants in the UK emissions trading scheme were given an intensity target.Furthermore,the Bush administration in the U.S.has strongly advocated this type of approach to tackle climate change(Kolstad2005; Pizer2005).When a trading system is based on emissions intensity,eachfirm can unilaterally increase both their output and emissions without changing emissions intensity and without any effect on otherfirms(the permit allocation is an adjustable grandfathering mechanism).However,the majority of distribution rules which have been discussed are relative per-formance mechanisms.The two most common RPMs include auctions(wherefirms’are allocated permits based on their relative bids)and grandfathering with afixed cap(where firms’are allocated permits based on their relative emissions levels with respect to somefixed cap)(see Hahn and Noll1982;Lyon1982;1986;Oehmke1987;Milliman and Prince1989; Van Dyke1991;Franciosi et al.1993;Parry1995;Parry et al.1999;Cramton and Kerr2002). 1We make a distinction between intensity-based caps and output-based allocation(although they do both act as an implicit output subsidy).In intensity rate-based mechanisms the emission cap is adjusted to maintain a constant emissions intensity and hence allocation is not dependent on otherfirms’behaviour(e.g.the levels of otherfirms’emissions and output choices).In contrast,output-based mechanisms alter the average allocation per unit of output to maintain afixed emissions cap(allocation is dependent onfirms’behaviour).The optimal initial allocation of pollution permits:a relative performance approach267 However,there is a large selection of RPMs that have not been extensively considered in the literature.For example,yardstick competition,where eachfirm’s performance is assessed relatively to the performance of otherfirms has been suggested(Shleifer1985;Franckx et al. 2005;Nalebuff and Stiglitz1983a,b).Moreover,a novel RPM that could be envisaged to allocate permits is the use of contests or tournaments wherefirms spend resources in order to‘win’a proportion of the permit allocation(Moldovanu and Sela2001,2006).Inter-firm comparisons using relative performance mechanisms have a number of gen-eral regulatory advantages which have been widely documented in the literature(Lazear and Rosen1981;Holmström1982;Green and Stokey1983;Nalebuff and Stiglitz,1983a,b; Mookherjee1984;Shleifer1985;Moldovanu and Sela2001,2006).Relative performance mechanisms can also be advantageous in an environmental indasamy et al. (1994)suggested the use of a tournament to control non-point pollution,and found that a RPM results in a number of desirable outcomes.Franckx et al.(2005)extended the work of Govindasamy et al.(1994)by using a different RPM,yardstick competition,and conducted the analysis in a more general environmental regulatory setting.Theyfind that this RPM will be desirable when a large number offirms participate and common shocks(such as similar technology shocks or oil price changes)are experienced by allfirms.Rather fewer authors have focused on relative performance issues in emissions trading. Using a rent-seeking model,Malueg and Yates(2006)examine the effects of citizen par-ticipation in a permit market to determine the endowment and price of permits.Theyfind that citizens’choice of lobbying and permit purchases in a market depends on the initial allocation mechanism chosen(auctioning or grandfathering).Finally,Groenenberg and Blok (2002)outline an initial allocation mechanism for a permit market that bases distribution on benchmarking the production process of eachfirm andfind it eliminates a large amount of problems associated with existing allocation mechanisms.For a number of decades the free allocation(grandfathering)of permits has been discussed as a feasible method of allocation(e.g.Tietenberg1985).Indeed,the majority of actual emis-sions trading schemes to date use grandfathering as the primary allocation mechanism due to its political viability:market participants will always lobby for the free allocation of per-mits(Stavins1998).Grandfathering might also be seen as offering a closerfit to existing regulatory approaches,since it does not involve any fundamental change in property rights compared with,for instance,a system of performance standards for polluting emissions. Grandfathering might also be preferred by governments on competition grounds,since the avoidance of a lump-sum distribution from industry to government can avoid disadvantaging domesticfirms relative to their international competitors.On the negative side,grandfather-ing could be seen as rewardingfirms who have engaged in relatively low pollution control efforts in the past.As grandfathering is a commonly used tool,the discussions regarding the effects of the mechanism have been widespread.In particular,Requate and Unold(2003) have shown that substantial innovation incentives exist forfirms in a grandfathered emissions scheme.However,Goulder et al.(1997)found grandfathering to be a rather inefficient allo-cation mechanism compared to alternative allocation procedures.Recently,grandfathering has been adapted to include a dynamic element(Bode2006;Böhringer and Lange2005). In particular,Böhringer and Lange(2005)have discussed updated grandfathering which continually updates the free allocation of permits based on historical emissions and output.2 They found that the dynamic allocation has to be carefully considered to reduce distortions in the product and permit market.2See Fischer(2001)for static analysis of output-based permit allocations.268Ian A.Mackenzie et al.Another important aspect of the mechanisms in question involves multi-period choice problems in pollution permit markets.Several studies have focused on general design con-siderations for multi-period permit markets(Cronshaw and Kruse1996;Rubin1996;Kling and Rubin1997;Schennach2000;Leiby and Rubin2001;Yates and Cronshaw2001),yet only a few studies have focused on the initial allocation of permits in this setting.In the context of the electricity sector,Bode(2006)finds considerable variation in the distributional impacts among different allocation mechanisms within a dynamic emissions trading scheme. Jensen and Rasmussen(2000)model a number of allocation mechanisms in a dynamic setting andfind that welfare and employment vary drastically across allocation mechanisms.The work which is the most relevant to our paper is by Böhringer and Lange(2005),who compare the efficiency of dynamic permit allocations based on output,emissions and a lump-sum transfer.In comparing efficiency,they make a distinction between markets that are open (i.e.whenfirms can trade outside the domestic market)and closed(i.e.when participating firms cannot trade in permits outside the domestic market).This distinction is important to policy analysis as tradable permit markets are becoming increasingly varied in size and scope and have the potential to have either an open or closed market structure.Theyfind in a closed market it is optimal to allocate permits on criteria not related to output,whereas for an open market,an efficient allocation occurs when the permits are distributed using a lump-sum approach.However,in their treatment of the initial allocation mechanism,Böhringer and Lange(2005)assume that the permit distribution to afirm is based only onfirms’absolute levels of output and emissions,so that otherfirms’s actions do not affect the allocation of a givenfirm.Yet,given thefixed emission cap considered by Böhringer and Lange(2005), the permit allocation to afirm is also crucially dependent on the behaviour of rivalfirms. This is because afixed emissions cap implies that if in the current period rivalfirms,say, increase their output and emissions relative to a givenfirm,then the current-period aggregate output and emissions increase,thus decreasing the proportion of future permits that each firm can receive per each unit of current output and emissions.As the result,even if a given firm does not alter its own choices,its own future allocation of permits will change.Thus we argue that the initial allocation process considered by Böhringer and Lange(2005)should take into account otherfirms’actions and thus should be modelled as a relative performance mechanism.Our paper therefore attempts to extend Böhringer and Lange(2005)by implementing a more general design of a dynamic initial allocation mechanism,which allows for the alloca-tion of permits to be based on eachfirm’s choices relative to otherfirms.Following Böhringer and Lange(2005),we consider allocation mechanisms which are based on choices of output and emissions,but in addition we consider possible permit allocations based on an“external”factor which is independent of output and emissions.This allows us to create an encompass-ing model for most existing types of initial allocation mechanisms such as grandfathering, auctioning and contests.We show that a RPM can efficiently(socially optimally)allocate pollution permits if the criteria used to comparefirms is based on such an external factor,in a contest.Given the variety of potential external factors,we suggest a number of criteria that a regulator may take into account when choosing a suitable factor.We also argue in favour of a new mechanism,which involves an inter-firm contest designed to achieve two goals simultaneously—that is,the primary goal of efficiency and some secondary goal,such as generating revenue,achieving health and safety targets,noise reduction,reduction of other pollutants,etc.Given the political economy problems with both auctions and grandfathering as a way of initially allocating permits,this new mechanism may well be of interest to policy makers.The optimal initial allocation of pollution permits:a relative performance approach269 Our contribution is thus twofold.First,we extend the results of Böhringer and Lange (2005)to a wider class of mechanisms,so-called relative performance mechanisms,such as grandfathering withfixed cap,yardsticks,auctions,contests,etc.Although such mechanisms create a situation wherefirms’choices are interdependent,the general intuition of Böhringer and Lange(2005)holds in the Nash equilibrium of the ensuing game.That is,for a wide range of mechanisms,for the initial allocation to be cost-efficient,it should not depend on firms’outputs,and may depend onfirms’emissions only in limited circumstances.Second, we propose that the lump-sum distribution advocated by Böhringer and Lange(2005)can be implemented better with a relative performance mechanism based on an external factor. Such a cost–efficient mechanism allows the regulator to achieve a secondary target,such as raising revenue,—thus“killing two birds with one stone”.To the best of our knowledge,this is thefirst paper to introduce a generalised RPM into a permit market which allows us to model most existing relative-based mechanisms and has the added advantage of encompassing APMs.The paper is organised as follows:Sect.2outlines our model and presents the social optimality conditions andfirm’s optimisation problem.A socially optimal dynamic initial allocation mechanism,when the market experiences both exogenous and endogenous permit prices,is considered in Sect.3.Section4discusses the external factor,while Sect.5concludes.2The modelWe follow Böhringer and Lange(2005)and consider a multi-period partial equilibrium model.The technology of afirm i(i=1,2,...,n)at time t(t=1,2,...)is given by a cost function c it(e it,q it),where q it is thefirm’s output level,and e it thefirm’s emissions resulting from production.Costs c it are assumed to be twice differentiable and convex,with∂c it ∂e it ≤0,∂c it∂qit>0,∂2c it∂e2it,∂2c it∂q2it,−∂2c it∂e it∂q it≥0and∂2c it∂q2it·∂2c it∂e2it−∂2c it∂e it∂q it2>0.Thefirm sells its output in a competitive product market at a price of p t.Finally,thefirm is regulated by a competitive emissions-trading program and receives an initial allocation of permits A it.We further assume that eachfirm i also“produces”a factor z it which has no direct rele-vance in the product and emissions market,and thus is outside the regulator’s interests and/or jurisdiction.This“external”factor is“produced”by eachfirm independently of output andemissions at a cost v it(z it)(possibly zero),with d v it dzit ≥0.While this external factor is irrel-evant to the product and emissions market,it may determinefirms’permit allocations A it ina manner to be specified later.2.1The generalised allocation mechanismBöhringer and Lange(2005)considered a mechanism whereby pollution permits are allo-cated based on the levels offirm’s historical production q it and emissions e it.3Wefirst extend this mechanism by assuming that in addition to output and emissions,some“external”factor may play a role in how many permits will be allocated to a givenfirm,but this factor has no relevance to the product and emissions market,and thus is beyond the interest or jurisdiction3Böhringer and Lange(2005)considered a number of historical observation periods,l=(1,2...,s).For expositional simplicity,we restrict our model to l=1(the historical period is simply the previous period).It is straightforward to generalise our model to l>1historical observation periods.270Ian A.Mackenzie et al.of the regulator (and it is this factor which determines the lump-sum allocations in the model of Böhringer and Lange 2005).Examples of a possible external factor include population size in a firm’s locality,a firm’s socially responsible activities,a firm’s emissions of other pollutants,a random event such a lottery draw and so on.We denote such external factors as z it .While we will discuss the external factor more in Sect.4,it is worth noting here that the nature of the external factor determines both the cost of this factor to the firm,as well as the degree of firm’s control over this factor.For example,population size is both beyond the firm’s control and it is “free”to the firm.On the other hand,lottery tickets can be bought by firms,or can be allocated to firms by the regulator (and thus are beyond firms’control).In contrast,in a permit auction,both success and costs of each firm’s bid depends on the bids of other participating firms.Thus,the allocation mechanism based on absolute performance (APM)is given byA APM it =λt −1q ,it ˜h (q i (t −1))+λt −1e ,it ˜g (e i (t −1))+λt −1z ,it ˜f (z i (t −1))(1)where ˜h ,˜g ,˜f are increasing and continuously differentiable functions,and λt −1q ,it ,λt −1e ,it ,λt −1z ,it ≥0are the weights (in period t )placed on period t −1’s performance.The weights reflect the relative importance of a particular activity,and can vary across time periods and across firms.We extend Eq.1by allowing for firms’performance to be evaluated in comparison to other firms,i.e.how a given firm i ’s performance at time t in production q it ,emissions e it ,an exter-nal factor z it compares relatively to the performance of every other firm −i ={1,...,i −1,i +1,...,n }.Formally,firm i ’s performance at time t in output relatively to other firms’output q −it is given by a relative performance function h =h (q i (t −1),q −i (t −1)).Similarly,relative performance in emissions and external factor are given by g =g (e i (t −1),e −i (t −1)),and f =f (z i (t −1),z −i (t −1)),respectively.We assume h i =∂h ∂q it ,g i =∂g ∂e it ,f i =∂f∂z it >0so that,for given levels of other firms’performance,higher levels of emissions,output,and the external factor result in a larger permit allocation.We also assume that h −i =∂h ∂q −it ,g −i =∂g ∂e −it ,f −i =∂f ∂z −it ≤0,so that for a given level of firm’s performance,its allocation does not increases if other firms’increase their levels of emissions,output,or the externalfactor.4We take a rather general view of the relative allocation functions.That is,to allow for uncer-tainty over allocations,we treat these functions as expectations over possible realisations.Thus allocations can be distributed using deterministic rules (such as yardstick competitions)devised by the regulator,as well as by lotteries,auctions,or contests.For analytical tractabil-ity,we assume that the relative allocation functions h ,g ,f are continuously differentiable.5For example,a firm’s relative allocation can be determined continuously based on how its own output compares to aggregate output,e.g.h (q i (t −1),q −i (t −1))=αq it q it + −i q −it .Another example of a continuous relative allocation function includes Tullock-type (winner takes all)contest allocations,where a firm’s expected amount of permits is given by all participatingfirms’outputs as follows:h (q i (t −1),q −i (t −1))=βq r it q r it + −i q r −it —i.e.the size of the permit lot βmultiplied by the probability of winning the contest (see Skaperdas 1996).4Instead,one can assume that h i and g iare negative.5Our argument will not change if we relax the assumption of continuity to include relative performance mechanisms such as winner-pay and all-pay auctions involving discontinuities in firms’payoff functions.To deal with such discontinuities,one typically assumes that all firms face commonly known continuously differ-entiable distribution of firms’“types”,and that all firms follow symmetric strictly increasing and differentiable strategy,so that each firm’s expected payoff function becomes continuously differentiable.The optimal initial allocation of pollution permits:a relative performance approach 271Thus,the permit allocation for firm i at time t ,according to the generalized Relative Performance Mechanism isA RPM it =λt −1q ,it h (q i (t −1),q −i (t −1))+λt −1e ,it g (e i (t −1),e −i (t −1))+λt −1z ,it f (z i (t −1),z −i (t −1))(2)Comparing this relative performance allocation mechanism to that based on absolute per-formance (1),one can observe the following:Remark 1If h −i ≡g −i ≡f −i ≡0then a relative performance allocation mechanism reduces to an absolute performance allocation mechanism.In other words,the absolute performance mechanism considered by Böhringer and Lange (2005)is a special case of relative performance mechanism when firm i ’s allocation is inde-pendent of the remaining firms’actions.In this case,the remaining firms’actions have no impact on firm i ’s allocation,and a firm i can obtain permits by optimally choosing q it ,e it and z it ,without considering other firms’actions.Note that Böhringer and Lange (2005)implicitly assume that the grandfathering mecha-nism is an absolute performance mechanism.However,with a fixed emission cap,for a given behaviour of other firms,if a particular firm increases/decreases its output and/or emissions,that would affect the aggregate output and emissions of domestic firms,ultimately affecting how many permits both that firm and all other firms will receive.Thus,it is implicit in Böhrin-ger and Lange (2005)that the factor weights will change each period to reflect changes in the aggregate activities.To see this,suppose that at time t a fixed amount of permits ¯E t is allocated among n firms proportionally to each firm’s output q it .In other words,each firm i receives an allocation γt q it ,where γt =¯E t q it + −i q −it .Thus,the output weight γt has to be adjusted each period to reflect changes in aggregate production.It is easy to see that such a fixed cap grandfathering mechanism is a RPM with h (q i (t −1),q −i (t −1))=¯E t q it q it + −i q −it .When a relative performance mechanism is used,firm i ’s choices affect the number of permits allocated to firm j =i ,and thus affect firm j ’s profits,and vice versa.In other words,a RPM creates a situation where firms choices are interdependent .In such a situa-tion,a rational firm will make its choices strategically,by taking into account the anticipated actions of its rivals.The relative performance permit allocation mechanism thus results in a game among participating firms,which leads firms’behaviour to be typically different from their behaviour when faced with an APM.To explore the distortionary effect of such behaviour,we first need to consider the socially optimal situation.2.2The socially optimal outcomeWe now consider the regulator’s point of view.Following Böhringer and Lange (2005)we assume that the regulator cares about profits and costs associated with the production of output and emissions of the specific pollutant,as well as the trade in the pollution permits,but is not interested in the external factors such as population size,lottery draws,or auction bids (we will come back to this assumption in Sect.4).Thus,the regulator’s objective is to maximise (minimise)the aggregate profit (cost)that all the domestic firms incur while producing the product of the regulator’s interests or jurisdiction whilst being constrained by the emissions program.When trade in emissions permits is not restricted to the regulator’s jurisdiction,firms can import/export emissions across the system’s borders.From a regulator’s point of view,this is a (small)open emissions trading system,where the permit price is exogenously deter-mined,and the aggregate emissions in the jurisdiction are not capped.This may occur when272Ian A.Mackenzie et al.the market is open to transactions from other (possibly larger)schemes.For example,in the European Union Emissions Trading Scheme (EU-ETS),member states allocate permits domestically,but firms in each member state can trade permits with firms in other member states.In such a system,the regulator’s objective takes into account the balance of the trade in the emission permits.Thus,given the set of prices (σt ,p it ),the regulator’s objective is toMax q it ,e it t n i =1p it q it −c it (e it ,q it )−σt n i =1e it −E t (3)where σt is the exogenous permit price determined by the (international)demand and supply of permits in the open market and E t is the domestic emissions cap at time t .For each firm i and each of it’s rival −i ={1,...,i −1,i +1,...,n },the socially optimal conditions are as follows:6p it =∂c it ∂q it (4)−∂c it it =−∂c jt jt(=σt )(5)for all i ,j =i ,t .That is,at period t all firms will simultaneously equate their marginal production costs to their firm-specific product price (4).Also,in the equilibrium,firms’mar-ginal abatement costs will be equalized (5),and will be equal to the (exogenously determined)common permit price.In contrast,in a closed emissions trading system,a single regulator distributes the total supply of permits,and thus ensures that the aggregate emissions are capped: i e it =E t .The emissions permit price is endogenously determined by the (domestic)demand and supply in the closed market.The regulators objective function is thus:Max q it ,e it t n i =1p it q it −c it (e it ,q it ) subject to n i =1e it =E t (6)The socially optimal conditions are identical to the conditions (4–5),except that firms’mar-ginal abatement costs will be equal to the shadow price of abatement.2.3Firm optimisationWe first extended the allocation model of Böhringer and Lange (2005)by allowing for eval-uations based on an independent external factor such as population size,socially responsible activities,emissions of other pollutants,lottery draw,and so on.We now focus our atten-tion on the firm-specific problem.Given the profile of other firms’actions,the set of prices (σt ,p it ),and its permit allocation A it for the target pollutant,a firm i will choose a level of emissions,output and an external factor,(q ∗it ,,e ∗it ,,z ∗it)to maximise its total stream of profits:Max q it ,e it ,z it t =1[p it q it −c it (e it ,q it )−v it (z it )]−σt (e it −A it )6We follow the language of Böhringer and Lange (2005)and refer to the least-cost outcome and corresponding conditions as socially optimal.。
法益衡量原则英语
法益衡量原则英语The principle of utilitarianism, also known as the principle of maximization of overall well-being, is a fundamental ethical principle that has been widely discussed and debated by philosophers and scholars in various fields. Developed by philosophers such as Jeremy Bentham and John Stuart Mill, utilitarianism posits that the morally right action is the one that maximizes overall utility or happiness and minimizesoverall harm or suffering. This principle has been applied in various areas of human life, including politics, economics, and healthcare.The principle of utilitarianism focuses on the consequences of an action rather than the intentions or motives behind it. According to utilitarianism, an action is morally right if it leads to the greatest amount of happiness or well-being for the greatest number of people affected by it. Conversely, an action is morally wrong if it leads to harm or suffering for a large number of people. The moral value of an action is determined by the net balance of pleasure or happiness created.One of the key features of utilitarianism is its focus on the collective well-being rather than individual interests. It considers the interests and happiness of all individuals who may be impacted by a particular action, and seeks to maximizeoverall well-being. This principle can be applied in variouscontexts, such as decision-making in public policy, business ethics, and healthcare ethics.In the realm of public policy, utilitarianism can serve as a guiding principle in determining the best course of action. For example, when making decisions about allocating resources for public projects, utilitarianism would suggest choosing the option that maximizes overall well-being, even if it means sacrificing some individual interests. By considering the consequences of each option on the overall welfare of society, decision-makers can make more informed and ethical choices.Utilitarianism also has implications for healthcare ethics. In medical decision-making, the principle of utilitarianism can help guide healthcare providers in determining the best courseof action for their patients. The focus is on promoting the greatest overall well-being for the patient and society, taking into account factors such as the effectiveness of treatments and the allocation of scarce resources.Despite these criticisms, the principle of utilitarianism continues to be influential in ethical discussions and decision-making processes. Its focus on maximizing overall well-being provides a framework for considering the consequences of actions and guiding ethical behavior. While its applications may vary across contexts, the principle of utilitarianism serves as a valuable tool for balancing the interests of individuals andsociety in the pursuit of the greatest happiness for the greatest number.。
Abstract
Optimal Inflation Targeting Rules∗Marc P.Giannoni Columbia UniversityMichael Woodford Princeton UniversityFebruary24,2003AbstractThis paper characterizes optimal monetary policy for a range of alternative eco-nomic models,applying the general theory developed in Giannoni and Woodford(2002a).The rules computed here have the advantage of being optimal regardlessof the assumed character of exogenous additive disturbances,though other aspects ofmodel specification do affect the form of the optimal rule.In each case,optimal policy can be implemented through aflexible inflation target-ing rule,under which the central bank is committed to adjust its interest-rate instru-ment so as to ensure that projections of inflation and other variables satisfy a targetcriterion.The paper shows which additional variables should be taken into account,inaddition to the inflation projection,and to what extent,for any given parameterizationof the structural equations.It also explains what relative weights should be placed onprojections for different horizons in the target criterion,and the manner and degree towhich the target criterion should be history-dependent.The likely quantitative significance of the various factors considered in the general discussion is then assessed by estimating a small,structural model of the U.S.monetarytransmission with explicit optimizing foundations.An optimal policy rule is computedfor the estimated model,and shown to correspond to a multi-stage inflation-forecasttargeting procedure.The degree to which actual U.S.policy over the past two decadeshas conformed to the optimal target criteria is then considered.∗We would like to thank Jean Boivin,Rick Mishkin,Ed Nelson,and Lars Svensson for helpful discussions, Brad Strum for research assistance,and the National Science Foundation for research support through a grant to the NBER.An increasingly popular approach to the conduct of monetary policy,since the early 1990s,has been inflation-forecast targeting.Under this general approach,a central bank is committed to adjust short-term nominal interest rates periodically so as to ensure that its projection for the economy’s evolution satisfies an explicit target criterion—for example,in the case of the Bank of England,the requirement that the RPIX inflation rate be projected to equal2.5percent at a horizon two years in the future(Vickers,1998).Such a commitment can overcome the inflationary bias that is likely to follow from discretionary policy guided solely by a concern for social welfare,and can also help to stabilize medium-term inflation expectations around a level that reduces the output cost to the economy of maintaining low inflation.Another benefit that is claimed for such an approach(e.g.,King,1997;Bernanke et al.,1999)—and an important advantage,at least in principle,of inflation targeting over other policy rules,such as a k-percent rule for monetary growth,that should also achieve a low average rate of inflation—is the possibility of combining reasonable stability of the inflation rate(especially over the medium to long term)with optimal short-run responses to real disturbances of various sorts.Hence Svensson(1999)argues for the desirability of “flexible”inflation targeting,by which it is meant1that the target criterion involves not only the projected path of the inflation rate,but one or more other variables,such as a measure of the output gap,as well.We here consider the question of what sort of additional variables ought to matter—and with what weights,and what dynamic structure—in a target criterion that is intended to implement optimal policy.We wish to use economic theory to address questions such as which measure of inflation is most appropriately targeted(an index of goods prices only,or wage inflation as well?),which sort of output gap,if any,should justify short-run departures of projected inflation from the long-run target rate(a departure of real GDP from a smooth 1Svensson discusses two alternative specifications of an inflation-targeting policy rule,one of which(a “general targeting rule”)involves specification of a loss function that the central bank should use to evaluate alternative paths for the economy,and the other of which(a“specific targeting rule”)involves specification of a target criterion.We are here concerned solely with policy prescriptions of the latter sort.On the implementation of optimal policy through a“general targeting rule,”see Svensson and Woodford(2003).trend path,or from a“natural rate”that varies in response to a variety of disturbances?), and how large a modification of the acceptable inflation projection should result from a given size of projected output gap.We also consider how far in the future the inflation and output projections should extend upon which the current interest-rate decision is based,and the degree to which an optimal target criterion should be history-dependent,i.e.,should depend on recent conditions,and not simply on the projected paths of inflation and other target variables from now on.In a recent paper(Giannoni and Woodford,2002a),we expound a general approach to the design of an optimal target criterion.We show,for a fairly general class of linear-quadratic policy problems,how it is possible to choose a target criterion that will satisfy several desiderata.First,the target criterion has the property that insofar as the central bank is expected to ensure that it holds at all times,this expectation will imply the existence of a determinate rational-expectations equilibrium.Second,that equilibrium will be optimal, from the point of view of a specified quadratic loss function,among all possible rational-expectations equilibria,given one’s model of the monetary transmission mechanism.2Thus the policy rule implements the optimal state-contingent evolution of the economy,in the sense of given it a reason to occur if the private sector is convinced of the central bank’s commitment to the rule and fully understands its implications.Third,the rule is robustly optimal,in the sense that the same target criterion brings about an optimal state-contingent evolution of the economy regardless of the assumed statistical properties of the exogenous disturbances,despite the fact that the target criterion makes no explicit reference to the particular types of disturbances that may occur(except insofar as these may be involved in the definition of the target variables—the variables appearing 2Technically,the state-contingent evolution that is implemented by commitment to the policy rule is optimal from a“timeless perspective”of the kind proposed in Woodford(1999b),which means that it would have been chosen as part of an optimal commitment at a date sufficiently far in the past for the policymaker to fully internalize the implications of the anticipation of the specified policy actions,as well as their effects at the time that they are taken.This modification of the concept of optimality typically used in Ramsey-style analyses of optimal policy commitments allows a time-invariant policy rule to be judged optimal,and eliminates the time inconsistency of optimal policy.See Giannoni and Woodford(2002a)and Svensson and Woodford(2003)for further discussion.in the loss function which defines the stabilization objectives).This robustness greatly increases the practical interest in the computation of a target criterion that is intended to implement optimal state-contingent responses to disturbances;for actual economies are affected by an innumerable variety of types of disturbances,and central banks always have a great deal of specific information about the ones that have most recently occurred.The demand that the target criterion be robustly optimal also allows us to obtain much sharper conclusions as to the form of an optimal target criterion.For while there would be a very large number of alternative relations among the paths of inflation and other variables that are equally consistent with the optimal state-contingent evolution in the case of a particular type of assumed disturbances,only relations of a very special sort continue to describe the optimal state-contingent evolution even if one changes the assumed character of the exogenous disturbances affecting the economy.Our general characterization in Giannoni and Woodford(2002a)is in terms of a fairly abstract notation,involving eigenvectors and matrix lag polynomials.Here we offer examples of the specific character of the optimallyflexible inflation targets that can be derived using that theory.Our results are of two sorts.First,we illustrate the implications of the theory in the context of a series of simple models that incorporate important features of realistic models of the monetary transmission mechanism.Such features include wage and price stickiness, inflation inertia,habit persistence,and predeterminedness of pricing and spending decisions. In the models considered,there is a tension between two or more of the central bank’s stabilization objectives,that cannot simultaneously be achieved in full;in the simplest case, this is a tension between inflation and output-gap stabilization,but we also consider models in which it is reasonable to seek to stabilize interest rates or wage inflation as well.These results in the context of very simple models are intended to give insight into the way in which the character of the optimal target criterion should depend on one’s model of the economy, and should be of interest even to readers who are not persuaded of the empirical realism of our estimated model.Second,we apply the theory to a small quantitative model of the U.S.monetary transmis-sion mechanism,the numerical parameters of which arefit to VAR estimates of the impulse responses of several aggregate variables to identified monetary policy shocks.While the model remains an extremely simple one,this exercise makes an attempt to judge the likely quantitative significance of the types of effects that have previously been discussed in more general terms.It also offers a tentative evaluation of the extent to which U.S.policy over the past two decades has differed from what an optimal inflation-targeting regime would have called for.1Model Specification and Optimal TargetsHere we offer a few simple examples of the way in which the optimal target criterion will depend on the details of one’s model of the monetary transmission mechanism.(The optimal target criterion also depends,of course,on one’s assumed stabilization objectives.But here we shall take the view that the appropriate stabilization objectives follow from ones assumptions about the way in which policy affects the economy,though the welfare-theoretic stabilization objectives implied by our various simple models are here simply asserted rather than derived.)The examples that we select illustrate the consequences of features that are often present in quantitative optimizing models of the monetary transmission mechanism. They are also features of the small quantitative model presented in section2;hence our analytical results in this section are intended to provide intuition for the numerical results presented for the empirical model in section3.The analysis of Giannoni and Woodford(2002a)derives a robustly optimal target crite-rion from thefirst-order conditions that characterize the optimal state-contingent evolution of the economy.Here we illustrate this method by directly applying it to our simple examples, without any need to recapitulate the general theory.1.1An Inflation-Output Stabilization TradeoffWefirst consider the central issue addressed in previous literature onflexible inflation target-ing,which is the extent to which a departure from complete(and immediate)stabilization ofinflation is justifiable in the case of real disturbances that prevent joint stabilization of both inflation and the(welfare-relevant)output gap.3We illustrate how this question would be answered in the case of a simple optimizing model of the monetary transmission mechanism that allows for the existence of such“cost-push shocks”(to use the language of Clarida et al.,1999).As is well known,a discrete-time version of the optimizing model of staggered price-setting proposed by Calvo(1983)results in a log-linear aggregate supply relation of the formπt=κx t+βE tπt+1+u t,(1.1) sometimes called the“New Keynesian Phillips curve”(after Roberts,1995).4Hereπt denotes the inflation rate(rate of change of a general index of goods prices),x t the output gap(the deviation of log real GDP from a time-varying“natural rate”,defined so that stabilization of the output gap is part of the welfare-theoretic stabilization objective5),and the disturbance term u t is a“cost-push shock”,collecting all of the exogenous shifts in the equilibrium relation between inflation and output that do not correspond to shifts in the welfare-relevant“natural rate”of output.In addition,0<β<1is the discount factor of the representative household, andκ>0is a function of a number of features of the underlying structure,including both the average frequency of price adjustment and the degree to which Ball-Romer(1990)“real rigidities”are important.We shall assume that the objective of monetary policy is to minimize the expected value 3Possible sources of disturbances of this sort are discussed in Giannoni(2000),Steinsson(2002),and Woodford(2003,chap.6).4See Woodford(2003,chap.3)for a derivation in the context of an explicit intertemporal general equi-librium model of the transmission mechanism.Equation(1.1)represents merely a log-linear approximation to the exact equilibrium relation between inflation and output implied by this pricing model;however,un-der circumstances discussed in Woodford(2003,chap.6),such an approximation suffices for a log-linear approximate characterization of the optimal responses of inflation and output to small enough disturbances. Similar remarks apply to the other log-linear models presented below.5See Woodford(2003,chaps.3and6)for discussion of how this variable responds to a variety of types of real disturbances.Under conditions discussed in chapter6,the“natural rate”referred to here corresponds to the equilibrium level of output in the case that all wages and prices were completelyflexible.However, our results in this section apply to a broader class of model specifications,under an appropriate definition of the“output gap”.of a loss function of the formW=E0∞t=0βt L t,(1.2)where the discount factorβis the same as in(1.1),and the loss each period is given byL t=π2t+λ(x t−x∗)2,(1.3) for a certain relative weightλ>0and optimal level of the output gap x∗>0.Under the same microfoundations as justify the structural relation(1.1),one can show(Woodford, 2003,chap.6)that a quadratic approximation to the expected utility of the representative household is a decreasing function of(1.2),withλ=κ/θ(1.4) (whereθ>1is the elasticity of substitution between alternative differentiated goods)and x∗a function of both the degree of market power and the size of tax distortions.However, we here offer an analysis of the optimal target criterion in the case of any loss function of the form(1.3),regardless of whether the weights and target values are the ones that can be justified on welfare-theoretic grounds or not.(In fact,a quadratic loss function of this form is frequently assumed in the literature on monetary policy evaluation,and is often supposed to represent the primary stabilization objectives of actual inflation-targeting central banks in positive characterizations of the consequences of inflation targeting.)The presence of disturbances of the kind represented by u t in(1.1)creates a tension between the two stabilization goals reflected in(1.3)of inflation stabilization on the one hand and output-gap stabilization(around the value x∗)on the other;under an optimal policy, the paths of both variables will be affected by cost-push shocks.The optimal responses can be found by computing the state-contingent paths{πt,x t}that minimize(1.2)with loss function(1.3)subject to the sequence of constraints(1.1).6The Lagrangian for this problem, 6Note that the aggregate-demand side of the model does not matter,as long as a nominal interest-rate path exists that is consistent with any inflation and output paths that may be selected.This is true if,for example,the relation between interest rates and private expenditure is of the form(1.15)assumed below,and the required path of nominal interest rates is always non-negative.We assume here that the non-negativity constraint never binds,which will be true,under the assumptions of the model,in the case of any small enough real disturbances{u t,r n t}.looking forward from any date t0,is of the formL t0=E t∞t=t0βt−t012[π2t+λx(x t−x∗)2]+ϕt[πt−κx t−βπt+1],(1.5)whereϕt is a Lagrange multiplier associated with constraint(1.1)on the possible inflation-output pairs in period t.In writing the constraint term associated with the period t AS relation,it does not matter that we substituteπt+1for E tπt+1;for it is only the conditional expectation of the term at date t0that matters in(1.5),and the law of iterated expectations implies thatE t0[ϕt E tπt+1]=E t[E t(ϕtπt+1)]=E t[ϕtπt+1]for any t≥t0.Differentiating(1.5)with respect to the levels of inflation and output each period,we obtain a pair offirst-order conditionsπt+ϕt−ϕt−1=0,(1.6)λ(x t−x∗)−κϕt=0,(1.7) for each period t≥t0.These conditions,together with the structural relation(1.1),have a unique non-explosive solution7for the inflation rate,the output gap,and the Lagrange multiplier(a unique solution in which the paths of these variables are bounded if the shocks u t are bounded),and this solution(which therefore satisfies the transversality condition) indicates the optimal state-contingent evolution of inflation and output.As an example,Figure1,plots the impulse responses to a positive cost-push shock,in the simple case that the cost-push shock is purely transitory,and unforecastable before the period in which it occurs(so that E t u t+j=0for all j≥1).Here the assumed values ofβ,κ,7Obtaining a unique solution requires the specification of an initial value for the Lagrange multiplierϕt0−1.See Woodford(2003,chap.7)for the discussion of alternative possible choices of this initial condition andtheir significance.Here we note simply that regardless of the value chosen forϕt0−1,the optimal responsesto cost-push shocks in period t0and later are the same.of Calvo pricing.andλare those given in Table1,8and the shock in period zero is of size u0=1;the periods represent quarters,and the inflation rate is plotted as an annualized rate,meaning that what is plotted is actually4πt.As one might expect,in an optimal equilibrium inflation is allowed to increase somewhat in response to a cost-push shock,so that the output gap need not fall as much as would be required to prevent any increase in the inflation rate.Perhaps less intuitively,thefigure also shows that under an optimal commitment,monetary policy remains tight even after the disturbance has dissipated,so that the output gap returns to 8These parameter values are based on the estimates of Rotemberg and Woodford(1997)for a slightly more complex variant of the model used here and in section1.3.The coefficientλhere corresponds toλx in the table.Note also that the value of.003for that coefficient refers to a loss function in whichπt represents the quarterly change in the log price level.If we write the loss function in terms of an annualized inflation rate,4πt,as is conventional in numerical work,then the relative weight on the output-gap stabilization term would actually be16λx,or about.048.Of course,this is still quite low compared the relative weights often assumed in the ad hoc stabilization objectives used in the literature on the evaluation of monetary policy rules.Table1:Calibrated parameter values for the examples in section1.Structural parametersβ0.99κ.024θ−10.13σ−10.16Shock processesρu0ρr0.35Loss functionλx.003λi.236zero only much more gradually.As a result of this,while inflation overshoots its long-run target value at the time of the shock,it is held below its long-run target value for a time following the shock,so that the unexpected increase in prices is subsequently undone.In fact,as the bottom panel of thefigure shows,under an optimal commitment,the price level eventually returns to exactly the same path that it would have been expected to follow if the shock had not occurred.This simple example illustrates a very general feature of optimal policy once one takes account of forward-looking private-sector behavior:optimal policy is almost always history-dependent.That is,it depends on the economy’s recent history and not simply on the set of possible state-contingent paths for the target variables(here,inflation and the output gap)that are possible from now on.(In the example shown in thefigure,the set of pos-sible rational-expectations equilibrium paths for inflation and output from period t onward depends only on the value of u t;but under an optimal policy,the actually realized inflation rate and output gap depend on past disturbances as well.)This is because a commitment to respond later to past conditions can shift expectations at the earlier date in a way that helps to achieve the central bank’s stabilization objectives.In the present example,if price-setters are forward-looking,the anticipation that a current increase in the general price level willpredictably be“undone”soon gives suppliers a reason not to increase their own prices cur-rently as much as they otherwise would.This leads to smaller equilibrium deviations from the long-run inflation target at the time of the cost-push shock,without requiring such a large change in the output gap as would be required to stabilize inflation to the same degree without a change in expectations regarding future inflation.(The impulse responses under the best possible equilibrium that does not involve history-dependence are shown by the dashed lines in thefigure.9Note that a larger initial output contraction is required,even though both the initial price increase and the long-run price increase caused by the shock are greater.)It follows that no purely forward-looking target criterion—one that involves only the projected paths of the target variables from the present time onward,like the criterion that is officially used by the Bank of England—can possibly determine an equilibrium with the optimal responses to disturbances.Instead,a history-dependent target criterion is necessary, as stressed by Svensson and Woodford(2003).A target criterion that works is easily derived from thefirst-order conditions(1.6)–(1.7). Eliminating the Lagrange multiplier,one is left with a linear relationπt+φ(x t−x t−1)=0,(1.8)with a coefficientφ=λ/κ>0,that the state-contingent evolution of inflation and the output gap must satisfy.Note that this relation must hold in an optimal equilibrium regardless of the assumed statistical properties of the disturbances.One can also show that a commitment to ensure that(1.8)holds each period from some date t0onward implies the existence of a.In this determinate rational-expectations equilibrium,10given any initial output gap x t0−1 equilibrium,inflation and output evolve according to the optimal state-contingent evolution 9See Woodford(2003,chap.7)for derivation of this“optimal non-inertial plan.”In the example shown in Figure1,this optimal non-inertial policy corresponds to the Markov equilibrium resulting from discretionary optimization by the central bank.That equivalence would not obtain,however,in the case of serially correlated disturbances.10The characteristic equation that determines whether the system of equations consisting of(1.1)and(1.8) has a unique non-explosive solution is the same as for the system of equations solved above for the optimal state-contingent evolution.characterized above.This is the optimal target criterion that we are looking for:it indicates that deviations of the projected inflation rateπt from the long-run inflation target(here equal to zero)should be accepted that are proportional to the degree to which the output gap is projected to decline over the same period that prices are projected to rise.Note that this criterion is history-dependent,because the acceptability of a given projection(πt,x t)depends on the recent past level of the output gap;it is this feature of the criterion that will result in the output gap’s returning only gradually to its normal level following a transitory cost-push shock,as shown in Figure1.How much of a projected change in the output gap is needed to justify a given degree of departure from the long-run inflation target?Ifλis assigned the value that it takes in the welfare-theoretic loss function,thenφ=θ−1,whereθis the elasticity of demand faced by the typicalfirm.The calibrated value for this parameter given in Table1(based on the estimates of Rotemberg and Woodford,1997)implies thatφ=.13.If we express the target criterion in terms of the annualized inflation rate(4πt)rather than the quarterly rate of price change,the relative weight on the projected quarterly change in the output gap will instead be4φ,or about0.51.Hence a projection of a decline in real GDP of two percentage points relative to the natural rate of output over the coming quarter would justify an increase in the projected(annualized)rate of inflation of slightly more than one percentage point.1.2Inflation InertiaA feature of the“New Keynesian”aggregate-supply relation(1.1)that has come in for substantial criticism in the empirical literature is the fact that past inflation rates play no role in the determination of current equilibrium inflation.Instead,empirical models of the kind used in central banks for policy evaluation often imply that the path of the output gap required in order to achieve a particular path for the inflation rate from now onward depends on what rate of inflation has already been recently experienced;and this aspect of one’s model is of obvious importance for the question of how rapidly one should expectthat it is optimal to return inflation to its normal level,or even to“undo”past unexpected price-level increases,following a cost-push shock.A simple way of incorporating inflation inertia of the kind that central-bank models often assume into an optimizing model of pricing behavior is to assume,as Christiano et al.(2001)propose,that individual prices are indexed to an aggregate price index during the intervals between re-optimizations of the individual prices,and that the aggregate price index becomes available for this purpose only with a one-period lag.When the Calvo model of staggered price-setting is modified in this way,the aggregate-supply relation(1.1)takes the more general form11πt−γπt−1=κx t+βE t[πt+1−γπt]+u t,(1.9)where the coefficient0≤γ≤1indicates the degree of automatic indexation to the aggregate price index.In the limiting case of complete indexation(γ=1),the case assumed by Christiano et al.and the case found to bestfit US data in our own estimation results below, this relation is essentially identical to the aggregate-supply relation proposed by Fuhrer and Moore(1995),which has been widely used in empirical work.The welfare-theoretic stabilization objective corresponding to this alternative structural model is of the form(1.2)with the period loss function(1.3)replaced byL t=(πt−γπt−1)2+λ(x t−x∗)2,(1.10)whereλ>0is again given by(1.4),and x∗>0is similarly the same function of underlying microeconomic distortions as before.12(The reason for the change is that with the automatic indexation,the degree to which the prices offirms that re-optimize their prices and those that do not are different depends on the degree to which the current overall inflation rate πt differs from the rate at which the automatically adjusted prices are increasing,i.e.,from γπt−1.)If we consider the problem of minimizing(1.2)with loss function(1.10)subject to 11See Woodford(2003,chap.3)for a derivation from explicit microeconomic foundations.12See Woodford(2003,chap.6)for derivation of this loss function as an approximation to expected utility.。
The joint design of unemployment insurance and employment protection. A first pass
The joint design of unemployment insurance and employmentprotection.Afirst pass.∗Olivier Blanchard†Jean Tirole‡September11,2006AbstractUnemployment insurance and employment protection are typically discussed and studied in isolation.In this paper,we argue that they are tightly linked,and we focuson their joint optimal design in a simple model,with risk averse workers,risk neutralfirms,and random shocks to productivity.We show that,in the“first best”,unemployment insurance comes with employment protection—in the form of layofftaxes;indeed,optimality requires that layofftaxesbe equal to unemployment benefits.We then explore the implications of four broadcategories of deviations fromfirst best:limits on insurance,limits on layofftaxes,ex–post wage bargaining,and ex–ante heterogeneity offirms or workers.We show howthe design must be modified in each case.Finally,we draw out the implications of our analysis for current policy debates and reform proposals,from thefinancing of unemployment insurance,to the respectiveroles of severance payments and unemployment benefits.Keywords:Unemployment insurance,employment protection,unemployment benefits,layofftaxes,layoffs,severance payments,experience rating.JEL classifications:D60,E62,H21,J30,J32,J38,J65.∗We are grateful to Daron Acemoglu,Suman Basu,Larry Katz,Javier Ortega,John Hassler,Aleh Tsyvinski,the editor and the two referees for helpful comments.†MIT and NBER.‡IDEI and GREMAQ(UMR5604CNRS),Toulouse,and MIT.1IntroductionUnemployment insurance and employment protection are typically discussed and studied in isolation.In this paper,we argue that they are tightly linked,and we focus on their optimal joint design.To show this,we start our analysis in Section1with a simple benchmark.Workers are risk averse;entrepreneurs runfirms and are risk neutral.The productivity of any worker-firm match is random.If productivity is low enough,the worker and thefirm may separate,in which case the worker becomes unemployed.In that benchmark,a simple way to achieve the optimum is for the state to pay unem-ployment benefits so as to insure workers,and to levy layofftaxes so as to leadfirms to internalize the cost of unemployment and take an efficient layoffdecision.The optimum has two further characteristics:Thefirst is that layofftaxes are equal to unemployment benefits:This common level delivers both full insurance and production efficiency.Thus, the benchmark shows the tight conceptual relation between unemployment insurance and employment protection—defined as layofftaxes.The second is that state intervention is not needed:The same allocation is achieved by havingfirms voluntarily pay severance pay-ments to their workers;in effect,severance payments act both as unemployment insurance and layofftaxes.Using this benchmark as a starting point,we then examine,in Sections2to5,how these conclusions are affected by the introduction of four empirically-relevant deviations from the benchmark,namely:limits on unemployment insurance,limits on layofftaxes, ex–post wage bargaining,and ex–ante heterogeneity of either workers orfirms.In each case,we ask two questions:Thefirst is how the distortion affects the optimal combination of unemployment insurance and layofftaxes.The second is how the distortion affects the need and the scope for state intervention.Reforms of both the unemployment insurance and employment protection are high on the policy agendas of many European and Latin American governments.Proposals range from the creation of unemployment accounts,to changes in thefinancing of the unemployment insurance system,to changes in the form of employment protection.These are complex issues,but we feel that our analysis can help think about the answers.This is what we do in Section6.21A benchmarkIn approaching the issue,we make two methodological choices.First,we use a static,one-period,model.As such,this represents a large step back from recent dynamic models of either unemployment insurance or employment protection. We do so for two reasons.First,we want to focus on the joint design of unemployment insurance and employment protection,which makes things more difficult.Second,we want to explore a number of deviations,starting from as simple a benchmark as feasible. We believe that the basic insights we get from this analysis will extend to more dynamic frameworks;but this obviously remains to be shown.Second,we use a mechanism design approach to the characterization of the optimum. This approach may appear unnecessarily heavy,especially in the benchmark itself(where the solution is straightforward),but we believe it pays off:It shows most clearly,in each case,first the characteristics of the optimal allocation,and then the role of unemployment benefits,taxes,and severance payments,in achieving this allocation.1.1AssumptionsTastes and technology are as follows:•The economy is composed of a continuum of mass1of workers,a continuum of mass(at least)1of entrepreneurs,and the state.•Entrepreneurs are risk neutral.Each entrepreneur can start and run afirm.There is a fixed cost of creating afirm,I,which is the same for all entrepreneurs.If afirm is created,a worker is hired,and the productivity of the match is then revealed. Productivity is given by y from cdf G(y),with density g(y)on[0,1].Thefirm can either keep the worker and produce,or lay the worker off,who then becomes unemployed.Realizations are iid acrossfirms;there is no aggregate risk.•Thefirm,but not the worker(or for that matter third parties such as an insurance company or the state)observes y.•Workers are risk averse,with utility function U(.).Absent unemployment benefits,utility if unemployed is given by U(b)(so b is the wage equivalent of being unemployed).31.2The optimal allocationLet¯y be the threshold level of productivity below which workers are laid off.Let w be the payment to the workers who remain employed,andµbe the payment to the workers who are laid off.The optimal allocation maximizes expected worker utility subject to the economy’s resource constraint:1V W≡G(¯y)U(b+µ)+(1−G(¯y))U(w)max{w,µ,¯y}subject to:1V≡−G(¯y)µ+y dG(y)−(1−G(¯y))w=I¯yFrom thefirst–order conditions,it follows that:w∗=b+µ∗(1)¯y∗=b(2) Given¯y∗,the levels of w∗andµ∗are determined by the resource constraint.Condition(1)is an insurance condition:Workers achieve the same level of utility, whether employed or laid-offand unemployed.Condition(2)is an efficiency condition:From the point of view of total output,it is efficient forfirms to produce so long as productivity exceeds the wage equivalent of being unemployed(we shall call b the production-efficient threshold level).1.3ImplementationConsider now the following implementation of the optimal allocation:•Stage1.The state chooses a payroll tax rateτ,a layofftax rate f,and unemployment benefitsµ.•Stage2.Entrepreneurs decide whether to startfirms and pay thefixed cost.1We derive thefirst best allocation ignoring the assumption that y is observed only by thefirm.We shall show below that this optimal allocation can indeed be implemented.4They offer contracts to workers.Contracts are characterized,explicitly,by a wage w, and,implicitly(since y is not contractable),a threshold productivity level¯y below which the worker is laid off.As allfirms face the same cost and distribution of productivity,in equilibrium,all workers are initially hired.•Stage3.The productivity of each job is realized.Firms decide whether to keep or dismiss workers.To show how the optimal allocation can be implemented,we work backwards in time.At Stage3,the cutoff¯y is such that thefirm is indifferent between keeping the worker and paying w+τin wage and payroll tax and dismissing the worker and paying layofftax f,so:¯y=w+τ−f.(3)If y>¯y,thefirm keeps the worker,produces y,pays w to the worker,andτto the state.If y<¯y,thefirm lays the worker off,pays f to the state;the state paysµto the worker.At Stage2,firms’wage offer w satisfies the free entry condition:1y dG(y)−(1−G(¯y))(w+τ)=I.(4) V F≡−G(¯y)f+¯yConsider now the problem faced by the government in choosing taxes and unemploy-ment benefits at Stage1.Condition(3)implies that to inducefirms to take the production-efficient layoffdecision¯y∗=b,the following condition must hold:w+τ−f=b.(5)Because optimal insurance further requires that w=b+µ,the state’s policy must satisfy:f−τ=µ.(6)5The netfiscal cost to thefirm of laying offa worker must be equal to the unemployment benefits paid to the worker by the state.Note that this condition implies a positive relation between the layoffand the payroll tax rates:For given unemployment benefits,the higher the payroll tax,the higher the layofftax needed to induce thefirm to take the production-efficient decision.The government budget constraint implies a second relation between taxes and benefits:V G≡−G(¯y)(µ−f)+(1−G(¯y))τ=0(7)This constraint implies a negative relation between the layoffand the payroll tax rates: For given unemployment benefits,the higher the payroll tax,the lower the layofftax required to balance the bining the two conditions gives:f=µ,τ=0(8)The layofftax must be equal to unemployment benefits,and the payroll tax rate is equal to zero.We summarize our results in Proposition1.Proposition1.In the benchmark,the optimal allocation is such that workers are fully insured(b+µ∗=w∗),and the threshold productivity is equal to the production-efficient level(¯y∗=b).Implementation is achieved through unemployment benefits equal toµ∗,and layofftaxes f=µ∗.Payroll taxes are equal to zero.Put another way,the contribution rate, defined as the ratio of layofftaxes to unemployment benefits,is equal to one.1.4Interpretation and discussionThe result that layofftaxes must be equal to unemployment benefits is a classic case of Pigovian internalization:To the extent that the state pays unemployment benefits to laid-offworkers,layofftaxes leadfirms to internalize these costs.Indeed,it is the rationale behind the experience rating systems in place in the different states in the United States.22See Baicker,Goldin,and Katz(1998)for a description of the politics and the arguments pro-and con-experience rating,presented in the1920s and1930s when these systems were put in place.6Indeed,within the assumptions of the benchmark,there is an even simpler way of making sure thatfirms internalize the costs of unemployment benefits:It is to have them provide unemployment benefits themselves rather than through the state.It is straight-forward to see that the optimal allocation can also be implemented by simply lettingfirms pay severance payments.Firms will then want to offer severance payments equal toµ. There is no need for the state to intervene.Thefinancing of unemployment benefits through layofftaxes,and the lack of a rationale for state intervention hold in the benchmark.But do they hold more generally?What happens for example if,for moral hazard or other reasons,laid-offworkers cannot be fully insured?Is it still optimal tofinance unemployment benefits only through layofftaxes?What happens iffirms facefinancial constraints and are sometimes unable to pay the layofftaxes?Is it still optimal to fully insure workers?What happens if wages are renegotiated ex-post,and unemployment insurance increases the reservation wage of workers in negotiations?What happens if some workers or somefirms are more exposed to the risk of low productivity than others?Isn’t there a risk that higher layofftaxes will affect them adversely?And,in all these cases,is it the case thatfirms can do it on their own,perhaps pooling resources through a private unemployment agency,or must the state intervene?These are the questions we take up in the next four sections.2Limits to insuranceIn our benchmark,workers could be and were fully insured.There are various reasons why this may not be feasible.Workers may require incentives not to shirk when employed,or incentives to search when unemployed.Or there may be a non–pecuniary loss associated with becoming unemployed.We explore the implications of this last assumption,and return to a discussion of other potential reasons later.3Assume that the utility of workers is now given by U(c)if employed,and by U(c)−B if unemployed,so B>0is the utility cost of being unemployed.4All other assumptions are the same as in the benchmark.3Empirical evidence suggests that non-pecuniary losses associated with becoming unemployed are indeed large(see for example Winkelmann and Winkelmann(1998)).4The derivation below goes through whatever the sign of B.But the substantive implications are obviously different.72.1The optimal allocationThe optimal allocation is the solution to:5max{w,µ,¯y}V W≡G(¯y)(U(b+µ)−B)+(1−G(¯y))U(w),subject to the resource constraint:V≡−G(¯y)µ+1¯yy dG(y)−(1−G(¯y))w=I.From thefirst–order conditions,it follows that:w∗=b+µ∗.(9)¯y∗=b−BU (w∗).(10)Given¯y∗,the levels of w∗andµ∗are determined by condition that the resource constraint holds with equality.Condition(9)shows that marginal utility is equalized across employment and unem-ployment.Because B>0however,this implies that utility is lower when unemployed.Condition(10)shows that the threshold level of productivity,¯y∗,is lower than the production-efficient level b.2.2ImplementationAs before,assume that the statefirst chooses taxes and benefits,thefirms then enter and offer a wage to workers,and,finally,productivity is realized.Consider the following implementation of the optimal allocation,working backwards in time.At Stage3,the threshold productivity below which thefirm lays a worker offis given by:¯y=w+τ−f.(3)5In deriving the optimal allocation,we again ignore the constraint that y is only observed by thefirm. Again,we show below that this allocation can be implemented.8At Stage2,the wage must satisfy the free entry condition:V F≡−G(¯y)f+1¯yy dG(y)−(1−G(¯y))(w+τ)=I.Consider thus the problem faced by the government in choosing taxes and unemploy-ment benefits at Stage1.From equations(3)and(10),it follows that,to inducefirms to take the socially-optimal layoffdecision¯y∗=b−B/U (w),the following condition musthold:f−τ=µ+BU (w).The netfiscal cost to thefirm of laying offa worker must exceed the unemployment benefits paid to the worker by an amount which depends on the cost of becoming unemployed.The other condition on taxes and benefits comes from the government budget con-straint:−G(¯y)(µ−f)+(1−G(¯y))τ=0.Combining these two conditions gives:f=µ+BU (w),τ<0(11)The layofftax must exceed unemployment benefits,implying,for budget balance,a negative payroll tax.We summarize our results in Proposition2.Proposition2.(i)In the presence of limits to insurance,the threshold productivity in the socially efficient allocation is lower than the production-efficient level(¯y∗<b),leading to a lower layoffrate than in the benchmark.(ii)Unemployment benefits,µ,must befinanced by a combination of layofftaxes which exceed these benefits(f>µ)and of negative payroll taxes,(τ<0).Put another way,the contribution rate must now be greater than one.2.3Interpretation and discussion•The intuition for the two parts of Proposition2is straightforward:To the extent that unemployment implies a loss in utility,it is optimal to reduce its incidence,and thus to have a lower productivity threshold than the production-efficient level.This is achieved by9increasing the cost of layoffs forfirms,thus by having higher layofftaxes.Thus,the higher the utility cost of unemployment,the lower the layoffrate,the larger the layofftaxes:The lower layoffrate serves as a partial substitute for unemployment insurance.6•We have examined the case where unemployment leads to a non-pecuniary loss in utility. The limits to insurance may come instead from incentives.Consider for example a modification of the benchmark based on shirking.Once hired, but before productivity is revealed,the worker decides whether to shirk or not.Shirking brings private benefits B but results in zero productivity and thus a layoff.Shirking is unobservable.Thus,to prevent shirking,the following condition must hold:(1−G(¯y))(U(w)−U(b+µ))≥B.(12)The expected utility gain from being employed relative to being unemployed must exceed some value B.In that case,the optimal threshold is given by:¯y=b+[w−(b+µ)−U(w)−U(b+µ)U (w)]so¯y is lower than the production-efficient level if workers are risk averse.Thus,again, limits to insurance lead to lower layoffs and higher layofftaxes.7An alternative rationalization comes from the need to motivate the unemployed to search.While we cannot formally analyze this case in our one-period model,search incen-tive constraints are likely to lead however to results similar to those we have derived.The difference in utility between unemployment and employment has to be sufficient to induce search effort.A full treatment would however require a dynamic model,and we cannot provide it here.8•The result that payments byfirms in case of layoffs must be larger than payments to the laid–offworkers implies that the optimal allocation cannot be achieved just by severance6This“overemployment”result is closely related to the conclusions of the“implicit contract”literature, in particular Baily(1974),Azariadis(1975),Akerlof and Miyazaki(1980).7Under the more general assumption that shirking does not yield zero productivity but instead shifts the distribution of y from G(·)to H(·),with G(·)stochastically dominating H(·),results are however less clear cut.In the absence of further restrictions,it is not necessarily the case that¯y is less than b,equivalently that f is greater thanµ.8The challenge here is to extend the research on optimal unemployment insurance—which focuses on the optimal size and timing of benefits(in particular Shavell and Weiss(1979),Hopenhayn and Nicolini (1997),Werning(2002))—to a model where the destruction margin is endogenous.10payments—which imply equal payments byfirms and payments to workers.Thus,imple-mentation requires the presence of a third party.9We have taken this third party to be the state,collecting layofftaxes and(negative)payroll taxes,and paying unemployment benefits to workers.Formally,what is needed is a pooling or insurance agency,collecting payments fromfirms that layoff,paying unemployment benefits to workers,and distribut-ing the difference to the remainingfirms.In this case,firms who join are better off.The agency may therefore be private and participation voluntary.The issue arises however of whether the optimal allocation can be implemented through a combination of unemployment benefits—paid either by the state or by a pooling agency—and severance payments fromfirms.This was indeed the case in the benchmark.It is no longer the case here.We now examine this issue more closely.2.4Severance payments versus unemployment insurance.A recurrent theme of the insurance literature is that the insurer must be wary of the exter-nality imposed by supplemental insurance contracts(Pauly(1974)).For this reason,insur-ance companies often demand exclusivity and managerial compensation contracts prevent executives from undoing their incentives through insider trading or derivatives contracts withfinancial institutions.In the context of this paper,this raises the issue of whether insurance can be delivered through a combination of unemployment benefits and severance payments.In the benchmark,the optimal allocation provided full insurance to workers,so the issue did not arise.Intuition suggests,and analysis confirms,that,in that case,firms would not want to undo the full insurance provided by the state by overinsuring the worker(severance pay)or underinsuring her(asking the worker to return some of the unemployment benefits, assuming this were feasible).But the issue arises here.To see this,return to Stage2and allowfirms to offer contracts which specify both a wage w and a severance payment,µF.The expected utility of workers is given by:V W≡G(¯y)(U(b+µF+µ)−B)+(1−G(¯y))U(w),9This is an example of the general proposition(for example Holmstr¨o m(1982))that,when parties in a“team”are subject to incentive problems,there is typically a need for a“budget breaker”,such as an insurance company or the state.11And the free entry condition is given by:V F≡−G(¯y)(f+µF)+1¯yy dG(y)−(1−G(¯y))(w+τ)=I,with¯y=w+τ−f−µFNow,starting fromµF=0,and assuming the economy is at the optimal allocation,so w=b+µ,consider the effects of a small increase inµF,dµF together with a decrease in the wage dw=−(G(¯y)/(1−G(¯y)))dµF so as to satisfy the free entry condition.It followsthatdV W=g(¯y)1−G(¯y)B dµF>0.Firms therefore have an incentive to offer more insurance than required in the optimal allocation.The reason why is that increasingµF has two effects on the expected utility of workers.First,it creates a wedge between marginal utility when employed and unemployed; starting from the optimal allocation,this effect is of second order.The other is that it reduces the probability of a layoff;because the loss in utility from becoming unemployed is equal to U(w)−U(b+µ)+B=B,this effect is offirst order and dominates thefirst. Whenfirms increaseµF however,they decrease layoffs,and given that layofftaxes exceed unemployment benefits paid by the state,they impose a negative externality on the state. This is why,in the end,lettingfirms freely choose severance payments is suboptimal.To summarize,in the presence of limits to insurance,the optimal allocation can be implemented by a state or by a private unemployment agency.But this agency must demand exclusivity,or else,mandate a ceiling for severance payments byfirms.Otherwise, there will be overprovision of insurance,and a suboptimal allocation.3Shallow pocketsIn our benchmark,firms were risk neutral and had deep pockets.These assumptions are again too strong.Even in the absence of aggregate risk,the owners of manyfirms,especially small ones,are not fully diversified,and thus are likely to act as if they were risk averse. And,even if entrepreneurs are risk neutral,information problems infinancial markets are likely to lead to restrictions on the funds available tofirms.In this section,we focus on the implications of limited funds.Perhaps the simplest way of capturing the idea thatfirms have limited funds is to12assume that each entrepreneur starts with assets I+¯f,where¯f≥0is therefore the free cashflow available to thefirm after investment.We explore the implications of this assumption,and discuss a number of extensions later.103.1The optimal allocationThe government budget constraint(7),the threshold condition(3),and the condition that payments by thefirm in case of layoffcannot exceed free cashflow(f≤¯f),can be combined to give the following constraint on¯y,w andµ:1112G(¯y)µ−(1−G(¯y))(¯y−w)≤¯f.Therefore,the optimal allocation is the solution to:V W≡G(¯y)U(b+µ)+(1−G(¯y))U(w),max{w,µ,¯y}subject to the resource constraint:1y dG(y)−(1−G(¯y))w=I,V≡−G(¯y)µ+¯yand the additional constraint:G(¯y)µ−(1−G(¯y))(¯y−w)≤¯f.From thefirst-order conditions,it follows that the worker still receives full insurance:w∗=b+µ∗.Furthermore,if the second constraint is binding(that is,if¯f is less than the layofftax in 10In this section,it is important that y be cash(rather than,say,learning experience and so on),so it can be used to pay wages and taxes.11One may wonder whether allowing for job creation subsidies/taxes in addition to payroll and layofftaxes might alleviate the shallow pocket constraint,and improve the allocation.This is not the case.Subsidies, even if allowed in the government budget constraint,would not appear in the equation below.12This constraint is derived as follows.First rewrite the threshold condition asτ=¯y−w+f and replace τin the government budget constraint to get−G(¯y)(µ−f)+(1−G(¯y))(¯y−w+f)=0:For a given¯y−w, the lower f,the lower isµ.Reorganize and use f≤¯f to get the equation in the text.13the optimal allocation derived in Section1),threshold productivity is given by:¯y∗=b+(µ∗−¯f)(1−G(¯y∗))>b.(13)By limiting payments byfirms in case of layoff,the shallow pocket constraint prevents the state from achieving the production-efficient threshold,and the layoffrate is now higher than the production-efficient level.The tighter the shallow pocket constraint—i.e.the lower¯f—then the larger(µ∗−¯f),the higher¯y∗,and so,the larger the layoffrate.The levels of¯y∗,w∗,andµ∗are determined by(13),the full insurance condition,and the condition that the resource constraint holds with equality.3.2ImplementationIf the shallow-pocket constraint is binding,the state chooses the highest feasible layofftax f=¯f.Given unemployment benefitsµ∗,the government budget constraint then implies:τ=G(¯y∗)1−G(¯y)(µ∗−¯f)>0.As unemployment benefits exceed layofftaxes,payroll taxes must be positive.The threshold productivity chosen byfirms is therefore given by:¯y∗=b+µ∗+τ−f=b+µ∗+G(¯y∗)1−G(¯y∗)(µ∗−¯f)−¯f=b+(µ∗−f)1−G(¯y∗).This is the same expression as in(13),and so,layoffand payroll taxes indeed implement the optimal allocation.The derivation shows that we can think of the shallow pocket constraint as affecting the threshold productivity level directly(through the limit on the layofftax)and indirectly(through the need for positive payroll taxes);both the lower layofftax and the higher payroll tax reduce thefiscal cost of layoffs forfirms,and thus lead to a layoffrate higher than the production-efficient level.By the same argument as before,the resource constraint implies that workers receive the optimal w∗andµ∗.We summarize the results in Proposition3.Proposition3.In the presence of shallow pockets,workers remain fully insured(w∗= b+µ∗).The threshold productivity is higher than the production-efficient level(¯y>b),14leading to a higher layoffrate than in the benchmark.This allocation can be implemented by the government choosing unemployment benefits µ∗,andfinancing them partly through layofftaxes¯f,and partly through payroll taxes,τ>0.Put another way,the implementation implies now a contribution rate smaller than one.3.3Interpretation and discussion•Proposition3has two important aspects:Thefirst is that the presence of limited funds does not prevent full insurance.The reason is that the state can raise the required funds through higher payroll taxes,and by implication a lower equilibrium wage,without violating the shallow pocket constraint.13 The second is that the presence of limited funds prevents the state from achieving the production-efficient layoffrate.Limits on layofftaxes affect layoffs in two ways,directly, and indirectly,through the higher payroll taxes required tofinance benefits.•Given that layofftaxes are less than unemployment benefits,it follows that,again,the optimal allocation cannot be achieved by just relying on severance payments byfirms.As in Section2,implementation can be achieved by a pooling agency,receiving contributions ¯f G(¯y∗)fromfirms that layoffand contributionsτ(1−G(¯y∗)from those that do not,and paying unemployment benefits G(¯y∗)µ∗to laid–offworkers.Firms have an incentive to join,and the agency may therefore be private.Also,in this case,because workers are fully insured,the coinsurance problem we looked at in the previous section does not arise: Insurance can be provided by a mix of severance payments byfirms,and insurance benefits from the state or the pooling agency.The assumption of exogenous free cashflow associated with each job is clearly too strong:•One reason why¯f may be endogenous is that thefirms may not want to have deep pockets even if they can.This arises for example,if,in contrast to the maintained assumption of13To see this,consider an allocation where w>b+µ.Now,consider a decrease in the wage of∆w<0 and an equal increase in payments byfirms to the state,∆τ.This change affects neither the threshold condition nor thefirm’s profie these increased payments to increase unemployment benefits by−[(1−G(¯y))/G(¯y)]∆w.Together,these changes imply a change in utility of[−(1−G(¯y))U (w)+(1−G(¯y))U (b+µ)](−∆w)>0.Thus,welfare can be improved until workers are fully insured.15。
抑郁症炎症标志物,文献综述
REVIEWInflammation and clinical responseto treatment in depression:A meta-analysis R.Strawbridge a,n,D.Arnone a,A.Danese b,c,A.Papadopoulos a, A.Herane Vives a,e,A.J.Cleare a,da Affective Disorders Research Group,Centre for Affective Disorders,Psychological Medicine,Institute of Psychiatry,King's College London,London,UKb Social,Genetic&Developmental Psychiatry Centre,Institute of Psychiatry,King's College London, London,UKc Department of Child&Adolescent Psychiatry,Institute of Psychiatry,King's College London,London,UKd National Institute for Health Research(NIHR)Biomedical Research Centre for Mental Health at South London and Maudsley NHS Foundation Trust and Institute of Psychiatry,King's College London,London,UK e Psychiatric University Clinic,University of Chile,Santiago,ChileReceived24February2015;accepted12June2015KEYWORDS Depression;Inflammation; Biological markers; Depressive disorder; Treatment-resistant AbstractThe depressive state has been characterised as one of elevated inflammation,which holds promise for better understanding treatment-resistance in affective disorders as well as for future developments in treatment stratification.Aiming to investigate alterations in the inflammatory profiles of individuals with depression as putative biomarkers for clinical response,we conducted meta-analyses examining data from35studies that investigated inflammation before and after treatment in depressed patients together with a measure of clinical response.There were sufficient data to analyse IL-6,TNFαand CRP.Levels of IL-6decreased with antidepressant treatment regardless of outcome,whereas persistently elevated TNFαwas associated with prospectively determined treatment resistance.Treatment non-responders tended to have higher baseline inflammation,using a composite measure of inflammatory markers.Ourfindings suggest that elevated levels of inflammation are contributory to treatment bining inflammatory biomarkers might prove a useful tool to improve diagnosis and detection of treatment refractoriness,and targeting persistent inflammation in treatment-resistant depres-sion may offer a potential target for the development of novel intervention strategies.&2015Elsevier B.V.and ECNP.All rights reserved./locate/euroneuro/10.1016/j.euroneuro.2015.06.007 0924-977X/&2015Elsevier B.V.and ECNP.All rightsreserved.n Correspondence to:Affective Disorders Research Group,Centre for Affective Disorders,Department of Psychological Medicine, Institute of Psychiatry,Psychology&Neuroscience,King's College London,London SE58AZ,UK.Tel.:+442078485305.E-mail address:Becci.strawbridge@(R.Strawbridge).European Neuropsychopharmacology(2015)25,1532–15431.IntroductionAn aberrant inflammatory profile has been widely demon-strated in depressive disorders and is believed to contribute to some of the biological mechanisms associated with disease onset and treatment response(Dowlati et al., 2010;Miller et al.,2009;Smith,1991).Recent evidence suggests that levels of inflammation might be modifiable with pharmacological treatment(Hannestad et al.,2011; Hiles et al.,2012;Janssen et al.,2010)and preliminary evidence indicates that treatment resistance might be associated with heightened inflammation.Additionally, non-steroidal anti-inflammatory drugs might be beneficial as adjunctive treatments in unipolar(Akhondzadeh et al., 2009;Muller et al.,2006)and bipolar(Nery et al.,2008) disorders and the TNFαantagonist infliximab may particu-larly benefit depressed individuals with a history of treat-ment resistance and high inflammation(Raison et al.,2013). Treatment non-response contributes greatly to the burden of affective illnesses(Gibson et al.,2010);it is common, affecting at least a third of patients(Warden et al.,2007), and is generally associated with poorer long-term outcomes (Fekadu et al.,2009).To improve the rate and robustness of clinical response in depression there is a need for novel treatment strategies(Kupfer et al.,2012),including enhan-cing the personalisation of treatment provision using stra-tification.As such,research has been increasingly focusing on the importance of effectively screening for predictors of response across depressed populations,and using putative biomarker signatures prior to treatment provision may help to identify objective biological differences between patients who do or do not respond to treatments.Measuring ‘panels’of biomarkers may assist with the discovery of biological signatures for disorders such as depression (Schmidt et al.,2011),which also may be supported using meta-analytic techniques that provide greater statis-tical power than individual bining these two approaches may be useful for identifying inflammatory relationships with depressed state and response to treat-ment,particularly as studies measuring different(but similar)data points cannot otherwise be compared in a high-powered analysis.We describe a new methodology of combining inflammatory data from different biomarkers together to enable a substantially higher statistical power.Another important factor in this relationship is whether inflammatory profiles within a depressed state might differ between individuals with unipolar and bipolar diagnoses: although this has not been established there is some indicative evidence that inflammation is not elevated in bipolar depressed state(Munkholm et al.,2013),as opposed to mania and euthymia.1.1.Aim of the studyWith the aim of expanding on previous work,we investi-gated studies measuring inflammatory biomarkers in depres-sion in relation to treatment response and hypothesised that (a)non-responsive patients would have higher levels of inflammation at baseline than responders;(b)patients would show a decrease in levels of inflammation after a course of treatment,but that;(c)treatment refractoriness would be characterised by persistently high levels of inflammation.2.Experimental procedures2.1.Criteria for study inclusionA systematic search of the literature was conducted to obtain all studies that measured inflammatory responses in depression at baseline and following a course of treatment,and that also assessed treatment response.A priori inclusion criteria required eligible studies to be in English,measure in vivo at least one peripheral biomarker purporting to measure inflammation in human subjects classified as being in a depressive episode according to a clinician-rated standardised measure of depression symptomatology(e.g.HRSD,MADRS,IDS)alongside a standardised measure of clinical response to a treatment(and where relevant,a comparison of inflammation between responder and non-responder groups at one timepoint or more).T o ensure we measured naturally occurring inflammation we excluded any studies which included a psychological or physiological stressor,or induced inflamma-tion either by a targeted agent or by specific immunomodulatory drugs (e.g.non-steroidal anti-inflammatory drugs would be excluded,but not psychotropic medications).For this reason we also excluded papers reporting relevant comparisons in specifically physically ill samples (though we included studies which did not necessarily exclude indivi-duals who had physical illnesses).Subjects were required to be of any adult age to be considered eligible.2.2.Systematic searchWe searched the databases PubMED(1960-),EMBASE(1974-),and PsycINFO(1967-),with the aim of eliciting all studies measuring peripheral markers of inflammation in patients with unipolar or bipolar depression and in relation to treatment response and/or clinical improvement,fulfilling our inclusion criteria.The full search process is depicted in Figure1.Studies were retrieved by RS and inclusion/exclusion of studies agreed by consensus(with AC, AP).Studies were also scrutinised for potentially relevant citations. In case of incomplete information study authors were contacted to request additional data not available in the original manuscript. 2.3.Assessment of qualityResearch reports were assessed using seven criteria,adapted from those developed by the Evidence-Based Medicine Working Group that had been modified for use in prognostic investigations(Fekadu et al.,2009)and the Cochrane Collaboration's Risk of Bias tool for trial designs(Higgins et al.,2011).Studies can score either positively(+1),negatively(À1) or neutrally(no score change)on each of the following domains:Cohort formation,sample size,trial/follow-up length,collection of biological data,study completion data,design of treatment provision,objective clinical assessment.This resulted in a ranking fromÀ7to+7(see T able1),which we used as a brief indicator of methodological rigour in individual studies,within the limitations of this approach.posite biomarker calculationIt was clear that the variation between studies of inflammatory biomarkers investigated would lead to low-powered meta-analyses of individual biomarkers.Based on the consideration that all selected biomarkers should measure the same latent construct(inflammation) and thus be correlated,we planned analyses to incorporate all possible available data.This novel method should at present be considered a preliminary test of the predictive validity of a combination of biomark-ers as a measure of overall inflammatory response.The‘composite1533Inflammation and clinical responsemeasure’provides a preliminary and perhaps coarse representation of inflammation and its relationship with response to treatment in affective disorders,and will therefore require consideration when interpreting results.However,this method not only permits a higher powered meta-analysis,but also enables a broader perspective to be taken on the putative relationship between inflammatory profiles and clinical response to antidepressant treatments in people with depression.T o prevent bias in the composite inflammation analysis towards studies measuring multiple biomarkers,one entry per eligible study was required for each analysis.It was also important not to bias our results towards particular biomarkers.We therefore employed a method to utilise the maximum data available by averaging together all relevant biomarkers within each study prior to entering into the meta-analysis.Eligible markers were defined as pro-inflammatory cytokines(Cameron and Kelvin,2000;Hodge-Dufour et al.,1998),as follows:tumour necrosis factor(TNFα),interferon-αorβ(IFNα/IFNβ), interleukins1(IL-1α/IL-1β)or6(IL-6),c-reactive protein(CRP)which was also included as a direct marker of inflammation.For each included study,mean data values for each eligible biomarker werefirst converted into pg/ml(except for CRP which was converted into mg/L),and then all relevant variables pooled to create the‘composite’measure using a pooling method embedded in the software Comprehensive Meta-analysis(version2.2.021),for merging multiple data points within subjects(using the mean of the selected outcomes).The composite data calculated for each study, roughly representing the levels of inflammation for each compar-ison,provided a single-entry per study into each meta-analysis. 2.5.Statistical analysisMeta-analyses were conducted where sufficient data were available in at least3studies for each primary comparison.For all possible biomarkers,the comparisons conducted were as follows:1.Responder vs.non-responders at baseline(pre-treatment).2.Inflammatory changes alongside treatment in responders.3.Inflammatory changes alongside treatment in non-responders.Additional secondary comparisons that were conducted on the above biomarkers were patients vs.controls at baseline and inflammatory change in all patients over treatment(not distinguish-ing between responder and non-responder groups).Aside from the effect-size calculations for the composite analyses, statistical analysis methodology was conducted using Stata11.0(Stata Corp,College Station,T exas)and supplemented by‘Metan’software downloadable from the Centre for Statistics in Medicine,Oxford,UK,as reported previously(Arnone et al.,2009).Standardised mean differ-ences were calculated using Cohen's d statistic and standardised effect sizes were then combined using the inverse variance method.Random effects analyses(DerSimonian and Laird,1986)were used throughout to weight each study.The presence of heterogeneity was tested using the Q-test and its magnitude estimated using I2,which can be interpreted as the proportion of effect size variance due to heterogeneity(Higgins et al.,2003).Publication bias,which describes the tendency of small studies to report large effect sizes,was examined using Egger's test (Egger et al.,1997)with the significance level set at p o0.05.T o further investigate causes for heterogeneity,meta-regression analyses were performed in the primary analyses(outlined below).Potential con-founders considered were;sex(%),age,baseline symptom severity, clinical setting(inpatient/outpatient),medication status on study-entry, standardised/naturalised treatment in study,length of treatment,study year,and study quality assessment.The ST A T A module"metareg"was used throughout and the REML(restricted maximum likelihood)method used to estimate the model parameters.3.ResultsThe literature search yielded a total of2053articles,of which35met inclusion criteria(see Figure1and Table1for Figure1Flow chart of selection process for inclusion of studies.R.Strawbridge et al. 1534details and reasons for exclusion).All included studies investigated unipolar major depression except for one that only included bipolar diagnosed patients in a depressive episode (Tsai et al.,2014),and three that included both bipolar and unipolar depression (Himmerich et al.,2006;Landmann et al.,1997;Maes et al.,1995)but didnot1535In flammation and clinical responsecompare in flammation between the two groups.Three biomarkers were suf ficiently researched to be included in primary analyses:interleukin-6(IL-6)in 12studies (Basterzi et al.,2005;Carvalho et al.,2012;Fornaro et al.,2011;Frommberger et al.,1997;Kubera et al.,2000;Lanquillon et al.,2000;Maes et al.,1997a ;Maes et al.,1995;Marques-Deak et al.,2007;Mikova et al.,2001;Yoshimura et al.,2009;Yoshimura et al.,2013),TNF αin 11studies (Eller et al.,2008,2009;Fornaro et al.,2013;Himmerich et al.,2006;Landmann et al.,1997;Lanquillon et al.,2000;Mikova et al.,2001;Piletz et al.,2009;Song et al.,2009;T uglu et al.,2003;Yoshimura et al.,2009)and CRP in 8studies (Chang et al.,2012;Harley et al.,2010;Lanquillon et al.,2000;O'Brien et al.,2006;Piletz et al.,2009;Tsai et al.,2014;T uglu et al.,2003;Uher et al.,2014).3.1.Description of studiesAll 35studies were longitudinal in design,measuring in flammatory markers at baseline and following up patients over the course of treatment.All but two studies (Carvalho et al.,2012;Uher et al.,2014)repeated in flammation measurements after treatment.Most articles dichotomised patients at study-end into responders and non-responders (Basterzi et al.,2005;Basterzi et al.,2010;Carvalho et al.,2012;Chang et al.,2012;Eller et al.,2008,2009;Fornaro et al.,2011;Fornaro et al.,2013;Frank et al.,2004;Himmerich et al.,2006;Kook et al.,1995;Landmann et al.,1997;Lanquillon et al.,2000;Maes et al.,1997a ;Maes et al.,1997b ;Mikova et al.,2001;O'Brien et al.,2006;Pariante and Miller ,1995;Seidel et al.,1996;Song et al.,2009;Yoshimura et al.,2009;Yoshimura et al.,2013).Forthese studies,the criterion for response was Z 50%reduc-tion of score on the adopted depression severity rating scale.Seven studies reported results in responders only (Frommberger et al.,1997;Hernandez et al.,2008;Maes et al.,1995;Marques-Deak et al.,2007;Piletz et al.,2009;Tsai et al.,2014;T uglu et al.,2003),and seven studies described clinical improvements using a continuous out-come measure (Harley et al.,2010;Himmerich et al.,2010;Kubera et al.,2000;Mizruchin et al.,1999;O'Brien et al.,2006;Schleifer et al.,1999;Uher et al.,2014).Studies were heterogeneous in terms of in flammatory biomarkers mea-sured and patient samples,including the presence of psychiatric comorbidity,the degree of baseline treatment refractoriness and medication status at baseline.All studies investigated only pharmacological treatment,except one that compared pharmacological with psychological inter-ventions (Harley et al.,2010);this found that high CRP was associated with good clinical response in antidepressant therapy but with poor response after psychotherapy.There were insuf ficient studies to compare in flammatory markers in unipolar and bipolar depression.Meta-analyses largely demonstrated signi ficant levels of heterogeneity (I 2)and lack of publication bias (all p 40.05);see figures and Table 2.3.2.Baseline in flammation and subsequent treatment-responseElevated baseline in flammation was found in depression vs.healthy controls with all three in flammatory markers:IL-6(p =0.003),TNF α(p =0.02)and CRP (p o 0.0001),as well as the composite analysis (p =0.017).However ,no signi ficantR.Strawbridge et al.1536differences in levels of baseline in flammation were identi-fied between those subsequently responding or not respond-ing to treatment:this was shown in TNF α(p =0.57),CRP (p =0.76),and IL-6(p =0.19),though the latter was numeri-cally higher in non-responders.The composite measure of in flammation at baseline showed higher levels were present in people subsequently not responding to treatment,which approached statistical signi ficance (p =0.073).This finding remained when con fining the analysis solely to unipolar patients (i.e.removing the study which contained someEgger’s test for publication bias: p=0.07, I 2 test for heterogeneity: p<0.001Egger’s test for publication bias: p=0.27, I 2test for heterogeneity: p<0.001Figure 2TNF αchange in responders (Fig.2A)vs.non-responders (Fig.2B).Egger's test for publication bias:p =0.07,I 2test for heterogeneity:p o 0.001and Egger's test for publication bias:p =0.27,I 2test for heterogeneity:p o 0.001.1537In flammation and clinical responsebipolar depressed patients(Himmerich et al.,2006); p=0.071).We performed a meta-regression on the compo-site measure which showed that the effect of elevated inflammation on treatment non-response was more accen-tuated in outpatient vs.inpatient settings(b=À0.494, p=0.012),and in studies with a higher quality rating (b=0.137,p=0.009).3.3.Effects of treatment and treatment-response on inflammationThere was no change evident in TNFαlevels when simply looking at the effects of treatment i.e.when responders and non-responders were grouped together(p=0.42).How-ever,there was a differential effect when treatment response was taken into account:levels of TNFαsignifi-cantly decreased in treatment responders(p=0.008)but not in non-responders(p=0.9);see Figure2.These analyses included one study where both unipolar and bipolar patients were included(Himmerich et al.,2006);exclusion of this study did not alter TNFαresults(responders,p=0.008;non-responders,p=0.66).Meta-regression analyses s uggested that decreased levels of TNFαin responders positively correlated with year of publication,suggesting a stronger effect in more recent studies(b=0.205,p=0.026)with a trend in non-responders(b=0.21,p=0.056).In the studies measuring IL-6,there was an overall reduction following treatment irrespective of treatment response(p=0.03;see Figure3).When separate analyses were conducted for responders and non-responders how-ever,non-significant decreases were seen after treatment in both responders(p=0.53)and non-responders(p=0.11).IL-6analyses included8patients diagnosed with bipolar depression(Maes et al.,1995);exclusion of the study containing these patients(as within-study bipolar/unipolar patient data was unavailable)from the analyses did not change the responders'subgroup results(p=0.8),and no non-responders were included in this article,but the overall analysis showed a slightly lowered significance value(k=9, ES=À0.54,CIÀ1.12/0.03,p=0.06).Meta-regressions indicated a correlation in responders between age and IL-6change over treatment;studies with a higher mean age report smaller reductions in IL-6levels with treatment(b=0.113,p=0.011).In non-responders the degree of change in measured IL-6levels was more sig-nificant in older studies(b=0.236,p=0.024).There was no effect of treatment,or of treatment-response,on levels of CRP or on the composite inflammation measure.However,meta-regressions run on the composite analyses unanimously suggested that studies in which not all subjects were unmedicated at baseline showed greater variance in inflammatory changes alongside treatment: (for all depressed subjects:b=1.23,p=0.017;for respon-ders only:b=1.101,p=0.432;for non-responders only: b=1.215,p=0.053.)3.4.Unipolar and bipolar depressionOnly T sai et al.(2014)included solely bipolar diagnosed patients who were in a depressive episode and it was not possible to undertake a meta-analysis comparing unipolar vs.bipolar depression in the four studies identified(Himmerich et al., 2006;Landmann et al.,1997;Maes et al.,1995)due to disparate study methodologies or insufficient information being available.T sai et al.(2014)identified a non-significant increase in levels of inflammation from acute depression to euthymia.In Maes et al.(1995),eight patients of the61included were bipolar patients in a depressed mood state,and the authors reported no correlations between bipolarity,IL-6and depression severity(Maes et al.,1995).The other two articles including bipolar depressed patients did not report results separately nor any comparisons between the unipolar and bipolar diagnosed subjects.As can be seen above,removal of bipolar subjects from primary meta-analyses did not substantially affect the results.4.DiscussionTo our knowledge this is thefirst meta-analysis to investi-gate systematically the relationship between inflammation and treatment resistance in depression,both as a predictive marker and in maintenance of the illness.We found that prospectively-determined treatment resistance is asso-ciated with continued elevations in inflammation,in that there is a decrease in TNFαlevels over time in treatment-responsive but not in treatment-resistant patients.We also examined a novel method for merging related inflammatory biomarker data,and its relation to treatment-resistance in affective disorders,finding a trend towards higher inflam-mation being associated with a poorer response to anti-depressant treatment.4.1.Inflammation and major depressionAlthough not the primary focus of the study,we have replicated previousfindings that depression as a whole is associated with increased inflammation.Inflammatory ele-vations in depression have been reliably demonstrated across numerous reviews(Dowlati et al.,2010;Hannestad et al.,2011;Hiles et al.,2012;Miller et al.,2009)and there exist many plausible mechanisms by which this may occur. The causative effect of psychological and physiological stress on the inflammatory response has been well docu-mented and this system interacts bidirectionally with other systems implicated in mood disorders,including HPA-axis activity and cortisol release(Miller et al.,1999),serotoner-gic pathways(Maes et al.,2011),neurogenesis and neuroin-flammation(Harry and Kraft,2012).There is additional evidence that inflammation is a causal factor in the onset of depression,supported by replicatedfindings that adminis-tration of inflammatory cytokines(particularly IFNαtreat-ment for hepatitis C)can induce depressive symptoms or clinical depression in many patients(Raison et al.,2005).An important area of uncertainty is the degree to which depression occurring as part of a bipolar disorder may differ compared to a unipolar disorder.There is a paucity of research to this end,and we were not able to identify sufficient studies to test the hypothesis that inflammatory markers may differentiate between unipolar and bipolar disorder.There is clear evidence of differential treatment strategies being appropriate in unipolar and bipolar depres-sion(Pacchiarotti et al.,2013)and due to the unansweredR.Strawbridge et al.1538question of whether raised levels of in flammation are more speci fic to unipolar depression,therapeutic intervention in this domain may not be appropriate in bipolar depression.This is clearly an area which requires further investigation.4.2.In flammation and treatment resistanceThe results of the meta-analysis demonstrate a role of in flammation in treatment-resistant depression:there were signi ficant decreases in TNF α(towards control levels)seen only in treatment responders,whereas treatment resistance was associated with persistently elevated TNF α.This implies that maintenance of heightened levels of in flammation may at least contribute to treatment refractoriness,and thus that anti-in flammatory agents might provide a mechanism for treatment resistance in individuals with persistent high levels of TNF α.This is strengthened by recent preliminary findings that a TNF αantagonist,in fliximab,can improve depression in some treatment-resistant patients (Raison et al.,2013);when strati fied by pre-treatment levels of in flammation,in fliximab appears to be most anti-depressant in those with higher pre-treatment in flammation.This association between TNF αmodi fication and response may account for the lack of signi ficant findings in a previous meta-analysis (Hannestad et al.,2011)which did not consider differential patterns of alteration in responders vs.non-responders.We also found that,regardless of treatment response,antidepressant treatment can have anti-in flammatory effects,notably a reduction in IL-6.This may also occur in bipolar depression as indicated by the reduced signi ficance found when removing Maes et al.(1995)whose sample was partly comprised of bipolar patients.The anti-in flammatory effects of antidepressants have been reported in preclinical (Connor et al.,1999)and in vitro (Xia et al.,1996)studies as well as clinical samples (Hannestad et al.,2011;Hiles et al.,2012).Indeed,it has been suggested that these anti-in flamma-tory effects may be one of the many mechanisms by which antidepressants exert their therapeutic effect (Janssen et al.,2010).It may be,therefore,that this anti-in flammatory effect of antidepressants is suf ficient in many cases to reverse the overall in flammatory response seen in depression.However ,in those with more severe or chronic illnesses,this effect may not in itself be suf ficient to normalise the in flammation,which may then in turn act as a maintaining factor in the illness.It should also be noted that psychological interventions alone have also been reported to reduce in flammation alongside depressive symptoms (Thornton et al.,2009).posite biomarker measurementWhile meta-regressions conducted on individual biomarkers may have been insuf ficiently powered to illustrate factors important in modifying the comparisons,the composite meta-regressions highlight the potential importance of medication status in the relationship between in flammation and treatment-resistance in depression.Our findings may also suggest that speci fic medications and their mechanismsEgger’s test for publication bias: p=0.47, I 2test for heterogeneity: p<0.001Figure 3IL-6alterations over treatment.Egger's test for publication bias:p =0.47,I 2test for heterogeneity:p o 0.001.1539In flammation and clinical responsemight explain some of the heterogeneity within results, something that we were not able to explore further.A comprehensive understanding of pharmacological effects on inflammation,and treatment-response,will require sub-stantially larger samples of depressed individuals before, during and after treatment with a range of separate antidepressant medications.The composite measures showed that patients with higher levels of inflammation responded less well to sub-sequent treatment,though thisfinding did not reach statistical significance.Despite the lack of significant results from the composite analyses,we suggest this approach is still worthwhile;due to the complexity of interactions between human biomarkers(as well as the heterogeneity of affective disorders),it is arguable that such methods will be more likely to detect robust and clinically useful biological indicators to predict the likelihood of treatment successes.There may be a number of methods for calculat-ing this composite measurement,and identification of an optimal approach requires further investigation.We parti-cularly highlight the difficulty surrounding which biomarkers should be classified as those representing inflammation,and inconsistencies within the literature on this subject.We believe that this can evolve through the use of large datasets,advanced modelling techniques,and/or new dis-coveries made in biochemical mechanisms.4.4.Clinical implicationsAs outlined earlier,treatment resistance is a common clinical problem in affective disorders,and it is likely that there are several contributing factors in each individual patient.An important approach is to rule out alternative diagnoses that may explain the depressive symptoms,and to evaluate organic factors that may be of relevance.The results of this meta-analysis add to the suggestion that it may also be important to evaluate the presence of raised levels of inflammation.We have shown that elevated inflammatory markers predict a poorer response to antidepressants,and that those who do not respond to antidepressant treatments show persistently elevated inflammation.We suggest that there is now a clear imperative for research to investigate whether targeting this elevated inflammation will improve the outcome in treatment resistant depression,and if so,in which particular groups of patients.Studies have rarely measured all potential inflammatory markers,and we do not yet know whether there are specific aspects of the inflammatory response that are relevant to depression or whether an approach such as that taken here of combining measures of inflammation is most likely to be of clinical relevance.We also suggest that inflammation is likely to represent just one of several potential novel treatment targets in these difficult to treat cases of dep-ression,and that other approaches based upon other maintain-ing factors such as HP A axis disturbance(Juruena et al.,2009; Markopoulou et al.,2009)may also suggest differential treat-ment approaches on an individual level.Indeed,combining a range of inflammatory and other markers might be useful in enhancing treatment personalisation and diagnostic accuracy in the future,and complements current strategies to link clinical syndromes more closely to underlying neurobiological and other substrates(Insel,2014).The benefits of this approach have been comprehensively outlined by Schmidt et al.(2011),which advocates the investigation of‘panels’of biomarkers(including for inflammatory,neurogenesis,endocrine and other systems) in order to improve the recognition of different patient subtypes and ultimately increase treatment response.4.5.LimitationsThere are several limitations in the interpretation of findings from this work.Our assessments using Egger's test indicate that our analyses are not likely to have been influenced by publication bias.However,due to the rela-tively small number of studies included in this work it is not possible to fully exclude the possibility of selective pub-lication of positive studies.It is also notable that there were a large range of treatments,inflammatory markers and variation in patient characteristics between included stu-dies,limiting the conclusiveness and generalisability of the presentfindings.In particular,the treatments studied were almost exclusively pharmacological,and therefore the results may not apply to other forms of treatment.In addition,depression is a highly diverse condition and this was evident in the significant levels of heterogeneity present in analyses,with factors including severity,depres-sive subtype,and degree of treatment resistance likely contributing to variation in inflammatory profiles.We exp-lored possible sources of heterogeneity with meta-regre-ssion analyses and found some associations with effect sizes, notably those present in the composite analyses,and that IL-6reductions with treatment were more prominent in younger samples.This may be a proxy for an earlier stage within the longitudinal course of affective illness or a representation of treatment naivety;both of these factors are associated with improved clinical response(Kornstein and Schneider,2001).However,it is important to bear in mind that heterogeneity could only partially be explained by the confounders we considered in meta-regressions; indeed,it is likely that there is significant heterogeneity due to the very nature of depression itself.Moreover,this reinforces our message that further progress will be facili-tated by defining more homogeneous groups for study,for example those with raised inflammatory markers and/or specific symptom profiles.Utilising standardised treatment approaches,and the inclu-sion of psychological treatments as well as pharmacological, could improve our understanding of how different treatments can resolve inflammation.Furthermore,the relationship between inflammatory and other biological systems is clearly complex and multifaceted.Concurrent assessment of some of the parameters interacting with the inflammatory response in depression,such as the endocrine system,might prove useful in providing a fuller understanding of neurobiological dysfunc-tion and treatment in depression.Author disclosuresRole of funding sourceThis paper presents independent research partly funded by the National Institute for Health Research(NIHR)and the NIHRR.Strawbridge et al.1540。
如何理性选择英文作文
如何理性选择英文作文英文,When it comes to choosing an English essay prompt, there are a few things to consider. First, it's importantto choose a prompt that you're interested in and passionate about. This will make the writing process more enjoyableand easier to complete. Additionally, consider the level of difficulty of the prompt. If it's too easy, it may not challenge you enough, but if it's too difficult, you may struggle to write a coherent essay.Another factor to consider is the relevance of the prompt to your personal experiences or interests. For example, if you're interested in environmental issues, you may want to choose a prompt related to climate change or sustainability. This will allow you to bring your own knowledge and experiences to the essay, making it more unique and personal.Finally, consider the length and format of the essay prompt. Some prompts may require a specific type of essay,such as a persuasive or argumentative essay, while others may be more open-ended. Make sure you understand the requirements of the prompt before starting to write.中文,当考虑选择英文作文题目时,有几个因素需要考虑。
最值原理英语
最值原理英语The Principle of Optimality in EnglishThe principle of optimality is a fundamental concept in various fields of study, including economics, decision theory, and computer science. This principle, also known as the Bellman's principle, states that an optimal decision made at any given stage of a process must be the best decision that can be made at that stage, regardless of the previous decisions made. In other words, the optimal decision at any given point in time must be independent of the decisions made prior to that point.The principle of optimality has wide-ranging applications and implications. In economics, it is used to analyze the behavior of individuals and firms in making optimal decisions, such as maximizing profits or minimizing costs. In decision theory, the principle of optimality is used to develop algorithms and strategies for making optimal decisions in complex situations, such as in game theory or in the design of artificial intelligence systems.One of the key applications of the principle of optimality is in the field of dynamic programming. Dynamic programming is a problem-solving technique that involves breaking down a complex problem into smaller, interconnected subproblems, and then solving each subproblem in a way that optimizes the overall solution. The principle of optimality is a fundamental component of dynamic programming, as it allows the problem to be solved in a recursive manner, where the optimal solution to the overall problem is built up from the optimal solutions to the smaller subproblems.Another important application of the principle of optimality is in the design of efficient algorithms and data structures. In computer science, the principle of optimality is used to develop algorithms that can solve complex problems in an optimal or near-optimal way, such as in the design of routing algorithms for computer networks or in the optimization of database queries.The principle of optimality is also relevant in the field of control theory, where it is used to develop optimal control strategies for systems with multiple inputs and outputs. In this context, the principle of optimality is used to determine the optimal control actions that should be taken at each stage of the control process in order to achieve the desired outcome.One of the key advantages of the principle of optimality is its ability to simplify complex decision-making processes. By breaking down a problem into smaller, interconnected subproblems and solving eachsubproblem in an optimal way, the overall problem can be solved more efficiently and effectively. This is particularly important in situations where the decision-making process involves a large number of variables or where the consequences of decisions can be far-reaching and difficult to predict.However, the principle of optimality is not without its limitations. In some cases, the optimal decision at a given stage may not be the best decision for the overall problem, particularly if there are complex interdependencies or if the problem involves significant uncertainty or risk. In these cases, the principle of optimality may need to be supplemented with other decision-making frameworks or techniques, such as game theory or robust optimization.Despite these limitations, the principle of optimality remains a powerful and widely-used tool in a variety of fields. Its ability to simplify complex decision-making processes and to identify optimal solutions has made it an indispensable tool in a wide range of applications, from economics and decision theory to computer science and control theory. As our understanding of complex systems and decision-making processes continues to evolve, the principle of optimality is likely to remain an important and influential concept in the years to come.。
保持理性思维英语作文
保持理性思维英语作文Maintaining Rational Thinking。
In today's fast-paced and complex world, it is crucial for individuals to maintain rational thinking. Rational thinking allows us to make sound decisions, solve problems effectively, and navigate through life's challenges with clarity and objectivity. However, in the face of various influences and pressures, it can be challenging to stay rational. Therefore, it is essential to understand the importance of rational thinking and develop strategies to cultivate and preserve it.Firstly, rational thinking enables us to analyze situations objectively and make informed decisions. When we encounter a problem or dilemma, it is natural for emotions to cloud our judgment. However, by stepping back and assessing the situation from a rational standpoint, we can separate our emotions from the facts. This allows us to consider various perspectives, evaluate potential outcomes,and make decisions based on logical reasoning rather than impulsive reactions. For example, when faced with a job offer, rational thinking will help us weigh the pros and cons, consider long-term implications, and make a choicethat aligns with our goals and values.Furthermore, rational thinking helps us navigatethrough the vast amount of information available to us today. With the advent of the internet and social media, we have access to an overwhelming amount of information, opinions, and news. However, not all information isaccurate or reliable. Rational thinking allows us tocritically evaluate information, discern fact from fiction, and make informed judgments. By questioning the credibility of sources, verifying information through multiple channels, and considering different viewpoints, we can avoid falling into the trap of misinformation and make well-informed decisions.Moreover, maintaining rational thinking enables us to handle conflicts and disagreements with grace and composure. In today's polarized society, it is common to encounterdiffering opinions and conflicting ideologies. Instead of reacting impulsively or becoming defensive, rationalthinking allows us to approach conflicts with an open mind and engage in constructive dialogue. By listening attentively, considering alternative perspectives, and presenting our own arguments based on logic and evidence,we can foster understanding, find common ground, and work towards resolution. Rational thinking helps us transcend personal biases and prejudices, promoting tolerance, empathy, and cooperation.To cultivate and preserve rational thinking, it is essential to practice self-awareness and mindfulness. By being aware of our own emotions, biases, and thought patterns, we can recognize when our thinking becomesclouded and prone to irrationality. Regular self-reflection, meditation, and journaling can help us gain insights into our own cognitive processes and identify areas for improvement. Additionally, seeking diverse perspectives, engaging in intellectual discussions, and exposingourselves to different cultures and ideas can broaden our horizons and enhance our rational thinking abilities.In conclusion, maintaining rational thinking is crucial in today's complex and fast-paced world. It allows us to make informed decisions, navigate through the vast amount of information available, and handle conflicts with grace. By cultivating self-awareness and practicing mindfulness, we can develop and preserve rational thinking skills. In doing so, we can approach life's challenges with clarity and objectivity, leading to personal growth and success.。
What is a Normative Goal
What is a Normative Goal?Mehdi Dastani Dept of Computer Science Utrecht University email:mehdi@cs.uu.nlLeendert van der Torre Dept of Arti£cial Intelligence Vrije Universiteit Amsterdam email:torre@cs.vu.nlAugust5,2002AbstractIn this paper we are interested in developing goal-based normative agent ar-chitectures.We ask ourselves the question what a normative goal is.To answerthis question we introduce a qualitative normative decision theory based on belief(B)and obligation(O)rules.We show that every agent which makes optimal deci-sions–which we call a BO rational agent–acts as if it is maximizing its achievednormative goals.This is the basis of our design of goal-based normative agents.1IntroductionSimon[10]interpreted goals as utility aspiration levels,in planning goals have a no-tion of desirability as well as intentionality,and in the BDI approach[4,8]goals have been identi£ed with desires.Moreover,recently several approaches to extend decision making and planning with goal generation,such as Thomason’s BDP logic[11]and Broersen et.al.’s BOID architecture[2].But what is this thing called goal?Although there are many uses of goals in planning and more recently in agent theory,the onto-logical status of goals seems to have received little attention.In this paper we ask ourselves what a normative goal is.We draw inspiration from Savage’s classical decision theory[9].The popularity of this theory is due to the fact that Savage shows that a rational decision maker,which satis£es some innocent look-ing properties,acts as if it is maximizing its expected utility function.This is called a representation theorem.In other words,Savage does not assume that an agent has a utility function and probability distribution which the agent uses to make decisions. However,he shows that if an agent bases his decisions on preferences and some prop-erties of these preferences,then we can assume that the agent bases his decisions on these utilities and probabilities together with the decision rule which maximizes its ex-pected utility.The main advantage is that Savage does not have to explain what a utility function is,an ontological problem which had haunted decision theory for ages.Likewise,we want to develop a qualitative normative decision theory in which a normative agent acts as if it is trying to maximize achieved normative goals.This is what we call a goal-based representation theorem.It implies that agents can beformalized or veri£ed as goal based reasoners even when the agent does not reason with goals at all.In other words,goal based representations do not have to be descriptive.A consequence of this indirect de£nition of goals is that the theory tells us what a goal is,such that we do not have to explain its ontological status separately.Our problem is the development of such a normative decision theory.In this paper we introduce a rule based decision theory,based on belief(B)and obligation(O)rules.Our problem thus breaks down into two subproblems:•How to develop a normative decision theory based on belief and obligation rules?•How to de£ne a notion of normative goals in this theory?We call an agent which minimizes its unreached obligations a BO rational agent,and we de£ne goals as a set of formulas which can be derived by beliefs and obligations in a certain way.The distinction between the decision theory and the goal generation is the way in which the obligation rules are used.In the decision theory obligation rules are only used to evaluate the consequences of decisions,whereas they are applied during goal generation.Our central result thus says that BO rational agents act as if they maximize the set of achieved goals.Like classical decision theory but in contrast to several proposals in the BDI ap-proach[4,8],the theory does not incorporate temporal reasoning and scheduling.The layout of this paper is as follows.We£rst develops a normative logic of de-cision.This logic tells us what the optimal decision is,but it does not tell us how to £nd this optimal decision.We then consider the AI solution to this problem[10]:break down the decision problem into goal generation and goal based decisions.2A normative decision theoryThe qualitative decision theory introduced in this section is based on sets of belief and obligation rules.There are several choices to be made,where our guide is to choose the simplest option available.2.1Logic of rulesThe starting point of any theory of decision is a distinction between choices made by the decision maker and choices imposed on it by its environment.We therefore assume the two disjoint sets of propositional atoms A={a,b,c,...}(the agent’s decision variables[6]or controllable propositions[1])and W={p,q,r,...}(the world parameters or uncontrollable propositions).We write:•L A,L W and L AW for the propositional languages built up from these atoms in the usual way,and x,y,...for any sentences of these languages.•Cn A,Cn W and Cn AW for the consequence sets,and|=A,|=W and|=AW for satis£ability,in any of these propositional logics.•x⇒y for an ordered pair of propositional sentences called a rule.•E R(S)for the R extension of S,as de£ned in De£nition1below.Belief and obligation rules are interpreted as inference rules.This is formalized in the following de£nition.De£nition1(Extension)Let R⊆L AW×L AW be a set of rules and S⊆L AW a set of sentences.The consequents of the S-applicable rules are:R(S)={y|x⇒y∈R,x∈S}and the R extension of S is the set of the consequents of the iteratively S-applicable rules:E R(S)=∩S⊆X,R(CnAW(X))⊆XXThe following proposition shows that E R(S)is the smallest superset of S closed under the rules R interpreted as inference rules.Proposition1Let•E0R(S)=S•E i R(S)=E i−1R (S)∪R(Cn AW(E i−1R(S)))for i>0We have E R(S)=∪∞0E i R(S).The following proposition shows that E R(S)is monotonic.Proposition2We have R(S)⊆R(S∪T)and E R(S)⊆E R(S∪T).Monotonicity is illustrated by the following example.Example1Let R={⊤⇒p,a⇒¬p}and S={a},where⊤stands for any tautology like p∨¬p.We have E R(S)={a,p,¬p},i.e.the R extension of S is inconsistent.We do not have that for example the speci£c rule overrides the more general one such that E R(S)={a,¬p}.2.2Decision speci£cationA decision speci£cation given in De£nition2is a description of a decision problem.It contains a set of belief and obligation rules,as well as a set of facts and an initial decision(or prior intentions).A belief rule‘the agent believes y in context x’is an ordered pair x⇒y withx∈L AW and y∈L W,and a obligation rule‘the agent is ought y in context x’is an ordered pair x⇒y with x∈L AW and y∈L AW.It implies that the agent’s beliefsare about the world(x⇒p),and not about the agent’s decisions.These beliefs canbe about the effects of decisions made by the agent(a⇒p)as well as beliefs aboutthe effects of parameters set by the world(p⇒q).Moreover,the agent’s obligationscan be about the world(x⇒p,obligation-to-be),but also about the agent’s decisions(x⇒a,obligation-to-do).These obligations can be triggered by parameters set by the world(p⇒y)as well as by decisions made by the agent(a⇒y).De£nition2(Decision speci£cation)A decision speci£cation is a tuple DS= F,B,O,d0 that contains a consistent set of facts F⊆L W,£nite set of belief rules B⊆L AW×L W,£nite set of obligation rules O⊆L AW×L AW and an initial decision d0⊆L A.2.3DecisionsThe belief rules are used to determine the expected consequences of a decision,where a decision d is any subset of L A that implies the initial decision d0,and the set of expected consequences of this decision d is the belief extension of F∪d.A decision does not imply a contradiction.De£nition3(Decisions)Let DS= F,B,O,d0 be a decision speci£cation.The set of DS decisions is∆={d|d0⊆d⊆L A,E B(F∪d)is consistent}A decision d∈∆is a DS decision.The following example illustrates decisions.Example2Let A={a,b,c},W={p,q,r}and DS= F,B,O,d0 with F= {p→r},B={p⇒q,b⇒¬q,c⇒p},O={⊤⇒r,⊤⇒a,a⇒b}and d0={a}.The initial decision d0re¤ects that the agent has already decided in an earlier stage to reach the obligation⊤⇒a.Note that the consequents of all B rules are sentences of L W,whereas the antecedents of the B rules as well as the antecedents and consequents of the O rules are sentences of L AW.We have due to the de£nition of E R(S):E B(F∪{a})={p→r,a}E B(F∪{a,b})={p→r,a,b,¬q}E B(F∪{a,c})={p→r,a,c,p,q}E B(F∪{a,b,c})={p→r,a,b,c,p,q,¬q}Therefore{a,b,c}is not a DS decision,because its extension is inconsistent.There are two ways in which we continue.In the following we introduce a norma-tive theory,which determines the interpretation of the elements of the decision speci-£cation.Thereafter we introduce a new element,called goals,in the decision theory. The distinction between the normative decision theory and the goal-based decision the-ory is how the obligation rules are used.In the normative theory obligation rules are only used to evaluate the consequences of decisions,they are never applied.In the goal based decision theory the obligation rules are applied during the generation of goals.2.4Optimal decisionsThe obligation rules are used to compare the decisions.The comparison is based on the set of unreached obligations and not on the set of violated or reached obligations, where a obligation x⇒y is unreached by a decision if the expected consequences of this decision imply x but not y,and it is violated or reached if these consequences imply respectively x∧¬y or x∧y.Note that the set of unreached desires is a superset of the set of violated desires.11In earlier work such as[12]we used the set of violated and reached obligations to order states,in the sense that we minimized violations and maximized reached obligations.The present de£nition has the advantage that it is simpler because it is based on a single minimization process only.Note that in the present circumstances we cannot minimize violations only,because it would lead to the counterintuitive situation that the minimal decision d=d0is always optimal.De£nition4(Comparing decisions)Let DS= F,B,O,d0 be a decision speci£-cation and d be a DS decision.The unreached obligations of decision d are: U(d)={x⇒y∈O|E B(F∪d)|=x and E B(F∪d)|=y} Decision d1is at least as good as decision d2,written as d1≥U d2,iffU(d1)⊆U(d2)Decision d1dominates decision d2,written as d1>U d2,iffd1≥U d2and d2≥U d1Decision d1is as good as decision d2,written as d1∼U d2,iffd1≥U d2and d2≥U d1The following continuation of Example2illustrates the comparison of decisions. Example3(Continued)We have:U({a})={⊤⇒r,a⇒b},U({a,b})={⊤⇒r},U({a,c})={a⇒b}.We thus have that the decisions{a,b}and{a,c}both dominate the initial decision {a},i.e.{a,b}>U{a}and{a,c}>U{a},but the decisions{a,b}and{a,c}do not dominate each other nor are they as good as each other,i.e.{a,b}≥U{a,c}and {a,c}≥U{a,b}.The following proposition shows that the binary relation on decisions is transitive and we can thus interpret it as a preference relation.Proposition3The binary relation≥U is transitive.The following proposition shows that obligations only matter as long as they are different from beliefs.Proposition4The decision set of decision speci£cation DS= F,B,O,d0 is exactly the decision set of DS′= F,B,O\B,d0 .Moreover,we have d1≥U d2in DS if and only if d1≥U d2in DS′.The decision theory prescribes a decision maker to select the optimal or best deci-sion,which is de£ned as a decision that is not dominated.De£nition5(Optimal decision)Let DS be a decision speci£cation.A DS decision d is U-optimal iff there is no DS decision d′that dominates it,i.e.d′>U d.The following example illustrates that the minimal decision d0is not necessarily an optimal decision.Example4Let A={a},W=∅and DS= ∅,∅,{⊤⇒a},∅ .We have U(∅)= {⊤⇒a}and U({a})=∅.Hence,doing a is better than doing nothing.The following example illustrates optimal decisions.Example5Let A={a,b},W=∅and DS= ∅,∅,{a⇒b},∅ .We have:U(d)={a⇒b}if d|=AW a and d|=AW b,U(d)=∅otherwiseThe U-optimal decisions are the decisions d that either do not imply a or that imply a∧b.The following proposition shows that for each decision speci£cation,there is at least one optimal decision.This is important,because agents have to act in some way.Proposition5Let DS be a decision speci£cation.There is at least one U-optimal DS decision.Proof.Since the facts F are consistent,there exists at least one DS decision.Since the set of desire rules is£nite there do not exist in£nite ascending chains in≥U,and thus there is an optimal D decision.An alternative to our notion of optimality is to introduce a notion of minimality in the de£nition of optimal decisions.The following De£nition6introduces a distinction between smaller and larger decisions.A smaller decision implies that the agent com-mits itself to less choices.A minimal optimal decision is an optimal decision such that there is no smaller optimal decision.De£nition6(Minimal optimal decision)A decision d is a minimal U-optimal DS decision iff it is an U-optimal DS decision and there is no DS decision d′such that d|=d′and d′|=d.The following example illustrates the distinction between optimal and minimal op-timal decisions.Example6Let DS= F,B,O,d0 with F=∅,B={a⇒x,b⇒x},O={⊤⇒x},d0=∅.Optimal decisions are{a},{b}and{a,b},of which the former two are minimal.The following proposition illustrates in what sense a decision theory based on op-timal decisions and one based on minimal optimal decisions are equivalent.Proposition6For every U-optimal DS decision d,there is a minimal U-optimal DS decision d′such that d∼U d′.We£nally de£ne what a BO rational agent is.De£nition7A BO rational agent is an agent that,confronted with a decision speci-£cation DS,selects an U-optimal DS decision.A BO parsimoneous agent is a BO rational agent that selects a minimal U-optimal DS decision.Thus far,we have not considered the notion of goals.This concept is introduced in the following section.3Goal-based decision theoryIn this section we show that every rational agent,in the sense of De£nition7,can be understood as a goal planning agent[7].3.1Goal-based optimal decisionsGoal-based decisions in De£nition8combine decisions in De£nition3and the notion of goal,which is a set of propositional sentences.Note that a goal set can contain decision variables(which we call to-do goals)as well as parameters(which we call to-be goals).De£nition8(Goal-based decision)Let DS= F,B,O,d0 be a decision speci£ca-tion and the goal set G⊆L AW a set of sentences.A decision d is a G decision if E B(F∪d)|=AW G.What is the goal set of an optimal decision?One way to start is to consider all derivable goals from an initial decision and a maximal set of obligations.De£nition9(Derivable goal set)Let DS= F,B,O,d0 be a decision speci£cation.A set of formulas G⊆L AW is a derivable goal set of DS ifG=E B∪O′(F∪d0)\Cn AW(E B(F∪d0))where O′⊆O is a maximal(with respect to set inclusion)set such that1.E B∪O′(F∪d0)is consistent and2.there is a DS decision d that is a G decision.However,the following proposition shows that for some derivable goal set G,not all G decisions are optimal.Proposition7For a derivable goal set G of DS,a G decision does not have to be an U-optimal decision.Proof.Consider the decision speci£cation DS= ∅,{a⇒p},{⊤⇒p,a⇒b},∅ .The set G={p}is the only derivable goal set(based on O′={⊤⇒p,a⇒b}).The DS decisions d1={a}and d2={a,b}are both G decisions,but only d2is an U-optimal decision.We therefore de£ne goals with respect to an optimal decision.De£nition10(Achievable goal set)Let DS= F,B,O,d0 be a decision speci£-cation.A set of formulas G⊆L AW is an achievable goal set of DS if there is an U-optimal DS decision d such thatG={x∧y|x⇒y∈O′,E B∪O′(F∪d)|=AW x∧y}whereO′={x⇒y∈O|E B(F∪d)|=AW x or E B(F∪d)|=AW x∧y}Proposition8Let DS= F,B,O,d0 be a decision speci£cation and let a set of formulas G⊆L AW be an achievable goal set of DS.There exists an U-optimal DS decision d such thatG={x∧y|x⇒y∈O,E B(F∪d)|=AW x∧y} Proof.Follows directly from E B∪O′(F∪d)=E B(F∪d),where O′is as de£ned in De£nition10.The following proposition shows that we can de£ne one half of the representation theorem for achievable goal sets.Proposition9For an U-optimal decision d of DS there is an achievable goal set G of DS such that d is a G decision.Proof.Follows directly from E B∪O′(F∪d)=E B(F∪d).However,the following proposition shows that the other half of the representation theorem still fails.Proposition10For an achievable goal set G of DS,a G decision does not have to be an U-optimal decision.Proof.Consider the decision speci£cation DS= {¬q},{a⇒p,b⇒p},{⊤⇒p,b⇒q},∅ .The set G={p}is the only achievable goal set(based on O′={⊤⇒p,b⇒q}).The DS decisions d1={a}and d2={b}are both(minimal)G decisions, but only d1is an optimal decision.The counterexample in Proposition10also shows that we cannot prove the second half of the representation theorem,because we only consider positive goals(states the agent wants to reach)and not negative goals(states the agents wants to evade).The theory is extended with positive and negative goals in the following subsection.3.2Positive and negative goalsIn this section we show that the representation theorem works both ways if we add negative goals,which are de£ned in the following de£nition as states the agent has to avoid.They function as constraints on the search process of goal-based decisions.De£nition11(Goal-based decision)Let DS= F,B,O,d0 be a decision speci£-cation,and the so-called positive goal set G+and negative goal set G−subsets of L AW.A decision d is a G+,G− decision if E B(F∪d)|=AW G+and for each g∈G−we have E B(F∪d)|=AW g.The de£nition of achievable goal set is extended with negative goals.De£nition12(Positive and negative achievable goal set)Let DS= F,B,O,d0 be a decision speci£cation.The two sets of formulas G+,G−⊆L AW are respec-tively a positive and negative achievable goal sets of DS if there is an optimal DS decision d such thatG+={x∧y|x⇒y∈O′,E B(F∪d)|=AW x∧y}G−={x|x⇒y∈O′,E B∪O′(F∪d)|=AW x}whereO′={x⇒y∈O|E B(F∪d)|=AW x or E B(F∪d)|=AW x∧y} The following example illustrates the distinction between optimal decisions and minimal optimal ones.Example7Let A={a,b},W=∅and DS= ∅,∅,{a⇒b},∅ .The opti-mal decision is∅or{a,b},and the related goal sets are G+,G− = ∅,{a} and G+,G− = {a∧b},∅ .The only minimal optimal decision is the former.The following example illustrates a con¤ict.Example8Let W={p},A={a},DS= F,B,O,d0 with F=∅,B=∅, O={⊤⇒a∧p,⊤⇒¬a},d0=∅.We have optimal decision{¬a}with goal set G+,G− = {¬a},∅ .The decision{a}does not derive goal set G+,G− = {a∧p},∅ .One of the possible choices is{a},which is however sub-optimal since we cannot guarantee that the£rst obligation is reached.The£rst part of the representation theorem is analogous to Proposition9. Proposition11For an U-optimal decision d of DS there is an achievable goal set G+,G− of DS such that d is a G+,G− decision.Proof.See Proposition9.Proposition12For an achievable goal set G+,G− of DS,a G+,G− decision is an U-optimal decision.Proof. G+,G− is achievable and thus there there is an U-optimal DS decision such that E B(F∪d)|=AW G+and for all g∈G−we have E B(F∪d)|=AW g. Let d be any decision d such that E B(F∪d)|=AW G+and for all g∈G−we have E B(F∪d)|=AW g.Suppose d is not U-optimal.This means that there exists a d′such that d′>U d,i.e.such that there exists an obligation x⇒y∈O with E B(F∪d)|=AW x,E B(F∪d)|=AW y and either:•E B(F∪d′)|=AW x∧y;•E B(F∪d′)|=AW y;However,the£rst option is not possible due to the negative goals and the second option is not possible due to the positive goals.Contradiction,so d has to be U-optimal.The representation theorem is a combination of Proposition11and12. Theorem1A decision d is an U-optimal decision if and only if there is an achievable goal set G+,G− of DS such that d is a G+,G− decision.4Agent speci£cation and designIn this section we discuss how the proposed qualitative normative decision and goal theory can be used to guide the design and speci£cation of rational BO agents in a compositional way.The general idea of compositional design and speci£cation is to build agents using components.They may be either primitive components or composed of other components,such that the speci£cation of agents can be broken down into the speci£cation of components and their relations.Here we give some preliminary ideas and explain how the proposed qualitative normative decision and goal theory supports a speci£c compositional design for rational BO agent.The qualitative decision theory,as proposed in section 2,speci£es the decision making of an agent in terms of its observations and its mental attitudes such as beliefs and obligations.The speci£ed agent can therefore be considered as consisting of com-ponents that represent agent’s beliefs and obligations and a reasoning component that generates agent’s decisions based on its observations and mental attitudes.The abstract design of such a BO agent is illustrated in Figure 1.For this design of BO agents,no-tions such as optimal decisions and minimal optimal decisions can be used to specify the reasoning component and thus the decision making mechanism of the agent.decision observation agentBOreasonerFigure 1:AgentThe following example illustrates an agent with certain beliefs and obligations,the possible decisions that the agent can make,and how the notions from qualitative normative decision theory can be used to specify the subset of decisions that the agent can make.Example 9Consider an agent who believes that he works and that if he sets an alarm clock he can wake up early to arrive in time at his work,i.e.B ={⊤⇒W ork,SetAlarm ⇒InT ime }The agent has also the obligation to arrive early at his work and he has to inform his boss when he does not work,i.e.O ={W ork ⇒InT ime,¬W ork ⇒InformBoss }In this example,the propositions SetAlarm and InformBoss are assumed to be decision variables (the agent has control on setting the alarm clock and informing his boss),while W ork and Intime are assumed to be world parameters (the agent has no control on its working status and the starting time).Moreover,we assume that the agent has no observation and no intentions.One can specify the agent as a rational BO agent in the sense that it makes optimal decisions.Being speci£ed as a rational BO agent,he will decide to use the alarm clock though he has in principle many possible decisions including ∅,{SetAlarm },{InformBoss },and {SetAlarm,InformBoss }.The goal based decision theory,as proposed in section 3,explains the decision making of a rational BO agent as if it aims at maximizing achieved normative goals.In particular,the goal based decision theory explains how normative goals of an agent can be speci£ed based on its decision speci£cation.The speci£ed reasoning component of the rational BO agent can therefore be decomposed and designed as consisting of two reasoning components:one which generates normative goals and one which gen-erate decisions to achieve those goals.This decomposition suggests an agent design as illustrated in Figure 2.According to this agent design,a BO agent generates £rst its normative goals based on its observation,its beliefs,obligations and its intentions.The generated goals are subsequently the input of the decision generation component.decisiongoalgeneration decision generation goal set observation agentBOFigure 2:Goal-based agentFollowing the design decomposition,the speci£cation of a BO agent can now also be decomposed and de£ned in terms of the speci£cation of its goal and decision gen-eration mechanisms.In particular,the goal generation mechanism can be speci£ed in terms of agent’s observations and its mental state on the one hand and its goals on the other hand.The decision generation component can then be speci£ed in terms of agent’s goals and mental state on the one hand and its decisions on the other hand.For example,consider again the working agent that may have in principle many goal sets consisting of ∅,W ork,Intime,SetAlarm,and InformBoss .This implies that the goal generation component may generate one of these possible goal ing the notions from goal based decision theory one may specify the goal generation mech-anism in order to generates achievable goal sets which when planned by the decisiongeneration component will result optimal decisions.In summary,we believe that the qualitative normative decision theory and goal based decision theory can be used to provide compositional speci£cation and design of rational BO agents.This leads to a transparent agent speci£cation and design structure. Moreover,it leads to support for reuse and maintainability of components and generic models.The compositional speci£cation and design of agents enable us to specify and design agents at various levels of abstraction leaving out many details such as representation issues and reasoning schemes.For our rational BO agents we did not to explain how decisions are generated;we only speci£ed what decisions should be generated.At one lower level we decomposed the reasoning mechanism and speci£ed goal and decision generation mechanisms.We also did not discuss the representation of individual components such as the belief or the obligation components.The conditional rules in these components specify the input/output relation.5Related researchThe theories in Thomason’s BDP[11]and Broersen et al.’s BOID[2]are different, because they allow multiple belief sets.This introduces the new problem of blocking wishful thinking discussed extensively in[3].6Concluding remarksIn this paper we have given an interpretation for goals in a qualitative decision theory based on beliefs and obligation rules,and we have shown that any agent which makes optimal decisions acts as if it is maximizing its achieved goals.Our motivation comes from the analysis of goal-based architectures,which have recently been introduced.However,the results of this paper may be relevant for a much wider audience.For example,Dennett argues that automated systems can be analyzed using concepts from folk psychology like beliefs,obligations,and goals.Our work may be used in the formal foundations of this‘intentional stance’[5].There are several topics for further research.The most interesting question is whether belief and obligation rules are fundamental,or whether they in turn can be represented by some other construct.Other topics for further research are a generaliza-tion of our representation theorem to other choices in our theory,the development of an incremental approach to goals,and the development of computationally attractive fragments of the logic,and heuristics of the optimization problem. References[1]C.Boutilier.Toward a logic for qualitative decision theory.In Proceedings of theKR’94,pages75–86,1994.[2]J.Broersen,M.Dastani,J.Hulstijn,and L.van der Torre.Goal generation in theBOID architecture.Cognitive Science Quarterly,to appear.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
An Optimality-Theoretic Alternative to the Apparent Wh-Movement in Old JapaneseChiharu Uda Kikutacuda@mail.doshisha.ac.jpThis paper proposes an Optimality-Theoretic alternative to the wh-movement analysis of Old Japanese (OJ: ?-8C). Diachronic aspects of Japanese have recently aroused strong interest in theoretical linguists, both functional and formal. Among the most remarkable works is Watanabe’s (2002) Minimalist analysis, which claims that OJ was in fact a wh-movement language. Japanese has always been a head-final (SOV), free-word order (scrambling) language. It was never expected that it went through a dramatic change in parameter setting. For this reason, Watanabe (2002) had a great impact on Japanese syntacticians, and it has been so influential that its claim is almost taken as a “discovery” of the unknown “fact” of the Japanese language (but cf. Tonoike (2002, 2003)). However, Watanabe’s (2002) analysis has some serious problems. This paper argues that Watanabe’s (2002) claim is wrong and that OJ did not have overt wh-movement. The observed word order pattern results from the interplay of several constraints, which basically have to do with morphological case marking.The proposed analysis not only accommodates the fact covered by the wh-movement analysis, but offers a more comprehensive picture of the development of the Japanese language during 8-10C.1. Word order restriction in OJ and Watanabe’s (2002) claimWatanabe’s (2002) claim is based on the word order restriction in OJ, which disappeared in early Middle Japanese (MJ: 9-12C). Through a thorough examination of data of Man’yosyu (c.750?: the only available text in OJ), regarding the distribution of the interrogative particle [ka], the topic marker [wa], and the subject marker [no/ga], Nomura (1993) and Sasaki (1992) conclude that [ka] almost always preceded [no/ga], while [wa] almost always preceded [ka]:thedataofnumber(1) I. Nominative subject: XP [ka] . . . Subj [no/ga] . . . approximately 90Subj [no/ga] . . . XP [ka] . . . 4 (or 5)II. Topicalized subject: XP [ka] . . . Subj [wa] . . . 2 (or 3)Subj [wa] . . . XP [ka] . . . approximately 50Based on this, Watanabe (2002) proposes the clause structure with split C system, which follows the insight of Rizzi (1997), and claims that XP[ka] is obligatorily moved to the Spec of FocP, a type of wh-movement:(2) [TopP Spec Top [FocP Spec Foc [IP Subj VP I ] ] ]So Watanabe’s (2002) wh-movement is in fact Focus-movement. The particle [ka] has a dual-function: (1) it attaches to either wh- or non-wh-word to make the clause interrogative, and (2) it places a focus on the phrase. As regards the second function, [ka] is with other focus-marking particles such as [ya] and [so] in triggering kakarimusubi, a special inflection form of the predicate (see below). All these focus particles obey the above word order restriction. Watanabe (2002) claims that the movement is triggered by the [-Interpretable] feature of the focus particle; kakarimusubi is a sign of wh-agreement, indicating that the clause includes a wh-trace in it.The wh-movement analysis suggests a new scenario of the decline of kakarimusubi. It is an established fact that the kakarimusubi completely disappeared around 14-15C, while the word order restriction in question was lifted around 9-10C. The difference of several hundred years may seem to reduce the credibility of the analysis. However, he demonstrates that the genuine interrogative function of [ka] was rapidly lost during early MJ, becoming confined to rhetorical questions. So it is suggested that the decline of kakarimusubi in fact started much earlier, along with the loss of wh-movement.2. Problems with the wh-movement analysisIn spite of its appeal, Watanabe’s (2002) analysis has the following problems.1. The pre-posing of a phrase does not necessarily require the [-Interpretable] feature of [ka]. Many instances of the pre-posed [ka]-phrase are a whole subordinate clause (conditional, etc.). However, the subordinate clause must precede the matrix clause in Japanese (=SOV) anyway, unless it is center-embedded, as evidenced by abundant data without [ka].2. Some of the pre-posed FocP cannot involve a wh-movement, if the predicate takes the attributive form. The [ka] phrases in (3), for instance, seem to be independent “presentational” phrases with no apparent grammatical function w.r.t. the matrix clause (i.e., considered as juxtaposition), which are unlikely to have moved out of the matrix clause. This type of [ka] examples is not uncommon:(3) Tahagoto-ka, oyodore-ka, komorikuno Hatuse-no yama-ni komori-seri-to-ihu.insane-word-Q, false-word-Q, (epigraph) Hatuse-no mountain-loc hide-do-C-say[attri.]‘(lit) A joke? Or a lie? They say that he has hidden himself in Mt. Hatuse.’3. The attributive form is not limited to kakarimusubi; the focus particle is not a crucial condition for the availability of the predicate form. It is the predicate form of a nominal clause and noun modificational clause. It was also common that the nominal clause occurred independently (expressing some kind of non-assertiveness.)4. The change in the behavior of [ka] may very well be lexical. As the particle [ka] lost its interrogative force, it is replaced by [ya] (for non-wh-word) and bare wh-word, which continued to trigger kakarimusubi through MJ. In other words, the word order restriction (and its loss) uniformly covered focus phrases ([so][ya][ka]), and kakarimusubi apparently flourished in MJ involving all focus phrases (with different membership), [so][ya] and bare wh-word. The proposed scenario fits only the history of [ka]. Besides, according to Watanabe (2002), only the kakarimusubi in OJ is the “real” one, while the one in MJ is merely a stylistic imitation, which somehow lasted very long. This distinction, however, has no support outside the theory.Due to the limitation of OJ data, such syntactic tests as unbounded dependency and cross over are not available. Given that Japanese has always allowed scrambling, the only solid motivation for the wh-movement analysis is that the word order pattern is (almost) obligatory. I will argue, however, that the word order in question is obligatory for a different reason.3. Other important facts: for the proposalI take the following observations of the history of Japanese as significant.1. In OJ and MJ, both [no] and [ga] marked either nominative or genitive (underspecified or homonymous). More importantly, [no] and [ga] as nominative markers occurred only in the clause headed by the attributive predicate in OJ; otherwise, the subject is marked either with topic marker wa or without any marker. There is good reason to believe that the attributive form is a nominal/verbal, mixed category; namely, the predicate in the attributive form can directly head a nominal clause. The unique distribution of the nominative [no/ga] has been traditionally ascribed to the nominal nature of the attributive form in philology, and has found a clearer Stochastic OT account in Kikuta (2003) on a similar line.2. It has been recently observed that the word order [Subj[no/ga] . . . Obj[o] . . . V(attri.)] is prohibited in OJ (Kinsui 2001, Yanagida 2003). The [o]-marked object has to be pre-posed or has to occur as bare NP. This restriction apparently parallels the one in (1); Subj[no/ga] and V(attri.) does not allow an intervening phrase with a particle. However, wh-movement analysis cannot be extended to this pattern, in which the predicate form is not conditioned by a movement of any sort.3. It is well-documented that MJ saw the development of morphological case markers, particularly the accusative [o]. Although details are somewhat controversial, a general consensus is that in OJ: (1) bare NP was very common both for subject and object; (2) the marker [o] for a direct object often accompanied a special semantic overtone; (3) [o] marked a non-object, and even the subject in a special construction called mi-construction. It is also agreed that [o] was established as the accusative case marker during MJ; the mi-construction also disappeared in MJ.4. ProposalIn view of the above facts, I claim that the word order restriction is a reflection of a nominal nature of the phrase. The reason for the attributive form of the predicate in kakarimusubi is semantic rather than syntactic (wh-agreement), having to do with the presupposition (as opposed to focus) and the non-assertiveness of a nominal clause.Adopting the idea of Malouf (2000), I assume that the attributive form predicate is a [+n, +v] mixed category, and that a [no/ga]-marked argument appears at the left-edge of a [+nominal] phrase. I propose the four constraints in (4), with the dominance relation in (5):(4) (A) Argument Realization Faithfulness: lexically specified arguments should be syntactically realized.(B) Nominal Phrase Surface Tightness: *[YP NP[+Ncase] XP Y[+N]] where XP = XP[+mark](C) Case Transparency: the morphological case should properly reflect the abstract case.(D) Case Type Consistency: Case type of coarguments should be consistent (+N-type case, +V-type case),where [+Vcase] includes: nominative [no/ga], accusative [o], dative [ni]; and [+Ncase]: genitive [no/ga].(5) OJ: (A) >> (B) >> (C) >> (D)MJ: (A) >> (C) >> (D) >> (B)Namely, what caused the word order restriction in OJ is the dominance of the constraint (B), which means that in an endocentric nominal phrase, the link between the Spec and head should not be interrupted by a non-modificational phrase. The actual form of this constraint can vary depending on the type of language, regarding exactly what the [+Ncase] is. In most cases, it is equivalent to the genitive case. In OJ, where the abstract case and the morphological case are yet to find a complete correspondence, it is the morphological realization of case. Note that this word order is very natural for an NP. The situation with OJ is rather unique because YP is partly verbal, taking arguments as other verbs do. I assume that abstract case system (as some kind of argument licensing) is universal, while morphological case realization is not only language-particular, but is subject to diachronic change within a language (cf. Sigurdsson 2003).The apparent loss of wh-movement results from the relative demotion of the constraint (B). The demotion is caused by the promotion of (C) and (D), which reflects the development of the morphological case system in MJ. The establishment of the accusative marker [o], mentioned above, cannot merely be an idiosyncratic lexical change; it (re)organized the overall system of case marking. A relevant well-known fact is that the “nominative” [ga], which was sensitive to the nominality of the predicate in OJ, gradually came to be used with predicate in any inflectional form in MJ, indicating its gradual establishment as the nominative marker. Concomitantly, OJ’s rather unique mixture of case type is lost, and the nominal word order restriction is lifted from the verbal clause. As a consequence, the clause is free to behave just as other clauses.Thus the proposed analysis explains the word order restriction in OJ and its loss in MJ, without making a radical claim that Japanese was once a wh-movement language. The benefit of this analysis includes: that it accommodates the fact that kakarimusubi flourished both in OJ and MJ, yet it showed a difference; and that it has managed to relate the change in case marking morphemes, which is well-documented without being given a systematic explanation, to the global change in the Japanese syntax. Other points of discussion include: the relation between morphological case and abstract case; and the role of morphological case in syntax, which may be more significant than has been assumed (cf. Sigurdsson 2003, V ogel 2003). Important ReferencesKikuta, C.U. 2003. Subject Case Marking in Early Classical Japanese; A New Perspective from Stochastic Optimality Theory.In Japanese/Koreasn Linguistics 12, ed. W. McClure, 152-164. Stanford, CA: CSLI Publications & SLA.Malouf, R. P. 2000. Mixed Categories in the Hierarchical Lexicon. Stanford, CA: CSLI Publications.Watanabe, A. 2002. Loss of Overt Wh-Movement in Old Japanese. In Syntactic Effects of Morphological Change, ed. D.W.Lightfoot, 179-195. Oxford: Oxford UP.。