Imprecise Dirichlet Model the case of a

合集下载

专业英语

专业英语

Definition of polymers A simple understanding of polymers can be gained by imaging them to be like a chain or, perhaps, a string of pearls, where the individual pearl represent small molecules that are chemically bonded together. Therefore, a polymer is a molecule made up of smaller molecules that are joined together by chemical bonds. The word polymer means „many parts or units.‟ The parts or units are the small molecules that combine. The result of the combination is, of course, a chainlike molecule (polymer). Usually the polymer chains are long, often consisting of hundreds of units, but polymers consisting of only a few units linked together are also known and can be commercially valuable.
Figure 1.1 Diagram illustrating the definition of plastics.
As Figure 1.1 shows, all materials can be classified as gases, simple liquids, or solids, with the understanding that most materials can be converted from one state to another through heating or cooling. If only materials that are structural solids at normal temperatures are examined, three major types of materials are encountered: metals, polymers, and ceramics. The polymer materials can be further divided into synthetic polymers and natural polymers. Most synthetic polymers are those that do not occur naturally and are represented by materials such as nylon, polyethylene, and polyester. Some synthetic polymers could be manufactured copies of naturally occurring materials (such as

FORNELL教授经典的顾客满意度论文1

FORNELL教授经典的顾客满意度论文1

TOTAL QUALITY MANAGEMENT, VOL. 11, NO. 7, 2000, S869-S882EUGENE W. ANDERSON & CLAES FORNELLNational Quality Research Center, University of Michigan Business School, Ann Arbor,MI 48109-1234, USAABSTRACT How do we know if an economy is performing well? How do we know if a company is performing well? The fact is that we have serious difficulty answering these questions today. The economy—for nations and for corporations—has changed much more than our theories and measurements. The development of national customer satisfaction indices (NCSIs) represents an important step towards addressing the gap between what we know and what we need to know. This paper describes the methodology underlying one such measure, the American Customer Satisfaction Index (ACSI). ACSI represents a uniform system for evaluating, comparing, and—ultimately- enhancing customer satisfaction across ifrms, industries and nations. Other nations are now adopting the same approach. It is argued that a global network of NCSIs based on a common methodology is not simply desirable, but imperative.IntroductionHow do we know if an economy is performing well? How do we know if a company is performing well? The fact is that we have serious difficulty answering these questions today. It is even more difficult to tell where we are going.Why is this? A good part of the explanation is that the economy—for nations and for corporations—has changed much more than our theories and measurements. One can easily make the case that the measures on which we rely for determining corporate and national economic performance have not kept pace. For example, the service sector and information technology play a dominant role in the modern economy. An implication of this change is that economic assets today reside heavily in intangibles—knowledge, systems, customer relationships, etc. (see Fig. 1). The building of shareholder wealth is no longer a matter of the management of ifnancial and physical assets. The same is true with the wealth of nations.As a result, one cannot continue to apply models of measurement and theory developed for a 'tangible' manufacturing economy to the economy we have today. How important is it to know about coal production, rail freight, textile mill or pig-iron production in the modern economy? Such measures are still collected in the US and reported in the media as if theyhad the same importance now as they did over 50 years ago.The problem gets worse when we take all these measures, add them up and draw conclusions. For example, in early 1999, the US stock market set an all time record highCorrespondence: E. W. Anderson, National Quality Research Center, University of Michigan Business School, Ann Arbor, MI 48109-1234, USA. Tel: (313) 763-1566; Fax: (313) 763-9768; E-mail: genea@ISSN 0954-4127 print/ISSN 1360-0613 online/00/07S869-14 0 2000 Taylor & Francis LtdS870 E. W. ANDERSON & C. FORNELLDow Jones Industrials:Price-to-Book Ratios11970 1999Source: Business Week, March 9, 1999Figure 1. Tangible versus intangible sources of value, 1970-99.with the Dow Jones Index passing 11 000 points, unemployment was at record lows, the economy expanded and inflation was almost non-existent. These statistics suggested a strong economy, which was also what was reported in the press and in most commentary by economists. As always, however, the real question is: Are we better off? How well are the actual experiences of people captured by the reported measures? Do the things economists and Governments choose to measure correspond with how people feel about their economic well-being? A closer inspection of the numbers and their underlying statistics reveals a somewhat different picture of the US economy than that typically held up as an example.?Corporate earnings growth for 1997 and 1998 were much lower than in the previous2 years, with a negative growth for 1998.?The major portion of the earnings growth in 1995 and 1996 was due to cost-cutting rather than revenue growth.?The trade deficit in 1999 was at a record high and growing.?Wages have been stagnant in the last 15 years (although there were small increases in 1997 and 1998).?The proportion of stock market capitalization versus GDP was about 150% of GDP in 1998 (the historical average is 48%; the proportion before the 1929 stock market crash was 82%).?Consumer and business debt were high and rising.?Even though many new jobs were created, 70% of those who lost their jobs got new jobs that paid less.?The number of bankruptcies was high and growing.?Worker absenteeism was at record highs.?Household savings were negative.Add the above to the fact that there is a great deal of worker anxiety over job security and lower levels of customer satisfaction than 5 years ago, and the question of whether we areyrFOUNDATIONS OF ACSI S871better off is cast in a different light. How much does it matter if we increase productivity,that the economy is growing or that the stock market is breaking records, if customers arenot satisifed? The basic idea behind a market economy is that businesses exist and competein order to create a satisifed customer. Investors will lfock to the companies that are expectedto do this well. It is not possible to increase economic prosperity without also increasingcustomer satisfaction. In a market economy, where suppliers compete for buyers, but buyersdo not compete for products, customer satisfaction defines the meaning of economic activity,because what matters in the final analysis is not how much we produce or consume, but howwell our economy satisfies its consumers.Together with other economic objectives—such as employment and growth—thequality of what is produced is a part of standard of living and a source of national competitiveness. Like other objectives, it should be subjected to systematic and uniform measurement. This is why there is a need for national indices of customer satisfaction. Anational index of customer satisfaction contributes to a more accurate picture of economicoutput, which in turn leads to better economic policy decisions and improvement of standard ofliving. Neither productivitymeasures nor price indices can be properly calibrated without taking quality into account.It is difficult to conduct economic policy without accurate and comprehensive measures. Customer satisfaction is of considerable value as a complement to the traditional measures.This is true for both macro and micro levels. Because it is derived from consumption data(as opposed to production) it is also a leading indicator of future proifts. Customer satisfactionleads to greater customer loyalty (Anderson & Sullivan, 1993; Bearden & Teel, 1983; Bolton& Drew, 1991; Boulding et al., 1993; Fornell, 1992; LaBarbera & Mazurski, 1983; Oliver,1980; Oliver & Swan, 1989; Yi, 1991). Through increasing loyalty, customer satisfactionsecures future revenues (Bolton, 1998; Fornell, 1992; Rust et al., 1994, 1995), reduces thecost of future transactions (Reichheld & Sasser, 1990), decreases price elasticities (Anderson,1996), and minimizes the likelihood customers will defect if quality falters (Anderson & Sullivan, 1993). Word-of-mouth from satisifed customers lowers the cost of attracting new customers and enhances the firm's overall reputation, while that of dissatisifed customersnaturally has the opposite effect (Anderson, 1998; Fornell, 1992). For all these reasons, it isnot surprising that empirical work indicates that ifrms providing superior quality enjoy higher economic returns (Aaker & Jacobson, 1994; Anderson et al., 1994, 1997; Bolton, 1998;Capon et al., 1990).Satisfied customers can therefore be considered an asset to the ifrm and should be acknowledged as such on the balance sheet. Current accounting-based measures are probablymore lagging than leading—they say more about past decisions than they do about tomorrow's performance (Kaplan & Norton, 1992). If corporations did incorporate customer satisfactionas a measurable asset, we would have a better accounting of the relationship between theenterprise's current condition and its future capacity to produce wealth.If customer satisfaction is so important, how should it be measured? It is too complicatedand too important to be casually implemented via standard market research surveys. The remainder of this article describes the methodology underlying the American Customer Satisfaction Index (ACSI) and discusses many of the key ifndings from this approach.Nature of the American Customer Satisfaction IndexACSI measures the quality of goods and services as experienced by those that consume them.An individual ifrm's customer satisfaction index (CSI) represents its served market's—its customers'—overall evaluation of total purchase and consumption experience, both actualand anticipated (Anderson et al., 1994; Fonrell, 1992; Johnson & Fornell, 1991).S872 E. W. ANDERSON & C. FORNELLThe basic premise of ACSI, a measure of overall customer satisfaction that is uniform and comparable, requires a methodology with two fundamental properties. (For a complete description of the ACSI methodology, please see the 'American Customer Staisfaction Index: Methodology Report' available from the American Society for Quailty Control, Milwaukee, WI.) First, the methodology must recognize that CSI is a customer evaluation that cannot be measured directly. Second, as an overall measure of customer satisfaction, CSI must be measured in a way that not only accounts for consumption experience, but is also forward-looking.Direct measurement of customer satisfaction: observability with errorEconomists have long expressed reservations about whether an individual's satisfaction or utility can be measured, compared, or aggregated (Hicks, 1934, 1939a,b, 1941; Pareto, 1906; Ricardo, 1817; Samuelson, 1947). Early economists who believed it was possible to produce a 'cardinal' measure of utility (Bentham, 1802; Marshall, 1890; Pigou, 1920) have been replaced by ordinalist economists who argue that the structure and implications of utility-maximizing economics can be retained while relaxing the cardinal assumption. How_ ever, cardinal or direct measurement of such judgements and evaluations is common in other social sciences. For example, in marketing, conjoint analysis is used to measure individual utilities (Green & Srinivasan, 1978, 1990; Green & Tull, 1975).Based on what Kenneth Boulding (1972) referred to as Katona's Law (the summation of ignorance can produce knowledge due to the self-canceling of random factors), the recent advances in latent variable modeling and the call from economists such as the late Jan Tinbergen (1991) for economic science to address better what is required for economic policy, scholars are once again focusing on the measurement of subjective (experience) utility. The challenge is not to arrive at a measurement system according to a universal system of axioms, but rather one where fallibility is recognized and error is admitted (Johnson & Fornell, 1991) .The ACSI draws upon considerable advances in measurement technology over the past 75 years. In the 1950s, formalized systems for prediction and explanation (in terms of accounting for variation around the mean of a variable) started to appear. Before then, research was essentially descriptive, although the single correlation was used to depict the degree of a relationship between two variables. Unfortunately, the correlation coefficient was otfen (and still is) misinterpreted and used to infer much more than what is permissible. Even though it provides very little information about the nature of a relationship (any given value of the correlation coefficient is consistent with an inifnite number of linear relationships), it was sometimes inferred as having both predictive and causal properties. The latter was not achieved until the 1980s with the advent of the second generation of multivariate analysisand associated sotfware (e.g. Lisrel).It was not until very recently, however, that causal networks could be applied to customer satisfaction data. What makes customer satisfaction data difficult to analyze via traditional methods is that they are associated with two aspects that play havoc with most statistical estimation techniques: (1) distributional skewness; and (2) multicollinearity. Both are extreme in this type of data. Fortunately, there has been methodological progress on both fronts particularly from the field of chemometrics, where the focus has been on robust estimation with small sample sizes and many variables.Not only is it now feasible to measure that which cannot be observed, it is also possible to incorporate these unobservables into systems of equations. The implication is that the conventional argument for limiting measurement to that which is numerical is no longer allFOUNDATIONS OF ACSI S873that compelling. Likewise, simply because consumer choice, as opposed to experience, is publicly observable does not mean that it must be the sole basis for utility measurement. Such reasoning only diminishes the influence of economic science in economic policy (Tinbergen 1991).Hence, even though experience may be a private matter, it does not follow that it is inaccessible to measurement or irrelevant for scientific inquiry, for cardinalist comparisons of utility are not mandatory for meaningful interpretation. For something to be 'meaningful,' it does not have to be 'flawless' or free of error. Even though (experience) utility or customer satisfaction cannot be directly observed, it is possible to employ proxies (fallible indicators) to capture empirically the construct. In the ifnal analysis, success or failure will depend on how well we explain and predict.Forward-looking measurement of customer satisfaction: explanation and predictionFor ACSI to be forward-looking, it must be embedded in a system of cause-and-effect relationships as shown in Fig. 2, making CSI the centerpiece in a chain of relationships running from the antecedents of customer satisfaction —expectations, perceived quality and value —to its consequences —voice and loyalty. The primary objective in estimating this system or model is to explain customer loyalty. It is through this design that ACSI captures the served market's evaluation of the ifrm's offering in a manner that is both backward- and forward-looking.Customer satisfaction (ACSI) has three antecedents: perceived quality, perceived value and customer expectations. Perceived quality or performance, the served market's evaluation of recent consumption experience, is expected to have a direct and positive effect on customer satisfaction. The second determinant of customer satisfaction is perceived value, or the perceived level of product quality relative to the price paid. Adding perceived value incorpo-rates price information into the model and increases the comparability of the results across ifrms, industries and sectors. The third determinant, the served market's expectations, represents both the served market's prior consumption experience with the firm's offeringCustomization Complaints to Complaints toinagement PersonnelPriceü GivenQualityQualityGivenPrice DelepurchasePrice Likelihood ToleranceCustomization Reliability O v e r a l l Figure 2. The American Customer Satisfaction Index model.S874 E. W. ANDERSON & C. FORNELLincluding non-experiential information available through sources such as advertising and word-of-mouth—and a forecast of the supplier's ability to deliver quality in the future.Following Hirschman's (1970) exit-voice theory, the immediate consequences of increased customer satisfaction are decreased customer complaints and increased customer loyalty (Fornell & Wemerfelt, 1988). When dissatisifed, customers have the option of exiting (e.g. going to a competitor) or voicing their complaints. An increase in satisfaction should decrease the incidence of complaints. Increased satisfaction should also increase customer loyalty. Loyalty is the ultimate dependent variable in the model because of its value as aproxy for profitability (Reichheld & Sasser, 1990).ACSI and the other constructs are latent variables that cannot be measured directly, each is assessed by multiple measures, as indicated in Fig. 1. To estimate the model requires data from recent customers on each of these 15 manifest variables (for an extended discussion of the survey design, see Fomell et al., 1996). Based on the survey data, ACSI is estimated as shown in Appendix B.Customer satisfaction index properties: the case of the American Customer Satisfaction IndexAt the most basic level the ACSI uses the only direct way to ifnd out how satisifed or dissatisifed customers are—that is, to ask them. Customers are asked to evaluate products and services that they have purchased and used. A straightforward summary of what customers say in their responses to the questions may have certain simplistic appeal, but such an approach will fall short on any other criterion. For the index to be useful, it must meet criteria related to its objectives. If the ACSI is to contribute to more accurate and comprehen-sive measurement of economic output, predict economic returns, provide useful information for economic policy and become an indicator of economic health, it must satisfy certain properties in measurement. These are: precision; validity; reliability; predictive power; coverage; simplicity; diagnostics; and comparability.PrecisionPrecision refers to the degree of certainty of the estimated value of the ACSI. ACSI results show that the 90% confidence interval (on a 0-100 scale) for the national index is ± 0.2 points throughout its first 4 years of measurement. For each of the six measured private sectors, it is an average ± 0.5 points and for the public administration/government sector, it is + 1.3 points. For industries, the conifdence interval is an average ±1.0 points for manufacturing industries, + 1.7 points for service industries and ± 2.5 points for government agencies. For the typical company, it is an average ± 2.0 points for manufacturing ifrms and 2.6 points for service companies and agencies. This level of precision is obtained as a result of great care in data collection, careful variable speciifcation and latent variable modeling. Latent variable modeling produces an average improvement of 22% in precision over use of responses from a single question, according to ACSI research.ValidityValidity refers to the ability of the individual measures to represent the underlying construct customer satisfaction (ACSI) and to relate effects and consequences in an expected manner. Discriminant validity, which is the degree to which a measured construct differs from other measured constructs, is also evidenced. For example, there is not only an importanto-FOUNDATIONS OF ACSI S875 conceptual distinction between perceived quality and customer satisfaction, but also anempirical distinction. That is, the covariance between the questions measuring the ACSI ishigher than the covariances between the ACSI and any other construct in the system.The nomological validity of the ACSI model can be checked by two measures: (1) latentvariable covariance explained; and (2) multiple correlations (R'). On average, 94% of thelatent variable covariance structure is explained by the structural model. The average R2ofthe customer satisfaction equation in the model is 0.75. In addition, all coefficients relatingthe variables of the model have the expected sign. All but a few are statistically signiifcant.In measures of customer satisfaction, there are several threats to validity. The most seriousof these is the skewness of the frequency distributions. Customers tend disproportionately touse the high scores on a scale to express satisfaction. Skewness is addressed by using a fairlyhigh number of scale categories (1-10) and by using a multiple indicator approach (Fornell,1992, 1995). It is a well established fact that vaildity typically increases with the use of more categories (Andrews, 1984), and it is particularly so when the respondent has good knowledgeabout the subject matter and when the distribution of responses is highly skewed. An indexof satisfaction is much to be preferred over a categorization of respondents as either 'satisfied'or 'dissatisfied'. Satisfaction is a matter of degree—it is not a binary concept. If measured asbinary, precision is low, validity is suspect and predictive power is poor.ReliabilityReliability of a measure is determined by its signal-to-noise ratio. That is, the extent to whichthe variation of the measure is due to the 'true' underlying phenomenon versus randomeffects. High reliability is evident if a measure is stable over time or equivalent with identicalmeasures (Fonrell, 1992). Signal-to-noise in the items that make up the index (in terms of variances) is about 4 to 1.Predictive power and financial implications of ACSIAn important part of the ACSI is its ability to predict economic returns. The model, ofwhich the ACSI is a part, uses two proxies for economic returns as criterion variables: (1)customer retention (estimated from a non-linear transformation of a measure of repurchase likelihood); and (2) price tolerance (reservation price). The items included in the index areweighted in such a way that the proxies and the ACSI are maximally correlated (subject tocertain constraints). Unless such weighting is done, the index is more likely to include mattersthat may be satisfying to the customer, but for which he or she is not willing to pay.The empirical evidence for predictive power is available from both the Swedish data andthe ACSI data. Using data from the Swedish Barometer, a one-point increase in the SCSBeach year over 5 years yields, on the average, a 6.6% increase in current return-on-investment (Anderson et al., 1994). Of the firms traded on the Stockholm Stock Market Exchange, it isalso evident that changes in the SCSB have been predictive of stock returns.A basic tenet underlying the ACSI is that satisifed customers represent a real, albeit intangible, economic asset to a ifrm. By deifnition, an economic asset generates future incomestreams to the owner of that asset. Therefore, if customer satisfaction is indeed an economicasset, it should be possible to use the ACSI for prediction of company ifnancial results. It is,of course, of considerable importance that the ifnancial consequences of the ACSI arespecified and documented. If it can be shown that the ACSI is related to ifnancial returns,then the index demonstrates external validity.The University of Michigan Business School faculty have done considerable research onS876 E. W. ANDERSON & C. FORNELLthe linkage between ACSI and economic returns, analyzing both accounting and stock market returns from measured companies. The pattern from all of these studies suggests a statistically strong and positive relationship. Speciifcally:?There is a positive and significant relationship between ACSI and accounting return_ on-assets (Fornell et al., 1995).?There is a positive and signiifcant relationship between the ACSI and the market valueof common equity (Ittner & Larcker, 1996). When controlling for accounting book values of total assets and liabilities, a one-unit change (on the 0-100-point scale used for the ACSI) is associated with an average of US$646 million increase in market value. There are also significant and positive relationships between ACSI and market-to-book values and price/earnings ratios. There is a negative relationship between ACSI and risk measures, implying that firms with high loyalty and customersatisfactionhave less variability and stronger financial positions.?There is a positive and significant relationship between the ACSI and the long-term adjusted financial performance of companies. Tobin's Q is generally accepted as the best measure of long-term performance. It is deifned as the ratio of a firm's present value of expected cash lfows to the replacement costs of its assets. Controlling for other factors, ACSI has a significant relationship to Tobin's Q (Mazvancheryl et al.,1999).?Since 1994, changes in the ACSI have correlated with the stock market (Martin,1998). The current market value of any security is the market's estimate of the discounted present value of the future income stream that the underlying asset will generate. If the most important asset is the satisfaction of the customer base, changes in ACSI should be related to changes in stock price. Until 1997, the stock market went up, whereas ACSI went down. However, in quarters following a sharp drop in ACSI, the stock market has slowed. Conversely, when the ACSI has gone down only slightly, the following quarter's stock market has gone up substantially. For the 6 years of ACSI measurement, the correlation between changes in the ACSI and changes in the Dow Jones industrial average has been quite strong. The interpretation of this relationship suggests that stock prices have responded to downsizing, cost cutting and productivity improvements, and that the deterioration in quality (particularly in the service sectors) has not been large enough to offset the positive effects. It also suggests that there is a limit beyond which it is unlikely that customers will tolerate further decreases in satisfaction. Once that limit is reached (which is now estimated to be approximately —1.4% quarterly decline in ACSI), the stock market will not go up further.ACSI scores of approximately 130 publicly traded companies display a statistically positive relationship with the traditional performance measures used by firms and security analysts (i.e. return-on-assets, return-on-equity, price—earnings ratio and the market-to-book ratio). In addition, the companies with the higher ACSI scores display stock price returns above the market adjusted average (Ittner & Larcker, 1996). The ACSI is also positively correlated with 'market value added'. This evidence indicates that the ACSI methodology produces a reliable and valid measure for customer satisfaction that is forward-looking and relevant to a company's economic performance.CoverageThe ACSI measures a substantial portion of the US economy. In terms of sales dollars, it is approximately 30% of the GDP. The measured companies produce over 40%, but the ACSIFOUNDATIONS OF ACSI S877measures only the sales of these companies to household consumers in the domestic market. The economic sectors and industries covered are discussed in Chapter III. Within each industry, the number of companies measured varies from 2 to 22.The national index and the indices for each industry and sector are relfective of the total value (quality times sales) of products and services provided by the ifrms at each respective level of aggregation. Relative sales are used to determine each company's or agency's contribution to its respective industry index. In turn, relative sales by each industry are used to determine each industry's contribution to its respective sector index. To calculate the national index, the percentage contributions of each sector to the GDP are used to top-weight the sector indices. Mathematically, this is deifned as:Index for industry i in sector s at time t = ES f i;If _S S ,, S I Index for sector s at time t =I g = E ,whereSr…, = sales by ifrm f, industry i, sector s at time t= index for firm f, industry i, sector s at time tandSit = E S,, = total sales for industry i at time tS, = E S i , = total sales for sector s at time t ,The index is updated on a quarterly basis. For each quarter, new indices are estimated for one or two sectors with total replacement of all data annually at the end of the third calendar quarter. The national index is comprised of the most recent estimate for each sectorT S National index at time t — ____________ E 4, V s9t t =T -3 s W,13where I s , = 0 for all t in which the index for a sector is not estimated, and I = I for all ,, quarters in which an index is estimated. In this way, the national index represents company, industry and sector indices for the prior year.SimplicityGiven the complexity of model estimation, the ACSI maintains reasonable simpilcity. It is calibrated on a 0-100 scale. Whereas the absolute values of the ACSI are of interest, much of the index's value, as with most other economic indicators, is found in changes over time, which can be expressed as percentages.DiagnosticsThe ACSI methodology estimates the relationships between customer satisfaction and its causes as seen by the customer: customer expectations, perceived quality and perceived value. Also estimated are the relationships between the ACSI, customer loyalty (as measured by customer retention and price tolerance (reservation prices)) and customer complaints. The。

COMP-627 Project Belief State Space Compression for Bayes-Adaptive

COMP-627 Project Belief State Space Compression for Bayes-Adaptive

1
Introduction
In real world systems, uncertainty generally arises in both the prediction of the system’s behaviour under different controls and the observability of the current system state. Partially Observable Markov Decision Processes (POMDPs) take both kind of uncertainty into account and provide a powerful model for sequential decision making under these conditions. However, most real world problem have huge state space and observation space, such that exact solving approaches are completely intractable (finite-horizon POMDPs are PSPACE-complete [1] and infinite-horizon POMDPs are undecidable [2]). This has motivated most researchers to focus on elaborating approximate solving approaches to this problem in order to solve ever larger POMDPs. 1
COMP-627 Project Belief State Space Compression fSt´ ephane Ross ID: 260237342 May 9, 2007

伍德里奇计量经济学英文各章总结

伍德里奇计量经济学英文各章总结

CHAPTER 1TEACHING NOTESYou have substantial latitude about what to emphasize in Chapter 1.I find it useful to talk about the economics of crime example (Example and the wage example (Example so that students see, at the outset, that econometrics is linked to economic reasoning, even if the economics is not complicated theory.I like to familiarize students with the important data structures that empirical economists use, focusing primarily on cross-sectional and time series data sets, as these are what I cover in a first-semester course. It is probably a good idea to mention the growing importance of data sets that have both a cross-sectional and time dimension.I spend almost an entire lecture talking about the problems inherent in drawing causal inferences in the social sciences. I do this mostly through the agricultural yield, return to education, and crime examples. These examples also contrast experimental and nonexperimental (observational) data. Students studying business and finance tend to find the term structure of interest rates example more relevant, although the issue there is testing the implication of a simple theory, as opposed to inferring causality. I have found that spending time talking about these examples, in place of a formal review of probability and statistics, is more successful (and more enjoyable for the students and me).CHAPTER 2TEACHING NOTESThis is the chapter where I expect students to follow most, if not all, of the algebraic derivations. In class I like to derive at least the unbiasedness of the OLS slope coefficient, and usually I derive the variance. At a minimum, I talk about the factors affecting the variance. To simplify the notation, after I emphasize the assumptions in the population model, and assume random sampling, I just condition on the values of the explanatory variables in the sample. Technically, this is justified by random sampling because, for example, E(u i|x1,x2,…,x n) =E(u i|x i) by independent sampling. I find that students are able to focus on the key assumption and subsequently take my word about how conditioning on the independent variables in the sample is harmless. (If you prefer, the appendix to Chapter 3 does the conditioning argument carefully.) Because statistical inference is no more difficult in multiple regression than in simple regression, I postpone inference until Chapter 4. (This reduces redundancy and allows you to focus on the interpretive differences between simple and multiple regression.)You might notice how, compared with most other texts, I use relatively few assumptions to derive the unbiasedness of the OLS slope estimator, followed by the formula for its variance. This is because I do not introduce redundant or unnecessary assumptions. For example,once is assumed, nothing further about the relationship between u and x is needed to obtain the unbiasedness of OLS under random sampling.CHAPTER 3TEACHING NOTESFor undergraduates, I do not work through most of the derivations in this chapter, at least not in detail. Rather, I focus on interpreting the assumptions, which mostly concern the population. Other than random sampling, the only assumption that involves more than population considerations is the assumption about no perfect collinearity, where the possibility of perfect collinearity in the sample (even if it does not occur in the population) should be touched on. The more important issue is perfect collinearity in the population, but this is fairly easy to dispense with via examples. These come from my experiences with the kinds of model specification issues that beginners have trouble with.The comparison of simple and multiple regression estimates – based on the particular sample at hand, as opposed to their statistical properties – usually makes a strong impression. Sometimes I do not bother with the “partialling out” interpretation of multiple regression.As far as statistical properties, notice how I treat the problem of including an irrelevant variable: no separate derivation is needed, as the result follows form Theorem .I do like to derive the omitted variable bias in the simple case. This is not much more difficult than showing unbiasedness of OLS in the simple regression case under the first four Gauss-Markov assumptions. It is important to get the students thinking about this problem early on, and before too many additional (unnecessary) assumptions have been introduced.I have intentionally kept the discussion of multicollinearity to a minimum. This partly indicates my bias, but it also reflects reality. It is, of course, very important for students to understand the potential consequences of having highly correlated independent variables. But this is often beyond our control, except that we can ask less of our multiple regression analysis. If two or more explanatory variables are highly correlated in the sample, we should not expect to precisely estimate their ceteris paribus effects in the population.I find extensive treatments of multicollinearity, where one “tests” or somehow “solves” t he multicollinearity problem, to be misleading, at best. Even the organization of some texts gives the impression that imperfect multicollinearity is somehow a violation of the Gauss-Markov assumptions: they include multicollinearity in a chapter or part of the book devoted to “violation of the basic assumptions,” or something like that. I have noticed that master’s students who have had some undergraduate econometrics are often confused on the multicollinearityissue. It is very important that students not confuse multicollinearity among the included explanatory variables in a regression model with the bias caused by omitting an important variable.I do not prove the Gauss-Markov theorem. Instead, I emphasize its implications. Sometimes, and certainly for advanced beginners, I put a special case of Problem on a midterm exam, where I make a particular choice for the function g(x). Rather than have the students directly compare the variances, they should appeal to the Gauss-Markov theorem for the superiority of OLS over any other linear, unbiased estimator.CHAPTER 4TEACHING NOTESAt the start of this chapter is good time to remind students that a specific error distribution played no role in the results of Chapter 3. That is because only the first two moments were derived under the full set of Gauss-Markov assumptions. Nevertheless, normality is needed to obtain exact normal sampling distributions (conditional on the explanatory variables). I emphasize that the full set of CLM assumptions are used in this chapter, but that in Chapter 5 we relax the normality assumption and still perform approximately valid inference. One could argue that the classical linear model results could be skipped entirely, and that only large-sample analysis is needed. But, from a practicalperspective, students still need to know where the t distribution comes from because virtually all regression packages report t statistics and obtain p-values off of the t distribution. I then find it very easy to cover Chapter 5 quickly, by just saying we can drop normality and still use t statistics and the associated p-values as being approximately valid. Besides, occasionally students will have to analyze smaller data sets, especially if they do their own small surveys for a term project.It is crucial to emphasize that we test hypotheses about unknown population parameters. I tell my students that they will be punished ifˆ = 0 on an exam or, even worse, H0: .632 they write something like H0:1= 0.One useful feature of Chapter 4 is its illustration of how to rewrite a population model so that it contains the parameter of interest in testing a single restriction. I find this is easier, both theoretically and practically, than computing variances that can, in some cases, depend on numerous covariance terms. The example of testing equality of the return to two- and four-year colleges illustrates the basic method, and shows that the respecified model can have a useful interpretation. Of course, some statistical packages now provide a standard error for linear combinations of estimates with a simple command, and that should be taught, too.One can use an F test for single linear restrictions on multiple parameters, but this is less transparent than a t test and does notimmediately produce the standard error needed for a confidence interval or for testing a one-sided alternative. The trick of rewriting the population model is useful in several instances, including obtaining confidence intervals for predictions in Chapter 6, as well as for obtaining confidence intervals for marginal effects in models with interactions (also in Chapter 6).The major league baseball player salary example illustrates the difference between individual and joint significance when explanatory variables (rbisyr and hrunsyr in this case) are highly correlated. I tend to emphasize the R-squared form of the F statistic because, in practice, it is applicable a large percentage of the time, and it is much more readily computed. I do regret that this example is biased toward students in countries where baseball is played. Still, it is one of the better examples of multicollinearity that I have come across, and students of all backgrounds seem to get the point.CHAPTER 5TEACHING NOTESChapter 5 is short, but it is conceptually more difficult than the earlier chapters, primarily because it requires some knowledge of asymptotic properties of estimators. In class, I give a brief, heuristic description of consistency and asymptotic normality before stating the consistency and asymptotic normality of OLS. (Conveniently, the sameassumptions that work for finite sample analysis work for asymptotic analysis.) More advanced students can follow the proof of consistency of the slope coefficient in the bivariate regression case. Section contains a full matrix treatment of asymptotic analysis appropriate for a master’s level course.An explicit illustration of what happens to standard errors as the sample size grows emphasizes the importance of having a larger sample.I do not usually cover the LM statistic in a first-semester course, and I only briefly mention the asymptotic efficiency result. Without full use of matrix algebra combined with limit theorems for vectors and matrices, it is very difficult to prove asymptotic efficiency of OLS.I think the conclusions of this chapter are important for students to know, even though they may not fully grasp the details. On exams I usually include true-false type questions, with explanation, to test the stud ents’ understanding of asymptotics. [For example: “In large samples we do not have to worry about omitted variable bias.” (False). Or “Even if the error term is not normally distributed, in large samples we can still compute approximately valid confidence intervals under the Gauss-Markov assumptions.” (True).]CHAPTER6TEACHING NOTESI cover most of Chapter 6, but not all of the material in great detail.I use the example in Table to quickly run through the effects of data scaling on the important OLS statistics. (Students should already have a feel for the effects of data scaling on the coefficients, fitting values, and R-squared because it is covered in Chapter 2.) At most, I briefly mention beta coefficients; if students have a need for them, they can read this subsection.The functional form material is important, and I spend some time on more complicated models involving logarithms, quadratics, and interactions. An important point for models with quadratics, and especially interactions, is that we need to evaluate the partial effect at interesting values of the explanatory variables. Often, zero is not an interesting value for an explanatory variable and is well outside the range in the sample. Using the methods from Chapter 4, it is easy to obtain confidence intervals for the effects at interesting x values.As far as goodness-of-fit, I only introduce the adjusted R-squared, as I think using a slew of goodness-of-fit measures to choose a model can be confusing to novices (and does not reflect empirical practice). It is important to discuss how, if we fixate on a high R-squared, we may wind up with a model that has no interesting ceteris paribus interpretation.I often have students and colleagues ask if there is a simple way to predict y when log(y) has been used as the dependent variable, and to obtain a goodness-of-fit measure for the log(y) model that can be compared with the usual R-squared obtained when y is the dependent variable. The methods described in Section are easy to implement and, unlike other approaches, do not require normality.The section on prediction and residual analysis contains several important topics, including constructing prediction intervals. It is useful to see how much wider the prediction intervals are than the confidence interval for the conditional mean. I usually discuss some of the residual-analysis examples, as they have real-world applicability.CHAPTER 7TEACHING NOTESThis is a fairly standard chapter on using qualitative information in regression analysis, although I try to emphasize examples with policy relevance (and only cross-sectional applications are included.).In allowing for different slopes, it is important, as in Chapter 6, to appropriately interpret the parameters and to decide whether they are of direct interest. For example, in the wage equation where the return to education is allowed to depend on gender, the coefficient on the female dummy variable is the wage differential between women and men at zero years of education. It is not surprising that we cannot estimatethis very well, nor should we want to. In this particular example we would drop the interaction term because it is insignificant, but the issue of interpreting the parameters can arise in models where the interaction term is significant.In discussing the Chow test, I think it is important to discuss testing for differences in slope coefficients after allowing for an intercept difference. In many applications, a significant Chow statistic simply indicates intercept differences. (See the example in Section on student-athlete GPAs in the text.) From a practical perspective, it is important to know whether the partial effects differ across groups or whether a constant differential is sufficient.I admit that an unconventional feature of this chapter is its introduction of the linear probability model. I cover the LPM here for several reasons. First, the LPM is being used more and more because it is easier to interpret than probit or logit models. Plus, once the proper parameter scalings are done for probit and logit, the estimated effects are often similar to the LPM partial effects near the mean or median values of the explanatory variables. The theoretical drawbacks of the LPM are often of secondary importance in practice. Computer Exercise is a good one to illustrate that, even with over 9,000 observations, the LPM can deliver fitted values strictly between zero and one for all observations.If the LPM is not covered, many students will never know about using econometrics to explain qualitative outcomes. This would be especially unfortunate for students who might need to read an article where an LPM is used, or who might want to estimate an LPM for a term paper or senior thesis. Once they are introduced to purpose and interpretation of the LPM, along with its shortcomings, they can tackle nonlinear models on their own or in a subsequent course.A useful modification of the LPM estimated in equation is to drop kidsge6 (because it is not significant) and then define two dummy variables, one for kidslt6 equal to one and the other for kidslt6 at least two. These can be included in place of kidslt6 (with no young children being the base group). This allows a diminishing marginal effect in an LPM. I was a bit surprised when a diminishing effect did not materialize.CHAPTER 8TEACHING NOTESThis is a good place to remind students that homoskedasticity played no role in showing that OLS is unbiased for the parameters in the regression equation. In addition, you probably should mention that there is nothing wrong with the R-squared or adjusted R-squared as goodness-of-fit measures. The key is that these are estimates of the population R-squared, 1 – [Var(u)/Var(y)], where the variances are the unconditional variances in the population. The usual R-squared, andthe adjusted version, consistently estimate the population R-squared whether or not Var(u|x) = Var(y|x) depends on x. Of course, heteroskedasticity causes the usual standard errors, t statistics, and F statistics to be invalid, even in large samples, with or without normality.By explicitly stating the homoskedasticity assumption as conditional on the explanatory variables that appear in the conditional mean, it is clear that only heteroskedasticity that depends on the explanatory variables in the model affects the validity of standard errors and test statistics. The version of the Breusch-Pagan test in the text, and the White test, are ideally suited for detecting forms of heteroskedasticity that invalidate inference obtained under homoskedasticity. If heteroskedasticity depends on an exogenous variable that does not also appear in the mean equation, this can be exploited in weighted least squares for efficiency, but only rarely is such a variable available. One case where such a variable is available is when an individual-level equation has been aggregated. I discuss this case in the text but I rarely have time to teach it.As I mention in the text, other traditional tests for heteroskedasticity, such as the Park and Glejser tests, do not directly test what we want, or add too many assumptions under the null. The Goldfeld-Quandt test only works when there is a natural way to order thedata based on one independent variable. This is rare in practice, especially for cross-sectional applications.Some argue that weighted least squares estimation is a relic, and is no longer necessary given the availability of heteroskedasticity-robust standard errors and test statistics. While I am sympathetic to this argument, it presumes that we do not care much about efficiency. Even in large samples, the OLS estimates may not be precise enough to learn much about the population parameters. With substantial heteroskedasticity we might do better with weighted least squares, even if the weighting function is misspecified. As discussed in the text on pages 288-289, one can, and probably should, compute robust standard errors after weighted least squares. For asymptotic efficiency comparisons, these would be directly comparable to the heteroskedasiticity-robust standard errors for OLS.Weighted least squares estimation of the LPM is a nice example of feasible GLS, at least when all fitted values are in the unit interval. Interestingly, in the LPM examples in the text and the LPM computer exercises, the heteroskedasticity-robust standard errors often differ by only small amounts from the usual standard errors. However, in a couple of cases the differences are notable, as in Computer Exercise .CHAPTER 9TEACHING NOTESThe coverage of RESET in this chapter recognizes that it is a test for neglected nonlinearities, and it should not be expected to be more than that. (Formally, it can be shown that if an omitted variable has a conditional mean that is linear in the included explanatory variables, RESET has no ability to detect the omitted variable. Interested readers may consult my chapter in Companion to Theoretical Econometrics, 2001, edited by Badi Baltagi.) I just teach students the F statistic version of the test.The Davidson-MacKinnon test can be useful for detecting functional form misspecification, especially when one has in mind a specific alternative, nonnested model. It has the advantage of always being a one degree of freedom test.I think the proxy variable material is important, but the main points can be made with Examples and . The first shows that controlling for IQ can substantially change the estimated return to education, and the omitted ability bias is in the expected direction. Interestingly, education and ability do not appear to have an interactive effect. Example is a nice example of how controlling for a previous value of the dependent variable – something that is often possible with survey and nonsurvey data – can greatly affect a policy conclusion. Computer Exercise is also a good illustration of this method.I rarely get to teach the measurement error material, although the attenuation bias result for classical errors-in-variables is worth mentioning.The result on exogenous sample selection is easy to discuss, with more details given in Chapter 17. The effects of outliers can be illustrated using the examples. I think the infant mortality example, Example , is useful for illustrating how a single influential observation can have a large effect on the OLS estimates.With the growing importance of least absolute deviations, it makes sense to at least discuss the merits of LAD, at least in more advanced courses. Computer Exercise is a good example to show how mean and median effects can be very different, even though there may not be “outliers” in the usual sense.CHAPTER 10TEACHING NOTESBecause of its realism and its care in stating assumptions, this chapter puts a somewhat heavier burden on the instructor and student than traditional treatments of time series regression. Nevertheless, I think it is worth it. It is important that students learn that there are potential pitfalls inherent in using regression with time series data that are not present for cross-sectional applications. Trends, seasonality, and high persistence are ubiquitous in time series data. By this time,students should have a firm grasp of multiple regression mechanics and inference, and so you can focus on those features that make time series applications different from cross-sectional ones.I think it is useful to discuss static and finite distributed lag models at the same time, as these at least have a shot at satisfying theGauss-Markov assumptions. Many interesting examples have distributed lag dynamics. In discussing the time series versions of the CLM assumptions, I rely mostly on intuition. The notion of strict exogeneity is easy to discuss in terms of feedback. It is also pretty apparent that, in many applications, there are likely to be some explanatory variables that are not strictly exogenous. What the student should know is that, to conclude that OLS is unbiased – as opposed to consistent – we need to assume a very strong form of exogeneity of the regressors. Chapter 11 shows that only contemporaneous exogeneity is needed for consistency.Although the text is careful in stating the assumptions, in class, after discussing strict exogeneity, I leave the conditioning on X implicit, especially when I discuss the no serial correlation assumption. As this is a new assumption I spend some time on it. (I also discuss why we did not need it for random sampling.)Once the unbiasedness of OLS, the Gauss-Markov theorem, and the sampling distributions under the classical linear model assumptions havebeen covered – which can be done rather quickly – I focus on applications. Fortunately, the students already know about logarithms and dummy variables. I treat index numbers in this chapter because they arise in many time series examples.A novel feature of the text is the discussion of how to compute goodness-of-fit measures with a trending or seasonal dependent variable. While detrending or deseasonalizing y is hardly perfect (and does not work with integrated processes), it is better than simply reporting the very high R-squareds that often come with time series regressions with trending variables.CHAPTER 11TEACHING NOTESMuch of the material in this chapter is usually postponed, or not covered at all, in an introductory course. However, as Chapter 10 indicates, the set of time series applications that satisfy all of the classical linear model assumptions might be very small. In my experience, spurious time series regressions are the hallmark of many student projects that use time series data. Therefore, students need to be alerted to the dangers of using highly persistent processes in time series regression equations. (Spurious regression problem and the notion of cointegration are covered in detail in Chapter 18.)It is fairly easy to heuristically describe the difference between a weakly dependent process and an integrated process. Using the MA(1) and the stable AR(1) examples is usually sufficient.When the data are weakly dependent and the explanatory variables are contemporaneously exogenous, OLS is consistent. This result has many applications, including the stable AR(1) regression model. When we add the appropriate homoskedasticity and no serial correlation assumptions, the usual test statistics are asymptotically valid.The random walk process is a good example of a unit root (highly persistent) process. In a one-semester course, the issue comes down to whether or not to first difference the data before specifying the linear model. While unit root tests are covered in Chapter 18, just computing the first-order autocorrelation is often sufficient, perhaps after detrending. The examples in Section illustrate how differentfirst-difference results can be from estimating equations in levels.Section is novel in an introductory text, and simply points out that, if a model is dynamically complete in a well-defined sense, it should not have serial correlation. Therefore, we need not worry about serial correlation when, say, we test the efficient market hypothesis. Section further investigates the homoskedasticity assumption, and, in a time series context, emphasizes that what is contained in the explanatory variables determines what kind of heteroskedasticity is ruled out by theusual OLS inference. These two sections could be skipped without loss of continuity.CHAPTER 12TEACHING NOTESMost of this chapter deals with serial correlation, but it also explicitly considers heteroskedasticity in time series regressions. The first section allows a review of what assumptions were needed to obtain both finite sample and asymptotic results. Just as with heteroskedasticity, serial correlation itself does not invalidate R-squared. In fact, if the data are stationary and weakly dependent, R-squared and adjusted R-squared consistently estimate the population R-squared (which is well-defined under stationarity).Equation is useful for explaining why the usual OLS standard errors are not generally valid with AR(1) serial correlation. It also provides a good starting point for discussing serial correlation-robust standard errors in Section . The subsection on serial correlation with lagged dependent variables is included to debunk the myth that OLS is always inconsistent with lagged dependent variables and serial correlation. I do not teach it to undergraduates, but I do to master’s students.Section is somewhat untraditional in that it begins with an asymptotic t test for AR(1) serial correlation (under strict exogeneity of the regressors). It may seem heretical not to give the Durbin-Watsonstatistic its usual prominence, but I do believe the DW test is less useful than the t test. With nonstrictly exogenous regressors I cover only the regression form of Durbin’s test, as the h statistic is asymptotically equivalent and not always computable.Section , on GLS and FGLS estimation, is fairly standard, although I try to show how comparing OLS estimates and FGLS estimates is not so straightforward. Unfortunately, at the beginning level (and even beyond), it is difficult to choose a course of action when they are very different.I do not usually cover Section in a first-semester course, but, because some econometrics packages routinely compute fully robust standard errors, students can be pointed to Section if they need to learn something about what the corrections do. I do cover Section for a master’s level course in applied econometrics (after the first-semester course).I also do not cover Section in class; again, this is more to serve as a reference for more advanced students, particularly those with interests in finance. One important point is that ARCH is heteroskedasticity and not serial correlation, something that is confusing in many texts. If a model contains no serial correlation, the usual heteroskedasticity-robust statistics are valid. I have a brief subsection on correcting for a known。

NBD-Dirichlet模型的消费者购买行为分析 package说明书

NBD-Dirichlet模型的消费者购买行为分析 package说明书

Package‘NBDdirichlet’October12,2022Type PackageTitle NBD-Dirichlet Model of Consumer Buying Behavior for MarketingResearchVersion1.4Date2022-05-26Author Feiming ChenMaintainer Feiming Chen<*********************>URL https:///~fchen/statistics/R-package-NBDdirichlet/ how-to-use-Dirichlet-marketing-model.htmlDescription The Dirichlet(aka NBD-Dirichlet)model describes the purchase inci-dence and brand choice of consumer products.We estimate the model and summarize vari-ous theoretical quantities of interest to marketing researchers.Also provides functions for mak-ing tables that compare observed and theoretical statistics.License GPL(>=3)NeedsCompilation noRepository CRANDate/Publication2022-05-2914:00:02UTCR topics documented:NBDdirichlet-package (2)dirichlet (3)plot.dirichlet (6)print.dirichlet (8)summary.dirichlet (9)Index1312NBDdirichlet-package NBDdirichlet-package NBD-Dirichlet of Model Consumer Buying BehaviorDescriptionThe Dirichlet(aka NBD-Dirichlet)model describes the probability distributions of the consumer purchase incidences and brand choices.We estimate the model and calculate various theoretical quantities of interest to marketing researchers.Author(s)Feiming ChenReferencesThe Dirichlet:A Comprehensive Model of Buying Behavior.G.J.Goodhardt,A.S.C.Ehrenberg,C.Chatfield.Journal of the Royal Statistical Society.Series A(General),V ol.147,No.5(1984),pp.621-655Repeat-Buying:Facts,Theory and Applications,2nd edn.A.S.C.Ehrenberg,1988,London,Charles Griffin,ISBN0852642873Book Review:Repeat-Buying:Facts,Theory and Applications by A.S.C.Ehrenberg.Norman Pigden.The Statistician,V ol.40,No.3,Special Issue(1991),pp.349-350See Alsodirichlet,print.dirichlet,summary.dirichlet,plot.dirichletExamplescat.pen<-0.56#Category Penetrationcat.buyrate<-2.6#Category Buyer s Average Purchase Rate in a given period.brand.share<-c(0.25,0.19,0.1,0.1,0.09,0.08,0.03,0.02)#Brands Market Share brand.pen.obs<-c(0.2,0.17,0.09,0.08,0.08,0.07,0.03,0.02)#Brand Penetration<-c("Colgate DC","Macleans","Close Up","Signal","ultrabrite","Gibbs SR","Boots bel","Sainsbury b.")dobj<-dirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs,)print(dobj)summary(dobj)#plot(dobj)dirichlet Estimation of the Dirichlet modelDescriptionGiven consumer purchase summary data,it estimates the parameters of the Dirichlet model,which describes the consumer repeat-buying behavior of branded products.It also returns with several probability functions for users to calculate various theoretical agedirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs, =NA,cat.pur.var =NA,nstar =50,max.S =30,max.K =30,check =F)Argumentscat.pen Product category penetration,which is the observed proportion of category buy-ers over a specific time period.cat.buyrateCategory buyers’average purchase rate in a given period.This is derived as the total number of category purchase occasions divided by the total number of category buyers during a time period.brand.share A vector of brand market share.We typically define it as the proportions of purchase occasions that belong to different brands during the time period.brand.pen.obs A vector of observed brand penetration,which is the proportion of buyers for each brand during the time period. A character vector of the brand names.If not given (default),use "B1","B2",etc.cat.pur.varThe observed variance of category purchase rates across individuals.It is used for the method of moment estimation of the parameter K in the Dirichlet model.If it is not given (default),then estimate K by "mean and zeros"(see reference).nstarMaximum number of category purchases in the time period considered in the calculation.Any number larger than nstar is assumed to have occurred with probability zero.By default,it is 50.For higher frequently purchased category and longer study time period,it is necessary to increase nstar to the level (say,100,300,etc.)where nstar n =1P (n )>0.9999.We did not use the truncation procedure (suggested by the reference authors)in order to simplify coding.max.S Upper bound for the model parameter S in the optimization procedure to solve for S.Default to 30.max.K Upper bound for the model parameter K in the optimization procedure to solve for K.Default to 30.checkA logical value.If T,print some diagnostic information.Defaul to F.DetailsThe Dirichlet model and its estimation can be found in the reference paper.It is found tofit and reproduce the patterns of repeat buying of branded products quite well.Specifically,the dirichlet model is a mixture of distributions at four levels:1.Each consumer’s purchase incidences in a product category follow the Poisson process.2.The purchase rates of the category by different consumers follow a Gamma distribution.3.Each consumer’s choices among the available brands follow a multinomial distribution,and4.These brand choice probabilities follow a multivariate Beta or"Dirichlet"distribution acrossdifferent consumers.There are three structural parameters to be estimated:M Mean purchase rate of the category.K Measures the diversity of the overal category purchase frequency across consumers(smaller K implies more diversity).S Measures the diversity of the brand purchase propensity across consumers(smaller S implies more diversity).To estimate M and K,we use the observed category penetration(cat.pen)and purchase rate (cat.buyrate).To estimate S,we use additionally the observed brand penetrations(brand.pen.obs) and brand market shares(brand.share).Note however once these three parameters are estimated, only the brand market shares are needed by the Dirichlet model to compute various repeat-buying theoretical statistics.The estimated parameters,along with several probability functions that can access the object data, are passed back in a list,which is assigned a"dirichlet"class attribute.The result can be used by the print.dirichlet,summary.dirichlet,and plot.dirichlet method.The study period(where we report the model result)is assumed to be4times of the observation period(input data).So if we use quarterly data,the model output is annulized.This multiple(4) can be changed using the member function period.set.ValueA list with the following components:M Estimated Dirichlet model parameter:mean purchase rate of the category.K Estimated Dirichlet model parameter:it measures the diversity of the overal category purchase frequency(smaller K implies more diversity).S Estimated Dirichlet model parameter:it measures the diversity of the brand purchase propensity(smaller S implies more diversity).nbrand Number of brands being considered in the produt category.nstar Input parameter:Maximum number of category purchases considered.cat.pen Input parameter:Category penetration in a given time period.cat.buyrate Input parameter:Category buyers’average purchase rate in a given time period.brand.share Input parameter:A vector of brand market share.brand.pen.obs Input parameter:A vector of observed brand penetration. Input parameter:A character vector of the brand names.check A logicalflag that indicates whether to print the intermediate information in the model estimation.Default to F.error A logicalflag that indicates if nstar is too small to sufficently cover the support of category purchase probabilities(calculated by Pn,see below).If it is returnedT,then nstar should be increased and the Dirichlet model be re-estimated.period.set A member function of the"dirichlet"class object with one required parameter (t),which can be any positive real number.It resets the study time period to bet times of the assumed base time period in the sample.period.print A member function of the"dirichlet"class object with no parameter.It indicates the current time period by printing the multiple t of the base time period.p.rj.n A member function of the"dirichlet"class object with three required parameters (r_j,n,j).It calculates the conditional probability of buying brand j forexactly r_j times given that the consumer has made n category purchases.Pn A member function of the"dirichlet"class object with one required parameter (n).It calculates the probability that a consumer has made n category purchasesin the study time period.brand.pen A member function of the"dirichlet"class object with one required and one optional parameter(j,limit=c(0:nstar)).It calculates the probability that aconsumer makes at least one purchase of the brand j(theoretical penetration)in the study time period.The optional vector limit enumerates the exact fre-quencies that a consumer will be buying in the category and is used to indexthe summation of the probabilities of not buying brand j given those categorypurchases in limit.brand.buyrate A member function of the"dirichlet"class object with one required and one optional parameter(j,limit=c(0:nstar)).It calculates the expected numberof brand j purchases given that the consumer is a buyer of the brand j in thetime period(theoretical brand buying rate).The limit parameter has the samemeaning as that in the function brand.pen.wp A member function of the"dirichlet"class object with one required and one optional parameter(j,limit=c(0:nstar)).It calculates the expected numberof the product category purchases given that the consumer is a buyer of thebrand j in the time period(theoretical category buying rate for brand buyer).The limit parameter has the same meaning as that in the function brand.pen. Author(s)Feiming ChenReferencesThe Dirichlet:A Comprehensive Model of Buying Behavior.G.J.Goodhardt,A.S.C.Ehrenberg,C.Chatfield.Journal of the Royal Statistical Society.Series A(General),V ol.147,No.5(1984),pp.621-655See Alsoprint.dirichlet,summary.dirichlet,plot.dirichlet,NBDdirichlet-packageExamples#The following data comes from the example in section3of#the reference paper.They are Toothpaste purchase data in UK#in1st quarter of1973from the AGB panel(5240static panelists).cat.pen<-0.56#Category Penetrationcat.buyrate<-2.6#Category Buyer s Average Purchase Rate in a given period.brand.share<-c(0.25,0.19,0.1,0.1,0.09,0.08,0.03,0.02)#Brands Market Share brand.pen.obs<-c(0.2,0.17,0.09,0.08,0.08,0.07,0.03,0.02)#Brand Penetration<-c("Colgate DC","Macleans","Close Up","Signal","ultrabrite","Gibbs SR","Boots bel","Sainsbury b.")dobj<-dirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs,)print(dobj)plot.dirichlet Plot of theoretical penetration growth and buying rate growthDescriptionThis function plots a’dirichlet’object.It is a method for the generic function plot for objects of the class’dirichlet’.It plots the theoretical penetration growth and buying rate growth across multiple brands according to the Dirichlet model over a specified time sequence.Usage##S3method for class dirichletplot(x,t=4,brand=1:x$nbrand,incr=1,result=NULL,...)Argumentsx An object of"dirichlet"class.t Maximum of the projection time period,which is specified as a multiple of the base time period.For example,if the base time period is quarterly,then t=4would mean annually.brand A vector specifying the subset of brands to be ploted.incr Increment for the time sequence that starts at0.Its unit is one base time period.Can be a fractional number such as0.1.result A list returned from the previous run of the plot.dirichlet.It is used to avoid repeating the computation when incr is changed....Other parameters passing to the generic function.DetailsA time sequence will be made from0up to t with increment incr,against each component of whichthe theoretical penetration and brand buying rate will be plotted.Each plotted point represents the cumulated penetration or buying rate from time0to its current time point(expressed as the multiple of the base time period).ValueA list with two components:pen A matrix with the penetration values.Its number of rows is the number of brands,and its number of columns is the length of the time sequence used forplotting the X coordinates of the points.buy A matrix with the buying rate values.Its dimension is the same as that of pen. Author(s)Feiming ChenReferencesThe Dirichlet:A Comprehensive Model of Buying Behavior.G.J.Goodhardt,A.S.C.Ehrenberg,C.Chatfield.Journal of the Royal Statistical Society.Series A(General),V ol.147,No.5(1984),pp.621-655See Alsodirichlet,summary.dirichlet,print.dirichlet,NBDdirichlet-packageExamplescat.pen<-0.56#Category Penetrationcat.buyrate<-2.6#Category Buyer s Average Purchase Rate in a given period.brand.share<-c(0.25,0.19,0.1,0.1,0.09,0.08,0.03,0.02)#Brands Market Share brand.pen.obs<-c(0.2,0.17,0.09,0.08,0.08,0.07,0.03,0.02)#Brand Penetration<-c("Colgate DC","Macleans","Close Up","Signal","ultrabrite","Gibbs SR","Boots bel","Sainsbury b.")dobj<-dirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs,)plot(dobj)8print.dirichlet print.dirichlet Print Dirichlet model informationDescriptionThis function prints a’dirichlet’object.It is a method for the generic function print of class’dirich-let’.It prints descriptive information on the product category,brand,and the estimated Dirichlet model parameters.Usage##S3method for class dirichletprint(x,...)Argumentsx An object of"dirichlet"class....Other parameters passing to the generic functionDetailsThe following information is printed:•Number of brands in the category•Brand list•Brands’market shares.•Brands’penetrations.•The multiple of the base time period that indicates the study time period,and the correspond-ing M value.•Category penetration and category buying rate.•Estimated dirichlet model parameters:M(for base period),K,and S.ValueNULLAuthor(s)Feiming ChenReferencesThe Dirichlet:A Comprehensive Model of Buying Behavior.G.J.Goodhardt,A.S.C.Ehrenberg,C.Chatfield.Journal of the Royal Statistical Society.Series A(General),V ol.147,No.5(1984),pp.621-655See Alsodirichlet,summary.dirichlet,plot.dirichlet,NBDdirichlet-packageExamplescat.pen<-0.56#Category Penetrationcat.buyrate<-2.6#Category Buyer s Average Purchase Rate in a given period.brand.share<-c(0.25,0.19,0.1,0.1,0.09,0.08,0.03,0.02)#Brands Market Share brand.pen.obs<-c(0.2,0.17,0.09,0.08,0.08,0.07,0.03,0.02)#Brand Penetration<-c("Colgate DC","Macleans","Close Up","Signal","ultrabrite","Gibbs SR","Boots bel","Sainsbury b.")dobj<-dirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs,)print(dobj)#YOU WILL SEE THE FOLLOWING:#Number of Brands in the Category=8#Brand List:Colgate DC:Macleans:Close Up:Signal:ultrabrite:#Gibbs SR:Boots bel:Sainsbury b.#Brands Market Shares:0.250.190.10.10.090.080.030.02#Brands Penetration:0.20.170.090.080.080.070.030.02#Multiple of Base Time Period:1,Current M=1.456#Channel Penetration=0.56,with Shopping Rate=2.6#Estimated Dirichlet Model Parameters:#NBD:M=1.46,K=0.78;Dirichlet:S=1.55summary.dirichlet Theoretical summary statistics from the Dirichlet model.DescriptionThis function summarizes a’dirichlet’object.It is a method for the generic function summary of class’dirichlet’.It calculate four types of theoretical summary statistics,which can be compared with the corresponding observed statistics.Usage##S3method for class dirichletsummary(object,t=1,type=c("buy","freq","heavy","dup"),digits=2,freq.cutoff=5,heavy.limit=1:6,dup.brand=1,...)Argumentsobject An object of"dirichlet"class.t Multiple of the base time period.For example,if the assumed base time period is quarterly,then t=4would mean annually.Default to one.type A character vector that specifies which types of theoretical statistics(during the time period indicated by t)are to be calculated.Four character strings can belisted:buy Theoretical brand penetration,buying rate,and the buying rate of the cate-gory per brand buyer.freq Distribution of the number of brand purchases.heavy Theoretical brand penetration and buying rate among category buyerswith a specific frequency range.dup Brand duplication (proportion of buyers of a particular brand also buyingother brand).digits Number of decimal digits to control the rounding precision of the reported statis-tics.Default to 2.freq.cutoff For the type="freq"table,where to cut off and lump the tail of the frequency distribution.heavy.limit For the type="heavy"table,a vector containing the specific purchase frequency range of the category buyers.The upper-bound of the frequency is nstar .dup.brand For the type="dup"table,the focal brand.Defaul to the first brand in the brand list....Other parameters passing to the generic function.DetailsThe output corresponds to the theoretical portion of the Table 3,4,5,6in the reference paper.We also have another set of functions (available upon request)that put observed and theoretical statistics together for making tables that resemble those in the reference.Let P n be the probability of a consumer buying the product category n times.Then P n has a Negative Binomial Distribution (NBD).Let p (r j |n )be the probability of making r j purchases of brand j ,gien that n purchases of the category having been make (r j ≤n ).Then p (r j |n )has a Beta-Binomial distribution.The theoretical brand penetration b is thenb =1−n =0P n p (0|n )The theoretical brand buying rate w isw =n =1{P nn r =1rp (r |n )}band the category buying rate per brand buyer w P isw P =n =1{nP n [1−p (0|n )]}b The brand purchase frequency distribution isf (r )=n ≥rP n p (r |n )The brand penetration given a specific category purchase frequency range R ={i 1,i 2,i 3, (i)1−n∈R P (n )p (0|n ) n ∈R P (n )The brand buying rate given a specific category purchase frequency range R={i1,i2,i3, (i)n∈R P(n)[nr=1rp(r|n)]n∈RP(n)[1−p(0|n)]To calculate the brand duplication measure,wefirst get the penetration b(j+k)of the"composite"brand of two brands j and k as:b(j+k)=1−nP n p k(0|n)p j(0|n)Then the theoretical proportion b jk of the population buying both brands at least once is:b jk=b j+b k−b(j+k)and the brand duplication b j/k(where brand k is the focal brand)is:b j/k=b jk/b kValueA list with those components that are specified by the input type parameter.buy A data frame with three components:pen.brand(Theoretical brand penetra-tion),pur.brand(buying rate of the brand),pur.cat(buying rate of the cate-gory per brand buyer).freq A matrix that lists the distribution of brand purchases.The number of rows is the number of brands.heavy A matrix with two columns.Thefirst column(Penetration)is the theoretical brand penetration among category buyers with a specific frequency range.Thesecond column(Avg Purchase Freq)is the brand buying rate of those categorybuyers.The number of rows is the number of brands.dup A vector with dimension as the number of brands.It reports the brand duplica-tion(proportion of buyers of a particular brand also buying other brand)of thefocal brand(dup.brand).Author(s)Feiming ChenReferencesThe Dirichlet:A Comprehensive Model of Buying Behavior.G.J.Goodhardt,A.S.C.Ehrenberg,C.Chatfield.Journal of the Royal Statistical Society.Series A(General),V ol.147,No.5(1984),pp.621-655See Alsodirichlet,print.dirichlet,plot.dirichlet,NBDdirichlet-packageExamplescat.pen<-0.56#Category Penetrationcat.buyrate<-2.6#Category Buyer s Average Purchase Rate in a given period.brand.share<-c(0.25,0.19,0.1,0.1,0.09,0.08,0.03,0.02)#Brands Market Share brand.pen.obs<-c(0.2,0.17,0.09,0.08,0.08,0.07,0.03,0.02)#Brand Penetration<-c("Colgate DC","Macleans","Close Up","Signal","ultrabrite","Gibbs SR","Boots bel","Sainsbury b.")dobj<-dirichlet(cat.pen,cat.buyrate,brand.share,brand.pen.obs,)##Not run:summary(dobj)summary(dobj,t=4,type="freq")summary(dobj,t=4,type="heavy",heavy.limit=c(7:50))summary(dobj,t=1,type="dup",dup.brand=2)Index∗methodsplot.dirichlet,6print.dirichlet,8summary.dirichlet,9∗modelsdirichlet,3∗packageNBDdirichlet-package,2dirichlet,2,3,7,9,11NBDdirichlet-package,2plot.dirichlet,2,4,6,6,9,11print.dirichlet,2,4,6,7,8,11 summary.dirichlet,2,4,6,7,9,913。

chapter_2

chapter_2

2Deformation:Displacements and Strains We begin development of the basicfield equations of elasticity theory byfirst investigating thekinematics of material deformation.As a result of applied loadings,elastic solids will changeshape or deform,and these deformations can be quantified by knowing the displacements ofmaterial points in the body.The continuum hypothesis establishes a displacementfield at allpoints within the elastic ing appropriate geometry,particular measures of deformationcan be constructed leading to the development of the strain tensor.As expected,the straincomponents are related to the displacementfield.The purpose of this chapter is to introduce thebasic definitions of displacement and strain,establish relations between these twofieldquantities,andfinally investigate requirements to ensure single-valued,continuous displace-mentfields.As appropriate for linear elasticity,these kinematical results are developed underthe conditions of small deformation theory.Developments in this chapter lead to two funda-mental sets offield equations:the strain-displacement relations and the compatibility equa-tions.Furtherfield equation development,including internal force and stress distribution,equilibrium and elastic constitutive behavior,occurs in subsequent chapters.2.1General DeformationsUnder the application of external loading,elastic solids deform.A simple two-dimensionalcantilever beam example is shown in Figure2-1.The undeformed configuration is taken withthe rectangular beam in the vertical position,and the end loading displaces material points tothe deformed shape as shown.As is typical in most problems,the deformation varies frompoint to point and is thus said to be nonhomogenous.A superimposed square mesh is shown inthe two configurations,and this indicates how elements within the material deform locally.It isapparent that elements within the mesh undergo extensional and shearing deformation.Anelastic solid is said to be deformed or strained when the relative displacements between pointsin the body are changed.This is in contrast to rigid-body motion where the distance betweenpoints remains the same.In order to quantify deformation,consider the general example shown in Figure2-2.In the undeformed configuration,we identify two neighboring material points P o and P connected withthe relative position vector r as shown.Through a general deformation,these points are mappedand P0in the deformed configuration.Forfinite or large deformation theory,the to locations P0o27undeformed and deformed configurations can be significantly different,and a distinction between these two configurations must be maintained leading to Lagrangian and Eulerian descriptions;see,for example,Malvern(1969)or Chandrasekharaiah and Debnath(1994). However,since we are developing linear elasticity,which uses only small deformation theory, the distinction between undeformed and deformed configurations can be dropped.Using Cartesian coordinates,define the displacement vectors of points P o and P to be u o and u,respectively.Since P and P o are neighboring points,we can use a Taylor series expansion around point P o to express the components of u asu¼u oþ@u@xr xþ@u@yr yþ@u@zr zv¼v oþ@v@xr xþ@v@yr yþ@v@zr zw¼w oþ@w@xr xþ@w@yr yþ@w@zr z(2:1:1)FIGURE2-1Two-dimensional deformation example.(Undeformed)(Deformed) FIGURE2-2General deformation between two neighboring points.28FOUNDATIONS AND ELEMENTARY APPLICATIONSNote that the higher-order terms of the expansion have been dropped since the components of r are small.The change in the relative position vector r can be written asD r¼r0Àr¼uÀu o(2:1:2) and using(2.1.1)givesD r x¼@u@xr xþ@u@yr yþ@u@zr zD r y¼@v@xr xþ@v@yr yþ@v@zr zD r z¼@w@xr xþ@w@yr yþ@w@zr z(2:1:3)or in index notationD r i¼u i,j r j(2:1:4) The tensor u i,j is called the displacement gradient tensor,and may be written out asu i,j¼@u@x@u@y@u@z@v@x@v@y@v@z@w@x@w@y@w@z2666666437777775(2:1:5)From relation(1.2.10),this tensor can be decomposed into symmetric and antisymmetric parts asu i,j¼e ijþ!ij(2:1:6) wheree ij¼12(u i,jþu j,i)!ij¼12(u i,jÀu j,i)(2:1:7)The tensor e ij is called the strain tensor,while!ij is referred to as the rotation tensor.Relations (2.1.4)and(2.1.6)thus imply that for small deformation theory,the change in the relative position vector between neighboring points can be expressed in terms of a sum of strain and rotation bining relations(2.1.2),(2.1.4),and(2.1.6),and choosing r i¼dx i, we can also write the general result in the formu i¼u o iþe ij dx jþ!ij dx j(2:1:8) Because we are considering a general displacementfield,these results include both strain deformation and rigid-body motion.Recall from Exercise1-14that a dual vector!i canDeformation:Displacements and Strains29be associated with the rotation tensor such that !i ¼À1=2e ijk !jk .Using this definition,it is found that!1¼!32¼12@u 3@x 2À@u 2@x 3 !2¼!13¼12@u 1@x 3À@u 3@x 1 !3¼!21¼12@u 2@x 1À@u 1@x 2 (2:1:9)which can be expressed collectively in vector format as v ¼(1=2)(r Âu ).As is shown in the next section,these components represent rigid-body rotation of material elements about the coordinate axes.These general results indicate that the strain deformation is related to the strain tensor e ij ,which in turn is a related to the displacement gradients.We next pursue a more geometric approach and determine specific connections between the strain tensor components and geometric deformation of material elements.2.2Geometric Construction of Small Deformation TheoryAlthough the previous section developed general relations for small deformation theory,we now wish to establish a more geometrical interpretation of these results.Typically,elasticity variables and equations are field quantities defined at each point in the material continuum.However,particular field equations are often developed by first investigating the behavior of infinitesimal elements (with coordinate boundaries),and then a limiting process is invoked that allows the element to shrink to a point.Thus,consider the common deformational behavior of a rectangular element as shown in Figure 2-3.The usual types of motion include rigid-body rotation and extensional and shearing deformations as illustrated.Rigid-body motion does not contribute to the strain field,and thus also does not affect the stresses.We therefore focus our study primarily on the extensional and shearing deformation.Figure 2-4illustrates the two-dimensional deformation of a rectangular element with original dimensions dx by dy .After deformation,the element takes a rhombus form as shown in the dotted outline.The displacements of various corner reference points areindicated(Rigid Body Rotation)(Undeformed Element)(Horizontal Extension)(Vertical Extension)(Shearing Deformation)FIGURE 2-3Typical deformations of a rectangular element.30FOUNDATIONS AND ELEMENTARY APPLICATIONSin the figure.Reference point A is taken at location (x,y ),and the displacement components of this point are thus u (x,y )and v (x,y ).The corresponding displacements of point B are u (x þdx ,y )and v (x þdx ,y ),and the displacements of the other corner points are defined in an analogous manner.According to small deformation theory,u (x þdx ,y )%u (x ,y )þ(@u =@x )dx ,with similar expansions for all other terms.The normal or extensional strain component in a direction n is defined as the change in length per unit length of fibers oriented in the n -direction.Normal strain is positive if fibers increase in length and negative if the fiber is shortened.In Figure 2-4,the normal strain in the x direction can thus be defined bye x ¼A 0B 0ÀAB From the geometry in Figure 2-4,A 0B 0¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidx þ@u @x dx 2þ@v @x dx 2s ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1þ2@u @x þ@u @x 2þ@v @x 2dx s %1þ@u @xdx where,consistent with small deformation theory,we have dropped the higher-order ing these results and the fact that AB ¼dx ,the normal strain in the x -direction reduces toe x ¼@u@x (2:2:1)In similar fashion,the normal strain in the y -direction becomese y ¼@v@y (2:2:2)A second type of strain is shearing deformation,which involves angles changes (see Figure 2-3).Shear strain is defined as the change in angle between two originally orthogonalx FIGURE 2-4Two-dimensional geometric strain deformation.Deformation:Displacements and Strains 31directions in the continuum material.This definition is actually referred to as the engineering shear strain.Theory of elasticity applications generally use a tensor formalism that requires a shear strain definition corresponding to one-half the angle change between orthogonal axes; see previous relation(2:1:7)1.Measured in radians,shear strain is positive if the right angle between the positive directions of the two axes decreases.Thus,the sign of the shear strain depends on the coordinate system.In Figure2-4,the engineering shear strain with respect to the x-and y-directions can be defined asg xy¼p2ÀffC0A0B0¼aþbFor small deformations,a%tan a and b%tan b,and the shear strain can then be expressed asg xy¼@v@xdxdxþ@u@xdxþ@u@ydydyþ@v@ydy¼@u@yþ@v@x(2:2:3)where we have again neglected higher-order terms in the displacement gradients.Note that each derivative term is positive if lines AB and AC rotate inward as shown in thefigure.By simple interchange of x and y and u and v,it is apparent that g xy¼g yx.By considering similar behaviors in the y-z and x-z planes,these results can be easily extended to the general three-dimensional case,giving the results:e x¼@u@x,e y¼@v@y,e z¼@w@zg xy¼@u@yþ@v@x,g yz¼@v@zþ@w@y,g zx¼@w@xþ@u@z(2:2:4)Thus,we define three normal and three shearing strain components leading to a total of six independent components that completely describe small deformation theory.This set of equations is normally referred to as the strain-displacement relations.However,these results are written in terms of the engineering strain components,and tensorial elasticity theory prefers to use the strain tensor e ij defined by(2:1:7)1.This represents only a minor change because the normal strains are identical and shearing strains differ by a factor of one-half;for example,e11¼e x¼e x and e12¼e xy¼1=2g xy,and so forth.Therefore,using the strain tensor e ij,the strain-displacement relations can be expressed in component form ase x¼@u@x,e y¼@v@y,e z¼@w@ze xy¼1@uþ@v,e yz¼1@vþ@w,e zx¼1@wþ@u(2:2:5)Using the more compact tensor notation,these relations are written ase ij¼12(u i,jþu j,i)(2:2:6)32FOUNDATIONS AND ELEMENTARY APPLICATIONSwhile in direct vector/matrix notation as the form reads:e¼12r uþ(r u)TÂÃ(2:2:7)where e is the strain matrix and r u is the displacement gradient matrix and(r u)T is its transpose.The strain is a symmetric second-order tensor(e ij¼e ji)and is commonly written in matrix format:e¼[e]¼e x e xy e xze xy e y e yze xz e yz e z2435(2:2:8)Before we conclude this geometric presentation,consider the rigid-body rotation of our two-dimensional element in the x-y plane,as shown in Figure2-5.If the element is rotated through a small rigid-body angular displacement about the z-axis,using the bottom element edge,the rotation angle is determined as@v=@x,while using the left edge,the angle is given byÀ@u=@y. These two expressions are of course the same;that is,@v=@x¼À@u=@y and note that this would imply e xy¼0.The rotation can then be expressed as!z¼[(@v=@x)À(@u=@y)]=2, which matches with the expression given earlier in(2:1:9)3.The other components of rotation follow in an analogous manner.Relations for the constant rotation!z can be integrated to give the result:u*¼u oÀ!z yv*¼v oþ!z x(2:2:9)where u o and v o are arbitrary constant translations in the x-and y-directions.This result then specifies the general form of the displacementfield for two-dimensional rigid-body motion.We can easily verify that the displacementfield given by(2.2.9)yields zero strain.xFIGURE2-5Two-dimensional rigid-body rotation.Deformation:Displacements and Strains33For the three-dimensional case,the most general form of rigid-body displacement can beexpressed asu*¼u oÀ!z yþ!y zv*¼v oÀ!x zþ!z xw*¼w oÀ!y xþ!x y(2:2:10)As shown later,integrating the strain-displacement relations to determine the displacementfield produces arbitrary constants and functions of integration,which are equivalent to rigid-body motion terms of the form given by(2.2.9)or(2.2.10).Thus,it is important to recognizesuch terms because we normally want to drop them from the analysis since they do notcontribute to the strain or stressfields.2.3Strain TransformationBecause the strains are components of a second-order tensor,the transformation theorydiscussed in Section1.5can be applied.Transformation relation(1:5:1)3is applicable forsecond-order tensors,and applying this to the strain givese0ij¼Q ip Q jq e pq(2:3:1)where the rotation matrix Q ij¼cos(x0i,x j).Thus,given the strain in one coordinate system,we can determine the new components in any other rotated system.For the general three-dimensional case,define the rotation matrix asQ ij¼l1m1n1l2m2n2l3m3n32435(2:3:2)Using this notational scheme,the specific transformation relations from equation(2.3.1)becomee0x¼e x l21þe y m21þe z n21þ2(e xy l1m1þe yz m1n1þe zx n1l1)e0y¼e x l22þe y m22þe z n22þ2(e xy l2m2þe yz m2n2þe zx n2l2)e0z¼e x l23þe y m23þe z n23þ2(e xy l3m3þe yz m3n3þe zx n3l3)e0xy¼e x l1l2þe y m1m2þe z n1n2þe xy(l1m2þm1l2)þe yz(m1n2þn1m2)þe zx(n1l2þl1n2)e0yz¼e x l2l3þe y m2m3þe z n2n3þe xy(l2m3þm2l3)þe yz(m2n3þn2m3)þe zx(n2l3þl2n3)e0zx¼e x l3l1þe y m3m1þe z n3n1þe xy(l3m1þm3l1)þe yz(m3n1þn3m1)þe zx(n3l1þl3n1)(2:3:3)For the two-dimensional case shown in Figure2-6,the transformation matrix can be ex-pressed asQ ij¼cos y sin y0Àsin y cos y00012435(2:3:4)34FOUNDATIONS AND ELEMENTARY APPLICATIONSUnder this transformation,the in-plane strain components transform according toe 0x ¼e x cos 2y þe y sin 2y þ2e xy sin y cos ye 0y ¼e x sin 2y þe y cos 2y À2e xy sin y cos ye 0xy ¼Àe x sin y cos y þe y sin y cos y þe xy (cos 2y Àsin 2y )(2:3:5)which is commonly rewritten in terms of the double angle:e 0x ¼e x þe y 2þe x Àe y 2cos 2y þe xy sin 2y e 0y ¼e x þe y Àe x Àe y cos 2y Àe xy sin 2y e 0xy ¼e y Àe x 2sin 2y þe xy cos 2y (2:3:6)Transformation relations (2.3.6)can be directly applied to establish transformations between Cartesian and polar coordinate systems (see Exercise 2-6).Additional applications of these results can be found when dealing with experimental strain gage measurement systems.For example,standard experimental methods using a rosette strain gage allow the determination of extensional strains in three different directions on the surface of a ing this type of data,relation (2:3:6)1can be repeatedly used to establish three independent equations that can be solved for the state of strain (e x ,e y ,e xy )at the surface point under study (see Exercise 2-7).Both two-and three-dimensional transformation equations can be easily incorporated in MATLAB to provide numerical solutions to problems of interest.Such examples are given in Exercises 2-8and 2-9.2.4Principal StrainsFrom the previous discussion in Section 1.6,it follows that because the strain is a symmetric second-order tensor,we can identify and determine its principal axes and values.According to this theory,for any given strain tensor we can establish the principal value problem and solvey ′FIGURE 2-6Two-dimensional rotational transformation.Deformation:Displacements and Strains 35the characteristic equation to explicitly determine the principal values and directions.The general characteristic equation for the strain tensor can be written asdet[e ijÀe d ij]¼Àe3þW1e2ÀW2eþW3¼0(2:4:1) where e is the principal strain and the fundamental invariants of the strain tensor can be expressed in terms of the three principal strains e1,e2,e3asW1¼e1þe2þe3W2¼e1e2þe2e3þe3e1W3¼e1e2e3(2:4:2)Thefirst invariant W1¼W is normally called the cubical dilatation,because it is related to the change in volume of material elements(see Exercise2-11).The strain matrix in the principal coordinate system takes the special diagonal forme ij¼e1000e2000e32435(2:4:3)Notice that for this principal coordinate system,the deformation does not produce anyshearing and thus is only extensional.Therefore,a rectangular element oriented alongprincipal axes of strain will retain its orthogonal shape and undergo only extensional deform-ation of its sides.2.5Spherical and Deviatoric StrainsIn particular applications it is convenient to decompose the strain tensor into two parts calledspherical and deviatoric strain tensors.The spherical strain is defined by~e ij¼13e kk d ij¼13Wd ij(2:5:1)while the deviatoric strain is specified as^e ij¼e ijÀ13e kk d ij(2:5:2)Note that the total strain is then simply the sume ij¼~e ijþ^e ij(2:5:3)The spherical strain represents only volumetric deformation and is an isotropic tensor,being the same in all coordinate systems(as per the discussion in Section1.5).The deviatoricstrain tensor then accounts for changes in shape of material elements.It can be shownthat the principal directions of the deviatoric strain are the same as those of the straintensor.36FOUNDATIONS AND ELEMENTARY APPLICATIONS2.6Strain CompatibilityWe now investigate in more detail the nature of the strain-displacement relations (2.2.5),and this will lead to the development of some additional relations necessary to ensure continuous,single-valued displacement field solutions.Relations (2.2.5),or the index notation form (2.2.6),represent six equations for the six strain components in terms of three displacements.If we specify continuous,single-valued displacements u,v,w,then through differentiation the resulting strain field will be equally well behaved.However,the converse is not necessarily true;that is,given the six strain components,integration of the strain-displacement relations (2.2.5)does not necessarily produce continuous,single-valued displacements.This should not be totally surprising since we are trying to solve six equations for only three unknown displacement components.In order to ensure continuous,single-valued displacements,the strains must satisfy additional relations called integrability or compatibility equations .Before we proceed with the mathematics to develop these equations,it is instructive to consider a geometric interpretation of this concept.A two-dimensional example is shown in Figure 2-7whereby an elastic solid is first divided into a series of elements in case (a).For simple visualization,consider only four such elements.In the undeformed configuration shown in case (b),these elements of course fit together perfectly.Next,let us arbitrarily specify the strain of each of the four elements and attempt to reconstruct the solid.For case (c),the elements have been carefully strained,taking into consideration neighboring elements so that the system fits together thus yielding continuous,single-valued displacements.However,for(b) Undeformed Configuration(c) Deformed ConfigurationContinuous Displacements (a) Discretized Elastic Solid (d) Deformed Configuration Discontinuous DisplacementsFIGURE 2-7Physical interpretation of strain compatibility.case(d),the elements have been individually deformed without any concern for neighboring deformations.It is observed for this case that the system will notfit together without voids and gaps,and this situation produces a discontinuous displacementfield.So,we again conclude that the strain components must be somehow related to yield continuous,single-valued displacements.We now pursue these particular relations.The process to develop these equations is based on eliminating the displacements from the strain-displacement relations.Working in index notation,we start by differentiating(2.2.6) twice with respect to x k and x l:e ij,kl¼12(u i,jklþu j,ikl)Through simple interchange of subscripts,we can generate the following additional relations:e kl,ij¼12(u k,lijþu l,kij)e jl,ik¼12(u j,likþu l,jik)e ik,jl¼12(u i,kjlþu k,ijl)Working under the assumption of continuous displacements,we can interchange the order of differentiation on u,and the displacements can be eliminated from the preceding set to gete ij,klþe kl,ijÀe ik,jlÀe jl,ik¼0(2:6:1) These are called the Saint Venant compatibility equations.Although the system would lead to 81individual equations,most are either simple identities or repetitions,and only6are meaningful.These six relations may be determined by letting k¼l,and in scalar notation, they become@2e x @y2þ@2e y@x2¼2@2e xy@x@y@2e y @z2þ@2e z@y2¼2@2e yz@y@z@2e z @x2þ@2e x@z2¼2@2e zx@z@x@2e x @y@z ¼@@xÀ@e yz@xþ@e zx@yþ@e xy@z@2e y @z@x ¼@@yÀ@e zx@yþ@e xy@zþ@e yz@x@2e z @x@y ¼@@zÀ@e xy@zþ@e yz@xþ@e zx@y(2:6:2)It can be shown that these six equations are equivalent to three independent fourth-order relations(see Exercise2-14).However,it is usually more convenient to use the six second-order equations given by(2.6.2).In the development of the compatibility relations,we assumed that the displacements were continuous,and thus the resulting equations (2.6.2)are actually only a necessary condition.In order to show that they are also sufficient,consider two arbitrary points P and P o in an elastic solid,as shown in Figure 2-8.Without loss in generality,the origin may be placed at point P o .The displacements of points P and P o are denoted by u P i and u o i ,and the displacement ofpoint P can be expressed asu P i ¼u o i þðC du i ¼u o i þðC @u i @x j dx j (2:6:3)where C is any continuous curve connecting points P o and P .Using relation (2.1.6)for the displacement gradient,(2.6.3)becomesu P i ¼u o i þðC (e ij þ!ij )dx j (2:6:4)Integrating the last term by parts givesðC !ij dx j ¼!P ij x P j ÀðC x j !ij ,k dx k (2:6:5)where !P ij is the rotation tensor at point P .Using relation (2:1:7)2,!ij ,k ¼12(u i ,jk Àu j ,ik )¼12(u i ,jk Àu j ,ik )þ12(u k ,ji Àu k ,ji )¼12@@x j (u i ,k þu k ,i )À12@@x i(u j ,k þu k ,j )¼e ik ,j Àe jk ,i (2:6:6)Substituting results (2.6.5)and (2.6.6)into (2.6.4)yieldsu P i¼u o i þ!P ij x P j þðC U ik dx k (2:6:7)where U ik ¼e ik Àx j (e ik ,j Àe jk ,i ).P oFIGURE 2-8Continuity of displacements.Now if the displacements are to be continuous,single-valued functions,the line integral appearing in(2.6.7)must be the same for any curve C;that is,the integral must be independent of the path of integration.This implies that the integrand must be an exact differential,so that the value of the integral depends only on the end points.Invoking Stokes theorem,we can show that if the region is simply connected(definition of the term simply connected is postponed for the moment),a necessary and sufficient condition for the integral to be path independent is for U ik,l¼U il,ing this result yieldse ik,lÀd jl(e ik,jÀe jk,i)Àx j(e ik,jlÀe jk,il)¼e il,kÀd jk(e il,jÀe jl,i)Àx j(e il,jkÀe jl,ik) which reduces tox j(e ik,jlÀe jk,ilÀe il,jkþe jl,ik)¼0Because this equation must be true for all values of x j,the terms in parentheses must vanish, and after some index renaming this gives the identical result previously stated by the compati-bility relations(2.6.1):e ij,klþe kl,ijÀe ik,jlÀe jl,ik¼0Thus,relations(2.6.1)or(2.6.2)are the necessary and sufficient conditions for continuous, single-valued displacements in simply connected regions.Now let us get back to the term simply connected.This concept is related to the topology or geometry of the region under study.There are several places in elasticity theory where the connectivity of the region fundamentally affects the formulation and solution method. The term simply connected refers to regions of space for which all simple closed curves drawn in the region can be continuously shrunk to a point without going outside the region. Domains not having this property are called multiply connected.Several examples of such regions are illustrated in Figure2-9.A general simply connected two-dimensional region is shown in case(a),and clearly this case allows any contour within the region to be shrunk to a point without going out of the domain.However,if we create a hole in the region as shown in case(b),a closed contour surrounding the hole cannot be shrunk to a point without going into the hole and thus outside of the region.Thus,for two-dimensional regions,the presence of one or more holes makes the region multiply connected.Note that by introducing a cut between the outer and inner boundaries in case(b),a new region is created that is now simply connected. Thus,multiply connected regions can be made simply connected by introducing one or more cuts between appropriate boundaries.Case(c)illustrates a simply connected three-dimensional example of a solid circular cylinder.If a spherical cavity is placed inside this cylinder as shown in case(d),the region is still simply connected because any closed contour can still be shrunk to a point by sliding around the interior cavity.However,if the cylinder has a through hole as shown in case(e),then an interior contour encircling the axial through hole cannot be reduced to a point without going into the hole and outside the body.Thus,case(e)is an example of the multiply connected three-dimensional region.It was found that the compatibility equations are necessary and sufficient conditions for continuous,single-valued displacements only for simply connected regions.However, for multiply connected domains,relations(2.6.1)or(2.6.2)provide only necessary but。

考研英语阅读理解长难句真题分析详解含翻译w

考研英语阅读理解长难句真题分析详解含翻译w

考研英语阅读理解长难句真题分析详解含翻译1. The Bilski case involves a claimed patent on a method for hedging risk in the energy market .结构:•The Bilski case [主] involves [谓] a claimed patent on a method [宾] for hedging risk in the energy market [状]单词:Hedge vi.防备n.树篱防备手段词组:Hedging risk 规避风险Energy market 能源市场直译:比尔斯基案涉及一个声称的专利方法,为了在能源市场规避风险译文:比尔斯基案涉及到一项已申请的关于能源市场风险规避方法的专利重点:•Hedging risk 规避风险•Energy market 能源市场2. The Federal Circuit issued an unusual order stating that the case would be heard by all 12 of the court’s judges , rather than a typical panel of three , and that one issue it wants to evaluate is whether it would “reconsider”its State Street Bank ruling .结构:•The Federal Circuit [主] issued [谓] an unusual order [宾] stating•宾语从句:that the case [主] would be heard [谓(被动)] by all 12 of the court’s judges , rather than a typical panel of three , and that one issue [主] it [主] wants [谓] to evaluate is [谓]•表语从句:whether it [主] would “reconsider”[谓] its State Street Bank ruling [宾]单词:Judge n.法官裁判员v.判断猜测评价Justice n.司法争议法官审判员Panel n.小组面板evaluate vt.评估评价n.评估评价词组:Rather than 而不是宁可...也不愿...直译:联邦巡回法庭颁布了一个不同寻常的命令,这个命令说比尔斯基案将会由全部的12名法官听证,而不是传统的3人小组;而且他们想要评估的问题是他们是否会重新考虑美国道富银行的判决解析:•两个That并列,引导宾语从句,共同作非谓语动词stating的宾语•It wants to evaluate是定语从句修饰issue,省略了连接词that •whether引导表语从句,作is的表语译文:联邦巡回法庭发布的一项不寻常的指令声明,该案件将由全部的12名法官共同听审,而非通常的3人小组,另外,联邦挥发法庭还宣布,它想要评估的另一件事是,是否会应该“重审”其对道富银行的判决重点:•两个That并列,引导宾语从句,共同作非谓语动词stating的宾语•It wants to evaluate是定语从句修饰issue,省略了连接词that •whether引导表语从句,作is的表语3. The Federal Circuit’s action comes in the wake of a series of recent decisions by the Supreme Court that has narrowed the scope of protections for patent holders .结构:•The Federal Circuit’s action [主] comes [谓] in the wake of a series of recent decisions by the Supreme Court•定语从句:that [主] has narrowed [谓] the scope of protections for patent holders [宾]单词:Supreme a.最高的最重要的n.至高霸权词组:In the wake of 紧随...之后直译:联邦巡回法庭的行动到来了,紧随着最高法院最近发布一系列的命令之后,这个命令是缩减专利持有者的保护范围解析:•that引导定语从句修饰decisions,并在句子中作主语译文:联邦巡回法庭的上述行为是在最高法院最近做出的一系列判决之后开始的,这些判决缩小了对专利持有者的保护范围重点:•In the wake of 紧随...之后•that引导定语从句修饰decisions,并在句子中作主语4. Last April , for example , the justices signaled that too many patents were being upheld for “inventions”that are obvious .结构:•Last April , for example , the justices [主] signaled [谓]•宾语从句:that too many patents [主] were being [系] upheld for “inventions”•定语从句:that [主] are [系] obvious [表]单词:Upheld vt.维护维持invention n.发明直译:例如,去年四月法官们发出了一个信号,有太多的专利用来维持那些明显的发明解析:•too表示程度时,用来表达讽刺意味•That引导宾语从句作signaled的宾语•that引导定义从句修饰inventions,在句子中做主语译文:比如,去年4月,最高法院的法官表示,太多显而易见的发明被授予了专利权重点:•too表示程度时,用来表达讽刺意味•That引导宾语从句作signaled的宾语•that引导定义从句修饰inventions,在句子中做主语5. The judges on the Federal Circuit are “reacting to the anti-patent trend at the Supreme Court ,”says Harold C. Wegner , a patent attorney and professor at George Washington University Law School .结构:•The judges on the Federal Circuit [主] are “reacting[谓] to the anti-patent trend at the Supreme Court ,”[宾] says HaroldC. Wegner , a patent attorney and professor at GeorgeWashington University Law School单词:attorney n.律师直译:联邦巡回法庭的法官正在做出反应,对最高法院的反专利趋势。

LDA模型概述

LDA模型概述

LDA模型理解来自于原文(Latent Dirichlet Allocation David M. Blei, Andrew Y. Ng, Michael I. Jordan)的定义:Latent Dirichlet allocation (LDA) is a generative probabilistic model of a corpus. The basic idea is that documents are represented as random mixtures over latent topics, where each topic is characterizedby a distribution over words.LDA assumes the following generative process for each document w in a corpus D:1. Choose N ~ Poisson(ξ).2. Choose θ ~ Dir(α).3. For each of the N words wn:(a) Choose a topic zn ~ Multinomial(θ).(b) Choose a word wn from p(wn | zn,β), a multinomial probability conditioned on the topic zn.1. N服从泊松分布。

泊松分布是一个离散分布,主要适合于描述单位时间内随机事件发生的次数。

分布图可以看这里。

这里的N就是文档的长度。

论文里讲泊松分布并不是关键的,可以替换成其他离散分布。

2. θ是一个k维向量。

这个k维向量服从狄利克雷分布。

狄利克雷分布(Dirichlet distribution)是一个连续多随机变量分布。

要理解狄利克雷分布,需要了解共轭先验。

存疑。

具体性质可以看这里。

这里的k是一个定义好的数,怎么定的不知道,反正是要生成这样一个k维向量。

考研英一伊丽莎白那篇文章

考研英一伊丽莎白那篇文章

考研英一伊丽莎白那篇文章DeepMind is one of the leading artificial intelligence (AI) companies in the world.The potential of this work applied to healthcare is very great, but it could also lead to further concentration of power in the tech giants.It is against that background that the information commissioner, Elizabeth Denham, has issued her damning verdict against the Royal Free hospital trust under the NHS,which handed over to DeepMind the records of 1.6 million patients in 2015 on the basis of a vague agreement which took far too little account of the patients' rights and their expectations of privacy.DeepMind has almost apologized. The NHS trust has mended its ways.Further arrangement sand there may be many - between the NHS and DeepMind will be carefully scrutinised to ensure that all necessary permissions have been asked of patients and all unnecessary data has been cleaned.There are lessons about informed patient consent to learn.But privacy is not the only angle in this case and not even the most important.Ms Denham chose to concentrate the blame on the NHS trust, since under existing law it "controlled" the data and DeepMind merely "processed" it.But this distinction misses the point that it is processing and aggregation, not the mere possession of bits, that gives the data value.The great question is who should benefit from the analysis of all the data that our lives now generate.Privacy law builds on the concept of damage to an individual from identifiable knowledge about them.That misses the way the surveillance economy works.The data of an individual there gains its value only when it is compared with the data of countless millions more.The use of privacy law to curb the tech giants in this instance feels slightly maladapted.This practice does not address the real worry.It is not enough to say that the algorithms DeepMind develops will benefit patients and save lives.What matters is that they will belong to a private monopoly which developed them using public resources.If software promises to save lives on the scale that dugs now can, big data may be expected to behave as a big pharm has done.We are still at the beginning of this revolution and small choices now may turn out to have gigantic consequences later.A long struggle will be needed to avoid a future of digital feudalism. Ms Denham's report is a welcome start.。

essenceofdecision

essenceofdecision

Essence of DecisionEssence of Decision: Explaining the Cuban Missile Crisis is an analysis, by political scientist Graham T. Allison, of the Cuban Missile Crisis. Allison used the crisis as a case study for future studies into governmental decision-making. The book became the founding study of the John F. Kennedy School of Government, and in doing so revolutionized the field of international relations. Allison originally published the book in 1971. In 1999, because of new materials available (including tape recordings of the U.S. government's proceedings), he rewrote the book with Philip Zelikow.The title is based on a speech by John F. Kennedy, in which he said, "The essence of ultimate decision remains impenetrable to the observer - often, indeed, to the decider himself."ThesisWhen he first wrote the book, Allison contended that political science and the study of international relations were saturated with rational expectations theories inherited from the field of economics. Under such a view, the actions of states are analyzed by assuming that nations consider all options and act rationally to maximize their utility.Allison attributes such viewpoints to the dominance of economists such as Milton Friedman, statesmen such as Robert McNamara and Henry Kissinger, disciplines such as game theory, and organizations such as the RAND Corporation. However, as he puts it:It must be noted, however, that an imaginative analyst can construct an account of value-maximizing choice for any action or set of actions performed by a government.Or, to put it bluntly, this approach (which Allison terms the "Rational Actor Model") violates the law of falsifiability. Also, Allison notes that "rational" analysts must ignore a lot of facts in order to make their analysis fit their models.In response, Allison constructed three different ways (or "lenses") through which analysts can examine events: the "Rational Actor" model, the "Organizat ional Behavior" model, and the "Governmental Politics" model.To illustrate the models, Allison poses the following three questions in each section:Why did the Soviet Union decide to place offensive missiles in Cuba?Why did the United States respond to the missile deployment with a blockade?Why did the Soviet Union withdraw the missiles?The "Rational Actor" ModelThe origin of Allison's first model is explained above. Basically, under this theory: Governments are treated as the primary actor.The government examines a set of goals, evaluates them according to their utility, then picks the one that has the highest "payoff."Under this theory, Allison explains the crisis like this:John F. Kennedy, in 1961, revealed that the Soviet Union, despite rhetoric, had far fewer ICBMs than it claimed. In response, Nikita Khrushchev ordered nuclear missiles with shorter rangesinstalled in Cuba. In one move, the Soviets bridged the "missile gap" while scoring points in the Cold War. Based on Kennedy's failure to back up the Bay of Pigs Invasion, they believed the U.S. wouldn't respond harshly.Kennedy and his advisors (EXCOMM) evaluated a number of options, ranging from doing nothing to a full invasion of Cuba. A blockade of Cuba was chosen because it wouldn't necessarily escalate into war, and because it forced the Soviets to make the next move.Because of mutually assured destruction by a nuclear war, the Soviets had no choice but to bow to U.S. demands and remove the weapons.The Organizational Process ModelAllison noted there were many facts that the rational model had to ignore, such as why the Soviets failed to camouflage the nuclear sites during construction, but did so only after U-2flights pinpointed their locations.He cited work by James G. March and Herbert Simon, which argue that existing governmental bureaucracy places limits on a nation's actions, and often dictates the final outcome. He then proposed the following "organizational process" model propositions:When faced with a crisis, government leaders don't look at it as a whole, but break it down and assign it according to pre-established organizational lines.Because of time and resource limitations, rather than evaluating all possible courses of action to see which one is most likely to work, leaders settle on the first proposal that adequately addresses the issue, which Simon termed "satisficing."Leaders gravitate towards solutions that limit short-term uncertainty (emphasis on "short-term"). Organizations follow set "repertoires" and procedures when taking actions.Because of the large resources and time required to fully plan and mobilize actions within a large organization (or government), leaders are effectively limited to pre-existing plans.Under this theory, the crisis is explained thus:Because the Soviets never established nuclear missile bases outside of their country at the time, they assigned the tasks to established departments, which in turn followed their own set procedures. However, their procedures were not adapted to Cuban conditions, and as a result, mistakes were made that allowed the U.S. to quite easily learn of the program's existence. Such mistakes included such gaffes as supposedly undercover Soviet troops decorating their barracks with Red Army Stars viewable from above.Kennedy and his advisors never really considered any other options besides a blockade or air strikes, and initially, were almost unanimously in favor of the air strikes. However, such attacks created massive uncertainty because the U.S. Air Force couldn't guarantee it would disable all the nuclear missiles. Additionally, although Kennedy wanted a "surgical" air strike that would destroy the missiles without inflicting extensive damage, the existing Air Force plan required extensive bombing that would have created more collateral damage than Kennedy desired. Because the U.S. Navy already had considerable strength in the field, because there was a pre-existing plan in place for a blockade, and because Kennedy was able to communicate directly with the fleet's captains, members fell back on the blockade as the only safe option.The Soviets simply did not have a plan to follow if the U.S. took decisive action against their missiles. Khrushchev's communications indicated a high degree of desperation. Without anyback-up plan, the Soviets had to withdraw.The "Governmental Politics" ModelAfter reading works by Richard Neustadt and Samuel P. Huntington, among others, Allison proposed a third model, which takes account of court politics(or "palace politics"). While statesmen don't like to admit they play politics to get things done, especially in h igh-stakes situations such as the Cuban missile crisis, they nonetheless do.Allison proposed the following propositions for this model:A nation's actions are best understood as the result of politicking and negotiation by its top leaders. Even if they share a goal, leaders differ in how to achieve it because of such factors as personal interests and background.Even if a leader holds absolute power (i.e., the President of the United States is technically the commander-in-chief), the leader must gain a consensus with his underlings or risk having his order misunderstood or, in some cases, ignored.Related to the above proposition, the make-up of a leader's entourage will have a large effect on the final decision (i.e., an entourage of "yes men" will create a different outcome than a group of advisors who are willing to voice disagreement).Leaders have different levels of power based on charisma, personality, skills of persuasion, and personal ties to decision-makers.If a leader is certain enough, they will not seek input from their advisors, but rather, approval. Likewise, if a leader has already implic itly decided on a particular course of action, an advisor wishing to have influence must work within the framework of the decision the leader has already made.If a leader fails to reach a consensus with his inner circle (or, at least, the appearance of a consensus), opponents may take advantage of these disagreements. Therefore, effective leaders must create a consensus.Because of the possibilities of miscommunication, misunderstandings, and downright disagreements, different leaders may take actions that the group as a whole would not approve of. Allison had to admit that, because the Soviets were not as open with their internal affairs as the Americans, he simply didn't have enough data to fully interpret the crisis with this model. Nonetheless, he made the following attempt:Khrushchev came under increasing fire from the Presidium because of Kennedy's revelation of the Soviet lack of ICBMs, as well as American successes in the Berlin Airlift. Also, the Soviet economy was being stretched, and military leaders were unhappy with Khrushchev's decision to cut the size of the Red Army. Placing missiles in Cuba was a cheap and quick way for him to secure his political base.Because of the failure of the Bay of Pigs invasion, Republicans in the Congress made Cuban policy into a major issue for the upcoming congressional elections later in 1962. Therefore, Kennedy immediately decided on a strong response rather than a diplomatic one. Although a majority of EXCOMM initially favored air strikes, those closest to the president - such as his brother and Attorney General, Robert Kennedy, and special counsel Theodore Sorensen - favored the blockade. At the same time, Kennedy got into arguments with proponents of the air strikes, such as Air Force General Curtis LeMay. After the Bay of Pigs Invasion fiasco, Kennedy alsodistrusted the CIA and its advice. This combination of push and pull led to the implication of a blockade.With his plans thwarted, Khrushchev tried to save face by pointing to American missiles in Turkey, a position similar to the Cuban missiles. While Kennedy refused to move these missiles "under duress," he allowed Robert Kennedy to reach a deal with Soviet ambassador Anatoly Dobrynin, in which the Turkish missiles (which Kennedy ordered removed prior to the crisis) would be quietly removed several months later. Publicly, Kennedy also agreed never to invade Cuba.ImplicationsWhen the book was first published, Allison's primary message was that the concept of mutually assured destruction as a barrier to nuclear war was unfounded. By looking at organizational and political models, such an outcome was quite possible - nations, against what was predicted by the rational viewpoint, could indeed "commit suicide."He pointed to several inc idents in history that seemed to back this assertion. His most salient point: prior to the attack at Pearl Harbor, Japanese military and civilian leaders, including those responsible for making the decision, were fully aware that they lacked the industrial capacity and military might to win a war against the U.S. They went ahead and attacked anyway.He also believed that the organizational model explained otherwise inexplicable gaffes in military history. To return to 1941, he noted that the U.S. intercepted enough evidence to indicate that Japan was about to attack Pearl Harbor, yet the commander did not prepare. The answer, Allison revealed, was not some conspiracy, but that what the intelligence community viewed as a "threat of attack," the commander interpreted as a "threat of sabotage." This miscommunication, due to different viewpoints, allowed the attack to be pulled off successfully - as Allison sarcastically noted, having U.S. planes lined up wing-to-wing and surrounded by armed guards was a good plan for preventing sabotage, but not for surviving an aerial attack.Likewise, the political process model explained otherwise confusing affairs. Allison pointed to the decision by General Douglas MacArthur to defy his orders during the Korean War and march too far north. The reason was not a "rational" change in U.S. intentions, but rather, MacArthur's disagreements with Harry Truman and other policymakers, and how officials allowed MacArthur to make what they considered unwise moves because of concerns over political backlash due to the general's public popularity.Above all, he described using rational actor models as dangerous. By using such models (and modes of thinking), people made unreliable assumptions about reality, which could have disastrous consequences. Part of what allowed the attack on Pearl Harbor to be pulled off was the assumption that, since Japan would lose such a war, they would never dare attack. The assumption under MAD is that nobody will ever start a nuclear war because of its consequences. However, humans are not inextricably bound to act in a rational manner, which history has proven time and time again.While Allison did not claim that any of his additional two models could fully explain anything, he noted that policymakers and analysts alike would benefit from stepping away from the traditional model and exploring alternate viewpoints (although this last remark could be viewed as facetious on Allison's part).CriticismThe book is part of an ongoing argument between supporters of rational expectations theories and analysts who look for alternative explanations.Milton Friedman, in particular, countered that, even if rational expectational theories do not describe reality per se, they should be kept since they provide accurate predictions (instrumentalism). Allison countered that Friedman has not provided enough evidence to demonstrate his theories actually predict anything, and criticizes his arguments as unscientific. Another argument (again, made by Friedman) is that the information needed for Allison's bureaucratic and political models is so large that it is impractical to use in such a crisis. Allison has conceded this is true, but argued that this does not mean a person should automatically revert back to the rational actor worldview.Moreover, Allison pointed out that the "rational actor" model continues to be applied even in long-term analyses (i.e., analyses that take place long after the event or "crisis" is past). In Essence of Decision, Allison suggests that one reason for the popularity of rational actor models is that, compared to other models, they require relatively little data and provide researchers with an "inexpensive approximation" of the situation. Allison also quotes Thomas Schelling's description of rationalistic thinking and vicarious problem solving:Y ou can sit in your armchair and try to predict how people will behave by asking how you would behave if you had your wits around you. Y ou get, free of charge, a lot of vicarious, empirical behavior.Finally, in Allison's first edition (1971), he was unable to fully explore his theories because much of the information was still classified. As a result, he made a number of assumptions on his own part. Following the collapse of the Soviet Union and the release of American recordings of EXCOMM, this new information (included in the revised 1999 edition) sometimes agreed with Allison's assumptions, but sometimes didn't.For example, in 1971, he guessed that Kennedy must have made an "under the table" agreement concerning the Turkish missiles, probably using his brother as a liaison. The American tapes confirmed this.However, Allison also guessed, in 1971, that Khrushchev must have formed his own "EXCOMM," or his own committee of advisors, to aid him during the crisis, and even named the Russian leaders he believed were with Khrushchev at the time. However, the Soviet records revealed that these individuals were not present, and Khrushchev was effectively stuck alone in his office during the crisis without the type of support Kennedy had.。

in a theoritical model, decision making

in a theoritical model, decision making

In a theoretical model of decision making, individuals are assumed to gather and process information in a rational and systematic manner in order to arrive at the best possible choice. This theoretical framework often involves weighing the costs and benefits of different options, considering various probabilities and potential outcomes, and evaluating the potential impact of each decision on one's goals and objectives.One common theoretical model of decision making is known as expected utility theory, which posits that individuals make choices based on the expected value of each option, taking into account both the potential gains and losses associated with each possible outcome. This model assumes that individuals are able to accurately assess the probabilities of different outcomes and are able to make decisions that maximize their expected utility or satisfaction.Another influential theoretical perspective on decision making is bounded rationality, which suggests that individuals do not always have the capacity to gather and process all relevant information when making decisions due to cognitive limitations and time constraints. Instead, individuals use heuristics and shortcuts to simplify the decision-making process and satisfice, or choose the first option that meets their criteria, rather than exhaustively searching for the best possible choice.Additionally, some theoretical models of decision making incorporate emotional and motivational factors, recognizing that individuals' choices areoften influenced by their desires, fears, and emotional responses to different options. This may involve considering the role of affective forecasting, or predicting one's emotional reactions to different outcomes, in decision making processes.Overall, theoretical models of decision making provide valuable frameworks for understanding and predicting how individuals make choices in various contexts, and serve as the foundation for further empirical research and practical applications in fields such as economics, psychology, and management.。

Two-dimensional Quantum Field Theory, examples and applications

Two-dimensional Quantum Field Theory, examples and applications

Abstract The main principles of two-dimensional quantum field theories, in particular two-dimensional QCD and gravity are reviewed. We study non-perturbative aspects of these theories which make them particularly valuable for testing ideas of four-dimensional quantum field theory. The dynamics of confinement and theta vacuum are explained by using the non-perturbative methods developed in two dimensions. We describe in detail how the effective action of string theory in non-critical dimensions can be represented by Liouville gravity. By comparing the helicity amplitudes in four-dimensional QCD to those of integrable self-dual Yang-Mills theory, we extract a four dimensional version of two dimensional integrability.
2 48 49 52 54 56
5 Four-dimensional analogies and consequences 6 Conclusions and Final Remarks

EngelWestJPE

EngelWestJPE

485[Journal of Political Economy,2005,vol.113,no.3]᭧2005by The University of Chicago.All rights reserved.0022-3808/2005/11303-0002$10.00Exchange Rates and FundamentalsCharles Engel and Kenneth D.WestUniversity of Wisconsin and National Bureau of Economic ResearchWe show analytically that in a rational expectations present-value model,an asset price manifests near–random walk behavior if fun-damentals are I(1)and the factor for discounting future fundamentals is near one.We argue that this result helps explain the well-known puzzle that fundamental variables such as relative money supplies,outputs,inflation,and interest rates provide little help in predicting changes in floating exchange rates.As well,we show that the data do exhibit a related link suggested by standard models—that the exchange rate helps predict these fundamentals.The implication is that exchange rates and fundamentals are linked in a way that is broadly consistent with asset-pricing models of the exchange rate.I.IntroductionA long-standing puzzle in international economics is the difficulty of tying floating exchange rates to macroeconomic fundamentals such as money supplies,outputs,and interest rates.Our theories state that the exchange rate is determined by such fundamental variables,but floating exchange rates between countries with roughly similar inflation rates are in fact well approximated as random walks.Fundamental variables do not help predict future changes in exchange rates.Meese and Rogoff (1983a ,1983b )first established this result.They evaluated the out-of-sample fit of several models of exchange rates,using We thank Shiu-Sheng Chen,Akito Matsumoto,Benjamin T.West,and Yu Yuan for research assistance;the National Science Foundation for financial support;and two anon-ymous referees,the editor,and many seminar audiences for helpful comments.Portions of this paper were completed while West was a Houblon-Norman Fellow at the Bank of England and the Professorial Fellow in Monetary Economics at Victoria University and the Reserve Bank of New Zealand.486journal of political economy data from the1970s.They found that by standard measures of forecast accuracy,such as the mean-squared deviation between predicted and actual exchange rates,accuracy generally increased when one simply forecast the exchange rate to remain unchanged compared to when one used the predictions from the exchange rate models.While a large number of studies have subsequently claimed tofind success for various versions of fundamentals-based models,sometimes at longer horizons and over different time periods,the success of these models has not proved to be robust.A recent comprehensive study by Cheung,Chinn, and Pascual(2002,19)concludes that“the results do not point to any given model/specification combination as being very successful.On the other hand...,it may be that one model will do well for one exchange rate,and not for another.”In this paper,we take a new line of attack on the question of the link between exchange rates and fundamentals.We work with a conventional class of asset-pricing models in which the exchange rate is the expected present discounted value of a linear combination of observable fun-damentals and unobservable shocks.Linear driving processes are pos-ited for fundamentals and shocks.Wefirst present a theorem concerning the behavior of an asset price determined in a present-value model.We show analytically that in the class of present-value models we consider,asset prices will follow a pro-cess arbitrarily close to a random walk if(1)at least one forcing variable (observable fundamental or unobservable shock)has a unit autore-gressive root and(2)the discount factor is near unity.So,in the limit, as the discount factor approaches unity,the change in the time t assettϪ1 price will be uncorrelated with information known at time.We explain below that our result is not an application of the simple efficient markets model of Samuelson(1965)and others.When that model is applied to exchange rates,it implies that cross-country interest rate differentials will predict exchange rate changes and thus that exchange rates will not follow a random walk.Intuitively,as the discount factor approaches unity,the model puts relatively more weight on fundamentals far into the future in explaining the asset price.Transitory movements in the fundamentals become rel-atively less important than the permanent components.Imagine per-forming a Beveridge-Nelson decomposition on the linear combination of fundamentals that drive the asset price,expressing it as the sum of a random walk component and a transitory component.The class of theoretical models we are considering then expresses the asset price as the discounted sum of the current and expected future fundamentals. As the discount factor approaches one,the variance of the change of the discounted sum of the random walk component approaches infinity, whereas the variance of the change of the stationary component ap-exchange rates and fundamentals487 proaches a constant.So the variance of the change of the asset price is dominated by the change of the random walk component as the dis-count factor approaches one.We view as unexceptionable the assumption that a forcing variable has a unit root,at least as a working hypothesis for our study.The assumption about the discount factor is,however,open to debate.We note that in reasonable calibrations of some exchange rate models,this discount factor in fact is quite near unity.Of course our analytical result is a limiting one.Whether a discount factor of0.9or0.99or0.999is required to deliver a process statistically indistinguishable from a random walk depends on the sample size used to test for random walk behavior and the entire set of parameters of the model.Hence we present some correlations calculated analytically in a simple stylized model.We assume a simple univariate process for fundamentals,with parameters chosen to reflect quarterly data from the recentfloating period.Wefind that discount factors above0.9suffice to yield near-zero correlations between the period t exchange rate and tϪ1period information.We do not attempt to verify our theoretical conclusion that large discount factors account for random walk behavior in exchange rates using any particular fundamentals model from the literature.That is,we do not pick specific models that we claim satisfy the conditions of our theorem and then estimate them and verify that they produce random walks.But if the present-value models of exchange rates imply random walk behavior,so that exchange rate changes are unpredictable,how then can we validate the models?We ask instead if these conventional models have implications for whether the exchange rate helps predict funda-mentals.It is plausible to look in this direction.Surely much of the short-termfluctuation in exchange rates is driven by changes in expec-tations about the future.If the models are good approximations and expectations reflect information about future fundamentals,the ex-change rate changes will likely be useful in forecasting these funda-mentals.So these models suggest that exchange rates Granger-cause the ing quarterly bilateral dollar exchange rates,1974–2001,for the dollar versus the currencies of the six other Group of Seven countries,wefind some evidence of such causality,especially for nominal variables.The statistical significance of the predictability is not uniform and suggests a link between exchange rates and fundamentals that perhaps is modest in comparison with the links between other sets of economic variables.But in our view,the statistical predictability is notable in light of the far weaker causality from fundamentals to exchange rates.For countries and data series for which there is statistically significant evidence of Granger causality,we next gauge whether the Granger cau-488journal of political economy sality results are consistent with our models.We compare the correlation of exchange rate changes with two estimates of the change in the present discounted value of fundamentals.One estimate uses only the lagged value of fundamentals.The other uses both the exchange rate and own lags.Wefind that the correlation is substantially higher when the exchange rate is used in estimating the present discounted value.To prevent confusion,we note that ourfinding that exchange rates predict fundamentals is distinct from ourfinding that large discount factors rationalize a random walk in exchange rates.It may be reasonable to link the twofindings.When expectations of future fundamentals are very important in determining the exchange rate,it seems natural to pursue the question of whether exchange rates can forecast those fun-damentals.But one can be persuaded that exchange rates Granger-cause fundamentals and still argue that the approximate random walk in exchange rates is not substantially attributable to a large discount factor. In the class of models we consider,all our empirical results are consistent with at least one other explanation,namely,that exchange rate move-ments are dominated by unobserved shocks that follow a random walk. The plausibility of this explanation is underscored by the fact that we generally fail tofind cointegration between the exchange rate and ob-servable fundamentals,a failure that is rationalized in our class of models by the presence of an I(1)(though not necessarily random walk)shock. As well,the random walk also can arise in models that fall outside the class we consider.It does so in models with small-sample biases,perhaps combined with nonlinearities/threshold effects(see Taylor,Peel,and Sarno2001;Kilian and Taylor2003;Rossi2003).Exchange rates will still predict fundamentals in such models,though a nonlinear fore-casting process may be required.Our suggestion that the exchange rate will nearly follow a random walk when the discount factor is close to unity means that forecasting changes in exchange rates is difficult but perhaps still possible.Some recent studies have found success at forecasting changes in exchange rates at longer horizons or using nonlinear methods,and further re-search along these lines may prove fruitful.MacDonald and Taylor (1994),Chinn and Meese(1995),and Mark(1995)have all found some success in forecasting exchange rates at longer horizons imposing long-run restrictions from monetary models.Groen(2000)and Mark and Sul(2001)find greater success using panel methods.Kilian and Taylor (2003)suggest that models that incorporate nonlinear mean reversion can improve the forecasting accuracy of fundamentals models,though it will be difficult to detect the improvement in out-of-sample forecasting exercises.The paper is organized as follows.Section II presents the theorem that the random walk in asset prices may result from a discount factorexchange rates and fundamentals 489near one in a present-value model.Section III demonstrates how the theorem applies to some models of exchange rates.Section IV presents evidence that changes in exchange rates help predict fundamentals.Section V presents conclusions.The Appendix has some algebraic de-tails.An additional appendix containing empirical results omitted from the paper to save space is available on request.II.Random Walk in Asset Prices as the Discount Factor Goes toOneWe consider models in which an asset price,,can be expressed as a s t discounted sum of current and expected future “fundamentals.”We examine asset-pricing models of the formϱϱj j s p (1Ϫb )b E (a x )ϩb b E (a x ),0!b !1,(1)͸͸t t 1t ϩj t 2t ϩj j p 0j p 0where is the vector of fundamentals,b is a discount factor,and x n #1t and are vectors.For example,the model for stock prices a a n #112considered by Campbell and Shiller (1987)and West (1988)has this form,where is the level of the stock price,the dividend (a scalar),s x t t ,and .The log-linearized model of the stock price of a p 0a p 112Campbell and Shiller (1988)also has this form,where is the log of s t the stock price,is the log of the dividend,,and .The x a p 1a p 0t 12term structure model of Campbell and Shiller also is a present-value model,where is the yield on a consol,is the short-term rate,s x t t ,and .In Section III,we review models in which is the a p 1a p 0s 12t log of the exchange rate and contains such variables as interest rates x t and logs of prices,money supplies,and income.We spell out here the sense in which the asset price should follow a random walk for a discount factor b that is near one.Assume that at least one element of the vector is an I(1)process,whose Wold in-x t novation is the vector .Our result requires that either (1)n #1e t and or (2),with the order of integration a x ∼I(1)a p 0a x ∼I(1)1t 22t of essentially unrestricted (I(0),I(1),or identically zero).In either a x 1t case,for b near one,will be well approximated by a linear combi-D s t nation of the elements of the unpredictable innovation .In a sense e t made precise in the Appendix,this approximation is arbitrarily good for b arbitrarily near one.This means,for example,that all autocor-relations of will be very near zero for b very near one.D s t Of course,there is continuity in the autocorrelations in the following sense:for b near one,the autocorrelations of will be near zero if the D s t previous paragraph’s condition that certain variables are I(1)is replaced with the condition that those variables are I(0)but with an autoregres-490journal of political economy TABLE 1Population Autocorrelations and Cross Correlations of D s tb (1)J 1(2)J (3)Correlation of with :D s t D s t Ϫ1(4)D s t Ϫ2(5)D s t Ϫ3(6)D x t Ϫ1(7)D x t Ϫ2(8)D x t Ϫ3(9)1..50 1.0.3.15.05.01.16.05.012..5.27.14.07.28.14.073..8.52.42.34.56.44.364..90 1.0.3.03.01.00.03.01.005..5.05.03.01.06.03.016..8.09.07.06.13.11.097..95 1.0.3.02.01.00.02.01.008..5.03.01.01.03.01.019..8.04.04.03.07.05.0410..90.90.5.04Ϫ.01Ϫ.03.02Ϫ.03Ϫ.0511..90.95.5.05.01Ϫ.01.04Ϫ.00Ϫ.0212..95.95.5.02Ϫ.00Ϫ.01.01Ϫ.02Ϫ.0313..95.99.5.02.01.00.03.01Ϫ.00Note.—The model is or .The scalar variable follows an AR(2)process with ϱϱj j s p (1Ϫb )͸b E x s p b ͸b E x x t t t ϩj t t t ϩj t j p 0j p 0autoregressive roots and J .When ,with parameter J .The correlations in cols.4–9were computed J J p 1.0D x ∼AR(1)11t analytically.If ,as in rows 1–9,then in the limit,as ,each of these correlations approaches zero.J p 1.0b r 11sive root very near one.For a given autoregressive root less than one,the autocorrelations will not converge to zero as b approaches one.But they will be very small for b very near one.Table 1gives an indication of just how small “small”is.The table gives correlations of with time information when follows a scalar D s t Ϫ1x t t univariate AR(2).(One can think of and or and a p 0a p 1a p 1121.One can consider these two possibilities interchangeably since,a p 02for given ,the autocorrelations of are not affected by whether b !1D s t or not a factor of multiplies the present value of fundamentals.)1Ϫb Rows 1–9assume that —specifically,with parameter x ∼I(1)D x ∼AR(1)t t J .We see that for the autocorrelations in columns 4–6and the b p 0.5cross correlations in columns 7–9are appreciable.Specifically,suppose that one uses the conventional standard error of .Then when ͱ1/T ,a sample size larger than 55will likely suffice to reject the null J p 0.5that the first autocorrelation of is zero (since row 2,col.5,gives D s t and ).(In this argument,ͱcorr(D s ,D s )p 0.2690.269/[1/55]≈2.0t t Ϫ1we abstract from sampling error in estimation of the autocorrelation.)But for ,the autocorrelations are dramatically smaller.For b p 0.9and ,a sample size larger than 1,600will be required,b p 0.9J p 0.5since .Finally,in connection with the previous ͱ0.051/(1/1,600)≈2.0paragraph’s reference to autoregressive roots less than one,we see in rows 10–13in the table that if the unit root in is replaced by an x t autoregressive root of 0.9or higher,the autocorrelations and cross cor-relations of are not much changed.D s texchange rates and fundamentals 491To develop intuition on this result,consider the following example.Suppose that the asset price is determined by a simple equation:s p (1Ϫb )m ϩb r ϩbE (s ).t t t t t ϩ1The “no-bubbles”solution to this expectational difference equation is a present-value model like (1):ϱϱj j s p (1Ϫb )b E m ϩb b E r .͸͸t t t ϩj t t ϩj j p 0j p 0Assume that the first differences of the fundamentals follow first-order autoregressions:D m p fD m ϩe ;Dr p gDr ϩe .t t Ϫ1mt t t Ϫ1r t Then we can write the solution asf (1Ϫb )1bg b D s p D m ϩe ϩDr ϩe .t t Ϫ1mt t Ϫ1r t 1Ϫb f 1Ϫb f 1Ϫb g (1Ϫb )(1Ϫb g )Consider first the special case of .Then as ,r p 0b r 1D s ≈[1/(1Ϫt t .In this case,the variance of the change in the exchange rate is f )]e mt finite as .If ,then as ,.In this case,b r 1r (0b r 1D s ≈constant #e t t r t as b increases,the variance of the change in the exchange rate gets large,but the variance is dominated by the independently and identi-cally distributed term .e r t In Section III,we demonstrate the applicability of this result to exchange rates.III.Exchange Rate ModelsExchange rate models since the 1970s have emphasized that nominal exchange rates are asset prices and are influenced by expectations about the future.The “asset market approach to exchange rates”refers to models in which the exchange rate is driven by a present discounted sum of expected future fundamentals.Obstfeld and Rogoff (1996,529)say that “one very important and quite robust insight is that the nominal exchange rate must be viewed as an asset price .Like other assets,the exchange rate depends on expectations of future variables”(italics in the original).Frenkel and Mussa’s (1985)survey explains the asset market approach:These facts suggest that exchange rates should be viewed as prices of durable assets determined in organized markets (like stock and commodity exchanges)in which current prices re-flect the market’s expectations concerning present and future492journal of political economyeconomic conditions relevant for determining the appropriate values of these durable assets,and in which price changes are largely unpredictable and reflect primarily new information that alters expectations concerning these present and future economic conditions.(726)A variety of models relate the exchange rate to economic fundamen-tals and to the expected future exchange rate.We write this relationship ass p (1Ϫb )(f ϩz )ϩb (f ϩz )ϩbE s .(2)t 1t 1t 2t 2t t t ϩ1Here,we define the exchange rate as the log of the home currency s t price of foreign currency (dollars per unit of foreign currency if the United States is the home country).The terms and ()are f z i p 1,2it it economic fundamentals that ultimately drive the exchange rate,such as money supplies,money demand shocks,productivity shocks,and so forth.We differentiate between fundamentals that are observable to the econometrician,,and those that are not observable,.One possibility f z it it is that the true fundamental is measured with error,so that is the f it measured fundamental and the include the measurement error;an-z it other is that the are unobserved shocks.z it Upon imposing the “no-bubbles”condition that goes to zero j b E s t t ϩj as ,we have the present-value relationshipj r ϱϱϱj j s p (1Ϫb )b E (f ϩz )ϩb b E (f ϩz ).(3)͸͸t t 1t ϩj 1t ϩj t 2t ϩj 2t ϩj j p 0j p 0This equation has the form of equation (1),where we have a x p1t ϩj and .We now outline some models thatf ϩz a x p f ϩz 1t ϩj 1t ϩj 2t ϩj 2t ϩj 2t ϩj fit into this framework.A.Money Income ModelConsider first the familiar monetary models of Frenkel (1976),Mussa (1976),and Bilson (1978)and their close cousins,the sticky-price mon-etary models of Dornbusch (1976)and Frankel (1979).Assume that in the home country there is a money market relationship given bym p p ϩg y Ϫa i ϩv .(4)t t t t mt Here,is the log of the home money supply,is the log of the home m p t t price level,is the level of the home interest rate,is the log of output,i y t t and is a shock to money demand.Here and throughout we use the v mt term “shock”in a somewhat unusual sense.Our “shocks”potentially include constant and trend terms,may be serially correlated,and mayexchange rates and fundamentals493 include omitted variables that in principle could be measured.Assume that a similar equation holds in the foreign country.The analogousm*p*i*y*v*foreign variables are,,,,and,and the parameters of thet t t t mtforeign money demand are identical to the home country’s parameters. The nominal exchange rate equals its purchasing power parity(PPP) value plus the real exchange rate:s p pϪp*ϩq.(5)t t t tInfinancial markets,the interest parity relationship isE sϪs p iϪi*ϩr.(6)t tϩ1t t t trHere is the deviation from rational expectations uncovered interest tparity.It can be interpreted as a risk premium or an expectational error. Putting these equations together and rearranging,we get1s p[mϪm*Ϫg(yϪy*)ϩqϪ(vϪv*)Ϫar]t t t t t t mt mt t1ϩaaϩE s.(7)t tϩ11ϩaThis equation takes the form of equation(2)when the discount factorb p a/(1ϩa)is given by,the observable fundamentals are given by f p mϪm*Ϫg(yϪy*)z p qϪ(vϪ,and the unobservables are1t t t t t1t t mtv*)z pϪrand.As in Mark(1995),our empirical work in Section IV mt2t tg p1f p sets.We also investigate a version of this model setting1t and moving to.We do so largely because we wish to mϪm*yϪy*zt t t t1tconduct a relatively unstructured investigation into the link between exchange rates and various measures of fundamentals.But we couldmϪm*argue that we focus on becausefinancial innovation has madet tstandard income measures poor proxies for the level of transactions.s yϪy* Similarly,we investigate the relationship between and.t t t Equation(7)is implied by both theflexible-price and sticky-price versions of the monetary model.In theflexible-price monetarist modelsyof Frenkel(1976),Mussa(1976),and Bilson(1978),output,,and thetreal exchange rate,,are exogenous.In the sticky-price models ofqtDornbusch(1976)and Frankel(1979),these two variables are endog-enous.Because nominal prices adjust slowly,the real exchange rate is influenced by changes in the nominal exchange rate.Output is demand determined and may respond to changes in the real exchange rate, income,and real interest rates.Nonetheless,since equations(4)(and its foreign counterpart),(5),and(6)hold in the Dornbusch-Frankel model,one can derive relationship(7)in those models.Dornbusch and Frankel each consider special cases for the exogenous monetary pro-cesses(in Dornbusch’s model,all shocks to the money supply are per-494journal of political economy manent;Frankel considers permanent shocks to the level and to the growth rate of money).As a result of their assumption that all shocks are permanent,they each can express the exchange rate purely in terms of current fundamentals,which may obscure the general implication that exchange rates depend on expected future fundamentals.We note here that some recent exchange rate models developed from the “new open economy macroeconomics”yield relationships very sim-ilar to the ones we describe in this section.For example,in Obstfeld and Rogoff (2003),the exchange rate is given by (their eq.[30])ϱj s p b E [(1Ϫb )(m Ϫm *)Ϫb r ],(8)͸t t t ϩj t ϩj t ϩj j p 0where we have translated their notation to be consistent with ours.Equation (8)is in fact the forward solution to a special case of equation(7)above.The discount factor,b ,in Obstfeld and Rogoff’s model is related to the semi-elasticity of money demand exactly as in equation(7).However,their money demand function is derived from a utility-maximizing framework in which real balances appear in the utility func-tion,and their risk premium is derived endogenously from first r t principles.B.Taylor Rule ModelHere we draw on the burgeoning literature on Taylor rules.Let p p t denote the inflation rate and be the “output gap.”We assume g p Ϫp y t t Ϫ1t that the home country (the United States in our empirical work)follows a Taylor rule of the formg i p b y ϩb p ϩv .(9)t 1t 2t t In (9),,,and the shock contains omitted terms.1b 10b 11v 12t The foreign country follows a Taylor rule that explicitly includes exchange rates:g ¯i *p Ϫb (s Ϫs *)ϩb y *ϩb p *ϩv *.(10)t 0t t 1t 2t t In (10),,and is a target for the exchange rate.We shall ¯0!b !1s*0t 1Much of the Taylor rule literature—wisely,in our view—puts expected inflation in the monetary policy rule.Among other benefits,this facilitates thinking of the monetary authority as setting an ex ante real rate.We use actual inflation for notational simplicity.If expected inflation is in the monetary rule,then inflation in the formulas below is replaced by expected inflation.exchange rates and fundamentals495 assume that monetary authorities target the PPP level of the exchange rate:¯s*p pϪp*.(11)t t tsSince is measured in dollars per unit of foreign currency,the rule tindicates that,ceteris paribus,the foreign country raises interest rates when its currency depreciates relative to the target.Clarida,Gali,and Gertler(1998)estimate monetary policy reaction functions for Germany and Japan(using data from1979–94)of a form similar to equation(10). Theyfind that a1percent real depreciation of the mark relative to the dollar led the Bundesbank to increase interest rates(expressed in an-nualized terms)byfive basis points,whereas the Bank of Japan increased rates by nine basis points in response to a real yen depreciation relative to the dollar.As the next equation makes clear,our argument still follows if the United States were also to target exchange rates.We omit the exchange rate target in(9)on the interpretation that U.S.monetary policy has virtually ignored exchange rates except,perhaps,as an indicator. Subtracting the foreign from the home money rule,we obtaing g¯iϪi*p b(sϪs*)ϩb(yϪy*)ϩb(pϪp*)ϩvϪv*.(12)t t0t t1t t2t t t tiϪi*Use interest parity(6)to substitute out for and(11)to sub-t tstitute out for the exchange rate target:b10g gs p(pϪp*)Ϫ[b(yϪy*)ϩb(pϪp*)t t t1t t2t t1ϩb1ϩb001ϩvϪv*ϩr]ϩE s.(13) t t t t tϩ11ϩbThis equation has the general form of(2)of the expected discounted1/(1ϩb) present-value models.The discount factor is equal to.Wef p pϪp*have.In our empirical work(in Sec.IV),we shall treat the 1t t tremaining variables as unobservable,so we haveg gz pϪ[b(yϪy*)ϩb(pϪp*)ϩvϪv*ϩr].2t1t t2t t t t tEquation(12)can be expressed another way,again using interest parity(6)and the equation for the target exchange rate(11):g gs p b(iϪi*)ϩb(pϪp*)Ϫb(yϪy*)Ϫb(pϪp*)t0t t0t t1t t2t tϪvϩv*Ϫ(1Ϫb)rϩ(1Ϫb)E s.(14) t t0t0t tϩ1This equation is very much like(13),except that it incorporates the interest differential,,as a“fundamental.”The discount factor iniϪi*t t496journal of political economythis formulation is given by .The observed fundamental is given1Ϫb 0by .In our empirical work,we treat the remainingf p i Ϫi *ϩp Ϫp *1t t t t t period t variables in equation (14)as unobserved.C.DiscussionWe begin by noting that the classic efficient markets model of Samuelson(1965)and others does not predict a random walk in exchange rates.The essence of this model is that there are no predictable profit op-portunities for a risk-neutral investor to exploit.If the U.S.interest rateis higher than the foreign interest rate by x percent,then the U.S.i i *t t dollar must be expected to fall by x percent over the period of theinvestment if there are to be no such opportunities.In terms of equation(6),then,the classic efficient markets model says that the risk premiumis zero and that a population regression of on will yieldr D s i Ϫi *t t ϩ1t t a coefficient of one.(For equities,the parallel prediction is that on theday on which a stock goes ex-dividend,its price should fall by the amountof the dividend [e.g.,Elton and Gruber 1970].)Our explanation yields a random walk approximation even when,asin the previous paragraph,uncovered interest parity holds.The readermay wonder how the data can simultaneously satisfy the following con-ditions:(1)a regression of on yields a nonzero coefficient,D s i Ϫi *t ϩ1t t and (2)is arbitrarily well approximated as a random walk (i.e.,s t is arbitrarily well approximated as white noise).The answer is thatD s t ϩ1when b is arbitrarily close to one,the of the regression of on2R D s t ϩ1will be arbitrarily close to zero and the correlation of withi Ϫi *D s t t t ϩ1will be arbitrarily small.It is in those senses that the random walki Ϫi *t t approximation will be arbitrarily good.The key question is not the logic of our result but its empirical validity.The result does not require uncovered interest parity,which was main-tained in the previous two paragraphs merely to clarify the relation ofour result to the standard efficient markets result.Instead,two condi-tions are required.The first is that fundamentals variables be very per-sistent—I(1)or nearly so.This is arguably the case with our data onthe observed fundamentals.We shall present evidence in Section IV thatwe cannot reject the null of a unit root in any of our data.Further,there is evidence in other research that the unobservable variables arevery persistent.For the money income model (eq.[7]),this is suggestedfor ,,and by the literature on money demand (e.g.,Sriram 2000),v q r mt t t PPP (e.g.,Rogoff 1996),and interest parity (e.g.,Engel 1996).(Werecognize that theory suggests that a risk premium like is I(0);ourr t interpretation is that if is I(0),it has a very large autoregressive root.)r t We are not concerned if or other variables are highly persistent I(0)r t。

Information hiding, anonymity and privacy a modular approach

Information hiding, anonymity and privacy a modular approach

Information Hiding,Anonymity and Privacy:A Modular ApproachDominic Hughes S TANFORD U NIVERSITY Vitaly Shmatikov SRI I NTERNATIONALAbstractWe propose a new specification framework for information hiding properties suchas anonymity and privacy.The framework is based on the concept of a functionview,which is a concise representation of the attacker’s partial knowledge about afunction.We describe system behavior as a set of functions,and formalize differ-ent information hiding properties in terms of views of these functions.We presentan extensive case study,in which we use the function view framework to system-atically classify and rigorously define a rich domain of identity-related properties,and to demonstrate that privacy and anonymity are independent.The key feature of our approach is its modularity.It yields precise,formal specifi-cations of information hiding properties for any protocol formalism and any choiceof the attacker model as long as the latter induce an observational equivalence rela-tion on protocol instances.In particular,specifications based on function views aresuitable for any cryptographic process calculus that defines some form of indistin-guishability between processes.Our definitions of information hiding propertiestake into account any feature of the security model,including probabilities,ran-dom number generation,timing,etc.,to the extent that it is accounted for by theformalism in which the system is specified.Keywords:security,information hiding,logic,knowledge,Kripke structure,ver-ification,anonymity,privacy1IntroductionSecurity requirements for computer systems often involve hiding information from an outside observer.Secrecy,anonymity,privacy,and non-interference each require that it should be impossible or infeasible to infer the value of a particular system attribute—a transmitted credit card number,the identity of a website visitor,the sender and recip-ient of an email message—by observing the system within the constraints of a givenobserver model.If formal analysis techniques are to be used for the analysis and veri-fication of computer security,they must provide support for the formal specification of information hiding properties as well as formal reasoning about information leaked to the attacker by various events and actions occurring in the system.In this paper we adopt an epistemological tradition that can be traced back to the seminal works of Kripke[Kri63]and Hintikka[Hin62]:hiding information,modeled as the attacker’s lack of knowledge about the system,corresponds to indistinguishability of system states.As the starting point,we assume that we are given a set of system configurations equipped with an observational equivalence relation.Consequently, our methods apply to any computational model of the attacker that partitions the space of all possible system configurations into observational equivalence classes.A typical example is the specification of a security protocol in a cryptographic process calculus whose notion of equivalence is testing equivalence of processes in some attacker model.The following informal example illustrates the way in which we shall obtain formal definitions of security properties,parametrically in.For ease of presentation,in this example we restrict to the case where a communication or exchange between agents consists of a single message(for example,an email).Thus we have in mind a Kripke structure whose possible worlds(states)are all possible email exchanges,and for which represents the attacker’s inability to distinguish between possible worlds and.Below is a natural-language expansion of the predicate we shall later obtain(in Table2)for defining absolute sender anonymity:A BSOLUTE SENDER ANONYMITY holds if:for every possible world of the Kripke structure,for every message sent in,for every agent,there exists a possible world indistinguishable from(i.e.,)such that in,is the sender of.Thus,from the attacker’s perspective,the lineup of candidates for the sender of any given message is the entire set of agents.(More generally,would denote a full exchange or‘conversation’between agents,potentially consisting of more than one message,and transmitted through any medium.)This example,though informal,should convey the idea behind our formulation of security properties parametrically in.A key advantage of this modularity with respect to is the resulting leveraging of the expressive power of the underlying formalism(e.g.,process calculus)in which a protocol is specified.Depending on the formalism,the equivalence relation,which represents the attacker’s inability to distinguish certain system states,may take into ac-count features such as probability distributions,generation of nonces and new names, timing,etc.In this case,our framework and the formal specifications of information hiding properties derived using it will also take these features into account.Our rmation hiding properties can be formalized naturally using modal logics of knowledge[FHMV95,SS99].Such logics can be used to formulate di-2rect statements about the limits of an observer’s knowledge.Verifying whether a system satisfies a given information hiding property is more difficult,since in order to reason about informationflows,it is necessary to formalize the behavior of all agents com-prising the system as“knowledge-based programs.”This is often a non-trivial exercise, requiring expertise in the chosen logic.On the other end of the formal methods spectrum,approaches based on process algebras[Sch96,AG99,LMMS99,BNP99]are well suited to formalizing concurrent systems.A process algebra may include a formal model of cryptographic primitives—needed,e.g.,for the analysis of cryptographic protocols—and typically comes equipped with an equivalence relation such as bisimulation or testing equivalence that models an observer’s inability to distinguish between certain processes.Process algebras also pro-vide proof techniques for process equivalence.The disadvantage of the process algebra approach is that stating information hiding properties in terms of process equivalence is very subtle and error-prone,especially for complicated properties such as anonymity and privacy.We introduce a modular framework for reasoning about information hiding prop-erties,independent of any particular system specification formalism or epistemic logic (see Figure1).The cornerstone of our approach is the concept of a function view.A function view is a foundational domain-theoretic notion.It represents partial informa-tion about a function,and thus models an observer’s incomplete knowledge thereof. Remarkably,just the three attributes of a function view—graph,kernel,and image—suffice to model many kinds of partial knowledge an observer may have about the func-tion of interest.We demonstrate how any system specification formalism that provides an equivalence relation on system configurations induces function views,and how in-formation hiding properties can be stated naturally in terms of opaqueness of this view. Therefore,security properties specified in our framework may be amenable for formal verification using a wide variety of formal methods and techniques.Applications to anonymity and privacy.The application of our specification frame-work to anonymity and privacy is especially significant.Identity protection is an active area of computer security research.Many systems have been proposed that implement different,and sometimes even contradictory,notions of what it means to be“anony-mous.”Instead of a single“anonymity”or“privacy”property,there are dozens of dif-ferentflavors of anonymity,and understanding them in a systematic way is a major challenge in itself.There is also a need for rigorous formal specification and verification of identity-related security properties since such properties are often difficult to model using conventional formal analysis techniques.Structure of paper.The structure of the paper is as follows.In section2,we introduce function views.In section3,we show how opaqueness of function views can be used to formalize information hiding properties.In section4,we use our theory of function views and opaqueness to demonstrate how most notions of anonymity proposed in the research literature can be formalized in a uniform way and represented as predicates3Systemspecification Property specification(1)Previous approach:Process algebra easyhard easyParticular logicSystemspecification Property specification(3)Modularapproach easy easyAny logicAny processalgebra Interface layer Figure 1:Modular approach to formalizing information hiding properties.See sec-tion 3.6for a detailed explanation.4on observational equivalence classes,thus facilitating their verification in any crypto-graphic process algebra.Perhaps our most important practical result is a crisp distinc-tion between anonymity and privacy(the latter understood as“relationship anonymity”),which has implications for public policy.2Theory of Function ViewsWhat can one know about a function?One might know its output on acertain input,or that a certain point of lies in the image of but that another pointdoes not.One may know distinct inputs and such that,without necessarily knowing the value.One approach to modeling partial knowledge of a function is to use domain theoreticideas[Sco72,AJ94],defining an approximation of a function to be anypartial function.This traditional notion of an approximation as asubset of input-output behavior has been very successful in research into semantics ofprogramming languages[Sto77,Gun92].In this paper we introduce a new notion of partial knowledge of a function.A viewof a function comprises a non-deterministic approximation of its graph(a binary re-lation containing),a subset of its image,and an equivalence relation contained in itskernel.Function views form a distributive lattice whose maximal consistent elementscorrespond to fully determined functions and whose bottom element represents absenceof any knowledge.In section2.1we define three primitive forms of opaqueness,one foreach of component of a view,each formalizing an observer’s inability to discern certaininformation about.In section3we show how to formalize information hiding properties in terms ofopaqueness of functions defining system behavior.The most important aspect of thisformalization is that any Kripke structure representing an observer gives rise to func-tion views.In sections3.2and3.4,we show how function views are constructed auto-matically from the equivalence relation of a Kripke structure.We then show how anyopaqueness property can be formalized as a predicate on equivalence relations,henceon Kripke structures.In particular,if the system is specified in a process algebra that supports a notion ofobservational equivalence,we demonstrate how opaqueness-based security propertiesof the system can be expressed as predicates on.This conversion is parametric inin the sense that it works for any computational model of the attacker that partitionsthe space of all possible system configurations into observational equivalence classes.Therefore,we do not require technical machinery for reasoning about how the partialknowledge represented by function views is obtained.This knowledge is implicit in theequivalence relation induced by the attacker model.Since opaqueness properties canbe expressed as predicates on the equivalence classes of any process algebra,a user isfree to employ his or her favorite algebra and preferred technique for verifying suchpredicates.Our chosen definition of function view is simple yet expressive enough to formalize5a host of security properties and the relationships between them,as shown in section4.It is interesting and surprising that attacker knowledge in this complex setting can be reduced to triples consisting of an approximation of the graph,image and kernel of a function.In section2.5,we discuss the scope of our framework,including its limitations and possible extensions.2.1Partial knowledge of functionsWe re-examine our original question:what can one know about a function?Consider the following properties of a function.Its graph.The input-output behavior of,typically coded as a subset of.Its image.The subset of.Its kernel.The quotient induced by,i.e.,the equivalence relation ongiven by iff.This list is by no means exhaustive.These three properties suffice,however,for formal-izing many information hiding properties such as anonymity and privacy,and we focus on them for the rest of this paper.We define the following corresponding notions of knowledge of a function.Graph knowledge.A binary relation such that.Thusis a non-deterministic approximation of:is a set of candidates for the output of on,and is always a candidate.Image knowledge.A subset of such that.The fact that is an assertion that is an output of,without necessarily knowing any specific input that produces.Kernel knowledge.An equivalence relation on such that,i.e., only if.Thus the edges of are assertions about the equality of.Note that the second and third forms of knowledge are positive in the sense that each point or is a definite observation of the image or kernel.By contrast, graph knowledge is negative,since is a definite observation that, whereas does not imply(unless is a singleton).D EFINITION1Function knowledge of type is a triple where1.is a binary relation between and,2.is a subset of,and3.is an equivalence relation on.We say that is consistent with,denoted,if61.,2.,and3..From a theoretical or foundational perspective,the choice of these three particular com-ponents is somewhat ad hoc.The interesting and surprising point is that these three components alone allowed us to express a rich spectrum of security-related properties: see Figure2,and the taxonomy in section4.3.A lineup of type is a set of functions of type.Given function knowledge of type,define the associated lineup as the set of functions with which is consistent:Under the intuition that is an observer’s view of some unknown function,is the set of candidates for.Given function knowledge and of type, define(approximates,or refines),by(conjunction on the right).Upon identifying with function knowledge of type,this extends our earlier definition of consistency.Meet and join are pointwise:where is the transitive closure of.With top and bottom ,function knowledge of type thus forms a distributive lattice.2.2Knowledge closureOur motivation for the notion of function knowledge is to have a way to represent the knowledge of an observer who is trying to discern some properties of a hidden function.Ideally,we would use to formulate assertions about the observer’s inability to discern certain properties of.For example,if the cardinality of is at least two elements for each,we can assert that“the observer cannot determine the value of on any input.”In general,however,we cannot make direct assertions about information hiding in terms of a single component of the knowledge triple because the three components are not independent.For example,in the case above,suppose that is the whole of7(so the observer knows that produces the same output value on all inputs),and that is the singleton.Then from and the observer can infer that for all ,even if thefirst coordinate is such that is always of size at least2.In other words,from and,the observer can refine his or her knowledge of the graph of. In order to make sound assertions about information hiding using,we mustfirst take an appropriate form of deductive or inference closure.The correct definition arises as the closure operator induced by a Galois connection.Given a lineup of functions from to,define the associated function knowledge of type by.Thus is the-maximal function knowledge consistent with each.(In making this definition we identify a function with function knowledge.)Therefore if,then,and.P ROPOSITION1(G ALOIS CONNECTION)The maps and con-stitute a contravariant Galois connection between the lattice of function knowledge of type and the powerset-lattice of lineups of type,i.e.,iff .Proof.We must show that(1)iff(2)Suppose(1)holds.Then each satisfies,so(2)follows immediately from the fact that is a meet.Conversely,suppose(2)holds.Writing,(2)is equivalent to the conjunction of(3)(4)(5)We must show that for each,we have,i.e.,,,and .These follow from(3),(4)and(5)respectively.Given function knowledge define the closure2.3Function views and opaquenessD EFINITION2A view of a function is any closed function knowledge of type that is consistent with.To formalize information hiding,we shall require the following predicates on views.D EFINITION3(O PAQUENESS)Let be a view of.We define the following forms of opaqueness of under:Value opaqueness:–Given,is-value opaque if for all.In otherwords,the lack of information in the view is such that there are at leastcandidates for the output of on any given input.–Given,is-value opaque if for all.In otherwords,the lack of information in the view is such that any element of is apossibility for the value of on any input.–is absolutely value opaque if is-value opaque.In other words,theview gives away nothing about the input-output behavior of:for all,every is a candidate for.is image opaque if is empty.In other words,the view is such that for no element can one definitely assert that lies in the image of.is kernel opaque if is equality on.In other words,beyond the trivial case ,with this view of no equalities can be inferred.P ROPOSITION2Let be a function of type.The various forms of opaque-ness of a view of can be characterized logically as follows,with reference to the corresponding lineup:-Value opaqueness-Value opaquenessImage opaquenessKernel opaquenessFor absolute value opaqueness,take.Proof.Each property is simply a restatement of the corresponding property defined in Definition3.92.4Composition of function viewsCompositionality is a notoriously difficult problem in computer security.It is well known that security properties typically do not compose,and reasoning about what an observer can learn upon putting together two views of the system is very involved.Al-though composition of function views is of no particular interest from the perspective of security property specification(we assume that all information available in the attacker is already embodied in the indistinguishability relation that induces his function view), for the sake of completeness and mathematical curiosity,we remark upon it.The only sensible definition of composition of function views is in terms of the underlying lineups:.One must take closure,for in general the pairwise composite of closed lineups does not yield a closed lineup.Since in composing at type,image and kernel information in is lost,it is not surprising that composition is not associative,as witnessed by the following example.Let,,and. Then for,and,the constant function is inbut is not in,hence.Since image and kernel information in is lost upon composing, the only route to a compositional notion of function view would be to somehow encode this information elsewhere,either by enriching the image or kernel components,or by adding one or more additional components.For example,the above non-associativity of the composition of,and arises because in closing we lost the information that is in the image of.The fact that implies the following condition():at least one of and is in the image of.Since is lost,the closure of is“too big,”including as it does the constant function,which does not satisfy.Motivated by this example,one avenue towards a compositional enrichment of the notion of function view might begin with extending the definition of the image component from a subset of the range to a set of subsets of the range,with the semantics that each such set contains at least one point of the image.Then the lost condition above can be encoded as the set,and we eliminate the undesirable function from the closure of.A similar extension may be necessary for the kernel.2.5Scope of the frameworkDepending on the application,one might consider incorporating additional properties, for example the cardinality of for each,or a subset of the complement of the kernel.Because of our description of“inference closure”between components of func-tion knowledge as a Galois connection with function lineups(Proposition1),the theory of function views extends to any conceivable property of a function.In addition,one can easily generalize from views of functions to views of partial functions or relations.As we apply the function view framework to reasoning about information hiding in systems,we assume that the equivalence relation associated with the observer is non-probabilistic.For example,we do not consider scenarios where the observer does not know with100%certainty the value of,but may be able to determine that10with probability90%,and with probability10%.While it re-stricts applicability of the framework,this approach is common in formal reasoningabout security properties.It is justified by assuming that cryptographic primitives hidedistributions of the underlying data,and it is therefore sufficient to consider only non-probabilistic observers such as the so called Dolev-Yao attacker[DY83].Any notionof process equivalence with sound cryptographic semantics[AR02,LMMS99]providesa non-probabilistic equivalence relation suitable for applying the function view frame-work.Nonetheless,future extensions may consider probability distributions over func-tion(and hence attribute)lineups in a manner similar to Halpern and Tuttle[HT93].Inthe current setup,a lineup of type is equivalent to a function.This has an obvious generalization to,where is the closed unit interval on the real line.Our theory of function views does not directly incorporate any temporal features.We are primarily concerned with reasoning about information that an observer may ex-tract from the system given a particular static equivalence relation that models his orher inability to distinguish certain system configurations.As the system evolves overtime,the observer may accumulate more observations,possibly narrowing the equiva-lence relation and allowing extraction of more information.This can be modeled in ourframework by the corresponding change in the observer’s function view.If temporal inferences,such as those involved in timing attacks,can be modeledin the underlying process specification formalism,they will be reflected in the inducedobservational equivalence relation and,in a modular fashion,in the function views rep-resenting the attacker’s view of the system.Therefore,function views can be used toreason about time-related security properties.Function views can also be used to rep-resent partial knowledge about relations between unknown entities.Therefore,they aresufficiently expressive to model“forward security”of systems.For example,kernelopaqueness of the mapping from email messages to senders models the attacker’s in-ability to determine whether two emails originated from the same source.Even if oneof the messages is compromised,the attacker will not be able to automatically infer thesender of the other.3Opaqueness and Information HidingIn this section,we establish the relationship between opaqueness of function views andobservational equivalence of system configurations.We then demonstrate how functionviews can be used to formalize information hiding properties and derive verificationconditions stated in terms of observational equivalence.3.1Possible-worlds modelWe will use the theory of function views developed in section2.1to reason about infor-mation hiding properties of systems,i.e.,whether the attacker is prevented from know-ing properties of the functions defining system behavior.We will follow the standardapproach of epistemic logic[Hin62,FHMV95]and formalize any system of interest11as a Kripke structure[Kri63].Here is the set of all possible configurations of system(we may also refer to elements of as possible worlds or states of),is an interpretation that defines configurations by assigning values to all attributes of,and is an equivalence relation on that models an observer’s inability to dis-tinguish certain states of.In general,Kripke structures can be used to model multiple observers with their corresponding equivalence relations on.Since our primary goal in this paper is to reason about security properties,we will assume a single attacker that incorporates the abilities of all hostile observers.In the rest of this section,we demonstrate how any Kripke structure induces function views.In particular,any computational model of the attacker,including those implicit in cryptographic process algebras,imposes an equivalence relation on and thus induces function views.Any information hiding property can be defined in two ways:as opaque-ness of the induced function view(following section2.3),or as a logical predicate on the underlying equivalence relation.The two definitions are equivalent,as demonstrated by proposition3.3.2Attribute opaquenessLet be a system with a set of configurations.An attribute of of typeis a function for each configuration,i.e.,a-indexed family of functions.In general,may have a variety of attributes.Such a representation is akin to the object-oriented view of the world,with behavior modeled by a set of methods.Security properties of computer systems often involve hiding information about functions defining the behavior of the system.For example,suppose is a bank with a set of customers.Writing customer’s bank balance in configuration as,we have defined an attribute of of type,where is the set of real numbers.Then the secrecy property of customers’balances can be formalized as the requirement that an observer should not be able to infer the value of attribute.A richer example is noninterference[GM82],which requires that the observable low(un-classified)behavior of the system hide all information about high(classified)functions inside the system.Define a view family for an attribute of to be a function view of for each.Opaqueness lifts pointwise to attributes as follows.D EFINITION4Let be a system with set of configurations and an attribute of type.Let be a view family for.For any form of opaqueness(e.g.,=kernel or=image),we say that is-opaque under if,for all configurations, is-opaque under the function view.3.3Observational equivalenceIntuitively,two processes are observationally equivalent if no context can distinguish them.In formal models of security protocols based on process calculi[AG99,LMMS99, BNP99],it is common to define security properties such as secrecy and authentication in12terms of observational equivalence(or related concepts such as may-testing equivalence or computational indistinguishability)between several instances of the protocol,or an instance of the protocol and that of an“ideal”protocol which is secure by design and serves as the specification of the desired property.Proof techniques for observational equivalence depend on the choice of a particular process calculus and an attacker model. Some of the formalisms are amenable to mechanized verification(e.g.,via trace-by-trace comparison).A related approach involves behavioral equivalences proved via logical relations[SP01].Given system configurations and,we will say that iff an outside ob-server(attacker),acting in coalition with all agents from set(e.g.,with access to their secret keys),cannot distinguish and,for whatever notion of indistinguishability is supported by the chosen formalism and the attacker model.We emphasize that any notion of observational equivalence automatically pro-vides a relation for the Kripke structure defining the attacker’s knowledge of the sys-tem.Therefore,once the system is formalized in a suitable process algebra,there is no need to reason how function views are obtained or where attacker knowledge“comes from.”As demonstrated in section3.4,any observational equivalence relation induces a particular function view,and its opaqueness can be characterized logically in terms of predicates on the equivalence classes as described in section3.5.3.4Opaqueness and observational equivalenceSuppose that the set of system configurations is equipped with an observational equiv-alence,an equivalence relation where represents the inability of an observer to distinguish between configurations and.Such an equivalence relation naturally induces a view family for any attribute,as follows.D EFINITION5Let be a system with set of configurations equipped with an obser-vational equivalence,and let be an attribute of of type.Every configu-ration defines a function lineup hence we obtain an attribute view family of,given byNote that is indeed a view of because and is closed for any function lineup.Since any observational equivalence induces an attribute view family,any form of opaqueness lifts to a predicate on observational equivalence.D EFINITION6Let be a system with set of configurations and let be an attribute of of type.Let be an observational equivalence on.For any form of opaqueness,we say that is-opaque under if the attribute view familyof induced by is-opaque.3.5Logical characterization of attribute opaquenessProposition2generalises to attribute opaqueness in the obvious way.13。

2022考研英语阅读Body of evidence

2022考研英语阅读Body of evidence

2022考研英语阅读Body of evidenceBody of evidence关于金融危机的大量证据Is a concentration of wealth at the top to blamefor financial crises?金融危机是否应归咎于财宝向上层的集中?In the search for the villain behind the globalfinancial crisis, some have pointed to inequality asa culprit. In his 20xx book Fault Lines, RaghuramRajan of the University of Chicago argued thatinequality was a cause of the crisis, and that theAmerican government served as a willing accomplice. From the early 1980s the wages ofworking Americans with little or no university education fell ever farther behind those withuniversity qualifications, he pointed out. Under pressure to respond to the problem ofstagnating incomes, successive presidents and Congresses opened a flood of mortgagecredit.全球金融危机的幕后黑手毕竟是谁。

各方对此问题的调查从未停止过,其中就有一种观点认为,收入不公平是造成危机的罪魁祸首。

芝加哥高校的拉古拉迈拉詹教授在他20xx 年出版的《断层线》一书中说,引发危机的主犯是收入不公平,而政府则充当了心甘情愿的帮凶角色。

generalized maxwell model

generalized maxwell model

Generalized Maxwell Model1. IntroductionThe Maxwell model is a linear viscoelastic model used to describe the rheological behavior of viscoelastic materials. It consists of a spring and a dashpot in parallel, and ismonly used to model the behavior of polymers, gels, and otherplex fluids. In this article, we will explore the generalized Maxwell model, which is an extension of the original Maxwell model and provides a more accurate representation of the viscoelastic properties of materials.2. The Maxwell modelThe Maxwell model, first proposed by James Clerk Maxwell in the 19th century, consists of a spring and a dashpot in parallel. The spring represents the elastic behavior of the material, while the dashpot represents the viscous behavior. The constitutive equation of the Maxwell model is given by:σ(t) = Eε(t) + ηdε(t)/dtWhere σ(t) is the stress, ε(t) is the strain, E is the elastic modulus, η is the viscosity, and dε(t)/dt is the rate of strain. The Maxwellmodel is simple and easy to understand, but it fails to capture the nonlinear viscoelastic behavior of many materials.3. The generalized Maxwell modelTo ovee the limitations of the original Maxwell model, the generalized Maxwell model introduces multiple springs and dashpots in parallel, each with its own elastic modulus and viscosity. This allows for a more accurate representation of theplex viscoelastic behavior of materials. The constitutive equation of the generalized Maxwell model is given by:σ(t) = ∑(Eiε(t) + ηidε(t)/dt)Where the summation is taken over all the springs and dashpots in the model, and Ei and ηi are the elastic moduli and viscosities of the individual elements. By including multiple elements with different relaxation times, the generalized Maxwell model can accurately describe the behavior of materials with nonlinear viscoelastic properties.4. Applications of the generalized Maxwell modelThe generalized Maxwell model has found wide applications in various fields, including polymer science, biomedicalengineering, and materials science. It has been used to study the viscoelastic behavior of polymers, gels, and foams, and to design materials with specific viscoelastic properties. In biomedical engineering, the model has been used to study the mechanical behavior of soft tissues and to develop new biomaterials for tissue engineering. In materials science, the model has been used to characterize the viscoelastic properties ofposites and to optimize their performance.5. Comparison with other viscoelastic modelsThe generalized Maxwell model is just one of many viscoelastic models used to describe the rheological behavior of materials. Other popular models include the Kelvin-Voigt model, the Burgers model, and the Zener model. Each of these models has its own advantages and limitations, and the choice of model depends on the specific material and the behavior of interest. The generalized Maxwell model is particularly useful for materials withplex viscoelastic behavior, as it allows for a more detailed description of the relaxation processes.6. ConclusionIn conclusion, the generalized Maxwell model is a powerful tool for describing the viscoelastic behavior of materials. Byextending the original Maxwell model to include multiple springs and dashpots, the generalized Maxwell model provides a more accurate representation of the nonlinear viscoelastic properties of materials. It has found wide applications in various fields and has contributed to our understanding of the mechanical behavior ofplex fluids and solids. As our knowledge of viscoelastic materials continues to grow, the generalized Maxwell model will undoubtedly remain an important tool for researchers and engineers alike.。

Mimicking the QCD equation of state with a dual black hole

Mimicking the QCD equation of state with a dual black hole
An earlier study [7] of thermodynamic properties of putative holographic duals to QCD starts with a lagrangian including an unspecified matter term.
1
2
a particular V (φ) whose corresponding cs (T ) curve closely mimics that of QCD. We close with a discussion in section 6. The results in this paper are based in large part on [8], and aspects of them will also be summarized in [9].
1
Introduction
In the supergravity approximation, the near-extremal D3-brane has equation of state s ∝ T 3 , with a constant of proportionality that is 3/4 of the free-field value for the dual N = 4 super√ Yang-Mills theory [1]. The speed of sound is cs = 1/ 3, as required by conformal invariance.
On the other hand, the speed of sound of a thermal state in quantum chromodynamics (QCD) has an interesting and phenomenologically important dependence on temperature, with a minimum near the cross-over temperature Tc . Lattice studies of the equation of state are too numerous to cite comprehensively, but they include [2] (for pure glue), [3] (a review article), and [4, 5] (recent studies with 2 + 1 flavors). We would like to find a five-dimensional gravitational theory that has black hole solutions whose speed of sound as a function of temperature mimics that of QCD. We will not try to include chemical potentials or to account for chiral symmetry breaking. We will not try to include asymptotic freedom, but instead will limit our computation to T < ∼ 4Tc and assume conformal behavior in the extreme UV. We will not even try to give an account of cross-over temperature Tc is recovered in our setup, corresponding to a minimum of cs near Tc . We will not try to embed our construction in string theory, but instead adjust parameters in a five-dimensional gravitational action to recover approximately the dependence cs (T ) found from the lattice. That action is S= 1 2κ2 5 √ 1 d5 x −g R − (∂φ)2 − V (φ) . 2 (1) confinement, except insofar as the steep rise in the number of degrees of freedom near the

Uncertainty measures on probability intervals from the imprecise Dirichlet model

Uncertainty measures on probability intervals from the imprecise Dirichlet model

Uncertainty measures on probability intervals from theimprecise Dirichlet modelJ.ABELLA´N*Department of Computer Science and Artificial Intelligence,University of Granada,Granada 18071,Spain(Received 6February 2006;in final form 3March 2006)When we use a mathematical model to represent information,we can obtain a closed and convex set of probability distributions,also called a credal set.This type of representation involves two types of uncertainty called conflict (or randomness )and non-specificity ,respectively.The imprecise Dirichlet model (IDM)allows us to carry out inference about the probability distribution of a categorical variable obtaining a set of a special type of credal set (probability intervals).In this paper,we shall present tools for obtaining the uncertainty functions on probability intervals obtained with the IDM,which can enable these functions in any application of this model to be calculated.Keywords :Imprecise probabilities;Credal sets;Uncertainty;Entropy;Conflict;Imprecise Dirichlet model1.IntroductionSince the amount of information obtained by any action is measured by a reduction in uncertainty,the concept of uncertainty is intricately connected to the concept of information.The concept of ‘information-based uncertainty’(Klir and Wierman 1998)is related to information deficiencies such as the information being incomplete,imprecise,fragmentary,not fully reliable,vague,contradictory or deficient,and this may result in different types of uncertainty.This paper is solely concerned with the information conceived in terms of uncertainty reduction,unlike the term ‘information’as it is used in the theory of computability or in terms of logic.In classic information theory,Shannon’s entropy (1948)is the tool used to quantify uncertainty.This function has certain desirable properties and has been used as the starting point when looking for another function to measure the amount of uncertainty in situations in which a probabilistic representation is not suitable.Many mathematical imprecise probability theories for representing information-based uncertainty are based on a generalization of the probability theory:e.g.Dempster–Shafer’s theory (DST)(Dempster 1967,Shafer 1976),interval-valued probabilities (de Campos et al.1994),order-2capacities (Choquet 1953/1954),upper–lower probabilities (Suppes 1974,Fine 1983,International Journal of General SystemsISSN 0308-1079print/ISSN 1563-5104online q 2006Taylor &Francis/journalsDOI:10.1080/03081070600687643*Email:jabellan@decsai.ugr.esInternational Journal of General Systems ,Vol.35,No.5,October 2006,509–5281988)or general convex sets of probability distributions (Good 1962,Levi 1980,Walley 1991,Berger 1994).Each of these represents a type of credal set that is a closed and convex set of probability distributions with a finite set of extreme points.In the DST,Yager (1983)distinguishes between two types of uncertainty:one is associated with cases where the information focuses on sets with empty intersections,and the other is associated with cases where the information focuses on sets where the cardinality is greater than one.These are called conflict and non-specificity ,respectively.The study of uncertainty measures in the DST is the starting point for the study of these measures on more general theories.In any of these theories,it is justifiable that a measure capable of measuring the uncertainty represented by a credal set must quantify the parts of conflict and non-specificity.More recently,Abella´n and Moral (2005b)and Klir and Smith (2001)justified the use of maximum entropy on credal sets as a good measure of total uncertainty.The problem lies in separating these functions into others that really do measure the conflict and non-specificityparts by using a credal set to represent the information.Abella´n et al.(2006)managed to split maximum entropy into functions that are capable of coherently measuring the conflict and non-specificity of a credal set P ;and also as algorithms in order to facilitate their calculationin order-2capacities (Abella´n and Moral 2005a,2006)so that S *ðP Þ¼S *ðP ÞþðS *2S *ÞðP Þ;where S *represents maximum entropy and S *represents minimum entropy on a credal set P ;with S *ðP Þcoherently quantifying the conflict part of a credal set and ðS *2S *ÞðP Þthe non-specificity part of a credal set.A natural way of representing knowledge is with probability intervals (Campos et al.1994).In this paper,we shall work with a special type of probability intervals obtained using the imprecise Dirichlet model (IDM).The main use of IDM is to infer about acategorical variable.Abella´n and Moral (2003b,2005b)recently used IDM to join uncertainty measures in classification (an important problem in the field of machine learning ).In this paper,we shall study IDM probability intervals and we shall prove that,while they can be represented by belief functions,they are not the only type of credal set belonging to belief functions and probability intervals.In addition,we shall present an algorithm that obtains the maximum entropy for this type of interval;we shall demonstrate a property that will enable us rapidly to obtain the minimum entropy for this type of interval;and using the fact that they represent a special type of belief function,we shall directly obtain the value of the Hartley measure on them.In Section 2of this paper,we shall introduce the most important imprecise probability theories and distinguish between probability intervals and belief functions.In Section 3,we shall present the IDM and its main properties and shall also examine the situation of IDM probability intervals in relation to other imprecise probability theories.In Section 4,we shall explore uncertainty measures on credal sets.In Section 5,we shall outline some procedures and algorithms for obtaining the values of the main uncertainty measures on IDM probability intervals and practical examples.Conclusions are presented in Section 6.J.Abella´n 5102.Theories of imprecise probabilities2.1Credal setsAll theories of imprecise probabilities that are based on classical set theory share some common characteristics(see Walley1991,Klir2006).One of them is that evidence within each theory is fully described by a lower probability function P*on afinite set X or, alternatively,by an upper probability function P*on X.These functions are always regular monotone measures(Wang and Klir1992)that are superadditive and subadditive, respectively,andX x[X P*ð{x}Þ<1;Xx[XP*ð{x}Þ>1:ð1ÞIn the various special theories of uncertainty,they possess additional special properties. When evidence is expressed(at the most general level)in terms of an arbitrary credal set,P of probability distribution functions p,on afinite set X(Kyburg1987),functions P*and P* associated with P are determined for each set A#X by the formulaeP*ðAÞ¼infp[P Xx[Apð{x}Þ;P*ðAÞ¼supp[PXx[Apð{x}Þ:ð2ÞSince for each p[P and each A#X,it follows thatP*ðAÞ¼12P*ðX2AÞ:ð3ÞOwing to this property,functions P*and P*are called dual(or conjugate).One of them is sufficient for capturing given evidence;the other one is uniquely determined by equation(3). It is common to use the lower probability function to capture the evidence.As is well known (Chateauneuf and Jaffray1989,Grabisch2000)any given lower probability function P*is uniquely represented by a set-valued function m for which mðYÞ¼0andXA[‘ðXÞmðAÞ¼1;ð4Þwhere we note‘(X)as the power set of X.Any set A#X for which mðAÞ–0is often called a focal element,and the set of all focal elements with the values assigned to them by function m is called a body of evidence.Function m is called a Mo¨bius representation of P*when it is obtained for all A#X via the Mo¨bius transformmðAÞ¼XB j B#Að21Þj A2B j P*ðBÞ:ð5ÞThe inverse transform is defined for all A#X by the formulaP*ðAÞ¼XB j B#AmðBÞ:ð6ÞIt follows directly from equation(5)P*ðAÞ¼XB j B>A–YmðBÞ;ð7Þfor all A#X.Assume now that evidence is expressed in terms of a given lower probability function P*.Then,the set of probability distribution functions that are consistent with P*,Uncertainty measures on IDM511P ðP *Þ;which is always closed and convex,is defined as followsP ðP *Þ¼p j x [X ;p ðx Þ[½0;1 ;Xx [X p ðx Þ¼1P *ðA Þ<X x [A p ðx Þ;A #X ():ð8Þ2.2Choquet capacities of various ordersA well-defined category of theories of imprecise probabilities is based on Choquet capacities of various orders (Choquet 1953/1954).The most general theory in this category is the theory based on capacities of order 2.Here,the lower and upper probabilities,P *and P *,are monotone measures for whichP *ðA <B Þ>P *ðA ÞþP *ðB Þ2P *ðA >B Þ;P *ðA >B Þ<P *ðA ÞþP *ðB Þ2P *ðA <B Þ;ð9Þfor all A ,B #X .Less general uncertainty theories are then based on capacities of order k .For each k .2,the lower and upper probabilities,P *and P *,satisfy the inequalitiesP *[k j ¼1A j !>X K #N k ;K –Y ð21Þj K jþ1P \j [KA j !;P *\k j ¼1A j!<X K #N k ;K –Y ð21Þj K jþ1P [j [K A j !;ð10Þfor all families of k subsets of X ,where N k ¼{1;2;...;k }:Clearly,if k 0.k ,then the theory based on capacities of order k 0is less general than the one based on capacities of order k .The least general of all these theories is the one in which the inequalities are required to hold for all k >2(the underlying capacity is said to be of order 1).This theory,which was extensively developed by Shafer (1976),is usually referred to as evidence theory or DST.In this theory,lower and upper probabilities are called belief and plausibility measures,noted asBel and Pl,respectively.An important feature of DST is that the Mo¨bius representation of evidence m (usually called a basic probability assignment function in this theory)is a non-negative function (m (A )[[0,1]).Hence,we can obtain Bel and Pl function from m as the following wayBel ðA Þ¼X B j B #A m ðB Þ;Pl ðA Þ¼X B j B >A –Ym ðB Þ:ð11ÞDST is thus closely connected with the theory of random sets (Molchanov 2004).When we work with nested families of focal elements,we obtain a theory of graded possibilities,which is a generalization of classical possibility theory (De Cooman 1997,Klir 2006).2.3Probability intervalsIn this theory,lower and upper probabilities P *and P *are determined for all sets A #X by intervals [l (x ),u (x )]of probabilities on singletons (x [X ).Clearly,l ðx Þ¼P *ð{x }ÞandJ.Abella´n 512u ðx Þ¼P *ð{x }Þand inequalities (1)must be satisfied.Each given set of probability intervals I ¼{½l ðx Þ;u ðx Þ j x [X }is associated with a credal set,P ðI Þ;of probability distribution functions,p ,defined as followsP ðI Þ¼p j x [X ;p ðx Þ[½l ðx Þ;u ðx Þ ;X x [Xp ðx Þ¼1():ð12ÞSets defined in this way are clearly special cases of sets defined by equation (8).Their special feature is that they always form an (n 21)-dimensional polyhedron,where n ¼j X j :In general,the polyhedron may have c vertices (corners),wheren <c <n ðn 21Þ;and each probability distribution function contained in the set can be expressed as a linearcombination of these vertices (Weichselberger and Po¨hlmann 1990,de Campos et al.1994).A given set I of probability intervals may be such that some combinations of values taken from the intervals do not correspond to any probability distribution function.This indicates that the intervals are unnecessarily broad.To avoid this deficiency,the concept of reachability was introduced in the theory (Campos et al.1994).A given set I is called reachable (or feasible)if and only if for each x [X and every value v (x )[[l (x ),u (x )]there exists a probability distribution function p for which p ðx Þ¼v ðx Þ:The reachability of any given set I can be easily checked:the set is reachable if and only if it passes the following testsX x [X l ðx Þþu ðy Þ2l ðy Þ<1;;y [X ;Xx [X u ðx Þþl ðy Þ2u ðy Þ>1;;y [X :ð13ÞIf I is not reachable,it can be converted to the set I 0¼{½l 0ðx Þ;u 0ðx Þ j x [X }of reachable intervals by the formulae l 0ðx Þ¼max l ðx Þ;12X y –x u ðy Þ();u 0ðx Þ¼min u ðx Þ;12Xy –x l ðy Þ();ð14Þfor all x [X .Given a reachable set I of probability intervals,the lower and upper probabilities are determined for each A #X by the formulae P *ðA Þ¼maxX x [A l ðx Þ;12X x ÓA u ðx Þ();P *ðA Þ¼min Xx [A u ðx Þ;12X x ÓA l ðx Þ():ð15ÞThe theory based on reachable probability intervals and DST are not comparable in terms of their generalities.However,they both are subsumed under a theory based on Choquet capacities of order 2as we can see in the following subsection.Uncertainty measures on IDM 5132.4Choquet capacities of order 2Although Choquet capacities of order 2do not capture all credal sets,they subsume all the other special uncertainty theories that are examined in this paper.They are thus quite general.Their significance is that they are computationally easier to handle than arbitrary credal sets.In particular,it is easier to compute P ðP *Þdefined by equation (8)when P *is a Choquet capacity of order 2.Let X ¼{x 1;x 2;...;x n }and let s ¼ðs ðx 1Þ;s ðx 2Þ;...;s ðx n ÞÞdenote a permutation bywhich elements of X are reordered.Then,it is established (de Campos and Bolan˜os 1989)that for any given Choquet capacity of order 2,P ðP *Þis determined by its extreme points,which are probability distributions p s computed as followsp s ðs ðx 1ÞÞ¼P *ð{s ðx 1Þ}Þ;p s ðs ðx 2ÞÞ¼P *ð{s ðx 1Þ;s ðx 2Þ}Þ2P *ð{s ðx 1Þ}Þ;.........p s ðs ðx n ÞÞ¼P *ð{s ðx 1Þ;...;s ðx n Þ}Þ2P *ð{s ðx 1Þ;...;s ðx n 21Þ}Þ:ð16ÞEach permutation defines an extreme point of P ðP *Þ;but different permutations can give rise to the same point.The set of distinct probability distributions p s is often called an interaction representation of P *(Grabisch2000).Figure 1.Main uncertainty theories ordered by their generalities.J.Abella´n 514Uncertainty measures on IDM515 Belief functions and reachable probability intervals represent special types of capacities of order2,as we can see in Figure1.However,belief functions are not generalizations of reachable probability intervals and the inverse is also not verified as we can see in Examples 1and2,respectively:Example1.We consider the set X¼{x1;x2;x3}and the following set of probability intervals on XL¼{½0;0:5 ;½0;0:5 ;½0;0:5 }:This set of probability intervals L has associated a credal set,P L;with vertices{ð0:5;0:5;0Þ;ð0:5;0;0:5Þ;ð0;0:5;0:5Þ}:There does not exist any basic probability assignment for this credal set.To prove this we suppose the contrary condition.Using equation(16)it can be proved that the credal set associated with a basic probability assignment on X has the vertices that we can see in Table1,where m i¼mð{x i}Þ;m ij¼mð{x i;x j}Þ;m123¼mðXÞ;i;j[{1;2;3}:Then,a basic probability assignment m with the same credal set,P L;must verify thatm1þm12þm13þm123¼0:5;m2þm12þm23þm123¼0:5;m3þm13þm23þm123¼0:5;m1¼m2¼m3¼0;m2þm23¼0;m3þm23¼0;m1þm13¼0;m3þm13¼0;m1þm12¼0;m2þm12¼0;where any other option give us a contradiction.Hence,we have that m i¼0;m ij¼0(i,j[{1,2,3})and m123¼0:5;implying that m is not a basic probability assignment.Example2.We consider the following basic probability assignment m on thefinite set X¼{x1;x2;x3;x4}defined bymð{x1;x2}Þ¼0:5;mð{x3;x4}Þ¼0:5:Table1.Set of vertices associated with a basic probability assignment on a set of3elements.s p1p2p3(1,2,3)m1þm12þm13þm123m2þm23m3(1,3,2)m1þm12þm13þm123m2m3þm23(2,1,3)m1þm13m2þm12þm23þm123m3(2,3,1)m1m2þm12þm23þm123m3þm13(3,1,2)m1þm12m2m3þm13þm23þm123 (3,2,1)m1m2þm12m3þm13þm23þm123Computing the upper and lower probability values for every x i ,we have the following set of probability intervals compatible with m :L ¼{½0;0:5 ;½0;0:5 ;½0;0:5 ;½0;0:5 };but this set contains the following probability distribution p 0¼ð0:5;0:5;0;0Þon X ,that not belongs to the credal set associated with m0¼p 0ð{x 3;x 4}Þ,Bel ð{x 3;x 4}Þ¼0:5;1¼p 0ð{x 1;x 2}Þ.Pl ð{x 1;x 2}Þ¼0:5:However,it is easy to obtain a set of reachable probability intervals that represents the same credal set that a belief function,as we can see in the following example.Example 3.We consider the set X ¼{x 1;x 2;x 3}and the following set of reachable probability intervals on XL ¼{½0:3;0:65 ;½0:2;0:55 ;½0:15;0:3 }:This set of probability intervals L has associated a credal set,P L ;with vertices{ð0:65;0:2;0:15Þ;ð0:3;0:55;0:15Þ;ð0:5;0:2;0:3Þð0:3;0:4;0:3Þ}:Using Table 1,it can be obtained that also this credal set is represented by the belief function associated with the basic probability assignment (has the same set of vertices)m ð{x 1}Þ¼0:3;m ð{x 2}Þ¼0:2;m ð{x 3}Þ¼0:15;m ð{x 1;x 2}Þ¼0:2;m ð{x 1;x 2;x 3}Þ¼0:15:3.IDM probability intervalsThe IDM was introduced by Walley (1996)to draw an inference about the probability distribution of a categorical variable.Let us assume that Z is a variable taking values on a finite set X and that we have a sample of size N of independent and identically distributed outcomes of Z .If we want to estimate the probabilities,u x ¼p ðx Þ;with which Z takes its values,a common Bayesian procedure consists in assuming a prior Dirichlet distribution for the parameter vector (u x )x [X ,and then taking the posterior expectation of the parameters given the sample.The Dirichlet distribution depends on the parameters s ,a positive real value,and t ,a vector of positive real numbers t ¼ðt x Þx [X ;verifying P x [X t x ¼1:The density takes the formf ððu x Þx [X Þ¼G ðs ÞQ x [X G ðs ·t x ÞY x [Xu s ·t x 21x ;where G is the gamma function.If r (x )is the number of occurrences of value x in the sample,the expected posterior value of parameter u x is (r (x )þs ·t x )/(N þs ),which is also the Bayesian estimate of u x (under quadratic loss).J.Abella´n 516The IDM(Walley1996)only depends on parameter s and assumes all the possible values of t.This defines a non-closed convex set of prior distributions.It represents a much weaker assumption than a precise prior model,but it is possible to make useful inferences.In our particular case,where the IDM is applied to a single variable,we obtain a credal set for this variable Z that can be represented by a system of probability intervals.For each parameter, u x,we obtain a probability interval given by the lower and upper posterior expected values of the parameter given the sample.These intervals can be easily computed and are given by [r(x)/(Nþs),(r(x)þs)/(Nþs)].The associated credal set on X is given by all the probability distributions p0on X,such that p0(x)[[r(x)/(Nþs),(r(x)þs)/(Nþs)],;x.The intervals are coherent in the sense that if they are computed by taking infimum and supremum in the credal set,then the same set of intervals is again obtained.Parameter s determines how quickly the lower and upper probabilities converge as more data become available;larger values of s produce more cautious inferences.Walley(1996) does not give a definitive recommendation,but he advocates values between s¼1and s¼2. We can define a generalization of a set of IDM probability intervals,considering that the frequencies r(x i)are non-negative real numbers.For the sake of simplicity,we use the same name for this type of probability interval.Formally:Definition1.Let X¼{x1;...;x n}be afinite set.Then a set of IDM probability intervals on X can be defined as the setL¼½l i;u i j l i¼rðx iÞNþs;u i¼rðx iÞþsNþs;i¼1;2;...;n;X ni¼1rðx iÞ¼N();where r(x i)are non-negative numbers and not all are equal to zero,and s is non-negative parameter.3.1PropertiesUsing the notation in definition1,we can express the following properties:1.Sets of IDM probability intervals generalize probability distributions.For a probabilitydistribution p on afinite set X¼{x1;...;x n};it is only necessary to consider s¼0and rðx iÞ¼pð{x i}Þ;for all i¼1;...;n:2.The credal set associated with a set L of IDM probability intervals,P L;has the followingset of vertices{v1,...,v n}v1¼rðx1ÞþsNþs;rðx2ÞNþs;...;rðx nÞNþsv2¼rðx1ÞNþs;rðx2ÞþsNþs;...;rðx nÞNþs.........v n¼rðx1ÞNþs;rðx2ÞNþs;...;rðx nÞþsNþsð17ÞUncertainty measures on IDM5173.Denoting as P s L the credal set associated with a set L of IDM probability intervals for a valueof the parameter s and a fixed array of values r ¼ðr ðx 1Þ;...;r ðx n ÞÞ;it can be verified thats 1<s 2,P s 1L #P s 2L4.Every set of IDM probability intervals represents a set of reachable probability intervals.InSection 2.4,we see that belief functions are not generalizations of probability intervals and the inverse is also not verified.However,the credal set associated with a set of IDM probability intervals L can also be expressed by a belief function.Proposition 1.Let L be a set of IDM probability intervals as in Definition 1.The credal set associated with L is the credal set associated with the belief function associated with the basic probability assignment m Lm L ð{x i }Þ¼r ðx i ÞN þs ;i ¼1;2;...;n m L ðX Þ¼sN þsm L ðA Þ¼0;;A ,X ;1,j A j ,n :ð18ÞProof .Using that the lower probability associated with L verifies thatP *ð{x i ;...;x j }Þ¼r ðx i Þþ···þr ðx j ÞN þs;via the Mo¨bius transform,we can obtain the following values m L ð{x i }Þ¼r ðx i ÞN þs ;m L ð{x i ;x j }Þ¼r ðx i Þþr ðx j ÞN þs 2r ðx i ÞN þs 2r ðx j ÞN þs ¼0;m L ð{x i ;x j ;x k }Þ¼r ðx i Þþr ðx j Þþr ðx k ÞN þs 2r ðx i Þþr ðx j ÞN þs 2r ðx j Þþr ðx k ÞN þs 2r ðx j Þþr ðx k ÞN þs þr ðx i ÞN þs þr ðx j ÞN þs þr ðx k ÞN þs ¼0;......;ð19Þfor all i ,j ,k [{1,2,...,n }.For a general set A such that 1,j A j ¼w ,n ;we have m L ð{A }Þ¼X B #A ð21Þj A 2B j P *ðB Þ¼X B #A ð21Þj A 2B j P x i [B r ðx i ÞN þs ¼X x i [Aw 2100@1A 2w 2210@1A þw 2330@1A 2···þð21Þw 21w 21w 210@1A 2435:r ðx i ÞN þs :ð20ÞJ.Abella´n 518Taking into account that0¼ð121Þw21¼w21!2w221!þw233!2···þð21Þw21w21w21!;ð21Þthenm Lð{A}Þ¼0;ð22ÞNowm Lð{X}Þ¼12Xx i[Xrðx iÞNþs¼sNþs:ð23ÞTherefore,m L obtained is a basic probability assignment on X.Now,let P L be the credal set associated with L and let P mLbe the credal set associated with m L.Then,P L¼P mL:i)Let p[P L be a probability distribution.ThenBel mL ðAÞ¼Xx i[Arðx iÞNþs<pðAÞ<Px i[Arðx iÞþsNþs¼Pl mLðAÞ;for all A#X.Hence,p[P mL;ii)Let p[P mLbe a probability distribution.Thenrðx iÞNþs ¼Bel mLð{x i}Þ<pð{x i}Þ<Pl mLð{x i}Þ¼rðx iÞþsNþs;for all x i[X.Hence,p[P L:A Sets of IDM probability intervals are not the only credal sets that can be expressed jointly by reachable probability intervals and belief functions.As we can observe in example3,it is possible for a credal set to be represented by a set of reachable probability intervals and by a belief function,although this credal set cannot be represented by a set of IDM probability intervals.We only need to consider in example3the value s/(Nþs):it must be0.35,using l1 and u1and0.15using l3and u3.However,the description of the credal sets belonging to reachable probability intervals and belief functions is still an open problem.In Figure1,we can see where the sets of IDM probability intervals are placed in relation to other theories of imprecise probabilities using a generality order.4.An overview of uncertainty measuresIt has well been established that uncertainty in classical possibility theory is quantified by the Hartley measure(Hartley1928).For each nonempty andfinite set A#X of possible alternatives,the Hartley measure,H(A),is defined by the formulaHðAÞ¼log2j A j;ð24ÞUncertainty measures on IDM519。

DSGE经典文章CEE(2005)

DSGE经典文章CEE(2005)

Nominal Rigidities and the Dynamic Effects of a Shock to Monetary PolicyAuthor(s): Lawrence J. Christiano, Martin Eichenbaum, and Charles L. EvansSource: Journal of Political Economy, Vol. 113, No. 1 (February 2005), pp. 1-45Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at ./page/info/about/policies/terms.jspJSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@.The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to Journalof Political Economy.1[Journal of Political Economy,2005,vol.113,no.1]᭧2005by The University of Chicago.All rights reserved.0022-3808/2005/11301-0003$10.00Nominal Rigidities and the Dynamic Effects of a Shock to Monetary PolicyLawrence J.Christiano and Martin Eichenbaum Northwestern University,National Bureau of Economic Research,and Federal Reserve Bank of ChicagoCharles L.EvansFederal Reserve Bank of ChicagoWe present a model embodying moderate amounts of nominal rigid-ities that accounts for the observed inertia in inflation and persistence in output.The key features of our model are those that prevent a sharp rise in marginal costs after an expansionary shock to monetary policy.Of these features,the most important are staggered wage con-tracts that have an average duration of three quarters and variable capital utilization.I.IntroductionThis paper seeks to understand the observed inertial behavior of infla-tion and persistence in aggregate quantities.To this end,we formulate and estimate a dynamic,general equilibrium model that incorporates staggered wage and price contracts.We use our model to investigate the mix of frictions that can account for the evidence of inertia and persistence.For this exercise to be well defined,we must characterize The first two authors are grateful for the financial support of a National Science Foun-dation grant to the National Bureau of Economic Research.We would like to acknowledge helpful comments from Lars Hansen and Mark Watson.We particularly want to thank Levon Barseghyan for his superb research assistance,as well as his insightful comments on various drafts of the paper.This paper does not necessarily reflect the views of the Federal Reserve Bank of Chicago or the Federal Reserve System.2journal of political economy inertia and persistence precisely.We do so using estimates of the dy-namic response of inflation and aggregate variables to a monetary policy shock.With this characterization,the question we ask reduces to the following:Can models with moderate degrees of nominal rigidities gen-erate inertial inflation and persistent output movements in response to a monetary policy shock?1Our answer to this question is yes.The model that we construct has two key features.First,it embeds Calvo-style nominal price and wage contracts.Second,the real side of the model incorporates four departures from the standard textbook, one-sector dynamic stochastic growth model.These departures are mo-tivated by recent research on the determinants of consumption,asset prices,investment,and productivity.The specific departures that we include are habit formation in preferences for consumption,adjustment costs in investment,and variable capital utilization.In addition,we as-sume thatfirms must borrow working capital tofinance their wage bill. Our keyfindings are as follows.First,the average duration of price and wage contracts in the estimated model is roughly two and three quarters,respectively.Despite the modest nature of these nominal ri-gidities,the model does a very good job of accounting quantitatively for the estimated response of the U.S.economy to a policy shock.In addition to reproducing the dynamic response of inflation and output, the model also accounts for the delayed,hump-shaped response in con-sumption,investment,profits,and productivity and the weak response of the real wage.2Second,the critical nominal friction in our model is wage contracts,not price contracts.A version of the model with only nominal wage rigidities does almost as well as the estimated model.In contrast,with only nominal price rigidities,the model performs very poorly.Consistent with existing results in the literature,the version of the model with only price rigidities cannot generate persistent move-ments in output unless we assume price contracts of extremely long duration.The model with only nominal wage rigidities does not have this problem.Third,we document how inference about nominal rigidities varies across different specifications of the real side of our model.3Estimated 1This question is the focus of a large and growing literature.See,e.g.,Rotemberg and Woodford(1999),Chari,Kehoe,and McGrattan(2000),Mankiw(2001),and the refer-ences therein.2In related work,Sbordone(2000)argues that,when aggregate real variables are taken as given,a model with staggered wages and prices does well at accounting for the time-series properties of wages and prices.See also Ambler,Guay,and Phaneuf(1999)and Huang and Liu(2002)for interesting work on the role of wage contracts.3For early discussions about the impact of real frictions on the effects of nominal rigidities,see Blanchard and Fischer(1989),Ball and Romer(1990),and Romer(1996). For more recent quantitative discussions,see Sims(1998),McCallum and Nelson(1999), Chari et al.(2000),Edge(2000),and Fuhrer(2000).nominal rigidities3 versions of the model that do not incorporate our departures from the standard growth model imply implausibly long price and wage contracts. Fourth,wefind that if one wants to generate inertia in inflation and persistence in output in a model while imposing only moderate wage and price stickiness,then it is crucial to allow for variable capital util-ization.To understand why this feature is so important,note that in our model,firms set prices as a markup over marginal costs.The major components of marginal costs are wages and the rental rate of capital. By allowing the services of capital to increase after a positive monetary policy shock,variable capital utilization helps dampen the large rise in the rental rate of capital that would otherwise occur.This in turn damp-ens the rise in marginal costs and,hence,prices.The resulting inertia in inflation implies that the rise in nominal spending that occurs after a positive monetary policy shock produces a persistent rise in real out-put.Similar intuition explains why sticky wages play a critical role in allowing our model to explain inflation inertia and output persistence. It also explains why our assumption about working capital plays a useful role:Other things equal,a decline in the interest rate lowers marginal costs.Fifth,although investment adjustment costs and habit formation do not play a central role with respect to inflation inertia and output per-sistence,they do play a critical role in accounting for the dynamics of other variables.Sixth,the major role played by the working capital channel is to reduce the model’s reliance on sticky prices.Specifically, if we estimate a version of the model that does not allow for working capital,the average duration of price contracts increases dramatically. Finally,wefind that our model embodies strong internal propagation mechanisms.The impact of a monetary policy shock on aggregate ac-tivity continues to grow and persist even beyond the time at which the typical contract that was in place at the time of the shock has been reoptimized.In addition,the effects on real variables persist well beyond the effects of the shock on the interest rate and the growth rate of money.We pursue a particular limited information econometric strategy to estimate and evaluate our model.To implement this strategy wefirst estimate the impulse response of eight key macroeconomic variables to a monetary policy shock,using an identified vector autoregression (VAR).We then choose six model parameters to minimize the difference between the estimated impulse response functions and the analogous objects in our model.4The remainder of this paper is organized as follows.In Section II,we 4Rotemberg and Woodford(1997),Christiano,Eichenbaum,and Evans(1998),and Edge(2000)have also applied this strategy in the context of monetary policy shocks.4journal of political economy briefly describe our estimates of the way the U.S.economy responds to a monetary policy shock.Section III displays our economic model.In Section IV,we discuss our econometric methodology.Our empirical results are reported in Section V and analyzed in Section VI.Concluding comments are contained in Section VII.II.The Consequences of a Monetary Policy ShockThis section begins by describing how we estimate a monetary policy shock.We then report estimates of how major macroeconomic variables respond to a monetary policy shock.Finally,we report the fraction of the variance in these variables that is accounted for by monetary policy shocks.The starting point of our analysis is the following characterization of monetary policy:R p f(Q)ϩe.(1)t t tR QHere,is the federal funds rate,f is a linear function,is an infor-t temation set,and is the monetary policy shock.We assume that the Fed tallows money growth to be whatever is necessary to guarantee that(1)eholds.Our basic identifying assumption is that is orthogonal to thetelements in.Below,we describe the variables in and elaborate on Q Qt tthe interpretation of this orthogonality assumption.We now discuss how we estimate the dynamic response of key mac-Y roeconomic variables to a monetary policy shock.Let denote thetYvector of variables included in the analysis.We partition as follows:tY p[Y,R,Y].t1t t2tYThe vector is composed of the variables whose time t values are 1tQcontained in and that are assumed not to respond contemporaneously tYto a monetary policy shock.The vector consists of the time t values2tQ Yof all the other variables in.The variables in are real gross domestict1tproduct,real consumption,the GDP deflator,real investment,the realYwage,and labor productivity.The variables in are real profits and2tthe growth rate of M2.All these variables,except money growth,haveRbeen logged.We measure the interest rate,,using the federal fundstrate.The data sources are in an appendix,available from the authors. With one exception(the growth rate of money),all the variables in Yare included in levels.In Altig et al.(2003),we adopt an alternative tYspecification of in which we impose cointegrating relationships among tthe variables.For example,we include the growth rate of GDP and the log difference between labor productivity and the real wage.The key properties of the impulse responses to a monetary policy shock are insensitive to this alternative specification.nominal rigidities5YThe ordering of the variables in embodies two key identifying as-tsumptions.First,the variables in do not respond contemporaneouslyY1tto a monetary policy shock.Second,the time t information set of the monetary authority consists of current and lagged values of the variables in and only past values of the variables in.Y Y1t2tOur decision to include all variables except for the growth rate of M2and real profits in reflects a long-standing view that many mac-Y1troeconomic variables do not respond instantaneously to policy shocks (see Friedman1968).We refer the reader to Christiano et al.(1999) for a discussion of sensitivity of inference to alternative assumptionsYabout the variables included in.While our assumptions are certainly1tdebatable,the analysis is internally consistent in the sense that we make the same assumptions in our economic model.To maintain consistencyY with the model,we place profits and the growth rate of money in.2t The VAR contains four lags of each variable,and the sample period is1965:3–1995:3.5When the constant term is ignored,the VAR can be written as follows:…Y p A YϩϩA YϩC h,(2)t1tϪ14tϪ4t9#9where C is a lower triangular matrix with diagonal terms equal hto unity,and is a nine-dimensional vector of zero-mean,serially un-tcorrelated shocks with a diagonal variance-covariance matrix.SinceY ethere are six variables in,the monetary policy shock,,is the seventh1t th eelement of.A positive shock to corresponds to a contractionary t tmonetary policy shock.We estimate the parameters,,C,A i p1,…,4ihand the variances of the elements of using standard least-squaresting these estimates,we compute the dynamic path of Ytefollowing a one-standard-deviation shock in,setting initial conditionstto zero.This path,which corresponds to the coefficients in the impulse response functions of interest,is invariant to the ordering of the vari-Y Yables within and within(see Christiano et al.1999).1t2tYThe impulse response functions of all variables in are displayed intfigure1.Lines marked with a plus sign correspond to the point esti-mates.The shaded areas indicate95percent confidence intervals about the point estimates.6The solid lines pertain to the properties of our structural model,which will be discussed in Section III.The results suggest that after an expansionary monetary policy shock,1.output,consumption,and investment respond in a hump-shapedfashion,peaking after about one and a half years and returning to preshock levels after about three years;5This sample period is the same as in Christiano et al.(1999).6We use the method described in Sims and Zha(1999).6Fig.1.—Model-and VAR-based impulse responses.Solid lines are benchmark model impulse responses;solid lines with plus signs are VAR-based impulse responses.Grey areas are 95percent confidence intervals about VAR-based estimates.Units on the horizontal axis are quarters.An asterisk indicates the period of policy shock.The vertical axis units are deviations from the unshocked path.Inflation,money growth,and the interest rate are given in annualized percentage points (APR);other variables are given in percentages.7Fig.1.—Continued8journal of political economyTABLE1Percentage Variance Due to Monetary Policy Shocks4Quarters Ahead 8QuartersAhead20QuartersAheadOutput15(4,26)38(15,48)27(9,35)Inflation1(0,8)4(1,11)7(3,18)Consumption14(4,26)21(5,37)14(4,26)Investment10(2,21)26(7,39)23(6,32)Real wage2(0,8)2(0,14)4(0,15)Productivity15(3,25)14(3,26)10(3,20)Federal funds rate32(18,44)19(8,27)18(5,27)M2growth19(8,29)19(8,26)19(8,24)Real profits13(5,25)18(6,31)7(2,20)Note.—Numbers in parentheses are the boundaries of the associated95percent confidence interval.2.inflation responds in a hump-shaped fashion,peaking after abouttwo years;3.the interest rate falls for roughly one year;4.real profits,real wages,and labor productivity rise;and5.the growth rate of money rises immediately.Interestingly,these results are consistent with the claims in Friedman (1968).For example,Friedman argued that an exogenous increase in the money supply leads to a drop in the interest rate,which lasts one to two years,and a rise in output and employment,which lasts two to five years.Finally,the robustness of the qualitative features of ourfind-ings to alternative identifying assumptions and sample subperiods,as well as the use of monthly data,is discussed in Christiano et al.(1999). Our strategy for estimating the parameters of our model focuses on only a component of thefluctuations in the data,namely the portion that is caused by a monetary policy shock.It is natural to ask how large that component is,since ultimately we are interested in a model that can account for all of the variation in the data.With this question in mind,table1reports variance decompositions.In particular,it displays the percentage of variance of the k-step-ahead forecast error in the elements of due to monetary policy shocks,for,8,and20.Y k p4tNumbers in parentheses are the boundaries of the associated95percentnominal rigidities 9confidence intervals.7Notice that policy shocks account for only a small fraction of inflation.At the same time,with the exception of real wages,monetary policy shocks account for a nontrivial fraction of the variation in the real variables.This last conclusion should be treated with caution.The confidence intervals about the point estimates are rather large.Also,while the impulse response functions are robust to the various perturbations discussed in Christiano et al.(1999)and Altig et al.(2003),the variance decompositions can be sensitive.For example,the analogous point estimates reported in Altig et al.are substantially smaller than those reported in table 1.III.The Model EconomyIn this section we describe our model economy and display the problems solved by firms and households.In addition,we describe the behavior of financial intermediaries and the monetary and fiscal authorities.The only source of uncertainty in the model is a shock to monetary policy.A.Final-Good FirmsAt time t ,a final consumption good,,is produced by a perfectly Y t competitive,representative firm.The firm produces the final good by combining a continuum of intermediate goods,indexed by ,j ෈(0,1)using the technology11/l f Y p Y dj ,(3)()t ͵jt 0where ,and denotes the time t input of intermediate good 1≤l !ϱY f jt j .The firm takes its output price,,and its input prices,,as given P P t jt and beyond its control.Profit maximization implies the Euler equationl /(l Ϫ1)f f P Y jt tp .(4)()P Y jtt Integrating (4)and imposing (3),we obtain the following relationship between the price of the final good and the price of the intermediate 7These confidence intervals are computed on the basis of bootstrap simulations of the estimated VAR.In each artificial data set we computed the variance decompositions cor-responding to the ones in table 1.The lower and upper bounds of the confidence intervals correspond to the 2.5and 97.5percentiles of simulated variance decompositions.10journal of political economy good:1Ϫl f 11/(1Ϫl )f P p P dj.(5)t ͵jt []0B.Intermediate-Goods FirmsIntermediate good is produced by a monopolist who uses the j ෈(0,1)following technology:a 1Ϫa a 1Ϫa k L Ϫf if k L ≥fjt jt jt jt Y p (6)jt {0otherwise,where .Here,and denote the time t labor and capital 0!a !1L k jt jt services used to produce the j th intermediate good.Also,denotes f 10the fixed cost of production.We rule out entry into and exit out of the production of intermediate good j .Intermediate firms rent capital and labor in perfectly competitive factor markets.Profits are distributed to households at the end of each time period.Let and denote the nominal rental rate on capital k R W t t services and the wage rate,respectively.Workers must be paid in advance of production.As a result,the j th firm must borrow its wage bill,,from the financial intermediary at the beginning of the period.W L t jt Repayment occurs at the end of time period t at the gross interest rate,.R t The firm’s real marginal cost is ,where s p ѨS (Y )/ѨY S (Y )p t t t given by (6)},where and .k k k min {r k ϩw R l ,Y r p R /P w p W /P k ,l t t t t t t t t t Given our functional forms,we have1Ϫa a 11k a 1Ϫa s p (r )(w R ).(7)t t t t ()()1Ϫa a Apart from fixed costs,the firm’s time t profits are ,where [(P /P )Ϫs ]PY jt t t t jt is firm j’s price.P jt We assume that firms set prices according to a variant of the mech-anism spelled out in Calvo (1983).This model has been widely used to characterize price-setting frictions.A useful feature of the model is that it can be solved without explicitly tracking the distribution of prices across firms.In each period,a firm faces a constant probability,1Ϫ,of being able to reoptimize its nominal price.The ability to reopti-y p mize its price is independent across firms and time.If a firm can reop-timize its price,it does so before the realization of the time t growth rate of money.Firms that cannot reoptimize their price simply indexnominal rigidities11to lagged inflation:P p p P .(8)jt t Ϫ1j ,t Ϫ1Here,.We refer to this price-setting rule as lagged inflation p p P /P t t t Ϫ1indexation.Let denote the value of set by a firm that can reoptimize at time ˜P P t jt t .Our notation does not allow to depend on j .We do this in antic-˜P t ipation of the well-known result that,in models like ours,all firms that can reoptimize their price at time t choose the same price (see Woodford 1996;Yun 1996).The firm chooses to maximize ˜P tϱl ˜E (by )v (P X Ϫs P )Y ,(9)͸t Ϫ1p t ϩl t tl t ϩl t ϩl j ,t ϩl l p 0subject to (4),(7),and…p #p ##p for l ≥1t t ϩ1t ϩl Ϫ1X p (10)tl {1for l p 0.In (9),is the marginal value of a dollar to the household,which is v t treated as exogenous by the fiter,we show that the value of a dollar,in utility terms,is constant across households.Also,denotes E t Ϫ1the expectations operator conditioned on lagged growth rates of money,,.This specification of the information set captures our as-m l ≥1t Ϫl sumption that the firm chooses before the realization of the time t ˜P t growth rate of money.To understand (9),note that influences firm ˜P t j’s profits only as long as it cannot reoptimize its price.The probability that this happens for l periods is ,in which case .The l ˜(y )P p P X p j ,t ϩl t tl presence of in (9)has the effect of isolating future realizations of l (y )p idiosyncratic uncertainty in which continues to affect the firm’s profits.˜P tC.HouseholdsThere is a continuum of households,indexed by .The j th j ෈(0,1)household makes a sequence of decisions during each period.First,it makes a consumption decision and a capital accumulation decision,and it decides how many units of capital services to supply.Second,it pur-chases securities,whose payoffs are contingent on whether it can reop-timize its wage decision.Third,it sets its wage rate after finding out whether it can reoptimize or not.Fourth,it receives a lump-sum transfer from the monetary authority.Finally,it decides how much of its financial assets to hold in the form of deposits with a financial intermediary and how much to hold in the form of cash.Since the uncertainty faced by the household over whether it can reoptimize its wage is idiosyncratic in nature,households work different12journal of political economy amounts and earn different wage rates.So,in principle,they are also heterogeneous with respect to consumption and asset holdings.A straightforward extension of arguments in Woodford (1996)and Erceg,Henderson,and Levin (2000)establishes that the existence of state-contingent securities ensures that,in equilibrium,households are ho-mogeneous with respect to consumption and asset holdings.Reflecting this result,our notation assumes that households are homogeneous with respect to consumption and asset holdings but heterogeneous with re-spect to the wage rate they earn and the hours they work.The preferences of the j th household are given byϱj l Ϫt E b [u (c Ϫbc )Ϫz (h )ϩv (q )].(11)͸t Ϫ1t ϩl t ϩl Ϫ1j ,t ϩl t ϩl l p 0Here,is the expectation operator,conditional on aggregate and j E t Ϫ1household j’s idiosyncratic information up to,and including,time t Ϫ;denotes time t consumption;denotes time t hours worked;1c h t jt denotes real cash balances;and denotes nominal cash q {Q /P Q t t t t balances.When ,(11)allows for habit formation in consumption b 10preferences.The household’s asset evolution equation is given bya M p R [M ϪQ ϩ(m Ϫ1)M ]ϩA ϩQ ϩW h t ϩ1t t t t t j ,t t j ,t j ,tk ¯¯ϩR u k ϩD ϪP [i ϩc ϩa (u )k ].(12)t t t t t t t t t Here,is the household’s beginning of period t stock of money and M t is time t labor income.In addition,,,and denote,respec-¯W h kD A j ,t j ,t t t j ,t tively,the physical stock of capital,firm profits,and the net cash inflow from participating in state-contingent security markets at time t .The variable represents the gross growth rate of the economywide perm t capita stock of money,.The quantity is a lump-sum pay-a a M (m Ϫ1)M t t t ment made to households by the monetary authority.The quantityis deposited by the household with a financiala M ϪPq ϩ(m Ϫ1)M t t t t t intermediary,where it earns the gross nominal rate of interest,.R t The remaining terms in (12),aside from ,pertain to the stock of Pc t t installed capital,which we assume is owned by the household.The household’s stock of physical capital,,evolves according to ¯kt ¯¯k p (1Ϫd )k ϩF (i ,i ).(13)t ϩ1t t t Ϫ1Here,d denotes the physical rate of depreciation,and denotes time i t t purchases of investment goods.The function,F ,summarizes the tech-nology that transforms current and past investment into installed capital for use in the following period.We discuss the properties of F below.Capital services,,are related to the physical stock of capital by k tnominal rigidities 13.Here,denotes the utilization rate of capital,which we assume ¯k p u k u t t t t is set by the household.8In (12),represents the household’s earn-k ¯R u kt t t ings from supplying capital services.The increasing,convex function denotes the cost,in units of consumption goods,of setting the ¯a (u )kt t utilization rate to .u t D.The Wage DecisionAs in Erceg et al.(2000),we assume that the household is a monopoly supplier of a differentiated labor service,.It sells this service to a h jt representative,competitive firm that transforms it into an aggregate labor input,,using the following technology:L t 1l w 1/l w L p h dj.()t ͵jt 0The demand curve for is given byh jt l /(l Ϫ1)w w W t h p L ,1≤l !ϱ.(14)jt t w ()W jt Here,is the aggregate wage rate,that is,the price of .It is straight-W L t t forward to show that is related to via the relationshipW W t jt 1Ϫl w 11/(1Ϫl )w W p (W )dj.(15)t ͵jt []0The household takes and as given.L W t t Households set their wage rate according to a variant of the mech-anism used to model price setting by firms.In each period,a household faces a constant probability,,of being able to reoptimize its nom-1Ϫy w inal wage.The ability to reoptimize is independent across households and time.If a household cannot reoptimize its wage at time t ,it sets according toW jt W p p W .(16)j ,t t Ϫ1j ,t Ϫ18Our assumption that households make the capital accumulation and utilization de-cisions is a matter of convenience.At the cost of more complicated notation,we could work with an alternative decentralization scheme in which firms make these decisions.14journal of political economyE.Monetary and Fiscal Policy We assume that monetary policy is given by…m p m ϩv e ϩv e ϩv e ϩ.(17)t 0t 1t Ϫ12t Ϫ2Here,m denotes the mean growth rate of money,and is the response v j of to a time t monetary policy shock.We assume that the govern-E m t t ϩj ment has access to lump-sum taxes and pursues a Ricardian fiscal policy.Under this type of policy,the details of tax policy have no impact on inflation and other aggregate economic variables.As a result,we need not specify the details of fiscal policy.9F.Loan Market Clearing,Final-Goods Clearing,and Equilibrium Financial intermediaries receive from households and a trans-M ϪQ t t fer,,from the monetary authority.Our notation here reflects (m Ϫ1)M t t the equilibrium condition,.Financial intermediaries lend all a M p M t t their money to intermediate-goods firms,which use the funds to pay for .Loan market clearing requiresL t W L p m M ϪQ .(18)t t t t t The aggregate resource constraint isc ϩi ϩa (u )≤Y .t t t t We adopt a standard sequence-of-markets equilibrium concept.In our appendix,available on request,we discuss our computational strategy for approximating that equilibrium.This strategy involves taking a linear approximation about the nonstochastic steady state of the economy and using the solution method discussed in Christiano (2002).For details,see the previous version of this paper (Christiano et al.2001).In prin-ciple,the nonnegativity constraint on intermediate-goods output in (6)is a problem for this approximation.It turns out that the constraint is not binding for the experiments that we consider,and so we ignore it.Finally,it is worth noting that since profits are stochastic,the fact that they are zero,on average,implies that they are often negative.As a consequence,our assumption that firms cannot exit is binding.Allowing for firm entry and exit dynamics would considerably complicate our analysis.9See Sims (1994)or Woodford (1994)for a further discussion.。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

2
Nonparametric Predictive Inference and the underlying model
Hill [21] introduced the assumption A(n) as a basis for predictive inference in case of real-valued observations. In his setting, suppose we have n observations ordered as z1 < z2 < . . . < zn , which partition the real-line into n + 1 intervals (zj −1 , zj ) for j = 1, . . . , n + 1, where we use notation z0 = −∞ and zn+1 = ∞. Hill’s assumption A(n) is that a future observation, represented by a random quantity Zn+1 , falls into any such interval with equal probability, so we have 1 for j = 1, . . . , n + 1. This assumption implies that P (Zn+1 ∈ (zj −1 , zj )) = n+1 the rank of Zn+1 amongst the n observed data has equal probability to be any value in {1, . . . , n + 1}. This clearly is a post-data assumption, related to exchangeability [17], which provides direct posterior predictive probabilities [18]. Hill [21,22] argued that A(n) is a reasonable basis for inference in the absence of any further process information beyond the data set, when actually 2
Coolen, Augustin: A nonparametric predictive alternative to the Imprecise Dirichlet Model: the case of a known number of categories
Sonderforschungsbereich 386, Paper 489 (2006) Online unter: http://epub.ub.uni-muenchen.de/
Email addresses: frank.coolen@ (F.P.A. Coolen), thomas@stat.uni-muent submitted to International Journal of Approximate Reasoning6 September 2006
a Durham
University, Department of Mathematical Sciences, Science Laboratories, Durham, DH1 3LE, UK University, Department of Statistics, Ludwigstr. 33, D-80539 Munich, Germany
variety of different applications (see, in particular, the survey by Bernard [4] and this special issue of International Journal of Approximate Reasoning.) Our approach relies on the general framework of ‘Nonparametric Predictive Inference’ (NPI) [3,8], which is based on Hill’s assumption A(n) [21]. By using the same variation of this assumption as presented in [9], called ‘circularA(n) ’, our inference is closely related to our approach sketched in [9] where we explicitly do not assume any knowledge about the number of possible categories, apart from the information in the available data. A detailed and extensive presentation of NPI for multinomial data, considering all relevant aspects and containing detailed proofs and discussions of principles of general interval probabilistic statistical inference, is in preparation [10]. In the current paper, we present related results for the practically important case of a known number of possible categories, which is closer in nature to the traditional use of multinomial distributions. In comparison to the results without such knowledge [9], the inferences in this paper are either the same, or have less imprecision, in the latter case the lower and upper probabilities will be nested in the logical manner. In Section 2, we give brief introductions to A(n) , circular-A(n) , interval probability and NPI, and to the model underlying our inferences [9,10]. The main results, NPI-based lower and upper probabilities for the next observation on the basis of multinomial data with a known number of possible categories, are presented in Section 3, where we also formulate some general properties of these inferences. In Section 4 these results are compared to the IDM and numerical examples are used to illustrate particular features of these inferences. In Section 5 some additional issues are discussed. An explanation of the derivation of the lower and upper probabilities is provided in an Appendix.
b Ludwig-Maximilians
Abstract Nonparametric Predictive Inference (NPI) is a general methodology to learn from data in the absence of prior knowledge and without adding unjustified assumptions. This paper develops NPI for multinomial data where the total number of possible categories for the data is known. We present the general upper and lower probabilities and several of their properties. We also comment on differences between this NPI approach and corresponding inferences based on Walley’s Imprecise Dirichlet Model. Key words: Imprecise Dirichlet Model, imprecise probabilities, interval probability, known number of categories, lower and upper probabilities, multinomial data, nonparametric predictive inference, probability wheel.
Projektpartner
相关文档
最新文档