Beyond k-Anonymity A Decision Theoretic Framework for Assessing Privacy Risk

合集下载

俱乐部趋同

俱乐部趋同

Applied Economics Letters,2006,13,569–574Club convergence inEuropean regionsRita De Siano a and Marcella D’Uva b,*a Department of Economic Studies,University of Naples‘Parthenope’,Via Medina40,80133Naples,Italyb Department of Social Sciences,University of Naples L’Orientale,Largo S.Giovanni Maggiore30,80134Naples,ItalyThis study investigates the‘club convergence’hypothesis applying the stochastic notion of convergence to groups of European regions.In order to avoid the group selection bias problem,the innovative regression tree technique was applied to select endogenously the most important variables in achieving the best identification of groups on the base of per capita income and productive specialization.Tests on stochastic convergence in each group evidences a strong convergence among the wealthiest regions of the European Union and a trend of weak convergence among the remaining groups,confirming Baumol’s hypothesis of convergence.I.IntroductionOver the past decade many authors have explored the evolution of output discrepancies,at both national and regional levels.In particular,starting with Baumol(1986)it has been widely hypothesized that convergence may hold not for all economies but within groups of them showing similar characteristics (Azariadis and Drazen,1990).This evidence is referred to as the‘club convergence’hypothesis which implies that a set of economies may converge with each other,in the sense that in the long run they tend towards a common steady state position, but there is no convergence across different sets. In seeking to test the club convergence hypothesis (Qing Li,1999;Feve and Le Pen,2000;Su,2003,for example)two main questions arise:(a)which frame-work of convergence to use,and(b)how to identify the economies belonging to each club.Initially,a cross-section notion of convergence was used in order to verify the existence of a negative relationship between initial per capita income and its growth rate. In contrast with this notion a stochastic definition of convergence(Carlino and Mills,1993)was proposed and explored by using time series analyses. According to this framework there is stochastic convergence if per capita income disparities between economies follow a stationary process.Bernard and Durlauf(1996)found that when economies show multiple long run equilibria,cross-sectional tests tend to spuriously reject the null hypothesis of no convergence and,as a consequence,represent a weaker notion of convergence than that of the time series.As regards the second point,two methods can be used in order to create different groups of economies.The first sorts of economies follows some a priori criteria(initial level of GDP,education, technology,capital accumulation,etc.)while the second follows an endogenous selection method (Durlauf and Johnson,1995).Finally,the switching regression with the contribution of additional infor-mation on the sample separation followed by Feve and Le Pen(2000)can be mentioned as an intermediate method in modelling convergence clubs. This study investigates the‘club convergence’hypothesis applying the stochastic notion of conver-gence to groups of European regions sorted accord-ing to their initial levels of per capita income and*Corresponding author.E-mail:mduva@unior.itApplied Economics Letters ISSN1350–4851print/ISSN1466–4291onlineß2006Taylor&Francis569/journalsDOI:10.1080/13504850600733473productive specialization(De Siano and D’Uva, 2004,2005)through the application of an innovative methodology known as Classification and Regression Tree Analysis(CART).Unlike other partitioning methods,CART allows a regression to be performed together with a classification analysis on the same ‘learning’dataset,without requiring particular speci-fication of the functional form for the predictor variables which are selected endogenously.The importance of similarities in the initial productive specialization has been highlighted by several theore-tical contributions(Jacobs,1969;Marshall,1980; Romer,1986;Lucas,1988;Helg et al.,1995;Bru lhart, 1998;Ottaviano and Puga,1998)which found that it can be crucial in determining both the nature and size of responses to external shocks.The paper is organized as follows:Section II introduces the methodology of the empirical analysis, Section III displays the dataset,Section IV shows the results of econometric analysis and Section V concludes.II.MethodologyThe empirical analysis is carried out in two parts:first regions are grouped through the classification and regression tree analyses(CART),then convergence is tested within‘clubs’using the time series analysis. CART methodology(Breiman et al.,1984)provides binary recursive partitioning using non-parametric approaches in order to construct homogeneous groups of regions using splitting variables which minimize the intra-group‘impurity’as predictors. The final outcome is a tree with branches and ‘terminal nodes’,as homogeneous as possible,where the average value of the node represents the predicted value of the dependent variable.In this analysis the regression is carried out through the least squares method using the regional GDP growth rate as dependent variable and initial GDP and specializa-tion indexes as explicative variables.In the second part of the study Carlino and Mills(1993)notion of stochastic convergence is applied in each group identified by CART methodology.It follows that if the logarithm of a region’s per capita income relative to the group’s average does not contain a unit root,the region converges.The model(Ben-David, 1994;Qing Li,1999)is the following:y j i,t ¼ iþ i tþ’y i,tÀ1þ"i,tð1Þwhere y j i,t is the log of region i per capita income inyear t,j is the region’s group and"is white noise errorwith0mean.Summing Equation1over j for eachgroup and dividing the outcome by the number ofregions within the group,the following equation isobtained:"y t¼" þ" tþ’"y tÀ1þ"tð2Þwhere"y t is the group’s average per capita incomein year t(the group superscript is suppressed).Subtracting Equation2from Equation1one has:RI i,t¼AþBtþ’RI i,tÀ1þ"tð3Þwhere RI i,t is the logarithm of region i per capitaincome relative to the group’s average at time t(y j i,tÀ"y t).For each region of the sample we apply theAugmented Dickey–Fuller(ADF)test(Dickey andFuller,1979)using the ADF regression ofEquation3:ÁRI t¼ þ tþ RI tÀ1þX kj¼1c jÁRI tÀjþ"tð4ÞAt this point,considering the low power of the ADFtest in the case of short time series,we run alsothe Kwiatkowski et al.(1992)test(KPSS)for trendstationarity.The null hypothesis of the KPSS test isthe trend stationarity against the unit root alter-native.If the KPSS statistic is larger than the criticalvalues the null hypothesis is rejected.The combinedanalysis of KPSS and ADF tests results leads on thefollowing possibilities(Qing Li,1999):.rejection by ADF tests and failure to reject byKPSS!strong convergence;.failure to reject by both ADF and KPSS!weakconvergence;.rejection by KPSS test and failure to rejectADF!no convergence;.rejection by both ADF and KPSS tests invitesto perform further analyses.III.Data DescriptionThis section presents the dataset used both to groupthe sample regions and to run the econometricanalysis.Data for GDP and employment are fromthe Eurostat New Cronos Regio database at NUTS2level.1Annual values for GDP per inhabitant in termsof Purchasing Power Parity(PPP)and the number of1According to EC Regulation No.1059/2003.570R.De Siano and M.D’Uvaemployees in the NACE92productive branches from1981to 2000are used.The sample consists of 123regions belonging to nine countries:11Belgian,8Dutch,29German,222French,20Italian,18Spanish,5Portuguese,2Greek,38British.4For each region (i )the following initial productivespecialization indexes (SP)were built for all theconsidered branches 5(j ):SP ij ¼E ij P n j ¼1ij P m i ¼1E ij P n j ¼1P mi ¼1ijð5Þwhere E indicates the number of employees.IV.Empirical ResultsThe main purpose of the study is to test the ‘clubconvergence’hypothesis across the European regions.In particular,the study aims to investigate whethera region’s per capita income converges to the averageof the group to which it belongs.In order to avoidthe group selection bias problem,the regressiontree technique was applied to select endogenouslythe most important variables in achieving thebest identification of groups (De Siano and D’Uva,2005).If the majority of regions in a groupconverges,the group may be considered a conver-gence ‘club’.The CART method allowed a tree to be built withfour terminal nodes including regions showing a morehomogeneous behaviour of per capita GDP growthrate and productive specialization.Results of CARTanalysis together with the stochastic convergence tests for each group are presented in what follows.The first group consists of 11regions (from Spain,Greece and Portugal)characterized by:the highest estimated mean value of GDP growth rate (126.08%)despite the lowest initial income level (average equal to 4144.3);strong specialization in the agriculture sector (the highest and equal to 3.75),construction branch (2.09)and food and beverages compartment (1.93);the minimum specialization in chemical,energy,and machinery branches and the highest in food-beverages-tobacco,mineral and construction.More than 80%of these regions display ‘weak’convergence while remaining regions show ‘strong’convergence (Table 1).The second group includes 23regions (mainly from Belgium,Spain,Italy and the United Kingdom)characterized by:an average GDP growth rate equal to 111.36%and the second highest initial income level (5788.78);strong specialization in agriculture (2.68)sector,food and beverage (1.26),construction (1.52)and energy (1.20)compartments;the highest specialization in chemical products (0.98);the second highest level of specialization in agricul-ture construction and energy.Almost all these regions present ‘weak’convergence (Table 2).The third group is formed by 21regions from Belgium,France,Germany,the Netherlands,Spain,the UK and Italy (only Abruzzo)characterized by:an estimate for the GDP growth rate of 106%and an average initial level of income equal to 6920.6;main specializations in manufacturing (1.03),mineral products (1.13),construction (1.22),food and beverage (1.45)and energy (1.21);the highest 2The analysis starts from 1984due to the lack of data in the respective regional labour statistics.3During the period 1983–1987there has been a different aggregation of Greek regions at NUTS2level.Kriti and Thessalia are the only regions which presents data for the period 1984–2000.4The geographic units for UK are at NUTS1level of Eurostat classification because of the lack of data for NUTS2units.5Agricultural-forestry and fishery,manufacturing,fuel and power products,non-metallic minerals and minerals,food-beverages-tobacco,textiles-clothing-leather and footwear,chemical products,metal products,machinery-equipment and electrical goods,various industries,building and construction,transport and communication,credit and insurance services.Table 1.Convergence test results of group 1Regions group 1ADF statistics KPSS statistics l ¼4Regions group 1ADF statistics KPSS statistics l ¼4Castilla-la ManchaÀ2.9780.099gr 43Kriti À4.05ÃÃ0.080ExtremaduraÀ3.320.097Pt11Norte À4.03ÃÃ0.126AndaluciaÀ2.630.094Pt12Centro (P)À2.290.123Ceuta y MelillaÀ1.770.123Pt14Alentejo À2.770.104CanariasÀ1.940.121Pt15Algarve À2.010.086ThessaliaÀ1.760.137Notes :ÃÃdenote statistical significance using unit root critical values at the 5%(À3.645).Club convergence in European regions571specialization in energy and manufacturing branches.Except for Abruzzo and Noord Brabant,which donot converge,all the other regions ‘weakly’convergeto the group’s average (Table 3).The fourth group contains 68regions (almost allGerman,French and Italian (North-Centre)andsome Belgian and Dutch)characterized by thelowest estimation of the GDP growth rate (97.8%),despite their highest initial GDP level (8893.9);thehighest specialization in the branches of the servicessector (1.16and 1.07,respectively)and in machinery(1.01);the lowest specialization in agriculture,foodand beverages,textile and construction activities.These regions present the highest percentage of‘strong’convergence to the group’s average (morethan 60%,Table 4).Table 5presents the summary of convergence testsresults (percentage are in parentheses).The main outcome of this study is the evidence of strong convergence among the wealthiest regions of the European Union.Besides,it appears that there is a trend of weak convergence also among the remaining groups (percentages are considerably over 80%).Therefore,Baumol’s hypothesis of conver-gence within clubs showing similar characteristics is confirmed.V.Conclusion This study tests the ‘club convergence’hypothesis applying the stochastic notion of convergence to groups of European regions.In order to avoid the group selection bias problem,the innovative regression tree technique was applied to selectTable 3.Convergence test results of group 3Regions group 3ADF statistics KPSS statistics l ¼4Regions group 3ADF statistics KPSS statistics l ¼4LimburgÀ1.680.116Abruzzo 2.600.153ÃÃHainautÀ0.800.091Friesland À3.620.142NamurÀ1.840.094Noord-Brabant À2.590.148ÃÃNiederbayernÀ1.270.104Limburg (NL)À2.980.128OberpfalzÀ1.400.097Yorkshire and The Humber À1.610.085TrierÀ1.430.119East Midlands À2.190.091Comunidad Foral de NavarraÀ2.750.071West Midlands À1.920.080La RiojaÀ1.770.119East Anglia À2.150.134BalearesÀ2.960.108South West À1.950.091LimousinÀ2.410.083Scotland 2.220.093Languedoc-RoussillonÀ3.390.105Notes :ÃÃdenote statistical significance using KPSS stationary critical values at the 5%level (0.146).Table 2.Convergence test results of group 2Regions group 2ADF statistics KPSS statistics l ¼4Regions group 2ADF statistics KPSS statistics l ¼4Vlaams BrabantÀ1.220.100Murcia À1.530.124Brabant WallonÀ1.600.111Molise À2.170.078Luxembourg1.190.122Campania À3.220.078Lu neburgÀ0.280.114Puglia À2.820.115GaliciaÀ1.690.140Basilicata À2.100.140Principado de AsturiasÀ1.550.146ÃÃCalabria À5.07ÃÃÃ0.106CantabriaÀ1.080.133Sicilia À2.980.142Aragon À1.580.142Sardegna À2.210.141Comunidad de MadridÀ1.380.091Lisboa e Vale do Tejo À2.620.141Castilla y Leon À2.580.138Wales À2.120.098Cataluna À1.550.097Northern Ireland À1.790.120Comunidad Valenciana À1.420.105Notes :ÃÃand ÃÃÃdenote statistical significance using KPSS stationary critical values at the 5%level (0.146)and 1%level (0.216)respectively,using unit root critical values at the 5%(À3.645)and 1%(À4.469).572R.De Siano and M.D’Uvaendogenously the most important variables inachieving the best identification of groups.Testson stochastic convergence in each group identifiedby CART evidence strong convergence among thewealthiest regions of the European Union and atrend of weak convergence among the remaininggroups.References Azariadis,C.and Drazen,A.(1990)Threshold externalities in economic development,Quarterly Journal of Economics ,105,501–26.Baumol,W.J.(1986)Productivity growth,convergence and welfare:what the long run data show,AmericanEconomic Review ,76,1072–85.Table 5.Convergence test resultsGroupsNo.of regions Strong convergence Weak convergence No convergence 1112(18,19)9(81,81)2231(4.35)21(91.3)1(4.35)32119(90.48)2(9.52)46843(63.23)20(29.41)4(5.88)Table 4.Convergence test results of group 4Regions group 4ADF statistics KPSS statistics l ¼4Regions group 4ADF statistics KPSS statistics l ¼4RegionBruxelles capitale À2.650.112Haute-Normandie À4.11ÃÃ0.102AntwerpenÀ2.770.102Centre (FR)À5.13ÃÃÃ0.099Oost-VlaanderenÀ3.150.078Basse-Normandie À3.86ÃÃ0.101West-VlaanderenÀ3.030.097Bourgogne À5.03ÃÃÃ0.113Licge À3.060.089Nord-Pas-de-Calais À4.37ÃÃ0.130StuttgartÀ4.22ÃÃ0.123Lorraine À4.41ÃÃ0.139KarlsruheÀ4.51ÃÃÃ0.088Alsace À4.13ÃÃ0.094FreiburgÀ5.11ÃÃÃ0.092Franche-Comte À5.20ÃÃÃ0.145Tu bingenÀ4.94ÃÃÃ0.104Pays de la Loire À4.34ÃÃ0.116OberbayernÀ4.17ÃÃ0.094Bretagne À4.41ÃÃ0.124MittelfrankenÀ3.79ÃÃ0.089Poitou-Charentes À4.74ÃÃÃ0.102UnterfrankenÀ0.420.140Aquitaine À3.290.104SchwabenÀ4.11ÃÃ0.084Midi-Pyre ne es À5.48ÃÃÃ0.103BremenÀ3.76ÃÃ0.121Rho ne-Alpes À4.93ÃÃÃ0.104HamburgÀ3.350.097Auvergne À4.43ÃÃ0.135DarmstadtÀ3.150.125Provence-Alpes-Co te d’Azur À5.10ÃÃÃ0.109GießenÀ3.020.088Corse À2.560.166ÃÃKasselÀ3.0120.094Piemonte À3.460.112BraunschweigÀ3.82ÃÃ0.116Valle d’Aosta À4.36ÃÃ0.080HannoverÀ3.96ÃÃ0.083Liguria À4.26ÃÃ0.117Weser-EmsÀ3.400.084Lombardia À4.04ÃÃ0.101Du sseldorfÀ3.94ÃÃ0.097Trentino-Alto Adige À3.84ÃÃ0.109Ko lnÀ3.96ÃÃ0.084Veneto À3.68ÃÃ0.106Mu nsterÀ4.04ÃÃ0.087Friuli-Venezia Giulia À4.20ÃÃ0.116DetmoldÀ4.06ÃÃ0.099Emilia-Romagna À3.120.136ArnsbergÀ3.98ÃÃ0.096Toscana À3.190.121KoblenzÀ3.88ÃÃ0.113Umbria À3.560.146ÃÃRheinhessen-PfalzÀ4.18ÃÃ0.107Marche À3.250.136SaarlandÀ4.35ÃÃ0.090Lazio À3.96ÃÃ0.098Schleswig-HolsteinÀ3.360.089Drenthe À1.850.134Pais VascoÀ3.630.159ÃÃUtrecht À2.400.155ÃÃI le de FranceÀ4.61ÃÃÃ0.110Noord-Holland À1.990.137Champagne ArdenneÀ3.79ÃÃ0.157ÃÃZuid-Holland À2.200.138Picardie À4.44ÃÃ0.142Zeeland À3.78ÃÃ0.093Notes :ÃÃand ÃÃÃdenote statistical significance using KPSS stationary critical values at the 5%level (0.146)and 1%level (0.216)respectively,using unit root critical values at the 5%(À3.645)and 1%(‘4.469).Club convergence in European regions573Ben-David, D.(1994)Convergence clubs and diverging economies,unpublished manuscript,University of Houston,Ben-Gurion University and CEPR. Bernard, A. B.and Durlauf,S.N.(1996)Interpreting tests of the convergence hypothesis,Journal of Econometrics,71,161–73.Breiman,L.,Friedman,J.L.,Olshen,R.A.and Stone,C.J.,(1984)Classification and Regression Trees,Wadsworth,Belmont,CA.Bru lhart,M.(1998)Economic geography,industrial location and trade:the evidence,World Economy,21, 775–801.Carlino,G.A.and Mills,L.O.(1993)Are US regional incomes converging?A time series analysis,Journal of Monetary Economics,32,335–46.De Siano,R.and D’Uva,M.(2004)Specializzazione e crescita:un’applicazione alle regioni dell’Unione Monetaria Europea,Rivista Internazionale di Scienze Sociali,4,381–98.De Siano,R.and D’Uva,M.(2005)Regional growth in Europe:an analysis through CART methodology, Studi Economici,87,115–28.Dickey,D.A.and Fuller,W.A.(1979)Distribution of the estimators for autoregressive time series with a unit root,Journal of The American Statistical Association, 74,427–31.Durlauf,S.N.and Johnson,P.A.(1995)Multiple regimes and cross-country growth behaviour,Journal of Applied Econometrics,10,365–84.Feve,P.and Le Pen,Y.(2000)On modelling convergence clubs,Applied Economic Letters,7,311–14.Helg,R.,Manasse,P.,Monacelli,T.and Rovelli,R.(1995) How much(a)symmetry in Europe?Evidence from industrial sectors,European Economic Review,39, 1017–41.Jacobs,J.(1969)The Economy of Cities,Jonathen Cape, London.Kwiatkowski, D.,Phillips,P. C. B.,Schmidt,P.and Shin,Y.(1992)Testing the null hypothesis of stationarity against the alternative of a unit root:how sure are we that economic time series have a unit root?,Journal of Econometrics,54, 159–78.Lucas,R. E.(1988)On the mechanics of economic development,Journal of Monetary Economics,22, 3–42.Marshall,A.(1980)Principles of Economics,Macmillan, London.Ottaviano,I.and Puga,D.(1998)Agglomeration in the global economy:a survey of the‘new economic geography’,World Economy,21,707–31.Qing,L.(1999)Convergence clubs:some further evidence, Review of International Economics,7,59–67. Romer,P.M.(1986)Increasing returns and long run growth,Journal of Political Economy,94, 1002–37.Su,J.J.(2003)Convergence clubs among15OECD countries,Applied Economic Letters,10,113–18.574R.De Siano and M.D’Uva。

严选题第一章第七题

严选题第一章第七题

严选题第一章第七题英文回答:In the realm of artificial intelligence, questions such as "What is it, anyway?" and "What are its capabilities?" arise. To unravel the true nature of AI, we embark on a journey to explore its various dimensions.Cognitive abilities: AI systems exhibit impressive cognitive capabilities that mimic human intelligence. They possess learning algorithms that enable them to process vast amounts of data, detect patterns, and make informed predictions. Moreover, they can engage in natural language processing, understanding and generating human language with remarkable accuracy.Machine learning: At the heart of AI's cognitive prowess lies machine learning. This technology empowers AI systems to learn from data without explicit programming. Supervised learning trains models on labeled data, whileunsupervised learning uncovers hidden patterns in unlabeled data. Reinforcement learning rewards desired behaviors, fostering efficient decision-making.Computer vision: AI's ability to "see" and interpret visual information is known as computer vision. Convolutional neural networks (CNNs) enable AI systems to extract meaningful features from images, empowering themwith object recognition, facial detection, and sceneanalysis capabilities.Natural language processing: Natural languageprocessing (NLP) bridges the gap between humans and machines by allowing AI systems to understand and generate human language. NLP tasks encompass machine translation,text summarization, and sentiment analysis, facilitating seamless communication between humans and AI.Robotics: AI plays a pivotal role in robotics, enabling machines to navigate their surroundings, manipulate objects, and interact with the physical world. AI-powered robots possess autonomous navigation, object manipulation, andperception capabilities, paving the way for advancements in manufacturing, healthcare, and exploration.Applications and impacts: AI's versatility extends across a wide range of disciplines, leaving an indelible mark on fields such as healthcare, finance, and transportation. In healthcare, AI assists in medical diagnosis, drug discovery, and personalized treatment plans. In finance, it automates fraud detection, risk assessment, and portfolio management. In transportation, AI powersself-driving cars, traffic optimization, and logistics planning.Ethical considerations: As AI's capabilities continueto expand, so too must we consider its ethical implications. Concerns regarding privacy, bias, and accountability arise, necessitating the development of ethical guidelines and responsible AI practices.The future of AI: The future of AI holds boundless possibilities, with ongoing research and development promising even more advanced and multifaceted capabilities.AI's integration into various industries will continue to revolutionize our lives, ushering in an era of unprecedented technological progress.中文回答:什么是人工智能?人工智能(Artificial Intelligence,简称AI)是一门计算机科学分支,旨在创建能够执行通常需要人类智能的任务的计算机系统。

当代研究生英语 第七单元 B课文翻译

当代研究生英语 第七单元 B课文翻译

价格的利润生物公司正在吞噬可改变动物DNA序列的所有专利。

这是对阻碍医学研究发展的一种冲击。

木匠认为他们的贸易工具是理所当然的。

他们买木材和锤子后,他们可以使用木材和锤子去制作任何他们所选择的东西。

多年之后来自木材厂和工具储藏室的人并没有任何进展,也没有索要利润份额。

对于那些打造明日药物的科学家们来说,这种独立性是一种罕见的奢侈品。

发展或是发现这些生物技术贸易中的工具和稀有材料的公司,对那些其他也用这些工具和材料的人进行了严格的监控。

这些工具包括关键基因的DNA序列,人类、动物植物和一些病毒的基因的部分片段,例如,HIV,克隆细胞,酶,删除基因和用于快速扫描DNA样品的DNA 芯片。

为了将他们这些关键的资源得到手,医学研究人员进场不得不签署协议,这些协议可以制约他们如何使用这些资源或是保证发现这些的公司可以得到最终结果中的部分利益。

许多学者称这抑制了了解和治愈疾病的进程。

这些建议使Harold得到了警示,Harold是华盛顿附近的美国国家卫生研究院的院长,在同年早期,他建立了一个工作小组去调查此事。

由于他的提早的调查,下个月出就能发布初步的报告。

来自安阿伯密歇根大学的法律教授,该工作组的主席Rebecea Eisenberg说,她们的工作组已经听到了好多研究者的抱怨,在它们中有一份由美国联合大学技术管理组提交的重量级的卷宗。

为了帮助收集证据,NIH建立了一个网站,在这个网站上研究者们可以匿名举报一些案件,这些案件他们相信他们的工作已经被这些限制性许可证严重阻碍了。

迫使研究人员在出版之前需要将他们的手稿展示给公司的这一保密条款和协议是投诉中最常见的原因之一。

另一个问题是一些公司坚持保有自动许可证的权利,该许可证是有关利用他们物质所生产的任何未来将被发现的产品,并且这些赋予他们对任何利用他们的工具所赚取的利润的支配权利的条款也有保有的权利。

Eisenberg说:“如果你不得不签署了许多这样的条款的话,那真的是一个大麻烦”。

Empirical processes of dependent random variables

Empirical processes of dependent random variables

2
Preliminaries
n i=1
from R to R. The centered G -indexed empirical process is given by (P n − P )g = 1 n
n
the marginal and empirical distribution functions. Let G be a class of measurabrocesses that have been discussed include linear processes and Gaussian processes; see Dehling and Taqqu (1989) and Cs¨ org˝ o and Mielniczuk (1996) for long and short-range dependent subordinated Gaussian processes and Ho and Hsing (1996) and Wu (2003a) for long-range dependent linear processes. A collection of recent results is presented in Dehling, Mikosch and Sorensen (2002). In that collection Dedecker and Louhichi (2002) made an important generalization of Ossiander’s (1987) result. Here we investigate the empirical central limit problem for dependent random variables from another angle that avoids strong mixing conditions. In particular, we apply a martingale method and establish a weak convergence theory for stationary, causal processes. Our results are comparable with the theory for independent random variables in that the imposed moment conditions are optimal or almost optimal. We show that, if the process is short-range dependent in a certain sense, then the limiting behavior is similar to that of iid random variables in that the limiting distribution is a Gaussian process and the norming √ sequence is n. For long-range dependent linear processes, one needs to apply asymptotic √ expansions to obtain n-norming limit theorems (Section 6.2.2). The paper is structured as follows. In Section 2 we introduce some mathematical preliminaries necessary for the weak convergence theory and illustrate the essence of our approach. Two types of empirical central limit theorems are established. Empirical processes indexed by indicators of left half lines, absolutely continuous functions, and piecewise differentiable functions are discussed in Sections 3, 4 and 5 respectively. Applications to linear processes and iterated random functions are made in Section 6. Section 7 presents some integral and maximal inequalities that may be of independent interest. Some proofs are given in Sections 8 and 9.

学术英语社科unitA翻译

学术英语社科unitA翻译

1、失去一份工作可能是最痛苦的经济事件在一个人的生活。

大多数人们依靠自己的劳动收入来维持他们的生活标准,许多人会从他们的工作得到的不仅是收入,还有自己的成就感。

一个失去工作意味着现在要定一个更低的生活标准,焦虑未来,并丧失自尊心。

这并不奇怪,因此,政治家竞选办公室经常谈论他们所提出的政策将帮助创造就业机会。

2、虽然一定程度的失业是不可避免的,在一个复杂的经济与成千上万的企业和以百万计的工人,失业量的变化大致随着时间的推移和席卷整个国家。

当一国保持其尽可能充分就业的工人,它实现了更高水平的国内生产总值会比留下了不少工人闲置更好。

3、失业问题一般分为两类,长期的问题和短期的问题。

经济的自然失业率通常是指充分就业状态下的失业率。

周期性失业是指今年年失业率围绕其自然率的波动,它是密切相关的经济活动的短期起伏。

4、判断失业问题有多么严重时,其中一个问题就是要考虑是否失业通常是一个短期或长期的条件。

如果失业是短期的,人们可能会得出结论,它不是一个大问题。

工人可能需要几个星期的工作之间找到最适合他们的口味和技能的开口。

然而,如果失业是长期的,人们可能会得出结论,这是一个严重的问题。

许多个月的失业工人更容易遭受经济和心理上的困难。

5、经济引起一些失业的原因之一是寻找工作。

求职是工人与适合的职位相匹配的过程。

如果所有工人和所有工作一样,使所有工人,同样适用于所有作业,求职就不会是一个问题。

下岗职工会很快找到新的工作,非常适合他们。

但是,实际上,工人有不同的想法和技能,岗位有不同的属性,在经济生活中众多的企业和家庭关于应聘者和职位空缺的信息缓慢传播。

6、摩擦性失业往往是在不同企业之间的劳动力需求变化的结果。

当消费者决定,他们更喜欢富士通而不是宏碁,富士通增加就业岗位,宏碁就解雇工人。

前宏碁的工人必须寻找新的就业机会,而富士通必须决定雇用新工人开辟了各种作业。

这种转变的结果是一段时间的失业。

7、同样,由于不同地区的国家生产不同的商品,在一个地区就业增长,在另一个减少。

cdmp p级题库

cdmp p级题库

cdmp p级题库The problem at hand is related to the CDMP P-Level question bank. In order to address this issue, it is essential to consider various perspectives and provide a comprehensive analysis. This response will aim to fulfill the requirements by answering in English, utilizing a minimum of 3000 characters and including six paragraphs from multiple viewpoints. The response will also strive to maintain an emotional and human-like tone.From an educational perspective, the CDMP P-Level question bank plays a crucial role in assessing the knowledge and skills of individuals in the field of data management. It serves as a tool to evaluate the competency of professionals and ensure that they possess the necessary expertise to handle complex data-related challenges. The question bank should encompass a wide range of topics, including data modeling, data governance, data quality, and data integration, among others. By covering a diverse array of subjects, the question bank can effectively evaluate theproficiency of candidates and provide a reliable measure of their capabilities.Another perspective to consider is that of the candidates who will be utilizing the CDMP P-Level question bank. For these individuals, the question bank represents an opportunity to showcase their knowledge and skills in the field of data management. It is essential that the questions are well-constructed, challenging, and reflective of real-world scenarios. This will allow candidates to demonstrate their ability to apply theoretical knowledge to practical situations. Additionally, the question bank should be regularly updated to keep up with the evolving nature of data management practices and technologies.From an industry standpoint, the CDMP P-Level question bank serves as a valuable resource for organizations seeking to hire competent data management professionals. By utilizing the question bank as part of the recruitment process, companies can ensure that they are selecting candidates who possess the necessary skills to effectively manage and leverage data. This, in turn, can contribute toimproved data-driven decision-making, enhanced data governance, and ultimately, better business outcomes. Therefore, it is crucial that the question bank accurately reflects the knowledge and expertise required in the industry.On the other hand, it is important to acknowledge the potential challenges and limitations associated with the CDMP P-Level question bank. One possible concern is therisk of the question bank becoming outdated due to the rapid pace of technological advancements in the field of data management. To address this issue, regular updates and revisions should be conducted to ensure that the question bank remains relevant and aligned with current industry practices. Additionally, the question bank should incorporate a balance between theoretical concepts and practical application to provide a comprehensive assessment of candidates' abilities.Furthermore, it is essential to consider the ethical implications of the CDMP P-Level question bank. It is crucial that the questions are fair, unbiased, andinclusive, avoiding any form of discrimination or prejudice. The question bank should be designed to assess candidates based on their knowledge and skills rather than their personal background or characteristics. Additionally, the question bank should be accessible to individuals from diverse backgrounds and should not create barriers to entry for underrepresented groups in the field of data management.In conclusion, the CDMP P-Level question bank plays a vital role in evaluating the knowledge and skills of data management professionals. It serves as a tool for assessing candidates, aiding in the recruitment process, andpromoting the effective utilization of data in organizations. However, it is important to address the challenges associated with outdated content and ethical considerations to ensure that the question bank remains relevant, fair, and inclusive. By considering multiple perspectives and continuously improving the question bank,it can effectively contribute to the growth and development of the data management field.。

考试虫王若平的考研英语长难句过关第一部门

考试虫王若平的考研英语长难句过关第一部门

阅读基本功难句过关(王若平)第一部分难句分类辨析第一章定语从句定语从句的修饰对象一直是阅读中经常遇到而难以把握的问题,定语从句的关系词究竟修饰上文中的哪个单词、短语或整个句子,一要靠语言知识,二要根据上下文进行逻辑判断。

定语从句有限定性和非限定性之分。

此外,定语从句和主句之间还存在着状语关系,说明原因、目的、让步、假设等关系。

But he did not talk at length about the matter, which was not considered by the White House to be a particularly important question. (说明原因)他没有详细地谈这件事,因为白宫没有把它看作是个特别重要的问题。

Anyone who thinks that rational knowledge need not be derived from perceptual knowledge is an idealist. (假设)如果认为理性知识可以不从感性知识得来,他就是一个唯心主义者。

So my chances of getting to revolutionary China are pretty slim, although I have not given up y efforts to get a passport, that will enable me to visit the countries of Socialism. (目的) 因此,我到革命的中国的希望相当小了,然而我并没有放弃努力来争取一张护照,使我得以访问社会主义国家。

He insisted on buying a car, which he could not afford and had no use for. (表让步)他坚持要买辆轿车,尽管他买不起,而且也不需要。

1. Libraries made education possible, and education in its turn added to libraries; the growth of knowledge followed a kind of compound-interest law, which was greatly enhanced by the invention of printing.要点:从内容上分析,which修饰 “the growth of knowledge”图书馆的出现使教育的发展成为可能,而教育的发展又反过来使图书馆不断扩大充实。

A Model of Reference-Dependent Preferences

A Model of Reference-Dependent Preferences

THE QUARTERLY JOURNAL OF ECONOMICSVol.CXXI November2006Issue4A MODEL OF REFERENCE-DEPENDENT PREFERENCES*B OTOND K O˝SZEGI AND M ATTHEW R ABINWe develop a model of reference-dependent preferences and loss aversion where“gain–loss utility”is derived from standard“consumption utility”and the reference point is determined endogenously by the economic environment.We assume that a person’s reference point is her rational expectations held in the recent past about outcomes,which are determined in a personal equilibrium by the requirement that they must be consistent with optimal behavior given expec-tations.In deterministic environments,choices maximize consumption utility,but gain–loss utility influences behavior when there is uncertainty.Applying the model to consumer behavior,we show that willingness to pay for a good is increasing in the expected probability of purchase and in the expected prices conditional on purchase.In within-day labor-supply decisions,a worker is less likely to continue work if income earned thus far is unexpectedly high,but more likely to show up as well as continue work if expected income is high.I.I NTRODUCTIONHow a person assesses the outcome of a choice is often de-termined as much by its contrast with a reference point as by intrinsic taste for the outcome itself.The most notable manifes-*We thank Paige Skiba and Justin Sydnor for research assistance,and Daniel Benjamin, B.Douglas Bernheim,Stefano DellaVigna,Xavier Gabaix, Edward Glaeser,George Loewenstein,Wolfgang Pesendorfer,Charles Plott,An-drew Postlewaite,Antonio Rangel,Jacob Sagi,Christina Shannon,Jean Tirole, Peter Wakker,seminar participants at the University of California—Berkeley, the Harvard University/Massachusetts Institute of Technology Theory Seminar, Princeton University,Rice University,the Stanford Institute for Theoretical Economics,and the University of New Hampshire,graduate students in courses at the University of California—Berkeley,Harvard University,and the Massa-chusetts Institute of Technology,and anonymous referees for helpful comments. We are especially grateful to Erik Eyster,Daniel Kahneman,and Nathan Novem-sky for discussions on related topics at the formative stage of this project.Ko˝szegi thanks the Hellman Family Faculty Fund,and Rabin thanks the MacArthur Foundation and the National Science Foundation forfinancial support.©2006by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.The Quarterly Journal of Economics,November200611331134QUARTERLY JOURNAL OF ECONOMICStation of such reference-dependent preferences is loss aversion:losses resonate more than same-sized gains.In this paper webuild on the essential intuitions in Kahneman and Tversky’s[1979]prospect theory and subsequent models of reference de-pendence,butflesh out,extend,and modify these models todevelop a more generally applicable theory.We illustrate suchapplicability by establishing some implications of loss aversionfor consumer behavior and labor effort.We present the basic framework in Section II.A person’sutility depends not only on her K-dimensional consumption bun-dle c but also on a reference bundle r.She has an intrinsic“consumption utility”m(c)that corresponds to the outcome-basedutility classically studied in economics.Overall utility is given byu(c͉r)ϵm(c)ϩn(c͉r),where n(c͉r)is“gain–loss utility.”Bothconsumption utility and gain–loss utility are separable acrossdimensions,so that m(c)ϵ¥k m k(c k)and n(c͉r)ϵ¥k n k(c k͉r k). Because the sensation of gain or loss due to a departure from thereference point seems closely related to the consumption valueattached to the goods in question,we assume that n k(c k͉r k)ϵ␮(m k(c k)Ϫm k(r k)),where␮٪satisfies the properties of Kah-neman and Tversky’s[1979]value function.Our model allows for both stochastic outcomes and stochastic reference points,and assumes that a stochastic outcome F is evaluated according to its expected utility,with the utility of each outcome being the aver-age of how it feels relative to each possible realization of the reference point G:U(F͉G)ϭ͐͐u(c͉r)dG(r)dF(c).In addition to the widely investigated question of how peoplereact to departures from a posited reference point,predictions ofreference-dependent theories also depend crucially on the under-studied issue of what the reference point is.In Section III wepropose that a person’s reference point is the probabilistic beliefsshe held in the recent past about outcomes.Although existingevidence is instead generally interpreted by equating the refer-ence point with the status quo,virtually all of this evidence comesfrom contexts where people plausibly expect to maintain thestatus quo.But when expectations and the status quo are differ-ent—a common situation in economic environments—equatingthe reference point with expectations generally makes betterpredictions.Our theory,for instance,supports the common viewthat the“endowment effect”found in the laboratory,wherebyrandom owners value an object more than nonowners,is due toloss aversion—since an owner’s loss of the object looms largerthan a nonowner’s gain of the object.But our theory makes the less common prediction that the endowment effect among such owners and nonowners with no predisposition to trade will dis-appear among sellers and buyers in real-world markets who expect to trade.Merchants do not assess intended sales as loss of inventory,but do assess failed sales as loss of money;buyers do not assess intended expenditures as losses,but do assess failures to carry out intended purchases or paying more than expected as losses.Equating the reference point with expectations rather than the status quo is also important for understanding financial risk:while an unexpected monetary windfall in the lab may be assessed as a gain,a salary of $50,000to an employee who expected $60,000will not be assessed as a large gain relative to status-quo wealth,but rather as a loss relative to expectations of wealth.And in nondurable consumption—where there is no ob-ject with which the person can be endowed—a status-quo-based theory cannot capture the role of reference dependence at all:it would predict,for instance,that a person who misses a concert she expected to attend would feel no differently than somebody who never expected to see the concert.While alternative theories of expectations formation could be used,in this paper we complete our model by assuming rational expectations,formalizing (in an extreme way)the realistic as-sumption that people have some ability to predict their own ing the framework of Ko ˝szegi [2005]to determine rational expectations when preferences depend on expectations,we define a “personal equilibrium”as a situation where the sto-chastic outcome implied by optimal behavior conditional on ex-pectations coincides with expectations.1We also define a notion of “preferred personal equilibrium,”which selects the (typically unique)personal equilibrium with highest expected utility.We show in Section III that in deterministic environments preferred personal equilibrium predicts that decision-makers maximize consumption utility,replicating the predictions of clas-sical reference-independent utility theory.Our analyses in Sec-tions IV and V of consumer and labor-supply behavior,however,demonstrate a central implication of our theory:when there is1.Expectations have been mentioned by many researchers as a candidate for the reference point.With the exception of Shalev’s [2000]game-theoretic model,however,to our knowledge our paper is the first to formalize the idea that expectations determine the reference point and to specify a rule for deriving them endogenously in any environment.1135A MODEL OF REFERENCE-DEPENDENT PREFERENCES1136QUARTERLY JOURNAL OF ECONOMICS uncertainty,a decision-maker’s preferences over consumption bundles will be influenced by her environment.Section IV shows that a consumer’s willingness to pay a given price for shoes depends on the probability with which she ex-pected to buy them and the price she expected to pay.On the one hand,an increase in the likelihood of buying increases a consum-er’s sense of loss of shoes if she does not buy,creating an“attach-ment effect”that increases her willingness to pay.Hence,the greater the likelihood she thought prices would be low enough to induce purchase,the greater is her willingness to buy at higher prices.On the other hand,holding the probability of getting the shoesfixed,a decrease in the price a consumer expected to pay makes paying a higher price feel like more of a loss,creating a “comparison effect”that lowers her willingness to pay the high price.Hence,the lower the prices she expected among those prices that induce purchase,the lower is her willingness to buy at higher prices.Our application in Section V to labor supply is motivated by some recent empirical research beginning with Camerer et al. [1997]onflexible work hours thatfinds some workers seem to have a daily“target”income.We develop a model where,after earning income in the morning and learning her afternoon wage, a taxi driver decides whether to continue driving in the afternoon. In line with the empirical results of the target-income literature, our model predicts that when drivers experience unexpectedly high wages in the morning,for any given afternoon wage they are less likely to continue work.Yet expected wage increases will tend to increase both willingness to show up to work,and to drive in the afternoon once there.Our model therefore replicates the key insight of the literature that exceeding a target income might reduce effort.But in addition,it both provides a theory of what these income targets will be,and—through the fundamental dis-tinction between unexpected and expected wages—avoids the unrealistic prediction that generically higher wages will lower effort.Beyond improvements in specific substantive predictions,our general approach has an attractive methodological feature:be-cause a full specification of␮٪allows us to derive both gain–loss utility and the reference point itself from consumption utility and the economic environment,it moves us closer to a universally applicable,zero-degrees-of-freedom way to translate any existing reference-independent model into the corresponding reference-dependent one.Although straightforward to apply in most cases,our model falls short of providing a recipe for entirely formulaic application of the principles of reference-dependent.Psychologi-cal and economic judgment is needed,for instance,in choosing the appropriate notion of “recent expectations.”And there are also settings where the same principles motivating our approach suggest an alternative to our reduced-form model.We discuss such shortcomings and gaps,some possible resolutions,as well as further economic applications,in Section VI.II.R EFERENCE -D EPENDENT U TILITYWe specify a person’s utility for a riskless outcome as u (c ͉r ),where c ϭ(c 1,c 2,...,c K )ʦޒK is consumption and r ϭ(r 1,r 2,...,r K )ʦޒK is a “reference level”of consumption.If c is drawn according to the probability measure F ,the person’s utility is given by 2(1)U ͑F ͉r ͒ϭ͵u ͑c ͉r ͒dF ͑c ͒.As is clearly necessary in our framework developed below,where we assume that the reference point is beliefs about out-comes,we allow for the reference point itself to be stochastic.Suppose that the person’s reference point is the probability mea-sure G over ޒK ,and her consumption is drawn according to the probability measure F .Then,her utility is(2)U ͑F ͉G ͒ϭ͵͵u ͑c ͉r ͒dG ͑r ͒dF ͑c ͒.This formulation captures the notion that the sense of gain or loss from a given consumption outcome derives from comparing it with all outcomes possible under the reference lottery.For exam-ple,if the reference lottery is a gamble between $0and $100,an outcome of $50feels like a gain relative to $0,and like a loss relative to $100,and the overall sensation is a mixture of these2.Despite the clear evidence that people’s evaluation of prospects is not linear in probabilities,our model simplifies things by assuming preferences are linear.1137A MODEL OF REFERENCE-DEPENDENT PREFERENCEStwo feelings.3That a person’s utility depends on a reference lottery in addition to the actual outcome is similar to several previous theories.4With the exception of the model in Gul[1991]—which can be interpreted as saying that the certainty equivalent of a chosen lottery becomes the decision-maker’s ref-erence point—none of these theories provide a theory of refer-ence-point determination,as we do below.While preferences are reference-dependent,gains and losses are clearly not all that people care about.The sensation of gain or avoided loss from having more money does significantly affect our utility—but so does the absolute pleasure of consumption we purchase with the money.Therefore,in contrast to prior formu-lations based on a “value function”defined solely over gains and losses,our approach makes explicit the way preferences also depend on absolute levels.We assume that overall utility has two components:u (c ͉r )ϵm (c )ϩn (c ͉r ),where m (c )is “consumption utility”typically stressed in economics,and n (c ͉r )is “gain–loss utility.”For simplicity—and for further reasons discussed in Ko ˝szegi and Rabin [2004]—we assume that consumption utility is addi-tively separable across dimensions:m (c )ϵ¥k ϭ1K m k (c k ),witheach m k ٪differentiable and strictly increasing.We also assume that gain–loss utility is separable:n (c ͉r )ϵ¥k ϭ1K n k (c k ͉r k ).Thus,in evaluating an outcome,the decision-maker assesses gain–loss utility in each dimension separately.In combination with loss aversion,this separability is at the crux of many implications of reference-dependent utility,including the endowment effect.Beyond saying that a person cares about both consumption3.See Larsen et al.[2004]for some evidence that subjects have mixed emotions for outcomes that compare differently with different counterfactuals.Given the features below,our formula also implies that losses relative to a stochastic reference point count more than gains,so that a person who gets $50is more distressed by how it compares with a possible $100than she is pleased by how it compares with a possible $0.We are unaware of evidence on whether this,or an alternative specification where the relief of avoiding $0outweighs the disappointment of not getting the $100,is closer to reality.We believe few of the results stressed in this paper depend qualitatively on our exact formulation.We discuss this assumption in more detail in Ko ˝szegi and Rabin [2005]in the context of risk preferences,where it is crucial.4.Our utility function is most closely related to Sugden’s [2003].The main difference is in the way a given consumption outcome is compared with the reference lottery.In our model,each outcome is compared with all outcomes in the support of the reference lottery.In Sugden [2003],an outcome is compared only with the outcome that would have resulted from the reference lottery in the same state.See also the axiomatic theories of Gul [1991],Masatlioglu and Ok [2005],and Sagi [2005].1138QUARTERLY JOURNAL OF ECONOMICSand gain–loss utility,we propose a strong relationship between the two.While it surely exaggerates the tightness of the connec-tion,our model assumes that how a person feels about gaining or losing in a dimension depends in a universal way on the changes in consumption utility associated with such gains or losses:5n k ͑c k ͉r k ͒ϵ␮͑m k ͑c k ͒Ϫm k ͑r k ͒͒,where ␮٪is a “universal gain–loss function.”6We assume that ␮٪satisfies the following properties:A0.␮(x )is continuous for all x ,twice differentiable for x0,and ␮(0)ϭ0.A1.␮(x )is strictly increasing.A2.If y Ͼx Ͼ0,then ␮(y )ϩ␮(Ϫy )Ͻ␮(x )ϩ␮(Ϫx ).A3.␮Љ(x )Յ0for x Ͼ0,and ␮Љ(x )Ն0for x Ͻ0.A4.␮ЈϪ(0)/␮Јϩ(0)ϵ␭Ͼ1,where ␮Јϩ(0)ϵlim x 30␮Ј(͉x ͉)and ␮ЈϪ(0)ϵlim x 30␮Ј(Ϫ͉x ͉).Assumptions A0–A4,first stated by Bowman,Minehart,and Rabin [1999],correspond to Kahneman and Tversky’s [1979]ex-plicit or implicit assumptions about their “value function”defined on c Ϫr .Loss aversion is captured by A2for large stakes and A4for small stakes.Assumption A3captures another important feature of gain–loss utility,diminishing sensitivity:the marginal change in gain–loss sensations is greater for changes that are close to one’s reference level than for changes that are further away.We shall sometimes be interested in characterizing the implications of reference dependence with loss aversion but with-5.In a single-dimensional model,Ko ¨bberling and Wakker [2005]also assume that the evaluation of gains and losses is related to consumption utility.Our formulation extends their insight to multiple dimensions.As one way to motivate that gain–loss utility is related to changes in consumption utility in each dimen-sion,consider a person choosing between two gambles:a 50–50chance of gaining a paper clip or losing a paper clip,and the comparable gamble involving $10bills.It seems likely that she would risk losing the paper clip rather than the money,and do so because her sensation of gains and losses is smaller for a good whose consumption utility is smaller.Yet since m ٪is approximately linear for such small stakes,the choice depends almost entirely on the comparison of n k ٪across dimensions,so that any model that does not relate gain–loss assessments to consumption utility is not equipped to provide guidance in this or related examples.6.Note that once ␮٪is fixed,unless A3Јbelow holds,affine transformations of m ٪will not in general result in affine transformations of our model’s overall utility function.While this raises no problem in applying our model once the full utility function u (⅐͉⅐)is specified (or empirically estimated),it does mean that when deriving our model from a reference-independent model based on consump-tion utility alone,the specification of ␮٪must be sensitive to the scaling of consumption utility.1139A MODEL OF REFERENCE-DEPENDENT PREFERENCES1140QUARTERLY JOURNAL OF ECONOMICSout diminishing sensitivity as a force on behavior.For doing so, we define an alternative to A3:A3Ј.For all x 0,␮Љ(x)ϭ0.This utility function replicates a number of properties commonly associated with reference-dependent preferences.Proposition1 establishes thatfixing the outcome,a lower reference point makes a person happier(Part1);and preferences exhibit a status quo bias(Parts2and3):P ROPOSITION1.If␮satisfies Assumptions A0–A4,then the follow-ing hold.1.For all F,G,GЈsuch that the marginals of GЈfirst-order stochastically dominate the marginals of G ineach dimension,U(F͉G)ՆU(F͉GЈ).2.For any c,cЈʦޒK,c cЈ,u(c͉cЈ)Նu(cЈ͉cЈ)fu(c͉c)Ͼu(cЈ͉c).3.Suppose that␮satisfies A3Ј.Then,for any F,FЈthatdo not generate the same distribution of outcomesin all dimensions,U(F͉FЈ)ՆU(FЈ͉FЈ)f U(F͉F)ϾU(FЈ͉F).Parts2and3mean that if a person is willing to abandon her reference point for an alternative,then she strictly prefers the alternative if that is her reference point.Under Assumptions A0–A4,this is always true for riskless consumption bundles,but counterexamples can be constructed to show that the analogous statement for lotteries requires a more restrictive assumption such as A3Ј.Proposition2establishes that in the special case where m٪is linear,our utility function u(c͉r)exhibits the same properties as␮٪:P ROPOSITION2.If m is linear and␮satisfies Assumptions A0–A4,K satisfying Assumptions A0–A4such then there exists{v k}kϭ1that,for all c and r,(3)u͑c͉r͒Ϫu͑r͉r͒ϭ͸kϭ1K v k͑c kϪr k͒.Insofar as for local changes m٪can be taken to be much closer to linear than␮٪,Proposition2says that for small changes our utility function shares the qualitative properties ofstandard formulations of prospect theory.7This equivalence does not hold when the changes are large or marginal consumption utilities change quickly.This is a good thing.If,for instance,a person’s reference level of water is a quart below the level needed for survival,loss aversion in ␮٪will not induce loss aversion in u (c ͉r ):she would be much happier about a one-quart increase in water consumption than she would be unhappy about a one-quart decrease.More importantly,when large losses in consumption or wealth are involved,diminishing marginal utility of wealth as economists conventionally conceive of it is likely to counteract the diminishing sensitivity in losses emphasized in prospect theory.8III.T HE R EFERENCE P OINT AS (E NDOGENOUS )E XPECTATIONSIn comparison to the extensive research on preferences over departures from posited reference points,research on the nature of reference points themselves is quite limited.While we hope that experiments and other empirical work will shed light on this topic,our model makes the extreme assumption that the refer-ence point is fully determined by the expectations a person held in the recent past.Specifically,a person’s reference point is her probabilistic beliefs about the relevant consumption outcome held between the time she first focused on the decision determining the outcome and shortly before consumption occurs.9While some evidence indicates that expectations are important in determining sensations of gain and loss,our primary motivation for this assump-tion is that it helps unify and reconcile existing interpretations and corresponds to readily accessible intuition in many examples.10The most common assumption,of course,has been that the reference point is the status quo.But in virtually all experiments7.The proof of Proposition 2shows that,quantitatively,the degree of loss aversion observed in u (c ͉r )is less than the degree assumed in ␮٪.8.The tension between consumption utility and gain–loss utility in the evaluation of losses has been emphasized by prior researchers;see,e.g.,Kahne-man [2003]and Ko ¨bberling,Schwieren,and Wakker [2004].9.Our theory posits that preferences depend on lagged expectations,rather than expectations contemporaneous with the time of consumption.This does not assume that beliefs are slow to adjust to new information or that people are unaware of the choices that they have just made—but that preferences do not instantaneously change when beliefs do.When somebody finds out five minutes ahead of time that she will for sure not receive a long-expected $100,she would presumably immediately adjust her expectations to the new situation,but she will still five minutes later assess not getting the money as a loss.10.For examples of some of the more direct evidence of expectations-based counterfactuals affecting reactions to outcomes,see Mellers,Schwartz,and Ritov [1999],Breiter et al.[2001],and Medvec,Madey,and Gilovich [1995].1141A MODEL OF REFERENCE-DEPENDENT PREFERENCES1142QUARTERLY JOURNAL OF ECONOMICSinterpreted as supporting this assumption,subjects plausibly expect to keep the status quo,so these studies are also consistent with the reference point being expectations.For instance,proce-dures that have generated the classic endowment effect might plausibly have induced a disposition of subjects to believe that their current ownership status is indicative of their ensuing own-ership status,so that our expectations-based theory makes the same prediction as a status-quo-based theory.Indeed,our theory may be useful for understanding instances where the endowment effect has not been found.One interpretation of the rare excep-tions to laboratoryfindings of the effect,such as Plott and Zeiler [2005],is that they have successfully decoupled subjects’expec-tations from their initial ownership status.Similarly,thefield experiment by List[2003],which replicates the effect for inexpe-rienced sports card collectors butfinds that experienced collectors show a much smaller,insignificant effect,is consistent with our theory if more experienced traders come to expect a high proba-bility of parting with items they have just acquired.And the important limits on the endowment effect noted by Tversky and Kahneman[1991]and Novemsky and Kahneman[2005]—that budgeted spending by buyers and successful reduction of inven-tory by sellers are not coded as losses—are clearly also predicted by our model:parties in market settings expect to exchange money for goods.11Consider also an instance of reference dependence commonly discussed in economics:employees’aversion to wage cuts.A wise inconsistency has pervaded the application of the reference-point-as-status-quo perspective to the laboratory versus the labor mar-ket:a decrease in salary is not a reduction in the status-quo level of wealth—it is a reduction from the expected rate of increase in wealth.While good judgment and obfuscatory language can be used to variously deem aversion to losses in current wealth(when we reject unexpected gambles)versus aversion to losses in in-creases in current wealth(when we are bothered by wage cuts) versus aversion to losses in increases in the rate of increase of current wealth(when we are bothered by not getting an expected pay raise)as the relevant notion of loss aversion,our model not11.Indeed,while researchers such as Novemsky and Kahneman[2005]seem to frame these examples as determining some“boundaries of loss aversion,”we view them more narrowly as determining the boundaries of the endowment effect. As we demonstrate in Section IV,loss aversion does have important implications in markets—but the endowment effect is not among them.only accommodates all these scenarios—but predicts which is the appropriate notion as a function of the environment.Finally,a status-quo theory of the reference point is espe-cially unsatisfying when applied to the many economic activi-ties—such as food,entertainment,and travel—that involve fleet-ing consumption opportunities and no ownership of physical as-sets.If a person expects to undergo a painful dental procedure,finding out that it is not necessary after all may feel like a gain.Yet there is no meaningful way in which her status-quo endow-ment of dental procedures is different from somebody’s who never expected the procedure,so irrespective of expectations a status-quo theory would always predict the same gain–loss utility of zero from this experience.Our model of how utility depends on expectations could be combined with any theory of how these expectations are formed.But as a disciplined and largely realistic first pass,we assume that expectations are fully rational.To illustrate with an example analyzed in more detail in Section IV,suppose that a consumer had long known that she would have the opportunity to buy shoes,and faced with price uncertainty,had formed plans whether to buy at each price.If given the reference point based on her expectation to carry through with these plans,there is some price where she would in fact prefer not to carry through with her plans,our theory says that she should not have expected these plans in the first place.More generally,our notion of personal equilibrium assumes that a person correctly predicts both the environment she faces—here,the market distribution of prices—and her own reaction to this environment—here,her behavior in reaction to market prices—and taking the reference point gener-ated by these expectations as given,in each contingency maxi-mizes expected utility.Formally,suppose that the decision-maker has probabilistic beliefs described by the distribution Q over ޒcapturing a distri-bution over possible choice sets {D l }l ʦޒshe might face,where each D l ʦ⌬(ޒK ).In the first and weaker of two solution concepts we consider,rational expectations is the only restriction we impose:D EFINITION 1.A selection {F l ʦD l }l ʦޒis a personal equilibrium(PE)if for all l ʦޒand F Јl ʦD l ,U (F l ͉͐F l dQ (l ))ՆU (F Јl ͉͐F l dQ (l )).If the person expects to choose F l from choice set D l ,then given her expectations over possible choice sets she expects the 1143A MODEL OF REFERENCE-DEPENDENT PREFERENCES。

中文翻译-what constitutes a theoretical contribution

中文翻译-what constitutes a theoretical contribution

什么是理论贡献?自从成为编辑以来,一直试图寻找一种简单的方式来传达理论贡献的必要成分。

关于这个主题有好几篇优秀的论文,但它们通常涉及难以纳入与作者和评论者日常交流的术语和概念。

我的经验是,现有的框架在澄清含义的同时也容易混淆含义。

除了接触卡普兰的作品外,都宾和其他人的作品在学术界各不相同。

AMR,我本文是填补这一空白的初步努力:其目的不是创造一个新的理论概念化,而是提出一些简单的概念来讨论理论的发展过程。

这是我每天的社论活动中产生的个人反思。

我的动机是缓解关于期望和标准的交流问题,这是由于缺乏一个广泛接受的框架来讨论组织科学中概念写作的优点。

最后,我的评论不应该被解释为官方的教条或指导评估过程的铁规则。

每一篇提交的论文都是独一无二的,而且都是根据自己的优点来评判的;然而,我的思想显然受到了前半期我读过的几百篇交流的影响。

自动抄表系统本文围绕三个关键问题展开:(a)理论发展的基础是什么?(b)什么是对理论发展的合法增值贡献-如何?(c)在判断概念性论文时考虑哪些因素?第一节描述一个理论的构成要素。

第二部分利用该框架建立理论发展过程的标准。

第三部分总结了评论者对论文实质归属和适当性的期望。

自动抄表系统什么是建筑砌块理论发展?根据理论发展权威(例如,Dubin,1978),一个完整的理论必须包含四个基本要素,这些要素在以下段落中描述。

什么。

哪些因素(变量、结构、概念)在逻辑上应该被考虑为解释社会或个人感兴趣的现象的一部分?判断我们包含“正确”因素的程度存在两个标准:全面性(即,是否包括所有相关因素?)以及节俭(即,是否应该删除一些因素,因为它们对我们的理解没有什么附加价值?).当作者开始绘制一个主题的概念图时,他们应该错误地赞成包括太多的因素,并认识到随着时间的推移,他们的想法将会得到完善。

一般来说,删除不必要或无效的元素比证明添加是合理的要容易。

但是,这不应该被解释为允许扔进厨房水槽。

对于一个优秀的理论家来说,对竞争中的吝啬和全面性病毒的敏感性是其特征。

四六级考试,想提高成绩的朋友们看看这个吧

四六级考试,想提高成绩的朋友们看看这个吧

在阅读题和词汇语法题中,有这几个词的选项肯定是答案:beyond, entitle, availabel, bargain, lest, except for在“自然科学”阅读中,有这几个词的选项肯定要排除:all, only, totally, compalatly, unlimiely.在“态度题”中,有这两个词的选项要排除:indiffrent(漠不关心的),subject(主观的) 词汇:(很有冲刺性)come go keep hold get put make turn bring look call ask stand lay run live以上词跟介词搭配必考几道!重点记忆词汇(括号内注明的是这次要考的意思)bargain(见了就选)except for(见了就选)offer(录取通知书)effects(个人财物)gap(不足、差距)mark(污点、做标记)mind(照料、看管)moment(考了8次)present(拿出)inquiredeliberateadvisableaccuseanything butbut forconsume withextensive atintervalsoriginpreferable toprocedureprofitablepropertypacepointrangerefuserefer toreliefreligionrelativelyreleaserisesinglesolespoilsticksuitsurpriseurgentvarytensetoleranttracevacantweakenwear off(有一些你总见到,但是总是拿不准代表什么,但真的就爱考这个!所以还是背背吧)需要辨析的:1. call off(取消、放弃) 和call up(召集、唤起)2. adapt to 和adopt3. arise 和arouse4. count on = rely on5. cope with = deal with6. no doubt 和in doubt7. employee 和employer8. general 和generous9. instant 和constant10. lie(及物) 和lay(不及物)11. regulate 和regular12. supply(有目的提供) 和offer(无目的提供)语法:(分值小)1. 虚拟语气:采集者退散表示建议的几个词:wish, would rather, had rather;it is time that + 过去式;it is high time that + 过去式;but for、lest、as if、as though、would、should、could、might +动词原型。

Beyond思考(Avastin新进展)

Beyond思考(Avastin新进展)

贝伐珠单抗 单药
PD *
CP + 安慰剂 (Pl) 3周方案, n=138

安慰剂 单药
PD
主要终点: PFS –通过与E4599研究的一致性,证实
次要终点:OS,ORR,疾病缓解时间,安全性,血浆生物标志物(VEGF-A, VEGFR-2) 探索性生物标志物:组织和血浆EGFR突变状态 分层因素:性别,吸烟状态,年龄
– 接受疾病进展后治疗的EGFR 突变阳性患者中, 26% B+CP组 及 35% Pl+CP组患者接受了EGFR TKI的治疗
数据截止时间 2014年4月30日
Caicun Zhou, et al. 2015 JCO.
其他次要研究终点
研究终点
B+CP (n=136) 54 (45.4–62.9)
• •
在中国人群中的疗效(HR临界 ≤0.83)
* 进展揭盲后,仅贝伐珠单抗组可选择使用贝伐珠单抗联合已被批准的二、三线治疗
PD = 疾病进展; R = 随机; ORR = 客观缓解率; HR = 风险比; VEGF-A = 血管上皮生长因子-A VEGFR-2 = 血管上皮生长因子受体-2; EGFR =表皮生长因子受体
ECOG PS = 东部肿瘤协作组行为状态
137 (99.3) 1 (0.7) 0 (0.0)
4 (2.9) 8 (5.8) 126 (91.3) 0 (0.0) 85 (61.6) 23 (27.1) 62 (72.9)
136 (98.6) 1 (0.7) 1 (0.7)
3 (2.2) 9 (6.5) 125 (90.6) 1 (0.7) 67 (48.6) 17 (25.4) 50 (74.6)

另一种定义

另一种定义

41
在水里你看到的只是自己的脸,但在酒里你能看到内心的花园。

——古埃及谚语史”。

无论是在同济医科大学,还是北京医
科大学,徐涛都是主动给自己加压,孜孜
以求地学习再学习,这才凭借厚实的专业
基础抓住了去德国的机遇。

培根说:“只
有愚者才等待机会,而智者则造就机会。


不要坐等机遇,只有自己打好了基础,
机遇来了才能抓住。

徐涛以内制外的一番
话,提醒大家抱定主动的机遇观,正确理
解和把握学习与机遇之间的关系。

对于学生所关心的大学、专业及发
展问题,徐涛现身说法,通过逐层剖析、
彼此映照等方式,提醒大家以学习为基
础,以自我为导向,将选择与努力相结
合,将兴趣与能力相结合,将自我发展与
社会需求相结合,一步一步走好自己的生人性最耀眼的光辉。

“最低等动物”的人性光辉◆诗秀
◎说林广记◎应聘者回应道。

Copyright©博看网 . All Rights Reserved.。

风险投资中的短视效应及理论解释

风险投资中的短视效应及理论解释

风险投资中的短视效应及理论解释江程铭;董华华;张文秀;胡凤培【摘要】“投资者难以长远地将多重投资看作整体,而更倾向于分割成单一决策进而表现出每次决策风险规避的现象”被称为风险投资中的短视效应.短视/远视风险投资常通过单一决策/重复决策范式予以研究.大多数的研究发现,在单一决策条件下,投资者接受投资的人数比例或者投资金额低于重复决策条件.短视效应的调节机制有反馈频率、投资灵活性、选择组块、风险状况等.研究者分别提出短视损失厌恶(myopic loss aversion,简称MLA)理论和短视预期理论(myopic prospect theory,MPT)对短视现象进行解释;但这些理论受到基于齐当别理论(equate-to-differentiate theory)研究的挑战.文章在总结短视效应已有相关研究的基础上指出今后有必要深入探索的方向.【期刊名称】《应用心理学》【年(卷),期】2014(020)002【总页数】7页(P180-186)【关键词】短视;预期理论;齐当别理论;损失厌恶;风险厌恶【作者】江程铭;董华华;张文秀;胡凤培【作者单位】浙江工业大学经贸管理学院,脑科学研究中心,杭州310023;浙江工业大学经贸管理学院,脑科学研究中心,杭州310023;浙江理工大学心理学系,杭州310018;浙江工业大学经贸管理学院,脑科学研究中心,杭州310023【正文语种】中文【中图分类】B849;B842;F832.481 引言投资总是存在风险或不确定性,因此投资决策又被称为风险投资决策。

一般而言,投资决策很少仅涉及一次投资。

比如,股票投资者终其一生会多次买卖股票。

但是一个普遍的现象是投资者常常不能将多次投资当做整体看待,而是倾向于视为一个个独立的决策。

研究者把“投资者难以长远地将多重投资看作整体,而更倾向于分割成单一决策进而表现出每次决策风险规避的现象”称为短视效应。

短视作为一个实用的经济学概念,其理论解释以及心理机制得到了研究者的广泛关注。

Non-defining attributive clauses课后练习

Non-defining attributive clauses课后练习

定语从句高考真题2015-20182018年高考英语定语从句汇编1.(2018·全国Ⅰ)Two of the authors of the review also made a study published in 2014 _______showed a mere five to 10 minutes a day of running reduced the risk of heart disease.2.(2018全国II)Between 2005-when the government started a soil-testing program _______ gives specific fertilizer recommendations to farmers.3.(2018浙江)Many westerners ________ come to China cook much less than in their own countries.4.(2018•北京)She and her family bicycle to work, _________ helps them keep fit.5.(2018天津)Kae, _________sister I shared a room with when we were at college, has gone to work in Australia.6.(2018江苏)Self-driving is an area _______ China and the rest of the world are on the same starting line.2017年高考英语定语从句汇编1.(2017天津)My eldest son,________ work takes him all over the world, is in New York at the moment.2.(2017江苏)In 1963 the UN set up the World Food Programme, one of ________ purposes is to relieve worldwide starvation.3.(2017北京)The little problems ________ we meet in our daily lives may be inspirations for great inventions.4.(2017)The publication of Great Expectations,_________was both widely reviewed and highly praised,strengthened Dickens status as a leading novelist.5.(2017江苏卷)We choose this hotel because the price for a night here is down to $20,half of_________it used to charge.6..(2017全国Ⅲ)Sarah, _________has taken part in shows along with top models wants to prove that she has brains.7..(2017全国I) It is possible to have too much of both, ________is not good for the health.2016年高考英语定语从句汇编1.(2016·江苏)Many young people,most of ________were well-educated,headed for remote regions to chase their dreams.2.(2016·天津)We will put off the picnic in the park until next week,________ the weather may be better.3.(2016·天津)One day,mentally exhausted,I wrote down all the reasons ________ this problem could not be solved.4.(2016·浙江)Scientists have advanced many theories about why human beings cry tears,none of ________ has been proved.5.(2016·浙江)I made friends with the natives,and their reaction amazed me.They gave me presents of their favorite artworks ________ they had refused to sell to tourists.6.(2016·四川)One important biological factor________helps women live longer is the difference in hormones between men and women.7.(2016·北京)I live next door to a couple _______ children often make a lot of noise. 8.(2016·北京)So condors with high levels of lead are sent to Los Angeles Zoo,________ they are treated with calcium EDTA,a chemical that removes lead from the blood over several days.9.(2016·全国Ⅰ)A nurse ________ understands the healing(治愈) value of silence can use this understanding to assist in the care of patients from their own and from other cultures.10.(2016·全国Ⅰ)I had one trip last year________I was caught by a hurricane in America.11..(2016全国I) But my connection with pandas goes back to my days on a TV show in the mid-1980s, ______I was the first Western TV reporter。

乔姆斯基最简方案

乔姆斯基最简方案

基本关系:最简方案的X-理论
• 在运算操作从词库中选择词汇项目通过推 导而生成语言表达式的过程中, 需要一个 具有普遍性的结构图式,在词库和运算系 统之间发挥中介的作用。这个结构图式就 是 X-理论模式(原来叫做 X-阶标理论,简 称为 X-理论) ,乔姆斯基根据简单方案对 于 X-理论图式做了修改,得出了一个图式 。
• 《最简方案》 (The Minimalist Program, Cambridge, MIT Press, 1995)是一个文集, 该文集共包括四章,每一章基本上是独自 成篇的。《语言学理论的最简方案》正是 此书的第三章。 • 乔姆斯基的《语言学理论的最简方案》是 生成语法学近期第二阶段的代表作,写成 于1992 年,于1993 年发表在 Hale 和 Keyser编辑出版的论文集《来自 20 号楼的 看法》 (The view from Buliding 20) , 后来于 1995 年收在《最简方案》一书中, 作为该书的第三章。
?最简方案试图仅仅依靠这些局部性关系取消过去生成语法模式中的中心语管辖的概念由于中心语管辖在过去生成语法的模式中起着核心作用所以在引入局部性关系的概念之后生成语法的所有模块以及模块之间的关系都要进行重新的审视和阐述
The Minimalist Pr学理论的最 简方案
• 在X-理论图式中,中心语X的选择来自词库, XP是X的投射,中心语X与其他成分构成了 两种局部性关系(local relation):一种局 部性关系是ZP和X之间的标示语-中心语关 系(Spec-head relation),另一种局部性 关系是X和YP之间的中心语-补语关系 (head-complement relation)。其中,X 和YP之间的关系与题元的确定有关,是更 为局部的、最基本的关系。此外,还有中 心语X和补语YP的中心语之间的关系,这是 一种中心语与中心语的关系(head-head relation)。

The Epiphany of Hidden Potential

The Epiphany of Hidden Potential

**The Epiphany of Hidden Potential**In the vast expanse of human existence, there lies an undiscovered realm – the domain of hidden potential, waiting to be unveiled through a profound epiphany.The ancient Greek philosopher Socrates once stated, “The unexamined life is not worth living.” This assertion underlines the significance of self-discovery and the realization of one's latent capabilities. Consider the story of Albert Einstein. In his early years, he struggled academically and was often seen as an underachiever. However, through his unwavering curiosity and independent thinking, he unlocked the hidden potential within him, transforming the field of physics with his revolutionary theories.In the world of sports, the tale of Michael Jordan is inspiring. Despite being cut from his high school basketball team, he refused to let that setback define him. Through hard work, determination, and a belief in his hidden potential, he went on to become one of the greatest basketball players of all time.In our daily lives, epiphanies of hidden potential can occur in unexpected ways. A person who has always considered themselves lacking in artistic talent might suddenly discover their ability to create beautiful paintings during a therapeutic art class.A student who struggles with public speaking might have a breakthrough moment during a class presentation, realizing their potential for effective communication.For instance, a woman who has always been afraid of heights might overcome her fear and take up rock climbing, uncovering a previously unknown physical and mental strength. Or a man who has never thought of himself as a leader might step up in a crisis and guide his team effectively, revealing leadership potential he never knew he had.However, recognizing and nurturing our hidden potential is not always easy. It often requires us to overcome self-doubt, societal expectations, and the fear of failure. But when we do have that moment of epiphany and choose to act on it, the possibilities are endless.In conclusion, the epiphany of hidden potential is a life-altering event. It is a light that pierces through the darkness of self-ignorance, illuminating a path towards growth and fulfillment. As Henry Ford said, “Whether you think you can, or you think you can't – you're right.” Let us be open to those moments of revelation and have the courage to explore and develop the hidden potential within us, for it is the key to unlocking a life of meaning and achievement.。

OB(1)

OB(1)
1. Describe what managers do.
2. Define organizational behavior (OB).
3. Explain the value of the systematic study of OB. 4. Identify the contributions made by major behavioral science disciplines to OB. 5. List the major challenges and opportunities for managers to use OB concepts.
?在一个充气不足的热气球上载着三位关系人类兴亡的科学家第一位是环保专家他的研究可拯救无救人免于因环境污染而面临死亡的噩运
Chapter
ONE
What is Organizational Behavior?
OBJECTIVES
After studying this chapter, you should be able to:
Management Functions (cont’d)
Leading
A function that includes motivating employees, directing others, selecting the most effective communication channels, and resolving conflicts.
Psychology
Contributing Disciplines to the OB Field
The science that seeks to measure, explain, and sometimes change the behavior of humans and other animals.
相关主题
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Beyond k-Anonymity:A Decision Theoretic Framework for Assessing Privacy RiskGuy Lebanon1,Monica Scannapieco2, ,Mohamed R.Fouad1,and Elisa Bertino11Purdue University,USAlebanon@,{mrf,bertino}@2ISTAT and Universit`a di Roma“La Sapienza”,Italyscannapi@istat.itAbstract.An important issue any organization or individual has toface when managing data containing sensitive information,is the riskthat can be incurred when releasing such data.Even though data maybe sanitized,before being released,it is still possible for an adversary toreconstruct the original data by using additional information that maybe available,for example,from other data sources.To date,however,nocomprehensive approach exists to quantify such risks.In this paper wedevelop a framework,based on statistical decision theory,to assess therelationship between the disclosed data and the resulting privacy risk.We relate our framework with the k-anonymity disclosure method;wemake the assumptions behind k-anonymity explicit,quantify them,andextend them in several natural directions.1IntroductionThe problem of data privacy is today a pressing concern for many organizations and individuals.The release of data may have some important advantages in terms of improved services and business,and also for the society at large,such as in the case of homeland security.However,unauthorized data disclosures can lead to violations to the privacy of individuals,can result infinancial and business damages,as in the case of data pertaining to enterprises;or can result in threats to national security,as in the case of sensitive GIS data[6].Preserving the privacy of such data is a complex task driven by various goals and requirements. Two important privacy goals are:(i)preventing identity disclosure,and(ii) preventing sensitive information disclosure.Identity disclosure occurs when the released information makes it possible to identify entities either directly(e.g.,by publishing identifiers like SSNs)or indirectly(e.g.,by linkage with other sources). Sensitive information typically includes information that must be protected by law,for example medical data,or is required by the subjects described by the data.In the latter case,data sensitivity is a subjective measure and may differ across entities.This work was performed while visiting research assistant at CERIAS and Depart-ment of Computer Sciences,Purdue University,USA.It is partially supported by the project“ESTEEM”(http://www.dis.uniroma1.it/∼esteem/index.html).J.Domingo-Ferrer and L.Franconi(Eds.):PSD2006,LNCS4302,pp.217–232,2006.c Springer-Verlag Berlin Heidelberg2006218G.Lebanon et al.To date,an important practical requirement for any privacy solution is the ability to quantify the privacy risk that can be incurred in the release of certain data.However,most of the work related to data privacy has focused on how to transform the data so that no sensitive information is disclosed or linked to specific entities.Because such techniques are based on data transformations that modify the original data with the purpose of preserving privacy,they are mainly focused on the tradeoffbetween data privacy and data quality(see e.g.[9,2,3,5]). Conversely,few approaches exist to quantify privacy risks and thus to support informed decisions.Duncan et al.[4]describes a framework,called Risk-Utility (R-U)confidentiality map,which addresses the tradeoffbetween data utility and kshmanan et al.[8]is an approach to the risk analysis for disclosed anonymized data that models a database as series of transactions and the at-tacker’s knowledge as a belief function.Our model is fundamentally different from both works;indeed,we deal exactly with relational instances,rather than with genericfiles or data frequencies;also,we incorporate the concept of data sensitivity into our framework and we consider generic disclosure procedures, not only anonymization like in[8].The goal of the work presented in this paper is to propose a comprehensive framework for the estimation of privacy risks.The framework is based on sta-tistical decision theory and introduces the notion of a disclosure rule,that is a function representing the data disclosure policy.Our framework estimates the privacy risk by means of a function that takes into account a given disclosure rule and(possibly)the knowledge that can be exploited by the attacker.It is important to point out that our framework is able to assess privacy risks also when no information is available about the knowledge,referred to as dictionary, that the adversary may exploit.The privacy risk function incorporates both identity disclosure and sensitive information disclosure.We introduce and ana-lyze different shapes of the privacy risk function.Specifically,we define the risk in the classical decision theory formulation and in the Bayesian formulation.We prove several interesting results within our framework:we show that,under rea-sonable hypotheses,the estimated privacy risk is an upper bound for the true privacy risk;we analyze the computational complexity of evaluating the privacy risk function,and we propose an algorithm for efficientlyfinding the disclosure rule that minimizes the privacy risk.Wefinally gain insight by showing that the privacy risk is a quantitative framework for exploring the assumptions and consequences of k-anonymity.2Privacy Risk FrameworkAs private information in databases is being disclosed,undesired effects occur such as privacy breaches,andfinancial loss due to identity theft.To proceed with a quantitative formalism we assume that we obtain a numeric description, referred to as loss,of that undesired effect.The loss may be viewed as a func-tion of(i)whether the disclosed information enables identification and(ii)the sensitivity of the disclosed information.Thefirst argument of the loss functionBeyond k-Anonymity:A Decision Theoretic Framework219 encapsulates whether the disclosed data can be tied to a specific entity or not. Consider for example the case of a hospital disclosing a list of the ages of patients, together with data indicating whether they are healthy or not.Even though this data is sensitive and if there is a little chance that the disclosed information can be tied to specific individuals,no privacy loss occurs as the data is anony-mous.The second argument of the loss function,the sensitivity of the disclosed information,may be high as is often the case for sensitive medical data.On the other hand,other disclosed information such as gender,may be only marginally private or not private at all.It is important to note that a precise quantifica-tion of the sensitivity of the disclosed information may depend on the entity to whom the data relates.For example,data such as annual income and past medical history may be very sensitive to some and only marginally sensitive to others.Let T be a relation with a relational scheme T(A1,...,A m),where each at-tribute A i is defined over the domain Dom i∪{⊥,§},with the only exception of A1as detailed later.The relation T stores the records that are considered for disclosure and has some values either missing or suppressed for privacy preser-vation.Specifically,a null value is denoted by⊥whereas a suppressed value is denoted by§.Furthermore,we denote the different attribute values of a specific record x in T using a vector notation(x1,...,x m).Thefirst attribute x1corre-sponds to a unique record identifier that can be neither⊥nor§.The set of all possible records may be written asX=(Dom1)×(Dom2∪{⊥,§})×···×(Dom m∪{⊥,§}).If T has cardinality n,it can be seen as a subset of X n which we may think of as a matrix whose rows are the different records.We refer to the i th record in such a relation as x i and its j th attribute as x ij.12.1Disclosure Rules and Privacy RiskStatistical decision theory[10]offers a natural framework for measuring the quantitative effect of the information disclosure phenomenon.The uncertainty is encoded by a parameterθabstractly called“a state of nature”which is typ-ically unknown.However,it is known thatθbelongs to a setΘ,usually afinite or infinite subset of R l.The decisions are being made based on a sample of ob-servations(x1,...,x n),x i∈X and are represented via a functionδ:X n→A where A is an abstract action space.The functionδis referred to as a deci-sion policy or decision rule.A key element of statistical decision theory is that the state of natureθgoverns the distribution pθthat generates the observed data.Instead of decision rulesδ:X n→A,we introduce disclosure rules defined as follows.1Note that throughout the paper,records and vectors are denoted by bold italic symbols whereas variables and attributes are denoted by only italic symbols.220G.Lebanon et al.Definition 1.A disclosure rule δis a function δ:X →X such that[δ(z )]j =⎧⎪⎨⎪⎩⊥z j =⊥§the j th attribute is suppressed z j otherwiseThe state of nature θthat influences the disclosure outcome is the side infor-mation used by the attacker in his identification attempt.Such side information θis often a public data resource composed of identities and their attributes,for example a phone book.The distribution over records p θis taken to be the empirical distribution ˜p over the data that is to be disclosed x 1,...,x n ,defined below.Definition 2.The empirical distribution ˜p on X associated with a set of records x 1,...,x n is ˜p (z )=1n n i =11{z =x i }where 1{z =x i }is 1if z =x i and 0otherwise.The empirical distribution is used for defining the risk associated with a disclo-sure rule δusing the mechanism of expectation.Note that the expectation withrespect to ˜p is simply the empirical mean E ˜p (f (x ,θ))=1n n i =1f (x i ,θ).Theloss and risk functions in the privacy adaptation of statistical decision theory are defined below.Definition 3.The loss function :X ×Θ→[0,∞]measures the loss incurred by disclosing the data δ(z )∈X due to possible identification based on θ∈Θ.Definition 4.The risk of the disclosure rule δin the presence of side in-formation θis the average loss of disclosing the records x 1,...,x n :R (δ,θ)=E ˜p (z )( (δ(z ),θ))=1n n i =1 (δ(x i ),θ).Definition 5.The Bayes risk of the disclosure rule δis R (δ)=E p (θ)(R (δ,θ))where p (θ)is a prior probability distribution on Θ.It is instructive at this point to consider in detail the identification process and its possible relations to the loss function.We use the term identification at-tempt to refer to the process of trying to identify the entity represented by the record.We refer to the subject performing the identification attempt as the at-tacker .The attacker performs an identification attempt based on the disclosed record y =δ(x i )and additional side information θreferred to as a dictionary .The role of the dictionary is to tie a record y to a list of possible candidate identities consistent with the record y ,i.e.having the same values on common fields.For example,consider y being (first-name,surname,phone#)and the dictionary being a phone book.The attacker needs only considering dictionary entities that are consistent with the disclosed record.Recall that some of the at-tributes (first-name,surname,phone#)may be replaced with ⊥or §symbolsBeyond k-Anonymity:A Decision Theoretic Framework221 due to missing information or due to the disclosure process,respectively.In this example,if all the attribute values are revealed and the available side informa-tion is an up-to-date phone book,it is likely that only one entity exists in the dictionary that is consistent with the revealed information.On the other hand, if the attribute value for phone#is suppressed,by replacing it with§symbol, the phone-bookθmay or may not yield a single consistent entity,depending on the popularity of the(first-name,surname)combination.From the attacker’s stand point,missing values are perceived the same way as suppressed values. Thus,in the rest of the paper and for the sake of notational simplicity,both missing and suppressed values will be denoted by the symbol⊥.Note that the loss function (δ(x i),θ)measures the loss due to disclosing δ(x i)in the presence of the side information–in this case the dictionaryθ. Specifying the loss is typically entity and problem dependent.We can,however, make some progress by decomposing the loss into two parts:(i)the ability to identify the entity represented byδ(x i)based on the side informationθand(ii) the sensitivity of the information inδ(x i).The identification part is formalized by the random variable Z defined as follows.Definition6.Letρ(δ(x i),θ)denote the set of individuals in the dictionaryθconsistent with the recordδ(x i).Moreover,let the random variable Z(δ(x i))be a binary variable that takes value1ifδ(x i)is identified and0otherwise. Assuming a uniform selection of entries in the dictionary by the attacker,we havep Z(δ(xi ))(1)=|ρ(δ(x i),θ)|−1ρ(δ(x i),θ)=∅0otherwiseand p Z(δ(xi ))(0)=1−p Z(δ(xi))(1).2.2SensitivityThe sensitivity of disclosed data is formalized by the following definition.Definition7.The sensitivity of a record is measured by a functionΦ:X→[0,+∞]where higher values indicate higher sensitivity.We allowΦto take on the value+∞in order to model situations where the information in the record is so private that its disclosure is prohibited under any positive identification chance.The sensitivityΦ(δ(x i))measures the adverse effect of disclosing the record δ(x i)if the attacker correctly identifies it.We make the assumption(whose re-laxation is straightforward)that if the attacker does not correctly identify the disclosed record,there is no adverse effect.The adverse effect is therefore a ran-dom variable with two possible outcomes:Φ(δ(x i))with probability p Z(δ(xi )) (1)and0with probability p Z(δ(xi ))(0).It is therefore natural to account for the222G.Lebanon et al.uncertainty resulting from possible identification by defining the loss (y,θ)as the expectation of the adverse effect resulting from disclosing y=δ(x i)(y,θ)=E pZ(y)(Φ(y)Z(y))=p Z(y)(1)·Φ(y)+p Z(y)(0)·0=Φ(y) |ρ(y,θ)|where the last equality holds if the dictionary selection probabilities are uniform andρ(y,θ)=φ.The risk R(δ,θ)with respect to the distribution˜p that governs the record set x1,...,x n becomesR(δ,θ)=E˜p(x)( (δ(z),θ))=1nni=1Φ(δ(x i))|ρ(δ(x i),θ)|and the Bayes risk under the prior p(θ)becomes(ifΘis discrete replace the integral below by a sum)R(δ)=E p(θ)(R(δ,θ))=1nni=1Φ(δ(x i))Θp(θ)dθ|ρ(δ(x i),θ)|.We now provide more details concerning records x i and their space X,that will be useful in the following.As introduced,a record x i∈X has attribute values(x i1,...,x im)where each attribute x ij,j=2,...,m either takes values in a domain Dom j or is unavailable,in which case we denote it by⊥.Thefirst attribute is x i1∈Dom1.We assume that[δ(x i)]1=x i1,i.e.[δ(x i)]1cannot have ⊥values.This assumption is for notational purposes only and in reality the disclosed data should be taken to be[δ(x i)]2,...,[δ(x i)]m.Notice also that the primary key of the relation can be distinct from the introduced record identifier, and can be one or more attributes defined over Dom2,...,Dom m.We make the assumption[δ(x i)]1=x i1in order to allow a possible dependency ofΦ(δ(x i))on the identifier x i1=[δ(x i)]1which enables theflexibility needed to treat attribute values related to different entities differently.For example,a certain entity,such as a specific person,may wish to protect certain attributes such as religion or age that may be less private for a different person.Possible expressions for the Φfunction are provided in the Appendix.3TradeoffBetween Disclosure Rules and Privacy RiskIn evaluating disclosure rulesδwe have to balance the following tradeoff.On one hand,disclosing private information incurs the privacy risk R(δ,θ).On the other hand,disclosing information serves some purpose,or else no information would ever be disclosed.Such disclosure benefit may arise from various reasons such as increased productivity due to the sharing of commercial data.We choose to represent this tradeoffby specifying a set of disclosure rulesΔthat are acceptable in terms of their disclosure benefit.From this set,we seek toBeyond k-Anonymity:A Decision Theoretic Framework223 choose the rule that incurs the least privacy riskδ∗=arg minδ∈ΔR(δ,θ).Notice that this framework is not symmetric in its treatment of the disclosure benefit and privacy risk and emphasizes the increased importance of privacy risk in the tradeoff.It is difficult to provide a convincing example of a setΔwithout specifying in detail the domain and the disclosure benefit.Nevertheless,we specify below sev-eral sets of rules that serve to illustrate the decision theoretic framework of this paper.The basic principle behind these rules is that the more attribute values are being disclosed,the greater the disclosure benefit is.The details of the specific application will eventually determine which set of rules is most appropriate.The three sets of rules below are parameterized by a positive integer k.The setΔ1consists of rules that disclose a total of k attribute values for all records combinedΔ1={δ:δ(x1),...,δ(x n)contain a total of k non⊥entries}.The second setΔ2consists of rules that disclose a certain number of attribute values for each recordΔ2={δ:∀iδ(x i)contains k non⊥entries}.The third setΔ3consists of rules that disclose a certain number of attribute values for each attributeΔ3={δ:∀j{[δ(x i)]j}n i=1contains k non⊥entries}.The setΔ1may be applicable in situations where the disclosure benefit is influenced simply by the number of disclosed attribute values.Such a situation may arise if there is a need for computing statistics on the joint space of repre-sented entities-attributes without an emphasis on either dimension.The setΔ2 may be applicable when the disclosure benefit is tied to per-entity data,for ex-ample discovering association rules in grocery store transactions.A ruleδ∈Δ2 guarantees that there are sufficient attributes disclosed for each entity to obtain meaningful conclusions.Similarly,the setΔ3may be useful in cases where there is an emphasis on per-attribute data.Disclosure rulesδ∈Δare evaluated on the basis of the risk functions R(δ,θ), R(δ).In some cases,the attacker’s dictionary is publicly available.We can then treat the“true”side informationθtrue as known,and the optimal disclosure rule is the minimizer of the riskδ∗=arg minR(δ,θtrue).(1)δ∈ΔIf the attacker’s side information is not known,but we can express a prior belief p(θ)describing the likelihood ofθtrue∈Θ,we may use the Bayesian approach and select the minimizer of the Bayes riskE p(θ)(R(δ,θ)).(2)δ∗B=arg minδ∈Δ224G.Lebanon et al.If there is no information concerningθtrue whatsoever,a sensible strategy is toselect the minimax ruleδ∗M that achieves the least worst risk,i.e.δ∗Msatisfiessup θ∈ΘR(δ∗M,θ)=inf supδ∈Δθ∈ΘR(δ,θ).(3)Notice that in all cases above we try to pick the best disclosure rule in terms of privacy risk,out of a setΔof disclosure rules that are acceptable in terms ofthe amount of revealed data.The rulesδ∗,δ∗B ,δ∗Mare useful,respectively,if weknowθtrue,we have a prior over it,or we have no knowledge whatsoever.An alternative situation to the one above is that the database is trying to estimate(or minimize)the privacy risk R(δ,θtrue)based on side information ˆθ=θtrue available to the database.In such cases we can use R(δ,ˆθ)as anestimate for R(δ,θtrue)but we need tofind a way to connect the two risks above by leveraging on a relation betweenˆθandθtrue.A reasonable assumption is that the database dictionaryˆθis specific to the database while the attacker’s dictionaryθtrue is a more general-purpose dictio-nary.We can then say thatθtrue contains the records inˆθas well as additional records.Following the same reasoning we can also assume that for each record that exists in both dictionaries,ˆθwill have more attribute values that are not ⊥.For example,consider a database of employee records for some company.ˆθwould be the database dictionary andθtrue would be a general-purpose dic-tionary such as a phone-book.It is natural to assume thatθtrue will contain additional records over the records inˆθand that the non-⊥attributes inθtrue (e.g.first-name,surname,phone#)will be more limited than the non-⊥at-tributes inˆθ.After all,some of the record attributes are private and would not be disclosed in order tofind their way into the attacker’s dictionary(resulting in more⊥symbols in theθtrue).Under the conditions specified above we can show that the true risk is bounded from above by R(δ,ˆθ)and that the chosen rule arg minδ∈ΔR(δ,ˆθ)has a risk that is guaranteed to bound the true privacy risk.This is formalized below.We consider dictionariesθas relational tables,whereθi=(θi1,...,θiq)is a record of a relation Tθ(A1,...,A q),with A1corresponding to the record identifier.Definition8.We define the relation between dictionariesθ=(θ1,...,θl1)andη=(η1,...,ηl2)by saying thatθ ηif for everyθi,∃ηv such thatηv1=θi1andηvk=⊥⇒θik=ηvk.The relation constitutes a partial ordering on the set of dictionariesΘ.Theorem1.Ifˆθcontains records that correspond to x1,...,x n andˆθ θtrue, then∀δR(δ,θtrue)≤R(δ,ˆθ).Proof.For every disclosed recordδ(x i)there exists a record inˆθthat corresponds to it and sinceˆθ θtrue there is also a record inθtrue that corresponds to it.As a result,ρ(δ(x i),ˆθ)andρ(δ(x i),θtrue)are non-empty sets.Beyond k -Anonymity:A Decision Theoretic Framework 225For an arbitrary a ∈ρ(δ(x i ),ˆθ)we have a =ˆθv for some v and since ˆθ θtrue there exists a corresponding record θtrue k .The record θtrue k will have the same or more ⊥symbols as a and therefore θtrue k ∈ρ(δ(x i ),θtrue ).The same argument can be repeated for every a ∈ρ(δ(x i ),ˆθ)thus showing that ρ(δ(x i ),ˆθ)⊆ρ(δ(x i ),θtrue )or |ρ(δ(x i ),θtrue )|−1≤|ρ(δ(x i ),ˆθ)|−1.The probability of identifying δ(x i )by the attacker is thus smaller than the identification probability based on ˆθ.It then follows that for all i , (δ(x i ),θtrue )≤ (δ(x i ),ˆθ)as well as R (δ,θtrue )≤R (δ,ˆθ).Solving (1)in the general case requires evaluating R (δ,θtrue )for each δ∈Δand selecting the minimum.The reason is that the dictionary θcontrolling the identification distribution p Z (δ(x i ))(1)=|ρ(δ(x i ),θ)|−1is of arbitrary shape.A practical assumption,that is often made for high dimensional distributions,is that the distribution underlying θfactorizes into a product form|ρ(δ(x i ),θ)|N = j |ρj ([δ(x i )]j ,θ)|N or|ρ(δ(x i ),θ)|= j αj ([δ(x i )]j ,θ)for some appropriate functions αj .In other words,the appearances of y j for all j =1,...,m in θare independent random variables.Returning to the phone-book example,the above assumption implies that the popularity of first names does not depend on the popularity of last names,e.g.,p (first-name =Mary |surname =Smith )=p (first-name =Mary |surname =Johnson )=p (first-name =Mary ).The independence assumption does not hold in general,as attribute values may be correlated,for instance,by integrity constraints;we plan to relax it in future work.First we analyze the complexity of evaluating the risk function R (δ,θ).This would depend on the complexity of computing Φ,denoted by C (Φ),and the complexity of computing |ρ(δ(x i ),θ)|which is O (Nm ),where N is the number of records in the dictionary θ.Solving arg min δ∈ΔR (δ,θ)by enumeration requires O (n ) C (Φ)+O (Nm ) ·|Δ|computations.We have |Δ1|= nm k ,|Δ2|= m k n ,|Δ3|= n km and C (Φ)typically being O (m )for the additive and multiplicative forms.In a typical setting where k m we have for Δ2and a linear or multiplicative Φ,a minimization com-plexity of O (nNm kn +1).The complexities above are computed for the naive enumeration algorithm.A much more efficient algorithm for obtaining arg min δ∈ΔR (δ,θ)for Δ2and Φ5under the assumption of dictionary independence is described below.226G.Lebanon et al.If we define C 1(y )={j :j >1,y j =⊥},C 2(y )={j :j >1,y j =⊥}we have (y ,θ)= j ∈C 2(y )e w j,y 1|ρ(y ,θ)|= j ∈C 2(y )e w j,y 1 k>1αk (y k ,θ)= j ∈C 2(y )e w j,y 1αj (y j ,θ)· l ∈C 1(y )1αl (⊥,θ)=j ∈C 2(y )e w j,y 1αj (⊥,θ)αj (y j ,θ)·m l =21αl (⊥,θ).To select the disclosure of k attributes that minimizes the above loss it remains to select the set C 2(y )of k indices that minimizes the loss.This set correspondsto the k smallest {e w j,y 1αj (⊥,θ)αj (y j ,θ)}m j =2and leads to the following algorithm.Algorithm 1.MinRisk(1)foreach i =1,...,n (2)foreach j =2,...,m (3)set γj :=e w j,x i 1αj (⊥,θ)αj (x ij ,θ)(4)identify the k smallest elements in {γj }m j =2(5)set δ(x i )to disclose the attributes corresponding to these k ele-mentsTheorem 2.The algorithm MinRisk for solving arg min δ∈ΔR (δ,θ)requires O (nNm )computations.Proof.For each record y =x i we compute the following.The set {γj =e w j,y 1αj (⊥,θ)αj (y j ,θ)}m j =2can be obtained in O (Nm ).Moreover,the set corresponding to the k smallest elements in {γj }m j =2can be obtained in two steps:(i)Get the k th -smallest element in {γj }m j =2,γ (this requires O (m )computations),then (ii)scan the set {γj }m j =2for elements <γ (again,this requires O (m )computations).Hence the overall complexity of the above procedure is O (n ) O (Nm )+O (m ) =O (nNm ).4Privacy Risk and k -Anonymityk -Anonymity [9]has recently received considerable attention by the research community [11,1].Given a relation T ,k -anonymity ensures that each disclosed record can be indistinctly matched to at least k individuals in T .It is enforced by considering a subset of the attributes called quasi-identifiers ,and forcing the disclosed values of these attributes to appear at least k times in the data-base.k -anonymity uses two operators to accomplish this task:suppression and generalization.We ignore the role of generalization operators in this paper as our privacy framework is cast solely in terms of suppression at attribute-level.However,it is straightforward to extend the privacy risk framework to include generalization operators leading to a more complete analogy with k -anonymity,and we plan to do it in future work.Beyond k-Anonymity:A Decision Theoretic Framework227 In its original formulation,k-anonymity does not seem to make any assump-tions on the possible external knowledge that could be used for entity identifi-cation and does not characterize the privacy loss.However,k-anonymity does make strong implicit assumptions whose absence eliminates any motivation it might possess.Following the formal presentation of k-anonymity in the privacy risk context,we analyze these assumptions and possible relaxations.Since the k-anonymity requirement is enforced on the relation T,the anonym-ization algorithm considers the attacker’s dictionary as equal to the relationT=θ.Representing the k-anonymity rule byδ∗k we have that the k-anonymityconstraints may be written as∀i|ρ(δ∗k(x i),T)|≥k.(4) The sensitivity function is taken to be constantΦ≡c as k-anonymity considers only the constraints(4)and treats all attributes and entities in the same way.Asa result,the loss incurred by k-anonymityδ∗k is bounded by (δ∗k(x i),T)≤c/kwhere equality is achieved if the constraint|ρ(δ∗k (x i),T)|=k is met.On theother hand,any ruleδ0that violates the k-anonymity requirement for some x i will incur a loss higher(underθ=T andΦ≡c)than the k-anonymity rule(δ0(x i),T)=c|ρ(δ0(x i),T)|≥ (δ∗k(x i),T).We thus have the following result presenting k-anonymity as optimal in terms of the privacy risk framework.Theorem3.Letδ∗k be a k-anonymity rule andδ0be a rule that violates thek-anonymity constraint,both with respect to x i∈T.Then(δ∗k(x i),T)≤c/k< (δ0(x i),T).As the above theorem implies,the k-anonymity rule minimizes the privacy loss per example x i and may be seen as arg minδ∈ΔR(δ,T)whereΔis a set of rules that includes both k-anonymity rules and rules that violate the k-anonymity constraints.The assumptions underlying k-anonymity,in terms of the privacy risk framework are1.θtrue=T2.Φ≡c3.Δis under-specified.Thefirst assumption may be taken as an indication that k-anonymity does not assume any additional information regarding the attacker’s dictionary.Aswe showed earlier,the resulting risk R(δ∗k ,T)≤c/k may be seen as a boundon the true risk R(δ∗k ,θtrue)under some assumptions.Alternatively,the privacyframework also introduces the mechanisms of the minimax rule and the Bayes rule if additional information is available such as the setΘof possible dictionaries or even a prior onΘ.Moreover,the attacker’s dictionaryθis often a standard。

相关文档
最新文档