Logarithmic singularity of the Szego kernel and a global invariant of strictly pseudoconvex
Twin Peaks
a r X i v :h e p -p h /0008122v 2 23 O c t 2000Twin Peaks aMischa Sall´e ,Jan Smit and Jeroen C.VinkInstitute for Theoretical Physics,University of Amsterdam,Valckenierstraat 65,1018XE Amsterdam,the NetherlandsThe on-shell imaginary part of the retarded selfenergy of massive ϕ4theory in 1+1dimensions is logarithmically infrared divergent.This leads to a zero in the spectral function,separating its usual bump into two.The twin peaks interfere in time-dependent correlation functions,which causes oscillating modulations on top of exponential-like decay,while the usual formulas for the decay rate fail.We see similar modulations in our numerical results for a mean field correlator,using a Hartree ensemble approximation.In our numerical simulations of 1+1dimensional ϕ4theory using the Hartree ensemble approximation 1we found funny modulations in a time-dependent correlation function.Fig.1shows such modulations on top of a roughly ex-ponential decay.The correlation function is the time average of the zero mo-mentum mode of the mean field,F mf (t )=ϕ(t )X (t )= t 2t 1dt ′X (t +t ′)/(t 2−t 1),taken afterwaiting a long time t 1for the system to be in approximate equilibrium.This equilibrium is approximately thermal and F mf (t )is analogous to the sym-metric correlation function of the quantum field theory at finite temperature,F (t )= 12πe −ipt12ρ(p 0),(1)and the latter in turn in terms of the retarded selfenergy Σ(p 0),ρ(p 0)=−2Im Σ(p 0)a Presentedby J.Smit.1mass in the propagators of the diagrams in Fig.2,after adding a counterterm that sets the real part of Σto zero at p 0=m .The one loop diagram is-14-12-10-8-6-4-2100200300400500600l o g (a b s [〈φt ’ φt ’+t 〉−〈φt ’〉〈φt ’+t 〉])tmt’m=31•103...61•103N=64, Lm=14.8-2.63-tm/233N=128, Lm=29.1-4.05-tm/105Figure1:Numerically computed correlation ln |F mf (t )|versus time t in units of the inverse temperature dependent mass m .The coupling is weak,λ/m 2=0.11and the tempera-ture T/m ≈1.4for the smaller volume (with significant deviations from the Bose-Einstein distribution)and ≈1.6for the larger volume (reasonable BE).++. . .123Figure 2:Diagrams leading to thermal damping.present only in the ‘broken phase’(for which ˆϕ =0;there is really only a symmetric phase in 1+1dimensions,but this is due to symmetry restoration by nonperturbative effects which will not obliterate the one-loop damping.)The corresponding selfenergy has been calculated in 2,for example.It only leads todamping for frequencies p 20>4m 2,which are irrelevant for the quasiparticledamping at p 20=m 2.So from now on we concentrate on the two-loop diagram.2After analytic continuation to real time onefinds that it is given by the sum of two terms,Σ1+Σ2(see e.g.3).Thefirst has an imaginary part corresponding to1↔3processes requiring p20>9m2,so it does not contribute to plasmon damping.The second is given byΣ2=−9λ2E1E2E3(1+n1)n2n3−n1(1+n2)(1+n3)m2+(p2+p3)2,E i=m =9λ2e m/T−1 2 ln mindeed an oscillating modulation on top of the roughly exponential decay.The decay corresponding to exp(−γt),withγgiven by(6),is also indicated in the plot:it does not do a good job in describing the average decay beyond thefirst interference minimum.The‘Twin Peaks’phenomenon implies that the usual definition of damping rate(5)is unreliable in1+1dimensions.Acknowledgements.We thank Gert Aarts for useful conversations.This work is supported by FOM/NWO.1.J.C.Vink,these proceedings.2.H.A.Weldon,Phys.Rev.D28,2007(1983).3.E.Wang and U.Heinz,Phys.Rev.D53,899(1996).1512.5107.552.5Figure3:The spectral functionρ(p0)near p0=m=1corresponding to the selfenergy shown in Figs.4,5(T=m,λ=0.4m2).-2-4-6-8-10-12050100150200Figure4:Plot of ln|F(t)|versus mt for T=m,λ=0.4m2.The straight line represents exp(−γt).4。
Toric singularities revisited
Comprehensive surveys from various different perspectives can be found in Danilov [2], Mumford et al. [19], [1] as well as [24, 25]. In [18], Kato extended the theory of toric geometry over a field to an absolute theory, without base. This is achieved by replacing the notion of a toroidal embedding introduced in [19] with the notion of a log structure. A toroidal embedding is a pair (X, U ) consisting of a scheme X locally of finite type and an open subscheme U ⊂ X such that (X, U ) is isomorphic, locally in the e ´ tale topology, to a pair consisting of a toric variety and its algebraic torus. Toroidal embeddings are particularly nice locally Noetherian schemes with distinguished log structures. A log structure on a scheme X , in the sense of Fontaine and Illusie, is a morphism of sheaves of monoids α : MX → OX restricting to an ∗ ∗ isomorphism α−1 (OX )∼ . The theory of log structures on schemes = OX is developed by Kato in [16]. Log structures were developed to give a unified treatment of the various constructions of deRham complexes with logarithmic poles. In [13] Illusie recalls the question that motivated their definition: Let me briefly recall what the main motivating question was. Suppose S is the spectrum of a complete discrete valuation ring A, with closed (resp. generic) point s (resp. η ), and X/S is a scheme with semi-stable reduction, which means that, locally for the e ´ tale topology, X is isomorphic to the closed subscheme of An S defined by the equation x1 · · · xn = t, where x1 , . . . , xn are coordinates on An and t is a uniformizing parameter of A. Then X is regular, Xη is smooth, and Y = Xs is a divisor with normal crossings on X . In this situation, one can consider, with Hyodo, the relative deRham complex of X over S with logarithmic poles along Y , . . ωX/S = ΩX/S (log Y ) ([10] see also [11, 12]). Its restriction to the generic fiber is the usual deRham complex Ω. Xη /η and it induces on Y a complex . . ωY = OY ⊗ ΩX/S (log Y ).
Chapter 4 Exponential and Logarithmic Functions
log100 log10 2
2
ln 1 0
log 0.1 log10 1 1
d. Find ln e-1.
d. Find log366.
ln e1 1ln e 1
2007 Pearson Education Asia
Chapter 4: Exponential and Logarithmic Functions 4.2 Logarithmic Functions
Example 3 – Graph of a Logarithmic Function with b > 1
Chapter 4: Exponential and Logarithmic Functions
Chapter Objectives
• To introduce exponential functions and their applications. • To introduce logarithmic functions and their graphs.
Example 11 – Radioactive Decay
A radioactive element decays such that after t days the number of milligrams present is given by N 100e0.062t . a. How many milligrams are initially present? Solution: For t = 0, N 100e 0.062 0 100 mg.
log 6 1 log 36 6 2 log 6 2
当代大学生推荐阅读书目
“经济学原理”课程推荐阅读书目(2009年10月修订)本书目所列多为知名经济学家撰写,将经济学基本原理用于思考实际问题,且有趣易懂的书籍。
推荐目的在于培养同学对学习和运用经济学的兴趣。
对于作业和考试并无直接帮助,特别适合在休闲时间(例如假期)阅读。
书目基于Mankiw, Principles of Economics (3rd ed.), Suggested Summer Readings。
并补充了中译本,增补了一些书籍。
排列顺序大致与教材内容平行。
内容相近甚或作者相同的书籍集中排列,节约时间者可择其一阅读。
Freakonomics: A Rogue Economist Explores the Hidden Side of Everything, by Steven D. Levitt, and Stephen J. Dubner, William Morrow, 2005.《魔鬼经济学》/(美)史蒂芬·列维特、史蒂芬·都伯纳著,刘祥亚译,广东经济出版社2006。
Philosophy of Science: A Very Short Introduction, Oxford University Press, 200.《科学哲学》/(英)Samir Okasha著,韩广忠译,凤凰出版传媒集团、译林出版社2009。
The Fatal Equilibrium, by Marshall Jevons, MIT Press, 1985.《致命的均衡》/(美)马歇尔·杰文斯著;罗全喜、叶凯译,机械工业出版社,2005。
Murder at the Margin, by Marshall Jevons, Princeton University, 1993.《边际谋杀》/(美)马歇尔·杰文斯著;王红夏译,机械工业出版社,2006。
A Deadly Indifference: A Henry Spearman Mystery, by Marshall Jevons, Carroll & Graf, 1995. 《夺命的冷漠》/(美)马歇尔·杰文斯著;石北燕、赵保国译,机械工业出版社,2008。
Game theory, maximum entropy, minimum discrepancy, and robust Bayesian decision theory
The Annals of Statistics2004,Vol.32,No.4,1367–1433DOI10.1214/009053604000000553©Institute of Mathematical Statistics,2004GAME THEORY,MAXIMUM ENTROPY,MINIMUMDISCREPANCY AND ROBUST BAYESIANDECISION THEORY1B Y P ETER D.G RÜNWALD AND A.P HILIP D AWIDCWI Amsterdam and University College LondonWe describe and develop a close relationship between two problems that have customarily been regarded as distinct:that of maximizing entropy,andthat of minimizing worst-case expected ing a formulation groundedin the equilibrium theory of zero-sum games between Decision Maker andNature,these two problems are shown to be dual to each other,the solution toeach providing that to the other.Although Topsøe described this connectionfor the Shannon entropy over20years ago,it does not appear to be widelyknown even in that important special case.We here generalize this theory to apply to arbitrary decision problems and loss functions.We indicate how an appropriate generalized definition ofentropy can be associated with such a problem,and we show that,subject tocertain regularity conditions,the above-mentioned duality continues to applyin this extended context.This simultaneously provides a possible rationale formaximizing entropy and a tool forfinding robust Bayes acts.We also describethe essential identity between the problem of maximizing entropy and that ofminimizing a related discrepancy or divergence between distributions.Thisleads to an extension,to arbitrary discrepancies,of a well-known minimaxtheorem for the case of Kullback–Leibler divergence(the“redundancy-capacity theorem”of information theory).For the important case of families of distributions having certain mean values specified,we develop simple sufficient conditions and methods foridentifying the desired solutions.We use this theory to introduce a newconcept of“generalized exponential family”linked to the specific decisionproblem under consideration,and we demonstrate that this shares many ofthe properties of standard exponential families.Finally,we show that the existence of an equilibrium in our game can be rephrased in terms of a“Pythagorean property”of the related divergence, Received February2002;revised May2003.1Supported in part by the EU Fourth Framework BRA NeuroCOLT II Working Group EP27150, the European Science Foundation Programme on Highly Structured Stochastic Systems,Eurandom and the Gatsby Charitable Foundation.A four-page abstract containing an overview of part of this paper appeared in the Proceedings of the2002IEEE Information Theory Workshop[see Grünwald and Dawid(2002)].AMS2000subject classifications.Primary62C20;secondary94A17.Key words and phrases.Additive model,Bayes act,Bregman divergence,Brier score,convexity, duality,equalizer rule,exponential family,Gamma-minimax,generalized exponential family, Kullback–Leibler divergence,logarithmic score,maximin,mean-value constraints,minimax,mutual information,Pythagorean property,redundancy-capacity theorem,relative entropy,saddle-point, scoring rule,specific entropy,uncertainty function,zero–one loss.13671368P.D.GRÜNWALD AND A.P.DAWIDthus generalizing previously announced results for Kullback–Leibler andBregman divergences.1.Introduction.Suppose that,for purposes of inductive inference or choos-ing an optimal decision,we wish to select a single distribution P∗to act as rep-resentative of a class of such distributions.The maximum entropy principle [Jaynes(1989),Csiszár(1991)and Kapur and Kesavan(1992)]is widely ap-plied for this purpose,but its rationale has often been controversial[see,e.g., van Fraassen(1981),Shimony(1985),Skyrms(1985),Jaynes(1985),Seidenfeld (1986)and Uffink(1995,1996)].Here we emphasize and generalize a reinterpreta-tion of the maximum entropy principle[Topsøe(1979),Walley(1991),Chapter5, Section12,and Grünwald(1998)]:that the distribution P∗that maximizes the en-tropy over also minimizes the worst-case expected logarithmic score(log loss). In the terminology of decision theory[Berger(1985)],P∗is a robust Bayes,or -minimax,act,when loss is measured by the logarithmic score.This gives a decision-theoretic interpretation of maximum entropy.In this paper we extend this result to apply to a generalized concept of entropy, tailored to whatever loss function L is regarded as appropriate,not just logarithmic score.We show that,under regularity conditions,maximizing this generalized entropy constitutes the major step towardfinding the robust Bayes(“ -minimax”) act against with respect to L.For the important special case that is described by mean-value constraints,we give theorems that in many cases allow us to find the maximum generalized entropy distribution explicitly.We further define generalized exponential families of distributions,which,for the case of the logarithmic score,reduce to the usual exponential families.We extend generalized entropy to generalized relative entropy and show how this is essentially the same as a general decision-theoretic definition of discrepancy.We show that the family of divergences between probability measures known as Bregman divergences constitutes a special case of such discrepancies.A discrepancy can also be used as a loss function in its own right:we show that a minimax result for relative entropy[Haussler(1997)]can be extended to this more general case.We further show that a“Pythagorean property”[Csiszár(1991)]known to hold for relative entropy and for Bregman divergences in fact applies much more generally;and we give a precise characterization of those discrepancies for which it holds.Our analysis is game-theoretic,a crucial concern being the existence and properties of a saddle-point,and its associated minimax and maximin acts,in a suitable zero-sum game between Decision Maker and Nature.1.1.A word of caution.It is not our purpose either to advocate or to criticize the maximum entropy or robust Bayes approach:we adopt a philosophically neutral stance.Rather,our aim is mathematical unification.By generalizing the concept of entropy beyond the standard Shannon framework,we obtain a varietyMAXIMUM ENTROPY AND ROBUST BAYES 1369of interesting characterizations of maximum generalized entropy and display its connections with other known concepts and results.The connection with -minimax might be viewed,by those who already regard robust Bayes as a well-founded principle,as a justification for maximizing entropy—but it should be noted that -minimax,like all minimax approaches,is not without problems of its own [Berger (1985)].We must also point out that some of the more problematic aspects of maximum entropy inference,such as the incompatibility of maximum entropy with Bayesian updating [Seidenfeld (1986)and Uffink (1996)],carry over to our generalized setting:in the words of one referee,rather than resolving this problem,we “spread it to a new level of abstraction and generality.”Although these dangers must be firmly held in mind when considering the implications of this work for inductive inference,they do not undermine the mathematical connections established.2.Overview.We start with an overview of our results.For ease of exposition,we make several simplifying assumptions,such as a finite sample space,in this section.These assumptions will later be relaxed.2.1.Maximum entropy and game theory.Let X be a finite sample space,and let be a family of distributions over X .Consider a Decision Maker (DM)who has to make a decision whose consequences will depend on the outcome of a random variable X defined on X .DM is willing to assume that X is distributed according to some P ∈ ,a known family of distributions over X ,but he or she does not know which such distribution applies.DM would like to pick a single P ∗∈ to base decisions on.One way of selecting such a P ∗is to apply the maximum entropy principle [Jaynes (1989)],which advises DM to pick that distribution P ∗∈ maximizing H (P )over all P ∈ .Here H (P )denotes the Shannon entropy of P ,H (P ):=− x ∈X p(x)log p(x)=E P {−log p(X)},where p is the probability mass function of P .However,the various rationales offered in support of this advice have often been unclear or disputed.Here we shall present a game-theoretic rationale,which some may find attractive.Let A be the set of all probability mass functions defined over X .By the information inequality [Cover and Thomas (1991)],we have that,for any distribution P ,inf q ∈A E P {−log q(X)}is achieved uniquely at q =p ,where it takes the value H (P ).That is,H (P )=inf q ∈A E P {−log q(X)},and so the maximum entropy can be written assup P ∈ H (P )=sup P ∈ inf q ∈AE P {−log q(X)}.(1)Now consider the “log loss game”[Good (1952)],in which DM has to specify some q ∈A ,and DM’s ensuing loss if Nature then reveals X =x is measured by −log q(x).Alternatively,we can consider the “code-length game”[Topsøe (1979)and Harremoës and Topsøe (2001)],wherein we require DM to specify1370P.D.GRÜNWALD AND A.P.DAWIDa prefix-free codeσ,mapping X into a suitable set offinite binary strings,and to measure his or her loss when X=x by the lengthκ(x)of the codewordσ(x). Thus DM’s objective is to minimize expected code-length.Basic results of coding theory[see,e.g.,Dawid(1992)]imply that we can associate withσa probability mass function q having q(x)=2−κ(x).Then,up to a constant,−log q(x)becomes identical with the code-lengthκ(x),so that the log loss game is essentially equivalent to the code-length game.By analogy with minimax results of game theory,one might conjecture thatsup P∈ infq∈AE P{−log q(X)}=infq∈AsupP∈E P{−log q(X)}.(2)As we have seen,P achieving the supremum on the left-hand side of(2)is a maximum entropy distribution in .However,just as important,q achieving the infimum on the right-hand side of(2)is a robust Bayes act against ,or a -minimax act[Berger(1985)],for the log loss decision problem.Now it turns out that,when is closed and convex,(2)does indeed hold under very general conditions.Moreover the infimum on the right-hand side is achieved uniquely for q=p∗,the probability mass function of the maximum entropy distribution P∗.Thus,in this game between DM and Nature,the maximum entropy distribution P∗may be viewed,simultaneously,as defining both Nature’s maximin and—in our view more interesting—DM’s minimax strategy.In other words, maximum entropy is robust Bayes.This decision-theoretic reinterpretation might now be regarded as a plausible justification for selecting the maximum entropy distribution.Note particularly that we do not restrict the acts q available to DM to those corresponding to a distribution in the restricted set :that the optimal act p∗does indeed turn out to have this property is a consequence of,not a restriction on, the analysis.The maximum entropy method has been most commonly applied in the setting where is described by mean-value constraints[Jaynes(1989)and Csiszár (1991)]: ={P:E P(T)=τ},where T=t(X)∈R k is some given real-or vector-valued statistic.As pointed out by Grünwald(1998),for such constraints the property(2)is particularly easy to show.By the general theory of exponential families[Barndorff-Nielsen(1978)],under some mild conditions onτthere will exist a distribution P∗satisfying the constraint E P∗(T)=τand having probability mass function of the form p∗(x)=exp{α0+αT t(x)}for someα∈R k,α0∈R. Then,for any P∈ ,E P{−log p∗(X)}=−α0−αT E P(T)=−α0−αTτ=H(P∗).(3)We thus see that p∗is an“equalizer rule”against ,having the same expected loss under any P∈ .To see that P∗maximizes entropy,observe that,for any P∈ ,H(P)=infq∈A E P{−log q(X)}≤E P{−log p∗(X)}=H(P∗),(4)by(3).MAXIMUM ENTROPY AND ROBUST BAYES1371 To see that p∗is robust Bayes and that(2)holds,note that,for any q∈A,sup P∈ E P{−log q(X)}≥E P∗{−log q(X)}≥E P∗{−log p∗(X)}=H(P∗),(5)where the second inequality is the information inequality[Cover and Thomas (1991)].HenceH(P∗)≤infq∈A supP∈E P{−log q(X)}.(6)However,it follows trivially from the“equalizer”property(3)of p∗thatsup P∈ E P{−log p∗(X)}=H(P∗).(7)From(6)and(7),we see that the choice q=p∗achieves the infimum on the right-hand side of(2)and is thus robust Bayes.Moreover,(2)holds,with both sides equal to H(P∗).The above argument can be extended to much more general sample spaces(see Section7).Although this game-theoretic approach and result date back at least to Topsøe(1979),they seem to have attracted little attention so far.2.2.This work:generalized entropy.The above robust Bayes view of maxi-mum entropy might be regarded as justifying its use in those decision problems, such as discrete coding and Kelly gambling[Cover and Thomas(1991)],where the log loss is clearly an appropriate loss function to use.But what if we are interested in other loss functions?This is the principal question we address in this paper.2.2.1.Generalized entropy and robust Bayes acts.Wefirst recall,in Section3,a natural generalization of the concept of“entropy”(or“uncertainty inherent in a distribution”),related to a specific decision problem and loss function facing DM. The generalized entropy thus associated with the log loss problem is just the Shannon entropy.More generally,let A be some space of actions or decisions and let X be the(not necessarilyfinite)space of possible outcomes to be observed.Let the loss function be given by L:X×A→(−∞,∞],and let be a convex set of distributions over X.In Sections4–6we set up a statistical game G based on these ingredients and use this to show that,under a variety of broad regularity conditions, the distribution P∗maximizing,over ,the generalized entropy associated with the loss function L has a Bayes act a∗∈A[achieving inf a∈A L(P∗,a)]that is a robust Bayes( -minimax)decision relative to L—thus generalizing the result for the log loss described in Section2.1.Some variations on this result are also given.2.2.2.Generalized exponential families.In Section7we consider in detail the case of mean-value constraints,of the form ={P:E P(T)=τ}.Forfixed loss function L and statistic T,asτvaries we obtain a family of maximum generalized entropy distributions,one for each value ofτ.For Shannon entropy,this turns out1372P.D.GRÜNWALD AND A.P.DAWIDto coincide with the exponential family having natural sufficient statistic T[Csiszár (1975)].In close analogy we define the collection of maximum generalized entropy distributions,as we varyτ,to be the generalized exponential family determined by L and T,and we give several examples of such generalized exponential families. In particular,Lafferty’s“additive models based on Bregman divergences”[Lafferty (1999)]are special cases of our generalized exponential families(Section8.4.2).2.2.3.Generalized relative entropy and discrepancy.In Section8we describe how generalized entropy extends to generalized relative entropy and show how this in turn is intimately related to a discrepancy or divergence function.Maximum generalized relative entropy then becomes a special case of the minimum discrepancy method.For the log loss,the associated discrepancy function is just the familiar Kullback–Leibler divergence,and the method then coincides with the “classical”minimum relative entropy method[Jaynes(1989);note that,for Jaynes,“relative entropy”is the same as Kullback–Leibler divergence;for us it is the negative of this].2.2.4.A generalized redundancy-capacity theorem.In many statistical deci-sion problems it is more natural to seek minimax decisions with respect to the discrepancy associated with a loss,rather than with respect to the loss directly. With any game we thus associate a new“derived game,”in which the discrepancy constructed from the loss function of the original game now serves as a new loss function.In Section9we show that our minimax theorems apply to games of this form too:broadly,whenever the conditions for such a theorem hold for the original game,they also hold for the derived game.As a special case,we reprove a minimax theorem for the Kullback–Leibler divergence[Haussler(1997)],known in infor-mation theory as the redundancy-capacity theorem[Merhav and Feder(1995)].2.2.5.The Pythagorean property.The Kullback–Leibler divergence has a celebrated property reminiscent of squared Euclidean distance:it satisfies an analogue of the Pythagorean theorem[Csiszár(1975)].It has been noted[Csiszár (1991),Jones and Byrne(1990)and Lafferty(1999)]that a version of this property is shared by the broader class of Bregman divergences.In Section10we show that a“Pythagorean inequality”in fact holds for the discrepancy based on an arbitrary loss function L,so long as the game G has a value;that is,an analogue of(2)holds.Such decision-based discrepancies include Bregman divergences as special cases.We demonstrate that,even for the case of mean-value constraints, the Pythagorean inequality for a Bregman divergence may be strict.2.2.6.Finally,Section11takes stock of what has been achieved and presents some suggestions for further development.MAXIMUM ENTROPY AND ROBUST BAYES13733.Decision problems.In this section we set out some general definitions and properties we shall require.For more background on the concepts discussed here, see Dawid(1998).A DM has to take some action a selected from a given action space A,after which Nature will reveal the value x∈X of a quantity X,and DM will then suffer a loss L(x,a)in(−∞,∞].We suppose that Nature takes no account of the action chosen by DM.Then this can be considered as a zero-sum game between Nature and DM,with both players moving simultaneously,and DM paying Nature L(x,a)after both moves are revealed.We call such a combination G:=(X,A,L) a basic game.Both DM and Nature are also allowed to make randomized moves,such a move being described by a probability distribution P over X(for Nature)orζover A (for DM).We assume that suitableσ-fields,containing all singleton sets,have been specified in X and A,and that any probability distributions considered are defined over the relevantσ-field;we denote the family of all such probability distributions on X by P0.We further suppose that the loss function L is jointly measurable.3.1.Expected loss.We shall permit algebraic operations on the extended real line[−∞,∞],with definitions and exceptions as in Rockafellar(1970),Section4. For a function f:X→[−∞,∞],and P∈P0,we may denote E P{f(X)} [i.e.,E X∼P{f(X)}]by f(P).When f is bounded below,f(P)is construedas∞if P{f(X)=∞}>0.When f is unbounded,we interpret f(P)as f+(P)−f−(P)∈[−∞,+∞],where f+(x):=max{f(x),0}and f−(x):= max{−f(x),0},allowing either f+(P)or f−(P)to take the value∞,but not both.In this last case f(P)is undefined,else it is defined(either as afinite number or as±∞).If DM knows that Nature is generating X from P or,in the absence of such knowledge,DM is using P to represent his or her own uncertainty about X, then the undesirability to DM of any act a∈A will be assessed by means of its expected loss,(8)L(P,a):=E P{L(X,a)}.We can similarly extend L to randomized acts:L(x,ζ):=E A∼ζ{L(x,A)}, L(P,ζ)=E(X,A)∼P×ζ{L(X,A)}.Throughout this paper we shall mostly confine attention to probability measures P∈P0such that L(P,a)is defined for all a∈A,and we shall denote the family of all such P by P.We further confine attention to randomized actsζsuch that L(P,ζ)is defined for all P∈P,denoting the set of all suchζby Z.Note that any distribution degenerate at a point x∈X is in P,and so L(x,ζ)is defined for all x∈X,ζ∈Z.L EMMA3.1.For all P∈P,ζ∈Z,(9)L(P,ζ)=E X∼P{L(X,ζ)}=E A∼ζ{L(P,A)}.1374P.D.GRÜNWALD AND A.P.DAWIDP ROOF.When L(P,ζ)isfinite this is just Fubini’s theorem.Now consider the case L(P,ζ)=∞.First suppose L≥0everywhere. If L(x,ζ)=∞for x in a subset of X having positive P-measure,then(9) holds,both sides being+∞.Otherwise,L(x,ζ)isfinite almost surely[P]. If E P{L(X,ζ)}werefinite,then by Fubini it would be the same as L(P,ζ). So once again E P{L(X,ζ)}=L(P,ζ)=+∞.This result now extends easily to possibly negative L,on noting that L−(P,ζ) must befinite;a parallel result holds when L(P,ζ)=−∞.Finally the whole argument can be repeated after interchanging the roles of x and a and of P andζ.C OROLLARY3.1.For any P∈P,inf ζ∈Z L(P,ζ)=infa∈AL(P,a).(10)P ROOF.Clearly infζ∈Z L(P,ζ)≤inf a∈A L(P,a).If inf a∈A L(P,a)=−∞we are done.Otherwise,for anyζ∈Z,L(P,ζ)=E A∼ζL(P,A)≥inf a∈A L(P,a).We shall need the fact that,for anyζ∈Z,L(P,ζ)is linear in P in the following sense.L EMMA3.2.Let P0,P1∈P,and let Pλ:=(1−λ)P0+λP1.Fixζ∈Z,such that the pair{L(P0,ζ),L(P1,ζ)}does not contain both the values−∞and+∞. Then,for anyλ∈(0,1),L(Pλ,ζ)isfinite if and only if both L(P1,ζ)and L(P0,ζ) are.In this case L(Pλ,ζ)=(1−λ)L(P0,ζ)+λL(P1,ζ).P ROOF.Consider a bivariate random variable(I,X)with joint distribution P∗over{0,1}×X specified by the following:I=1,0with respective probabilitiesλ, 1−λ;and,given I=i,X has distribution P i.By Fubini we haveE P∗{L(X,ζ)}=E P∗[E P∗{L(X,ζ)|I}],in the sense that,whenever one side of this equation is defined andfinite,the same holds for the other,and they are equal.Noting that,under P∗,the distribution of X is Pλmarginally,and P i conditional on I=i(i=0,1),the result follows. 3.2.Bayes act.Intuitively,when X∼P an act a P∈A will be optimal if it minimizes L(P,a)over all a∈A.Any such act a P is a Bayes act against P.More generally,to allow for the possibility that L(P,a)may be infinite as well as to take into account randomization,we callζP∈Z a(randomized)Bayes act,or simply Bayes,against P(not necessarily in P)ifE P{L(X,ζ)−L(X,ζP)}∈[0,∞](11)MAXIMUM ENTROPY AND ROBUST BAYES1375 for allζ∈Z.We denote by A P(resp.Z P)the set of all nonrandomized(resp. randomized)Bayes acts against P.Clearly A P⊆Z P,and L(P,ζP)is the samefor allζP∈Z P.The loss function L will be called -strict if,for each P∈ ,there exists a P∈A that is the unique Bayes act against P;L is -semistrict if,for each P∈ ,A P is nonempty,and a,a ∈A P⇒L(·,a)≡L(·,a ).When L is -strict,and P∈ ,it can never be optimal for DM to choose a randomized act; when L is -semistrict,even though a randomized act can be optimal there is never any point in choosing one,since its loss function will be identical with that of any nonrandomized optimal act.Semistrictness is clearly weaker than strictness.For our purposes we can replace it by the still weaker concept of relative strictness:L is -relatively strict if for all P∈ the set of Bayes acts A P is nonempty and,for all a,a ∈A P, L(P ,a)=L(P ,a )for all P ∈ .3.3.Bayes loss and entropy.Whether or not a Bayes act exists,the Bayes loss H(P)∈[−∞,∞]of a distribution P∈P is defined byH(P):=infa∈A L(P,a).(12)It follows from Corollary3.1that it would make no difference if the infimum in(12)were extended to be overζ∈Z.We shall mostly be interested in Bayes acts of distributions P withfinite H(P).In the context of Section2.1,with L(x,q)the log loss−log q(x),H(P)is just the Shannon entropy of P.P ROPOSITION 3.1.Let P∈P and suppose H(P)isfinite.Then the following hold:(i)ζP∈Z is Bayes against P if and only ifE P{L(X,a)−L(X,ζP)}∈[0,∞](13)for all a∈A.(ii)ζP is Bayes against P if and only if L(P,ζP)=H(P).(iii)If P admits some randomized Bayes act,then P also admits some nonrandomized Bayes act;that is,A P is not empty.P ROOF.Items(i)and(ii)follow easily from(10)andfiniteness.To prove(iii),let f(P,a):=L(P,a)−H(P).Then f(P,a)≥0for all a,while E A∼ζP f(P,A)=L(P,ζP)−H(P)=0.We deduce that{a∈A:f(P,a)=0}has probabil-ity1underζP and so,in particular,must be nonempty.We express the well-known concavity property of the Bayes loss[DeGroot (1970),Section8.4]as follows.1376P.D.GRÜNWALD AND A.P.DAWIDP ROPOSITION3.2.Let P0,P1∈P,and let Pλ:=(1−λ)P0+λP1.Suppose that H(P i)<∞for i=0,1.Then H(Pλ)is a concave function ofλon[0,1](and thus,in particular,continuous on(0,1)and lower semicontinuous on[0,1]).It is either bounded above on[0,1]or infinite everywhere on(0,1).P ROOF.Let B be the set of all a∈A such that L(Pλ,a)<∞for someλ∈(0,1)—and thus,by Lemma3.2,for allλ∈[0,1].If B is empty, then H(Pλ)=∞for allλ∈(0,1);in particular,H(Pλ)is then concave on[0,1]. Otherwise,taking anyfixed a∈B we have H(Pλ)≤L(Pλ,a)≤max i L(P i,a), so H(Pλ)is bounded above on[0,1].Moreover,as the pointwise infimum of the nonempty family of concave functions{L(Pλ,a):a∈A},H(Pλ)is itself a concave function ofλon[0,1].C OROLLARY3.2.If for all a∈A,L(Pλ,a)<∞for someλ∈(0,1),then for allλ∈[0,1],H(Pλ)=lim{H(Pµ):µ∈[0,1],µ→λ}[it being allowed that H(Pλ)is notfinite].P ROOF.In this case B=A,so that H(Pλ)=inf a∈B L(Pλ,a).Each func-tion L(Pλ,a)isfinite and linear,hence a closed concave function ofλon[0,1]. This last property is then preserved on taking the infimum.The result now follows from Theorem7.5of Rockafellar(1970).C OROLLARY3.3.If in addition H(P i)isfinite for i=0,1,then H(Pλ)is a bounded continuous function ofλon[0,1].Note that Corollary3.3will always apply when the loss function is bounded.Under some further regularity conditions[see Dawid(1998,2003)and Section3.5.4below],a general concave function over P can be regarded as generated from some decision problem by means of(12).Concave functions have been previously proposed as general measures of the uncertainty or diversity in a distribution[DeGroot(1962)and Rao(1982)],generalizing the Shannon entropy. We shall thus call the Bayes loss H,as given by(12),the(generalized)entropy function or uncertainty function associated with the loss function L.3.4.Scoring rule.Suppose the action space A is itself a set Q of distributions for X.Note we are not here considering Q∈Q as a randomized act over X,but rather as a simple act in its own right(e.g.,a decision to quote Q as a description of uncertainty about X).We typically write the loss as S(x,Q)in this case and refer to S as a scoring rule or score.Such scoring rules are used to assess the performance of probability forecasters[Dawid(1986)].We say S is -proper if ⊆Q⊆P and,for all P∈ ,the choice Q=P is Bayes against X∼P. Then for P∈ ,(14)H(P)=S(P,P).Suppose now we start from a general decision problem,with loss function L such that Z Q is nonempty for all Q∈Q.Then we can define a scoring rule byS(x,Q):=L(x,ζQ),(15)where for each Q∈Q we suppose we have selected some specific Bayes actζQ∈Z Q.Then for P∈Q,S(P,Q)=L(P,ζQ)is clearly minimized when Q=P,so that this scoring rule is Q-proper.If L is Q-semistrict,then(15) does not depend on the choice of Bayes actζQ.More generally,if L is Q-relatively strict,then S(P,Q)does not depend on such a choice,for all P,Q∈Q.We see that,for P∈Q,inf Q∈Q S(P,Q)=S(P,P)=L(P,ζP)=H(P). In particular,the generalized entropy associated with the constructed scoring rule(15)is identical with that determined by the original loss function L.In this way,almost any decision problem can be reformulated in terms of a proper scoring rule.3.5.Some examples.We now give some simple examples,both to illustrate the above concepts and to provide a concrete focus for later development.Further examples may be found in Dawid(1998)and Dawid and Sebastiani(1999).3.5.1.Brier score.Although it can be generalized,we restrict our treatment of the Brier score[Brier(1950)]to the case of afinite sample space X= {x1,...,x N}.A distribution P over X can be represented by its probability vector p=(p(1),...,p(N)),where p(x):=P(X=x).A point x∈X may also be represented by the N-vectorδx corresponding to the point-mass distribution on{x} having entriesδx(j)=1if j=x,0otherwise.The Brier scoring rule is then defined byS(x,Q):= δx−q 2(16)=Nj=1{δx(j)−q(j)}2=j q(j)2−2q(x)+1.(17)ThenS(P,Q)=j q(j)2−2jp(j)q(j)+1,(18)which is uniquely minimized for Q=P,so that this is a P-strict proper scoring rule.The corresponding entropy function is(see Figure1)H(P)=1−j p(j)2.(19)。
大学体验英语视听说教程(第三册)-2.
3. People don’t differ in behavior as they differ in skin pigments. Extroverts, introverts, optimists, pessimists, criminals, liberals, etc. are found in all societies and cultures. Even identical twins (with 100% similar genes) and fraternal twins (with 50% similar genes) behave differently in most of the cases.
Answer: Is it nature or is it nurture?
2. According to the passage, what’s the definition of genius?
Answer: Geniuses are those who have the intelligence, enthusiasm and endurance to acquire the needed expertise in a broadly valued domain of achievement and who then make contributions to that field that are considered by peers to be original.
Mysteries of the Universe
The mysteries of the universe are vast and awe-inspiring, encompassing everything from the nature of dark matter and dark energy to the origins of the cosmos and the possibility of extraterrestrial life. Some of the most intriguing mysteries include:1.Dark Matter and Dark Energy: These are two of the most enigmaticcomponents of the universe, comprising the majority of its mass and energy. Yet, their true nature and properties remain largely elusive, challenging our understanding of the fundamental forces at play in the universe.2.The Big Bang: The origin of the universe itself is a profoundmystery, with the Big Bang theory providing a framework forunderstanding the rapid expansion of space and the subsequentevolution of galaxies, stars, and planets. However, manyquestions remain about what preceded the Big Bang and what lies beyond the observable universe.3.Black Holes: These enigmatic cosmic phenomena have captivatedscientists and the public alike, as their extreme gravitational pull and mysterious interiors defy our current understanding of physics. The nature of the singularity at the heart of a black hole and the potential links to other cosmic mysteries aresubjects of ongoing research.4.Exoplanets and the Search for Life: The discovery of thousandsof exoplanets beyond our solar system has fueled speculationabout the potential for life elsewhere in the universe.Understanding the conditions necessary for life to exist and the likelihood of finding extraterrestrial civilizations are among the most tantalizing mysteries in astronomy.5.Quantum Mechanics and Gravity: The quest to reconcile theprinciples of quantum mechanics with the force of gravityrepresents a major frontier in theoretical physics, withprofound implications for understanding the behavior of matter at the smallest and largest scales.These mysteries, among many others, continue to inspire scientists and philosophers to push the boundaries of human knowledge and imagination, offering a glimpse into the profound complexities of the cosmos.。
Logarithmic
Logarithmic transform coefficient histogram matching with spatialequalizationBlair Silver◊, Sos Agaian*, and Karen Panetta◊◊ Department of Electrical and Computer Engineering, Tufts University161 College Avenue, Medford, MA 02155* College of Engineering, The University of Texas at San Antonio6900 North Loop 1604 West, San Antonio, TX 78249-0669ABSTRACTIn this paper we propose an image enhancement algorithm that is based on utilizing histogram data gathered from transform domain coefficients that will improve on the limitations of the histogram equalization method. Traditionally, classical histogram equalization has had some problems due to its inherent dynamic range expansion. Many images with data tightly clustered around certain intensity values can be over enhanced by standard histogram equalization, leading to artifacts and overall tonal change of the image. In the transform domain, one has control over subtle image properties such as low and high frequency content with their respective magnitudes and phases. However, due to the nature of many of these transforms, the coefficient’s histograms may be so tightly packed that distinguishing them from one another may be impossible. By placing the transform coefficients in the logarithmic transform domain, it is easy to see the difference between different quality levels of images based upon their logarithmic transform coefficient histograms. Our results demonstrate that combing the spatial method of histogram equalization with logarithmic transform domain coefficient histograms achieves a much more balanced enhancement, that out performs classical histogram equalization.Keywords: Image enhancement, transform coefficient histogram, histogram equalization, histogram matching.1.INTRODUCTIONImage enhancement techniques strive for one major purpose: to improve some characteristic of an image. These enhancement techniques can be broken up into two major classifications: spatial domain enhancement and transform domain enhancement.Spatial domain techniques deal with the raw image data, altering the intensity values based on a specific algorithm’s set of criteria. These techniques can range from local filtering to global algorithms. A common example of a spatial technique is histogram equalization, which attempts to alter the spatial histogram of an image to closely match a uniform distribution. Histogram equalization treats the image globally and because of this suffers from being poorly suited for retaining local detail. It is also common that the equalization will over enhance the image, resulting in an undesired loss of visual data, of quality, and of intensity scale [6].Transform domain enhancement techniques involve transforming the image intensity data into a specific domain by using such methods as the DCT, Fourier, and Hartley transforms [2,7-9,11].These transforms are used to alter the frequency content of an image to improve desired traits, such as high frequency content. Many enhancement techniques have been proposed that attempt to enhance the image based upon other transform domains and their characteristics [2,7-9,11].Each of these methods has its strong points and its weak points. This leads to the question: is there a way to combine these styles of enhancement to return even better results? This paper will explore a new method for which a transformdomain based technique and a spatial technique can be combined to enhance images. The proposed algorithm will address visualizing and altering the transform coefficient histograms through histogram mapping and histogram equalization using the Discrete Cosine Transform (DCT). This paper will also demonstrate a quantitative measurement based upon contrast entropy to determine the efficacy and the optimization of the method.The paper is organized as follows: Section I lays out the difference between spatial and transform domain enhancement and briefly states the proposed algorithm. Section II defines the measure of algorithm performance, choosing optimal parameters, the logarithmic transform domain, and histogram equalization. Section III has an explanation of the logarithmic transform domain histogram matching with histogram equalization algorithm (LTHMHE) as well as an explanation of LTHMHE combined with alpha-rooting, and section IV is an analysis of the experimental results using this method. Section V is a discussion of the results and some concluding comments are made.2. BACKGROUNDIn this section, background topics necessary to understand the new method proposed are discussed. The measure of performance will be explored first followed by a method for choosing optimal parameters, a definition of the logarithmic transform domain, a definition of histogram equalization, and a definition of alpha-rooting.2.1 Measure of performanceMeasuring the performance of a given enhancement algorithm is a key step into understanding how effective a given method is. However, defining a proper measure of enhancement has proven to be a difficult task. It is important when implementing an image enhancement technique to create a suitable image enhancement measure, however the improvement resulting from the enhancement is often difficult to measure. This problem become more apparent when the enhancement algorithms are parameter based and one needs: a) to choose the best parameters; b) to choose the best transform among a class of unitary transforms; c) to automate the image enhancement procedures. The problem becomes especially difficult when an image enhancement procedure is used as a preprocessing step for other image processing purposes such as object detection, classification, and recognition. For this reason it becomes apparent that a measure must be designed based on a specific trait of the image.In the past, there have been many differing definitions of an adequate measure of performance based on contrast[1,3,4,11]. Gordon and Rangayan used local contrast defined by the mean gray values centered on a current pixel of two rectangular windows [4]. Begchladi and Negrate defined an improved version of the aforementioned measure by basing their method on local edge information of the image [1]. In the past, attempts at statistical measures of gray level distribution of local contrast enhancement (for example mean, variance, or entropy) have not been particularly useful or meaningful. A number of images, which show an obvious contrast improvement, showed no consistency, as a class, when using these statistical measurements. Morrow introduced a measure based on the contrast histogram, which has a much greater consistency than statistical measures [11].For simple patterns, two definitions of contrast measure have also been often used. One is the Michelson contrast measure ; the other is the Weber contrast measure. The Michelson contrast measure is used to measure the contrast of a periodic pattern such as a sinusoidal grating, while the Weber contrast measure assumes a large uniform luminance background with a small test target. Both measures are therefore unsuitable for measuring the contrast in complex images [10]. Many such modifications of the Weber contrast have been proposed [7-9]. Note that Fechner’s law gives a relationship between brightness and light intensity which is given by the following equation.)ln()ln(minmax max f f k f f k B ′+′= (1)Where k’ is a constant, and fmax and fmin are the maximum and minimum luminance values in a block of the image. Fechner’s law provides the basis for the contrast measure based on contrast entropy which was proposed and later modified by Agaian. [8,9].Definition [9]: Let an image I be split into 12k k × blocks (,)B k l with center (k,l) of size 12M M ×.An image enhancement or contrast measure with respect to transform Φ{{⎪⎭⎪⎬⎫⎪⎩⎪⎨⎧⎪⎭⎪⎬⎫⎪⎩⎪⎨⎧ΦΦ=Φ),(),(min max ;min;,max;,,par I par I AWC w l k w l k l k parameters (2)Where Φ is a given transform from class of fast unitary transforms (including wavelets), and max;,w k l I , min;,w k l I are themaximum and minimum luminance values in a block (,)B k l of the image and where the parameters are the processing enhancement algorithm parameters.Definition [8]: Modified image enhancement measure∑∑==ΦΦ=Φ212111;min;,max;21,,),(),(log 201)(k l k k w l k w l k k k par I par I k k EME α (3) 12,,()k k EME αΦ is called a measure of image enhancement or contrast measure with respect to transform Φ.Therefore, the optimal transform,Φ, is relative to the measure of enhancement, 0()EME EME Φ=.A simple modification to the above definition leads to another powerful measure of enhancement, as proposed also by Agaian [8].∑∑==ΦΦΦΦ=Φ212111;min;,max;;min;,max;21,)()(log )()(1)(k l k k w l k w l k wl k w l k k k I I I I k k EME(4)This is known as the measure of enhancement by entrop y, or EME [8]. This measure averages contrast entropy over a given image using a specified block size. To be as accurate as possible, it is reasonable to suggest using a smaller block size to make the data as representative as possible. For this paper, the measure of enhancement by entropy was used along with a block size of 4 by 4.We also wish to introduce the Michelson law based contrast measure :{}{}⎟⎟⎠⎞⎜⎜⎝⎛+−==∑∑==∈∈2111;min;,max;;min;,max;21log 201max ))((max k l k k w l k w l k w l k w l k I I I I k k EME EME φφφφφ (5) These definitions use the Michelson Contrast , or modulation, definition: the relation between the spread and the sum of the two luminances can be represented asModulation = (L max - L min ) / (L max + L min ) (6)The main idea behind this measure is to use the relationship between the spread and the sum of the two luminance values found in a small block. It then takes the average modulation in each block over the entire image. In the context of vision, such a relationship could be caused by scattered light introduced into the view path by a translucent object.{{})(max 21,,,Φ=Φk k AME AME αα(7) Another possible modification would be adaptation of the Michelson law based contrast measure to include contrast entropy. ∑∑==+−⎥⎦⎤⎢⎣⎡+−=Φ212111;min;,max;;min;,max;;min;,max;;min;,max;21,,log 1)(k l k k w l k w l k w l k w l k w l k w l k w l k w l k k k I I I I I I I I k k AME ααα (8)An example of the AWC, EME, EME of Entropy, Michelson Law EME, and AME plotted versus an enhancement parameter, alpha, can be found in Figure 1. Depending on the measure of enhancement, different measures will be more useful than others. Our proposed method works best with the measures based upon entropy such as the EME of Entropy and the AME. For this paper, the EME of Entropy, equation 4, will be used to measure results.(a) (b) (c)(d) (e) (f)Figure 1: (a) Original Pentagon Image, (b-f) Graphs of alpha vs AWC, EME, EME of Entropy, Michelson Law EME, AME. 2.3 Logarithmic transform domain:The transform domain affords us the ability to view the frequency content of an image. It conveniently breaks up the data in regions of lower and higher frequency. However, the histogram of this data is usually less telling and may require another type of transformation. This is because a plot of the histogram of a typical image is compact and uninformative, as shown in Figure 2a.(a) (b)Figure 2: (a) DCT-2 transform domain histogram (b) Logarithmic DCT-2 transform domain histogramBy taking the logarithm of the data, this problem can be avoided. This is done in primarily two steps. The first step requires the creation of a matrix to preserve the phase of the transform image, which is given by the equationθ(8)angleXji=))),((,(jiWhere the angle function returns the angle of the coefficient. This will be used to restore the phase of the transform coefficients. The next step is to take the logarithm of the modulus of the coefficients as shown by the equationˆ(,)ln((,))X i j X i j γηλ=+ (9)Where η, γ, and λ are enhancement parameters, usually set to 1. The shifting coefficient, λ, is needed to keep returning the logarithm of zero, which is undefined. The shifting, in itself, enhances the contrast of the image, though only slightly. This results in a much more visible version of the histogram as shown in Figure 2b. To return the coefficients to the standard transform domain the signal is exponentiated and the phase is restored as shown by),(),(ˆ),(j i j j i X e e j i X θ⋅=′ (10)This preserves the overall image characteristics, ensuring that the returned image is visually similar to the original image, and that the enhancement only plays upon the magnitude of the transform coefficients.This process works with real and complex orthogonal transforms. It is important to keep the phase information unchanged, because the angle contains most of the images underlying characteristic information. The coefficients in equation 9, η and γ, can be utilized as additional enhancement parameters. By changing their values one can find other optimal enhancement points, though for simplicity these coefficients can usually be set to 1. Graphs showing how to use these coefficients along with the EME to locate optimal values can be shown in Figures 3d and 3e.(a)(b)(c)(d)(e)(f)Figure 3: (a) Original Pentagon image, (b) Pentagon image enhanced using LTHMHE with alpha-rooting, using k=0.78 (c) EME vs k, where k is a parameter of enhancement. Other examples of using plots of the EME to find optimal parameter values by picking themaximum: (d) Pentagon image enhanced using LTHMHE with η =1.2 and λ =3, (e) EME vs η vs λ, (f) EME vs γ vs λ.2.2 Choosing optimal parametersSince we have defined our measure of enhancement, it then becomes necessary to define the method of choosing optimal parameters based upon that measure. Utilizing the proposed measure of enhancement based upon entropy affords a simple mathematical basis for determining the optimal parameters of our enhancement.By plotting the EME versus the coefficients of the enhancement on a specific image we can return a descriptive graph as shown in Figure 3c. Interpreting this graph depends on the enhancement method. Since the method which we will be using for optimization involves alpha-rooting, we shall use the simple rule that the maximum EME value returns the optimal point. An example of an original image and its resulting optimized enhancement can also be found in Figure 3, as well as examples of choosing optimal parameter values using multiple parameters can be found in Figures 3d and 3e.2.4 Histogram equalization:Histogram equalization maps the input image’s intensity values to best approximate a uniform distribution. This technique is a useful tool for quick and easy image enhancement. In many cases, equalization successfully balances an image returning an increase in contrast.Given an image A(x,y) and a desired output image B(x,y) there is some transformation function, f, which maps A to B. All the pixels in A in the region a n to a a dn n + will have their values mapped to a corresponding region in B in the range of b n to b b dn n +. Each of these images will have a probability density function (PDF) )(A A n p and )(b b n p .Assuming a 1-1 mapping, it is easy to show thata a Ab b B dn n p dn n p )()(= (11)Using this relationship, it can be shown that the mapping function from A to B is)()()(0a A n A a n nF du u p n n f a==∫ (12)Where )(a A n F is the cumulative probability distribution function of the original image. Therefore, to return a histogram equalized image, an image must be transformed using its cumulative probability function.Histogram equalization's success at image enhancement is because it expands the dynamic range of intensity values while flattening the overall spatial histogram. This leads to a more overall even representation of all the spectrum of intensities, which can be used to bring out otherwise subtle details. This is usually a quick and effective method for image enhancement. On many images, histogram equalization provides satisfactory to good results, but there are a number of images where it fails to properly enhance the test image. The shortcomings and pitfalls of histogram equalization can be easily shown [6].As an example, Figure 4a shows an image of a helicopter. The resulting image after 256-level histogram equalization was applied is shown in Figure 4b. Figure 4c compares the spatial histograms of both images. Notice the loss of information on the body of the helicopter; you can no longer see the windows, or the details of the tail. The main focus of the image has become more of a silhouette than a picture. The background has been over emphasized as well. Other problems with histogram equalization can be artifacts and overall brightness change in the resulting image [6].(a) (b) (c)Figure 4: (a) Original Image of Copter , (b) resulting image after basic histogram equalization of Copter , (c) comparison of thespatial histograms before and after histogram equalization.2.5 Alpha-rooting:Alpha-rooting is a simple method that can be used in combination with many different orthogonal transforms such as the Fourier, Hartley, Haar wavelet and cosine transforms. The method is based upon two simple basic ideas: a) any signal, or image, is comprised of two elements, a magnitude and a phase; b) high frequency coefficients of an image, upon transformation, will have smaller magnitudes than low frequency coefficients. The first property can be shown in the equation below:),(),(),(s p j e s p X s p X θ= (13)It can be shown that the phase of an image contains most of the information needed to reconstruct the image, while the magnitude only contains the intensity of the point. This can be shown by combining the magnitude of one image with the phase of another, which will return almost a perfect reconstruction of the first image. It is then possible to change the magnitude information of an image without altering the basic layout of an image.The second concept behind this method is magnitude reduction. The main idea is that the magnitude of the lower frequency coefficients of a transform will have higher values than the higher frequency components. By raising the magnitude of an image to some value, α, where 0<α<1, the higher valued lower frequency components of an image can be reduced more in proportion to the lower valued high frequency components. This proportional reduction of magnitudes leads to an emphasizing of the high frequency content on an image. The mathematical form of this operation can be seen below. ),(1),(),(),(),(s p j e s p X s p X s p X s p X θαα==−) (14)where X(p,s) is the transform coefficients of the image x(p,s). Taking the inverse transform of the result returns the enhanced image. The resulting output shows an emphasis on the high frequency content of the image without changing the phase of the image results in an overall contrast enhancement of the entire image.3.METHODOLOGY3.1 Logarithmic transform histogram matching with histogram equalizationTraditionally, histogram matching is applied to spatial domain data, adjusting the range of values to match a specified histogram. In this paper, we discuss the application of transform histogram matching, which is a new take on the old concept.While investigating different qualities of images and their respective transform coefficient histograms, it had become apparent that the visually better images returned distinctly different transform histograms from their worse counterparts. This is the basis for our explorations of transform histograms and histogram equalization, a spatial technique that suffers from extreme dynamic range expansion, which can result in ugly artifacts, as previously shown. By combining this basic technique with transform enhancement methods, the end results can be surprisingly better in visual quality and quantitative measurement.Figure 5: Block Diagram of Logarithmic Transform Histogram Matching with Histogram EqualizationThe first proposed algorithm attempts to enhance the image using a histogram equalized image as a baseline. Logarithmic transform histogram matching with histogram equalization (LTHMHE) is detailed in Figure 5, and by the steps listed below:Input: Original ImageStep1: Transform Image (DCT, Fourier, and others)Step 2: Equalize the Histogram of the ImageStep 3: Take logarithm of magnitude coefficientsStep 4: Calculate coefficient histogramStep 5:Take logarithm of original transform dataStep 6: Map data to equalized histogramStep 7: Exponentiate dataStep 8: Restore phase and Inverse TransformOutput: Enhanced ImageThe first step would be to take an image and apply histogram equalization to it. This equalized image would then have its logarithmic transform histogram calculated as previously discussed. The original image would then have its logarithm transform coefficients mapped to create a similar histogram to match the equalized image’s transform histogram coefficients as shown in Figure 6b. The result of this enhancement leads to an overall flattening of the spatial histogram as shown in Figure 6a.(a) (b)Figure 6: (a) Comparison of spatial histograms of an original image, histogram equalization, and LTHMHE, (b) comparison oforiginal, histogram equalized, and LTHMHE3.2 Logarithmic transform histogram matching with histogram equalization with alpha-rootingBuilding off of the foundation of logarithmic transform histogram matching with histogram equalization, questions arose if we could improve the process by including already established methodologies into the algorithm. Alpha-rooting seemed a simple addition which could easily be inserted into our enhancement algorithm. The new algorithm takes the form of the flow graph shown below in Figure 7.The addition of alpha-rooting led to the ability to use the EME to pick an optimum enhancement value for the coefficient built into the alpha-rooting algorithm. This built in recursion allows for manipulation of enhancement variables to return desirable results. It should be noted that this is not the only place alpha-rooting could have been inserted to enhance the algorithm, but that this was the most obvious point.4.EXPERIMENTAL RESULTSTwo new methods were tested in this paper, proving to be formidable, powerful, and fast enhancement techniques. For the purposes of this paper, three images are shown. A table of results can be found in Table 1, and example images can be found in Figures 8, 9, and 10.(a) (b) (c)(d) (e) (f)Figure 8: (a) Original Image EME=0.01201, image enhanced using (b) histogram equalization EME=2.9321, (c) alpha-rooting EME=0.2856, (d) LTHMHE EME=9.9116, and (e) LTHMHE with alpha-rooting EME=40.2839. (f) Graph of EME versus α used tooptimize LTHMHE with alpha-rooting with peak at α=0.65.The first was an image of an Artic hare, chosen because of its strong concentration of data points around the intensity level of 255. This type of image, when enhanced, can have its dynamic range expanded to the point of changing the overall tone of the picture, along with creating ugly artifacts [6].The second image chosen was the Copter image. This image has the interesting characteristic of a relatively dark central area and a lighter, textured background. This image is usually difficult to enhance due to its unbalanced nature, leading most methods to either enhance the helicopter or the background but at the sacrifice of the other object.The third image chosen was the Plane image. This image is the direct opposite of the artic hare image, because it has data points concentrated around the lower end of the intensity spectrum. This image has hidden contour lines in the background along with prevalent film grain which cannot be seen in the original image.The first image’s overall tone is almost perfectly white, with very little variation, making it a hard image to enhance without altering the image drastically. The original EME had an extremely low value of 0.01201. After applying our logarithmic transform histogram matching with histogram equalization algorithm to the image we returned an EME of 9.9116, a huge increase in contrast. Compared to straight histogram equalization, which caused artifacts, tonal change to the image, and an EME of 2.392, LTHMHE enhanced the image better and avoided the undesirable side effects. Alpha-rooting performed admirably, raising the EME to 0.2856 by itself, although the image had the characteristic gray effect from the method. However, when we used the modified LTHMHE algorithm to include alpha-rooting we improved our resulting EME to 40.2839. Both the LTHMHE and the LTHMHE with alpha-rooting returned visually better results than alpha-rooting alone and histogram equalization. These results can be seen in Figure 8.(a) (b) (c)(f)(d) (e)Figure 9: (a) Original Image EME=0.03593, image enhanced using (b) histogram equalization EME=1.3027, (c) alpha-rooting EME=2.3506, (d) LTHMHE EME=9.015, and (e) LTHMHE with alpha-rooting EME=127.8699. (f) Graph of EME versus α used tooptimize LTHMHE with alpha-rooting with peak at α=0.04.The second image, the Copter, is a difficult image to enhance due to the tendency to enhance either the background or the helicopter but not both. The original image had an EME of 0.03593. Histogram equalization returned questionable results, with a complete loss of detail in the helicopter returning and EME of 1.3027. This is characteristic of the extreme dynamic range expansion problem which plagues histogram equalization. Our LTHMHE algorithm corrected this, returning visually pleasing results and an EME of 9.015. Alpha-rooting, alone, improved the image noticeably, returning an EME of 2.3506, again with the characteristic graying of the image. Inserting alpha-rooting into our LTHMHE algorithm returned the best results, conveying much more detail in both the helicopter and in the background. This exciting result returned an EME of 127.8699. Again, visually, the LTHMHE and LTHMHE with alpha-rooting returned better results that histogram equalization and alpha-rooting alone. These results can be seen in Figure 9.The third image, the Plane, is characteristically dark and dull. Our enhancement technique brought out the subtle details on the wings of the plane and in the background without overemphasizing any specific part of the image. The original image had an EME of 0.3340. Histogram equalization brought the EME of the image up to 23.3756, at the sacrifice of much of the image detail. This image shows the over-enhancement attribute of histogram equalization well, which caused a lot of the image to be lost in the grain and noise as well as the over emphasized subtle background ripples. After our process, LTHMHE, the enhanced image returned an EME of 33.8779, a staggering improvement. The image is visually clearer, more detailed, and generally better than the histogram equalized version. Alpha-rooting returned a slightly sharper image than the original returning an EME of 55.1932, while our LTHMHE with alpha-rooting returned an even higher EME of 138.578. As before, LTHMHE and LTHMHE with alpha-rooting returned results that were much more visually appealing that those found using histogram equalization and alpha-rooting. These results can be seen in Figure 10.(a) (b)(c)(c) (d) (e)Figure 10: (a) Original Image EME=0.33396, image enhanced using (b) histogram equalization EME=23.3756, (c) alpha-rooting EME=55.1932, (d) LTHMHE EME=33.8779, and (e) LTHMHE with alpha-rooting EME=138.578. (f) Graph of EME versus α usedto optimize LTHMHE with alpha-rooting with peak at α=0.08.Table 1: Comparison of resulting EME’s of different enhancement methods Original Histogram EqualizedAlpha rooting LTHMHELTHMHE with Alpha Rooting Artic Hare 0.012008 2.9321 α=0.80 0.2856 9.9116 α=0.65 40.2839 Copter 0.035928 1.3027 α=0.80 2.3506 9.015 α=0.04 127.8699 Moon 0.8681 6.6359 α=0.70 156.249 31.8327 α=0.56 91.8347 Pentagon 0.21835 41.5252 α=0.74 110.482 86.8147 α=0.78 331.0331 Plane 0.33396 23.3756 α=0.79 55.1932 33.8779 α=0.08 138.5785. CONCLUDING REMARKSThis paper proposed a new method of image enhancement based upon the logarithmic transform coefficient histogram using contrast entropy as a measure of performance and of optimization. Our results demonstrated the power of the logarithmic transform histogram matching with histogram equalization method, showing it to outperform classical histogram equalization. We also showed the modular nature of the proposed algorithm through the addition of alpha-rooting as a performance booster. As a benchmark, the performance of this algorithm was compared to established enhancement techniques: histogram equalization and alpha-rooting.A measure of enhancement is not a perfect science. There is not universal measure for image enhancement. In choosing a measure, it is necessary to choose those qualities in the image which are being measured. In our case we chose contrast. The measure was used to find optimal values as well as show image improvement numerically. After。
Logarithmic Sobolev Inequalities and Spectral Gaps
Logarithmic Sobolev Inequalities and Spectral GapsEric Carlen 1and Michael Loss 1School of Mathematics,Georgia Tech,Atlanta,GA 30332January 24,2004AbstractWe prove an simple entropy inequality and apply it to the problem of determining log–Sobolev constants.1IntroductionLet µbe a probability measure on R n of the form d µ=e −V (x )d x .One says that µadmits a logarithmic Sobolev inequality with constant b in case for all functions f on R n with R n f 2d µ=1,R n f 2ln f 2d µ≤b R n|∇f |2d µ.(1.1)There is an extensive literature devoted to the specification of conditions under which there is a logarithmic Sobolev inequality associated to µ,and the determination of the best constant b when such an inequality does hold.A number of years ago,it was shown by Bakry and Emery [2]that when V is uniformly strictly convex;i.e.,when the Hessian of V satisfiesHess V (x )≥2cI ,(1.2)for all x ,then (1.1)holds with b =1/c .Since the method of Bakry and Emery requires that (1.2)hold uniformly,it does not apply in many cases of interest.One says that µadmits a spectral gap with constant λin case for all functions u on R n with R n u d µ=0,R n u 2d µ≤1λ R n |∇u |2d µ.(1.3)Taking f to have the form f =√1−α2+αu where u is orthogonal to the constants,a simpleTaylor expansion yieldsR nf 2ln f 2d µ=α2 3 R n u 2d µ−1 +O (α3),(1.4)1Work partially supported by U.S.National Science Foundation grant DMS 03-00349.c2004by the authors.This paper may be reproduced,in its entirety,for non-commercial purposes.1at least when u is bounded.Since in this case we also have,R n |∇f|2dµ=α2R n|∇u|2dµ,it follows that wheneverµadmits a logarithmic Sobolev inequality with constant b,it admits a spectral gap with constantλ≥2/b.This useful fact was observed by Rothaus[7].Often it is much easier to prove a spectral gap than is is to prove a log–Sobolev inequality. Indeed,there are many examples of measuresµthat admit a spectral gap,but do not admit a log–Sobolev inequality.At this meeting,several problems have been discussed in which a spectral gap has been proved,but a log-Sobolev inequality has not–or at least not with useful constants.It can however be realtively easy to directly establish a restricted log–Sobolev inequality:Definition In case for somefinite b,(1.1)is satisifed wheneverR n f dµ=0andR nf2dµ=1,(1.5)themµsatisifes a restricted log–Sobolev inequality.Our aim here is to show that wheneverµhas the formµ=e−V(x)d x,andµadmits a spectral gap,then under broad,easy to check conditions,µadmits a restricted log–Sobolev inequality with an explicit constant.¿From this,we then deduce an unrestriced log-Sobolev constant forµ. The passage from the restricted to the unrestricted inequality based on a simple a–priori entropy inequality.We explain this in the second section,and then in the third section,we show how restricted log–Sobolev inequalities may be easily established.The restriced log–Sobolev inequalities considered here are closely related to what are sometimes called deffective log–Sobolev inequalities.These take the formR n f2ln f2dµ≤bR n|∇f|2dµ+aR nf2dµ(1.6)for all f withR nf2dµ=1.Ifµadmits a spectral gap with constantλ,and(1.6)is satisfied,thenwhenever u satisifies(1.5),bR n |∇u|2dµ+aR nu2dµ≤b+aλR n|∇u|2dµso that a defective log–Sobolev inequality,together with a spectral gap,implies a restricted log–Sobolev inequality.There have been many investigations of log–Sobolev inequalities in the setting of diffusion semigroups,starting from the ground breaking work of Gross[6]demonstrating the equivalence of hypercontractivity of a diffusion semigroup with the validity of a log–Sobolev inequality for the associated Dirichlet form.See[1]for an insightful recent survey,and of particular relevance here, recent work of Cattiaux[3].In particular,Cattiaux[3]has recently obtained log–Sobolev inequalities under conditions sim-ilar to those in Theorem2.2below,however his methods are considerably more complicated,and do not provide explicit constants.Many earlier researchers have also relied on diffusion semigruop arguments.For example,it is well known that a defective log–Sobolev inequality together with a spectral gap implies a log–Sobolev inequality.The standard proof uses Gross’s Theorem and an argument of Glimm [5].Gross’s Theorem assures that if µsatisfies a defective log–Sobolev inequality,then the diffusion semigroup P t associated to the Dirichelt form R n |∇u |2d µis bounded from L 2to L 4for some t 0>0.Next,the argument of Glimm is used to show that if a diffusion semigroup P t is bounded from L 2to L 4and some time t 0,and its generator has a spectral gap,then for some t 1>t 0,P t 1is a contraction from L 2to L 4.This contractivity,together with an interpolation argument and Gross’s Theorem once again give the log–Sobolev inequality.This sort of argument is well known,but rather indirect.Our aim here is to provide a simple,direct passage from defective log–Sobolev inequalities to log–Sobolev inequalities via spectral gaps,and also to provide a simple and direct criterion for the validity of defective log–Sobolev inequalities.Because the arguments are simple and direct,they lead to sharper,more explicit results in many cases.2A convexity inequality for entropyOur goal in this section is to prove the following inequality:2.1THEOREM.Let µbe any probability measure on a sigma algebra S of subsets of some set Ω.Let u be any measurable real valued function withΩu 2d µ=1and Ωu d µ=0.For all αwith 0≤α≤1,definef (x )= 1−α2+αu (x ).ThenΩf 2ln f 2d µ≤2α2+α4+α2Ωu 2ln u 2d µ.(2.1)Before proving the theorem,we make several remarks.First,because u is normalized in L 2and is orthogonal to the constants,f is also normalized in L 2.That is,both u 2d µand f 2d µare probability measures.Second,the Taylor expansion (1.4)shows that the constant 2is the best possible constant multiplying α2in (2.1),for if u (x )=±1for every almost every x ,then the right hand side of (2.1)reduces to 2α2+α4.Third,it might seem natural to try and prove (2.1)by controlling the remainder terms in the Taylor expansion.However,extracting an estimate on the remainder involving only αandΩu 2ln u 2d µdoes not seem to be straightforward.Instead,we first prove an L p inequality that is an identity at p =2.Differentiation in p will yield Theorem 2.1.The L p inequality is the following:2.2THEOREM.Let µbe any probability measure on a sigma algebra S of subsets of some set Ω.Let u be any measurable real valued function withΩu 2d µ=1and Ωu d µ=0.For all αwith 0≤α≤1,definef (x )= 1−α2+αu (x ).Then for p ≥2,f p p ≤(1−α2)p/2+p (p −1)2 f (p −2)p α2 u 2p .(2.2)Proof:For t real,define φ(t )byφ(t )=Ω|c +tu |p d µwhere c =√1−α2.Differentiating,we findφ (t )=p Ω((c +tu )2)p/2−1(c +tu )u d µ.In particular,φ (0)=p |c |p −2cΩu d µ=0.(2.3)Differentiating once more,we findφ (t )=p (p −1)Ω|c +tu |p −2u 2d µ.(2.4)Applying Holder’s inequality with indices p/(p −2)and p/2,we obtainφ (t )≤p (p −1) c +tu p −2p u 2p=p (p −1)φ(t )(p −2)/p u 2p .(2.5)Together,(2.3)and (2.4)show that φis increasing in t ≥0,and thus,for all t with 0≤t ≤α,we have from (2.5)thatφ (t )≤p (p −1)φ(α)(p −2)/p u 2p .(2.6)Then,once again using (2.3),φ(α)=φ(0)+α0 t 0φ (s )d s d t .(2.7)Using the estimate (2.6)in (2.7)yields theresult.Proof of Theorem 2.1:We notice first that (2.2)holds as an equality at p =2.We may therefore differentiate both sides in p at p =2to obtain a new inequality.We first compute d d p f p −2p = 2p ln Ω|f |p d µ −p −2p 1 f p p Ω|f |p ln |f |d µ f p −2p .Since f 2=1,this vanishes at p =2.Next,for g =αu ,d d p g 2p = −2p 2ln g p p +2p 1 g p p Ω|g |p ln |g |d µ .At p =2,this reduces to 12 Ωg 2ln g 2 g 22 d µ=α22 Ωu 2ln u 2d µ.Finally,the derivative of p (p −1)/2at p =2is 3/2,and the derivative of c p at p =2is c p ln c .On the left hand side,the derivative of f p p at p =2is Ωf 2ln |f |d µ.Altogether,we haveΩf 2ln f 2d µ≤(1−α2)ln(1−α2)+3α2+α2 Ωu 2ln u 2d µ.(2.8)However,by concavity of the logarithm,(1−α2)ln(1−α2)≤−(1−α2)α2so that(1−α2)ln(1−α2)+3α2≤2α2+α4.Using this in (2.8)gives us(2.1).3Application to logarithmic Sobolev inequalitiesIn this section,we consider Ω=R n ,andd µ=e −V (x )d n x ,and are concerned with the following question:Suppose V is such that µadmits a spectral gap with constant λ.What further conditions on V then ensure that µalso admits a logarithmic Sobolev constant for some finite constant b ?The following lemmas provide a positive answer.The first says that if µadmits a spectral gap,and if µadmits a restricted log-Sobolev inequality,then it admits an unrestriced log-Sobolev inequality,and it provides a simple estimate for the constant.3.1LEMMA.Suppose that µadmits a spectral gap λ>0,and for some finite b , R n u 2ln u 2d µ≤b R n|∇u |2d µwheneverR n u d µ=0and R n u 2d µ=1.Then µadmits a logarithmic Sobolev inequality with constant no larger thanb +3λ.Proof:Consider any f with R n f 2d µ=1,and write it in the form considered in Theorem 2.1.Then,by Theorem 2.1, R n f 2ln f 2d µ≤3α2+α2R n u 2ln u 2d µ.By the spectral gap inequality,α2=α2R n u 2d µ≤α2λ R n |∇u |2d µ≤1λR n |∇f |2d µ.By hypothesisα2R n u 2ln u 2d µ≤bα2 R n |∇u |2d µ≤b R n |∇f |2d µ.This yields theresult.The next lemma gives conditions under which a restricted log-Sobolev inequality may be proven.3.2LEMMA.Suppose that V is C 2,and that µadmits a spectral gap λ>0,and−C =inf x 14|∇V (x )|2−12∆V (x )−πe 2V (x ) >−∞.(3.1)Then for all u satisfyingR n u d µ=0and R n u 2d µ=1,R n u 2ln u 2d µ≤λ(λ+|C |)πe 2R n |∇u |2d µ.Before proving Lemma 3.2,we recall a special case of the family of logarithmic Sobolev inequal-ities on R n equipped with Lebesgue measure:For all functions g on R n with R n g 2d n x =1,R ng 2ln g 2d n x ≤1πe 2 R n |∇g |2d n x .Proof of Lemma 3.2:Consider any function u with R n u d µ=0,and R n u 2d µ=1.Then forany t with 0<t <1,R n |∇u |2d µ−tπe 2 R n u 2ln u 2d µ=(1−t ) R n |∇u |2d µ+t R n |∇u |2d µ−πe 2 Rn u 2ln u 2d µ≥(1−t )λ R n u 2d µ+t R n |∇u |2d µ−πe 2 R nu 2ln u 2d µ .(3.2)Next,defineg (x )=u (x )e −V (x )/2so thatR n u 2d µ=R n g 2d n x .When U is smooth with compact support and V,a simple computation revealsR n |∇u|2dµ=R n|∇g|2d n x+R nW g2d n xwhereW(x)=14|∇V(x)|2−12∆V(x).A standard approximation argument shows identity,the so–called“ground state stransformation”is generally valid.An even simpler computation revealsR n u2ln u2dµ=R ng2ln g2d n x+R nV g2d n x.Therefore,R n |∇u|2dµ−πe2R nu2ln u2dµ=R n|∇g|2d n x−πe2R ng2ln g2d n x+R nW−πe2Vg2d n x≥R nW−πe2Vg2d n x.(3.3)Therefore,R n |∇u|2dµ−tπe2R nu2ln u2dµ≥R n(1−t)λ+tW−πe2Vg2d n x.The integrand is non negative provided that1 4|∇V(x)|2−12∆V(x)−πe2V(x)+1−ttλ≥0for all x.Define C by(3.1).Then provided C isfinite,we can chose t=λ/(λ+|C|),and the integrand will bepositive.Lemmas3.1and3.2immediately yield the following theorem:3.3THEOREM.Suppose that V is C2and thatµadmits a spectral gapλ>0,and−C=infx14|∇V(x)|2−12∆V(x)−πe2V(x)>−∞.thenµadmits a logarithmic Sobolev inequality with constant b no larger thanλ(λ+|C|)πe2+3λ.Notice that if V(x) |x|γand|∇V(x)| |x|γ−1,then the condition that C isfinite requires γ≥2.This is consistent with the fact that wheneverµadmits a logarithmic Sobolev inequality with somefinite constant b,there is a numberβ>0so thatR neβ|x|2dµ<∞.Thus,concerning qualitative growth conditions on V,Theorem3.3is sharp.It is however surprising that the Laplacian of V enters C with a negative sign,given that the Bakry–Emery condition implies a logarithmic Sobolev inequality forµwhenever the Hessian of V is uniformly bounded below.To end the discussion,let us note that in one dimension estimates of the gap are relatively easy to come by.It was shown before that with u=e V/2g(1.3)reduces toR n |∇g(x)|2+|∇V|24−∆V2|g|2d n x≥λR n|g|2d n x.and must hold for all functions g satisfying the conditionR nge−V/2d n x=0.In other words,thebest possible value forλis given by the gap of the Schr¨o dinger operator−∆g+|∇V|24−∆V2g=λg.(3.3)Clearly,the function g0=e−V/2is the ground state of this Schr¨o dinger equation with corresponding eigenvalue zero.By an elementary calculation is easily seen thatλis the second eigenvalue of theoperator(on L2(R,d x))−dd x−g 0g0dd x−g 0g0.(3.4)By the well known commutation formula(see,e.g.[4]),the operatord d x −g 0g0−dd x−g 0g0hasλas its lowest eigenvalue,in fact it has the same spectrum as(3.3)except for the lowest eiegnvalue zero.Thus,the gapλis now the lowest eigenvalue of the operator−d2 d x +V2+V 24,which is given by an unconstrained minimization.Notice also,that here the second derivative of the potential shows up with the“right”sign.4AcknowledgementsThis paper grew out of discussions betwen the authors while both were visiting Cedric Villani at E.N.S.Lyon in June2003.We thank Cedric for hosting us,and for many interesting discussion on a problem of proving a family of log–Sobolev inequalities on R n with constants independent of the dimension.This problem arises in a large deviations problem considered by several authors,and in particular,Otto and Villani,and is explained in these procedings.References[1]C.Ane et al.,Sur les in´e galit´e s de Sobolev Logarithmiques,Soc.Math de France,Panoramaset Synth`e ses,No.10,200.[2]D.Bakry,M.Emery Hypercontractivit´e de semi–groups de diffusion,C.R.Acad.Sci.ParisS´e r I Math.299775–778(1984)[3]P.Cattiaux,Hypercontractivity for perturbed diffusion semigroups,preprint,2003.[4]P.A.Deift,Applications of a commutation formula,Duke Math.J.45267–310(1978).[5]J.Glimm,Bosonfields with nonlinear self interaction in two dimensions,Commun.Math.Phys.,812–25(1968)[6]L.Gross,Logarithmic Sobolev inequalities,Amer.Jour.Math,971061–1083(1976)[7]O.S.Rothaus,Lower bounds for eigenvalues of regular Sturm–Liouville operators and thelogarithmic Sobolev inequality,Duke Math.Jour.45351–362(1978)。
Possible Negative Pressure States in the Evolution of the Universe
ible matter: the vacuum energy (also known by such names as dark energy, quintessence, x-matter,
become unbound, while the nongravitationally bound systems remain bound.
Meanwhile, it is convenient to express the mean densities ρi of various quantities in the Universe in terms of their fractions relative to the critical density: Ωi = ρi/ρcrit. The theory of cosmological
system of equations. Their results showed that even for very slow growth of Λ (which satisfies all
the conditions on the variation of GN ), in the distant future the gravitationally bound systems
1E-mail: chukh0581@ 2The second address : P.O. Box 30-15, Shanghai 200030, PR China.
–2–
term have been presented before. For example, we can start from the Einstein action describing the gravitational forces in the presence of the cosmological constant (Padmanabhan 2003)
The Singularity and AI
The Singularity and AIThe concept of the Singularity and AI has been a topic of discussion and speculation in the field of technology and science for many years. The Singularity refers to the hypothetical point in the future when artificial intelligence (AI) surpasses human intelligence and capabilities. This idea has captured the imagination of many visionaries, scientists, and futurists, raising both excitement and concern about the potential implications of such a technological leap.One of the biggest questions surrounding the Singularity and AI is whether it will bring about a utopian future or a dystopian one. Proponents of the Singularity argue that AI has the potential to solve some of humanity's most pressing issues, such as poverty, disease, and climate change. They believe that AI could usher in an era of abundance, where robots and AI systems take over menial and dangerous tasks, allowing humans to focus on more creative and fulfilling pursuits.On the other hand, skeptics of the Singularity warn of the risks and dangers associated with the exponential growth of AI. They fear that AI systems could become uncontrollable and autonomous, leading to unintended consequences and potential threats to human civilization. The prospect of superintelligent AI surpassing human cognitive abilities raises concerns about the loss of control and the potential for AI to act in ways that are harmful to humanity.Ethical considerations are also at the forefront of discussions surrounding the Singularity and AI. Questions about the moral implications of creating machines with intelligence and consciousness raise concerns about the treatment of AI entities and their rights. Issues such as bias in AI algorithms, the potential for AI to replicate human behaviors and emotions, and the impact of AI on employment and society as a whole are all important considerations in the development of AI technologies.Despite the uncertainties and risks associated with the Singularity and AI, it is clear that AI technologies have the potential to revolutionize many aspects of our lives. From healthcare and transportation to entertainment and education, AI is already makingsignificant advancements and transforming industries. The key to harnessing the power of AI for the benefit of humanity lies in responsible development, ethical guidelines, and thoughtful consideration of the implications of AI technologies.In conclusion, the Singularity and AI present both exciting possibilities and significant challenges for the future of humanity. As we continue to progress in the development of AI technologies, it is crucial that we approach the potential of the Singularity with caution and foresight. By addressing ethical concerns, considering the social and environmental impacts of AI, and prioritizing responsible innovation, we can work towards a future where AI enhances human capabilities and creates a more equitable and sustainable world.。
the ghost of Christmash past
Cognitive Science 27(2003)285–298Short communicationLexical effects on compensation for coarticulation:the ghost of Christmash pastJames S.Magnuson a ,∗,Bob McMurray b ,Michael K.Tanenhaus b ,Richard N.Aslin ba Department of Psychology,Columbia University,1190Amsterdam Ave.,MC 5501,New York City,NY 10027,USAb University of Rochester,New York,NY,USAReceived 9September 2002;received in revised form 16December 2002;accepted 25December 2002AbstractThe question of when and how bottom-up input is integrated with top-down knowledge has been debated extensively within cognition and perception,and particularly within language processing.A long running debate about the architecture of the spoken-word recognition system has centered on the locus of lexical effects on phonemic processing:does lexical knowledge influence phoneme percep-tion through feedback,or post-perceptually in a purely feedforward system?Elman and McClelland (1988)reported that lexically restored ambiguous phonemes influenced the perception of the following phoneme,supporting models with feedback from lexical to phonemic representations.Subsequently,several authors have argued that these results can be fully accounted for by diphone transitional proba-bilities in a feedforward system (Cairns et al.,1995;Pitt &McQueen,1998).We report results strongly favoring the original lexical feedback explanation:lexical effects were present even when transitional probability biases were opposite to those of lexical biases.©2003Cognitive Science Society,Inc.All rights reserved.Keywords:Psychology;Language understanding;Neural networks...Scrooge,having his key in the lock of the door,saw in the knocker,without its undergoing any intermediate process of change:not a knocker,but Marley’s face.–A Christmas Carol ,Charles Dickens (1843).A central question in cognitive science is when and how information sources are inte-grated.Fodor (1983)argued that modularity between perceptual systems would allow gains in ∗Corresponding author.Tel.:+1-212-854-5667;fax:+1-212-854-3609.E-mail address:magnuson@ (J.S.Magnuson).0364-0213/03/$–see front matter ©2003Cognitive Science Society,Inc.All rights reserved.doi:10.1016/S0364-0213(03)00004-1286J.S.Magnuson et al./Cognitive Science27(2003)285–298processing efficiency and maximize veridical perception.Similar arguments have been made for purely modular or feedforward stages within systems,perhaps most notably in language processing(e.g.,Frazier&Clifton,1996;Norris,McQueen,&Cutler,2000).On this view, protection of the bottom-up signal from top-down knowledge is necessary to prevent our days from beingfilled with hallucinations and ghostly apparitions like Marley’s face in the knocker (the reality of Scrooge’s perception notwithstanding).An alternative view is that because signals occur in noise,immediate use of top-down knowledge makes processing more reliable by allowing knowledge and the processing context to constrain interpretation of the signal(e.g.,McClelland,1987,1996;McClelland&Elman, 1986).The present report focuses on a particular question within this broader debate;does lexical knowledge affect sublexical processing?In many spoken language tasks,the lexical status of a carrier sequence(i.e.,whether it is a word or not)influences phonemic judgments.For example,lexical status affects response times in phoneme monitoring and judgments about whether a phoneme is present in a carrier sequence containing noise(phoneme restoration,e.g.,Pitt&Samuel,1995;Samuel,1981, 1996,1997,2001;Warren,1970;Warren&Warren,1970).Lexical status also affects the category boundary in an identification task for consonants that vary along a continuum:when only one endpoint forms a word(e.g.,a dash–tash or dask–task continuum),the category boundary shifts significantly toward the lexical end of the continuum(the“Ganong”effect; Fox,1984;Ganong,1980;Pitt,1995).Although lexical effects on phoneme identification are well documented,the theoretical explanation has been widely debated,primarily because it bears on long-standing debates about models of hierarchically organized cognitive architectures.According to one class of model, lexical knowledge affects phonemic processing via feedback.Initially,an ambiguous phoneme will equally activate both candidate phonemes,which will in turn activate relevant lexical representations.When one potential match to an ambiguous phoneme would make the stimulus conform to a word and another would not(e.g.,dash vs.tash given an ambiguous alveolar stop consonant),activation of the word feeds back and boosts activation of its corresponding phoneme(dash will send feedback to/d/,boosting its activation relative to/t/).This explanation was implemented in the influential TRACE model(McClelland&Elman,1986).Proponents of lexical feedback hold that it would also compensate for noise inherent in speech by providing constraints on the interpretation of a bottom-up signal:lexical feedback serves as an implicit encoding of the probability that a phoneme will occur in a given context.An alternative class of model eschews feedback in favor of a purely feedforward system. For example,in the Race model(Cutler&Norris,1979),phonemic decisions can be based on the output of either lexical or purely phonemic processing routes;the one to reach a threshold first,or the one attended to based on task constraints,provides the basis for the decision.In the Merge model(Norris et al.,2000),phoneme decisions are based on post-perceptual phoneme decision units that receive input from perceptual phoneme units and lexical units,avoiding the need for lexical feedback.Thus,the crucial difference in feedforward and feedback ac-counts is the locus of lexical effects on phonemes.In feedback accounts,lexical knowledge influences phonemic perception.In feedforward accounts,prior knowledge influences phone-mic perception indirectly via one of two mechanisms.In one class of feedforward models, lexical knowledge affects post-perceptual decisions(Norris et al.,2000).In another,apparentJ.S.Magnuson et al./Cognitive Science 27(2003)285–298287lexical effects result from precompiled sublexical knowledge,such as diphone transitional probabilities (Cairns,Shillcock,Chater,&Levy,1995;Pitt &McQueen,1998).Elman and McClelland (1988)provided an apparently crucial test of these accounts by demonstrating lexical effects on compensation for coarticulation .Compensation for coarticu-lation (Mann &Repp,1981;Repp &Mann,1981,1982)occurs when category boundaries in a phoneme identification task are shifted by the preceding coarticulatory context.Mann and Repp found that following a segment with an alveolar place of articulation (e.g.,/s/),catego-rization of non-endpoint steps on an immediately following alveolar–velar continuum (/t/–/k/)was biased towards the velar place of articulation.The opposite shift was observed following a context with a velar place of articulation (e.g.,/ /).Thus,subjects are more likely to respond /k/if the target immediately follows /s/,and /t/if it follows / /.Mann and Repp’s interpretation of these effects was that,in natural production,when a velar or palatal segment must be produced immediately following an alveolar segment,the articulators are unlikely to reach the ideal target for the second segment.The result is a re-alization of the velar or palatal segment that is acoustically more similar than normal to its alveolar counterpart.They proposed that the speech perception system is tuned to dynamically shift category boundaries depending on context through perceptual learning,compensating for effects of coarticulation.Given an ambiguous segment in the compensation for coarticulation experimental paradigm,the perceptual system attributes the non-ideal realization of midpoint steps along the continuum to coarticulation due to the preceding segment.Elman and McClelland (1988)combined compensation for coarticulation with the Ganong (1980)effect.If the Ganong effect results from lexical feedback to a perceptual phonemic level,then an ambiguous segment disambiguated and restored based on lexical status ought to drive compensation for coarticulation.Elman and McClelland presented subjects with auditory contexts such as fooliX and christ-maX ,where X was a segment that,in isolation,was perceptually halfway between /s/and / /.This ambiguous fricative was immediately followed by a word from a tapes to capes con-tinuum.The expectation was that X would be perceived as / /given fooliX and as /s/given christmaX because of the Ganong effect.Then,if the restoration were due to true lexical in-fluence on phonemic perception,the lexically restored /s/or / /percept should modulate the perception of the following /t/–/k/continuum—that is,the lexically restored fricative percept should drive perceptual compensation for coarticulation.Elman and McCelland’s results supported this lexical-feedback prediction.Responses on the tapes /capes continuum were shifted towards capes following lexical contexts biased towards /s/(e.g.,christmaX ),and towards tapes following contexts biased towards / /(e.g.,fooliX ).These results have been challenged by claims that a purely feedforward model based on transitional probabilities among phonemes can account for the results without lexical representations.Cairns et al.(1995)analyzed the London–Lund corpus (Svartvik &Quirk,1980)and reported that Elman and McClelland’s items confounded diphone transitional probability (TP)with lexical status.That is,across a corpus of British English,they found the sequence / s/to be more likely than / /,and /i /more likely than /i s/(we discuss the specific statistic they used in detail later).This allows an explanation of the lexical effects on compensation for coarticulation that does not invoke lexical representations:sensitivity to diphone TPs could explain the lexical288J.S.Magnuson et al./Cognitive Science 27(2003)285–298bias in the ambiguous fricative.Moreover,Shillcock,Lindsey,Levy,and Chater (1992;see also Norris,1993)trained a recurrent network to output the previous,current and predicted next phoneme given phoneme-by-phoneme transcriptions of conversations from the same corpus and found that the network exhibited lexical effects on compensation for coarticulation.These results from British English prompted Pitt and McQueen (1998)to devise an empirical test of the TP hypothesis in American English.They used two lexical contexts (juice and bush )in which the vowels were equally predictive of /s/and / /.They contrasted these lexical contexts with nonword contexts with TP biases:nai -,biased towards / /,and der -,biased towards /s/.Pitt and McQueen predicted that if TP is the true basis for the lexical effects reported by Elman and McClelland (1988),then compensation for coarticulation should be observed in the nonword contexts (where TPs differed)but not in the lexical contexts (where TPs were equated).This is precisely what they found.There are several reasons why we felt it was important to revisit whether there are lexi-cal effects on compensation for coarticulation.First,Pitt and McQueen’s (1998)result with equi-biased lexical contexts is a null effect,and must be interpreted with caution.Second,the tested lexical contexts and TPs were based on only two items,and may not generalize to other contexts.Third,as we later discovered,lexical status and TP were not confounded for all of the Elman and McClelland (1988)items in corpora of American English.1.ExperimentWe designed materials using the same / /and /i /vowels that Elman and McClelland used,but we embedded them in contexts with opposite lexical biases.They found lexical effects on compensation for coarticulation with Christmas —we used brush to embed the same vowel in a context with the opposite (lexically-based)fricative bias.Elman and McClelland found lexical effects with foolish .We used bliss to embed the same vowel in a context with the opposite fricative ing opposite lexical biases creates a strong test of whether there are lexical effects beyond effects of diphone TP.We hypothesize that lexical status might have more powerful effects than diphone TPs because of the increased redundancy afforded by lexical information.We agree that diphone transitions could guide perception of ambiguous segments by combining the bottom-up signal with acquired knowledge of the most likely segments to follow.We argue that words,however,are more predictive because they span multiple segments and provide a compact,implicit representation of context-specific statistics.1.1.Method1.1.1.Materials We created an /s/–/ /continuum by recording natural,isolated utterances of the two frica-tives.Samples were excised from the center of each fricative to make their durations 233ms.These served as the endpoints of the continuum.We then created intermediate steps using a waveform averaging technique similar to that used by Pitt and McQueen (1998)for their tapes /capes continuum (see also McQueen,1991;Repp,1981).We created weighted averagesJ.S.Magnuson et al./Cognitive Science 27(2003)285–298289of matrix representations of the /s/and //waveforms in 2.5%steps.Thus,the /s/endpoint was 100%/s/,0%/ /.The next step was 97.5%/s/and 2.5%/ /.We created 39steps between the /s/and / /endpoints.Consistent with previous reports,our pilot identification tests with the /s/–/ /continuum revealed substantial individual differences in the maximally ambiguous token.Rather than finding the maximally ambiguous token for each participant in our study (the procedure Pitt and McQueen used),and potentially alerting participants to our interest in the fricative component of the stimuli,we used two intermediate fricatives:the 50and 60%/s/tokens.Both were ambiguous for six pilot participants (mean for the 50%stimulus in isolation was 51%“s”responses;mean for the 60%stimulus was 58%,though the ranges overlapped).To avoid coarticulatory cues in the vowels,the lexical contexts were naturally produced tokens of the first three phonemes of the words bliss and brush ,i.e.,/bl i /(318ms)and /br /(314ms).1Acoustic analyses comparing these with complete productions of bliss and brush did not reveal inadvertent coarticulatory cues.The two word-initial CCV contexts were combined with three fricative segments by appending them to the CCV:the appropriate endpoint (bli -+/s /,or bru -+/ /),and the two ambiguous fricatives,50and 60%/s/.A tapes /capes continuum was constructed in a similar fashion,except that the /t/–/k/end-points were recorded in their full lexical contexts.The initial portion of each endpoint through the fourth pitch period in the vowel /e i /was cut from the full lexical context.The resulting stimuli were 89(/ke i /)and 83ms long (/te i /).To make the endpoints the same length,6ms were excised from the interior of the noise burst in /ke i /.Pilot tests on the tapes /capes continuum indicated that the following seven steps on the continuum would yield non-ceiling/floor levels of “t”responses and a graded change from mainly “t”to mainly “k”responses:45,47.5,50,52.5,55,57.5,and 60%/t/.The auditory stimuli were recorded and presented with 16-bit resolution and a 22.05kHz sampling rate.1.1.2.ProcedureThe experiment was conducted using PsyScope 1.2.5(Cohen,MacWhinney,Flatt,&Provost,1993).On each trial,participants heard one of six fricative-final stimuli (bliss ,brush ,bli-50,bli-60,bru-50,or bru-60,where 50and 60indicate the percentage /s/for the ambiguous tokens)immediately followed by an item from the tapes /capes continuum.The task was similar to that used by Pitt and McQueen (1998):participants pressed one of four buttons,labeled “s t,”“s k,”“sh t,”and “sh k,”indicating the sequence of segments they heard at the word boundary.On each trial,the four orthographic labels appeared on the screen,aligned with correspond-ingly labeled keys on the keyboard.After an 800ms delay,one of the bliss /blush fricative stimuli was presented (bliss ,brush ,bli-50,bli-60,bru-50,or bru-60)followed immediately by one of the nine tokens from the tapes /capes continuum.The experiment began with 36practice trials to familiarize participants with the task.The practice trials consisted of two repetitions of each pairing of the endpoint bliss /brush stimuli with each of the nine continuum steps (the tapes and capes endpoints along with the seven intermediary steps)in random order.These practice trials were not included in the analyses.Following the practice trials,there were 324experimental trials,consisting of six repe-titions of each of the 2(lexical context)×3(fricative stimuli)×9(tapes /capes steps)com-binations of the stimulus elements.These were presented in random order in six blocks of 54trials.290J.S.Magnuson et al./Cognitive Science27(2003)285–2981.1.3.ParticipantsSeventeen volunteers with normal hearing were paid for their participation.One participant’sdata was excluded because he always perceived the ambiguous fricative tokens as/s/.1.2.Results and discussionTo determine whether the lexical contexts influenced responses to the ambiguous fricatives,we examined the proportion of“s”responses to the endpoint and ambiguous fricative stimuliat each tapes/capes step.The endpoints(bliss and brush)were responded to at ceiling andfloor levels of/s/response(96%[ranging from93to98%across steps],4%[ranging from2to7%],respectively).Lexical context strongly affected responses to the ambiguous fricatives.Across all tapes/capes stimuli,the50%/s/token was labeled“s”84%of the time(range:79–88%)given the bli-context and labeled“sh”93%of the time(range:89–97%)giventhe bru-context;the60%/s/token was labeled“s”86%of the time(range:83–91%)in thebli-context,and labeled“sh”92%of the time(range:86–97%)given the bru-context.Weconducted two ANOV As(one each for50and60%/s/fricatives),with2×9levels(lexicalcontext×tapes/capes steps)on the“s”-response proportions to verify that the pattern heldacross participants and tapes/capes level.The effect of lexical context was significant for boththe50%/s/stimulus(F(1,15)=110.9,p<.001,ω2=.77)and the60%/s/stimulus (F(1,15)=121.2,p<.001,ω2=.79).The effect of step was not reliable for either ambiguous fricative,nor was the interaction of context and step.Thus,the lexical contextswere effective at shifting responses to the ambiguous fricatives,and the stimuli exhibit theprerequisite lexical effect on the fricative(the Ganong,1980effect)for examining whetherlexical bias influences compensation for coarticulation.We next asked whether fricative perception affected compensation for coarticulation in the“t/k”responses.Fig.1shows the effect of lexical context on the“t/k”responses.Proportionsof“k”responses are plotted at each tapes/capes continuum step,with separate curves forbli-and bru-contexts and separate plots for each fricative type.We conducted ANOV As onthe proportion of“k”responses as a function of lexical context and ambiguous tapes/capessteps(steps2–8;we excluded steps1and9because there is no reason to expect effects onunambiguous consonants).Thus,we conducted two(50and60%/s/)2×7ANOV As(lexicalcontext×tapes/capes continuum step).For the endpoint stimuli(top panel),there were significant effects of context(bli-=69%“k,”bru-=49%;F(1,15)=41.4,p<.001,ω2=.56),step(ranging from21to93%; F(6,90)=103.9,p<.001,ω2=.85),and a significant interaction of context and step (F(6,90)=2.5,p<.05,ω2=.04).This weak interaction depended on the relatively small effect at step8;with step8removed,the interaction was not reliable(F(5,75)=1.4,p=.25). In the case of the50%/s/fricative stimuli,there were significant effects of context(49%“k”responses given bliss,43%given brush;F(1,15)=5.5,p<.05,ω2=.12)and step(ranging from7to90%;F(6,90)=97.2,p<.001,ω2=.84),but the interaction was not significant (F<1).The pattern was the same for the60%/s/stimuli:there were significant effects of context(bli-=52%“k”responses,bru-=46%;F(1,15)=4.6,p<.05,ω2=.19)and step (ranging from10to90%;F(6,90)=92.3,p<.001,ω2=.83)but the interaction was not reliable(F<1).J.S.Magnuson et al./Cognitive Science27(2003)285–298291Fig.1.Proportion of tapes responses at each step along the tapes/capes continuum as a function of the preceding fricative context.Top panel:endpoints(bliss,100%/s/,and brush,0%/s/).Middle panel:lexical contexts with ambiguousfinal fricative(60%/s/).Bottom panel:lexical contexts with ambiguousfinal fricative(50%/s/).The most important result in each case is the main effect of lexical context,indicating that compensation for coarticulation was modulated by lexical status.2This is consistent with a recent study by Samuel and Pitt(2003).They tested a number of items with the same vocalic context but opposite lexical biases and found lexical effects on compensatory coarticulation for most of their items.Since the ambiguous fricative was always preceded by the same vocalic context,Samuel and Pitts’results cannot be accounted for by diphone TPs.How can we explain the divergence between Pitt and McQueen(1998),who did notfind lexical compensation with TP-neutral contexts,and Elman and McClelland(1988),Samuel292J.S.Magnuson et al./Cognitive Science 27(2003)285–298and Pitt (2003),and our study (all of which found lexically mediated compensation,even when TPs were at odds with lexical bias)?One possibility is that differences in stimulus preparation techniques are responsible.For example,our ambiguous fricatives were highly pliable (cf.the large changes in “s”responses depending on lexical context).Another possibility is that there is something unusual about the two lexical items Pitt and McQueen used,and another factor overrode the lexical bias.Indeed,Samuel and Pitt found that the compensation effect is strongly influenced by perceptual grouping phenomena,and,in a separate test of perceptual grouping,that fricatives cohered more strongly with Pitt and McQueen’s lexical than nonword contexts,making them less susceptible to compensation.A third possibility is that the corpus analyses used to determine diphone TPs were sufficiently different in the various studies that the resultant selection of stimulus items was unbalanced.According to the corpus analyses reported by Cairns et al.(1995),the lexical biases used in our stimulus materials were opposite to the TP biases for our chosen vowel-fricative contexts.However,there may be higher-order TPs correlated with lexical context.In order to test whether a more elaborate phoneme-based TP explanation might account for our results,we conducted a series of corpus analyses.2.Corpus analysesWe analyzed two pronunciation dictionaries,Moby and MIT,3weighted by the frequency counts in Francis and Kucera (1982),and one phonemically transcribed corpus of spoken,con-versational American English (the CALLHOME corpus;Kingsbury,Strassel,McLemore,&McIntyre,1997a ).4Forward and backward TPs based on the MIT and CALLHOME corpora are shown in Table 1(results with the MIT and Moby lexicons are nearly identical).The only contexts strongly biased towards / /were / /and /e i /.Even if we compute the statisticused by Cairns et al.(1995)—backward TP (which we hold is the incorrect statistic 5)—wefind a similar pattern.Across all corpora,five contexts are biased towards / /,none of which corresponds to a context lexically biased towards / /used by Elman and McClelland or us.Thus,the claim that TP was confounded with lexical status in the crucial contexts in Elman and McClelland’s materials does not hold (at least not for American English).We also tested whether larger n -phone TPs might account for the results of Experiment 1.Table 2presents forward and backward TPs for all possible preceding TP contexts (V ,CV ,CCV)and targets (fricative [F],VF,CVF,CCVF)based on the MIT lexicon (similar patterns hold for Moby and CALLHOME).None of the context/target combinations account for both the bliss and brush effects.Thus,neither simple diphone TPs nor more complex TPs account for our results.6Perhaps the explanation does not lie with lexical knowledge per se ,but with higher-order statistics than simple TPs.While we explored a large set of TPs,it is possible that another analysis exists which would account for the results.However,as one invokes higher-and higher-order statistics,several problems will emerge.For example,what is the proper order?If diphones do not suffice,a fixed n -phone TP model (where n >2)will not be able to account for lexical effects with two-segment words.Critics of feedback in spoken word recognition have appealed to the fact that simple recur-rent networks can simulate the Elman and McClelland (1988)effects without explicit lexicalJ.S.Magnuson et al./Cognitive Science 27(2003)285–298293Table 1Forward and backward diphone transitional probabilities for the occurrence of /s/and / /given vowels based on a frequency-weighted written lexicon (MIT)and one spoken corpus (CALLHOME)of American EnglishForward TPBackward TP p ( |V)p (s|V)Bias p (V| )p (V|s)Bias MIT (written frequency-weighted lexicon)æ.0068.0288s .0359.0293eq ε.0162.0759s .0510.0457eq i .0164.0804s .1487.1401eq ɑw .0000.1808s .0000.0035s .0044.0892s .0094.0366s .0032.0576s .0085.0298s .0048.0005 .0032.0001 ɔi .0000.0309s .0000.0044s ɑi .0000.0297s .0000.0089s ɑ.0002.0411s .0003.0128s ɔ.0024.0312s .0036.0090s e i .1295.0695 .2316.0238 i .0013.0209s .0049.0151s o .0122.0447s .0313.0220eq u .0081.0084eq.0219.0044.0014.0211s .0136.0390s CALLHOME (spoken telephone conversation corpus)æ.0033.0931s .02091.0913sε.0070.0779s .02660.0465s i .0107.0610s .09343.0837eq ɑw .0057.0625s .00045.0007s.0048.0382s .03001.0372eq.0103.0985s .02137.0319eq .0041.0000 .00341.0000 ɔi .0072.0107eq .00705.0016ɑi .0038.0508s .01910.0399s ɑ.0089.0149s .01637.0043ɔ.0064.0305s .00796.0059eq e i .0229.0557s .08707.0332i .0083.0484s .05024.0457eq o .0139.0305s .07229.0248u .0071.0579s .02842.0363eq.0026.253s .00796.0122s“Bias”for /s/or / /was operationalized as one TP being 1.5times greater than the other;biases are indicated bybold type.knowledge (Cairns et al.,1995;Norris,1993)—when TP and lexical biases are correlated in the training corpus,which Cairns et al.(1995)reported to be true of English.7However,such models do learn to represent word-specific statistics.Recurrent networks have the potential to make use of recent history (e.g.,by making the states of some units at time t part of the input at time t +1),representing context-specific statistics over potentially large temporal windows294J.S.Magnuson et al./Cognitive Science 27(2003)285–298Table 2All possible forward and backward fricative transitional probabilities based on the MIT corpusForward transitional probabilityBackward transitional probability p ( |context)p (s|context)Bias p (context| )p (context|s)Bias i.0164.0804s .1487.1401eq .0044.0892s.0094.0366s l i .0405.0956s.0174.0079 l .0220.1199s.0012.0012eq bli .1043.0735eq.0030.0004 br .0131.0131eq.0001<.0001eq p (i |context)p (i s|context)Biasp (context|i )p (context|i s)Bias l .0050.0118s.1172.0562 bl .0269.0190eq.0201.0029 p ( |context)p ( s|context)Biasp (context| )p (context| s)Bias l .0003.0018s.1250.0336 br .0009.0009eq.0104.0005 p (l i |context)p (l i s|context)Biasp (context|l i )p (context|l i s)Bias b .0012.0009eq.1718.0514 p (r |context)p (r s|context)Biasp (context|r )p (context|r s)Bias <.0001<.0001eq .0833.0153 None accounts for both the bliss and brush compensation for coarticulation effects.of context-dependent size .In this case,the learned dependencies are regularities controlled by words (e.g.,lexical items control inter-phoneme statistical dependencies,with the weakest TPs at word boundaries;e.g.,Harris,1955).Thus,these networks encode a dynamic statistical representation,such that for a given word,the crucial “n -phone”resolves to word length.A fixed n -phone model cannot work,since the effective size of n is context dependent;indeed,the relevant context is best described as lexical.In other words,lexical knowledge subsumes the relevant statistics (cf.McClelland &Elman,1986).Feedback between explicit lexical and phonemic representations provides an efficient way of instantiating this knowledge for processing.Recurrent networks may represent such knowl-edge and feedback via context or history units.However,to date,models without explicit lexical representations have not simulated lexical biases on compensation for coarticulation when TP and lexical biases are at odds (i.e.,lexical bias and TP bias have been correlated in the training corpora,which has been the basis for arguing that lexical knowledge is unnecessary for explaining the phenomenon).Our corpus analysis demonstrates that both we and Elman and McClelland (1988)have found this result.The challenge now is for models without lexical rep-resentations to simulate the dominance of lexical bias over TP bias.We expect this is possible with models like those used by Cairns et al.(1995)and Norris et al.(2000).However,we pre-dict that this will not be possible with a truly bottom-up model—one whose interpretation。
Singularities in Inflationary Cosmology A Review
a rXiv:g r-qc/961236v 115Dec1996To appear in the Proceedings of the Sixth Quantum Gravity Seminar,Moscow.Singularities in Inflationary Cosmology:A Review Arvind Borde †∗and Alexander Vilenkin ⋆Institute of Cosmology Department of Physics and Astronomy Tufts University Medford,MA 02155,USA.Abstract:We review here some recent results that show that inflationary cosmological models must contain initial singularities.We also present a new singularity theorem.The question of the initial singularity re-emerges in inflationary cosmology because inflation is known to be generically future-eternal.It is natural to ask,therefore,if inflationary models can be continued into the infinite past in a non-singular way.The results that we discuss show that the answer to the question is “no.”This means that we cannot use inflation as a way of avoiding the question of the birth of the Universe.We also argue that our new theorem suggests –in a sense that we explain in the paper –that the Universe cannot be infinitely old.I.Introduction Inflationary cosmological models appear,at first glance,to admit the possibility that the Universe might be described by a version of the steady-state picture.The possibility seems to arise because inflation is generically future-eternal:in a large class of inflationary cosmological models the Universe consists of a number of isolated thermalized regions embedded in an always-inflating background [1].The boundaries of the thermalized regions expand into this background,but the inflating domains that separate them expand even faster,and the thermalized regions do not,in general,merge.As previously created regions expand,new ones come into existence,but the Universe does not fill up entirely with thermalized regions [2–4].A cosmological model in which the inflationary phase has no global end and continually produces new “islands of1thermalization”naturally leads to this question:can the model be extended in a non-singular way into the infinite past,avoiding in this way the problem of the initial singularity?The Universe would then be in a steady state of eternal inflation without a beginning.Assuming that some rather general conditions are met,we have recently shown[5–8]that the answer to this question is“no”:generic inflationary models necessarily contain initial singularities.This is significant,for it forces us in inflationary cosmologies(as in the standard big-bang ones)to face the question of what,if anything,came before.This paper reviews what is known about the existence of singularities in inflationary cosmology.A partial answer to the singularity question was pre-viously given by Vilenkin[9]who showed the necessity of a beginning in a two-dimensional spacetime and gave a plausibility argument for four dimen-sions.The broad question was also previously addressed by Borde[10]who sketched a general proof using the Penrose-Hawking-Geroch global techniques. We will not discuss this earlier work here,concentrating instead on more recent results.The paper is organized as follows:Section II outlines some mathematical background(see Hawking and Ellis[11]for details).Section III describes our first theorem,applicable to open Universes with a simple causal structure. Section IV sketches how the theorem may be extended to closed Universes. Section V presents a new theorem:Here,we drop the assumption that the causal structure of the Universe is simple.Instead,we introduce a new condition, which we call the limited influence condition.We argue that this condition is likely to hold in many inflationary models.Our new theorem makes no assumptions about whether the Universe is open or closed,thus providing a unified treatment of the two cases.Section VI offers some concluding comments. II.Mathematical PreliminariesSpacetime is represented by a manifold M with a time-oriented[12]Lorentz metric g ab of signature(−,+,+,+).We do not assume any specificfield equation for g ab.Instead,we impose an inequality on the Ricci curvature (called a convergence condition),and our conclusions are valid in any theory of gravity(such as Einstein’s,with a physically reasonable source)in which such a condition is satisfied.A curve is called causal if it is everywhere either timelike or null.The causal and chronological pasts of a point p,denoted respectively by J−(p)and I−(p),are defined as follows:J−(p)={q:∃a future-directed causal curve from q to p},andI −(p )={q :∃a future-directed timelike curve from q to p }.The futures J +(p )and I +(p )are defined similarly.The sets I ±(p )are open:i.e.,if x ∈I ±(p ),then all points in some neighborhood of x also lie in I ±(p ).The past light cone of p is defined [6]as E −(p )=J −(p )−I −(p ).It follows that E −(p )is achronal (i.e.,no two points on it can be connected by a timelike curve)and that E −(p )⊂˙I −(p )(where ˙I−(p )is the boundary of I −(p )).In general,however,E −(p )=˙I −(p )(see fig.1).These definitions of futures,pasts,and light cones can be extended from single points p to arbitrary spacetime sets in a straightforward manner.Spacetimes in which E −(p )=˙I −(p ),for all points p ,are called past causally simple .We tighten this definition by further requiring that E −(p )=∅(this rules out certain causalityviolations).E −(−(p )=∅Figure 1:An example of the causal complications that can arise in an unrestricted spacetime.Light rays travel along 45◦lines in this diagram,and the two thick horizontal lines are identified.This allows the point q to send a signal to the point p along the dashed line,as shown,even though q lies outside what is usually considered the past light cone of p .The boundary of the past of p ,˙I−(p ),then consists of the past light cone of p ,E −(p ),plus a further piece.Such a spacetime is not “causally simple.”A timelike curve is maximally extended in the past direction if it has no past endpoint.(Such a curve is often called past-inextendible.)The idea behind this is that such a curve is fully extended in the past direction,and is not merely a segment of some other curve.We define a closed Universe as one that contains a compact,edgeless, achronal hypersurface,and an open Universe as one that contains no such surface.The strong causality condition holds on M if there are no closed or“almost-closed”timelike or null curves through any point of M.Ifµis any timelike curve in a spacetime that obeys the strong causality condition and x is any point not onµ,then there must be some neighborhood N of x that does not intersectµ.(Otherwise,µwould accumulate at x,and thereby give an almost-closed timelike curve.)Finally,consider a congruence[13]of null geodesics with affine parameter v and tangent V a.The expansion of the geodesics may be defined asθ≡D a V a, where D a is the covariant derivative.The propagation equation forθleads to this inequality:dθθ2−R ab V a V b.(1)2Suppose that(i)R ab V a V b≥0for all null vectors V a(this is called the null convergence condition),(ii)the expansion,θ,is negative at some point v=v0 on a geodesicγ,and(iii)γis complete in the direction of increasing v(i.e.,γis defined for all v≥v0).Thenθ→−∞alongγafinite affine parameter distance from v0[11,14].III.Open UniversesOurfirst result[5,7]applies to open,causally simple spacetimes:Theorem1:A spacetime M cannot be null-geodesically complete to the past if it satisfies the following conditions:A.It is past causally simple.B.It is open.C.It obeys the null convergence condition.D.It has at least one point p such that for every point q to the past of p thevolume of the difference of the pasts of p and q isfinite.Assumptions A–C are conventional as far as work on singularity theorems goes.But assumption D is new and is inflation-specific.A slightly different version has been discussed in detail elsewhere[9,7],but here is a rough,short explanation:It may be shown that if a point r lies in a thermalized region, then all points in I+(r)also lie in that thermalized region[5].Therefore, given a point p in the inflating region,all points in its past must lie in the inflating region.Further,it seems plausible that there is a zero probabilityfor no thermalized regions to form in an infinite spacetime volume.Then assumption D follows.Proof:The full proof of this result is available elsewhere[5,7],but here is a sketch:Suppose that M is null-complete to the past.We show that a contradiction follows.Let q be a point to the past of the point p of assumption D.Then every past-directed null geodesic from q must leave E−(q)at some point and enter I−(q)(i.e.,it must leave the past null cone of q and enter the interior of the past of q).For,letγbe a past-directed null geodesic from q,and suppose thatγlies in E−(q)throughout.Choose a small“triangle”of null geodesics neighboringγin E−(q)and construct a volume“wedge”by moving the triangle so that its vertex moves from q to a point q′(still in I−(p)),an infinitesimal distance to the future of q.The volume of this region may be expressed[5,7]as∆ ∞0A(v)dv,where∆is a constant,A is the cross-sectional area of E−(q)in the wedge, and v is an affine parameter along the geodesic(chosen to increase in the past direction).From assumption D,this volume(being a part of the volume of I−(p)−I−(q))must befinite.This can happen only if A decreases somewhere. Butd A5Figure2:A closed Universe in which the past light cone of any point q iscompact(and the volume of the difference of the pasts of any two points isfinite).The past-directed null geodesics from q start offinitially in E−(q);but,once they recross at r(“at the back”)they enter I−(q)(because thereare timelike curves between q and points on these null geodesics past r),and they thus leave E−(q).inflationary cosmological models,which are“spatially large”in the sense that they contain many different regions that are not in causal communication.We define a localized light cone as one that does not wrap around the Universe.More precisely,we say that a past light cone is localized if from every spacetime point p not on the cone there is at least one timelike curve,maximally extended in the past direction,that does not intersect the cone[16].It turns out that the conclusion of our theorem still holds if we replace assumption B by the assumption that past light cones are localized[6].V.Causally Complicated UniversesThe assumption of causal simplicity–made in ourfirst result in order to simplify the proof–can be dropped,as long as we are willing to make a replacement assumption about the causal structure of inflating spacetimes.The new theorem embraces topologically and causally complicated spacetimes,and it allows us to give a unified treatment of open and closed Universes.Theorem2:A spacetime M cannot be null-geodesically complete to the past if it satisfies the following conditions:A.It obeys the null convergence condition.B.It obeys the strong causality condition.C.It has at least one point p such thati.for every point q to the past of p the volume of the difference of thepasts of p and q isfinite(i.e.,Ω(I−(p)−I−(q))<∞),and ii.there is a timelike curveµ,maximally extended to the past of p,such that the boundary of the future ofµhas a non-empty intersection withthe past of p(i.e.,˙I+(µ)∩I−(p)=∅).Part(ii)of assumption C is new.It is related to certain other causal and topological properties of spacetimes[8],and there are also physical reasons for believing that the assumption is reasonable.Consider,for instance,a point r in the inflating region.Suppose that its past,I−(r),has the property that it “swallows the Universe,”in the sense that every timelike curve that is maximally extended in the past direction eventually enters I−(r).(This is related to the issue of localization of light cones discussed above.)Assuming that there are thermalization events arbitrarily far in the past,it seems likely,then,that there is a thermalization event somewhere in I−(r).This contradicts the fact that r lies in the inflating region[5].It is plausible,therefore,that inflating spacetimes will,in general,have the property that there exist maximally extended(in the past direction)timelike curves whose futures do not encompass the whole inflating region.(If no timelike curve has a future that encompasses the entire inflating region,it will guarantee that the Universe never completely thermalizes–so one may view a condition of this sort as a sufficient condition for inflation to be future-eternal.)Another piece of evidence for the reasonableness of part(ii)of assumption C is that the spacetime in the past light cone of any point in the inflating region is locally approximately de Sitter.It is similar to the spacetime in the future light cone of a point in an inflating universe where there is no thermalization.Thus “past infinity”in inflating regions might be expected to be similar to that of de Sitter space,where the sort of behavior we are talking about does occur[11].We are arguing,in other words,that a typical maximally extended past-directed curve ought not to influence the entire inflating region–there must be portions of the region that do not lie to the future of such a curve.This is illustrated infig.3.Let V be a spacetime region.We call a timelike curve,µ,a curve of limited influence in V if its future does not engulf all of V.If V is the inflating region of a spacetime M,and if all timelike curves in M are of limited.influence in V,we say that the spacetime obeys the limited influence condition(b)Figure3:Thesefigures each represent the inflating region of some space-time.The shaded region in each case represents the future of the curveµ.In(a)µcan influence the entire inflating region,whereas in(b)it cannot.Proof:Suppose that M is null-complete to the past.We show that this leads to a contradiction.Let q be a point to the past of the point p of assumption C.We have seen in Theorem1that every past-directed null geodesic from q must leave E−(q) at some point and enter I−(q)(i.e.,it must leave the past null cone of q and enter the interior of the past of q).Let the point q belong to˙I+(µ)∩I−(p)(seefig.4).Letγbe a null geodesic through q that lies on˙I+(µ).From assumption B it follows that this geodesic cannot leave˙I+(µ)when followed in the past direction.For,suppose it does at some point x.This point cannot lie onµitself(because then it,and all points to its causal future,including q,will lie to the chronological future of some point onµ,i.e.,in I+(µ)and not on its boundary).Pick a neighborhood N of x that does not intersectµanywhere(see the discussion of strong causality in Section II).There will be some null geodesic in N,past-directed from x,that lies on the boundary˙I+(µ).If this geodesic,λ,is other than the continuation ofγ,there will be a timelike curve from it to a point onγ(seefig.5),violating the achronal nature of the boundary˙I+(µ).˙Iand on E−(q).It must lie on˙I+(µ)throughout when followed into thepast(the hollow circle at the“past end”ofµis not part of the spacetime).But q∈I−(p),soγmust enter I−(q).This contradicts the fact that it lies throughout.on˙I+(µ)then there will be a timelike curve–shown by the dashed line–betweenthe two.Now,we have seen thatγmust leave E−(q)and enter I−(q);i.e.,there must be a point r to the past of q onγsuch that r∈I−(q).This means that every point in some neighborhood of r must also lie in I−(q).Some of these points must belong to I+(µ).(The point r lies onγ,and so belongs to˙I+(µ),the boundary of the future ofµ.Therefore,there must be points close to r that lie in I+(µ).)This means that there is a timelike curve that starts in the past at some point onµ,passes through a point close to r,and then continues on to q.This contradicts the fact that q∈˙I+(µ).VI.DiscussionThe theorems in this paper show that inflation does not seem to remove the problem of the initial singularity(although it does move the singularity back into an indefinite past).In fact,our analysis of the assumptions of the theorems suggests that almost all points in the inflating region have a singularity somewhere in their pasts.In this sense,our results are stronger than most of the usual singularity theorems,which–in general–predict the existence of just one incomplete geodesic[17].Indeed,Theorem2is even stronger than that,since it appears to suggest that the Universe cannot be infinitely old,in the sense that the inflating region of spacetime can contain no timelike curve infinitely long(in proper time)in the past direction[18].For,suppose such a curve,µ,does exist.It seems reasonable to suppose that the null geodesics that lie on the boundary of the future ofµare also complete in the past direction[19].If this is the case,and ifµis of limited influence in the inflating region,we arrive at the same contradiction as the one in our theorem[20].The existence of initial singularities in inflationary models means that we cannot use inflation as a way of avoiding the question of the birth of the Uni-verse.The question will probably have to be answered quantum mechanically, i.e.,by describing the Universe by a wave function,and not by a classical space-time.AcknowledgementsOne of the authors(A.V.)acknowledges partial support from the National Science Foundation.The other author(A.B.)thanks the Institute of Cosmology at Tufts University and Dean Al Siegel and Provost Tim Bishop of Southampton College of Long Island University for their continued support.References1.The inflationary expansion is driven by the potential energy of a scalarfieldϕ,while thefield slowly“rolls down”its potential V(ϕ).Whenϕreaches the minimum of the potential this vacuum energy thermalizes,and inflation is followed by the usual radiation-dominated expansion.The evolution of the fieldϕis influenced by quantumfluctuations,and as a result thermalization does not occur simultaneously in different parts of the Universe.2.A.Vilenkin,Phys.Rev.D,27,2848(1983);A.D.Linde,Phys.Lett.B175,395(1986).3.M.Aryal and A.Vilenkin,Phys.Lett.B199,351(1987);A.S.Goncharov,A.D.Linde and V.F.Mukhanov,Int.J.Mod.Phys.A2,561(1987);K.Nakao,Y.Nambu and M.Sasaki,Prog.Theor.Phys.80,1041(1988).4.A.Linde,D.Linde and A.Mezhlumian,Phys.Rev.D,49,1783(1994).5.A.Borde and A.Vilenkin,Phys.Rev.Lett.,72,3305(1994).6.A.Borde,Phys.Rev.D.,50,3392(1994).7.A.Borde and A.Vilenkin,in Relativistic Astrophysics:The Proceedings ofthe Eighth Yukawa Symposium,edited by M.Sasaki,Universal Academy Press,Japan(1995).8.A.Borde,Tufts Institute of Cosmology preprint(1995).9.A.Vilenkin,Phys.Rev.D,46,2355(1992).10.A.Borde,Cl.and Quant.Gravity4,343(1987).11.S.W.Hawking and G.F.R.Ellis,The large scale structure of spacetime,Cambridge University Press,Cambridge,England(1973).12.This means that the notions of“past”and“future”are globally well-defined.13.A congruence is a set of curves in an open region of spacetime,one througheach point of the region.14.A weakening of the conditions under whichθdiverges,was discussed byF.J.Tipler,J.Diff.Eq.,30,165(1978);Phys.Rev.D,17,2521(1978);these results were extended in[10].15.Similar behavior occurs,for instance,in the Einstein Universe,but itdoes not in the de Sitter Universe,nor in some closed Robertson-Walker Universes[11,6].16.We actually need to impose a further causality requirement,called the stablecausality condition,in order for this definition be meaningful;see ref.[6] for the details.17.Our results are also stronger than many standard singularity theorems–such as the Hawking-Penrose theorem[11]–because we do not assume the strong energy condition.This is crucially important when discussingthe structure of inflationary spacetimes,because the condition is explicitly violated there[7].18.The existence,or not,of such a curve is related to issues raised inA.D.Linde,D.Linde and A.Mezhlumian,Phys.Rev.D,49,1783(1994).19.It is possible to contrive examples in which this is not true–where,forinstance,a timelike curve avoids singularities in the past,but no null ones in the boundary of its future do.In a physically reasonable spacetime, however,one would expect singularities to be visible to timelike and null curves alike.20.The question of whether or not the Universe is infinitely old is sometimesposed as the question of whether or not there exists an upper bound to the length of timelike curves when followed into the past.This formulation does not,however,get to the essence of the question.Consider,for example,two-dimensional Minkowski space with the region t≤0removed.This truncated spacetime has a“global beginning”at t=0,and is thus not infinitely old at any(finite)positive time t.When viewed from the spacelike hypersurface, S,given by t=√12。
高中英语外刊阅读27-发表负结果有益于科学进步(2)
文章主要亮点有:①“科学界应重视发表那些负向或失败的研究结果”观点不同寻常,有助于科学界深思;②论证手法多样(对比,因果,引用等);③结构明确,按照“提出观点(第一段)——论证观点(第二至五段)——分析观点所涉错误现象产生的原因(第六段)——总结收篇,重申观点(第七段)”的脉络展开论述。
【原文】ⅠHypothesis-driven research is at the heart of scientific endeavor, and it is often the positive,confirmatory data that get the most attention and guide further research. But many studies produce non-confirmatory data—observations that refute current ideas and carefully constructed hypotheses. And it can be argue d that these “negative data,” far from having little value in science, are actually an integral part of scientific progress that deserve more attention.ⅡAt first glance, this may seem a little nonsensical; after all, how can non-confirmatory results help science to progress when they fail to substantiate anything? But in fact, in a philosophical sense, only negative data resulting in rejection of a hypothesis represent real progress. As philosopher of science Karl Popper stated: “Every refutation should b e regarded as a great success; not merely a success of the scientist who refuted the theory, but also of the scientist who created the refuted theory and who thus in the first instance suggested, if only indirectly, the refuting experiment.”ⅢOn a more practical level, Journal of Negative Results in Biomedicine (JNRBM) was launched on the premise that scientific progress depends not only on the accomplishments of individuals but requires teamwork and open communication of all results—positive and negative. After all, the scientific community can only learn from negative results if the data are published.ⅣThough not every negative result will turn out to be of groundbreaking significance, it is imperative to be aware of the more balanced perspective that can result from the publication of non-confirmatory findings. The first and most obvious benefits of publishing negative results are a reduction in the duplication of effort between researchers, leading to the acceleration of scientific progress, and greater transparency and openness.ⅤMore broadly, publication of negative data might also contribute to a more realistic appreciation of the “messy” nature of science. Scientific endeavors rarely result in perfect discoveries of elements of “truth” about the world. This is largely because they are frequently based on methods with real limitations and hypotheses based on uncertain premises.ⅥIt is perhaps this “messy” aspect of science that contributes to a hesitation within the scientific community to publish negative data. In an ever more competitive environment, it may be that scientific journals prefer to publish studies with clear and specific conclusions. Indeed, Daniele Fanelli of the University of Edinburgh suggests that results may be distorted by a “p ublish or perish” culture in which the progress of scientific careers depends on the frequency and quality of citations. This leads to a situation in which data that support a hypothesis may be perceived in a more positive light and receive more citations than data that only generate more questions and uncertainty.ⅦDespite the effects of this competitive environment, however, a willingness to publish negative data is emerging among researchers. Publications that emphasize positive findings are of course useful, but a more balanced presentation of all the data, including negative or failed experiments, would also make a significant contribution to scientific progress.【词汇短语】1.hypothesis [haɪˈpɒθəsɪs] n.假设2.confirmatory [kən'fɜ:məˌtərɪ] a.证实的,确实的3.refute [rɪˈfju:t] v.驳斥4.negative [ˈnegətɪv] a.否定的5.integral [ˈɪntɪgrəl] a.构成整体所必须的6.nonsensical [nɒnˈsensɪkl] a.荒谬的7.substantiate [səbˈstænʃieɪt] v.证明8.rejection [rɪ'dʒekʃn] n.拒绝,驳回unch [lɔ:ntʃ] v.发射,发起,推出10.premise ['premɪs] n.前提11.accomplishment [əˈkʌmplɪʃmənt] n.成就munity [kəˈmju:nəti] n.团体,界13.groundbreaking [ˈgraʊndbreɪkɪŋ] a.创新的14.significance [sɪgˈnɪfɪkəns] n.意义15.imperative [ɪmˈperətɪv] a.必要的16.duplication [ˌdju:plɪ'keɪʃn] n.双重,重复17.acceleration [əkˌseləˈreɪʃn] n.加速18.transparency [trænsˈpærənsi] n.透明度19.contribute to 增益,有助于20.appreciation [əˌpri:ʃiˈeɪʃn] n.欣赏,评论21.messy [ˈmesi] a.散乱的22.endeavor [ɪn'devə] n.努力23.hesitation [ˌhezɪ'teɪʃn] n.犹豫24.distort [dɪˈstɔ:t] v.扭曲25.perish [ˈperɪʃ] v.毁灭26.frequency [ˈf ri:kwənsi] n.频率27.emerge [iˈmɜ:dʒ] v.出现,显露【翻译点评】Ⅰ①Hypothesis-driven research is at the heart of scientific endeavor, and it is often the positive, confirmatory data that get the most attention and guide further research. ②But many studies produce non-confirmatory data—observations that refute current ideas and carefully constructed hypotheses. ③And it can be argued that these “negative data,” far from having little value in science, are actually an integral part of scientific progress that deserve more attention.翻译:假设驱动型研究对于科学事业至关重要, 且经常是那些正向的、验证性数据最受关注并引导深入研究。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
However, the evaluation of the integral is not easy and, so far, we can only
give examples with trivial L(∂Ω) – see Proposition 3 below.
We were led to consider the integral of ψθ by the works of Branson-Ørstead [4] and Parker-Rosenberg [20] on the constructions of conformal invariant
has a singularity along the boundary diagonal. If we take a smooth defining
function ρ of the domain, which is positive in Ω and dρ = 0 on ∂Ω, then (by
[6] and [2]) we can expand the singularity as
(1.1)
Sθ(z, z) = ϕθ(z)ρ(z)−n + ψθ(z) log ρ(z),
where ϕθ and ψθ are functions on Ω that are smooth up to the boundary. Note that ψθ|∂Ω is independent of the choice of ρ and is shown to gives a local invariant of the pseudohermitian structure θ.
(ii) Let {Ωt}t∈R be a C∞ family of strictly pseudoconvex domains in M . Then L(∂Ωt) is independent of t.
In case n = 2, we have shown in [13] that
ψθ |∂ Ω
of the coefficient an of the asymptotic expansion kt(x, x) ∼ t−n∞ j=0 Nhomakorabeaaj
(x)tj
is shown to be a conformal (resp. CR) invariant, while the integrand andvg
does depend on the choice of a scale g ∈ [g] (resp. a contact form θ). This is
We here prove that the integral of the coefficient of the logarithmic singularity of the Szego¨ kernel gives a biholomorphic invariant of a domain Ω, or a CR invariant of the boundary ∂Ω, and moreover that the invariant is unchanged under perturbations of the domain (Theorem 1). We also show that the same invariant appears as the coefficient of the logarithmic term of the volume expansion of the domain with respect to the Bergman volume element (Theorem 2). This second result is an analogy of the derivation of a conformal invariant from the volume expansion of conformally compact Einstein metrics which arises in the AdS/CFT correspondence – see [10] for a discussion and references.
from the heat kernel kt(x, y) of conformal Laplacian, and their CR analogue
for CR invariant sub-Laplacian by Stanton [22]. For a conformal manifold
of even dimension 2n (resp. CR manifold of dimension 2n − 1), the integral
a natural consequence of the variational formula for the kernel kt(x, y) under
conformal scaling, which follows from the heat equation. Our Theorem 1
is also a consequence of a variational formula of the Szego¨ kernel, which is
1. Introduction
This paper is a continuation of Fefferman’s program [7] for studying the geometry and analysis of strictly pseudoconvex domains. The key idea of the program is to consider the Bergman and Szego¨ kernels of the domains as analogs of the heat kernel of Riemannian manifolds. In Riemannian (or conformal) geometry, the coefficients of the asymptotic expansion of the heat kernel can be expressed in terms of the curvature of the metric; by integrating the coefficients one obtains index theorems in various settings. For the Bergman and Szego¨ kernels, there has been much progress made on the description of their asymptotic expansions based on invariant theory ([7], [1], [15]); we now seek for invariants that arise from the integral of the coefficients of the expansions.
Theorem 1. (i) The integral
L(∂Ω, θ) = ψθ θ ∧ (dθ)n−1
∂Ω
is independent of the choice of a pseudohermitian structure θ of ∂Ω. Thus we may write L(∂Ω) = L(∂Ω, θ).
1
2
LOGARITHMIC SINGULARITY OF THE SZEGO¨ KERNEL
Let Ω be a relatively compact, smoothly bounded strictly pseudoconvex
domain in a complex manifold M . We take a pseudohermitian structure θ, or a contact form, of ∂Ω and define a surface element dσ = θ∧(dθ)n−1. Then we may define the Hardy space A(∂Ω, dσ) consisting of the boundary values of holomorphic functions on Ω that are L2 in the norm f 2 = ∂Ω |f |2dσ. The Szego¨ kernel Sθ(z, w) is defined as the reproducing kernel of A(∂Ω, dσ), which can be extended to a holomorphic function of (z, w) ∈ Ω × Ω and
arXiv:math/0309176v1 [math.CV] 10 Sep 2003
LOGARITHMIC SINGULARITY OF THE SZEGO¨ KERNEL AND A GLOBAL INVARIANT OF STRICTLY PSEUDOCONVEX DOMAINS
KENGO HIRACHI
Webster connection for θ. Thus the integrand ψθ θ∧dθ is nontrivial and does
depend on θ, but it also turns out that L(∂Ω) = 0 by Stokes’ theorem. For
higher dimensions, we can still give examples of (∂Ω, θ) for which ψθ|∂Ω ≡ 0.
The proofs of these results based on Kashiwara’s microlocal analysis of the Bergman kernel in [17], where he showed that the reproducing property of the Bergman kernel on holomorphic functions can be “quantized” to a reproducing property of the microdifferential operators (i.e., classical analytic pseudodifferential operators). It provides a system of microdifferential equations that characterizes the singularity of the Bergman kernel (which can be formulated as a microfunction) up to a constant multiple; such an argument can be equally applied to the Szego¨ kernel. These systems of equations are used to overcome one of the main difficulties, when we consider the analogy to the heat kernel, that the Bergman and Szego¨ kernels are not defined as solutions to differential equations.