2006 APL Predicting hardness of dense C3N4 polymorphs
andrew ng的讲义中文
andrew ng的讲义中文
摘要:
1.介绍Andrew Ng 和他在机器学习领域的贡献
2.阐述深度学习的基本概念
3.介绍深度学习的发展历程
4.深度学习在计算机视觉、自然语言处理等领域的应用
5.讨论深度学习未来的发展趋势和挑战
正文:
Andrew Ng 是机器学习领域的知名专家,曾是斯坦福大学的人工智能实验室主任,后来创立了谷歌大脑项目,为推动深度学习的发展做出了巨大贡献。
在Andrew Ng 的讲义中,他首先介绍了深度学习的基本概念。
深度学习是一种基于神经网络的机器学习方法,它通过多层神经网络对数据进行学习,能够自动提取数据中的特征,从而实现复杂的模式识别和预测。
接下来,讲义回顾了深度学习的发展历程。
从最初的神经网络研究,到后来的深度学习复兴,Andrew Ng 详细介绍了这一领域的重要里程碑和关键技术。
他强调了反向传播算法、激活函数、优化方法等核心概念的重要性。
在讲义的后半部分,Andrew Ng 重点介绍了深度学习在计算机视觉、自然语言处理等领域的应用。
他以图像识别、语音识别、机器翻译等具体任务为例,生动地展示了深度学习如何解决实际问题,并取得了显著的成果。
最后,Andrew Ng 讨论了深度学习未来的发展趋势和挑战。
他认为,虽
然深度学习在许多领域取得了成功,但仍然存在一些问题,如模型的可解释性、数据依赖性等。
他呼吁研究人员继续探索新的方法,以解决这些问题,进一步推动深度学习的发展。
《神经网络与深度学习综述DeepLearning15May2014
Draft:Deep Learning in Neural Networks:An OverviewTechnical Report IDSIA-03-14/arXiv:1404.7828(v1.5)[cs.NE]J¨u rgen SchmidhuberThe Swiss AI Lab IDSIAIstituto Dalle Molle di Studi sull’Intelligenza ArtificialeUniversity of Lugano&SUPSIGalleria2,6928Manno-LuganoSwitzerland15May2014AbstractIn recent years,deep artificial neural networks(including recurrent ones)have won numerous con-tests in pattern recognition and machine learning.This historical survey compactly summarises relevantwork,much of it from the previous millennium.Shallow and deep learners are distinguished by thedepth of their credit assignment paths,which are chains of possibly learnable,causal links between ac-tions and effects.I review deep supervised learning(also recapitulating the history of backpropagation),unsupervised learning,reinforcement learning&evolutionary computation,and indirect search for shortprograms encoding deep and large networks.PDF of earlier draft(v1):http://www.idsia.ch/∼juergen/DeepLearning30April2014.pdfLATEX source:http://www.idsia.ch/∼juergen/DeepLearning30April2014.texComplete BIBTEXfile:http://www.idsia.ch/∼juergen/bib.bibPrefaceThis is the draft of an invited Deep Learning(DL)overview.One of its goals is to assign credit to those who contributed to the present state of the art.I acknowledge the limitations of attempting to achieve this goal.The DL research community itself may be viewed as a continually evolving,deep network of scientists who have influenced each other in complex ways.Starting from recent DL results,I tried to trace back the origins of relevant ideas through the past half century and beyond,sometimes using“local search”to follow citations of citations backwards in time.Since not all DL publications properly acknowledge earlier relevant work,additional global search strategies were employed,aided by consulting numerous neural network experts.As a result,the present draft mostly consists of references(about800entries so far).Nevertheless,through an expert selection bias I may have missed important work.A related bias was surely introduced by my special familiarity with the work of my own DL research group in the past quarter-century.For these reasons,the present draft should be viewed as merely a snapshot of an ongoing credit assignment process.To help improve it,please do not hesitate to send corrections and suggestions to juergen@idsia.ch.Contents1Introduction to Deep Learning(DL)in Neural Networks(NNs)3 2Event-Oriented Notation for Activation Spreading in FNNs/RNNs3 3Depth of Credit Assignment Paths(CAPs)and of Problems4 4Recurring Themes of Deep Learning54.1Dynamic Programming(DP)for DL (5)4.2Unsupervised Learning(UL)Facilitating Supervised Learning(SL)and RL (6)4.3Occam’s Razor:Compression and Minimum Description Length(MDL) (6)4.4Learning Hierarchical Representations Through Deep SL,UL,RL (6)4.5Fast Graphics Processing Units(GPUs)for DL in NNs (6)5Supervised NNs,Some Helped by Unsupervised NNs75.11940s and Earlier (7)5.2Around1960:More Neurobiological Inspiration for DL (7)5.31965:Deep Networks Based on the Group Method of Data Handling(GMDH) (8)5.41979:Convolution+Weight Replication+Winner-Take-All(WTA) (8)5.51960-1981and Beyond:Development of Backpropagation(BP)for NNs (8)5.5.1BP for Weight-Sharing Feedforward NNs(FNNs)and Recurrent NNs(RNNs)..95.6Late1980s-2000:Numerous Improvements of NNs (9)5.6.1Ideas for Dealing with Long Time Lags and Deep CAPs (10)5.6.2Better BP Through Advanced Gradient Descent (10)5.6.3Discovering Low-Complexity,Problem-Solving NNs (11)5.6.4Potential Benefits of UL for SL (11)5.71987:UL Through Autoencoder(AE)Hierarchies (12)5.81989:BP for Convolutional NNs(CNNs) (13)5.91991:Fundamental Deep Learning Problem of Gradient Descent (13)5.101991:UL-Based History Compression Through a Deep Hierarchy of RNNs (14)5.111992:Max-Pooling(MP):Towards MPCNNs (14)5.121994:Contest-Winning Not So Deep NNs (15)5.131995:Supervised Recurrent Very Deep Learner(LSTM RNN) (15)5.142003:More Contest-Winning/Record-Setting,Often Not So Deep NNs (16)5.152006/7:Deep Belief Networks(DBNs)&AE Stacks Fine-Tuned by BP (17)5.162006/7:Improved CNNs/GPU-CNNs/BP-Trained MPCNNs (17)5.172009:First Official Competitions Won by RNNs,and with MPCNNs (18)5.182010:Plain Backprop(+Distortions)on GPU Yields Excellent Results (18)5.192011:MPCNNs on GPU Achieve Superhuman Vision Performance (18)5.202011:Hessian-Free Optimization for RNNs (19)5.212012:First Contests Won on ImageNet&Object Detection&Segmentation (19)5.222013-:More Contests and Benchmark Records (20)5.22.1Currently Successful Supervised Techniques:LSTM RNNs/GPU-MPCNNs (21)5.23Recent Tricks for Improving SL Deep NNs(Compare Sec.5.6.2,5.6.3) (21)5.24Consequences for Neuroscience (22)5.25DL with Spiking Neurons? (22)6DL in FNNs and RNNs for Reinforcement Learning(RL)236.1RL Through NN World Models Yields RNNs With Deep CAPs (23)6.2Deep FNNs for Traditional RL and Markov Decision Processes(MDPs) (24)6.3Deep RL RNNs for Partially Observable MDPs(POMDPs) (24)6.4RL Facilitated by Deep UL in FNNs and RNNs (25)6.5Deep Hierarchical RL(HRL)and Subgoal Learning with FNNs and RNNs (25)6.6Deep RL by Direct NN Search/Policy Gradients/Evolution (25)6.7Deep RL by Indirect Policy Search/Compressed NN Search (26)6.8Universal RL (27)7Conclusion271Introduction to Deep Learning(DL)in Neural Networks(NNs) Which modifiable components of a learning system are responsible for its success or failure?What changes to them improve performance?This has been called the fundamental credit assignment problem(Minsky, 1963).There are general credit assignment methods for universal problem solvers that are time-optimal in various theoretical senses(Sec.6.8).The present survey,however,will focus on the narrower,but now commercially important,subfield of Deep Learning(DL)in Artificial Neural Networks(NNs).We are interested in accurate credit assignment across possibly many,often nonlinear,computational stages of NNs.Shallow NN-like models have been around for many decades if not centuries(Sec.5.1).Models with several successive nonlinear layers of neurons date back at least to the1960s(Sec.5.3)and1970s(Sec.5.5). An efficient gradient descent method for teacher-based Supervised Learning(SL)in discrete,differentiable networks of arbitrary depth called backpropagation(BP)was developed in the1960s and1970s,and ap-plied to NNs in1981(Sec.5.5).BP-based training of deep NNs with many layers,however,had been found to be difficult in practice by the late1980s(Sec.5.6),and had become an explicit research subject by the early1990s(Sec.5.9).DL became practically feasible to some extent through the help of Unsupervised Learning(UL)(e.g.,Sec.5.10,5.15).The1990s and2000s also saw many improvements of purely super-vised DL(Sec.5).In the new millennium,deep NNs havefinally attracted wide-spread attention,mainly by outperforming alternative machine learning methods such as kernel machines(Vapnik,1995;Sch¨o lkopf et al.,1998)in numerous important applications.In fact,supervised deep NNs have won numerous of-ficial international pattern recognition competitions(e.g.,Sec.5.17,5.19,5.21,5.22),achieving thefirst superhuman visual pattern recognition results in limited domains(Sec.5.19).Deep NNs also have become relevant for the more generalfield of Reinforcement Learning(RL)where there is no supervising teacher (Sec.6).Both feedforward(acyclic)NNs(FNNs)and recurrent(cyclic)NNs(RNNs)have won contests(Sec.5.12,5.14,5.17,5.19,5.21,5.22).In a sense,RNNs are the deepest of all NNs(Sec.3)—they are general computers more powerful than FNNs,and can in principle create and process memories of ar-bitrary sequences of input patterns(e.g.,Siegelmann and Sontag,1991;Schmidhuber,1990a).Unlike traditional methods for automatic sequential program synthesis(e.g.,Waldinger and Lee,1969;Balzer, 1985;Soloway,1986;Deville and Lau,1994),RNNs can learn programs that mix sequential and parallel information processing in a natural and efficient way,exploiting the massive parallelism viewed as crucial for sustaining the rapid decline of computation cost observed over the past75years.The rest of this paper is structured as follows.Sec.2introduces a compact,event-oriented notation that is simple yet general enough to accommodate both FNNs and RNNs.Sec.3introduces the concept of Credit Assignment Paths(CAPs)to measure whether learning in a given NN application is of the deep or shallow type.Sec.4lists recurring themes of DL in SL,UL,and RL.Sec.5focuses on SL and UL,and on how UL can facilitate SL,although pure SL has become dominant in recent competitions(Sec.5.17-5.22). Sec.5is arranged in a historical timeline format with subsections on important inspirations and technical contributions.Sec.6on deep RL discusses traditional Dynamic Programming(DP)-based RL combined with gradient-based search techniques for SL or UL in deep NNs,as well as general methods for direct and indirect search in the weight space of deep FNNs and RNNs,including successful policy gradient and evolutionary methods.2Event-Oriented Notation for Activation Spreading in FNNs/RNNs Throughout this paper,let i,j,k,t,p,q,r denote positive integer variables assuming ranges implicit in the given contexts.Let n,m,T denote positive integer constants.An NN’s topology may change over time(e.g.,Fahlman,1991;Ring,1991;Weng et al.,1992;Fritzke, 1994).At any given moment,it can be described as afinite subset of units(or nodes or neurons)N= {u1,u2,...,}and afinite set H⊆N×N of directed edges or connections between nodes.FNNs are acyclic graphs,RNNs cyclic.Thefirst(input)layer is the set of input units,a subset of N.In FNNs,the k-th layer(k>1)is the set of all nodes u∈N such that there is an edge path of length k−1(but no longer path)between some input unit and u.There may be shortcut connections between distant layers.The NN’s behavior or program is determined by a set of real-valued,possibly modifiable,parameters or weights w i(i=1,...,n).We now focus on a singlefinite episode or epoch of information processing and activation spreading,without learning through weight changes.The following slightly unconventional notation is designed to compactly describe what is happening during the runtime of the system.During an episode,there is a partially causal sequence x t(t=1,...,T)of real values that I call events.Each x t is either an input set by the environment,or the activation of a unit that may directly depend on other x k(k<t)through a current NN topology-dependent set in t of indices k representing incoming causal connections or links.Let the function v encode topology information and map such event index pairs(k,t)to weight indices.For example,in the non-input case we may have x t=f t(net t)with real-valued net t= k∈in t x k w v(k,t)(additive case)or net t= k∈in t x k w v(k,t)(multiplicative case), where f t is a typically nonlinear real-valued activation function such as tanh.In many recent competition-winning NNs(Sec.5.19,5.21,5.22)there also are events of the type x t=max k∈int (x k);some networktypes may also use complex polynomial activation functions(Sec.5.3).x t may directly affect certain x k(k>t)through outgoing connections or links represented through a current set out t of indices k with t∈in k.Some non-input events are called output events.Note that many of the x t may refer to different,time-varying activations of the same unit in sequence-processing RNNs(e.g.,Williams,1989,“unfolding in time”),or also in FNNs sequentially exposed to time-varying input patterns of a large training set encoded as input events.During an episode,the same weight may get reused over and over again in topology-dependent ways,e.g.,in RNNs,or in convolutional NNs(Sec.5.4,5.8).I call this weight sharing across space and/or time.Weight sharing may greatly reduce the NN’s descriptive complexity,which is the number of bits of information required to describe the NN (Sec.4.3).In Supervised Learning(SL),certain NN output events x t may be associated with teacher-given,real-valued labels or targets d t yielding errors e t,e.g.,e t=1/2(x t−d t)2.A typical goal of supervised NN training is tofind weights that yield episodes with small total error E,the sum of all such e t.The hope is that the NN will generalize well in later episodes,causing only small errors on previously unseen sequences of input events.Many alternative error functions for SL and UL are possible.SL assumes that input events are independent of earlier output events(which may affect the environ-ment through actions causing subsequent perceptions).This assumption does not hold in the broaderfields of Sequential Decision Making and Reinforcement Learning(RL)(Kaelbling et al.,1996;Sutton and Barto, 1998;Hutter,2005)(Sec.6).In RL,some of the input events may encode real-valued reward signals given by the environment,and a typical goal is tofind weights that yield episodes with a high sum of reward signals,through sequences of appropriate output actions.Sec.5.5will use the notation above to compactly describe a central algorithm of DL,namely,back-propagation(BP)for supervised weight-sharing FNNs and RNNs.(FNNs may be viewed as RNNs with certainfixed zero weights.)Sec.6will address the more general RL case.3Depth of Credit Assignment Paths(CAPs)and of ProblemsTo measure whether credit assignment in a given NN application is of the deep or shallow type,I introduce the concept of Credit Assignment Paths or CAPs,which are chains of possibly causal links between events.Let usfirst focus on SL.Consider two events x p and x q(1≤p<q≤T).Depending on the appli-cation,they may have a Potential Direct Causal Connection(PDCC)expressed by the Boolean predicate pdcc(p,q),which is true if and only if p∈in q.Then the2-element list(p,q)is defined to be a CAP from p to q(a minimal one).A learning algorithm may be allowed to change w v(p,q)to improve performance in future episodes.More general,possibly indirect,Potential Causal Connections(PCC)are expressed by the recursively defined Boolean predicate pcc(p,q),which in the SL case is true only if pdcc(p,q),or if pcc(p,k)for some k and pdcc(k,q).In the latter case,appending q to any CAP from p to k yields a CAP from p to q(this is a recursive definition,too).The set of such CAPs may be large but isfinite.Note that the same weight may affect many different PDCCs between successive events listed by a given CAP,e.g.,in the case of RNNs, or weight-sharing FNNs.Suppose a CAP has the form(...,k,t,...,q),where k and t(possibly t=q)are thefirst successive elements with modifiable w v(k,t).Then the length of the suffix list(t,...,q)is called the CAP’s depth (which is0if there are no modifiable links at all).This depth limits how far backwards credit assignment can move down the causal chain tofind a modifiable weight.1Suppose an episode and its event sequence x1,...,x T satisfy a computable criterion used to decide whether a given problem has been solved(e.g.,total error E below some threshold).Then the set of used weights is called a solution to the problem,and the depth of the deepest CAP within the sequence is called the solution’s depth.There may be other solutions(yielding different event sequences)with different depths.Given somefixed NN topology,the smallest depth of any solution is called the problem’s depth.Sometimes we also speak of the depth of an architecture:SL FNNs withfixed topology imply a problem-independent maximal problem depth bounded by the number of non-input layers.Certain SL RNNs withfixed weights for all connections except those to output units(Jaeger,2001;Maass et al.,2002; Jaeger,2004;Schrauwen et al.,2007)have a maximal problem depth of1,because only thefinal links in the corresponding CAPs are modifiable.In general,however,RNNs may learn to solve problems of potentially unlimited depth.Note that the definitions above are solely based on the depths of causal chains,and agnostic of the temporal distance between events.For example,shallow FNNs perceiving large“time windows”of in-put events may correctly classify long input sequences through appropriate output events,and thus solve shallow problems involving long time lags between relevant events.At which problem depth does Shallow Learning end,and Deep Learning begin?Discussions with DL experts have not yet yielded a conclusive response to this question.Instead of committing myself to a precise answer,let me just define for the purposes of this overview:problems of depth>10require Very Deep Learning.The difficulty of a problem may have little to do with its depth.Some NNs can quickly learn to solve certain deep problems,e.g.,through random weight guessing(Sec.5.9)or other types of direct search (Sec.6.6)or indirect search(Sec.6.7)in weight space,or through training an NNfirst on shallow problems whose solutions may then generalize to deep problems,or through collapsing sequences of(non)linear operations into a single(non)linear operation—but see an analysis of non-trivial aspects of deep linear networks(Baldi and Hornik,1994,Section B).In general,however,finding an NN that precisely models a given training set is an NP-complete problem(Judd,1990;Blum and Rivest,1992),also in the case of deep NNs(S´ıma,1994;de Souto et al.,1999;Windisch,2005);compare a survey of negative results(S´ıma, 2002,Section1).Above we have focused on SL.In the more general case of RL in unknown environments,pcc(p,q) is also true if x p is an output event and x q any later input event—any action may affect the environment and thus any later perception.(In the real world,the environment may even influence non-input events computed on a physical hardware entangled with the entire universe,but this is ignored here.)It is possible to model and replace such unmodifiable environmental PCCs through a part of the NN that has already learned to predict(through some of its units)input events(including reward signals)from former input events and actions(Sec.6.1).Its weights are frozen,but can help to assign credit to other,still modifiable weights used to compute actions(Sec.6.1).This approach may lead to very deep CAPs though.Some DL research is about automatically rephrasing problems such that their depth is reduced(Sec.4). In particular,sometimes UL is used to make SL problems less deep,e.g.,Sec.5.10.Often Dynamic Programming(Sec.4.1)is used to facilitate certain traditional RL problems,e.g.,Sec.6.2.Sec.5focuses on CAPs for SL,Sec.6on the more complex case of RL.4Recurring Themes of Deep Learning4.1Dynamic Programming(DP)for DLOne recurring theme of DL is Dynamic Programming(DP)(Bellman,1957),which can help to facili-tate credit assignment under certain assumptions.For example,in SL NNs,backpropagation itself can 1An alternative would be to count only modifiable links when measuring depth.In many typical NN applications this would not make a difference,but in some it would,e.g.,Sec.6.1.be viewed as a DP-derived method(Sec.5.5).In traditional RL based on strong Markovian assumptions, DP-derived methods can help to greatly reduce problem depth(Sec.6.2).DP algorithms are also essen-tial for systems that combine concepts of NNs and graphical models,such as Hidden Markov Models (HMMs)(Stratonovich,1960;Baum and Petrie,1966)and Expectation Maximization(EM)(Dempster et al.,1977),e.g.,(Bottou,1991;Bengio,1991;Bourlard and Morgan,1994;Baldi and Chauvin,1996; Jordan and Sejnowski,2001;Bishop,2006;Poon and Domingos,2011;Dahl et al.,2012;Hinton et al., 2012a).4.2Unsupervised Learning(UL)Facilitating Supervised Learning(SL)and RL Another recurring theme is how UL can facilitate both SL(Sec.5)and RL(Sec.6).UL(Sec.5.6.4) is normally used to encode raw incoming data such as video or speech streams in a form that is more convenient for subsequent goal-directed learning.In particular,codes that describe the original data in a less redundant or more compact way can be fed into SL(Sec.5.10,5.15)or RL machines(Sec.6.4),whose search spaces may thus become smaller(and whose CAPs shallower)than those necessary for dealing with the raw data.UL is closely connected to the topics of regularization and compression(Sec.4.3,5.6.3). 4.3Occam’s Razor:Compression and Minimum Description Length(MDL) Occam’s razor favors simple solutions over complex ones.Given some programming language,the prin-ciple of Minimum Description Length(MDL)can be used to measure the complexity of a solution candi-date by the length of the shortest program that computes it(e.g.,Solomonoff,1964;Kolmogorov,1965b; Chaitin,1966;Wallace and Boulton,1968;Levin,1973a;Rissanen,1986;Blumer et al.,1987;Li and Vit´a nyi,1997;Gr¨u nwald et al.,2005).Some methods explicitly take into account program runtime(Al-lender,1992;Watanabe,1992;Schmidhuber,2002,1995);many consider only programs with constant runtime,written in non-universal programming languages(e.g.,Rissanen,1986;Hinton and van Camp, 1993).In the NN case,the MDL principle suggests that low NN weight complexity corresponds to high NN probability in the Bayesian view(e.g.,MacKay,1992;Buntine and Weigend,1991;De Freitas,2003), and to high generalization performance(e.g.,Baum and Haussler,1989),without overfitting the training data.Many methods have been proposed for regularizing NNs,that is,searching for solution-computing, low-complexity SL NNs(Sec.5.6.3)and RL NNs(Sec.6.7).This is closely related to certain UL methods (Sec.4.2,5.6.4).4.4Learning Hierarchical Representations Through Deep SL,UL,RLMany methods of Good Old-Fashioned Artificial Intelligence(GOFAI)(Nilsson,1980)as well as more recent approaches to AI(Russell et al.,1995)and Machine Learning(Mitchell,1997)learn hierarchies of more and more abstract data representations.For example,certain methods of syntactic pattern recog-nition(Fu,1977)such as grammar induction discover hierarchies of formal rules to model observations. The partially(un)supervised Automated Mathematician/EURISKO(Lenat,1983;Lenat and Brown,1984) continually learns concepts by combining previously learnt concepts.Such hierarchical representation learning(Ring,1994;Bengio et al.,2013;Deng and Yu,2014)is also a recurring theme of DL NNs for SL (Sec.5),UL-aided SL(Sec.5.7,5.10,5.15),and hierarchical RL(Sec.6.5).Often,abstract hierarchical representations are natural by-products of data compression(Sec.4.3),e.g.,Sec.5.10.4.5Fast Graphics Processing Units(GPUs)for DL in NNsWhile the previous millennium saw several attempts at creating fast NN-specific hardware(e.g.,Jackel et al.,1990;Faggin,1992;Ramacher et al.,1993;Widrow et al.,1994;Heemskerk,1995;Korkin et al., 1997;Urlbe,1999),and at exploiting standard hardware(e.g.,Anguita et al.,1994;Muller et al.,1995; Anguita and Gomes,1996),the new millennium brought a DL breakthrough in form of cheap,multi-processor graphics cards or GPUs.GPUs are widely used for video games,a huge and competitive market that has driven down hardware prices.GPUs excel at fast matrix and vector multiplications required not only for convincing virtual realities but also for NN training,where they can speed up learning by a factorof50and more.Some of the GPU-based FNN implementations(Sec.5.16-5.19)have greatly contributed to recent successes in contests for pattern recognition(Sec.5.19-5.22),image segmentation(Sec.5.21), and object detection(Sec.5.21-5.22).5Supervised NNs,Some Helped by Unsupervised NNsThe main focus of current practical applications is on Supervised Learning(SL),which has dominated re-cent pattern recognition contests(Sec.5.17-5.22).Several methods,however,use additional Unsupervised Learning(UL)to facilitate SL(Sec.5.7,5.10,5.15).It does make sense to treat SL and UL in the same section:often gradient-based methods,such as BP(Sec.5.5.1),are used to optimize objective functions of both UL and SL,and the boundary between SL and UL may blur,for example,when it comes to time series prediction and sequence classification,e.g.,Sec.5.10,5.12.A historical timeline format will help to arrange subsections on important inspirations and techni-cal contributions(although such a subsection may span a time interval of many years).Sec.5.1briefly mentions early,shallow NN models since the1940s,Sec.5.2additional early neurobiological inspiration relevant for modern Deep Learning(DL).Sec.5.3is about GMDH networks(since1965),perhaps thefirst (feedforward)DL systems.Sec.5.4is about the relatively deep Neocognitron NN(1979)which is similar to certain modern deep FNN architectures,as it combines convolutional NNs(CNNs),weight pattern repli-cation,and winner-take-all(WTA)mechanisms.Sec.5.5uses the notation of Sec.2to compactly describe a central algorithm of DL,namely,backpropagation(BP)for supervised weight-sharing FNNs and RNNs. It also summarizes the history of BP1960-1981and beyond.Sec.5.6describes problems encountered in the late1980s with BP for deep NNs,and mentions several ideas from the previous millennium to overcome them.Sec.5.7discusses afirst hierarchical stack of coupled UL-based Autoencoders(AEs)—this concept resurfaced in the new millennium(Sec.5.15).Sec.5.8is about applying BP to CNNs,which is important for today’s DL applications.Sec.5.9explains BP’s Fundamental DL Problem(of vanishing/exploding gradients)discovered in1991.Sec.5.10explains how a deep RNN stack of1991(the History Compressor) pre-trained by UL helped to solve previously unlearnable DL benchmarks requiring Credit Assignment Paths(CAPs,Sec.3)of depth1000and more.Sec.5.11discusses a particular WTA method called Max-Pooling(MP)important in today’s DL FNNs.Sec.5.12mentions afirst important contest won by SL NNs in1994.Sec.5.13describes a purely supervised DL RNN(Long Short-Term Memory,LSTM)for problems of depth1000and more.Sec.5.14mentions an early contest of2003won by an ensemble of shallow NNs, as well as good pattern recognition results with CNNs and LSTM RNNs(2003).Sec.5.15is mostly about Deep Belief Networks(DBNs,2006)and related stacks of Autoencoders(AEs,Sec.5.7)pre-trained by UL to facilitate BP-based SL.Sec.5.16mentions thefirst BP-trained MPCNNs(2007)and GPU-CNNs(2006). Sec.5.17-5.22focus on official competitions with secret test sets won by(mostly purely supervised)DL NNs since2009,in sequence recognition,image classification,image segmentation,and object detection. Many RNN results depended on LSTM(Sec.5.13);many FNN results depended on GPU-based FNN code developed since2004(Sec.5.16,5.17,5.18,5.19),in particular,GPU-MPCNNs(Sec.5.19).5.11940s and EarlierNN research started in the1940s(e.g.,McCulloch and Pitts,1943;Hebb,1949);compare also later work on learning NNs(Rosenblatt,1958,1962;Widrow and Hoff,1962;Grossberg,1969;Kohonen,1972; von der Malsburg,1973;Narendra and Thathatchar,1974;Willshaw and von der Malsburg,1976;Palm, 1980;Hopfield,1982).In a sense NNs have been around even longer,since early supervised NNs were essentially variants of linear regression methods going back at least to the early1800s(e.g.,Legendre, 1805;Gauss,1809,1821).Early NNs had a maximal CAP depth of1(Sec.3).5.2Around1960:More Neurobiological Inspiration for DLSimple cells and complex cells were found in the cat’s visual cortex(e.g.,Hubel and Wiesel,1962;Wiesel and Hubel,1959).These cellsfire in response to certain properties of visual sensory inputs,such as theorientation of plex cells exhibit more spatial invariance than simple cells.This inspired later deep NN architectures(Sec.5.4)used in certain modern award-winning Deep Learners(Sec.5.19-5.22).5.31965:Deep Networks Based on the Group Method of Data Handling(GMDH) Networks trained by the Group Method of Data Handling(GMDH)(Ivakhnenko and Lapa,1965; Ivakhnenko et al.,1967;Ivakhnenko,1968,1971)were perhaps thefirst DL systems of the Feedforward Multilayer Perceptron type.The units of GMDH nets may have polynomial activation functions imple-menting Kolmogorov-Gabor polynomials(more general than traditional NN activation functions).Given a training set,layers are incrementally grown and trained by regression analysis,then pruned with the help of a separate validation set(using today’s terminology),where Decision Regularisation is used to weed out superfluous units.The numbers of layers and units per layer can be learned in problem-dependent fashion. This is a good example of hierarchical representation learning(Sec.4.4).There have been numerous ap-plications of GMDH-style networks,e.g.(Ikeda et al.,1976;Farlow,1984;Madala and Ivakhnenko,1994; Ivakhnenko,1995;Kondo,1998;Kord´ık et al.,2003;Witczak et al.,2006;Kondo and Ueno,2008).5.41979:Convolution+Weight Replication+Winner-Take-All(WTA)Apart from deep GMDH networks(Sec.5.3),the Neocognitron(Fukushima,1979,1980,2013a)was per-haps thefirst artificial NN that deserved the attribute deep,and thefirst to incorporate the neurophysiolog-ical insights of Sec.5.2.It introduced convolutional NNs(today often called CNNs or convnets),where the(typically rectangular)receptivefield of a convolutional unit with given weight vector is shifted step by step across a2-dimensional array of input values,such as the pixels of an image.The resulting2D array of subsequent activation events of this unit can then provide inputs to higher-level units,and so on.Due to massive weight replication(Sec.2),relatively few parameters may be necessary to describe the behavior of such a convolutional layer.Competition layers have WTA subsets whose maximally active units are the only ones to adopt non-zero activation values.They essentially“down-sample”the competition layer’s input.This helps to create units whose responses are insensitive to small image shifts(compare Sec.5.2).The Neocognitron is very similar to the architecture of modern,contest-winning,purely super-vised,feedforward,gradient-based Deep Learners with alternating convolutional and competition lay-ers(e.g.,Sec.5.19-5.22).Fukushima,however,did not set the weights by supervised backpropagation (Sec.5.5,5.8),but by local un supervised learning rules(e.g.,Fukushima,2013b),or by pre-wiring.In that sense he did not care for the DL problem(Sec.5.9),although his architecture was comparatively deep indeed.He also used Spatial Averaging(Fukushima,1980,2011)instead of Max-Pooling(MP,Sec.5.11), currently a particularly convenient and popular WTA mechanism.Today’s CNN-based DL machines profita lot from later CNN work(e.g.,LeCun et al.,1989;Ranzato et al.,2007)(Sec.5.8,5.16,5.19).5.51960-1981and Beyond:Development of Backpropagation(BP)for NNsThe minimisation of errors through gradient descent(Hadamard,1908)in the parameter space of com-plex,nonlinear,differentiable,multi-stage,NN-related systems has been discussed at least since the early 1960s(e.g.,Kelley,1960;Bryson,1961;Bryson and Denham,1961;Pontryagin et al.,1961;Dreyfus,1962; Wilkinson,1965;Amari,1967;Bryson and Ho,1969;Director and Rohrer,1969;Griewank,2012),ini-tially within the framework of Euler-LaGrange equations in the Calculus of Variations(e.g.,Euler,1744). Steepest descent in such systems can be performed(Bryson,1961;Kelley,1960;Bryson and Ho,1969)by iterating the ancient chain rule(Leibniz,1676;L’Hˆo pital,1696)in Dynamic Programming(DP)style(Bell-man,1957).A simplified derivation of the method uses the chain rule only(Dreyfus,1962).The methods of the1960s were already efficient in the DP sense.However,they backpropagated derivative information through standard Jacobian matrix calculations from one“layer”to the previous one, explicitly addressing neither direct links across several layers nor potential additional efficiency gains due to network sparsity(but perhaps such enhancements seemed obvious to the authors).。
启动子预测
BIOINFORMATICS Vol.25ISMB2009,pages i313–i320doi:10.1093/bioinformatics/btp191 Toward a gold standard for promoter prediction evaluationThomas Abeel1,2,Yves Van de Peer1,2,∗and Yvan Saeys1,21Department of Plant Systems Biology,VIB and2Department of Plant Biotechnology and Genetics,Ghent University, Technologiepark927,B-9052Gent,BelgiumABSTRACTMotivation:Promoter prediction is an important task in genome annotation projects,and during the past years many new promoter prediction programs(PPPs)have emerged.However,many of these programs are compared inadequately to other programs.In most cases,only a small portion of the genome is used to evaluate the program,which is not a realistic setting for whole genome annotation projects.In addition,a common evaluation design to properly compare PPPs is still lacking.Results:We present a large-scale benchmarking study of17state-of-the-art PPPs.A multi-faceted evaluation strategy is proposed that can be used as a gold standard for promoter prediction evaluation, allowing authors of promoter prediction software to compare their method to existing methods in a proper way.This evaluation strategy is subsequently used to compare the chosen promoter predictors, and an in-depth analysis on predictive performance,promoter class specificity,overlap between predictors and positional bias of the predictions is conducted.Availability:We provide the implementations of the four protocols, as well as the datasets required to perform the benchmarks to the academic community free of charge on request.Contact:yves.vandepeer@psb.ugent.beSupplementary information:Supplementary data are available at Bioinformatics online.1INTRODUCTIONPromoter prediction programs(PPPs)aim to identify promoter regions in a genome using computational models.In early work, promoter prediction focused on identifying the promoter of(protein-coding)genes(Fickett and Hatzigeorgiou,1997),but more recently it has become clear that transcription initiation does not always result in proteins,and that transcription occurs all over the genome (Carninci et al.,2006;Frith et al.,2008;Sandelin et al.,2007). One important question is what the different PPPs are actually trying to predict.Some programs aim to predict the exact location of the promoter region of known protein-coding genes,while others focus onfinding the transcription start site(TSS).Recent research has shown that there is often no single TSS,but rather a whole transcription start region(TSR)containing multiple TSSs that are used at different frequencies(Frith et al.,2008).This article analyzes the performance of17programs on two tasks:(i)genome-wide identification of the start of genes and(ii)genome-wide identification of TSRs.Most PPPs that are published make use of a tailored evaluation protocol that almost always proclaims the new PPP outperforming ∗To whom correspondence should be addressed.all others.Our aim is provide an objective benchmark that allows us to test and compare PPPs.In the past few years,a number of papers have evaluated promoter prediction software.The earliest work indicated that many of the early PPPs predicted too many false positives(FPs)(Fickett and Hatzigeorgiou,1997).A later genome-wide review included a completely new set of promoter predictors and introduced an evaluation protocol based on gene annotation(Bajic et al.,2004).This protocol has later been used to validate promoter predictions for the ENCODE pilot project (Bajic et al.,2006).Sonnenburg et al.(2006)proposed a more rigorous machine-learning-inspired validation method that uses experimentally determined promoters from DBTSS,a database of promoters.The most recent large-scale validation of PPPs included more programs than any of the earlier studies and introduced for thefirst time an evaluation based on all experimentally determined TSSs in the human genome(Abeel et al.,2008a,b).While many issues have been solved,there is still a large number of challenges that remain open for debate in evaluating the performance of promoter prediction software.Generally,we can distinguish two main approaches in promoter prediction.The first approach assigns scores to all single nucleotides to identify TSSs or ually,the scoring is done with a classification algorithm that is typically validated using cross-validation.This cross-validation provides afirst insight in to the performance of the model and can be used to optimize the model parameters on a training set.The scores obtained from these techniques can be used as input for a genome annotation pipeline,where they will be aggregated in gene models.Because of their design,this type of promoter predictors will always work on a genome-wide scale. Programs using this approach include ARTS(Sonnenburg et al., 2006),ProSOM(Abeel et al.,2008b)and EP3(Abeel et al.,2008a). The second approach identifies a promoter region without providing scores for all nucleotides.Typically,this type of programs will output a start coordinate and a stop coordinate of the promoter,and a score that indicates the confidence in the prediction.In rare cases,only one coordinate is given as TSS.For two programs no score is provided (Wu-method and PromoterExplorer).Within this approach,we can distinguish two subclasses of programs:the ones that work on a genomic scale and the ones that do not.The latter are used to identify the promoter of a single gene.In this work we will not consider these programs,because they are usually distributed as a website and are thus not suited for large-scale analyses.PPPs can be applied to identify the promoter of known genes, or they can be used to identify the start of any transcription event, regardless of what thefinal fate of the transcribed sequence is.For each application,we propose two evaluation protocols that can be used to assess the performance of a program for that particular application.Each application has an associated reference dataset which the protocol will use to evaluate a PPP.We use the same©2009The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License(/licenses/ by-nc/2.0/uk/)which permits unrestricted non-commercial use,distribution,and reproduction in any medium,provided the original work is properly cited. by guest on December 25, 2011 /Downloaded fromT.Abeel et al.type of reference datasets that have previously been used to validate promoter predictions(see section2for details).Several methods have been proposed to validate promoter predictions.Cross-validation on a small set of promoter and non-promoter sequences is sometimes used to validate a PPP(Xie et al., 2006),but the results are often an overestimation of the performance on a complete genome(Bajic et al.,2004).Other methods make use of gene annotation to evaluate promoter predictions,based on the rationale that the start of a gene corresponds with a promoter (Bajic et al.,2004,2006).However,it is clear that not all promoters are associated with protein-coding genes and,furthermore,not all transcription events start at the beginning of a gene.TSSs have been observed at the start of internal exons or at the3 end of a gene(Carninci et al.,2006).More recently,two large resources for promoter research in the human genome have been used to validate promoter predictions.Thefirst source is the DBTSS database, containing a large set of experimentally determined promoters (Wakaguri et al.,2008).The second source is a genome-wide screening of the human genome using the CAGE technique(Shiraki et al.,2003),providing all TSSs in the genome.The latter source is the most valuable as it is an exhaustive screening for all possible TSSs.The remainder of this work proposes a set of protocols and datasets to use when validating promoter prediction software.To illustrate our methods,we analyzed17PPPs with the proposed validation schemes.While the methods are applicable to any genome,we focus in the current article on the human genome. Finally,we highlight some challenges that arise in selecting the best PPP for a particular task.2MATERIALS AND METHODS2.1DatasetsWe used release hg18of the human genome for all analyses.For the validation protocols,we use the RefSeq genes downloaded from the UCSC table browser.This set includes23799unique gene models and is further referred to as the gene set.We also use the CAGE tag dataset from Carninci et al.(2006).The latter was preprocessed to aggregate all overlapping tags into clusters,resulting in180413clusters containing a total of4874272 CAGE tags.A cluster is considered to be a TSR if it contains at least two tags.Singleton clusters are removed as these could be transcriptional noise. This dataset will be referred to as the CAGE dataset.2.2Promoter prediction softwareWe used two criteria to select the PPPs to include in this analysis:(i)the program or predictions should be available without charge for academic use, and(ii)the program should be able to process the complete human genome or predictions should be available for the complete genome.At least17 programs(Table1)fulfilled these criteria and have been included.Details for settings and prediction extraction methods for each program are included in the Supplementary Material.2.3Evaluation protocolsIn this article,we propose four protocols to evaluate the quality of predictions made by PPPs.Thefirst two protocols are bin-based protocols,inspired by Sonnenburg et al.(2006).The latter two are distance based,inspired by Abeel et al.(2008b).Figure1shows a schematic overview of how each protocol determines the prediction performance.For the explanation of each protocol we assume that we have a set of predictions.Furthermore,we have a reference set(the gene set or the Table1.Overview of all the programs analyzedName ReferencesARTS Sonnenburg et al.(2006) CpGcluster Hackenberg et al.(2006) CpGProD Ponger and Mouchiroud(2002) DragonGSF Bajic and Brusic(2003) DragonPF Bajic et al.(2002)EP3Abeel et al.(2008a)Eponine Down and Hubbard(2002) FirstEF Davuluri et al.(2001) McPromoter Ohler et al.(2000)NNPP2.2Reese(2001)Nscan Gross and Brent(2006) Promoter2.0Knudsen(1999) PromoterExplorer Xie et al.(2006) PromoterScan Prestridge(1995)ProSOM Abeel et al.(2008b)PSPA Wang and Hannenhalli(2006) Wu-method Wu et al.(2007)CAGE set)that is considered to be the ground truth.The binning protocols (1A and1B)are more machine-learning oriented.Each bin has two labels: one provided by the reference set and the other provided by the PPP. Performance can be assessed based on these labels.The distance protocols (2A and2B)calculate the distance between a reference item and the closest prediction and will use this to calculate the performance.Protocols ending in A use the CAGE data as reference,while the ones ending in B use the gene set.Note that the B protocols discard all intergenic predictions from the evaluation.Intergenic prediction are removed because the gene set only contains known genes,so we have no idea which of the intergenic prediction are related to unknown genes or other types of transcription(Bajic et al., 2004).2.3.1Bin-based validation Evaluation protocol1A:this protocol uses the CAGE dataset as reference.We divide the genome in bins of500nt.Next, we check for each bin whether it overlaps with the center of a TSR.If it does, we label this bin as a positive TSR.With this labeling we can determine the number of true positives(TPs),FPs,false negatives(FNs)and true negatives (TNs).Each bin that is both labeled by a prediction and a TSR is considered a TP.A TN is a bin that is not labeled as predicted nor labeled as TSR.A FP is a bin that is labeled as predicted but not labeled as TSR.Finally,a FN is a bin that is not labeled as predicted but is labeled as TSR.From these we calculate the precision and recall with the following formulas.precision=TPTP+FPrecall=TPTP+FNEvaluation protocol1B:this protocol is a variant of protocol1A,but it uses the gene set as reference instead of the CAGE dataset.This protocol resembles the one used in Sonnenburg et al.(2006).We label all the bins overlapping the start of a gene as a positive gene start bin.All bins that overlap with the gene,but not with the start of that gene,are labeled as negative gene start bins.Bins that do not overlap with a gene or gene start are ignored in the analysis.A TP is a bin labeled as predicted and as a positive gene start.A TN is a bin not labeled as predicted and labeled as a negative gene start.A FP is a bin labeled as predicted and as a negative gene start.Finally,a FN is a bin not labeled as predicted and labeled as a positive gene start.The calculation of precision and recall are the same as in protocol1A.Note that this protocol ignores intergenic predictions that are not close to a gene start.i314 by guest on December 25, 2011 /Downloaded fromBenchmarking promoters predictors2Atag cluster2BDiscarded predictionFig.1.Visual representation of how the different protocols work.The panel numbers refer to the protocol identifiers.Protocols starting with1are based on binning,the ones starting with2on distance.Protocols ending in A use the CAGE data as reference,and those ending in B the gene set.More details can befound in the main text.2.3.2Distance-based validation Evaluation protocol2A:this protocolaims to validate predictions with the CAGE dataset as a reference.Wedetermine three scores:(i)the number of predictions(totalPredictions);(ii)how many of these predictions are correct(correctPredictions);and(iii)howmany TSRs are discovered by the predictions(discoveredTSR).A predictionis correct if the distance to the closest TSR is smaller than500nt.We use500nt as this is the same value as the binning approach and the value hasbeen used in the past for this type of analysis(Abeel et al.,2008a,b).A TSRis considered discovered if there is at least one prediction less than500ntaway from the TSR.The CAGE dataset has180413TSRs(totalTSR).We then define recall and precision as follows:precision=correctPredictionstotalPredictionsrecall=discoveredTSRtotalTSREvaluation protocol2B:this is a modification of protocol2A to checkthe agreement between TSR predictions and gene annotation.This methodresembles the method used in the EGASP pilot-project(Bajic et al.,2006).We determine three scores:(i)number of predictions(totalPredictions);(ii)how many of these predictions are correct(correctPredictions);and(iii)how many genes are discovered by the predictions(discoveredGenes).Allpredictions that are not near the start of a gene or do not overlap with a geneare discarded.A prediction is correct if the distance to the closest start of agene is smaller than500nt.A start of a gene is considered discovered if thereis at least one prediction less than500nt away from the TSR.Predictions thatoverlap a gene,but are not within500nt of the start are wrong predictions.There are23799genes in the reference set(totalGenes).precision=correctPredictionstotalPredictionsrecall=discoveredGenestotalGenesAs in protocol1B,this method ignores intergenic predictions that are notclose to a gene start.2.4Performance measuresPrecision and recall have been defined for each protocol as their definitionis dependent on the protocol.Unfortunately,it is impossible to compare twoprecision–recall pairs from different programs as there is a trade-off betweenthe precision and recall.A solution that is often used in machine learning isthe use of ROC curves.We will use a variant of this method called PRCs.Instead of plotting the TP rate against the FP rate,we plot the recall against theprecision.The resulting graphs are comparable and provide a full overviewof the potential of the PPP.So,to fairly assess the performance of each PPP,we need to calculate all possible precision–recall pairs.This can be done by amoving threshold on the score of the predictions made by a program.We use500thresholds equally spaced between the minimum and maximum scorefor each PPP.The area under the auPRC is calculated using the trapezoidmethod on all precision–recall pairs for each algorithm.To quantify the performance of a PPP over all protocols with a singlemetric we introduce the PPP score,which is the harmonic mean of the auPRCi315by guest on December 25, 2011/Downloaded fromT.Abeel et al.Table2.Overview of the results of all protocols on all PPPsName1A1B2A2B Number of predictions Threshold F-score PPP score 1ARTS0.190.360.470.644321170.563620.470.342CpGcluster0.090.220.280.442277742.241670.380.183CpGProD0.060.160.320.04208100.254730.450.084DragonGSF0.060.160.250.42100,0460.260.450.145DragonPF0.050.080.180.267475710.340.320.096EP30.180.230.420.5167807−0.0480.440.287Eponine0.140.290.410.5713209640.9860.450.278FirstEF0.080.230.280.52448180.929380.280.189McPromoter0.040.100.120.2343818−0.013470.250.0810NNPP2.20.010.010.010.0119625520.990.080.0111Nscan0.070.270.220.5123360200.5580.340.1712Promoter2.00.010.010.020.0119236100.50.100.0113PromoterExplorer0.020.050.070.12134282NA0.250.0414PromoterScan0.020.050.060.1324867157.510.200.0415ProSOM0.180.250.420.51632280.653020.440.2916PSPA0.050.170.160.332560285.204670.280.1117Wu-method0.040.100.130.2423934NA0.310.08Thefirst two columns provide the index and the name of the PPPs.The third through sixth column show the area under the precision–recall curve(auPRC)for each of the protocols. The seventh column displays the number of predictions for the optimal threshold as determined by protocol2A.The eighth column shows the optimal threshold determined with protocol2A and the next column the corresponding F-score.The tenth column gives thefinal score for the promoter predictor as the harmonic mean of the auPRC scores for the four protocols.PPP scores over25%are indicated in bold.These are the programs we used for in-depth analysis.of the four protocols.PPP score=41auPRC(1A)+1auPRC(1B)+1auPRC(2A)+1auPRC(2B)The harmonic mean is used as it reduces the effect of high outliers,while at the same time it increases the effect of low scores.As such it will favor programs that provide a stable performance over all protocols.For the in-depth analysis,we can only consider the predictions at one threshold.The optimal threshold is thus determined by calculating the F-score,i.e.the harmonic mean of precision and recall,for each precision–recall pair,and selecting the threshold for which the F-score ismaximal.F=2×precision×recall precision+recallDetermining the optimal threshold is done on the precision–recall pairs obtained by protocol2A.We used protocol2A,because it can be considered the most comprehensive and correct protocol:it uses the CAGE dataset(most comprehensive),and it uses the actual overlap and distance between TSRs and prediction(most correct).2.5Classes of promotersWe classify promoters in so-called shape classes using the method described in Carninci et al.(2006).Single peak(SP)promoters are TSRs that have all tags closely grouped together(the majority of TSSs are not>4nt apart). The second category contains the promoters that have a broad distribution of TSSs(BR).To differentiate between different cases in the broad category, were two additional defined classes referred to as‘broad distribution with a dominant peak(PB)’and‘promoters with a multi-modal distribution of TSSs(MU)’.The shape class of a tag cluster is determined by testing a condition for each shape class in a particular order.Thefirst test that succeeds indicates the shape class.Wefirst test for SP,next for PB andfinally for MU.If none of the tests succeeds,the promoter is assigned the BR label.A TSR has the SP shape if over50%of all individual tags starts no further than4nt apart.The PB shape is defined as any TSR for which the ratio of the number of tags at the two most commonly used locations exceeds2.A TSR has a multi-modal distribution if the distance of any two subsequent5%percentiles of the tag distribution exceeds25%of the total length of the TSR.We consider only clusters with at least100tags.When applied to our pre-processed CAGE dataset,5570clusters have at least100tags.Of these clusters,944have a sharp peak(SP),498have a broad dominant peak(PB), 3188clusters have a multi-modal distribution(MU)and940do notfit in any of the other classes(BR).Another subdivision of TSRs was made to assess the bias of PPPs toward rare and common transcription initiation event.To assess the performance on TSRs that are rarely used and TSRs that are commonly used,we create two datasets.The set with rarely used TSRs contains all TSRs that have exactly 2tags,while the commonly used TSRs have at least25associated tags.This results in14363common TSRs and85519rare TSRs.3RESULTS3.1Benchmarking PPPsWe have applied the four protocols described in the previous section to17PPPs that have been published in the literature,and for which we were able to procure genome-wide predictions on the human genome or for which the software is available for free for academic use.We ran the latter programs ourselves on a grid, requiring over30000CPU hours to complete the human genome. For15programs this resulted in predictions with scores,while for2 programs we only have predictions without a score(Wu-method and PromoterExplorer).Results of this analysis are reported in Table2. In earlier work,we used the F-score to identify the PPP that performs best on a number of datasets.However,there are some drawbacks on using the F-score as single criterion.First of all,to compare programs fairly,one has to optimize the threshold of the program on the validation set.Even when this is done properly,the optimized F-score is only a single point on the PRC that can be obtained with the program.Hence,the F-score does not provide any insight in to the full potential of the PPP under investigation.i316 by guest on December 25, 2011 /Downloaded fromBenchmarking promoters predictorsprecisionr e cal l Fig.2.PRCs for all PPPs when evaluated with protocol 2A.For some applications one would be more interested in how the PPP behaves under very high precision conditions while other researchers could be interested in the behavior at very high recall rates.As suggested before (Sonnenburg et al.,2006),the fairest way to compare PPPs is by calculating the complete PRC and then computing the area under this curve.Figure 2shows the PRCs for all 17PPPs for protocol 2A,and the remaining protocols result in similar plots (data not shown).In a PRC,graphs most to the top-right indicate the better performing programs.We see that there are three graphs that dominate the first part of the plot;these are the graphs corresponding to the ARTS,EP3and ProSOM programs.At about 20%precision,the graph of Eponine starts dominating,but ARTS,EP3and ProSOM remain closeby.PromoterExplorer and the Wu-method do not have a full graph,as they do not provide scores;they are represented by a single point in the plot.To be able to calculate the full area under the curve,we included one extra point to close the curve.This added point has the same recall as the point with the lowest precision in the curve,but has precision value 0.Adding this point allows the auPRC to be calculated for each PPP (including those with only one precision–recall pair)and it will put programs that do not cover the complete precision spectrum on equal footing with programs that do cover it.The graphs of Eponine and DragonPF indicate that auPRC for those programs may be underestimated.However,we ran the programs at the lowest threshold that would work on our system.So it seems that Eponine and DragonPF do not allow us to explore them in an extreme setting with very low precision.On the other extreme of the plot,we see that the graph of some programs drops to 0from a relatively high recall score.This indicates that some programs do not allow us to explore extreme high precision scores.The area under the curve is reported in Table 2in the columns marked with a protocol identifier.Each of the four protocols assigns the highest auPRC to ARTS.To aggregate the results of the four protocols in one measure,we calculate the harmonic mean of the auPRC of the four protocols and report it as the PPP score in the last column of Table 2.This score is an indication of the overall performance of the PPP on different tasks and using different evaluation algorithms.Four programs have a PPPi317 by guest on December 25, 2011/Downloaded fromT.Abeel et al.score over0.25:ARTS,Eponine,EP3and ProSOM.ARTS clearly performs best with34%,while the other three programs are closely together around28%.All further analyses were performed on all 17PPPs,but we only report results for the four best PPPs as these are the most interesting.The two methods for detection of CpG islands(CpGcluster and CpGProD)work relatively well with protocol2A,especially since they have not been designed to predict promoters,but rather to detect CpG islands.This again indicates that CpG islands are a very strong signal for promoter detection and that the presence of a CpG island is often sufficient for promoter identification.FirstEF and NScan are two methods that try to predict more than just the core promoter.FirstEF tries to identify the structure of thefirst exon and NScan tries to construct a complete gene model.This additional gene-oriented modeling clearly improves the performance of the programs under the1Band2B protocols.In the1A and2A protocols,these programs have lower scores than programs that have a comparable performance on 1B and2B.Promoter2.0and NNPP2.2obtained total scores of<1% indicating that these programs are not suited to identify promoters. Striking is that Eponine,which is around since2001,is still one of the only four promoter predictors that obtain a total score above20%.3.2Positional distribution of predictionsBecause the evaluation protocols allow a certain distance between the prediction and the actual TSR,one should always check how well the predictions are positioned around the target site.In this section, we analyze the positional specificity with respect to the closest TSR for the four top performing programs.For the positional specificity to the closest TSR we use the optimal threshold as determined by protocol2A.Figure3shows the positional distribution of predictions relative to the closest TSRs.Note that all TSRs that overlap with a prediction have distance0,which explains the peak at position0in the graph.The x-axis represents the distance to the TSR.The y-axis shows the number of tags(logarithmic scale).We can see that all programs have by far the largest fraction of the tags overlapping with predictions.ARTS and Eponine make more predictions that are not overlapping with the TSR than EP3and ProSOM,but the predictions are mostly in the vicinity of the TSR.Further from the TSR there is little difference between the four programs.Overall, all four programs have well-localized predictions with respect to the annotated TSRs.3.3Classes of promotersTo analyze the bias of promoter predictors to particular shape classes, we analyzed the recall obtained by each program for each of the classes.We use the optimal threshold as determined by protocol 2A.For this threshold,we determine the number of tags of the shape class that is discovered.For these analyses only the recall is informative.The precision of a method can only be calculated on the complete reference set and for this analysis we only use a subset of the reference.Table3shows the fraction of TSRs of each class that is identified at the optimal threshold.The scores in the table are the fraction of tags marked as SP,PB,MU or BR that is recovered by the program. Single peak TSRs are less recovered by PPPs than any of the broad categories(BR,PB and MU).The TATA motif is known to be overrepresented in the SP class and these promoters are commonly associated with tissue-specific genes,while the BR,PB and MUposition relative to TSRcountFig.3.Positional specificity for predictions around TSRs.The positional specificity is determined by using the optimal threshold as determined with protocol2A.Table3.Recall score for each of the top four PPPs on each of the four promoter classes and on the Rare and Common TSR setName SP PB MU BR Rare CommonARTS0.580.900.930.950.230.81 EP30.520.820.850.840.230.74 Eponine0.690.920.940.960.240.80 ProSOM0.510.830.810.830.210.71The recall is calculated with the optimal threshold as determined with protocol2A. classes are strongly associated with CpG islands,commonly found in housekeeping genes(Carninci et al.,2006).This indicates that the current state-of-the-art in promoter prediction is biased toward housekeeping genes that contain CpG islands.One caveat with the last analysis is that although we make a distinction between different TSR shapes,we still look at TSRs that have at least100associated tags,which means that these TSRs have a high initiation rate.To compare the performance of the four PPPs on less common TSRs,we use the sets of rarely used and commonly used TSRs(see Section2).The fraction of identified TSRs for these two sets is shown in the last two columns of Table3.All four PPPs have a strong bias toward strong TSRs,covered by a lot of tags.3.4Pair-wise prediction overlapTo calculate the overlap between predictions made by different programs,we divided the genome in chunks of500nt.The predictions for each program are determined as predicted regions that have a score that is higher than the optimal threshold determined by protocol2A.Table4shows the fraction of predictions that is shared between two PPPs.In this table,we only included the four PPPs that obtained a PPP score over0.25in the benchmark analysis presented in Table2.The value in a cell with column title A and row title B should be interpreted as the fraction of predictions of program A that are contained in the predictions of program B.For example,the value in row2,column1is the fraction of predictionsi318 by guest on December 25, 2011 /Downloaded from。
nd-2006数据集介绍
nd-2006数据集介绍nd-2006数据集是一个广泛用于机器学习和深度学习领域的大型多标签数据集。
该数据集由多个文件组成,总共有大约13GB的数据,包括文本文件和图像文件。
数据集主要关注图像分类问题,其中包含多个类别和大量的样本。
本文将详细介绍nd-2006数据集的背景、数据格式、数据预处理、特征提取以及实验结果等内容。
一、背景nd-2006数据集是由国家数据集组织(National Data Archive on the Web)发布的一个大规模多标签图像数据集。
该数据集旨在为机器学习和深度学习研究人员提供一个可用于图像分类任务的丰富数据资源。
数据集涵盖了多个类别,包括但不限于动物、植物、建筑物、交通工具等,提供了大量的样本供研究人员进行实验和分析。
二、数据格式nd-2006数据集包含多个文件,每个文件包含一组图像及其对应的标签。
每个图像通常以JPEG或PNG格式存储,标签则以文本形式描述。
数据集还包含一些元数据文件,如类别描述、图像尺寸等信息。
在加载和使用nd-2006数据集时,需要按照指定的格式和命名规范进行操作。
三、数据预处理在处理nd-2006数据集之前,需要进行一些预处理操作,如图像加载、标签清洗、图像裁剪和尺寸调整等。
这些操作可以增强数据的质量,提高模型的性能。
在加载图像时,可以使用各种图像处理库,如OpenCV和PIL,将图像转换为合适的数据类型并存储为numpy数组,以便在程序中使用。
对于标签,需要将文本标签转换为数字标签,可以使用one-hot编码或TF-IDF等方法进行转换。
在进行预处理时,需要注意数据的范围和分布,以便进行适当的归一化处理。
四、特征提取nd-2006数据集包含多种特征,如颜色、纹理、形状等。
在进行特征提取时,需要根据具体任务和模型的特点选择合适的特征组合和提取方法。
常见的特征提取方法包括图像转换、滤波器应用、区域特征提取等。
使用这些方法可以从不同的角度描述图像,提高模型的泛化能力。
nlp模型发展史
nlp模型发展史
自然语言处理(NLP)模型的发展历经了多个阶段和技术革新,包括
规则驱动模型、基于统计模型、深度学习模型等。
1.规则驱动模型(1950s-1990s)。
早期的自然语言处理模型是基于规则的,即由人类专家手动编写语法
规则和词汇表来解决自然语言处理问题,例如语法分析、机器翻译和信息
检索等。
但由于语言的复杂性和准确性的限制,这些规则驱动的模型变得
难以维护和扩展,无法处理自然语言的多义性和歧义性。
2.基于统计模型(1990s-2010s)。
随着计算机技术的快速发展和数据量的大幅增加,基于统计建模的方
法在自然语言处理中变得越来越流行。
这种模型使用预先收集的语言数据
集来学习语言规律,然后使用这些规则来处理新的语言数据。
常见的算法
包括隐马尔可夫模型(HMM)、条件随机场(CRF)、最大熵模型(MaxEnt)等。
这些模型在许多自然语言处理任务中都取得了有限的成功,包括语音
识别、词性标注、实体识别、情感分析等。
3.深度学习模型(2010s至今)。
深度学习模型是一种利用人工神经网络解决自然语言处理问题的方法。
这种模型使用深度神经网络来自动学习语言的特征,并能够处理大量的语
言数据和复杂的语言规律。
在自然语言处理中,深度学习模型已经被广泛
应用于文本分类、机器翻译、问答系统、语义分析等任务中,其中最著名
的包括卷积神经网络(CNN)、递归神经网络(RNN)、长短期记忆网络(LSTM)和Transformer等。
通过使用这些深度学习模型,自然语言处理
系统已经取得了很大的进展,并成为当前研究的热点领域之一。
mindgpt 参数
mindgpt 参数
MINDgpt是一个基于OpenAI的GPT-3模型进行自然语言处理的模型。
它的参数包括以下几个方面:
1. 模型深度(Model Depth):指模型的层数,即有多少个Transformer Encoder层。
层数越多,模型的表达能力越强,但计算成本也会增加。
2. 序列长度(Sequence Length):指输入到模型中的文本序列的最大长度。
较大的序列长度可以捕捉更多的上下文信息,但也增加了计算成本。
3. 隐藏单元数(Hidden Units):指每个Transformer Encoder 层中隐藏层的维度。
维度越高,模型的表达能力越强,但计算成本也会增加。
4. Embedding维度(Embedding Dimension):指输入token的表示维度。
维度越高,模型可以捕获更多的语义信息,但计算成本也会增加。
5. 批次大小(Batch Size):指每个训练迭代中用于更新模型参数的样本数目。
较大的批次大小可以提高训练效率,但也需要更多的内存和计算资源。
这些参数可以根据具体任务和计算资源进行调整,以达到最佳的模型性能和资源利用效率。
gpts概念
GPTs是Generative Pre-trained Transformers的缩写,中文可以称为“生成式预训练转换器”。
这是OpenAI推出的一种人工智能技术,主要用于自然语言处理任务,如对话生成、文本摘要、机器翻译等。
GPTs通过预训练(pre-training)的方式,在大量文本数据上进行学习,从而获得理解和生成自然语言的能力。
在首届OpenAI开发者大会上,OpenAI宣布允许用户构建自定义ChatGPT完成特定的个人和专业任务。
用户能快速创建自己专用版本的ChatGPT,可以用于帮助教孩子数学或解释棋盘游戏的规则。
另外,在最新论文中,OpenAI直接把GPTs称作general-purpose technologies (GPTs),即通用技术。
通用技术是指满足三个核心标准的技术——在时间上持续改进、在经济中普遍存在、并能产生相关创新的技术。
随着时间的推移,GPT在不断改进,因此满足了第一个标准。
说话人识别模型失配下的似然得分补偿变换
征不 断变化 、环境 对系 统识别 构成 影响从 而导致 识别 模型 失配情 况下 ,需要 对模 型的得 分进行 补偿 的人 识别 ;混合高斯 模型 ;似 然得分 补偿 变换 中圈分 类号 -T 1 . N9 23 4 文献 标识 码 -A
1
引 言
关 说 话人 识 别 的识 别 率 得 不 到 进 一 步提 高 的主 要 因素 。为 了 降低 这 些 因素 的 影 响 ,人 们 从 事 了大 量 的 研 究 ,这些 研 究可 分 为 三 个 方 面 :一 、语 音 降 噪 ,这 一 方 面 是 研 究 的 热 点 ,谱 减法 【3 对 静 态 噪 声最 2】 .是
语 音 是人 的 自然 属 性 之 一 , 由于 各 个 说 话人 发 音 器 官 的生 理 差 异 以及 后 天 形 成 的 发音 习惯 等 行 为
差 异 的 影 响 , 每个 人 的语 音 中蕴 含 着 与 众 不 同 的个 人 特 征…。说 话 人 识 别 就 是 着 眼 于 提 取 包 含 在 语 音
对 于与 文 本 无 关 的 说 话 人 识 别 ,由于说 话 人 的个 性 特 征 具 有 长 时 变 动 性 ,而 且 其 发音 常 常 与环 境 、 说 话人 情 绪 、 说话 人 健 康 有 密 切 关 系 【 ,实 际 过 程 中还 可 能 引入 背 景 噪 声 等 干 扰 ,这 些 都 是 与 文 本 无 l 】
nuscenes指标nds -回复
nuscenes指标nds -回复其中括号内的主题是"Nuscenes指标NDS"。
在自动驾驶技术的发展中,评估和比较不同系统的性能是至关重要的。
Nuscenes (nuTonomy scenes)指标NDS (nuScenes Detection and Segmentation)是一个用于评估目标检测和分割算法的指标。
本文将逐步回答关于Nuscenes指标NDS的一些问题。
问题1:什么是Nuscenes指标NDS?Nuscenes指标NDS是nuScenes数据集的官方评估指标,用于评估目标检测和分割算法。
nuScenes是一个大规模的自动驾驶数据集,包含来自一个广泛城市环境的高分辨率传感器数据。
NDS评估指标可以帮助研究人员和开发者衡量自动驾驶系统在不同场景和条件下的性能。
问题2:NDS如何进行评估?NDS评估主要分为两个步骤:目标检测和实例分割。
在目标检测阶段,算法将预测的边界框与真实的边界框进行比较,并计算不同IoU (Intersection over Union)阈值下的准确率。
在实例分割阶段,算法将预测的分割掩码与真实的分割掩码进行比较,并计算平均交并比(Mean Intersection over Union, MIoU)。
问题3:NDS指标的计算公式是什么?Nuscenes指标NDS的计算公式如下所示:NDS = 0.5 * NDS_{detection} + 0.5 * NDS_{segmentation}其中,NDS_{detection}表示目标检测指标,NDS_{segmentation}表示实例分割指标。
问题4:目标检测指标的计算公式是什么?目标检测指标的计算公式为:NDS_{detection} = AP_{0.5} + AP_{0.7} + AP_{0.9}其中,AP_{0.5}、AP_{0.7}和AP_{0.9}分别代表IoU阈值为0.5、0.7和0.9时的平均准确率(Average Precision)。
一天搞懂深度学习演示教学ppt课件
Softmax
1-2 基本思想
Neural Network
1-2 基本思想
……
……
……
……
……
……
y1
y2
y10
Cross Entropy
“1”
……
1
0
0
……
target
Softmax
……
Given a set of parameters
目标识别
目标分析
图像捕获 图像压缩 图像存储
图像预处理 图像分割
特征提取 目标分类 判断匹配
模型建立 行为识别
2-1 机器视觉
关键技术与应用
A)生物特征识别技术——安全领域应用广泛 生物特征识别技术是一种通过对生物特征识别和检测,对身伤实行鉴定的技术。从 统计意义上讲人类的指纹、虹膜等生理特征存在唯一性,可以作为鉴另用户身份 的依据。目前,生物特征识别技术主要用于身份识别,包括语音、指纹、人脸、 静脉,虹膜识别等。
1958: Perceptron (linear model) 1969: Perceptron has limitation 1980s: Multi-layer perceptron Do not have significant difference from DNN today 1986: Backpropagation Usually more than 3 hidden layers is not helpful 1989: 1 hidden layer is “good enough”, why deep? 2006: RBM initialization 2009: GPU 2011: Start to be popular in speech recognition 2012: win ILSVRC image competition 2015.2: Image recognition surpassing human-level performance 2016.3: Alpha GO beats Lee Sedol 2016.10: Speech recognition system as good as humans
经合组织风险认识工具OECD Risk Awareness tool
Governance Zones
ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT
ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT
Foreword
The OECD Risk Awareness Tool for Multinational Enterprises in Weak
Governance Zones aims to help companies that invest in countries where governments are unwilling or unable to assume their responsibilities. It addresses risks and ethical dilemmas that companies are likely to face in such weak governance zones, including obeying the law and observing international instruments, heightened care in managing investments, knowing business partners and clients and dealing with public sector officials, and speaking out about wrongdoing.
© OECD 2006
No reproduction, copy, transmission or translation of this publication may be made without written permission. Applications should be sent to OECD Publishing: rights@ or by fax (33 1) 45 24 13 91. Permission to photocopy a portion of this work should be addressed to the Centre français d'exploitation du droit de copie, 20, rue des Grands-Augustins, 75006 Paris, France (contact@).
模式识别第四次作业答案
(6) 检查是否达到预先设定的要求:如达到要求则算法结束,否则返回(2),进入下一轮
学习。
4.答案: (1)图像大小:400×400
第一个卷积层滤波器大小:5 × 5 第一个卷积层图像个数:20 第一个卷积层权重个数:5×5×20 第一个卷积层图像大小:396 × 396 第一个卷积层 pooling 窗口大小:2 × 2 第一个卷积层 pooling 之后图像大小:198 × 198
像《黑客帝国》那样“下载-拷贝”新知识
像《黑客帝国》那样“下载-拷贝”新知识唯一2011-12-15 12:30:13还记得电影《黑客帝国》里的情节吗?仅需要通过电脑物理性地连接到大脑,就可以下载新的知识技能,从而将其内化为自身本领。
如今,这一切仿佛就要如数上演。
分享到:新浪微博人人豆瓣QQ空间腾讯微博还记得电影《黑客帝国》里的情节吗?仅需要通过电脑物理性地连接到大脑,就可以下载新的知识技能,从而将其内化为自身本领。
如今,这一切仿佛就要如数上演。
近日,研究人员“解码”出大脑对某种行为或者技能的学习过程,以期望对另一个大脑重现这个过程时,便可使其掌握到相应技能。
这听起来真让人激动啊,传说中的技能终于要实现了吗?不过,目前研究成果还只是最初级阶段。
研究人员称,未来几年内,其最直接的应用前景仍限于临床医学的脑部康复治疗上。
这是美国波士顿大学和日本京都A TR计算神经学研究室的研究人员的共同研究课题,其目的是掌握大脑对不同技能的学习过程,主要利用的技术是fMRI(功能性磁共振仪),选取了成年人的视觉感知学习(visual perceptual learning)作为“解码-复制”的对象,据称人体的该项技能,是可以通过反复训练而得到显著提高的。
波士顿大学脑神经科学家Takeo Watanabe 介绍说,在研究过程中他们主要使用了“fMRI 神经反馈解码(decoded fMRI neurofeedback)”来模拟视觉皮层的活动,使用的实验道具是视觉神经学中经常使用的“Gabor Patch”图像。
据称,通过被测试者对该类图像的辨别能力,可以判断其大脑皮质中视力神经部分对影像信息的接收和反应速度,还能检测大脑视觉处理中的黑白对比敏感度。
同时,临床也有用该类图片作为训练,以改善视神经对影响成形的处理过程,提高视觉清晰度。
不同频率分割下的“Gabor Patch”图像在实验过程中,共有16名参与人员,年龄分布在20到38周岁,11名男性和5名女性,视力正常。
APS综合征
Evaluation of the Clinical Performance of a Novel Chemiluminescent Immunoassay for Detection ofAnticardiolipin and Anti-Beta2-Glycoprotein 1Antibodies inthe Diagnosis of Antiphospholipid SyndromeShulan Zhang,MD,Ziyan Wu,MD,Ping Li,MD,Yina Bai,BS,Fengchun Zhang,MD,and Yongzhe Li,MDAbstract:Detection of antiphospholipid antibodies represents thefirst-line approach for diagnosis of antiphospholipid syndrome (APS).In this study,we evaluated the clinical performance of a novel chemiluminescence assay (CIA)in detection of IgG/IgM/IgA anti-cardiolipin (aCL)and IgG/IgM/IgA anti-b 2glycoprotein 1(a b 2GP1)antibodies and to compare it with commercial enzyme-linked immu-nosorbent assay (ELISA)kits from the same manufacturer.A total of 227sera were tested in this study,including 84samples from patients with APS,104samples from patients with non-APS diseases as disease controls,and 39healthy controls.Serum IgG/IgM/IgA aCL and IgG/IgM/IgA a b 2GP1were determined by both ELISA (QUANTA Lite TM ELISA)and CIA (QUANTA Flash 1assays).Significant quantitative correlations were identified between ELISA and CIA in IgG/IgM/IgA aCL and IgG/IgM/IgA a b 2GP1autoantibodies detection (P <0.001),with the rho value ranging from 0.51to 0.87.In addition,ELISA and CIA demonstrated good qualitative agreements in IgG/IgM/IgA aCL and IgM/IgA a b 2GP1autoantibodies determination with kappa coefficient ranged from 0.52to 0.77.In contrast,ELISA and CIA showed a moderate qualitative agreement in IgG a b 2GP1detection with a kappa value of 0.2.Notably,significantly higher IgG a b 2GP1positive sera were detected by CIA,compared to those detected by ELISA in both primary APS (52.9%vs.8.8%)and APS associated to other diseases sera (70.0%vs.8.0%).For diagnosis of APS,IgG a b 2GP1detection by CIA (IgG a b 2GP1CIA)demonstrated the highest sensitivity (63.1%),followed by IgG aCL CIA (48.8%).More importantly,IgG a b 2GP1CIA demonstrated the highest ability to predict the thrombotic events in patients with APS,with an OR of 3(95%CI:1.1–7.9).Our data suggest that this novel CIA assay had good performance in detecting aCL and a b 2GP1antibodies,especially in the detection of IgG a b 2GP1antibodies.Our findings could shed insight on the application of CIA in the laboratory diagnosis of APS in China.(Medicine 94(46):e2059)Abbreviations :a b 2GP1=anti-b 2glycoprotein 1,aCL =anti-cardiolipin,aPL =antiphospholipid antibodies,APS =antiphospholipid syndrome,CIA =chemiluminescence assay,ELISA =enzyme-linked immunosorbent assay,LA =lupus anticoagulant,OR =odds ratios,ROC =receiver-operating characteristic,SLE =systemic lupus erythematosus.INTRODUCTIONAntiphospholipid syndrome (APS)is a heterogeneous group of autoimmune disease characterized by recurrent arterial/venous thrombosis,and/or pregnancy morbidity,as well as the presence of antiphospholipid (aPL)antibodies.Primary APS (PAPS)is defined by no evidence of any underlying systemic autoimmune disorder,while APS associated to other diseases is associated with other systemic autoimmune syndromes,especi-ally with systemic lupus erythematosus (SLE).1As aPLs being a hallmark feature of APS,detection of aPLs represents the first-line approach for diagnosis of APS.According to 2006updated consensus criteria of APS,the diagnosis of APS requires the persistent presence of at least one of the following aPLs,including lupus anticoagulant (LA),IgG and/or IgM anti-cardiolipin (aCL),and IgG and/or IgM anti-b 2glycoprotein 1(a b 2GP1)antibodies.1,2IgA aCL and IgA a b 2GP1antibodies are not currently included in the laboratory criteria for APS,but are suggested as ‘‘noncriteria’’antibodies for seronegative patients with clinical suspicion of APS.2,3Of note,the diagnosis of APS relies predominantly on laboratory findings,as characteristic clinical features of throm-bosis and pregnancy morbidity also occur in many other dis-eases.In addition,these laboratory results are critical for predicting and stratifying the risks to develop the clinical manifestations of the syndrome.Unfortunately,the routinely used assays in clinical settings,particularly enzyme-linked immunosorbent assay (ELISA),lack standardized kits,resulting in substantial variations in the antibody positivity between different laboratories.4–6Editor:Angelo Marzano.Received:August 30,2015;revised:October 7,2015;accepted:October 19,2015.From the Department of Rheumatology and Clinical Immunology,Peking Union Medical College Hospital,Chinese Academy of Medical Sciences &Peking Union Medical College,Key Laboratory of Rheumatology and Clinical Immunology,Ministry of Education,Beijing,China (SZ,ZW,PL,YB,FZ,YL).Correspondence:Yongzhe Li,Department of Rheumatology and ClinicalImmunology,Peking Union Medical College Hospital,Chinese Academy of Medical Sciences &Peking Union Medical College,Key Laboratory of Rheumatology and Clinical Immunology,Ministry of Education,No.1Shuai Fu Yuan,Eastern District,Beijing 100730,China (e-mail:yongzhelipumch@).Shulan Zhang and Ziyan Wu contributed equally to this work.The authors have no conflicts of interest to disclose.This work was supported in part by the National Natural Science Foundationof China Grants No.81373188,81172857(to YL),81302592(to SZ),the Chinese National High Technology Research and Development Pro-gram,Ministry of Science and Technology Grants No.2011AA02A113,the National Science Technology Pillar Program in the 12nd Five-year Plan No.2014BAI07B00,the capital health research and development of special grants No.2014-1-4011(to YL).Copyright #2015Wolters Kluwer Health,Inc.All rights reserved.This is an open access article distributed under the Creative Commons Attribution-NonCommercial License,where it is permissible to download,share and reproduce the work in any medium,provided it is properly cited.The work cannot be used commercially.ISSN:0025-7974DOI:10.1097/MD.0000000000002059Medicine®D IAGNOSTIC A CCURACY S TUDYMedicineVolume 94,Number 46,November 2015 |1Recently,the chemiluminescence technology has been applied for autoantibody testing.7–15Several studies indicated that this novel assay had similar performance to commercial ELISAs and had a good agreement of results among laboratories regarding the detection of IgG/IgM aCL and IgG/IgM a b2GP1 autoantibodies,9–15suggesting that the chemiluminescence assay(CIA)could be a promising tool to improve the reprodu-cibility and reduce interlaboratory variations.However,those studies have been performed in heterogeneous groups of patients in terms of different ethnic/geographic background, and most of these studies compared HemosIL1AcuStar CIA system or Zenit RA CIA system with either homemade ELISA11,12–15or ELISA kit from another manufacturer.9,13,14 With the introduction of QUANTA Flash1system,it is possible to compare the CIA system with ELISA kit from the same manufacturer.In addition,as CIA being a promising viable alternative,it is of paramount importance to evaluate this novel fully automated assay in aCL and a b2GP1autoantibodies detec-tion in Chinese patients with APS.In the present study,we evaluated the analytical and clinical performance of a novel CIA assay(QUANTA Flash1assays)in the detection of IgG/ IgM/IgA aCL and IgG/IgM/IgA a b2GP1antibodies and to com-pare it with commercial ELISA kits from the same manufacturer.MATERIALS AND METHODSSeraA total of227sera were tested in this study,including84 samples from patients with APS,104samples from patients with non-APS diseases as disease controls(non-APS),and39healthy controls(HC).The APS samples included34samples from patients with PAPS and50samples from patients with APS associated to other diseases.The non-APS samples included30 samples from patients with non-APS thrombosis,32samples from patients with non-APS pregnancy-related morbidity(PRM),and 42samples from patients with SLE.HC included subjects without any signs of infection or inflammation or other significant ill-nesses.The diagnosis of APS was determined according to the Sydney revised Sapporo guidelines.2Specifically,subjects were diagnosed with APS based on a combination of one positive clinical criterion and one positive laboratory criterion(LA, aCL or a b2G1antibodies determined by ELISA)on2different occasions separated by12weeks.2The demographics and clinical characteristics of all subjects are shown in Table1.All of samples were tested for LA.Study protocols were reviewed and approved by the Ethical Committee of Peking Union Medical College Hospital(PUMCH)and informed consents were obtained from all participants.All sera were stored atÀ208C until analysis. Serum Antibodies DeterminationSerum aCL autoantibodies(IgG,IgM,and IgA)and a b2GP1 (IgG,IgM,and IgA)were determined by both ELISA(QUANTA Lite TM ELISAs,INOV A Diagnostic,Inc.,San Diego,CA)and CIA(QUANTA Flash1assays,INOV A Diagnostic,Inc.)accord-ing to the manufacturer’s instructions.The QUANTA Flash1as-says were performed on BIO-FLASH1instrument(Biokit S.A., Barcelona,Spain).The principle of the QUANTA Flash1assay system was previously described by Mahler et al7and Bentow et al.16The cut-off values for positivity were set based on the recommendations by the manufacturer.Statistical AnalysisSPSS20.0statistical software package(SPSS,Inc.,Chi-cago,IL)and Prism5.02(GraphPad Software,San Diego,CA)were utilized for all statistical tests.Cohen kappa agreement test and Spearman correlation test were performed to analyze the qualitative and quantitative agreement between ELISA and CIA.Serial receiver-operating characteristic(ROC)curves were used to calculate the area under the ROC(AUC)for defining optimal cut-off values and analyzing the performance of differ-ent assays.Average linkage clustering by Heml1.0Heatmap illustrator(The CUCKOO Workgroup,Hefei,Anhui,China) was used for cluster analysis.Hierarchical clustering was utilized to illustrate the relationship between different assays and to display the reactivity patterns of the patients.P values of less than0.05were considered statistical significant.RESULTSClinical CharacteristicsClinical characteristics and laboratory findings of all sub-jects are listed in Table1.Specifically,the incidence of arterial thrombosis in patients with PAPS,APS associated to other diseases,non-APS thrombosis,non-APS PRM,and SLE were 26.5%,36.0%,16.7%,0,and2.3%,respectively.The presence of venous thrombosis in patients with PAPS,APS associated to other diseases,non-APS thrombosis,non-APS PRM,and SLE were41.2%,52.0%,86.7%,3.0%,and0,respectively.For calculation of the incidence of obstetric complications,we excluded male patients and nonmarried female patients.Thus, the calibrated incidence of obstetric complications in patients with PAPS,APS associated to other diseases,non-APS throm-bosis,non-APS PRM,and SLE were50.0%,53.1%,0%,100%, and0, was detected in73.5%of PAPS patients, 80%of patients with APS associated to other diseases,6.7%of patients with non-APS thrombosis,3%of patients with non-APS PRM,and11.9%of SLE patients.aCL(IgG,IgM,and IgA)and a b2GP1(IgG,IgM, and IgA)Autoantibodies Detection by ELISA and CIA AssaysTable1shows the results of IgG/IgM/IgA aCL and IgG/ IgM/IgA a b2GP1autoantibodies detection by ELISA and CIA from all tested sera.Except for IgM/IgA a b2GPI autoanti-bodies,all HC samples were negative by both assays.Similar percentages of positive results for IgG/IgM/IgA aCL and IgM/ IgA a b2GP1autoantibodies were found in both assays.How-ever,significantly higher IgG a b2GP1positive sera were detected by CIA,compared to those detected by ELISA in both PAPS(52.9%vs.8.8%,P<0.001)and APS associated to other diseases sera(70.0%vs.8.0%,P<0.001).IgA aCL and IgA a b2GP1autoantibodies have been considered as‘‘non-criteria’’antibodies for seronegative patients with clinical suspicion of APS.2Importantly,both IgA aCL and IgA a b2GP1 antibodies detected by either assay were significantly higher in patients with APS than those in non-APS disease controls or HC (Table1).Qualitative Agreements and Quantitative Agreements Between ELISA and CIA Assays in aCL(IgG,IgM,and IgA)and a b2GP1(IgG,IgM, and IgA)Autoantibodies DeterminationGenerally,ELISA and CIA demonstrated good overall agreements(>90%)in IgG/IgM/IgA aCL and IgM/IgAZhang et al Medicine Volume94,Number46,November2015 2| Copyright#2015Wolters Kluwer Health,Inc.All rights reserved.a b2GP1autoantibodies determination.The positive agreement and negative agreement between ELISA and CIA in detection of these autoantibodies ranged from40.9%to67.4%and90.7%to 96.8%,respectively(Table2).kappa coefficient was calculated to assess the qualitative agreements between ELISA and CIA, and the kappa coefficient for those antibodies ranged from0.52 to0.77(Table2).Interestingly,ELISA and CIA showed a moderate overall agreement(76.7%)in IgG a b2GP1detection, with the positive agreement,negative agreement,and kappa value of14.8%,75.7%,and0.2,respectively(Table2).Quan-titative agreements between ELISA and CIA assays were determined by Spearman correlation test.Importantly,signifi-cant quantitative correlations were identified between ELISA and CIA assays in IgG/IgM/IgA aCL and IgG/IgM/IgA a b2GP1 autoantibodies detection(P<0.001),with the rho value ranging from0.51to0.87(Figure1).Clinical Performance Characteristics of ELISA and CIA Assays in aCL(IgG,IgM,and IgA)anda b2GP1(IgG,IgM,and IgA)DeterminationAssay performance characteristics for detection of IgG/ IgM/IgA aCL and IgG/IgM/IgA a b2GP1autoantibodies were evaluated on both ELISA and CIA assays,and the results are summarized in Table3.For diagnosis of APS,IgG a b2GP1 detection by CIA(IgG a b2GP1CIA)demonstrated the highest sensitivity(63.1%),followed by IgG aCL CIA(48.8%),IgG aCL ELISA(36.9%),IgA a b2GP1ELISA(22.6%),and IgA aCL CIA(22.6%).Interestingly,IgG a b2GP1ELISA exhibited the lowest sensitivity(8.3%).Of note,the highest sensitivity observed in IgG a b2GP1CIA did not result in loss of specificity (93.7%)(Table3).In addition,IgA aCL ELISA showed the highest positive predictive value(100%),and IgG a b2GP1CIA showed the highest negative predictive value(81.2%)(Table3).TABLE1.Demographic Characteristics and Antibody Profiles of APS Patients and ControlsPrimary APS (n¼34)APS Associatedto Other Diseases(n¼50)Non-APSThrombosis(n¼30)Non-APSPRM(n¼32)SLE(n¼42)HealthControls(n¼39)Sex(female/male)24/1042/810/2032/039/314/25Median age atstudy(max,min)34(9,76)33.5(11,86)53.5(14,85)35(24,41)30(12,68)39(25,65)Arterialthrombosis,n(%)9(26.5)18(36.0)5(16.7)0(0.0)1(2.3)0(0.0)Venousthrombosis,n(%)14(41.2)26(52.0)26(86.7)1(3.0)0(0.0)0(0.0)Obstetriccomplications,n(%)Ã9(50.0)17(53.1)0(0.0)32(100.0)0(0.0)0(0.0)LA,n(%)25(73.5)40(80.0)2(6.7)1(3.0)5(11.9)0(0.0) aPLÃÃELISA/CIA,n(%)19(55.9)/21(61.8)ÃÃÃ29(58.0)/37(74.0)ÃÃÃ1(3.3)/0(0.0)3(9.4)/3(9.4)9(21.4)/7(16.7)3(7.7)/1(2.6)aCL IgG ELISA/CIA,n(%)16(47.1)/18(52.9)15(30.0)/23(46.0)0(0.0)/0(0.0)1(3.1)/1(3.1)1(2.4)/2(4.8)0(0.0)/0(0.0) aCL IgM ELISA/CIA,n(%)3(8.8)/4(11.8)12(24.0)/9(18.0)1(3.3)/0(0.0)2(6.3)/1(3.1)1(2.4)/1(2.4)0(0.0)/0(0.0) aCL IgA ELISA/CIA,n(%)6(17.6)/9(26.5)ÃÃÃ4(8.0)/10(20.0)ÃÃÃ0(0.0)/0(0.0)0(0.0)/1(3.1)0(0.0)/1(2.4)0(0.0)/0(0.0) a b2GP1IgGELISA/CIA,n(%)3(8.8)/18(52.9)ÃÃÃÃ4(8)/35(70.0)ÃÃÃÃ0(0.0)/0(0.0)1(3.1)/2(6.3)1(2.4)/7(16.7)0(0.0)/0(0.0)a b2GP1IgMELISA/CIA,n(%)3(8.8)/1(2.9)11(22.0)/7(14.0)0(0.0)/0(0.0)1(3.1)/0(0.0)7(16.7)/0(0.0)1(2.6)/1(2.6)a b2GP1IgAELISA/CIA,n(%)5(14.7)/7(20.6)ÃÃÃ14(28.0)/9(18.0)ÃÃÃ0(0.0)/0(0.0)1(3.1)/1(3.1)0(0.0)/0(0.0)2(5.1)/0(0.0)aCL¼anticardiolipin;aPL¼antiphospholipid antibodies;APS¼antiphospholipid syndrome;a b2GP1¼anti-b2glycoprotein1;CIA¼chem-chemiluminescence assay;ELISA¼enzyme-linked immunosorbent assay;LA¼lupus anticoagulant;PRM¼pregnancy-related morbidity; SLE¼systemic lupus erythematosus.ÃPercentage among married women of reproductive age.ÃÃAny aPL positive test.ÃÃÃP<0.01(comparison between APS and non-APS/HC).ÃÃÃÃP<0.01(comparison between ELISA vs.CIA).Medicine Volume94,Number46,November2015Evaluation of CIA in the Diagnosis of APS Copyright#2015Wolters Kluwer Health,Inc.All rights |3ROC analysis was performed to evaluate the discrimi-nation power of ELISA and CIA for distinguishing patients with APS and controls.IgG a b 2GP1CIA exhibited the best dis-crimination power with the area under the curves (AUC)of 0.86,followed by IgG aCL CIA (AUC of 0.85)and IgM a b 2GP1CIA (AUC of 0.78)(Figure 2and Table 3).Interest-ingly,IgM aCL ELISA,IgA aCL ELISA and IgG a b 2GP1ELISA showed poor discrimination power with the ACU of 0.57,0.57,and 0.58,respectively (Figure 2and Table 3).The odds ratios (OR)were calculated to evaluate the performance of each autoantibody tested by either ELISA or CIA in prediction of APS.Interestingly,all of the autoanti-bodies tested by both ELISA and CIA showed high ORs for predicting APS,ranging from 3.8in IgA a b 2GP1ELISA to 44.5in IgG aCL CIA.Importantly,IgG a b 2GP1CIA demonstrated the highest ability to predict the thrombotic events in patients with APS,with an OR of 3(95%CI:1.1–7.9)(Table 3).However,all the autoantibodies detected by either ELISA or CIA had little power to predict the obstetric risks (Table 3).Cluster AnalysisTo further illustrate the distribution of each autoantibody tested by either ELISA or CIA in patients with APS and controls,and to illustrate the relationships among these autoantibodies,a supervised cluster analysis with a dendrogram was performed.The cluster analysis indicates that the majority of PAPS and patients with APS associated to other diseases were positive for IgG a b 2GP1CIA and IgG aCL CIA (Figure 3).Some of the controls also showed positive results in some autoantibodies by different assays.In addition,IgG aCL CIA clustered were found closer to IgG a b 2GPI CIA than to IgG aCL ELISA (Figure 3).More importantly,the dendrogram shows that IgG a b 2GPI CIA and IgG aCL CIA clusters,and to a less extent,IgG aCL ELISATABLE 2.Qualitative Agreements Between ELISA and CIA Assays in aCL (IgG,IgM,and IgA)and a b 2GP1(IgG,IgM,and IgA)DetectionELISA vs.CIA OverallAgreement Ã(95%CI)Positive Agreement (95%CI)Negative Agreement (95%CI)Kappa (95%CI)aCL IgA 94.3(91.2–96.9)40.9(22.7–59.1)94(90.8–96.8)0.55(0.31–0.76)aCL IgG 93.4(89.9–96.5)67.4(52.2–80.4)92.3(88.8–95.9)0.77(0.64–0.87)aCL IgM 95.6(93.0–97.8)54.5(31.9–72.7)95.3(92.6–97.7)0.68(0.47–0.85)a b 2GP1IgA 91.2(87.2–94.7)41.9(24.2–59.6)90.7(86.7–94.6)0.52(0.32–0.69)a b 2GP1IgG 76.7(71.4–82.8)14.8(5.8–23.7)75.7(70.0–81.4)0.20(0.09–0.32)a b 2GP1IgM96.9(94.7–99.1)56.3(31.1–81.4)96.8(94.4–99.1)0.71(0.46–0.88)aCL ¼anticardiolipin antibody;a b 2GP1¼anti-b 2glycoprotein 1antibody;CIA ¼chemiluminescence assay;ELISA ¼enzyme-linked immu-nosorbent assay ÃQualitative agreements are given in percent,followed by kappa statistics (95%confidence intervals are provided in theparentheses).FIGURE 1.Quantitative correlation between ELISA and CIA.Correlations between ELISA and CIA for IgA aCL (A),IgG aCL (B),IgM aCL (C),IgA a b 2GP1(D),IgG a b 2GP1(E),and IgM a b 2GP1(F)antibodies detection are shown.Quantitative correlation between the ELISA and CIA for each individual antibody detection was calculated by Spearman correlation test.a b 2GP1¼anti-b 2glycoprotein 1,aCL ¼anti-cardiolipin,CIA ¼chemiluminescence assay.Zhang et al MedicineVolume 94,Number 46,November 20154|Copyright#2015Wolters Kluwer Health,Inc.All rights reserved.T A B L E 3.C l i n i c a l P e r f o r m a n c e C h a r a c t e r i s t i c s f o r E L I S A a n d C I A A s s a y s i n a C L (I g G ,I g M ,a n d I g A )a n d a b 2G P 1(I g G ,I g M ,a n d I g A )D e t e c t i o na C L I g G E L I S A a C L I g G C I Aa C L I g M E L I S Aa C L I g M C I A a C L I g A E L I S A a C L I g A C I A ab 2G P 1I g G E L I S A a b 2G P 1I g G C I A a b 2G P 1I g M E L I S A a b 2G P 1I g M C I Aa b 2G P 1I g A E L I S Aa b 2G P 1I g A C I AM a n u f a c t u r e r ’s c u t -o f f Ã!20G P L !20C U !20M P L !20C U !20A P L !20C U !20S G U !20C U !20S M U !20C U !20S A U !20C U S e n s i t i v i t y i n A P S %(95%C I )36.9(26.6–48.1)48.8(37.7–60)17.9(10.4–27.7)15.5(9.5–25)11.9(5.9–20.8)22.6(14.2–33.1)8.3(3.4–16.4)63.1(51.9–73.4)16.7(9.4–26.4)9.5(4.2–17.9)22.6(14.2–33.1)19.1(11.3–29.8)S p e c i fic i t y %(95%C I )98.6(95–99.8)97.9(94–100)97.2(93–99.2)98.6(95–99.8)100(97.5–100)98.6(95–99.8)98.6(95–99.8)93.7(88.4–97.1)98.6(95–99.8)99.3(96.2–100)93(87.5–96.6)99.3(96.2–100)P o s i t i v e p r e d i c t i v e v a l u e %93.993.279.088.7100.090.577.885.587.588.965.594.1N e g a t i v e p r e d i c t i v e v a l u e %72.776.566.866.565.968.564.781.266.865.167.267.6A U C 0.730.850.570.760.570.760.580.860.620.780.620.74O d d s r a t i o f o r A P S (95%C I )41.2(9.5–178.4)44.5(13.1–150.9)7.6(2.4–23.6)12.9(2.8–58.8)40.5(2.3–700.3)20.6(4.7–91.1)6.4(1.3–31.6)25.5(11.4–57.1)14.1(3.1–63.8)15.0(1.8–121.8)3.8(1.7–8.8)33.4(4.3–257.3)O d d s r a t i o f o r t h r o m b o s i s (95%C I )2.4(0.9–6.5)2.4(0.9–6.2)0.6(0.2–2)0.7(0.2–2.4)1.2(0.3–5.1)1.6(0.5–5.0)1.3(0.2–7.2)3.0(1.1–7.9)0.5(0.2–1.8)1.3(0.2–7.2)1.1(0.4–3.4)1.7(0.5–5.8)O d d s r a t i o f o r o b s t e t r i c (95%C I )0.9(0.3–3.0)1.2(0.4–3.6)0.3(0.1–1.9)0.3(0.1–1.9)1.2(0.2–6.8)1.0(0.3–3.8)0.4(0–3.9)1.2(0.4–3.6)0.3(0.1–1.9)0.4(0–3.9)0.4(0.1–1.4)1.0(0.3–3.8)C u t -o f f u s e d a t 95.0%s p e c i fic i t y 9.43G P LÃÃ8.85C U16.61M P L7.45C U N A 3.95C U 9.76S G U ÃÃÃ51.95C U12.29S M U4.85C U29.67S A U1.05C US e n s i t i v i t y i n A P S %a t 95.0%s p e c i fic i t y 50.0ÃÃ58.321.433.3N A 47.616.7ÃÃÃ53.62534.517.985.7O d d s r a t i o f o r A P S a t 95.0%s p e c i fic i t y (95%C I )27.6(10.3–74.3)27.2(11.3–65.3)5.3(2.1–13.3)8.4(3.6–19.7)N A 17.7(7.4–42.3)14.1(3.1–63.8)22.4(9.4–53.6)6.5(2.6–16.0)10.24(4.2–24.8)4.2(1.6–10.8)11.5(5.3–25.0)O d d s r a t i o f o r t h r o m b o s i s a t 95.0%s p e c i fic i t y (95%C I )2.6(1.0–6.7)2.6(1.0–6.9)0.7(0.2–2.0)0.6(0.3–1.7)N A 1.6(0.6–4.2)1.3(0.4–4.7)2.1(0.8–5.3)0.5(0.2–1.5)0.9(0.3–2.3)1.0(0.3–3.4)1.9(0.7–4.9)O d d s r a t i o f o r o b s t e t r i c a t 95.0%s p e c i fic i t y (95%C I )1.4(0.5–4.2)1.6(0.5–4.7)0.4(0.1–1.6)0.3(0.1–1.0)N A 0.9(0.3–2.7)0.3(0–2.6)1.1(0.4–3.3)0.4(0.1–1.7)0.4(0.1–1.2)0.4(0.1–1.7)0.6(0.2–1.7)a C L ¼a n t i c a r d i o l i p i n ;ab 2G P 1¼a n t i -b 2g l yc o p r o t e i n 1;C I A ¼c h e m i l u m i n e s c e n c e a s s a y ;E L I S A ¼e n z y m e -l i n k ed i m m u n o s o r be n t a s s a y .ÃM a n uf a c t u r e r ’s c u t -o f f u s e d ,w h e r e e q u i v o c a l r e s u l t s a r e c o n s i d e r e d p o s i t i v e .ÃÃC u t -o f f u s e d a t 97.5%s p e c i fic i t y .ÃÃÃC u t -o f f u s e d a t 98.6%s p e c i fic i t y .MedicineVolume 94,Number 46,November 2015Evaluation of CIA in the Diagnosis of APSCopyright#2015Wolters Kluwer Health,Inc.All rights reserved. |5cluster,were closest related the APS than other autoantibodies tested by either ELISA or CIA (Figure 3).DISCUSSIONThe major findings in this study include:CIA strikingly increased the sensitivity in detection of IgG a b 2GP1antibody without loss of specificity in patients with APS,compared with the results by ELISA;IgG a b 2GP1detected by CIA demon-strated the ability to predict the thrombotic events in patients with APS,while IgG a b 2GP1detected by ELISA or other antibodies detected by either ELISA or CIA showed poor ability in predicting the thrombotic risks;ELISA and CIA exhibited good overall qualitative agreements (>90%)in IgG/IgM/IgA aCL and IgM/IgA a b 2GP1autoantibodies detection,while they only showed a moderate overall agreement (76.7%)in IgG a b 2GP1detection;ELISA and CIA assays demonstrated sig-nificant quantitative correlations in IgG/IgM/IgA aCL and IgG/IgM/IgA a b 2GP1autoantibodies detection.Our findings sup-ported that CIA could serve as a promising viable alternative forELISA in the detection of aCL and a b 2GP1autoantibodies,especially in the detection of IgG a b 2GP1autoantibody.We found that CIA strikingly increased the sensitivity in detection of IgG a b 2GP1antibodies compared with the results by ELISA.Importantly,the increased sensitivity of IgG a b 2GP1by CIA did not sacrifice the specificity,PPV and NPV values.Several factors may contribute to the improved sensitivity by CIA,such as differences in detection system and antigens,as CIA utilized the full-length recombinant a b 2GP1expressed in the insect cells as antigen in BIO-FLASH 1instrument.Mondejar et al 10reported that IgG a b 2GP1detected by CIA and IgG a b 2GP1detected by ELISA had a comparable sensitivity in APS patients from Spain using the same CIA and ELISA systems,although a trend of higher sensitivity in CIA was observed.Interestingly,the sensitivity of IgG a b 2GP1detected by CIA was higher in our study compared to that in Mondejar’s study,10although the sensitivity of IgG a b 2GP1detected by ELISA was much lower.Despite widespread use of ELISA for detection of aCL and a b 2GP1autoantibodies in clinical settings,several limitations,such as low reproducibility,substantial interlaboratory vari-ations,challenged the role of ELISA in accurately evaluating the risks of developing APS-related complications.4,18a b 2GP1autoantibodies have been recognized as the main pathogenic subset in aPLs,especially with thrombosis events.12,17How-ever,we did not observe any association between IgG a b 2GP1autoantibodies detected by ELISA with thrombosis events.In contrast,we did identify an association between IgG a b 2GP1autoantibodies and thrombosis events using the CIA assay.Interestingly,Moerloose et al,12reported that IgG a b 2GP1determined by both ELISA (QUANTA Lite INOVA)and CIA (HemosIL 1AcuStar)significantly correlated with throm-bosis events in patients with APS from Europe.As we used the same ELISA kit,the discrepancies may be due to different ethnic/geographic backgrounds or due to substantial interla-boratory variations in ELISA testing,as mentioned earlier.Our results revealed good qualitative and quantitative agree-ments between CIA and ELISA in IgG/IgM aCL and IgM a b 2GP1autoantibodies determinations.In IgG aCL detection,CIA and ELISA showed good overall,positive and negative agreements of 93.4%,67.4%,and 92.3%,respectively,which is similar to what has been previously described by Mondejar et al.10(90.1%,68.4%,and 95.1%,respectively).However,for IgM aCL and IgM a b 2GP1antibodies detection,the overall and positive agreements between CIA and ELISA in our study werehigherFIGURE 2.Receiver-operating characteristics (ROC)analysis.ROC analysis was performed to evaluate the discrimination power of aCL (IgA,IgG,and IgM)and a b 2GP1(IgA,IgG,and IgM)antibodies detected either by ELISA or by CIA for distinguishing patients with APS (n ¼84)and controls (n ¼143).a b 2GP1¼anti-b 2glyco-protein 1,aCL ¼anti-cardiolipin,CIA ¼chemiluminescenceassay.FIGURE 3.Supervised cluster analysis.Supervised centered cluster analysis according to disease cohorts (PAPS,APS associated to other diseases,non-APS thrombosis,non-APS PRM,SLE,and health controls)is shown.The dendrogram is shown to illustrate the relationships among these antibodies in the diagnosis of APS.APS ¼associated to other diseases,secondary antiphospholipid syndrome,non-APS PRM ¼non-APS pregnancy-related morbidity;SLE ¼systemic lupus erythematosus,PAPS ¼primary antiphospholipid syndrome.Zhang et al MedicineVolume 94,Number 46,November 20156|Copyright#2015Wolters Kluwer Health,Inc.All rights reserved.。
kleibergen-paap rk wald f 临界值 -回复
kleibergen-paap rk wald f 临界值-回复[Kleibergen-Paap RK Wald F临界值]是用于计量经济学领域的一种统计检验方法。
它在处理特定的问题时,能够提供更加精确和可靠的结果。
本文将分步骤回答关于[Kleibergen-Paap RK Wald F临界值]的问题,并解释其在计量经济学中的应用。
第一步:理解[Kleibergen-Paap RK Wald F临界值]的原理和意义[Kleibergen-Paap RK Wald F临界值]是基于最大似然估计法的统计检验方法,用于计算因果效应的标准误差。
它是由计量经济学家Kleibergen和Paap在2006年提出的,用于解决常见的经济计量问题,如内生性、选择性偏误等。
第二步:确定问题并建立假设在应用[Kleibergen-Paap RK Wald F临界值]之前,我们需要确定研究的问题,并建立相应的假设。
例如,我们想研究教育对收入的影响,并假设教育对收入有正向的因果效应。
第三步:收集数据并进行回归分析接下来,我们需要收集相关的数据,并进行线性回归分析。
在回归模型中,我们将教育水平作为自变量,收入作为因变量,同时控制其他潜在的影响因素。
第四步:测试内生性问题在计量经济学中,经常面临内生性问题,即自变量与误差项之间存在相关性。
为了解决内生性问题,我们可以使用[Kleibergen-Paap RK Wald F临界值]方法。
该方法基于两阶段最小二乘法,构建一个合成变量,并将其加入回归方程中。
第五步:计算[Kleibergen-Paap RK Wald F临界值]接下来,我们需要计算[Kleibergen-Paap RK Wald F临界值]。
该临界值的计算基于经济学理论和数理统计学推导,由计量经济学软件提供。
计算结果将告诉我们是否存在内生性问题。
第六步:解释结果并进行验证最后,我们需要解释计算结果并进行验证。
如果[Kleibergen-Paap RK Wald F临界值]大于某个阈值,说明存在内生性问题,我们需要进一步调整模型。
gpt最经典的几个问答
gpt最经典的几个问答GPT是一种基于深度学习的自然语言处理模型,可以进行问答、文本生成、对话等任务。
下面列举了GPT最经典的几个问答题目及其回答,希望能满足您的需求。
1. 什么是GPT模型?GPT模型是一种基于Transformer的语言模型,具备处理自然语言任务的能力。
它通过大规模的预训练和微调过程,可以生成连贯、富有逻辑性的文本。
2. GPT模型是如何进行预训练的?GPT模型使用无监督学习的方式进行预训练,通过大量的文本数据,如维基百科、互联网文章等,学习语言的概率分布。
具体而言,它通过自回归的方式,预测下一个词的概率分布。
3. GPT模型有哪些应用场景?GPT模型可以应用于多个自然语言处理任务,如文本生成、机器翻译、问答系统、对话系统等。
它在生成文本方面表现出色,可以用于自动写作、智能客服等场景。
4. GPT模型有哪些优点?GPT模型具备以下优点:一是能够生成连贯的文本,表达能力较强;二是具备一定的逻辑推理能力,可以回答复杂的问题;三是模型结构简单,易于实现和使用。
5. GPT模型有哪些局限性?GPT模型也存在一些局限性:一是对于长文本的理解能力有限,容易出现遗忘现象;二是在某些情况下可能会生成不准确或错误的信息;三是对于新领域的文本理解能力较弱。
6. GPT模型的改进有哪些?为了解决GPT模型的一些问题,研究者进行了一系列的改进。
例如,通过引入注意力机制,增强模型对上下文的理解能力;引入预训练和微调的方式,提升模型在特定任务上的性能。
7. GPT模型与BERT模型有何不同?GPT模型和BERT模型都是基于Transformer的语言模型,但在预训练和微调的方式上有所不同。
GPT模型采用自回归的方式预训练,而BERT模型采用自编码的方式预训练。
此外,GPT模型更适合生成式任务,BERT模型更适合判别式任务。
8. GPT模型在问答系统中的应用如何?GPT模型在问答系统中可以作为回答生成模块使用。
生成式人工智能的著名应用案例
生成式人工智能的著名应用案例生成式人工智能(Generative Artificial Intelligence)是一种基于机器学习和神经网络技术的人工智能方法,通过学习大量样本数据、模式识别和生成模型等技术手段,实现对于自然语言、图像、音频等领域的生成和创作。
生成式人工智能已经在多个领域取得了显著的应用成就,下面将介绍一些著名的生成式人工智能应用案例。
一、自然语言生成自然语言生成(Natural Language Generation,NLG)是生成式人工智能领域的一个重要应用分支,其核心任务是根据语言模型和背景知识生成符合语法、逻辑和语义规则的自然语言文本。
这项技术已经在多个领域得到了广泛应用。
其中最为著名的案例当属OpenAI 的GPT系列模型。
1. GPT-3OpenAI的GPT-3(Generative Pre-trained Transformer 3)是目前公认最先进的自然语言生成模型之一。
该模型采用了极为庞大的神经网络结构和大规模的预训练参数,能够在多个自然语言处理任务上取得出色的表现。
GPT-3不仅可以生成高质量的文章、新闻报道、故事情节等文本内容,还可以用于辅助写作、翻译、对话生成等场景,已经在多个应用场景中展现出了巨大的潜力。
二、图像生成生成式人工智能在图像生成领域也取得了一系列令人瞩目的成果,相关技术不仅能够实现图像的风格转换、图像修复、图像合成等功能,还可以创造出前所未有的艺术作品。
著名的图像生成应用案例包括DeepDream和StyleGAN。
1. DeepDreamGoogle开发的DeepDream是一种基于卷积神经网络的图像生成算法,其核心思想是通过对图像进行多次迭代的梯度上升优化,增强网络在图像中检测到的特定特征。
DeepDream不仅可以生成风格独特的艺术图像,还可以用于图像增强、艺术创作等领域。
2. StyleGANStyleGAN是一种基于生成对抗网络(GAN)的图像生成模型,其创新之处在于能够生成富有多样性和逼真感的人脸图像。
DS证据理论
一.D-S 凭据理论引入出生D-S 凭据理论的出生:起源于 20 世纪 60 年代的哈佛大学数学家 A.P. Dempster 利用上、下限概率解决多值照射问题,1967 年起连续宣告一系列论文,标志着凭据理论的正式出生。
形成dempster 的学生 G.shafer 对凭据理论做了进一步发展,引入相信件数见解,形成了一套“凭据”和“组合”来办理不确定性推理的数学方法D-S 理论是对贝叶斯推理方法实行,主若是利用概率论中贝叶斯条件概率来进行的,需要知道先验概率。
而D-S 凭据理论不需要知道先验概率,可以很好地表示“不确定”,被广泛用来办理不确定数据。
适用于:信息交融、专家系统、情报剖析、法律案件剖析、多属性决策剖析二.D-S 凭据理论的基本见解定义 1 基本概率分配( BPA )设 U 为以鉴别框架,则函数m:2u→ [0,1] 满足以下条件:(1) m(? )=0(2) ∑A? U m(A)=1 时称 m(A)=0 为 A 的基本赋值, m(A)=0 表示对 A 的相信程度也称为 mass 函数。
定义 2 相信件数(Belief Function )Bel:2u→ [0,1]Bel(A)=∑B? A m(B)=1(? A? U)表示 A 的全部子集的基本概率分配函数之和定义 3 似然函数( plausibility Function)pl(A)=1- Bel(A)=∑B? U m(B)- ∑B? A m(B)=∑B? A≠? m(B)似然函数表示不否认 A 的相信度,是全部与 A 订交的子集的基本概率分配之和。
定义 4 相信区间[Bel(A) ,pl(A)] 表示命题 A 的相信区间, Bel(A) 表示相信件数为下限, pl(A) 表示似真函数为上限举例:如 (0.25,0.85), 表示 A 为真有 0.25 的相信度, A 为假有 0.15 的相信度, A 不确定度为三.D-S 凭据理论的组合规则m个 mass 函数的 Dempster 合成规则其中 K 称为归一化因子, 1- K 即∑A1? ...? A n=? m1(A1)?m2(A2)???m n(A n) 反响了凭据的矛盾程度四.判决规则设存在 A1,A2? U ,满足m(A1)=max{m(A i),A i? U}m(A2)=max{m(A i),A i? U 且 A i≠A1}若有:m(A1)- m(A2)>ε1m( Θ)<ε2m(A1)>m( Θ)则 A1为判决结果,ε1,ε2为开初设定的门限,Θ为不确定会集五.D-S 凭据理论存在的问题(一)无法解决凭据矛盾严重和圆满矛盾的情况该鉴别框架为 {Peter ,Paul ,Mary}, 基本概率分配函数为 m{Peter},m{Paul},m{Mary} 由 D-S 凭据理论的基本见解和组合规则进行剖析可以看出诚然在 W1,W2目击中, peter 和 mary 都为 0.99 ,但是存在严重的矛盾,造成合成此后的 Bel 函数值为 0,这显然与本质情况不合,更极端的情况若是 W1中m{peter)=1,W 2中 m{Mary}=1, 则归一化因子 K=0 ,D-S 组合规则无法进行(二)难以鉴别模糊程度由于凭据理论中的凭据模糊主要来自于各子集的模糊度。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
See discussions, stats, and author profiles for this publication at: /publication/235000917 Predicting hardness of dense C3N4 polymorphsARTICLE in APPLIED PHYSICS LETTERS · MARCH 2006Impact Factor: 3.3 · DOI: 10.1063/1.2182109CITATIONS 40READS 1067 AUTHORS, INCLUDING:Riping LiuYan Shan University114 PUBLICATIONS 765 CITATIONSSEE PROFILE Yongjun TianYan Shan University156 PUBLICATIONS 2,104 CITATIONSSEE PROFILEChunxiao GaoJilin University134 PUBLICATIONS 735 CITATIONSSEE PROFILEAvailable from: Chunxiao GaoRetrieved on: 29 October 2015Predicting hardness of dense C3N4polymorphsJulong He,Licong Guo,Xiaoju Guo,Riping Liu,and Yongjun Tian a͒Key Laboratory of Metastable Materials Science and Technology,Yanshan University,Qinhuangdao066004,ChinaHuitian WangNational Laboratory of Solid State Microstructures,Nanjing University,Nanjing210093,ChinaChunxiao GaoState Key Laboratory for Superhard Materials,Jilin University,Changchun130023,China͑Received3December2005;accepted23January2006;published online6March2006͒We report the calculations of the Vickers hardness offive predicted C3N4polymorphs by using the microscopic model of hardness.The hardest phase,cubic C3N4,has the hardness of92.0GPa,softer than diamond,although its modulus is higher than that of diamond.The densest phase,cubic spinel C3N4,has the lowest hardness of62.3GPa in thefive polymorphs.Our analysis suggests that the hardness of simple-structured covalent materials might not exceed that of diamond.©2006 American Institute of Physics.͓DOI:10.1063/1.2182109͔Superhard materials are widely used in industry for fast machining and drilling.Intense theoretical and experimental efforts have been focused on the possibility offinding novel superhard materials.Because of short C–N bond length, novel carbon nitrides are expected as candidates in the su-perhard materials family.1,2Recent particular interest in su-perhard material arises from a theoretical prediction that the bulk modulus of hypothetical-C3N4is very close to that of diamond.2This result has motivated theoretical studies3–6 and experimental synthesis7–15of carbon nitride compounds. Further theoretical calculations proposedfive dense polymor-phs of C3N4,in which cubic C3N4has a bulk modulus greater than diamond.16,17Therefore the cubic C3N4is ex-pected to be a material harder than diamond.It has been suspicious about early claims of obtaining␣-and-C3N4 phases,however,recent experimental result gave more con-vincing evidence for the existence of-C3N4,18which re-news the hope of synthesizing superhard C3N4crystals.Un-fortunately,its hardness and bulk properties could not be determined because its single crystal size is only about 60–80nm.There is no report on the hardness measurement of single-crystalline carbon nitrides up to now.An empirical correlation was proposed between hardness and Young’s modulus19–21or the shear modulus.22Brazhkin et al.23discussed also the correlation of hardness with shear modulus and other properties in detail.However,the depen-dence of hardness with modulus or other properties for a covalent material is not unequivocal and monotonic.24The hardness values of thefive dense polymorphs still remain puzzling because the modulus or other properties may not be the best indicators of hardness.25Therefore the speculation3,26that some C3N4polymorphs might be harder than diamond based on their moduli is questionable.25To clarify this argument,here we calculate in detail the hardness of the dense C3N4polymorphs.Hardness wasfirst defined as the ability of one mineral to scratch another.27In a general sense,hardness is the resis-tance offered by a given material to external mechanical action.22When an indenter was pressed into the surface of a covalent material,the chemical bonds in the material around the indenter should suffer the comprehensive actions of com-pression,shear,and tension.Hardness is a complicated me-chanical property of a material distinguished from bulk modulus.Bulk modulus is the resistance of a material against volume change,whereas hardness is the resistance of its chemical bond to breaking,microscopically corresponding to the transition of valence electrons.Based on this idea,re-cently,we have presented a microscopic model of hardness for covalent and polar covalent crystals.24The hardness for-mula is within a10%accuracy.Especially for superhard ma-terials͑diamond and c-BN͒with Vickers hardness higher than60GPa,the accuracy is within5%.24Because the denser phases of C3N4polymorphs are typical polar covalent solids,the calculated results of hardness for these phases are valid.In this letter,we considered thefive dense C3N4poly-morphs of-C3N4,␣-C3N4,pseudocubic-C3N4͑p-C3N4͒, cubic-C3N4͑c-C3N4͒,and cubic spinel-C3N4͑c s-C3N4͒pro-posed in the literature.2,16,17To obtain physical properties and bond parameters of thefive polymorphs which will be used in the calculation of hardness according to our microscopic hardness model,we performfirst-principles calculations within the framework of density functional theory imple-mented in CASTEP code,and the same technique as that in our previous work for Mulliken population calculations was used.28In order accurately to obtain the overlap population of a chemical bond,the unit cells or supercells of56atoms were constructed.The-C3N4structure͓Fig.1͑a͔͒is based on the hexagonal-Si3N4structure of space group P3,with C substituted for Si.The␣-C3N4structure͓Fig.1͑b͔͒is dif-ferent from the-C3N4structure,exhibits the space group of P31c,and can be described as an ABAB...stacking sequence of layers of-C3N4͑A͒and its mirror image͑B͒.The p-C3N4structure͓Fig.1͑c͔͒is based on pseudocubic ␣-CdIn2Se4structure and has the space group of P-42m, with C substituted for Cd and In,and with N substituted for Se.The c-C3N4͓Fig.1͑d͔͒has the structure of the high-pressure willemite-II structure of Zn2SiO4,with C substi-tuted for Zn and Si,and with N substituted for O,and shows the space group of I-43d.The above four structures consista͒Electronic mail:fhcl@APPLIED PHYSICS LETTERS88,101906͑2006͒0003-6951/2006/88͑10͒/101906/3/$23.00©2006American Institute of Physics88,101906-1Downloaded 27 Jun 2006 to 202.206.252.10. Redistribution subject to AIP license or copyright, see /apl/copyright.jspof fourfold coordinated carbon linked by threefold coordi-nated nitrogen atoms in CN 4tetrahedra.Similar to the syn-thesized cubic spinel c s -Si 3N 4,29the c s -C 3N 4structure ͓Fig.1͑e ͔͒is constructed with C substituted for Fe and N substi-tuted for O in the spinel structure of Fe 3O 4.There are two coordinative configurations of C or N in the c s -C 3N 4struc-ture of space group Fd -3m .The first configuration consists of fourfold coordinated carbon linked by fourfold coordi-nated nitrogen atoms in CN 4tetrahedra,and the second con-figuration consists of sixfold coordinated carbon linked by fourfold coordinated nitrogen atoms in CN 6octahedra.In the five polymorphs of C 3N 4shown in Table I,␣-C 3N 4has the lowest energy,-C 3N 4,c -C 3N 4and p -C 3N 4have a slightly higher energy than ␣-C 3N 4,and c s -C 3N 4has the highest energy.This indicates that the sta-bility sequence for the five polymorphs is ␣-,-,c -,p -,and then c s -C 3N 4.The tetrahedrally coordinated structures,-C 3N 4,␣-C 3N 4,p -C 3N 4,and c -C 3N 4,have one type of C–N bond and exhibit similar bond lengths ͑see Table II ͒.In the c s -C 3N 4structure,the C atoms are four-and sixfold co-ordinated by N in a 1:2ratio,therefore there are two types of C–N bond lengths ͑see Table III ͒,1.5183and 1.6419Åfor tetrahedrally and octahedrally coordinated C atoms,respec-tively.For the -,␣-,p -,and c -C 3N 4having one type of C–N bond where C atoms are tetrahedrally coordinated as Si at-oms in -Si 3N 4,according to our microscopic model,24their Vickers hardness can be calculated using the following equa-tion:H V =350N e 2/3e−1.191f i /d 2.5,͑1͒where d is the C–N bond length,N e is the valence electron density and can be calculated from Eq.͑2͒in Ref.30,and f i is the ionicity of C–N bond.For the c s -C 3N 4having two types of chemical bonds ͓8C ͑t ͒–N and 24C ͑o ͒–N bonds ͔in unit cell,where C atoms are tetrahedrally and octahedrally coordinated respectively,its Vickers hardness can be calcu-lated as follows:H V =͓͑H Vt ͒͑H V o ͒3͔1/4,͑2͒where H V x =350͑N e x ͒2/3e−1.191f i x/͑d x ͒2.5are the hardness of hy-pothetical binary compound composed of C ͑t ͒–N bond or C ͑o ͒–N bond,respectively,and here x represents t or o .The valence electron densities of the hypothetical compounds composed of C ͑t ͒–N and C ͑o ͒–N bonds,N e t and N e o ,can be calculated by using Eqs.͑4͒and ͑5͒in Ref.30.According to our generalized ionicity scale,28the ionicity f i or f i x ͒of C–N bond can be calculated withf i =͑f h ͒0.735=͓1−exp ͑−͉P c −P ͉/P ͔͒0.735,͑3͒where f h is the new ionicity scale of a bond based on bond overlap population,P is the overlap population of a C–N bond,and P c is the overlap population of the bond in a pure covalent crystal containing the same type of coordinates.Be-cause the bonds in different coordinate structures have dif-ferent P c values,28we select the known crystals of -Si 3N 4and c s -Si 3N 4to determine the P c values of the C–N bonds in the five polymorphs of C 3N 4.For the -Si 3N 4,the ionicity f i of Si–N bond is equal to 0.4,30and the calculated average overlap population of Si–N bonds is 0.68in the -Si 3N ing Eq.͑3͒,we find that P c =0.91for the A–B bonds in the polymorphs of A 3B 4with the same coordinate as the -Si 3N 4.From the data of C–N bonds in Table II,we calcu-lated the Vickers hardness of the four C 3N 4polymorphs and listed the results in Table II.In the c s -Si 3N 4,the value of the P c is still equal to 0.91for the Si ͑t ͒–N bonds.From the experimental Vickers hardness of 35GPa for the c s -Si 3N 4,31we obtain P c =0.57for the Si ͑o ͒–N bonds calculated by us-ing Eqs.͑2͒and ͑3͒.In other words,the P c values are 0.91and 0.57for A ͑t ͒–B and A ͑o ͒–B bonds in spinel structure of A 3B 4.From the data of C ͑t ͒–N and C ͑o ͒–N bonds in Table III,we calculated the Vickers hardness of c s -C 3N 4and listed it in Table III.For the five polymorphs of C 3N 4that we studied,all of them are superhard materials.The c -C 3N 4with the modulus greater than diamond is the hardest one,however,its Vickers hardness of 92.0GPa is about 5.4%smaller than thecalcu-FIG.1.The C 3N 4unit cells for:͑a ͒-C 3N 4containing two C 3N 4units;͑b ͒␣-C 3N 4containing four C 3N 4units;͑c ͒p -C 3N 4containing one C 3N 4unit;͑d ͒c -C 3N 4containing four C 3N 4units;and ͑e ͒c s -C 3N 4containing eight C 3N 4units.The carbon and nitrogen atoms are represented as black and white spheres,respectively.TABLE I.Total energy E tot ,lattice parameters a and c ,density ,and bulk modulus B 0of diamond and five C 3N 4polymorphs.The data in parentheses are experimental values.Parameters Diamond-C 3N 4␣-C 3N 4p C 3N 4c -C 3N 4c s -C 3N 4Space group Fd 3¯m P 3P 31c p 4¯2m I 4¯3d Fd 3¯m E tot ͑eV/atom ͒−155.825−221.846−221.984−221.650−221.674−220.490a ͑Å͒ 3.5363͑3.567a ͒ 6.4032 6.4678 3.4331 5.4094 6.7138c ͑Å͒— 2.4053 4.7117———V ͑Å3/unit cell ͒44.2285.40170.7040.46158.30302.60͑g/cm 3͒ 3.607 3.579 3.582 3.778 3.862 4.041B 0͑GPa ͒438.8͑443b ͒419.1378.7393.2449.2379.2a Reference 35.bReference 22.Downloaded 27 Jun 2006 to 202.206.252.10. Redistribution subject to AIP license or copyright, see /apl/copyright.jsplated value97.3GPa of diamond.This suggests that it wouldbe impossible tofind novel superhard materials with hard-ness exceeding diamond in carbon nitrides.In going fromlowest Vickers hardness of62.3GPa to highest value of92.0GPa,the corresponding structures are c s-,p-,␣-,-, and c-C3N4.An interesting fact is that hardness also has noobvious relations to bulk modulus and density for the samematerial,C3N4,with different polymorphs.Although thec-C3N4phase has not only the highest bulk modulus of449.2GPa but also the highest hardness of92.0GPa,the ␣-C3N4phases with the lower bulk modulus have the highest hardness compared with the p-C3N4;Similarly,the denseststructure c s-C3N4has the lowest hardness.32From the traditional point of view,the conditions that acovalent crystal must fulfill to be superhard are high bulkmodulus and high shear modulus,27in particular,high bulkmodulus.On the basis of the calculation presented here,thetraditional method for searching potential superhard materi-als based on the modulus of a covalent solid is not accurateenough.Hardness of all C3N4polymorphs is smaller thanthat of diamond.Is it possible tofind other simple-structuredcovalent materials harder than diamond?To address this is-sue,one has to focus particularly on light element com-pounds.Here let us consider possible light element com-pounds with bond lengths͑inÅ͒:32C–N:1.47;B–O:1.48;B–N:1.57;B–C:1.57;Si–O:1.61;Be–O:1.67;Si–N:1.74;Si–C:1.88.Our theoretical model indicates that the hardnessof a covalent material is determined by bond density,bondlength,and ionicity of the covalent bond.24Assuming thatthe bond densities of these binary compounds are equal tothat of diamond having the smallest atomic volume in knownmaterials,the hardness of B–N,B–C,Si–O,Be–O,Si–N,andSi–C compounds would not exceed that of diamond becausehardness of a covalent crystal is inversely proportional tod2.5;According to the position of B and O in the periodictable,the ionicity of the B–O bond should be greater than0.256of the B–N bond,33,34thus the hardness of covalentB–O compounds would also not exceed that of diamond.Other possible candidates for compounds harder than dia-mond are compact structures composed of both heavy and light atoms,27however,the longer chemical bond and greater ionicity in the structure will cause lower hardness.In summary,we calculate here the Vickers hardness of five predicted C3N4polymorphs by using the microscopic model of hardness.Our results show that thefive predicted C3N4polymorphs are superhard materials,the c-C3N4poly-morph is the hardest and its Vickers hardness is lower than that of diamond.Our analysis suggests that it seems unlikely tofind novel superhard materials harder than diamond in simple-structured covalent materials.This work was supported by National Natural Science Foundation of China͑Grant Nos.50225207,50372055,and 50472051͒and by National Basic Research Program of China͑Grant No.2005CB724400͒.1M.L.Cohen,Phys.Rev.B32,7988͑1985͒.2A.Y.Liu and M.L.Cohen,Science245,841͑1989͒.3A.Y.Liu and M.L.Cohen,Phys.Rev.B41,10727͑1990͒.4J.L.Corkill and M.L.Cohen,Phys.Rev.B48,17622͑1993͒.5J.E.Lowther,Phys.Rev.B57,5724͑1998͒.6I.Alves,G.Demazeau,B.Tanguy,and F.Weill,Solid State Commun. 109,697͑1999͒.7C.M.Niu,Y.Z.Lu,and C.M.Lieber,Science261,334͑1993͒.8K.M.Yu,M.L.Cohen,E.E.Haller,W.L.Hansen,A.Y.Liu,and I.C. Wu,Phys.Rev.B49,5034͑1994͒.9D.Marton,K.J.Boyd,A.H.Al-Bayati,S.S.Todorov,and J.W.Rabalais, Phys.Rev.Lett.73,118͑1994͒.10H.Sjöström,S.Stafström,M.Boman,and J.-E.Sundgren,Phys.Rev. Lett.75,1336͑1995͒.11M.R.Wixom,J.Am.Ceram.Soc.73,1973͑1996͒.12Y.Peng,T.Ishigaki,and S.Horiuchi,Appl.Phys.Lett.73,3671͑1998͒. 13H.Montigaud,B.Tanguy,I.Demazeau,I.Alves,and S.Courjault,J. Mater.Sci.35,2547͑2000͒.14Z.Zhang,K.Leinenweber,M.Bauer,L.A.J.Garvie,P.F.McMillan,and G.H.Wolf,J.Am.Chem.Soc.123,7788͑2001͒.15C.B.Cao,Q.Lv,and H.S.Zhu,Diamond Relat.Mater.12,1070͑2002͒. 16D.M.Teter and R.J.Hemley,Science271,53͑1996͒.17S.-D.Mo,L.Ouyang,and W.Y.Ching,Phys.Rev.Lett.83,5046͑1999͒. 18L.W.Yin,Y.Bando,M.S.Li,Y.X.Liu,and Y.X.Qi,Adv.Mater.͑Weinheim,Ger.͒15,1840͑2003͒.19J.J.Gilman,J.Appl.Phys.39,6086͑1968͒.20A.P.Gerk,J.Mater.Sci.12,735͑1977͒.21D.G.Clerc and H.M.Ledbetter,J.Phys.Chem.Solids59,1071͑1998͒. 22D.M.Teter,MRS Bull.23,22͑1998͒.23V.V.Brazhkin,A.G.Lyapin,and R.J.Hemley,Philos.Mag.A82,231͑2002͒.24F.M.Gao,J.L.He,E.Wu,S.M.Liu,D.L.Yu,D.C.Li,S.Y.Zhang,and Y.J.Tian,Phys.Rev.Lett.91,015502͑2003͒.25G.Ceder,Science280,1099͑1998͒.26J.V.Badding,Adv.Mater.͑Weinheim,Ger.͒11,877͑1997͒.27J.M.Léger and J.Haines,Endeavour21,121͑1997͒.28J.L.He,E.Wu,H.T.Wang,R.P.Liu,and Y.J.Tian,Phys.Rev.Lett.94, 015504͑2005͒.29A.Zerr,G.Miehe,G.Serghiou,M.Schwarz,E.Kroke,R.Riedel,H. Fueß,P.Kroll,and R.Boehler,Nature͑London͒400,340͑1999͒.30J.L.He,L.C.Guo,D.L.Yu,R.P.Liu,Y.J.Tian,and H.T.Wang,Appl. Phys.Lett.85,5571͑2004͒.31J.Z.Jiang,F.Kragh,D.J.Frost,K.Ståhl,and H.Lindelov,J.Phys.: Condens.Matter13,L515͑2001͒.32J.Haines,J.M.Léger,and G.Bocquillon,Annu.Rev.Mater.Res.31,1͑2001͒.33J.C.Phillips and J.A.V.Vechten,Phys.Rev.Lett.23,1115͑1969͒;J.A. V.Vechten,Phys.Rev.187,1007͑1969͒.34B.F.Levine,J.Chem.Phys.59,1463͑1973͒.35E.Knittle,R.B.Kaner,R.Jeanloz,and M.L.Cohen,Phys.Rev.B51, 12149͑1995͒.36R.A.Andrievski,Int.J.Refract.Met.Hard Mater.19,447͑2001͒.TABLE II.Bond parameters and calculated Vickers hardness of diamondand four C3N4polymorphs.The experimental hardness of diamond is listedin parentheses.Parameters Diamond-C3N4␣-C3N4p-C3N4c-C3N4d͑Å͒ 1.5313 1.4519 1.4522 1.4749 1.4624P—0.790.770.750.81f i00.2370.2670.2970.205N e͑1/Å3͒0.7240.7490.7500.7910.809H v͑GPa͒97.3͑96±5͒a85.782.779.692.0a Reference36.TABLE III.Bond parameters and calculated Vickers hardness of c s-C3N4polymorph.Coordination d͑Å͒N e͑1/Å3͒P P c f i H V͑GPa͒H V͑GPa͒Tet 1.5183 1.140.690.910.3885.562.3Oct 1.64190.730.460.570.3256.162.3Downloaded 27 Jun 2006 to 202.206.252.10. Redistribution subject to AIP license or copyright, see /apl/copyright.jsp。