Ontology language extensions to support localized semantics, modular reasoning, and collabo
2000 【ng0500_25】 Gene Ontology-tool for the unification of biology

Gene Ontology: tool for the unification of biologyThe Gene Ontology Consortium**Michael Ashburner 1, Catherine A. Ball 3, Judith A. Blake 4, David Botstein 3, Heather Butler 1, J. Michael Cherry 3, Allan P. Davis 4, Kara Dolinski 3, Selina S.Dwight 3, Janan T. Eppig 4, Midori A. Harris 3, David P. Hill 4, Laurie Issel-Tarver 3, Andrew Kasarskis 3, Suzanna Lewis 2, John C. Matese 3, Joel E. Richardson 4,Martin Ringwald 4, Gerald M. Rubin 2& Gavin Sherlock 31FlyBase (http://www.fl). 2Berkeley Drosophila Genome Project (http://fruitfl). 3Saccharomyces Genome Database (). 4Mouse Genome Database and Gene Expression Database (). Correspondence should be addressed to J.M.C. (e-mail: cherry@) and D.B. (e-mail: botstein@), Department of Genetics, Stanford University School of Medicine, Stanford, California, USA.The accelerating availability of molecular sequences, particularly the sequences of entire genomes, has transformed both the the-ory and practice of experimental biology. Where once bio-chemists characterized proteins by their diverse activities and abundances, and geneticists characterized genes by the pheno-types of their mutations, all biologists now acknowledge that there is likely to be a single limited universe of genes and proteins,many of which are conserved in most or all living cells. This recognition has fuelled a grand unification of biology; the infor-mation about the shared genes and proteins contributes to our understanding of all the diverse organisms that share them.Knowledge of the biological role of such a shared protein in one organism can certainly illuminate, and often provide strong inference of, its role in other organisms.Progress in the way that biologists describe and conceptualize the shared biological elements has not kept pace with sequencing.For the most part, the current systems of nomenclature for genes and their products remain divergent even when the experts appre-ciate the underlying similarities. Interoperability of genomic data-bases is limited by this lack of progress, and it is this major obstacle that the Gene Ontology (GO) Consortium was formed to address.Functional conservation requires a common language for annotationNowhere is the impact of the grand biological unification more evident than in the eukaryotes, where the genomic sequences of three model systems are already available (budding yeast, Sac-charomyces cerevisiae , completed in 1996 (ref. 1); the nematode worm Caenorhabditis elegans , completed in 1998 (ref. 2); and the fruitfly Drosophila melanogaster , completed earlier this year 3) and two more (the flowering plant Arabidopsis thaliana 4and fission yeast S chizosaccharomyces pombe ) are imminent. The complete genomic sequence of the human genome is expected in a year or two, and the sequence of the mouse (Mus musculus )will likely follow shortly thereafter.The first comparison between two complete eukaryotic genomes (budding yeast and worm 5) revealed that a surpris-ingly large fraction of the genes in these two organisms dis-played evidence of orthology. About 12% of the worm genes (∼18,000) encode proteins whose biological roles could be inferred from their similarity to their putative orthologues in yeast, comprising about 27% of the yeast genes (∼5,700). Most of these proteins have been found to have a role in the ‘core bio-logical processes’ common to all eukaryotic cells, such as DNA replication, transcription and metabolism. A three-way com-parison among budding yeast, worm and fruitfly shows that this relationship can be extended; the same subset of yeast genes generally have recognizable homologues in the fly genome 6.Estimates of sequence and functional conservation between the genes of these model systems and those of mammals are less reliable, as no mammalian genome sequence is yet known in its entirety. Nevertheless, it is clear that a high level of sequence and functional conservation will extend to all eukaryotes, with the likelihood that genes and proteins that carry out the core biological processes will again be probable orthologues. Fur-thermore, since the late 1980s, many experimental confirma-tions of functional conservation between mammals and model organisms (commonly yeast) have been published 7–12.This astonishingly high degree of sequence and functional conservation presents both opportunities and challenges. The main opportunity lies in the possibility of automated transfer of biological annotations from the experimentally tractable model organisms to the less tractable organisms based on gene and protein sequence similarity. Such information can be used to improve human health or agriculture. The challenge lies in meeting the requirements for a largely or entirely computa-tional system for comparing or transferring annotation among different species. Although robust methods for sequence comparison are at hand 13–15, many of the other ele-ments for such a system remain to be developed.Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web () are being constructed: biologicalprocess, molecular function and cellular component.© 2000 N a t u r e A m e r i c a I n c . • h t t p ://g e n e t i c s .n a t u r e .c o mA dynamic gene ontologyThe GO Consortium is a joint project of three model organism databases: FlyBase 16,Mouse Genome Informatics 17,18(MGI) and the Saccharomyces Genome Database 19(SGD). It is expected that other organism databases will join in the near future. The goal of the Consortium is to produce a structured, precisely defined, common, con-trolled vocabulary for describing the roles of genes and gene products in any organism.Early considerations of the problems posed by the diversity of activities that characterize the cells of yeast, flies and mice made it clear that extensions of standard indexing meth-ods (for example, keywords) are likely to be both unwieldy and, in the end, unworkable.Although these resources remain essential,and our proposed system will continue to link to and depend on them, they are not sufficient in themselves to allow automatic transfers of annotation.Each node in the GO ontologies will be linked to other kinds of information, includ-ing the many gene and protein keyword databases such as SwissPROT (ref. 20), Gen-Bank (ref. 21), EMBL (ref. 22), DDBJ (ref.23), PIR (ref. 24), MIPS (ref. 25), YPD &WormPD (ref. 26), Pfam (ref. 27), SCOP (ref. 28) and ENZYME (ref. 29). One reason for this is that the state of biological knowl-edge of what genes and proteins do is very incomplete and changing rapidly. Discover-ies that change our understanding of the roles of gene products in cells are published on a daily basis. To illustrate this, consider annotating two different proteins. One is known to be a transmembrane receptor ser-ine/threonine kinase involved in p53-induced apoptosis; the other is known only to be a membrane-bound protein. In one case, the knowledge about the protein is sub-stantial, whereas in the other it is minimal.© 2000 N a t u r e A m e r i c a I n c . • h t t p ://g e n e t i c s .n a t u r e .c o mWe need to be able to organize, describe, query and visualize bio-logical knowledge at vastly different stages of completeness. Any system must be flexible and tolerant of this constantly changing level of knowledge and allow updates on a continuing basis.Similar considerations suggested that a static hierarchical sys-tem, such as the Enzyme Commission 30(EC) hierarchy, although computationally tractable, was also likely to be inadequate to describe the role of a gene or a protein in biology in a manner that would be either intuitive or helpful for biologists. The hier-archical EC numbering system for enzymes is the standard resource for classifying enzymatic chemical reactions. The EC system does not address the classification of non-enzymatic pro-teins or the ability to describe the role of a gene product within a cell; also, the system has little facility for describing diverse pro-tein interactions. The vagueness of the term ‘function’ when applied to genes or proteins emerged as a particular problem, as this term is colloquially used to describe biochemical activities,biological goals and cellular structure. It is commonplace today to refer to the function of a protein such as tubulin as ‘GTPase’ or ‘constituent of the mitotic spindle’. For all these reasons, we are constructing three independent ontologies.Three categories of GOBiological process refers to a biological objective to which the gene or gene product contributes. A process is accomplished via one or more ordered assemblies of molecular functions.Processes often involve a chemical or physical transformation,in the sense that something goes into a process and something different comes out of it. Examples of broad (high level) bio-logical process terms are ‘cell growth and maintenance’ or ‘sig-nal transduction’. Examples of more specific (lower level)process terms are ‘translation’, ‘pyrimidine metabolism’ or ‘cAMP biosynthesis’.Molecular function is defined as the biochemical activity (including specific binding to ligands or structures) of a gene product. This definition also applies to the capability that a gene product (or gene product complex) carries as a potential. It describes only what is done without specifying where or when the event actually occurs. Examples of broad functional terms are ‘enzyme’, ‘transporter’ or ‘ligand’. Examples of narrower func-tional terms are ‘adenylate cyclase’ or ‘Toll receptor ligand’.Cellular component refers to the place in the cell where a gene product is active. These terms reflect our understanding of eukaryotic cell structure. As is true for the other ontologies, not all terms are applicable to all organisms; the set of terms is meant to be inclusive. Cellular component includes such terms as ‘ribo-some’ or ‘proteasome’, specifying where multiple gene products would be found. It also includes terms such as ‘nuclear mem-brane’ or ‘Golgi apparatus’.Ontologies have long been used in an attempt to describe all entities within an area of reality and all relationships between those entities. An ontology comprises a set of well-defined terms with well-defined relationships. The structure itself reflects the current representation of biological knowledge as well as serving as a guide for organizing new data. Data can be annotated to varying levels depending on the amount and completeness of available information. This flexibility also allows users to narrow or widen the focus of queries. Ultimately, an ontology can be a vital tool enabling researchers to turn data into knowledge. Com-puter scientists have made significant contributions to linguistic formalisms and computational tools for developing complex vocabulary systems using reason-based structures, and we hope that our ontologies will be useful in providing a well-developed data set for this community to test their systems. The Molecular Biology Ontology Working Group (/projects/bio-ontology/) is actively attempting to develop standards in this general field.Biological process, molecular function and cellular component are all attributes of genes, gene products or gene-product groups.Each of these may be assigned independently and, indeed, we believe that simply recognizing that biological process, molecular function and cellular location represent independent attributes is by itself clarifying in many situations, as in the annotation of gene-expression data. The relationships between a gene product (or gene-product group) to biological process, molecular func-tion and cellular component are one-to-many, reflecting the bio-logical reality that a particular protein may function in several processes, contain domains that carry out diverse molecular© 2000 N a t u r e A m e r i c a I n c . • h t t p ://g e n e t i c s .n a t u r e .c o mfunctions, and participate in multiple alternative interactions with other proteins, organelles or locations in the cell.The ontologies are developed for a generic eukaryotic cell;accordingly, specialized organs or body parts are not represented.Full integration of the ontologies with anatomical structures will occur as the ontologies are incorporated into each species’ data-base and are related to anatomical data within each database. GO terms are connected into nodes of a network, thus the connec-tions between its parents and children are known and form what are technically described as directed acyclic graphs. The ontolo-gies are dynamic, in the sense that they exist as a network that is changed as more information accumulates, but have sufficient uniqueness and precision so that databases based on the ontolo-gies can automatically be updated as the ontologies mature. The ontologies are flexible in another way, so that they can reflect the many differences in the biology of the diverse organisms, such as the breakdown of the nucleus during mitosis. In this way the GO Consortium has built up a system that supports a common lan-guage with specific, agreed-on terms with definitions and sup-porting documentation (the GO ontologies) that can be understood and used by a wide biological community.Examples of GO annotationAs one example, consider DNA metabolism, a biological process carried out by largely (but not entirely) shared elements in eukaryotes. The part of the process ontology (with selected gene names from S. cerevisiae , Drosophila and M. musculus ) shown is largely one parent to many children (Fig. 1a ). One notable excep-tion is the process of DNA ligation, which is a child of three processes, DNA replication, DNA repair and DNA recombina-tion. The yeast gene product Cdc9p is able to carry out the ligation step for all three processes, whereas it is uncertain whether the same enzyme is used in the other species. From the point of view of the ontology, it matters not, and a computer (or a human searcher) will find the appropriate nodes in either case using as the query either the enzyme, the gene name(s) or the GO term (or, if available, the unique GO identifier, in this case, GO:0003910).Also shown are the molecular function ontology for the MCM protein complex members that are known to regulate initiation of DNA replication in the three organisms (Fig. 1b ), and a por-tion of the cellular component ontology for these proteins (Fig.1c ). These ontologies reflect the finding that Mcm2–7 proteins are components of the pre-replicative complex in several model organisms, as well as sometimes localizing to the cytoplasm 30.The ontology supports both biological realities, and yet the mole-cular functions and the biological processes of the MCM homo-logues are conserved nevertheless.The usefulness of the GO ontologies for annotation received its first major test in the annotation of the recently completed sequence of the Drosophila genome. Little human intervention was required to annotate 50% of the genes to the molecular function and biological process ontologies using the GO method. Another use for GO ontologies that is gaining rapid adherence is the anno-tation of gene-expression data, especially after these have been clustered by similarities in pattern of gene expression 32,33. The results of clustering about 100 yeast experiments (of which about half are shown; Fig. 2) grouped together a subset of genes which, by name alone, convey little to most biologists. When the full short GO annotations for process, molecular function and location are added, however, the biological reason and import of the co-expres-sion of these genes becomes evident.The GO project is currently using a flat file format to store the ontologies, definitions of terms and gene associations. The ontologies, gene associations, definitions and documentation are available from the GO web site (),which also describes the principles and objectives used by the pro-ject. The ontologies are by no means complete. They are being expanded during the association of gene products from the col-laborating databases and we expect them to continue to evolve for many years. GO requires that all gene associations to the ontolo-gies must be attributed to the literature; for each citation the type of evidence will be encoded. As of early April 2000 there were 1,923, 2,094 and 490 nodes in the process, function and compo-nent ontologies, respectively. The three organism databases have made substantial progress to link gene products. Thus far the process, function and component ontologies have associations with 1,624, 1,602 and 1,577 yeast genes; 741, 2,334 and 1,061 fly genes; and 1,933, 2,896 and 1,696 mouse genes, respectively. A running table of these statistics can be found at the web site.The GO concept is intended to make possible, in a flexible and dynamic way, the annotation of homologous gene and protein sequences in multiple organisms using a common vocabulary that results in the ability to query and retrieve genes and proteins based on their shared biology. The GO ontologies produce a con-trolled vocabulary that can be used for dynamic maintenance and interoperability between genome databases. The ontologies are a work in progress. They can be consulted at any time on the World-Wide Web; indeed, their availability to human and machine alike is essential to maintain their flexibility and allow their evolution along with increased understanding of the under-lying biology. It is hoped that the GO concepts, especially the dis-tinctions between biological process, molecular function and cellular component, will find favour among biologists so that we can all facilitate, in our writing as well as our thinking, the grand unification of biology that the genome sequences portend.AcknowledgementsWe thank K. Fasman and M. Rebhan for useful discussions, and Astra Zeneca for financial support. SGD is supported by a P41, National Resources, grant from National Human Genome Research Institute (NHGRI) grantHG01315; MGD by a P41 from NHGRI grant HG00330; GXD by National Institute of Child Health and Human Development grant HD33745; and FlyBase by a P41 from NHGRI grant HG00739 and the Medical Research Council, London.Received 20 March; accepted 5 April 2000.© 2000 N a t u r e A m e r i c a I n c . • h t t p ://g e n e t i c s .n a t u r e .c o m1.Goffeau, A. et al.Life with 6000 genes. Science 274, 546 (1996).2.Worm Sequencing Consortium. Genome sequence of the nematode C. elegans : a platform for investigating biology. The C. elegans Sequencing Consortium.Science 282, 2012–2018 (1998).3.Adams, M.D. et al.The genome sequence of Drosophila melanogaster . Science 287, 2185–2195 (2000).4.Meinke, D.W. et al . Ar abidopsis thaliana : a model plant for genome analysis.Science 282, 662–682 (1998).5.Chervitz, S.A. et al. Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure. Nucleic Acids Res . 27, 74–78 (1999).6.Rubin, G.M. et parative genomics of the eukaryotes. Science 287,2204–2215 (2000).7.Tang, Z., Kuo, T., Shen, J. & Lin, R.J. Biochemical and genetic conservation of fission yeast Dsk1 and human SR protein-specific kinase 1. Mol. Cell. Biol . 20,816–824 (2000).8.Vajo, Z. et al . Conservation of the Caenorhabditis elegans timing gene clk-1 from yeast to human: a gene required for ubiquinone biosynthesis with potential implications for aging . Mamm. Genome 10, 1000–1004 (1999).9.Ohi, R. et al . Myb-related Schizosaccharomyces pombe cdc5p is structurally and functionally conserved in eukaryotes. Mol. Cell. Biol.18, 4097–4108 (1998).10.Bassett, D.E. Jr et al . Genome cross-referencing and XREFdb: implications for the identification and analysis of genes mutated in human disease. Nature Genet.15,339–344 (1997).11.Kataoka T. et al . Functional homology of mammalian and yeast RAS genes. Cell 40,19–26 (1985).12.Botstein, D. & Fink, G.R. Yeast: an experimental organism for modern biology.Science 240, 1439–1443 (1988).13.Tatusov, R.L., Galperin, M.Y ., Natale, D.A. & Koonin, E.V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res .28, 33–36 (2000).14.Andrade, M.A. et al . Automated genome sequence analysis and annotation.Bioinfor matics 15, 391–412 (1999).15.Fleischmann, W., Moller, S., Gateau, A. & Apweiler, R. A novel method for automatic functional annotation of proteins. Bioinformatics 15, 228–233 (1999).16.The FlyBase Consortium. The FlyBase database of the Drosophila Genome Projects and community literature. Nucleic Acids Res . 27, 85–88 (1999).17.Blake, J.A. et al . The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. Nucleic Acids Res . 28, 108–111 (2000).18.Ringwald, M. et al . GXD: a gene expression database for the laboratorymouse current status and recent enhancements. Nucleic Acids Res . 28,115–119 (2000).19.Ball, C.A. et al . Integrating functional genomic information into theSaccha r omyces Genome Database. Nucleic Acids Res . 28, 77–80 (2000).20.Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and itssupplement TrEMBL in 2000. Nucleic Acids Res.28, 45–48 (2000).21.Benson, D.A. et al . GenBank. Nucleic Acids Res.28, 15–18 (2000).22.Baker, W. et al . The EMBL Nucleotide Sequence Database. Nucleic Acids Res.28,19–23 (2000).23.Tateno, Y. et al . DNA Data Bank of J apan (DDBJ ) in collaboration with masssequencing teams. Nucleic Acids Res.28, 24–26 (2000).24.Barker, W.C. et al . The Protein Information Resource (PIR). Nucleic Acids Res.28,41–44 (2000).25.Mewes, H.W. et al . MIPS: a database for genomes and protein sequences.Nucleic Acids Res.28, 37–40 (2000).26.Costanzo, M.C. et al . The Yeast Proteome Database (YPD) and Caenorhabditiselegans Proteome Database (WormPD): comprehensive resources for the organization and comparison of model organism protein information. Nucleic Acids Res.28, 73–76 (2000).27.Bateman, A. et al . The Pfam protein families database. Nucleic Acids Res.28,263–266 (2000).28.Lo Conte, L. et al . SCOP: a structural classification of proteins database. NucleicAcids Res.28, 257–259 (2000).29.Bairoch, A. The ENZYME database in 2000. Nucleic Acids Res.28, 304–305(2000).30.Enzyme Nomenclature. Recommendations of the Nomenclature Committee ofthe Inte national Union of Biochemist y and Molecula Biology on the Nomenclature and Classification of Enyzmes. NC-IUBMB.(Academic, New York,1992).31.Tye, B.K. MCM proteins in DNA replication. Annu. Rev. Biochem.68, 649–686(1999).32.Eisen, M., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and displayof genome-wide expression patterns . Proc. Natl Acad. Sci. USA 95, 14863–14868(1998).33.Spellman, P.T. et al . Comprehensive identification of cell cycle-regulated genesof the yeast Sacchar omyces cer evisiae by microarray hybridization. Mol. Biol.Cell 9, 3273–3297 (1998).© 2000 N a t u r e A m e r i c a I n c . • h t t p ://g e n e t i c s .n a t u r e .c o m。
本体研究

The Classification of Ontology
• 目前被广泛使用的本体有如下5 个: • Wordnet:Wordnet 是基于心理语言规则的英文词典,它以synsets 为单位组织信息。所谓synsets 是在特定的上下文环境中可互换的同 义词的集合。 • Framenet:Framenet 也是英文词典,采用称为Frame Semantics 的 描述框架, 提供很强的语义分析能力, 目前发展为FramenetII。 • GUM:支持多语种处理,包含基本的概念及独立于各种具体语言的 概念组织方式。 • ENSUS:为机器翻译提供概念结构,包括7 万多个概念。 • Mikrokmos:Mikromos也支持多语种处理,采用一种语言中立的中间 语言TMR 来表示知识。
•
(1) 本体可以在不同的建模方法、范式、语言和软件工具之间进行翻译和映射, 以实现不同系统之间的互操作和继承。 (2) 从功能上来讲,本体和数据库有些相似。但是本体比数据库表达的知识丰 富得多。首先,定义本体的语言,在词法和语义上都比数据库所能表示的信 息丰富得多;最重要的,本体提供的是一个领域严谨丰富的理论,而不单单 是一个存放数据的结构。 (3) 本体是领域内重要实体、属性、过程及其相互关系形式化描述的基础。这 种形式化的描述可成为软件系统中可重用和共享的组件。
The Modeling Primitive of Ontology
• 本体包含5个基本的建模元语(Modeling Primitive)或说是5个要素: • 类/概念(classes/concepts):概念的含义很广泛,可以指任何事物, 如工作描述、功能、行为、策略和推理过程等等。 • 关系(relations):关系代表了在领域中概念之间的交互作用。形式 上定义为n 维笛卡儿乘积的子集: R : C1 ×C2×⋯×Cn 。如:子 类关系( subclass-of) 。 • 函数(functions):函数是一类特殊的关系。在这种关系中前n - 1 个 元素可以惟一决定第n 个元素。形式化的定义如下: F : C1 ×C2 ×⋯×Cn-1 →Cn 。例如Mother-of 关系就是一个函数,其中Motherof ( x , y) 表示y 是x 的母亲,显然x 可以惟一确定他的母亲y 。 • 公理(axioms):公理代表永真断言,比如概念乙属于概念甲的范围。 • 实例(instances):实例代表元素。
英语教学法教程 (王蔷)研究生入学考试复习资料

一、选择填空1. ________relates to the truthfulness of the data.A. ValidityB. ReliabilityC. SubjectD. Object2. Which one is not the area of the institution ________.A. restrictionsB. time, length, frequencyC. classroom management skillsD. syllabus3. English is described as foreign language in all of the countries except ________.A. FranceB. JapanC. ChinaD. Australia4. What Krashen and Terrell emphasize in their approach is the primacy of________.A. formB.vocabularyC.meaningD.phonetics5. There are many situations in which we use more than one language skill, so it is valuable to integrate the four skills, to________.A. enhance the students’ communicative competenceB. combine pronunciation, vocabulary and grammarC. use body language and picturesD. use mechanical practice and meaningful practice6.According to Wang Qiang, the way a language teacher learned a language will influence the way he ________ to some extent.A. learns a languageB. learns his mother tongueC. teaches a languageD. obtains linguistic knowledge7. If a teacher wants to control what the students do as much as possible, it’s best to do________.A. whole class workB. team activitiesC. pair workD. group work8.With regard to syllabus design, the Communicative Approach lays special emphasis on ________.A. authentic materialsB. learners’ needsC. meaningful drillsD. teachers’ roles9. The generative-transformational school of linguistics emerged through the influence of _________.A. Noam ChomskyB. J. PiagetC. D. Ausubel D. J.B. Bruner10. According to the behaviorist, a _________ is formed when a correct response to a stimulus is consistently rewarded.A. meaningB. wordC. habitD. reaction11. Another linguistic theory of communication favored in Communication Language Teaching is _________ functional account of language use.A. Chomsky’sB. Hymes’sC. Candlin’sD. Halliday’s12. What Krashen and Terrell emphasize in their approach is the primacy of _____.A. formB. vocabularyC. meaningD. phonetics13. The ultimate goal of learning a foreign language in a Grammar-Translation classroom is to enable the students to ______ its literature.A. translate and writeB. readC. read and writeD. read and translate14. The Natural Approach believes that the teaching of ______ should be delayed until comprehension skills are established.A. listeningB. speakingC. readingD. writing15.Many proponents of the Communicative Approach advocate the use of _______ materials in the language classrooms.A.classic B.authenticC.modern D.oral16.Of the three procedures followed in a cognitive classroom, which can be viewed as the performance stage?A.Exercises.B.Application activities.C.Introduction of new materials.D.None of the above.17.From the mid-1970s the key concept in educational linguistics and language pedagogy is that of_______.A.Communication or communicative competenceB.motivation in learning a foreign languageC.independence and autonomy in learningD.language acquisition through the use of active trial18. To _______, it is advocated that we adopt a communicative approach to writing.A. motivate studentsB. demotivate studentsC. free students from too much workD. keep students busy19. According to Willis the conditions for language learning are exposure to a rich but comprehensible language input, use of the language to do things, _______ to process and use the exposure, and instruction in language.A. chancesB. contextC. motivationD. Knowledge20. As far as school assessment is concerned, we have teacher’s assessment, continuous assessment, _______, and portfolios.A. students’ self-assessmentB. relative’s assessmentC. informal assessmentD. formal assessment21.For most people the term “curriculum” includes those activities that educators have devised for _________, which are represented in the form of a written document.A. teachersB. designersC. LearnersD.students22. _________is the author of the book Syntactic Structures.A. Edward SapirB. Noam ChomskyC. J. R. FirthD.M.A.K. Halliday23.Traditional behaviorists believed that language learning is simply a matter of imitation and _________formation.A. learningB. habitC. practiceD. knowledge24.The term "interlanguage" was first coined by the American linguist, _________.A. Noam ChomskyB. BloomfieldC. B.F. SkinnerD. Larry,Selinker25.According to the records available, human beings have been engaged in the study of language for _________centuries.A. 10B. 15C. 20D. 2526. Views on language and _________ both influence theories on how language should be taught.A. views on language learningB. views on culture learningC. values of lifeD. styles of life27.One of the disadvantages of traditional pedagogy is _________.A. the learners are able to use all skills, including the receptive skills and the productive skillsB. the learners are not able to use the language in an integrated wayC. the learners are not able to writeD. the learners perform well in class, but they cannot read out of class28. If you ask students to translate the meaning of new words, you are _________.A.checking spellingB.checking memorizingC.checking pronunciationD.checking understanding29.Krashen believes that acquisition of a language refers to the _________ process leading to the development of competence and is not dependent on the teaching of grammatical rules.A. consciousB. unconsciousC. overconsciousD. subconscious30. In the 19th century, the strategy in language teaching usually adopted by foreign language teachers was the _______ of grammar rules with translation.A. introductionB. interpretationC. comprehensionD. combination31. Krashen believes that acquisition of a language refers to the _______ process leading to the development of competence and is not dependent on the teaching of grammatical rules.A. consciousB. unconsciousC. overconsciousD. subconscious32. Halliday advocates that the social context of language use can be analyzed in terms of the field, tenor and mode of_____.A. contextB. discourseC. contentD. situation33. In the Natural Approach, the teacher can make use of various ways except _____ in order to help the students to be successful.A. keeping their attention on key lexical itemsB. explaining grammatical rulesC. using appropriate gesturesD. using context to help them understand34. According to Palmer and some other linguists of his time, ______ played one of the most important roles in foreign language learning.A. grammarB. phoneticsC. vocabularyD. rhetoric35. ______ refers to the interpretation of individual message elements in terms of their interconnectedness and of how meaning is represented in relationship to the text.A. Grammatical competenceB. Sociolinguistic competenceC. Discourse competenceD. Strategic competence36.Students’ mistakes are ________ corrected in the classrooms of the Direct Method.A.never B.immediatelyC.seldom D.carelessly37.________ is particularly interested in the relationship between sentences and the contexts and situation in which they are used.A.Transformational Grammar B.PragmaticsC.Structuralism D.The Situational Approach38.What do the three approaches (the Silent Way, Community Language Learning, and Suggestopaedia ) have in common?A.All stress the intrusion of the teacher into the learning process.B.All lay emphasis on the individual and on personal learning strategies.C.All view the learning of a second language the same as the learning of the first.D.All three are deductive in the initial stage of the language learning process.39. In English teaching classrooms very often writing is seen as “writing as language learning”, and it is believed to be _______.A. writing for communicationB. writing for real needsC. pseudo writingD. authentic writing40. Which of the following is NOT among the features of process writing?A. Help students to understand their own composing process.B. Let students discover what they want to say as they write.C. Encourage feedback both from both teacher and peers.D. Emphasize the form rather than the content.41. Which of the following is true of second language learning?A. Natural language exposure.B. Informal learning context.C. Structured input.D. Little error correction.42. What type of learners can benefit most from real object instruction?A. Individual learners.[5. Tactile learners.C. Auditory learners.D. Visual learners.43. What type of intelligence is cooperative learning best suited for?A. Interpersonal intelligence.B. Intrapersonal intelligence.C. Logical intelligence.D. Linguistic intelligence.44. What does the following practise?* Peer and I v. vent to the cinema yesterday.Peter and * I went to the cinema yesterday.Peer and I zoent to the * cinema yesterday.Peer and I zoent to the cinema * yesterday.A. Stress.B. Articulation.C. Liaison.D. Intonation.45. What learning strategy can the following help to train?Match the adjectives on the left with the nouns on the right.H cavy DayNice BabyClose BuildingLight RainTall FriendCute SmokerA. Grouping.B. Collocation.C. Imitation.D. Imagery.46. Which of the following is a communication game?A. Bingo.B. Word chain.C. Rearranging and describing.D. Cross-word puzzle.47. Which of the following can help train speaking?A. Listen and follow instructions.B. Simon says.C. Pairs finding.D. Match captions with pictures.48. Which of the following activities is most appealing to children"s characteristics?A. Cross-word puzzle.B. Formal grammar instruction.C. Reciting texts.D. Role-play.49. What"s the teacher doing by saying" Who wants to have a try?"?A. Controlling discipline.B. Giving prompt.C, Eva[uating students" work.D. Directing students~ attention to the lesson.50. Which of the following activities is the most suitable for group work?A. Guessing game.B. Story telling.C. Information-gap.D. Drama performance.51. Which of the following belongs to learning outcomes?A. Role-plays,B, Sequencing pictures.C. Surveys.D. Worksheets.52. Which of the following best describes first language acquisition?A. Care-taker talk.B. Minimal pair practice.C. Selected input.D. Timely error correction.53. Which of the following seating arrangements is most suitable for a whole class discussion?54. What is the teacher doing in terms of error correction?"S: I go to the theatre last night."T: You GO to the theatre last night?A. Correcting"the student"s mistake.B. Hinting that there is a mistake.C. Encouraging peer correction.D. Asking the Student whether he really went to the theatre.55. Which of the following questions can be used in the questionnaire for assessingparticipation?A. Did you get all the questions right in today"s class?B. Did you finish the task on time?C. Can you use the strategies we have learned today?D. What did you do in your group work today?56.One of the disadvantages of traditional pedagogy is _______.A. it focuses on form rather than on functionsB. language is used to perform certain communicative functionsC. learners are not able to make sentencesD. learners are not able to do translation二、名词解释1.Scaffolding: the technique of changing the level of support over the course of a teaching session; a more-skilled person (teacher or more-advanced peer of the child) adjusts the amount of guidance to fit the student’s current performance. When the task the student is learning is new, the teacher might use direct instruction. As the student’s competence increases, less guidance is provided.2.The ultimate goal of ELT: the ultimate of foreign language teaching is to enable students to use the foreign language in work or life when necessary. Thus we should teach that part of the language that will be used (rather than all part of the language).3.Definition of task: a piece of classroom work which involves learners in comprehending, manipulating, producing or interacting in the target language while their attention in principally focused on meaning rather than form. (Nunan 1989:8)A lesson plan is a framework of a lesson in which teachers make advance decision about what they hope to achieve and how they would like to achieve it. In other words, teachers need to think about the aims to be achieved, materials to be covered, activities to be organized, and techniques and resources to be used in order to achieve the aims of the lesson.4.Classroom management is the way teachers organize what goes on in the classroom. It contributes directly to the efficiency of teaching and learning as the most effective activities can be made almost useless if the teacher does not organize them efficiently. As the goal of classroom management is to create an atmosphere conductive to interacting in English in meaningful ways.5.Deductive method: The Deductive method relies on reasoning, analyzing and comparing. First, the T writes an example on board or draws attention to an example in the textbook. Second, the T explains the underlying rules regarding the forms and positions of certain structural words. The explanations are often done in the S’s native language and use grammatical terms. Sometimes, comparisons are made between the native language and the target language or between the newly presented structure and previously learned structures. Finally, the Ss practice applying the rule to produce sentences withgiven prompts.6.Inductive method: the T provides learners with authentic language data and induces the learners to realize grammar rules without any forms of explicit explanation.7. Language:” Language is a system of arbitrary vocal symbols used for human communication.” It can be understood in the following six aspects:Language as system;Language as symbolic;Language as arbitrary;Language as vocal;Language as human;Language as communication8.Bottom-up modelSome teachers teach reading by introducing new vocabulary and new structures first andthen going over the text sentence by sentence. This way of teaching reading reflects thebelief that reading comprehension is based on the understanding and mastery熟练of all the new words, new phrases, and new structures as well as a lot of reading aloud practice. Also, this reading follows a linear process from the recognition of letters, to words, to phrases, to sentences, to paragraphs, and then to the meaning of the whole text. This way of teachingreading is said to follow a bottom-up model.9. Top-down modelIt is believed that in teaching reading, the teacher should teach the background knowledge first so that students equipped with such knowledge will be able to guess meaning from the printed page. This process of reading is said to follow the top-down model of teachingreading just as Goodman(1970) once said that reading was “a psycholinguistic guessinggame”10. Structural view:The structural view sees language as a linguistic system made up of various subsystems: from phonological, morphological, lexical, etc. to sentences.11. The functional view:The functional view sees language as a linguistic system but also as a means for doing things.Most of our day-to-day language use involves functional activities: greetings; offering,suggesting, advising, apologizing, etc.The communicative view of languageThe communicative, or functional view of language is the view that language is a vehicle for the expression of functional meaning. The semantic and communicative dimensions of language are more emphasized than the grammatical characteristics, although these are also included.12. The interactional view:The interactional view considers language as a communicative tool, whose main use is to build up and maintain social relations between people.13.The behaviorist theory( Skinne r)-- a stimulus-response theory of psychologyThe key point of the theory of conditioning is that "you can train an animal to do anything (within reason) if you follow a certain procedure which has three major stages, stimulus,response, and reinforcement"14.Cognitive theory( Noam Chomsky):The term cognitive is to describe loosely methods in which students are asked to think rather than simply repeat.15.The goal of CLTThe goal of CLT is to develop students' communicative competence16.Lesson planning means making decisions in advance about what techniques, activities and materials will be used in the class.17Teaching stages and procedures:Teaching stages are the major steps that language teachers go through in the classroom.Procedures are the detailed steps in each teaching stage.18. Three P's model: presentation, practice and production.19.SkimmingSkimming means reading quickly to get the gist,i.e. the main idea of the text.20.ScanningScanning means to read to locate/get specific information.21. DiscussionA discussion is often used for a) exchange of personal opinions. This sort of discussion canstart with a question like "What do you think of?"b) stating of personal opinions ongeneral issues. c) problem-solving.d) the ranking(分类;顺序)of alternatives e) deciding upon priorities(先;前)etc.22. Role-playRole-play is a very common language learning activity where students play differentroles and interact from the point of view of the roles they play.23.What’s called A process approach to writingDefinitionWhat really matters or makes a difference is the help that the teacher provides to guide the students through the process that they undergo when they are writing.24.What’s the assessmentAssessment in ELT means to discover what the learners know and can do at a certain stage of the learning process.25.Grammar Translation:The Grammar Translation method started around the time of Erasmus (1466-1536). Its primaryfocus is on memorization of verb paradigms, grammar rules, and vocabulary. Application of this knowledge was directed on translation of literary texts--focusing of developing students' appreciation of the target language's literature as well as teaching the language. Activities utilized in today's classrooms include: questions that follow a reading passage; translating literary passages from one language to another; memorizing grammar rules; memorizing native-language equivalents of target language vocabulary. (Highly structured class work with the teacher controlling all activities.)26. Direct Method:The Direct Method was introduced by the German educator Wilhelm Viëtor in the early 1800's. Focusing on oral language, it requires that all instruction be conducted in the target language with no recourse to translation. Reading and writing are taught from the beginning, although speaking and listening skills are emphasized--grammar is learned inductively. It has a balanced, four-skill emphasis.27. The Silent Way:The teacher is active in setting up classroom situations while the students do most of the talking and interaction among themselves. All four skills (listening, speaking, reading & writing) are taught from the beginning. Student errors are expected as a normal part of learning; the teacher's silence helps to foster self-reliance and student initiative.28. Community Language Learning:Teachers recognize that learning can be threatening and by understanding and accepting students' fears, they help their students feel secure and overcome their fears of language learning--ultimately providing students with positive energy directed at language learning. Students choose what they want to learn in the class and the syllabus is learner-generated.29. Natural Approach:Introduced by Gottlieb Henese and Dr. L. Sauveur in Boston around 1866. The Natural Approach is similar to the Direct Method, concentrating on active demonstrations to convey meaning by associating words and phrases with objects and actions. Associations are achieved via mime, paraphrase and the use of manipulatives. Terrell (1977) focused on the principles of meaningful communication, comprehension before production, and indirect error correction. Krashen's (1980) input hypothesis is applied in the Natural30. Reading Method:The reading method was prominent in the U.S. following the Committee of Twelve in 1900 and following the Modern Foreign Language Study in 1928. The earlier method was similar to the traditional Grammar/Translation method and emphasized the transference of linguistic understanding to English. Presently, the reading method focuses more on silent reading for comprehension purposes.31. ASTP and the Audiolingual Method:This approach is based on the behaviorist belief that language learning is the acquisition of a set of correct language habits. The learner repeats patterns and phrases in the language laboratory until able to reproduce them spontaneously.ASTP (Army Specialized Training Program)was an intensive, specialized approach to language instruction used in during the 1940's. In the postwar years, the civilian version of ASTP and the audiolingual method featured memorization of dialogues, pattern drills, and emphasis on pronunciation.32. Cognitive Methods:Cognitive methods of language teaching are based on meaningful acquisition of grammar structures followed by meaningful practice.33. Communicative Methods:The goal of communicative language approaches is to create a realistic context for language acquisition in the classroom. The focus is on functional language usage and the ability to learners to express their own ideas, feelings, attitudes, desires and needs. Open ended questioning and problem-solving activities and exchanges of personal information are utilized as the primary means of communication. Students usually work with authentic materials (authentic realia) in small groups on communication activities, during which they receive practice in negotiating meaning.34. Total Physical Response Method:This approach to second language teaching is based on the belief that listening comprehension should be fully developed before any active oralparticipation from students is expected (just as it is with children when theyare learning their native language) .35.What is the Grammar-Translation Method?The Grammar-Translation Method is designed around grammatical structures.36.The Functional-Notional ApproachUnlike the Grammar-Translation Method, which is based on the grammar structures, it thinks thata general learner should take part in the language activities, the functions of language involved inthe real and normal life are most important. For example, the learners have to learn how to give directions, buy goods, ask a price, claim ownership of something and so on. It tells that is not just important to know the forms of the language, it is also important to know the functions and situations, so that the learner could practice real-life communication.municative CompetenceBoth knowledge about the language and the knowledge about how to use the language in communicative situation appropriately.38.Critical Period Hypothesis关键期假说This hypothesis states that if humans do not learn a foreign language before a certain age ,then due to changes such as maturation of the brain ,it becomes impossible to learn the foreign language like a native speaker.39.Process-oriented theories:强调过程are concerned with how the mind organizes new information such as habit formation, induction, making inference, hypothesis testing and generalization.40.Condition-oriented theories: 强调条件emphasize the nature of the human and physical context in which language learning takes place, such as the number of students, the kind of input learners receives, and the atmosphere.41.Behavioristtheory,(Skinner and waston raynor)A the key point of the theory of conditioning is that”you can train an animal to do anything if youfollow a certain procedure which has three major stages,s timulus,response,and reinforcementB the idea of this method is that language is learned by constant repletion and the reinforcement ofthe teacher. Mistakes were immediately corrected, and correct utterances were immediatelypraised.42.Cognitive theory:Chomsky)thinks that language is not a form of behavior,it is an intricate rule-based system a nd a large part of language acquisition is the learning of this system.There are a finite number of grammatical rules in the system and with knowledge of these an infinite number of sentences can be produced.43.Constructivist theory:(John Dewey)the constructivist theory believes that learning is a proces in which the learner constructs meaning based on his/her own experiences and what he/he r already knows44.Socio-constructivist theory:(Vygotsky)he emphasizes interaction and engagement with the t arget language in a social context based on the concept of“Zone of Proximal Development”(ZP D)and scaffolding.。
语义网简明教程SW5-ONTOLOGY

11
本体的构成(续)
▪ 实际应用中,不一定要严格按照5个元素来构造本体
– 可能缺少某种元素
▪ 概念之间的关系也不仅限于4种关系 – 如词典中描述同义词、近义词的关系
▪ 应根据具体情况来确定
12
有向图表示本体示例
13
Ontoloty与面向对象区别
– 找出基本的术语和术语间的关系及相应的规则 – 给出这些术语和关系的定义
7
5.3 本体的构成
▪ 客观世界的特征:
– 世界存在着对象(Object); – 对象可以抽闲出类(Class); – 对象具有属性(Property),属性可以赋值(Value); – 对象之间存在着不同的关系(Relation); – 对象可以分解为部分(Part); – 对象可以具有不同的状态(State); – 属性和关系随着时间的推移而改变; – 不同的时刻会有不同的事件(Event)发生; – 事件能导致其它的事件发生或改变状态; – 在一定的时间段上存在着过程(Process),对象则参与到过程中。
26
本体在信息检索中的应用
▪ 信息检索
– 全文检索(Text Retrival) – 数据检索(Data Retrival) – 知识检索(Knowledge Retrival)
▪ 基于本体的信息检索
– 建立领域本体:在领域专家的帮助下,建立相关领域的本体。 – 建立检索源 :收集信息源中的数据,并参照已建立的本体,把收集
24
分类法
▪ 分类法是传统图书馆最重要的知识组织工具,广泛用于
– 文献标引 – 图书排架 – 目录组织 – 检索服务
▪ 国际上分类法
新编简明英语语言学 Chapter 10 Language acquisition

Chapter 10 Language acquisition语言习得知识点:1.*Definition: language acquisition; overextention; telegraphic speech2.*Theories of child language acquisition: behaviorist view; innatist view; interactionist view3.Cognitive factors in child language development4.The Critical Period Hypothesis5.*Stages in child language development考核目标:识记:Definition: language acquisition; overextention; telegraphic speech领会:Cognitive factors in child language development; The Critical Period Hypothesis; Stages in child language development简单应用:Theories of child language acquisition: behaviorist view; innatist view; interactionist view一、定义nguage acquisition语言习得----refers to the child’s acquisition ofhis mother tongue, i.e. how the child comes to understand and speak the language of his community. 指儿童对其母语的习得,也就是儿童是如何逐渐理解和说其社区的语言。
Ontology与语言问题

ontology中Being的规定性
作为ontology范畴的Being与日常语言中to be的最根本的区别是:Being的意义是从逻辑上得到规定的, 因此我们常常会谈到这样的说法:Being的规定性。这在日常语言中是不见的。 在日常语言中,to be除了用作系词以及表示存在的动词以外, 还可以表示其他各种实际意义,例如莎士比亚剧本中的一句台词To be or notto be,指要活,还是要死。to be表示各种实际意义的时候,往往与其他词连用。海德格尔曾举与to be相似的德文sein 所能表示的各种意思中的些句子的英译)〔10〕: 原 句 海德格尔对其中to be的解释:
(此书是我的)
8.Red is the port side. 它代表左派
(红色是左派)
9.The dog is in the garden. 狗正在花园里闲逛
(那狗在花园里)
在海德格尔列出的这些例句中,to be 不仅有作动词存在用的(例句1、2),而且大量是用作系词;在用作系词时,不仅连系名词(例句8),而且还连系代词(例句7),介词短语(例句3、4、5、9)和动词不定式(例句6)。一般认为,to be作为系词时,它本身的意义是不确定的,或者说是没有实际意义的,然而海德格尔认为,从另一眼光去看,恰因is 内在地依然是不确定的和缺乏意义的, 它才能有这许多不同的用法,才能根据情境的需要去实现和决定其自己〔11〕。这些情况充分说明,当人们不经心地使用to be 时,表面上看来它只是起连系主语和表语的作用,实际上人们是明白它和不同词语使用时所具有的多种不同意义的,只是人们一般并不对不同用法中这个词的不同意义作反思的表述罢了。
前人对ontology进行语言分析的尝试
ontology与语言的特殊关系,它对语言的特殊使用,早已引起了人们的关注。随着对不同语言的哲学的比较研究的开展,ontology与印欧语系的特殊关系得到了揭示。但是,在这一研究中也存在着一种倾向,即把哲学问题简单地归结为语言问题,或者以对日常语言的分析取代哲学的研究。这就忘记了语言是思想借以表达出来的工具。结合语言加以研究,是为了借助于这个工具,揭示ontology的哲学内涵及其思维方式;反过来说,对ontology进行语言分析时,必须联系它的特殊思维方式以及它对语言的特殊使用。不把握住这一点,进行单纯的语言分析,尤其是将这种分析停留在日常语言的层次上,适足以掩盖ontology这种形态的哲学的实质。让我们先来看一下这种分析所达到的结果。
本体理论与领域本体的构建

第二章本体理论与领域本体的构建2.1 本体理论2.1.1 本体的基本概念本体论(Ontology)的概念最初起源于哲学领域,是形而上学理论研究的一个分支,与认识论相对。
认识论研究人类知识的本质和来源,即研究主观认知,而本体论研究的则是客观存在。
Ontology一方面研究存在的本质,另一方面研究客体对象的理论定义,即整个现实世界的基本特征。
现在哲学领域较多翻译为“本体论”。
经过多年的演进,到今天,经过人们对“本体”这一概念的重新理解和定位,本体的理论与方法早已被信息领域采用,用于知识的组织、表示、共享和重用。
本体在计算机学科的使用可以追溯到上个世纪80年代,Alxenader在1986年发表的文章被视为本体在计算机领域获得不同于哲学领域的新的研究的起点。
随后Ontolgoy在人工智能领域界获得稳步的发展,并被逐渐赋予了新的含义[8-9]。
1991年,在人工智能领域,Neches等人最早给出Ontology定义,Neches认为[10]“An ontology defines the basic terms and relations comprising the vocabulary of a topic area,as well as the rules for combining termsand relations to define extensions to the vocabulary.”即“一个本体给出构成相关领域词汇的基本术语和关系,以及利用这些术语和关系构成的规则定义这些词汇的外延规则。
”本体定义了组成主题领域的词汇表的基本术语及其关系,以及结合这些术语和关系来定义词汇表外延的规则[11]。
1993年美国斯坦福大学知识系统实验室(Knowledge System Laborary,简称KSL)的Gruber给出了本体在信息科学领域被广泛接受的定义:“An ontology is an explicit specification of a conceptualization”[12]。
Ontology在语义Web中的应用研究

收稿日期:2003204212;修返日期:2003207203Ontology 在语义Web 中的应用研究邓 芳(北京邮电大学科学与技术学院,北京100876)摘 要:探讨了本体Ontology 及语义W eb ,描述了Ontology 在语义W eb 中的作用,结合信息检索和B2B 的电子商务这两个具体应用,研究了Ontology 在其中的作用,并且对实现中需要注意的问题进行了说明。
关键词:本体;语义W eb ;信息检索;B2B中图法分类号:TP30112 文献标识码:A 文章编号:100123695(2004)0620097202Research on the Application of Ontology in Semantic WebDE NG Fang(College o f Computer Science &Technology ,Beijing Univer sity o f Posts &Telecommunications ,Beijing 100876,China )Abstract :The techn ology of ontology and semantic web is surveyed .The research is made on the application of ontology in semanticweb.T w o applications ,in formation searching and B2B electronic business ,are given.And suggestions of realization are given in the end.K ey w ords :Ontology ;Semantic Web ;In formation Search ;B2B1 语义WebInternet 和Web 已成为人们获取和发布信息不可缺少的方式和工具,但其构成的庞大的信息网也给使用者带来了很多问题和苦恼。
本体的概念

1 关于Ontology1.1 Ontology的定义Ontology最早是一个哲学的范畴,后来随着人工智能的发展,被人工智能界给予了新的定义。
然后最初人们对Ontology的理解并不完善,这些定义也出在不断的发展变化中,比较有代表性的定义列表如下:关于最后一个定义的说明体现了Ontology的四层含义:l 概念模型(cerptualization)通过抽象出客观世界中一些现象(Phenomenon)的相关概念而得到的模型,其表示的含义独立于具体的环境状态l 明确(explicit)所使用的概念及使用这些概念的约束都有明确的定义l 形式化(formal)Ontology是计算机可读的。
l 共享(share)Ontology中体现的是共同认可的知识,反映的是相关领域中公认的概念集,它所针对的是团体而不是个体。
Ontology的目标是捕获相关的领域的知识,提供对该领域知识的共同理解,确定该领域内共同认可的词汇,并从不同层次的形式化模式上给出这些词汇(术语)和词汇之间相互关系的明确定义。
1.2 Ontology的建模元语Perez等人用分类法组织了Ontology,归纳出5个基本的建模元语(Modeling Primitives):l 类(classes)或概念(concepts)指任何事务,如工作描述、功能、行为、策略和推理过程。
从语义上讲,它表示的是对象的集合,其定义一般采用框架(frame)结构,包括概念的名称,与其他概念之间的关系的集合,以及用自然语言对概念的描述。
l 关系(relations)在领域中概念之间的交互作用,形式上定义为n维笛卡儿积的子集:R:C1×C2×…×Cn。
如子类关系(subclass-of)。
在语义上关系对应于对象元组的集合。
l 函数(functions)一类特殊的关系。
该关系的前n-1个元素可以唯一决定第n个元素。
形式化的定义为F:C1×C2×…×Cn-1→Cn。
学校的英语水平要求和语言支持

Textbooks and workbooks
01
Contain materials for students to learn and practice English, including reading comprehension, grammar exercises, and writing prompts
要点一
要点二
Progression requirements
Students may be required to demonstrate increasing levels of English proficiency as their progress through their program This ensures that they can handle more complex academic content and particle activity in higher level courses
汇报人:可编辑
2024-01-06
English proficiency requirements and language supp
目录
CONTENTS
English proficiency requirementsLanguage Support PolicyEnglish language trainingImplementation and effectiveness of language supportFuture English proficiency requirements and language support planning
Ontology在领域词典构建中的应用

领域特征概念。领域特征属性构成领域特征属性 讨了如何用 O t o 思想建立 “ nl og y 领域词典”的问 基于 O t o 思想建立领域词典 , nl og y 不仅可以 清 层。 最后用领域特征属性和部分手工构建的领域 题。 特征概念作为种子 , 采用 Bo t p i 的机器学 晰地描述领域词典中的领域特征概念及其关系 , ot r p g sa n 习技术, 从大规模无标注真实语料中, 动学习再 还可以实现领域知识的共享和重用 ,有利于领域 自 通过少量的人工校对的方法获取更多的 领域特征 词典的维护。在构建领域词典的过程中面临的困 概念 , 不断地扩充领域词典。 具体构建瓴唆词典层 难和问题主要有 : 领域特征屙 l的 生 提取及其组织 、 和描述语言、手工挑选 次分类体系的步骤如下 :
关 键词 : 域 知识 ; no g ; 领 O tly / 词典 o N域
1领域知识 ຫໍສະໝຸດ “ 领域知识”是一个源于人工智能领域的术 语。 人 在 工智能领域 , 领域知识主要应用在基于知 识的专家系统和 自 然语言理解系统 中。领域知识 是指在某一领域内的概念、 概念之间的相互关系 以及有关概念的约束的 集合。根据不同领域和不 同应用的需要 ,领域知识”这 术语的定义也有 “ 所不同。 自然语言处理的研究中, 在 领域知识是应 用于文本主题和内容分析的基础知识。领域知识 是面向 计算机 、 正常人不必费力获取、 用来描述某 领域的领域特征概念和领域特征概念之间的相 互关系的知识。领域知识具有知识本身所具有的 所有属性和特 点。 “ 面向计算机” 正常人不必费力获取” 和“ 是领 域知识在文本的主题和内 容分析的应用中体现出 来的两个重要特性。为了更好的描述领域特征概 念及其之间的关系 , 我们引入“ 领域特征属性 ” 的 概念。 领域特征属性” “ 也是一种领域特征概念 , 它 是领域特征概念再抽象和概括所形成的类别。确 定“ 领域特征属性 ” 应遵循以下三个原则 :1 () 领域 特征属性能描述某一领域的领域特征概念,且不 易于再分割。() 2领域特征属性一定要能够描述某 领域中全部的领域特征概念。 3领域特征属性 () 是稳定的, 是必须确定的。
Reviewing the design of DAML+OIL An ontology language for the semantic web

Reviewing the Design of DAML+OIL: An Ontology Language for the Semantic WebIan Horrocks University of Manchester Manchester,UK horrocks@ Peter F.Patel-SchneiderBell Labs ResearchMurray Hill,NJ,U.S.A.pfps@Frank van HarmelenVrije UniversiteitAmsterdam,the NetherlandsFrank.van.Harmelen@cs.vu.nlAbstractIn the current“Syntactic Web”,uninterpreted syntactic con-structs are given meaning only by private off-line agreementsthat are inaccessible to computers.In the Semantic Web vi-sion,this is replaced by a web where both data and its se-mantic definition are accessible and manipulable by computersoftware.DAML+OIL is an ontology language specificallydesigned for this use in the Web;it exploits existing Webstandards(XML and RDF),adding the familiar ontologicalprimitives of object oriented and frame based systems,andthe formal rigor of a very expressive description logic.Thedefinition of DAML+OIL is now over a year old,and the lan-guage has been in fairly widespread use.In this paper,wereview DAML+OIL’s relation with its key ingredients(XML,RDF,OIL,DAML-ONT,Description Logics),we discuss thedesign decisions and trade-offs that were the basis for thelanguage definition,and identify a number of implementa-tion challenges posed by the current language.These issuesare important for designers of other representation languagesfor the Semantic Web,be they competitors or successors ofDAML+OIL,such as the language currently under definitionby W3C.IntroductionIn the short span of its existence,the World Wide Web hasresulted in a revolution in the way information is transferredbetween computer applications.It is no longer necessary forhumans to set up channels for inter-application informationtransfer;this is handled by TCP/IP and related protocols.Itis also no longer necessary for humans to define the syntaxand build parsers used for each kind of information transfer;this is handled by HTML,XML and related standards.How-ever,it is still not possible for applications to interoperatewith other applications without some pre-existing,human-created,and outside-of-the-web agreements as to the mean-ing of the information being transferred.The next generation of the Web aims to alleviate thisproblem—making Web resources more readily accessible toautomated processes by adding information that describesWeb content in a machine-accessible and manipulable fash-ion.This coincides with the vision that Tim Berners-Leecalls the Semantic Web in his recent book“Weaving theWeb”(Berners-Lee1999).1/XML/Schema/2/RDF/they are to be used effectively by automated processes,e.g., to determine the semantic relationships between syntacti-cally different terms.DAML+OIL is the result of merging DAML-ONT(an early result of the DARPA Agent Markup Language (DAML)programme3)and OIL(the Ontology Inference Layer)(Fensel et al.2001),developed by a group of(largely European)researchers,several of whom were members of the European-funded On-To-Knowledge consortium.4 Until recently,the development of DAML+OIL has been undertaken by a committee largely made up of members of the two language design teams(and rather grandly titled the Joint EU/US Committee on Agent Markup Languages).5 More recently,DAML+OIL has been submitted to W3C as a proposal for the basis of the W3C Web Ontology language.6 As it is an ontology language,DAML+OIL is designed to describe the structure of a domain.DAML+OIL takes an object oriented approach,with the structure of the domain being described in terms of classes and properties.An on-tology consists of a set of axioms that assert characteristics of these classes and properties.Asserting that resources are instances of DAML+OIL classes or that resources are re-lated by properties is left to RDF,a task for which it is well suited.Since the definition of DAML+OIL is available else-where,7we will not repeat it here.Instead,in the follow-ing sections,we will review a number of fundamental design choices that were made for DAML+OIL:foundations in De-scription Logic,XML datatypes,layering on top of RDFS, comparison with its predecessor OIL,and the role of infer-ence for a Semantic Web ontology language.Foundations in Description LogicDAML+OIL is,in essence,equivalent to a very expressive Description Logic(DL),with a DAML+OIL ontology cor-responding to a DL terminology.As in a DL,DAML+OIL classes can be names(URI’s in the case of DAML+OIL)or expressions,and a variety of constructors are provided for building class expressions.The expressive power of the lan-guage is determined by the class(and property)constructors provided,and by the kinds of axioms allowed.Figure1summarises the constructors in DAML+OIL. The standard DL syntax is used in this paper for compact-ness as the RDF syntax is rather verbose.In the RDF syntax, for example,Human Male would be written as <daml:Class><daml:intersectionOfrdf:parseType="daml:collection"> <daml:Class rdf:about="#Human"/><daml:Class rdf:about="#Male"/></daml:intersectionOf></daml:Class>DL SyntaxintersectionOf Human Male unionOf Doctor Lawyer complementOf MaleoneOf john marytoClass hasChild Doctor hasClass hasChild Lawyer hasValue citizenOf USA minCardinalityQ hasChild Lawyer maxCardinalityQ hasChild Male cardinalityQ hasParent Female Figure1:DAML+OIL class constructorsThe meanings of thefirst three constructors from Figure1 are just the standard boolean operators on classes.The oneOf constructor allows classes to be defined by enumerat-ing their members.The toClass and hasClass constructors correspond to slot constraints in a frame-based language.The class is the class all of whose instances are related via the property only to resources of type,while the class is theclass all of whose instances are related via the property to at least one resource of type.The hasValue constructor is just shorthand for a combination of hasClass and oneOf. The minCardinalityQ,maxCardinalityQ and cardinalityQ constructors(known in DLs as qualified number restrictions) are generalisations of the hasClass and hasValue construc-tors.The class(,)is the class all of whose instances are related via the property to at least(at most,exactly)different resources of type.The emphasis on different is because there is no unique name as-sumption with respect to resource names(URIs):it is possi-ble that many URIs could name the same resource.Note that arbitrarily complex nesting of constructors is possible.The formal semantics of the class constructors is given by DAML+OIL’s model-theoretic semantics8or can be derived from the specification of a suitably expressive DL (e.g.,see(Horrocks&Sattler2001)).Figure2summarises the axioms allowed in DAML+OIL. These axioms make it possible to assert subsumption or equivalence with respect to classes or properties,the dis-jointness of classes,the equivalence or non-equivalence of individuals(resources),and various properties of properties.A crucial feature of DAML+OIL is that subClassOf and sameClassAs axioms can be applied to arbitrary class ex-pressions.This provides greatly increased expressive power with respect to standard frame-based languages where such axioms are invariably restricted to the form where the left hand side is a class name,there is only one such axiom per name,and there are no cycles(the class on the right hand side of an axiom cannot refer,either directly or indirectly,to the class name on the left hand side).A consequence of this expressive power is that all of the class and individual axioms,as well as the uniquePropertyAxiom ExampleBush G Bush differentIndividualFrom john peterinverseOf hasChild hasParent transitiveProperty ancestor ancestor uniqueProperty hasMother unambiguousProperty isMotherOfFigure2:DAML+OIL axiomsand unambiguousProperty axioms,can be reduced to sub-ClassOf and sameClassAs axioms(as can be seen from theDL syntax).As we have seen,DAML+OIL also allows properties of properties to be asserted.It is possible to assert that a prop-erty is unique(i.e.,functional)and unambiguous(i.e.,its inverse is functional).It is also possible to use inverse prop-erties and to assert that a property is transitive.XML Datatypes in DAML+OILDAML+OIL supports the full range of datatypes in XML Schema:the so called primitive datatypes such as string,decimal orfloat,as well as more complex derived datatypes such as integer sub-ranges.This is facilitated by main-taining a clean separation between instances of“object”classes(defined using the ontology language)and instances of datatypes(defined using the XML Schema type system). In particular,the domain of interpretation of object classesis disjoint from the domain of interpretation of datatypes, so that an instance of an object class(e.g.,the individual “Italy”)can never have the same denotation as a value ofa datatype(e.g.,the integer5),and that the set of object properties(which map individuals to individuals)is disjoint from the set of datatype properties(which map individualsto datatype values).The disjointness of object and datatype domains was mo-tivated by both philosophical and pragmatic considerations: Datatypes are considered to be already sufficiently struc-tured by the built-in predicates,and it is,therefore,notappropriate to form new classes of datatype values using the ontology language(Hollunder&Baader1991). The simplicity and compactness of the ontology language are not compromised:even enumerating all the XML Schema datatypes would add greatly to its complexity, while adding a logical theory for each datatype,even if it were possible,would lead to a language of monumental proportions.The semantic integrity of the language is not compromised—defining theories for all the XML Schema datatypes would be difficult or impossible without extending the language in directions whosesemantics would be difficult to capture within the existing framework.The“implementability”of the language is not compromised—a hybrid reasoner can easily be im-plemented by combining a reasoner for the“object”language with one capable of deciding satisfiability ques-tions with respect to conjunctions of(possibly negated) datatypes(Horrocks&Sattler2001).From a theoretical point of view,this design means that the ontology language can specify constraints on data val-ues,but as data values can never be instances of object classes they cannot apply additional constraints to elements of the object domain.This allows the type system to be ex-tended without having any impact on the ontology language, and vice versa.Similarly,the formal properties of hybrid reasoners are determined by those of the two components; in particular,the combined reasoner will be sound and com-plete if both components are sound and complete.From a practical point of view,DAML+OIL implementa-tions can choose to support some or all of the XML Schema datatypes.For supported datatypes,they can either imple-ment their own type checker/validater or rely on some exter-nal component.The job of a type checker/validater is simply to take zero or more data values and one or more datatypes, and determine if there exists any data value that is equal to every one of the specified data values and is an instance of every one of the specified data types.Extending RDF SchemaDAML+OIL is tightly integrated with RDFS:RDFS is used to express DAML+OIL’s machine readable specification,9 and RDFS provides the only serialisation for DAML+OIL. While the dependence on RDFS has some advantages in terms of the re-use of existing RDFS infrastructure and the portability of DAML+OIL ontologies,using RDFS to com-pletely define the structure of DAML+OIL is quite difficult as,unlike XML,RDFS is not designed for the precise spec-ification of syntactic structure.For example,there is no way in RDFS to state that a restriction(slot constraint)should consist of exactly one property(slot)and one class.The solution to this problem adopted by DAML+OIL is to define the semantics of the language in such a way that they give a meaning to any(parts of)ontologies that conform to the RDFS specification,including“strange”constructs such as restrictions with multiple properties and classes.The meaning given to strange constructs may,however,include strange“side effects”.For example,in the case of a restric-tion with multiple properties and classes,the semantics in-terpret this in the same way as a conjunction of all the con-straints that would result from taking the cross product of the specified properties and classes,but with the added(and probably unexpected)effect that all these restrictions must have the same interpretation(i.e.,are equivalent).DAML+OIL’s dependence on RDFS may also have con-sequences for the decidability of the language.Decidability is lost when cardinality constraints can be applied to proper-ties that are transitive,or that have transitive sub-properties. (Horrocks,Sattler,&Tobies1999).There is no way to for-mally capture this constraint in RDFS,so decidability in DAML+OIL depends on an informal prohibition of cardi-nality constraints on non-simple properties.DAML+OIL vs.OILFrom the point of view of language constructs,the differ-ences between OIL and DAML+OIL are relatively trivial. Although there is some difference in“keyword”vocabulary, there is usually a one to one mapping of constructors,and in the cases where the constructors are not completely equiva-lent,simple translations are possible.OIL also uses RDFS for its serialisation(although it also provides a separate XML-based syntax).Consequently, OIL’s RDFS based syntax would seem to be susceptible to the same difficulties as described above for DAML+OIL. However,in the case of OIL there does not seem to be an assumption that any ontology conforming to the RDFS meta-description should be a valid OIL ontology—presumably ontologies containing unexpected usages of the meta-properties would be rejected by OIL processors as the semantics do not specify how these could be translated into .Thus,OIL and DAML+OIL take rather differ-ent positions with regard to the layering of languages on the Semantic Web.Another effect of DAML+OIL’s tight integration with RDFS is that the frame structure of OIL’s syntax is much less evident:a DAML+OIL ontology is more DL-like in that it consists largely of a relatively unstructured collec-tion of subsumption and equality axioms.This can make it more difficult to use DAML+OIL with frame based tools such as Prot´e g´e(Grosso et al.1999)or OilEd(Bechhofer et al.2001)because the axioms may be susceptible to many different frame-like groupings.(Bechhofer,Goble,&Hor-rocks2001).The treatment of individuals in OIL is also very different from that in DAML+OIL.In thefirst place,DAML+OIL re-lies wholly on RDF for assertions involving the type(class) of an individual or a relationship between a pair of objects. In the second place,DAML+OIL treats individuals occur-ring in the ontology(in oneOf constructs or hasValue restrictions)as true individuals(i.e.,interpreted as single elements in the domain of discourse)and not as primitive concepts as is the case in OIL.This weak treatment of the oneOf construct is a well known technique for avoiding the reasoning problems that arise with existentially defined classes,and is also used,e.g.,in the C LASSIC knowledge representation system(Borgida&Patel-Schneider1994). Moreover,DAML+OIL makes no unique name assumption: it is possible to explicitly assert that two individuals are the same or different,or to leave their relationship unspecified. This treatment of individuals is very powerful,and justi-fies intuitive inferences that would not be valid for OIL,e.g., that persons all of whose countries of residence are Italy are kinds of person that have at most one country of residence: Person residence Italy residenceInference in DAML+OILAs we have seen,DAML+OIL is equivalent to a very ex-pressive DL.More precisely,DAML+OIL is equivalent to the DL(Horrocks,Sattler,&Tobies1999)with the addition of existentially defined classes(i.e.,the oneOf constructor)and datatypes(often called concrete domains in DLs(Baader&Hanschke1991)).This equivalence al-lows DAML+OIL to exploit the considerable existing body of description logic research to define the semantics of the language and to understand its formal properties,in par-ticular the decidability and complexity of key inference problems(Donini et al.1997);as a source of sound and complete algorithms and optimised implementation tech-niques for deciding key inference problems(Horrocks,Sat-tler,&Tobies1999;Horrocks&Sattler2001);and to use implemented DL systems in order to provide(partial) reasoning support(Horrocks1998a;Patel-Schneider1998; Haarslev&M¨o ller2001).A important consideration in the design of DAML+OIL was that key inference problems in the language,in partic-ular class consistency/subsumption,to which most other in-ference problems can be reduced,should be decidable,as this facilitates the provision of reasoning services.More-over,the correspondence with DLs facilitates the use of DL algorithms that are known to be amenable to optimised im-plementation and to behave well in realistic applications in spite of their high worst case complexity(Horrocks1998b; Haarslev&M¨o ller2001).Maintaining the decidability of the language requires cer-tain constraints on its expressive power that may not be ac-ceptable to all applications.However,the designers of the language decided that reasoning would be important if the full power of ontologies was to be realised,and that a pow-erful but still decidable ontology language would be a good starting point.Reasoning can be useful at many stages during the design, maintenance and deployment of ontologies.Reasoning can be used to support ontology design and to improve the quality of the resulting ontology.For example, class consistency and subsumption reasoning can be used to check for logically inconsistent classes and(possibly un-expected)implicit subsumption relationships(Bechhofer etal.2001).This kind of support has been shown to be par-ticularly important with large ontologies,which are often built and maintained over a long period by multiple authors.Other reasoning tasks,such as“matching”(Baader et al.1999)and/or computing least common subsumers(Baader &K¨u sters1998)could also be used to support“bottom up”ontology design,i.e.,the identification and description ofrelevant classes from sets of example instances.Like information integration(Calvanese et al.1998), ontology integration can also be supported by reason-ing.For example,integration can be performed usinginter-ontology assertions specifying relationships between classes and properties,with reasoning being used to com-pute the integrated hierarchy and to highlight any prob-lems/inconsistencies.Unlike some other integration tech-niques,this method has the advantage of being non-intrusivewith respect to the original ontologies.Reasoning with respect to deployed ontologies will en-hance the power of“intelligent agents”,allowing them todetermine if a set of facts is consistent w.r.t.an ontology,to identify individuals that are implicitly members of a givenclass etc.A suitable service ontology could,for example,allow an agent seeking secure services to identify a service requiring a userid and password as a possible candidate.ChallengesClass consistency/subsumption reasoning in DAML+OIL is known to be decidable(as it is contained in the C2fragmentoffirst order logic(Gr¨a del,Otto,&Rosen1997)),but manychallenges remain for implementors of“practical”reasoning systems,i.e.,systems that perform well with the kinds ofreasoning problem generated by realistic applications. Individuals Unfortunately,the combination of DAML+OIL individuals with inverse properties is sopowerful that it pushes the worst case complexity of theclass consistency problem from E XP T IME(for/OIL) to NE XP T IME.No“practical”decision procedure is cur-rently known for this logic,and there is no implementedsystem that can provide sound and complete reasoning for the whole DAML+OIL language.In the absence ofinverse properties,however,a tableaux algorithm has beendevised(Horrocks&Sattler2001),and in the absence of individuals(in extensionally defined classes),DAML+OILcan exploit implemented DL systems via a translation into (extended with datatypes)similar to the one used by OIL.It would,of course,also be possible to translateDAML+OIL ontologies into using OIL’s weaktreatment of individuals,but in this case reasoning with individuals would not be complete with respect to thesemantics of the language.This approach is taken by someexisting applications,e.g.,OilEd(Bechhofer et al.2001) Scalability Even without the oneOf constructor,class con-sistency reasoning is still a hard problem.Moreover,Web ontologies can be expected to grow very large,and with de-ployed ontologies it may also be desirable to reason w.r.t.a large numbers of class/property instances.There is good evidence of empirical tractability and scalability for implemented DL systems(Horrocks1998b; Haarslev&M¨o ller2001),but this is mostly w.r.t.logics that do not include inverse properties(e.g.,(Horrocks, Sattler,&Tobies1999)).Adding inverse properties makes practical implementations more problematical as several im-portant optimisation techniques become much less effec-tive.Work is required in order to develop more highly opti-mised implementations supporting inverse properties,and to demonstrate that they can scale as well as implemen-tations.It is also unclear if existing techniques will be able to cope with large numbers of class/property instances(Hor-rocks,Sattler,&Tobies2000).Finally,it is an inevitable consequence of the high worst case complexity that some problems will be intractable,even for highly optimised implementations.It is conjectured that such problems rarely arise in practice,but the evidence for this conjecture is drawn from a relatively small number of applications,and it remains to be seen if a much wider range of Web application domains will demonstrate similar char-acteristics.New Reasoning Tasks So far we have mainly discussed class consistency/subsumption reasoning,but this may not be the only reasoning problem that is of interest.Other tasks could include querying,explanation,matching,computing least common subsumers,etc.Querying in particular may be important in Semantic Web applications.Some work on query languages for DLs has already been done(Calvanese, De Giacomo,&Lenzerini1999;Horrocks&Tessaris2000), and work is underway on the design of a DAML+OIL query language,but the computational properties of such a lan-guage,either theoretical or empirical,have yet to be deter-mined.Explanation may also be an important problem,e.g.,to help an ontology designer to rectify problems identified by reasoning support,or to explain to a user why an applica-tion behaved in an unexpected manner.As discussed above, reasoning problems such as matching and computing least common subsumers could also be important in ontology de-sign.DiscussionThere are other concerns with respect to the place DAML+OIL has in the Semantic Web.After DAML+OIL was developed,the W3C RDF Core Working Group devised a model theory for RDF and RDFS10,which is incompati-ble with the semantics of DAML+OIL,an undesirable state of affairs.Also,in late2001W3C initiated the Web On-tology working group11,a group tasked with developing an ontology language for the Semantic Web.DAML+OIL has been submitted to this working group as a starting point for a W3C recommendation on ontology languages.A W3C ontology language needs tofit in with other W3C recommendations even more than an independent DAML+OIL would.Work is thus needed to develop a se-mantic web ontology language,which the Web Ontologyworking group has tentatively name OWL,that layers bet-ter on top of RDF and RDFS.Unfortunately,the obvious layering(that is,using the same syntax as RDF and extending its semantics,just as RDFS does)is not possible.Such an extension results in se-mantic paradoxes—variants of the Russell paradox.These paradoxes arise from the status of all classes(including DAML+OIL restrictions)as individuals,which requires that many restrictions be present in all models;from the sta-tus of the class membership relationship as a regular prop-erty(rdf:type);from the ability to make contradictory state-ments;and from the ability to create restrictions that refer to themselves.In an RDFS-compliant version of DAML+OIL, a restriction that states that its instances have no rdf:type re-lationships to itself is not only possible to state,but exists in all models,resulting in an ill-formed logical formalism. The obvious way around this problem,that of using non-RDF syntax for DAML+OIL restrictions,appears to be meeting with considerable resistance so either further edu-cation or some other solution is needed.ConclusionWe have discussed a number of fundamental design deci-sions underlying the design of DAML+OIL,in particular its foundation in Description Logic,its use of datatypes from XML Schema,its sometimes problematic layering on top of RDF Schema,and its deviations from its predecessor OIL. We have also described how various aspects of the language are motivated by the desire for tractable reasoning facilities. Although a number of challenges remain,DAML+OIL has considerable merits.In particular,the basic idea of having a formally-specified web language that can repre-sent ontology information will go a long way towards allow-ing computer programs to interoperate without pre-existing, outside-of-the-web agreements.If this language also has an effective reasoning mechanism,then computer programs can manipulate this interoperability information themselves, and determine whether a common meaning for the informa-tion that they pass back and forth is present.ReferencesBaader,F.,and Hanschke,P.1991.A schema for integrating concrete domains into concept languages.In Proc.of IJCAI-91, 452–457.Baader,F.,and K¨u sters,puting the least com-mon subsumer and the most specific concept in the presence of cyclic-concept descriptions.In Proc.of KI’98,129–140. Springer-Verlag.Baader,F.;K¨u sters,R.;Borgida,A.;and McGuinness,D.L. 1999.Matching in description logics.J.of Logic and Compu-tation9(3):411–447.Bechhofer,S.;Horrocks,I.;Goble,C.;and Stevens,R.2001. OilEd:a reason-able ontology editor for the semantic web.In Proc.of the Joint German/Austrian Conf.on Artificial Intelli-gence(KI2001),396–408.Springer-Verlag.Bechhofer,S.;Goble,C.;and Horrocks,I.2001.DAML+OIL is not enough.In Proc.of the First Semantic Web Working Sym-posium(SWWS’01),151–159.CEUR Electronic Workshop Pro-ceedings,/.Berners-Lee,T.1999.Weaving the Web.San Francisco:Harper. Borgida,A.,and Patel-Schneider,P.F.1994.A semantics and complete algorithm for subsumption in the CLASSIC description logic.J.of Artificial Intelligence Research1:277–308. Calvanese,D.;De Giacomo,G.;Lenzerini,M.;Nardi,D.;and Rosati,rmation integration:Conceptual modeling and reasoning support.In Proc.of CoopIS’98,280–291. Calvanese,D.;De Giacomo,G.;and Lenzerini,M.1999.An-swering queries using views in description logics.In Proc. of DL’99,9–13.CEUR Electronic Workshop Proceedings, /V ol-22/.Decker,S.;van Harmelen,F.;Broekstra,J.;Erdmann,M.;Fensel, D.;Horrocks,I.;Klein,M.;and Melnik,S.2000.The semantic web:The roles of XML and RDF.IEEE Internet Computing4(5). Donini,F.M.;Lenzerini,M.;Nardi,D.;and Nutt,W.1997.The complexity of concept rmation and Computation 134:1–58.Fensel,D.;van Harmelen,F.;Horrocks,I.;McGuinness,D.L.; and Patel-Schneider,P.F.2001.OIL:An ontology infrastructure for the semantic web.IEEE Intelligent Systems16(2):38–45. Gr¨a del,E.;Otto,M.;and Rosen,E.1997.Two-variable logic with counting is decidable.In Proc.of LICS-97,306–317.IEEE Computer Society Press.Grosso,W.E.;Eriksson,H.;Fergerson,R.W.;Gennari,J.H.; Tu,S.W.;and Musen,M.A.1999.Knowledge modelling at the millenium(the design and evolution of prot´e g´e-2000).In Proc.of Knowledge acqusition workshop(KAW-99).Haarslev,V.,and M¨o ller,R.2001.High performance reasoning with very large knowledge bases:A practical case study.In Proc. of IJCAI-01.Hollunder,B.,and Baader,F.1991.Qualifying number restric-tions in concept languages.In Proc.of KR-91,335–346. Horrocks,I.,and Sattler,U.2001.Ontology reasoning in the(D)description logic.In Proc.of IJCAI-01.Morgan Kaufmann.Horrocks,I.,and Tessaris,S.2000.A conjunctive query language for description logic Aboxes.In Proc.of AAAI2000,399–404. Horrocks,I.;Sattler,U.;and Tobies,S.1999.Practical reasoning for expressive description logics.In Ganzinger,H.;McAllester, D.;and V oronkov,A.,eds.,Proc.of LPAR’99,161–180.Springer-Verlag.Horrocks,I.;Sattler,U.;and Tobies,S.2000.Reasoning with individuals for the description logic.In Proc.of CADE-17,LNAI,482–496.Horrocks,I.1998a.The FaCT system.In de Swart,H.,ed.,Proc. of TABLEAUX-98,307–312.Springer-Verlag.Horrocks,ing an expressive description logic:FaCT orfiction?In Proc.of KR-98,636–647.McGuinness,D.L.1998.Ontological issues for knowledge-enhanced search.In Proc.of FOIS,Frontiers in Artificial Intelli-gence and Applications.IOS-press.McIlraith,S.;Son,T.;and Zeng,H.2001.Semantic web services. IEEE Intelligent Systems16(2):46–53.Patel-Schneider,P.F.1998.DLP system description.In Proc. of DL’98,87–89.CEUR Electronic Workshop Proceedings, /V ol-11/.。
本体概念、描述语言和方法论方面的综述

本体概念、描述语言和方法论方面的综述。
一、本体的概念Ontology 的概念最初起源于哲学领域,可以追溯到公元前古希腊哲学家亚里士多德(384-322 b.c.)。
它在哲学中的定义为“对世界上客观存在物的系统地描述,即存在论”,是客观存在的一个系统的解释或说明,关心的是客观现实的抽象本质[1]。
在人工智能界,最早给出Ontology定义的是Neches等人,他们将Ontology定义为“给出构成相关领域词汇的基本术语和关系,以及利用这些术语和关系构成的规定这些词汇外延的规则的定义”[1]。
Neches认为:“本体定义了组成主题领域的词汇表的基本术语及其关系,以及结合这些术语和关系来定义词汇表外延的规则。
”(“An ontology defines the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary.”)[6]。
后来在信息系统、知识系统等领域,越来越多的人研究Ontology,并给出了许多不同的定义。
其中最著名并被引用得最为广泛的定义是由Gruber提出的,“本体是概念化的明确的规范说明”,原文参见:"An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an Ontology is a systematic account of Existence. For AI systems, what "exists" is that which can be represented. When the knowledge of a domain is represented in a declarative formalism, the set of objects that can be represented is called the universe of discourse. This set of objects, and the describable relationships among them, are reflected in the representational vocabulary with which a knowledge-based program represents knowledge. Thus, in the context of AI, we can describe the ontology of a program by defining a set of representational terms. In such an ontology, definitions associate the names of entities in the universe of discourse (e.g., classes, relations, functions, or other objects) with human-readable text describing what the names mean, and formal axioms that constrain the interpretation and well-formed use of these terms. Formally, an ontology is the statement of a logical theory."[2, 3]。
Ontology理论研究和应用建模

Ontology理论研究和应用建模——《Ontology研究综述》、w3c Ontology研究组文档以及Jena编程应用总结1 关于Ontology1.1Ontology的定义Ontology最早是一个哲学的范畴,后来随着人工智能的发展,被人工智能界给予了新的定义。
然后最初人们对Ontology的理解并不完善,这些定义也出在不断的发展变化中,比较有代表性的定义列表如下:关于最后一个定义的说明体现了Ontology的四层含义:●概念模型(cerptualization)通过抽象出客观世界中一些现象(Phenomenon)的相关概念而得到的模型,其表示的含义独立于具体的环境状态●明确(explicit)所使用的概念及使用这些概念的约束都有明确的定义●形式化(formal)Ontology是计算机可读的。
●共享(share)Ontology中体现的是共同认可的知识,反映的是相关领域中公认的概念集,它所针对的是团体而不是个体。
Ontology的目标是捕获相关的领域的知识,提供对该领域知识的共同理解,确定该领域内共同认可的词汇,并从不同层次的形式化模式上给出这些词汇(术语)和词汇之间相互关系的明确定义。
1.2Ontology的建模元语Perez等人用分类法组织了Ontology,归纳出5个基本的建模元语(Modeling Primitives):●类(classes)或概念(concepts)指任何事务,如工作描述、功能、行为、策略和推理过程。
从语义上讲,它表示的是对象的集合,其定义一般采用框架(frame)结构,包括概念的名称,与其他概念之间的关系的集合,以及用自然语言对概念的描述。
●关系(relations)在领域中概念之间的交互作用,形式上定义为n维笛卡儿积的子集:R:C1×C2×…×C n。
如子类关系(subclass-of)。
在语义上关系对应于对象元组的集合。
●函数(functions)一类特殊的关系。
Focus on Language

The method is now greatly improved…
Focus on language -Body- Methods Evaluation of methods: The procedure we followed has certain advantages over the existing method.
Focus on language -Body- Expository Our theory is based on the assumption that… This theory proceeds from the idea (principle) of… The underlying concept of the theory is as follows.
Focus on language -Body- Derivation
… is given by:
… as follows: … as in the following:
The following equation is obtained.
This becomes… Therefore, we have…
The work presented in this paper focuses on several aspects of…
The principal purpose (objective, task) of the present
(further, preliminary) work is to investigate the features of (mechanisms involved in, effects produced by…)
基于Protégé工具的本体发展和查询检索(IJISA-V5-N9-8)

I.J. Intelligent Systems and Applications, 2013, 09, 67-75Published Online August 2013 in MECS (/)DOI: 10.5815/ijisa.2013.09.08Ontology Development and Query Retrievalusing Protégé ToolVishal JainResearch Scholar, Computer Science and Engineering Department, Lingaya’s University, Faridabad, IndiaE-mail: vishaljain83@Dr. Mayank SinghAssociate Professor, Krishna Engineering College, Ghaziabad, IndiaE-mail: mayanksingh2005@Abstract—This paper highlights the explicit description about concept of ontology which is concerned with the development and methodology involved in building ontology. The concept of ontologies has contributed to the development of Semantic Web where Semantic Web is an extension of the current World Wide Web in which information is given in a well-defined meaning that translates the given unstructured data into knowledgeable representation data thus enabling computers and people to work in cooperation. Thus, we can say that Semantic Web is information in machine understandable form. It is also called as Global Information Mesh (GIM). Semantic Web technology can be used to deal with challenges including traditional search engines and retrieval techniques within given organizations or for e-commerce applications whose initial focus is on professional users. Ontology represents information in a manner so that this information can also be used by machines not only for displaying, but also for automating, integrating, and reusing the same information across various applications which may include Artificial Intelligence, Information Retrieval (IR) and many more. Ontology is defined as a collection of set of concepts, their definitions and the relationships among them represented in a hierarchical manner that is termed as Taxonomy. There are various tools available for developing ontologies like Hozo, DOML, and AltovaSemantic Works etc. We have used protégéwhich is one of the most widely used ontology development editor that defines ontology concepts (classes), properties, taxonomies, various restrictions and class instances. It also supports several ontology representation languages, including OWL. There are various versions of protégéavailable like WebProtege 2.0 beta, Protégé3.4.8, Protégé4.1 etc. In this paper, we have illustrated ontology development using protégé3.1 by giving an example of Computer Science Department of University System. It may be useful for future researchers in making ontology on protégéversion 3.1. Index Terms— Semantic Web, Ontology Development, OWL, Protégé 3.1I.IntroductionWorld Wide Web is the largest database in the Universe which is mostly understandable by human users and not by machines. WWW is human focused web. It discovers documents for the people. It lacks the existence of a semantic structure which maintains interdependency and scalability of its components. It returns results of given query with the help of hyperlinks between resources. It produces large number of results that may or may not satisfy user’s query. It results in the presentation of irrelevant information to the user. In the current web, resources are accessible through hyperlinks to web content spread throughout the world. The content of information is machine readable but not machine understandable. Use of current www does not support the concept of ontologies and users cannot make inferences due to unavailability of complete data. An enormous collection of unstructured data present on web leads to problems in extracting information about a particular domain. Hence information extraction is a logical step to retrieve relevant data and the extracted information. The word Information Retrieval is explicitly defined as process of extracting relevant results in context of given query. It is described as the task of identifying documents on the basis of properties assigned to the documents by various users requesting for retrieval. There are many Information Retrieval techniques for extracting keywords like NLP based extraction techniques. Content-based image retrieval system requires users to adopt new and challenges search strategies based on the visual pictures of images [1]. Multimedia information retrieval provides retrieval capabilities of text images and different dimensions like form, content and structure. When text annotation is nonexistent and incomplete content-based method must be used. Retrieval accuracy can be improved by content-based methods [2].68Ontology Development and Query Retrieval using Protégé ToolThe remaining sections of paper are as follows. Section 2 makes readers aware of Semantic Web including its architecture and its importance as future web technology. In this section, we have also discussed about Ontology and its components. A list of differences is shown on Relational Database and Ontology. Section 3 defines development of ontology on “Computer Science Department” using Protégé tool via Case Study.II.Semantic Web2.1ImportanceThis futuristic concept of Semantic Web is needed to make our present web more precise and effective by increasing the structure and size of current web. Semantic Web (SW) uses Semantic Web documents (SWD’s) that must be combined with Web based Indexing. The idea of Semantic Web (SW) as envisioned by Tim Bermers Lee came into existence in 1996 with the aim to translate given information into machine understandable form.2.2DefinitionSemantic Web is the new-generation Web that tries to represent information such that it can be used by machines not just for display purposes, but for automation, integration, and reuse across applications [3]. The emerging Semantic Web technology has revolutionized the way we use the Web to find and organize information. It is defined as framework of expressing information because we can develop various languages and approaches for increasing IR effectiveness. Semantic Web (SW) uses Semantic Web documents (SWD’s) that are written in SW languages like OWL, DAML+OIL. We can say that Semantic Web documents are means of information exchange in Semantic Web (SW).The Semantic Web (SW) is an extension of current www in which documents are filled by annotations in machine understandable markup language. Semantic Web technology can be used first to address efficiency, productivity and scalability challenges within Enterprises or for e-commerce applications and the initial focus is on professional users [4].Tim Berner Lee (Inventor of Web, HTTP, & HTML) says that Semantic web will be the next generation of Current Web and the next IT revolution [6, 7, and 8]. It is treated as future concept or technology. In the Fig. 1, at the bottom of the architecture we find XML, a language that lets enables us to write structured documents according to predefined guidelines or syntax. XML is particularly suitable for sending documents across the Web [9]. RDF is a basic data model for writing simple statements about Web objects (resources). RDF Model has three components: Resource, Property and Statement. Both XML and RDF follow same syntax in writing properties. Therefore, it is located on top of the XML layer [10]. RDF Schema (rdfs)provides modeling primitives for organizing Web objects into hierarchies. Its key primitives are classes and properties, subclass and sub property relationships, and domain and range restrictions [11]. RDF Schema is based on RDF. RDF Schema is RDF vocabulary description language. It represents relationship between groups of resources. The Logic layer is used in development of ontology and producing a knowledgeable representation document written in either XML or RDF. The Proof layer involves the actual deductive process as well as the representation of proofs in Web languages (from lower levels) and proof validation [12]. Finally, the Trust layer will emerge through the use of digital signatures and other kinds of knowledge, based on recommendations. The Semantic Web is envisioned as a collection of information linked in a way that can be easily processed by machine. This whole vision depends on agreeing upon common standards - something that is used and extended everywhere [13, 14].Fig. 1: “Semantic Web layered Architecture [5]”Berners-lee outlined the architecture of the Semantic Web in the following 3 layers [15]:The metadata layer:It contains the concepts of resource and properties and RDF (Resource Description Framework), most popular data model for the metadata layer.The schema layer: Web ontology languages (OWL) are introduced here to define a hierarchical description of concepts (is-a hierarchy) and properties and RDFS (RDF Schema) is a popular schema layer language. The logical layer: Set of web ontology languages are introduced at this layer to provide a richer set of modeling primitives in which Semantic Web plays a very important role to replace slow, ineffective, inefficient, & non intelligent web processes by fast, effective and inexpensive automatic processes. We can make our web more precise and increase retrieval capacity by adding annotations to documents. TheSemantic Web will allow both humans and machines to find and make use of data in modern ways that previously haven't been possible by www.Both Semantic Web (SW) and World Wide Web (www) are different from each other in various aspects which are described in the form of table as shownTable 1: “Comparison between Web and Semantic Web” [16]The WWW consists primarily of content for humanconsumption. Content links to other content on theWWW via the universal Resource Locator (URL). TheURL relies on surrounding context (if any) to communicate the purpose of the link that it represents; usually the user infers the semantics. Web content typically contains formatting instructions for a nice presentation, again for human consumption [17]. WWW content doesnot have any formal logical constructs. Correspondingly, the Semantic Web consists primarily of statements for application consumption. The statements link together via constructs that can form semantics, the meaning of the link. Thus, link semantics provide a defined meaningful path rather than a user-interpreted one. The statements may also contain logic that allows further interpretation and inference of the statements.2.3OntologyThe term ontology can be defined in many different ways. Genesereth and Nilsson defined Ontology as an explicit specification of a set of objects, concepts, and other entities that are presumed to exist in some area of interest and the relationships that hold them. It enables the Web for software components can be ideally supported through the use of Semantic Web technologies [18]. This helps in understanding the concepts of the domain as well as helps the machine to interpret the definitions of concepts in the domains and also the relations between them. Ontologies can be broadly divided into two main types: lightweight and heavyweight. Lightweight Ontologies involve taxonomy (or class hierarchy) that contains classes, subclasses, attributes and values. Heavy weight Ontologies model domains in a deeper way and include axioms and constraints [19]. Ontology layer consists of hierarchical distribution of important concepts in the domain and describing about the Ontology concepts, relationships and constraints. Fig. 2 displays the Ontology and its Constituents parts.Fig. 2: “Ontology and its components [20]”AdvantagesThere are many advantages of using ontology in the Semantic Web technology. Some of them are as follows [21, 22]:∙Sharing common understanding of the structure of information among people or software agents is one of the more common goals in developing Ontologies [23].∙Ontology enables reusability of domain knowledge in representing concepts and their relationships.∙Making explicit domain assumptions underlying an implementation makes it possible to change these assumptions easily if our knowledge about the domain changes [24].∙Separating the domain knowledge from the operational knowledge is another common use of ontologies. We can describe a task of configuring a product from its components according to a requiredspecification and implement a program that does this configuration independent of the products and components themselves [25].∙Use of ontology enables to analyze domain knowledge on basis of declared terms in a document. ∙Each user has its defined attributes and relationships between other users.∙Ontology is considered as backbone of Software. Since SW translates the given data into machine understandable language using concept of ontologies [26].∙Ontology development is a cooperative process; it allows different peoples to express their views on given domain.∙Ontology language editors helps to build SW.2.4Ontology Languages and EditorsIt is defined as formal language used to encode ontology. Various languages are listed below:∙DAML+OIL: - DAML stands for DARPA Agent Markup Language. DARPA stands for Defense Advanced Research project Agency. OIL stands for Ontology Interchange Language. This language uses Description Logic (DL) to express this language. ∙SWRL: - It stands for Semantic Web Rule Language. It adds rules to OWL+DL.∙OWL: - It stands for Web Ontology Language. It is used to represent relations between entities by using formal semantics and vocabulary.Ontology Editors: - They are applications designed to assist modifications of ontology. Various editors are listed below:∙Protégé: - It is free, open source and knowledge requisition system. It is written in Java and uses Swings to create the complex user interface.∙DOME: - It stands for DERI Ontology Management Environment. It is designed to create effective management of ontologies.∙Onto Lingua: - It is an ontology developed by OnTO Knowledge Project. It implements Ontology construction process.∙Altova SemanticWorks: - It is an RDF document editor and ontology development IDE. It creates and edits RDF documents, RDF Schema and OWL ontologies.Table 2: “Comparison between RDBMS and Ontology”III.Case StudyThe Computer Science Department Ontology describes various terms used in a computer science department. It shows the terms and their inheritance but not the relationships. For example, A Professor inherits from a Teaching which inherits from the Staff which is a generalization of a Person. Similarly Assistant inherits from Non Teaching which in turn inherits from Staff which in turn Person. The Screen Shot of Computer Science Department is shown in Fig. 3.3.1Ontology DevelopmentTool: Protégé is an open-source tool for editing and managing Ontologies. It is the most widely used domain-independent, freely available, platform-independent technology for developing and managing terminologies, Ontologies, and knowledge bases in a broad range of application domains. There are various versions of protégéavailable out of which the frequently used ones are: protégé2000, protégé3.1, protégé3.4 beta, protégé3.4(released recently) and protégé 4.0 beta.Computer Science Department OntologyComputer SciencePersonStaffTeaching (faculty)ProfessorReaderLecturerNon-TeachingAssistantTechnicianStudentPost GraduateGraduatePublicationBooksJournalsIt provides a rich set of knowledge modelingstructures. We have used the protégé version 3.1 to develop my Ontology on Computer Science Department. It provides the facility to support for multi user system, class trees on different tabs are synchronized by default, standard max memory allocation is 100 MB, RDF backend validates frame names, improved handling of sub slots and database backend correctly identifies MSSQL server and optimizes table creation accordingly.Fig. 3:“Computer Science Department Ontology”Fig.3, shows the Ontology on “Computer Science Department with the help of Protégé tool.3.2 Code SnippetsFollowing are different various Code snippet of Computer Science Department Ontology, developed in Protégé 3.1XML Code Snippet <knowledge_basexmlns="/xml" xmlns:xsi="/2001/XMLSchema-instance"xsi:schemaLocation="/xml /xml/schema/protege.xsd"><class><name>:SYSTEM-CLASS</name> <type>:STANDARD-CLASS</type> <own_slot_value><slot_reference>:ROLE</slot_reference> <value value_type="string">Abstract</value> </own_slot_value><superclass>:THING</superclass> </class> <class><name>Staff</name><type>:STANDARD-CLASS</type> <own_slot_value><slot_reference>:ROLE</slot_reference><value value_type="string">Concrete</value> </own_slot_value><superclass>Person</superclass><template_slot>ID</template_slot><template_slot>Sal</template_slot></class><class><name>Teaching</name><type>:STANDARD-CLASS</type><own_slot_value><slot_reference>:ROLE</slot_reference><value value_type="string">Concrete</value> </own_slot_value><superclass>Staff</superclass><template_slot>specialisation</template_slot> </class><class><name>Professor</name><type>:STANDARD-CLASS</type><own_slot_value><slot_reference>:ROLE</slot_reference><value value_type="string">Concrete</value> </own_slot_value><superclass>Teaching</superclass></class><class><name>Lecturer</name><type>:STANDARD-CLASS</type><own_slot_value><slot_reference>:ROLE</slot_reference><value value_type="string">Concrete</value> </own_slot_value><superclass>Teaching</superclass></class><class><name>TeachingAssistant</name><type>:STANDARD-CLASS</type><own_slot_value><slot_reference>:ROLE</slot_reference><value value_type="string">Concrete</value></own_slot_value><superclass>Teaching</superclass></class></knowledge_base>RDF Code Snippet<?xml version='1.0' encoding='UTF-8'?><!DOCTYPE rdf:RDF [<!ENTITY rdf '/1999/02/22-rdf-syntax-ns#'><!ENTITY a '/system#'><!ENTITY rdf_ '/rdf'><!ENTITY rdfs '/2000/01/rdf-schema#'> ]><rdf:RDF xmlns:rdf="&rdf;"xmlns:rdf_="&rdf_;"xmlns:a="&a;"xmlns:rdfs="&rdfs;"><rdfs:Class rdf:about="&rdf_;Academic"rdfs:label="Academic"><rdfs:subClassOfrdf:resource="&rdf_;Nonteaching"/></rdfs:Class>OWL Code Snippet<?xml version="1.0"?><rdf:RDFxmlns:xsp="http://www.owl-/2005/08/07/xsp.owl#"xmlns:swrlb=/2003/11/swrlb# xmlns:swrl="/2003/11/swrl#"xmlns:protege="/plugins/o wl/protege#"xmlns:rdf="/1999/02/22-rdf-syntax-ns#"xmlns:xsd="/2001/XMLSchema#"<owl:Ontology rdf:about=""/><owl:Class rdf:ID="UndergraduateStudent"><rdfs:subClassOf><owl:Class rdf:ID="Student"/></rdfs:subClassOf>In this paper, we have described the use of SemanticWeb in Information Retrieval with the help of Ontology. Information Retrieval over collection of those documents offers new challenges and opportunities. The paper shows that Semantic Web (SW) is better than current World Wide Web (www) by defining various differences between them. It gives brief overview on Ontology and its role in Semantic Web (SW).3.3 Class-SubclassFig. 4: ” Ontology on Computer Science Department in Protégé 3.1 (Sub Class)”3.4Query RetrievalFig. 5: “Query retrieval “Staff Salary Greater than 25000”Fig. 5, shows the result of query given to the Ontology based system.IV.ConclusionOntology represents information in a manner so that this information can also be used by machines not only for displaying, but also for automating, integrating, and reusing the same information across various applications. We have developed ontology on Computer Science and Engineering Department using one of famous ontology editor named as Protégé3.1. Protégéis an open-source tool for editing and managing Ontologies. It is the most widely used domain-independent, freely available, platform-independent technology for developing and managing ontologies. This paper will help upcoming researchers to develop an ontology using the protégé 3.1 in the semantic web. This ontology can also be used by any university system to make relevant search on the web. The developed ontology can be extended further to improve the performance of the Internet Technology. AcknowledgementI ,Vishal Jain would like to give my sincere thanks to Prof. M. N. Hoda, Director, Bharati V idyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi for giving me opportunity to do P.hD from Lingaya’s University, Faridabad. References[1]Carlo Meghini_ Fabrizio Sebastiani and UmbertoStraccia, “A Model of Multimedia Information Retrieval”,[2]Henning Muller, Nicolas Michoux, David Bandonand Antoine Geissbuhler, “A Review of Content Based Image Retrieval Systems in Medical Applications - Clinical Benefits and Future Directions”,[3]http://lpt.fri.uni-lj.si/research/15-semantic-web-and-ontologies/6-semantic-web-and-ontologies [4]Harold Boley, Said Tabet and Gerd Wagner,“Design Rationale of RuleML: A Markup Language for Semantic Web Rules, /papers/DesignRationaleRuleML-SWWS01paper20.pdf[5]Gagandeep Singh, Vishal Jain, “InformationRetrieval (IR) through Semantic Web (SW): An Overview”, In Pro ceedings of CONFLUENCE 2012- The Next Generation Information Technology Summit, September 2012, 23-27. [6]Christoph Bussler, Dieter Fensel, AlexanderMaedche, “A Conceptual Architecture forSemantic Web Enabled Web Services”, NSF-EU Workshop on Database and Information Systems Research for Semantic Web and Enterprises, April3 - 5, 2002 Amicalola Falls and State Park,Georgia[7]P. Lambrix, “Towards a Semantic Web forBioinformatics using Ontology-based Annotation”, in: proceedings of the 14th IEEE international workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises, 2005, pp.3-7.[8]Semantic Web Education by Vladan Devedzic,Springer, ,2006, Pages 33 - 50[9]/staff/fh/CM3028/index.php[10]Dario Bonino, “Arc hitectures and Algorithms forIntelligent Web Applications”, December 2005 [11]Zhaohui Wu, Huajun Chen, “Semantic Grid –Model, Methodology and Applications”, Springer, 2008, Page 26-32[12]Junhua Qu, Chao Wei, Wenjuan Wang, Fei Liu,“Research on a Retrieval System Based on Semantic Web”,2011 IEEE International Conference on Internet Computing and Information Services./10.1109/ICICIS.2011.142[13]Grigoris Antoniou and Frank van Harmelen, “WebOntology Language: OWL”[14]Thomas B. Passin, “Explorer's Guide tothe Semantic Web”, Manning Publications Co., 2004[15]Grigoris Antoniou and Frank Von Hormelen, “ASemantic Web primer”, The MIT Press Cambridge, Massachusetts London, England[16]/column/uploads/1/article_4.txt[17]Ee-Peng Lim an d Aixin Sun, “Web Mining- TheOntology Approach”[18]/article.cfm?id=the-semantic-web[19]Sergey Sosnovsky, Darina Dicheva, “Ontologicaltechnologies for user modeling”, Int. J. Metadata, Semantics and Ontologies, Vol. 5, No. 1, 2010[20]/wiki/Semantic_Web[21]Noy and McGuinness ,“Ontology Development101: A Guide to Creating Your First Ontology”, Stanford University[22]Sugumaran and Storey, “The Role of DomainOntologies in Database Design : An Ontology Management and Conceptual Modeling Environment”, ACM Transactions on DatabaseSystems, Vol. 31, No. 3, September 2006, Pages 1064–1094.[23]Chandrasekaran, Josephson, Benjamins. "What areOntologies and why do we need them". IEEE Intelligent Systems, Jan/Feb 1999.[24]Time Berners-Lee, The Semantic Web Revisited,IEEE Intelligent Systems, 2006[25]Lina Tankelevičienė, Ontology and OntologyEngineering: Analysis of Concepts, Classifications and Potential Use in E-Learning Context, Technical Report MII-SED-08-01, February 2008.[26]Daniel L. Rubin, Natalya F. Noy and Mark A.Musen, “Protégé: A Tool for Managing and Using Terminology in Radiology Applications”, Journal of Digital Imaging. 2007 Nov; 20(Suppl 1)34-46 Authors’ ProfilesVishal Jain has completed hisM.Tech (CSE) from USIT, GuruGobind Singh IndraprasthaUniversity, Delhi and doing PhDfrom Computer Science andEngineering Department,Lingaya’s University, Faridabad. Presently he is working as Assistant Professor in Bharati Vidyapeeth’s Institute of Computer Applications and Management, (BVICAM), New Delhi. His research area includes Web Technology, Semantic Web and Information Retrieval. He is also associated with CSI, ISTE.Dr. Mayank Singh has completedhis M. E in software engineeringfrom Thapar University and PhDfrom Uttarakhand TechnicalUniversity. His Research areaincludes Software Engineering,Software Testing, Wireless SensorNetworks and Data Mining. Presently He is working as Associate Professor in Krishna Engineering College, Ghaziabad. He is associated with CSI, IE (I), IEEE Computer Society India and ACM.。
基于知识本体的术语定义1

载《术语标准化与信息技术》,2009年,第2期基于知识本体的术语定义1揭春雨2冯志伟3摘要:本文全面回顾传统术语定义的历史背景和理论基础,充分肯定它在术语学奠基和发展中尤其在已有的术语标准化等工作中的积极作用。
但通过实例,我们指出传统术语定义的局限性,提出崭新的基于知识本体的术语定义,旨在抛砖引玉,把长期囿于名词术语的传统术语学扩展成一种面向各种专业知识的更广泛的语言表达的术语学理论。
关键词:术语; 定义; 知识本体Ontology-based definition of termKit Chunyu Feng ZhiweiAbstract: This paper reviews the history and theoretical background for the traditional definition of term and recognize its positive role in previous terminology work such as terminology standardization. Based on available evidence from real text data, however, we point out the limitations of this definition and propose an ontology-based definition as a call for the development of ontological terminology.Key words: term; definition; ontology本文尝试从知识本体(ontology)的角度探讨术语的语言学特性,提出与传统的术语定义不同的一些崭新的观点。
通过回顾和分析传统的术语定义的理论背景和作用,并基于诸多事实,我们指出传统的术语定义的局限性,并尝试给出基于知识本体的术语新定义,以及这一尝试的必要性和可能性。
Language Competence and Language Performance

An essay on linguistics Language Competence and Language PerformanceName:xxxClass:xxxNumber :xxx[Abstract]Language is a tool of communication. The fundamental purpose to master a language is to communicate. This article lists Chomsky’s and Hymes’ definition of language competence and performance, an analysis of their distinction, relationship and enlightenments to foreign language learning, expecting to provide some help to the learners.[Key words] language competence, language performance, learning.In the process of language learning, people often talk about competence and performance which are two skills of language learning. They rely on each other as well as promote each other. Understanding their meanings and differences correctly takes a positive influence to our foreign language learning.Ⅰ.The Differences between Competence and Performance In order to understand the two concepts of language competence and language performance, it’s worth mentioning the two linguistics’ points of view.1.1 Chomsky’s standpointChomsky is a U.S. linguist famous for his important ideas about language, including the idea that everyone is born with knowledge about grammar.According to Chomsky, language competence and language performance are completely different. Competence is an ability to recognize and understand sentences, including summarizing the language materials and deducing them to the language rules and regular sentences. He thought, language refers to the ability of the speakers’ and liste ners’ internal knowledge. Competence is the speakers’ underlying linguistic knowledge and the abstract master to language system. Performance is just the actual behavior of competence. Although performance is derived from competence, it’s not equal to competence. Performance is unique and diverse, and always dependent on the environment. Competence is an abstract concept that is full of details but unable to confirm. They are totally two fundamentally different phenomenons. To attachimportance to language is to attach to the most basic things, namely language competence.1.2 Hymes’ standpoint.Hymes is an American anthropologist and linguist famous for his idea about communicative competence.In accordance with Hymes, language competence refers to not only make sentences obey the grammatical rules, but also use the language appropriately. He put forward communicative competence that is antagonistic to Chomsky's language competence. Hymes thought, Chomsky’s thought only involves language rules but ignores the social rules which is the important characteristic of communicative functions as a social phenomenon. In his view, competence is the ability to communicate in a particular environment with using language smoothly. That is to say, one’s communicative compe tence includes not only language knowledge, but also the ability to use language appropriately in social environment. Training the four basic language skills—listening, speaking, reading, writing—should reflect the interdependence of language and communication.Ⅱ.The Relationship between Competence and PerformanceLanguage competence and language performance are interdependent and complementary to each other.On the one hand, language competence is the premise of language performance. Competence is to perfo rmance what composer’s skills is to composing or improvising music. Without solid language skills can not gain a good ability of language use.On the other hand, to improve the ability of language performance can enhance and consolidate the lever of language competence. Performance is reflected in the competence, since language ability is not equal to have the ability to use the language. Performance can neither be replaced by nor improve in pace with competence, but to gain by training and practicing.Ⅲ. Enlightenments to Foreign Language LearningThe ultimate purpose of foreign language learning is not only to learn grammatical knowledge, but also to cultivate students' ability of using language. If we only emphasize the use of language ability, while neglect the basic knowledge of language learning, it is not possible to complete the task of foreign language learning. In another word, ignoring either competence or performance is not able to reach the purpose of foreign language learning. The ideal learning effect is that language competence and language performance should keep in the same pace.Ⅳ.ConclusionI think that for the students of foreign language, competence needs to be nurtured in the process of enhancing performance, but competence calls for more efforts at the beginning stage while success in performance may motivate the acquisition of competence.References1.胡壮麟语言学教程第四版北京:北京大学出版社20112.胡壮麟、姜望琪语言学高级教程北京:北京大学出版社20023.王宗炎语言与语言的应用上海:上海外语教育出版社1998。
基于本体的电子商务思索.pdf

一、引言 随着全球信息化的发展,电子商务(E-commerce-EC)已经成为世界经济市场必不可少的组成部分。
有预测表明,2009年全球电子商贸的总量将超过18000亿美元。
伴随电子商贸的蓬勃发展,EC模式呈现出自动化、智能化和移动化的新趋势。
而传统基于HTML的EC平台缺乏语义信息,无法将显示信息与数据分离,难以满足EC新需要。
Web元信息处理和语义化发展,特别是AI成熟的理论、方法和技术,将对EC的发展起到关键的作用。
电子商务平台是使交易双方的需求及供给信息都可以良好交流和交换的地方,所以构建一个良好的电子商务平台首要的要求就是交易双方的信息可以充分地交换和互用。
但是,由于交易双方所使用的计算机系统等一些方面的不同,会产生互用性问题,特别是通过Internet进行商业活动时。
计算机系统相互作用时存在的问题大体可以划分为以下四种类型:系统异构、语法层异构、结构层异构和语义层异构。
系统异构包括硬件和操作系统之间的不相容;语法异构指的是不同语言和不同的数据表示;结构层异构指使用不同的数据模型;语义层异构指系统间交换信息时所用术语的含义不同,如同义词。
其中,随着技术的发展,前三种类型的问题已逐渐得到更好的解决,如使用CORBA,DCOM和不同的中间件产品。
XML的出现在一定程度上解决了最后一类问题。
XML本身具有的许多优点为电子商务特别是B2B的发展带来了很大的变化,如XML持国际语言编码标准Unicode,把业务规则和数据内容、结构分离开来,用户只需协商数据内容和结构,从而可以自由定义和实现各自的业务规则,企业之间可以灵活方便地建立多对多的连接等。
本文着重描述数据内容和结构的元语言XML[1],凭借其强大的定义和表示标记语言能力,正成为互联网信息表示与交换的标准格式。
采用XML作为表示语法,W3C开发了在Web上进行元数据处理和交换的标准RDF(S)[2],RDF(S)的出现使得本体建模技术可以应用于EC内容定义和信息交换,使得开发自动化、智能化和移动化的EC平台成为可能。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Ontology Language Extensions to Support LocalizedSemantics, Modular Reasoning, and Collaborative Ontology Design and Ontology ReuseJie Bao and Vasant HonavarArtificial Intelligence Research LaboratoryComputer Science DepartmentIowa State UniversityAmes IA USA 50010{baojie,honavar}@Abstract.Modular approaches to design and use of ontologies are essential to the success of the Semantic web enterprise. We describe P-OWL (Package-based OWL) which extends OWL, a widely used ontology language that supports modular design, adaptation, use, and reuse of ontologies. P-OWL localizes the semantics of entities and relationships in OWL to modules called packages. P-OWL and the associated tools will greatly facilitate collaborative ontology construction, use, and reuse.Keywords: Modular Ontology, Contextual Ontology, Package-extended Ontology, Semantic Web, OWL, P-OWL1IntroductionSemantic Web [BL2001] aims to support seamless and flexible access, use of semantically heterogeneous, networked data, knowledge, and services. The success of the Semantic Web enterprise relies on the availability of a large collection of domain or application specific ontologies and mappings between ontologies to allow integration of data [RCH2003; BDS2003] as well as components of complex workflows [PCH2004]. Increasing need for sharing of information and services between autonomous organizations have led to major efforts aimed at the construction of ontologies in many domains e.g., the gene ontology () [A2000] in biology.By its very nature, ontology construction is a collaborative process which involves direct cooperation among individuals or groups of domain experts or knowledge engineers or indirect cooperation through reuse or adaptation of previously published, autonomously developed, very likely, semantically heterogeneous ontologies. Despite this, relatively little attention has been paid to formalisms and tools for collaborative construction in such settings. This state of affairs in ontology languages and ontology engineering is reminiscent of the early programming languages and first attempts at software engineering when uncontrolled use of global variables, spaghetti2 code, absence of well-defined modules leading to unwanted and uncontrolled interactions between code fragments.Hence, there is an acute need for approaches and tools that facilitate collaborative modular design, adaptation, use, and reuse of ontologies. The lack of such tools is a major barrier to realizing the full potential of the Semantic Web. Against this background, this paper describes P-OWL (Package-based OWL) which extends OWL, a widely used ontology language that supports modular design, adaptation, use, and reuse of ontologies and Ontomill, a collaborative ontology-building tool that includes an ontology editor and a reasoner. The rest of the paper is organized as follows. Section 2 describes some of the requirements of collaborative ontology design tools to motivate the work described in this paper. Section 3 presents the basic definitions and semantics of package-extended ontology; Section 4 describes the reasoning algorithm in package-extended ontology; Sections 5 describes the syntax specifications for package-extended ontology and a possible extension to OWL/RDF, called P-OWL; Section 6 concludes with a brief summary, discussion of related work and some directions for ongoing and future work.2 Desiderata of Collaborative Ontology ToolsConsider the task of building an ontology for a large state university system. Typically, multiple relatively autonomous groups (faculty, programs, departments, colleges) contribute parts of such an ontology that pertain to their domains of expertise or responsibility. The ontology for the university system should be a semantically coherent integration of the constituent ontologies developed by the individual groups. Hence, there is a need for collaborative ontology construction tools. We enumerate below, some desiderata of such collaborative ontology construction tools.Local Terminology: Terms used in different ontologies e.g., the department name, research topics, and graduate student status, etc. should be given unique identifiers. This is necessary to avoid name conflicts when merging two independently developed ontologies and to avoid unwanted interactions among modules. For example, one individual might define TurkeyStudy (the study of the country Turkey) as AsianStudy (study of Asia) with (inRegion = Turkey); whereas another individual or group may unknowing define Turkey as subclass of Asia. Manual processing of such name conflicts does not scale up with increase in size, number, and complexity of ontologies.Localized Semantics: Collaborative ontology construction requires different groups to adapt or use of ontologies that were independently developed by other groups. However, unrestricted use of entities and relationships from different ontologies can result in serious semantic conflicts, especially when the ontologies in question represent local views of the world from the respective points of view of the ontology producers. For example, university A may define A:AsianStuy and A:EuropeanStudy as two disjoint concepts in its ontology:A:AsianStudy ¢A:EuropeanStudy = ˘3 whereas University B may defineB:AsianStuy ” A:AsianStudyB:EuropeanStudy ” A:EuropeanStudyB:TurkeyStudy¥ B:AsianStudy¢ B:EuropeanStudyThis will lead to obvious semantic conflicts if both ontologies have global semantics. Ontology Evolution: Ontology construction is usually an iterative process. This is especially true in emerging areas of science in which there is little consensus concerning the basic entities and assumed relationships among entities (i.e., ontological commitments). A small change in one part of an ontology may be propagated in an unintended and hence undesirable manner across the entire ontology. For example, two universities A and B initially define ontologies that satisfy the following axioms:A:AsianStudy ¢ A:EuropeanStudy = ˘A:TurkeyStudy¥ A:AsianStudyB:AsianStuy ” A:AsianStudyB:EuropeanStudy ” A:EuropeanStudyB: TurkeyStudy ” A: TurkeyStudyBut now university B decides TurkeyStudy should be viewed as a kind of EuropeanStudy, by adding a new axiomB:TurkeyStudy ¥ A: EuropeanStudyThis will lead to the unintended effect that B:TurkeyStudy to be an empty concept i.e., one with no members or instancesDistinction between Organizational and Semantic HierarchiesOntologies are often organized in the form of subsumption (ISA, subclassOf) hierarchies defined over classes and properties. For example, given HistoryDeparment¥AcademicDepartmentsHistoryDeparmentHall ˛BuildingHistoryStudentClubs¥StudentOrganizationSuppose that it is now desired to state that the above three concepts are all about history department. If we were to introduce a new common super-class, say HistoryDepartmentRelated, for the classes that correspond to the three concepts, it will, instead of clarifying the semantics associated with the concepts in question, will introduce logical ambiguity. This is because HistoryDeparmentHall is declared to be an instance of the Building concept, it will (through subsumption), be an instance of HistoryDepartmentRelated. This problem is even worse for properties because of the distinction between datatype property (with range of predefined datatype) and object property (with range of class or instance of a class) in ontology languages such as OWL. It’s usually hard to design a superproperty when both datatype extension and object type extension are possible in future. When we examine ontolgies in many application domains, we find that properties are much less organized compared to classes.A related problem has to do with organizing instances into a hierarchy. For example, if there are a dozen of HistoryStudentClubs instances such as CivilWarClub,4 WarOf1812Club, RussianHistoryClub, EnglandHistoryClub and so on, it will be more clear to organize them into hierarchy, such asAmericanHistoryClubsCivilWarClubWarOf1812ClubEuropeanHistoryClubsRussianHistoryClubEnglandHistoryClubHowever, this is hard to do in ontology languages such as OWL without modifying the ontology schema (T-box) which is not always possible or safe.In short, an organizational hierarchy of ontology entities may be different in structure from the semantic hierarchy. Furthermore, there might be a need for an organizational hierarchy even when a semantic hierarchy is missing.Ontology Reuse: In collaborative design of ontologies, it often makes sense to reuse parts of existing ontologies. However, lack of modularity and localized semantics in ontologies forces an all or nothing choice with regard to reuse of an existing ontology. For example, a university library may want to reuse part of Congress Library Catalog ontology in creating its own ontology. Nevertheless, because ontology languages such as OWL do not support the import and reuse of only a part of an existing ontology. Modular ontologies facilitate more flexible and efficient reuse of existing ontologies.Knowledge Hiding: In many applications, the provider of an ontology may not wish, because of copyright considerations or privacy or security concerns, to make the entire ontology visible to the outside while willing to expose certain parts of the ontology to certain subsets of users. For example, if an ontology provider reuses licensed commercial ontology such as a part of the CYC ontology, the ontology provider may not be able to reveal that part of the ontology to all users.Proposed ApproachCurrent ontology languages, like DAML+OIL and OWL while they offer some degree of modularization by restricting ontology segments into separated XML namespaces, fail to fully support localized semantics, ontology evolution, distinction between semantic and organizational hierarchies over concepts and properties, ontology reuse, and knowledge hiding. In this paper, we argue for package based ontology language extensions to overcome these limitations. A package is an ontology module with clearly defined access interface; mapping between packages is performed by views, which define a set of queries on the referred packages. Semantics are localized by hiding semantic details of a package by defining appropriate interfaces (special views). Packages provide an attractive way to compromise between the need for knowledge sharing and the need for knowledge hiding in collaborative design and use of ontologies. The structured organization of ontology entities (classes, properties, instances) in packages bring to ontology design and reuse,5 the same benefits as those provided by packages in software design and reuse in software engineering.3Syntax and Semantics of Package-extended OntologiesCurrent ontology languages are based on description logics (DL). The syntax and semantics of Package-Extended Ontologies is based on description logic based languages. Description logic is a family of knowledge representation language that can be used to represent the knowledge of an application domain in a structured and precise fashion [BHS2003]. The interested reader is referred to [DCM2003] for details of description logic. In this section, we define the syntax and semantics of package extended ontologies.PackageDefinition 1 (Ontology Entity) An ontology entity is an axiom e=[C|P|I] where C is a class (concept) definition axiom, P is a property (relation) definition axiom and I is an instance (object) definition axiom.Definition 2(Scope Limitation Modifier, SLM) scope limitation modifier of an ontology entity e is a Boolean function V e(r), where r is the identifier of a model that refers e. Model(r) could access e if and only if V e(r) = True.Possible SLMs include but not limited to, public, protected, and private. They provide a controllable way to define accessing interface of a package. Detailed semantics of the SLMs will be given later.Definition 3 (Basic Package): A basic package is a logic model P b= <E, V> where E={e i} is a set of entities and V={v i}is the set of their SLMs.Definition 4 (Compositional Package): A compositional package is a logic model of P c=<E, V, P> where E={e i}and V={v i}are sets of entities and their SLMs and P is a set of basic or compositional packages. For all P i˛ P, we say P i is ˛N (NestedIn) P c. We define ˛N as a transitive property over package such thatP1˛N P2¢P2˛N P36P1˛N P3Packages could be recursively nested to form a package hierarchy. One advantage of package hierarchy is that both T-Box and A-Box of a logic model (see below for precise definitions) can be structured in an organizational hierarchy, while their semantics could have different hierarchy or no hierarchy at all.Given a basic or compositional package P and its entity set E and SLM set V, we have definition 5-7:Definition 5 (SLM-member) each e i˛ E is called a v i-member of P and denoted ase i˛v i PDefinition 6 (Home Package)P is called the home package of e i and denoted as P = HomePackage(e i). For compositional package, P = HomePackage(P i) for all P i˛P.6 Definition7 (T-Box and A-Box) The subset of all class definitions and property definitions of E is called the T-Box of P, the subset of all instance definitions in E are called the A-Box of P.Definition 8 (Default SLMs) three default SLMs are specified as follows:•Public e(r) := True•Protected e (r) :=(r = identifier of HomePackage(e)) wModel(r) ˛N HomePackage(e)•Private e (r) :=(r = identifier of HomePackage(e)).Definition 9 (Signature of Package) the signature of a package P is a triple <CN, PN, IN> where CN, PN, IN refer to the set of all names of classes, properties/, and instances with P as their home package, respectively.Definition 10 (Entity Scope) The scope S of an ontology entity e in package P is the set of models from which e is visible.S(e) = {model(r)|SLM e(r) = True }If e is a public-member of P, S(e) is the whole universe; if e is a protected-member of P, S(e) is P and all its offspring packages; if e is a private-member of P, S(e) is only P.Definition 11 (Default Interface) Shallow Default Interface I s of a package P is a subset of P’s signature such that:EN i˛I s iff e i˛v i P and V i(r) = True, for œrwhere EN i and V i(r) are the name and SLM of entity e i. In another word, shallow default interface is composed by the names of all public entities in that package. Deep Default Interface, or for short, Default Interface,I d of a package P is the union of its own shallow default interface I s and the deep default interface of its home package.I d (P)= I s (P) c I d (HomePackage(P))Note that the definition of deep default interface is a recursive one, which means all the visible entities in its parent path are also in P’s own default interface. If a package has no home package, its shallow default interface is also its (deep) default interface. Theorem 1: Default interface of package P corresponds to the set of all entities that are visible from P.I d (P)f{name of e| P˛S(e)}Proof: We have:I d (P) = I s (P) c I d (HomePackage(P)) ,Suppose the ancestors of P are P1 … P m. Then we haveI d (P)=I s (P) c I s (P1) c…c I s (P m)For œEN˛I d (P), we have7EN ˛ I s (P ) or EN ˛ I s (P i ). i =1,…mSuppose EN is the name of e . It could be eithere ˛public P ,ore ˛public P i . i =1,…mBoth cases implyFor œr SLM e (r) = public e (r) = TrueÆ SLM e (identifier of P ) = TrueÆ P ˛S (e )Hence, I d (P )f {name of e | P ˛S (e )} ~Definition 12 (Horizon of a Package ) horizon , of a package P is the set of all ontology entities that could be “seen” from P,(P ) = {e | P ˛S (e )},(P ) includes all members of P and all public and protected members of all its ancestor packages.Query, View and InterfacePackages provide a way to modularize an ontology and to localize knowledge. Now we turn to connecting the modules by specifying mappings between them.A common way to connect ontology modules is the one-to-one name mapping between modules. This is also supported by ontology languages such as OWL via assertions owl:equivalentClass, owl:equivalentProperty and owl:sameIndividualAs. However, this approach to mapping between ontologies is rather limited in terms of the types of mappings that can be specified. In addition, such mappings are reflexive which is not always desirable. We argue that to maintain the local semantics of a package, query-based or view-based mappings provide a better alternative. We introduce such mappings in what follows.Definition 13 (Local Interpretation of Package ) A local interpretation of a package P is a pair ø = <ªø , (.)ø>, where the concept space ªø contains a nonempty set of objects and the role space (.)ø is a function over ªø ·ªø such that• C i ˛V i P iff C i øfªø• P i ˛V i P iff P i øfªø ·ªø• I i ˛V i P iff I i ø˛ªøDefinition 14 (Distributed Interpretation of Packages ) a distributed interpretation of a set of packages {P i }, i=1,…m is a family ød ={øi } where øi =<ªøi , (.)øi > is the local interpretation of P i . The union of all ªøi is the distributed concept space ªød and (.)ød ={ functions over ªød ·ªød } is the distributed role space.Definition 15 (Query ) Given a set of packages {P i }, and e 1,…,e m are some entity names in {I d (P i )}. ød is the distributed interpretation of {P i }. A query over {P i } is an expression of one of the following forms:8• Class Query: C q (x):= f c (e 1,…, e m ) f ªødwhere f c is a unary (one free variable)logic construction function for classes. • Property Query : P q (x,y):= f p (e 1,…, e m ) f (.)ødwhere f p is a binary (two free variables) logic construction function for properties. • Instance Query : I q := f i (e 1,…, e m )˛ ªødwhere f p is a logic construction function with no variable for instances.The left hand side of the expression is the definiendum of the query and the rhs is the definien of the query.Definition 16 (View ) a view W over a set of packages {P i } is a set of queries over {P i }. {P i } is called the domain of the viewIf we do not limit the expressiveness of query, a view can be as complex as any package. In practice, the expressiveness of queries allowed by definition 15 should be restricted in order to ensure tractability of inference. Possible candidates include conjunctive query [SK2003] or disjunction of conjunctive queries [CGL2001].Definition 17 (Interface ) an interface F over package P is a view over and only over P .One module can have multiple interfaces as shown is figure 1, which enables multiple ways to reuse a package. For example, to reuse part of an existing ontology such as Congress Library Catalog, multiple interfaces, such as “History” and “Computer Science” could be defined. The resulting interfaces would allow efficient and flexible reuse of the Congress Library Catalog ontology. Views also offer a reusable mechanism to connect packages if they (the views) are defined over multiple packages. Figure 2 shows two packages P 3 and P 4 reuse a view V 1 over two packages P 1 and P 2Definition 18 (Signature of View ) the signature of a view W is the name set of all query definienda in that view.Figure 1. A Package with multiple interfaces 11I 12Figure 2: A View can be built uponmultiple packages and can be referred toby multiple modules1V9 Since interface is a special kind of view, signature of interface can be defined in the same fashion.Note that the default interface I d(P) of a package P is the signature of a simple interface F of P with only equivalency assertions such asF:e i/P:e iwhere e i is an ontology entity in P.Package-extended OntologyView and interface controls how knowledge can in one or more packages can be referred to (exported to) by other packages. To complete the connection among packages, we should also specify how knowledge is imported into a package.Definition 19.(Imported) a package P1 is said being imported into a package P2 if the default interface of P1 ,I d (P1), is used in some entity definition axioms in P2. A view W is said to be imported into package P2 if subset of the signature of W is used in some entity definition axioms in P2. The set of all imported packages and views of a package P is called the domain of P.When a package P or view W is imported into a package, note that only the signature of that P/W is used instead of exposing the entity definition axioms. The referring package only takes care of the set of referred names, while semantics of its domain are maintained intact. When reasoning over the semantics of its domain is needed, a reasoning request should be populated to the referred package/view and locally resolved. Thus, locality of semantics is maintained while allowing global reasoning.Definition 20.(Importing Closure) an importing closure ÷ of a set of packages and views ensures that domains of all views and packages in ÷ are also in ÷.Definition 21 (Package-extended Ontology): a package-extended ontology O = <P,W> where P is a set of packages , W is a set of views defined on P. P and W constitute an importing closure.4Reasoning over Package-extended OntologyNow we briefly discuss how reasoning over a package-extended ontology.Reasoning in package-extended ontology can be seen as distributed reasoning among autonomous ontology modules where no global semantics is guaranteed. Therefore, the whole reasoning process has to be built on local reasoning offered by individual modules.We focus on the subsumption problem – the problem of determining if a class is a subclass of another class. Many other reasoning problems can be reduced to subsumption,for example1.C and D are equivalent ]C is subsumed by D and D is subsumed by C.102.C and D are disjoint ]C¢D is subsumed by z(bottom concept).3.a is a member of C] {a} is subsumed by CFirst we give the definition of subsumption reasoning in package-extended ontology:Definition 22 (Interpretation of Package-extended Ontology) interpretation ø=<ªø, (.)ø> of a package-extended ontology O = <P,W> is the distributed interpretation of {P,W}if W is treated as packages.Definition 23 (based on [SK2003]) (subsumption reasoning request) over package-extended ontology O with respect to some interpretation ø involves checking whether a class C is subsumed by another class D with respect to ø, denoted as C¥D iff øÖCøf Dø. We say C¥D if Cøf Døfor all possible øWe describe an extension of the Tableau algorithm in description logic [BCM2003, p78] for subsumption reasoning over a package extended ontology. We restrict our discussion to the language of ALCN. The general idea of standard Tableau algorithm is to reduce the subsumption problem to (un)satisfiability problem and try to construct a possible interpretation for given terminology. The reduction is easy to understand since C¥D iff C¢ D is unsatisfiable. Transform C¢ D into negation normal form (NNF), i.e. negation occurs only in front of concept names. Denote the transformed expression as C0, the algorithm starts with an ABox A0={C0{x0}}, and apply consistency-preserving transformation rules [BCM2003, p81] to the ABox as far as possible. If one possible ABox is found, C0 is satisfiable and the subsumption is not true. If no possible ABox could be found, the subsumption is true.The algorithm for distributed subsumption reasoning is as follows:SubsumptionAnswer (C, D, O)Input: Concept C and D, Ontology O=<P, W>Return: True or False1.Construct an ABox A = {C¢ D (x)}, Transform A into NNF.2.FOR all package/views P being referred in A3. RETURN Satisfiable ({A}, P) ;4.END FORSatisfiable (S, P)Input: Initial ABox set S, package/view PReturn: True or False1.FOR all ABoxes A i in S2.Transform concepts in A i into NNF wrt visible entities from P ;3.Do ABox transformation as that in standard tableau algorithm, result in anaugmented set of ABoxes S i, S’= S c S i.4. IF ›A˛S i is complete and consistent5. RETURN True;6. ELSE117.FOR all imported packages/views P’8. IF Satisfiable (S’, P’) = True9. RETURN True;10. END IF11.END FOR12. END IF13.END FOR14.RETURN False;An ABox is called complete if none of the transformation rules applies to it. An ABox is called consistent if no logic clash is found.The basic idea of Satisfiable algorithm is that a package or view could answer a Satisfiable request if a possible interpretation is found locally; otherwise it will consult the packages and views in its domain. Although no global semantics is available, an interpretation of the “global” model is incrementally constructed by the queries among packages and views.Suppose the domain of each module (package or view) is finite and expanded importing path for every package has finite length and no cyclic importing is allowed, the final call times of Satisfiable is PS PACE-complete. It is easy to prove from the properties of the Tableau Algorithm that the SubsumptionAnswer algorithm is sound, terminable, complete and decidable, given all modules are limited with ALCN-concept description. Since we know satisfiability of ALNC-concept description is PS PACE-complete in each of the package, the SubsumptionAnswer is also PS PACE-complete for this case.5Specifications of P-OWL Language.In this section, we show the basic formalism of package-extended ontology can be incorporated into ontology languages such as OWL. Also, to keep backward compatibility to legacy systems, we want the extended ontology language is syntactically as compatible as possible with existing ontology languages. Hence, instead of introducing new syntax, a large part of this specification is given in OWL, RDF and RDFS to extend OWL. The part of the specification that cannot be given in OWL and RDF is specified using rules.To allow tradeoff between expressiveness and complexity, the proposed solution is offered in two versions. The Lite version enables basic package, view and interface functionalities, thereby providing support for modularity and information hiding capacity; The Full version supports composition of packages.P-OWL Lite specificationsSpec 1.Package is defined inside a XML namespace; one namespace can hold multiple packagesSpec 2.Package is a special OWL class12•Package ¥ owl:ClassSpec 3.There is one and only one global package P0•P0∈PackageSpec 4.Each term belongs to a unique package. If not explicitly stated,a)owl:Thing and owl:Nothing has assumed package P0b)For class, the homepage package of superclass is assumed as homepackage; if no superclass, P0 is assumedc)For property, the homepage package of superproperty is assumed as homepackage; if no superproperty, P0 is assumedd)For instance, the homage package of the class type is assumed as homepackage.•inPackage ¥rdf:Property•range(inPackage) = Package•x = { Owl:Thing | owl:Nothing } →inPackage(x, P0)•SubClassOf(x,y) ¢ inPackage(x,˘) ¢ inPackage(y,z)→inPackage(x,z) •SubPropertyOf(x,y) ¢ inPackage(x,˘) ¢ inPackage(y,z)→inPackage(x,z)•x∈y ¢ inPackage(x,˘) ¢ inPackage(y,z)→inPackage(x,z)Spec 5.No entities in one package can have identical local names; No package names in one namespace could be identical.To make it compatible to OWL language, an entity is given a unique storage name with its package name as prefix. The translating from package/local name to storage name should be supported by the ontology editor and reasoner. For example, an entity named “OWL” defined in package “Language” could be stored as “Language_OWL” and an entity named “OWL” defined in package “Animal”could be stored as “Animal_OWL” in the same ontology.Spec 6.Each term has a SLM. Possible SLMs includea)Public: terms is visible to the whole universeb)Private: term is visible inside this package only.Default modifier is public•InPackagePublic ¥ inPackage•InPackagePrivate ¥inPackage•InPackage(x,p) ¢InPackagePrivate(x,p) →InPackagePublic (x,p)Spec 7.Default interface of a package is the public entities in that package •DefaultInterface(p) := {"x| InPackagePublic (x,p)}。