Ontology integration in a multilingual e-retail system
语料库术语中英对照
Aboutness 所言之事Absolute frequency 绝对频数Alignment (of parallel texts) (平行或对应)语料的对齐Alphanumeric 字母数字类的Annotate 标注(动词)Annotation 标注(名词)Annotation scheme 标注方案ANSI/American National Standards Institute 美国国家标准学会ASCII/American Standard Code for Information Exchange 美国信息交换标准码Associate (of keywords) (主题词的)联想词AWL/Academic word list 学术词表Balanced corpus 平衡语料库Base list 底表、基础词表Bigram 二元组、二元序列、二元结构Bi-hapax 两次词Bilingual corpus 双语语料库CA/Contrastive Analysis 对比分析Case-sensitive 大小写敏感、区分大小写Chi-square (χ2) test 卡方检验Chunk 词块CIA/Contrastive Interlanguage Analysis 中介语对比分析CLAWS/Constituent Likelihood Automatic Word-tagging System CLAWS词性赋码系统Clean text policy 干净文本原则Cluster 词簇、词丛Colligation 类联接、类连接、类联结Collocate n./v. 搭配词;搭配Collocability 搭配强度、搭配力Collocation 搭配、词语搭配Collocational strength 搭配强度Collocational framework/frame 搭配框架Comparable corpora 类比语料库、可比语料库ConcGram 同现词列、框合结构Concordance (line) 索引(行)Concordance plot (索引)词图Concordancer 索引工具Concordancing 索引生成、索引分析Context 语境、上下文Context word 语境词Contingency table 连列表、联列表、列连表、列联表Co-occurrence/Co-occurring 共现Corpora 语料库(复数)Corpus Linguistics 语料库语言学Corpus 语料库Corpus-based 基于语料库的Corpus-driven 语料库驱动的Corpus-informed 语料库指导的、参考了语料库的Co-select/Co-selection/Co-selectiveness 共选(机制)Co-text 共文DDL/Data Driven Learning 数据驱动学习Diachronic corpus 历时语料库Discourse 话语、语篇Discourse prosody 话语韵律Documentation 备检文件、文检报告EAGLES/Expert Advisory Groups on Language Engineering Standards EAGLES文本规格Empirical Linguistics 实证语言学Empiricism 经验主义Encoding 字符编码Error-tagging 错误标注、错误赋码Extended unit of meaning 扩展意义单位File-based search/concordancing 批量检索Formulaic sequence 程式化序列Frequency 频数、频率General (purpose) corpus 通用语料库Granularity 颗粒度Hapax legomenon/hapax 一次词Header/Text head 文本头、头标、头文件HMM/Hidden Markov Model 隐马尔科夫模型Idiom Principle 习语原则Index/Indexing (建)索引In-line annotation 文内标注、行内标注Key keyword 关键主题词Keyness 主题性、关键性Keyword 主题词KWIC/Key Word in Context 语境中的关键词、语境共现(方式)Learner corpus 学习者语料库Lemma 词目、原形词、词元Lemma list 词形还原对应表Lemmata 词目、原形词、词元(复数)Lemmatization 词形还原、词元化Lemmatizer 词形还原(词元化)工具Lexical bundle 词束Lexical density 词汇密度Lexical item 词项、词语项目Lexical priming 词汇触发理论Lexical richness 词汇丰富度Lexico-grammar/Lexical grammar 词汇语法Lexis 词语、词项LL/Log likelihood (ratio) 对数似然比、对数似然率Longitudinal/Developmental corpus 跟踪语料库、发展语料库、历时语料库Machine-readable 机读的Markup 标记、置标MDA/Multi-dimensional approach 多维度分析法Metadata 元信息Meta-metadata 元元信息MF/MD (Multi-feature/Multi-dimensional) approach 多特征/多维度分析法Mini-text 微型文本Misuse 误用Monitor corpus (动态)监察语料库Monolingual corpus 单语语料库Multilingual corpus 多语语料库Multimodal corpus 多模态语料库MWU/Multiword unit 多词单位MWE/Multiword expression 多词单位MI/Mutual information 互信息、互现信息N-gram N元组、N元序列、N元结构、N元词、多词序列NLP/Natural Language Processing 自然语言处理Node 节点(词)Normalization 标准化Normalized frequency 标准化频率、标称频率、归一频率Observed corpus 观察语料库Ontology 知识本体、本体Open Choice Principle 开放选择原则Overuse 超用、过多使用、使用过度、过度使用Paradigmatic 纵聚合(关系)的Parallel corpus 平行语料库、对应语料库Parole linguistics 言语语言学Parsed corpus 句法标注的语料库Parser 句法分析器Parsing 句法分析Pattern/patterning 型式Pattern grammar 型式语法Pedagogic corpus 教学语料库Phraseology 短语、短语学POSgram 赋码序列、码串POS tagging/Part-of-Speech tagging 词性赋码、词性标注、词性附码POS tagger 词性赋码器、词性赋码工具Prefab 预制语块Probabilistic (基于)概率的、概率性的、盖然的Probability 概率Rationalism 理性主义Raw text/Raw corpus 生文本(语料)Reference corpus 参照语料库Regex/RE/RegExp/Regular Expressions 正则表达式Register variation 语域变异Relative frequency 相对频率Representative/Representativeness 代表性(的)Rule-based 基于规则的Sample n./v. 样本;取样、采样、抽样Sampling 取样、采样、抽样Search term 检索项Search word 检索词Segmentation 切分、分词Semantic preference 语义倾向Semantic prosody 语义韵SGML/Standard Generalized Markup Language 标准通用标记语言Skipgram 跨词序列、跨词结构Span 跨距Special purpose corpus 专用语料库、专门用途语料库、专题语料库Specialized corpus 专用语料库Standardized TTR/Standardized type-token ratio 标准化类符/形符比、标准化类/形比、标准化型次比Stand-off annotation 分离式标注Stop list 停用词表、过滤词表Stop word 停用词、过滤词Synchronic corpus 共时语料库Syntagmatic 横组合(关系)的Tag 标记、码、标注码Tagger 赋码器、赋码工具、标注工具Tagging 赋码、标注、附码Tag sequence 赋码序列、码串Tagset 赋码集、码集Text 文本TEI/Text Encoding Initiative 文本编码计划The Lexical Approach 词汇中心教学法The Lexical Syllabus 词汇大纲Token 形符、词次Token definition 形符界定、单词界定Tokenization 分词Tokenizer 分词工具Transcription 转写Translational corpus 翻译语料库Treebank 树库Trigram 三元组、三元序列、三元结构T-score T值Type 类符、词型TTR/Type-token ratio 类符/形符比、类/形比、型次比Underuse 少用、使用不足Unicode 通用码Unit of meaning 意义单位WaC/Web as Corpus 网络语料库Wildcard 通配符Word definition 单词界定Word form 词形Word family 词族Word list 词表XML/EXtensible Markup Language 可扩展标记语言Zipf's Law 齐夫定律Z-score Z值。
也谈关于 O ntolo gy 的翻译
也谈关于Ontology的翻译庞学铨提要:如何翻译和理解Ontology这一重要概念,对研究西方哲学的存在理论关系极大,对它的理解又最终被归结为对to be的理解。
国内西方哲学界对此的看法历来有异,近年又有研究者认为以“是”来翻译和理解to be最为准确。
本文根据当代著名语言学、哲学史专家美国学者卡恩的研究成果,讨论了to be本来具有的多种用法、多重涵义。
认为它的一种主要和基本的用法,是作系动词用,表示“是”的意义;即使在作为系动词用时,它也可以表示“是者”、“存在”,包“存在”的意义;它究竟表示“是”还是“存在”或别的意义,要看使用它的不同时代、不含“是者”、同语境和不同哲学家;而to be所含有的“存在”的意义,在不同的形而上学理论中,又有差异,有的指本体意义的“存在”,有的指实存意义上的“存在”,有的则指自身显现意义上的“存在”。
因此,对Ontology的翻译和理解,也应该视不同情形而定。
关键词: 西方哲学 Ontology概念 翻译理解作者庞学铨,男,哲学博士,浙江大学人文学院教授、博士生导师。
(杭州 310027)讨论存在论的问题,不能不首先涉及Ontology的概念与翻译。
我国哲学界对Ontology一词的最通常译名曾是“本体论”,近年来又流行一个新的译名“存在论”,对这两个通行译名的准确性又一直有着不同的看法;有的研究者则主张正确的译法应为“是论。
”分歧来自对Ontology一词的词源学理解。
从Ontology的词根来看,是由词干ont和表示“学说”的词尾logy构成的。
ont是希腊文on的变化式,因此,该词指的是关于on的学问。
on是希腊文einai的中性分词,einai则相当于英文的to be,德文的sein。
就是说,希腊文on的意思相当于英文的Being,Ontology也就是关于being的学问。
Being作为to be的分词,其意义取决于to be。
所以,对Ontology的理解也就最终被归结为对to be 的理解。
Ontology enrichment and indexing process
—Ing´e nierie des Connaissances —R ESEARCH R EPORTN o 03.05Mai 2003Ontology enrichment and indexing process E.Desmontils,C.Jacquin,L.SimonInstitut de Recherche en Informatique de Nantes2,rue de la HoussinireB.P.9220844322NANTES CEDEX 3E.Desmontils,C.Jacquin,L.SimonOntology enrichment and indexing process18p.Les rapports de recherche de l’Institut de Recherche en Informatique de Nantes sont disponibles aux formats PostScript®et PDF®`a l’URL:http://www.sciences.univ-nantes.fr/irin/Vie/RR/Research reports from the Institut de Recherche en Informatique de Nantes are available in PostScript®and PDF®formats at the URL:http://www.sciences.univ-nantes.fr/irin/Vie/RR/indexGB.html ©May2003by E.Desmontils,C.Jacquin,L.SimonOntology enrichmentand indexing processE.Desmontils,C.Jacquin,L.Simondesmontils,jacquin,simon@irin.univ-nantes.fr+AbstractWithin the framework of Web information retrieval,this paper presents some methods to improve an indexing process which uses terminology oriented ontologies specific to afield of knowledge.Thus,techniques to enrich ontologies using specialization processes are proposed in order to manage pages which have to be indexed but which are currently rejected by the indexing process.This ontology specialization process is made supervised to offer to the expert of the domain a decision-making aid concerning itsfield of application.The proposed enrichment is based on some heuristics to manage the specialization of the ontology and which can be controlled using a graphic tool for validation.Categories and Subject Descriptors:H.3.1[Content Analysis and Indexing]General Terms:Abstracting methods,Dictionaries,Indexing methods,Linguistic processing,Thesauruses Additional Key Words and Phrases:Ontology,Enrichment,Supervised Learning,Thesaurus,Indexing Process, Information Retrieval in the Web1IntroductionSearch engines,like Google1or Altavista2help us tofind information on the Internet.These systems use a cen-tralized database to index information and a simple keywords based requester to reach information.With such systems,the recall is often rather convenient.Conversely,the precision is weak.Indeed,these systems rarely take into account content of documents in order to index them.Two major approaches,for taking into account the se-mantic of document,exist.Thefirst approach concerns annotation techniques based on the use of ontologies.They consist in manually annotating documents using ontologies.The annotations are then used to retrieve information from the documents.They are rather dedicated to request/answer system(KAON3...)The second approach,for taking into account of Web document content,are information retrieval techniques based on the use of domain ontologies[8].They are usually dedicated for retrieving documents which concern a specific request.For this type of systems,the index structure of the web pages is given by the ontology structure.Thus,the document indexes belong to the concepts set of the ontology.An encountered problem is that many concepts extracted from docu-ment and which belong to the domain are not present in the domain ontology.Indeed,the domain coverage of the ontology may be too small.In this paper,wefirst present the general indexing process based on the use of a domain ontology(section 2).Then,we present an analysis of experiment results which leads us to propose improvements of the indexing process which are based on ontology enrichment.They make it possible to increase the rate of indexed concepts (section3).Finally,we present a visualisation tool which enables an expert to control the indexing process and the ontology enrichment.2Overview of the indexing processThe main goal is to build a structured index of Web pages according to an ontology.This ontology provides the index structure.Our indexing process can be divided into four steps(figure1)[8]:1.For each page,aflat index of terms is built.Each term of this index is associated with its weighted frequency.This coefficient depends on each HTML marker that describes each term occurrence.2.A thesaurus makes it possible to generate all candidate concepts which can be labeled by a term of theprevious index.In our implementation,we use the Wordnet thesaurus([14]).3.Each candidate concept of a page is studied to determine its representativeness of this page content.Thisevaluation is based on its weighted frequency and on the relations with the other concepts.It makes it possible to choose the best sense(concept)of a term in relation to the context.Therefore,the more a concept has strong relationships with other concepts of its page,the more this concept is significant into its page.This contextual relation minimizes the role of the weighted frequency by growing the weight of the strongly linked concepts and by weakening the isolated concepts(even with a strong weighted frequency).4.Among these candidate concepts,afilter is produced via the ontology and the representativeness of thely,a selected concept is a candidate concept that belongs to the ontology and has an high representativeness of the page content(the representativeness exceeds a threshold of sensitivity).Next,the pages which contain such a selected concept are assigned to this concept into the ontology.Some measures are evaluated to characterize the indexing process.They determine the adequacy between the Web site and the ontology.These measures take into account the number of pages selected by the ontology(the Ontology Cover Degree or OCD),the number of concepts included in the pages(the Direct Indexing Degree or DID and the Indirect Indexing Degree or IID)...The global evaluation of the indexing process(OSAD:Ontology-Site Adequacy Degree)is a linear combination of the previous measures(weighted means)among different threshold from0to1.The measure enables us to quantify the“quality”of our indexing process(see[8])for more details).67ValueValid and indexed(representativeness degree greater than0.3)337428333547With a representativeness degree greater than0.3Not in WordnetofIn Wordnet2734053881Number of processed candidate concepts4“/”(1315HTML pages).89105like http://www.acronymfi,an online database that contains more than277000acronymes.11 6For instance,,a search engine that allows keywords like AND,OR,NOT or NEAR.1213Initial indexing process With the pruning process8021684.33%98.86%58.75%87.04%56.84%81.5%0.62%11.5%Table2:Results of the indexing process concerning1000pages of the site of the CSE department of the University of Washington(with a threshold of0,3).phases!).This phenomenon is due to the enrichment algorithm which authorizes the systematic addition of any representative concept(i.e.threshold of representativeness)to the ontology of the domain.While the second enrichment method,which operates with pruning rules(see sub-section3.3),enables to only add136concepts to the ontology.Also let us notice that this method keeps the rate of coverage(98,86%)of the enrichment method without pruning.Indeed,during this pruning phase,some concepts which does not index enough pages(according to the threshold),are removed from the ontology.Their pages are then linked to concepts that subsume them.Next,the number of concepts that index pages is growing.It is not surprising because we add only concepts indexing a minimal number of pages.Finally,the rate of accepted concepts goes from0.62%to11.5%!So,our process uses more available concepts that the pages contain.4OntologyManager:a user interface for ontology validationA tool which makes it possible to control the ontology enrichment has been developed(see Figure7).This tool implemented in java language,proposes a tree like view of the ontology.On the one hand,it proposes a general view of the ontology which enables the expert to easily navigate throw the ontology,on the other hand,it proposes a more detailed view which informs the expert about coefficient associated with concepts and pages.Notice that, in this last case,concepts are represented with different colours according to their associated coefficient.So a human expert easily can compares them.Moreover,some part of the ontology graph can also be masked in order to focus the expert attention on a specific part of the ontology.We are now developing a new functionality for the visualisation tool.It enables the user to have an hyperbolic view of the ontology graph(like OntoRama tool[9]or like H3Viewer[16]).In this context,the user can work with bigger ontologies.The user interface also makes it possible to visualise the indexed pages(see Figure8)and the ontology enrich-ment(by a colour system which can be customized).It will be easy to the human expert to validate or invalidate the added concepts,to obtain the indexing rate of a particular concept and to dynamically reorganize(by a drag and drop system)the ontology.The concept validation process is divided into4steps defining4classes of concepts:•bronze concepts:concepts proposed by our learning process and accepted by an expert just“to see”;•silver concepts:concepts accepted by the expert for all indexing processes he/she does;•gold concepts:concepts proposed by an expert to its community7for testing;141516Ontology enrichmentand indexing processE.Desmontils,C.Jacquin,L.SimonAbstractWithin the framework of Web information retrieval,this paper presents some methods to improve an indexing process which uses terminology oriented ontologies specific to afield of knowledge.Thus,techniques to enrich ontologies using specialization processes are proposed in order to manage pages which have to be indexed but which are currently rejected by the indexing process.This ontology specialization process is made supervised to offer to the expert of the domain a decision-making aid concerning itsfield of application.The proposed enrichment is based on some heuristics to manage the specialization of the ontology and which can be controlled using a graphic tool for validation.Categories and Subject Descriptors:H.3.1[Content Analysis and Indexing]General Terms:Abstracting methods,Dictionaries,Indexing methods,Linguistic processing,Thesauruses Additional Key Words and Phrases:Ontology,Enrichment,Supervised Learning,Thesaurus,Indexing Process, Information Retrieval in the Web。
Ontology的含义及翻译
“Ontology”的意义及翻译作者:邹诗鹏近年来,Ontology问题复又成为学界的热点研究领域,问题仍然集中于如何理解和翻译Ontology,大多数的意见认为应当放弃“本体”及“本体论”,而选择“存在”及“存在论”,或者干脆就是“是”及“是论”。
但到底是“存在”及“存在论”,还是“是”及“是论”(“是态论”),则形成了争论的焦点。
这场争论的实质是反映了学界对于西方学术研习的质量要求,同时也表现了学界对于中西方文化在根源上是否能够形成沟通的困惑与思考。
一、Ontology及其复杂的汉译问题存在论(Ontology)是哲学的核心领域。
顾名思义,存在论即关于“存在”的理论,是关于存在是什么以及存在如何存在的理论。
存在论虽然是在17世纪才由德国经院学者郭克兰纽命名并由沃尔夫加以完善并从理论上系统化,但就存在论这一学问而言,则是早已由古希腊哲学确定了其基本框架及理论内容的。
事实上,存在论本身就是古希腊哲学的主题形态。
不过,Ontology并不是一劳永逸的理论体系。
对于不断追求理论超越的西方哲学传统而言,后世的西方哲学显然有理由构造与古希腊哲学的“Ontology”有所突破甚或根本不同的Ontology结构。
Ontology的复杂性从词源角度说源于其核心概念toon(tobe)在西方思想演进中的复杂性,从本质上说则是源于哲学家们不同的哲学观念,这种状况必然导致人们对Ontology的不同理解。
特别是,由于Ontology在文化传播中与异文化传统及其语言习惯的冲突、融汇与涵化,从而使得在西方哲学那里本就十分复杂的Ontology的异文化翻译显得更为复杂。
Ontology的汉译就充分地表明了这一点。
近百年来,Ontology先后被译为“物性学”“万有学”(卫礼贤)、“实体论”(陈大年)、“本体学”(常守义)、“万有论”(陈康)、“凡有论”、“至有论”(张君劢)、“存有论”(唐君毅)、“有根论”(张岱年),“是论”(陈康、汪子嵩、王太庆等)以及“是态论”(陈康)等等。
Ontologies
– Data exchange and archiving
• CRM = also a technical reference for use in comparing and evaluating information systems, data schema, &c. • a basis for data transfer between incompatible systems • XML doesn't do this on its own • but CRM can be used for designing a common XML schema • CRM = also a basis for archiving of data (cf. troubles with the 1986 Domesday Project…)
Tea
Water
Milk
Lemon
Sugar
– Are we sure all of us understand these “things” the same way? • What is an “ingredient”? Please define: … • What is a “required ingredient”? Please define: … • What is an “optional ingredient”? Please define: …
Most of available definitions are of little help to the layman:
“An ontology is an explicit specification of a conceptualization” Thomas R. Gruber “An ontology is a logical theory accounting for the intended meaning of a formal vocabulary” Nicola Guarino “A theory concerning the kinds of entities and specifically the kinds of abstract entities that are to be admitted to a language system” Webster’s Third New International Dictionary
论Ontology在信息系统研究中的两重性
作 者 简 介 : 知 津 (9 7 ) 男 , 授 , 士 生 导 师 , 表 论 文 3 0余 篇 , 王 14 一 , 教 博 发 0 出版 著 作 2 9部 ; 鑫 (9 6 ) 男 ,0 9级 情 报 学 硕 士 研 金 1 8一 , 2 0
究 生 , 表论 文 4篇 ; 文 爽 (9 5 ) 男 ,0 9级情 报 学硕 士研 究 生 , 表 论 文 4篇 。 发 王 18 一 , 2 0 发
1 Onoo y的 概念 及特 征 tlg
共享 的重 要 组件 。
O t o y 词最 早 产 生 于 l no g 一 l 7世 纪 , 用 于 哲学 应 领域, 与形 而 上学 和 “ 第一 哲学 ” 同义 词 。 是 在哲 学范 畴 , no g 可 以翻 译 为 “ 体论 ” 该 理 论 是 对 客观 O toy l 本 , 存在 的一 个 系统 的解 释或 说 明 ,它关 心 的是 客 观现 实 的抽象 本质 , 一个 研 究 “ 在 ” 是 存 的理 论 。 它关 注于 事 物存在 的 原 因 , 不是 存在 的结 果 。 而 本体论 确 立 了 种追 寻 初 始 本 原 、 足 理 由、 终 同一 性 、 高价 充 最 最 值原 理 的哲学 探 索 的道路 Ⅱ。 】 作 为 一 个 曾 经 用 于 哲 学 上 的概 念 . no g O tl y最 o 早用 于哲 学 以外 的 领域 是 人工 智 能 。现在 广 泛应 用 于知识 工 程 、 知识 表 示 、 息检 索 、 息摘 要 、 信 信 知识 管 理等 领域 , 国外对 本 体论 的研 究非 常 活跃 . 至被 应 甚 用到企业 集 成 、 自然语 言翻译 、 药 、 医 电子 商务 、 理 地 信 息 系统 、 法律 信 息 系统 、 生物 信 息系统 等 [。 2 ] 其实 , nooy就 是 通 过 对 于概 念 、 O tl g 术语 及 其 相 互关 系 的规 范化描 述 ,勾 画 出某一 领 域 的基 本知识 体 系 和描 述 语 言 。O tlg nooy的 目标 是捕 获相 关 领域 的知识 , 供对 该 领 域知 识 的共 同理解 , 定该 领域 提 确 内共 同认 可 的词汇 ,并从 不 同层次 的形 式 化模 式上 给 出这些 术语 和 术语 问相互 关 系 的明确定 义 。 O t o y 有 以下 特 征 : no g 具 l () 1 使用 范 围十 分广 泛 。O tl y能够 在不 同的 noo g 建 模方 法 、 言 、 式 和 工 具 之 间进 行 转 换 和 映 射 . 语 范 在 不 同的系 统之 间具 有 可继 承性 和互 操作 性 。 ( ) 功能 上与 数 据库 具 有一 定 的相 似 性 , 在 2在 但 所 能表达 的知识 方 面 ,却 比数 据 库 丰富 很多 。一 方 面, 定义 O tlg nooy的语 言 , 词 法 和语 义 两个 层 面上 在 所 能 表达 的信 息 与数 据 库相 比 , 要 丰富 很 多 ; 一 都 另
Motivic integration over Deligne-Mumford stacks
a rX iv:mat h /312115v5[mat h.AG ]16Dec24MOTIVIC INTEGRATION OVER DELIGNE-MUMFORDSTACKS TAKEHIKO YASUDA Abstract.The aim of this article is to develop the theory of motivic integration over Deligne-Mumford stacks and to apply it to the birational geometry of Deligne-Mumford stacks.Contents 1.Introduction 21.1.Notation and convention 71.2.Acknowledgments 82.Stacks of twisted jets 82.1.Short review of the Deligne-Mumford stacks 82.2.Stacks of twisted jets 102.3.Morphism of stacks of twisted jets 173.Motivic integration 193.1.Convergent stacks 193.2.Convergent spaces and coarse moduli spaces 223.3.Cohomology realization 233.4.Cylinders and motivic measure 273.5.Integrals of measurable functions 283.6.Motivic integration over singular varieties 343.7.Tame proper birational morphisms and twisted arcs 373.8.Fractional Tate objects 383.9.Shift number403.10.Transformation rule414.Birational geometry of Deligne-Mumford stacks474.1.Divisors and invariants of pairs474.2.Homological McKay correspondence and discrepancies494.3.Orbifold cohomology524.4.Convergence and normal crossing divisors 542TAKEHIKO YASUDA4.5.Generalization to singular stacks57 4.6.Invariants for varieties58 References591.IntroductionIn this article,we study the motivic integration over Deligne-Mumford stacks,which was started in[Yas1].The motivic integration was intro-duced by Kontsevich[Kon]and developed by Denef and Loeser[DL1], [DL2]etc.It is now well-known that the motivic integration is effective in the study of birational geometry.For example,Batyrev[Bat]has applied it to the study of stringy E-functions and Mustat¸ˇa[Mus]to one of the singularities appearing in the minimal model program.Wefirst recall the motivic integration over varieties.Thanks to Se-bag[Seb],we can work over an arbitrary perfectfield k.Let X be a variety over k,that is,a separated algebraic space offinite type over k.For a non-negative integer n,an n-jet of X over a k-algebra R is a R[[t]]/t n+1-point of X.For each n,there exists an algebraic space J n X parameterizing n-jets.For example,J0X is X itself and J1X is the tangent bundle of X.The spaces J n X,n∈Z≥0constitute a projectivesystem and the limit J∞X:=lim←−J n X exists.We can define a measureµX and construct an integration theory on J∞X with values in some ring(or semiring)in which we can add and multiply the classes{V} of varieties V and some class of infinite sums are defined.For exam-ple,we can use a completion of the Grothendieck ring of mixed Hodge structures(k=C)or mixed Galois representations(k afinitefield).If X is smooth,then we haveJ∞X1dµX=µX(J∞X)={X}.To generalize the theory to Deligne-Mumford stacks,it is not suf-ficient to consider only R[[t]]/t n+1-points of a stack.Inspired by a work of Abramovich and Vistoli[AV],the author introduced the no-tion of twisted jets in[Yas1].Let X be a separated Deligne-Mumford stack offinite type over k andµl,k be the group scheme of l-th roots of unity for a positive integer l prime to the characteristic of k.A twisted n-jet over X is a representable morphism from a quotient stack [(Spec R[[t]]/t nl+1)/µl,k]to X.We will prove that the category J n X of twisted n-jets is a Deligne-Mumford stack.If k is algebraically closedMOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS3 and X is a quotient stack[M/G],then we haveJ0X∼= g∈Conj(G)[M g/C g].Here Conj(G)is a representative set of conjugacy classes,M g thefixed point locus of g and C g is the centralizer of g.The right hand side often appears in the study of McKay correspondence.There exists also the projective limit J∞X:=lim←−J n X.When X is smooth,we define a measureµX and construct an integration theory on the point set|J∞X|.Let L be the class{A1k}of an affine line.To a variety X and an ideal sheaf I⊂O X,we can associate a function ord I:J∞X→Z≥0∪{∞} and a function L ord I.Consider a proper birational morphism f:Y→X of varieties with Y smooth.The Jacobian ideal sheaf Jac f⊂O Y is defined to be the0-th Fitting ideal ofΩY/X.If X is also smooth,then this is identical with the ideal sheaf of the relative canonical divisor K Y/X:=K Y−f∗K X.Let f∞:J∞Y→J∞X be the morphism induced by f.The relation of the measuresµX andµY is described by the following transformation rule:F dµX= (F◦f∞)L−ord Jac f dµY.This formula was proved by Kontsevich[Kon],Denef and Loeser[DL1], and Sebag[Seb].Using this,we obtain many results in the birational geometry.For instance,Kontsevich proved the following:If f:Y→X and f′:Y→X′are proper birational morphisms of smooth proper varieties over C,and if K Y/X=K Y/X′,then the Hodge structure of H i(X,Q)and that of H i(X′,Q)are isomorphic.We generalize the transformation rule to Deligne-Mumford stacks. If we consider only representable morphisms,no interesting phenom-enon appears.A morphism of Deligne-Mumford stacks is said to be birational if it induces an isomorphism of open dense substacks.For example,if M is a variety with an effective action of afinite group G,then the natural morphism from the quotient stack[M/G]to the quotient variety M/G is birational.A morphism f:Y→X is said to be tame if for every geometric point y of Y,Ker(Aut(y)→Aut(f(x))) is of order prime to the characteristic of k.The transformation rule is generalized to tame,proper and birational morphisms.Let˜x be a geometric point of J0X and x its image in X.Aµl-action on the tangent space T x X derives from˜x.If for suitable basis,ζ∈µl4TAKEHIKO YASUDA acts by diag(ζa1,...,ζa d),1≤a i≤l,then we define1sht(˜x):=d−MOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS5 V,sht(V)∈Q is well-defined.If D=0and W=|X|,then the invariant is equal to V⊂J0X{V}L sht(V).If k=C and X is proper,this has the information of the Hodge structure of the orbifold cohomology defined below.In characteristic zero,we can generalize the invariantΣW(X,D)to the case where X is singular:A log Deligne-Mumford stack is defined to be the pair(X,D)of a normal Deligne-Mumford stack X offinite type and a Q-divisor D on X such that K X+D is Q-Cartier.For a log Deligne-Mumford stack(X,D)and a constructible subset W⊂|X|,if f:Y→X is a proper birational morphism with Y smooth,then we defineΣW(X,D):=Σf−1(W)(Y,f∗(K X+D)−K Y).This invariant is a common generalization and refinement of the stringy E-function and the orbifold E-function.By a calculation,we will see thatΣW(X,D)=∞if and only if(X,D)is Kawamata log terminal around W(For the definition,see Definition4.17).The following is the direct consequence of the transformation rule and viewed as a generalization of Batyrev’s result and Denef and Loeser’s one.Theorem1.3.Let(X,D)and(X′,D′)be log Deligne-Mumford stacks. Assume that there exist a smooth DM stack Y and proper birational morphisms f:Y→X and f′:Y→X such that f∗(K X+D)= (f′)∗(K X′+D′)and f−1(W)=(f′)−1(W′).In positive characteristic, assume in addition that X and X′are smooth and that f and f′are tame.Then we haveΣW(X,D)=ΣW′(X′,D′).Remark1.4.Kawamata[Kaw]obtained a closely related result in terms of the derived category.Finally we give corollaries of this theorem.Let G⊂GL d(C)be afinite subgroup and X:=C d/G the quotient variety.For g∈G,we define a rational number age(g)as follows: Let l be the order of g andζ:=exp(2π√ld i=1a i.If g∈SL d(C),then age(g)is an integer.The following was called the Homological McKay correspondence.It was proved by Y.Ito and Reid6TAKEHIKO YASUDA[IR]for dimension three and by Batyrev for arbitrary dimension[Bat]. (See also[Rei2]).Corollary1.5.Suppose that G⊂SL d(C)and that there is a crepant resolution Y→X.For an even integer i,putn i:=♯{g∈Conj(G)|age(g)=i/2}.Then we haveH i(Y,Q)∼= 0(i:odd)Q(−i/2)⊕n i(i:even).Since X=C d/G has only quotient singularities,K X is Q-Cartier and its pull-back by arbitrary morphism is defined.For a resolution f:Y→X and for each exceptional prime divisor E⊂Y,there is a rational number a(E,X)such thatK Y≡f∗K X+ E⊂Y a(E,X)E.The discrepancy of X is defined to be the infimum of a(E,X)for all resolutions Y→X and all exceptional divisors E⊂Y.The following is a refinement of Reid–Shepherd-Barron–Tai criterion for canonical(or terminal)quotient singularities(see[Rei1,§4.11]).Corollary1.6.For afinite group G⊂GL d(C)without reflection,the discrepancy of X=C d/G is equal tomin{age(g)|1=g∈G}−1.Chen and Ruan[CR]defined a new cohomology for topological orb-ifolds(Satake’s V-manifolds),called orbifold cohomology.We give its algebraic version.Let X be a smooth Deligne-Mumford stack over C. For i∈Q,we defineH i orb(X,Q):= V⊂J0X H i−2sht(V)(¯V,Q)⊗Q(−sht(V)).Here¯V is the coarse moduli space of V.If X is proper,then H i orb(X,Q) is a pure Hodge structure of weight i.(We define Hodge structure with fractional weights in the trivial fashion.)The following was conjectured by Ruan[Rua]and a weak version was proved by Lupercio-Poddar [LP]and the author[Yas1]independently.This is a generalization of Kontsevich’s theorem stated above.Corollary1.7.Let X and X′be proper and smooth Deligne-Mumford stacks offinite type over C.Suppose that there exist a smooth Deligne-Mumford stack Y and proper birational morphisms f:Y→X andMOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS7 f′:Y→X′such that K Y/X=K Y/X′.Then for every i∈Q,there is an isomorphism of Hodge structuresH i orb(X,Q)∼=H i orb(X′,Q).We also define the p-adic orbifold cohomology.Let X be a smooth Deligne-Mumford stack over afinitefield k.and p a prime number different from the characteristic of k.If necessary,replacing k with its finite extension,we defineH i orb(X⊗¯k,Q p):= V⊂J0X H i−2sht(V)(¯V⊗¯k,Q p)⊗Q p(−sht(V)).Replacing k is necessary to ensure that fractional Tate twists Q p(−sht(V)) exist.Corollary1.8.Let X and X′be proper and smooth Deligne-Mumford stacks offinite type over afinitefield k.Suppose that there exist a smooth Deligne-Mumford stack Y and tame proper birational mor-phisms f:Y→X and f′:Y→X′such that K Y/X=K Y/X′.Suppose that the p-adic orbifold cohomology groups of X and X′are defined. Then for every i∈Q,there is an isomorphism of Galois representa-tionsH i orb(X⊗k¯k,Q p)ss∼=H i orb(X′⊗k¯k,Q p)ss.Here the superscript“ss”means the semisimplification.For varieties,T.Ito[Ito1]and Wang[Wan]obtained a similar result over numberfields.1.1.Notation and convention.Throughout this paper,we work over a perfect basefield k.A Deligne-Mumford stack(DM stack for short)is supposed to be separated.What we mean by a variety is a separated algebraic space offinite type over k.•N,Z≥0:the set of positive integers and that of non-negativeintegers•[M/G]:quotient stack•|X|:the set of points of X•¯X:the coarse moduli space of a DM stack X•D l n,S:=[D nl,S/µl,k]•D n,S:=Spec R[[t]]/t n+1(S=Spec R)•µl⊂¯k:the group of l-th roots of unity•µl,k:=Spec k[x]/(x l−1):the group scheme of l-th roots ofunity over k•Conj(G):a representative set of conjugacy classes[g]of g∈G8TAKEHIKO YASUDA•Conj(µl,G):a representative set of conjugacy classes ofµl֒→G•J n X:n-jet space•J(a)n X:For a scheme with G-action and a:µl֒→G,J(a)n X⊂J n X is the locus where the twoµl-actions on J n X coincide•J l n X:the stack of twisted n-jets of order l•J n X:= char(k)∤l J l n X:the stack of twisted n-jets•πn:J∞X→J n X,π:J∞X→X:natural projections•f n:J n Y→J n X:the morphism induced by f:Y→X•R,S:the semirings of equivalence classes of convergent stacksand convergent spaces•L:={A1k}•MHS and MHS1/r:the category of mixed Hodge structuresand the category of1MOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS9 2.1.1.Wefirst review the Deligne-Mumford(DM)stack very briefly.We mention the book of Laumon and Moret-Bailly[LMB]as a reference of stacks.We will sometimes use results from it.Fix a basefield k.Let(Aff/k)be the category of affine schemes overk.A DM stack X is a category equipped with a functor X→(Aff/k)which satisfies several conditions.It should be afibered category over(Aff/k)and is usually best understood in terms of thefiber categories X(S),for S∈(Aff/k),and the pull-back functors f∗:X(T)→X(S) for f:S→T.The X(S)are groupoids with,at least for S offinitetype,finite automorphism groups.The DM stacks constitute a2-category.In terms of thefiber cat-egories,a1-morphism(or simply morphism)f:Y→X is the data of functors f S:Y(S)→X(S),compatible with pull-backs,and a 2-morphism f→g is a system of morphisms of functors f S→g S, compatible with pull-backs.A scheme,or more generally an algebraic space X is identified with the DM stack withfibers the discrete cat-egories with sets of objects the X(S):=Hom(S,X).A diagram of stacksX fZis said to be commutative if a2-isomorphism g◦f∼=h has been given. The strict identity g◦f=h is not required.A morphism f:Y→X of DM stacks is called representable if for every morphism M→X with M an algebraic space,thefiber product M×X Y is also an algebraic space.It is equivalent to that for every objectξ∈Y,the natural map Aut(ξ)→Aut(f(ξ))is injective.We can generalize many properties of a morphism of schemes to DM stack;´e tale,smooth,proper etc.By a condition in the definition,for every DM stack X,there exist an algebraic space M and an´e tale surjective morphism M→X,which is called an atlas.We say that X is smooth, normal etc if an atlas is so.The diagonal morphism∆:X→X×X of a DM stack X is,bydefinition,representable.We say that X is separated if∆isfinite,thatis,quasi-finite and proper.Note that∆is not immersion unless X isan algebraic space.In this paper,every DM stack is supposed to beseparated.2.1.2.Points and coarse moduli space.A point of a DM stack X isan equivalence class of morphisms Spec K→X with K⊃k afield by the following equivalence relation;morphisms Spec K1→X and10TAKEHIKO YASUDASpec K2→X are equivalent if there is anotherfield K3⊃K1,K2⊃k making the following diagram commutative.Spec K3Spec K2XWe denote by the set of the points by|X|.It carries a Zariski topology; A⊂|X|is an open subset if A=|Y|for some open immersion Y֒→X. (see[LMB]for details).If X is a scheme,then|X|is equal to the underlying topological space as sets.A coarse moduli space of a DM stack X is an algebraic space equipped with a morphism X→X such that every morphism X→Y with Y algebraic space uniquely factors through X and for every algebraically closedfield K⊃k,the map X(K)/isom→X(K)is bijective.By the definition,it is clear that the coarse moduli space is unique up to isomorphism.Keel and Mori[KM]proved that for every DM stack, the coarse moduli space exists.If X is the coarse moduli space of X, then the map|X|→|X|is a homeomorphism.2.1.3.Quotient stack.One of the simplest examples is the quotient stack.Let M be an algebraic space and G afinite group(or an´e tale finite group scheme over k)acting on M.Then we can define the quotient stack[M/G]as follows;an object over a scheme S is a pair of a G-torsor P→S and a G-equivariant morphism P→M and a morphism of(P→S,P→M)to(Q→T,Q→M)over a morphism S→T is a G-equivariant morphism P→Q compatible with the other morphisms.This stack has the canonical atlas M→[M/G].There is also a natural morphism[M/G]→M/G which makes M/G the coarse moduli space.The composition M→[M/G]→M/G is the quotient map.2.2.Stacks of twisted jets.2.2.1.In the article[Yas1],the author introduced the notion of twisted jets.There,only twisted jets overfields were considered and the stack of twisted jets was constructed as a closed substack of another stack. By contrast,in this paper,we consider the category of twisted jets parameterized by arbitrary affine scheme and verify that it is actually a DM stack.Wefirst recall jets and arcs over a variety.Here we mean a separated algebraic space offinite type by a variety.Let X be a variety and n aMOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS11 non-negative integer.The functor(Aff/k)→(Sets)Spec R→Hom(Spec R[[t]]/t n+1,X)is representable by a variety J n X,called the n-jet space.The nat-ural surjection k[[t]]/t n+2։k[[t]]/t n+1induces a natural projection J n+1X→J n X.Since they are all affine morphisms,the projectivelimit J∞X:=lim←−J n X exists.This is an algebraic space,but not gen-erally offinite type.We call this the arc space.For everyfield extension K⊃k,there is an identificationHom(Spec K,J∞X)=Hom(Spec K[[t]],X).An arc of X is a point of J∞X,that is,a morphism Spec K[[t]]→X. For S=Spec R∈(Aff/k)and a non-negative integer n,we putD n,S:=Spec R[[t]]/t n+1.For l a positive integer prime to the characteristic of k,we denote by µl⊂¯k the cyclic group of l-th roots of unity.We define also the group scheme of l-th roots of unity over kµl,k:=Spec k[x]/(x l−1).Whenµl,k is a constant group scheme,then we identify it with the groupµl.The natural action ofµl,k on D n,S is defined by t→x⊗t. We putD l n,S:=[D nl,S/µl,k].Also for n=∞,and for afield K⊃k,we putD∞,K:=Spec K[[t]]and D l∞,K:=[D∞,K/µl,k].Definition2.1.Let X be a DM stack.A twisted n-jet of order l of X over S is a representable morphism D l n,S→X.For afield K⊃k,a twisted arc(or twisted infinite jet)of order l of X over K is a representable morphism D l∞,K→X.Definition2.2.Let X be a DM stack.Suppose n<∞.We define the stack of twisted n-jets of order l,denoted J l n X,as follows;an object over S∈(Aff/k)is a representable morphism D l n,S→X,a morphism fromγ:D l n,S→X toγ′:D l n,T→X over f:S→T is a2-morphism fromγto f′◦γ′,where f′:D l n,S→D l n,T is the morphism naturally induced by f.We will prove that it is actually a DM stack.12TAKEHIKO YASUDADefinition 2.3.We define the stack of twisted n -jet of X byJ n X := char (k )∤lJ l n X .If X is of finite type,then J l n X is empty for sufficiently large l and J n X is in fact the disjoint sum of only finitely many J l n X .Lemma 2.4.The category J l n X is a stack.Proof.For an object γ:D l n,S →X of J l n X and for a morphism f :T →S ,we have a“pull-back”,γT :=f ′◦γwhich is unique up to 2-isomorphisms.Here f ′:D l n,T →D l n,S is the natural morphism inducedby f .Hence J l n X is a groupoid.We first show that for two objects γ,γ′:D l n,S →X ,the functorI som (γ,γ′):(Aff/S )→(Sets)(T →S )→Hom (J l n X )(T )(γT ,γ′T ).is a sheaf.Consider a morphism T →S and an ´e tale coverT i →T .Let T ij :=T i ×T T j .For every object αof D l n,T ,we have the pull-backsαi and αij to D l n,T i and D l n,T ij respectively.Since X is a prestack,the sequence0→Hom X (T )(γT (α),γ′T (α))→ Hom X (T i )(γT i (αi ),γ′T i (αi ))⇉ Hom X (T ij )(γT ij (αij ),γ′T ij(αij ))is exact.Since a morphism of twisted jets is a natural transformation of functors,it implies that the sequence0→Hom (J l n X )(T )(γT ,γ′T )→ Hom (J l n X )(T i )(γT i ,γ′T i )⇉ Hom (J l n X )(T ij )(γT ij ,γ′T ij )is also exact,and the functor I som (γ,γ′)is a sheaf.It remains to show that one can glue objects.Let T i →T be an ´e tale cover,let γi :D l n,T i →X be twisted jets and let h ij :(γi )T ij →(γj )T ij be a morphism in (J l n X )(T ij ).Assume that they satisfy thecocycle condition.Then for every object αof D l n,T ,we can glue theobjects γi (αi )of X ,because X is a stack.Therefore we can determine the image of αand obtain a functor γ:D l n,T →X which is clearly representable.Thus we have verified all conditions.MOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS13 2.2.2.Let Y→X be a representable morphism of DM stacks.Then for a twisted jet D l n,S→Y,composing the morphisms,we obtain atwisted jet D l n,S→X.Thus we have a natural morphism J l n Y→J l n X. In[Yas1],we defined a barely faithful morphism to be a morphism f:Y→X of DM stacks such that for every objectξof Y,the map Aut(ξ)→Aut(f(ξ))is bijective.Thus all barely faithful morphisms are representable.Barely faithful morphisms are stable under base change[Yas1,Lemma4.21].Lemma2.5.Let Y→X be a barely faithful and formally´e tale mor-phism of DM stacks.Then the naturally induced diagramJ l n Y J l n XXis cartesian.Proof.Consider a commutative diagramS YXwhere the bottom arrow is representable and the left arrow is a natural one.Then we claim that there exists a unique morphism D l n,S→Y whichfits into the diagram.The lemma easily follows from it. Without loss of generality,we can assume that S is connected.Let U⊂D l n,S×X Y be the connected component containing the image of S.Then the natural morphism U→D l n,S is barely faithful,formally ´e tale and bijective,hence an isomorphism.It shows our claim.Thus we obtain an equivalence of categories,Y×X J l n X∼=J l n Y. For every DM stack X,there arefinite groups G i,schemes M i with G i-action and a morphism i[M i/G i]→X which is´e tale,surjective and barely faithful.Hence thanks to Lemma2.5,in proving that J l n X is a DM stack,we may assume that X is a quotient stack[M/G].Let k′/k be thefield extension by adding all l-th roots of unity for the order l of elements of G prime to the characteristic of k.Replacing k with k′and M with M⊗k k′,we may assume thatµl,k is a constant group scheme for l such that there is a twisted jet D l n,S→[M/G].The action µl D n,S induces an actionµl J n M.On the other hand,for each embedding a:µl֒→G,µl acts on M as a subgroup of G and on J n M.14TAKEHIKO YASUDADefinition 2.6.We define J (a )n M to be the closed subscheme of J n Mwhere the two actions µl J n M are identical.Definition 2.7.We define Conj(µl ,G )to be a representative set of the conjugacy classes of embeddings µl ֒→G .Proposition 2.8.For 0≤n ≤∞,there is an isomorphismJ l n X ∼=a ∈Conj(µl ,G )[J (a )nl M/C a ].Here C a is the centralizer of a .By this isomorphism,[J (a )nl M/C a ]cor-responds to twisted jets D l n,S →X inducing a :µl ֒→G .Proof.Let m :=nl .Choose a primitive l -th root ζ∈µl of unity.Let γ:D l n,S →X be an object over S of J l n X .The canonical atlas D m,S →D l n,S corresponds to the object αof D l n,SD m,S ×µl µl-action D m,S D m,S.The morphismθ:=ζ×ζ−1:D m,S ×µl →D m,S ×µlis an automorphism of αover ζ:D m,S →D m,S ,whose order is l .Any other object of D l n,S is a pull-back of αand any automorphism is a pull-back of a power of θ.Therefore the twisted jet γis determined by the images of αand θin X .Let the diagramPh MD m,Sbe the object over D m,S of X which is the image of αby γ.Let λbe its automorphism over ζ:D m,S →D m,S which is the image of θ.Because γis representable,the order of λis also l .Let Q :=P ×D m,S S .Then P →D m,S is isomorphic as torsors to D m,k ×k Q →D m,k ×k S .Since we have chosen a primitive l -th root ζ,we can identify Conj(µl ,G )with a representative set Conj l (G )of the conjugacy classes of elements of order l .MOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS15 Claim:If S is connected,then there are open and closed subsets Q′⊂Q and P′⊂P which,for some g∈Conj l(G),are stable under C g-action and C g-torsors over S.Take an´e tale cover T→S such that Q T:=Q×S T is isomorphic to the trivial G-torsor T×G→T with a right action.Then the pull-back of the automorphismλis a left action of some g−1∈G over each connected component of T.If necessary,replacing the isomorphism Q T∼=T×G,we can assume that the automorphism is given by unique g−1∈Conj l(G).Letφ:T×G→Q be the natural morphism.Then we see thatφ(T×C g)∩φ(T×(G\C g))=∅,as follows:Let a∈C g, b∈G\C g,x∈T×C g and y∈T×(G\C g).Ifφ(x)=φ(y),then φ(x)=φ(gxg−1)=λφ(x)g−1=λφ(y)g−1=φ(gyg−1)=φ(y).It is a contradiction.Similarly P decomposes also.Since(h◦λ)|P′=h|P′and(h◦g)|P′=(g◦h)|P′,we haveh◦(ζ×id Q′)=h◦(λ◦g)|P′=(g◦h)|P′.It means that the morphism D m,k×k Q′→M corresponds to a mor-phism Q′→(J m M)ζ◦g−1and that the morphism Q′→(J m M)ζ◦g−1 and a C g-torsor Q′→S determine an object over S of a quotient stack [(J m M)ζ◦g−1/C g].Note that(J m M)ζ◦g−1=J(a)m M.Thus we have a morphism J l n X→ [J(a)m M/C a].The inverse morphism can be con-structed by following the argument conversely. Theorem2.9.Let X be a DM stack.(1)For n∈Z≥0,J l n X and J n X are DM stacks.(2)If X is offinite type(resp.smooth),then for n∈Z≥0,thenJ l n X and J n X are also offinite type(resp.smooth). (3)For every m≥n,the natural projection J m X→J n X is anaffine morphism.Proof.1:There is an´e tale,surjective and barely faithful morphism i[M/G i]→X such that each M i is a scheme and G i is afinite group.From Lemma2.5,Proposition2.8and[LMB,Lemme4.3.3], the morphism J l n X→X is representable.From[LMB,Proposition 4.5],J l n X is a DM stack.J n X is also a DM stack.The morphism J l n X→X is also separated and so is J l n X.2and3:These also result from Lemma2.5and Proposition2.8. In general,a projective system{X i,ρi:X i+1→X i}i≥0of DM stacks such that everyρi is representable and affine,there exists a projective limit X∞=lim←−X i.In fact,for each i,there is an O X0-algebra A i such that X i∼=Spec A i(see[LMB,§14.2])and the A i’s constitute an16TAKEHIKO YASUDAinductive system.We can see that X∞:=Spec(lim−→A i)is the projective limit of the given projective system.From Theorem2.9,the projective system{J n X}n(resp.{J l n X}n) has the projective limitJ∞X:=lim←−J n X(resp.J l∞X:=lim←−J l n X).Then the point set|J∞X|is identified with the set of the equivalence classes of the twisted arcs D l∞,K→X with respect to the following equivalent relation:Letγi:D l∞,Ki→X,i=1,2,be twisted arcs.If for afield K3⊃K1,K2and natural morphisms D l∞,K3→D l∞,K1,D l∞,K2,the diagramD l∞,K3D l∞,K1γ1Xis commutative,thenγ1andγ2are equivalent.Remark 2.10.For two stacks X and Y,we can define a Hom-stack H om(X,Y)which parameterizes morphisms from X to Y,and its substack H om rep(X,Y)which parameterizes representable morphisms. Olsson[Ols]proved that if X and Y are Deligne-Mumford stacks satis-fying certain conditions,then H om(X,Y)is a Deligne-Mumford stack. and H om rep(X,Y)is its open substack.Then,Aoki[Aok]proved that H om(X,Y)is an Artin stack if X and Y are Artin stacks satisfying certain conditions.The stack J l n X of twisted n-jets of order l(n<∞) is identical with H om rep(D l n,X).2.2.3.Inertia stack.Definition2.11.To each DM stack X,we associate the inertia stack I X defined as follows;an object of I X is a pair(x,α)with x an object of X andα∈Aut(x)and a morphism(x,α)→(y,β)in I X is a morphismφ:x→y in X withφα=βφ.It is known that I X is isomorphic to X×∆,X×X,∆X,where∆: X→X×X is the diagonal morphism.Then the forgetting morphism I X→X is isomorphic to thefirst projection X×∆,X×X,∆X→X. Since we have supposed that X is separated,the diagonal morphism is finite and unramified.Hence the forgetting morphism I X→X is so as well.Definition2.12.Let l be a positive integer prime to char(k).We define I l X⊂I X to be the open and closed substack of objects(x,α) such that the order ofαis l.MOTIVIC INTEGRATION OVER DELIGNE-MUMFORD STACKS17 Proposition 2.13.Assume that k contains all l-th roots of unity. Then for each choice of a primitive l-th rootζof unity,there is anatural isomorphism J l0X∼=I l X.Proof.The assertion follows from the fact that giving a representable morphism D l0×S→X is equivalent to giving an object x over S of X and an embeddingµl֒→Aut(x),which is equivalent to giving the image ofζ∈µl. The inertia stack is the algebraic counterpart of the twisted sector ofan analytic orbifold,which was used to define the orbifold cohomology√in[CR].Since for k=C,there is a canonical choice exp(2π。
Ontology与语言问题
ontology中Being的规定性
作为ontology范畴的Being与日常语言中to be的最根本的区别是:Being的意义是从逻辑上得到规定的, 因此我们常常会谈到这样的说法:Being的规定性。这在日常语言中是不见的。 在日常语言中,to be除了用作系词以及表示存在的动词以外, 还可以表示其他各种实际意义,例如莎士比亚剧本中的一句台词To be or notto be,指要活,还是要死。to be表示各种实际意义的时候,往往与其他词连用。海德格尔曾举与to be相似的德文sein 所能表示的各种意思中的些句子的英译)〔10〕: 原 句 海德格尔对其中to be的解释:
(此书是我的)
8.Red is the port side. 它代表左派
(红色是左派)
9.The dog is in the garden. 狗正在花园里闲逛
(那狗在花园里)
在海德格尔列出的这些例句中,to be 不仅有作动词存在用的(例句1、2),而且大量是用作系词;在用作系词时,不仅连系名词(例句8),而且还连系代词(例句7),介词短语(例句3、4、5、9)和动词不定式(例句6)。一般认为,to be作为系词时,它本身的意义是不确定的,或者说是没有实际意义的,然而海德格尔认为,从另一眼光去看,恰因is 内在地依然是不确定的和缺乏意义的, 它才能有这许多不同的用法,才能根据情境的需要去实现和决定其自己〔11〕。这些情况充分说明,当人们不经心地使用to be 时,表面上看来它只是起连系主语和表语的作用,实际上人们是明白它和不同词语使用时所具有的多种不同意义的,只是人们一般并不对不同用法中这个词的不同意义作反思的表述罢了。
前人对ontology进行语言分析的尝试
ontology与语言的特殊关系,它对语言的特殊使用,早已引起了人们的关注。随着对不同语言的哲学的比较研究的开展,ontology与印欧语系的特殊关系得到了揭示。但是,在这一研究中也存在着一种倾向,即把哲学问题简单地归结为语言问题,或者以对日常语言的分析取代哲学的研究。这就忘记了语言是思想借以表达出来的工具。结合语言加以研究,是为了借助于这个工具,揭示ontology的哲学内涵及其思维方式;反过来说,对ontology进行语言分析时,必须联系它的特殊思维方式以及它对语言的特殊使用。不把握住这一点,进行单纯的语言分析,尤其是将这种分析停留在日常语言的层次上,适足以掩盖ontology这种形态的哲学的实质。让我们先来看一下这种分析所达到的结果。
A Good Role Model for Ontologies Collaborations
A Good Role Model for Ontologies:CollaborationsMichael Pradel,Jakob Henriksson,and Uwe AßmannFakultät für Informatik,Technische Universität Dresden michael@binaervarianz.de,{jakob.henriksson|uwe.assmann}@tu-dresden.de Abstract.Ontologies are today used to annotate web data with machine pro-cessable semantics and for domain modeling.As the use of ontologies increasesand the ontologies themselves grow larger,the need to construct ontologies ina component-based manner is becoming more and more important.In object-oriented software development,the notions of roles and role modeling have beenknown for many years.We argue that role models constitute attractive ontologi-cal units—components.Role models,among other things,provide separation ofconcerns in ontological modeling.This paper introduces roles to ontologies anddiscusses relevant issues related to transferring these techniques to ontologies.Examples of role models enabling separation of concerns and reuse are providedand discussed.1IntroductionOntology languages are emerging as the de facto standard for capturing semantics on the web.One of the most important ontology languages today is the Web Ontology Language OWL,standardized and recommended by W3C[12].One issue currently addressed in the research community is how to define reusable ontologies or ontology parts.In more general terms,how to construct an ontology from possibly independently developed components?OWL natively provides some facilities for reusing ontologies and ontology parts. First,a feature inherited from RDF[7](upon which OWL is layered)is linking—loosely referencing distributed web content and other ontologies using URIs.Second,OWL provides an owl:imports construct which syntactically includes the complete refer-enced ontology into the importing ontology.The linking mechanism is convenient from a modeling perspective,but is semantically not well-defined—there is no guarantee that the referenced ontology or web content exists.Furthermore,the component(usually an ontology class)is small and often hard to detach from the surrounding ontology in a semantically well-defined ually a full ontology import is required since it is un-clear which other classes the referenced class depends on.The owl:imports construct can only handle complete ontologies and does not allow for partial reuse.This can lead to inconsistencies in the resulting ontology due to conflicting modeling axioms.Over-all,OWL seems to be inflexible in the kind of reuse provided,especially regarding the granularity of components.Existing approaches addressing these issues often refer to modular ontologies and, in general terms,aim at enabling the reuse of ontology parts or fragments in a well-defined way(for some work in this direction,see[4–6,11]).That is,investigate howonly certain parts of an ontology can be reused and deployed elsewhere.While it is interesting work and allows for reuse,we believe that such extracted ontological units fail to provide an intuitive meaning of why those units should constitute components—they were not designed as such.The object-orientated software community has long discussed new ways of model-ing software.One interesting result of this research is the notion of role modeling[13]. The main argument is that today’s class-oriented modeling mixes two related but ulti-mately different notions:natural types and role types.Natural types capture the identity of its instances,while a role type describes their interactions.Intuitively,an object can-not discard its natural type without losing its identity while a role type can be changed depending on the current context of the object.Person for example,is a natural type while Parent is a role type.Parent is a role that can be played by persons.A role type thus only models one specific aspect of its related natural types.Related role types can be joined together into a role model to capture and separate one specific concern of the modeled whole.In this paper we introduce role modeling to ontologies.Role modeling can bring several benefits to ontologies and ontological modeling.Roles provide:–More natural ontological modeling by separating roles from classes–An appropriate notion and size of reusable ontological components—role models –Separation of concerns by capturing a single concern in a role modelWe believe that role models constitute useful and natural units for component-based ontology engineering.Role models are developed as components and intended to be de-ployed as such,in contrast to existing approaches aimed at extracting ontological units from ontologies not necessarily designed to be modular.While we argue that modeling with roles is beneficial to ontological modeling and provides a new kind of component not previously considered for ontologies,the transition from object-orientation is not straightforward.The contribution of this paper is the introduction of modeling prim-itives to support roles in ontologies and a discussion of the main differences for role modeling between ontologies and object-oriented models.1The semantics of the new modeling primitives is provided by translation into the assumed underlying ontologi-cal formalism of Description Logics(DLs)[3].That way,existing tools can be reused for modeling with roles.To convince the reader of the usefulness of role models,we demonstrate their use on two examples.Thefirst example shows separation of concerns and the second example demonstrates reuse of role models in different contexts.The remaining part of the paper is structured as follows.Section2introduces roles as used and understood in object-orientation and discusses what the main differences are between models and ontologies.Section3introduces role models to ontologies and gives examples of their use.Section4discusses related work to component-based ontology modeling and Section5concludes the paper and discusses open issues.1When we simply say model,we shall mean a model in the object-oriented sense.2From Roles in Software Modeling to OntologiesThe OOram software engineering method[13]was thefirst to introduce roles in object-orientation.The innovative idea was that objects can actually be abstracted in two ways: classifying them according to their inherent properties,and focusing on how they work together with other objects(collaborate).While the use of classes as an object abstrac-tion is a cornerstone in object-oriented modeling,focusing on object collaborations using roles has not been given the attention it deserves(however,for some work ad-dressing these issues,see CaesarJ[1]and ObjectTeams[8]).There are different views in the object-oriented community[15,16]on what roles really are.However,some basic concepts seem to be accepted by most authors:–Roles and role types.A role describes the behavior of an object in a certain context.In this context the object is said to play the role.One object may play several roles at a time.A set of roles with similar behavior is abstracted by a role type(just as similar objects are abstracted by a class).–Collaborations and role models.Roles focus on the interaction between objects and consequently never occur in isolation,but rather in collaborations.This leads to a new abstraction not available for classes—the role model.It describes a set of role types that relate to each other and thus as a whole characterizes a common collaboration(a common goal or functionality).–Open and bound role types.Role types are bound to classes by a plays relation,e.g.Person plays Father(a person can play the role of being a father).However,not all role types of a role model must be bound to a class.Role types not associated witha class are called open and intuitively describe missing parts of a collaboration.It is important to note that class modeling and role modeling do not replace each other,but are complementary.A purely class-based approach arguably leads to poor modeling by enforcing the representation of role types by classes and thus disregards reuse possibilities based on object collaborations.However,roles cannot replace classes entirely since this would disallow modeling of properties that are not related to a specific context.Adapting roles for ontology modeling There is currently no consensus on the exact re-lationship between models and ontologies,although the question is a current and impor-tant one(see e.g.[2]).There is however some agreement upon fundamental differences between models and ontologies which will have an impact on transferring the notion of roles from models to ontologies.One difference is that models often describe something dynamic,for example a system to be implemented.In contrast,ontologies are static entities.Even though an ontology may evolve over time,the entities being modeled do not have the same no-tion of time.Models often describe systems that are eventually to be executed,while ontologies do not(although some approaches exist that compile ontologies to Java2). The dynamism and notion of executability in modeling is closely connected to func-tionality(or behavior).A collaboration in object-oriented modeling often captures a 2See for example,http://www.aifb.uni-karlsruhe.de/WBS/aeb/ontojava/.separate and reusable functionality.For example,a realization of depth-first traversal over graph structures may require several collaborating methods in different classes for its implementation.The collection of all the related dependencies between the classes constitutes a collaboration and thus implements this functionality[14].Because of the non-existence of dynamism and behavior in ontologies,roles and collaborations neces-sarily capture something different.Instead of describing the behavior of an object using the notion of a role,ontological roles describe context-dependent properties.Definition1(Ontological roles and role types).An ontological role describes the properties of an individual in a certain context.A set of roles with similar properties is abstracted by an ontological role type.Based on this we define what we consider a role model(collaboration)to be in an ontological setting.Definition2(Ontological collaborations and role models).An ontological role model describes a set of related ontological role types and as such encapsulates com-mon relationships between ontological roles.For example,an ontology may describe the concept Person.If john,mary and sarah are said to be persons,but in fact belong to a family,the needed associations may be encoded in a Family collaboration describing relationships such as parents having chil-dren.The existing Family collaboration could then simply be imported and employed to encode that john and mary are the parents of sarah.Another difference between models and ontologies are their implicit assumptions. In models,classes are assumed to be disjoint,which is,however,not the case for on-tologies.This implies that role-playing individuals may belong to classes to which the corresponding role type is not explicitly bound.To avoid unintended role bindings,the ontology engineer explicitly has to constrain them in the ontology.3Using Role Models in OntologiesClass-based modeling,as used in ontologies today,has proven to be successful,but ex-perience in object-orientation has lead to role modeling as a complementary paradigm. This section shows how roles and role models can beneficially be used in ontologies. One of our main motivations is to promote role models as a useful ontological unit—a component—in ontological modeling.We therefore show how role models can be incorporated and reused in class-based ontologies.The following example is intended to demonstrate how classes can be split into separate concerns where each concern is modeled by employing a different role model. Figure1shows parts of a wine ontology modeled with roles.Classes are represented by gray rectangles while white rectangles with rounded corners denote role types.The definition of a role type is specified inside its rectangle(in standard DL syntax).In addition,role types are tagged with the name of their role model,e.g.(Product).Labeled arrows represent binary properties between types.The ontology in Figure1models three natural types(classes):Wine,Winery and Food.In a class-based version of the ontology in Figure1,the concerns of wine bothFigure1.Different concerns of the Wine class are separated by the role types Product and Drink. being a product and a drink(to be had with a meal)would be intermingled in a single class definition of Wine.The most natural way of modeling this would be to state that Wine is a subclass of the classes Product and Drink.However,this would not be ideal since a wine does not always have to be a product.Rather,we would like to express that a wine can be seen as a product(in the proper context).This can be expressed using roles where these concerns are instead explicitly separated into the role types Product and Drink.The motivation from a modeling perspective is that wines are always wines (that is,wine is a natural type).A wine may however be seen differently in different contexts:As a product to be sold,or as a drink being part of a meal.Modeling the role-based ontology from Figure1in a more concrete syntax could look like this(based on Manchester OWL syntax[9]):The import statements import the needed role models and the Plays primitive binds roles to classes.The translation of the ontology into standard DL giving the on-tology meaning is discussed in Section3.1.The above mentioned modeling distinction can also be helpful in other situations. Imagine the existence of an ontology with the classes Person and PolarBear(naturally) stated to be disjoint.The modeler now wants to introduce the concept of Parent and decides that parents are persons.Furthermore,while being focused on polar bears for a while decides that since obviously not all polar bears are parents,the opposite should hold and states that parents are also polar bears.This unfortunate and unintentional mistake makes the class Parent unsatisfiable(i.e.it is always empty).A more natural way to solve this problem would be to import a Family role model(modeling notions such as parents etc.)and state that Person s can play the role of Parent and PolarBear scan do the same.Thus,instead of intermingling a class Parent with the definitions of Person and PolarBear,possibly causing inconsistencies,the role type Parent cross-cuts the different involved(natural)classes as a separate concern.Doing this will prevent the role type Parent from being empty.This example has shown that employing roles can be more natural than using classes to describe non-inherent properties of individuals.Note that we do not claim that it is not possible to solve the above mentioned model-ing problem strictly using classes as is done today.In fact,we very much recognize this fact by giving role-based ontologies a translational semantics to standard DL semantics (see Section3.1).Instead we argue that modeling with roles is more natural and easier from the perspective of the modeler.Apart from the rather philosophical distinction between classes and roles described above,roles are important in collaborations.A set of collaborating roles may be joined together in a role model,which may effectively be reused in many different ontologies. Thus,role models provide an interesting reuse unit for ontologies.Figure2shows an example of reusability.There are two class-based ontologies, one modeling wines and the other pizzas.Both the concept of Wine and Pizza in the different ontologies can in certain contexts be considered as products(as one concern). To capture this concern and the relationships the role of being a product has with other roles,for example being a producer,we reuse the Product role model introduced in Figure1.Figure2.The Product role model reused in two different ontologies.The example shows how a set of related relationships(for example produces and consumes)can be encapsulated in a role model and reused for different domains.Not only relationships are encapsulated,but also the related role types that act as ranges and domains for the relationships.As another example we can again consider the previously mentioned Family role model where relationships such as hasChild and hasParent are modeled.This role model may not only be used in an ontology catered to modeling persons.Consider forinstance the same notions being needed in an ontology modeling tree data structures. There,possible relationships between nodes may also be modeled by reusing the same role model.Another example would be an ontology describing operating systems and their processes,new child processes being spawned from parent processes,etc.After having looked at some examples of ontologies being modeled using role mod-els,we will in the following section discuss their semantics.3.1Semantics of Role-Modeled OntologiesWe argue that modeling with roles should be enabled by introducing new ontological modeling primitives.Roles allow modelers to separate concerns in an intuitive manner and provide useful ontological units(components).At the same time,current class-based ontology languages(e.g.OWL)are already very expressive.Thus,we believe that there is no lack in expressiveness,but rather in modeling primitives and reuse.We therefore aim for a translational approach where role-based ontologies may be com-piled to standard(DL-based)ontologies.A great advantage is that this permits to reuse existing tools,in particular already well-developed reasoning engines.A class-based ontology is considered to be a set of DL axioms constructed using class descriptions(or simply classes),property descriptions(or properties),and individ-uals.For supporting roles,we enhance the syntax with role types and role properties. For the sake of simplicity,we restrict role types to be conjuncts of existential restric-tions limited to atomic role types.That is,of the form∃p1.R1 ... ∃p n.R n,where R i are role types and p i are role properties.Role properties simply define their domain and range(both have to be role types).Classes(respectively properties)and role types (respectively role properties)are built from disjoint sets of names.This disjointness corresponds to the underlying difference of natural types and role types.To support role modeling,we introduce two new axioms.Thefirst axiom expresses that individuals of a class can play a role:R£C(role binding)binds role type R to class C.The second axiom expresses that some specific individual plays a role:R(a)(role assertion),where R is a role type and a an individual.Additionally,we add syntax for ontologies to import role models.The extended syntax may now be translated to the underlying ontology language by the following algorithm:31.Make all imported role type definitions available as classes in the ontology.2.For each role type R used in the ontology:(a)Let{C1,...,C n}be the set of classes to which R is bound(R£C i).Then addthe axiom R C1 ... C n ⊥to the ontology.(b)For each role assertion R(a),make the same assertion available in the resultingontology,now referring to the class-representative for the role type R.3.Remove import and Plays statements.This translation captures the can-play semantics of roles by defining role types as subtypes of classes.It implies that an open role R may not be played by any individual 3Role properties and role property assertions are left out here but can be easily integrated into the syntax extensions and the translation algorithm.since R ⊥would be added to the ontology(i.e.R is always interpreted as the empty set).The semantics of our role modeling extension is an immediate consequence of the translation by using the standard semantics of DLs.We will now look at an example of how a role-based ontology is compiled to a standard class-based ontology.The ontology from Figure1imports the role models Product and Meal.The Product role model could for example be defined by:4To illustrate the impact of binding one role type to multiple classes,we assume that the Product role type is also bound to the class Food in Figure1(and in the subsequent listing).That is,also foods can be considered products in some contexts.Our trans-lation as defined above results in the following class-based ontology(for the example disregarding the Meal role model):The resulting ontology consists of only standard OWL constructs and can thus be used by existing tools such as reasoners.A consequence of this resulting ontology is for example that an individual playing the role of a product has to be either a wine or a food.We can thus single out and study the concern of being a product,but not having to consider in detail what those products are.We could have done the same in a class-based ontology by stating that wines and foods are products,thus using Product as a super-class to both Wine and Food.However,as already mentioned,this would disregard the fact that wines and foods are not always products.4Related WorkModularizing ontologies andfinding appropriate ontology reuse units are becoming important issues.Several works address this issue,most having a strong formal founda-tion.A common property between existing work seems to be the desire to reuse partial ontologies.That is,enable more refined reuse of ontologies by allowing to import and share vocabulary(classes,in some sense meaning)rather than axioms(ontologies,that is,syntactical units).4The definitions of the role properties produces and consumes are left out.One work in this direction proposes a new import primitive:semantic import[11]. Semantic import differs from owl:imports(referred to as syntactic import)by allow-ing to import partial ontologies and by additionally enforcing the existence of any re-ferred external ontologies or ontology elements(classes,properties,individuals)by the notion of ontology spaces.The goal in this work is controlled partial reuse.The work in[5]defines a logical framework for modular integration of ontologies by allowing each ontology to define its local and external signature(that is,classes, properties etc.).The external signature is assumed to be defined in another ontology. Two distinct restrictions are defined on the usage of the external signatures.Thefirst syntactically disallows certain axioms which are considered harmful,while the second restriction generalized thefirst by taking semantical issues into consideration.The gen-eral goal,apart from a formal framework,is to allow safe merging of ontologies.The work in[6]also proposes partial reuse of ontologies by allowing to automat-ically extract modules from ontologies.One interesting requirement put on such an extracted module is that it should describe a well-defined subject matter,that is,be self-contained from a modeling perspective.In contrast to these works on partial ontology reuse,in particular how to extract or modularize existing ontologies,our work aims at defining a more intuitive ontological unit—an ontological component that was defined as such.5Conclusions and OutlookIn this paper we have proposed an ontological unit able to improve modeling and pro-vide a means for reuse—the ontological role model.The concept of roles has its roots in software modeling and we have taken thefirst steps to transfer this notion to the world of ontologies.Role models provide a view on individuals and their relationships that is different from the abstractions provided by purely class-based approaches.As such, role models provide a reusable abstraction unit for ontologies.Furthermore,due to the translational semantics,the approach is compatible with existing formalisms and tools.As a next step we aim at integrating role modeling into tools,for example the Protégéontology editor[10].This is important since we argue that ontology engineers should treat roles asfirst class members of their language and distinguish them from classes.Other issues also remain to be further clarified.The semantics of roles may be subject of discussion.Apart from focusing on can-play semantics,must-play may in some cases be desirable for role bindings.Another issue to clarify is the implication of applying one role model several times in an ontology.One could argue for multi-ple imports where each import is associated with a unique name space.However,this would disallow to refer to all instances of a certain role type,for instance to all products in an ontology.Finally,further investigations into the implications of the open-world semantics of ontologies relating to role bindings and role assertions should be done.In conclusion,we argue that role models provide an interesting reuse abstraction for ontologies and that roles should be supported as an ontological primitive.AcknowledgementThis research has been co-funded by the European Commission and by the Swiss Fed-eral Office for Education and Science within the6th Framework Programme project REWERSE number506779(cf.).References1.I.Aracic,V.Gasiunas,M.Mezini,and K.Ostermann.An Overview of CaesarJ,pages135–173.Springer Berlin/Heidelberg,2006.2.U.Aßmann,S.Zschaler,and G.Wagner.Ontologies,Meta-Models,and the Model-DrivenParadigm,pages249–273.Springer,2006.3. F.Baader,D.Calvanese,D.L.McGuinness,D.Nardi,and P.F.Patel-Schneider,editors.TheDescription Logic Handbook.Cambridge University Press,2003.4. B.Cuenca Grau,I.Horrocks,Y.Kazakov,and U.Sattler.Just the right amount:Extractingmodules from ontologies.In Proc.of the Sixteenth International World Wide Web Conference (WWW2007),2007.5. B.Cuenca Grau,Y.Kazakov,I.Horrocks,and U.Sattler.A logical framework for modu-lar integration of ontologies.In Proc.of the20th Int.Joint Conf.on Artificial Intelligence (IJCAI2007),pages298–303,2007.6. B.C.Grau,B.Parsia,E.Sirin,and A.Kalyanpur.Modularity and web ontologies.In P.Do-herty,J.Mylopoulos,and C.A.Welty,editors,Proceedings of KR2006:the20th Interna-tional Conference on Principles of Knowledge Representation and Reasoning,Lake District, UK,June2–5,2006,pages198–209.AAAI Press,2006.7.P.Hayes et al.RDF Semantics.W3C Recommendation,10February2004.Available at/TR/rdf-mt/.8.S.Herrmann.Object teams:Improving modularity for crosscutting collaborations.In Proc.Net Object Days2002,2002.9.M.Horridge,N.Drummond,J.Goodwin,A.Rector,R.Stevens,and H.Wang.The manch-ester owl syntax.OWL:Experiences and Directions(OWLED),November2006.10.H.Knublauch,R.W.Fergerson,N.F.Noy,and M.A.Musen.The ProtégéOWL plugin:Anopen development environment for semantic web applications.Third International Semantic Web Conference(ISWC),November2004.11.J.Pan,L.Serafini,and Y.Zhao.Semantic import:An approach for partial ontology reuse.In Proc.of the ISWC2006Workshop on Modular Ontologies(WoMO),2006.12.P.F.Patel-Schneider,P.Hayes,and I.Horrocks.OWL web ontology language semantics andabstract syntax.W3C Recommendation,10February2004.Available at http://www.w3.org/TR/owl-semantics/.13.T.Reenskaug,P.Wold,and O.Lehne.Working with Objects,The OOram Software Engi-neering Method.Manning Publications Co,1996.14.Y.Smaragdakis and D.Batory.Mixin layers:an object-oriented implementation tech-nique for refinements and collaboration-based designs.ACM Trans.Softw.Eng.Methodol., 11(2):215–255,2002.15. F.Steimann.On the representation of roles in object-oriented and conceptual modelling.Data Knowl.Eng.,35(1):83–106,2000.16. F.Steimann.The role data model revisited.Roles,an interdisciplinary perspective,AAAIFall Symposium,2005.。
Ontology在领域词典构建中的应用
领域特征概念。领域特征属性构成领域特征属性 讨了如何用 O t o 思想建立 “ nl og y 领域词典”的问 基于 O t o 思想建立领域词典 , nl og y 不仅可以 清 层。 最后用领域特征属性和部分手工构建的领域 题。 特征概念作为种子 , 采用 Bo t p i 的机器学 晰地描述领域词典中的领域特征概念及其关系 , ot r p g sa n 习技术, 从大规模无标注真实语料中, 动学习再 还可以实现领域知识的共享和重用 ,有利于领域 自 通过少量的人工校对的方法获取更多的 领域特征 词典的维护。在构建领域词典的过程中面临的困 概念 , 不断地扩充领域词典。 具体构建瓴唆词典层 难和问题主要有 : 领域特征屙 l的 生 提取及其组织 、 和描述语言、手工挑选 次分类体系的步骤如下 :
关 键词 : 域 知识 ; no g ; 领 O tly / 词典 o N域
1领域知识 ຫໍສະໝຸດ “ 领域知识”是一个源于人工智能领域的术 语。 人 在 工智能领域 , 领域知识主要应用在基于知 识的专家系统和 自 然语言理解系统 中。领域知识 是指在某一领域内的概念、 概念之间的相互关系 以及有关概念的约束的 集合。根据不同领域和不 同应用的需要 ,领域知识”这 术语的定义也有 “ 所不同。 自然语言处理的研究中, 在 领域知识是应 用于文本主题和内容分析的基础知识。领域知识 是面向 计算机 、 正常人不必费力获取、 用来描述某 领域的领域特征概念和领域特征概念之间的相 互关系的知识。领域知识具有知识本身所具有的 所有属性和特 点。 “ 面向计算机” 正常人不必费力获取” 和“ 是领 域知识在文本的主题和内 容分析的应用中体现出 来的两个重要特性。为了更好的描述领域特征概 念及其之间的关系 , 我们引入“ 领域特征属性 ” 的 概念。 领域特征属性” “ 也是一种领域特征概念 , 它 是领域特征概念再抽象和概括所形成的类别。确 定“ 领域特征属性 ” 应遵循以下三个原则 :1 () 领域 特征属性能描述某一领域的领域特征概念,且不 易于再分割。() 2领域特征属性一定要能够描述某 领域中全部的领域特征概念。 3领域特征属性 () 是稳定的, 是必须确定的。
A framework for ontology integration
A Framework for Ontology IntegrationDiego Calvanese,Giuseppe De Giacomo,Maurizio LenzeriniDipartimento di Informatica e SistemisticaUniversit`a di Roma“La Sapienza”Via Salaria113,00198Roma,Italy{calvanese,degiacomo,lenzerini}@dis.uniroma1.itAbstract.One of the basic problems in the development of techniques for the semanticweb is the integration of ontologies.Indeed,the web is constituted by a variety ofinformation sources,each expressed over a certain ontology,and in order to extractinformation from such sources,their semantic integration and reconciliation in termsof a global ontology is required.In this paper,we address the fundamental problemof how to specify the mapping between the global ontology and the local ontologies.We argue that for capturing such mapping in an appropriate way,the notion of queryis a crucial one,since it is very likely that a concept in one ontology correspondsto a view(i.e.,a query)over the other ontologies.As a result query processing inontology integration systems is strongly related to view-based query answering in dataintegration.1IntroductionOne of the basic problems in the development of techniques for the semantic web is the inte-gration of ontologies.Indeed,the web is constituted by a variety of information sources,and in order to extract information from such sources,their semantic integration and reconcilia-tion is required.In this paper we deal with a situation where we have various local ontologies, developed independently from each other,and we are required to build an integrated,global ontology as a mean for extracting information from the local ones.Thus,the main purpose of the global ontology is to provide a unified view through which we can query the various local ontologies.Most of the work carried out on ontologies for the semantic web is on which language or which method to use to build the global ontology on the basis of the local ones[13,2].For example,the Ontology Inference Layer(OIL)[13,2]proposes to use a restricted form of the expressive and decidable DL studied in[4]to express ontologies for the semantic web.In this paper,we address what we believe is a crucial problem for the semantic web:how do we specify the mapping between the global ontology and the local ontologies.This aspect is the central one if we want to use the global ontology for answering queries in the context of the semantic web.Indeed,we are not simply using the local ontologies as an intermediate step towards the global one.Instead,we are using the global ontology for accessing information in the local ones.It is our opinion that,although the problem of specifying the mapping between the global and the local ontologies is at the heart of integration in the web,it is not deeply investigated yet.We argue that even the most expressive ontology specification languages are not sufficient for information integration in the semantic web.In a real world setting,different ontologiesare build by different organizations for different purposes.Hence one should expect the same information to be represented in different forms and with different levels of abstraction in the various ontologies.When mapping concepts in the various ontologies to each other,it is very likely that a concept in one ontology corresponds to a view(i.e.,a query)over the other ontologies.Observe that here the notion of“query”is a crucial one.Indeed,to express mappings among concepts in different ontologies,suitable query languages should be added to the ontology specification language,and considered in the various reasoning tasks,in the spirit of[4,5].As a result query processing in this setting is strongly related to view-based query answering in data integration systems[20,17].What distinguishes ontology integration from data integration as studied in databases,is that,while in data integration one assumes that each source is basically a databases,i.e.,a logical theory with a single model,such an assumption is not made in ontology integration,where a local ontology is an arbitrary logical theory,and hence can have multiple models.Our main contribution in this paper is to present a general framework for an ontology of integration where the mapping between ontologies is expressed through suitable mechanisms based on queries,and to illustrate the framework proposed with two significant case studies.The paper is organized as follows.In the next section we set up a formal framework for on-tology integration.In Sections3and4,we illustrate the so called global-centric approach and local-centric approach to integration,and we discuss for each of the two approaches a specific case study showing the subtleties involved.In Section5we briefly present an approach to in-tegration that goes beyond the distinction between global-centric and local-centric.Finally, Section6concludes the paper.2Ontology integration frameworkIn this section we set up a formal framework for ontology integration systems(OISs).We argue that this framework provides the basis of an ontology of integration.For the sake of simplicity,we will refer to a simplified framework,where the components of an OIS are the global ontology,the local ontologies,and the mapping between the two.We call such systems “one-layered”.More complex situations can be modeled by extending the framework in order to represent,for example,mappings between local ontologies(in the spirit of[12,6]),or global ontologies that act as local ones with respect to another layer.In what follows,one of the main aspects is the definition of the semantics of both the OIS,and of queries posed to the global ontology.For keeping things simple,we will use in the following a unique semantic domain∆,constituted by afixed,infinite set of symbols.Formally,an OIS O is a triple G,S,M G,S ,where G is the global ontology,S is the set of local ontologies,and M G,S is the mapping between G and the local ontologies in S. Global ontology.We denote with A G the alphabet of terms of the global ontology,and we assume that the global ontology G of an OIS is expressed as a theory(named simply G) in some logic L G.Local ontologies.We assume to have a set S of n local ontologies S1,...,S n.We denotewith A Si the alphabet of terms of the local ontology S i.We also denote with A S theunion of all the A Si ’s.We assume that the various A Si’s are mutually disjoint,and eachone is disjoint from the alphabet A G.We assume that each local ontology is expressed asa theory(named simply S i)in some logic L Si ,and we use S to denote the collection oftheories S1,...,S n.Mapping.The mapping M G,S is the heart of the OIS,in that it specifies how the concepts1 in the global ontology G and in the local ontologies S map to each other. Semantics.Intuitively,in specifying the semantics of an OIS,we have to start with a model of the local ontologies,and the crucial point is to specify which are the models of the global ontology.Thus,for assigning semantics to an OIS O= G,S,M G,S ,we start by considering a local model D for O,i.e.,an interpretation that is a model for all the theories of S.We call global interpretation for O any interpretation for G.A global interpretationI for O is said to be a global model for O wrt D if:•I is a model of G,and•I satisfies the mapping M G,S wrt D.In the next sections,we will come back to the notion of satisfying a mapping wrt a local model.The semantics of O,denoted sem(O),is defined as follows:sem(O)={I|there exists a local model D for Os.t.I is a global model for O wrt D}Queries.Queries posed to an OIS O are expressed in terms of a query language Q G over the alphabet A G and are intended to extract a set of tuples of elements of∆.Thus,every query has an associated arity,and the semantics of a query q of arity n is defined as follows.The answer q O of q to O is the set of tuplesq O={ c1,...,c n |for all I∈sem(O), c1,...,c n ∈q I} where q I denotes the result of evaluating q in the interpretation I.As we said before,the mapping M G,S represents the heart of an OIS O= G,S,M G,S . In the usual approaches to ontology integration,the mechanisms for specifying the mapping between concepts in different ontologies are limited to expressing direct correspondences between terms.We argue that,in a real-world setting,one needs a much more powerful mechanism.In particular,such a mechanism should allow for mapping a concept in one ontology into a view,i.e.,a query over the other ontologies,which acquires the relevant information by navigating and aggregating several concepts.Following the research done in data integration[16,17],we can distinguish two basic approaches for defining this mapping:•the global-centric approach,where concepts of the global ontology G are mapped into queries over the local ontologies in S;•the local-centric approach,where concepts of the local ontologies in S are mapped to queries over the global ontology G.We discuss these two approaches in the following sections.1Here and below we use the term“concept”for denoting a concept of the ontology.3Global-centric approachIn the global-centric approach(aka global-as-view approach),we assume we have a query language V S over the alphabet A S,and the mapping between the global and the local on-tologies is given by associating to each term in the global ontology a view,i.e.,a query,over the local ontologies.The intended meaning of associating to a term C in G a query V s over S,is that such a query represents the best way to characterize the instances of C using the concepts in S.A further mechanism is used to specify if the correspondence between C and the associated view is sound,complete,or exact.Let D be a local model for O,and I a global interpretation for O:•I satisfies the correspondence C,V s,sound in M G,S wrt D,if all the tuples satisfying V s in D satisfy C in I,•I satisfies the correspondence C,V s,complete in M G,S wrt D,if no tuple other than those satisfying V s in D satisfies C in I.•I satisfies the correspondence C,V s,exact in M G,S wrt D,if the set of tuples that satisfy C in I is exactly the set of tuples satisfying V s in D.We say that I satisfies the mapping M G,S wrt D,if I satisfies every correspondence in M G,S wrt D.The global-centric approach is the one adopted in most data integration systems.In such systems,sources are databases(in general relational ones),the global ontology is actually a database schema(again,represented in relational form),and the mapping is specified by as-sociating to each relation in the global schema one relational query over the source relations. It is a common opinion that this mechanism allow for a simple query processing strategy, which basically reduces to unfolding the query using the definition specified in the mapping, so as to translate the query in terms of accesses to the sources[20].Actually,when we add constraints(even of a very simple form)to the global schema,query processing becomes even harder,as shown in the following case study.3.1A case studyWe now set up a global-centric framework for ontology integration,which is based on ideas developed for data integration over global schemas expressed in the Entity-Relationship model[3].In particular,we describe the main components of the ontology integration system, and we provide the semantics both of the system,and of query answering.The OIS O= G,S,M G,S is defined as follows:•The global ontology G is expressed in the Entity-Relationship model(or equivalently as UML class diagrams).In particular,G may include:–typing constraints on relationships,assigning an entity to each component of the relationship;–mandatory participation to relationships,saying that each instance of an entity must participate as i-th component to a relationship;–ISA relations between both entities and relationships;–typing constraints,functional restrictions,and mandatory existence,for attributes both of entities and of relationships.•The local ontologies S are constituted simply by a relational alphabet A S,and by the extensions of the relations in A S.For example,such extensions may be expressed as relational databases.Observe that we are assuming that no intensional relation between terms in A S is present in the local ontologies.•The mapping M G,S between G and S is given by a set of correspondences of the form C,V s,sound ,where C is a concept(i.e.,either an entity,a relationship,or an attribute) in the global ontology and V s is a query over S.More precisely,–The mapping associates a query of arity1to each entity of G.–The mapping associates a query of arity2to each entity attribute A of G.Intuitively, if the query retrieves the pair x,y from the extension of the local ontologies,this means that y is a value of the attribute A of the entity instance x.Thus,thefirst argument of the query corresponds to the instances of the entity for which A is defined,and the second argument corresponds to the values of the attribute A.–The mapping associates a query of arity n to each relationship R of arity n in G.Intuitively,if the query retrieves the tuple x1,...,x n from the extension of the local ontologies,this means that x1,...,x n is an instance of R.–The mapping associates a query of arity n+1to each attribute A of a relationship R of arity n in G.Thefirst n arguments of the query correspond to the tuples of R, and the last argument corresponds to the values of A.As specified above,the intended meaning of the query V s associated to the concept C is that it specifies how to retrieve the data corresponding to C in the global schema starting from the data at the sources.This confirms that we are following the global-as-views approach:each concept in the global ontology is defined as a view over the concepts in the local ontologies.We do not pose any constraint on the language used to express the queries in the mapping.Since the extensions of local ontologies are rela-tional databases,we simply assume that the language is able to express computations over relational databases.To specify the semantics of a data integration system,we have to characterize,given the set of tuples in the extension of the various relations of the local ontologies,which are the data satisfying the global ontology.In principle,one would like to have a single extension as model of the global ontology.Indeed,this is the case for most of the data integration systems described in the literature.However,we will show in the following the surprising result that, due to the presence of the semantic conditions that are implicit in the conceptual schema G, in general,we will have to account for a set of possible extensions.Example1.Figure1shows the global schema G1of a data integration system O1= G1,S1,M1 ,where Age is a functional attribute,Student has a mandatory participation in the relationship Enrolled,Enrolled isa Member,and University isa Organization.The schema models persons who can be members of one or more organizations,and students who areFigure1:Global ontology of Example1enrolled in universities.Suppose that S is constituted by S1,S2,S3,S4,S5,S6,S7,S8,and that the mapping M1is as follows:Person(x)←S1(x)Organization(x)←S2(x)Member(x,y)←S7(x,z)∧S8(z,y)Student(x)←S3(x,y)∨S4(x)Age(x,y)←S3(x,y)∨S6(x,y,z)University(x)←S5(x)Enrolled(x,y)←S4(x,y)From the semantics of the OIS O it is easy to see that,given a local model D,several situations are possible:1.No global model exists.This happens,in particular,when the data in the extension of thelocal ontologies retrieved by the queries associated to the elements of the global ontology do not satisfy the functional attribute constraints.2.Several global models exist.This happens,for example,when the data in the extensionof the local ontologies retrieved by the queries associated to the global concepts do not satisfy the ISA relationships of the global ontology.In this case,it may happen that several ways exist to add suitable objects to the elements of G in order to satisfy the constraints.Each such ways yields a global model.Example2.Referring to Example1,consider a local model D1,where S3contains the tuple t1,a1 ,and S6contains the tuple t1,a2,v1 .The query associated to Age by the mapping M1specifies that,in every model of O1both tuples should belong to the extension of Age. However,since Age is a functional attribute in G1,it follows that no model exists for the OIS O1.Example3.Referring again to Example1,consider a local model D2,where S1contains p1 and p2,S2contains o1,S5contains u1,S4contains t1,and the pairs p1,o1 and p2,u1 are in the join between S7and S8.By the mapping M1,it follows that in every model of O1,wehave that p1,p2∈Person, p1,o1 , p2,u1 ∈Member,o1∈Organization,t1∈Student,and u1∈University.Moreover,since G1specifies that Student has a mandatory participation in the relationship Enrolled,in every model for O1,t1must be enrolled in a certain university. The key point is that nothing is said in D2about which university,and therefore we have to accept as models all interpretations for O1that differ in the university t1is enrolled in.In the framework proposed,it is assumed that thefirst problem is solved by the queries extracting data from the extension of the local ontologies.In other words,it is assumed that, for any functional attribute A,the corresponding query implements a suitable data cleaning strategy(see,e.g.,[15])that ensures that,for every local model D and every x,there is at most one tuple(x,y)in the extension of A(a similar condition holds for functional attributes of relationships).The second problem shows that the issue of query answering with incomplete informa-tion arises even in the global-as-view approach to data integration.Indeed,the existence of multiple global models for the OIS implies that query processing cannot simply reduce to evaluating the query over a single relational database.Rather,we should in principle take all possible global models into account when answering a query.It is interesting to observe that there are at least two different strategies to simplify the setting,and overcome this problem that are frequently adopted in data integration sys-tems[16,20,17]:•Data integration systems usually adopt a simpler data model(often,a plain relational data model)for expressing the global schema(i.e.,the global ontology).In this case,the data retrieved from the sources(i.e.,the local ontologies)triviallyfits into the schema,and can be directly considered as the unique database to be processed during query answering.•The queries associated to the concepts of the global schema are often considered as exact.In this case,analogously to the previous one,it is easy to see that the only global exten-sion to be considered is the one formed by the data retrieved by the extension of the local ontologies.However,observe that,when data in this extension do not obey all semantic conditions that are implicit in the global ontology,this single extension is not coherent with the global ontology,and the OIS is inconsistent.This implies that query answering in meaningless.We argue that,in the usual case of autonomous,heterogeneous local on-tologies,it is very unlikely that datafit in the global ontology,and therefore,this approach is too restrictive,in the sense that the OIS would be often inconsistent.The fact that the problem of incomplete information is overlooked in current approaches can be explained by observing that traditional data integration systems follow one of the above mentioned simplifying strategies:they either express the global schema as a set of plain relations,or consider the sources as exact(see,for instance,[11,19,1]).In[3]we present an algorithm for computing the set of certain answers to queries posed to a data integration system.The key feature of the algorithm is to reason about both the query and the global ontology in order to infer which tuples satisfy the query in all models of the OIS.Thus,the algorithm does not simply unfold the query on the basis of the mapping,as usually done in data integration systems based on the global-as-view approach.Indeed,the algorithm is able to add more answers to those directly extracted from the local ontologies, by exploiting the semantic conditions expressed in the conceptual global schema.Let O= G,S,M G,S be an OIS,let D be a local model,and let Q be a query over the global ontology G.The algorithm is constituted by three major steps.1.From the query Q,obtain a new query expand G(Q)over the elements of the global ontol-ogy G in which the knowledge in G that is relevant for Q has been compiled in.2.From expand G(Q),compute the query unfold MG,S (expand G(Q)),by unfoldingexpand G(Q)on the basis of the mapping M G,S.The unfolding simply substitutes each atom of expand G(Q)with the query associated by M G,S to the element in the atom.The resulting unfold MG,S(expand G(Q))is a query over the relations in the local ontologies.3.Evaluate the query unfold MG,S(expand G(Q))over the local model D.The last two steps are quite obvious.Instead,thefirst one requires tofind a way to compile into the query the semantic relations holding among the concepts of the global schema G.A way to do so is shown in[3].The query expand G(Q)returned by the algorithm is exponential wrt to Q.However,expand G(Q)is a union of conjunctive queries,which,if the queries in the mapping are polynomial,makes the entire algorithm polynomial in data complexity. Example4.Referring to Example3,consider the query Q1to O1:Q1(x)←Member(x,y)∧University(y)It is easy to see that{p2,t1}is the set of certain answers to Q1with respect to O1and D2. Thus,although D2does not indicate in which university t1is enrolled,the semantics of O1 specifies that t1is enrolled in a university in all legal database for O1.Since Member is ageneralization of Enrolled,this implies that t1is in Q O1,and hence is in unf M1(exp G1(Q1))evaluated over D2.4Local-centric approachIn the local-centric approach(aka local-as-view approach),we assume we have a query lan-guage V G over the alphabet A G,and the mapping between the global and the local ontologies is given by associating to each term in the local ontologies a view,i.e.,a query over the global ontology.Again,the intended meaning of associating to a term C in S a query V g over G,is that such a query represents the best way to characterize the instances of C using the concepts in G.As in the global-centric approach,the correspondence between C and the associated view can be either sound,complete,or exact.Let D be a local model for O,and I a global interpretation for O:•I satisfies the correspondence V g,C,sound in M G,S wrt D,if all the tuples satisfyingC inD satisfy V g in I,•I satisfies the correspondence V g,C,complete in M G,S wrt D,if no tuple other than those satisfying C in D satisfies V g in I,•I satisfies the correspondence V g,C,exact in M G,S wrt D,if the set of tuples that satisfy C in D is exactly the set of tuples satisfying V g in I.As in the global-centric approach,we say that I satisfies the mapping M G,S wrt D,if I satisfies every correspondence in M G,S wrt D.Recent research work on data integration follows the local-centric approach[20,17,18,6, 8].The major challenge of this approach is that,in order to answer a query expressed over theglobal schema,one must be able to reformulate the query in terms of queries to the sources. While in the global-centric approach such a reformulation is guided by the correspondences in the mapping,here the problem requires a reasoning step,so as to infer how to use the sources for answering the query.Many authors point out that,despite its difficulty,the local-centric approach better supports a dynamic environment,where local ontologies can be added to the systems without the need for restructuring the global ontology.4.1A case studyWe present here an OIS architecture based on the use of Description Logics to represent ontologies[6,7].Specifically,we adopt the Description Logic DLR,in which both classes and n-ary relations can be represented[4].Wefirst introduce DLR,and then we illustrate how we use the logic to define an OIS.4.1.1The Description Logic DLRDescription Logics2(DLs)are knowledge representation formalisms that are able to capture virtually all class-based representation formalisms used in Artificial Intelligence,Software Engineering,and Databases[9,10].One of the distinguishing features of these logics is that they have optimal reasoning algo-rithms,and practical systems implementing such algorithms are now used in several projects.In DLs,the domain of interest is modeled by means of concepts and relations,which denote classes of objects and relationships,respectively.Here,we focus our attention on the DL DLR[4,6],whose basic elements are concepts(unary relations),and n-ary relations. We assume to deal with an alphabet A constituted by afinite set of atomic relations,atomic concepts,and constants,denoted by P,A,and a,respectively.We use R to denote arbitrary relations(of given arity between2and n max),and C to denote arbitrary concepts,respectively built according to the following syntax:C::= 1|A|¬C|C1 C2|∃[i]R|(≤k[i]R)R::= n|P|i/n:C|¬R|R1 R2where i denotes a component of a relation,i.e.,an integer between1and n max,n denotes the arity of a relation,i.e.,an integer between2and n max,and k denotes a nonnegative integer. We consider only concepts and relations that are well-typed,which means that only relations of the same arity n are combined to form expressions of type R1 R2(which inherit the arity n),and i≤n whenever i denotes a component of a relation of arity n.The semantics of DLR is specified as follows.An interpretation I is constituted by an interpretation domain∆I,and an interpretation function·I that assigns to each constant an element of∆I under the unique name assumption,to each concept C a subset C I of∆I, and to each relation R of arity n a subset R I of(∆I)n,such that the conditions in Figure2 are satisfied.Observe that,the“¬”constructor on relations is used to express difference of relations,and not the complement[4].A DLR knowledge base is a set of inclusion assertions of the formC1 C2R1 R22See for the home page of Description Logics.I1=∆IA I⊆∆I(¬C)I=∆I\C I(C1 C2)I=C I1∩C I2(∃[i]R)I={d∈∆I|∃ d1,...,d n ∈R I.d i=d}(≤k[i]R)I={d∈∆I| { d1,...,d n ∈R I1|d i=d}≤k}I n⊆(∆I)nP I⊆ I ni/n:C I={ d1,...,d n ∈ I n|d i∈C I}(¬R)I= In \R I(R1 R2)I=R I1∩R I2Figure2:Semantic rules for DLR(P,R,R1,and R2have arity n)where C1and C2are concepts,and R1and R2are relations of the same arity.An inclusion assertion C1 C2(resp.,R1 R2)is satisfied in an interpretation I if C I1⊆C I2(resp.,R I1⊆R I2).An interpretation is a model of a knowledge base K,if it satisfies all assertions inK.K logically implies an inclusion assertionρifρis satisfied in all models of K.Finally,we introduce the notion of query expression in DLR.We assume that the al-phabet A is enriched with afinite set of variable symbols,simply called variables.A query expression Q over a DLR knowledge base K is a non-recursive datalog query of the formQ( x)←conj1( x, y1)∨···∨conj m( x, y m)where each conj i( x, y i)is a conjunction of atoms,and x, y i are all the variables appearing in the conjunct.Each atom has one of the forms R( t)or C(t),where t and t are variables in x and y i or constants in A,R is a relation of K,and C is a concept of K.The number of variables of x is called the arity of Q,and is the arity of the relation denoted by the query Q. We observe that the atoms in query expressions are arbitrary DLR concepts and relations, freely used in the assertions of the KB.Given an interpretation I,a query expression Q of arity n is interpreted as the set Q I of n-tuples of constants c1,...,c n ,such that,when substituting each c i for x i,the formula∃ y1.conj1( x, y1)∨···∨∃ y m.conj m( x, y m)evaluates to true in I.DLR is equipped with effective reasoning techniques that are sound and complete with respect to the semantics.In particular,checking whether a given assertion logically follows from a set of assertions is EXPTIME-complete in(assuming that numbers are encoded in unary),and query containment,i.e.,checking whether one query is contained in another one in every model of a set of assertions,is EXPTIME-hard and solvable in2EXPTIME[4]. 4.1.2DLR local-centric OISWe now set up a local-centric framework for ontology integration,which is based on ideas developed for data integration over DLR knowledge bases[6,5].In particular,we describe。
3 Knowledge Representation and Ontologies Logic, Ontologies and Semantic Web Languages
3Knowledge Representation and OntologiesLogic,Ontologies and Semantic Web LanguagesStephan Grimm1,Pascal Hitzler2,Andreas Abecker11FZI Research Center for Information Technologies,University of Karlsruhe,Germany {grimm,abecker}@fzi.de2Institute AIFB,University of Karlsruhe,Germanyhitzler@aifb.uni-karlsruhe.deSummary.In Artificial Intelligence,knowledge representation studies the formalisation of knowl-edge and its processing within machines.Techniques of automated reasoning allow a computer sys-tem to draw conclusions from knowledge represented in a machine-interpretable form.Recently, ontologies have evolved in computer science as computational artefacts to provide computer systems with a conceptual yet computational model of a particular domain of interest.In this way,computer systems can base decisions on reasoning about domain knowledge,similar to humans.This chapter gives an overview on basic knowledge representation aspects and on ontologies as used within com-puter systems.After introducing ontologies in terms of their appearance,usage and classification,it addresses concrete ontology languages that are particularly important in the context of the Semantic Web.The most recent and predominant ontology languages and formalisms are presented in relation to each other and a selection of them is discussed in more detail.3.1Knowledge RepresentationAs a branch of symbolic Artificial Intelligence,knowledge representation and reasoning aims at designing computer systems that reason about a machine-interpretable representa-tion of the world,similar to human reasoning.Knowledge-based systems have a computa-tional model of some domain of interest in which symbols serve as surrogates for real world domain artefacts,such as physical objects,events,relationships,etc.[45].The domain of interest can cover any part of the real world or any hypothetical system about which one desires to represent knowledge for computational purposes.A knowledge-based system maintains a knowledge base which stores the symbols of the computational model in form of statements about the domain,and it performs reasoning by manipulating these symbols.Applications can base their decisions on domain-relevant questions posed to a knowledge base.3.1.1A Motivating ScenarioTo illustrate principles of knowledge representation in this chapter,we introduce an exam-ple scenario taken from a B2B travelling use case.In this scenario,companies frequently38Stephan Grimm,Pascal Hitzler,Andreas Abeckerbook business trips for their employees,sending them to international meetings and con-ference events.Such a scenario is a relevant use case for Semantic Web Services,since companies desire to automate the online booking process,while they still want to bene-fit from the high competition among various travel agencies and no-frills airlines that sell tickets via the internet.Automation is achieved by computational agents deciding about whether an online offer of some travel agencyfits a request for a business trip or not,based on the knowledge they have about the offer and the request.Knowledge represented in this domain of“business trips”is aboutflights,trains,booking,companies and their employees, cities that are source or destination for a trip,etc.Knowledge-based systems use a computational representation of such knowledge in form of statements about the domain of interest.Examples of such statements in the busi-ness trips domain are“companies book trips for their employees”,“flights and train rides are special kinds of trips”or“employees are persons employed at some company”.This knowledge can be used to answer questions about the domain of interest.From the given statements,and by means of automated deduction,a knowledge-based system can,for ex-ample,derive that“a person on aflight booked by a company is an employee”or“the company that booked aflight for a person is this person’s employer”.In this way,a knowledge-based computational agent can reason about business trips, similar to the way a human would.It could,for example,tell apart offers for business trips from offers for vacations,or decide whether the destination city for a requestedflight is close to the geographical region specified in an offer,or conclude that a participant of a businessflight is an employee of the company that booked theflight.3.1.2Forms of Representing KnowledgeIf we look at current Semantic Web technologies and use cases,knowledge representation appears in different forms,the most prevalent of which are based on semantic networks, rules and logic.Semantic network structures can be found in RDF graph representations [30]or Topic Maps[41],whereas a formalisation of business knowledge often comes in form of rules with some“if-then”reading,e.g.in business rules or logic programming formalisms.Logic is used to realise a precise semantic interpretation for both of the other forms.By providing formal semantics for knowledge representation languages,logic-based formalisms lay the basis for automated deduction.We will investigate these three forms of knowledge representation in the following.Semantic NetworksOriginally,semantic networks stem from the“existential graphs”introduced by Charles Peirce in1896to express logical sentences as graphical node-and-link diagrams[43].Later on,similar notations have been introduced,such as conceptual graphs[45],all differing slightly in syntax and semantics.Despite these differences,all the semantic network for-malisms concentrate on expressing the taxonomic structure of categories of objects and the relations between them.We use a general notion of a semantic network,abstracting from the different concrete notations proposed.A semantic network is a graph whose nodes represent concepts and whose arcs rep-resent relations between these concepts.They provide a structural representation of state-ments about a domain of interest.In the business trips domain,typical concepts would be3Knowledge Representation and Ontologies39“Company”,“Employee”or“Flight”,while typical relations would be“books”,“isEm-ployedAt”or“participatesIn”.Figure3.1shows an example of a semantic network for the business trips domain.Fig.3.1.A Semantic Network for Business TripsSemantic networks provide a means to abstract from natural language,representing the knowledge that is captured in text in a form more suitable for computation.The knowledge expressed in the network from Figure3.1coincides with the content of the following natural language text.“Employees of companies are persons,while both persons and companies are le-gal panies book trips for their employees.These trips can beflights or train rides which start and end in cities of Europe or the panies them-selves have locations which can be cities.The company UbiqBiz books theflight FL4711from London to New York for Mister X.”Typically,concepts are chosen to represent the meaning of nouns in such a text,while relations are mapped to verb phrases.The fragment Company books−−−−−→Trip is read as “companies book trips”,expressed as a binary two However, this is not mandatory;the relation books−−−−−→could also be“lifted”to a concept Booking with relations hasActor−−−−−−−−→pointing to Company,−−−−−−−−→,hasParticipant−−−−−−−−−−−−→and hasObjectEmployee and Trip,respectively.In this way,its ternary character wouldthe original network where the information about an employee’s involvement in booking is implicit.In principle,the concepts and relations in a semantic network are generic and could stand for anything relevant in the domain of interest.However,some particular relations for some standard knowledge representation and reasoning cases have evolved.40Stephan Grimm,Pascal Hitzler,Andreas AbeckerThe semantic network in Figure3.1illustrates the distinction between general concepts, like Employee,and individual concepts,like MisterX.While the latter represent con-crete individuals or objects in the domain of interest,the former serve as classes to group together such individuals that have certain properties in common,as e.g.all employees.The particular relation which links individuals to their classes is that of instantiation,denoted by isA−−−−→.Thus,MisterX is called an instance of the concept employee.The lower part of the network is concerned with knowledge about individuals,reflecting a particular situation of the employee MisterX participating in a certainflight,while the upper part is concerned with knowledge about general concepts,reflecting various possible situations.The most prominent type of relation in semantic networks,however,is that of subsump-tion,which we denote by kindOf−−−−−−→.A subsumption link connects two general concepts and expresses specialisation or generalisation,respectively.In the network in Figure3.1,a flight is said to be a special kind of trip,i.e.Trip subsumes Flight.This means that any flight is also a trip,however,there might be other trips which are notflights,such as train rides.Subsumption is associated with the notion of inheritance in that a specialised concept inherits all the properties from its more general parent concepts.For example,from the net-work one can read that a company can be located in a European city,since locatedAt−−−−−−−−→points from Company to Location while EUCity is a kind of City which is itself a kind of Location.The concept EUCity inherits the property of being a potential location for a company from the concept Location.Other particular relations that can be found in semantic network notations are,for ex-ample,partOf−−−−−−→to denote part-whole relationships,etc.Semantic networks are closely related to another form of knowledge representation called frame systems.In fact,frame systems and semantic networks can be identical in their expressiveness but use different representation metaphors[43].While the semantic network metaphor is that of a graph with concept nodes linked by relation arcs,the frame metaphor draws concepts as boxes,i.e.frames,and relations as slots inside frames that can befilled by other frames.Thus,in the frame metaphor the graph turns into nested boxes.The semantic network form of knowledge representation is especially suitable for cap-turing the taxonomic structure of categories for domain objects and for expressing general statements about the domain of interest.Inheritance and other relations between such cate-gories can be represented in and derived from subsumption hierarchies.On the other hand, the representation of concrete individuals or even data values,like numbers or strings,does notfit well the idea of semantic networks.RulesAnother natural form of expressing knowledge in some domain of interest are rules that re-flect the notion of consequence.Rules come in the form of IF-THEN-constructs and allow to express various kinds of complex statements.Rules can be found in logic programming systems,like the language Prolog[31],in deductive databases[34]or in business rules systems.The following is an example of rules expressing knowledge in the business trips do-main,specified in their intuitive if-then-reading.3Knowledge Representation and Ontologies41(1)IF something is aflight THEN it is also a trip(2)IF some person participates in a trip booked by some companyTHEN this person is an employee of this company(3)FACT the person MisterX participates in aflight booked by the company UbiqBiz(4)IF a trip’s source and destination cities are close to each otherTHEN the trip is by trainThe IF-part is also called the body of a rule,while the THEN-part is also called its head.Typically,rule-based knowledge representation systems operate on facts,which are often formalised as a special kind of rule with an empty body.They start from a given set of facts,like rule(3)above,and then apply rules in order to derive new facts,thus“drawing conclusions”.However,the intuitive reading with natural language phrases is not suitable for compu-tation,and therefore such phrases are formalised to predicates and variables over objects of the domain of interest.A formalisation of the above rules in the typical style of rule languages looks as follows.(1)Trip(?t):−Flight(?t)(2)Employee(?p)∧isEmployedAt(?p,?c):−Trip(?t)∧books(?c,?t)∧Company(?c)∧participatesIn(?p,?t)∧Person(?p)(3)Person(MisterX)∧participatesIn(MisterX,FL4711)∧Flight(FL4711)∧books(UbiqBiz,FL4711)∧Company(UbiqBiz):−(4)TrainRide(?t):−Trip(?t)∧startsFrom(?t,?s)∧endsIn(?t,?d)∧close(?s,?d) In most logic programming systems a rule is read as an inverse implication,starting with the head followed by the body,which is indicated by the symbol:−that resembles a backward arrow.In this formalisation,the intuitive notions from the text,that were concepts and relations in the semantic network case,became predicates linked through variables and constants that identify objects in the domain of interest.Variables start with the symbol? and take as their values the constants that occur in facts such as(3).Rule(1)captures inheritance–or subsumption–between trips andflights by stating that“everything that is aflight is also a trip”.Rule(2)draws conclusions about the status of employment for participants of businessflights.From the facts(3),these two rules are able to derive the implicit fact that“MisterX is an employee of UbiqBiz”.While the rules(1)and(2)express general domain knowledge,rule(4)can be inter-preted as part of some company’s travelling policy,stating that trips between close cities shall be conducted by train.In business rules,for example,rule-based formalisms are used with the motivation to capture complex business knowledge in companies like pricing mod-els or delivery policies.Rule-based knowledge representation systems are especially suitable for reasoning about concrete instance data,i.e.simple facts of the form Employee(MisterX).Com-plex sets of rules can efficiently derive implicit such facts from explicitly given ones.They are problematic if more complex and general statements about the domain shall be derived which do notfit a rule’s head.42Stephan Grimm,Pascal Hitzler,Andreas AbeckerLogicBoth forms,semantic networks as well as rules,have been formalised using logic to give them a precise semantics.Without such a precise formalisation they are vague and ambigu-ous,and thus problematic for computational purposes.From just the graphical representa-tion of the semantic network in Figure3.1,for example,it is not clear whether companies can only bookflights for their own employees or for employees of partner companies as well.Neither is it clear from the fragment Company books−−−−−→Trip whether every com-pany books trips or just some company.Also for rules,despite their much more formal appearance,the exact meaning remains unclear when,for example,forms of negation are introduced that allow for potential conflicts between rules.Depending on the choice of procedural evaluation orflavour of formal semantics,different derivation results are being produced.The most prominent and fundamental logical formalism classically used for knowledge representation is the“first-order predicate calculus”,orfirst-order logic for short,and we choose this formalism to present logic as a form of knowledge representation here.First-order logic allows one to describe the domain of interest as consisting of objects,i.e.things that have individual identity,and to construct logical formulas around these objects formed by predicates,functions,variables and logical connectives[43].We assume that the reader is familiar with the notation offirst-order logic from formalisations of various mathematical disciplines.Similar to semantic networks,most statements in natural language can be expressed in terms of logical sentences about objects of the domain of interest with an appropriate choice of predicate and function symbols.Concepts are mapped to unary,relations to binary predicates.We illustrate the use of logic for knowledge representation by axiomatising parts of the semantic network from Figure3.1more precisely.Subsumption,for example,can be directly expressed by a logical implication,which is illustrated in the translation of the following fragment.Employee kindOf−−−−−−→Person∀x:(Employee(x)→Person(x))Due to the universal quantifier,the variable x in the logical formula ranges over all domain objects and its reading is“everything that is an employee is also a person”.Other parts of the network can be further restricted using logical formulas,as shown in the following example.Company books−−−−−→Trip∀x,y:(books(x,y)→Company(x)∧Trip(y))∀x:∃y:(Trip(x)→Company(y)∧books(y,x)) The graphical representation of the network fragment leaves some details open,while the logical formulas capture the booking relation between companies and trips more precisely. Thefirst formula states that domain and range of the booking relation are companies and trips,respectively,while the second formula makes sure that for every trip there does actu-ally exist a company that booked it.In particular,more complex restrictions that range over larger fragments of a network graph can be formulated in logic,where the intuitive graphical notation lacks expressiv-ity.As an example consider the relations between companies,trips and employees in the following fragment.3Knowledge Representation and Ontologies43 Company books←−−−−−−−−−−−Employee−−−−−→Trip participatesIn←−−−−−−−−−−−−−−−−−−−−−−−−employedAt∀x:∃y:(Trip(x)→Employee(y)∧participatesIn(y,x)∧books(employer(y),x)) The logical formula expresses additional knowledge that is not captured in the graph rep-resentation.It states that,for every trip,there must be an employee that participates in this trip while the employer of this participant is the company that booked theflight.Rules can also be formalised with logic.An IF-THEN-rule can be represented as a logical implication with universally quantified variables.For example,a common formali-sation of the ruleIF a trip’s source and destination cities are close to each otherTHEN the trip is by trainis the translation to the logical formula∀x,y,z:(Trip(x)∧startsFrom(x,y)∧endsIn(x,z)∧close(y,z)→TrainRide(x)). However,the typical rule-based systems do not interpret such a formula in the classical sense offirst-order logic but employ different kinds of semantics,which are discussed in Section3.2.Since a precise axiomatisation of domain knowledge is a prerequisite for processing knowledge within computers in a meaningful way,we focus on logic as the dominant form of knowledge representation.Therefore,we investigate different kinds of logics and formal semantics more closely in a subsequent section.In the context of the Semantic Web,two particular logical formalisms have gained momentum,reflecting the semantic network and rules forms of knowledge representation. The graph notations of semantic networks have been formalised through description log-ics,which are fragments offirst-order logic with typical Tarskian model-theoretic seman-tics but restricted to unary and binary predicates to capture the notions of concepts an relations.On the other hand,rules have been formalised through logic programming for-malisms with minimal model semantics,focusing on the derivation of simple facts about individual objects.Both description logics and logic programming can be found as underly-ing formalisms in various knowledge representation languages in the Semantic Web,which are addressed in Section3.4.3.1.3Reasoning about KnowledgeThe way in which we,as humans,process knowledge is by reasoning,i.e.the process of reaching conclusions.Analogously,a computer processes the knowledge stored in a knowledge base by drawing conclusions from it,i.e by deriving new statements that follow from the given ones.The basic operations a knowledge-based system can perform on its knowledge base are typically denoted by tell and ask[43].The tell-operation adds a new statement to the knowledge base,whereas the ask-operation is used to query what is known.The statements that have been added to a knowledge base via the tell-operation constitute the explicit knowledge a system has about the domain of interest.The ability to process explicit knowledge computationally allows a knowledge-based system to reason over a domain of interest by deriving implicit knowledge that follows from what has been told explicitly.44Stephan Grimm,Pascal Hitzler,Andreas AbeckerThis leads to the notion of logical consequence or entailment.A knowledge base KB is said to entail a statementαifα“follows”from the knowledge stored in KB,which is written as KB|=α.A knowledge base entails all the statements that have been added via the tell-operation plus those that are their logical consequences.As an example,consider the following knowledge base with sentences infirst-order logic.KB={Person(MisterX),participates(MisterX,FL4711),Flight(FL4711),books(UbiqBiz,FL4711),∀x,y,z:(Flight(y)∧participates(x,y)∧books(z,y)→employedAt(x,z)),∀x,y:(employedAt(x,y)→Company(x)∧Employee(y)),∀x:(Person(x)→¬Company(x))}The knowledge base KB explicitly states that“MisterX is a person who participates in theflight FL4711booked by UbiqBiz”,that“participants offlights are employed at the company that booked theflight”,that“the employment relation holds between companies and employees”and that“persons are different from companies”.If we ask the question “Is MisterX employed at UbiqBiz?”by sayingask(KB,employedAt(MisterX,UbiqBiz))the answer will be yes.The knowledge base KB entails the fact that“MisterX is employed at UbiqBiz”,i.e.KB|=employedAt(MisterX,UbiqBiz),although it was not“told”so ex-plicitly.This follows from its general knowledge about the domain.A further consequence is that“UbiqBiz is a company”,i.e.KB|=Company(UbiqBiz),which is reflected by a positive answer to the questionask(KB,Company(UbiqBiz)).This follows from the former consequence together with the fact that“employment holds between companies and employees”.Another important notion related to entailment is that of consistency or satisfiability. Intuitively,a knowledge base is consistent or satisfiable if it does not contain contradictory facts.If we would add the fact that“UbiqBiz is a person”to the above knowledge base KB by sayingtell(KB,Person(UbiqBiz)),it would become unsatisfiable because persons are said to be different from companies.We explicitly said that UbiqBiz is a person while at the same time it can be derived that it is a company.In general,an unsatisfiable knowledge base is not very useful,since in logical for-malisms it would entail any arbitrary fact.The ask-operation would always return a posi-tive result independent from its parameters,which is clearly not desirable for a knowledge-based system.The inference procedures implemented in computational reasoners aim at realising the entailment relation between logical statements[43].They derive implicit statements from a given knowledge base or check whether a particular statement is entailed by a knowledge base.3Knowledge Representation and Ontologies45 An inference procedure that only derives entailed statements is called sound.Soundness is a desirable feature of an inference procedure,since an unsound inference procedure would potentially draw wrong conclusions.If an inference procedure is able to derive every statement that is entailed by a knowledge base then it is called pleteness is also a desirable property,since a complex chain of conclusions might break down if only a single statement in it is missing.Hence,for reasoning in knowledge-based systems we desire sound and complete inference procedures.3.2Logic-Based Knowledge-Representation FormalismsFirst-order(predicate)logic is the prevalent and single most important knowledge repre-sentation formalism.Its importance stems from the fact that basically all current symbolic knowledge representation formalisms can be understood in their relation tofirst-order logic. Its roots can be traced back to the ancient Greek philosopher Aristotle,and modernfirst-order predicate logic was created in the19th century,when the foundations for modern mathematics were laid.First-order logic captures some of the essence of human reasoning by providing a notion of logical consequence as already mentioned.It also provides a notion of universal truth in the sense that a logical statement can be universally valid(and thus called a tautology), meaning that it is a statement which is true regardless of any preconditions.Logical consequence and universal truth can be described in terms of model-theoretic semantics.In essence,a model for a logical theory3describes a state of affairs which makes the theory true.A tautology is a statement for which all possible states of affairs are models.A logical consequence of a theory is a statement which is true in all models of the theory.How to derive logical consequences from a theory–a process called deduction or infer-encing–is obviously central to the study of logic.Deduction allows to access knowledge which is not explicitly given but implicitly represented by a theory.Valid ways of deriv-ing logical consequences from theories also date back to the Greek philosophers,and have been studied since.At the heart of this is what has become known as proof theory.Proof theory describes syntactic rules which act on theories and allow to derive logical consequences without explicit recurrence to models.The notion of universal truth can thus be reduced to syntactic manipulations.This allows to abstract from model theory and enables deduction by symbol manipulation,and thus by automated means.Obviously,with the advent of electronic computing devices in the20th century,the automation of deduction has become an important and influentialfield of study.Thefield of automated reasoning is concerned with the development of efficient algorithms for de-duction.These algorithms are usually required to be sound,and completeness is a desired feature.The fact that sound and complete deduction algorithms exist forfirst-order predicate logic is reflected by the statement thatfirst-order logic is semi-decidable.More precisely,3A logical theory denotes a set of logical formulas,seen as the axioms of some theory to be mod-elled.46Stephan Grimm,Pascal Hitzler,Andreas Abeckersemi-decidability offirst-order logic means that there exist algorithms which,given a the-ory and a query statement,terminate with positive answer infinite time whenever the state-ment is a logical consequence of the theory.Note that for semi-decidability,termination is not required if the statement is not a logical consequence of the theory,and indeed,ter-mination(with the correct negative answer)cannot be guaranteed in general forfirst-order logical theories.For some kinds of theories,however,sound and complete deduction algorithms exist which always terminate.Such theories are called decidable,and they have certain more-or-less obvious advantages,including the following.•Decidability guarantees that the algorithm always comes back with a correct answer infinite time.4Under semi-decidability,an algorithm which runs for a considerable amount of time may still terminate,or may not terminate at all,and thus the user cannot know whether he has waited long enough for an answer.Decidability is particularly important if we want to reason about the question of whether or not a given statement is a logical consequence of a theory.•Experience shows that practically efficient algorithms are often available for decidable theories due to the effective use of heuristics.Often,this is even the case if worst-case complexity is very high.3.2.1Description LogicsDescription logics[3]are essentially decidable fragments offirst-order logic,5and we have just seen why the study of these is important.At the same time,description logics are expressive enough such that they have become a major knowledge representation paradigm, in particular for use within the Semantic Web.We will describe one of the most important and influential description logics,called ALC.Other description logics are best understood as restrictions or extensions of ALC.We introduce the standard description logic notation and give a formal mapping into standard first-order logic syntax.The Description Logic ALCA description logic theory consists of statements about concepts,individuals,and their re-lations.Individuals correspond to constants infirst-order logic,and concepts correspond to unary predicates.In terms of semantic networks,description logic concepts correspond to general concepts in semantic networks,while individuals correspond to individual con-cepts.We deal with conceptsfirst,and will talk about individuals later.Concepts can be named concepts or anonymous(composite)d concepts consist simply of a name,say“human”,which will be mapped to a unary predicate in4It should be noted that there are practical limitations to this due to the fact that computing resources are always limited.A theoretically sound,complete and terminating algorithms may thus run into resource limits and terminate without an answer.5To be precise,there do exist some description logics which are not decidable.And there exist some which are not straightforward fragments offirst-order logics.But for this general introduction,we will not concern ourselves with these.。
Ontology
Help users to find their needed services exactly Enable semi-automatic service composition
The semantic description of web services
Describe the semantic web services depending on two kinds of ontologies:
Describe Resources Domain / Service Ontology
Autonomous: Means the domain ontology may not complete enough to describe resources in each organization.
Metadata Data
The dynamics and autonomy of a VO plague the construction of a domain ontology
Cannot pre-construct a global ontology. Cannot construct a stable ontology.
Matching algorithm
Globalization.-英语版全球化PPT
4
Tourism globalization
Tourism plays an important role in the process of economic globalization.
Small scale, single structure, has not yet developed into a complete system of industry
take concerted action .
Nowadays,internati nal tourism is the biggest industry in the world.Unfortunately ,international tourism creates tensio rather than undersanding between people from different cultures.
Advantages
It provides a chance for developing countries to meet opportunities and challenges.
Economic globalization can make the world's capital, technology, products, markets, resources, labor force reasonable configuration.
2
1
TEXT
ADD TEXT
Tang Bingtian
Tourism development has become international.
A The tourism industry in the modern world began to
外研社应用英语教程综合英语2教案U12
The eagerness with which the nationembracedthe scandal is simultaneously understandable and troubling.
Paragraph1
2.The language policy in the European Union is both ineffective and hypocritical, and its ideas of linguistic equality and multilingualism are costly and cumbersome illusions.
2. About the author
Juliane House(1942-)—German linguist and Translation Studies scholar. House received a degree in English and Spanish Translation and International Law from the University of Heidelberg, Germany. Later, she worked as a translator and researcher. She earned herBed,MA and PhD in Linguistics and Applied Linguistics at the University of Toronto, Canada. She is a senior member of the German Science Foundation’s Research Centre on Multilingualism at the University of Hamburg, where she has directed several projects on translation and interpreting. Her research interests include translation theory and practice, contrastive pragmatics, discourse analysis, politeness theory, English as lingua franca, intercultural communication, and global business communication.Her published works includeA Model for Translation Quality Assessment(1977 and revisited 1997),Let’s Talk and Talk About It: A Pedagogic Interactional Grammar of English(1981) with Willis Edmondson,Interlingual and Intercultural Communication(1986) with Shoshana Blum-Kulka,Cross-cultural Pragmatics: Requests and Apologies(1989) with Shoshana Blum-Kulka and Gabriele Kasper,Misunderstanding in Social Life: Discourse Approaches to Problematic Talk(2003) with Gabriele Kasper and Steven Ross,Multilingual Communication(2004) with JochenRehbein,Translation(2009),Translatory Action and Intercultural Communication(2009) with Kristin Bührig and Jan ten Thije,English as a Lingua Franca(Special Issue ofIntercultural Pragmaticsvol. 6, No. 2. 2009),Convergence and Divergence in Language Contact Situations(2010) with Kurt Braunmüller,Globalization, Discourse, Media: In a Critical Perspective(2010) with Anna Duszak and Lukasz Kumiega, andImpoliteness in Germany(Intercultural Pragmatics7:4, 2010).
UDC分类号 中英文对照史上最全
UDC大纲中英文对照版英文的在后面本大纲UDC类从多语种国际十进分类汇总(udcc088号出版物)公布的由UDC联盟知识共享署名3许可下一样(第一版本2009,后续的更新2012)。
0科学与知识。
组织。
计算机科学。
信息。
文件。
图书馆事业。
制度。
出版物[编辑] 00绪论。
知识与文化基础。
预备知识001一般的科学知识。
知识工作组织002文件。
书籍。
写作。
作者003编写系统和脚本004计算机科学与技术。
计算4.2计算机体系结构4.3计算机硬件4.4软件4.5人机交互4.6数据4.7计算机通信4.8人工智能4.9面向应用的计算机技术005管理5.1管理理论5.2管理代理。
机制。
措施5.3管理活动5.5管理操作。
方向5.6质量管理。
全面质量管理(TQM)5.7组织管理(有机)5.9个管理领域5.92记录管理5.93工厂管理。
物理资源管理5.94知识管理5.95、96人事管理。
人力资源管理006标准化的产品,操作,重量,措施和时间007活动和组织。
信息。
通信与控制理论(控制论)008文明。
文化。
进步01参考书目及书目。
目录02图书馆030一般参考作品(如主题)050系列出版物,期刊(如主题)06个组织的一般性质069博物馆070份报纸(如主题)。
新闻。
新闻概论08polygraphies。
集体作品(如主题)09手稿。
罕见且非凡的作品(如主题)1哲学。
心理学[编辑]101哲学的性质与作用11形而上学111一般形而上学。
本体论122/129特殊的形而上学13精神和精神的哲学。
精神生活的形而上学14哲学体系与观点159.9心理学159.91心理生理学(生理心理学)。
心理生理学159.92心理发展和能力。
比较心理学159.93感觉。
感官知觉159.94执行功能159.95高级心理过程159.96特殊的精神状态和过程159.97变态心理学159.98应用心理学(心理技术)一般16逻辑。
认识论。
知识论。
逻辑方法论17道德哲学。
The Integration of δ ′ ( f) in a Multidimensional Space
∫
Note that n ⋅ ∇ ∇ f = ∂ f ⁄ ∂ n so that the integral on the right of Eqs. (7) and (9) depend on both ∂ f ⁄ ∂ n = ∇ f and ∂ f ⁄ ∂ n .
2 2
2
2
The Integration of in a Multidimensional Space
The Integration of δ′ ( f ) in a Multidimensional Space
F. Farassat
NASA Langley Research Center- Hampton, Virginia
1.0 Introduction
In the study of noise from high speed surfaces, one needs to evaluate integrals involving δ′ ( f ) where f ( x ) = 0 or f ( x, t ) = 0 is a stationary or moving surface on which acoustic sources lie. For example, consider the thickness noise term of the Ffowcs williams-Hawkings (FW-H) equation when we take the time derivative explicitly: ∂ ∂ ---- [ ρ v n δ ( f ) ] = ---- [ρ v δ( f )] ∂t 0 ∂t 0 ˜ n = ρ v ˙ δ ( f ) – ρ v n v n ∇ f δ′ ( f ) 0˜n 0˜ It is assumed that the function f ( x, t ) is so defined that we have ∇ f = 1 on this surface. The local normal velocity of the surface is denoted v n given by the relation v n = – ∂ f ⁄ ∂ t and a tilde under a symbol indicates restriction of the variable to the surface f = 0 [1]. This relation shows how the generalized function δ′ ( f ) appears in wave propagation problems as the source term of the linear wave equation. We point out two features of Eq. (1). First we note that only one of the two normal derivatives multiplying δ′ ( f ) is restricted to the surface f = 0 . Second, the unrestricted function ∇ f also appears as a factor of δ′ ( f ) which can not and should not be set equal to 1. In this paper, we will first give the interpretation of the following integral I =
在未来英语作文
In the future,the English language is likely to evolve and adapt to the changing global landscape.Here are some possible developments in the world of English:1.Global Standardization:English may become more standardized globally,with fewer regional accents and dialects.This could be due to increased international communication and the influence of global media.2.Increased Use of Technology:The integration of technology in language learning and communication will become more prevalent.AIpowered translation tools and language learning apps will make English more accessible to nonnative speakers.3.New Vocabulary:As technology advances and society changes,new words and phrases will be added to the English lexicon.Terms related to emerging technologies, environmental issues,and cultural shifts will become commonplace.4.Changes in Grammar:Grammar rules may become more flexible to accommodate the way people naturally use the language in digital communication.This could lead to a more conversational and less formal style of written English.5.Multilingual Influence:English will continue to borrow words and phrases from other languages,enriching its vocabulary and reflecting the multicultural nature of the global community.6.ELearning and Virtual Classrooms:The future of English education will likely involve more online and virtual classrooms,allowing students from around the world to learn and practice English in realtime with native speakers.7.Cultural Sensitivity:As English becomes more global,there will be a greater emphasis on understanding and respecting the cultural nuances of different Englishspeaking regions.8.Environmental Language:With growing concerns about climate change and sustainability,the language will evolve to include more terms related to environmental conservation and ecofriendly practices.9.Inclusivity:The future of English will likely see a push for more inclusive language, avoiding terms that could be considered offensive or discriminatory.10.Global English Day:A day dedicated to celebrating the global use of English might be established,promoting cultural exchange and language learning.11.English as a Lingua Franca:English may become even more established as a lingua franca,a common language used for communication between people who do not share a native language.nguage Preservation:Efforts to preserve the English language in its various forms will become more important to maintain linguistic diversity.13.Creative Writing:The future of English literature will likely see a surge in creativity, with writers exploring new narrative structures and incorporating elements from other cultures.nguage Apps and Tools:Advanced language learning apps will provide personalized learning experiences,adapting to the users progress and learning style.15.International Collaboration:There will be an increase in international collaboration in the field of linguistics,with researchers working together to study and document the evolution of English.As English continues to grow and change,it will remain a vital tool for communication, education,and cultural exchange in the future.。
通用概念知识图谱介绍
通⽤概念知识图谱介绍1.定义通⽤概念知识图谱指由实体(⽐如“刘德华”)、概念(⽐如“演员”),实体与概念之间的类属关系(⼜称isA关系,⽐如 “刘德华 isA 演员”),概念与概念之间的 subclass of 关系(⽐如 “电影演员”是“演员”的⼦类)组成的图谱。
通常后⾯两类关系,⼜统称为 isA 关系。
如果 A isA B,通常称A为B的下位词(hyponym),或者B为A的上位词(hypernym)。
2.⽤途1.搜索意图理解⽤户搜索“西游记”,我们通过它的概念“中国古代四⼤名著”、“⼩说”可以理解⽤户是在搜索⼩说类名著。
对于⽤户搜索意图的精准理解可以进⼀步帮助改进检索、排序与推荐。
2.实体相似性判断当⽤户需要判断“复旦⼤学”和“上海交⼤”是否相似时,仅仅根据字⾯相似性,很难知道它们是相似实体。
但是通过概念知识图谱,我们可以看到它们的概念是差不多的,从⽽可以判断它们在语义上是相似的。
3.可解释实体推荐当⽤户先后搜索“复旦⼤学”、“上海交通⼤学”,“上海理⼯⼤学”时,我们⼈类可以⾃然地推断⽤户是在搜索上海⾼校。
如今,机器通过检索概念知识图谱,发现这三个实体共享“上海⾼校”这个概念,从⽽也可以准确识别⽤户的搜索意图,进⼀步推荐“上海外国语⼤学”,“同济⼤学”等实体,并给出⽤户是在搜索上海⾼校这⼀解释。
3.概念知识图谱实例1.⼤词林(哈⼯⼤)语⾔:中⽂分类体系(schema):⼈⼯构建组成:实体、上位词、上下位关系、同义词关系、实体属性。
存储:关系数据库-Probase(复旦⼤学)语⾔:中⽂分类体系(schema):双层,”类别-实例“,以百度百科的词条标签作为类别数据:主要利⽤百度百科的词条标签作为类别,下图是其图谱数据与PKUBASE的pkubase-types.txt数据的对照存储:Neo4j3.Xlore(清华⼤学)语⾔:中⽂、英⽂分类体系(schema):使⽤百度百科、维基百科的分类体系,如:、组成:概念表、实例表、属性表、实例摘要⽂本、信息框、上下位关系、相关关系、跨语⾔链接、URL数据:百度百科、中⽂维基百科、英⽂维基百科存储:类似关系数据库4.微软概念图谱语⾔:英⽂分类体系(schema):双层,”类别-实例“组成:概念表、实例表、上下位关系(IsA)表存储:不详5.ConceptNet(MIT Media Lab)语⾔:多语⾔分类体系:URI hierarchy/a/: assertions, also known as edges (as of 5.5, these are the same thing)/c/: concepts, also known as terms (words and phrases in a particular language)/d/: datasets (broad sources of knowledge)/r/: language-independent relations, such as /r/IsA/s/: knowledge sources, which can be human contributors, Web sites, or automated processes/and/: conjunctions of sources that were used together to create an assertion例⼦:数据:ConceptNet 5、DBPedia(infoboxes)、Wiktionary(multilingual dictionary,synonyms、antonyms、translations)、WordNet、OpenCyc(high-level ontology)、Verbosity存储:PostgreSQL。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Ontology integration in a multilingual e-retail systemMaria Teresa PAZIENZA(i), Armando STELLATO(i), Michele VINDIGNI(i), Alexandros VALARAKOS(ii), Vangelis KARKALETSIS(ii)(i)Department of Computer Science, Systems and Management,University of Roma Tor Vergata, Italy{pazienza, stellato, vindigni}@info.uniroma2.it (ii)Software and Knowledge Engineering Lab., Institute of Informatics and Telecommunications, N.C.S.R. "DEMOKRITOS". Athens, Greece{alexv, vangelis}@iit.demokritos.grAbstractThe advent of e-commerce and the continuous growth of the WWW led to a new generation of e-retail stores. A number of commercial agent-based systems have been developed to help Internet shoppers decide what to buy and where to buy it from. In such systems, ontologies play a crucial role in supporting the exchange of business data, as they provide a formal vocabulary for the information and unify different views of a domain in a safe cognitive approach. Based on this assumption, inside CROSSMARC (a European research project supporting development of an agent-based multilingual information extraction system from web pages), an ontology architecture has been developed in order to organize the information provided by different resources in several languages. CROSSMARC ontology aims to support all the different activities carried on by the system’s agents. The ontological architecture is based on three different layers: (1) a meta-layer that represents the common semantics that will be used by the different system's components in their reasoning activities, (2) a conceptual layer where the relevant concepts in each domain are represented and (3) a linguistic layer where language dependent realizations of such concepts are organized. This approach has been defined to enable rapid adaptation into different domains and languages.1IntroductionThe continuous growth of the information on the Web and the proliferation of e-commerce sites are becoming overwhelming for consumers, who should acquaint themselves with a huge number of sites, dissimilar more in the amount of provided information, presentation styles, and overall organization, than in their contents. Extracting semi structured data from e-retail sites (and in general from the Web) is a complex task. Images, texts and other media that contain the relevant information, are organized in a way suitable to catch the human attention more than to be perceived as a rigorous and intelligible structure. The extraction task becomes even harder in our multi-lingual society, as web pages are typically written in different languages. Moreover, new product types are likely to appear in the market as technology is being evolved. This makes the customisation of existing systems/resources to new unforeseen scenarios an expensive and labour-intensive effort. In such systems, ontologies play a crucial role in supporting the exchange of data, providing a formal vocabulary for the information and unifying different views of a domain in a safe cognitive approach (Pazienza & Vindigni, 2002).In this paper, we describe the knowledge model that has been adopted in CROSSMARC, an e-retail product comparison multi-agent system, currently under development as part of an EU-funded project, aiming to provide users with product information fitting their needs.CROSSMARC technology operates in four languages (English, Greek, French and Italian) and is applied in two different product domains: computer goods and job offers.2CROSSMARC ArchitectureThe overall CROSSMARC architecture (see Fig. 1) is composed of a data processing layer (involving several language processing components), a database where relevant extracted information is stored and maintained, and a presentation layer (user interface).Interface AgentsAgents in the system could be broadly divided in three categories:•retrieval agents, which identify domain-relevant Web Sites (focused crawling) and return web pages inside these sites (web spidering) that it is likely to contain the desired information; •Information Extraction (IE) agents (a separate one for each language) which process the retrieved web pages, performing Named Entity Recognition and Classification (NERC) and Fact Extraction (FE), and finally populate a database with the extracted information; •interface agents which process user's queries, perform user modelling, access the database and supply users with product information in their native language.This architecture has been chosen to facilitate customisation into new languages, application domains, and/or services and to support the asynchronous activities mentioned above.The CROSSMARC ontology represents the shared domain model for the purpose of extracting structured descriptions of e-retail products. It embodies a language neutral vocabulary of the domain, which is a formal description of relevant concepts that describe a specific product. In the overall processing flow, the ontology plays several key roles, as it is depicted in Figure 1: •During Web page collection (focused crawling + spidering of web sites), it comes in use as a “bag of words”, that is a rough terminological description of the domain in all the four languages that helps CROSSMARC crawlers and spiders to identify the interesting web pages. •During Named Entity Recognition (NERC) it drives the identification and classification of relevant entities in textual descriptions (Grover et al, 2002). Also, ontology’s structure is used in cross-lingual name matching. Each language-specific IE component identifies the ontological concepts for the entities that it recovers.•During Fact Extraction and Normalization, entities identified in the NERC phase are correlated and aggregated to form language-independent specific product descriptions exploiting the ontology structure. All the values of product features in the final description are normalised to their canonical representation provided by the ontology.•During the final presentation of the results to the end user, exploiting the correlation among the four lexicons and the ontology it is possible to provide cross-lingual descriptions of the same product. Thus, results are adapted to the language preferences of the end user who can also compare uniform summaries of product descriptions from pages in different languages.3Knowledge modelThe realisation of CROSSMARC functions demands background knowledge at different levels (i.e. lexical, ontological and task oriented). Therefore, the ontological architecture is organized around three different layers:• a meta-concetpual layer, which represents the common semantics that will be used by the different components of the system in their reasoning activities,• a conceptual layer where relevant concepts of each domain are represented, and•an instances layer where language dependent realizations of such concepts are organized.The current ontology architecture is implemented in Protégé 2000 (Noy et, al. 2000), an ontology engineering environment that supports ontology development and maintenance. Protégé-2000 adopts a frame-based knowledge model, based on classes, slots, facets, and axioms. Classes are concepts in the domain of discourse, organized in a taxonomic hierarchy, while slots describe their properties. Facets and axioms specify additional constraints. In Protégé-2000, individuals are instances of classes and classes are instances of metaclasses.In CROSSMARC metaclasses are used to constrain the intended interpretation of classes, slots and instances in the ontology. The basic idea is to define specific metaclasses to represent our model and use them to specify how the different elements are connected together (for instance, that a feature could have one or more attributes ranging over some values). In this way, the extension of CROSSMARC reasoning capabilities to a new semantic type requires the extension of metaclasses to allow for such new type, the declaration of how it fits the rest of the knowledge representation, and its instantiation it in the ontology.The meta-conceptual layer of CROSSMARC defines how linguistic processors will work on the ontology, enforcing a semantic agreement by characterizing the ontological content according to the adopted knowledge model. The Protégé metaclasses hierarchy has been extended introducing a few metaclasses. These are used in the Conceptual level to assign computational semantics to elements of the domain ontology. Basically the metaclass extension provides a further typization to concepts, adding a few constraints for formulating IE templates.The conceptual layer is organized around the three different modalities introduced in the previous section (see Fig. 2-c). All the conceptual model is rooted under three main classes: DOMAIN-TEMPLATE, DOMAIN-ONTOLOGY and DOMAIN-LEXICON. Each of these represents a specific knowledge aspect in the overall organization.The DOMAIN-TEMPLATE component is made of specific elements that bring computational semantics on the concepts in the ontology. This semantics is strictly related to the IE task and is provided by the definition of three meta-elements: Feature, Attribute and Value. Essentially, the DOMAIN-TEMPLATE component describes a product offer as being composed by a set of features; this roughly corresponds to a structural description of "part-of" relationships (for instance, in the computer goods domain, they represent components as screen, CPU, devices, and so on). Each feature is characterized by a certain number of specific attributes that could range over some domain (for instance, the CPU feature could be characterized by the "Processor Type" and the "Processor Speed" attributes). DOMAIN-TEMPLATE is the superclass of each IE template in a domain. Its subclasses further characterize the kind of template they model. DOMAIN-ONTOLOGY roots all the domain-relevant concepts. No computational specific semantics is adopted here, as this class (and its subclasses, the domain concepts) should only obey to a more general knowledge model. In this way knowledge engineers could model here in anatural declarative form concepts, attributes and relations they feel relevant to describe the domain. A specific subclass of the DOMAIN-ONTOLOGY is the ABSTRACTION class, under which there are general abstract concepts as Measurement Units and numeric ranges.Support for lexical information (i.e. language dependent realizations of ontology concepts) is rotted at DOMAIN-LEXICON class and its subclasses, ENGLISH, FRENCH, ITALIAN and GREEK LEXICON. Each lexical instance of these is composed by three slots: REFER-TO, a reference to a FEATURE, DOMAIN-CONCEPT or ATTRIBUTE; SYNONYM that holds multiple synonym terms describing a concept, feature or attribute; REG-EXPR, same as the previous but with regular expressions. Subclassing the four sublanguages makes easier to partition lexical information among the languages (as these contents should be filled by different information providers).Fig. 2: Protégé (a) and CROSSMARC metaclasses (b) and Conceptual Layer (c)The instance layer represent domain specific individuals. It instantiates classes in the domain ontology; these instances fill the values for attributes of the domain templates. There are two kind of instances in a CROSSMARC knowledge base: concept instances and lexical instances. Concept instances are subject to change over time, for instance, as technology evolves and new manufacturers, products and components appear, while others will become obsolete. Similarly, lexical instances could be refined, added, or adjusted to reflect linguistic bias (for instance, today no vendor refer to Windows 95 while writing Windows in a product description).4Ontology Maintenance ProcessOntology maintenance concerns the addition, removal or reorganization of entries in the ontology. The impact of changes over the different ontology layers is strongly dependent on the modifications to be applied: extending the number of ontology concepts is expected to make no impact at all on the design of CROSSMARC agents, nor on the database structure (as long as the FE will maintain its original output data model); other actions, such as deleting concepts or entire branches, could potentially invalidate already extracted information. Lexical entries could be added, deleted or modified without affecting the conceptual layer, and the lexicon scheme could be modified as long as a reference to the concepts is maintained. On the other hand, changes in the conceptual layer usually affect the lexicons: changes to the main fabric of the DOMAIN-TEMPLATE component (i.e. attributes and features), could obviously have a heavy impact on all of the CROSSMARC processing steps and reasoning capabilities.The maintenance process is performed through six different phases (see fig. 3): Domain Experts examine the current status of the specific domain to identify possible changes (1) and report the most significant ones to the Knowledge Engineers (2) in the form of new and/or misused concepts;Linguistic Content Providers give coherent lexicalisations to the changes reported by the Domain Experts (3) and discuss them with the Knowledge Engineers; these then prepare new models for the ontology (4), taking into account the modifications to the domain model, and submit them to the Ontology Administrator (5),who will release a new version of the ontology based on the proposals received from the Knowledge Engineers, and releases it to the community; finally (6) Knowledge Engineers adapt lexicalisations by Linguistic Content Providers to the concepts specified in the new version of ontology.Ontolog y AdministratorFig. 3: Ontology Maintenance flow of processes5 ConclusionsOn the WWW the construction and use of ontologies have begun to replace the old-fashioned ways of exchanging business data in weak standardized formats with standard syntax (like XML/RDF) that adheres to semantic specifications given through ontologies. Ontologies help standardize the meaning of core concepts through different components and applications and facilitate the construction of services able to draw information from various sources in a uniform manner. However, any such standardization in the highly dynamic and expanding world of the Web is bound to face considerable difficulties, due to the considerable effort involved in adapting existing content to the new standards and following these standards in the construction of new content. The CROSSMARC ontology design and maintenance process aims to integrate these experiences by providing a general methodology that could be easily adapted across different domains and languages. The organization of the ontology has been designed to be applied to different domains without changing the overall structure, but simply changing relevant values; this has been obtained by decoupling the lexical component (language-dependent) from the conceptual one and inscribing the domain model in a widely assessed framework. Also, this architecture provides us with a language-independent and a homogeneous approach for presenting data in the graphical user’s interface.ReferencesN. F. Noy, R. W. Fergerson, & M. A. Musen (2000). The knowledge model of Protege-2000: Combining interoperability and flexibility. 2nd International Conference on Knowledge Engineering and Knowledge Management (EKAW'2000), Juan-les-Pins, France.M. T. Pazienza and M. Vindigni. (2002). Language-based agent communication. 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW'02), Sigueza, Spain.C. Grover, S. McDonald,D. Nic Gearailt, V. Karkaletsis, D. Farmakiotou, G. Samaritakis, G. Petasis, M.T. Pazienza, M. Vindigni, F. Vichot and F. Wolinski (2002): Multilingual XML-Based Named Entity Recognition for E-Retail Domains. 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain。