RDF based architecture for semantic integration of heterogeneous information sources

合集下载

基于知识图谱问答系统的技术实现

文章编号：2096-1472(2021)-02-38-07DOI:10.19644/ki.issn2096-1472.2021.02.008软件工程 SOFTWARE ENGINEERING 第24卷第2期2021年2月V ol.24 No.2Feb. 2021基于知识图谱问答系统的技术实现魏泽林1,2，张帅1,2，王建超1,2(1.大连东软信息学院，辽宁大连 116023；2.大连东软教育科技集团有限公司研究院，辽宁大连 116023)*******************;*********************;***********************摘要：知识图谱是实现对话机器人的一类重要工具。

如何通过一套完整流程来构建基于知识图谱的问答系统是比较复杂的。

因此，本文从构建基于知识图谱的问答系统的全流程角度总结了多个主题：知识图谱类型、知识图谱构建与存储、应用在知识图谱对话中的语言模型、图空间内的语义匹配及生成。

进一步，本文在各主题的垂直领域归纳了常用方法及模型，并分析了各子模块的目的和必要性。

最后，本文通过总结出的必要模块及流程，给出了一种基于知识图谱的问答系统的基线模型快速构建方法。

该方法借助了各模块的前沿算法且有效地保证了拓展性、准确性和时效性。

关键词：知识图谱；问答系统；对话机器人；语言模型；语义匹配中图分类号：TP183 文献标识码：AImplementation of Question Answering based on Knowledge GraphWEI Zelin 1,2, ZHANG Shuai 1,2, WANG Jianchao 1,2( 1.Dalian Neusoft University of Information , Dalian 116023, China ;2.Research Institute , Dalian Neusoft Education Technology Group Co . Limited , Dalian 116023, China )*******************;*********************;***********************Abstract: Knowledge graph is an important tool for realizing chatbots. The lifecycle of constructing a question answering system based on knowledge graph is a complex task. This paper summarizes a number of topics from the perspective of building a knowledge graph-based question answering system. The topics include knowledge graph types, knowledge graph construction and storage, language models used in knowledge graph dialogue, semantic matching and generation in graph space. Furthermore, this paper summarizes commonly used methods and models in vertical areas of topics, and analyzes the purpose and necessity of sub-modules. A method for quickly constructing a baseline model of a knowledge graph based question answering system will be presented. The proposed method relies on the cutting-edge algorithms and effectively guarantees scalability, accuracy and timeliness.Keywords: knowledge graph; question answering system; chatbot; language model; semantic matching1 引言(Introduction)知识问答系统在二十世纪五六十年代时就已经出现。

CAM某相关论文的讲解

CAM
情境化注意元数据
1
CAM？
What is CAM？
AM：指用户在使用普遍网站、Wiki、Blog、文本交流、电子邮件等资源的过程中，被用户注意的合引起用户注意的任何内容。 CAM: 情境化注意元数据是用于描述用户某个情境下发生的一系列行为的数据。
界内普遍认可的观点是由比利时天主教荷语鲁汶大学计算机系超媒体和数据库研究组的Jehad Najjar等人提出来的。
① the number of diﬀerent types of events of the MACE system that have occurred during certain time intervals (a day, a month, or an arbitrary time interval)； ② as well as an ordered classification of the resources that were accessed the most for each type of event with the number of occurrences.
3
Applying the RDF-CAM binding
3
RDF-based Zeitgeist for MACE
MACE: Metadata for Architectural Contents in Europe
It provides advanced graphical access to large amounts of architectural learning resources stored in various repositories that are distributed all over Europe. All interactions with the MACE portal are monitored and stored as CAM data using the XML binding. Zeitgeist for MECE??

Oracle Retail Analytics和计划云服务下一代云更新指南May 2023 发布

Oracle Retail Analytics and Planning Cloud ServiceNext Gen Cloud Update GuideMay 2023 | Release 23.1.201.0Copyright © 2023, Oracle and/or its affiliatesDisclaimerThis document in any form, software or printed matter, contains proprietary information that is the exclusive property of Oracle. Your access to and use of this confidential material is subject to the terms and conditions of your Oracle software license and service agreement, which has been executed and with which you agree to comply. This document and information contained herein may not be disclosed, copied, reproduced or distributed to anyone outside Oracle without prior written consent of Oracle. This document is not part of your license agreement nor can it be incorporated into any contractual agreement with Oracle or its subsidiaries or affiliates.This document is for informational purposes only and is intended solely to assist you in planning for the implementation and upgrade of the product features described. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in this document remains at the sole discretion of Oracle.Due to the nature of the product architecture, it may not be possible to safely include all features described in this document without risking significant destabilization of the code.TABLE OF CONTENTSDisclaimer 1 Introduction 3 Document Summary 3 Overview of Next Generation SaaS Architecture 3 Assumptions 3 Getting Started 5 Customer Actions for Retail Planning Solutions 5 Updated Solution URLs 5 Authentication Changes 5 File Transfer Services 6 Retail Analytics and Planning Core Update 7 Retail Analytics and Planning Core Plus 8 Customer Actions for Retail Insights and AI Foundation Solutions 9 Updated Solution URLs 9 Authentication Changes 9 File Transfer Services 10 Customer Actions for POM 11 Update Service Endpoint URLs 11 Authentication Changes 11 References 12IntroductionDocument SummaryThis document provides general enablement for Oracle Retail customers moving to Oracle's Next Generation Retail Analytics and Planning SaaS architecture. These checklists and resources capture major customer activities and milestones. Retailers should use these checklists early in the Next Gen Cloud Update phase. The checklists allow retailers to ask key questions while working with technical staff and partners. In addition, the checklists provide considerations for implementation planning. Once completed, the checklists can be used to set expectations among all parties and saved for future reference. Overview of Next Generation SaaS ArchitectureAll of Oracle’s Planning, Retail Insights, and Retail AI Foundation Cloud Services are moving to Oracle’s Next Ge neration SaaS Architecture. This is a cloud-native architecture that provides a secure, highly scalable foundation with increased system availability. This new architecture yields the following benefits:▪Significantly reduced downtime.▪Full adoption of OAuth 2.0 for all REST services.▪Significant improvements in middle-tier and application-tier scalability.▪Improved, simplified intra-Oracle Retail integration.▪ A centralized Oracle Retail Business Intelligence instance for easier reporting administration.▪Retirement of SFTP in favor of a service-based approach. Refer to the "Managing File Transfers" section in the RAP Administration Guide.▪Version 19 should be read-only once the Next Gen production update is complete and and the environment is handed over to the customer.AssumptionsNote the following assumptions regarding the Next Gen Cloud Update:▪Update of a Non-production environment in the New Gen Cloud occurs first to enable the customer to perform their development activities and prepare before the Production update.▪All batch files in the Version 19 (or earlier) environment must be processed, and there should be no files remaining in the SFTP folder.▪Update activity will be performed after the Weekly or Nightly batch cycle in the version 19 environment is complete.No jobs should be pending.▪If all the Oracle Retail Cloud services are currently using the same IDCS or OCI IAM instance, then no changes should be made to the IDCS or OCI IAM. If you are using different IDCS or OCI IAM instances for different OracleRetail Cloud services, they will be merged into a single instance. There is then a customer action to reset thepasswords after the merge.Retail Analytics and Planning Solutions Next Gen Update ChoicesUpdating to the Next-Gen Architecture for these solutions is a complete update process and all features and functionality currently available are unchanged. This also covers any solution built on AIF:Inventory Optimization (IO)Promotion & Markdown Optimization (PMO)Offer Optimization (OO)Assortment & Space Optimization (ASO)There are two choices when updating Planning Solutions: Core Update and Core Plus Update. Refer to Planning Core Update vs Planning Core Plus Update for more information.Getting StartedTo get the process started, please reach out to your Customer Success Manager (CSM) to receive more information about Oracle’s next-generation architecture, discuss the steps laid out in this document, and identify target dates and timelines that you want to follow to complete the upgrade. You may also raise a Service Request in Oracle Support to obtain more information or to request that Oracle schedule the upgrade for you based on your business needs.Customer Actions for Retail Planning SolutionsDue to the technical changes in Oracle's Next Generation SaaS architecture, the actions below are to be performed by the customer. This section applies to any current customer of a Retail Planning or Supply Chain solution based on the Retail Predictive Application Server (RPAS) architecture.Oracle will provision next-gen Cloud Services environment and migrate your existing configurations and data before it is handed over to you for validation. For details, please refer to the following sections.Along with the steps mentioned below, please refer to Next Gen Update Video for detailed information.Updated Solution URLsAuthentication ChangesFile Transfer ServicesRetail Analytics and Planning Core UpdateRetail Analytics and Planning Core PlusAlong with Retail Analytics and Core Update, followingare the additional capabilities.Built on top of AI Foundation for Analytics, Reporting,Data Mining, Intelligence, and ForecastingPlans and Forecasts are sent/received to/from AIFoundation. Oracle Retail AI Foundation providesspecific insight and value to planning, buying, selling,and moving decisions through Customer Segmentation,Advanced Clustering, Demand Transference, AffinityAnalysis, and Size Profiling and can be expanded tosupport broader retail uses as required.Refer to the Oracle Retail Analytics and PlanningImplementation Guide for Planning integration with AIFoundation:Refer to the Retail AI Foundation Cloud Service guideshere for more information on AI Foundation:Customers choosing Core Plus are expected to use POMto schedule their batches.Customer Actions for Retail Insights and AI Foundation SolutionsDue to the technical changes in Oracle's Next Generation SaaS architecture, the actions below are performed by the customer. This section applies to any current customer of Retail Insights, AI Foundation, and all associated applications such as Offer Optimization and Inventory Optimization.Updated Solution URLsAuthentication ChangesFile Transfer ServicesCustomer Actions for POMPOM batch schedules will change in Next Gen SaaS Architecture. Please refer to the section “RPASCE Batch Schedule with POM” in the RPAS Administration Guide for information about setting up new schedules.Update Service Endpoint URLsAuthentication ChangesReferencesRefer to the Release 23.1.201.0 documentation at the following URL: https:///en/industries/retail/index.htmlCONNECT WITH USCall +1.800.Oracle1 or visit Outside North America, find your local office at /contact /oracle /oracleCopyright © 2023 Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and the contents hereof are subject to change withoutnotice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, withoutour prior written permission.Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC。

陈为博士

陈为博士浙江大学CAD&CG国家重点实验室310058, 杭州，中国电话: 0086-571-88206681-522传真: 0086-571-88206680电邮: chenwei@ shearwarp@主页: /home/chenwei陈为，1976年生，博士，副教授，IEEE会员。

1996年本科毕业于浙江大学应用数学系，2000年6月至2002年6月在德国Fraunhofer图形研究所攻读联合培养博士，2002年9月进入浙江大学CAD&CG国家重点实验室工作，2004年12月晋升副教授，2009年12月晋升教授。

2006年7月至2008年9月受浙江大学新星计划资助，在美国普渡大学从事访问研究。

已(含合作)培养博士研究生5名，硕士研究生12名，主持国家973项目子课题两项、863高科技项目一项，国家自然科学基金面上项目和浙江省自然科学基金各两项。

研究兴趣包括科学计算可视化和可视分析，在IEEE Transactions on Visualization and Computer Graphics, IEEE Visualization, Eurographics, EuroVis, CVIU, ACM I3D, Pacific graphics, Computer-Aided Design, IEEE CG&A, The Visual Computer, Journal of Computer Animation and Virtual Worlds，ACM VRST等国际重要期刊和会议发表和录用论文三十余篇，被国际论文他引100余次；在国内的重要期刊计算机科学与技术学报(JCST)、软件学报、计算机学报、计算机辅助设计与图形学学报等发表论文二十余篇；SCI和EI索引各30余次；合著出版教材1部，已经再版三次，版权输出到台湾地区；译著1部。

.担任《计算机辅助设计与图形学学报》编委，受邀担任国际著名学术会议程序委员会委员多次（IEEE Visualization, Pacific Graphics, CGI, Pacific Vis，CGIM等），以及ACM SIGGRAPH, IEEE Visualization, Eurographics, C&G, IEEE Transactions on Image Processing, Pacific Graphic, MICCAI等著名国际杂志和会议审稿人。

基于关键词的语义搜索

上海交通大学硕士学位论文基于关键词的语义搜索姓名：周琦申请学位级别：硕士专业：计算机应用技术指导教师：俞勇20090101基于关键词的语义搜索摘要语义搜索虽然提出多年，但是为了能够运用精确的语义搜索技术, 用户必须熟知本体中的结构和相应的知识表示，并且具备写形式化查询语言的能力。

所以目前语义搜索的应用人群往往还停留在专家用户这个层面上，这已经成为阻碍语义搜索发展的严峻问题。

在另一方面，广大的互联网用户仍然习惯于传统基于关键词的搜索方式, 即使搜索引擎提供了简单的布尔表达式查询来增加查询的准确性，然而几乎所有的用户都不用这种最简单的逻辑表达式来进行日常的查询。

所以，如果能够将关键词查询这种方式无缝地衔接到语义搜索上，那么对于互联网搜索的广大用户来说，他们就可以凭借习惯的关键词作为手段而达到语义搜索的准确结果。

本文提出了一种将用户输入的关键字自动翻译，并且排序成语义搜索查询的方法，并且实现了系统原型SPARK来验证我们的方法。

这样通过输入关键词，用户就能获得语义搜索引擎所能接受的形式化语义查询。

但由于关键词和语义查询之间的巨大差异，有三个主要的难点需要克服：1) 关键词的歧义性:同一个关键词在不同的背景下有不同的含义, 如何区分用户想要表达的真正含义是一个重要的问题。

2)关系的缺失:在传统的关键词搜索中，词与词之间没有显式的修饰关系，如何区别查询的主要部分和修饰部分的关系非常困难。

3)翻译结果的多样性: 歧义性和属性缺失会造成大量可能的查询，如何挑选出符合用户信息需求的查询就显得非常重要。

为了解决以上的问题，我们针对性提出了将关键词翻译成为系统所能接受的语义查询的三阶段步骤：通过多种单词-本体资源匹配方法解决了多义问题，通过有效的查询图生成算法较好地解决了关系缺失的问题，最后通过查询排序的方法对生成的语义查询进行评价，向用户反馈相关的语义搜索查询。

在SPARK的实现中，用户可以输入任意表达信息需求的关键词，系统根据算法最后生成一系列符合用户信息需求并且可以被语义搜索接受的SPARQL查询语句反馈给用户，或用户可以直接将这些SPARQL查询提交给执行引擎以获得语义搜索的结果。

远程教育中智能答疑系统的设计与实现完整

远程教育中智能答疑系统的设计与实现完整文档资料可直接使用，可编辑，欢迎下载北京交通大学硕士学位论文远程教育中智能答疑系统的设计与实现姓名：胡娜申请学位级别：硕士专业：教育技术学指导教师：赵宏20071201ｊｂ塞銮道盔堂亟±堂僮迨塞生塞翅垂中文摘要摘要：随着网络技术的发展和网络应用的普及，依托于网络技术的远程教育正在迅猛地发展。

基于网络环境下的教育模式，采用的是探索式学习方式，它支持学生根据自己的情况，浏览相关的教学资源，实现优秀教育资源和教育方法的共享。

但是，在远程教学中，学生和教师是时空相对分离的，学生无法与教师直接交流，于是答疑作为其教学活动中的一个重要环节，正日益引起人们的关注。

设计一个好的远程教育答疑系统，能及时有效地解决学生在学习过程中历产生的疑问，这样可以提高远程学生的学习效率，保证远程教育的质量。

一般的答疑系统采用的是基于搜索引擎的关键字查询方式，这种答疑系统需要学生自己输入关键字进行提问，对学生提炼总结关键字的能力有一定要求，并且搜索的效果并不理想，需要学生进一步来筛选系统反馈的答案，使得学习效率不高，这种答疑系统有必要进一步优化。

智能答疑系统是一个具有知识记忆、数据计算统计、逻辑推理、知识学习和实现友好人机交互的智能系统，其本质是一个具有智能性的知识系统。

它支持自然语言的提问、自动检索问题并呈现有效答案，能够通过学习自动扩展和更新答案知识库。

它的这些特点，使学生在学习时能够使用自己熟悉的方式表达问题，并能够及时获得与问题较为相关的一些反馈答案。

本文首先论述了研究智能答疑系统的背景和意义，并在分析了远程教育模式特点及对比了现有的答疑系统的基础上，对答疑系统做了统一的设计和开发，提出了一个基于本体以及ＸＭＬ的智能答疑系统的设计，初步建立了本体库以及知识库，给出了完整的体系结构及其架构开发模式，并对开发智能答疑系统环境中的关键技术进行了深入的研究，最后给出了智能答疑系统的实现方法。

基于列数据库的RDF数据管理实现

RDF 三元组描述 Serge Abiteboul，Rick Hull，Victor Vianu 写了本书名 “Foundations of Databases”的书的信息如下：
（ person1，isNamed，" Serge Abiteboul" ）（ person2，isNamed，" Rick Hull" ）（ person3，isNamed，" Victor Vianu" ）（ book1，hasAuthor，person1）（ book1，hasAuthor，person2）（ book1，hasAuthor，person3）（ book1，isTitled，" Foundations of Databases" ） RDF 数据模型的基本对象类型由资源、文字和陈
0引言
随着语义网的发展，产生了大量的 RDF 数据来描述信息，特别是在 Linking Open Data （ LOD）［1］项目的努力下，网络上越来越多的开放数据以 RDF 格式发布，领域已涉及到生命科学、地理信息、百科全书和社会媒体等方面，截止到 2011 年 9 月，RDF 的总量已达 310 亿条三元组。面对如此海量的 RDF 数据，对如何有效的存储和快速的查询信息，提出了更高的要求。列存储是近几年国内外研究的热点技术，数据按列组织存储，更适合数据压缩，查询时，也只需读取与查询相关的列，大大减小了读取数据量，查询性能上优于传统的关系数据库［2］。
文中在分析比较了几种 RDF 存储模式的基础上，设计了一种基于列存储的 RDF 管理方式。

Sponsored by

Order from Chaos From Semi-Structured DataVol. 3, No. 8 - October 2005by Natalya Noy, Stanford UniversityAs more ontologies become available, it becomes harder, rather than easier, to find an ontology to reuse. There is probably little argument that the past decade has brought the “big bang” in the amount of onlineinformation available for processing by humans and machines. Two of the trends that it spurred (amongmany others) are: first, there has been a move to more flexible and fluid (semi-structured) models thanthe traditional centralized relational databases that stored most of the electronic data before; second,today there is simply too much information available to be processed by humans, and we really need helpfrom machines. On today’s Web, however, most of the information is still for human consumption in oneway or another.Both of these trends are reflected in the vision of the Semantic Web, a form of Web content that will beprocessed by machines with ontologies as its backbone. Tim Berners-Lee, James Hendler, and OraLassila described the “grand vision” for the Semantic Web in a Scientific American article in 2001:1Ordinary Web users instruct their personal agents to talk to one another, as well as to a number of otherintegrated online agents—for example, to find doctors that are covered by their insurance; to scheduletheir doctor appointments to satisfy both constraints from the doctor’s office and their own personalcalendars; to request prescription refills, ensuring no harmful drug interactions; and so on. For thisscenario to be possible, the agents need to share not only the terms—such as appointment, prescription,time of the day, and insurance—but also the meaning of these terms. For example, they need tounderstand that the time constraints are all in the same time zone (or to translate between time zones), toknow that the term plans accepted in the knowledge base of one doctor’s agent means the same as healthinsurance for the patient’s agent (and not insurance, which refers to car insurance), and to realize it isrelated to the term do not accept for another doctor, which contains a list of excluded plans. Such seamless conversation between software agents that were not initially designed to work together isthe Holy Grail of Semantic Web research. Regardless of whether this Holy Grail is ever completelydiscovered (or invented), as with any “grand challenge,” this vision drives cutting-edge research,attracting researchers from artificial intelligence, databases, information integration, data mining, naturallanguage processing, user interfaces, social networks, and many other fields. Simply constructing piecesand components of the Holy Grail is by itself a fruitful and worthwhile endeavor that will produce many useful discoveries along the way. To make this sort of seamless interaction between software agents possible, the agents must share semantics, or meaning, of the notions that they operate with. These semantics are expressed in ontologies, which contain the explicit definitions of terms used by the agents. These definitions are represented in languages where each construct has a formal explicit meaning that can be unambiguously interpreted by humans and machines. While there are many definitions of an ontology,2 the common thread is that an ontology is some formal description of a domain, intended for sharing among applications, expressed in a language that can be used for reasoning.Since the underlying goal of ontology development is to create artifacts that different applications can share, there is an emphasis on creating common ontologies that can then be extended to more specific domains and applications. If these extensions refer to the same top-levelontology, the problem of integrating them can be greatly alleviated. Furthermore, ontologies are developed for use with reasoning engines that can infer new facts from ontology definitions that were not put in explicitly. Hence, the semantics of ontology languages themselves aredebugging/qa developertools performance optimizationSponsored byTechExcel security Sponsored by Microsoft services-oriented architecture Development Tools DirectorySubscribe to Queue in print. It's FREE!First Name Last Name Company Street City State/Province Zip/Postal CodeEmailexplicit and expressed in some formal language such as first-order logic.In the past few years, the ontology has become a well-recognized substrate for research in informatics and computer science.3 The word ontology now appears on almost 3 million Web pages; the Swoogle crawler () indexes more than 300,000 ontologies and knowledge bases on the Web. For our purposes, we define ontology to be an enumeration of the concepts—and the relationships among those concepts—that characterize an application area. Ontologies provide an explicit framework for talking about some reality—a domain of discourse—and offer an inspectable, editable, and reusable structure for describing the area of interest. Ontologies have become central to the construction of intelligent decision-support systems, simulation systems, data-integration systems, information-retrieval systems, and natural-language systems. W3C has developed RDF (Resource Description Framework), RDF Schema,4 and OWL (Web Ontology Language),5 standard languages for representing ontologies on the Semantic Web.Although the idea of sharing formal descriptions of domains through ontologies is central to the Semantic Web, don’t assume that everyone will subscribe to only one or a small number of ontologies. A common misconception about the Semantic Web (and the reason many dismiss it outright) is that it relies on everyone sharing the same ontologies. Rather, the Semantic Web provides two advantages: first, formally specified ontology languages that make semantics expressed in these languages explicit and unambiguous and therefore more amenable to automatic processing and integration; second, the infrastructure to use ontologies on the Web, extend them, reuse them, have them cross-reference concepts in other ontologies, and so on. For example, if I am setting up my own online store, I can reuse ontologies for billing, shipping, and inventory, and extend and customize them for my own store. If someone else reuses the same inventory ontology, we could be part of the same portal that searches through both our inventories. Furthermore, we will know exactly what we mean by the term address (whether it is shipping or billing address), and that number in stock is the number of items, not the number of crates of those items.One of the main challenges of semi-structured data is to move away from the fairly rigid and centrally controlled database schemas to the more fluid, flexible, and decentralized model. Similarly, on the Semantic Web, there is no central control. Just as anyone can put up his or her own page on the Web and anyone can point to it (and say either good or bad things about it), on the Semantic Web anyone can post an ontology, and anyone can reuse and extend any ontology that is appropriate to his or her task. Naturally, this model brings up issues of finding the right ontologies, evaluating them, trusting the sources they come from, and so on. We discuss these issues later in the article.Problems The Semantic Web Does and Doesn’t SolveHow are ontologies and the Semantic Web different from other forms of structured and semi-structured data, from database schemas to XML? Perhaps one of the main differences lies in their explicit formalization. If we make more of our assumptions explicit and able to be processed by machines, automatically or semi-automatically integrating the data will be easier. Here is another way to look at this: ontology languages have formal semantics, which makes building software agents that process them much easier, in the sense that their behavior is much more predictable (assuming they follow the specified explicit semantics—but at least there is something to follow).6This explicit machine-processable description of meaning is the key difference between XML and ontology languages such as RDF and OWL: In XML, some of the semantics are implicit, encoded in the order and nesting of components in the XML document; in ontology languages, the semantics are explicit, having underlying axioms and formal descriptions. To represent an order of sentences in RDF, you need to use an explicit structure, such as an RDF list, to specify the order, saying explicitly which one comes first, which one comes next, and which one is last, rather than simply positioning things in the order that you want in the serialized document. Conversely, the order in which statements appear in an RDF document is irrelevant, as an underlying structure is a graph, and it is the links between the elements that are important. In fact, it is perfectly legal for an RDF parser to read in an RDF document and write it out, with triples appearing in a completely different order in the document—the represented model will still be exactly the same.So, what does the Semantic Web bring to the table today, and what is it likely to bring in the near future? One of the key hypotheses behind the Semantic Web is that by making lots of ontologies—explicit and machine-processable descriptions of domain models—available on the Web, we can encourage people to reuse and extend them.The Semantic Web infrastructure encourages and supports publication of ontologies. Hopefully, this infrastructure will encourage agents to reuse existing ontologies rather than create new ones. When two agents use the same ontology, semantic interoperability between them is greatly facilitated: they share the same vocabulary and the understanding of what classes, properties, and individuals in that vocabulary mean, and how they relate to one another. Easy access to and availability of ontologies in standard formats that the Semantic Web infrastructure provides should enable and facilitate such reuse.In addition to providing the infrastructure and social pressure to share domain models, recent developments in Semantic Web standards provide other key components that facilitate semantic interoperability: standard languages and technical means to support semantic agreement.W3C recommendations for RDF, RDF Schema, and OWL established a set of standard XML-based ontology languages for the first time. While formal ontologies have existed, they all used different languages, underlying knowledge models, and formats. Having a set of standard languages for ontology interchange and reuse should also facilitate and encourage reuse of ontologies.Semantic Web languages such as RDF and OWL provide the technical means to facilitate interoperability: the use of namespace URIs (uniform resource identifiers) in RDF and OWL, and imports in OWL, enable specific unambiguous references from concepts in one ontology to concepts in another, thus enabling reuse of ontologies and their components across the Web. For example, a user developing an application that deals with wines can declare specific wine individuals to be instances of the Wine class defined in a shared ontology elsewhere. Then, anyone searching for an instance of that Wine class on the Web will get this user’s wine as a result to the query. Furthermore, language constructs in OWL allow you to relate explicitly meanings of terms in different ontologies. For example, if my ontology does not have a concept of Wine, but rather I have the concepts Red Wine and White Wine, I can explicitly declare that the union of these two concepts from my ontology is equivalent to the class Wine in this other, widely shared ontology.The Semantic Web, RDF, and OWL are not by themselves the answer for seamless interoperability. Agents need to share and reuse ontologies published in RDF and OWL rather than choose their own, to reuse them correctly and consistently, or to create correspondences between terms in different ontologies if they choose to reuse different ones. More specifically, several hurdles need to be overcome to make interoperability truly seamless:Incorrect or inconsistent reuse of ontologies. Reusing ontologies is hard, just as reusing software code is. The Semantic Web makes it likely that people will reuse (portions of) ontologies incorrectly or inconsistently. Semantic interoperability, however, will be facilitated only to the extent that people reference and reuse public ontologies in ways that are consistent with their original intended use. For example, if an agent uses the Dublin Core property creator (/dc/elements/1.1/creator) to represent anything other than the creator of the document, interoperating with others that use the property correctly will be a problem.Finding the right ontologies. To reuse an ontology, one needs to find something to reuse. Users must be able to search through available ontologies to determine which ones, if any, are suitable for their particular tasks. Furthermore, even though internally the Semantic Web is for machines to process, user-facing tools must be available that enable users to browse and edit ontologies, to custom-tailor their tasks, to create different views and perspectives, to extract appropriate subsets, and so on. One such tool platform—Protégé—is presented in the next section. Using different ontologies. If two agents operating in the same domain choose to use different ontologies, the interoperability problem will still exist even if their ontologies and data are in OWL and RDF. The agents will then need to create mappings between terms in their respective ontologies, either manually or with the help of tools. While OWL provides some constructs to express the mappings, it does not free users from the necessity of finding and declaring these mappings. Therefore, just as schema mapping and integration are crucial in the database world, the mapping problem is still very much alive in the ontology world.In summary, to enable semantic interoperability among agents, several key problems must be addressed:ers must be able to search through available ontologies to determine which ones, if any, are suitable for their particular tasks;2.Tools and techniques must be available that enable custom tailoring of the ontologies being reused to the users’ tasks, views andperspectives, and subsets appropriate for the users;3.Just as schema mapping and integration are crucial in the database world, ontology mapping and integration will remain a seriousproblem, as there will be ontologies that overlap and cover similar domains, but still bring their unique values to the table.The Protégé StoryWith ontologies being the key technology in the Semantic Web, ontology tools will be essential to its success. In the Protégé group at Stanford Medical Informatics (), we have been working for more than two decades on technologies to support effective development and use of ontologies by both knowledge engineers and domain experts. Protégé is an open source ontology editor and knowledge-acquisition system that arguably has become the most widely used ontology editor for the Semantic Web. It has around 30,000 registered users, an active discussion forum, and an annual international users conference.More important, Protégé is also a platform for developing knowledge-based applications, including those for the Semantic Web. It provides a Java-based API for developing plug-ins, and a thriving community of developers all over the world develop plug-ins for tasks ranging from different ways to visualize ontologies and knowledge bases, to using wizards that help in ontology specification, to invoking different problem-solvers, reasoners, and rule engines with the data in the ontologies, to performing ontology mapping, comparing ontology versions, importing and exporting other ontologies, and so on (/download/plugins.html). The large user community and popularity of the Protégé system is a testament to the idea that using ontologies in software systems is gaining popularity rapidly (Protégé gets several hundred new registrations each week).One of the main goals of Protégé has always been accessibility to domain experts. For example, there are a number of visualization plug-ins (/cgi-bin/wiki.pl?ProtegePlug-insLibraryByTopic) that present concepts and relations as diagrams, or allow users to define new concepts and relations in a knowledge base by drawing a flowchart where nodes and edges are actually complex objects(/doc/tutorial/graph_widget/).As ontologies become more of a commodity, their collaborative development within organizations, and outside single organizations, becomes essential. Protégé supports multiuser ontology development, allowing multiple users to access an ontology server to edit the same ontology simultaneously. While Protégé does support ontology comparison and versioning (/plugins/prompt/prompt.html), there are now frequent requests to support versioning just as seamlessly as software code (versioning is supported in modern development environments).Through the experience in the Protégé group, we are finding more interest in the industry to use ontologies to model the IT structure of an organization and its business processes. On the one hand, many companies are realizing that a huge number of various components in their systems need to be described explicitly to understand how they relate to one another and how changes or malfunctions in one component will affect other components. On the other hand, for many companies, it is not the tools themselves that constitute their main intellectual-property value; rather, this value is in the business-process descriptions, in the company’s understanding of the domain, and in their methods of performing domain analysis and generating software tools from this analysis. Formal ontologies, through their system-independent descriptions that are somewhat removed from the actual implementation, force the knowledge bearers to think and encode their knowledge in abstract terms that may be more readily used by the tools to generate necessary software.DaimlerChrysler, for example, a member of the Protégé Industrial Affiliates Program since 2001, has developed a broad spectrum of applications for employing semantic technologies to support an improved engineering process. Their main focus is knowledge representation and management, semantic portals, parametric design, intelligent decision support, configuration, and semantic information integration. DaimlerChrysler Research and Technology has built a framework, based on Protégé, for developing ontology-based applications customized for the engineering domain. The framework, including both a toolset and an application development methodology, has been successfully used in pilot applications for supporting different product lifecycle phases such as product design, marketing, sales, and service.Another example of a Protégé industrial partner is Exigen Group, based in San Francisco. Exigen uses Protégé interactively to model background knowledge about the business domain. The business model and rules are used to create BPM (business process management) applications. Information extraction techniques are used to develop repositories of ontologies and rules from business documents. An OWL reasoner validates the extracted knowledge for the ontology component, and a rule engine validates the rule component. Protégé is employed to correct and augment the knowledge interactively, parts of which can then be fed back as enriched background knowledge. The knowledge base describes interfaces and migration paths among Exigen document archives, thus providing the semantics of document transformations.The Challenging Road AheadThis article has already mentioned a number of challenges facing both the Semantic Web community and any other community that attempts to apply knowledge- and data-intensive solutions on the Web scale. Here are some additional challenges.Matchmaker, Matchmaker, Find Me a Match. The Semantic Web vision critically depends on the ability of users to reuse existing ontologies (to enable interoperation, among other things), which in turn requires that users are able to find an ontology they need. Different kinds of ontology repositories exist today: some are generated by crawling the Web (e.g., ), some are curated (e.g., ), and some allow domain experts to add their ontologies to them (e.g., Open Biomedical Ontologies;). These repositories, for the most part, are just places to store and retrieve ontologies, however, at best enabling simple cross-ontology search. They do not enable their users to evaluate the ontologies in the repository, to search them intelligently, to know how ontologies were used, and what other users in the field think about various aspects of particular ontologies.As more ontologies become available, it becomes harder, rather than easier, to find an ontology to reuse for a particular application or task. Even today (and the situation will only get worse), it is often easier to develop a new ontology from scratch than to reuse someone else’s existing ontology. First, ontologies and other knowledge sources vary widely in quality, coverage, level of detail, and so on. Second, in general, few, if any, objective and computable measures are available to determine the quality of an ontology. Deciding whether an ontology is appropriate for a particular use is a subjective task. We can often agree on what a bad ontology is, but most people would find it hard to agree on a universal “good” ontology: An ontology that is good for one task may not be appropriate for another. Third, while it would be helpful to know how a particular ontology was used and which applications found it appropriate, this information is almost never available today.7It’s the Web, Stupid. The knowledge-representation community has always dealt with somewhat small, isolated problems, and knowledge bases haven’t had to interoperate a lot. Moving knowledge representation to the Web creates many unique challenges. Scalability is one. Some Protégé users already have ontologies with many tens of thousands of classes.8 This is well below the limit for storage. While effective retrieval of concepts and editing can be done seamlessly, and using reasoners for ontologies of this size is possible, it may be pushing the limits of effective use. One of the solutions is effective modularization of ontologies (see the next point).Another challenge of the Web is interoperation. No matter what the combination of the “winning” technologies is at the end, one thing is certain: These technologies will support and significantly facilitate interoperation among different models and content. It is unlikely that this support will be manifested by an extremely small, well-defined number of standards that everyone will use, thus essentially eliminating the main hurdles to interoperability. Rather, it is probably a solution that would encourage such reuse and the use of a limited number of differentmodels in the first place, but will also support, facilitate, and effectively implement mappings and translations among them.Black Boxes. OWL, the W3C standard ontology language for the Semantic Web, is actually a complex language. Writing ontologies in OWL, even using graphical user interfaces, is a challenging task. We need to develop special-purpose editors with only limited functionality that are easy for people to use and grasp and have many wizards and tools for development. It is probably unrealistic to assume that many noncomputer scientists, without special training, will be able to develop full-fledged ontologies in OWL, with logical expressions, universal and existential restrictions, and so on.If the Semantic Web vision is to succeed, however, they won’t have to. One might argue that most of the ontologies on the Semantic Web would be simple hierarchies with simple properties, rather than ontologies using different types of restrictions, disjointedness, and complex logical expressions with unions and intersections. There is a place for both, and a small number of well-developed and verified ontologies will be necessary. At the same time, people should be able to reuse them as “black boxes” without having to understand much about them. If I reuse a well-established ontology of time, such as OWL-Time (/~pan/OWL-Time.html), all I should have to know is that if I say that you order food before you pay for it, using the notion of before from the OWL-Time ontology, my system would be able to figure out the temporal relations between the events.Trust Me. Trust—and especially, some computable metric of it—is extremely important in any kind of setting where anyone can post machine-processable data. The problem of trust (or mistrust) is already plaguing the Web. On the Web, information is consumed mostly by humans, who can often use their background knowledge and intuition to assess the trustworthiness of the source. Our vast background knowledge and experience, which is hard to encode, helps a lot in determining what is good and what is bad, but mechanisms are still needed to help us in these decisions, as we are not experts in every field.At the other end of the spectrum, databases are designed largely for machine processing, but they are largely centrally controlled, and schemas are usually published (if at all) by respectable companies. A database is not something that an average blogger posts on the Web. In the free-flowing realm of the Semantic Web and other data being posted essentially by anyone to be processed by machines, these machines need to be able to determine how trustworthy the sources are. Reputation systems, certificates, and security measures will all need to be adapted to this realm.“Just Enough” and Imprecise Answers. Historically, both reasoning in artificial intelligence and querying in databases has been about precise answers to specific questions, with few exceptions. With a much less centrally controlled, loosely coupled collection of resources that come online and go away, have questionable credentials, and use different representations and levels of precision in their representation, we must develop query and reasoning techniques that are flexible enough to deal with such fluid collections. These methods should not require precision in either specification or answers.Furthermore, in both reasoning in artificial intelligence and query answering in databases, there is a paradigm of getting a complete answer to your question with respect to the data stored in the accessible resources. On the Web, we are used to getting answers that are “good enough” or “as good as possible,” given the time constraints and the resources available, but not necessarily the best. If I find an airfare that seems to be about the lowest that is reasonable for this time of the year, I am happy to stop my search, even if I know that a ticket that is $10 cheaper is available somewhere on the Web. Likewise on the Semantic Web, a result that we can find quickly and effectively with the resources that are currently available is often good enough and does not have to be perfect.None of these challenges is insurmountable, and addressing them will produce a lot of interesting advances on the way. Think of this as the history of artificial intelligence in a nutshell: Research has not produced, and may never produce, a machine that is as intelligent as a human being. But think of how many scientific and technical advances we have made while pursuing this goal! Likewise, whether or not we achieve the Holy Grail of all machines talking to and understanding one another without much, if any, intervention from humans, we will produce many useful tools along the way.References1.Berners-Lee, T., Hendler, J., and Lassila, O. 2001. The Semantic Web. Scientific American 284(5): 34–43.2.Welty, C. 2003. Ontology research. AI Magazine 24(3).3.McGuinness, D. L. 2001. Ontologies come of age. In The Semantic Web: Why, What, and How, ed. D. Fensel, J. Hendler, H.Lieberman, and W. Wahlster. Cambridge: MIT Press.4.Brickley, D., and Guha, R. V. 1999. Resource description framework (RDF) schema specification. Proposed recommendation, WorldWide Web Consortium.5.Dean, M., Connolly, D., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D. L., Patel-Schneider, P. F., and Stein L. A. 2002.Web ontology language (OWL) reference version 1.0; /tr/owl-guide/.6. A detailed comparison of ontologies and databases can be found in Uschold, M., and Grüninger, M. 2004. Ontologies and semantics forseamless connectivity. SIGMOD Record 33(3).7.Noy, N., Guha, R. V., and Musen, M. A. 2005. User ratings of ontologies: who will rate the raters? In AAAI 2005 Spring Symposiumon Knowledge Collection from Volunteer Contributors. Stanford, CA.。

人工智能之知识图谱

图表目录图1知识工程发展历程 (3)图2 Knowledge Graph知识图谱 (9)图3知识图谱细分领域学者选取流程图 (10)图4基于离散符号的知识表示与基于连续向量的知识表示 (11)图5知识表示与建模领域全球知名学者分布图 (13)图6知识表示与建模领域全球知名学者国家分布统计 (13)图7知识表示与建模领域中国知名学者分布图 (14)图8知识表示与建模领域各国知名学者迁徙图 (14)图9知识表示与建模领域全球知名学者h-index分布图 (15)图10知识获取领域全球知名学者分布图 (23)图11知识获取领域全球知名学者分布统计 (23)图12知识获取领域中国知名学者分布图 (23)图13知识获取领域各国知名学者迁徙图 (24)图14知识获取领域全球知名学者h-index分布图 (24)图15 语义集成的常见流程 (29)图16知识融合领域全球知名学者分布图 (31)图17知识融合领域全球知名学者分布统计 (31)图18知识融合领域中国知名学者分布图 (31)图19知识融合领域各国知名学者迁徙图 (32)图20知识融合领域全球知名学者h-index分布图 (32)图21知识查询与推理领域全球知名学者分布图 (39)图22知识查询与推理领域全球知名学者分布统计 (39)图23知识查询与推理领域中国知名学者分布图 (39)图24知识表示与推理领域各国知名学者迁徙图 (40)图25知识查询与推理领域全球知名学者h-index分布图 (40)图26知识应用领域全球知名学者分布图 (46)图27知识应用领域全球知名学者分布统计 (46)图28知识应用领域中国知名学者分布图 (47)图29知识应用领域各国知名学者迁徙图 (47)图30知识应用领域全球知名学者h-index分布图 (48)图31行业知识图谱应用 (68)图32电商图谱Schema (69)图33大英博物院语义搜索 (70)图34异常关联挖掘 (70)图35最终控制人分析 (71)图36企业社交图谱 (71)图37智能问答 (72)图38生物医疗 (72)图39知识图谱领域近期热度 (75)图40知识图谱领域全局热度 (75)表1知识图谱领域顶级学术会议列表 (10)表2 知识图谱引用量前十论文 (56)表3常识知识库型指示图 (67)摘要知识图谱（Knowledge Graph）是人工智能重要分支知识工程在大数据环境中的成功应用，知识图谱与大数据和深度学习一起，成为推动互联网和人工智能发展的核心驱动力之一。

旅游管理系统数据库设计参考文献

旅游管理系统数据库设计参考文献在设计旅游管理系统的数据库时，参考文献是至关重要的，它们可以提供宝贵的经验和指导，帮助我们更好地规划和实施数据库架构。

以下是一些在旅游管理系统数据库设计方面的参考文献，可以帮助我们更好地理解和应用相关技术。

数据库设计基础•Korth, H. F., & Silberschatz, A. (1991). Database System Concepts.McGraw-Hill. 这本经典教材介绍了数据库系统基本概念，包括实体关系模型设计、关系代数和SQL查询语言等。

对于理解数据库设计的基础原理非常有帮助。

旅游管理系统数据库设计实践•Li, Y., Guo, W., & Chen, L. (2016). Design and Implementation of Tourism Information Management System Based on Data Warehouse and Data Mining. International Conference on Digital Economy (ICDE). 本文介绍了基于数据仓库和数据挖掘技术的旅游信息管理系统设计与实现。

通过数据仓库的建设和数据挖掘技术的应用，实现了对旅游信息的有效管理和个性化推荐功能。

数据库性能优化•Yaghoubi, A., Duff, R. J., & Boykin, R. E. (2014). Performance Comparison of NoSQL Approaches for Storing RDF Data in Semantic Web Applications. International World Wide Web Conference (WWW). 该研究比较了在语义Web应用程序中存储RDF数据的NoSQL方法的性能。

了解这些最新的性能优化技术可以帮助我们在实践中更好地优化旅游管理系统数据库的性能。

Oracle BPM 套件：一份关于 Oracle Corporation 的商业流程管理工具的介绍

An Ontological Approach to Oracle BPMJean Prater, Ralf Mueller, Bill BeauregardOracle Corporation, 500 Oracle Parkway, Redwood City, CA 94065, USA **********************,***********************,*****************************The Oracle Business Process Management (Oracle BPM) Suite is composed oftools for Business Analysts and Developers for the modeling of BusinessProcesses in BPMN 2.0 (OMG1 standard), Business Rules, Human Workflow,Complex Events, and many other tools. BPM operates using the commontenants of an underlying Service Oriented Architecture (SOA) runtimeinfrastructure based on the Service Component Architecture (SCA). OracleDatabase Semantic Technologies provides native storage, querying andinferencing that are compliant with W3C standards for semantic (RDF/OWL)data and ontologies, with scalability and security for enterprise-scale semanticapplications.Semantically-enabling all artifacts of BPM from the high-level design of aBusiness Process Diagram to the deployment and runtime model of a BPMapplication promotes continuous process refinement, enables comprehensiveimpact analysis and prevents unnecessary proliferation of processes andservices. This paper presents the Oracle BPM ontology based upon BPMN 2.0,Service Component Architecture (SCA) and the Web Ontology Language(OWL 2). The implementation of this ontology provides a wide range of usecases in the areas of Process Analysis, Governance, Business Intelligence andSystems Management. It also has the potential to bring together stakeholdersacross an Enterprise, for a true Agile End-to-End Enterprise Architecture.Example use cases are presented as well as an outlook of the evolution of theontology to cover the organizational and social aspects of Business ProcessManagement.1.IntroductionIn the 1968 film, 2001: A Space Odyssey, the movie’s antagonist, HAL, is a computer that is capable not only of speech, speech recognition, and natural language processing, but also lip reading, apparent art appreciation, interpreting and reproducing emotional behavior, reasoning, and playing chess, all while maintaining the systems on an interplanetary mission. While the solution we present in this paper does not possess all of the capabilities of HAL, the potential benefits of combining semantic technology with Oracle BPM provides the ability to define contextual relationships between business processes and provides the tools to use that context so that ‘software agents’ (programs working on behalf of people) can find the right1 Object Management Group, see 2 Jean Prater, Ralf Mueller, Bill Beauregardinformation or processes and make decisions based on the established contextual relationships.Organizations can more efficiently and effectively optimize their information technology resources through a service-oriented approach leveraging common business processes and semantics throughout their enterprise. The challenge, however, with applications built on Business Process Management (BPM) and Service Oriented Architecture (SOA) technology is that many are comprised of numerous artifacts spanning a wide range of representation formats. BPMN 2.0, the Service Component Architecture Assembly Model, Web Service definitions (in the form of WSDL), XSLT transformations, for example are all based on well defined but varying type models. To answer even simple queries on the entire BPM model, a user is left with a multitude of API’s and technologies, making the exercise difficult and highly complicated. Oracle has developed an ontology in OWL that encompasses all the artifacts of a BPM application and is stored in Oracle Database Semantic Technologies that provides a holistic view of the entire model and a unified and standardized way to query that model using SPARQL.Oracle is actively involved in the standards process and is leading industry efforts to use ontologies for metadata analysis. Oracle is also investigating the integration of organizational and social aspects of BPM using FOAF2. BPMN 2.0 task performers can be associated with a FOAF Person, Group or Organization and then used in Social Web activities to enable Business Users to collaborate on BPM models.1.1 BenefitsThe benefits of adding semantic technology to the database and to business process management in the middleware, driven by an underlying ontology are three fold:1.It promotes continuous process refinement. A less comprehensive processmodel can evolve into a complete executable process in the same model.2.It makes it easy to analyze the impact of adding, modifying or deletingprocesses and process building blocks on existing processes and webservices.3.It helps prevent unnecessary proliferation of processes and services. Combining semantic technology and business process management allows business users across organizational boundaries to find, share, and combine information and processes more easily by adding contextual relationships.1.2 Customer Use CaseThe US Department of Defense (DoD) is leading the way in the Federal Government for Architecture-driven Business Operations Transformation. A vital tenet for success is ensuring that business process models are based on a standardized representation, thus enabling the analysis and comparison of end to end business processes. This will lead to the reuse of the most efficient and effective process patterns (style guide), comprised of elements (primitives), throughout the DoD Business Mission Area. A key principle in DoD Business Transformation is its focus on data ontology. The 2 The Friend of a Friend (FOAF) project, see An Ontological Approach to Oracle BPM 3 Business Transformation Agency (BTA), under the purview of the Deputy Chief Management Officer (DCMO), has been at the forefront of efforts to develop a common vocabulary and processes in support of business enterprise interoperability through data standardization. The use of primitives and reuse of process patterns will reduce waste in overhead costs, process duplication and building and maintaining enterprise architectures. By aligning the Department of Defense Architecture Framework3 2.0 (DoDAF 2.0) with Business Process Modeling Notation 2.0 (BPMN 2.0) and partnering with industry, the BTA is accelerating the adoption of these standards to improve government business process efficiency.2.The Oracle BPM OntologyThe Oracle BPM ontology encompasses and expands the BPMN 2.0 and SCA ontologies. The Oracle BPM ontology is stored in Oracle Database Semantic Technologies and creates a composite model by establishing relationships between the OWL classes of the BPMN 2.0 ontology and the OWL classes of the SCA runtime ontology. For example, the BPMN 2.0 Process, User Task and Business Rule Task are mapped to components in the composite model. Send, Receive and Service Tasks, as well as Message Events are mapped to appropriate SCA Services and References and appropriate connections are created between the composite model artifacts. Figure 1 illustrates the anatomy of the Business Rule Task “Determine Approval Flow” that is a part of a Sales Quote demo delivered with BPM Suite.Figure 1: Anatomy of a BPMN 2.0 Business Rule Task4The diagram shows that the Business Rule Task “Determine Approval Flow” is of BPMN 2.0 type Business Rule Task and implemented by a SCA Decision Component that is connected to a BPMN Component “RequestQuote”. Also of significance is that the Decision Component exposes a Service that refers to a specific XML-Schema, which is also referred to by Data Objects in the BPMN 2.0 process RequestQuote.bpmn.3See /products/BEA_6.2/BEA/products/2009-04-27 Primitives Guidelines for Business Process Models (DoDAF OV-6c).pdf4 Visualized using TopBraid Composer TM4 Jean Prater, Ralf Mueller, Bill Beauregard3.An Ontology for BPMN 2.0With the release of the OMG BPMN 2.0 standard, a format based on XMI and XML-Schema was introduced for the Diagram Interchange (DI) and the Semantic Model. Based on the BPMN 2.0 Semantic Model, Oracle created an ontology that is comprised of the following5:•OWL classes and properties for all BPMN 2.0 Elements that are relevant for the Business Process Model.6The OWL classes, whenever possible,follow the conventions in the BPMN 2.0 UML meta model. OWL propertiesand restrictions are included by adding all of the data and object propertiesaccording to the attributes and class associations in the BPMN 2.0 model.7•OWL classes and properties for instantiations of a BPMN 2.0 process model. These OWL classes cover the runtime aspects of a BPMN 2.0process when executed by a process engine. The process engine createsBPMN 2.0 flow element instances when the process is executed. Activitylogging information is captured, including timestamps for a flow elementinstance’s activation and completion, as well as the performer of the task. The implicit (unstated) relationships in the Oracle BPM ontology can be automatically discovered using the native inferencing engine included with Oracle Database Semantic Technologies. The explicit and implicit relationships in the ontology can be queried using Oracle Database Semantic Technologies support for SPARQL (patterns matching queries) and/or mixed SPARQL in SQL queries. [6] Example SPARQL queries are shown below:Select all User Tasks in all Lanesselect ?usertask ?lanewhere {usertask rdf:type bpmn:UserTask .usertask bpmn:inLane lane}Select all flow elements with their sequence flow in lane p1:MyLane (a concrete instance of RDF type bpmn:Lane)select ?source ?targetwhere {flow bpmn:sourceFlowElement source .flow bpmn:targetFlowElement target .5 All of the classes of the BPMN 2.0 meta model that exists for technical reasons only (model m:n relationship or special containments) are not represented in the ontology6 The work in [2] describes an ontology based on BPMN 1.x for which no standardized meta model exists7 Oracle formulated SPARQL queries for envisioned use cases and added additional properties and restrictions to the ontology to support those use casesAn Ontological Approach to Oracle BPM 5 target bpmn:inLane p1:MyLane}Select all activities in process p1:MyProcess that satisfy SLA p1:MySLA select ?activity ?activityInstancewhere {activity bpmn:inProcess p1:MyProcess .activityInstance obpm:instanceOf activity .activityInstance obpm:meetSLA p1:MySLA}A unique capability of BPMN 2.0, as compared to BPEL, for instance, is its ability to promote continuous process refinement. A less comprehensive process model, perhaps created by a business analyst can evolve into a complete executable process that can be implemented by IT in the same model. The work sited in Validating Process Refinement with Ontologies[4] suggests an ontological approach for the validation of such process refinements.4.An Ontology for the SCA composite modelThe SCA composite model ontology represents the SCA assembly model and is comprised of OWL classes for Composite, Component, Service, Reference and Wire, which form the major building blocks of the assembly model. Oracle BPM ontology has OWL classes for concrete services specified by WSDL and data structures specified by XML-Schema. The transformation of the SCA assembly model to the SCA ontology includes creating finer grained WSDL and XML-Schema artifacts to capture the dependencies and relationships between concrete WSDL operations and messages to elements of some XML-Schema and their imported schemata.The SCA ontology was primarily created for the purpose of Governance and to act as a bridge between the Oracle BPM ontology and an ontology that would represent a concrete runtime infrastructure. This enables the important ability to perform impact analysis to determine, for instance, which BPMN 2.0 data objects and/or data associations are impacted by the modification of an XML-Schema element or which Web Service depends on this element. This feature helps prevent the proliferation of new types and services, and allows IT to ascertain the impact of an XML-Schema modification.5.The TechnologiesAs part of the customer use case, as referenced in section 1.2 above, we implemented a system that takes a BPM Project comprised of BPMN 2.0 process definitions, SCA assembly model, WSDL service definitions, XML-Schema and other metadata, and created appropriate Semantic data (RDF triples) for the Oracle BPM ontology. The6 Jean Prater, Ralf Mueller, Bill Beauregardtriples were then loaded into Oracle Database Semantic Technologies [3] and a SPARQL endpoint was used to except and process queries.6.ConclusionOracle BPM ontology encompasses and expands the generic ontologies for BPMN 2.0 and the SOA composite model to cover all artifacts of a BPM application from a potentially underspecified8process model in BPMN 2.0 down to the XML-Schema element and type level at runtime for process analysis, governance and Business Intelligence. The combination of RDF/OWL data storage, inferencing and SPARQL querying, as supported by Oracle Database Semantic Technologies, provides the ability to discover implicit relationships in data and find implicit and explicit relationships with pattern matching queries that go beyond classical approaches of XML-Schema, XQuery and SQL.7.AcknowledgementsWe’d like to thank Sudeer Bhoja, Linus Chow, Xavier Lopez, Bhagat Nainani and Zhe Wu for their contributions to the paper and valuable comments.8.References[1] Business Process Model and Notation (BPMN) Version 2.0,/spec/BPMN/2.0/[2] Ghidini Ch., Rospocher M., Serafini L.: BPMN Ontology,https://dkm.fbk.eu/index.php/BPMN_Ontology[3] Oracle Database Semantic Technologies,/technetwork/database/options/semantic-tech/[4] Ren Y., Groener G., Lemcke J., Tirdad R., Friesen A., Yuting Z., Pan J., Staab S.:Validating Process Refinement with Ontologies[5] Service Component Architecture (SCA), [6] Kolovski V., Wu Z., Eadon G.: Optimizing Enterprise-Scale OWL 2 RL Reasoning in aRelational Database System, ISWC 2010, page 436-452[7] “Use of End-toEnd (E2E) Business Models and Ontology in DoD Business Architectures”;Memorandum from Deputy Chief Management Office; April 4, 2011, Elizabeth A.McGrath, Deputy DCMO.[8] “Primitives and Style: A Common Vocabulary for BPM across the Enterprise”; DennisWisnosky, Chief Architect & CTO ODCMO and Linus Chow Oracle; BPM Excellence in Practice 2010; Published by Future Strategies, 20108A BPMN 2.0 model element is considered underspecified, if its valid but not all attribute values relevant for execution are specified.。

一种基于知识图谱技术的智能制造数据标准数字化转型方法

学术研讨一种基于知识图谱技术的智能制造数据标准数字化转型方法■ 岳高峰1，2 刘继红1* 高亮2 温娜2（1. 北京航空航天大学；2. 中国标准化研究院）摘要：智能制造领域的标准涵盖设备、机床、计算机系统、智能传感器等多种类型。

以工业化、数字化、智能化为特征的智能制造对标准的数字化和机器可读提出了内在要求。

为了满足智能制造标准的数字化、网络化、智能化发展要求，基于知识图谱技术和本体建模理论，本文提出了用于智能制造标准数字化转型的本体模型，包括类、属性、关系和规则库。

在此基础上，通过在行业应用的典型示例，解释和验证了本文所提的模型和方法。

关键词：知识图谱，标准数字化，本体DOI编码：10.3969/j.issn.1002-5944.2023.15.005A Digital Transformation Method of Standard for IntelligentManufacturing Data Based on Knowledge Graph TechnologyYUE Gao-feng1,2 LIU Ji-hong1* GAO Liang2 WEN Na2（1. Beihang University; 2. China National Institute of Standardizationn）Abstract:Standards in the ﬁ eld of intelligent manufacturing cover equipment, machine tools, computer systems, intelligent sensors and other types of products. Intelligent manufacturing, characterized by industrialization, digitalization and intellectualization, puts forward inherent requirements for the digitalization and machine readability of standards. In order to meet the requirements of digitalization, network and intelligent development of intelligent manufacturing standards, based on knowledge graph technology and ontology modeling theory, this paper proposes an ontology model for the digital transformation of intelligent manufacturing standards, including classes, attributes, relationships and rule base. On this basis, the model and method proposed in this paper are veriﬁ ed through typical examples of industrial applications. Keywords: knowledge graph, standards digitization, ontology0 引言欧美发达国家在将新技术用于推动制造业的变革发展。

RDF 模型及其推理机制

Ontology KR frame ontology Logic theory
ER model Database schemas Collection of terms
Description logic
Object classes
Ontology Spectrum
如何来描述现实世界-3
本体－ontology
-Tim Berners-Lee, James Hendler, Ora Lassila, －The Semantic Web, Scientific American, May 2001
•
“The Semantic Web is a major research initiative of the World Wide Web Consortium (W3C) to create a metadata-rich Web of resources that can describe themselves not only by how they should be displayed (HTML) or syntactically (XML), but also by the meaning of the metadata.”
/TR/2003/WD-rdf-concepts-20030905/
RDF Graph Data Model
RDF就是三元组的集合，每一个三元组都包含了subject 、 predicate、object；一系列的三元组构成一个RDF 图
RDF Graph XML serialization
Definition: The Semantic Web is the representation of data on the World Wide Web. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming. 定义：Semantic Web 就是在web上表现数据。它由W3C牵头，大量的研究机构和企业伙伴共同参与。Semantic web基于RDF，RDF集成了大量使用以XML为句法和以URIs来命名的应用。

peer-to-peer

A logic-based approach for computing service executions plans inpeer-to-peer networksHenrik Nottelmann and Norbert FuhrInstitute of Informatics and Interactive Systems,University of Duisburg-Essen,47048Duisburg,Germany,{nottelmann,fuhr}@uni-duisburg.deAbstract.Today,peer-to-peer services can comprise a large and growing number of services,e.g.search ser-vices or services dealing with heterogeneous schemas in the context of Digital Libraries.For a given task,thesystem has to determine suitable services and their processing order(“execution plan”).As peers can join orleave the network spontaneously,static execution plans are not sufﬁcient.This paper proposes a logic-basedapproach for dynamically computing execution plans:Services are described in the DAML-S language.Thesedescriptions are mapped onto Datalog.Finally,logical rules are applied on the service description facts fordetermining matching services andﬁnding an optimum execution plan.1IntroductionPeer-to-peer architectures have emerged recently as an alternative to centralised architectures.In the beginning, they have been mainly used for simple applications likeﬁle sharing with only primitive retrieval capabilities. Nowadays,they are employed more and more for advanced IR applications.In the scenario used in this paper,users search for documents in a peer-to-peer network(a“retrieval task”).A user issues a query to the network.The query is routed through the network,and—without further interaction with the user—documents are retrieved and sent back to the user.Documents are structured through schemas.Thus,queries are also stated against a schema,and the retrieval task deﬁnes the schema of user queries and the schema of the result documents(requested by the user).In contrast to other approaches,we assume a heterogeneous peer-to-peer network.Each peer can use its own schema for representing its documents.In addition,a peer can offer different services which are specialised in solving a speciﬁc problem,e.g.for bridging the heterogeneity(mediating between different schemas),or for im-proving information access to Digital Libraries.So,here we deal with a heterogeneous network of services which are offered by peers.Nodes can spontaneously join and leave the peer-to-peer network,so they cannot be integrated in the system in a static way.Thus,a match-making component compares the(retrieval)task with all services which are available at that time,and computes an execution plan(the order of services to be invoked).An execution plan can include (besides search services)e.g.schema mapping services if the schema of the query and the search service differ. This paper proposes a logic-based approach for computing execution plans,which picks up some ideas from[6]: 1.DAML Services(DAML-S,[3])is the forthcoming standard for machine-readable service descriptions in theSemantic Web,and thus also employed in this approach.DAML-S deﬁnes the vocabulary(an upper ontology) for describing arbitrary(originally mostly business-oriented)services.A lower ontology for Digital Library services(i.e.,the description of actual services)is presented in this paper.2.In a next step,parts of the DAML-S descriptions are transformed into Datalog,a predicate horn logic.Logicalmatch-making rules can then be applied on the resulting facts for computing an execution plan(in this paper,a sequential order of services).For the scenario presented in this paper,considering only the input and outputtypes of services is sufﬁcient for retrieval-like tasks.3.Similar to resource selection in federated Digital Libraries,the match-making component should consider thecosts of execution plans and compute an optimum selection.This paper presents an approach for cost-optimum service selection,based on probabilistic logics.Other authors have proposed logic-or RDF-based approaches forﬁnding suitable services before.In[7],services are modelled as processes using the MIT process Handbook ontology,providing similar modelling primitives as DAML-S(see Sec.2),and introduces a simple query language for retrieving suitable processes.As this query language only uses the syntactic model,semantics-preserving query-mutation operators(using e.g.specialisa-tion/generalisation)are introduced.In contrast,RDF(S)advertisements are used in[15]for both services and clients,so match-making is reduced to RDF graph matching.A lisp-like notation for logical constructs is used by [8]for both service capabilities descriptions and for the service request.An AI planning component can infer an execution plan by iteratively adding services which minimise the remaining effort.In[13],the quality-of-service of a service execution plan is considered.Similar to the decision-theoretic framework for service selection selection(“resource selection”)[11,4]and the general service selection model presented in this paper,costs are associated with each execution plan,and a local optimisation algorithm is applied forﬁnding the optimum execution plan.A user speciﬁes a query w.r.t.virtual operations,for which matching web services are then found.A similar approach is taken in[18].Here,composite services,and thus execution plans,are modelled as state charts.Then,different quality criteria(e.g.monetary price,execution time,reliability,availability)are combined into an overall cost measure for an execution plan.As it is not feasible to consider every possible execution plan, linear programming is then employed forﬁnding an optimum execution plan.Edutella[9],a metadata infrastructure for the P2P network JXTA,combines RDF and Datalog.In contrast to[12], it does not work on an ontology level,and only maps RDF statements onto Datalog facts,without preserving the semantics of RDF modelling primitives.When the RDF model contains a DAML-S service description,the derived Datalog facts can be used for searching for services with known properties.In contrast,the approach presented in this paper combines DAML-S,probabilistic Datalog,a probabilistic ex-tension to predicate horn logic,and a decision-theoretic model forﬁnding the cost-optimum execution plans in heterogeneous peer-to-peer networks.This paper is organised as follows:The next section gives a brief introduction into DAML Services.Section3 extends DAML-S by a lower ontology for library services.These models will be transformed into probabilistic Datalog in Sec.4.Match-making rules(see Sec.5)can then be used for computing an optimum execution plan. 2DAML Services(DAML-S)DAML-S deﬁnes a vocabulary for describing services(an upper ontology).The service model is expressed in DAML+OIL.E.g.,DAML-S contains classes for processes and properties for deﬁning their input and output types.However,it does not contain any description of actual services;they have to be deﬁned in application-speciﬁc lower ontologies.Service descriptions in DAML-S consist of three different parts:Proﬁle:It describes what the services actually do,mainly by means of input and output parameters,preconditions and effects.In addition,different service types can be used for categorisation.The service proﬁle will be used for match-making.Process model:Processes describe how services work internally.They can be described either as atomic processes or as compositions of other services.Advanced match-making components can use the process model for an in-depth analysis.Service grounding:The grounding can be used for calling the service.E.g.,WSDL descriptions can be included in the service groundings.Together with the service process mode,it can be used for actually invoking the service.This implementation aspect is out of the scope of this paper.2.1Service ProﬁleEvery service has an associated proﬁle.The proﬁle(Fig.1)gives a high-level description of the functionality of a service,and is intended to be used for match-making.Fig.1.DAML-S proﬁle deﬁnitionThe contact information aims at developers who want to contact the responsible person(e.g.the system adminis-trator)of the server,and can be neglected here.The parameter descriptions are more interesting.DAML-S supports four different kinds of service parameters: input parameters,output parameters,preconditions which have to be fulﬁlled in the physical world before the service can be executed,and effects the service has on the physical world.Preconditions and effects mainly aim at E-Commerce applications.For a book selling service,the ordered book must be on stock,and after the service execution,the book will be delivered to the customer.In the Digital Library setting used in this paper(pure retrieval task),preconditions and effects do not play any role.Thus,only inputs and outputs are used.1Each parameter has a name(a string),is restricted to a speciﬁc type(a DAML+OIL class or an XML Schema data-type),and refers to one parameter in the process model(see below).2.2Process ModelThe process model(Fig.2)gives a more detailed view on the service.As said before,it can be used by a match-making component for an in-depth analysis of the services.Similar to the proﬁle,a process is described by input and output parameters,preconditions and effects.Proﬁle parameter descriptions can correspond to these process parameters.The deﬁnition of a parameter is shorter than in the proﬁle.The property URI is used for identiﬁcation,no additional string is speciﬁed.In addition,each process parameter is a sub-property of one predeﬁned properties input,output,etc.DAML-S basically contains two types(as sub-classes)of processes:atomic and composite processes.Atomic processes are viewed as black boxes(like proﬁles).Composite processes are deﬁned as compositions of control constructs and other processes.Examples for control constructs are sequences of other control constructs(or processes),repetitions,conditions(if-then-else),or parallel execution of control constructs(or processes)with a synchronisation point at the end.Thus,composite processes 1This could easily be extended so that preconditions and effects are also considered.Fig.2.DAML-S process deﬁnitionallow for describing a service as a complex composition of other services.This is comparable to the usage of scripting code which glues together existing software components.3Lower ontology for library servicesIn addition to the DAML-S upper ontology,a domain-speciﬁc lower ontology is required.This lower ontology deﬁnes types of services(processes)which are used in the speciﬁc application area.This section brieﬂy describes a simple lower ontology for library services.As the parameter deﬁnition in the process model is simpler than in the proﬁle,atomic processes are employed here for the service descriptions.3.1Search servicesSearch services are among the important services in distributed Digital Libraries.They receive a user query,retrieve useful documents from their associated collection,and return them to the caller.A simple process model of search services is depicted in Fig.3(upper part).A search process is a special case (a DAML+OIL sub-class)of an atomic process.Every search process has exactly one query as input(the car-dinality restrictions are omitted in the graph)and exactly one result(meant as a set of documents)as output. Thus,sub-properties of input and output are used.The ranges of these new properties are restricted to(generic) DAML+OIL classes Query and Result.They have to be deﬁned in the lower ontology,too,but left out as the exact deﬁnitions do not touch this discussion.In a heterogeneous setting,search services probably use different schemas for expressing queries and representing documents.Typically some search services adhere to Dublin Core(DC),e.g.those operating on Open Archives data.Other services might use specialised schemas,e.g.the ACM digital library,or services providing retrieval in art collections.Thus,the description must also contain the schema the search service uses.This is modelled by creating schema-speciﬁc sub-classes for queries and results[10].In the internal presentation,a library schema directly relates to a DAML+OIL“schema”.For match-making,it is sufﬁcient to consider the speciﬁc sub-types of queries and results. The lower part in Fig.3shows the description of an ACM search service.Obviously,ACMSearch is a sub-class of the generic class Search.The ranges of the input and output properties are restricted to ACM-speciﬁc query/result sub-classes.Fig.3.Process model for search servicesWith this extended description,a match-making component can clearly distinguish between search services using different schemas,and can plan accordingly.3.2Other library servicesIn large peer-to-peer-systems,where a large number of DLs has to be federated,heterogeneity of DLs,especially w.r.t.the underlying schema,becomes a major issue.Each search service may use a different document structure. In federated Digital Libraries,e.g.MIND[10],users may query DLs in their preferred schema,and the system(i.e, schema mapping services)must perform the necessary transformations for each individual DL.Each schema mapping service mediates between exactly two different schemas(“input schema”,“output schema”). If there is no schema mapping service available for a required mapping,then several schema mapping services have to be chained(with at least one intermediary schema).There are two different kinds of schema mapping services:–Query transformation services take a query referring to one speciﬁc schema as input and return the same query in another speciﬁc schema.In this paper,a service DC2ACMQuery is considered which transforms a DC query into an ACM query.–In a similar way,a result transformation service like ACM2DCResult transforms a result(set of documents) adhering to one speciﬁc schema(here:ACM)into another schema(here:DC).Finally,query modiﬁcation services compute a new query for a given query based on some given relevance judge-ments, e.g.by applying a query expansion algorithm.This scenario only contains one such service DCQueryModification working on DC queries and results.4DAM+OIL and DatalogThis sectionﬁrst introduces deterministic and probabilistic Datalog.Then,it describes how DAML-S models are transformed into Datalog facts which can then be exploited by match-making rules.4.1DatalogDatalog[16]is a variant of predicate logic based on function-free Horn clauses.Negation is allowed,but its use is limited to achieve a correct and complete model(see below).Rules have the form h←b1∧···∧b n,where h(the “head”)and b i(the subgoals of the“body”)denote literals2with variables and constants as arguments.A rule can be seen as a clause{h,¬b1,...,¬b n}:father(X,Y):-parent(X,Y)&male(X).This denotes that father(x,y)is true for two constants x and y if both parent(x,y)and male(x)are true.This rule has the head father(X,Y)and two body literals(considered as a conjunction)parent(X,Y)and male(X).In addition,negated literals start with an exclamation mark.Variables start with an uppercase character,constants with a lowercase character.Thus the rule expresses that fathers are male parents.A fact is a rule with only constants in the head and an empty body:parent(jo,mary).The semantics are deﬁned by well-founded models[17],which are based on the notion of the greatest unfounded set.Given a partial interpretation of a program,this is the maximum set of ground literals that can be assumed to be false.Negation is allowed in Datalog as long as the program is modularly stratiﬁed[14](in contrast to Prolog).In contrast to global stratiﬁcation,modular stratiﬁcation is formulated w.r.t.the instantiation of a program for its Herbrand universe.The program is modularly stratiﬁed if there is an assignment of ordinal levels to ground atoms such that whenever a ground atom appears negatively in the body of a rule,the ground atom in the head of that rule is of strictly higher level,and whenever a ground atom appears positively in the body of a rule,the ground atom in the head has at least that level.4.2Probabilistic DatalogIn probabilistic Datalog[5],every fact or rule has a probabilistic weight attached,prepended to the fact or rule: 0.5male(X):-person(X).0.5male(jo).Semantics of pDatalog programs are deﬁned as follows:The pDatalog program is modelled as a probability distri-bution over the set of all“possible worlds”.A possible world is the well-founded model of a possible deterministic program,which is formed by the deterministic part of the program and a subset of the indeterministic part.As for deterministic Datalog,only modularly stratiﬁed programs are allowed.Computation of the probabilities is based on the notion of event keys and event expressions,which allow for recognising duplicate or disjoint events when computing a probabilistic weight.Facts and instantiated rules are basic events(identiﬁed by a unique event key).Each derived fact is associated with an event expression that is a Boolean combination of the event keys of the underlying basic events.E.g.,the event expressions of the subgoals of a rule form a conjunction.If there are multiple rules for the same head,the event expressions corresponding to the rule bodies form a disjunction.By default,events are assumed to be independent,so the probabilities of events in a conjunction can be multiplied.2Literals in logics are different from literals in DAML+OIL!4.3Transforming DAML-S models into DatalogThe services described by DAML-S have to be transformed into a Datalog program which can be used for match-making.Aﬁrst step for such a mapping from DAML+OIL onto a four-valued variant of probabilistic Datalog is proposed in[12]:DAML+OIL classes(concepts in description logics)are mapped onto unary Datalog predicates, properties(roles in description logics)onto binary Datalog predicates,and instances and DAML+OIL literals onto Datalog constants.In addition,Datalog rules preserving the DAML+OIL semantics for several DAML+OIL con-structs have been presented.In contrast,Edutella[9]only maps RDF triples onto Datalog facts,without preserving the semantics of RDF modelling primitives.When the RDF model contains a DAML-S service description,the partial model can be used for searching for services with known properties.This paper proposes a simple match-making approach which only considers the input and output types of the services.3Thus,only these parts of the DAML-S description are transformed by introducing a new ternary auxiliary relation service.Itsﬁrst argument contains the service name,the second one describes the type of the input parameter,and the last argument represents the output parameter type.If a service has more than one input or output value,the types are concatenated.Obviously,these facts can easily be derived from the existing knowledge. service(dl:DCQueryModification,dl:DCQuery_DCResult,dl:DCQuery).service(dl:DC2ACMQuery,dl:DCQuery,dl:ACMQuery).service(dl:ACMSearch,dl:ACMQuery,dl:ACMResult).service(dl:ACM2DCResult,dl:ACMResult,dl:DCResult).Similar,the given task is deﬁned by a ternary relation task:task(mytask,dl:DCQuery_DCResult,dl:DCResult).As shown above,deterministic Datalog is sufﬁcient for modelling services and tasks.Probabilistic Datalog will be employed later for computing optimum execution plans.5Computing service execution plansFor retrieval-like tasks as assumed in this paper,it is sufﬁcient to consider sequential lists of services as execution plans.The assumption is that each invoked service can only rely on the output of the previous service execution. Thus,the input type of a service must match the output type of the previous service in the plan(or the user input if it is theﬁrst service),i.e.every single type in the input must be a sub-set of one of the types in the output.More formally:Let the output type of a service be OT:=OT1×OT2×···×OT k and the input type of another service be IT:=IT1×IT2×···×IT l.Then,OT and IT match if and only if for each1≤i≤l there is a1≤j≤k so that IT j is a sub-set of OT j,i.e.IT i⊆OT j.In Datalog,this is encoded by facts match(OT,IT):match(dl:DCQuery_DCResult,dl:DCQuery).match(dl:DCQuery_DCResult,dl:DCResult).match(dl:ACMQuery,dl:ACMQuery).match(dl:ACMQuery,dl:Query)....The goal then is to deﬁne Datalog rules which can be used for computing an execution plan for a given task.These rules can then be applied directly on the facts which are generated from the DAML-S descriptions of the available services.3Future versions of this approach can employ more information,e.g.the service type or the service composition.5.1Computing service chainsThis section introduces an algorithm for computing execution plans,in using Datalog rules and the facts created from the service descriptions.The basic idea is to start by determining all lists of services which can be executed in sequential order(“service chain”).The service chains whose input and output types match the input and output types of the user task then form the execution plans.Unlike Prolog,Datalog does not allow for creating lists directly,thus service chains have to be deﬁned recursively. The ternary relation chain encodes such a service chain.Theﬁrst argument deﬁnes the service at the front,the third argument the service at the end of the chain.The second argument deﬁnes an arbitrary service somewhere in the middle(or equals null,if there is no other service).As a consequence,the chainDCQueryModification→DC2ACMQuery→ACMSearch→ACM2DCResultcan be represented by the following facts:chain(dl:DCQueryModification,dl:DC2ACMQuery,dl:ACM2DCResult).chain(dl:DCQueryModification,null,dl:DC2ACMQuery).chain(dl:DC2ACMQuery,ACMSearch,dl:ACM2DCResult).chain(dl:DC2ACMQuery,null,dl:ACMSearch).chain(dl:ACMSearch,null,dl:ACM2DCResult).Computing service chains starts withﬁnding chains of exactly two services with matching input and output types. Longer chains can be derived by computing the transitive closure of the chain relation:If there are two service chains where the last service in one chain equals theﬁrst service in the other service chain,then obviously both chains can be combined into one single service chain.In Datalog,this can be encoded by two rules,one for the chains consisting of two services,and another recursive one for computing the transitive closure:chain(S1,null,S2):-service(S1,I1,O1)&service(S2,I2,O2)&match(O1,I2).=>chain(dl:DCQueryModification,null,dl:DC2ACMQuery).chain(dl:DC2ACMQuery,null,dl:ACMSearch).chain(dl:ACMSearch,null,dl:ACM2DCResult).chain(S1,S,S2):-chain(S1,S11,S)&chain(S,S22,S2).=>chain(dl:DCQueryModification,dl:DC2ACMQuery,dl:ACMSearch).chain(dl:DCQueryModification,dl:DC2ACMQuery,dl:ACM2DCResult).chain(dl:DCQueryModification,dl:ACMSearch,dl:ACM2DCResult).chain(dl:DC2ACMQuery,dl:ACMSearch,dl:ACM2DCResult).5.2Computing execution plansAn execution plan for a given task is a service chain where the input type of the task is a super-set of the input type of the chain,and the output type of the chain is a super-set of the output type of the task.Thus,execution plans are encoded by the4-ary predicate plan.Theﬁrst argument contains the task related to the execution plan, the other three arguments contain the three arguments of the corresponding service chain(i.e.,theﬁrst service,the last service,and a service somewhere in the middle of the service chain).Computation of execution plans is straight-forward if service chains are already computed:plan(T,S1,S,S2):-task(T,TI,TO)&chain(S1,S,S2)&service(S1,I,O1)&match(TI,I)&service(S2,I2,O)&match(O,TO).=>plan(dl:DCQueryModification,dl:ACMSearch,dl:ACM2DCResult).plan(dl:DCQueryModification,dl:DC2ACMQuery,dl:ACM2DCResult).plan(dl:DC2ACMQuery,dl:ACMSearch,dl:ACM2DCResult).The two service literals are introduced only to check that the input and output types of the chain match those of the task.Thus,the free variables O1(output type of theﬁrst service in the chain)and I2(input type of the last service in the chain)are unused.The complete execution plan(all services in the correct order)can be determined by iteratively traversing the chain relation.The fact database is queried for services between two services for which it is already known that they are in the plan.The algorithm starts with theﬁrst and the intermediary service:?-chain(dl:DCQueryModification,S,dl:ACMSearch).=>(dl:DC2ACMQuery).Thus,the plan contains the DC2ACM query transformation service between the query modiﬁcation and the ACM search service.It is still unclear if there are other services in that part of the chain,so the procedure has to be repeated:?-chain(dl:DCQueryModification,S,dl:DC2ACMQuery).=>(null).?-chain(dl:DC2ACMQuery,S,dl:ACMSearch).=>(null).Thus,the query modiﬁcation and the DC2ACM query transformation service have to be executed directly one after another.The same holds for the query transformation and the search service.Now,the second part of the chain has to be investigated.The result is that DC2ACMQuery and ACMSearch have to be executed without any service between them:?-chain(dl:ACMSearch,S,dl:ACM2DCResult).=>(null).Thus,all complete execution plans can be determined based on the logic program.5.3Optimum execution plan selectionThe match-making component might compute a large number of potential execution plans,and only one of them should be selected and executed.In the context of search service selection,the concept of costs(combining e.g. time,money,quality)has been used for computing an optimum selection in the decision-theoretic framework [11,4].This framework gives a theoretical justiﬁcation for selecting the best search services.In this paper,a similar approach is applied to the more general problem of execution plan selection.Again,the no-tion of costs(of an execution plan)is used.For computation reasons,time and money(“effort”)are separated from the number of relevant documents(“beneﬁt”).The costs are later computed as the weighted difference between the effort and the beneﬁer-speciﬁc weights ec and bc allow for choosing different selection policies(e.g.good results,fast results).In this paper,we do not describe how the costs of a service can be computed.Methods for estimating costs of search services have been proposed in[11].Currently we are working on methods for estimating costs for query and document transformation services.In this paper,we assume that effort and beneﬁt of all services are given.If a service chain consists of two services,where each of them has its designated effort,then the effort of the service chain is the sum of the efforts of the two services.Its beneﬁt has to be computed as the product of the beneﬁts of the two services,as non-search services,e.g.query modiﬁcation services,do not retrieve aﬁxed number of relevant documents.So,their beneﬁt must be speciﬁed relatively to the beneﬁt of a search service.Typically,only distributions for the effort and beneﬁt are given instead of exact values.Thus,two binary relations effort and benefit are used for specifying the distributions independently:0.7effort(dl:ACMSearch,10).0.3effort(dl:ACMSearch,12).0.6benefit(dl:ACMSearch,20).0.4benefit(dl:ACMSearch,30).For computing the costs of execution plans,the relation chain is extended by two additional arguments for the effort and beneﬁt of the whole service chains,plan is extended by one argument for the costs of the execution plan:chain(S1,null,S2,E,B):-service(S1,I1,O1)&effort(S1,E1)&benefit(S1,B1)&service(S2,I2,O2)&effort(S2,E2)&benefit(S2,B2)&match(O1,I2)&add(E,E1,E2)&mult(B,B1,B2).chain(S1,S,S2,E,B):-chain(S1,S11,S,E1,B1)&chain(S,S22,S2,E2,B2)&add(E,E1,E2)&mult(B,B1,B2).plan(T,S1,S,S2,C):-task(T,TI,TO)&chain(S1,S,S2,E,B)&service(S1,I,O1,SE1,SB1)&match(TI,I)&service(S2,O2,O,SE2,SB2)&match(O,TO)&mult(SE,E,ec)&mult(SB,B,bc)&sub(C,SE,SB).When only distributions for the effort and beneﬁt are given,the rules compute the distribution of the costs.These distributions can be used for computing expected costs for every execution plan(outside logics).Finally,the exe-cution plan with lowest expected costs has to be selected.In the example,only exact efforts and beneﬁts are considered for the other services(for simplicity):effort(dl:DCQueryModification,4).benefit(dl:DCQueryModification,1.2).effort(dl:DC2ACMQuery,5).benefit(dl:DC2ACMQuery,0.8).effort(dl:ACM2DCResult,5).benefit(dl:ACM2DCResult,0.8).。

数字地球的关键技术-语义网和OWL简介

• OWL
– 对RDF Schema的扩展 – 描述逻辑（DL）的Web化
译自Ivan Herman, W3C的SW Q&A
语义网与AI的关系
• 语义网
– 采取“主语＋谓语＋宾语”的形式 – 一种表达元数据的简单方式 – 结构化和定义名词的方法 – 有限推理 – 与应用相关的规则 – 模糊逻辑
• AI还包括
译自Ivan Herman, W3C的SW Q&A
SW Tools
• (Graphical) Editors:
– IsaViz (Xerox Research/W3C), RDFAuthor (Univ. of Bristol), Protege 2000 (Stanford Univ.), SWOOP (Univ. of Maryland), Orient (IBM)
– Jena (for Java, includes OWL reasoning), RDFLib (for Python), Redland (in C, with interfaces to Tcl, Java, PHP, Perl, Python…),
• SWI-Prolog, IBM’s Semantic Toolkit, … • Databases (either based on an internal sql engine or fully triple based):
From TBL’s Scientific American Article: The Semantic Web, 2001
Tim Berners-Lee讲故事
• Almost instantly the new plan was presented: a much closer clinic and earlier times—but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were—not a problem. The other was something about the insurance company's list failing to include this provider under physical therapists: "Service type and insurance plan status securely verified by other means," the agent reassured him. "(Details?)"

用户标注的词语网络与语义描述

・L I BRARY AND I N FORM A TI ON SERV ICE ・用户标注的词语网络与语义描述白　华郑州大学信息管理系　郑州450001〔摘要〕用户标注具有简洁、交流与共享、自由表达、推荐与检索等特点,但是它的平面结构使其很难适应语义网的需要,因而有必要进行语义建构,建立用户标注模型和语义联系,以便使用元数据与本体语言对用户标注进行语义描述,使之成为标签本体,以适应新一代因特网的发展。

〔关键词〕用户标注　词语网络　用户标注模型　标签语义描述〔分类号〕G254Research on the W ords Net and Se man ti c D escr i pti on s of Users Tagg i n g Bai HuaDepart m ent of I nf or mati on Manage ment,Zhengzhou University,Zhengzhou 450001〔Abstract 〕U ser tagging has s ome characteristics such as succinctness,exchange and sharing,exp ress freely,recommending andsearching,but it is difficult t o meet the needs of se mantic web for its p lane structure .It is necessary t o go on se mantic architecture in order t o construct the model of user tagging and se mantic relati ons,s o as t o use metadata and ont ol ogy language t o descri p t the se mantic of user tagging,and make it turn in tags Ont ol ogies t o meet the devel op needs for ne w general web .〔Keywords 〕user tagging words net model of user tagging tag semantic descri p ti ons 收稿日期:2009-08-07 修回日期:2009-11-04 本文起止页码:70-73 本文责任编辑:王传清1　用户标注的特点与语义处理的意义1.1　用户标注的特点用户标注是因特网用户对自己的资源或收藏的他人资源添加标签的活动。

一种层次聚类的RDF图语义检索方法研究

ＫｙｗｒｓＤ（ｅｏｒｅｄｓｒｔｎｆｍｗｒ）ｇａｈｉａｃｉｌｌｓｒｇｅａｔｔｅａ；ｖｃｒｓａｅｍｄｌｅｏｄ：ＲＦｒｓｕｃｅｃｐｏａｅｏｉｉｒｋｒｐ；ｈｅｒｈａｃｕｔｉ；ｓｍｎｉｒｒｖｌｅｔｐｃｏｅｒｃｅｎｃｅｉｏ
第２９卷第８期２１０２年８月
计算机应用研究
ＡｐｌａｉｎＲｅｅｒｈｏｍｐｔｍｐｉｔｓａｃｆＣｏｕｅｃｏ
Ｖｏ．９Ｎｏ８１２．
Ａｕ．２２ｇ０１
一
种层次聚类的ＲＦ图语义检索方法研究Ｄ
Ａｂｔａｔｓｒｃ：ＴｅｃｒｅｔｒｓａｃｅａｅｈｕｒｎｅｅｒｈｒｌｔｄＲＤＦｇａｈｒｔｅｅｅｉｓｓｍｅｐｏｌｍｓｕｈａｏｆｃｅｃｆｍｅｒｓｇ，ｌｗｒｐｅｒｖｘｓｏｒｂｅ，ｓｃｓｌｗｅｉｎｙｏｍｏｙｕａｅｏｉｔｉｓａｃｆｃｅｃｎＯｏ．ｈｓｐｐｒｒｐｓｄａｈｅａｃｉａｌｓｒｇｓｍａｔｅｒｅａｄｌｏｅｈｅｉｉｎｙａｄＳｎＴｉａｅｏｏｅｉｒｒｈｃｌｃｕｔｉｅｎｉｒｔｖｌｒｐｅｎｃｉｍｏｅｎＲＤＦｇａｈａｄｔｅｍｅｈｄｒｐｎｔｏｈｂｓｄｏｈｄｌｔｏｖｆｒｓｉｒｂｅ．ｈｔｅｔｃｉｇｅｔｉｓｆｏＦｇａｈａｄｈｅａｃｉａｌｓｒｇｂｈａｅｎｔｅｍｏｅｏｓｌｅａｏｅａｄｐｏｌｍＴａｘｒｔｎｉｅｒｍＲＤｒｐｎｉｒｈｃｌｃｕｔｉｙｔｅｓａｎｔｒｅｎｇｉａｃｆｈｎｏｏｙｌｒｒｄｅｃｍｐｅｒｐｔｃｕｅｉｔｅｔｃｕｅｆｒｅｃｅｔｅｒｖ．ｉｎａｉｇｔｒｅｏ — ｕｄｅｏｅｏｔｌｇｉａｙｍａｅｔｏｌｘｇａｈｓｒｔｒｏａｔｅｓｒｔｒｏｆｉｎｔｉａＯｒｔｔｇｔｂｎｔｂｈｕｎｒｕｉｒｅ１ｅｎａ