Knowledge Extraction by using an Ontology-based Annotation Tool
知识提炼方法
知识提炼方法Knowledge extraction is a crucial process in learning and understanding complex information. It involves distilling and extracting the most essential parts of a piece of information or a topic, making it easier to comprehend and retain. This process can be challenging, as it requires critical thinking, analysis, and the ability to separate relevant information from the irrelevant.知识提炼是学习和理解复杂信息的关键过程。
它涉及提炼和抽取信息或主题的最关键部分,使其更容易理解和记忆。
这个过程可能具有挑战性,因为它需要批判性思维、分析能力和从无关信息中分离相关信息的能力。
One effective method for knowledge extraction is summarization. Summarizing information involves condensing a large amount of content into a concise and focused summary. This allows individuals to grasp the main points and key concepts without getting bogged down by unnecessary details. By summarizing information, individuals can quickly review and recall important details when needed.知识提炼的一个有效方法是总结。
获取知识的过程比知识本身更重要英语作文
获取知识的过程比知识本身更重要英语作文全文共3篇示例,供读者参考篇1Title: The Process of Obtaining Knowledge is More Important than Knowledge ItselfIntroduction:Knowledge is a valuable asset that shapes our understanding of the world and influences our decisions and actions. However, the process of obtaining knowledge is equally significant as it allows us to develop critical thinking skills, foster curiosity, and expand our understanding of various subjects.Body:1. Developing Critical Thinking Skills:- The process of obtaining knowledge often involves researching, analyzing, and synthesizing information.- Engaging in critical thinking allows individuals to evaluate the credibility of sources, recognize biases, and formwell-informed opinions.2. Fostering Curiosity:- The process of acquiring knowledge encourages individuals to ask questions, seek answers, and explore new ideas.- Curiosity drives innovation, creativity, and personal growth, enabling individuals to develop a deeper appreciation for learning.3. Expanding Understanding:- Through the process of obtaining knowledge, individuals are exposed to different perspectives, beliefs, and experiences.- This exposure fosters empathy, tolerance, and cultural awareness, helping individuals to build connections and collaborate with diverse groups.4. Overcoming Challenges:- The process of obtaining knowledge often involves overcoming obstacles, making mistakes, and facing failures.- These challenges provide opportunities for growth, resilience, and self-improvement, ultimately enhancing one's ability to adapt to new situations and navigate complex problems.Conclusion:In conclusion, while knowledge is undoubtedly valuable, the process of obtaining knowledge is equally essential as it cultivates critical thinking skills, fosters curiosity, expands understanding, and promotes personal growth. By embracing the journey of learning, individuals can develop a lifelong passion for acquiring knowledge and continue to evolve intellectually, emotionally, and socially.篇2The Process of Acquiring Knowledge is More Important than Knowledge ItselfKnowledge is a valuable asset that enables us to understand the world around us, make informed decisions, and continuously grow and develop. However, while the accumulation of knowledge is important, the process of acquiring knowledge is equally, if not more, significant. The journey of learning not only expands our minds but also enhances our skills, fosters critical thinking, and shapes our attitudes and perspectives.One of the key benefits of the process of acquiring knowledge is that it enhances our critical thinking skills. When we actively engage with new information, seek out different viewpoints, and analyze and evaluate the evidence, we aresharpening our ability to think critically and make sound judgments. This process of questioning, exploring, and synthesizing information is essential for developing awell-rounded understanding of complex issues and challenges.Furthermore, the process of acquiring knowledge exposes us to diverse perspectives and ideas, broadening our horizons and challenging our preconceived notions. By engaging with a wide range of sources, including academic research, literature, art, and conversations with others, we deepen our understanding of the world and develop empathy and appreciation for different viewpoints. This exposure to diverse perspectives is crucial for fostering tolerance, understanding, and collaboration in an increasingly interconnected world.Moreover, the process of acquiring knowledge is an ongoing journey of growth and development. As we delve into new subjects, experiment with different approaches, and learn from our experiences, we are constantly expanding our skills and knowledge base. This process of continuous learning not only keeps our minds agile and sharp but also opens up new opportunities for personal and professional growth.In addition, the process of acquiring knowledge is transformative, shaping our attitudes, beliefs, and behaviors.When we engage with new information, grapple with challenging concepts, and confront our biases and assumptions, we are forced to reevaluate our perspectives and beliefs. This process of self-reflection and growth is essential for personal development and fosters a lifelong commitment to learning and self-improvement.Finally, the process of acquiring knowledge is a source of joy, fulfillment, and empowerment. When we immerse ourselves in learning, whether through reading, studying, exploring new skills, or engaging in discussions with others, we experience a sense of satisfaction and accomplishment. This joy of discovery and intellectual growth fuels our curiosity and motivates us to continue seeking out new knowledge and experiences.In conclusion, while knowledge is undoubtedly valuable, the process of acquiring knowledge is equally if not more important. The journey of learning enhances our critical thinking skills, broadens our perspectives, fosters personal growth, shapes our attitudes and beliefs, and brings joy and fulfillment. By embracing the process of acquiring knowledge as a lifelong journey of growth and discovery, we can truly harness the transformative power of learning in our lives.篇3The Process of Acquiring Knowledge Is More Important than Knowledge ItselfIntroductionKnowledge is considered as a valuable asset in today'sfast-paced world. With the advancement of technology and the spread of information, access to knowledge has become easier than ever before. However, the process of acquiring knowledge is often overlooked in comparison to the knowledge itself. In this essay, we will explore why the process of gaining knowledge is more important than knowledge itself.Importance of the ProcessThe process of acquiring knowledge involves various steps such as critical thinking, research, analysis, and reflection. These steps not only help in understanding the information but also in developing skills like problem-solving, decision-making, and creativity. For example, when conducting research for a project, one not only gains knowledge on the topic but also improves their research skills and ability to critically evaluate sources.Furthermore, the process of acquiring knowledge allows for personal growth and development. As individuals engage in learning new things and challenging their existing beliefs, theybecome more open-minded, adaptable, and resilient. This continuous learning process helps individuals to stay relevant in an ever-changing world and to navigate through uncertainties with confidence.Moreover, the process of gaining knowledge fosters a sense of curiosity and intellectual curiosity. By asking questions, exploring new ideas, and seeking answers, individuals can expand their horizons and deepen their understanding of the world. This curiosity drives innovation, creativity, and intellectual growth, leading to new discoveries and advancements in various fields.Challenges of the ProcessWhile the process of acquiring knowledge is essential for personal growth and development, it is not without challenges. The process can be time-consuming, requiring dedication, persistence, and hard work. It may involve setbacks, failures, and disappointments, which can be discouraging at times. However, overcoming these challenges can build resilience, perseverance, and determination, which are valuable traits for success in any endeavor.Moreover, the process of gaining knowledge requires effective learning strategies and study habits. It involves activeengagement, critical thinking, and reflection, rather than mere memorization or rote learning. Developing these skills takes time and effort but is crucial for truly understanding and applying knowledge in real-life situations.ConclusionIn conclusion, while knowledge itself is valuable, the process of acquiring knowledge is equally, if not more, important. The process of gaining knowledge involves critical thinking, research, analysis, and reflection, which not only help in understanding information but also in developing skills, fostering curiosity, and promoting personal growth. Despite the challenges, the process of acquiring knowledge is essential for staying relevant, navigating uncertainties, and driving innovation. Therefore, it is crucial to value and prioritize the process of gaining knowledge to truly unleash the potential of knowledge itself.。
学习的方法英语作文
When it comes to learning,there are various methods that can be employed to enhance the efficiency and effectiveness of the learning process.Here are some key strategies that can be incorporated into ones study routine:1.Setting Clear Goals:Before starting any learning activity,its crucial to set clear and achievable goals.This helps in maintaining focus and provides a sense of direction.2.Creating a Study Schedule:Organizing a study schedule can significantly improve the learning experience.It allows for the allocation of specific time slots for different subjects or tasks,ensuring a balanced approach to learning.3.Active Reading:Instead of passively reading through materials,active reading involves engaging with the text by taking notes,asking questions,and summarizing information in ones own words.4.Participating in Group Studies:Collaborative learning through group studies can be beneficial as it allows for the exchange of ideas and insights,fostering a deeper understanding of the subject matter.ing Mnemonics:Mnemonic devices are memory aids that can help in remembering complex information.They can be particularly useful when learning new vocabulary or historical dates.6.Practicing Regularly:Consistent practice is essential for reinforcing learning. Regularly revisiting and applying the knowledge gained can help in solidifying it in ones memory.7.Seeking Feedback:Constructive feedback from teachers or peers can provide valuable insights into areas that need improvement,thus refining the learning process.8.Utilizing Technology:Educational apps,online courses,and digital resources can supplement traditional learning methods and offer interactive and engaging ways to acquire knowledge.9.Engaging in Critical Thinking:Developing critical thinking skills is crucial for deep learning.It involves analyzing,evaluating,and synthesizing information to form a comprehensive understanding.10.Taking Breaks:Its important to take regular breaks during study sessions to avoid burnout.Short breaks can help refresh the mind and improve concentration.11.Reflecting on Learning:Reflecting on what has been learned and how it can be applied in different contexts is a powerful way to consolidate knowledge.12.Staying Curious and OpenMinded:Maintaining a curious and openminded attitude towards learning can lead to a more enjoyable and enriching educational experience. By incorporating these methods into your study routine,you can enhance your learning capabilities and achieve better academic results.Remember,the key to effective learning lies in finding the right balance between different strategies that work best for your individual learning style.。
英语哲学思想解读50题
英语哲学思想解读50题1. The statement "All is flux" was proposed by _____.A. PlatoB. AristotleC. HeraclitusD. Socrates答案:C。
本题考查古希腊哲学思想家的观点。
赫拉克利特提出了“万物皆流”的观点。
选项A 柏拉图强调理念论;选项B 亚里士多德注重实体和形式;选项D 苏格拉底主张通过对话和反思来寻求真理。
2. "Know thyself" is a famous saying from _____.A. ThalesB. PythagorasC. DemocritusD. Socrates答案:D。
此题考查古希腊哲学家的名言。
“认识你自己”是苏格拉底的名言。
选项A 泰勒斯主要研究自然哲学;选项B 毕达哥拉斯以数学和神秘主义著称;选项C 德谟克利特提出了原子论。
3. Which philosopher believed that the world is composed of water?A. AnaximenesB. AnaximanderC. ThalesD. Heraclitus答案:C。
本题考查古希腊哲学家对世界构成的看法。
泰勒斯认为世界是由水组成的。
选项A 阿那克西美尼认为是气;选项B 阿那克西曼德认为是无定;选项D 赫拉克利特提出万物皆流。
4. The idea of the "Forms" was put forward by _____.A. PlatoB. AristotleC. EpicurusD. Stoics答案:A。
这道题考查古希腊哲学中的概念。
柏拉图提出了“理念论”,即“形式”。
选项B 亚里士多德对其进行了批判和发展;选项C 伊壁鸠鲁主张快乐主义;选项D 斯多葛学派强调道德和命运。
5. Who claimed that "The unexamined life is not worth living"?A. PlatoB. AristotleC. SocratesD. Epicurus答案:C。
通过提问获取知识的英语作文
通过提问获取知识的英语作文Obtaining knowledge through questioning is a fundamental aspect of the learning process. The act of asking questions allows individuals to gain a deeper understanding of a subject, clarify uncertainties, and stimulate critical thinking. In this essay, we will explore the importance of asking questions as a means to acquire knowledge.First and foremost, questioning is a powerful tool for acquiring new information. When we ask questions, we are actively seeking answers and insights that can expand our knowledge base. By engaging in a dialogue with others or conducting research, we can uncover valuable information that may have otherwise gone unnoticed. This process of inquiry enables us to fill gaps in our understanding and acquire new perspectives on a given topic.Furthermore, questioning helps us to clarify uncertainties and dispel misconceptions. When we encounter a complex or ambiguous concept, asking questions allows us to break it down into smaller, more manageable parts. By seeking clarification from experts or fellow learners, we can untangle confusion and gain a clearer grasp of the subject matter. This process ofquestioning and clarification is essential for overcoming obstacles and deepening our comprehension.In addition, asking questions stimulates critical thinking and promotes intellectual growth. When we pose thoughtful inquiries, we are encouraged to analyze information, weigh evidence, and form reasoned conclusions. This practice of critical inquiry helps us to develop our analytical skills, hone our problem-solving abilities, and cultivate a curious and inquisitive mindset. By continuously questioning and challenging our assumptions, we can expand our intellectual horizons and push the boundaries of our knowledge.Moreover, questioning fosters a spirit of curiosity and a thirst for knowledge. When we ask questions, we demonstrate a genuine interest in learning and a willingness to explore new ideas. This curiosity drives us to seek out new information, engage with diverse perspectives, and pursue lifelong learning. By nurturing a habit of questioning, we can cultivate a sense of intellectual curiosity that fuels our intellectual growth and enables us to adapt to an ever-changing world.In conclusion, the act of asking questions is a powerful means of acquiring knowledge, clarifying uncertainties, stimulating critical thinking, and fostering a spirit of curiosity. Byengaging in a process of inquiry, we can deepen our understanding of the world, challenge our assumptions, and expand our intellectual horizons. Therefore, let us embrace the practice of questioning as a vital tool for gaining knowledge and enhancing our learning journey.。
一切有用的东西英语作文
一切有用的东西英语作文When it comes to English composition, there are several elements that are considered essential for crafting a well-written piece. Here's a breakdown of what makes an English essay not just useful, but also engaging and effective.1. Clear and Concise Introduction: The introduction should grab the reader's attention and clearly state the purpose of the essay. It should be concise, avoiding unnecessary details that could confuse the reader.2. Strong Thesis Statement: The thesis is the heart of your essay. It should be clear, concise, and debatable. It sets the tone for the entire composition and guides the reader through your argument.3. Structured Body Paragraphs: Each paragraph should focus ona single main idea that supports your thesis. It shouldfollow a logical structure: topic sentence, supportingdetails (evidence, examples, or explanations), and a concluding sentence that transitions to the next paragraph.4. Coherent and Logical Flow: The flow of ideas is crucial. Each paragraph should connect to the next, creating a seamless argument. The use of transitional phrases can help maintain this flow.5. Appropriate Use of Language: The language should be formaland appropriate for academic writing. Avoid slang, colloquialisms, and informal contractions. Vocabulary should be varied and precise, enhancing the clarity of your writing.6. Citation and Referencing: When using sources, it'simportant to cite them correctly. This not only gives creditto the original authors but also adds credibility to your work. Different citation styles exist (APA, MLA, Chicago, etc.), so be sure to follow the one required for your essay.7. Critical Analysis: Rather than just describing information,a useful essay should analyze, evaluate, and interpret the information. This shows a deeper level of understanding and engagement with the topic.8. Conclusion: The conclusion should not introduce new information but rather summarize the main points and restate the thesis in a new way. It should leave the reader with a final, memorable perspective on the topic.9. Proofreading and Editing: Before submission, it'sessential to proofread your essay for grammatical errors, typos, and inconsistencies in style. Editing for clarity and conciseness can also improve the overall quality of the composition.10. Originality: Lastly, an essay should be original. Plagiarism is a serious offense in academic writing. Ensure that all ideas are your own or properly attributed.By focusing on these elements, you can create an Englishcomposition that is not only useful for academic purposes but also a pleasure to read.。
knowledgeacquisition
3Knowledge AcquisitionIntroductionIn this chapter we will be looking at knowledge acquisition,i.e.,the process of obtaining the knowledge to be stored in a knowledge-based system. ObjectivesBy the end of the chapter you will be able to:r define knowledge acquisitionr explain how knowledge is acquired from a human expertr explain the purpose and types of interviews in obtaining knowledger explain why it is necessary to record the results of interviews using techniques such as repertory grids.What Is Knowledge Acquisition?Knowledge acquisition(sometimes referred to as knowledge elicitation)is the process of acquiring knowledge from a human expert,or a group of experts,and using the knowledge to build knowledge-based systems.An expert system must contain the knowledge of human experts;therefore the knowledge acquisition process primarily involves a discussion between the knowl-edge engineer and the human expert.Clearly,a knowledge acquisition session should not be like a visit to the dentist.The knowledge engineer should aim to be as friendly as possible.In general,experts will be delighted to talk to anyone about their subject of interest, and will exhaust any knowledge engineer.This however,does not mean that the knowledge acquisition process is easy.8992An Introduction to Knowledge Engineering Feedback2PlanningEnsure that the time and place of the interview are known.Decide the purpose of the interview and based upon this what type of interview technique would be most appropriate.Book the appropriate room for the interview to take place in;ensure appropriate refreshments are available.Where appropriate plan the questions that need to be asked or collect appropriate materials to trigger the expert’s memory.Explain the nature and purpose of the interview with the expert.This will help the expert prepare for the interview.Ensure that the expert understands what factors will hinder progress of the interview.In other words,check that the expert understands the outcome of the interview.Ensure that appropriate recording devices are available,e.g.tape recorders, video and an assistant to take notes where necessary.Stage management techniquesConsideration needs to be given to the location and the time of day of the inter-view.Experts may work unusual hours so what may normally be considered anti-social times may be appropriate for the interview.Consider the room layout to minimise disturbance and maximise comfort. Unlike a conversation the interview should not be assumed to be a natural form of interaction.They are a crucial process to knowledge acquisition,and the time should be used as effectively as possible.As noted above,the interview should be approached in an organised and professional manner–even when the interview itself is unstructured.Interviews have a particular advantage over other forms of knowledge acquisition procedures.The knowledge engineer can satisfy both themselves,and the expert,that they have grasped the points that the expert has been making.There are various tips that can help during the interview process:Firstly,avoid parative words like bigger;better and lighter are not always helpful,and certainly not precise.Bigger/better/lighter than what? Secondly,bear in mind that the expert may miss out key parts of the reasoning process.Where parts of the process are potentially complex the expert may ignore some of these complexities in order to simplify the explanation so that the knowledge engineer will understand them.Similarly,when solving problems the expert may make what appear to be intuitive leaps.In reality these are probably cause and effect relationships that the expert has noticed from years of experience in the domain.However,because these steps are‘intuitive’ratherKnowledge Acquisition91 It is particularly important that the knowledge engineer uses these sources.As well as detailed technical information,they can be used to familiarise the knowledge engineer with the subject matter.Thus when the knowledge engineer conducts the preliminary interviews with the expert,they are already familiar with the some of the terminology and have a basic grasp of the subject area.This prevents wasting of the expert’s time by asking them to explain trivial information.While various types of documentation provide useful background to a specific knowledge domain,there is no guarantee that the documentation is either complete or up-to-date.Therefore one of the main methods of obtaining knowledge is still to use human experts,as they are more likely to be aware of the current state of knowledge in their particular domain.The skills required of a knowledge engineer have already been discussed in Chapter1.Some specific skills will also be expected from the human experts from whom knowledge will be elicited.Characteristics expected from an expert include being:r articulater motivatedr logicalr systematic.Conducting InterviewsTo conduct a successful interview the knowledge engineer will need to:r planr use appropriate stage management techniquesr consider and use appropriate social skillsr maintain appropriate self-control during the interview.Activity2Think about the planning required for an interview to obtain knowledge from an expert.The activities that must take place are similar to organising any meeting.So,consider what you must do to plan a meeting in an office about any subject—or plan a meeting of a student society,for example,to discuss an important issue.Keeping this idea in mind,can you list the planning and stage management activities that need to take place before an interview?90An Introduction to Knowledge Engineering InterviewsDuring your study you may have developed some familiarity with the use of interviews in a systems development context.These include:r unstructuredr structuredr event recallr thinking aloud.A knowledge engineer can also use interviews as method of obtaining knowledge from human experts however they must also consider other sources of knowledge.Activity1You are a knowledge engineer about to start on obtaining information for a new expert system.As part of this process,you are investigating the knowledge domain and have meetings arranged with a human expert.Besides talking to an expert,where else might a knowledge engineer look tofind useful information?Feedback1Sources of available knowledge include:r procedure manualsr records of past case studiesr standards documentationr knowledge from other humans,less knowledgeable but more available then experts.Clearly,we need a range of knowledge acquisition methods,including computer tools.We will also need to use a range of sources such as printed documentation and manuals.Other Sources of KnowledgeQuestionnaires are also valuable in many situations.There is clearly a considerable similarity between acquiring knowledge from experts in order to compile knowl-edge bases,and acquiring information from system users in order to develop a new or replacement information system.Printed sources of knowledge can be very useful.In the specific context of knowl-edge engineering and acquiring knowledge of a particular domain,manuals,case studies and perhaps textbooks can also prove valuable.。
如何获取知识英语作文
如何获取知识英语作文How to Acquire KnowledgeKnowledge is a powerful tool that can open doors, provide opportunities, and enhance our understanding of the world around us. In today's fast-paced and information-driven society, the ability to acquire knowledge efficiently and effectively is more important than ever. But with vast amounts of information available at our fingertips, it can be overwhelming to know where to start. In this article, we will explore some strategies and tips for acquiring knowledge in various fields.1. Read WidelyOne of the best ways to acquire knowledge is through reading. Whether it's books, articles, newspapers, or online resources, reading exposes us to new ideas, perspectives, and information. Make a habit of reading regularly and across a variety of genres and topics. This will not only expand your knowledge base but also improve your critical thinking and analytical skills.2. Stay CuriousCuriosity is the driving force behind learning. Stay curious and ask questions about the world around you. Be open to newexperiences, ideas, and perspectives. Engage in discussions, debates, and conversations with others to broaden your understanding of different subjects.3. Attend Workshops and SeminarsAttending workshops and seminars is a great way to acquire knowledge in a structured and interactive setting. These events often feature experts in their field who can provide valuable insights and information. Take advantage of these opportunities to learn from others and network with like-minded individuals.4. Utilize Online Courses and TutorialsThe internet has revolutionized the way we acquire knowledge. Online courses and tutorials are readily available on a wide range of subjects, from coding to cooking to creative writing. Take advantage of these resources to learn at your own pace and on your own schedule.5. Seek MentorshipHaving a mentor can be invaluable in acquiring knowledge.A mentor is someone who can guide, support, and provide feedback as you navigate your learning journey. Seek out mentors in your field of interest and learn from their expertise and experience.6. Practice Self-DisciplineAcquiring knowledge requires self-discipline and dedication. Set goals for yourself and establish a routine that allows for consistent learning. Make time for reading, studying, and practicing new skills on a daily or weekly basis.7. Experiment and ExploreDon't be afraid to experiment and explore new things. Try new hobbies, take up a new language, or learn a musical instrument. Engaging in hands-on activities can enhance your learning experience and make acquiring knowledge more enjoyable.In conclusion, acquiring knowledge is a lifelong process that requires dedication, curiosity, and an open mind. By implementing the strategies mentioned above and staying committed to your learning goals, you can expand your knowledge base and reach new heights of understanding. Remember that knowledge is power, and the more you acquire, the more opportunities you will have to grow and succeed in life.。
Knowledge Engineering-Principles And Methods
Knowledge Engineering:Principles and MethodsRudi Studer1, V. Richard Benjamins2, and Dieter Fensel11Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany{studer, fensel}@aifb.uni-karlsruhe.dehttp://www.aifb.uni-karlsruhe.de2Artificial Intelligence Research Institute (IIIA),Spanish Council for Scientific Research (CSIC), Campus UAB,08193 Bellaterra, Barcelona, Spainrichard@iiia.csic.es, http://www.iiia.csic.es/~richard2Dept. of Social Science Informatics (SWI),richard@swi.psy.uva.nl, http://www.swi.psy.uva.nl/usr/richard/home.htmlAbstractThis paper gives an overview about the development of the field of Knowledge Engineering over the last 15 years. We discuss the paradigm shift from a transfer view to a modeling view and describe two approaches which considerably shaped research in Knowledge Engineering: Role-limiting Methods and Generic Tasks. To illustrate various concepts and methods which evolved in the last years we describe three modeling frameworks: CommonKADS, MIKE, and PROTÉGÉ-II. This description is supplemented by discussing some important methodological developments in more detail: specification languages for knowledge-based systems, problem-solving methods, and ontologies. We conclude with outlining the relationship of Knowledge Engineering to Software Engineering, Information Integration and Knowledge Management.Key WordsKnowledge Engineering, Knowledge Acquisition, Problem-Solving Method, Ontology, Information Integration1IntroductionIn earlier days research in Artificial Intelligence (AI) was focused on the development offormalisms, inference mechanisms and tools to operationalize Knowledge-based Systems (KBS). Typically, the development efforts were restricted to the realization of small KBSs in order to study the feasibility of the different approaches.Though these studies offered rather promising results, the transfer of this technology into commercial use in order to build large KBSs failed in many cases. The situation was directly comparable to a similar situation in the construction of traditional software systems, called …software crisis“ in the late sixties: the means to develop small academic prototypes did not scale up to the design and maintenance of large, long living commercial systems. In the same way as the software crisis resulted in the establishment of the discipline Software Engineering the unsatisfactory situation in constructing KBSs made clear the need for more methodological approaches.So the goal of the new discipline Knowledge Engineering (KE) is similar to that of Software Engineering: turning the process of constructing KBSs from an art into an engineering discipline. This requires the analysis of the building and maintenance process itself and the development of appropriate methods, languages, and tools specialized for developing KBSs. Subsequently, we will first give an overview of some important historical developments in KE: special emphasis will be put on the paradigm shift from the so-called transfer approach to the so-called modeling approach. This paradigm shift is sometimes also considered as the transfer from first generation expert systems to second generation expert systems [43]. Based on this discussion Section 2 will be concluded by describing two prominent developments in the late eighties:Role-limiting Methods [99] and Generic Tasks [36]. In Section 3 we will present some modeling frameworks which have been developed in recent years: CommonKADS [129], MIKE [6], and PROTÈGÈ-II [123]. Section 4 gives a short overview of specification languages for KBSs. Problem-solving methods have been a major research topic in KE for the last decade. Basic characteristics of (libraries of) problem-solving methods are described in Section 5. Ontologies, which gained a lot of importance during the last years are discussed in Section 6. The paper concludes with a discussion of current developments in KE and their relationships to other disciplines.In KE much effort has also been put in developing methods and supporting tools for knowledge elicitation (compare [48]). E.g. in the VITAL approach [130] a collection of elicitation tools, like e.g. repertory grids (see [65], [83]), are offered for supporting the elicitation of domain knowledge (compare also [49]). However, a discussion of the various elicitation methods is beyond the scope of this paper.2Historical Roots2.1Basic NotionsIn this section we will first discuss some main principles which characterize the development of KE from the very beginning.Knowledge Engineering as a Transfer Process…This transfer and transformation of problem-solving expertise from a knowledge source to a program is the heart of the expert-system development process.” [81]In the early eighties the development of a KBS has been seen as a transfer process of humanknowledge into an implemented knowledge base. This transfer was based on the assumption that the knowledge which is required by the KBS already exists and just has to be collected and implemented. Most often, the required knowledge was obtained by interviewing experts on how they solve specific tasks [108]. Typically, this knowledge was implemented in some kind of production rules which were executed by an associated rule interpreter. However, a careful analysis of the various rule knowledge bases showed that the rather simple representation formalism of production rules did not support an adequate representation of different types of knowledge [38]: e.g. in the MYCIN knowledge base [44] strategic knowledge about the order in which goals should be achieved (e.g. “consider common causes of a disease first“) is mixed up with domain specific knowledge about for example causes for a specific disease. This mixture of knowledge types, together with the lack of adequate justifications of the different rules, makes the maintenance of such knowledge bases very difficult and time consuming. Therefore, this transfer approach was only feasible for the development of small prototypical systems, but it failed to produce large, reliable and maintainable knowledge bases.Furthermore, it was recognized that the assumption of the transfer approach, that is that knowledge acquisition is the collection of already existing knowledge elements, was wrong due to the important role of tacit knowledge for an expert’s problem-solving capabilities. These deficiencies resulted in a paradigm shift from the transfer approach to the modeling approach.Knowledge Engineering as a Modeling ProcessNowadays there exists an overall consensus that the process of building a KBS may be seen as a modeling activity. Building a KBS means building a computer model with the aim of realizing problem-solving capabilities comparable to a domain expert. It is not intended to create a cognitive adequate model, i.e. to simulate the cognitive processes of an expert in general, but to create a model which offers similar results in problem-solving for problems in the area of concern. While the expert may consciously articulate some parts of his or her knowledge, he or she will not be aware of a significant part of this knowledge since it is hidden in his or her skills. This knowledge is not directly accessible, but has to be built up and structured during the knowledge acquisition phase. Therefore this knowledge acquisition process is no longer seen as a transfer of knowledge into an appropriate computer representation, but as a model construction process ([41], [106]).This modeling view of the building process of a KBS has the following consequences:•Like every model, such a model is only an approximation of the reality. In principle, the modeling process is infinite, because it is an incessant activity with the aim of approximating the intended behaviour.•The modeling process is a cyclic process. New observations may lead to a refinement, modification, or completion of the already built-up model. On the other side, the model may guide the further acquisition of knowledge.•The modeling process is dependent on the subjective interpretations of the knowledge engineer. Therefore this process is typically faulty and an evaluation of the model with respect to reality is indispensable for the creation of an adequate model. According to this feedback loop, the model must therefore be revisable in every stage of the modeling process.Problem Solving MethodsIn [39] Clancey reported on the analysis of a set of first generation expert systems developed to solve different tasks. Though they were realized using different representation formalisms (e.g. production rules, frames, LISP), he discovered a common problem solving behaviour.Clancey was able to abstract this common behaviour to a generic inference pattern called Heuristic Classification , which describes the problem-solving behaviour of these systems on an abstract level, the so called Knowledge Level [113]. This knowledge level allows to describe reasoning in terms of goals to be achieved, actions necessary to achieve these goals and knowledge needed to perform these actions. A knowledge-level description of a problem-solving process abstracts from details concerned with the implementation of the reasoning process and results in the notion of a Problem-Solving Method (PSM).A PSM may be characterized as follows (compare [20]):• A PSM specifies which inference actions have to be carried out for solving a given task.• A PSM determines the sequence in which these actions have to be activated.•In addition, so-called knowledge roles determine which role the domain knowledge plays in each inference action. These knowledge roles define a domain independent generic terminology.When considering the PSM Heuristic Classification in some more detail (Figure 1) we can identify the three basic inference actions abstract ,heuristic match , and refine . Furthermore,four knowledge roles are defined:observables ,abstract observables ,solution abstractions ,and solutions . It is important to see that such a description of a PSM is given in a generic way.Thus the reuse of such a PSM in different domains is made possible. When considering a medical domain, an observable like …410 C“ may be abstracted to …high temperature“ by the inference action abstract . This abstracted observable may be matched to a solution abstraction, e.g. …infection“, and finally the solution abstraction may be hierarchically refined to a solution, e.g. the disease …influenca“.In the meantime various PSMs have been identified, like e.g.Cover-and-Differentiate for solving diagnostic tasks [99] or Propose-and-Revise [100] for parametric design tasks.PSMs may be exploited in the knowledge engineering process in different ways:Fig. 1 The Problem-Solving Method Heuristic Classificationroleinference action•PSMs contain inference actions which need specific knowledge in order to perform their task. For instance,Heuristic Classification needs a hierarchically structured model of observables and solutions for the inference actions abstract and refine, respectively.So a PSM may be used as a guideline to acquire static domain knowledge.• A PSM allows to describe the main rationale of the reasoning process of a KBS which supports the validation of the KBS, because the expert is able to understand the problem solving process. In addition, this abstract description may be used during the problem-solving process itself for explanation facilities.•Since PSMs may be reused for developing different KBSs, a library of PSMs can be exploited for constructing KBSs from reusable components.The concept of PSMs has strongly stimulated research in KE and thus has influenced many approaches in this area. A more detailed discussion of PSMs is given in Section 5.2.2Specific ApproachesDuring the eighties two main approaches evolved which had significant influence on the development of modeling approaches in KE: Role-Limiting Methods and Generic Tasks. Role-Limiting MethodsRole-Limiting Methods (RLM) ([99], [102]) have been one of the first attempts to support the development of KBSs by exploiting the notion of a reusable problem-solving method. The RLM approach may be characterized as a shell approach. Such a shell comes with an implementation of a specific PSM and thus can only be used to solve a type of tasks for which the PSM is appropriate. The given PSM also defines the generic roles that knowledge can play during the problem-solving process and it completely fixes the knowledge representation for the roles such that the expert only has to instantiate the generic concepts and relationships, which are defined by these roles.Let us consider as an example the PSM Heuristic Classification (see Figure 1). A RLM based on Heuristic Classification offers a role observables to the expert. Using that role the expert (i) has to specify which domain specific concept corresponds to that role, e.g. …patient data”(see Figure 4), and (ii) has to provide domain instances for that concept, e.g. concrete facts about patients. It is important to see that the kind of knowledge, which is used by the RLM, is predefined. Therefore, the acquisition of the required domain specific instances may be supported by (graphical) interfaces which are custom-tailored for the given PSM.In the following we will discuss one RLM in some more detail: SALT ([100], [102]) which is used for solving constructive tasks.Then we will outline a generalization of RLMs to so-called Configurable RLMs.SALT is a RLM for building KBSs which use the PSM Propose-and-Revise. Thus KBSs may be constructed for solving specific types of design tasks, e.g. parametric design tasks. The basic inference actions that Propose-and-Revise is composed of, may be characterized as follows:•extend a partial design by proposing a value for a design parameter not yet computed,•determine whether all computed parameters fulfil the relevant constraints, and•apply fixes to remove constraint violations.In essence three generic roles may be identified for Propose-and-Revise ([100]):•…design-extensions” refer to knowledge for proposing a new value for a design parameter,•…constraints” provide knowledge restricting the admissible values for parameters, and •…fixes” make potential remedies available for specific constraint violations.From this characterization of the PSM Propose-and-Revise, one can easily see that the PSM is described in generic, domain-independent terms. Thus the PSM may be used for solving design tasks in different domains by specifying the required domain knowledge for the different predefined generic knowledge roles.E.g. when SALT was used for building the VT-system [101], a KBS for configuring elevators, the domain expert used the form-oriented user interface of SALT for entering domain specific design extensions (see Figure 2). That is, the generic terminology of the knowledge roles, which is defined by object and relation types, is instantiated with VT specific instances.1Name:CAR-JAMB-RETURN2Precondition:DOOR-OPENING = CENTER3Procedure:CALCULATION4Formula:[PLATFORM-WIDTH -OPENING-WIDTH] / 25Justification:CENTER-OPENING DOORS LOOKBEST WHEN CENTERED ONPLATFORM.(the value of the design parameter CAR-JUMB-RETURN iscalculated according to the formula - in case the preconditionis fulfilled; the justification gives a description why thisparameter value is preferred over other values (example takenfrom [100]))Fig. 2 Design Extension Knowledge for VTOn the one hand, the predefined knowledge roles and thus the predefined structure of the knowledge base may be used as a guideline for the knowledge acquisition process: it is clearly specified what kind of knowledge has to be provided by the domain expert. On the other hand, in most real-life situations the problem arises of how to determine whether a specific task may be solved by a given RLM. Such task analysis is still a crucial problem, since up to now there does not exist a well-defined collection of features for characterizing a domain task in a way which would allow a straightforward mapping to appropriate RLMs. Moreover, RLMs have a fixed structure and do not provide a good basis when a particular task can only be solved by a combination of several PSMs.In order to overcome this inflexibility of RLMs, the concept of configurable RLMs has been proposed.Configurable Role-Limiting Methods (CRLMs) as discussed in [121] exploit the idea that a complex PSM may be decomposed into several subtasks where each of these subtasks may be solved by different methods (see Section 5). In [121], various PSMs for solving classification tasks, like Heuristic Classification or Set-covering Classification, have been analysed with respect to common subtasks. This analysis resulted in the identification ofshared subtasks like …data abstraction” or …hypothesis generation and test”. Within the CRLM framework a predefined set of different methods are offered for solving each of these subtasks. Thus a PSM may be configured by selecting a method for each of the identified subtasks. In that way the CRLM approach provides means for configuring the shell for different types of tasks. It should be noted that each method offered for solving a specific subtask, has to meet the knowledge role specifications that are predetermined for the CRLM shell, i.e. the CRLM shell comes with a fixed scheme of knowledge types. As a consequence, the introduction of a new method into the shell typically involves the modification and/or extension of the current scheme of knowledge types [121]. Having a fixed scheme of knowledge types and predefined communication paths between the various components is an important restriction distinguishing the CRLM framework from more flexible configuration approaches such as CommonKADS (see Section 3).It should be clear that the introduction of such flexibility into the RLM approach removes one of its disadvantages while still exploiting the advantage of having a fixed scheme of knowledge types, which build the basis for generating effective knowledge-acquisition tools. On the other hand, configuring a CRLM shell increases the burden for the system developer since he has to have the knowledge and the ability to configure the system in the right way. Generic Task and Task StructuresIn the early eighties the analysis and construction of various KBSs for diagnostic and design tasks evolved gradually into the notion of a Generic Task (GT) [36]. GTs like Hierarchical Classification or State Abstraction are building blocks which can be reused for the construction of different KBSs.The basic idea of GTs may be characterized as follows (see [36]):• A GT is associated with a generic description of its input and output.• A GT comes with a fixed scheme of knowledge types specifying the structure of domain knowledge needed to solve a task.• A GT includes a fixed problem-solving strategy specifying the inference steps the strategy is composed of and the sequence in which these steps have to be carried out. The GT approach is based on the strong interaction problem hypothesis which states that the structure and representation of domain knowledge is completely determined by its use [33]. Therefore, a GT comes with both, a fixed problem-solving strategy and a fixed collection of knowledge structures.Since a GT fixes the type of knowledge which is needed to solve the associated task, a GT provides a task specific vocabulary which can be exploited to guide the knowledge acquisition process. Furthermore, by offering an executable shell for a GT, called a task specific architecture, the implementation of a specific KBS could be considered as the instantiation of the predefined knowledge types by domain specific terms (compare [34]). On a rather pragmatic basis several GTs have been identified including Hierarchical Classification,Abductive Assembly and Hypothesis Matching. This initial collection of GTs was considered as a starting point for building up an extended collection covering a wide range of relevant tasks.However, when analyzed in more detail two main disadvantages of the GT approach have been identified (see [37]):•The notion of task is conflated with the notion of the PSM used to solve the task, sinceeach GT included a predetermined problem-solving strategy.•The complexity of the proposed GTs was very different, i.e. it remained open what the appropriate level of granularity for the building blocks should be.Based on this insight into the disadvantages of the notion of a GT, the so-called Task Structure approach was proposed [37]. The Task Structure approach makes a clear distinction between a task, which is used to refer to a type of problem, and a method, which is a way to accomplish a task. In that way a task structure may be defined as follows (see Figure 3): a task is associated with a set of alternative methods suitable for solving the task. Each method may be decomposed into several subtasks. The decomposition structure is refined to a level where elementary subtasks are introduced which can directly be solved by using available knowledge.As we will see in the following sections, the basic notion of task and (problem-solving)method, and their embedding into a task-method-decomposition structure are concepts which are nowadays shared among most of the knowledge engineering methodologies.3Modeling FrameworksIn this section we will describe three modeling frameworks which address various aspects of model-based KE approaches: CommonKADS [129] is prominent for having defined the structure of the Expertise Model, MIKE [6] puts emphasis on a formal and executable specification of the Expertise Model as the result of the knowledge acquisition phase, and PROTÉGÉ-II [51] exploits the notion of ontologies.It should be clear that there exist further approaches which are well known in the KE community, like e.g VITAL [130], Commet [136], and EXPECT [72]. However, a discussion of all these approaches is beyond the scope of this paper.Fig. 3 Sample Task Structure for DiagnosisTaskProblem-Solving MethodSubtasksProblem-Solving MethodTask / Subtasks3.1The CommonKADS ApproachA prominent knowledge engineering approach is KADS[128] and its further development to CommonKADS [129]. A basic characteristic of KADS is the construction of a collection of models, where each model captures specific aspects of the KBS to be developed as well as of its environment. In CommonKADS the Organization Model, the Task Model, the Agent Model, the Communication Model, the Expertise Model and the Design Model are distinguished. Whereas the first four models aim at modeling the organizational environment the KBS will operate in, as well as the tasks that are performed in the organization, the expertise and design model describe (non-)functional aspects of the KBS under development. Subsequently, we will briefly discuss each of these models and then provide a detailed description of the Expertise Model:•Within the Organization Model the organizational structure is described together with a specification of the functions which are performed by each organizational unit.Furthermore, the deficiencies of the current business processes, as well as opportunities to improve these processes by introducing KBSs, are identified.•The Task Model provides a hierarchical description of the tasks which are performed in the organizational unit in which the KBS will be installed. This includes a specification of which agents are assigned to the different tasks.•The Agent Model specifies the capabilities of each agent involved in the execution of the tasks at hand. In general, an agent can be a human or some kind of software system, e.g.a KBS.•Within the Communication Model the various interactions between the different agents are specified. Among others, it specifies which type of information is exchanged between the agents and which agent is initiating the interaction.A major contribution of the KADS approach is its proposal for structuring the Expertise Model, which distinguishes three different types of knowledge required to solve a particular task. Basically, the three different types correspond to a static view, a functional view and a dynamic view of the KBS to be built (see in Figure 4 respectively “domain layer“, “inference layer“ and “task layer“):•Domain layer : At the domain layer all the domain specific knowledge is modeled which is needed to solve the task at hand. This includes a conceptualization of the domain in a domain ontology (see Section 6), and a declarative theory of the required domain knowledge. One objective for structuring the domain layer is to model it as reusable as possible for solving different tasks.•Inference layer : At the inference layer the reasoning process of the KBS is specified by exploiting the notion of a PSM. The inference layer describes the inference actions the generic PSM is composed of as well as the roles , which are played by the domain knowledge within the PSM. The dependencies between inference actions and roles are specified in what is called an inference structure. Furthermore, the notion of roles provides a domain independent view on the domain knowledge. In Figure 4 (middle part) we see the inference structure for the PSM Heuristic Classification . Among others we can see that …patient data” plays the role of …observables” within the inference structure of Heuristic Classification .•Task layer : The task layer provides a decomposition of tasks into subtasks and inference actions including a goal specification for each task, and a specification of how theseFig. 4 Expertise Model for medical diagnosis (simplified CML notation)goals are achieved. The task layer also provides means for specifying the control over the subtasks and inference actions, which are defined at the inference layer.Two types of languages are offered to describe an Expertise Model: CML (Conceptual Modeling Language) [127], which is a semi-formal language with a graphical notation, and (ML)2 [79], which is a formal specification language based on first order predicate logic, meta-logic and dynamic logic (see Section 4). Whereas CML is oriented towards providing a communication basis between the knowledge engineer and the domain expert, (ML)2 is oriented towards formalizing the Expertise Model.The clear separation of the domain specific knowledge from the generic description of the PSM at the inference and task layer enables in principle two kinds of reuse: on the one hand, a domain layer description may be reused for solving different tasks by different PSMs, on the other hand, a given PSM may be reused in a different domain by defining a new view to another domain layer. This reuse approach is a weakening of the strong interaction problem hypothesis [33] which was addressed in the GT approach (see Section 2). In [129] the notion of a relative interaction hypothesis is defined to indicate that some kind of dependency exists between the structure of the domain knowledge and the type of task which should be solved. To achieve a flexible adaptation of the domain layer to a new task environment, the notion of layered ontologies is proposed:Task and PSM ontologies may be defined as viewpoints on an underlying domain ontology.Within CommonKADS a library of reusable and configurable components, which can be used to build up an Expertise Model, has been defined [29]. A more detailed discussion of PSM libraries is given in Section 5.In essence, the Expertise Model and the Communication Model capture the functional requirements for the target system. Based on these requirements the Design Model is developed, which specifies among others the system architecture and the computational mechanisms for realizing the inference actions. KADS aims at achieving a structure-preserving design, i.e. the structure of the Design Model should reflect the structure of the Expertise Model as much as possible [129].All the development activities, which result in a stepwise construction of the different models, are embedded in a cyclic and risk-driven life cycle model similar to Boehm’s spiral model [21].The basic structure of the expertise model has some similarities with the data, functional, and control view of a system as known from software engineering. However, a major difference may be seen between an inference layer and a typical data-flow diagram (compare [155]): Whereas an inference layer is specified in generic terms and provides - via roles and domain views - a flexible connection to the data described at the domain layer, a data-flow diagram is completely specified in domain specific terms. Moreover, the data dictionary does not correspond to the domain layer, since the domain layer may provide a complete model of the domain at hand which is only partially used by the inference layer, whereas the data dictionary is describing exactly those data which are used to specify the data flow within the data flow diagram (see also [54]).3.2The MIKE ApproachThe MIKE approach (Model-based and Incremental Knowledge Engineering) (cf. [6], [7])。
知识来源于提问这解释英语作文
知识来源于提问这解释英语作文Knowledge Comes from Questioning: An ExplorationThe pursuit of knowledge is a fundamental aspect of the human experience. Throughout history, individuals have sought to understand the world around them, driven by an innate curiosity and a desire to expand the boundaries of their understanding. At the heart of this quest lies the act of questioning, a powerful tool that has the ability to unlock the mysteries of the universe and propel us towards greater enlightenment.Questioning, in its essence, is the act of seeking information, clarification, or understanding. It is the catalyst that ignites the spark of discovery, guiding us towards new realms of knowledge and challenging our preconceived notions. By posing questions, we open ourselves up to a world of possibilities, inviting exploration and encouraging critical thinking.The importance of questioning cannot be overstated. It is the foundation upon which all scientific progress is built, as researchers and scholars continuously ask questions to uncover new truths and challenge existing paradigms. In the realm of education, questioningis a vital pedagogical tool, empowering students to engage actively with the material and develop a deeper understanding of the subject matter.Moreover, questioning is not limited to the academic or scientific spheres; it is a universal skill that can be applied in all aspects of life. Individuals who possess the ability to ask thoughtful, insightful questions are often better equipped to navigate complex situations, make informed decisions, and drive innovation.One of the most significant benefits of questioning is its ability to foster personal growth and development. By asking questions, we confront our own biases, assumptions, and limitations, forcing us to re-evaluate our perspectives and consider alternative viewpoints. This process of self-reflection and introspection can lead to profound personal transformations, as we challenge our own beliefs and strive to expand our understanding of the world and ourselves.Furthermore, questioning is not merely a passive act of seeking information; it is an active engagement with the world around us. When we ask questions, we are not simply receiving information, but rather, we are participating in a dynamic exchange of ideas, perspectives, and insights. This interaction can lead to the generation of new knowledge, as the process of questioning and responding can uncover previously unexplored connections and lead to novelsolutions.In the realm of problem-solving, questioning is an indispensable tool. By asking the right questions, individuals and teams can identify the root causes of complex issues, explore alternative approaches, and develop more effective solutions. This ability to question and challenge the status quo is essential for driving innovation and progress in any field.Beyond its practical applications, the act of questioning also holds deep philosophical and existential significance. Fundamental questions about the nature of reality, the meaning of life, and the human condition have been the subject of contemplation and debate for centuries, inspiring profound philosophical discourse and shaping our understanding of the world and our place within it.In conclusion, knowledge comes from questioning, a powerful and dynamic process that lies at the heart of human understanding and progress. By embracing the art of questioning, we can unlock new realms of knowledge, challenge our own biases, and contribute to the collective advancement of humanity. As we navigate the complexities of the world, let us continue to ask questions, seek answers, and embark on the endless journey of discovery and enlightenment.。
科学中求是与求不英语作文
科学中求是与求不英语作文English Answer:Science is a rigorous and disciplined field that seeks to uncover the truth about the natural world through systematic observation, experimentation, and analysis. At the heart of scientific inquiry lies a fundamental commitment to epistemic values such as objectivity, skepticism, and truth-seeking. These principles guide scientists in their pursuit of knowledge, shaping their methods, interpretations, and conclusions.One of the core tenets of scientific practice is the principle of epistemic justification, which holds that claims must be supported by evidence and logical reasoning. Scientists are expected to provide empirical evidence to substantiate their hypotheses and theories, ensuring that their conclusions are not mere conjectures or speculations. This emphasis on evidence-based reasoning distinguishes science from other fields of inquiry, such as philosophy orreligion, which may rely on intuition, revelation, or faith as sources of knowledge.However, scientific inquiry is not solely driven by the pursuit of objective truth. It also involves a degree of subjectivity and interpretation, as scientists bring their own perspectives, biases, and preconceptions to their work. This inherent subjectivity raises questions about the extent to which scientific knowledge can ever truly be objective or value-free.Despite these challenges, science has proven to be a remarkably successful and reliable method for generating knowledge and understanding the world around us. Its rigorous methodology and commitment to epistemic values have led to countless breakthroughs and advancements in various fields, including medicine, technology, and environmental science.In conclusion, science is an epistemic endeavor that seeks to uncover the truth about the natural world through systematic observation, experimentation, and analysis.While it is not immune to subjectivity and interpretation, its commitment to epistemic values ensures that scientific knowledge is grounded in evidence and logical reasoning, making it a powerful tool for expanding our understanding of the universe.中文回答:科学是一门严谨而有纪律的学科,它力求通过系统的观察、实验和分析来揭示自然世界的真相。
distilling the knowledge in a neural network
distilling the knowledge in a neural networkKnowledge Distilling Is a method of model compression, which refers to the method of using a more complex Teacher model to guide a lighter Student model training, so as to maintain the accuracy of the original Teacher model as far as possible while reducing the model size and computing resources. This approach was noticed, mainly due to Hinton's paper Distilling the Knowledge in a Neural Network.Knowledge Distill Is a simple way to make up for the insufficient supervision signal of classification problems. In the traditional classification problem, the goal of the model is to map the input features to a point in the output space, for example, in the famous Imagenet competition, which is to map all possible input images to 1000 points in the output space. In doing so, each of the 1,000 points is a one hot-encoded category information. Such a label can provide only the supervision information of log (class) so many bits. In KD, however, we can use teacher model to output a continuous label distribution for each sample, so that the supervised information is much more available than one hot's. From another perspective, you can imagine that if there is only one goal like label, the goal of the model is to force the mapping of each class in the training sample to the same point, so that the intra-class variance and inter-class distance that are very helpful for training will be lost. However, using the output of teacher model can recover this information. The specific example is like the paper, where a cat and a dog are closer than a cat and a table, and if an animal does look like a cat or a dog, it can provide supervision for both categories. To sum up, the core idea of KD is that "dispersing" is compressed to the supervisory information of a point, so that the output of student model can be distributed as much as the output of match teacher model as possible. In fact, to achieve this goal, it is not necessarily teacher model to be used. The uncertain information retained in the data annotation or collection can also help the training of the model.。
【工程学科英语(整合第二稿)】 参考答案
Unit OneTask 1⑩④⑧③⑥⑦②⑤①⑨Task 2① be consistent with他说,未来的改革必须符合自由贸易和开放投资的原则。
② specialize in启动成本较低,因为每个企业都可以只专门从事一个很窄的领域。
③ d erive from以上这些能力都源自一种叫机器学习的东西,它在许多现代人工智能应用中都处于核心地位。
④ A range of创业公司和成熟品牌推出的一系列穿戴式产品让人们欢欣鼓舞,跃跃欲试。
⑤ date back to置身硅谷的我们时常淹没在各种"新新"方式之中,我们常常忘记了,我们只是在重新发现一些可追溯至涉及商业根本的朴素教训。
Task 3T F F T FTask 4The most common viewThe principle task of engineering: To take into account the customers ‘ needs and to find the appropriate technical means to accommodate these needs.Commonly accepted claims:Technology tries to find appropriate means for given ends or desires;Technology is applied science;Technology is the aggregate of all technological artifacts;Technology is the total of all actions and institutions required to create artefacts or products and the total of all actions which make use of these artefacts or products.The author’s opinion: it is a viewpoint with flaws.Arguments: It must of course be taken for granted that the given simplified view of engineers with regard to technology has taken a turn within the last few decades. Observable changes: In many technical universities, the inter‐disciplinary courses arealready inherent parts of the curriculum.Task 5① 工程师对于自己的职业行为最常见的观点是:他们是通过应用科学结论来计划、开发、设计和推出技术产品的。
3 Knowledge Representation and Ontologies Logic, Ontologies and Semantic Web Languages
3Knowledge Representation and OntologiesLogic,Ontologies and Semantic Web LanguagesStephan Grimm1,Pascal Hitzler2,Andreas Abecker11FZI Research Center for Information Technologies,University of Karlsruhe,Germany {grimm,abecker}@fzi.de2Institute AIFB,University of Karlsruhe,Germanyhitzler@aifb.uni-karlsruhe.deSummary.In Artificial Intelligence,knowledge representation studies the formalisation of knowl-edge and its processing within machines.Techniques of automated reasoning allow a computer sys-tem to draw conclusions from knowledge represented in a machine-interpretable form.Recently, ontologies have evolved in computer science as computational artefacts to provide computer systems with a conceptual yet computational model of a particular domain of interest.In this way,computer systems can base decisions on reasoning about domain knowledge,similar to humans.This chapter gives an overview on basic knowledge representation aspects and on ontologies as used within com-puter systems.After introducing ontologies in terms of their appearance,usage and classification,it addresses concrete ontology languages that are particularly important in the context of the Semantic Web.The most recent and predominant ontology languages and formalisms are presented in relation to each other and a selection of them is discussed in more detail.3.1Knowledge RepresentationAs a branch of symbolic Artificial Intelligence,knowledge representation and reasoning aims at designing computer systems that reason about a machine-interpretable representa-tion of the world,similar to human reasoning.Knowledge-based systems have a computa-tional model of some domain of interest in which symbols serve as surrogates for real world domain artefacts,such as physical objects,events,relationships,etc.[45].The domain of interest can cover any part of the real world or any hypothetical system about which one desires to represent knowledge for computational purposes.A knowledge-based system maintains a knowledge base which stores the symbols of the computational model in form of statements about the domain,and it performs reasoning by manipulating these symbols.Applications can base their decisions on domain-relevant questions posed to a knowledge base.3.1.1A Motivating ScenarioTo illustrate principles of knowledge representation in this chapter,we introduce an exam-ple scenario taken from a B2B travelling use case.In this scenario,companies frequently38Stephan Grimm,Pascal Hitzler,Andreas Abeckerbook business trips for their employees,sending them to international meetings and con-ference events.Such a scenario is a relevant use case for Semantic Web Services,since companies desire to automate the online booking process,while they still want to bene-fit from the high competition among various travel agencies and no-frills airlines that sell tickets via the internet.Automation is achieved by computational agents deciding about whether an online offer of some travel agencyfits a request for a business trip or not,based on the knowledge they have about the offer and the request.Knowledge represented in this domain of“business trips”is aboutflights,trains,booking,companies and their employees, cities that are source or destination for a trip,etc.Knowledge-based systems use a computational representation of such knowledge in form of statements about the domain of interest.Examples of such statements in the busi-ness trips domain are“companies book trips for their employees”,“flights and train rides are special kinds of trips”or“employees are persons employed at some company”.This knowledge can be used to answer questions about the domain of interest.From the given statements,and by means of automated deduction,a knowledge-based system can,for ex-ample,derive that“a person on aflight booked by a company is an employee”or“the company that booked aflight for a person is this person’s employer”.In this way,a knowledge-based computational agent can reason about business trips, similar to the way a human would.It could,for example,tell apart offers for business trips from offers for vacations,or decide whether the destination city for a requestedflight is close to the geographical region specified in an offer,or conclude that a participant of a businessflight is an employee of the company that booked theflight.3.1.2Forms of Representing KnowledgeIf we look at current Semantic Web technologies and use cases,knowledge representation appears in different forms,the most prevalent of which are based on semantic networks, rules and logic.Semantic network structures can be found in RDF graph representations [30]or Topic Maps[41],whereas a formalisation of business knowledge often comes in form of rules with some“if-then”reading,e.g.in business rules or logic programming formalisms.Logic is used to realise a precise semantic interpretation for both of the other forms.By providing formal semantics for knowledge representation languages,logic-based formalisms lay the basis for automated deduction.We will investigate these three forms of knowledge representation in the following.Semantic NetworksOriginally,semantic networks stem from the“existential graphs”introduced by Charles Peirce in1896to express logical sentences as graphical node-and-link diagrams[43].Later on,similar notations have been introduced,such as conceptual graphs[45],all differing slightly in syntax and semantics.Despite these differences,all the semantic network for-malisms concentrate on expressing the taxonomic structure of categories of objects and the relations between them.We use a general notion of a semantic network,abstracting from the different concrete notations proposed.A semantic network is a graph whose nodes represent concepts and whose arcs rep-resent relations between these concepts.They provide a structural representation of state-ments about a domain of interest.In the business trips domain,typical concepts would be3Knowledge Representation and Ontologies39“Company”,“Employee”or“Flight”,while typical relations would be“books”,“isEm-ployedAt”or“participatesIn”.Figure3.1shows an example of a semantic network for the business trips domain.Fig.3.1.A Semantic Network for Business TripsSemantic networks provide a means to abstract from natural language,representing the knowledge that is captured in text in a form more suitable for computation.The knowledge expressed in the network from Figure3.1coincides with the content of the following natural language text.“Employees of companies are persons,while both persons and companies are le-gal panies book trips for their employees.These trips can beflights or train rides which start and end in cities of Europe or the panies them-selves have locations which can be cities.The company UbiqBiz books theflight FL4711from London to New York for Mister X.”Typically,concepts are chosen to represent the meaning of nouns in such a text,while relations are mapped to verb phrases.The fragment Company books−−−−−→Trip is read as “companies book trips”,expressed as a binary two However, this is not mandatory;the relation books−−−−−→could also be“lifted”to a concept Booking with relations hasActor−−−−−−−−→pointing to Company,−−−−−−−−→,hasParticipant−−−−−−−−−−−−→and hasObjectEmployee and Trip,respectively.In this way,its ternary character wouldthe original network where the information about an employee’s involvement in booking is implicit.In principle,the concepts and relations in a semantic network are generic and could stand for anything relevant in the domain of interest.However,some particular relations for some standard knowledge representation and reasoning cases have evolved.40Stephan Grimm,Pascal Hitzler,Andreas AbeckerThe semantic network in Figure3.1illustrates the distinction between general concepts, like Employee,and individual concepts,like MisterX.While the latter represent con-crete individuals or objects in the domain of interest,the former serve as classes to group together such individuals that have certain properties in common,as e.g.all employees.The particular relation which links individuals to their classes is that of instantiation,denoted by isA−−−−→.Thus,MisterX is called an instance of the concept employee.The lower part of the network is concerned with knowledge about individuals,reflecting a particular situation of the employee MisterX participating in a certainflight,while the upper part is concerned with knowledge about general concepts,reflecting various possible situations.The most prominent type of relation in semantic networks,however,is that of subsump-tion,which we denote by kindOf−−−−−−→.A subsumption link connects two general concepts and expresses specialisation or generalisation,respectively.In the network in Figure3.1,a flight is said to be a special kind of trip,i.e.Trip subsumes Flight.This means that any flight is also a trip,however,there might be other trips which are notflights,such as train rides.Subsumption is associated with the notion of inheritance in that a specialised concept inherits all the properties from its more general parent concepts.For example,from the net-work one can read that a company can be located in a European city,since locatedAt−−−−−−−−→points from Company to Location while EUCity is a kind of City which is itself a kind of Location.The concept EUCity inherits the property of being a potential location for a company from the concept Location.Other particular relations that can be found in semantic network notations are,for ex-ample,partOf−−−−−−→to denote part-whole relationships,etc.Semantic networks are closely related to another form of knowledge representation called frame systems.In fact,frame systems and semantic networks can be identical in their expressiveness but use different representation metaphors[43].While the semantic network metaphor is that of a graph with concept nodes linked by relation arcs,the frame metaphor draws concepts as boxes,i.e.frames,and relations as slots inside frames that can befilled by other frames.Thus,in the frame metaphor the graph turns into nested boxes.The semantic network form of knowledge representation is especially suitable for cap-turing the taxonomic structure of categories for domain objects and for expressing general statements about the domain of interest.Inheritance and other relations between such cate-gories can be represented in and derived from subsumption hierarchies.On the other hand, the representation of concrete individuals or even data values,like numbers or strings,does notfit well the idea of semantic networks.RulesAnother natural form of expressing knowledge in some domain of interest are rules that re-flect the notion of consequence.Rules come in the form of IF-THEN-constructs and allow to express various kinds of complex statements.Rules can be found in logic programming systems,like the language Prolog[31],in deductive databases[34]or in business rules systems.The following is an example of rules expressing knowledge in the business trips do-main,specified in their intuitive if-then-reading.3Knowledge Representation and Ontologies41(1)IF something is aflight THEN it is also a trip(2)IF some person participates in a trip booked by some companyTHEN this person is an employee of this company(3)FACT the person MisterX participates in aflight booked by the company UbiqBiz(4)IF a trip’s source and destination cities are close to each otherTHEN the trip is by trainThe IF-part is also called the body of a rule,while the THEN-part is also called its head.Typically,rule-based knowledge representation systems operate on facts,which are often formalised as a special kind of rule with an empty body.They start from a given set of facts,like rule(3)above,and then apply rules in order to derive new facts,thus“drawing conclusions”.However,the intuitive reading with natural language phrases is not suitable for compu-tation,and therefore such phrases are formalised to predicates and variables over objects of the domain of interest.A formalisation of the above rules in the typical style of rule languages looks as follows.(1)Trip(?t):−Flight(?t)(2)Employee(?p)∧isEmployedAt(?p,?c):−Trip(?t)∧books(?c,?t)∧Company(?c)∧participatesIn(?p,?t)∧Person(?p)(3)Person(MisterX)∧participatesIn(MisterX,FL4711)∧Flight(FL4711)∧books(UbiqBiz,FL4711)∧Company(UbiqBiz):−(4)TrainRide(?t):−Trip(?t)∧startsFrom(?t,?s)∧endsIn(?t,?d)∧close(?s,?d) In most logic programming systems a rule is read as an inverse implication,starting with the head followed by the body,which is indicated by the symbol:−that resembles a backward arrow.In this formalisation,the intuitive notions from the text,that were concepts and relations in the semantic network case,became predicates linked through variables and constants that identify objects in the domain of interest.Variables start with the symbol? and take as their values the constants that occur in facts such as(3).Rule(1)captures inheritance–or subsumption–between trips andflights by stating that“everything that is aflight is also a trip”.Rule(2)draws conclusions about the status of employment for participants of businessflights.From the facts(3),these two rules are able to derive the implicit fact that“MisterX is an employee of UbiqBiz”.While the rules(1)and(2)express general domain knowledge,rule(4)can be inter-preted as part of some company’s travelling policy,stating that trips between close cities shall be conducted by train.In business rules,for example,rule-based formalisms are used with the motivation to capture complex business knowledge in companies like pricing mod-els or delivery policies.Rule-based knowledge representation systems are especially suitable for reasoning about concrete instance data,i.e.simple facts of the form Employee(MisterX).Com-plex sets of rules can efficiently derive implicit such facts from explicitly given ones.They are problematic if more complex and general statements about the domain shall be derived which do notfit a rule’s head.42Stephan Grimm,Pascal Hitzler,Andreas AbeckerLogicBoth forms,semantic networks as well as rules,have been formalised using logic to give them a precise semantics.Without such a precise formalisation they are vague and ambigu-ous,and thus problematic for computational purposes.From just the graphical representa-tion of the semantic network in Figure3.1,for example,it is not clear whether companies can only bookflights for their own employees or for employees of partner companies as well.Neither is it clear from the fragment Company books−−−−−→Trip whether every com-pany books trips or just some company.Also for rules,despite their much more formal appearance,the exact meaning remains unclear when,for example,forms of negation are introduced that allow for potential conflicts between rules.Depending on the choice of procedural evaluation orflavour of formal semantics,different derivation results are being produced.The most prominent and fundamental logical formalism classically used for knowledge representation is the“first-order predicate calculus”,orfirst-order logic for short,and we choose this formalism to present logic as a form of knowledge representation here.First-order logic allows one to describe the domain of interest as consisting of objects,i.e.things that have individual identity,and to construct logical formulas around these objects formed by predicates,functions,variables and logical connectives[43].We assume that the reader is familiar with the notation offirst-order logic from formalisations of various mathematical disciplines.Similar to semantic networks,most statements in natural language can be expressed in terms of logical sentences about objects of the domain of interest with an appropriate choice of predicate and function symbols.Concepts are mapped to unary,relations to binary predicates.We illustrate the use of logic for knowledge representation by axiomatising parts of the semantic network from Figure3.1more precisely.Subsumption,for example,can be directly expressed by a logical implication,which is illustrated in the translation of the following fragment.Employee kindOf−−−−−−→Person∀x:(Employee(x)→Person(x))Due to the universal quantifier,the variable x in the logical formula ranges over all domain objects and its reading is“everything that is an employee is also a person”.Other parts of the network can be further restricted using logical formulas,as shown in the following example.Company books−−−−−→Trip∀x,y:(books(x,y)→Company(x)∧Trip(y))∀x:∃y:(Trip(x)→Company(y)∧books(y,x)) The graphical representation of the network fragment leaves some details open,while the logical formulas capture the booking relation between companies and trips more precisely. Thefirst formula states that domain and range of the booking relation are companies and trips,respectively,while the second formula makes sure that for every trip there does actu-ally exist a company that booked it.In particular,more complex restrictions that range over larger fragments of a network graph can be formulated in logic,where the intuitive graphical notation lacks expressiv-ity.As an example consider the relations between companies,trips and employees in the following fragment.3Knowledge Representation and Ontologies43 Company books←−−−−−−−−−−−Employee−−−−−→Trip participatesIn←−−−−−−−−−−−−−−−−−−−−−−−−employedAt∀x:∃y:(Trip(x)→Employee(y)∧participatesIn(y,x)∧books(employer(y),x)) The logical formula expresses additional knowledge that is not captured in the graph rep-resentation.It states that,for every trip,there must be an employee that participates in this trip while the employer of this participant is the company that booked theflight.Rules can also be formalised with logic.An IF-THEN-rule can be represented as a logical implication with universally quantified variables.For example,a common formali-sation of the ruleIF a trip’s source and destination cities are close to each otherTHEN the trip is by trainis the translation to the logical formula∀x,y,z:(Trip(x)∧startsFrom(x,y)∧endsIn(x,z)∧close(y,z)→TrainRide(x)). However,the typical rule-based systems do not interpret such a formula in the classical sense offirst-order logic but employ different kinds of semantics,which are discussed in Section3.2.Since a precise axiomatisation of domain knowledge is a prerequisite for processing knowledge within computers in a meaningful way,we focus on logic as the dominant form of knowledge representation.Therefore,we investigate different kinds of logics and formal semantics more closely in a subsequent section.In the context of the Semantic Web,two particular logical formalisms have gained momentum,reflecting the semantic network and rules forms of knowledge representation. The graph notations of semantic networks have been formalised through description log-ics,which are fragments offirst-order logic with typical Tarskian model-theoretic seman-tics but restricted to unary and binary predicates to capture the notions of concepts an relations.On the other hand,rules have been formalised through logic programming for-malisms with minimal model semantics,focusing on the derivation of simple facts about individual objects.Both description logics and logic programming can be found as underly-ing formalisms in various knowledge representation languages in the Semantic Web,which are addressed in Section3.4.3.1.3Reasoning about KnowledgeThe way in which we,as humans,process knowledge is by reasoning,i.e.the process of reaching conclusions.Analogously,a computer processes the knowledge stored in a knowledge base by drawing conclusions from it,i.e by deriving new statements that follow from the given ones.The basic operations a knowledge-based system can perform on its knowledge base are typically denoted by tell and ask[43].The tell-operation adds a new statement to the knowledge base,whereas the ask-operation is used to query what is known.The statements that have been added to a knowledge base via the tell-operation constitute the explicit knowledge a system has about the domain of interest.The ability to process explicit knowledge computationally allows a knowledge-based system to reason over a domain of interest by deriving implicit knowledge that follows from what has been told explicitly.44Stephan Grimm,Pascal Hitzler,Andreas AbeckerThis leads to the notion of logical consequence or entailment.A knowledge base KB is said to entail a statementαifα“follows”from the knowledge stored in KB,which is written as KB|=α.A knowledge base entails all the statements that have been added via the tell-operation plus those that are their logical consequences.As an example,consider the following knowledge base with sentences infirst-order logic.KB={Person(MisterX),participates(MisterX,FL4711),Flight(FL4711),books(UbiqBiz,FL4711),∀x,y,z:(Flight(y)∧participates(x,y)∧books(z,y)→employedAt(x,z)),∀x,y:(employedAt(x,y)→Company(x)∧Employee(y)),∀x:(Person(x)→¬Company(x))}The knowledge base KB explicitly states that“MisterX is a person who participates in theflight FL4711booked by UbiqBiz”,that“participants offlights are employed at the company that booked theflight”,that“the employment relation holds between companies and employees”and that“persons are different from companies”.If we ask the question “Is MisterX employed at UbiqBiz?”by sayingask(KB,employedAt(MisterX,UbiqBiz))the answer will be yes.The knowledge base KB entails the fact that“MisterX is employed at UbiqBiz”,i.e.KB|=employedAt(MisterX,UbiqBiz),although it was not“told”so ex-plicitly.This follows from its general knowledge about the domain.A further consequence is that“UbiqBiz is a company”,i.e.KB|=Company(UbiqBiz),which is reflected by a positive answer to the questionask(KB,Company(UbiqBiz)).This follows from the former consequence together with the fact that“employment holds between companies and employees”.Another important notion related to entailment is that of consistency or satisfiability. Intuitively,a knowledge base is consistent or satisfiable if it does not contain contradictory facts.If we would add the fact that“UbiqBiz is a person”to the above knowledge base KB by sayingtell(KB,Person(UbiqBiz)),it would become unsatisfiable because persons are said to be different from companies.We explicitly said that UbiqBiz is a person while at the same time it can be derived that it is a company.In general,an unsatisfiable knowledge base is not very useful,since in logical for-malisms it would entail any arbitrary fact.The ask-operation would always return a posi-tive result independent from its parameters,which is clearly not desirable for a knowledge-based system.The inference procedures implemented in computational reasoners aim at realising the entailment relation between logical statements[43].They derive implicit statements from a given knowledge base or check whether a particular statement is entailed by a knowledge base.3Knowledge Representation and Ontologies45 An inference procedure that only derives entailed statements is called sound.Soundness is a desirable feature of an inference procedure,since an unsound inference procedure would potentially draw wrong conclusions.If an inference procedure is able to derive every statement that is entailed by a knowledge base then it is called pleteness is also a desirable property,since a complex chain of conclusions might break down if only a single statement in it is missing.Hence,for reasoning in knowledge-based systems we desire sound and complete inference procedures.3.2Logic-Based Knowledge-Representation FormalismsFirst-order(predicate)logic is the prevalent and single most important knowledge repre-sentation formalism.Its importance stems from the fact that basically all current symbolic knowledge representation formalisms can be understood in their relation tofirst-order logic. Its roots can be traced back to the ancient Greek philosopher Aristotle,and modernfirst-order predicate logic was created in the19th century,when the foundations for modern mathematics were laid.First-order logic captures some of the essence of human reasoning by providing a notion of logical consequence as already mentioned.It also provides a notion of universal truth in the sense that a logical statement can be universally valid(and thus called a tautology), meaning that it is a statement which is true regardless of any preconditions.Logical consequence and universal truth can be described in terms of model-theoretic semantics.In essence,a model for a logical theory3describes a state of affairs which makes the theory true.A tautology is a statement for which all possible states of affairs are models.A logical consequence of a theory is a statement which is true in all models of the theory.How to derive logical consequences from a theory–a process called deduction or infer-encing–is obviously central to the study of logic.Deduction allows to access knowledge which is not explicitly given but implicitly represented by a theory.Valid ways of deriv-ing logical consequences from theories also date back to the Greek philosophers,and have been studied since.At the heart of this is what has become known as proof theory.Proof theory describes syntactic rules which act on theories and allow to derive logical consequences without explicit recurrence to models.The notion of universal truth can thus be reduced to syntactic manipulations.This allows to abstract from model theory and enables deduction by symbol manipulation,and thus by automated means.Obviously,with the advent of electronic computing devices in the20th century,the automation of deduction has become an important and influentialfield of study.Thefield of automated reasoning is concerned with the development of efficient algorithms for de-duction.These algorithms are usually required to be sound,and completeness is a desired feature.The fact that sound and complete deduction algorithms exist forfirst-order predicate logic is reflected by the statement thatfirst-order logic is semi-decidable.More precisely,3A logical theory denotes a set of logical formulas,seen as the axioms of some theory to be mod-elled.46Stephan Grimm,Pascal Hitzler,Andreas Abeckersemi-decidability offirst-order logic means that there exist algorithms which,given a the-ory and a query statement,terminate with positive answer infinite time whenever the state-ment is a logical consequence of the theory.Note that for semi-decidability,termination is not required if the statement is not a logical consequence of the theory,and indeed,ter-mination(with the correct negative answer)cannot be guaranteed in general forfirst-order logical theories.For some kinds of theories,however,sound and complete deduction algorithms exist which always terminate.Such theories are called decidable,and they have certain more-or-less obvious advantages,including the following.•Decidability guarantees that the algorithm always comes back with a correct answer infinite time.4Under semi-decidability,an algorithm which runs for a considerable amount of time may still terminate,or may not terminate at all,and thus the user cannot know whether he has waited long enough for an answer.Decidability is particularly important if we want to reason about the question of whether or not a given statement is a logical consequence of a theory.•Experience shows that practically efficient algorithms are often available for decidable theories due to the effective use of heuristics.Often,this is even the case if worst-case complexity is very high.3.2.1Description LogicsDescription logics[3]are essentially decidable fragments offirst-order logic,5and we have just seen why the study of these is important.At the same time,description logics are expressive enough such that they have become a major knowledge representation paradigm, in particular for use within the Semantic Web.We will describe one of the most important and influential description logics,called ALC.Other description logics are best understood as restrictions or extensions of ALC.We introduce the standard description logic notation and give a formal mapping into standard first-order logic syntax.The Description Logic ALCA description logic theory consists of statements about concepts,individuals,and their re-lations.Individuals correspond to constants infirst-order logic,and concepts correspond to unary predicates.In terms of semantic networks,description logic concepts correspond to general concepts in semantic networks,while individuals correspond to individual con-cepts.We deal with conceptsfirst,and will talk about individuals later.Concepts can be named concepts or anonymous(composite)d concepts consist simply of a name,say“human”,which will be mapped to a unary predicate in4It should be noted that there are practical limitations to this due to the fact that computing resources are always limited.A theoretically sound,complete and terminating algorithms may thus run into resource limits and terminate without an answer.5To be precise,there do exist some description logics which are not decidable.And there exist some which are not straightforward fragments offirst-order logics.But for this general introduction,we will not concern ourselves with these.。
Wikipedia Knowledge Extraction - ACM at UIUC:维基百科的知识提取的ACM伊利诺伊大学香槟分校
◦ Different levels of information? ◦ Simple rules based on part of speech tags?
Clustering
Idea: Determine whether two separate mentions point to the same concept
◦ ‘The dog’, ‘a dog’, ‘dogs’ ◦ ‘Cats’, ‘C.A.T.S’, ‘CAT Scan’ ◦ ‘President Obama’, ‘President Barack Obama’
Possible solutions:
◦ Feature-based classification ◦ Self organizing map ◦ Terms associated
Use type of phrase (Noun Phrase, Verb Phrase) to determine sentence to form.
Read papers from Turing Center (University of Washington)
SRL Parsing
Performs a deep analysis on each sentence. E.g. “Yoshi has a long tongue which he uses
Hadoop Compatibility
Need to ensure scaling is possible for move to regular Wikipedia
Hadoop is an open source implementation of the Map-Reduce algorithm
知识与知识的运用 英语
知识与知识的运用英语Knowledge and the Application of KnowledgeKnowledge is a powerful tool that can unlock the doors to a world of endless possibilities. It is the foundation upon which we build our understanding of the world around us, and the key to unlocking our full potential as individuals and as a society. However, knowledge alone is not enough; it is the application of that knowledge that truly transforms our world.At its core, knowledge is the accumulation of information, facts, and insights that we gather through various means, such as education, experience, and exploration. It is the building blocks that allow us to make sense of the world and to develop a deeper understanding of the complex systems and phenomena that shape our lives. Whether it is the intricate workings of the human body, the mysteries of the universe, or the nuances of human behavior, knowledge provides us with the tools to explore, analyze, and ultimately, to make sense of the world around us.But knowledge is not simply a passive accumulation of information. It is a dynamic and ever-evolving process that requires activeengagement and application. It is through the application of knowledge that we are able to transform our understanding into tangible outcomes, whether it is solving a complex problem, creating a new invention, or improving the human condition.One of the most powerful examples of the application of knowledge is in the field of science and technology. Researchers and scientists around the world are constantly pushing the boundaries of human knowledge, uncovering new insights and discoveries that have the potential to change the world. From the development of life-saving medical treatments to the creation of groundbreaking technologies that revolutionize the way we live, work, and communicate, the application of scientific knowledge has had a profound impact on our lives.Similarly, in the realm of social sciences and humanities, the application of knowledge has led to significant advancements in our understanding of human behavior, culture, and society. Scholars and thinkers have used their knowledge to develop new theories, frameworks, and approaches to addressing complex social and political challenges, from issues of inequality and social justice to the nuances of human psychology and the nature of human relationships.But the application of knowledge is not limited to the realms ofacademia and research. In our everyday lives, we constantly apply our knowledge to navigate the world around us, make decisions, and solve problems. Whether it is using our knowledge of nutrition to make healthy food choices, applying our understanding of personal finance to manage our money wisely, or drawing on our knowledge of interpersonal communication to build stronger relationships, the application of knowledge is a fundamental part of our daily lives.However, the application of knowledge is not always straightforward or easy. It often requires creativity, critical thinking, and the willingness to take risks and experiment. Sometimes, the application of knowledge may even challenge our preconceived notions or require us to rethink our assumptions. This can be a difficult and uncomfortable process, but it is also essential for growth and progress.Ultimately, the true power of knowledge lies in its application. It is not enough to simply accumulate information; we must be willing to put that knowledge into action, to use it to solve problems, to create new opportunities, and to improve the world around us. By embracing the power of knowledge and committing ourselves to its application, we can unlock the full potential of our human capacity and create a better, more just, and more sustainable world for all.。
Extracting Chatbot Knowledge from Online Discussion Forums
AbstractThis paper presents a novel approach for extractinghigh-quality<thread-title,reply>pairs as chatknowledge from online discussion forums so as toefficiently support the construction of a chatbot fora certain domain.Given a forum,the high-quality<thread-title,reply>pairs are extracted using acascaded framework.First,the replies logicallyrelevant to the thread title of the root message areextracted with an SVM classifier from all the re-plies,based on correlations such as structure andcontent.Then,the extracted<thread-title,reply>pairs are ranked with a ranking SVM based on theircontent qualities.Finally,the Top-N<thread-title,reply>pairs are selected as chatbot knowledge.Re-sults from experiments conducted within a movieforum show the proposed approach is effective.1IntroductionA chatbot is a conversational agent that interacts with users in a certain domain or on a certain topic with natural lan-guage sentences.Normally,a chatbot works by a user ask-ing a question or making a comment,with the chatbot an-swering the question,or making a comment,or initiating a new topic.Many chatbots have been deployed on the Inter-net for the purpose of seeking information,site guidance, FAQ answering,and so on,in a strictly limited domain. Existing famous chatbot systems include ELIZA[Weizen-baum,1966],PARRY[Colby,1973]and ALICE.1Most existing chatbots consist of dialog management modules to control the conversation process and chatbot knowledge bases to response to user input.Typical implementation of chatbot knowledge bases contains a set of templates that match user inputs and generate responses.Templates cur-rently used in chatbots,however,are hand coded.Therefore, the construction of chatbot knowledge bases is time con-suming,and difficult to adapt to new domains.* This work was finished while the first author was visiting Mi-crosoft Research Asia during Feb.2005-Mar.2006 as a component of the project of AskBill Chatbot led by Dr. Ming Zhou.1 /An online discussion forum is a web community that al-lows people to discuss common topics,exchange ideas,and share information in a certain domain,such as sports,mov-ies,and so on.Creating threads and posting replies are ma-jor user behaviors in forum rge repositories of archived threads and reply records in online discussion forums contain a great deal of human knowledge on many topics.In addition to rich information,the reply styles from authors are diverse.We believe that high-quality replies of a thread,if mined,could be of great value to the construction of a chatbot for certain domains.In this paper,we propose a novel approach for extracting high-quality<thread-title,reply>pairs from online discus-sion forums to supplement chatbot knowledge base.Given a forum,the high-quality<thread-title,reply>pairs are ex-tracted using a cascaded framework.First,the replies logi-cally relevant to the thread title of the root message are ex-tracted with an SVM classifier from all the replies,based on correlations such as structure and content.Then,the ex-tracted<thread-title,reply>pairs are ranked with a ranking SVM based on their content qualities.Finally,the Top-N <thread-title,reply>pairs are selected as chatbot knowl-edge.The rest of this paper is organized as follows.Important related work is introduced in Section2.Section3outlines the characteristics of online discussion forums with the ex-planations of the challenges of extracting stable<thread-title,reply>pairs.Section4presents our proposed cascaded framework.Experimental results are reported in Section5. Section6presents comparison of our approach with other related work.The conclusion and the future work are pro-vided in Section 7.2Related WorkBy“chatbot knowledge extraction”throughout this paper, we mean extracting the pairs of<input,response>from on-line resources.Based on our study of the literature,there is no published work describing the use of online communities like forums for automatic chatbot knowledge acquisition.Existing work on automatic chatbot knowledge acquisition is mainly based on human annotated datasets,such as the work by Shawar and Atwell[2003]and Tarau and Figa[2004].Their ap-proaches are helpful to construct commonsense knowledgeExtracting Chatbot Knowledge from Online Discussion Forums*Jizhou Huang1, Ming Zhou2, Dan Yang11School of Software Engineering, Chongqing University, Chongqing, China, 400044{jizhouhuang, dyang}@2Microsoft Research Asia, 5F Sigma Center, No.49 Zhichun Road, Haidian, Bejing, China, 100080mingzhou@for chatbots,but are not capable of extracting knowledge for specific domains.Notably,there is some work on knowledge extraction from web online communities to support QA and summari-zation.Nishimura et al.[2005]develop a knowledge base for a QA system that answers type“how”questions.Shres-tha and McKeown[2004]present a method to detect<ques-tion,answer>pairs in an email conversation for the task of email summarization.Zhou and Hovy[2005]describe a summarization system for technical chats and emails about Linux kernel.These researchers’approaches utilize the characteristics of their corpora and are best fit for their spe-cific tasks,but they limit each of their corpora and tasks,so they cannot directly transform their methods to our chatbot knowledge extraction approach.3Our ApproachAn online discussion forum is a type of online asynchronous communication system.A forum normally consists of sev-eral discussion sections.Each discussion section focuses on a specific discussion theme and includes many threads.Peo-ple can initiate new discussions by creating threads,or ask (answer)questions by posting questions(replies)to an ex-isting section.In a section,threads are listed in chronologi-cal order.Within a thread,information such as thread title, thread starter,and number of replies are presented.The thread title is the title of the root message posted by the thread starter to initiate discussion.One can access a thread from the thread list and see the replies listed in chronologi-cal order,with the information of the authors and posting times.Compared with other types of web communities such as newsgroups,online discussion forums are better suited for chatbot knowledge extraction for the following reasons:1.In a thread within a forum,the root message and itsfollowing up replies can be viewed as<input,re-sponse>pairs,with same structure of chat templateof a chatbot.2.There is popular,rich,and live information in an on-line discussion forum.3.Diverse opinions and various expressions on a topicin an online discussion forum are useful to extractdiverse <input, response> pairs for chatbots.Due to technical limitations of current chatbots in han-dling dialogue management,we think that pairs of<input, response>for a chatbot should be context independent, which means that the understanding inputs and responses will not rely on the previous <input, response>. However,because of the nature of a forum,it is difficult to extract high-quality<input,response>pairs that meet chatbot requirements:1.Replies are often short,elliptical,and irregular,andfull of spelling,usage,and grammar mistakes whichresults in noisy text.2.Not all of replies are related to root messages.3.A reply may be separated in time or place from thereply to which it responds,leading to a fragmentedconversational structure.Thus,adjacent repliesmight be semantically unrelated.4.There is no evidence to reveal who has replied towhich reply unless the participants have quoted theentire entries or parts of a previously posted reply topreserve context [Eklundh, 1998].To overcome these sorts of difficulties,lexical and struc-tural information from different replies within threads are analyzed in our experiments,as well as user behaviors in discussions.Therefore,to extract valid pairs of<input,response> from a forum,we first need to extract relevant replies to ini-tial root messages.In this process,replies that are relevant to the previous replies rather than to the initial root message are ignored and the replies logically directly relevant to the thread title are extracted.The replies to the initial root mes-sage,in spite of being relevant,may have different qualities. To select high-quality replies,a ranking SVM is employed to rank the replies.Finally,the pairs of the title of the root message and the extracted Top-N replies are used as the chatbot knowledge.4Cascaded Hybrid ModelAn input online discussion forum F contains discussion sections s1,s2,…,s k.A section consists of T threads t1,t2,…,t u. Each thread t is a sequence of replies t={r0,r1,r2,…,r n}, where r0is the root message posted by the thread starter and r i is the ith)1( i reply.A reply r is posted by a participant p at a specific moment m with content c.A thread t can be modeled as a sequence of triplets:)},,(),...,,,(),,,(),,,{(},...,,,{22211121nnnncmpcmpcmpcmprrrrt==We define an R R as a direct reply r j)1(j to the root message r0where r j is not correlated with the other reply r j’)'1'(jjjin the thread.Therefore,chatbot knowledge(CK)can be viewed as the pairs of<input,response>that fulfill the following con-straints:)}{()},{(RRtyhigh-qualile,thread-titresponseinputCK==A thread title is used to model the user input of a chatbot and RRs of this thread are used to model the chatbot re-sponses.The high-quality pairs of<thread-title,RR>will be selected as chatbot knowledge.A high-quality pair of<thread-title,RR>for the chatbot should meet the following requirements:1.The thread-title is meaningful and popular.2.The RR provides descriptive,informative and trust-worthy content to the root message.3.The RR has high readability,neatly short and conciseexpressive style, clear structure.4.The RR is attractive and can capture chatter’s inter-est.5.Both thread-title and RR should have NO intemper-ate sentiment,no obscene words and exclusive per-sonal information.6.Both thread-title and RR should have proper length. In this paper,identifying the qualified thread-title is not our focus.Instead,we focus on selecting qualified RR.Fig-ure1illustrates the structure of the cascaded model.Thefirst pass(on the left-hand side)applies an SVM classifier to the candidate RR to identify the RR of a thread.Then the second pass(in the middle)filters out the RR that contains intemperate sentiment,obscene words and personal infor-mation with a predefined keyword list.The RR which is longer than a predefined length is also filtered out.Finally the RR ranking module(on the right-hand side)is used to extract the descriptive,informative and trustworthy replies to the root message.Figure 1. Structure of Cascaded Model.4.1RR IdentificationThe task of RR identification can be viewed as a binary clas-sification problem of distinguishing RR from non-RR.Our approach is to assign a candidate reply r i)1( i an appropri-ate class y(+1if it is an RR,-1or not).Here Support Vector Machines(SVMs)is selected as the classification model be-cause of its robustness to over-fitting and high performance [Sebastiani, 2002].SVMlight[Joachims,1999]is used as the SVM toolkit for training and testing.Table1lists the feature set to iden-tify RR for a pair of <thread-title, reply>.1Structural features1-1Does this reply quote root message?1-2Does this reply quote other replies?1-3Is this reply posted by the thread starter?1-4# of replies between same author’s previous and cur-rent reply2Content features2-1# of words2-2# of content words of this reply2-3# of overlapping words between thread-title and re-ply2-4# of overlapping content words between thread-title and reply2-5Ratio of overlapping words2-6Ratio of overlapping content words between thread-title and reply2-7# of domain words of this reply2-8Does this reply contain other participants’ registered nicknames in forum?Table 1. Features for RR Classifier.In our research,both structural and content features are selected.In structural features,quotation maintains context coherence and indicates the relevance between the current reply and the quoted root message or reply,as discussed in [Eklundh and Macdonald,1994;Eklundh,1998].Two quo-tation features(feature1-1and feature1-2)are employed in our classifier.Feature1-1indicates that the current reply quoting the root message is relevant to the root message.On the contrary,feature1-2indicates the current reply might be irrelevant to the root message because it quotes other re-plies.We use features1-3and1-4based on the observation of behaviors of posting replies in forums.The thread starter, when participants reply to the starter’s thread,usually adds new comments to the replies.Therefore,the added replies gradually diverge from the original root message.If a par-ticipant wants to supplement or clarify his previous reply,he can add a new reply.Therefore,the participant’s new reply is often the supporting reason or argument to his previous reply if they are close to each other.Content features include the features about the number of words and the number of content words in the current reply, the overlapping words and content words between the root message and the current reply.In our work,words that do not appear in the stop word list2are considered as content words.Feature2-7estimates the specialization of the cur-rent reply by the number of domain specific terms.To sim-plify the identification of domain specific terms,we simply extract words as domain specific words if they do not ap-pear in a commonly used lexicon(consists of73,555Eng-lish words).Feature2-8estimates a reply’s pertinence to other replies,because some participants might insert the registered nicknames of other participants and sometimes add clue words such as“P.S.”to explicitly correlate their replies with certain participants.4.2RR RankingFurther,after the RRs have been identified,non-eligible RRs are filtered out with a keyword list with33obscenities,62 personal information terms(terms beginning with“my”, such as my wife,my child)and17forum specific terms (such as Tomatometer,Rotten Tomato,etc.).Replies with more than N words are eliminated because people may be-come bored in chatbot scenarios if the response is too long. In our experiments,N is set as 50 based on our observation.3 We analyzed the resulting RRs set of4.1.For some RRs, there is certain noise left from the previous pass,while for other RRs,there are too many RRs with varied qualities. Therefore,the task of RR ranking is to select the high-quality RRs.The ranking SVM[Joachims,2002]is em-ployed to train the ranking function using the feature set in Table 2.The number of being quoted of a reply is selected as a feature(feature1-1)because a reply is likely to be widely quoted within a thread as it is popular or the subject of de-bate.In other words,the more times a reply is quoted,the higher quality it may have.This motivates us to extract the quoted number of all the other replies posted by an author within a thread(feature2-9)and throughout the forum (feature 2-10).We also take“author reputation”into account when as-sessing the quality of a reply.The motivation is that if an author has a good reputation,his reply is more likely to be reliable.We use the author behavior related features to as-sess his“reputation.”An earlier work investigates the rela-tionship between a reader’s selection of a reply and the author of this reply,and found that some of the features raised from authors’behavior over time,correlate to how2 /stop_list.html3 50 is the average length of 1,200 chatbot responses which preferred by three chatters through sample experiments.likely a reader is to choose to read a reply from an author [Fiore et al.,2002].Features2-1to2-7are author behavior related features in the forum.Feature2-8models how many people have chosen to read the threads or replies of an author in the forum by using the measurement of the influ-ence of participants.This is described in detail in[Matsu-mura et al., 2002].1Feature of the number of being quoted1-1# of quotations of this reply within the current thread2Features from the author of a reply2-1# of threads the author starts in the forum2-2#of replies the author posts to others’threads in the forum2-3The average length of the author’s replies in the fo-rum2-4The longevity of participation2-5#of the author’s threads that get no replies in the fo-rum2-6# of replies the author’s threads get in the forum2-7# of threads the author is involved in the forum2-8The author’s total influence in the forum2-9#of quotations of the replies that are posted by the author in current thread2-10#of quotations of all the replies that are posted by the author in the forumTable 2. Features for RR Ranking.5Experimental Results5.1Data for ExperimentsIn our experiments,the Rotten Tomatoes forum4is used as test data.It is one of the most popular online discussion fo-rums for movies and video games.The Rotten Tomatoes fo-rum discussion archive is selected because each thread and its replies are posted by movie fans,amateur and profes-sional filmmakers,film critics,moviegoers,or movie pro-ducers.This makes the threads and replies more heteroge-neous, diverse, and informative.For research purposes,the discussion records are col-lected by crawling the Rotten Tomatoes Forum over the time period from November11,1999to June15,2005.The downloaded collection contains1,767,083replies from 65,420threads posted by12,973distinctive participants,so there are,on average,27.0replies per thread,136.2replies per participant,and5.0threads per participant.The number of thread titles in question form is16,306(24.93%)and in statement form is49,114(75.07%).We use part of these discussion records in our experiments.5.2RR IdentificationTo build the training and testing dataset,we randomly se-lected and manually tagged53threads from the Rotten To-matoes movie forum,in which the number of replies was between10(min)and125(max).There were3,065replies in53threads,i.e.,57.83replies per thread on average.Three4 /vine/human experts were hired to manually identify the relevance of the replies to the thread-title in each thread.Experts an-notated each reply with one of the three labels:a)RR,b) non-RR and c)Unsure.Replies that received two or three RR labels were regarded as RR,replies with two or three non-RR labels were regarded as non-RR.All the others were regarded as Unsure.After the labeling process,we found out that1,719replies (56.08%)were RR,1,336replies(43.59%)were non-RR,10 (0.33%)were Unsure.We then removed10unsure replies and60replies with no words.We randomly selected35 threads for training(including1,954replies)and18threads for testing(including1,041replies).Our baseline system used the number of replies between the root message and the responding reply[Zhou and Hovy,2005]as the feature to classify RRs.Table3provides the performance using SVM with the feature set described in Table 1.Feature set Precision Recall F-scoreBaseline73.24%66.86%69.90%Structural89.47%92.29%90.86%Content71.80%85.86%78.20%All90.48%92.29%91.38%Table 3. RR Identification Result.With only the structural features,the precision,recall and f-score reached89.47%,92.29%,and90.86%.Content fea-tures,when used alone,the precision,recall and f-score are low.But after adding content features to structural features, the precision improved by1.01%while recall stayed the same.This indicates that content features help to improve precision.Root messageTitle:Recommend Some Westerns For Me?Description:And none of that John Wayne sh*t.1.The Wild Bunch It's kickass is what it is.2.Once Upon a Time in the West3.Does Dances With Wolves count as a western?Doesn'tmatter, I'd still recommend it.4.White Comanche This masterpiece stars ……5.Here's some I'm sure nobody else ……6.for Dances with Wolves.7.: understands he's a minority here: ……8.Open Range is really good. Regardless ……9.One of the best films I've ever seen.10.The Good the Bad and the Ugly ……Figure 2. A Sample of RRs.Figure2presents some identified RRs listed in chrono-logical order for the root message with the title,“Recom-mend Some Westerns For Me?”and description for the title,“And none of that John Wayne sh*t.”.5.3Extract High-quality RRTo train the ranking SVM model,an annotated dataset was required.After the non-eligible RRs were filtered out fromthe identified RRs ,three annotators labeled all of the re-maining RRs with three different quality ratings.The ratings and their descriptions are listed in Table 4.Rating DescriptionFascinating This reply is informative and interesting,and it is suitable for a chatbotAcceptable The reply is just so-so but tolerableUnsuitableThis reply is bad and not suitable for a chat-botTable 4. RR Rating Labels.After the labeling process,there were 568(71.81%)fas-cinating RRs ,48(6.07%)acceptable RRs ,and 175(22.12%)unsuitable RRs in the 791RRs of the 35training threads.And in the 511RRs of the 18test threads,there were 369(72.21%)fascinating RRs ,25(4.89%)acceptable RRs ,and 117 (22.90%) unsuitable RRs .We used mean average precision (MAP)as the metric to evaluate RR ranking.MAP is defined as the mean of aver-age precision over a set of queries and average precision (AvgP i ) for a query q i is defined as:==Mj i instancespositive of number j pos j p AvgP 1)(*)(where j is the rank,M is the number of instances retrieved,pos (j )is a binary function to indicate whether the instance in the rank j is positive (relevant),and p (j )is the precision at the given cut-off rank j .The baseline ranked the RRs of each thread by their chronological order.Our ranking function with the feature set in Table 2achieved high performance (MAP score is 86.50%)compared with the baseline (MAP score is 82.33%).We also tried content features such as the cosine similarity between an RR and the root message,and found that they could not help to improve the ranking perform-ance.The MAP score was reduced to 85.23%when we added the cosine similarity feature to our feature set.5.4Chat Knowledge Extraction with Proper N Set-tingThe chat knowledge extraction task requires that the ex-tracted RRs should have high quality and high precision.After we got the ranked RRs of each thread,the Top-N RRs were selected as chatbot responses.The baseline system just selected Top-N RRs ranked in chronological order.Figure 3shows the comparison of the performances of our approach and the baseline system at different settings of N .Figure 4shows the Top-N (N =6,N can be adjusted to get proper equilibrium between quantity and quality of RRs when extracting chatbot knowledge)RRs after ranking the RRs in Figure 2.As an instance,we uniformly extracted Top-6high-quality RRs from each thread.Altogether 108<thread-title,reply>pairs were generated from 18threads.Among these extracted pairs,there were 97fascinating pairs and 11wrong pairs,which showed that 89.81%of the ex-tracted chatbot knowledge was correct.Input:Recommend Some Westerns For Me?Chatbot responses:6. for Dances with Wolves.11. Young Guns! & Young Guns 2!2. Once Upon a Time in the West 9. One of the best films I’ve ever seen.27. I second the dollars trilogy and also Big Hand ……18.Classic Anthony Mann Westerns:The Man from Laramie (1955) ……Figure 4. Top-6 RRs .6Comparison with Related WorkPrevious works have utilized different datasets for knowl-edge acquisition for different applications.Shrestha and McKeown [2004]use an email corpus.Zhou and Hovy [2005]use Internet Relay Chat and use clustering to model multiple sub-topics within a chat log.Our work is the first to explore using the online discussion forums to extract chat-bot knowledge.Since the discussions in a forum are pre-sented in an organized fashion within each thread in which users tend to respond to and comment on specific topics,we only need to identify the RRs for each thread.Hence,the clustering becomes unnecessary.Furthermore,a thread can be viewed as <input,response>pairs,with the same struc-ture of chat template of a chatbot,making a forum better suited for the chatbot knowledge extraction task.The use of thread title as input means that we must iden-tify relevant replies to the root message (RRs ),much like finding adjacent pairs (APs)in [Zhou and Hovy,2005]but for the root message.They utilize AP to identify initiating and responding correspondence in a chat log since there are multiple sub-topics within a chat log,while we use RR to identify relevant response to the thread-title.Similarly,we apply an SVM classifier to identify RRs but use more effec-tive structural features.Furthermore,we select high-quality RRs with a ranking function.Xi et al.[2004]use a ranking function to select the most relevant messages to user queries in newsgroup searches,and in which the author feature is proved not effective.In our work,the author feature also proves not effective in identifying relevant replies but it is proved effective in se-lecting high-quality RRs in RR ranking.This is because ir-relevant replies are removed in the first pass,making author features more salient in the remaining RRs .This also indi-cates that the cascaded framework outperforms the flat model by optimally employing different features at different passes.7Conclusions and Future WorkWe have presented an effective approach to extract<thread-title,reply>pairs as knowledge of a chatbot for a new do-main. Our contribution can be summarized as follows:1.Perhaps for the first time,our work proposes usingonline discussion forums to extract chatbot knowl-edge.2.A cascaded framework is designed to extract thehigh-quality<thread-title,reply>pairs as chabotknowledge from forums.It can optimally use differ-ent features in different passes,making the extractedchatbot knowledge of higher quality.3.We show through experiments that structural fea-tures are the most effective features in identifyingRR and author features are the most effective fea-tures in identifying high-quality RR.Compared with manual knowledge construction methods, our approach is more efficient in building a specific domain chatbot.In our experiment with a movie forum domain, 11,147<thread-title,reply>pairs were extracted from2,000 threads within two minutes.It is simply not feasible to have human experts encode a knowledge base of such size.As future work,we plan to improve the qualities of the extracted RRs.The method of selecting valid thread titles and extracting completed sentences from the extracted RRs is an area for exploration.In addition,we are also interested in extracting questions from threads so that<question,re-ply> pairs can be used to support QA style chat.We currently feed the extracted<thread-title,reply>di-rectly into the chatbot knowledge base.But there is much room to improve quality in the future.For example,we can generalize the chat templates by clustering similar topics and grouping similar replies,and improve coherence among the consecutive chat replies by understanding the styles of replies.AcknowledgementsThe authors are grateful to Dr.Cheng Niu,Zhihao Li for their valuable suggestions on the draft of this paper.We also thank Dwight for his assistance to polish the English.We wish to thank Litian Tao,Hao Su and Shiqi Zhao for their assistance to annotate the experimental data. References[Colby,1973]K.M.Colby.Simulation of Belief systems.In Schank and Colby(Eds.)Computer Models of Thought and Language, pp.251-286, 1973.[Eklundh and Macdonald,1994]K.S.Eklundh and C.Macdonald.The Use of Quoting to Preserve Context in Electronic Mail Dialogues.In IEEE Transactions on Professional Communication, 37(4):197-202, 1994. [Eklundh,1998]K.S.Eklundh.To quote or not to quote:setting the context for computer-mediated dialogues.I n S.Herring(Ed.),Computer-Mediated Conversation.Cresskill, NJ:Hampton Press, 1998.[Fiore et al.,2002]A.T.Fiore,S.Leetiernan and M.A.Smith.Observed Behavior and Perceived Value of Authors in Usenet Newsgroups:Bridging the Gap.In Proceedings of the CHI2002Conference on Human Factors in Computing Systems, pp.323-330, 2002. [Joachims,1999]T.Joachims.Making large-scale SVM learning practical.Advances in Kernel Methods-Sup-port Vector Learning, MIT-Press,1999.[Joachims,2002]T.Joachims.Optimizing Search Engines Using Clickthrough Data.In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), pp.133-142, 2002.[Matsumura et al.,2002]N.Matsumura,Y.Ohsawa and M.Ishizuka.Profiling of Participants in Online-Community.Chance Discovery Workshop on the Seventh Pacific Rim International Conference on Artificial Intelligence (PRICAI), pp.45-50, 2002.[Nishimura et al.,2005]R.Nishimura,Y.Watanabe and Y.Okada.A Question Answer System Based on Confirmed Knowledge Developed by Using Mails Posted to a Mailing List.In Proceedings of the IJCNLP2005,pp.31-36, 2005.[Sebastiani,2002]F.Sebastiani.Machine learning in auto-mated text categorization.ACM Computing Surveys, 34(1):1-47, 2002.[Shawar and Atwell,2003]B.A.Shawar and E.Atwell.Machine Learning from dialogue corpora to generate chatbots. In Expert Update journal, 6(3):25-29, 2003. [Shrestha and McKeown,2004]L.Shrestha and K.McKe-own.Detection of question-answer pairs in email con-versations.In Proceedings of Coling2004,pp.889-895, 2004.[Tarau and Figa,2004]P.Tarau and E.Figa.Knowledge-based conversational Agents and Virtual Story-telling.In Proceedings2004ACM Symposium on Applied Com-puting, 1:39-44, 2004.[Weizenbaum,1966]J.Weizenbaum.ELIZA-A Computer Program for the Study of Natural Language Communi-cation between Man and munications of the ACM, 9(1):36-45, 1966.[Xi et al.,2004]W.Xi,J.Lind and E.Brill.Learning Ef-fective Ranking Functions for Newsgroup Search.In Proceedings of SIGIR 2004, pp.394-401, 2004.[Zhou and Hovy,2005]L.Zhou and E.Hovy.Digesting Virtual“Geek”Culture:The Summarization of Techni-cal Internet Relay Chats.In Proceedings of ACL2005, pp.298-305,2005.。
Transparent Knowledge Extraction
RDP Neural Network
Z
I2
(0,0) (0,1) Y (1,1)
(1,0) X
I0 XOR I2 1 0 0 0
θ0
0 w0 0 w1
I1
Bias
I0 0 0 1 1
I1 0 1 0 1
Out 0 1 1 0
David Elizondo
Geometrical Approaches for Artificial Neural Networks
RDP Neural Network RDP Neural Network Construction Principle Linear Separability Methods for building RDP Neural Networks
RDP Neural Network
Multilayer neural network Generalisation of single layer perceptron for solving non linearly separable classification problems Automatic construction (SVM kernel) Convergence guaranteed Does not suffer from Catastrophic Interference No learning parameters Transparent Knowledge Extraction Generalisation level comparable to BP, CC, Rulex (Benchmarks, Satellite Images)
RDP Neural Network RDP Neural Network Construction Principle Linear Separability Methods for building RDP Neural Networks
A Knowledge Base for a Neural Guidance System
Abstract
1
Introduction
Mobile creatures are guided by knowledge contained in their brains. A brain can be viewed as a neural network that behaves as an expert system in the domain of mundane behavior. Neural networks o er functions that are di cult to implement in traditional expert systems. Unlike traditional systems, where a designer xes the meanings of symbols, neural networks represent symbols in a continuous and extensible manner MacLennan, 1991]. The extensibility of symbols in a neural network resolves the \brittleness" problem Sun, 1992a] found in von Neumann architectures. Instead of depending upon knowledge encoded by an engineer, a neural network learns relationships among symbols. The dynamic environment of the real world presents multiple changing goals that require various levels of guidance ranging from habitual to exploratory action; traditional expert systems generate only habitual decisions. A neural network also can directly represent commonsense reasoning patterns Sun, 1992b], something that has been di cult in traditional systems Elkan, 1993]. This summary describes a paper which focuses on one component of a neural expert system, the knowledge base. The proposed implementation places the knowledge base in an associative memory, where a distributed representation of symbols, and features of the Dynamic Link Architecture Buhmann et al., 1990] provide the basis for representing relations and production rules suitable for generating and guiding mundane behavior. The proposed implementation is expected to deal with semantic information, rather than the syntactically re ned information that is amenable to formal logic. This statement can be made more clear by comparison to related work. The system of Ajjanagadde and Shastri, 1991] processes syntactically re ned information. CONSYDERR of Sun, 1992a] is a hybrid that combines semantic information in its distributed level and re ned information in its local level. The full paper has the same organization as this summary. Section 1 introduces terminology and explains the expected bene ts of neural network architectures over the von Neumann architecture. Section 2 describes some data structures that will make it possible to implement an expert system using a neural network. 1
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Knowledge Extraction by using an Ontology-basedAnnotation ToolMaria Vargas-Vera,Enrico Motta,John Domingue,Simon Buckingham Shum and Mattia LanzoniKnowledge Media Institute(KMi),The Open University,Walton Hall,Milton Keynes,MK76AA,United Kingdomm.vargas-vera,e.motta,j.b.domingue,s.buckingham.shum,nzoni@ABSTRACTThis paper describes a Semantic Annotation Tool for extraction of knowledge structures from web pages through the use of simple user-defined knowledge extraction patterns.The semantic annota-tion tool contains:an ontology-based mark-up component which allows the user to browse and to mark-up relevant pieces of in-formation;a learning component(Crystal from the University of Massachusetts at Amherst)which learns rules from examples and an information extraction component which extracts the objects and relation between these objects.Ourfinal aim is to provide support for ontology population by using the information extraction com-ponent.Our system uses as domain of study“KMi Planet”,a Web-based news server that helps to communicate relevant information between members in our institute.KeywordsOntology-based mark-up,Ontology population,Extraction of knowl-edge,Information extraction technologies.1.INTRODUCTIONSemantic annotation has been focused in isolated annotations of web pages.However,semantic web tries to achieve the annota-tion of pages with semantic information.In other words,the aim is to enrich the content of web pages.Recent work on semantic annotation guided by an ontology is discussed in[14].However, our approach has a different aim,we use the ontology as guider to the human annotator of the training set(ie.the user is presented with a set of possibles tags which could be used during the mark-up process),and then the system learns rules by using the semantic annotations,whilst in OntoAnnotate[14]the user selects the object identifier and the appropriate class for it from a hierarchy of classes. Then all the information which is in the Ontology for that particular object identifier is presented to the user.If the object identifier is not defined the user could create a new object or class relation.One target of the system presented in this paper is to learn rules Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.Copyright2000ACM0-89791-88-6/97/05..$5.00from texts by using a machine learning component called Crystal. To extract rules from text,we had developed an environment which allows user to perform four phases:browse,semantic annotation of pages,learning rules and information extraction(IE)of the web pages.Each of these phases are described as follows:1.BrowseThis option could by used by the user to select the kind of browser in our case could be WebOnto or any other browser.WebOnto[3]provides web-based visualisation,browsing and editing support for the ontology.It allows easier develop-ment and maintenance of the knowledge models,themselves specified in OCML(Conceptual Modeling Language)[8].2.The Markup phase.The activity of semantic tagging refersto the activity of annotating text documents(written in plain ASCII or HTML format)with an tags set defined on the ontology,in particular we work with the hand-crafted KMi ontology(ontology describing Knowledge Media Institute).The semantic annotation tool provides means to browse the event hierarchy(described in next section).In this hierarchy each event is a class and the annotation component extracts the set of possible tags from the slots defined in each class.In general mark-up process might be difficult but in our case the annotation component is guiding the user with the possi-ble entities which could be marked in the text.Other approach related to our work is the SHOE Knowledge annotator which is a Java program that allows users to mark-up web pages with the SHOE ontology[5].However,in SHOE there is not relation between the new annotations and the original text.3.Learning phase.This phase uses the marked text as trainingset and learns relations from the stories.It uses crystal as a learning component.Crystal works using the bottom-up ap-proach.Itfinds rules for specific instances and it generalises these rules.4.The information extraction phase.The goal of a Infor-mation Extraction system(IE)is to extract specific types of information from text.For example,an IE system in the domain of KMi(Knowledge Media Institute)organisation, should be able to extract the name of KMi projects,KMi funding organisations,awards,dates,etc.The main advan-tage of IE task is that portions of a text that are not relevant to the domain can be ignored.Therefore text could be pro-cessed quickly.Most IE systems use some form of partial parsing to recog-nise syntactic constructs without generating a complete parsetree for each sentence.Such partial parsing has the advan-tages of greater speed and robustness.High speed is neces-sary to apply the IE to a large set of documents.IE has been used in several domains,for instance,scientificarticles such as MEDLINE(it contains abstracts of biomedi-cal journals)[2],bibliographic notices[9],and medical records[13].Also,ontologies has been used in IE systems to helpthem extract relations from semi or unstructured documents,statements or terms[11].Recent work on semi-automaticontology acquisition by means of IE,supported by machine-learning methods,is described in[6,4].In similar lines thereis the CMU’s approach for extracting information from hy-pertext using machine learning techniques(Bayes classifier)and making use of an ontology[1].However,we remark thatwe are not creating an ontology,we are supporting ontologypopulation.The ontology population problem is an impor-tant issue to be addressed since it is difficult to keep up todate a hand-crafted ontology.In our work,we had integrated the hand-crafted KMi On-tology into the information extractor.The main task of theontology is to disambiguation of some extracted informa-tion.For instance,in the event conferring an award“X wasgranted Y amount of money”.X could be instantiated toname of project or institution.In this case we make use ofthe ontology to clarify the type of X.In the construction of our IE component we had integratedseveral components(Marmot,Badger and Crystal)from theUniversity of Massachusetts at Amherst(UMass)which arefully described in Rillof[10].We remark that in our IE com-ponent the template matching itself is supported semanti-cally by referring to the ontology,but also contains somelightweight NLP techniques in order to syntactically identifysome fragments of the sentences.We believe it is importantto mix the syntactic and semantic.The semantic checking isoften necessary to resolve ambiguities,for example,ontolo-gies can provide us with axioms of common sense knowl-edge such“if someone is visiting a place then this someoneshould be a person.”Conversely,some grammar construc-tions(such as dates)can be recognized robustly.Figure1illustrates the four phases.In particular la browse phase has been launched.Our primary contribution is to integrate a template-driven IE en-gine with an ontology engine(including inference capabilities be-sides lexicons such as Wordnet)in order to supply the necessary semantic content and then to disambiguate extracted information andfinally our second contribution is to provide support for the ontology population process.The paper is organised as follows:In Section2we present a ty-pology of two events as are defined in KMi ontology.Section3 presents the mark-up phase.Section4shows the learning phase using Crystal.Section5presents the extraction of information us-ing Badger.Section6describes the use of ontology to cope with the ambiguity in the identification of objects in the story.Section7 shows the OCML1code generated after badger obtains template instantiations.Section8discusses the process of populating an on-tology as an activity in the life cycle of the ontology construction. Finally,Section9gives conclusions and directions for future work.Figure2:Event hierarchyslots:monetary award(sum of money)has-duration(duration)start-time(time-point)end-time(time-point)has-location(a place)main-agent(list of person(s))other agents-involved(list of person(s)) location-at-start(a place)location-at-end(a place)awarding-body(an organization)has-award-rationale(project goals)In the event2the value for the slot has-award-rationale is ex-tracted from text by using heuristics such as if the word goal ap-pears in the story then the system will extract as rationale the sen-tence until itfinds full stop.The reason for this is because is to general to be learned by an IE component.It does not follow any grammar rule about how the rationale could be expressed by a jour-nalist who writes an story describing a project’s award.Class Event3:demonstration-of-technology technology-being-demostrated(technology)has-duration(duration)start-time(time-point)end-time(time-point)has-location(a place)other agents-involved(list of person(s)) main-agent(list of person(s))location-at-start(a place)location-at-end(a place)medium-used(equipment)subject-of-the-demo(title)Event3contains the structure for the event“demonstration-of-technology”.Entities that need to be recognised are technology, place,etc.3.MARK-UP PHASEThe mark-up component aims to help the manual annotation of web pages.In this component the ontology plays a important role guiding the mark-up process.The user does not know which is the relevant information which might be annotated.Therefore,we con-sider that is useful to have a such tool that presents user with pos-sibles tags.An example of annotated story is shown in Figure4. The user selects an specific class on the hierarchy of events,for example,“visiting-a-place-or-people”.Then a set of possibles tags is presented to the user for the event“visiting-a-place-or-people”. The set of tags are:has-duration,start-time,end-time,has-location, other agents-involved,main-agent,visitor,people-or-organisation-being-visited.From this set the user could select a subset of tags and then automatically a template for the event“visiting-a-place-or-people”is created.The created template is used later by the component which make instantiations of templates(Badger).Fig-ure3shows the user selection.In this particular example the user only selects start-time,end-time,has-location andvisitor.Figure3:Selection of tagsFor the sake of space,let us assume that the user annotates the story with two tags:visitor and place from the selected set.Figure4 shows the semantic annotations which automatically are inserted in the text.In the story David Brown was annotated as visitor and The OU is annotated as place.4.LEARNING PHASEThis phase was implemented by integrating two tools Marmot and the learning component called Crystal both from Umass.A brief description of Marmot(a text preprocessor)is giving before the learning component Crystal is presented.Figure4:Annotated story4.1MarmotMarmot(from UMass)is a natural language preprocessing tool that accepts ASCIIfiles and produces an intermediate level of text analysis that is useful for IE applications.Sentences are sepa-rated and segmented into noun phrases,verb phrases prepositional phrases.Marmot has several functionalities:preprocesses abbreviationsto guide sentence segmentation,resolves sentences boundaries,iden-tifies parenthetical expressions,recognises entries from a phrasal lexicon and replace them,recognises dates and duration phrases, performs phrasal bracketing of noun,preposition and adverbial phrases,finally scopes conjunctions and disjunctions.We had defined our own verbs,nouns,abbreviations and tags in order to apply Marmot to our KMi domain.For the sake of space we would analyse only thefirst three sentences in the story givenin Figure5.In thefirst sentence,Marmot recognised two entitiesfirstly a subject(SUBJ)which is JOHN DOMINGUE and secondly a date. The latest is recognised and marked between the symbol“@”.Dates are recognised robustly as regular expressions.SUBJ(1):JOHN DOMINGUEADVP(2):@WED_%COMMA%_15_OCT_1997@PUNC(3):%PERIOD%In sentence number2,DA VID BROWN is recognised as sub-ject(SUBJ),a prepositional phrase(PP)“FOR INDUSTRY”is en-counter,the verb(VB)VISITS is also found,OBJ1takes the value of THE OU andfinally a punctuation symbol(PUNC)is the full stop is encountered at the end of the sentence.SUBJ(1):DAVID BROWN%COMMA%UNIVERSITYPP(2):FOR INDUSTRYVB(3):VISITSOBJ1(4):THE OUPUNC(5):%PERIOD%In the same fashion,in sentence number3,DA VID BROWN is recognised as subject,the word VISITED is recognised as verb and OBJ1as THE OU.SUBJ(1):DAVID BROWN%COMMA%THE CHAIRMAN OF THE UNIVERSITYPP(2):FOR INDUSTRY DESIGN AND IMPLEMENTATION ADVISORY GROUP AND CHAIRMAN OF MOTOROLAPUNC(3):%COMMA%VB(4):VISITEDOBJ1(5):THE OU4.2CrystalCrystal is a dictionary induction tool.It derives a dictionary of concept node(CN)from a training corpus.Thefirst step in dictio-nary creation is the annotation of a set of training texts by a domain expert.Each phrase that contains information to be extracted is tagged(with SGML style tags).Crystal initialises a CN dictionary for each positive instance of each type of event.The initial CN definitions are designed to ex-tract the relevant phrases in the training instance that creates themFigure5:Marmot outputbut are too specific to apply to a unseen sentences.The main task of Crystal is to gradually relax the constraints on the initial definitions and also to merge similar definitions.Crystalfinds generalisations of its initial CN definitions by com-paring definitions that are similar.This similarity is deduced by counting the number of relaxations required to unify two CN def-initions.Then a new definition is created with constraints relaxed. Finally the new definition is tested against the training corpus to insure that it does not extract phrases that were not marked with the original two definitions.This means that Crystal takes similar instances and generalises into a more general rule by preserving the properties from each of the CN definitions which are generalised.The inductive concept learning in Crystal is similar to the induc-tive learning algorithm described in[7]a specific-to-general data-driven search tofind the most specific generalisation that covers all positive instances.Crystalfinds the most specific generalisation that covers all positive instances but uses a greedy unification of similar instances rather than breadth-first search.Coming back to our example David Brown’s story.We have that Crystal learns a conceptual node such as the one shown in Figure7.These conceptual node states that“X visited”.So that in the future whenever the pattern“X visited”appears in the text the case frame will extract“X”as the visitor.For the pattern X visited Y,we basically are extracting relations r(X,Y)from texts which could be interpreted as“X visited Y”and the Lexicon for relation r is the union of the lexicon(X)and lexi-con(Y).If wefind this relation in our texts then wefind a instance for the event“visiting-a-place-or-people”.In this example we do not have the case that two different tem-plates might apply to the same sentence.But it is possible to en-counter these cases.Let us consider the following example from the MUC domain(the MUC domain is a set of documentsdescrib-Figure6:Crystal outputing terrorist activities in Latin America):“A visitor from Colombia was hurt when two terrorists attempted to kill the major”.if visitor from Colombia is marked as victim two terrorist are marked as perpetrators and major as victim.Crystal generates3frame cases that represents the following pat-terns:If a text contains the expression“X was hurt”then the system extracts“X”as the victim.If a text contains the expression“X attempted to kill”then the system extracts“X”as perpetrator.If the text contains the expression“attempted to kill Y”then the system extracts“Y”as the victim.In recent years had been great interest in annotated-based tech-niques for producing automatically dictionaries.The reason for this is that automatic creation of conceptual dictionaries is important factor for portability and scalability of an IE system.Crystal has been tested on corpus of300KMi stories.Crystal was able to induce a dictionary of CN definitions for each event in KMi ontology.5.EXTRACTION PHASEA third component called Badger(from UMass)which was also integrated into our IE component.Badger makes the instantiation of templates.The main task of badger is to take each sentence in the text(in our case a story writ-ten in a e-mail message)and see if it matches any of our CN defini-tions.If no extraction CN definition applies to a sentence,then no information will be extracted;this means that irrelevant text can be processed very quickly.It might occurs that Badger obtains more than one type of event for an story.Then our IE system decides to classify the story ac-Visitor: V (class_person)Has−location: P (class_place)Has−duration:Start−time: ST (class time_point) End−time: ET (class time_point)D (class duration)Verb: visited (active verb)Figure 7:Concept node for the visiting eventcording with the following criteria:how many feature for each type were encountered in the story.Badger obtained a case frame instantiations for Place and Vis-itor using conceptual nodes defined in the dictionary constructed by Crystal.In the Badger’s output the following conventions were used:the name of the slot appears in the left hand side of the arrow and the value for the slot on the right hand side of the arrow.In David Brown story,Badger instantiated Place to The OU and visi-tor to David Brown.The type of event is obtained from the value of Type and the document ID from docid.The output shown in Figure 8means that Badger had instantiated (using the CN definitions and domain lexicon)to a frame of the form:Concept Node:CN-type:visiting-a-place-or-people Slots:Visitor tag:VI Start-time tag:ST End-Time tag:ET Place tag:PLResearch-group tag:GR Date is not stated in the story.So Start-time and End-time are instantiated to the date in which the story was written.6.INFERENCE CAPABILITIES BY USING AN ONTOLOGYAn example of an story belonging to the type of event conferring-a-monetary-award is defined as follows.This example is described in this paper because shows the inference capabilities which could be obtained from using an IE component plus an ontology.IBROW has been awarded 1million Ecu from the Eu-ropean Commission to carry out research in the area of knowledge-based systems.The output from Badger is shown asbelow.Figure 8:Badger output<cn>ID:80Type:conferring-a-monetary-award docid =ibrow-story sentence_num =1segment_num =1Funder ==>PP:FROM THE EUROPEAN COMMISSION </cn><cn>ID:106Type:conferring-a-monetary-award docid =ibrow-story sentence_num =1segment_num =1Money ==>OBJ1:1MILLION ECU </cn><cn>ID:24Type:conferring-a-monetary-award docid =ibrow-story sentence_num =1segment_num =1Project-Institution ==>SUBJ:IBROW </cn>In this last example,we need to use the KMi planet ontology to find if Project-Institution is a institution name or a project name ,and this is done by a simple traversal of the inheritance links in the ontology.Specifically,to remove ambiguity we sent a query to Web-onto asking for the set of all educational-organizations using the following query code.web-onto display akt-kmi-planet-kb ocml-eval(setofall ?x(educational-organization ?x))This gives a list containing all educational-organizations:to give@(the-open-university...org-knowledge-media-institute) IBROW does not match any of these,however,we also send a query to Web-onto asking for the set of all kmi-projects:web-onto display akt-kmi-planet-kbocml-eval(setofall?x(kmi-project?x))yieldingto give@(project-d3e...project-kmi-planet...project-ibrow...project-heronsgate-mars-buggy)and hence a match of“IBROW”to project-ibrowIn a similar fashion a query is sent to webonto in order tofind if Funder is a valid funder body.web-onto display akt-kmi-planet-kbocml-eval(setofall?x(awarding-body?x))to give@(...org-european-commissionorg-british-council)At same time some semantic relations could be obtained by using the KMi planet ontology.For our example about IBROW we can derive the following semantic relations:“ibrow is KMi project”and“KMi is part-of the Open-University”The OCML query to derive that KMi is part of the open univer-sity is as follows:web-onto display akt-kmi-planet-kbocml-eval(setofall?x(organization-unit-part-of?xthe-open-university))to give@(knowledge-media-instituteacad-unit-department-of-earth-scienceacad-unit-department-of-statistics-ouacad-unit-faculty-of-maths-and-computing-ou ...org-office-for-technology-development)therefore we could conclude that:“the Open-University has been awarded1million Ecu from the European Commission”In a future implementation we will be interested infinding more complex relations by using our KMi Planet ontology.Finally,we remark that OCML(the query language used by we-bonto)has adopted the closed world assumption(CWA),in the same fashion as Prolog,and so facts that are not provable are re-garded as“false”as opposed to“unknown”.7.OCML CODE GENERATED FROM OURSYSTEMOur goal is to use the information obtained by Badger and KMi ontology in order to be able to populate our KMi ontology with new instances of classes.In order to accomplish this task we had plugged another component which is a translator from Badger’s output to OCML code.The main function of this translator is to tokenise the Badger output and thenfind the CN definitions(cn markers)and extract all the objects encountered in the story.The name of each slot in the frame case corresponds to the name of thefield in the class definition and the value for thefield is the extracted information.For the example David Brown’s story we end up with a visiting-a-place-or-people event and produce the intermediate output:(def-instance visit-of-david-brown-the-chairman-of-the-universityvisiting-a-place-people((has-duration1-day)(start-time wed-15-oct-1997)(end-time wed-15-oct-1997)(has-location the-ou)(visitor david-brown-the-chairman-of-the-university)))where an instance of the type event visiting-a-place-or-people has been defined with the name“visit-of-david-brown-the-chairman-of-the-university”.8.POPULATING THE ONTOLOGYBuilding domain-specific ontologies often requires time-consuming expensive manual construction.Therefore we envisage IE as a technology that might help us during ontology maintenance pro-cess.During the population step our IE system has tofill prede-fined slots associated with each event,as already defined the on-tology.Our goal is to automaticallyfill as many slots as possible. However,some of the slots will probably still require manual inter-vention.There are several reasons for this problem:there is information that is not stated in the story,none of our templates match with the sentence that mightprovide the information(incomplete set of templates) We note that there are some cases when the instances are not defined in the ontology and then determining the type of an object is not straightforward.This has to be derived from a proof.Currently, we still looking to this aspect of our research.Figure9shows the extracted information from David Brown story.Once the system had extracted the information the user will pre-sented with all extracted information even the one that cannot be categorized as belonging to a type of object defined in our domain. Therefore,before populating the ontology we will require that a person check/complete the extracted information.9.CONCLUSIONS AND FUTURE DIREC-TIONSWe had built a tool which extracts knowledge using an ontology, an IE component and OCML translator.Currently,our system hadFigure9:Extracted information been trained using the archive of300stories that we had collectedin KMi.2The training step was performed using typical examplesof stories belonging to each of the different type events defined inthe ontology.We obtained results over95%using the IE compo-nent in KMi stories.However,in the future we would like to usethe IE component in a different domain.We are interested in usingour system in companies project reports,Curriculum Vitae(CV’s),or application of jobs.Another possible direction that we would like to explore is toincorporate into the IE component a different Machine Learningalgorithm such as described in[12].in order to compare perfor-mance between them.As medium term goal,we would like to have access to a libraryof IE methods and to activate these over a web page or a collectionof web pages.Besides the above issues,Badger could be extended in order tosave its output in XML(Extensible Markup Language).This willincrease the portability of our IE system as XML is the universalformat for structured documents and data on the Web.Finally,we would like to integrate our IE component with vi-sualisation component.This visualisation component will allowvisualisation of all entities extracted.10.ACKNOWLEDGMENTSThe research described in this paper is supported by(EPSRC)under the project name:Advanced Knowledge Technologies(AKT).11.REFERENCES[1]M.Craven,D.DiPasquo,D.Freitag,A.McCallum,K.Nigam T.Mitchell,andS.Slattery.Learning to Construct Knowledge Bases from the World Wide Web.Artificial Intelligence,1999.。