Apache OpenNLP Developer Documentation

合集下载

OpenNLP中语言检测模型训练和模型的使用

OpenNLP 中语⾔检测模型训练和模型的使⽤因为项⽬的原因，需要使⽤到NLP 的相关技术。

语⾔检测模型cld3是python 要与项⽬集成也不太⽅便，后来找到OpenNLP ，发现它相对来说，对于亚洲的语⾔有⼀些⽀持。

下⾯是OpenNLP 的训练相关的东西，在项⽬⾥如果检测对象过短，对于检测结果也容易出现偏差的情况。

所以语料要充⾜。

⼀、⽂档准备我们先从⽂档⼊⼿，在上的⽂档是很规范的，先找到Language Detector 这个title ，然后往下看到training ，我们根据⽂档提⽰，发现其实我们的语料按照如下规范就可以注意⼏点：1.⽂本⽂件中的⼀⾏就是⼀条语料，第⼀列是语种对应的ISO-639-3码，第⼆列是tab 缩进，第三列就是语料⽂本2.对于长⽂本，不要⼈为的去加换⾏3.训练语料必须要有多个不同的语料信息，否则在训练时会报错⼆、模型训练有了以上的语料⽂件，就可以通过⼏⾏简单的代码就能将其训练成我们需要的语⾔检测了InputStreamFactory inputStreamFactory = new MarkableFileInputStreamFactory(new File("corpus.txt"));ObjectStream<String> lineStream = new PlainTextByLineStream(inputStreamFactory, StandardCharsets.UTF_8);ObjectStream<LanguageSample> sampleStream = new LanguageDetectorSampleStream(lineStream);TrainingParameters params = ModelUtil.createDefaultTrainingParameters();params.put(TrainingParameters.ALGORITHM_PARAM, PerceptronTrainer.PERCEPTRON_VALUE);params.put(TrainingParameters.CUTOFF_PARAM, 0);LanguageDetectorFactory factory = new LanguageDetectorFactory();LanguageDetectorModel model = LanguageDetectorME.train(sampleStream, params, factory);model.serialize(new File("langdetect.bin"));}最后运⾏⼀下，就能在你的本地⽣成⼀个langdetect.bin 的语料⽂件了，以后就在程序⾥⽤上就好了。

RDocumentation包的中文名称说明书

Package‘RDocumentation’October12,2022Type PackageTitle Integrate R with'RDocumentation'Version0.8.2URL https://,https://BugReports https:///datacamp/RDocumentation/issuesDescriptionWraps around the default help functionality in R.Instead of plain documentationﬁles,documen-tation will show up as it does on<https://>,a plat-form that shows R documentation from'CRAN','GitHub'and'Bioconductor',together with in-formative stats to assess the package quality.License GPL(>=2)Imports httr(>=1.2.1),proto(>=0.3-10),rjson(>=0.2.15),utilsRoxygenNote6.0.1Suggests testthatNeedsCompilation noAuthor Ludovic Vannoorenberghe[cre],Jonathan Cornelissen[aut],Hannes Buseyne[ctb],Filip Schouwenaars[ctb]Maintainer Ludovic Vannoorenberghe<********************>Repository CRANDate/Publication2018-01-2322:30:32UTCR topics documented:RDocumentation-package (2)check_package (2)documentation (3)hideViewer (4)Index512check_packageRDocumentation-packageIntegrate R with DescriptionEnhance the search/help functionality in R with DetailsPackage:RDocumentationType:PackageVersion:0.2Date:2016-08-09License:GPL(>=2)~~An overview of how to use the package,including the most important functions~~Author(s)Jonathan Cornelissen(for )Maintainer:Jonathan Cornelissen<*********************>See AlsohelpExampleshelp(mean,base)centrefile.downloadcheck_package Check if a package is installed for the user.DescriptionCheck if a package is installed for the user.Usagecheck_package(pkg,version)documentation3Argumentspkg Name of the packageversion the latest version to be checkedValue1if the package is not installed;-1if the package is not up to date;0if the package if the package is installed and up to date.Examples##Not run:check_package("RDocumentation","0.2")check_package("utils","3.3.1")##End(Not run)documentation Documentation on RDocumentation or via the normal help system ifofﬂineDescriptionWrapper functions around the default help functions from the utils package.If online,you’ll be redirected to RDocumentation.If you’re ofﬂine,you’ll fall back onto your locally installed documentationﬁles.Usagehelp(...)"?"(...)help.search(...)Arguments...the arguments you’d pass to the default utils function with the same nameDetailsfor slow internet connections,a timeout can be set for getting the page of RDocumentation via options("RDocumentation.timeOut"=nb_of_seconds)the default timeout is3seconds.4hideViewer hideViewer Redirects the viewer to the RDocumentation help page.DescriptionRedirects the viewer to the RDocumentation help page.UsagehideViewer()Index∗packageRDocumentation-package,2 (documentation),3check_package,2documentation,3help,2help(documentation),3hideViewer,4RDocumentation(RDocumentation-package),2 Rdocumentation(RDocumentation-package),2 RDocumentation-package,25。

opennlp 提取名词

opennlp 提取名词
OpenNLP是一个用于自然语言处理的开源工具包，可以用于提取文本中的名词。

要使用OpenNLP提取名词，您需要执行以下步骤：
1.安装OpenNLP：您可以从OpenNLP的官方网站下载并安装OpenNLP。

确保按照说明进行安装和配置。

2.准备数据：准备要分析的文本数据。

确保文本数据是纯文本格式，没有HTML标签、特殊字符等。

3.训练模型：使用OpenNLP的训练工具训练一个命名实体识别（NER）模型。

训练模型需要一些标记好的训练数据，其中包含文本中的名词和其他实体。

4.运行模型：使用训练好的模型对文本数据进行命名实体识别。

您可以使用OpenNLP的命令行工具或编程接口来运行模型。

5.处理结果：处理识别结果，提取文本中的名词和其他实体。

根据需要，您可以将结果保存到文件、数据库或其他存储介质中。

请注意，使用OpenNLP提取名词需要一定的自然语言处理和编程知识。

如果您是初学者，建议您先学习一些基本的自然语言处理和编程知识，以便更好地使用OpenNLP。

opentelemetry 注解

opentelemetry 注解
Opentelemetry注解是一种用于加入分布式跟踪的标志，可以在应用程序代码中插入，以捕获和报告与请求和响应相关的关键信息。

这些注解可以用于标记函数、方法和代码块，以指示该代码的作用和功能，从而帮助跟踪系统了解应用程序的执行过程和性能。

Opentelemetry注解可以包括以下信息：
1. 跟踪标识：标识应用程序执行的唯一跟踪ID，可以用于关联和追踪多个请求的执行路径。

2. 时间戳：记录代码执行的开始和结束时间，用于计算执行时间和性能指标。

3. 标签：添加键值对来描述请求和响应的属性，例如请求的URL、方法、参数等。

4. 错误信息：记录错误和异常发生的位置和原因。

通过使用Opentelemetry注解，开发人员可以将关键信息集成到分布式追踪系统中，并从中获得有关系统性能和行为的详细洞察。

这样可以更好地监控和优化应用程序的性能，并帮助排查和解决潜在的问题。

一种自动化测试框架[发明专利]

(19)中华人民共和国国家知识产权局(12)发明专利申请(10)申请公布号 (43)申请公布日 (21)申请号 201710334139.3(22)申请日 2017.05.12(71)申请人郑州云海信息技术有限公司地址 450000 河南省郑州市郑东新区心怡路278号16层1601室(72)发明人张震　赵霞　(74)专利代理机构济南信达专利事务所有限公司 37100代理人刘继枝(51)Int.Cl.G06F 11/36(2006.01)(54)发明名称一种自动化测试框架(57)摘要本发明公开了一种自动化测试框架，包括测试用例管理系统、控制系统和第三方自动化测试工具；并将测试用例、关键字表、测试脚本单独管理，分层映射，形成关键字驱动的自动化测试框架。

本发明的一种自动化测试框架和现有技术相比，层次清晰、便于后期维护，而且关键字驱动测试，脚本编写简单，使得编程基础不太好的测试人员也可以进行自动化测试脚本的设计和编写，大大提高了工作效率。

权利要求书1页说明书3页附图1页CN 107133177 A 2017.09.05C N 107133177A1.一种自动化测试框架，其特征在于,包括测试用例管理系统、控制系统和第三方自动化测试工具；其中，所述的测试用例管理系统支持测试者依据测试需求设计测试用例，并构建关键字表，存放于数据库，便于控制系统与第三方自动化测试工具的访问；所述的控制系统负责对数据库的操作、测试数据处理及传输数据控制；所述的第三方自动化测试工具，负责执行自动化测试任务，解析关键字表，并根据关键字驱动测试脚本进行自动化测试，返回处理结果给控制系统，输出测试报告。

2.根据权利要求1所述的一种自动化测试框架，其特征在于，所述的控制系统，用于从数据库中读取数据，驱动第三方工具读取关键字表，驱动测试脚本执行。

3.根据权利要求1所述的一种自动化测试框架，其特征在于，所述的测试用例管理系统包括测试用例设计模块，关键字管理模块、生成测试请求模块以及生成关键字表模块；所述的测试用例设计模块，用于依据测试需求，进行测试用例的设计；所述的关键字管理模块，用于关键字管理；所述的生成测试请求模块，用于生成测试请求；所述的生成关键字表模块，用于生成以关键字表。

hanlp的作用-解释说明

hanlp的作用-概述说明以及解释1.引言1.1 概述HanLP（即“Han Language Processing”）是一个开源的自然语言处理（NLP）工具包，最初由人民日报社自然语言处理实验室开发，并已经在众多大型项目和企业中被广泛应用。

自然语言处理是人工智能领域中一个重要的研究方向，涉及到对人类语言的理解和处理。

HanLP作为一款功能强大的NLP工具包，集成了一系列中文文本的处理和分析功能，能够帮助开发者快速、准确地处理中文文本数据。

HanLP具备多项核心功能，包括分词、词性标注、命名实体识别、依存句法分析、关键词提取等。

这些功能能够协助用户完成诸如文本分析、信息提取、机器翻译、情感分析、智能问答等各种任务。

HanLP具有以下几个显著特点：1. 智能高效：HanLP采用了深度学习和统计机器学习等先进的技术，能够实现高效、准确的文本处理。

它精心训练的模型和优化算法确保了在不同场景下的稳定性和效果。

2. 针对中文：HanLP是专门为中文设计的工具包，充分考虑了中文的特殊性。

它支持繁简体转换、拼音转换等特殊处理，并基于大规模中文语料库进行训练，以获得更好的中文处理效果。

3. 可定制性：HanLP提供了丰富的功能和参数设置，允许用户根据自己的需求进行个性化定制。

用户可以选择不同的模型、配置和插件，以满足特定场景下的需求。

4. 强大的生态系统：HanLP的社区活跃，拥有众多用户和开发者参与其中。

在HanLP的基础上，还衍生出了丰富的周边工具和应用，形成了一个庞大的生态系统。

总之，HanLP作为一款功能全面、性能出色的中文NLP工具包，为中文文本处理和分析提供了便捷、高效的解决方案。

无论是学术研究还是商业应用，HanLP都是一个不可或缺的利器。

它的出现大大降低了中文自然语言处理的门槛，为中文信息处理领域的发展做出了重要贡献。

1.2 文章结构文章结构部分的内容如下：2. 正文在这一部分，我们将详细介绍HanLP的作用和功能。

OpenText UFT实施服务说明书

Services FlyerUFT Implementation ServicesOpenT ext Professional Services can help you rapidly get on the right foot with our UFT Implementation Services.We help set the solid start you need to leverage your investment in your test automation solutions.Executive SummaryGetting started and taking strides to adopt any platform can be a challenge. OpenText Professional Services can help you rapidly get on the right foot with our services for OpenT ext Unified Functional T esting (UFT), a test automation platform. Implementation services not only get you started, but they can also help improve how you use your solution, which ultimately improves your application testing and quality. About OpenT ext UFTOpenT ext UFT is a proven platform for test automation. It enables organizations to test at full velocity across a range of technologies using its extensive features and a broad ecosystem of integrations. UFT has enabled thousands of organizations to reduce their test execution time extensively and increase the quality of their applications.UFT Implementation ServicesOpenT ext Professional Services offer a hands-on experience by combining our technical implementation services with sidebyside coaching from our experienced consultants, as well as formal UFT education.UFT One QuickStartOur experienced consultants will work with you to accelerate your adoption of UFT One and.Learn how to set up UFT One for one application in your environment. The handson approach combined with your formal training provides a great foundation for you to shape your future test automation ecosystem with confidence. Work with our consultants to obtain a good understanding of: ■Setting up and basic use of UFT Onein your environment.■Managing UFT license.■Working with UFT One add-ins.■Coding and script hardening.■Using AI-based testing, API, and serviceprotocols.■Using integrations and developing aUFT One-based test framework.Y ou will experience these capabilities specificto your team, applications, and environment.Working with our consultants will give your teamsolid examples of transitioning from manual toautomated testing. The duration of the UFT OneQuickStart service is five days. The ProfessionalServices QuickStart service can be used topurchase the UFT One QuickStart service.UFT Developer QuickStartFor this service, our consultants show yourdevelopers how to use UFT Developer andintegrate it into their daytoday unit testing.Developers will learn how to:■Set up and use UFT Developer in yourenvironment■Use UFT Developer software developmentkits.■Developing a UFT Developer test framework■Using integrations & advancedUFT Developer capabilitiesThe duration of the UFT Developer QuickStartservice is five days. The Professional ServicesQuickStart service can be used to purchasethe UFT Developer QuickStart service.UFT Digital Lab ServicesProfessional Services personnel are experts inUFT Digital Lab. We can help your organizationimplement or upgrade UFT Digital Lab and mobile testing. We also offer a managed service tohose UFT Digital Lab and your mobile devices,releasing your developers and testers from theburdens of managing, supporting, and updating the evergrowing complexity of the mobiledevice landscape.UFT MOBILE IMPLEMENTATIONS AND UPGRADESThroughout our UFT Digital Lab implementations, we work side by side with you andcoach your team. We share our recommendedpractices to advance your team’s selfsufficiency in using, configuring, and maintainingyour solution. These services are availablefor OpenText UFT Digital Lab on SaaS or asan onpremises solution. Learn more aboutour UFT Digital Lab implementation and upgrade services.MOBILE DEVICE HOSTING USING UFT MOBILEThe Mobile Device Hosting Service providesthe ability to use a cloudbased environmentfor testing mobile applications on real physicaldevices,such as phones and tablets.It leverages your existing UFT Digital Lab licenses andintegrates into your development and testingtools to enable authentic, end-to-end, mobiledevice interaction. This secure solution is always available, and you can customize it toaccommodate the lifecycle and requirementsof any mobile application. OpenT ext hosts theUFT Digital Lab environment and the devices.Proactive and reactive support ensures thatthe environment is available and operatingoptimally for dedicated use. Learn more aboutMobile Device Hosting.Other UFT ServicesContinuous Integration andT esting ServiceIf you want to take the next step on your Dev-Ops journey, Professional Services has helped many customers succeed with a “think big, start small” approach. This approach sets your overall Enterprise DevOps vision and starts where you can make the biggest impact in the shortest time: Continuous Integration and Continuous T esting.Our Continuous Integration and T esting services extend across the application lifecycle and use our UFT platform capabilities with other OpenText and thirdparty platforms. T echnology is only part of this service. We also help your organization improve processes and organizational areas:■Develop a common approach and integrated platform for integrated development environments, source control, build, testing, and deployment.■Improve code quality techniques suchas static analysis, code decoupling,and unit testing.■Develop an end-to-end Continuous Integration process: from check-out, through unit testing, to build and deployment.■Rapidly provision and deprovision development and test environments.■Support traditional, mobile app, and microservicesbased applications.■Improve, through coaching, your agile practices (especially Scrum- and SAFe-inspired approaches) and use ChatOpsto improve collaboration.■Accelerate testing by establishing a collaborative testdriven approach of writing tests before code is ready.■Improve traceability by linking testassets to code, requirements, defects, and builds.■Adopt and improve automated testing capabilities across unit, functional, regression, performance, and security testing, including mobile.■Use network, service, and data virtualization technology to shift testing left.■Integrate the overall test process with development, build, and deployment.■Use targeted change impact regressiontesting.■Focus manual testing on exploratory andinvestigative testing quality and feedbackcapabilities.Learn more about our Continuous Integration& T esting Service.T esting Services Using UFTOpenText provides functional testing services with strong testing capabilities usingUFT Digital Lab, UFT One, and UFT Developer.Customers with licenses for these solutionscan rapidly engage Professional Services forfunctional testing services.We can also assist you with implementing amobile testing framework to advance a newcapability or improve your existing mobile capability. We can conduct testing services on-site, remotely, or with a mixed-shore approach.We also tailor our Functional T esting to your requirements. These services may include:■Consultation to scope and plan functionaltesting on web, mobile, or anotherapplication of your choice.■Conversion of manual test cases toautomated tests for your applications usingUFT One, UFT Developer, UFT Digital Lab,or a combination of these platforms.■Execution of automated functional testsand reporting of defects and test results.■Provision of mentoring for your testers toenhance your functional testing capability.Additionally, OpenT ext offers competitive fixed-price testing services that we provide remotely.Learn more.Benefits of Our UFT ServicesWorking with our team to provide UFT servicesprovides several benefits:■Accelerate your timetovalue by rapidlyimplementing your UFT-based functionaltesting framework.■Confidently implement and keep yourUFT solution up to date based on ourrecommended architecture, practices,and expert sidebyside coaching.■Rapidly implement mobile testingcapabilities and a device farm without theburden of having to host and supportthe devices.■Extend or accelerate your functionaltesting capability and effectively testyour applications to ensure their qualityand performance for your customers.The OpenT ext ProfessionalServices DifferenceOpenT ext Professional Services delivers unmatched capabilities through a comprehensive set of consulting services. These serviceshelp drive innovation through streamlined andefficient solution delivery. We provide:■Proven softwaresolution implementationexpertise.■More than 20 years of experience helpinglarge, complex, global organizations realizevalue from their OpenT ext softwareinvestments.■Rich intellectual property and unparalleledreach into product engineering.■T echnologyagnostic assessmentapproach with no vendor lockin.■Education and support services to ensureadoption.Learn MoreFind more information about our ProfessionalServices’ capabilities:OpenT ext Professional Services261-000229-001 | O | 10/23 | © 2023 Open T ext。

stanford corenlp 语法解析

stanford corenlp 语法解析Stanford CoreNLP是一个开源的自然语言处理工具包，它提供了一系列的NLP任务，包括分词、词性标注、命名实体识别、语法解析等。

其中，语法解析是指对输入的句子进行句法分析，分析句子的结构和成分之间的关系。

使用Stanford CoreNLP进行语法解析的常见步骤如下：1. 引入相应的库和模型文件，可以在Stanford CoreNLP官方网站下载并配置。

2. 创建一个StanfordCoreNLP对象，设定需要执行的功能和相应的模型文件路径。

3. 输入待分析的句子。

4. 调用StanfordCoreNLP对象的`annotate`方法对句子进行分析，结果以Annotation对象的形式返回。

5. 从Annotation对象中提取想要的语法解析结果，可以通过`get`方法获取不同的结果，如`get(CoreAnnotations.SentencesAnnotation.class)`获取句子列表，`get(CoreAnnotations.TreeAnnotation.class)`获取句子的语法树。

下面是一个示例代码，展示了如何使用Stanford CoreNLP进行语法解析：```javaimport edu.stanford.nlp.pipeline.*;import edu.stanford.nlp.util.*;import java.util.*;public class CoreNLPExample {public static void main(String[] args) {// 设置StanfordCoreNLP的配置属性Properties props = new Properties();props.setProperty("annotators","tokenize,ssplit,pos,lemma,ner,parse");// 设置模型文件的路径props.setProperty("parse.model","edu/stanford/nlp/models/lexparser/chinesePCFG.ser.gz");// 创建StanfordCoreNLP对象StanfordCoreNLP pipeline = new StanfordCoreNLP(props);// 待分析的文本String text = "我爱北京天安门。

robot fromwork [documentation]用法-概述说明以及解释

robot fromwork [documentation]用法-概述说明以及解释1.引言1.1 概述概述在当前科技高速发展的时代，机器人技术越来越受到人们的关注和重视。

Robot Framework作为一种先进的自动化测试框架，旨在帮助开发者和测试人员提高测试效率，减轻工作负担，提供更优质的软件产品。

Robot Framework的设计理念是简单实用，易于使用和扩展。

它采用了关键字驱动的方式，使测试人员能够使用自然语言编写测试用例，而不需要具备专业的编程知识。

这使得Robot Framework成为一个理想的工具，既适用于技术水平较高的开发人员，也适用于技术水平相对较低的测试人员。

作为一种开源框架，Robot Framework提供了许多关键字库和扩展插件，可以满足各种不同的测试需求。

同时，它还支持多种不同的接口，如Web自动化、移动应用测试、数据库测试等，可以适应不同的测试场景。

本文将从Robot Framework的介绍、特点以及未来的发展展望等方面进行详细阐述，以帮助读者更好地理解和使用Robot Framework，提高测试效率。

同时，我们还会介绍Robot Framework的使用方法和一些实际案例，以帮助读者更好地掌握和运用该框架。

文章结构部分是文章的重要组成部分之一，它用于向读者介绍整篇文章的框架和组织方式。

在本文中，文章结构部分可以包括以下内容：1.2 文章结构本文分为三个主要部分：引言、正文和结论。

引言部分（Chapter 1）主要介绍了本文的背景和目的。

在引言的第一部分（1.1 概述）中，将对Robot Framework进行简要介绍，包括其基本概念和用途。

在第二部分（1.2 文章结构）中，将详细说明本文的整体结构和各个章节的内容。

最后，在第三部分（1.3 目的）中，将明确本文的写作目的和预期效果。

正文部分（Chapter 2）是本文的核心内容，主要分为两个小节：Robot Framework简介和Robot Framework的特点。

opensearch-best-practice-cn-zh-2020-11-05说明书

开放搜索最佳实践1.2.最佳实践功能篇相关性实战分词、匹配、相关性、排序表达式针对目前若干用户遇到的搜索结果与预期不符合的问题进行统一详细说明，并以此为话题展开说明下OpenSearch在搜索效果方面的功能和后续一些工作方向。

首先，对于搜索来讲，最常见的有两种做法：数据库的like查询，可以理解为简单的包含关系；百度、google等搜索引擎，涉及到分词，将查询词根据语义切分成若干词组term（这个是搜索引擎重难点之一），通过term组合匹配给相应文档进行打分，根据分值排序，并最终返回给用户。

OpenSearch采用的方式与上述搜索引擎做法基本一致。

那这里就有三部分内容会影响搜索效果：1，分词方式；2，匹配方式；3，相关性算分。

我们来分别说下这三部分在OpenSearch上的行为和表现。

接下来，我们详细说明下各个字段的展现效果及适用场景，供大家参考。

分词方式熟悉各类分词是本篇操作的前提，请务必先查阅内置分析器文档。

匹配方式原理分完词后得到若干term，如何召回文档，就涉及到匹配方式。

目前OpenSearch内部默认支持的是AND，即一篇文档中包含全部的term才能被搜索出来。

当然这是对同一关键词而言的，除此之外系统还支持多种匹配方式，如AND、OR、RANK、NOTAND以及()，优先级从高到低为()，ANDNOT，AND，OR，RANK。

举例案例问：我文档中包含“吃饭了”，我搜索“吃饭”、“吃饭了”都能召回，搜索“吃饭了吗”没结果？答：因为目前OpenSearch是要求全部的分词结果都匹配才能召回文档，上面的“吗”在文档中没有出现，所以无法召回。

但可以通过查询分析解决。

问：我只想查找某些词排在最前面的文档，比如以“肯德基”开头的文档；答：目前不支持位置相关召回。

相关性算分上面提到的都是跟召回相关的技术，召回文档之后，究竟文档如何排序就涉及到相关性。

目前OpenSearch有sort子句来支持用户自定义排序。

OpenText ZENworks Patch Management用户手册说明书

ZENworks Patch ManagementOrganizations today experience a seemingly endless stream of software security threats accompa-nied by the patches required to fix them. Keeping up can strain even the best-staffed IT department. Simply maintaining patched endpoints can feel overwhelming. Quickly responding to critical security threats can feel impossible. ZENworks Patch Management helps you easily update your endpoint software on a regular schedule as well as quickly identify and remediate emerging security threats.Product HighlightsOpenT ext ZENworks Patch Management helpsyou proactively manage threats by automat-ing the collection, analysis, and delivery ofpatches to a diverse range of endpoints.Policy-based patching lets you maintain patchcurrency on devices through regularly sched -uled updates. Security-focused patching letsyou identify software security vulnerabilitiesthat impact devices and apply, in one action,all patches required to remediate the devices.ZENworks Patch Management is availableeither as a standalone product or as a dis-counted s ubscription option in the OpenT extZENworks Suite.Key BenefitsZENworks Patch Management is ready to helpyou:■ Dramatically lower patch management costs and effort by enabling accurate, automated processes for patch assessment, monitoring, and remediationacross your whole organization.■ Expand the reach of your patchmanagement efforts with a cross-platformapproach that provides pre-testedpatches for more than 40 differentWindows, SUSE, Red Hat, and Macintoshoperating systems.■ Boost compliance with tools that allow you to monitor patch complianceon every device in your organization, quickly identify and assess vulnerabilities, and automatically apply updates and patches to bring devices up to predefined policies and standards.■ Identify critical software security vulnerabilities (WannaCry, NotPetya, SamSam, BlueKeep, and more) that impact your devices in order to respond quickly to emerging threats.■ Manage endpoint lifecycle and security issues through a single pane of glass with configuration, patch, asset, and endpoint security management—all integrated in one console.Key Features Monitoring and Reporting With a powerful monitoring and reporting en-gine, ZENworks Patch Management providesdeep insights into the patch status and overallsecurity posture of your network. This includes:■ Agent-based monitoring that detects security vulnerabilities on individual endpoints, continually assesses security risks, and provides automatic notification of patch compliance issues and concerns. ■ Dynamic, dashboard-style graphicalreports that quickly provide a complete, high-level view of patch compliance across your organization—and make it easy to drill down to detailed patch data for individual endpoints.Data SheetCollection, Analysis and Pre-T esting ZENworks Patch Management eliminates the extensive time and manual effort required to collect, analyze, and test the overwhelming number of patches available for different types of endpoint systems. This includes:■Vulnerability announcements that inform organizations when a new patch is ready for deployment.■The world’s largest dynamic repository of patches, which provides more than 50,000 pre-tested patches for more than 100 major current and legacy applications and operating systems (including Linux and Mac).■Reliable and thoroughly pre-tested patch packages that dramatically reduce the time and labor required to check, verify, and deploy patches.Automated DeploymentIn addition to providing an extensive library of pre-tested patch packages, ZENworks Patch Management includes features that streamline and automate every aspect of the patch deployment and verification process. This includes:■Fast, automatic patch deployment based on predefined policies, as well as the ability to customize tested patches as needed.■A wizard-based interface that simplifies the process of getting the right patches to the right endpoints quickly and efficiently, plus download percentage reports that provide real-time status updates.■Support for phased rollouts to ensure smooth, error-free patch deploymentsto large numbers of systems.■Rapid verification of patch deployments that catches deployment issues before they can become security problems.■A virtual appliance deployment option for patch, configuration, asset, and endpoint security management, which simplifies installation and reduces support costs. Policy-Based ComplianceZENworks Patch Management allows you tocreate mandatory baselines for patch compli-ance based on predefined policies, continuallymonitor endpoint systems for compliance, au-tomatically remediate systems that don’t meetminimum standards, and clearly document im-provements in patch compliance. This includes:■Patented digital fingerprinting technology that establishes a security profile for eachmanaged node on your network andfacilitates ongoing compliance.■21 standard reports that document changes and demonstrate progress toward internal and external audit and patch compliance requirements.■Automatic application of required updates and patches to new systems and installations to bring them into compliance with predefined patch policies and standards.Emerging Threat Detectionand RemediationZENworks Patch Management identifies the software security threats that impact your endpoints to help you prioritize your remedia-tion efforts and track the results. This includes:■Identification of vulnerabilities based on industry-standard Common Vulnerabilities and Exposures (CVEs) imported from the NIST National Vulnerability Database.■Customizable dashboards that showthe CVEs impacting your devices, organized by severity, release date,or vulnerable device count.■One-click remediation of impacted devices to automatically apply all patches required to remediate the CVE without needing to find or select the patches.■Tracking of remediation progress for CVEs that shows current vulnerability status as well as increasing or decreasing trend over time.260-000027-001 | O | 02/23 | © 2023 Open T ext。

OpendTect中文操作手册

OpendTect3.0.3 培训手册（荷兰dGB公司）北京地航时代科技有限公司二00七年十二月目录第1章前言 (1)1.1 练习说明 (1)1.2 致谢 (1)第2章简介 (2)2.1 F3演示数据体 (2)2.2 快速启动一个项目 (3)2.3 预演 (4)基本数据显示 (5)2.3.2 使用缺省属性集 (7)2.4.查看和分析属性 (8)简介 (8)查看属性 (8)属性&速度 (9)交互式属性分析 (9)属性选择 (10)2.5 层位追踪 (10)练习2.5a 层位追踪 (10)第3章神经网络 (13)3.1 简介 (13)3.2 波形分类 (13)工作流程 (13)3.3 生成气烟囱数据体 (16)定义属性集 (17)拾取样本位置 (18)3.4 孔隙度反演 (21)工作流程 (21)第4章倾角导向滤波 (26)练习4.1 构建Median Dip Filter和Edge Preserving Smoothing Filter (26)第5章边缘增强滤波（Ridge Enhancement Filtering） (27)练习5.1 神经网络断层检测 (27)练习5.2边缘增强滤波I (27)练习5.3边缘增强滤波II，速度优化 (27)练习5.4边缘增强滤波III，其它体和属性 (27)第6章层序地层学解释系统（SSIS） (28)6.1 简介 (28)6.1.1 OpendTect SSIS简介 (28)6.1.2 基本理念 (28)6.1.3 工作流程 (29)6.1.4. 层序地层学原理（Catuneanu 2002） (29)6.2 OpendTect层序地层学解释 (32)练习计算导向体 (32)6.2.1 利用注释功能进行第一次解释 (32)注释练习 (33)6.2.2 地层尖灭/超覆模式 (33)练习层位追踪 (33)6.3 年代地层 (34)6.3.1 简介 (34)6.3.2 如何计算年代地层 (35)练习计算年代地层 (36)6.3.3 导向体算法、设置、中值滤波器和层位 (38)练习显示年代地层 (39)6.4 Wheeler变换 (40)练习Add Wheeler Scene (41)练习Create Wheeler Cube (42)6.5 沉积体系域解释和地层界面 (42)6.5.1 沉积体系域 (42)练习沉积体系域解释 (43)6.5.2 地层界面和地层界面的时间属性 (46)地层界面练习 (46)6.5.3 实例 (47)6.5.4 讨论 (48)参考文献 (49)第7章建立一个新工区 (50)练习7.1 建立工区 (50)练习7.2 输入地震数据 (50)练习7.3 创建SteeringCube (50)练习7.4 输入层位 (51)练习7.5 输入井数据 (51)1.1 练习说明此DVD中包含了进行神经网络练习所需的演示数据体。

openapi-generator yaml 规则 -回复

openapi-generator yaml 规则-回复OpenAPI Generator YAML规则是使用OpenAPI Generator工具时所使用的配置文件。

OpenAPI Generator是一个开源的代码生成器，它可以根据OpenAPI规范和模板生成各种编程语言的客户端代码、服务器代码、文档和代码注释等。

本文将一步一步地回答关于OpenAPI Generator YAML规则的问题，介绍如何使用这个配置文件。

第一步：什么是OpenAPI Generator YAML规则？OpenAPI Generator YAML规则是一个YAML格式的配置文件，它描述了在使用OpenAPI Generator生成代码时需要的各种规则和选项。

通过修改这个配置文件，开发人员可以定制生成代码的行为，包括选择生成的代码类型、指定生成代码的目标路径、设置模板和库的选项等。

第二步：如何创建OpenAPI Generator YAML规则文件？要创建OpenAPI Generator YAML规则文件，首先需要安装OpenAPI Generator工具。

可以从OpenAPI Generator的官方网站下载适用于自己操作系统的版本。

安装完成后，可以通过命令行或者使用图形界面工具打开OpenAPI Generator。

第三步：如何编辑OpenAPI Generator YAML规则文件？打开OpenAPI Generator后，可以看到一个Web界面，其中有一个“配置文件”选项卡。

点击该选项卡，可以看到一个文本编辑器，用于编辑OpenAPI Generator YAML规则文件。

在该编辑器中，可以使用YAML语法来描述规则和选项。

第四步：哪些规则和选项可以配置？OpenAPI Generator支持许多规则和选项，可以对生成代码的行为进行高度定制。

以下是一些常见的配置选项：1. `inputSpec`：指定OpenAPI规范文件的路径。

apache opennlp 例子

apache opennlp 例子Apache OpenNLP是一个自然语言处理（NLP）库，旨在帮助开发人员构建和部署能够处理文本的应用程序。

它提供了一套工具和模型，用于执行各种常见的NLP任务，例如命名实体识别、词性标注、句法分析和文本分类。

以下是关于Apache OpenNLP的一些例子。

1. 命名实体识别：Apache OpenNLP可以用于识别文本中的命名实体，如人名、地名和组织名。

通过使用OpenNLP的训练模型，可以将文本中的实体标记出来，这对于信息提取和文本分析非常有用。

2. 词性标注：Apache OpenNLP提供了词性标注的功能，可以将文本中的每个单词标记为其相应的词性，如名词、动词、形容词等。

这对于语义分析和句法分析非常有帮助。

3. 句法分析：Apache OpenNLP可以执行句法分析，从而分析句子中单词之间的关系和结构。

通过句法分析，可以识别出句子的主语、谓语和宾语等重要成分，有助于理解文本的含义和语法结构。

4. 文本分类：Apache OpenNLP提供了文本分类的功能，可以将文本分为不同的类别或标签。

这对于自动化文本分类和信息过滤非常有帮助，例如将新闻文章分为体育、政治、娱乐等类别。

5. 文本生成：除了文本分析，Apache OpenNLP还可以用于文本生成。

通过使用OpenNLP的生成模型，可以根据给定的输入生成自然语言文本，例如自动生成文章、摘要和评论等。

6. 语言检测：Apache OpenNLP可以检测文本的语言，从而确定文本是用哪种语言编写的。

这对于多语言处理和国际化应用程序非常有用。

7. 信息提取：Apache OpenNLP可以从文本中提取有关特定实体或关系的信息。

通过使用OpenNLP的模型和规则，可以识别出文本中的关键信息，并进行进一步的分析和处理。

8. 问答系统：Apache OpenNLP可以用于构建问答系统，从而回答用户提出的问题。

通过使用OpenNLP的模型和算法，可以将问题解析为语义表示，并从知识库或文本中检索相关答案。

NLP项目开发部署文档

NLP项目开发部署文档一. 代码结构说明NLP项目包含两个项目一个以azure命名，一个以nlp命名，azure是一个web 项目，nlp是azure后台处理项目。

1.azure代码结构a)azure包中是对FNLP项目中接口使用所作修改的包；b)filter包是为了实现WebService中权限控制；c)listener包的作用是初始化，包括读入模型文件等操作；d)org包是引用FNLP中的包；e)record包包含记录用户和任务记录的对象文件；f)user包包含涉及登录和重置密码的servlet；g)util包中包含公共的工具类；h)webrole包中包含涉及各种网站功能的servlet；i)webservice包是实现webservice的包；j)其它jsp，html，js，css资源在src/main/webapp文件夹下；2.nlp项目代码结构worker包中是workerrole中实现大规模文本分析功能的包，其它包的功能同上。

二. 运行环境和工具NLP项目使用Java开发，用Maven管理依赖包，服务器采用了Tomcat。

以下是所使用工具版本：JDK:(64位)Tomcat: (64位)Maven: 安装java JDKa)到下载Java JDK 7安装，这里下载的文件名是：b)按照默认位置安装，比如：c)设置环境变量：右键计算机->高级系统设置->环境变量->系统变量新建变量：JAVA_HOME，值为JDK安装的位置：编辑Path变量，在变量后面加上：%JAVA_HOME%\bin，用；符号分隔打开命令行，运行java –version来验证是否安装正确配置a)到下载Tomcat7, 这里.zip ，b)编辑\bin\ ，在185行的位置加上： set JAVA_OPTS=-Xms1g -Xmx2g3.安装Mavena)到下载Maven，这里下载的文件名为.zip, 解压到可以在C:\Program Files下建一个文件夹Maven，然后把拷贝到此目录下，这样安装路径就为：C:\Program Files\Maven\设置环境变量：右键计算机->高级系统设置->环境变量->用户变量：新建用户变量M2_HOME,值为Maven的安装路径：C:\Program Files\Maven\新建用户变量M2,值为：%M2_HOME%\bin编辑/新建用户变量Path，将%M2%加在后面，用 ; 符号分隔打开命令行，运行mvn --version来验证maven是否安装成功4.安装Eclipse和Maven Plugin到下载Eclipse EE，这里下载的文件名为：解压到eclipse, 打开，选择workspace；导入azure和nlp两个项目：File->Import->Maven->Existing MavenProjects->Next->Browse, 选择azure和nlp根目录，分别导入两个项目；Windows->Preferences->Java->Installed JREs, 将默认的JRE删除，Add->Standard VM, 将JRE Home设置5.安装window azure sdk和azure eclipse plugin到下载SDK for .NET，这里下载的文件名为：，直接安装；在eclipse中，Help->Install New Software，输入，回车；选择Azure Toolkit for Java, Next；6. 项目采用Maven管理，但是有对于azure项目中有第三方的jar包是maven中央仓库中没有的，所以这里有两种处理办法：a)用nexus来建一个私有的仓库，将加入其中；b)参考，将安装在本地；这里有一个临时的nexus地址：，如无效请采用第二种方法。

openim 组织机构-概述说明以及解释

openim 组织机构-概述说明以及解释1. 引言1.1 概述概述部分是对整篇文章的简要介绍，用于引导读者对后续内容的理解和期望。

在这篇文章中，我们将探讨openim组织机构的特点和重要性。

OpenIM是一个组织机构，旨在通过开放、协作和创新的方式推动即时通讯领域的发展。

在这个概述部分，我们将首先介绍OpenIM的背景和起源，然后重点探讨其在促进即时通讯技术与标准化方面的作用。

通过对OpenIM组织机构的深入了解，我们将能够更好地理解其在推动即时通讯领域的变革和创新方面的重要性。

随后的章节将进一步探讨OpenIM的组织结构和运作模式，以及对未来的展望和结论。

1.2 文章结构文章结构部分的内容可以包括以下信息：文章的结构是指文章整体的组织架构和章节分布。

一个良好的文章结构可以使读者更好地理解文章的内容和逻辑关系。

本文的结构分为引言、正文和结论三个部分。

引言部分主要介绍文章的背景和目的。

通过引言，读者可以对文章的主题和重点有一个初步的了解。

同时，引言部分也可以引起读者的兴趣，让读者想要继续阅读下去。

正文部分是文章的主要内容，包括了几个要点。

每个要点可以是一个小节，用于展开详细讨论。

在每个要点之间，可以使用适当的过渡段落，帮助读者理解不同要点之间的联系和衔接。

结论部分是对正文进行总结和归纳。

在结论中，可以回顾正文中的关键要点和主要观点，并提出一些总结性的观点。

此外，结论部分也可以展望未来，讨论可能的研究方向、改进方案或者对该领域未来发展的预测。

通过以上的文章结构，读者可以清晰地了解文章的组织架构和章节分布，有助于他们更好地理解文章的内容和逻辑关系。

同时，这样的结构也使得作者能够有条不紊地表达观点和论据，使文章更加完整和连贯。

1.3 目的目的部分（1.3）本文的目的是介绍和探讨OpenIM 组织机构的结构和功能。

通过本章的阐述，读者将能够了解OpenIM 组织机构在开发和管理开放即时通讯系统上的作用以及其重要性。

OpenText 项目与组织管理软件系列说明书

BrochureDeliver ConsistentBusiness OutcomesProject and Portfolio ManagementBrochureDeliver Consistent Business OutcomesDeliver ConsistentBusiness OutcomesProject and Portfolio Management (PPM) provides critical information in real time to help you make the right investment decisions. It standardizes, manages, and captures the execution of project and operational activities as well as resources.Can Y ou Meet Y our Management Challenges?T oday’s project management organization (PMO) struggles with time, cost, and resource management challenges, particularly visibility and data consolidation within the enterprise portfolio. Given these daily challenges, it is difficult for executives to see which projects and op-erational activities they should be working on to find out how much is left in their budget, to what capacity are resources being utilized, and how to align activities with business demands.OpenT ext Project and Portfolio ManagementPPM software helps you overcome these project management chal-lenges. It provides your PMO and executives with visibility into strategic and operational demand, as well as the ongoing projects across your organization. The OpenT ext™ Application Portfolio Management (APM) module feeds detailed application assessment data into this process while project and program management capabilities provide real-time visibility into the project lifecycle at the portfolio, program, resource, financial, and project level. In the end, you get the flexibility and transpar-ency needed for challenging economic conditions.PPM Offers top-down planning capabilities that are supported with de-tailed project plans, resulting in better business outcomes:■Provides an open data model and tables for using any business intelligence tool for analysis and strategic reporting■Provides financial management capabilities for IT operations and strategic projects for rapidly adapting budgets and resources as business objectives change■Provides simplified PPM tools to assess all current applications, helping you determine which applications are of most value toyou and helps eliminate application redundancy ■Supports application lifecycle management by helping organizations combine detailed project plans with requirements management, quality, and performance testing efforts■Enhances visibility, maintains compliance, and reduces costs through the cloud for prioritized investments and consolidated project reporting across traditional and agile projects■Is available using standard systems or with mobile-based functionality PPM ComponentsOpenT ext™ Portfolio Management: This module enables you to gov-ern your portfolio of projects, applications, and opportunities in real time with effective collaborative portfolio management plete lifecycle forecasting capabilities give you the information to make effec-tive portfolio decisions, from proposal initiation, justification, and review to project initiation, execution, and deployment. Portfolio Optimization capabilities help you to determine an excellent mix of proposed proj-ects, active projects, and maintained assets. Different scenarios can be determined automatically based on user-defined criteria for Agile and traditional projects. PPM allows Epics being created from Portfolio and it provides a flexible KPI model to ensure projects are health. Application Portfolio Management: This component enables you to assess and prioritize the entire application portfolio for rationalization and modernization opportunities. These opportunities are based on both business goals and IT technology decisions that provide ongoing support through business events such as mergers and acquisitions, divestment, and IT sourcing strategy changes. OpenText™ APM is not just about helping optimize application roadmaps it is also about synchronizing IT priorities with business priorities. As a result, APM should be viewed as an extension of the strategic planning of the IT organization, especially given that these applications automate core business operations.Program Management: This module enables you to manage your pro-grams collaboratively—from concept to completion—with auditable governance processes. Program Management also provides automated processes for managing scope, risk, quality, issues, and schedules. With Program Management, you no longer need multiple tools and paper manuals to manage program initiation and budget processes, approval, scope changes, risk, issue resolution, resources, or status. It allows to consolidated Agile and traditional projects and demands.Project Management: This module helps you meet the challenges of managing projects in large, geographically dispersed enterprise envi-ronments. It integrates project management and process controls to reduce the number of schedule overruns, thereby reducing project risks and costs. You can now integrate multiple traditional and agile type of projects in the same work breakdown structure. Helping to consolidate resource, time and cost from different teams.Financial Management: This module provides a single, real-time view into all financial attributes related to the programs,projects, and the overall corporate project portfolio. Program and project managers gain the flexibility needed to adjust forecasts rapidly as business objectives change. Cash flow analysis capabilities increase the accuracy of IT in-vestment decisions. Multiple languages are supported, and it’s perfect for global organizations. Financial Management offers SOP 98-1 sup-port, which uses a built-in capitalization method to reduce capitalization errors and uses out-of-the-box portlets to bring needed visibility and control.Resource Management: This module provides comprehensive re-source analysis, which includes both strategic and operational activi-ties at any stage in the work lifecycle. This holistic approach enables a complete understanding of where internal or contracted resources are committed. In turn, your managers can quickly respond to changes with a clear understanding of the effects on resource capacity and work prioritization.Time Management: The Time Management module helps you focus on value-added activities by streamlining time collection and improving ef-ficiency in resource allocation across the wide range of work performed by employees. This provides the capabilities your organization needs to understand how much time is spent on strategic investments vs. operational activities. This helps improve resource allocation and load balancing along with overall productivity and execution. PPM allows merging or transferring timesheet from various tools including Agile. For one single source of truth for cost management. Demand Management: The Demand Management module captures all project and non-project requests so you may know what the organiza-tion is asking for and have the information needed to prioritize valuable resources. Stakeholders have a comprehensive picture of past, present, and future demand so requests can be prioritized, assigned, viewed, and spread across multiple dimensions to identify trends. PPM requests are also integrated to Agile tools to create features and synchronize story points and status.Project and Portfolio Management Dashboard: This PPM Dashboard provides role-based, exception-oriented visibility into business trends, status,and deliverables to help you execute decisions quickly.It sup-ports information sharing with other applications or corporate portals according to the Java Portlet specification—Java Specific Request (JSR 168)—and Web Services for Remote Portlets (WSRP) specification.Project and Portfolio Management Foundation: This platform runs PPM. It includes an advanced workflow engine and configuration ca-pabilities. Additionally, PPM Foundation incorporates enterprise-class data security features.PPM Mobile Access: Bookmark the URL for mobile phone access and the ability to make decisions on Demand Management requests and submit or approve time sheets is at your fingertips. This allows busy executives to stay connected any time and at any place.Figure 1. Project and Portfolio ManagementBrochureDeliver Consistent Business OutcomesWhy OpenT ext?Visibility into demand: T oday’s executives struggle with business align-ment, time, cost, and resource management challenges. PPM allows you to step back and observe a macro view of your operations, while at the same time providing PPM tools and services to help you analyze the day-to-day health of your business unit.Flexible business process automation: PPM is built on top of a pow-erful workflow process engine that can rapidly digitize and automate PPM processes. These capabilities enable us to provide the project management organization with the flexibility and control necessary to align services with business goals.Reporting to support all stakeholders: Unlike other project manage-ment approaches that only offer time reporting systems and project scheduling tools, PPM offers top-down planning capabilities. Decision making for day-to-day users and key stakeholders is improved by ad-hoc query capabilities and the aggregate information is reported from multiple data sources.Application Lifecycle Management (Agile, Enterprise Agile and tradi-tional projects): PPM supports the Application Lifecycle Management solution first, by helping with the requirements management and in-vestment planning process, which allows you to leverage resources in the most effective activities. Secondly, it provides real-time visibility into the health status and value of any application within the portfolio. Automated application lifecycle process controls, including support for industry standards and methodologies, help to improve application quality while lowering costs.Delivering rapid value: We can help you achieve a rapid return on your PPM investment through best practices consulting, packaged deploy-ment, upgrade, and education solutions delivered onsite or through OpenT ext™ software as a service (SaaS) solution for PPM. Both ap-proaches offer a service delivery model with consolidated project re-porting for burn-down charts by user stories or sprints that can help you to achieve a successful adoption to deliver measurable results for traditional and agile projects.Choose the Delivery Option that Works for Y ouY ou can access the same complete toolset and full functionality of PPM either as an in-house solution or as a SaaS managed solution. If you are implementing PPM for the first time, you can begin using your PPM on SaaS solution in a matter of weeks, allowing your IT team to focus on business outcomes rather than on running a software. If you are already running the in-house PPM and choose at any point to move from on-premise to SaaS, SaaS Solutions provide a cost-effective and painless process to assist with the move.OpenT ext PPM on SaaSSaaS has become a fundamental and proven approach to delivering IT and business application solutions that help organizations innovate digitally and deliver an outstanding customer and user experience.Whether it is private or public cloud implementation, SaaS is proven to deliver rapid returns and optimize resources so that you and your teams can focus on the innovation to drive your business outcomes through the efficacy of the SaaS approach. PPM is also available on Amazon AWS and Docker.Powerful Benefits of PPM on SaaSA solution to meet your business needs. It offers:■Three configurable PPM environments: development, test,and production■Full support for core PPM modules, providing a complete solution ■Separately available support for preconfigured content such as Portfolio OptimizationA service you can rely on, which delivers:■Best practices to provide world-class business continuity■24x7 access to OpenT ext™ SaaS customer support■Fully protected environment at the people, process, data, network, and physical levelAn ongoing expertise to help guide your success, along with:■Reviews of board sessions to discuss implementation approaches and provide best practice guidance■ITIL-certified SaaS CSM who drives adoption and provides continuity■SaaS advanced consultancy services team that provides PPM best practice application expertise■Verification of IT-initiated changes, reducing risk to the environment Service OfferingsPPM on SaaSSaaS Solutions provides cost-effective service offerings to meet your PPM needs.OpenT ext PPM On-PremiseIf you decide on a traditional, in-house deployment, our OpenT ext™Software Services team and partners are available to help you get the most from your investment. Our Software Services team provides a full set of consulting, education, and support offerings to enable success. The PPM reference model provides packaged processes based on ITIL, PMP, Prince, CMMI, Six Sigma, Agile frameworks, SAFe, LEss, and other best practices and methodologies plus many years of experience in project and portfolio management. Our best practices from multiple implementations of Project and Portfolio Management are included in our packaged deployment offerings to implement quickly.Blogs and discussion forums on specific OpenT ext solutions give you the chance to explore issues in-depth. Read what our experts and your peers have to say, and contribute your own insights.For More InformationT o join the enterprise conversation in your business community, visit: https:///t5/Project-and-Portfolio- Management/ct-p/sws-PPM Learn more at/ppm/softwaresupportsvcs /opentext。

OpenText NetIQ产品说明说明书

Case StudyAt a Glance IndustryChallengeCreate a seamless end-user experience and streamline backend services while moving business-critical solutions to AWS cloud environmentProducts and ServicesNetIQ Identity Manager NetIQ Access Manager NetIQ Identity GovernanceNetIQ Advanced AuthenticationSuccess Highlights• E nriched functionality and seamless access across hybrid environment • Reduced business complexity with seamless end-user experience • Introduced Cloud Bridge for full bi-directional communication in hybrid environment • Increased scalability, flexibility, and cost-predictability with AWS deploymentOpenTextNetIQ supports global digital transformation totransparently bridge business-critical solutions hosted on premises and in AWS cloud environment.Who is OpenText?OpenText™ is one of the world’s largest enterprise software providers. It delivers mission-critical technology and supporting services that help thousands of customers worldwide manage core IT elements of their business so they can run and transform— at the same time. Cyberscurity is an OpenT ext™ line of business.Digital Transformation Drives Move to a SaaS Application ModelOpenText, like many of its customers, is a large organization grown significantly through acquisition. This strategy brought a plethora of tools used in different divisions. T o standardize its corporate identity management, OpenText trusts its own suite of identity and access solutions, under the NetIQ banner. NetIQ Identity Manager by OpenText™ and Access Manager by OpenText™ wereIT-managed in an on-premises environment and evolved more recently to include NetIQ Advanced Authentication by OpenText™ for multi-factor authentication as well as effective website protection.The merger between Micro Focus and HPE Software tripled the size of the organization and introduced new challenges around data hygiene, audit compliance, and security in general. At the same time, there was a definite market move towards a preference forSaaS-based solutions, to relieve the burden and cost of maintaining an on-premises IT environment. Jon Bultmeyer, CTO,Cybersecurity, runs the engineering teams involved in building Cybersecurity SaaS offerings. He works closely with other OpenText teams on the customer delivery model as well as the internal delivery of SaaS versions. He explains: “We found that we were lagging a little in version-currency, just because of the workload involved in an upgrade. To secure, run, and operate a large-scale identity management operation for over 12,000 staff is labor-intensive and time-consuming. This seemed a good opportunity to embrace the digital transformation at the heart of Micro Focus (now part of OpenText) and move our identity and access architecture to an AWS-hosted cloud environment.”“Cloud Bridge really streamlines the transition to SaaS and gives us the observability we need to ensure effective data flows between different systems.”Jon Bultmeyer CTOCyberResOpenTextIntroduce New Functionality and Comprehensive Access Reviewsin Hybrid EnvironmentOpenText took a wider view and introduced the SaaS Center of Excellence (CoE) organization, headed up by David Gahan, Senior Director, Cybersecurity SaaS. Rather than just make a ‘like for like’ move, the team chose to enhance the platform with NetIQ Identity Governance by OpenText™,as well as expanding the NetIQ Advanced Authentication by OpenText™ capability into a SaaS model. Pivoting from a ‘governance first’ principle with a focus on application access reviews, the project aimed to move via automated application access and approval to fully automated application access request and enablement.The full solution would provide seamless connectivity to the company’s key applications: Salesforce to manage customer interactions and order processing; Workday as an integrated HR solution; and NetSuite, which manages business finances and operational support, as well as other business-critical applications. It would also provide the capability to conduct certification reviews. This automated process builds a comprehensive directory of who has access to what. Periodically, all process and solution owners are asked to review their access list for accuracy. Job roles determine the level of access to specific solutions required for individuals. This ‘least privilege’ principle ensures that only colleagues with the right access level can configure the finance platform, for instance, or reach confidential personnel data in Workday.The project was part of the corporate digital transformation and as such had an executive spotlight on it, coupled with a tight delivery deadline of no more than 12 months. Cloud Bridge: Managing FullyIntegrated Identity Governancein a Hybrid EnvironmentOpenText’s own Professional Services skillsand their specific expertise in building thesesystems for Cybersecurity customers wasinvaluable. The SaaS CoE team workedon creating the SaaS infrastructure, andBultmeyer’s engineering teams werebuilding the SaaS applications. Meanwhile,Professional Services implemented NetIQIdentity Governance on premises to kickstartthe application integration, which relied onmany interconnected parts. Because theday-to-day business running takes ultimatepriority, this was a ‘run and transform’ scenariowith a hybrid approach. Key business systemsmoved in phases to the SaaS environmentwhile others remained on premises fornow. It is a challenge to integrate identitygovernance between on-premises and SaaS-based systems, and Cybersecurity wantedfully automated event-driven integration—they recognized that the manual process ofeither CSV file transfers or site-to-site VPNconnections that are offered by some marketalternatives can cause firewall complexities.As this, again, is not a challenge that isunique to OpenText, Bultmeyer’s teamturned its attention to creating the OpenTextCloud Bridge, as he explains: “Cloud Bridgeis a singular communication bridge for allour Cybersecurity SaaS solutions. It allowssecure bi-directional communication betweenon-premises and SaaS systems via a Dockercontainer. There are no special rules whenconfiguring the Cloud Bridge agent,so communication between on-premisesand cloud-based systems can be up andrunning within just an hour. There is just asingle location to monitor, so any issuesare resolved quickly. Cloud Bridge reallystreamlines the transition to SaaS and gives usthe observability we need to ensure effectivedata flows between different systems.”Reduced Business ComplexityWhile Navigating COVID-19Working PracticesOnce the CoE SaaS infrastructure wasoperational, the Professional Services teamtransitioned the on-premises NetIQ IdentityGovernance implementation to the AWSenvironment. The identity governanceenvironment now includes end-to-endintegrated workflows between key systems,integrated password management, singlesign-on, full visibility through Cloud Bridge,and advanced analytics leveraging OpenText™Vertica™ capabilities. Gahan says: “Leveragingour own NetIQ [by OpenText] solutions in aSaaS environment has allowed us to createa seamless end-user experience wherewe were once living in a world made up ofdifferent islands of access. The solutions ouremployees use to service our customers’needs and our own internal needs have beenstandardized, drastically reducing businesscomplexity across the board. It’s given usterrific backend benefits as well by helpingsimplify and standardize the concepts ofidentity and access acrossall of our business units.”“The project timelines coincided with theCOVID-19 pandemic, which presented uswith the same challenges our customersexperienced around the world,” addsBultmeyer. “Suddenly we could no longergather around a whiteboard to brainstorm,and we had to quickly adjust to workingremotely. Thankfully, this didn’t deter ourdetermination, and many teams—includingour Micro Focus (now part of OpenText) ITteam, the dedicated project implementationteam, our product management teams,backline engineering teams, the newlyformed CoE team, and our Customer Successteams—worked seamlessly together toadjust the implementation and manage anyproblems we encountered along the way.”2Enriched Functionalityand Cost Predictability in Flexible AWS DeploymentGahan spearheads the SaaS CoE, a new global organization dedicated to supporting SaaS customers. Leveraging expertise on defining governance policies, designingthe solution, and configuring this in a SaaS environment, the team created a truly hybrid identity governance platform where the end user does not know, nor need to care, whether the data they access resides on-premises or in the cloud. “And this is just how it should be,” Gahan says. “Our end users now benefit from much richer functionality such as seamless multi-factor authentication and sophisticated access review processes, drastically reducing manual processes.”Bultmeyer concludes: “NetIQ [by OpenText™] solutions have simplified our identity governance and shortened our communication lines. We were excited to leverage our strategic partnership with AWS, giving us a scalable and cost-predictable model as we grow, and allowing us to roll out additional functionality much faster than we otherwise could have done.”“NetIQ [by OpenText™] solutions have simplified our identitygovernance and shortened our communication lines.We were excited to leverage our strategic partnership withAWS, giving us a scalable and cost-predictable model aswe grow, and allowing us to roll out additional functionalitymuch faster than we otherwise could have done.”Jon BultmeyerCTOCyberResOpenText Cybersecurity provides comprehensive security solutions for companies and partners of all sizes. From prevention, detection and response to recovery, investigation and compliance, our unified end-to-end platform helps customers build cyber resilience via a holistic security portfolio. Powered by actionable insights from our real-time and contextual threat intelligence, OpenText Cybersecurity customers benefit from high efficacy products, a compliant experience and simplified security to help manage business risk.768-000087-003 | O | 11/23 | © 2023 Open Text。

opentelemetry 介绍

opentelemetry 介绍
OpenTelemetry是一个开源的可观测性框架，旨在标准化应用程序的监控数据。

OpenTelemetry是云原生计算基金会（CNCF）的一个项目，它提供了一系列工具、API和SDK，用于生成、收集和导出遥测数据，这些数据包括指标、日志和追踪信息。

通过这些数据，开发者可以更好地分析和优化应用程序的性能和行为。

具体来说，OpenTelemetry的功能包括：
1. 监控性能：使用OpenTelemetry可以收集关于应用程序性能的信息，如CPU使用率、内存使用率、网络使用率和请求延迟等。

2. 跟踪用户行为：OpenTelemetry能够帮助开发者追踪用户的行为，从而更好地理解用户如何与应用程序互动。

3. 标准化：OpenTelemetry致力于提供标准化的解决方案，以便在不同的平台和服务之间共享和分析监控数据。

它定义了一套API和SDK，确保了跨语言和平台的互操作性。

4. 厂商无关性：作为一个可插拔的服务，OpenTelemetry可以轻松添加常见的技术协议与格式，让服务选择更自由，打破了各个厂商不兼容的格局。

总的来说，OpenTelemetry的出现，为开发者提供了一个统一和标准化的方式来监控和分析应用程序，无论是在开发过程中还是在实际部署后，都能够提供强有力的支持。

这对于提高应用程序的质量和用户体验具有重要意义。

相关主题

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Apache OpenNLP Developer DocumentationWritten and maintained by the Apache OpenNLP Development Community Version 1.5.2-incubatingCopyright © , The Apache Software FoundationLicense and Disclaimer. The ASF licenses thisdocumentation to you under the Apache License, Version2.0 (the "License"); you may not use this documentationexcept in compliance with the License. You may obtain acopy of the License at/licenses/LICENSE-2.0Unless required by applicable law or agreed to inwriting, this documentation and its contents aredistributed under the License on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, eitherexpress or implied. See the License for the specificlanguage governing permissions and limitations under theLicense.,Table of Contents1. Introduction2. Sentence DetectorSentence DetectionSentence Detection ToolSentence Detection APISentence Detector TrainingTraining ToolTraining APIEvaluationEvaluation Tool3. TokenizerTokenizationTokenizer ToolsTokenizer APITokenizer TrainingTraining ToolTraining APIDetokenizingDetokenizing APIDetokenizer Dictionary4. Name FinderNamed Entity RecognitionName Finder ToolName Finder APIName Finder TrainingTraining ToolTraining APICustom Feature GenerationEvaluationEvaluation ToolEvaluation APINamed Entity Annotation Guidelines5. Document CategorizerClassifyingDocument Categorizer ToolDocument Categorizer APITrainingTraining ToolTraining API6. Part-of-Speech TaggerTaggingPOS Tagger ToolPOS Tagger APITrainingTraining ToolTraining APITag DictionaryEvaluationEvaluation Tool7. ChunkerChunkingChunker ToolChunking APIChunker TrainingTraining ToolTraining APIChunker EvaluationChunker Evaluation Tool8. ParserParsingParser ToolParsing APIParser TrainingTraining ToolTraining API9. Coreference Resolution10. CorporaCONLLCONLL 2000CONLL 2002CONLL 2003Arvores DeitadasGetting the dataConverting the dataEvaluationLeipzig Corpora11. Machine LearningMaximum EntropyImplementation12. UIMA IntegrationRunning the pear sample in CVDFurther HelpList of Tables4.1. Genertor elementsChapter 1. IntroductionThe Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximumentropy and perceptron based machine learning.The goal of the OpenNLP project will be to create a mature toolkit for the abovementioned tasks. An additional goal is to provide a large number of pre-built models for a variety of languages, as well as the annotated text resources that those models are derived from.Chapter 2. Sentence DetectorTable of ContentsSentence DetectionSentence Detection ToolSentence Detection APISentence Detector TrainingTraining ToolTraining APIEvaluationEvaluation ToolSentence DetectionThe OpenNLP Sentence Detector can detect that a punctuation character marks the end of a sentence or not. In this sense a sentence is defined as the longest white space trimmed character sequence between two punctuation marks. The first and last sentence make an exception to this rule. The first non whitespace character is assumed to be the begin of a sentence, and the last non whitespace characteris assumed to be a sentence end. The sample text below should be segmented into its sentences.P i e r r e V i n k e n,61y e a r s o l d,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29.M r.V i n k e n i sc h a i r m a n o f E l s e v i e r N.V.,t h e D u t c h p u b l i s h i n g g r o u p.R ud o l p h A g ne w,55y e a r so l d a n d f o r m e r c h a i r m a n o f C o n s o l i d a t e d G o l d F i e l d s P L C,w a s n a m e d a d i r e c t o r o f t h i sB r i t i s h i n d u s t r i a l c o n g l o m e r a t e.After detecting the sentence boundaries each sentence is written in its own line.P i e r r e V i n k e n,61y e a r s o l d,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29.M r.V i n k e n i s c h a i r m a n o f E l s e v i e r N.V.,t h e D u t c h p u b l i s h i n g g r o u p.R u d o l p h A g n e w,55y e a r s o l d a n d f o r m e r c h a i r m a n o f C o n s o l i d a t e d G o l d F i e l d s P L C,w a s n a m e d a d i r e c t o r o f t h i s B r i t i s h i n d u s t r i a l c o n g l o m e r a t e.Usually Sentence Detection is done before the text is tokenized and thats the way the pre-trained models on the web site are trained, but it is also possible to perform tokenization first and let the Sentence Detector process the already tokenized text. The OpenNLP Sentence Detector cannot identify sentence boundaries based on the contents of the sentence. A prominent example is the first sentencein an article where the title is mistakenly identified to be the first part of the first sentence. Most components in OpenNLP expect input which is segmented into sentences.Sentence Detection ToolThe easiest way to try out the Sentence Detector is the command line tool. The tool is only intended for demonstration and testing. Download the english sentence detector model and start the Sentence Detector Tool with this command:$b i n/o p e n n l p S e n t e n c e D e t e c t o r e n-s e n t.b i nJust copy the sample text from above to the console. The Sentence Detector will read it and echo one sentence per line to the console. Usually the input is read from a file and the output is redirected to another file. This can be achieved with the following command.$b i n/o p e n n l p S e n t e n c e D e t e c t o r e n-s e n t.b i n<i n p u t.t x t>o u t p u t.t x tFor the english sentence model from the website the input text should not be tokenized.Sentence Detection APIThe Sentence Detector can be easily integrated into an application via its API. To instantiate the Sentence Detector the sentence model must be loaded first.I n p u t S t r e a m m o d e l I n=n e w F i l e I n p u t S t r e a m("e n-s e n t.b i n");t r y{S e n t e n c e M o d e l m o d e l=n e w S e n t e n c e M o d e l(m o d e l I n);}c a t c h(I O E x c e p t i o n e){e.p r i n t S t a c k T r a c e();}f i n a l l y{i f(m o d e l I n!=n u l l){t r y{m o d e l I n.c l o s e();}c a t c h(I O E x c e p t i o n e){}}}After the model is loaded the SentenceDetectorME can be instantiated.S e n t e n c e D e t e c t o r M E s e n t e n c e D e t e c t o r=n e w S e n t e n c e D e t e c t o r M E(m o d e l);The Sentence Detector can output an array of Strings, where each String is one sentence.S t r i n g s e n t e n c e s[]=s e n t e n c e D e t e c t o r.s e n t D e t e c t("F i r s t s e n t e n c e.S e c o n d s e n t e n c e.");The result array now contains two entires. The first String is "First sentence." and the second Stringis "Second sentence." The whitespace before, between and after the input String is removed. The APIalso offers a method which simply returns the span of the sentence in the input string.S p a n s e n t e n c e s[]=s e n t e n c e D e t e c t o r.s e n t P o s D e t e c t("F i r s t s e n t e n c e.S e c o n d s e n t e n c e.");The result array again contains two entires. The first span beings at index 2 and ends at 17. Thesecond span begins at 18 and ends at 34. The utility method Span.getCoveredText can be used to createa substring which only covers the chars in the span.Sentence Detector TrainingTraining ToolOpenNLP has a command line tool which is used to train the models available from the model downloadpage on various corpora. The data must be converted to the OpenNLP Sentence Detector training format.Which is one sentence per line. An empty line indicates a document boundary. In case the documentboundary is unknown, its recommended to have an empty line every few ten sentences. Exactly like theoutput in the sample above. Usage of the tool:$b i n/o p e n n l p S e n t e n c e D e t e c t o r T r a i n e rU s a g e:o p e n n l p S e n t e n c e D e t e c t o r T r a i n e r[-a b b D i c t p a t h][-p a r a m s p a r a m s F i l e]-l a n g l a n g u a g e\[-c u t o f f n u m][-i t e r a t i o n s n u m][-e n c o d i n g c h a r s e t N a m e]-d a t a t r a i n D a t a-m o d e l m o d e l F i l eA r g u m e n t s d e s c r i p t i o n:-a b b D i c t p a t hT h e a b b r e v i a t i o n d i c t i o n a r y i n X M L f o r m a t.-p a r a m s p a r a m s F i l eT r a i n i n g p a r a m e t e r s f i l e.-l a n g l a n g u a g es p e c i f i e s t h e l a n g u a g e w h i c h i s b e i n g p r o c e s s e d.-c u t o f f n u ms p e c i f i e s t h e m i n n u m b e r o f t i m e s a f e a t u r e m u s t b e s e e n.I t i s i g n o r e d i f a p a r a m e t e r s f i l e i s p a s s e d.-i t e r a t i o n s n u ms p e c i f i e s t h e n u m b e r o f t r a i n i n g i t e r a t i o n s.I t i s i g n o r e d i f a p a r a m e t e r s f i l e i s p a s s e d.-e n c o d i n g c h a r s e t N a m es p e c i f i e s t h e e n c o d i n g w h i c h s h o u l d b e u s e d f o r r e a d i n g a n d w r i t i n g t e x t.I f n o t s p e c i f i e d t h e s y s t e m d e f a u l t w i l l b e u s e -d a t a t r a i n D a t at h e d a t a t o b e u s e d d u r i n g t r a i n i n g-m o d e l m o d e l F i l et h e o u t p u t m o d e l f i l eTo train an English sentence detector use the following command:$b i n/o p e n n l p S e n t e n c e D e t e c t o r T r a i n e r-e n c o d i n g U T F-8-l a n g e n-d a t a e n-s e n t.t r a i n-m o d e l e n-s e n t.b i nI n d e x i n g e v e n t s u s i n g c u t o f f o f5C o m p u t i n g e v e n t c o u n t s...d o n e.4883e v e n t sI n d e x i n g...d o n e.S o r t i n g a n d m e r g i n g e v e n t s...d o n e.R e d u c e d4883e v e n t s t o2945.D o n e i n d e x i n g.I n c o r p o r a t i n g i n d e x e d d a t a f o r t r a i n i n g...d o n e.N u m b e r o f E v e n t T o k e n s:2945N u m b e r o f O u t c o m e s:2N u m b e r o f P r e d i c a t e s:467...d o n e.C o m p u t i n g m o d e l p a r a m e t e r s...P e r f o r m i n g100i t e r a t i o n s.1:..l o g l i k e l i h o o d=-3384.63768267431440.389514642637722732:..l o g l i k e l i h o o d=-2191.92666885976720.93979111202129843:..l o g l i k e l i h o o d=-1645.86407715559810.96436616833913584:..l o g l i k e l i h o o d=-1340.3863037745190.97399139873028875:..l o g l i k e l i h o o d=-1148.41415485196240.9748105672742167...<s k i p p i n g a b u n c h o f i t e r a t i o n s>...95:..l o g l i k e l i h o o d=-288.255568058744360.983411836985459896:..l o g l i k e l i h o o d=-287.22836803434810.983411836985459897:..l o g l i k e l i h o o d=-286.21748303445260.983411836985459898:..l o g l i k e l i h o o d=-285.2224869810480.983411836985459899:..l o g l i k e l i h o o d=-284.242969172239160.9834118369854598following code sample shows how a model can be loaded.I n p u t S t r e a m m o d e l I n=n e w F i l e I n p u t S t r e a m("e n-t o k e n.b i n");t r y{T o k e n i z e r M o d e l m o d e l=n e w T o k e n i z e r M o d e l(m o d e l I n);}c a t c h(I O E x c e p t i o n e){e.p r i n t S t a c k T r a c e();}f i n a l l y{i f(m o d e l I n!=n u l l){t r y{m o d e l I n.c l o s e();}c a t c h(I O E x c e p t i o n e){}}}After the model is loaded the TokenizerME can be instantiated.T o k e n i z e r t o k e n i z e r=n e w T o k e n i z e r M E(m o d e l);The tokenizer offers two tokenize methods, both expect an input String object which contains theuntokenized text. If possible it should be a sentence, but depending on the training of the learnabletokenizer this is not required. The first returns an array of Strings, where each String is one token.S t r i n g t o k e n s[]=t o k e n i z e r.t o k e n i z e("A n i n p u t s a m p l e s e n t e n c e.");The output will be an array with these tokens."A n","i n p u t","s a m p l e","s e n t e n c e","."The second method, tokenizePos returns an array of Spans, each Span contain the begin and endcharacter offsets of the token in the input String.S p a n t o k e n S p a n s[]=t o k e n i z e r.t o k e n i z e P o s("A n i n p u t s a m p l e s e n t e n c e.");The tokenSpans array now contain 5 elements. To get the text for one span call Span.getCoveredTextwhich takes a span and the input text. The TokenizerME is able to output the probabilities for thedetected tokens. The getTokenProbabilities method must be called directly after one of the tokenizemethods was called.T o k e n i z e r M E t o k e n i z e r=...S t r i n g t o k e n s[]=t o k e n i z e r.t o k e n i z e(...);d o u b le t o k e n P r o b s[]=t o k e n i z e r.g e t T o k e n P r o b a b i l i t i e s();The tokenProbs array now contains one double value per token, the value is between 0 and 1, where 1 isthe highest possible probability and 0 the lowest possible probability.Tokenizer TrainingTraining ToolOpenNLP has a command line tool which is used to train the models available from the model downloadpage on various corpora. The data must be converted to the OpenNLP Tokenizer training format. Which isone sentence per line. Tokens are either separater by a whitespace or if by a special <SPLIT> tag. Thefollowing sample shows the sample from above in the correct format.P i e r r e V i n k e n<S P L I T>,61y e a r s o l d<S P L I T>,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29<S P L I T>.M r.V i n k e n i s c h a i r m a n o f E l s e v i e r N.V.<S P L I T>,t h e D u t c h p u b l i s h i n g g r o u p<S P L I T>.R u d o l p h A g n e w<S P L I T>,55y e a r s o l d a n d f o r m e r c h a i r m a n o f C o n s o l i d a t e d G o l d F i e l d s P L C<S P L I T>,w a s n a m e d a n o n e x e c u t i v e d i r e c t o r o f t h i s B r i t i s h i n d u s t r i a l c o n g l o m e r a t e<S P L I T>.Usage of the tool:$b i n/o p e n n l p T o k e n i z e r T r a i n e rU s a g e:o p e n n l p T o k e n i z e r T r a i n e r[-a b b D i c t p a t h][-a l p h a N u m O p t i s A l p h a N u m O p t][-p a r a m s p a r a m s F i l e]-l a n g l a n g u a g e[-c u t o f f n u m][-i t e r a t i o A r g u m e n t s d e s c r i p t i o n:-a b b D i c t p a t hT h e a b b r e v i a t i o n d i c t i o n a r y i n X M L f o r m a t.-a l p h a N u m O p t i s A l p h a N u m O p tO p t i m i z a t i o n f l a g t o s k i p a l p h a n u m e r i c t o k e n s f o r f u r t h e r t o k e n i z a t i o n-p a r a m s p a r a m s F i l eT r a i n i n g p a r a m e t e r s f i l e.b e u s es a i d->N O_O P E R A T I O N"->M E R G E_T O_R I G H TT h i s->N O_O P E R A T I O Ni s->N O_O P E R A T I O Na->N O_O P E R A T I O Nt e s t->N O_O P E R A T I O N"->M E R G E_T O_L E F T.->M E R G E_T O_L E F TThat will result in the following character sequence:H e s a i d"T h i s i s a t e s t".TODO: Add documentation about the dictionary format and how to use the API. Contributions are welcome. Detokenizing APITODO: Write documentation about the detokenizer api. Any contributions are very welcome. If you want to contribute please contact us on the mailing list or comment on the jira issue OPENNLP-216. Detokenizer DictionaryTODO: Write documentation about the detokenizer dictionary. Any contributions are very welcome. If you want to contribute please contact us on the mailing list or comment on the jira issue OPENNLP-217. Chapter 4. Name FinderTable of ContentsNamed Entity RecognitionName Finder ToolName Finder APIName Finder TrainingTraining ToolTraining APICustom Feature GenerationEvaluationEvaluation ToolEvaluation APINamed Entity Annotation GuidelinesNamed Entity RecognitionThe Name Finder can detect named entities and numbers in text. To be able to detect entities the Name Finder needs a model. The model is dependent on the language and entity type it was trained for. The OpenNLP projects offers a number of pre-trained name finder models which are trained on various freely available corpora. They can be downloaded at our model download page. To find names in raw text the text must be segmented into tokens and sentences. A detailed description is given in the sentence detector and tokenizer tutorial. Its important that the tokenization for the training data and the input text is identical.Name Finder ToolThe easiest way to try out the Name Finder is the command line tool. The tool is only intended for demonstration and testing. Download the English person model and start the Name Finder Tool with this command:$b i n/o p e n n l p T o k e n N a m e F i n d e r e n-n e r-p e r s o n.b i nThe name finder now reads a tokenized sentence per line from stdin, an empty line indicates a document boundary and resets the adaptive feature generators. Just copy this text to the terminal:P i e r r e V i n k e n,61y e a r s o l d,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29.M r.V i n k e n i s c h a i r m a n o f E l s e v i e r N.V.,t h e D u t c h p u b l i s h i n g g r o u p.R u d o l p h A g n e w,55y e a r s o l d a n d f o r m e r c h a i r m a n o f C o n s o l i d a t e d G o l d F i e l d s P L C,w a s n a m e da d i r e c t o r o f t h i s B r i t i s h i n d u s t r i a l c o n g l o m e r a t e.the name finder will now output the text with markup for person names:<S T A R T:p e r s o n>P i e r r e V i n k e n<E N D>,61y e a r s o l d,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29.M r.<S T A R T:p e r s o n>V i n k e n<E N D>i s c h a i r m a n o f E l s e v i e r N.V.,t h e D u t c h p u b l i s h i n g g r o u p.<S T A R T:p e r s o n>R u d o l p h A g n e w<E N D>,55y e a r s o l d a n d f o r m e r c h a i r m a n o f C o n s o l i d a t e d G o l d F i e l d s P L C, w a s n a m e d a d i r e c t o r o f t h i s B r i t i s h i n d u s t r i a l c o n g l o m e r a t e.names. The following lines show how to construct a custom feature generatorA d a p t i v e F e a t u r e G e n e r a t o r f e a t u r e G e n e r a t o r=n e w C a c h e d F e a t u r e G e n e r a t o r(n e w A d a p t i v e F e a t u r e G e n e r a t o r[]{n e w W i n d o w F e a t u r e G e n e r a t o r(n e w T o k e n F e a t u r e G e n e r a t o r(),2,2),n e w W i n d o w F e a t u r e G e n e r a t o r(n e w T o k e n C l a s s F e a t u r e G e n e r a t o r(t r u e),2,2),n e w O u t c o m e P r i o r F e a t u r e G e n e r a t o r(),n e w P r e v i o u s M a p F e a t u r e G e n e r a t o r(),n e w B i g r a m N a m e F e a t u r e G e n e r a t o r(),n e w S e n t e n c e F e a t u r e G e n e r a t o r(t r u e,f a l s e)});which is similar to the default feature generator. The javadoc of the feature generator classes explain what the individual feature generators do. To write a custom feature generator please implement the AdaptiveFeatureGenerator interface or if it must not be adaptive extend the FeatureGeneratorAdapter. The train method which should be used is defined asp u b l i c s t a t i c T o k e n N a m e F i n d e r M o d e l t r a i n(S t r i n g l a n g u a g e C o d e,S t r i n g t y p e,O b j e c t S t r e a m<N a m e S a m p l e>s a m p l e s,A d a p t i v e F e a t u r e G e n e r a t o r g e n e r a t o r,f i n a l M a p<S t r i n g,O b j e c t>r e s o u r c e s,i n t i t e r a t i o n s,i n t c u t o f f)t h r o w s I O E x c e p t i o nand can take feature generator as an argument. To detect names the model which was returned from the train method and the feature generator must be passed to the NameFinderME constructor.n e w N a m e F i n d e r M E(m o d e l,f e a t u r e G e n e r a t o r,N a m e F i n d e r M E.D E F A U L T_B E A M_S I Z E);Feature Generation defined by XML DescriptorOpenNLP can also use a xml descritpor file to configure the featuer generation. The descriptor file is stored inside the model after training and the feature generators are configured correctly when the name finder is instantiated. The following sample shows a xml descriptor:<g e n e r a t o r s><c a c h e><g e n e r a t o r s><w i n d o w p r e v L e n g t h="2"n e x t L e n g t h="2"><t o k e n c l a s s/></w i n d o w><w i n d o w p r e v L e n g t h="2"n e x t L e n g t h="2"><t o k e n/></w i n d o w><d e f i n i t i o n/><p r e v m a p/><b i g r a m/><s e n t e n c e b e g i n="t r u e"e n d="f a l s e"/></g e n e r a t o r s></c a c h e></g e n e r a t o r s>The root element must be generators, each sub-element adds a feature generator to the configuration. The sample xml is equivalent to the generators defined by the API above.The following table shows the supported elements:Table 4.1. Genertor elementsElement Aggregated Attributesgenerators yes nonecache yes nonecharngram no min and max specify the length of the generated character ngramsdefinition no nonedictionary no dict is the key of the dictionary resource to use, and prefix is a feature prefix stringprevmap no nonesentence no begin and end to generate begin or end features, both are optional and are boolean valuestokenclass no nonetoken no nonebigram no nonetokenpattern no nonewindow yes prevLength and nextLength must be integers ans specify the window size custom no class is the name of the feature generator class whcih will be loadedAggregated feature generators can contain other generators, like the cache or the window feature generator in the sample.EvaluationT h e u p w a r d m o v e m e n t o f g r o s s m a r g i n r e s u l t e d f r o m a m o u n t s p u r s u a n t t oa d j u s t m e n t s t o ob l i g a t i o n s t o w a r d s d e a l e r s.To be able to classify a text, the document categorizer needs a model. The classifications are requirements-specific and hence there is no pre-built model for document categorizer under OpenNLP project.Document Categorizer ToolThe easiest way to try out the document categorizer is the command line tool. The tool is only intended for demonstration and testing. The following command shows how to use the document categorizer tool.$b i n/o p e n n l p D o c c a t m o d e lThe input is read from standard input and output is written to standard output, unless they are redirected or piped. As with most components in OpenNLP, document categorizer expects input which is segmented into sentences.Document Categorizer APITo perform classification you will need a maxent model - these are encapsulated in the DoccatModel class of OpenNLP tools.First you need to grab the bytes from the serialized model on an InputStream - we'll leave it you to do that, since you were the one who serialized it to begin with. Now for the easy part:I n p u t S t r e a m i s=...D o c c a t M o d e l m=n e w D o c c a t M o d e l(i s);With the DoccatModel in hand we are just about there:S t r i n g i n p u t T e x t=...D o c u m e n t C a t e g o r i z e r ME m y C a t e g o r i z e r=n e w D o c u m e n t C a t e g o r i e r M E(m);d o u b l e[]o u t c o me s=m y C a t e g o r i z e r.c a t e g o r i z e(i n p u t T e x t);S t r i n g c a t e g o r y=m y C a t e g o r i z e r.g e t B e s t O u t c o m e();TrainingThe Document Categorizer can be trained on annotated training material. The data must be in OpenNLP Document Categorizer training format. This is one document per line, containing category and text separated by a whitespace. The following sample shows the sample from above in the required format. Here GMDecrease and GMIncrease are the categories.G M D e c r e a s e M a j o r a c q u i s i t i o n s t h a t h a v e a l o w e r g r o s s m a r g i n t h a n t h e e x i s t i n g n e t w o r k a l s o\h a d a n e g a t i v e i m p a c t o n t h e o v e r a l l g r o s s m a r g i n,b u t i t s h o u l d i m p r o v e f o l l o w i n g\t h e i m p l e m e n t a t i o n o f i t s i n t e g r a t i o n s t r a t e g i e s.G M I n c r e a s e T h e u p w a r d m o v e m e n t o f g r o s s m a r g i n r e s u l t e d f r o m a m o u n t s p u r s u a n t t o a d j u s t m e n t s\t o o b l i g a t i o n s t o w a r d s d e a l e r s.Note: The line breaks marked with a backslash are just inserted for formatting purposes and must not be included in the training data.Training ToolThe following command will train the document categorizer and write the model to en-doccat.bin:$b i n/o p e n n l p D o c c a t T r a i n e r-e n c o d i n g U T F-8-l a n g e n-d a t a e n-d o c c a t.t r a i n-m o d e l e n-d o c c a t.b i nAdditionally it is possible to specify the number of iterations, and the cutoff.Training APISo, naturally you will need some access to many pre-classified events to train your model. The class opennlp.tools.doccat.DocumentSample encapsulates a text document and its classification. DocumentSample has two constructors. Each take the text's category as one argument. The other argument can either be raw text, or an array of tokens. By default, the raw text will be split into tokens by whitespace. So, let's say your training data was contained in a text file, where the format is as described above. Then you might want to write something like this to create a collection of DocumentSamples:D o c c a t M o d e l m o d e l=n u l l;I n p u t S t r e a m d a t a I n=n u l l;t r y{d a t a I n=ne w F i l e I n p u t S t r e a m("e n-s e n t i m e n t.t r a i n");O b j e c t S t r e a m<S t r i n g>l i n e S t r e a m=n e w P l a i n T e x t B y L i n e S t r e a m(d a t a I n,"U T F-8");O b j e c t S t r e a m<D o c u m e n t S a m p l e>s a m p l e S t r e a m=n e w D o c u m e n t S a m p l e S t r e a m(l i n e S t r e a m);m o d e l=D o c u m e n t C a t e g o r i z e r M E.t r a i n("e n",s a m p l e S t r e a m);}c a t c h(I O E x c e p t i o n e){//F a i l e d t o r e a d o r p a r s e t r a i n i n g d a t a,t r a i n i n g f a i l e de.p r i n t S t a c k T r a c e();}f i n a l l y{i f(d a t a I n!=n u l l){t r y{d a t a I n.c l o s e();}c a t c h(I O E x c e p t i o n e){//N o t a n i s s u e,t r a i n i n g a l r e a d y f i n i s h e d.//T h e e x c e p t i o n s h o u l d b e l o g g e d a n d i n v e s t i g a t e d//i f p a r t o f a p r o d u c t i o n s y s t e m.e.p r i n t S t a c k T r a c e();}}}Now might be a good time to cruise over to Hulu or something, because this could take a while ifyou've got a large training set. You may see a lot of output as well. Once you're done, you can pretty quickly step to classification directly, but first we'll cover serialization. Feel free to skim.O u t p u t S t r e a m m o d e l O u t=n u l l;t r y{m o d e l O u t=n e w B u f f e r e d O u t p u t S t r e a m(n e w F i l e O u t p u t S t r e a m(m o d e l F i l e));m o d e l.s e r i a l i z e(m o d e l O u t);}c a t c h(I O E x c e p t i o n e){//F a i l e d t o s a v e m o d e le.p r i n t S t a c k T r a c e();}f i n a l l y{i f(m o d e l O u t!=n u l l){t r y{m o d e l O u t.c l o s e();}c a t c h(I O E x c e p t i o n e){//F a i l e d t o c o r r e c t l y s a v e m o d e l.//W r i t t e n m o d e l m i g h t b e i n v a l i d.e.p r i n t S t a c k T r a c e();}}}Chapter 6. Part-of-Speech TaggerTable of ContentsTaggingPOS Tagger ToolPOS Tagger APITrainingTraining ToolTraining APITag DictionaryEvaluationEvaluation ToolTaggingThe Part of Speech Tagger marks tokens with their corresponding word type based on the token itself and the context of the token. A token might have multiple pos tags depending on the token and the context. The OpenNLP POS Tagger uses a probability model to predict the correct pos tag out of the tag set. To limit the possible tags for a token a tag dictionary can be used which increases the tagging and runtime performance of the tagger.POS Tagger ToolThe easiest way to try out the POS Tagger is the command line tool. The tool is only intended for demonstration and testing. Download the english maxent pos model and start the POS Tagger Tool with this command:$b i n/o p e n n l p P O S T a g g e r e n-p o s-m a x e n t.b i nThe POS Tagger now reads a tokenized sentence per line from stdin. Copy these two sentences to the console:P i e r r e V i n k e n,61y e a r s o l d,w i l l j o i n t h e b o a r d a s a n o n e x e c u t i v e d i r e c t o r N o v.29.M r.V i n k e n i s c h a i r m a n o f E l s e v i e r N.V.,t h e D u t c h p u b l i s h i n g g r o u p.。