Oozie工作流框架使用指南
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Oozie
◦ Home page ◦ Documents
பைடு நூலகம்
Hue
◦ Home page ◦ Tutorials
Expression Language
◦ Tutorials
Job Designer Job Design Example
◦ Fs Action ◦ MapReduce Action
Oozie
WorkFlow
◦ A workflow scheduler system to manage Apache Hadoop jobs
◦ Workflow jobs are Directed Acyclical Graphs (DAGs) of actions ◦ Coordinator jobs are recurrent workflow jobs triggered by time (frequency) and data availabilty. ◦ MR\FS\Email\Shell\Ssh\Hive\Pig\Sqoop\Distcp\Java
◦ oozie.wf.rerun.failnodes ◦ oozie.wf.rerun.skip.nodes
Workflow definitions can be parameterized When workflow node is executed by Oozie all the ELs are resolved into concrete values EL expressions can be used in the configuration values of action and decision nodes Workflow Job Properties (or Parameters) Expression Language Functions
Datetime, Frequency and Time-Period Coordinator Action Parameterization of Coordinator Coordinator Design
定期创建文件夹
◦ ◦ ◦ ◦ 开始时间: 结束时间: 频率:每3分钟一次 文件夹名字为标定时间格式化yyyy-MM-ddTHH-mm
Coordinator
Integrated
Scalable Reliable Extensible
Does not support circles Workflow Nodes
◦ Control Flow Nodes ◦ Workflow Action Nodes
WorkFlow Recovery
2013年10月
System Installation Job Design Oozie Overview WorkFlow Design Coordinator Design Reference
Requirements
◦ Cloudera Manager ◦ CDH
Oozie
Start
Map Reduce1
输出>500行
是
Map Reduce2
End
Coordinator execute workflow jobs:
◦ Recurrent ◦ Interdependent
Coordinator Based:
◦ Time intervals ◦ Data availability ◦ Time intervals and/or data availability
◦ Install ◦ Config
Hue
◦ Install ◦ Config (2 items) ◦ Basic Operation
Action Overview
◦ An execution/computation task (Map-Reduce job, Pig job, a shell command). It can also be referred as task or 'action node'.