Oozie工作流框架使用指南

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。


Oozie
◦ Home page ◦ Documents
பைடு நூலகம்

Hue
◦ Home page ◦ Tutorials

Expression Language
◦ Tutorials

Job Designer Job Design Example
◦ Fs Action ◦ MapReduce Action

Oozie

WorkFlow
◦ A workflow scheduler system to manage Apache Hadoop jobs
◦ Workflow jobs are Directed Acyclical Graphs (DAGs) of actions ◦ Coordinator jobs are recurrent workflow jobs triggered by time (frequency) and data availabilty. ◦ MR\FS\Email\Shell\Ssh\Hive\Pig\Sqoop\Distcp\Java
◦ oozie.wf.rerun.failnodes ◦ oozie.wf.rerun.skip.nodes





Workflow definitions can be parameterized When workflow node is executed by Oozie all the ELs are resolved into concrete values EL expressions can be used in the configuration values of action and decision nodes Workflow Job Properties (or Parameters) Expression Language Functions



Datetime, Frequency and Time-Period Coordinator Action Parameterization of Coordinator Coordinator Design

定期创建文件夹
◦ ◦ ◦ ◦ 开始时间: 结束时间: 频率:每3分钟一次 文件夹名字为标定时间格式化yyyy-MM-ddTHH-mm

Coordinator


Integrated
Scalable Reliable Extensible


Does not support circles Workflow Nodes
◦ Control Flow Nodes ◦ Workflow Action Nodes

WorkFlow Recovery
2013年10月



System Installation Job Design Oozie Overview WorkFlow Design Coordinator Design Reference

Requirements
◦ Cloudera Manager ◦ CDH

Oozie
Start
Map Reduce1
输出>500行

Map Reduce2
End

Coordinator execute workflow jobs:
◦ Recurrent ◦ Interdependent

Coordinator Based:
◦ Time intervals ◦ Data availability ◦ Time intervals and/or data availability
◦ Install ◦ Config

Hue
◦ Install ◦ Config (2 items) ◦ Basic Operation

Action Overview
◦ An execution/computation task (Map-Reduce job, Pig job, a shell command). It can also be referred as task or 'action node'.
相关文档
最新文档