数据仓库技术架构及方案
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Corporate Memory
ORDER
ORDER NUMBER ORDER DATE STATUS
ORDER ITEM BACKORDERED QUANTITY
ORDER ITEM SHIPPED QUANTITY SHIP DATE
ITEM ITEM NUMBER QUANTITY DESCRIPTION
当前 转换 目标
技术 应用 信息 业务
逻辑层 方案
项目
EDW 应用逻辑架构
操作型源数据影像
多功能模型 历史数据 经转换后
视图 逻辑数据集市 依赖型数据集市 分析型知识库
Tier 1 Operational Image
Of
Tier 2
Single Version
CUSTOMER
CUSTOMER NUMBER CUSTOMER NAME CUSTOMER CITY CUSTOMER POST CUSTOMER ST CUSTOMER ADDR CUSTOMER PHONE CUSTOMER FAX
Gartner Magic Quadrant for Data Warehouse DBMS Servers, 2006 Feinberg, Hardcastle, Butler, Dawson (8/25/2006)
The Magic Quadrant is copyrighted 9/12/06 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner's analysis of how certain vendors measure against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the "Leaders" quadrant. The Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Query Complexity
Query Data TB’s Volumes
Workload Mix
Agenda
• Teradata简介 • 架构设计原理 • 整体架构说明 • ETL架构说明
架构立方
逻辑架构层
操作的顺序
技术 应用 信息 业务
逻辑层 方案 项目
定义的 等级
当前 转换
目标
起
经
止
物理
Байду номын сангаас
Teradata 系统扩展能力
Data Storage (raw, user data)
20 TB
Multiple, Integrated Stars and Normalized
15 TB
1,000
Data Model Sophistication
Normalized
Multiple, Integrated Stars
5-10 Way Joins
MB’s
GB’s
Batch Reporting, Repetitive Queries
“Iterative”, Ad Hoc Queries Data Analysis/Mining
Near Real Time Data Feeds
Active Data Warehousing
• 全球员工超过5,500名
Teradata 市场份额
Teradata Top 10
90% of Top 10 Global Telco Firms
70% of Top 10 Global Airlines
60% of the Top 10 Transportation Logistic Firms
• EDW/ADW 数据库技术 • 分析型解决方案 • 咨询服务
> 自1999年开始,连续9年被Gartner评为数据仓库第一名
• 美国前10大上市软件公司
> S&P 500 标准普尔500成员 > 纽交所上市代码: “TDC” > NYSE Arca Tech 100
• 世界级的客户遍布全球
> 超过 850 个世界级客户 > 超过2000个安装系统
• World class customer list
> More than 850 customers > More than 2,000
installations
• Global presence
> Over 100 countries
Teradata 驱动世界级企业的可持续发展
Post
Communications
10 TB 5 TB
# of Concurrent Queries
15+ way Joins + OLAP operations + Aggregation + Complex “Where”
constraints + Views Parallelism
Simple Star
3-5 Way Joins
数据仓库技术架构及方案
SPDB Project Training
黄予辉 2008年12月13日
Agenda
• Teradata简介 • 架构设计原理 • 整体架构说明 • ETL架构说明
Teradata 公司概况
• Teradata Corporation – 2007年10月1日纽交所上市
> 企业级数据仓库全球领导者
Log
Tier 3
PHY
Semantic Layer
MM EE TT AA -DD AA TT AA
Reference Architecture - Source View
Frontline Users
C/S
Web
Enterprise Users — (Browsers and/or Portal)
Consumers
Suppliers
Internal
Partners
Back-office users
Web
C/S
WAN / VAN
Transactional Services
Internet / Intranet
WAN / VAN
Analytic & Decision Making Services
• 技术
> The bit IT cares about most. The easiest to get WRONG because we don’t concentrate on the other aspects of architecture FIRST!
> What do we have and need to support the other 3 Views without limitation?
Financial
Travel
Retail
Insurance
AoyamaShoji
Manufacturing
Teradata 数据仓库技术的领导者
软件
硬件
Gartner Magic Quadrant for Data Warehouse DBMS, 2006 Feinberg & Beyer (9/2006)
NW
TX1 APPL
DA-MW
MSG-MW
TX2 APPL
DA-MW
MSG-MW
TX3 APPL
DA-MW
MSG-MW
TX4 APPL
DA-MW
Extract from Database
OLTP1
OLTP2
OLTP3
OLTP4
Transactional Repositories
ASP / JSP
Service Brokers
> The data is worked on by the applications, used by the business.
• 应用
> What functions and interrelations of functions do the applications have and need? Sales, Marketing, Pricing, Manufacturing, Customer Management.
成功由好的架构设计方法开始
• 业务
> What is the business model, where is it going, how does it plan to get there?
> The requirements. The business process.
• 信息
> What data do we have and need to support the Business View? Information is also calculations and rules. Typically we see Logical & Physical data models here, all subject areas of the business.
Enterprise Service Bus
Duplicate
MSTG-rMaW nsacMStGi-oMWn
Event Notification
Business Rules
DA-MW
DA-MW
MSG-MW
Event Detection
DA-MW
Business Process Automation
> Works against information to support the Business View. The applications work within the confines of the Information architecture, creating and consuming the data elements, rules and definitions of that architecture view.
50% of Top 10 Global Retailers
40% of Top 10 Global Commercial
& Savings Banks
FORTUNE Global Rankings, July 2006
• Leading industries
> Banking/Financial Services > Government > Insurance & Healthcare > Manufacturing > Retail > Telecommunications > Transportation Logistics > Travel
Streaming Batch
Data Acquisition & Integration
MSG-MW
Strategic APPL
DA-MW
MSG-MW
Tactical APPL
DA-MW
MSG-MW
BI APPL
DA-MW
NW
BI APPL
DA-MW
QD
EDW — A
RDBMS Based Event
> Choosing an ETL tool before you have defined an architecture is “SUB OPTIMAL”
• A separation of Business, Data, Process and Technology as appropriate
Processing
QD EDW — B RS
Analytic & Decision Making Repositories
Reference Architecture – Data loading View
Frontline Users