咨询方法与工具资料库datawhse【精品文档】
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Lite Bytes
BC’s Topical Technology White Papers
Data Warehousing -- Concepts,
Components, Approaches
D o you know any clients that think reporting is easy? Who can
quickly, accurately and concisely determine the state of their business
on a real-time basis? Who doesn’t have mission-critical business
information that lives only in a spreadsheet?
We haven’t seen many, either. Now a relatively new concept is
attempting to make this process easier: Data Warehousing. This Lite Byte will cover the basics of data warehousing -- key concepts, conceptual approaches, opportunity indicators, potential pitfalls, and other issues related to this topic.
Executive Overview
Data warehousing is designed to allow a business to gather disparate information into a facility that promotes access and integration. Warehouses combine hardware, databases and user access tools into a cohesive system which consolidates, synchronizes and
presents data in a format which eases the transition from data
to information.
Data Warehouses rely on several advanced concepts. Relational databases which support transaction processing are often not effectively optimized for warehousing, and
alternative systems are required. Access tools vary from general purpose to application-specific. Executive Information Systems (EIS) are best viewed as a component, not the end result, of a data warehouse. Finally, data warehouses are a # General Description
# Conceptual Architecture # Data Wholesaling &
Retailing
# Refining
# Databases
# Access & Reporting
Written by: Ann Willis, BC Dallas Greg Moran, BC Cleveland Randy Green, BC Dallas
complement to, not a replacement for, legacy transaction processing systems.
Introduction
The purpose of this document is to help BC professionals understand the basics of data warehousing. After reading this paper, a BC professional should be at least casually conversant with the terms, approaches, components and issues related to data warehousing. This document will help you identify opportunities for implementation of data warehousing concepts, and will guide your thinking as you move forward with a data warehousing project.
However, this Lite Byte is by no means the last word in the technical and process-oriented aspects of data warehousing. This topic is exploding in popularity, and a variety of sources may be found which provided specific technical and re-engineering guidance. This document does not replace the volumes of information available. In the last section of this paper, we will list a bibliography of resources which should be consulted before any decisions are made.
If you are familiar with transaction processing systems, some of the information you are about to read may seem contradictory to the concepts you are familiar with. This is to be expected. Data Warehouses do not exist to capture data; they exist to develop information. The difference is subtle but profound, and it permeates the entirety of the data warehousing concept. Be prepared to have your traditional thinking challenged.
This is intended to be a living document. The information we discuss here is, almost by definition, limited to a small number of engagements that have actually implemented systems. As you use this document and find sections that need addition, clarification and correction, let the Technology Team know by entering your comments in the appropriate AA Online discussion. By teaming our collective corporate knowledge, we can make this task less complex.
G ENERAL D ESCRIPTION
Data warehousing is the latest manifestation of a simple data processing requirement: deliver the information users need, when they need it, in a format they can use. Over the history of the industry, this has been the driving force for data processing.
Fortunately for those of us in the consulting industry, MIS has historically done a relatively poor job of meeting this simple requirement. For a variety of reasons, systems have been developed which fail one or more of the three key tests.
Perhaps most critically, data processing organizations have largely focused on gathering data to answer the question “what happened?”. More and more, managers are moving beyond wanting the know what happened, and are more concerned with understanding why it happened. Data warehousing is an attempt to better answer the “why” question. Data warehousing focuses on summarized data and the relationships of that data to other pieces of data. At the highest level, data warehousing is a movement to turn data into information.
engineering of corporations -- responsibility is being pushed lower and lower into corporate organization charts. Where companies could once employ a few people fluent in multivariate regression analysis to serve executive management, now the lowest line manager may have big questions to go along with his new P&L responsibility. This line manager needs tools and information at his fingertips to answer the questions that effect his responsibility center, and he can’t depend on a single Analysis Department to get him those answers six weeks after he needed them.
Referring to the old but still applicable pyramid model, we can equate most existing systems with the “data” layer. This is largely why online trans action processing (OLTP) systems exist -- to capture information about an event that occurred. In most OLTP systems, you can make several observations about the data layer:
∙it is concerned with discreet events -- OLTP systems have one iron rule: one event, one record.
Mixing information from multiple business events into a single transaction is a recipe for
disaster for OLTP processing.
∙it is highly detailed -- OLTP systems attempt to capture everything the business would ever need to know about an event. This information about a transaction is often designed to stand alone -- which means it may not be effectively related to other events that have indirect bearing on the event.
∙it is optimized for update -- OLTP systems are designed to capture information as long as a user is willing to supply it. The internal workings of OLTP systems revolve around fast response
times. Anything that might impede a sub-second screen update is viewed with suspicion. This is all well and good as long as the user is content with answering relatively simple questions. Any good OLTP system can tell you how many widgets were sold in January, or the total payroll cost for a particular sales department. Where OLTP systems fail is in answering questions like, “what was the sales department payroll cost for selling widgets in the Southern Region to wholesale customers this quarter versus this quarter last year?”. Answering these multidimensional questions like the one above requires a different view of the data, and a different sort of database.。