6 Sigma项目运作实例doc74
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
《6 Sigma项目运作实例》
如何定义一个项目?
项目定义是由冠军来完成的。
我们简单介绍以下项目是如何定义的。
1确定主要商业问题:
a目标
b目的
c可交付使用的
2对与生产来说:
a循环时间
b质量/缺陷水平
c耗费
3项目的选择
a选择项目的工具
a1宏观图
a2 Pareto图分析
a3鱼骨图
a4因果矩阵图
b项目的标准(评估)
b1减少缺陷的70%
b2第一年节省 $175K
b3项目完成周期为4个月
b4最少的资金总额
b5黑带的第一个项目必须满足培训目标
《6 Sigma项目运作实例》->《定义阶段》->我们在定义阶段做什么
--------------------------------------------------------------------------------------------------- 我们在定义阶段需要做什么?
1,完成项目陈述。
2,完成项目预测节省金额。
3,完成问题陈述:
3.1问题是什么?
3.2在哪里和什么时间发现的?
3.3问题将涉及哪些工序?
3.4谁将受到影响?
3.5问题的严重程度是什么?
3.6你是如何得知这些的?
4,绘制宏观图。
5,描述项目的主线。
6,完成目标陈述。
7,组成项目小组,列出小组成员。
8,完成财务评估。
《6 Sigma项目运作实例》->《定义阶段》->如何进行项目问题陈述
--------------------------------------------------------------------------------------------------- 如何进行问题陈述?
分六个方面进行问题陈述:
1问题是什么?
2在哪里和什么时间发现的?
3问题将涉及哪些工序?
4谁将受到影响?
5问题的严重程度是什么?
6你是如何得知这些的?
《6 Sigma项目运作实例》->《定义阶段》->如何绘制宏观图
--------------------------------------------------------------------------------------------------- 如何绘制宏观图?
绘制宏观图的顺序:供应商->输入->工序->输出->客户
《6 Sigma项目运作实例》->《定义阶段》->项目的目标陈述要点
--------------------------------------------------------------------------------------------------- 项目的目标陈述要点:
1,目标陈述
2,计算方法
3,全年节省额
确定Team Members成员:
1,小组成员要包括技术人员
2,包括维修人员(如果需要)
3,包括操作者
4,小组人员不超过5人(特殊情况除外)。
《6 Sigma项目运作实例》->《测量阶段》->如何进行项目描述
--------------------------------------------------------------------------------------------------- 如何进行项目描述:
1,目标陈述
2,Metric 图
3,月节省额
如何绘制工艺流程图:
召集小组:
流程图绘制是集体努力的结果
小组包括:
流程负责人:项目结果的负责人
工程部门-工艺,产品,设计及设备
生产部门-操作员,各班次主管,培训员,操作班长,维修技师流程图所需信息
脑力风暴
观察/经历
操作手册
工程标准,工作指示
六大方面(人,机,方法,测量,材料,环境)
确定工艺范围:
范围至观重要
越窄越好!
大量工艺步骤可能表明项目定义不佳或问题
源于几个项目
问题藏于问题中
若问题可以由粗略分析解决,管理层会去做
绘制可执行的工艺图
你能确认缺陷来源吗?
我们能有意识地改变输入指标变量吗?
有意识的改变输入指标变量能直接影响输出结果吗?
工艺流程图(PFD):
6 Sigma 工艺流程图的要素:
所有工艺步骤包括隐形工厂
数据采集点
所有设备/工具
各步骤表明增值性(VA)和非增值性(NVA)
控制标准文件
用标准符号绘制工艺流程:
在Microsoft OfficeTM 等软件中可找到
工艺流程图-程序:
绘制工艺记载的工艺步骤
包括所有检查点,测量指标和传运步骤
确认所有数据采集点
标示各工序标准控制文件
各步骤标明为增值性(VA)或非增值性(NVA)
确认各工艺步骤的 X 和 Y
标明可能消除的NVA 步骤
加入并标明“隐形工厂”工段
标明为VA或NVA,标明可能消除的步骤
标明须指定控制文件的步骤
加入DUP,RTY,COPQ,循环周期等估计值
标明须进行量具和工艺能力研究的步骤
通过直接或秘密观察确认准确性
文件记录/确认:
文件记录的工艺流程
首先绘制记录下来的工艺
加入并标明隐形工厂步骤
当所有步骤展示出来后,流程图就属于实际工艺
确认
流程图的准确性至关重要
项目组必须花时间观察工艺
秘密进行。
观察导致行为改变
确认实际工艺设置与记录的设置相同
跨班跨机器观察工艺
如何绘制工艺流程细图:
工艺流程细图:
6 Sigma 工艺流程图要素:
工艺或产品是输出指标Y和输入指标X
标准上下限和标准控制文件
所用设备/工具
绘制工艺流程细图
工艺流程细图必须依工艺流程图而画。
更改其一应在另一个中反映出来。
应使用最新的控制文件
标明所有隐形工厂步骤的输入输出指标
工艺流程细图程序:
1,从流程图中列出工艺步骤
2,加入下列内容
输出指标
输出指标标准,若存在
输入指标
输入指标标准,若存在
工艺能力或量具能力指标
所用设备
3,标明隐形工厂步骤
4,标明各步骤属于增值性(VA)或非增值性(NVA)5,标明各步骤属于可控性的(C)或噪音性的(N)
6,确认各设备的输入指标设置
7,确认流程图准确性
8,必要时更改及更新流程
标准限和工艺能力:
工艺及产品标准
加入X的工艺设置
加入Y 的标准限
标明未记录的Y和可控的X
测量系统
加入量具重复性及复验性数据
标明须做测量系统分析的量具
工艺能力
展示RTY,DPU,CPK等的估计值
标明哪些工艺步骤数据陈旧或不完整而需做工艺能力分析
更改及更新:
更改
记住:6 Sigma 的目标之一是找出:Y=F(X)
随着对工艺的深入了解,更新工艺图以反映新的信息
更新
项目最终成果之一是现有的工艺的流程图
更新工艺图以反映任何工艺改变
加入测量系统分析及工艺能力分析结果
精简制造与5S:
精简制造例似于日本的5S
精简制造与5S:
鱼骨图:
鱼骨图
一种系统确认所有可能导致问题(后果)产生的原因方法。
构造鱼骨图的方法:
1. 陈述问题,并置于右边的方框内
2. 朝方框画一水平箭头。
3. 在箭头上下写上传统因素类型名称*或你怀疑是的类型名称。
用
直线连到箭头线上。
4.在各主要的类型范围内,集思广益并列出所有可能引起问题发生的因子。
5.进一步优化:对各种详细列出的因子再列出其输入变量。
*6m--man, machine ,method, measurement, mother nature (environment)
(6M:人员,机器,测量方法,原材料,环境)
定性测量系统研究:
定性型量具 R&R -术语:
检验员分数(%)-在定性型R&R检验过程中,检验员前后一致的比例
定性数据--定性(合格/不合格)数据,可用来做记录和分析
定性型测量系统--把每个部件与标准进行比较,从而决定部件是否符合标准的测量系统。
消费者偏见--员工倾向把合格产品判为废品
有效筛选分数(%)--在定性型R&R检验过程中,所有员工本身前后一致且相互之间也一致的比例。
标准值--由一个高准确度量具所测的平均值
生产者偏差--员工倾向于把不合格(有缺陷的)产品判为合格
筛选--用检验方法对产品进行100%的评估
筛选有效性--定性量具系统区别合格与不合格的能力
使用定性型量具 R&R 的目的:
工艺评估
评估你的检查标准或工作质量标准与客户要求的一致性
确定所有班次,机器等的检查人员是否使用相同标准来决定合格与不合格量化检查人员准确重复其检验结果的能力
确定检查人员与“已知标准”的一致性及倾向于消费者偏差还是生产者偏差工艺改进
发现是否需要培训,缺少工序或缺乏标准
定性型量具 R&R 的方法:
准备
从工艺中挑选30个部件,50%合格,50%次品
可能的话,挑选近乎于合格和不合格样本
挑选检查人员--受过完全培训的和有资格的
实施
要求每一个检查人员随机地检查部件,决定合格与不合格并重复此检查
评估
将结果载入文件
如果必要,采取适当的措施调整测量工艺
重做R&R试验,核实调整后的有效性
定性型量具 R&R --结论:
检查员分数
如果大多数员工都是100%,则培训作用极为有限
筛选有效分数
如果员工本身前后一致但是相互间不一致,则重新培训可帮助减少错误。
标准化分数
如果员工时常与标准不一致,则需要改变测量系统(或局部标准)
工艺能力分析:
为何测量工艺能力?
使我们根据数据分配资源!(这可不常见!)
缺陷率得以量化
确认可以改进机会
分析工艺能力可使组织预测其所有产品和服务的真实质量水平
确认工艺发生问题的本质-居中程度或分散度
工艺能力研究
连续数据离散数据
1.确认标准限 1.确认标准限
2.收集数据 2.收集数据
3.确定短期偏差 3.决定:短期还是长期?
4.计算工艺能力指标:(通常是长期)
a.短期: 4.计算工艺能力指标:
Ⅰ ZU,ZL a.长期:
Ⅱ CP Ⅰ PPM
Ⅲ CPK Ⅱ Sigma水平ZLT
Ⅳ Sigma水平ZST Ⅲ PPK
b.长期: b.短期:
Ⅰ Sigma水平ZLT Ⅰ Sigma水平ZST
Ⅱ PPK Ⅱ CPK
工艺能力计算实例
一位技师负责医院设备的蒸汽杀菌过程。
其中一个关键参数是控制“暴露”阶段的温度。
设备室温度和在最小饱和蒸汽浓度的周期时间决定杀菌程度在整个设备室维持前后一致的温度范围很重要。
第一步:确认标准
这一阶段常被忽视。
我们如何设定标准?
设计部门-设计蓝图
设计部门如何得到各项要求?
工艺部门-标准由工艺以前能够做到的或开始使用时的能力定
这想法有错吗?
客户
我们总是对客户说可以吗?
对上例而言:
设备室目标温度是1250C±1.50C
第二步:采集数据-合理编组
应采集数据获得“短期”性能,如可能,“长期”性能
通过固定时间区间采集一系列快照型数据
应按合理编组采集快照数据
什么是合理编组?
从流程连续不断产生的零件或产品中合理取样以期捕获最小工艺偏差的方法
组内偏差反映一般偏差
平均标准差(用一种均方差方法平均)是对工艺应有能力的良好估计
第二步:采样-例子
例子:技师在暴露周期从控温探针读数中选取五个数据,并从连续七个杀菌运转周期采集数据,数据列在ChamberTemp2.mtw文件的杆ChambTemp栏中
第三步:确定短期偏差
多数现有数据居于长期和短期之间
为了估计真实短期数据:
小心设计工艺能力研究方法
确保编组策略合理
某些工艺无法研究短期数据
如低产量和长循环周期工艺
采样昂贵或难以取样的工艺
第三步:短期还是长期?
一个指导思想:如果允许80%的输入指标
在其自然范围内浮动,数据就是长期的
短期及长期:组内及组间
平均标准差与总标准差
对各组方差取平均值可得到组内标准差的平均值
总标准差由所有数据算出,不计编组
平均标准差不计组间偏差,而总标准差计入组间偏差
平均标准差是对组内标准差的最佳估计
长期和短期指导思想
短期
数据在有限的周期或间隔采集
数据在有限的机器和员工中采集
差不多总是连续变量
长期
数据在很多的周期,间隔,机器和员工中采集
可以是离散或连续数据
离散数据几乎都是长期性的
第四步:计算ZU和ZL:
Z-分数
提供统计数据以便用共同语言交流
提供一个与标准上下限相关的工艺性能指标
第四步:计算CP
例子
工艺平均值为325
标准差为15
标准上限为380,下限为270
CP是多少?
若平均值为 355而标准差不变CP又是多少?
Cp与工艺应有能力
Cp是工艺应有能力的良好指标
工艺应有能力--一个工艺观察到的最好的短期性能
机会--工艺长期性能与工艺应有能力间的差距
Sigma项目--致力与把长期性能与工艺应有能力的差距缩短定量测量系统研究:
定性型量具 R&R --模型
测量系统μ总和=μ工艺+Δμ测量系统
偏离度:观察值=实际真实值+测量偏移
通过“校准计划” Δ测量偏移
来评估真实值测量值
(准确度)
测量系统σ2 总合=σ2工艺+σ2测量系统
偏离度:观察的偏差=工艺的偏差+测量的偏差
通过“校准计划”
来评估真实值测量值
(准确度)
测量系统的指标:
量具R&R结果->量具偏差(σmeasurement system )真实值精确度(量具偏差)
观察值
测量系统的精确度(P):
精确度包括重复性和复制性
测量系统的指标-PT:
精确度与公差之比--P/T
代表量具偏差占公差的部分
此部分通常用百分数来表示
最好的情形P/T<10%--可接受的P/T<30%
测量系统的测量方法--P/TV:
精确度与总偏差之比
代表量具偏差占据总偏差的部分
此部分通常用百分率来表示
最好情形<10% 量具可接受条件<30%
测量系统的指标--分辨指数 :
分辨指数是测量系统从工艺数据中可辨认的不同读数的数量分辨指数是一个分辨率指标
分辨指数是重复性和复制性的函数
最好情形:>4 ,可接受的:3-4
P/T 和 P/TV 的用处:
P/T (% 公差)
最常用于测量系统的精确度评估
将量具的精确度与公差要求进行对比
如果量具用来对生产样品进行分类 P/T 还可以
P/SV(%R&R)--6 Sigma 首选
测量量具与量具研究偏差相比其性能如何
最适合进行工艺改进的评估
使用时应小心。
量具研究偏差并不一定代表真实的工艺偏差P/TV(%R&R)--6 Sigma 首选
测量量具与工艺偏差相比其性能如何
使用时应小心。
量具研究偏差并不一定代表真实的工艺偏差当量具样本中的偏差代表真实工艺偏差时,P/TV等于P/SV
定量型量具 R&R --使用方法说明:
1,校准量具或确认最近校准仍然有效
2,收集10个代表工艺偏差全部范围的样本
3,从每日使用这种测量方法的员工中选出检验员
4,运用 Clac>Make Patterned Data> 准备量具研究数据表5,让员工测量所有无标识,随机次序的样本
6,分别让另外其他员工测量所有无标识,随机次序的样本7,重复第五步及第六步循环三次。
也尽量打乱员工次序
8,用 Minitab 作下列两个分析
Stat>Quality Tools>Gage R&R Study(Crossed)
Stat>Quality Tools>Gage Run Chart
9,对测量系统能力研究结果进行分析
10,确定适当的后续措施
定量型量具 R&R --Minitab 实例:
一个黑带想对冶金工艺使用的温度表进行量具研究,他严格按前面一页的方法进行实验,并将数据输进了R&Rexample.xls 中。
运用Minitab分析数据并评估量具能力
Stat>Quality Tools>Gage R&R Study(Crossed)...
Minitab 量具R&R研究--选项
输入该工艺公差和偏差,如果你想要Minitab帮你计算P/T 和 P/TV的话。
Minitab 默认计算P/SV
量具R&R结果--ANOVA表
P值是变化源在统计上对总偏差影响是否不显著的概率
在这个例子中,部件和员工均为显著的偏差源
另外,你能用Minitab的计算器计算总的平方和吗?这个值代表什么意思?
《6 Sigma项目运作实例》->《分析阶段》->失效模式及后果分析
---------------------------------------------------------------------------------------------------
失效模式及后果分析:
Failure Modes and Effects Analysis (FMEA)
Background:
Failure Modes and Effects Analysis (FMEA)
First developed in the 1950’s
Appropriated by NASA in the 1960’s for the space program
Ford Motor Company was the first North American company to widely implement the use of FMEAs
Types of FMEA
System – Top-level, early stage analysis of complex systems
Design – Systems, subsystems, parts & components early in design stage
Process – Focuses on process flow, sequence, equipment, tooling, gauges, inputs, outputs, set points, etc
Who? When?
Who constructs the FMEA?
The Black Belt is the team leader.
The process owner inherits the finished FMEA.
Use the process mapping, C&E matrix team.
May need to add a rep from quality, a supplier, reliability
When should the FMEA be constructed?
After the process map & the C&E matrix
Before or after the control plan, depending on the maturity
of the process
Why?
Warm up exercise:
You have 60 seconds to document:
What would you want to know about a “defect”?
For the process:
FMEA improves the reliability of the process
An FMEA identifies problems before they occur
FMEA serves as a record of improvement & knowledge
For the future:
FMEA helps evaluate the risk of process changes
FMEA identifies areas for other studies –
multi-vari, ANOVA, DOE
6s Process FMEA -- Terminology
FMEA: A systematic analysis of a process used to identify potential failures and to prevent their occurrence
Potential Failure mode: The manner in which the process could potentially fail to meet the process requirements.
Potential Failure Effect: The results of the failure mode on the customer.
Severity: An assessment of the seriousness of a failure mode. Severity applies to the effects only.
Cause: How the failure could occur, described in terms of something that can be corrected or controlled.
Occurrence: The likelihood that a specific failure mode is projected to occur.
Detection: The effectiveness of current process controls to identify the failure mode (or the failure effect) prior to occurring, prior to release to production, or prior to shipment to the customer.
RPN -- Risk Priority Number: The product of Severity, Occurrence & Detection
FMEA Examples
Plating Example
An aerospace plating company was shipping product to its customers with nickel plating that was too thin. Parts were failing corrosion testing at the customer.
Shipping Example
The shipping department of an electronics company is unable to
ship an assembly without its clam shell protective packaging. This causes occasional late shipments to the customer.
In the following examples, a single line from the FMEA is used as an illustration for each of the above examples.
图形技术分析:
Graphical Methods
Process Variation
Noise variation from discrete inputs
Different operators, machines, setups
Different days, shifts
Different batches, mixtures, raw materials
Noise variation from continuous inputs
Ambient temperature, humidity, pressure
Wear, drift, erosion, chemical depletion
) ,..., , ( 2 1 k Process x x x f y =) ,..., , ( 2 1 k Noise n n n f + Intentional Unwanted The equation just means that any output is determined by the intentional process settings
and the unwanted noise variation.
Common Classification of Noise Variables
Positional (within part variation)
Variation within a single production unit
Thickness variation across a plated part
Variation across a unit containing many parts
Variation across a semiconductor wafer with many die
Variation by position in a batch process
Cavity-to-cavity variations in an injection molding operation
Cyclical (part-to-part variation)
Variation between consecutive production units
Batch-to-batch average differences – consecutive batches
Temporal (time-to-time variation)
Shift-to-shift, Day-to-Day, Setup-to-setup
Variation not accounted for by Positional or Cyclical
2 2 2 2
Temporal Cyclical Positional Noise σ σ σ ++=
Graphical Analysis – Example
Injection molding is used to make a type of socket, four pieces at a time, one piece per slot. Measurements of the sockets consist of thickness values in excess of 5.00 millimeters. The gauges measure in hundredths of a millimeter. The specification is 11 ± 6.
Four times a day the supervisor would go to the press and gather up the parts produced by five consecutive cycles of the press. Since each cycle produced four parts, he would have 20 parts to measure every two hours.
The supervisor kept track of the cycle and the cavity from which each part came and wrote his twenty
measurements in an array like
this:
The supervisor collected samples four times a day for five days (20 samples total, 20 parts per sample). Calculate the process capability and use a Multi-Vari chart to help determine sources of variation.
A BCDE
S1 18 19 20 19 21
S2 13 16 14 13 13
S3 10 11 13 10 13
S4 11 12 13 13 13
Exercise: Determine Capability
Using Minitab, analyze the Thick data
in SocketData.mtw for process capability
Remember, the specifications are: 11 ± 6
What is the short-term process capability?
What is the long-term process capability?
Are these good or bad values?
Remember, one goal of Six Sigma is to
reduce variation, which will increase
capability. It is always important to
understand the process capability.
Preparing Data for Marginal Plot by “Slot”
Marginal plots require both variables to be defined numerically We need to convert “Slot” to a numeric column first
Step 1: Convert “Slot”
Manip>Code>Text to Numeric
Manip > Code > Text to Numeric
Multi-Vari Analysis – Defined
A graphical analysis tool
Uses logical sub-grouping
An alyzes the effects of discrete X’s on continuous Y’s
A capability and process analysis tool
Data collected for a relatively short time
Data can estimate capability, stability, and y = f(x)’s
Major focus: study uncontrolled noise variation first
Variation in noise variables produces chronic and acute
mean shifts, changes in variability, and instability
Noise variation must be reduced or eliminated in order to leverage the important controllable variables systematically Multi-vari analysis is a very useful tool
for graphically identifying sources of
variation, especially noise variation. Later
this week, we will be studying correlation &
regression (an analysis of the effect of
continuous X’s on continuous Y’s), analysis
of variance (ANOVA) and the General Linear
Model (GLM), both numerical analyses of
variance data.
Multi-vari analyses will help identify the
variation sources with the purpose of reducing
or eliminating them.
A Multi-Vari Plan
1. Clearly state the objective
2. List the X’s and Y’s to be studied
3. Ensure measurement system capability
4. Describe the sampling plan
5. Describe the data collection & storage plan (who, what, when, etc.)
6. Describe the procedure and settings used to run the process
7. Assemble and train the team. Define responsibilities
8. Collect the data
9. Analyze the data
10. Verify the results
11. Draw conclusions. Report results. Make recommendations
Injection Molding Example
1. Clearly state the objective
Determine the process capability of the injection molding process
Determine the major sources of noise variation
2. List the X’s and Y’s to be studied
Output: Thickness
Inputs: Cavity (slot), cycle, sample
3. Ensure measurement system capability
An MSA was conducted and the system was found capable
4. Describe the sampling plan
One sample from each slot, five consecutive runs, four times a
day for five days.
5. Describe the data collection & storage plan (who, what, when, where, etc.)
The supervisor collected the data and entered it in a worksheet
6. Describe the procedure and settings used to run the process
Standard, constant process settings.
7. Assemble and train the team. Define responsibilities. For a small project, the supervisor did all the work 8. Collect the data.
The data are in Minitab worksheet SocketData.mtw 9. Analyze the data
Analysis is on the following slides
中心限理论:
Central Limit Theorem
Q: Why Are So Many Distributions Normal?
Why is something this
complicated so
common?
Science has shown us that variables that
vary randomly are distributed normally. So
a normal distribution is actually a random distribution.
Another reason why some distributions
are normally distributed is because
measurements are actually averages over
time of many sub-measurements. The
single measurement that we think we are
making is actually the average (or sum) of
many measurements. The Central Limit
Theorem, discussed in the following slides,
provides an explanation of why averages of
non-normal data appear normal.
Dice Demonstration (Integer Distribution)
What does a probability distribution
from a single die look like?
What is the mean?
What is the standard deviation?
Construct a dataset in Minitab
Select Calc > Random Data > Integer… from the main menu
Generate 1,000 rows of data in C1: Min = 1, Max = 6
Use Minitab’s Graphical Summary routine for analysis Stat > Basic S tatistics > Display Descriptive Statistics…
Minitab Output (Typical)
The probability distribution of the
possible outcomes of the roll of a single die
is obviously non-normal.
A perfect distribution would have had
all six bars exactly equal, but even with
10,000 data points, there is still some
differences in the histogram. If a better
estimate is required, a different data set
could be constructed with exactly equal
counts of each possible outcome. Try it
and see if the numbers are any different.
Sampling a Non-normal Distribution – Exercise
Each person in the class is to toss a single die sixteen times and record the data.
Calculate the mean and standard deviation of each sample of sixteen
Record the means and standard deviations from each person in the class in a Minitab worksheet
Use Minitab’s Graphical Summary routine for analysis
Stat > Basic Statistics > Display Descriptive Statistics…Alternately, a sample of sixteen throws
of the dice can be simulated in Minitab as
follows:
Select: Calc > Ra ndom Data > Integer… from
the main menu
Generate 16 rows of data in C1: Min = 1, Max
= 6
Analyze the Sample Data
What is the mean of the sample averages?
Mean ≈3.5
What is the standard deviation of the sample averages?
Sigma ≈0.4
Is the distribution normal?
What is the p-value?
What is the relationship between the average of the
sample means and the population average?
What is the relationship between the sigma of the
averages and the sigma of the individuals?
The Central Limit Theorem
Formal Definition:
If random samples of n measurements are repeatedly
drawn from a population with a finite mean μμμμand a standard deviation σ σσ σ , then, when n is large, the relative frequency histogram for the sample means (calculated from the
repeated samples) will be approximately normal with a
mean μμμμand a standard deviation equal to the population standard deviation, σ σσ σ , divided by the square root of n.
(Note: The approximation becomes more precise as n increases.)
Central Limit Theorem – Exercise
From a Minitab analysis of the uniformly distributed data:
For an exercise, verify that the Central Limit Theorem is valid for this uniform data
Variable N Mean StDev
n=1 (Individuals) 10000 -0.00331 0.57918
n=2 (Means) 10000 0.00259 0.40613
n=5 (Means) 10000 -0.00113 0.25953
n=30 (Means) 10000 -0.00237 0.10559
相关性及简单线性回归:
Regression & Correlation
Introduction
Used for quantitative variables (X’s and Y’s)
For review: What is the focus of Six Sigma?
Q. What does this equation represent?
A. A mathematical model of a process
Purpose of Regression: to predict Y from a setting of x Examples:
Distance = f(acceleration, initial velocity, time)
Product yield = f(concentrations of reactants)
Hardness = f(alloy, anneal temperature)
) ( x f Y =
Remember, the focus of Six Sigma is to
determine the defining equation of the
process. It is to identify the important input
variables, determine the relationship to the
outputs, determine the optimum values of the
critical inputs and then control the inputs at
the optimum settings.
To do this, the Black Belt must know the
relationship between the inputs and the
outputs. This module discusses linear
modeling techniques for identifying the
relationship between continuous variable
inputs and continuous variable outputs.
A Simple Linear Model
Linear equations require continuous input
and output variables. One other assumption is
that the independent variable (input) is known
and fixed and that all of the variation is in the
dependent variable (output). This is not
usually the case, but often the inputs are
settings on dials or gauges or software that
seems fixed and invariable. Many times the
variation in the output is a function of the
inability of the input controller to hold the
input at the same value.
Collecting Data (y & x) – A Few Thoughts
Pg 8 ?March 01, Breakthrough Management Group. Unpublished proprietary work available only under license. All rights reserved. March 16, 2001
Make sure the process settings cover the likely production
range (but not too far).
Too great a range points outside the normal range may
have too great an effect on the model.
Too small a range Error term may dominate the fit.
Take several replicates at each input setting (x).
Replicate runs help increase the model accuracy.
Randomize runs whenever practical.
Run order is often significant factor.
The output (y) at different inputs (x抯) is not always
independent of previous settings.
A good spread in the data is required for a
good model. Consider two examples:
All of the data is collected at the normal
process settings. In this case, regression will
try to fit a linear model to a combination of
random process variation and random
measurement variation. The results will be of
no value.
The second case is when most of the data
is clustered around the standard settings
except for a couple of points at the extreme
ranges. In this case, the extreme points
control the fit of the model. If one of the
extreme points is a flyer, then the model will
be in error due to the flyer.
The ideal case is for the Black Belt to
collect a range of data throughout the process
space.
置信区间:
Confidence Intervals
A population is the set of all measurements of interest to the experimenter A sample is a subset of measurements selected from the population
An inference is a statement about a population parameter based on information contained in a sample
Two types of inference
Estimation
A poll has been devised to determine the public’s reaction to a
new political scandal. The purpose is to estimate the reaction
of all Americans by polling a representative sample
Hypothesis testing
A vaccine for Lyme disease has been developed but the rate
of negative side effects is 1.45%. A new vaccine has been
developed and it is desired to know if the rate of negative side
effects is lower than 1.45%.
The other branch of statistics is
descriptive. Its purpose is merely to
describe a set of measurements.
Inferential statistics is used to guess what
God knows about a population from a sample.
Within inferential statistics, there are two
types: estimation and hypothesis testing.
Estimation is trying to guess the population
statistics from a sample. Hypothesis testing
concerns evaluating a sample statistic and
comparing it to some hypothetical value.
Estimates and the CLT
What is the best estimate of the population mean using sample data? The sample mean!
How good of an estimate is the sample mean?
What factors influence the accuracy of the estimate of the mean from sample data?
Recall that:
The variation in the distribution of sample means is a function of the variance of the Population and the sample size!
n Pop X /σ σ =
What About Small Samples?
If the population standard deviation is known (it almost never is) use
the previous formula for small samples, too
If the population sigma is unknown (it usually is):
The estimate for standard deviation (s) is used
The t-distribution is used instead of the normal (Z) distribution
Q: What is a t-distribution?
The t-distribution is a family of bell-shaped (normal-like)
distributions that are dependent on sample size
The smaller the sample size n, the wider and flatter the
distribution
n
s t X μ n
s t X n n 1 , 2 / 1 , 2 / +≤≤α α
The t-distribution is the general case for
any sample where the population standard
deviation is unknown. However, with large
samples, the t- and z-distributions are nearly
identical, so either can be used.
You can verify this in Minitab by
generating a large sample of normal data and
then analyzing it with both the z- and t-
distribution routines.
Proportions and Binomial Experiments
Pg 35 . April 01, Breakthrough Management Group. Unpublished proprietary work available only under license. All rights reserved. April 3, 2001
Proportion data is usually the result of a binomial-type
experiment
Binomial experiments (or Bernoulli trials) are those that
have on ly one of two outcomes, either a “success” or a
“failure”
The probability of this type of experiment is described by a
binomial distribution, a complicated distribution
In many cases the normal distribution can be used to
approximate the binomial distribution
When nxp > 5 and nx(1-p)>5
μ= nxp and σ 2 = nxpx(1-p)
Binomial distributions are discussed in
almost every statistics textbook. Calculations
with them is not necessarily difficult, but it is
tedious if it must be done manually. Minitab
has routines, however, that greatly simplifies
the calculations.
If the binomial approximation applies and
the data can be estimated with a normal
distribution other statistical tests and control
charts can be used that would not be available
otherwise.
Try to construct your experiments such
that the binomial approximation is valid.
A general rule of thumb: for the normal
approximation to apply, have a sample size of
at least 30 and large enough to guarantees at
least 5 successes.
假设测试:
Introduction to Hypothesis Testing
A Bright Idea
Notes:
Pg 5 11 Nov 2000 ?April 01, Breakthrough Management Group. Unpublished proprietary work available only under license. All rights reserved.
A light bulb company is trying to produce a brighter light bulb for the same energy. It is hoped that a change in the filament coating process will produce a brighter light.
The engineer collected the last ten light bulbs made before the process change and the first ten after the change. The mean light output of the old process bulbs is 1251 lumens and the new process is 1273 lumens.
Does the increase of 22 in the means of the two groups represent a real improvement?
Could the difference between these two groups have happened by random chance?
Should the engineer switch to the new process?
These kinds of problems are very
familiar to engineers. An engineer is
given a task to improve a process or
product. After a change in the process,
the engineer is left with the problem of
determining whether the process change
has made a significant improvement or
not. Though engineers often use more
advanced techniques to determine the
improved settings (DOE, for example, to
be discussed later), a hypothesis test is
often used to verify the experiment
results.
The process may be as follows:
?Identify the problem.
?Design and run an experiment to
find an improved condition.
?Analyze the data and determine the。