《人工智能与数据挖掘教学课件》lect.ppt
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
2020/4/24
AI&DM
11
3.1 Extracting Classification Rules from Trees
• Represent the knowledge in the form of IF-THEN rules
• One rule is created for each path from the root to a leaf
buys_computer no no yes yes yes no yes no yes yes yes yes yes no 5
1 Example (2): Output: A Decision Tree for “buys_computer”
age?
<=30 ov30e.r.c4a0st
student?
• Note: Test set is independent of training set, otherwise over-fitting will occur
• 2. Model usage: use the model to classify future or unknown
objects
– The amount of information, needed to decide if an arbitrary example in S belongs to P or N is defined as
I ( p, n)
p
p
n
log2
p pn
p
n
n
log2
n pn
2020/4/24
AI&DM
– At start, all the training examples are at the root
– Attributes are categorical (if continuous-valued, they are discretized in advance)
– Examples are partitioned recursively based on selected attributes
age <=30 <=30 30…40 >40 >40 >40 31…40 <=30 <=30 >40 <=30 31…40 31…40 >40
income high high high medium low low low medium low medium medium medium high medium
– Estimate accuracy of the model
• The known label of test sample is compared with the classified result from the model
• Accuracy rate is the percentage of testing set samples that are correctly classified by the model
student credit_rating no fair no excellent no fair no fair yes fair yes excellent yes excellent no fair yes fair yes fair yes excellent no excellent yes fair nAIo&DM excellent
– Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain)
• Conditions for stopping partitioning
– All samples for a given node belong to the same class
2020/4/24
AI&DM
9
Attribute Selection by Information Gain
Computation
Class P: buys_computer = “yes”
Class N:
E(age) 5 I (2,3) 4 I (4,0)
14
14
5 I (3,2) 0.69 14
2020/4/24
AI&DM
12
3.2 Rules simplification and elimination
A Rule for the Tree in Figure 3.4
IF Age <=43 & Sex = Male & Credit Card Insurance = No THEN Life Insurance Promotion = No (accuracy = 75%, Figure 3.4)
classify objects in all subsets Si is
E( A)
i1
pi p
Hale Waihona Puke Baidu
ni n
I(
pi , ni )
• The encoding information that would be gained by branching on A Gain(A) I ( p, n) E(A)
yes
>40 credit rating?
no
yes
excellent fair
no
yes
no
yes
2020/4/24
AI&DM
6
2 Algorithm for Decision Tree Building
• Basic algorithm (a greedy algorithm)
– Tree is constructed in a top-down recursive divide-and-conquer manner
– This set of examples is used for model construction: training set
– The model can be represented as classification rules, decision trees, or mathematical formulae
yes
B ill P rofessor
2
yes
Jim A ssociate P rof 7
yes
D ave A ssistant P rof 6
no
A nne A ssociate P rof 3
no
2020/4/24
AI&DM
Classifier (Model)
IF rank = ‘professor’ OR years > 6 THEN tenured = ‘y3 es’
IF age = “31…40”
THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”
Gain(income) 0.029 Gain(student) 0.151
30?0 4 0 0
Gain(credit _ rating) 0.048
2>02400/4/24
3 2 0.971
AI&DM
10
3. Decision Tree Rules
• Automate rule creation • Rules simplification and elimination • A default rule is chosen
A Simplified Rule Obtained by Removing Attribute Age
IF Sex = Male & Credit Card Insurance = No THEN Life Insurance Promotion = No (accuracy = 83.3% (5/6), Figure 3.5)
Classification Process (2): Use the Model in Prediction
Classifier
Testing Data
Unseen Data
NAME RANK
YEARS TENURED
T om A ssistant P rof 2
no
M erlisa A ssociate P rof 7
buys_computer = “no”
Hence
I(p, n) = I(9, 5) =0.940
Gain(age) I ( p, n) E(age)
Compute the entropy for
= 0.940-0.69=0.25
age:
Similarly
age <=30
pi ni I(pi, ni) 2 3 0.971
no
George Professor
5
yes
Joseph A ssistant P rof 7
yes
(Jeff, Professor, 4)
Tenured?
1 Example (1): Training Dataset
An example from Quinlan’ s ID3 (1986)
2020/4/24
• Rules are easier for humans to understand
• Example
IF age = “<=30” AND student = “no” THEN buys_computer = “no”
IF age = “<=30” AND student = “yes” THEN buys_computer = “yes”
Chapter 3 Basic Data Mining Techniques
3.1 Decision Trees
(For classification)
2020/4/24
AI&DM
1
Introduction: Classification—A Two-Step Process
• 1. Model construction: build a model that can describe a set of
2020/4/24
AI&DM
2
Classification Process (1): Model Construction
Training Data
Classification Algorithms
NAME RANK
YEARS TENURED
M ike A ssistant P rof 3
no
M ary A ssistant P rof 7
– There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf
– There are no samples left
– Reach the pre-set accuracy
predetermined classes
– Preparation: Each tuple/sample is assumed to belong to a predefined class, labeled by the output attribute or class label attribute
2020/4/24
AI&DM
7
Information Gain (信息增益)(ID3/C4.5)
• Select the attribute with the highest information gain
• Assume there are two classes, P and N
– Let the set of examples S contain p elements of class P and n elements of class N
8
Information Gain in Decision Tree Building
• Assume that using attribute A, a set S will be
partitioned into sets {S1, S2 , …, Sv}
– If Si contains pi examples of P and ni examples of N, the entropy (熵), or the expected information needed to