人工智能 贝叶斯网络
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Artificial Intelligence: Bayesian Networks
1
Graphical Models
• If no assumption of independence is made, then an exponential number of parameters must be estimated for sound probabilistic inference. • No realistic amount of training data is sufficient to estimate so many parameters. • If a blanket assumption of conditional independence is made, efficient training and inference is possible, but such a strong assumption is rarely warranted. • Graphical models use directed or undirected graphs over a set of random variables to explicitly specify variable dependencies and allow for less restrictive independence assumptions while limiting the number of parameters that must be estimated.
– 1) Compute the joint distribution using this equation. – 2) Compute any desired conditional probability using the joint distribution.
6
Naï ve Bayes as a Bayes Net
5
Joint Distributions for Bayes Nets
• A Bayesian Network implicitly defines a joint distribution.
P( x1 , x2 ,...xn ) P( xi | Parents ( X i ))
i 1 n
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
8
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • P( However, is a criteria for Xi | Xj, S)this = P( Xitoo | S)strict , is equivalent to conditional independence since two P( Xi , Xj | S) = P( Xi | S ) P(nodes Xj | S) will still be considered independent if their simply exists some How prove? variableto that depends on both.
• Each node has a conditional probability table (CPT) that gives the probability of each of its values given every possible combination of values for its parents (conditioning case).
Burglary Alarm JohnCalls MaryCalls
???
Earthquake
John calls 90% of the time there is an Alarm and the Alarm detects 94% of Burglaries so people generally think it should be fairly high.
Alarm
A T F P(J) .90 .05
T F F
JohnCalls
MaryCalls
T F
CPT Comments
• Probability of false not given since rows must add to 1. • Example requires 10 parameters rather than 25–1 = 31 for specifying the full joint distribution. • Number of parameters in the CPT for a node is exponential in the number of parents (fan-in).
11
Bayes Net Inference
• Given known values for some evidence variables, determine the posterior probability of some query variables. • Example: Given that John calls, what is the probability that there is a Burglary?
– For example, if we know the alarm went off that someone called about the alarm, then it makes earthquake and burglary dependent since evidence for earthquake decreases belief in burglary. and vice versa.
However, this ignores the prior probability of John calling.
12
BBaidu Nhomakorabeayes Net Inference
• Example: Given that John calls, what is the probability that there is a Burglary?
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
9
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.
– Bayesian Networks: Directed acyclic graphs that indicate causal structure. – Markov Networks: Undirected graphs that capture general dependencies.
2
Bayesian Networks
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.
• Naï ve Bayes is a simple Bayes Net
Y
X1
X2
…
Xn
• Priors P(Y) and conditionals P(Xi|Y) for Naï ve Bayes provide CPTs for the network.
7
Independencies in Bayes Nets
– Roots (sources) of the DAG that have no parents are given prior probabilities.
P(B) .001
Burglary
Earthquake
B E P(A)
P(E) .002
T
T
F T F
.95
.94 .29 .001 A P(M) .70 .01 4
• Example
P( J M A B E ) P( J | A) P(M | A) P( A | B E) P(B) P(E)
0.9 0.7 0.001 0.999 0.998 0.00062
• Therefore an inefficient approach to inference is:
• Directed Acyclic Graph (DAG)
– Nodes are random variables – Edges indicate causal influences
Burglary Alarm
Earthquake
JohnCalls
MaryCalls
3
Conditional Probability Tables
– For example, if we know nothing else, Earthquake and Burglary are independent.
• However, if we have information about a common effect (or descendent thereof) then the two “independent” causes become probabilistically linked since evidence for one cause can “explain away” the other.
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
10
Independencies in Bayes Nets (cont.)
• Unless we know something about a common effect of two “independent causes” or a descendent of a common effect, then they can be considered independent.
1
Graphical Models
• If no assumption of independence is made, then an exponential number of parameters must be estimated for sound probabilistic inference. • No realistic amount of training data is sufficient to estimate so many parameters. • If a blanket assumption of conditional independence is made, efficient training and inference is possible, but such a strong assumption is rarely warranted. • Graphical models use directed or undirected graphs over a set of random variables to explicitly specify variable dependencies and allow for less restrictive independence assumptions while limiting the number of parameters that must be estimated.
– 1) Compute the joint distribution using this equation. – 2) Compute any desired conditional probability using the joint distribution.
6
Naï ve Bayes as a Bayes Net
5
Joint Distributions for Bayes Nets
• A Bayesian Network implicitly defines a joint distribution.
P( x1 , x2 ,...xn ) P( xi | Parents ( X i ))
i 1 n
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
8
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • P( However, is a criteria for Xi | Xj, S)this = P( Xitoo | S)strict , is equivalent to conditional independence since two P( Xi , Xj | S) = P( Xi | S ) P(nodes Xj | S) will still be considered independent if their simply exists some How prove? variableto that depends on both.
• Each node has a conditional probability table (CPT) that gives the probability of each of its values given every possible combination of values for its parents (conditioning case).
Burglary Alarm JohnCalls MaryCalls
???
Earthquake
John calls 90% of the time there is an Alarm and the Alarm detects 94% of Burglaries so people generally think it should be fairly high.
Alarm
A T F P(J) .90 .05
T F F
JohnCalls
MaryCalls
T F
CPT Comments
• Probability of false not given since rows must add to 1. • Example requires 10 parameters rather than 25–1 = 31 for specifying the full joint distribution. • Number of parameters in the CPT for a node is exponential in the number of parents (fan-in).
11
Bayes Net Inference
• Given known values for some evidence variables, determine the posterior probability of some query variables. • Example: Given that John calls, what is the probability that there is a Burglary?
– For example, if we know the alarm went off that someone called about the alarm, then it makes earthquake and burglary dependent since evidence for earthquake decreases belief in burglary. and vice versa.
However, this ignores the prior probability of John calling.
12
BBaidu Nhomakorabeayes Net Inference
• Example: Given that John calls, what is the probability that there is a Burglary?
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
9
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.
– Bayesian Networks: Directed acyclic graphs that indicate causal structure. – Markov Networks: Undirected graphs that capture general dependencies.
2
Bayesian Networks
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S) • However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.
• Naï ve Bayes is a simple Bayes Net
Y
X1
X2
…
Xn
• Priors P(Y) and conditionals P(Xi|Y) for Naï ve Bayes provide CPTs for the network.
7
Independencies in Bayes Nets
– Roots (sources) of the DAG that have no parents are given prior probabilities.
P(B) .001
Burglary
Earthquake
B E P(A)
P(E) .002
T
T
F T F
.95
.94 .29 .001 A P(M) .70 .01 4
• Example
P( J M A B E ) P( J | A) P(M | A) P( A | B E) P(B) P(E)
0.9 0.7 0.001 0.999 0.998 0.00062
• Therefore an inefficient approach to inference is:
• Directed Acyclic Graph (DAG)
– Nodes are random variables – Edges indicate causal influences
Burglary Alarm
Earthquake
JohnCalls
MaryCalls
3
Conditional Probability Tables
– For example, if we know nothing else, Earthquake and Burglary are independent.
• However, if we have information about a common effect (or descendent thereof) then the two “independent” causes become probabilistically linked since evidence for one cause can “explain away” the other.
– For example, Burglary and Earthquake should be considered independent since they both cause Alarm.
10
Independencies in Bayes Nets (cont.)
• Unless we know something about a common effect of two “independent causes” or a descendent of a common effect, then they can be considered independent.