人工智能贝叶斯网络精品PPT课件
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Artificial Intelligence: Bayesian Networks
1
Graphical Models
• If no assumption of independence is made, then an exponential number of parameters must be estimated for sound probabilistic inference.
• Directed Acyclic Graph (DAG)
– Nodes are random variables – Edges indicate causal influences
Burglary
Earthquake
Alarm
JohnCalls
MaryCalls
3
Conditional Probability Tables
– Bayesian Networks: Directed acyclic graphs that indicate causal structure.
– Markov Networks: Undirected graphs that capture general dependencies.
2
Bayesian Networks
• Graphical models use directed or undirected graphs over a set of random variables to explicitly specify variable dependencies and allow for less restrictive independence assumptions while limiting the number of parameters that must be estimated.
n
P(x1,x2,.x.n.) P(xi|Par(eXin))ts i1
• Example
P (J M A B E )
P ( J |A ) P ( M |A ) P ( A | B E ) P ( B ) P ( E )
0 .9 0 .7 0 .0 0 .91 9 0 .99 9 0 .08 006
A P(J) T .90 F .05
JohnCalls
MaryCalls
A P(M) T .70 F .01
4
CPT Comments
• Probability of false not given since rows must add to 1.
• Example requires 10 parameters rather than 25–1 = 31 for specifying the full joint distribution.
7
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S)
the joint distribution.
6
Naïve Bayes as a Bayes Net
• Naïve Bayes is a simple Bayes Net
Y
… X1
X2
Xn
• Priors P(Y) and conditionals P(Xi|Y) for Naïve Bayes provide CPTs for the network.
• Each node has a conditional probability table (CPT) that gives the probability of each of its values given every possible combination of values for its parents (conditioning case).
• Number oΒιβλιοθήκη Baidu parameters in the CPT for a node is exponential in the number of parents (fan-in).
5
Joint Distributions for Bayes Nets
• A Bayesian Network implicitly defines a joint distribution.
• No realistic amount of training data is sufficient to estimate so many parameters.
• If a blanket assumption of conditional independence is made, efficient training and inference is possible, but such a strong assumption is rarely warranted.
– Roots (sources) of the DAG that have no parents are given prior probabilities.
P(B)
.001
Burglary
P(E)
Earthquake .002
Alarm
B E P(A) T T .95 T F .94 F T .29 F F .001
• Therefore an inefficient approach to inference is:
– 1) Compute the joint distribution using this equation. – 2) Compute any desired conditional probability using
• However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.
1
Graphical Models
• If no assumption of independence is made, then an exponential number of parameters must be estimated for sound probabilistic inference.
• Directed Acyclic Graph (DAG)
– Nodes are random variables – Edges indicate causal influences
Burglary
Earthquake
Alarm
JohnCalls
MaryCalls
3
Conditional Probability Tables
– Bayesian Networks: Directed acyclic graphs that indicate causal structure.
– Markov Networks: Undirected graphs that capture general dependencies.
2
Bayesian Networks
• Graphical models use directed or undirected graphs over a set of random variables to explicitly specify variable dependencies and allow for less restrictive independence assumptions while limiting the number of parameters that must be estimated.
n
P(x1,x2,.x.n.) P(xi|Par(eXin))ts i1
• Example
P (J M A B E )
P ( J |A ) P ( M |A ) P ( A | B E ) P ( B ) P ( E )
0 .9 0 .7 0 .0 0 .91 9 0 .99 9 0 .08 006
A P(J) T .90 F .05
JohnCalls
MaryCalls
A P(M) T .70 F .01
4
CPT Comments
• Probability of false not given since rows must add to 1.
• Example requires 10 parameters rather than 25–1 = 31 for specifying the full joint distribution.
7
Independencies in Bayes Nets
• If removing a subset of nodes S from the network renders nodes Xi and Xj disconnected, then Xi and Xj are independent given S, i.e. P(Xi | Xj, S) = P(Xi | S)
the joint distribution.
6
Naïve Bayes as a Bayes Net
• Naïve Bayes is a simple Bayes Net
Y
… X1
X2
Xn
• Priors P(Y) and conditionals P(Xi|Y) for Naïve Bayes provide CPTs for the network.
• Each node has a conditional probability table (CPT) that gives the probability of each of its values given every possible combination of values for its parents (conditioning case).
• Number oΒιβλιοθήκη Baidu parameters in the CPT for a node is exponential in the number of parents (fan-in).
5
Joint Distributions for Bayes Nets
• A Bayesian Network implicitly defines a joint distribution.
• No realistic amount of training data is sufficient to estimate so many parameters.
• If a blanket assumption of conditional independence is made, efficient training and inference is possible, but such a strong assumption is rarely warranted.
– Roots (sources) of the DAG that have no parents are given prior probabilities.
P(B)
.001
Burglary
P(E)
Earthquake .002
Alarm
B E P(A) T T .95 T F .94 F T .29 F F .001
• Therefore an inefficient approach to inference is:
– 1) Compute the joint distribution using this equation. – 2) Compute any desired conditional probability using
• However, this is too strict a criteria for conditional independence since two nodes will still be considered independent if their simply exists some variable that depends on both.