贝叶斯推断
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
庄教授的访问学者,完成《广义信息论》。 后来研究投资组合理论,下海搞投资。 最近在汪老师鼓励下重新搞研究, • 结合语义信息方法和似然度方法 研究机器学习:最大互信息分类, 混合模型,贝叶斯推断,多标签 分类(也是这次会议交流B1组)。
1
Bayes’ Reasoning and Bayesian Inference 贝叶斯推理和贝叶斯(主义)推断
Maximum Likelihood Criterion = Maximum Generalized KL Information Criterion
最大似然准则=最大广义KL信息准则
• Likelihood =Negative Cross-entropy: Assume Nj->∞ and IID assumption is tenable,there is
• 我的理解
Bayes’ Reasoning
Inference using θ
Probability reasoning without θ
including classical Bayes’ prediction
Likelihood Inference
Bayesian Inference
P(X | yj ) P( yj | X )P(X ) / P( yj )
method for machine learning: • Maximum mutual information classification • Mixture models,Multi-label learning • Improved Bayesian inference to Logical Bayesian inference (group A1) • 最早研究色觉和美感等哲学问题,因色觉模型涉及模糊数学,当了汪培
I (X ; j ) -
i
P(xi
|
y
j
)
log
P(xi| j
P(xi )
)
C
1 Nj
log(X| j )
5
Bayesian Inference: Advantages and Disadvantages 贝叶斯主义推断: 优点和缺点
• Tool:Bayesian posterior
❖ we maximize likelihood
to get optimized θj*.
P(X | j ) P(xi | j )Nji
i
❖ Step 2: Using P(X|θj*) to make probability prediction.
❖ Disadvantage: when P(X) becomes P’(X), P(X|θj*) will be invalid. 4
P( X | y j ) P( y j | X )P( X ) / P( y j | xi )P(xi )
i
Note: P(yj|X) is not normaliz3ed
From Classical Bayes’ Prediction to Likelihood Prediction 从经典的的贝叶斯预测到似然预测
Tool: P(X|θj)
Tools: P(θ), P(X|θ)->P(θ|X)=P(θ)P(X|θ)/Pθ(X)
Max: logP(X|θj)
Baidu Nhomakorabea
Max: logP(θ|X) for MAP
Logical Bayesian Inference Tool: truth or membership function T(θj|X) Max: log[T(θj|X)/T(θj)] = log[P(X|θj)/P(X)]
❖ So, Fisher developed the likelihood method.
❖ Tool: likelihood function P(X|θj)
❖ Step 1: For a sample sequence: x(1), x(2),…, x(n) under IID assumption
log P(X j | j ) log P(xi | j )N ji i
N j P(xi | y j ) log P(xi| j )= -N j H ( X | j ) i
Conditional Sampling distribution Likelihood function Generalized KL information:
研究经历 Research Experience
• In 1990s, studied semantic information theory, color vision, portfolio • Recently combined semantic information method and likelihood
Predictions by both sides should be compatible for 2 huge samples
Classical Bayes’ Prediction
经典的贝叶斯预测
❖ Tool:transition probability function P(yj|X) or Shannon’s channel P(Y|X): P(yj|X), j=1,2,…
❖ Advantage: When P(X) becomes P’(X), the tool P(yj|X) still works.
P '( X | y j ) P( y j | X )P '( X ) / P( y j | xi )P '(xi )
i
❖ Disadvantage: If samples are small, we cannot obtain continuous P(yj|X)
❖ Two steps: ❖ Step I: Obtain prediction tool
P(yj|X) from a sample or sampling distribution P(X,Y); ❖ Step 2: For given P(X) or P‘(X) and yj, make probability prediction:
1
Bayes’ Reasoning and Bayesian Inference 贝叶斯推理和贝叶斯(主义)推断
Maximum Likelihood Criterion = Maximum Generalized KL Information Criterion
最大似然准则=最大广义KL信息准则
• Likelihood =Negative Cross-entropy: Assume Nj->∞ and IID assumption is tenable,there is
• 我的理解
Bayes’ Reasoning
Inference using θ
Probability reasoning without θ
including classical Bayes’ prediction
Likelihood Inference
Bayesian Inference
P(X | yj ) P( yj | X )P(X ) / P( yj )
method for machine learning: • Maximum mutual information classification • Mixture models,Multi-label learning • Improved Bayesian inference to Logical Bayesian inference (group A1) • 最早研究色觉和美感等哲学问题,因色觉模型涉及模糊数学,当了汪培
I (X ; j ) -
i
P(xi
|
y
j
)
log
P(xi| j
P(xi )
)
C
1 Nj
log(X| j )
5
Bayesian Inference: Advantages and Disadvantages 贝叶斯主义推断: 优点和缺点
• Tool:Bayesian posterior
❖ we maximize likelihood
to get optimized θj*.
P(X | j ) P(xi | j )Nji
i
❖ Step 2: Using P(X|θj*) to make probability prediction.
❖ Disadvantage: when P(X) becomes P’(X), P(X|θj*) will be invalid. 4
P( X | y j ) P( y j | X )P( X ) / P( y j | xi )P(xi )
i
Note: P(yj|X) is not normaliz3ed
From Classical Bayes’ Prediction to Likelihood Prediction 从经典的的贝叶斯预测到似然预测
Tool: P(X|θj)
Tools: P(θ), P(X|θ)->P(θ|X)=P(θ)P(X|θ)/Pθ(X)
Max: logP(X|θj)
Baidu Nhomakorabea
Max: logP(θ|X) for MAP
Logical Bayesian Inference Tool: truth or membership function T(θj|X) Max: log[T(θj|X)/T(θj)] = log[P(X|θj)/P(X)]
❖ So, Fisher developed the likelihood method.
❖ Tool: likelihood function P(X|θj)
❖ Step 1: For a sample sequence: x(1), x(2),…, x(n) under IID assumption
log P(X j | j ) log P(xi | j )N ji i
N j P(xi | y j ) log P(xi| j )= -N j H ( X | j ) i
Conditional Sampling distribution Likelihood function Generalized KL information:
研究经历 Research Experience
• In 1990s, studied semantic information theory, color vision, portfolio • Recently combined semantic information method and likelihood
Predictions by both sides should be compatible for 2 huge samples
Classical Bayes’ Prediction
经典的贝叶斯预测
❖ Tool:transition probability function P(yj|X) or Shannon’s channel P(Y|X): P(yj|X), j=1,2,…
❖ Advantage: When P(X) becomes P’(X), the tool P(yj|X) still works.
P '( X | y j ) P( y j | X )P '( X ) / P( y j | xi )P '(xi )
i
❖ Disadvantage: If samples are small, we cannot obtain continuous P(yj|X)
❖ Two steps: ❖ Step I: Obtain prediction tool
P(yj|X) from a sample or sampling distribution P(X,Y); ❖ Step 2: For given P(X) or P‘(X) and yj, make probability prediction: