markovchains 马尔科夫链PPT课件
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
• Markov models are well suited to this type of task
Markov Chain Models
• a Markov chain model is defined by – a set of states • we’ll think of a traversal through the states as generating a sequence • each state adds to the sequence (except for begin and end) – a set of transitions with associated probabilities • the transitions emanating from a given state define a distribution over the possible next states
is
n o t s u p p o rte d
• key property of a (1st order) Markov chain: the probability
of each depends only on the value of Macintosh PICT im age form at is not supported
• can also have an end state; allows the model to represent – a distribution over sequences of different lengths – preferences for ending sequences with certain symbols
Estimating the Model Parameters
• given some data (e.g. a set of sequences from CpG islands), how can we determine the probability parameters of our model?
• one approach: maximum likelihood estimation – given a set of data D – set the parameters to maximize MacintoshPICT
A Markov Chain Model for DNA
begwenku.baidu.comn
state transition
.38
A
G
.16
.34
.12
C
T
transition probabilities
M acin to sh P IC T im ag e fo rm at
is n o t su p p o rted
Markov Chain Models
M a c in to s h
P IC T
im a g e
fo rm a t
is
n o t s u p p o rte d
• similarly we can denote the probability of a sequence x as
M a c in to s h P IC T im a g e fo rm a t
M acintosh P IC T im age form at
is not supported
M a c in to s h P IC T im a g e fo r m a t
is n o t s u p p o r te d
The Probability of a Sequence for a Given Markov Chain Model
Markov Chain Models
BMI/CS 576
Fall 2010
Motivation for Markov Models in Computational Biology
• there are many cases in which we would like to represent the statistical regularities of some class of sequences – genes – various regulatory sites in DNA (e.g., where RNA polymerase and transcription factors bind) – proteins in a given family
• for any probabilistic model of sequences, we can write this probability as (recall the “Chain Rule of Probability”)
M a c in to s h
P IC T
im a g e
fo rm a t
A
begin
G
end
C
T
Markov Chain Models
• Let X be a sequence of L random variables X1… XL representing a biological sequence generated through some process
• We can ask how probable the sequence is given our model
A
begin
C
G
end
T
M
a c in t o s h
P IC
T
im
ag e
fo r m
at
is
not
s u p p o r te d
Markov Chain Notation
•
the transition parameters can be denoted by
where M acintosh P IC T im age form at is not supported
is n o t s u p p o r te d
where
M acintosh PICT im age form at is not supported
represents the
transition from
the begin
state
• This gives a probability distribution over sequences of length L
Markov Chain Models
• a Markov chain model is defined by – a set of states • we’ll think of a traversal through the states as generating a sequence • each state adds to the sequence (except for begin and end) – a set of transitions with associated probabilities • the transitions emanating from a given state define a distribution over the possible next states
is
n o t s u p p o rte d
• key property of a (1st order) Markov chain: the probability
of each depends only on the value of Macintosh PICT im age form at is not supported
• can also have an end state; allows the model to represent – a distribution over sequences of different lengths – preferences for ending sequences with certain symbols
Estimating the Model Parameters
• given some data (e.g. a set of sequences from CpG islands), how can we determine the probability parameters of our model?
• one approach: maximum likelihood estimation – given a set of data D – set the parameters to maximize MacintoshPICT
A Markov Chain Model for DNA
begwenku.baidu.comn
state transition
.38
A
G
.16
.34
.12
C
T
transition probabilities
M acin to sh P IC T im ag e fo rm at
is n o t su p p o rted
Markov Chain Models
M a c in to s h
P IC T
im a g e
fo rm a t
is
n o t s u p p o rte d
• similarly we can denote the probability of a sequence x as
M a c in to s h P IC T im a g e fo rm a t
M acintosh P IC T im age form at
is not supported
M a c in to s h P IC T im a g e fo r m a t
is n o t s u p p o r te d
The Probability of a Sequence for a Given Markov Chain Model
Markov Chain Models
BMI/CS 576
Fall 2010
Motivation for Markov Models in Computational Biology
• there are many cases in which we would like to represent the statistical regularities of some class of sequences – genes – various regulatory sites in DNA (e.g., where RNA polymerase and transcription factors bind) – proteins in a given family
• for any probabilistic model of sequences, we can write this probability as (recall the “Chain Rule of Probability”)
M a c in to s h
P IC T
im a g e
fo rm a t
A
begin
G
end
C
T
Markov Chain Models
• Let X be a sequence of L random variables X1… XL representing a biological sequence generated through some process
• We can ask how probable the sequence is given our model
A
begin
C
G
end
T
M
a c in t o s h
P IC
T
im
ag e
fo r m
at
is
not
s u p p o r te d
Markov Chain Notation
•
the transition parameters can be denoted by
where M acintosh P IC T im age form at is not supported
is n o t s u p p o r te d
where
M acintosh PICT im age form at is not supported
represents the
transition from
the begin
state
• This gives a probability distribution over sequences of length L