基因组的进化
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Counting the minimum number of changes
Maximum parsimony given the tree. method
Maximum likelihood method
Bayes method
UPGAMA algorithm
Unweighed Pair-group method using arithmetic averages . The rate of substitution is more or less constant
Substitution
Insertion & deletions
Inversion
• Chromosomal rearrangements
Main concept for tree construction
Estimate relationships between organisms or genes
Tools 1
• Alignment • Block Chain
BLAST
Tools 2
BlastZ/LastZ
• Whole genome alignment • Chain (lastZ)
Alignment
Step1
• Mask repeat on query and target • 2Bit-file generation with “faToTwoBit ” • Size-file generation with “faSize”
Genome synteny
• chainNet on pairwise genome sequences
http://evolution.berkeley.edu
Negative
Positive
Balancing
•purifying/stabilizing selection
• diversifying/disruptive selection
NeutraBaidu Nhomakorabea theory of molecular evolution
Evolution
Jilin Zhang Dec 02, 2013
OUTLINE
1 Stories and Theories
• Stories on origin of life • Darwin’s theory • Neutral theory of molecular evolution
2
• Genetic variations • Phylogenetic tree
Step 2 Chaining
• – axtChain in.axt tNibDir qNibDir out.chain
Step 3 Netting
• – chainMergeSort chain/*.chain > all.chain • – chainPreNet in.chain target.sizes query.sizes out.chain • – chainNet in.chain target.sizes query.sizes target.net query.net • – netSyntenic in.net out.net • – netToAxt in.net in.chain tNibDir qNibDir out.axt • – axtSort in.axt out.axt • – axtToMaf in.axt tSizes qSizes out.maf
STORIES AND THEORIES
Stories
Darwin’s Theory
Species have changed, and are still changing
Variations can act only by very short and slow steps, it can produce no great or sudden modification, the change is cumulative.
Consensus tree
• The strict consensus tree shows only those groups (nodes or clades) that are shared among all trees in the set, with polytomies (three-forked) representing nodes not supported by all trees. • The majority-rule consensus tree shows nodes or clades that are supported by at least half of the trees in the set.
Whole genome alignment Orthologs Divergence time and species tree
COMPARATIVE ANALYSIS
Whole genome alignment
• Pairwise genome alignment • Multi genome alignment
Step2
• Split query file into small subfiles
Step3
• Lastz target.2bit[m ulti] subfile parameters > • subfile.out
ChainNet
Step1 Best alignment selection
• – all bases in all chromosomes are initially marked as unused • – The chains then put into a list sorted with the highest-scoring chain first • – loop, throwing out the parts of the chain that intersect with bases already covered by previously taken chains, and then marking the bases that are left in the chain as covered
Models
Models
starts with a star tree, joins two nodes, choosing the pair to
Neighbor Joining Algorithm
achieve the greatest reduction in tree length. A new node is then created to replace the two nodes joined. Repeat procedure until tree solved
Functionally more important genes or gene regions evolve more slowly
Common genome variations Phylogenetic Tree
EVOLUTIONARY BASICS
Common Variations
Mutations at nucleotide level
Methods for reconstruction
Programs/Software
Phyml Paup Mega Phylip Mrbayes
• one of the fastest ML software
• classic, not free
• graphic interface, free
Evolutionary Basics
3
Comparative Analysis
• Whole genome alignment • Orthologs • Divergence time and species tree
Stories on origin of life Darwin’s theory Neutral theory of molecular evolution
• classic, free
• popular Bayesian tree reconstruc tion package
Example
Input (phylip format) 10 100 HUMAN26353 VQWCAVSQPE ATKCFQWQRN MRKVRRMSGP PVSCIKRDSP IQCIQAIAEN RADAVTLDGG FIYEAGLAPY KLRPVAAEVY GTERQPRTHY YAVAVVKKGG MOUSE24351 VQWCAVSNSE EEKCLRWQNE MRKVG---GP PLSCVKKSST RQCIQAIVTN RADAMTLDGG TLFDAGKPPY KLRPVAAEVY GTKEQPRTHY YAVAVVKNSS HORSE23349 VRWCTVSNHE VSKCASFRDS MKSIVPA-PP LVACVKRTSY LECIKAIADN EADAVTLDAG LVFEAGLSPY NLKPVVAEFY GSKTEPQTHY YAVAVVKKNS ….. Output (newick format) (MOUSE24351:0.24084,(BOVIN25352:0.24624,(((HUMAN23357:0.53282,(XENLA26341:0.41999,SALSA25329:0.35952):0.22389):0.17453,CHI CK26352:0.30644):0.15404,(FEPIG6332:0.13028,(HUMAN25347:0.17692,HORSE23349:0.13011):0.07571):0.10549):0.16215):0.14101,HUMA N26353:0.17629);
All life is related and has descended from a common ancestor Natural Selection: Those organisms with the most beneficial traits are more likely to survive and reproduce.
Constructing a tree based on the differences
Testing the tree for consistency
Terms
• Topology – structure and the relationship • Nodes – DNA (RNA, mtDNA) sequences, proteins, species = taxonomic units (TUs) • Terminal (extant) nodes, leaves – OTUs • Internal nodes- unobserved ancestor sequences • Branches – parent-child relations between two nodes • Clade
Sequence divergence
• Distance
Generally, sequence divergence is measured by the different sites between two sequence: P=ndiv/N
AAGTCCTAGCTAGTGCTTTGCAGATAAC AAGTGCTAGCTAGATCTTTGCAGATAAC
Genetic variation is due to random fixation of mutations with no fitness effect (neutral mutations)
Rate of molecular evolution is equal to the neutral mutation rate (an explanation for the molecular-clock hypothesis.)
Maximum parsimony given the tree. method
Maximum likelihood method
Bayes method
UPGAMA algorithm
Unweighed Pair-group method using arithmetic averages . The rate of substitution is more or less constant
Substitution
Insertion & deletions
Inversion
• Chromosomal rearrangements
Main concept for tree construction
Estimate relationships between organisms or genes
Tools 1
• Alignment • Block Chain
BLAST
Tools 2
BlastZ/LastZ
• Whole genome alignment • Chain (lastZ)
Alignment
Step1
• Mask repeat on query and target • 2Bit-file generation with “faToTwoBit ” • Size-file generation with “faSize”
Genome synteny
• chainNet on pairwise genome sequences
http://evolution.berkeley.edu
Negative
Positive
Balancing
•purifying/stabilizing selection
• diversifying/disruptive selection
NeutraBaidu Nhomakorabea theory of molecular evolution
Evolution
Jilin Zhang Dec 02, 2013
OUTLINE
1 Stories and Theories
• Stories on origin of life • Darwin’s theory • Neutral theory of molecular evolution
2
• Genetic variations • Phylogenetic tree
Step 2 Chaining
• – axtChain in.axt tNibDir qNibDir out.chain
Step 3 Netting
• – chainMergeSort chain/*.chain > all.chain • – chainPreNet in.chain target.sizes query.sizes out.chain • – chainNet in.chain target.sizes query.sizes target.net query.net • – netSyntenic in.net out.net • – netToAxt in.net in.chain tNibDir qNibDir out.axt • – axtSort in.axt out.axt • – axtToMaf in.axt tSizes qSizes out.maf
STORIES AND THEORIES
Stories
Darwin’s Theory
Species have changed, and are still changing
Variations can act only by very short and slow steps, it can produce no great or sudden modification, the change is cumulative.
Consensus tree
• The strict consensus tree shows only those groups (nodes or clades) that are shared among all trees in the set, with polytomies (three-forked) representing nodes not supported by all trees. • The majority-rule consensus tree shows nodes or clades that are supported by at least half of the trees in the set.
Whole genome alignment Orthologs Divergence time and species tree
COMPARATIVE ANALYSIS
Whole genome alignment
• Pairwise genome alignment • Multi genome alignment
Step2
• Split query file into small subfiles
Step3
• Lastz target.2bit[m ulti] subfile parameters > • subfile.out
ChainNet
Step1 Best alignment selection
• – all bases in all chromosomes are initially marked as unused • – The chains then put into a list sorted with the highest-scoring chain first • – loop, throwing out the parts of the chain that intersect with bases already covered by previously taken chains, and then marking the bases that are left in the chain as covered
Models
Models
starts with a star tree, joins two nodes, choosing the pair to
Neighbor Joining Algorithm
achieve the greatest reduction in tree length. A new node is then created to replace the two nodes joined. Repeat procedure until tree solved
Functionally more important genes or gene regions evolve more slowly
Common genome variations Phylogenetic Tree
EVOLUTIONARY BASICS
Common Variations
Mutations at nucleotide level
Methods for reconstruction
Programs/Software
Phyml Paup Mega Phylip Mrbayes
• one of the fastest ML software
• classic, not free
• graphic interface, free
Evolutionary Basics
3
Comparative Analysis
• Whole genome alignment • Orthologs • Divergence time and species tree
Stories on origin of life Darwin’s theory Neutral theory of molecular evolution
• classic, free
• popular Bayesian tree reconstruc tion package
Example
Input (phylip format) 10 100 HUMAN26353 VQWCAVSQPE ATKCFQWQRN MRKVRRMSGP PVSCIKRDSP IQCIQAIAEN RADAVTLDGG FIYEAGLAPY KLRPVAAEVY GTERQPRTHY YAVAVVKKGG MOUSE24351 VQWCAVSNSE EEKCLRWQNE MRKVG---GP PLSCVKKSST RQCIQAIVTN RADAMTLDGG TLFDAGKPPY KLRPVAAEVY GTKEQPRTHY YAVAVVKNSS HORSE23349 VRWCTVSNHE VSKCASFRDS MKSIVPA-PP LVACVKRTSY LECIKAIADN EADAVTLDAG LVFEAGLSPY NLKPVVAEFY GSKTEPQTHY YAVAVVKKNS ….. Output (newick format) (MOUSE24351:0.24084,(BOVIN25352:0.24624,(((HUMAN23357:0.53282,(XENLA26341:0.41999,SALSA25329:0.35952):0.22389):0.17453,CHI CK26352:0.30644):0.15404,(FEPIG6332:0.13028,(HUMAN25347:0.17692,HORSE23349:0.13011):0.07571):0.10549):0.16215):0.14101,HUMA N26353:0.17629);
All life is related and has descended from a common ancestor Natural Selection: Those organisms with the most beneficial traits are more likely to survive and reproduce.
Constructing a tree based on the differences
Testing the tree for consistency
Terms
• Topology – structure and the relationship • Nodes – DNA (RNA, mtDNA) sequences, proteins, species = taxonomic units (TUs) • Terminal (extant) nodes, leaves – OTUs • Internal nodes- unobserved ancestor sequences • Branches – parent-child relations between two nodes • Clade
Sequence divergence
• Distance
Generally, sequence divergence is measured by the different sites between two sequence: P=ndiv/N
AAGTCCTAGCTAGTGCTTTGCAGATAAC AAGTGCTAGCTAGATCTTTGCAGATAAC
Genetic variation is due to random fixation of mutations with no fitness effect (neutral mutations)
Rate of molecular evolution is equal to the neutral mutation rate (an explanation for the molecular-clock hypothesis.)