微生物次级代谢产物合成基因簇预测分析
合集下载
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
22
NRPS/PKS domain architecture analysis
Domains and subgroups are detected using another pHMMs library Conserved motifs are detected using the pHMMs in the CLUSEAN package
15
Gene clusters distributed in fungi genome
Core orthologous and species-specific gene clusters in three fungi
16
antiSMASH
(antibiotics & Secondary Metabolite Analysis Shell)
NRPS-PKS hybrid
7
次级代谢产物生物合成基因簇
PKS和NRPS复合酶的基因主要是由连续的 模块(Module)构成,每个基因模块的产物可催化多 聚酮或多肽链的一轮延伸和可能的修饰。 一个基因可能含有多个模块(Module) 模块(Module)非最小单位(下级单元:结构域) 一种次级代谢产物的合成由多个基因共同编码
Terpene synthase Phytoene_synthase
Terpene_synth_C phytoene_synth
PFAM PF03936.9 This study
Filter negative and positive pHMMs Gene clusters are defined by locating clusters of signature gene pHMM hits spaced within <10 kb mutual dis-tance
families (smCOGs)
25
analysis of secondary metabolism gene
families (smCOGs)
26
Gene Cluster Blast Comparative Analysis
Using BlastP align all gene clusters from NCBI nt database (15 February 2011) Homologous genes (BLAST e-value < 1E-05; 30% minimal sequence identity; shortest BLAST alignment covers over >25% of the sequence) are given the same colors
18
Analysis tools of antiSMASH
NCBI BLAST+ HMMer 3, Muscle 3 Glimmer 3 FastTree
TreeGraph 2 Indigo-depict PySVG JQuery SVG
19
Pipeline for genomic analysis of secondary metabolites
10
PKS(Polyketide synthase)
AT ACP KS
必需结构域:酰基转移酶(Acyltransferase,AT) 酮基合酶(ketosynthase,KS) 酰基载体蛋白(Acylcarrierprotein,ACP) 非必需结构域:可能含有1~3个修饰酮基的结构域: 酮基还原酶(KR) 脱水酶(DH) 烯酰基还原酶(ER)。
Using HMMer3 tool (http://hmmer.janelia.org/) aligh Pfam-source and other pHMMs library
Compound class NRPS NRPS NRPS Description Condensation domain Adenylation domain Adenylation domain with integrated oxidase Ketosynthase domain Acyltransferase domain FabH fatty acid synthase Enediyine ketosynthase Modular ketosynthase Type II PKS Chain length factor HMM name Condensation AMP-binding A-OX Source PFAM PF00668.13 PFAM PF00501.21 This study
D
12
次级合成基因簇的挖掘
Traditional strategy 表型 to 基因型 Genomics strategy 基因型 to 表型
No gene background Limited gene cluster searching Time-consuming
All-view genes background Large –scale gene cluster searching Time-saving 13
8
PKS
McDaniel et al., Proc. Natl. Acad. Sci. USA, 96 (1999) 1846–1851
9
NRPS-PKS
Brian M. Kevany etal., APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2009, 1144 –1155
20
Input files
Genbank format or EMBL format : annotated nucleotide file
Fasta format : the fasta should be single sequence FASTA files (with one ">")
11
NRPS(nonribosomal peptide synthetase)
A PCP C
必需的结构域:腺苷酰化结构域(Andeylation,A) 缩合结构域(Condensation,C) 肽酰载体蛋白结构域 (Peptidylcarrierprotein,PCP) (thiolation domain) 非必需结构域:差向异构化(Epimerization) :L N甲基化(Nmethylation) 氧化(Oxidation)等 修饰结构域
gene prediction by Glimmer3 (prokaryotic data) or GlimmerHMM
(eukaryotic data) Transform the predicted results to EMBL format
21
Detection of gene clusters
3
次级代谢产物简介
次级代谢产物:微生物生长到一定阶段才产生的化学结构十分复 杂、对该生物无明显生理功能,或并非是微生物生长和繁殖所必 需的物质 主要来源:放线菌、真菌等
Zwittermicin A
4
次级代谢产物简介
polyketides (type I) 聚酮(PK) polyketides (type II) polyketides (type III)
NRPS A domain
specificities are predicted using both the signature sequence method and the support-vector machinesbased method of NRPSPredictor2
24
analysis of secondary metabolism gene
V1.2 (August 20th, 2011)
17
Based on profile hidden Markov models of genes that are specific for certain types of gene clusters, antiSMASH is able to accurately identify the gene clusters encoding secondary metabolites of all known broad chemical classes antiSMASH not only detects the gene clusters, but also offers detailed sequence analysis
PKS PKS PKS (neg.)
PKS PKS PKS
PKS_KS PKS_AT fabH
ene_KS mod_KS t2clf
SMART SMART This study
Yadav et al. (2009) Yadav et al. (2009) This study
Terpene Terpene ……….
Alexander Fleming 苏格兰人,发现青霉 素及其治疗传染性疾 病的功效, 1945年获 得诺贝尔生理医学奖
Selman worksman 美国人,对土壤微生 物产生抗生素物质进 行了系统和开创性工 作,发现了抑制肺结 核的链霉菌素,1952年 获得诺贝尔生理医学 奖
Francisco Malpartida 西班 牙人,于 1984 年, 第一个克隆了放线紫 红素的 完整合成基 因 簇
aminoglycosides / aminocyclitols 氨基糖甙类/氨基环醇
ectoines 四氢嘧啶
nucleosides 核苷
phosphoglycolipids 磷wenku.baidu.com糖脂
melanins 黑素类
others
5
次级代谢产物简介
抗生素:抗细菌、抗真菌、抗病毒、抗肿瘤等 酶抑制剂
免疫抑制剂
23
Predicted core structure
PKS AT domain
specificities are predicted using a 24 amino acid signature sequence of the active site as well as other pHMMs
nonribosomal peptides 非核糖体肽(NRP) bacteriocins 细菌素 aminocoumarins 基香豆素
butyrolactones 丁内酯
terpenes 萜烯 beta-lactams β-内酰胺 siderophores 铁载体
indoles 吲哚类
lantibiotics 羊毛硫抗生素
微生物次级代谢产物及其生物 合成基因簇预测分析
------antiSMASH
次级代谢产物及其生物合成基因簇简介
主 要 内 容
antiSMASH 分析方法及流程
antiSMASH 的优势与比较
次级代谢产物的分类鉴定概貌
2
微生物学发展史中四位重要人物
Louis Pasteur 法国人,开创 了微生物技术 的新时代
次级合成基因簇的挖掘
传统的天然产物分离通常是通过活性跟踪的策略。
第一篇对NRPS进行 预测的文章
NATURE , 417 ,2002
J. Antibiot. 59(3): 168–176, 2006
J. Antibiot. 59(9): 533–542, 2006
Microbiology 154, 1555–1569,2008
14
Gene clusters distributed in fungi genome
gene clusters in the 27 sequenced fungal genomes predicted by SMURF
N. Khaldi et al. , Fungal Genetics and Biology 47 ,(2010) 736–741
生理活性物质
受体拮抗剂
………..
6
次级代谢产物生物合成基因簇
次级代谢产物的编码基因通常在基因组中成簇存在,编码 具有多种功能的复合酶 研究的最清楚为 NRPS : nonribosomal peptides synthetase PKS: polyketides synthetase
NRPS/PKS domain architecture analysis
Domains and subgroups are detected using another pHMMs library Conserved motifs are detected using the pHMMs in the CLUSEAN package
15
Gene clusters distributed in fungi genome
Core orthologous and species-specific gene clusters in three fungi
16
antiSMASH
(antibiotics & Secondary Metabolite Analysis Shell)
NRPS-PKS hybrid
7
次级代谢产物生物合成基因簇
PKS和NRPS复合酶的基因主要是由连续的 模块(Module)构成,每个基因模块的产物可催化多 聚酮或多肽链的一轮延伸和可能的修饰。 一个基因可能含有多个模块(Module) 模块(Module)非最小单位(下级单元:结构域) 一种次级代谢产物的合成由多个基因共同编码
Terpene synthase Phytoene_synthase
Terpene_synth_C phytoene_synth
PFAM PF03936.9 This study
Filter negative and positive pHMMs Gene clusters are defined by locating clusters of signature gene pHMM hits spaced within <10 kb mutual dis-tance
families (smCOGs)
25
analysis of secondary metabolism gene
families (smCOGs)
26
Gene Cluster Blast Comparative Analysis
Using BlastP align all gene clusters from NCBI nt database (15 February 2011) Homologous genes (BLAST e-value < 1E-05; 30% minimal sequence identity; shortest BLAST alignment covers over >25% of the sequence) are given the same colors
18
Analysis tools of antiSMASH
NCBI BLAST+ HMMer 3, Muscle 3 Glimmer 3 FastTree
TreeGraph 2 Indigo-depict PySVG JQuery SVG
19
Pipeline for genomic analysis of secondary metabolites
10
PKS(Polyketide synthase)
AT ACP KS
必需结构域:酰基转移酶(Acyltransferase,AT) 酮基合酶(ketosynthase,KS) 酰基载体蛋白(Acylcarrierprotein,ACP) 非必需结构域:可能含有1~3个修饰酮基的结构域: 酮基还原酶(KR) 脱水酶(DH) 烯酰基还原酶(ER)。
Using HMMer3 tool (http://hmmer.janelia.org/) aligh Pfam-source and other pHMMs library
Compound class NRPS NRPS NRPS Description Condensation domain Adenylation domain Adenylation domain with integrated oxidase Ketosynthase domain Acyltransferase domain FabH fatty acid synthase Enediyine ketosynthase Modular ketosynthase Type II PKS Chain length factor HMM name Condensation AMP-binding A-OX Source PFAM PF00668.13 PFAM PF00501.21 This study
D
12
次级合成基因簇的挖掘
Traditional strategy 表型 to 基因型 Genomics strategy 基因型 to 表型
No gene background Limited gene cluster searching Time-consuming
All-view genes background Large –scale gene cluster searching Time-saving 13
8
PKS
McDaniel et al., Proc. Natl. Acad. Sci. USA, 96 (1999) 1846–1851
9
NRPS-PKS
Brian M. Kevany etal., APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2009, 1144 –1155
20
Input files
Genbank format or EMBL format : annotated nucleotide file
Fasta format : the fasta should be single sequence FASTA files (with one ">")
11
NRPS(nonribosomal peptide synthetase)
A PCP C
必需的结构域:腺苷酰化结构域(Andeylation,A) 缩合结构域(Condensation,C) 肽酰载体蛋白结构域 (Peptidylcarrierprotein,PCP) (thiolation domain) 非必需结构域:差向异构化(Epimerization) :L N甲基化(Nmethylation) 氧化(Oxidation)等 修饰结构域
gene prediction by Glimmer3 (prokaryotic data) or GlimmerHMM
(eukaryotic data) Transform the predicted results to EMBL format
21
Detection of gene clusters
3
次级代谢产物简介
次级代谢产物:微生物生长到一定阶段才产生的化学结构十分复 杂、对该生物无明显生理功能,或并非是微生物生长和繁殖所必 需的物质 主要来源:放线菌、真菌等
Zwittermicin A
4
次级代谢产物简介
polyketides (type I) 聚酮(PK) polyketides (type II) polyketides (type III)
NRPS A domain
specificities are predicted using both the signature sequence method and the support-vector machinesbased method of NRPSPredictor2
24
analysis of secondary metabolism gene
V1.2 (August 20th, 2011)
17
Based on profile hidden Markov models of genes that are specific for certain types of gene clusters, antiSMASH is able to accurately identify the gene clusters encoding secondary metabolites of all known broad chemical classes antiSMASH not only detects the gene clusters, but also offers detailed sequence analysis
PKS PKS PKS (neg.)
PKS PKS PKS
PKS_KS PKS_AT fabH
ene_KS mod_KS t2clf
SMART SMART This study
Yadav et al. (2009) Yadav et al. (2009) This study
Terpene Terpene ……….
Alexander Fleming 苏格兰人,发现青霉 素及其治疗传染性疾 病的功效, 1945年获 得诺贝尔生理医学奖
Selman worksman 美国人,对土壤微生 物产生抗生素物质进 行了系统和开创性工 作,发现了抑制肺结 核的链霉菌素,1952年 获得诺贝尔生理医学 奖
Francisco Malpartida 西班 牙人,于 1984 年, 第一个克隆了放线紫 红素的 完整合成基 因 簇
aminoglycosides / aminocyclitols 氨基糖甙类/氨基环醇
ectoines 四氢嘧啶
nucleosides 核苷
phosphoglycolipids 磷wenku.baidu.com糖脂
melanins 黑素类
others
5
次级代谢产物简介
抗生素:抗细菌、抗真菌、抗病毒、抗肿瘤等 酶抑制剂
免疫抑制剂
23
Predicted core structure
PKS AT domain
specificities are predicted using a 24 amino acid signature sequence of the active site as well as other pHMMs
nonribosomal peptides 非核糖体肽(NRP) bacteriocins 细菌素 aminocoumarins 基香豆素
butyrolactones 丁内酯
terpenes 萜烯 beta-lactams β-内酰胺 siderophores 铁载体
indoles 吲哚类
lantibiotics 羊毛硫抗生素
微生物次级代谢产物及其生物 合成基因簇预测分析
------antiSMASH
次级代谢产物及其生物合成基因簇简介
主 要 内 容
antiSMASH 分析方法及流程
antiSMASH 的优势与比较
次级代谢产物的分类鉴定概貌
2
微生物学发展史中四位重要人物
Louis Pasteur 法国人,开创 了微生物技术 的新时代
次级合成基因簇的挖掘
传统的天然产物分离通常是通过活性跟踪的策略。
第一篇对NRPS进行 预测的文章
NATURE , 417 ,2002
J. Antibiot. 59(3): 168–176, 2006
J. Antibiot. 59(9): 533–542, 2006
Microbiology 154, 1555–1569,2008
14
Gene clusters distributed in fungi genome
gene clusters in the 27 sequenced fungal genomes predicted by SMURF
N. Khaldi et al. , Fungal Genetics and Biology 47 ,(2010) 736–741
生理活性物质
受体拮抗剂
………..
6
次级代谢产物生物合成基因簇
次级代谢产物的编码基因通常在基因组中成簇存在,编码 具有多种功能的复合酶 研究的最清楚为 NRPS : nonribosomal peptides synthetase PKS: polyketides synthetase