外显子组测序ppt课件
全外显子组测序的具体方法及步骤
全外显子组测序的具体方法及步骤
全外显子组测序(Whole Exome Sequencing,WES)是指利用序列捕获或者靶向技术将全基因组外显子区域DNA 富集后再进行高通量测序的基因组分析方法。
与全基因组重测序相比,全外显子组测序只需针对外显子区域的基因序列测序,覆盖度更深、数据准确性更高,更加简便、经济、高效。
技术优势
高性价,强分析,快速交付
外显子组测序主要用于识别和研究与疾病相关的编码区的基因组变异。
结合大量的公共数据库提供的外显子数据和正常人群数据库, 有利于更好地排除无害突变及解释变异信息之间的关联和致病机理。
技术路线
技术参数
样本要求。
人全基因组全外显子组全外显子组和全转录组测序及其临床应用
1.无义突变(nonsense mutation)
即一个核苷酸突变后,产生了一个终止密码(stop condon),截断了转录和翻译,从而形成新的蛋白质分子。 新的蛋白质分子很可能丧失原来蛋白质分子的功能。
2.错义突变(missense mutation)
即一个核苷酸突变后,产生了一个不同的密码子,从而编 码出不同的氨基酸。根据该氨基酸对蛋白质空间结构的影响, 新的蛋白质功能可能保持或改变,因此错义突变又分为保守 突变和非保守突变。
人基因组中编码蛋白质的基因序列约占全基因组序列的 1.5%,有20 000~25 000个蛋白质编码基因,剩余的部分 包括RNA编码基因,调控序列与伪基因(pseudogene) 等,以及各种重复序列。重复序列占人全基因组的60%左右, 无转录活性,包括成簇存在于染色体特定区域的串联重复序 列(tandem repeat),
(二)基因(gene)
基因是具有遗传信息的DNA片段。通过转录形成功能RNA 分子。人的所有基因在23对染色体上呈线性排列,每一个 染色体含有数百个基因,大多数基因包含多个外显子 (exon),相邻外显子的中间是内含子(intron)。在基 因与基因之间,通常是调控序列和非编码的基因间片段(图 2-5-1)。
(三)结构性变异(structural variation,SV)
通常包括长度在50bp以上的DNA序列的插入、缺失、倒位、 重复,移动元件,染色体内部或染色体之间的序列易位,以 及更为复杂的组合变异。基因中出现这种变异,会影响转录、 翻译以及蛋白质分子结构和性质,乃至生物体的表型。
(四)拷贝数变异(copy number variation,CNV)
4)重复上面三个步骤,进行第二个碱基的信号收集,直至 完成所有循环。
外显子组测序信息分析
Base_covered_on_target(Mb)10 Coverage_of_target_region11 Fraction_of_target_covered_with_at_least_20x12 Fraction_of_target_covered_with_at_least_10x13 Fraction_of_target_covered_with_at_least_4x14
13721 92.05 47.31
12636 90.86 46.75
9776 66.84 43.05
9616 64.37 41.45
6904
6815
6684
6437
当比对到参考基因组目标区域的数据量在60%之上,认为外显子捕 获效率合格。
3.2.3、染色体覆盖深度分布
注:横坐标为染色体长度,纵坐标为覆盖深度取对数。
二、外显子组测序流程
基因组DNA的随机打断 DNA片段生物信息分析
三、外显子组测序信息分析流程
主要信息分析内容归类
3.1、数据过滤与评估 3.2、整体质量评估 3.3、SNP检测与注释 3.4、InDel检测与注释 3.5、高级分析
外显子组测序在医学研究中的应用
一 • 外显子组测序技术简介 二 • 外显子组测序流程 三 • 外显子组测序信息分析内容 四 • 外显子组测序的应用方案
一、外显子组测序技术简介
外显子测序是指利用序列捕获技术将全基因组外显子区 域DNA捕捉并富集后,再进行高通量测序的基因组分析方法。
外显子组序列仅占全基因组序列的1%左右,与人类85% 致病基因突变相关。与全基因组测序相比,外显子组测序不 仅费用较低,而且测序覆盖度更深,数据准确性更高。
人类基因组计划与基因测序ppt课件
ppt课件.
31
第四代测序技术——纳米孔测序技术
原理:分子在通过纳米孔道时,会对通过纳米孔的电流,或横 穿过纳米孔的电流(隧穿电流)产生影响,而每种不同的分子 通过时,对电流产生的影响具有可区别的差异。于是利用这种 差异,纳米孔测序技术就可以识别基因中碱基(对)的排列顺 序。
ppt课件.
9
1号染色体——生命。讲生命的诞生,来源。 2号染色体——物种。人类发展和近亲之间的分别。
人
3号染色体——历史。孟德尔以及其他科学家在遗传学上做出的贡献。 4号染色体——命运。你的命运完全在你的基因里。
种
5号染色体——环境。推翻让读者觉得基因是简单的分割开来的。
自
6号染色体——智慧。基因的存在不是为了致病的 7号染色体——本能。解释行为遗传学和进化心理学结论对人类的影响。
对的碱基在原样品DNA分子上
的位置。此后各组反应物通过聚
丙烯酰胺凝胶电泳进行分离,通
过放射自显影检测末端标记的分
子,并直接读取待测DNA片段
的核苷酸序列。
ppt课件.
22
第二代测序技术
特点:
大大降低了测序成本的同时,
大幅提高了测序速度,
持了高准确性,
序列读长方面起第一代ppt课测件. 序技术则要短很多。
ppt课件.
7
人类基因组计划——概述
人类基因组计划(human genome project, HGP): 是由美国科学家于1985年率先提出,于1990年正式启动的。美 国、英国、法国、德国、日本和我国科学家共同参与了这一预 算达30亿美元的人类基因组计划。 意义: •人类基因组计划与曼哈顿原子弹计划和阿波罗计划并称为三大 科学计划。 •被誉为生命科学的“登月计划”。 中国: 中国于1999年9月积极参加到这项研究计划中的,承担其中1% 的任务,即人类3号染色体短臂上约3000万个碱基对的测序任 务。中国因此成为参加这项研究计划的唯一的发展中国家。
外显子测序
b r i e fc o m m u n i c at i o n sHyperphosphatasia mental retardation (HPMR) syndrome is an autosomal recessive form of mental retardation with distinct facial features and elevated serum alkaline phosphatase. We performed whole-exome sequencing in three siblings of a nonconsanguineous union with HPMR and performed computational inference of regions identical by descent in all siblings to establish PIGV , encoding a member of the GPI-anchor biosynthesis pathway, as the gene mutated in HPMR. We identified homozygous or compound heterozygous mutations in PIGV in three additional families.Recessive mutations are relatively common in the human genome, but their identification remains challenging. Initial efforts at using exome sequencing for disease gene discovery 1 analyzed small num-bers of unrelated individuals, removed variants that are common or not predicted to be deleterious and then searched for genes with such variants in all affected individuals. The analysis of the exome sequences of two siblings and two further unrelated individuals affected by the autosomal recessive Miller syndrome led to the iden-tification of DHODH as the disease gene 2. Subsequently, researchers analyzed whole genome sequences of the same two siblings and their parents to identify chromosomal regions in which both siblings had inherited identical haplotypes from both parents, which allowed thenumber of gene candidates for Miller syndrome to be reduced from 34 to 4, showing that linkage information represents a useful filter for genome sequence data 3. These studies illustrate the utility of sophisti-cated algorithmic analysis in reducing the candidate gene set beyond what can be achieved by a simple intersection filter.HPMR, also known as Mabry syndrome (MIM%239300), was ini-tially described as an autosomal recessive syndrome characterized by mental retardation and greatly elevated alkaline phosphatase levels 4,5. Within a group of individuals with this rare syndrome, a previous study 6 delineated a specific clinical entity characterized by a distinct facial gestalt including hypertelorism, long palpebral fissures, a broad nasal bridge and tip, and a mouth with downturned corners and a thin upper lip, as well as brachytelephalangy. More variable neurological features included seizures and muscular hypotonia 6.Here, DNA from three siblings of nonconsanguineous parents with this subtype of HPMR was analyzed by exome sequenc-ing (Supplementary Figs. 1 and 2 and Supplementary Table 1). Whole-exome sequencing using the ABI SOLiD platform was per-formed following enrichment of exonic sequences using Agilent’s SureSelect whole-exome enrichment. Called variants were filtered to exclude variants not found in all affected persons as well as common variants identified in the dbSNP130 or HapMap databases, which left 14 candidate genes on multiple chromosomes (Table 1 and Supplementary Tables 2–4).In this work, we developed a statistical model that allowed us to infer regions that are identical by descent (IBD) from the exome sequences of only the affected children of a family in which an auto-somal recessive disorder segregates. In consanguineous families, affected siblings share two haplotypes that are inherited from a single common ancestor at the disease locus and are thus homozygous by descent. In nonconsanguineous families, the affected children inherit identical maternal and paternal haplotypes in a region surrounding the disease gene, meaning that both haplotypes originated from the same maternal and paternal haplotype but are not necessarily from an identical ancestor (IBD = 2).We developed an algorithm based on a Hidden Markov Model (HMM), a type of Bayesian network that is used to infer a sequence of hidden (that is, unobservable) states. We used the HMM algorithm to identify chromosomal regions with IBD = 2 in the presence of noisy (that is, potentially erroneous) sequence data. It is not possible to measure the IBD = 2 state directly; it is only possible to determine whether the genotypes of the siblings are compatible with identity-by-state status, that is, whether each sibling has the same homozygousIdentity-by-descent filtering of exome sequence dataidentifies PIGV mutations in hyperphosphatasia mental retardation syndromePeter M Krawitz 1–3,11, Michal R Schweiger 1,2,11,Christian Rödelsperger 1–3, Carlo Marcelis 4, Uwe Kölsch 5, Christian Meisel 5, Friederike Stephani 4, Taroh Kinoshita 6, Yoshiko Murakami 6, Sebastian Bauer 2, Melanie Isau 1,Axel Fischer 1, Andreas Dahl 1, Martin Kerick 1, Jochen Hecht 1,3, Sebastian Köhler 2, Marten Jäger 2, Johannes Grünhagen 2, Birgit Jonske de Condor 2, Sandra Doelken 2, Han G Brunner 4, Peter Meinecke 7, Eberhard Passarge 8, Miles D Thompson 9,David E Cole 9, Denise Horn 2, Tony Roscioli 4,10, Stefan Mundlos 1–3 & Peter N Robinson 1–31Max Planck Institute for Molecular Genetics, Berlin, Germany. 2Institut für Medizinische Genetik, Charité Universitätsmedizin Berlin, Berlin, Germany.3Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité -Universitätsmedizin Berlin, Berlin, Germany. 4Department of Human Genetics, UniversityMedical Centre St. Radboud, Nijmegen, The Netherlands. 5Institut für Medizinische Immunologie, Charité Universitätsmedizin Berlin, Berlin, Germany. 6Department of Immunoregulation, Research Institute for Microbial Diseases, Osaka University, Osaka, Japan. 7Medizinische Genetik, Altonaer Kinderkrankenhaus, Hamburg, Germany. 8Institut für Humangenetik, Universitätsklinikum Essen, Essen, Germany. 9Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada. 10Department of Molecular and Clinical Genetics, University of Sydney, Sydney, Australia. 11These authors contributed equally to this work. Correspondence should be addressed to S.M. (stefan.mundlos@charite.de) or P .N.R. (peter.robinson@charite.de).Received 29 March; accepted 3 August; published online 29 August 2010; doi:10.1038/ng.653© 2010 N a t u r e A m e r i c a , I n c . A l l r i g h t s r e s e r v e d.b r i e fc o m m u n i c at i o n sor heterozygous genotype, a situation which we refer to as IBS*. In our model, every genetic locus was either IBD = 2 or IBD ≠ 2. The HMM was then used to predict the most likely sequence of IBD = 2 or IBD ≠ 2 chromosomal segments on the basis of the observed exome sequences of two or more affected siblings (Supplementary Fig. 1 and Supplementary Methods ).HMM analysis decreased the search space to about 20% of the tran-scribed genome, reducing the number of candidate genes with muta-tions present in all three siblings from 14 to 2 (Table 1, Supplementary Table 5 and Supplementary Figs. 3–5). The two mutations, c.[859G>A]+[859G>A] in SLC9A1 and c.[1022C>A]+[1022C>A] in PIGV , were located within a 13-Mb homozygous block that was part of a larger 35-Mb IBD = 2 block. Runs of homozygosity of up to 4 Mb can occur in the European population even in individuals with no shared ancestors in the previous five to ten generations 7. Both variants were confirmed with ABI Sanger sequencing and were not detected in 200 healthy, unrelated central European individuals. Further homozygous and compound heterozygous mutations were detected in PIGV in individuals from the families designated B 8, C 9 and D 10 (Supplementary Note and Supplementary Tables 6 and 7). All of these missense mutations affect evolutionarily highly conserved residues of PIGV (Fig. 1a ).PIGV, the second mannosyltransferase in the GPI anchor bio-synthesis pathway 11, appeared to be of particular interest because alkaline phosphatase is a GPI-anchored protein. Over 100m ammalian proteins are modified by a glycosylphosphatidylinosi tol (GPI) anchor at their C terminus. The highly conserved back-bone structure of the GPI anchor is synthesized in the endoplasmicr eticulum through at least nine sequential reaction steps mediated by at least 18 proteins. GPI-anchored proteins comprise functionally divergent classes including hydrolytic enzymes, receptors, adhesion molecules and proteins with roles in the immune system 12. Little is known to date about the phenotypic consequences of mutationsof the GPI pathway in mammals. Abrogation of GPI biosynthesis in mice by knockdown of Piga , which encodes a protein that is involved in the first step of GPI-anchor biosynthesis, results in embryonic lethality 13. Somatic loss-of-function mutations in PIGA in hematopoietic stem cells are associated with paroxysmaln octurnal hemoglobinuria 14, primarily because the progeny of affected stem cells are deficient in the GPI-anchored complement regulatory proteins CD55 and CD59, leading to the intravascular hemolysis characteristic of the disease. A promoter mutation in PIGM , encoding a subunit of the complex transferring the first mannose, reduces PIGM expression by over 90% and leads to an autosomal recessive syndrome characterized by hepatic venous thrombosis and absence seizures 15.Defects in the GPI biosynthesis pathway can result in down-regulation of GPI-anchored proteins but not necessarily in a uniform reduction of all such proteins 12. We therefore examined the surface expression of the GPI anchor itself on leukocytes of three indivi-duals with HPMR using Alexa488-conjugated inactivated aerolysin (FLAER). All three subjects showed a substantial reduction of GPI-anchor expression. Correspondingly, expression of the GPI-anchored protein CD16 was markedly reduced (Supplementary Fig. 6). Wild-type PIGV cDNA and PIGV cDNA containing the p.Ala341Glu alteration were transiently transfected into PIGV-deficient Chinese hamster ovary (CHO) cells 11 to assess their effect on protein expression. Cells transfected with themutant constructdid not restore surface expression of GPI-anchored marker proteins (Fig. 1b ), possibly because expressed PIGV protein levels were substantially reduced (Fig. 1c ).Human Mouse Rat Dog Cow Horse FrogZebrafishLumenPIGVCytosolNH 2p.Gln256Lysp.Ala341Glu p.Ala341ValCD59CD55GAPDH12337 kD37 kDPIGV0.0011.50.4p.His385Pro QAHCOOHab cfigure 1 Identification of PIGV mutations in individuals with HPMRsyndrome. (a ) The homozygous PIGV mutation c.[1022C>A]+[1022C>A]; p.[Ala341Glu]+[Ala341Glu] was detected via whole exome sequencing in family A. Further homozygous and compound heterozygous mutations affecting evolutionarily highly conserved residues were found in threeunrelated families: c.[1022C>A]+[1154A>C]; p.[Ala341Glu]+[His385Pro] in family B, c.[766C>A]+[766C>A]; p.[Gln256Lys]+[Gln256Lys] in family C, and c.[1022C>A]+[1022C>T]; p.[Ala341Glu]+[Ala341Val] in family D. (b ) PIGV-deficient CHO cells were transiently transfected with wild-type (dashed lines) or p.Ala341Glu mutant (solid lines) PIGV cDNA in a weak expression vector or the empty vector (gray shadow). Wild-type PIGV efficiently restored the surface expression of CD59 (left) and CD55 (right), whereas p.Ala341Glu mutant PIGV induced only very low levels of CD59 and CD55. (c ) PIGV protein levels were assessed 2 d after transfection of a control vector (lane 1), wild-type PIGV (lane 2) and PIGV with p.Ala341Glu (lane 3). The numbers beneath the gel indicate the relative intensity of PIGV to GAPDH expression.table 1 number of genes with nonsynonymous variants and acceptor or donor splice site mutationsA1A2A3A1 & A2 & A3FilterHomozygous heterozygous All Homozygous heterozygous All Homozygous heterozygous All Homozygous heterozygous All NS/SS2,7529343,3852,9001,0903,6402,8061,0703,6251,7282731,928Not in dbSNP13018235216218472622003823512214IBD = 2165212052517623202Sanger validated22Reducing the search space to the identical by descent (IBD = 2) regions and filtering out all common variants decreased the number of genes with nonsynonymous variants and acceptor or donor splice site mutation to two candidate genes. NS, nonsynonymous; SS, acceptor or donor splice site mutations.© 2010 N a t u r e A m e r i c a , I n c . A l l r i g h t s r e s e r v e d .b r i e fc o m m u n i c at i o n sIn summary, we have identified PIGV mutations in HPMR using whole-exome capture and SOLiD sequencing in combination with an HMM algorithm to identify regions with IBD = 2 in siblings affected with autosomal recessive disorders. Our algorithm can be used in combination with other bioinformatic filters to streamline gene dis-covery in future exome sequencing projects.Accession codes. The mutations in this work were numbered according to transcripts available in GenBank under the codes NM_003047.3 (SLC9A1) and NM_017837.2 (PIGV ).Note: Supplementary information is available on the Nature Genetics website.ACKNowlEDGMENTSThis work was supported by a grant from the Deutsche Forschungsgemeinschaft (SFB 665) to S.M., by a grant from Bundesministerium für Bildung und Forschung (BMBF , project number 0313911) and an Australian National Health and Medical Research Council international research training fellowship to T.R., and by a grant of the Canadian Institutes of Health Research and Epilepsy Canada to M.D.T. We thank B. Fischer, U. Kornak, M. Ralser, E. van Beusekom, U. Marchfelder and D. Lefeber for their assistance in this project.AUTHoR CoNTRIBUTIoNSM.R.S., M.I. and A.D. performed targeted exome resequencing. P .M.K., C.R., A.F., M.K., S.B., S.K., M.J. and P .N.R. performed bioinformatic analysis. P .M.K., C. Marcelis, J.G., B.J.d.C., F.S. and T.R. performed mutation analysis andgenotyping. D.H., C. Marcelis, M.D.T., D.E.C., S.D., P .M., E.P ., T.R. and H.G.B.contributed to clinical evaluation of the affected individuals and delineation of the phenotype. P .M.K., U.K. and C. Meisel performed flow cytometric analysis. Y.M. and T.K. performed analysis of wild-type and A341E PIGV clones. P .M.K., M.R.S., D.H., J.H., H.G.B., P .N.R. and S.M. carried out the project planning and preparation of the manuscript.CoMPETING FINANCIAl INTERESTSThe authors declare no competing financial interests.Published online at /naturegenetics/.Reprints and permissions information is available online at /reprintsandpermissions/.1. Ng, S.B. et al. Nat. Genet. 42, 30–35 (2010).2. Ng, S.B. et al. Nature 461, 272–276 (2009).3. Roach, J.C. et al. Science 328, 636–639 (2010).4. Mabry, C.C. et al. J. Pediatr. 77, 74–85 (1970).5.Kruse, K., Hanefeld, F ., Kohlschutter, A., Rosskamp, R. & Gross-Selbeck, G. J. Pediatr. 112, 436–439 (1988).6. Horn, D., Schottmann, G. & Meinecke, P . Eur. J. Med. Genet. 53, 85–88 (2010).7. McQuillan, R. et al. Am. J. Hum. Genet. 83, 359–372 (2008).8. Rabe, P . et al. Am. J. Med. Genet. 41, 350–354 (1991).9. Marcelis, C.L., Rieu, P ., Beemer, F . & Brunner, H.G. Clin. Dysmorphol. 16, 73–76(2007).10. Thompson, M.D. et al. Am. J. Med. Genet. 152a , 1661–1669 (2010).11. Kang, J.Y. et al. J. Biol. Chem. 280, 9489–9497 (2005).12. Kinoshita, T., Fujita, M. & Maeda, Y. J. Biochem. 144, 287–294 (2008).13. Nozaki, M. et al. Lab. Invest. 79, 293–299 (1999).14. Takeda, J. et al. Cell 73, 703–711 (1993).15. Almeida, A.M. et al. Nat. Med. 12, 846–851 (2006).© 2010 N a t u r e A m e r i c a , I n c . A l l r i g h t s r e s e r v e d.。
外显子测序 生物学重复-概述说明以及解释
外显子测序生物学重复-概述说明以及解释1.引言1.1 概述外显子测序(exome sequencing)是一种基于高通量测序技术的生物学研究方法,其目的是对生物体中的外显子区域进行快速、准确地测序和分析。
外显子是基因组中编码蛋白质的片段,它们占据了整个基因组的仅0.5至1.5的区域,但却承载着80以上的已知致病突变。
因此,外显子测序被广泛应用于寻找蛋白质编码基因的突变,以及与遗传性疾病、肿瘤和其他复杂疾病相关的致病突变的鉴定和研究。
外显子测序的基本原理是使用高通量测序技术对DNA样本进行测序,然后利用生物信息学方法将测序结果与参考基因组进行比对和分析,从而确定样本中外显子的序列和存在突变的位置。
与全基因组测序相比,外显子测序具有较低的成本和更高的效率,因为外显子相对较小且具有较高的功能重要性,可以更准确地筛选和鉴定潜在致病突变。
外显子测序在生物学研究中的应用广泛而重要。
它不仅可以用于研究人类遗传性疾病和肿瘤突变,还可应用于农业、畜牧业和其他生物领域的基因组学研究。
通过对不同个体的外显子进行测序,我们可以了解个体间的遗传差异、突变积累和遗传进化规律,为人类进化和适应性研究提供重要依据。
然而,外显子测序也面临一些挑战。
首先,由于外显子区域相对较小,它只能提供关于外显子的信息,对非编码区域的突变鉴定有限。
其次,外显子测序在处理复杂疾病和疾病相关基因组变异时可能会遇到困难,因为这些变异可能位于基因的调控区域或与功能相关的非编码RNA中。
此外,外显子测序对测序深度和准确性要求较高,因此需要高质量的测序平台和数据分析方法的支持。
总之,外显子测序作为一种高效、准确的测序技术,在生物学研究和临床诊断中发挥着重要作用。
随着技术的不断发展和应用的不断扩大,外显子测序将为我们揭示生物体的基因组变异与功能之间的关系,为疾病的早期诊断和个性化治疗提供更多可能性。
同时,对于生物学重复的研究也为我们提供了全新的视角和理解,有助于揭示生命的奥秘和进化的规律。
外显子组测序信息分析PPT课件
R0 R0 34 113 125 110 682
892 975
111 121 218 865 610 652 51 277 303 32 77 112 124
776 850 14 14 19 21 882 925 00 93 100 32 10 8 10 923 940 00 32 18
3.3.3、突变特征
97.76 96.16
Mapping_datasize(Mb)3 Effective_sequences_on_target(Mb)5 Average_sequencing_depth_on_target7
Mismatch_rate_in_target_region8 Mismatch_rate_in_all_effective_sequence9
3.1、数据过滤与评估 3.2、整体质量评估 3.3、SNP检测与注释 3.4、InDel检测与注释 3.5、高级分析
第5页/共32页
3.1、数据过滤与评估
第6页/共32页
3.1.1、原始数据过滤
1. 过滤接头。对含接头的reads去除接头序列。 2. 一条reads上N(未能确定出具体的碱基类型)的比例
突变频谱图
注:横坐标为不同类型的突变,纵坐标为不同类型突变对应的频率。
第16页/共32页
3.3.3、突变特征
突变位点上下文碱基偏好性
注:横坐标为突变位点上下文的碱基位置,0为SNP突变位点,负数代表突变位点前的碱基, 正数代表突变位点后的碱基,纵坐标为不同碱基对应的比例。从图上可以看出,不同类型 的SNP突变上下文具有不同的碱基偏好性。
第10页/共32页
3.2.2、外显子捕获统计
Target region stat Length_of_target_region(Mb)1
人外显子测序
人外显子测序药明康德基因中心,陆桂1. 什么是外显子测序(whole exon sequencing)?外显子组测序是指利用序列捕获技术将全基因组外显子区域DNA捕捉并富集后进行高通量测序的基因组分析方法。
外显子测序相对于基因组重测序成本较低,对研究基因的SNP、Indel 等具有较大的优势,但无法研究基因组结构变异如染色体断裂重组等。
2. 外显子捕获试剂盒有哪些?目前主要有Roche、Illumina和Agilent三家的外显子捕获试剂。
Nimblegen和Illumina的捕获试剂盒中的探针是DNA探针,化学性质稳;Agilent的捕获试剂盒是RNA探针,有可能RNA 不是很稳定。
3. 外显子捕获效率是什么?外显子测序过程中要用到杂交过程。
在人的染色体上有许多与外显子有同源性的部分,这些有同源性的部分很可能在杂交过程中也被捕获下来。
所以,测到的序列中,有一部分不是外显子序列。
我们把测序得是外显子的部分占全部测序序列的比列称为捕获效率。
Nimblegen大约是70%Agilent大约是60%Illumina大约是50%4. 外显子测序一般建议做多少倍的覆盖?一般做100X或者150X。
较高的覆盖倍数,对于测异质性的遗传变质,可以发现小比例的突变。
另外,外显子测序的覆盖不是很均匀,这样较高的平均覆盖率有利于保证大部分的区域有足够的覆盖倍数。
5. 外显子测序能够测出多大的片段缺失?大致能测出50bp的片段缺失。
目前的测序主要还是用Hiseq 2000,单侧的测长就是100bp。
由于外显子测序的覆盖很不平均,所以如果有大段的缺失,无法判断是因为杂交没有捕获到,还是因为缺失。
目前能够测到的,就是在一个read中发现的缺失。
一个read的长度也就是100bp,所以大到50bp以下的片段缺失可以从外显子测序中测出来。
6. 外显子捕获可以做CNV吗?外显子测序因为有一个杂交捕获的过程,这样就会有一个杂交捕获效率的问题。
外显子组测序
346: 256-259.
[案例三] 癌症研究:外显子测序研究局限性肺腺癌瘤内异质性[14] 本研究采用多区域取样分析瘤内异质性的研究思路,对11位患者的局限性肺腺癌的48
个肿瘤样品进行了外显子测序。共鉴定出7269个体突变,其中21个是已知的与癌症相关的 基因突变,76% 的体突变及21个已知癌症基因突变中的20个都可以在同一肿瘤的所有区域 样品中检测到,表明对肿瘤的某一区域进行单次活检,以适当的深度对其测序,可以鉴别 出绝大多数突变。而前期关于肾透明细胞癌的研究结果表明,肿瘤不同区域样品的共有突 变仅占突变总数的31%~37%,说明肿瘤异质性在不同癌种间存在差异。
应用方向
孟德尔疾病研究
马布里综合症[1]:发现致病基因PIGV; 逆向性痤疮[2]:发现致病基因NCSTN; 眼皮肤白化病[3]:发现致病基因SLC24A5; 先天性肾脏和尿道畸形[4]:发现致病基因DSTYK;
复杂疾病研究
混合型低脂血症[5]:发现致病 基因ANGPTL3; 孤独症[6]:发现11 个新生突变 ……
[9] Rudin C M, Durinck S, Stawiski E W, et al. Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer[J]. Nature Genetics, 2012, 44(10): 1111-1116.
外显子组测序
2.
3.
Choi M, Scholl U I, Ji W, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing[J]. Proceedings of the National Academy of Sciences, 2009, 106(45): 19096-19101. Yan X J, Xu J, Gu Z H, et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia[J]. Nature genetics, 2011, 43(4): 309-315. Platforms A. Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia[J]. N Engl J Med, 2013, 2013(368): 2059-2074.
人类基因组的蛋白编码区域大约包含85%的致病突变。
- Choi M, Scholl U I, Ji W, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing[J]. Proceedings of the National Academy of Sciences, 2009, 106(45): 19096-19101.
5.Indel区域的reads重新做局部多序列比对: 在indel的边缘,一些错配看起来很像是SNP,通过对dbSNP库及bam文件检 测到的indel附近的reads进行局部的重新比对,可以消除indel周边的假 阳性SNP。
DNA测序技术及其应用ppt课件
第三代测序技术
1、目前一期的读取速度为3bp/ s
2、它实现了DNA聚合酶内在自身的processivity (延续性,也就是DNA聚合酶一次可以合成很 长的片段),一个反应就可以测非常长的序列。 3、它的精度非常高,达到99.9999%。
42
4.1 单分子即时DNA测序技术
显微镜下进行荧光探测
聚合酶 硫化酶 荧光素酶
25
测序数据输出:
26
指示器 相机 PTP盒
27
小结
454测序法的突出优势是较长的读长,目前 GS FLX测序系统的序列读长已超过400 bp。虽 然454平台的测序成本比其他新一代测序平台要 高很多,但对于那些需要长读长的应用,如从 头测序,它仍是最理想的选择。
缺点是无法准确测量同聚物的长度。
35
测序过程:
连接酶
通用测序引物n
八聚核苷酸 模板
36
37
小结
超高通量是SOLID系统最突出的特点, 目前SOLID 3单次运行可产生50 GB的序 列数据,相当于17倍人类基因组覆盖度。
38
表1 三种二代测序技术对比
39
4. 第三代DNA测序技术
第二代的高通量测序技术已经得到了很好的发 展和应用, 但是由于其测序速度、成本、准确度等 关键问题的解决仍存在困难,研究者们很快将目光转 向了更高更新的测序解决方案。 单分子测序也就应 运而生。
人类基因组计划是当代生命科学一项伟大的 科学工程,它奠定了21世纪生命科学发展和现代 医药生物技术产业化的基础,具有科学上的巨大 意义和商业上的巨大价值。
4
DNA测序技术的历史与发展
测序技术最早可以追溯到20世纪50年代。 早在1945年就已经出现了关于早期测序技术的 报导,即Whitfeld等用化学降解的方法测定多聚核 糖核苷酸序列。1977年Sanger等发明的双脱氧核苷 酸末端终止法和Gilbert等发明的化学降解法,标志 着第一代测序技术的诞生。 此后在三十几年的发展中陆续产生了第二代测 序技术,包括Roche公司的454技术、Illumina公司 的Solexa技术和ABI公司的SOLiD技术。
外显子组测序ppt课件
二、测序深度
• The sensitivity to detect heterozygous variants with 10 reads is 78.6%, but increases to 95.2% at 20x and approximately 100% at 30x and greater.[1]
本ID,外显子编号,以及氨基酸改变,如 OD2:NM_022162:exon8:c.G2722C:p.G908R。
默认使用refSeq完成基因注释,如果有特殊的要求,可以使用UCSC known gene,Ensembl,GENCODE,CCDS等基因注释系统。
22
23
24
2) 1000G注释: 检测突变位点是否在1000 Genomes Projects(2012 release)数据库中 检测到,如果检测到,显示等位基因频率(allele frequency)。默认 是使用所有人种的数据库,如果有特定要求,可以按照要求展示不同人 种(比如AMR, AFR, ASN,EUR,中国人,日本人)等位基因频率。
2. Yan X J, Xu J, Gu Z H, et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia[J]. Nature genetics, 2011, 43(4): 309-315.
• Exome sequencing produced a higher level of coverage for the targeted sequences (mean, 167.50×), slightly increasing our ability to detect mutations with VAFs of less than 10%. [3]
Illumina平台测序原理及常见测序文库构建ppt课件
.
真核生物
总RNA
原核生物
利用Oligo (dT)富 集mRNA
去除 rRNA
将mRNA 随机打断成 ~200 nt
随机引物六聚体反转 录合成cDNA
末端修复,加A,加 接头后PCR扩增
Illumina 测序
55
链特异性转录组建库
每个泳道(Langos,P7 和P5接头)
.
9
仪器简介
单条DNA 模板
35个循环的桥式PCR
约1000条 DNA模板的 拷贝
cBot
HiSeq Sequencer
.
10单链 模板链杂交: 将单链DNA模板杂交到Flow Cell 上
.
13
桥式PCR扩增
单链DNA与FlowCell表面对应接头杂 交,形成“桥” 以接头为引物进行扩增
.
14
桥式PCR扩增
.
15
变性
变性双链的“桥” 得到与FlowCell相连的两条互补 的单链DNA分子
.
16
第二轮桥式PCR扩增
.
17
完成桥式PCR扩增
完成28循环的桥式PCR
.
18线性化• Exon主要用于人全外显子测序和目标区域测捕获方式
• 液相杂交
• 液相杂交是通过在溶液中, 利用链碱基配对的原理, 将DNA片段与探针杂交, 然后洗脱,富集目的片段。
.
49
Exon测序特点
• 优点:
• 不足:
• 1、 与全基因组测序相比,外 • 1、与全基因组测序相比,不能
• 合成 • 照相,收集信号 • 去阻断,切除荧
外显子
individuals that are exome sequenced variants from affected individuals variants from unaffected
常染色体隐性遗传病:
单基因疾病
样本 平台
• 一个先天性黑蒙症的 庞大家系
• the SureSelect 50 Mb All Exon Targeted Enrichment kit • Illumina HiSeq 2000
Adapt from: Ignacio Varela, et al. Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1in renal carcinoma. Nature (2011)
肝癌研究:
1
Adapt from: Yi
Shi. et al. Exome Sequencing Identifies ZNF644 Mutations in High Myopia. PLoS Genetics 7(6), 1–10 (2011)
来自近亲家庭的罕见隐性遗传病:
单基因疾病
样本
平台 方法 结论
样本 实验 平台 方 法
• 十个肝癌患者的肝癌原发灶和侵犯肝脏 门静脉的转移灶(PVTT)
复杂疾病
• NimbleGen Human Exome 2.1M Arrays & Illumina/Solexa sequencing • SureSelect Human All Exon Kit (38 Mb) & ABI SOLiD sequencing
• 测10个患者肝癌原发灶和配对的PVTT的 外显子 • 扩大试验,测了110个患者在10个基因上 的突变情况
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
- Choi M, Scholl U I, Ji W, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing[J]. Proceedings of the National Academy of Sciences, 2009, 106(45): 19096-19101.
Reads on target
Percent reads on target
89,782,719 87,156,364 97.1%
Mean depth of coverage
Target bases at 1x
68,899,95
7
Target bases at 10x
79.1%
Target bases at 20x
• Exome sequencing produced a higher level of coverage for the targeted sequences (mean, 167.50×), slightly increasing our ability to detect mutations with VAFs of less than 10%. [3]
• The average cover-age of each base in the targeted regions was 100-fold, and 95.3% of these bases were covered sufficiently deeply for variant calling (≥10× cover-age) [2]
3
二、测序深度
• The sensitivity to detect heterozygous variants with 10 reads is 78.6%, but increases to 95.2% at 20x and approximately 100% at 30x and greater.[1]
3. Platforms A. Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia[J]. N Engl J Med, 2013, 2013(368): 2059-2074.
4
Coverage rate
• Exome sequencing results on the Ion Proton™ System using the Ion PI™ Chip and the Ion TargetSeq™ Exome Kit
8
基于Ion Proton™的外显子测序结果
Raw reads
Reads mapped Percent reads mapped
119x
Type
98.5%
95.3%
92.5%
Number of variants
Concordance with dbSNP135
SNVs
30,095
Heterozygous SNVs 18,031
98.0% 97.1%
Homozygous SNVs 12,046
99.4%
9
2. Yan X J, Xu J, Gu Z H, et al. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia[J]. Nature genetics, 2011, 43(4): 309-315.
在人类基因中大约有180,000外显子,占人类基因组的1%,约30MB。
-Ng S B, Turner E H, Robertson P D, et al. Targeted capture and massively parallel sequencing of 12 human exomes[J]. Nature, 2009, 461(7261): 272-276.
外显子组测序
1
目录
一、外显子测序简介 二、测序深度 三、测序平台 四、数据分析流程 五、数据分析内容 六、后期验证
2
一、外显子测序简介
外显子测序(也称目标外显子组捕获)是指利用序列捕获技术将全 基因组外显子区域NA捕捉并富集后进行高通量测序的基因组分析方法。 是一种选择基因组的编码序列的高效策略,外显子测序相对于基因组重 测序成本较低,对研究已知基因的SNP、Indel等具有较大的优势。
1. Choi M, Scholl U I, Ji W, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing[J]. Proceedings of the National Academy of Sciences, 2009, 106(45): 19096-19101.
Sequencing depth and coverage of the nine paired initial sequencing samples.
5
三、测序平台
Ion Proton™
Illumina HiSeq
6
基于Ion Proton™的外显子测序流程
7
• The bound DNA is isolated using streptavidincoated Dynabeads® paramagnetic beads, and then amplified and purified. The purified, target-enriched sample is then returned to the Ion Torrent system workflow for emulsion PCR, enrichment, and sequencing.