蛋白质组的生物信息学分析
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Summary of proteome & proteomics
Summary of proteome & proteomics
Bottleneck in Proteome based on MS/MS
Reproducibility Quantification
Accuracy
Reliability
Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra” Shilov IV, Seymour SL et al (2007) MCP 6.9, 1638
2 ,P-value less than 0.05.
Quant. Results evaluation
Evaluation of biological/technical replicates
Quantified overlap between two experiments
Correlation between two experimental replicates ratio pairs
Decide Significant difference expressed proteins
If bioreplicates or technique replicates contained, one may get a mean or median of comparable ratios, or just decide by the number of occurrence of sig. difference among the compared replicates samples.
• Unique feature • hypothesis selection stage “… there is greater potential for improvement from advances in determining what to score, not how to score it.” Shilov IV, Seymour SL et al (2007) MCP 6.9, 1638 • ProGroup™ algorithm for protein inference • Quantitative results for stable isotope label quant experiments • Extensive AA Modification Catalog • Global & local FDR filtering
Bioinformatics for Proteome
武汉金开瑞生物工程有限公司
OUTLINE
Summary of Proteome Bottleneck in Proteome based on MS/MS Qualitative Proteome Analysis Quantitative Proteome Analysis Post Translational Modifications(PTMs) Functional Annotation of large scale protein data set
Peptide ion fragmentation
Available search engines
Why ProteinPilot
• Paragon™ algorithm: ‘Sequence approach’ search algorithm for peptide ID; “The Paragon Algorithm, a Next Generation
Functional enrichment
Pathway enrichment results
ID statistical
Sample #Total name spectra ALL 345040 #ID spectra 158011 ID percentage 45.8% #ID peptides 28125 #ID proteins 4741
Feature distribution
MS/MS spectra: fragment ions(product ions)
example.mgf
Charge of precursor ion
Mass of precursor ion
Peptide fragment ions: m/z intensity charge
End symbol of one MS/MS spectrum
Unique peptide number
Peptide length distribution
Venn Diagram among multiple experiments
Protein coverage distribution
How to quant a PSM?
Quant. Based MS: XIC(eXtracted Ion Current)
Fra Baidu bibliotek
Post Translational Modifications
Post translational modification (PTM) is a step in protein biosynthesis. Proteins are created by ribosomes translating mRNA into polypeptide chains. These polypeptide chains undergo PTM (such as folding, cutting and other processes) before becoming the mature protein product.
Website: http://www.unimod.org
Workflow for PTMs
Shortcoming by conventional ID workflow
A new concept of site location for PTMs
Ascore
PTM score
MD score
Identification
Identification of workflow
How to identify a protein/pep seq.
Raw data from mass spectrum
MS spectra: Peptide ions(precursors, mother ions)
The aim is to create a community supported, comprehensive database of protein modifications for mass spectrometry applications. That is, accurate and verifiable values, derived from elemental compositions, for the mass differences introduced by all types of natural and artificial modifications. Other important information includes any mass change, (neutral loss), that occurs during MS/MS analysis, and site specificity, (which residues are susceptible to modification and any constraints on the position of the modification within the protein or peptide)
A general workflow of data analysis
Calculate peptides’ abundance : XICs, Intensity, Spectra count Normalize peptides’ abundance: mean, median, quantile, linear regression Decide proteins’ expressed abundance and Fold Change(FC): mean, median, total, weighted ratio. Statistical analysis: T-test, ANOVA, PCA, Fisher’s Exact Test Decide significant expressed different proteins: FC:1.2, 1.3, 1.5,
Quant. Based MS/MS: XIC(eXtracted Ion Current) & intensity
XICs of fragment ions: MRM,SWATH Intensities of fragment ions (labeled marker):Itraq,TMT
Quantitative Proteome
ID results evaluation
Filtering standard
By identification confidence: Protein Unused score >1.3 (means a 95% confidence). More likely to be used. By Local FDR cutoff: less than 1% or 5% FDR. By unique peptides num. At least 1 unique peptide per protein group
Fisher’s exact test Cumulative supper hypergeometric test
Functional enrichment
GO enrichment results
When the p-value is less than 0.05, the corresponding GO term is considered as significant enriched.
Some visual results
GO annotation
workflow
Results statistic
GO annotation results
KEGG Pathway annotation
Workflow
Results
Functional enrichment
Many functional nodes would be gathered and overlap if just annotate genes/proteins directly, which may puzzle researchers. So we hope to filter and screen it to achieve more significative functional nodes.