生物信息学作业A版

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

生物信息学作业
CDK4:
Reference:
[01] Aggarwal P, V aites LP: Nuclear cyclin D1/CDK4 kinase regulates CUL4 expression and triggers neoplastic growth via activation of the PRMT5 methyltransferase. Oct .2010
[02] Zhang X: MicroRNA-related genetic variations as predictors for risk of second primary tumor and/or recurrence in patients with early-stage head and neck cancer. Sep.2010
[03] Lang JM: A Flexible Multiplex Bead-Based Assay for Detecting Germline CDKN2A and CDK4 V ariants in Melanoma-Prone Kindreds. Nov.2010
查找序列:
登陆NCBI数据库

“search”框选“N ”,“for”框填“CDK4”

点“mRNA LINKS”查看碱基序列
基因序列:
/nuccore/NM_000075.2
1 agccctccca gtttccgcgc gcctctttgg cagctggtca catggtgagg gtgggggtga
61 gggggcctct ctagcttgcg gcctgtgtct atggtcgggc cctctgcgtc cagctgctcc
121 ggaccgagct cgggtgtatg gggccgtagg aaccggctcc ggggccccga taacgggccg
181 cccccacagc accccgggct ggcgtgaggg tctcccttga tctgagaatg gctacctctc
241 gatatgagcc agtggctgaa attggtgtcg gtgcctatgg gacagtgtac aaggcccgtg
301 atccccacag tggccacttt gtggccctca agagtgtgag agtccccaat ggaggaggag
361 gtggaggagg ccttcccatc agcacagttc gtgaggtggc tttactgagg cgactggagg
421 cttttgagca tcccaatgtt gtccggctga tggacgtctg tgccacatcc cgaactgacc
481 gggagatcaa ggtaaccctg gtgtttgagc atgtagacca ggacctaagg acatatctgg
541 acaaggcacc cccaccaggc ttgccagccg aaacgatcaa ggatctgatg cgccagtttc
601 taagaggcct agatttcctt catgccaatt gcatcgttca ccgagatctg aagccagaga
661 acattctggt gacaagtggt ggaacagtca agctggctga ctttggcctg gccagaatct
721 acagctacca gatggcactt acacccgtgg ttgttacact ctggtaccga gctcccgaag
781 ttcttctgca gtccacatat gcaacacctg tggacatgtg gagtgttggc tgtatctttg
841 cagagatgtt tcgtcgaaag cctctcttct gtggaaactc tgaagccgac cagttgggca
901 aaatctttga cctgattggg ctgcctccag aggatgactg gcctcgagat gtatccctgc
961 cccgtggagc ctttcccccc agagggcccc gcccagtgca gtcggtggta cctgagatgg 1021 aggagtcggg agcacagctg ctgctggaaa tgctgacttt taacccacac aagcgaatct 1081 ctgcctttcg agctctgcag cactcttatc tacataagga tgaaggtaat ccggagtgag
1141 caatggagtg gctgccatgg aaggaagaaa agctgccatt tcccttctgg acactgagag 1201 ggcaatcttt gcctttatct ctgaggctat ggagggtcct cctccatctt tctacagaga
1261 ttactttgct gccttaatga cattcccctc ccacctctcc ttttgaggct tctccttctc
1321 cttcccattt ctctacacta aggggtatgt tccctcttgt ccctttccct acctttatat
1381 ttggggtcct tttttataca ggaaaaacaa aacaaagaaa taatggtctt tttttttttt
1441 ttaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa
引物设计:
登陆Primer3网页
/primer593/input.htm

粘贴源序列,选择参数

点击pick primer获得引物
/cgi-bin/primer3-web-cgi-bin-0.4.0/primer3_results.cgi
得到最匹配引物序列及相应参数:
OLIGO start len tm gc% any 3' seq
LEFT PRIMER 394 20 59.99 55.00 3.00 1.00 gaaactctgaagccgaccag RIGHT PRIMER 606 20 60.02 50.00 5.00 1.00 aggcagagattcgcttgtgt SEQUENCE SIZE: 660
INCLUDED REGION SIZE: 660
BLASTN:
登陆NCBI的BLAST网址
/Blast.cgi
点击nucleotide blast,粘贴序列及范围选择,点击进行BLAST
结果输出
匹配系列列表:
序列排列情况:
>ref|NM_000075.2| Homo sapiens cyclin-dependent kinase 4
(CDK4), mRNA
Length=1474
GENE ID: 1019 CDK4 | cyclin-dependent kinase 4 [Homo sapiens] (Over 100 PubMed links)
Score = 394 bits (213), Expect = 1e-107
Identities = 213/213 (100%), Gaps = 0/213 (0%)
Strand=Plus/Plus
Query 394 AGGTGGCTTTACTGAGGCGACTGGAGGCTTTTGAGCATCCCAATGTTGTCCGGCTGATGG 453
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 394 AGGTGGCTTTACTGAGGCGACTGGAGGCTTTTGAGCATCCCAATGTTGTCCGGCTGATGG 453 Query 454 ACGTCTGTGCCACATCCCGAACTGACCGGGAGATCAAGGTAACCCTGGTGTTTGAGCATG 513
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 454 ACGTCTGTGCCACATCCCGAACTGACCGGGAGATCAAGGTAACCCTGGTGTTTGAGCATG 513 Query 514 TAGACCAGGACCTAAGGACATATCTGGACAAGGCACCCCCACCAGGCTTGCCAGCCGAAA 573
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 514 TAGACCAGGACCTAAGGACATATCTGGACAAGGCACCCCCACCAGGCTTGCCAGCCGAAA 573 Query 574 CGATCAAGGATCTGATGCGCCAGTTTCTAAGAG 606
|||||||||||||||||||||||||||||||||
Sbjct 574 CGATCAAGGATCTGATGCGCCAGTTTCTAAGAG 606
>ref|NT_029419.12| Homo sapiens chromosome 12 genomic contig, GRCh37 reference primary
assembly
Length=71516776
蛋白质分析:
利用ProtParam查询蛋白质基本性质信息
http://www.expasy.ch/tools/protparam.html


http://www.expasy.ch/cgi-bin/protparam
Number of amino acids: 303 Molecular weight: 33729.8 Theoretical pI: 6.52
Amino acid composition:
Ala (A) 22 7.3%
Arg (R) 23 7.6%
Asn (N) 7 2.3%
Asp (D) 17 5.6%
Cys (C) 4 1.3%
Gln (Q) 8 2.6%
Glu (E) 19 6.3%
Gly (G) 24 7.9%
His (H) 9 3.0%
Ile (I) 11 3.6%
Leu (L) 33 10.9%
Lys (K) 11 3.6%
Met (M) 8 2.6%
Phe (F) 13 4.3%
Pro (P) 25 8.3%
Ser (S) 15 5.0%
Thr (T) 15 5.0%
Trp (W) 3 1.0%
Tyr (Y) 9 3.0%
Val (V) 27 8.9%
Pyl (O) 0 0.0%
Sec (U) 0 0.0%
(B) 0 0.0%
(Z) 0 0.0%
(X) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 36 Total number of positively charged residues (Arg + Lys): 34 Atomic composition:
Carbon C 1515
Hydrogen H 2381
Nitrogen N 419
Oxygen O 430
Sulfur S 12
Formula: C1515H2381N419O430S12
Total number of atoms: 4757
Extinction coefficients:
Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.
Ext. coefficient 30160
Abs 0.1% (=1 g/l) 0.894, assuming all pairs of Cys residues form cystines
Ext. coefficient 29910
Abs 0.1% (=1 g/l) 0.887, assuming all Cys residues are reduced Estimated half-life:
The N-terminal of the sequence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 39.16
This classifies the protein as stable.
Aliphatic index: 89.74
Grand average of hydropathicity (GRAVY): -0.167
结果分析:
我们由以上可以获得氨基酸的数目,分子量等电点,消光系数,预计的半衰期,不稳定指数,组成元素,亲脂性指数等参数。

分析二级结构:点击进入SOPMA。

输入NCBI中查到的氨基酸序列,点击submit,得到:
http://npsa-pbil.ibcp.fr/cgi-bin/secpred_sopma.pl
以上主要是蛋白质的二级结构。

蛋白质功能分析:
点击进入PROTIEN DA TA BANK,在点击sequence search进行查询:
/pdb/search/advSearch.do?st=SequenceQuery 点击Submit Query显示结果如下:
/pdb/results/results.do?outformat=&qrid=59F6615C&tabtoshow=
Current
通过结果分析我们可知,已知功能的蛋白质与数据库里可以获得可以匹配的蛋白
质,并且可以通过氨基酸序列来预测蛋白质功能。

相关文档
最新文档