蛋白分析

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Gi:40644130 allene oxide cyclase 【Niootiana tabacum】

丙二烯氧化环化酶(allene oxide cyclase, AOC)

Gi:140083805 cytosolic class II small heatshock protein HSP17.5 【Rosa hybrid cultivar】

Gi:289487897 Lasoorbate peroxidasa 【Bruguiera gymnorhiza】

比如是核糖体16S, 18S,或ITS等DNA序列,一般在Blast n 中搜索,到底是用megablast,discontiguous megablast还是blastn要根据你的序列与数据库序列的相似性,一般首先用blastn,它对相似性要求较低,可发现大量相似序列,如果进一步要求,再选择megablast 等。但是注意blastn搜索数据库对核酸序列相似性要求较高,如果序列保守性不高,比如新的RNA病毒的基因组序列,可能很难得到结果,这时需要用blastp 或Blast x等。

如果是编码蛋白的基因序列,可先将其翻译成蛋白(注意,一条序列理论上有六种编码可能)然后分别去blastp搜索蛋白数据库,当然,你也可直接将其在Blast x中搜索,Blast x会自动将六种编码可能分别翻译后搜索蛋白数据库。

Blastp/PSI-Blast/PHI-BLAST都是蛋白序列与蛋白序列之间的Blast比对。

1,Blastp: 标准的蛋白序列与蛋白序列之间的比对

Standard protein BLAST is designed for protein searches.

Blastp用于确定查询的氨基酸序列在蛋白数据库中找到相似的序列。跟其它的Blast程序一样,目的是要找到相似的区域。

2,PSI-BLAST : 敏感度更高的蛋白序列与蛋白序列之间的比对

PSI-BLAST is designed for more sensitive protein-protein similarity searches.

Position-Specific Iterated (PSI)-BLAST,是一种更加高灵敏的Blastp程序,对于发现远亲物种的相似蛋白或某个蛋白家族的新成员非常有效。当你使用标准的Blastp比对失败时,或比对的结果仅仅是一些假基因或推测的基因序列时("hypothetical protein" or "similar to..."),你可以选择PSI-BLAST重新试试。

3,PHI-BLAST : 模式发现迭代BLAST

PHI-BLAST can do a restricted protein pattern search.

PHI-BLAST, 模式发现迭代BLAST, 用蛋白查询来搜索蛋白数据库的一个程序。仅仅找出那些查询序列中含有的特殊模式的对齐。

PHI的语法详细介绍看这里:/blast/html/PHIsyntax.html

The syntax for patterns in PHI-BLAST follows the conventions of PROSITE. When using the stand-alone program, it is permissible to have multiple patterns in a file separated by a blank line between patterns. When using the Web-page only one pattern is allowed per query.

Valid protein characters for PHI-BLAST patterns:

ABCDEFGHIKLMNPQRSTVWXYZU

Valid DNA characters for PHI-BLAST patterns:

ACGT

Other useful delimiters:

[ ] means any one of the characters enclosed in the brackets e.g., [LFYT] means one occurrence of L or F or Y or T

- means nothing (this is a spacer character used by PROSITE)

x(5) means 5 positions in which any residue is allowed (and similarly for any other single number in parentheses after x)

x(2,4) means 2 to 4 positions where any residue is allowed, and similarly for any other two numbers separated by a comma; the first number should be < the second number. can occur only at the end of a pattern and means nothing it may occur before a period (another spacer used by PROSITE) may be used at the end of the pattern and means nothing.

When using the stand-alone program, the pattern should be in a file, with the first line starting:

ID followed by 2 spaces and a text string giving the pattern a name. There should also be a line starting PA followed by 2 spaces followed by the pattern description.

All other PROSITE codes in the first two columns are allowed, but only the HI code, described below is relevant to PHI-BLAST.

Here is an example from PROSITE.

ID CNMP_BINDING_2; PATTERN.

AC PS00889;

DT OCT-1993 (CREATED); OCT-1993 (DA TA UPDA TE); NOV-1995 (INFO UPDATE).

DE Cyclic nucleotide-binding domain signature 2.

PA [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,11)-R-[STAQ]-A-x-[LIVMA]-x-[STACV].

NR /RELEASE=32,49340;

NR /TOTAL=57(36); /POSITIVE=57(36); /UNKNOWN=0(0); /FALSE_POS=0(0);

NR /FALSE_NEG=1; /PARTIAL=1;

相关文档
最新文档