蛋白分析
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Gi:40644130 allene oxide cyclase 【Niootiana tabacum】
丙二烯氧化环化酶(allene oxide cyclase, AOC)
Gi:140083805 cytosolic class II small heatshock protein HSP17.5 【Rosa hybrid cultivar】
Gi:289487897 Lasoorbate peroxidasa 【Bruguiera gymnorhiza】
比如是核糖体16S, 18S,或ITS等DNA序列,一般在Blast n 中搜索,到底是用megablast,discontiguous megablast还是blastn要根据你的序列与数据库序列的相似性,一般首先用blastn,它对相似性要求较低,可发现大量相似序列,如果进一步要求,再选择megablast 等。但是注意blastn搜索数据库对核酸序列相似性要求较高,如果序列保守性不高,比如新的RNA病毒的基因组序列,可能很难得到结果,这时需要用blastp 或Blast x等。
如果是编码蛋白的基因序列,可先将其翻译成蛋白(注意,一条序列理论上有六种编码可能)然后分别去blastp搜索蛋白数据库,当然,你也可直接将其在Blast x中搜索,Blast x会自动将六种编码可能分别翻译后搜索蛋白数据库。
Blastp/PSI-Blast/PHI-BLAST都是蛋白序列与蛋白序列之间的Blast比对。
1,Blastp: 标准的蛋白序列与蛋白序列之间的比对
Standard protein BLAST is designed for protein searches.
Blastp用于确定查询的氨基酸序列在蛋白数据库中找到相似的序列。跟其它的Blast程序一样,目的是要找到相似的区域。
2,PSI-BLAST : 敏感度更高的蛋白序列与蛋白序列之间的比对
PSI-BLAST is designed for more sensitive protein-protein similarity searches.
Position-Specific Iterated (PSI)-BLAST,是一种更加高灵敏的Blastp程序,对于发现远亲物种的相似蛋白或某个蛋白家族的新成员非常有效。当你使用标准的Blastp比对失败时,或比对的结果仅仅是一些假基因或推测的基因序列时("hypothetical protein" or "similar to..."),你可以选择PSI-BLAST重新试试。
3,PHI-BLAST : 模式发现迭代BLAST
PHI-BLAST can do a restricted protein pattern search.
PHI-BLAST, 模式发现迭代BLAST, 用蛋白查询来搜索蛋白数据库的一个程序。仅仅找出那些查询序列中含有的特殊模式的对齐。
PHI的语法详细介绍看这里:/blast/html/PHIsyntax.html
The syntax for patterns in PHI-BLAST follows the conventions of PROSITE. When using the stand-alone program, it is permissible to have multiple patterns in a file separated by a blank line between patterns. When using the Web-page only one pattern is allowed per query.
Valid protein characters for PHI-BLAST patterns:
ABCDEFGHIKLMNPQRSTVWXYZU
Valid DNA characters for PHI-BLAST patterns:
ACGT
Other useful delimiters:
[ ] means any one of the characters enclosed in the brackets e.g., [LFYT] means one occurrence of L or F or Y or T
- means nothing (this is a spacer character used by PROSITE)
x(5) means 5 positions in which any residue is allowed (and similarly for any other single number in parentheses after x)
x(2,4) means 2 to 4 positions where any residue is allowed, and similarly for any other two numbers separated by a comma; the first number should be < the second number. can occur only at the end of a pattern and means nothing it may occur before a period (another spacer used by PROSITE) may be used at the end of the pattern and means nothing.
When using the stand-alone program, the pattern should be in a file, with the first line starting:
ID followed by 2 spaces and a text string giving the pattern a name. There should also be a line starting PA followed by 2 spaces followed by the pattern description.
All other PROSITE codes in the first two columns are allowed, but only the HI code, described below is relevant to PHI-BLAST.
Here is an example from PROSITE.
ID CNMP_BINDING_2; PATTERN.
AC PS00889;
DT OCT-1993 (CREATED); OCT-1993 (DA TA UPDA TE); NOV-1995 (INFO UPDATE).
DE Cyclic nucleotide-binding domain signature 2.
PA [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,11)-R-[STAQ]-A-x-[LIVMA]-x-[STACV].
NR /RELEASE=32,49340;
NR /TOTAL=57(36); /POSITIVE=57(36); /UNKNOWN=0(0); /FALSE_POS=0(0);
NR /FALSE_NEG=1; /PARTIAL=1;