上机实习四:BLAST序列相似性搜索工具的使用

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
• 选择搜索的物种 • 选择过滤条件,过滤掉查询序列中具有较低复杂度的 掩盖部分 • 选择记分矩阵。对于blastp有5种矩阵:PAM30, PAM70, BLOSUM45,BLOSUM62 (默认值)以及 BLOSUM80。 • 期望值的默认设定值是10。在这个E值下,随机出现 得分等于或高于比对得分S的期望数为10个。当将期望 选项值调小时,返回的数据库搜索结果将变少;匹配 被搜索到的概率也会变小。增大E值将返回更多的结果 • 字段长度。 默认值 BLASTp为3,BLASTn为11 • 返回结果的格式
cut-off parameters
BLAST search strategies
General concepts How to evaluate the significance of your results
How to handle too many results
How to handle too few results
Sometimes a real match has an E value > 1
…try a reciprocal BLAST to confirm
Sometimes a similar E value occurs for a short exact match and long less exact match
Step 4a: Select optional search parameters
CD search
Step 4a: Select optional search parameters
Entrez!
Filter
Expect Word size Scoring matrix
organism
BLAST: 选择参数
filtering
Step 4b: optional formatting parameters
Alignment view Descriptions Alignments
program
query database taxonomy
taxonomy
High scores low e values Cut-off: .05? 10-10?
Step 1: Choose your sequence
上机实习四:
Sequence can be input in FASTA format or as accession number
基本局部比对搜索工具 blast的使用
BLAST搜索的4个步骤
①选择你所感兴趣的序列号或序列,将它粘贴到BLAST的输入 框中。 ②选择一个BLAST程序(blastp,blastn,blastx,tblastx, tblastn)。 ③选择一个用于搜索的数据库。一个通常的选择是去冗余 (nr)。 ④为搜索和输出格式选择可选参数。
Assessing whether proteins are homologous
RBP4 and PAEP: Low bit score, E value 0.49, 24% identity (“twilight zone”). But they are indeed homologous. Try a BLAST search with PAEP as a query, and find many other lipocalins.
Βιβλιοθήκη Baidu
Step 3: choose the database
nr = non-redundant (most general database) dbest = database of expressed sequence tags dbsts = database of sequence tag sites gss = genomic survey sequences htgs = high throughput genomic sequence
We will get to the bottom of a BLAST search in a few minutes…
EVD parameters
BLOSUM matrix gap penalties 10.0 is the E value Effective search space = mn = length of query x db length threshold score = 11
Example of the FASTA format for a BLAST query
Step 2: Choose the BLAST program
Step 2: Choose the BLAST program
blastn (nucleotide BLAST) blastp (protein BLAST) tblastn (translated BLAST) blastx (translated BLAST) tblastx (translated BLAST)
相关文档
最新文档