TREC 2005 genomics track overview
关于胚胎干细胞的文献
关于胚胎干细胞的文献
1. Thomson JA等人在1998年首次成功地从人类胚胎中分离出ESC,并在Science杂志上发表了相关研究。
这项研究标志着胚胎干
细胞领域的重要突破,引起了广泛的关注。
2. 《Nature》杂志于2006年发表了一篇综述文章,探讨了胚
胎干细胞的特性、来源、分化潜能以及其在再生医学和药物研发领
域的应用前景。
该综述提供了对胚胎干细胞研究的全面概述。
3. 《Cell Stem Cell》杂志是一个专门刊登干细胞研究的期刊,其中包括了大量关于胚胎干细胞的研究论文。
浏览该期刊的相关文
章可以获取最新的研究进展和突破。
4. 《Stem Cells》杂志也是一个重要的期刊,涵盖了广泛的干
细胞研究领域。
在该期刊中,你可以找到关于胚胎干细胞的最新研
究成果和评论。
5. 《Developmental Biology》杂志发表了一些关于胚胎干细
胞的发育生物学研究,这些研究有助于我们理解胚胎干细胞的分化
和发展过程。
此外,还有一些专门关注胚胎干细胞伦理和法律问题的文献:
1.《Science》杂志上发表的一篇综述文章讨论了胚胎干细胞研究的伦理和法律挑战,以及各国政策和法规的差异。
2. 《Nature Reviews Genetics》杂志发表了一些关于胚胎干细胞伦理和社会问题的综述文章,对胚胎干细胞研究的伦理和社会影响进行了深入探讨。
需要注意的是,以上只是一些代表性的文献和期刊,胚胎干细胞研究领域的文献非常广泛。
如果你有特定的研究方向或者更具体的问题,我可以提供更详细的信息。
非诺贝特对干燥综合征小鼠Treg细胞的免疫调节作用
欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁氉氉氉氉引文格式:郭星艺,粘红,王颖,李娜,党维钰,魏瑞华.非诺贝特对干燥综合征小鼠Treg细胞的免疫调节作用[J].眼科新进展,2022,42(1):11 14,19.doi:10.13389/j.cnki.rao.2022.0003【实验研究】非诺贝特对干燥综合征小鼠Treg细胞的免疫调节作用△郭星艺 粘 红 王 颖 李 娜 党维钰 魏瑞华欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁欁氉氉氉氉作者简介:郭星艺(ORCID:0000 0002 5993791X),女,1997年8月出生,黑龙江齐齐哈尔人,在读硕士研究生。
研究方向:自身免疫性眼病。
E mail:guoxingyi0820@126.com通信作者:魏瑞华(ORCID:0000 0002 97080355),女,1974年8月出生,天津人,博士,主任医师。
研究方向:自身免疫性眼病。
Email:rwei@tmu.edu.cn收稿日期:2021 09 17修回日期:2021 11 08本文编辑:付中静△基金项目:国家自然科学基金项目(编号:81970793,82070929);天津市临床重点学科(专科)建设项目(编号:TJLCZDXKT003)作者单位:300384天津市,天津医科大学眼科医院、眼视光学院、眼科研究所,国家眼耳鼻喉疾病临床医学研究中心天津市分中心,天津市视网膜功能与疾病重点实验室【摘要】 目的 探讨过氧化物酶体增殖物激活受体α(PPAR α)激动剂非诺贝特对干燥综合征小鼠Treg细胞的免疫调节作用。
方法 将14只15周龄雄性NOD小鼠随机分为模型组和非诺贝特组,每组7只。
模型组小鼠给予标准饲料喂养,非诺贝特组小鼠予含0.3g·kg-1非诺贝特的标准饲料喂养。
How many species are there on earth and in the ocean
How Many Species Are There on Earth and in the Ocean? Camilo Mora1,2*,Derek P.Tittensor1,3,4,Sina Adl1,Alastair G.B.Simpson1,Boris Worm11Department of Biology,Dalhousie University,Halifax,Nova Scotia,Canada,2Department of Geography,University of Hawaii,Honolulu,Hawaii,United States of America, 3United Nations Environment Programme World Conservation Monitoring Centre,Cambridge,United Kingdom,4Microsoft Research,Cambridge,United KingdomAbstractThe diversity of life is one of the most striking aspects of our planet;hence knowing how many species inhabit Earth is among the most fundamental questions in science.Yet the answer to this question remains enigmatic,as efforts to sample the world’s biodiversity to date have been limited and thus have precluded direct quantification of global species richness, and because indirect estimates rely on assumptions that have proven highly controversial.Here we show that the higher taxonomic classification of species(i.e.,the assignment of species to phylum,class,order,family,and genus)follows a consistent and predictable pattern from which the total number of species in a taxonomic group can be estimated.This approach was validated against well-known taxa,and when applied to all domains of life,it predicts,8.7million(61.3 million SE)eukaryotic species globally,of which,2.2million(60.18million SE)are marine.In spite of250years of taxonomic classification and over1.2million species already catalogued in a central database,our results suggest that some 86%of existing species on Earth and91%of species in the ocean still await description.Renewed interest in further exploration and taxonomy is required if this significant gap in our knowledge of life on Earth is to be closed.Citation:Mora C,Tittensor DP,Adl S,Simpson AGB,Worm B(2011)How Many Species Are There on Earth and in the Ocean?PLoS Biol9(8):e1001127.doi:10.1371/journal.pbio.1001127Academic Editor:Georgina M.Mace,Imperial College London,United KingdomReceived November12,2010;Accepted July13,2011;Published August23,2011Copyright:ß2011Mora et al.This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits unrestricted use,distribution,and reproduction in any medium,provided the original author and source are credited.Funding:Funding was provided by the Sloan Foundation through the Census of Marine Life Program,Future of Marine Animal Populations project.The funders had no role in study design,data collection and analysis,decision to publish,or preparation of the manuscript.Competing Interests:The authors have declared that no competing interests exist.*E-mail:moracamilo@IntroductionRobert May[1]recently noted that if aliens visited our planet, one of their first questions would be,‘‘How many distinct life forms—species—does your planet have?’’He also pointed out that we would be‘‘embarrassed’’by the uncertainty in our answer. This narrative illustrates the fundamental nature of knowing how many species there are on Earth,and our limited progress with this research topic thus far[1–4].Unfortunately,limited sampling of the world’s biodiversity to date has prevented a direct quantifi-cation of the number of species on Earth,while indirect estimates remain uncertain due to the use of controversial approaches(see detailed review of available methods,estimates,and limitations in Table1).Globally,our best approximation to the total number of species is based on the opinion of taxonomic experts,whose estimates range between3and100million species[1];although these estimations likely represent the outer bounds of the total number of species,expert-opinion approaches have been ques-tioned due to their limited empirical basis[5]and subjectivity[5–6](Table1).Other studies have used macroecological patterns and biodiversity ratios in novel ways to improve estimates of the total number of species(Table1),but several of the underlying assumptions in these approaches have been the topic of sometimes heated controversy([3–17],Table1);moreover their overall predictions concern only specific groups,such as insects[9,18–19], deep sea invertebrates[13],large organisms[6–7,10],animals[7], fungi[20],or plants[21].With the exception of a few extensively studied taxa(e.g.,birds[22],fishes[23]),we are still remarkably uncertain as to how many species exist,highlighting a significant gap in our basic knowledge of life on Earth.Here we present a quantitative method to estimate the global number of species in all domains of life.We report that the number of higher taxa,which is much more completely known than the total number of species [24],is strongly correlated to taxonomic rank[25]and that such a pattern allows the extrapolation of the global number of species for any kingdom of life(Figures1and2).Higher taxonomy data have been previously used to quantify species richness within specific areas by relating the number of species to the number of genera or families at well-sampled locations,and then using the resulting regression model to estimate the number of species at other locations for which the number of families or genera are better known than species richness(reviewed by Gaston&Williams[24]).This method,however,relies on extrapolation of patterns from relatively small areas to estimate the number of species in other locations(i.e.,alpha diversity). Matching the spatial scale of this method to quantify the Earth’s total number of species would require knowing the richness of replicated planets;not an option as far as we know,although May’s aliens may disagree.Here we analyze higher taxonomic data using a different approach by assessing patterns across all taxonomic levels of major taxonomic groups.The existence of predictable patterns in the higher taxonomic classification of species allows prediction of the total number of species within taxonomic groups and may help to better constrain our estimates of global species richness.ResultsWe compiled the full taxonomic classifications of,1.2million currently valid species from several publicly accessible sources(see Materials and Methods).Among eukaryote‘‘kingdoms,’’assess-ment of the temporal accumulation curves of higher taxa(i.e.,thecumulative number of species,genera,orders,classes,and phyla described over time)indicated that higher taxonomic ranks are much more completely described than lower levels,as shown by strongly asymptoting trajectories over time([24],Figure1A–1F, Figure S1).However,this is not the case for prokaryotes,where there is little indication of reaching an asymptote at any taxonomic level(Figure S1).For most eukaryotes,in contrast,the rate of discovery of new taxa has slowed along the taxonomic hierarchy, with clear signs of asymptotes for phyla(or‘‘divisions’’in botanical nomenclature)on one hand and a steady increase in the number of species on the other(Figure1A–1F,Figure S1).This prevents direct extrapolation of the number of species from species-accumulation curves[22,23]and highlights our current uncer-tainty regarding estimates of total species richness(Figure1F). However,the increasing completeness of higher taxonomic ranks could facilitate the estimation of the total number of species,if the former predicts the latter.We evaluated this hypothesis for all kingdoms of life on Earth.First,we accounted for undiscovered higher taxa by fitting,for each taxonomic level from phylum to genus,asymptotic regression models to the temporal accumulation curves of higher taxa (Figure1A–1E)and using a formal multimodel averaging framework based on Akaike’s Information Criterion[23]to predict the asymptotic number of taxa of each taxonomic level (dotted horizontal line in Figure1A–11E;see Materials and Methods for details).Secondly,the predicted number of taxa at each taxonomic rank down to genus was regressed against the numerical rank,and the fitted models used to predict the number of species(Figure1G,Materials and Methods).We applied this approach to18taxonomic groups for which the total numbers of species are thought to be relatively well known.We found that this approach yields predictions of species numbers that are consistent with inventory totals for these groups(Figure2).When applied to all eukaryote kingdoms,our approach predicted,7.77million species of animals,,298,000species of plants,,611,000species of fungi,,36,400species of protozoa,and,27,500species of chromists;in total the approach predicted that,8.74million species of eukaryotes exist on Earth(Table2).Restricting this approach to marine taxa resulted in a prediction of2.21million eukaryote species in the world’s oceans(Table2).We also applied the approach to prokaryotes;unfortunately,the steady pace of description of taxa at all taxonomic ranks precluded the calculation of asymptotes for higher taxa(Figure S1).Thus,we used raw numbers of higher taxa(rather than asymptotic estimates)for prokaryotes,and as such our estimates represent only lower bounds on the diversity in this group.Our approach predicted a lower bound of,10,100species of prokaryotes,of which,1,320are marine.It is important to note that for prokaryotes,the species concept tolerates a much higher degree of genetic dissimilarity than in most eukaryotes[26,27];additionally, due to horizontal gene transfers among phylogenetic clades, species take longer to isolate in prokaryotes than in eukaryotes, and thus the former species are much older than the latter[26,27]; as a result the number of described species of prokaryotes is small (only,10,000species are currently accepted).Assessment of Possible LimitationsWe recognize a number of factors that can influence the interpretation and robustness of the estimates derived from the method described here.These are analyzed below.Species definitions.An important caveat to the interpretation of our results concerns the definition of species. Different taxonomic communities(e.g.,zoologists,botanists,and bacteriologists)use different levels of differentiation to define a species.This implies that the numbers of species for taxa classified according to different conventions are not directly comparable. For example,that prokaryotes add only0.1%to the total number of known species is not so much a statement about the diversity of prokaryotes as it is a statement about what a species means in this group.Thus,although estimates of the number of species are internally consistent for kingdoms classified under the same conventions,our aggregated predictions for eukaryotes and prokaryotes should be interpreted with that caution in mind. Changes in higher taxonomy.Increases or decreases in the number of higher taxa will affect the raw data used in our method and thus its estimates of the total number of species.The number of higher taxa can change for several reasons including new discoveries,the lumping or splitting of taxa due to improved phylogenies and switching from phenetic to phylogenetic classifications,and the detection of synonyms.A survey of2,938 taxonomists with expertise across all major domains of life (response rate19%,see Materials and Methods)revealed that synonyms are a major problem at the species level,but much less so at higher taxonomic levels.The percentage of taxa names currently believed to be synonyms ranged from17.9(628.7SD) for species,to7.38(615.8SD)for genera,to5.5(634.0SD)for families,to3.72(645.2SD)for orders,to1.15(68.37SD)for classes,to0.99(67.74SD)for phyla.These results suggest that by not using the species-level data,our higher-taxon approach is less sensitive to the problem of synonyms.Nevertheless,to assess the extent to which any changes in higher taxonomy will influence our current estimates,we carried out a sensitivity analysis in which the number of species was calculated in response to variations in the number of higher taxa(Figure3A–3E,Figure S2).This analysis indicates that our current estimates are remarkably robust to changes in higher taxonomy.Changes in taxonomic effort.Taxonomic effort can be a strong determinant of species discovery rates[21].Hence the estimated asymptotes from the temporal accumulation curves of higher taxa(dotted horizontal line in Figure1A–1E)might be driven by a decline in taxonomic effort.We presume,however, that this is not a major factor:while the discovery rate of higher taxa is declining(black dots and red lines in Figure3F–3J),the rate of description of new species remains relatively constant(grey lines in Figure3F–3J).This suggests that the asymptotic trends among higher taxonomic levels do not result from a lack of taxonomic effort as there has been at least sufficient effort to describe newAuthor SummaryKnowing the number of species on Earth is one of the most basic yet elusive questions in science.Unfortunately, obtaining an accurate number is constrained by the fact that most species remain to be described and because indirect attempts to answer this question have been highly controversial.Here,we document that the taxonomic classification of species into higher taxonomic groups (from genera to phyla)follows a consistent pattern from which the total number of species in any taxonomic group can be predicted.Assessment of this pattern for all kingdoms of life on Earth predicts,8.7million(61.3 million SE)species globally,of which,2.2million(60.18 million SE)are marine.Our results suggest that some86% of the species on Earth,and91%in the ocean,still await description.Closing this knowledge gap will require a renewed interest in exploration and taxonomy,and a continuing effort to catalogue existing biodiversity data in publicly available databases.species at a constant rate.Secondly,although a majority(79.4%) of experts that we polled in our taxonomic survey felt that the number of taxonomic experts is decreasing,it was pointed out that other factors are counteracting this trend.These included,among others,more amateur taxonomists and phylogeneticists,new sampling methods and molecular identification tools,increased international collaboration,better access to information,and access to new areas of exploration.Taken together these factors have resulted in a constant rate of description of new species,as evident in our Figure1,Figure3F–3J,and Figure S1and suggest that the observed flattening of the discovery curves of higher taxa is unlikely to be driven by a lack of taxonomic effort.Completeness of taxonomic inventories.To account for yet-to-be-discovered higher taxa,our approach fitted asymptotic regression models to the temporal accumulation curve of higher taxa.A critical question is how the completeness of such curves will affect the asymptotic prediction.To address this,we performed a sensitivity analysis in which the asymptotic number of taxa was calculated for accumulation curves with different levels of completeness.The results of this test indicated that the asymptotic regression models used here would underestimate the number of predicted taxa when very incomplete inventories are used(Figure3K–3O).This underestimation in the number of higher taxa would lower our prediction of the number of speciesTable1.Available methods for estimating the global number of species and their limitations.Case Study Limitations Macroecological patternsBody size frequency distributions.By extrapolation from the frequency of large to small species,May[7]estimated10to50million species of animals.May[7]suggested that there was no reason to expect a simple scaling law from large to small species.Further studies confirmed different modes of evolution among small species[4]and inconsistent body size frequency distributions among taxa[4].Latitudinal gradients in species.By extrapolation from the better sampled temperate regions to the tropics,Raven[10]estimated3to5million species of large organisms.May[2]questioned the assumption that temperate regions were better sampled than tropical ones;the approach also assumed consistent diversity gradients across taxa which is not factual[4].Species-area relationships.By extrapolation from the number of species in deep-sea samples,Grassle&Maciolek[13]estimated that the world’s deep seafloor could contain up to10million mbshead&Bouchet[12]questioned this estimation by showing that high local diversity in the deep sea does not necessarily reflect high global biodiversity given low species turnover.Diversity ratiosRatios between taxa.By assuming a global6:1ratio of fungi to vascular plants and that there are,270,000species of vascular plants, Hawksworth[20]estimated1.6million fungi species.Ratio-like approaches have been heavily critiqued because,given known patterns of species turnover,locally estimated ratios between taxa may or may not be consistent at the global scale[3,12]and because at least one group of organisms should be well known at the global scale,which may not always be true[15].Bouchet[6]elegantly demonstrated the shortcomings of ratio-based approaches by showing how even for a well-inventoried marine region,the ratio of fishes to total multicellular organisms would yield,0.5million global marine species whereas the ratio of Brachyura to total multicellular organisms in the same sampled region would yield,1.5million species.Host-specificity and spatial ratios.Given50,000known species of tropical treesand assuming a5:1ratio of host beetles to trees,that beetles represent40%ofthe canopy arthropods,and that the canopy has twice the species of the ground,Erwin[9]estimated30million species of arthropods in the tropics.Known to unknown ratios.Hodkinson&Casson[18]estimated that62.5%of thebug(Hemiptera)species in a sampled location were unknown;by assuming that7.5%–10%of the global diversity of insects is bugs,they estimated between1.84and2.57million species of insects globally.Taxonomic patternsTime-species accumulation curves.By extrapolation from the discovery recordit was estimated that there are,19,800species of marine fishes[23]and,11,997 birds[22].This approach is not widely applicable because it requires species accumulation curves to approach asymptotic levels,which is only true for a small number of well-described taxa[22–23].Authors-species accumulation curves.Modeling the number of authors describing species over time allowed researchers to estimate that the proportion of flowering plants yet to be discovered is13%to18%[21].This is a very recent method and the effect of a number of assumptions remains to be evaluated.One is the extent to which the description of new species is shifting from using taxonomic expertise alone to relying on molecular methods(particularly among small organisms[26])and the other that not all authors listed on a manuscript are taxonomic experts, particularly in recent times when the number of coauthors per taxa described is increasing[21,38],which could be due to more collaborative research[38]and the acknowledgment of technicians,field assistants, specimen collectors,and so on as coauthors(Philippe Bouchet,personal communication).Analysis of expert estimations.Estimates of,5million species of insects[15] and,200,000marine species[14]were arrived at by compiling opinion-based estimates from taxonomic experts.Robustness in the estimations is assumed from the consistency of responses among different experts.Erwin[5]labeled this approach as‘‘non-scientific’’due to a lack of verification.Estimates can vary widely,even those of a single expert[5,6]. Bouchet[6]argues that expert estimations are often passed on from one expert to another and therefore a robust estimation could be the‘‘same guess copied again and again’’.doi:10.1371/journal.pbio.1001127.t001through our higher taxon approach,which suggests that our species estimates are conservative,particularly for poorly sampled taxa.We reason that underestimation due to this effect is severe for prokaryotes due to the ongoing discovery of higher taxa (Figure S1)but is likely to be modest in most eukaryote groups because the rate of discovery of higher taxa is rapidly declining (Figure 1A–3E,Figure S1,Figure 3F–3J).Since higher taxonomic levels are described more completely (Figure 1A–1E),the resulting error from incomplete inventories should decrease while rising in the taxonomic hierarchy.Recalculating the number of species while omitting all data from genera yielded new estimates that were mostly within the intervals of our original estimates (Figure S3).However,Chromista (on Earth and in the ocean)and Fungi (in the ocean)were exceptions,having inflated predictions without the genera data (Figure S3).This inflation in the predicted number of species without genera data highlights the high incompleteness of at least the genera data in those three cases.In fact,Adl et al.’s [28]survey of expert opinions reported that the number of described species of chromists could be in the order of 140,000,which is nearly 10times the number of species currently catalogued in the databases used here (Table 1).These results suggest that our estimates for Chromista and Fungi (in the ocean)need to be considered with caution due to the incomplete nature of their data.Subjectivity in the Linnaean system of classification.Different ideas about the correct classification of species into a taxonomic hierarchy may distort the shape of the relationships we describe here.However,an assessment of the taxonomic hierarchy shows a consistent pattern;we found that at any taxonomic rank,the diversity of subordinate taxa is concentrated within a few groups with a long tail of low-diversity groups (Figure 3P–3T).Although we cannot refute the possibility of arbitrary decisions in the classification of some taxa,the consistent patterns in Figure 3P–3T imply that these decisions do not obscure the robust underlying relationship between taxonomic levels.The mechanism for the exponential relationships between nested taxonomic levels is uncertain,but in the case of taxa classified phylogenetically,it may reflect patterns of diversification likely characterized by radiations within a few clades and little cladogenesis in most others [29].We would like to caution that the database we used here for protistan eukaryotes (mostly in Protozoa and Chromista in this work)combines elements of various classification schemes from different ages—in fact the very division of these organisms into ‘‘Protozoa’’and ‘‘Chromista’’kingdoms is non-phylogenetic and not widely followed among protistologists [28].It would be valuable to revisit the species estimates for protistan eukaryotes once their global catalogue can be organized into a valid and stable higher taxonomy (and their catalogue of described species is more complete—see above).DiscussionKnowing the total number of species has been a question of great interest motivated in part by our collective curiosity about the diversity of life on Earth and in part by the need to provide a reference point for current and future losses of biodiversity.Unfortunately,incomplete sampling of the world’s biodiversity combined with a lack of robust extrapolation approacheshasFigure 1.Predicting the global number of species in Animalia from their higher taxonomy.(A–F)The temporal accumulation of taxa (black lines)and the frequency of the multimodel fits to all starting years selected (graded colors).The horizontal dashed lines indicate the consensus asymptotic number of taxa,and the horizontal grey area its consensus standard error.(G)Relationship between the consensus asymptotic number of higher taxa and the numerical hierarchy of each taxonomic rank.Black circles represent the consensus asymptotes,green circles the catalogued number of taxa,and the box at the species level indicates the 95%confidence interval around the predicted number of species (see Materials and Methods).doi:10.1371/journal.pbio.1001127.g001yielded highly uncertain and controversial estimates of how many species there are on Earth.In this paper,we describe a new approach whose validation against existing inventories and explicit statistical nature adds greater robustness to theestimation of the number of species of given taxa.In general,the approach was reasonably robust to various caveats,and we hope that future improvements in data quality will further diminish problems with synonyms and incompleteness of data,and lead to even better (and likely higher)estimates of global species richness.Our current estimate of ,8.7million species narrows the range of 3to 100million species suggested by taxonomic experts [1]and it suggests that after 250years of taxonomic classification only a small fraction of species on Earth (,14%)and in the ocean (,9%)have been indexed in a central database (Table 2).Closing this knowledge gap may still take a lot longer.Considering current rates of description of eukaryote species in the last 20years (i.e.,6,200species per year;6811SD;Figure 3F–3J),the average number of new species described per taxonomist’s career (i.e.,24.8species,[30])and the estimated average cost to describe animal species (i.e.,US $48,500per species [30])and assuming that these values remain constant and are general among taxonomic groups,describing Earth’s remaining species may take as long as 1,200years and would require 303,000taxonomists at an approximated cost of US $364billion.With extinction rates now exceeding natural background rates by a factor of 100to 1,000[31],our results also suggest that this slow advance in the description of species will lead to species becoming extinct before we know they even existed.High rates of biodiversity loss provide an urgent incentive to increase our knowledge of Earth’s remaining species.Previous studies have indicated that current catalogues of species are biased towards conspicuous species with large geographical ranges,body sizes,and abundances [4,32].This suggests that the bulk of species that remain to be discovered are likely to be small-ranged and perhaps concentrated in hotspots and less explored areas such as the deep sea and soil;although their small body-size and cryptic nature suggest that many could be found literally in our own ‘‘backyards’’(after Hawksworth and Rossman [33]).Though remarkable efforts and progress have been made,a further closing of this knowledge gap will require a renewed interest in exploration and taxonomy bybothFigure 2.Validating the higher taxon approach.We compared the number of species estimated from the higher taxon approach implemented here to the known number of species in relatively well-studied taxonomic groups as derived from published sources [37].We also used estimations from multimodel averaging from species accumulation curves for taxa with near-complete inventories.Vertical lines indicate the range of variation in the number of species from different sources.The dotted line indicates the 1:1ratio.Note that published species numbers (y -axis values)are mostly derived from expert approximations for well-known groups;hence there is a possibility that those estimates are subject to biases arising from synonyms.doi:10.1371/journal.pbio.1001127.g002Table 2.Currently catalogued and predicted total number of species on Earth and in the ocean.SpeciesEarth Ocean CataloguedPredicted±SECataloguedPredicted±SEEukaryotes Animalia 953,4347,770,000958,000171,0822,150,000145,000Chromista 13,03327,50030,5004,8597,4009,640Fungi 43,271611,000297,0001,0975,32011,100Plantae 215,644298,0008,2008,60016,6009,130Protozoa 8,11836,4006,6908,11836,4006,690Total 1,233,5008,740,0001,300,000193,7562,210,000182,000Prokaryotes Archaea 502455160110Bacteria 10,3589,6803,4706521,320436Total 10,86010,1003,6306531,320436Grand Total1,244,3608,750,0001,300,000194,4092,210,000182,000Predictions for prokaryotes represent a lower bound because they do not consider undescribed higher taxa.For protozoa,the ocean database was substantially more complete than the database for the entire Earth so we only used the former to estimate the total number of species in this taxon.All predictions were rounded to three significant digits.doi:10.1371/journal.pbio.1001127.t002。
益生菌肠道微生物的基因组学英文论文及翻译
The genomics of probiotic intestinal microorganismsSeppo Salminen1 , Jussi Nurmi2 and Miguel Gueimonde1(1) Functional Foods Forum, University of Turku, FIN-20014 Turku, Finland(2) Department of Biotechnology, University of Turku, FIN-20014 Turku, FinlandSeppo SalminenEmail: *********************Published online: 29 June 2005AbstractAn intestinal population of beneficial commensal microorganisms helps maintain human health, and some of these bacteria have been found to significantly reduce the risk of gut-associated disease and to alleviate disease symptoms. The genomic characterization of probiotic bacteria and other commensal intestinal bacteria that is now under way will help to deepen our understanding of their beneficial effects.While the sequencing of the human genome [1, 2] has increased ourunderstanding of the role of genetic factors in health and disease, each human being harbors many more genes than those in their own genome. These belong to our commensal and symbiotic intestinal microorganisms - our intestinal 'microbiome' - which play an important role in maintaining human health and well-being. A more appropriate image of ourselves would be drawn if the genomes of our intestinal microbiota were taken into account. The microbiome may contain more than 100 times the number of genes in the human genome [3] and provides many functions that humans have thus not needed to develop themselves. The indigenous intestinal microbiota provides a barrier against pathogenic bacteria and other harmful food components [4–6]. It has also been shown to have a direct impact on the morphology of the gut [7], and many intestinal diseases can be linked to disturbances in the intestinal microbial population [8].The indigenous microbiota of an infant's gastrointestinal tract is originally created through contact with the diverse microbiota of the parents and the immediate environment. During breast feeding, initial microbial colonization is enhanced by galacto-oligosaccharides in breast milk and contact with the skin microbiota of the mother. This early colonization process directs the microbial succession until weaning and forms the basis for a healthy microbiota. The viable microbes in the adultintestine outnumber the cells in the human body tenfold, and the composition of this microbial population throughout life is unique to each human being. During adulthood and aging the composition and diversity of the microbiota can vary as a result of disease and the genetic background of the individual.Current research into the intestinal microbiome is focused on obtaining genomic data from important intestinal commensals and from probiotics, microorganisms that appear to actively promote health. This genomic information indicates that gut commensals not only derive food and other growth factors from the intestinal contents but also influence their human hosts by providing maturational signals for the developing infant and child, as well as providing signals that can lead to an alteration in the barrier mechanisms of the gut. It has been reported that colonization by particular bacteria has a major role in rapidly providing humans with energy from their food [9]. For example, the intestinal commensal Bacteroides thetaiotaomicron has been shown to have a major role in this process, and whole-genome transcriptional profiling of the bacterium has shown that specific diets can be associated with selective upregulation of bacterial genes that facilitate delivery of products of carbohydrate breakdown to the host's energy metabolism [10, 11]. Key microbial groups in the intestinal microbiota are highly flexible in adapting to changes in diet, and thus detailed prediction of their actions and effects may be difficult. Although genomic studies have revealed important details about the impact of the intestinal microbiota on specific processes [3, 11–14], the effects of species composition and microbial diversity and their potential compensatory functions are still not understood.Probiotics and healthA probiotic has been defined by a working group of the International Life Sciences Institute Europe (ILSI Europe) as "a viable microbial food supplement which beneficially influences the health of the host" [15]. Probiotics are usually members of the healthy gut microbiota and their addition can assist in returning a disturbed microbiota to its normal beneficial composition. The ILSI definition implies that safety and efficacy must be scientifically demonstrated for each new probiotic strain and product. Criteria for selecting probiotics that are specific for a desired target have been developed, but general criteria that must be satisfied include the ability to adhere to intestinal mucosa and tolerance of acid and bile. Such criteria have proved useful but cumbersome in current selection processes, as there are several adherence mechanisms and they influence gene upregulation differently in the host. Therefore, two different adhesion studies need to be conducted on each strain and theirpredictive value for specific functions is not always good or optimal. Demonstration of the effects of probiotics on health includes research on mechanisms and clinical intervention studies with human subjects belonging to target groups.The revelation of the human genome sequence has increased our understanding of the genetic deviations that lead to or predispose to gastrointestinal disease as well as to diseases associated with the gut, such as food allergies. In 1995, the first genome of a free-living organism, the bacterium Haemophilus influenzae, was sequenced [16]. Since then, over 200 bacterial genome sequences, mainly of pathogenic microorganisms, have been completed. The first genome of a mammalian lactic-acid bacterium, that of Lactococcus lactis, a microorganism of great industrial interest, was completed in 2001 [17]. More recently, the genomes of numerous other lactic-acid bacteria [18], bifidobacteria [12] and other intestinal microorganisms [13, 19, 20] have been sequenced, and others are under way [21]. Table 1lists the probiotic bacteria that have been sequenced. These great breakthroughs have demonstrated that evolution has adapted both microbes and humans to their current state of cohabitation, or even symbiosis, which is beneficial to both parties and facilitates a healthy and relatively stable but adaptable gut environment.Table 1Lessons from genomesLactic-acid bacteria and bifidobacteria can act as biomarkers of gut health by giving early warning of aberrations that represent a risk of specific gut diseases. Only a few members of the genera Lactobacillus and Bifidobacterium, two genera that provide many probiotics, have been completely sequenced. The key issue for the microbiota, for probiotics, and for their human hosts is the flexibility of the microorganisms in coping with a changeable local environment and microenvironments.This flexibility is emphasized in the completed genomes of intestinal and probiotic microorganisms. The complete genome sequence of the probiotic Lactobacillus acidophilus NCFM has recently been published by Altermann et al. [22]. The genome is relatively small and the bacterium appears to be unable to synthesize several amino acids, vitamins and cofactors. Italso encodes a number of permeases, glycolases and peptidases for rapid uptake and utilization of sugars and amino acids from the human intestine, especially the upper gastrointestinal tract. The authors also report a number of cell-surface proteins, such as mucus- and fibronectin-binding proteins, that enable this strain to adhere to the intestinal epithelium and to exchange signals with the intestinal immune system. Flexibility is guaranteed by a number of regulatory systems, including several transcriptional regulators, six PurR-type repressors and ninetwo-component systems, and by a variety of sugar transporters. The genome of another probiotic, Lactobacillus johnsonii [23], also lacks some genes involved in the synthesis of amino acids, purine nucleotides and numerous cofactors, but contains numerous peptidases, amino-acid permeases and other transporters, indicating a strong dependence on the host.The presence of bile-salt hydrolases and transporters in these bacteria indicates an adaptation to the upper gastrointestinal tract [23], enabling the bacteria to survive the acidic and bile-rich environments of the stomach and small intestine. In this regard, bile-salt hydrolases have been found in most of the sequenced genomes of bifidobacteria and lactic-acid bacteria [24], and these enzymes can have a significant impact on bacterial survival. Another lactic-acid bacterium, Lactobacillus plantarum WCFS1, also contains a large number of genes related to carbohydrate transport and utilization, and has genes for the production of exopolysaccharides and antimicrobial agents [18], indicating a good adaptation to a variety of environments, including the human small intestine [14]. In general, flexibility and adaptability are reflected by a large number of regulatory and transport functions.Microorganisms that inhabit the human colon, such as B. thetaiotaomicron and Bifidobacterium longum [12], have a great number of genes devoted to oligosaccharide transport and metabolism, indicating adaptation to life in the large intestine and differentiating them from, for example, L. johnsonii [23]. Genomic research has also provided initial information on the relationship between components of the diet and intestinal microorganisms. The genome of B. longum [12] suggests the ability to scan for nutrient availability in the lower gastrointestinal tract in human infants. This strain is adapted to utilizing the oligosaccharides in human milk along with intestinal mucins that are available in the colon of breast-fed infants. On the other hand, the genome of L. acidophilus has a gene cluster related to the metabolism of fructo-oligosaccharides, carbohydrates that are commonly used as prebiotics, or substrates to肠道微生物益生菌的基因组学塞波萨米宁,尤西鲁米和米格尔哥尔摩得(1)功能性食品论坛,图尔库大学,FIN-20014芬兰图尔库(2)土尔库大学生物技术系,FIN-20014芬兰图尔库塞波萨米宁电子邮件:seppo.salminen utu.fi线上发表于2005年6月29日摘要肠道有益的共生微生物有助于维护人体健康,一些这些细菌被发现显着降低肠道疾病的风险和减轻疾病的症状。
A Method for the Rapid and Efficient Elution of Native
A Method for the Rapid and Efficient Elution of NativeAffinity-Purified Protein A Tagged ComplexesCaterina Strambio-de-Castillia,†Jaclyn Tetenbaum-Novatt,†Brian S.Imai,‡Brian T.Chait,§andMichael P.Rout*,†The Rockefeller University,1230York Avenue,New York New York 10021-6399Received May 24,2005A problem faced in proteomics studies is the recovery of tagged protein complexes in their native and active form.Here we describe a peptide,Bio-Ox,that mimics the immunoglobulin G (IgG)binding interface of Staphylococcus aureus Protein A,and competitively displaces affinity-purified Protein A fusion proteins and protein complexes from IgG-Sepharose.We show that Bio-Ox elution is a robust method for the efficient and rapid recovery of native tagged proteins,and can be applied to a variety of structural genomics and proteomics studies.Keywords:Staphylococcus aureus •Protein A •affinity purification •proteomics •fusion proteinIntroductionProtein -protein interactions are central to the maintenance and control of cellular processes.The study of such protein -protein interactions has been greatly enhanced by fusion protein technology,wherein specific peptide or protein domain “tags”are fused to the protein of interest (generally at either its carboxyl-terminus or amino-terminus).These tags can facilitate the detection,increase the yield,and enhance the solubility of their associated proteins.1-3Most importantly,these fusion domains have been exploited to allow the single-step purification of the test protein either alone or in complexes with its in vivo binding partners.4-6The yield of these purifica-tion methods is often high enough to allow the identification of such binding partners by mass spectrometry.A commonly used affinity tagging method generates ge-nomically expressed Protein A (PrA)fusion proteins by modify-ing the coding sequence of the protein under study via PCR-directed approaches.7-9This method takes advantage of the ∼10nM binding affinity of PrA from S taphylococcus aureus for the constant region (Fc)of immunoglobulin G (IgG).10After purification on IgG-conjugated resins,PrA-tagged proteins or protein complexes are most commonly eluted from the resin using high or low pH conditions.These elution methods typically lead to the denaturation of the isolated proteins,the dissociation of complexes,and concomitant loss of activity.However,it is often desirable to recover soluble native protein or protein complexes.One method by which this can be achieved is by constructing a cleavable tag.Such tags carry a specific cleavage site for a protease placed proximal to the tagged protein,allowing the tag to be removed from the fusion protein.Proteases that are widely used for this purpose includeblood coagulation factors X (factor Xa),enteropeptidase (en-terokinase),alpha-thrombin,and the tobacco etch virus (TEV)protease.Nevertheless,this method has drawbacks.First,the literature is replete with reports of fusion proteins that were cleaved by these proteases at sites other than the canonical cleaving site.11-14Second,the removal of the tag destroys the ability to detect or further purify the protein of interest,necessitating the encumbrance of a second,tandem tag.15Here,we describe a rapid single step method for the efficient recovery of native and active PrA fusion proteins and protein complexes from IgG-Sepharose.This technique avoids the complications of having to use a protease and in addition has the advantage of retaining the original tag on the target protein after elution,permitting further purification steps and detection of the fusion protein in subsequent experiments.Our method takes advantage of a previously described peptide,termed FcIII,16which mimics the protein -protein binding interface of PrA for the hinge region on the Fc domain of human IgG.We modified FcIII by the addition of a biotin moiety to its amino-terminus to increase the peptide’s solubility while leaving its affinity for Fc intact s making it a more effective elution reagent.We termed this modified peptide,Bio-Ox.To investigate the properties of Bio-Ox,PrA-tagged proteins were isolated in their native state from yeast on an IgG-conjugated Sepharose resin,either alone or in combination with their in vivo interacting partners;the Bio-Ox peptide was then used to competitively displace the tagged proteins and elute them from the resin.The efficiency of elution was monitored by quantitatively comparing the amounts of proteins eluted to the amounts remaining on the resin under a variety of test conditions.We show that Bio-Ox elution is a robust method for the efficient and rapid recovery of native tagged proteins that can be applied to a variety of structural genomics and proteomics studies.*To whom correspondence should be addressed.Tel:+1(212)327-8135.E-mail:rout@.†Laboratory of Cellular and Structural Biology,Box 213.‡Proteomics Resource Center,Box 105.§Laboratory of Mass Spectrometry and Gaseous Ion Chemistry,Box 170.2250Journal of Proteome Research 2005,4,2250-225610.1021/pr0501517CCC:$30.25©2005American Chemical SocietyPublished on Web10/08/2005Experimental SectionPeptide Synthesis,Oxidation and Cyclization.Peptides were synthesized using standard Fmoc protocols.Typical deprotec-tion times with20%piperidine were2times10min and typical coupling times with4-10-fold excess of amino acids over resin were2to6h.Small batches of peptides were made on a Symphony synthesizer(Protein Technologies,Inc.),while larger batches were made manually.Peptides were cleaved from the resin using94.5%trifluoroacetic acid, 2.5%water, 2.5% ethanedithiol and1%triisopropylsilane for3h at25°C.The solubilized peptides were precipitated with10volumes of cold tert-butyl methyl ether and the precipitated peptide was washed several times with ether prior to air-drying.The air-dried peptide was dissolved in20%acetonitrile in water to approximately0.5mg/mL,the pH was adjusted to8.5using sodium bicarbonate and the peptide was allowed to air oxidize overnight to promote cyclization.The progress of cyclization was monitored by mass spectrometry.The cyclized crude peptide was purified using standard preparative reversed phase HPLC using a Vydac218TP1022C18column.Peptide Solubility.Eluting peptides were suspended at a concentration of440µM(0.77mg/mL for BioOx;0.67mg/mL for FcIII),in peptide buffer by extensive vortexing.The peptide concentration was verified by measuring the OD280nm of each solution(extinction coefficient:1OD280nm)0.13mg/mL).The peptide solutions/suspensions were then combined with equal amounts of a100mM buffer to obtain∼220µM peptide at a range of pH values(buffers:Na-Acetate pH4.8,Na-Citrate pH 5.4,Na-Succinate pH5.8,Na-MES pH6.2,BisTris-Cl pH6.5, Na-HEPES pH7.4,Na-TES pH7.5,Tris-Cl pH8.3,Na-CAPSO pH9.6).Samples were incubated at room temperature with gentle agitation for20min,and then insoluble material was removed by centrifugation at21000×g max for20min at25°C.The concentration of peptide in each remaining superna-tant was determined by measuring its OD280nm.To determine the maximum solubility of each peptide,the peptides were dissolved to saturation in peptide buffer by extensive vortexing and incubation with stirring at25°C overnight.Insoluble material was removed by centrifugation at15000×g for15min at25°C and the amount of dissolved peptide was measured directly by amino acid analysis.Peptide Competitive Displacement of Bound Recombinant PrA from IgG-Sepharose.Recombinant PrA(280µg;6.7nmol) from S.aureus(Pierce)was dissolved in1mL TB-T[20mM HEPES-KOH pH7.4,110mM KOAc,2mM MgCl2,0.1%Tween-20(vol/vol)]and added to280µL of packed pre-equilibrated Sepharose4B(Amersham Biosciences)conjugated with affinity-purified rabbit IgG(ICN/Cappel; 1.87nmoles IgG).After incubation on a rotating wheel overnight at4°C,the resin was washed twice with1mL TB-T,twice with1mL TB-T containing 200mM MgCl2,and twice with1mL TB-T.After the final wash, the resin was divided evenly into14equal aliquots.The peptide was dissolved in peptide buffer at concentrations ranging between0and440µM peptide.Aliquots of400µL of the appropriate peptide solution was added to each PrA-IgG-Sepharose containing tube,and the tubes were then incubated on a rotating platform for3h at4°C followed by1h at25°C. After displacement of bound PrA from the IgG-Sepharose,the resin was recovered by centrifugation on a Bio-Spin column (BioRad),and resuspended in one-bed volume of sample buffer.Samples were separated by SDS-PAGE.Yeast Strains.Strains are isogenic to DH5alpha unless otherwise specified.All yeast strains were constructed using standard genetic techniques.C-terminal genomically tagged strains were generated using the PCR method previously described.7,17Affinity Purification of Proteins and Protein Complexes on IgG-Sepharose.The protocol for the purification of PrA-containing complexes was modified from published methods.18-20For the purification of Kap95p-PrA,yeast cytosol was prepared essentially as previously described.21,22Kap95p-PrA cytosol was diluted with3.75volumes of extraction buffer 1[EB1:20mM Hepes/KOH,pH7.4,0.1%(vol/vol)Tween-20, 1mM EDTA,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF]. The diluted cytosol was cleared by centrifugation at2000×g av for10min in a Sorvall T6000D tabletop centrifuge and at 181000×g max for1h in a Type80Ti Beckman rotor at4°C.10µL bed volume of IgG-Sepharose pre-equilibrated in EB1was added per0.5mL of cytosol and the binding reaction was incubated overnight at4°C on a rotating wheel.The resin was recovered by centrifugation at2000×g av for1min in a Sorvall T6000D tabletop centrifuge,transferred to1.5mL snap-cap tubes(Eppendorf),and washed6times with EB1without DTT. For the purification of Nup82p-PrA,cells were grown in Whickerham’s medium21to a concentration of4×107cells/ mL,washed with water and with20mM Hepes/KOH pH7.4, 1.2%PVP(weight/vol),4µg/mL pepstatin,0.2mg/mL PMSF, and frozen in liquid N2before being ground with a motorized grinder(Retsch).Ground cell powder(1g)was thawed into10 mL of extraction buffer2[EB2;20mM Na-HEPES,pH7.4,0.5% TritonX-100(vol/vol),75mM NaCl,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF].Cell lysates were homogenized by extensive vortexing at25°C followed by the use of a Polytron for25s(PT10/35;Brinkman Instruments)at4°C.Clearing of the homogenate,binding to IgG-Sepharose,resin recovery and washing was done as above except that10µL of IgG-Sepharose bed volume was used per1g of cell powder and EB2without DTT was used for all the washes.Elution of the PrA tagged complexes was performed as described below.Peptide Elution of Test Proteins and Protein Complexes and Removal of Peptide by Size Exclusion.Kap95p-PrA or Nup82p-PrA bound IgG-Sepharose resin was recovered over a pre-equilibrated Bio-Spin column(BioRad)by centrifugation for1min at1000×g max.Three bed-volumes of440µM(unless otherwise indicated in the text)of eluting peptide in peptide buffer were added per volume of packed IgG Sepharose resin. The elution was carried out for various times(as indicated in the text)at either4°C or at25°C.When elution was complete, the eluate was recovered over a Bio-Spin column.Finally,the resin was washed with one bed-volume of elution buffer to displace more eluted material from the resin and the wash was pooled with the initial eluate.The peptide was removed by filtration of the eluate over a micro spin G25column(Amer-sham Biosciences)as described by the manufacturer.Kap95p-Nup2p in Vitro Binding Experiments.To demon-strate in vitro binding of proteins after elution from the resin, Kap95p-PrA from0.3mL of yeast cytosol was affinity-purified on17.5µL of packed IgG-Sepharose and eluted with52.5µL of440µM Bio-Ox for2.5h at4°C followed by1h at25°C. The resulting sample(total volume88µL)was mixed with0.1µL of E.coli total cell lysate containing Nup2p-GST(generous gift from David Dilworth and John Aitchison23)and brought to a total volume of500µL with TB-T,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF.Controls were set up in the absence of either Kap95p-PrA or Nup2p-GST.The samples were incubated at25°C for30min after which40µL of packed,pre-Native Elution of PrA-Tagged Proteins research articlesJournal of Proteome Research•Vol.4,No.6,20052251equilibrated glutathione-Sepharose 4B resin (Amersham Bio-sciences)was added per sample and the incubation was continued at 4°C for 1h.After nine washes with 1mL of TB-T,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF,at 25°C,the resin was recovered on Bio-Spin columns as described above and bound material was eluted with 40µL of sample buffer.The samples were resolved on SDS-PAGE alongside an aliquot of input peptide-eluted Kap95p-PrA.To demonstrate the recovery of in vitro reconstituted protein complexes from the resin,Kap95p-PrA from 0.3mL of yeast cytosol was affinity-purified on 10µL of packed IgG-Sepharose and the washed resin was equilibrated in TB-T,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF.This pre-bound Kap95p-PrA was mixed with 50µL of E.coli total cell lysate containing Nup2p-GST in a total volume of 1mL of TB-T,1mM DTT,4µg/mL pepstatin,0.2mg/mL PMSF.A mock control experiment was set up in the absence of Nup2p-GST.The binding reaction was carried out for 1h at 4°C and the resin was washed 2times with 1mL of TB-T,2times with 1mL of TB-T containing 100µM ATP and 3times with peptide buffer (all washed were without DTT).Bound material was eluted with 30µL of 440µM Bio-Ox in peptide buffer at 4°C for 2.5h at 4°C followed by 1h at 25°C.Samples were resolved by SDS-PAGE.Figure 1.Addition of a Biotin moiety to the FcIII peptide does not alter the ability of the peptide to competitively displace bound PrA from IgG-Sepharose.(a)Primary sequence and chemical structure of the biotinylated FcIII peptide,Bio-Ox.(b)220µM suspensions of peptides were prepared in buffers of different pHs,and allowed to solubilize.The material remaining in the buffer after centrifugation is plotted for Bio-Ox (closed triangles,black trend line )and FcIII (open circles,gray trend line ;dashed horizontal line represents the starting 220µM level .(c)Increasing amounts of Bio-Ox (closed triangles )and FcIII (open diamonds )were used to competitively displace recombinant PrA from IgG-Sepharose.The amounts of PrA left on the resin after elution were resolved by SDS-PAGE alongside known amounts of PrA standards.The data are displayed on logarithmic scale on both axes.Data are displayed as a %recovery relative to the input PrA amount (i.e.,PrA amount remaining bound in the absence of eluting peptide).Linear regression for both data sets was used to calculate the IC50.research articlesStrambio-de-Castillia et al.2252Journal of Proteome Research •Vol.4,No.6,2005Quantitation and Image Analyses.Band intensities were quantified with the Openlab software (Improvision),and the data was plotted using Excel (Microsoft).Results and DiscussionDesign of the PrA Mimicking Peptide.The hinge region on the Fc fragment of immunoglobulin G (IgG)interacts with Staphylococcus aureus Protein A (PrA).This region was also found to be the preferred binding site for peptides selected by bacteriophage display from a random library.16The specific Fc binding interactions of a selected 13amino acid peptide (termed FcIII),were shown to closely mimic those of natural Fc binding partners.We reasoned that this peptide could be used to efficiently displace PrA tagged proteins from IgG-conjugated affinity resins.Initial trials with FcIII determined that,although it functioned as an eluant,it exhibited a strong tendency to aggregate and its solubility under physiological conditions was not sufficient for many practical purposes,leading to low yields and nonreproducible results.As the high peptide concentrations needed for elution are outside the conditions for which the FcIII peptide was designed,we synthesized several modified peptides based on FcIII,with the specific aim of increasing their solubility and decreasing their degree of aggregation under conditions that would be useful for the isolation of proteins and protein complexes.Among the different alternatives,the most efficient in the displacement of bound PrA-tagged Kap95p from IgG-Sepharose was a peptide in which the amino-terminus of the original FcIII peptide wasFigure 2.Bio-Ox can be used to efficiently compete bound PrA-tagged proteins and protein complexes from IgG-Sepharose in a temperature-dependent fashion.(a )Kap95p-PrA/Kap60p was affinity-purified on IgG-Sepharose from logarithmically growing yeast cells.440µM Bio-Ox was used to competitively displace the bound tagged proteins from the IgG-Sepharose resin.The elution reaction was carried out for the times indicated.At the end of the incubation time eluted proteins (E )and proteins remaining bound to the resin (B )were resolved on SDS-PAGE.(b )Kap95p-PrA (closed squares)and Nup82p-PrA (open squares )were affinity-purified on IgG-Sepharose from logarithmically growing yeast cells and eluted as described above.The amounts of eluted versus resin-bound protein was quantified using the OpenLab software and the elution efficiency for each time point is presented as the percentage of eluted material over the total amount of bound plus eluted material (%eluted).(c )440µM Bio-Ox was used to elute Kap95p-PrA or Nup82p-PrA for 1h at 4°C or 25°C as indicated.Native Elution of PrA-Tagged Proteinsresearch articlesJournal of Proteome Research •Vol.4,No.6,20052253modified by the addition of a Biotin moiety (data not shown).We termed this peptide Bio-Ox (Figure 1,panel a).The solubility of Bio-Ox was measured directly by amino acid analysis and was shown to be ∼3-fold greater than the solubility of FcIII at pH 7.4.In addition,comparison of the solubility of both peptides over a range of pHs indicated that the Bio-Ox was considerably more soluble than FcIII at all but the most extreme pHs tested;importantly,Bio-Ox is very soluble across the full physiological range of pHs (Figure 1,panel b).To determine whether the addition of the Biotin moiety could have altered the inhibiting ability of the peptide,we measured the inhibition constant for Bio-Ox and found it to be comparable with the reported K i for FcIII (∼11nM;data not shown).We then measured the IC 50for competitive displacement for FcIII and Bio-Ox,under conditions in which both were soluble.For this test,commercially available recom-binant PrA from S.aureus was first bound to IgG-Sepharose and then increasing concentrations of the peptide were used to displace the bound PrA from the immobilized IgG (Figure 1,panel c).The apparent IC 50was found to be 10.4(3.2µM for FcIII and 9.8(2.6µM for Bio-Ox (mean value of four independent trials (standard deviation of the mean).Taken together,Bio-Ox appears to be as efficient as FcIII at binding to the F c portion of antibodies and competing for this site with Protein A,but is far more soluble in physiologically compatible buffers,a key requirement for an efficient elution peptide (Figure 3).Experimental Design of the Competitive Elution Procedure.The principle of the method is as follows;genomically PrA-tagged proteins of interest are expressed in yeast and affinity isolated on IgG-conjugated Sepharose resin.Depending on the conditions used for lysis and extraction,the test protein can be recovered in native form either in isolation or in complexes with protein partners.After binding,the resin is recovered by centrifugation and washed extensively to remove unbound material.The bound material is competitively displaced from the IgG-Sepharose resin by incubation with 440µM Bio-Ox peptide in peptide buffer for 2h at 4°C.Finally,the peptide is rapidly (<1min)removed from the eluted sample by fraction-ation over a size exclusion spin column.Given a typical protein of average abundance,1-10µg of pure protein can be recovered from 1g of cells using this method.Figure 3.Elution of Kap95-PrA/Kap60p is dose dependent.(a )Kap95p-PrA was affinity-purified on IgG-Sepharose from loga-rithmically growing yeast cells and eluted using increasing concentrations of Bio-Ox peptide as indicated.(b )The elution efficiency measured as described in Figure 2was plotted versus the peptide concentration in logarithmic scale as indicated.Figure 4.Eluted Kap95p-PrA/Kap60complex retains its biological activity.(a )Kap95p-PrA was prepared by affinity purification followed by Bio-Ox peptide elution (Kap95-PrA eluate ).Three binding reactions were then set up containing eluted Kap95p-PrA and Nup2p-GST bacterial lysate,Kap95p-PrA alone or Nup2p-GST alone.At the end of the incubation,Nup2p-GST was affinity-purified on glutathione-Sepharose and the immobilized material was eluted from the resin with sample buffer and resolved on SDS -PAGE (GST bound ).(b )Kap95p-PrA was immobilized on IgG-Sepharose and incubated with (+)or without (-)bacterial lysate containing Nup2p-GST.The resulting material was eluted using Bio-Ox.Eluate (E )and resin bound (B )material was resolved on SDS-PAGE.*,indicates a Nup2p breakdown product.Table 1.Elution Efficiency for PrA Tagged Nupsname of nup%yieldNup53p 56Nup59p 81Nup84p 88Nup85p 81Nic96p 76Nsp1p 99Nup1p 99Nup120p 69Nup157p 82Nup159p 53Nup170p 80Nup192p 76Gle2p 90research articlesStrambio-de-Castillia et al.2254Journal of Proteome Research •Vol.4,No.6,2005To explore the characteristics of Bio-Ox elution under conditions that preserve native protein complexes,we chose to work with the yeast karyopherin Kap95p-PrA/Kap60p com-plex,24and with the yeast nucleoporin Nup82p-PrA/Nsp1p/ Nup159p complex.25,26This choice was dictated by our interest in the structure and function of the yeast nuclear pore complex (NPC).17,27Optimization of the Elution Conditions.An elution time course for Kap95p-PrA/Kap60p and Nup82p-PrA from IgG-Sepharose at4°C is shown in Figure2,panels a and b.In both cases,the elution was virtually complete after2h at4°C.The largest difference in elution efficiency between the two test proteins was found at the earlier time points.Thus,more than 50%of initially bound Kap95p-PrA was displaced by10min, while it took∼1h to obtain the same result with Nup82p-PrA. We also determined the temperature dependence of the elution process(Figure2,panel c).Elutions of Kap95p-PrA and Nup82p-PrA with Bio-Ox,for1h were compared at4°C and 25°C(Figure2,panel c),showing that elution was improved at25°C over4°C for both test proteins.These various factors underscore the need to conduct appropriate test experiments to determine the optimal conditions for any given application. For example,elution for shorter periods and at4°C is preferable when the proteins under study are sensitive to denaturation,dissociation or proteolytic degradation.We also tested the dependence of elution efficiency upon Bio-Ox concentration.For this test,Kap95p-PrA bound to IgG-Sepharose was competitively displaced using increasing amounts of Bio-Ox peptide for4h at4°C.(Figure3).Bio-Ox peptide displaced IgG-Sepharose bound PrA tagged Kap95p with an apparent IC50of60.8µM.For practical purposes,the protocol we use in most cases takes advantage of the high solubility of Bio-Ox to obtain maximally efficient elutions,utilizing a concentration of440µM of Bio-Ox peptide for2h at4°C.To test the general applicability of the method,we performed peptide elution experiments using a series of PrA tagged proteins that were available in our laboratories.17The yield for these proteins was in all cases>50%and in most cases was >80%(average yield78%(14%;Table1).Eluted Proteins Retain their Biological Activity.The trans-location of macromolecules between the nucleus and cytosol of eukaryotic cells occurs through the NPC and is facilitated by soluble transport factors termed karyopherins(reviewed in ref28).Nucleoporins that contain FG peptide repeats(FG Nups)function as binding sites for karyopherins within the NPC.One example of an FG Nup-karyopherin interaction is represented by the binding of the Kap95p/Kap60p complex to Nup2p,29an interaction that requires both karyopherins to be natively folded.30,31We took advantage of this interaction to demonstrate that the Bio-Ox eluted Kap95p-PrA/Kap60p com-plex retains its biological activity and is able to bind Nup2p in vitro(Figure4,panel a).In this test,Kap95p-PrA was affinity-purified and eluted from IgG-Sepharose as described above. The eluate was incubated with whole cell lysate from E.coli expressing Nup2p-GST,23and GST-tagged Nup2p was isolated over gluthatione-Sepharose resin.As a control,the same experiment was performed either in the absence of Nup2p-GST containing bacterial lysate or in the absence of Kap95p-PrA eluate.As shown,Nup2p-GST binds specifically and directly to the peptide-eluted Kap95p-PrA/Kap60p complex. This result is consistent with reported data and demonstrates that elution with Bio-Ox does not alter the native state and biological activity of Kap95p-PrA.Moreover,the apparent equimolar stoichiometry of the Nup2-GST/Kap95p-PrA/Kap60p complex indicates that essentially all of the peptide eluted karyopherins were in their native,active conformation.This result underscores the usefulness of this method for the preparation of native protein samples.The method can also be used for in vitro reconstitution experiments of biologically relevant protein-protein interac-tions of interest.For this test,Kap95p-PrA was affinity isolated on IgG-Sepharose,Nup2p-GST was bound to the immobilized Kap95p-PrA and then the reconstituted complex was competi-tively displaced from the resin by Bio-Ox peptide elution(Figure 4,panel b).This shows that the method can be used in vitro to study protein-protein interactions using purified compo-nents.ConclusionWe have used the Bio-Ox technology extensively in our laboratories for a wide variety of applications including:(1) the semipreparative purification of∼30PrA-tagged natively folded Nups for the determination of their sedimentation coefficient over a sucrose velocity gradient(S.Dokudovskaya, L.Veenhoff,personal communication);(2)the isolation of yeast cyclins and cyclin-Cdk associated proteins;32(3)the semi-preparative purification of enzymatically active Dpb4p-PrA chromatin remodeling/histone complexes;33and(4)the study of the in vitro binding property of proteins of interest using blot and resin binding experiments.34Thus,this method should be generally applicable to the native purification of most other proteins and protein complexes.Acknowledgment.We are very grateful to David Dil-worth and John Aitchison for the generous gift of bacterially expressed Nup2p-GST.We are deeply indebted to Rosemary Williams for her skilled technical assistance throughout the course of this study and to all members of the Rout and Chait laboratories and of the Proteomic Research Center,past and present,for their continual help and unwavering support.We are particularly grateful to Markus Kalkum,Bhaskar Chan-drasekhar,Svetlana Dokudovskaya and Liesbeth Veenhoff.This work was supported by grants from the American Cancer Society(RSG-0404251)and the NIH(GM062427,RR00862,and CA89810).References(1)Uhlen,M.;Forsberg,G.;Moks,T.;Hartmanis,M.;Nilsson,B.Fusion proteins in biotechnology.Curr.Opin.Biotechnol.1992, 3(4),363-369.(2)Nygren,P.A.;Stahl,S.;Uhlen,M.Engineering proteins to facilitatebioprocessing.Trends Biotechnol.1994,12(5),184-188.(3)Baneyx,F.Recombinant protein expression in Escherichia coli.Curr.Opin.Biotechnol.1999,10(5),411-421.(4)LaVallie,E.R.;McCoy,J.M.Gene fusion expression systems inEscherichia coli.Curr.Opin.Biotechnol.1995,6(5),501-506.(5)Nilsson,J.;Stahl,S.;Lundeberg,J.;Uhlen,M.;Nygren,P.A.Affinity fusion strategies for detection,purification,and im-mobilization of recombinant proteins.Protein Expr.Purif.1997, 11(1),1-16.(6)Einhauer,A.;Jungbauer,A.The FLAG peptide,a versatile fusiontag for the purification of recombinant proteins.J.Biochem.Biophys.Methods2001,49(1-3),455-465.(7)Aitchison,J. D.;Blobel,G.;Rout,M.P.Nup120p:a yeastnucleoporin required for NPC distribution and mRNA transport.J.Cell Biol.1995,131(6Pt2),1659-1675.(8)Grandi,P.;Doye,V.;Hurt,E.C.Purification of NSP1revealscomplex formation with‘GLFG’nucleoporins and a novel nuclear pore protein NIC96.EMBO J.1993,12(8),3061-3071.(9)Stirling,D.A.;Petrie,A.;Pulford,D.J.;Paterson,D.T.;Stark,M.J.Protein A-calmodulin fusions:a novel approach for investigat-ing calmodulin function in yeast.Mol.Microbiol.1992,6(6),703-713.Native Elution of PrA-Tagged Proteins research articlesJournal of Proteome Research•Vol.4,No.6,20052255。
2005-A global Malmquist productivity index
A global Malmquist productivity indexJesu ´s T.Pastor a ,C.A.Knox Lovell b ,TaCentro de Investigacio ´n Operativa,Universidad Miguel Herna ´ndez,03206Elche (Alicante),SpainbDepartment of Economics,University of Georgia,Athens,GA 30602,USA Received 2June 2004;received in revised form 24January 2005;accepted 16February 2005Available online 23May 2005AbstractThe geometric mean Malmquist productivity index is not circular,and its adjacent period components can provide different measures of productivity change.We propose a global Malmquist productivity index that is circular,and that gives a single measure of productivity change.D 2005Elsevier B.V .All rights reserved.Keywords:Malmquist productivity index;Circularity JEL classification:C43;D24;O471.IntroductionThe geometric mean form of the contemporaneous Malmquist productivity index,introduced by Caves et al.(1982),is not circular.Whether this is a serious problem depends on the powers of persuasion of Fisher (1922),who dismissed the test,and Frisch (1936),who endorsed it.The index averages two possibly disparate measures of productivity change.Fa ¨re and Grosskopf (1996)state sufficient conditions on the adjacent period technologies for the index to satisfy circularity,and to average the same measures of productivity change.When linear programming techniques are used to compute and decompose the index,infeasibility can occur.Whether this is a serious problem depends on0165-1765/$-see front matter D 2005Elsevier B.V .All rights reserved.doi:10.1016/j.econlet.2005.02.013T Corresponding author.Tel.:+17065423689;fax:+17065423376.E-mail address:knox@ (C.A.K.Lovell).Economics Letters 88(2005)266–271/locate/econbasethe structure of the data.Xue and Harker(2002)provide necessary and sufficient conditions on the datafor LP infeasibility not to occur.We demonstrate that the source of all three problems is the specification of adjacent periodtechnologies in the construction of the index.We show that it is possible to specify a base periodtechnology in a way that solves all three problems,without having to impose restrictive conditions oneither the technologies or the data.Berg et al.(1992)proposed an index that compares adjacent period data using technology from a baseperiod.This index satisfies circularity and generates a single measure of productivity change,but it paysfor circularity with base period dependence,and it remains susceptible to LP infeasibility.Shestalova(2003)proposed an index having as its base a sequential technology formed from data ofall producers in all periods up to and including the two periods being compared.This index is immune toLP infeasibility,and it generates a single measure of productivity change,but it fails circularity and itprecludes technical regress.Thus no currently available Malmquist productivity index solves all three problems.We propose anew global index with technology formed from data of all producers in all periods.This index satisfiescircularity,it generates a single measure of productivity change,it allows technical regress,and it isimmune to LP infeasibility.In Section2we introduce and decompose the circular global index.Its efficiency change componentis the same as that of the contemporaneous index,but its technical change component is new.In Section3we relate it to the contemporaneous index.In Section4we provide an empirical illustration.Section5concludes.2.The global Malmquist productivity indexConsider a panel of i=1,...,I producers and t=1,...,T time periods.Producers use inputs x a R N+toproduce outputs y a R P+.We define two technologies.A contemporaneous benchmark technology isdefined as T c t={(x t,y t)|x t can produce y t}with k T c t=T c t,t=1,...,T,k N0.A global benchmarktechnology is defined as T c G=conv{T c1v...v T c T}.The subscript b c Q indicates that both benchmark technologies satisfy constant returns to scale.A contemporaneous Malmquist productivity index is defined on T c s asM scx t;y t;x tþ1;y tþ1ÀÁ¼D scx tþ1;y tþ1ðÞD scx t;y tðÞ;ð1Þwhere the output distance functions D c s(x,y)=min{/N0|(x,y//)a T c s},s=t,t+1.Since M c t(x t,y t,x t+1, y t+1)p M c t+1(x t,y t,x t+1,y t+1)without restrictions on the two technologies,the contemporaneous index is typically defined in geometric mean form as M c(x t,y t,x t+1,y t+1)=[M c t(x t,y t,x t+1,y t+1)ÂM c t+1(x t,y t,x t+1, y t+1)]1/2.A global Malmquist productivity index is defined on T c G asM Gcx t;y t;x tþ1;y tþ1ÀÁ¼D Gcx tþ1;y tþ1ðÞD Gcx t;y tðÞ;ð2Þwhere the output distance functions D c G(x,y)=min{/N0|(x,y//)a T c G}.J.T.Pastor,C.A.K.Lovell/Economics Letters88(2005)266–271267Both indexes compare (x t +1,y t +1)to (x t ,y t ),but they use different benchmarks.Since there is only one global benchmark technology,there is no need to resort to the geometric mean convention when defining the global index.M cGdecomposes as M G c x t ;y t ;x t þ1;f y t þ1ÀÁ¼D t þ1c x t þ1;y t þ1ðÞD t c x t ;y t ðÞÂD G c x t þ1;y t þ1ðÞD t þ1c x t þ1;y t þ1ðÞÂD t cx t ;y t ðÞD Gc x t ;y t ðÞ&'¼TE t þ1c x t þ1;y t þ1ðÞTE t c x t ;y t ðÞÂD G c Àx t þ1;y t þ1=D t þ1c x t þ1;y t þ1ðÞÁD G c x t ;y t =D t cx t ;y t ðÞÀÁ()¼EC c ÂBPG G ;t þ1cx t þ1;y t þ1ðÞBPG cx t ;y tðÞ()¼EC c ÂBPC c ;ð3Þwhere EC c is the usual efficiency change indicator and BPG c G,s V 1is a best practice gap between T c Gand T c s measured along rays (x s ,y s),s =t ,t +1.BPC c is the change in BPG c ,and provides a new measure of technical change.BPC c f 1indicates whether the benchmark technology in period t +1in the region[(x t +1,y t +1/D ct +1(x t +1,y t +1))]is closer to or farther away from the global benchmark technology than is the benchmark technology in period t in the region [(x t ,y t /D ct (x t ,y t ))].M c G has four virtues.First,like any fixed base index,M cGis circular,and since EC c is circular,so is BPC c .Second,each provides a single measure,with no need to take the geometric mean of disparate adjacent period measures.Third,but not shown here,the decomposition in (3)can be extended to generate a three-way decomposition that is structurally identical to the Ray and Desli (1997)decomposition of the contemporaneous index.M cGand M c share a common efficiency change component,but they have different technical change and scale components,and so M c Gp M c without restrictions on the technologies.Finally,the technical change and scale components of M c Gare immune to the LP infeasibility problem that plagues these components of M c .paring the global and contemporaneous indexes The ratioM G c =M c¼M G c =M t þ1cÀÁÂM G c =M t cÀÁÂÃ1=2¼D G cx t þ1;y t þ1=D t þ1c x t þ1;y t þ1ðÞÀÁD G c x t ;y t =D t þ1c x t ;y t ðÞÀÁ"#ÂD G c x t þ1;y t þ1=D t c x t þ1;y t þ1ðÞÀÁD G c x t ;y t =D t c x t ;y t ðÞÀÁ"#()1=2¼BPG G ;t þ1cx t þ1;y t þ1ðÞBPG G ;t þ1cx t ;y tðÞ"#ÂBPG G ;t c xt þ1;y t þ1ðÞBPG G ;t c x t ;y tðÞ"#()1=2ð4Þis the geometric mean of two terms,each being a ratio of benchmark technology gaps along differentrays.M c G /M c f 1as projections onto T c t and T c t +1of period t +1data are closer to,equidistant from,orfarther away from T c G than projections onto T c t and T ct +1of period t data are.J.T.Pastor,C.A.K.Lovell /Economics Letters 88(2005)266–271268J.T.Pastor,C.A.K.Lovell/Economics Letters88(2005)266–271269 Table1Electricity generation data,annual means1977198219871992 Output(000MW h)13,70013,86016,18017,270 Labor(#FTE)1373179719952021 Fuel(billion BTU)1288144116671824 Capital(To¨rnqvist)44,756211,622371,041396,386 M c G=M c if BPG c G,s(x t+1,y t+1)=BPG c G,s(x t,y t),s=t,t+1.From the first equality in(4),this condition is equivalent to the condition M c G=M c s,s=t,t+1.If this condition holds for all s,it is equivalent to the condition M c t=M c1for all t.Althin(2001)has shown that a sufficient condition for base period independence is that technical change be Hicks output-neutral(HON).Hence HON is also sufficient for M c G=M c.4.An empirical illustrationWe summarize an application intended to illustrate the behavior of M c G,and to compare its performance with that of M c.We analyze a panel of93US electricity generating firms in four years (1977,1982,1997,1992).The firms use labor(FTE employees),fuel(BTUs of energy)and capital(a multilateral To¨rnqvist index)to generate electricity(net generation in MW h).The data are summarized in Table1.Electricity generation increased by proportionately less than each input did.The main cause of the rapid increase in the capital input was the enactment of environmental regulations mandating the installation of pollution abatement equipment.We are unable to disaggregate the capital input into its productive and abatement components.Empirical findings are summarized in Table2.The first three rows report decomposition(3)of M c G, and the final three rows report M c and its two adjacent period components.Columns correspond to time periods.M c G shows a large productivity decline from1977to1982,followed by weak productivity growth. Cumulative productivity in1992was25%lower than in1977.M c G calculated using1992and1977data generates the same value,verifying that it is circular.The efficiency change component EC c of M c G(and M c)is also circular,and cumulates to an18% improvement.Best practice change,BPC c,is also circular,and declined by35%.Capital investment in Table2Global and contemporaneous Malmquist productivity indexes1977–19821982–19871987–1992Cumulative productivity1977–1992 M c G0.685 1.064 1.0390.7570.757EC c 1.163 1.0890.929 1.176 1.176 BPC c0.5890.977 1.1180.6440.644M c0.4310.895 1.0390.4000.592M c t0.7130.902 1.0530.678 1.333M c t+10.2600.887 1.0240.2360.263pollution abatement equipment generated cleaner air but not more electricity.Consequently catching up with deteriorating best practice was relatively easy.Turning to the contemporaneous index M c reported in the final three rows,the story is not so clear.Cumulative productivity in 1992was 60%lower than in 1977.However calculating M c using 1992and 1977data generates a smaller 40%decline,verifying that M c is not circular.Neither figure is close to the25%decline reported by M cG,verifying that technical change was not HON,but (pollution abatement)capital-using.The lack of circularity is reflected in the frequently large differences between M ct and M c t +1,which give conflicting signals when computed using 1992and 1977data,with M c tsignaling productivitygrowth and M ct +1signaling productivity decline.Although not reported in Table 2,we have calculated three-way decompositions of M cG and M c .All three components of M c G are circular,and LP infeasibility does not occur.In contrast,the technical change and scale components of M c are not circular,and infeasibility occurs for 13observations.The circular global index M cGtells a single story about productivity change,and its decomposition is intuitively appealing in light of what we know about the industry during the cking circularity,M c and its two adjacent period components tell different stories that are often contradictory.Thedifferences between M cGand M c are a consequence of the capital-using bias of technical change,which was regressive due to the mandated installation of pollution abatement equipment,augmented perhaps by the rate base padding that was prevalent during the period.5.ConclusionsThe contemporaneous Malmquist productivity index is not circular,its adjacent period components can give conflicting signals,and it is susceptible to LP infeasibility.The global Malmquist productivity index and each of its components is circular,it provides single measures of productivity change and its components,and it is immune to LP infeasibility.The global index decomposes into the same sources of productivity change as the contemporaneous index does.A sufficient condition for equality of the two indexes,and their respective components,is Hicks output neutrality of technical change.The global index must be recomputed when a new time period is incorporated.Diewert’s (1987)assertion that b ...economic history has to be rewritten ...Q when new data are incorporated is the base period dependency problem revisited.The problem can be serious when using base periods t =1and t =T ,but it is likely to be benign when using global base periods {1,...,T }and {1,...,T +1}.While new data may change the global frontier,the rewriting of history is likely to be quantitative rather than qualitative.ReferencesAlthin,R.,2001.Measurement of productivity changes:two Malmquist index approaches.Journal of Productivity Analysis 16,107–128.Berg,S.A.,Førsund,F.R.,Jansen,E.S.,1992.Malmquist indices of productivity growth during the deregulation of Norwegian banking,1980–89.Scandinavian Journal of Economics 94,211–228(Supplement).Caves,D.W.,Christensen,L.R.,Diewert,W.E.,1982.The economic theory of index numbers and the measurement of input output,and productivity.Econometrica 50,1393–1414.J.T.Pastor,C.A.K.Lovell /Economics Letters 88(2005)266–271270J.T.Pastor,C.A.K.Lovell/Economics Letters88(2005)266–271271 Diewert,W.E.,1987.Index numbers.In:Eatwell,J.,Milgate,M.,Newman,P.(Eds.),The New Palgrave:A Dictionary of Economics,vol.2.The Macmillan Press,New York.Fa¨re,R.,Grosskopf,S.,1996.Intertemporal Production Frontiers:With Dynamic DEA.Kluwer Academic Publishers,Boston. Fisher,I.,1922.The Making of Index Numbers.Houghton Mifflin,Boston.Frisch,R.,1936.Annual survey of general economic theory:the problem of index numbers.Econometrica4,1–38.Ray,S.C.,Desli,E.,1997.Productivity growth,technical progress,and efficiency change in industrialized countries:comment.American Economic Review87,1033–1039.Shestalova,V.,2003.Sequential Malmquist indices of productivity growth:an application to OECD industrial activities.Journal of Productivity Analysis19,211–226.Xue,M.,Harker,P.T.,2002.Note:ranking DMUs with infeasible super-efficiency in DEA models.Management Science48, 705–710.。
rack开展研究的建议
19
相关性评估过程( ) 相关性评估过程(1)
对于每一个topic,NIST从参加者取得 , 对于每一个 从参加者取得 的结果中挑选中一部分运行结果, 的结果中挑选中一部分运行结果,从每 个运行结果中取头100个文档,然后用 个文档, 个运行结果中取头 个文档 这些文档构成一个文档池,使用人工方 这些文档构成一个文档池, 式对这些文档进行判断。 式对这些文档进行判断。相关性判断是 二值的:相关或不相关。 二值的:相关或不相关。 没有进行判断的文档被认为是不相关的。 没有进行判断的文档被认为是不相关的。
11
名词定义
TREC目前所包含的 目前所包含的Tracks: 目前所包含的 :
Cross-Language Track Filtering Track Genomics Track HARD Track Interactive Track Novelty Track Question Answering Track Robust Retrieval Track Terabyte Track Video Track Web Track
Title:标题,通常由几个单词构成,非 :标题,通常由几个单词构成, 常简短 Description:描述,一句话,比Title :描述,一句话, 详细,包含了Title的所有单词 详细,包含了 的所有单词 Narrative:详述,更详细地描述了哪些 :详述, 文档是相关的
14
Topic示例 示例
<num> Number: 351 <title> Falkland petroleum exploration <desc> Description: What information is available on petroleum exploration in the South Atlantic near the Falkland Islands? <narr> Narrative: Any document discussing petroleum exploration in the South Atlantic near the Falkland Islands is considered relevant. Documents discussing petroleum exploration in continental South America are not relevant.
《2024年牛类滋养层干细胞建系的研究》范文
《牛类滋养层干细胞建系的研究》篇一一、引言牛类滋养层干细胞(Bull Trophoblast Stem Cells,BTSCs)是一种在生物医学研究领域具有广泛应用潜力的细胞类型。
由于其具备多向分化能力和强大的增殖能力,近年来成为畜牧业及医学研究领域的一个热门课题。
BTSCs建系研究旨在通过建立稳定的细胞系,为研究牛类胚胎发育机制、疾病模型构建以及细胞治疗等提供重要工具。
本文将就牛类滋养层干细胞的建系方法、实验过程、结果分析和未来展望等方面进行详细阐述。
二、研究背景及意义随着生物技术的不断发展,干细胞研究在畜牧业和医学领域的应用日益广泛。
BTSCs作为一类具有多向分化潜能的细胞,对于研究牛类胚胎发育、疾病模型构建以及细胞治疗具有重要意义。
通过建立稳定的BTSCs细胞系,不仅可以为牛类胚胎生物技术提供重要工具,还可以为人类干细胞研究提供借鉴。
此外,BTSCs 的建系研究还有助于推动畜牧业的发展,提高牛类养殖的经济效益。
三、实验材料与方法1. 实验材料本实验所需材料包括牛类胚胎、培养基、血清、生长因子等。
所有材料均需经过严格的质量控制,以确保实验结果的可靠性。
2. 实验方法(1)胚胎收集与处理:从牛类新鲜胚胎中获取滋养层组织,经过适当的机械和酶解处理,得到单细胞悬液。
(2)细胞培养:将单细胞悬液接种于培养皿中,加入含有生长因子和血清的培养基进行培养。
(3)干细胞筛选与鉴定:通过细胞形态观察、免疫荧光染色、分子生物学等方法,筛选出具有多向分化潜能的BTSCs并进行鉴定。
(4)细胞建系:将筛选出的BTSCs进行传代培养,建立稳定的细胞系。
四、实验过程与结果分析1. 实验过程本实验首先从新鲜牛类胚胎中获取滋养层组织,经过适当的处理得到单细胞悬液。
然后,将单细胞悬液接种于培养皿中,加入含有生长因子和血清的培养基进行培养。
在培养过程中,通过观察细胞形态、免疫荧光染色等方法筛选出具有多向分化潜能的BTSCs。
最后,将筛选出的BTSCs进行传代培养,建立稳定的细胞系。
论文的参考文献标准模版
参考文献标准模版一、参考文献书写格式1)期刊[序号] 主要作者. 文献题名[J]. 刊名,出版年份,卷号(期号):起止页码.例如:[1] 袁庆龙,候文义. Ni-P合金镀层组织形貌及显微硬度研究[J]. 太原理工大学学报,2001,32(1):51-53.2)专著[序号] 主要作者. 专著名[M].出版地:出版者,出版年份,起止页码.[4] 王芸生. 六十年来中国与日本[M]. 北京:三联书店,1980,161-172.3)专利文献[序号] 专利所有者. 专利题名[P]. 专利国别:专利号,发布日期.[7] 姜锡洲. 一种温热外敷药制备方案[P]. 中国专利:881056078,1983-08-12.4)报纸文章[序号] 主要作者. 文献题名[N]. 报纸名,出版日期(版次).[11] 谢希德. 创造学习的思路[N]. 人民日报,1998-12-25(10).二、文献名称标识期刊文章[J]、专著[M]、论文集[C]、学位论文[D]、专利[P]、标准[S]、报纸文章[N]、报告[R]、资料汇编[G]、其他文献[Z][1] 纪钢. 一种对周期性信号采样的新方法[J]. 仪表技术,1998,(4):31-34.[2] 李晓陆. 带通采样定理在降低功耗问题中的实际应用[J]. 桂林电子工业学院学报,2004,24(5):36-38.[3] 李思坤,苏显渝,陈文静. 一种新的小波变换空间载频条纹相位重建方法[J]. 中国激光,2010,37(12):3060-6065.[4] Wang Chuandan,Zhang Zhongpei,Li Shaoqian. INTERFERENCE MITIGATINGBASED ON FRACTIONAL FOURIER TRANSFORM IN TRANSFORM DOMAIN COMMUNICATION SYSTEM [J]. Journal of Electronics(China),电子科学学刊(英文版),2007(2):1327-1350.[5] S.C.Chan,T.S.Ng. TRANSFORM DOMAIN CONJUGATE GRADIENTALGORITHM FOR ADAPTIVE FILTERING [J]. Journal of Electronics(China),电子科学学刊(英文版),2000,17(1):69-76.[6] Li Ke,Shi Xinhua,Zhang Eryang. TRANSFORM DOMAIN SMART ANTENNASALGORITHM FOR MAI SUPPRESSION [J]. Journal of Electronics(China),电子科学学刊(英文版),2004,21(4):289-295.[7] 谢艾纾,徐成,赵利平,邓绍伟,赵嫦花. 变换域维纳滤波及其改进[J]. 计算机工程与应用,2011,11(24):1-8.[8] 焦李成,孙强. 多尺度变换域图像的感知与识别:进展和展望[J]. 计算机学报,2006,29(2):177-193.[9] 李栋. 模拟信号的数字化[J]. 中国新闻科技,1999(8):4-9.[10] 周超. 多带模拟信号的采样与重构[J]. 传感器与微系统,2011,30(5):83-85.[11] 山磊. 模拟信号的数字传输[J]. 南宁职业技术学院学报,2005,10(1):92-95.[12] 徐洪浩. 带限信号谱估计的一个新算法[J]. 哈尔滨船舶工程学院学报,1985(3):36-42.[13] 沈彩耀,李红波,张颋,曾繁景. 带限信号时延估计快速算法研究[J]. 信息工程大学学报,2007,8(1):77-80.[14] 王飞雪,郭桂蓉. 多通带带限信号的采样定理[C]. 第九届全国信号处理学术年会(CCSP-99),1999(10)-1.[15] 邓林旺,曹建航,何睿,倪琰. 一种模拟信号采样装置[P]. 比亚迪股份有限公司,2001(3)-2.[16] 木青. 高速A/D转换器的基本原理与结构比较[J]. 微电子学,1987,17(5):8-11.[17] 崔庆林,蒋和全. 高速A/D转换器动态参数的计算机辅助测试[J]. 微电子学,2004,34(5):505-509.[18] 王萍,石寅. 一种用于高速A/D转换器的高精度参考电压电阻网络[J]. 电子学报,2000,28(12):48-51.[19] 崔庆林,蒋和全. 高速A/D转换器测试采样技术研究[J]. 微电子学,2006,36(1):52-55.[20] David L. Donoho. Compressed sensing[J]. IEEE Transactions on InformationTheory,2006,52(4): 1289-1306[21] E.J. Candes and J Romberg. Quantitative robust uncertaninty principles and optimallysparse decompositions[J]. Foundations of Comput Math,2006,6(2) :227-254 [22] D. L. Donoho,Y Tsaig. Extensions of compressed sensing[J]. Signal Processing.2006,86(3) :533-548.[23] E.J. Candes. Monoscale ridgelets for the rep resentation of images with edges.Stanford:Stanford University,1999.[24] E.J. Candes and J Romberg. Practical signal recovery from random projections InProc.SPIE Computational Imaging,2005,5674:76-86[25] E.J.Candes. Compressive sampling.Int. Congress of Mathematics,2006,3:1433-1452[26] R. Baraniuk. Compressive sensing. IEEE Signal Processing Magazine,2007,24(4):448-121.[27] 石光明,刘丹化,高大化,刘哲,林杰,王良君.压缩感知理论及其研究进展[J].电子学报,2009,37(5):1070-1081.[28] Olshausen B A, Field D J. Emergency of simple-cell receptive field properties bylearning a sparse coding for natural images. Nature,1996,381(6583): 607-609. [29] Olshausen B A, Field D J. Sparse coding with an overcomplete basis set: a strategyemployed by V1? Visual Research,1997,37(33): 3311-3325.[30] 程文波,王华军. 信号稀疏表示的研究及应用[J].西南石油大学学报(自然科学版),2008,30(5):148-151.[31] 何昭水,谢胜利. 信号的稀疏性分析[J]. 自然科学进展,2006,16(9):1167-1173.[32] 李映,张艳宁,许星. 基于信号稀疏表示的形态成分分析:进展和展望[J]. 电子学报,2009,37(1):146-152.[33] 傅予力,谢胜利,何昭水. 稀疏信号的参数分析[J]. 武汉大学学报(工学版),2006,36(9):101-121.[34] 王世一编著. 《数字信号处理》(修订版). 北京理工大学出版社,1997.[35] Xiaoyan Xing,Lisheng Xu,Jilie Ding,Xiaobo Deng and Hailei Liu. The Preliminaryanalysis of Guizhou short-term climate change characteristics using the information theory[C]. 2010 International Conference on Remote Sensing (ICRS 2010),2010(10).[36] 廖斌,许刚,王裕国. 二维匹配跟踪自适应图像编码[J]. 计算机辅助设计与图形学学报,2003,15(9):1084-1090.[37] 尹忠科,王建英,Pierre Vandergheynst. 在低维空间实现的基于MP的图像稀疏分解. 电讯技术,2004,44(3):12-15.[38] M.Lustig,D.L.Donoho,J.M.Pauly. Sparse MRI:The application of compressedsensing for rapid MR imaging. Magnetic Resonance in Medicine. 2007,58(6):1182-1195.[39] Chen,S.A.Billings,and W. Luo. Orthogonal least squares and their application tonon-linear system identification. International Journal of Control,1989,50(5):1873-1896.[40] R. Baraniuk,P. Steeghs,Compressive radar imaging. IEEE Radar Conference,Waltham,Massachusetts,April 2007.[41] W. Bajwa,J. Haupt,A. Sayeed,etc. Compressive wireless sensing. Int. Conf. onInformation Processing in Sensor Networks(IPSN),Nashville,Tennessee,2006:134-142.[42] W. Bajwa,J. Haupt,A. Sayeed,etc. Compressive wireless sensing. Proceedings of thefifth International Conference on Information Processing in Sensor Networks,IPSN’06. New York: Association for Computing Machinery. 2006:134-142.[43] G.Quer,R.Masiero,D.Munaretto,etc. On the Interplay Between Routing and SignalRepresentation for Compressive Sensing in Wireless Sensor Networks. Information Throry and Applications Workshop(ITA 2009),San Diego,CA.[44] 黄萍莉,岳军. 图像传感器CCD技术[J]. 信息记录材料,2005,6(1):50-55.[45] 赵瑾娜. 攻擂方:CMOS技术前景无限[N]. 中国计算机报,2001-05-28(D03).[46] 青山. CMOS技术:还有很长的路要走[N]. 中国电子报,2001-03-16(006).[47] 俊平. CMOS技术有望再领风骚15年[N]. 电子资讯时报,2002-12-05(B04).[48] 陈辰. 基于CCD和CMOS技术的混合数字图像传感器技术兼有低成本和高性能两大优点[J]. 电子产品世界,1998,Z1:143.[49] 王东. 基于数码相机的CCD与CMOS技术[J]. 今日印刷,2002,8(12):56-59.[50] 康为民,李延彬,高伟志. 数字微镜阵列红外动态景象模拟器的研制[J]. 红外与激光工程,2008,37(5):753-756.[51] D. Takhar,V. Bansal,M. Wakin,etc. A compressed sensing camera: New theory andan implementation using digital micromirrors[C]. SPIE Electronic Imaging: Computational Imaging. San Jose. 2006[52] M. Duarte,M. Davenport,D. Takhar,etc. Single-pixel imaging via compressivesampling[C]. IEEE Signal Processing Magazine,2008,25(2):82-91.[53] CAO Wenhua,LIU Songhao,Wuyi University. Optical pulse compression using anonlinear optical loop mirror constructed from dispersion decreasing fiber[J]. Science in China(Series E: Technological Sciences),中国科学(E辑:技术科学)(英文版),2004,47(1):33-50.[54] 孟藏珍,袁俊泉,徐向东. 海杂波背景下自适应脉冲压缩的性能与分析[J]. 雷达科学与技术,2006,4(5):305-308.[55] 商枝江. 基于压缩感知的稀疏多径信道估计算法研究[D]. 电子科技大学,2011.[56] Emmanuel Candes,Justin Romberg,T. Tao,Robust uncertainty principles: exactsignal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory,2006,52(2):489-509.[57] E. Candes,J. Romberg,T. Tao. Stable signal recovery from incomplete andinaccurate measurements. Communications on Pure and Applied Mathematics,2006,59(8):1207-1223.[58] Hong Fang,Quanbing Zhang,Sui Wei. A Method of image Reconstruction Based onSub-Gaussian Random Projection[J]. Journal of Computer Research and Development,2008,45(8):1402-1407.[59] Hong Fang,Quanbing Zhang,Sui Wei. Method of image reconstruction based on verysparse random projection[J]. Computer Engineering and Applications,2007,43(22):25-27.[60] E.Candes,T.Tao.Near optimal signal recovery from random projections: Universalencoding strategies?[J]. IEEE Transactions on Information Theory,2006,52(12): 5406-5425.[61] W.Yin,S.P.Morgan,J.Yang,Y.Zhang,Practical compressive sensing with Toeplitzand circulant matrices[C]. Rice University CAAM Technical Report TR10-01,Submitted to VCIP 2010.[62] W.Bajwa,J.Haupt,G.Raz,S.J.Wright,R.D.Nowak. Toeplitz-structured compressedsensing matrices[C]. Proceedings of the IEEE Workshop on Statistical Signal Processing,Washington D.C.,USA:IEEE,2007,294-298.[63] F.Sebert,Y.M.Zou,L.Ying. Toeplitz block matrices in compressed sensing and theirapplications in imaging. [C]. Proceedings of International Conference on Technology and Applications in Biomedicine,Washington D.C.,USA:IEEE,2008,47-50. [64] Holger Rauhut. Circulant and Toeplitz matrices in compressed sensing[C]. InProcessing SPARS’09,Saint Malo,2009.[65] Radu Berinde,Pintr Indyk,Sparse recovery using sparse random matrices,2008,preprint.[online],Available:/cs.[66] T.T.Do,T.D.Trany,L.Gan,Fast compressive with structurally random matrices,Proceedings of the IEEE International Conference on Acoustics[C]. Speech and signal Processing,Washington D.C.,USA:IEEE,2008,3369-3372.[67] Lorne Applebaum,Stephen Howard,Stephen Searle,Robert Calderbank,Chirpsensing codes: Deterministic compressed sensing measurements for fast recovery.2008,preprint.[online],Available:/cs.[68] Justin Romberg,compressive sensing by random convolution[J]. SIAM Jouranl onImagining Sciences,Nov.2009,2(4):1098-1128.[69] Richard Baraniuk,Mark Davenport,Ronald Dcvore,Michael Wakin. A simple proofof the restricted Isometry property for random matrices[J]. Comstructive Approximation, Dec.2008,28(3):253-263.[70] Richard Baraniuk. Compressive sensing. IEEE Signal Processing Magazine[J]. July2007,24(4):118-121.[71] E.Candes,T.Tao. Decoding by linear Programming[J]. IEEE Transactions onInformation Theory,2005,51(12):4203-4215.[72] Ronald,A. DeVore. Deterministic constructions of compressed sensing matrices[J].Journal of Complexity,2007,23(4-6):918-925.[73] P.Wojtaszczyk. Stability and instance optimality for Gaussian measurement incompressed sensing,Feb,2008.[74] 常彦勋. 有限域的本原元性质[J]. 数学杂志,1993,13(1):59-63.[75] 李海合,王三福. 有限域上的同余方程组[J]. 渭南师范学院学报,2009,24(5):9-10.[76] 白志东. 大维随机矩阵理论及其应用[R]. 东北师范大学,2009.[77] 李云龙. 一类凸规划最优解的形式表达式[J]. 哈尔滨科学技术大学学报,1993,17(1):78-83.[78] 陈景达,陈向晖. 特殊矩阵[M]. 北京:清华大学出版社,2001.[79] 张贤达. 矩阵分析与应用[M]. 北京:清华大学出版社,2004.[80] 胡星星. 线性规划的组合方向算法[D]. 杭州电子科技大学,2011.[81] S.B.Chen,D.L.Donoho,M.A.Saunders. Atomic decomposition by basis pursuit[J].SIAM Journal on Scientific Computing,1998,20(1):33-61.[82] Kim S,Koh K,Lustig M,Boyd S,Gorinevsky D. An interior-point method forlarge-scale l1 regularized least squares[C]. IEEE Journal of Selected Topics in Signal Processing,2007,1(4):606-617.[83] Fiqueiredo MAT,Nowak R D,Wright S J. Gradient projection for sparsereconstruction:Application to compressed sensing and other inverse problems[C].IEEE Journal of Selected Topics in Signal Processing,2007,1(4):586-598.[84] 伍杰. 求解对称非线性方程组的共轭梯度法[D]. 湖南大学,2010.[85] D. L. Donoho,Y Tsaig. Fast solution of l1-norm minimization problems when thesolution may be sparse[J]. Technical Report,Department of Statistics,Stanford University,USA,2008.[86] Tropp J,Gilbert A. Signal recovery from random measurements via orthogonalmatching pursuit[J]. Transactions on Information Theory,2007,53(12):4655-4666.[87] Needell D,Vershynin R. Uniform uncertainty principle and signal rccovery viaregularized orthogonal matching pursuit[J]. Found Comput Math,2008,in press. [88] Needell D,Tropp J A. CoSaMP:Iterative signal recovery from incomplete andinaccurate samples[J]. ACM Technical Report 2008-01,California Institute of Technology,Pasadena,2008.7.[89] Thong T Do,Lu Gan,Nam Nguyen and Trac D Tran. Sparsely adaptive matchingpursuit algorithm for practical compressed sensing[J]. Asilomar Conference on Signals Systems,and Computers,Pacific Grove,California,2008.10.[90] Dai W,Milenkovic O. Subspace pursuit for compressive sensing signalreconstruction[J]. 2008 5th International Symposium on Turbo Codes and Related Topics,TURBOCODING,2008:402-407.[91] 刘亚新,赵瑞珍,胡绍海,姜春晖. 用于压缩感知信号重建的正则化自适应匹配追踪算法[J]. 电子与信息学报,2010,32(11):2713-2717.[92] Kingsbury N G. Complex wavelets for shift invariant analysis and filtering of comlexwavelets for shift invariant analysis and filtering of signals[J]. Journal of Applied and Computational Harmonic Analysis,2001,10(3):234-253.[93] Herrity K.K,Gilbert A C,Tropp J A. Sparse approximation via iterative shareholding.In: Proceedings of the IEEE International Conference on Acoustics[C]. Speech and signal Processing,Washington D.C.,USA:IEEE,2006,624-627.[94] E.Candes,D.L.Donoho. New Tight Frames of Curvelets and Optimal Representationsof Objects with Piecewise C2 Singularities Communications on Pure and Applied Mathematics[C],2003,57(2):219-266.[95] Vinje W E,Gallant J L. Sparse coding and décor-relation in primary visual cortexduring natural vision[J]. Science,2000,287(5456): 1273-1276.[96] Olshausen B A,Field D J. Emergency of simple-cell receptive field properties bylearning a sparse coding for natural images[J]. Nature,1996,381(6583): 607-609.[97] Olshausen B A,Field D J. Sparse coding with an overcomplete basisset:a strategyemployed by V1? [J]. Visual Research,1997,37(33): 3311-3325.[98] V. K. Goyal,K. Alyson,et al. Compressive sampling and lossy compression[C].IEEE SIGNAL PROCESSING MAGAZINE,2008,25(2):48-56.[99] E. J. Candes,M. B. Wakin. An introduction to compressive sampling:Asending/sampling parading that goes against the common knowledge in data acquisition[C]. IEEE Signal Processing Magazine,2008,25(5):21-30.[100] 郭天圣. 基于小波变换的图像去噪研究[D]. 兰州理工大学,2010.[101] L.M.Bregman. The method of successive projection for finding a common point of convex sets[J]. Doklady Mathematics,1965,(6):688-692.[102] David L,Donoho,Yaakov Tsaig,Iddo Drori ,Jean-Luc Starck. Sparse Solution of Underdetermined Linear Equations by Stagewise Orthogonal Matching Pursuit[J],2006.[103] 王潇,尹忠科,王建英,杨郑. 应用基追踪的信号分离的算法[C]. 2008年中国西部青年通信学术会议论文集,2008(12):446-449.l-regularized [104] S.J.Kim,K.Koh,M.Lusting,et al. A method for large-scale1 least-squares[C]. IEEE Journal on Selected Topics in Signal Processing,2007,4(1):606-617.[105] I.Daubechies,M.Defrise,C.D.Mol. An iterative thresholding algorithm for linear inverse problems with a sparsely constraint[P]. Comm.Pure.,2004,57(11):1413-1457. [106] A.C.Gilbert,S.Guha,P.Indyk,et al. Near-optimal sparse Fourier representations via sampling[P]. Proceedings of the Annual ACM Symposium on Theory of Computing.Montreal,Que.,Canada: Association for Computing Machinery,2002:152-161.[107] A.C.Gilbert,S.Muthukrishnan,M.J.Strauss. Improved time bounds for neat-optimal sparse Fourier representation[P]. Proceedings of SPIE,Waveles XI,Belingham WA: International Society for Optical Engineering,2005,5914:1-15.[108] A.C.Gilbert,M.J.Strauss,J.Tropp. Algorithmic linear dimension reduction in thel1 norm for sparse vectors[N]. /files/cs/allerton2006GSTV.pdf. [109] A.C.Gilbert,M.J.Strauss,J.Tropp.One sketch for all:Fast algorithms for compressed sensing. Proceedings of the 39th Annual ACM Symposium on Theory of Computing,New York:Association for Computing Machiner,2007:237-246.[110] Takigawa I,Kudo M,Toyama J. Performance analysis of minimuml-norm1 solutions for underdetermined source separation[J]. IEEE Transactions on Signal Processing,2004,52(3): 582-591.。
转基因玉米t25数字pcr方法的建立与验证
摘 要 :转基因安全管理和标识制度的实施需要标准化的检测方法和转基因检测标准物质,标准物质是获得 准 确 、可 靠 、可比检测结果的保证。转基因玉米T 2 5 为我国批准进口用作加工原料的转基因植物。为加强对
确 、可靠。T25基体标准物质的量值为1.001 2 ,相对不确定度为0.001 6。研究表明建立的T25//WW二重数
字 PCR方法可以用于转基因玉米T25的定量检测,为转基因成分定量检测提供了物质基础和技术保证。
关键词:转基因玉米;T25;基体标准物质;微滴数字PCR
doi:10.13304/j.nykjdb.2019.0154
中国农业科技导报,2020, 2 2 ( 2 ) : 173-178
Journal of Agricultural Science and Technology
转 基 因 玉 米 T2 5 数 字 P C R 方2 , 肖 晓 琳 \ 张 飞 燕 、 梁 晋 刚 \ 王 顥 潜 、 张旭冬、 张秀杰〃
T25的监管,以 T25基体标准物质为研究对象,建立数字PCR方法,并选择8 家实验室,采用数字PCR联合定
值测定。结 果 表 明 ,T25/AiW 二 重 数 字 PCR系 统 具 有 良 好 的 扩 增 特 异 性 。初 始 模 板 量 在 100~ 10 000
拷贝•(JO/1之间可获得稳定、可靠的检测结果。通过多实验室协同定值说明T25//WW二 重 数 字 PCR方法准
(1.Development Center for Science and Technology, Ministry of Agriculture and Rural Affairs, Beijing 100176, China; 2.Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, W u h a n 430062, China)
免疫调节细胞的发现与功能研究
免疫调节细胞的发现与功能研究在人体免疫系统中,免疫调节细胞(immune regulatory cell)是一类关键的细胞,它们能够维持免疫系统的平衡,保证身体不会出现自身免疫性疾病或过度免疫反应。
与其他免疫细胞不同的是,免疫调节细胞并不会主动攻击外来病原体,而是通过调节免疫反应的强度和方向来保持身体内部的免疫平衡。
免疫调节细胞的发现可以追溯到20世纪60年代。
当时,免疫学家们发现了一群能够抑制淋巴细胞活性的T细胞。
这些T细胞被称为“免疫抑制性T细胞(suppressor T cells)”,它们的发现引起了学界的广泛关注。
然而,随着研究不断深入,人们逐渐意识到,这些T细胞虽然能够抑制免疫反应,但并不是唯一的免疫调节细胞类型。
目前,已知的免疫调节细胞包括Treg细胞、调节性B细胞和调节性树突状细胞等。
其中,Treg细胞是最早被发现的免疫调节细胞类型,也是研究最深入的种类之一。
Treg细胞最早由普林斯顿大学的欧文·齐约(Owen Witte)等人在90年代初发现。
他们发现,在小鼠体内存在着一种特殊的T细胞子群,能够抑制其他T细胞的免疫反应。
这些细胞表现出高水平的CD25、CTLA-4和FOXP3等标志性蛋白,被认为是一种全新的T细胞亚群。
2001年,瑞典科学家汉斯-延斯·莫克(Hans-Georg von Boehmer)和弗朗西斯·韦克(Francis W. Wekerle)等人证实了这一类型的T细胞是一种调节性T细胞,被命名为“调节性T细胞(Treg细胞)”。
Treg细胞的主要功能是抑制自身反应和过度免疫反应。
它们通过多种机制来发挥这一作用。
首先,Treg细胞能够抑制其他T细胞的活性,阻止它们释放细胞因子和活化自身。
其次,Treg细胞还能够与树突状细胞和B细胞等其他免疫细胞相互作用,影响它们的分化和功能。
最后,Treg细胞还能够产生一些抑制性细胞因子,如IL-10和TGF-β等,对免疫反应进行直接抑制。
基于转录组测序挖掘老芒麦落粒候选基因及其功能分析
基于转录组测序挖掘老芒麦落粒候选基因及其功能分析基于转录组测序挖掘老芒麦落粒候选基因及其功能分析引言:老芒麦(Triticum monococcum L. var. monococcum)作为一种重要的古老小麦亲缘种,具有独特的遗传资源和抗逆性,对研究小麦的进化和抗逆机制具有重要意义。
本研究利用转录组测序技术,鉴定了老芒麦落粒过程中的关键基因,并通过生物信息学分析和功能验证揭示了其在调控老芒麦落粒过程中的作用。
方法:本研究选取老芒麦不同落粒期(包括0天、3天和7天)的籽粒组织样品,使用Illumina HiSeq 2500平台进行转录组测序。
首先,对测序数据进行去除低质量序列和参考基因组比对,然后利用Cufflinks软件拼接和定量基因转录本。
根据表达量差异进行差异基因分析,筛选出相关的落粒候选基因。
结果:经过测序和分析,共得到5,000个单倍子集,包括1,000个差异表达基因。
从差异基因中筛选出了10个与落粒过程关联性最强的候选基因。
进一步的生物信息学分析显示,这些基因涉及到植物发育、激素信号转导、光合作用及碳代谢等多个功能通路。
为了进一步验证转录组结果,我们选择了其中的两个基因进行功能分析。
RNA干扰技术和基因表达分析结果显示,这两个基因对老芒麦落粒过程有着显著的调控作用。
此外,我们还通过酵母双杂交实验验证了其中一个基因与一个隐花色素的结合。
讨论:通过转录组测序技术的应用,我们成功地挖掘到了老芒麦落粒过程中的关键基因,并对其功能进行了分析。
研究结果显示,这些基因参与了多个重要的生理和代谢通路,通过对转录组数据以及功能验证结果的综合分析,我们可以更好地理解老芒麦落粒过程的调控机制。
此外,我们的研究结果还为深入研究小麦的遗传进化和抗逆性提供了重要的候选基因资源。
结论:本研究通过转录组测序技术成功挖掘了老芒麦落粒过程中的关键基因,并验证了其在调控落粒过程中的重要作用。
结果表明,这些基因参与了多个生理和代谢通路,对了解老芒麦的遗传进化和抗逆性具有重要意义。
《2024年基于脾脏转录组筛选北京油鸡和广明白鸡抗热应激相关功能基因》范文
《基于脾脏转录组筛选北京油鸡和广明白鸡抗热应激相关功能基因》篇一一、引言在禽类养殖过程中,热应激是常见的一种环境压力,其可导致家禽生理机能的下降和生产性能的损失。
鸡的抗热应激机制及其相关功能基因的研究,对于提高鸡只的生产性能、减少疾病的发生具有重要的科学价值和应用前景。
本文基于北京油鸡和广明白鸡的脾脏转录组数据,对这两种鸡只的抗热应激相关功能基因进行筛选和探讨。
二、材料与方法1. 实验动物本实验选用了北京油鸡和广明白鸡作为研究对象。
在饲养环境中,保持恒定的光照、温度和湿度等条件。
2. 样品制备选取在相同环境条件下的北京油鸡和广明白鸡,在遭受热应激前后进行脾脏样本的采集,随后进行转录组样本的制备。
3. 转录组测序及数据分析利用新一代测序技术对样品进行转录组测序,然后对数据进行处理和分析,筛选出差异表达基因。
4. 生物信息学分析利用生物信息学软件对差异表达基因进行功能注释、富集分析、互作网络构建等分析。
三、结果1. 转录组数据分析结果通过转录组测序及数据分析,我们成功获得了北京油鸡和广明白鸡在热应激条件下的基因表达谱。
对比分析发现,两种鸡只的基因表达存在显著的差异。
2. 差异表达基因的筛选通过对比分析,我们筛选出了一批在热应激条件下差异表达的基因。
这些基因主要涉及免疫应答、能量代谢、抗氧化等方面。
3. 功能基因的验证通过实时荧光定量PCR等方法,对部分差异表达基因进行验证,结果表明这些基因在两种鸡只中的表达确实存在显著差异。
4. 抗热应激相关功能基因的确定经过生物信息学分析和功能验证,我们确定了一些与抗热应激相关的功能基因,这些基因可能参与鸡只的抗热应激过程。
四、讨论通过对北京油鸡和广明白鸡的脾脏转录组数据分析,我们发现这两种鸡只在抗热应激方面存在显著的差异。
这些差异主要表现在一些与免疫应答、能量代谢、抗氧化等相关的功能基因上。
这些功能基因的差异表达可能是导致两种鸡只抗热应激能力不同的重要原因。
在抗热应激过程中,鸡只需要通过调节自身的生理机能来应对环境压力。
地钱基因组揭示陆生植物进化过程
先于藓类和角苔类,这表明最早的陆地开拓者携带有苔纲的特性。因此,苔纲可能存留 更大多祖先特性。苔纲是单系发育的,带叶状体的配子体占优势,孢子体退化;具有膜 结合的油体和腹侧单细胞假根;没有气孔。苔纲显示了较低的染色体进化,地钱纲 (Marchantiopsida)的分子进化与其他苔纲植物相比更低。雌雄异体的苔纲有性染色体, 可能是群体的祖先。
核基因组编码 769 个 tRNA 基因(51 个假基因)和 301 snRNA 基因。在 265 个地钱 MIR 基因中,264 个 miRNA 前体(pre-miRNA)在不同的基因组中(包括编码蛋白质的 基因)被发现证实。42 个 miRNA 前体存在 ORF。miRNA 前体的表达形式与它们的靶标 互作很弱,并且发现了 DCL1 同源序列(miRNA 的加工所需)在地钱和一些轮藻中。
3、重复 DNA 重复序列占了 M. polymorpha 常染色体基因组的 22%,这个值比 P. patens 小立碗藓
低(48%),但高于一种金鱼藻(角苔纲)(6.98%)。与被子植物相似,长末端重复(LTR) 反转录转座子,包括 264 个全长的长末端重复(LTR)反转录转座子,占据了重复序列 的最大部分(9.7%)。X 和 Y 染色体的特异重复序列之前有过报道,但没有新增加的性 别相关的序列被确认。
Marchantia polymorpha 是一种地钱,有复杂的叶状体配子体。其化石可追溯到二三 叠纪时期,推测该复杂叶状体可以适应干旱环境。由于由于在实验室生长和基因操作的 方便以及存在大量的历史文献中,我们选择了 M. polymorpha 作为一种代表性苔类,对 其基因组进行了分析。
【结果】
【结构基因组学和注释】 通过全基因组鸟枪测序法测出 M. polymorpha subspecies ruderalis(地钱亚种)的核
《2024年基于脾脏转录组筛选北京油鸡和广明白鸡抗热应激相关功能基因》范文
《基于脾脏转录组筛选北京油鸡和广明白鸡抗热应激相关功能基因》篇一一、引言随着现代养殖业的发展,热应激已成为家禽生产中常见的环境压力之一。
北京油鸡和广明白鸡作为我国特色的地方品种,对环境适应性有着特殊要求。
研究两种鸡的抗热应激相关基因对于提升家禽抗逆能力,保障养鸡业的稳定发展具有重大意义。
脾脏作为重要的免疫器官,在应激反应中扮演着关键角色。
本文旨在通过脾脏转录组学技术,筛选北京油鸡和广明白鸡抗热应激相关功能基因,为家禽抗逆育种提供理论依据。
二、材料与方法1. 实验动物选取健康、同龄的北京油鸡和广明白鸡各若干只,分别进行热应激处理(如高温、高湿环境)与对照组处理(适宜温度、湿度)。
2. 样品采集分别在处理后对实验鸡进行脾脏组织取样,确保样本无污染、无炎症。
将取样的脾脏组织进行RNA提取。
3. 转录组测序与分析利用RNA-Seq技术对提取的RNA进行转录组测序,分析两种鸡在热应激条件下的基因表达差异。
通过生物信息学分析,筛选出与抗热应激相关的功能基因。
三、结果与分析1. 转录组测序结果通过对北京油鸡和广明白鸡的脾脏转录组测序,共获得了大量基因表达数据。
比较两种鸡在热应激条件下的基因表达差异,发现了一批与抗热应激相关的基因。
2. 功能基因筛选根据基因表达差异、生物学功能及已有研究报道,筛选出与抗热应激相关的功能基因。
这些基因主要涉及免疫调节、能量代谢、抗氧化等方面。
3. 基因验证及功能研究通过PCR、qRT-PCR等实验方法对筛选出的功能基因进行验证,并进一步研究其在抗热应激中的作用机制。
结果表明,这些基因在两种鸡的抗热应激过程中发挥了重要作用。
四、讨论1. 基因差异表达分析通过对北京油鸡和广明白鸡的脾脏转录组数据进行分析,发现两种鸡在热应激条件下的基因表达存在差异。
这些差异可能与两种鸡的遗传背景、环境适应性等因素有关。
2. 功能基因的生物学意义筛选出的功能基因主要涉及免疫调节、能量代谢、抗氧化等方面,这些基因的差异表达可能导致两种鸡在抗热应激过程中的表现差异。
秀丽线虫
性别不同具有不同的细胞数。雌雄同体成虫含有959个体细 胞,约2000个生殖细胞,而雄性成虫则具有1031个体细胞 和1000个生殖细胞。
5.秀丽线虫有两种性别:雌雄同体和雄性。 雌雄同体可进行自我繁殖,也可与雄性交配繁殖;与雄
性交配的后代,50%是雌雄同体,50%为雄性。自我繁殖的 大多是雌雄同体,雄性个体以很低的频率自发产生。一条未 经交配的雌雄同体在生殖期可产生约300个后代。若与雄性 个体交配则产生多达1000个。
John E. Sulston
H. Robert Horvitz
2002年,Brenner和Horvitz、Sulston对器官 发育的遗传基础及程序性细胞死亡基因调控机 制的揭示,荣获了诺贝尔生理医学奖
Sydney Brenner
Andrew Z. Fire
Craigc. Mell
2006年,安德鲁· 法尔和克雷格· 梅洛获得诺贝尔生理医学奖, 以表彰他们发现了RNA干扰现象。
参考文献:
[1] Breener.S, The genetics of Caenorhabditis elegans [J].Genetics ,1974,77(1):7194 [2] /view/705699.htm [3] Stephen.J, Kenney.A, Garyl, et al. Persistence of EscherichiacoliO157: H7, Salmonella Newport and Salmonella Poona in thegut of a free-living nematode, Caenorhabditis elegans, and trans-mission to progeny and uninfected nematodes [ J]. International Journal of Food Microbiology,2005, 101: 227-236. [4] 庞林海,杜爱芳,李孝军等.秀丽隐杆线虫培养特性与保存方法研究[J].浙江农业学 报,2007,19(1):34-36 [5] Sulston JE. Neuronal cell lineages in the nematode, Caenorhabditis elegans [J]. Cold Spring Harb Symp Quant Biol,1983,48(2):443-452 [6] Simonetta SH, Golombek DA. An automated tracking system for Caenorhabditis elegans locomotor behavior and circadian studies application[J]. J Neurosci Methods, 2007, 161:273— 280. [7] Benedetti MG Foster AL,Vantipalli MC,et a1.Compounds that confer thermal stress resistance and extended lifespan[J]. Exp Gerontol, 2008, 43:882—89 1. [8] Verwaerde P'Cuvillier G Improved assay techniques using nematode worms: USA,US7083947[PI.2006-08—01.
age-period-cohortPPT课件
There exists relationships between the semantic type of a related
term and the semantic type of each query concept in UMLS
sema paragraph retrieval - avoid incorrect match of abbreviations
-
8
Stage 2: paragraph retrieval - concept retrieval (IR model)
According to our model (Liu, 2004; UIC Robust
track, 2005) , we have: sim (q ,d2 )sim (q ,d 1 )
-
13
Stage 3: passage extraction and ranking - extraction
The criterion for the optimal passage in a paragraph is given by:
“Given various windows of different sizes, choose the one which has the maximum number of query concepts and the smallest size.”
Stage 2: Paragraph retrieval
- retrieve 2,000 most relevant paragraphs
Stage 3: Passage extraction and ranking
- extract and retrieve 1,000 most relevant passages
湖南农业大学毕业论文
湖南农业大学全日制普通本科生毕业论文(设计)水稻陆两优996的临界氮稀释曲线测定与分析Determination and analysis of rice luliangyou 996critical nitrogen dilution curve学生姓名:衣启乐学号:201042144125年级专业及班级:2010级生态学(1)班指导老师及职称:刘向华副教授学院:生物科学技术学院湖南·长沙提交日期:2014年 5 月目录摘要:..................................................................................................................... - 2 - 关键词:................................................................................................................... - 2 - 1 前言..................................................................................................................... - 5 - 1.1立论背景............................................................................................................. - 5 - 1.2氮素水平对作物生物量积影响的研究进展..................................................... - 6 - 1.3作物临界氮浓度与稀释曲线的研究进展......................................................... - 6 -1.4本研究的目的和意义......................................................................................... - 7 -2 材料与方法......................................................................................................... - 8 - 2.1 试验材料............................................................................................................ - 8 - 2.2 试验设计............................................................................................................ - 8 - 2.3 测定项目.......................................................................................................... - 8 - 2.3.1 干物质测定................................................................................................... - 8 - 2.3.2 氮的测定......................................................................................................... - 8 - 2.3.3 成熟期产量测定........................................................................................... - 9 - 3结果与分析............................................................................................................ - 9 - 3.1 数据处理.......................................................................................................... - 9 - 3.2 实验水稻移栽后地上部干重变化情况............................................................ - 9 - 3.3 不同施氮处理下水稻产量分析...................................................................... - 10 -3.4 临界氮的计算.................................................................................................. - 10 -4 讨论与结论......................................................................................................... - 11 - 4.1氮素水平对水稻氮浓度的效应及其临界氮浓度模型................................... - 11 - 4.1.1水稻临界氮浓度模型分析............................................................................ - 11 - 4.1.2临界氮累积模型在氮素运筹中的应用........................................................ - 11 - 4.2结论................................................................................................................... - 12 - 参考文献................................................................................................................. - 12 - 致谢..................................................................................................................... - 15 -水稻陆两优996的临界氮稀释曲线测定与分析学生:衣启乐指导老师:刘向华(湖南农业大学生物科学技术学院,长沙 410128)摘要:本实验选择水稻陆两优996在湖南农业大学(长沙)试验基地进行,2013年3月23--7月20日期间,对水稻田进行临界氮测定试验,氮素浓度处理依次为N1(0kg/hm2),N2(60kg/hm2),N3(95kg/hm2),N4(130kg/hm2),N5(165kg/hm2),N6(200kg/hm2),N7(235kg/hm2),N8(270kg/hm2)。
小立碗藓WRKY基因家族生物信息学分析
㊀Guihaia㊀Feb.2022ꎬ42(2):267-276http://www.guihaia-journal.comDOI:10.11931/guihaia.gxzw202007039乔刚ꎬ李莉ꎬ姜山.小立碗藓WRKY基因家族生物信息学分析[J].广西植物ꎬ2022ꎬ42(2):267-276.QIAOGꎬLILꎬJIANGS.BioinformaticsanalysisofWRKYgenefamilyinPhyscomitrellapatens[J].Guihaiaꎬ2022ꎬ42(2):267-276.小立碗藓WRKY基因家族生物信息学分析乔㊀刚1ꎬ李㊀莉2ꎬ姜㊀山2∗(1.贵州师范大学生命科学学院ꎬ贵阳550001ꎻ2.贵州师范大学国际教育学院ꎬ贵阳550001)摘㊀要:WRKY作为最先在植物中发现的转录因子ꎬ在植物生长发育等过程中发挥重要作用ꎮ为了更好地研究小立碗藓WRKY蛋白的结构与功能ꎬ该文以Pfam数据库中WRKY基因家族数据(登录号为PF03106)为材料ꎬ分析了小立碗藓(Physcomitrellapatens)WRKY基因家族成员的理化性质㊁蛋白质的二级结构预测㊁染色体定位㊁内外显子分布及系统进化关系ꎮ结果表明:(1)小立碗藓WRKY基因家族成员共有38个基因ꎬ根据WRKY保守结构域个数和锌指结构类型分成Ⅰ㊁Ⅱ两大类ꎬ不含第Ⅲ类(锌指结构为C2HC型)ꎬ其中部分基因WRKY保守结构域发生变异ꎮ(2)WRKY蛋白氨基酸长度在216~775aa之间㊁相对分子质量在24.5~82.8kDa之间ꎬ亚细胞定位显示WRKY家族成员蛋白质定位于细胞核中ꎮ(3)WRKY蛋白的二级结构以α ̄螺旋㊁延伸链㊁β ̄转角㊁无规卷曲四种构成元件构成ꎬ除PpWRKY11(α ̄螺旋为主)外ꎬ其余无规卷曲占比高达70%ꎮ(4)与拟南芥的系统进化关系表明ꎬ植物在进化过程中WRKY家族成员的数目与进化方式发生改变ꎬWRKY基因家族成员外显子的个数为3~7个ꎮ(5)小立碗藓WRKY基因家族成员无规则分散于21条染色体上ꎬ并未形成基因簇ꎮ该研究通过分析WRKY基因家族的基本结构与性质ꎬ能为后续深入研究WRKY转录因子的功能奠定基础ꎮ关键词:小立碗藓ꎬWRKY转录因子ꎬ基因家族分析ꎬ生物信息中图分类号:Q943㊀㊀文献标识码:A㊀㊀文章编号:1000 ̄3142(2022)02 ̄0267 ̄10BioinformaticsanalysisofWRKYgenefamilyinPhyscomitrellapatensQIAOGang1ꎬLILi2ꎬJIANGShan2∗(1.SchoolofLifeSciencesꎬGuizhouNormalUniversityꎬGuiyang550001ꎬChinaꎻ2.SchoolofInternationalEducationꎬGuizhouNormalUniversityꎬGuiyang550001ꎬChina)Abstract:AsatranscriptionfactorfirstfoundinplantsꎬWRKYplaysanimportantroleinplantgrowthanddevelopment.HoweverꎬithasnotbeenstudiedinPhyscomitrellapatens.ByusingtheWRKYgenefamilydata(accessionnumberisPF03106)inPfamdatabaseꎬthispaperstudiedthebasicinformationofWRKYinP.patensꎬwhichincludedphysicochemicalpropertiesꎬproteinsecondarystructurepredictionꎬchromosomelocalizationꎬexonandintrondistributionandphylogeneticrelationship.Theresultswereasfollows:(1)TheWRKYgenefamilyinP.patensconsistedof38memberswhichweredividedintotwomajorcategories─ⅠandⅡꎬandtheconserveddomainsofsomeWRKYgenes收稿日期:2020-11-28基金项目:国家自然科学基金(31260426ꎬ31560508)[SupportedbyNationalNaturalScienceFoundationofChina(31260426ꎬ31560508)]ꎮ第一作者:乔刚(1992-)ꎬ硕士研究生ꎬ主要从事植物分子生物学研究ꎬ(E ̄mail)2335603452@qq.comꎮ∗通信作者:姜山ꎬ博士ꎬ教授ꎬ研究方向为植物病理学ꎬ(E ̄mail)kyosan200312@hotmail.comꎮhadmutated.(2)TheaminoacidlengthofWRKYproteinwas216-775aaandtherelativemolecularmasswas24.5-82.8kDaꎻSubcellularlocationshowedthattheproteinsofWRKYgenefamilyweredistributedinthenucleus.(3)ThesecondarystructureofWRKYproteinwascomposedoffourconstituentelements:α ̄helixꎬextendedstrandꎬβ ̄turnꎬandrandomcoilꎻExceptforPpWRKY11(α ̄helixdominated)ꎬtheproportionofrandomcoilwasupto70%.(4)ThephylogeneticrelationshipwithArabidopsisshowedthatthenumberoftheWRKYfamilymembersandthewayoftheirrevolutionhadchangedintheprocessofplants evolutionꎬthenumberofexonsofWRKYgenefamilymemberswere3-7.(5)ThemembersofWRKYgenewererandomlydispersedon21chromosomeswithoutformingagenecluster.ThisstudyanalyzedthebasicstructureandpropertiesoftheWRKYgenefamilyꎬanditwasfoundthattheWRKYgenefamilyofPhyscomitrellapatenshasevolutionarydiversityanduniquevariabilityinconserveddomainsꎬlayingthefoundationforsubsequentresearch.Keywords:PhyscomitrellapatensꎬWRKYtranscriptionfactorꎬgenefamilyanalysisꎬbioinformatics㊀㊀目前ꎬ在植物中发现60多个转录因子家族(Kaplan ̄Levyetal.ꎬ2012ꎻQinetal.ꎬ2014)ꎬ其中研究较多的有MYB㊁NAC㊁WRKY㊁SBP㊁GARS等ꎮWRKY(transcriptionfactor)作为最先在植物中发现的转录因子(Wangetal.ꎬ2015)ꎬ它由N ̄端的WRKY保守结构域(WRKYGQK)和C ̄端的锌指结构组成(Wangetal.ꎬ2018)ꎮ根据保守结构域的个数和锌指结构的类型ꎬ可将WRKY转录因子分成三类:Ⅰ类ꎬ含有2个WRKY结构域㊁锌指结构为C2H2型ꎻⅡ类ꎬ含有单个结构域ꎬ锌指结构为C2H2(CX4 ̄5 ̄C ̄X22 ̄23 ̄H ̄X1 ̄H)ꎬ由此将其分为五个亚型(Ⅱa㊁Ⅱb㊁Ⅱc㊁Ⅱd㊁Ⅱe)ꎻⅢ类ꎬ单个保守结构域㊁锌指为C2HC(C ̄X7 ̄C ̄X23 ̄HX)型(Songetal.ꎬ2015ꎻJiaetal.ꎬ2018)ꎮ但是ꎬ在低等植物中不含有第Ⅲ类类型的WRKY转录因子(苏琦等ꎬ2007)ꎮ小立碗藓(Physcomitrellapatens)作为高等植物中的低等类群ꎬ它的优势主要体现在以下几点: (1)小立碗藓基因组测序结果已知(http://www.cosmoss.org/)(Rensingetal.ꎬ2008)ꎬ大小为511Mbꎬ染色体共有27条(Rensingetal.ꎬ2002)ꎻ(2)同源重组率高ꎻ(3)生长周期短㊁易于扩繁(Xuetal.ꎬ2009)ꎻ(4)细胞结构简单ꎬ表型能直接观察ꎮ因此ꎬ小立碗藓已成为研究植物基因结构与功能的一种理想植物(蓝雨纯等ꎬ2020)ꎮ随着生物技术的发展ꎬWRKY转录因子不断被发现ꎮ目前ꎬ已经鉴定出绿藻中有3个WRKY转录因子(Rinersonetal.ꎬ2015)㊁卷柏19个(Lietal.ꎬ2016)㊁山松83个(Liuetal.ꎬ2009)㊁拟南芥74个(Ulkeretal.ꎬ2004)㊁水稻97个(Xuetal.ꎬ2016)ꎮ同时ꎬWRKY作为转录因子已经被证明调控植物多重生理功能(Eulgemetal.ꎬ2000)ꎮ有研究发现ꎬWRKY转录因子参与植物对干旱㊁高温㊁低温等非生物胁迫的应答反应(Chenetal.ꎬ2012)以及植物对病原菌的抗性(Shametal.ꎬ2017)ꎬ如水稻OsWRKY13参与抵抗水稻稻瘟病菌(Magnaporthegrisea)的感染(Schluttenhofer&Yuanꎬ2014)ꎬAtWRKY8在盐胁迫下表达出现增加ꎬ进而增强植株对盐胁迫的抗性(Chenetal.ꎬ2013)ꎻWRKY转录因子与植物的生长发育(Chenetal.ꎬ2019)和代谢产物生物的合成(Zhangetal.ꎬ2018ꎻLiuetal.ꎬ2019)密切相关ꎬ但在小立碗藓中ꎬWRKY是否参与防御反应及其作用机制却鲜有报道ꎮ随着生物技术㊁计算机技术与大数据的迅猛发展(Pearsonꎬ2001)ꎬ人们对于未知领域的研究也不断加强ꎮ生物信息学技术作为一门新兴的时尚学科ꎬ结合计算机数据收集与分析ꎬ不断探索未知基因的结构与功能ꎬ使得预测结果逐渐具有说服力ꎬ为科学研究做出了很大贡献ꎬ这一技术广泛应用于蛋白分析㊁基因结构预测等众多领域ꎮ因此ꎬ本文通过生物信息学分析研究WRKY基因家族成员的基本信息㊁保守结构域分析㊁染色体定位㊁内外显子分布等信息ꎬ旨在为后续研究小立碗藓WRKY蛋白的结构与功能提供理论基础ꎮ1㊀材料与方法1.1材料Bio ̄linux操作系统ꎻ虚拟机virtualBoxꎻ各种数据库ꎬ即Pfam(http://pfam.xfam.org/)㊁EnsemblPlants(Bolseretal.ꎬ2017)(http://plants.ensembl.org/index.html)㊁phytozome(https://phytozome.jgi.doe.gov/pz/portal.html)ꎻ在线构图工具及可视化软件ꎬ即Adobeillustrator㊁TBtools等ꎮ1.2小立碗藓WRKY基因家族的确定在Pfam数据库下载WRKY蛋白保守结构域862广㊀西㊀植㊀物42卷(∗.hmm)文件ꎬ在EnsemblPlants数据库中下载小立碗藓基因组的cds㊁cdna㊁pep㊁gff3等序列作为备用文件ꎮ在小立碗藓蛋白质数据库检索含有WRKY(PF03106)基因家族结构域的蛋白质序列(李昊阳等ꎬ2014)ꎬ通过E值初筛ꎬ再用SMART网站(http://smart.embl ̄heidelberg.de/)进行手动筛选ꎬ最终结果用于后续研究ꎮ1.3小立碗藓WRKY基因家族成员的理化性质分析与亚细胞定位由ProtParam(https://web.expasy.org/protparam/protparam ̄doc.html)网站预测家族成员的等电点(PI)㊁相对分子质量㊁氨基酸长度㊁蛋白质疏水性等性质ꎻ通过PLOC(http://www.csbio.sjtu.edu.cn/bioinf/Cell ̄PLoc ̄2/)㊁Plant ̄mPLOC(植物细胞定位)预测家族成员在细胞中的位置分布ꎮ1.4小立碗藓WRKY基因家族成员蛋白的二级结构分析将确定的小立碗藓WRKY基因家族蛋白质的氨基酸序列通过在线网站HNNSecondaryStructurePrediction(https://npsa ̄prabi.ibcp.fr/cgibin/npsa_automat.pl?page=npsa_sopma.html)预测其二级结构各构型的占比ꎮ1.5小立碗藓WRKY基因家族成员系统发育树的构建利用MEGA7.0软件进行进化树构建ꎬ程序选用最大似然法ꎬBootstrap值设定为1000ꎬ其他设定参数以系统默认值为准ꎮ1.6小立碗藓WRKY基因家族成员内外显子及保守结构域分析由MEME在线网(http://gsds.cbi.pku.edu.cn/)ꎬ预测蛋白质的保守结构域的motif模型信息ꎮ通过http://gsds.cbi.pku.edu.cn/网站绘制小立碗藓WRKY基因家族成员的内外显子分布图ꎬ由AI软件合并绘图ꎮ1.7小立碗藓WRKY基因家族成员的染色体定位图将WRKY蛋白所对应的氨基酸序列及成员在小立碗藓基因组中对应的名称文件㊁小立碗藓27条染色体的长度信息ꎻ由http://mg2c.iask.in/mg2c_v2.0/在线网站绘制染色体定位图ꎬ通过TBtools工具进行可视化修饰ꎮ2㊀结果与分析2.1小立碗藓WRKY基因家族成员的确定通过对38个WRKY家族成员的保守结构域序列比对将WRKY家族成员主要分为以下几类:Ⅰ类(3个基因)㊁Ⅱa(5个基因)㊁Ⅱb(6个基因)㊁Ⅱc(5个基因)㊁Ⅱd(13个基因)㊁Ⅱe(6个基因)㊁不含第三类(C2HC型)ꎬ其中部分基因的保守结构域序列发生变异(表1)ꎮ表1㊀小立碗藓WRKY保守结构域的变异Table1㊀VariationofconserveddomainofPhyscomitrellapatensWRKY基因名称GenenameWRKY结构域序列WRKYdomainsequence变异后保守结构域序列Conserveddomainsequenceaftermutation分组GroupsPpWRKY7PpWRKY8PpWRKY14PpWRKY25PpWRKY30PpWRKY32PpWRKY35WRKYGQKWRKYGQKWRKYGQKWRKYGQKWRKYGQKWRKYGQKWRKYGQKWKKYGNKWKKYGNKWKKYGNKWRKYGHKWRKYGQNWKKYGNKWKKYGNKⅡaⅡaⅡaⅡdⅡbⅡaⅡa2.2小立碗藓WRKY基因家族成员理化性质分析及亚细胞定位由表2可知ꎬ整个家族成员的氨基酸序列长度在216~775aa之间ꎬPpWRKY25最小ꎬPpWRKY10最大ꎮWRKY蛋白氨基酸的理论等电点PI值在5.10~9.79之间ꎬ蛋白质的相对分子质量在24.5~82.8kDa之间ꎬ与氨基酸数量成正比ꎮ另外ꎬ通过蛋白质疏水性数值可以得到PpWRKY25㊁PpWRKY30㊁PpWRKY36既具有疏水性又有亲水性ꎬ其数值在-0.5~+0.5之间ꎮ负值越大说明亲水性越强ꎬ其他氨基酸都为亲水氨基酸ꎮPLOC亚细胞定位结果表明小立碗藓WRKY蛋白全部分布在细胞核中ꎮ9622期乔刚等:小立碗藓WRKY基因家族生物信息学分析表2㊀小立碗藓WRKY转录因子理化性质分析Table2㊀AnalysisofphysicochemicalpropertiesofWRKYtranscriptionfactorsinPhyscomitrellapatens基因名称Genename等电点PI分子量Molecularweight(kDa)氨基酸长度Lengthofaminoacid(aa)亚细胞定位Subcellularlocalization蛋白质疏水性Grandaverageofhydropathicity(GRAVY)PpWRKY19.7143.1395Nucleus细胞核-0.693PpWRKY26.4182.0765Nucleus细胞核-0.601PpWRKY38.7975.4705Nucleus细胞核-0.970PpWRKY49.7643.2396Nucleus细胞核-0.655PpWRKY59.1074.2684Nucleus细胞核-1.052PpWRKY66.6281.3761Nucleus细胞核-0.560PpWRKY76.0837.7343Nucleus细胞核-1.047PpWRKY87.6840.3365Nucleus细胞核-0.936PpWRKY96.7160.6558Nucleus细胞核-0.648PpWRKY108.2182.8775Nucleus细胞核-0.751PpWRKY119.7942.8385Nucleus细胞核-0.719PpWRKY125.6655.8507Nucleus细胞核-0.845PpWRKY135.8941.1372Nucleus细胞核-0.991PpWRKY148.3165.2589Nucleus细胞核-0.759PpWRKY159.7543.2395Nucleus细胞核-0.643PpWRKY165.6971.2642Nucleus细胞核-0.840PpWRKY179.7242.4385Nucleus细胞核-0.829PpWRKY188.5548.7437Nucleus细胞核-0.688PpWRKY198.4578.9749Nucleus细胞核-0.717PpWRKY207.6055.7504Nucleus细胞核-0.820PpWRKY216.0659.3533Nucleus细胞核-0.754PpWRKY225.7656.6511Nucleus细胞核-0.809PpWRKY236.4681.7758Nucleus细胞核-0.813PpWRKY245.2763.0579Nucleus细胞核-0.637PpWRKY257.1524.5216Nucleus细胞核-0.295PpWRKY267.3780.5740Nucleus细胞核-0.694PpWRKY275.1052.8480Nucleus细胞核-0.752PpWRKY285.6262.8576Nucleus细胞核-0.789PpWRKY295.4064.0584Nucleus细胞核-0.806PpWRKY308.0144.3409Nucleus细胞核-0.463PpWRKY315.3357.2525Nucleus细胞核-0.737PpWRKY325.9459.9532Nucleus细胞核-0.955PpWRKY337.7961.2566Nucleus细胞核-0.720PpWRKY345.9963.2588Nucleus细胞核-0.571PpWRKY355.6748.1435Nucleus细胞核-0.571PpWRKY368.7152.7491Nucleus细胞核-0.454PpWRKY378.2379.5748Nucleus细胞核-0.512PpWRKY385.9679.3723Nucleus细胞核-0.8672.3小立碗藓WRKY基因家族成员二级结构预测由表3可知ꎬ小立碗藓WRKY基因家族成员的蛋白质由四种结构元件构成(α ̄螺旋㊁延伸链㊁β ̄转角㊁无规卷曲)ꎬ其中以无规卷曲为主要构成元件ꎬ最高为PpWRKY19ꎬ占比达73.83%ꎬ其次是α ̄螺旋ꎻ而PpWRKY11结果与家族成员不同ꎬ以α ̄螺旋为主ꎬ占比高达42.34%ꎬ无规卷曲为39.74%ꎮ另外两种构成元件占比均较小ꎬ比较稳定ꎮ072广㊀西㊀植㊀物42卷2.4小立碗藓WRKY基因家族成员系统进化关系分析从图1小立碗藓WRKY基因家族成员的进化树分析可以看出ꎬWRKY基因家族成员在进化的过程中已经发生了改变ꎬ38个基因分布在不同的五个分支上(分别含有Ⅰ型基因3个㊁Ⅱa型基因5个㊁Ⅱb型基因6个㊁Ⅱc型基因5个㊁Ⅱd型基因13个㊁Ⅱe型6个)ꎮ从进化树还可以看出ꎬ小立碗藓与拟南芥在进化上具有一定的差异性ꎬ其中的Ⅱa㊁Ⅱd并未聚集在同一分支ꎬ说明植物由低等向高等进化的过程中基因的进化方式发生了改变ꎬ出现了分化ꎮ2.5小立碗藓WRKY基因家族成员内外显子分布及结构域分析小立碗藓WRKY家族成员主要分布在几个不同的分支ꎬ在进化上既具有差异性又具有相似性ꎮ由图2:B可知ꎬ小立碗藓WRKY基因家族成员内外显子分布图ꎬ外显子的个数为3~7个ꎮ其中ꎬⅠ型有4~6个外显子㊁Ⅱa与Ⅱe型外显子4个㊁Ⅱb型含外显子3~7个㊁Ⅱc型外显子3个㊁大多数Ⅱd型含有外显子3个ꎮ内含子的长度可以通过图下比例尺推断得到ꎮ由图2:C可知ꎬ不同WRKY基因所对应的蛋白结构域共有10个motifꎬ不同基因对应的motif也有所不同ꎮ由表4可知ꎬ每个motif的宽度㊁E ̄value值㊁对应motif的Logo图ꎻ氨基酸宽度最大为50ꎬ最低仅有21ꎮ这38个家族成员中约有50%具有两个保守结构域ꎬ最多含有四个ꎻ而Ⅱa(PpWRKY7㊁PpWRKY8㊁PpWRKY14㊁PpWRKY35)都只含有一个motif5的保守结构域ꎬ并未形成稳定的motif1-motif3的稳定结构ꎮ2.6小立碗藓WRKY基因家族成员的染色体分布由图3可知ꎬ通过分析最终将38个WRKY基因定位在小立碗藓27条染色体中的21条ꎬⅠ型(PpWRKY3㊁PpWRKY5㊁PpWRKY19)分布于第8㊁第3㊁第2号染色体ꎻⅡa型(PpWRKY35㊁PpWRKY31㊁PpWRKY8㊁PpWRKY7㊁PpWRKY14)分布于第14㊁第4㊁第27㊁第6㊁第5号染色体ꎻⅡb型(PpWRKY36㊁PpWRKY30㊁PpWRKY2㊁PpWRKY6㊁PpWRKY37㊁PpWRKY26)分布于第11㊁第7㊁第13㊁第3㊁第12㊁第4号染色体ꎻⅡc型(PpWRKY34㊁PpWRKY9㊁PpWRKY33㊁PpWRKY24㊁PpWRKY28)分布于第20㊁第24㊁第23㊁第3㊁第13号染色体ꎻⅡd型(PpWRKY29㊁PpWRKY22㊁PpWRKY25㊁PpWRKY38㊁PpWRKY16㊁PpWRKY10㊁PpWRKY23㊁PpWRKY21㊁表3㊀小立碗藓WRKY蛋白质二级结构分析Table3㊀SecondarystructureanalysisofWRKYproteininPhyscomitrellapatens基因名称Genename二级结构Secondarystructure(%)α ̄螺旋α ̄helix延伸链Extendedstrandβ ̄转角β ̄turn无规卷曲RandomcoilPpWRKY126.3312.916.0854.68PpWRKY232.0313.592.6151.76PpWRKY314.619.503.9771.91PpWRKY421.9715.916.0656.06PpWRKY516.968.634.2470.18PpWRKY629.5713.533.5553.55PpWRKY715.7413.706.7163.85PpWRKY822.7415.347.6754.25PpWRKY924.378.963.4163.26PpWRKY1019.359.816.9763.87PpWRKY1142.3411.346.4939.74PpWRKY1231.768.886.9052.47PpWRKY1333.3910.195.4950.94PpWRKY1426.3210.533.9059.25PpWRKY1525.0614.948.3551.65PpWRKY1628.1912.156.7052.96PpWRKY1725.4512.474.9451.74PpWRKY1824.4910.766.1858.58PpWRKY1912.689.354.1473.83PpWRKY2033.738.134.7653.37PpWRKY2127.7710.515.4456.29PpWRKY2219.379.596.6564.38PpWRKY2318.078.055.4168.47PpWRKY2417.1012.617.7762.52PpWRKY2543.067.411.8547.69PpWRKY2627.4312.702.9756.98PpWRKY2732.298.965.4253.33PpWRKY2821.1811.814.6962.33PpWRKY2927.109.427.5355.14PpWRKY3027.6315.656.6050.12PpWRKY3120.009.333.2467.43PpWRKY3229.8911.284.5154.32PpWRKY3323.858.834.4262.90PpWRKY3427.559.014.4259.01PpWRKY3523.919.895.0661.15PpWRKY3632.5913.656.3147.45PpWRKY3726.6012.703.0757.62PpWRKY3827.256.645.6760.441722期乔刚等:小立碗藓WRKY基因家族生物信息学分析图1㊀小立碗藓与拟南芥WRKY系统发育树Fig.1㊀PhylogenetictreesofWRKYinPhyscomitrellapatensandArabidopsisPpWRKY27㊁PpWRKY13㊁PpWRKY18㊁PpWRKY20㊁PpWRKY12)分布于第2㊁第11㊁第11㊁第26㊁第3㊁第8㊁第3㊁第19㊁第22㊁第21㊁第2㊁第7㊁第11号染色体ꎻⅡe(PpWRKY15㊁PpWRKY4㊁PpWRKY17㊁PpWRKY1㊁PpWRKY11㊁PpWRKY32)分布于第2㊁第1㊁第17㊁第14㊁第7㊁第2号染色体ꎮ同一类型Ⅱd中PpWRKY29与PpWRKY18㊁PpWRKY22与PpWRKY25分别分布同一染色体上ꎻⅡe中PpWRKY15与PpWRKY32分布于同一染色体上ꎬ其他基因在染色体上呈现出明显的非均匀分布ꎬ所有基因在染色体上并未形成基因簇(Baietal.ꎬ2002)ꎮ3㊀讨论与结论本研究通过生物信息学分析鉴定小立碗藓WRKY基因家族共包含38个基因ꎬ与拟南芥比较可以发现ꎬ在植物从早期水生到陆生㊁低等到高等进化过程中ꎬWRKY基因是不断扩张丰富的ꎬ也意味着该家族有新功能的引入ꎮ根据WRKY结构域272广㊀西㊀植㊀物42卷图2㊀小立碗藓WRKY基因进化关系(A)和内外显子分布图(B)以及小立碗藓WRKY蛋白结构域的motif模型(C)Fig.2㊀Evolutionaryrelationship(A)anddistributionofexonsandintrons(B)ofPhyscomitrellapatensWRKYgenefamilyꎬandmotifmodeloftheWRKYproteindomainofP.patens(C)和锌指结构将小立碗藓WRKY家族分成两大类(Ⅰ型3个㊁Ⅱa型5个㊁Ⅱb型6个㊁Ⅱc型5个㊁Ⅱd型13个㊁Ⅱe型6个ꎬ不含Ⅲ型)ꎻ而在高等模式植物拟南芥中含Ⅲ型基因14个(Ulkeretal.ꎬ2004)㊁籼稻含有47个(Rossetal.ꎬ2007)ꎬ且在高等植物中几乎所有的Ⅲ类WRKY因子都参与生物胁迫反应ꎬ而低等植物不含第Ⅲ类WRKY因子(苏琦等ꎬ2007)ꎮ本实验室前期转录组数据显示ꎬ小立碗藓在接种灰霉菌的过程中PpWRKY10出现表达量的上调ꎬ说明WRKY转录因子参与了植物对灰霉菌的防御反应(实验数据未发表)ꎬ但关于具体的防御机制仍有待深入研究ꎮ进化分析表明Ⅰ型被认为是Ⅱ型与Ⅲ型的原始祖先ꎬⅡ型与Ⅲ型是通过Ⅰ型C末端或N末端WRKY结构域的变化或缺失演变而来(Zhangetal.ꎬ2005)ꎬ说明植物由低等向高等进化的过程中对环境的适应能力不同最终可能导致进化的类型不同ꎬ高等植物WRKY家族中的Ⅲ型可能由Ⅰ型在应对环境压力过程中产生ꎬ表明WRKY基因家族在进化过程中具有多样性ꎬ基因功能不断进化ꎮ3722期乔刚等:小立碗藓WRKY基因家族生物信息学分析表4㊀WRKY转录因子的预测motif列表Table4㊀PredictivemotiflistofWRKYtranscriptionfactors基序名Motifname宽度Width(aa)E ̄valueLogomotif1401.9e-1485motif2292.0e-1224motif3291.4e-1039motif4295.2e-969motif5502.0e-660motif6504.4e-592motif7413.3e-561motif8421.2e-349motif9294.8e-368motif10214.1e-286㊀㊀通过结构分析发现ꎬ小立碗藓部分Ⅱ型WRKY基因保守结构域发生变异ꎬ主要涉及WKKYGNK㊁WRKYGHK㊁WRKYGQN三种变异类型ꎮ张凡等(2018)研究表明ꎬWRKY家族保守结构域大都是 Q 突变为 E K 或 S ꎬ这些变异会导致WRKY蛋白与DNA结合的活性减弱ꎬ并且不同变异类型对植株的功能影响也不相同ꎮ水稻OSWRKY45中 Q 变异为 K 后ꎬ其表达量在植株不同部位均出现了上调表达ꎬ并且在干旱胁迫下转基因的植株表现出更好的恢复力ꎬ过表达OSWRKY45植株的抗病与抗旱能力显著提高(Qiu&Yuꎬ2009)ꎮ但是ꎬ在小立碗藓中出现了由 Q 突变为 H 的不同变异类型ꎬ结合实验室前期数据植株在接种灰霉菌后突变基因并未上调表达ꎬ可能突变基因并未参与生物胁迫反应ꎬ是否参与非生物胁迫目前尚未见有报道ꎮ本研究结果为后续深入研究其基因功能提供了方向ꎮ对WRKY家族保守结构域的motif模型分析发现ꎬ家族成员中出现motif的缺失ꎬ并未全部形成motif1-motif3的稳定结构ꎬ解释了由单个保守结构域构成的Ⅱa型WRKY保守结构域部分发生突变的原因ꎬ表明物种可能在变异中不断获得进化ꎮ蛋白质的二级结构除PpWRKY11以α ̄螺旋为主要构成元件外ꎬ其他基因均以无规卷曲(占比达70%)为主要构成元件ꎬ且蛋白全分布于细胞核中ꎮ染色体定位显示ꎬ38个基因分散分布于小立碗藓的21条染色体上ꎬ呈现明显的不规则分布ꎻ毛果杨中86个WRKY家族基因在染色体上也呈现不规则分布(Heetal.ꎬ2012)ꎮ从进化关系图来看ꎬWRKY家族的聚类与分析一致ꎬ结构上WRKY蛋白保守程度较高ꎬ但也存在变异的情况ꎬ说明小立碗藓WRKY转录因子在进化过程中具有多样性ꎮ其中ꎬ结构域突变基因除了Ppwrky25没聚在一起外ꎬ其他均聚合在同一分支上ꎬ说明这些转录因子可能具有相似的功能ꎬ此现象同样出现在土豆WRKY转录因子中(Liuetal.ꎬ2017)ꎮ472广㊀西㊀植㊀物42卷图3㊀小立碗藓WRKY基因家族的染色体分布图Fig.3㊀ChromosomedistributionofWRKYgenefamilyWRKY基因在植物生长发育及抗逆境胁迫中具有重要意义ꎬ一直是研究的热点问题ꎬ但在小立碗藓中WRKY基因的研究较少ꎮ本文在基因水平上对WRKY基因展开研究ꎬ为后续深入研究WRKY蛋白的结构与功能奠定了基础ꎮ参考文献:BAIJꎬPENNILLLAꎬNINGJCꎬetal.ꎬ2002.Diversityinnucleotidebindingsite ̄leucine ̄richrepeatgenesincereals[J].CytogenetGenomeResꎬ12(12):1871-1884.BOLSERDMꎬRUQURTIBꎬROBERTJꎬetal.ꎬ2017.Ensemblplants:Integratingtoolsforvisualizingꎬminingꎬandanalyzingplantgenomicsdata[J].MethodsMolBiolꎬ1533:1-31.CHENLGꎬZHANGLPꎬLIDBꎬetal.ꎬ2013.WRKY8transcriptionfactorfunctionsintheTMV ̄cgdefenseresponsebymediatingbothabscisicacidandethylenesignalinginArabidopsis[J].PNASꎬ110(21):E1963-E1971.CHENXꎬCHENRHꎬWANGYFꎬetal.ꎬ2019.Genome ̄WideIdentificationofWRKYTranscriptionFactorsinChinesejujube(ZiziphusjujubaMill.)andtheirinvolvementinfruitdevelopingꎬripeningꎬandabioticstress[J].Genesꎬ10(5):360.EULGEMTꎬRUSHTONPJꎬROBATZEKSꎬetal.ꎬ2000.TheWRKYsuperfamilyofplanttranscriptionfactors[J].TrendsPlantSciꎬ5(5):199-206.HEHSꎬDONGQꎬSHAOYHꎬetal.ꎬ2012.Genome ̄widesurveyandcharacterizationoftheWRKYgenefamilyinPopulustrichocarpa[J].PlantCellReportsꎬ31(7):1199-1217.ISHIGUROSꎬNAKAMURAKꎬ1994.CharacterizationofacDNAencodinganovelDNA ̄bindingproteinꎬSPF1ꎬthatrecognizesSP8sequencesinthe5ᶄupstreamregionsofgenescodingforsporaminandβ ̄amylasefromsweetpotato[J].MolGenGenetꎬ244(6):563-571.JIACHꎬWANGZꎬZHANGJBꎬetal.ꎬ2018.CloningandexpressionanalysisofeightWRKYtranscriptionfactorsin5722期乔刚等:小立碗藓WRKY基因家族生物信息学分析bananas[J].ChinJTropCropꎬ39(11):87-93.[贾彩红ꎬ王卓ꎬ张建斌ꎬ等ꎬ2018.香蕉中8个WRKY转录因子的克隆及表达分析[J].热带作物学报ꎬ39(11):87-93.]KAPLAN ̄LEVYRNꎬBREWERPBꎬQUONTꎬetal.ꎬ2012.TheTrihelixfamilyoftranscriptionfactors ̄lightꎬstressanddevelopment[J].TrendsPlantSciꎬ17(3):163-171.LANYCꎬHUANGBꎬWEIJꎬetal.ꎬ2020.IdentificationandbioinformaticsanalysisoftheexpansingenefamilyofPhyscomitrellapatens[J].Guihaiaꎬ40(6):854-863.[蓝雨纯ꎬ黄彬ꎬ韦娇ꎬ等ꎬ2020.小立碗藓扩展蛋白基因家族的鉴定与生物信息学分析[J].广西植物ꎬ40(6):854-863.]LIUYꎬCAOTꎬCHENJWꎬ2007.AdvancesonthestudyofthemossPhyscomitrellapatensꎬapotentialmodelplant[J].Guihaiaꎬ27(1):90-94.[刘艳ꎬ曹同ꎬ陈静文ꎬ2007.有前景的模式植物小立碗藓的研究新进展[J].广西植物ꎬ27(1):90-94.]LIUJJꎬEKRAMODDOULLAHAKJꎬ2009.IdentificationandcharacterizationoftheWRKYtranscriptionfactorfamilyinPinusmonticola[J].Genomeꎬ52(1):77-88.LIUQNꎬLIUYꎬXINZZꎬetal.ꎬ2017.Genome ̄wideidentificationandcharacterizationoftheWRKYgenefamilyinpotato(Solanumtuberosum)[J].BiochemSystEcolꎬ71:212-218.LIUYꎬYANGTYꎬLINZKꎬetal.ꎬ2019.AWRKYtranscriptionfactorPbrWRKY53fromPyrusbetulaefoliaisinvolvedindroughttoleranceandAsAaccumulation[J].PlantBiotechnolꎬ17(9):1770-1787.LIHYꎬSHIYꎬDINGYNꎬetal.ꎬ2014.Bioinformaticsanalysisofexpansingenefamilyinpoplargenome[J].JBeijingForUnivꎬ36(2):59-67.[李昊阳ꎬ施杨ꎬ丁亚娜ꎬ等ꎬ2014.杨树扩展蛋白基因家族的生物信息学分析[J].北京林业大学学报ꎬ36(2):59-67.]LIMYꎬXUZSꎬTIANCꎬetal.ꎬ2016.GenomicidentificationofWRKYtranscriptionfactorsincarrot(Daucuscarota)andanalysisofevolutionandhomologousgroupsforplants[J].SciRepꎬ6:23101.PEARSONWRꎬ2001.Trainingforbioinformaticsandcomputationalbiology[J].Bioinformaticsꎬ17(9):761-762.QINYꎬMAXꎬYUGHꎬetal.ꎬ2014.Evolutionaryhistoryoftrihelixfamilyandtheirfunctionaldiversification[J].DNAResꎬ21(5):499-510.QIUYPꎬYUDQꎬ2009.Over ̄expressionofthestress ̄inducedOsWRKY45enhancesdiseaseresistanceanddroughttoler ̄anceinArabidopsis[J].EnvironExpBotꎬ65(1):35-47.RENSINGSAꎬROMBAUTSSꎬPEERYVDꎬetal.ꎬ2002.Mosstranscriptomeandbeyond[J].TrendsPlantSciꎬ7(12):535-538.ROSSCAꎬLIUYꎬSHENQJꎬ2007.TheWRKYgenefamilyinrice(Oryzasativa)[J].JIntegrPlantBiolꎬ49(6):827-842.RINERSONCIꎬRABARARCꎬTRIPATHIPꎬetal.ꎬ2015.TheevolutionofWRKYtranscriptionfactors[J].BMCPlantBiolꎬ15(1):66.SCHUTTENHOFERCꎬYUANLꎬ2014.RegulationofspecializedmetabolismbyWRKYtranscriptionfactors[J].PlantPhysiolꎬ167(2):295-306.SUQꎬSHANGYHꎬDUMYꎬetal.ꎬ2007.ProgressonplantWRKYtranscriptionfactor[J].ChinAgricSciBullꎬ23(5):94-98.[苏琦ꎬ尚宇航ꎬ杜密英ꎬ等ꎬ2007.植物WRKY转录因子研究进展[J].中国农学通报ꎬ23(5):94-98.]SONGHꎬSUNWHꎬYANGGFꎬetal.ꎬ2018.WRKYtranscriptionfactorsinlegumes[J].BMCPlantBiolꎬ18(1):243.ULKERBꎬSOSICHIEJꎬ2004.WRKYtranscriptionfactors:fromDNAbindingtowardsbiologicalfunction[J].CurrOpinPlantBiolꎬ7(5):491-498.WANGYYꎬFENGLꎬZHUYXꎬetal.ꎬ2015.ComparativegenomicanalysisoftheWRKYIIIgenefamilyinpopulusꎬgrapeꎬarabidopsisandrice[J].BiolDirectꎬ10:28.WANGCTꎬRUJNꎬLIUYWꎬetal.ꎬ2018.MaizeWRKYtranscriptionfactorZmWRKY106confersdroughtandheattoleranceintransgenicplants[J].IntJMolSciꎬ19(10):2-15.XUZYꎬZHANGDDꎬHUJꎬetal.ꎬ2009.Comparativegenomeanalysisofligninbiosynthesisgenefamiliesacrosstheplantkingdom[J].BMCBioinformaticsꎬ10(Suppl.11):1471-1486.XUHJꎬWATANABEKAꎬZHANGLYꎬetal.ꎬ2016.WRKYtranscriptionfactorgenesinwildriceOryzanivara[J].DNAResꎬ23(4):311-323.ZHAOHꎬZHAOXGꎬHEYKꎬetal.ꎬ2004.Physcomitrellapatensꎬapotentialmodelsysteminplantmolecularbiology[J].ChinBullBotꎬ21(2):129-138.[赵奂ꎬ赵晓刚ꎬ何奕昆ꎬ等ꎬ2004.植物分子生物学研究极具前景的模式系统 小立碗藓[J].植物学报ꎬ21(2):129-138.]ZHANGFꎬYINJꎬGUOYQꎬetal.ꎬ2018.ResearchadvancesonWRKYtranscriptionfactors[J].BiotechnolBullꎬ34(1):40-48.[张凡ꎬ尹俊龙ꎬ郭英琪ꎬ等ꎬ2018ꎬWRKY转录因子的研究进展[J].生物技术通报ꎬ34(1):40-48.]ZHANGMꎬCHENYꎬNIELꎬetal.ꎬ2018.Transcriptome ̄wideidentificationandscreeningofWRKYfactorsinvolvedintheregulationoftaxolbiosynthesisinTaxuschinensis[J].SciReportsꎬ8(1):5197.ZHANGYJꎬWANGLJꎬ2005.TheWRKYtranscriptionfactorsuperfamily:itsoriginineukaryotesandexpansioninplants[J].BMCEvolBiolꎬ5(1):1.(责任编辑㊀蒋巧媛)672广㊀西㊀植㊀物42卷。
玉米转基因 质粒标准 pmi
玉米转基因质粒标准 pmi
玉米转基因质粒标准PMI是现代遗传学研究中的一个重要工具。
PMI是一种质粒,它包含了一种称为“细菌转移元件”的序列。
这个序列可以使质粒在细菌体内进行复制和转移,并且在转基因实验中,可以作为一种标记来证明成功转化的细胞或组织已经完全集成了质粒。
在玉米转基因中,使用PMI标准可以确保转基因植物不会对环境造成不良影响。
通过筛选PMI标准转化的植物,可以保证它们在生长和发育过程中不会对其他植物或生物造成有害影响。
此外,PMI标准还可以用于检测玉米产品是否含有转基因成分,保障消费者的食品安全。
总之,玉米转基因质粒标准PMI是保障玉米转基因品种安全和可持续发展的一个重要工具。
它不仅可以指导转基因植物的研制和应用,还可以保障消费者的健康和食品安全。
- 1 -。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
TREC 2005 Genomics Track OverviewWilliam Hersh1, Aaron Cohen1, Jianji Yang1, Ravi Teja Bhupatiraju1,Phoebe Roberts2, Marti Hearst31Oregon Health & Science University, Portland, OR, USA2Biogen Idec Corp., Boston, MA, USA3University of California, Berkeley, CA, USAThe TREC 2005 Genomics Track featured two tasks, an ad hoc retrieval task and four subtasks in text categorization. The ad hoc retrieval task utilized a 10-year, 4.5-million document subset of the MEDLINE bibliographic database, with 50 topics conforming to five generic topic types. The categorization task used a full-text document collection with training and test sets consisting of about 6,000 biomedical journal articles each. Participants aimed to triage the documents into categories representing data resources in the Mouse Genome Informatics database, with performance assessed via a utility measure.1. IntroductionThe goal of the TREC Genomics Track is to create test collections for evaluation of information retrieval (IR) and related tasks in the genomics domain. The Genomics Track differs from other TREC tracks in that it is focused on retrieval in a specific domain as opposed to general retrieval tasks, such as Web searching or question answering. There are many reasons why a focus on this domain is important. New advances in biotechnologies have changed the face of biological research, particularly “high-throughput” techniques such as gene microarrays [1]. These techniques not only generate massive amounts of data but also have led to an explosion of new scientific knowledge. As a result, this domain is ripe for improved information access and management.The scientific literature plays a key role in the growth of biomedical research data and knowledge. Experiments identify new genes, diseases, and other biological processes and factors that require further investigation. Furthermore, the literature itself becomes a source of “experiments” as researchers turn to it to search for knowledge that in turn drives new hypotheses and research. Thus, there are considerable challenges not only for better IR systems, but also for improvements in related techniques, such as information extraction and text mining [2, 3].Because of the growing size and complexity of the biomedical literature, there is increasing effort devoted to structuring knowledge in databases. The use of these databases is made pervasive by the growth of the Internet and the Web as well as a commitment of the research community to put as much data as possible into the public domain. Figure 1 depicts the overall process of “funneling” the literature towards structured knowledge, showing the information system tasks used at different levels along the way. This figure shows our view of the optimal uses for IR and the related areas of information extraction and text mining.Figure 1 - The funneling of scientific literature and related information retrieval and extraction disciplines.TREC 2005 marks the third offering of the Genomics Track. The first of the track, 2003, was limited by lack of resources to perform relevance judgments and other tasks, so the track had to use “pseudojudgments” culled from data created for other purposes [4]. In 2004, however, the track obtained a five-year grant from the U.S. National Science Foundation (NSF), which provided resources for building test collections and other data sources. The 2004 track featured an ad hoc retrieval task [5] and three subtasks in text categorization [6].For 2005, the track built on the success of 2004 by using the same underlying document collections on new topics for ad hoc retrieval and refinement of the text categorization tasks. Similar to the 2004 track, the track attracted the largest number of participating groups of any in TREC. In 2005, 32 groups submitted 59 runs to the ad hoc retrieval task, while 19 groups submitted 192 runs to the categorization subtasks. A total of 41 different groups participated, with 10 groups participating in both tasks, 22 participating only in the ad hoc retrieval task, and 9 participating in just the categorization tasks, making it the largest track in TREC 2005.The remainder of this paper covers the tasks, methods, and results of the two tasks separately, followed by discussion of future directions.2. Ad Hoc Task2.1 TaskThe ad hoc retrieval task modeled the situation of a user with an information need using an information retrieval system to access the biomedical scientific literature. The document collection was based on a large subset of the MEDLINE bibliographic database. It should benoted that although we are in an era of readily available full-text journals (usually requiring a subscription), many users of the biomedical literature enter through searching MEDLINE. As such, there are still strong motivations to improve the effectiveness of searching MEDLINE.2.2 DocumentsThe document collection for the 2005 ad hoc retrieval task was the same 10-year MEDLINE subset using for the 2004 track. One goal we have is to produce a number of topic and relevance judgment collections that use this same document collection to make retrieval experimentation easier (so people do not have to load different collections into their systems). Additional uses of this subset have already appeared [7]. MEDLINE can be searched by anyone in the world using the PubMed system of the National Library of Medicine (NLM), which maintains both MEDLINE and PubMed. The full MEDLINE database contains over 14 million references dating back to 1966 and is updated on a daily basis.The subset of MEDLINE for the TREC 2005 Genomics Track consisted of 10 years of completed citations from the database inclusive from 1994 to 2003. Records were extracted using the Date Completed (DCOM) field for all references in the range of 19940101 - 20031231. This provided a total of 4,591,008 records, which is about one third of the full MEDLINE database. The data included all of the PubMed fields identified in the MEDLINE Baseline record. Descriptions of the various fields of MEDLINE are available at:/entrez/query/static/help/pmhelp.html#MEDLINEDisplayFormat The MEDLINE subset was provided in the “MEDLINE” format, consisting of ASCII text with fields indicated and delimited by 2-4 character abbreviations. The size of the file uncompressed was 9,587,370,116 bytes. An XML version of MEDLINE subset was also available. It should also be noted that not all MEDLINE records have abstracts, usually because the article itself does not have an abstract. In general, about 75% of MEDLINE records have abstracts. In our subset, there were 1,209,243 (26.3%) records without abstracts.2.3 TopicsAs with 2004, we collected information needs from real biologists. However, instead of soliciting free-form biomedical questions, we developed a set of six generic topic templates (GTTs) derived from an analysis of the topics from the 2004 track and other known biologist information needs (Table 1). GTTs consist of semantic types, such as genes or diseases, placed in the context of commonly queried biomedical questions, and semantic types are often present in more than one GTT. After we developed the GTTs, 11 people interviewed 25 biologists to obtain ten or more specific information needs that conformed to each GTT. One GTT did not model a commonly researched problem, and was dropped from the study. The topics did not have to fit precisely into the GTTs, but had to come close, i.e., have all the required semantic types. We then had other people search on the topics to make sure there was some, but not too much, relevant information in MEDLINE. ). Ten information needs for each GTT were selected for inclusion in the 2005 track to total fifty topics.In order to get participating groups started with the topics, and in order for them not to “spoil”their automatic status of their official runs by working with the official topics, we developed 10 sample topics, consisting of two topics from each GTT. These learning topics had a MEDLINE search and relevance judgments of the output that we made available to participants. Table 1 also gives an example topic for each GTT that comes from the sample topics.2.4 Relevance judgmentsRelevance judgments were done using the conventional pooling method of TREC. Based on estimation of relevance judgment resources, the top 60 documents for each topic from all official runs were used. This gave an average pool size of 821 documents with a range of 290 to 1356. These pools were then provided to the relevance judges, who consisted of five individuals with varying expertise in biology. The relevance judges were instructed in the following manner for each GTT:•Relevant article must describe how to conduct, adjust, or improve a standard, a, new method, or a protocol for doing some sort of experiment or procedure.•Relevant article must describe some specific role of the gene in the stated disease or biological process.•Relevant article must describe a specific interaction (e.g., promote, suppress, inhibit, etc.) between two or more genes in the stated function of the organ or the disease.•Relevant article must describe a mutation of the stated gene and the particular biological impact(s) that the mutation has been found to have.The articles had to describe a specific gene, disease, impact, mutation, etc. and not just the concept in general.Table 1 - Generic topic types and example sample topics. The semantic types in each GTT are underlined.Generic Topic Type Topic Range Example Sample TopicFind articles describing standard methods or protocols for doing some sort of experiment or procedure 100-109 Methodorprotocol: GST fusion proteinexpression in Sf9 insect cellsFind articles describing the role of a gene involved in a given disease 110-119 Gene: DRD4Disease: AlcoholismFind articles describing the role of a gene in a specific biological process 120-129 Gene: Insulin receptor geneBiological process: SignalingtumorigenesisFind articles describing interactions (e.g., promote, suppress, inhibit, etc.) between two or more genes in the function of an organ or in a disease 130-139 Genes: HMG and HMGB1Disease: HepatitisFind articles describing one or more mutations of a given gene and its biological impact 140-149 Genewithmutation: RetBiological impact: Thyroid functionRelevance judges were asked to rate documents as definitely, possibly, or not relevant. As in 2004, articles that were rated definitely or possibly relevant were considered relevant for use in the binary recall and precision-related measures of retrieval performance. Relevance judgments were performed by individuals with varying levels of expertise in biology (from an undergraduate student to a PhD researcher). For 10 of the topics, judgments were performed in duplicate to allow interobserver reliability measurement using the kappa statistic.2.5 Measures and statistical analysisRetrieval performance was measured with the “usual” TREC ad hoc measures of mean average precision (MAP), binary preference (B-Pref) [8], precision at the point of the number of relevant documents retrieved (R-Prec), and precision at varying numbers of documents retrieved (e.g., 5, 10, 30, etc. documents up to 1,000). These measures were calculated using version 8.0 oftrec_eval developed by Chris Buckley (Sabir Research).Research groups submitted their runs through the TREC Web site in the usual manner. They were required to classify their runs into one of three categories:•Automatic - no manual intervention in building queries•Manual - manual construction of queries but no further human interaction•Interactive - completely interactive construction of queries and further interaction with system outputThey were also required to provide a brief system description.Statistical analysis of the above measures was performed using SPSS (version 12.0). Repeated measure analysis of variance (ANOVA) with posthoc tests using Sidak adjustments were performed on the above variables. In addition, descriptive analysis of MAP was also done to study the spread of the data.2.6 ResultsA total of 32 groups submitted 58 runs. Table 2 shows the results of relevance judging for each topic, listing the pool size sent to a given assessor plus their distribution of relevance assessments. The combined number and percentage of documents rated definitely and possibly relevant are also listed, since these were considered relevant from the standpoint of official results. Six topics had no definitely relevant documents. One topic had no definitely or possibly relevant documents and was dropped from the calculation of official results.Table 2 - Relevant documents per topic. Topic 135 had no relevant documents and waseliminated from the results. Documents that were definitely or possibly relevant were considered to be relevant for the purposes of official TREC results.Topic Pool Size Definitely Relevant PossiblyRelevantNot RelevantDefinitely +Possibly (TREC) Relevant% TRECRelevant100 704 22 52 630 74 10.5% 101 651 2 18 631 20 3.1% 102 1164 5 5 1154 10 0.9% 103 701 6 19 676 25 3.6% 104 629 0 4 625 4 0.6% 105 1133 4 85 1044 89 7.9% 106 1230 44 125 1061 169 13.7% 107 484 76 114 294 190 39.3% 108 1092 76 127 889 203 18.6% 109 389 165 14 210 179 46.0% 110 934 4 12 918 16 1.7% 111 675 109 93 473 202 29.9% 112 872 4 7 861 11 1.3% 113 1356 10 4 1342 14 1.0% 114 754 210 169 375 379 50.3% 115 1350 3 12 1335 15 1.1% 116 1265 58 28 1179 86 6.8% 117 1094 527 182 385 709 64.8% 118 938 20 12 906 32 3.4% 119 589 42 19 528 61 10.4% 120 527 223 122 182 345 65.5% 121 422 17 25 380 42 10.0% 122 871 19 37 815 56 6.4% 123 1029 5 32 992 37 3.6% 124 752 8 53 691 61 8.1% 125 1202 3 8 1191 11 0.9% 126 1320 190 117 1013 307 23.3% 127 841 1 3 837 4 0.5% 128 954 21 53 880 74 7.8% 129 987 16 22 949 38 3.9% 130 813 9 23 781 32 3.9% 131 431 2 40 389 42 9.7% 132 531 3 27 501 30 5.6% 133 523 0 5 518 5 1.0% 134 732 2 9 721 11 1.5% 135 1057 0 0 1057 0 0.0% 136 853 0 3 850 3 0.4% 137 1129 12 39 1078 51 4.5% 138 501 6 6 489 12 2.4% 139 380 15 20 345 35 9.2% 140 395 14 15 366 29 7.3% 141 520 34 47 439 81 15.6% 142 528 151 120 257 271 51.3% 143 902 0 4 898 4 0.4% 144 1212 1 1 1210 2 0.2% 145 288 10 22 256 32 11.1% 146 825 370 67 388 437 53.0% 147 659 0 10 649 10 1.5% 148 536 0 11 525 11 2.1% 149 1294 6 17 1271 23 1.8% Avg 820.4 50.5 41.2 728.7 91.7 12.5%Table 3 - Overlap of duplicate judgments for kappa statistic.Duplicate judge - Relevant Duplicate judge -Not RelevantTotalOriginal judge -Relevant1100 629 1729Original judge -Not Relevant546 8204 8750 Total 1646 8833 10479In order to assess the consistency of relevance judgments, we had judgments of ten topics performed in duplicate. (For three topics, we actually had judgments performed in triplicate; oneof these was the topic that had no relevant documents.) The judgments from the original judge who did the assessing was used as the “official” judgment. Table 3 shows the consistency of the judgments from the original and duplicating judge. The kappa score for inter-judge agreementwas 0.585, indicating a “moderate” level of agreement and comparable to the 2004 Genomics Track.The overall results are shown in Table 4, sorted by MAP. The top-ranking run came from York University. The top-ranking run was a manual run, but this group also had the top-ranking automatic run. The top-ranking interactive run was somewhat further down the list, although this group had an automatic run that performed better. The statistical analysis of the runs showed overall statistical significance for all of the measures. Pair-wise comparison of MAP for the 58 runs showed that significant difference from the top run was obtained at run uta05i. At the other end, significant difference from the lowest run was reached by run genome2. Figure 2 shows the MAP results with 95% confidence intervals, while Figure 3 shows all of the statistics from Table 4, sorted by each run’s MAP.We also assessed the results by topic. Table 5 shows the various measures for each topic, while Figure 4 shows the same data graphically with confidence intervals. The spread of MAP showeda wide variation among the 49 topics. Topic 136 had the lowest variance (<0.001) with range of0-0.0287. On the other hand, topic 119 showed the highest variance (0.060), with range of0.0144-0.8289. Topic 121 received the highest mean MAP at 0.620, while topic 143 had the lowest at 0.003. Figure 5 compares the number of relevant documents with MAP for each topic.In addition, we grouped the results by GTT, as shown in Table 6. The GTT of information describing the role of a gene in a disease achieved the highest MAP, while the gene interactions and gene mutations achieved the best B-Pref. However, the differences among all of the GTTs were modest.Table 4 - Run results by run name, type (manual, automatic, or interactive), and performancemeasures.P100B-prefP1000P10Run Group TypeMAPR-Prec[9] yorku.huang m 0.302 0.3212 0.3155 0.4551 0.2543 0.0748york05gm1york05ga1[9] yorku.huang a 0.2888 0.3118 0.3061 0.4592 0.2557 0.0721[10] ibm.zhang a 0.2883 0.3091 0.3026 0.4735 0.2643 0.0766ibmadz05us[10] ibm.zhang a 0.2859 0.3061 0.2987 0.4694 0.2606 0.0761ibmadz05bsuwmtEg05 uwaterloo.clarke a 0.258 0.2853 0.2781 0.4143 0.2292 0.0718UIUCgAuto[11] uiuc.zhai a 0.2577 0.2688 0.2708 0.4122 0.231 0.0709[11] uiuc.zhai i 0.2487 0.2627 0.267 0.4224 0.2355 0.0694UIUCgInt[12] nlm-umd.aronson a 0.2479 0.2767 0.2675 0.402 0.2378 0.0688NLMfusionAiasl1 [13] academia.sinica.tsai a 0.2453 0.2708 0.265 0.398 0.2292 0.0698[12] nlm-umd.aronson a 0.2453 0.2666 0.2541 0.4082 0.2339 0.0693NLMfusionBUniNeHug2 [14] uneuchatel.savoy a 0.2439 0.2582 0.264 0.398 0.2308 0.0712UniGe2[15] u.geneva a 0.2396 0.2705 0.2608 0.3878 0.2361 0.0711[16] iir.yu a 0.2391 0.2629 0.2716 0.3898 0.231 0.0668i2r1[17] utampere.pirkola a 0.2385 0.2638 0.2546 0.4163 0.2255 0.0678uta05a[16] iir.yu a 0.2375 0.2622 0.272 0.3878 0.2296 0.067i2r2UniNeHug2c [14] uneuchatel.savoy a 0.2375 0.2662 0.2589 0.3878 0.239 0.0725uwmtEg05fb uwaterloo.clarke a 0.2359 0.2573 0.2552 0.3878 0.2257 0.0712[18] dalianu.yang m 0.2349 0.2678 0.2725 0.3939 0.2206 0.0648DUTAdHoc2[19] tsinghua.ma a 0.2349 0.2663 0.2568 0.4224 0.2214 0.0622THUIRgen1S[20] tno.erasmus.kraaij a 0.2346 0.2607 0.2564 0.3857 0.2227 0.0668tnog10[18] dalianu.yang m 0.2344 0.2718 0.2726 0.402 0.22 0.0645DUTAdHoc1[20] tno.erasmus.kraaij a 0.2332 0.2506 0.2555 0.402 0.2173 0.0668tnog10piasl2 [13] academia.sinica.tsai a 0.2315 0.2465 0.2487 0.3816 0.2276 0.07[21] uamsterdam.aidteam a 0.2314 0.2638 0.2592 0.4163 0.2271 0.0612 UAmscombGeFb[22] suny-buffalo.ruiz a 0.2262 0.2567 0.2542 0.3633 0.2122 0.0683UBIgeneA[23] ohsu.hersh a 0.2233 0.2569 0.2544 0.3735 0.2169 0.0632OHSUkey[24] ntu.chen a 0.2204 0.2562 0.2498 0.398 0.1996 0.0644NTUgah2[19] tsinghua.ma a 0.2177 0.2519 0.2395 0.4143 0.2198 0.0695THUIRgen2P[24] ntu.chen a 0.2173 0.2558 0.2513 0.3918 0.1998 0.0615NTUgah1[15] u.geneva a 0.215 0.2364 0.2347 0.3367 0.2237 0.0694UniGeNe[21] uamsterdam.aidteam a 0.2015 0.2325 0.232 0.3551 0.2094 0.0568 UAmscombGeMl[17] utampere.pirkola i 0.198 0.2411 0.229 0.4082 0.2137 0.0547uta05iPDnoSE [25] upadova.bacchin a 0.1937 0.2213 0.2183 0.3571 0.2006 0.063iitprf011003 [26] iit.urbain a 0.1913 0.2142 0.2205 0.3612 0.2018 0.065[27] dublincityu.gurrin a 0.1851 0.2178 0.2129 0.3816 0.1851 0.0577dcu1dcu2[27] dublincityu.gurrin a 0.1844 0.2234 0.214 0.3959 0.1896 0.0599[28] simon-fraseru.shi m 0.1834 0.2072 0.2149 0.3429 0.1898 0.0608SFUshi[23] ohsu.hersh a 0.183 0.2285 0.2221 0.3286 0.1965 0.0592OHSUall[29] fudan.niu a 0.1807 0.2006 0.2055 0.3 0.1794 0.057wim2genome1 [30] csusm.guillen a 0.1803 0.2174 0.211 0.3245 0.1749 0.0577[29] fudan.niu a 0.1781 0.2094 0.2076 0.3347 0.181 0.0592wim1[12] nlm.wilbur a 0.1777 0.214 0.2192 0.3041 0.1824 0.0526NCBITHQ[12] nlm.wilbur m 0.1747 0.2081 0.2181 0.3122 0.182 0.0519NCBIMAN0.17380.2079 0.2046 0.3082 0.1941 0.0579UICgen1 [31] uillinois-chicago.liu a[32] umaryland.oard a 0.1729 0.1954 0.1898 0.3041 0.1439 0.0409MARYGEN1PDSESe02 [25] upadova.bacchin a 0.1646 0.1928 0.1928 0.3224 0.1904 0.0615genome2 [30] csusm.guillen a 0.1642 0.1931 0.1928 0.298 0.1676 0.0565[33] uiowa.eichmann a 0.1303 0.1861 0.1693 0.2898 0.1671 0.0396UIowa05GN102[34] umichigan-dearborn.murphey a 0.1221 0.1541 0.1435 0.3224 0.1473 0.0321UMD01[33] uiowa.eichmann a 0.1095 0.1636 0.1414 0.2857 0.1571 0.026UIowa05GN101CCP0 [35] ucolorado.cohen m 0.1078 0.1486 0.1311 0.2837 0.1439 0.0203YAMAHASHI2 utokyo.takahashi m 0.1022 0.1236 0.1276 0.2653 0.1312 0.0369YAMAHASHI1 utokyo.takahashi m 0.1003 0.1224 0.1248 0.2531 0.1267 0.0356dpsearch2 [36] datapark.zakharov m 0.0861 0.1169 0.1034 0.2633 0.1231 0.0278dpsearch1 [36] datapark.zakharov m 0.0827 0.1177 0.1017 0.2551 0.1182 0.0274asubaral arizonau.baral m 0.0797 0.1079 0.0967 0.2714 0.1061 0.0142CCP1 [35] ucolorado.cohen m 0.0554 0.0963 0.0775 0.1878 0.0951 0.0134[34] umichigan-dearborn.murphey a 0.0544 0.0703 0.0735 0.1755 0.0843 0.0166UMD02Minimum 0.0544 0.0703 0.0735 0.1755 0.0843 0.0134Mean 0.1968 0.2258 0.2218 0.3576 0.1976 0.0573Maximum 0.302 0.3212 0.3155 0.4735 0.2643 0.0766Figure 2 - Run results with 95% confidence intervals, sorted alphabetically.0.050.10.150.20.250.30.350.40.450.5y o r k 05g m 1y o r k 05g a 1i b m a d z 05u s i b m a d z 05b s u w m t E g 05U I U C g A u t o U I U C g I n t N L M f u s i o n A i a s l 1N L M f u s i o n B U n i N e H u g 2U n i G e 2i 2r 1u t a 05a i 2r 2U n i N e H u g 2c u w m t E g 05f b D U T A d H o c 2T H U I R g e n 1S t n o g 10D U T A d H o c 1t n o g 10p i a s l 2U A m s c o m b G e F b U B I g e n e A O H S U k e y N T U g a h 2T H U I R g e n 2P N T U g a h 1U n i G e N e U A m s c o m b G e M l u t a 05i P D n o S E i i t p r f 011003d c u 1d c u 2S F U s h i O H S U a l l w i m 2g e n o m e 1w i m 1N C B I T H Q N C B I M A N U I C g e n 1M A R Y G E N 1P D S E S e 02g e n o m e 2U I o w a 05G N 102U M D 01U I o w a 05G N 101C C P 0Y A M A H A S H I 2Y A M A H A S H I 1d p s e a r c h 2d p s e a r c h 1a s u b a r a l C C P 1U M D 02RunV a l u eFigure 3 - Run results plotted graphically, sorted by MAP of each run.Table 5 - Results by topic.Topic MAP R-Prec B-Pref P10 P100 P1000 100 0.1691 0.2148 0.1616 0.3569 0.1916 0.0550 101 0.0454 0.0526 0.0285 0.0483 0.0516 0.0141 102 0.0110 0.0172 0.0100 0.0172 0.0091 0.0036 103 0.0603 0.0945 0.0570 0.0948 0.0602 0.0169 104 0.0694 0.0948 0.0582 0.0690 0.0124 0.0023 105 0.1102 0.1703 0.1461 0.4655 0.1586 0.0327 106 0.0625 0.1120 0.1231 0.3138 0.1433 0.0491 107 0.4184 0.4297 0.5289 0.9103 0.5934 0.1373 108 0.1224 0.1973 0.2206 0.4828 0.2788 0.0695 109 0.5347 0.5196 0.6512 0.9190 0.7066 0.1345 110 0.0137 0.0248 0.0154 0.0224 0.0128 0.0055 111 0.2192 0.2985 0.2926 0.3569 0.3140 0.1170 112 0.2508 0.3354 0.2754 0.3586 0.0481 0.0062 113 0.3124 0.3498 0.3164 0.3931 0.0822 0.0096 114 0.3876 0.4364 0.5505 0.8259 0.6697 0.2476 115 0.0378 0.0437 0.0340 0.0534 0.0193 0.0036 116 0.1103 0.1720 0.1456 0.2879 0.1636 0.0359 117 0.3796 0.4739 0.5126 0.8345 0.7409 0.4099 118 0.1343 0.1460 0.1369 0.3276 0.0634 0.0145 119 0.5140 0.5212 0.5075 0.8190 0.3462 0.0493 120 0.5769 0.5421 0.7217 0.9259 0.8091 0.2695 121 0.6205 0.6560 0.6394 0.7983 0.3040 0.0337 122 0.1423 0.2023 0.1590 0.3569 0.1510 0.0320 123 0.0375 0.0708 0.0474 0.1121 0.0493 0.0133 124 0.1519 0.2035 0.1693 0.5103 0.1505 0.0324 125 0.0772 0.0862 0.0708 0.0897 0.0209 0.0028 126 0.1313 0.2172 0.2388 0.3966 0.2979 0.1422 127 0.1015 0.1250 0.0862 0.0759 0.0155 0.0028 128 0.0921 0.1424 0.1062 0.3224 0.1247 0.0366 129 0.0864 0.1393 0.0939 0.1793 0.0984 0.0212 130 0.3390 0.3545 0.3346 0.6362 0.1388 0.0194 131 0.4436 0.4384 0.4230 0.5517 0.2790 0.0343 132 0.1048 0.1558 0.1115 0.2431 0.0966 0.0196 133 0.0328 0.0207 0.0172 0.0172 0.0140 0.0029 134 0.1687 0.1771 0.1582 0.1914 0.0364 0.0069 136 0.0032 0.0000 0.0000 0.0000 0.0019 0.0010 137 0.0676 0.1146 0.0767 0.1776 0.0848 0.0232 138 0.2196 0.2342 0.2029 0.2534 0.0552 0.0089 139 0.3600 0.3941 0.3488 0.5810 0.2052 0.0305 140 0.2700 0.3115 0.2423 0.3810 0.1843 0.0248 141 0.2381 0.2735 0.2053 0.3362 0.2598 0.0699 142 0.4416 0.4608 0.5911 0.8569 0.6409 0.2098 143 0.0031 0.0043 0.0011 0.0034 0.0021 0.0009 144 0.0734 0.0603 0.0431 0.0276 0.0053 0.0009 145 0.3363 0.3761 0.3238 0.5931 0.1852 0.0260 146 0.4808 0.4961 0.6325 0.8466 0.7212 0.3076 147 0.0087 0.0138 0.0057 0.0138 0.0091 0.0040 148 0.0411 0.0376 0.0144 0.0293 0.0407 0.0066 149 0.0286 0.0495 0.0304 0.0603 0.0347 0.0089Figure 4 - Results by topic plotted graphically.10020030040050060070080012345678910111213141516171819202122232425262728293031323334353637383940414243444546474849TopicR e l e v a n t0.00000.10000.20000.30000.40000.50000.60000.7000M A PFigure 5 - Comparison of number of relevant documents and MAP for each topic.Table 6 - Results by generic topic type.Topics GTT MAP R-Prec B-Pref P10 P100 P1000100-109 Information describing standard methods orprotocols for doing some sort of experiment orprocedure0.1603 0.1903 0.1985 0.3678 0.2206 0.0515110-119 Information describing the role(s) of a geneinvolved in a disease0.2360 0.2802 0.2787 0.4279 0.2460 0.0899120-129 Information describing the role of a gene in aspecific biological process0.2018 0.2385 0.2333 0.3767 0.2021 0.0587130-139 Information describing interactions (e.g.,promote, suppress, inhibit, etc.) between two ormore genes in the function of an organ or in adisease0.1932 0.2099 0.1859 0.2946 0.1013 0.0163140-149 Information describing one or more mutationsof a given gene and its biological impact or role0.1922 0.2084 0.2090 0.3148 0.2083 0.06593. Categorization Task3.1 SubtasksThe second task for the 2005 track was a full-text document categorization task. It was similar in part to the 2004 categorization task in using data from the Mouse Genome Informatics (MGI, /) system [37] and was a document triage task, where a decision is made on a per-document basis about whether or not to pass a document on for further expert review. It included a repeat of one subtask from last year, the triage of articles for GO annotation [38], and added triage of articles for three other major types of information collected and catalogued by MGI. These include articles about tumor biology [39], embryologic gene expression [40], and alleles of mutant phenotypes [41].As such, the categorization task assessed how well systems can categorize documents in four separate categories. We used the same utility measure used last year but with different parameters (see below). We created an updated version of the cat_eval program that calculated the utility measure plus recall, precision, and the F score.3.2 DocumentsThe documents for the 2005 categorization tasks consisted of the same full-text articles used in 2004. The articles came from three journals over two years, reflecting the full-text data we were able to obtain from Highwire Press: Journal of Biological Chemistry (JBC), Journal of Cell Biology (JCB), and Proceedings of the National Academy of Science (PNAS). These journals have a good proportion of mouse genome articles. Each of the papers from these journals was available in SGML format based on Highwire’s document type definition (DTD). Also the same as 2004, we designated articles published in 2002 as training data and those in 2003 as test data.。