short motifs in
74LVC245A; 74LVCH245A 八路总线收发器; 3 状态 数据手册说明书
74LVC245A; 74LVCH245AOctal bus transceiver; 3-stateRev. 9 — 11 September 2018Product data sheet1. General descriptionThe 74LVC245A; 74LVCH245A are 8-bit transceivers featuring non-inverting 3-state buscompatible outputs in both send and receive directions. The device features an output enable(OE) input for easy cascading and a send/receive (DIR) input for direction control. OE controls theoutputs so that the buses are effectively isolated.Inputs can be driven from either 3.3 V or 5 V devices. When disabled, up to 5.5 V can be applied tothe outputs. These features allow the use of these devices in mixed 3.3 V and 5 V applications.The 74LVCH245A bus hold on data inputs eliminates the need for external pull-up resistors to holdunused inputs.2. Features and benefits• 5 V tolerant inputs/outputs for interfacing with 5 V logic•Wide supply voltage range from 1.2 V to 3.6 V•CMOS low-power consumption•Direct interface with TTL levels•Inputs accept voltages up to 5.5 V•High-impedance when V CC = 0 V•Bus hold on all data inputs (74LVCH245A only)•Complies with JEDEC standard:•JESD8-7A (1.65 V to 1.95 V)•JESD8-5A (2.3 V to 2.7 V)•JESD8-C/JESD36 (2.7 V to 3.6 V)•ESD protection:•HBM JESD22-A114F exceeds 2000 V•MM JESD22-A115B exceeds 200 V•CDM JESD22-C101E exceeds 1000 V•Specified from -40 °C to +85 °C and -40 °C to +125 °C3. Ordering information4. Functional diagram5. Pinning information5.1. Pinning74LVC245A 74LVCH245ADIR V CC A0OE A1B0A2B1A3B2A4B3A5B4A6B5A7B6GND B7001aak2921234567891012111413161518172019Fig. 3.Pin configuration SOT163-1 (SO20),SOT339-1 (SSOP20) and SOT360-1 (TSSOP20)001aak29374LVC245A 74LVCH245AT ransparent top viewB6A6A7B5A5B4A4B3A3B2A2B1A1B0A0OE G N D B 7D I R V C C9128137146155164173182191011120terminal 1 index areaGND (1)(1) This is not a supply pin. The substrate is attached to this pad using conductive die attach material. There is no electrical or mechanical requirement to solder this pad. However, if it is soldered, the solder land should remain floating or be connected to GND.Fig. 4.Pin configuration SOT764-1 (DHVQFN20)5.2. Pin description6. Functional descriptionTable 3. Function selectionH = HIGH voltage level; L = LOW voltage level; X = don’t care; Z = high impedance OFF-state.7. Limiting valuesTable 4. Limiting valuesIn accordance with the Absolute Maximum Rating System (IEC 60134). Voltages are referenced to GND (ground = 0 V).[1]The minimum input voltage ratings may be exceeded if the input current ratings are observed.[2]The output voltage ratings may be exceeded if the output current ratings are observed.[3]For SO20 packages: above 70 °C derate linearly with 8 mW/K.For (T)SSOP20 packages: above 60 °C derate linearly with 5.5 mW/K.For DHVQFN20 packages: above 60 °C derate linearly with 4.5 mW/K.8. Recommended operating conditions9. Static characteristicsTable 6. Static characteristicsAt recommended operating conditions. Voltages are referenced to GND (ground = 0 V).[1]All typical values are measured at V CC = 3.3 V (unless stated otherwise) and T amb = 25 °C.[2]The bus hold circuit is switched off when V I ˃ V CC allowing 5.5 V on the input terminal.[3]For I/O ports the parameter I OZ includes the input leakage current.[4]Valid for data inputs of bus hold parts only (74LVCH245A). Note that control inputs do not have a bus hold circuit.[5]The specified sustaining current at the data input holds the input below the specified V I level.[6]The specified overdrive current at the data input forces the data input to the opposite input state.10. Dynamic characteristicsTable 7. Dynamic characteristicsVoltages are referenced to GND (ground = 0 V). For test circuit see Fig. 7.[1]Typical values are measured at T amb = 25 °C and V CC = 1.2 V, 1.8 V, 2.5 V, 2.7 V and 3.3 V respectively.[2]t pd is the same as t PLH and t PHL.t en is the same as t PZL and t PZH.t dis is the same as t PLZ and t PHZ.[3]Skew between any two outputs of the same package switching in the same direction. This parameter is guaranteed by design.[4]C PD is used to determine the dynamic power dissipation (P D in μW).P D = C PD × V CC2 × f i × N + Σ(C L × V CC2 × f o) where:f i = input frequency in MHz; f o = output frequency in MHzC L = output load capacitance in pFV CC = supply voltage in VoltsN = number of inputs switchingΣ(C L × V CC2 × f o) = sum of the outputs.10.1. Waveforms and test circuit11. Package outlineSO20: plastic small outline package; 20 leads; body width 7.5 mm SOT163-1Fig. 8.Package outline SOT163-1 (SO20)SSOP20: plastic shrink small outline package; 20 leads; body width 5.3 mm SOT339-1Fig. 9.Package outline SOT339-1 (SSOP20)TSSOP20: plastic thin shrink small outline package; 20 leads; body width 4.4 mm SOT360-1Fig. 10.Package outline SOT360-1 (TSSOP20)DHVQFN20: plastic dual in-line compatible thermal enhanced very thin quad flat package; no leads;Fig. 11.Package outline SOT764-1 (DHVQFN20)12. Abbreviations13. Revision history14. Legal informationData sheet status[1]Please consult the most recently issued document before initiating orcompleting a design.[2]The term 'short data sheet' is explained in section "Definitions".[3]The product status of device(s) described in this document may havechanged since this document was published and may differ in case ofmultiple devices. The latest product status information is available onthe internet at https://.DefinitionsDraft — The document is a draft version only. The content is still under internal review and subject to formal approval, which may result in modifications or additions. Nexperia does not give any representations or warranties as to the accuracy or completeness of information included herein and shall have no liability for the consequences of use of such information. Short data sheet — A short data sheet is an extract from a full data sheet with the same product type number(s) and title. A short data sheet is intended for quick reference only and should not be relied upon to contain detailed and full information. For detailed and full information see the relevant full data sheet, which is available on request via the local Nexperia sales office. In case of any inconsistency or conflict with the short data sheet, the full data sheet shall prevail.Product specification — The information and data provided in a Product data sheet shall define the specification of the product as agreed between Nexperia and its customer, unless Nexperia and customer have explicitly agreed otherwise in writing. In no event however, shall an agreement be valid in which the Nexperia product is deemed to offer functions and qualities beyond those described in the Product data sheet.DisclaimersLimited warranty and liability — Information in this document is believedto be accurate and reliable. However, Nexperia does not give any representations or warranties, expressed or implied, as to the accuracyor completeness of such information and shall have no liability for the consequences of use of such information. Nexperia takes no responsibility for the content in this document if provided by an information source outside of Nexperia.In no event shall Nexperia be liable for any indirect, incidental, punitive, special or consequential damages (including - without limitation - lost profits, lost savings, business interruption, costs related to the removalor replacement of any products or rework charges) whether or not such damages are based on tort (including negligence), warranty, breach of contract or any other legal theory.Notwithstanding any damages that customer might incur for any reason whatsoever, Nexperia’s aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms and conditions of commercial sale of Nexperia.Right to make changes — Nexperia reserves the right to make changesto information published in this document, including without limitation specifications and product descriptions, at any time and without notice. This document supersedes and replaces all information supplied prior to the publication hereof.Suitability for use — Nexperia products are not designed, authorized or warranted to be suitable for use in life support, life-critical or safety-critical systems or equipment, nor in applications where failure or malfunctionof an Nexperia product can reasonably be expected to result in personal injury, death or severe property or environmental damage. Nexperia and its suppliers accept no liability for inclusion and/or use of Nexperia products in such equipment or applications and therefore such inclusion and/or use is at the customer’s own risk.Quick reference data — The Quick reference data is an extract of the product data given in the Limiting values and Characteristics sections of this document, and as such is not complete, exhaustive or legally binding. Applications — Applications that are described herein for any of these products are for illustrative purposes only. Nexperia makes no representation or warranty that such applications will be suitable for the specified use without further testing or modification.Customers are responsible for the design and operation of their applications and products using Nexperia products, and Nexperia accepts no liability for any assistance with applications or customer product design. It is customer’s sole responsibility to determine whether the Nexperia product is suitableand fit for the customer’s applications and products planned, as well asfor the planned application and use of customer’s third party customer(s). Customers should provide appropriate design and operating safeguards to minimize the risks associated with their applications and products. Nexperia does not accept any liability related to any default, damage, costs or problem which is based on any weakness or default in the customer’s applications or products, or the application or use by customer’s third party customer(s). Customer is responsible for doing all necessary testing for the customer’s applications and products using Nexperia products in order to avoid a default of the applications and the products or of the application or use by customer’s third party customer(s). Nexperia does not accept any liability in this respect.Limiting values — Stress above one or more limiting values (as defined in the Absolute Maximum Ratings System of IEC 60134) will cause permanent damage to the device. Limiting values are stress ratings only and (proper) operation of the device at these or any other conditions above thosegiven in the Recommended operating conditions section (if present) or the Characteristics sections of this document is not warranted. Constant or repeated exposure to limiting values will permanently and irreversibly affect the quality and reliability of the device.Terms and conditions of commercial sale — Nexperia products aresold subject to the general terms and conditions of commercial sale, as published at /profile/terms, unless otherwise agreed in a valid written individual agreement. In case an individual agreement is concluded only the terms and conditions of the respective agreement shall apply. Nexperia hereby expressly objects to applying the customer’s general terms and conditions with regard to the purchase of Nexperia products by customer.No offer to sell or license — Nothing in this document may be interpreted or construed as an offer to sell products that is open for acceptance or the grant, conveyance or implication of any license under any copyrights, patents or other industrial or intellectual property rights.Export control — This document as well as the item(s) described herein may be subject to export control regulations. Export might require a prior authorization from competent authorities.Non-automotive qualified products — Unless this data sheet expressly states that this specific Nexperia product is automotive qualified, the product is not suitable for automotive use. It is neither qualified nor tested in accordance with automotive testing or application requirements. Nexperia accepts no liability for inclusion and/or use of non-automotive qualified products in automotive equipment or applications.In the event that customer uses the product for design-in and use in automotive applications to automotive specifications and standards, customer (a) shall use the product without Nexperia’s warranty of the product for such automotive applications, use and specifications, and (b) whenever customer uses the product for automotive applications beyond Nexperia’s specifications such use shall be solely at customer’s own risk, and (c) customer fully indemnifies Nexperia for any liability, damages or failed product claims resulting from customer design and use of the product for automotive applications beyond Nexperia’s standard warranty and Nexperia’s product specifications.Translations — A non-English (translated) version of a document is for reference only. The English version shall prevail in case of any discrepancy between the translated and English versions.TrademarksNotice: All referenced brands, product names, service names and trademarks are the property of their respective owners.Contents1. General description (1)2. Features and benefits (1)3. Ordering information (2)4. Functional diagram (2)5. Pinning information (3)5.1. Pinning (3)5.2. Pin description (3)6. Functional description (3)7. Limiting values (4)8. Recommended operating conditions (4)9. Static characteristics (5)10. Dynamic characteristics (7)10.1. Waveforms and test circuit (8)11. Package outline (10)12. Abbreviations (14)13. Revision history (14)14. Legal information (15)© Nexperia B.V. 2018. All rights reservedFor more information, please visit: Forsalesofficeaddresses,pleasesendanemailto:*************************** Date of release: 11 September 2018。
各国免费ISM频段
“中华人民共和国无线电频率划分规定”2006版下列频带:6 765-6 795 kHz (中心频率为6 780 kHz),433.05-434.79 MHz (中心频率为433.92 MHz),除5.280 款所列国家以外的第一区,61-61.5 GHz (中心频率为61.25 GHz),122-123 GHz (中心频率为122.5 GHz),和244-246 GHz (中心频率为245 GHz)。
指定给工业、科学和医疗(ISM)使用,但须经有关部门与那些无线电通信业务可能受到影响的主管部门达成协议后给予特别批准。
援用本规定时,主管部门应考虑有关的ITU-R最新建议书。
欧洲ISM标准:1. ETSI :Draft ETSI EN 300 220-2 V2.2.1 (2008-04)规定了频段,如下表ERC Recommendation 70-03 relating to the use of short range devices (SRD)'规定:The bands in Annex 1 a - b - c - d f - f1 - f2 - h - i - j - k - l and m are also designated forindustrial, scientific and medical (ISM)查找Annex 1 a - b - c - d f - f1 - f2 - h - i - j - k - l and m,如下:6765-6795 kHz13.553-13.567 MHz26.957-27.283 MHz40.660-40.700 MHz433.050-434.790 MHz2400.0-2483.5 MHz5725-5875 MHz5725-5875 MHz61.0-61.5 GHz122-123 GHz244-246 GHz美国标准:Operation within the bands 902 - 928 MHz, 2400 - 2483.5 MHz, 5725 - 5875 MHz, and 24.0 - 24.25 GHz.国际电联标准。
大肠杆菌CRISPR-Cas9系统基因敲除简介
1CRISPR-Cas系统的研究进展CRISPR(clustered regularly interspaced short palindromic repeats),即串联的、间隔的短回文重复序列,最早在1987年研究大肠杆菌的碱性磷酸酶基因时被发现[1]。
随后在细菌和古细菌的基因组中也发现大量存在CRISPR,研究证实它能够保护自身抵御外来病毒和质粒的入侵[2],作用机制是依靠crRNA(CRISPR RNA)和tracrRNA(trans-activating crRNA)结合并引导Cas蛋白对外源DNA进行特异性降解[3]。
已发现的CRISPR-Cas系统有三种类型:Ⅰ型,Ⅱ型和Ⅲ型,其中以Ⅱ型最为简单,只需一种Cas蛋白,即通过RNA 介导核心蛋白Cas9识别并切割靶序列,引起DNA双链断裂[2]。
受自然界中CRISPR-Cas系统的启发,主要对来自于化脓性链球菌(Streptococcus pyogenes)的Ⅱ型CRISPR-Cas系统进行人为改造和利用,目前已经将其发展成为一种新型的基因编辑技术,实现基因敲除、插入、定点突变和组合编辑[4],并成功应用于大肠杆菌、酿酒酵母、家蚕、果蝇和人类细胞等[5]。
和传统的基因编辑技术相比,这一新技术具有成本低、操作简便、效率高的优点[6]。
2 CRISPR-Cas系统的组成与机制典型的Ⅱ型CRISPR-Cas系统基因座包含tracrRNA基因、Cas蛋白编码基因(cas9、cas1、cas2和csn2)、CRISPR基因座(引导序列、间隔序列和重复序列)这三个部分[6]。
Ⅱ型CRISPR-Cas系统的作用机制可分为三个阶段,第一是高度可变间隔序列的获得(图1),第二是CRISPR-Cas系统基因座的表达,第三是对外源遗传物质的降解[6](图2)。
Cas1、Cas2和Csn2蛋白与新间隔序列的获得相关。
与间隔序列同源的外源遗传物质上的原间隔序列(protospacer),其下游存在一段保守序列,被称为PAM(protospacer adjacent motifs)[7]。
Unit 9 课件How_I_Found_My_Voice
•Background Inform science fantasy saga and fictional universe created by writer/producer director George Lucas during the 1970s. The saga began with the film Star Wars (later retitled Star Wars Episode IV: A New Hope), which was released on May 25, 1977 by the 20th Century Fox. • Six feature films comprise the Star Wars film series. These films are generally split into two trilogies: The “original trilogy” of Episodes IVVI (released between 1977 and 1983) and the “trilogy” of Episodes I- III (released between 1999 and 2005).
• Darth Vader is the central antagonist in George Lucas„s first three Star Wars films and Revenge of the Sith, voiced by James Earl Jones and portrayed physically by David Prowse in the original Star Wars trilogy and by Canadian actor Hayden Christensen in Star Wars Episode III: Revenge of the Sith. Vader is one of the most iconic villains of all time.
英语文学赏析1-English Novels
The Old Man and The Sea
Major themes Success Hemingway draws a distinction between two different types of success: outer, material success and inner, spiritual success. One way to describe Santiago’s story is as a triumph of indefatigable(不屈不挠的) spirit over exhaustible material resources. Worthiness One must constantly demonstrate one’s heroism and manliness through actions conducted with dignity. A heroic and manly life is not, then, one of inner peace and selfsufficiency; it requires constant demonstration of one’s worthiness through noble action.
Main Terms of English Novels
Character Protagonist Antagonist Climax Conflict Diction Flashback
How to Appreciate Novels
We must grasp…
1. The five basic elements:
who what where when how why
pull down buffer经典配方
Arabidopsis EPSIN1Plays an Important Role in VacuolarTrafficking of Soluble Cargo Proteins in Plant Cells via Interactions with Clathrin,AP-1,VTI11,and VSR1WJinhee Song,Myoung Hui Lee,Gil-Je Lee,Cheol Min Yoo,and Inhwan Hwang1Division of Molecular and Life Sciences and Center for Plant Intracellular Trafficking,Pohang University of Scienceand Technology,Pohang790-784,KoreaEpsin and related proteins play important roles in various steps of protein trafficking in animal and yeast cells.Many epsin homologs have been identified in plant cells from analysis of genome sequences.However,their roles have not been elucidated.Here,we investigate the expression,localization,and biological role in protein trafficking of an epsin homolog, Arabidopsis thaliana EPSIN1,which is expressed in most tissues we examined.In the cell,one pool of EPSIN1is associated with actinfilaments,producing a network pattern,and a second pool localizes primarily to the Golgi complex with a minor portion to the prevacuolar compartment,producing a punctate staining pattern.Protein pull-down and coimmunoprecipitation experiments reveal that Arabidopsis EPSIN1interacts with clathrin,VTI11,g-adaptin-related protein(g-ADR),and vacuolar sorting receptor1(VSR1).In addition,EPSIN1colocalizes with clathrin and VTI11.The epsin1mutant,which has a T-DNA insertion in EPSIN1,displays a defect in the vacuolar trafficking of sporamin:greenfluorescent protein(GFP),but not in the secretion of invertase:GFP into the medium.Stably expressed HA:EPSIN1complements this trafficking defect.Based on these data,we propose that EPSIN1plays an important role in the vacuolar trafficking of soluble proteins at the trans-Golgi network via its interaction with g-ADR,VTI11,VSR1,and clathrin.INTRODUCTIONAfter translation in eukaryotic cells,a large number of proteins are transported to subcellular compartments by a variety of different mechanisms.Newly synthesized vacuolar proteins that are delivered to the endoplasmic reticulum(ER)by the cotrans-lational translocation mechanism are transported to the vacuole from the ER by a process called intracellular trafficking.Traffick-ing of a protein to the vacuole from the ER occurs through two organelles,the Golgi complex and the prevacuolar compartment (PVC)(Rothman,1994;Hawes et al.,1999;Bassham and Raikhel, 2000;Griffiths,2000).Transport of a protein from the ER to the Golgi complex is performed by coat protein complex II vesicles. Transport from the trans-Golgi network(TGN)to the PVC occurs via clathrin-coated vesicles(CCVs)(Robinson et al.,1998;Tang et al.,2005;Yang et al.,2005).Transport of a protein from the ER to the vacuole/lysosome requires a large number of proteins,including components of vesicles,factors involved in vesicle generation and fusion,reg-ulators of intracellular trafficking,adaptors for the cargo proteins, and other accessory proteins(Robinson and Kreis,1992;Bennett, 1995;Schekman and Orci,1996;da Silva Conceic¸a˜o et al.,1997;Kirchhausen,1999;Sever et al.,1999;Bassham and Raikhel, 2000;Griffiths,2000;Jin et al.,2001;Robinson and Bonifacino, 2001).Most of these proteins are found in all eukaryotic cells from yeast,animals,and plants,suggesting that protein traffick-ing mechanisms from the ER to the vacuole/lysosome may be highly conserved in all eukaryotic cells.Of the large number of proteins involved in intracellular traf-ficking,a group of proteins that have the highly conserved epsin N-terminal homology(ENTH)domain have been identified as playing a critical role at various trafficking steps in animal and yeast cells(Chen et al.,1998;De Camilli et al.,2002;Wendland, 2002;Overstreet et al.,2003;Legendre-Guillemin et al.,2004). The ENTH domain binds to phosphatidylinositols(PtdIns), although the lipid binding specificity differs with individual members of the epsin family.For example,epsin1binds to PtdIns(4,5)P2,whereas EpsinR and Ent3p bind to PtdIns(4)P and PdtIns(3,5)P2,respectively(Itoh et al.,2001).The ENTH domain is thought to be responsible for targeting these proteins to specific compartments and also for introducing curvature to the bound membranes to assist in the generation of CCVs(Legendre-Guillemin et al.,2004).However,the exact steps of intracellular trafficking in which ENTH-containing proteins play a role are complex.Epsin homologs can be divided into two groups based on the pathway in which they play a role.One group,which includes epsin1in animal cells and Ent1p and Ent2p in yeast cells,is involved in endocytosis from the plasma membrane (Chen et al.,1998;De Camilli et al.,2002;Wendland,2002).The other group,which includes EpsinR/clint/enthoprotin in animal cells and Ent3p and Ent4p in yeast cells,is involved in protein trafficking from the TGN to the lysosome/vacuole as well as1To whom correspondence should be addressed.E-mail ihhwang@postech.ac.kr;fax82-54-279-8159.The author responsible for distribution of materials integral to thefindings presented in this article in accordance with the policy describedin the Instructions for Authors()is:Inhwan Hwang(ihhwang@postech.ac.kr).W Online version contains Web-only data./cgi/doi/10.1105/tpc.105.039123The Plant Cell,Vol.18,2258–2274,September2006,ª2006American Society of Plant Biologistsretrograde trafficking from the early endosomes to the TGN (Kalthoff et al.,2002;Wasiak et al.,2002;Hirst et al.,2003; Chidambaram et al.,2004;Eugster et al.,2004;Saint-Pol et al., 2004).Another common feature of epsin-related proteins is that they play a role in CCV-mediated protein trafficking at both the TGN and the plasma membrane.These proteins can bind directly to clathrin through their multiple clathrin binding motifs;thus,they may recruit clathrin to the plasma membrane or the TGN to generate CCVs(Rosenthal et al.,1999;Wendland et al.,1999; Drake et al.,2000).In addition,these proteins interact with many other proteins,such as heterotetrameric clathrin adaptor complexes(APs),monomeric adaptor Golgi-localized,g-ear–containing Arf binding proteins(GGAs),and soluble NSF attach-ment protein receptors(SNAREs).Epsin1interacts with AP-2, Epsin15,and intersectin(Chen et al.,1998;Legendre-Guillemin et al.,2004),whereas EpsinR/enthoprotin/clint and Ent3p interact with SNAREs such as vti1b and vti1p,respectively (Chidambaram et al.,2004)and with adaptor proteins such as GGAs and AP-1(Duncan et al.,2003;Mills et al.,2003).In addition,epsin homologs have ubiquitin-interacting motifs and are ubiquitinated(Oldham et al.,2002;Shih et al.,2002).Protein ubiquitination acts as a signal for endocytosis from the plasma membrane and trafficking from the TGN through the endosome/ PVC to the lysosome/vacuole(Polo et al.,2002;Horak,2003; Raiborg et al.,2003;Scott et al.,2004).The binding of epsin homologs to ubiquitin raises the possibility that epsin homologs may bind directly to cargo proteins that are destined for the vacuole/lysosome from either the plasma membrane or the TGN (Chen and De Camilli,2005;Sigismund et al.,2005).In plant cells,sequence analysis of the entire Arabidopsis thaliana genome reveals several proteins with the highly con-served ENTH domains(Holstein and Oliviusson,2005).However, their biological roles have not been addressed.In this study,we investigate the functional role of EPSIN1,an Arabidopsis epsin homolog,at the molecular level.In particular,we focus on its possible role in protein trafficking in plant cells.We demonstrate that EPSIN1interacts with clathrin,AP-1,VSR1,and VTI11and plays an important role in the vacuolar trafficking of a soluble protein from the Golgi complex to the central vacuole.RESULTSEPSIN1,a Member of the Epsin Family,Is Ubiquitously Expressed in ArabidopsisThe Arabidopsis genome encodes three highly similar epsin-related proteins,EPSIN1,EPSIN2,and EPSIN3(Holstein and Oliviusson,2005).In this study,we investigated the biological role of EPSIN1.EPSIN1has the highly conserved ENTH domain at the N terminus.However,the rest of the molecule is less similar to other epsin-related proteins,although it has motifs,such as LIDL and DPF,that may function as clathrin and AP-1binding motifs,respectively.To understand the biological role of EPSIN1,its expression in various plant tissues was examined.An antibody was raised against the middle domain of EPSIN1(amino acid residues153to 337).The antibody recognized a protein band at90kD,which was much larger than the expected size,60kD,of EPSIN1 (Figure1A).It was shown previously that epsin-related proteins migrate slower than expected in SDS-PAGE(Chen et al.,1998). The control serum did not recognize any protein bands.This re-sult suggested that the antibody specifically recognized EPSIN1. To confirm this,protoplasts were transformed with EPSIN1 tagged with HA at the N terminus(HA:EPSIN1)and protein extracts from the transformed protoplasts were analyzed by protein gel blotting using anti-HA and anti-EPSIN1antibodies. The anti-HA antibody specifically recognized a protein band from the transformed protoplasts,but not from the untransformed protoplasts,at90kD(Figure1B).In addition,the90-kD protein species was recognized by the anti-EPSIN1antibody,confirming that the90-kD band was EPSIN1.The expression of EPSIN1in various tissues was examined using the anti-EPSIN1antibody. Protein extracts were prepared from various tissues at different stages of plant growth and used for protein gel blot analysis. EPSIN1was expressed in all of the tissues examined,with the highest expression in cotyledons andflowers(Figure1C). EPSIN1Produces Both Network and PunctateStaining PatternsTo examine the subcellular distribution of EPSIN1,total protein extracts from leaf tissues were separated into soluble and membrane fractions and analyzed by protein gel blotting using anti-EPSIN1antibody.EPSIN1was detected in both membrane (pellet)and soluble fractions(Figure2A).As controls for the fractionation,Arabidopsis aleurain-like protease(AALP)and Arabi-dopsis vacuolar sorting receptor(VSR)were detected with anti-AALP and anti-VSR antibodies,respectively(Sohn et al.,2003). AALP is a soluble protein present in the vacuolar lumen,and VSR is a membrane protein that is localized primarily to the PVC with a minor portion to the Golgi complex(da Silva Conceic¸a˜o et al., 1997;Ahmed et al.,2000).As expected,AALP and VSR were detected in the supernatant and pellet fractions,respectively. These results indicated that EPSIN1localized to multiple loca-tions,consistent with the behavior of other epsin-related proteins (Legendre-Guillemin et al.,2004).Next,we defined the subcellular localization of EPSIN1.Our initial attempts to localize the endogenous EPSIN1with the anti-EPSIN1antibody failed.Thus,we determined the localization of EPSIN1protein transiently expressed in protoplasts.EPSIN1 was tagged with the HA epitope,greenfluorescent protein(GFP), or redfluorescent protein(RFP).The amount of total EPSIN1 protein was determined using various amounts of HA:EPSIN1 plasmid DNA by protein gel blot analysis with anti-EPSIN1an-tibody and was found to be proportional to the amount of plasmid used(Figure2B).For the localization,we used a minimal amount(5to10m g)of EPSIN1plasmid DNAs.Protoplasts were transformed with HA:EPSIN1,and localization of EPSIN1 was determined by immunostaining with anti-HA antibody.HA: EPSIN1produced primarily a punctate staining pattern(Figure 2Ca).In addition to punctate stains,we occasionally observed weakly stained strings that connected punctate stains(Figure 2Cc,arrowheads).By contrast,the nontransformed controls did not produce any patterns(Figure2Ce).In protoplasts trans-formed with EPSIN1:GFP and EPSIN1:RFP,both EPSIN1fusionEPSIN1in Vacuolar Trafficking2259proteins produced a network pattern with punctate stains (Fig-ures 2Cg and 2Ch),whereas GFP and RFP alone produced diffuse patterns (Figures 2Dh and 2Di),indicating that EPSIN1produces the network pattern with punctate stains.These results were further confirmed by cotransforming the protoplasts with either EPSIN1:GFP and HA:EPSIN1or EPSIN1:GFP and EPSIN1:RFP .The punctate staining pattern of EPSIN1:GFP closely over-lapped that of HA:EPSIN1(Figures 2Da to 2Dc).In addition,the network and punctate staining patterns of EPSIN1:GFP closely overlapped those of EPSIN1:RFP (Figures 2De to 2Dg).However,the fine networks revealed by EPSIN1:GFP in the live protoplasts were nearly absent in the fixed protoplasts.Thus,the differences in the staining patterns between fixed and live protoplasts may be attributable to the fact that the network pattern of live protoplasts are not well preserved under the fixing conditions used.In addi-tion,the strings occasionally observed in the fixed protoplasts may represent the remnants of the network pattern revealed by HA:EPSIN1.These results strongly suggest that EPSIN1is re-sponsible for the network pattern as well as the punctate stains.The network pattern was reminiscent of the ER or actin pattern in plant cells (Boevink et al.,1998;Jin et al.,2001;Kim et al.,2005),whereas the punctate staining pattern suggested that EPSIN1may localize to the Golgi complex or endosomes,as observed previously with epsin homologs in animal and yeast cells (Wasiak et al.,2002;Chidambaram et al.,2004;Saint-Pol et al.,2004).Therefore,protoplasts were cotransformed with EPSIN1:RFP and GFP:talin ,a marker for actin filaments consist-ing of GFP and the actin binding domain of mouse talin (Kost et al.,1998;Kim et al.,2005).As expected,GFP:talin produced the network pattern (Figure 3A)(Kost et al.,1998;Kim et al.,2005).Furthermore,the red fluorescent network pattern of EPSIN1:RFP closely overlapped the green fluorescent network pattern of GFP:talin (Figure 3A),raising the possibility that EPSIN1:GFP bound to the actin filaments rather than to the ER.To confirm this,the EPSIN1:RFP pattern was examined after treatment with latrunculin B (Lat B),a chemical agent known to disrupt actin filaments (Spector et al.,1983).Lat B–treated protoplasts produced the diffuse green fluorescent pattern of GFP:talin (Figure 3A),an indication of solubilized actin filaments,as observed previously (Kim et al.,2005).In addition,the Lat B–treated protoplasts displayed a diffuse red fluorescent pattern of EPSIN1:RFP (Figure 3A),indicating that EPSIN1is associated with actin filaments but not with the ER.Furthermore,the punc-tate staining pattern of EPSIN1:RFP also was not observed in the presence of Lat B,indicating that actin filaments played a role in yielding the punctate staining pattern of EPSIN1.In the same conditions,BiP:GFP,an ER marker (Lee et al.,2002),produced a network pattern,indicating that Lat B does not disrupt the ER network patterns (Figure 3Ai).To identify the organelle responsible for the punctate staining pattern of EPSIN1,its localization was compared with that of ST:GFP and PEP12p/SYP21.ST:GFP,a chimericproteinFigure 1.EPSIN1Is Expressed in Various Arabidopsis Tissues.(A)Generation of anti-EPSIN1antibody.The middle domain,corresponding to amino acid residues 153to 337,was expressed as the Hisx6-tagged form in E.coli and used to raise antibody in a rabbit.Control serum was obtained from the rabbit before immunization.Total protein extracts were obtained from leaf tissues and used to test the anti-EPSIN1antibody.(B)Specificity of the anti-EPSIN1antibody.Protein extracts were obtained from protoplasts expressing EPSIN1tagged with the HA epitope at the N terminus and used for protein gel blot analysis using anti-HA and anti-EPSIN1antibodies.(C)Expression of EPSIN1in various tissues.Total protein extracts from the indicated tissues were analyzed by protein gel blotting using anti-EPSIN1antibody.Leaf tissues were harvested 11and 20d after germination.Cotyledons were obtained from 5-d-old plants.The membranes were stained with Coomassie blue to control for protein loading.RbcL,large subunit of the ribulose-1,5-bis-phosphate carboxylase/oxygenase (Rubisco)complex.2260The Plant CellFigure 2.EPSIN1Produces Both Network and Punctate Staining Patterns.(A)Subcellular fractionation of EPSIN1.Total (T)protein extracts of leaf tissues were separated into soluble (S)and pellet (P)fractions and analyzed by protein gel blotting using anti-EPSIN1,anti-AALP,and anti-VSR antibodies.(B)Expression level of EPSIN1in transformed protoplasts.Protoplasts were transformed with various amounts of HA:EPSIN1DNA,and the level of EPSIN1was determined by protein gel blotting with anti-EPSIN1antibody.Protein extracts from untransformed protoplasts were used as a control.The membrane was also stained with Coomassie blue to control for loading.(C)Localization of EPSIN1.Protoplasts were transformed with the indicated constructs (5to 10m g),and the localization of EPSIN1was examined either by immunostaining with anti-HA antibody or by direct detection of the GFP or RFP signal.Untransformed protoplasts were immunostained with anti-HA antibody as a control.Bars ¼20m m.(D)Colocalization of EPSIN1proteins.The localization of EPSIN1protein was examined in protoplasts transformed with HA:EPSIN1and EPSIN1:GFP or with EPSIN1:GFP and EPSIN1:RFP .As controls,GFP and RFP alone were transformed into protoplasts.Bars ¼20m m.EPSIN1in Vacuolar Trafficking 2261亚细胞定位可以荧光观察也可以做western 检测Figure 3.Localization of EPSIN1in Protoplasts.2262The Plant Cellbetween rat sialyltransferase and GFP,localizes to the Golgi complex,and PEP12p,a t-SNARE,localizes to the PVC(da Silva Conceic¸a˜o et al.,1997;Boevink et al.,1998;Jin et al.,2001). Protoplasts were cotransformed with HA:EPSIN1and ST:GFP. The localization of these proteins was examined after staining with anti-HA antibody.ST:GFP was observed directly with the greenfluorescent signals.A major portion of the HA:EPSIN1-positive punctate stains closely overlapped with those of ST:GFP (Figures3Ba to3Bc).To further confirm the Golgi localization of HA:EPSIN1,protoplasts transformed with HA:EPSIN1were treated with brefeldin A(BFA),a chemical known to disrupt the Golgi complex(Driouich et al.,1993),and the localization of HA:EPSIN1was examined.In the presence of BFA,HA:EPSIN1 yielded a largely diffuse pattern with aggregates,but not the punctate staining pattern,indicating that BFA affects EPSIN1 localization(Figure3Be).In the same conditions,ST:GFP pro-duced a network pattern with large aggregates(Figure3Bg), confirming that the Golgi complex was disrupted.These results support the notion that EPSIN1localizes to the Golgi complex. Next,we examined the possibility of EPSIN1localizing to the PVC.Protoplasts were cotransformed with EPSIN1:GFP and PEP12p:HA.The localization of PEP12p:HA was examined after staining with anti-HA antibody.EPSIN1:GFP was observed di-rectly with the greenfluorescent signals.Only a minor portion of the EPSIN1:GFP-positive punctate stains overlapped with the PEP12p:HA-positive punctate stains(Figures3Bi to3Bk,ar-rows).These results indicated that EPSIN1localized primarily to the Golgi complex with a minor portion to the PVC.To obtain independent evidence for the localization,we ex-amined the colocalization of EPSIN1with VTI11,a v-SNARE that is distributed equally to both the TGN and the PVC(Zheng et al., 1999;Bassham et al.,2000;Kim et al.,2005).Protoplasts were cotransformed with EPSIN1:GFP and VTI11:HA,and the local-ization of these proteins was examined by immunostaining with anti-HA antibody.EPSIN1-positive punctate stains largely colo-calized with those of VTI11:HA(Figures3Bm to3Bo),confirming that EPSIN1localizes to both the Golgi complex and the PVC. EPSIN1Binds to and Colocalizes with ClathrinThe members of the epsin family have two clathrin binding motifs (Rosenthal et al.,1999;Wendland et al.,1999;Drake et al.,2000). Sequence analysis indicated that EPSIN1has a potential clathrin binding motif.To explore the possibility that EPSIN1binds to clathrin,glutathione S-transferase–fused EPSIN1(GST:EPSIN1) was constructed for a protein pull-down assay(Figure4A).GST: EPSIN1was expressed in Escherichia coli and purified from E. coli extracts(Figure4B).The purified GST:EPSIN1was mixed with protein extracts obtained from leaf tissues.Proteins pelleted with glutathione–agarose were analyzed by protein gel blotting using anti-clathrin antibody.GST:EPSIN1,but not GST alone, precipitated from the plant extracts a180-kD protein species that was recognized by anti-clathrin antibody(Figure4C),indi-cating that EPSIN1bound to clathrin.To further examine its binding to clathrin,EPSIN1was divided into two regions,the ENTH and the remainder of the molecule (EPSIN1D N)(Figure4A).These regions were expressed in E.coli as GST fusion proteins,GST:ENTH and GST:EPSIN1D N,re-spectively(Figure4B).Protein pull-down experiments using leaf cell extracts were performed with purified GST:ENTH and GST: EPSIN1D N.GST:EPSIN1D N,but not GST:ENTH,precipitated clathrin from the plant extracts(Figure4C).To identify the clathrin binding motif,the C-terminal region containing the putative clathrin binding motif,LIDL(Lafer,2002),as well as GST:RIDL, which contained an Arg substitution of thefirst Leu residue in the motif,were expressed as GST fusion proteins in E.coli(Figures 4A and4B).GST:LIDL,but not GST:RIDL,precipitated clathrin from protein extracts(Figure4C),indicating that the LIDL motif functioned as a clathrin binding motif.The in vitro binding of EPSIN1with clathrin strongly suggested that EPSIN1was likely to colocalize with clathrin.Therefore, immunohistochemistry for the localization of EPSIN1and clathrin was performed.Protoplasts were transformed with HA:EPSIN1, and the localization of HA:EPSIN1and clathrin was examined by staining with anti-HA and anti-clathrin antibodies,respectively. The anti-clathrin antibody produced a punctate staining pattern (Figure4D).A majority(60to70%)of the HA:EPSIN1-positive punctate stains closely overlapped with a pool(40to50%)of clathrin-positive punctate stains(Figure4D),consistent with an interaction between EPSIN1and clathrin.There was also a pool of clathrin-positive punctate stains that lacked the HA:EPSIN1 signal,suggesting that clathrin also was involved in an EPSIN1-independent process.To further characterize the interaction between EPSIN1and clathrin,we examined whether or not EPSIN1is permanently associated with CCVs.Protein extracts from leaf tissues were first separated into soluble and pellet fractions by ultracentrifu-gation.The pellet fraction was treated with Triton X-100and further fractionated by gelfiltration,and the fractions were ana-lyzed by protein gel blotting using anti-clathrin,anti-EPSIN,and anti-VSR antibodies.Clathrin was detected in a peak between 443and669kD(see Supplemental Figure1online).Interestingly, VSR,the vacuolar cargo receptor,was eluted at the same posi-tion with clathrin.By contrast,EPSIN1was eluted at90kD. These results suggest that EPSIN1is not permanently associ-ated with CCVs.Figure3.(continued).(A)Colocalization of EPSIN1with actinfilaments.Protoplasts were transformed with the indicated constructs,and the localization of these proteins was examined in the presence(þLat B)and absence(ÿLat B)of Lat B(10m M).Bars¼20m m.(B)Localization of EPSIN1to the Golgi complex and the PVC.Protoplasts were transformed with the indicated constructs,and localization of the proteins was examined after immunostaining with anti-HA.The GFP signals were observed directly in thefixed protoplasts.For BFA treatment,BFA(30 m g/mL)was added to the transformed protoplasts at24h after transformation and incubated for3h.Arrows indicate the overlap between EPSIN1:GFP and PEP12p:HA.Bars¼20m m.EPSIN1in Vacuolar Trafficking2263Figure 4.EPSIN1Binds to and Colocalizes with Clathrin.(A)Constructs.GST was fused to the N terminus.ENTH,the epsin N-terminal homology domain.DLF and DPF motifs are similar to AP-1and AP-3binding motifs,respectively.Q11indicates a stretch of 11Glu residues.The clathrin binding motif (LIDL)and the Leu-to-Arg substitution in the clathrin binding motif (RIDL)are shown in the C-terminal region.The numbers indicate amino acid positions.(B)Expression of GST-fused EPSIN1proteins.Constructs were introduced into E.coli ,and their expression was induced by isopropylthio-b -galactoside.GST fusion proteins were purified from E.coli extracts with glutathione–agarose beads.Purified proteins were stained with Coomassie blue.(C)Interaction of EPSIN1with clathrin.GST-fused EPSIN1proteins were mixed with protein extracts from leaf tissues.EPSIN1binding proteins were precipitated using glutathione–agarose beads and analyzed by protein gel blotting using anti-clathrin antibody.Supernatants also were included in the protein gel blot analysis.Subsequently,the membranes were stained with Coomassie blue.Bead,glutathione–agarose beads alone;P,pellet;S,supernatant (10%of total).(D)Colocalization of EPSIN1with clathrin.Protoplasts transformed with HA:EPSIN1were fixed with paraglutaraldehyde,and the localization of HA:EPSIN1and clathrin was examined by immunostaining with anti-HA and anti-clathrin antibodies,respectively.Bar ¼20m m.2264The Plant CellEPSIN1Interacts with VTI11Epsin-related proteins in animal and yeast cells are involved in either endocytosis or vacuolar/lysosomal protein trafficking(Chen et al.,1998;De Camilli et al.,2002;Wendland,2002;Overstreet et al.,2003;Legendre-Guillemin et al.,2004).To elucidate the pathway of EPSIN1involvement,binding partners of EPSIN1 were examined.In animal and yeast cells,epsin-like proteins have been shown to interact with SNAREs(Chen et al.,1998; Chidambaram et al.,2004).Because EPSIN1localized to the Golgi complex and the PVC,EPSIN1interactions with Arabidop-sis VTI11and VTI12(formerly At VTI1a and At VTI1b,respectively) were examined.VTI11is a v-SNARE that localizes to the TGN and travels to the PVC(Zheng et al.,1999;Bassham et al.,2000). VTI11and VTI12were tagged with HA at the C terminus and introduced into protoplasts.The expression of VTI11:HA and VTI12:HA in protoplasts was confirmed by protein gel blot analysis using anti-HA antibody.The anti-HA antibody detected protein bands at33and35kD(Figure5A),the expected positions of VTI11:HA and VTI12:HA,respectively.Purified GST:EPSIN1 from E.coli extracts was mixed with plant extracts from the VTI11:HA-or VTI12:HA-transformed protoplasts,and GST: EPSIN1-bound proteins were precipitated from the mixture using glutathione–agarose beads.The pellet fraction was analyzed by protein gel blotting using anti-HA antibody.VTI11:HA,but not VTI12:HA,was detected from the pellet(Figure5A).GST alone did not precipitate VTI11:HA from the plant extracts.These results indicated that although VTI11and VTI12are highly similar to each other,EPSIN1specifically binds to VTI11:HA.To further confirm this interaction,we performed a reciprocal protein pull-down experiment(i.e.,pull-down of EPSIN1with VTI11)using protein extracts obtained from protoplasts transformed with VTI11:HA and EPSIN1:GFP.VTI11:HA-bound proteins were immunoprecipitated with anti-HA antibody,and the immunopre-cipitates were analyzed by protein gel blotting using anti-HA, anti-GFP,and anti-calreticulin antibodies.Anti-calreticulin anti-body was used as a negative control.In addition to VTI11:HA, EPSIN1:GFP was detected in the immunoprecipitates(Figure 5B).However,calreticulin was not detected in the pellet.These results further confirm the interaction between VTI11and EPSIN1. To determine the VTI11binding domain of EPSIN1,proteinpull-down experiments were performed using GST:ENTH and GST:EPSIN1D N.GST:ENTH,but not GST:EPSIN1D N,precipi-tated VTI11:HA from the plant extracts(Figure5C),indicating that the ENTH domain contained the VTI11binding motif.Similarly,in animal and yeast cells,EpsinR and Ent3p have been shown to bind to vti1b and vti1p,respectively(Chidambaram et al.,2004). EPSIN1Binds to the Arabidopsis Homolog of g-Adaptinof AP-1Epsin homologs bind to adaptor proteins(APs)(Duncan et al., 2003;Mills et al.,2003).In animal cells,EPSIN1binds to the a-adaptin of AP-2via the D F F/W(where F indicates a hydro-phobic amino acid)and FXDXF motifs(Figure4A)(Brett et al., 2002).Arabidopsis EPSIN1has three DPF motifs to which a-adaptin of AP-2could bind.In addition,EPSIN1has two regions with motifs similar to the acidic Phe motif for binding AP-1and GGAs(Duncan et al.,2003).Therefore,the interactions of EPSIN1with AP complexes were examined.We isolated the Arabidopsis proteins g-adaptin related protein(g-ADR),a-ADR, and d-ADR,which were most closely related to g-adaptin, a-adaptin,and d-adaptin of AP-1,AP-2,and AP-3,respectively. These Arabidopsis proteins were tagged with GFP and ex-pressed transiently in protoplasts.Protein extracts from the transformed protoplasts were mixed with purified GST:EPSIN1, and the GST:EPSIN1-bound proteins were precipitated.The pellet was analyzed by protein gel blotting using anti-GFP antibody.GFP:g-ADR,but not a-ADR:GFP or d-ADR:GFP,was detected in the pellet(Figure6A).The control for the protein pull-down assay,GST alone,did not precipitate any of these proteins. These results strongly suggested that EPSIN1interacts with g-ADR specifically.To further confirm the interaction between EPSIN1and g-ADR,we performed a reciprocal protein pull-down experiment(i.e.,pull down of EPSIN1proteins with Figure5.EPSIN1Binds to VTI11.(A)Protein extracts were prepared from VTI11:HA-and VTI12:HA-transformed protoplasts and mixed with GST alone or GST:EPSIN1. EPSIN1-bound proteins were precipitated from the mixture with gluta-thione–agarose beads and analyzed by protein gel blotting using anti-HA antibody.(B)Coimmunoprecipitation of EPSIN1:GFP with VTI11:HA.Protein ex-tracts from protoplasts cotransformed with VTI11:HA and EPSIN1:GFP were used for immunoprecipitation with anti-HA antibody.The immuno-precipitates were analyzed by protein gel blotting with anti-HA,anti-GFP, and anti-calreticulin antibodies.P,immunoprecipitate;S,supernatant;T, total protein extracts(5%of the input).(C)For binding experiments,protein extracts from protoplasts trans-formed with VTI11:HA were mixed with GST alone,GST:ENTH,and GST:EPSIN1D N.Proteins were precipitated with glutathione-agarose beads and analyzed by protein gel blotting using anti-HA antibody.The amount of the input proteins is indicated.EPSIN1in Vacuolar Trafficking2265。
GridVis-Basic 电力分析仪产品说明书
1Network visualisation software• GridVis ®-Basic (in the scope of supply)3 digital inputs/outputs•Usable as either inputs or outputs•Switch output•Threshold value output •Logic output•Remote via Modbus / ProfibusT emperature measurement •PT100, PT1000, KTY83, KTY84Interfaces •RS485•Ethernet•SNTP •TFTP•BACnet (optional)Networks• T N, T T , IT networks•3 and 4-phase networks•Up to 4 single-phase networksMeasured data memory •256 MB Flash• H armonics up to 40th harmonic •Rotary field components•Distortion factor T HD-U / T HD-I2 analogue inputs • A nalogue, temperature or residual current input (RCM)Residual current measurement BACnet (optional)HomepageAlarm managementMemory 256 MB Ethernet-Modbus gateway2• M easurement, monitoring and checking of electrical characteristics in energy distribution systems • R ecording of load profiles in energy management systems (e.g. ISO 50001)• Acquisition of the energy consumption for cost centre analysis • M easured value transducer for building management systems or PLC (Modbus)• M onitoring of power quality characteristics, e.g. harmonics up to 40th harmonic • R esidual current monitoring (RCM)Areas of applicationMain featuresUniversal meter• O perating current monitoring for general electrical parameters • H igh transparency through a multi-stage and scalable measurement system in the field of energy measurement • A cquisition of events through continuous measurement with 200 ms high resolutionRCM device• C ontinuous monitoring of residual currents (Residual Current Monitor, RCM)• A larming in case a preset threshold fault current reached • N ear-realtime reactions for triggering countermeasures • P ermanent RCM measurement for systems in permanent operation without the opportunity to switch offEnergy measurement device•Continuous acquisition of the energy data and load profiles • E ssential both in relation to energy efficiency and for the safe design of power distribution systemsHarmonics analyser / event recorder• Analysis of individual harmonics for current and voltage •Prevention of production downtimes•Significantly longer service life for equipment • R apid identification and analysis of power quality fluctuations by means of user-friendly tools (GridVis ®)Fig.: UMG 96RM-E with residual current monitoring via measuring inputs I5 / I6Fig.: Event logger: Voltage dip in the low voltage distribution system3Extensive selection of tariffs• 7 tariffs each for effective energy (consumption, delivery and without backstop)• 7 tariffs each for reactive energy (inductive, capacitive and without backstop)•7 tariffs for apparent energy •L1, L2 and L3, for each phaseHighest possible degree of reliability•Continuous leakage current measurement • H istorical data: Long-term monitoring of the residual current allows changes to be identified in good time, e.g. insulation faults•Time characteristics: Recognition of time relationships •Prevention of neutral conductor carryover • R CM threshold values can be optimized for each individual case: Fixed, dynamic and stepped RCM threshold value • M onitoring of the CGP (central ground point) and the sub-distribution panelsAnalysis of fault current events• E vent list with time stamp and values•Presentation of fault currents with characteristic and duration • R eproduction of phase currents during the fault current surge • P resentation of the phase voltages during the fault current surgeAnalysis of the harmonic fault current components•Frequencies of the fault currents (fault type)•Current peaks of the individual frequency components in A and %•Harmonics analysis up to 40th harmonic •Maximum values with real-time bar displayDigital IOs• E xtensive configuration of IOs for intelligent integration, alarmand control tasksFig.: Continuous leakage current measurementFig.: Analysis of fault current eventsFig.: Analysis of the harmonic fault current components4Dimension diagramsAll dimensions in mmSide viewRear viewEthernet (TCP/IP)- / Homepage- / Ethernet-Modbus gateway functionality•Simple integration into the network •More rapid and reliable data transfer •Modern homepage • W orld-wide access to measured values by means of standard web browsers via the device's inbuilt homepage • Access to measurement data via various channels • R eliable saving of measurement data possible over a very long periods of time in the 256 MByte measurement data memory • C onnection of Modbus slave devices via Ethernet-ModbusgatewayFig.: Ethernet-Modbus gateway functionalityMeasuring device homepage• W ebserver on the measuring device, i.e. device's own homepage •Remote operation of the device display via the homepage •Comprehensive measurement data incl. PQ • O nline data directly available via the homepage, historic data optional via the APP measured value monitor, 51.00.246Fig.: Illustration of the online data via the device's inbuilt homepageCut out: 92+0,8 x 92+0,8 mm5Typical connectionDevice overview and technical dataFig.: Connection example residual currentmeasurement and PE monitoringFig.: Connection example with temperature and residual current measurementS2S1S2S2S1S1Digital-Eingänge/Ausgänge UMG 96RM-E (RCM)L1L2L3Spannungsmessung 3456StrommessungVersorgungs-spannung12RS4851617BAB AV e r b r a u c h e r230V/400V 50HzI 41918N282930313233343536Analog-Eingänge13141524V DC K1K2=E t h e r n e t10/100B a s e -TPCK3K4K5==37R J 450-30 mAS2S1I DIFFI 5I 6PT100S1S2S3Gruppe 1Gruppe 2V 1V 2V 3V N N/-L/+2)1)2)2)3)3)3)3)Digital inputs/outputs Power supply voltage Current measurement Measuring voltage Analog inputs L o d s Group 1Group 2Comment:For detailed technical information please refer to the operation manual and the Modbus address list.•= included - = not included *1 Inclusive UL certification.6Fig.: GridVis ®software, configuration menuComment:For detailed technical information please refer to the operation manual and the Modbus address list.• = included - = not included*2 O ptional additional functions with the packages GridVis ®-Professional, GridVis ®-Service and GridVis ®-Ultimate.7Fig.: RCM configuration, e.g. dynamicthreshold value formation, for load-dependent threshold value adaptationFig.: Summation current transformer for the acquisition of residual currents. Wide range with different configurations and sizes allow use in almost all applicationsMeasurement surge voltage Power consumption Overload for 1 sec.Sampling frequency per channel (50 / 60 Hz)Residual current inputAnalogue inputsMeasurement range, residual current input*Digital outputsSwitching voltage Switching current Response timePulse output (energy pulse)Comment:For detailed technical information please refer to the operation manual and the Modbus address list.•= included - = not included*3 E xample of residual current input 30 mA with 600/1 residual current transformer: 600 x 30 mA = 18,000 mA *4A ccurate device dimensions can be found in the operation manual.8Comment:For detailed technical information please refer to the operation manual and the Modbus address list.• = included - = not included。
蛋白质分析和蛋白质组学(0)
DNA
RNA
protein
Page 389
[1] Protein families [3] Protein localization
protein [4] Protein function
Gene ontology (GO): --cellular component --biological process --molecular function
Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2)
MBD
TRD
The protein includes a methylated DNA binding domain (MBD) and a transcriptional repression domain (TRD). MeCP2 is a transcriptional repressor. Mutations in the gene encoding MeCP2 cause Rett Syndrome, a neurological disorder affecting girls primarily.
Page 390
John Reed 等人于2001发现的新基因Bcl-GL。 B。 该基 因BH3 domains 与Bcl-2 家族其他蛋白质的比对分析。 相 同与相似的残基分别用黑色和灰色出题粗体字母标示 C, 上述蛋白质BH2 domains的比对分析
EBV 编码的BALF1 ,BHRF1 蛋白质与 Kaposi’s sarcomaassociated herpesvirus (HHV8)ORF16 (KSHVorf16) 与 细胞基因bcl-x ,bcl-2的序列相似性。一致性的区域为黑色, 保守区域为灰色。
SupportingInformation:支持信息
Supporting InformationImaging mechanical vibrations in suspended graphene sheetsD. Garcia-Sanchez, A. M. van der Zande, A. San Paulo, B. Lassagne, P. L. McEuen and A. BachtoldA - Sample fabricationSuspended graphene sheets are fabricated by mechanical exfoliation. A freshly cleaved piece of Kish graphite (Toshiba Ceramics) is rubbed onto a degenerately doped silicon wafer with 290 nm SiO2 grown by plasma enhanced chemical vapor deposition. Before depositing the graphene, the wafer is patterned with trenches using photolithography and plasma etching. The trenches are millimeter long, 0.5 – 10 µm wide, and 250 nm deep. Electrodes defined by photolithography between the trenches are deposited using electron-beam evaporation and consist of 5 nm Cr and 35 nm Au.B- Description of the FEM modelTo model the shape of the eigenmodes, we have developed a simulation based on finite element methods (FEM) using ANSYS. The first step of the simulation is to account for the buckling of the suspended sheet by finding the adequate boundary conditions at the clamping edges. To do this, we hold one clamping edge of the suspended region fixed, and impose an in plane displacement to the other clamping edge. Specifically, the displacement of this edge consists of a translation and a rotation within the undeformed resonator surface. Since the resulting out of plane displacement can be large, calculations are carried out taking into account geometric non-linear deformations1. To ensure that the buckling goes in the desired direction, we apply an out of plane perturbative force, which is then cleared at the end of the calculations. We make the assumption that the mechanical properties of the resonator are isotropic with 1TPa for the Young’s modulus and 0.17 for the Poisson ratio2. The exact value of the Poisson ratio has little effect on the output of the calculations. In the second step of the simulation, a modal analysis is performed to determine the resonance frequency and the eigenmode shape of the deformed resonator. Here, the modal analysis is carried out in the linear regime because the amplitude of the vibration is small. To check that this simulation is free of errors, it has been successfully compared to analytical predictions for a beam under tension3. The simulation also reproduces recent calculations on nanotubes with slack4. See sections C and D.C- Comparison between the FEM model and analytical expressions for resonators under tension . The resonance frequency for a beam under weak uniform tension (T <<EI/l 2 ) is 5wtEI T wt EI l f ρπρπ1228.024.2221+= (1)We take the length l = 500 nm, the width w = 20 nm, the thickness t = 3.5 Å, the density ρ = 2200 kg/m 3, and the Young’s Modulus E = 1TPa. The bending moment of a rigid beam is I = wt 3/12.For high tension (T>>EI/l 2) the frequency can be expressed as 3[221)2/4(2121ξπξρ+++=wt T l f ] (2)with .23212/Tl Etw =ξFigure S1 shows the resonance frequency as a function of tension using the FEM model and the above expressions. There is a good agreement between the theory and the FEM model.10-1510-1310-1110-910100f (M H z )T(N)Figure S1 Resonance frequency as a function of tension for the first eigenmode of a graphene resonator under tension.D- Comparison between the FEM model and previous simulations on buckled SWNTs .Previous work 4 has reported numerical studies on SWNT resonators that are buckled (slack). We compare this work to the FEM model that we have developed. For this, we use the same geometry and the same physicalcharacteristics as in reference 4. The resonator is a doubly clamped rod with l =1.75um, d =2nm, E =2.18TPa, ρ=2992kg/m3 and a slack of 0,3%. The slack is defined as the ratio of the excess length of the tube to the distance between the clamping points. Figure S2 shows the resonance frequency for different eigenmodes obtained with the two simulation techniques. A good agreement is obtained.051015200400f (M H z )Eigenmode numberFigure S2 Resonance frequency for different eigenmodes of a SWNT resonator with 0.3% of slack.Conclusions of sections C and D. The FEM model shows a good agreement with established analytical expressions for beam resonators under tension and previous numerical calculations on buckled SWNTs. Note that the FEM model that we have developed can go beyond these cases. The model can be applied to resonators with any arbitrary geometry and any arbitrary stressed state.E- SFM detection of mechanical vibrations.We have developed a technique based on scanning force microscopy (SFM) to detect the mechanical vibrations of nanotube and graphene resonators 6. The resonator motion is electrostatically actuated with an oscillating voltage applied on a gate electrode. The frequency f RF of the driving voltage V RF is set at (or close to) the resonance frequency of the resonator. In addition, V RF is modulated at f mod , )2cos())2cos(1(mod t f t f RF ππ−. While the SFM cantilever cannot follow the rapid oscillations at f RF , it can detect the modulation envelope.The topography imaging is obtained in tapping mode using the second eigenmode of the SFM cantilever. The vibrations are detected with the first eigenmode of the SFM cantilever. Figure S3 shows that the signal of the vibrations is significantly enhanced when f mod is matching the resonance frequency f tip of the first eigenmode of the SFM cantilever. As a result, measurements are carried out with f mod = f tip .0204060801001201402A m p l i t u d e (a .u .)f mod (kHz)Figure S3 Detected response of the vibration of a nanotube driven at resonance as a function of f mod . The frequency of the first eigenmode of the SFM cantilever is 58 kHz. Nanotube resonance frequency f RF is 153 MHz. Measurements are taken at the nanotube position where the vibration amplitude is maximum.Graphene resonators show a lorentzian response to the rf drive frequency. Figure S4 a shows the frequency response of a graphene resonator measured with the SFM technique. For comparison, Fig. S4 b shows the frequency response of the same resonator measured using optical interferometry 7. The resonance frequencies are very similar for both techniques. However, the quality factor measured with the SFM technique is much lower due to energy dissipation to air, as the SFM technique is operated at atmospheric conditions, while the optical interferometry is performed in vacuum. Note that the low Q is not attributed to the disturbance of the SFM tip 6. Indeed, we have noticed no change in the quality factor as the amplitude set point of the SFM cantilever is reduced by 3%–5% from the limit of cantilever retraction, which corresponds to the enhancement of the resonator-tip interaction.A m p . (a .u .)f(MHz)f (MHz)0Figure S4 a Resonance peak of the fundamental mode of the graphene resonator shown in Figure 2 of the paper. The measurement is carried out using the SFM technique in air. The resonance is found at ~31 MHz with the quality factor Q = 5. Measurements are taken at the position where the vibration amplitude is maximum. b Resonance peak measured optically with a pressure of < 10-6 torr. The resonance is found at 32 MHz with Q = 64.As shown in Eq. 1 of the paper, the radio frequency force F RF on the suspended sheet is a linear function of the offset voltage V DC and the radio frequency voltage V RF . Figure S5 shows vibration amplitude as a function of V DC for an edge eigenmode of a graphene resonator. We find that the vibration amplitude is a linear function of the DC voltage and thus of the force. By operating in this regime, we ensure that the resonators are operating in the linear response regime, and the exotic edge eigenmodes are not a result of non-linear effects.-4-20240A m p . (a u )V DC (V)Figure S5 Vibration amplitude as a function of V DC at f 1 = 33 MHz and V RF = 100 mV for an edge eigenmode of a graphene resonator. The dimensions of the suspended sheet are t = 6 nm and l = 3.5 μm. The measurements are obtained at the position of the graphene sheet where the amplitude is maximumF- Estimation of the vibration amplitude of graphene resonators.The amplitude of the vibration can be estimated when considering beams that neither have slack nor tension. The eigenfunctions U n (x) are obtained from 6,8:042=∂∂+∂∂x U EI t U A beam beam ρ (4)The displacement of the resonator can be expanded in terms of U n (x),∑−=)2exp(t f i U z RF n n beam πα (5) where ∫−−=l RF n n n RF n n dx x F x U Q if f f L A 022232)()(/141ρπα (6)with f n the eigenfrequencies, Q n the quality factor for each eigenmode and RF DC RF V V z x C F )()(/)(φ−∂∂=. C(x) is the capacitance per unit of length and is given by z w x C 0)(ε= with w the width of the resonator and z the separation between the resonator and the gate. We have estimatedthat the maximum amplitude of the graphene resonator in Figure 2 of the paper is 0.1 nm. For this, we have used Q = 5, V DC - φ= 3V, and V RF = 60 mV.1 Zienkiewick, O.C.; Taylor, R.L. The Finite Element Method 5th edition. Butterworth-Heinemann; Oxford, 2000.2 Popov, V.N.; Van Doren, V.E.; Balkanski M. Elastic properties of single-walled carbon nanotubes. Phys. Rev. B2000,61, 3078.3 E. Buks, M.L. Roukes, Phys. Rev. B 2001,63, 033402.4 Üstunel,; H., Roundy, D.; Arias. T.A. Nano Lett.2005,5, 523.5 S. Sapmaz, et al., Phys.Rev B 2003,67, 235414.6 Garcia-Sanchez, D.; et al. Phys. Rev. Lett.2007,99, 085501.7 Bunch, J.S.; et al. Science2007,315, 490.8 A. N. Cleland., Fondations of Nanomechanics (Springer, Berlin, 2003).。
封装测试
Ø Modeled fault testing
Ø Will detect 100% of detectable modeled faults Ø Requires only 47 vectors Ø Vectors can be generated and analyzed by ATPG tools Ø Note: some of the faults are not able to be detected by
Ø Stuck-short: a single transistor is permanently stuck in short state
Ø Detection of a stuck-open fault requires two vectors
Ø A collapsed fault set contains one fault from each equivalence subset
Ø The length of ATPG patterns is reduced significantly after considering the fault collapse
Microelectronics
上海交通大学微电子学院
Transistor (Switch)Faults
Ø MOS transistor is considered an ideal switch and two types of faults are modeled:
Ø Stuck-open: a single transistor is permanently stuck in open state
暨南大学《分子生物学》复习题
暨南大学《分子生物学》复习题分子生物学复习题Macromolecules(大分子)Supercoil(超螺旋):DNA双螺旋本身进一步盘绕称超螺旋。
超螺旋有正超螺旋和负超螺旋两种,负超螺旋的存在对于转录和复制都是必要的。
Palindrome(回文序列):DNA序列中一条链从左到右阅读和另一条链从右到左阅读是一样的序列,即两条链由相邻的反向重复序列组成。
melting temperature(解链温度):DNA在溶液中随温度升高逐步变性,当A260达到彻底变性时,A260的一半时的温度就叫解链温度(Tm),Tm与GC比例、序列复杂性等因素有关。
hyperchromic shift(增色效应):当双螺旋DNA融解(解链)时,260nm处紫外吸收增加的现象。
Nucleosome(核小体):真核生物染色体的基本结构单位,由DNA与组蛋白H1、H2A、H2B、H3、H4组成的,DNA以左手螺旋缠绕于组蛋白核心上Chaperones(伴侣蛋白):与新合成的多肽链形成复合物并协助它正确折叠成具有生物功能构向的蛋白质Domain(结构域):多肽链在二级结构或超二级结构的基础上形成三级结构的局部折叠区,它是相对独立的紧密球状实体,称为结构域。
Motif(基序):又称超二级结构,是蛋白质分子中特别是球状蛋白质分子中由若干相邻的二级结构与元件(主要是α螺旋和β折叠片)组合在一起,彼此相互作用,形成种类不多的、有规则的二级结构组合或二级结构串,在多种蛋白质中充当三级结构的构件。
protein family(蛋白质家族):指具有相似结构,有某一功能共性的一组蛋白质。
Proteasome(蛋白酶复合物):蛋白酶体存在于所有真核细胞的胞质及核内,是高度保守具有多种催化功能的蛋白酶复合物。
蛋白酶体具有蛋白水解酶活性,蛋白酶体水解作用需要泛素蛋白参加。
Ubiquitylation(蛋白质泛素化):泛素间隔或连续地附着到被降解的蛋白质赖氨酸残基上,这一过程称为蛋白质泛素化。
作者的情感诉求怎么写英语
作者的情感诉求怎么写英语When expressing an author's emotional appeal in English, it's essential to use language that is both emotive and evocative. Here are some strategies to consider:1. Use of Figurative Language: Metaphors, similes, and personification can add depth and emotion to your writing.- Example: "Her voice was a siren's song, luring me into the depths of her story."2. Strong Verbs: Choose verbs that convey action and emotion.- Example: "He leapt with joy" or "She whispered her secrets into the night."3. Sensory Details: Engage the reader's senses to make the emotions more tangible.- Example: "The air was thick with the scent of fresh-cut grass and the distant hum of bees."4. Direct Emotion Words: Don't shy away from using words that directly express emotions.- Example: "I felt a pang of sorrow as the sun dipped below the horizon."5. Show, Don't Tell: Instead of telling the reader what the character feels, show it through actions and dialogue.- Example: "Her hands trembled as she clutched the worn letter to her chest."6. Inner Monologue: Use a character's thoughts to revealtheir emotional state.- Example: "Why did every encounter with him leave me feeling as if I'd just run a marathon?"7. Contrast and Conflict: Highlight the emotional stakes by contrasting them with opposing elements.- Example: "Despite the laughter echoing in the room, a heavy silence weighed on her heart."8. Rhythm and Flow: The rhythm of your sentences can mimic the emotional tone you're trying to convey.- Example: "The rhythm of the rain on the windowpane was a lullaby to her restless thoughts."9. Dialogue: Use dialogue to express emotion naturally.- Example: "‘I can't do this without you,’ she cried, her voice breaking with the weight of her confession."10. Symbolism and Motifs: Use recurring symbols or motifs to represent certain emotions or themes.- Example: "The old, gnarled tree stood as a silent witness to their enduring love."11. Vary Sentence Structure: Mixing short, punchy sentences with longer, more complex ones can reflect the intensity of emotions.- Example: "She ran, her heart pounding, the wind whipping her hair, the world a blur around her."12. Evoke Empathy: Write in a way that invites the reader to empathize with the characters.- Example: "His eyes, brimming with unshed tears, told a story of a heart too full for words."Remember, the key to writing an author's emotional appeal is to be genuine and specific. Avoid overusing clichés and strive for originality in expressing feelings. The goal is to create a connection between the reader and the text, allowing the reader to experience the emotions as if they were their own.。
18KJDL 18KJDL规格说明书
Rack-mounted DCS Signal Conditioners 18K-RACKCURRENT LOOP SUPPLYMODEL: 18KJDL–A[1]66–R[2] Specify a code from below for each [1] and [2]. (e.g. 18KJDL-A366-R/S)Default setting (table next) will be used if not otherwise specified.No linearization data will be programmed if you don't specify type of linearization and required data.• Linearization dataCode 1 segment data: Use Ordering Information Sheet (No.ESU-1669) to specify linearization data.Code 3 T/C, Code 4 RTD: Specify input sensor type and temperature range.LINEARIZATION CODE DEFAULT r a e n i L a t a d t n e m g e S :12: Square root extraction –––3: Thermocouple K 0 – 1000°C 001 – 0001 t P DT R :4°CINPUTCurrentA : 4 – 20 mA DC (Input resistance 250 Ω)[1] LINEARIZATION0: None1: Segment data2: Square root extraction 3: Thermocouple 4: RTDOUTPUT 1Voltage6: 1 – 5 V DC (Load resistance 2000 Ω min.)OUTPUT 2Voltage6: 1 – 5 V DC (Load resistance 2000 Ω min.)POWER INPUTDC Power R : 24 V DC(Operational voltage range 24 V ±10 %, ripple 10 %p-p max.)[2] OPTIONSPower Switch blank : None/S : With power switch• PC configurator software (model: JXCON)Downloadable at M-System’s web site.A dedicated cable is required to connect the module to the PC. Please refer to the internet software download site or the users manual for the PC configurator for applicable cable types.terminals on the front and connector on the rear; terminalcover providedConnectionInput: M3.5 screw terminals (torque 0.8 N·m)and connectorOutput 1: ConnectorOutput 2: M3.5 screw terminals (torque 0.8 N·m)and connectorPower input: Supplied from connectorScrew terminal: Nickel-plated steelIsolation: Input to output 1 to output 2 to powerOverrange output: Approx. -10 to +120 % at 1 – 5 VLinearization: 16 points max. represented as percentage offull-scaleAdjustments: Programming Unit (model: PU-2x);linearization data, zero and span, simulating output, etc.(Refer to the users manual of JXCON for the adjustmentsconfigurable with JXCON.)Current rating: ≤ 22 mA DC• Shortcircuit ProtectionCurrent limited: 30 mA max.Protected time duration: No limit•Segment data: 16 points (15 segments) max. within therange of -15.00 to +115.00 % input or output representedas percentage of fullscale•Square root extractionLow-end cutout: 5 % (output); curve characteristics as in thefigure belowOUTPUT■ Square root extraction•Thermocouple linearizable rangeUSABLE RANGET/C°C°F(PR) 0 to 176032 to 3200K(CA)-270 to +1370-454 to +2498E(CRC)-270 to +1000-454 to +1832J(IC)-210 to +1200-346 to +2192T(CC)-270 to +400-454 to +752B(RH) 0 to 182032 to 3308R-50 to +1760-58 to +3200S-50 to +1760-58 to +3200Note: For the temperatures that range below 0°C, thetransmitter may partially not satisfy the described accuracy.Consult factory.•RTD linearizable rangeUSABLE RANGERTD°C°FJPt 100 (JIS ’89)-200 to +500-328 to +932Pt 100 (JIS ’89)-200 to +650-328 to +1202Pt 100 (JIS ’97, IEC)-200 to +650-328 to +1202Pt 50Ω (JIS ’81)-200 to +500-328 to +932Ni 508.4Ω-50 to +200-58 to +392Note: Pt 100 (JIS ’89) is deviated from Pt 100 (JIS ’97) onlywithin the described accuracy.Operating temperature: -5 to +55°C (23 to 131°F)Operating humidity: 30 to 90 %RH (non-condensing)Mounting: Standard Rack 18KBXxWeight: 150 g (0.33 lb)with segment gain > 1Temp. coefficient: ±0.015 %/°C (±0.008 %/°F)Response time: ≤ 0.5 sec. (0 – 90 %)Line voltage effectOutput signal: ±0.1 % over voltage rangeInsulation resistance: ≥ 100 MΩ with 500 V DCDielectric strength: 500 V AC @ 1 minute (input to output 1to output 2 to power to ground)■ WITH POWER SWITCHPU-2xModule(Section A)Section A. Field Terminals* With power switch onlyUse either of module or external field terminals.。
motifStackguide-Bioconductor
motifStack guideJianhong Ou,Lihua Julie ZhuMay15,2016Contents1Introduction1 2Prepare environment2 3Examples of using motifStack23.1plot a DNA sequence logo with different fonts and colors (2)3.2plot a RNA sequence logo (3)3.3plot an amino acid sequence logo (4)3.4plot sequence logo stack (4)3.5plot a sequence logo cloud (7)3.6plot grouped sequence logo (9)3.7motifCircos (9)3.8motifPiles (11)4docker container for motifStack14 5References14 6Session Info151IntroductionA sequence logo,based on information theory,has been widely used as a graphical representation of sequence conservation(aka motif)in multiple amino acid or nucleic acid sequences.Sequence motif represents conserved characteristics such as DNA binding sites,where transcription factors bind, and catalytic sites in enzymes.Although many tools,such as seqlogo[1],have been developed to create sequence motif and to represent it as individual sequence logo,software tools for depicting the relationship among multiple sequence motifs are still lacking.We developed aflexible and powerful open-source R/Bioconductor package,motifStack,for visualization of the alignment of multiple sequence motifs.12Prepare environmentYou will need ghostscript:the full path to the executable can be set by the environment variable R GSCMD.If this is unset,a GhostScript executable will be searched by name on your path.For example,on a Unix,linux or Mac”gs”is used for searching,and on Windows the setting of the environment variable GSC is used,otherwise commands”gswi64c.exe”then”gswin32c.exe”are tried. Example on Windows:assume that the gswin32c.exe is installed at C:\Program Files\gs\gs9.06\bin, then open R and try:Sys.setenv(R_GSCMD=file.path("C:","Program Files","gs","gs9.06","bin","gswin32c.exe"))3Examples of using motifStack3.1plot a DNA sequence logo with different fonts and colorsUsers can select different fonts and colors to draw the sequence logo(Figure1).suppressPackageStartupMessages(library(motifStack))pcm<-read.table(file.path(find.package("motifStack"),"extdata","bin_SOLEXA.pcm"))pcm<-pcm[,3:ncol(pcm)]rownames(pcm)<-c("A","C","G","T")motif<-new("pcm",mat=as.matrix(pcm),name="bin_SOLEXA")##pfm object#motif<-pcm2pfm(pcm)#motif<-new("pfm",mat=motif,name="bin_SOLEXA")opar<-par(mfrow=c(4,1))plot(motif)#plot the logo with same heightplot(motif,ic.scale=FALSE,ylab="probability")#try a different fontplot(motif,font="mono,Courier")#try a different font and a different color groupmotif@color<-colorset(colorScheme='basepairing')plot(motif,font="Times")par(opar)12345672positionb i t s12345671positionp r o b a b i l i t y12345672positionb i t s12345672positionb i t sFigure 1:DNA sequence logo.Plot a DNA sequence logo with different fonts andcolors.1234567012positionb i t sFigure 2:RNA sequence logo.Plot an RNA sequence logo3.2plot a RNA sequence logoFrom DNA sequence logo to RNA sequence logo (Figure 2),you just need to change the rowname of the matrix from ”T”to ”U”.rna <-pcmrownames (rna)[4]<-"U"motif <-new ("pcm",mat =as.matrix (rna),name ="RNA_motif")plot (motif)13579121518212427303324positionb i t sCAPFigure 3:Amino acid sequence logo.Plot an sequence logo with any symbols as you want such as amino acid sequence logo3.3plot an amino acid sequence logoGiven that motifStack allows to use any letters as symbols,it can also be used to draw amino acid sequence logos (Figure 3).library (motifStack)protein <-read.table (file.path (find.package ("motifStack"),"extdata","cap.txt"))protein <-t (protein[,1:20])motif <-pcm2pfm (protein)motif <-new ("pfm",mat =motif,name ="CAP",color =colorset (alphabet ="AA",colorScheme ="chemistry"))plot (motif)3.4plot sequence logo stackmotifStack is designed to show multiple motifs in same canvas.To show the sequence logo stack,the distance of motifs need to be calculated first for example by using MotIV[2]::motifDistances,which implemented STAMP[3].After alignment,users can use plotMotifLogoStack function to draw sequence logos stack (Figure 4)or use plotMotifLogoStackWithTree function to show the distance tree with the sequence logos stack (Figure 5)or use plotMotifStackWithRadialPhylog function to plot sequence logo stack in radial style (Figure 6)in the same canvas.There is a shortcut function named as e stack layout to call plotMotifLogoStack,treeview layout to call plotMotifLogoStackWithTree and radialPhylog to call plotMotifStackWithRadialPhylog.library (motifStack)#####Input#####1234567802b i t s123456702b i t s123456702b i t s1234567802b i t s123456789101102b i t s123456702b i t s12345678902positionb i t sFigure 4:Sequence logo stack.Plot motifs with sequence logo stack style.pcms <-readPCM (file.path (find.package ("motifStack"),"extdata"),"pcm$")motifs <-lapply (pcms,pcm2pfm)##plot stacksmotifStack (motifs,layout ="stack",ncex =1.0)##plot stacks with hierarchical tree motifStack (motifs,layout ="tree")123456782b i t s12345672b i t s12345672b i t s123456782b i t s12345678910112b i t s12345672b i t s1234567892positionb i t sFigure 5:Treeview layout logo stack.Sequence logo stack with hierarchical cluster tree.##When the number of motifs is too much to be shown in a vertical stack,##motifStack can draw them in a radial style.##random sample from MotifDb library ("MotifDb")matrix.fly <-query (MotifDb,"Dmelanogaster")motifs2<-as.list (matrix.fly)##use data from FlyFactorSurveymotifs2<-motifs2[grepl ("Dmelanogaster \\-FlyFactorSurvey \\-",names(motifs2))]##format the namesnames(motifs2)<-gsub("Dmelanogaster_FlyFactorSurvey_","",gsub("_FBgn\\d+$","",gsub("[^a-zA-Z0-9]","_",gsub("(_\\d+)+$","",names(motifs2))))) motifs2<-motifs2[unique(names(motifs2))]pfms<-sample(motifs2,50)##creat a list of object of pfmmotifs2<-lapply(names(pfms),function(.ele,pfms){new("pfm",mat=pfms[[.ele]],name=.ele)},pfms)##trim the motifsmotifs2<-lapply(motifs2,trimMotif,t=0.4)##setting colorslibrary(RColorBrewer)color<-brewer.pal(12,"Set3")##plot logo stack with radial stylemotifStack(motifs2,layout="radialPhylog",circle=0.3,cleaves=0.2,clabel.leaves=0.5,col.bg=rep(color,each=5),col.bg.alpha=0.3,col.leaves=rep(color,each=5),bel.circle=rep(color,each=5),bel.circle.width=0.05,bel.circle=rep(color,each=5),bel.circle.width=0.02,circle.motif=1.2,angle=350)3.5plot a sequence logo cloudWe can also plot a sequence logo cloud for DNA sequence logo(Figure7).##assign groups for motifsgroups<-rep(paste("group",1:5,sep=""),each=10)names(groups)<-names(pfms)##assign group colorsgroup.col<-brewer.pal(5,"Set3")names(group.col)<-paste("group",1:5,sep="")##use MotIV to calculate the distances of motifsjaspar.scores<-MotIV::readDBScores(file.path(find.package("MotIV"),"extdata",Figure6:Sequence logo stack in radial style Plot motifs in a radial style when the number of motifs is too much to be shown in a vertical stack."jaspar2010_PCC_SWU.scores"))d<-MotIV::motifDistances(lapply(pfms,pfm2pwm))hc<-MotIV::motifHclust(d,method="average")##convert the hclust to phylog objectphylog<-hclust2phylog(hc)##reorder the pfms by the order of hclustleaves<-names(phylog$leaves)pfms<-pfms[leaves]##create a list of pfm objectspfms<-lapply(names(pfms),function(.ele,pfms){new("pfm",mat=pfms[[.ele]],name=.ele)},pfms)##extract the motif signaturesmotifSig<-motifSignature(pfms,phylog,groupDistance=0.01,min.freq=1)##draw the motifs with a tag-cloud style.motifCloud(motifSig,scale=c(6,.5),layout="rectangles",group.col=group.col,groups=groups,draw.legend=TRUE)3.6plot grouped sequence logoTo plot grouped sequence logo,except do motifCloud,we can also plot it with radialPhylog style(Figure 8).##get the signatures from object of motifSignaturesig<-signatures(motifSig)##set the inner-circle color for each signaturegpCol<-sigColor(motifSig)##plot the logo stack with radial style.plotMotifStackWithRadialPhylog(phylog=phylog,pfms=sig,circle=0.4,cleaves=0.3,clabel.leaves=0.5,col.bg=rep(color,each=5),col.bg.alpha=0.3,col.leaves=rep(rev(color),each=5),bel.circle=gpCol,bel.circle.width=0.03,angle=350,circle.motif=1.2,motifScale="logarithmic")3.7motifCircosWe can also plot it with circos style(Figure9).In circos style,we can plot two group of motifs and with multiple color rings.##plot the logo stack with radial style.motifCircos(phylog=phylog,pfms=pfms,pfms2=sig,col.tree.bg=rep(color,each=5),col.tree.bg.alpha=0.3,Figure7:Sequence logo cloud with rectangle packing layout Like tag-cloud,the sequence logo size is determined by the number of motifs of the signature.The group sources of the motifs for each signature are shown as a pie graph in topleft corner.col.leaves=rep(rev(color),each=5),bel.circle=gpCol,bel.circle.width=0.03,bel.circle=gpCol,bel.circle.width=0.03,r.rings=c(0.02,0.03,0.04),col.rings=list(sample(colors(),50),sample(colors(),50),sample(colors(),50)),angle=350,motifScale="logarithmic")Figure8:Grouped sequence logo with radialPhylog style layout.Like tag-cloud,the sequence logo size is determined by the number of motifs for the signature.The gray-black circle indicates the range of each signature.3.8motifPilesWe can also plot it with pile style(Figure10).In pile style,we can plot two group of motifs and with multiple color annoations.Figure9:Grouped sequence logo with circos style layout.more color sets with more motifs. ##plot the logo stack with radial style.motifPiles(phylog=phylog,pfms=pfms,pfms2=sig,col.tree=rep(color,each=5),col.leaves=rep(rev(color),each=5),col.pfms2=gpCol,r.anno=c(0.02,0.03,0.04),fru SOLEXA 5CG7928 SOLEXA 5 CG3407 SOLEXA 2 5 CG11085 SOLEXA wor SOLEXA 2 5 AbdB SOLEXAExd SOLEXAluna SOLEXA 5Six4 SOLEXAMad FlyRegsrp FlyRegHer SANGER 5cad FlyRegsqz SANGER 5HLH106 SANGER 5 3 vri SANGER 5l 1 sc da SANGER 5 nub FlyRegfoxo SANGER 10Aef1 SANGER 5Eip74EF FlyRegCG12768 SANGER 5 CG5953 SANGER 5 Hnf4 SANGER 10Dip3 SANGER 5sug SOLEXA 5Hnf4 SANGER 5klu SANGER 10CG12029 SANGER 10 ftz f1 FlyRegHr46 SANGER 5Hth CellCG3407 SANGER 2 5 tgo sima SANGER 5 ey FlyRegHey SANGER 5nub SOLEXA 5Unc4 SOLEXACG32532 SOLEXA Oc CellOc SOLEXABap CellGA T Ae SANGER 5Ro SOLEXAT up SOLEXACG4328 CellDfd FlyRegAp CellCG13424 Cell CG4136 Cell1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39Figure10:Grouped sequence logo with piles style layout.more color sets with more motifs.col.anno=list(sample(colors(),50),sample(colors(),50),sample(colors(),50)),motifScale="logarithmic",plotIndex=TRUE,groupDistance=0.01)4docker container for motifStackDocker allows software to be packaged into containers and the containers can be run any platform aswell using a virtual machine called boot2docker.motifStack has its docker image stored in Docker Hub. Users can download the image and run.docker pull jianhong/motifstack_1.13.6cd~##in windows,please try cd c:\textbackslash Users\textbackslash usernamemkdir tmp4motifstack##this will be the share folder for your host and container. docker run-ti--rm-v${PWD}/tmp4motifstack:/volume/data jianhong/motifstack_1.13.6R ##in Rsetwd("/tmp")library(motifStack)packageVersion("motifStack")pcmpath<-"pcmsDatasetFly"pcms<-readPCM(pcmpath)pfms<-lapply(pcms,pcm2pfm)matalign_path<-"/usr/bin/matalign"neighbor_path<-"/usr/bin/phylip/neighbor"outpath<-"output"system(paste("perl MatAlign2tree.pl--in.--pcmpath",pcmpath,"--out",outpath, "--matalign",matalign_path,"--neighbor",neighbor_path,"--tree","UPGMA")) newickstrUPGMA<-readLines(con=file.path(outpath,"NJ.matalign.distMX.nwk"))phylog<-newick2phylog(newickstrUPGMA,FALSE)leaves<-names(phylog$leaves)motifs<-pfms[leaves]motifSig<-motifSignature(motifs,phylog,groupDistance=2,min.freq=1,trim=.2)sig<-signatures(motifSig)gpCol<-sigColor(motifSig)leaveNames<-gsub("^Dm_","",leaves)pdf("/volume/data/test.pdf",width=8,height=11)motifPiles(phylog=phylog,DNAmotifAlignment(motifs),sig,col.pfms=gpCol,col.pfms.width=.01,col.pfms2=gpCol,col.pfms2.width=.01,labels.leaves=leaveNames,plotIndex=c(FALSE,TRUE),IndexCex=1,groupDistance=2,clabel.leaves=1)dev.off()You will see the test.pdffile in the folder of tmp4motifstack.5ReferencesReferences[1]Oliver Bembom.seqlogo:Sequence logos for dna sequence alignments.R package version1.5.4,2006.[2]Eloi Mercier and Raphael Gottardo.Motiv:Motif identification and validation.R package version1.10.0,2010.[3]Mahony S and Benos PV.Stamp:a web tool for exploring dna-binding motif similarities.NucleicAcids Res.,35(Web Server issue):W253–W258,2007.6Session InfosessionInfo()##R version3.3.0(2016-05-03)##Platform:x86_64-pc-linux-gnu(64-bit)##Running under:Ubuntu14.04.4LTS####locale:##[1]LC_CTYPE=en_US.UTF-8LC_NUMERIC=C LC_TIME=en_US.UTF-8##[4]LC_COLLATE=C LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 ##[7]LC_PAPER=en_US.UTF-8LC_NAME=C LC_ADDRESS=C##[10]LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8LC_IDENTIFICATION=C####attached base packages:##[1]stats4parallel grid stats graphics grDevices utils datasets ##[9]methods base####other attached packages:##[1]RColorBrewer_1.1-2MotifDb_1.14.0motifStack_1.16.2Biostrings_2.40.0 ##[5]XVector_0.12.0IRanges_2.6.0S4Vectors_0.10.0ade4_1.7-4##[9]MotIV_1.28.0BiocGenerics_0.18.0grImport_0.9-0XML_3.98-1.4##[13]BiocStyle_2.0.2####loaded via a namespace(and not attached):##[1]Rcpp_0.12.5highr_0.6plyr_1.8.3##[4]formatR_1.4GenomeInfoDb_1.8.2bitops_1.0-6##[7]tools_3.3.0zlibbioc_1.18.0digest_0.6.9##[10]evaluate_0.9lattice_0.20-33BSgenome_1.40.0##[13]yaml_2.1.13seqLogo_1.38.0rtracklayer_1.32.0##[16]stringr_1.0.0knitr_1.13Biobase_2.32.0##[19]BiocParallel_1.6.2rGADEM_2.20.0rmarkdown_0.9.6##[22]magrittr_1.5Rsamtools_1.24.0scales_0.4.0##[25]htmltools_0.3.5GenomicRanges_1.24.0GenomicAlignments_1.8.0 ##[28]SummarizedExperiment_1.2.2colorspace_1.2-6stringi_1.0-1##[31]RCurl_1.95-4.8munsell_0.4.3。
8年级英语每天的学习计划
8年级英语每天的学习计划Monday8:00 am - 9:00 am: Reading comprehension practiceStart the day by practicing reading comprehension passages. Read a passage and answer the questions to test your understanding of the text. Use different types of passages such as fiction, nonfiction, and poetry.9:00 am - 10:00 am: Vocabulary buildingSpend an hour learning new words and expanding your vocabulary. Use flashcards, word games, and online resources to learn and practice new words. Make a list of words to review throughout the week.10:00 am - 11:00 am: Writing practiceWork on improving your writing skills by practicing different types of writing. Spend an hour writing essays, creative stories, or descriptive paragraphs. Practice using different writing techniques and styles.11:00 am - 12:00 pm: Grammar reviewReview different grammar topics such as tenses, parts of speech, and sentence structure. Complete grammar exercises and worksheets to reinforce your understanding of the rules and concepts.12:00 pm - 1:00 pm: Lunch break1:00 pm - 2:00 pm: Spelling practiceSpend an hour practicing spelling words. Use spelling worksheets, word searches, and spelling games to improve your spelling skills. Create a list of commonly misspelled words to focus on.2:00 pm - 3:00 pm: Literature analysisRead a short story or a poem and analyze the literary elements such as plot, characters, and theme. Write a summary of the literary work and discuss your interpretation of the text. 3:00 pm - 4:00 pm: Reading for pleasureTake some time to read for pleasure. Choose a book or a magazine that interests you and spend an hour reading. This will help improve your reading skills and develop a love for reading.TuesdayStart the day with grammar exercises to reinforce your understanding of the rules. Complete a set of exercises focusing on a specific grammar topic such as subject-verb agreement or punctuation.9:00 am - 10:00 am: Vocabulary reviewReview the new words you learned yesterday. Use flashcards or online vocabulary quizzes to test your memory and retention of the words. Practice using the words in sentences to reinforce their meaning.10:00 am - 11:00 am: Writing practiceWork on improving your writing skills by focusing on a specific aspect of writing such as organization, voice, or style. Practice writing paragraphs or essays using the skills you are working on.11:00 am - 12:00 pm: Reading comprehensionPractice reading comprehension by reading a passage and answering questions. Choose a challenging passage to test your comprehension skills and analyze your mistakes.12:00 pm - 1:00 pm: Lunch break1:00 pm - 2:00 pm: Spelling practiceReview the spelling words from yesterday and practice spelling exercises. Use different spelling activities such as word scrambles or crosswords to reinforce your spelling skills. 2:00 pm - 3:00 pm: Literary analysisRead a short story or a poem and analyze the literary elements. Compare and contrast different literary works and discuss the themes and motifs present in the text.3:00 pm - 4:00 pm: Reading for pleasureTake some time to read for pleasure and explore different genres of literature. Choose a book or a magazine that you enjoy and spend an hour reading and expanding your literary horizons.Wednesday8:00 am - 9:00 am: Grammar reviewReview different grammar topics and complete grammar exercises to reinforce your understanding of the rules. Focus on a specific grammar skill that you find challenging and practice until you feel confident.Learn new words and expand your vocabulary. Use different methods such as word games, flashcards, and online resources to learn and practice new words.10:00 am - 11:00 am: Writing practiceWork on writing skills by focusing on a specific aspect of writing such as voice, style, or organization. Practice writing paragraphs or essays using the skills you are working on. 11:00 am - 12:00 pm: Reading comprehension practicePractice reading comprehension by reading a passage and answering questions. Choose a challenging passage to test your comprehension skills.12:00 pm - 1:00 pm: Lunch break1:00 pm - 2:00 pm: Spelling practiceReview the spelling words from the previous days and practice spelling exercises. Use different spelling activities such as word scrambles or crosswords to reinforce your spelling skills.2:00 pm - 3:00 pm: Literary analysisRead a short story or a poem and analyze the literary elements. Compare and contrast different literary works and discuss the themes and motifs present in the text.3:00 pm - 4:00 pm: Reading for pleasureTake some time to read for pleasure and explore different genres of literature. Choose a book or a magazine that you enjoy and spend an hour reading and expanding your literary horizons.Thursday8:00 am - 9:00 am: Grammar exercisesComplete a set of grammar exercises focusing on a specific grammar topic such as subject-verb agreement or punctuation. Use different resources such as workbooks or online grammar quizzes.9:00 am - 10:00 am: Vocabulary reviewReview the new words you learned earlier in the week. Use flashcards or online vocabulary quizzes to test your memory and retention of the words. Practice using the words in sentences to reinforce their meaning.10:00 am - 11:00 am: Writing practiceWork on improving your writing skills by focusing on a specific aspect of writing such as organization, voice, or style. Practice writing paragraphs or essays using the skills you are working on.11:00 am - 12:00 pm: Reading comprehensionPractice reading comprehension by reading a challenging passage and answering questions. Analyze your mistakes and work on improving your comprehension skills.12:00 pm - 1:00 pm: Lunch break1:00 pm - 2:00 pm: Spelling practiceReview the spelling words from the previous days and practice spelling exercises. Use different spelling activities such as word scrambles or crosswords to reinforce your spelling skills.2:00 pm - 3:00 pm: Literary analysisRead a short story or a poem and analyze the literary elements. Compare and contrast different literary works and discuss the themes and motifs present in the text.3:00 pm - 4:00 pm: Reading for pleasureTake some time to read for pleasure and explore different genres of literature. Choose a book or a magazine that you enjoy and spend an hour reading and expanding your literary horizons.Friday8:00 am - 9:00 am: Grammar reviewReview different grammar topics and complete grammar exercises to reinforce your understanding of the rules. Focus on a specific grammar skill that you find challenging and practice until you feel confident.9:00 am - 10:00 am: Vocabulary buildingLearn new words and expand your vocabulary. Use different methods such as word games, flashcards, and online resources to learn and practice new words.10:00 am - 11:00 am: Writing practiceWork on writing skills by focusing on a specific aspect of writing such as voice, style, or organization. Practice writing paragraphs or essays using the skills you are working on. 11:00 am - 12:00 pm: Reading comprehension practicePractice reading comprehension by reading a challenging passage and answering questions. Analyze your mistakes and work on improving your comprehension skills.12:00 pm - 1:00 pm: Lunch break1:00 pm - 2:00 pm: Spelling practiceReview the spelling words from the previous days and practice spelling exercises. Use different spelling activities such as word scrambles or crosswords to reinforce your spelling skills.2:00 pm - 3:00 pm: Literary analysisRead a short story or a poem and analyze the literary elements. Compare and contrast different literary works and discuss the themes and motifs present in the text.3:00 pm - 4:00 pm: Reading for pleasureTake some time to read for pleasure and explore different genres of literature. Choose a book or a magazine that you enjoy and spend an hour reading and expanding your literary horizons.Saturday and SundayOn the weekends, take a break from the structured study plan and focus on reading for pleasure. Spend time reading books, magazines, and articles that interest you. This will help you relax and unwind while still improving your reading skills.In addition to the daily study plan, it is important to practice English in everyday situations. Try to speak and write in English as much as possible, watch English-language movies and TV shows, and listen to English music and podcasts. This will help reinforce the skills you are learning and make English a natural part of your daily routine.。
Synopsys的Hspice入门
?PULSE ?SIN ?PWL ?AM (single frequency AM) ?SFFM (single frequency FM) ?EXP (exponential function)
© 2001 Synopsys, Inc. (9) CONFIDENTIAL
DC & AC Independent Source
Note: # is either a sweep or a hardcopy file number
© 2001 Synopsys, Inc. (3) CONFIDENTIAL
Netlist Structure
•One main program and one or more
optional submodules (.alter)
•HSPICE Output
?Run status
.st0
?Output listing
.lis
?Analysis data, transient .tr# (e.g. .tr0)
?Analysis data, dc ?Analysis data, ac ?Measure output
.sw# (e.g. .sw0) .ac# (e.g. .ac0) .m*# (e.g. .mt0)
VASP几个计算实例
用VASP计算H原子的能量氢原子的能量为。
在这一节中,我们用VASP计算H原子的能量。
对于原子计算,我们可以采用如下的INCAR文件PREC=ACCURATENELMDL = 5 make five delays till charge mixingISMEAR = 0; SIGMA=0.05 use smearing method采用如下的KPOINTS文件。
由于增加K点的数目只能改进描述原子间的相互作用,而在单原子计算中并不需要。
所以我们只需要一个K点。
Monkhorst Pack 0 Monkhorst Pack1 1 10 0 0采用如下的POSCAR文件atom 115.00000 .00000 .00000.00000 15.00000 .00000.00000 .00000 15.000001cart0 0 0采用标准的H的POTCAR得到结果如下:k-point 1 : 0.0000 0.0000 0.0000band No. band energies occupation1 -6.3145 1.000002 -0.0527 0.000003 0.4829 0.000004 0.4829 0.00000我们可以看到,电子的能级不为。
Free energy of the ion-electron system (eV)---------------------------------------------------alpha Z PSCENC = 0.00060791Ewald energy TEWEN = -1.36188267-1/2 Hartree DENC = -6.27429270-V(xc)+E(xc) XCENC = 1.90099128PAW double counting = 0.00000000 0.00000000entropy T*S EENTRO = -0.02820948eigenvalues EBANDS = -6.31447362atomic energy EATOM = 12.04670449---------------------------------------------------free energy TOTEN = -0.03055478 eVenergy without entropy = -0.00234530 energy(sigma->0) = -0.01645004我们可以看到也不等于。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx1Copyright © 200x Inderscience Enterprises Ltd. A study of the repetitive structure and distribution of short motifs in human genomic sequences Abanish Singh Department of Computer Science and Engineering, University of Texas at Arlington, TX 76019, Arlington, USA E-mail: singh@ Cedric Feschotte* Department of Biology, University of Texas at Arlington, TX 76019, Arlington, USA E-mail: cedric@ *Corresponding author Nikola Stojanovic Department of Computer Science and Engineering, University of Texas at Arlington, TX 76019, Arlington, USA E-mail: nick@ Abstract: Over the last several years the search for functional elements in human and other genomes by exploiting motif over-representation became increasingly popular. However, about half of the human genome consists of known repeated elements, and that is also the case with most higher eukaryotes. In this study we have shown that in addition to these known repeats, human genomic sequences feature many short motifs which are significantly over-represented, and that their frequency varies only slightly between random repeat-masked sequences and regions located immediately upstream of the known genes. As the ongoing ENCODE project is set on the development of the techniques for high-throughput identification of the functional elements in the human genome, concentrating on about 1% of its entire DNA sequence,we have chosen these regions for a part of our study. Keywords: DNA; repeated sequences; functional elements; sequence motifs. Reference to this paper should be made as follows: Singh, A., Feschotte, C. and Stojanovic, N. (xxxx) ‘A study of the repetitive structure and distribution of short motifs in human genomic sequences’, Int. J. Bioinformatics Research and Applications , Vol. x, No. x, pp.xxx–xxx. Biographical notes: Abanish Singh received his ME Degree in Computer Science and Engineering in 2000 from the Motilal Nehru National Institute of Technology, Allahabad, India. Before joining the Doctoral Program at the University of Texas at Arlington in 2004, he served on the faculty of the Department of Computer Science and Engineering at the Sant Longowal2 A. Singh et al.Institute of Engineering and Technology, Longowal, India, since 1993.His current PhD Thesis work is in bioinformatics and computational biology,specifically in the genomic sequence analysis and motif discovery.Cedric Feschotte received his PhD Degree in Biology in 2001 from theUniversity of Paris VI, France. He was a postdoctoral research associate at theDepartments of Plant Biology and Genetics at the University of Georgia from2001 to 2004, where he worked with Dr. Susan Wessler on transposableelements in plant genomes. Since 2004, he is an Assistant Professor in theDepartment of Biology at the University of Texas at Arlington. His currentresearch focus is on the evolutionary history and genomic impact of mobilegenetic elements, with an emphasis on the human genome.Nikola Stojanovic received his PhD Degree in Computer Science andEngineering in 1997 from the Pennsylvania State University, University Park,PA. After five years of working on the Human Genome Project at theWhitehead Institute/MIT Center for Genome Research in Cambridge, MA,he joined the faculty of the University of Texas at Arlington in 2003, as anAssistant Professor. His research interests are in algorithms for genomicsequence analysis, phylogenetic studies and sequence alignments.1IntroductionThe search for transcription factor binding sites is one of the most popular sub-fields of bioinformatics, and many algorithms have been developed over more than a decade of intensive research. The first approaches relied on a rather naive assumption that the target sites for protein binding must feature information content sufficient for them to be recognised. Disillusionment soon followed, as any attempt to isolate functional elements in DNA resulted in an enormous number of false positives. Recent approaches have thus concentrated on the incorporation of additional information to the raw sequence data. They often relied on the phylogenetic conservation (Stojanovic et al., 1999; Jegga et al., 2002; Sharan et al., 2003; Corcoran et al., 2005) or a search for clusters of elements whose sequences match experimentally confirmed consensus motifs (Jegga et al., 2002; Sharan et al., 2003) retrieved from databases such as TRANSFAC (Matys et al., 2006) or Jaspar (Sandelin et al., 2004). The latter methods exploited the fact that proteins involved in the initiation of transcription rarely, if ever, act in isolation.With the advances in microarray technology large sets of putatively co-expressed genes became available. This, in turn, stimulated the development of new techniques aiming to detect conserved motifs in the upstream sequences of these genes (Hughes et al., 2000; Jegga et al., 2002; Bannai et al., 2004). It is intuitive that if a group of genes is coordinately regulated, it should be controlled by the same transcription factors. From the hypothesis that protein binding is directed by DNA sequence motifs it follows that same motifs should be present in all observed upstream regulatory sequences, moreover as a cluster, or multiple clusters representing targets for the transcriptional initiation complexes (Jegga et al., 2002; Johansson et al., 2003; Sharan et al., 2003). This led to the exploitation of motif over-representation in the target regions. In addition, it has been observed in yeast that the promoter regions are often characterised by multiple occurrences of the same binding motif (van Helden et al.,A study of the repetitive structure and distribution of short motifs 3 1998), and it has been postulated that it may also be the case in higher eukaryotes. Along with the expectation of a co-occurrence of motifs in different regulatory sequences, this postulate stimulated the search for over-represented, or ‘surprise’, motifs (Apostolico et al., 2000; van Helden, 2004).As the search for non-coding functional elements in newly sequenced genomes intensified, a comprehensive study of the effectiveness of many motif-finding tools has been performed (Tompa et al., 2005). Not surprisingly, it has shown that, while there has been some success in the binding site recognition, the existing methods are not nearly satisfactory. There are several reasons for that. Spatial configuration of DNA, along with other epigenetic phenomena, may be a major factor in transcription factor binding, and no currently available tool incorporates this information. Transcrip-tion factors generally feature non-specific binding preferences (Balhoff and Wray, 2005), and that permits variations in the motif consensus. To make things worse, transcription factor binding sites are short, and their detection may be hampered by pieces of repeated or randomly conserved short sequences. Our own previous study has shown that it is indeed the case (Stojanovic et al., 1999). This, in turn, may force us to comparatively look at homologous or paralogous sequences which have diverged so much that the proteins that bind there are only similar, but no longer same. The solution to this problem may lie in the simultaneous study of a large number of closely related sequences, which are becoming increasingly available. This was one of the major methods of the ENCODE project (The ENCODE Project Consortium, 2004), which in its pilot phase aimed at the development of high-throughput techniques for the classification of DNA elements within a set of target regions comprising approximately 1% of the human genome.It is well known that even non-functional parts of a genome are not a random assembly of four letters. The analysis of statistical features of sequences often involves a non-trivial background model, such as these based on Markov Models, or Hidden Markov Models. In this study we have done extensive simulations and analysis of real data in order to identify short motif conservation patterns in the human genome, and in particular in the ENCODE target regions. While the number of repeated elements in randomly generated synthetic sequences was almost perfectly conforming to the Poisson expectation, the number of repeated substrings in repeat-masked random intergenic sequences was far greater than expected. This bias appears to be genome-wide, as it persisted even when we simultaneously considered many additional randomly collected human sequences, varying in size between 100 and 4,000 characters. In consequence, any search for conserved motifs is bound to return many results, and, depending on what we search for, most would likely be false positives.In order to gain a better perspective on our ability to characterise significant over-represented motifs in different regions of the genome, we have looked at the number of short (<20 bp) repeats, both exact and approximate in various genomic environments and synthetic sequences. Although studies have been done regarding the distribution of tandem repeats (Bilgen et al., 2004) and larger interspersed repeats (International Human Genome Sequencing Consortium, 2001), we are unaware of any systematic examination of the genome-wide occurrences of very short interspersed motifs.4 A. Singh et al.2 Distribution of short exact repeated motifsWe started the analysis by creating six different data sets, each consisting of 100 sequences of 500 bases in length. Although we looked at other possible segment lengths, as short as 50, and as long as several thousand, the results on short exact sequences were not substantially different and length 500 was well suited for the consideration of regions immediately 5’ to known genes. Although the issue is still unresolved, some studies have shown that most cis-acting regulatory elements appear to cluster in the gene upstream regions of about this length (Khambata-Ford et al., 2003).Four of our data sets were synthetic, containing sequences created by assembling of A’s, C’s, G’s and T’s using a random number generator on Unix, and sequences generated by second, third and fifth order Markov Models, trained on one million bases taken from human chromosome 2 obtained through the Ensembl (Birney et al., 2006) genome browser. These specific MMs have been selected because the second and the third order are widely used in the simulation of genetic sequences, and the fifth order is popular in gene-finding tools. We were especially interested in the behaviour of the second order Markov Model, as it has been used to generate control sequences in a comprehensive evaluation of motif-finding tools (Tompa et al., 2005). Strings generated by even higher order MMs were considered in order to confirm trends, but not studied in detail. The remaining two sets were real DNA sequences: one was constructed from the upstream regions immediately 5’ to annotated Ensembl human genes and the other consisted of random repeat-masked human intergenic sequences. The total length of sequences in each set was 50,000 bases (300,000 letters total).In order to count short exact repeated oligonucleotides we used a modification of the Karp–Rabin pattern matching algorithm (Karp and Rabin, 1987), locating all repeats of specified length in time linear with the size of the sequence. The original Karp–Rabin method was based on numerical keys to code patterns, and we used such keys as indices to a hash table counting the number of occurrences of each motif. We ran our program separately for motif lengths varying between 4 and 9, and recorded the total numbers of repeated elements in Table 1. Since the repeats have been counted separately for each of the 100 sequences in each set, the recorded values include the mean (µ)and the standard deviation (σ)for all runs. In addition to the empirically determined counts we have also recorded the expected numbers of the repeats, based on the Poisson model.As it can be seen from Table 1, there were only insignificant differences between the models for motifs of length 4.1 Starting with length five, an obvious pattern emerges, in which the number of repeats in sequences created by the random number generator correlates with Poisson predictions very well, but none of the other models do. Indeed, a chi-square test on the columns of Table 1 (µ values), whose results are shown in Table 2, confirmed with very high confidence that random synthetic sequence draws from the same distribution as Poisson prediction, but rejected other sets (except, weakly, the higher order MMs). There were more repeats than expected in all Markov Models and they corresponded well to each other, confirmed by solid p-values. A weak similarity has also been found between the second order Markov Model and random intergenic sequences. This, on one hand, justifies its use in modeling genomic environments, but it also advises caution concerning the use of the MMs in simulations.A study of the repetitive structure and distribution of short motifs 5 Table 1The mean numbers (µ) and standard deviations (σ) of repeated patterns of different lengths in different types of nucleotide sequences. Pattern counting has been doneover 100 sequences of length 500 in each categoryPattern length ExpectednumberRandomsynthetic2nd orderMarkov M.3rd orderMarkov M.5th orderMarkov M.RandomgenomicUpstreamregulatory4 429.06µ = 425.74 µ = 437.99 µ = 432.84 µ = 432.23 µ = 438.97 µ = 433.92σ = 6.36 σ = 8.12 σ = 7.1 σ = 6.91 σ = 8.5 σ = 9.945 193.16µ = 189.18 µ = 237.83 µ = 222.98 µ = 222.27 µ = 261.64 µ = 260.11σ = 15.59 σ = 17.0 σ = 16.68 σ = 15.83 σ = 33.49 σ = 30.676 57.46µ =55.16 µ = 84.33 µ = 74.58 µ = 75.88 µ = 106.62 µ = 115.31σ = 12.51 σ = 15.16 σ = 13.27 σ = 14.66 σ= 43.5 σ = 37.727 15.03µ = 14.0 µ = 24.5 µ= 21.82 µ = 23.3 µ = 38.66 µ= 47.54σ = 5.77 σ = 9.97 σ = 7.81 σ = 9.48 σ = 44.31 σ= 29.888 3.8µ = 3.12 µ = 7.05 µ = 5.75 µ= 6.87 µ= 15.72 µ = 21.3σ = 2.75 σ = 5.15 σ = 4.16 σ = 5.19 σ= 44.26 σ = 21.629 0.95µ = 0.56 µ = 1.94 µ = 1.47 µ= 1.97 µ= 8.57 µ= 11.33σ = 1.17 σ = 2.42 σ = 1.92 σ = 2.25 σ = 44.04 σ = 15.67Table 2 Chi-square confidence levels for the compared data sets, indicating the likelihood that sequences in the compared set pairs (column-wise in Table 1) have indeed been drawnfrom the same distribution. MMn abbreviates nth order Markov modelExpected number Randomsynthetic2nd orderMarkov M.3rd orderMarkov M.5th orderMarkov M.RandomgenomicUpstreamregulatoryExpected 1.0 >0.995 <0.005 ≈0.02 <0.005 <<0.005 <<0.005 Random >0.995 1.0 <0.01 ≈0.2 ≈0.1 <<0.005 <<0.005 MM2 <0.005 <0.01 1.0 ≈0.6 ≈0.8 ≈0.025 <0.005MM3 ≈≈0.2 ≈0.6 1.0 >0.995 <0.005<0.005 MM5 < ≈0.1 ≈0.8 >0.995 1.0 <0.005<0.005 Genomic << <<0.005 ≈0.025 <0.005 <0.005 1.0 ≈0.8 Regulatory << <<0.005 <0.005 <0.005 <0.005 ≈0.8 1.0It was surprising, and somewhat discouraging for the attempts of locating functional elements through over-representation, that random intergenic repeat-masked (and thus, presumably, reasonably unique) sequences featured about the same number of short repeated motifs as sequences taken upstream of the genes. The chi-square test was conclusive on this, with the p-value indicating a strong agreement. As for the correspondence between the real sequences and the models, random genomic sequences appear to have somewhat similar number of repeats as the second order Markov Model, but are otherwise quite distinct from any other simulated data set. Overall, real genomic sequences were similar to each other, but not to the models, Markov Models mutually agreed well, but otherwise did not show significant similarity to other data sets, and synthetic sequences corresponded well to the Poisson prediction (a ‘sanity check’), but their composition was different than that of Markov Model simulated data, and dramatically different than that of the real sequences.We next analysed these relationships at a finer granularity, looking separately at each motif length, and for each length separately at the number of motifs repeating n times, where n was varied between two and ten or more (the latter counted together).6 A. Singh et al.Although we did full analysis for all motif lengths in our range, the results were similar, and we show the representative sample for motif lengths 4, 7 and 9 in Table 3. As before, the sequences generated by using random numbers corresponded to Poisson predictions consistently well, while there was a discrepancy between these two and all other models. There was a somewhat weak mutual agreement between different Markov Models (at least two of the three corresponded to each other in every test), and occasionally between random genomic and gene upstream sequences, but the fit between the synthetic and real data was consistently poor.In this round of testing we have applied the chi-square test on all combinations of models, separately for each motif length, using the sums of the number of repeats in 100 runs as our samples x i, where i corresponded to the number of times the motifs have been repeated (so, for instance, in the test for motifs of length 6, x3was the count of motifs of length 6 repeated three times). This method provided us the sufficient sample size in each category i, which could have otherwise been a problem, having in mind the relative scarcity of long exact motifs repeated many times. Unfortunately, for all comparisons except these involving Poisson expectations we needed to estimate the expected values from the data, and thus substantially reduce the number of degrees of freedom, which has made our analysis of longer repeats somewhat unreliable.Table 3 The mean numbers (µ) and standard deviations (σ)of repeated patterns of length 4, 7 and 9 in different types of nucleotide sequences. For each motif length, thecorresponding rows represent the numbers of motifs repeated n times, where n variesbetween two and ten or more. Pattern counting has been done over 100 sequencesof length 500 in each categoryLength/ repeats ExpectednumberRandomsynthetic2nd orderMarkov M.3rd orderMarkov M.5th orderMarkov M.RandomgenomicUpstreamregulatory4/2 69.25µ= 70.28 µ= 50.05 µ= 55.42 µ= 55.16 µ= 45.17 µ= 47.19σ = 6.2σ = 6.97σ = 6.77σ = 7.94σ = 8.38σ = 9.42 4/3 45.09µ= 44.86 µ= 37.64 µ= 39.04 µ= 38.93 µ= 31.68 µ= 31.05σ = 5.84 σ = 5.53 σ = 5.32 σ = 4.99 σ = 8.1 σ = 7.69 4/4 22.02µ= 21.36 µ= 24.39 µ= 22.7 µ= 23.21 µ= 20.13 µ= 19.0σ = 3.9 σ = 4.34 σ = 4.7 σ = 4.37 σ = 4.73 σ = 5.27 4/5 8.6µ= 8.17 µ= 12.8 µ= 11.69 µ= 11.12 µ= 11.65 µ= 11.0σ = 2.8 σ = 3.18 σ = 3.14 σ = 3.08 σ = 3.12 σ = 3.46 4/6 2.8µ= 2.83 µ= 5.44 µ= 4.52 µ= 4.56 µ= 6.41 µ= 5.81σ = 1.73 σ = 2.34 σ = 1.83 σ = 1.86 σ = 2.38 σ = 2.28 4/7 0.78µ= 0.73 µ= 2.51 µ= 2.16 µ= 2.09 µ= 3.67 µ= 3.17σ = 0.8 σ = 1.47 σ = 1.38 σ = 1.39 σ = 2.02 σ = 1.9 4/8 0.19µ= 0.23 µ= 0.94 µ= 0.69 µ= 0.77 µ= 1.8 µ= 1.79σ = 0.47 σ = 1.01 σ = 0.81 σ = 1.0 σ = 1.39 σ = 1.37 4/9 0.04µ= 0.03 µ= 0.38 µ= 0.33 µ= 0.41 µ= 1.32 µ= 1.13σ = 0.17 σ = 0.64 σ = 0.57 σ = 0.62 σ = 1.43 σ = 1.35 4/10+ 0.01µ= 0 µ= 0.16 µ= 0.12 µ= 0.23 µ= 0.79 µ= 0.74σ = 0 σ = 0.39 σ = 0.32 σ = 0.51 σ = 1.56 σ = 1.09 7/2 7.4µ= 6.97 µ= 11.8 µ= 10.39 µ= 10.86 µ= 15.85 µ= 18.51σ = 2.87σ = 4.85σ = 3.64σ = 4.12σ = 6.58σ = 9.25 7/3 0.07µ= 0.02 µ= 0.3 µ= 0.3 µ= 0.39 µ= 0.75 µ= 1.52σ = 0.14 σ = 0.61 σ = 0.59 σ = 0.72 σ = 1.62 σ = 2.7A study of the repetitive structure and distribution of short motifs 7 Table 3 The mean numbers (µ) and standard deviations (σ)of repeated patterns of length 4, 7 and 9 in different types of nucleotide sequences. For each motif length, thecorresponding rows represent the numbers of motifs repeated n times, where n variesbetween two and ten or more. Pattern counting has been done over 100 sequencesof length 500 in each category (continued)Length/ repeats ExpectednumberRandomsynthetic2nd orderMarkov M.3rd orderMarkov M.5th orderMarkov M.RandomgenomicUpstreamregulatory7/4 0.001µ= 0 µ= 0 µ= 0.01 µ= 0.05 µ= 0.06 µ= 0.47σ = 0 σ = 0 σ = 0.01 σ = 0.22 σ = 0.24 σ = 1.117/5 0µ= 0 µ= 0 µ= 0.02 µ= 0.03 µ= 0.13 µ= 0.18σ = 0 σ = 0 σ = 0.14 σ = 0.17 σ = 0.91 σ = 0.527/6 0µ= 0 µ= 0 µ= 0 µ= 0.01 µ= 0.13 µ= 0.13σ = 0 σ = 0 σ = 0 σ = 0.01 σ = 1.01 σ = 0.447/7 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.06 µ= 0.03σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.42 σ = 0.177/8 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.07 µ= 0.06σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.6 σ = 0.247/9 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.14 µ= 0.03σ = 0 σ = 0 σ = 0 σ = 0 σ = 1.3 σ = 0.177/10+ 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.08 µ= 0.03σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.8 σ = 0.179/2 0.48µ= 0.28 µ= 0.97 µ= 0.72 µ= 0.92 µ= 2.07 µ= 3.81σ = 0.58σ = 1.21σ = 0.96σ = 1.07σ = 3.18σ = 5.729/3 0µ= 0 µ= 0 µ= 0.01 µ= 0.03 µ= 0.21 µ= 0.33σ = 0 σ = 0 σ = 0.01 σ = 0.17 σ = 1.6 σ = 0.99/4 0µ= 0 µ= 0 µ= 0 µ= 0.01 µ= 0.05 µ= 0.15σ = 0 σ = 0 σ = 0 σ = 0.01 σ = 0.26 σ = 0.579/5 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.13 µ= 0.07σ = 0 σ = 0 σ = 0 σ = 0 σ = 1.01 σ = 0.359/6 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.13 µ= 0.06σ = 0 σ = 0 σ = 0 σ = 0 σ = 1.29 σ = 0.289/7 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.09 µ= 0.07σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.8 σ = 0.299/8 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.03 µ= 0.01σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.3 σ = 0.019/9 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.01 µ= 0.01σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.01 σ = 0.19/10+ 0µ= 0 µ= 0 µ= 0 µ= 0 µ= 0.04 µ= 0.01σ = 0 σ = 0 σ = 0 σ = 0 σ = 0.4 σ = 0.01Interestingly, while there was a good agreement in the repeat distribution in random genomic and gene upstream sequences for motifs of length 4, the chi-square test failed for every other length. As it can be seen from Table 3 for lengths 7 and 9, gene upstream8 A. Singh et al.sequences appear to feature a preference to an increased number of moderately repeated motifs, while random genomic sequences are biased towards smaller numbers of these repeated more dramatically (five or more times). This pattern was consistent for all considered motif lengths, however the fewer motifs of higher repeat count compensated for the lower number of moderately repeated motifs, resulting in an overall similarity in the overall number of repeated sequences throughout our test sets, regardless of their proximity to the genes. Under any circumstances, the number of short repeated motifs in the genomic sequences was greater than in any of the synthetic models, and far greater (an order of magnitude for longer motifs) than the Poisson expectations.3 Analysis of most common short degenerate motifsAfter experiencing this dramatic micro-repetitive structure of the human genome we were interested to find the most common short motifs significantly repeated in the ENCODE regions. Although the program used above was capable of locating short exact motifs in sequences of any length, in linear time, concentrating on perfect conservation appeared to be too restrictive. We have thus used our tool for finding short variable repeated motifs, described in detail in Singh and Stojanovic (2006).Briefly, our software starts by locating all exact repeated patterns in given sequences, including dinucleotides. This seeding step is done in time slightly super-linear to the length of the sequences, using the suffix tree data structure (Weiner, 1973), which has recently been applied to problems similar to ours (Adebiyi et al., 2001; Bannai et al., 2004). After building the original list of repeats, we use an indexing scheme to quickly locate all neighbours of a given (seed) motif, and search for all pairs that appear to be substantially repeated together, at a fixed distance. Whenever such pairs are found we build the tentative consensus of the approximate motif, and recursively try to extend it with additional seed elements and overlaps. The consensus building continues until a certain quality threshold, usually set to 0.9 (90%) or 0.95, can not be maintained any longer. We label all positions of absolute conservation with uppercase letters, and assign them the weight based on the number of sites participating in the construction of the consensus. Positions featuring a majority character, but occasionally broken with a mismatch are labelled with a lowercase character, whose weight is determined based on the number of sites which agree. When there is no agreement at a position it is signified by character ‘N’. The final consensus motif is reported based on the probabilistic evaluation of its length, weight and the number of occurrences.Since the program assumes that it has been given a set of sequences, rather than a single one, it considerably reduces the search space by filtering out the motifs which do not appear in the minimal number of distinct segments (a settable parameter). Unfortunately, when the sequences in the input are very similar (such as vertebrate ultra-conserved sequences, or even just homologous sequences from closely related species), this causes a theoretically exponential explosion of the recursive refinement step. However, such situations are rare (after repeat masking, most intergenic sequences do not exhibit good conservation of motifs longer than about a dozen bases), and easily detectable. On average, our software is capable of locating all significantly repeated variable motifs quickly and accurately.We ran this motif-detection program on the entire set of ENCODE regions, obtained through Ensembl, after masking the repeats. The repeat masking step was done sinceA study of the repetitive structure and distribution of short motifs 9 there was little purpose in trying to find common short repeated motifs in the presence of known long repeat elements. We performed 150,000 runs on 5–10 randomly chosen segments of length 1000, setting the program parameters so that only elements, which have been found in all segments were reported. We have not excluded known exons from our test data – it simplified the selection and, since exons generally comprise less than 2% of the human genome we believed that they would not significantly affect our results.Although we recorded only motifs of length 7 and above, their number was in thousands even after filtering these which were nearly identical or inverse complements of each other, and these featuring extremely simple sequence (all A’s, for instance) or tandem repeats. All these motifs do deserve further classification, but at this time we have limited our study only to about two dozen, which were statistically least likely to occur by chance. Since our repeat masked ENCODE sequences contained 40,645,510 bases, counting both strands, in a completely random string of this length a motif of size 10, for instance, would be expected to be found about 39 times. We used such considerations as a basis for the selection of the top choices, where the effective length of the string was calculated by assigning different weights to differently conserved positions (1 for an uppercase letter, 0.5 for lowercase and 0 for an ‘N’ – this could have been further refined by using the exact weights of the identified motifs, but for this study that was not necessary).In order to do a tentative characterisation of the discovered motifs, we have checked them against the human entries in RepBase (Jurka, 2000) for possible membership in a known repeat family, and TRANSFAC (Matys et al., 2006) for a possible functional role. Table 4 summarises this information for four longest motifs in our list (the fifth one of that size was (CTG)4, which we filtered out), and Table 5 provides the same account for the top five motifs after these with two or less G’s or C’s were removed. The remaining top motifs were either degenerated poly-A’s (or poly-T’s), or a combination of A’s and T’s. Although they are also potentially significant, we have not studied them in detail, since poly-A tails are known to be present in many copies in genomic sequences. One explanation for their prevalence is in that they are derived from the terminus of non-LTR retrotransposon repeats. These elements are abundant in the human genome (over 2.3 million copies spanning over one third of the genome) and they are characterised by a stretch of poly-A’s at their 3’ end (International Human Genome Sequencing Consortium, 2001). Because of its variable length and rapid mutational degradation, part or all of the 3’ poly-A terminus of non-LTR retrotransposons may often remain after repeat masking.Even as the number of matches our top motifs had in RepBase generally exceeded what would be expected by chance only, these hits were not concentrated in a single repeat class, and thus probably do not represent remnants of a particular mobile element, at least not one of a known classification. Similarly, their TRANS-FAC matches do not appear to lend strong support to the hypothesis that they may be functional protein binding sites – the examined sequence was human, but most of the hits were in non-human elements, or elements which are common in repeated sequences throughout the mammalian lineage (or even broader). While these motifs are clearly strongly repetitive, and some also likely functional, further studies are needed in order to characterise their nature and origins.。