数据库外文文献翻译
毕业论文(设计)外文文献翻译及原文
金融体制、融资约束与投资——来自OECD的实证分析R.SemenovDepartment of Economics,University of Nijmegen,Nijmegen(荷兰内梅亨大学,经济学院)这篇论文考查了OECD的11个国家中现金流量对企业投资的影响.我们发现不同国家之间投资对企业内部可获取资金的敏感性具有显著差异,并且银企之间具有明显的紧密关系的国家的敏感性比银企之间具有公平关系的国家的低.同时,我们发现融资约束与整体金融发展指标不存在关系.我们的结论与资本市场信息和激励问题对企业投资具有重要作用这种观点一致,并且紧密的银企关系会减少这些问题从而增加企业获取外部融资的渠道。
一、引言各个国家的企业在显著不同的金融体制下运行。
金融发展水平的差别(例如,相对GDP的信用额度和相对GDP的相应股票市场的资本化程度),在所有者和管理者关系、企业和债权人的模式中,企业控制的市场活动水平可以很好地被记录.在完美资本市场,对于具有正的净现值投资机会的企业将一直获得资金。
然而,经济理论表明市场摩擦,诸如信息不对称和激励问题会使获得外部资本更加昂贵,并且具有盈利投资机会的企业不一定能够获取所需资本.这表明融资要素,例如内部产生资金数量、新债务和权益的可得性,共同决定了企业的投资决策.现今已经有大量考查外部资金可得性对投资决策的影响的实证资料(可参考,例如Fazzari(1998)、 Hoshi(1991)、 Chapman(1996)、Samuel(1998)).大多数研究结果表明金融变量例如现金流量有助于解释企业的投资水平。
这项研究结果解释表明企业投资受限于外部资金的可得性。
很多模型强调运行正常的金融中介和金融市场有助于改善信息不对称和交易成本,减缓不对称问题,从而促使储蓄资金投着长期和高回报的项目,并且提高资源的有效配置(参看Levine(1997)的评论文章)。
因而我们预期用于更加发达的金融体制的国家的企业将更容易获得外部融资.几位学者已经指出建立企业和金融中介机构可进一步缓解金融市场摩擦。
大数据外文翻译参考文献综述
大数据外文翻译参考文献综述(文档含中英文对照即英文原文和中文翻译)原文:Data Mining and Data PublishingData mining is the extraction of vast interesting patterns or knowledge from huge amount of data. The initial idea of privacy-preserving data mining PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. Privacy-preserving data mining considers the problem of running data mining algorithms on confidential data that is not supposed to be revealed even to the partyrunning the algorithm. In contrast, privacy-preserving data publishing (PPDP) may not necessarily be tied to a specific data mining task, and the data mining task may be unknown at the time of data publishing. PPDP studies how to transform raw data into a version that is immunized against privacy attacks but that still supports effective data mining tasks. Privacy-preserving for both data mining (PPDM) and data publishing (PPDP) has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes. One well studied approach is the k-anonymity model [1] which in turn led to other models such as confidence bounding, l-diversity, t-closeness, (α,k)-anonymity, etc. In particular, all known mechanisms try to minimize information loss and such an attempt provides a loophole for attacks. The aim of this paper is to present a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explain their effects on Data Privacy.Although data mining is potentially useful, many data holders are reluctant to provide their data for data mining for the fear of violating individual privacy. In recent years, study has been made to ensure that the sensitive information of individuals cannot be identified easily.Anonymity Models, k-anonymization techniques have been the focus of intense research in the last few years. In order to ensure anonymization of data while at the same time minimizing the informationloss resulting from data modifications, everal extending models are proposed, which are discussed as follows.1.k-Anonymityk-anonymity is one of the most classic models, which technique that prevents joining attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In the k-anonymous tables, a data set is k-anonymous (k ≥ 1) if each record in the data set is in- distinguishable from at least (k . 1) other records within the same data set. The larger the value of k, the better the privacy is protected. k-anonymity can ensure that individuals cannot be uniquely identified by linking attacks.2. Extending ModelsSince k-anonymity does not provide sufficient protection against attribute disclosure. The notion of l-diversity attempts to solve this problem by requiring that each equivalence class has at least l well-represented value for each sensitive attribute. The technology of l-diversity has some advantages than k-anonymity. Because k-anonymity dataset permits strong attacks due to lack of diversity in the sensitive attributes. In this model, an equivalence class is said to have l-diversity if there are at least l well-represented value for the sensitive attribute. Because there are semantic relationships among the attribute values, and different values have very different levels of sensitivity. Afteranonymization, in any equivalence class, the frequency (in fraction) of a sensitive value is no more than α.3. Related Research AreasSeveral polls show that the public has an in- creased sense of privacy loss. Since data mining is often a key component of information systems, homeland security systems, and monitoring and surveillance systems, it gives a wrong impression that data mining is a technique for privacy intrusion. This lack of trust has become an obstacle to the benefit of the technology. For example, the potentially beneficial data mining re- search project, Terrorism Information Awareness (TIA), was terminated by the US Congress due to its controversial procedures of collecting, sharing, and analyzing the trails left by individuals. Motivated by the privacy concerns on data mining tools, a research area called privacy-reserving data mining (PPDM) emerged in 2000. The initial idea of PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. The solutions were often tightly coupled with the data mining algorithms under consideration. In contrast, privacy-preserving data publishing (PPDP) may not necessarily tie to a specific data mining task, and the data mining task is sometimes unknown at the time of data publishing. Furthermore, some PPDP solutions emphasize preserving the datatruthfulness at the record level, but PPDM solutions often do not preserve such property. PPDP Differs from PPDM in Several Major Ways as Follows :1) PPDP focuses on techniques for publishing data, not techniques for data mining. In fact, it is expected that standard data mining techniques are applied on the published data. In contrast, the data holder in PPDM needs to randomize the data in such a way that data mining results can be recovered from the randomized data. To do so, the data holder must understand the data mining tasks and algorithms involved. This level of involvement is not expected of the data holder in PPDP who usually is not an expert in data mining.2) Both randomization and encryption do not preserve the truthfulness of values at the record level; therefore, the released data are basically meaningless to the recipients. In such a case, the data holder in PPDM may consider releasing the data mining results rather than the scrambled data.3) PPDP primarily “anonymizes” the data by hiding the identity of record owners, whereas PPDM seeks to directly hide the sensitive data. Excellent surveys and books in randomization and cryptographic techniques for PPDM can be found in the existing literature. A family of research work called privacy-preserving distributed data mining (PPDDM) aims at performing some data mining task on a set of private databasesowned by different parties. It follows the principle of Secure Multiparty Computation (SMC), and prohibits any data sharing other than the final data mining result. Clifton et al. present a suite of SMC operations, like secure sum, secure set union, secure size of set intersection, and scalar product, that are useful for many data mining tasks. In contrast, PPDP does not perform the actual data mining task, but concerns with how to publish the data so that the anonymous data are useful for data mining. We can say that PPDP protects privacy at the data level while PPDDM protects privacy at the process level. They address different privacy models and data mining scenarios. In the field of statistical disclosure control (SDC), the research works focus on privacy-preserving publishing methods for statistical tables. SDC focuses on three types of disclosures, namely identity disclosure, attribute disclosure, and inferential disclosure. Identity disclosure occurs if an adversary can identify a respondent from the published data. Revealing that an individual is a respondent of a data collection may or may not violate confidentiality requirements. Attribute disclosure occurs when confidential information about a respondent is revealed and can be attributed to the respondent. Attribute disclosure is the primary concern of most statistical agencies in deciding whether to publish tabular data. Inferential disclosure occurs when individual information can be inferred with high confidence from statistical information of the published data.Some other works of SDC focus on the study of the non-interactive query model, in which the data recipients can submit one query to the system. This type of non-interactive query model may not fully address the information needs of data recipients because, in some cases, it is very difficult for a data recipient to accurately construct a query for a data mining task in one shot. Consequently, there are a series of studies on the interactive query model, in which the data recipients, including adversaries, can submit a sequence of queries based on previously received query results. The database server is responsible to keep track of all queries of each user and determine whether or not the currently received query has violated the privacy requirement with respect to all previous queries. One limitation of any interactive privacy-preserving query system is that it can only answer a sublinear number of queries in total; otherwise, an adversary (or a group of corrupted data recipients) will be able to reconstruct all but 1 . o(1) fraction of the original data, which is a very strong violation of privacy. When the maximum number of queries is reached, the query service must be closed to avoid privacy leak. In the case of the non-interactive query model, the adversary can issue only one query and, therefore, the non-interactive query model cannot achieve the same degree of privacy defined by Introduction the interactive model. One may consider that privacy-reserving data publishing is a special case of the non-interactivequery model.This paper presents a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explains their effects on Data Privacy. k-anonymity is used for security of respondents identity and decreases linking attack in the case of homogeneity attack a simple k-anonymity model fails and we need a concept which prevent from this attack solution is l-diversity. All tuples are arranged in well represented form and adversary will divert to l places or on l sensitive attributes. l-diversity limits in case of background knowledge attack because no one predicts knowledge level of an adversary. It is observe that using generalization and suppression we also apply these techniques on those attributes which doesn’t need th is extent of privacy and this leads to reduce the precision of publishing table. e-NSTAM (extended Sensitive Tuples Anonymity Method) is applied on sensitive tuples only and reduces information loss, this method also fails in the case of multiple sensitive tuples.Generalization with suppression is also the causes of data lose because suppression emphasize on not releasing values which are not suited for k factor. Future works in this front can include defining a new privacy measure along with l-diversity for multiple sensitive attribute and we will focus to generalize attributes without suppression using other techniques which are used to achieve k-anonymity because suppression leads to reduce the precision ofpublishing table.译文:数据挖掘和数据发布数据挖掘中提取出大量有趣的模式从大量的数据或知识。
数据库中英文对照外文翻译文献
中英文对照外文翻译Database Management SystemsA database (sometimes spelled data base) is also called an electronic database , referring to any collection of data, or information, that is specially organized for rapid search and retrieval by a computer. Databases are structured to facilitate the storage, retrieval , modification, and deletion of data in conjunction with various data-processing operations .Databases can be stored on magnetic disk or tape, optical disk, or some other secondary storage device.A database consists of a file or a set of files. The information in these files may be broken down into records, each of which consists of one or more fields. Fields are the basic units of data storage , and each field typically contains information pertaining to one aspect or attribute of the entity described by the database . Using keywords and various sorting commands, users can rapidly search , rearrange, group, and select the fields in many records to retrieve or create reports on particular aggregate of data.Complex data relationships and linkages may be found in all but the simplest databases .The system software package that handles the difficult tasks associated with creating ,accessing, and maintaining database records is called a database management system(DBMS).The programs in a DBMS package establish an interface between the database itself and the users of the database.. (These users may be applications programmers, managers and others with information needs, and various OS programs.)A DBMS can organize, process, and present selected data elements form the database. This capability enables decision makers to search, probe, and query database contents in order to extract answers to nonrecurring and unplanned questions that aren’t available in regular reports. These questions might initially be vague and/or poorly defined ,but people can “browse” through the database until they have the needed information. In short, the DBMS will “manage” the stored data items and assemble the needed items from the common database in response to the queries of those who aren’t programmers.A database management system (DBMS) is composed of three major parts:(1)a storage subsystemthat stores and retrieves data in files;(2) a modeling and manipulation subsystem that provides the means with which to organize the data and to add , delete, maintain, and update the data;(3)and an interface between the DBMS and its users. Several major trends are emerging that enhance the value and usefulness of database management systems;Managers: who require more up-to-data information to make effective decisionCustomers: who demand increasingly sophisticated information services and more current information about the status of their orders, invoices, and accounts.Users: who find that they can develop custom applications with database systems in a fraction of the time it takes to use traditional programming languages.Organizations : that discover information has a strategic value; they utilize their database systems to gain an edge over their competitors.The Database ModelA data model describes a way to structure and manipulate the data in a database. The structural part of the model specifies how data should be represented(such as tree, tables, and so on ).The manipulative part of the model specifies the operation with which to add, delete, display, maintain, print, search, select, sort and update the data.Hierarchical ModelThe first database management systems used a hierarchical model-that is-they arranged records into a tree structure. Some records are root records and all others have unique parent records. The structure of the tree is designed to reflect the order in which the data will be used that is ,the record at the root of a tree will be accessed first, then records one level below the root ,and so on.The hierarchical model was developed because hierarchical relationships are commonly found in business applications. As you have known, an organization char often describes a hierarchical relationship: top management is at the highest level, middle management at lower levels, and operational employees at the lowest levels. Note that within a strict hierarchy, each level of management may have many employees or levels of employees beneath it, but each employee has only one manager. Hierarchical data are characterized by this one-to-many relationship among data.In the hierarchical approach, each relationship must be explicitly defined when the database is created. Each record in a hierarchical database can contain only one key field and only one relationship is allowed between any two fields. This can create a problem because data do not always conform to such a strict hierarchy.Relational ModelA major breakthrough in database research occurred in 1970 when E. F. Codd proposed a fundamentally different approach to database management called relational model ,which uses a table asits data structure.The relational database is the most widely used database structure. Data is organized into related tables. Each table is made up of rows called and columns called fields. Each record contains fields of data about some specific item. For example, in a table containing information on employees, a record would contain fields of data such as a person’s last name ,first name ,and street address.Structured query language(SQL)is a query language for manipulating data in a relational database .It is nonprocedural or declarative, in which the user need only specify an English-like description that specifies the operation and the described record or combination of records. A query optimizer translates the description into a procedure to perform the database manipulation.Network ModelThe network model creates relationships among data through a linked-list structure in which subordinate records can be linked to more than one parent record. This approach combines records with links, which are called pointers. The pointers are addresses that indicate the location of a record. With the network approach, a subordinate record can be linked to a key record and at the same time itself be a key record linked to other sets of subordinate records. The network mode historically has had a performance advantage over other database models. Today , such performance characteristics are only important in high-volume ,high-speed transaction processing such as automatic teller machine networks or airline reservation system.Both hierarchical and network databases are application specific. If a new application is developed ,maintaining the consistency of databases in different applications can be very difficult. For example, suppose a new pension application is developed .The data are the same, but a new database must be created.Object ModelThe newest approach to database management uses an object model , in which records are represented by entities called objects that can both store data and provide methods or procedures to perform specific tasks.The query language used for the object model is the same object-oriented programming language used to develop the database application .This can create problems because there is no simple , uniform query language such as SQL . The object model is relatively new, and only a few examples of object-oriented database exist. It has attracted attention because developers who choose an object-oriented programming language want a database based on an object-oriented model. Distributed DatabaseSimilarly , a distributed database is one in which different parts of the database reside on physically separated computers . One goal of distributed databases is the access of informationwithout regard to where the data might be stored. Keeping in mind that once the users and their data are separated , the communication and networking concepts come into play .Distributed databases require software that resides partially in the larger computer. This software bridges the gap between personal and large computers and resolves the problems of incompatible data formats. Ideally, it would make the mainframe databases appear to be large libraries of information, with most of the processing accomplished on the personal computer.A drawback to some distributed systems is that they are often based on what is called a mainframe-entire model , in which the larger host computer is seen as the master and the terminal or personal computer is seen as a slave. There are some advantages to this approach . With databases under centralized control , many of the problems of data integrity that we mentioned earlier are solved . But today’s personal computers, departmental computers, and distributed processing require computers and their applications to communicate with each other on a more equal or peer-to-peer basis. In a database, the client/server model provides the framework for distributing databases.One way to take advantage of many connected computers running database applications is to distribute the application into cooperating parts that are independent of one anther. A client is an end user or computer program that requests resources across a network. A server is a computer running software that fulfills those requests across a network . When the resources are data in a database ,the client/server model provides the framework for distributing database.A file serve is software that provides access to files across a network. A dedicated file server is a single computer dedicated to being a file server. This is useful ,for example ,if the files are large and require fast access .In such cases, a minicomputer or mainframe would be used as a file server. A distributed file server spreads the files around on individual computers instead of placing them on one dedicated computer.Advantages of the latter server include the ability to store and retrieve files on other computers and the elimination of duplicate files on each computer. A major disadvantage , however, is that individual read/write requests are being moved across the network and problems can arise when updating files. Suppose a user requests a record from a file and changes it while another user requests the same record and changes it too. The solution to this problems called record locking, which means that the first request makes others requests wait until the first request is satisfied . Other users may be able to read the record, but they will not be able to change it .A database server is software that services requests to a database across a network. For example, suppose a user types in a query for data on his or her personal computer . If the application is designed with the client/server model in mind ,the query language part on the personal computer simple sends the query across the network to the database server and requests to be notified when the data are found.Examples of distributed database systems can be found in the engineering world. Sun’s Network Filing System(NFS),for example, is used in computer-aided engineering applications to distribute data among the hard disks in a network of Sun workstation.Distributing databases is an evolutionary step because it is logical that data should exist at the location where they are being used . Departmental computers within a large corporation ,for example, should have data reside locally , yet those data should be accessible by authorized corporate management when they want to consolidate departmental data . DBMS software will protect the security and integrity of the database , and the distributed database will appear to its users as no different from the non-distributed database .In this information age, the data server has become the heart of a company. This one piece of software controls the rhythm of most organizations and is used to pump information lifeblood through the arteries of the network. Because of the critical nature of this application, the data server is also the one of the most popular targets for hackers. If a hacker owns this application, he can cause the company's "heart" to suffer a fatal arrest.Ironically, although most users are now aware of hackers, they still do not realize how susceptible their database servers are to hack attacks. Thus, this article presents a description of the primary methods of attacking database servers (also known as SQL servers) and shows you how to protect yourself from these attacks.You should note this information is not new. Many technical white papers go into great detail about how to perform SQL attacks, and numerous vulnerabilities have been posted to security lists that describe exactly how certain database applications can be exploited. This article was written for the curious non-SQL experts who do not care to know the details, and as a review to those who do use SQL regularly.What Is a SQL Server?A database application is a program that provides clients with access to data. There are many variations of this type of application, ranging from the expensive enterprise-level Microsoft SQL Server to the free and open source mySQL. Regardless of the flavor, most database server applications have several things in common.First, database applications use the same general programming language known as SQL, or Structured Query Language. This language, also known as a fourth-level language due to its simplistic syntax, is at the core of how a client communicates its requests to the server. Using SQL in its simplest form, a programmer can select, add, update, and delete information in a database. However, SQL can also be used to create and design entire databases, perform various functions on the returned information, and even execute other programs.To illustrate how SQL can be used, the following is an example of a simple standard SQL query and a more powerful SQL query:Simple: "Select * from dbFurniture.tblChair"This returns all information in the table tblChair from the database dbFurniture.Complex: "EXEC master..xp_cmdshell 'dir c:\'"This short SQL command returns to the client the list of files and folders under the c:\ directory of the SQL server. Note that this example uses an extended stored procedure that is exclusive to MS SQL Server.The second function that database server applications share is that they all require some form of authenticated connection between client and host. Although the SQL language is fairly easy to use, at least in its basic form, any client that wants to perform queries must first provide some form of credentials that will authorize the client; the client also must define the format of the request and response.This connection is defined by several attributes, depending on the relative location of the client and what operating systems are in use. We could spend a whole article discussing various technologies such as DSN connections, DSN-less connections, RDO, ADO, and more, but these subjects are outside the scope of this article. If you want to learn more about them, a little Google'ing will provide you with more than enough information. However, the following is a list of the more common items included in a connection request.Database sourceRequest typeDatabaseUser IDPasswordBefore any connection can be made, the client must define what type of database server it is connecting to. This is handled by a software component that provides the client with the instructions needed to create the request in the correct format. In addition to the type of database, the request type can be used to further define how the client's request will be handled by the server. Next comes the database name and finally the authentication information.All the connection information is important, but by far the weakest link is the authentication information—or lack thereof. In a properly managed server, each database has its own users with specifically designated permissions that control what type of activity they can perform. For example, a user account would be set up as read only for applications that need to only access information. Another account should be used for inserts or updates, and maybe even a third account would be used for deletes.This type of account control ensures that any compromised account is limited in functionality. Unfortunately, many database programs are set up with null or easy passwords, which leads to successful hack attacks.译文数据库管理系统介绍数据库(database,有时拼作data base)又称为电子数据库,是专门组织起来的一组数据或信息,其目的是为了便于计算机快速查询及检索。
英文论文(外文文献)翻译成中文的格式与方法
英文论文(外文文献)翻译成中文的格式与方法英文论文(外文文献)翻译成中文的格式与方法本文关键词:外文,英文,中文,翻译成,文献英文论文(外文文献)翻译成中文的格式与方法本文简介:在撰写毕业设计(论文)或科研论文时,需要参考一些相关外文文献,了解国外的最新研究进展,这就需要我们找到最新最具代表性的外文文献,进行翻译整理,以备论文写作时参考,外文文献中英文文献占绝大多数,因此英文论文准确的翻译成中文就显得尤为重要!一、外文文献从哪里下载1、从知网国际文献总库中找英文论文(外文文献)翻译成中文的格式与方法本文内容:在撰写毕业设计(论文)或科研论文时,需要参考一些相关外文文献,了解国外的最新研究进展,这就需要我们找到最新最具代表性的外文文献,进行翻译整理,以备论文写作时参考,外文文献中英文文献占绝大多数,因此英文论文准确的翻译成中文就显得尤为重要!一、外文文献从哪里下载1、从知网国际文献总库中找,该数据库中包含14,000多家国外出版社的文献,囊括所有专业的英文文献资料。
2、一些免费的外文数据库或网站,为了方便大家查找,编者整理成文档供大家下载:国外免费文献数据库大全下载3、谷歌学术检索工具,检索时设置成只检索英文文献,键入与专业相关的关键词即可检索。
二、英文论文翻译格式与要求翻译的外文文献的字符要求不少于1.5万(或翻译成中文后至少在3000字以上)。
字数达到的文献一篇即可。
翻译的外文文献应主要选自学术期刊、学术会议的文章、有关着作及其他相关材料,应与毕业论文(设计)主题相关,并作为外文参考文献列入毕业论文(设计)的参考文献。
并在每篇中文译文首页用"脚注"形式注明原文作者及出处,中文译文后应附外文原文。
需认真研读和查阅术语完成翻译,不得采用翻译软件翻译。
中文译文的编排结构与原文同,撰写格式参照毕业论文的格式要求。
参考文献不必翻译,直接使用原文的(字体,字号,标点符号等与毕业论文中的参考文献要求同),参考文献的序号应标注在译文中相应的地方。
外文文献查找方法及翻译要求
文献查找:根据自己课题到知网或类似的数据库网站查找,有相关英文文章可以进行下载之后翻译。
如果不能下载,网站的中英文摘要一般都可以看到,根据英文关键词到谷歌搜索里面搜相关的英文文章。
要求的翻译原文不一定是专业的科技文章,只要内容相关,难度跟篇幅合适都可以拿来用。
如果想翻译专业的英文文献,可以通过谷歌的学术搜索网站/schhp?hl=zh-CN,给出的链接比较多,多数是链接到数据库上,没有账号不能下载,但是有些时候搜索出来的条目会有PDF链接,直接点击就能下载下来。
还可以通过维基百科查找相关词条的英文解释,/wiki/Main_Page,但无论英文还是中文都要符合模板中的格式要求,如果英文原文不能编辑,可以不作处理。
英译中:翻译一段检查一段,中文语言一定要通顺,不要有漏译。
不知道的单词一定要查,不要猜。
另一个是,同一个单词在文章里多次出现,每次翻译的时候都要用同样的单词翻译,不能原文同一个单词,在译文里出现的时候是不同的单词。
格式和原文要完全一样,原文有表格,就在译文里也画表格,原文是图片,译文也是图片,原文图片里文字的位置和译文图片里文字的位置要一致。
还有,整个文章统一字体。
查词顺序:先查英语国语字典或者用GOOGLE英文网页,理解英文单词的意思,然后用翻译软件英译中翻译,用翻译软件翻译出来的中文单词,用百度查中文单词的意思,如果查到的中文单词的意思和英语国语字典或者用GOOGLE英文网页的英文单词的解释是一致,翻译软件翻译的是正确的,如果不一致,翻译软件翻译的是错的!翻译工具:1) 在线词典有道词典:/爱词霸:/responsibly/用爱词霸的词典选项。
注:以上两个在线词典的准确率相对较高,可用于查词组译法,灵格斯字典,360上下载,到软件管家里找就可以,这里技术类名词多一些。
爱词霸英中互译翻译软件:/尽量不使用GOOGLE翻译软件。
可参考:GOOGLE翻译软件: /#ja|zh-CN|%0D%0A GOOGLE翻译软件总有翻译错的时候,所以,用GOOGLE翻译软件要注意!牛津字典,或者英语国语词典,灵格斯字典下载一个,自己下载一个吧!注意!!必须有英语国语词典,网上的!。
SQL Server数据库管理外文翻译文献
SQL Server数据库管理外文翻译文献本文翻译了一篇关于SQL Server数据库管理的外文文献。
摘要该文献介绍了SQL Server数据库管理的基本原则和策略。
作者指出,重要的决策应该基于独立思考,避免过多依赖外部帮助。
对于非可确认的内容,不应进行引用。
文献还强调了以简单策略为主、避免法律复杂性的重要性。
内容概述本文详细介绍了SQL Server数据库管理的基本原则和策略。
其中包括:1. 独立决策:在数据库管理中,决策应该基于独立思考。
不过多依赖用户的帮助或指示,而是依靠数据库管理员的专业知识和经验进行决策。
独立决策:在数据库管理中,决策应该基于独立思考。
不过多依赖用户的帮助或指示,而是依靠数据库管理员的专业知识和经验进行决策。
2. 简单策略:为了避免法律复杂性和错误的决策,应采用简单策略。
这意味着避免引用无法确认的内容,只使用可靠和可验证的信息。
简单策略:为了避免法律复杂性和错误的决策,应采用简单策略。
这意味着避免引用无法确认的内容,只使用可靠和可验证的信息。
3. 数据库管理准则:文献提出了一些SQL Server数据库管理的准则,包括:规划和设计数据库结构、有效的数据备份和恢复策略、用户权限管理、性能优化等。
数据库管理准则:文献提出了一些SQL Server数据库管理的准则,包括:规划和设计数据库结构、有效的数据备份和恢复策略、用户权限管理、性能优化等。
结论文献通过介绍SQL Server数据库管理的基本原则和策略,强调了独立决策和简单策略的重要性。
数据库管理员应该依靠自己的知识和经验,避免过度依赖外部帮助,并采取简单策略来管理数据库。
此外,遵循数据库管理准则也是确保数据库安全和性能的重要手段。
以上是对于《SQL Server数据库管理外文翻译文献》的详细内容概述和总结。
如果需要更多详细信息,请阅读原文献。
数据库外文参考文献及翻译
数据库外文参考文献及翻译数据库外文参考文献及翻译SQL ALL-IN-ONE DESK REFERENCE FOR DUMMIESData Files and DatabasesI. Irreducible complexityAny software system that performs a useful function is going to be complex. The more valuable the function, the more complex its implementation will be. Regardless of how the data is stored, the complexity remains. The only question is where that complexity resides. Any non-trivial computer application has two major components: the program the data. Although an application’s level of complexity depends on the task to be performed, developers have some control over the location of that complexity. The complexity may reside primarily in the program part of the overall system, or it may reside in the data part.Operations on the data can be fast. Because the programinteracts directly with the data, with no DBMS in the middle, well-designed applications can run as fast as the hardware permits. What could be better? A data organization that minimizes storage requirements and at the same time maximizes speed of operation seems like the best of all possible worlds. But wait a minute . Flat file systems came into use in the 1940s. We have known about them for a long time, and yet today they have been almost entirely replaced by database s ystems. What’s up with that? Perhaps it is the not-so-beneficial consequences。
大数据挖掘外文翻译文献
文献信息:文献标题:A Study of Data Mining with Big Data(大数据挖掘研究)国外作者:VH Shastri,V Sreeprada文献出处:《International Journal of Emerging Trends and Technology in Computer Science》,2016,38(2):99-103字数统计:英文2291单词,12196字符;中文3868汉字外文文献:A Study of Data Mining with Big DataAbstract Data has become an important part of every economy, industry, organization, business, function and individual. Big Data is a term used to identify large data sets typically whose size is larger than the typical data base. Big data introduces unique computational and statistical challenges. Big Data are at present expanding in most of the domains of engineering and science. Data mining helps to extract useful data from the huge data sets due to its volume, variability and velocity. This article presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective.Keywords: Big Data, Data Mining, HACE theorem, structured and unstructured.I.IntroductionBig Data refers to enormous amount of structured data and unstructured data thatoverflow the organization. If this data is properly used, it can lead to meaningful information. Big data includes a large number of data which requires a lot of processing in real time. It provides a room to discover new values, to understand in-depth knowledge from hidden values and provide a space to manage the data effectively. A database is an organized collection of logically related data which can be easily managed, updated and accessed. Data mining is a process discovering interesting knowledge such as associations, patterns, changes, anomalies and significant structures from large amount of data stored in the databases or other repositories.Big Data includes 3 V’s as its characteristics. They are volume, velocity and variety. V olume means the amount of data generated every second. The data is in state of rest. It is also known for its scale characteristics. Velocity is the speed with which the data is generated. It should have high speed data. The data generated from social media is an example. Variety means different types of data can be taken such as audio, video or documents. It can be numerals, images, time series, arrays etc.Data Mining analyses the data from different perspectives and summarizing it into useful information that can be used for business solutions and predicting the future trends. Data mining (DM), also called Knowledge Discovery in Databases (KDD) or Knowledge Discovery and Data Mining, is the process of searching large volumes of data automatically for patterns such as association rules. It applies many computational techniques from statistics, information retrieval, machine learning and pattern recognition. Data mining extract only required patterns from the database in a short time span. Based on the type of patterns to be mined, data mining tasks can be classified into summarization, classification, clustering, association and trends analysis.Big Data is expanding in all domains including science and engineering fields including physical, biological and biomedical sciences.II.BIG DATA with DATA MININGGenerally big data refers to a collection of large volumes of data and these data are generated from various sources like internet, social-media, business organization, sensors etc. We can extract some useful information with the help of Data Mining. It is a technique for discovering patterns as well as descriptive, understandable, models from a large scale of data.V olume is the size of the data which is larger than petabytes and terabytes. The scale and rise of size makes it difficult to store and analyse using traditional tools. Big Data should be used to mine large amounts of data within the predefined period of time. Traditional database systems were designed to address small amounts of data which were structured and consistent, whereas Big Data includes wide variety of data such as geospatial data, audio, video, unstructured text and so on.Big Data mining refers to the activity of going through big data sets to look for relevant information. To process large volumes of data from different sources quickly, Hadoop is used. Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. Its distributed supports fast data transfer rates among nodes and allows the system to continue operating uninterrupted at times of node failure. It runs Map Reduce for distributed data processing and is works with structured and unstructured data.III.BIG DATA characteristics- HACE THEOREM.We have large volume of heterogeneous data. There exists a complex relationship among the data. We need to discover useful information from this voluminous data.Let us imagine a scenario in which the blind people are asked to draw elephant. The information collected by each blind people may think the trunk as wall, leg as tree, body as wall and tail as rope. The blind men can exchange information with each other.Figure1: Blind men and the giant elephantSome of the characteristics that include are:i.Vast data with heterogeneous and diverse sources: One of the fundamental characteristics of big data is the large volume of data represented by heterogeneous and diverse dimensions. For example in the biomedical world, a single human being is represented as name, age, gender, family history etc., For X-ray and CT scan images and videos are used. Heterogeneity refers to the different types of representations of same individual and diverse refers to the variety of features to represent single information.ii.Autonomous with distributed and de-centralized control: the sources are autonomous, i.e., automatically generated; it generates information without any centralized control. We can compare it with World Wide Web (WWW) where each server provides a certain amount of information without depending on other servers.plex and evolving relationships: As the size of the data becomes infinitely large, the relationship that exists is also large. In early stages, when data is small, there is no complexity in relationships among the data. Data generated from social media and other sources have complex relationships.IV.TOOLS:OPEN SOURCE REVOLUTIONLarge companies such as Facebook, Yahoo, Twitter, LinkedIn benefit and contribute work on open source projects. In Big Data Mining, there are many open source initiatives. The most popular of them are:Apache Mahout:Scalable machine learning and data mining open source software based mainly in Hadoop. It has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent patternmining.R: open source programming language and software environment designed for statistical computing and visualization. R was designed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand beginning in 1993 and is used for statistical analysis of very large data sets.MOA: Stream data mining open source software to perform data mining in real time. It has implementations of classification, regression; clustering and frequent item set mining and frequent graph mining. It started as a project of the Machine Learning group of University of Waikato, New Zealand, famous for the WEKA software. The streams framework provides an environment for defining and running stream processes using simple XML based definitions and is able to use MOA, Android and Storm.SAMOA: It is a new upcoming software project for distributed stream mining that will combine S4 and Storm with MOA.Vow pal Wabbit: open source project started at Yahoo! Research and continuing at Microsoft Research to design a fast, scalable, useful learning algorithm. VW is able to learn from terafeature datasets. It can exceed the throughput of any single machine networkinterface when doing linear learning, via parallel learning.V.DATA MINING for BIG DATAData mining is the process by which data is analysed coming from different sources discovers useful information. Data Mining contains several algorithms which fall into 4 categories. They are:1.Association Rule2.Clustering3.Classification4.RegressionAssociation is used to search relationship between variables. It is applied in searching for frequently visited items. In short it establishes relationship among objects. Clustering discovers groups and structures in the data.Classification deals with associating an unknown structure to a known structure. Regression finds a function to model the data.The different data mining algorithms are:Table 1. Classification of AlgorithmsData Mining algorithms can be converted into big map reduce algorithm based on parallel computing basis.Table 2. Differences between Data Mining and Big DataVI.Challenges in BIG DATAMeeting the challenges with BIG Data is difficult. The volume is increasing every day. The velocity is increasing by the internet connected devices. The variety is also expanding and the organizations’ capability to capture and process the data is limited.The following are the challenges in area of Big Data when it is handled:1.Data capture and storage2.Data transmission3.Data curation4.Data analysis5.Data visualizationAccording to, challenges of big data mining are divided into 3 tiers.The first tier is the setup of data mining algorithms. The second tier includesrmation sharing and Data Privacy.2.Domain and Application Knowledge.The third one includes local learning and model fusion for multiple information sources.3.Mining from sparse, uncertain and incomplete data.4.Mining complex and dynamic data.Figure 2: Phases of Big Data ChallengesGenerally mining of data from different data sources is tedious as size of data is larger. Big data is stored at different places and collecting those data will be a tedious task and applying basic data mining algorithms will be an obstacle for it. Next we need to consider the privacy of data. The third case is mining algorithms. When we are applying data mining algorithms to these subsets of data the result may not be that much accurate.VII.Forecast of the futureThere are some challenges that researchers and practitioners will have to deal during the next years:Analytics Architecture:It is not clear yet how an optimal architecture of analytics systems should be to deal with historic data and with real-time data at the same time. An interesting proposal is the Lambda architecture of Nathan Marz. The Lambda Architecture solves the problem of computing arbitrary functions on arbitrary data in real time by decomposing the problem into three layers: the batch layer, theserving layer, and the speed layer. It combines in the same system Hadoop for the batch layer, and Storm for the speed layer. The properties of the system are: robust and fault tolerant, scalable, general, and extensible, allows ad hoc queries, minimal maintenance, and debuggable.Statistical significance: It is important to achieve significant statistical results, and not be fooled by randomness. As Efron explains in his book about Large Scale Inference, it is easy to go wrong with huge data sets and thousands of questions to answer at once.Distributed mining: Many data mining techniques are not trivial to paralyze. To have distributed versions of some methods, a lot of research is needed with practical and theoretical analysis to provide new methods.Time evolving data: Data may be evolving over time, so it is important that the Big Data mining techniques should be able to adapt and in some cases to detect change first. For example, the data stream mining field has very powerful techniques for this task.Compression: Dealing with Big Data, the quantity of space needed to store it is very relevant. There are two main approaches: compression where we don’t loose anything, or sampling where we choose what is thedata that is more representative. Using compression, we may take more time and less space, so we can consider it as a transformation from time to space. Using sampling, we are loosing information, but the gains inspace may be in orders of magnitude. For example Feldman et al use core sets to reduce the complexity of Big Data problems. Core sets are small sets that provably approximate the original data for a given problem. Using merge- reduce the small sets can then be used for solving hard machine learning problems in parallel.Visualization: A main task of Big Data analysis is how to visualize the results. As the data is so big, it is very difficult to find user-friendly visualizations. New techniques, and frameworks to tell and show stories will be needed, as for examplethe photographs, infographics and essays in the beautiful book ”The Human Face of Big Data”.Hidden Big Data: Large quantities of useful data are getting lost since new data is largely untagged and unstructured data. The 2012 IDC studyon Big Data explains that in 2012, 23% (643 exabytes) of the digital universe would be useful for Big Data if tagged and analyzed. However, currently only 3% of the potentially useful data is tagged, and even less is analyzed.VIII.CONCLUSIONThe amounts of data is growing exponentially due to social networking sites, search and retrieval engines, media sharing sites, stock trading sites, news sources and so on. Big Data is becoming the new area for scientific data research and for business applications.Data mining techniques can be applied on big data to acquire some useful information from large datasets. They can be used together to acquire some useful picture from the data.Big Data analysis tools like Map Reduce over Hadoop and HDFS helps organization.中文译文:大数据挖掘研究摘要数据已经成为各个经济、行业、组织、企业、职能和个人的重要组成部分。
数据分析外文文献+翻译
数据分析外文文献+翻译文献1:《数据分析在企业决策中的应用》该文献探讨了数据分析在企业决策中的重要性和应用。
研究发现,通过数据分析可以获取准确的商业情报,帮助企业更好地理解市场趋势和消费者需求。
通过对大量数据的分析,企业可以发现隐藏的模式和关联,从而制定出更具竞争力的产品和服务策略。
数据分析还可以提供决策支持,帮助企业在不确定的环境下做出明智的决策。
因此,数据分析已成为现代企业成功的关键要素之一。
文献2:《机器研究在数据分析中的应用》该文献探讨了机器研究在数据分析中的应用。
研究发现,机器研究可以帮助企业更高效地分析大量的数据,并从中发现有价值的信息。
机器研究算法可以自动研究和改进,从而帮助企业发现数据中的模式和趋势。
通过机器研究的应用,企业可以更准确地预测市场需求、优化业务流程,并制定更具策略性的决策。
因此,机器研究在数据分析中的应用正逐渐受到企业的关注和采用。
文献3:《数据可视化在数据分析中的应用》该文献探讨了数据可视化在数据分析中的重要性和应用。
研究发现,通过数据可视化可以更直观地呈现复杂的数据关系和趋势。
可视化可以帮助企业更好地理解数据,发现数据中的模式和规律。
数据可视化还可以帮助企业进行数据交互和决策共享,提升决策的效率和准确性。
因此,数据可视化在数据分析中扮演着非常重要的角色。
翻译文献1标题: The Application of Data Analysis in Business Decision-making The Application of Data Analysis in Business Decision-making文献2标题: The Application of Machine Learning in Data Analysis The Application of Machine Learning in Data Analysis文献3标题: The Application of Data Visualization in Data Analysis The Application of Data Visualization in Data Analysis翻译摘要:本文献研究了数据分析在企业决策中的应用,以及机器研究和数据可视化在数据分析中的作用。
数据采集外文文献翻译中英文
数据采集外文文献翻译(含:英文原文及中文译文)文献出处:Txomin Nieva. DATA ACQUISITION SYSTEMS [J]. Computers in Industry, 2013, 4(2):215-237.英文原文DATA ACQUISITION SYSTEMSTxomin NievaData acquisition systems, as the name implies, are products and/or processes used to collect information to document or analyze some phenomenon. In the simplest form, a technician logging the temperature of an oven on a piece of paper is performing data acquisition. As technology has progressed, this type of process has been simplified and made more accurate, versatile, and reliable through electronic equipment. Equipment ranges from simple recorders to sophisticated computer systems. Data acquisition products serve as a focal point in a system, tying together a wide variety of products, such as sensors that indicate temperature, flow, level, or pressure. Some common data acquisition terms are shown below.Data collection technology has made great progress in the past 30 to 40 years. For example, 40 years ago, in a well-known college laboratory, the device used to track temperature rises in bronze made of helium was composed of thermocouples, relays, interrogators, a bundle of papers, anda pencil.Today's university students are likely to automatically process and analyze data on PCs. There are many ways you can choose to collect data. The choice of which method to use depends on many factors, including the complexity of the task, the speed and accuracy you need, the evidence you want, and more. Whether simple or complex, the data acquisition system can operate and play its role.The old way of using pencils and papers is still feasible for some situations, and it is cheap, easy to obtain, quick and easy to start. All you need is to capture multiple channels of digital information (DMM) and start recording data by hand.Unfortunately, this method is prone to errors, slower acquisition of data, and requires too much human analysis. In addition, it can only collect data in a single channel; but when you use a multi-channel DMM, the system will soon become very bulky and clumsy. Accuracy depends on the level of the writer, and you may need to scale it yourself. For example, if the DMM is not equipped with a sensor that handles temperature, the old one needs to start looking for a proportion. Given these limitations, it is an acceptable method only if you need to implement a rapid experiment.Modern versions of the strip chart recorder allow you to retrieve data from multiple inputs. They provide long-term paper records of databecause the data is in graphic format and they are easy to collect data on site. Once a bar chart recorder has been set up, most recorders have enough internal intelligence to operate without an operator or computer. The disadvantages are the lack of flexibility and the relative low precision, often limited to a percentage point. You can clearly feel that there is only a small change with the pen. In the long-term monitoring of the multi-channel, the recorders can play a very good role, in addition, their value is limited. For example, they cannot interact with other devices. Other concerns are the maintenance of pens and paper, the supply of paper and the storage of data. The most important is the abuse and waste of paper. However, recorders are fairly easy to set up and operate, providing a permanent record of data for quick and easy analysis.Some benchtop DMMs offer selectable scanning capabilities. The back of the instrument has a slot to receive a scanner card that can be multiplexed for more inputs, typically 8 to 10 channels of mux. This is inherently limited in the front panel of the instrument. Its flexibility is also limited because it cannot exceed the number of available channels. External PCs usually handle data acquisition and analysis.The PC plug-in card is a single-board measurement system that uses the ISA or PCI bus to expand the slot in the PC. They often have a reading rate of up to 1000 per second. 8 to 16 channels are common, and the collected data is stored directly in the computer and then analyzed.Because the card is essentially a part of the computer, it is easy to establish the test. PC-cards are also relatively inexpensive, partly because they have since been hosted by PCs to provide energy, mechanical accessories, and user interfaces. Data collection optionsOn the downside, the PC plug-in cards often have a 12-word capacity, so you can't detect small changes in the input signal. In addition, the electronic environment within the PC is often susceptible to noise, high clock rates, and bus noise. The electronic contacts limit the accuracy of the PC card. These plug-in cards also measure a range of voltages. To measure other input signals, such as voltage, temperature, and resistance, you may need some external signal monitoring devices. Other considerations include complex calibrations and overall system costs, especially if you need to purchase additional signal monitoring devices or adapt the PC card to the card. Take this into account. If your needs change within the capabilities and limitations of the card, the PC plug-in card provides an attractive method for data collection.Data electronic recorders are typical stand-alone instruments that, once equipped with them, enable the measurement, recording, and display of data without the involvement of an operator or computer. They can handle multiple signal inputs, sometimes up to 120 channels. Accuracy rivals unrivalled desktop DMMs because it operates within a 22 word, 0.004 percent accuracy range. Some data electronic automatic recordershave the ability to measure proportionally, the inspection result is not limited by the user's definition, and the output is a control signal.One of the advantages of using data electronic loggers is their internal monitoring signals. Most can directly measure several different input signals without the need for additional signal monitoring devices. One channel can monitor thermocouples, RTDs, and voltages.Thermocouples provide valuable compensation for accurate temperature measurements. They are typically equipped with multi-channel cards. Built-in intelligent electronic data recorder helps you set the measurement period and specify the parameters for each channel. Once you set it all up, the data electronic recorder will behave like an unbeatable device. The data they store is distributed in memory and can hold 500,000 or more readings.Connecting to a PC makes it easy to transfer data to a computer for further analysis. Most data electronic recorders can be designed to be flexible and simple to configure and operate, and most provide remote location operation options via battery packs or other methods. Thanks to the A/D conversion technology, certain data electronic recorders have a lower reading rate, especially when compared with PC plug-in cards. However, a reading rate of 250 per second is relatively rare. Keep in mind that many of the phenomena that are being measured are physical in nature, such as temperature, pressure, and flow, and there are generallyfewer changes. In addition, because of the monitoring accuracy of the data electron loggers, a large amount of average reading is not necessary, just as they are often stuck on PC plug-in cards.Front-end data acquisition is often done as a module and is typically connected to a PC or controller. They are used in automated tests to collect data, control and cycle detection signals for other test equipment. Send signal test equipment spare parts. The efficiency of the front-end operation is very high, and can match the speed and accuracy with the best stand-alone instrument. Front-end data acquisition works in many models, including VXI versions such as the Agilent E1419A multi-function measurement and VXI control model, as well as a proprietary card elevator. Although the cost of front-end units has been reduced, these systems can be very expensive unless you need to provide high levels of operation, and finding their prices is prohibited. On the other hand, they do provide considerable flexibility and measurement capabilities.Good, low-cost electronic data loggers have the right number of channels (20-60 channels) and scan rates are relatively low but are common enough for most engineers. Some of the key applications include:•product features•Hot die cutting of electronic products•Test of the environmentEnvironmental monitoring•Composition characteristics•Battery testBuilding and computer capacity monitoringA new system designThe conceptual model of a universal system can be applied to the analysis phase of a specific system to better understand the problem and to specify the best solution more easily based on the specific requirements of a particular system. The conceptual model of a universal system can also be used as a starting point for designing a specific system. Therefore, using a general-purpose conceptual model will save time and reduce the cost of specific system development. To test this hypothesis, we developed DAS for railway equipment based on our generic DAS concept model. In this section, we summarize the main results and conclusions of this DAS development.We analyzed the device model package. The result of this analysis is a partial conceptual model of a system consisting of a three-tier device model. We analyzed the equipment project package in the equipment environment. Based on this analysis, we have listed a three-level item hierarchy in the conceptual model of the system. Equipment projects are specialized for individual equipment projects.We analyzed the equipment model monitoring standard package in the equipment context. One of the requirements of this system is the ability to use a predefined set of data to record specific status monitoring reports. We analyzed the equipment project monitoring standard package in the equipment environment. The requirements of the system are: (i) the ability to record condition monitoring reports and event monitoring reports corresponding to the items, which can be triggered by time triggering conditions or event triggering conditions; (ii) the definition of private and public monitoring standards; (iii) Ability to define custom and predefined train data sets. Therefore, we have introduced the "monitoring standards for equipment projects", "public standards", "special standards", "equipment monitoring standards", "equipment condition monitoring standards", "equipment project status monitoring standards and equipment project event monitoring standards, respectively Training item triggering conditions, training item time triggering conditions and training item event triggering conditions are device equipment trigger conditions, equipment item time trigger conditions and device project event trigger condition specialization; and training item data sets, training custom data Sets and trains predefined data sets, which are device project data sets, custom data sets, and specialized sets of predefined data sets.Finally, we analyzed the observations and monitoring reports in the equipment environment. The system's requirement is to recordmeasurements and category observations. In addition, status and incident monitoring reports can be recorded. Therefore, we introduce the concept of observation, measurement, classification observation and monitoring report into the conceptual model of the system.Our generic DAS concept model plays an important role in the design of DAS equipment. We use this model to better organize the data that will be used by system components. Conceptual models also make it easier to design certain components in the system. Therefore, we have an implementation in which a large number of design classes represent the concepts specified in our generic DAS conceptual model. Through an industrial example, the development of this particular DAS demonstrates the usefulness of a generic system conceptual model for developing a particular system.中文译文数据采集系统Txomin Nieva数据采集系统, 正如名字所暗示的, 是一种用来采集信息成文件或分析一些现象的产品或过程。
外文文献和翻译_信息系统开发和数据库开发
信息系统开发和数据库开发在许多组织中,数据库开发是从企业数据建模开始的,企业数据建模确定了组织数据库的范围和一般内容。
这一步骤通常发生在一个组织进行信息系统规划的过程中,它的目的是为组织数据创建一个整体的描述或解释,而不是设计一个特定的数据库。
一个特定的数据库为一个或多个信息系统提供数据,而企业数据模型(可能包含许多数据库)描述了由组织维护的数据的范围。
在企业数据建模时,你审查当前的系统,分析需要支持的业务领域的本质,描述需要进一步抽象的数据,并且规划一个或多个数据库开发项目。
图1显示松谷家具公司的企业数据模型的一个部分。
1.1 信息系统体系结构如图1所示,高级的数据模型仅仅是总体信息系统体系结构(ISA)一个部分或一个组织信息系统的蓝图。
在信息系统规划期间,你可以建立一个企业数据模型作为整个信息系统体系结构的一部分。
根据Zachman(1987)、Sowa和Zachman(1992)的观点,一个信息系统体系结构由以下6个关键部分组成:数据(如图1所示,但是也有其他的表示方法)。
操纵数据的处理(着系可以用数据流图、带方法的对象模型或者其他符号表示)。
网络,它在组织内并在组织与它的主要业务伙伴之间传输数据(它可以通过网络连接和拓扑图来显示)。
人,人执行处理并且是数据和信息的来源和接收者(人在过程模型中显示为数据的发送者和接收者)。
执行过程的事件和时间点(它们可以用状态转换图和其他的方式来显示)。
事件的原因和数据处理的规则(经常以文本形式显示,但是也存在一些用于规划的图表工具,如决策表)。
1.2 信息工程信息系统的规划者按照信息系统规划的特定方法开发出信息系统的体系结构。
信息工程是一种正式的和流行的方法。
信息工程是一种面向数据的创建和维护信息系统的方法。
因为信息工程是面向数据的,所以当你开始理解数据库是怎样被标识和定义时,信息工程的一种简洁的解释是非常有帮助的。
信息工程遵循自顶向下规划的方法,其中,特定的信息系统从对信息需求的广泛理解中推导出来(例如,我们需要关于顾客、产品、供应商、销售员和加工中心的数据),而不是合并许多详尽的信息请求(如一个订单输入屏幕或按照地域报告的销售汇总)。
数据挖掘外文翻译参考文献
数据挖掘外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)外文:What is Data Mining?Simply stated, data mining refers to extracting or “mining” knowledge from large amounts of data. The term is actually a misnomer. Remember that the mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. Thus, “data mining” should have been more appropriately named “knowledge mining from data”, which is unfortunately somewhat long. “Knowledge mining”, a shorter term, may not reflect the emphasis on mining from large amounts of data. Nevertheless, mining is a vivid term characterizing the processthat finds a small set of precious nuggets from a great deal of raw material. Thus, such a misnomer which carries both “data” and “mining” became a popular choice. There are many other terms carrying a similar or slightly different meaning to data mining, such as knowledge mining from databases, knowledge extraction, data / pattern analysis, data archaeology, and data dredging.Many people treat data mining as a synonym for another popularly used term, “Knowledge Discovery in Databases”, or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery in databases. Knowledge discovery consists of an iterative sequence of the following steps:· data cleaning: to remove noise or irrelevant data,· data integration: where multiple data sources may be combined,· data selection : where data relevant to the analysis task are retrieved from the database,· data transformati on : where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance,· data mining: an essential process where intelligent methods are applied in order to extract data patterns,· pattern evaluation: to identify the truly interesting patterns representing knowledge based on some interestingness measures, and· knowledge presentation: where visualization and knowledge representation techniques are used to present the mined knowledge to the user .The data mining step may interact with the user or a knowledge base. The interesting patterns are presented to the user, and may be stored as new knowledge in the knowledge base. Note that according to this view, data mining is only one step in the entire process, albeit an essential one since it uncovers hidden patterns for evaluation.We agree that data mining is a knowledge discovery process. However, in industry, in media, and in the database research milieu, the term “data mining” is becoming more popular than the longer term of “knowledge discovery in databases”. Therefore, in this book, we choose to use the term “data mining”. We adopt a broad view of data mining functionality: data mining is the process of discovering interesting knowledgefrom large amounts of data stored either in databases, data warehouses, or other information repositories.Based on this view, the architecture of a typical data mining system may have the following major components:1. Database, data warehouse, or other information repository. This is one or a set of databases, data warehouses, spread sheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data.2. Database or data warehouse server. The database or data warehouse server is responsible for fetching the relevant data, based on the user’s data mining request.3. Knowledge base. This is the domain knowledge that is used to guide the search, or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, used to organize attributes or attribute values into different levels of abstraction. Knowledge such as user beliefs, which can be used to assess a pattern’s interestingness based on its unexpectedness, may also be included. Other examples of domain knowledge are additional interestingness constraints or thresholds, and metadata (e.g., describing data from multiple heterogeneous sources).4. Data mining engine. This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association analysis, classification, evolution and deviation analysis.5. Pattern evaluation module. This component typically employs interestingness measures and interacts with the data mining modules so as to focus the search towards interesting patterns. It may access interestingness thresholds stored in the knowledge base. Alternatively, the pattern evaluation module may be integrated with the mining module, depending on the implementation of the data mining method used. For efficient data mining, it is highly recommended to push the evaluation of pattern interestingness as deep as possible into the mining process so as to confine the search to only the interesting patterns.6. Graphical user interface. This module communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task, providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining results. In addition, this component allows the user to browse database and data warehouse schemas or datastructures, evaluate mined patterns, and visualize the patterns in different forms.From a data warehouse perspective, data mining can be viewed as an advanced stage of on-1ine analytical processing (OLAP). However, data mining goes far beyond the narrow scope of summarization-style analytical processing of data warehouse systems by incorporating more advanced techniques for data understanding.While there may be many “data mining systems” on the market, not all of them can perform true data mining. A data analysis system that does not handle large amounts of data can at most be categorized as a machine learning system, a statistical data analysis tool, or an experimental system prototype. A system that can only perform data or information retrieval, including finding aggregate values, or that performs deductive query answering in large databases should be more appropriately categorized as either a database system, an information retrieval system, or a deductive database system.Data mining involves an integration of techniques from mult1ple disciplines such as database technology, statistics, machine learning, high performance computing, pattern recognition, neural networks, data visualization, informationretrieval, image and signal processing, and spatial data analysis. We adopt a database perspective in our presentation of data mining in this book. That is, emphasis is placed on efficient and scalable data mining techniques for large databases. By performing data mining, interesting knowledge, regularities, or high-level information can be extracted from databases and viewed or browsed from different angles. The discovered knowledge can be applied to decision making, process control, information management, query processing, and so on. Therefore, data mining is considered as one of the most important frontiers in database systems and one of the most promising, new database applications in the information industry.A classification of data mining systemsData mining is an interdisciplinary field, the confluence of a set of disciplines, including database systems, statistics, machine learning, visualization, and information science. Moreover, depending on the data mining approach used, techniques from other disciplines may be applied, such as neural networks, fuzzy and or rough set theory, knowledge representation, inductive logic programming, or high performance computing. Depending on the kinds of data to bemined or on the given data mining application, the data mining system may also integrate techniques from spatial data analysis, Information retrieval, pattern recognition, image analysis, signal processing, computer graphics, Web technology, economics, or psychology.Because of the diversity of disciplines contributing to data mining, data mining research is expected to generate a large variety of data mining systems. Therefore, it is necessary to provide a clear classification of data mining systems. Such a classification may help potential users distinguish data mining systems and identify those that best match their needs. Data mining systems can be categorized according to various criteria, as follows.1) Classification according to the kinds of databases mined.A data mining system can be classified according to the kinds of databases mined. Database systems themselves can be classified according to different criteria (such as data models, or the types of data or applications involved), each of which may require its own data mining technique. Data mining systems can therefore be classified accordingly.For instance, if classifying according to data models, we may have a relational, transactional, object-oriented,object-relational, or data warehouse mining system. If classifying according to the special types of data handled, we may have a spatial, time -series, text, or multimedia data mining system , or a World-Wide Web mining system . Other system types include heterogeneous data mining systems, and legacy data mining systems.2) Classification according to the kinds of knowledge mined. Data mining systems can be categorized according to the kinds of knowledge they mine, i.e., based on data mining functionalities, such as characterization, discrimination, association, classification, clustering, trend and evolution analysis, deviation analysis , similarity analysis, etc. A comprehensive data mining system usually provides multiple and/or integrated data mining functionalities.Moreover, data mining systems can also be distinguished based on the granularity or levels of abstraction of the knowledge mined, including generalized knowledge(at a high level of abstraction), primitive-level knowledge(at a raw data level), or knowledge at multiple levels (considering several levels of abstraction). An advanced data mining system should facilitate the discovery of knowledge at multiple levels of abstraction.3) Classification according to the kinds of techniques utilized.Data mining systems can also be categorized according to the underlying data mining techniques employed. These techniques can be described according to the degree of user interaction involved (e.g., autonomous systems, interactive exploratory systems, query-driven systems), or the methods of data analysis employed(e.g., database-oriented or data warehouse-oriented techniques, machine learning, statistics, visualization, pattern recognition, neural networks, and so on ) .A sophisticated data mining system will often adopt multiple data mining techniques or work out an effective, integrated technique which combines the merits of a few individual approaches.翻译:什么是数据挖掘?简单地说,数据挖掘是从大量的数据中提取或“挖掘”知识。
ASP.NET2.0数据库外文文献及翻译和参考文献-英语论文
2.0数据库外文文献及翻译和参考文献-英语论文 2.0数据库外文文献及翻译和参考文献参考文献[1] Matthew 高级程序设计[M].人民邮电出版社,2009.[2] 张领项目开发全程实录[M].清华大学出版社,2008.[3] 陈季实例指南与高级应用[M].中国铁道出版社,2008.[4] 郑霞2.0编程技术与实例[M].人民邮电出版社,2009.[5] 李俊民.精通SQL(结构化查询语言详解)[M].人民邮电出版社,2009.[6] 刘辉 .零基础学SQL Server 2005[M].机械工业出版社,2007.[7] 齐文海.ASP与SQL Server站点开发实用教程[M].机械工业出版社,2008.[8] 唐学忠.原文请找SQL Server 2000数据库教程[M]. 电子工业出版社,2005.[9] 王珊、萨师煊.数据库系统概论(第四版)[M].北京:高等教育出版社,2006.[10] Mani work Management Principles and Practive. Higher Education Press,2005,12VS2005中开发 2.0数据库程序一、简介在2005年11月7日,微软正式发行了.NET 2.0(包括 2.0),Visual Studio 2005和SQL Server 2005。
所有这些部件均被设计为可并肩独立工作。
也就是说,版本1.x和版本2.0可以安装在同一台机器上;你可以既有Visual 2002/2003和Visual Studio 2005,同时又有SQL Server 2000和SQL Server 2005。
而且,微软还在发行Visual Studio 2005和SQL Server 2005的一个 Express式的SKU。
注意,该Express版并不拥有专业版所有的特征。
2.0除了支持1.x风格的数据存取外,自身也包括一些新的数据源控件-它们使得访问和修改数据库数据极为轻松。
数据库外文参考文献及翻译.
数据库外文参考文献及翻译数据库外文参考文献及翻译数据库管理系统——实施数据完整性一个数据库,只有用户对它特别有信心的时候。
这就是为什么服务器必须实施数据完整性规则和商业政策的原因。
执行SQL Server的数据完整性的数据库本身,保证了复杂的业务政策得以遵循,以及强制性数据元素之间的关系得到遵守。
因为SQL Server的客户机/服务器体系结构允许你使用各种不同的前端应用程序去操纵和从服务器上呈现同样的数据,这把一切必要的完整性约束,安全权限,业务规则编码成每个应用,是非常繁琐的。
如果企业的所有政策都在前端应用程序中被编码,那么各种应用程序都将随着每一次业务的政策的改变而改变。
即使您试图把业务规则编码为每个客户端应用程序,其应用程序失常的危险性也将依然存在。
大多数应用程序都是不能完全信任的,只有当服务器可以作为最后仲裁者,并且服务器不能为一个很差的书面或恶意程序去破坏其完整性而提供一个后门。
SQL Server使用了先进的数据完整性功能,如存储过程,声明引用完整性(DRI),数据类型,限制,规则,默认和触发器来执行数据的完整性。
所有这些功能在数据库里都有各自的用途;通过这些完整性功能的结合,可以实现您的数据库的灵活性和易于管理,而且还安全。
声明数据完整性声明数据完整原文请找腾讯3249114六,维-论'文.网 定义一个表时指定构成的主键的列。
这就是所谓的主键约束。
SQL Server使用主键约束以保证所有值的唯一性在指定的列从未侵犯。
通过确保这个表有一个主键来实现这个表的实体完整性。
有时,在一个表中一个以上的列(或列的组合)可以唯一标志一行,例如,雇员表可能有员工编号( emp_id )列和社会安全号码( soc_sec_num )列,两者的值都被认为是唯一的。
这种列经常被称为替代键或候选键。
这些项也必须是唯一的。
虽然一个表只能有一个主键,但是它可以有多个候选键。
SQL Server的支持多个候选键概念进入唯一性约束。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Transact-SQL Cookbook第一章数据透视表1.1使用数据透视表1.1.1 问题支持一个元素序列往往需要解决各种问题。
例如,给定一个日期范围,你可能希望产生一行在每个日期的范围。
或者,您可能希望将一系列的返回值在单独的行成一系列单独的列值相同的行。
实现这种功能,你可以使用一个永久表中存储一系列的顺序号码。
这种表是称为一个数据透视表。
许多食谱书中使用数据透视表,然后,在所有情况下,表的名称是。
这个食谱告诉你如何创建表。
1.1.2 解决方案首先,创建数据透视表。
下一步,创建一个表名为富,将帮助你在透视表:CREATE TABLE Pivot (i INT,PRIMARY KEY(i))CREATE TABLE Foo(i CHAR(1))富表是一个简单的支持表,你应插入以下10行:INSERT INTO Foo VALUES('0')INSERT INTO Foo VALUES('1')INSERT INTO Foo VALUES('2')INSERT INTO Foo VALUES('3')INSERT INTO Foo VALUES('4')INSERT INTO Foo VALUES('5')INSERT INTO Foo VALUES('6')INSERT INTO Foo VALUES('7')INSERT INTO Foo VALUES('8')INSERT INTO Foo VALUES('9')利用10行在富表,你可以很容易地填充枢轴表1000行。
得到1000行10行,加入富本身三倍,创建一个笛卡尔积:INSERT INTO PivotSELECT f1.i+f2.i+f3.iFROM Foo f1, Foo F2, Foo f3如果你名单上的行数据透视表,你会看到它所需的数目的元素,他们将编号从0到999。
1.1.3讨论你会看到食谱,跟随在这本书中,枢轴表通常是用来添加一个排序属性查询。
某种形式的数据透视表中发现许多数据库为基础的系统,尽管它往往是隐藏的用户,主要用在预定义的查询和程序。
你已经看到一些表连接(的富表)控制的行数,我们插入语句生成的数据透视表。
从0到999的值是通过连接生成的字符串。
数字值,是字符串。
因此,当加号(+)运算符用来串连,我们得到的结果如下:'0' + '0' + '0' = '000''0' + '0' + '1' = '001这些结果是插入整数列在目的地的数据透视表。
当你使用一个插入语句插入字符串到整数列的数据库,含蓄地转换成整数的字符串。
笛卡尔积富情况下确保所有可能的组合生成,和,因此,所有可能的值从0到999的产生。
这是值得指出的,这个例子使用行从0999和负数。
你可以很容易地产生负面的号码,如果需要,重复插入声明“-”符号前面的连接字符串,小心点大约0排。
有没有这样的事,作为一个- 0,所以你不想将' 000 '行时产生的负轴数。
如果你这样做,你最终会与0行的数据透视表。
在我们的例子中,0行是不可能的,因为我们定义一个主键的透视表。
枢轴表可能是最有用的表中的世界。
一旦你使用它,它几乎是不可能创造一个严重的应用没有它。
作为一个示范,让我们用枢轴表生成一个图表迅速从32码到126:SELECT i Ascii_Code, CHAR(i) Ascii_Char FROM PivotWHERE i BETWEEN 32 AND 126Ascii_CodeAscii_Char----------- ----------3233 !34 "35 #36 $37 %38 &39 '40 (41 )42 *43 +44 ,45 -46 .47 /48 049 150 251 3...如何更好的使用数据透视表在这个特定的例子是你产生行输出不具有同等数量的行输入。
没有数据透视表,这是困难的,如果不是不可能的任务。
简单的指定一个范围,然后选择枢轴行在该范围内,我们能够产生的数据,不存在任何数据库中的表。
作为另一个例子,数据透视表的有用性,我们可以很容易地使用它来生成一个日历的下一个七天:SELECTCONVERT(CHAR(10),DATEADD(d,i,CURRENT_TIMESTAMP), 121) date, DATENAME(dw,DATEADD(d,i,CURRENT_TIMESTAMP)) day FROM PivotWHERE i BETWEEN 0 AND 6date day---------- ------------------------------2001-11-05 Monday2001-11-06 Tuesday2001-11-07 Wednesday2001-11-08 Thursday2001-11-09 Friday2001-11-10 Saturday2001-11-11 Sunday这些查询只是快速震荡,列在这里向您展示如何一个数据透视表可用于查询。
你会看到其他的食谱,枢轴表往往是一个必不可少的工具,为快速有效的解决问题。
第二章集结构化查询语言,作为一种语言,是围绕这一概念集。
你可能记得在小学学习,或者也许你研究套代数在高中或大学。
虽然语句如选择,更新,删除和可用于在一个数据行在一个时间,该报表设计运行数据集,且你获得最好的优势时,使用这种方式。
尽管这一切,我们通常看到的程序,使用操纵数据一次一行,而不是采取优势的强大的订珠加工能力。
我们希望,这一章,我们可以打开你的眼睛的力量,集合操作。
当你写语句,不知道对程序,选择一个记录,更新它,然后选择另一个。
相反,认为无论在经营上的记录集,一下子。
如果你使用的程序性思维,思维可以采取一些习惯。
为了帮助你,这一章提出了一些食谱表明权力的一套面向编程方法与结构化查询语言。
食谱,在本章的组织表现出不同类型的操作,可以进行设置。
你会看到如何找到共同的要素,总结的一组数据,并找出元素集是一个极端。
行动不一定符合数学定义的集合运算。
相反,我们这些定义和解决现实世界的问题,用代数术语。
在现实世界中,有些偏离严格的数学定义是必要的。
例如,它往往是必要的元素的集合,一个操作是不可能的数学定义集。
2.1简介潜水前的食谱,我们想通过一些基本步骤作了简要的概念和定义的术语在本章。
虽然我们相信你所熟悉的数学概念,交叉口,和工会,我们想把这些set-algebra条款纳入一个现实世界的例子。
2.1.1部件有三种类型的部件时应注意工作组。
第一个是自己设定的。
一个集合是一个集合的元素,和,为我们的宗旨,元素是数据库表中的行或列的查询返回的。
最后,我们的宇宙,这是我们长期使用参考的所有可能的元素为一组给定。
2.1.1.1集一个集合是一个集合的元素。
根据定义,内容不得复制,和他们没有命令。
在这里,数学定义的一组不同于其实际使用中的语言。
在现实世界中,它往往是有益的排序集合的元素到一个指定的顺序。
这样做可以让你找到极端等五大,或底部五,记录。
图2 - 1显示了一例2套。
我们会提到这些例子,我们讨论的各个方面的术语。
我们的目的,我们将考虑一组是一个收集表中的行确定一个共同的元素。
考虑,例如,下面的表项。
这张桌子是一家集集,其中每个集是一个独特的标识order-identification数。
CREATE TABLE OrderItems(OrderId INTEGER,ItemId INTEGER,ProductIdCHAR(10),Qty INTEGER,PRIMARY KEY(OrderId,ItemId))每一集都在这个案件是一个秩序和有很多元素,不重复。
将元素行定义产品的数量和这些产品被命令。
常见的元素是订单列。
使用SQL,很容易从一组列表中的所有元素。
你只是问题的一条语句的选择与确定一套具体的利益。
以下查询将返回所有单项记录集合中的顺序确定的:SELECT * FROM OrderItems WHERE OrderId=112在这一章中,我们将与集,总是在一个表。
许多作者试图证明集合操作使用不同的表。
这个方法有2个问题。
首先,从实证角度而有利,你很少会发现一个数据库表,都具有相同的结构。
其次,有许多隐藏的可能性书面查询来当你认为不同的设置为不同的片同表。
通过集中在一个表,我们希望能打开你的心,这些可能性。
2.1.1.2元素一个元素是一个成员的一组。
图2 - 1,每一个人的信是一个元素。
我们的目的,工作时,一个元素是一个行的表。
结构化查询语言,它往往是有益的,不认为元素统一实体。
在纯数学意义上来说,这是不可能的,一个集合的元素划分为2个或多个组件。
结构化查询语言,然而,你可以分为组成元素。
一个表通常是由许多不同的栏目,你就会经常查询写入操作只有一个子集,这些列。
例如,让我们说,你想找到的所有订单,包含一个炸药,无论数量。
你的元素排在orderitems表。
你需要使用产品编号列识别爆炸物,你会需要返回订单列确定的订单,但你没有使用其他表中的列。
这里的查询:SELECT OrderIdFROM OrderItems oGROUP BY OrderIdHAVING EXISTS(SELECT *FROM OrderItems o1WHERE o1.ProductId='Explosive' AND o.OrderId=o1.OrderId) 此查询实际使用的一组操作,你会读到这一章。
操作称为包含操作,和它对应的查询关键字的存在。
2.1.1.3合集一个合集的所有可能的元素可以是一个给定的集合。
考虑1和2图。
每一集是由字母的字母表。
如果我们决定,只有字母可以设置元素,宇宙的两队会设置的所有信件,如图2 - 2所示。
一个更现实的例子一个宇宙,认为一个学校课程的学生提供40种可能。
每个学生选择一个小数目40课程采取在某一学期。
课程内容。
本课程,使学生正在制定一套。
不同的学生采取不同的组合和数量的课程。
该集是不一样的,也不是所有大小相同,但他们都包含元素相同的宇宙。
每个学生必须选择从相同的40种可能性。
在学生/课程的例子了,所有的元素都来自同一个宇宙。
它也可能为一些套在一个表有不同的宇宙人。
例如,假设一个表列出完成案例研究,学生提出了。
进一步假设,宇宙可能的情况是不同的每个过程研究。
如果你认为一套定义一个课程和学生,宇宙的元素,将取决于课程的学生。