大数据外文翻译参考文献综述(文档含中英文对照即英文原文和中文翻译)原文:Data Mining and Data PublishingData mining is the extraction of vast interesting patterns or knowledge from huge amount of data. The initial idea of privacy-preserving data mining PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. Privacy-preserving data mining considers the problem of running data mining algorithms on confidential data that is not supposed to be revealed even to the partyrunning the algorithm. In contrast, privacy-preserving data publishing (PPDP) may not necessarily be tied to a specific data mining task, and the data mining task may be unknown at the time of data publishing. PPDP studies how to transform raw data into a version that is immunized against privacy attacks but that still supports effective data mining tasks. Privacy-preserving for both data mining (PPDM) and data publishing (PPDP) has become increasingly popular because it allows sharing of privacy sensitive data for analysis purposes. One well studied approach is the k-anonymity model [1] which in turn led to other models such as confidence bounding, l-diversity, t-closeness, (α,k)-anonymity, etc. In particular, all known mechanisms try to minimize information loss and such an attempt provides a loophole for attacks. The aim of this paper is to present a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explain their effects on Data Privacy.Although data mining is potentially useful, many data holders are reluctant to provide their data for data mining for the fear of violating individual privacy. In recent years, study has been made to ensure that the sensitive information of individuals cannot be identified easily.Anonymity Models, k-anonymization techniques have been the focus of intense research in the last few years. In order to ensure anonymization of data while at the same time minimizing the informationloss resulting from data modifications, everal extending models are proposed, which are discussed as follows.1.k-Anonymityk-anonymity is one of the most classic models, which technique that prevents joining attacks by generalizing and/or suppressing portions of the released microdata so that no individual can be uniquely distinguished from a group of size k. In the k-anonymous tables, a data set is k-anonymous (k ≥ 1) if each record in the data set is in- distinguishable from at least (k . 1) other records within the same data set. The larger the value of k, the better the privacy is protected. k-anonymity can ensure that individuals cannot be uniquely identified by linking attacks.2. Extending ModelsSince k-anonymity does not provide sufficient protection against attribute disclosure. The notion of l-diversity attempts to solve this problem by requiring that each equivalence class has at least l well-represented value for each sensitive attribute. The technology of l-diversity has some advantages than k-anonymity. Because k-anonymity dataset permits strong attacks due to lack of diversity in the sensitive attributes. In this model, an equivalence class is said to have l-diversity if there are at least l well-represented value for the sensitive attribute. Because there are semantic relationships among the attribute values, and different values have very different levels of sensitivity. Afteranonymization, in any equivalence class, the frequency (in fraction) of a sensitive value is no more than α.3. Related Research AreasSeveral polls show that the public has an in- creased sense of privacy loss. Since data mining is often a key component of information systems, homeland security systems, and monitoring and surveillance systems, it gives a wrong impression that data mining is a technique for privacy intrusion. This lack of trust has become an obstacle to the benefit of the technology. For example, the potentially beneficial data mining re- search project, Terrorism Information Awareness (TIA), was terminated by the US Congress due to its controversial procedures of collecting, sharing, and analyzing the trails left by individuals. Motivated by the privacy concerns on data mining tools, a research area called privacy-reserving data mining (PPDM) emerged in 2000. The initial idea of PPDM was to extend traditional data mining techniques to work with the data modified to mask sensitive information. The key issues were how to modify the data and how to recover the data mining result from the modified data. The solutions were often tightly coupled with the data mining algorithms under consideration. In contrast, privacy-preserving data publishing (PPDP) may not necessarily tie to a specific data mining task, and the data mining task is sometimes unknown at the time of data publishing. Furthermore, some PPDP solutions emphasize preserving the datatruthfulness at the record level, but PPDM solutions often do not preserve such property. PPDP Differs from PPDM in Several Major Ways as Follows :1) PPDP focuses on techniques for publishing data, not techniques for data mining. In fact, it is expected that standard data mining techniques are applied on the published data. In contrast, the data holder in PPDM needs to randomize the data in such a way that data mining results can be recovered from the randomized data. To do so, the data holder must understand the data mining tasks and algorithms involved. This level of involvement is not expected of the data holder in PPDP who usually is not an expert in data mining.2) Both randomization and encryption do not preserve the truthfulness of values at the record level; therefore, the released data are basically meaningless to the recipients. In such a case, the data holder in PPDM may consider releasing the data mining results rather than the scrambled data.3) PPDP primarily “anonymizes” the data by hiding the identity of record owners, whereas PPDM seeks to directly hide the sensitive data. Excellent surveys and books in randomization and cryptographic techniques for PPDM can be found in the existing literature. A family of research work called privacy-preserving distributed data mining (PPDDM) aims at performing some data mining task on a set of private databasesowned by different parties. It follows the principle of Secure Multiparty Computation (SMC), and prohibits any data sharing other than the final data mining result. Clifton et al. present a suite of SMC operations, like secure sum, secure set union, secure size of set intersection, and scalar product, that are useful for many data mining tasks. In contrast, PPDP does not perform the actual data mining task, but concerns with how to publish the data so that the anonymous data are useful for data mining. We can say that PPDP protects privacy at the data level while PPDDM protects privacy at the process level. They address different privacy models and data mining scenarios. In the field of statistical disclosure control (SDC), the research works focus on privacy-preserving publishing methods for statistical tables. SDC focuses on three types of disclosures, namely identity disclosure, attribute disclosure, and inferential disclosure. Identity disclosure occurs if an adversary can identify a respondent from the published data. Revealing that an individual is a respondent of a data collection may or may not violate confidentiality requirements. Attribute disclosure occurs when confidential information about a respondent is revealed and can be attributed to the respondent. Attribute disclosure is the primary concern of most statistical agencies in deciding whether to publish tabular data. Inferential disclosure occurs when individual information can be inferred with high confidence from statistical information of the published data.Some other works of SDC focus on the study of the non-interactive query model, in which the data recipients can submit one query to the system. This type of non-interactive query model may not fully address the information needs of data recipients because, in some cases, it is very difficult for a data recipient to accurately construct a query for a data mining task in one shot. Consequently, there are a series of studies on the interactive query model, in which the data recipients, including adversaries, can submit a sequence of queries based on previously received query results. The database server is responsible to keep track of all queries of each user and determine whether or not the currently received query has violated the privacy requirement with respect to all previous queries. One limitation of any interactive privacy-preserving query system is that it can only answer a sublinear number of queries in total; otherwise, an adversary (or a group of corrupted data recipients) will be able to reconstruct all but 1 . o(1) fraction of the original data, which is a very strong violation of privacy. When the maximum number of queries is reached, the query service must be closed to avoid privacy leak. In the case of the non-interactive query model, the adversary can issue only one query and, therefore, the non-interactive query model cannot achieve the same degree of privacy defined by Introduction the interactive model. One may consider that privacy-reserving data publishing is a special case of the non-interactivequery model.This paper presents a survey for most of the common attacks techniques for anonymization-based PPDM & PPDP and explains their effects on Data Privacy. k-anonymity is used for security of respondents identity and decreases linking attack in the case of homogeneity attack a simple k-anonymity model fails and we need a concept which prevent from this attack solution is l-diversity. All tuples are arranged in well represented form and adversary will divert to l places or on l sensitive attributes. l-diversity limits in case of background knowledge attack because no one predicts knowledge level of an adversary. It is observe that using generalization and suppression we also apply these techniques on those attributes which doesn’t need th is extent of privacy and this leads to reduce the precision of publishing table. e-NSTAM (extended Sensitive Tuples Anonymity Method) is applied on sensitive tuples only and reduces information loss, this method also fails in the case of multiple sensitive tuples.Generalization with suppression is also the causes of data lose because suppression emphasize on not releasing values which are not suited for k factor. Future works in this front can include defining a new privacy measure along with l-diversity for multiple sensitive attribute and we will focus to generalize attributes without suppression using other techniques which are used to achieve k-anonymity because suppression leads to reduce the precision ofpublishing table.译文:数据挖掘和数据发布数据挖掘中提取出大量有趣的模式从大量的数据或知识。
Management Information System Overview Management Information System is that we often say that the MIS, is a human, computers and other information can be composed of the collection, transmission, storage, maintenance and use of the system, system, emphasizing emphasizing the the management, management, management, stressed stressed stressed that that the modern information society In the increasingly popular. MIS is a new subject, it across a number of areas, such as scientific scientific management management management and and and system system system science, science, science, operations operations operations research, research, research, statistics statistics statistics and and and computer computer science. In these subjects on the basis of formation of information-gathering and processing methods, thereby forming a vertical and horizontal weaving, and systems. The 20th century, along with the vigorous development of the global economy, many economists have proposed a new management theory. In the 1950s, Simon made dependent on information management and decision-making ideas. Wiener published the same period of the control theory, that he is a management control process. 1958, Gail wrote: "The management will lower the cost of timely and accurate information to b etter control." During better control." During this period, accounting for the beginning of the computer, data processing in the term.1970, Walter T . Kenova just to the management information system under a definition of the . Kenova just to the management information system under a definition of the term: "verbal or written form, at the right time to managers, staff and outside staff for the past, present, the projection of future Enterprise and its environment-related information 原文请找腾讯3249114六,维^论~文.网 no no application application application model, model, model, no no mention mention of of computer applications. 1985, management information systems, the founder of the University of Minnesota professor of management at the Gordon B. Davis to a management information system a more complete definition of "management information system is a computer hardware and software resources, manual operations, analysis, planning , Control and decision -making model and the database - System. System. It It provides information to to support support enterprises enterprises or or organizations organizations of of the operation, management and decision-making function. "Comprehensive definition of this Explained Explained that that that the the the goal goal goal of of of management management management information information information system, system, system, functions functions functions and and and composition, composition, composition, but but also reflects the management information system at the time of level.With the continuous improvement of science and technology, computer science increasingly mature, the computer has to be our study and work on the run along. Today, computers are already already very low price, performance, but great progress, and it was used in many areas, the very low price, performance, but great progress, and it was used in many areas, the computer computer was was was so so so popular popular popular mainly mainly mainly because because because of of of the the the following following following aspects: aspects: aspects: First, First, First, the the the computer computer computer can can substitute for many of the complex Labor. Second, the computer can greatly enhance people's work work efficiency. efficiency. efficiency. Third, Third, Third, the the the computer computer computer can can can save save save a a a lot lot lot of of of resources. resources. resources. Fourth, Fourth, Fourth, the the the computer computer computer can can make sensitive documents more secure.Computer application and popularization of economic and social life in various fields. So that the original old management methods are not suited now more and social development. Many people still remain in the previous manual. This greatly hindered the economic development of mankind. mankind. In recent years, with the University of sponsoring scale is In recent years, with the University of sponsoring scale is growing, the number of students students in in in the the the school school school also also also have have have increased, increased, increased, resulting resulting resulting in in in educational educational educational administration administration administration is is is the the growing complexity of the heavy work, to spend a lot of manpower, material resources, and the existing management of student achievement levels are not high, People have been usin g the traditional method of document management student achievement, the management there are many shortcomings, such as: low efficiency, confidentiality of the poor, and Shijianyichang, will have a large number of of documents documents documents and and data, which is is useful useful for finding, finding, updating updating and maintaining Have brought a lot of difficulties. Such a mechanism has been unable to meet the development of the times, schools have become more and more day -to-day management of a bottleneck. bottleneck. In In In the the the information information information age age age this this this traditional traditional traditional management management management methods methods methods will will will inevitably inevitably inevitably be be computer-based information management replaced. As As part part part of of of the the the computer computer computer application, application, application, the the the use use use of of of computers computers computers to to to students students students student student student performance performance information for management, with a manual management of the incomparable advantages for example: example: rapid rapid rapid retrieval, retrieval, retrieval, to to to find find find convenient, convenient, convenient, high high high reliability reliability reliability and and and large large large capacity capacity capacity storage, storage, storage, the the confidentiality confidentiality of of of good, good, good, long long long life, life, life, cost cost cost Low. Low. Low. These These These advantages advantages advantages can can can greatly greatly greatly improve improve improve student student performance management students the efficiency of enterprises is also a scientific, standardized standardized management, management, management, and and and an an an important important important condition condition condition for for for connecting connecting connecting the the the world. world. world. Therefore, Therefore, the development of such a set of management software as it is very necessary thing.Design ideas are all for the sake of users, the interface nice, clear and simple operation as far as possible, but also as a practical operating system a good fault-tolerant, the user can misuse a timely manner as possible are given a warning, so that users timely correction . T o take full advantage advantage of the of the functions of visual FoxPro, design p owerful software powerful software at the same time, as much as possible to reduce the occupiers system resources. Visual FoxPro the command structure and working methods: Visual FoxPro was originally originally called called FoxBASE, FoxBASE, the the U.S. U.S. Fox Fox Software has introduced introduced a a database products, products, in in the run on DOS, compatible with the abase family. Fox Fox Software Software Microsoft acquisition, to be developed so that it can run on Windows, and changed its name to Visual FoxPro. Visual FoxPro is a powerful relational database rapid application development tool, tool, the the the use use use of of of Visual Visual Visual FoxPro FoxPro FoxPro can can can create create create a a a desktop desktop desktop database database database applications, applications, applications, client client client / / / server server applications applications and and and Web Web Web services services services component-based component-based component-based procedures, procedures, procedures, while while while also also also can can can use use use ActiveX ActiveX controls or API function, and so on Ways to expand the functions of Visual FoxPro.1651First, work methods 1. Interactive mode of operation (1) order operation VF in the order window, through an order from the keyboard input of all kinds of ways to complete the operation order. (2) menu operation VF use menus, windows, dialog to achieve the graphical interface features an interactive operation. (3) aid operation VF in the system provides a wide range of user-friendly operation of tools, such as the wizard, design, production, etc.. 2. Procedure means of implementation VF in the implementation of the procedures is to form a group of orders and programming language, an extension to save. PRG procedures in the document, and then run through the automatic implementation of this order documents and award results are displayed. Second, the structure of command 1. Command structure 2. VF orders are usually composed of two parts: The first part is the verb order, also known as keywords, for the operation of the designated order functions; second part of the order clause, for an order that the operation targets, operating conditions and other information . VF order form are as follows: 3. <Order verb> "<order clause>" 4. Order in the format agreed symbols 5. 5. VF in the order form and function of the use of the symbol of the unity agreement, the meaning of VF in the order form and function of the use of the symbol of the unity agreement, the meaning of these symbols are as follows: 6. Than that option, angle brackets within the parameters must be based on their format input parameters. 7. That may be options, put in brackets the parameters under specific requ ests from users choose to enter its parameters. 8. Third, the project manager 9. Create a method 10. command window: CREA T PROJECT <file name> T PROJECT <file name> 11. Project Manager 12. tab 13. All - can display and project management applications of all types of docume nts, "All" tab contains five of its right of the tab in its entirety . 14. Data - management application projects in various types of data files, databases, free form, view, query documents. 15. Documentation - display 原文请找腾讯原文请找腾讯3249114六,维^论~文.网 , statements, documents, labels and other documents. 16. Category - the tab display and project management applications used in the class library documents, including VF's class library system and the user's own design of the library. 17. Code - used in the project management procedures code documents, such as: program files (. PRG), API library and the use of project management for generation of applications (. APP). 18. (2) the work area 19. The project management work area is displayed and management of all types of document window. 20. (3) order button 21. Project Manager button to the right of the order of the work area of the document window to provide command. 22. 4, project management for the use of 23. 1. Order button function 24. New - in the work area window selected certain documents, with new orders button on the new document added to the project management window. 25. Add - can be used VF "file" menu under the "new" order and the "T ools" menu under the "Wizard" order to create the various independent paper added to the project manager, unified organization with management. 26. Laws - may amend the project has been in existence in the various documents, is still to use such documents to modify the design interface. 27. Sports - in the work area window to highlight a specific document, will run the paper.28. Mobile - to check the documents removed from the project. 29. 29. Even Even Even the the the series series series - - - put put put the the the item item item in in in the the the relevant relevant relevant documents documents documents and and and even even even into into into the the the application application executable file. Database System Design :Database design is the logical database design, according to a forthcoming data classification system and the logic of division-level organizations, is user-oriented. Database design needs of various departments of the integrated enterprise archive data and data needs analysis of the relationship between the various data, in accordance with the DBMS. 管理信息系统概要管理信息系统概要管理信息系统就是我们常说的MIS (Management Information System ),是一个由人、计算机等组成的能进行信息的收集、传送、储存、维护和使用的系统,在强调管理,强调信息的现代社会中它越来越得到普及。
审计风险外文文献翻译最新译文文献出处:C E Hogan. The Discussion of Audit Risk Control [J]. Contemporary Accounting Research, 2015, 25(1): 219.原文The Discussion of Audit Risk ControlC E HoganAbstractFor any one market, seeking resources optimal configuration is its internal requirements, this requirement with complete information between market subjects, in reality, however, investors and by investors, creditors and debtors, regulators and inevitable existence of information asymmetry between the regulated, audit the generation of the industry is to eliminate the information asymmetry. Certified public accountants to verify statements of the financial information of foreign enterprises and other information, the truth of market main body with information as close as possible to complete information is the process of the audit. Since the audit conclusion is certified public accountants in sampling surveys on the basis of the subjective conclusion, usually can't be absolutely perfect information, the audit risk and the audit risk is the audit itself inherent cannot evade a question.Keywords: audit risk, audit risk management and risk control1 IntroductionAuditing profession development, has become an indispensable organic part of market economy, in the establishment and maintenance of the capital market development, holds an important place of audit, audit of the financial market is hard to imagine.In recent years, however, in view of the accounting firms and certified public accountants case erupted repeatedly, most lawsuits and high litigation of the damages to the whole industry development.2002 of the American journal of accounting statistics results show that the United States over the past 15 years for the auditor to accuse lawsuit, far more than the whole industry occurred in the 105 - year history of the total number of ['];European Ernst & young, KPMG, delete and PWC international accounting firms in 2007, a year only received compensation lawsuit, claim amountmore than $1 billion in six, demanded amount of between $350 million to $1 billion with 12.Strengthen research of audit risk and its management, therefore, not only relates to the interests of the subject of audit and reputation, and is related to the construction of the economic system, is not only beneficial to audit the construction industry, promote audit, benign and healthy development of the career but also to contain or block the audit risk caused a chain reaction, make the audit resources to have economic benefits and social benefits in the direction of the flow, promote the reasonable allocation of social resources and social stability.2 Literature reviewIn 1978, D.H. Roberts (D.H.R obverts) raises the ultimate audit risk model, its mathematical expression is: the ultimate risk inherent risk control risk x 2 analytical detection risk and (+ sampling risk not sampling risk).In 1981, the auditing standards board (AlCPA) standards of 39 announcement the audit sampling and brought forward a new model of audit risk, this theory is that the audit.Risks from the analysis of inherent risk, control risk anddetection risk and testing of four risk in detail, including: inherent risk and control risk the risk of significant error in financial statements and analytical examination and detailed test risks said the risk of significant error in the financial statements are not found. In 1983, the auditing standards board (AICPA) is explained in the auditing standards no. 47 "audit risk and the importance of audit services" (sAS47 #) of the audit risk model and made the changes, the revised audit model: audit risk inherent risk 2 x check risk control. As a result of this model includes the main audit risk factors, and shows that the number of the relationship between each risk factor, convenient measurement, operability and applicability, and therefore most audit organization and the international accounting firms are using this model, the independent auditing standards are also using this model. In 2004, the international auditing standards are revised in SAS47 # auditing standards audit model on the basis of a new audit risk model is put forward, its abstract expression is: the risk of material misstatement risk in audit risk = x check, this model to control risk and inherent risk into comprehensiverisk, and said with the risk of material misstatement. The model that audit risk depends on the size of the material misstatement risk and check risk, certified public accountant shall risk assessment of the implementation process, evaluation of material misstatement risk, and further to design and implement audit according to the results of the assessment program, to control the inspection risk, to reduce audit risk to an acceptable level.And for some institutions and scholars,Audit risk theory put forward its own views is put forward in 1983: Audit risk inherent risk control risk x x = analytical detection risk and substantive testrisk [6]; the auditing practices board (APC) in 1988, an audit risk model is put forward, namely: audit risk = inherent risk control risk x x x sampling risk. In 1997, Alvin. A. Arenas and James k. loss baker (Alvin a. Arenas and James k. Lob eke) published monograph in combination with the audit learn A "(Auditing - An integrated Approach) adopted the system foundation audit and the risk-based audit pattern, on the basis of the risk assessment of the audited units, comprehensive analysis and evaluation of various influence factors of the audited units of economic activity, and according to the quantitative risk level to determine the implementation of the audit scope, focus, and carries on the substantive examination.3 Audit risk management and control3.1 Audit project management and controlEntrusted by the audit stage, first of all should carefully choose the auditees. Industry, the development level of industry correlation and macro-economic conditions, the types of industry market information such as help auditors on the current operating situation of the customer to make a preliminary judgment, and thus to initial positioning its risk. Customer’s own information focus should examine its management level, management level and sustainable management ability and senior management personnel quality, and so on and so forth. Auditors take special attention in the understanding of the unusual move, especially in the audit of listed company, any signs of abnormal behavior will have its exposed, namely risk signal. Between the auditor and the client if there is a related party relationship will affect theindependence of the audit, therefore when determining accepting new clients to avoid this kind of relationship to weakenthe independence of certified public accountants. In commissioned phase can be a new customer list to inform law firm of professional auditors.Implementation stage of the audit specific controlled by implementation and business substantive testing phase and implementation detailed analytical testing and balance testing phase two phases, this stage guided by the audit plan, audit risk control oriented, to obtain audit evidence as the basic goals, the establishment of the internal control system of the audited units first and abide by the conditions for conformance test, according to the test results revised audit plan; And then to substantive testing of accounting report project data, evaluation and appraisal according to the test result.Way to achieve the goal of certified public accountants audit is the implementation of audit procedures, and the result is to achieve the goal of the audit through the audit report to reflect. Audit report reflects the client's final request, also reflect the quality of audit work to accomplish the task, and is also the judgement of the audited matters and conclusion. Therefore audit report stage is to audit the project quality and degree of risk control, the last part of the project risk control.3.2 Audit industry risk management and controlA sound system of laws and regulations is the audit laws is the basic measures to guard against auditing risk. Audit theory system must have a tight inner logic, to become a mature discipline and guide audit practice. Revised auditing standards as the core of the audit standard system, pay attention to the improvement on the application of audit risk model, perfect the risk-oriented audit on the implementation of the specific procedures of specific methods, such as the evaluation of internalcontrol system, the control test and confirm the audit sampling method, test phase use expectation level of audit risk, inherent risk, control risk and detection risk and legal responsibility audit litigation risk and evaluation method, etc., for the auditor in practice to establish a normative and principled technical guidance system, enables the auditor's practice to rules-based and laws.An institute of certified public accountants should give full play to the function of its industry association, to further promote the improvement of the industry standards, strengthen supervision, to establish credit rating, filing system, peer review and experience exchange. In addition, an institute of certified public accountants shall promote the legislation and building rules and regulations, work, and take some measures to protect the lawful rights and interests of a member of the association. To explore in practice, summarize the experience on the basis of the audit work must be formulated in compliance with standards and guidelines as soon as possible, the audit procedures, content, clerical, language use and so on shall be clearly stipulated; Strengthen the constraints supervision mechanism, establish and perfect the relevant regulations of the peer review and the system.3.3 Audit environment risk management and controlThe audit environment is constantly changing. Industrial society to information society and the transformation of the knowledge economy era, the progressive realization of economic globalization, the modern enterprise system gradually introduced, further improving the corporate governance structure, information technology is widely applied in the audit practice, etc. Play an important role in the audit environment, isthe auditor's quality and skills, social expectations and requirements for the audit, the development of related disciplines and so on.For the improvement of the audit environment and reform, not the auditing profession or an institute of certified public accountants can be achieved, it needs the joint efforts of the whole society, such as the correct understanding of the auditing profession widespread public, to reduce the audit expectation gap; To improve the standardization of the capital market operations and the transparency of information disclosure; Perfect the construction of accounting legal system, etc.4 ConclusionsAudit is to monitor the development of social economy, the important aspect of optimizing the allocation of resources, the development of capital market prosperity and stability is particularly important. Audit risk management throughout all aspects of the audit activities, throughout the audit activities. Public accounting firms andcertified public accountants as the main body of the audit risk management, especially must pay attention to in the daily audit practice and strengthen the audit risk management, they need to improve its own, perfect the causes of audit risk, and thus achieve the control of the audit risk more effectively.译文对审计风险控制的探讨C E Hogan摘要对于任何一个市场而言,寻求资源的最优配置都是其内在要求,这要求市场主体之间具备完全信息,然而现实中,投资者与被投资者、债权人与债务人、监管者与被监管者之间必然存在信息的不对称,审计这一行业的产生就是为了消除这种信息的不对称。
以下是一些常见的参考文献中文和英文对照:1. 书籍 Book中文:王小明. 计算机网络技术. 北京:清华大学出版社,2018.英文:Wang, X. Computer Network Technology. Beijing: Tsinghua University Press, 2018.2. 学术期刊 Article in Academic Journal中文:张婷婷,李伟. 基于深度学习的影像分割方法. 计算机科学与探索,2019,13(1):61-67.英文:Zhang, T. T., Li, W. Image Segmentation Method Based on Deep Learning. Computer Science and Exploration, 2019, 13(1): 61-67.3. 会议论文 Conference Paper中文:王维,李丽. 基于云计算的智慧物流管理系统设计. 2019年国际物流与采购会议论文集,2019:112-117.英文:Wang, W., Li, L. Design of Smart Logistics Management System Based on Cloud Computing. Proceedings of the 2019 International Conference on Logistics and Procurement, 2019: 112-117.4. 学位论文 Thesis/Dissertation中文:李晓华. 基于模糊神经网络的水质评价模型研究. 博士学位论文,长春:吉林大学,2018.英文:Li, X. H. Research on Water Quality Evaluation Model Based on Fuzzy Neural Network. Doctoral Dissertation, Changchun: Jilin University, 2018.5. 报告 Report中文:国家统计局. 2019年国民经济和社会发展统计公报. 北京:中国统计出版社,2019.英文:National Bureau of Statistics. Statistical Communique of the People's Republic of China on the 2019 National Economic and Social Development. Beijing: China Statistics Press, 2019.以上是一些常见的参考文献中文和英文对照,希望对大家写作有所帮助。
云计算外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)原文:Technical Issues of Forensic Investigations in Cloud Computing EnvironmentsDominik BirkRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyAbstract—Cloud Computing is arguably one of the most discussedinformation technologies today. It presents many promising technological and economical opportunities. However, many customers remain reluctant to move their business IT infrastructure completely to the cloud. One of their main concerns is Cloud Security and the threat of the unknown. Cloud Service Providers(CSP) encourage this perception by not letting their customers see what is behind their virtual curtain. A seldomly discussed, but in this regard highly relevant open issue is the ability to perform digital investigations. This continues to fuel insecurity on the sides of both providers and customers. Cloud Forensics constitutes a new and disruptive challenge for investigators. Due to the decentralized nature of data processing in the cloud, traditional approaches to evidence collection and recovery are no longer practical. This paper focuses on the technical aspects of digital forensics in distributed cloud environments. We contribute by assessing whether it is possible for the customer of cloud computing services to perform a traditional digital investigation from a technical point of view. Furthermore we discuss possible solutions and possible new methodologies helping customers to perform such investigations.I. INTRODUCTIONAlthough the cloud might appear attractive to small as well as to large companies, it does not come along without its own unique problems. Outsourcing sensitive corporate data into the cloud raises concerns regarding the privacy and security of data. Security policies, companies main pillar concerning security, cannot be easily deployed into distributed, virtualized cloud environments. This situation is further complicated by the unknown physical location of the companie’s assets. Normally,if a security incident occurs, the corporate security team wants to be able to perform their own investigation without dependency on third parties. In the cloud, this is not possible anymore: The CSP obtains all the power over the environmentand thus controls the sources of evidence. In the best case, a trusted third party acts as a trustee and guarantees for the trustworthiness of the CSP. Furthermore, the implementation of the technical architecture and circumstances within cloud computing environments bias the way an investigation may be processed. In detail, evidence data has to be interpreted by an investigator in a We would like to thank the reviewers for the helpful comments and Dennis Heinson (Center for Advanced Security Research Darmstadt - CASED) for the profound discussions regarding the legal aspects of cloud forensics. proper manner which is hardly be possible due to the lackof circumstantial information. For auditors, this situation does not change: Questions who accessed specific data and information cannot be answered by the customers, if no corresponding logs are available. With the increasing demand for using the power of the cloud for processing also sensible information and data, enterprises face the issue of Data and Process Provenance in the cloud [10]. Digital provenance, meaning meta-data that describes the ancestry or history of a digital object, is a crucial feature for forensic investigations. In combination with a suitable authentication scheme, it provides information about who created and who modified what kind of data in the cloud. These are crucial aspects for digital investigations in distributed environments such as the cloud. Unfortunately, the aspects of forensic investigations in distributed environment have so far been mostly neglected by the research community. Current discussion centers mostly around security, privacy and data protection issues [35], [9], [12]. The impact of forensic investigations on cloud environments was little noticed albeit mentioned by the authors of [1] in 2009: ”[...] to our knowledge, no research has been published on how cloud computing environments affect digital artifacts,and on acquisition logistics and legal issues related to cloud computing env ironments.” This statement is also confirmed by other authors [34], [36], [40] stressing that further research on incident handling, evidence tracking and accountability in cloud environments has to be done. At the same time, massive investments are being made in cloud technology. Combined with the fact that information technology increasingly transcendents peoples’ private and professional life, thus mirroring more and more of peoples’actions, it becomes apparent that evidence gathered from cloud environments will be of high significance to litigation or criminal proceedings in the future. Within this work, we focus the notion of cloud forensics by addressing the technical issues of forensics in all three major cloud service models and consider cross-disciplinary aspects. Moreover, we address the usability of various sources of evidence for investigative purposes and propose potential solutions to the issues from a practical standpoint. This work should be considered as a surveying discussion of an almost unexplored research area. The paper is organized as follows: We discuss the related work and the fundamental technical background information of digital forensics, cloud computing and the fault model in section II and III. In section IV, we focus on the technical issues of cloud forensics and discuss the potential sources and nature of digital evidence as well as investigations in XaaS environments including thecross-disciplinary aspects. We conclude in section V.II. RELATED WORKVarious works have been published in the field of cloud security and privacy [9], [35], [30] focussing on aspects for protecting data in multi-tenant, virtualized environments. Desired security characteristics for current cloud infrastructures mainly revolve around isolation of multi-tenant platforms [12], security of hypervisors in order to protect virtualized guest systems and secure network infrastructures [32]. Albeit digital provenance, describing the ancestry of digital objects, still remains a challenging issue for cloud environments, several works have already been published in this field [8], [10] contributing to the issues of cloud forensis. Within this context, cryptographic proofs for verifying data integrity mainly in cloud storage offers have been proposed,yet lacking of practical implementations [24], [37], [23]. Traditional computer forensics has already well researched methods for various fields of application [4], [5], [6], [11], [13]. Also the aspects of forensics in virtual systems have been addressed by several works [2], [3], [20] including the notionof virtual introspection [25]. In addition, the NIST already addressed Web Service Forensics [22] which has a huge impact on investigation processes in cloud computing environments. In contrast, the aspects of forensic investigations in cloud environments have mostly been neglected by both the industry and the research community. One of the first papers focusing on this topic was published by Wolthusen [40] after Bebee et al already introduced problems within cloud environments [1]. Wolthusen stressed that there is an inherent strong need for interdisciplinary work linking the requirements and concepts of evidence arising from the legal field to what can be feasibly reconstructed and inferred algorithmically or in an exploratory manner. In 2010, Grobauer et al [36] published a paper discussing the issues of incident response in cloud environments - unfortunately no specific issues and solutions of cloud forensics have been proposed which will be done within this work.III. TECHNICAL BACKGROUNDA. Traditional Digital ForensicsThe notion of Digital Forensics is widely known as the practice of identifying, extracting and considering evidence from digital media. Unfortunately, digital evidence is both fragile and volatile and therefore requires the attention of special personnel and methods in order to ensure that evidence data can be proper isolated and evaluated. Normally, the process of a digital investigation can be separated into three different steps each having its own specificpurpose:1) In the Securing Phase, the major intention is the preservation of evidence for analysis. The data has to be collected in a manner that maximizes its integrity. This is normally done by a bitwise copy of the original media. As can be imagined, this represents a huge problem in the field of cloud computing where you never know exactly where your data is and additionallydo not have access to any physical hardware. However, the snapshot technology, discussed in section IV-B3, provides a powerful tool to freeze system states and thus makes digital investigations, at least in IaaS scenarios, theoretically possible.2) We refer to the Analyzing Phase as the stage in which the data is sifted and combined. It is in this phase that the data from multiple systems or sources is pulled together to create as complete a picture and event reconstruction as possible. Especially in distributed system infrastructures, this means that bits and pieces of data are pulled together for deciphering the real story of what happened and for providing a deeper look into the data.3) Finally, at the end of the examination and analysis of the data, the results of the previous phases will be reprocessed in the Presentation Phase. The report, created in this phase, is a compilation of all the documentation and evidence from the analysis stage. The main intention of such a report is that it contains all results, it is complete and clear to understand. Apparently, the success of these three steps strongly depends on the first stage. If it is not possible to secure the complete set of evidence data, no exhaustive analysis will be possible. However, in real world scenarios often only a subset of the evidence data can be secured by the investigator. In addition, an important definition in the general context of forensics is the notion of a Chain of Custody. This chain clarifies how and where evidence is stored and who takes possession of it. Especially for cases which are brought to court it is crucial that the chain of custody is preserved.B. Cloud ComputingAccording to the NIST [16], cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal CSP interaction. The new raw definition of cloud computing brought several new characteristics such as multi-tenancy, elasticity, pay-as-you-go and reliability. Within this work, the following three models are used: In the Infrastructure asa Service (IaaS) model, the customer is using the virtual machine provided by the CSP for installing his own system on it. The system can be used like any other physical computer with a few limitations. However, the additive customer power over the system comes along with additional security obligations. Platform as a Service (PaaS) offerings provide the capability to deploy application packages created using the virtual development environment supported by the CSP. For the efficiency of software development process this service model can be propellent. In the Software as a Service (SaaS) model, the customer makes use of a service run by the CSP on a cloud infrastructure. In most of the cases this service can be accessed through an API for a thin client interface such as a web browser. Closed-source public SaaS offers such as Amazon S3 and GoogleMail can only be used in the public deployment model leading to further issues concerning security, privacy and the gathering of suitable evidences. Furthermore, two main deployment models, private and public cloud have to be distinguished. Common public clouds are made available to the general public. The corresponding infrastructure is owned by one organization acting as a CSP and offering services to its customers. In contrast, the private cloud is exclusively operated for an organization but may not provide the scalability and agility of public offers. The additional notions of community and hybrid cloud are not exclusively covered within this work. However, independently from the specific model used, the movement of applications and data to the cloud comes along with limited control for the customer about the application itself, the data pushed into the applications and also about the underlying technical infrastructure.C. Fault ModelBe it an account for a SaaS application, a development environment (PaaS) or a virtual image of an IaaS environment, systems in the cloud can be affected by inconsistencies. Hence, for both customer and CSP it is crucial to have the ability to assign faults to the causing party, even in the presence of Byzantine behavior [33]. Generally, inconsistencies can be caused by the following two reasons:1) Maliciously Intended FaultsInternal or external adversaries with specific malicious intentions can cause faults on cloud instances or applications. Economic rivals as well as former employees can be the reason for these faults and state a constant threat to customers and CSP. In this model, also a malicious CSP is included albeit he isassumed to be rare in real world scenarios. Additionally, from the technical point of view, the movement of computing power to a virtualized, multi-tenant environment can pose further threads and risks to the systems. One reason for this is that if a single system or service in the cloud is compromised, all other guest systems and even the host system are at risk. Hence, besides the need for further security measures, precautions for potential forensic investigations have to be taken into consideration.2) Unintentional FaultsInconsistencies in technical systems or processes in the cloud do not have implicitly to be caused by malicious intent. Internal communication errors or human failures can lead to issues in the services offered to the costumer(i.e. loss or modification of data). Although these failures are not caused intentionally, both the CSP and the customer have a strong intention to discover the reasons and deploy corresponding fixes.IV. TECHNICAL ISSUESDigital investigations are about control of forensic evidence data. From the technical standpoint, this data can be available in three different states: at rest, in motion or in execution. Data at rest is represented by allocated disk space. Whether the data is stored in a database or in a specific file format, it allocates disk space. Furthermore, if a file is deleted, the disk space is de-allocated for the operating system but the data is still accessible since the disk space has not been re-allocated and overwritten. This fact is often exploited by investigators which explore these de-allocated disk space on harddisks. In case the data is in motion, data is transferred from one entity to another e.g. a typical file transfer over a network can be seen as a data in motion scenario. Several encapsulated protocols contain the data each leaving specific traces on systems and network devices which can in return be used by investigators. Data can be loaded into memory and executed as a process. In this case, the data is neither at rest or in motion but in execution. On the executing system, process information, machine instruction and allocated/de-allocated data can be analyzed by creating a snapshot of the current system state. In the following sections, we point out the potential sources for evidential data in cloud environments and discuss the technical issues of digital investigations in XaaS environmentsas well as suggest several solutions to these problems.A. Sources and Nature of EvidenceConcerning the technical aspects of forensic investigations, the amount of potential evidence available to the investigator strongly diverges between thedifferent cloud service and deployment models. The virtual machine (VM), hosting in most of the cases the server application, provides several pieces of information that could be used by investigators. On the network level, network components can provide information about possible communication channels between different parties involved. The browser on the client, acting often as the user agent for communicating with the cloud, also contains a lot of information that could be used as evidence in a forensic investigation. Independently from the used model, the following three components could act as sources for potential evidential data.1) Virtual Cloud Instance: The VM within the cloud, where i.e. data is stored or processes are handled, contains potential evidence [2], [3]. In most of the cases, it is the place where an incident happened and hence provides a good starting point for a forensic investigation. The VM instance can be accessed by both, the CSP and the customer who is running the instance. Furthermore, virtual introspection techniques [25] provide access to the runtime state of the VM via the hypervisor and snapshot technology supplies a powerful technique for the customer to freeze specific states of the VM. Therefore, virtual instances can be still running during analysis which leads to the case of live investigations [41] or can be turned off leading to static image analysis. In SaaS and PaaS scenarios, the ability to access the virtual instance for gathering evidential information is highly limited or simply not possible.2) Network Layer: Traditional network forensics is knownas the analysis of network traffic logs for tracing events that have occurred in the past. Since the different ISO/OSI network layers provide several information on protocols and communication between instances within as well as with instances outside the cloud [4], [5], [6], network forensics is theoretically also feasible in cloud environments. However in practice, ordinary CSP currently do not provide any log data from the network components used by the customer’s instances or applications. For instance, in case of a malware infection of an IaaS VM, it will be difficult for the investigator to get any form of routing information and network log datain general which is crucial for further investigative steps. This situation gets even more complicated in case of PaaS or SaaS. So again, the situation of gathering forensic evidence is strongly affected by the support the investigator receives from the customer and the CSP.3) Client System: On the system layer of the client, it completely depends on the used model (IaaS, PaaS, SaaS) if and where potential evidence could beextracted. In most of the scenarios, the user agent (e.g. the web browser) on the client system is the only application that communicates with the service in the cloud. This especially holds for SaaS applications which are used and controlled by the web browser. But also in IaaS scenarios, the administration interface is often controlled via the browser. Hence, in an exhaustive forensic investigation, the evidence data gathered from the browser environment [7] should not be omitted.a) Browser Forensics: Generally, the circumstances leading to an investigation have to be differentiated: In ordinary scenarios, the main goal of an investigation of the web browser is to determine if a user has been victim of a crime. In complex SaaS scenarios with high client-server interaction, this constitutes a difficult task. Additionally, customers strongly make use of third-party extensions [17] which can be abused for malicious purposes. Hence, the investigator might want to look for malicious extensions, searches performed, websites visited, files downloaded, information entered in forms or stored in local HTML5 stores, web-based email contents and persistent browser cookies for gathering potential evidence data. Within this context, it is inevitable to investigate the appearance of malicious JavaScript [18] leading to e.g. unintended AJAX requests and hence modified usage of administration interfaces. Generally, the web browser contains a lot of electronic evidence data that could be used to give an answer to both of the above questions - even if the private mode is switched on [19].B. Investigations in XaaS EnvironmentsTraditional digital forensic methodologies permit investigators to seize equipment and perform detailed analysis on the media and data recovered [11]. In a distributed infrastructure organization like the cloud computing environment, investigators are confronted with an entirely different situation. They have no longer the option of seizing physical data storage. Data and processes of the customer are dispensed over an undisclosed amount of virtual instances, applications and network elements. Hence, it is in question whether preliminary findings of the computer forensic community in the field of digital forensics apparently have to be revised and adapted to the new environment. Within this section, specific issues of investigations in SaaS, PaaS and IaaS environments will be discussed. In addition, cross-disciplinary issues which affect several environments uniformly, will be taken into consideration. We also suggest potential solutions to the mentioned problems.1) SaaS Environments: Especially in the SaaS model, the customer does notobtain any control of the underlying operating infrastructure such as network, servers, operating systems or the application that is used. This means that no deeper view into the system and its underlying infrastructure is provided to the customer. Only limited userspecific application configuration settings can be controlled contributing to the evidences which can be extracted fromthe client (see section IV-A3). In a lot of cases this urges the investigator to rely on high-level logs which are eventually provided by the CSP. Given the case that the CSP does not run any logging application, the customer has no opportunity to create any useful evidence through the installation of any toolkit or logging tool. These circumstances do not allow a valid forensic investigation and lead to the assumption that customers of SaaS offers do not have any chance to analyze potential incidences.a) Data Provenance: The notion of Digital Provenance is known as meta-data that describes the ancestry or history of digital objects. Secure provenance that records ownership and process history of data objects is vital to the success of data forensics in cloud environments, yet it is still a challenging issue today [8]. Albeit data provenance is of high significance also for IaaS and PaaS, it states a huge problem specifically for SaaS-based applications: Current global acting public SaaS CSP offer Single Sign-On (SSO) access control to the set of their services. Unfortunately in case of an account compromise, most of the CSP do not offer any possibility for the customer to figure out which data and information has been accessed by the adversary. For the victim, this situation can have tremendous impact: If sensitive data has been compromised, it is unclear which data has been leaked and which has not been accessed by the adversary. Additionally, data could be modified or deleted by an external adversary or even by the CSP e.g. due to storage reasons. The customer has no ability to proof otherwise. Secure provenance mechanisms for distributed environments can improve this situation but have not been practically implemented by CSP [10]. Suggested Solution: In private SaaS scenarios this situation is improved by the fact that the customer and the CSP are probably under the same authority. Hence, logging and provenance mechanisms could be implemented which contribute to potential investigations. Additionally, the exact location of the servers and the data is known at any time. Public SaaS CSP should offer additional interfaces for the purpose of compliance, forensics, operations and security matters to their customers. Through an API, the customers should have the ability to receive specific information suchas access, error and event logs that could improve their situation in case of aninvestigation. Furthermore, due to the limited ability of receiving forensic information from the server and proofing integrity of stored data in SaaS scenarios, the client has to contribute to this process. This could be achieved by implementing Proofs of Retrievability (POR) in which a verifier (client) is enabled to determine that a prover (server) possesses a file or data object and it can be retrieved unmodified [24]. Provable Data Possession (PDP) techniques [37] could be used to verify that an untrusted server possesses the original data without the need for the client to retrieve it. Although these cryptographic proofs have not been implemented by any CSP, the authors of [23] introduced a new data integrity verification mechanism for SaaS scenarios which could also be used for forensic purposes.2) PaaS Environments: One of the main advantages of the PaaS model is that the developed software application is under the control of the customer and except for some CSP, the source code of the application does not have to leave the local development environment. Given these circumstances, the customer obtains theoretically the power to dictate how the application interacts with other dependencies such as databases, storage entities etc. CSP normally claim this transfer is encrypted but this statement can hardly be verified by the customer. Since the customer has the ability to interact with the platform over a prepared API, system states and specific application logs can be extracted. However potential adversaries, which can compromise the application during runtime, should not be able to alter these log files afterwards. Suggested Solution:Depending on the runtime environment, logging mechanisms could be implemented which automatically sign and encrypt the log information before its transfer to a central logging server under the control of the customer. Additional signing and encrypting could prevent potential eavesdroppers from being able to view and alter log data information on the way to the logging server. Runtime compromise of an PaaS application by adversaries could be monitored by push-only mechanisms for log data presupposing that the needed information to detect such an attack are logged. Increasingly, CSP offering PaaS solutions give developers the ability to collect and store a variety of diagnostics data in a highly configurable way with the help of runtime feature sets [38].3) IaaS Environments: As expected, even virtual instances in the cloud get compromised by adversaries. Hence, the ability to determine how defenses in the virtual environment failed and to what extent the affected systems havebeen compromised is crucial not only for recovering from an incident. Also forensic investigations gain leverage from such information and contribute to resilience against future attacks on the systems. From the forensic point of view, IaaS instances do provide much more evidence data usable for potential forensics than PaaS and SaaS models do. This fact is caused throughthe ability of the customer to install and set up the image for forensic purposes before an incident occurs. Hence, as proposed for PaaS environments, log data and other forensic evidence information could be signed and encrypted before itis transferred to third-party hosts mitigating the chance that a maliciously motivated shutdown process destroys the volatile data. Although, IaaS environments provide plenty of potential evidence, it has to be emphasized that the customer VM is in the end still under the control of the CSP. He controls the hypervisor which is e.g. responsible for enforcing hardware boundaries and routing hardware requests among different VM. Hence, besides the security responsibilities of the hypervisor, he exerts tremendous control over how customer’s VM communicate with the hardware and theoretically can intervene executed processes on the hosted virtual instance through virtual introspection [25]. This could also affect encryption or signing processes executed on the VM and therefore leading to the leakage of the secret key. Although this risk can be disregarded in most of the cases, the impact on the security of high security environments is tremendous.a) Snapshot Analysis: Traditional forensics expect target machines to be powered down to collect an image (dead virtual instance). This situation completely changed with the advent of the snapshot technology which is supported by all popular hypervisors such as Xen, VMware ESX and Hyper-V.A snapshot, also referred to as the forensic image of a VM, providesa powerful tool with which a virtual instance can be clonedby one click including also the running system’s mem ory. Due to the invention of the snapshot technology, systems hosting crucial business processes do not have to be powered down for forensic investigation purposes. The investigator simply creates and loads a snapshot of the target VM for analysis(live virtual instance). This behavior is especially important for scenarios in which a downtime of a system is not feasible or practical due to existing SLA. However the information whether the machine is running or has been properly powered down is crucial [3] for the investigation. Live investigations of running virtual instances become more common providing evidence data that。
中英文对照外文翻译文献(文档含英文原文和中文翻译)【Abstract】Under the network environment the library information resource altogether constructs sharing is refers to all levels of each kind of library basis user to the social information demand, through network use computer, correspondence, electron, multimedia and so on advanced information technology, the high idealization carries on the synthesis cooperation development and the use activity to various collections information resource and the network resources . The market economy swift and violent development, the networking unceasing renewal, the information age arrival, had decided the future library trend of development will be implements the information resource altogether to construct sharing, already achieved the social mutual recognition about this point.This is because:libraries implement the information resource altogether to construct sharing are solve the knowledge information explosion and the collection strength insufficient this contradictory important way..【Key Words】Network; libraries implement: information: construction;work environment the libraryUnder the network environment the library information resource altogether constructs sharing is refers to all levels of each kind of library basis user to the social information demand, through network use computer, correspondence, electron, multimedia and so on advanced information technology, the high idealization carries on the synthesis cooperation development and the use activity to various collections information resource and the network resources.1、 information resource altogether will construct sharing is the future library development and the use information resource way that must be taken.The market economy swift and violent development, the networking unceasing renewal, the information age arrival, had decided the future library trend of development will be implements the information resource altogether to construct sharing, already achieved the social mutual recognition about this point.This is because: 。
大数据技术专业 英语
![大数据技术专业 英语](https://img.taocdn.com/s3/m/6f06a56382c4bb4cf7ec4afe04a1b0717fd5b3a8.png)
大数据技术专业英语English:Big Data Technology is a multidisciplinary field that encompasses various aspects of data collection, storage, processing, analysis, and visualization. In this specialized field, professionals utilize advanced tools and techniques to handle vast amounts of data generated from diverse sources such as social media, sensors, mobile devices, and enterprise systems. They employ technologies like Hadoop, Spark, NoSQL databases, and machine learning algorithms to extract valuable insights, identify patterns, and make data-driven decisions. Proficiency in programming languages like Python, R, Java, and Scala is crucial for implementing algorithms and building scalable data processing systems. Additionally, expertise in data warehousing, data modeling, and data governance is essential for ensuring the quality and integrity of data throughout its lifecycle. Moreover, strong analytical skills and domain knowledge are indispensable for interpreting results and deriving actionable recommendations from complex datasets. As the volume and complexity of data continue to grow exponentially, the demand for skilled professionals in Big Data Technology is expected to rise, offering lucrative careeropportunities in various industries such as finance, healthcare, retail, and telecommunications.中文翻译:大数据技术是一个涵盖数据收集、存储、处理、分析和可视化等多方面的跨学科领域。
毕业设计说明书英文文献及中文翻译学生姓名:学号:计算机与控制工程学院:专指导教师:2017 年 6 月英文文献Cloud Computing1。
Cloud Computing at a Higher LevelIn many ways,cloud computing is simply a metaphor for the Internet, the increasing movement of compute and data resources onto the Web. But there's a difference: cloud computing represents a new tipping point for the value of network computing. It delivers higher efficiency, massive scalability, and faster,easier software development. It's about new programming models,new IT infrastructure, and the enabling of new business models。
For those developers and enterprises who want to embrace cloud computing, Sun is developing critical technologies to deliver enterprise scale and systemic qualities to this new paradigm:(1) Interoperability —while most current clouds offer closed platforms and vendor lock—in, developers clamor for interoperability。
大数据技术在企业管理中的应用(英文中文双语版优质文档)With the advent of the digital age, big data technology has become an indispensable part of enterprise management. Big data technology can help companies better understand the market and customer needs, formulate more scientific decision-making plans, and improve the competitiveness of enterprises. This article will discuss from three aspects: the application of big data technology in enterprise management, the advantages of big data technology in enterprise management, and the development trend of big data technology in enterprise management.1. Application of big data technology in enterprise management1. Market AnalysisEnterprises can use big data technology to conduct market analysis. Through the collection and analysis of massive data, companies can understand market needs and trends, formulate more accurate marketing strategies, and improve sales efficiency. For example, through the analysis of social media data, companies can understand user preferences and needs, and provide guidance for product development and marketing.2. Customer ManagementBig data technology can help companies better manage customer relationships. Through the analysis of customer data, enterprises can understand customer needs and preferences, formulate more personalized service plans, and improve customer satisfaction. For example, through the analysis of customer behavior and consumption data, companies can provide personalized recommendation services to increase customer stickiness.3. Operations managementBig data technology can help enterprises better manage operations. By analyzing the internal data of the enterprise, the enterprise can understand the situation of various links such as production, procurement, and sales, and find problems in time and make adjustments. For example, through the analysis of supply chain data, enterprises can optimize the supply chain structure, improve logistics efficiency and reduce costs.2. The advantages of big data technology in enterprise management1. High precisionBig data technology can analyze massive amounts of data, and can discover some laws and trends that are difficult to be detected by humans, thereby improving the accuracy of decision-making.2. Strong real-time performanceBig data technology can process and analyze data in real time, and enterprises can keep abreast of market and customer changes and formulate faster response strategies.3. High economyCompared with traditional research and analysis methods, the cost of big data technology is lower, which can save the R&D and marketing costs of enterprises.4. Strong predictive abilityBig data technology can predict future trends and changes through the analysis of historical data, and provide more scientific decision support.3. The development trend of big data technology in enterprise management1. IntelligentWith the development of artificial intelligence technology, big data technology will become more and more intelligent. In the future, big data technology can better understand data and language through technologies such as machine learning and natural language processing, and conduct more accurate analysis and decision-making.2. SecurityThe application of big data technology also brings some security risks, such as data leakage and privacy issues. In the future, big data technology will pay more attention to data security, and protect enterprise data security through encryption technology and access control.3. DiversificationBig data technology can be applied not only in enterprise management, but also in many fields such as medical care, finance, and education. In the future, big data technology will become more diversified, providing more accurate decision-making support for all walks of life.Summarize:The application of big data technology in enterprise management is becoming more and more extensive, and its advantages are becoming more and more obvious. Big data technology can help companies better understand the market and customer needs, formulate more scientific decision-making plans, and improve the competitiveness of enterprises. In the future, big data technology will become more intelligent, secure and diversified, bringing more opportunities and challenges to enterprises and other industries.随着数字化时代的到来,大数据技术已经成为企业管理中不可或缺的一部分。
云计算外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)原文:Technical Issues of Forensic Investigations in Cloud Computing EnvironmentsDominik BirkRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyAbstract—Cloud Computing is arguably one of the most discussedinformation technologies today. It presents many promising technological and economical opportunities. However, many customers remain reluctant to move their business IT infrastructure completely to the cloud. One of their main concerns is Cloud Security and the threat of the unknown. Cloud Service Providers(CSP) encourage this perception by not letting their customers see what is behind their virtual curtain. A seldomly discussed, but in this regard highly relevant open issue is the ability to perform digital investigations. This continues to fuel insecurity on the sides of both providers and customers. Cloud Forensics constitutes a new and disruptive challenge for investigators. Due to the decentralized nature of data processing in the cloud, traditional approaches to evidence collection and recovery are no longer practical. This paper focuses on the technical aspects of digital forensics in distributed cloud environments. We contribute by assessing whether it is possible for the customer of cloud computing services to perform a traditional digital investigation from a technical point of view. Furthermore we discuss possible solutions and possible new methodologies helping customers to perform such investigations.I. INTRODUCTIONAlthough the cloud might appear attractive to small as well as to large companies, it does not come along without its own unique problems. Outsourcing sensitive corporate data into the cloud raises concerns regarding the privacy and security of data. Security policies, companies main pillar concerning security, cannot be easily deployed into distributed, virtualized cloud environments. This situation is further complicated by the unknown physical location of the companie’s assets. Normally,if a security incident occurs, the corporate security team wants to be able to perform their own investigation without dependency on third parties. In the cloud, this is not possible anymore: The CSP obtains all the power over the environmentand thus controls the sources of evidence. In the best case, a trusted third party acts as a trustee and guarantees for the trustworthiness of the CSP. Furthermore, the implementation of the technical architecture and circumstances within cloud computing environments bias the way an investigation may be processed. In detail, evidence data has to be interpreted by an investigator in a We would like to thank the reviewers for the helpful comments and Dennis Heinson (Center for Advanced Security Research Darmstadt - CASED) for the profound discussions regarding the legal aspects of cloud forensics. proper manner which is hardly be possible due to the lackof circumstantial information. For auditors, this situation does not change: Questions who accessed specific data and information cannot be answered by the customers, if no corresponding logs are available. With the increasing demand for using the power of the cloud for processing also sensible information and data, enterprises face the issue of Data and Process Provenance in the cloud [10]. Digital provenance, meaning meta-data that describes the ancestry or history of a digital object, is a crucial feature for forensic investigations. In combination with a suitable authentication scheme, it provides information about who created and who modified what kind of data in the cloud. These are crucial aspects for digital investigations in distributed environments such as the cloud. Unfortunately, the aspects of forensic investigations in distributed environment have so far been mostly neglected by the research community. Current discussion centers mostly around security, privacy and data protection issues [35], [9], [12]. The impact of forensic investigations on cloud environments was little noticed albeit mentioned by the authors of [1] in 2009: ”[...] to our knowledge, no research has been published on how cloud computing environments affect digital artifacts,and on acquisition logistics and legal issues related to cloud computing env ironments.” This statement is also confirmed by other authors [34], [36], [40] stressing that further research on incident handling, evidence tracking and accountability in cloud environments has to be done. At the same time, massive investments are being made in cloud technology. Combined with the fact that information technology increasingly transcendents peoples’ private and professional life, thus mirroring more and more of peoples’actions, it becomes apparent that evidence gathered from cloud environments will be of high significance to litigation or criminal proceedings in the future. Within this work, we focus the notion of cloud forensics by addressing the technical issues of forensics in all three major cloud service models and consider cross-disciplinary aspects. Moreover, we address the usability of various sources of evidence for investigative purposes and propose potential solutions to the issues from a practical standpoint. This work should be considered as a surveying discussion of an almost unexplored research area. The paper is organized as follows: We discuss the related work and the fundamental technical background information of digital forensics, cloud computing and the fault model in section II and III. In section IV, we focus on the technical issues of cloud forensics and discuss the potential sources and nature of digital evidence as well as investigations in XaaS environments including thecross-disciplinary aspects. We conclude in section V.II. RELATED WORKVarious works have been published in the field of cloud security and privacy [9], [35], [30] focussing on aspects for protecting data in multi-tenant, virtualized environments. Desired security characteristics for current cloud infrastructures mainly revolve around isolation of multi-tenant platforms [12], security of hypervisors in order to protect virtualized guest systems and secure network infrastructures [32]. Albeit digital provenance, describing the ancestry of digital objects, still remains a challenging issue for cloud environments, several works have already been published in this field [8], [10] contributing to the issues of cloud forensis. Within this context, cryptographic proofs for verifying data integrity mainly in cloud storage offers have been proposed,yet lacking of practical implementations [24], [37], [23]. Traditional computer forensics has already well researched methods for various fields of application [4], [5], [6], [11], [13]. Also the aspects of forensics in virtual systems have been addressed by several works [2], [3], [20] including the notionof virtual introspection [25]. In addition, the NIST already addressed Web Service Forensics [22] which has a huge impact on investigation processes in cloud computing environments. In contrast, the aspects of forensic investigations in cloud environments have mostly been neglected by both the industry and the research community. One of the first papers focusing on this topic was published by Wolthusen [40] after Bebee et al already introduced problems within cloud environments [1]. Wolthusen stressed that there is an inherent strong need for interdisciplinary work linking the requirements and concepts of evidence arising from the legal field to what can be feasibly reconstructed and inferred algorithmically or in an exploratory manner. In 2010, Grobauer et al [36] published a paper discussing the issues of incident response in cloud environments - unfortunately no specific issues and solutions of cloud forensics have been proposed which will be done within this work.III. TECHNICAL BACKGROUNDA. Traditional Digital ForensicsThe notion of Digital Forensics is widely known as the practice of identifying, extracting and considering evidence from digital media. Unfortunately, digital evidence is both fragile and volatile and therefore requires the attention of special personnel and methods in order to ensure that evidence data can be proper isolated and evaluated. Normally, the process of a digital investigation can be separated into three different steps each having its own specificpurpose:1) In the Securing Phase, the major intention is the preservation of evidence for analysis. The data has to be collected in a manner that maximizes its integrity. This is normally done by a bitwise copy of the original media. As can be imagined, this represents a huge problem in the field of cloud computing where you never know exactly where your data is and additionallydo not have access to any physical hardware. However, the snapshot technology, discussed in section IV-B3, provides a powerful tool to freeze system states and thus makes digital investigations, at least in IaaS scenarios, theoretically possible.2) We refer to the Analyzing Phase as the stage in which the data is sifted and combined. It is in this phase that the data from multiple systems or sources is pulled together to create as complete a picture and event reconstruction as possible. Especially in distributed system infrastructures, this means that bits and pieces of data are pulled together for deciphering the real story of what happened and for providing a deeper look into the data.3) Finally, at the end of the examination and analysis of the data, the results of the previous phases will be reprocessed in the Presentation Phase. The report, created in this phase, is a compilation of all the documentation and evidence from the analysis stage. The main intention of such a report is that it contains all results, it is complete and clear to understand. Apparently, the success of these three steps strongly depends on the first stage. If it is not possible to secure the complete set of evidence data, no exhaustive analysis will be possible. However, in real world scenarios often only a subset of the evidence data can be secured by the investigator. In addition, an important definition in the general context of forensics is the notion of a Chain of Custody. This chain clarifies how and where evidence is stored and who takes possession of it. Especially for cases which are brought to court it is crucial that the chain of custody is preserved.B. Cloud ComputingAccording to the NIST [16], cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal CSP interaction. The new raw definition of cloud computing brought several new characteristics such as multi-tenancy, elasticity, pay-as-you-go and reliability. Within this work, the following three models are used: In the Infrastructure asa Service (IaaS) model, the customer is using the virtual machine provided by the CSP for installing his own system on it. The system can be used like any other physical computer with a few limitations. However, the additive customer power over the system comes along with additional security obligations. Platform as a Service (PaaS) offerings provide the capability to deploy application packages created using the virtual development environment supported by the CSP. For the efficiency of software development process this service model can be propellent. In the Software as a Service (SaaS) model, the customer makes use of a service run by the CSP on a cloud infrastructure. In most of the cases this service can be accessed through an API for a thin client interface such as a web browser. Closed-source public SaaS offers such as Amazon S3 and GoogleMail can only be used in the public deployment model leading to further issues concerning security, privacy and the gathering of suitable evidences. Furthermore, two main deployment models, private and public cloud have to be distinguished. Common public clouds are made available to the general public. The corresponding infrastructure is owned by one organization acting as a CSP and offering services to its customers. In contrast, the private cloud is exclusively operated for an organization but may not provide the scalability and agility of public offers. The additional notions of community and hybrid cloud are not exclusively covered within this work. However, independently from the specific model used, the movement of applications and data to the cloud comes along with limited control for the customer about the application itself, the data pushed into the applications and also about the underlying technical infrastructure.C. Fault ModelBe it an account for a SaaS application, a development environment (PaaS) or a virtual image of an IaaS environment, systems in the cloud can be affected by inconsistencies. Hence, for both customer and CSP it is crucial to have the ability to assign faults to the causing party, even in the presence of Byzantine behavior [33]. Generally, inconsistencies can be caused by the following two reasons:1) Maliciously Intended FaultsInternal or external adversaries with specific malicious intentions can cause faults on cloud instances or applications. Economic rivals as well as former employees can be the reason for these faults and state a constant threat to customers and CSP. In this model, also a malicious CSP is included albeit he isassumed to be rare in real world scenarios. Additionally, from the technical point of view, the movement of computing power to a virtualized, multi-tenant environment can pose further threads and risks to the systems. One reason for this is that if a single system or service in the cloud is compromised, all other guest systems and even the host system are at risk. Hence, besides the need for further security measures, precautions for potential forensic investigations have to be taken into consideration.2) Unintentional FaultsInconsistencies in technical systems or processes in the cloud do not have implicitly to be caused by malicious intent. Internal communication errors or human failures can lead to issues in the services offered to the costumer(i.e. loss or modification of data). Although these failures are not caused intentionally, both the CSP and the customer have a strong intention to discover the reasons and deploy corresponding fixes.IV. TECHNICAL ISSUESDigital investigations are about control of forensic evidence data. From the technical standpoint, this data can be available in three different states: at rest, in motion or in execution. Data at rest is represented by allocated disk space. Whether the data is stored in a database or in a specific file format, it allocates disk space. Furthermore, if a file is deleted, the disk space is de-allocated for the operating system but the data is still accessible since the disk space has not been re-allocated and overwritten. This fact is often exploited by investigators which explore these de-allocated disk space on harddisks. In case the data is in motion, data is transferred from one entity to another e.g. a typical file transfer over a network can be seen as a data in motion scenario. Several encapsulated protocols contain the data each leaving specific traces on systems and network devices which can in return be used by investigators. Data can be loaded into memory and executed as a process. In this case, the data is neither at rest or in motion but in execution. On the executing system, process information, machine instruction and allocated/de-allocated data can be analyzed by creating a snapshot of the current system state. In the following sections, we point out the potential sources for evidential data in cloud environments and discuss the technical issues of digital investigations in XaaS environmentsas well as suggest several solutions to these problems.A. Sources and Nature of EvidenceConcerning the technical aspects of forensic investigations, the amount of potential evidence available to the investigator strongly diverges between thedifferent cloud service and deployment models. The virtual machine (VM), hosting in most of the cases the server application, provides several pieces of information that could be used by investigators. On the network level, network components can provide information about possible communication channels between different parties involved. The browser on the client, acting often as the user agent for communicating with the cloud, also contains a lot of information that could be used as evidence in a forensic investigation. Independently from the used model, the following three components could act as sources for potential evidential data.1) Virtual Cloud Instance: The VM within the cloud, where i.e. data is stored or processes are handled, contains potential evidence [2], [3]. In most of the cases, it is the place where an incident happened and hence provides a good starting point for a forensic investigation. The VM instance can be accessed by both, the CSP and the customer who is running the instance. Furthermore, virtual introspection techniques [25] provide access to the runtime state of the VM via the hypervisor and snapshot technology supplies a powerful technique for the customer to freeze specific states of the VM. Therefore, virtual instances can be still running during analysis which leads to the case of live investigations [41] or can be turned off leading to static image analysis. In SaaS and PaaS scenarios, the ability to access the virtual instance for gathering evidential information is highly limited or simply not possible.2) Network Layer: Traditional network forensics is knownas the analysis of network traffic logs for tracing events that have occurred in the past. Since the different ISO/OSI network layers provide several information on protocols and communication between instances within as well as with instances outside the cloud [4], [5], [6], network forensics is theoretically also feasible in cloud environments. However in practice, ordinary CSP currently do not provide any log data from the network components used by the customer’s instances or applications. For instance, in case of a malware infection of an IaaS VM, it will be difficult for the investigator to get any form of routing information and network log datain general which is crucial for further investigative steps. This situation gets even more complicated in case of PaaS or SaaS. So again, the situation of gathering forensic evidence is strongly affected by the support the investigator receives from the customer and the CSP.3) Client System: On the system layer of the client, it completely depends on the used model (IaaS, PaaS, SaaS) if and where potential evidence could beextracted. In most of the scenarios, the user agent (e.g. the web browser) on the client system is the only application that communicates with the service in the cloud. This especially holds for SaaS applications which are used and controlled by the web browser. But also in IaaS scenarios, the administration interface is often controlled via the browser. Hence, in an exhaustive forensic investigation, the evidence data gathered from the browser environment [7] should not be omitted.a) Browser Forensics: Generally, the circumstances leading to an investigation have to be differentiated: In ordinary scenarios, the main goal of an investigation of the web browser is to determine if a user has been victim of a crime. In complex SaaS scenarios with high client-server interaction, this constitutes a difficult task. Additionally, customers strongly make use of third-party extensions [17] which can be abused for malicious purposes. Hence, the investigator might want to look for malicious extensions, searches performed, websites visited, files downloaded, information entered in forms or stored in local HTML5 stores, web-based email contents and persistent browser cookies for gathering potential evidence data. Within this context, it is inevitable to investigate the appearance of malicious JavaScript [18] leading to e.g. unintended AJAX requests and hence modified usage of administration interfaces. Generally, the web browser contains a lot of electronic evidence data that could be used to give an answer to both of the above questions - even if the private mode is switched on [19].B. Investigations in XaaS EnvironmentsTraditional digital forensic methodologies permit investigators to seize equipment and perform detailed analysis on the media and data recovered [11]. In a distributed infrastructure organization like the cloud computing environment, investigators are confronted with an entirely different situation. They have no longer the option of seizing physical data storage. Data and processes of the customer are dispensed over an undisclosed amount of virtual instances, applications and network elements. Hence, it is in question whether preliminary findings of the computer forensic community in the field of digital forensics apparently have to be revised and adapted to the new environment. Within this section, specific issues of investigations in SaaS, PaaS and IaaS environments will be discussed. In addition, cross-disciplinary issues which affect several environments uniformly, will be taken into consideration. We also suggest potential solutions to the mentioned problems.1) SaaS Environments: Especially in the SaaS model, the customer does notobtain any control of the underlying operating infrastructure such as network, servers, operating systems or the application that is used. This means that no deeper view into the system and its underlying infrastructure is provided to the customer. Only limited userspecific application configuration settings can be controlled contributing to the evidences which can be extracted fromthe client (see section IV-A3). In a lot of cases this urges the investigator to rely on high-level logs which are eventually provided by the CSP. Given the case that the CSP does not run any logging application, the customer has no opportunity to create any useful evidence through the installation of any toolkit or logging tool. These circumstances do not allow a valid forensic investigation and lead to the assumption that customers of SaaS offers do not have any chance to analyze potential incidences.a) Data Provenance: The notion of Digital Provenance is known as meta-data that describes the ancestry or history of digital objects. Secure provenance that records ownership and process history of data objects is vital to the success of data forensics in cloud environments, yet it is still a challenging issue today [8]. Albeit data provenance is of high significance also for IaaS and PaaS, it states a huge problem specifically for SaaS-based applications: Current global acting public SaaS CSP offer Single Sign-On (SSO) access control to the set of their services. Unfortunately in case of an account compromise, most of the CSP do not offer any possibility for the customer to figure out which data and information has been accessed by the adversary. For the victim, this situation can have tremendous impact: If sensitive data has been compromised, it is unclear which data has been leaked and which has not been accessed by the adversary. Additionally, data could be modified or deleted by an external adversary or even by the CSP e.g. due to storage reasons. The customer has no ability to proof otherwise. Secure provenance mechanisms for distributed environments can improve this situation but have not been practically implemented by CSP [10]. Suggested Solution: In private SaaS scenarios this situation is improved by the fact that the customer and the CSP are probably under the same authority. Hence, logging and provenance mechanisms could be implemented which contribute to potential investigations. Additionally, the exact location of the servers and the data is known at any time. Public SaaS CSP should offer additional interfaces for the purpose of compliance, forensics, operations and security matters to their customers. Through an API, the customers should have the ability to receive specific information suchas access, error and event logs that could improve their situation in case of aninvestigation. Furthermore, due to the limited ability of receiving forensic information from the server and proofing integrity of stored data in SaaS scenarios, the client has to contribute to this process. This could be achieved by implementing Proofs of Retrievability (POR) in which a verifier (client) is enabled to determine that a prover (server) possesses a file or data object and it can be retrieved unmodified [24]. Provable Data Possession (PDP) techniques [37] could be used to verify that an untrusted server possesses the original data without the need for the client to retrieve it. Although these cryptographic proofs have not been implemented by any CSP, the authors of [23] introduced a new data integrity verification mechanism for SaaS scenarios which could also be used for forensic purposes.2) PaaS Environments: One of the main advantages of the PaaS model is that the developed software application is under the control of the customer and except for some CSP, the source code of the application does not have to leave the local development environment. Given these circumstances, the customer obtains theoretically the power to dictate how the application interacts with other dependencies such as databases, storage entities etc. CSP normally claim this transfer is encrypted but this statement can hardly be verified by the customer. Since the customer has the ability to interact with the platform over a prepared API, system states and specific application logs can be extracted. However potential adversaries, which can compromise the application during runtime, should not be able to alter these log files afterwards. Suggested Solution:Depending on the runtime environment, logging mechanisms could be implemented which automatically sign and encrypt the log information before its transfer to a central logging server under the control of the customer. Additional signing and encrypting could prevent potential eavesdroppers from being able to view and alter log data information on the way to the logging server. Runtime compromise of an PaaS application by adversaries could be monitored by push-only mechanisms for log data presupposing that the needed information to detect such an attack are logged. Increasingly, CSP offering PaaS solutions give developers the ability to collect and store a variety of diagnostics data in a highly configurable way with the help of runtime feature sets [38].3) IaaS Environments: As expected, even virtual instances in the cloud get compromised by adversaries. Hence, the ability to determine how defenses in the virtual environment failed and to what extent the affected systems havebeen compromised is crucial not only for recovering from an incident. Also forensic investigations gain leverage from such information and contribute to resilience against future attacks on the systems. From the forensic point of view, IaaS instances do provide much more evidence data usable for potential forensics than PaaS and SaaS models do. This fact is caused throughthe ability of the customer to install and set up the image for forensic purposes before an incident occurs. Hence, as proposed for PaaS environments, log data and other forensic evidence information could be signed and encrypted before itis transferred to third-party hosts mitigating the chance that a maliciously motivated shutdown process destroys the volatile data. Although, IaaS environments provide plenty of potential evidence, it has to be emphasized that the customer VM is in the end still under the control of the CSP. He controls the hypervisor which is e.g. responsible for enforcing hardware boundaries and routing hardware requests among different VM. Hence, besides the security responsibilities of the hypervisor, he exerts tremendous control over how customer’s VM communicate with the hardware and theoretically can intervene executed processes on the hosted virtual instance through virtual introspection [25]. This could also affect encryption or signing processes executed on the VM and therefore leading to the leakage of the secret key. Although this risk can be disregarded in most of the cases, the impact on the security of high security environments is tremendous.a) Snapshot Analysis: Traditional forensics expect target machines to be powered down to collect an image (dead virtual instance). This situation completely changed with the advent of the snapshot technology which is supported by all popular hypervisors such as Xen, VMware ESX and Hyper-V.A snapshot, also referred to as the forensic image of a VM, providesa powerful tool with which a virtual instance can be clonedby one click including also the running system’s mem ory. Due to the invention of the snapshot technology, systems hosting crucial business processes do not have to be powered down for forensic investigation purposes. The investigator simply creates and loads a snapshot of the target VM for analysis(live virtual instance). This behavior is especially important for scenarios in which a downtime of a system is not feasible or practical due to existing SLA. However the information whether the machine is running or has been properly powered down is crucial [3] for the investigation. Live investigations of running virtual instances become more common providing evidence data that。
计算机科学与技术 外文翻译 英文文献 中英对照
![计算机科学与技术 外文翻译 英文文献 中英对照](https://img.taocdn.com/s3/m/29ca0099b9d528ea81c7799a.png)
附件1:外文资料翻译译文大容量存储器由于计算机主存储器的易失性和容量的限制, 大多数的计算机都有附加的称为大容量存储系统的存储设备, 包括有磁盘、CD 和磁带。
相对于主存储器,大的容量储存系统的优点是易失性小,容量大,低成本, 并且在许多情况下, 为了归档的需要可以把储存介质从计算机上移开。
1. 磁盘今天,我们使用得最多的一种大量存储器是磁盘,在那里有薄的可以旋转的盘片,盘片上有磁介质以储存数据。
因此,一个磁盘存储器系统有许多个别的磁区, 每个扇区都可以作为独立的二进制位串存取,盘片表面上的磁道数目和每个磁道上的扇区数目对于不同的磁盘系统可能都不相同。
磁区大小一般是不超过几个KB; 512 个字节或1024 个字节。
云计算大数据外文翻译文献(文档含英文原文和中文翻译)原文:Meet HadoopIn pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log, they didn’t try to grow a larger ox. We shouldn’t be trying for bigger computers, but for more systems of computers.—Grace Hopper Data!We live in the data age. It’s not easy to measure the total volume of data stored electronically, but an IDC estimate put the size of the “digital universe” at 0.18 zettabytes in2006, and is forecasting a tenfold growth by 2011 to 1.8 zettabytes. A zettabyte is 1021 bytes, or equivalently one thousand exabytes, one million petabytes, or one billion terabytes. That’s roughly the same order of magnitude as one disk drive for every person in the world.This flood of data is coming from many sources. Consider the following:• The New York Stock Exchange generates about one terabyte of new trade data perday.• Facebook hosts approximately 10 billion photos, taking up one petabyte of storage.• , the genealogy site, stores around 2.5 petabytes of data.• The Internet Archive stores around 2 petabytes of data, and is growing at a rate of20 terabytes per month.• The Large Hadron Collider near Geneva, Switzerland, will produce about 15 petabytes of data per year.So there’s a lot of data out there. But you are probably wondering how it affects you.Most of the data is locked up in the largest web properties (like search engines), orscientific or financial institutions, isn’t it? Does the advent of “Big Data,” as it is being called, affect smaller organizations or individuals?I argue that it does. Take photos, for example. My wife’s grandfather was an avid photographer, and took photographs throughout his adult life. His entire corpus of medium format, slide, and 35mm film, when scanned in at high-resolution, occupies around 10 gigabytes. Compare this to the digital photos that my family took last year,which take up about 5 gigabytes of space. My family is producing photographic data at 35 times the rate my wife’s grandfather’s did, and the rate is increasing every year as it becomes easier to take more and more photos.More generally, the digital streams that individuals are producing are growing apace. Microsoft Research’s MyLifeBits project gives a glimpse of archiving of personal in formation that may become commonplace in the near future. MyLifeBits was an experiment where an individual’s interactions—phone calls, emails, documents were captured electronically and stored for later access. The data gathered included a photo taken every minute, which resulted in an overall data volume of one gigabyte a month. When storage costs come down enough to make it feasible to store continuous audio and video, the data volume for a future MyLifeBits service will be many times that.The trend is f or every individual’s data footprint to grow, but perhaps more importantly the amount of data generated by machines will be even greater than that generated by people. Machine logs, RFID readers, sensor networks, vehicle GPS traces, retail transactions—all of these contribute to the growing mountain of data.The volume of data being made publicly available increases every year too. Organizations no longer have to merely manage their own data: success in the future will be dictated to a large extent by their ability to extract value from other organizations’ data.Initiatives such as Public Data Sets on Amazon Web Services, , and exist to foster the “information commons,” where data can be freely (or in the case of AWS, for a modest price) shared for anyone to download and analyze. Mashups between different information sources make for unexpected and hitherto unimaginable applications.Take, for example, the project, which watches the Astrometry groupon Flickr for new photos of the night sky. It analyzes each image, and identifies which part of the sky it is from, and any interesting celestial bodies, such as stars or galaxies. Although it’s still a new and experimental service, it shows the kind of things that are possible when data (in this case, tagged photographic images) is made available andused for something (image analysis) that was not anticipated by the creator.It has been said that “More data usually beats better algorithms,” which is to say that for some problems (such as recommending movies or music based on past preferences),however fiendish your algorithms are, they can often be beaten simply by having more data (and a less sophisticated algorithm).The good news is that Big Data is here. The bad news is that we are struggling to store and analyze it.Data Storage and AnalysisThe problem is simple: while the storage capacities of hard drives have increased massively over the years, access speeds--the rate at which data can be read from drives--have not kept up. One typical drive from 1990 could store 1370 MB of data and had a transfer speed of 4.4 MB/s, so you could read all the data from a full drive in around five minutes. Almost 20years later one terabyte drives are the norm, but the transfer speed is around 100 MB/s, so it takes more than two and a half hours to read all the data off the disk.This is a long time to read all data on a single drive and writing is even slower. The obvious way to reduce the time is to read from multiple disks at once. Imagine if we had 100 drives, each holding one hundredth of the data. Working in parallel, we could read the data in under two minutes.Only using one hundredth of a disk may seem wasteful. But we can store one hundred datasets, each of which is one terabyte, and provide shared access to them. We can imagine that the users of such a system would be happy to share access in return for shorter analysis times, and, statistically, that their analysis jobs would be likely to be spread over time, so they wouldn`t interfere with each other too much.There`s more to being able to read and write data in parallel to or from multiple disks, though. The first problem to solve is hardware failure: as soon as you start using many pieces of hardware, the chance that one will fail is fairly high. A common way of avoiding data loss is through replication: redundant copies of the data are kept by the system so that in the event of failure, there is another copy available. This is how RAID works, for instance, although Hadoop`s filesystem, the Hadoop Distributed Filesystem (HDFS),takes a slightly different approach, as you shall see later. The second problem is that most analysis tasks need to be able to combine the data in some way; data read from one disk may need to be combined with the data from any of the other 99 disks. Various distributed systems allow data to be combined from multiple sources, but doing this correctly is notoriously challenging. MapReduce provides a programming model that abstracts the problem from disk reads and writes, transforming it into a computation over sets of keys and values. We will look at the details of this model in later chapters, but the important point for the present discussion is that there are two parts to the computation, the map and the re duce, and it’s the interface between the two where the “mixing” occurs. Like HDFS, MapReduce has reliability built-in.This, in a nutshell, is what Hadoop provides: a reliable shared storage and analysis system. The storage is provided by HDFS, and analysis by MapReduce. There are other parts to Hadoop, but these capabilities are its kernel.Comparison with Other SystemsThe approach taken by MapReduce may seem like a brute-force approach. The premise is that the entire dataset—or at least a good portion of it—is processed for each query. But this is its power. MapReduce is a batch query processor, and the ability to run an ad hoc query against your whole dataset and get the results in a reasonable time is transformative. It changes the way you think about data, and unlocks data that was previously archived on tape or disk. It gives people the opportunity to innovate with data. Questions that took too long to get answered before can now be answered, which in turn leads to new questions and new insights.For e xample, Mailtrust, Rackspace’s mail division, used Hadoop for processing email logs. One ad hoc query they wrote was to find the geographic distribution of their users.In their words: This data was so useful that we’ve scheduled the MapReduce job to run monthly and we will be using this data to help us decide which Rackspace data centers to place new mail servers in as we grow. By bringing several hundred gigabytes of data together and having the tools to analyze it, the Rackspace engineers were able to gain an understanding of the data that they otherwise would never have had, and, furthermore, they were able to use what they had learned to improve the service for their customers. You can read more about how Rackspace uses Hadoop in Chapter 14.RDBMSWhy c an’t we use databases with lots of disks to do large-scale batch analysis? Why is MapReduce needed? The answer to these questions comes from another trend in disk drives: seek time is improving more slowly than transfer rate. Seeking is the process of moving the disk’s head to a particular place on the disk to read or write data. It characterizes the latency of a disk operation, whereas the transfer rate corresponds to a disk’s bandwidth.If the data access pattern is dominated by seeks, it will take longer to read or write large portions of the dataset than streaming through it, which operates at the transfer rate. On the other hand, for updating a small proportion of records in a database, a traditional B-Tree (the data structure used in relational databases, which is limited by the rate it can perform seeks) works well. For updating the majority of a database, a B-Tree is less efficient than MapReduce, which uses Sort/Merge to rebuild the database.In many ways, MapReduce can be seen as a complement to an RDBMS. (The differences between the two systems are shown in Table 1-1.) MapReduce is a good fit for problems thatneed to analyze the whole dataset, in a batch fashion, particularly for ad hoc analysis. An RDBMS is good for point queries or updates, where the dataset has been indexed to deliver low-latency retrieval and update times of a relatively small amount of data. MapReduce suits applications where the data is written once, and read many times, whereas a relational database is good for datasets that are continually updated.Table 1-1. RDBMS compared to MapReduceTraditional RDBMS MapReduceData size Gigabytes PetabytesAccess Interactive and batch BatchWrite once, read many times Updates Read and write manytimesStructure Static schema Dynamic schemaIntegrity High LowScaling Nonlinear LinearAnother difference between MapReduce and an RDBMS is the amount of structure in the datasets that they operate on. Structured data is data that is organized into entities that have a defined format, such as XML documents or database tables that conform to a particular predefined schema. This is the realm of the RDBMS. Semi-structured data, on the other hand, is looser, and though there may be a schema, it is often ignored, so it may be used only as a guide to the structure of the data: for example, a spreadsheet, in which the structure is the grid of cells, although the cells themselves may hold anyform of data. Unstructured data does not have any particular internal structure: for example, plain text or image data. MapReduce works well on unstructured or semistructured data, since it is designed to interpret the data at processing time. In other words, the input keys and values for MapReduce are not an intrinsic property of the data, but they are chosen by the person analyzing the data.Relational data is often normalized to retain its integrity, and remove redundancy. Normalization poses problems for MapReduce, since it makes reading a record a nonlocaloperation, and one of the central assumptions that MapReduce makes is that it is possible to perform (high-speed) streaming reads and writes.A web server log is a good example of a set of records that is not normalized (for example, the client hostnames are specified in full each time, even though the same client may appear many times), and this is one reason that logfiles of all kinds are particularly well-suited to analysis with MapReduce.MapReduce is a linearly scalable programming model. The programmer writes two functions—a map function and a reduce function—each of which defines a mapping from one set of key-value pairs to another. These functions are oblivious to the size of the data or the cluster that they are operating on, so they can be used unchanged for a small dataset and for a massive one. More importantly, if you double the size of the input data, a job will run twice as slow. But if you also double the size of the cluster, a job will run as fast as the original one. This is not generally true of SQL queries.Over time, however, the differences between relational databases and MapReduce systems are likely to blur. Both as relational databases start incorporating some of the ideas from MapReduce (such as Aster Data’s and Greenplum’s databases), and, from the other direction, as higher-level query languages built on MapReduce (such as Pig and Hive) make MapReduce systems more approachable to traditional database programmers.Grid ComputingThe High Performance Computing (HPC) and Grid Computing communities have been doing large-scale data processing for years, using such APIs as Message Passing Interface (MPI). Broadly, the approach in HPC is to distribute the work across a cluster of machines, which access a shared filesystem, hosted by a SAN. This works well for predominantly compute-intensive jobs, but becomes a problem when nodes need to access larger data volumes (hundreds of gigabytes, the point at which MapReduce really starts to shine), since the network bandwidth is the bottleneck, and compute nodes become idle.MapReduce tries to colocate the data with the compute node, so data access is fast since it is local. This feature, known as data locality, is at the heart of MapReduce and is the reason for its good performance. Recognizing that network bandwidth is the most precious resource in a data center environment (it is easy to saturate network links by copying data around),MapReduce implementations go to great lengths to preserve it by explicitly modelling network topology. Notice that this arrangement does not preclude high-CPU analyses in MapReduce.MPI gives great control to the programmer, but requires that he or she explicitly handle the mechanics of the data flow, exposed via low-level C routines and constructs, such as sockets, as well as the higher-level algorithm for the analysis. MapReduce operates only at the higher level: the programmer thinks in terms of functions of key and value pairs, and the data flow is implicit.Coordinating the processes in a large-scale distributed computation is a challenge. The hardest aspect is gracefully handling partial failure—when you don’t know if a remote process has failed or not—and still making progress with the overall computation. MapReduce spares the programmer from having to think about failure, since the implementation detects failed map or reduce tasks and reschedules replacements on machines that are healthy. MapReduce is able to do this since it is a shared-nothing architecture, meaning that tasks have no dependence on one other. (This is a slight oversimplification, since the output from mappers is fed to the reducers, but this is under the control of the MapReduce system; in this case, it needs to take more care rerunning a failed reducer than rerunning a failed map, since it has to make sure it can retrieve the necessary map outputs, and if not, regenerate them by running the relevant maps again.) So from the programmer’s point of view, the order in which the tasks run doesn’t matter. By contrast, MPI programs have to explicitly manage their own checkpointing and recovery, which gives more control to the programmer, but makes them more difficult to write.MapReduce might sound like quite a restrictive programming model, and in a sense itis: you are limited to key and value types that are related in specified ways, and mappers and reducers run with very limited coordination between one another (the mappers pass keys and values to reducers). A natural question to ask is: can you do anything useful or nontrivial with it?The answer is yes. MapReduce was invented by engineers at Google as a system for building production search indexes because they found themselves solving the same problem over and over again (and MapReduce was inspired by older ideas from the functional programming, distributed computing, and database communities), but it has since been used for many other applications in many other industries. It is pleasantly surprising to see the range of algorithms that can be expressed in MapReduce, from image analysis, to graph-based problems,to machine learning algorithms. It can’t solve every problem, of course, but it is a general data-processing tool.You can see a sample of some of the applications that Hadoop has been used for in Chapter 14.Volunteer ComputingWhen people first hear about Hadoop and MapReduce, they often ask, “How is it different from SETI@home?” SETI, the Search for Extra-Terrestrial Intelligence, runs a project called SETI@home in which volunteers donate CPU time from their otherwise idle computers to analyze radio telescope data for signs of intelligent life outside earth. SETI@home is the most well-known of many volunteer computing projects; others include the Great Internet Mersenne Prime Search (to search for large prime numbers) and Folding@home (to understand protein folding, and how it relates to disease).Volunteer computing projects work by breaking the problem they are trying to solve into chunks called work units, which are sent to computers around the world to be analyzed. For example, a SETI@home work unit is about 0.35 MB of radio telescope data, and takes hours or days to analyze on a typical home computer. When the analysis is completed, the results are sent back to the server, and the client gets another work unit. As a precaution to combat cheating, each work unit is sent to three different machines, and needs at least two results to agree to be accepted.Although SETI@home may be superficially similar to MapReduce (breaking a problem into independent pieces to be worked on in parallel), there are some significant differences. The SETI@home problem is very CPU-intensive, which makes it suitable for running on hundreds of thousands of computers across the world, since the time to transfer the work unit is dwarfed by the time to run the computation on it. Volunteers are donating CPU cycles, not bandwidth.MapReduce is designed to run jobs that last minutes or hours on trusted, dedicated hardware running in a single data center with very high aggregate bandwidth interconnects. By contrast, SETI@home runs a perpetual computation on untrusted machines on the Internet with highly variable connection speeds and no data locality.译文:初识Hadoop古时候,人们用牛来拉重物,当一头牛拉不动一根圆木的时候,他们不曾想过培育个头更大的牛。
云计算外文文献+翻译1. 引言云计算是一种基于互联网的计算方式,它通过共享的计算资源提供各种服务。
2. 外文文献概述作者:Antonio Fernández Anta, Chryssis Georgiou, Evangelos Kranakis出版年份:2019年该外文文献主要综述了云计算的发展和应用。
3. 研究内容该研究综述了云计算技术的基本概念和相关技术。
4. 文献翻译《云计算:一项调查》是一篇全面介绍云计算的文献。
5. 结论。
大数据英文翻译Big Data TranslationWith the rapid advancement of technology, the amount of data collected and generated is increasing exponentially. This immense volume of data is commonly referred to as "Big Data". Big Data refers to data sets that are too large and complex to be processed by traditional data processing systems.In recent years, Big Data has become a hot topic in various industries as it has the potential to provide valuable insights and improve decision-making processes. Big Data is often characterized by the "3Vs" – volume, velocity, and variety. Volume refers to the vast amount of data that is being produced every second. Velocity refers to the speed at which this data is being generated and needs to be processed. Lastly, variety refers to the different types and formats of data that are being collected, including structured data (such as numbers and dates) and unstructured data (such as text, images, and videos).The analysis of Big Data requires advanced analytics techniques and tools such as data mining, machine learning, and predictive modeling. These techniques allow organizations to extract meaningful patterns and trends from the vast amount of data. Additionally, Big Data analytics can help identify hidden correlations and relationships that may not be apparent at first glance. By understanding these patterns, organizations can make data-driven decisions and gain a competitive advantage in their respective industries.The impact of Big Data can be seen in various fields. In healthcare, Big Data analytics can be used to improve patient outcomes and personalize treatments. By analyzing patient records, genetic data, and other medical information, healthcare providers can identify risk factors, predict diseases, and recommend personalized treatment plans. In finance, Big Data analytics can be used to detect fraudulent activities and identify investment opportunities. By analyzing market trends, consumer behavior, and economic indicators, financial institutions can make informed decisions and mitigate risks.However, the use of Big Data also raises concerns about privacy and security. With the collection of vast amounts of personal data, there is an increased risk of data breaches and unauthorized access. To address these concerns, organizations need to implement robust security measures and ensure compliance with data protection regulations.In conclusion, Big Data has the potential to revolutionize various industries by providing valuable insights and improving decision-making processes. However, it also poses challenges in terms of data management, analysis, and security. Organizations that are able to effectively harness the power of Big Data will be better equipped to succeed in the data-driven era.。
翻译文献1标题: The Application of Data Analysis in Business Decision-making The Application of Data Analysis in Business Decision-making文献2标题: The Application of Machine Learning in Data Analysis The Application of Machine Learning in Data Analysis文献3标题: The Application of Data Visualization in Data Analysis The Application of Data Visualization in Data Analysis翻译摘要:本文献研究了数据分析在企业决策中的应用,以及机器研究和数据可视化在数据分析中的作用。
英文文献及翻译(计算机专业)The increasing complexity of design resources in a net-based collaborative XXX common systems。
design resources can be organized in n with design activities。
A task is formed by a set of activities and resources linked by logical ns。
XXX managementof all design resources and activities via a Task Management System (TMS)。
which is designed to break down tasks and assign resources to task nodes。
This XXX。
2 Task Management System (TMS)TMS is a system designed to manage the tasks and resources involved in a design project。
It poses tasks into smaller subtasks。
XXX management of all design resources and activities。
TMS assigns resources to task nodes。
3 Collaborative DesignCollaborative design is a process that XXX a common goal。
In a net-based collaborative design environment。
n XXX n for all design resources and activities。
外文文献原文THE TECHNIQUE DEVELOPMENT HISTORY OF JSPThe Java Server Pages( JSP) is a kind of according to web of the script plait distance technique, similar carries the script language of Java in the server of the Netscape company of server- side JavaScript( SSJS) and the Active Server Pages(ASP) of the Microsoft. JSP compares the SSJS and ASP to have better can expand sex, and it is no more exclusive than any factory or some one particular server of Web. Though the norm of JSP is to be draw up by the Sun company of, any factory can carry out the JSP on own system.The After Sun release the JSP( the Java Server Pages) formally, the this kind of new Web application development technique very quickly caused the people's concern. JSP provided a special development environment for the Web application that establishes the high dynamic state. According to the Sun parlance, the JSP can adapt to include the Apache WebServer, IIS4.0 on the market at inside of 85% server product.This chapter will introduce the related knowledge of JSP and Databases, and JavaBean related contents, is all certainly rougher introduction among them basic contents, say perhaps to is a Guide only, if the reader needs the more detailed information, pleasing the book of consult the homologous JSP.1.1 GENERALIZEThe JSP(Java Server Pages) is from the company of Sun Microsystems initiate, the many companies the participate to the build up the together of the a kind the of dynamic the state web the page technique standard, the it have the it in the construction the of the dynamic state the web page the strong but the do not the especially of the function. JSP and the technique of ASP of the Microsoft is very alike. Both all provide the ability that mixes with a certain procedure code and is explain by the language engine to carry out the procedure code in the code of HTML. Underneath we are simple of carry on the introduction to it.JSP pages are translated into servlets. So, fundamentally, any task JSP pages can perform could also be accomplished by servlets. However, this underlying equivalence does not mean that servlets and JSP pages are equally appropriate in all scenarios. The issue is not the power of the technology, it is the convenience, productivity, and maintainability of one or the other. After all, anything you can do on a particular computer platform in the Java programming language you could also do in assembly language. But it still matters which you choose.JSP provides the following benefits over servlets alone:• It is easier to write and maintain the HTML. Your static code is ordinary HTML: no extra backslashes, no double quotes, and no lurking Java syntax.• You can use standard Web-site development tools. Even HTML tools that know nothing about JSP can be used because they simply ignore the JSP tags.• You can divide up your development team. The Java programmers can work on the dynamic code. The Web developers can concentrate on the presentation layer. On large projects, this division is very important. Depending on the size of your team and the complexity of your project, you can enforce a weaker or stronger separation between the static HTML and the dynamic content.Now, this discussion is not to say that you should stop using servlets and use only JSP instead. By no means. Almost all projects will use both. For some requests in your project, you will use servlets. For others, you will use JSP. For still others, you will combine them with the MVC architecture . You want the appropriate tool for the job, and servlets, by themselves, do not complete your toolkit.1.2 SOURCE OF JSPThe technique of JSP of the company of Sun, making the page of Web develop the personnel can use the HTML perhaps marking of XML to design to turn the end page with format. Use the perhaps small script future life of marking of JSP becomes the dynamic state on the page contents.( the contents changes according to the claim of)The Java Servlet is a technical foundation of JSP, and the large Web applies the development of the procedure to need the Java Servlet to match with with the JSP and then can complete, this name of Servlet comes from the Applet, the local translation method of now is a lot of, this book in order not to misconstruction, decide the direct adoption Servlet but don't do any translation, if reader would like to, can call it as" small service procedure". The Servlet is similar to traditional CGI, ISAPI, NSAPI etc. Web procedure development the function of the tool in fact, at use the Java Servlet hereafter, the customer need not use again the lowly method of CGI of efficiency, also need not use only the ability come to born page of Web of dynamic state in the method of API that a certain fixed Web server terrace circulate. Many servers of Web all support the Servlet, even not support the Servlet server of Web directly and can also pass the additional applied server and the mold pieces to support the Servlet. Receive benefit in the characteristic of the Java cross-platform, the Servlet is also a terrace irrelevant, actually, as long as match the norm of Java Servlet, the Servlet is complete to have nothing to do with terrace and is to have nothing to do with server of Web. Because the Java Servlet is internal to provide the service by the line distance, need not start a progress to the each claimses, and make use of the multi-threadingmechanism can at the same time for several claim service, therefore the efficiency of Java Servlet is very high.But the Java Servlet also is not to has no weakness, similar to traditional CGI, ISAPI, the NSAPI method, the Java Servlet is to make use of to output the HTML language sentence to carry out the dynamic state web page of, if develop the whole website with the Java Servlet, the integration process of the dynamic state part and the static state page is an evil-foreboding dream simply. For solving this kind of weakness of the Java Servlet, the SUN released the JSP.A number of years ago, Marty was invited to attend a small 20-person industry roundtable discussion on software technology. Sitting in the seat next to Marty was James Gosling, inventor of the Java programming language. Sitting several seats away was a high-level manager from a very large software company in Redmond, Washington. During the discussion, the moderator brought up the subject of Jini, which at that time was a new Java technology. The moderator asked the manager what he thought of it, and the manager responded that it was too early to tell, but that it seemed to be an excellent idea. He went on to say that they would keep an eye on it, and if it seemed to be catching on, they would follow his company's usual "embrace and extend" strategy. At this point, Gosling lightheartedly interjected "You mean disgrace and distend."Now, the grievance that Gosling was airing was that he felt that this company would take technology from other companies and suborn it for their own purposes. But guess what? The shoe is on the other foot here. The Java community did not invent the idea of designing pages as a mixture of static HTML and dynamic code marked with special tags. For example, Cold Fusion did it years earlier. Even ASP (a product from the very software company of the aforementioned manager) popularized this approach before JSP came along and decided to jump on the bandwagon. In fact, JSP not only adopted the general idea, it even used many of the same special tags as ASP did.The JSP is an establishment at the model of Java servlets on of the expression layer technique, it makes the plait write the HTML to become more simple.Be like the SSJS, it also allows you carry the static state HTML contents and servers the script mix to put together the born dynamic state exportation. JSP the script language that the Java is the tacit approval, however, be like the ASP and can use other languages( such as JavaScript and VBScript), the norm of JSP also allows to use other languages.1.3 JSP CHARACTERISTICSIs a service according to the script language in some one language of the statures system this kind of discuss, the JSP should be see make is a kind of script language.However, be a kind of script language, the JSP seemed to be too strong again, almost can use all Javas in the JSP.Be a kind of according to text originally of, take manifestation as the central development technique, the JSP provided all advantages of the Java Servlet, and, when combine with a JavaBeans together, providing a kind of make contents and manifestation that simple way that logic separate. Separate the contents and advantage of logical manifestations is, the personnel who renews the page external appearance need not know the code of Java, and renew the JavaBeans personnel also need not be design the web page of expert in hand, can use to take the page of JavaBeans JSP to define the template of Web, to build up a from have the alike external appearance of the website that page constitute. JavaBeans completes the data to provide, having no code of Java in the template thus, this means that these templates can be written the personnel by a HTML plait to support. Certainly, can also make use of the Java Servlet to control the logic of the website, adjust through the Java Servlet to use the way of the document of JSP to separate website of logic and contents.Generally speaking, in actual engine of JSP, the page of JSP is the edit and translate type while carry out, not explain the type of. Explain the dynamic state web page development tool of the type, such as ASP, PHP3 etc., because speed etc. reason, have already can't satisfy current the large electronic commerce needs appliedly, traditional development techniques are all at to edit and translate the executive way change, such as the ASP → ASP+;PHP3 → PHP4.In the JSP norm book, did not request the procedure in the JSP code part( be called the Scriptlet) and must write with the Java definitely. Actually, have some engines of JSP are adoptive other script languages such as the EMAC- Script, etc., but actually this a few script languages also are to set up on the Java, edit and translate for the Servlet to carry out of. Write according to the norm of JSP, have no Scriptlet of relation with Java also is can of, however, mainly lie in the ability and JavaBeans, the Enterprise JavaBeanses because of the JSP strong function to work together, so even is the Scriptlet part not to use the Java, edit and translate of performance code also should is related with Java.1.4 JSP MECHANISMTo comprehend the JSP how unite the technical advantage that above various speak of, come to carry out various result easily, the customer must understand the differentiation of" the module develops for the web page of the center" and" the page develops for the web page of the center" first.The SSJS and ASP are all in several year ago to release, the network of that time is still very young, no one knows to still have in addition to making all business, datas and the expression logic enter the original web page entirely heap what better solvethe method. This kind of model that take page as the center studies and gets the very fast development easily. However, along with change of time, the people know that this kind of method is unwell in set up large, the Web that can upgrade applies the procedure. The expression logic write in the script environment was lock in the page, only passing to shear to slice and glue to stick then can drive heavy use. Express the logic to usually mix together with business and the data logics, when this makes be the procedure member to try to change an external appearance that applies the procedure but do not want to break with its llied business logic, apply the procedure of maintenance be like to walk the similar difficulty on the eggshell. In fact in the business enterprise, heavy use the application of the module already through very mature, no one would like to rewrite those logics for their applied procedure.HTML and sketch the designer handed over to the implement work of their design the Web plait the one who write, make they have to double work-Usually is the handicraft plait to write, because have no fit tool and can carry the script and the HTML contents knot to the server to put together. Chien but speech, apply the complexity of the procedure along with the Web to promote continuously, the development method that take page as the center limits sex to become to get up obviously.At the same time, the people always at look for the better method of build up the Web application procedure, the module spreads in customer's machine/ server the realm. JavaBeans and ActiveX were published the company to expand to apply the procedure developer for Java and Windows to use to come to develop the complicated procedure quickly by" the fast application procedure development"( RAD) tool. These techniques make the expert in the some realm be able to write the module for the perpendicular application plait in the skill area, but the developer can go fetch the usage directly but need not control the expertise of this realm.Be a kind of take module as the central development terrace, the JSP appeared. It with the JavaBeans and Enterprise JavaBeans( EJB) module includes the model of the business and the data logic for foundation, provide a great deal of label and a script terraces to use to come to show in the HTML page from the contents of JavaBeans creation or send a present in return. Because of the property that regards the module as the center of the JSP, it can drive Java and not the developer of Java uses equally. Not the developer of Java can pass the JSP label( Tags) to use the JavaBeans that the deluxe developer of Java establish. The developer of Java not only can establish and use the JavaBeans, but also can use the language of Java to come to control more accurately in the JSP page according to the expression logic of the first floor JavaBeans.See now how JSP is handle claim of HTTP. In basic claim model, a claimdirectly was send to JSP page in. The code of JSP controls to carry on hour of the logic processing and module of JavaBeanses' hand over with each other, and the manifestation result in dynamic state bornly, mixing with the HTML page of the static state HTML code. The Beans can be JavaBeans or module of EJBs. Moreover, the more complicated claim model can see make from is request other JSP pages of the page call sign or Java Servlets.The engine of JSP wants to chase the code of Java that the label of JSP, code of Java in the JSP page even all converts into the big piece together with the static state HTML contents actually. These codes piece was organized the Java Servlet that customer can not see to go to by the engine of JSP, then the Servlet edits and translate them automatically byte code of Java.Thus, the visitant that is the website requests a JSP page, under the condition of it is not knowing, an already born, the Servlet actual full general that prepared to edit and translate completes all works, very concealment but again and efficiently. The Servlet is to edit and translate of, so the code of JSP in the web page does not need when the every time requests that page is explain. The engine of JSP need to be edit and translate after Servlet the code end is modify only once, then this Servlet that editted and translate can be carry out. The in view of the fact JSP engine auto is born to edit and translate the Servlet also, need not procedure member begins to edit and translate the code, so the JSP can bring vivid sex that function and fast developments need that you are efficiently.Compared with the traditional CGI, the JSP has the equal advantage. First, on the speed, the traditional procedure of CGI needs to use the standard importation of the system to output the equipments to carry out the dynamic state web page born, but the JSP is direct is mutually the connection with server. And say for the CGI, each interview needs to add to add a progress to handle, the progress build up and destroy by burning constantly and will be a not small burden for calculator of be the server of Web. The next in order, the JSP is specialized to develop but design for the Web of, its purpose is for building up according to the Web applied procedure, included the norm and the tool of a the whole set. Use the technique of JSP can combine a lot of JSP pages to become a Web application procedure very expediently.JSP six built-in objectsrequest for:The object of the package of information submitted by users, by calling the object corresponding way to access the information package, namely the use of the target users can access the information.response object:The customer's request dynamic response to the client sent the data.session object1. What is the session: session object is a built-in objects JSP, it in the first JSP pages loaded automatically create, complete the conversation of management.From a customer to open a browser and connect to the server, to close the browser, leaving the end of this server, known as a conversation.When a customer visits a server, the server may be a few pages link between repeatedly, repeatedly refresh a page, the server should bethrough some kind of way to know this is the same client, which requires session object.2. session object ID: When a customer's first visit to a server on the JSP pages, JSP engines produce a session object, and assigned aString type of ID number, JSP engine at the same time, the ID number sent to the client, stored in Cookie, this session objects, and customers on the establishment of a one-to-one relationship. When a customer to connect to the server of the other pages, customers no longer allocated to the new session object, until, close your browser, the client-server object to cancel the session, and the conversation, and customer relationship disappeared. When a customer re-open the browser to connect to the server, the server for the customer to create a new session object.aplication target1. What is the application:Servers have launched after the application object, when a customer to visit the site between the various pages here, this application objects are the same, until the server is down. But with the session difference is that all customers of the application objects are the same, that is, all customers share this built-in application objects.2. application objects commonly used methods:(1) public void setAttribute (String key, Object obj): Object specified parameters will be the object obj added to the application object, and to add the subject of the designation of a keyword index.(2) public Object getAttribute (String key): access to application objects containing keywords for.out targetsout as a target output flow, used to client output data. out targets for the output data.Cookie1. What is Cookie:Cookie is stored in Web server on the user's hard drive section of the text. Cookie allow a Web site on the user's computer to store information on and then get back to it.For example, a Web site may be generated for each visitor a unique ID, and then to Cookie in the form of documents stored in each user's machine.If you use IE browser to visit Web, you will see all stored on your hard drive on the Cookie. They are most often stored in places: c: \ windows \ cookies (in Window2000 is in the C: \ Documents and Settings \ your user name \ Cookies).Cookie is "keyword key = value value" to preserve the format of the record.2. Targets the creation of a Cookie, Cookie object called the constructor can create a Cookie. Cookie object constructor has two string .parameters: Cookie Cookie name and value.Cookie c = new Cookie ( "username", "john");3. If the JSP in the package good Cookie object to send to the client, the use of the response addCookie () method.Format: response.addCookie (c)4. Save to read the client's Cookie, the use of the object request getCookies () method will be implemented in all client came to an array of Cookie objects in the form of order, to meet the need to remove the Cookie object, it is necessary to compare an array cycle Each target keywords.JSP的技术发展历史Java Server Pages(JSP)是一种基于web的脚本编程技术,类似于网景公司的服务器端Java脚本语言—— server-side JavaScript(SSJS)和微软的Active Server Pages(ASP)。
云计算技术的应用与发展趋势(英文中文双语版优质文档)With the continuous development of information technology, cloud computing technology has become an indispensable part of enterprise information construction. Cloud computing technology can help enterprises realize a series of functions such as resource sharing, data storage and processing, application development and deployment. This article will discuss from three aspects: the application of cloud computing technology, the advantages of cloud computing technology and the development trend of cloud computing technology.1. Application of Cloud Computing Technology1. Resource sharingCloud computing technology can bring together different resources to realize resource sharing. Enterprises can use cloud computing technology to share resources such as servers, storage devices, and network devices, so as to maximize the utilization of resources.2. Data storage and processingCloud computing technology can help enterprises store and process massive data. Through cloud computing technology, enterprises can store data in the cloud to realize remote access and backup of data. At the same time, cloud computing technology can also help enterprises analyze and process data and provide more accurate decision support.3. Application development and deploymentCloud computing technology can help enterprises develop and deploy applications faster and more conveniently. Through cloud computing technology, enterprises can deploy applications on the cloud to realize remote access and management of applications. At the same time, cloud computing technology can also provide a variety of development tools and development environment, which is convenient for enterprises to carry out application development.2. Advantages of cloud computing technology1. High flexibilityCloud computing technology can flexibly adjust the usage and allocation of resources according to the needs of enterprises, so as to realize the optimal utilization of resources. At the same time, cloud computing technology can also support elastic expansion and contraction, which is convenient for enterprises to cope with business peaks and valleys.2. High securityCloud computing technology can ensure the security of enterprise data through data encryption, identity authentication, access control and other means. At the same time, cloud computing technology can also provide a multi-level security protection system to prevent security risks such as hacker attacks and data leakage.3. Cost-effectiveCompared with the traditional IT construction model, the cost of cloud computing technology is lower. Through cloud computing technology, enterprises can avoid large-scale hardware investment and maintenance costs, and save enterprise R&D and operating expenses.4. Convenient managementCloud computing technology can help enterprises achieve unified resource management and monitoring. Through cloud computing technology, enterprises can centrally manage resources such as multiple servers, storage devices, and network devices, which is convenient for enterprises to carry out unified monitoring and management.5. Strong scalabilityCloud computing technology can quickly increase or decrease the usage and configuration of resources according to the needs of enterprises, so as to realize the rapid expansion and contraction of business. At the same time, cloud computing technology can also provide a variety of expansion methods, such as horizontal expansion, vertical expansion, etc., to facilitate enterprises to expand their business on demand.3. The development trend of cloud computing technology1. The advent of the multi-cloud eraWith the development of cloud computing technology, the multi-cloud era has arrived. Enterprises can choose different cloud platforms and deploy services on multiple clouds to achieve high availability and elastic expansion of services.2. Combination of artificial intelligence and cloud computingArtificial intelligence is one of the current hot technologies, and cloud computing technology can also provide better support for the development of artificial intelligence. Cloud computing technology can provide high-performance computing resources and storage resources, providing better conditions for the training and deployment of artificial intelligence.3. The Rise of Edge ComputingEdge computing refers to the deployment of computing resources and storage resources at the edge of the network to provide faster and more convenient computing and storage services. With the development of the Internet of Things and the popularization of 5G networks, edge computing will become an important expansion direction of cloud computing technology.4. Guarantee of security and privacyWith the widespread application of cloud computing technology, data security and privacy protection have become important issues facing cloud computing technology. In the future, cloud computing technology will pay more attention to security measures such as data encryption, identity authentication and access control to ensure the security and privacy of corporate and personal data.To sum up, cloud computing technology has become an indispensable part of enterprise information construction. Through cloud computing technology, enterprises can realize a series of functions such as resource sharing, data storage and processing, application development and deployment. At the same time, cloud computing technology also has the advantages of high flexibility, high security, high cost-effectiveness, convenient management and strong scalability. In the future, with the multi-cloud era, the combination of artificial intelligence and cloud computing, the rise of edge computing, and the protection of security and privacy, cloud computing technology will continue to enhance its importance and application value in enterprise information construction.随着信息技术的不断发展,云计算技术已经成为企业信息化建设中不可或缺的一部分。
英文参考文献及翻译Linux - Operating system of cybertimes Though for a lot of people , regard Linux as the main operating system to make up huge work station group, finish special effects of " Titanic " make , already can be regarded as and show talent fully. But for Linux, this only numerous news one of. Recently, the manufacturers concerned have announced that support the news of Linux to increase day by day, users' enthusiasm to Linux runs high unprecedentedly too. Then, Linux only have operating system not free more than on earth on 7 year this piece what glamour, get the favors of such numerous important software and hardware manufacturers as the masses of users and Orac le , Informix , HP , Sybase , Corel , Intel , Netscape , Dell ,etc. , OK?1.The background of Linux and characteristicLinux is a kind of " free (Free ) software ": What is called free,mean users can obtain the procedure and source code freely , and can use them freely , including revise or copy etc.. It is a result of cybertimes, numerous technical staff finish its research and development together through Inte rnet, countless user is it test and except fault , can add user expansion function that oneself make conveniently to participate in. As the most outstanding one in free software, Linux has characteristic of the following:(1)Totally follow POSLX standard, expand the network operatingsystem of supporting all AT&T and BSD Unix characteristic. Because of inheritting Unix outstanding design philosophy , and there are clean , stalwart , high-efficient and steady kernels, their all key codes are finished by Li nus Torvalds and other outstanding programmers, without any Unix code of AT&T or Berkeley, so Linu x is not Unix, but Linux and Unix are totally compatible.(2)Real many tasks, multi-user's system, the built-in networksupports, can be with such seamless links as NetWare , Windows NT , OS/2 ,Unix ,etc.. Network in various kinds of Unix it tests to be fastest in comparing and assess efficiency. Support such many kinds of files systems as FAT16 , FAT32 , NTFS , Ex t2FS , ISO9600 ,etc. at the same time .(3) Can operate it in many kinds of hardwares platform , including such processors as Alpha , SunSparc , PowerPC , MIPS ,etc., to various kinds of new-type peripheral hardwares, can from distribute on global numerous programmer there getting support rapidly too.(4) To that the hardware requires lower, can obtain very good performance on more low-grade machine , what deserves particular mention is Linux outstanding stability , permitted " year " count often its running times.2.Main application of Linux At present,Now, the application of Linux mainly includes:(1) Internet/Intranet: This is one that Linux was used most at present, it can offer and include Web server , all such Inter net services as Ftp server , Gopher server , SMTP/POP3 mail server , Proxy/Cache server , DNS server ,etc.. Linux kernel supports IPalias , PPP and IPtunneling, these functions can be used for setting up fictitious host computer , fictitious service , VPN (fictitious special-purpose network ) ,etc.. Operating Apache Web server on Linux mainly, the occupation rate of market in 1998 is 49%, far exceeds the sum of such several big companies as Microsoft , Netscape ,etc..(2) Because Linux has outstanding networking ability , it can be usedin calculating distributedly large-scaly, for instance cartoon making , scientific caculation , database and file server ,etc..(3) As realization that is can under low platform fullness of Unix that operate , apply at all levels teaching and research work of universities and colleges extensively, if Mexico government announce middle and primary schools in the whole country dispose Linux and offer Internet service for student already.(4) Tabletop and handling official business appliedly. Application number of people of in this respect at present not so good as Windows of Microsoft far also, reason its lie in Lin ux quantity , desk-top of application software not so good as Windows application far not merely, because the characteristic of the freedom software makes it not almost have advertisement thatsupport (though the function of Star Office is not second to MS Office at the same time, but there are actually few people knowing).3.Can Linux become a kind of major operating system?In the face of the pressure of coming from users that is strengthened day by day, more and more commercial companies transplant its application to Linux platform, comparatively important incident was as follows, in 1998 ①Compaq and HP determine to put forward user of requirement truss up Linux at their servers , IBM and Dell promise to offer customized Linux system to user too. ②Lotus announce, Notes the next edition include one special-purpose edition in Linux. ③Corel Company transplants its famous WordPerfect to on Linux, and free issue. Corel also plans to move the other figure pattern process products to Linux platform completely.④Main database producer: Sybase , Informix , Oracle , CA , IBM have already been transplanted one's own database products to on Linux, or has finished Beta edition, among them Oracle and Informix also offer technical support to their products.4.The gratifying one is, some farsighted domestic corporations have begun to try hard to change this kind of current situation already. Stone Co. not long ago is it invest a huge sum of money to claim , regard Linux as platform develop a Internet/Intranet solution, regard this as the core and launch Stone's system integration business , plan to set up nationwide Linux technical support organization at the same time , take the lead to promote the freedom software application and development in China. In addition domestic computer Company , person who win of China , devoted to Linux relevant software and hardware application of system popularize too. Is it to intensification that Linux know , will have more and more enterprises accede to the ranks that Linux will be used with domestic every enterprise to believe, more software will be planted in Linux platform. Meanwhile, the domestic university should regard Linux as the original version and upgrade already existing Unix content of courses , start with analysing the source code and revising the kernel and train a large number of senior Linux talents, improve our country's own operating system. Having only really grasped the operating system, the software industry of our country could be got rid of and aped sedulously at present, the passive state led by the nose by others, create conditions for revitalizing the software industry of our country fundamentally.中文翻译Linux—网络时代的操作系统虽然对许多人来说,以Linux作为主要的操作系统组成庞大的工作站群,完成了《泰坦尼克号》的特技制作,已经算是出尽了风头。
关于大数据的学术英文文献Big Data: Challenges and Opportunities in the Digital Age.Introduction.In the contemporary digital era, the advent of big data has revolutionized various aspects of human society. Big data refers to vast and complex datasets generated at an unprecedented rate from diverse sources, including social media platforms, sensor networks, and scientific research. While big data holds immense potential for transformative insights, it also poses significant challenges and opportunities that require thoughtful consideration. This article aims to elucidate the key challenges and opportunities associated with big data, providing a comprehensive overview of its impact and future implications.Challenges of Big Data.1. Data Volume and Variety: Big data datasets are characterized by their enormous size and heterogeneity. Dealing with such immense volumes and diverse types of data requires specialized infrastructure, computational capabilities, and data management techniques.2. Data Velocity: The continuous influx of data from various sources necessitates real-time analysis and decision-making. The rapid pace at which data is generated poses challenges for data processing, storage, andefficient access.3. Data Veracity: The credibility and accuracy of big data can be a concern due to the potential for noise, biases, and inconsistencies in data sources. Ensuring data quality and reliability is crucial for meaningful analysis and decision-making.4. Data Privacy and Security: The vast amounts of data collected and processed raise concerns about privacy and security. Sensitive data must be protected fromunauthorized access, misuse, or breaches. Balancing data utility with privacy considerations is a key challenge.5. Skills Gap: The analysis and interpretation of big data require specialized skills and expertise in data science, statistics, and machine learning. There is a growing need for skilled professionals who can effectively harness big data for valuable insights.Opportunities of Big Data.1. Improved Decision-Making: Big data analytics enables organizations to make informed decisions based on comprehensive data-driven insights. Data analysis can reveal patterns, trends, and correlations that would be difficult to identify manually.2. Personalized Experiences: Big data allows companies to tailor products, services, and marketing strategies to individual customer needs. By understanding customer preferences and behaviors through data analysis, businesses can provide personalized experiences that enhancesatisfaction and loyalty.3. Scientific Discovery and Innovation: Big data enables advancements in various scientific fields,including medicine, genomics, and climate modeling. The vast datasets facilitate the identification of complex relationships, patterns, and anomalies that can lead to breakthroughs and new discoveries.4. Economic Growth and Productivity: Big data-driven insights can improve operational efficiency, optimize supply chains, and create new economic opportunities. By leveraging data to streamline processes, reduce costs, and identify growth areas, businesses can enhance their competitiveness and contribute to economic development.5. Societal Benefits: Big data has the potential to address societal challenges such as crime prevention, disease control, and disaster management. Data analysis can empower governments and organizations to make evidence-based decisions that benefit society.Conclusion.Big data presents both challenges and opportunities in the digital age. The challenges of data volume, velocity, veracity, privacy, and skills gap must be addressed to harness the full potential of big data. However, the opportunities for improved decision-making, personalized experiences, scientific discoveries, economic growth, and societal benefits are significant. By investing in infrastructure, developing expertise, and establishing robust data governance frameworks, organizations and individuals can effectively navigate the challenges and realize the transformative power of big data. As thedigital landscape continues to evolve, big data will undoubtedly play an increasingly important role in shaping the future of human society and technological advancement.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
毕业设计附件外文文献翻译:原文+译文文献出处:Chaudhuri S. Big data,cloud computing technology and the audit[J]. IT Professional Magazine, 2016, 2(4): 38-51.原文Big data,cloud computing technology and the auditChaudhuri SAbstractAt present, large data along with the development of cloud computing technology, is a significant impact on global economic and social life. Big data and cloud computing technology to modern audit provides a new technology and method of auditing organizations and audit personnel to grasp the big data, content and characteristics of cloud computing technology, to promote the further development of the modern audit technology and method.Keywords: big data, cloud computing technology, audit, advice1 Related concept1.1 Large dataThe word "data" (data) is the meaning of "known" in Latin, can also be interpreted as "fact”. In 2009, the concept of “big data” gradually begins to spread in society. The concept of "big data" truly become popular, it is because the Obama administration in 2012 high-profile announced its "big data research and development plan”. It marks the era of "big data" really began to enter the social economic life.” Big data" (big data), or "huge amounts of data, refers to the amount of data involved too big to use the current mainstream software tools, in a certain period of time to realize collection, analysis, processing, or converted to help decision-makers decision-making information available. Internet data center (IDC) said "big data" is for the sake of more economical, more efficient from high frequency, large capacity, different structures and types of data to derive value and design of a new generation of architecture and technology, and use it to describe and define the information explosion times produce huge amounts of data, and name the related technology development and innovation. Big data has four characteristics: first, the data volume is huge, jumped from TB level to the level of PB.Second, processing speed, the traditionaldata mining technology are fundamentally different. Third, many data types’pictures, location information, video, web logs, and other forms. Fourth, the value of low density, high commercial value.1.2 Cloud computing"Cloud computing" concept was created in large Internet companies such as Google and IBM handle huge amounts of data in practice. On August 9, 2006, Google CEO Eric Schmidt (Eric Schmidt) in the search engine assembly for the first time put forward the concept of "cloud computing”. In October 2007, Google and IBM began in the United States university campus to promote cloud computing technology plan, the project hope to reduce the cost of distributed computing technology in academic research, and provide the related hardware and software equipment for these universities and technical support (Michael Mille, 2009).The world there are many about the definition of "cloud computing”.” Cloud computing" is the increase of the related services based on Internet, use and delivery mode, is through the Internet to provide dynamic easy extension and often virtualized resources. American national standards institute of technology (NIST) in 2009 about cloud computing is defined as: "cloud computing is a kind of pay by usage pattern, this pattern provides available, convenient, on-demand network access, enter the configurable computing resources Shared pool resources (including network, servers, storage, applications, services, etc.), these resources can be quick to provide, just in the management of the very few and or little interaction with service providers."1.3 The relationship between big data and cloud computingOverall, big data and cloud computing are complementary to each other. Big data mainly focus on the actual business, focus on "data", provide the technology and methods of data collection, mining and analysis, and emphasizes the data storage capacity. Cloud computing focuses on "computing", pay attention to IT infrastructure, providing IT solutions, emphasizes the ability to calculate, the data processing ability. If there is no large data storage of data, so the cloud computing ability strong again, also hard to find a place; If there is no cloud computing ability of data processing, the big data storage of data rich again, and ultimately, used in practice. From a technical point of view, large data relies on the cloud computing. Huge amounts of data storage technology, massive data management technology, graphs programming model is the key technology of cloud computing, are also big data technology base. And the data will be "big", themost important is the technology provided by the cloud computing platform. After the data is on the "cloud", broke the past their segmentation of data storage, more easy to collect and obtain, big data to present in front of people. From the focus, the emphasis of the big data and cloud computing. The emphasis of the big data is all sorts of data, broad, deep huge amounts of data mining, found in the data value, forcing companies to shift from "business-driven" for "data driven”. And the cloud is mainly through the Internet, extension, and widely available computing and storage resources and capabilities, its emphasis is IT resources, processing capacity and a variety of applications, to help enterprises save IT deployment costs. Cloud computing the benefits of the IT department in enterprise, and big data benefit enterprise business management department.2 Big data and cloud computing technology analysis of the influence of the audit2.1 Big data and cloud computing technology promote the development of continuous audit modeIn traditional audit, the auditor only after completion of the audited business audit, and audit process is not audit all data and information, just take some part of the audit. This after the event, and limited audit on the audited complex production and business operation and management system is difficult to make the right evaluation in time, and for the evaluation of increasingly frequent and complex operation and management activities of the authenticity and legitimacy is too slow. Along with the rapid development of information technology, more and more audit organization began to implement continuous audit way, to solve the problem of the time difference between audit results and economic activity. However, auditors for audit, often limited by current business conditions and information technology means, the unstructured data to digital, or related detail data cannot be obtained, the causes to question the judgment of the are no specific further and deeper. And big data and cloud computing technology can promote the development of continuous audit mode, make the information technology and big data and cloud computing technology is better, especially for the business data and risk control "real time" to demand higher specific industry, such as banking, securities, insurance industry, the continuous audit in these industries is imminent.2.2 Big data and cloud computing technology to promote the application of overall audit modeThe current audit mode is based on the evaluation of audit risk to implement sampling audit. In impossible to collect and analyze the audited all economic business data, the current audit modemainly depends on the audit sampling, from the perspective of the local inference as a whole, namely to extract the samples from working on the audit, and then deduced the whole situation of the audit object. The sampling audit mode, due to the limited sample drawn, and ignored the many and the specific business activity, the auditors cannot find and reveal the audited major fraud, hidden significant audit risks. Big data and cloud computing technology for the auditor, is not only a technical means are available, the technology and method will provide the auditor with the feasibility of implementing overall audit mode. Using big data and cloud computing technology, cross-industry, across the enterprise to collect and analysis of the data, can need not random sampling method, and use to collect and analyze all the data of general audit mode. Use of big data and cloud computing technology overall audit mode is to analyze all the data related to the audit object allows the auditor to establish overall audit of the thinking mode; can make the modern audit for revolutionary change. Auditors to implement overall audit mode, can avoid audit sampling risk. If could gather all the data in general, you can see more subtle and in-depth information, deep analysis of the data in multiple perspectives, to discover the hidden details in the data information of value to the audit problem. At the same time, the auditor implement overall audit mode, can be found from the audit sampling mode can find problems.2.3 Big data and cloud computing technology for integrated application of the audit resultsAt present, the auditor audit results is mainly provided to the audit report of the audited, its format is fixed, single content, contains less information. As the big data and cloud computing technology is widely used in the audit, the auditor audit results in addition to the audit report, and in the process of audit collection, mining, analysis and processing of large amounts of information and data, can be provided to the audited to improve management, promote the integrated application of the audit results, improve the comprehensive application effect of the audit results. First of all, the auditor in the audit to obtain large amounts of data and related information of summary and induction, financial, business and find the inner rules of operation and management etc, common problems and development trend, through the summary induces a macroscopic and comprehensive strong audit information, to provide investors and other stakeholders audited data prove that, correlation analysis and decision making Suggestions, thus promoting the improvement of the audited management level. Second, auditors by using big data and cloud computing technology can be the same problem in different category analysis and processing, from a differentAngle and different level of integration of refining to satisfy the needs of different levels. Again, the auditor will audit results for intelligent retained, by big data and cloud computing technology, to regulation and curing the problem in the system, in order to calculate or determine the problem developing trend, an early warning of the auditees.3 Big data and cloud computing technology promote the relationship between the applications of evidenceAuditors in the audit process should be based on sufficient and appropriate audit evidence audit opinion, and issue the audit report. However, under the big data and cloud computing environment, auditors are faced with both a huge amount data screening test, and facing the challenge of collecting appropriate audit evidence. Auditors when collecting audit evidence, the traditional thinking path is to collect audit evidence, based on the causal relationship between the big data analysis will be more use of correlation analysis to gather and found that the audit evidence. But from the perspective of audit evidence found, because of big data technology provides an unprecedented interdisciplinary, quantitative dimensions available, made a lot of relevant information to the audit records and analysis. Big data and cloud computing technology has not changed the causal relationship between things, but in the big data and cloud computing technology the development and use of correlation, makes the analysis of data dependence on causal logic relationship is reduced, and even more inclined to application based on the analysis of correlation data, on the basis of correlation analysis of data validation is large, one of the important characteristics of cloud computing technology. In the big data and cloud computing environment, the auditor can collect audit evidence are mostly electronic evidence. Electronic evidence itself is very complex, and cloud computing technology makes it more difficult to obtain evidence of the causal. Auditors should collect from long-term dependence on cause and effect and found that the audit evidence, into a correlation is used to collect and found that the audit evidence.译文大数据、云计算技术与审计Chaudhuri S摘要目前,大数据伴随着云计算技术的发展,正在对全球经济社会生活产生巨大的影响。