云计算外文翻译原文

合集下载

关于云计算的英语作文

关于云计算的英语作文

关于云计算的英语作文英文回答:Cloud computing has revolutionized the way businesses and individuals store, process, and access data and applications. As a result, there are multiple benefits of cloud computing, including improved flexibility, reduced costs, increased security, enhanced collaboration, and innovative services.Flexibility is one of the key advantages of cloud computing. It allows users to access their data and applications from anywhere with an internet connection. This makes it easier for businesses to operate remotely and for individuals to work from home.Reduced costs are another advantage of cloud computing. Businesses can avoid the upfront costs of purchasing and maintaining their own IT infrastructure. Instead, they can pay for cloud services on a pay-as-you-go basis.Increased security is another key benefit of cloud computing. Cloud providers have invested heavily insecurity measures to protect their customers' data. This makes cloud computing a more secure option than storingdata on-premises.Enhanced collaboration is another advantage of cloud computing. Cloud-based applications make it easier forteams to collaborate on projects. This is because cloud applications can be accessed from anywhere, and they allow users to share files and collaborate in real-time.Innovative services are another key benefit of cloud computing. Cloud providers are constantly developing new services, such as artificial intelligence, machine learning, and data analytics. These services can help businesses improve their operations and make better decisions.Overall, cloud computing offers a number of benefitsthat can help businesses and individuals improve their operations, reduce costs, and increase innovation.中文回答:云计算的优势。

计算机英文文献加翻译

计算机英文文献加翻译

Management Information System Overview Management Information System is that we often say that the MIS, is a human, computers and other information can be composed of the collection, transmission, storage, maintenance and use of the system, system, emphasizing emphasizing the the management, management, management, stressed stressed stressed that that the modern information society In the increasingly popular. MIS is a new subject, it across a number of areas, such as scientific scientific management management management and and and system system system science, science, science, operations operations operations research, research, research, statistics statistics statistics and and and computer computer science. In these subjects on the basis of formation of information-gathering and processing methods, thereby forming a vertical and horizontal weaving, and systems. The 20th century, along with the vigorous development of the global economy, many economists have proposed a new management theory. In the 1950s, Simon made dependent on information management and decision-making ideas. Wiener published the same period of the control theory, that he is a management control process. 1958, Gail wrote: "The management will lower the cost of timely and accurate information to b etter control." During better control." During this period, accounting for the beginning of the computer, data processing in the term.1970, Walter T . Kenova just to the management information system under a definition of the . Kenova just to the management information system under a definition of the term: "verbal or written form, at the right time to managers, staff and outside staff for the past, present, the projection of future Enterprise and its environment-related information 原文请找腾讯3249114六,维^论~文.网 no no application application application model, model, model, no no mention mention of of computer applications. 1985, management information systems, the founder of the University of Minnesota professor of management at the Gordon B. Davis to a management information system a more complete definition of "management information system is a computer hardware and software resources, manual operations, analysis, planning , Control and decision -making model and the database - System. System. It It provides information to to support support enterprises enterprises or or organizations organizations of of the operation, management and decision-making function. "Comprehensive definition of this Explained Explained that that that the the the goal goal goal of of of management management management information information information system, system, system, functions functions functions and and and composition, composition, composition, but but also reflects the management information system at the time of level.With the continuous improvement of science and technology, computer science increasingly mature, the computer has to be our study and work on the run along. Today, computers are already already very low price, performance, but great progress, and it was used in many areas, the very low price, performance, but great progress, and it was used in many areas, the computer computer was was was so so so popular popular popular mainly mainly mainly because because because of of of the the the following following following aspects: aspects: aspects: First, First, First, the the the computer computer computer can can substitute for many of the complex Labor. Second, the computer can greatly enhance people's work work efficiency. efficiency. efficiency. Third, Third, Third, the the the computer computer computer can can can save save save a a a lot lot lot of of of resources. resources. resources. Fourth, Fourth, Fourth, the the the computer computer computer can can make sensitive documents more secure.Computer application and popularization of economic and social life in various fields. So that the original old management methods are not suited now more and social development. Many people still remain in the previous manual. This greatly hindered the economic development of mankind. mankind. In recent years, with the University of sponsoring scale is In recent years, with the University of sponsoring scale is growing, the number of students students in in in the the the school school school also also also have have have increased, increased, increased, resulting resulting resulting in in in educational educational educational administration administration administration is is is the the growing complexity of the heavy work, to spend a lot of manpower, material resources, and the existing management of student achievement levels are not high, People have been usin g the traditional method of document management student achievement, the management there are many shortcomings, such as: low efficiency, confidentiality of the poor, and Shijianyichang, will have a large number of of documents documents documents and and data, which is is useful useful for finding, finding, updating updating and maintaining Have brought a lot of difficulties. Such a mechanism has been unable to meet the development of the times, schools have become more and more day -to-day management of a bottleneck. bottleneck. In In In the the the information information information age age age this this this traditional traditional traditional management management management methods methods methods will will will inevitably inevitably inevitably be be computer-based information management replaced. As As part part part of of of the the the computer computer computer application, application, application, the the the use use use of of of computers computers computers to to to students students students student student student performance performance information for management, with a manual management of the incomparable advantages for example: example: rapid rapid rapid retrieval, retrieval, retrieval, to to to find find find convenient, convenient, convenient, high high high reliability reliability reliability and and and large large large capacity capacity capacity storage, storage, storage, the the confidentiality confidentiality of of of good, good, good, long long long life, life, life, cost cost cost Low. Low. Low. These These These advantages advantages advantages can can can greatly greatly greatly improve improve improve student student performance management students the efficiency of enterprises is also a scientific, standardized standardized management, management, management, and and and an an an important important important condition condition condition for for for connecting connecting connecting the the the world. world. world. Therefore, Therefore, the development of such a set of management software as it is very necessary thing.Design ideas are all for the sake of users, the interface nice, clear and simple operation as far as possible, but also as a practical operating system a good fault-tolerant, the user can misuse a timely manner as possible are given a warning, so that users timely correction . T o take full advantage advantage of the of the functions of visual FoxPro, design p owerful software powerful software at the same time, as much as possible to reduce the occupiers system resources. Visual FoxPro the command structure and working methods: Visual FoxPro was originally originally called called FoxBASE, FoxBASE, the the U.S. U.S. Fox Fox Software has introduced introduced a a database products, products, in in the run on DOS, compatible with the abase family. Fox Fox Software Software Microsoft acquisition, to be developed so that it can run on Windows, and changed its name to Visual FoxPro. Visual FoxPro is a powerful relational database rapid application development tool, tool, the the the use use use of of of Visual Visual Visual FoxPro FoxPro FoxPro can can can create create create a a a desktop desktop desktop database database database applications, applications, applications, client client client / / / server server applications applications and and and Web Web Web services services services component-based component-based component-based procedures, procedures, procedures, while while while also also also can can can use use use ActiveX ActiveX controls or API function, and so on Ways to expand the functions of Visual FoxPro.1651First, work methods 1. Interactive mode of operation (1) order operation VF in the order window, through an order from the keyboard input of all kinds of ways to complete the operation order. (2) menu operation VF use menus, windows, dialog to achieve the graphical interface features an interactive operation. (3) aid operation VF in the system provides a wide range of user-friendly operation of tools, such as the wizard, design, production, etc.. 2. Procedure means of implementation VF in the implementation of the procedures is to form a group of orders and programming language, an extension to save. PRG procedures in the document, and then run through the automatic implementation of this order documents and award results are displayed. Second, the structure of command 1. Command structure 2. VF orders are usually composed of two parts: The first part is the verb order, also known as keywords, for the operation of the designated order functions; second part of the order clause, for an order that the operation targets, operating conditions and other information . VF order form are as follows: 3. <Order verb> "<order clause>" 4. Order in the format agreed symbols 5. 5. VF in the order form and function of the use of the symbol of the unity agreement, the meaning of VF in the order form and function of the use of the symbol of the unity agreement, the meaning of these symbols are as follows: 6. Than that option, angle brackets within the parameters must be based on their format input parameters. 7. That may be options, put in brackets the parameters under specific requ ests from users choose to enter its parameters. 8. Third, the project manager 9. Create a method 10. command window: CREA T PROJECT <file name> T PROJECT <file name> 11. Project Manager 12. tab 13. All - can display and project management applications of all types of docume nts, "All" tab contains five of its right of the tab in its entirety . 14. Data - management application projects in various types of data files, databases, free form, view, query documents. 15. Documentation - display 原文请找腾讯原文请找腾讯3249114六,维^论~文.网 , statements, documents, labels and other documents. 16. Category - the tab display and project management applications used in the class library documents, including VF's class library system and the user's own design of the library. 17. Code - used in the project management procedures code documents, such as: program files (. PRG), API library and the use of project management for generation of applications (. APP). 18. (2) the work area 19. The project management work area is displayed and management of all types of document window. 20. (3) order button 21. Project Manager button to the right of the order of the work area of the document window to provide command. 22. 4, project management for the use of 23. 1. Order button function 24. New - in the work area window selected certain documents, with new orders button on the new document added to the project management window. 25. Add - can be used VF "file" menu under the "new" order and the "T ools" menu under the "Wizard" order to create the various independent paper added to the project manager, unified organization with management. 26. Laws - may amend the project has been in existence in the various documents, is still to use such documents to modify the design interface. 27. Sports - in the work area window to highlight a specific document, will run the paper.28. Mobile - to check the documents removed from the project. 29. 29. Even Even Even the the the series series series - - - put put put the the the item item item in in in the the the relevant relevant relevant documents documents documents and and and even even even into into into the the the application application executable file. Database System Design :Database design is the logical database design, according to a forthcoming data classification system and the logic of division-level organizations, is user-oriented. Database design needs of various departments of the integrated enterprise archive data and data needs analysis of the relationship between the various data, in accordance with the DBMS. 管理信息系统概要管理信息系统概要管理信息系统就是我们常说的MIS (Management Information System ),是一个由人、计算机等组成的能进行信息的收集、传送、储存、维护和使用的系统,在强调管理,强调信息的现代社会中它越来越得到普及。

人工智能英文文献原文及译文

人工智能英文文献原文及译文

附件四英文文献原文Artificial Intelligence"Artificial intelligence" is a word was originally Dartmouth in 1956 to put forward. From then on, researchers have developed many theories and principles, the concept of artificial intelligence is also expands. Artificial intelligence is a challenging job of science, the person must know computer knowledge, psychology and philosophy. Artificial intelligence is included a wide range of science, it is composed of different fields, such as machine learning, computer vision, etc, on the whole, the research on artificial intelligence is one of the main goals of the machine can do some usually need to perform complex human intelligence. But in different times and different people in the "complex" understanding is different. Such as heavy science and engineering calculation was supposed to be the brain to undertake, now computer can not only complete this calculation, and faster than the human brain can more accurately, and thus the people no longer put this calculation is regarded as "the need to perform complex human intelligence, complex tasks" work is defined as the development of The Times and the progress of technology, artificial intelligence is the science of specific target and nature as The Times change and development. On the one hand it continues to gain new progress on the one hand, and turning to more meaningful, the more difficult the target. Current can be used to study the main material of artificial intelligence and artificial intelligence technology to realize the machine is a computer, the development history of artificial intelligence is computer science and technology and the development together. Besides the computer science and artificial intelligence also involves information, cybernetics, automation, bionics, biology, psychology, logic, linguistics, medicine and philosophy and multi-discipline. Artificial intelligence research include: knowledge representation, automatic reasoning and search method, machine learning and knowledge acquisition and processing of knowledge system, natural language processing, computer vision, intelligent robot, automatic program design, etc.Practical application of machine vision: fingerprint identification,face recognition, retina identification, iris identification, palm, expert system, intelligent identification, search, theorem proving game, automatic programming, and aerospace applications.Artificial intelligence is a subject categories, belong to the door edge discipline of natural science and social science.Involving scientific philosophy and cognitive science, mathematics, neurophysiological, psychology, computer science, information theory, cybernetics, not qualitative theory, bionics.The research category of natural language processing, knowledge representation, intelligent search, reasoning, planning, machine learning, knowledge acquisition, combined scheduling problem, perception, pattern recognition, logic design program, soft calculation, inaccurate and uncertainty, the management of artificial life, neural network, and complex system, human thinking mode of genetic algorithm.Applications of intelligent control, robotics, language and image understanding, genetic programming robot factory.Safety problemsArtificial intelligence is currently in the study, but some scholars think that letting computers have IQ is very dangerous, it may be against humanity. The hidden danger in many movie happened.The definition of artificial intelligenceDefinition of artificial intelligence can be divided into two parts, namely "artificial" or "intelligent". "Artificial" better understanding, also is controversial. Sometimes we will consider what people can make, or people have high degree of intelligence to create artificial intelligence, etc. But generally speaking, "artificial system" is usually significance of artificial system.What is the "smart", with many problems. This involves other such as consciousness, ego, thinking (including the unconscious thoughts etc. People only know of intelligence is one intelligent, this is the universal view of our own. But we are very limited understanding of the intelligence of the intelligent people constitute elements are necessary to find, so it is difficult to define what is "artificial" manufacturing "intelligent". So the artificial intelligence research often involved in the study of intelligent itself. Other about animal or other artificial intelligence system is widely considered to be related to the study of artificial intelligence.Artificial intelligence is currently in the computer field, the moreextensive attention. And in the robot, economic and political decisions, control system, simulation system application. In other areas, it also played an indispensable role.The famous American Stanford university professor nelson artificial intelligence research center of artificial intelligence under such a definition: "artificial intelligence about the knowledge of the subject is and how to represent knowledge -- how to gain knowledge and use of scientific knowledge. But another American MIT professor Winston thought: "artificial intelligence is how to make the computer to do what only can do intelligent work." These comments reflect the artificial intelligence discipline basic ideas and basic content. Namely artificial intelligence is the study of human intelligence activities, has certain law, research of artificial intelligence system, how to make the computer to complete before the intelligence needs to do work, also is to study how the application of computer hardware and software to simulate human some intelligent behavior of the basic theory, methods and techniques.Artificial intelligence is a branch of computer science, since the 1970s, known as one of the three technologies (space technology, energy technology, artificial intelligence). Also considered the 21st century (genetic engineering, nano science, artificial intelligence) is one of the three technologies. It is nearly three years it has been developed rapidly, and in many fields are widely applied, and have made great achievements, artificial intelligence has gradually become an independent branch, both in theory and practice are already becomes a system. Its research results are gradually integrated into people's lives, and create more happiness for mankind.Artificial intelligence is that the computer simulation research of some thinking process and intelligent behavior (such as study, reasoning, thinking, planning, etc.), including computer to realize intelligent principle, make similar to that of human intelligence, computer can achieve higher level of computer application. Artificial intelligence will involve the computer science, philosophy and linguistics, psychology, etc. That was almost natural science and social science disciplines, the scope of all already far beyond the scope of computer science and artificial intelligence and thinking science is the relationship between theory and practice, artificial intelligence is in the mode of thinking science technology application level, is one of its application. From the view of thinking, artificial intelligence is not limited to logicalthinking, want to consider the thinking in image, the inspiration of thought of artificial intelligence can promote the development of the breakthrough, mathematics are often thought of as a variety of basic science, mathematics and language, thought into fields, artificial intelligence subject also must not use mathematical tool, mathematical logic, the fuzzy mathematics in standard etc, mathematics into the scope of artificial intelligence discipline, they will promote each other and develop faster.A brief history of artificial intelligenceArtificial intelligence can be traced back to ancient Egypt's legend, but with 1941, since the development of computer technology has finally can create machine intelligence, "artificial intelligence" is a word in 1956 was first proposed, Dartmouth learned since then, researchers have developed many theories and principles, the concept of artificial intelligence, it expands and not in the long history of the development of artificial intelligence, the slower than expected, but has been in advance, from 40 years ago, now appears to have many AI programs, and they also affected the development of other technologies. The emergence of AI programs, creating immeasurable wealth for the community, promoting the development of human civilization.The computer era1941 an invention that information storage and handling all aspects of the revolution happened. This also appeared in the U.S. and Germany's invention is the first electronic computer. Take a few big pack of air conditioning room, the programmer's nightmare: just run a program for thousands of lines to set the 1949. After improvement can be stored procedure computer programs that make it easier to input, and the development of the theory of computer science, and ultimately computer ai. This in electronic computer processing methods of data, for the invention of artificial intelligence could provide a kind of media.The beginning of AIAlthough the computer AI provides necessary for technical basis, but until the early 1950s, people noticed between machine and human intelligence. Norbert Wiener is the study of the theory of American feedback. Most familiar feedback control example is the thermostat. It will be collected room temperature and hope, and reaction temperature compared to open or close small heater, thus controlling environmental temperature. The importance of the study lies in the feedback loop Wiener:all theoretically the intelligence activities are a result of feedback mechanism and feedback mechanism is. Can use machine. The findings of the simulation of early development of AI.1955, Simon and end Newell called "a logical experts" program. This program is considered by many to be the first AI programs. It will each problem is expressed as a tree, then choose the model may be correct conclusion that a problem to solve. "logic" to the public and the AI expert research field effect makes it AI developing an important milestone in 1956, is considered to be the father of artificial intelligence of John McCarthy organized a society, will be a lot of interest machine intelligence experts and scholars together for a month. He asked them to Vermont Dartmouth in "artificial intelligence research in summer." since then, this area was named "artificial intelligence" although Dartmouth learn not very successful, but it was the founder of the centralized and AI AI research for later laid a foundation.After the meeting of Dartmouth, AI research started seven years. Although the rapid development of field haven't define some of the ideas, meeting has been reconsidered and Carnegie Mellon university. And MIT began to build AI research center is confronted with new challenges. Research needs to establish the: more effective to solve the problem of the system, such as "logic" in reducing search; expert There is the establishment of the system can be self learning.In 1957, "a new program general problem-solving machine" first version was tested. This program is by the same logic "experts" group development. The GPS expanded Wiener feedback principle, can solve many common problem. Two years later, IBM has established a grind investigate group Herbert AI. Gelerneter spent three years to make a geometric theorem of solutions of the program. This achievement was a sensation.When more and more programs, McCarthy busy emerge in the history of an AI. 1958 McCarthy announced his new fruit: LISP until today still LISP language. In. "" mean" LISP list processing ", it quickly adopted for most AI developers.In 1963 MIT from the United States government got a pen is 22millions dollars funding for research funding. The machine auxiliary recognition from the defense advanced research program, have guaranteed in the technological progress on this plan ahead of the Soviet union. Attracted worldwide computer scientists, accelerate the pace of development of AI research.Large programAfter years of program. It appeared a famous called "SHRDLU." SHRDLU "is" the tiny part of the world "project, including the world (for example, only limited quantity of geometrical form of research and programming). In the MIT leadership of Minsky Marvin by researchers found, facing the object, the small computer programs can solve the problem space and logic. Other as in the late 1960's STUDENT", "can solve algebraic problems," SIR "can understand the simple English sentence. These procedures for handling the language understanding and logic.In the 1970s another expert system. An expert system is a intelligent computer program system, and its internal contains a lot of certain areas of experience and knowledge with expert level, can use the human experts' knowledge and methods to solve the problems to deal with this problem domain. That is, the expert system is a specialized knowledge and experience of the program system. Progress is the expert system could predict under certain conditions, the probability of a solution for the computer already has. Great capacity, expert systems possible from the data of expert system. It is widely used in the market. Ten years, expert system used in stock, advance help doctors diagnose diseases, and determine the position of mineral instructions miners. All of this because of expert system of law and information storage capacity and become possible.In the 1970s, a new method was used for many developing, famous as AI Minsky tectonic theory put forward David Marr. Another new theory of machine vision square, for example, how a pair of image by shadow, shape, color, texture and basic information border. Through the analysis of these images distinguish letter, can infer what might be the image in the same period. PROLOGE result is another language, in 1972. In the 1980s, the more rapid progress during the AI, and more to go into business. 1986, the AI related software and hardware sales $4.25 billion dollars. Expert system for its utility, especially by demand. Like digital electric company with such company XCON expert system for the VAX mainframe programming. Dupont, general motors and Boeing has lots of dependence of expert system for computer expert. Some production expert system of manufacture software auxiliary, such as Teknowledge and Intellicorp established. In order to find and correct the mistakes, existing expert system and some other experts system was designed,such as teach users learn TVC expert system of the operating system.From the lab to daily lifePeople began to feel the computer technique and artificial intelligence. No influence of computer technology belong to a group of researchers in the lab. Personal computers and computer technology to numerous technical magazine now before a people. Like the United States artificial intelligence association foundation. Because of the need to develop, AI had a private company researchers into the boom. More than 150 a DEC (it employs more than 700 employees engaged in AI research) that have spent 10 billion dollars in internal AI team.Some other AI areas in the 1980s to enter the market. One is the machine vision Marr and achievements of Minsky. Now use the camera and production, quality control computer. Although still very humble, these systems have been able to distinguish the objects and through the different shape. Until 1985 America has more than 100 companies producing machine vision systems, sales were us $8 million.But the 1980s to AI and industrial all is not a good year for years. 1986-87 AI system requirements, the loss of industry nearly five hundred million dollars. Teknowledge like Intellicorp and two loss of more than $6 million, about one-third of the profits of the huge losses forced many research funding cuts the guide led. Another disappointing is the defense advanced research programme support of so-called "intelligent" this project truck purpose is to develop a can finish the task in many battlefield robot. Since the defects and successful hopeless, Pentagon stopped project funding.Despite these setbacks, AI is still in development of new technology slowly. In Japan were developed in the United States, such as the fuzzy logic, it can never determine the conditions of decision making, And neural network, regarded as the possible approaches to realizing artificial intelligence. Anyhow, the eighties was introduced into the market, the AI and shows the practical value. Sure, it will be the key to the 21st century. "artificial intelligence technology acceptance inspection in desert storm" action of military intelligence test equipment through war. Artificial intelligence technology is used to display the missile system and warning and other advanced weapons. AI technology has also entered family. Intelligent computer increase attracting public interest. The emergence of network game, enriching people's life.Some of the main Macintosh and IBM for application software such as voice and character recognition has can buy, Using fuzzy logic,AI technology to simplify the camera equipment. The artificial intelligence technology related to promote greater demand for new progress appear constantly. In a word ,Artificial intelligence has and will continue to inevitably changed our life.附件三英文文献译文人工智能“人工智能”一词最初是在1956 年Dartmouth在学会上提出来的。

云计算外文翻译参考文献

云计算外文翻译参考文献

云计算外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)原文:Technical Issues of Forensic Investigations in Cloud Computing EnvironmentsDominik BirkRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyAbstract—Cloud Computing is arguably one of the most discussedinformation technologies today. It presents many promising technological and economical opportunities. However, many customers remain reluctant to move their business IT infrastructure completely to the cloud. One of their main concerns is Cloud Security and the threat of the unknown. Cloud Service Providers(CSP) encourage this perception by not letting their customers see what is behind their virtual curtain. A seldomly discussed, but in this regard highly relevant open issue is the ability to perform digital investigations. This continues to fuel insecurity on the sides of both providers and customers. Cloud Forensics constitutes a new and disruptive challenge for investigators. Due to the decentralized nature of data processing in the cloud, traditional approaches to evidence collection and recovery are no longer practical. This paper focuses on the technical aspects of digital forensics in distributed cloud environments. We contribute by assessing whether it is possible for the customer of cloud computing services to perform a traditional digital investigation from a technical point of view. Furthermore we discuss possible solutions and possible new methodologies helping customers to perform such investigations.I. INTRODUCTIONAlthough the cloud might appear attractive to small as well as to large companies, it does not come along without its own unique problems. Outsourcing sensitive corporate data into the cloud raises concerns regarding the privacy and security of data. Security policies, companies main pillar concerning security, cannot be easily deployed into distributed, virtualized cloud environments. This situation is further complicated by the unknown physical location of the companie’s assets. Normally,if a security incident occurs, the corporate security team wants to be able to perform their own investigation without dependency on third parties. In the cloud, this is not possible anymore: The CSP obtains all the power over the environmentand thus controls the sources of evidence. In the best case, a trusted third party acts as a trustee and guarantees for the trustworthiness of the CSP. Furthermore, the implementation of the technical architecture and circumstances within cloud computing environments bias the way an investigation may be processed. In detail, evidence data has to be interpreted by an investigator in a We would like to thank the reviewers for the helpful comments and Dennis Heinson (Center for Advanced Security Research Darmstadt - CASED) for the profound discussions regarding the legal aspects of cloud forensics. proper manner which is hardly be possible due to the lackof circumstantial information. For auditors, this situation does not change: Questions who accessed specific data and information cannot be answered by the customers, if no corresponding logs are available. With the increasing demand for using the power of the cloud for processing also sensible information and data, enterprises face the issue of Data and Process Provenance in the cloud [10]. Digital provenance, meaning meta-data that describes the ancestry or history of a digital object, is a crucial feature for forensic investigations. In combination with a suitable authentication scheme, it provides information about who created and who modified what kind of data in the cloud. These are crucial aspects for digital investigations in distributed environments such as the cloud. Unfortunately, the aspects of forensic investigations in distributed environment have so far been mostly neglected by the research community. Current discussion centers mostly around security, privacy and data protection issues [35], [9], [12]. The impact of forensic investigations on cloud environments was little noticed albeit mentioned by the authors of [1] in 2009: ”[...] to our knowledge, no research has been published on how cloud computing environments affect digital artifacts,and on acquisition logistics and legal issues related to cloud computing env ironments.” This statement is also confirmed by other authors [34], [36], [40] stressing that further research on incident handling, evidence tracking and accountability in cloud environments has to be done. At the same time, massive investments are being made in cloud technology. Combined with the fact that information technology increasingly transcendents peoples’ private and professional life, thus mirroring more and more of peoples’actions, it becomes apparent that evidence gathered from cloud environments will be of high significance to litigation or criminal proceedings in the future. Within this work, we focus the notion of cloud forensics by addressing the technical issues of forensics in all three major cloud service models and consider cross-disciplinary aspects. Moreover, we address the usability of various sources of evidence for investigative purposes and propose potential solutions to the issues from a practical standpoint. This work should be considered as a surveying discussion of an almost unexplored research area. The paper is organized as follows: We discuss the related work and the fundamental technical background information of digital forensics, cloud computing and the fault model in section II and III. In section IV, we focus on the technical issues of cloud forensics and discuss the potential sources and nature of digital evidence as well as investigations in XaaS environments including thecross-disciplinary aspects. We conclude in section V.II. RELATED WORKVarious works have been published in the field of cloud security and privacy [9], [35], [30] focussing on aspects for protecting data in multi-tenant, virtualized environments. Desired security characteristics for current cloud infrastructures mainly revolve around isolation of multi-tenant platforms [12], security of hypervisors in order to protect virtualized guest systems and secure network infrastructures [32]. Albeit digital provenance, describing the ancestry of digital objects, still remains a challenging issue for cloud environments, several works have already been published in this field [8], [10] contributing to the issues of cloud forensis. Within this context, cryptographic proofs for verifying data integrity mainly in cloud storage offers have been proposed,yet lacking of practical implementations [24], [37], [23]. Traditional computer forensics has already well researched methods for various fields of application [4], [5], [6], [11], [13]. Also the aspects of forensics in virtual systems have been addressed by several works [2], [3], [20] including the notionof virtual introspection [25]. In addition, the NIST already addressed Web Service Forensics [22] which has a huge impact on investigation processes in cloud computing environments. In contrast, the aspects of forensic investigations in cloud environments have mostly been neglected by both the industry and the research community. One of the first papers focusing on this topic was published by Wolthusen [40] after Bebee et al already introduced problems within cloud environments [1]. Wolthusen stressed that there is an inherent strong need for interdisciplinary work linking the requirements and concepts of evidence arising from the legal field to what can be feasibly reconstructed and inferred algorithmically or in an exploratory manner. In 2010, Grobauer et al [36] published a paper discussing the issues of incident response in cloud environments - unfortunately no specific issues and solutions of cloud forensics have been proposed which will be done within this work.III. TECHNICAL BACKGROUNDA. Traditional Digital ForensicsThe notion of Digital Forensics is widely known as the practice of identifying, extracting and considering evidence from digital media. Unfortunately, digital evidence is both fragile and volatile and therefore requires the attention of special personnel and methods in order to ensure that evidence data can be proper isolated and evaluated. Normally, the process of a digital investigation can be separated into three different steps each having its own specificpurpose:1) In the Securing Phase, the major intention is the preservation of evidence for analysis. The data has to be collected in a manner that maximizes its integrity. This is normally done by a bitwise copy of the original media. As can be imagined, this represents a huge problem in the field of cloud computing where you never know exactly where your data is and additionallydo not have access to any physical hardware. However, the snapshot technology, discussed in section IV-B3, provides a powerful tool to freeze system states and thus makes digital investigations, at least in IaaS scenarios, theoretically possible.2) We refer to the Analyzing Phase as the stage in which the data is sifted and combined. It is in this phase that the data from multiple systems or sources is pulled together to create as complete a picture and event reconstruction as possible. Especially in distributed system infrastructures, this means that bits and pieces of data are pulled together for deciphering the real story of what happened and for providing a deeper look into the data.3) Finally, at the end of the examination and analysis of the data, the results of the previous phases will be reprocessed in the Presentation Phase. The report, created in this phase, is a compilation of all the documentation and evidence from the analysis stage. The main intention of such a report is that it contains all results, it is complete and clear to understand. Apparently, the success of these three steps strongly depends on the first stage. If it is not possible to secure the complete set of evidence data, no exhaustive analysis will be possible. However, in real world scenarios often only a subset of the evidence data can be secured by the investigator. In addition, an important definition in the general context of forensics is the notion of a Chain of Custody. This chain clarifies how and where evidence is stored and who takes possession of it. Especially for cases which are brought to court it is crucial that the chain of custody is preserved.B. Cloud ComputingAccording to the NIST [16], cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal CSP interaction. The new raw definition of cloud computing brought several new characteristics such as multi-tenancy, elasticity, pay-as-you-go and reliability. Within this work, the following three models are used: In the Infrastructure asa Service (IaaS) model, the customer is using the virtual machine provided by the CSP for installing his own system on it. The system can be used like any other physical computer with a few limitations. However, the additive customer power over the system comes along with additional security obligations. Platform as a Service (PaaS) offerings provide the capability to deploy application packages created using the virtual development environment supported by the CSP. For the efficiency of software development process this service model can be propellent. In the Software as a Service (SaaS) model, the customer makes use of a service run by the CSP on a cloud infrastructure. In most of the cases this service can be accessed through an API for a thin client interface such as a web browser. Closed-source public SaaS offers such as Amazon S3 and GoogleMail can only be used in the public deployment model leading to further issues concerning security, privacy and the gathering of suitable evidences. Furthermore, two main deployment models, private and public cloud have to be distinguished. Common public clouds are made available to the general public. The corresponding infrastructure is owned by one organization acting as a CSP and offering services to its customers. In contrast, the private cloud is exclusively operated for an organization but may not provide the scalability and agility of public offers. The additional notions of community and hybrid cloud are not exclusively covered within this work. However, independently from the specific model used, the movement of applications and data to the cloud comes along with limited control for the customer about the application itself, the data pushed into the applications and also about the underlying technical infrastructure.C. Fault ModelBe it an account for a SaaS application, a development environment (PaaS) or a virtual image of an IaaS environment, systems in the cloud can be affected by inconsistencies. Hence, for both customer and CSP it is crucial to have the ability to assign faults to the causing party, even in the presence of Byzantine behavior [33]. Generally, inconsistencies can be caused by the following two reasons:1) Maliciously Intended FaultsInternal or external adversaries with specific malicious intentions can cause faults on cloud instances or applications. Economic rivals as well as former employees can be the reason for these faults and state a constant threat to customers and CSP. In this model, also a malicious CSP is included albeit he isassumed to be rare in real world scenarios. Additionally, from the technical point of view, the movement of computing power to a virtualized, multi-tenant environment can pose further threads and risks to the systems. One reason for this is that if a single system or service in the cloud is compromised, all other guest systems and even the host system are at risk. Hence, besides the need for further security measures, precautions for potential forensic investigations have to be taken into consideration.2) Unintentional FaultsInconsistencies in technical systems or processes in the cloud do not have implicitly to be caused by malicious intent. Internal communication errors or human failures can lead to issues in the services offered to the costumer(i.e. loss or modification of data). Although these failures are not caused intentionally, both the CSP and the customer have a strong intention to discover the reasons and deploy corresponding fixes.IV. TECHNICAL ISSUESDigital investigations are about control of forensic evidence data. From the technical standpoint, this data can be available in three different states: at rest, in motion or in execution. Data at rest is represented by allocated disk space. Whether the data is stored in a database or in a specific file format, it allocates disk space. Furthermore, if a file is deleted, the disk space is de-allocated for the operating system but the data is still accessible since the disk space has not been re-allocated and overwritten. This fact is often exploited by investigators which explore these de-allocated disk space on harddisks. In case the data is in motion, data is transferred from one entity to another e.g. a typical file transfer over a network can be seen as a data in motion scenario. Several encapsulated protocols contain the data each leaving specific traces on systems and network devices which can in return be used by investigators. Data can be loaded into memory and executed as a process. In this case, the data is neither at rest or in motion but in execution. On the executing system, process information, machine instruction and allocated/de-allocated data can be analyzed by creating a snapshot of the current system state. In the following sections, we point out the potential sources for evidential data in cloud environments and discuss the technical issues of digital investigations in XaaS environmentsas well as suggest several solutions to these problems.A. Sources and Nature of EvidenceConcerning the technical aspects of forensic investigations, the amount of potential evidence available to the investigator strongly diverges between thedifferent cloud service and deployment models. The virtual machine (VM), hosting in most of the cases the server application, provides several pieces of information that could be used by investigators. On the network level, network components can provide information about possible communication channels between different parties involved. The browser on the client, acting often as the user agent for communicating with the cloud, also contains a lot of information that could be used as evidence in a forensic investigation. Independently from the used model, the following three components could act as sources for potential evidential data.1) Virtual Cloud Instance: The VM within the cloud, where i.e. data is stored or processes are handled, contains potential evidence [2], [3]. In most of the cases, it is the place where an incident happened and hence provides a good starting point for a forensic investigation. The VM instance can be accessed by both, the CSP and the customer who is running the instance. Furthermore, virtual introspection techniques [25] provide access to the runtime state of the VM via the hypervisor and snapshot technology supplies a powerful technique for the customer to freeze specific states of the VM. Therefore, virtual instances can be still running during analysis which leads to the case of live investigations [41] or can be turned off leading to static image analysis. In SaaS and PaaS scenarios, the ability to access the virtual instance for gathering evidential information is highly limited or simply not possible.2) Network Layer: Traditional network forensics is knownas the analysis of network traffic logs for tracing events that have occurred in the past. Since the different ISO/OSI network layers provide several information on protocols and communication between instances within as well as with instances outside the cloud [4], [5], [6], network forensics is theoretically also feasible in cloud environments. However in practice, ordinary CSP currently do not provide any log data from the network components used by the customer’s instances or applications. For instance, in case of a malware infection of an IaaS VM, it will be difficult for the investigator to get any form of routing information and network log datain general which is crucial for further investigative steps. This situation gets even more complicated in case of PaaS or SaaS. So again, the situation of gathering forensic evidence is strongly affected by the support the investigator receives from the customer and the CSP.3) Client System: On the system layer of the client, it completely depends on the used model (IaaS, PaaS, SaaS) if and where potential evidence could beextracted. In most of the scenarios, the user agent (e.g. the web browser) on the client system is the only application that communicates with the service in the cloud. This especially holds for SaaS applications which are used and controlled by the web browser. But also in IaaS scenarios, the administration interface is often controlled via the browser. Hence, in an exhaustive forensic investigation, the evidence data gathered from the browser environment [7] should not be omitted.a) Browser Forensics: Generally, the circumstances leading to an investigation have to be differentiated: In ordinary scenarios, the main goal of an investigation of the web browser is to determine if a user has been victim of a crime. In complex SaaS scenarios with high client-server interaction, this constitutes a difficult task. Additionally, customers strongly make use of third-party extensions [17] which can be abused for malicious purposes. Hence, the investigator might want to look for malicious extensions, searches performed, websites visited, files downloaded, information entered in forms or stored in local HTML5 stores, web-based email contents and persistent browser cookies for gathering potential evidence data. Within this context, it is inevitable to investigate the appearance of malicious JavaScript [18] leading to e.g. unintended AJAX requests and hence modified usage of administration interfaces. Generally, the web browser contains a lot of electronic evidence data that could be used to give an answer to both of the above questions - even if the private mode is switched on [19].B. Investigations in XaaS EnvironmentsTraditional digital forensic methodologies permit investigators to seize equipment and perform detailed analysis on the media and data recovered [11]. In a distributed infrastructure organization like the cloud computing environment, investigators are confronted with an entirely different situation. They have no longer the option of seizing physical data storage. Data and processes of the customer are dispensed over an undisclosed amount of virtual instances, applications and network elements. Hence, it is in question whether preliminary findings of the computer forensic community in the field of digital forensics apparently have to be revised and adapted to the new environment. Within this section, specific issues of investigations in SaaS, PaaS and IaaS environments will be discussed. In addition, cross-disciplinary issues which affect several environments uniformly, will be taken into consideration. We also suggest potential solutions to the mentioned problems.1) SaaS Environments: Especially in the SaaS model, the customer does notobtain any control of the underlying operating infrastructure such as network, servers, operating systems or the application that is used. This means that no deeper view into the system and its underlying infrastructure is provided to the customer. Only limited userspecific application configuration settings can be controlled contributing to the evidences which can be extracted fromthe client (see section IV-A3). In a lot of cases this urges the investigator to rely on high-level logs which are eventually provided by the CSP. Given the case that the CSP does not run any logging application, the customer has no opportunity to create any useful evidence through the installation of any toolkit or logging tool. These circumstances do not allow a valid forensic investigation and lead to the assumption that customers of SaaS offers do not have any chance to analyze potential incidences.a) Data Provenance: The notion of Digital Provenance is known as meta-data that describes the ancestry or history of digital objects. Secure provenance that records ownership and process history of data objects is vital to the success of data forensics in cloud environments, yet it is still a challenging issue today [8]. Albeit data provenance is of high significance also for IaaS and PaaS, it states a huge problem specifically for SaaS-based applications: Current global acting public SaaS CSP offer Single Sign-On (SSO) access control to the set of their services. Unfortunately in case of an account compromise, most of the CSP do not offer any possibility for the customer to figure out which data and information has been accessed by the adversary. For the victim, this situation can have tremendous impact: If sensitive data has been compromised, it is unclear which data has been leaked and which has not been accessed by the adversary. Additionally, data could be modified or deleted by an external adversary or even by the CSP e.g. due to storage reasons. The customer has no ability to proof otherwise. Secure provenance mechanisms for distributed environments can improve this situation but have not been practically implemented by CSP [10]. Suggested Solution: In private SaaS scenarios this situation is improved by the fact that the customer and the CSP are probably under the same authority. Hence, logging and provenance mechanisms could be implemented which contribute to potential investigations. Additionally, the exact location of the servers and the data is known at any time. Public SaaS CSP should offer additional interfaces for the purpose of compliance, forensics, operations and security matters to their customers. Through an API, the customers should have the ability to receive specific information suchas access, error and event logs that could improve their situation in case of aninvestigation. Furthermore, due to the limited ability of receiving forensic information from the server and proofing integrity of stored data in SaaS scenarios, the client has to contribute to this process. This could be achieved by implementing Proofs of Retrievability (POR) in which a verifier (client) is enabled to determine that a prover (server) possesses a file or data object and it can be retrieved unmodified [24]. Provable Data Possession (PDP) techniques [37] could be used to verify that an untrusted server possesses the original data without the need for the client to retrieve it. Although these cryptographic proofs have not been implemented by any CSP, the authors of [23] introduced a new data integrity verification mechanism for SaaS scenarios which could also be used for forensic purposes.2) PaaS Environments: One of the main advantages of the PaaS model is that the developed software application is under the control of the customer and except for some CSP, the source code of the application does not have to leave the local development environment. Given these circumstances, the customer obtains theoretically the power to dictate how the application interacts with other dependencies such as databases, storage entities etc. CSP normally claim this transfer is encrypted but this statement can hardly be verified by the customer. Since the customer has the ability to interact with the platform over a prepared API, system states and specific application logs can be extracted. However potential adversaries, which can compromise the application during runtime, should not be able to alter these log files afterwards. Suggested Solution:Depending on the runtime environment, logging mechanisms could be implemented which automatically sign and encrypt the log information before its transfer to a central logging server under the control of the customer. Additional signing and encrypting could prevent potential eavesdroppers from being able to view and alter log data information on the way to the logging server. Runtime compromise of an PaaS application by adversaries could be monitored by push-only mechanisms for log data presupposing that the needed information to detect such an attack are logged. Increasingly, CSP offering PaaS solutions give developers the ability to collect and store a variety of diagnostics data in a highly configurable way with the help of runtime feature sets [38].3) IaaS Environments: As expected, even virtual instances in the cloud get compromised by adversaries. Hence, the ability to determine how defenses in the virtual environment failed and to what extent the affected systems havebeen compromised is crucial not only for recovering from an incident. Also forensic investigations gain leverage from such information and contribute to resilience against future attacks on the systems. From the forensic point of view, IaaS instances do provide much more evidence data usable for potential forensics than PaaS and SaaS models do. This fact is caused throughthe ability of the customer to install and set up the image for forensic purposes before an incident occurs. Hence, as proposed for PaaS environments, log data and other forensic evidence information could be signed and encrypted before itis transferred to third-party hosts mitigating the chance that a maliciously motivated shutdown process destroys the volatile data. Although, IaaS environments provide plenty of potential evidence, it has to be emphasized that the customer VM is in the end still under the control of the CSP. He controls the hypervisor which is e.g. responsible for enforcing hardware boundaries and routing hardware requests among different VM. Hence, besides the security responsibilities of the hypervisor, he exerts tremendous control over how customer’s VM communicate with the hardware and theoretically can intervene executed processes on the hosted virtual instance through virtual introspection [25]. This could also affect encryption or signing processes executed on the VM and therefore leading to the leakage of the secret key. Although this risk can be disregarded in most of the cases, the impact on the security of high security environments is tremendous.a) Snapshot Analysis: Traditional forensics expect target machines to be powered down to collect an image (dead virtual instance). This situation completely changed with the advent of the snapshot technology which is supported by all popular hypervisors such as Xen, VMware ESX and Hyper-V.A snapshot, also referred to as the forensic image of a VM, providesa powerful tool with which a virtual instance can be clonedby one click including also the running system’s mem ory. Due to the invention of the snapshot technology, systems hosting crucial business processes do not have to be powered down for forensic investigation purposes. The investigator simply creates and loads a snapshot of the target VM for analysis(live virtual instance). This behavior is especially important for scenarios in which a downtime of a system is not feasible or practical due to existing SLA. However the information whether the machine is running or has been properly powered down is crucial [3] for the investigation. Live investigations of running virtual instances become more common providing evidence data that。

Cloud-computing云计算纯英文介绍PPT课件

Cloud-computing云计算纯英文介绍PPT课件
Virtualization Technology
*
Distributed computing technology
Complex task
Part task
Part task
Part task
Part task
result
result
result
result
single computing
2Hale Waihona Puke 1019502000
1980
1960
*
Feature
Large-scale
Common
Cheap
Reliable
Convenient
Cloud Computing
*
the specific characteristics
Data in the cloud: not afraid to lose, do not have to back up, you can restore at any point;
云应用
Technical support
cloud computing
1
2
3
4
5
Distributed computing technology
Distributed storage technology
Data management technology
Cloud Computing Platform Administration Technology
Powerful computing: unlimited space, unlimited speed
PC
C/S
Cloud Computing

云计算Cloud-Computing-外文翻译

云计算Cloud-Computing-外文翻译

毕业设计说明书英文文献及中文翻译学生姓名:学号:计算机与控制工程学院:专指导教师:2017 年 6 月英文文献Cloud Computing1。

Cloud Computing at a Higher LevelIn many ways,cloud computing is simply a metaphor for the Internet, the increasing movement of compute and data resources onto the Web. But there's a difference: cloud computing represents a new tipping point for the value of network computing. It delivers higher efficiency, massive scalability, and faster,easier software development. It's about new programming models,new IT infrastructure, and the enabling of new business models。

For those developers and enterprises who want to embrace cloud computing, Sun is developing critical technologies to deliver enterprise scale and systemic qualities to this new paradigm:(1) Interoperability —while most current clouds offer closed platforms and vendor lock—in, developers clamor for interoperability。

云计算与物联网外文翻译文献

云计算与物联网外文翻译文献

文献信息:文献标题:Integration of Cloud Computing with Internet of Things: Challenges and Open Issues(云计算与物联网的集成:挑战与开放问题)国外作者:HF Atlam等人文献出处:《IEEE International Conference on Internet of Things》, 2017字数统计:英文4176单词,23870字符;中文7457汉字外文文献:Integration of Cloud Computing with Internet of Things:Challenges and Open IssuesAbstract The Internet of Things (IoT) is becoming the next Internet-related revolution. It allows billions of devices to be connected and communicate with each other to share information that improves the quality of our daily lives. On the other hand, Cloud Computing provides on-demand, convenient and scalable network access which makes it possible to share computing resources; indeed, this, in turn, enables dynamic data integration from various data sources. There are many issues standing in the way of the successful implementation of both Cloud and IoT. The integration of Cloud Computing with the IoT is the most effective way on which to overcome these issues. The vast number of resources available on the Cloud can be extremely beneficial for the IoT, while the Cloud can gain more publicity to improve its limitations with real world objects in a more dynamic and distributed manner. This paper provides an overview of the integration of the Cloud into the IoT by highlighting the integration benefits and implementation challenges. Discussion will also focus on the architecture of the resultant Cloud-based IoT paradigm and its new applications scenarios. Finally, open issues and future research directions are also suggested.Keywords: Cloud Computing, Internet of Things, Cloud based IoT, Integration.I.INTRODUCTIONIt is important to explore the common features of the technologies involved in the field of computing. Indeed, this is certainly the case with Cloud Computing and the Internet of Things (IoT) – two paradigms which share many common features. The integration of these numerous concepts may facilitate and improve these technologies. Cloud computing has altered the way in which technologies can be accessed, managed and delivered. It is widely agreed that Cloud computing can be used for utility services in the future. Although many consider Cloud computing to be a new technology, it has, in actual fact, been involved in and encompassed various technologies such as grid, utility computing virtualisation, networking and software services. Cloud computing provides services which make it possible to share computing resources across the Internet. As such, it is not surprising that the origins of Cloud technologies lie in grid, utility computing virtualisation, networking and software services, as well as distributed computing, and parallel computing. On the other hand, the IoT can be considered both a dynamic and global networked infrastructure that manages self-configuring objects in a highly intelligent way. The IoT is moving towards a phase where all items around us will be connected to the Internet and will have the ability to interact with minimum human effort. The IoT normally includes a number of objects with limited storage and computing capacity. It could well be said that Cloud computing and the IoT will be the future of the Internet and next-generation technologies. However, Cloud services are dependent on service providers which are extremely interoperable, while IoT technologies are based on diversity rather than interoperability.This paper provides an overview of the integration of Cloud Computing into the IoT; this involves an examination of the benefits resulting from the integration process and the implementation challenges encountered. Open issues and research directions are also discussed. The remainder of the paper is organised as follows: Section II provides the basic concepts of Cloud computing, IoT, and Cloud-based IoT; SectionIII discusses the benefits of integrating the IoT into the Cloud; Could-based IoT Architecture is presented in section IV; Section V illustrates different Cloud-based IoT applications scenarios. Following this, the challenges facing Cloud-based IoT integration and open research directions are discussed in Section VI and Section VII respectively, before Section VIII concludes the paper.II.BASIC CONCEPTSThis section reviews the basic concepts of Cloud Computing, the IoT, and Cloud-based IoT.1.Cloud ComputingThere exist a number of proposed definitions for Cloud computing, although the most widely agreed upon seems be that put forth by the National Institute of Standards and Technology (NIST). Indeed, the NIST has defined Cloud computing as "a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction".As stated in this definition, Cloud computing comprises four types of deployment models, three different service models, and five essential characteristics.Cloud computing deployment models are most commonly classified as belonging to the public Cloud, where resources are made available to consumers over the Internet. Public Clouds are generally owned by a profitable organisation (e.g. Amazon EC2). Conversely, the infrastructure of a private Cloud is commonly provided by a single organisation to serve the particular purposes of its users. The private Cloud offers a secure environment and a higher level of control (Microsoft Private Cloud). Hybrid Clouds are a mixture of private and public Clouds. This choice is provided for consumers as it makes it possible to overcome some of the limitations of each model. In contrast, a community Cloud is a Cloud infrastructure which is delivered to a group of users by a number of organisations which share the same need.In order to allow consumers to choose the service that suits them, services inCloud computing are provided at three different levels, namely: the Software as a Service (SaaS) model, where software is delivered through the Internet to users (e.g. GoogleApps); the Platform as a Service (PaaS) model, which offers a higher level of integrated environment that can build, test, and deploy specific software (e.g. Microsoft Azure); and finally, with the Infrastructure as a Service (IaaS) model, infrastructure such as storage, hardware and servers are delivered as a service (e.g. Amazon Web Services).2.Internet of ThingsThe IoT represents a modern approach where boundaries between real and digital domains are progressively eliminated by consistently changing every physical device to a smart alternative ready to provide smart services. All things in the IoT (smart devices, sensors, etc.) have their own identity. They are combined to form the communication network and will become actively participating objects. These objects include not only daily usable electronic devices, but also things like food, clothing, materials, parts, and subassemblies; commodities and luxury items; monuments and landmarks; and various forms of commerce and culture. In addition, these objects are able to create requests and alter their states. Thus, all IoT devices can be monitored, tracked and counted, which significantly decreases waste, loss, and cost.The concept of the IoT was first mentioned by Kevin Ashton in 1999, when he stated that “The Internet of Things has the potential to change the world, just as the Internet did. Maybe even more so”. Later, the IoT was formally presented by the International Telecommunication Union (ITU) in 2005. A great many definitions of the IoT have been put forth by numerous organisations and researchers. According to the ITU (2012), the IoT is “a global infrastructure for the Information Society, enabling advanced services by interconnecting (physical and virtual) things based on, existing and evolving, interoperable information and communication technologies”. The IoT introduces a variety of opportunities and applications. However, it faces many challenges which could potentially hinder its successful implementation, such as data storage, heterogeneous resource-constrained, scalability, Things, variable geospatial deployment, and energy efficiency.3.Cloud-Based Internet of ThingsThe IoT and Cloud computing are both rapidly developing services, and have their own unique characteristics. On the one hand, the IoT approach is based on smart devices which intercommunicate in a global network and dynamic infrastructure. It enables ubiquitous computing scenarios. The IoT is typically characterised by widely-distributed devices with limited processing capabilities and storage. These devices encounter issues regarding performance, reliability, privacy, and security. On the other hand, Cloud computing comprises a massive network with unlimited storage capabilities and computation power. Furthermore, it provides a flexible, robust environment which allows for dynamic data integration from various data sources. Cloud computing has partially resolved most of the IoT issues. Indeed, the IoT and Cloud are two comparatively challenging technologies, and are being combined in order to change the current and future environment of internetworking services.The Cloud-based Internet of Things is a platform which allows for the smart usage of applications, information, and infrastructure in a cost-effective way. While the IoT and Cloud computing are different from each other, their features are almost complementary, as shown in TABLE 1. This complementarity is the primary reason why many researchers have proposed their integration.TABLE 1. COMPARISON OF THE IOT WITH CLOUD COMPUTINGIII.BENEFITS OF INTEGRATING IOT WITH CLOUDSince the IoT suffers from limited capabilities in terms of processing power and storage, it must also contend with issues such as performance, security, privacy, reliability. The integration of the IoT into the Cloud is certainly the best way toovercome most of these issues. The Cloud can even benefit from the IoT by expanding its limits with real world objects in a more dynamic and distributed way, and providing new services for billions of devices in different real life scenarios. In addition, the Cloud provides simplicity of use and reduces the cost of the usage of applications and services for end-users. The Cloud also simplifies the flow of the IoT data gathering and processing, and provides quick, low-cost installation and integration for complex data processing and deployment. The benefits of integrating IoT into Cloud are discussed in this section as follows.municationApplication and data sharing are two significant features of the Cloud-based IoT paradigm. Ubiquitous applications can be transmitted through the IoT, whilst automation can be utilised to facilitate low-cost data distribution and collection. The Cloud is an effective and economical solution which can be used to connect, manage, and track anything by using built-in apps and customised portals. The availability of fast systems facilitates dynamic monitoring and remote objects control, as well as data real-time access. It is worth declaring that, although the Cloud can greatly develop and facilitate the IoT interconnection, it still has weaknesses in certain areas. Thus, practical restrictions can appear when an enormous amount of data needs to be transferred from the Internet to the Cloud.2.StorageAs the IoT can be used on billions of devices, it comprises a huge number of information sources, which generate an enormous amount of semi-structured or non-structured data. This is known as Big Data, and has three characteristics: variety (e.g. data types), velocity (e.g. data generation frequency), and volume (e.g. data size). The Cloud is considered to be one of the most cost-effective and suitable solutions when it comes to dealing with the enormous amount of data created by the IoT. Moreover, it produces new chances for data integration, aggregation, and sharing with third parties.3.Processing capabilitiesIoT devices are characterised by limited processing capabilities which preventon-site and complex data processing. Instead, gathered data is transferred to nodes that have high capabilities; indeed, it is here that aggregation and processing are accomplished. However, achieving scalability remains a challenge without an appropriate underlying infrastructure. Offering a solution, the Cloud provides unlimited virtual processing capabilities and an on-demand usage model. Predictive algorithms and data-driven decisions making can be integrated into the IoT in order to increase revenue and reduce risks at a lower cost.4.ScopeWith billions of users communicating with one another together and a variety of information being collected, the world is quickly moving towards the Internet of Everything (IoE) realm - a network of networks with billions of things that generate new chances and risks. The Cloud-based IoT approach provides new applications and services based on the expansion of the Cloud through the IoT objects, which in turn allows the Cloud to work with a number of new real world scenarios, and leads to the emergence of new services.5.New abilitiesThe IoT is characterised by the heterogeneity of its devices, protocols, and technologies. Hence, reliability, scalability, interoperability, security, availability and efficiency can be very hard to achieve. Integrating IoT into the Cloud resolves most of these issues. It provides other features such as ease- of-use and ease-of-access, with low deployment costs.6.New ModelsCloud-based IoT integration empowers new scenarios for smart objects, applications, and services. Some of the new models are listed as follows:•SaaS (Sensing as a Service), which allows access to sensor data;•EaaS (Ethernet as a Service), the main role of which is to provide ubiquitous connectivity to control remote devices;•SAaaS (Sensing and Actuation as a Service), which provides control logics automatically;•IPMaaS (Identity and Policy Management as a Service), which provides access to policy and identity management;•DBaaS (Database as a Service), which provides ubiquitous database management;•SEaaS (Sensor Event as a Service), which dispatches messaging services that are generated by sensor events;•SenaaS (Sensor as a Service), which provides management for remote sensors;•DaaS (Data as a Service), which provides ubiquitous access to any type of data.IV.CLOUD-BASED IOT ARCHITECTUREAccording to a number of previous studies, the well-known IoT architecture is typically divided into three different layers: application, perception and network layer. Most assume that the network layer is the Cloud layer, which realises the Cloud-based IoT architecture, as depicted in Fig. 1.Fig. 1. Cloud-based IoT architectureThe perception layer is used to identify objects and gather data, which is collected from the surrounding environment. In contrast, the main objective of the network layer is to transfer the collected data to the Internet/Cloud. Finally, the application layer provides the interface to different services.V.CLOUD-BASED IOT APPLICATIONSThe Cloud-based IoT approach has introduced a number of applications and smart services, which have affected end users’ daily lives. TABLE 2 presents a brief discussion of certain applications which have been improved by the Cloud-based IoT paradigm.TABLE 2. CLOUD-BASED IOT APPLICATIONSVI.CHALLENGES FACING CLOUD-BASED IOT INTEGRATION There are many challenges which could potentially prevent the successful integration of the Cloud-based IoT paradigm. These challenges include:1.Security and privacyCloud-based IoT makes it possible to transport data from the real world to theCloud. Indeed, one particularly important issues which has not yet been resolved is how to provide appropriate authorisation rules and policies while ensuring that only authorised users have access to the sensitive data; this is crucial when it comes to preserving users’ privacy, and particularly when data integrity must be guaranteed. In addition, when critical IoT applications move into the Cloud, issues arise because of the lack of trust in the service provider, information regarding service level agreements (SLAs), and the physical location of data. Sensitive information leakage can also occur due to the multi-tenancy. Moreover, public key cryptography cannot be applied to all layers because of the processing power constraints imposed by IoT objects. New challenges also require specific attention; for example, the distributed system is exposed to number of possible attacks, such as SQL injection, session riding, cross- site scripting, and side-channel. Moreover, important vulnerabilities, including session hijacking and virtual machine escape are also problematic.2.HeterogeneityOne particularly important challenge faced by the Cloud- based IoT approach is related to the extensive heterogeneity of devices, platforms, operating systems, and services that exist and might be used for new or developed applications. Cloud platforms suffer from heterogeneity issues; for instance, Cloud services generally come with proprietary interfaces, thus allowing for resource integration based on specific providers. In addition, the heterogeneity challenge can be exacerbated when end-users adopt multi-Cloud approaches, and thus services will depend on multiple providers to improve application performance and resilience.3.Big dataWith many predicting that Big Data will reach 50 billion IoT devices by 2020, it is important to pay more attention to the transportation, access, storage and processing of the enormous amount of data which will be produced. Indeed, given recent technological developments, it is clear that the IoT will be one of the core sources of big data, and that the Cloud can facilitate the storage of this data for a long period of time, in addition to subjecting it to complex analysis. Handling the huge amount of data produced is a significant issue, as the application’s whole performance is heavilyreliant on the properties of this data management service. Finding a perfect data management solution which will allow the Cloud to manage massive amounts of data is still a big issue. Furthermore, data integrity is a vital element, not only because of its effect on the service’s quality, but also because of security and privacy issues, the majority of which relate to outsourced data.4.PerformanceTransferring the huge amount of data created from IoT devices to the Cloud requires high bandwidth. As a result, the key issue is obtaining adequate network performance in order to transfer data to Cloud environments; indeed, this is because broadband growth is not keeping pace with storage and computation evolution. In a number of scenarios, services and data provision should be achieved with high reactivity. This is because timeliness might be affected by unpredictable matters and real-time applications are very sensitive to performance efficiency.5.Legal aspectsLegal aspects have been very significant in recent research concerning certain applications. For instance, service providers must adapt to various international regulations. On the other hand, users should give donations in order to contribute to data collection.6.MonitoringMonitoring is a primary action in Cloud Computing when it comes to performance, managing resources, capacity planning, security, SLAs, and for troubleshooting. As a result, the Cloud-based IoT approach inherits the same monitoring demands from the Cloud, although there are still some related challenges that are impacted by velocity, volume, and variety characteristics of the IoT.rge scaleThe Cloud-based IoT paradigm makes it possible to design new applications that aim to integrate and analyse data coming from the real world into IoT objects. This requires interacting with billions of devices which are distributed throughout many areas. The large scale of the resulting systems raises many new issues that are difficult to overcome. For instance, achieving computational capability and storage capacityrequirements is becoming difficult. Moreover, the monitoring process has made the distribution of the IoT devices more difficult, as IoT devices have to face connectivity issues and latency dynamics.VII.OPEN ISSUES AND RESEARCH DIRECTIONSThis section will address some of the open issues and future research directions related to Cloud-based IoT, and which still require more research efforts. These issues include:1.StandardisationMany studies have highlighted the issues of lack of standards, which is considered critical in relation to the Cloud- based IoT paradigm. Although a number of proposed standardisations have been put forth by the scientific society for the deployment of IoT and Cloud approaches, it is obvious that architectures, standard protocols, and APIs are required to allow for interconnection between heterogeneous smart things and the generation of new services, which make up the Cloud- based IoT paradigm.2.Fog ComputingFog computing is a model which extends Cloud computing services to the edge of the network. Similar to the Cloud, Fog supply communicates application services to users. Fog can essentially be considered an extension of Cloud Computing which acts as an intermediate between the edge of the network and the Cloud; indeed, it works with latency-sensitive applications that require other nodes to satisfy their delay requirements. Although storage, computing, and networking are the main resources of both Fog and the Cloud, the Fog has certain features, such as location awareness and edge location, that provide geographical distribution, and low latency; moreover, there are a large nodes; this is in contrast with the Cloud, which is supported for real-time interaction and mobility.3.Cloud CapabilitiesAs in any networked environment, security is considered to be one of the main issues of the Cloud-based IoT paradigm. There are more chances of attacks on boththe IoT and the Cloud side. In the IoT context, data integrity, confidentiality and authenticity can be guaranteed by encryption. However, insider attacks cannot be resolved and it is also hard to use the IoT on devices with limited capabilities.4.SLA enforcementCloud-based IoT users need created data to be conveyed and processed based on application-dependent limitations, which can be tough in some cases. Ensuring a specific Quality of Service (QoS) level regarding Cloud resources by depending on a single provider raises many issues. Thus, multiple Cloud providers may be required to avoid SLA violations. However, dynamically choosing the most appropriate mixture of Cloud providers still represents an open issue due to time, costs, and heterogeneity of QoS management support.5.Big dataIn the previous section, we discussed Big Data as a critical challenge that is tightly coupled with the Cloud-based IoT paradigm. Although a number of contributions have been proposed, Big Data is still considered a critical open issue, and one in need of more research. The Cloud-based IoT approach involves the management and processing of huge amounts of data stemming from various locations and from heterogeneous sources; indeed, in the Cloud-based IoT, many applications need complicated tasks to be performed in real-time.6.Energy efficiencyRecent Cloud-based IoT applications include frequent data that is transmitted from IoT objects to the Cloud, which quickly consumes the node energy. Thus, generating efficient energy when it comes to data processing and transmission remains a significant open issue. Several directions have been suggested to overcome this issue, such as compression technologies, efficient data transmission; and data caching techniques for reusing collected data with time-insensitive application.7.Security and privacyAlthough security and privacy are both critical research issues which have received a great deal of attention, they are still open issues which require more efforts. Indeed, adapting to different threats from hackers is still an issue. Moreover, anotherproblem is providing the appropriate authorisation rules and policies while ensuring that only authorised users have access to sensitive data; this is crucial for preserving users’ privacy, specifically when data integrity must be guaranteed.VIII.CONCLUSIONThe IoT is becoming an increasingly ubiquitous computing service which requires huge volumes of data storage and processing capabilities. The IoT has limited capabilities in terms of processing power and storage, while there also exist consequential issues such as security, privacy, performance, and reliability; As such, the integration of the Cloud into the IoT is very beneficial in terms of overcoming these challenges. In this paper, we presented the need for the creation of the Cloud-based IoT approach. Discussion also focused on the Cloud-based IoT architecture, different applications scenarios, challenges facing the successful integration, and open research directions. In future work, a number of case studies will be carried out to test the effectiveness of the Cloud-based IoT approach in healthcare applications.中文译文:云计算与物联网的集成:挑战与开放问题摘要物联网(IoT)正在成为下一次与互联网相关的革命。

云计算外文翻译参考文献

云计算外文翻译参考文献

云计算外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)原文:Technical Issues of Forensic Investigations in Cloud Computing EnvironmentsDominik BirkRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyRuhr-University BochumHorst Goertz Institute for IT SecurityBochum, GermanyAbstract—Cloud Computing is arguably one of the most discussedinformation technologies today. It presents many promising technological and economical opportunities. However, many customers remain reluctant to move their business IT infrastructure completely to the cloud. One of their main concerns is Cloud Security and the threat of the unknown. Cloud Service Providers(CSP) encourage this perception by not letting their customers see what is behind their virtual curtain. A seldomly discussed, but in this regard highly relevant open issue is the ability to perform digital investigations. This continues to fuel insecurity on the sides of both providers and customers. Cloud Forensics constitutes a new and disruptive challenge for investigators. Due to the decentralized nature of data processing in the cloud, traditional approaches to evidence collection and recovery are no longer practical. This paper focuses on the technical aspects of digital forensics in distributed cloud environments. We contribute by assessing whether it is possible for the customer of cloud computing services to perform a traditional digital investigation from a technical point of view. Furthermore we discuss possible solutions and possible new methodologies helping customers to perform such investigations.I. INTRODUCTIONAlthough the cloud might appear attractive to small as well as to large companies, it does not come along without its own unique problems. Outsourcing sensitive corporate data into the cloud raises concerns regarding the privacy and security of data. Security policies, companies main pillar concerning security, cannot be easily deployed into distributed, virtualized cloud environments. This situation is further complicated by the unknown physical location of the companie’s assets. Normally,if a security incident occurs, the corporate security team wants to be able to perform their own investigation without dependency on third parties. In the cloud, this is not possible anymore: The CSP obtains all the power over the environmentand thus controls the sources of evidence. In the best case, a trusted third party acts as a trustee and guarantees for the trustworthiness of the CSP. Furthermore, the implementation of the technical architecture and circumstances within cloud computing environments bias the way an investigation may be processed. In detail, evidence data has to be interpreted by an investigator in a We would like to thank the reviewers for the helpful comments and Dennis Heinson (Center for Advanced Security Research Darmstadt - CASED) for the profound discussions regarding the legal aspects of cloud forensics. proper manner which is hardly be possible due to the lackof circumstantial information. For auditors, this situation does not change: Questions who accessed specific data and information cannot be answered by the customers, if no corresponding logs are available. With the increasing demand for using the power of the cloud for processing also sensible information and data, enterprises face the issue of Data and Process Provenance in the cloud [10]. Digital provenance, meaning meta-data that describes the ancestry or history of a digital object, is a crucial feature for forensic investigations. In combination with a suitable authentication scheme, it provides information about who created and who modified what kind of data in the cloud. These are crucial aspects for digital investigations in distributed environments such as the cloud. Unfortunately, the aspects of forensic investigations in distributed environment have so far been mostly neglected by the research community. Current discussion centers mostly around security, privacy and data protection issues [35], [9], [12]. The impact of forensic investigations on cloud environments was little noticed albeit mentioned by the authors of [1] in 2009: ”[...] to our knowledge, no research has been published on how cloud computing environments affect digital artifacts,and on acquisition logistics and legal issues related to cloud computing env ironments.” This statement is also confirmed by other authors [34], [36], [40] stressing that further research on incident handling, evidence tracking and accountability in cloud environments has to be done. At the same time, massive investments are being made in cloud technology. Combined with the fact that information technology increasingly transcendents peoples’ private and professional life, thus mirroring more and more of peoples’actions, it becomes apparent that evidence gathered from cloud environments will be of high significance to litigation or criminal proceedings in the future. Within this work, we focus the notion of cloud forensics by addressing the technical issues of forensics in all three major cloud service models and consider cross-disciplinary aspects. Moreover, we address the usability of various sources of evidence for investigative purposes and propose potential solutions to the issues from a practical standpoint. This work should be considered as a surveying discussion of an almost unexplored research area. The paper is organized as follows: We discuss the related work and the fundamental technical background information of digital forensics, cloud computing and the fault model in section II and III. In section IV, we focus on the technical issues of cloud forensics and discuss the potential sources and nature of digital evidence as well as investigations in XaaS environments including thecross-disciplinary aspects. We conclude in section V.II. RELATED WORKVarious works have been published in the field of cloud security and privacy [9], [35], [30] focussing on aspects for protecting data in multi-tenant, virtualized environments. Desired security characteristics for current cloud infrastructures mainly revolve around isolation of multi-tenant platforms [12], security of hypervisors in order to protect virtualized guest systems and secure network infrastructures [32]. Albeit digital provenance, describing the ancestry of digital objects, still remains a challenging issue for cloud environments, several works have already been published in this field [8], [10] contributing to the issues of cloud forensis. Within this context, cryptographic proofs for verifying data integrity mainly in cloud storage offers have been proposed,yet lacking of practical implementations [24], [37], [23]. Traditional computer forensics has already well researched methods for various fields of application [4], [5], [6], [11], [13]. Also the aspects of forensics in virtual systems have been addressed by several works [2], [3], [20] including the notionof virtual introspection [25]. In addition, the NIST already addressed Web Service Forensics [22] which has a huge impact on investigation processes in cloud computing environments. In contrast, the aspects of forensic investigations in cloud environments have mostly been neglected by both the industry and the research community. One of the first papers focusing on this topic was published by Wolthusen [40] after Bebee et al already introduced problems within cloud environments [1]. Wolthusen stressed that there is an inherent strong need for interdisciplinary work linking the requirements and concepts of evidence arising from the legal field to what can be feasibly reconstructed and inferred algorithmically or in an exploratory manner. In 2010, Grobauer et al [36] published a paper discussing the issues of incident response in cloud environments - unfortunately no specific issues and solutions of cloud forensics have been proposed which will be done within this work.III. TECHNICAL BACKGROUNDA. Traditional Digital ForensicsThe notion of Digital Forensics is widely known as the practice of identifying, extracting and considering evidence from digital media. Unfortunately, digital evidence is both fragile and volatile and therefore requires the attention of special personnel and methods in order to ensure that evidence data can be proper isolated and evaluated. Normally, the process of a digital investigation can be separated into three different steps each having its own specificpurpose:1) In the Securing Phase, the major intention is the preservation of evidence for analysis. The data has to be collected in a manner that maximizes its integrity. This is normally done by a bitwise copy of the original media. As can be imagined, this represents a huge problem in the field of cloud computing where you never know exactly where your data is and additionallydo not have access to any physical hardware. However, the snapshot technology, discussed in section IV-B3, provides a powerful tool to freeze system states and thus makes digital investigations, at least in IaaS scenarios, theoretically possible.2) We refer to the Analyzing Phase as the stage in which the data is sifted and combined. It is in this phase that the data from multiple systems or sources is pulled together to create as complete a picture and event reconstruction as possible. Especially in distributed system infrastructures, this means that bits and pieces of data are pulled together for deciphering the real story of what happened and for providing a deeper look into the data.3) Finally, at the end of the examination and analysis of the data, the results of the previous phases will be reprocessed in the Presentation Phase. The report, created in this phase, is a compilation of all the documentation and evidence from the analysis stage. The main intention of such a report is that it contains all results, it is complete and clear to understand. Apparently, the success of these three steps strongly depends on the first stage. If it is not possible to secure the complete set of evidence data, no exhaustive analysis will be possible. However, in real world scenarios often only a subset of the evidence data can be secured by the investigator. In addition, an important definition in the general context of forensics is the notion of a Chain of Custody. This chain clarifies how and where evidence is stored and who takes possession of it. Especially for cases which are brought to court it is crucial that the chain of custody is preserved.B. Cloud ComputingAccording to the NIST [16], cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal CSP interaction. The new raw definition of cloud computing brought several new characteristics such as multi-tenancy, elasticity, pay-as-you-go and reliability. Within this work, the following three models are used: In the Infrastructure asa Service (IaaS) model, the customer is using the virtual machine provided by the CSP for installing his own system on it. The system can be used like any other physical computer with a few limitations. However, the additive customer power over the system comes along with additional security obligations. Platform as a Service (PaaS) offerings provide the capability to deploy application packages created using the virtual development environment supported by the CSP. For the efficiency of software development process this service model can be propellent. In the Software as a Service (SaaS) model, the customer makes use of a service run by the CSP on a cloud infrastructure. In most of the cases this service can be accessed through an API for a thin client interface such as a web browser. Closed-source public SaaS offers such as Amazon S3 and GoogleMail can only be used in the public deployment model leading to further issues concerning security, privacy and the gathering of suitable evidences. Furthermore, two main deployment models, private and public cloud have to be distinguished. Common public clouds are made available to the general public. The corresponding infrastructure is owned by one organization acting as a CSP and offering services to its customers. In contrast, the private cloud is exclusively operated for an organization but may not provide the scalability and agility of public offers. The additional notions of community and hybrid cloud are not exclusively covered within this work. However, independently from the specific model used, the movement of applications and data to the cloud comes along with limited control for the customer about the application itself, the data pushed into the applications and also about the underlying technical infrastructure.C. Fault ModelBe it an account for a SaaS application, a development environment (PaaS) or a virtual image of an IaaS environment, systems in the cloud can be affected by inconsistencies. Hence, for both customer and CSP it is crucial to have the ability to assign faults to the causing party, even in the presence of Byzantine behavior [33]. Generally, inconsistencies can be caused by the following two reasons:1) Maliciously Intended FaultsInternal or external adversaries with specific malicious intentions can cause faults on cloud instances or applications. Economic rivals as well as former employees can be the reason for these faults and state a constant threat to customers and CSP. In this model, also a malicious CSP is included albeit he isassumed to be rare in real world scenarios. Additionally, from the technical point of view, the movement of computing power to a virtualized, multi-tenant environment can pose further threads and risks to the systems. One reason for this is that if a single system or service in the cloud is compromised, all other guest systems and even the host system are at risk. Hence, besides the need for further security measures, precautions for potential forensic investigations have to be taken into consideration.2) Unintentional FaultsInconsistencies in technical systems or processes in the cloud do not have implicitly to be caused by malicious intent. Internal communication errors or human failures can lead to issues in the services offered to the costumer(i.e. loss or modification of data). Although these failures are not caused intentionally, both the CSP and the customer have a strong intention to discover the reasons and deploy corresponding fixes.IV. TECHNICAL ISSUESDigital investigations are about control of forensic evidence data. From the technical standpoint, this data can be available in three different states: at rest, in motion or in execution. Data at rest is represented by allocated disk space. Whether the data is stored in a database or in a specific file format, it allocates disk space. Furthermore, if a file is deleted, the disk space is de-allocated for the operating system but the data is still accessible since the disk space has not been re-allocated and overwritten. This fact is often exploited by investigators which explore these de-allocated disk space on harddisks. In case the data is in motion, data is transferred from one entity to another e.g. a typical file transfer over a network can be seen as a data in motion scenario. Several encapsulated protocols contain the data each leaving specific traces on systems and network devices which can in return be used by investigators. Data can be loaded into memory and executed as a process. In this case, the data is neither at rest or in motion but in execution. On the executing system, process information, machine instruction and allocated/de-allocated data can be analyzed by creating a snapshot of the current system state. In the following sections, we point out the potential sources for evidential data in cloud environments and discuss the technical issues of digital investigations in XaaS environmentsas well as suggest several solutions to these problems.A. Sources and Nature of EvidenceConcerning the technical aspects of forensic investigations, the amount of potential evidence available to the investigator strongly diverges between thedifferent cloud service and deployment models. The virtual machine (VM), hosting in most of the cases the server application, provides several pieces of information that could be used by investigators. On the network level, network components can provide information about possible communication channels between different parties involved. The browser on the client, acting often as the user agent for communicating with the cloud, also contains a lot of information that could be used as evidence in a forensic investigation. Independently from the used model, the following three components could act as sources for potential evidential data.1) Virtual Cloud Instance: The VM within the cloud, where i.e. data is stored or processes are handled, contains potential evidence [2], [3]. In most of the cases, it is the place where an incident happened and hence provides a good starting point for a forensic investigation. The VM instance can be accessed by both, the CSP and the customer who is running the instance. Furthermore, virtual introspection techniques [25] provide access to the runtime state of the VM via the hypervisor and snapshot technology supplies a powerful technique for the customer to freeze specific states of the VM. Therefore, virtual instances can be still running during analysis which leads to the case of live investigations [41] or can be turned off leading to static image analysis. In SaaS and PaaS scenarios, the ability to access the virtual instance for gathering evidential information is highly limited or simply not possible.2) Network Layer: Traditional network forensics is knownas the analysis of network traffic logs for tracing events that have occurred in the past. Since the different ISO/OSI network layers provide several information on protocols and communication between instances within as well as with instances outside the cloud [4], [5], [6], network forensics is theoretically also feasible in cloud environments. However in practice, ordinary CSP currently do not provide any log data from the network components used by the customer’s instances or applications. For instance, in case of a malware infection of an IaaS VM, it will be difficult for the investigator to get any form of routing information and network log datain general which is crucial for further investigative steps. This situation gets even more complicated in case of PaaS or SaaS. So again, the situation of gathering forensic evidence is strongly affected by the support the investigator receives from the customer and the CSP.3) Client System: On the system layer of the client, it completely depends on the used model (IaaS, PaaS, SaaS) if and where potential evidence could beextracted. In most of the scenarios, the user agent (e.g. the web browser) on the client system is the only application that communicates with the service in the cloud. This especially holds for SaaS applications which are used and controlled by the web browser. But also in IaaS scenarios, the administration interface is often controlled via the browser. Hence, in an exhaustive forensic investigation, the evidence data gathered from the browser environment [7] should not be omitted.a) Browser Forensics: Generally, the circumstances leading to an investigation have to be differentiated: In ordinary scenarios, the main goal of an investigation of the web browser is to determine if a user has been victim of a crime. In complex SaaS scenarios with high client-server interaction, this constitutes a difficult task. Additionally, customers strongly make use of third-party extensions [17] which can be abused for malicious purposes. Hence, the investigator might want to look for malicious extensions, searches performed, websites visited, files downloaded, information entered in forms or stored in local HTML5 stores, web-based email contents and persistent browser cookies for gathering potential evidence data. Within this context, it is inevitable to investigate the appearance of malicious JavaScript [18] leading to e.g. unintended AJAX requests and hence modified usage of administration interfaces. Generally, the web browser contains a lot of electronic evidence data that could be used to give an answer to both of the above questions - even if the private mode is switched on [19].B. Investigations in XaaS EnvironmentsTraditional digital forensic methodologies permit investigators to seize equipment and perform detailed analysis on the media and data recovered [11]. In a distributed infrastructure organization like the cloud computing environment, investigators are confronted with an entirely different situation. They have no longer the option of seizing physical data storage. Data and processes of the customer are dispensed over an undisclosed amount of virtual instances, applications and network elements. Hence, it is in question whether preliminary findings of the computer forensic community in the field of digital forensics apparently have to be revised and adapted to the new environment. Within this section, specific issues of investigations in SaaS, PaaS and IaaS environments will be discussed. In addition, cross-disciplinary issues which affect several environments uniformly, will be taken into consideration. We also suggest potential solutions to the mentioned problems.1) SaaS Environments: Especially in the SaaS model, the customer does notobtain any control of the underlying operating infrastructure such as network, servers, operating systems or the application that is used. This means that no deeper view into the system and its underlying infrastructure is provided to the customer. Only limited userspecific application configuration settings can be controlled contributing to the evidences which can be extracted fromthe client (see section IV-A3). In a lot of cases this urges the investigator to rely on high-level logs which are eventually provided by the CSP. Given the case that the CSP does not run any logging application, the customer has no opportunity to create any useful evidence through the installation of any toolkit or logging tool. These circumstances do not allow a valid forensic investigation and lead to the assumption that customers of SaaS offers do not have any chance to analyze potential incidences.a) Data Provenance: The notion of Digital Provenance is known as meta-data that describes the ancestry or history of digital objects. Secure provenance that records ownership and process history of data objects is vital to the success of data forensics in cloud environments, yet it is still a challenging issue today [8]. Albeit data provenance is of high significance also for IaaS and PaaS, it states a huge problem specifically for SaaS-based applications: Current global acting public SaaS CSP offer Single Sign-On (SSO) access control to the set of their services. Unfortunately in case of an account compromise, most of the CSP do not offer any possibility for the customer to figure out which data and information has been accessed by the adversary. For the victim, this situation can have tremendous impact: If sensitive data has been compromised, it is unclear which data has been leaked and which has not been accessed by the adversary. Additionally, data could be modified or deleted by an external adversary or even by the CSP e.g. due to storage reasons. The customer has no ability to proof otherwise. Secure provenance mechanisms for distributed environments can improve this situation but have not been practically implemented by CSP [10]. Suggested Solution: In private SaaS scenarios this situation is improved by the fact that the customer and the CSP are probably under the same authority. Hence, logging and provenance mechanisms could be implemented which contribute to potential investigations. Additionally, the exact location of the servers and the data is known at any time. Public SaaS CSP should offer additional interfaces for the purpose of compliance, forensics, operations and security matters to their customers. Through an API, the customers should have the ability to receive specific information suchas access, error and event logs that could improve their situation in case of aninvestigation. Furthermore, due to the limited ability of receiving forensic information from the server and proofing integrity of stored data in SaaS scenarios, the client has to contribute to this process. This could be achieved by implementing Proofs of Retrievability (POR) in which a verifier (client) is enabled to determine that a prover (server) possesses a file or data object and it can be retrieved unmodified [24]. Provable Data Possession (PDP) techniques [37] could be used to verify that an untrusted server possesses the original data without the need for the client to retrieve it. Although these cryptographic proofs have not been implemented by any CSP, the authors of [23] introduced a new data integrity verification mechanism for SaaS scenarios which could also be used for forensic purposes.2) PaaS Environments: One of the main advantages of the PaaS model is that the developed software application is under the control of the customer and except for some CSP, the source code of the application does not have to leave the local development environment. Given these circumstances, the customer obtains theoretically the power to dictate how the application interacts with other dependencies such as databases, storage entities etc. CSP normally claim this transfer is encrypted but this statement can hardly be verified by the customer. Since the customer has the ability to interact with the platform over a prepared API, system states and specific application logs can be extracted. However potential adversaries, which can compromise the application during runtime, should not be able to alter these log files afterwards. Suggested Solution:Depending on the runtime environment, logging mechanisms could be implemented which automatically sign and encrypt the log information before its transfer to a central logging server under the control of the customer. Additional signing and encrypting could prevent potential eavesdroppers from being able to view and alter log data information on the way to the logging server. Runtime compromise of an PaaS application by adversaries could be monitored by push-only mechanisms for log data presupposing that the needed information to detect such an attack are logged. Increasingly, CSP offering PaaS solutions give developers the ability to collect and store a variety of diagnostics data in a highly configurable way with the help of runtime feature sets [38].3) IaaS Environments: As expected, even virtual instances in the cloud get compromised by adversaries. Hence, the ability to determine how defenses in the virtual environment failed and to what extent the affected systems havebeen compromised is crucial not only for recovering from an incident. Also forensic investigations gain leverage from such information and contribute to resilience against future attacks on the systems. From the forensic point of view, IaaS instances do provide much more evidence data usable for potential forensics than PaaS and SaaS models do. This fact is caused throughthe ability of the customer to install and set up the image for forensic purposes before an incident occurs. Hence, as proposed for PaaS environments, log data and other forensic evidence information could be signed and encrypted before itis transferred to third-party hosts mitigating the chance that a maliciously motivated shutdown process destroys the volatile data. Although, IaaS environments provide plenty of potential evidence, it has to be emphasized that the customer VM is in the end still under the control of the CSP. He controls the hypervisor which is e.g. responsible for enforcing hardware boundaries and routing hardware requests among different VM. Hence, besides the security responsibilities of the hypervisor, he exerts tremendous control over how customer’s VM communicate with the hardware and theoretically can intervene executed processes on the hosted virtual instance through virtual introspection [25]. This could also affect encryption or signing processes executed on the VM and therefore leading to the leakage of the secret key. Although this risk can be disregarded in most of the cases, the impact on the security of high security environments is tremendous.a) Snapshot Analysis: Traditional forensics expect target machines to be powered down to collect an image (dead virtual instance). This situation completely changed with the advent of the snapshot technology which is supported by all popular hypervisors such as Xen, VMware ESX and Hyper-V.A snapshot, also referred to as the forensic image of a VM, providesa powerful tool with which a virtual instance can be clonedby one click including also the running system’s mem ory. Due to the invention of the snapshot technology, systems hosting crucial business processes do not have to be powered down for forensic investigation purposes. The investigator simply creates and loads a snapshot of the target VM for analysis(live virtual instance). This behavior is especially important for scenarios in which a downtime of a system is not feasible or practical due to existing SLA. However the information whether the machine is running or has been properly powered down is crucial [3] for the investigation. Live investigations of running virtual instances become more common providing evidence data that。

计算机科学与技术 外文翻译 英文文献 中英对照

计算机科学与技术 外文翻译 英文文献 中英对照

附件1:外文资料翻译译文大容量存储器由于计算机主存储器的易失性和容量的限制, 大多数的计算机都有附加的称为大容量存储系统的存储设备, 包括有磁盘、CD 和磁带。

相对于主存储器,大的容量储存系统的优点是易失性小,容量大,低成本, 并且在许多情况下, 为了归档的需要可以把储存介质从计算机上移开。

术语联机和脱机通常分别用于描述连接于和没有连接于计算机的设备。

联机意味着,设备或信息已经与计算机连接,计算机不需要人的干预,脱机意味着设备或信息与机器相连前需要人的干预,或许需要将这个设备接通电源,或许包含有该信息的介质需要插到某机械装置里。

大量储存器系统的主要缺点是他们典型地需要机械的运动因此需要较多的时间,因为主存储器的所有工作都由电子器件实现。

1. 磁盘今天,我们使用得最多的一种大量存储器是磁盘,在那里有薄的可以旋转的盘片,盘片上有磁介质以储存数据。

盘片的上面和(或)下面安装有读/写磁头,当盘片旋转时,每个磁头都遍历一圈,它被叫作磁道,围绕着磁盘的上下两个表面。

通过重新定位的读/写磁头,不同的同心圆磁道可以被访问。

通常,一个磁盘存储系统由若干个安装在同一根轴上的盘片组成,盘片之间有足够的距离,使得磁头可以在盘片之间滑动。

在一个磁盘中,所有的磁头是一起移动的。

因此,当磁头移动到新的位置时,新的一组磁道可以存取了。

每一组磁道称为一个柱面。

因为一个磁道能包含的信息可能比我们一次操作所需要得多,所以每个磁道划分成若干个弧区,称为扇区,记录在每个扇区上的信息是连续的二进制位串。

传统的磁盘上每个磁道分为同样数目的扇区,而每个扇区也包含同样数目的二进制位。

(所以,盘片中心的储存的二进制位的密度要比靠近盘片边缘的大)。

因此,一个磁盘存储器系统有许多个别的磁区, 每个扇区都可以作为独立的二进制位串存取,盘片表面上的磁道数目和每个磁道上的扇区数目对于不同的磁盘系统可能都不相同。

磁区大小一般是不超过几个KB; 512 个字节或1024 个字节。

云计算英文论文

云计算英文论文

云计算英文论文Cloud computing1. IntroductionCloud computing is known as the most anticipated technological revolution over the entire world, and the reason why cloud computing has brought widespread attention is that it not only represent a new technology appeared, but also lead to the entire industry to change. Therefore, the ranking of national competitiveness will change accordingly.2. Cloud computing definition and applicationThe concept of cloud computing is put forward by Google, which is a beautiful network applications mode. Based on increasing in Internet-related services and delivery models, it provides dynamic, easy to expand and virtualized resources usually through the Internet. This service can be related to IT, software and Internet; other services can be supported too. It means that computing power can be circulation as a commodity. As we expect, more and more enterprise use this new technology into the practical application, such as webmail, Gmail and apple store use it for computing and storage. With the technology development, cloud computing has given full play to it roles, Industry, Newcomers and Media buy computer technology to make network cheaper, faster, convenient and better control.Cloud computing is used in the distributed computers by computing, rather than the local computer or remote server, the operation of the enterprise data center and the Internet are more similar. This process allows companies to be able to switch resources to the urgent application according to computer and storage systems demand access. This means that the computingpower can be negotiable, like water and electricity, access convenient andlow-cost. The biggest difference is that it is transmitted via the Internet.3. AdvantageBy providing a large and secure data storage centers for cloud computing, users do not have to worry about their own data which be lost for some reason or destroyed by computer virus invasion, such as some data stored in the hard disk data will be lost by the computer damaged and virus invasion, so that users cannot access the data and recover data, another disadvantage may also appear when other people use your computer steal the user's computer, such as personal confidential information and business data loss. “Gate of blue photos " is a typical insecure example. If user uploads the photos through the Interne into data storage center, it may have less chance to access personal confidential information. And with cloud computing development, for the user, these services are free of charge, for the future, advanced technology will certainly become a cash cow for merchants, and for the country, the computing power is also able to reflect a country technological level and the level of overall national strength.As the user data stored in the cloud storage center, can reduce customer demand for thehardware level, coupled with cloud computing extremely powerful computing capability, it can enable users to call data more convenient and quick if can add the high-speed networks. Users no longer need replace the computer caused by having no enough hard disk space and CPU computing power. Otherwise, users only surf the Internet to access cloud storage center andthen easily access the data.As powerful data storage centers, the most wonderful features is sharing data. No matter computer data or a variety of communication devices such as mobile phones and PDA. When some damaged appeared in your phone, lost or replaced in order to chase the trend of the times and phone, data copy is a tedious thing. However, in another way, you can solve the entire problem through cloud storage center.Cloud computing services will create more millions of jobs. On March 6, a news show that show that Microsoft statement a commissioned research conducted by theworld-renowned market analyst firm IDC. The research show that cloud computing will create nearly 14 million new jobs worldwide in 2015. And also predicted that it can stimulate IT innovation and brought the new revenue which can reach about $ 1.1 trillion, coupled with cloud computing efficiency substantially promote, the organization will increasere-investment and work opportunity. IDC chief research officer , as senior vice president, John F. Gantz said: "For most organizations, there is no doubt that cloud computing will significantly enhance the return on investment and flexibility, reduce investment costs, and bring revenue growth multiplied as well, we usually believe that cloud computing will reduce jobs by mistake, the opposite, it can create a lot of jobs around the world, no matter emerging market, small cities and small businesses will increase employment opportunities and get benefit from cloud computing. "14. DisadvantageWith large storage center, cloud computing strongly urges powerful hardware conditions and safety measures, whichincluding a large storage space and powerful computing capabilities. As today's scientific and technological rapid develop, user increased, hardware capabilities become one of the necessary condition for development, moreover, the cloud should be kept to enhance their computing power and storage space, and endless. Security measures also include two aspects; firstly, it is necessary to ensure that user data not damaged, lost, and stolen by others, which requires a strong IT team to full range of maintenance and strict access management strategies. Some information has been confirmed, the existing cloud computing service provider still cannot guarantee no similar problem occurs. For users, this is a very serious problem. Another security issue is a natural or unnatural disaster, also can cause the storage center damaged, can cause users cannot access.Cloud computing services are available via the Internet, therefore, user usually have high demands for network access speed, although domestic access speeds improve fast, but compared with LAN, it will appear delays, incomparable, In addition, no network, the user will not be able to access cloud services.The rapid rise of Cloud technology, in another perspective, also restricts the speed of development, which also lead to lower demand for high-end devices. It is the basic reason1Reference[1]why can restrict terminal development. Customer needs, determine the business requirements for products, if customers reduce the terminal's hardware and software requirements, the business will be greatly reduced the degree of product development and restricted the pace of the terminal technology development.5. Cloud computing devel opment in chinaIn China, Cloud computing services have been in a state of rapid development. Looking at the cloud computing industry, government accounted for a major position, and in the market they have also been given the affirmation of the cloud computing. The cloud computing services major support for government agencies and enterprises, therefore, the personal cloud computing services in China is a potential big cake and wait for people enjoying. However because of the domestic personal network speed upgrade too slow, and relevant government network supervision, the development of personal cloud computing services will have more difficult. Due to excessive speculation in the international aspects, in our country, a lot of companies want to get some benefit from the rapid development of technology, exploit cloud computing services projects and promote enterprise development. My personal view is that enterprises should do some preparation, judge it’s good or bad and prospects, as well as estimate the strength of the company and do their best. Through access to information discovery, cloud computing as a technology trend has been got the world's attention, but look at the forefront of foreign enterprise closely, we can easily find almost no companies start the business rely on cloud computing; cloud computing applications and related research are also mostly implement by those large enterprise with more funds. As the prospects for development, for the enterprise, profit is the most critical issues.Although cloud computing has been integrated into our lives, but the domestic development of cloud computing is still confronted with many problems, such as lack of user awareness, migration risk, lack of standards, user locks, security and dataprivacy, safety, standards, data privacy has become a topic of which we are most concerned.Nowadays, people still lack of comprehensive, systematic cognitive of Cloud computing, part of them just imitate completely others and foreign experience, ignoring the specific conditions in our country, resulting in spending much money, but failed to alleviate the complex IT issues. In the original data center, hardware is relatively independent, but migrates to cloud computing data center, systematic assessment and scientific analysis must be essential.Otherwise it may lead to the hardware platform not achieve the proper effect and even the collapse of the application system. Cloud computing products is quite diversity, but the birth of the cloud standard is extremely difficult, the major manufacturers just have their own standards, the Government is also involved, both of them are striving to be able to dominate the major position, so more efforts are still needed. The other faced challenge is how to provide users legitimate services is also very important, except the risks in the systems. Compared with traditional data centers, it cloud provided more services diversity, which also bring more difficult to control. Therefore, analyze the needs of users, reasonable, enforceable service level agreements (SLAs) provided will help users establish confidence in cloud computing services in a large extent. Security problems can be said that a key factor in cloud computing landing. Amazon failure made people deep in thought, which is just one factor of the security.The other security risks include user privacy protection, data sovereignty, disaster recovery, and even illegal interests caused by some hackers. In the era of cloud computing, there will be more user privacy data placed in the clouds, more data, and morevalue. Most importantly, lack of user’s trust will lead to cloud computing landing.Although the development road is very far away, but the Government has been giving support and encouragement. In 2012 Cloud Computing Industry Development Forum was successfully held, which is strong evidence. At the meeting, as a CCID Consulting, cloud industry observers, as well as excellence forerunner of the cloud computing industry, Microsoft, with their co-operation, complete the first "China Cloud economic development Report”, which shows that in-depth interpretation the content of cloud formation and specify the key factor of economic development in the cloud. The result obtained based on thelong-term observation and in-depth research on cloud computing industry. In addition, the participants share their own point about industry environment, market situation, technology development, application service elements and so on; they also can come up with any question about the cloud computing, expects will do their best to give analysis and authoritative view.26. Future prospectsAbout cloud computing services future prospects, in my opinion, it will have great potential, we can find out the answer from the action of the giants of the major IT companies, but now it still in a state of excessive speculation. Many small companies have no any knowledge and experience about it, and still want to exploit the services project. I don’t think it is a sagacious decision. in fact, for current state, I maintained a cautious attitude, as a corporate without strong financial strength, I think they should focus on business development and growth, in order to reduce unnecessary waste of resources, they can develop and useit when the technology get mature stage.Finally, have to say that the future of the cloud computing, it not only can provide unlimited possibilities for storage, read, and manage data using the network. But also provides us infinite computing power because of its huge storage space, there is no powerful computing capabilities, it cannot provide users with easy and fast. In one word, the development potential of cloud computing is endless.7. Reference[1] Microsoft news, Cloud computing services will create more millions of jobs.2012-03-06 2Reference[2]。

云计算大数据外文翻译文献

云计算大数据外文翻译文献

云计算大数据外文翻译文献(文档含英文原文和中文翻译)原文:Meet HadoopIn pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log, they didn’t try to grow a larger ox. We shouldn’t be trying for bigger computers, but for more systems of computers.—Grace Hopper Data!We live in the data age. It’s not easy to measure the total volume of data stored electronically, but an IDC estimate put the size of the “digital universe” at 0.18 zettabytes in2006, and is forecasting a tenfold growth by 2011 to 1.8 zettabytes. A zettabyte is 1021 bytes, or equivalently one thousand exabytes, one million petabytes, or one billion terabytes. That’s roughly the same order of magnitude as one disk drive for every person in the world.This flood of data is coming from many sources. Consider the following:• The New York Stock Exchange generates about one terabyte of new trade data perday.• Facebook hosts approximately 10 billion photos, taking up one petabyte of storage.• , the genealogy site, stores around 2.5 petabytes of data.• The Internet Archive stores around 2 petabytes of data, and is growing at a rate of20 terabytes per month.• The Large Hadron Collider near Geneva, Switzerland, will produce about 15 petabytes of data per year.So there’s a lot of data out there. But you are probably wondering how it affects you.Most of the data is locked up in the largest web properties (like search engines), orscientific or financial institutions, isn’t it? Does the advent of “Big Data,” as it is being called, affect smaller organizations or individuals?I argue that it does. Take photos, for example. My wife’s grandfather was an avid photographer, and took photographs throughout his adult life. His entire corpus of medium format, slide, and 35mm film, when scanned in at high-resolution, occupies around 10 gigabytes. Compare this to the digital photos that my family took last year,which take up about 5 gigabytes of space. My family is producing photographic data at 35 times the rate my wife’s grandfather’s did, and the rate is increasing every year as it becomes easier to take more and more photos.More generally, the digital streams that individuals are producing are growing apace. Microsoft Research’s MyLifeBits project gives a glimpse of archiving of personal in formation that may become commonplace in the near future. MyLifeBits was an experiment where an individual’s interactions—phone calls, emails, documents were captured electronically and stored for later access. The data gathered included a photo taken every minute, which resulted in an overall data volume of one gigabyte a month. When storage costs come down enough to make it feasible to store continuous audio and video, the data volume for a future MyLifeBits service will be many times that.The trend is f or every individual’s data footprint to grow, but perhaps more importantly the amount of data generated by machines will be even greater than that generated by people. Machine logs, RFID readers, sensor networks, vehicle GPS traces, retail transactions—all of these contribute to the growing mountain of data.The volume of data being made publicly available increases every year too. Organizations no longer have to merely manage their own data: success in the future will be dictated to a large extent by their ability to extract value from other organizations’ data.Initiatives such as Public Data Sets on Amazon Web Services, , and exist to foster the “information commons,” where data can be freely (or in the case of AWS, for a modest price) shared for anyone to download and analyze. Mashups between different information sources make for unexpected and hitherto unimaginable applications.Take, for example, the project, which watches the Astrometry groupon Flickr for new photos of the night sky. It analyzes each image, and identifies which part of the sky it is from, and any interesting celestial bodies, such as stars or galaxies. Although it’s still a new and experimental service, it shows the kind of things that are possible when data (in this case, tagged photographic images) is made available andused for something (image analysis) that was not anticipated by the creator.It has been said that “More data usually beats better algorithms,” which is to say that for some problems (such as recommending movies or music based on past preferences),however fiendish your algorithms are, they can often be beaten simply by having more data (and a less sophisticated algorithm).The good news is that Big Data is here. The bad news is that we are struggling to store and analyze it.Data Storage and AnalysisThe problem is simple: while the storage capacities of hard drives have increased massively over the years, access speeds--the rate at which data can be read from drives--have not kept up. One typical drive from 1990 could store 1370 MB of data and had a transfer speed of 4.4 MB/s, so you could read all the data from a full drive in around five minutes. Almost 20years later one terabyte drives are the norm, but the transfer speed is around 100 MB/s, so it takes more than two and a half hours to read all the data off the disk.This is a long time to read all data on a single drive and writing is even slower. The obvious way to reduce the time is to read from multiple disks at once. Imagine if we had 100 drives, each holding one hundredth of the data. Working in parallel, we could read the data in under two minutes.Only using one hundredth of a disk may seem wasteful. But we can store one hundred datasets, each of which is one terabyte, and provide shared access to them. We can imagine that the users of such a system would be happy to share access in return for shorter analysis times, and, statistically, that their analysis jobs would be likely to be spread over time, so they wouldn`t interfere with each other too much.There`s more to being able to read and write data in parallel to or from multiple disks, though. The first problem to solve is hardware failure: as soon as you start using many pieces of hardware, the chance that one will fail is fairly high. A common way of avoiding data loss is through replication: redundant copies of the data are kept by the system so that in the event of failure, there is another copy available. This is how RAID works, for instance, although Hadoop`s filesystem, the Hadoop Distributed Filesystem (HDFS),takes a slightly different approach, as you shall see later. The second problem is that most analysis tasks need to be able to combine the data in some way; data read from one disk may need to be combined with the data from any of the other 99 disks. Various distributed systems allow data to be combined from multiple sources, but doing this correctly is notoriously challenging. MapReduce provides a programming model that abstracts the problem from disk reads and writes, transforming it into a computation over sets of keys and values. We will look at the details of this model in later chapters, but the important point for the present discussion is that there are two parts to the computation, the map and the re duce, and it’s the interface between the two where the “mixing” occurs. Like HDFS, MapReduce has reliability built-in.This, in a nutshell, is what Hadoop provides: a reliable shared storage and analysis system. The storage is provided by HDFS, and analysis by MapReduce. There are other parts to Hadoop, but these capabilities are its kernel.Comparison with Other SystemsThe approach taken by MapReduce may seem like a brute-force approach. The premise is that the entire dataset—or at least a good portion of it—is processed for each query. But this is its power. MapReduce is a batch query processor, and the ability to run an ad hoc query against your whole dataset and get the results in a reasonable time is transformative. It changes the way you think about data, and unlocks data that was previously archived on tape or disk. It gives people the opportunity to innovate with data. Questions that took too long to get answered before can now be answered, which in turn leads to new questions and new insights.For e xample, Mailtrust, Rackspace’s mail division, used Hadoop for processing email logs. One ad hoc query they wrote was to find the geographic distribution of their users.In their words: This data was so useful that we’ve scheduled the MapReduce job to run monthly and we will be using this data to help us decide which Rackspace data centers to place new mail servers in as we grow. By bringing several hundred gigabytes of data together and having the tools to analyze it, the Rackspace engineers were able to gain an understanding of the data that they otherwise would never have had, and, furthermore, they were able to use what they had learned to improve the service for their customers. You can read more about how Rackspace uses Hadoop in Chapter 14.RDBMSWhy c an’t we use databases with lots of disks to do large-scale batch analysis? Why is MapReduce needed? The answer to these questions comes from another trend in disk drives: seek time is improving more slowly than transfer rate. Seeking is the process of moving the disk’s head to a particular place on the disk to read or write data. It characterizes the latency of a disk operation, whereas the transfer rate corresponds to a disk’s bandwidth.If the data access pattern is dominated by seeks, it will take longer to read or write large portions of the dataset than streaming through it, which operates at the transfer rate. On the other hand, for updating a small proportion of records in a database, a traditional B-Tree (the data structure used in relational databases, which is limited by the rate it can perform seeks) works well. For updating the majority of a database, a B-Tree is less efficient than MapReduce, which uses Sort/Merge to rebuild the database.In many ways, MapReduce can be seen as a complement to an RDBMS. (The differences between the two systems are shown in Table 1-1.) MapReduce is a good fit for problems thatneed to analyze the whole dataset, in a batch fashion, particularly for ad hoc analysis. An RDBMS is good for point queries or updates, where the dataset has been indexed to deliver low-latency retrieval and update times of a relatively small amount of data. MapReduce suits applications where the data is written once, and read many times, whereas a relational database is good for datasets that are continually updated.Table 1-1. RDBMS compared to MapReduceTraditional RDBMS MapReduceData size Gigabytes PetabytesAccess Interactive and batch BatchWrite once, read many times Updates Read and write manytimesStructure Static schema Dynamic schemaIntegrity High LowScaling Nonlinear LinearAnother difference between MapReduce and an RDBMS is the amount of structure in the datasets that they operate on. Structured data is data that is organized into entities that have a defined format, such as XML documents or database tables that conform to a particular predefined schema. This is the realm of the RDBMS. Semi-structured data, on the other hand, is looser, and though there may be a schema, it is often ignored, so it may be used only as a guide to the structure of the data: for example, a spreadsheet, in which the structure is the grid of cells, although the cells themselves may hold anyform of data. Unstructured data does not have any particular internal structure: for example, plain text or image data. MapReduce works well on unstructured or semistructured data, since it is designed to interpret the data at processing time. In other words, the input keys and values for MapReduce are not an intrinsic property of the data, but they are chosen by the person analyzing the data.Relational data is often normalized to retain its integrity, and remove redundancy. Normalization poses problems for MapReduce, since it makes reading a record a nonlocaloperation, and one of the central assumptions that MapReduce makes is that it is possible to perform (high-speed) streaming reads and writes.A web server log is a good example of a set of records that is not normalized (for example, the client hostnames are specified in full each time, even though the same client may appear many times), and this is one reason that logfiles of all kinds are particularly well-suited to analysis with MapReduce.MapReduce is a linearly scalable programming model. The programmer writes two functions—a map function and a reduce function—each of which defines a mapping from one set of key-value pairs to another. These functions are oblivious to the size of the data or the cluster that they are operating on, so they can be used unchanged for a small dataset and for a massive one. More importantly, if you double the size of the input data, a job will run twice as slow. But if you also double the size of the cluster, a job will run as fast as the original one. This is not generally true of SQL queries.Over time, however, the differences between relational databases and MapReduce systems are likely to blur. Both as relational databases start incorporating some of the ideas from MapReduce (such as Aster Data’s and Greenplum’s databases), and, from the other direction, as higher-level query languages built on MapReduce (such as Pig and Hive) make MapReduce systems more approachable to traditional database programmers.Grid ComputingThe High Performance Computing (HPC) and Grid Computing communities have been doing large-scale data processing for years, using such APIs as Message Passing Interface (MPI). Broadly, the approach in HPC is to distribute the work across a cluster of machines, which access a shared filesystem, hosted by a SAN. This works well for predominantly compute-intensive jobs, but becomes a problem when nodes need to access larger data volumes (hundreds of gigabytes, the point at which MapReduce really starts to shine), since the network bandwidth is the bottleneck, and compute nodes become idle.MapReduce tries to colocate the data with the compute node, so data access is fast since it is local. This feature, known as data locality, is at the heart of MapReduce and is the reason for its good performance. Recognizing that network bandwidth is the most precious resource in a data center environment (it is easy to saturate network links by copying data around),MapReduce implementations go to great lengths to preserve it by explicitly modelling network topology. Notice that this arrangement does not preclude high-CPU analyses in MapReduce.MPI gives great control to the programmer, but requires that he or she explicitly handle the mechanics of the data flow, exposed via low-level C routines and constructs, such as sockets, as well as the higher-level algorithm for the analysis. MapReduce operates only at the higher level: the programmer thinks in terms of functions of key and value pairs, and the data flow is implicit.Coordinating the processes in a large-scale distributed computation is a challenge. The hardest aspect is gracefully handling partial failure—when you don’t know if a remote process has failed or not—and still making progress with the overall computation. MapReduce spares the programmer from having to think about failure, since the implementation detects failed map or reduce tasks and reschedules replacements on machines that are healthy. MapReduce is able to do this since it is a shared-nothing architecture, meaning that tasks have no dependence on one other. (This is a slight oversimplification, since the output from mappers is fed to the reducers, but this is under the control of the MapReduce system; in this case, it needs to take more care rerunning a failed reducer than rerunning a failed map, since it has to make sure it can retrieve the necessary map outputs, and if not, regenerate them by running the relevant maps again.) So from the programmer’s point of view, the order in which the tasks run doesn’t matter. By contrast, MPI programs have to explicitly manage their own checkpointing and recovery, which gives more control to the programmer, but makes them more difficult to write.MapReduce might sound like quite a restrictive programming model, and in a sense itis: you are limited to key and value types that are related in specified ways, and mappers and reducers run with very limited coordination between one another (the mappers pass keys and values to reducers). A natural question to ask is: can you do anything useful or nontrivial with it?The answer is yes. MapReduce was invented by engineers at Google as a system for building production search indexes because they found themselves solving the same problem over and over again (and MapReduce was inspired by older ideas from the functional programming, distributed computing, and database communities), but it has since been used for many other applications in many other industries. It is pleasantly surprising to see the range of algorithms that can be expressed in MapReduce, from image analysis, to graph-based problems,to machine learning algorithms. It can’t solve every problem, of course, but it is a general data-processing tool.You can see a sample of some of the applications that Hadoop has been used for in Chapter 14.Volunteer ComputingWhen people first hear about Hadoop and MapReduce, they often ask, “How is it different from SETI@home?” SETI, the Search for Extra-Terrestrial Intelligence, runs a project called SETI@home in which volunteers donate CPU time from their otherwise idle computers to analyze radio telescope data for signs of intelligent life outside earth. SETI@home is the most well-known of many volunteer computing projects; others include the Great Internet Mersenne Prime Search (to search for large prime numbers) and Folding@home (to understand protein folding, and how it relates to disease).Volunteer computing projects work by breaking the problem they are trying to solve into chunks called work units, which are sent to computers around the world to be analyzed. For example, a SETI@home work unit is about 0.35 MB of radio telescope data, and takes hours or days to analyze on a typical home computer. When the analysis is completed, the results are sent back to the server, and the client gets another work unit. As a precaution to combat cheating, each work unit is sent to three different machines, and needs at least two results to agree to be accepted.Although SETI@home may be superficially similar to MapReduce (breaking a problem into independent pieces to be worked on in parallel), there are some significant differences. The SETI@home problem is very CPU-intensive, which makes it suitable for running on hundreds of thousands of computers across the world, since the time to transfer the work unit is dwarfed by the time to run the computation on it. Volunteers are donating CPU cycles, not bandwidth.MapReduce is designed to run jobs that last minutes or hours on trusted, dedicated hardware running in a single data center with very high aggregate bandwidth interconnects. By contrast, SETI@home runs a perpetual computation on untrusted machines on the Internet with highly variable connection speeds and no data locality.译文:初识Hadoop古时候,人们用牛来拉重物,当一头牛拉不动一根圆木的时候,他们不曾想过培育个头更大的牛。

云计算外文文献+翻译

云计算外文文献+翻译

云计算外文文献+翻译1. 引言云计算是一种基于互联网的计算方式,它通过共享的计算资源提供各种服务。

随着云计算的普及和应用,许多研究者对该领域进行了深入的研究。

本文将介绍一篇外文文献,探讨云计算的相关内容,并提供相应的翻译。

2. 外文文献概述作者:Antonio Fernández Anta, Chryssis Georgiou, Evangelos Kranakis出版年份:2019年该外文文献主要综述了云计算的发展和应用。

文中介绍了云计算的基本概念,包括云计算的特点、架构、服务模型以及云计算的挑战和前景。

3. 研究内容该研究综述了云计算技术的基本概念和相关技术。

文中首先介绍了云计算的定义和其与传统计算的比较,深入探讨了云计算的优势和不足之处。

随后,文中介绍了云计算的架构,包括云服务提供商、云服务消费者和云服务的基本组件。

在架构介绍之后,文中提供了云计算的三种服务模型:基础设施即服务(IaaS)、平台即服务(PaaS)和软件即服务(SaaS)。

每种服务模型都从定义、特点和应用案例方面进行了介绍,并为读者提供了更深入的了解。

此外,文中还讨论了云计算的挑战,包括安全性、隐私保护、性能和可靠性等方面的问题。

同时,文中也探讨了云计算的前景和未来发展方向。

4. 文献翻译《云计算:一项调查》是一篇全面介绍云计算的文献。

它详细解释了云计算的定义、架构和服务模型,并探讨了其优势、不足和挑战。

此外,该文献还对云计算的未来发展进行了预测。

对于研究云计算和相关领域的读者来说,该文献提供了一个很好的参考资源。

它可以帮助读者了解云计算的基本概念、架构和服务模型,也可以引导读者思考云计算面临的挑战和应对方法。

5. 结论。

外文翻译云计算中倾斜度感知的任务调度.

外文翻译云计算中倾斜度感知的任务调度.

2013Skew-Aware Task Scheduling in Clouds云计算中倾斜度感知的任务调度李东生,陈宜兴,理查德·胡亥国防科技大学,计算机学院,并行与分布式处理国家实验室,中国国立大学莱佛士商学院,新加坡dsli@摘要:数据扭曲是MapReduce一样的云系统中慢任务出现的一个重要原因。

在本文中,我们提出了一个斜感知任务调度(SATS)机制针对MapReduce类似系统的迭代应用。

该机构利用迭代应用中在相邻迭代的数据分布的相似性,来减少数据扭曲造成的落伍的问题。

它在当前迭代的任务的执行过程中收集数据的分布信息,并用这些信息来指导下一次迭代时任务的数据分割。

我们在HaLoop系统落实机制,在一个集群中部署。

实验结果表明,该机制可以处理数据扭曲,有效地提高负载平衡。

关键词:数据扭曲;任务调度;云计算;负载均衡1、简介近年来云计算已经成为一个有前途的技术,而且MapReduce是最成功的一个大规模数据密集型云计算的实现平台[1] - [3]。

MapReduce的使用一个简单的数据并行的编程模型,有两个基本的操作,即,Map和Reduce操作。

用户可以根据应用程序的要求自定义Map功能和Reduce功能。

每个map任务取一片输入数据,并产生一个用Map功能的key/value对的集合,这是初步地用Reduce功能做Reduce任务。

这种编程模型很简单,但功能强大,许多大规模数据处理应用程序可以由模型来表示。

类MapReduce的系统可以在云计算中自动调度多个分布在机器中的Map和/或Reduce任务。

作为同步步骤仅存在于Map阶段和Reduce阶段之间,任务执行在相同的阶段具有高平行度,并且因此并发性和系统的可扩展性可以被高度增强。

Hadoop[4]和它的变体(例如,HaLoop [5]和Hadoop++ [6])是典型的类MapReduce系统。

由于在类MapReduce系统中Map和Reduce阶段之间存在同步步骤,在任一阶段慢任务可能减慢整个工作的执行。

专业英语翻译

专业英语翻译

第五章So What is Cloud Computing?We see Cloud Computing as a computing model, not a technology. In this model “customers” plug into the “cloud” to access IT resources which are priced and provided “on-demand”. Essentially, IT resources are rented and shared among multiple tenants much as office space, apartments, or storage spaces are used by tenants. Delivered over an Internet connection, the “cloud” replaces the company data center or server providing the same service. Thus, Cloud Computing is simply IT services sold and delivered over the Internet. Refer to section of Types of Cloud Computing.Cloud Computing vendors combine virtualization (one computer hosting several “virtual” servers), automated provisioning (servers have software installed automatically), and Internet connectivity technologies to provide the service[1]. These are not new technologies but a new name applied to a collection of older (albeit updated) technologies that are packaged, sold and delivered in a new way.A key point to remember is that, at the most basic level, your data resides onsomeone else’s server(s). This means that most concerns (and there are potentially hundreds) really come down to trust and control issues. Do you trust them with your data?那么,什么是云计算?我们看到云计算作为一种计算模式,而不是一个技术。

(完整word版)JAVA外文文献+翻译

(完整word版)JAVA外文文献+翻译

Java and the InternetIf Java is, in fact, yet another computer programming language, you may question why it is so important and why it is being promoted as a revolutionary step in computer programming. The answer isn't immediately obvious if you’re comin g from a traditional programming perspective. Although Java is very useful for solving traditional stand—alone programming problems, it is also important because it will solve programming problems on the World Wide Web。

1.Client—side programmingThe Web’s in itial server—browser design provided for interactive content, but the interactivity was completely provided by the server. The server produced static pages for the client browser, which would simply interpret and display them。

Basic HTML contains simple mechanisms for data gathering: text-entry boxes, check boxes, radio boxes, lists and drop—down lists, as well as a button that can only be programmed to reset the data on the form or “submit” the data on the form back to the server。

云计算技术的应用与发展趋势(英文中文双语版优质文档)

云计算技术的应用与发展趋势(英文中文双语版优质文档)

云计算技术的应用与发展趋势(英文中文双语版优质文档)With the continuous development of information technology, cloud computing technology has become an indispensable part of enterprise information construction. Cloud computing technology can help enterprises realize a series of functions such as resource sharing, data storage and processing, application development and deployment. This article will discuss from three aspects: the application of cloud computing technology, the advantages of cloud computing technology and the development trend of cloud computing technology.1. Application of Cloud Computing Technology1. Resource sharingCloud computing technology can bring together different resources to realize resource sharing. Enterprises can use cloud computing technology to share resources such as servers, storage devices, and network devices, so as to maximize the utilization of resources.2. Data storage and processingCloud computing technology can help enterprises store and process massive data. Through cloud computing technology, enterprises can store data in the cloud to realize remote access and backup of data. At the same time, cloud computing technology can also help enterprises analyze and process data and provide more accurate decision support.3. Application development and deploymentCloud computing technology can help enterprises develop and deploy applications faster and more conveniently. Through cloud computing technology, enterprises can deploy applications on the cloud to realize remote access and management of applications. At the same time, cloud computing technology can also provide a variety of development tools and development environment, which is convenient for enterprises to carry out application development.2. Advantages of cloud computing technology1. High flexibilityCloud computing technology can flexibly adjust the usage and allocation of resources according to the needs of enterprises, so as to realize the optimal utilization of resources. At the same time, cloud computing technology can also support elastic expansion and contraction, which is convenient for enterprises to cope with business peaks and valleys.2. High securityCloud computing technology can ensure the security of enterprise data through data encryption, identity authentication, access control and other means. At the same time, cloud computing technology can also provide a multi-level security protection system to prevent security risks such as hacker attacks and data leakage.3. Cost-effectiveCompared with the traditional IT construction model, the cost of cloud computing technology is lower. Through cloud computing technology, enterprises can avoid large-scale hardware investment and maintenance costs, and save enterprise R&D and operating expenses.4. Convenient managementCloud computing technology can help enterprises achieve unified resource management and monitoring. Through cloud computing technology, enterprises can centrally manage resources such as multiple servers, storage devices, and network devices, which is convenient for enterprises to carry out unified monitoring and management.5. Strong scalabilityCloud computing technology can quickly increase or decrease the usage and configuration of resources according to the needs of enterprises, so as to realize the rapid expansion and contraction of business. At the same time, cloud computing technology can also provide a variety of expansion methods, such as horizontal expansion, vertical expansion, etc., to facilitate enterprises to expand their business on demand.3. The development trend of cloud computing technology1. The advent of the multi-cloud eraWith the development of cloud computing technology, the multi-cloud era has arrived. Enterprises can choose different cloud platforms and deploy services on multiple clouds to achieve high availability and elastic expansion of services.2. Combination of artificial intelligence and cloud computingArtificial intelligence is one of the current hot technologies, and cloud computing technology can also provide better support for the development of artificial intelligence. Cloud computing technology can provide high-performance computing resources and storage resources, providing better conditions for the training and deployment of artificial intelligence.3. The Rise of Edge ComputingEdge computing refers to the deployment of computing resources and storage resources at the edge of the network to provide faster and more convenient computing and storage services. With the development of the Internet of Things and the popularization of 5G networks, edge computing will become an important expansion direction of cloud computing technology.4. Guarantee of security and privacyWith the widespread application of cloud computing technology, data security and privacy protection have become important issues facing cloud computing technology. In the future, cloud computing technology will pay more attention to security measures such as data encryption, identity authentication and access control to ensure the security and privacy of corporate and personal data.To sum up, cloud computing technology has become an indispensable part of enterprise information construction. Through cloud computing technology, enterprises can realize a series of functions such as resource sharing, data storage and processing, application development and deployment. At the same time, cloud computing technology also has the advantages of high flexibility, high security, high cost-effectiveness, convenient management and strong scalability. In the future, with the multi-cloud era, the combination of artificial intelligence and cloud computing, the rise of edge computing, and the protection of security and privacy, cloud computing technology will continue to enhance its importance and application value in enterprise information construction.随着信息技术的不断发展,云计算技术已经成为企业信息化建设中不可或缺的一部分。

关于云计算的英语作文

关于云计算的英语作文

关于云计算的英语作文英文回答:Cloud computing is a revolutionary technology that has transformed the way businesses and individuals store, access, and manage data. It allows users to access applications and data from any device with an internet connection, making it incredibly convenient and flexible. 。

One of the major benefits of cloud computing is itscost-effectiveness. Instead of investing in expensive hardware and software, businesses can simply pay for the services they use on a subscription basis. This not only reduces upfront costs but also allows for scalability asthe business grows.Another advantage of cloud computing is its reliability and security. Cloud service providers invest heavily instate-of-the-art security measures to protect data from cyber threats and ensure constant access to data. This isparticularly important for businesses that rely on data for their operations.Furthermore, cloud computing enables collaboration and remote work. With cloud-based applications and storage, teams can work together on projects from different locations, making it easier to stay connected and productive.In addition, cloud computing has had a significant impact on the entertainment industry. Streaming services like Netflix and Spotify rely on cloud computing to deliver content to millions of users simultaneously, without any lag or interruption.Overall, cloud computing has revolutionized the way we store and access data, and its impact can be seen across various industries.中文回答:云计算是一项革命性的技术,改变了企业和个人存储、访问和管理数据的方式。

云计算英文术语

云计算英文术语

云计算术语(中英文对照)1. 自由计算free comput ing2. 弹性可伸缩elastic and scalable4. 硬盘hard disk/ volume5. 密钥key6. 公开密钥public key7. 映像image / mapping8. 负载均衡load balancing9. 对象存储object storage10. 弹性计算elastic computing11. 按秒计费charged by seco nds12. 多重实时副本multiple real-time copy13. 安全隔离security isolation14. 异地副本Iong-distanee copy15. 后端系统back-end system16. 前端系统front-end system17. 写时拷贝技术copy-on-write technique18. 控制台con sole19. 监控台dashboard20. 远程终端remote termi nal21. 服务端口service port22.模拟主机simulati on host display 显示器23.路由器router24. 多路万兆光纤multiple 10000MB optical fiber25. 密码验证登录password authentication login27. 动态IP dynamic IP28. 混合云hybrid cloud29. SLA Service Level Agreeme nt 服务级别协议30. 分布式存储distributed storage31. 存储柜locker32. 云计算加速器cloud computing accelerator33. NIST Natio nal In stitute of Sta ndards and Techno logy美国国家标准技术研究所34. 智能电网smart gird35. 智慧城市smart city36. 物联网In ternet of Thi ngs (IOT)37. 集成电路In tegrated Circuit38. 嵌套虚拟化n ested virtualizati on39. 内存memory40. 千兆Gigabyte41. 网卡n etwork card42. 单线程测试sin gle thread test43. 最大素数测试largest prime test44. 单核CPU sin gle-core CPU45. 双核CPU dual-core CPU46. 磁盘吞叶量disk throughput47. BGP 边界网关协议Border Gateway Protocol48. 语音控制voice con trol49. 湿度humidity50. 智能分析in tellige nt an alysis51. SOA Service Orie nted Architecture 面向服务的架构全媒体omn i-media API 接口API in terface快照 sn apshot 工单系统 ticket system 堡垒机 fortress mach ine 单点登录 SSO si ngle sig n on 脚本管理script man ageme nt拓扌卜管理 topology man ageme ntETL Extractio n-Tra nsformatio n-Loadi ng数据提取、转换和加载网络流量 network traffic 域名绑定 doma in banding|文件外链 exter nal docume nt lin ki ng 防篡改 tamper-proofi ng 防抵赖 non-repudiati on 端至 U 端en d-to-e nd全景透视 pano ramic perspective 多维度特征 multidime nsional characteristic ide ntificatio n识另U检索 retrieval 存储矩阵 storage matrix 示例代码 sample code 可执行代码 executable code 远程擦除 remote wipe 底层固件bottom firmware虚拟机 源代码 文档 virtual machinesource code docume nt53. 54. 55.56. 57. 58. 59. 60. 61. 62. 63.64. 65. 66. 67. 68. 69. 70. 71. 72. 73.74.75.76.77.78.79.存储分级storage tiering80. 回写式高速缓存 Write-back Cache 81. 软件定义存储software defi ned storage82. 横向可扩展存储 tran sverse exte nsiblestorage 83. 模块化数据中心 Modular Data Center84. DNS Domai n Name System 域名系统85. 封顶capp ing 86. -H- LJL心片 chip87. ISVIn depe ndent Software Ven dor第三方软件开发商88. 特征向量 Feature Vector 89.远程异地备份 remote backup 90.虚拟显示技术 visual vision 91.虚拟现实 Visual Reality (VR) 92.数据记录器 Data Recorder 93.业务连续性管理 Bus in ess Co ntinuity Man ageme nt (BCM) 94. 钢筋砼框架 rein forced con crete frame 95. 防爆墙blast wall96. 入侵检测探测器 in trusion detector 97. 弱电间 low voltage room 98. 门禁系统 access con trol system 99. 网络接入商web portal provider 100. 审计日志 audit logs 101. UPS Unin terrupted Power Supply不间断电源102. 柴油发电机 diesel generator103. 地下储油罐 un dergr ound petrol tank104. 多节点集群multi-node cluster105. 预案emerge ncy resp onse pla n107. 容错级fault tolera nee108. 里程表milesto ne109. 制冷密度cooli ng den sity110. 千瓦kilowatt111. 灭火方式fire ext in guishi ng method 112. 防渗漏等级an ti-leakage level113. 机房均布何载computer room eve n load 114. 全冗余full redundancy115. 两路市电two-way electricity116. 一路自卑应急on e-way self-prepared emerge ncypower117. 9度烈度9 degree seismic inten sity118. 密文ciphertext119. 专属机柜exclusive rack120. 设备上下电支持upper and lower electricity support121. 网络布线n etwork cabli ng122. 实时热备份real time thermal backup123. 桌面演练desktop practice124. 模拟切换演练simulated switch practice125. 园区占地面积floor area of the park126.规划建设面积pla nning con struct ion area127. 高速链路复制high-speed copy ing link128. 7 探24hours 7 multiply 24 hours129. 安全和访问控制security and visiting control (物理环境) 130. 银监会China Banking Regulatory Commission (CBRC)131. 发改委Natio nal Developme nt and Reform Commissi on (NDRC)132. 中国信息安全Ch ina In formation Techno logy Evaluation Cen ter测评中心133. 工信部Ministry of Industry and Information Technology134. 住建部Ministry of Housing and Urban-Rural Development国际灾难恢复协会Disaster Recovery Institute International135. DRII业务持续协会Business Continuity Institute136. BCITotal Cost of Ownship 总拥有成本137. TCOHuman Computer Interaction 人机交互138. HCIOptical Character Recognition 文字识别139. OCR140. SOA Service Oriented Architecture 面向服务的体系结构非关系型数据库141. NoSQL142. Hadoop 一个分布式系统基础架构,由Apache 基金会所开发。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Implementation Issues of A Cloud Computing PlatformBo Peng,Bin Cui and Xiaoming LiDepartment of Computer Science and Technology,Peking University{pb,bin.cui,lxm}@AbstractCloud computing is Internet based system development in which large scalable computing resources are provided“as a service”over the Internet to users.The concept of cloud computing incorporates web infrastructure,software as a service(SaaS),Web2.0and other emerging technologies,and has attracted more and more attention from industry and research community.In this paper,we describe our experience and lessons learnt in construction of a cloud computing platform.Specifically,we design a GFS compatiblefile system with variable chunk size to facilitate massive data processing,and introduce some implementation enhancement on MapReduce to improve the system throughput.We also discuss some practical issues for system implementation.In association of the China web archive(Web InfoMall) which we have been accumulating since2001(now it contains over three billion Chinese web pages), this paper presents our attempt to implement a platform for a domain specific cloud computing service, with large scale web text mining as targeted application.And hopefully researchers besides our selves will benefit from the cloud when it is ready.1IntroductionAs more facets of work and personal life move online and the Internet becomes a platform for virtual human society,a new paradigm of large-scale distributed computing has emerged.Web-based companies,such as Google and Amazon,have built web infrastructure to deal with the internet-scale data storage and computation. If we consider such infrastructure as a“virtual computer”,it demonstrates a possibility of new computing model, i.e.,centralize the data and computation on the“super computer”with unprecedented storage and computing capability,which can be viewed as a simplest form of cloud computing.More generally,the concept of cloud computing can incorporate various computer technologies,including web infrastructure,Web2.0and many other emerging technologies.People may have different perspectives from different views.For example,from the view of end-user,the cloud computing service moves the application software and operation system from desktops to the cloud side,which makes users be able to plug-in anytime from anywhere and utilize large scale storage and computing resources.On the other hand,the cloud computing service provider may focus on how to distribute and schedule the computer resources.Nevertheless,the storage and computing on massive data are the key technologies for a cloud computing infrastructure.Google has developed its infrastructure technologies for cloud computing in recent years,including Google File System(GFS)[8],MapReduce[7]and Bigtable[6].GFS is a scalable distributedfile system,which Copyright0000IEEE.Personal use of this material is permitted.However,permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists,or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.Bulletin of the IEEE Computer Society Technical Committee on Data Engineering61emphasizes fault tolerance since it is designed to run on economically scalable but inevitably unreliable(due to its sheer scale)commodity hardware,and delivers high performance service to a large number of clients. Bigtable is a distributed storage system based on GFS for structured data management.It provides a huge three-dimensional mapping abstraction to applications,and has been successfully deployed in many Google products.MapReduce is a programming model with associated implementation for massive data processing. MapReduce provides an abstraction by defining a“mapper”and a“reducer”.The“mapper”is applied to every input key/value pair to generate an arbitrary number of intermediate key/value pairs.The“reducer”is applied to all values associated with the same intermediate key to generate output key/value pairs.MapReduce is an easy-to-use programming model,and has sufficient expression capability to support many real world algorithms and tasks.The MapReduce system can partition the input data,schedule the execution of program across a set of machines,handle machine failures,and manage the inter-machine communication.More recently,many similar systems have been developed.KosmosFS[3]is an open source GFS-Like system,which supports strict POSIX interface.Hadoop[2]is an active Java open source project.With the support from Yahoo,Hadoop has achieved great progress in these two years.It has been deployed in a large system with4,000nodes and used in many large scale data processing tasks.In Oct2007,Google and IBM launched“cloud computing initiative”programs for universities to promote the related teaching and research work on increasingly popular large-scale ter in July2008,HP, Intel and Yahoo launched a similar initiative to promote and develop cloud computing research and education. Such cloud computing projects can not only improve the parallel computing education,but also promote the research work such as Internet-scale data management,processing and scientific computation.Inspired by this trend and motivated by a need to upgrade our existing work,we have implemented a practical web infrastructure as cloud computing platform,which can be used to store large scale web data and provide high performance processing capability.In the last decade,our research and system development focus is on Web search and Web Mining,and we have developed and maintained two public web systems,i.e.,Tianwang Search Engine[4]and Web Archive system Web infomall[1]as shown in Figure1.(a)Tianwang(b)Web infomallFigure1:Search engine and Chines web archive developed at SEWM group of PKU During this period,we have accumulated more than50TB web data,built a PC cluster consisting of100+ PCs,and designed various web application softwares such as webpage text analysis and processing.With the increase of data size and computation workload in the system,we found the cloud computing technology is a promising approach to improve the scalability and productivity of the system for web services.Since2007,we62started to design and develop our web infrastructure system,named“Tplatform”,including GFS-likefile system “TFS”[10]and MapReduce computing environment.We believe our practice of cloud computing platform implementation could be a good reference for researchers or engineers who are interested in this area.2TPlatform:A Cloud Computing PlatformIn this section,we briefly introduce the implementation and components of our cloud computing platform, named“Tplatform”.Wefirst present the overview of the system,followed by the detailed system implementation and some practical issues.Figure2:The System Framework of TplatformFig2shows the overall system framework of the“Tplatform”,which consists of three layers,i.e.,PC cluster, infrastructure for cloud computing platform,and data processing application layer.The PC cluster layer provides the hardware and storage devices for large scale data processing.The application layer provides the services to users,where users can develop their own applications,such as Web data analysis,language processing,cluster and classification,etc.The second layer is the main focus of our work,consisting offile system TFS,distributed data storage mechanism BigTable,and MapReduce programming model.The implementation of BigTable is similar to the approach presented in[6],and hence we omit detailed discussion here.2.1Implementation of File SystemThefile system is the key component of the system to support massive data storage and management.The designed TFS is a scalable,distributedfile system,and each TFS cluster consists of a single master and multiple chunk servers and can be accessed by multiple client.632.1.1TFS ArchitectureIn TFS,files are divided into variable-size chunks.Each chunk is identified by an immutable and globally unique 64bit chunk handle assigned by the master at the time of chunk creation.Chunk servers store the chunks on the local disks and read/write chunk data specified by a chunk handle and byte range.For the data reliability, each chunk is replicated on multiple chunk servers.By default,we maintain three replicas in the system,though users can designate different replication levels for differentfiles.The master maintains the metadata offile system,which includes the namespace,access control information, the mapping fromfiles to chunks,and the current locations of chunks.It also controls system-wide activities such as garbage collection of orphaned chunks,and chunk migration between chunk servers.Each chunk server periodically communicates with the master in HeartBeat messages to report its state and retrieve the instructions.TFS client module is associated with each application by integrating thefile system API,which can commu-nicate with the master and chunkservers to read or write data on behalf of the application.Clients interact with the master for metadata operations,but all data-bearing communication goes directly to the chunkservers.The system is designed to minimize the master’s involvement infile accessing operations.We do not provide the POSIX API.Besides providing the ordinary read and write operations,like GFS,we have also provided an atomic record appending operation so that multiple clients can append concurrently to afile without extra synchronization among them.In the system implementation,we observe that the record appending operation is the key operation for system performance.We design our own system interaction mechanism which is different from GFS and yields better record appending performance.2.1.2Variable Chunk SizeIn GFS,afile is divided intofixed-size chunks(e.g.,64MB).When a client uses record appending operation to append data,the system checks whether appending the record to the last chunk of a certainfile may make the chunk overflowed,i.e.,exceed the maximum size.If so,it pads all the replica of the chunk to the maximum size, and informs the client that the operation should be continued on the new chunk.(Record appending is restricted to be at most one-fourth of the chunk size to keep worst case fragmentation at an acceptable level.)In case of write failure,this approach may lead to duplicated records and incomplete records.In our TFS design,the chunks of afile are allowed to have variable sizes.With the proposed system in-teraction mechanism,this strategy makes the record appending operation more efficient.Padding data,record fragments and record duplications are not necessary in our system.Although this approach brings some extra cost,e.g.,every data structure of chunk needs a chunk size attribute,the overall performance is significantly improved,as the read and record appending operations are the dominating operations in our system and can benefit from this design choice.2.1.3File OperationsWe have designed differentfile operations for TFS,such as read,record append and write.Since we allow variable chunk size in TFS,the operation strategy is different from that of GFS.Here we present the detailed implementation of read operation to show the difference of our approach.To read afile,the client exchanges messages with the master,gets the locations of chunks it wants to read from,and then communicates with the chunk servers to retrieve the data.Since GFS uses thefixed chunk size, the client just needs to translate thefile name and byte offset into a chunk index within thefile,and sends the master a request containing thefile name and chunk index.The master replies with the corresponding chunk handle and locations of the replicas.The client then sends a request to one of the replicas,most likely the closest one.The request specifies the chunk handle and a byte range within that chunk.Further reads of the same chunk do not require any more client-master interaction unless the cached information expires or thefile is reopened.64In our TFS system,the story is different due to the variable chunk size strategy.The client can not translate the byte offset into a chunk index directly.It has to know all the sizes of chunks in thefile before deciding which chunk should be read.Our solution is quite straightforward,when a client opens afile using read mode,it gets all the chunks’information from the master,including chunk handle,chunk size and locations,and use these information to get the proper chunk.Although this strategy is determined by the fact of variable chunk size, its advantage is that the client only needs to communicate with the master once to read the wholefile,which is much efficient than GFS’original design.The disadvantage is that when a client has opened afile for reading, later appended data by other clients is invisible to this client.But we believe this problem is negligible,as the majority of thefiles in web applications are typically created and appended once,and read by data processing applications many times without modifications.If in any situation this problem becomes critical,it can be easily overcome by set an expired timestamp for the chunks’information and refresh it when invalid.The TFS demonstrates our effort to build an infrastructure for large scale data processing.Although our system has the similar assumptions and architectures as GFS,the key difference is that the chunk size is variable, which makes our system able to adopt different system interactions for record appending operation.Our record appending operation is based on chunk level,thus the aggregate record appending performance is no longer restricted by the network bandwidth of the chunk servers that store the last chunk of thefile.Our experimental evaluation shows that our approach significantly improves the concurrent record appending performance for singlefile by25%.More results on TFS have been reported in[10].We believe the design can apply to other similar data processing infrastructures.2.2Implementation of MapReduceMapReduce system is another major component of the cloud computing platform,and has attracted more and more attentions recently[9,7,11].The architecture of our implementation is similar to Hadoop[2],which is a typical master-worker structure.There are three roles in the system:Master,Worker and User.Master is the central controller of the system,which is in charge of data partitioning,task scheduling,load balancing and fault tolerance processing.Worker runs the concrete tasks of data processing and computation.There exist many workers in the system,which fetch the tasks from Master,execute the tasks and communicate with each other for data er is the client of the system,implements the Map and Reduce functions for computation task,and controls theflow of computation.2.2.1Implementation EnhancementWe make three enhancements to improve the MapReduce performance in our system.First,we treat intermediate data transfer as an independent task.Every computation task includes map and reduce subtasks.In a typical implementation such as Hadoop,reduce task starts the intermediate data transfer,which fetches the data from all the machines conducting map tasks.This is an uncontrollable all-to-all communication,which may incur network congestion,and hence degrade the system performance.In our design,we split the transfer task from the reduce task,and propose a“Data transfer module”to execute and schedule the data transfer task independently. With appropriate scheduling algorithm,this method can reduce the probability of network congestion.Although this approach may aggravate the workload of Master when the number of transfer tasks is large,this problem can be alleviated by adjusting the granularity of transfer task and integrating data transfer tasks with the same source and target addresses.In practice,our new approach can significantly improve the data transfer performance.Second,task scheduling is another concern on MapReduce system,which helps to commit resources be-tween a variety of tasks and schedule the order of task execution.To optimize the system resource utility,we adopt multi-level feedback queue scheduling algorithm in our design.Multiple queues are used to allocate the concurrent tasks,and each of them is assigned with a certain priority,which may vary for different tasks with respect to the resources requested.Our algorithm can dynamically adjust the priority of running task,which65balances the system workload and improves the overall throughput.The third improvement is on data serialization.In MapReduce framework,a computation task consists of four steps:map,partition,group and reduce.The data is read in by map operation,intermediate data is gener-ated and transferred in the system,andfinally the results are exported by reduce operation.There exist frequent data exchanges between memory and disk which are generally accomplished by data serialization.In our imple-mentation of MapReduce system,we observed that the simple native data type is frequently used in many data processing applications.Since memory buffer is widely used,most of the data already reside in the memory before they are de-serialized into a new data object.In other words,we should avoid expensive de-serialization operations which consume large volume of memory space and degrade the system performance.To alleviate this problem,we define the data type for key and value as void*pointer.If we want to de-serialize the data with native data type,a simple pointer assignment operation can replace the de-serialization operation,which is much more efficient.With this optimization,we can also sort the data directly in the memory without data de-serialization.This mechanism can significantly improve the MapReduce performance,although it introduces some cost overhead for buffer management.2.2.2Performance Evaluation on MapReduceDue to the lack of benchmark which can represent the typical applications,performance evaluation on MapRe-duce system is not a trivial task.Wefirst use PennySort as the simple benchmark.The result shows that the performance of intermediate data transfer in the shuffle phase is the bottle neck of the system,which actually motivated us to optimize the data transfer module in MapReduce.Furthermore,we also explore a real applica-tion for text mining,which gathers statistics of Chinese word frequency in webpages.We run the program on a 200GB Chinese Web collection.Map function analyzes the content of web page,and produces every individual Chinese word as the key value.Reduce function sums up all aggregated values and exports the frequencies.In our testbed with18nodes,the job was split into3385map tasks,30reduce tasks and101550data transfer tasks, the whole job was successfully completed in about10hours,which is very efficient.2.3Practical Issues for System ImplementationThe data storage and computation capability are the major factors of the cloud computing platform,which determine how well the infrastructure can provide services to end users.We met some engineering and technical problems during the system implementation.Here we discuss some practical issues in our work.2.3.1System Design CriteriaIn the system design,our purpose is to develop a system which is scalable,robust,high-performance and easy to be maintained.However,some system design issues may be conflicted,which places us in a dilemma in many cases.Generally,we take three major criteria into consideration for system design:1)For a certain solution, what is bottleneck of the procedure which may degenerate the system performance?2)Which solution has better scalability andflexibility for future change?3)Since network bandwidth is the scarce resource of the system, how to fully utilize the network resource in the implementation?In the following,we present an example to show our considerations in the implementation.In the MapReduce system,fault tolerance can be conducted by either master or workers.Master takes the role of global controller,maintains the information of the whole system and can easily decide whether a failed task should be rerun,and when/where to be rerun.Workers only keep local information,and take charge of reporting the status of running tasks to Master.Our design combines the advantages of these two factors.The workers can rerun a failed task for a certain number of times,and are even allowed to skip some bad data records which cause the failure.This distributed strategy is more robust and scalable than centralized mechanism,i.e., only re-schedule failed tasks in the Master side.662.3.2Implementation of Inter-machine CommunicationSince the implementation of cloud computing platform is based on the PC cluster,how to design the inter-machine communication protocol is the key issue of programming in the distributed environment.The Remote Procedure Call(RPC)middle ware is a popular paradigm for implementing the client-server model of distributed computing,which is an inter-process communication technology that allows a computer program to cause a subroutine or procedure to execute on another computer in a PC cluster without the programmer explicitly coding the details for this remote interaction.In our system,all the services and heart-beat protocols are RPC calls.We exploit Internet Communications Engine(ICE),which is an object-oriented middleware that provides object-oriented RPC,to implement the RPC framework.Our approach performs very well under our system scale and can support asynchronous communication model.The network communication performance of our system with ICE is comparable to that of special asynchronous protocols with socket programming,which is much more complicated for implementation.2.3.3System Debug and DiagnosisDebug and Diagnosis in distributed environment is a big challenge for researchers and engineers.The overall system consists of various processes distributed in network,and these processes communicate each other to execute a complex task.Because of the concurrent communications in such system,many faults are generally not easy to be located,and hence can hardly be debugged.Therefore,we record complete system log in our system.In All the server and client sides,important software boundaries such as API and RPC interfaces are all logged.For example,log for RPC messages can be used to check integrality of protocol,log for data transfer can be used to validate the correctness of transfer.In addition,we record performance log for performance tuning. In our MapReduce system,log in client side records the details of data read-in time,write-out time of all tasks, time cost of sorting operation in reduce task,which are tuning factors of our system design.In our work,the recorded log not only helps us diagnose the problems in the programs,but also helps find the performance bottleneck of the system,and hence we can improve system implementation accordingly. However,distributed debug and diagnosis are still low efficient and labor consuming.We expect better tools and approaches to improve the effectiveness and efficiency of debug and diagnosis in large scale distributed system implementation.3ConclusionBased on our experience with Tplatform,we have discussed several practical issues in the implementation of a cloud computing platform following Google model.It is observed that while GFS/MapReduce/BigTable provides a great conceptual framework for the software core of a cloud and Hadoop stands for the most popular open source implementation,there are still many interesting implementation issues worth to explore.Three are identified in this paper.•The chunksize of afile in GFS can be variable instead offixed.With careful implementation,this design decision delivers better performance for read and append operations.•The data transfer among participatory nodes in reduce stage can be made”schedulable”instead of”un-controlled”.The new mechanism provides opportunity for avoiding network congestions that degrade performance.•Data with native types can also be effectively serialized for data access in map and reduce functions,which presumably improves performance in some cases.67While Tplatform as a whole is still in progress,namely the implementation of BigTable is on going,the finished parts(TFS and MapReduce)are already useful.Several applications have shown the feasibility and advantages of our new implementation approaches.The source code of Tplatform is available from[5]. AcknowledgmentThis work was Supported by973Project No.2007CB310902,IBM2008SUR Grant for PKU,and National Natural Science foundation of China under Grant No.60603045and60873063.References[1]China Web InfoMall.,2008.[2]The Hadoop Project./,2008.[3]The KosmosFS Project./,2008.[4]Tianwang Search.,2008.[5]Source Code of Tplatform Implementation./˜webg/tplatform,2009.[6]F.Chang,J.Dean,S.Ghemawat,W.C.Hsieh,D.A.Wallach,M.Burrows,T.Chandra,A.Fikes,and R.E.Gruber.Bigtable:a distributed storage system for structured data.In OSDI’06:Proceedings of the7th USENIX Symposium on Operating Systems Design and Implementation,pages15–15,2006.[7]J.Dean and S.Ghemawat.Mapreduce:Simplified data processing on large clusters.In OSDI’04:Proceed-ings of the5th USENIX Symposium on Operating Systems Design and Implementation,pages137–150, 2004.[8]G.Sanjay,G.Howard,and L.Shun-Tak.The googlefile system.In Proceedings of the17th ACM Sympo-sium on Operating Systems Principles,pages29–43,2003.[9]H.Yang,A.Dasdan,R.Hsiao,and D.S.Parker.Map-reduce-merge:simplified relational data processingon large clusters.In SIGMOD’07:Proceedings of the2007ACM SIGMOD international conference on Management of data,pages1029–1040,2007.[10]Z.Yang,Q.Tu,K.Fan,L.Zhu,R.Chen,and B.Peng.Performance gain with variable chunk size ingfs-likefile systems.In Journal of Computational Information Systems,pages1077–1084,2008.[11]M.Zaharia,A.Konwinski,A.D.Joseph,R.Katz,and I.Stoica.Improving mapreduce performance inheterogeneous environments.In OSDI’07:Proceedings of the8th USENIX Symposium on Operating Systems Design and Implementation,pages29–42,2007.68。

相关文档
最新文档