下一代AWS云计算架构Nitro
CPU关键技术未来演进路线
CPU关键技术未来演进路线后摩尔定律时代,单靠制程工艺的提升带来的性能受益已经十分有限,Dennard Scaling规律约束,芯片功耗急剧上升,晶体管成本不降反升;单核的性能已经趋近极限,多核架构的性能提升亦在放缓。
AIoT时代来临,下游算力需求呈现多样化及碎片化,通用处理器难以应对。
1)从通用到专用:面向不同的场景特点定制芯片,XPU、FPGA、DSA、ASIC应运而生。
2)从底层到顶层:软件、算法、硬件架构。
架构的优化能够极大程度提升处理器性能,例如AMD Zen3将分离的两块16MB L3 Cache 合并成一块32MB L3 Cache,再叠加改进的分支预测、更宽的浮点unit 等,便使其单核心性能较Zen2提升19%。
3)异构与集成:苹果M1 Ultra芯片的推出带来启迪,利用逐步成熟的3D封装、片间互联等技术,使多芯片有效集成,似乎是延续摩尔定律的最佳实现路径。
主流芯片厂商已开始全面布局:Intel已拥有CPU、FPGA、IPU产品线,正加大投入GPU产品线,推出最新的Falcon Shores架构,打磨异构封装技术;NvDIA则接连发布多芯片模组(MCM,Multi-Chip Module)Grace系列产品,预计即将投入量产;AMD则于近日完成对塞灵思的收购,预计未来走向CPU+FPGA的异构整合。
此外,英特尔、AMD、Arm、高通、台积电、三星、日月光、Google 云、Meta、微软等十大行业主要参与者联合成立了Chiplet标准联盟,正式推出通用Chiplet的高速互联标准“Universal ChipletInterconnectExpress”(通用小芯片互连,简称“UCIe”)。
在UCIe的框架下,互联接口标准得到统一。
各类不同工艺、不同功能的Chiplet芯片,有望通过2D、2.5D、3D等各种封装方式整合在一起,多种形态的处理引擎共同组成超大规模的复杂芯片系统,具有高带宽、低延迟、经济节能的优点。
亚马逊AWS云计算平台的介绍
亚马逊AWS云计算平台的介绍云计算是一种新的计算模式,其核心思想是将计算设备、数据和应用程序都存储在互联网上,使得用户可以在任何时间、任何地点通过网络访问这些服务。
目前,亚马逊AWS云计算平台已经成为了全球领先的云计算服务提供商之一。
本文就对亚马逊AWS 云计算平台做一个简单的介绍。
一、亚马逊AWS云计算平台的历史和发展亚马逊AWS云计算平台是由亚马逊公司于2006年推出的,它最初是为了满足亚马逊自身的业务需求而开发的。
1998年,亚马逊公司的规模开始急剧扩张,当时传统的客户端服务器架构已经无法满足公司的业务需求。
于是,亚马逊公司开始探索新的计算模式,最终决定采用云计算模式,即将计算设备、数据和应用程序都存储在互联网上,以便随时随地访问。
随着亚马逊AWS云计算平台的不断发展和壮大,越来越多的企业和机构开始意识到云计算的重要性,并开始采用亚马逊AWS 云计算平台来提供各种IT服务。
目前,亚马逊AWS云计算平台已经成为全球领先的云计算服务提供商之一,其用户包括了众多知名企业和机构,如NASA、Netflix、Airbnb、Dropbox、Spotify 等。
二、亚马逊AWS云计算平台的服务和应用亚马逊AWS云计算平台提供了包括计算、存储、数据库、安全、开发工具、人工智能、物联网等在内的各种服务和应用程序。
以下是亚马逊AWS云计算平台的一些主要服务和应用的介绍:1.计算服务计算服务是亚马逊AWS云计算平台的核心服务之一。
它包括了EC2、Lambda、Batch等多个服务。
其中,EC2是一种弹性计算服务,它可以让用户在亚马逊的基础设施上租用虚拟计算机实例,并以每小时收费的方式,按需使用计算资源。
Lambda是一种无服务器计算服务,它可以让用户编写和运行代码,而无需担心基础设施的管理和维护。
Batch是一种批处理服务,它可以让用户轻松地在亚马逊的基础设施上运行批处理作业。
2.存储服务存储服务是亚马逊AWS云计算平台中另一个核心服务。
aws知识点总结
aws知识点总结AWS(Amazon Web Services)是由亚马逊公司提供的云计算服务平台,通过该平台,用户可以按需获取计算能力、存储、数据库等服务,从而节省成本,提高效率。
AWS提供了众多服务,包括计算、存储、数据库、网络、开发工具、安全和身份、分析、人工智能等,下面将对AWS的一些重要知识点进行总结。
一、计算服务1. EC2(Elastic Compute Cloud)EC2是AWS中最核心的服务之一,它提供了可扩展的虚拟服务器实例,用户可以通过EC2快速获取和启动虚拟服务器。
EC2实例可以根据需要进行弹性伸缩,用户可以根据实际需求随时调整实例的规模和性能。
2. LambdaLambda是AWS提供的无服务器计算服务,用户无需管理服务器,只需上传代码即可运行,Lambda会根据实际请求进行自动扩展。
Lambda支持多种语言,包括Node.js、Python、Java等。
3. ECS(Elastic Container Service)ECS是AWS提供的容器管理服务,用户可以在ECS上运行Docker容器,实现应用程序的快速部署和扩展。
4. EKS(Elastic Kubernetes Service)EKS是AWS提供的托管Kubernetes服务,用户可以在EKS上轻松地运行Kubernetes集群,实现容器化应用程序的部署和管理。
5. Auto ScalingAuto Scaling是AWS提供的自动扩展服务,用户可以根据实际负载情况自动调整EC2实例的规模,确保系统具有良好的稳定性和可用性。
二、存储服务1. S3(Simple Storage Service)S3是AWS提供的对象存储服务,用户可以在S3上存储和检索任意数量的数据,S3具有高可用性和高耐用性,适合存储静态文件、多媒体内容、备份数据等。
2. EBS(Elastic Block Store)EBS是AWS提供的持久化块存储服务,用户可以将EBS卷挂载到EC2实例上,用于存储应用程序数据、数据库、文件系统等。
AWS认证解决方案架构师
AWS认证解决方案架构师
《AWS认证解决方案架构师:打造可靠、安全的云计算架构》
当前,云计算技术已经成为众多企业的首选,其中AWS作为
全球领先的云计算平台,备受企业青睐。
然而,随着云计算平台的不断发展,对于拥有AWS认证解决方案架构师的需求也
日益增加。
AWS认证解决方案架构师是AWS的核心认证之一,获得该
认证意味着个人具备了在AWS架构设计和部署方面的专业能力。
那么,这类专业人才需要具备哪些技能和知识呢?
首先,AWS认证解决方案架构师需要对AWS的各种服务和
功能有着深入的了解,包括但不限于EC2、S3、VPC等核心
服务。
其次,他们需要具备扎实的架构设计能力,能够根据客户需求设计出安全、高可用性、弹性和可扩展的云计算解决方案。
此外,他们还需要了解云计算的最佳实践和安全标准,确保所设计的架构符合行业标准和客户期望。
在实际工作中,AWS认证解决方案架构师可以承担多个角色,包括但不限于技术顾问、解决方案架构师、系统工程师等。
他们需要与客户进行沟通,了解客户需求并据此制定相应的云计算解决方案。
同时,他们还需要指导团队成员,确保所设计的架构能够得到有效部署和维护。
最后,为了获得AWS认证解决方案架构师资格,个人需要通
过相应的考试,证明自己具备了所需的技能和知识。
在备考过
程中,可以通过参加AWS认证的培训课程,获得专业的指导和实践经验,提升通过考试的机会。
总的来说,作为企业在云计算领域的核心竞争力之一,AWS 认证解决方案架构师具有着广阔的就业前景和发展空间。
只要个人不断学习和提升自己的技能,就能够在这个领域中获得更多的机会和挑战。
使用AWS轻松构建PB级企业BI解决方案
使用AWS轻松构建PB级企业BI解决方案AWS(Amazon Web Services)是一家全球领先的云计算服务提供商,其面向企业的各种云服务可以帮助企业快速构建和部署高度可扩展的解决方案。
在本文中,我们将探讨如何使用AWS来构建PB级企业BI(商业智能)解决方案。
在当今竞争激烈的市场环境中,企业需要及时准确地获取和分析大量的数据,以便做出明智的业务决策。
传统的BI解决方案通常昂贵且复杂,在处理大数据量时往往效率低下。
但使用AWS的云服务,可以轻松构建PB级企业BI解决方案,具备高性能、高可扩展性和低成本的特点。
首先,我们可以使用AWS的存储服务S3(Simple Storage Service)来存储PB级别的数据。
S3是一种可扩展且高度安全的对象存储服务,可容纳任意数量的数据,并可通过简单的API进行访问和管理。
使用S3存储数据可以轻松实现数据的高可用性和持久性,同时还可以根据需要扩展存储容量。
其次,我们可以使用AWS的数据处理服务,如Amazon Redshift和Amazon EMR来处理PB级别的数据。
Amazon Redshift是一种用于大规模数据仓库和分析的完全托管的云数据仓库服务,能够快速高效地处理PB级别的数据。
Amazon Redshift还与业界主流的BI工具集成,可以方便地进行数据可视化和分析。
而Amazon EMR是一种完全托管的Hadoop框架,可用于处理和分析大数据集。
使用Amazon EMR,我们可以方便地进行大规模数据处理和分析操作。
另外,AWS的分析服务还提供了其他强大的工具和服务,如Amazon Athena和Amazon QuickSight。
Amazon Athena是一种无服务器的交互式查询服务,可用于分析S3中的大量数据。
使用Amazon Athena,我们可以轻松地查询PB级别的数据,并快速获得结果。
另外,Amazon QuickSight是一种云端BI工具,可以帮助用户从多个数据源中快速创建交互式可视化分析报表。
亚马逊的云计算平台AWS
亚马逊的云计算平台AWS
一、简介
亚马逊云计算(Amazon Web Services,简称AWS)是美国亚马逊近
年来开发的一项云计算服务,提供安全可靠的网络架构和计算平台。
它是
一种以服务的形式(SaaS)提供的面向全球的服务,提供多种云计算产品,如虚拟机(EC2)、数据库(RDS)、存储(S3)和事件处理(SNS)等服务。
此外,还能提供信息安全管理、加密解密服务(KMS)、容器服务(ECS)、应用和网站等,为企业提供优质的开发和运行环境,从而加速
IT应用的发展。
二、特点
1、自动扩展:AWS自动地根据客户对服务器资源的使用情况,调整
服务器资源规格,使得用户可以根据自己的应用需求动态地增加或减少服
务器资源。
2、高可用性:AWS提供高可用服务,确保持续稳定运行、可用性和
可靠性。
AWS的可用性和高度的可靠性是许多企业已经转型到云计算的主
要原因之一
3、储存业务:亚马逊提供的云存储服务(S3),可以实现容量非常大,性能更佳的数据处理。
用户不需要购买硬件,将对自己的存储服务进
行升级,降低了企业的运营成本。
4、安全保护:AWS提供了一系列管理工具,让用户可以控制数据存
储和传输的安全级别,保护信息安全。
亚马逊AWS的服务器架构及优化方案
亚马逊AWS的服务器架构及优化方案亚马逊AWS(Amazon Web Services)是全球最大的云计算服务提供商之一。
AWS提供各种云计算相关的服务,包括计算、存储、数据库、分析、机器学习、人工智能等。
其中,AWS的服务器架构是其成功的关键之一。
在本文中,我们将探讨AWS的服务器架构及如何优化。
1. 服务器架构AWS的服务器架构采用多层次的系统架构,包括数据中心、区域、可用区、实例和存储。
下面我们逐个介绍。
数据中心是AWS云计算服务的核心基础设施,其提供了可靠的电力、网络、空调和物理安全。
AWS目前在全球70个区域(包括已经启动的和尚未启动的区域)拥有100多个数据中心。
每个AWS区域都由一个或多个数据中心组成。
例如,北美区域包括美国西部、美国东部、加拿大中部等多个数据中心。
数据中心下面的是区域,AWS的区域是由一些相邻的地理位置组成的。
现在,AWS区域的数量已经达到了全球22个。
AWS 的区域与数据中心的联系非常紧密,基本上每个数据中心都在一个区域内。
区域下一层是可用区,AWS的可用区是指在同一AWS区域中独立运营的一个或多个数据中心。
每个可用区都是独立的,可以实现高可用性和灾备恢复。
例如,美国东部(弗吉尼亚)AWS区域包括6个可用区。
实例是AWS云计算服务提供的虚拟服务器,提供各种计算能力和服务,支持多种操作系统和应用程序。
AWS的实例是根据业务需求进行规划和分配的,可以根据需要动态增加或减少。
AWS 的实例规划非常灵活,有多种规格可供选择,可以根据需要选择适当的规格。
存储是AWS提供的云存储服务,支持不同的存储类型,包括对象存储,文件存储和块存储。
AWS的存储也非常灵活,可以根据业务需要进行灵活选取。
2. 优化方案在使用AWS的服务时,有一些优化技巧可以提高系统的性能和可用性。
下面我们将从以下几个方面进行介绍。
2.1 规划优化在使用AWS的服务时,最重要的事情是规划。
规划是根据业务需求和服务特性进行部署和调整的过程。
亚马逊云计算技术的应用和未来
亚马逊云计算技术的应用和未来亚马逊云计算技术,作为世界上最大的公共云计算服务提供商,已经在全球90多个国家提供安全、易用和高可用性的云服务,成为了全球数字变革的先锋力量。
亚马逊云计算拥有丰富的服务群、强大的全球基础设施、灵活的计费方式等优势,应用广泛并且不断拓展。
亚马逊云计算架构亚马逊云计算架构是一种高度可扩展和弹性的云平台服务,以多层次驱动的技术架构为支持和保障和平台使用体验。
亚马逊计算云架构包括丰富的应用程序接口(API)、开发工具、应用程序、存储、网络安全等基础云服务,能够满足市场上各种规模、类型、安全性的应用程序需求。
亚马逊云计算技术应用亚马逊云计算技术在多种领域得到了广泛的应用,例如物联网、金融、医疗、电信、能源等。
在物联网领域,亚马逊推出了AWS IoT(Internet of Things)服务,支持物联网设备的连接、网络安全、数据分析等一系列操作,为企业定制化的应用程序提供了全方位的支持。
亚马逊云计算还为金融行业提供了强大的基础设施支持,例如AWS Elastic Load Balancing、RDS、CloudFront等服务,帮助银行机构提升业务效率并且加强数据安全性控制。
在医疗领域,亚马逊云计算还能支持医学数据分析、电子病历和视频会议等应用。
亚马逊云计算未来趋势随着数字化转型不断加速,云计算依然是未来技术的中心,亚马逊云计算也将不断加码技术创新和产品开发。
未来,亚马逊将主要关注以下三个方向:1. 安全性和合规性:亚马逊将持续扩展和优化安全控制机制,确保云计算使用的安全性和合规性。
此外,亚马逊云计算还将扩展智能化自动化安全控制的覆盖范围。
2. 机器智能:亚马逊将在AI领域进行大量的探索和研发,但不同于其他公司,亚马逊云计算将会更侧重数据分析、自然语言处理和机器学习等领域,打造更有价值的智能应用。
3. 数据分析和预测:亚马逊将持续发挥云计算的优势,为企业和个人打造更加灵活的数据分析和预测平台,支持更加智能化和精准化的数据应用。
运营商大模型硬件基础设施创新及RDMA流量控制技术研究
运营商大模型硬件基础设施创新及RDMA流量控制技术研究车碧瑶1 张永航2 廖怡2 唐剑2 樊小平2 赵继壮1 陆钢1(1.中国电信股份有限公司研究院,北京102209;2.中国电信天翼云科技有限公司,北京100007)摘要:从业界大模型硬件基础设施创新的主要模式出发,论述了电信运营商在该领域自主创新的路线选择考虑㊂基于实际组网环境和业务场景提出需求,设计了一种支持NO-PFC㊁交换机免配置的拥塞控制算法,使用RTT作为拥塞感知信号,控制交换机队列长度,实现低延迟㊂关键词:RDMA拥塞控制;大模型基础设施创新;运营商数据中心网络中图分类号:TP30;F124㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀文献标志码:A引用格式:车碧瑶,张永航,廖怡,等.运营商大模型硬件基础设施创新及RDMA流量控制技术研究[J].信息通信技术与政策,2024,50(2):26-32.DOI:10.12267/j.issn.2096-5931.2024.02.0050㊀引言真正认真对待软件的人应该自己制造硬件 [1]㊂经过十几年的发展,云计算已经走到了硬件创新成为行业主要驱动力的阶段㊂随着2022年底大模型时代的开启,全球头部云服务商2023年除了推出自己的各种大模型,也坚定地在大模型硬件基础设施上进行了自主研发㊂本文首先对电信运营商在大模型硬件基础设施领域自主创新的路线选择进行了分析和研究,然后重点论述了基于中国电信云网融合大科创实验装置在远程直接内存访问(Remote Direct Memory Access, RDMA)拥塞控制方面的研究进展㊂1㊀运营商大模型硬件基础设施创新路线图大模型硬件基础设施创新主要包括以下3个层面㊂一是研发人工智能(Artificial Intelligence,AI)算力芯片㊂2023年,AWS推出第二代AI芯片Trainium2,微软推出Maia100,谷歌推出TPUv5p,这些产品均选择走可对特定AI业务场景加速的专用集成电路(Application Specific Integrated Circuit,ASIC)芯片路线,而不是通用图形处理器(Graphics Processing Unit, GPU)路线㊂二是研发数据处理单元(Data Processing Unit, DPU)㊂例如,AWS的Nitro㊁谷歌的IPU㊁阿里巴巴的CIPU㊁中国电信的紫金DPU等㊂DPU设备是云服务商的根本技术所在,云主机最重要的虚拟化㊁网络通信㊁存储㊁安全功能全部下沉到此设备中;与过去智能网卡只能提供部分软件卸载不同,现在整个基础架构软件堆栈都可以在DPU上实现,中央处理器(Central Processing Unit,CPU)释放后可给最终用户售卖更多核;头部云服务商自研DPU的产品路线上均选择对能够体现自身架构独特性的功能进行强化;因功能非常复杂且需要嵌入云服务商各自独特的功能,故产业界DPU标准化程度还不高㊂三是研发运行在数据中心专用通信硬件上的实时处理逻辑㊂例如,嵌入高速网卡中的RDMA拥塞控制逻辑㊁网络负载均衡逻辑和交换机上的定制化协议处理逻辑等㊂第一㊁二层面硬件自主研发的商业价值主要体现在:一方面,自研芯片可给云服务商加持其他公司难以复制的核心竞争力,如AWS的IPU Nitro;另一方面,大幅降低云服务商采购第三方先进芯片的投资额,可以预估一旦谷歌原生多模态大模型Gemini的领先效果被业界广泛认可,则训练Gemini的谷歌张量处理器(Tensor Processing Unit,TPU)会一改以前只是自用的局面,外部客户也会从通用GPU转向更便宜的谷歌自研芯片TPU,谷歌会大大降低外购GPU成本㊂但第一㊁二层面的硬件研发需要巨大的投入和时间积累并且失败风险很高,目前的实现路径有以下几种模式㊂一是与大型芯片公司联合研发,既可解决自身能力不足问题,又提高了项目的成功率㊂例如,微软组建数百人的独立团队,与AMD联合开发代号名为Athena的AI芯片,此项目预估已投入20亿美元以上;谷歌TPU v1~v4均由博通共同设计,除了芯片设计之外,博通公司还为谷歌提供了关键的知识产权,并负责了制造㊁测试和封装新芯片等步骤,以供应谷歌的新数据中心,博通公司还与其他客户(如Facebook㊁微软和AT&T等公司)合作设计ASIC芯片㊂二是收购半导体设计公司,走独立自主的芯片设计路线㊂例如,亚马逊多年前收购Annapurna Labs,设计出的AI推理/训练和网络芯片均已规模部署㊂三是收购初创公司获得完整知识产权(Intellectual Property,IP)和人才,如微软收购DPU初创公司Fungible㊂四是组建设计团队,直接购买第三方完整IP修改后定制出自己的芯片,但除了因符合云服务商定制化需求的IP供应商很少外,商务合作模式也受限于运营商标准化采购流程比较难以操作㊂五是与已经成功流片的小体量的初创设备商合作进行上层功能定制,快速推出自己的芯片㊂六是基于现场可编程门阵列(Field Programmable Gate Array,FPGA)开展核心IP完全自主可控的产品研发,逐步积累芯片研发经验,时机成熟启动流片,最后实现低成本芯片规模化部署;微软早在2010年就启动了以FPGA路线为主的硬件研发;由于FPGA在信息通信网络设备中广泛存在,运营商在云中选择同样的FPGA路线可实现IP的复用;针对高端云网设备(高速DPU+高速交换机)极难解耦的困境,运营商端侧的FPGA设备可以实现异构厂家交换机协议的兼容,保持运营商对网络的核心掌控力㊂综上所述,结合运营商自身业务场景㊁实际需求和研发现状,对硬件基础设施创新3个层面分析如下:芯片研发耗时漫长,投资巨大,见效慢,且流片失败风险极高㊂选择上层功能定制合作模式的自研芯片见效快,但由于运营商研发人员没有真正深度参与IP设计,从长远看不利于核心竞争力的掌控㊂因此,在第三层面研发嵌入到特殊硬件中的硬件逻辑则相对周期较短,风险可控,实现独有技术架构的可能性较大㊂例如,随着业界100G以上高速网卡在需求方引导下逐步开放可编程接口,研发面向大模型智算场景运行在高速网卡上的RDMA流量控制逻辑是一种性价比较高的选择㊂RDMA流量控制技术是保证大模型训练网络性能的关键技术之一㊂RDMA流量控制技术主要包括RDMA拥塞控制与RDMA多路径负载均衡两种技术: RDMA拥塞控制技术用于调控各个计算端服务器向数据中心网络的发送数据的速度;RDMA多路径负载均衡技术的目标是让流入网络的报文公平且最大化地利用组网中所有物理链路,尽快完成流传递,避免出现一部分链路过载而另一部分链路利用率不高的情况㊂这两种技术现阶段都需要在符合特定规范的硬件中嵌入运营商自主研发的控制逻辑,才能在100G㊁200G㊁400G甚至未来800G的高速网卡和高速交换机中发挥作用㊂2023年,中国电信股份有限公司研究院与中国电信天翼云科技有限公司紧密协同在RDMA拥塞控制方面持续发力,结合运营商智算网络规模大㊁可靠性要求高等特征确定研发目标:重点关注可部署性,尽可能破除对基于优先级的流量控制(Priority-Based Flow Control,PFC)的依赖,简化交换机配置,避免繁琐的显式拥塞通知(Explicit Congestion Notification,ECN)水线调优,得到高速㊁NO-PFC㊁NO-ECN㊁Zero Queuing的拥塞控制算法㊂基于大科创装置仿真实验平台和物理实验平台,通过方法创新不断挑战性能曲线,自主研发拥塞控制技术(Chinatelecom Congestion Control,CTCC),在Incast场景㊁全闪存储场景㊁混合专家(Mixed of Expert,MoE)大模型训练场景实测结果有明显对比优势㊂2㊀RDMA流量控制技术业界研究现状2.1㊀主流技术路线随着大模型算力性能飞速提升,为实现更高的GPU计算加速比,云主机网络带宽从主流通用云计算的单端口25G演进到单端口400G,此时基于软件的网络堆栈已经无法发挥出网卡的全部性能㊂头部云服务商在高算力数据中心的各种业务中开始广泛采用RDMA技术,将网络堆栈卸载到网卡硬件中,实现数据直接传输㊂但RDMA网络在协调低延迟㊁高带宽利用率和高稳定性方面面临着挑战㊂由于网络丢包对业务(尤其是大模型训练业务)影响较大,避免网络拥塞并发挥网络全链路负载是保证算网协同场景性能的关键,云服务提供商都在此领域积极布局自主研发创新㊂数据中心网络拥塞主要由Incast流量和流量调度不均导致,为应对这两类场景,提高RDMA网络的性能和可靠性,业界采用拥塞控制算法和流量路径负载均衡两种技术路线㊂前者致力于提出高效的拥塞控制协议,感知链路拥塞状态后进行流级别控速;后者调整进入网络的各种流量路径避免拥塞,特别是解决在大模型训练业务场景下复杂的组网架构㊁通信模式极易引起的局部链路过载等问题㊂主流拥塞控制算法主要通过ECN㊁往返时延(Round-Trip Time,RTT)㊁带内网络遥测(In-band Network Telemetry,INT)等信号感知链路拥塞,并做出微秒级响应㊂当前业界最普遍采用的㊁基于ECN信号的代表性算法是微软和Mellanox联合研发的数据中心量化拥塞通知(Data Center Quantized Congestion Notification,DCQCN)算法[2],需要交换机在拥塞时标记数据包,并由接收侧反馈到发送侧网卡进行速率控制㊂基于RTT的方案依赖网卡硬件实现高精度的时延测试,不需要交换机参与,部署相对容易,谷歌提出的TIMELY和SWIFT算法[3-4]均采用该路线;基于INT信号的方案依赖链路中交换机记录的出口速率和队列深度等信息精确控制飞行流量,要求交换机支持特定格式的INT报文[5-6]㊂在流量路径负载均衡控制方面,业界主流技术路线包括动态负载均衡和多路径传输两种㊂动态负载均衡感知链路故障或拥塞状态,修改数据包头中生成负载均衡哈希(Hash)算法Key值的相关字段,实现自适应路由,腾讯提出端网协同的快速故障自愈Hash DODGING方案[7]采用该路线,网卡和交换机上采用基于Hash偏移的网络路径控制方法,感知故障后终端修改数据包头的服务类型字段值实现重新选路;多路径传输路线的主要设计思路是包级别甚至信元(Cell)级别的负载均衡实现方案,以解决传统等价多路径(Equal Cost Multipath,ECMP)算法在长/短流混合场景负载分配不均导致长尾时延的问题㊂AWS的SRD 协议[8]实现逐包转发的负载均衡技术,依赖自研芯片Nitro完成乱序重排㊂谷歌提出新型网络架构Aquila[9],定制TiN(ToR-in-NIC)芯片实现网卡和交换机硬件级的紧耦合改造,采用私有L2Cell Based协议GNet提供Cell级交换能力㊂博通公司采用分布式分散式机箱(Distributed Disaggregated Chassis,DDC)组网方案[10],提出基于网卡的全网端到端Cell改造以及仅在叶脊网络(Leaf-Spine)之间进行Cell改造的实现方案㊂目前,先进的负载均衡方案大多依赖端网协同,需要交换机和网卡提供各种定制化能力㊂由于尚未形成统一的标准,设备商基于各自独有技术提供能力支持,现阶段开放性不足,难以异厂家设备组网,在运营商现网环境中大规模应用存在阻碍㊂端到端拥塞控制算法可以在不进行业务软件㊁网络硬件设备更新的前提下优化网络拥塞和时延,是提升大规模集群网络通信性能最具成本效益的方法㊂结合现网环境和业务场景,运营商可先着手于短期内能落地㊁易部署的高效拥塞控制算法,在数据中心改造升级过程中结合实际情况探索端网协同的负载均衡策略,提出更完备的流量控制解决方案㊂2.2㊀面临挑战与优化目标DCQCN是标准网卡中默认的RDMA拥塞控制算法,只有当交换机队列累积至超过ECN水线才能感知拥塞,导致在大规模Incast场景拥塞缓解速度慢,收敛前持续触发PFC㊂此外,DCQCN算法超参数数量过多,性能与参数选择强相关,在实际部署中调参困难㊂此外,DCQCN 算法完全依赖于路径中交换机标记ECN 拥塞后对端返回给发送端的拥塞通知报文(Congestion Notification Packet ,CNP )调速,此方案有如下优劣势㊂在各个发送端,由于一台交换机下所有发送端收到的拥塞信号接近,很容易导致各个流以相同的计算公式在同等输入条件下得到的速度相近,吞吐波形图中体现为各条流曲线基本重合㊂通过大科创装置的物理实验平台,观测到DCQCN 吞吐量接近链路带宽且各条流曲线公平性非常好㊂ECN 信号无法反馈准确的交换机队列长度,拥塞情况下极易导致队列累积触发PFC ㊂如果一条链路上出现多种流量混跑,因为交换机每个端口的优先级队列只有8条,超过8个业务时必然存在多个业务共享一个交换机优先级队列的情况㊂各个业务的流量模型不同时,可能出现共享队列的流彼此影响,触发PFC 时端口暂停导致受害者流的问题㊂调速应同时考虑交换机链路和主机处理速度双重因素,但交换机的ECN 信号无法反映对端主机上的业务处理速度㊂F1图1㊀CTCC 拥塞控制算法实现框架综合考虑运营商现网设备现状与实际业务需求,从业务性能㊁网络可靠性㊁成本等方面出发,提出自主可控的CTCC 拥塞控制算法2023年设计目标:一是降低业务延迟,满足RDMA 网络高吞吐㊁低时延的需求㊂算法基于端到端的RTT 信号监控网络拥塞状态,快速做出响应,控制交换机队列长度,减少数据包在网络中的排队延迟和抖动㊂二是支持NO-PFC ㊂算法能够在NO-PFC 配置下正常工作,避免持续丢包降低网络性能,保证网络可靠性㊂三是简化部署步骤㊂工业级网络实践中往往强调可部署性,新的拥塞控制方案应当不需要对网络设备进行任何修改,主要在网卡上实现和配置,降低部署的成本和复杂度㊂3㊀中国电信自研RDMA 拥塞控制算法交换机队列长度是网络拥塞状态的直接反应,维持稳定的低交换机队列能够同时实现低延迟和高吞吐㊂排除软件侧时延抖动,RTT 大小主要受数据包经过交换机的排队延迟影响,能够快速反应网络拥塞状态的变化㊂随着硬件性能的提升,网卡能够提供更高的时钟精度和更准确的时间戳功能㊂这使得通过网卡进行高精度延迟测量成为可能,为基于RTT 信号的数据中心RDMA 拥塞控制协议的设计与实现提供了前提条件㊂针对DCQCN 基于ECN 信号调速导致队列累积㊁对网络拥塞反应滞后㊁PFC 依赖程度较高等问题,考虑使用RTT 信号进行更细粒度的调速,提出一种端到端的㊁基于速率(Rate-Based )的拥塞控制协议,可基于现有商用网卡或DPU 的可编程拥塞控制(Programmable Congestion Control ,PCC )功能实现㊂与现有算法相比主要有以下两点创新:依赖RTT 信号进行Rate-Based 调速,实现交换机免配置,能够有效维持交换机低队列,降低延迟;以支持NO-PFC 配置为出发点,设置收到否定应答(Negative ACKnowledge ,NACK )报文时快速降速,减少丢包带来的性能损失㊂3.1㊀算法设计如图1所示,CTCC 算法使用RTT 信号体现网络拥塞的变化趋势,设置目标RTT ,当实测RTT 高于目标RTT 时表明网络发生拥塞,控制发送端网卡降速;实测RTT 低于目标RTT 时表明网络畅通,可试探性增速㊂此外,网卡收到NACK 信号快速降速,避免持续丢包造成网络性能损失㊂CTCC算法主要在网卡中实现,采用无需修改RDMA协议或软件协议栈的RTT探测方式,发送端网卡在拥塞控制算法请求RTT探测时主动发出探测包,收到RTT响应报文或NACK基于加性增乘性减(Additive Increase Multiplicative Decrease,AIMD)策略调速㊂接收端网卡负责返回应答(Acknowledgement, ACK)报文和NACK报文,收到RTT探测包时记录相关时间戳,生成RTT响应报文返回发送方㊂为避免反向链路拥塞增加RTT信号反馈延迟,设置RTT响应报文高优先级㊂该算法无需交换机参与,能够降低部署成本,更好地支持动态环境下的网络调整和扩/缩容操作㊂CTCC算法难点描述:典型场景如7000个发送方往一个接收方打流,约束条件为7000个发送方彼此完全未知,每个发送方只能通过往接收方发送探测帧获得微秒级延迟后进行发送速率控制;目标为7000个发送方要速率快速收敛达到一致以保证公平性,同时避免总发送速率超过接收方链路带宽,避免交换机队列太满产生PFC暂停帧,瓶颈链路吞吐要尽量逼近链路带宽㊂此外,在网络动态变化或复杂业务场景下,如打流期间对相同接收方动态新增1000个或动态减少1000个发送方㊁发送方物理链路混跑有多种业务流量㊁跨多个交换机㊁大小业务流混跑等场景,依然要满足上述目标㊂3.2㊀算法优势分析纯RTT方案无需交换机配合,基于现有商用网卡实现,减少替换和运维成本㊂CTCC算法仅基于RTT信号进行拥塞控制,无需交换机支持可编程等高级功能,且基于商用网卡提供的PCC框架实现,无需定制化硬件㊂收到NACK快速降速,支持NO-PFC场景㊂算法设置网卡收到NACK后直接将速率减半,在关闭PFC的情况下也能应对大规模突发场景,快速降速大幅减少丢包数量,降低丢包带来的性能损失㊂参数数量少,降低调优难度㊂算法不依赖PFC和ECN,免去配置交换机水线的繁琐步骤;且网卡实现简单,超参数数量少,极大地降低了算法调优难度,减少部署和运维工作量㊂3.3㊀控制器设计在算法研发测试过程中,随着测试环境节点数的增加,算法烧写㊁网卡和交换机配置等准备工作量剧增,且极易出现不同节点算法配置不一致等问题㊂为验证算法可商用需要进行覆盖多种基础场景的上千项测试,测试结果的统一记录和汇总是结果分析和算法优化的基础㊂为解决该问题,自主研发出CTCC集中控制器,提供图形化操作界面,实现多设备算法镜像一键烧写㊁动态超参数下发㊁算法类型切换㊁自动化测试㊁测试结果实时监控㊁试验结果跟踪等一系列功能,大大降低了研发测试的工作量和复杂性,保证测试结果可靠㊂其中,超精度网络指标采集及监控是CTCC控制器的重要组成部分和一大技术难点㊂拥塞控制技术在研发过程中往往需要观测流量变化瞬间的网络性能的变化,对指标采集精度提出非常高的要求㊂CTCC控制器采用网络遥感技术,通过推模式(Push Mode)周期性地主动向采集器上送设备的接口流量统计㊁CPU 或内存数据等信息,相对传统拉模式(Pull Mode)的一问一答式交互,可提供更实时㊁更高速的数据采集功能㊂之后,经过Protocol Buffer编码,实时上报给采集器进行接收和存储㊂通过上述方案,可实现亚秒级以上的监控精度㊂3.4㊀算法性能评估利用商用网卡可编程架构实现自研算法,基于大科创装置的物理实验台搭建8台服务器通过1台交换机连接的网络环境,通过性能测试(Perftest)命令进行打流测试验证自研算法优势㊂测试使用的网卡支持per-QP和per-IP两种调速模式,per-QP模式下为每个连接(QueuePair,QP)单独调速,per-IP模式为相同目的互联网协议(Internet Protocol,IP)地址的QP分配相同的速率㊂考虑到同一目的IP的流可能通过负载均衡分配到不同的链路上,拥塞状态会存在差异,设置相同发送速率并不合理㊂在测试中,采用per-QP模式对每个QP进行细粒度调速,根据链路实际拥塞情况调整速率㊂对于DCQCN算法,测试时开启PFC,相关参数使用网卡和交换机推荐的默认值㊂对于CTCC算法,测试时关闭网卡和交换机的PFC功能㊂CTCC算法维持交换机低队列避免丢包:将7台服务器作为发送方,另外1台作为接收方,控制7个发送方同时起1000个QP向接收方打流,对比DCQCN 和CTCC算法在大规模Incast拥塞场景的性能㊂测试结果显示DCQCN算法拥塞控制基本失效,始终维持10MB以上的交换机队列,打流过程中持续触发PFC,易造成PFC风暴㊁死锁等问题,从而影响网络性能㊂CTCC算法最高交换机队列仅为1.22MB,且在没有开启PFC的状态下无丢包㊂DCQCN算法Perftest测得的发送端总和带宽为97.98Gbit/s,瓶颈链路带宽利用率为95.4%㊂CTCC算法测得的发送端总和带宽为90.70Gbit/s,瓶颈链路带宽利用率为91.5%㊂CTCC算法实现低时延:为验证自研算法在时延方面存在的优势,在上述测试场景中添加同方向的小流,测试小流完成的时延㊂由于DCQCN算法维持高队列,小流延迟达到1154.77μs,而CTCC算法能够有效维持低交换机队列,小流延迟平均值为20.31μs,与DCQCN相比降低99%㊂以上两项测试结果验证了CTCC能够在保证高吞吐的同时显著降低时延㊂与DCQCN相比,大规模Incast场景CTCC算法交换机平均队列和小流时延降低90%以上,在DCQCN持续触发PFC的情况下实现稳定状态无丢包㊂尽管控制交换机低队列易导致吞吐损失,且RTT探测包会占用少量带宽,CTCC仍保证了90%以上的带宽利用率,与DCQCN相比吞吐损失低于5%㊂4㊀结束语本文总结了业内RDMA拥塞控制算法研究趋势,结合运营商实际组网环境和业务场景需求提出研发目标,设计了一种交换机免配置的拥塞控制算法,基于大科创装置验证了其在物理环境中的性能优势㊂随着自主研发DPU㊁交换机技术的不断突破,产业各方会持续开展RDMA关键技术攻关,加强面向大模型训练场景数据中心网络极致负载分担㊁RDMA拥塞控制算法等核心技术研究,基于新的硬件设备设计结合多种信号的高效拥塞控制算法,并规划拥塞控制与负载均衡结合的全套解决方案,推动产业链的成熟与落地㊂参考文献[1]张佳欣.德媒:芯片之争,中国绝非无能为力[N].科技日报,2021-04-09(004).[2]ZHU Y,ZHANG M,ERAN H,et al.Congestion controlfor large-scale RDMA deployments[J].ACM SIGCOMM Computer Communication Review,2015,45(5):523-536.DOI:10.1145/2829988.2787484.[3]MITTAL R,LAM V T,DUKKIPATI N,et al.TIMELY:RTT-based congestion control for the datacenter [C]//Proceedings of the2015ACM Conference onSpecial Interest Group on Data Communication.New York:ACM,2015:537-550.DOI:10.1145/2785956. 2787510.[4]KUMAR G,DUKKIPATI N,JANG K,et al.Swift:delay is simple and effective for congestion control in the datacenter[C]//SIGCOMM 20:Annual Conference of the ACM Special Interest Group on Data Communication on the Applications,Technologies,Architectures,and Protocols for Computer Communication.New York: ACM,2020:514-528.DOI:10.1145/3387514.3406591.[5]LI Y,MIAO R,LIU H,et al.HPCC:high precisioncongestion control[C]//Proceedings of the ACM Special Interest Group on Data Communication.New York: ACM,2019:44-58.DOI:10.1145/3341302.3342085.[6]BASAT R B,RAMANATHAN S,LI Y,et al.PINT:probabilistic in-band network telemetry[C].Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication.New York:ACM,2020: 662-680.[7]何春志.腾讯星脉高性能计算网络:为AI大模型构筑网络底座[EB/OL].(2023-03-06)[2023-12-20].https:///developer/article/2234084.[8]SHALEV L,AYOUB H,BSHARA N,et al.A cloud-optimized transport protocol for elastic and scalable HPC [J].IEEE Computer Society,2020(6):67-73.DOI: 10.1109/MM.2020.3016891.[9]GIBSON D,HARIHARAN H,LANCE E,et al.Aquila:a unified,low-latency fabric for datacenter networks [C]//Proceedings of19th USENIX Symposium onNetworked Systems Design and Implementation.Seattle: NSDI,2022:1249-1266.[10]WU X G.Reducing job completion time in AI/ML clusters[EB/OL].(2022-06-09)[2023-12-20].https://www./blog/reducing-job-completion-time-in-ai-ml-clusters.作者简介:车碧瑶㊀中国电信股份有限公司研究院云网运营技术研究所助理工程师,主要从事RDMA高性能网络方向的研究工作张永航㊀中国电信天翼云科技有限公司研发专家,长期从事RDMA高性能网络的设计和研究工作廖怡㊀㊀中国电信天翼云科技有限公司研发专家,主要从事RDMA网络架构㊁协议㊁拥塞控制算法㊁智能网卡和DPU相关的研究工作唐剑㊀㊀中国电信天翼云科技有限公司研发工程师,主要从事高性能网络方向的研发工作樊小平㊀中国电信天翼云科技有限公司资深专家,主要从事高性能网络方向的研究工作赵继壮㊀中国电信股份有限公司研究院云网运营技术研究所云计算研究中心总监,高级工程师,主要从事云计算和高性能计算的软硬件优化等方面的研究工作陆钢㊀㊀中国电信股份有限公司研究院云网运营技术研究所副所长,教授级高级工程师,长期从事云计算技术研发与应用方面的研究工作Research on hardware infrastructure innovation for large language model of telecom operators and RDMA traffic control technologyCHE Biyao1,ZHANG Yonghang2,LIAO Yi2,TANG Jian2,FAN Xiaoping2,ZHAO Jizhuang1,LU Gang1(1.China Telecom Corporation Limited Research Institute,Beijing102209,China;2.China Telecom Cloud Technology Co.,Ltd.,Beijing100007,China)Abstract:Based on the main modes of hardware infrastructure innovation for large language model in the industry,this paper explores the considerations for route selection by telecom operators in this field.This paper presents a congestion control algorithm that supports NO-PFC and does not require the configuration of switches.The algorithm is designed with considerations for the actual networking environment and service scenario.To achieve low latency,the Round-Trip Time(RTT)is employed as the congestion sensing signal,enabling effective regulation of the switch queue length. Keywords:RDMA congestion control;hardware infrastructure innovation for large language model;telecom operator data center network(收稿日期:2023-12-26)。
AWS-Aurora关系型数据库介绍
Amazon Aurora 紧急崩溃恢复
Aurora只读副本自动伸缩技术
MASTER
READ REPLICA
READ REPLICA
READ REPLICA
SHARED DISTRIBUTED STORAGE VOLUME
READER END-POINT
基于重做日志复制的副本低延时 - 通常<10毫秒读取器端点具有负载平衡和自动缩放(CPU及连接数)
Amazon 存储引擎容错
SQL
Transaction
优化4 out of 6 write quorum3 out of 6 read quorumAZP1eer-to-peAZer2 replicatAioZ n3 for repairs
Caching
Amazon Aurora 只读副本
•
可用性自动检测并替换失败的database nodes自动检测并重启失败的database processes只读副本在主节点故障时自 动提升 (failover)客户可以指定fail-over 顺序
兼容 MySQL 和 PostgreSQL 的关系数据库,为云打造。性能和可用性与商用数据库相当,成本只有 1/10。
与MYSQL写性能比较
SysBench Write-Only (writes/sec)
DB Size
Amazon Aurora
MySQL
1 GB
107,000
8,400
10 GB
107,000
Aurora只读副本的不同之处
Log RecordsBinlog DataDouble-Write Buffer FRM Files, Metadata
MySQL With ReplicaAZ 1 AZ 2
AWS云计算基础学习指南
AWS云计算基础学习指南云计算已经成为现代科技领域的一个重要概念,而AWS(亚马逊网络服务)则是最为常见和流行的云计算平台之一。
本文将从基础开始,为读者提供一份AWS云计算基础学习指南,帮助读者了解云计算和AWS平台的基本概念和功能。
第一章:云计算概述云计算是一种通过网络以服务的形式,提供计算和存储资源的技术。
传统的计算和存储方式依赖于本地硬件设备,而云计算则通过将资源放在远程服务器上来实现。
这一方式带来了许多优势,例如更高的可扩展性、灵活性和安全性。
AWS是一个提供云计算服务的平台,为用户提供弹性计算、存储、数据库、机器学习和人工智能等各种服务。
第二章:AWS云计算基本概念在使用AWS之前,我们需要了解一些基本概念。
AWS有一个全局基础设施,分布在全球各地,包括多个区域和可用区。
每个区域都是相互独立的,拥有自己的数据中心网络和计算资源。
而可用区则是在同一区域内的不同数据中心,用于提供更高的容错性和可靠性。
此外,AWS还提供了虚拟私有云(VPC)和子网的概念,用于帮助用户更好地定制网络环境。
第三章:AWS核心服务AWS平台上有众多的服务可供选择,我们在这里重点介绍一些核心服务。
EC2(Elastic Compute Cloud)是AWS提供的弹性计算服务,允许用户在云中租用虚拟机实例。
S3(Simple Storage Service)则是一种对象存储服务,提供了可扩展的存储容量和高可靠性。
RDS(Relational Database Service)则提供了一种管理关系型数据库的解决方案。
另外,还有诸如Lambda、DynamoDB、SNS等其他重要的服务。
第四章:AWS扩展与自动化AWS提供了许多工具和服务,帮助用户实现自动化和扩展。
CloudFormation是一种基于模板的工具,可帮助用户定义和设置整个AWS基础架构。
它允许用户以文本文件的形式描述资源配置,然后通过模板来创建和部署这些资源。
亚马逊AWS技术白皮书:AWS 云采用框架 – 平台视角
AWS Cloud Adoption FrameworkPlatform PerspectiveNovember 2015© 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.NoticesThis document is provided for informational purposes only. It represents AWS’scurrent product offerings and practices as of the date of issue of this document,which are subject to change without notice. Customers are responsible formaking their own independent assessment of the information in this documentand any use of AWS’s products or services, each of which is provided “as is”without warranty of any kind, whether express or implied. This document doesnot create any warranties, representations, contractual commitments, conditionsor assurances from AWS, its affiliates, suppliers or licensors. The responsibilitiesand liabilities of AWS to its customers are controlled by AWS agreements, andthis document is not part of, nor does it modify, any agreement between AWSand its customers.Page 2 of 19ContentsAbstract 4Introduction 4Design Architecture 7Conceptual Architecture Activity 7Logical Architecture Activity 8Considerations 10Implementation Architecture 11Considerations 13Architecture Optimization 14Cloud Design Principles and Patterns Activity 14Application Migration Patterns Activity 15Considerations 17CAF Taxonomy and Terms 18Conclusion 19Notes 19 Page 3 of 19AbstractThe Amazon Web Services (AWS) Cloud Adoption Framework (CAF)1 provides best practices and prescriptive guidance to accelerate an organization's move to cloud computing. The CAF guidance is broken into a number of areas of focus that are relevant to implementing cloud-based IT systems. These focus areas are called perspectives. Each perspective is covered in a separate whitepaper. This whitepaper covers the Platform Perspective, which focuses on designing, implementing, and optimizing the architecture of the AWS technology that you use in your cloud adoption initiative.IntroductionYour organization can use the AWSCloud Adoption Framework (CAF)guidance to explore how differentdepartments can work together onone or more cloud adoption initiative.Guidance is separated into thefollowing focus areas, calledperspectives: Business Perspective,Platform Perspective, MaturityPerspective, People Perspective,Process Perspective, OperationsPerspective, and Security Perspective.The Platform Perspective componentsdescribe the structure and design of acloud-based IT system, or a hybrid ITsystem that spans both cloud andnon-cloud environments.The rest of this whitepaper describeshow the perspectives translate into activities that your organization can perform. This whitepaper covers design architecture and implementation architecture. You can also benefit from principles and patterns for rapidly implementing orFigure 1 Components of the PlatformPerspectiveexperimenting with new solutions on the cloud, or migrating existing non-cloud solutions to the cloud, which will be covered as part of optimization.Embracing AgilityMany organizations already use agile development to increase the velocity of their anticipated business outcomes. However, some businesses experience difficulty in achieving agility all the way through to deployment and operations. Consider embracing agility if you want to increase the velocity of achieving your anticipated business outcomes. For example, you could form a team to initiate a project and, with limited analysis, use AWS services to create a proof of concept (POC). If the POC is successful, you continue. If not, you select a different approach. The AWS platform creates a low barrier for experimentation, and allows you to rapidly deploy servers. When you complete your POC experiment you can shut down the AWS services environment, and no longer pay for resources. When your solution is ready for end users (the minimally viable product), you can gather the users’ feedback and use it to inform priorities for future feature releases. By documenting the different phases of your cloud journey as it progresses, you can create a complete picture of the IT environment. Consider storing the artifacts that you create using incremental experimentation in the source code management system that you use today for storing and revising your application code.You can complete the process of describing a business need and transitioning it into an IT solution using an iterative approach. In addition, you can use an iterative process to provide delivery teams enough detail so that what they build provides the intended outcome. Figure 2 illustrates how an IT capability maps to the services that deliver the capability.Figure 2: Example of Architectural Mapping from Capability to ServiceWhen you use an iterative architectural approach, you can focus more time on business needs and goals. As business needs change and more information is surfaced, the technical architecture you use to deliver the business capability to the customer can shift to match the business need. You can also iterate faster, trying out new things to see if they work with minimal barrier to entry, due to utility pricing. The iterative approach makes it easier to roll back changes or stand up a parallel environment to test new features.You can use a combination of AWS services to create IT capability, and use the AWS Service Catalog to centrally manage commonly deployed IT services. You can also use AWS services that provide a specific IT capability, such as Amazon Glacier for data archiving.There are several components to consider from the Platform Perspective:The Design Architecture component: Look at the common design patterns used in your implementations and identify common needs and redundancies.The Implementation Architecture component: Look at the security, data handling and retention, log aggregation, monitoring needs, and common operational patterns.The Architecture Optimization component: Identify your optimization strategies,what tools and processes need to be changed, and what automation can be used.Design ArchitectureThe Design Architecture component of the Platform perspective promotes the engagement of stakeholders from many parts of the organization. In your cloud adoption scenario, you need to provide different views on your architecture to each stakeholder. For example, as you work with business sponsors to design a solution you can contextualize the architecture to describe how IT can be used to achieve the expected business outcome, and what the costs, returns, and risks might be.Prior to an AWS adoption journey, your organization should consider modifying its governance and architectural principles to include AWS architectural principles. If you have not done so, then try using the iterative method described earlier to establish these principles. You can build methodologies and processes using sprints, just as you build applications. As you build, you can validate the design of your conceptual architectures against your governance and architectural principles.Conceptual Architecture ActivityConceptual views are technically abstract, but they should be described in a context that is familiar to business users. Use the conceptual architecture to define the business context of an IT system with business models. This is where you balance short-, medium-, and long-term business goals and concerns for IT initiatives.Three key components of a conceptual architecture are business vision, goals, and objectives. Use the conceptual architecture to understand which capabilities will be needed as part of the logical or functional architecture that will describe the solution. Figure 3 illustrates an example conceptual architecture that describes where AWS services are applicable.Figure 3 Example of a Conceptual ArchitectureUsing AWS, the creation of a conceptual architecture can become more iterative. You can use AWS services as part of the development effort by using experimentation to validate and evolve the approach. As business capability concepts are proven, development teams can start work on delivering functions and features into production. With quicker delivery, end-user feedback can be used to verify whether business objectives and compliance requirements are being met with the current technical approach.Implement automated testing to test your rapidly iterating conceptual architecture. This not only minimizes the introduction of bugs into your application, but also includes continuous compliance as part of continuous delivery, helping to ensure that changes to your application do not affect your organization’s security posture.Logical Architecture ActivityLogical (or functional) architectural views describe the building blocks of the IT system and their relationships without getting into the technical details of how the functionality is implemented. The logical architecture contains the data flow and capability models that relate to the business models that meet the businessoutcomes.Quality attributes, dependency mapping, and plans for obsolescence can be identified, documented, and addressed as part of designing the logical architecture. A logical architecture (Figure 4) that uses AWS can make use of geographical duplication as well as the elastic nature of AWS services. Using design principles that take advantage of these characteristics will allow system capacities to expand and shrink as loads expand and contract.Figure 4 Example of a Logical Architecture DiagramYou can use different approaches based on the type of project your organization is designing. Projects with a long duration typically are used in predictable, repeatable environments or environments where refinement of approach is not possible or recommended after decisions are made. These types of initiatives are driven with top-down control over outcome. An example of such an initiative is shutting down a corporate data center after a decision to move to the cloud.Initiatives with a short duration are driven with bottom-up freedom over outcome. Change in direction is expected and may be encouraged for better alignment with shifting business needs.There are also hybrid approaches to initiatives where the goal is to migrate and decompose a monolithic mission-critical solution or environment. These initiatives will combine the best aspects of heavy up-front planning with the freedom to innovate as needed to deliver optimized customer outcomes.Considerations∙Do use feedback from delivered features to review and revise the conceptual architecture with the business team.∙Do minimize the number of architectural principles to allow the greatest flexibility in solution development.∙Do stay focused on customer outcomes and business objectives rather than technical solutions.∙Do experiment with AWS services to experience, learn, and prove that your logical architecture will achieve the desired business outcome.∙Do focus on short duration project scoping and iterative processes for systems of interaction where outcomes are more fluid.∙Do consider the practice of creating logical architecture as a dynamic process. ∙Do limit the amount of redundant tech nologies to prevent “technology sprawl” and allow for focus and specialization.∙Do not make functional and implementation architectures dependent on a complete conceptual architecture. Consider identifying a key objective and starting design and delivery of that functionality. Use the feedback fromadoption of the features as input in the evolution of the conceptualarchitecture.∙Do not attempt to create the perfect architecture up front. Consider starting with the highest risk/reward scenario and use experimentation to prove your approach.Implementation ArchitectureThe Implementation Architecture component of the AWS CAF Platform Perspective describes the detailed designs within the IT system and the specific implementation components and their relationships. This architecture also defines the implementation of the system’s building blocks by soft ware or hardware elements.The implementation architecture for an AWS environment describes the design of the technical environment. The description is broken into layers, with each layer providing information for a specific team in the organization. AWS reference architectures are available at /architecture. Figure 5 illustrates a high-level implementation architecture. This artifact works best online, where you can enable clicking on each item for more information, and you can plan for automatic updates.Figure 5 Example of an Implementation ArchitectureDescribing the AWS environment and providing guidance on usage will be a critical portion of the implementation architecture development. Describing how resources, accounts, and tagging work, and the how the Amazon Virtual Private Cloud (VPC) environment is configured provides information that will help the organization determine which resources are consumed by various systems, applications, and initiatives.The Information Architecture should set strategies for deployment, monitoring, auditing, and logging that will give you exposure to near real-time data. Setsecurity, data retention, gateway, and routing strategies and policies so your delivery teams have the information they need to enable control over the AWS environment as it grows.Include taxonomy and naming conventions as part of the metrics, monitoring, and chargeback implementation. The actual running environment will change continuously and will be best viewed through dashboards with near real-time information.Dashboard information can be represented graphically or by using lists. If you use a graphical dashboard, users could click the graphic to show additional detail. If you use a list in your dashboard, users familiar with spreadsheets can find information in well-defined columns. Figure 6 shows a graphical dashboard that can provide near real-time information.Figure 6 Example of a Graphic-based Near Real-time Dashboard Consider prescribing a taxonomy and naming convention in the implementation architecture. Then you can implement this taxonomy as a tagging standard on AWS resources. To increase confidence and reduce risk, you can leverage the AWS environment during implementation architecture creation. When you use AWS, the environment can be created and tested for verification or certification earlier in the release cycle. Additionally, tools are available through AWS and theAWS Marketplace that can automate processes and shorten the time needed to deliver, test, and operate AWS-based environments.Defining an operational playbook for how you are going to deploy and operate your systems will help ensure consistency and repeatability of success. This playbook should also be iterative in nature, with the constructive feedback implemented in systems that did not have this capacity at the time of creation. Considerations∙Do identify a network connectivity strategy for AWS services.∙Do outline AWS components to be used (services/features).∙Do define security controls (native vs. third-party tools). Greater details are available in the AWS CAF Security Perspective whitepaper.∙Do define data security and retention policies (encryption, backups, snapshots, third-party tools).∙Do create and work toward an automated deployment process to reduce the impact of human error and introduce portability.∙Do create an operational playbook. More information on this topic is available in the AWS CAF Operations Perspective whitepaper.∙Do outline a monitoring strategy.∙Do outline a logging strategy that validates that your logging system can manage the amount of information you decide to collect.∙Do create a strategy for resource tracking as part of your implementation architecture, ensuring that resources are appropriately tagged at the time of deployment. This can also be extended into cost allocation tagging.∙Do not let application environments form in an ad hoc fashion. Choose a strategy to organize your application environments.Architecture OptimizationThe Architecture Optimization component of the AWS CAF Platform perspective promotes the adaptability of an architecture that uses AWS—as business needs change and as new and better technical solutions become available, your architectural decisions can be modified and adjusted. Since physical computers are not purchased, the long lead-time for procurement, staging, burn-in, and configuration is no longer necessary. Because you can continue to optimize your architecture during the design phase, this process can completed with less up-front information; your decisions can change and be implemented as needed.As you adopt AWS services, a key focus should be on building tacit knowledge in the organization. Creating a centralized repository with principles, patterns, best practices, a glossary, and reference architectures will help ensure the rapid growth of AWS skills in the organization. As you start an automated and agile deployment process, the centralized information repository allows systems and people who deploy applications to access the governing principles as well as the pieces and parts that they own.Cloud Design Principles and Patterns ActivityAdherence to the software design principles and patterns that you document will improve quality and productivity and reduce risk during solution development. All delivery teams can follow these principles when designing and building solutions. A pattern is a proven approach to achieving a result. You can automate patterns that you use frequently to improve efficiency, consistency, reliability, and supportability. Consider following these best practices:∙Provide guidance that captures reusable approaches, leverages an infrastructure as code approach, and treats that code like application code (source control, code reviews, etc.).∙Create a baseline of language and understanding across the technical organization to ease communications. This might include creating a taxonomy and a dictionary or a glossary describing how things will be named andorganized.∙Educate everyone to a foundational level to provide common language and understanding. Building fluency in the language of AWS cloud adoption and explaining the taxonomy and naming conventions will help acceleratefamiliarity with and ability to use cloud-based technologies and approaches across the organization.∙Use fast track or factory principles to create common approaches with reliable results. Provide documentation that describes diagrams, naming conventions, code review guidance, and so on to provide a common language, approach, and expectations. Using wiki-based tools for documentation will allow teams to update documentation and keep it current, and will provide a single authoritative source for guidance.∙Create a governance process and/or team that ensures and/or audits the outcome of patterns and intended results.∙Provide an “Andon cord” for the deployment team to use if they see something that doesn’t fit in with their understanding of patterns. Application Migration Patterns ActivityProven approaches for migrating IT systems to the cloud are available as migration patterns.Consider organizing applications in a way that helps identify and introduce patterns that you can use with predictable results. Two of the more commonly used pivots are business criticality and data classification. Understanding which categories of data are associated with which applications will provide valuable insight. Another useful pivot is level of mission criticality. Depending on your needs, you could also consider organizing by systems of record versus systems of interaction, monolithic applications versus highly decomposed applications, or new applications versus applications near the end of life.One approach that you can take is to organize your applications into five groups based on the action you want to plan for each application. The different actions include retiring, retaining, replacing, re-hosting, refactoring, and rewriting. Figure 7 illustrates this five-group application migration pattern.Figure 7 Graphic Representation of an Application Migration PatternYou can also use an inventory of current data center applications and their dependencies to determine which applications to migrate and when. This could potentially allow you to avoid a costly equipment refresh, pushing away from capital expenditure (CapEx), and taking advantage of AWS utility pricing.For making decisions about which patterns to leverage, consider creating a Center of Excellence (CoE) team to select patterns that enable the shortest time to value. Another approach is to organize and prioritize by ease of effort to migrate. For example, you could decide to migrate development and test applications first, followed by self-contained applications, customer training sites, pre-sale demo portals, and trial applications. During the migration, consider prioritizing a Tier 1 application to gain visibility and endorsement from executive sponsors.Consider developing new applications or refactoring existing applications in the AWS environment. For existing applications, you could migrate applications to the AWS cloud environment and prioritize rework or optimization initiatives. The refactoring can be enabled by the agility of deployment on AWS.Considerations∙Do consider new applications for migration first.∙Do start development of new capabilities or rewrites of existing capabilities in the AWS environment.∙Do take advantage of capacity concerns as a reason to prioritize development in the cloud.∙Do consider using code review (both for application and infrastructure code) to provide a feedback loop that improves process and reduces technical debt. ∙Do consider using wikis to provide access to guidance that can be updated and maintained over time.∙Do leverage AWS cloud adoption as a way to fast track maturity of combined roles and skills thinking. This would manifest as adeveloper/security/operations mindset and coding architectural models to validate approach.∙Do use AWS cloud adoption to institutionalize a scalable, service-oriented architecture (SOA) approach to separate concerns and to enable integration of reusable services and limit the amount of code maintained.∙Do create patterns that assume failure by building in recovery code with features such as circuit breaker patterns, caching, and queuing, andexponential back-off.∙Do write code with an eye towards reuse through exposed API endpoints for easy discovery, integration, and reuse.∙Do introduce your deployment team to your development team. Empower both teams to fully appreciate the benefits of scalable infrastructure andutility pricing.∙Do not optimize a solution before it is well architected.∙Do not start migrations without operational processes defined. Consider defining backup and recovery guidance as an initial step in a migration effort. ∙Do not manually migrate all applications. Consider using automation to scale and accelerate migration of applications (migration factory).∙Do not wait to automate something. If you’re deploying the same thing twice manually, invest the time in automation.CAF Taxonomy and TermsAWS created the Cloud Adoption Framework (CAF) to capture guidance and best practices from previous customer engagements. An AWS CAF perspective represents an area of focus relevant to implementing cloud-based IT systems in organizations. For example, when a cloud solution is to be implemented, the Platform perspective provides guidance on designing, implementing, and optimizing the architecture of the AWS technology that you plan to use in your cloud adoption initiative.Each CAF perspective is made up of components and activities. A component is a sub-area of a perspective that represents a specific aspect that needs attention. This whitepaper explores the components of the Platform perspective. Within each component, an activity provides prescriptive guidance for creating actionable plans an organization can use to move to the cloud and to operate cloud-based solutions.For example, Design Architecture is one component of the Platform Perspective, and creating logical architectural views that describe the building blocks of the IT system and their relationships may be an activity within that component.When combined, the AWS Cloud Adoption Framework (CAF) and the Cloud Adoption Methodology (CAM) can be used as guidance during your journey to the AWS cloud.ConclusionTranslating business outcomes into technical solutions is still a necessary step in the IT lifecycle. By adopting AWS services, you have the flexibility to change an architectural decision after more information is gathered and as assumptions are tested and technology advances. The Platform Perspective provides an approach to separating a complex set of ideas and decisions into manageable components.Use the design component to facilitate discussions with business stakeholders and provide an abstract level of detail to describe how business outcomes will be accomplished.Use the implementation component to facilitate discussions with technical teams who are responsible for creating, delivering, and maintaining solutions at a level agreed upon with the business stakeholders.Use the architecture optimization component for approaches and patterns that provide predictable and repeatable results. For example, when you use an application migration pattern you can organize and categorize groups of applications and follow a common approach to migrating to an AWS environment. You can also create a small set of principles that all technical team members can use to help with key decisions. This ensures that a common approach to making decisions is used across the organization.Notes1 https:///whitepapers/aws_cloud_adoption_framework.pdf。
AWS安全基础架构02
AWS安全基础架构02AWS安全基础架构02AWS(Amazon Web Services)是由亚马逊公司提供的一套云计算服务,包括计算、存储和数据库等各种功能。
在使用AWS的过程中,安全是一个重要的考虑因素。
AWS提供了一系列的安全服务和功能,用于保护用户的数据和应用程序。
本文将介绍AWS的安全基础架构。
首先,AWS采用了多层次的安全措施来保护用户的数据和应用程序。
在物理层面上,AWS的数据中心采用了严格的物理安全措施,包括视频监控、门禁系统、生物识别等。
此外,AWS还使用了防火墙、入侵检测系统(IDS)和入侵防御系统(IPS)等网络安全设备来保护网络交通。
其次,AWS提供了Identity and Access Management(IAM)服务,用于管理用户的身份和访问权限。
IAM允许用户创建和管理用户、组和角色,并控制每个用户对AWS资源的访问权限。
通过使用IAM,用户可以实现最小权限原则,即每个用户只被授予他们所需的最低权限,从而减少了潜在的安全风险。
此外,AWS还提供了Virtual Private Cloud(VPC)服务,用于创建和管理用户的私有虚拟网络。
通过使用VPC,用户可以在AWS云中创建一个与传统数据中心相似的网络环境,包括子网、路由表和安全组等。
用户可以使用网络ACL(Access Control List)和安全组来控制网络流量,从而增强网络的安全性。
AWS还提供了密钥管理服务(AWS Key Management Service,KMS),用于管理和保护加密密钥。
用户可以使用KMS来生成和导入密钥,并使用它们来加密和解密敏感数据。
此外,AWS还提供了多种加密选项,包括在传输过程中使用SSL(Secure Sockets Layer)和在存储过程中使用SSE (Server-Side Encryption)等。
AWS还为用户提供了监控和日志服务,用于监测和记录系统的安全事件。
亚马逊云服务AWS详解
亚马逊云服务AWS详解随着信息技术的飞速发展,云计算作为一种全新的计算模式,已经逐渐走进了人们的生活中。
而在云计算领域,亚马逊作为行业领袖,其云服务AWS备受好评。
本文将对AWS做一个详细解析,让大家更好地认识和了解这个不可或缺的云服务平台。
什么是AWSAWS全称为Amazon Web Services,是亚马逊发起的一项云服务项目。
AWS提供一系列的云计算服务,包括无服务器计算、计算、存储、数据库、分析、机器学习等各个方面,以及边缘计算、业务应用等增值服务。
AWS沟通了当今云计算的最新技术,并承诺提供可扩展、安全、高效的云计算服务,其市场份额和客户基础均为业内领先水平。
AWS的优势在云计算市场和业内中,AWS一直处于领导地位,因为AWS有很多优点,使它成为了无数企业心中的首选:首先,AWS是可扩展性最好的云计算服务之一,它能够满足不同规模企业的需求。
而且,因为AWS服务的自动配置和管理,客户可以根据需要调整或删除资源。
其次,AWS的安全性非常高,其云安全服务可以防止大多数网络攻击,后备和恢复系统的设置还可以保护企业数据。
第三,AWS是经济的,因为它不需要在Infrastructur上花费大量资金,而是按需和按使用来计费,这样最终用户可以省钱。
另外,AWS具有管理简单和强大的适用性,它提供了各种工具和服务,可以帮助用户轻松地创建、配置和管理他们的应用程序。
AWS的使用AWS服务是建立在亚马逊全球基础设施之上,客户可以选择全球多个区域,以便更好地满足他们的需求。
AWS的服务是按需和按使用来计费的,客户可以根据需要支付费用。
AWS有许多使用场景,其中一些最常见的场景包括:企业可以向 AWS 迁移业务到云端,并根据情况调整 AWS 服务规模,以适应业务变化。
AWS提供了无服务器计算,可免费使用1百万个 Lambda 请求,这使得企业可以花费较少的费用进行有益的开发和测试。
AWS还可以帮助用户开发机器学习、人工智能、大数据、 IoT 和 API 等业务,使其在使用这些新技术时更加便捷。
AWS底层网络技术架构揭秘
• 数据中心需要具有网络弹性:
• 扩 展 intra- A Z 网络容 量 • 扩 展 intra- A Z 网络容 量 • 扩展互联网和 inter-AW S 区域的网络容 量
构建一个可扩展的数据中心
• 我们需要什么?
• 网络构建块
• 让其在大小合适的增量范围内轻松扩展 • 强隔离边界 • 网络容量大
构建一个全球骨干网络
• 光纤路径的极度审查
• 端对端延迟 • 路径风险 • 故障修 复 的预期
• 容量/规模
• 底层的光传输性能
• 路径多样性
• 理解有共同风险的链路组
• 低延迟很要紧
• 在正常清况下做到最佳延迟 • 路径故障期间最大程度/减少额 外 延迟
下一代AWS云计算架构Nitro
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
Virtualization
<_start>: e9 59 e1 17 00 0f 1f 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
j mpq nopl
f f f f 82d08037e15e (%rax)
add 0x31bad(%rax),%dh
add %al,(%rax)
sti
rex.WRXB push %r10
in
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
Boot starts untrusted and must prove that system is trustworthy.
Unavoidable complexity due to need to support legacy and general purpose workloads.
PK/KEK Early Firmware
No
Properly Signed?
Yes
Fail Boot!
UEFI Secure Boot
批注本地保存成功开通会员云端永久保存去开通
The Nitro Project – Next Generation AWS Infrastructure
srd含义 选型
srd含义选型
SRD在不同的语境中可能表示不同的含义,具体如下:
一方面,SRD(Scalable Reliable Datagram)是AWS推出的一种协议,专为AWS数据中心网络设计,基于Nitro芯片,为提高HPC 性能实现的一种高吞吐、低延迟的网络传输协议。
它不保留数据包顺序,而是通过尽可能多的网络路径发送数据包,同时避免路径过载。
为了最大限度地减少抖动并确保对网络拥塞波动的最快响应,在AWS自研的Nitro chip中实施SRD。
SRD由EC2主机上的HPC/ML 框架通过AWS EFA(Elastic Fabric Adapter)内核旁路接口使用。
另一方面,SRD(Short Range Device)表示短距离无线通讯设备,通常是指工作在25MHz到1000MHz之间,传输功率在500MW(27dbm)以内的设备。
SRD设备很多时候工作在免费频段上,由于现有的频段资源不足,很多时候会产生拥塞和互相干扰的情况。
不同业务中所使用的许多ISM(工业、科学和医用)频段亦会与短距离无线通讯设备(SRD)重合。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
Virtualization
<_start>: e9 59 e1 17 00 0f 1f 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
After ten years of Amazon Elastic Compute Cloud (Amazon EC2), if we applied al of our learnings, what would a hypervisor look like?
Nitro: Two years later
<multiboot2_header_start>: d6 50 52 e8 00 00 00 00 88 00
j mpq nopl
f f f f 82d08037e15e (%),%dh
add %al,(%rax)
sti
rex.WRXB push %r10
in
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
What happened?
The V M M is the heart of a hypervisor. A s long a s a statistical majority of instructions execute natively, we call this
<multiboot1_header_end>: 0f 1f 40 00
<multiboot2_header_start>: d6 50 52 e8 00 00 00 00 88 00
j mpq nopl
f f f f 82d08037e15e (%rax)
add 0x31bad(%rax),%dh
<_start>: e9 59 e1 17 00 0f 1f 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
<multiboot1_header_end>: 0f 1f 40 00
<multiboot2_header_start>: d6 50 52 e8 00 00 00 00 88 00
<multiboot2_header_start>: d6 50 52 e8 00 00 00 00 88 00
j mpq nopl
f f f f 82d08037e15e (%rax)
add 0x31bad(%rax),%dh
add %al,(%rax)
sti
rex.WRXB push %r10
in
The Nitro Project – Next Generation AWS Infrastructure
下一代AWS云计算架构Nitro
Agenda
Nitro Overview Evolution of Nitro Nitro Security Chip Deep Dive AWS Outposts
add %al,(%rax)
sti
rex.WRXB push %r10
in
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
Virtualization
<_start>: e9 59 e1 17 00 0f 1f 00
j mpq nopl
f f f f 82d08037e15e (%rax)
add 0x31bad(%rax),%dh
add %al,(%rax)
sti
rex.WRXB push %r10
in
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
Virtualization
<_start>: e9 59 e1 17 00 0f 1f 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
<multiboot1_header_end>: 0f 1f 40 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
<multiboot1_header_end>: 0f 1f 40 00
<multiboot2_header_start>: d6 50 52 e8 00 00 00 00 88 00
j mpq nopl
f f f f 82d08037e15e (%rax)
add 0x31bad(%rax),%dh
add %al,(%rax)
sti
rex.WRXB push %r10
in
$0xf,%al
nopl 0x0(%rax)
( bad) push push cal l q mov
%rax %rdx ffff82d080200020
%al,(%rax)
Virtualization
<_start>: e9 59 e1 17 00 0f 1f 00
<multiboot1_header_start>: 02 b0 ad 1b 03 00 00 00 fb 4f 52 e4 0f
<multiboot1_header_end>: 0f 1f 40 00
AWS Nitro
Launched in November 2017 In development since 2013 All new launches use Nitro Purpose-built hardware/software Hypervisor built for AWS
Virtualization