基于RAID+6编码的分布式存储系统检验盘故障修复算法研究
合集下载
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
store the data dispersedly,which contributes to
saving the mass data.By this,distributed storage system has the capability of disaster
tolerance,and
effectively.
have gained explosive growth in
recent
data information safely and efficiently has become the
academia
to
and
industrial
field.Distributed
storage system
employs network connections
基于RAID 6理论体系的编码算法:RDP编码和EVENODD编码。RAID 6技术
实现了较低冗余较高容错能力、高效的随机数据存取、数据的并行处理及消除对 校验盘的访问瓶颈。RAID 6技术的良好特性对整个系统性能的提升具有重要意 义。 2)缩短再生数据在网络中的传输时间。在网络链路状况不可改变的情况下, 只有通过减少数据传输数量来节省修复时间,即减小修复带宽。网络编码思想改 变传统网络节点仅进行数据存储转发的现状,充分利用网络节点的计算编码能 力,中间节点参与编码和解码,提高单位数据的信息量,提高整个网络的吞吐量。 将网络编码思想应用到故障修复领域中,可以减少修复带宽,提高修复效率。 同时,由于单节点故障的概率远大于多个节点同时发生故障的概率,本文主 要研究单节点故障的快速修复问题。传统的单节点故障修复算法仅针对原始数据 盘故障进行修复;对于校验盘修复则需要下载所有原始数据,修复效率较低。 本文针对RDP编码和EVENODD编码,分别提出对应的校验盘快速修复算
exceptional issue.As
a
transaction
rather than the
to
result,how to
recover
data rapidly and efficiently
guarantee
the reliability of the system in the point. T}lis dissertation
学校有权按有关规定向国家有关部门或机构送交论文的复印件和电子版允许论文被查阅和借阅可以将学位论文编入中国学位论文全文数据库等有关数据库进行检索可以采用影印缩印或扫描等复制手段保存汇编学位论文
中国科学技术大学学位论文原创性声明
本人声明所呈交的学位论文,是本人在导师指导下进行研究工作所取得的成 果。除己特别加以标注和致谢的地方外,论文中不包含任何他人已经发表或撰写 过的研究成果。与我一同工作的同志对本研究所做的贡献均已在论文中作了明确 的说明。
efficiency.
single node failure happens more often
SalTle
Meanwhile,since
failure’S happening
than
multiple nodes
at the
time,this dissertation mainly studies the problem of
l
摘要
Abstract
Abstract
Wim
the rapid development of information technology and the、杭de spread of
resources
multimedia application,information
years.How to store the studying focus in the
efficiently.This
dissertation focuses
on
RDP code
and
EVENODD code,which are applied in the
RAID 6 systems.RAID-6
technique
has the capability of lower redundancy and
say,decreasing
bandwidth.Network
coding ideas break the convention of traditional
use
network
node for data storage and forward,and make full
of the calculation and
1)nle efficient algorithm When
choosing
of data regeneration encoding algorithm,this dissertation
the erasure-codes-based
selects the exclusive XOR operation,which helps to regenerate data
coding ability of network node,SO that the intermediate nodes Can participate in the
encoding
and
decoding process,which contributes to
expanding
the data
information
号公开
口保密(——年)
作者签名:岛刀公∥父
签字日期:塑!垒垒互亟兰1 9
导师签名:《垒!盘业刍纭 导师签名:王:聋!鱼出
摘要
摘要
近年来,随着信息技术的飞速发展及多媒体应用的广泛普及,信息资源呈爆 炸式增长。如何对数据信息进行安全高效存储,成为学术界和工业界的研究热点。 分布式存储系统利用网络连接将数据进行分散存储,实现海量数据存储并具备容 灾能力,有效克服了集中式存储系统的弊端。 相比于集中式存储,分布式存储系统提升了数据存储容量及数据并行操作能 力。然而,在带来性能提升的同时,由于大量存储设备的存在,节点故障变得更 为普遍。工业界己将节点故障作为日常事务而非异常进行处理。如何在发生节点 故障时,快速高效恢复数据以保证系统的可靠性成为研究热点。 本文将主要研究分布式存储系统中校验盘故障的快速修复问题。提高故障节 点的恢复速度,可以从提高再生数据的生成效率及提高再生数据在网络中的传输 效率两方面入手。据此,本文的工作内容主要为以下两个方向: 1)高效的数据再生算法。本文在选择基于纠删码的编码算法时,选择仅需 异或(xog)操作便能实现的编码方案,实现数据的快速高效生成。本文主要研究
status cannot
on
the Intemet number of
When the network
be
changed,reducing the
to
transmission data is the only way to the recovery
save
the recovering time,that’S
法。通过对RDP和EVENODD码编码算法的研究,充分利用行校验盘的数据特
性,结合网络编码思想,实现对角线校验盘的快速高效修复。理论研究表明,相 比于传统修复算法,该算法可显著减少校验盘故障修复过程中耗费的带宽资源,
从而提高修复效率。
关键词:
分布式存储系统RDP码EVENODD码校验盘故障修复修复带宽
also overcomes the
disadvantages of the centralized
storage
systems
Compared with
the
centralized storage
system,distributed
storage
system
expands
the data storage capacity
作者签:冬立型叁丛
签字日期:三竺!!生苎旦望日.
中国科学技术大学学位论文授权使用声明
作为申请学位的条件之一,学位论文著作权拥有者授权中国科学技术大学拥 有学位论文的部分使用权,即:学校有权按有关规定向国家有关部门或机构送交 论文的复印件和电子版,允许论文被查阅和借阅,可以将学位论文编入《中国学 位论文全文数据库》等有关数据库进行检索,可以采用影印、缩印或扫描等复制手 段保存、汇编学位论文。本人提交的电子文档的内容和纸质论文的内容相一致。 保密的学位论文在解密后也遵守此规定。
how to recovery the
single node failure’S rapidly.For the traditional single node
focus
on
failure’S recovery algorithms
the
recovery
of original
data disk,the
bottlenecks to parity disks.These
advantages
of RAID一6
technique
promote
the whole system’S
performance.
1ll
Abstraet
2)Shortening
the regeneration data’S transmission time link
and improves
the ability of parallel operation.
Unfortunately,although the system’S
performance
has been improved,node failure
becomes more common,because of the existence of large numbers of storage devices. Industrial community has regarded the node failure as daily
access
higher
fault-tolerance technique
and
also has the
to data randomly in
can
an
effective way.
RAID一6
access
can
parallel processing data and it also
eliminate visiting
recovery ofparity disk needs to download all the original data to
repair.Therefore,the
recovery efficiency is comparatively low.
This dissertation puts
forward
recovery
data
of node failure,two methods are
regeneration
available:improving
its
the
efficiency of
and the
efficiency
of
network transmission.Therefore,this
dissertation is mainly divided into the following two aspects:
the corresponding fast recovery algorithms of
parity disk
to
RDP
code and
EVENODD
code.With
the study
of RDP
use
and
EVENODD codes’encoding parity
algorithms,these algorithms
make the best
of rOW
disk’S data characteristics;with the help
can
of network coding
ideas,these
can
algorithms
realize
coding
per unit and enlarging the network throughput
capacity.Applying network
ideas into the field of disk failure recovery Can decrease recovery bandwidth and increase the repairing
case
of node
failure becomes
a
popular research
will
make
a
study of the
fast recovery algorithm
to
the
recovery of parity disk in distributed storage system.In order to speed up the