SmurfPDMS A Platform for Query Processing in Large-Scale PDMS

合集下载

相关主题

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

SmurfPDMS:A Platform for Query Processing in

Large-Scale PDMS

Katja Hose Christian Lemke Jana Quasebarth Kai-Uwe Sattler Department of Computer Science and Automation,TU Ilmenau

P.O.Box100565,D-98684Ilmenau,Germany

Abstract:As Peer Data Management Systems(PDMS)are a focus of current research,

there are lots of approaches like query processing or routing issues that have to be eval-

uated.Since there is no common platform approaches are evaluated in separate.This

is disadvantageous for research groups in two ways.First,it means a huge effort to

build a simulation environment from scratch.Second,this makes a direct comparison

of approaches more difﬁcult.In this paper,we present SmurfPDMS an extensible sys-

tem that means to provide a common platform for all researchers in that they can easily

integrate their approaches and that allows for running large simulation experiments in

distributed environments such as workstation clusters or even PlanetLab.

1Introduction

Peer Data Management Systems(PDMS)–also known as schema-based P2P systems–are an important area of recent and current research.Emerging from federated database systems and applying the P2P paradigm,PDMS have to counteract the challenges coming along with peer autonomy.This means that all peers are equal in terms of issuing and processing queries,each peer possesses and owns its private local data,and each peer might have a local schema that is unique in the whole network.Furthermore,each peer can only communicate with those neighbor peers to which mappings exist.

This and the fact that we consider unstructured P2P systems as the basis for PDMS re-quires efﬁcient distributed query processing strategies that do not need any kind of global knowledge.In contrast to structured P2P systems like Chord[SMK+01],PDMS do not have global indexes or hash functions that could help usﬁnd the data we are looking for. Since we are neither allowed to rearrange the peers’data we have toﬁnd other possibilities to route queries efﬁciently through the network.A common means to do this are routing indexes[CGM02]that can be used to identify which neighbors hold data that matches the query.

Though aspects like routing indexes,query processing strategies,or dynamic behavior have a great inﬂuence on each other,they are usually considered independently from each other by different research groups.The use of different platforms,implementations,and assumptions hampers a direct comparison of similar concepts.Additionally,many existing simulators like ns-2(/nsnam/ns/)are too low-level for simulating PDMS appropriately.Most environments have another severe drawback:their lack of documen-tation and extensibility.Furthermore,they often do not have an intuitive user interface let alone a graphical one that might allow outsiders to conﬁgure and use the system.

In this paper,we do not present yet another PDMS system in addition to systems like Pi-azza[TIM+03]but SmurfPDMS(SiMUlation enviRonment For PDMS)a common plat-