MuMHR:Multi-path, multi-hop hierarchical routing_2007

合集下载

哈夫曼编码 信息学奥赛

哈夫曼编码 信息学奥赛

哈夫曼编码信息学奥赛
哈夫曼编码是一种可变长度编码方式,它根据字符出现概率来构造平均长度最短的码字。

哈夫曼编码是哈夫曼树的一种应用,哈夫曼树是一种特殊的二叉树,它的所有叶子节点都带有权值,从中构造出带权路径长度最短的二叉树。

在信息学奥赛中,哈夫曼编码通常用于数据压缩和编码问题。

例如,给定一组字符及其出现频率,要求设计一种编码方式使得字符的平均编码长度最短。

这种问题可以使用哈夫曼树来解决,具体步骤如下:
1. 根据字符出现频率构建哈夫曼树。

2. 对哈夫曼树进行编码,从根节点开始,对左子树分配码“0”,右子树分
配码“1”,一直到达叶子节点为止。

3. 将从树根沿每条路径到达叶子节点的代码排列起来,便得到了哈夫曼编码。

哈夫曼编码在信息学奥赛中非常重要,因为它是一种高效的数据压缩和编码方式,能够有效地减少存储空间和提高数据传输效率。

mulliken集居数的基本关系

mulliken集居数的基本关系

mulliken集居数的基本关系马利肯原子结构方法是非常有用的以解释分子构型以及电子结构的有效方法。

它与傅里叶分解(Fourier decomposition)相结合,是20世纪中叶出现的重要方法之一。

马利肯原子结构方法的原理在古典与量子物理学之间,即用古典性质来描述量子物理。

该方法对解释各种氧化物分子以及其他复杂分子,均有十分重要的价值。

由此,不仅仅是理论化学家和结构物理学家十分重视,而且也被实验化学家、分子生物学家和材料科学家用于量化分析实验结果,以及解释实验结果。

马利肯集居数是马利肯原子结构方法中的一个重要结果。

它表示了相对分子轨道(MO)的具体分子结构,即电子结构的程度,它的物理含义是每个原子的电子构造结果。

即马利肯集居数实际上表达的是某一原子的电子结构结果,可以应用于科学研究以及实际工程中,用于解释双极类分子的反应模型、半导体材料的设计等等,同时也是瑞典科学家马利肯提出”电子亲和力”(electronegativity)这一概念的起点。

从马利肯角度来讲,集居数实质上表示某个原子有多少电子被其余原子所锁定,它不仅可以用于给出某原子或分子在电子结构上定置情况,而且还可以用于描述原子偶极矩以及两种分子间的相对电子配对情况。

集居数可以用来解释分子结构规律,特别是当其余原子的电荷分布的分布状态不平衡时,就可以用它来分析分子的结构情况以及体系的力学稳定性。

此外,集居数具有量化模型,因此当原子上存在电子附着时,就可以用马利肯集居数作出量化计算。

在此之上,集居数迅速有效计算出电子张量的积分,可以用该模型再现分子的电子构造,马利肯的集居数原理甚至定义了一般原子的元素化学性质,得到了一个较直观的总结,可以帮助人们了解某一原子在某一分子的电子分配情况。

总的来说,马利肯集居数可以广泛地应用于化学和材料学的科学研究,不仅可以解释分子构造以及分子间的相互作用,而且还可以用于定量分析,能够更有效地探究化学反应机制,而不是变化当中的实质。

如何利用马尔可夫逻辑进行多模态数据融合(Ⅲ)

如何利用马尔可夫逻辑进行多模态数据融合(Ⅲ)

马尔可夫逻辑是一种基于概率的数学模型,用于描述一个系统在一定条件下从一个状态转移到另一个状态的过程。

它可以被应用于多模态数据融合,即将来自不同传感器或数据源的信息整合在一起,以获得更全面和准确的信息。

多模态数据融合在现代科学和工程领域中扮演着至关重要的角色。

例如,在智能交通系统中,我们需要整合来自交通摄像头、交通传感器和交通流量数据等多种数据,以实现准确的交通监控和预测。

在医学影像诊断中,结合来自X光、CT 扫描和MRI等多种不同的影像数据,可以提高疾病诊断的准确性。

因此,如何有效地利用马尔可夫逻辑进行多模态数据融合成为了一个重要课题。

首先,我们需要了解马尔可夫逻辑的基本原理。

马尔可夫逻辑是基于马尔可夫链的模型,马尔可夫链是一个描述随机过程的数学工具,其基本思想是当前状态只与前一个状态有关,与更早的状态无关。

在多模态数据融合中,我们可以将不同传感器或数据源的信息看作不同的状态,而它们之间的转移概率则可以描述为各种数据之间的相关性。

通过建立马尔可夫逻辑模型,我们可以利用这些转移概率来推断不同数据之间的关联关系,从而实现数据融合。

其次,马尔可夫逻辑可以用于处理不完整或噪声数据。

在实际应用中,不同传感器或数据源采集到的信息可能存在缺失或者噪声,这会导致数据融合的困难。

马尔可夫逻辑具有较强的容错性,它可以通过对缺失数据的推断和对噪声数据的抑制,从而在一定程度上提高了数据的完整性和准确性。

通过建立合适的观测模型和转移模型,我们可以利用马尔可夫逻辑来对不完整或噪声数据进行有效的补全和过滤,从而实现多模态数据的融合。

此外,马尔可夫逻辑还可以用于动态系统的建模和预测。

在多模态数据融合中,我们不仅需要对数据进行静态的整合,还需要对数据进行动态的建模和预测。

马尔可夫逻辑具有良好的动态建模能力,它可以通过对当前状态的观测和转移概率的推断,来预测系统未来的状态。

这对于实现对多模态数据融合结果的实时监控和预测具有重要意义。

如何利用马尔可夫逻辑进行多模态数据融合的特征交叉(十)

如何利用马尔可夫逻辑进行多模态数据融合的特征交叉(十)

在当今信息爆炸的时代,多模态数据融合成为了一项重要的研究课题。

多模态数据融合是指从多个不同模态的数据源中获取信息,并将这些信息整合到一个统一的框架中进行分析和处理。

马尔可夫逻辑是一种用于建模复杂系统和数据的强大工具,其在多模态数据融合中也有着广泛的应用。

本文将探讨如何利用马尔可夫逻辑进行多模态数据融合的特征交叉。

## 马尔可夫逻辑简介马尔可夫逻辑是一种用于建模不确定性的逻辑框架,其核心思想是基于马尔可夫决策过程(MDP)和马尔可夫随机场(MRF)。

在多模态数据融合中,马尔可夫逻辑可以用于将来自不同模态的数据进行联合建模,并对其进行概率推断。

通过马尔可夫逻辑,可以实现模态之间的信息共享和交互,从而更好地利用多模态数据的丰富信息。

## 多模态数据融合的挑战在多模态数据融合中,最大的挑战之一是如何有效地进行特征交叉。

不同模态的数据往往具有不同的特征表示,如何将这些特征进行有效的交叉,以获取更丰富和准确的信息,是多模态数据融合面临的重要问题。

传统的特征交叉方法往往只能处理单一模态的数据,对于多模态数据的特征交叉仍然具有一定的局限性。

## 马尔可夫逻辑在多模态数据融合中的应用利用马尔可夫逻辑进行多模态数据融合的特征交叉,可以充分利用不同模态数据之间的关联和交互信息,从而实现更好的特征融合效果。

首先,可以利用马尔可夫逻辑对不同模态的数据进行联合建模,构建一个统一的概率图模型。

在这个模型中,不同模态的数据可以相互影响和补充,从而实现了模态之间的信息共享和交互。

其次,通过马尔可夫逻辑的推断和学习方法,可以有效地实现多模态数据的特征交叉。

马尔可夫逻辑可以通过学习模型参数和推断隐藏变量的方法,自动发现不同模态数据之间的潜在关联和相互作用,从而实现了特征的有效交叉和整合。

## 深度学习与马尔可夫逻辑的结合近年来,深度学习作为一种强大的模式识别和特征学习方法,也被广泛应用于多模态数据融合中。

深度学习可以通过神经网络模型来学习和提取不同模态数据的特征表示,从而实现了对多模态数据的高效整合和处理。

多频段多系统MR合并,预测频率重耕后5G多频组网覆盖效果

多频段多系统MR合并,预测频率重耕后5G多频组网覆盖效果

I G I T C W技术 分析Technology Analysis54DIGITCW2024.020 引言移动通信网络普遍使用MR 测量进行网络覆盖评估,传统的MR 覆盖评估方法一般仅使用同频MR 数据,对单系统网络评估结果较为准确,但对于多频段多系统组网的网络,受不同频段及异系统互操作驻留策略影响,存在“幸存者偏差”问题,只能评估用户在当前网络上的覆盖水平,较高的驻留门限会导致覆盖评估结果不准确,不能呈现真实的网络覆盖情况。

此外,传统的方法只能评估已建成的网络,不能预测频率重耕后5G 多频组网下的覆盖情况,不能精准指导网络规划及建设。

针对传统MR 评估方法存在的问题,根据MR 测量原理,本文提出了一种基于多频段多系统的MR 测量(同频、异频、异系统)数据进行网络覆盖评估的方法,通过合并异频、异系统MR 测量数据,弥补驻留策略导致的MR 覆盖评估数据缺失的不足,可以对单一频率覆盖进行还原,有效避免了目前同频MR 覆盖评估存在的“幸存者偏差”问题,提高了网络覆盖评估准确性,并可以提前对频率重耕后的5G 多频组网覆盖效果进行预测,提供数据支撑网络规划与建设。

通过在中国联通L900M +L1800M 网络及5G 网络中按本方法进行评估并进行实测验证,实测结果基本与MR 评估结果一致,证明了本方法的准确性。

1 整体方案概述(1)通过采集L900M 同频MR 数据、L1800M 异频MR 测量L900M 数据,进行数据处理和栅格数据合作者简介:田 超(1978-),男,安徽马鞍山人,中级工程师,学士,主要从事无线网规划和建设工作。

戴 廷(1983-),男,江苏徐州市人,中级工程师,学士,主要从事无线网运营管理工作。

多频段多系统MR合并,预测频率重耕后5G多频组网覆盖效果田 超,戴 廷(中国联合网络通信有限公司安徽省分公司,安徽 合肥 230071)摘要:传统MR覆盖评估方法只能评估现网单系统网络的覆盖效果,无法评估频率重耕后的多频段网络覆盖效果,而且存在“幸存者偏差”问题,导致评估数据不完整。

python蒙卡遗传算法

python蒙卡遗传算法

python蒙卡遗传算法Python蒙特卡洛遗传算法介绍蒙特卡洛遗传算法(Monte Carlo Genetic Algorithm)是一种基于概率和随机性的优化算法。

它结合了蒙特卡洛模拟和遗传算法的思想,用于解决复杂的优化问题。

Python作为一种强大的编程语言,提供了丰富的库和工具,使得实现蒙特卡洛遗传算法变得简单且高效。

基本原理蒙特卡洛遗传算法通过模拟随机事件来估计问题的最优解。

它使用遗传算法中的进化过程来搜索解空间,并使用蒙特卡洛模拟来评估每个个体的适应度。

1. 初始化种群:随机生成初始种群,其中每个个体代表一个可能的解。

2. 评估适应度:对每个个体进行适应度评估,即计算其在问题空间中的目标函数值。

3. 选择操作:根据适应度值选择父代个体进行交叉和变异操作。

4. 交叉操作:对选定的父代个体进行交叉操作,生成子代个体。

5. 变异操作:对子代个体进行变异操作,引入新的基因组合。

6. 更新种群:将子代个体与父代个体合并,形成新的种群。

7. 重复步骤2-6,直到满足停止条件(达到最大迭代次数或找到满意的解)。

Python实现下面是一个简单的Python代码示例,演示了如何使用蒙特卡洛遗传算法解决一个简单的优化问题。

```pythonimport random# 目标函数:计算x^2 + y^2 的最小值def objective_function(x, y):return x**2 + y**2# 初始化种群def initialize_population(population_size):population = []for _ in range(population_size):x = random.uniform(-10, 10)y = random.uniform(-10, 10)population.append((x, y))return population# 计算适应度值def calculate_fitness(population):fitness_values = []for individual in population:x, y = individualfitness_values.append(objective_function(x, y))return fitness_values# 选择操作:轮盘赌选择def selection(population, fitness_values):total_fitness = sum(fitness_values)probabilities = [fitness / total_fitness for fitness infitness_values]selected_population = []for _ in range(len(population)):selected_individual = random.choices(population, probabilities)[0]selected_population.append(selected_individual)return selected_population# 交叉操作:单点交叉def crossover(parent1, parent2):crossover_point = random.randint(1, len(parent1) - 1)child1 = parent1[:crossover_point] + parent2[crossover_point:] child2 = parent2[:crossover_point] + parent1[crossover_point:] return child1, child2# 变异操作:随机变异def mutation(individual, mutation_rate):mutated_individual = []for gene in individual:if random.random() < mutation_rate:mutated_gene = random.uniform(-10, 10)mutated_individual.append(mutated_gene)else:mutated_individual.append(gene)return tuple(mutated_individual)# 更新种群def update_population(selected_population, population_size, mutation_rate):new_population = []while len(new_population) < population_size:parent1, parent2 = random.sample(selected_population, 2) child1, child2 = crossover(parent1, parent2)mutated_child1 = mutation(child1, mutation_rate)mutated_child2 = mutation(child2, mutation_rate)new_population.extend([mutated_child1, mutated_child2]) return new_population[:population_size]# 主函数def main():population_size = 100max_iterations = 1000mutation_rate = 0.01population = initialize_population(population_size)for iteration in range(max_iterations):fitness_values = calculate_fitness(population)selected_population = selection(population, fitness_values) population = update_population(selected_population, population_size, mutation_rate)best_solution = min(population, key=lambda individual: objective_function(*individual))print("Best solution:", best_solution)```总结蒙特卡洛遗传算法是一种强大的优化算法,通过结合蒙特卡洛模拟和遗传算法的思想,可以解决各种复杂的优化问题。

python 多目标遗传算法 约束条件

python 多目标遗传算法 约束条件

python 多目标遗传算法约束条件多目标遗传算法(MOGA)是一种优化算法,可以用于解决多目标优化问题。

在传统的优化算法中,通常只考虑一个目标函数,而在现实生活中,很多问题往往涉及到多个相互关联的目标。

MOGA通过综合考虑多个目标函数,寻找一组最优解,这些解被称为“非支配解”或“帕累托最优解”。

在MOGA中,每个解都表示为一个个体,而这些个体会根据其在目标空间中的表现进行进化。

遗传算法的基本思想是通过模拟自然界的进化过程,利用选择、交叉和变异等操作,逐步改进个体的适应度。

对于MOGA而言,需要设计适应度函数来评估每个个体在多个目标函数上的表现。

在设计适应度函数时,需要考虑多个目标函数之间的权重关系。

通常,我们可以通过设定不同的权重值来平衡不同目标之间的重要性。

例如,如果一个问题涉及到最小化成本和最大化利润两个目标,我们可以设定一个权重值来调整两个目标的相对重要性。

MOGA的核心思想是通过不断进化,逐步逼近帕累托最优解集。

帕累托最优解集是指在目标空间中不可被其他解支配的解的集合。

通过遗传算法的操作,MOGA可以在解空间中搜索到一组帕累托最优解,这些解在多个目标函数上都表现优秀。

为了增加算法的多样性和避免陷入局部最优解,MOGA还引入了一些改进措施。

例如,引入种群多样性维持机制,通过选择适当的交叉和变异操作,保持种群的多样性。

另外,也可以采用多目标选择算子,来选择不同适应度等级的个体,以增加种群的多样性。

在应用MOGA解决实际问题时,需要根据具体情况进行算法的参数设置。

例如,种群规模、交叉概率、变异概率等都需要根据问题的特点来确定。

另外,也可以采用一些高级的MOGA算法,如NSGA-II、SPEA2等,来进一步提高算法的性能。

总的来说,MOGA是一种强大的优化算法,可以有效解决多目标优化问题。

通过综合考虑多个目标,MOGA能够找到一组帕累托最优解,这些解在多个目标函数上都表现优秀。

在实际应用中,我们可以根据具体问题的特点,合理设计适应度函数和参数设置,以获得更好的优化结果。

字符合集和出现概率构造哈夫曼树

字符合集和出现概率构造哈夫曼树

《字符合集和出现概率构造哈夫曼树》一、引言在信息技术领域,字符合集和出现概率构造哈夫曼树是一项重要且基础的概念。

本文将深入探讨字符合集和哈夫曼树的构造过程,并分析字符出现概率对哈夫曼树构造的影响,旨在让读者对此理解更加深入、全面。

二、字符合集和出现概率字符合集是指在某种编码方式下所能使用的字符的总体。

在计算机领域中,字符合集通常指的是能够表示文本、图像、声音等信息的全部字符。

而字符的出现概率则是指在一个文本或数据集中某个字符出现的频率,通常使用概率分布来表示。

在构造哈夫曼树的过程中,首先需要了解字符合集中各个字符的出现概率,因为出现概率越高的字符,在哈夫曼编码中对应的编码长度应该越短,以提高编码效率。

对字符合集中字符出现概率的准确评估对于哈夫曼树的构造至关重要。

三、哈夫曼树的构造步骤1. 根据字符出现概率构造叶子节点:首先根据给定的字符合集以及各字符的出现概率,构造对应的叶子节点。

出现概率较高的字符对应的叶子节点应该在哈夫曼树中拥有较短的路径长度。

2. 构造哈夫曼树的内部节点:在构造叶子节点之后,根据哈夫曼树的构造规则,通过不断合并概率最小的两个节点来构造内部节点,并更新概率信息。

这一过程直到所有节点合并成为一个根节点为止。

3. 确定字符的编码:通过遍历哈夫曼树的路径,确定每个字符对应的编码,从而完成哈夫曼编码的构造。

四、字符出现概率对哈夫曼树的影响字符出现概率对哈夫曼树的构造有着重要的影响。

在实际应用中,如果字符出现概率相差较大,那么对应的编码长度也会存在较大的差异。

合理评估字符出现概率,能够在一定程度上提高哈夫曼编码的效率。

当字符出现概率相近的时候,构造的哈夫曼树将会比较平衡,对应的编码长度也会较为接近,这样能够保证在编码和解码的过程中效率的平衡。

而当字符出现概率差异较大时,哈夫曼树将会不平衡,对应的编码长度也会存在较大的差异,这样就需要根据实际情况来平衡编码的效率和解码的效率。

五、我的观点和理解在我看来,字符合集和出现概率构造哈夫曼树是一项非常重要的基础概念。

主路径及相关方法简介

主路径及相关方法简介

主路径及相关方法简介1.局部主路径1989年,Hummon NP首次提出了主路径分析方法[5]。

他从一个由40个节点组成的,代表DNA学科的引文网络中提取出一些起中心作用的节点,并由这些节点之间的连接代表了DNA学科发展的主要思想变迁。

主路径分析不同于传统的诸如文献耦合,共被引分析等方法,它更关注网络中节点之间的连接而非节点本身。

而连接的重要程度一般用SPC值来衡量,这是一种最常用的测量方式[1],另外还有NNPC,SPLC,SPNP等方法[2],这里不一一详细介绍。

一个连接的SPC值可以由网络中从所有起点出发到所有终点结束所经过的所有路径中所穿过此连接的次数来衡量。

被穿过的次数越多,即SPC值越大,这个连接也就越重要。

而主路径分析就是从所有起始点出发,找出下一个连接中SPC值最大的连接,并以下一个节点为起始点,重复此一过程直到最后终点所得到的路径。

需要补充的是,如果有若干个连接同时具有最大SPC值,则同时入选。

此方法之所以被称为“局部主路径”[2],是因为这一算法的每一步只关注当前连接的最大SPC值,它强调的是知识传播过程中的局部重要性。

当然,相对于局部主路径还有一种“全局主路径”[2],此方法试图在网络整体中找到一条SPC值最大的路径。

图1 局部主路径(Hummon NP1989)现以IT外包为例进行局部主路径分析,如图2所示。

图2 ITO局部主路径2.全局主路径全局主路径的思想在文献[3]中就有所体现。

但首先提出这个说法的是在文献[2]中。

如上所述,全局主路径是在整个网络中找到一条SPC值最大的路径,这也就是在有向(有权值)图中找一条最长路径。

相比较于局部主路径,全局主路径更关注知识传播的整体重要性而非局部重要性。

文献[3]指出,一般来说全局主路径比局部主路径要长;而文献[2]得出的结果是两条路径一样长,结果还表明二者实际上更是一种优势互补关系,同时考虑二者应该比只考虑一种更有说服力。

另外,由于局部主路径的每一步都是选取具有最大SPC值的连接点,若出现具有同样大小的最大SPC值的连接点,则同时入选,因此,局部主路径上会出现分叉现象,有时候包含的点会更多一些。

遗传算法双层编码

遗传算法双层编码

遗传算法的双层编码(Two-layer Encoding)是一种在解决特定复杂优化问题时采用的编码策略,特别适用于具有层次结构或多个决策级别的问题。

在双层编码中,个体的表示被划分为两个或多个相互关联的部分,每一部分分别对应问题的不同层面。

例如,在处理一些复杂的组合优化问题如多级物流配送问题、任务调度问题或者多层次的决策结构设计问题时,可能需要将问题分解为上层和下层结构,每个层级都有其对应的决策变量:
1. 第一层编码:通常用于表达高层级的决策变量,如区域分配、主路径选择等。

2. 第二层编码:用于描述低层级的具体操作细节,比如在给定区域内的具体路线规划、任务在某个阶段的具体安排等。

在进化过程中,遗传算法不仅对整体结构进行变异和交叉,同时对每层编码也独立进行这些操作。

这样可以确保在保持整体结构合理的同时,也能细化到局部最优解的搜索。

例如,在求解MDARP(Multi-Dimensional Assignment Problem with Routing Paths)这类问题时,可能会首先通过第一层编码确定车辆与客户之间的大致分配关系,然后通过第二层编码详细指定每辆车的行驶路径。

在每次迭代中,双层编码都会同时更新两层编码的信息,以适应不断优化的过程。

mfista算法

mfista算法

mfista算法MFISTA算法(Multifrontal Iterative Shrinkage-Thresholding Algorithm)是一种用于求解稀疏表示问题的迭代优化算法。

它在L1范数正则化的框架下,能够高效地求解具有稀疏性的信号重构问题。

稀疏表示问题是指在给定一组基函数的情况下,通过寻找最少的基函数线性组合来表示一个信号。

这个问题在信号处理、图像处理、机器学习等领域都有广泛的应用。

MFISTA算法的目标就是通过迭代的方式,找到一个最优的稀疏表示。

MFISTA算法的核心思想是通过将稀疏表示问题转化为一个优化问题,并采用了迭代收缩阈值的方式进行求解。

算法的具体步骤如下:1. 初始化:给定信号和基函数,初始化稀疏表示的值。

2. 迭代更新:通过迭代的方式不断更新稀疏表示的值,直至收敛。

在每一次迭代中,首先计算梯度,然后根据梯度信息进行更新。

3. 收缩阈值:在更新稀疏表示的过程中,需要进行阈值的选择。

MFISTA算法采用了一种自适应的阈值选择方法,称为FISTA步骤。

FISTA步骤通过计算两次迭代的差异来选择阈值,从而实现更好的收敛性能。

4. 结束条件:当稀疏表示的值不再发生显著变化时,算法收敛,可以得到最终的稀疏表示结果。

MFISTA算法相比于其他稀疏表示算法具有以下优点:1. 收敛速度快:MFISTA算法通过引入FISTA步骤,能够更快地收敛,提高算法的运行效率。

2. 稀疏性更好:MFISTA算法能够得到更加稀疏的稀疏表示结果,即使用更少的基函数来表示信号。

3. 适用性广:MFISTA算法在不同领域的稀疏表示问题中都有较好的应用效果,包括图像处理、压缩感知、信号恢复等。

4. 鲁棒性强:MFISTA算法对噪声和数据不完整性具有较好的鲁棒性,能够处理一些复杂的实际问题。

尽管MFISTA算法在稀疏表示问题中取得了较好的效果,但仍然存在一些局限性。

首先,算法的收敛性与初始化值有关,不同的初始化值可能导致不同的收敛结果。

munkres函数

munkres函数

Munkres算法,也称为匈牙利算法,是一种用于解决二分图最大匹配问题的线性时间复杂度算法。

二分图最大匹配问题是在一个二分图中寻找最大的匹配,即找到最大的子集,使得图中的每条边都有一个与之匹配的顶点。

Munkres算法的基本思想是通过在原图中构造增广路径,并在增广路径上不断进行增广操作,最终得到最大匹配。

具体步骤如下:
1. 初始化:将所有未匹配的点标记为0,已匹配的点标记为无穷大。

2. 寻找增广路径:从任意一个未匹配的点开始,进行DFS或BFS 等搜索方法,直到找到一个增广路径。

增广路径是指从起点开始,沿着一条路径可以一直匹配到终点,但终点尚未匹配。

3. 进行增广操作:在增广路径上,将路径上的所有点与对应的未匹配点进行匹配,并将这些点标记为已匹配。

4. 重复步骤2和3,直到所有的点都已匹配或者找不到增广路径为止。

Munkres算法的时间复杂度为O(V^3),其中V是顶点的数量。

这是因为在最坏的情况下,需要枚举所有可能的增广路径,而每条增广路径最多包含V个顶点。

因此,Munkres算法是一种非常高效的算法,被广泛应用于解决二分图最大匹配问题。

multinominal logistic 解读

multinominal logistic 解读

multinominal logistic 解读多项式逻辑斯蒂回归(Multinomial Logistic Regression)是一种常用的分类算法,广泛应用于机器学习和统计分析中。

它是逻辑斯蒂回归在多分类问题中的扩展,能够预测多个离散的类别。

在多项式逻辑斯蒂回归中,目标变量可以有两个或多个离散的类别。

与二元逻辑斯蒂回归不同的是,多项式逻辑斯蒂回归使用了多个二元逻辑斯蒂回归模型来进行分类。

模型的输出是每个类别的概率。

具体地说,对于每个类别,模型计算一个线性函数的概率,然后对这些概率进行归一化,以确保它们的总和为1。

最终,模型将观测值分配给具有最高概率的类别。

多项式逻辑斯蒂回归可以使用最大似然估计方法来估计模型参数。

估计的参数可以用于预测新的观测值的类别。

在实际应用中,多项式逻辑斯蒂回归常用于文本分类、医疗诊断、人脸识别等领域。

例如,在文本分类任务中,可以使用多项式逻辑斯蒂回归将不同的文档分配到预定义的类别中。

在医疗诊断中,可以使用多项式逻辑斯蒂回归来预测一位患者属于哪个疾病类型。

在人脸识别任务中,多项式逻辑斯蒂回归可用于将人脸图像识别为不同的人物。

尽管多项式逻辑斯蒂回归在许多领域中表现出色,但它也有一些限制。

首先,它要求观测值之间是独立的。

其次,它假设观测值的分布是多项式的。

这意味着它对于连续变量的建模能力有限。

此外,多项式逻辑斯蒂回归也可能受到过度拟合的问题。

为了提高模型的性能,可以使用一些技术和方法。

例如,特征选择可以帮助排除一些不相关的特征,以减少模型的复杂度。

此外,交叉验证可以用于评估模型的性能和避免过拟合。

另外,通过增加训练样本量和调整模型的正则化参数,也可以提高模型的鲁棒性。

总结来说,多项式逻辑斯蒂回归是一种常用的多分类算法,在机器学习和统计分析中有着广泛的应用。

它适用于文本分类、医疗诊断、人脸识别等领域。

尽管存在一些局限性,但通过使用合适的技术和方法,可以提高模型的性能和稳定性。

EMMREML软件说明说明书

EMMREML软件说明说明书

Package‘EMMREML’October12,2022Type PackageVersion3.1Date2015-07-20Title Fitting Mixed Models with Known Covariance StructuresAuthor Deniz Akdemir,Okeke Uche GodfreyMaintainer Deniz Akdemir<****************************>Depends Matrix,statsDescription The main functions are'emmreml',and'emmremlMultiKernel'.'emm-reml'solves a mixed model with known covariance structure using the'EMMA'algo-rithm.'emmremlMultiKernel'is a wrapper for'emmreml'to handle multiple random compo-nents with known covariance structures.The function'emmremlMultivariate'solves a multivari-ate gaussian mixed model with known covariance structure using the'ECM'algorithm. License GPL-2NeedsCompilation noRepository CRANDate/Publication2015-07-2205:52:07R topics documented:EMMREML (2)emmreml (2)emmremlMultiKernel (4)emmremlMultivariate (6)Index91EMMREML Fitting mixed models with known covariance structures.DescriptionThe main functions are emmreml,and emmremlMultiKernel.emmreml solves a mixed model with known covariance structure using the EMMA algorithm in Kang et.al.(2008).emmremlMulti-Kernel is a wrapper for emmreml to handle multiple random components with known covariance structures.The function emmremlMultivariate solves a multivariate gaussian mixed model with known covariance structure using the ECM algorithm in Zhou and Stephens(2012).DetailsPackage:EMMREMLType:PackageVersion: 3.1Date:2015-07-20License:GPL-2Author(s)Deniz Akdemir,Okeke Uche GodfreyMaintainer:Deniz Akdemir<****************************>ReferencesEfficient control of population structure in model organism association mapping.Kang,Hyun Min and Zaitlen,Noah A and Wade,Claire M and Kirby,Andrew and Heckerman,David and Daly, Mark J and Eskin,Eleazar.Genetics,2008.Genome-wide efficient mixed-model analysis for association studies.Zhou,Xiang and Stephens, Matthew.Nature genetics,2012.emmreml Solver for Gaussian mixed model with known covariance structure.DescriptionThis function estimates the parameters of the modely=Xβ+Zu+ewhere y is the n vector of response variable,X is a nxq known design matrix offixed effects,Z isa nxl known design matrix of random effects,βis qx1vector offixed effects coefficients and u ande are independent variables with N l(0,σ2u K)and N n(0,σ2e I n)correspondingly.It also producesthe BLUPs and some other useful statistics like large sample estimates of variances and PEV.Usageemmreml(y,X,Z,K,varbetahat=FALSE,varuhat=FALSE,PEVuhat=FALSE,test=FALSE)Argumentsy nx1numeric vectorX nxq matrixZ nxl matrixK lxl matrix of known relationshipsvarbetahat TRUE or FALSEvaruhat TRUE or FALSEPEVuhat TRUE or FALSEtest TRUE or FALSEValueVu Estimate ofσ2uVe Estimate ofσ2ebetahat BLUEs forβuhat BLUPs for uXsqtestbetaχ2test statistics for testing whether thefixed effect coefficients are equal to zero.pvalbeta pvalues obtained from large sample theory for thefixed effects.We report the pvalues adjusted by the"padjust"function for allfixed effect coefficients.Xsqtestuχ2test statistic values for testing whether the BLUPs are equal to zero.pvalu pvalues obtained from large sample theory for the BLUPs.We report the pvalues adjusted by the"padjust"function.varuhat Large sample variance for the BLUPs.varbetahat Large sample variance for theβ’s.PEVuhat Prediction error variance estimates for the BLUPs.loglik loglikelihood for the model.Examplesn=200M1<-matrix(rnorm(n*300),nrow=n)K1<-cov(t(M1))K1=K1/mean(diag(K1))covY<-2*K1+1*diag(n)Y<-10+crossprod(chol(covY),rnorm(n))#training setTrainset<-sample(1:n,150)funout<-emmreml(y=Y[Trainset],X=matrix(rep(1,n)[Trainset],ncol=1),Z=diag(n)[Trainset,],K=K1)cor(Y[-Trainset],funout$uhat[-Trainset])emmremlMultiKernel Function tofit Gaussian mixed model with multiple mixed effects withknown covariances.DescriptionThis function is a wrapper for the emmreml tofit Gaussian mixed model with multiple mixed effects with known covariances.The modelfitted is y=Xβ+Z1u1+Z2u2+...Z k u k+e where y is the n vector of response variable,X is a nxq known design matrix offixed ef-fects,Z j is a nxl j known design matrix of random effects for j=1,2,...,k,βis nx1vec-tor offixed effects coefficients and U=(u t1,u t2,...,u tk )t and e are independent variables withN L(0,blockdiag(σ2u1K1,σ2u2K2,...,σ2ukK k))and N n(0,σ2e I n)correspondingly.The function pro-duces the BLUPs for the L=l1+l2+...+l k dimensional random effect U.The variance parameters for random effects are estimated as(ˆw1,ˆw2,...,ˆw k)∗ˆσ2u where w=(w1,w2,...,w k)are the kernel weights.The function also provides some useful statistics like large sample estimates of variances and PEV.UsageemmremlMultiKernel(y,X,Zlist,Klist,varbetahat=FALSE,varuhat=FALSE,PEVuhat=FALSE,test=FALSE)Argumentsy nx1numeric vectorX nxq matrixZlist list of random effects design matrices of dimensions nxl1,...,nxl kKlist list of known relationship matrices of dimensions l1xl1,...,l k xl kvarbetahat TRUE or FALSEvaruhat TRUE or FALSEPEVuhat TRUE or FALSEtest TRUE or FALSEValueVu Estimate ofσ2uVe Estimate ofσ2ebetahat BLUEs forβuhat BLUPs for uweights Estimates of kernel weightsXsqtestbeta Aχ2test statistic based for testing whether thefixed effect coefficients are equal to zero.pvalbeta pvalues obtained from large sample theory for thefixed effects.We report the pvalues adjusted by the"padjust"function for allfixed effect coefficients.Xsqtestu Aχ2test statistic based for testing whether the BLUPs are equal to zero.pvalu pvalues obtained from large sample theory for the BLUPs.We report the pvalues adjusted by the"padjust"function.varuhat Large sample variance for the BLUPs.varbetahat Large sample variance for theβ’s.PEVuhat Prediction error variance estimates for the BLUPs.loglik loglikelihood for the model.Examples####example#Data from Gaussian process with three#(total four,including residuals)independent#sources of variationn=80M1<-matrix(rnorm(n*10),nrow=n)M2<-matrix(rnorm(n*20),nrow=n)M3<-matrix(rnorm(n*5),nrow=n)#Relationship matricesK1<-cov(t(M1))K2<-cov(t(M2))K3<-cov(t(M3))K1=K1/mean(diag(K1))K2=K2/mean(diag(K2))K3=K3/mean(diag(K3))#Generate datacovY<-2*(.2*K1+.7*K2+.1*K3)+diag(n)Y<-10+crossprod(chol(covY),rnorm(n))#training setTrainsamp<-sample(1:80,60)funout<-emmremlMultiKernel(y=Y[Trainsamp],X=matrix(rep(1,n)[Trainsamp],ncol=1),Zlist=list(diag(n)[Trainsamp,],diag(n)[Trainsamp,],diag(n)[Trainsamp,]),Klist=list(K1,K2,K3),varbetahat=FALSE,varuhat=FALSE,PEVuhat=FALSE,test=FALSE)#weightsfunout$weights#Correlation of predictions with true values in test setuhatmat<-matrix(funout$uhat,ncol=3)uhatvec<-rowSums(uhatmat)cor(Y[-Trainsamp],uhatvec[-Trainsamp])emmremlMultivariate Function tofit multivariate Gaussian mixed model with with knowncovariance structure.DescriptionThis function estimates the parameters of the modelY=BX+GZ+Ewhere Y is the dxn matrix of response variable,X is a qxn known design matrix offixed effects, Z is a lxn known design matrix of random effects,B is dxq matrix offixed effects coefficients and G and E are independent matrix variate variables with N dxl(0,V G,K)and N dxn(0,V E,I n) correspondingly.It also produces the BLUPs for the random effects G and some other statistics. UsageemmremlMultivariate(Y,X,Z,K,varBhat=FALSE,varGhat=FALSE,PEVGhat=FALSE,test=FALSE,tolpar=1e-06,tolparinv=1e-06)ArgumentsY dxn matrix of response variableX qxn known design matrix offixed effectsZ lxn known design matrix of random effectsK lxl matrix of known relationshipsvarBhat TRUE or FALSEvarGhat TRUE or FALSEPEVGhat TRUE or FALSEtest TRUE or FALSEtolpar tolerance parameter for convergencetolparinv tolerance parameter for matrix inverseValueVg Estimate of V GVe Estimate of V EBhat BLUEs for BGpred BLUPs for GXsqtestBχ2test statistics for testing whether thefixed effect coefficients are equal to zero.pvalB pvalues obtained from large sample theory for thefixed effects.We report the pvalues adjusted by the"padjust"function for allfixed effect coefficients.XsqtestGχ2test statistic values for testing whether the BLUPs are equal to zero.pvalG pvalues obtained from large sample theory for the BLUPs.We report the pvalues adjusted by the"padjust"function.varGhat Large sample variance for BLUPs.varBhat Large sample variance for the elements of B.PEVGhat Prediction error variance estimates for the BLUPs.Examplesl=20n<-15m<-40M<-matrix(rbinom(m*l,2,.2),nrow=l)rownames(M)<-paste("l",1:nrow(M))beta1<-rnorm(m)*exp(rbinom(m,5,.2))beta2<-rnorm(m)*exp(rbinom(m,5,.1))beta3<-rnorm(m)*exp(rbinom(m,5,.1))+beta2g1<-M%*%beta1g2<-M%*%beta2g3<-M%*%beta3e1<-sd(g1)*rnorm(l)e2<-(-e1*2*sd(g2)/sd(g1)+.25*sd(g2)/sd(g1)*rnorm(l))e3<-1*(e1*.25*sd(g2)/sd(g1)+.25*sd(g2)/sd(g1)*rnorm(l))y1<-10+g1+e1y2<--50+g2+e2y3<--5+g3+e3Y<-rbind(t(y1),t(y2),t(y3))colnames(Y)<-rownames(M)cov(t(Y))Y[1:3,1:5]K<-cov(t(M))K<-K/mean(diag(K))rownames(K)<-colnames(K)<-rownames(M)X<-matrix(1,nrow=1,ncol=l)colnames(X)<-rownames(M)Z<-diag(l)rownames(Z)<-colnames(Z)<-rownames(M)SampleTrain<-sample(rownames(Z),n)Ztrain<-Z[rownames(Z)%in%SampleTrain,]Ztest<-Z[!(rownames(Z)%in%SampleTrain),]##For a quick answer,tolpar is set to1e-4.Correct this in practice.outfunc<-emmremlMultivariate(Y=Y%*%t(Ztrain),X=X%*%t(Ztrain),Z=t(Ztrain),K=K,tolpar=1e-4,varBhat=FALSE,varGhat=FALSE,PEVGhat=FALSE,test=FALSE)Yhattest<-outfunc$Gpred%*%t(Ztest)cor(cbind(Ztest%*%Y[1,],Ztest%*%outfunc$Gpred[1,],Ztest%*%Y[2,],Ztest%*%outfunc$Gpred[2,],Ztest%*%Y[3,],Ztest%*%outfunc$Gpred[3,])) outfuncRidgeReg<-emmremlMultivariate(Y=Y%*%t(Ztrain),X=X%*%t(Ztrain),Z=t(Ztrain%*%M), K=diag(m),tolpar=1e-5,varBhat=FALSE,varGhat=FALSE,PEVGhat=FALSE,test=FALSE)Gpred2<-outfuncRidgeReg$Gpred%*%t(M)cor(Ztest%*%Y[1,],Ztest%*%Gpred2[1,])cor(Ztest%*%Y[2,],Ztest%*%Gpred2[2,])cor(Ztest%*%Y[3,],Ztest%*%Gpred2[3,])IndexEMMREML,2emmreml,2emmremlMultiKernel,4 emmremlMultivariate,69。

第十章布林代数与数位逻辑(“布林”相关文档)共43张

第十章布林代数与数位逻辑(“布林”相关文档)共43张

A
B
C
最小項
最大項
0
0
0
0
0
1
0
1
1
0
1
0
1
1
0 m0= A B C
1 m1= A B C
0
m2= A B C
1
m3= A BC
0
m4=A B C
1
m5=A B C
0
m6=AB C
M0=A+B+C
M1= A+B+ C
M2= A+ B +C
M3= A+B +C M4= A +B+C
M5=A +B+ C
1
1
1
下面的表格中筆者舉出二個布林函數,並同時比較每一個布林函數的邏輯電路圖。
1 指布林運算後所得到的結果。
0
0
1
1
0
1
1
1
1
0
1
1
1
11 14布林代来自的公設布林代數之公設(postulate)定義如下:
假設S代表一集合,且集合S中只包含兩個元素0及1 單位元素
+之單位元素為0,其定義為A + 0 = 0 + A = A ˙之單位元素為1,其定義為A˙1 = 1˙A = A
第十章 布林代數與數位邏輯
10-1
10-2 介
10-3
布林函數與布林代數 邏輯電路的認識與簡
組合電路
1
電腦硬體元件是許多邏輯電路組合而成,在設計 電路時,會以布林函數及布林代數來表達電路的 設計方式及電路的功能,透過布林函數及布林代 數,可以來達到電路簡化目的,降低硬體成本。 使用布林代數來簡化電路的方式外,還有一種更 標準化的電路簡化方式-卡諾圖,我們也會在本 章中一併探討。最後一節,會介紹一些常用的組 合電路,包括:多工器、解多工器、半加器、全 加器、編碼器、解碼器等,並會說明各種組合電 路的功能。

哈夫曼树度为m

哈夫曼树度为m

哈夫曼树度为m
(原创版)
目录
1.哈夫曼树的概念和基本性质
2.哈夫曼树的度为 m 的含义
3.哈夫曼树的构造方法
4.哈夫曼树在数据压缩和编码中的应用
正文
哈夫曼树(Huffman Tree)是一种带权路径长度最短的二叉树,主要用于数据压缩和编码。

它是由美国计算机科学家 David A.Huffman 在1952 年提出的,具有唯一解的性质。

哈夫曼树的度为 m 是指树的每个节点最多有 m 个子节点。

哈夫曼树的构造方法如下:
1.根据输入数据(字符)的出现频率构建一个哈夫曼树。

首先将输入数据中的每个字符作为叶子节点,将其出现的频率作为权值。

2.在所有节点中选择权值最小的两个节点,将它们作为一棵新二叉树的左右子节点,且它们的权值之和作为新节点的权值。

3.将这两个节点从原节点集合中移除,将新节点加入集合。

4.重复步骤 2 和 3,直到只剩下一个节点,这个节点就是哈夫曼树
的根节点。

哈夫曼树在数据压缩和编码中的应用十分广泛。

由于哈夫曼树是带权路径长度最短的二叉树,因此可以实现对原始数据的压缩。

在压缩过程中,将每个字符映射到哈夫曼树中的一个路径,这个路径代表一个编码。

在解压缩时,根据编码沿着哈夫曼树进行路径还原,即可得到原始数据。

哈夫曼编码是一种前缀编码,即任何字符的编码都不是另一个字符编码的前缀,
这保证了解压缩的唯一性。

综上所述,哈夫曼树度为 m,具有唯一解的性质,通过构造哈夫曼树,可以实现数据压缩和编码。

multinominal logistic 解读

multinominal logistic 解读

multinominal logistic 解读多项式逻辑回归解读多项式逻辑回归是一种广义线性模型,用于解决多类别分类问题。

在分类问题中,我们希望将输入数据分为两个或多个类别。

多项式逻辑回归是二元逻辑回归的自然扩展,它可以处理具有两个以上类别的情况。

在多项式逻辑回归中,我们使用了多个逻辑回归模型来分别处理每个类别。

每个模型都计算出该样本属于某个类别的概率,并选择概率最高的类别作为预测结果。

多项式逻辑回归通过引入多个线性模型参数和一个激活函数,将线性回归扩展到多类别分类问题。

为了计算每个类别的概率,我们使用 softmax 函数作为激活函数。

Softmax 函数将多个线性回归模型输出的结果转化为对应类别的概率。

对于每个类别 k,我们有一个线性模型,其输入为特征向量 x,参数为βk,偏置为 bk。

模型的输出记为 zk,它通过如下公式计算得到:zk = βk * x + bk然后,我们使用 softmax 函数对每个 zk 进行转换,计算每个类别的预测概率。

softmax 函数的公式如下:P(y = k | x) = exp(zk) / (∑i=1 to K(exp(zi)))其中 P(y = k | x) 表示给定输入 x,样本属于类别 k 的概率,K 是总的类别数。

训练多项式逻辑回归模型的目标是最大化训练数据集上的似然函数,即最大化正确分类所有样本的概率乘积。

为了减少过拟合,通常还会引入正则化项。

最常用的优化算法是梯度下降法,它通过计算损失函数的梯度来更新模型参数。

多项式逻辑回归常用于处理多类别分类问题,例如手写数字识别、文本分类等。

它的优点是简单有效,并且具有较好的解释性。

然而,它也有一些局限性,如对特征空间的线性可分性要求较高。

总结来说,多项式逻辑回归是一种用于多类别分类的广义线性模型。

它通过引入多个线性模型和 softmax 函数,将线性回归推广到多类别分类问题。

训练过程通过最大化似然函数来寻找最优的模型参数,常用的优化算法是梯度下降法。

python贝尔曼最优公式

python贝尔曼最优公式

python贝尔曼最优公式
贝尔曼最优公式是强化学习中的重要概念,它描述了智能体在一个马尔可夫决策过程中如何选择最优策略。

在这个过程中,智能体需要在不同状态之间做出决策,以最大化其长期累积奖励。

智能体的目标是找到一个最优策略,使得在任何给定状态下,选择的动作能够最大化预期累积奖励。

而贝尔曼最优公式正是用来计算最优策略的关键工具之一。

贝尔曼最优公式的核心思想是,一个状态的最优值函数等于该状态下所有可能动作的奖励加上下一个状态的最优值函数的折扣值。

换句话说,最优值函数是通过递归地考虑下一个状态的最优值函数来计算的。

这个公式可以用以下的数学形式表示:
V*(s) = max[Q*(s,a)]
其中V*(s)表示状态s的最优值函数,Q*(s,a)表示在状态s下选择动作a的最优值函数。

贝尔曼最优公式是强化学习中的基石,它提供了一种计算最优策略的有效方法。

通过不断迭代更新状态的最优值函数,智能体可以逐步优化策略,实现更好的决策效果。

当然,贝尔曼最优公式并不仅限于强化学习领域。

在实际应用中,它也可以用于其他优化问题的求解,如路径规划、资源分配等。

贝尔曼最优公式是一个非常重要的数学工具,它为我们解决最优决策问题提供了思路和方法。

通过合理应用贝尔曼最优公式,我们可以更好地理解和解决现实中的复杂问题。

ACM-GIS%202006-A%20Peer-to-Peer%20Spatial%20Cloaking%20Algorithm%20for%20Anonymous%20Location-based%

ACM-GIS%202006-A%20Peer-to-Peer%20Spatial%20Cloaking%20Algorithm%20for%20Anonymous%20Location-based%

A Peer-to-Peer Spatial Cloaking Algorithm for AnonymousLocation-based Services∗Chi-Yin Chow Department of Computer Science and Engineering University of Minnesota Minneapolis,MN cchow@ Mohamed F.MokbelDepartment of ComputerScience and EngineeringUniversity of MinnesotaMinneapolis,MNmokbel@Xuan LiuIBM Thomas J.WatsonResearch CenterHawthorne,NYxuanliu@ABSTRACTThis paper tackles a major privacy threat in current location-based services where users have to report their ex-act locations to the database server in order to obtain their desired services.For example,a mobile user asking about her nearest restaurant has to report her exact location.With untrusted service providers,reporting private location in-formation may lead to several privacy threats.In this pa-per,we present a peer-to-peer(P2P)spatial cloaking algo-rithm in which mobile and stationary users can entertain location-based services without revealing their exact loca-tion information.The main idea is that before requesting any location-based service,the mobile user will form a group from her peers via single-hop communication and/or multi-hop routing.Then,the spatial cloaked area is computed as the region that covers the entire group of peers.Two modes of operations are supported within the proposed P2P spa-tial cloaking algorithm,namely,the on-demand mode and the proactive mode.Experimental results show that the P2P spatial cloaking algorithm operated in the on-demand mode has lower communication cost and better quality of services than the proactive mode,but the on-demand incurs longer response time.Categories and Subject Descriptors:H.2.8[Database Applications]:Spatial databases and GISGeneral Terms:Algorithms and Experimentation. Keywords:Mobile computing,location-based services,lo-cation privacy and spatial cloaking.1.INTRODUCTIONThe emergence of state-of-the-art location-detection de-vices,e.g.,cellular phones,global positioning system(GPS) devices,and radio-frequency identification(RFID)chips re-sults in a location-dependent information access paradigm,∗This work is supported in part by the Grants-in-Aid of Re-search,Artistry,and Scholarship,University of Minnesota. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.ACM-GIS’06,November10-11,2006,Arlington,Virginia,USA. Copyright2006ACM1-59593-529-0/06/0011...$5.00.known as location-based services(LBS)[30].In LBS,mobile users have the ability to issue location-based queries to the location-based database server.Examples of such queries include“where is my nearest gas station”,“what are the restaurants within one mile of my location”,and“what is the traffic condition within ten minutes of my route”.To get the precise answer of these queries,the user has to pro-vide her exact location information to the database server. With untrustworthy servers,adversaries may access sensi-tive information about specific individuals based on their location information and issued queries.For example,an adversary may check a user’s habit and interest by knowing the places she visits and the time of each visit,or someone can track the locations of his ex-friends.In fact,in many cases,GPS devices have been used in stalking personal lo-cations[12,39].To tackle this major privacy concern,three centralized privacy-preserving frameworks are proposed for LBS[13,14,31],in which a trusted third party is used as a middleware to blur user locations into spatial regions to achieve k-anonymity,i.e.,a user is indistinguishable among other k−1users.The centralized privacy-preserving frame-work possesses the following shortcomings:1)The central-ized trusted third party could be the system bottleneck or single point of failure.2)Since the centralized third party has the complete knowledge of the location information and queries of all users,it may pose a serious privacy threat when the third party is attacked by adversaries.In this paper,we propose a peer-to-peer(P2P)spatial cloaking algorithm.Mobile users adopting the P2P spatial cloaking algorithm can protect their privacy without seeking help from any centralized third party.Other than the short-comings of the centralized approach,our work is also moti-vated by the following facts:1)The computation power and storage capacity of most mobile devices have been improv-ing at a fast pace.2)P2P communication technologies,such as IEEE802.11and Bluetooth,have been widely deployed.3)Many new applications based on P2P information shar-ing have rapidly taken shape,e.g.,cooperative information access[9,32]and P2P spatio-temporal query processing[20, 24].Figure1gives an illustrative example of P2P spatial cloak-ing.The mobile user A wants tofind her nearest gas station while beingfive anonymous,i.e.,the user is indistinguish-able amongfive users.Thus,the mobile user A has to look around andfind other four peers to collaborate as a group. In this example,the four peers are B,C,D,and E.Then, the mobile user A cloaks her exact location into a spatialA B CDEBase Stationregion that covers the entire group of mobile users A ,B ,C ,D ,and E .The mobile user A randomly selects one of the mobile users within the group as an agent .In the ex-ample given in Figure 1,the mobile user D is selected as an agent.Then,the mobile user A sends her query (i.e.,what is the nearest gas station)along with her cloaked spa-tial region to the agent.The agent forwards the query to the location-based database server through a base station.Since the location-based database server processes the query based on the cloaked spatial region,it can only give a list of candidate answers that includes the actual answers and some false positives.After the agent receives the candidate answers,it forwards the candidate answers to the mobile user A .Finally,the mobile user A gets the actual answer by filtering out all the false positives.The proposed P2P spatial cloaking algorithm can operate in two modes:on-demand and proactive .In the on-demand mode,mobile clients execute the cloaking algorithm when they need to access information from the location-based database server.On the other side,in the proactive mode,mobile clients periodically look around to find the desired number of peers.Thus,they can cloak their exact locations into spatial regions whenever they want to retrieve informa-tion from the location-based database server.In general,the contributions of this paper can be summarized as follows:1.We introduce a distributed system architecture for pro-viding anonymous location-based services (LBS)for mobile users.2.We propose the first P2P spatial cloaking algorithm for mobile users to entertain high quality location-based services without compromising their privacy.3.We provide experimental evidence that our proposed algorithm is efficient in terms of the response time,is scalable to large numbers of mobile clients,and is effective as it provides high-quality services for mobile clients without the need of exact location information.The rest of this paper is organized as follows.Section 2highlights the related work.The system model of the P2P spatial cloaking algorithm is presented in Section 3.The P2P spatial cloaking algorithm is described in Section 4.Section 5discusses the integration of the P2P spatial cloak-ing algorithm with privacy-aware location-based database servers.Section 6depicts the experimental evaluation of the P2P spatial cloaking algorithm.Finally,Section 7con-cludes this paper.2.RELATED WORKThe k -anonymity model [37,38]has been widely used in maintaining privacy in databases [5,26,27,28].The main idea is to have each tuple in the table as k -anonymous,i.e.,indistinguishable among other k −1tuples.Although we aim for the similar k -anonymity model for the P2P spatial cloaking algorithm,none of these techniques can be applied to protect user privacy for LBS,mainly for the following four reasons:1)These techniques preserve the privacy of the stored data.In our model,we aim not to store the data at all.Instead,we store perturbed versions of the data.Thus,data privacy is managed before storing the data.2)These approaches protect the data not the queries.In anonymous LBS,we aim to protect the user who issues the query to the location-based database server.For example,a mobile user who wants to ask about her nearest gas station needs to pro-tect her location while the location information of the gas station is not protected.3)These approaches guarantee the k -anonymity for a snapshot of the database.In LBS,the user location is continuously changing.Such dynamic be-havior calls for continuous maintenance of the k -anonymity model.(4)These approaches assume a unified k -anonymity requirement for all the stored records.In our P2P spatial cloaking algorithm,k -anonymity is a user-specified privacy requirement which may have a different value for each user.Motivated by the privacy threats of location-detection de-vices [1,4,6,40],several research efforts are dedicated to protect the locations of mobile users (e.g.,false dummies [23],landmark objects [18],and location perturbation [10,13,14]).The most closed approaches to ours are two centralized spatial cloaking algorithms,namely,the spatio-temporal cloaking [14]and the CliqueCloak algorithm [13],and one decentralized privacy-preserving algorithm [23].The spatio-temporal cloaking algorithm [14]assumes that all users have the same k -anonymity requirements.Furthermore,it lacks the scalability because it deals with each single request of each user individually.The CliqueCloak algorithm [13]as-sumes a different k -anonymity requirement for each user.However,since it has large computation overhead,it is lim-ited to a small k -anonymity requirement,i.e.,k is from 5to 10.A decentralized privacy-preserving algorithm is proposed for LBS [23].The main idea is that the mobile client sends a set of false locations,called dummies ,along with its true location to the location-based database server.However,the disadvantages of using dummies are threefold.First,the user has to generate realistic dummies to pre-vent the adversary from guessing its true location.Second,the location-based database server wastes a lot of resources to process the dummies.Finally,the adversary may esti-mate the user location by using cellular positioning tech-niques [34],e.g.,the time-of-arrival (TOA),the time differ-ence of arrival (TDOA)and the direction of arrival (DOA).Although several existing distributed group formation al-gorithms can be used to find peers in a mobile environment,they are not designed for privacy preserving in LBS.Some algorithms are limited to only finding the neighboring peers,e.g.,lowest-ID [11],largest-connectivity (degree)[33]and mobility-based clustering algorithms [2,25].When a mo-bile user with a strict privacy requirement,i.e.,the value of k −1is larger than the number of neighboring peers,it has to enlist other peers for help via multi-hop routing.Other algorithms do not have this limitation,but they are designed for grouping stable mobile clients together to facil-Location-based Database ServerDatabase ServerDatabase ServerFigure 2:The system architectureitate efficient data replica allocation,e.g.,dynamic connec-tivity based group algorithm [16]and mobility-based clus-tering algorithm,called DRAM [19].Our work is different from these approaches in that we propose a P2P spatial cloaking algorithm that is dedicated for mobile users to dis-cover other k −1peers via single-hop communication and/or via multi-hop routing,in order to preserve user privacy in LBS.3.SYSTEM MODELFigure 2depicts the system architecture for the pro-posed P2P spatial cloaking algorithm which contains two main components:mobile clients and location-based data-base server .Each mobile client has its own privacy profile that specifies its desired level of privacy.A privacy profile includes two parameters,k and A min ,k indicates that the user wants to be k -anonymous,i.e.,indistinguishable among k users,while A min specifies the minimum resolution of the cloaked spatial region.The larger the value of k and A min ,the more strict privacy requirements a user needs.Mobile users have the ability to change their privacy profile at any time.Our employed privacy profile matches the privacy re-quirements of mobiles users as depicted by several social science studies (e.g.,see [4,15,17,22,29]).In this architecture,each mobile user is equipped with two wireless network interface cards;one of them is dedicated to communicate with the location-based database server through the base station,while the other one is devoted to the communication with other peers.A similar multi-interface technique has been used to implement IP multi-homing for stream control transmission protocol (SCTP),in which a machine is installed with multiple network in-terface cards,and each assigned a different IP address [36].Similarly,in mobile P2P cooperation environment,mobile users have a network connection to access information from the server,e.g.,through a wireless modem or a base station,and the mobile users also have the ability to communicate with other peers via a wireless LAN,e.g.,IEEE 802.11or Bluetooth [9,24,32].Furthermore,each mobile client is equipped with a positioning device, e.g.,GPS or sensor-based local positioning systems,to determine its current lo-cation information.4.P2P SPATIAL CLOAKINGIn this section,we present the data structure and the P2P spatial cloaking algorithm.Then,we describe two operation modes of the algorithm:on-demand and proactive .4.1Data StructureThe entire system area is divided into grid.The mobile client communicates with each other to discover other k −1peers,in order to achieve the k -anonymity requirement.TheAlgorithm 1P2P Spatial Cloaking:Request Originator m 1:Function P2PCloaking-Originator (h ,k )2://Phase 1:Peer searching phase 3:The hop distance h is set to h4:The set of discovered peers T is set to {∅},and the number ofdiscovered peers k =|T |=05:while k <k −1do6:Broadcast a FORM GROUP request with the parameter h (Al-gorithm 2gives the response of each peer p that receives this request)7:T is the set of peers that respond back to m by executingAlgorithm 28:k =|T |;9:if k <k −1then 10:if T =T then 11:Suspend the request 12:end if 13:h ←h +1;14:T ←T ;15:end if 16:end while17://Phase 2:Location adjustment phase 18:for all T i ∈T do19:|mT i .p |←the greatest possible distance between m and T i .pby considering the timestamp of T i .p ’s reply and maximum speed20:end for21://Phase 3:Spatial cloaking phase22:Form a group with k −1peers having the smallest |mp |23:h ←the largest hop distance h p of the selected k −1peers 24:Determine a grid area A that covers the entire group 25:if A <A min then26:Extend the area of A till it covers A min 27:end if28:Randomly select a mobile client of the group as an agent 29:Forward the query and A to the agentmobile client can thus blur its exact location into a cloaked spatial region that is the minimum grid area covering the k −1peers and itself,and satisfies A min as well.The grid area is represented by the ID of the left-bottom and right-top cells,i.e.,(l,b )and (r,t ).In addition,each mobile client maintains a parameter h that is the required hop distance of the last peer searching.The initial value of h is equal to one.4.2AlgorithmFigure 3gives a running example for the P2P spatial cloaking algorithm.There are 15mobile clients,m 1to m 15,represented as solid circles.m 8is the request originator,other black circles represent the mobile clients received the request from m 8.The dotted circles represent the commu-nication range of the mobile client,and the arrow represents the movement direction.Algorithms 1and 2give the pseudo code for the request originator (denoted as m )and the re-quest receivers (denoted as p ),respectively.In general,the algorithm consists of the following three phases:Phase 1:Peer searching phase .The request origina-tor m wants to retrieve information from the location-based database server.m first sets h to h ,a set of discovered peers T to {∅}and the number of discovered peers k to zero,i.e.,|T |.(Lines 3to 4in Algorithm 1).Then,m broadcasts a FORM GROUP request along with a message sequence ID and the hop distance h to its neighboring peers (Line 6in Algorithm 1).m listens to the network and waits for the reply from its neighboring peers.Algorithm 2describes how a peer p responds to the FORM GROUP request along with a hop distance h and aFigure3:P2P spatial cloaking algorithm.Algorithm2P2P Spatial Cloaking:Request Receiver p1:Function P2PCloaking-Receiver(h)2://Let r be the request forwarder3:if the request is duplicate then4:Reply r with an ACK message5:return;6:end if7:h p←1;8:if h=1then9:Send the tuple T=<p,(x p,y p),v maxp ,t p,h p>to r10:else11:h←h−1;12:Broadcast a FORM GROUP request with the parameter h 13:T p is the set of peers that respond back to p14:for all T i∈T p do15:T i.h p←T i.h p+1;16:end for17:T p←T p∪{<p,(x p,y p),v maxp ,t p,h p>};18:Send T p back to r19:end ifmessage sequence ID from another peer(denoted as r)that is either the request originator or the forwarder of the re-quest.First,p checks if it is a duplicate request based on the message sequence ID.If it is a duplicate request,it sim-ply replies r with an ACK message without processing the request.Otherwise,p processes the request based on the value of h:Case1:h= 1.p turns in a tuple that contains its ID,current location,maximum movement speed,a timestamp and a hop distance(it is set to one),i.e.,< p,(x p,y p),v max p,t p,h p>,to r(Line9in Algorithm2). Case2:h> 1.p decrements h and broadcasts the FORM GROUP request with the updated h and the origi-nal message sequence ID to its neighboring peers.p keeps listening to the network,until it collects the replies from all its neighboring peers.After that,p increments the h p of each collected tuple,and then it appends its own tuple to the collected tuples T p.Finally,it sends T p back to r (Lines11to18in Algorithm2).After m collects the tuples T from its neighboring peers, if m cannotfind other k−1peers with a hop distance of h,it increments h and re-broadcasts the FORM GROUP request along with a new message sequence ID and h.m repeatedly increments h till itfinds other k−1peers(Lines6to14in Algorithm1).However,if mfinds the same set of peers in two consecutive broadcasts,i.e.,with hop distances h and h+1,there are not enough connected peers for m.Thus, m has to relax its privacy profile,i.e.,use a smaller value of k,or to be suspended for a period of time(Line11in Algorithm1).Figures3(a)and3(b)depict single-hop and multi-hop peer searching in our running example,respectively.In Fig-ure3(a),the request originator,m8,(e.g.,k=5)canfind k−1peers via single-hop communication,so m8sets h=1. Since h=1,its neighboring peers,m5,m6,m7,m9,m10, and m11,will not further broadcast the FORM GROUP re-quest.On the other hand,in Figure3(b),m8does not connect to k−1peers directly,so it has to set h>1.Thus, its neighboring peers,m7,m10,and m11,will broadcast the FORM GROUP request along with a decremented hop dis-tance,i.e.,h=h−1,and the original message sequence ID to their neighboring peers.Phase2:Location adjustment phase.Since the peer keeps moving,we have to capture the movement between the time when the peer sends its tuple and the current time. For each received tuple from a peer p,the request originator, m,determines the greatest possible distance between them by an equation,|mp |=|mp|+(t c−t p)×v max p,where |mp|is the Euclidean distance between m and p at time t p,i.e.,|mp|=(x m−x p)2+(y m−y p)2,t c is the currenttime,t p is the timestamp of the tuple and v maxpis the maximum speed of p(Lines18to20in Algorithm1).In this paper,a conservative approach is used to determine the distance,because we assume that the peer will move with the maximum speed in any direction.If p gives its movement direction,m has the ability to determine a more precise distance between them.Figure3(c)illustrates that,for each discovered peer,the circle represents the largest region where the peer can lo-cate at time t c.The greatest possible distance between the request originator m8and its discovered peer,m5,m6,m7, m9,m10,or m11is represented by a dotted line.For exam-ple,the distance of the line m8m 11is the greatest possible distance between m8and m11at time t c,i.e.,|m8m 11|. Phase3:Spatial cloaking phase.In this phase,the request originator,m,forms a virtual group with the k−1 nearest peers,based on the greatest possible distance be-tween them(Line22in Algorithm1).To adapt to the dynamic network topology and k-anonymity requirement, m sets h to the largest value of h p of the selected k−1 peers(Line15in Algorithm1).Then,m determines the minimum grid area A covering the entire group(Line24in Algorithm1).If the area of A is less than A min,m extends A,until it satisfies A min(Lines25to27in Algorithm1). Figure3(c)gives the k−1nearest peers,m6,m7,m10,and m11to the request originator,m8.For example,the privacy profile of m8is(k=5,A min=20cells),and the required cloaked spatial region of m8is represented by a bold rectan-gle,as depicted in Figure3(d).To issue the query to the location-based database server anonymously,m randomly selects a mobile client in the group as an agent(Line28in Algorithm1).Then,m sendsthe query along with the cloaked spatial region,i.e.,A,to the agent(Line29in Algorithm1).The agent forwards thequery to the location-based database server.After the serverprocesses the query with respect to the cloaked spatial re-gion,it sends a list of candidate answers back to the agent.The agent forwards the candidate answer to m,and then mfilters out the false positives from the candidate answers. 4.3Modes of OperationsThe P2P spatial cloaking algorithm can operate in twomodes,on-demand and proactive.The on-demand mode:The mobile client only executesthe algorithm when it needs to retrieve information from the location-based database server.The algorithm operatedin the on-demand mode generally incurs less communica-tion overhead than the proactive mode,because the mobileclient only executes the algorithm when necessary.However,it suffers from a longer response time than the algorithm op-erated in the proactive mode.The proactive mode:The mobile client adopting theproactive mode periodically executes the algorithm in back-ground.The mobile client can cloak its location into a spa-tial region immediately,once it wants to communicate withthe location-based database server.The proactive mode pro-vides a better response time than the on-demand mode,but it generally incurs higher communication overhead and giveslower quality of service than the on-demand mode.5.ANONYMOUS LOCATION-BASEDSERVICESHaving the spatial cloaked region as an output form Algo-rithm1,the mobile user m sends her request to the location-based server through an agent p that is randomly selected.Existing location-based database servers can support onlyexact point locations rather than cloaked regions.In or-der to be able to work with a spatial region,location-basedservers need to be equipped with a privacy-aware queryprocessor(e.g.,see[29,31]).The main idea of the privacy-aware query processor is to return a list of candidate answerrather than the exact query answer.Then,the mobile user m willfilter the candidate list to eliminate its false positives andfind its exact answer.The tighter the spatial cloaked re-gion,the lower is the size of the candidate answer,and hencethe better is the performance of the privacy-aware query processor.However,tight cloaked regions may represent re-laxed privacy constrained.Thus,a trade-offbetween the user privacy and the quality of service can be achieved[31]. Figure4(a)depicts such scenario by showing the data stored at the server side.There are32target objects,i.e., gas stations,T1to T32represented as black circles,the shaded area represents the spatial cloaked area of the mo-bile client who issued the query.For clarification,the actual mobile client location is plotted in Figure4(a)as a black square inside the cloaked area.However,such information is neither stored at the server side nor revealed to the server. The privacy-aware query processor determines a range that includes all target objects that are possibly contributing to the answer given that the actual location of the mobile client could be anywhere within the shaded area.The range is rep-resented as a bold rectangle,as depicted in Figure4(b).The server sends a list of candidate answers,i.e.,T8,T12,T13, T16,T17,T21,and T22,back to the agent.The agent next for-(a)Server Side(b)Client SideFigure4:Anonymous location-based services wards the candidate answers to the requesting mobile client either through single-hop communication or through multi-hop routing.Finally,the mobile client can get the actualanswer,i.e.,T13,byfiltering out the false positives from thecandidate answers.The algorithmic details of the privacy-aware query proces-sor is beyond the scope of this paper.Interested readers are referred to[31]for more details.6.EXPERIMENTAL RESULTSIn this section,we evaluate and compare the scalabilityand efficiency of the P2P spatial cloaking algorithm in boththe on-demand and proactive modes with respect to the av-erage response time per query,the average number of mes-sages per query,and the size of the returned candidate an-swers from the location-based database server.The queryresponse time in the on-demand mode is defined as the timeelapsed between a mobile client starting to search k−1peersand receiving the candidate answers from the agent.On theother hand,the query response time in the proactive mode is defined as the time elapsed between a mobile client startingto forward its query along with the cloaked spatial regionto the agent and receiving the candidate answers from theagent.The simulation model is implemented in C++usingCSIM[35].In all the experiments in this section,we consider an in-dividual random walk model that is based on“random way-point”model[7,8].At the beginning,the mobile clientsare randomly distributed in a spatial space of1,000×1,000square meters,in which a uniform grid structure of100×100cells is constructed.Each mobile client randomly chooses itsown destination in the space with a randomly determined speed s from a uniform distribution U(v min,v max).When the mobile client reaches the destination,it comes to a stand-still for one second to determine its next destination.Afterthat,the mobile client moves towards its new destinationwith another speed.All the mobile clients repeat this move-ment behavior during the simulation.The time interval be-tween two consecutive queries generated by a mobile client follows an exponential distribution with a mean of ten sec-onds.All the experiments consider one half-duplex wirelesschannel for a mobile client to communicate with its peers with a total bandwidth of2Mbps and a transmission range of250meters.When a mobile client wants to communicate with other peers or the location-based database server,it has to wait if the requested channel is busy.In the simulated mobile environment,there is a centralized location-based database server,and one wireless communication channel between the location-based database server and the mobile。

相关主题
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

MuMHR:Mu lti-path,Multi-hop HierarchicalRoutingMohammad Hammoudeh∗,Alexander Kurz†,and Elena Gaura∗∗Cogent Computing Applied Research Centre,Department of Creative ComputingCoventry UniversityCoventry,CV15FB,UK.Email:aa2792@†Department of Computer ScienceUniversity of LeicesterLeicester,LE17RH,UK.Email:kurz@Abstract—This paper proposes a self-organizing,cluster based protocol-Mu lti-path,Multi-hop Hierarchical Routing(MuMHR) -for use in large scale,distributed Wireless Sensor Networks (WSN).With MuMHR,robustness is achieved by each node learning multiple paths and election of cluster-head backup node(s).Energy expenditure is reduced by shortening the distance between the node and its cluster-head and by reducing the setup communication overhead.This is done through incorporating the number-of-hops metric in addition to the back-off waiting time.Simulation results show that MuMHR performs better than LEACH,which is the most promising hierarchical routing algorithm to date;MuMHR reduces the total number of set-up messages by up to65%and enhances the data delivery ratio by up to0.83.I.I NTRODUCTIONHierarchical routing is one of the most popular routing schemes in sensor networks[1],[2],[3],[4],[5],[6],[7].It is a two or more tier routing scheme known for its scalability and communication efficiency.Nodes in the upper tier are called cluster-heads and act as a routing backbone,while nodes in the lower tier perform the sensing tasks.Kulkarni et al.[8]argue that multi-tier networks are scalable and offer a number of advantages over single-tier networks:lower cost, better coverage,higher functionality,and better reliability.A sink-based single tier network can lead to congestion at the gateway especially in dense sensor networks.This can cause communication delays and inadequate tracking of the sensed events.Moreover,some of the routing algorithms for such architecture are commonly not scalable.To overcome these problems,network clustering has been proposed as a possible solution.Low Energy Adaptive Clustering Hierarchy(LEACH)[9] is one of the most promising routing algorithms for sensor networks[1],[2],[3],[4],[5],[6],[7].However,LEACH has been based on a number of assumptions which in the authors’opinion limit its effectiveness in a number of applications.This paper proposes modifications to LEACH which will improve the robustness of the algorithm and reduce the energy con-sumption in the network.The dynamic clustering brings extra overhead,such as head changes,which may diminish savings in energy consumption.A possible solution which is examined in this paper in order to reduce set-up communication overhead is the use of the back-off time for advertisement messages. Also the assumption that every node can transmit to reach the sink is relaxed by enabling multi-hop transmissions.Finally, to add reliability to the protocol,nodes transmit over multiple paths transmissions and backup cluster-heads are elected. The rest of the paper is organized as follows.Section II briefly describes the network model and assumptions.Section III reviews the related work.Section IV describes the simula-tion tool used for the work in this paper.Section V exhibits the details of MuMHR.Section VI presents simulation results. Section VII concludes the paper.II.LEACHThe operation of LEACH is split into two phases,the set-up phase and the steady-state phase.To minimize overhead, the duration of the steady-state phase is longer than the set-up phase[9].During the set-up phase,the clusters are created and cluster-heads are elected.This election is made by every node choosing a random number between0and1.The sensor node becomes cluster-head if this random number is less than the threshold T(n)set as:T(n)=P/1−P×r mod1Pif n∈G0otherwise where P is the desired percentage of cluster-heads,r is the current round and G is the set of nodes that have not been selected as a cluster-head in the last1/P rounds.The desired percentage of cluster-heads was found to be5%of the total number of nodes in the network[9].Note that0cluster-heads and100%cluster-heads is equivalent to direct communication. After cluster-heads are chosen,they broadcast an advertise-ment message to the entire network declaring that they are the new cluster-heads.Every node receiving the advertisement decides to which cluster they wish to belong based on the signal strength of the received message.The sensor node sends a message to register with the cluster-head of their choice. Based on a Time Division Multiple Access(TDMA)approach, the cluster-head assigns each node registered in its clustera0-7695-2988-7/07 $25.00 © 2007 IEEE2007 International Conference on Sensor Technologies and Applications DOI 10.1109/SENSORCOMM.2007.11140time slot to send its data.This requires that every node must support TDMA.The cluster-head keeps a list of all nodes in the cluster to inform them of the TDMA schedule.During the steady-state phase,sensor nodes can start trans-mitting data to their respective cluster-head.The cluster-head applies aggregation functions to compress the data before transmission to the sink.After a predetermined period of time spent on the steady-state phase,the network enters the set-up phase again and starts a new round of creating clusters. LEACH is well-suited for applications where constant mon-itoring is needed and data collection occurs periodically to a centralized location[10].It increases network lifetime in two ways.First,the load is distributed to all nodes but not all at the same time.Second,there is lossless aggregation of data by the cluster-heads.The protocol is powerful and simple since nodes do not require global knowledge or location information to create clusters.LEACH is able to increase the network lifetime and it achieves a more than7-fold reduction in energy dissipation compared to direct communication[9].Despite the significant overall energy savings,however,the assumptions made by the protocol raise a number of issues: 1)LEACH assumes that all nodes begin with the sameamount of energy and that the amount of energy a cluster-head consumes is more than that of a non-cluster node.It also assumes that the amount of energy consumed by cluster-heads in every cluster round is constant.This assumption is however not realistic.Fur-thermore,making adjustments for differences in energy consumption causes LEACH to be less efficient.2)LEACH assumes that all nodes can communicate witheach other and are able to reach the sink.Therefore,it is only suitable for small size networks.3)LEACH requires that all nodes are continuously listen-ing.This is not realistic in a random distribution of the sensor nodes,for example,where cluster-heads would be located at the edge of the network.4)LEACH assumes that all nodes have data to send and soassign a time slot for a node even though some nodes might not have data to transmit.5)LEACH assumes that all nearby nodes have correlateddata which is not always true.6)Finally,there is no mechanism to ensure that the electedcluster-heads(P)will be uniformly distributed over the network.Hence,there is the possibility that all cluster-heads will be concentrated in one part of the network.III.R ELATED W ORKA number of enhancements over LEACH have been pro-posed previously[1],[2],[3],[4],[5],[6],[7].Lindsey and Raghavendra devised a protocol called Power Efficient GAthering in Sensor Information Systems(PEGASIS)[5]that is an improvement over LEACH.As opposed to LEACH, PEGASIS has no clusters,instead it creates chains from sensor nodes so that each node communicates only with their closest neighbours and only one node is selected from the chain to communicate with the sink.PEGASIS has a number of drawbacks[11]:•It requires dynamic adjustment of network topology to route data,which introduces significant overhead.•It requires location information.•Similar to LEACH,PEGASIS assumes that all nodes can communicate with the sink directly.•The head of the chain can become a bottleneck and cause excessive transmission delays.•It assumes that all nodes start with the same level of energy and consumption rates are equal.Hierarchical-PEGASIS[2]extends PEGASIS by introducing a hierarchy in the network topology.It aims to reduce trans-mission delays to the sink and proposes a data gathering scheme that balances the energy and delay cost.Although Hierarchical-PEGASIS avoids the clustering overhead,it still requires dynamic topological adjustments.Manjeshwar and Agrawal implemented the Threshold-sensitive Energy Efficient Protocol(TEEN),that utilizes a hierarchical approach along with a data centric mecha-nism[3].The same authors also developed the AdaPtive Threshold-sensitive Energy Efficient sensor Network protocol (APTEEN)[4],which enhances TEEN by capturing periodic data collection and reacting to time critical events.They demonstrated that APTEEN performance is between LEACH and TEEN in terms of energy dissipation and network lifetime. TEEN gives the best performance since it decreases the number of transmissions.Both TEEN and APTEEN proto-cols require additional traffic control to continually update the threshold values and complexity of forming clusters in multiple levels,implementing threshold-based functions and dealing with attribute-based naming of queries. Smaragdakis et al.[6]address the issue of heterogeneity(in terms of energy)of nodes.Their protocol,called SEP(Stable Election Protocol),is based on random selection of cluster-heads weighted according to the remaining node energy.This approach addresses the problem of varying energy levels and consumption rates but still assumes that the sink can be reached directly by all nodes.LEACH-C(LEACH Centralised)[1]uses a central algo-rithm to form clusters.This algorithm is not robust since it requires location information for all sensors in the network. LEACH and its derivatives have been successful in reducing the energy per bit required by each node and the network as a whole to communicate from the nodes to the sink. Nonetheless,most are built upon the inflexible assumptions that:every node is able to communicate directly with the sink; every communication path is equally likely to succeed;and, every node has the same starting energy level and uses energy at the same rate.This paper provides a protocol with the same underlying benefit as LEACH and derivatives but provides for multi-hop communication,and increases robustness by using multiple communication paths.Also,in comparison to LEACH and most derivatives,this protocol reduces the number of set-up messages required,and thus should extend network life.IV.S EN S OR P LUSIn order to implement the routing algorithm proposed in this paper and study its properties,a new version of an in-house sensor network simulator called SenSorPlus was used.SenSor [12]is a realistic and scalable Python based simulator that provides a workbench for prototyping algorithms for WSN.It consists of a fixed API,with customisable internals.Each simulated sensor node runs in its own thread and com-municates using the same protocols as its physical counterpart would be.This enables experimentation with different algo-rithms for managing the network topology,simulating fault management strategies and so on,within the same simulation.SenSorPlus is an extension of SenSor with an added interface between the simulation environment and different hardware platforms,for example the Gumstix [13]platform.SenSorPlus bridges between SenSor and the Gumstix to allow applications implemented within the simulator to be ported directly on to the hardware.Sensors are modelled using a pool of concurrent,communicating threads.Individual sensors are able to:1)Gather and process data from a model environment 2)Locate and communicate with their nearest neighbours 3)Determine whether they are operating correctly and act accordingly to alter the network topology in case of faulty nodes being detected.Separate interfaces gather information from the network and display it on the graph pane or the chart pane,where individual data can be plotted during the simulation.This partitioning allows us to experiment with different ways of processing individual node data into information.V.M U MHR:Mu LTI -PATH ,M ULTI -HOP H IERARCHICALR OUTING In this section,the properties of the proposed routing protocol,MuMHR are described.The main objective of this protocol is to provide substantially energy-efficient and robust communication.The energy efficiency is achieved by load balancing at two levels:(1)Network level,which involves traffic multiplexing over multiple paths;(2)Cluster level,introducing rotation of the cluster-heads every given interval oftime.This prevents energy depletion resulting from constantly using the same path for transmission or particular nodes being cluster-heads for a long duration.The multi-paths aspect is not only used for load balancing but also when path failures occur.When a path fails,an alternative path can be immediately used which allows the protocol to dynamically adapt to failures without delays or degradation in the quality of service.At the cluster set-up time,one or more nodes are chosen as cluster-head backup node(s).Backup cluster-head node substitute for the cluster-head in some failure cases or when the current cluster-head decides to reduce its participation in the protocol if its energy level approaches a certain threshold value.For instance,if the current cluster-head decides to hand its role to the backup node,it notifies the respective node and forwards to it necessary information,such as the backup nodes list,to avoid a complete cluster set-up phase.In this protocol,the duced.It indicates how far the cluster-head is from the sensing node.This allows nodes to:(1)Select the nearest cluster-head node,which saves energy and reduces messaging needed to bridge the distance between the cluster-head and the sensor node;(2)Allows a node to learn the shortest path to the selected cluster-head.We have also used a back-off waiting time similar to one proposed in [14]to decrease the number of set-up messages and aid the formation of more geographically uniform clusters.During the back-off waiting time,sensor nodes receive advertisement messages and only consider the message with the smallest number-of-hops received during that time.This allows blocking of the advertisement flooding at the edges of neighbouring clusters.Both the number-of-hops metric and the back-off waiting time allow energy efficient cluster formation.The operation of the proposed routing protocol can be split into two phases:the setup phase and the data transfer phase.A.Set-up PhaseDuring the set-up phase cluster-heads are selected and clusters are created.We assume that a simple addressing scheme is in place and the sink knows the range of addresses of nodes in the network.The sink randomly selects 5%of the nodes as cluster-heads and broadcasts this information.Every node that receives the discovery message changes its state from “waiting”to “discovered”and examines the message to check whether it has been selected as cluster-head or not.If yes,it starts a new cluster by broadcasting an advertisement message.Otherwise,it forwards the message to its neighbours.Every node will remember the node from which it has received the discovery message as the immediate neighbour nearest to the sink.This path will be used only in cluster failure situations.Every cluster-head will create an advertisement message that has the number-of-hops parameter set to zero and broadcasts it to its neighbours.Upon receiving an advertisement message,a sensor node will do the following:(a)If the node already belongs to a cluster,then it ignores the received advertisement message;(b)If the back-off waiting timer is still valid,then it caches the received packet and waits for other possible adver-tisements;(c)If the received message has a better number-of-hops metric than the stored one,the latter is deleted and the former is retained.When the back-off waiting time expires,the sensor node increases the number-of-hops parameter in the retained packet and broadcasts it to its neighbours.The node will remember the address of the sender,the cluster-head,and the node from which it received the message as the nearest neighbour to the cluster-head.If a sensor node receives multiple copies of the same advertisement,it selects the one(s)with minimum number-of-hops parameter.Then the node uses the available energy to calculate a value that represents its desire to be a cluster-head in the next cluster set-up round.This value is included in the registration packet that the node sends back to the chosen cluster-head.The cluster-head will extract the highest value(s)and adds its corresponding sender to the cluster-head backup list and registers the node as amember of the pacting different functions into a single multi-purpose message reduces set-up communication overhead and thus makes the protocol more stable and energy efficient.When the cluster round time is over,the current cluster-head hands the master role to the first node in the backup nodes list.With a single flood to cluster members only,the new cluster-head continues its predecessors’role without the need of further communications.The cluster-head role will also be handed to the backup node when a fault occurs in the current cluster-head node.However,in the study presented here,a limited set of faults were considered for the “hand-over”such as internal errors and the energy level approaches a threshold.In the case of faults,such as physical damage or fatal internal errors in the cluster-head,the nodes will use an alternative path until a backup node starts the process of creating a new cluster.B.Data Transmission PhaseDuring the data transmission phase,sensing nodes transmit data to their cluster-head.The cluster-heads aggregates the received data before transmission to the sink or immediately multiplex messages over multiple lines in time critical appli-cations.Each member node transmits data on its assigned time slot scheduled by TDMA.Furthermore,each cluster communicates using unique Carrier Sense Multiple Access (CDMA)codes to avoid interference with traffic generated by other clusters.VI.S IMULATION E XPERIMENTSUsing the SenSorPlus framework,we implemented the proposed algorithm.A demonstration of the simulation is available online 1.For our simulation,we gave all the nodes an initial supply of energy and ran the protocol until it converged.We consider the energy-efficiency of our routing protocol in terms of the number of set-up messages.For our experiments,we created a 100-node network,where the nodes are scattered randomly on 600×600grid,such that no two nodes share the same location.Figure 1shows a random node distribution topology of 100-nodes,where the arrows represent communicating neighbours.The power of the sensors radio transmitter is set to cover all nodes within a 20m radius.The processing delay for transmitting a message is randomly chosen between 5s and 10s .Using this network configuration,we ran the protocol and tracked its progress in terms of number of messages sent and delays.The simulation results are presented in the following subsections.A.Efficiency and RobustnessIn this subsection,we study how introducing the number-of-hops metric and the advertisements back-off waiting time can affect the energy efficient cluster formation.Figure 2shows four network topologies each resulting from a different simulation run.In each topology,nodes are organized into1/cds/distributing/who/sensor/hi-cluster-routing.htmlFigure 1.A 100-nodes WSN with randomtopology(a)(b)(c)(d)Figure 2.Geographically uniform cluster formationfive clusters (P=5).The lines indicate the borders of different clusters.In each experiment we studied how the back-off waiting time and the number-of-hops metric can help to form more geographically uniform clusters.Topologies (a)and (b)were generated using LEACH (the back-off waiting time and the number-of-hops were set to zero),whereas (c)and (d)were generated using MuMHR with the back-off waiting time set to 20s and the number-of-hops initially set to zero.Figure 3shows the node distribution among clusters in the four network topologies.The graphs compare the distribution of nodes among clusters formed using LEACH with those formed using MuMHR.In the LEACH topology (a),the first and fifth clusters hosted over 65%of the total number of nodes while the other three clusters hosted less than 35%of the total network population.In the second LEACH topology (b),the fifth cluster had zero nodes while the percentage of nodes fluctuated among theFigure3.Node distribution among clustersother clusters between7%and30%.It can be clearly seen that there is no unifirm distribution of node numbers amongst the clusters,which increases both the heavy clusters manage-ment overhead and also the energy comsumption.Whereas in MuMHR topologies(c)and(d),nodes were distributed much more fairly among clusters.The number of nodes at every cluster maintained a maximum of7%difference from the optimal population(20%).In topologies(a)and(b),the area covered by different clusters varies largely.These cluster topologies are not energy efficient because data needs to traverse large number of nodes to reach the cluster-head.In MuMHR generated topologies(c)and(d),the area of all clusters is almost equivalent and nodes are distributed much more fairly among thefive clusters than in(a)and(b)as figure2shows.This demonstrates that the back-off waiting time together with the number-of-hops metric lead to the formation of more energy efficient clusters by shortening the routes.The back-off waiting time gives more time to receive a smaller number-of-hops value.This allows nodes to register with the closest cluster-head resulting in more geographically uniform clusters.Furthermore,the number of advertisement messages is reduced,because nodes only forward the best advertisement received during the back-off waiting time.This stops unnecessaryflooding at the border of neighbouring clusters.Infigure4(a),the number of network setup messages versus the back-off waiting time is drawn.When the back-off waiting time is zero the total number of sent messages will be similar to that in LEACH.Thefigure shows that as the back-off waiting time becomes larger,the number of messages will decrease until the time becomes large enough to receive advertisements from all cluster-heads.It reduces the total number of set-up messages by up to65%over LEACH. Therefore,the back-off waiting time is effective in reducing overall set-up energy consumption.Figure4(b),shows the network convergence time versus the back-off waiting time.A linear correspondence between the time needed to establish routes and the back-off waiting time is evident.As the back-off waiting time increases,network convergence time will increase proportionally.In time critical applications it is important that the convergence time isFigure4.(a)The number of sent messages versus the back-off waiting time.(b)The convergence time versus the back-off waiting timereduced.This means,reducing the back-off waiting time to a minimal value to capture the advantages of the back-offwaiting time with minimal delay to achieve efficiency.B.Fault-Tolerance and ReliabilityMany of the proposed protocols in thefield of sensor networks show poor fault-tolerance in the face of frequentnode failures[15].MuMHR provides fault tolerance through a multi-path routing strategy.The multiple paths are learned by nodes during the set-up phase through redundant messages.For example,the path to the cluster-head node is learnt from the advertisement message sent by the cluster-head.Each node joins the cluster for a certain period of time T nC before it re-registers with the cluster again or registers with a new cluster. The cluster also has a life T C1;a cluster-head starts a newcluster formation round by handing the cluster-head role to the backup node.In this approach,the worst time T R to recover from a node failure is the time taken for a node to leave acluster or to renew registration with the same cluster-head it currently belongs to.This is written as:T R=T C1−T nC. If all paths that a sensor node has learned fail,the nodewill broadcast its data to its neighbours.Neighbours will then pass the message to their cluster-head.The cluster-head mayreceive multiple copies of the same message and eliminates redundancy by applying aggregation.To measure the fault-Figure5.DDR versus number of functioning nodes tolerance capabilities of this protocol we make some nodes function as faulty nodes by dropping all packets that they receive,and hence affecting the communication paths.These nodes will become faulty after paths are set-up and recover to function correctly in the next cluster formation round.In the simulation,10%of the nodes are forced to break down after paths are established.The fault tolerance capabilities of this routing protocol are evaluated using the Data Delivery Ratio (DDR)as a metric which measures the ability of the network to deliver packets to the sink through multi-paths.This measure is easy to obtain and free with every received packet.DDR is a service level parameter that indicates the network effec-tiveness in transmitting offered data in one direction of virtual connection[16].It represents the ratio of packets successfully received to packet transmission attempts.Attempted packets transmitted are referred to as DataOffered.Successfully deliv-ered packets are referred to as DataDelivered.Then the ratio can be written as:DDR=DataDelivered/DataOffered. Figure5shows the DDR versus the number of functioning nodes.With the same configuration described in Section VI, the DDR value for ten runs each with a different network-density were drawn.The results show that the protocol main-tained an average delivery ratio of0.733.This demonstrates that multi-path routing can be used to recover from path failures and results in a better delivery ratio.DDR increases slightly as the network-density increases since higher network density slightly counteracts the effect of dead nodes.This algorithm increases the DDR and reduces energy consumption at the same time since packet multiplexing over duplicate paths helps in load balancing and prevents energy depletion.VII.C ONCLUSIONIn this paper,we demonstrated MuMHR,which is an im-provement over LEACH.MuMHR provides solutions to some of the limitations of LEACH.The number-of-hops parameter and the back-off waiting time resulted in more energy efficient cluster formation.The algorithm uses redundant messages received from different sources to build a multi-path map, which allows auto-adaptation to path failures.MuMHR also enable multi-hop transmissions to relax LEACH’s inflexible assumption that all nodes in the network can communicate with each other.The new algorithm achieves robustness and efficiency without location information and with less energy expenditure than LEACH.Simulation results confirm the effi-ciency of the algorithm in terms of communication reduction, robustness and energy savings.This routing algorithm was implemented and used by Shuttleworth et al.[17]and found to easily support various computationally demending applications for WSNs.A CKNOWLEDGMENTSThe authors would like to thank Sarah Mount for her help and support in using SenSorPlus.R EFERENCES[1]W.Heinzelman,“Application-specific protocol architectures for wirelessnetworks,”in PhD thesis,Massachusetts institute of technology,June 2000.[2]S.Lindsey,C.Raghavendra,and K.Sivalingam,“Data gathering insensor networks using the energy*delay metric,”in Proceedings of the IPDPS Workshop on Issues in Wireless Networks and Mobile Computing, April2001.[3] A.Manjeshwar and D.Agrawal,“A protocol for enhanced efficiencyin wireless sensor networks,”in Proceedings of the1st International Workshop on Parallel and Distributed Computing Issues in Wireless Networks and Mobile Computing,April2001.[4]——,“Apteen:A hybrid protocol for efficient routing and comprehen-sive information retrieval in wireless sensor networks,”in Proceedings of the2nd International Workshop on Parallel and Distributed Computing Issues in Wireless Networks and Mobile computing,April2002.[5]S.Lindsey and C.Raghavendra,“Pegasis:Power efficient gathering insensor information systems,”in Proceedings of the IEEE Aerospace Conference,March2002.[6]G.Smaragdakis,I.Matta,and A.Bestavros,“Sep:A stable electionprotocol for clustered heterogeneous wireless sensor networks,”in Second International Workshop on Sensor and Actor Network Protocols and Applications(SANPA2004),August2004.[7]M.Ye,C.Li,G.Chen,and J.Wu,“Eecs:An energy efficient clusteringscheme in wireless sensor networks,”in Performance,Computing,and Communications Conference,2005.IPCCC2005.24th IEEE Interna-tional,April2005,pp.535–540.[8]P.Kulkarni,D.Ganesan,and P.Shenoy,“The case for multi-tier camerasensor networks,”Proceedings of the13th annual ACM international conference on Multimedia,pp.229–238,2005.[9]W.Heinzelman,A.Chandrakasan,and H.Balakrishna,“Energy-efficientcommunication protocol for wireless microsensor networks,”Proceed-ings of the33rd International Conference on System Sciences,January 2000.[10]S.Lee and T.Chung,“Data aggregation for wireless sensor networksusing self-organizing map,”in Artificial Intelligence and Simulation,vol.3397/2005.Springer Berlin/Heidelberg,2004,pp.508–517. [11]M.Hammoudeh,“Robust and energy efficient routing in wireless sensornetworks,”Master’s thesis,University of Leicester,2006.[12]S.Mount,R.Newman,and E.Gaura,“A simulation tool for systemservices in ad-hoc wireless sensor networks,”NSTI Nanotechnology Conference and Trade Show,vol.3,p.423,May2005.[13],“Gumstix way small computing,”2007,[Online;accessed26-March-2007].[Online].Available:[14]L.Xia,X.Chen,and X.Guan,“A new gradient-based routing protocolin wireless sensor networks,”Springer-Verlag Berline Heidelberg,2005.[15]Y.Liu and W.Seah,“A priority-based multi-path routing protocol forsensor networks,”Personal,Indoor and Mobile Radio Communications, vol.1,pp.216–220,Sept.2004.[16]J.Dunn and C.Martin,“Terminology for frame relay benchmarking,”in Internet informational RFC3133,June2001.[17]J.Shuttleworth,M.Hammoudeh,E.Gaura,and R.Newman,“Experi-mental applications of mapping services in wireless sensor networks,”in Fourth International Conference on Networked Sensing Systems,June 2007.。

相关文档
最新文档