二部图社区划分算法的实现与验证
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
二部图社区划分算法的实现与验证
2015年6月
摘要
二分网络是复杂网络的网络表现形式之一,二部图是描述二分网络的工具。对于二分网络的社区划分研究通常用以下方法:一种方法是把二分网络以无权投影或加权投影的方式投影到单分网络中进行社区划分。但是这种方法有个缺点:它会把原始二分网络上的一部分信息丢失,导致实验结果不准。另一种方法是直接在二分网络上进行网络社区划分,这种方法很好的避免了上一种方法中投影造成的实验误差。
PageRank算法是Google的网页排序算法,是Google用来衡量网页的重要性的算法,该算法根据人们对这个网页的点击率来衡量网页的受欢迎程度从而得出该网页的排序,该算法是随机游走理论的一个典型应用模型。
对二分网络单侧节点进行社区划分的研究是具有重要的实际意义的。基于能量在网络中的转移概率和模块度思想,本文将PageRank算法用于二分社交网络的社区发现中,具体内容是利用二分社交网络节点间的连接关系,构造PageRank算法适用的概率转移矩阵,并利用不同维度的两个PageRank矩阵的联合运算,实现对二部图中单侧节点的社区划分,并计算出Q值。该算法通过模拟能量在网络中转移的过程,利用各个节点的能量在网络中转移后收到的其他节点的能量作为社区之间合并的依据,并用模块度作为判断社区划分好坏程度的标准。最后将PR算法用于典型网络(南非妇女网络)上测试。
关键词:二分网络;PR算法;模块度;随机游走理论;社区划分
Abstract
Bipartite network is one form of the network performance in complex networ- ks,bipartite figure is a tool of describing bipartite network.For the research of bipartite net- work community division,there are usually two ways.One way is to divide the bipartite network into the one-mode network in the form of a unweighted projection or weighted projection for community division.However,this method come with a disadvantage:it will lose some information of the orginal bipartite network,which leads to the experimental results inaccurate.Another way is to divide the network community directly on the bipartite net- work.This method can avoid the error caused by the first method.
PageRank algorithm is a page ranking algorithm which Google used to measure the importance of web page algorithm.It can measure webpage popularity according to the web hits and get the page ranking.This algorithm is a typical application model of random walk theory.
The research on the community division of the unilateral nodes in bipartite network has very important practical significance.Based on energy transfer probability in the network and modularity thought,this article use PageRank algorithm for bipartite social network community discovery,specific content is using the bipartite social connection relationship between network nodes to construct the probability transfer matrix for PageRank algorithm.By using different dimensions of two PageRank matrix for compu- tation to realize the unilateral nodes in the bipartite figure community division and cal- culate Q value.This algorithm simulate the energy transfer process in the network,take the energy of each node in the network transfer energy received after other nodes as the basis of merger,use modularity as the judgement of community division.At last,the PageRank algorithm is used for testing in the typical network(south Africa women’s network).
Keywords: bipartite network; PR algorithm; modularity; random walk theory; community division