BIOLOGICAL NETWORKS

合集下载

人工智能领域中英文专有名词汇总

人工智能领域中英文专有名词汇总

名词解释中英文对比<using_information_sources> social networks 社会网络abductive reasoning 溯因推理action recognition(行为识别)active learning(主动学习)adaptive systems 自适应系统adverse drugs reactions(药物不良反应)algorithm design and analysis(算法设计与分析) algorithm(算法)artificial intelligence 人工智能association rule(关联规则)attribute value taxonomy 属性分类规范automomous agent 自动代理automomous systems 自动系统background knowledge 背景知识bayes methods(贝叶斯方法)bayesian inference(贝叶斯推断)bayesian methods(bayes 方法)belief propagation(置信传播)better understanding 内涵理解big data 大数据big data(大数据)biological network(生物网络)biological sciences(生物科学)biomedical domain 生物医学领域biomedical research(生物医学研究)biomedical text(生物医学文本)boltzmann machine(玻尔兹曼机)bootstrapping method 拔靴法case based reasoning 实例推理causual models 因果模型citation matching (引文匹配)classification (分类)classification algorithms(分类算法)clistering algorithms 聚类算法cloud computing(云计算)cluster-based retrieval (聚类检索)clustering (聚类)clustering algorithms(聚类算法)clustering 聚类cognitive science 认知科学collaborative filtering (协同过滤)collaborative filtering(协同过滤)collabrative ontology development 联合本体开发collabrative ontology engineering 联合本体工程commonsense knowledge 常识communication networks(通讯网络)community detection(社区发现)complex data(复杂数据)complex dynamical networks(复杂动态网络)complex network(复杂网络)complex network(复杂网络)computational biology 计算生物学computational biology(计算生物学)computational complexity(计算复杂性) computational intelligence 智能计算computational modeling(计算模型)computer animation(计算机动画)computer networks(计算机网络)computer science 计算机科学concept clustering 概念聚类concept formation 概念形成concept learning 概念学习concept map 概念图concept model 概念模型concept modelling 概念模型conceptual model 概念模型conditional random field(条件随机场模型) conjunctive quries 合取查询constrained least squares (约束最小二乘) convex programming(凸规划)convolutional neural networks(卷积神经网络) customer relationship management(客户关系管理) data analysis(数据分析)data analysis(数据分析)data center(数据中心)data clustering (数据聚类)data compression(数据压缩)data envelopment analysis (数据包络分析)data fusion 数据融合data generation(数据生成)data handling(数据处理)data hierarchy (数据层次)data integration(数据整合)data integrity 数据完整性data intensive computing(数据密集型计算)data management 数据管理data management(数据管理)data management(数据管理)data miningdata mining 数据挖掘data model 数据模型data models(数据模型)data partitioning 数据划分data point(数据点)data privacy(数据隐私)data security(数据安全)data stream(数据流)data streams(数据流)data structure( 数据结构)data structure(数据结构)data visualisation(数据可视化)data visualization 数据可视化data visualization(数据可视化)data warehouse(数据仓库)data warehouses(数据仓库)data warehousing(数据仓库)database management systems(数据库管理系统)database management(数据库管理)date interlinking 日期互联date linking 日期链接Decision analysis(决策分析)decision maker 决策者decision making (决策)decision models 决策模型decision models 决策模型decision rule 决策规则decision support system 决策支持系统decision support systems (决策支持系统) decision tree(决策树)decission tree 决策树deep belief network(深度信念网络)deep learning(深度学习)defult reasoning 默认推理density estimation(密度估计)design methodology 设计方法论dimension reduction(降维) dimensionality reduction(降维)directed graph(有向图)disaster management 灾害管理disastrous event(灾难性事件)discovery(知识发现)dissimilarity (相异性)distributed databases 分布式数据库distributed databases(分布式数据库) distributed query 分布式查询document clustering (文档聚类)domain experts 领域专家domain knowledge 领域知识domain specific language 领域专用语言dynamic databases(动态数据库)dynamic logic 动态逻辑dynamic network(动态网络)dynamic system(动态系统)earth mover's distance(EMD 距离) education 教育efficient algorithm(有效算法)electric commerce 电子商务electronic health records(电子健康档案) entity disambiguation 实体消歧entity recognition 实体识别entity recognition(实体识别)entity resolution 实体解析event detection 事件检测event detection(事件检测)event extraction 事件抽取event identificaton 事件识别exhaustive indexing 完整索引expert system 专家系统expert systems(专家系统)explanation based learning 解释学习factor graph(因子图)feature extraction 特征提取feature extraction(特征提取)feature extraction(特征提取)feature selection (特征选择)feature selection 特征选择feature selection(特征选择)feature space 特征空间first order logic 一阶逻辑formal logic 形式逻辑formal meaning prepresentation 形式意义表示formal semantics 形式语义formal specification 形式描述frame based system 框为本的系统frequent itemsets(频繁项目集)frequent pattern(频繁模式)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy data mining(模糊数据挖掘)fuzzy logic 模糊逻辑fuzzy set theory(模糊集合论)fuzzy set(模糊集)fuzzy sets 模糊集合fuzzy systems 模糊系统gaussian processes(高斯过程)gene expression data 基因表达数据gene expression(基因表达)generative model(生成模型)generative model(生成模型)genetic algorithm 遗传算法genome wide association study(全基因组关联分析) graph classification(图分类)graph classification(图分类)graph clustering(图聚类)graph data(图数据)graph data(图形数据)graph database 图数据库graph database(图数据库)graph mining(图挖掘)graph mining(图挖掘)graph partitioning 图划分graph query 图查询graph structure(图结构)graph theory(图论)graph theory(图论)graph theory(图论)graph theroy 图论graph visualization(图形可视化)graphical user interface 图形用户界面graphical user interfaces(图形用户界面)health care 卫生保健health care(卫生保健)heterogeneous data source 异构数据源heterogeneous data(异构数据)heterogeneous database 异构数据库heterogeneous information network(异构信息网络) heterogeneous network(异构网络)heterogenous ontology 异构本体heuristic rule 启发式规则hidden markov model(隐马尔可夫模型)hidden markov model(隐马尔可夫模型)hidden markov models(隐马尔可夫模型) hierarchical clustering (层次聚类) homogeneous network(同构网络)human centered computing 人机交互技术human computer interaction 人机交互human interaction 人机交互human robot interaction 人机交互image classification(图像分类)image clustering (图像聚类)image mining( 图像挖掘)image reconstruction(图像重建)image retrieval (图像检索)image segmentation(图像分割)inconsistent ontology 本体不一致incremental learning(增量学习)inductive learning (归纳学习)inference mechanisms 推理机制inference mechanisms(推理机制)inference rule 推理规则information cascades(信息追随)information diffusion(信息扩散)information extraction 信息提取information filtering(信息过滤)information filtering(信息过滤)information integration(信息集成)information network analysis(信息网络分析) information network mining(信息网络挖掘) information network(信息网络)information processing 信息处理information processing 信息处理information resource management (信息资源管理) information retrieval models(信息检索模型) information retrieval 信息检索information retrieval(信息检索)information retrieval(信息检索)information science 情报科学information sources 信息源information system( 信息系统)information system(信息系统)information technology(信息技术)information visualization(信息可视化)instance matching 实例匹配intelligent assistant 智能辅助intelligent systems 智能系统interaction network(交互网络)interactive visualization(交互式可视化)kernel function(核函数)kernel operator (核算子)keyword search(关键字检索)knowledege reuse 知识再利用knowledgeknowledgeknowledge acquisitionknowledge base 知识库knowledge based system 知识系统knowledge building 知识建构knowledge capture 知识获取knowledge construction 知识建构knowledge discovery(知识发现)knowledge extraction 知识提取knowledge fusion 知识融合knowledge integrationknowledge management systems 知识管理系统knowledge management 知识管理knowledge management(知识管理)knowledge model 知识模型knowledge reasoningknowledge representationknowledge representation(知识表达) knowledge sharing 知识共享knowledge storageknowledge technology 知识技术knowledge verification 知识验证language model(语言模型)language modeling approach(语言模型方法) large graph(大图)large graph(大图)learning(无监督学习)life science 生命科学linear programming(线性规划)link analysis (链接分析)link prediction(链接预测)link prediction(链接预测)link prediction(链接预测)linked data(关联数据)location based service(基于位置的服务) loclation based services(基于位置的服务) logic programming 逻辑编程logical implication 逻辑蕴涵logistic regression(logistic 回归)machine learning 机器学习machine translation(机器翻译)management system(管理系统)management( 知识管理)manifold learning(流形学习)markov chains 马尔可夫链markov processes(马尔可夫过程)matching function 匹配函数matrix decomposition(矩阵分解)matrix decomposition(矩阵分解)maximum likelihood estimation(最大似然估计)medical research(医学研究)mixture of gaussians(混合高斯模型)mobile computing(移动计算)multi agnet systems 多智能体系统multiagent systems 多智能体系统multimedia 多媒体natural language processing 自然语言处理natural language processing(自然语言处理) nearest neighbor (近邻)network analysis( 网络分析)network analysis(网络分析)network analysis(网络分析)network formation(组网)network structure(网络结构)network theory(网络理论)network topology(网络拓扑)network visualization(网络可视化)neural network(神经网络)neural networks (神经网络)neural networks(神经网络)nonlinear dynamics(非线性动力学)nonmonotonic reasoning 非单调推理nonnegative matrix factorization (非负矩阵分解) nonnegative matrix factorization(非负矩阵分解) object detection(目标检测)object oriented 面向对象object recognition(目标识别)object recognition(目标识别)online community(网络社区)online social network(在线社交网络)online social networks(在线社交网络)ontology alignment 本体映射ontology development 本体开发ontology engineering 本体工程ontology evolution 本体演化ontology extraction 本体抽取ontology interoperablity 互用性本体ontology language 本体语言ontology mapping 本体映射ontology matching 本体匹配ontology versioning 本体版本ontology 本体论open government data 政府公开数据opinion analysis(舆情分析)opinion mining(意见挖掘)opinion mining(意见挖掘)outlier detection(孤立点检测)parallel processing(并行处理)patient care(病人医疗护理)pattern classification(模式分类)pattern matching(模式匹配)pattern mining(模式挖掘)pattern recognition 模式识别pattern recognition(模式识别)pattern recognition(模式识别)personal data(个人数据)prediction algorithms(预测算法)predictive model 预测模型predictive models(预测模型)privacy preservation(隐私保护)probabilistic logic(概率逻辑)probabilistic logic(概率逻辑)probabilistic model(概率模型)probabilistic model(概率模型)probability distribution(概率分布)probability distribution(概率分布)project management(项目管理)pruning technique(修剪技术)quality management 质量管理query expansion(查询扩展)query language 查询语言query language(查询语言)query processing(查询处理)query rewrite 查询重写question answering system 问答系统random forest(随机森林)random graph(随机图)random processes(随机过程)random walk(随机游走)range query(范围查询)RDF database 资源描述框架数据库RDF query 资源描述框架查询RDF repository 资源描述框架存储库RDF storge 资源描述框架存储real time(实时)recommender system(推荐系统)recommender system(推荐系统)recommender systems 推荐系统recommender systems(推荐系统)record linkage 记录链接recurrent neural network(递归神经网络) regression(回归)reinforcement learning 强化学习reinforcement learning(强化学习)relation extraction 关系抽取relational database 关系数据库relational learning 关系学习relevance feedback (相关反馈)resource description framework 资源描述框架restricted boltzmann machines(受限玻尔兹曼机) retrieval models(检索模型)rough set theroy 粗糙集理论rough set 粗糙集rule based system 基于规则系统rule based 基于规则rule induction (规则归纳)rule learning (规则学习)rule learning 规则学习schema mapping 模式映射schema matching 模式匹配scientific domain 科学域search problems(搜索问题)semantic (web) technology 语义技术semantic analysis 语义分析semantic annotation 语义标注semantic computing 语义计算semantic integration 语义集成semantic interpretation 语义解释semantic model 语义模型semantic network 语义网络semantic relatedness 语义相关性semantic relation learning 语义关系学习semantic search 语义检索semantic similarity 语义相似度semantic similarity(语义相似度)semantic web rule language 语义网规则语言semantic web 语义网semantic web(语义网)semantic workflow 语义工作流semi supervised learning(半监督学习)sensor data(传感器数据)sensor networks(传感器网络)sentiment analysis(情感分析)sentiment analysis(情感分析)sequential pattern(序列模式)service oriented architecture 面向服务的体系结构shortest path(最短路径)similar kernel function(相似核函数)similarity measure(相似性度量)similarity relationship (相似关系)similarity search(相似搜索)similarity(相似性)situation aware 情境感知social behavior(社交行为)social influence(社会影响)social interaction(社交互动)social interaction(社交互动)social learning(社会学习)social life networks(社交生活网络)social machine 社交机器social media(社交媒体)social media(社交媒体)social media(社交媒体)social network analysis 社会网络分析social network analysis(社交网络分析)social network(社交网络)social network(社交网络)social science(社会科学)social tagging system(社交标签系统)social tagging(社交标签)social web(社交网页)sparse coding(稀疏编码)sparse matrices(稀疏矩阵)sparse representation(稀疏表示)spatial database(空间数据库)spatial reasoning 空间推理statistical analysis(统计分析)statistical model 统计模型string matching(串匹配)structural risk minimization (结构风险最小化) structured data 结构化数据subgraph matching 子图匹配subspace clustering(子空间聚类)supervised learning( 有support vector machine 支持向量机support vector machines(支持向量机)system dynamics(系统动力学)tag recommendation(标签推荐)taxonmy induction 感应规范temporal logic 时态逻辑temporal reasoning 时序推理text analysis(文本分析)text anaylsis 文本分析text classification (文本分类)text data(文本数据)text mining technique(文本挖掘技术)text mining 文本挖掘text mining(文本挖掘)text summarization(文本摘要)thesaurus alignment 同义对齐time frequency analysis(时频分析)time series analysis( 时time series data(时间序列数据)time series data(时间序列数据)time series(时间序列)topic model(主题模型)topic modeling(主题模型)transfer learning 迁移学习triple store 三元组存储uncertainty reasoning 不精确推理undirected graph(无向图)unified modeling language 统一建模语言unsupervisedupper bound(上界)user behavior(用户行为)user generated content(用户生成内容)utility mining(效用挖掘)visual analytics(可视化分析)visual content(视觉内容)visual representation(视觉表征)visualisation(可视化)visualization technique(可视化技术) visualization tool(可视化工具)web 2.0(网络2.0)web forum(web 论坛)web mining(网络挖掘)web of data 数据网web ontology lanuage 网络本体语言web pages(web 页面)web resource 网络资源web science 万维科学web search (网络检索)web usage mining(web 使用挖掘)wireless networks 无线网络world knowledge 世界知识world wide web 万维网world wide web(万维网)xml database 可扩展标志语言数据库附录 2 Data Mining 知识图谱(共包含二级节点15 个,三级节点93 个)间序列分析)监督学习)领域 二级分类 三级分类。

BP神经网络PPT全文

BP神经网络PPT全文
常要求激活函数是连续可微的
输出层与隐含层的激活函数可以不同,并且输出层
各单元的激活函数可有所区别
2024/8/16
26
2 多层网络的表达能力
按照Kolmogorov定理,任何一个判决均可用 前式所示的三层神经网络实现。
即: 只要给定足够数量的隐含层单元、适 当的非线性函数、以及权值, 任何由输入向输 出的连续映射函数均可用一个三层前馈神经网络 实现。
神经网络的计算通过网络结构实现;
不同网络结构可以体现各种不同的功能;
网络结构的参数是通过学习逐渐修正的。
2024/8/16
7
(1)基本的人工神经元模型
McCulloch-Pitts神经元模型
输入信号;链接强度与权向量;
信号累积
2024/8/16
激活与抑制
8
人工神经元模型的三要素 :
一组连接 一个加法器 一个激励函数
➢ 树突(dendrites), 接收来自外接的信息 ➢ 细胞体(cell body), 神经细胞主体,信息加工 ➢ 轴突(axon), 细胞的输出装置,将信号向外传递,
与多个神经元连接 ➢突触 (synapsse), 神经元经突触向其它神经元(胞体 或树突)传递信号
2024/8/16
5
(2)生物神经元的基本特征
5 假定:第l层为当前处理层;
其前一层l 1、当前层l、后一层l 1的计算单元序号为i, j,k;
位于当前层第j个计算单元的输出为Olj,j 1,..., nl
前层第i个单元到本层第j个单元的连接权值为ilj , i 1,..., nl1
本层第j个单元到后层第k个单元的连接权值为
l 1 jk
,
连接权值,突触连接强度

九年级第13期英语周报(GZ)参考答案

九年级第13期英语周报(GZ)参考答案
九年级第 13期 第2-4版
Units 1~4 基础知识专项练习
Unit 4 一、1. exam 3. hate 5. mess 7. though 9. ashamed
2. awful 4. none 6. polite 8. suggest 10. careless
九年级第 13期 第2-4版
九年级第 13期 第2-4版
Units 1~4 基础知识专项练习
三、1-5ACBAD
6-10 CDBAA
九年级第 13期 第2-4版
Units 1~4 基础知识专项练习 四、1. Tom has trouble in falling asleep. 2. Her parents expect her to be a teacher. 3. It takes me at least two hours to finish my homework every day. 4. I suppose (that) I can finish the project next year.
Units 1~4 基础知识专项练习 二、1. laugh at 2. out of place 3. heard from 4. made up his mind 5. drives me mad 6. none of our business 7. is on a diet 8. It’s, of, to travel
Units 1~4 基础知识专项练习
四、1. to go 3. to eat 5. to do 7. to bring 9. to make 11. to take 13. thinking 2. to invent 4. to finish 6. drawing 8. to come 10. to shake 12. answering 14. to become

蛋白互作英文文献

蛋白互作英文文献

蛋白互作英文文献Protein-Protein Interactions: A Review of the Current ResearchAbstract:Protein-protein interactions (PPIs) play a critical role in various biological processes, including signal transduction, gene regulation, and cellular functions. Understanding the mechanisms and dynamics of these interactions is crucial for elucidating the complexity of cellular networks. This article provides an overview of the current research on protein-protein interactions, including the techniques used to study PPIs, the databases available for PPI data, and the computational methods employed for predicting and analyzing PPI networks. Additionally, we discuss the functional significance of PPIs in various cellular processes and their implications in disease development.1. IntroductionProteins are the building blocks of life, and their interactions with other proteins are essential for the proper functioning of cells. Protein-protein interactions are highly dynamic and complex, involving the formation of transient or stable complexes that govern various cellular processes. Understanding the underlying principles and dynamics of these interactions is crucial for deciphering the intricate molecular mechanisms of biological systems.2. Techniques for studying protein-protein interactionsA variety of experimental techniques have been developed to study PPIs. These include yeast two-hybrid, co-immunoprecipitation, fluorescence resonance energy transfer, and mass spectrometry-based approaches. Each technique has its strengths and limitations, and the choice of method depends on the specific research question and the characteristics of the proteins under investigation.3. Databases for protein-protein interaction dataSeveral databases have been established to collect, curate, and provide access to PPI data. These databases, such as the Protein Data Bank, the Biomolecular Interaction Network Database, and the Human Protein Reference Database, serve as valuable resources for researchers to retrieve and analyze PPI information. These databases enable the integration and interpretation of PPI data from multiple sources, facilitating the discovery of novel interactions and the exploration of protein networks.4. Computational methods for predicting and analyzing protein-protein interaction networksComputational approaches have been extensively employed to predict and analyze PPI networks. These methods utilize various algorithms, including sequence-based, structure-based, and network-based approaches. They aid in the identification of potential protein interactions, the characterization of protein complexes, and the prediction of protein functions. Furthermore, computational methods allow for the visualization and analysis of PPI networks, facilitating the identification of key proteins and modules within these networks.5. Functional significance of protein-protein interactionsPPIs play critical roles in numerous cellular processes, including signal transduction, protein localization, protein folding, and enzymatic activities. These interactions regulate protein functions, mediate protein complex assembly, and orchestrate cellular responses to external stimuli. Understanding the functional significance of PPIs is essential for deciphering the complexity of cellular networks and the underlying mechanisms of biological processes.6. Implications of protein-protein interactions in disease developmentDisruptions in protein-protein interactions can lead to the development of various diseases, including cancer, neurodegenerative disorders, and infectious diseases. Dysregulated PPI networks can contribute to abnormal cellular signaling, protein misfolding, and dysfunctional protein complexes, which can ultimately result in disease phenotypes. Therefore, targeting PPIs has emerged as a promising therapeutic strategyfor the treatment of various diseases, and the identification of small molecules and peptides that disrupt or modulate specific protein interactions holds great potential for drug development.7. ConclusionProtein-protein interactions are integral to cellular processes and play a vital role in the maintenance of cellular homeostasis. Advancements in experimental techniques and computational methods have significantly enhanced our understanding of PPI networks. Further research in this field will undoubtedly uncover new insights into the complexity of protein interactions and their functional implications, ultimately leading to the development of innovative therapies and interventions for various diseases.。

Finding community structure in networks using the eigenvectors of matrices

Finding community structure in networks using the eigenvectors of matrices
Finding community structure in networks using the eigenvectors of matrices
M. E. J. Newman
Department of Physics and Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109–1040
We consider the problem of detecting communities or modules in networks, groups of vertices with a higher-than-average density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as “modularity” over possible divisions of a network. Here we show that this maximization process can be written in terms of the eigenspectrum of a matrix we call the modularity matrix, which plays a role in community detection similar to that played by the graph Laplacian in graph partitioning calculations. This result leads us to a number of possible algorithms for detecting community structure, as well as several other results, including a spectral measure of bipartite structure in neteasure that identifies those vertices that occupy central positions within the communities to which they belong. The algorithms and measures proposed are illustrated with applications to a variety of real-world complex networks.

Attack vulnerability of complex networks

Attack vulnerability of complex networks

Attack vulnerability of complex networksPetter Holme*and Beom Jun Kim†Department of Theoretical Physics,Umea˚University,90187Umea˚,SwedenChang No Yoon and Seung Kee HanDepartment of Physics,Chungbuk National University,Cheongju,Chungbuk361-763,Korea͑Received19December2001;published7May2002͒We study the response of complex networks subject to attacks on vertices and edges.Several existing complex network models as well as real-world networks of scientific collaborations and Internet traffic are numerically investigated,and the network performance is quantitatively measured by the average inverse geodesic length and the size of the largest connected subgraph.For each case of attacks on vertices and edges, four different attacking strategies are used:removals by the descending order of the degree and the between-ness centrality,calculated for either the initial network or the current network during the removal procedure.It is found that the removals by the recalculated degrees and betweenness centralities are often more harmful than the attack strategies based on the initial network,suggesting that the network structure changes as important vertices or edges are removed.Furthermore,the correlation between the betweenness centrality and the degree in complex networks is studied.DOI:10.1103/PhysRevE.65.056109PACS number͑s͒:89.75.Fb,89.75.Hc,89.65.ϪsI.INTRODUCTIONExamples of complex networks are abundant in many dis-ciplines of science and have recently received much attention ͓1,2͔.Many works have tried to regenerate geometrical sta-tistics of real-world networks by generative algorithms thatmimic behaviors found in real-world networks.The studiesalong this line have been able to model,e.g.,the emergenceof scale-free degree distributions͓3͔and the high clusteringof social networks͓4,5͔.Another group of complex networkstudies aims to investigate certain dynamical problems onnetwork topologies͓5,6͔.A third group of works studies howthe geometric characteristics and performances of the net-works are affected by the restrictions imposed on networks.The approach taken by the present paper belongs to the thirdcategory as we study the robustness of the network subject tovarious attack strategies.Originated from studies of computer networks,‘‘attackvulnerability’’͓3͔denotes the decrease of network perfor-mance due to a selected removal of vertices or edges.In thepresent study we measure the attack vulnerability of variouscomplex network models and real-world networks.We com-pare different ways of attacking the network and use variousways of measuring the resulting damage.In general,thisgives a measure of the decrease of network functionality un-der a sinister attack.The meaningful purpose for attack vul-nerability studies is for the sake of protection:If one wants toprotect the network by guarding or by a temporary isolationof some vertices͑edges͒,the most important vertices͑edges͒,breaking of which would make the whole network malfunc-tion,should be identified.Furthermore,one can learn how to build attack-robust networks,and also how to increase the robustness of vital biological networks.Also in a large net-work of a criminal organization,the whole network can be made to collapse by arresting key persons,which can be identified by a similar study.However,the applicability to social networks may not be very high—acquaintance ties are to some extent subjective and time dependent͓7͔,and when a social network structure is under attack,the dynamics would probably speed up as the organization tries to protect itself.A topic closely related to attack vulnerability is that of the percolation in complex networks͓8͔,where all vertices͑or edges͒have the equal probability of being disabled.In the network of computers,this situation corresponds to a random breakdown of computers,while in the problem of a disease spread through a network of people,it corresponds to the fact that a randomly chosen set of people are susceptible.One of the key quantities in percolation studies,the size of the larg-est connected subgraph,is also used in the present paper as one of the measures of network performance.This paper is organized as follows:In Sec.II we provide the definitions of terms and measured quantities.In Sec.III various attack strategies are explained.In Sec.IV two real-world networks and several complex network models used in the present paper are briefly described.Sections V,VI,and VII are devoted to the main results,on the relation between degrees and betweenness centralities,on the vulnerability under vertex attack,and on the vulnerability under edge at-tack.Finally,we summarize our results in Sec.VIII.II.DEFINITIONS OF QUANTITIESIn general,the complex networks—networks of both ran-domness and structure—studied in this article can be repre-sented by an undirected and unweighted graph Gϭ(V,E), where V is the set of vertices͑or nodes͒and E is the set of edges͑or links͒.Each edge connects exactly one pair of ver-tices,and a vertex pair can be connected by maximally one edge,i.e.,multiconnection is not allowed.Let,furthermore,*Electronic address:holme@tp.umu.se†Electronic address:kim@tp.umu.se Present address:Departmentof Molecular Science and Technology,Ajou University,Suwon442-749,Korea.PHYSICAL REVIEW E,VOLUME65,056109N denote the number of vertices Nϭ͉V͉and L the number of edges Lϭ͉E͉.For a social network͓9͔,V is a set of persons ͑or‘‘actors’’in the sociology parlance͒and E is the set ofacquaintance ties that links the persons together.In computer networks V represent the routers or computers and E the channels for computer communication.There are several ways of measuring the functionality of networks.One key quantity is the average geodesic length l, which is sometimes termed‘‘the characteristic path length,’’defined bylϵ͗d͑v,w͒͘ϵ1N͑NϪ1͚͒v෈V͚w v෈Vd͑v,w͒,where d(v,w)is the length of the geodesic between v and w (v,w෈V),i.e.,the number of edges in the shortest path con-necting the two,and the factor N(NϪ1)is the number of pairs of vertices.If l is large,the dynamics͑of epidemics, informationflow,etc.͒is slow in the network.Social net-works are known to have a very short average geodesic length,lϰln N,with the‘‘six degrees of separation,’’lϷ6, of the earth’s population as a celebrated example͓10͔.The logarithmic increase of l is also characteristic of computer networks,and lϷ17has been estimated for the entire world-wide web͓3͔.As the number of removed vertices or edges is increased,the network will eventually break into disconnected subgraphs.The average geodesic length,by definition,becomes infinite for such a disconnected graph, and one can instead study the average inverse geodesic length,lϪ1ϵͳ1d͑v,w͒ʹϵ1N͑NϪ1͚͒v෈V͚w v෈V1d͑v,w͒,͑2.1͒which has afinite value even for a disconnected graph since 1/d(v,w)ϭ0if no path connects v and w.It should be noted that the notation lϪ1does not mean the reciprocal of l.The functionality of the network is then measured by lϪ1:the larger lϪ1is the better the network functions.Since subsequent attacks will disintegrate the network,the size of the largest connected subgraph is also an interesting quantity for measuring the functionality of the networks.In social networks,the largest connected subgraph is known to have a size of the order of the entire network,and accord-ingly is called the‘‘giant component’’͓11͔.Throughout the present paper,we denote the size of the giant component as S,which will be used together with lϪ1to study the attack vulnerability.In addition to the logarithmically increasing average geo-desic length,social networks are found to have a high local transitivity:if͕u,v͖and͕u,w͖are two connected pairs,then v is likely to be connected to w too͑and if it does͕u,v,w͖is called a‘‘triad’’͓͒12͔.The clustering coefficient␥͑intro-duced in Ref.͓5͔͒intends to measure the average degree of the local transitivity in a network:Let͉⌫v͉E denote the num-ber of edges in the neighborhood⌫v of v෈V,then␥vϵ͉⌫v͉Eͩk v2ͪ͑2.2͒is called the local clustering coefficient of the vertex v.Here the degree k v of v is defined as the number of vertices in⌫v, i.e.,k vϵ͉⌫v͉.The‘‘clustering coefficient’’is then defined as the average of␥v,␥ϵ͗␥v͘ϵ1N͚v෈V␥v.͑2.3͒An alternative interpretation is that␥v is the fraction of the number of triads divided by the maximal number of triads.In Fig.1,we present an illustration to explain the meaning of the local clustering coefficient:The number of edges within the neighborhood͉⌫v͉Eϭ5and the degree k vϭ6result in ␥vϭ1/3in Fig.1.Both␥v and␥are strictly in the interval ͓0,1͔,with␥ϭ1attained only for a fully connected network, where every vertex is connected to every other vertex with the total number of edges LϭN(NϪ1)/2.Removals of important vertices may affect the network significantly.For example,in Ref.͓3͔only a few removals of vertices with the highest degrees has been shown to be enough to alter the behaviors of scale-free networks and the average geodesic length has been found to increase dramati-cally.In the studies of social networks,centrality is an im-portant concept,which tries to capture the prominence of a person in the embedding social structure.It is natural to ex-pect that removals of vertices with high centrality will worsen the functionality of networks more than the removals by degrees.It should be noted that the vertex with a low degree can have a high centrality͑this will be shown explic-itly in Sec.V͒and thus attacking the network by removing vertices with high centralities may differ from that by de-grees.Among many centrality measures͓14͔we focus on the ‘‘vertex betweenness centrality’’C B(v)͓15͔defined for a vertex v෈V as follows:C B͑v͒ϭ͚w wЈ෈V␴wwЈ͑v͒␴wwЈ,͑2.4͒where␴wwЈis the number of geodesics between w and wЈand␴wwЈ(v)is the number of geodesics between w and wЈFIG.1.An example of how to calculate the local clustering coefficient␥v of the vertex v.The closed black circles indicate the neighborhood⌫v of v,and the thick lines are the edges connecting two vertices within⌫v.Since there arefive such edges(͉⌫v͉E ϭ5)and the degree k vϭ6,we obtain␥vϭ1/3from Eq.͑2.2͒.HOLME,KIM,YOON,AND HAN PHYSICAL REVIEW E65056109that passes v.Similarly,one can define the‘‘edge between-ness centrality’’C B(e)for an edge e෈E asC B͑e͒ϭ͚w wЈ෈V␴wwЈ͑e͒␴wwЈ,͑2.5͒where␴wwЈ(e)is the number of geodesics between w and wЈthat includes the edge e.Throughout the present paper, we call C B(v)and C B(e)the vertex betweenness and the edge betweenness for brevity.For calculations of C B(v)and C B(e)we use the O(NL)algorithm presented in Refs.͓16,17͔.III.ATTACK STRATEGIESFor the study of attack vulnerability of the network,the selection procedure of the order in which vertices are re-moved is an open choice.One may of course maximize the destructive effect at anyfixed number of removed vertices ͑or edges͒.However,this requires the knowledge of the whole network structure,and pinpointing the vertex to attack in this way makes a very time-demanding computation.A more tractable choice,used in the original study of computer networks,is to select the vertices in the descending order of degrees in the initial network and then to remove vertices one by one starting from the vertex with the highest degree ͓3͔;this attack strategy uses the initial degree distribution and thus is called‘‘ID removal’’throughout the paper.The vertices with high betweenness also play important roles in connecting vertices in the network͓31͔.The second attack strategy is called‘‘IB removal’’and uses the initial distribu-tion of the betweenness.Both ID removal and IB removal use the information on the initial network.As more vertices are removed,the network structure changes,leading to dif-ferent distributions of the degree and the betweenness than the initial ones.The third attack strategy called‘‘RD re-moval’’uses the recalculated degree distribution at every re-moval step,and the fourth strategy,we call it‘‘RB removal,’’is based on the recalculated betweenness at every step.RD removal has been used in Refs.͓18,19͔.It should be noted that ID and RD removals are local strategies,while the other two based on the betweenness are global ones,which makes the applications of ID and RD O(N)algorithms,while IB and RB are O(NL)even with the best known algorithm ͓16,17͔.The other important difference between the degree-based and the betweenness-based strategies is that the former concentrate on reducing the total number of edges in the network as fast as possible whereas the latter concentrate on destroying as many geodesics as possible.It is not entirely clear a priori which one of these four different attack strat-egies should be more harmful than the others,although one can naively expect that RD and RB are more harmful than ID and IB,respectively.It should also be noted that for the strategies based on the recalculated information,the most harmful sequences for removals of N rm vertices and N rmЈver-tices(N rm N rmЈ)might differ significantly even in the early stage of attacks.One can also attack edges instead of vertices.In the net-work of computers,attacking edges may correspond to the cutting off of communication cables,while the attacks on vertices can be interpreted as breakdowns of servers by ma-licious hackers.͑The opposite is of course also imaginable:a software obstruction of a communication link or a server destroyed physically.͒The vulnerability of networks under edge attacks is also studied by using similar strategies͑we again call them as ID,IB,RD,and RB removals of edges͒. The concept of the edge betweenness was introduced in Sec. II from a straightforward generalization of the vertex be-tweenness.On the other hand,the definition of the‘‘edge degree’’is not so clear.But still it is expected that the im-portance of an edge should be possible to assess by the de-grees of the two vertices it connects.In this work we attempt to define the edge degree k e from the local information of the vertex degrees in several different ways,k eϵk v k w,͑3.1a͒k eϵk vϩk w,͑3.1b͒k eϵmin͑k v,k w͒,͑3.1c͒k eϵmax͑k v,k w͒,͑3.1d͒where the edge e connects two vertices v and w with vertex degrees k v and k w,respectively.As will be discussed in Sec. V,among the above definitions,wefind that Eq.͑3.1a͒gives a more reasonable result͓a higher C B(e)to k e correlation͔than the others,and thus the‘‘edge degree’’defined as k eϵk v k w is used for the attack strategies ID and RD edge removals.From the definitions,we expect that a vertex with higher degree usually should have higher betweenness in most real-world networks.However,the correlation between the edge degree and edge betweenness is less obvious.This is ex-pected to show a larger difference between degree-based and betweenness-based attack strategies for edge attacks than for vertex attack.The four different edge attack strategies ap-plied to a network generated by Watts and Strogatz’s small-world network model͑see Sec.IV D͒is shown in Fig.2. Quite soon the original network structure is lost and procedure-specific structures emerge.For example,the RB edge removal concentrates on edges of high betweenness, and thus edges that carry more geodesics arefirst lost.Con-sequently,it is not a surprise that the resulting network struc-ture by RB consists of highly connected clusters and vertices with no neighbors.The RD procedure,on the other hand, removes edges connecting vertices with high vertex degrees, and,therefore,it is natural that the original network is split into many subgraphs of vertices with low͑but not zero͒de-grees.In Secs.VI and VII,we investigate various networks sub-ject to the above-mentioned four different attack strategies, ID,IB,RD,and RB,applied for vertex removals and edge removals.To detect the damages caused by those attacks,we measure the average inverse geodesic length lϪ1in Eq.͑2.1͒as well as the size S of the giant component.As the vertex attack proceeds,both the remaining number of verti-ces and the average inverse geodesic length decrease,which,ATTACK VULNERABILITY OF COMPLEX NETWORKS PHYSICAL REVIEW E65056109from the definition of l Ϫ1,suggests that l Ϫ1can be both increasing and decreasing,depending on how much damage is made by the removals.However,the edge removals do not change the number of vertices in the network and thus l Ϫ1should be a decreasing function of the number of removed edges.Similarly,S is expected to show different behaviors for vertex and edge removals.For vertex removals,S versus the number of removed vertices should have a slope of unity in the initial attack stages since the removed vertex probably belonged to the giant component.On the other hand,the initial edge attacks should not change the size of the giant component,and thus S versus the number of removed edges should start as a horizontal line.We conclude the section with some technical details:In any case where two or more vertices ͑edges ͒could equally be chosen by some strategy,the selection is done randomly.For the RB strategy,if the betweenness is zero for all verti-ces,i.e.,the vertices are either isolated or linked to exactly one neighbor,the vertices with k v ϭ1are attacked before the meaningless attack of vertices with k v ϭ0.IV .NETWORKSTo study the emergence of different geometrical proper-ties of complex networks such as social networks,power grids,metabolic networks,computer networks,and so on,different generative algorithms have been proposed ͓1͔.Among various existing models for generating networks similar to real ones,two generic models,the Watts-Strogatz ͑WS ͒model of the small-world networks ͓5͔and theBaraba´si-Albert ͑BA ͒model of the scale-free network ͓3,20͔,have been widely studied.Both models commonly show the behavior that two arbitrarily chosen vertices are connected by a remarkably short path.More specifically,the average geodesic length has been found to scale logarithmically with the network size.On the one hand,the WS model does notexhibit the power-law distribution of degree,which many real networks show and the BA model successfully produces.On the other hand,the WS model has high clustering,e.g.,social networks,whereas the BA model has a clustering co-efficient that scales toward zero as N →ϱ.There have been attempts to revise and extend these representative models in order to produce a network model that can show a small average geodesic length,a scale-free degree distribution,and high clustering,all at the same time ͓4,22͔.In this work,we study two real networks,a ‘‘social network’’constructed from scientific collaboration data ͑Sec.IV A ͒and a ‘‘com-puter network’’constructed from computer traffic over the Internet ͑Sec.IV B ͒,as well as four model networks,therandom network model by Erdo¨s and Re ´nyi ͑ER ͒,the WS model,the BA model,and the clustered scale-free network ͑CSF ͒model suggested by two of the present authors in Ref.͓4͔͑Secs.IV C–IV F ͒.It should be noted that network mod-els such as those mentioned above model the emergence of structure in networks—structure that can be monitored by certain quantities such as degree distribution,clustering co-efficient,and so on.However,they do ͑probably ͒not sample the ensemble of networks defined by specific values of these quantities uniformly.This is known to be the case for other ways of generating random graphs with structural biases that are easier to give a probability-theoretical analysis ͓23͔.That the sampling is biased makes an inference from the scaling of quantities difficult—if,e.g.,one model gives networks with the same values as another except for,say,a smaller clustering coefficient,it is not certain that a different behav-ior under,say,edge removal is due to the lower clustering.A.Scientific collaboration network from the hep-late-print archiveTo obtain a well-defined social network from real-world data,we follow Ref.͓17͔and construct a network of scien-tific collaborations from the the Los Alamos preprint ar-chives ͓24͔in the following way:If two scientists wrote an article together,they are connected by an edge.Accordingly,the vertices in the network are scientists,and the edges rep-resent the collaboration ties.For the attack vulnerability cal-culations,the whole Los Alamos e-print archive is too big to be computationally tractable,͓32͔instead we chose the hep-lat database,which contains preprints about lattice studies in high-energy physics,among various subcategories only for computational convenience.The network used in analysis has N ϭ2010͑the number of vertices ͒and L ϭ6614͑the number of edges ͒,and the size of the giant component is S ϭ1412and the clustering coefficient is ␥Ϸ0.571.A discus-sion on the usefulness of collaboration networks as real-world social networks can be found in Ref.͓17͔.puter network from Internet trafficTo build the network structure for the computer commu-nications we follow Ref.͓20͔and use data from the National Laboratory for Applied Network Research ͑NLANR ͓͒25͔.Here the network is constructed as follows:Over a period of 24h a number of servers associated to NLANR and physi-cally spread over USA,gather information throughcomputerFIG.2.Various attack strategies for edge removals applied to a realization of the generative algorithm of the Watts-Strogatz model of small-world network ͑see Sec.IV D ͒.From left to right,the evolutions of network structures are shown for the edge attack strat-egies based on ID ͑initial degree ͒,IB ͑initial betweenness ͒,RD ͑recalculated degree ͒,and RB ͑recalculated betweenness ͒,respec-tively.The initial network structure is displayed at the top left cor-ner of each column and the subsequent structures at the next nine steps are exhibited ͑first four steps from top to bottom and then five more steps from top right to bottom right ͒.For an individual sub-graph the thickness of the lines is proportional to the betweenness of the corresponding edge.HOLME,KIM,YOON,AND HAN PHYSICAL REVIEW E 65056109interconnections.For every connection established through a server the whole path,from the originating vertex to the requested destination,is added to the network graph.To be more specific,the servers are using the border gateway pro-tocol ͑BGP ͒to relay connections over Internet’s largest scale ͓26͔.Vertices are computer networks,or ‘‘autonomous sys-tems’’in BGP nomenclature,interconnected by one or many BGP servers.An edge thus represents an established direct connection between two autonomous systems.The data we analyze represent a network with N ϭS ϭ2210,L ϭ4334,and ␥Ϸ0.221.C.Erdo¨s-Re ´nyi model of random networks In the ER model ͓27͔,we start from N vertices without edges.Subsequently,edges connecting two randomly chosen vertices are added until the total number of edges becomes L .It generates random networks with no particular structural bias:The only restriction in the model is that no multiple edges are allowed between two vertices.In this study,we choose the average degree ͗k ͘ϵ2L /N as a control parameter in the ER model.The ER model graphs have a logarithmi-cally increasing l ,a Poisson-type degree distribution,and a clustering coefficient close to zero.D.Watts-Strogatz model of small-world networksIn the WS model ͓5͔one starts by constructing a regular one-dimensional network with only local connections of range r .For example,r ϭ2means that each vertex is con-nected to its two nearest neighbors and two next nearest neighbors ͓see Fig.3͑a ͔͒.Then each edge is visited once,and with the rewiring probability P ,is detached at the opposite vertex and reconnected to a randomly chosen vertex forming a ‘‘shortcut.’’͑See the illustration in Fig.3.͒For P ϭ0the network is a regular local network,with high clustering,but without the small-world behavior:The average geodesic length in this case grows linearly with the network size.In the opposite limit of P ϭ1,where every edge has been re-wired,the generated random graph has vanishing clustering,but shows a logarithmic behavior of the average geodesic length l ϰln N .In an intermediate range of P ͓typically PϳO (1/N )͔,the network generated by the WS model displays both high clustering and small-world behavior—the com-monly found characteristics of real social networks.E.Baraba´si-Albert model of scale-free networks Apart from the average geodesic length and clustering,the degree distribution is a structural bias that has received much attention.Many ͑but not all ͒real networks are known to have a power-law distribution of degrees ͓3,28͔,manifest-ing a scale-free nature of the network.The BA model of a scale-free network ͓3,20͔is defined by the following ingre-dients:͑1͒Initial condition:To start with the network consists of m 0vertices and no edges.͑2͒Growth:One vertex v with m edges is added every time step.͑3͒Preferential attachment:An edge is added to an old vertex with a probability proportional to its degree.More precisely,the probability P u for a new vertex v to be at-tached to u is ͓33͔:P u ϭk u͚w ෈Vk w.͑4.1͒The growth step is iterated N Ϫm 0times to construct a net-work with size N ,for each growth step the preferential at-tachment step is iterated m times.The above described BA model has been shown to generate scale-free networks with the logarithmically increasing average geodesic length with the size N .However,the original BA model results in net-works with low clustering.F.Clustered scale-free network modelIn order to incorporate the high clustering of social net-works one can modify the standard BA model by adding one additional step,͑4͒Triad formation:If an edge between v and u was added in the previous step of preferential attachment,then add an edge from v to a randomly chosen neighbor w of u .This forms a triad,three vertices connected each other.If there is no available vertex to connect within ⌫u —do a pref-erential attachment step instead.For every new vertex,after an additional preferential at-tachment step,the triad formation step is performed with a probability P t ͑and thus a preferential attachment with the probability 1ϪP t ).The average number of triad formation trials per added vertex m t ϵ(m Ϫ1)P t is taken as a control parameter in this CSF network model ͑see Fig.4͒.The scale-free degree distribution of the original BA model is con-served in the CSF model whose properties have been ana-lyzed in detail in Ref.͓4͔.In the limiting case of m t ϭ0,the original BA network is constructed from the CSF model.The CSF model has been shown to exhibit high clustering ͑fur-thermore,the clustering coefficient is tunable by the control parameter m t ),while it still preserves the characteristics found in the BA model such as the logarithmicallyincreasingFIG.3.The Watts-Strogatz ͑WS ͒model of small-world net-works.The starting point is a regular one-dimensional lattice in ͑a ͒with the range r ϭ2of the connections.Every edge is visited once and then with the rewiring probability P is rewired to the other vertex.The WS model can generate ͑a ͒the local regular network when P ϭ0,with high clustering but with a large average geodesic length,and ͑c ͒the fully random network when P ϭ1,with low clustering but with a very short geodesic length.In the intermediate region of P depicted in ͑b ͒,the WS model has both high clustering and the small-world behavior ͑more specifically,the average geo-desic length l ϰln N for the network with the size N ).ATTACK VULNERABILITY OF COMPLEX NETWORKS PHYSICAL REVIEW E 65056109average geodesic length and the scale-free degree distribu-tion.V .CORRELATION BETWEEN DEGREEAND BETWEENNESSFor the six different networks described in Sec.IV ,we seek the relation between the degree and the betweenness for vertices and edges.Both the degree and the betweenness,to some extent,measure how important the vertex ͑edge ͒is.The natural expectation is that the vertex ͑edge ͒with higher degree should also have higher betweenness.The calculation of the betweenness is based on the global information on paths connecting all pairs of vertices,while the degree,by definition,is the quantity that depends on only the local in-formation.This implies that the identification of the relation between the degree and the betweenness can have practical importance since one can approximately estimate the be-tweenness from the degree.We first show in Fig.5scatter plots of the vertex between-ness C B (v )versus the vertex degree k v .As expected,net-works with the scale-free degree distributions,͑a ͒the scien-tific collaboration network,͑b ͒the computer network,͑e ͒the BA network,and ͑f ͒the CSF network,show clear signs of correlation between the degree and the betweenness.As the scale-free network becomes more clustered ͑from the BA model to the CSF model ͒,the correlation between C B (v )and k v becomes weaker,manifested by more scattered plots in ͑f ͒than ͑e ͒.The ER and the WS models,͑c ͒and ͑d ͒,respec-tively,are characterized by the absence of vertices with very high degrees,which makes the correlation between C B (v )and k v rather difficult to observe especially in the region of high degrees.However,notable correlations are evident even for these networks with an exponential cutoff in degree dis-tributions.For the study of the correlation between the edge degree k e and the edge betweenness C B (e ),we try four different definitions of the edge degree in Eq.͑3.1͒with the assump-tion that the edge degree can be defined by only the degrees of vertices it connects.For all networks,except the scientific collaboration network,we find ͑at least some ͒correlation between k e and C B (e ).This correlation is most evident withthe definition in Eq.͑3.1a ͒,k e ϵk v k w ,where k v and k w are degrees of the vertices v and w that the edge e connects.But the definition k e ϵmin(k v ,k w ),Eq.͑3.1c ͒,also displays a high correlation between k e and C B (e ).This suggests that the lower degree of the two vertices an edge connects is more important for a high edge betweenness than the greater de-gree of the two vertices.In other words,this illustrates the quite natural situation that an edge does not necessarily be-come central just because it connects to one central vertex,rather it has to be a bridge between two central vertices.For the scientific collaboration network,it turns out that none of the definitions of edge degree manifest the correla-tion clearly.Figure 6shows the scatter plots for the edge degree and the edge betweenness,corresponding to the net-works in Fig.5.Especially,the similarity between the real network and the model network is evident between the com-puter network and the CSF network ͓compare ͑b ͒and ͑f ͒inFIG.4.The construction of the clustered scale-free ͑CSF ͒net-work in Sec.IV F.͑a ͒In the preferential attachment step for the newly added vertex v ͑denoted as closed black circle ͒,the white vertex u is chosen with the probability proportional to its degree ͑the dashed line represents the new edge ͒.͑b ͒In the triad formation step an additional edge ͑dashed line ͒is added to a randomly se-lected vertex w in the neighborhood ⌫u of the vertex u chosen in the previous preferential attachment step in ͑a ͒.The vertices marked by ϫare not allowed since they are not in ⌫u .Without the triad formation step,the CSF model reduces to the original BA model of scale-freenetworks.FIG.5.Correlation between the vertex betweenness C B (v )and the vertex degree k v for ͑a ͒the scientific collaboration network,͑b ͒the computer network,͑c ͒the ER network with the size N ϭ104and average degree ͗k ͘ϭ6,͑d ͒the WS network with N ϭ104,r ϭ3,and P ϭ0.01,͑e ͒the BA network with N ϭ104,m 0ϭ5,and m ϭ3,and ͑f ͒the CSF network with N ϭ104,m 0ϭ5,m ϭ3,and m t ϭ1.8͑see Sec.IV for details of networks ͒.All are in log-log scales except for ͑c ͒and ͑d ͒.HOLME,KIM,YOON,AND HAN PHYSICAL REVIEW E 65056109。

深度脉冲神经网络及其应用研究

深度脉冲神经网络及其应用研究

摘要深度神经网络(Deep Neural Networks, DNNs)作为机器学习(Machine Learning, ML)领域内的研究热点,借鉴生物视觉认知系统的分区机制,将数据表征为一系列的矢量进行特征学习,DNNs在计算机视觉领域取得了巨大成就。

脉冲神经网络(Spiking Neural Networks, SNNs)是一种具有生物可塑性(Biological Plasticity)的神经网络,它利用随时间变化的脉冲序列(Spike Train)在神经元之间进行信息传递,能更好地融入时空信息,是“类脑计算”的主要工具。

结合了DNNs和SNNs各自的优势,分析了现有深度脉冲神经网络(Deep Spiking Neural Networks, DSNNs)的模型特点,开展了脉冲编码、基于DSNNs的学习方法的研究,并针对基于DSNNs的机械臂故障诊断方法进行了研究,具体内容如下:首先,介绍了DSNNs的研究背景和意义,综述了DSNNs的国内外研究现状,阐述了论文的研究内容和技术路线。

其次,介绍了深度卷积神经网络(Deep Convolutional Neural Networks, DCNNs)、深度置信神经网络(Deep Belief Networks, DBNs)以及SNNs的发展、模型结构、现阶段DSNNs模型的实现方法及学习算法等相关内容,为后续研究提供理论支撑。

第三,提出了基于DCNNs的机械臂故障分类方法,重点介绍了UCI机械臂传感数据的预处理技术,分析了DCNNs处理一维时序信号的能力。

将采集到的机械臂力及力矩传感数据在时间和数据两个维度进行结合,并采用1D和2D卷积方法在CPU(Intel Core i5-7200U)和GPU(GFX NVIDIA GeForce GTX1060 3G)进行实验验证。

实验结果表明:对于机械臂一维时序信号数据的处理方式能够很好的拟合DCNNs模型,分类准确率优于传统的分类方法。

外文翻译--神经网络概述

外文翻译--神经网络概述

外文原文与译文外文原文Neural Network Introduction1.ObjectivesAs you read these words you are using a complex biological neural network. You have a highly interconnected set of some 1011neurons to facilitate your reading, breathing, motion and thinking. Each of your biological neurons,a rich assembly of tissue and chemistry, has the complexity, if not the speed, of a microprocessor. Some of your neural structure was with you at birth. Other parts have been established by experience.Scientists have only just begun to understand how biological neural networks operate. It is generally understood that all biological neural functions, including memory, are stored in the neurons and in the connections between them. Learning is viewed as the establishment of new connections between neurons or the modification of existing connections.This leads to the following question: Although we have only a rudimentary understanding of biological neural networks, is it possible to construct a small set of simple artificial “neurons” and perhaps train them to serve a useful function? The answer is “yes.”This book, then, is about artificial neural networks.The neurons that we consider here are not biological. They are extremely simple abstractions of biological neurons, realized as elements in a program or perhaps as circuits made of silicon. Networks of these artificial neurons do not have a fraction of the power of the human brain, but they can be trained to perform useful functions. This book is about such neurons, the networks that contain them and their training.2.HistoryThe history of artificial neural networks is filled with colorful, creative individuals from many different fields, many of whom struggled for decades todevelop concepts that we now take for granted. This history has been documented by various authors. One particularly interesting book is Neurocomputing: Foundations of Research by John Anderson and Edward Rosenfeld. They have collected and edited a set of some 43 papers of special historical interest. Each paper is preceded by an introduction that puts the paper in historical perspective.Histories of some of the main neural network contributors are included at the beginning of various chapters throughout this text and will not be repeated here. However, it seems appropriate to give a brief overview, a sample of the major developments.At least two ingredients are necessary for the advancement of a technology: concept and implementation. First, one must have a concept, a way of thinking about a topic, some view of it that gives clarity not there before. This may involve a simple idea, or it may be more specific and include a mathematical description. To illustrate this point, consider the history of the heart. It was thought to be, at various times, the center of the soul or a source of heat. In the 17th century medical practitioners finally began to view the heart as a pump, and they designed experiments to study its pumping action. These experiments revolutionized our view of the circulatory system. Without the pump concept, an understanding of the heart was out of grasp.Concepts and their accompanying mathematics are not sufficient for a technology to mature unless there is some way to implement the system. For instance, the mathematics necessary for the reconstruction of images from computer-aided topography (CAT) scans was known many years before the availability of high-speed computers and efficient algorithms finally made it practical to implement a useful CAT system.The history of neural networks has progressed through both conceptual innovations and implementation developments. These advancements, however, seem to have occurred in fits and starts rather than by steady evolution.Some of the background work for the field of neural networks occurred in the late 19th and early 20th centuries. This consisted primarily of interdisciplinary work in physics, psychology and neurophysiology by such scientists as Hermann vonHelmholtz, Ernst Much and Ivan Pavlov. This early work emphasized general theories of learning, vision, conditioning, etc.,and did not include specific mathematical models of neuron operation.The modern view of neural networks began in the 1940s with the work of Warren McCulloch and Walter Pitts [McPi43], who showed that networks of artificial neurons could, in principle, compute any arithmetic or logical function. Their work is often acknowledged as the origin of theneural network field.McCulloch and Pitts were followed by Donald Hebb [Hebb49], who proposed that classical conditioning (as discovered by Pavlov) is present because of the properties of individual neurons. He proposed a mechanism for learning in biological neurons.The first practical application of artificial neural networks came in the late 1950s, with the invention of the perception network and associated learning rule by Frank Rosenblatt [Rose58]. Rosenblatt and his colleagues built a perception network and demonstrated its ability to perform pattern recognition. This early success generated a great deal of interest in neural network research. Unfortunately, it was later shown that the basic perception network could solve only a limited class of problems. (See Chapter 4 for more on Rosenblatt and the perception learning rule.)At about the same time, Bernard Widrow and Ted Hoff [WiHo60] introduced a new learning algorithm and used it to train adaptive linear neural networks, which were similar in structure and capability to Rosenblatt’s perception. The Widrow Hoff learning rule is still in use today. (See Chapter 10 for more on Widrow-Hoff learning.) Unfortunately, both Rosenblatt's and Widrow's networks suffered from the same inherent limitations, which were widely publicized in a book by Marvin Minsky and Seymour Papert [MiPa69]. Rosenblatt and Widrow wereaware of these limitations and proposed new networks that would overcome them. However, they were not able to successfully modify their learning algorithms to train the more complex networks.Many people, influenced by Minsky and Papert, believed that further research onneural networks was a dead end. This, combined with the fact that there were no powerful digital computers on which to experiment,caused many researchers to leave the field. For a decade neural network research was largely suspended. Some important work, however, did continue during the 1970s. In 1972 TeuvoKohonen [Koho72] and James Anderson [Ande72] independently and separately developed new neural networks that could act as memories. Stephen Grossberg [Gros76] was also very active during this period in the investigation of self-organizing networks.Interest in neural networks had faltered during the late 1960s because of the lack of new ideas and powerful computers with which to experiment. During the 1980s both of these impediments were overcome, and researchin neural networks increased dramatically. New personal computers and workstations, which rapidly grew in capability, became widely available. In addition, important new concepts were introduced.Two new concepts were most responsible for the rebirth of neural net works. The first was the use of statistical mechanics to explain the operation of a certain class of recurrent network, which could be used as an associative memory. This was described in a seminal paper by physicist John Hopfield [Hopf82].The second key development of the 1980s was the backpropagationalgorithm for training multilayer perceptron networks, which was discovered independently by several different researchers. The most influential publication of the backpropagation algorithm was by David Rumelhart and James McClelland [RuMc86]. This algorithm was the answer to the criticisms Minsky and Papert had made in the 1960s. (See Chapters 11 and 12 for a development of the backpropagation algorithm.) These new developments reinvigorated the field of neural networks. In the last ten years, thousands of papers have been written, and neural networks have found many applications. The field is buzzing with new theoretical and practical work. As noted below, it is not clear where all of this will lead US.The brief historical account given above is not intended to identify all of the major contributors, but is simply to give the reader some feel for how knowledge inthe neural network field has progressed. As one might note, the progress has not always been "slow but sure." There have been periods of dramatic progress and periods when relatively little has been accomplished.Many of the advances in neural networks have had to do with new concepts, such as innovative architectures and training. Just as important has been the availability of powerful new computers on which to test these new concepts.Well, so much for the history of neural networks to this date. The real question is, "What will happen in the next ten to twenty years?" Will neural networks take a permanent place as a mathematical/engineering tool, or will they fade away as have so many promising technologies? At present, the answer seems to be that neural networks will not only have their day but will have a permanent place, not as a solution to every problem, but as a tool to be used in appropriate situations. In addition, remember that we still know very little about how the brain works. The most important advances in neural networks almost certainly lie in the future.Although it is difficult to predict the future success of neural networks, the large number and wide variety of applications of this new technology are very encouraging. The next section describes some of these applications.3.ApplicationsA recent newspaper article described the use of neural networks in literature research by Aston University. It stated that "the network can be taught to recognize individual writing styles, and the researchers used it to compare works attributed to Shakespeare and his contemporaries." A popular science television program recently documented the use of neural networks by an Italian research institute to test the purity of olive oil. These examples are indicative of the broad range of applications that can be found for neural networks. The applications are expanding because neural networks are good at solving problems, not just in engineering, science and mathematics, but m medicine, business, finance and literature as well. Their application to a wide variety of problems in many fields makes them very attractive. Also, faster computers and faster algorithms have made it possible to use neuralnetworks to solve complex industrial problems that formerly required too much computation.The following note and Table of Neural Network Applications are reproduced here from the Neural Network Toolbox for MATLAB with the permission of the Math Works, Inc.The 1988 DARPA Neural Network Study [DARP88] lists various neural network applications, beginning with the adaptive channel equalizer in about 1984. This device, which is an outstanding commercial success, is a single-neuron network used in long distance telephone systems to stabilize voice signals. The DARPA report goes on to list other commercial applications, including a small word recognizer, a process monitor, a sonar classifier and a risk analysis system.Neural networks have been applied in many fields since the DARPA report was written. A list of some applications mentioned in the literature follows.AerospaceHigh performance aircraft autopilots, flight path simulations, aircraft control systems, autopilot enhancements, aircraft component simulations, aircraft component fault detectorsAutomotiveAutomobile automatic guidance systems, warranty activity analyzersBankingCheck and other document readers, credit application evaluatorsDefenseWeapon steering, target tracking, object discrimination, facial recognition, new kinds of sensors, sonar, radar and image signal processing including data compression, feature extraction and noise suppression, signal/image identification ElectronicsCode sequence prediction, integrated circuit chip layout, process control, chip failure analysis, machine vision, voice synthesis, nonlinear modeling EntertainmentAnimation, special effects, market forecastingFinancialReal estate appraisal, loan advisor, mortgage screening, corporate bond rating, credit line use analysis, portfolio trading program, corporate financial analysis, currency price predictionInsurancePolicy application evaluation, product optimizationManufacturingManufacturing process control, product design and analysis, process and machine diagnosis, real-time particle identification, visual quality inspection systems, beer testing, welding quality analysis, paper quality prediction, computer chip quality analysis, analysis of grinding operations, chemical product design analysis, machine maintenance analysis, project bidding, planning and management, dynamic modeling of chemical process systemsMedicalBreast cancer cell analysis, EEG and ECG analysis, prosthesis design, optimization of transplant times, hospital expense reduction, hospital quality improvement, emergency room test advisement0il and GasExplorationRoboticsTrajectory control, forklift robot, manipulator controllers, vision systems SpeechSpeech recognition, speech compression, vowel classification, text to speech synthesisSecuritiesMarket analysis, automatic bond rating, stock trading advisory systems TelecommunicationsImage and data compression, automated information services,real-time translation of spoken language, customer payment processing systemsTransportationTruck brake diagnosis systems, vehicle scheduling, routing systems ConclusionThe number of neural network applications, the money that has been invested in neural network software and hardware, and the depth and breadth of interest in these devices have been growing rapidly.4.Biological InspirationThe artificial neural networks discussed in this text are only remotely related to their biological counterparts. In this section we will briefly describe those characteristics of brain function that have inspired the development of artificial neural networks.The brain consists of a large number (approximately 1011) of highly connected elements (approximately 104 connections per element) called neurons. For our purposes these neurons have three principal components: the dendrites, the cell body and the axon. The dendrites are tree-like receptive networks of nerve fibers that carry electrical signals into the cell body. The cell body effectively sums and thresholds these incoming signals. The axon is a single long fiber that carries the signal from the cell body out to other neurons. The point of contact between an axon of one cell and a dendrite of another cell is called a synapse. It is the arrangement of neurons and the strengths of the individual synapses, determined by a complex chemical process, that establishes the function of the neural network. Figure 6.1 is a simplified schematic diagram of two biological neurons.Figure 6.1 Schematic Drawing of Biological NeuronsSome of the neural structure is defined at birth. Other parts are developed through learning, as new connections are made and others waste away. This development is most noticeable in the early stages of life. For example, it has been shown that if a young cat is denied use of one eye during a critical window of time, it will never develop normal vision in that eye.Neural structures continue to change throughout life. These later changes tend to consist mainly of strengthening or weakening of synaptic junctions. For instance, it is believed that new memories are formed by modification of these synaptic strengths. Thus, the process of learning a new friend's face consists of altering various synapses.Artificial neural networks do not approach the complexity of the brain. There are, however, two key similarities between biological and artificial neural networks. First, the building blocks of both networks are simple computational devices (although artificial neurons are much simpler than biological neurons) that are highly interconnected. Second, the connections between neurons determine the function of the network. The primary objective of this book will be to determine the appropriate connections to solve particular problems.It is worth noting that even though biological neurons are very slow whencompared to electrical circuits, the brain is able to perform many tasks much faster than any conventional computer. This is in part because of the massively parallel structure of biological neural networks; all of the neurons are operating at the same time. Artificial neural networks share this parallel structure. Even though most artificial neural networks are currently implemented on conventional digital computers, their parallel structure makes them ideally suited to implementation using VLSI, optical devices and parallel processors.In the following chapter we will introduce our basic artificial neuron and will explain how we can combine such neurons to form networks. This will provide a background for Chapter 3, where we take our first look at neural networks in action.译文神经网络概述1.目的当你现在看这本书的时候,就正在使用一个复杂的生物神经网络。

Systems biology 系统生物学

Systems biology  系统生物学

Systems biologyFrom Wikipedia, the free encyclopediaSystems biology is the study of systems of biological components, which may be molecules, cells, organisms or entire species. Living systems are dynamic and complex, and their behavior may be hard to predict from the properties of individual parts. To study them, we use quantitative measurements of the behavior of groups of interacting components, systematic measurement technologies such as genomics, bioinformatics and proteomics, and mathematical and computational models to describe and predict dynamical behavior.Systems biology is a term often used to describe a number of trends in bioscience research, and a movement which draws on those trends. Proponents (支持者,倡导者)describe systems biology as a biology-based inter-disciplinary study field that focuses on complex interactions in biological systems, claiming that it uses a new perspective (holism 整体论instead of reduction). Particularly from year 2000 onwards, the term is used widely in the biosciences, and in a variety of contexts. An often stated ambition of systems biology is the modeling and discovery of emergent properties, properties of a system whose theoretical description is only possible using techniques which fall under the remit(范围,职责)of systems biology. These typically involve metabolic networks or cell signaling networks[1].系统生物学维基百科,自由的百科全书系统生物学是生物组分的系统,它可以是分子,细胞,器官或整个物种的研究。

Modularity and community structure in networks

Modularity and community structure in networks

a r X i v :p h y s i c s /0602124v 1 [p h y s i c s .d a t a -a n ] 17 F eb 2006Modularity and community structure in networksM. E.J.NewmanDepartment of Physics and Center for the Study of Complex Systems,Randall Laboratory,University of Michigan,Ann Arbor,MI 48109–1040Many networks of interest in the sciences,including a variety of social and biological networks,are found to divide naturally into communities or modules.The problem of detecting and characterizing this community structure has attracted considerable recent attention.One of the most sensitive detection methods is optimization of the quality function known as “modularity”over the possible divisions of a network,but direct application of this method using,for instance,simulated annealing is computationally costly.Here we show that the modularity can be reformulated in terms of the eigenvectors of a new characteristic matrix for the network,which we call the modularity matrix,and that this reformulation leads to a spectral algorithm for community detection that returns results of better quality than competing methods in noticeably shorter running times.We demonstrate the algorithm with applications to several network data sets.IntroductionMany systems of scientific interest can be represented as networks—sets of nodes or vertices joined in pairs by lines or edges .Examples include the Internet and the worldwide web,metabolic networks,food webs,neural networks,communication and distribution networks,and social networks.The study of networked systems has a history stretching back several centuries,but it has expe-rienced a particular surge of interest in the last decade,especially in the mathematical sciences,partly as a result of the increasing availability of large-scale accurate data describing the topology of networks in the real world.Statistical analyses of these data have revealed some un-expected structural features,such as high network tran-sitivity [1],power-law degree distributions [2],and the existence of repeated local motifs [3];see [4,5,6]for reviews.One issue that has received a considerable amount of attention is the detection and characterization of com-munity structure in networks [7,8],meaning the appear-ance of densely connected groups of vertices,with only sparser connections between groups (Fig.1).The abil-ity to detect such groups could be of significant practical importance.For instance,groups within the worldwide web might correspond to sets of web pages on related top-ics [9];groups within social networks might correspond to social units or communities [10].Merely the finding that a network contains tightly-knit groups at all can convey useful information:if a metabolic network were divided into such groups,for instance,it could provide evidence for a modular view of the network’s dynamics,with dif-ferent groups of nodes performing different functions with some degree of independence [11,12].Past work on methods for discovering groups in net-works divides into two principal lines of research,both with long histories.The first,which goes by the name of graph partitioning ,has been pursued particularly in computer science and related fields,with applications in parallel computing and VLSI design,among other ar-eas [13,14].The second,identified by names such as blockFIG.1:The vertices in many networks fall naturally into groups or communities,sets of vertices (shaded)within which there are many edges,with only a smaller number of edges between vertices of different groups.modeling ,hierarchical clustering ,or community structure detection ,has been pursued by sociologists and more re-cently also by physicists and applied mathematicians,with applications especially to social and biological net-works [7,15,16].It is tempting to suggest that these two lines of re-search are really addressing the same question,albeit by somewhat different means.There are,however,impor-tant differences between the goals of the two camps that make quite different technical approaches desirable.A typical problem in graph partitioning is the division of a set of tasks between the processors of a parallel computer so as to minimize the necessary amount of interprocessor communication.In such an application the number of processors is usually known in advance and at least an approximate figure for the number of tasks that each pro-cessor can handle.Thus we know the number and size of the groups into which the network is to be split.Also,the goal is usually to find the best division of the network re-gardless of whether a good division even exists—there is little point in an algorithm or method that fails to divide the network in some cases.Community structure detection,by contrast,is per-2haps best thought of as a data analysis technique used to shed light on the structure of large-scale network datasets,such as social networks,Internet and web data, or biochemical munity structure meth-ods normally assume that the network of interest divides naturally into subgroups and the experimenter’s job is to find those groups.The number and size of the groups is thus determined by the network itself and not by the experimenter.Moreover,community structure methods may explicitly admit the possibility that no good division of the network exists,an outcome that is itself considered to be of interest for the light it sheds on the topology of the network.In this paper our focus is on community structure de-tection in network datasets representing real-world sys-tems of interest.However,both the similarities and differences between community structure methods and graph partitioning will motivate many of the develop-ments that follow.The method of optimal modularity Suppose then that we are given,or discover,the struc-ture of some network and that we wish to determine whether there exists any natural division of its vertices into nonoverlapping groups or communities,where these communities may be of any size.Let us approach this question in stages and focus ini-tially on the problem of whether any good division of the network exists into just two communities.Perhaps the most obvious way to tackle this problem is to look for divisions of the vertices into two groups so as to mini-mize the number of edges running between the groups. This“minimum cut”approach is the approach adopted, virtually without exception,in the algorithms studied in the graph partitioning literature.However,as discussed above,the community structure problem differs crucially from graph partitioning in that the sizes of the commu-nities are not normally known in advance.If community sizes are unconstrained then we are,for instance,at lib-erty to select the trivial division of the network that puts all the vertices in one of our two groups and none in the other,which guarantees we will have zero intergroup edges.This division is,in a sense,optimal,but clearly it does not tell us anything of any worth.We can,if we wish,artificially forbid this solution,but then a division that puts just one vertex in one group and the rest in the other will often be optimal,and so forth.The problem is that simply counting edges is not a good way to quantify the intuitive concept of commu-nity structure.A good division of a network into com-munities is not merely one in which there are few edges between communities;it is one in which there are fewer than expected edges between communities.If the num-ber of edges between two groups is only what one would expect on the basis of random chance,then few thought-ful observers would claim this constitutes evidence of meaningful community structure.On the other hand,if the number of edges between groups is significantly less than we expect by chance—or equivalently if the number within groups is significantly more—then it is reasonable to conclude that something interesting is going on. This idea,that true community structure in a network corresponds to a statistically surprising arrangement of edges,can be quantified using the measure known as modularity[17].The modularity is,up to a multiplicative constant,the number of edges falling within groups mi-nus the expected number in an equivalent network with edges placed at random.(A precise mathematical formu-lation is given below.)The modularity can be either positive or negative,with positive values indicating the possible presence of com-munity structure.Thus,one can search for community structure precisely by looking for the divisions of a net-work that have positive,and preferably large,values of the modularity[18].The evidence so far suggests that this is a highly effective way to tackle the problem.For instance, Guimer`a and Amaral[12]and later Danon et al.[8]op-timized modularity over possible partitions of computer-generated test networks using simulated annealing.In di-rect comparisons using standard measures,Danon et al. found that this method outperformed all other methods for community detection of which they were aware,in most cases by an impressive margin.On the basis of con-siderations such as these we consider maximization of the modularity to be perhaps the definitive current method of community detection,being at the same time based on sensible statistical principles and highly effective in practice.Unfortunately,optimization by simulated annealing is not a workable approach for the large network problems facing today’s scientists,because it demands too much computational effort.A number of alternative heuris-tic methods have been investigated,such as greedy algo-rithms[18]and extremal optimization[19].Here we take a different approach based on a reformulation of the mod-ularity in terms of the spectral properties of the network of interest.Suppose our network contains n vertices.For a par-ticular division of the network into two groups let s i=1 if vertex i belongs to group1and s i=−1if it belongs to group2.And let the number of edges between ver-tices i and j be A ij,which will normally be0or1,al-though larger values are possible in networks where mul-tiple edges are allowed.(The quantities A ij are the el-ements of the so-called adjacency matrix.)At the same time,the expected number of edges between vertices i and j if edges are placed at random is k i k j/2m,where k i and k j are the degrees of the vertices and m=14m ijA ij−k i k j4m s T Bs,(1)where s is the vector whose elements are the s i.The leading factor of1/4m is merely conventional:it is in-cluded for compatibility with the previous definition of modularity[17].We have here defined a new real symmetric matrix B with elementsk i k jB ij=A ij−FIG.2:Application of our eigenvector-based method to the “karate club”network of Ref.[23].Shapes of vertices indi-cate the membership of the corresponding individuals in the two known factions of the network while the dotted line indi-cates the split found by the algorithm,which matches the fac-tions exactly.The shades of the vertices indicate the strength of their membership,as measured by the value of the corre-sponding element of the eigenvector.groups,but to place them on a continuous scale of“how much”they belong to one group or the other.As an example of this algorithm we show in Fig.2the result of its application to a famous network from the so-cial science literature,which has become something of a standard test for community detection algorithms.The network is the“karate club”network of Zachary[23], which shows the pattern of friendships between the mem-bers of a karate club at a US university in the1970s. This example is of particular interest because,shortly after the observation and construction of the network, the club in question split in two as a result of an inter-nal dispute.Applying our eigenvector-based algorithm to the network,wefind the division indicated by the dotted line in thefigure,which coincides exactly with the known division of the club in real life.The vertices in Fig.2are shaded according to the val-ues of the elements in the leading eigenvector of the mod-ularity matrix,and these values seem also to accord well with known social structure within the club.In partic-ular,the three vertices with the heaviest weights,either positive or negative(black and white vertices in thefig-ure),correspond to the known ringleaders of the two fac-tions.Dividing networks into more than two communities In the preceding section we have given a simple matrix-based method forfinding a good division of a network into two parts.Many networks,however,contain more than two communities,so we would like to extend our method tofind good divisions of networks into larger numbers of parts.The standard approach to this prob-lem,and the one adopted here,is repeated division into two:we use the algorithm of the previous sectionfirst to divide the network into two parts,then divide those parts,and so forth.In doing this it is crucial to note that it is not correct, afterfirst dividing a network in two,to simply delete the edges falling between the two parts and then apply the algorithm again to each subgraph.This is because the degrees appearing in the definition,Eq.(1),of the mod-ularity will change if edges are deleted,and any subse-quent maximization of modularity would thus maximize the wrong quantity.Instead,the correct approach is to define for each subgraph g a new n g×n g modularity matrix B(g),where n g is the number of vertices in the subgraph.The correct definition of the element of this matrix for vertices i,j isB(g)ij=A ij−k i k j2m ,(4)where k(g)i is the degree of vertex i within subgraph g and d g is the sum of the(total)degrees k i of the vertices in the subgraph.Then the subgraph modularity Q g=s T B(g)s correctly gives the additional contribution to the total modularity made by the division of this subgraph.In particular,note that if the subgraph is undivided,Q g is correctly zero.Note also that for a complete network Eq.(4)reduces to the previous definition for the modu-larity matrix,Eq.(2),since k(g)i→k i and d g→2m in that case.In repeatedly subdividing our network,an important question we need to address is at what point to halt the subdivision process.A nice feature of our method is that it provides a clear answer to this question:if there exists no division of a subgraph that will increase the modular-ity of the network,or equivalently that gives a positive value for Q g,then there is nothing to be gained by divid-ing the subgraph and it should be left alone;it is indi-visible in the sense of the previous section.This happens when there are no positive eigenvalues to the matrix B(g), and thus our leading eigenvalue provides a simple check for the termination of the subdivision process:if the lead-ing eigenvalue is zero,which is the smallest value it can take,then the subgraph is indivisible.Note,however,that while the absence of positive eigen-values is a sufficient condition for indivisibility,it is not a necessary one.In particular,if there are only small positive eigenvalues and large negative ones,the terms in Eq.(3)for negativeβi may outweigh those for positive.It is straightforward to guard against this possibility,how-ever:we simply calculate the modularity contribution for each proposed split directly and confirm that it is greater than zero.Thus our algorithm is as follows.We construct the modularity matrix for our network andfind its leading (most positive)eigenvalue and eigenvector.We divide the network into two parts according to the signs of the elements of this vector,and then repeat for each of the parts.If at any stage wefind that the proposed split makes a zero or negative contribution to the total mod-5ularity,we leave the corresponding subgraph undivided. When the entire network has been decomposed into in-divisible subgraphs in this way,the algorithm ends. One immediate corollary of this approach is that all “communities”in the network are,by definition,indi-visible subgraphs.A number of authors have in the past proposed formal definitions of what a community is[9,16,24].The present method provides an alter-native,first-principles definition of a community as an indivisible subgraph.Further techniques for modularity maximization In this section we describe briefly another method we have investigated for dividing networks in two by mod-ularity optimization,which is entirely different from our spectral method.Although not of especial interest on its own,this second method is,as we will shortly show,very effective when combined with the spectral method.Let us start with some initial division of our vertices into two groups:the most obvious choice is simply to place all vertices in one of the groups and no vertices in the other.Then we proceed as follows.Wefind among the vertices the one that,when moved to the other group, will give the biggest increase in the modularity of the complete network,or the smallest decrease if no increase is possible.We make such moves repeatedly,with the constraint that each vertex is moved only once.When all n vertices have been moved,we search the set of in-termediate states occupied by the network during the operation of the algorithm tofind the state that has the greatest modularity.Starting again from this state,we repeat the entire process iteratively until no further im-provement in the modularity results.Those familiar with the literature on graph partitioning mayfind this algo-rithm reminiscent of the Kernighan–Lin algorithm[25], and indeed the Kernighan–Lin algorithm provided the inspiration for our method.Despite its simplicity,wefind that this method works moderately well.It is not competitive with the best pre-vious methods,but it gives respectable modularity val-ues in the trial applications we have made.However, the method really comes into its own when it is used in combination with the spectral method introduced ear-lier.It is a common approach in standard graph par-titioning problems to use spectral partitioning based on the graph Laplacian to give an initial broad division of a network into two parts,and then refine that division us-ing the Kernighan–Lin algorithm.For community struc-ture problems wefind that the equivalent joint strategy works very well.Our spectral approach based on the leading eigenvector of the modularity matrix gives an ex-cellent guide to the general form that the communities should take and this general form can then befine-tuned by our vertex moving method,to reach the best possible modularity value.The whole procedure is repeated to subdivide the network until every remaining subgraph is indivisible,and no further improvement in the modular-ity is possible.Typically,thefine-tuning stages of the algorithm add only a few percent to thefinal value of the modularity, but those few percent are enough to make the difference between a method that is merely good and one that is, as we will see,exceptional.Example applicationsIn practice,the algorithm developed here gives excel-lent results.For a quantitative comparison between our algorithm and others we follow Duch and Arenas[19] and compare values of the modularity for a variety of networks drawn from the literature.Results are shown in Table I for six different networks—the exact same six as used by Duch and Arenas.We compare mod-ularityfigures against three previously published algo-rithms:the betweenness-based algorithm of Girvan and Newman[10],which is widely used and has been incor-porated into some of the more popular network analysis programs(denoted GN in the table);the fast algorithm of Clauset et al.[26](CNM),which optimizes modularity using a greedy algorithm;and the extremal optimization algorithm of Duch and Arenas[19](DA),which is ar-guably the best previously existing method,by standard measures,if one discounts methods impractical for large networks,such as exhaustive enumeration of all parti-tions or simulated annealing.The table reveals some interesting patterns.Our al-gorithm clearly outperforms the methods of Girvan and Newman and of Clauset et al.for all the networks in the task of optimizing the modularity.The extremal opti-mization method on the other hand is more competitive. For the smaller networks,up to around a thousand ver-tices,there is essentially no difference in performance be-tween our method and extremal optimization;the mod-ularity values for the divisions found by the two algo-rithms differ by no more than a few parts in a thousand for any given network.For larger networks,however,our algorithm does better than extremal optimization,and furthermore the gap widens as network size increases, to a maximum modularity difference of about a6%for the largest network studied.For the very large networks that have been of particular interest in the last few years, therefore,it appears that our method for detecting com-munity structure may be the most effective of the meth-ods considered here.The modularity values given in Table I provide a use-ful quantitative measure of the success of our algorithm when applied to real-world problems.It is worthwhile, however,also to confirm that it returns sensible divisions of networks in practice.We have given one example demonstrating such a division in Fig.2.We have also checked our method against many of the example net-works used in previous studies[10,17].Here we give two more examples,both involving network representationsmodularity Q network GN CNM DA this paper3419845311331068027519maximal value of the quantity known as modularity over possible divisions of a network.We have shown that this problem can be rewritten in terms of the eigenval-ues and eigenvectors of a matrix we call the modularity matrix,and by exploiting this transformation we have created a new computer algorithm for community de-tection that demonstrably outperforms the best previ-ous general-purpose algorithms in terms of both quality of results and speed of execution.We have applied our algorithm to a variety of real-world network data sets, including social and biological examples,showing it to give both intuitively reasonable divisions of networks and quantitatively better results as measured by the modu-larity.AcknowledgmentsThe author would like to thank Lada Adamic,Alex Arenas,and Valdis Krebs for providing network data and for useful comments and suggestions.This work was funded in part by the National Science Foundation un-der grant number DMS–0234188and by the James S. McDonnell Foundation.[1]D.J.Watts and S.H.Strogatz,Collective dynamics of‘small-world’networks.Nature393,440–442(1998). [2]A.-L.Barab´a si and R.Albert,Emergence of scaling inrandom networks.Science286,509–512(1999).[3]o,S.Shen-Orr,S.Itzkovitz,N.Kashtan,D.Chklovskii,and U.Alon,Network motifs:Simplebuilding blocks of complex networks.Science298,824–827(2002).[4]R.Albert and A.-L.Barab´a si,Statistical mechanics ofcomplex networks.Rev.Mod.Phys.74,47–97(2002).[5]S.N.Dorogovtsev and J.F.F.Mendes,Evolution ofnetworks.Advances in Physics51,1079–1187(2002). [6]M.E.J.Newman,The structure and function of complexnetworks.SIAM Review45,167–256(2003).[7]M.E.J.Newman,Detecting community structure in net-works.Eur.Phys.J.B38,321–330(2004).[8]L.Danon,J.Duch, A.Diaz-Guilera,and A.Arenas,Comparing community structure identification.J.Stat.Mech.p.P09008(2005).[9]G.W.Flake,wrence,C.L.Giles,and F.M.Co-etzee,Self-organization and identification of Web com-munities.IEEE Computer35,66–71(2002).[10]M.Girvan and M.E.J.Newman,Community structurein social and biological networks.Proc.Natl.Acad.Sci.USA99,7821–7826(2002).[11]P.Holme,M.Huss,and H.Jeong,Subnetwork hierar-chies of biochemical pathways.Bioinformatics19,532–538(2003).[12]R.Guimer`a and L.A.N.Amaral,Functional cartogra-phy of complex metabolic networks.Nature433,895–900 (2005).[13]U.Elsner,Graph partitioning—a survey.Technical Re-port97-27,Technische Universit¨a t Chemnitz(1997). [14]P.-O.Fj¨a llstr¨o m,Algorithms for graph partitioning:Asurvey.Link¨o ping Electronic Articles in Computer and Information Science3(10)(1998).[15]H.C.White,S.A.Boorman,and R.L.Breiger,Socialstructure from multiple networks:I.Blockmodels of roles and positions.Am.J.Sociol.81,730–779(1976). [16]S.Wasserman and K.Faust,Social Network Analysis.Cambridge University Press,Cambridge(1994).[17]M.E.J.Newman and M.Girvan,Finding and evaluat-ing community structure in networks.Phys.Rev.E69, 026113(2004).[18]M.E.J.Newman,Fast algorithm for detecting com-munity structure in networks.Phys.Rev.E69,066133 (2004).[19]J.Duch and A.Arenas,Community detection in complexnetworks using extremal optimization.Phys.Rev.E72, 027104(2005).[20]F.R.K.Chung,Spectral Graph Theory.Number92in CBMS Regional Conference Series in Mathematics, American Mathematical Society,Providence,RI(1997).[21]M.Fiedler,Algebraic connectivity of graphs.Czech.Math.J.23,298–305(1973).[22]A.Pothen,H.Simon,and K.-P.Liou,Partitioning sparsematrices with eigenvectors of graphs.SIAM J.Matrix Anal.Appl.11,430–452(1990).[23]W.W.Zachary,An informationflow model for conflictandfission in small groups.Journal of Anthropological Research33,452–473(1977).[24]F.Radicchi,C.Castellano,F.Cecconi,V.Loreto,andD.Parisi,Defining and identifying communities in net-A101,2658–2663 (2004).[25]B.W.Kernighan and S.Lin,An efficient heuristic proce-dure for partitioning graphs.Bell System Technical Jour-nal49,291–307(1970).[26]A.Clauset,M.E.J.Newman,and C.Moore,Findingcommunity structure in very large networks.Phys.Rev.E70,066111(2004).[27]P.Gleiser and L.Danon,Community structure in jazz.Advances in Complex Systems6,565–573(2003). [28]H.Jeong,B.Tombor,R.Albert,Z.N.Oltvai,and A.-L.Barab´a si,The large-scale organization of metabolic networks.Nature407,651–654(2000).[29]H.Ebel,L.-I.Mielsch,and S.Bornholdt,Scale-free topol-ogy of e-mail networks.Phys.Rev.E66,035103(2002).[30]X.Guardiola,R.Guimer`a,A.Arenas,A.Diaz-Guilera,D.Streib,and L. A.N.Amaral,Macro-and micro-structure of trust networks.Preprint cond-mat/0206240 (2002).[31]M.E.J.Newman,The structure of scientific collabora-tion A98,404–409 (2001).[32]L.A.Adamic and N.Glance,The political blogosphereand the2004us election.In Proceedings of the WWW-2005Workshop on the Weblogging Ecosystem(2005).。

概论

概论
第一章 7
神经网络原理
突触传递信息特点

1 时延性 : (0.3~1ms) 2 综合性 : 时间与空间的累加 3 类型: 兴奋与抑制 4 脉冲与电位转换: (D/A功能) 5 速度: 1~150m/s 6 不应期(死区): 3~5ms 7 不可逆性(单向) 8 可塑性 : 强度可变 ,有学习功能
互联网络特点

每个元都与其它元相连 例: Hopfield Boltzmann机
神经网络原理
第一章
43
ANN研究中的核心问题 How to determine the weights(加权系数)
学习规则简介
神经网络原理
第一章
44
学习规则

1)直接设计计算 e.g. Hopfield 作优化计算 2)学习得到,即通过训练(training)
第一章 概论

神经网络种类
生物神经网络(Biological Neural Networks)


人工神经网络(Artificial Neural Networks)
第一章 1
神经网络原理
1.1 生物神经元及生物神经网络 (Biological Neuron and Biological Neural Networks) 1.2 人工神经网络(Aritificial Neural networks) 1.3 ANN的发展 (Development of ANN) 1.4 ANN与AC (ANN and Automatic Control)

神经网络原理
第一章
45
常用学习规则
a) Hebb学习
ij i j
D.Hebb1949年提出:两元同时兴奋,则突触 连接加强 b)δ学习规则 误差校正规则 梯度方法 (BP即为其中一种)

批量查找特定位置碱基附近基因的方法

批量查找特定位置碱基附近基因的方法

批量查找特定位置碱基附近基因的方法One approach to identifying genes in the vicinity ofspecific genomic positions is by utilizing bioinformatic tools such as genome browsers and annotation databases. These resources provide comprehensive information aboutgene locations, structures, and functions.基因组浏览器和注释数据库是一种确定特定基因组位置附近的基因的方法。

这些资源提供了有关基因的位置、结构和功能的详细信息。

Bioinformatic tools such as the UCSC Genome Browser or Ensembl allow users to input genomic coordinates or search for genes based on specific criteria, such as proximity to certain genomic features or known genetic variants. They display the genomic region of interest along with annotated features such as exons, introns, promoters, transcription start sites, and regulatory regions.生物信息学工具,如UCSC基因组浏览器或Ensembl,允许用户根据基于某些特定标准(如与某些已知遗传变体相邻)或者通过输入基因组坐标来搜索与之相邻基因。

它们显示感兴趣的基因组区域,并显示出注释功能,例如外显子,内含子,启动子,转录起始位点和调控区域。

the spread of epidemic disease on networks

the spread of epidemic disease on networks

Spread of epidemic disease on networksM.E.J.NewmanCenter for the Study of Complex Systems,University of Michigan,Ann Arbor,Michigan48109-1120 Santa Fe Institute,1399Hyde Park Road,Santa Fe,New Mexico87501͑Received4December2001;published26July2002͒The study of social networks,and in particular the spread of disease on networks,has attracted considerable recent attention in the physics community.In this paper,we show that a large class of standard epidemiological models,the so-called susceptible/infective/removed͑SIR͒models can be solved exactly on a wide variety of networks.In addition to the standard but unrealistic case offixed infectiveness time andfixed and uncorrelated probability of transmission between all pairs of individuals,we solve cases in which times and probabilities are nonuniform and correlated.We also consider one simple case of an epidemic in a structured population,that of a sexually transmitted disease in a population divided into men and women.We confirm the correctness of our exact solutions with numerical simulations of SIR epidemics on networks.DOI:10.1103/PhysRevE.66.016128PACS number͑s͒:89.75.Hc,87.23.Ge,05.70.Fh,64.60.AkI.INTRODUCTIONMany diseases spread through human populations by con-tact between infective individuals͑those carrying the dis-ease͒and susceptible individuals͑those who do not have the disease yet,but can catch it͒.The pattern of these disease-causing contacts forms a network.In this paper we investi-gate the effect of network topology on the rate and pattern of disease spread.Most mathematical studies of disease propagation make the assumption that populations are‘‘fully mixed,’’meaning that an infective individual is equally likely to spread the disease to any other member of the population or subpopu-lation to which they belong͓1–3͔.In the limit of large popu-lation size this assumption allows one to write down nonlin-ear differential equations governing,for example,numbers of infective individuals as a function of time,from which solutions for quantities of interest can be derived,such as typical sizes of outbreaks and whether or not epidemics oc-cur.͑Epidemics are defined as outbreaks that affect a non-zero fraction of the population in the limit of large system size.͒Epidemic behavior usually shows a phase transition with the parameters of the model—a sudden transition from a regime without epidemics to one with.This transition hap-pens as the‘‘reproductive ratio’’R0of the disease,which is the fractional increase per unit time in the number of infec-tive individuals,passes though one.Within the class of fully mixed models much elaboration is possible,particularly concerning the effects of age struc-ture in the population,and population turnover.The crucial element however that all such models lack is network topol-ogy.It is obvious that a given infective individual does not have equal probability of infecting all others;in the real world each individual only has contact with a small fraction of the total population,although the number of contacts that people have can vary greatly from one person to another.The fully mixed approximation is made primarily in order to al-low the modeler to write down differential equations.For most diseases it is not an accurate representation of real con-tact patterns.In recent years a large body of research,particularly within the statistical physics community,has addressed thetopological properties of networks of various kinds,fromboth theoretical and empirical points of view,and studied theeffects of topology on processes taking place on those net-works͓4,5͔.Social networks͓6–9͔,technological networks ͓10–13͔,and biological networks͓14–18͔have all been ex-amined and modeled in some detail.Building on insightsgained from this work,a number of authors have pursued amathematical theory of the spread of disease on networks ͓19–24͔.This is also the topic of the present paper,in which we show that a large class of standard epidemiological mod-els can be solved exactly on networks using ideas drawn from percolation theory.The outline of the paper is as follows.In Sec.II we intro-duce the models studied.In Sec.III we show how percola-tion ideas and generating function methods can be used toprovide exact solutions of these models on simple networkswith uncorrelated transmission probabilities.In Sec.IV weextend these solutions to cases in which probabilities oftransmission are correlated,and in Sec.V to networks repre-senting some types of structured populations.In Sec.VI wegive our conclusions.II.EPIDEMIC MODELS AND PERCOLATION The mostly widely studied class of epidemic models,and the one on which we focus in this paper,is the class of susceptible/infective/removed or SIR models.The original and simplest SIR model,first formulated͑though never pub-lished͒by Lowell Reed and Wade Hampton Frost in the 1920s,is as follows.A population of N individuals is divided into three states:susceptible͑S͒,infective͑I͒,and removed ͑R͒.In this context‘‘removed’’means individuals who are either recovered from the disease and immune to further in-fection,or dead.͑Some researchers consider the R to stand for‘‘recovered’’or‘‘refractory.’’Either way,the meaning is the same.͒Infective individuals have contacts with randomly chosen individuals of all states at an average rate␤per unit time,and recover and acquire immunity͑or die͒at an aver-age rate␥per unit time.If those with whom infective indi-viduals have contact are themselves in the susceptible state,PHYSICAL REVIEW E66,016128͑2002͒then they become infected.In the limit of large N this model is governed by the coupled nonlinear differential equations ͓1͔:ds dt ϭϪ␤is,didtϭ␤isϪ␥i,drdtϭ␥i,͑1͒where s(t),i(t),and r(t)are the fractions of the population in each of the three states,and the last equation is redundant, since sϩiϩrϭ1necessarily at all times.This model is ap-propriate for a rapidly spreading disease that confers immu-nity on its survivors,such as influenza.In this paper we will consider only diseases of this type.Diseases that are endemic because they propagate on time scales comparable to or slower than the rate of turnover of the population,or because they confer only temporary immunity,are not well repre-sented by this model;other models have been developed for these cases͓3͔.The model described above assumes that the population is fully mixed,meaning that the individuals with whom a sus-ceptible individual has contact are chosen at random from the whole population.It also assumes that all individuals have approximately the same number of contacts in the same time,and that all contacts transmit the disease with the same probability.In real life none of these assumptions is correct, and they are all grossly inaccurate in at least some cases.In the work presented here we remove these assumptions by a series of modifications of the model.First,as many others have done,we replace the‘‘fully mixed’’aspect with a network of connections between indi-viduals͓19–28͔.Individuals have disease-causing contacts only along the links in this network.We distinguish here between‘‘connections’’and actual contacts.Connections be-tween pairs of individuals predispose those individuals to disease-causing contact,but do not guarantee it.An individu-al’s connections are the set of people with whom the indi-vidual may have contact during the time he or she is infective—people that the individual lives with,works with, sits next to on the bus,and so forth.We can vary the number of connections each person has with others by choosing a particular degree distribution for the network.͑Recall that the degree of a vertex in a network is the number of other vertices to which it is attached.͒For example,in the case of sexual contacts,which can commu-nicate sexually transmitted diseases,the degree distribution has been found to follow a power-law form͓8͔.By placing the model on a network with a power-law degree distribution we can emulate this effect in our model.Our second modification of the model is to allow the probability of disease-causing contact between pairs of indi-viduals who have a connection to vary,so that some pairs have higher probability of disease transmission than others.Consider a pair of individuals who are connected,one of whom i is infective and the other j susceptible.Suppose that the average rate of disease-causing contacts between them is r i j,and that the infective individual remains infective for a time␶i.Then the probability1ϪT i j that the disease will not be transmitted from i to j is1ϪT i jϭlim␦t→0͑1Ϫr i j␦t͒␶i/␦tϭeϪr i j␶i,͑2͒and the probability of transmission isT i jϭ1ϪeϪr i j␶i.͑3͒Some models,particularly computer simulations,use dis-crete time steps rather than continuous time,in which case instead of taking the limit in Eq.͑2͒we simply set␦tϭ1, givingT i jϭ1Ϫ͑1Ϫr i j͒␶i,͑4͒where␶is measured in time steps.In general r i j and␶i will vary between individuals,so that the probability of transmission also varies.Let us assume initially that these two quantities are independent identically distributed͑iid͒random variables chosen from some appro-priate distributions P(r)and P(␶).͑We will relax this as-sumption later.͒The rate r i j need not be symmetric—the probability of transmission in either direction might not be the same.In any case,T i j is in general not symmetric be-cause of the appearance of␶i in Eqs.͑3͒and͑4͒.Now here is the trick:because r i j and␶i are idd random variables,so is T i j,and hence the a priori probability of transmission of the disease between two individuals is sim-ply the average T of T i j over the distributions P(r)and P(␶),which isTϭ͗T i j͘ϭ1Ϫ͵0ϱdr d␶P͑r͒P͑␶͒eϪr␶͑5͒for the continuous time case orTϭ1Ϫ͵0ϱdr͚␶ϭ0ϱP͑r͒P͑␶͒͑1Ϫr͒␶͑6͒for the discrete case͓23͔.We call T the‘‘transmissibility’’of the disease.It is necessarily always in the range0рTр1.Thus the fact that individual transmission probabilities vary makes no difference whatsoever;in the population as a whole the disease will propagate as if all transmission prob-abilities were equal to T.We demonstrate the truth of this result by explicit simulation in Sec.III E.It is this result that makes our models solvable.Cases in which the variables r and␶are not idd are trickier,but,as we will show,these are sometimes solvable as well.We note further that more complex disease transmission models,such as SEIR models in which there is an infected-but-not-infective period͑E͒,are also covered by this formal-ism.The transmissibility T i j is essentially just the integrated probability of transmission of the disease between two indi-viduals.The precise temporal behavior of infectivity and other variables is unimportant.Indeed the model can be gen-eralized to include any temporal variation in infectivity of the infective individuals,and transmission can still be repre-sented correctly by a simple transmissibility variable T,as above.M.E.J.NEWMAN PHYSICAL REVIEW E66,016128͑2002͒Now imagine watching an outbreak of the disease,which starts with a single infective individual,spreading across our network.If we were to mark or‘‘occupy’’each edge in the graph across which the disease is transmitted,which happens with probability T,the ultimate size of the outbreak would be precisely the size of the cluster of vertices that can be reached from the initial vertex by traversing only occupied edges.Thus,the model is precisely equivalent to a bond percolation model with bond occupation probability T on the graph representing the community.The connection between the spread of disease and percolation was in fact one of the original motivations for the percolation model itself͓29͔,but seems to have been formulated in the manner presented here first by Grassberger͓30͔for the case of uniform r and␶,and by Warren et al.͓23,24͔for the nonuniform case.In the following section we show how the percolation problem can be solved on random graphs with arbitrary de-gree distributions,giving exact solutions for the typical size of outbreaks,presence of an epidemic,size of the epidemic ͑if there is one͒,and a number of other quantities of interest.III.EXACT SOLUTIONS ON NETWORKS WITH ARBITRARY DEGREE DISTRIBUTIONS One of the most important results to come out of empiri-cal work on networks is thefinding that the degree distribu-tions of many networks are highly right skewed.In other words,most vertices have only a low degree,but there are a small number whose degree is very high͓5,7,11,31͔.The network of sexual contacts discussed above provides one ex-ample of such a distribution͓8͔.It is known that the presence of highly connected vertices can have a disproportionate ef-fect on certain properties of the network.Recent work sug-gests that the same may be true for disease propagation on networks͓21,32͔,and so it will be important that we incor-porate nontrivial degree distributions in our models.As a first illustration of our method therefore,we look at a simple class of unipartite graphs studied previously by a variety of authors͓33–42͔,in which the degree distribution is speci-fied,but the graph is in other respects random.Our graphs are simply defined.One specifies the degree distribution by giving the properly normalized probabilities p k that a randomly chosen vertex has degree k.A set of N degrees͕k i͖,also called a degree sequence,is then drawn from this distribution and each of the N vertices in the graph is given the appropriate number k i of‘‘stubs’’—ends of edges emerging from it.Pairs of these stubs are then chosen at random and connected together to form complete edges. Pairing of stubs continues until none are left.͑If an odd number of stubs is by chance generated,complete pairing is not possible,in which case we discard one k i and draw an-other until an even number is achieved.͒This technique guar-antees that the graph generated is chosen uniformly at ran-dom from the set of all graphs with the selected degree sequence.All the results given in this section are averaged over the ensemble of possible graphs generated in this way,in the limit of large graph size.A.Generating functionsWe wish then to solve for the average behavior of graphs of this type under bond percolation with bond occupation probability T.We will do this using generating function tech-niques͓43͔.Following Newman et al.͓36͔,we define a gen-erating function for the degree distribution thusG0͑x͒ϭ͚kϭ0ϱp k x k.͑7͒Note that G0(1)ϭ͚k p kϭ1if p k is a properly normalized probability distribution.This function encapsulates all of the information about the degree distribution.Given it,we can easily reconstruct the distribution by repeated differentiationp kϭ1k!d k G0dx k͉xϭ0.͑8͒We say that the generating function G0‘‘generates’’the dis-tribution p k.The generating function is easier to work with than the degree distribution itself because of two crucial properties.Powers.If the distribution of a property k of an object is generated by a given generating function,then the distribu-tion of the sum of k over m independent realizations of the object is generated by the m th power of that generating func-tion.For example,if we choose m vertices at random from a large graph,then the distribution of the sum of the degrees of those vertices is generated by͓G0(x)͔m.Moments.The mean of the probability distribution gener-ated by a generating function is given by thefirst derivative of the generating function,evaluated at1.For instance,the mean degree z of a vertex in our network is given byzϭ͗k͘ϭ͚k kp kϭG0Ј͑1͒.͑9͒Higher moments of the distribution can be calculated from higher derivatives also.In general,we have͗k n͘ϭ͚k k n p kϭͫͩx d dxͪn G0͑x͒ͬxϭ1.͑10͒A further observation that will also prove crucial is the following.While G0above correctly generates the distribu-tion of degrees of randomly chosen vertices in our graph,a different generating function is needed for the distribution of the degrees of vertices reached by following a randomly cho-sen edge.If we follow an edge to the vertex at one of its ends,then that vertex is more likely to be of higher degree than is a randomly chosen vertex,since high-degree vertices have more edges attached to them than low-degree ones.The distribution of degrees of the vertices reached by following edges is proportional to kp k,and hence the generating func-tion for those degrees isSPREAD OF EPIDEMIC DISEASE ON NETWORKS PHYSICAL REVIEW E66,016128͑2002͚͒k kp k x k͚kkp k ϭxG 0Ј͑x ͒G 0Ј͑1͒.͑11͒In general we will be concerned with the number of ways ofleaving such a vertex excluding the edge we arrived along,which is the degree minus 1.To allow for this,we simply divide the function above by one power of x ,thus arriving at a new generating functionG 1͑x ͒ϭG 0Ј͑x ͒G 0Ј͑1͒ϭ1z G 0Ј͑x ͒,͑12͒where z is the average vertex degree,as before.In order to solve the percolation problem,we will also need generating functions G 0(x ;T )and G 1(x ;T )for the dis-tribution of the number of occupied edges attached to a ver-tex,as a function of the transmissibility T .These are simple to derive.The probability of a vertex having exactly m of the k edges emerging from it occupied is given by the binomialdistribution (mk)T m (1ϪT )k Ϫm ,and hence the probability dis-tribution of m is generated byG 0͑x ;T ͒ϭ͚m ϭ0ϱ͚k ϭm ϱp kͩk mͪT m͑1ϪT ͒k Ϫm x m ϭ͚k ϭ0ϱp k͚m ϭ0kͩkmͪ͑xT ͒m ͑1ϪT ͒k Ϫm ϭ͚k ϭ0ϱp k ͑1ϪT ϩxT ͒k ϭG 0…1ϩ͑x Ϫ1͒T ….͑13͒Similarly,the probability distribution of occupied edges leav-ing a vertex arrived at by following a randomly chosen edgeis generated byG 1͑x ;T ͒ϭG 1…1ϩ͑x Ϫ1͒T ….͑14͒Note that,in our notationG 0͑x ;1͒ϭG 0͑x ͒,͑15a ͒G 0͑1;T ͒ϭG 0͑1͒,͑15b ͒G 0Ј͑1;T ͒ϭTG 0Ј͑1͒,͑15c ͒and similarly for G 1.͓G 0Ј(x ;T )here represents the derivativeof G 0(x ;T )with respect to its first argument.͔B.Outbreak size distributionThe first quantity we will work out is the distribution P s (T )of the sizes s of outbreaks of the disease on our net-work,which is also the distribution of sizes of clusters ofvertices connected together by occupied edges in the corre-sponding percolation model.Let H 0(x ;T )be the generatingfunction for this distribution,H 0͑x ;T ͒ϭ͚s ϭ0ϱP s ͑T ͒x s .͑16͒By analogy with the preceding section we also define H 1(x ;T )to be the generating function for the cluster of con-nected vertices we reach by following a randomly chosen edge.Now,following Ref.͓36͔,we observe that H 1can be broken down into an additive set of contributions as follows.The cluster reached by following an edge may be:͑1͒a single vertex with no occupied edges attached to it,other than the one along which we passed in order to reach it;͑2͒a single vertex attached to any number m у1of occupied edges other than the one we reached it by,each leading to another cluster whose size distribution is also generated by H 1.We further note that the chance that any two finite clus-ters that are attached to the same vertex will have an edge connecting them together directly goes as N Ϫ1with the size N of the graph,and hence is zero in the limit N →ϱ.In other words,there are no loops in our clusters;their structure is entirely treelike.Using these results,we can express H 1(x ;T )in a Dyson-equation-like self-consistent form thusH 1͑x ;T ͒ϭxG 1…H 1͑x ;T ͒;T ….͑17͒Then the size of the cluster reachable from a randomly cho-sen starting vertex is distributed according toH 0͑x ;T ͒ϭxG 0…H 1͑x ;T ͒;T ….͑18͒It is straightforward to verify that for the special case T ϭ1of 100%transmissibility,these equations reduce to those given in Ref.͓36͔for component size in random graphs with arbitrary degree distributions.Equations ͑17͒and ͑18͒pro-vide the solution for the more general case of finite transmis-sibility which applies to SIR models.Once we have H 0(x ;T ),we can extract the probability distribution of clus-ters P s (T )by differentiation using Eq.͑8͒on H 0.In most cases however it is not possible to find arbitrary derivatives of H 0in closed form.Instead we typically evaluate them numerically.Since direct evaluation of numerical derivatives is prone to machine precision problems,we recommend evaluating the derivatives by numerical contour integration using the Cauchy formulaP s ͑T ͒ϭ1s !d s H 0dx sͯx ϭ0ϭ12␲iͶH 0͑␨;T ͒␨s ϩ1d ␨,͑19͒where the integral is over the unit circle ͓44͔.It is possible to find the first thousand derivatives of a function without dif-ficulty using this method ͓36͔.By this method then,we can find the exact probability P s that a particular outbreak of our disease will infect s people in total,as a function of the transmissibility T .M.E.J.NEWMAN PHYSICAL REVIEW E 66,016128͑2002͒C.Outbreak sizes and the epidemic transitionAlthough in general we must use numerical methods to find the complete distribution P s of outbreak sizes from Eq.͑19͒,we canfind the mean outbreak size in closed form.Using Eq.͑9͒,we have͗s͘ϭH0Ј͑1;T͒ϭ1ϩG0Ј͑1;T͒H1Ј͑1;T͒,͑20͒where we have made use of the fact that the generating func-tions are1at xϭ1if the distributions that they generate are properly normalized.Differentiating Eq.͑17͒,we haveH1Ј͑1;T͒ϭ1ϩG1Ј͑1;T͒H1Ј͑1;T͒ϭ11ϪG1Ј͑1;T͒,͑21͒and hence͗s͘ϭ1ϩG0Ј͑1;T͒1ϪG1Ј͑1;T͒ϭ1ϩTG0Ј͑1͒1ϪTG1Ј͑1͒.͑22͒Given Eqs.͑7͒,͑12͒,͑13͒,and͑14͒,we can then evaluate this expression to get the mean outbreak size for any value of T and degree distribution.We note that Eq.͑22͒diverges when TG1Ј(1)ϭ1.This point marks the onset of an epidemic;it is the point at which the typical outbreak ceases to be confined to afinite number of individuals,and expands tofill an extensive fraction of the graph.The transition takes place when T is equal to the criti-cal transmissibility T c,given byT cϭ1G1Ј͑1͒ϭG0Ј͑1͒G0Љ͑1͒ϭ͚kkp k͚kk͑kϪ1͒p k.͑23͒For TϾT c,we have an epidemic,or‘‘giant component’’in the language of percolation.We can calculate the size of this epidemic as follows.Above the epidemic threshold Eq.͑17͒is no longer valid because the giant component is ex-tensive and therefore can contain loops,which destroys the assumptions on which Eq.͑17͒was based.The equation is valid however if we redefine H0to be the generating func-tion only for outbreaks other than epidemic outbreaks,i.e., isolated clusters of vertices that are not connected to the giant component.These however do notfill the entire graph, but only the portion of it not affected by the epidemic.Thus, above the epidemic transition,we haveH0͑1;T͒ϭ͚s P sϭ1ϪS͑T͒,͑24͒where S(T)is the fraction of the population affected by the epidemic.Rearranging Eq.͑24͒for S and making use of Eq.͑18͒,wefind that the size of the epidemic isS͑T͒ϭ1ϪG0͑u;T͒,͑25͒where uϵH1(1;T)is the solution of the self-consistencyrelationuϭG1͑u;T͒.͑26͒Results equivalent to Eqs.͑22͒–͑26͒were given previouslyin a different context in Ref.͓40͔.Note that it is not the case,even above T c,that all out-breaks give rise to epidemics of the disease.There are stillfinite outbreaks even in the epidemic regime.While this ap-pears very natural,it stands nonetheless in contrast to thestandard fully mixed models,for which all outbreaks giverise to epidemics above the epidemic transition point.In thepresent case,the probability of an outbreak becoming an epidemic at a given T is simply equal to S(T).D.Degree of infected individualsThe quantity u defined in Eq.͑26͒has a simple interpre-tation:it is the probability that the vertex at the end of arandomly chosen edge remains uninfected during an epi-demic͑i.e.,that it belongs to one of thefinite components͒.The probability that a vertex does not become infected via one of its edges is thus vϭ1ϪTϩTu,which is the sum of the probability1ϪT that the edge is unoccupied,and the probability Tu that it is occupied but connects to an unin-fected vertex.The total probability of being uninfected if a vertex has degree k is v k,and the probability of having de-gree k given that a vertex is uninfected is p k v k/͚k p k v k ϭp k v k/G0(v),which distribution is generated by the func-tion G0(v x)/G0(v).Differentiating and setting xϭ1,we thenfind that the average degree z out of vertices outside thegiant component isz outϭv G0Ј͑v͒G0͑v͒ϭv G1͑v͒G0͑v͒zϭu͓1ϪTϩTu͔1ϪSz.͑27͒Similarly the degree distribution for an infected vertex is generated by͓G0(x)ϪG0(v x)͔/͓1ϪG0(v)͔,which gives a mean degree z in for vertices in the giant component ofz inϭ1Ϫv G1͑v͒1ϪG0͑v͒zϭ1Ϫu͓1ϪTϩTu͔Sz.͑28͒Note that1ϪSϭG0(u;T)рu,since all coefficients of G0(x;T)are by definition positive͑because they form a probability distribution͒and hence G0(x;T)has only posi-tive derivatives,meaning that it is convex everywhere on the positive real line within its domain of convergence.Thus, from Eq.͑27͒,z outрz.Similarly,z inуz,and hence,as we would expect,the mean degree of infected individuals is al-ways greater than or equal to the mean degree of uninfected ones.Indeed,the probability of a vertex being infected, given that it has degree k,goes as1Ϫv kϭ1ϪeϪk ln(1/v),i.e., tends exponentially to unity as degree becomes large.E.An exampleLet us now look at an application of these results to a specific example of disease spreading.First of all we need to define our network of connections between individuals,SPREAD OF EPIDEMIC DISEASE ON NETWORKS PHYSICAL REVIEW E66,016128͑2002͒which means choosing a degree distribution.Here we will consider graphs with the degree distributionp kϭͭ0for kϭ0CkϪ␣eϪk/␬for kу1,͑29͒where C,␣,and␬are constants.In other words,the distri-bution is a power-law of exponent␣with an exponential cutoff around degree␬.This distribution has been studied before by various authors͓7,36,37,40͔.It makes a good ex-ample for a number of reasons:͑1͒distributions of this form are seen in a variety of real-world networks͓7,45͔;͑2͒it includes pure power-law and pure exponential distributions, both of which are also seen in various networks͓7,11,12,31͔, as special cases when␬→ϱor␣→0;͑3͒it is normalizable and has all momentsfinite for anyfinite␬.The constant C isfixed by the requirement of normaliza-tion,which gives Cϭ͓Li␣(eϪ1/␬)͔Ϫ1and hencep kϭkϪ␣eϪk/␬Li␣͑eϪ1/␬͒for kу1,͑30͒where Li n(x)is the n th polylogarithm of x.We also need to choose the distributions P(r)and P(␶) for the transmission rate and the time spent in the infective state.For the sake of easier comparison with computer simu-lations we use discrete time and choose both distributions to be uniform,with r real in the range0рrϽr max and␶integer in the range1р␶р␶max.The transmissibility T is then given by Eq.͑6͒.From Eq.͑30͒,we haveG0͑x͒ϭLi␣͑xeϪ1/␬͒Li␣͑eϪ1/␬͒͑31͒andG1͑x͒ϭLi␣Ϫ1͑xeϪ1/␬͒x Li␣Ϫ1͑eϪ1/␬͒.͑32͒Thus the epidemic transition in this model occurs atT cϭLi␣Ϫ1͑eϪ1/␬͒Li␣Ϫ2͑eϪ1/␬͒ϪLi␣Ϫ1͑eϪ1/␬͒.͑33͒Below this value of T there are only small͑nonepidemic͒outbreaks,which have mean size͗s͘ϭ1ϩT͓Li␣Ϫ1͑eϪ1/␬͔͒2Li␣͑eϪ1/␬͓͒͑Tϩ1͒Li␣Ϫ1͑eϪ1/␬͒ϪT Li␣Ϫ2͑eϪ1/␬͔͒.͑34͒Above it,we are in the region in which epidemics can occur, and they affect a fraction S of the population in the limit of large graph size.We cannot solve for S in closed form,but we can solve Eqs.͑25͒and͑26͒by numerical iteration and hencefind S.In Fig.1we show the results of calculations of the aver-age outbreak size and the size of epidemics from the exactformulas,compared with explicit simulations of the SIRmodel on networks with the degree distribution͑30͒.Simu-lations were performed on graphs of Nϭ100000vertices, with␣ϭ2,a typical value for networks seen in the real world,and␬ϭ5,10,and20͑the three curves in each panel of thefigure͒.For each pair of the parameters␣and␬for the network,we simulated10000disease outbreaks each for (r,␶)pairs with r max from0.1to1.0in steps of0.1,and␶max from1to10in steps of1.Figure1shows all of these resultson one plot as a function of the transmissibility T,calculatedfrom Eq.͑6͒.Thefigure shows two important things.First,the points corresponding to different values of r max and␶max but the same value of T fall in the same place and the two-parameter set of results for r and␶collapses onto a single curve.This indicates that the arguments leading to Eqs.͑5͒and͑6͒arecorrect͑as also demonstrated by Warren et al.͓23,24͔͒andthat the statistical properties of the disease outbreaks reallydo depend only on the transmissibility T,and not on theindividual rates and times of infection.Second,the dataclearly agree well with our analytic results for average out-break size and epidemic size,confirming the correctness ofour exact solution.The small disagreement between simula-tions and exact solution for͗s͘close to the epidemic transi-tion in the lower panel of thefigure appears to be afinite sizeeffect,due to the relatively small system sizes used in thesimulations.To emphasize the difference between our results and thosefor the equivalent fully mixed model,we compare the posi-tion of the epidemic threshold in the two cases.In the case ␣ϭ2,␬ϭ10͑the middle curve in each frame of Fig.1͒,our analytic solution predicts that the epidemic threshold occurs at T cϭ0.329.The simulations agree well with this predic-tion,giving T cϭ0.32(2).By contrast,a fully mixedSIR FIG.1.Epidemic size͑top͒and average outbreak size͑bottom͒for the SIR model on networks with degree distributions of the form ͑30͒as a function of transmissibility.Solid lines are the exact solu-tions,Eqs.͑25͒and͑22͒,for␣ϭ2and͑left to right in each panel͒␬ϭ20,10,and5.Each of the points is an average result for10000 simulations on graphs of100000vertices each with distributions of r and␶as described in the text.M.E.J.NEWMAN PHYSICAL REVIEW E66,016128͑2002͒。

生物相互作用网络的分析与建构

生物相互作用网络的分析与建构

生物相互作用网络的分析与建构1. 生物相互作用网络的概念生物相互作用网络(Biological Interaction Network)是指在生物体内存在的、多种分子之间的相互作用关系的总和,包括蛋白质与蛋白质之间、蛋白质与DNA之间、蛋白质与小分子之间的相互作用。

生物相互作用网络的研究对于深入理解生物体内的基本生命现象很有帮助,比如细胞信号传导、代谢途径、基因调控等。

2. 生物相互作用网络的分类生物相互作用网络可以按照不同的分子种类进行分类。

2.1 蛋白质相互作用网络蛋白质相互作用网络是研究最多的一类生物相互作用网络。

它是指在细胞内或细胞间发生的蛋白质与蛋白质之间的相互作用,包括蛋白质与蛋白质结合、酶促反应、酶与底物结合等多种类型。

2.2 代谢途径相互作用网络代谢途径相互作用网络是指在细胞内发生的代谢反应之间的相互关系。

代谢途径相互作用网络通常是以代谢途径的组成和反应之间的相互关系为基础建立的。

2.3 基因调控网络基因调控网络是指基因或转录因子在细胞内的相互作用。

它一般包括基因或转录因子结合DNA、转录因子和共调控因子之间的相互作用等。

3. 生物相互作用网络的建构生物相互作用网络的建构需要大量的生物实验和数据分析。

建构生物相互作用网络的主流方法包括高通量分析、物种互补法和基于文献的手工构建法。

3.1 高通量分析高通量分析是对生物分子相互作用进行大规模、高通量实验和数据分析的方法。

高通量分析包括蛋白质互作域(protein interaction domain)分析、蛋白质互作(protein-protein interaction)分析等。

3.2 物种互补法物种互补法是一种基于不同物种的生物系统学分析方法,通过比较不同物种之间的生物相互作用网络,来了解这些生物之间的共同性和差异性。

这种方法对于解决物种特异性问题和补充高通量分析的局限性都很有帮助。

3.3 基于文献的手工构建法基于文献的手工构建法是一种人工根据文献资料建立生物相互作用网络的方法。

mixed membership stochastic blockmodels

mixed membership stochastic blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (MMSB) is a powerful statistical framework used for modeling complex relationships within networks, allowing for a nuanced understanding of the diverse connections between nodes. In this article, we will delve into the foundations, applications, and advantages of MMSB in various domains.1. Introduction to MMSBMixed Membership Stochastic Blockmodels represent a class of probabilistic graphical models designed to capture intricate relationships in networks. The core idea is that each node belongs to different groups with certain probabilities, enhancing the model's ability to reflect the diversity present in real-world networks.2. Fundamentals of MMSBa. Model Overview: MMSB is built upon the foundation of Stochastic Block Models (SBM), a model that divides a network into blocks, where nodes within a block share similar connection probabilities. MMSB introduces mixed membership, allowing for a more flexible adaptation to the diversity observed in real-world networks.b. Random Block Models: Understanding the basics ofrandom block models provides insight into how MMSB partitions nodes based on shared characteristics, forming the basis for capturing complex network structures.3. Applications of MMSBa. Social Network Analysis:- Diversity of Memberships: MMSB can identify diverse memberships within social networks, providing a more nuanced understanding of the complex structures present in social groups.- Relationship Prediction: Utilizing existing node relationships, MMSB excels in predicting future connections, crucial for understanding the evolution of social networks.b. Biological Network Applications:- Gene Regulation Networks: MMSB aids in identifying interactions among genes in regulatory networks, shedding light on the intricate regulatory mechanisms within biological systems.- Protein-Protein Interactions: In protein-protein interaction networks, MMSB reveals functional groups of proteins, offering valuable insights for biological research.- Disease Association Networks: MMSB can analyze patterns of associations in disease networks, providing a newperspective for disease-related studies.4. Advantages and Challenges of MMSBa. Advantages:- Flexibility: MMSB captures the diversity of relationships in networks, making it applicable to a wide range of complex systems.- Dynamic Analysis: Beyond static structures, MMSB excels in analyzing dynamic changes within networks, adapting to the evolving nature of real-world networks.b. Challenges:- Parameter Estimation: Challenges exist in accurately estimating model parameters, a critical aspect of MMSB, demanding ongoing efforts for improvement.- Computational Complexity: The computational demands of MMSB, especially for large-scale networks, pose challenges that necessitate further algorithmic enhancements.5. Future Directions for MMSBa. Methodological Improvements:- Parameter Estimation Techniques: Future work may focus on refining parameter estimation methods to enhance the accuracy of MMSB in diverse applications.- Efficiency Enhancement: Advancements in algorithmscan address the computational challenges, making MMSB more accessible for large-scale networks.b. Interdisciplinary Applications:- Expansion into Various Fields: Integrating MMSB into domains such as finance and healthcare can broaden its applications, offering new insights into complex systems.c. Theoretical Advancements:- In-depth Theoretical Exploration: A deeper exploration of the theoretical foundations of MMSB can uncover its applicability in a broader spectrum of scenarios.6. ConclusionMixed Membership Stochastic Blockmodels provide a versatile framework for understanding the intricate structures within networks, offering valuable applications in social and biological contexts. As advancements continue, MMSB holds the potential to contribute significantly to our understanding of complex systems.。

第九章 生物分子网络与通路

第九章 生物分子网络与通路

负调控
转录调控网络-2
转录调控网络-检测技术
ChIP是一项比较流行的研究转录因子与启动子相互
结合的实验技术。
CHIP与基因芯片相结合建立的CHIP-on-chip方法 已广泛用于特定反式因子靶基因的高通量筛选; CHIP-SEQ新一代测序技术。
基本流程
转录调控数据库
TRANSFAC数据库——MATCH软件
网络的基本概念
网络定义 有向网络与无向网络 加权网络与等权网络 二分网络
网络中的路径与距离
网络定义
网络定义:通常可以用图G=(V,E)表示网络。 其中, V 是网络的节点集合,每个节点代表一个生 物分子,或者一个环境刺激; E 是边的集合,每条边代表节点之间的相互关系。 当V中的两个节点v1与v2之间存在一条属于E的边e1 时,称边e1连接v1与v2,或者称v1连接于v2,也称作 v2是v1的邻居。
有向网络与无向网络
根据网络中的边是否具有方向性或者说连接一条边 的两个节点是否存在顺序,网络可以分为有向网络 与无向网络,边存在方向性,为有向网络,否则为 无向网络。
生物分子网络的方向性取决于其所代表的关系。
如调控关系中转录因子与被调控基因之间是存在顺 序关系的,因此转录调控网络是有向网络,而基因 表达相关网络中的边代表的是两个基因在多个实验 条件下的表达高相关性,因此是无向的。
人体经络网络 思考:如果 说经脉图就 是一个网络 的话,那么 网络的节点 应该是什么? 网络的边又 应该是什么?
人体穴位就是该网络的节点,其医疗功能不同且相 互联系。 经络理论和针炙是网络科学初创时期有文字记载的 最早的人体生物网络模型及成功的医学应用。
发展历史-2

stable diffusion 括弧 用法

stable diffusion 括弧 用法

Stable DiffusionIntroductionStable diffusion is a concept in the field of physics that refers to the process by which the concentration of particles or substances in a system reaches an equilibrium state. This phenomenon occurs in various natural and artificial systems, such as gases, liquids, and even social networks. The understanding and control of stable diffusion have significant implications in fields such as chemistry, biology, and engineering.The Basics of DiffusionDiffusion is the net movement of particles from an area of high concentration to an area of low concentration. It is driven by a concentration gradient, which represents the difference in concentration between two points. The purpose of diffusion is to equalize the concentration across the system, reaching a stable state.Factors Affecting DiffusionSeveral factors can influence the rate and extent of diffusion:1. TemperatureHigher temperatures generally increase the rate of diffusion because particles have more kinetic energy, leading to greater movement and collisions. This increased motion facilitates the mixing of substances and the spread of particles.2. Concentration GradientThe greater the difference in concentration between two areas, the more rapid diffusion will occur. This difference creates a driving force that propels the particles towards equilibrium.3. Surface AreaA larger surface area enhances diffusion as it provides more space for particles to interact and spread out. For example, fine powders can diffuse more quickly than large chunks due to their increased surface area.4. Molecular Weight and SizeSmaller, lighter molecules diffuse more rapidly than larger ones. Thisis because they can move through space more easily and encounter fewer obstacles along the way.Mathematical DescriptionThe pr ocess of diffusion can be mathematically described using Fick’s laws of diffusion. These laws quantify the flux of particles and the rate at which they diffuse through a medium. Fick’s first law states that the flux of particles is proportional to the concentration gradient:Flux = -D * (dc/dx)where Flux is the rate of flow of particles, D is the diffusion coefficient, dc is the change in concentration, and dx is the change in distance across which diffusion occurs.Fick’s second law extends the concept to include the change in concentration with respect to time:∂c/∂t = D * ∂²c/∂x²This equation describes how the concentration of particles changes over time in response to the concentration gradient and diffusion coefficient.Applications of Stable DiffusionThe principle of stable diffusion has numerous applications across various fields:1. Chemical ReactionsIn chemical reactions, stable diffusion plays a crucial role in the distribution and mixing of reactants. It ensures that reactants are brought together effectively, increasing the chances of successful reactions and improving reaction rates.2. Biological SystemsIn biological systems, stable diffusion is essential for processes such as oxygen transport in the blood, nutrient absorption in cells, and the release of neurotransmitters in the brain. Understanding how diffusion operates in these systems can help in diagnosing and treating diseases.3. Material ScienceIn material science, stable diffusion is utilized to enhance the properties of materials. For example, by controlling the diffusion of dopants into semiconductor materials, engineers can tailor theelectrical conductivity and performance of electronic devices.4. Social NetworksStable diffusion is also observed in social networks, where ideas, trends, and information spread from one individual to another. By studying how information diffuses through social networks, researchers can predict and influence behaviors, as well as analyze the dynamics of social interactions.ConclusionStable diffusion plays a fundamental role in a wide range of natural and artificial systems. Understanding the factors that influence diffusion and the mathematical principles underlying it enables scientists and engineers to control and optimize processes in fields ranging from chemistry to social networks. By harnessing the power of stable diffusion, we can develop new materials, advance medical treatments, and gain insights into complex systems.。

关键基因和hub基因(生物网络角度)

关键基因和hub基因(生物网络角度)

关键基因和hub基因(生物网络角度)写在前面这篇文章仍然来自几篇文章及自己平时的积累,主要阐述关键基因和hub基因。

很多人误以为hub基因就是关键基因,甚至有人认为差异表达基因就是关键基因。

在正式看本文章之前,我先以个人理解的角度简单的来说明这三者之间的关系,不同见解的请留言。

•差异表达基因是两个group之间有统计学差异的gene,以芯片为例的话,几万个探针里可能差异的就1000个左右(当然根据设定阈值差异很大)•hub基因,是degree高的gene,在基因表达网络中有高的连接度degree,不涉及betweeness等。

并且hub基因的筛选有很大的人为因素,到底是取前5%还是10%没有具体要求,一般建议5%。

也就是说这是一个很宽松的设定。

•关键基因,有人从hub里挑靠前的,有人从差异表达基因里挑p值大的。

到怎么才算关键基因?笼统来说,假如你这个基因被敲减,表型显著消失,那肯定是关键基因。

但仅从生物信息分析角度怎么挑?不可能有一种方法就可以直接解决这个问题,现在只从表达网络的角度,稍后我会写一篇多个角度如何筛选关键基因的文章。

,其范围要比hub小。

hub不一定关键,关键不一定hub。

总之,在数目上获范畴上DGEs>Hubs>key genes(candidate genes)------------------------------------------------好了,开始正文吧HUB 基因The WGCNA approach typically dealswith the identification of gene modules byusing the gene expression levels that arehighly correlated across samples. Thistechnique has been successfully utilized todetect gene modules in Arabidopsis, rice,maize and poplar for various biotic andabiotic stresses . Further, this approachalso leads to construction of Gene Co-expression Network (GCN), a scale freenetwork, where, genes are represented asnodes and edges depict associationsamong genes . In such network, highlyconnected genes are called hub genes,which are expected to play an importantrole in understanding the biologicalmechanism of response understresses/conditions. Identification of hubgenes will also help in mitigating the stressin plants through genetic engineering. Theexisting approaches have mainly focusedon hub gene identification, based only ongene connection degrees in the GCN.Moreover, these techniques select suchgenes empirically without any statisticalcriteria. Besides, few approaches can befound in the literature for theidentification of hub nodes in a scale freenetwork.这里可以看出,hub基因是是在无尺度共表达网络中存在的,对应着degree,也就是说在GCN中。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Protein Interaction Network
Proteins in a cell

There are thousands of different active proteins in a cell acting as:

enzymes, catalysors to chemical reactions of the metabolism components of cellular machinery (e.g. ribosomes) regulators of gene expression Certain proteins play specific roles in special cellular compartments. Others move from one compartment to another as “signals”.
Pathway Networks
Signaling & Metabolic Pathway Network

A Pathway can be defined as a modular unit of interacting molecules to fulfill a cellular function.
Database / Knowledge Source
Homology Modeling
Post-tranlational Modification
Database / Knowledge Source
From the particular to the universal
A.-L- Barabasi & Z. Oltvai, Science, 2002
Why Study Networks?
It is increasingly recognized that complex systems cannot be described in a reductionist view. Understanding the behavior of such systems starts with understanding the topology of the corresponding network. Topological information is fundamental in constructing realistic models for the function of the network.
Protein Interactions
P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …
Yeast Protein Interaction Network
Nodes: proteins
Links: physical
interactions (binding)

Bioinformatics
The essence of life is information (i.e. from digital code to emerging properties of biosystems.)
Bioinformatics is the study of information content of life



Power Law Network
PREFERENTIAL ATTACHMENT on Growth: the probability that a new vertex will be connected to vertex i depends on the connectivity of that vertex:
Biological Networks’ Properties Databases Discussion STM Clustering Model

Introduction
Bioinformatics

Informatics
Its carrier is a set of digital codes and a language. In its manifestation in the space-time continuum, it has utility (e.g. to decrease entropy of an open system).

Biological Network Model

Network

A linked list of interconnected nodes.
Node


Protein, peptide, or non-protein biomolecules.

Edges
Biological relationships, etc., interactions, regulations, reactions,
Regulatory Network
Expression Network

A network representation of genomic data. Inferred from genomic data, i.e. microarray.
BIOLOGICAL NETWORK PROPERTY
A Pathway Example
A Pathway Example
A Pathway Example
Regulatory Network

a collection of DNA segments (genes) in a cell which interact with each other and with other substances in the cell, thereby governing the rates at which genes in the network are transcribed into mRNA.
Proteomics
Genomics Proteomics
Structural Proteomics
Structure Determination
Functional Proteomics
Protein-Protein Interaction & Networking Protein Expression
Interaction Network Pathway Network Regulatory Network Expression Network

Biological Networks Properties
Power law degree distribution: Rich get richer Small World: A small average path length Mean shortest node-to-node path Robustness: Resilient and have strong resistance to failure on random attacks and vulnerable to targeted attacks Hierarchical Modularity: A large clustering coefficient How many of a node’s neighbors are connected to each other
Genome Size
Proteom Sizeቤተ መጻሕፍቲ ባይዱ(PDB)
BIOLOGICAL NETWORK
Networks are found in biological systems of varying scales:
1. Evolutionary tree of life 2. Ecological networks 3. Expression networks 4. Regulatory networks - genetic control networks of organisms 5. The protein interaction network in cells 6. The metabolic network in cells … more biological networks
Protein Interactions


Proteins perform a function as a complex rather as a single protein. Knowing whether two proteins interact can help us discover unknown proteins’ functions: If the function of one protein is known, the function of its binding partners are likely to be related- “guilt by association”. Thus, having a good method for detecting interactions can allow us to use a small number of proteins with known function to characterize new proteins.
Signaling Pathway Networks




Metabolic Pathway Networks

In biology a signal or biopotential is an electric quantity (voltage or current or field strength), caused by chemical reactions of charged ions. refer to any process by which a cell converts one kind of signal or stimulus into another. Another use of the term lies in describing the transfer of information between and within cells, as in signal transduction. a series of chemical reactions occurring within a cell, catalyzed by enzymes, resulting in either the formation of a metabolic product to be used or stored by the cell, or the initiation of another metabolic pathway
相关文档
最新文档