Maximum entropy weighting of aligned sequences of proteins or DNA
2007 C H 王教团(信息与计算科学)周朝卫(信息与计算科学)周龙飞(信息管理与信息系统)
6 Task 3
-------------------------------------------------------25
6.1 Background -----------------------------------------------------------------25 6.2 Model 5 Kidney Matching for Transplantation ---------------------27 6.3 Simulations -----------------------------------------------------------------32 6.4 Conclusion ------------------------------------------------------------------35
Regionalization of the transplant system has produced political ramifications . Doctors living in small communities, who want to do a good job in transplants, need continuing experience by doing a minimum number of transplants per year.However, the kidneys from these small communities frequently go to the hospitals in the big city. Task 1: Build a mathematical model for the generic US transplant network(s). This model must be able to give insight into the following: Where are the potential bottlenecks for efficient organ matching? If more resources were available for improving the efficiency of the donor-matching process, where and how could they be used? Would this network function better if it was divided into smaller networks (for instance at the state level)? And finally, can you make the system more effective by saving and prolonging more lives? If so, suggest policy changes and modify your model to reflect these improvements. Task 2: Investigate the transplantation policies used in a country other than the US. By modifying your model from Task 1, determine if the US policy be would improved by implementing the procedures used in the other country. As members of an expert analysis team (knowledge of public health issues and network science) hired by Congress to perform a study of these questions, write a one-page report to Congress addressing the questions and issues of Task 1 and the information and possible improvements you have discovered from your research of the different country’s policies. Be sure to reference how you used your models from Task 1 to help address the issues.
基于主客观赋权法的多目标多属性决策方法_宋冬梅
( 1. School of Geoscience,China University of Petroleum ( East China) ,Qingdao 266580 ,Shandong ,China; 2. Graduate School,China University of Petroleum ( East China) ,Qingdao 266580 ,Shandong ,China; 3. College of Science,China University of Petroleum ( East China) ,Qingdao 266580 ,Shandong ,China; 4. Frist Institute of Oceangraphy ,State Oceanic Administration of People's Republic of China, Qingdao 266061 ,Shandong ,China) Abstract: For main defects of traditional subjective weighing and objective weighing method in the process of multiobjective and multiple attribute decision,a new weight way combined subjective and objective weighting method was proposed. Subjective weigh method has advantages of considering three different attitudes ( pessimistic,neutral,optimistic) of the policy makers. Objective weighting method was based on the CRITIC method and the Entropy value method,which fully considered the discrete, correlation and contrast intensity of the data. Finally , linear group legal and multiplication operator were used to combine subjective and objective weighting method. The feasibility and practicability of the proposed method was proved by the experiment of assessment on the antiinterference ability of the communication equipment. Key words: nonstructural fuzzy number method; triangular fuzzy number method; CRITIC method; entropy value method; w eight combination method
stata熵权法计算指令
stata熵权法计算指令英文回答:The entropy weight method (EWM) is a widely used objective weighting method in the field of decision-making. It utilizes the concept of information entropy to determine the weights of different criteria, ensuring that the weights are unbiased and reflect the relative importance of each criterion.The EWM algorithm involves the following steps:1. Normalize the decision matrix to ensure that all criteria are on the same scale.2. Calculate the entropy value for each criterion using the following formula:E_j = ∑(p_ij log(p_ij))。
where p_ij is the normalized value of the jthcriterion for the ith alternative.3. Calculate the weight for each criterion using the following formula:W_j = (1 E_j) / ∑(1 E_k)。
where k represents all criteria.The EWM has several advantages. It is objective anddoes not require subjective judgments from decision-makers. It also considers the uncertainty and variation in the data, ensuring that the weights are robust and reliable.However, the EWM also has some limitations. It assumes that all criteria are equally important, which may not always be the case in real-world decision-making scenarios. Additionally, the EWM can be sensitive to outliers in the data, which can affect the calculated weights.Despite these limitations, the EWM remains a valuabletool for decision-making, especially when the criteria are complex and uncertain. It provides a systematic andunbiased approach to determining the weights of different criteria, leading to more informed and defensible decisions.中文回答:熵权法。
最大熵模型与自然语言处理MaxEntModelampNLP 共94页
自然语言处理
MaxEnt Model & NLP
laputa NLP Group, AI Lab, Tsinghua Univ.
Topics
• NLP与随机过程的关系(背景) • 最大熵模型的介绍(熵的定义、最大熵
模型) • 最大熵模型的解决(非线性规划、对偶
i 1
“学习”可能是动词,也可能是名词。可以被标为主语、 谓语、宾语、定语……
“学习”被标为定语的可能性很小,只有0.05
我们引入这个新的知识: p(y4)0.05
除此之外,仍然坚持无偏见原则: p(x1)p(x2)0.5
p(y1)p(y2)p(y3)0.3 95
已知与未知的关系—例子
特征(Feature)
特征:(x,y) y:这个特征中需要确定的信息 x:这个特征中的上下文信息
注意一个标注可能在一种情况下是需要确 定的信息,在另一种情况下是上下文信 息:
p(x1)p(x2)1
已知:
4
p( yi ) 1
i 1
“学习”可能是动词,也可能是名词。可以被标为主语、 谓语、宾语、定语……
“学习”被标为定语的可能性很小,只有0.05p(y4)0.05
当“学习”被标作动词的时候,它被标作谓语的概率为
引0.9入5这个新的知识: p(y2|x1)0.95
1/9
1/9
999 999
11lo3g31lo9g4 3 3 lo3g 9 lo3g 3
称硬币-Version.3,4,…∞
更广泛地:如果一个随机变量x的可能取值为 X位=y{有x1,c种x2取,…,值x)k}。n的要期用望n位值y至: y少1y为2…:yn表示(每
i k1pxxilop lg o x1 c g xii k1pxxlio lc og pg x1 xi
基于组合赋权的topsis法
基于组合赋权的topsis法英文回答:TOPSIS Method Based on Combined Weights.The Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) is a widely used multi-criteria decision-making (MCDM) method. It is based on the idea of selecting the alternative that is closest to the ideal solution and farthest from the negative ideal solution.In the traditional TOPSIS method, all criteria are assumed to have equal importance. However, in many real-world applications, different criteria may have different levels of importance. To address this, a number of weighted TOPSIS methods have been proposed.One of the most popular weighted TOPSIS methods is the combined weights method. In this method, the weights of the criteria are determined by combining the subjective weightsassigned by the decision-maker and the objective weights derived from the data.To determine the subjective weights, the decision-maker can use any of the following methods:Simple weighting: Each criterion is assigned a weight between 0 and 1, with the sum of the weights being equal to 1.Pairwise comparison: Each criterion is compared to every other criterion, and a weight is assigned based on the relative importance of each criterion.Analytic hierarchy process (AHP): A more sophisticated method that involves decomposing the problem into a hierarchy of criteria and subcriteria.To determine the objective weights, the data can be analyzed using any of the following methods:Entropy: The entropy of each criterion is calculated,and the weights are assigned based on the relative entropy values.Standard deviation: The standard deviation of each criterion is calculated, and the weights are assigned based on the relative standard deviation values.Coefficient of variation: The coefficient of variation of each criterion is calculated, and the weights are assigned based on the relative coefficient of variation values.Once the subjective and objective weights have been determined, they can be combined to form the combined weights. The weights for criterion i are determined by the following equation:$$w_i^c = \alpha w_i^s + (1-\alpha) w_i^o$$。
基于组合赋权-云模型的高速公路网交通韧性评价
第50 卷第 11 期2023年11 月Vol.50,No.11Nov. 2023湖南大学学报(自然科学版)Journal of Hunan University(Natural Sciences)基于组合赋权-云模型的高速公路网交通韧性评价李洁1,刘邱琪1,张欣宇1,韦媛媛2,张晶晶2†(1.湖南大学土木工程学院,湖南长沙 410082;2.广西交科集团有限公司,广西壮族自治区南宁 530007)摘要:为了制定提升高速公路网交通韧性的策略,提出一种基于组合赋权−云模型的路网韧性评价方法. 首先,选取结构熵、边介数、聚类系数、路网密度4个路网拓扑结构指标和行程时间指数、路网流量非均匀指数2个交通运行状态指标,从路网拓扑结构与功能两方面对路网韧性进行综合评价. 其次,对路网韧性等级进行划分,确定各评价指标在不同韧性等级的阈值,并基于逆向云发生器计算云参数特征值及确定度. 随后,采用层次分析法和熵值法对指标进行组合赋权,通过加权平均得到不同韧性等级的隶属度,根据隶属度最大原则判定高速公路网韧性等级. 最后,以某高速公路网为例进行实证研究,将所提出的组合赋权-云模型评价方法与综合模糊评价法进行对比. 研究表明,二者的评价结果相近,但组合赋权-云模型评价方法克服了综合模糊评价法中随机性的缺陷,能更客观地反映路网的真实运行状态.关键词:交通运输规划与管理;交通系统韧性;复杂网络;组合赋权;云模型;高速公路中图分类号:U491 文献标志码:AEvaluation of Traffic Resilience of Freeway Networks Based onCombined Weighting-Cloud ModelLI Jie1,LIU Qiuqi1,ZHANG Xinyu1,WEI Yuanyuan2,ZHANG Jingjing2†(1.College of Civil Engineering, Hunan University, Changsha 410082, China;2.Guangxi Transportation Science and Technology Group Co., Ltd., Nanning 530007, China)Abstract:In a pursuit to develop strategies to amplify the resilience of freeway networks, this paper introduces an evaluation method of road resilience based on the combined weighting-cloud model. First,four topological structure indicators were selected,namely structure entropy,edge betweenness,freeway network density,and clustering coefficient,as well as two traffic status indicators:the travel time index and the traffic heterogeneity index. The resilience of the freeway network was comprehensively evaluated based on the topological structure and traffic status indicators. Then, the resilience of the freeway network was graded, the boundary values of evaluation indicators at different resilience levels were determined,and the characteristic values and certainty of the cloud∗收稿日期:2023-02-13基金项目:国家自然科学基金资助项目(51878264), National Natural Science Foundation of China(51878264);湖南省科学技术厅重点研发项目(2022SK2096), Department of Science and Technology of Hunan Province(2022SK2096);河南省交通运输厅科技项目(2020G11),Department of Communications of Henan Province(2020G11)作者简介:李洁(1972―),女,湖南株洲人,湖南大学副教授,工学博士† 通信联系人,E-mail:****************文章编号:1674-2974(2023)11-0224-11DOI:10.16339/ki.hdxbzkb.2023141第 11 期李洁等:基于组合赋权-云模型的高速公路网交通韧性评价parameters were estimated based on the backward cloud generator. Afterward,the indicators were weighted by combining the analytic hierarchy process and the entropy method. The membership degrees of different resilience levels were determined by calculating the weighted average,and the resilience level of the freeway network was detected according to the maximum membership degree. Finally,a case study was made for a freeway network to compare the combined weighting-cloud model method proposed in this study with the comprehensive fuzzy method. It is indicated from the research that the evaluation results of the two methods are similar. In contrast, the combined weighting-cloud model method reflects the actual status of the freeway network more objectively because it is free from the defect of randomness, which is included in the latter method.Key words:transportation planning and management;traffic resilience;complex networks;combined weighting; cloud model; freeway韧性的概念最早由Holling[1]引入到生态系统的研究,随后在众多领域得到广泛关注和应用.2006年,Murray-Tuite[2]首次将韧性概念引入交通运输系统中,基于交通特性提出韧性定义及其量化方法. 此后,学者们从多方面研究与探讨交通系统的韧性特征与评价方法.交通韧性是系统综合能力的表现,可通过拓扑指标、交通特性指标表征. Ip等[3]基于复杂网络理论提出轨道系统可靠通道的韧性评价指标,根据该指标的加权平均值评估节点韧性,再以节点韧性加权总和量化路网韧性. Dunn等[4]选择最大连通图、平均最短路径等拓扑指标评估航空网络的系统韧性. 徐锦强等[5]选择拓扑指标和交通指标对城市道路路网韧性进行综合评价,发现结合交通特性指标的路网韧性评价能更客观地反映路网实际性能. Bocchini 等[6]将总出行时间和出行距离作为系统性能指标,以实现道路网络韧性最大化为目标进行路网修复策略研究. Pratelli等[7]以速度为道路交通性能指标,将韧性指数定义为随时间变化,实际速度面积和限速面积的比值. Omer等[8]考虑出行时长、环境影响和出行成本3个因素对网络出行总时间的影响,以出行时间作为系统性能指标分析路网层面交通网络韧性. 两类指标对交通韧性的表达各有侧重:拓扑指标是基于复杂网络理论,通过结构特性静态分析网络抵御、对抗冲击的能力;交通特性指标则表现了系统性能随时间的动态变化,可反映路网的功能韧性.为提升和优化交通系统韧性,学者们针对不同交通扰动事件展开研究. Hsieh等[9]对遭受自然灾害事件扰动的台湾高速公路网韧性进行评估. Begum 等[10]从气候变化的视角提出区域公路网韧性提升的建议和评估标准. Xiao等[11]探究地震灾害对交通基础设施破坏的程度. Chu等[12]就如何提升易受地震影响的公路桥梁系统韧性进行了探讨. Zhu等[13]针对飓风艾琳和桑迪的侵袭探讨纽约市交通基础设施系统的韧性. Kasmalkar等[14]量化洪水对城市交通系统造成的破坏程度,以此提升城市交通系统的韧性.Bruyelle等[15]在对城市轨道交通系统韧性的研究中考虑恐怖袭击事件的影响. Zhong等[16]评估了遭受交通事故影响的广州机场高速公路的交通系统韧性. 地震、飓风、恐怖袭击等是对交通系统造成巨大损失的极端事件,引起了较多学者的关注.部分学者尝试从系统的角度对交通的日常扰动事件进行分析. Tang等[17]从交通系统韧性的视角研究道路交通拥堵的治理策略. Almotahari等[18]对构建的150个不同拓扑网络进行不同拥堵水平的测试,以筛选出最能表征网络韧性的指标,并在发生交通拥堵的城市路网进行实例验证. Khaghani等[19]采用多维指标表征道路网络的韧性,利用加利福尼亚州洛杉矶主要高速公路数据分析路网在高峰时段对常发性交通拥堵的抵御能力. Testa等[20]构建美国纽约高速公路网拓扑模型,选择平均节点度、聚类系数、中介中心性等拓扑指标为评价指标,分析随机移除节点或连线后路网的韧性. Zhang等[21]利用北京和深圳GPS数据分析不同城市交通拥堵影响下道路网络韧性的特征和区别,为交通管理部门提供理论依据. Akbarzadeh等[22]以伊朗伊斯法罕市的道路网为例,探讨交通流、节点中心性、节点重要度之间的关联性,为城市路网规划和交通管理提供重要依据. 吕225湖南大学学报(自然科学版)2023 年彪等[23]提出以日变交通配流为基础的城市路网韧性评估模型,使用路网效率和路网可达性等拓扑指标描述城市路网在扰动事件下的系统韧性. 随着区域间出行需求不断增长,拥堵从城市道路逐渐蔓延到高速公路. 尤其在节假日,局部交通流量短期内激增对高速公路造成进一步冲击[24],进而影响交通系统韧性. 由于高速公路具有一定的封闭性,交通流激增对路网造成的冲击在短时间内难以消散,如果高速公路网络系统韧性不足将导致系统性能迅速下降,影响社会经济的正常运转. 对此,增强高速公路网的交通韧性,可以预防或减缓交通拥堵的发生,实现提升高速公路服务水平和降低出行成本的目的.本文对节假日交通拥堵冲击下的高速公路网络交通韧性进行量化评估.首先,本文构建了高速公路网交通韧性评价指标体系,在选择指标时,考虑高速公路网结构特性的同时引入交通特性.其次,云模型广泛应用于各研究领域中的评价问题,可将不确定性问题进行定性和定量转换[25-26]. 本文提出一种组合赋权-云模型的高速公路网韧性评价方法,以主观权值和客观权值对评价指标进行组合赋权,通过逆向云发生器得到各指标在不同等级区间内的确定度,根据综合确定度最大原则得到高速公路网韧性评价结果.最后,本文以某市高速公路网为例进行实证分析,通过交通调查数据、网络拓扑数据以及统计年鉴等数据对其路网韧性进行评价.本文通过将提出的评价方法与综合模糊评价法进行比较,验证了所提出的评价方法的合理性和有效性.1 高速公路网韧性评价指标及分级随着交通韧性研究的深入,韧性的具体定义在不同研究中存在一定差异,但韧性的内涵主要包含两方面:一是系统适应、吸收和抵抗冲击的能力,反映了交通的静态韧性;二是系统受到冲击后恢复到正常服务水平的能力,可反映交通的动态韧性. 本文将高速公路网交通韧性定义为:高速公路网抵御交通流量的冲击,随着时间流逝能够恢复到正常运行水平的能力.高速公路交通系统韧性评价指标的选取应当全面、客观,遵循独立性、不相关性以及可评价性等原则. 当高速公路受到扰动时,韧性将受到两方面的影响:一方面,路网自身的结构特征能够适应并吸收部分交通流量,对系统韧性产生一定的影响;另一方面,交通特性反映了路网上的交通流数量和时空分布,是短时间内路网韧性变化的重要因素. 因此,高速公路网韧性的评价在采用路网拓扑结构相关指标的同时,还需考虑交通特性相关指标.本文从路网结构韧性和功能韧性两方面出发,选取路网密度、行程时间指数等因素建立评价指标体系,并参考以往研究成果确定各指标的韧性等级.高速公路网韧性评价流程如图1所示.1.1 路网结构韧性指标路网结构特性对交通韧性存在影响,且在短期内不随交通状态的改变发生变化. 基于复杂网络理论,本文选取结构熵、边介数、聚类系数、路网密度4项指标,对路网的拓扑结构特征进行分析.1.1.1 结构熵熵的物理意义为体系混乱程度的度量,结构熵可量化网络结构的稳定程度从而反映体系所具备的抵抗能力. 在交通系统中,路网结构熵越小,表明路网结构稳定性越差,所对应的结构韧性越差,表现为面临交通流量冲击时,路网难以抵抗和吸收,进而导致局部路段发生拥堵的概率增加. 本文对高速公路交通系统节点分布特性进行研究,在节点拓扑指标的基础上计算路网的结构熵[27],结构熵E的计算式如下:E=-∑i=1N I i ln I i(1)式中:I i为第i个节点的重要度.Ii=ki∑i=1N k i(2)式中:k i为第i个节点的度值;N为网络中节点的数量.图1 高速公路网韧性评价流程Fig.1 Freeway network resilience evaluation procedure226第 11 期李洁等:基于组合赋权-云模型的高速公路网交通韧性评价1.1.2 边介数中介度反映了网络单元在网络中的枢纽性,分为节点介数和边介数[28]. 边介数可反映单一路段在路网中的过渡性和衔接性,其值越大表示经过该路段的最短路径数越多.随着来自不同出行路径的流量不断增加,路段承受冲击的能力减弱,甚至发生局部交通拥堵,影响路网的综合运行效率,降低路网的韧性. 边介数的计算公式表示为路网中所有节点对的最短路径中,经过该边路径的数目占最短路径总数的比例,具体如式(3)所示.B cd =2(N-1)(N-2)∑b icdj b ij(3)式中:b ij表示节点n i到节点n j的所有最短路径数量;bicdj为节点n i到节点n j的所有最短路径中经过边l cd的数量;2/(N-1)(N-2)为标准化公式. 路网边介数即为路网中所有边介数的算数平均数.1.1.3 聚类系数聚类系数主要反映的是网络内部相邻节点之间联系程度的高低. 所有节点的聚类系数平均值即为网络聚类系数C,可描述路网的聚集程度[29]. C越趋近1,意味着路网中节点聚集性越好,节点之间通达性越强,路网的韧性越强. 当局部路段的交通流量过多时,聚集性较好的路网能够利用邻近节点的替代性资源分散交通流,避免发生拥堵或拥堵能够在短时间内疏散,路段恢复到正常的服务水平. 节点和网络的聚类系数为:C i =Miki(k i-1)(4)C=1N∑i=1N C i(5)式中:k i为第i个节点的度值,这些节点间存在的最大连线数为k i(k i-1);M i为实际存在的连线数.1.1.4 路网密度路网要求具有合适的规模,能够承受交通流的冲击,为出行者提供一定的出行服务水平. 路网密度是进行交通评价常用的指标,密度越高的路网在面对流量冲击时可用于交通分流的路段越多,表现出更强的抵抗力和吸收力,路网具有较强的韧性. 考虑到计算的简便性,高速公路路网密度可以用区域内高速公路里程数与区域面积的比值来表示,其计算式为:v=d S(6)式中:d为研究区域高速公路网总里程;S为行政区划总面积.1.2 路网功能韧性指标高速公路交通流具有一定时空演变规律,交通状态的不同变化模式对路网韧性造成的冲击程度不同. 在畅通状态下,交通流的缓慢增长对路网冲击较小,且在韧性可承受的范围内;随着交通流增加,流量的时空不均匀分布对路网的冲击进一步加剧,局部路段的拥堵会严重影响整个路网的韧性. 考虑交通特性对路网功能韧性的影响,本文选择行程时间指数和路网流量非均匀指数作为路网功能韧性指标.1.2.1 行程时间指数行程时间指数(Travel Time Index, TTI)为得克萨斯州交通研究所使用的路网运行状态评价指标[30-31],定义为出行实际行驶时间与自由流状态下行驶时间的比值,计算式如下:TTI ij=t ij t freeij(7)式中:t ij为出行实际行驶时间;t free ij为自由流状态下行驶时间.TTI是一个被广泛采用的交通运行状态评价指标,如高德地图[32]和百度地图中所使用的拥堵延时指数即为行程时间指数. 由式(7)可知,TTI值越大表示出行时间越长,路段越拥堵. 当交通流持续进入路网时,路网的吸收力逐渐减小,抵抗力降低,交通系统韧性变弱.1.2.2 路网流量非均匀指数路网流量非均匀指数F NE表示路网所有节点流量的非均衡度,而节点i在t时刻的流量均衡度F i(t)可通过与节点i相连接所有路段流量和路网中节点标准流量F N i(t)的方差表征[33]. F NE体现了路网中流量的分布情况. 当交通流量过度集中于局部路段时,即使某些路段仍然畅通,但路网的整体韧性较差.F NE(t)计算流程为:F Ni(t)=Qi(t)k i(8)式中:Q i(t)为t时刻流入与流出节点i的交通流量总和;k i为节点i的度值,节点标准流量F N i(t)为节点i 的度平均交通量.Fi(t)=1ki∑j=1[]Q ji(t)-F N i(t)2(9)式中:Q j i(t)为t时刻节点i和节点j相连路段上下行的交通总量,而节点i在t时刻的流量均衡度F i(t)为各路段流量Q j i(t)与标准流量F N i(t)离差平方和的平227湖南大学学报(自然科学版)2023 年均值.F NE (t )=1N ∑éëêêùûúúF i (t )-1N F i (t )2(10)式中:F NE (t )为t 时刻的路网流量非均匀指数.1.3 路网韧性等级划分为更加科学地评价路网韧性,本文参考相关研究成果并结合专家意见,将路网韧性划分为4个等级,即V ={v 1, v 2, v 3, v 4}={强韧性,较强韧性,中韧性,弱韧性}. 各指标不同等级范围根据以往研究成果以及统计年鉴确定,具体数值如表1所示.由于各指标没有统一的纲量,本文采用相对分析法确定指标范围. 针对路网密度这一指标,本文利用我国各主要城市的《统计年鉴》中高速公路总里程和行政区域面积2个数据指标,计算高速公路路网密度,并以此为参考确定路网密度指标等级的边界值. 结构熵、边介数、聚类系数、路网流量非均匀指数4个指标可参考现有研究成果设定等级边界值[27,33-35]. 行程时间指数指标的等级边界值则参考高德地图拥堵延时指数范围确定[32].2 组合赋权-云模型评价模型2.1 云模型理论云模型由李德毅等[36]提出,适用于处理定性概念与定量数量之间不确定性转换. 在本研究中,高速公路网交通韧性的分级为定性概念,评价指标的取值为定量数据.2.1.1 云模型的概念设U 为一个用精确数值表示的定量论域,D 为位于U 上的定性概念,若存在x ∈U 且x 为定性概念D 的一次随机实现,x 对D 的确定度μ(x )∈ [0,1]为具有稳定倾向的随机数[35]:μ(x ):U → [0,1],∀x ∈U ,x →μ(x )则x 在论域U 上的分布为隶属云,即云模型,x 为一个云滴.期望值E x 、熵值E n 、超熵值H e 是云模型的3个主要指标:期望值E x 是云滴4个韧性评价等级对应的云分布中心值,反映韧性评价指标的划分等级;熵值E n 表示各评价等级的值域范围,可反映评价过程中数据采集的随机性;超熵值H e 是熵的不确定性度量,表示某一评价等级的隶属度随机性大小,揭示韧性评价过程中指标取值随机性与等级模糊性之间的关联程度. Z =r ij (E x ij , E n ij , H e ij )可整体表征云模型的定性概念D ,即本研究中6项指标的韧性等级.2.1.2 云发生器云发生器是云模型中定性概念与定量数据之间相互转换的特定算法,主要分为正向云发生器和逆向云发生器[37]. 正向云发生器表示由定性概念到定量数据的转换过程,逆向云发生器表示由定量数据到定性概念的转换过程. 本文主要通过逆向云发生器,基于样本云滴数据计算云模型的3个数字特征,实现韧性评价指标取值到韧性分级的转换[38],如图2所示.2.1.3 云模型特征值计算通过逆向云发生器获取云模型特征值的算法有多种[39],本文参考以往应用在交通研究中的方法[35]计算云模型的3个特征值. 当云滴所在的论域空间存在一个评价范围[C min , C max ]时,云模型的期望值E x 计算式如下:E x =C max +C min2(11)式中:C max 、C min 分别对应等级区间的上限、下限边界值,即阈值. 对于单边界的情况,如[-∞, C max ]或[C min , +∞], 则依据评价指标具体实测值的上限或下限确定缺省边界.熵值E n 的计算式为:E n =C max-C min2.355(12)表1 韧性评价指标分级标准Tab.1 Grading standard of resilience evaluation indicators一级指标结构韧性功能韧性二级指标结构熵边介数聚类系数路网密度TTI F NE强韧性[0.9,1)(0,0.1][0.8,1)[900,1 800)[1,1.5)[0,2)较强韧性[0.4,0.9)(0.1,0.4][0.4,0.8)[500,900)[1.5,2)[2,4)中韧性[0.2,0.4)(0.4,0.8][0.1,0.4)[200,500)[2,4)[4,8)弱韧性[0,0.2)(0.8,1][0,0.1)[0,200)≥4[8,∞)图2 逆向云发生器示意图Fig.2 The schematic diagram of backward cloud generator228第 11 期李洁等:基于组合赋权-云模型的高速公路网交通韧性评价超熵值H e将指标值x的随机性约束弱化为某种“泛正态分布”,是熵值E n的不确定性度量,所以可根据E n的大小为H e取一个合适的常数,一般0.01≤H e≤0.1[35].2.2 组合赋权高速公路网韧性评价指标体系包含了结构和功能两个方面的多项指标,各指标对韧性的影响不尽相同,需要进行合理赋权. 权重的计算方法主要分为两大类:主观赋权法和客观赋权法. 主观赋权法包括层次分析法、模糊综合评价法、专家意见法等. 层次分析法是常用的主观赋权方法,通过业内专家将定性问题进行量化分析,使各指标权重结果更符合实际情况. 由于业内专家的经验和个人偏好等主观因素影响,赋值过程中可能存在一定偏差,影响计算结果的客观性. 熵值法是客观赋权法之一,根据数据的差异性确定各指标的重要程度,权重的确定比较客观,不受主观因素的影响. 熵值法确定权重是基于各指标数据的差异,而忽略了不同指标之间的影响,导致最终结果可能与实际情况相违背.为弥补单一赋权方法的不足,本文将层次分析法和熵值法相结合,利用式(13)计算韧性指标的综合权值. 组合赋权将高速公路网韧性评价指标主客观权重的差异程度和重要程度相匹配,计算组合权重值,确保权重值贴合实际情况的同时减少人为因素的影响,提高评价结果的合理性和客观性.ωi =δi×εi∑i=16(δi×εi)(13)式中:ωi为韧性指标i的组合权重值,i=1, 2,…, 6;δi 为指标i主观权重值;εi为指标i客观权重值.2.3 评价模型本文首先采用组合赋权法确定高速公路网6个韧性评价指标的权重,然后根据云模型理论建立高速公路网交通韧性的综合评价模型. 具体步骤如下:1)根据上文选定的指标,建立交通系统韧性评价对象的指标集U={u1, u2, u3,…,u6},评价集V={v1, v2, v3,v4}及评价指标的组合权重集W={w1, w2,w3,…,w6}.2)运用逆向云发生器,基于评价集V生成相应的云参数矩阵:Z=r ij(E x ij, E n ij, H e ij).3)利用云参数计算云模型的确定度μij:μij=expéëêêùûúú-(x i-E x ij)22E2n ij(14)4)将w i和μij加权平均计算高速公路网隶属各等级的综合确定度,根据综合确定度最大原则判定该路网的韧性等级:μj=∑iωiμij(15)3 实例分析3.1 高速公路基础数据本文选取某市高速公路网作为评价对象进行实例研究,根据交通调查数据、路网拓扑数据、统计年鉴数据等,对高速公路网交通韧性指标进行计算. 首先,本文选取2020年5月1日至2020年5月4日共4 d交通调查数据,在对数据进行校核和清洗后,计算进出各收费站的交通量. 其次,基于复杂网络理论,路网拓扑模型的构建方法可分为Space L、Space P、Space R[40-41]. 为更好地反映路网真实情况,最大限度地保留路网结构完整性,本文采用Space L方法构建某市高速公路拓扑路网模型:以收费站为节点N={n1, n2, n3,…,n n},以收费站点之间连接的路段为连线E={e12, e13, e14,…,e ij},以各收费站点之间的交通量作为拓扑模型的权重W={w12, w13, w14,…,w ij}. 实际路网中,同一高速路段的同一收费站可能存在多个出入口,为简化模型,本文将同一收费站多个出入口视为同一节点. 基于OpenStreetMap提供的某市高速公路网络矢量数据及2020年该市公路交通示意图,利用ArcGIS、UCINET等软件建立高速公路网络拓扑模型,经过处理获得59个收费站点和368条路段,如图3所示.基于上述交通调查数据和路网拓扑数据,计算该市高速公路网从5月1日至5月4日每天的韧性评价指标和4 d的平均值,结果如表2所示.3.2 路网韧性评价组合赋权方法可弥补单一赋权方法的不足,使赋值结果更合理、准确. 本文基于3.1节处理后的数据,分别采用层次分析法和熵值法计算各评价指标的主观和客观权重值,并按式(13)确定相应的组合权值,如表3所示.表3显示,聚类系数的组合权重值最大,为0.268 5,说明该指标对路网韧性具有重要影响,路网229湖南大学学报(自然科学版)2023 年内部相邻节点之间连接程度的提升是增强路网韧性的关键. 在功能韧性指标中,路网流量非均匀指数的组合权重值较大,为0.204 5,表明交通流量在路网中是否均匀分布对路网韧性有重要影响. 这一结果证明管理者可通过一定的管控措施(如限行、引流等)来缓解扰动对交通系统韧性的冲击.基于表2路网韧性评价指标的实际值及 式(11)、式(12),利用逆向云发生器算法计算6项评价指标隶属各韧性等级云模型的特征值,计算结果如表4所示.本文根据表4云模型的特征值,通过MATLAB 软件运用云正向发生器算法绘制各韧性评价指标的标准云图,如图4所示.由图4(c )可知,当确定度为0.6时,强韧性等级集中分布在[0.82,0.99],较强韧性等级集中分布在[0.37,0.75],中韧性等级集中分布在[0.10,0.38],弱韧性等级集中分布在[0.02,0.09].基于表2的韧性评价指标实际值和表4的指标云模型特征值,利用式(14)计算不同评价指标隶属各级别的确定度,结果见表5.基于表3和表5的结果,利用式(15)计算5月1日至5月4日每天及这4 d 平均的高速公路网韧性隶属各韧性等级的综合确定度,最终评价结果见表6.由表6可知,高速公路网在5月1日至5月4日这4 d 平均的韧性等级为中韧性,说明假期出行需求大,交通流冲击对路网韧性产生较大影响. 单日路网韧性评价结果表明,假期大部分时间路网韧性都处于中韧性状态. 由于假期免收高速公路费的影响,所以交通系统受到冲击较大,路网交通分布最不均匀,路网韧性评价结果均为中韧性;而在处于假期中段表2 某市高速公路网韧性评价指标Tab.2 Resilience evaluation indicators of thefreeway network指标结构熵边介数聚类系数路网密度TTI F NE5月1日0.929 10.206 30.378 0485.520 01.422 46.148 15月2日1.574 26.737 65月3日1.443 77.969 05月4日1.524 05.906 14日平均1.462 16.690 2表3 评价指标组合权重值Tab.3 Combination weight values of evaluation indicators评价指标结构熵边介数聚类系数路网密度TTIF NE主观权重值0.109 10.165 10.202 20.166 60.155 60.201 4客观权重值0.108 50.165 20.229 50.181 50.139 80.175 5组合权重值0.068 30.157 80.268 50.175 00.125 90.204 5(a )实际高速公路网(b )路网拓扑结构图3 某市高速公路网及其对应拓扑结构Fig.3 The freeway network and its correspondingtopological structure表4 韧性评价指标云模型特征值Tab.4 Cloud model characteristics values of resilience evaluation indicators评价指标结构熵边介数聚类系数路网密度TTI F NE强韧性(0.95,0.04,0.01)(0.05,0.04,0.01)(0.90,0.08,0.01)(1450,467.09,0.1)(1.25,0.21,0.02)(1.00,0.85,0.01)较强韧性(0.65,0.21,0.03)(0.25,0.13,0.02)(0.60,0.17,0.03)(700,169.85,0.08)(1.75,0.21,0.02)(3.00,0.85,0.02)中韧性(0.30,0.08,0.02)(0.60,0.17,0.02)(0.25,0.13,0.02)(350,127.39,0.06)(3.00,0.85,0.04)(6.00,1.70,0.02)弱韧性(0.10,0.08,0.02)(0.90,0.08,0.01)(0.05,0.04,0.01)(100,84.93,0.05)(4.27,0.23,0.02)(11.87,3.29,0.02)230。
时空极差熵权法 英文
时空极差熵权法英文The Time-Space Extreme Difference Entropy Weighting Method (TSWEDEM) is a multi-criteria decision-making method that is used to evaluate the performance of a set of alternatives concerning multiple criteria. The technique employs a top-down approach that aims to identify the most advantageous alternative that satisfies the preferences and constraints of the decision-maker. This paper provides a comprehensive review of the TSWEDEM, including its theoretical background, algorithm, and practical applications.The TSWEDEM is based on two core concepts: entropy and weighting. Entropy is a measure of the uncertainty or unpredictability of a system, while weighting is a technique used to assign relative importance to different criteria or factors. In the context of the TSWEDEM, entropy is used to measure the degree of difference between the performance of alternatives with respect to each criterion, while weighting is used to incorporate the decision-maker's preferences for each criterion.The algorithm of the TSWEDEM can be divided into three steps: normalization of the decision matrix, determination of the weighting coefficient and calculation of the fuzzy comprehensive appraisal. The first step involves standardizing the decision matrix to avoid dominance by any single criterion. The second step involves determining the weighting coefficient for each criterion through the use of expert judgment or other methods. The final step involves calculating the fuzzy comprehensive appraisal for each alternative, which is a weighted sum of the normalized scores for each criterion.The advantages of the TSWEDEM include its ability to handle both quantitative and qualitative criteria, its ability to incorporate expert judgment, and its ability to provide a comprehensive evaluation of alternatives. The technique has been successfully applied in avariety of fields, including environmental management, transportation planning, and energy systems analysis.In environmental management, the TSWEDEM has been used to evaluate the performance of different waste management strategies. In transportation planning, it has been used to select the best transportation mode for a given route. In energy systems analysis, it has been used to evaluate the performance of different renewable energy technologies.However, the TSWEDEM has several limitations, including the subjectivity of the weighting process, the potential for inconsistency in expert judgment, and the lack of a clear theoretical foundation. Additionally, the technique can betime-consuming and computationally intensive, particularly when dealing with large and complex decision matrices.In conclusion, the TSWEDEM is a valuable multi-criteria decision-making method that can help decision-makers evaluate the performance of alternatives concerning multiple criteria. Its theoretical foundation, algorithm, and practical applications have been discussed in detail. Thetechnique has several advantages, but it also has some limitations, which need to be considered when applying it in practice.。
高斯中的优化
优化第一步:确定分子构型,可以根据对分子的了解通过GVIEW和CHEM3D等软件来构建,但更多是通过实验数据来构建(如根据晶体软件获得高斯直角坐标输入文件,软件可在大话西游上下载,用GVIEW可生成Z-矩阵高斯输入文件),需要注意的是分子的原子的序号是由输入原子的顺序或构建原子的顺序决定来实现的,所以为实现对称性输入,一定要保证第一个输入的原子是对称中心,这样可以提高运算速度。
我算的分子比较大,一直未曾尝试过,希望作过这方面工作的朋友能补全它。
以下是从本论坛,大话西游及宏剑公司上下载的帖子。
将键长相近的,如B12 1.08589B13 1.08581B14 1.08544键角相近的,如A6 119.66589A7 120.46585A8 119.36016二面角相近的如D10 -179.82816D11 -179.71092都改为一致,听说这样可以减少变量,提高计算效率,是吗?在第一步和在以后取某些键长键角相等,感觉是一样的。
只是在第一步就设为相等,除非有实验上的证据,不然就是纯粹的凭经验了。
在前面计算的基础上,如果你比较信赖前面的计算,那么设为相等,倒还有些依据。
但是,设为相等,总是冒些风险的。
对于没有对称性的体系,应该是没有绝对的相等的。
或许可以这么试试:先PM3,再B3LYP/6-31G.(其中的某些键长键角设为相等),再B3LYP/6-31G(放开人为设定的那些键长键角相等的约束)。
比如键长,键角,还有是否成键的问题,Gview看起来就是不精确,不过基本上没问题,要是限制它们也许就有很大的问题,能量上一般会有差异,有时还比较大如果要减少优化参数,不是仅仅将相似的参数改为一致,而是要根据对称性,采用相同的参数。
例如对苯分子分子指定部分如下:CC 1 B1C 2 B2 1 A1C 3 B3 2 A2 1 D1C 4 B4 3 A3 2 D2C 1 B5 2 A4 3 D3H 1 B6 2 A5 3 D4H 2 B7 1 A6 6 D5H 3 B8 2 A7 1 D6H 4 B9 3 A8 2 D7H 5 B10 4 A9 3 D8H 6 B11 1 A10 2 D9B1 1.395160B2 1.394712B3 1.395427B4 1.394825B5 1.394829B6 1.099610B7 1.099655B8 1.099680B9 1.099680B10 1.099761 B11 1.099604 A1 120.008632 A2 119.994165 A3 119.993992 A4 119.998457 A5 119.997223 A6 119.980770 A7 120.012795 A8 119.981142 A9 120.011343 A10 120.007997 D1 -0.056843 D2 0.034114 D3 0.032348 D4 -179.972926 D5 179.953248 D6 179.961852 D7 -179.996436 D8 -179.999514 D9 179.989175参数很多,但是通过对称性原则,并且采用亚原子可以将参数减少为:XX 1 B0C 1 B1 2 A1C 1 B1 2 A1 3 D1C 1 B1 2 A1 4 D1C 1 B1 2 A1 5 D1C 1 B1 2 A1 6 D1C 1 B1 2 A1 7 D1H 1 B2 2 A1 8 D1H 1 B2 2 A1 3 D1H 1 B2 2 A1 4 D1H 1 B2 2 A1 5 D1H 1 B2 2 A1 6 D1H 1 B2 2 A1 7 D1B0 1.0B1 1.2B2 2.2A1 90.0D1 60.0对于这两个工作,所用的时间为57s和36s,对称性为C01和D6H,明显后者要远远优于前者。
MultipleSequenceAlignment(MSA)
x GGGCACTGCAT y GGTTACGTC-z GGGAACTGCAG
w GGACGTACC-v GGACCT-----
Alignment 1 Alignment 2
Aligning alignments/profiles
-AGGCTATCACCTG TAG–CTACCA---G CAG–CTACCA---G CAG–CTATCAC–GG CAG–CTATCGC–GG
A
1
1
.8
C
.6
1
.4 1 .6 .2
G
1 .2
.2
.4 1
T
.2
1 .6
.2
-
.2
.8
.4 .8 .4
Aligning alignments/profiles
SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE ---- FAST CAT
SeqB GARFIELD THE FAST CAT
SeqC GARFIELD THE VERY FAST CAT
SeqA GARFIELD THE LAST FA-T CAT SeqB GARFIELD THE FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqD -------- THE FA-T CAT
AAA
ACC
An alignment with 3 columns
ACG
ACT
0
Consistency-based approaches
▪ T-Coffee
– M-Coffee & 3D-Coffee (Expresso)
水面舰船总体方案设计多目标综合评估方法
Vol. 43, No. 1Jan., 2021第43卷第1期2021年1月舰船科学技术SHIP SCIENCE AND TECHNOLOGY水面舰船总体方案设计多目标综合评估方法胡开业1,刘源2(1.哈尔滨工程大学船舶工程学院,黑龙江哈尔滨150001; 2.保利科技有限公司,北京100010)摘 要:为了优化水面舰船总体性能,开展舰船总体方案多目标综合评估方法研究,对待选方案进行多目标综合评估,可为方案的选择提供理论依据。
对各综合评估方法的原理、特点及其在舰船评估领域的研究现状、发展 前景等进行简要梳理,在此基础上,重点根据舰船方案评估的特点和炳权法的不足对其数学模型进行改进,与理想点法相结合,提出具有客观性且适用于舰船方案评估的爛权理想点综合评估法,并引入实例进行计算、分析和验证。
研究结果表明,提出的爛权理想点法在舰船方案评估领域具有可行性和有效性。
关键词:总体方案;多目标综合评估;嫡权理想点法;数学模型中图分类号:U662.2 文献标识码:A文章编号:1672 - 7649(2021)01 - 0017 - 06 doi : 10.3404/j.issn,1672 - 7649.2021.01.003Research on the multi-objective synthesis assessment methodologyfor the general schemes design of navy vesselsHUKai-ye^LIUYuan 2(1. School of Shipbuilding Engineering, Harbin Engineering University, Harbin 150001, China;2. Poly Technologies Inc, Beijing 100010, China)Abstract: Warships general is an extremely complex integrated system, in order to make the general scheme, it is ne cessary to weigh each multi-objectives, analyse and evaluate the scheme set, and provide theoretical evidence for decision making. This paper researched and analyzed the principle, characteristics about synthesis assessment methods, siunmarized the applicability, application and development prospect of these methods in the warships assessment field. Selective analyses focus on the improvement of the entropy-weight method's mathematical model according to its shortage. Combined the en tropy-weight method with TOPSIS, put forward entropy weighting ideal point comprehensive evaluation method and thengive an example to calculate, analyse and verify this new mothed. Discussed its effectiveness and feasibility in the field of warship schemes' evaluation.Key words: general schemes ; multi-objective synthesis assessment ; entropy weighting ideal point ; mathematicalmodel0引言舰船初步设计阶段将产生大量待选方案,为了获 得综合效能最好的设计方案,需要对待选方案集进行 分析和筛选,从而确定最终的设计方案;对于完工船舶,需对其综合性能进行全面分析评价,以明确其是 否满足规范和使用要求。
熵值法的英文
熵值法的英文Entropy-Based Weighting MethodEntropy is a fundamental concept in various fields, including information theory, thermodynamics, and decision-making. In the context of decision-making and data analysis, the entropy-based weighting method has emerged as a useful tool for determining the relative importance or weights of different criteria or attributes. This method provides a systematic and objective approach to assigning weights to variables, which can be particularly valuable in multi-criteria decision-making problems.The entropy-based weighting method is based on the principle that the more information a criterion or attribute provides, the more important it is in the decision-making process. The method relies on the calculation of the entropy of each criterion, which reflects the degree of uncertainty or dispersion of the data associated with that criterion. The higher the entropy, the lower the information content, and consequently, the lower the weight assigned to that criterion.The first step in the entropy-based weighting method is to construct the decision matrix, which is a table that represents the performanceof each alternative with respect to each criterion. The decision matrix can be denoted as X = [xij], where xij represents the value of the jth criterion for the ith alternative.Next, the decision matrix is normalized to ensure that all values are within the range of 0 to 1. This normalization process can be done using various techniques, such as linear normalization or vector normalization. The normalized decision matrix is denoted as R = [rij], where rij represents the normalized value of the jth criterion for the ith alternative.The entropy of each criterion is then calculated using the following formula:ej = -k * Σ(rij * ln(rij))where ej is the entropy of the jth criterion, k is a constant (usually set to 1/ln(m), where m is the number of alternatives), and rij is the normalized value of the jth criterion for the ith alternative.The weight of each criterion is then calculated as:wj = (1 - ej) / Σ(1 - ej)where wj is the weight of the jth criterion, and ej is the entropy of thejth criterion.The entropy-based weighting method has several advantages over other weighting methods, such as subjective weighting methods or equal weighting methods. First, it is an objective and data-driven approach, which means that the weights are determined based on the information content of the data rather than on subjective judgments or assumptions. This can be particularly useful in situations where decision-makers have limited knowledge or experience with the problem at hand.Second, the entropy-based weighting method is sensitive to the degree of variation in the data. Criteria with higher variation in their values will have lower entropy and, consequently, higher weights. This reflects the fact that more variable criteria are generally more informative and, therefore, more important in the decision-making process.Third, the entropy-based weighting method is relatively simple to implement and can be easily automated using computer software or spreadsheet applications. This makes it a practical and accessible tool for decision-makers in a variety of contexts, from business and finance to environmental management and public policy.Despite its advantages, the entropy-based weighting method alsohas some limitations. For example, it assumes that the criteria are independent and that the data is accurate and reliable. In situations where there are dependencies between criteria or where the data is incomplete or uncertain, the method may not produce accurate results.In conclusion, the entropy-based weighting method is a powerful tool for determining the relative importance of different criteria or attributes in decision-making. By utilizing the concept of information entropy, the method provides an objective and data-driven approach to weight assignment, which can be particularly valuable in complex, multi-criteria decision problems. While it has its limitations, the entropy-based weighting method is a widely-used and well-established technique in the field of decision analysis and data analysis.。
基于低秩约束的熵加权多视角模糊聚类算法
基于低秩约束的熵加权多视角模糊聚类算法张嘉旭 1王 骏 1, 2张春香 1林得富 1周 塔 3王士同1摘 要 如何有效挖掘多视角数据内部的一致性以及差异性是构建多视角模糊聚类算法的两个重要问题. 本文在Co-FKM 算法框架上, 提出了基于低秩约束的熵加权多视角模糊聚类算法(Entropy-weighting multi-view fuzzy C-means with low rank constraint, LR-MVEWFCM). 一方面, 从视角之间的一致性出发, 引入核范数对多个视角之间的模糊隶属度矩阵进行低秩约束; 另一方面, 基于香农熵理论引入视角权重自适应调整策略, 使算法根据各视角的重要程度来处理视角间的差异性. 本文使用交替方向乘子法(Alternating direction method of multipliers, ADMM)进行目标函数的优化. 最后, 人工模拟数据集和UCI (University of California Irvine)数据集上进行的实验结果验证了该方法的有效性.关键词 多视角模糊聚类, 香农熵, 低秩约束, 核范数, 交替方向乘子法引用格式 张嘉旭, 王骏, 张春香, 林得富, 周塔, 王士同. 基于低秩约束的熵加权多视角模糊聚类算法. 自动化学报, 2022,48(7): 1760−1770DOI 10.16383/j.aas.c190350Entropy-weighting Multi-view Fuzzy C-means With Low Rank ConstraintZHANG Jia-Xu 1 WANG Jun 1, 2 ZHANG Chun-Xiang 1 LIN De-Fu 1 ZHOU Ta 3 WANG Shi-Tong 1Abstract Effective mining both internal consistency and diversity of multi-view data is important to develop multi-view fuzzy clustering algorithms. In this paper, we propose a novel multi-view fuzzy clustering algorithm called en-tropy-weighting multi-view fuzzy c-means with low-rank constraint (LR-MVEWFCM). On the one hand, we intro-duce the nuclear norm as the low-rank constraint of the fuzzy membership matrix. On the other hand, the adaptive adjustment strategy of view weight is introduced to control the differences among views according to the import-ance of each view. The learning criterion can be optimized by the alternating direction method of multipliers (ADMM). Experimental results on both artificial and UCI (University of California Irvine) datasets show the effect-iveness of the proposed method.Key words Multi-view fuzzy clustering, Shannon entropy, low-rank constraint, nuclear norm, alternating direction method of multipliers (ADMM)Citation Zhang Jia-Xu, Wang Jun, Zhang Chun-Xiang, Lin De-Fu, Zhou Ta, Wang Shi-Tong. Entropy-weighting multi-view fuzzy C-means with low rank constraint. Acta Automatica Sinica , 2022, 48(7): 1760−1770随着多样化信息获取技术的发展, 人们可以从不同途径或不同角度来获取对象的特征数据, 即多视角数据. 多视角数据包含了同一对象不同角度的信息. 例如: 网页数据中既包含网页内容又包含网页链接信息; 视频内容中既包含视频信息又包含音频信息; 图像数据中既涉及颜色直方图特征、纹理特征等图像特征, 又涉及描述该图像内容的文本.多视角学习能有效地对多视角数据进行融合, 避免了单视角数据数据信息单一的问题[1−4].多视角模糊聚类是一种有效的无监督多视角学习方法[5−7]. 它通过在多视角聚类过程中引入各样本对不同类别的模糊隶属度来描述各视角下样本属于该类别的不确定性程度. 经典的工作有: 文献[8]以经典的单视角模糊C 均值(Fuzzy C-means, FCM)算法作为基础模型, 利用不同视角间的互补信息确定协同聚类的准则, 提出了Co-FC (Collaborative fuzzy clustering)算法; 文献[9]参考文献[8]的协同思想提出Co-FKM (Multiview fuzzy clustering algorithm collaborative fuzzy K-means)算法, 引入双视角隶属度惩罚项, 构造了一种新型的无监督多视角协同学习方法; 文献[10]借鉴了Co-FKM 和Co-FC 所使用的双视角约束思想, 通过引入视角权重, 并采用集成策略来融合多视角的模糊隶属收稿日期 2019-05-09 录用日期 2019-07-17Manuscript received May 9, 2019; accepted July 17, 2019国家自然科学基金(61772239), 江苏省自然科学基金(BK20181339)资助Supported by National Natural Science Foundation of China (61772239) and Natural Science Foundation of Jiangsu Province (BK20181339)本文责任编委 刘艳军Recommended by Associate Editor LIU Yan-Jun1. 江南大学数字媒体学院 无锡 2141222. 上海大学通信与信息工程学院 上海 2004443. 江苏科技大学电子信息学院 镇江2121001. School of Digital Media, Jiangnan University, Wuxi 2141222. School of Communication and Information Engineering,Shanghai University, Shanghai 2004443. School of Electronic Information, Jiangsu University of Science and Technology,Zhenjiang 212100第 48 卷 第 7 期自 动 化 学 报Vol. 48, No. 72022 年 7 月ACTA AUTOMATICA SINICAJuly, 2022度矩阵, 提出了WV-Co-FCM (Weighted view colla-borative fuzzy C-means) 算法; 文献[11]通过最小化双视角下样本与聚类中心的欧氏距离来减小不同视角间的差异性, 基于K-means 聚类框架提出了Co-K-means (Collaborative multi-view K-means clustering)算法; 在此基础上, 文献[12]提出了基于模糊划分的TW-Co-K-means (Two-level wei-ghted collaborative K-means for multi-view clus-tering)算法, 对Co-K-means 算法中的双视角欧氏距离加入一致性权重, 获得了比Co-K-means 更好的多视角聚类结果. 以上多视角聚类方法都基于成对视角来构造不同的正则化项来挖掘视角之间的一致性和差异性信息, 缺乏对多个视角的整体考虑.一致性和差异性是设计多视角聚类算法需要考虑的两个重要原则[10−14]. 一致性是指在多视角聚类过程中, 各视角的聚类结果应该尽可能保持一致.在设计多视角聚类算法时, 往往通过协同、集成等手段来构建全局划分矩阵, 从而得到最终的聚类结果[14−16]. 差异性是指多视角数据中的每个视角均反映了对象在不同方面的信息, 这些信息互为补充[10],在设计多视角聚类算法时需要对这些信息进行充分融合. 综合考虑这两方面的因素, 本文拟提出新型的低秩约束熵加权多视角模糊聚类算法(Entropy-weigh-ting multi-view fuzzy C-means with low rank con-straint, LR-MVEWFCM), 其主要创新点可以概括为以下3个方面:1)在模糊聚类框架下提出了面向视角一致性的低秩约束准则. 已有的多视角模糊聚类算法大多基于成对视角之间的两两关系来构造正则化项, 忽视了多个视角的整体一致性信息. 本文在模糊聚类框架下从视角全局一致性出发引入低秩约束正则化项, 从而得到新型的低秩约束多视角模糊聚类算法.2) 在模糊聚类框架下同时考虑多视角聚类的一致性和差异性, 在引入低秩约束的同时进一步使用面向视角差异性的多视角香农熵加权策略; 在迭代优化的过程中, 通过动态调节视角权重系数来突出具有更好分离性的视角的权重, 从而提高聚类性能.3)在模糊聚类框架下首次使用交替方向乘子法(Alternating direction method of multipliers,ADMM)[15]对LR-MVEWFCM 算法进行优化求解.N D K C m x j,k j k j =1,···,N k =1,···,K v i,k k i i =1,···,C U k =[µij,k ]k µij,k k j i 在本文中, 令 为样本总量, 为样本维度, 为视角数目, 为聚类数目, 为模糊指数. 设 表示多视角场景中第 个样本第 个视角的特征向量, , ; 表示第 个视角下, 第 个聚类中心, ; 表示第 个视角下的模糊隶属度矩阵, 其中 是第 个视角下第 个样本属于第 个聚类中心的模i =1,···,C j =1,···,N.糊隶属度, , 本文第1节在相关工作中回顾已有的经典模糊C 均值聚类算法FCM 模型[17]和多视角模糊聚类Co-FKM 模型[9]; 第2节将低秩理论与多视角香农熵理论相结合, 提出本文的新方法; 第3节基于模拟数据集和UCI (University of California Irvine)数据集验证本文算法的有效性, 并给出实验分析;第4节给出实验结论.1 相关工作1.1 模糊C 均值聚类算法FCMx 1,···,x N ∈R D U =[µi,j ]V =[v 1,v 2,···,v C ]设单视角环境下样本 , 是模糊划分矩阵, 是样本的聚类中心. FCM 算法的目标函数可表示为J FCM 可得到 取得局部极小值的必要条件为U 根据式(2)和式(3)进行迭代优化, 使目标函数收敛于局部极小点, 从而得到样本属于各聚类中心的模糊划分矩阵 .1.2 多视角模糊聚类Co-FKM 模型在经典FCM算法的基础上, 文献[9]通过引入视角协同约束正则项, 对视角间的一致性信息加以约束, 提出了多视角模糊聚类Co-FKM 模型.多视角模糊聚类Co-FKM 模型需要满足如下条件:J Co-FKM 多视角模糊聚类Co-FKM 模型的目标函数 定义为7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1761η∆∆式(5)中, 表示协同划分参数; 表示视角一致项,由式(6)可知, 当各视角趋于一致时, 将趋于0.µij,k 迭代得到各视角的模糊隶属度 后, 为了最终得到一个具有全局性的模糊隶属度划分矩阵, Co-FKM 算法对各视角下的模糊隶属度采用几何平均的方法, 得到数据集的整体划分, 具体形式为ˆµij 其中, 为全局模糊划分结果.2 基于低秩约束的熵加权多视角模糊聚类算法针对当前多视角模糊聚类算法研究中存在的不足, 本文提出一种基于低秩约束的熵加权多视角模糊聚类新方法LR-MVEWFCM. 一方面通过向多视角模糊聚类算法的目标学习准则中引入低秩约束项, 在整体上控制聚类过程中各视角的一致性; 另一方面基于香农熵理论, 通过熵加权机制来控制各视角之间的差异性.同时使用交替方向乘子法对模型进行优化求解.U 1,···,U K U U U 设多视角隶属度 融合为一个整体的隶属度矩阵 , 将矩阵 的秩函数凸松弛为核范数, 通过对矩阵 进行低秩约束, 可以将多视角数据之间的一致性问题转化为核范数最小化问题进行求解, 具体定义为U =[U 1···U K ]T ∥·∥∗其中, 表示全局划分矩阵, 表示核范数. 式(8)的优化过程保证了全局划分矩阵的低秩约束. 低秩约束的引入, 可以弥补当前大多数多视角聚类算法仅能基于成对视角构建约束的缺陷, 从而更好地挖掘多视角数据中包含的全局一致性信息.目前已有的多视角的聚类算法在处理多视角数据时, 通常默认每个视角平等共享聚类结果[11], 但实际上某些视角的数据往往因空间分布重叠而导致可分性较差. 为避免此类视角的数据过多影响聚类效果,本文拟对各视角进行加权处理, 并构建香农熵正则项从而在聚类过程中有效地调节各视角之间的权重, 使得具有较好可分离性的视角的权重系数尽可能大, 以达到更好的聚类效果.∑Kk =1w k =1w k ≥0令视角权重系数 且 , 则香农熵正则项表示为U w k U =[U 1···U K ]T w =[w 1,···,w k ,···,w K ]K 综上所述, 本文作如下改进: 首先, 用本文提出的低秩约束全局模糊隶属度矩阵 ; 其次, 计算损失函数时考虑视角权重 , 并加入视角权重系数的香农熵正则项. 设 ; 表示 个视角下的视角权重. 本文所构建LR-MVEWFCM 的目标函数为其中, 约束条件为m =2本文取模糊指数 .2.1 基于ADMM 的求解算法(11)在本节中, 我们将使用ADMM 方法, 通过交替方向迭代的策略来实现目标函数 的最小化.g (Z )=θ∥Z ∥∗(13)(10)最小化式 可改写为如下约束优化问题:其求解过程可分解为如下几个子问题:V w U V 1) -子问题. 固定 和 , 更新 为1762自 动 化 学 报48 卷(15)v i,k 通过最小化式 , 可得到 的闭合解为U w Q Z U 2) -子问题. 固定 , 和 , 更新 为(17)U (t +1)通过最小化式 , 可得到 的封闭解为w V U w 3) -子问题. 固定 和 , 更新 为Z Q U Z(20)通过引入软阈值算子, 可得式 的解为U (t+1)+Q (t )=A ΣB T U (t +1)+Q (t )S θ/ρ(Σ)=diag ({max (0,σi −θ/ρ)})(i =1,2,···,N )其中, 为矩阵 的奇异值分解, 核范数的近邻算子可由软阈值算子给出.Q Z U Q 5) -子问题. 固定 和 , 更新 为w =[w 1,···,w k ,···,w K ]U ˜U经过上述迭代过程, 目标函数收敛于局部极值,同时得到不同视角下的模糊隶属度矩阵. 本文借鉴文献[10]的集成策略, 使用视角权重系数 和模糊隶属度矩阵 来构建具有全局特性的模糊空间划分矩阵 :w k U k k 其中, , 分别表示第 个视角的视角权重系数和相应的模糊隶属度矩阵.LR-MVEWFCM 算法描述如下:K (1≤k ≤K )X k ={x 1,k ,···,x N,k }C ϵT 输入. 包含 个视角的多视角样本集, 其中任意一个视角对应样本集 , 聚类中心 , 迭代阈值 , 最大迭代次数 ;v (t )i,k ˜Uw k 输出. 各视角聚类中心 , 模糊空间划分矩阵和各视角权重 ;V (t )U (t )w (t )t =0步骤1. 随机初始化 , 归一化 及 ,;(21)v (t +1)i,k 步骤2. 根据式 更新 ;(23)U (t +1)步骤3. 根据式 更新 ;(24)w (t +1)k 步骤4. 根据式 更新 ;(26)Z (t +1)步骤5. 根据式 更新 ;(27)Q (t +1)步骤6. 根据式 更新 ;L (t +1)−L (t )<ϵt >T 步骤7. 如果 或者 , 则算法结束并跳出循环, 否则, 返回步骤2;w k U k (23)˜U步骤8. 根据步骤7所获取的各视角权重 及各视角下的模糊隶属度 , 使用式 计算 .2.2 讨论2.2.1 与低秩约束算法比较近年来, 基于低秩约束的机器学习模型得到了广泛的研究. 经典工作包括文献[16]中提出LRR (Low rank representation)模型, 将矩阵的秩函数凸松弛为核范数, 通过求解核范数最小化问题, 求得基于低秩表示的亲和矩阵; 文献[14]提出低秩张量多视角子空间聚类算法(Low-rank tensor con-strained multiview subspace clustering, LT-MSC),7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1763在各视角间求出带有低秩约束的子空间表示矩阵;文献 [18] 则进一步将低秩约束引入多模型子空间聚类算法中, 使算法模型取得了较好的性能. 本文将低秩约束与多视角模糊聚类框架相结合, 提出了LR-MVEWFCM 算法, 用低秩约束来实现多视角数据间的一致性. 本文方法可作为低秩模型在多视角模糊聚类领域的重要拓展.2.2.2 与多视角Co-FKM 算法比较图1和图2分别给出了多视角Co-FKM 算法和本文LR-MVEWFCM 算法的工作流程.多视角数据Co-FKM视角 1 数据视角 2 数据视角 K 数据各视角间两两约束各视角模糊隶属度集成决策函数划分矩阵ÛU 1U 2U K图 1 Co-FKM 算法处理多视角聚类任务工作流程Fig. 1 Co-FKM algorithm for multi-view clustering task本文算法与经典的多视角Co-FKM 算法在多视角信息的一致性约束和多视角聚类结果的集成策略上均有所不同. 在多视角信息的一致性约束方面, 本文将Co-FKM 算法中的视角间两两约束进一步扩展到多视角全局一致性约束; 在多视角聚类结果的集成策略上, 本文不同于Co-FKM 算法对隶属度矩阵简单地求几何平均值的方式, 而是将各视角隶属度与视角权重相结合, 构建具有视角差异性的集成决策函数.3 实验与分析3.1 实验设置本文采用模拟数据集和UCI 中的真实数据集进行实验验证, 选取FCM [17]、CombKM [19]、Co-FKM [9]和Co-Clustering [20]这4个聚类算法作为对比算法, 参数设置如表1所示. 实验环境为: Intel Core i5-7400 CPU, 其主频为2.3 GHz, 内存为8 GB.编程环境为MATLAB 2015b.本文采用如下两个性能指标对各算法所得结果进行评估.1) 归一化互信息(Normalized mutual inform-ation, NMI)[10]N i,j i j N i i N j j N 其中, 表示第 类与第 类的契合程度, 表示第 类中所属样本量, 表示第 类中所属样本量, 而 表示数据的样本总量;2) 芮氏指标(Rand index, RI)[10]表 1 参数定义和设置Table 1 Parameter setting in the experiments算法算法说明参数设置FCM 经典的单视角模糊聚类算法m =min (N,D −1)min (N,D −1)−2N D 模糊指数 ,其中, 表示样本数, 表示样本维数CombKM K-means 组合 算法—Co-FKM 多视角协同划分的模糊聚类算法m =min (N,D −1)min (N,D −1)−2η∈K −1K K ρ=0.01模糊指数 , 协同学习系数 ,其中, 为视角数, 步长 Co-Clustering 基于样本与特征空间的协同聚类算法λ∈{10−3,10−2, (103)µ∈{10−3,10−2,···,103}正则化系数 ,正则化系数 LR-MVEWFCM 基于低秩约束的熵加权多视角模糊聚类算法λ∈{10−5,10−4, (105)θ∈{10−3,10−2, (103)m =2视角权重平衡因子 , 低秩约束正则项系数, 模糊指数 MVEWFCMθ=0LR-MVEWFCM 算法中低秩约束正则项系数 λ∈{10−5,10−4, (105)m =2视角权重平衡因子 , 模糊指数 多视角数据差异性集成决策函数各视角模糊隶属度U 1U 2U K各视角权重W 1W 2W kLR-MVEWFCM 视角 1 数据视角 2 数据视角 K 数据整体约束具有视角差异性的划分矩阵Û图 2 LR-MVEWFCM 算法处理多视角聚类任务工作流程Fig. 2 LR-MVEWFCM algorithm for multi-viewclustering task1764自 动 化 学 报48 卷f 00f 11N [0,1]其中, 表示具有不同类标签且属于不同类的数据配对点数目, 则表示具有相同类标签且属于同一类的数据配对点数目, 表示数据的样本总量. 以上两个指标的取值范围介于 之间, 数值越接近1, 说明算法的聚类性能越好. 为了验证算法的鲁棒性, 各表中统计的性能指标值均为算法10次运行结果的平均值.3.2 模拟数据集实验x,y,z A 1x,y,z A 2x,y,z A 3x,y,z 为了评估本文算法在多视角数据集上的聚类效果, 使用文献[10]的方法来构造具有三维特性的模拟数据集A ( ), 其具体生成过程为: 首先在MATLAB 环境下采用正态分布随机函数normrnd 构建数据子集 ( ), ( )和 ( ), 每组对应一个类簇, 数据均包含200个样本.x,y,z 其中第1组与第2组数据集在特征z 上数值较为接近, 第2组与第3组数据集在特征x 上较为接近;然后将3组数据合并得到集合A ( ), 共计600个样本; 最后对数据集内的样本进行归一化处理. 我们进一步将特征x , y , z 按表2的方式两两组合, 从而得到多视角数据.表 2 模拟数据集特征组成Table 2 Characteristic composition of simulated dataset视角包含特征视角 1x,y 视角 2y,z 视角 3x,z将各视角下的样本可视化, 如图3所示.通过观察图3可以发现, 视角1中的数据集在空间分布上具有良好的可分性, 而视角2和视角3的数据在空间分布上均存在着一定的重叠, 从而影Z YZZXYX(a) 模拟数据集 A (a) Dataset A(b) 视角 1 数据集(b) View 1(c) 视角 2 数据集(c) View 2(d) 视角 3 数据集(d) View 3图 3 模拟数据集及各视角数据集Fig. 3 Simulated data under multiple views7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1765响了所在视角下的聚类性能. 通过组合不同视角生成若干新的数据集, 如表3所示, 并给出了LR-MVEWFCM重复运行10次后的平均结果和方差.表 3 模拟数据实验算法性能对比Table 3 Performance comparison of the proposedalgorithms on simulated dataset编号包含特征NMI RI1视角1 1.0000 ± 0.0000 1.0000 ± 0.0000 2视角20.7453 ± 0.00750.8796 ± 0.0081 3视角30.8750 ± 0.00810.9555 ± 0.0006 4视角1, 视角2 1.0000 ± 0.0000 1.0000 ± 0.0000 5视角1, 视角3 1.0000 ± 0.0000 1.0000 ± 0.0000 6视角2, 视角30.9104 ± 0.03960.9634 ± 0.0192 7视角2, 视角3 1.0000 ± 0.0000 1.0000 ± 0.0000对比LR-MVEWFCM在数据集1~3上的性能, 我们发现本文算法在视角1上取得了最为理想的效果, 在视角3上的性能要优于视角2, 这与图3中各视角数据的空间可分性是一致的. 此外, 将各视角数据两两组合构成新数据集4~6后, LR-MVEWFCM算法都得到了比单一视角更好的聚类效果, 这都说明了本文采用低秩约束来挖掘多视角数据中一致性的方法, 能够有效提高聚类性能.基于多视角数据集7, 我们进一步给出本文算法与其他经典聚类算法的比较结果.从表4中可以发现, 由于模拟数据集在某些特征空间下具有良好的空间可分性, 所以无论是本文的算法还是Co-Clustering算法、FCM算法等算法均取得了很好的聚类效果, 而CombKM算法的性能较之以上算法则略有不足, 分析其原因在于CombKM算法侧重于挖掘样本之间的信息, 却忽视了多视角之间的协作, 而本文算法通过使用低秩约束进一步挖掘了多视角之间的全局一致性, 因而得到了比CombKM算法更好的聚类效果.3.3 真实数据集实验本节采用5个UCI数据集: 1) Iris数据集; 2) Image Segmentation (IS) 数据集; 3) Balance数据集; 4) Ionosphere数据集; 5) Wine数据集来进行实验. 由于这几个数据集均包含了不同类型的特征,所以可以将这些特征进行重新分组从而构造相应的多视角数据集. 表5给出了分组后的相关信息.我们在多视角数据集上运行各多视角聚类算法; 同时在原数据集上运行FCM算法. 相关结果统计见表6和表7.NMI RI通过观察表6和表7中的和指标值可知, Co-FKM算法的聚类性能明显优于其他几种经典聚类算法, 而相比于Co-FKM算法, 由于LR-MVEWFCM采用了低秩正则项来挖掘多视角数据之间的一致性关系, 并引入多视角自适应熵加权策略, 从而有效控制各视角之间的差异性. 很明显, 这种聚类性能更为优异和稳定, 且收敛性的效果更好.表6和表7中的结果也展示了在IS、Balance、Iris、Ionosphere和Wine数据集上, 其NMI和RI指标均提升3 ~ 5个百分点, 这也说明了本文算法在多视角聚类过程中的有效性.为进一步说明本文低秩约束发挥的积极作用,将LR-MVEWFCM算法和MVEWFCM算法共同进行实验, 算法的性能对比如图4所示.从图4中不难发现, 无论在模拟数据集上还是UCI真实数据集上, 相比较MVEWFCM算法, LR-MVEWFCM算法均可以取得更好的聚类效果. 因此可见, LR-MVEWFCM目标学习准则中的低秩约束能够有效利用多视角数据的一致性来提高算法的聚类性能.为研究本文算法的收敛性, 同样选取8个数据集进行收敛性实验, 其目标函数变化如图5所示.从图5中可以看出, 本文算法在真实数据集上仅需迭代15次左右就可以趋于稳定, 这说明本文算法在速度要求较高的场景下具有较好的实用性.综合以上实验结果, 我们不难发现, 在具有多视角特性的数据集上进行模糊聚类分析时, 多视角模糊聚类算法通常比传统单视角模糊聚类算法能够得到更优的聚类效果; 在本文中, 通过在多视角模糊聚类学习中引入低秩约束来增强不同视角之间的一致性关系, 并引入香农熵调节视角权重关系, 控制不同视角之间的差异性, 从而得到了比其他多视角聚类算法更好的聚类效果.表 4 模拟数据集7上各算法的性能比较Table 4 Performance comparison of the proposed algorithms on simulated dataset 7数据集指标Co-Clustering CombKM FCM Co-FKM LR-MVEWFCMA NMI-mean 1.00000.9305 1.0000 1.0000 1.0000 NMI-std0.00000.14640.00000.00000.0000 RI-mean 1.00000.9445 1.0000 1.0000 1.0000 RI-std0.00000.11710.00000.00000.00001766自 动 化 学 报48 卷3.4 参数敏感性实验LR-MVEWFCM算法包含两个正则项系数,λθθθθλλ即视角权重平衡因子和低秩约束正则项系数, 图6以LR-MVEWFCM算法在模拟数据集7上的实验为例, 给出了系数从0到1000过程中, 算法性能的变化情况, 当低秩正则项系数= 0时, 即不添加此正则项, 算法的性能最差, 验证了本文加入的低秩正则项的有效性, 当值变化过程中, 算法的性能相对变化较小, 说明本文算法在此数据集上对于值变化不敏感, 具有一定的鲁棒性; 而当香农熵正则项系数= 0时, 同样算法性能较差, 也说明引入此正则项的合理性. 当值变大时, 发现算法的性能也呈现变好趋势, 说明在此数据集上, 此正则项相对效果比较明显.4 结束语本文从多视角聚类学习过程中的一致性和差异性两方面出发, 提出了基于低秩约束的熵加权多视角模糊聚类算法. 该算法采用低秩正则项来挖掘多视角数据之间的一致性关系, 并引入多视角自适应熵加权策略从而有效控制各视角之间的差异性,从而提高了算法的性能. 在模拟数据集和真实数据集上的实验均表明, 本文算法的聚类性能优于其他多视角聚类算法. 同时本文算法还具有迭代次数少、收敛速度快的优点, 具有良好的实用性. 由于本文采用经典的FCM框架, 使用欧氏距离来衡量数据对象之间的差异,这使得本文算法不适用于某些高维数据场景. 如何针对高维数据设计多视角聚类算法, 这也将是我们今后的研究重点.表 5 基于UCI数据集构造的多视角数据Table 5 Multi-view data constructdedbased on UCI dataset编号原数据集说明视角特征样本视角类别8IS Shape92 31027 RGB99Iris Sepal长度215023 Sepal宽度Petal长度2Petal宽度10Balance 天平左臂重量262523天平左臂长度天平右臂重量2天平右臂长度11Iris Sepal长度115043 Sepal宽度1Petal长度1Petal宽度112Balance 天平左臂重量162543天平左臂长度1天平右臂重量1天平右臂长度113Ionosphere 每个特征单独作为一个视角135134214Wine 每个特征单独作为一个视角1178133表 6 5种聚类方法的NMI值比较结果Table 6 Comparison of NMI performance of five clustering methods编号Co-Clustering CombKM FCM Co-FKM LR-MVEWFCM 均值P-value均值P-value均值P-value均值P-value均值80.5771 ±0.00230.00190.5259 ±0.05510.20560.5567 ±0.01840.00440.5881 ±0.01093.76×10−40.5828 ±0.004490.7582 ±7.4015 ×10−172.03×10−240.7251 ±0.06982.32×10−70.7578 ±0.06981.93×10−240.8317 ±0.00648.88×10−160.9029 ±0.0057100.2455 ±0.05590.01650.1562 ±0.07493.47×10−50.1813 ±0.11720.00610.2756 ±0.03090.10370.3030 ±0.0402110.7582 ±1.1703×10−162.28×10−160.7468 ±0.00795.12×10−160.7578 ±1.1703×10−165.04×10−160.8244 ±1.1102×10−162.16×10−160.8768 ±0.0097120.2603 ±0.06850.38250.1543 ±0.07634.61×10−40.2264 ±0.11270.15730.2283 ±0.02940.01460.2863 ±0.0611130.1385 ±0.00852.51×10−90.1349 ±2.9257×10−172.35×10−130.1299 ±0.09842.60×10−100.2097 ±0.03290.04830.2608 ±0.0251140.4288 ±1.1703×10−161.26×10−080.4215 ±0.00957.97×10−090.4334 ±5.8514×10−172.39×10−080.5295 ±0.03010.43760.5413 ±0.03647 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1767表 7 5种聚类方法的RI 值比较结果Table 7 Comparison of RI performance of five clustering methods编号Co-ClusteringCombKM FCMCo-FKM LR-MVEWFCM均值P-value 均值P-value 均值P-value 均值P-value 均值80.8392 ±0.0010 1.3475 ×10−140.8112 ±0.0369 1.95×10−70.8390 ±0.01150.00320.8571 ±0.00190.00480.8508 ±0.001390.8797 ±0.0014 1.72×10−260.8481 ±0.0667 2.56×10−50.8859 ±1.1703×10−16 6.49×10−260.9358 ±0.0037 3.29×10−140.9665 ±0.0026100.6515 ±0.0231 3.13×10−40.6059 ±0.0340 1.37×10−60.6186 ±0.06240.00160.6772 ±0.02270.07610.6958 ±0.0215110.8797 ±0.0014 1.25×10−180.8755 ±0.0029 5.99×10−120.8859 ±0.0243 2.33×10−180.9267 ±2.3406×10−16 5.19×10−180.9527 ±0.0041120.6511 ±0.02790.01560.6024 ±0.0322 2.24×10−50.6509 ±0.06520.11390.6511 ±0.01890.0080.6902 ±0.0370130.5877 ±0.0030 1.35×10−120.5888 ±0.0292 2.10×10−140.5818 ±1.1703×10−164.6351 ×10−130.6508 ±0.01470.03580.6855 ±0.0115140.7187 ±1.1703×10−163.82×10−60.7056 ±0.01681.69×10−60.7099 ±1.1703×10−168.45×10−70.7850 ±0.01620.59050.7917 ±0.0353R I数据集N M I数据集(a) RI 指标(a) RI(b) NMI 指标(b) NMI图 4 低秩约束对算法性能的影响(横坐标为数据集编号, 纵坐标为聚类性能指标)Fig. 4 The influence of low rank constraints on the performance of the algorithm (the X -coordinate isthe data set number and the Y -coordinate is the clustering performance index)目标函数值1 096.91 096.81 096.61 096.71 096.51 096.41 096.31 096.21 096.1目标函数值66.266.065.665.865.465.2迭代次数05101520目标函数值7.05.06.55.54.04.53.03.5迭代次数05101520迭代次数05101520目标函数值52.652.251.451.851.050.6迭代次数05101520×106(a) 数据集 7(a) Dataset 7(b) 数据集 8(b) Dataset 8(c) 数据集 9(c) Dataset 9(d) 数据集 10(d) Dataset 101768自 动 化 学 报48 卷ReferencesXu C, Tao D C, Xu C. Multi-view learning with incompleteviews. IEEE Transactions on Image Processing , 2015, 24(12):5812−58251Brefeld U. Multi-view learning with dependent views. In: Pro-ceedings of the 30th Annual ACM Symposium on Applied Com-puting, Salamanca, Spain: ACM, 2015. 865−8702Muslea I, Minton S, Knoblock C A. Active learning with mul-tiple views. Journal of Artificial Intelligence Research , 2006,27(1): 203−2333Zhang C Q, Adeli E, Wu Z W, Li G, Lin W L, Shen D G. In-fant brain development prediction with latent partial multi-view representation learning. IEEE Transactions on Medical Imaging ,2018, 38(4): 909−9184Bickel S, Scheffer T. Multi-view clustering. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM '04), Brighton, UK: IEEE, 2004. 19−265Wang Y T, Chen L H. Multi-view fuzzy clustering with minim-ax optimization for effective clustering of data from multiple sources. Expert Systems with Applications , 2017, 72: 457−4666Wang Jun, Wang Shi-Tong, Deng Zhao-Hong. Survey on chal-lenges in clustering analysis research. Control and Decision ,2012, 27(3): 321−328(王骏, 王士同, 邓赵红. 聚类分析研究中的若干问题. 控制与决策,2012, 27(3): 321−328)7Pedrycz W. Collaborative fuzzy clustering. Pattern Recognition Letters , 2002, 23(14): 1675−16868Cleuziou G, Exbrayat M, Martin L, Sublemontier J H. CoFKM:A centralized method for multiple-view clustering. In: Proceed-ings of the 9th IEEE International Conference on Data Mining,Miami, FL, USA: IEEE, 2009. 752−7579Jiang Y Z, Chung F L, Wang S T, Deng Z H, Wang J, Qian P J. Collaborative fuzzy clustering from multiple weighted views.IEEE Transactions on Cybernetics , 2015, 45(4): 688−70110Bettoumi S, Jlassi C, Arous N. Collaborative multi-view K-means clustering. Soft Computing , 2019, 23(3): 937−94511Zhang G Y, Wang C D, Huang D, Zheng W S, Zhou Y R. TW-Co-K-means: Two-level weighted collaborative K-means for multi-view clustering. Knowledge-Based Systems , 2018, 150:127−13812Cao X C, Zhang C Q, Fu H Z, Liu S, Zhang H. Diversity-in-duced multi-view subspace clustering. In: Proceedings of the2015 IEEE Conference on Computer Vision and Pattern Recog-nition, Boston, MA, USA: IEEE, 2015. 586−59413Zhang C Q, Fu H Z, Liu S, Liu G C, Cao X C. Low-rank tensor constrained multiview subspace clustering. In: Proceedings of the 2015 IEEE International Conference on Computer Visio,Santiago, Chile: IEEE, 2015. 1582−159014Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direc-tion method of multipliers. Foundations and Trends in Machine Learning , 2011, 3(1): 1−12215Liu G C, Lin Z C, Yan S C, Sun J, Yu Y, Ma Y. Robust recov-ery of subspace structures by low-rank representation. IEEE1616.216.015.815.615.415.215.0目标函数值目标函数值目标函数值51015迭代次数迭代次数迭代次数 711.2011.1511.1011.0511.0010.9510.90800700600500400300200目标函数值38.638.238.438.037.837.637.437.251015205101520迭代次数 705101520(e) 数据集 11(e) Dataset 11(f) 数据集 12(f) Dataset 12(g) 数据集 13(g) Dataset 13(h) 数据集 14(h) Dataset 14图 5 LR-MVEWFCM 算法的收敛曲线Fig. 5 Convergence curve of LR-MVEWFCM algorithm图 6 模拟数据集7上参数敏感性分析Fig. 6 Sensitivity analysis of parameters on simulated dataset 77 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1769。
211110123_基于TAME-EDKT模型的适老化家电产品设计评价
第44卷 第8期 包 装 工 程2023年4月 PACKAGING ENGINEERING 153收稿日期:2022–11–09作者简介:王春鹏(1970—),男,硕士,教授,主要研究方向为产品设计及其理论。
基于TAME-EDKT 模型的适老化家电产品设计评价王春鹏,许贞武(青岛理工大学 艺术与设计学院,山东 青岛 266033)摘要:目的 为解决适老化家电产品设计评价过程中指标集确定的非适老性、指标间相互作用关系的干扰性和权重值确定过程中忽略用户需求的片面性等问题,基于老年技术接受模型,提出一种综合评价模型方法。
方法 首先,基于老年技术接受模型确定设计方案评价指标;其次,基于熵权法确定评价指标的初始权重,运用DEMATEL 法确定指标之间的相互作用权重,以KANO 模型进行权重调整,确定最终的指标权重,并基于TOPSIS 法进行方案排序优选;最后,运用该方法进行适老化吸尘器设计方案优选,并进行对比验证。
结论 该方法可有效确定适老化家电产品设计评价指标集的适老性,较好地考虑了指标间的相互作用关系及用户需求对指标权重的影响,并进一步完善适老化家电产品设计领域的理论研究。
关键词:适老化家电产品设计;老年技术接受模型;TAME-EDKT 模型;TOPSIS 法 中图分类号:TB472 文献标识码:A 文章编号:1001-3563(2023)08-0153-08 DOI :10.19554/ki.1001-3563.2023.08.015Evaluation of Elderly-oriented Household Appliance DesignBased on TAME-EDKT ModelWANG Chun-peng , XU Zhen-wu(College of Art and Design, Qingdao University of Technology, Shandong Qingdao 266033, China)ABSTRACT: The work aims to propose a comprehensive evaluation model based on the technology acceptance model of the elderly in order to solve the problems such as the unsuitability determined by index set, the interference between in-dexes and the one-sidedness of ignoring users' needs in the determination of weight values in the design and evaluation of elderly-oriented household appliances. Firstly, the evaluation indexes of the design scheme were determined based on the technology acceptance model of the elderly. Secondly, the initial weight of evaluation indexes was determined based on entropy weight method, the interaction weight between indexes was determined by DEMATEL method and KANO model was further used to adjust the weight of indexes. The final weight was obtained by the above three weighting methods, and the schemes were optimized and ranked based on TOPSIS method. Finally, this method was used to select the optimal de-sign scheme of elderly-oriented vacuum cleaner, and was compared for verification. This method can effectively deter-mine the suitability of index set for the design of elderly-oriented household appliances, better consider the interaction between the indexes and the influence of users' needs on the index weight, and further improve the theoretical research in the field of elderly-oriented household appliance design.KEY WORDS: design of elderly-oriented household appliances; technology acceptance model of the elderly; TAME- EDKT model; TOPSIS method根据第七次全国人口普查,我国65岁及以上人口达到13.5%。
Tikhonov吉洪诺夫正则化
Tikhonov regularizationFrom Wikipedia, the free encyclopediaTikhonov regularization is the most commonly used method of regularization of ill-posed problems named for Andrey Tychonoff. In statistics, the method is also known as ridge regression . It is related to the Levenberg-Marquardt algorithm for non-linear least-squares problems.The standard approach to solve an underdetermined system of linear equations given as,b Ax = is known as linear least squares and seeks to minimize the residual2b Ax -where ∙is the Euclidean norm. However, the matrix A may be ill-conditioned or singular yielding a non-unique solution. In order to give preference to a particular solution with desirable properties, the regularization term is included in this minimization:22x b Ax Γ+-for some suitably chosen Tikhonov matrix , Γ. In many cases, this matrix is chosen as the identity matrix Γ= I , giving preference to solutions with smaller norms. In other cases, highpass operators (e.g., a difference operator or aweighted Fourier operator) may be used to enforce smoothness if the underlying vector is believed to be mostly continuous. This regularization improves the conditioning of the problem, thus enabling a numerical solution. An explicit solution, denoted by , is given by:()b A A A x T T T 1ˆ-ΓΓ+=The effect of regularization may be varied via the scale of matrix Γ. For Γ= αI, when α = 0 this reduces to the unregularized least squares solution provided that (A T A)−1 exists.Contents∙ 1 Bayesian interpretation∙ 2 Generalized Tikhonov regularization∙ 3 Regularization in Hilbert space∙ 4 Relation to singular value decomposition and Wiener filter∙ 5 Determination of the Tikhonov factor∙ 6 Relation to probabilistic formulation∙7 History∙8 ReferencesBayesian interpretationAlthough at first the choice of the solution to this regularized problem may look artificial, and indeed the matrix Γseems rather arbitrary, the process can be justified from a Bayesian point of view. Note that for an ill-posed problem one must necessarily introduce some additional assumptions in order to get a stable solution. Statistically we might assume that a priori we know that x is a random variable with a multivariate normal distribution. For simplicity we take the mean to be zero and assume that each component is independent with standard deviation σx. Our data is also subject to errors, and we take the errors in b to bealso independent with zero mean and standard deviation σb. Under these assumptions the Tikhonov-regularized solution is the most probable solutiongiven the data and the a priori distribution of x, according to Bayes' theorem. The Tikhonov matrix is then Γ= αI for Tikhonov factor α = σb/ σx.If the assumption of normality is replaced by assumptions of homoskedasticity and uncorrelatedness of errors, and still assume zero mean, then theGauss-Markov theorem entails that the solution is minimal unbiased estimate.Generalized Tikhonov regularizationFor general multivariate normal distributions for x and the data error, one can apply a transformation of the variables to reduce to the case above. Equivalently, one can seek an x to minimize22Q P x x b Ax -+- where we have used 2P x to stand for the weighted norm x T Px (cf. theMahalanobis distance). In the Bayesian interpretation P is the inverse covariance matrix of b , x 0 is the expected value of x , and Q is the inverse covariance matrix of x . The Tikhonov matrix is then given as a factorization of the matrix Q = ΓT Γ(e.g. the cholesky factorization), and is considered a whitening filter. This generalized problem can be solved explicitly using the formula()()010Ax b P A Q PA A x T T -++-[edit] Regularization in Hilbert spaceTypically discrete linear ill-conditioned problems result as discretization of integral equations, and one can formulate Tikhonov regularization in the original infinite dimensional context. In the above we can interpret A as a compact operator on Hilbert spaces, and x and b as elements in the domain and range of A . The operator ΓΓ+T A A *is then a self-adjoint bounded invertible operator.Relation to singular value decomposition and Wiener filterWith Γ = αI , this least squares solution can be analyzed in a special way via the singular value decomposition. Given the singular value decomposition of AT V U A ∑=with singular values σi , the Tikhonov regularized solution can be expressed asb VDU x T =ˆwhere D has diagonal values22ασσ+=i iii Dand is zero elsewhere. This demonstrates the effect of the Tikhonov parameter on the condition number of the regularized problem. For the generalized case a similar representation can be derived using a generalized singular value decomposition. Finally, it is related to the Wiener filter:∑==q i i i T i i v b u f x1ˆσ where the Wiener weights are 222ασσ+=i i i f and q is the rank of A . Determination of the Tikhonov factorThe optimal regularization parameter α is usually unknown and often in practical problems is determined by an ad hoc method. A possible approach relies on the Bayesian interpretation described above. Other approaches include the discrepancy principle, cross-validation, L-curve method, restricted maximum likelihood and unbiased predictive risk estimator. Grace Wahba proved that the optimal parameter, in the sense of leave-one-out cross-validation minimizes: ()()[]21222ˆT T X I X X X I Tr y X RSSG -+--==αβτwhereis the residual sum of squares andτ is the effective number degreeof freedom. Using the previous SVD decomposition, we can simplify the above expression: ()()21'22221'∑∑==++-=q i i i i qi i iu b u u b u y RSS ασα ()21'2220∑=++=qi i i i u b u RSS RSS ασαand ∑∑==++-=+-=q i i qi i i q m m 12221222ασαασστ Relation to probabilistic formulationThe probabilistic formulation of an inverse problem introduces (when all uncertainties are Gaussian) a covariance matrix C M representing the a priori uncertainties on the model parameters, and a covariance matrix C D representing the uncertainties on the observed parameters (see, for instance, Tarantola, 2004[1]). In the special case when these two matrices are diagonal and isotropic,and , and, in this case, the equations of inverse theory reduce to the equations above, with α = σD/ σM.HistoryTikhonov regularization has been invented independently in many different contexts. It became widely known from its application to integral equations from the work of A. N. Tikhonov and D. L. Phillips. Some authors use the term Tikhonov-Phillips regularization. The finite dimensional case was expounded by A. E. Hoerl, who took a statistical approach, and by M. Foster, who interpreted this method as a Wiener-Kolmogorov filter. Following Hoerl, it is known in the statistical literature as ridge regression.[edit] References∙Tychonoff, Andrey Nikolayevich (1943). "Об устойчивости обратных задач [On the stability of inverse problems]". Doklady Akademii NaukSSSR39 (5): 195–198.∙Tychonoff, A. N. (1963). "О решении некорректно поставленных задач и методе регуляризации [Solution of incorrectly formulated problemsand the regularization method]". Doklady Akademii Nauk SSSR151:501–504.. Translated in Soviet Mathematics4: 1035–1038.∙Tychonoff, A. N.; V. Y. Arsenin (1977). Solution of Ill-posed Problems.Washington: Winston & Sons. ISBN 0-470-99124-0.∙Hansen, P.C., 1998, Rank-deficient and Discrete ill-posed problems, SIAM ∙Hoerl AE, 1962, Application of ridge analysis to regression problems, Chemical Engineering Progress, 58, 54-59.∙Foster M, 1961, An application of the Wiener-Kolmogorov smoothing theory to matrix inversion, J. SIAM, 9, 387-392∙Phillips DL, 1962, A technique for the numerical solution of certain integral equations of the first kind, J Assoc Comput Mach, 9, 84-97∙Tarantola A, 2004, Inverse Problem Theory (free PDF version), Society for Industrial and Applied Mathematics, ISBN 0-89871-572-5 ∙Wahba, G, 1990, Spline Models for Observational Data, Society for Industrial and Applied Mathematics。
加权熵最大的分带分析方法研究鲍明
加权熵最大的分带分析方法研究鲍明 管鲁阳 李晓东 田静(中国科学院声学研究所,北京,100080)A Study on Subband Analysis under the Criterion of Maximum Weighting EntropyBAO Ming GUAN Lu-yang LI Xiao-dong TIAN Jing(Institute of Acoustics ,Chinese Academy of Sciences, Beijing, 100080)1 引言分带分析是信号分析的重要手段,分带标准的选取是研究人员关注的热点问题。
本文将加权熵应用于信号动态分带研究,通过地面目标分类试验验证了加权熵分带方法在获得低维分类特征方面具有优势。
2 功率谱加权熵最大分带分析信号分带的目的是希望在有限带数的条件下,获得信号最大的信息量。
功率谱加权熵最大分析正是满足在一定先验权系数的条件下获得信息量最大的动态分带方法。
其分析模型如下。
随机信号)(t f 的功率密度函数的定义为:TF T 2|)(|lim )(ωωϕ∞→= (1) 频率取归一化频率,π20-平均功率表示如下:⎰=πωωϕπ20)(21)(d W P o w (2)在频域设定带宽i ω∆,信号)(t f 在频带]2/,2/[i i i i i W ωωωω∆+∆-∈的平均功率为: ⎰∆+∆-∆=2/2/)(1)(i i i i w w w i i d W P o w ωωωϕω =i W ]2/,2/[i i i i ωωωω∆+∆- (3)取随机信号)(x f 第i 带内平均功率与信号总平均功率的比值)()(W Pow W Pow P i i = (4) 可得:∑==N i i P11 (5)设定权系数矢量:{}N ηηηη 21,=为每带的加权系数值,此系数可由先验知识得出或通过学习优化获得。
由信息论中加权熵的定义,定义功率谱加权熵如下:∑=-=Ni i i i P P P H 1ln ),(ηηη (6) 3 功率谱加权熵最大分带倒谱系数分类特征特征提取在确定先验权系数的条件下,功率谱加权熵最大分带分析算法的实现,通过对离散FFT 的频域bin 进行组合,优化获取频域能量分布概率加权熵最大分带边界,该问题为组合优化问题,采用遗传算法求解,如图1:图1: 分带边界优化遗传算法框图算法确定L 维优化权系数后,并算目标信号的功率谱加权熵最大分带边界)(l d ,按下式计算分类特征:))()1(()()(ˆ)1()(2l d l d k X l Xl d l d k -+=∑+= L i 1= (7) ∑=-=l m m Li l X L i c 1))5.0(cos())(ˆ(10log 2)(π L i 1= (8) 4 地面目标分类试验这里以两类地面目标轮式车与履带车噪声特征分类为例[1],讨论“功率谱加权熵最大分带倒谱系数分类特征”的分类性能。
在r中使用熵权法示例
在r中使用熵权法示例英文回答:Entropy weighting method, also known as the entropy weight method, is a multi-criteria decision-making method that uses entropy as a measure of information to assign weights to different criteria. It is commonly used in decision-making problems where there are multiple criteria and the importance of each criterion needs to be determined.The entropy weighting method starts by calculating the entropy of each criterion, which represents the degree of uncertainty or randomness in the data. The entropy is calculated based on the probability distribution of thedata for each criterion. A criterion with higher entropy indicates that the data is more evenly distributed and provides less information for decision-making.After calculating the entropy of each criterion, the next step is to calculate the weight of each criterion. Theweight is calculated by normalizing the entropy values and then using them as the weights. The normalized entropy values represent the relative importance of each criterionin the decision-making process.To illustrate the entropy weighting method, let's consider an example of choosing a vacation destination. Suppose we have three criteria to consider: cost, weather, and attractions. We want to assign weights to eachcriterion to determine the best vacation destination.First, we collect data on the cost, weather, and attractions of several potential vacation destinations. We then calculate the entropy of each criterion based on the data. For example, if the cost of all destinations isevenly distributed, the entropy of the cost criterion will be high. On the other hand, if the cost of all destinations is concentrated in a narrow range, the entropy will be low.Next, we normalize the entropy values and use them as weights. For example, if the entropy values for cost, weather, and attractions are 0.6, 0.4, and 0.8 respectively,we can normalize them to 0.3, 0.2, and 0.4. These normalized values represent the weights of each criterion.Finally, we can use the weights to make a decision. For example, if the cost criterion has a weight of 0.3, the weather criterion has a weight of 0.2, and the attractions criterion has a weight of 0.4, we can calculate a score for each vacation destination based on these weights. The destination with the highest score will be the best choice.中文回答:熵权法,也称为熵权法,是一种利用熵作为信息度量的多准则决策方法,用于确定不同准则的权重。
基于组合模型的球员贡献度评价
第38卷第3期2024年5月山东理工大学学报(自然科学版)Journal of Shandong University of Technology(Natural Science Edition)Vol.38No.3May 2024收稿日期:20230419第一作者:代浩然,男,2196347446@;通信作者:曹文芹,女,caowenqin@文章编号:1672-6197(2024)03-0071-08基于组合模型的球员贡献度评价代浩然,曹文芹(山东理工大学数学与统计学院,山东淄博255049)摘要:介绍了两种评价球员贡献度的统计模型,基于组合模型的优势,提出了1个组合评价模型,以2021 2022年NBA 季后赛为例,利用非参数统计中Friedman 检验进行了实证分析㊂通过基于熵权法改进的Topsis ㊁主成分综合评价与组合模型3种评价模型来探究2021 2022年NBA 总冠军金州勇士队球员的贡献,并对其贡献进行排名,采用Friedman 检验得到3种模型的结论基本一致㊂关键词:非参数统计;相关分析;组合模型;Friedman 检验中图分类号:O213文献标志码:APlayer contribution evaluation based on combination modelDAI Haoran,CAO Wenqin(School of Mathematics and Statistics,Shandong University of Technology,Zibo 255049,China)Abstract :In the paper,we firstly introduce two statistical models for evaluating player contribution,and then propose a new combined evaluation model based on the advantages of the combination model.We fi-nally performed an empirical analysis using nonparametric statistical Friedman test for the 2021 2022NBA playoffs.Golden state Warriors is the champion team,and the team is analyzed as an example using improved entropy weighting method,principal component analysis,and the combined model.We explore the contributions of Golden State Warriors players and rank their contributions.We come to a conclusionthat the three models are consistent according to the Friedman test.Keywords :nonparametric statistics;correlation analysis;composite model;Friedman test㊀㊀众所周知,NBA(national basketball association)代表全世界最高的篮球水平,如果一支球队夺得NBA 总冠军,就是拿到了篮球领域的最高荣誉㊂在2021 2022赛季,西部勇士队以4ʒ2系列赛得分击败东部凯尔特人队,夺得NBA 总冠军㊂这是勇士队时隔4年再次获得NBA 总冠军,也是队史上获得的第7个NBA 总冠军㊂然而,在2021 2022赛季初期,勇士队一直不被看好,即便最后进入总决赛,外界也认为他们无法获得总冠军㊂因此,对于这次特别的夺冠,本文对其冠军队伍展开分析,客观地展现勇士队夺冠的重要因素㊂由于球员自身身体素质的差异㊁训练方式的差别以及球员风格的不同,在竞技能力的展现上存在差异,每场比赛中球员们的表现㊁对球队的帮助以及在场时的贡献都会有所起伏,最终导致每名球员在胜利的贡献上也存在差异㊂目前国内已经有一些对球员表现的评价方法㊂许坚等[1]基于信息熵理论对篮球比赛球员贡献评价体系进行研究,最终建立球员贡献评价体系以及球员贡献值的计算公式;吴威等[2]通过主成分分析的方法构建了CBA(China basketball association)国内球员攻防能力评价模型;刘欣然[3]通过线性加权的方法得到一个球队比赛中的防守质量的综合评价分数;景怀国等[4]使用Q 型聚类分析对第30届奥运会男子篮球赛参赛队伍综合能力进行分析;李国㊀等[5]使用Topsis(technique for order preference by si-milarity to an ideal solution)评价模型对中国男子篮球队与对手攻防指标进行了综合分析;现阶段CBA 与NBA 联赛中对于球员引入了效率值来判断球员对于胜率的贡献㊂本文介绍了熵权法改进的Topsis 模型和主成分综合评价模型㊂由于这两种评价模型各有优劣,为了得到更好的评价结果,本文提出组合评价模型㊂该模型能够综合利用前两种评价模型的结果,对金州勇士队球员的贡献得分以及排名进行讨论,并通过Friedman 检验分析3种模型结果的一致性㊂1㊀综合评价模型1.1㊀基于熵权法改进的Topsis 模型与传统的Topsis 模型相比,熵权法改进的Topsis 模型主要是对待评价球员的加权决策矩阵进行了改进㊂熵权法是一种根据待评价指标来确定权重的客观打分方式,这种方法能够反映指标背后隐含的信息以此来增强各指标的差异性,以避免选取指标的差异过小而造成分析不清,从而达到全面反映各类信息的目的[6]㊂在构建数据矩阵前要先将指标转化为极大型指标㊂假定有m 个待评价对象,n 个评价指标,将所有数据构成的判断矩阵进行标准化处理,以此得到后续使用的标准化数据矩阵P :P ij =a ijðmi =1a ij ,(1)P =(P ij )m ˑn ,(2)式中:P ij 为第i 个待评价对象第j 个指标值的权重,m 为待评价对象的数量,n 为评价指标的个数,a ij 为第i 个待评价对象第j 个指标的评价值㊂熵是系统无序状态的度量,熵权反映了各指标向决策者提供的有用信息量㊂根据熵的思想来度量所有评价指标的信息效用值,从而确定各指标的熵权[7],第j 个指标的信息熵值e j 为e j =-ðmi =1P ij ln P ij ln m,(3)式中:e j (0ɤe j ɤ1)为第j 个指标的熵值,-1ln m为信息熵系数㊂通过信息熵值来确定各评价指标的权重w j :w j =1-e jðnj =1(1-e j )㊂(4)㊀㊀确定指标权重,建立加权决策矩阵,将式(4)得到的权重向量考虑到决策矩阵当中,通过标准化矩阵的每一行与其权重相乘得到加权规范化决策矩阵V =(v ij )m ˑn ㊂V =v 11v 12 v 1n v 21v 22 v 2n ︙︙︙v m 1v m 2v mn éëêêêêêêêùûúúúúúúú=㊀㊀r 11w 1r 12w 1 r 1n w 1r 21w 2r 22w 2 r 2n w 2︙︙︙r m 1w mr m 2w m r mn w m éëêêêêêêêùûúúúúúúú,(5)式中:v ij 为第i 个待评价对象第j 个指标标准化后加权的数据,r ij 为第i 个待评价对象第j 个指标标准化后的数据,w j 为第j 个指标的权重㊂经计算得到加权规范化决策矩阵后寻找正理想解与负理想解,令V +表示最偏好的方案(正理想解),V -表示最不偏好的方案(负理想解):V +=max v ij |j =1,2, ,n {}=v +1,v +2, ,v +n {},V -=min v ij |j =1,2, ,n {}=v -1,v -2, ,v -n{},(6)得到正理想解与负理想解后计算不同待评价对象到正负理想解的距离:D +i=ðnj =1(v ij -v +j )2,i =1,2, ,m ,D -i=ðnj =1(v ij -v -j )2,i =1,2, ,m ,(7)最终计算得出待评价对象与最优方案的贴近度C i :C i =D -i D -i +D +i ,1ɤi ɤm ,(8)式中C i 越大,表示第i 个待评价对象越接近最优水27山东理工大学学报(自然科学版)2024年㊀平㊂贴近度C i 的取值范围为0,1[],其中,当C i =0时,待评价对象的综合得分最差;当C i =1时,待评价对象的综合得分最好㊂1.2㊀主成分综合评价模型在综合评价中变量之间常具有一定的相关性,利用这些指标建立线性综合评价函数,容易造成信息重复,影响综合评价的结果[8]㊂主成分分析法以少数的综合变量取代原始采用的多维变量,可以减少在选取指标上花费的时间,在指标选取上较为容易㊂假定有m 个待评价对象,n 个评价指标,在进行分析时要先对数据进行标准化处理:y ij =x ij -x -js j,i =1,2, m ,j =1,2, ,n ,(9)式中:x ij 为第i 个待评价对象第j 个指标的值,x -j 为第j 个指标的平均值,s j 为第j 个指标的标准差㊂在数据预处理后需要计算数据集的相关系数矩阵R :R =(r jk )n ˑn ,(10)r jk =1n -1ðni =1(x ij -x -j )2㊃(x ik -x -k )2s j ㊃s k,(11)式中r jk 为第j 个指标与第k 个指标的相关系数㊂然后求出相关系数矩阵R 的n 个特征根与特征向量,确定主成分个数,由特征方程式λI n -R =0(12)可求得n 个特征根,并按照大小顺序将其排列为λ1ȡλ2ȡ ȡλn ȡ0㊂㊀㊀特征根的大小描述了每个主成分在评价中所起作用的大小㊂每一个特征根对应一个特征向量,记为u 1,u 2, ,u n ,求得各主成分为z ij =u T j x i ,x i=(x i 1,x i 2, ,x in )T ,(13)提取的主成分特征根应大于所有主成分特征根的平均数,又因为主成分分析进行了标准化,因此特征根的平均值为1,故提取大于1的特征根:λ1ȡλ2ȡ ȡλt ȡ1,t <n ,同时还要满足累计方差贡献率能够达到80%,若不能,需要通过改变特征根的大小以此来确定主成分个数,即E =ðtg =1λg ðn j =1λj ȡ80%㊂(14)利用选择的主成分,确定各个待评价对象的综合评价得分㊂首先求得每一个主成分的线性加权值;然后再对t 个主成分进行加权求和,其权重为t 个主成分的方差贡献率:w g =λgðtg =1λg ;(15)最终根据各主成分及权重得到各对象的综合评价得分为F i =ðtg =1w g z ig ,i =1,2, ,m ㊂(16)1.3㊀综合评价组合模型组合模型是提高模型精度的重要方法之一,单一模型表达能力不足,不能对复杂的问题进行很好的建模分析㊂组合模型通过组合的方式综合几个模型的优点,同时消除单一评价模型可能存在的较大偏差,从而使模型能够有更好的表达能力㊂本文综合了基于熵权改进的Topsis 模型与主成分综合评价模型,提出了1个新的组合评价模型㊂组合模型的关键在于组合权重的选择㊂考虑到两种综合评价得分的取值范围不同,即数据的离散程度具有一定的差异,从主观上来讲,球员间得分差异越大越能体现出二者的贡献度及排名的区别㊂统计中的方差正是描述数据离散程度的一个指标,故本文以单个评价模型球员得分的方差作为模型权重占比㊂具体为G c =σ2c σ21+σ22,(17)式中:G c 表示第c 种模型的权重,c =1,2;σ21是基于熵权法改进的Topsis 模型计算得到球员得分的方差;σ22是主成分综合评价模型计算得到球员得分的方差㊂按照上述加权方式得到的组合模型的综合评价得分为f i =ð2c =1G c F ic ,i =1,2, ,m ,(18)式中F ic 表示第c 个模型评价第i 个球员的得分㊂37第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀代浩然,等:基于组合模型的球员贡献度评价2㊀非参数方法简介2.1㊀相关分析2.1.1㊀皮尔逊相关系数皮尔逊相关系数是一种用于衡量两个变量之间线性关系强度的统计量,可以帮助了解变量之间的关系,从而更好地理解数据和做出决策㊂其计算公式为ρXY =Cov(X ,Y )D (X )D Y (),(19)式中ρXY 的取值范围是[-1,1],当ρXY 大于0时,表示X 与Y 正相关,反之负相关,ρXY 为0时表示二者不具有相关性;ρXY 的绝对值越大说明二者相关性越强,绝对值越小说明二者相关性越小㊂2.1.2㊀Spearman 秩相关系数非参数统计中的Spearman 秩相关系数不仅可以衡量线性相关关系,同样也可以衡量非线性相关关系㊂该种方法对分析的变量数据不需要正态性假设,且对异常数值敏感度低㊂具体计算公式为r s =1-6n (n 2-1)ðni =1(R i -Q i )2,(20)式中:R i 表示X 的秩,Q i 表示Y 的秩㊂若遇到秩相等的情况则采用平均秩对其进行处理㊂2.1.3㊀Kendall τ相关系数Kendall τ相关系数于1938年提出,是一种与Spearman 秩相关系数类似的相似性检验法,它从变量是否协同一致的角度出发检验两变量之间是否存在相关性,协同性的定义如下:假设有n 对观测值:(x 1,y 1),(x 2,y 2), ,(x n ,y n ),如果x j -x i ()y j -y i ()>0则称数对满足协同性,反之则称数对不协同㊂用N c 表示满足协同性数对对数,N d 表示不满足协同性数对对数,具体公式为τ=N c -N dn (n -1)/2,(21)若遇到秩相等的情况则采用平均秩对其进行处理㊂2.2㊀Friedman 检验Friedman 检验是根据完全区组设计理论而提供的实验方法,当针对随机区组的实验数据时,由于传统的分析方法理论要求实验误差必须是正态分布的,当数据结果在当前不能够满足方差分析法的正态前提时,Friedman 建立并使用了秩方差分析法[9],该种方法仅仅依赖于每个区组内所观测的秩次㊂假设有k 个处理和b 个区组,数据结构见表1㊂表1㊀完全随机区组数据分析结构区组处理1处理2 处理k 区组1x 11x 12 x 1k 区组2x 21x 22 x 2k ︙︙︙︙区组bx b 1x b 2x bk ㊀㊀Friedman 检验与大部分方差分析的检验问题是一样的,即关于位置参数的假设检验为H 0:θ1=θ2= =θk ,H 1:∃i ,j ɪ1,2, ,k ,i ʂj ,θi ʂθj ㊂检验统计量为Q =12bk (k +1)ðki =1R 2i+-3b (k +1),(22)Q 统计量在H 0下近似服从自由度v =k -1的χ2分布㊂若统计量Q <χ20.05(k -1),则接受H 0,反之则拒绝H 0㊂当数据存在相同的秩时,Q 值校正为Q c =Q /1-ðgi =1(τ3i -τi )bk (k 2-1)(),(23)式中:τi 为第i 个结的长度,g 为结的个数㊂3㊀模型对比分析本文以2021 2022赛季金州勇士队在季后赛的表现为例分析上述3种评价模型㊂3.1㊀数据收集及处理本文利用网站()提供的数据,收集了2021 2022年季后赛所有球队的217名球员30项指标数据㊂以金州勇士队球员最低出场时间球员(Anderson)的48.9min 为最低标准,删除低于该出场时间球员数据,得到161名球员㊂此外,本文参考了王斌等[10]㊁章翔[11]给出的指标,以及作者在通过网络腾讯视频㊁NBA 官网等方式观看47山东理工大学学报(自然科学版)2024年㊀NBA 比赛时获得的一些心得,最终选取了表2中的10个指标进行分析及评价㊂表2㊀评价指标指标解释总得分球员在季后赛阶段总计得分数,体现了其得分能力㊀真实投篮命中率/%衡量球员出手效率的指标三分命中率/%衡量球员外线投篮的指标罚篮命中率/%衡量球员罚球好坏的指标篮板球数/个衡量球员在篮板球方面指标助攻数/个衡量球员在传球方面的指标失误次数/次衡量球员在控制失误方面的指标抢断/个衡量球员在防守端的指标盖帽/次衡量球员在防守端的指标个人犯规数衡量球员在控制失误方面的指标3.2㊀相关性分析本文通过相关性分析来探究指标之间的关系,以此判断指标是否出现严重的共线性问题㊂通过计算皮尔逊相关系数得到相关系数矩阵,将其可视化如图1所示㊂图1㊀皮尔逊相关系数热力图由图1可知,总得分㊁篮板球数㊁助攻数㊁失误次数㊁抢断㊁盖帽㊁个人犯规数之间具有较高的关联性,但仅通过皮尔逊相关系数无法直接表明是否一定存在这种相关关系,因为它受极端值的影响较大㊂为更加准确得到指标之间的具体关系,再利用Spearman 秩相关系数与Kendall τ相关系数进行相关分析,将其可视化如图2㊁图3所示㊂由图2㊁图3可知,总得分㊁篮板球数㊁助攻数㊁失误次数㊁抢断㊁盖帽㊁个人犯规数之间确实具有较高的关联性,但这些指标又会受到出场时间的影响,因为这些指标中隐含了出场时间这一信息,出场时图2㊀Spearman秩相关系数热力图图3㊀Kendall τ相关系数热力图间可以较为直接地体现出教练组对于球员水平的判断㊂对于本文而言,金州勇士队全体成员出场的比赛场数是相同的,出场时间仅受教练组的安排,在此不单独作为讨论的评价指标㊂通过以上相关分析,本文发现部分指标之间具有较强的相关性,说明指标之间存在较强的共线性问题,会对综合评价得分产生一定的影响㊂因此在计算综合评价得分时要考虑共线性这一因素的影响㊂最终通过3种综合评价模型对金州勇士队球员的表现进行评价㊂3.3㊀基于熵权法改进的Topsis 模型求解该模型是通过计算最优解和最劣解之间的距离,从而确定其综合评价指数,避免了传统权重分配方法中存在的主观性和不确定性,是普通Topsis 模型的一种改进方式㊂首先将金州勇士队14名球员的10项指标根据熵权法求出具体权重,将求得的权重代入Topsis 模型求解,其具体权重见表3㊂由表3可知,助攻数的权重最高达到18.157%,这符合金州勇士队擅长使用传切体系,这一体系下球队注重分享球,最终目的是帮助队员以57第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀代浩然,等:基于组合模型的球员贡献度评价表3㊀指标权重分配指标信息熵值信息效用值权重/%助攻数0.7820.21818.157总得分0.7840.21617.978篮板球数0.8240.17614.656盖帽0.8430.15713.063抢断0.8560.14411.967失误次数0.9210.079 6.613真实投篮命中率0.9270.073 6.033个人犯规数0.9440.056 4.659三分命中率0.9580.042 3.455罚篮命中率0.9590.041 3.421最擅长的方式得分,助攻数也是对这一体系的侧面反映;同时罚篮命中率占比最低仅为3.421%,这一数据也可以从球队实际情况入手,因为球队擅长传切体系,难以获得高罚球数,因此罚篮命中率占比最小是有一定合理性的㊂可以说明,通过熵权法得到的指标权重具有一定的实际意义㊂经计算得到各权重后代入加权决策矩阵,以此来求出正理想解与负理想解㊂然后计算不同球员评价向量到正理想解与负理想解的距离,最终计算得到各球员与最优方案的贴近度,以此结果代表各球员的综合得分,具体得分见表4㊂表4㊀Topsis模型得分及排名姓名正理想解负理想解综合得分排名Curry0.200.450.701Green0.260.420.622 Wiggins0.260.370.603 Thompson0.250.320.564Poole0.270.300.525Looney0.350.270.446 Porter Jr.0.380.210.357 Payton II0.420.220.358Moody0.480.180.279 Kuminga0.470.150.2410Bjelica0.470.140.2311 Iguodala0.500.140.2212Lee0.510.130.2113 Anderson0.510.130.2014㊀㊀由表4可知,基于熵权法改进的Topsis模型得到各球员的综合得分,其大小反映了球员在争冠道路上的贡献度㊂3.4㊀主成分综合评价模型的求解通过3.2节相关性分析发现各评价指标具有较强的共线性,本模型通过主成分分析来解决这一问题㊂首先选取特征根大于1的主成分个数㊂通常采用碎石图来可视化这一现象,具体碎石图如图4所示㊂由图4可知,前3个特征根大于1,因此本文将选取前3个主成分作为本次综合评价的3个指标㊂图4㊀碎石图在满足特征根大于1这一基本条件后再探究前3个主成分对于变量的累计方差贡献情况,具体方差解释见表5㊂表5㊀方差解释成分特征根方差贡献率/%累计方差贡献率/%1 5.28652.86552.8652 1.30513.05265.9173 1.05810.58176.49840.9259.25385.75150.514 5.13690.88760.276 2.75793.64570.238 2.38496.02980.202 2.02598.05490.129 1.29399.346100.0650.654100㊀㊀由表5可知,前3个主成分累计方差贡献率仅为76.498%,小于80%,因此使用降低特征根至0.9的策略,期望能增加主成分的数量,以实现累计方差的贡献率达到80%这一目标㊂结合图4和表5可知,当特征根降低到0.9可以增加第4个主成分,使累计方差贡献率达到85.751%,故本文采用前4个67山东理工大学学报(自然科学版)2024年㊀主成分进行综合评价㊂通过上述分析可以得出具体的4个评价指标,根据式(16)计算得到最终的主成分综合评价,得分见表6㊂表6㊀主成分综合评价得分及排名㊀㊀Green约0.47分,即便如此,这两位球员仍位列贡献度排名前两位;从得分上来看,Poole㊁Thompson㊁Wiggins在夺冠的道路上发挥了不可磨灭的作用;考虑到金州勇士队球员储备丰富,Lee与Anderson发挥的作用并不明显㊂3.5㊀组合模型的求解基于上述两种综合评价模型求解得到各球员的综合得分,经计算,基于熵权法改进的Topsis模型的方差为0.154,主成分综合评价模型的为1.005㊂将计算结果代入式(17)得到两个模型的权重分别为:G1=0.133,G2=0.867,利用式(18)将两类模型进行加权求和,得到的得分见表7㊂由表7可知,Curry仍然以巨大的得分优势领先第2名Green,而贡献度排在第3㊁4㊁5名顺序则成了表7㊀组合模型得分及排名综合评价结果一致,但与基于熵权法改进的Topsis 模型结果存在差异,而Lee与Anderson仍处于最后两位㊂从整体上看,与前两种评价有一定的变化,但整体评价是否存在差异还需要进一步检验㊂3.6㊀一致性检验综合上述3种综合评价模型得到的综合评价得分及排名,考虑到通过排名更能直观感受各球员在争夺总冠军这条路上做出的贡献,为增强结论的严谨性,本文采用Friedman检验来判断3种模型是否存在差异㊂此时模型的一致性检验的原假设为H0:θ1=θ2= =θ14,检验统计量Q(式(22))可以度量一致性,Q越大表示3种评价模型的一致性越强,对本文球员的评价更具说服力㊂结合金州勇士队球员情况,此时有k= 14个球员和b=3个区组㊂本文将基于熵权法改进的Topsis模型㊁主成分综合评价㊁组合模型进行检验,球员顺序按照主成分综合评价得分的排名编号为1 14,具体数据见表8㊂表8㊀完全随机区组数据分析表球员1234567891011121314 Topsis模型1254376810911121314主成分评价1234567891011121314组合模型1234567811910121314㊀㊀由表8可知,3种模型结果不完全一致㊂通过RStudio进行求解,最终得到P值为0.9556,接受原假设,认为3种模型之间不存在差异,故通过Friedman检验增加了结论的可靠性,以此可以更加合理地体现出金州勇士队2021 2022年获得总冠军的球员贡献度的具体情况㊂4㊀结论本文以2021 2022赛季NBA金州勇士队球员77第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀代浩然,等:基于组合模型的球员贡献度评价为研究对象,基于非参数相关分析探究了影响球员贡献度的10个指标,分别使用两种评价模型及1种组合模型计算了金州勇士队14名球员在夺冠道路上的贡献度,最后基于Friedman检验对3种模型进行了一致性检验,得到以下结论㊂1)组合模型可以更全面地考虑多个指标之间的影响,能够更真实地反映评价对象的综合表现㊂2)3种模型之间不具有差异性,通过Friedman 检验得到了3种模型的评价基本一致㊂3)通过排名以及收集得到的数据可以看出金州勇士队球员大致划分为4个档:Curry㊁Green㊁Poole㊁Thompson㊁Wiggins5人属于第一档,即球队首发实力的球员,能够获得稳定的上场时间并且能够取得较好的数据;Porter Jr.㊁Looney㊁Payton II3人属于第二档,即能够为球队提供稳定输出的球员但上场时间要少于首发;Kuminga㊁Moody㊁Bjelica3人属于第三档,即能够获得较少的出场时间但能够为球队在某些方面发挥作用;Iguodala㊁Lee㊁Anderson不能每场都保证有出场时间,仅发挥很小的作用,大部分会在无比赛悬念的时间上场㊂4)Curry在高位与Green的高位挡拆是勇士队的一大进攻特征㊂通过2人的球队贡献排名可以很好地体现出这一点,这也是金州勇士队能够取得总冠军非常重要的一项因素㊂5)Curry是金州勇士队的领袖和核心球员㊂通过前面3种模型的比较,可以发现Curry的综合得分远超出其他球员,证明了他在球队中的重要地位和突出贡献㊂他最终获得了FMVP这一总决赛含金量最高的个人奖项,证明了他是一位经验丰富㊁能够带领球队走向胜利的重要角色㊂以上分析反映了球员真实情况,故本文所构建的模型比较合理严谨㊂参考文献:[1]许坚,周勇,廖书雷,等.基于信息熵理论的篮球比赛球员贡献评价体系研究[J].浙江体育科学,2021,43(6):86-92. [2]吴威,凡新,王伟.基于主成分分析的CBA球队攻防能力的研究[J].湖北师范大学学报(自然科学版),2022,42(2):92-97.[3]刘欣然.篮球防守质量评价指标体系的确立及其原则[J].沈阳体育学院学报,2011,30(5):137-138,142.[4]景怀国,王军.Q型聚类分析对第30届奥运会男子篮球赛参赛队伍综合能力分析[J].广州体育学院学报,2012,32(6):68-72.[5]李国,孙庆祝.第30届奥运会中国男子篮球队与对手攻防指标的TOPSIS分析[J].中国体育科技,2013,49(1):88-95. [6]信桂新,杨朝现,杨庆媛,等.用熵权法和改进TOPSIS模型评价高标准基本农田建设后效应[J].农业工程学报,2017,33(1): 238-249.[7]周惠成,张改红,王国利.基于熵权的水库防洪调度多目标决策方法及应用[J].水利学报,2007(1):100-106.[8]黄利文.基于理想点的主成分分析法在综合评价中的应用[J].统计与决策,2021,37(10):184-188.[9]黄小澄.商业银行上市前后绩效对比:基于Fried-man检验和DEA[J].老字号品牌营销,2022(7):57-59.[10]王斌,明园淋,王志菲,等.NBA联赛中三分球技术的运用及效果:基于2018 2019赛季NBA总决赛勇士队VS猛龙队[J].浙江师范大学学报(自然科学版),2023,46(1):96-103. [11]章翔.NBA与CBA球队技术统计的逐步回归分析及比较研究[J].北京体育大学学报,2014,37(1):134-138.(编辑:杜清玲)87山东理工大学学报(自然科学版)2024年㊀。
(仅供参考)《ABAQUS-有限元分析常见问题解答》常见问题汇总
第1章关于 Abaqus 基本知识的常见问题第一篇基础篇第1章关于 Abaqus 基本知识的常见问题第1章关于 Abaqus 基本知识的常见问题1.1 Abaqus 的基本约定1.1.1 自由度的定义【常见问题1-1】Abaqus 中的自由度是如何定义的?1.1.2 选取各个量的单位【常见问题1-2】在 Abaqus 中建模时,各个量的单位应该如何选取?1.1.3 Abaqus 中的时间【常见问题1-3】怎样理解 Abaqus 中的时间概念?第1章关于 Abaqus 基本知识的常见问题1.1.4 Abaqus 中的重要物理常数【常见问题1-4】Abaqus 中有哪些常用的物理常数?1.1.5 Abaqus 中的坐标系【常见问题1-5】如何在 Abaqus 中定义局部坐标系?1.2 Abaqus 中的文件类型及功能【常见问题1-6】Abaqus 建模和分析过程中会生成多种类型的文件,它们各自有什么作用? 【常见问题1-7】提交分析后,应该查看 Abaqus 所生成的哪些文件?1.3 Abaqus 的帮助文档1.3.1 在帮助文档中查找信息【常见问题1-8】如何打开 Abaqus 帮助文档?第1章关于 Abaqus 基本知识的常见问题【常见问题1-9】Abaqus 帮助文档的内容非常丰富,如何在其中快速准确地找到所需要的信息?1.3.2 在 Abaqus/CAE 中使用帮助【常见问题1-10】Abaqus/CAE 的操作界面上有哪些实时帮助功能?【常见问题1-11】Abaqus/CAE 的 Help 菜单提供了哪些帮助功能?1.4 更改工作路径【常见问题1-12】Abaqus 读写各种文件的默认工作路径是什么?如何修改此工作路径?1.5 Abaqus 的常用 DOS 命令【常见问题1-13】Abaqus 有哪些常用的 DOS 命令?第1章关于 Abaqus 基本知识的常见问题1.6 设置 Abaqus 的环境文件1.6.1 磁盘空间不足【常见问题1-14】提交分析作业时出现如下错误信息,应该如何解决?***ERROR: UNABLE TO COMPLETE FILE WRITE. CHECK THAT SUFFICIENT DISKSPACE IS AVAILABLE. FILE IN USE AT F AILURE IS shell3.stt.(磁盘空间不足)或者***ERROR:SEQUENTIAL I/O ERROR ON UNIT 23, OUT OF DISK SPACE OR DISK QUOTAEXCEEDED.(磁盘空间不足)1.6.2 设置内存参数【常见问题1-15】提交分析作业时出现如下错误信息,应该如何解决?***ERROR: THE SETTING FOR PRE_MEMORY REQUIRES THAT 3 GIGABYTES OR MOREBE ALLOCATED BUT THE HARDWARE IN USE SUPPORTS ALLOCATION OF AT MOST 3GIGABYTES OF MEMORY. EITHER PRE_MEMORY MUST BE DECREASED OR THE JOBMUST BE RUN ON HARDWARE THAT SUPPORTS 64-BIT ADDRESSING.(所设置的pre_memory 参数值超过3G,超出了计算机硬件所能分配的内存上限)或者***ERROR: THE REQUESTED MEMORY CANNOT BE ALLOCATED. PLEASE CHECK THESETTING FOR PRE_MEMORY. THIS ERROR IS CAUSED BY PRE_MEMORY BEINGGREATER THAN THE MEMORY AVAILABLE TO THIS PROCESS. POSSIBLE CAUSES AREINSUFFICIENT MEMORY ON THE MACHINE, OTHER PROCESSES COMPETING FORMEMORY, OR A LIMIT ON THE AMOUNT OF MEMORY A PROCESS CAN ALLOCATE.(所设置的 pre_memory 参数值超出了计算机的可用内存大小)第1章关于 Abaqus 基本知识的常见问题或者***ERROR: INSUFFICIENT MEMORY. PRE_MEMORY IS CURRENTLY SET TO 10.00MBYTES. IT IS NOT POSSIBLE TO ESTIMATE THE TOTAL AMOUNT OF MEMORY THATWILL BE REQUIRED. PLEASE INCREASE THE VALUE OF PRE_MEMORY.(请增大pre_memory 参数值)或者***ERROR: THE VALUE OF 256 MB THAT HAS BEEN SPECIFIED FORSTANDARD_MEMORY IS TOO SMALL TO RUN THE ANALYSIS AND MUST BEINCREASED. THE MINIMUM POSSIBLE VALUE FOR STANDARD_MEMORY IS 560 MB.(默认的standard_memory 参数值为256 M,而运行分析所需要的standard_memory 参数值至少为560 M)1.7 影响分析时间的因素【常见问题1-16】使用 Abaqus 软件进行有限元分析时,如何缩短计算时间?【常见问题1-17】提交分析作业后,在 Windows 任务管理器中看到分析作业正在运行,但 CPU 的使用率很低,好像没有在执行任何工作任务,而硬盘的使用率却很高,这是什么原因?1.8 Abaqus 6.7新增功能【常见问题1-18】Abaqus 6.7 版本新增了哪些主要功能?第1章关于 Abaqus 基本知识的常见问题1.9 Abaqus 和其它有限元软件的比较【常见问题1-19】Abaqus 与其他有限元软件有何异同?第2章关于 Abaqus/CAE 操作界面的常见问题第2章关于Abaqus/CAE 操作界面的常见问题2.1 用鼠标选取对象【常见问题2-1】在 Abaqus/CAE 中进行操作时,如何更方便快捷地用鼠标选取所希望选择的对象(如顶点、线、面等)?2.2 Tools 菜单下的常用工具2.2.1 参考点【常见问题2-2】在哪些情况下需要使用参考点?2.2.2 面【常见问题2-3】面(surface)有哪些类型?在哪些情况下应该定义面?第2章关于 Abaqus/CAE 操作界面的常见问题2.2.3 集合【常见问题2-4】集合(set)有哪些种类?在哪些情况下应该定义集合?2.2.4 基准【常见问题2-5】基准(datum)的主要用途是什么?使用过程中需要注意哪些问题?2.2.5 定制界面【常见问题2-6】如何定制 Abaqus/CAE 的操作界面?【常见问题2-7】6.7版本的 Abaqus/CAE 操作界面上没有了以前版本中的视图工具条(见图2-6),操作很不方便,能否恢复此工具条?图2-6 Abaqus/CAE 6.5版本中的视图工具条第3章Part 功能模块中的常见问题第3章Part 功能模块中的常见问题3.1 创建、导入和修补部件3.1.1 创建部件【常见问题3-1】在 Abaqus/CAE 中创建部件有哪些方法?其各自的适用范围和优缺点怎样? 3.1.2 导入和导出几何模型【常见问题3-2】在 Abaqus/CAE 中导入或导出几何模型时,有哪些可供选择的格式?【常见问题3-3】将 STEP 格式的三维 CAD 模型文件(*.stp)导入到 Abaqus/CAE 中时,在窗口底部的信息区中看到如下提示信息:A total of 236 parts have been created.(创建了236个部件)此信息表明 CAD 模型已经被成功导入,但是在 Abaqus/CAE 的视图区中却只显示出一条白线,看不到导入的几何部件,这是什么原因?第3章Part 功能模块中的常见问题3.1.3 修补几何部件【常见问题3-4】Abaqus/CAE 提供了多种几何修补工具,使用时应注意哪些问题?【常见问题3-5】将一个三维 CAD 模型导入 Abaqus/CAE 来生成几何部件,在为其划分网格时,出现如图3-2所示的错误信息,应如何解决?图3-2 错误信息:invalid geometry(几何部件无效),无法划分网格3.2 特征之间的相互关系【常见问题3-6】在 Part 功能模块中经常用到三个基本概念:基本特征(base feature)、父特征(parent feature)和子特征(children feature),它们之间的关系是怎样的?第3章Part 功能模块中的常见问题3.3 刚体和显示体3.3.1 刚体部件的定义【常见问题3-7】什么是刚体部件(rigid part)?它有何优点?在 Part 功能模块中可以创建哪些类型的刚体部件?3.3.2 刚体部件、刚体约束和显示体约束【常见问题3-8】刚体部件(rigid part)、刚体约束(rigid body constraint)和显示体约束(display body constraint)都可以用来定义刚体,它们之间有何区别与联系?3.4 建模实例【常见问题3-9】一个边长 100 mm 的立方体,在其中心位置挖掉半径为20 mm 的球,应如何建模? 『实现方法1』『实现方法2』第4章Property 功能模块中的常见问题第4章 Property 功能模块中的常见问题4.1 超弹性材料【常见问题4-1】如何在 Abaqus/CAE 中定义橡胶的超弹性(hyperelasticity)材料数据?4.2 梁截面形状、截面属性和梁横截面方位4.2.1 梁截面形状【常见问题4-2】如何定义梁截面的几何形状和尺寸?【常见问题4-3】如何在 Abaqus/CAE 中显示梁截面形状?4.2.2 截面属性【常见问题4-4】截面属性(section)和梁截面形状(profile)有何区别?第4章Property 功能模块中的常见问题【常见问题4-5】提交分析作业时,为何在 DAT 文件中出现错误提示信息“elements have missing property definitions(没有定义材料特性)”?『实 例』出错的 INP 文件如下:*NODE1, 0.0 , 0.0 , 0.02, 20.0 , 0.0 , 0.0*ELEMENT, TYPE=T3D2, ELSET=link1, 1, 2*BEAM SECTION, ELSET=link, MATERIAL= steel, SECTION=CIRC15.0,提交分析作业时,在 DAT 文件中出现下列错误信息:***ERROR:.80 elements have missing property definitions The elements have been identified inelement set ErrElemMissingSection.4.2.3 梁横截面方位【常见问题4-6】梁横截面方位(beam orientation)是如何定义的?它有什么作用?【常见问题4-7】如何在 Abaqus 中定义梁横截面方位?【常见问题4-8】使用梁单元分析问题时,为何出现下列错误信息:***ERROR: ELEMENT 16 IS CLOSE TO PARALLEL WITH ITS BEAM SECTION AXIS.第4章Property 功能模块中的常见问题DIRECTION COSINES OF ELEMENT AXIS 2.93224E-04 -8.20047E-05 1.0000. DIRECTIONCOSINES OF FIRST SECTION AXIS 0.0000 0.0000 1.0000。
Python自然语言处理学习笔记(55):最大熵分类器
Python⾃然语⾔处理学习笔记(55):最⼤熵分类器6.6 Maximum Entropy Classifiers 最⼤熵分类器The Maximum Entropy classifier uses a model that is very similar to the model employed by the naive Bayes classifier. But rather than using probabilities to set the model's parameters, it uses search techniques to find a set of parameters that will maximize the performance of the classifier. In particular, it looks for(寻找)the set of parameters that maximizes the total likelihood(总可能性)of the training corpus, which is defined as:(10) P(features) = Σx |in| corpus P(label(x)|features(x))Where P(label|features), the probability that an input whose features are features will have class label label, is defined as:(11) P(label|features) = P(label, features) /Σlabel P(label, features)Because of the potentially complex interactions between the effects of related features, there is no way to directly calculate the model parameters that maximize the likelihood of the training set. Therefore, Maximum Entropy classifiers choose the model parameters using iterative optimization(迭代优化) techniques, which initialize the model's parameters to random values, and then repeatedly refine those parameters to bring them closer to the optimal solution. These iterative optimization techniques guarantee that each refinement of the parameters will bring them closer to the optimal values, but do not necessarily provide a means of determining when those optimal values have been reached. Because the parameters for Maximum Entropy classifiers are selected using iterative optimization techniques, they can take a long time to learn. This is especially true when the size of the training set, the number of features, and the number of labels are all large.NoteSome iterative optimization techniques are much faster than others. When training Maximum Entropy models, avoid the use of Generalized Iterative Scaling (GIS) or Improved Iterative Scaling (IIS), which are both considerably slower than the Conjugate Gradient (CG) and the BFGS optimization methods.The Maximum Entropy Model 最⼤熵模型The Maximum Entropy classifier model is a generalization of the model used by the naive Bayes classifier. Like the naive Bayes model, the Maximum Entropy classifier calculates the likelihood of each label for a given input value by multiplying together the parameters that are applicable for the input value and label. The naive Bayes classifier model defines a parameter for each label, specifying its prior probability, and a parameter for each (feature, label) pair, specifying the contribution of individual features towards a label's likelihood.In contrast, the Maximum Entropy classifier model leaves it up to the user to decide what combinations of labels and features should receive their own parameters. In particular, it is possible to use a single parameter to associate a feature with more than one label; or to associate more than one feature with a given label. This will sometimes allow the model to "generalize" over some of the differences between related labels or features.Each combination of labels and features that receives its own parameter is called a joint-feature. Note that joint-features are properties of labeled values, whereas (simple) features are properties of unlabeled values.NoteIn literature that describes and discusses Maximum Entropy models, the term "features" often refers to joint-features; the term "contexts" refers to what we have been calling (simple) features.Typically, the joint-features that are used to construct Maximum Entropy models exactly mirror those that are used by the naive Bayes model. In particular, a joint-feature is defined for each label, corresponding to w[label], and for each combination of (simple) feature and label, corresponding to w[f,label]. Given the joint-features for a Maximum Entropy model, the score assigned to a label for a given input is simply the product of the parameters associated with the joint-features that apply to that input and label:(12) P(input, label) = Prod joint-features(input,label)w[joint-feature]Maximizing Entropy 熵的最⼤化The intuition that motivates Maximum Entropy classification is that we should build a model that captures the frequencies of individual joint-features, without making any unwarranted assumptions. An example will help to illustrate this principle.Suppose we are assigned the task of picking the correct word sense for a given word, from a list of ten possible senses (labeled A-J). At first, we are not told anything more about the word or the senses. There are many probability distributions that we could choose for the ten senses, such as:Table 6.1A B C D E F G H I J(i)10%10%10%10%10%10%10%10%10%10%(ii)5%15%0%30%0%8%12%0%6%24%0%(iii)0%100%0%0%0%0%0%0%0%Although any of these distributions might be correct, we are likely to choose distribution (i), because without any more information, there is no reason to believe that any word sense is more likely than any other. On the other hand, distributions (ii) and (iii) reflect assumptions that are not supported by what we know.One way to capture this intuition that distribution (i) is more "fair" than the other two is to invoke the concept of entropy. In the discussion of decision trees, we described entropy as a measure of how "disorganized" a set of labels was. In particular, if a single label dominates then entropy is low, but if the labels are more evenly distributed then entropy is high. In our example, we chose distribution (i) because its label probabilities are evenly distributed — in other words, because its entropy is high. In general, the Maximum Entropy principle states that, among the distributions that are consistent with what we know, we should choose the distribution whose entropy is highest.Next, suppose that we are told that sense A appears 55% of the time. Once again, there are many distributions that are consistent with this new piece of information, such as:Table 6.2A B C D E F G H I J(iv)55%45%0%0%0%0%0%0%0%0%(v)55%5%5%5%5%5%5%5%5%5%0%(vi)55%3%1%2%9%5%0%25%0%But again, we will likely choose the distribution that makes the fewest unwarranted assumptions — in this case, distribution (v).Finally, suppose that we are told that the word "up" appears in the nearby context 10% of the time, and that when it does appear in the context there's an 80% chance that sense A or C will be used. In this case, we will have a harder time coming up with an appropriate distribution by hand; however, we can verify that the following distribution looks appropriate:Table 6.3A B C D E F G H I J(vii)+up 5.1%0.25% 2.9%0.25%0.25%0.25%0.25%0.25%0.25%0.25%4.46%` `-up49.9% 4.46% 4.46% 4.46% 4.46% 4.46% 4.46% 4.46% 4.46%In particular, the distribution is consistent with what we know: if we add up the probabilities in column A, we get 55%; if we add up the probabilities of row 1, we get 10%; and if we add up the boxes for senses A and C in the +up row, we get 8% (or 80% of the +up cases). Furthermore, the remaining probabilities appear to be "evenly distributed."Throughout this example, we have restricted ourselves to distributions that are consistent with what we know; among these, we chose the distribution with the highest entropy. This is exactly what the Maximum Entropy classifier does as well. In particular, for each joint-feature, the Maximum Entropy model calculates the "empirical frequency" of that feature — i.e., the frequency with which it occurs in the training set. It then searches for the distribution which maximizes entropy, while still predicting the correct frequency for each joint-feature.Generative vs Conditional Classifiers 产⽣式对⽐条件式分类器An important difference between the naive Bayes classifier and the Maximum Entropy classifier concerns the type of questions they can be used to answer. The naive Bayes classifier is an example of a generative classifier, which builds a model that predicts P(input, label), the joint probability of a (input, label) pair. As a result, generative models can be used to answer the following questions:1. What is the most likely label for a given input?2. How likely is a given label for a given input?3. What is the most likely input value?4. How likely is a given input value?5. How likely is a given input value with a given label?6. What is the most likely label for an input that might have one of two values (but we don't know which)?The Maximum Entropy classifier, on the other hand, is an example of a conditional classifier. Conditional classifiers build models that predict P(label|input) — the probability of a label given the input value. Thus, conditional models can still be used to answer questions 1 and 2. However, conditional models can not be used to answer the remaining questions 3-6.In general, generative models are strictly more powerful than conditional models, since we can calculate the conditional probability P(label|input) from the joint probability P(input, label), but not vice versa. However, this additional power comes at a price. Because the model is more powerful, it has more "free parameters" which need to be learned. However, the size of the training set is fixed. Thus, when using a more powerful model, we end up with less data that can be used to train each parameter's value, making it harder to find the best parameter values. As a result, a generative model may not do as good a job at answering questions 1 and 2 as a conditional model, since the conditional model can focus its efforts on those two questions. However, if we do need answers to questions like 3-6, then we have no choice but to use a generative model.The difference between a generative model and a conditional model is analogous to the difference between a topographical map and a picture of a skyline. Although the topographical map can be used to answer a wider variety of questions, it is significantly more difficult to generate an accurate topographical map than it is to generate an accurate skyline.。
三标度AHP-熵优化组合赋权法在PPP项目风险评价中的应用
三标度AHP-熵优化组合赋权法在PPP项目风险评价中的应用姜安民;董彦辰;吴洋;张舒平;倪佳【摘要】目前,政府在基础设施建设中大力提倡PPP模式,对PPP项目风险进行科学的评价是PPP项目顺利完成的重要保障.采用三标度AHP确定风险评价指标主观权重,熵权法确定评价指标的客观权重,建立优化组合赋权模型,构造拉格朗日函数对模型进行求解,得到优化组合权重.并对AA县给排水PPP项目进行风险评价,在项目实施过程中,总体风险与评价结果较为接近.研究结果表明,三标度AHP-熵优化组合赋权法可行,能在PPP项目风险评价中提供可靠的指标权重.%At present,the government strongly advocates PPP model in the infrastructure construction,and the scientific evaluation of the PPP project risk is an important factor for the successful completion of the PPP project. This study used three-scale AHP method and entropy method to determine the subjective and objective weight of evaluation index. Then an optimal combination weighting model was built to construct Lagrange function to obtain the optimal combination. This method was verified to evaluate the risk of a PPP project of water supply and drainage in AA County. The results show that the overall risk is close to the evaluation result. This fact indicated that the three-scale AHP-entropy optimal combination weighting method is feasible,and can provide a reliable index weight in PPP project risk assessment.【期刊名称】《工程管理学报》【年(卷),期】2017(031)005【总页数】6页(P62-67)【关键词】PPP模式;三标度AHP;熵权法;优化组合赋权法;灰色评价【作者】姜安民;董彦辰;吴洋;张舒平;倪佳【作者单位】湖南城建职业技术学院,湖南湘潭 411100;湖南城建职业技术学院,湖南湘潭 411100;湖南城建职业技术学院,湖南湘潭 411100;湖南城建职业技术学院,湖南湘潭 411100;湖南大学土木工程学院,湖南长沙 410082;湖南大学土木工程学院,湖南长沙 410082【正文语种】中文【中图分类】F284PPP(Public-Private-Partnerships)模式即公私合营的一种新兴模式,拓宽融资渠道、解决政府资金短缺问题是PPP模式的最大优势,它的出现顺应了社会发展的需求。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Introduction
1
AAAAAAAA AAAAAAAA CCCCCCCC CCCCCCCC GGGGGGGG
Sequence V.&A. 0.1875 0.1875 0.1875 0.1875 0.25
VW 0.1667 0.1667 0.1667 0.1667 0.3333
ME 0.1667 0.1667 0.1667 0.1667 0.3333
In Proc. Third Int. Conf. Intelligent Systems for Molecular Biology Eds. C. Rawlings et al., p. 215{221, AAAI Press 1995
Maximum Entropy Weighting of Aligned Sequences of Proteins or DNA
Graeme Mitchison
In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with pro les or other consensus models based on multiple alignments. A new weighting scheme for this type of database search is proposed. In a statistical description of the searching problem it is derived from the maximum entropy principle. It can be proved that, in a certain sense, it corrects for uneven representation. It is shown that nding the maximum entropy weights is an easy optimization problem for which standard techniques are applicable.
Abstraቤተ መጻሕፍቲ ባይዱt
Consensus models made from multiple sequence alignments have proved very useful for searching databases (Taylor 1986; Gribskov, McLachlan, & Eisenberg 1987; Barton 1990; Bairoch 1993; Heniko & Heniko 1994; Krogh et al. 1994). A common problem, however, is that some groups of sequences dominate the multiple alignment and outweigh other groups. For instance, an alignment of a random set of known globins would contain mostly vertebrate alpha and beta chains of which several hundred are known, whereas other families of globins, like leghemoglobins, would have many fewer representatives. Thus a search based on such a globin alignment would be more likely to pick out vertebrate alpha and beta chains than the less common globins. For this reason a weighting of the sequences that compensates for the di erences in representation may be very important. A method for weighting sequences can be useful in other situations too, in the prediction of protein secondary structure from multiple alignments (Levin et al. 1993) for instance, or for use in the actual alignment procedure (Thompson, Higgins, & Gibson 1994a). Several methods exist for weighting sequences in alignments. The methods in (Felsenstein 1973; Altschul, Carroll, & Lipman 1989) can be used for sequences that are related by a known phylogenetic tree; the distances between nodes in the tree are used for
Table 1: A toy example of a multiple alignment. The rst column shows the weights assigned by the method proposed in (Vingron & Argos 1989). It is seen that the last sequence obtains a weight only slightly larger than the other four sequences. It is obvious that the last sequence should be assigned twice as large a weight as the rst four, so that the three di erent sequences are given the same weights. This is exactly what the Voronoi weights (VW) shown in the second column do. The last column shows that the maximum entropy weights are equal to the VW. We have actually cheated a little here, because ME weights only ensure that the weights of the A sequences add up to 1/3, and not that they are individually equal to 1/6 (similarly for the C sequences). weighting scheme will be referred to as maximum entropy (ME) weighting. One of the main advantages of our weighting scheme is that it is based on theory compatible with the basic assumptions behind pro le search and HMM search, whereas most other weighting schemes are based on intuitive ideas. It can be proved that our scheme corrects in a certain sense for the uneven representation. It is important to realize that there is no objectively `correct' way to weight sequences. Any weighting scheme builds on some assumptions about the structure of the space of sequences and on some goals for the weighting. Often a weighting scheme has some implicit assumptions about the probability distribution over the space of sequences. In this work these assumptions are the same as those which underpin pro le and HMM search. For simplicity we will rst discuss the case of block alignments, but all the results will carry over to the more general case discussed later. We de ne a block alignment to be one where gaps can occur, but these gaps are treated like additional characters, i.e., no penalty is used for opening a gap. Assume the multiple alignment consists of sequences sn , n = 1; : : :; N. Each sequence consists of characters sn , i = 1; : : :; L, some of which may be gap characters i (`-'). De ne 8 < 1 if sequence n has character j at position i, sn = j mn = : (1) i ij 0 otherwise.