Schedulability-Aware Mapping of Real-Time Object-Oriented Models to Multi-Threaded Implemen
深度优先局部聚合哈希
Vol.48,No.6Jun. 202 1第48卷第6期2 0 2 1年6月湖南大学学报)自然科学版)Journal of Hunan University (Natural Sciences )文章编号:1674-2974(2021 )06-0058-09 DOI : 10.16339/ki.hdxbzkb.2021.06.009深度优先局艺B 聚合哈希龙显忠g,程成李云12(1.南京邮电大学计算机学院,江苏南京210023;2.江苏省大数据安全与智能处理重点实验室,江苏南京210023)摘 要:已有的深度监督哈希方法不能有效地利用提取到的卷积特征,同时,也忽视了数据对之间相似性信息分布对于哈希网络的作用,最终导致学到的哈希编码之间的区分性不足.为了解决该问题,提出了一种新颖的深度监督哈希方法,称之为深度优先局部聚合哈希(DeepPriority Local Aggregated Hashing , DPLAH ). DPLAH 将局部聚合描述子向量嵌入到哈希网络 中,提高网络对同类数据的表达能力,并且通过在数据对之间施加不同权重,从而减少相似性 信息分布倾斜对哈希网络的影响.利用Pytorch 深度框架进行DPLAH 实验,使用NetVLAD 层 对Resnet18网络模型输出的卷积特征进行聚合,将聚合得到的特征进行哈希编码学习.在CI-FAR-10和NUS-WIDE 数据集上的图像检索实验表明,与使用手工特征和卷积神经网络特征的非深度哈希学习算法的最好结果相比,DPLAH 的平均准确率均值要高出11%,同时,DPLAH 的平均准确率均值比非对称深度监督哈希方法高出2%.关键词:深度哈希学习;卷积神经网络;图像检索;局部聚合描述子向量中图分类号:TP391.4文献标志码:ADeep Priority Local Aggregated HashingLONG Xianzhong 1,覮,CHENG Cheng1,2,LI Yun 1,2(1. School of Computer Science & Technology ,Nanjing University of Posts and Telecommunications ,Nanjing 210023, China ;2. Key Laboratory of Jiangsu Big Data Security and Intelligent Processing ,Nanjing 210023, China )Abstract : The existing deep supervised hashing methods cannot effectively utilize the extracted convolution fea tures, but also ignore the role of the similarity information distribution between data pairs on the hash network, result ing in insufficient discrimination between the learned hash codes. In order to solve this problem, a novel deep super vised hashing method called deep priority locally aggregated hashing (DPLAH) is proposed in this paper, which em beds the vector of locally aggregated descriptors (VLAD) into the hash network, so as to improve the ability of the hashnetwork to express the similar data, and reduce the impact of similarity distribution skew on the hash network by im posing different weights on the data pairs. DPLAH experiment is carried out by using the Pytorch deep framework. Theconvolution features of the Resnet18 network model output are aggregated by using the NetVLAD layer, and the hashcoding is learned by using the aggregated features. The image retrieval experiments on the CIFAR-10 and NUS - WIDE datasets show that the mean average precision (MAP) of DPLAH is11 percentage points higher than that of* 收稿日期:2020-04-26基金项目:国家自然科学基金资助项目(61906098,61772284),National Natural Science Foundation of China(61906098, 61772284);国家重 点研发计划项目(2018YFB 1003702) , National Key Research and Development Program of China (2018YFB1003702)作者简介:龙显忠(1985—),男,河南信阳人,南京邮电大学讲师,工学博士,硕士生导师覮 通信联系人,E-mail : *************.cn第6期龙显忠等:深度优先局部聚合哈希59non-deep hash learning algorithms using manual features and convolution neural network features,and the MAP of DPLAH is2percentage points higher than that of asymmetric deep supervised hashing method.Key words:deep Hash learning;convolutional neural network;image retrieval;vector of locally aggregated de-scriptors(VLAD)随着信息检索技术的不断发展和完善,如今人们可以利用互联网轻易获取感兴趣的数据内容,然而,信息技术的发展同时导致了数据规模的迅猛增长.面对海量的数据以及超大规模的数据集,利用最近邻搜索[1(Nearest Neighbor Search,NN)的检索技术已经无法获得理想的检索效果与可接受的检索时间.因此,近年来,近似最近邻搜索[2(Approximate Nearest Neighbor Search,ANN)变得越来越流行,它通过搜索可能相似的几个数据而不再局限于返回最相似的数据,在牺牲可接受范围的精度下提高了检索效率.作为一种广泛使用的ANN搜索技术,哈希方法(Hashing)[3]将数据转换为紧凑的二进制编码(哈希编码)表示,同时保证相似的数据对生成相似的二进制编码.利用哈希编码来表示原始数据,显著减少了数据的存储和查询开销,从而可以应对大规模数据中的检索问题.因此,哈希方法吸引了越来越多学者的关注.当前哈希方法主要分为两类:数据独立的哈希方法和数据依赖的哈希方法,这两类哈希方法的区别在于哈希函数是否需要训练数据来定义.局部敏感哈希(Locality Sensitive Hashing,LSH)[4]作为数据独立的哈希代表,它利用独立于训练数据的随机投影作为哈希函数•相反,数据依赖哈希的哈希函数需要通过训练数据学习出来,因此,数据依赖的哈希也被称为哈希学习,数据依赖的哈希通常具有更好的性能.近年来,哈希方法的研究主要侧重于哈希学习方面.根据哈希学习过程中是否使用标签,哈希学习方法可以进一步分为:监督哈希学习和无监督哈希学习.典型的无监督哈希学习包括:谱哈希[5(Spectral Hashing,SH);迭代量化哈希[6](Iterative Quantization, ITQ);离散图哈希[7(Discrete Graph Hashing,DGH);有序嵌入哈希[8](Ordinal Embedding Hashing,OEH)等.无监督哈希学习方法仅使用无标签的数据来学习哈希函数,将输入的数据映射为哈希编码的形式.相反,监督哈希学习方法通过利用监督信息来学习哈希函数,由于利用了带有标签的数据,监督哈希方法往往比无监督哈希方法具有更好的准确性,本文的研究主要针对监督哈希学习方法.传统的监督哈希方法包括:核监督哈希[9](Supervised Hashing with Kernels,KSH);潜在因子哈希[10](Latent Factor Hashing,LFH);快速监督哈希[11](Fast Supervised Hashing,FastH);监督离散哈希[1(Super-vised Discrete Hashing,SDH)等.随着深度学习技术的发展[13],利用神经网络提取的特征已经逐渐替代手工特征,推动了深度监督哈希的进步.具有代表性的深度监督哈希方法包括:卷积神经网络哈希[1(Convolutional Neural Networks Hashing,CNNH);深度语义排序哈希[15](Deep Semantic Ranking Based Hash-ing,DSRH);深度成对监督哈希[16](Deep Pairwise-Supervised Hashing,DPSH);深度监督离散哈希[17](Deep Supervised Discrete Hashing,DSDH);深度优先哈希[18](Deep Priority Hashing,DPH)等.通过将特征学习和哈希编码学习(或哈希函数学习)集成到一个端到端网络中,深度监督哈希方法可以显著优于非深度监督哈希方法.到目前为止,大多数现有的深度哈希方法都采用对称策略来学习查询数据和数据集的哈希编码以及深度哈希函数.相反,非对称深度监督哈希[19](Asymmetric Deep Supervised Hashing,ADSH)以非对称的方式处理查询数据和整个数据库数据,解决了对称方式中训练开销较大的问题,仅仅通过查询数据就可以对神经网络进行训练来学习哈希函数,整个数据库的哈希编码可以通过优化直接得到.本文的模型同样利用了ADSH的非对称训练策略.然而,现有的非对称深度监督哈希方法并没有考虑到数据之间的相似性分布对于哈希网络的影响,可能导致结果是:容易在汉明空间中保持相似关系的数据对,往往会被训练得越来越好;相反,那些难以在汉明空间中保持相似关系的数据对,往往在训练后得到的提升并不显著.同时大部分现有的深度监督哈希方法在哈希网络中没有充分有效利用提60湖南大学学报(自然科学版)2021年取到的卷积特征.本文提出了一种新的深度监督哈希方法,称为深度优先局部聚合哈希(Deep Priority Local Aggregated Hashing,DPLAH).DPLAH的贡献主要有三个方面:1)DPLAH采用非对称的方式处理查询数据和数据库数据,同时DPLAH网络会优先学习查询数据和数据库数据之间困难的数据对,从而减轻相似性分布倾斜对哈希网络的影响.2)DPLAH设计了全新的深度哈希网络,具体来说,DPLAH将局部聚合表示融入到哈希网络中,提高了哈希网络对同类数据的表达能力.同时考虑到数据的局部聚合表示对于分类任务的有效性.3)在两个大型数据集上的实验结果表明,DPLAH在实际应用中性能优越.1相关工作本节分别对哈希学习[3]、NetVLAD[20]和Focal Loss[21]进行介绍.DPLAH分别利用NetVLAD和Focal Loss提高哈希网络对同类数据的表达能力及减轻数据之间相似性分布倾斜对于哈希网络的影响. 1.1哈希学习哈希学习[3]的任务是学习查询数据和数据库数据的哈希编码表示,同时要满足原始数据之间的近邻关系与数据哈希编码之间的近邻关系相一致的条件.具体来说,利用机器学习方法将所有数据映射成{0,1}r形式的二进制编码(r表示哈希编码长度),在原空间中不相似的数据点将被映射成不相似)即汉明距离较大)的两个二进制编码,而原空间中相似的两个数据点将被映射成相似(即汉明距离较小)的两个二进制编码.为了便于计算,大部分哈希方法学习{-1,1}r形式的哈希编码,这是因为{-1,1}r形式的哈希编码对之间的内积等于哈希编码的长度减去汉明距离的两倍,同时{-1,1}r形式的哈希编码可以容易转化为{0,1}r形式的二进制编码.图1是哈希学习的示意图.经过特征提取后的高维向量被用来表示原始图像,哈希函数h将每张图像映射成8bits的哈希编码,使原来相似的数据对(图中老虎1和老虎2)之间的哈希编码汉明距离尽可能小,原来不相似的数据对(图中大象和老虎1)之间的哈希编码汉明距离尽可能大.h(大象)=10001010h(老虎1)=01100001h(老虎2)=01100101相似度尽可能小相似度尽可能大图1哈希学习示意图Fig.1Hashing learning diagram1.2NetVLADNetVLAD的提出是用于解决端到端的场景识别问题[20(场景识别被当作一个实例检索任务),它将传统的局部聚合描述子向量(Vector of Locally Aggregated Descriptors,VLAD[22])结构嵌入到CNN网络中,得到了一个新的VLAD层.可以容易地将NetVLAD 使用在任意CNN结构中,利用反向传播算法进行优化,它能够有效地提高对同类别图像的表达能力,并提高分类的性能.NetVLAD的编码步骤为:利用卷积神经网络提取图像的卷积特征;利用NetVLAD层对卷积特征进行聚合操作.图2为NetVLAD层的示意图.在特征提取阶段,NetVLAD会在最后一个卷积层上裁剪卷积特征,并将其视为密集的描述符提取器,最后一个卷积层的输出是H伊W伊D映射,可以将其视为在H伊W空间位置提取的一组D维特征,该方法在实例检索和纹理识别任务[23別中都表现出了很好的效果.NetVLAD layer(KxD)x lVLADvectorh------->图2NetVLAD层示意图⑷Fig.2NetVLAD layer diagram1201NetVLAD在特征聚合阶段,利用一个新的池化层对裁剪的CNN特征进行聚合,这个新的池化层被称为NetVLAD层.NetVLAD的聚合操作公式如下:NV((,k)二移a(x)(血⑺-C((j))(1)i=1式中:血(j)和C)(j)分别表示第i个特征的第j维和第k个聚类中心的第j维;恣&)表示特征您与第k个视觉单词之间的权.NetVLAD特征聚合的输入为:NetVLAD裁剪得到的N个D维的卷积特征,K个聚第6期龙显忠等:深度优先局部聚合哈希61类中心.VLAD的特征分配方式是硬分配,即每个特征只和对应的最近邻聚类中心相关联,这种分配方式会造成较大的量化误差,并且,这种分配方式嵌入到卷积神经网络中无法进行反向传播更新参数.因此,NetVLAD采用软分配的方式进行特征分配,软分配对应的公式如下:-琢II Xi-C*II 2=—e(2)-琢II X-Ck,II2k,如果琢寅+肄,那么对于最接近的聚类中心,龟&)的值为1,其他为0.aS)可以进一步重写为:w j X i+b ka(x i)=—e-)3)w J'X i+b kk,式中:W k=2琢C k;b k=-琢||C k||2.最终的NetVLAD的聚合表示可以写为:N w;x+b kv(j,k)=移—----(x(j)-Ck(j))(4)i=1w j.X i+b k移ek,1.3Focal Loss对于目标检测方法,一般可以分为两种类型:单阶段目标检测和两阶段目标检测,通常情况下,两阶段的目标检测效果要优于单阶段的目标检测.Lin等人[21]揭示了前景和背景的极度不平衡导致了单阶段目标检测的效果无法令人满意,具体而言,容易被分类的背景虽然对应的损失很低,但由于图像中背景的比重很大,对于损失依旧有很大的贡献,从而导致收敛到不够好的一个结果.Lin等人[21]提出了Focal Loss应对这一问题,图3是对应的示意图.使用交叉爛作为目标检测中的分类损失,对于易分类的样本,它的损失虽然很低,但数据的不平衡导致大量易分类的损失之和压倒了难分类的样本损失,最终难分类的样本不能在神经网络中得到有效的训练.Focal Loss的本质是一种加权思想,权重可根据分类正确的概率p得到,利用酌可以对该权重的强度进行调整.针对非对称深度哈希方法,希望难以在汉明空间中保持相似关系的数据对优先训练,具体来说,对于DPLAH的整体训练损失,通过施加权重的方式,相对提高难以在汉明空间中保持相似关系的数据对之间的训练损失.然而深度哈希学习并不是一个分类任务,因此无法像Focal Loss一样根据分类正确的概率设计权重,哈希学习的目的是学到保相似性的哈希编码,本文最终利用数据对哈希编码的相似度作为权重的设计依据具体的权重形式将在模型部分详细介绍.正确分类的概率图3Focal Loss示意图[21】Fig.3Focal Loss diagram12112深度优先局部聚合哈希2.1基本定义DPLAH模型采用非对称的网络设计.Q={0},=1表示n张查询图像,X={X i}m1表示数据库有m张图像;查询图像和数据库图像的标签分别用Z={Z i},=1和Y ={川1表示;i=[Z i1,…,zj1,i=1,…,n;c表示类另数;如果查询图像0属于类别j,j=1,…,c;那么z”=1,否则=0.利用标签信息,可以构造图像对的相似性矩阵S沂{-1,1}"伊”,s”=1表示查询图像q,和数据库中的图像X j语义相似,S j=-1表示查询图像和数据库中的图像X j语义不相似.深度哈希方法的目标是学习查询图像和数据库中图像的哈希编码,查询图像的哈希编码用U沂{-1,1}"",表示,数据库中图像的哈希编码用B沂{-1,1}m伊r表示,其中r表示哈希编码的长度.对于DPLAH模型,它在特征提取部分采用预训练好的Resnet18网络[25].图4为DPLAH网络的结构示意图,利用NetVLAD层聚合Resnet18网络提取到的卷积特征,哈希编码通过VLAD编码得到,由于VLAD编码在分类任务中被广泛使用,于是本文将NetVLAD层的输出作为分类任务的输入,利用图像的标签信息监督NetVLAD层对卷积特征的利用.事实上,任何一种CNN模型都能实现图像特征提取的功能,所以对于选用哪种网络进行特征学习并不是本文的重点.62湖南大学学报(自然科学版)2021年conv1图4DPLAH结构Fig.4DPLAH structure图像标签soft-max1,0,1,1,0□1,0,0,0,11,1,0,1,0---------*----------VLADVLAD core)c)l・>:i>数据库图像的哈希编码2.2DPLAH模型的目标函数为了学习可以保留查询图像与数据库图像之间相似性的哈希编码,一种常见的方法是利用相似性的监督信息S e{-1,1}n伊"、生成的哈希编码长度r,以及查询图像的哈希编码仏和数据库中图像的哈希编码b三者之间的关系[9],即最小化相似性的监督信息与哈希编码对内积之间的L损失.考虑到相似性分布的倾斜问题,本文通过施加权重来调节查询图像和数据库图像之间的损失,其公式可以表示为:min J=移移(1-w)(u T b j-rs)专,B i=1j=1s.t.U沂{-1,1}n伊r,B沂{-1,1}m伊r,W沂R n伊m(5)受FocalLoss启发,希望深度哈希网络优先训练相似性不容易保留图像对,然而Focal Loss利用图像的分类结果对损失进行调整,因此,需要重新进行设计,由于哈希学习的目的是为了保留图像在汉明空间中的相似性关系,本文利用哈希编码的余弦相似度来设计权重,其表达式为:1+。
综述Representation learning a review and new perspectives
explanatory factors for the observed input. A good representation is also one that is useful as input to a supervised predictor. Among the various ways of learning representations, this paper focuses on deep learning methods: those that are formed by the composition of multiple non-linear transformations, with the goal of yielding more abstract – and ultimately more useful – representations. Here we survey this rapidly developing area with special emphasis on recent progress. We consider some of the fundamental questions that have been driving research in this area. Specifically, what makes one representation better than another? Given an example, how should we compute its representation, i.e. perform feature extraction? Also, what are appropriate objectives for learning good representations?
德国工业4.0原版
Intense research activities in universities and other research institutions Drastically increasing number of publications in recent years Large amount of funding by the German government
Model predictive control (MPC)
Modern, optimization-based control technique Successful applications in many industrial fields Can handle hard constraints on states and inputs Optimization of some performance criterion Applicable to nonlinear, MIMO systems
A system is strictly dissipative on a set W ⊆ Z with respect to the supply rate s if there exists a storage function λ such that for all (x , u ) ∈ W it holds that λ(f (x , u )) − λ(x ) ≤ s (x , u ) − ρ(x ) with ρ > 0.
k =0 x (k |t + 1) x (t + 1) state x input u t+1 u (k |t + 1) k =N
Basic MPC scheme
求解线性方程组稀疏解的稀疏贪婪随机Kaczmarz算法
大小 k̂ 。②输出 xj。③初始化 S = {1,…,n},x0 = 0,
j = 0。④当 j ≤ M 时,置 j = j + 1。⑤选择行向量
ai,i ∈
{
1,…,n
},每一行对应的概率为
‖a‖i
2 2
‖A‖
2 F
。
⑥
( | ) 确 定 估 计 的 支 持 集 S,S = supp xj-1 max { k̂,n-j+1 } 。
行从而达到加快算法收敛速度的目的。算法 3 给出
稀疏贪婪随机 Kaczmarz 算法。
算法 3 稀疏贪婪随机 Kaczmarz 算法。①输入
A∈ Rm×n,b ∈ Rm,最大迭代数 M 和估计的支持集的
大 小 k̂ 。 ② 输 出 xk。 ③ 初 始 化 S = {1,…,n},x0 =
x
* 0
=
0。④
置
k
=
0
时,当
k
≤
M
-
1
时。⑤计算
( {| | } ϵk=
1 2
‖b
1 - Ax‖k 22
max
1≤ ik ≤ m
bik - aik xk 2
‖a
‖ ik
2 2
+
)1
‖A‖
2 F
(2)
⑥决定正整数指标集
{ | | } Uk =
ik|
bik - aik xk
2
≥
ϵ‖k b
-
Ax‖k
‖22 a
‖ ik
2 2
ï í
1
ï î
j
l∈S l ∈ Sc
其中,j 为迭代步数。当 j → ∞ 时,wj⊙ai → aiS,因此
稀疏算子 编译
稀疏算子(Sparse Operator)是指只对部分元素进行操作的算子,例如矩阵乘法中的稀疏矩阵。
在编译过程中,稀疏算子的处理通常涉及到如何有效地存储和计算稀疏矩阵,以及如何优化稀疏算子的计算性能。
以下是一些编译中处理稀疏算子的常见方法:
1.压缩存储:对于稀疏矩阵,可以使用压缩存储方法来减少存储空间的使用。
例
如,可以使用三元组表示法或行主序存储法等。
2.稀疏算子优化:针对稀疏算子进行优化,可以显著提高计算性能。
例如,可以
使用快速傅里叶变换(FFT)等算法加速稀疏矩阵乘法等操作。
3.代码生成优化:在编译器中,可以根据稀疏算子的特性生成优化的代码。
例如,
可以使用向量化指令、并行计算等技术来加速稀疏算子的计算。
4.内存优化:对于大规模的稀疏矩阵,内存的使用也是一个重要的问题。
可以使
用内存优化技术,例如缓存优化、内存对齐等,来提高内存的使用效率。
5.并行计算:对于大规模的稀疏矩阵操作,可以使用并行计算技术来加速计算。
例如,可以将稀疏矩阵分成多个子矩阵,并使用多线程或分布式计算等技术进行并行处理。
总之,在编译过程中处理稀疏算子需要综合考虑存储、计算和内存等多个方面,并使用各种优化技术来提高计算性能和内存使用效率。
节点中介性和频谱离散度感知虚拟光网络生存性协同映射
节点中介性和频谱离散度感知虚拟光网络生存性协同映射刘焕淋*① 胡会霞① 陈 勇② 温 濛① 王展鹏①①(重庆邮电大学通信与信息工程学院 重庆 400065)②(重庆邮电大学自动化学院 重庆 400065)摘 要:虚拟网络的映射策略影响弹性光网络(EON)资源可用性和网络生存性。
该文提出一种基于节点间距离和频谱离散度感知的虚拟光网络生存性协同映射(CM-DSDA)算法,研究节点计算资源和拓扑位置中介性的光节点排序策略,设计频谱离散度方法评价链路频谱碎片化程度。
在虚拟链路的生存性映射中,选择邻接已映射节点中消耗频隙数少且频谱离散度低的工作光路和保护光路协同映射虚拟网络。
仿真结果表明所提算法能有效地提高EON 的频谱占用率和减少带宽阻塞率。
关键词:虚拟光网络;生存性协同映射;节点中介性;频谱离散度;频谱占用率中图分类号:TN929.11文献标识码:A文章编号:1009-5896(2020)09-2166-07DOI : 10.11999/JEIT190543Survivability Coordinated Mapping Based on Node Centrality and Spectrum Dispersion Awareness for Virtual Optical NetworksLIU Huanlin ① HU Huixia ① CHEN Yong ② WEN Meng ① WANG Zhanpeng ①①(School of Communication and Information Engineering , Chongqing University of Postsand Telecommunications , Chongqing 400065, China )②(School of Automation , Chongqing University of Posts and Telecommunications , Chongqing 400065, China )Abstract : The mapping strategy of virtual network has important effect on the resource availability and survivability of the Elastic Optical Network (EON). A survivable virtual optical network Coordinated Mapping based on the Distance and Spectrum Dispersion Awareness (CM-DSDA) between nodes is proposed in the paper. A physical node weighted sorting strategy is studied, which not only considers the number of physical node computing resources, but also considers the location centrality of the physical nodes in the EON topology.And a method of spectrum dispersion is designed to evaluate the link’s spectrum fragmentation. During the virtual link’s survivability mapping, the working and protection optical paths adjacent the position of the mapped physical nodes with the minimum number of spectrum usage and the lowest frequency spectrum dispersion are selected to coordinated mapping the virtual optical networks. Simulation results show that the CM-DSDA can effectively increase the EON’s spectrum utilization and reduce bandwidth blocking probability.Key words : Virtual optical network; Survivable coordinated mapping; Node centrality; Spectrum dispersion;Spectrum usage ratio1 引言随着云计算、移动互联网、未来网等数据应用快速发展,网络中海量数据的交换和不确定性流向对传统的带宽固定、调制格式单一的波分复用光网络提出了挑战[1]。
slam算法工程师招聘笔试题与参考答案(某世界500强集团)2024年
2024年招聘slam算法工程师笔试题与参考答案(某世界500强集团)(答案在后面)一、单项选择题(本大题有10小题,每小题2分,共20分)1、以下哪个不属于SLAM(Simultaneous Localization and Mapping)算法的基本问题?A、定位B、建图C、导航D、路径规划2、在视觉SLAM中,常用的特征点检测算法不包括以下哪一项?A、SIFT(Scale-Invariant Feature Transform)B、SURF(Speeded Up Robust Features)C、ORB(Oriented FAST and Rotated BRIEF)D、BOW(Bag-of-Words)3、SLAM(同步定位与映射)系统中的“闭环检测”功能主要目的是什么?A. 提高地图的精度B. 减少计算量C. 优化路径规划D. 增强系统稳定性4、在视觉SLAM中,以下哪种方法通常用于提取特征点?A. SIFT(尺度不变特征变换)B. SURF(加速稳健特征)C. ORB(Oriented FAST and Rotated BRIEF)D. 以上都是5、SLAM(Simultaneous Localization and Mapping)算法的核心目标是什么?A. 实现无人驾驶车辆在未知环境中的自主导航B. 构建三维空间地图并实时更新C. 实现机器人路径规划D. 以上都是6、以下哪种传感器不适合用于SLAM系统?A. 激光雷达B. 摄像头C. 声呐D. 超声波传感器7、以下关于SLAM(同步定位与映射)系统的描述中,哪个是错误的?A. SLAM系统通常需要在未知环境中进行定位与建图。
B. SLAM系统通常需要使用传感器来获取环境信息。
C. SLAM系统可以实时生成地图并更新位置信息。
D. SLAM系统不需要进行初始化定位。
8、以下关于视觉SLAM(视觉同步定位与映射)系统的描述中,哪个是正确的?A. 视觉SLAM系统只依赖于视觉传感器进行定位与建图。
集成梯度特征归属方法-概述说明以及解释
集成梯度特征归属方法-概述说明以及解释1.引言1.1 概述在概述部分,你可以从以下角度来描述集成梯度特征归属方法的背景和重要性:集成梯度特征归属方法是一种用于分析和解释机器学习模型预测结果的技术。
随着机器学习的快速发展和广泛应用,对于模型的解释性需求也越来越高。
传统的机器学习模型通常被认为是“黑盒子”,即无法解释模型做出预测的原因。
这限制了模型在一些关键应用领域的应用,如金融风险评估、医疗诊断和自动驾驶等。
为了解决这个问题,研究人员提出了各种机器学习模型的解释方法,其中集成梯度特征归属方法是一种非常受关注和有效的技术。
集成梯度特征归属方法能够为机器学习模型的预测结果提供可解释的解释,从而揭示模型对于不同特征的关注程度和影响力。
通过分析模型中每个特征的梯度值,可以确定该特征在预测中扮演的角色和贡献度,从而帮助用户理解模型的决策过程。
这对于模型的评估、优化和改进具有重要意义。
集成梯度特征归属方法的应用广泛,不仅适用于传统的机器学习模型,如决策树、支持向量机和逻辑回归等,也可以应用于深度学习模型,如神经网络和卷积神经网络等。
它能够为各种类型的特征,包括数值型特征和类别型特征,提供有益的信息和解释。
本文将对集成梯度特征归属方法的原理、应用优势和未来发展进行详细阐述,旨在为读者提供全面的了解和使用指南。
在接下来的章节中,我们将首先介绍集成梯度特征归属方法的基本原理和算法,然后探讨应用该方法的优势和实际应用场景。
最后,我们将总结该方法的重要性,并展望未来该方法的发展前景。
1.2文章结构文章结构内容应包括以下内容:文章的结构部分主要是对整篇文章的框架进行概述,指导读者在阅读过程中能够清晰地了解文章的组织结构和内容安排。
第一部分是引言,介绍了整篇文章的背景和意义。
其中,1.1小节概述文章所要讨论的主题,简要介绍了集成梯度特征归属方法的基本概念和应用领域。
1.2小节重点在于介绍文章的结构,将列出本文各个部分的标题和内容概要,方便读者快速了解文章的大致内容。
强化学习算法中的稀疏表示学习方法详解(五)
强化学习算法中的稀疏表示学习方法详解强化学习(Reinforcement Learning, RL)是一种机器学习方法,其目标是使智能体(agent)通过与环境的交互,学习到如何在未知环境中做出最优的决策。
在强化学习中,智能体通过观察环境的状态和采取行动来获取奖励,从而不断优化自己的策略。
稀疏表示学习(Sparse Representation Learning)则是一种用于特征提取和数据降维的方法,通过学习数据的稀疏表达形式,可以更好地捕捉数据的潜在结构和特征。
本文将详细探讨强化学习算法中的稀疏表示学习方法及其应用。
一、稀疏表示学习的基本原理稀疏表示学习的基本原理是利用线性组合来表示数据,同时尽可能使用少量的基函数。
对于给定的数据集,稀疏表示学习旨在找到一组稀疏系数,使得数据能够被这组稀疏系数线性表示。
在强化学习中,稀疏表示学习可以用于提取环境的特征,从而帮助智能体更好地理解环境和做出决策。
二、稀疏表示学习在强化学习中的应用在强化学习中,智能体需要不断地观察环境的状态并做出决策。
然而,由于环境的复杂性和高维度特征的存在,传统的特征提取方法往往难以满足需求。
稀疏表示学习可以通过学习数据的稀疏表示,更好地捕捉环境的特征,从而帮助智能体更好地理解环境和做出决策。
例如,在深度强化学习中,智能体通常使用神经网络来近似值函数或策略函数。
稀疏表示学习可以用于特征提取,从而帮助神经网络更好地学习环境的特征。
通过学习数据的稀疏表示,可以更好地捕捉环境的潜在结构和特征,从而提高智能体的决策能力。
三、稀疏表示学习方法在强化学习中,常用的稀疏表示学习方法包括字典学习、压缩感知和稀疏自编码器等。
这些方法都可以用于学习数据的稀疏表示,从而帮助智能体更好地理解环境和做出决策。
1. 字典学习字典学习是一种常用的稀疏表示学习方法,其目标是学习一组基函数(字典),使得数据能够被这组基函数线性表示。
在强化学习中,可以使用字典学习来提取环境的特征,从而帮助智能体更好地理解环境和做出决策。
稀疏向量检索
稀疏向量检索一、引言在大数据时代,稀疏向量检索已成为一个重要的研究领域。
稀疏向量检索是指在大型数据集中gao效地查找与给定稀疏向量相似的向量。
这种技术在推荐系统、信息检索、机器学习等领域有着广泛的应用。
本文将探讨稀疏向量检索的方法、应用和面临的挑战。
二、稀疏向量检索的方法1. 近似最近邻搜索(Approximate Nearest Neighbor Search,ANN):这种方法通过计算向量的近似距离来找到相似的向量。
常见的近似算法包括基于哈希的方法(如LSH)和基于树的方法(如Annoy)。
2. 基于密度的聚类(Density-Based Clustering):通过将高维数据聚类成多个簇,然后在每个簇内查找与给定向量相似的向量。
这种方法对于处理非线性数据和异常值具有较好的效果。
3. 基于核的方法(Kernel-Based Methods):利用核函数将高维数据映射到低维空间,然后在低维空间中计算向量间的相似度。
这种方法在处理高维数据时具有较好的性能。
三、稀疏向量检索的应用1. 推荐系统:稀疏向量检索技术可以用于推荐系统中,根据用户的历史行为和偏好,为其推荐相似的内容或产品。
2. 信息检索:在搜索引擎中,稀疏向量检索技术可以用于快速查找与查询相关的文档或网页。
3. 机器学习:稀疏向量检索技术可以用于特征降维、异常值检测等机器学习任务中,提高算法的效率和准确性。
四、面临的挑战1. 高维数据的处理:高维数据的处理是稀疏向量检索面临的一个重要挑战。
高维空间中的数据通常具有高度稀疏的特点,如何有效地表示和处理这些数据是一个难题。
2. 数据规模和效率的平衡:在大数据环境下,稀疏向量检索需要在大规模数据集中快速找到相似的向量。
如何在保证效率的同时处理大规模数据集是一个挑战。
3. 语义相似度的计算:在某些应用中,我们需要计算向量间的语义相似度,而不仅仅是基于距离的相似度。
如何有效地计算语义相似度是一个具有挑战性的问题。
基于RRT的运动规划算法综述
基于RRT的运动规划算法综述1.介绍在过去的十多年中,机器人的运动规划问题已经收到了大量的关注,因为机器人开始成为现代工业和日常生活的重要组成部分。
最早的运动规划的问题只是考虑如何移动一架钢琴从一个房间到另一个房间而没有碰撞任何物体。
早期的算法则关注研究一个最完备的运动规划算法(完备性指如果存在这么一条规划的路径,那么算法一定能够在有限时间找到它),例如用一个多边形表示机器人,其他的多边形表示障碍物体,然后转化为一个代数问题去求解。
但是这些算法遇到了计算的复杂性问题,他们有一个指数时间的复杂度。
在1979年,Reif则证明了钢琴搬运工问题的运动规划是一个PSPACE-hard问题[1]。
所以这种完备的规划算法无法应用在实际中。
在实际应用中的运动规划算法有胞分法[2],势场法[3],路径图法[4]等。
这些算法在参数设置的比较好的时候,可以保证规划的完备性,在复杂环境中也可以保证花费的时间上限。
然而,这些算法在实际应用中有许多缺点。
例如在高维空间中这些算法就无法使用,像胞分法会使得计算量过大。
势场法会陷入局部极小值,导致规划失败[5],[6]。
基于采样的运动规划算法是最近十几年提出的一种算法,并且已经吸引了极大的关注。
概括的讲,基于采样的运动规划算法一般是连接一系列从无障碍的空间中随机采样的点,试图建立一条从初始状态到目标状态的路径。
与最完备的运动规划算法相反,基于采样的方法通过避免在状态空间中显式地构造障碍物来提供大量的计算节省。
即使这些算法没有实现完整性,但是它们是概率完备,这意味着规划算法不能返回解的概率随着样本的数量趋近无穷而衰减到零[7],并且这个下降速率是指数型的。
快速扩展随机树(Rapidly-exploring Random Trees,RRT)算法,是近十几年得到广泛发展与应用的基于采样的运动规划算法,它由美国爱荷华州立大学的Steven M. LaValle 教授在1998年提出,他一直从事RRT算法的改进和应用研究,他的相关工作奠定了RRT算法的基础。
From Data Mining to Knowledge Discovery in Databases
s Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media atten-tion of late. What is all the excitement about?This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges in-volved in real-world applications of knowledge discovery, and current and future research direc-tions in the field.A cross a wide variety of fields, data arebeing collected and accumulated at adramatic pace. There is an urgent need for a new generation of computational theo-ries and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. These theories and tools are the subject of the emerging field of knowledge discovery in databases (KDD).At an abstract level, the KDD field is con-cerned with the development of methods and techniques for making sense of data. The basic problem addressed by the KDD process is one of mapping low-level data (which are typically too voluminous to understand and digest easi-ly) into other forms that might be more com-pact (for example, a short report), more ab-stract (for example, a descriptive approximation or model of the process that generated the data), or more useful (for exam-ple, a predictive model for estimating the val-ue of future cases). At the core of the process is the application of specific data-mining meth-ods for pattern discovery and extraction.1This article begins by discussing the histori-cal context of KDD and data mining and theirintersection with other related fields. A briefsummary of recent KDD real-world applica-tions is provided. Definitions of KDD and da-ta mining are provided, and the general mul-tistep KDD process is outlined. This multistepprocess has the application of data-mining al-gorithms as one particular step in the process.The data-mining step is discussed in more de-tail in the context of specific data-mining al-gorithms and their application. Real-worldpractical application issues are also outlined.Finally, the article enumerates challenges forfuture research and development and in par-ticular discusses potential opportunities for AItechnology in KDD systems.Why Do We Need KDD?The traditional method of turning data intoknowledge relies on manual analysis and in-terpretation. For example, in the health-careindustry, it is common for specialists to peri-odically analyze current trends and changesin health-care data, say, on a quarterly basis.The specialists then provide a report detailingthe analysis to the sponsoring health-care or-ganization; this report becomes the basis forfuture decision making and planning forhealth-care management. In a totally differ-ent type of application, planetary geologistssift through remotely sensed images of plan-ets and asteroids, carefully locating and cata-loging such geologic objects of interest as im-pact craters. Be it science, marketing, finance,health care, retail, or any other field, the clas-sical approach to data analysis relies funda-mentally on one or more analysts becomingArticlesFALL 1996 37From Data Mining to Knowledge Discovery inDatabasesUsama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth Copyright © 1996, American Association for Artificial Intelligence. All rights reserved. 0738-4602-1996 / $2.00areas is astronomy. Here, a notable success was achieved by SKICAT ,a system used by as-tronomers to perform image analysis,classification, and cataloging of sky objects from sky-survey images (Fayyad, Djorgovski,and Weir 1996). In its first application, the system was used to process the 3 terabytes (1012bytes) of image data resulting from the Second Palomar Observatory Sky Survey,where it is estimated that on the order of 109sky objects are detectable. SKICAT can outper-form humans and traditional computational techniques in classifying faint sky objects. See Fayyad, Haussler, and Stolorz (1996) for a sur-vey of scientific applications.In business, main KDD application areas includes marketing, finance (especially in-vestment), fraud detection, manufacturing,telecommunications, and Internet agents.Marketing:In marketing, the primary ap-plication is database marketing systems,which analyze customer databases to identify different customer groups and forecast their behavior. Business Week (Berry 1994) estimat-ed that over half of all retailers are using or planning to use database marketing, and those who do use it have good results; for ex-ample, American Express reports a 10- to 15-percent increase in credit-card use. Another notable marketing application is market-bas-ket analysis (Agrawal et al. 1996) systems,which find patterns such as, “If customer bought X, he/she is also likely to buy Y and Z.” Such patterns are valuable to retailers.Investment: Numerous companies use da-ta mining for investment, but most do not describe their systems. One exception is LBS Capital Management. Its system uses expert systems, neural nets, and genetic algorithms to manage portfolios totaling $600 million;since its start in 1993, the system has outper-formed the broad stock market (Hall, Mani,and Barr 1996).Fraud detection: HNC Falcon and Nestor PRISM systems are used for monitoring credit-card fraud, watching over millions of ac-counts. The FAIS system (Senator et al. 1995),from the U.S. Treasury Financial Crimes En-forcement Network, is used to identify finan-cial transactions that might indicate money-laundering activity.Manufacturing: The CASSIOPEE trou-bleshooting system, developed as part of a joint venture between General Electric and SNECMA, was applied by three major Euro-pean airlines to diagnose and predict prob-lems for the Boeing 737. To derive families of faults, clustering methods are used. CASSIOPEE received the European first prize for innova-intimately familiar with the data and serving as an interface between the data and the users and products.For these (and many other) applications,this form of manual probing of a data set is slow, expensive, and highly subjective. In fact, as data volumes grow dramatically, this type of manual data analysis is becoming completely impractical in many domains.Databases are increasing in size in two ways:(1) the number N of records or objects in the database and (2) the number d of fields or at-tributes to an object. Databases containing on the order of N = 109objects are becoming in-creasingly common, for example, in the as-tronomical sciences. Similarly, the number of fields d can easily be on the order of 102or even 103, for example, in medical diagnostic applications. Who could be expected to di-gest millions of records, each having tens or hundreds of fields? We believe that this job is certainly not one for humans; hence, analysis work needs to be automated, at least partially.The need to scale up human analysis capa-bilities to handling the large number of bytes that we can collect is both economic and sci-entific. Businesses use data to gain competi-tive advantage, increase efficiency, and pro-vide more valuable services to customers.Data we capture about our environment are the basic evidence we use to build theories and models of the universe we live in. Be-cause computers have enabled humans to gather more data than we can digest, it is on-ly natural to turn to computational tech-niques to help us unearth meaningful pat-terns and structures from the massive volumes of data. Hence, KDD is an attempt to address a problem that the digital informa-tion era made a fact of life for all of us: data overload.Data Mining and Knowledge Discovery in the Real WorldA large degree of the current interest in KDD is the result of the media interest surrounding successful KDD applications, for example, the focus articles within the last two years in Business Week , Newsweek , Byte , PC Week , and other large-circulation periodicals. Unfortu-nately, it is not always easy to separate fact from media hype. Nonetheless, several well-documented examples of successful systems can rightly be referred to as KDD applications and have been deployed in operational use on large-scale real-world problems in science and in business.In science, one of the primary applicationThere is an urgent need for a new generation of computation-al theories and tools toassist humans in extractinguseful information (knowledge)from the rapidly growing volumes ofdigital data.Articles38AI MAGAZINEtive applications (Manago and Auriol 1996).Telecommunications: The telecommuni-cations alarm-sequence analyzer (TASA) wasbuilt in cooperation with a manufacturer oftelecommunications equipment and threetelephone networks (Mannila, Toivonen, andVerkamo 1995). The system uses a novelframework for locating frequently occurringalarm episodes from the alarm stream andpresenting them as rules. Large sets of discov-ered rules can be explored with flexible infor-mation-retrieval tools supporting interactivityand iteration. In this way, TASA offers pruning,grouping, and ordering tools to refine the re-sults of a basic brute-force search for rules.Data cleaning: The MERGE-PURGE systemwas applied to the identification of duplicatewelfare claims (Hernandez and Stolfo 1995).It was used successfully on data from the Wel-fare Department of the State of Washington.In other areas, a well-publicized system isIBM’s ADVANCED SCOUT,a specialized data-min-ing system that helps National Basketball As-sociation (NBA) coaches organize and inter-pret data from NBA games (U.S. News 1995). ADVANCED SCOUT was used by several of the NBA teams in 1996, including the Seattle Su-personics, which reached the NBA finals.Finally, a novel and increasingly importanttype of discovery is one based on the use of in-telligent agents to navigate through an infor-mation-rich environment. Although the ideaof active triggers has long been analyzed in thedatabase field, really successful applications ofthis idea appeared only with the advent of theInternet. These systems ask the user to specifya profile of interest and search for related in-formation among a wide variety of public-do-main and proprietary sources. For example, FIREFLY is a personal music-recommendation agent: It asks a user his/her opinion of several music pieces and then suggests other music that the user might like (<http:// www.ffl/>). CRAYON(/>) allows users to create their own free newspaper (supported by ads); NEWSHOUND(<http://www. /hound/>) from the San Jose Mercury News and FARCAST(</> automatically search information from a wide variety of sources, including newspapers and wire services, and e-mail rele-vant documents directly to the user.These are just a few of the numerous suchsystems that use KDD techniques to automat-ically produce useful information from largemasses of raw data. See Piatetsky-Shapiro etal. (1996) for an overview of issues in devel-oping industrial KDD applications.Data Mining and KDDHistorically, the notion of finding useful pat-terns in data has been given a variety ofnames, including data mining, knowledge ex-traction, information discovery, informationharvesting, data archaeology, and data patternprocessing. The term data mining has mostlybeen used by statisticians, data analysts, andthe management information systems (MIS)communities. It has also gained popularity inthe database field. The phrase knowledge dis-covery in databases was coined at the first KDDworkshop in 1989 (Piatetsky-Shapiro 1991) toemphasize that knowledge is the end productof a data-driven discovery. It has been popular-ized in the AI and machine-learning fields.In our view, KDD refers to the overall pro-cess of discovering useful knowledge from da-ta, and data mining refers to a particular stepin this process. Data mining is the applicationof specific algorithms for extracting patternsfrom data. The distinction between the KDDprocess and the data-mining step (within theprocess) is a central point of this article. Theadditional steps in the KDD process, such asdata preparation, data selection, data cleaning,incorporation of appropriate prior knowledge,and proper interpretation of the results ofmining, are essential to ensure that usefulknowledge is derived from the data. Blind ap-plication of data-mining methods (rightly crit-icized as data dredging in the statistical litera-ture) can be a dangerous activity, easilyleading to the discovery of meaningless andinvalid patterns.The Interdisciplinary Nature of KDDKDD has evolved, and continues to evolve,from the intersection of research fields such asmachine learning, pattern recognition,databases, statistics, AI, knowledge acquisitionfor expert systems, data visualization, andhigh-performance computing. The unifyinggoal is extracting high-level knowledge fromlow-level data in the context of large data sets.The data-mining component of KDD cur-rently relies heavily on known techniquesfrom machine learning, pattern recognition,and statistics to find patterns from data in thedata-mining step of the KDD process. A natu-ral question is, How is KDD different from pat-tern recognition or machine learning (and re-lated fields)? The answer is that these fieldsprovide some of the data-mining methodsthat are used in the data-mining step of theKDD process. KDD focuses on the overall pro-cess of knowledge discovery from data, includ-ing how the data are stored and accessed, howalgorithms can be scaled to massive data setsThe basicproblemaddressed bythe KDDprocess isone ofmappinglow-leveldata intoother formsthat might bemorecompact,moreabstract,or moreuseful.ArticlesFALL 1996 39A driving force behind KDD is the database field (the second D in KDD). Indeed, the problem of effective data manipulation when data cannot fit in the main memory is of fun-damental importance to KDD. Database tech-niques for gaining efficient data access,grouping and ordering operations when ac-cessing data, and optimizing queries consti-tute the basics for scaling algorithms to larger data sets. Most data-mining algorithms from statistics, pattern recognition, and machine learning assume data are in the main memo-ry and pay no attention to how the algorithm breaks down if only limited views of the data are possible.A related field evolving from databases is data warehousing,which refers to the popular business trend of collecting and cleaning transactional data to make them available for online analysis and decision support. Data warehousing helps set the stage for KDD in two important ways: (1) data cleaning and (2)data access.Data cleaning: As organizations are forced to think about a unified logical view of the wide variety of data and databases they pos-sess, they have to address the issues of map-ping data to a single naming convention,uniformly representing and handling missing data, and handling noise and errors when possible.Data access: Uniform and well-defined methods must be created for accessing the da-ta and providing access paths to data that were historically difficult to get to (for exam-ple, stored offline).Once organizations and individuals have solved the problem of how to store and ac-cess their data, the natural next step is the question, What else do we do with all the da-ta? This is where opportunities for KDD natu-rally arise.A popular approach for analysis of data warehouses is called online analytical processing (OLAP), named for a set of principles pro-posed by Codd (1993). OLAP tools focus on providing multidimensional data analysis,which is superior to SQL in computing sum-maries and breakdowns along many dimen-sions. OLAP tools are targeted toward simpli-fying and supporting interactive data analysis,but the goal of KDD tools is to automate as much of the process as possible. Thus, KDD is a step beyond what is currently supported by most standard database systems.Basic DefinitionsKDD is the nontrivial process of identifying valid, novel, potentially useful, and ultimate-and still run efficiently, how results can be in-terpreted and visualized, and how the overall man-machine interaction can usefully be modeled and supported. The KDD process can be viewed as a multidisciplinary activity that encompasses techniques beyond the scope of any one particular discipline such as machine learning. In this context, there are clear opportunities for other fields of AI (be-sides machine learning) to contribute to KDD. KDD places a special emphasis on find-ing understandable patterns that can be inter-preted as useful or interesting knowledge.Thus, for example, neural networks, although a powerful modeling tool, are relatively difficult to understand compared to decision trees. KDD also emphasizes scaling and ro-bustness properties of modeling algorithms for large noisy data sets.Related AI research fields include machine discovery, which targets the discovery of em-pirical laws from observation and experimen-tation (Shrager and Langley 1990) (see Kloes-gen and Zytkow [1996] for a glossary of terms common to KDD and machine discovery),and causal modeling for the inference of causal models from data (Spirtes, Glymour,and Scheines 1993). Statistics in particular has much in common with KDD (see Elder and Pregibon [1996] and Glymour et al.[1996] for a more detailed discussion of this synergy). Knowledge discovery from data is fundamentally a statistical endeavor. Statistics provides a language and framework for quan-tifying the uncertainty that results when one tries to infer general patterns from a particu-lar sample of an overall population. As men-tioned earlier, the term data mining has had negative connotations in statistics since the 1960s when computer-based data analysis techniques were first introduced. The concern arose because if one searches long enough in any data set (even randomly generated data),one can find patterns that appear to be statis-tically significant but, in fact, are not. Clearly,this issue is of fundamental importance to KDD. Substantial progress has been made in recent years in understanding such issues in statistics. Much of this work is of direct rele-vance to KDD. Thus, data mining is a legiti-mate activity as long as one understands how to do it correctly; data mining carried out poorly (without regard to the statistical as-pects of the problem) is to be avoided. KDD can also be viewed as encompassing a broader view of modeling than statistics. KDD aims to provide tools to automate (to the degree pos-sible) the entire process of data analysis and the statistician’s “art” of hypothesis selection.Data mining is a step in the KDD process that consists of ap-plying data analysis and discovery al-gorithms that produce a par-ticular enu-meration ofpatterns (or models)over the data.Articles40AI MAGAZINEly understandable patterns in data (Fayyad, Piatetsky-Shapiro, and Smyth 1996).Here, data are a set of facts (for example, cases in a database), and pattern is an expres-sion in some language describing a subset of the data or a model applicable to the subset. Hence, in our usage here, extracting a pattern also designates fitting a model to data; find-ing structure from data; or, in general, mak-ing any high-level description of a set of data. The term process implies that KDD comprises many steps, which involve data preparation, search for patterns, knowledge evaluation, and refinement, all repeated in multiple itera-tions. By nontrivial, we mean that some search or inference is involved; that is, it is not a straightforward computation of predefined quantities like computing the av-erage value of a set of numbers.The discovered patterns should be valid on new data with some degree of certainty. We also want patterns to be novel (at least to the system and preferably to the user) and poten-tially useful, that is, lead to some benefit to the user or task. Finally, the patterns should be understandable, if not immediately then after some postprocessing.The previous discussion implies that we can define quantitative measures for evaluating extracted patterns. In many cases, it is possi-ble to define measures of certainty (for exam-ple, estimated prediction accuracy on new data) or utility (for example, gain, perhaps indollars saved because of better predictions orspeedup in response time of a system). No-tions such as novelty and understandabilityare much more subjective. In certain contexts,understandability can be estimated by sim-plicity (for example, the number of bits to de-scribe a pattern). An important notion, calledinterestingness(for example, see Silberschatzand Tuzhilin [1995] and Piatetsky-Shapiro andMatheus [1994]), is usually taken as an overallmeasure of pattern value, combining validity,novelty, usefulness, and simplicity. Interest-ingness functions can be defined explicitly orcan be manifested implicitly through an or-dering placed by the KDD system on the dis-covered patterns or models.Given these notions, we can consider apattern to be knowledge if it exceeds some in-terestingness threshold, which is by nomeans an attempt to define knowledge in thephilosophical or even the popular view. As amatter of fact, knowledge in this definition ispurely user oriented and domain specific andis determined by whatever functions andthresholds the user chooses.Data mining is a step in the KDD processthat consists of applying data analysis anddiscovery algorithms that, under acceptablecomputational efficiency limitations, pro-duce a particular enumeration of patterns (ormodels) over the data. Note that the space ofArticlesFALL 1996 41Figure 1. An Overview of the Steps That Compose the KDD Process.methods, the effective number of variables under consideration can be reduced, or in-variant representations for the data can be found.Fifth is matching the goals of the KDD pro-cess (step 1) to a particular data-mining method. For example, summarization, clas-sification, regression, clustering, and so on,are described later as well as in Fayyad, Piatet-sky-Shapiro, and Smyth (1996).Sixth is exploratory analysis and model and hypothesis selection: choosing the data-mining algorithm(s) and selecting method(s)to be used for searching for data patterns.This process includes deciding which models and parameters might be appropriate (for ex-ample, models of categorical data are differ-ent than models of vectors over the reals) and matching a particular data-mining method with the overall criteria of the KDD process (for example, the end user might be more in-terested in understanding the model than its predictive capabilities).Seventh is data mining: searching for pat-terns of interest in a particular representa-tional form or a set of such representations,including classification rules or trees, regres-sion, and clustering. The user can significant-ly aid the data-mining method by correctly performing the preceding steps.Eighth is interpreting mined patterns, pos-sibly returning to any of steps 1 through 7 for further iteration. This step can also involve visualization of the extracted patterns and models or visualization of the data given the extracted models.Ninth is acting on the discovered knowl-edge: using the knowledge directly, incorpo-rating the knowledge into another system for further action, or simply documenting it and reporting it to interested parties. This process also includes checking for and resolving po-tential conflicts with previously believed (or extracted) knowledge.The KDD process can involve significant iteration and can contain loops between any two steps. The basic flow of steps (al-though not the potential multitude of itera-tions and loops) is illustrated in figure 1.Most previous work on KDD has focused on step 7, the data mining. However, the other steps are as important (and probably more so) for the successful application of KDD in practice. Having defined the basic notions and introduced the KDD process, we now focus on the data-mining component,which has, by far, received the most atten-tion in the literature.patterns is often infinite, and the enumera-tion of patterns involves some form of search in this space. Practical computational constraints place severe limits on the sub-space that can be explored by a data-mining algorithm.The KDD process involves using the database along with any required selection,preprocessing, subsampling, and transforma-tions of it; applying data-mining methods (algorithms) to enumerate patterns from it;and evaluating the products of data mining to identify the subset of the enumerated pat-terns deemed knowledge. The data-mining component of the KDD process is concerned with the algorithmic means by which pat-terns are extracted and enumerated from da-ta. The overall KDD process (figure 1) in-cludes the evaluation and possible interpretation of the mined patterns to de-termine which patterns can be considered new knowledge. The KDD process also in-cludes all the additional steps described in the next section.The notion of an overall user-driven pro-cess is not unique to KDD: analogous propos-als have been put forward both in statistics (Hand 1994) and in machine learning (Brod-ley and Smyth 1996).The KDD ProcessThe KDD process is interactive and iterative,involving numerous steps with many deci-sions made by the user. Brachman and Anand (1996) give a practical view of the KDD pro-cess, emphasizing the interactive nature of the process. Here, we broadly outline some of its basic steps:First is developing an understanding of the application domain and the relevant prior knowledge and identifying the goal of the KDD process from the customer’s viewpoint.Second is creating a target data set: select-ing a data set, or focusing on a subset of vari-ables or data samples, on which discovery is to be performed.Third is data cleaning and preprocessing.Basic operations include removing noise if appropriate, collecting the necessary informa-tion to model or account for noise, deciding on strategies for handling missing data fields,and accounting for time-sequence informa-tion and known changes.Fourth is data reduction and projection:finding useful features to represent the data depending on the goal of the task. With di-mensionality reduction or transformationArticles42AI MAGAZINEThe Data-Mining Stepof the KDD ProcessThe data-mining component of the KDD pro-cess often involves repeated iterative applica-tion of particular data-mining methods. This section presents an overview of the primary goals of data mining, a description of the methods used to address these goals, and a brief description of the data-mining algo-rithms that incorporate these methods.The knowledge discovery goals are defined by the intended use of the system. We can distinguish two types of goals: (1) verification and (2) discovery. With verification,the sys-tem is limited to verifying the user’s hypothe-sis. With discovery,the system autonomously finds new patterns. We further subdivide the discovery goal into prediction,where the sys-tem finds patterns for predicting the future behavior of some entities, and description, where the system finds patterns for presenta-tion to a user in a human-understandableform. In this article, we are primarily con-cerned with discovery-oriented data mining.Data mining involves fitting models to, or determining patterns from, observed data. The fitted models play the role of inferred knowledge: Whether the models reflect useful or interesting knowledge is part of the over-all, interactive KDD process where subjective human judgment is typically required. Two primary mathematical formalisms are used in model fitting: (1) statistical and (2) logical. The statistical approach allows for nondeter-ministic effects in the model, whereas a logi-cal model is purely deterministic. We focus primarily on the statistical approach to data mining, which tends to be the most widely used basis for practical data-mining applica-tions given the typical presence of uncertain-ty in real-world data-generating processes.Most data-mining methods are based on tried and tested techniques from machine learning, pattern recognition, and statistics: classification, clustering, regression, and so on. The array of different algorithms under each of these headings can often be bewilder-ing to both the novice and the experienced data analyst. It should be emphasized that of the many data-mining methods advertised in the literature, there are really only a few fun-damental techniques. The actual underlying model representation being used by a particu-lar method typically comes from a composi-tion of a small number of well-known op-tions: polynomials, splines, kernel and basis functions, threshold-Boolean functions, and so on. Thus, algorithms tend to differ primar-ily in the goodness-of-fit criterion used toevaluate model fit or in the search methodused to find a good fit.In our brief overview of data-mining meth-ods, we try in particular to convey the notionthat most (if not all) methods can be viewedas extensions or hybrids of a few basic tech-niques and principles. We first discuss the pri-mary methods of data mining and then showthat the data- mining methods can be viewedas consisting of three primary algorithmiccomponents: (1) model representation, (2)model evaluation, and (3) search. In the dis-cussion of KDD and data-mining methods,we use a simple example to make some of thenotions more concrete. Figure 2 shows a sim-ple two-dimensional artificial data set consist-ing of 23 cases. Each point on the graph rep-resents a person who has been given a loanby a particular bank at some time in the past.The horizontal axis represents the income ofthe person; the vertical axis represents the to-tal personal debt of the person (mortgage, carpayments, and so on). The data have beenclassified into two classes: (1) the x’s repre-sent persons who have defaulted on theirloans and (2) the o’s represent persons whoseloans are in good status with the bank. Thus,this simple artificial data set could represent ahistorical data set that can contain usefulknowledge from the point of view of thebank making the loans. Note that in actualKDD applications, there are typically manymore dimensions (as many as several hun-dreds) and many more data points (manythousands or even millions).ArticlesFALL 1996 43Figure 2. A Simple Data Set with Two Classes Used for Illustrative Purposes.。
Discriminatively Trained Sparse Code Gradients for Contour Detection
Discriminatively Trained Sparse Code Gradientsfor Contour DetectionXiaofeng Ren and Liefeng BoIntel Science and Technology Center for Pervasive Computing,Intel LabsSeattle,W A98195,USA{xiaofeng.ren,liefeng.bo}@AbstractFinding contours in natural images is a fundamental problem that serves as thebasis of many tasks such as image segmentation and object recognition.At thecore of contour detection technologies are a set of hand-designed gradient fea-tures,used by most approaches including the state-of-the-art Global Pb(gPb)operator.In this work,we show that contour detection accuracy can be signif-icantly improved by computing Sparse Code Gradients(SCG),which measurecontrast using patch representations automatically learned through sparse coding.We use K-SVD for dictionary learning and Orthogonal Matching Pursuit for com-puting sparse codes on oriented local neighborhoods,and apply multi-scale pool-ing and power transforms before classifying them with linear SVMs.By extract-ing rich representations from pixels and avoiding collapsing them prematurely,Sparse Code Gradients effectively learn how to measure local contrasts andfindcontours.We improve the F-measure metric on the BSDS500benchmark to0.74(up from0.71of gPb contours).Moreover,our learning approach can easily adaptto novel sensor data such as Kinect-style RGB-D cameras:Sparse Code Gradi-ents on depth maps and surface normals lead to promising contour detection usingdepth and depth+color,as verified on the NYU Depth Dataset.1IntroductionContour detection is a fundamental problem in vision.Accuratelyfinding both object boundaries and interior contours has far reaching implications for many vision tasks including segmentation,recog-nition and scene understanding.High-quality image segmentation has increasingly been relying on contour analysis,such as in the widely used system of Global Pb[2].Contours and segmentations have also seen extensive uses in shape matching and object recognition[8,9].Accuratelyfinding contours in natural images is a challenging problem and has been extensively studied.With the availability of datasets with human-marked groundtruth contours,a variety of approaches have been proposed and evaluated(see a summary in[2]),such as learning to clas-sify[17,20,16],contour grouping[23,31,12],multi-scale features[21,2],and hierarchical region analysis[2].Most of these approaches have one thing in common[17,23,31,21,12,2]:they are built on top of a set of gradient features[17]measuring local contrast of oriented discs,using chi-square distances of histograms of color and textons.Despite various efforts to use generic image features[5]or learn them[16],these hand-designed gradients are still widely used after a decade and support top-ranking algorithms on the Berkeley benchmarks[2].In this work,we demonstrate that contour detection can be vastly improved by replacing the hand-designed Pb gradients of[17]with rich representations that are automatically learned from data. We use sparse coding,in particularly Orthogonal Matching Pursuit[18]and K-SVD[1],to learn such representations on patches.Instead of a direct classification of patches[16],the sparse codes on the pixels are pooled over multi-scale half-discs for each orientation,in the spirit of the Pbimage patch: gray, abdepth patch (optional):depth, surface normal…local sparse coding multi-scale pooling oriented gradients power transformslinear SVM+ - …per-pixelsparse codes SVMSVMSVM … SVM RGB-(D) contoursFigure 1:We combine sparse coding and oriented gradients for contour analysis on color as well as depth images.Sparse coding automatically learns a rich representation of patches from data.With multi-scale pooling,oriented gradients efficiently capture local contrast and lead to much more accurate contour detection than those using hand-designed features including Global Pb (gPb)[2].gradients,before being classified with a linear SVM.The SVM outputs are then smoothed and non-max suppressed over orientations,as commonly done,to produce the final contours (see Fig.1).Our sparse code gradients (SCG)are much more effective in capturing local contour contrast than existing features.By only changing local features and keeping the smoothing and globalization parts fixed,we improve the F-measure on the BSDS500benchmark to 0.74(up from 0.71of gPb),a sub-stantial step toward human-level accuracy (see the precision-recall curves in Fig.4).Large improve-ments in accuracy are also observed on other datasets including MSRC2and PASCAL2008.More-over,our approach is built on unsupervised feature learning and can directly apply to novel sensor data such as RGB-D images from Kinect-style depth ing the NYU Depth dataset [27],we verify that our SCG approach combines the strengths of color and depth contour detection and outperforms an adaptation of gPb to RGB-D by a large margin.2Related WorkContour detection has a long history in computer vision as a fundamental building block.Modern approaches to contour detection are evaluated on datasets of natural images against human-marked groundtruth.The Pb work of Martin et.al.[17]combined a set of gradient features,using bright-ness,color and textons,to outperform the Canny edge detector on the Berkeley Benchmark (BSDS).Multi-scale versions of Pb were developed and found beneficial [21,2].Building on top of the Pb gradients,many approaches studied the globalization aspects,i.e.moving beyond local classifica-tion and enforcing consistency and continuity of contours.Ren et.al.developed CRF models on superpixels to learn junction types [23].Zhu ed circular embedding to enforce orderings of edgels [31].The gPb work of Arbelaez puted gradients on eigenvectors of the affinity graph and combined them with local cues [2].In addition to Pb gradients,Dollar et.al.[5]learned boosted trees on generic features such as gradients and Haar wavelets,Kokkinos used SIFT features on edgels [12],and Prasad et.al.[20]used raw pixels in class-specific settings.One closely related work was the discriminative sparse models of Mairal et al [16],which used K-SVD to represent multi-scale patches and had moderate success on the BSDS.A major difference of our work is the use of oriented gradients:comparing to directly classifying a patch,measuring contrast between oriented half-discs is a much easier problem and can be effectively learned.Sparse coding represents a signal by reconstructing it using a small set of basis functions.It has seen wide uses in vision,for example for faces [28]and recognition [29].Similar to deep network approaches [11,14],recent works tried to avoid feature engineering and employed sparse coding of image patches to learn features from “scratch”,for texture analysis [15]and object recognition [30,3].In particular,Orthogonal Matching Pursuit [18]is a greedy algorithm that incrementally finds sparse codes,and K-SVD is also efficient and popular for dictionary learning.Closely related to our work but on the different problem of recognition,Bo ed matching pursuit and K-SVD to learn features in a coding hierarchy [3]and are extending their approach to RGB-D data [4].Thanks to the mass production of Kinect,active RGB-D cameras became affordable and were quickly adopted in vision research and applications.The Kinect pose estimation of Shotton et. ed random forests to learn from a huge amount of data[25].Henry ed RGB-D cam-eras to scan large environments into3D models[10].RGB-D data were also studied in the context of object recognition[13]and scene labeling[27,22].In-depth studies of contour and segmentation problems for depth data are much in need given the fast growing interests in RGB-D perception.3Contour Detection using Sparse Code GradientsWe start by examining the processing pipeline of Global Pb(gPb)[2],a highly influential and widely used system for contour detection.The gPb contour detection has two stages:local contrast estimation at multiple scales,and globalization of the local cues using spectral grouping.The core of the approach lies within its use of local cues in oriented gradients.Originally developed in [17],this set of features use relatively simple pixel representations(histograms of brightness,color and textons)and similarity functions(chi-square distance,manually chosen),comparing to recent advances in using rich representations for high-level recognition(e.g.[11,29,30,3]).We set out to show that both the pixel representation and the aggregation of pixel information in local neighborhoods can be much improved and,to a large extent,learned from and adapted to input data. For pixel representation,in Section3.1we show how to use Orthogonal Matching Pursuit[18]and K-SVD[1],efficient sparse coding and dictionary learning algorithms that readily apply to low-level vision,to extract sparse codes at every pixel.This sparse coding approach can be viewed similar in spirit to the use offilterbanks but avoids manual choices and thus directly applies to the RGB-D data from Kinect.We show learned dictionaries for a number of channels that exhibit different characteristics:grayscale/luminance,chromaticity(ab),depth,and surface normal.In Section3.2we show how the pixel-level sparse codes can be integrated through multi-scale pool-ing into a rich representation of oriented local neighborhoods.By computing oriented gradients on this high dimensional representation and using a double power transform to code the features for linear classification,we show a linear SVM can be efficiently and effectively trained for each orientation to classify contour vs non-contour,yielding local contrast estimates that are much more accurate than the hand-designed features in gPb.3.1Local Sparse Representation of RGB-(D)PatchesK-SVD and Orthogonal Matching Pursuit.K-SVD[1]is a popular dictionary learning algorithm that generalizes K-Means and learns dictionaries of codewords from unsupervised data.Given a set of image patches Y=[y1,···,y n],K-SVD jointlyfinds a dictionary D=[d1,···,d m]and an associated sparse code matrix X=[x1,···,x n]by minimizing the reconstruction errorminY−DX 2F s.t.∀i, x i 0≤K;∀j, d j 2=1(1) D,Xwhere · F denotes the Frobenius norm,x i are the columns of X,the zero-norm · 0counts the non-zero entries in the sparse code x i,and K is a predefined sparsity level(number of non-zero en-tries).This optimization can be solved in an alternating manner.Given the dictionary D,optimizing the sparse code matrix X can be decoupled to sub-problems,each solved with Orthogonal Matching Pursuit(OMP)[18],a greedy algorithm forfinding sparse codes.Given the codes X,the dictionary D and its associated sparse coefficients are updated sequentially by singular value decomposition. For our purpose of representing local patches,the dictionary D has a small size(we use75for5x5 patches)and does not require a lot of sample patches,and it can be learned in a matter of minutes. Once the dictionary D is learned,we again use the Orthogonal Matching Pursuit(OMP)algorithm to compute sparse codes at every pixel.This can be efficiently done with convolution and a batch version of the OMP algorithm[24].For a typical BSDS image of resolution321x481,the sparse code extraction is efficient and takes1∼2seconds.Sparse Representation of RGB-D Data.One advantage of unsupervised dictionary learning is that it readily applies to novel sensor data,such as the color and depth frames from a Kinect-style RGB-D camera.We learn K-SVD dictionaries up to four channels of color and depth:grayscale for luminance,chromaticity ab for color in the Lab space,depth(distance to camera)and surface normal(3-dim).The learned dictionaries are visualized in Fig.2.These dictionaries are interesting(a)Grayscale (b)Chromaticity (ab)(c)Depth (d)Surface normal Figure 2:K-SVD dictionaries learned for four different channels:grayscale and chromaticity (in ab )for an RGB image (a,b),and depth and surface normal for a depth image (c,d).We use a fixed dictionary size of 75on 5x 5patches.The ab channel is visualized using a constant luminance of 50.The 3-dimensional surface normal (xyz)is visualized in RGB (i.e.blue for frontal-parallel surfaces).to look at and qualitatively distinctive:for example,the surface normal codewords tend to be more smooth due to flat surfaces,the depth codewords are also more smooth but with speckles,and the chromaticity codewords respect the opponent color pairs.The channels are coded separately.3.2Coding Multi-Scale Neighborhoods for Measuring ContrastMulti-Scale Pooling over Oriented Half-Discs.Over decades of research on contour detection and related topics,a number of fundamental observations have been made,repeatedly:(1)contrast is the key to differentiate contour vs non-contour;(2)orientation is important for respecting contour continuity;and (3)multi-scale is useful.We do not wish to throw out these principles.Instead,we seek to adopt these principles for our case of high dimensional representations with sparse codes.Each pixel is presented with sparse codes extracted from a small patch (5-by-5)around it.To aggre-gate pixel information,we use oriented half-discs as used in gPb (see an illustration in Fig.1).Each orientation is processed separately.For each orientation,at each pixel p and scale s ,we define two half-discs (rectangles)N a and N b of size s -by-(2s +1),on both sides of p ,rotated to that orienta-tion.For each half-disc N ,we use average pooling on non-zero entries (i.e.a hybrid of average and max pooling)to generate its representationF (N )= i ∈N |x i 1| i ∈N I |x i 1|>0,···, i ∈N |x im | i ∈NI |x im |>0 (2)where x ij is the j -th entry of the sparse code x i ,and I is the indicator function whether x ij is non-zero.We rotate the image (after sparse coding)and use integral images for fast computations (on both |x ij |and |x ij |>0,whose costs are independent of the size of N .For two oriented half-dics N a and N b at a scale s ,we compute a difference (gradient)vector DD (N a s ,N b s )= F (N a s )−F (N b s ) (3)where |·|is an element-wise absolute value operation.We divide D (N a s ,N b s )by their norms F (N a s ) + F (N b s ) + ,where is a positive number.Since the magnitude of sparse codes variesover a wide range due to local variations in illumination as well as occlusion,this step makes the appearance features robust to such variations and increases their discriminative power,as commonly done in both contour detection and object recognition.This value is not hard to set,and we find a value of =0.5is better than,for instance, =0.At this stage,one could train a classifier on D for each scale to convert it to a scalar value of contrast,which would resemble the chi-square distance function in gPb.Instead,we find that it is much better to avoid doing so separately at each scale,but combining multi-scale features in a joint representation,so as to allow interactions both between codewords and between scales.That is,our final representation of the contrast at a pixel p is the concatenation of sparse codes pooled at all thescales s ∈{1,···,S }(we use S =4):D p = D (N a 1,N b 1),···,D (N a S ,N b S );F (N a 1∪N b 1),···,F (N a S ∪N b S ) (4)In addition to difference D ,we also include a union term F (N a s ∪N b s ),which captures the appear-ance of the whole disc (union of the two half discs)and is normalized by F (N a s ) + F (N b s ) + .Double Power Transform and Linear Classifiers.The concatenated feature D p (non-negative)provides multi-scale contrast information for classifying whether p is a contour location for a partic-ular orientation.As D p is high dimensional (1200and above in our experiments)and we need to do it at every pixel and every orientation,we prefer using linear SVMs for both efficient testing as well as training.Directly learning a linear function on D p ,however,does not work very well.Instead,we apply a double power transformation to make the features more suitable for linear SVMs D p = D α1p ,D α2p (5)where 0<α1<α2<1.Empirically,we find that the double power transform works much better than either no transform or a single power transform α,as sometimes done in other classification contexts.Perronnin et.al.[19]provided an intuition why a power transform helps classification,which “re-normalizes”the distribution of the features into a more Gaussian form.One plausible intuition for a double power transform is that the optimal exponent αmay be different across feature dimensions.By putting two power transforms of D p together,we allow the classifier to pick its linear combination,different for each dimension,during the stage of supervised training.From Local Contrast to Global Contours.We intentionally only change the local contrast es-timation in gPb and keep the other steps fixed.These steps include:(1)the Savitzky-Goley filter to smooth responses and find peak locations;(2)non-max suppression over orientations;and (3)optionally,we apply the globalization step in gPb that computes a spectral gradient from the local gradients and then linearly combines the spectral gradient with the local ones.A sigmoid transform step is needed to convert the SVM outputs on D p before computing spectral gradients.4ExperimentsWe use the evaluation framework of,and extensively compare to,the publicly available Global Pb (gPb)system [2],widely used as the state of the art for contour detection 1.All the results reported on gPb are from running the gPb contour detection and evaluation codes (with default parameters),and accuracies are verified against the published results in [2].The gPb evaluation includes a number of criteria,including precision-recall (P/R)curves from contour matching (Fig.4),F-measures computed from P/R (Table 1,2,3)with a fixed contour threshold (ODS)or per-image thresholds (OIS),as well as average precisions (AP)from the P/R curves.Benchmark Datasets.The main dataset we use is the BSDS500benchmark [2],an extension of the original BSDS300benchmark and commonly used for contour evaluation.It includes 500natural images of roughly resolution 321x 481,including 200for training,100for validation,and 200for testing.We conduct both color and grayscale experiments (where we convert the BSDS500images to grayscale and retain the groundtruth).In addition,we also use the MSRC2and PASCAL2008segmentation datasets [26,6],as done in the gPb work [2].The MSRC2dataset has 591images of resolution 200x 300;we randomly choose half for training and half for testing.The PASCAL2008dataset includes 1023images in its training and validation sets,roughly of resolution 350x 500.We randomly choose half for training and half for testing.For RGB-D contour detection,we use the NYU Depth dataset (v2)[27],which includes 1449pairs of color and depth frames of resolution 480x 640,with groundtruth semantic regions.We choose 60%images for training and 40%for testing,as in its scene labeling setup.The Kinect images are of lower quality than BSDS,and we resize the frames to 240x 320in our experiments.Training Sparse Code Gradients.Given sparse codes from K-SVD and Orthogonal Matching Pur-suit,we train the Sparse Code Gradients classifiers,one linear SVM per orientation,from sampled locations.For positive data,we sample groundtruth contour locations and estimate the orientations at these locations using groundtruth.For negative data,locations and orientations are random.We subtract the mean from the patches in each data channel.For BSDS500,we typically have 1.5to 21In this work we focus on contour detection and do not address how to derive segmentations from contours.pooling disc size (pixel)a v e r a g e p r e c i s i o na v e r a g e p r e c i s i o nsparsity level a v e r a g e p r e c i s i o n (a)(b)(c)Figure 3:Analysis of our sparse code gradients,using average precision of classification on sampled boundaries.(a)The effect of single-scale vs multi-scale pooling (accumulated from the smallest).(b)Accuracy increasing with dictionary size,for four orientation channels.(c)The effect of the sparsity level K,which exhibits different behavior for grayscale and chromaticity.BSDS500ODS OIS AP l o c a l gPb (gray).67.69.68SCG (gray).69.71.71gPb (color).70.72.71SCG (color).72.74.75g l o b a l gPb (gray).69.71.67SCG (gray).71.73.74gPb (color).71.74.72SCG (color).74.76.77Table 1:F-measure evaluation on the BSDS500benchmark [2],comparing to gPb on grayscaleand color images,both for local contour detec-tion as well as for global detection (-bined with the spectral gradient analysis in [2]).Recall P r e c i s i o n Figure 4:Precision-recall curves of SCG vs gPb on BSDS500,for grayscale and color images.We make a substantial step beyondthe current state of the art toward reachinghuman-level accuracy (green dot).million data points.We use 4spatial scales,at half-disc sizes 2,4,7,25.For a dictionary size of 75and 4scales,the feature length for one data channel is 1200.For full RGB-D data,the dimension is 4800.For BSDS500,we train only using the 200training images.We modify liblinear [7]to take dense matrices (features are dense after pooling)and single-precision floats.Looking under the Hood.We empirically analyze a number of settings in our Sparse Code Gradi-ents.In particular,we want to understand how the choices in the local sparse coding affect contour classification.Fig.3shows the effects of multi-scale pooling,dictionary size,and sparsity level (K).The numbers reported are intermediate results,namely the mean of average precision of four oriented gradient classifier (0,45,90,135degrees)on sampled locations (grayscale unless otherwise noted,on validation).As a reference,the average precision of gPb on this task is 0.878.For multi-scale pooling,the single best scale for the half-disc filter is about 4x 8,consistent with the settings in gPb.For accumulated scales (using all the scales from the smallest up to the current level),the accuracy continues to increase and does not seem to be saturated,suggesting the use of larger scales.The dictionary size has a minor impact,and there is a small (yet observable)benefit to use dictionaries larger than 75,particularly for diagonal orientations (45-and 135-deg).The sparsity level K is a more intriguing issue.In Fig.3(c),we see that for grayscale only,K =1(normalized nearest neighbor)does quite well;on the other hand,color needs a larger K ,possibly because ab is a nonlinear space.When combining grayscale and color,it seems that we want K to be at least 3.It also varies with orientation:horizontal and vertical edges require a smaller K than diagonal edges.(If using K =1,our final F-measure on BSDS500is 0.730.)We also empirically evaluate the double power transform vs single power transform vs no transform.With no transform,the average precision is 0.865.With a single power transform,the best choice of the exponent is around 0.4,with average precision 0.884.A double power transform (with exponentsMSRC2ODS OIS APgPb.37.39.22SCG.43.43.33PASCAL2008ODS OIS APgPb.34.38.20SCG.37.41.27Table2:F-measure evaluation comparing our SCG approach to gPb on two addi-tional image datasets with contour groundtruth: MSRC2[26]and PASCAL2008[6].RGB-D(NYU v2)ODS OIS AP gPb(color).51.52.37 SCG(color).55.57.46gPb(depth).44.46.28SCG(depth).53.54.45gPb(RGB-D).53.54.40SCG(RGB-D).62.63.54Table3:F-measure evaluation on RGB-D con-tour detection using the NYU dataset(v2)[27].We compare to gPb on using color image only,depth only,as well as color+depth.Figure5:Examples from the BSDS500dataset[2].(Top)Image;(Middle)gPb output;(Bottom) SCG output(this work).Our SCG operator learns to preservefine details(e.g.windmills,faces,fish fins)while at the same time achieving higher precision on large-scale contours(e.g.back of zebras). (Contours are shown in double width for the sake of visualization.)0.25and0.75,which can be computed through sqrt)improves the average precision to0.900,which translates to a large improvement in contour detection accuracy.Image Benchmarking Results.In Table1and Fig.4we show the precision-recall of our Sparse Code Gradients vs gPb on the BSDS500benchmark.We conduct four sets of experiments,using color or grayscale images,with or without the globalization component(for which we use exactly the same setup as in gPb).Using Sparse Code Gradients leads to a significant improvement in accuracy in all four cases.The local version of our SCG operator,i.e.only using local contrast,is already better(F=0.72)than gPb with globalization(F=0.71).The full version,local SCG plus spectral gradient(computed from local SCG),reaches an F-measure of0.739,a large step forward from gPb,as seen in the precision-recall curves in Fig.4.On BSDS300,our F-measure is0.715. We observe that SCG seems to pick upfine-scale details much better than gPb,hence the much higher recall rate,while maintaining higher precision over the entire range.This can be seen in the examples shown in Fig.5.While our scale range is similar to that of gPb,the multi-scale pooling scheme allows theflexibility of learning the balance of scales separately for each code word,which may help detecting the details.The supplemental material contains more comparison examples.In Table2we show the benchmarking results for two additional datasets,MSRC2and PAS-CAL2008.Again we observe large improvements in accuracy,in spite of the somewhat different natures of the scenes in these datasets.The improvement on MSRC2is much larger,partly because the images are smaller,hence the contours are smaller in scale and may be over-smoothed in gPb. As for computational cost,using integral images,local SCG takes∼100seconds to compute on a single-thread Intel Core i5-2500CPU on a BSDS image.It is slower than but comparable to the highly optimized multi-thread C++implementation of gPb(∼60seconds).Figure6:Examples of RGB-D contour detection on the NYU dataset(v2)[27].Thefive panels are:input image,input depth,image-only contours,depth-only contours,and color+depth contours. Color is good picking up details such as photos on the wall,and depth is useful where color is uniform(e.g.corner of a room,row1)or illumination is poor(e.g.chair,row2).RGB-D Contour Detection.We use the second version of the NYU Depth Dataset[27],which has higher quality groundtruth than thefirst version.A medianfiltering is applied to remove double contours(boundaries from two adjacent regions)within3pixels.For RGB-D baseline,we use a simple adaptation of gPb:the depth values are in meters and used directly as a grayscale image in gPb gradient computation.We use a linear combination to put(soft)color and depth gradients together in gPb before non-max suppression,with the weight set from validation.Table3lists the precision-recall evaluations of SCG vs gPb for RGB-D contour detection.All the SCG settings(such as scales and dictionary sizes)are kept the same as for BSDS.SCG again outperforms gPb in all the cases.In particular,we are much better for depth-only contours,for which gPb is not designed.Our approach learns the low-level representations of depth data fully automatically and does not require any manual tweaking.We also achieve a much larger boost by combining color and depth,demonstrating that color and depth channels contain complementary information and are both critical for RGB-D contour detection.Qualitatively,it is easy to see that RGB-D combines the strengths of color and depth and is a promising direction for contour and segmentation tasks and indoor scene analysis in general[22].Fig.6shows a few examples of RGB-D contours from our SCG operator.There are plenty of such cases where color alone or depth alone would fail to extract contours for meaningful parts of the scenes,and color+depth would succeed. 5DiscussionsIn this work we successfully showed how to learn and code local representations to extract contours in natural images.Our approach combined the proven concept of oriented gradients with powerful representations that are automatically learned through sparse coding.Sparse Code Gradients(SCG) performed significantly better than hand-designed features that were in use for a decade,and pushed contour detection much closer to human-level accuracy as illustrated on the BSDS500benchmark. Comparing to hand-designed features(e.g.Global Pb[2]),we maintain the high dimensional rep-resentation from pooling oriented neighborhoods and do not collapse them prematurely(such as computing chi-square distance at each scale).This passes a richer set of information into learn-ing contour classification,where a double power transform effectively codes the features for linear paring to previous learning approaches(e.g.discriminative dictionaries in[16]),our uses of multi-scale pooling and oriented gradients lead to much higher classification accuracies. Our work opens up future possibilities for learning contour detection and segmentation.As we il-lustrated,there is a lot of information locally that is waiting to be extracted,and a learning approach such as sparse coding provides a principled way to do so,where rich representations can be automat-ically constructed and adapted.This is particularly important for novel sensor data such as RGB-D, for which we have less understanding but increasingly more need.。
基于周期采样的分布式动态事件触发优化算法
第38卷第3期2024年5月山东理工大学学报(自然科学版)Journal of Shandong University of Technology(Natural Science Edition)Vol.38No.3May 2024收稿日期:20230323基金项目:江苏省自然科学基金项目(BK20200824)第一作者:夏伦超,男,20211249098@;通信作者:赵中原,男,zhaozhongyuan@文章编号:1672-6197(2024)03-0058-07基于周期采样的分布式动态事件触发优化算法夏伦超1,韦梦立2,季秋桐2,赵中原1(1.南京信息工程大学自动化学院,江苏南京210044;2.东南大学网络空间安全学院,江苏南京211189)摘要:针对无向图下多智能体系统的优化问题,提出一种基于周期采样机制的分布式零梯度和优化算法,并设计一种新的动态事件触发策略㊂该策略中加入与历史时刻智能体状态相关的动态变量,有效降低了系统通信量;所提出的算法允许采样周期任意大,并考虑了通信延时的影响,利用Lyapunov 稳定性理论推导出算法收敛的充分条件㊂数值仿真进一步验证了所提算法的有效性㊂关键词:分布式优化;多智能体系统;动态事件触发;通信时延中图分类号:TP273文献标志码:ADistributed dynamic event triggerring optimizationalgorithm based on periodic samplingXIA Lunchao 1,WEI Mengli 2,JI Qiutong 2,ZHAO Zhongyuan 1(1.College of Automation,Nanjing University of Information Science and Technology,Nanjing 210044,China;2.School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China)Abstract :A distributed zero-gradient-sum optimization algorithm based on a periodic sampling mechanism is proposed to address the optimization problem of multi-agent systems under undirected graphs.A novel dynamic event-triggering strategy is designed,which incorporates dynamic variables as-sociated with the historical states of the agents to effectively reduce the system communication overhead.Moreover,the algorithm allows for arbitrary sampling periods and takes into consideration the influence oftime delay.Finally,sufficient conditions for the convergence of the algorithm are derived by utilizing Lya-punov stability theory.The effectiveness of the proposed algorithm is further demonstrated through numer-ical simulations.Keywords :distributed optimization;multi-agent systems;dynamic event-triggered;time delay ㊀㊀近些年,多智能体系统的分布式优化问题因其在多机器人系统的合作㊁智能交通系统的智能运输系统和微电网的分布式经济调度等诸多领域的应用得到了广泛的研究[1-3]㊂如今,已经提出各种分布式优化算法㊂文献[4]提出一种结合负反馈和梯度流的算法来解决平衡有向图下的无约束优化问题;文献[5]提出一种基于自适应机制的分布式优化算法来解决局部目标函数非凸的问题;文献[6]设计一种抗干扰的分布式优化算法,能够在具有未知外部扰动的情况下获得最优解㊂然而,上述工作要求智能体与其邻居不断地交流,这在现实中会造成很大的通信负担㊂文献[7]首先提出分布式事件触发控制器来解决多智能体系统一致性问题;事件触发机制的核心是设计一个基于误差的触发条件,只有满足触发条件时智能体间才进行通信㊂文献[8]提出一种基于通信网络边信息的事件触发次梯度优化㊀算法,并给出了算法的指数收敛速度㊂文献[9]提出一种基于事件触发机制的零梯度和算法,保证系统状态收敛到最优解㊂上述事件触发策略是静态事件触发策略,即其触发阈值仅与智能体的状态相关,当智能体的状态逐渐收敛时,很容易满足触发条件并将生成大量不必要的通信㊂因此,需要设计更合理的触发条件㊂文献[10]针对非线性系统的增益调度控制问题,提出一种动态事件触发机制的增益调度控制器;文献[11]提出一种基于动态事件触发条件的零梯度和算法,用于有向网络的优化㊂由于信息传输的复杂性,时间延迟在实际系统中无处不在㊂关于考虑时滞的事件触发优化问题的文献很多㊂文献[12]研究了二阶系统的凸优化问题,提出时间触发算法和事件触发算法两种分布式优化算法,使得所有智能体协同收敛到优化问题的最优解,并有效消除不必要的通信;文献[13]针对具有传输延迟的多智能体系统,提出一种具有采样数据和时滞的事件触发分布式优化算法,并得到系统指数稳定的充分条件㊂受文献[9,14]的启发,本文提出一种基于动态事件触发机制的分布式零梯度和算法,与使用静态事件触发机制的文献[15]相比,本文采用动态事件触发机制可以避免智能体状态接近最优值时频繁触发造成的资源浪费㊂此外,考虑到进行动态事件触发判断需要一定的时间,使用当前状态值是不现实的,因此,本文使用前一时刻状态值来构造动态事件触发条件,更符合逻辑㊂由于本文采用周期采样机制,这进一步降低了智能体间的通信频率,但采样周期过长会影响算法收敛㊂基于文献[14]的启发,本文设计的算法允许采样周期任意大,并且对于有时延的系统,只需要其受采样周期的限制,就可得到保证多智能体系统达到一致性和最优性的充分条件㊂最后,通过对一个通用示例进行仿真,验证所提算法的有效性㊂1㊀预备知识及问题描述1.1㊀图论令R表示实数集,R n表示向量集,R nˑn表示n ˑn实矩阵的集合㊂将包含n个智能体的多智能体系统的通信网络用图G=(V,E)建模,每个智能体都视为一个节点㊂该图由顶点集V={1,2, ,n}和边集E⊆VˑV组成㊂定义A=[a ij]ɪR nˑn为G 的加权邻接矩阵,当a ij>0时,表明节点i和节点j 间存在路径,即(i,j)ɪE;当a ij=0时,表明节点i 和节点j间不存在路径,即(i,j)∉E㊂D=diag{d1, ,d n}表示度矩阵,拉普拉斯矩阵L等于度矩阵减去邻接矩阵,即L=D-A㊂当图G是无向图时,其拉普拉斯矩阵是对称矩阵㊂1.2㊀凸函数设h i:R nңR是在凸集ΩɪR n上的局部凸函数,存在正常数φi使得下列条件成立[16]:h i(b)-h i(a)- h i(a)T(b-a)ȡ㊀㊀㊀㊀φi2 b-a 2,∀a,bɪΩ,(1)h i(b)- h i(a)()T(b-a)ȡ㊀㊀㊀㊀φi b-a 2,∀a,bɪΩ,(2) 2h i(a)ȡφi I n,∀aɪΩ,(3)式中: h i为h i的一阶梯度, 2h i为h i的二阶梯度(也称黑塞矩阵)㊂1.3㊀问题描述考虑包含n个智能体的多智能体系统,假设每个智能体i的成本函数为f i(x),本文的目标是最小化以下的优化问题:x∗=arg minxɪΩðni=1f i(x),(4)式中:x为决策变量,x∗为全局最优值㊂1.4㊀主要引理引理1㊀假设通信拓扑图G是无向且连通的,对于任意XɪR n,有以下关系成立[17]:X T LXȡαβX T L T LX,(5)式中:α是L+L T2最小的正特征值,β是L T L最大的特征值㊂引理2(中值定理)㊀假设局部成本函数是连续可微的,则对于任意实数y和y0,存在y~=y0+ω~(y -y0),使得以下不等式成立:f i(y)=f i(y0)+∂f i∂y(y~)(y-y0),(6)式中ω~是正常数且满足ω~ɪ(0,1)㊂2㊀基于动态事件触发机制的分布式优化算法及主要结果2.1㊀考虑时延的分布式动态事件触发优化算法本文研究具有时延的多智能体系统的优化问题㊂为了降低智能体间的通信频率,提出一种采样周期可任意设计的分布式动态事件触发优化算法,95第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法其具体实现通信优化的流程图如图1所示㊂首先,将邻居和自身前一触发时刻状态送往控制器(本文提出的算法),得到智能体的状态x i (t )㊂然后,预设一个固定采样周期h ,使得所有智能体在同一时刻进行采样㊂同时,在每个智能体上都配置了事件检测器,只在采样时刻检查是否满足触发条件㊂接着,将前一采样时刻的智能体状态发送至构造的触发器中进行判断,当满足设定的触发条件时,得到触发时刻的智能体状态x^i (t )㊂最后,将得到的本地状态x^i (t )用于更新自身及其邻居的控制操作㊂由于在实际传输中存在时延,因此需要考虑满足0<τ<h 的时延㊂图1㊀算法实现流程图考虑由n 个智能体构成的多智能体系统,其中每个智能体都能独立进行计算和相互通信,每个智能体i 具有如下动态方程:x ㊃i (t )=-1h2f i (x i )()-1u i (t ),(7)式中u i (t )为设计的控制算法,具体为u i (t )=ðnj =1a ij x^j (t -τ)-x ^i (t -τ)()㊂(8)㊀㊀给出设计的动态事件触发条件:θi d i e 2i (lh )-γq i (lh -h )()ɤξi (lh ),(9)q i (t )=ðnj =1a ij x^i (t -τ)-x ^j (t -τ)()2,(10)㊀㊀㊀ξ㊃i (t )=1h[-μi ξi (lh )+㊀㊀㊀㊀㊀δi γq i (lh -h )-d i e 2i (lh )()],(11)式中:d i 是智能体i 的入度;γ是正常数;θi ,μi ,δi 是设计的参数㊂令x i (lh )表示采样时刻智能体的状态,偏差变量e i (lh )=x i (lh )-x^i (lh )㊂注释1㊀在进行动态事件触发条件设计时,可以根据不同的需求为每个智能体设定不同的参数θi ,μi ,δi ,以确保其能够在特定的情境下做出最准确的反应㊂本文为了方便分析,选择为每个智能体设置相同的θi ,μi ,δi ,以便更加清晰地研究其行为表现和响应能力㊂2.2㊀主要结果和分析由于智能体仅在采样时刻进行事件触发条件判断,并在达到触发条件后才通信,因此有x ^i (t -τ)=x^i (lh )㊂定理1㊀假设无向图G 是连通的,对于任意i ɪV 和t >0,当满足条件(12)时,在算法(7)和动态事件触发条件(9)的作用下,系统状态趋于优化解x ∗,即lim t ңx i (t )=x ∗㊂12-β2φm α-τβ2φm αh -γ>0,μi+δi θi <1,μi-1-δi θi >0,ìîíïïïïïïïï(12)式中φm =min{φ1,φ2}㊂证明㊀对于t ɪ[lh +τ,(l +1)h +τ),定义Lyapunov 函数V (t )=V 1(t )+V 2(t ),其中:V 1(t )=ðni =1f i (x ∗)-f i (x i )-f ᶄi (x i )(x ∗-x i )(),V 2(t )=ðni =1ξi (t )㊂令E (t )=e 1(t ), ,e n (t )[]T ,X (t )=x 1(t ), ,x n (t )[]T ,X^(t )=x ^1(t ), ,x ^n (t )[]T ㊂对V 1(t )求导得V ㊃1(t )=1h ðni =1u i (t )x ∗-x i (t )(),(13)由于ðni =1ðnj =1a ij x ^j (t -τ)-x ^i (t -τ)()㊃x ∗=0成立,有V ㊃1(t )=-1hX T (t )LX ^(lh )㊂(14)6山东理工大学学报(自然科学版)2024年㊀由于㊀㊀X (t )=X (lh +τ)-(t -lh -τ)X ㊃(t )=㊀㊀㊀㊀X (lh )+τX ㊃(lh )+t -lh -τhΓ1LX^(lh )=㊀㊀㊀㊀X (lh )-τh Γ2LX^(lh -h )+㊀㊀㊀㊀(t -lh -τ)hΓ1LX^(lh ),(15)式中:Γ1=diag (f i ᶄᶄ(x ~11))-1, ,(f i ᶄᶄ(x ~1n ))-1{},Γ2=diag (f i ᶄᶄ(x ~21))-1, ,(f i ᶄᶄ(x ~2n))-1{},x ~1iɪ(x i (lh +τ),x i (t )),x ~2i ɪ(x i (lh ),x i (lh+τ))㊂将式(15)代入式(14)得㊀V ㊃1(t )=-1h E T (lh )LX ^(lh )-1hX ^T (lh )LX ^(lh )+㊀㊀㊀τh2Γ2X ^T (lh -h )L T LX ^(lh )+㊀㊀㊀(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )㊂(16)根据式(3)得(f i ᶄᶄ(x ~i 1))-1ɤ1φi,i =1, ,n ㊂即Γ1ɤ1φm I n ,Γ2ɤ1φmI n ,φm =min{φ1,φ2}㊂首先对(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )项进行分析,对于t ɪ[lh +τ,(l +1)h +τ),基于引理1和式(3)有(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )ɤβhφm αX ^T (lh )LX ^(lh )ɤβ2hφm αðni =1q i(lh ),(17)式中最后一项根据X^T (t )LX ^(t )=12ðni =1q i(t )求得㊂接着分析τh2Γ2X ^(lh -h )L T LX ^(lh ),根据引理1和杨式不等式有:τh2Γ2X ^T (lh -h )L T LX ^(lh )ɤ㊀㊀㊀㊀τβ2h 2φm αX ^T (lh -h )LX ^(lh -h )+㊀㊀㊀㊀τβ2h 2φm αX ^T (lh )LX ^(lh )ɤ㊀㊀㊀㊀τβ4h 2φm αðni =1q i (lh -h )+ðni =1q i (lh )[]㊂(18)将式(17)和式(18)代入式(16)得㊀V ㊃1(t )ɤβ2φm α+τβ4φm αh -12()1h ðni =1q i(lh )+㊀㊀㊀τβ4φm αh ðni =1q i (lh -h )+1h ðni =1d i e 2i(lh )㊂(19)根据式(11)得V ㊃2(t )=-ðni =1μih ξi(lh )+㊀㊀㊀㊀ðni =1δihγq i (lh -h )-d i e 2i (lh )()㊂(20)结合式(19)和式(20)得V ㊃(t )ɤ-12-β2φm α-τβ4φm αh ()1h ðni =1q i (lh )+㊀㊀㊀㊀τβ4φm αh 2ðn i =1q i (lh -h )+γh ðni =1q i (lh -h )-㊀㊀㊀㊀1h ðni =1(μi -1-δi θi)ξi (lh ),(21)因此根据李雅普诺夫函数的正定性以及Squeeze 定理得㊀V (l +1)h +τ()-V (lh +τ)ɤ㊀㊀㊀-12-β2φm α-τβ4φm αh()ðni =1q i(lh )+㊀㊀㊀τβ4φm αh ðni =1q i (lh -h )+γðni =1q i (lh -h )-㊀㊀㊀ðni =1(μi -1-δiθi)ξi (lh )㊂(22)对式(22)迭代得V (l +1)h +τ()-V (h +τ)ɤ㊀㊀-12-β2φm α-τβ2φm αh-γ()ðl -1k =1ðni =1q i(kh )+㊀㊀τβ4φm αh ðni =1q i (0h )-㊀㊀12-β2φm α-τβ4φm αh()ðni =1q i(lh )-㊀㊀ðlk =1ðni =1μi -1-δiθi()ξi (kh ),(23)进一步可得㊀lim l ңV (l +1)h -V (h )()ɤ㊀㊀㊀τβ4φm αh ðni =1q i(0h )-16第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法㊀㊀㊀ðni =1(μi -1-δi θi )ðl =1ξi (lh )-㊀㊀㊀12-β2φm α-τβ2φm αh-γ()ð l =1ðni =1q i(lh )㊂(24)由于q i (lh )ȡ0和V (t )ȡ0,由式(24)得lim l ң ðni =1ξi (lh )=0㊂(25)基于ξi 的定义和拉普拉斯矩阵的性质,可以得到每个智能体的最终状态等于相同的常数,即lim t ңx 1(t )= =lim t ңx n (t )=c ㊂(26)㊀㊀由于目标函数的二阶导数具有以下性质:ðni =1d f ᶄi (x i (t ))()d t =㊀㊀㊀㊀-ðn i =1ðnj =1a ij x ^j (t )-x ^i (t )()=㊀㊀㊀㊀-1T LX^(t )=0,(27)式中1=[1, ,1]n ,所以可以得到ðni =1f i ᶄ(x i (t ))=ðni =1f i ᶄ(x ∗i )=0㊂(28)联立式(26)和式(28)得lim t ңx 1(t )= =lim t ңx n (t )=c =x ∗㊂(29)㊀㊀定理1证明完成㊂当不考虑通信时延τ时,可由定理1得到推论1㊂推论1㊀假设通信图G 是无向且连通的,当不考虑时延τ时,对于任意i ɪV 和t >0,若条件(30)成立,智能体状态在算法(7)和触发条件(9)的作用下趋于最优解㊂14-n -1φm -γ>0,μi+δi θi <1,μi-1-δi θi >0㊂ìîíïïïïïïïï(30)㊀㊀证明㊀该推论的证明过程类似定理1,由定理1结果可得14-β2φm α-γ>0㊂(31)令λn =βα,由于λn 是多智能体系统的全局信息,因此每个智能体很难获得,但其上界可以根据以下关系来估计:λn ɤ2d max ɤ2(n -1),(32)式中d max =max{d i },i =1, ,n ㊂因此得到算法在没有时延情况下的充分条件:14-n -1φm -γ>0㊂(33)㊀㊀推论1得证㊂注释2㊀通过定理1得到的稳定性条件,可以得知当采样周期h 取较小值时,由于0<τ<h ,因此二者可以抵消,从而稳定性不受影响;而当采样周期h 取较大值时,τβ2φm αh项可以忽略不计,因此从理论分析可以得出允许采样周期任意大的结论㊂从仿真实验方面来看,当采样周期h 越大,需要的收剑时间越长,但最终结果仍趋于优化解㊂然而,在文献[18]中,采样周期过大会导致稳定性条件难以满足,即算法最终难以收敛,无法达到最优解㊂因此,本文提出的算法允许采样周期任意大,这一创新点具有重要意义㊂3㊀仿真本文对一个具有4个智能体的多智能体网络进行数值模拟,智能体间的通信拓扑如图2所示㊂采用4个智能体的仿真网络仅是为了初步验证所提算法的有效性㊂值得注意的是,当多智能体的数量增加时,算法的时间复杂度和空间复杂度会增加,但并不会影响其有效性㊂因此,该算法在更大规模的多智能体网络中同样适用㊂成本函数通常选择凸函数㊂例如,在分布式传感器网络中,成本函数为z i -x 2+εi x 2,其中x 表示要估计的未知参数,εi 表示观测噪声,z i 表示在(0,1)中均匀分布的随机数;在微电网中,成本函数为a i x 2+b i x +c i ,其中a i ,b i ,c i 是发电机成本参数㊂这两种情境下的成本函数形式不同,但本质上都是凸函数㊂本文采用论文[19]中的通用成本函数(式(34)),用于证明本文算法在凸函数上的可行性㊂此外,通信拓扑图结构并不会影响成本函数的设计,因此,本文的成本函数在分布式网络凸优化问题中具有通用性㊂g i (x )=(x -i )4+4i (x -i )2,i =1,2,3,4㊂(34)很明显,当x i 分别等于i 时,得到最小局部成本函数,但是这不是全局最优解x ∗㊂因此,需要使用所提算法来找到x ∗㊂首先设置重要参数,令φm =16,γ=0.1,θi =1,ξi (0)=5,μi =0.2,δi =0.2,26山东理工大学学报(自然科学版)2024年㊀图2㊀通信拓扑图x i (0)=i ,i =1,2,3,4㊂图3为本文算法(7)解决优化问题(4)时各智能体的状态,其中设置采样周期h =3,时延τ=0.02㊂智能体在图3中渐进地达成一致,一致值为全局最优点x ∗=2.935㊂当不考虑采样周期影响时,即在采样周期h =3,时延τ=0.02的条件下,采用文献[18]中的算法(10)时,各智能体的状态如图4所示㊂显然,在避免采样周期的影响后,本文算法具有更快的收敛速度㊂与文献[18]相比,由于只有当智能体i 及其邻居的事件触发判断完成,才能得到q i (lh )的值,因此本文采用前一时刻的状态值构造动态事件触发条件更符合逻辑㊂图3㊀h =3,τ=0.02时算法(7)的智能体状态图4㊀h =3,τ=0.02时算法(10)的智能体状态为了进一步分析采样周期的影响,在时延τ不变的情况下,选择不同的采样周期h ,其结果显示在图5中㊂对比图3可以看出,选择较大的采样周期则收敛速度减慢㊂事实上,这在算法(7)中是很正常的,因为较大的h 会削弱反馈增益并减少固定有限时间间隔中的控制更新次数,具体显示在图6和图7中㊂显然,当选择较大的采样周期时,智能体的通信频率显著下降,同时也会导致收敛速度减慢㊂因此,虽然采样周期允许任意大,但在收敛速度和通信频率之间需要做出权衡,以选择最优的采样周期㊂图5㊀h =1,τ=0.02时智能体的状态图6㊀h =3,τ=0.02时的事件触发时刻图7㊀h =1,τ=0.02时的事件触发时刻最后,固定采样周期h 的值,比较τ=0.02和τ=2时智能体的状态,结果如图8所示㊂显然,时延会使智能体找到全局最优点所需的时间更长,但由于其受采样周期的限制,最终仍可以对于任意有限延迟达成一致㊂图8㊀h =3,τ=2时智能体的状态36第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法4 结束语本文研究了无向图下的多智能体系统的优化问题,提出了一种基于动态事件触发机制的零梯度和算法㊂该机制中加入了与前一时刻智能体状态相关的动态变量,避免智能体状态接近最优值时频繁触发产生的通信负担㊂同时,在算法和触发条件设计中考虑了采样周期的影响,在所设计的算法下,允许采样周期任意大㊂对于有时延的系统,在最大允许传输延迟小于采样周期的情况下,给出了保证多智能体系统达到一致性和最优性的充分条件㊂今后拟将本算法向有向图和切换拓扑图方向推广㊂参考文献:[1]杨洪军,王振友.基于分布式算法和查找表的FIR滤波器的优化设计[J].山东理工大学学报(自然科学版),2009,23(5):104-106,110.[2]CHEN W,LIU L,LIU G P.Privacy-preserving distributed economic dispatch of microgrids:A dynamic quantization-based consensus scheme with homomorphic encryption[J].IEEE Transactions on Smart Grid,2022,14(1):701-713.[3]张丽馨,刘伟.基于改进PSO算法的含分布式电源的配电网优化[J].山东理工大学学报(自然科学版),2017,31(6):53-57.[4]KIA S S,CORTES J,MARTINEZ S.Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication[J].Automatica,2015,55:254-264.[5]LI Z H,DING Z T,SUN J Y,et al.Distributed adaptive convex optimization on directed graphs via continuous-time algorithms[J]. IEEE Transactions on Automatic Control,2018,63(5):1434 -1441.[6]段书晴,陈森,赵志良.一阶多智能体受扰系统的自抗扰分布式优化算法[J].控制与决策,2022,37(6):1559-1566. [7]DIMAROGONAS D V,FRAZZOLI E,JOHANSSON K H.Distributed event-triggered control for multi-agent systems[J].IEEE Transactions on Automatic Control,2012,57(5):1291-1297.[8]KAJIYAMA Y C,HAYASHI N K,TAKAI S.Distributed subgradi-ent method with edge-based event-triggered communication[J]. IEEE Transactions on Automatic Control,2018,63(7):2248 -2255.[9]LIU J Y,CHEN W S,DAI H.Event-triggered zero-gradient-sum distributed convex optimisation over networks with time-varying topol-ogies[J].International Journal of Control,2019,92(12):2829 -2841.[10]COUTINHO P H S,PALHARES R M.Codesign of dynamic event-triggered gain-scheduling control for a class of nonlinear systems [J].IEEE Transactions on Automatic Control,2021,67(8): 4186-4193.[11]CHEN W S,REN W.Event-triggered zero-gradient-sum distributed consensus optimization over directed networks[J].Automatica, 2016,65:90-97.[12]TRAN N T,WANG Y W,LIU X K,et al.Distributed optimization problem for second-order multi-agent systems with event-triggered and time-triggered communication[J].Journal of the Franklin Insti-tute,2019,356(17):10196-10215.[13]YU G,SHEN Y.Event-triggered distributed optimisation for multi-agent systems with transmission delay[J].IET Control Theory& Applications,2019,13(14):2188-2196.[14]LIU K E,JI Z J,ZHANG X F.Periodic event-triggered consensus of multi-agent systems under directed topology[J].Neurocomputing, 2020,385:33-41.[15]崔丹丹,刘开恩,纪志坚,等.周期事件触发的多智能体分布式凸优化[J].控制工程,2022,29(11):2027-2033. [16]LU J,TANG C Y.Zero-gradient-sum algorithms for distributed con-vex optimization:The continuous-time case[J].IEEE Transactions on Automatic Control,2012,57(9):2348-2354. [17]LIU K E,JI Z J.Consensus of multi-agent systems with time delay based on periodic sample and event hybrid control[J].Neurocom-puting,2016,270:11-17.[18]ZHAO Z Y.Sample-baseddynamic event-triggered algorithm for op-timization problem of multi-agent systems[J].International Journal of Control,Automation and Systems,2022,20(8):2492-2502.[19]LIU J Y,CHEN W S.Distributed convex optimisation with event-triggered communication in networked systems[J].International Journal of Systems Science,2016,47(16):3876-3887.(编辑:杜清玲)46山东理工大学学报(自然科学版)2024年㊀。
基于深度学习的资源投入问题算法
Vol. 27 No. 6June2021第27卷第6期2 0 2 1年6月计算机集成制造系统ComputerIntegrated ManufacturingSystemsDOI : 10. 13196/j. cims. 2021. 06. 003基于深度学习的资源投入问题算法陆志强,任逸飞,许则鑫(同济大学机械与能源工程学院,上海201804)摘要:针对资源投入调度问题,提出了基于实时调度状态的调度优先级规则智能决策机制,构造了嵌合人工神经网络的双层迭代循环搜索算法。
算法上层为启发式资源搜索框架,下层为基于实时调度状态的调度优先级规 则智能决策算法$下层算法通过双隐层BP 神经网络离线学习,获得调度状态与调度优先级规则的映射关系,并在 实时调度过程中的每一阶段,根据当前调度数据,智能决策调度优先级规则,并指导作业调度进行。
最后,通过标 准算例库PSPLIB 进行对比实验,验证了所设计算法的有效性$关键词:深度学习;双层迭代循环搜索;资源投入问题;启发式规则;调度中图分类号:TP29文献标识码:ADeep learning algorithm for resource investment problemLU Zhiqiang , REN Yifei , XU Zexin(SchoolofMechanicalEngineering ,TongjiUniversity ,Shanghai201804,China )Abstract :An inte l igent decision-making scheme of scheduling priority rules based on real-time scheduling state was presentedforresourceandjobschedulingofresourceinvestmentproblem ,andadouble-layeriterativecyclicsearch algorithm based on artificial neural networkwas proposed. The upper stage of the algorithm was a heuristic resourcesearchframework ,andthelowerstage was an inte l igent decision-making algorithm of scheduling priority rules basedonreal-timeschedulingstatus'Thelowerstageofalgorithmobtainedthemappingrelationshipbetweensched- ulingstatusandschedulingpriorityrulesthrougho f -linelearningofdoublehiddenlayerBPneuralnetwork'Theschedulingpriorityrulesweredecidedinte l igentlyateachstageofreal-timeschedulingprocess ,whichguidedjobscheduling according to current scheduling data. The effectiveness of the designed algorithm was verified by compar- isonwithotherliteraturealgorithmthroughexperimentswithPSPLIB.Keywords : deep learning ; double-layer iterative cyclic search ; resource investment problem ; heuristic priorityrule ; schedulingo 引言为了满足市场订单需求,降低飞机生产装配成 本,目前国际大型飞机制造公司摒弃传统的固定站位式飞机装配模式,吸取丰田汽车流水生产线理念 和精益生产理论,对飞机总装生产线进行流程再造, 设计了全新的飞机移动生产线装配模式$从而大大缩短了飞机总装时间,降低了飞机制造成本,提高了装配质量,可按需连续生产。
压缩感知的稀疏定理
压缩感知的稀疏定理
压缩感知(Compressed Sensing,CS)是一种基于稀疏表示的采样理论。
在信号处理中,我们通常假设原始信号是稀疏的,即只有少数几个非零系数。
压缩感知的稀疏定理表明,对于一组随机或伪随机采样的信号,我们可以从少量的样本中恢复出原始信号的稀疏表示。
压缩感知的稀疏定理可以概括为:对于一个满足某些条件的稀疏信号,当采样率大于某个阈值时,该信号可以从少量的采样中被精确地重构出来。
这个条件被称为稀疏性条件,通常用L0范数或L1范数来描述。
压缩感知的稀疏定理在图像处理、通信系统、医学成像等领域具有广泛的应用前景。
它可以帮助减少数据采集和传输的成本,提高数据处理的效率和精度。
同时,压缩感知也面临着一些挑战,如如何选择适当的稀疏基函数、如何设计高效的压缩算法等。
知识蒸馏迁移暗知识
知识蒸馏迁移暗知识知识蒸馏迁移暗知识是一种新兴的机器学习技术,它允许机器从一个模型中蒸馏出另一个模型中缺失的暗知识,从而实现有效的知识迁移。
知识蒸馏迁移暗知识的目的是将现有的机器学习模型的知识转移到另一个机器学习模型中,以提高训练效率和改善模型的性能。
知识蒸馏迁移暗知识技术通过让源模型“蒸馏”出暗知识,并将这些暗知识应用于目标模型来实现知识迁移。
在这种情况下,源模型中包含的知识可以被视为“暗知识”,而目标模型可以被视为“暗知识”的受益者。
知识蒸馏迁移暗知识的主要技术实现方式是通过对源模型的参数进行整合,并将其应用于目标模型的训练过程,从而将源模型中的暗知识转移到目标模型中。
目前,知识蒸馏迁移暗知识的研究众多,主要包括模型聚类、模型联合学习、自监督学习、弱监督学习、多任务学习等技术。
例如,模型聚类将源模型参数聚合为一组参数,然后应用于目标模型的训练,从而使目标模型在训练过程中获得源模型中包含的暗知识。
模型联合学习也是一种知识蒸馏迁移暗知识技术,它采用联合学习的方式,将源模型和目标模型的参数联合在一起,从而让目标模型受益于源模型中的暗知识。
此外,自监督学习也是一种常见的知识蒸馏迁移暗知识技术,它将源模型的参数作为一个自监督学习任务,然后用于目标模型的训练,从而让目标模型受益于源模型中的暗知识。
另外,弱监督学习也是一种知识蒸馏迁移暗知识技术,它可以通过利用源模型的参数来构建弱监督任务,然后用于目标模型的训练,从而让目标模型受益于源模型中的暗知识。
多任务学习也是一种知识蒸馏迁移暗知识技术,它可以利用源模型的参数构建一个多任务学习问题,然后用于目标模型的训练,从而让目标模型受益于源模型中的暗知识。
知识蒸馏迁移暗知识技术在许多机器学习应用中发挥着重要作用,可以有效地将源模型中的暗知识转移到目标模型中,从而提高训练效率和改善模型的性能。
然而,知识蒸馏迁移暗知识也存在一些问题,例如如何选择合适的技术进行知识蒸馏,如何更有效地将暗知识转移到目标模型中等等。
利用稀疏自编码的局部谱聚类映射算法
利用稀疏自编码的局部谱聚类映射算法万月;陈秀宏;何佳佳【期刊名称】《传感器与微系统》【年(卷),期】2018(037)001【摘要】Traditional spectral clustering algorithms establish direct adjacency matrix using Gaussian kernel,and then do original data clustering,without taking into account deep feature of the data as well as the manifold structure of the neighborhood,but only carry out single cluster,in view of the above three shortcomings,put forward a local spectral clustering and mapping algorithm using sparseautoencoders(LSCMS). Through data preprocessing,LSCMS uses sparse auto-coding to extract deep characteristics of the original data set,which can better reflect the characteristics of the sample,so as to replace the original data;and reconstructs adjacency matrix by its linear neighborhood instead of Gaussian kernel function.LSCMS clusters and maps data to cluster index simultaneously so as to coordinate the cluster indicator. Experimental results on UCI datasets,handwritten datasets,face datasets show that the algorithm is superior to the existing clustering algorithms.%传统谱聚类算法直接对原始数据建立高斯核邻接矩阵后再对数据进行聚类,并未考虑数据的深层次特征以及数据的邻域流形结构,并且仅进行单一聚类,针对以上三点不足,提出了利用稀疏自编码的局部谱聚类映射算法(LSCMS),通过对数据进行预处理,利用稀疏自编码提取能反映原始数据本质的深层次特征,并以此替代原始数据;对每个数据利用其邻域进行线性重构,以重构权值代替高斯核函数建立邻接矩阵.LSCMS在聚类同时将数据映射到聚类指标上进而协调聚类指标.在UCI数据集、手写数据集、人脸数据集上的实验结果表明:算法优于现有的聚类算法.【总页数】5页(P145-148,153)【作者】万月;陈秀宏;何佳佳【作者单位】江南大学数字媒体学院,江苏无锡214122;江南大学数字媒体学院,江苏无锡214122;江南大学数字媒体学院,江苏无锡214122【正文语种】中文【中图分类】TP393【相关文献】1.基于局部结构化特征稀疏编码的手制动机故障检测 [J], 刘盛亚;Philip Yamba;邹荣;许桢英;崔世林2.一种基于MPI的稀疏化局部尺度并行谱聚类算法的研究与实现 [J], 李瑞琳;赵永华;黄小磊3.基于多层非负局部Laplacian稀疏编码的图像分类 [J], 万源;张景会;吴克风;孟晓静4.一种基于局部排序的约束稀疏编码的图像分类方法 [J], 曹晔5.基于弹性网和直方图相交的非负局部稀疏编码 [J], 万源;张景会;陈治平;孟晓静因版权原因,仅展示原文概要,查看原文内容请购买。
飞行管理系统AADL建模与分析
期
COM PU
计算机技术与
TER TECHNOLO GY AND
发展
DEV ELOPM
EN
T
VMola.r2.0 2N0o1析
汤小明1 ,苏罗辉2 ,宋科璞2
(11 西北工业大学 自动化学院 ,陕西 西安 710075 ; 21 飞行自动控制研究所 ,陕西 西安 710065)
摘 要 :航空电子系统软件的建模与分析是保证军用和民用飞机高可靠 、高性能的重要手段 ,也是模型驱动软件体系结构 的重要组成部分 。飞行管理系统作为航空电子系统的重要组成部分 ,传统上 ,对该系统的可调度性分析是在系统设计完 成后 ,在实现与验证阶段进行的 ,这使得系统无法进行的准确地软硬件需求分析 。采用先进的建模方法 AADL 对其进行 建模 ,为飞行管理系统的可调度性分析 、可靠性分析以及通信延迟等分析提供了可能 ,使得在系统需求分析阶段就可以准 确确定系统的软硬件需求 ,并能大大降低系统的更改验证成本 。首先论述了建模语言 AADL 的基本构成以及与航空电子 应用接口规范 ARINC653 的对应关系 ;然后描述了飞行管理系统的功能构成 ,并建立了飞行管理系统的 AADL 模型 ;最后 详细论述了系统调度理论 ,AADL 工具 ,飞管系统 AADL 模型的仿真分析 。通过仿真分析为飞管系统的处理器选型 、系统 设计 、软件设计与优化提供了依据 。 关键词 :航空电子系统 ;模型驱动 AADL ;飞行管理系统 ;实时调度分析 中图分类号 : TP311 文献标识码 :A 文章编号 : 1673 - 629X(2010) 03 - 0191 - 04
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Abstract
The object-oriented design methods and their CASE tools are widely used in practice by many real-time software developers. However, object-oriented CASE tools require an additional step of identifying tasks from a given design model. Unfortunately, it is di cult to automate this step for a couple of reasons: (1) there are inherent discrepancies between objects and tasks; and (2) it is hard to derive tasks while maximizing real-time schedulability since this problem makes a non-trivial optimization problem. As a result, in practical objectoriented CASE tools, task identi cation is usually performed in an ad-hoc manner using hints provided by human designers. In this paper, we present a systematic, schedulability-aware approach that can help mapping real-time object-oriented models to multi-threaded implementations. In our approach, a task contains a group of mutually exclusive transactions that may possess di erent periods and deadline. For this new task model, we provide a schedulability analysis algorithm. We also show how the run-time system is implemented and how executable code is generated in our framework. We have performed a case study. It shows the di culty of task derivation problem and the utility of the automated synthesis of implemenation.
real-time systems. However, object-oriented CASE tools require an additional step of identifying tasks from a given design model. Unfortunately, it is difcult to automate this step for a couple of reasons: (1) there are inherent discrepancies between objects and tasks; and (2) it is hard to derive tasks while maximizing real-time schedulability since this problem makes a non-trivial optimization problem. As a result, in practical object-oriented CASE tools, task identi cation is usually performed in an ad-hoc manner using hints provided by human designers. However, task derivation has a signi cant e ect on the real-time schedulability of the resultant system. Once task derivation is performed, the next step in object-oriented real-time system development is timing analysis. This can be done relatively easily using several scheduling algorithms in the literature 6, 5]. Whereas commercial object-oriented CASE tools including UML-RT have not yet provided timing analysis features and timeliness guarantees, there have been several research results on automated implementation of ROOM-based designs and associated schedulability analyses 2, 9, 10]. However, these approaches are applicable to a system design only after tasks are completely identi ed, and do not address the schedulability-aware mapping f real-time objectoriented models to implementations. Thus, real-time designers need rigorous methods for such mappings. In commercial object-oriented CASE tools, it is a common practice to map each of individual objects to a single task 12] since this is simple and natural. The other extreme practice is to map all objects in a system to a single task. For example, both RoseRT for UML-RT 13] and ObjecTime Developer 7] for ROOM map all active objects (a capsule instance in the UMLRT terminology) into a single task. RoseRT allows designers to map capsule instances to multiple tasks.
Unfortunately, this requires that the UML-RT design be changed. Designers need to convert their original design model into a dynamically con gurable system by associating capsule instances with dynamically created, optional tasks and then inserting code incarnating those capsule instances in initial state transitions of xed capsule instances. This hurts one of the advantages of object-oriented design methods { separation of design models and implementations. Nevertheless, this approach is e ective to reduce blocking time due to priority inversion. In this paper, we present a systematic, schedulability-aware method that can automatically generate an implementation from object-oriented design model for a real-time system. We also show how the run-time system is implemented and how executable code is generated. To derive tasks from a design, we rely on the notion of a transaction. It denotes an end-to-end computation from an external input to an external output. It can be described as a sequence or chain of events owing through the end-to-end computation. A transaction may be associated with timing constraints such as a period and a deadline as in 9]. Our approach groups mutually exclusive transactions into a task to reduce the number of tasks. As a result, a task may have multiple scheduling attributes including periods, execution times, and blocking times. Note that transactions in a task may have di erent periods and deadlines. We provide a new schedulability analysis algorithm that can take into account this task model. To further reduce the number of tasks, we also adopt preemption threshold scheduling presented in 15], and the notion of a non-preemptive group 11, 12]. The end result of our approach is an automated implementation method which maps a real-time object-oriented model to a multi-threaded implementation. The remainder of this paper is organized as follows. Section 2 overviews the UML-RT design model which becomes the source of our implementation model. Section 3 presents the details of our approach. Section 4 explains the inter-workings and implementation details of a CASE tool that supports our approach. Section 5 presents a simple example which demonstrates the application of our approach. We conclude this paper in Section 6.