基于线性分配的难负样本挖掘度量学习

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
利用cnn对图像进行特征提取后通过对比特征编码之间的距离度量利用基于线性分配的难负样本挖掘方法不同于基于抑制简单样本的难例挖掘线性分配问题linearassignmentproblemlap算法可以根据模型的学习情况动态地构筑训练对并通过调整样本采样顺序的方式使模型得到充分训练
Journalቤተ መጻሕፍቲ ባይዱof Computer Applications 计算机应用, 2020, 40( 2) : 352 - 357
傅泰铭,陈 燕*,李陶深
(广西大学 计算机与电子信息学院,南宁 530004) ( ∗ 通信作者电子邮箱 gxcy@foxmail. com)
摘 要:科学家依靠鲸鱼尾巴的形状及其独特的标记来识别鲸鱼的种类,但靠人眼识别和手工标注的过程非常繁 琐。而且鲸鱼尾巴照片数据集存在数据分布不均衡的特点,其中个别种类样本数量极少,甚至仅有一份;同时样本个 体差异较小,并且包含未知类别,导致以图像分类的方式完成鲸鱼身份的自动标注存在困难。为解决度量学习在该 任务下难以分类的问题,在孪生神经网络(SNN)的基础上,利用线性分配问题(LAP)算法进行难负样本挖掘训练过程 从而动态地构筑训练批次。首先对训练样本提取图像特征向量,并计算特征向量的相似性度量;然后通过 LAP 为模 型分配样本对,根据度量分数矩阵动态地构筑训练样本批次,针对性地训练困难样本对。在一个数据分布不平衡的 鲸鱼尾巴图像数据集和 CUB-200-2001 数据集上得到的实验结果表明,所提算法在少数类学习和细粒度图像分类上能 取得良好的效果。
(College of Computer, Electronics and Information, Guangxi University, Nanning Guangxi 530004, China)
Abstract: Scientists identify the species of whales based on the shape and the distinctive marks of the whale tails,but the process of recognition by human eyes and manual labeling is very cumbersome. The dataset of whale tail photo has the unbalanced data distribution,and some specific categories in the dataset have very few samples or even one sample. Besides,the samples have small individual differences and contain unknown categories,which leads to the difficulty in automatic labeling of whale identification by image classification. To solve the problem that metric learning is difficult to realize classification under this task,on the basis of Siamese Neural Network(SNN),the training batches were constructed dynamically by using Linear Assignment Problem(LAP)algorithm in the training process of hard-negative sample mining. Firstly,image feature vectors were extracted from the training samples,and the similarity metric of feature vector was calculated. Then,LAP was used to assign sample pairs to the model,training sample batches were constructed dynamically according to the metric score matrix,and the difficult sample pairs were targeted by trained. Experimental results on a whale tail image dataset with unbalanced data distribution and CUB 200-2001 dataset show that,the proposed algorithm can achieve good results in learning minority classes and classifying fine-grained images.
Key words: linear assignment; hard-negative sample mining; metric learning; fine-grained image recognition; Siamese Neural Network (SNN)
0 引言
近 年 来 ,卷 积 神 经 网 络(Convolutional Neural Network, CNN)已经被证实了在计算机视觉的多个应用领域能取得相 当优越的性能,如目标检测、人脸识别、图像分类等。CNN 在 经过良好的训练后具有十分强大的表达能力,甚至能用来区 分未被训练的任务。
ISSN 1001⁃9081 CODEN JYIIDU
2020⁃ 02⁃ 10 http:/ / www. joca. cn
文章编号:1001-9081(2020)02-0352-06
DOI:10. 11772/j. issn. 1001-9081. 2019081403
基于线性分配的难负样本挖掘度量学习
关键词:线性分配;难负样本挖掘;度量学习;细粒度图像识别;孪生神经网络 中图分类号:TP391. 41 文献标志码:A
Hard-negative sample mining for metric learning based on linear assignment
FU Taiming,CHEN Yan*,LI Taoshen
相关文档
最新文档