基于空间变换双线性网络的细粒度鱼类图像分类

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

第52卷第5期 2019年5月

天津大学学报(自然科学与工程技术版)

Journal of Tianjin University (Science and Technology )

V ol. 52 No. 5May 2019

收稿日期：2018-08-10；修回日期：2018-11-04. 作者简介：冀中（1979— ），男，博士，副教授，jizhong@. 通信作者：张锁平，iot323@.

基金项目：国家自然科学基金资助项目(61771329)；天津市自然科学基金资助项目(17JCYBJC16300).

Supported by the National Natural Science Foundation of China (No.61771329)，the Natural Science Foundation of Tianjin ，China

(No.17JCYBJC16300).

DOI:10.11784/tdxbz201808040

基于空间变换双线性网络的细粒度鱼类图像分类

冀中1，赵可心1，张锁平2，李明兵2

(1. 天津大学电气自动化与信息工程学院，天津 300072；2. 国家海洋技术中心，天津 300072)

摘要：有效地识别水下各种鱼类目标具有重要的实际意义和理论价值．鱼类生存环境复杂，由于海洋的极端条件，水下鱼类图像的分辨率低，且图像类间相似度高、类内差异性大，并受光照、角度、姿态等的影响较大，这些因素使得鱼类识别成为一项具有挑战的任务．针对这些难点，提出了一个能够有效进行细粒度鱼类图像分类的深度学习模型．该模型包含空间变换网络和双线性网络两部分，首先利用空间变换网络作为注意力机制，去除图像背景中复杂的干扰信息，选择图像中感兴趣的目标区域，简化后续分类；双线性网络通过融合两个深度网络的特征图提取图像的双线性特征，使得对目标中具有判别性的特定位置有较强的响应，从而识别种类，该模型可以进行端到端的训练．在公开的F4K 数据集上，该模型取得了最好的性能，识别正确率为99.36%，较现有最好算法DeepFish 提高0.56%，此外，发布了一个包含100类共6358张图片的新的鱼类图像数据集Fish100，该模型在Fish100数据集上的识别正确率高出BCNN 算法0.98%．多个数据集上的实验验证了模型的有效性与先进性．关键词：鱼类分类；细粒度分类；空间变换；双线性网络

中图分类号：TP37 文献标志码：A 文章编号：0493-2137(2019)05-0475-08

Fine -Grained Fish Image Classification Based on a Bilinear

Network with Spatial Transformation

Ji Zhong 1，Zhao Kexin 1，Zhang Suoping 2，Li Mingbing 2

(1. School of Electrical and Information Engineering ，Tianjin University ，Tianjin 300072，China ；

2. National Ocean Technology Center ，Tianjin 300072，China )

Abstract ：Effective classification of various fish species under water has great practical and theoretical significance.

Due to the extreme conditions of the ocean ，underwater images have very low resolution. Since the living environ-ment is highly complex ，fish images have properties of high inter-class similarity ，large intra-class variety ，and are greatly affected by light ，angle ，posture etc. These factors make fish classification a challenging task. To cope with these challenges ，a deep fine-grained fish imageclassification model is proposed. It consists of a spatial transformer network and a bilinear network. Specifically ，the spatial transformer network aims at removing the complex back-ground as an attention mechanism and selecting the region of interest in the image. The bilinear network extracts the bilinear features of the image by fusing the feature maps of two deep networks ，so that it responds to the discrimina-tive part of the target. The model can be trained in an end-to-end way. The model achieves its best performance on the public F4K dataset. The recognition accuracy was 99.36%，which was 0.56% higher than the DeepFish algorithm. In addition ，a new dataset called Fish100，containing 100 categories of 6358 images ，was released. Accuracy of the model is 0.98% higher than that of the bilinear convolutional neural network (BCNN )model. Experiments on several datasets verified the effectiveness and superiority of the proposed algorithm.

Keywords ：fish classification ；fine-grained classification ；spatial transformation ；bilinear network