2D Line Filters for Vision-based Lane Detection and Tracking
一种基于消失点和颜色过滤器的车道线检测算法
第40卷第5期Vol.40㊀No.5重庆工商大学学报(自然科学版)J Chongqing Technol &Business Univ(Nat Sci Ed)2023年10月Oct.2023一种基于消失点和颜色过滤器的车道线检测算法胡昊东1,刘贵如1,王陆林1,李㊀铮21.安徽工程大学计算机与信息学院,安徽芜湖2410002.芜湖易来达雷达科技有限公司,安徽芜湖241000摘㊀要:针对经典霍夫车道线检测方法实用性较差,无法准确区分车道线和路沿与应用道路场景简单等问题,提出了一种基于消失点和颜色过滤器的车道线检测算法,不仅提高车道线检测的准确率,而且能够应用较复杂行车场景;首先,对行车视频连续五帧图像进行预处理,获取行车环境下车道线消失点位置,能够自适应选取行车环境图像的感兴趣区域(Region of Interest ,ROI );然后,对ROI 图像根据车道线颜色特征进行过滤得到二值图像,获取二值图像中所有连通区域质心和倾斜角等数据,通过结合消失点特征和角度阈值进行限制,筛选记录符合车道线特征连通区域的数据,接着分割较大区域获取更多质心点,识别漏检符合车道线特征的区域质心点;最后,对获取的质心点使用最小二乘法进行拟合并标识车道线;实验结果表明:算法能够在多场景道路上快速准确的检测出车道线,与经典霍夫算法进行仿真比较,算法具有一定的鲁棒性和实时性㊂关键词:车道线检测;RGB 阈值过滤;消失点;自适应感兴趣区域;Hough 变换中图分类号:TP391.9㊀㊀文献标识码:A ㊀㊀doi:10.16055/j.issn.1672-058X.2023.0005.004㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀收稿日期:2022-06-14㊀修回日期:2022-07-28㊀文章编号:1672-058X(2023)05-0025-09基金项目:高校优秀青年人才支持计划重点项目(GXYQZD2019052);安徽工程大学自然科学预研项目(XJKY2020123);安徽工程大学自然科学预研项目(XJKY2022146);安徽工程大学-鸠江区产业协同创新专项基金项目(2022CYXTB2).作者简介:胡昊东(1999 ),安徽淮南人,硕士研究生,从事车道线检测研究.通讯作者:刘贵如(1980 ),山西五台人,副教授,从事信号处理相关研究.Email:liuguiru@.引用格式:胡昊东,刘贵如,王陆林,等.一种基于消失点和颜色过滤器的车道线检测算法[J].重庆工商大学学报(自然科学版),2023,40(5):25 33.HU Haodong LIU Guiru WANG Lulin et al.A lane detection algorithm based on vanishing point and color filter J .Journal of Chongqing Technology and Business University Natural Science Edition 2023 40 5 25 33.A Lane Detection Algorithm Based on Vanishing Point and Color Filter HU Haodong 1 LIU Guiru 1 WANG Lulin 1 LI Zheng 21.School of Computer and Information Science Anhui Polytechnic University Anhui Wuhu 241000 China2.Wuhu Elaida Radar Technology Co. LTD. Anhui Wuhu 241000 ChinaAbstract A lane line detection algorithm based on vanishing points and color filters was proposed to address the problems of poor practicality of the classical Hough lane line detection method its inability to accurately distinguish between lane lines and road edges and the simplicity of the applied road scenes.The proposed algorithm not only improves the accuracy of lane line detection but also enables the application of more complex driving scenarios.First five consecutive frames of the driving video were pre-processed to obtain the location of the vanishing point of the lane lines in the driving environment and the Region of Interest ROI of the driving environment image could be selected adaptively.Then the ROI image was filtered according to the lane line color feature to obtain a binary image and data such as the centroid and tilt angle of all connected areas in the binary image was obtained.By combining the vanishing point feature and angle threshold to limit the data in the connected area conforming to the lane line feature was screened and recorded.The data of the area was then divided into a larger area to obtain more centroid points and the area centroid points that were missed and met the characteristics of the lane line were identified.Finally the least squares method was used to fit the obtained重庆工商大学学报(自然科学版)第40卷centroid points and identify the lane lines.The experiment shows that the algorithm can quickly and accurately detect lane lines on multi-scene paring the classical Hough algorithm the algorithm has certain robustness and real-time performance.Keywords lane detection RGB threshold filtering vanishing points adaptive region of interest Hough transform1㊀引㊀言随着科学技术的快速发展,汽车的智能驾驶能力也得到优越的发展㊂其中,车道线检测技术在车辆辅助系统中发挥着关键性作用,通过摄像头采集数据并交由计算机进行分析处理,将车道线进行正确检测与标识㊂车道线检测的准确性能够直接影响车辆的行车安全[1-2]㊂从目前国内外学者关于车道线检测研究现状来看,传统的车道线检测方法主要包括基于特征[3]和基于模型[4]的检测方法,并且除传统检测方法外,基于深度学习的检测方法也在快速的发展㊂基于特征的方法主要是在灰度图像或者彩色图像中提取车道线的特征,主要包括车道线的颜色㊁边缘㊁直方图等特征㊂基于模型的方法主要通过动态的调整车道线模型的参数实现对车道线的检测,可以采用的模型有直线模型㊁抛物线模型和B样条曲线等㊂K. Dinakaran等[5]提出一种利用先进计算机视觉技术的车道线检测算法,将RGB色彩空间转换成HLS色彩空间,图像利用逆透视变换变成鸟瞰图,不足之处在于无法确定使用相机内部参数就无法进行逆透视变换;Lee 等[6]对颜色属性和图像梯度进行研究,采用扫描线测试进行线聚类,完成车道线标识;Zhao等[7]利用基于Catmull-Rom样条曲线模型来表示弯曲的车道线,该模型对噪声比较敏感,在雨天容易产生错误检测;Wang 等[8]利用三阶B样条曲线模型来表示弯曲的车道线,但拟合过程会进行过多的迭代运算;王杰等[9]提出了基于Retinex的图像增强算法,解决在弱光环境中车道线难检测的问题,但检测环境仅局限于弱光环境,外部环境的变化对检测结果影响较大㊂目前,传统的车道线检测方法在实际应用中具有简便易操作性㊁检测性能方面能够满足高效准确的要求,仍然吸引众多学者对车道线检测进行深入研究㊂近些年,随着深度学习㊁神经网络的不断发展,越来越多的研究学者将深度学习㊁神经网络相关知识用于车道线检测研究㊂孙建波等[10]提出在Lane Net网络中使用VGG16全连接卷积化取代E-Net解码器,以此提升对车道线的检测速度;Satish Kumar Satti等[11]提出一种用于检测和跟踪道路边界车道的新方法CNN-LD,主要基于卷积神经网络来提取边缘特征㊂使用深度学习方法进行车道线检测会比传统方法表现出更好的准确率,但是需要大量的训练集进行训练,并且对计算机的计算能力要求严格,成本需求较高㊂由于实际道路环境变幻莫测,目前经典霍夫车道线检测方法在处理车道线和路沿容易出现误检情况,并且在一些复杂场景下也会出现误检漏检问题㊂现在已有的研究基础上,选用颜色过滤器直接对道路图像进行处理,获取完整车道线信息,再通过消失点等一些条件进行约束来得到车道线精确信息,采用最小二乘法拟合道路车道线㊂算法在车道线破损㊁车道标识干扰和车道线被遮挡等复杂情况也能够完成车道线检测,具有一定的鲁棒性,提高了车道线检测方法的准确率和实时性㊂2㊀车道线检测算法流程所提出的车道线检测算法的流程主要由3部分组成,包括消失点检测与感兴趣区域提取阶段,左右车道线划分及参数获取阶段和车道线拟合阶段㊂首先,对车载视频的连续五帧图像进行预处理获取边缘图像;再用Hough变化检测图像中的直线,并将直线进行相交,对所有交点坐标进行投票,在投票数最大的区域内,求区域内所有坐标的均值,通过均值坐标对图像划分获取ROI,从而降低图像车道线检测阶段的复杂度㊂然后,根据对每一帧道路图像中的车道线颜色特征,使用过滤器对白色车道线和黄色车道线进行过滤,得到其二值图像㊂接着对二值图像中连通区域依据车道线特征进行筛选,获取符合左右车道线特征区域的质心㊂最后,使用直线模型进行车道线拟合,并对车道线进行标记㊂算法的具体流程如图1和图2所示㊂图1㊀感兴趣区域范围提取流程Fig.1㊀ROI extraction process62第5期胡昊东,等:一种基于消失点和颜色过滤器的车道线检测算法每一帧图像使用R O I 范围进行裁剪基于车道线颜色特征进行过滤计算图像消失点位置获取所有连通区域相关参数结合消失点与角度阈值结合消失点与角度阈值获取左车道线数据;拟合左车道线获取右车道线数据;拟合右车道线添加漏检左车道线区域添加漏检右车道线区域左车道线候选区域右车道线候选区域存储历史车道线数据车道线最终拟合所有连通区域图2㊀车道线检测算法的主要流程Fig.2㊀The main flow of the lane detection algorithm3㊀图像预处理由于图像中会出现天空和建筑物等一些杂质,会影响图像中车道线的检测㊂为了便于检测出车道线,突出车道线特征,确保车道线检测的精度,因此通过预处理操作,获得自适应当前图像大小的ROI㊂在图像中车道线的区域几乎全在图像下半部分,通过预处理得到的自适应ROI 几乎包含全部车道线信息㊂3.1㊀灰度化图像在对图像进行处理前,需要对图像进行灰度化处理,将图像进行灰度化后可减少后续算法的计算量㊂传统常用的灰度化方法是加权平均法,加权平均法的方法,如式(1)所示:I (x )=0.299R (x )+0.587G (x )+0.114B (x )(1)通过对道路车道线的观察,车道线主要是由白色车道线和黄色车道线组成,因此为了增加道路背景与道路车道线之间的对比度,选用将RGB 通道中的红色和绿色通道进行保留的方式,对图像进行灰度化,所使用的方法[12],如式(2)所示:I (x )=R (x )+G (x )-B (x )(2)上述两式中,I (x )表示图像中任意位置的灰度值,R (x )表示红色通道的亮度值,G (x )表示绿色通道的亮度值,B (x )表示蓝色通道的亮度值㊂灰度化图像的对比效果如图3所示㊂(a )原图(b )加权平均法(c )本文方法图3㊀不同灰度化方法效果对比Fig.3㊀Comparison of the effects of different grayscale methods3.2㊀边缘检测在对图像进行灰度化后,仍然需要借助灰度图像中各个像素点的梯度幅值及方向,从而提取车道线信息㊂利用Canny 算子[13]进行边缘检测㊂Canny 算子的检测步骤主要由四部分组成,首先使用高斯滤波器对图像进行平滑操作;接着各个像素点的梯度幅值及方向利用一阶偏导有限差分方法进行计算;然后对幅值图像采用非极大值进行抑制;最后使用双阈值进行检测和连接边缘㊂Canny 边缘检测的效果图如图4所示㊂图4㊀Canny 边缘检测效果图Fig.4㊀Canny edge detection renderings72重庆工商大学学报(自然科学版)第40卷3.3㊀Hough 变换Hough 变换主要是一种特征检测[14],经常用来检测图像中的直线特征㊂Hough 变换是对两个坐标空间进行变换,将直角坐标系中的直线方程映射到极坐标空间上的一个点,从而形成峰值,将检测直线问题转成统计峰值问题㊂在直角坐标系中,对于一条直线的方程定义为y =kx +b 的形式,其中k 表示直线斜率,b 表示直线截距㊂对于过点k (x 0,y 0)的直线方程y 0=kx 0+b ,将直线方程改变形式成b =-kx 0+y 0,因此就使直角坐标系中的一条直线与参数空间中的一个点对应起来㊂当参数空间选用直角坐标系时,就会出现在特殊直线方程x =c (c 是常数),此时直线的斜率无穷大,无法在参数空间中进行表示的问题㊂因此将参数空间选用极坐标系,对直线做垂线,垂线用极坐标进行表示为ρ=x cos θ+y sin θ,其中ρ表示垂线极径,θ表示垂线极角㊂完成方程从直角坐标系到极坐标系的转换后,接着设计一个数组为2维大小的累加器,用于记录极坐标系中交点,对交点参数ρ和θ满足的单元格进行投票,直至投票结束㊂最后搜索累加器中极值点,取值极值点的ρ和θ值,求解直线方程㊂如图5所示㊂-1000-5000500100020406080-80-60-40-20ρθ(a )选取极坐标空间域中多个峰值(b )Hough 变换检测出的线段图5㊀图像的Hough 变换Fig.5㊀Hough transform of images3.4㊀消失点位置及ROI 获取通过Hough 变换可以检测出分为左侧和右侧的线段集㊂消失点定义为车道线或者道路边界在远处的交点,因此对检测到的线段进行延展,使之在远处相交,如图6(a)所示㊂按图像长宽大小创建一个20ˑ20像素大小的单元格集,组成满足允许消失点位置误差范围内的网格[15]㊂对每个交点进行判断,若交点落在某一单元格内,则将对相应单元格进行+1操作,直至所有交点判断结束㊂最后,选出具有最大投票数的单元格,并求出单元格内的投票交点的平均值,将其作为消失点的近似值,如图6(b)所示㊂利用消失点的位置对图像进行裁剪,得到包含车道线信息的ROI㊂如图6(c)和图6(d)所示㊂(a )延展检测直线(b )获取最大投票区域与消失点(c )原图82第5期胡昊东,等:一种基于消失点和颜色过滤器的车道线检测算法(d )ROI 图像图6㊀消失点预测及ROI 选取Fig.6㊀Vanishing point prediction and ROI selection4㊀车道线检测算法4.1㊀车道线颜色特征过滤器根据道路车道线的颜色特征,道路上的车道线主要由黄色车道和白色车道线两种颜色组成㊂考虑到图像是由R㊁G㊁B 3种通道组成,因此将图像的3个通道进行分离,分别对每一种通道设置阈值进行过滤㊂对图像中的黄色车道线和白色车道线分别设置不同的阈值进行过滤,最后合并两种颜色车道线的二值图像㊂对R㊁G㊁B 3个通道分离后,用三维图进行显示,寻找对于黄色车道线和白色车道线不同的过滤阈值,如图7所示㊂通过对每个通道的三维图像进行观察,选取相应的过滤阈值,对黄色车道线和白色车道线使用不同的阈值对R㊁G㊁B 进行过滤,得到对应的二值图像,如图8所示㊂(a )R 通道(b )G 通道(c )B 通道30025020015010050001002003000200400600800100012001400250200150100500R 通道黄色车道线白色车道线y 轴x 轴灰度值㊀3002502001501005000100200300200400600800100012001400250200150100500G 通道黄色车道线白色车道线y 轴x 轴灰度值㊀30025020015010050001002003000200400600800100012001400250200150100500B 通道黄色车道线白色车道线y 轴x 轴灰度值(d )R 通道三维图像(e )G 通道三维图像(f )B 通道三维图像图7㊀阈值选取Fig.7㊀Thresholdselection㊀㊀(a )原图(b )黄色车道线(c )白色车道线图8㊀二值图像Fig.8㊀Binary image㊀㊀在颜色过滤器获取的黄色车道线和白色车道线二值图像后,对不同颜色的车道线二值图像进行合并㊂由于过滤器将远处车道线的过滤较为模糊,以及会存在较小的噪声点,因此使用形态学相关算法对二值图像进一步处理㊂首先,由于二值图像中会存在干扰物等一些像素多的连通区域,因此对合并后的二值图像进行腐蚀与膨化操作㊂对像素特别少的连通区域进行线性腐蚀处理,可以达到滤除噪声的目的;接着,膨胀作为腐蚀的对偶运算,使用线性膨胀操作对腐蚀后的图像进行处理,补充之前连续残缺的车道线,便于后期获取车道线区域的质心㊂图像处理效果如图9所示㊂(a )合并二值图像(b )腐蚀操作(c )膨化操作图9㊀图像的腐蚀与膨化操作Fig.9㊀Image erosion and puffing operations92重庆工商大学学报(自然科学版)第40卷4.2㊀基于消失点选取车道线连通区域对二值图像中所有的连通区域,计算出每一个连通区域的最小矩阵㊁区域的质心和区域与x 轴的角度信息㊂车道线是图像中在一定角度范围内的线段[16],设置角度阈值筛选符合车道线角度的连通区域㊂通常取右车道线的角度范围为[20ʎ,80ʎ],左车道线的角度范围为[100ʎ,160ʎ]㊂根据图像中车道线在远处具有交点P (x 0,y 0)(同为消失点)的特性,计算出连通区域最小矩阵对角线的直线方程,求解直线与消失点同一水平位置的坐标点值B (x 1,y 1)㊂设置距离dis 阈值,对两点之间的距离进行限制,将满足dis 阈值范围内的连通区域作为车道线区域,如式(3)所示:|PB |=x 0-x 1()2+y 0-y 1()2save ㊀if(|PB |<dis )delete ㊀if(|PB |>dis )ìîíïïïï(3)通过车道线角度和消失点位置两重阈值限制,从二值图像中的连通区域中筛选出符合左车道线和右车道线的区域,并将左右车道线区域分别存储在L left 和L right 两个变量空间中㊂4.3㊀分割连通区域由于每一个区域只会存在一个质心,为了解决拟合点过少的情况,对连通区域按行分割㊂将大面积连通区域分割成小面积的连通区域,再求解出单个小连通区域的质心,增加拟合点的个数,便于拟合正确的车道线㊂设置分割区域的最小宽度split _dis 阈值,对连通区域宽度超过split _dis 阈值的区域进行分割,将分割出来的小面积连通区域再次获取区域质心,如图10所示㊂最后将分割后区域的质心重新映射到原区域,扩展原区域的标识点数量,便于正确拟合车道线㊂如图11所示㊂㊀㊀(a )裁剪图(b )分割区域获取质心图10㊀分割图像Fig.10㊀Split images(a )待分割区域(b )增加质心后的区域图11㊀处理大面积连通区域Fig.11㊀Handling large connected areas4.4㊀添加漏检区域在车道线角度和消失点约束条件下,筛选出的连通区域仍会出现漏检区域,因此检测漏检的区域,计算其质心数据,增加拟合的数据点,提高拟合车道线的精度㊂在车道线为虚线的情况下,图像中远处的车道线区域拍摄模糊,以及物体遮挡造成车道线残缺,都会影响车道线的正确检测㊂在车道线的二值图像中,远处车道线模糊和车道线残缺的情况呈现为凌乱的小面积区域,产生区域漏检现象㊂为了解决此问题,利用检测出的正确车道线区域对所有区域进行扫描㊂首先,对已经存储在L left 和L right 空间中数据点进行一次项拟合,获得两条直线方程的斜率(k left ㊁k right )与截距(b left ㊁b right )㊂接着,将每一个区域的质心C i (x i ,y i )的x i 坐标值分别代入左右直线方程,分别求解出y new 坐标值㊂最后,对左右直线方程分别求解的y new ,若符合阈值T add 的范围内,将该区域的质心数据添加到左右车道线的数据点存储空间中,如式(4)和式(5)所示㊂初次筛选出的车道线连通区域如图12(a)所示,添加漏检区域结果如图12(b)所示㊂y new =k left ˑx i +b leftL left +C i ㊀if(y new <T add ){(4)y new =k right ˑx i+b rightL right +C i ㊀if(y new <T add ){(5)(a )初次筛选区域(b )添加漏检区域图12㊀添加漏检的车道线区域Fig.12㊀Adding missed lane marking area3第5期胡昊东,等:一种基于消失点和颜色过滤器的车道线检测算法4.5㊀车道线拟合与数据存储最小二乘法是进行拟合直线常用的方法,它主要是计算最小化误差的平方和寻找最佳函数对数据点进行匹配㊂最小二乘法可以快捷简单的求得未知的数据,并使得真实数据与求得数据之间的误差的平方和达到最小,因此选用最小二乘法对获取的车道线数据进行拟合成直线㊂对于检测出的左右车道线数据点:(x 1,y 1),(x 2,y 2) (x n ,y n ),对于n 个数据点进行直线拟合时,直线方程的形式为y =a 1x +a 0,其中a 1㊁a 0为未知变量㊂如果把点(x i ,y i ),i =1,2, ,n ,分别代入直线方程,就会得到y i =a 1x i +a 0,对其形式进行变换,如式(6)所示:y i =(x i ㊀1)a 1a 0()(6)将左右车道线数据点:(x 1,y 1),(x 2,y 2) (x n ,y n )代入式(6)后的公式进行组合成矩阵的形式,如式(7)所示:y 1y 2︙y n æèçççççöø÷÷÷÷÷=x 11x 21︙︙x n1æèçççççöø÷÷÷÷÷a 1a 0()(7)将矩阵y 1y 2 y n ()T 设为Y ,x 1x 2 x n 111()T设为X ,a 1a 0()T 设为A ,则对公式进行变换求解拟合方程参数过程,如式(8)所示:Y =XA ⇒X TY =X TXA ⇒(X TX )-1X TY =(X TX )-1X TXA ⇒A =(X T X )-1X T Y(8)车道线在检测过程中会出现干扰物遮挡㊁车道线破损和车道线不明显等情况,这些情况会影响车道线的检测,造成车道线无法检测问题㊂为了解决此类问题,利用相邻两帧图像具有相似性,车道线偏离较小的特点,每次对检测到的车道线数据点进行存储,在车道线绘制之前,判断上一帧车道线拟合数据和当前帧车道线拟合数据之间的差距㊂若差距在接受范围内,则认为当前帧车道线检测准确,否则认为当前帧车道线检测错误,使用上一帧车道线检测数据,确保车道线拟合的准确性㊂最后,利用拟合的车道线数据在图像中绘制车道线,辅助车辆进行智能行驶和避险㊂车道线绘制结果如图13所示㊂(a )二值图像中绘制车道线(b )裁剪图像中绘制车道线(c )原图像中绘制车道线图13㊀车道线绘制Fig.13㊀Lane drawing5㊀实验结果与分析进行车道线检测所使用的系统配置为Intel (R)Core(TM)i5-10505CPU @3.20GHz 3.20GHz㊂为了测试车道线检测算法的有效性,选用了西安交通大学数据集和高速公路车载记录视频进行仿真测试,选用的数据集中存在光照亮度低㊁车道线被部分遮挡和路面标志干扰等一些常见行车环境㊂仿真效果如图14所示㊂从图中可以看出经过算法进行车道线检测可以在相对复杂的行车环境下准确的拟合出车道线,具有较好的鲁棒性㊂(a )正常光照情况(b )光照亮度低13重庆工商大学学报(自然科学版)第40卷(c )车道线被部分遮挡(d )中间有标识干扰(e )夜晚环境(f )存在黄色宽车道线图14㊀车道线检测仿真效果Fig.14㊀Simulation effect of lane detection由表1的ROI 占比得出,使用ROI 自适应选取方法更能准确的选取包含车道线信息的部分图像,采用预测消失点位置选取ROI 的方法,比使用固定ROI 选取比例更加过滤除车道线以上的无关信息部分,以此提高车道线检测的准确率㊂表1㊀高速公路图像ROI 占原图像的百分比Table 1㊀The percentage of highway image ROI in theoriginal image算㊀法图像大小标准ROI 区域大小检测ROI 区域大小检测ROI 占原图像的百分比/%标准ROI 占检测ROI的百分比/%经典霍夫算法1280ˑ7201280ˑ2951280ˑ36050.0081.94本文算法1280ˑ7201280ˑ2951280ˑ29841.3898.99此外,针对选用的车道线测试图像,使用经典霍夫车道线检测作为对比实验进行数据分析,表2和表3为不同算法在同一实验平台下的仿真数据结果㊂由表2和表3数据结果可得,与经典霍夫算法相比,检测方法在平均准确率上具有明显的提高㊂与经典霍夫算法的检测平均准确率86.91%相比,算法的车道线检测平均准确率达到了94.14%,平均准确率提升了7.23%,具有很好的检测效果㊂算法各个步骤的平均运行时间如表4所示,感兴趣区域范围提取阶段的用时只在整体算法最开始阶段消耗,待感兴趣区域确定后,每帧图像处理并不需要再次进入ROI 提取阶段,对图像进行车道线检测的总时长主要为车道线数据点检测时长和车道线拟合时长之和,因此算法对车道线检测的运行效率能够满足实时性的要求㊂表2㊀算法仿真数据Table 2㊀Simulation data of the algorithm数据来源帧㊀数车道线检测结果正确检测误检结果漏检结果平均准确率/%西安交通大学数据集420738902843392.46高速公路车载视频125512523099.76总㊀共546251422873394.14表3㊀经典霍夫算法仿真数据Table 3㊀Classic Hough algorithm simulation data数据来源帧㊀数车道线检测结果正确检测误检结果漏检结果平均准确率/%西安交通大学数据集420736904803787.71高速公路车载视频12551057197184.22总㊀共546247576773886.91表4㊀算法各步骤平均运行时间Table 4㊀The average running time of each stepof the algorithm算法步骤平均运行时间/ms感兴趣区域范围提取阶段570.7车道线数据点检测阶段232.1车道线拟合阶段34.2车道线检测算法330.36㊀结㊀论选用颜色过滤器直接将图像处理成二值图像,完成车道线和路沿的区分,并利用车道线消失点和角度特征进行约束,获取多场景中车道线区域的质心数据,23第5期胡昊东,等:一种基于消失点和颜色过滤器的车道线检测算法实现对不同场景图像中车道线的拟合㊂实验结果表明:在西安交通大学数据集和高速公路车载视频的仿真测试实验中,车道线检测平均准确率达到94.14%,相比于经典霍夫算法平均准确率提升7.23%,并且可在复杂场景中准确检测车道线㊂检测算法具有一定的鲁棒性和实时性,在之后的工作中,将进一步研究选用合适的模型对弯曲车道线进行标识,继续提高算法应用场景的复杂度㊂参考文献References1 ㊀PARK S K KIM B S JEONG S H et ne estimationusing lateral histogram in radar based ACC system C// European Radar Conference 2014.2 ㊀BENGLER K DIETMAYER K FARBER B et al.Threedecades of driver assistance systems review and future perspectives J .IEEE Intelligent Transportation Systems Magazine 2014 6 4 6 22.3 ㊀CHAO M MEI X.A method for lane detection based on colorclustering C//Third International Conference on Knowledge Discovery Data Mining 2010.4 ㊀武历颖余强.一种快速准确非结构化道路检测方法研究J .计算机仿真2016 33 9 174 178.WU Li-ying YU Qiang.A fast and accurate detection method of unstructured road J .Computer Simulation 2016 339174 178.5 ㊀DINAKARAN K STEPHEN A S KABILESH S K et al.Advanced lane detection technique for structural highway based on computer vision algorithm J .Materials Today Proceedings 2021 45 2 2073 2081.6 ㊀LEE C MOON J H.Robust lane detection and tracking forreal-time applications J .IEEE Transactions on Intelligent Transportation Systems 2018 19 12 4043 4048.7 ㊀ZHAO K MEUTER M NUNN C et al.A novel multi-lanedetection and tracking system C//Intelligent Vehicles Symposium IV 2012.8 ㊀WANG Y TEOH E K SHEN ne detection and trackingusing B-snake J .Image and Vision Computing 2004 224 269 280.9 ㊀王杰陈黎卿黄莉莉等.基于Retinex的弱光条件下车道线识别方法J .计算机与数字工程2019 472451 456.WANG Jie CHEN Li-qing HUANG Li-li et ne recognition method in weak light condition based on Retinex J .Computer&Digital Engineering 2019 472451 456.10 孙建波张叶常旭岭.基于改进Mask R-CNN+LaneNet的车载图像车辆压线检测J .光学精密工程2022 30 7 854 868.SUN Jian-bo ZHANG Ye CHANG Xu-ling.Vehicle pressure line detection based on improved mask R-CNN+ LaneNet J .Optics and Precision Engineering 2022 307 854 868.11 SATTI S K DEVI K S DHAR P et al.A machine learning approach for detecting and tracking road boundary lanes J . ICT Expres 2021 7 1 99 103.12 SUN T Y HUANG W C.Embedded vehicle lane-marking tracking system C//IEEE International Symposium on Consumer Electronics 2009.13 赵芳周旺辉陈岳涛等.改进的Canny算子在裂缝检测中的应用J .电子测量技术2018 41 20 107 111. ZHAO Fang ZHOU Wang-hui CHEN Yue-tao et al. Application of improved canny operator in crack detection J . Electronic Measurement Technology 2018 4120107 111.14 DUDA R O HART P e of the Hough transform to detect lines and curves in pictures.J .Communications of the ACM 1972 15 1 11 15.15 JIAO X YANG D JIANG K et al.Real-time lane detection and tracking for autonomous vehicle applications J . Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering 2019 23392301 2311.16 ANDRADE D C BUENO F FRANCO F R et al.A novel strategy for road lane detection and tracking based on a vehicle s forward monocular camera J .IEEE Transactions on Intelligent Transportation Systems 2019 2041497 1507.责任编辑:田㊀静33。
基于机器视觉的二维图像质量缺陷检测研究进展
包装工程第44卷第23期·198·PACKAGING ENGINEERING2023年12月基于机器视觉的二维图像质量缺陷检测研究进展张德海1,祝志逢1,李艳芹1,黄子帆1,马选雄1,许宸语1,刘祥2(1.郑州轻工业大学机电工程学院,郑州450002;2.中标防伪印务有限公司,北京102218)摘要:目的机器视觉图像处理技术是近年在图像处理领域发展起来的一门新兴边缘交叉学科,二维图像的质量检测是印刷行业中必不可少的环节,分析基于机器视觉的二维图像质量缺陷检测流程,探索影响基于机器视觉的二维图像质量缺陷检测精度的相关因素,为后续研究印刷品的二维图像自动化检测和质量控制提供参考。
方法在此基础上,围绕图像预处理中的灰度转换、噪声过滤、固定阈值分割、自适应阈值分割、Otsu法及边缘检测,对图像配准中的基于灰度统计信息分布配准方法、基于特征的图像配准方法进行总结,然后归纳分析图像的缺陷提取和分类。
结论以实际例子对上述研究内容进行了提炼,通过图像预处理中的噪声过滤为后续缺陷提取提供清晰图像,减少伪影干扰;通过图像预处理中的灰度变换、阈值分割、感兴趣区域提取减少系统处理时间,为实现高效的缺陷检测奠定了坚实的基础;通过图像配准消除了机械振动引起的图像位置偏移,确保后续缺陷提取的准确性;通过图像缺陷提取和分类帮助印刷企业找出生产问题,提供有针对性的改进措施,可为生产高质量产品提供支持。
关键词:机器视觉;印刷质量;缺陷检测;图像处理中图分类号:TB487 文献标识码:A 文章编号:1001-3563(2023)23-0198-10DOI:10.19554/ki.1001-3563.2023.23.024Research Progress of Two-dimensional Image Quality Defect Detection Based onMachine VisionZHANG De-hai1, ZHU Zhi-feng1, LI Yan-qin1, HUANG Zi-fan1, MA Xuan-xiong1, XU Chen-yu1, LIU Xiang2(1. School of Electrical and Mechanical Engineering, Zhengzhou University of Light Industry,Zhengzhou 450002, China; 2. ZhongBiao Anti-counterfeiting Printing Co., Ltd., Beijing 102218, China)ABSTRACT: Machine vision image processing technology is an emerging fringe cross-disciplinary discipline developed in the field of image processing in recent years, and the quality inspection of two-dimensional images is an essential link in the printing industry, analyzing the quality defect detection process of two-dimensional images based on machine vision, exploring the relevant factors affecting the accuracy of two-dimensional image quality defect detection based on machine vision, and providing references for the subsequent research and development of automated inspection and quality control of two-dimensional images of printed materials. On this basis, around the gray scale conversion, noise filtering, fixed threshold segmentation, adaptive threshold segmentation, Otsu method and edge detection in image preprocessing, the gray scale statistical information distribution based alignment method and feature based image alignment method in image alignment were summarized, and then the defect extraction and classification of images were summarized and analyzed. The above research content is refined with practical examples. Noise filtering in image preprocessing is used to provide clear images for subsequent defect extraction and reduce artifact interference. Gray scale收稿日期:2023-09-12基金项目:国家自然科学基金青年项目(52006201);国家自然科学基金面上项目(52275295);郑州轻工业大学横向项目(JDG20210045)第44卷第23期张德海,等:基于机器视觉的二维图像质量缺陷检测研究进展·199·transformation, threshold segmentation and region of interest extraction in image preprocessing are used to reduce system processing time, laying a solid foundation for efficient defect detection. The image location offset caused by mechanical vibration is eliminated by image registration to ensure the accuracy of subsequent defect extraction. Image defect extraction and classification can help printing companies find production problems and provide targeted improvement measures for the production of high-quality products, thus providing important support.KEY WORDS: machine vision; printing quality; defect detection; image processing印刷缺陷是印刷产品生产中不可避免出现的问题,其缺陷类型主要包括刀丝、划痕、漏白、漏印、飞墨、污渍、套印不准等。
LTM9011-14, LTM9010-14, LTM9009-14, LTM9008-14, LT
1dc1884afaDescriptionLTM9009-14, LTM9008-14, LTM9007-14,LTM9006-14: 14-Bit, 125/105/80/65/40/25Msps Octal ADC FamilyDC1884 supports the LTM ®9011 high speed, octal ADC family.The versions of the 1884A demo board are listed in Table 1. Depending on the required resolution and sample rate, the DC1884 is supplied with the appropriate ADC. The circuitry on the analog inputs is optimized for analog inputPARAMETERCONDITIONVALUESupply Voltage – DC1884A Depending on Sampling Rate and the A/D Converter Provided, This Supply Must Provide Up to 700mA Optimized for 3.5V[3.3V to 6V Minimum/Maximum]Analog Input Range Depending on SENSE Pin Voltage 1V P-P to 2V P-P Logic Input VoltagesMinimum Logic High 1.3V Maximum Logic Low0.6VLogic Output Voltages (Differential)Nominal Logic Levels (100Ω Load, 3.5mA Mode)350mV/1.25V Common Mode Minimum Logic Levels (100Ω Load, 3.5mA Mode)247mV/1.25V Common Mode Sampling Frequency (Convert Clock Frequency)See Table 1Encode Clock Level Single-Ended Encode Mode (ENC – Tied to GND)0V to 3.6V Encode Clock Level Differential Encode Mode (ENC – Not Tied to GND)0.2V to 3.6VResolutionSee Table 1Input Frequency Range See Table 1SFDR See Applicable Data Sheet SNRSee Applicable Data Sheetperformance summaryfrequencies from 1MHz to 70MHz. Refer to the data sheet for proper input networks for different input frequencies. Design files for this circuit board are available at http://www.linear .com/demo/DC1884AL , L T , L TC, L TM, Linear Technology and the Linear logo are registered trademarks of Linear Technology Corporation. All other trademarks are the property of their respective owners.Table 1. DC1884 VariantsDC1884 VARIANTSADC PART NUMBER RESOLUTION MAXIMUM SAMPLE RATEINPUT FREQUENCY 1884A-A LTM9011-1414-Bit 125Msps 1MHz to 70MHz 1884A-B LTM9010-1414-Bit 105Msps 1MHz to 70MHz 1884A-C LTM9009-1414-Bit 80Msps 1MHz to 70MHz 1884A-D LTM9008-1414-Bit 65Msps 1MHz to 70MHz 1884A-E LTM9007-1414-Bit 40Msps 1MHz to 70MHz 1884A-FLTM9006-1414-Bit25Msps1MHz to 70MHz(T A = 25°C)2dc1884afaQuick start proceDureDC1884 is easy to set up to evaluate the performance of the LTM9011 family of A/D converters. For proper mea-surement equipment setup, refer to Figure 1 and follow the procedure explained in the following sections.Figure 1. Test Setup of DC1884DC1884A F01ANALOGINPUTSJUMPERS SHOWN INQuick start proceDureSetupIf a DC1371 data acquisition and collection system was supplied with the DC1884, follow the DC1371 Quick Start Guide to install the required software and to connect the DC1371 to the DC1884 and to a PC.DC1884 Board JumpersThe DC1884 board should have the following jumper set-tings as default positions (as per Figure 1):JP14: PAR/SER: Selects Parallel or Serial Programming Mode. (Default: Serial)Optional Jumpers AJ3 and J6: Term: Enables/Disable Optional Output Termination. (Default: Removed)JP5: I LVDS: Selects Either 1.75mA or 3.5mA of Output Current for the LVDS Drivers. (Default: Removed)JP1 and JP2: Lane: Selects Either 1-L ane or 2-L ane Output Modes (Default: Removed)Note: The DC1371 does not support 1-Lane operation. JP9: SHDN: Enables and Disables the LTM9011. (Default: Removed)JP8: WP: Enable/Disables Write Protect for the EEPROM. (Default: Removed)Note: Optional jumper should be left open to ensure proper serial configuration.Applying Power and Signals to the DC1884The DC1371 is used to acquire data from the DC1884. The DC1371 must first be connected to a powered USB port and have 5V applied power before applying 3.5V across the pins marked V+ and “GND” on the DC1884. DC1884 requires 3.5V for proper operation.The DC1884 demonstration circuit requires up to 700mA depending on the sampling rate and the A/D converter supplied.The DC1884 should not be removed or connected to the DC1371 while power is applied.Analog Input NetworkFor optimal distortion and noise performance, the RC network on the analog inputs may need to be optimized for different analog input frequencies. For input frequen-cies above 70MHz, refer to the LTM9011 data sheet for a proper input network.In almost all cases, filters will be required on both analog input and encode clock to provide data sheet SNR. The filters should be located close to the inputs to avoid reflections from impedance discontinuities at the driven end of a long transmission line. Most filters do not present 50Ω outside the passband. In some cases, 3dB to 10dB pads may be required to obtain low distortion.If your generator cannot deliver full-scale signals without distortion, you may benefit from a medium power amplifier based on a gallium arsenide gain block prior to the final filter. This is particularly true at higher frequencies where IC based operational amplifiers may be unable to deliver the combination of low noise figure and high IP3 point required. A high order filter can be used prior to this final amplifier, and a relatively lower Q filter used between the amplifier and the demonstration circuit.Encode ClockNote: Apply an encode clock to the SMA connector on the DC1884 demonstration circuit board marked “J2 CLK+.” As a default, the DC1884 is populated to have a single-ended input.For the best noise performance, the ENCODE input must be driven with a very low jitter, square wave source. The amplitude should be large, up to 3V P-P or 13dBm. When using a sinusoidal signal generator, a squaring circuit can be used. Linear Technology also provides demo board DC1075A that divides a high frequency sine wave by four, producing a low jitter square wave for best results with the LTM9011.3dc1884afaQuick start proceDureUsing bandpass filters on the clock and the analog input will improve the noise performance by reducing the wideband noise power of the signals. In the case of the DC1884 a bandpass filter used for the clock should be used prior to the DC1075A. Data sheet FFT plots are taken with 10 pole LC filters made by TTE (Los Angeles, CA) to suppress signal generator harmonics, non harmonically related spurs and broadband noise. L ow phase noise Agilent 8644B generators are used for both the clock input and the analog input.Digital OutputsData outputs, data clock, and frame clock signals are available on J1 of the DC1884. This connector follows the VITA-57/FMC standard, but all signals should be verified when using an FMC carrier card other than the DC1371. SoftwareThe DC1371 is controlled by the PScope system software that can be downloaded from the L inear Technology website at /software/.To start the data collection software, “PScope.exe,” which is installed by default to \Program Files\LTC\PScope\, double click the PScope icon or bring up the run window under the start menu, and browse to the PScope directory and select “PScope.”If the DC1884 is properly connected to the DC1371, PScope should automatically detect the DC1884 and configure itself accordingly.If everything is hooked up properly and a powered and suitable convert clock is present, clicking the “Collect” button should result in time and frequency plots displayed in the PScope window. Additional information and help for PScope is available in the DC1371 Quick Start Guide, and in the online help feature within the PScope program itself.Serial ProgrammingPScope has the ability to program the DC1884 serially through the DC1371. There are several options available in the LTM9011 family that are only available through serially programming. PScope allows all of these features to be tested.These options are available by first clicking on the “Set Demo Bd Options” icon on the PScope toolbar (Figure 2). This will bring up the menu shown in Figure 3.Figure 2. PScope ToolbarFigure 3. Demo Board Configuration Options4dc1884afa5dc1884afaInformation furnished by L inear Technology Corporation is believed to be accurate and reliable. However , no responsibility is assumed for its use. Linear Technology Corporation makes no representa-tion that the interconnection of its circuits as described herein will not infringe on existing patent rights.Quick start proceDureThis menu allows any of the options available for the LTM9011 family to be programmed serially. The LTM9011 family has the following options:Randomizer : Enables Data Output Randomizer .• Off (Default): Disables data output randomizer . • On: Enables data output randomizer .T wo’s Complement : Enables T wo’s Complement Mode.• Off (Default): Selects offset binary mode.• On: Selects two’s complement mode.Sleep Mode : Selects Between Normal Operation and Sleep Mode.• Off (Default): Entire ADC is powered and active. • On: The entire ADC is powered down.Channel 1 Nap : Selects Between Normal Operation and Putting Channel 1 in Nap Mode.• Off (Default): Channel 1 is active.• On: Channel 1 is in nap mode.Channel 2 Nap : Selects Between Normal Operation and Putting Channel 2 in Nap Mode.• Off (Default): Channel 2 is active.• On: Channel 2 is in nap mode.Channel 3 Nap : Selects Between Normal Operation and Putting Channel 3 in Nap Mode.• Off (Default): Channel 3 is active.• On: Channel 3 is in nap mode.Channel 4 Nap : Selects Between Normal Operation and Putting Channel 4 in Nap Mode.• Off (Default): Channel 4 is active.• On: Channel 4 is in nap mode.Output Current : Selects the LVDS Output Drive Current.• 1.75mA (Default): LVDS output driver current.• 2.1mA: LVDS output driver current.• 2.5mA: LVDS output driver current.• 3.0mA: LVDS output driver current.• 3.5mA: LVDS output driver current.• 4.0mA: LVDS output driver current.• 4.5mA: LVDS output driver current.Internal Termination : Enables L VDS Internal Termination.• Off (Default): Disables internal termination.• On: Enables internal termination.Outputs : Enables Digital Outputs.• Enabled (Default): Enables digital outputs. • Disabled: Disables digital outputs.Test P attern : Selects Digital Output Test Patterns. The desired test pattern can be entered into the text boxes provided.• Off (Default): ADC input data is displayed.• On: Test pattern is displayed.Once the desired settings are selected, click “OK” and PScope will automatically update the register of the device on the DC1884 demo board.6dc1884afaLinear Technology Corporation1630 McCarthy Blvd., Milpitas, CA 95035-7417(408) 432-1900 ● FAX : (408) 434-0507 ● www.linear .comLINEAR TECHNOLOGY CORPORA TION 2012LT 0117 • PRINTED IN USADEMONSTRATION BOARD IMPORTANT NOTICELinear Technology Corporation (LTC) provides the enclosed product(s) under the following AS IS conditions:This demonstration board (DEMO BOARD) kit being sold or provided by Linear Technology is intended for use for ENGINEERING DEVELOPMENT OR EVALUATION PURPOSES ONLY and is not provided by LTC for commercial use. As such, the DEMO BOARD herein may not be complete in terms of required design-, marketing-, and/or manµFacturing-related protective considerations, including but not limited to product safety measures typically found in finished commercial goods. As a prototype, this product does not fall within the scope of the European Union directive on electromagnetic compatibility and therefore may or may not meet the technical requirements of the directive, or other regulations.If this evaluation kit does not meet the specifications recited in the DEMO BOARD manual the kit may be returned within 30 days from the date of delivery for a full refund. THE FOREGOING WARRANTY IS THE EXCLUSIVE WARRANTY MADE BY THE SELLER TO BUYER AND IS IN LIEU OF ALL OTHER WARRANTIES, EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. EXCEPT TO THE EXTENT OF THIS INDEMNITY, NEITHER PARTY SHALL BE LIABLE TO THE OTHER FOR ANY INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES.The user assumes all responsibility and liability for proper and safe handling of the goods. Further, the user releases LTC from all claims arising from the handling or use of the goods. Due to the open construction of the product, it is the user’s responsibility to take any and all appropriate precautions with regard to electrostatic discharge. Also be aware that the products herein may not be regulatory compliant or agency certified (FCC, UL, CE, etc.).No License is granted under any patent right or other intellectual property whatsoever. LTC assumes no liability for applications assistance, customer product design, software performance, or infringement of patents or any other intellectual property rights of any kind.LTC currently services a variety of customers for products around the world, and therefore this transaction is not exclusive .Please read the DEMO BOARD manual prior to handling the product . Persons handling this product must have electronics training and observe good laboratory practice standards. Common sense is encouraged .This notice contains important safety information about temperatures and voltages. For further safety concerns, please contact a LTC application engineer.Mailing Address:Linear Technology 1630 McCarthy pitas, CA 95035Copyright © 2004, Linear Technology Corporation。
一种监控摄像机干扰检测模型
引言监控视频是犯罪案件调查中一项重要的信息来源与破案依据,但犯罪者极有可能通过对摄像机进行人为干预甚至破坏来掩盖其可疑活动,因此,如何有效地检测监控摄像机干扰事件具有重要的应用价值。
在目前的监控摄相机干扰检测方法中,特殊场景误检测依旧是巨大的挑战,如:照明变化、天气变化、人群流动、大型物体通过等。
对于一些纹理较少或无纹理背景的、黑暗或低质量的监控视频来说,大多数检测方法都会将其误识别为散焦干扰;对于使用带纹理物体进行镜头遮挡的情况,当遮挡物的灰度和亮度与图像背景非常相似时,无法将遮挡物与图像背景区分。
除此之外,物体缓慢地遮挡镜头也是一个具有挑战性的检测问题。
本文使用深度神经网络建立检测模型,利用改进的ConvGRU (Convolutional Gated Recurrent Unit )提取视频的时序特征和图像的空间全局依赖关系,结合Siamese 架构,提出了SCG 模型。
一种监控摄像机干扰检测模型刘小楠,邵培南(中国电子科技集团公司第三十二研究所,上海201808)摘要为减少监控干扰检测中因特殊场景引起的误检测,文中提出一种基于Siamese 架构的SCG(Siamese with Convolutional Gated Recurrent Unit )模型,利用视频片段间的潜在相似性来区分特殊场景与干扰事件。
通过在Siamese 架构中融合改进ConvGRU 网络,使模型充分利用监控视频的帧间时序相关性,在GRU 单元间嵌入的非局部操作可以使网络建立图像空间依赖响应。
与使用传统的GRU 模块的干扰检测模型相比,使用改进的ConvGRU 模块的模型准确率提升了4.22%。
除此之外,文中还引入残差注意力模块来提高特征提取网络对图像前景变化的感知能力,与未加入注意力模块的模型相比,改进后模型的准确率再次提高了2.49%。
关键词Siamese ;ConvGRU ;Non-local block ;相机干扰;干扰检测中图分类号TP391文献标识码A文章编号1009-2552(2021)01-0090-07DOI 10.13274/ki.hdzj.2021.01.016A surveillance camera tampering detection modelLIU Xiao -nan,SHAO Pei -nan(The 32nd Research Institute of China Electronics Technology Group Corporation ,Shanghai 201808,China )Abstract :This paper proposes an SCG model based on Siamese network to reduce the detection error caused by some special scenes in camera tampering detection.The model can use the potential similarity be⁃tween video clips to distinguish special scenes from tampering events.The improved ConvGRU network is in⁃tegrated to capture the temporal correlation between the frames of surveillance video.We embed non -local block s between GRU cells simultaneously,so the model can establish the spatial dependence of the image.The improved ConvGRU network improves model performance by 4.22%.We also add residual attention mod⁃ule to improve the perception ability of the model to the change of image foreground,this again improves the accuracy of the model by 2.49%.Key words :Siamese ;ConvGRU ;Non-local block ;camera tampering ;tampering detection作者简介:刘小楠(1994-),女,硕士研究生,研究方向为计算机视觉、深度学习。
西塘景点介绍作文英语
西塘景点介绍作文英语Title: Exploring the Charm of Xitang: A Tourist's Guide。
Xitang, a picturesque water town nestled in the heartof China's Zhejiang Province, boasts a rich tapestry of history, culture, and natural beauty. With its ancient architecture, serene waterways, and tranquil ambiance, Xitang has become a beloved destination for travelers seeking to immerse themselves in the essence of traditional Chinese life. In this guide, we will delve into the captivating attractions that make Xitang a must-visit destination.### Historical Landmarks:1. Mingtang Hall: One of Xitang's most prominent landmarks, Mingtang Hall is a magnificent structure dating back to the Ming and Qing dynasties. Adorned with intricate carvings and elegant architecture, this hall served as a gathering place for scholars and officials during ancienttimes.2. Yuehe Ancient Street: Strolling along Yuehe Ancient Street is like stepping back in time. Lined with well-preserved buildings from the Ming and Qing dynasties, this cobblestone street exudes an old-world charm. Visitors can explore quaint shops selling handicrafts, traditional snacks, and souvenirs.3. Shipi Lane: Renowned for its timeless beauty, Shipi Lane is a narrow alleyway flanked by ancient houses and winding canals. As sunlight filters through the tiled roofs and wooden lattices, it creates a mesmerizing interplay of light and shadow, perfect for photography enthusiasts.### Scenic Waterways:1. Wuzhen River: The lifeblood of Xitang, the Wuzhen River meanders through the town, offering enchanting views at every turn. Visitors can embark on leisurely boat rides along the river, admiring the historic buildings and lush greenery that line its banks.2. Eight Bridges: Spanning the tranquil waters of Xitang, the Eight Bridges are iconic symbols of the town's architectural prowess. Each bridge is unique in design, reflecting the artistic sensibilities of the craftsmen who built them centuries ago. Crossing these bridges is a delightful experience that provides unparalleled vistas of Xitang's scenic beauty.### Cultural Heritage:1. Shadow Play: Xitang is renowned for its vibrant cultural scene, with traditional art forms such as shadow play taking center stage. Performances are held regularly in the town's theaters, allowing visitors to witness this ancient storytelling tradition come to life through intricately crafted puppets and vivid storytelling.2. Tea Culture: Tea enthusiasts will delight inXitang's thriving tea culture, which dates back centuries. Visitors can participate in tea ceremonies, where skilled artisans demonstrate the art of brewing and serving teawith grace and precision. Sample local specialties such as Longjing tea, renowned for its delicate flavor and fragrant aroma.### Culinary Delights:1. Local Cuisine: Xitang's culinary scene is a feastfor the senses, with an array of delectable dishes showcasing the region's culinary heritage. From savory braised pork belly to crispy fried river fish, there's something to satisfy every palate. Don't miss the opportunity to dine at a waterfront restaurant, where you can savor delicious meals while enjoying panoramic views of the river.2. Street Food: For a taste of authentic street food, head to Xitang's bustling marketplaces, where vendors sell an array of tantalizing snacks and delicacies. Sample crispy fried dumplings, savory scallion pancakes, and sweet rice cakes, all prepared fresh and bursting with flavor.### Conclusion:In conclusion, Xitang offers a captivating blend of history, culture, and natural beauty that beckons travelers from near and far. Whether you're exploring ancient landmarks, cruising along scenic waterways, or savoring local delicacies, the charm of Xitang is sure to leave a lasting impression. Embark on a journey of discovery and immerse yourself in the timeless allure of this enchanting water town.。
基于面部多特征跨层融合网络的驾驶员疲劳检测方法
第38卷第6期2023年12月安 徽 工 程 大 学 学 报J o u r n a l o fA n h u i P o l y t e c h n i cU n i v e r s i t y V o l .38N o .6D e c .2023文章编号:1672-2477(2023)06-0064-08收稿日期:2023-06-13基金项目:安徽省高校优秀青年基金项目(2023A H 030020)作者简介:徐文奇(1991-),男,浙江宁波人,助理实验师,硕士㊂通信作者:胡耀聪(1992-),男,安徽芜湖人,讲师,博士㊂基于面部多特征跨层融合网络的驾驶员疲劳检测方法徐文奇,胡耀聪*(安徽工程大学电气工程学院,安徽芜湖 241000)摘要:针对现有驾驶员疲劳检测很大程度依赖于局部疲劳相关信息提取而导致检测准确度不足的问题,本文提出了一种基于面部多特征融合的驾驶员疲劳检测算法,能够对整体面部疲劳状态进行特征学习,从而实现更精确的驾驶员疲劳状态检测㊂提出的驾驶员人脸疲劳检测算法包含3个步骤:首先使用M T C N N 网络检测面部关键点并截取脸部㊁眼部㊁嘴部图像区域;其次设计一种面部多特征跨层融合网络,实现不同面部区域之间的信息交互与疲劳相关特征提取,进而通过多标签分类对单帧图像面部疲劳相关属性进行识别;最后使用L S TM 对长时间序列进行建模,实现最终的驾驶员疲劳状态检测㊂本文提出的驾驶员疲劳检测算法在N T HU -D D D 数据集进行了测试,对比实验验证了该方法的可行性和有效性㊂关 键 词:疲劳相关信息;多特征跨层融合;多标签分类;长时间序列中图分类号:T P 391.41 文献标志码:A伴随公共交通的快速发展和车辆数量的指数级增长,交通安全已成为世界各地亟待解决的问题㊂世界卫生组织近来调查显示,全球由交通事故导致死亡人数每24秒新增1例,每年由车祸导致死亡人数超13万[1]㊂由司机长时间驾驶或睡眠不足导致的疲劳驾驶是造成交通事故死亡的重要原因之一㊂因此,驾驶员疲劳检测的研究对智能交通系统具有重要意义[2-4]㊂计算机视觉算法是基于视频的疲劳驾驶检测系统的核心技术㊂近期研究中,学者们已经提出了几种算法来实现疲劳驾驶检测㊂通常来说,一个完整的驾驶员疲劳检测框架主要包含以下3个步骤:(1)人脸检测:通过目标检测器逐帧检测驾驶员面部并对关键点进行定位;(2)特征提取:通过传统的特征描述子[5-7]或深度学习模型[8-12]来学习与疲劳驾驶相关的信息;(3)疲劳判定:依赖帧间信息判别驾驶员疲劳程度㊂面部特征提取是驾驶员疲劳检测的关键,然而现有的方法通常仅关注局部区域的疲劳相关属性,例如眼睑闭合时间(P e r c e n t a g e o f E y e l i dC l o s u r e ,P E R C L O S )[13]㊁嘴角张合比(M o u t hA s pe c tR a t i o ,MA R )[14]等,而忽略了全局面部特征表示,致使疲劳检测精度较低㊂为了解决这个问题,本文设计了面部多特征跨层融合网络来进行精确的驾驶员疲劳检测,首先使用MT C N N 网络检测面部关键点并截取脸部㊁眼部㊁嘴部图像区域;其次设计一种面部多特征跨层融合网络,实现不同面部区域之间的信息交互与疲劳相关特征提取,进而通过多标签分类对单帧图像的面部疲劳相关属性进行识别;最后使用L S T M 对长时间序列进行建模,实现最终的驾驶员疲劳状态检测㊂1 疲劳检测算法1.1 面部关键点检测本研究采用多任务级联卷积神经网络(M u l t i -t a s kC a s c a d eC o n v o l u t i o n a lN e u r a lN e t w o r k ,MT C N N )模型[15]进行人脸关键点检测,它包含3个子网络:P -N e t ㊁R -N e t 和O -N e t,模型结构如图2所示㊂具体来说,MT C N N 模型的推理流程包含以下步骤:(1)对输入图像进行缩放操作,设定缩放因子为γ,并将原始图像以{1,γ,γ2, ,γn}的比例进行缩放,从而生成一组不同尺度的图像㊂(2)P -N e t:采用全卷积神经网络结构,用于初步标定人脸边界框㊂通过3个浅层卷积提取面部特征,粗略地搜索人脸候选区域㊂(3)R -N e t :包含3个卷积层和1个全连接层,用于进一步剔除错误和重复的人脸框,其输入为P -N e t 检测出的候选区域,采用卷积层实现特征细化,最后通过回归判定出候选区域中是否包含人脸㊁人脸中心偏移量㊁人脸关键点坐标㊂(4)O -N e t :包含4个卷积层和1个全连接层,用于输出最终的人脸关键点检测结果㊂MT C -N N 模型可以准确地检测出视频图像中的人脸区域并定位出关键点,为后续的疲劳特征提取和疲劳状态检测打下了基础㊂图1 驾驶员人脸疲劳检测算法整体框架图2 M T C N N 模型结构示意图1.2 面部疲劳属性识别疲劳状态主要表现在面部全局纹理㊁眼睑闭合程度㊁嘴角张合程度等相关属性㊂面部多特征跨层融合网络模型如图3所示㊂由图3可知,此网络模型包含脸部特征提取网络分支F -b r a n c h ㊁眼部特征提取网络分支E -b r a n c h ㊁嘴部特征提取网络分支M -b r a n c h ,分别从人脸㊁眼睛和嘴巴3个图像区域学习疲劳相关信息,对应的输入尺寸为:128×128㊁64×64㊁64×64㊂特征提取网络分支F -b r a n c h ㊁E -b r a n c h 和M -b r a n c h借鉴了M o b i l e N e t -V 2网络结构,有效地平衡了模型性能与计算复杂度㊂具体来说,它采用了深度可分离卷积运算,即将标准的3×3卷积分操作拆分为逐点卷积(P o i n t w i s eC o n v o l u t i o n )和逐通道卷积(C h a n n e l -w i s eC o n v o l u t i o n ),降低了模型的参数量和浮点计算数;其次瓶颈层采用倒置残差模块(I n v e r t e dR e s i d u -a l sB l o c k )结构,即先通道升维,后通道降维的策略,在保持网络深度的同时,增加了特征图的维度;此外,倒置残差模块的最后1个卷积层使用线性激活函数代替R e L U 激活函数,用于解决特征丢失与梯度弥散问题㊂面部多特征跨层融合网络模型能够学习不同面部区域的疲劳相关信息,具体运算过程可以定义为:F (l )f a c e =F b r a n c h (F (l -1)f a c e |θ(l )f a c e )=θ(l )f a c e ×F (l -1)f a c e ,(1)F (l )e y e s =E b r a n c h (F (l -1)e y e s |θ(l )e y e s )=θ(l )e y e s ×F (l -1)e ye s ,(2)㊃56㊃第6期徐文奇,等:基于面部多特征跨层融合网络的驾驶员疲劳检测方法F (l )m o u t h =M b r a n c h (F (l -1)m o u t h |θ(l )m o u t h )=θ(l )m o u t h ×F (l -1)m o u t h ,(3)其中,F (l )f a c e ㊁F (l )e ye s ㊁F (l )m o u t h 分别表示面部㊁眼部㊁嘴部网络分支在第l 层提取的特征图;θ(l )f a c e ㊁θ(l )e ye s ㊁θ(l)m o u t h 表示其对应的相关参数㊂图3 面部多特征跨层融合网络模型需要注意的是,为了进一步促进不同面部区域的疲劳相关信息交互,面部多特征跨层融合网络采用了跨层连接单元,用于对不同网络分支的中间层特征图进行融合,即通过1×1卷积将E -b r a n c h 与M -b r a n c h 学习的眼部区域信息与嘴部区域信息进行特征映射,接着将其与F -b r a n c h 学习的全局面部特征进行拼接,并通过1×1卷积实现维度变换㊂此外,全局均值池化层用于将F -b r a n c h ㊁E -b r a n c h 与M -b r a n c h 网络分支中的最后一层(L 层)卷积特征图进行降维,具体可以定义为:^f =A v g P o o l i n g (F (L )f a c e ⊕F (L )e ye s ⊕F (L )m o u t h ),(4)其中,A v g P o o l i n g (㊃)表示全局均值池化操作;⊕表示特征通道合并;^f 表示融合后得到的全局疲劳状态表征㊂本文采用了多标签分类方法对单帧图像的面部疲劳相关属性进行判别,主要涉及全局属性(正常/垂头)㊁眼部属性(正常/闭眼)和嘴部属性(正常/打哈欠)㊂多标签分类损失具体可以定义如下:c jk =s o f t m a x (f |θk )=e x p (θj k ㊃f )∑j'e x p (θj 'k ㊃f ),(5)L c l s =-∑Jj =1∑Kk =1δk ㊃l k ㊃l o g (c j k ),(6)其中,c jk 表示s o f t m a x 分类器计算出的第k 个疲劳相关属性被分类为第j 个类别的概率;θk 为该分类器的相关参数;L c l s 为单个样本的损失;δk 表示第k 个疲劳相关属性的权重参数;l k 表示第k 个疲劳相关属性的真实值标签㊂1.3 面部疲劳状态检测疲劳是一种连续出现的长时间面部状态,因此仅仅依赖于单帧图像表现出的面部疲劳相关属性仍然难以实现精确的疲劳状态检测㊂基于此,本文采用长短期记忆网络模型(L o n g S h o r t -t e r m N e t w o r k ,L S T N )逐帧对面部疲劳相关属性进行编码,建模长时序信息,最终输出疲劳检测结果㊂L S T N 单元的输入门i (t )用于调制输入信号z (t ),记忆单元m (t )记录了当前的记忆状态,L S T N 单元的输出h (t )由遗忘门f (t )和输出门o (t )共同决定,面部多特征跨层融合网络疲劳逐帧计算疲劳相关属性c (t ),而双向长短期记忆网络以该属性作为输入,并输出每帧图像的疲劳得分,运算过程可表示为:i (t )=σ(W i c (t )+R i h (t -1)+b i ,(7)㊃66㊃安 徽 工 程 大 学 学 报第38卷f (t )=σ(W f c (t )+R f h (t -1)+b f ,(8)o (t )=σ(W o c (t )+R o h (t -1)+b o ,(9)z (t )=φ(W z c (t )+R z h (t -1)+b z ,(10)m (t )=i (t )⊗z (t )+f (t )⊗m (t -1),(11)h (t )=o (t )⊗φ(m (t )),(12)其中,W 表示当前状态输入的权重矩阵;R 表示上一个状态输出的权重矩阵;b 表示阈值项;σ为s i g m o i d 函数;φ为双正切函数;⊗表示元素内积㊂L S T N 单元的输出取决于当前时刻的疲劳相关属性和之前时刻的疲劳相关属性,实现了长时序信息融合㊂2 算法实现与结果2.1 实验环境本文在U b u n t u18.04操作系统下,通过P y -t o r c h 开源工具构建MT C N N 模型㊁面部多特征跨层融合网络模型与L S T N 网络模型,并应用于驾驶员疲劳识别任务中,实验相关配置与具体参数如表1所示㊂表1 实验平台相关配置与具体参数相关配置具体参数主机D e l l P o w e r E d g eT 440C P U I n t e l C o r e i 7-9700G P UN V I D I A G e f o r c eR T X 3090操作系统U b u n t u16.04P yt h o n 版本P yt h o n3.8P yt o r c h 版本P yt o r c h1.132.2 实验数据集N T HU -D D D 是一个公开的驾驶员疲劳识别数据集,由中国台湾清华大学发布㊂该数据集中的所有视频都是由带有主动红外L E D 的彩色摄像头拍摄的㊂录制视频的参与者在模拟驾驶环境中进行正常驾驶和疲劳驾驶,并可以分为五种场景条件:白天不佩戴眼镜㊁白天佩戴眼镜㊁白天佩戴墨镜㊁夜晚不佩戴眼镜㊁夜晚佩戴眼镜,如图4所示㊂录制的视频分辨率为640×480,每秒30帧㊂此外,N T HU -D D D 数据集包含每一帧图像的疲劳相关信息标注,涉及全局状态(正常/垂头)㊁眼睛(正常/闭眼)和嘴巴(正常/打哈欠)㊂本实验借鉴了文献[16]的数据处理方法,即采用滑动窗口对完整视频进行截取,截取片段的帧长设置为300帧㊂N T HU -D D D 数据集中的360段完整的训练视频可以被拆分为2390个时间片段,涉及1572个正常驾驶片段和818个疲劳驾驶片段;20段完整的测试视频可以被拆分为602个时间片段,涉及348个正常驾驶片段和254个疲劳驾驶片段㊂图4 N T HU -D D D 数据集示例样本2.3 实验评价指标本文实验的评价指标包括检测率(D e t e c t i o nR a t e ,D R )㊁误报率(F a l s eA l a r m R a t e ,F A R )㊁准确率(A c c u r a c y Ra t e ,A R ),分别可以定义为:㊃76㊃第6期徐文奇,等:基于面部多特征跨层融合网络的驾驶员疲劳检测方法D R =T pT p +F n ×100%,(13)F A R =F pT n +F p×100%,(14)D R =T p +T nT p +T n +F p +F n×100%,(15)其中,T p ㊁F p ㊁T n 和F n 分别表示真阳性㊁假阳性㊁真阴性㊁假阴性的样本数量㊂2.4 实验结果比较本实验主要采用检测率㊁误报率和准确率这3个指标评估面部多特征跨层融合网络模型的性能并与现有的驾驶员疲劳识别方法和模型做了比较㊂对比方法主要涉及三类:第一类是基于规则的疲劳检测算法,例如P E R C L O S [13]㊁MA R [14]等㊂第二类是结合传统的特征描述子和机器学习算法进行驾驶员疲劳检测,例如S L A F s -R F [17]㊁L B P T O P -S V M [18]等;第三类是基于深度学习的驾驶员疲劳识别方法,例如M S T N [19]㊁D D D -I A A [20]㊁3D C N N -F F [21]㊂本文所提出的基于面部多特征融合的驾驶员疲劳检测算法及其对比模型在相同的D e l lP o w e r E d g eT 440计算平台上进行训练与测试,模型训练过程中的超参数设置如表2所示㊂实验统一采用MT C N N 模型截取驾驶员人脸区域,通过不同模型提取疲劳相关特征,并最终判定驾驶员是否处于疲劳状态㊂表3列出了不同模型在N T HU -D D D 数据集上的识别精度对比结果㊂表2 模型训练的超参数设置超参数配置具体参数初始学习率0.001学习率下降间隔纪元数100学习率调整倍数0.1迭代纪元数500优化器类型A d a m 优化器批样本数16表3 各种疲劳识别算法在N T HU -D D D 数据集上的识别精度对比方法名称N T HU -D D D D R /%F A R /%A R /%P E R C L O S [13]65.419.873.9MA R[14]55.928.764.7S L A F s -R F [17]72.022.175.4L B P T O P -S VM [18]74.020.477.2M S T N[19]85.011.287.2D D D -I A A [2]81.919.081.43D C N N -C A L [21]80.321.579.2F B r a n c h -L S T N 79.513.583.6E B r a n c h -L S T N 78.723.077.9M B r a n c h -L S T N 67.724.772.1面部多特征跨层融合网络-L S T N (本文方法)89.48.690.5文献[13]和文献[14]分别将眼睑闭合时间(P E R C L O S )和嘴角张合比(MA R )作为规则用于评判驾驶员疲劳程度,这类方法在N T HU -D D D 数据集上的表现不佳,准确率分别为73.9%㊁70.6%㊂文献[17]和文献[18]结合了传统的特征描述子和机器学习算法判定驾驶员疲劳状态,其中,文献[17]融合了梯度方向特征和关键点运动矢量,接着随机森林分类器判定驾驶员是否处于疲劳状态;文献[18]使用了三维局部二值模式L B P -T O P 描述子提取面部动态纹理特征,并通过S VM 对提取的特征进行分类,进而检测驾驶员疲劳状态;实验结果表明,基于传统特征描述子的疲劳检测算法性能优于基于规则的疲劳检测算法㊂文献[19]㊁文献[20]和文献[21]构建了深度学习模型进行端到端的疲劳特征提取和疲劳检测㊂具体来说,文献[19]提出了一种多阶段时空网络模型(M u l t i s t a g eS p a t i a l -t e m p o r a lN e t w o r k ,M S T N ),其中C N N 模型从单帧图片中提取人脸疲劳相关特征,L S T N 模型用于建模长时序信息,并输出疲劳检测结果㊂文献[20]提出了D D D -I A A 驾驶员疲劳检测框架,其中A l e x N e t ㊁F l o w I m a g e N e t 和V G G F a c e 分别用于提取全局环㊃86㊃安 徽 工 程 大 学 学 报第38卷境信息㊁帧间动作信息和面部细节轮廓信息,最后采用分数融合检测驾驶员是否处于疲劳状态㊂文献[21]中提出了一种3D C N N -C A L 的疲劳检测框架,该框架首先使用三维卷积网络提取连续时间段的疲劳相关信息,接着借助条件自适应学习获取全局场景信息,最终通过特征融合识别驾驶员疲劳状态㊂实验结果显示文献[19]提出的M S T N 模型优于其他对比方法,该算法在N T HU -D D D 数据集上的检测率㊁误报率和准确率分别为85.0%㊁11.2%和87.2%㊂本文提出的面部多特征跨层融合网络包含3个网络分支:F -b r a n c h ㊁E -b r a n c h 与M -b r a n c h ,分别从人脸㊁眼睛和嘴巴3个图像区域学习疲劳相关信息,并通过跨层连接单元实现不同分支的信息交互㊂从实验结果可以看出,单独的脸部㊁眼部或嘴部特征提取网络分支在N T HU -D D D 数据集上的检测精度不高,而将3个网络分支进行跨层融合后,模型的性能得到了显著提升,检测率㊁误报率和准确率分别达到了89.4%㊁8.6%㊁90.5%,优于其他对比模型㊂面部多特征跨层融合网络模型在五种不同场景条件下的精度表现如表4所示,这五种场景包括白天不佩戴眼睛㊁白天佩戴眼镜㊁白天佩戴墨镜㊁夜间不佩戴眼镜㊁夜间佩戴眼镜㊂实验结果表明,本文提出的算法在白天条件下的检测精度要高于夜间条件下的检测精度,不佩戴眼镜情况下的检测精度要高于佩戴眼镜或佩戴墨镜情况下的检测精度㊂表4 本文提出的疲劳识别算法在N T HU -D D D 数据集中不同场景条件下的识别精度对比场景类型D R /%F A R /%A R /%白天不佩戴眼镜98.12.997.6白天佩戴眼镜94.05.494.3白天佩戴墨镜75.018.678.6夜间不佩戴眼镜92.38.791.7夜间佩戴眼镜85.59.288.5所有场景89.48.690.5图5 疲劳检测效果示意图本文提出的疲劳检测算法的效果示意图如图5所示,从图中可以看出MT C N N 网络可以准确地定位出人脸关键点,进而对脸部㊁眼部和嘴部区域进行截取;面部多特征跨层融合网络模型能够有效地判断出单帧图片的面部疲劳相关属性,而L S T N 网络结合了当前时刻与之前时刻的疲劳相关属性,并输出最终的疲劳检测结果㊂㊃96㊃第6期徐文奇,等:基于面部多特征跨层融合网络的驾驶员疲劳检测方法㊃07㊃安 徽 工 程 大 学 学 报第38卷3 结论本文针对现有驾驶员疲劳检测很大程度依赖于局部疲劳相关信息提取而导致检测准确度不足的问题,提出了一种基于面部多特征融合的驾驶员疲劳检测算法㊂首先使用MT C N N网络检测面部关键点并截取脸部㊁眼部㊁嘴部图像区域;其次设计一种面部多特征跨层融合网络,实现不同面部区域之间的信息交互与疲劳相关特征提取,进而通过多标签分类对单帧图像的面部疲劳相关属性进行识别;最后使用L S T N 对长时间序列进行建模,实现最终的驾驶员疲劳状态检测㊂本文提出的驾驶员疲劳检测算法在N T HU-D D D数据集进行了测试,对比实验验证了该方法的可行性和有效性㊂然而如何进一步提升复杂光照环境下的驾驶员疲劳检测的精度,这将是接下来的重点研究工作㊂参考文献:[1] 胡耀聪.基于深度学习的驾驶员行为与疲劳识别方法研究[D].南京:东南大学,2021.[2] S I N G H D,MOHA NCK.D e e p s p a t i o-t e m p o r a l r e p r e s e n t a t i o n f o r d e t e c t i o n o f r o a d a c c i d e n t s u s i n g s t a c k e d a u t o e n c o d-e r[J].I E E ET r a n s a c t i o n s o n I n t e l l i g e n tT r a n s p o r t a t i o nS y s t e m s,2019,20(3):879-887.[3] K A P L A N S,G U V E N S A N M A,Y A V U Z A G,e ta l.D r i v e rb e h a v i o ra n a l y s i sf o rs a f ed r i v i n g:as u r v e y[J].I E E ET r a n s a c t i o n s o n I n t e l l i g e n tT r a n s p o r t a t i o nS y s t e m s,2015,16(6):3017-3032.[4] R AM Z A N M,K HA N H U,AWA NS M,e t a l.As u r v e y o n s t a t e-o f-t h e-a r t d r o w s i n e s s d e t e c t i o n t e c h n i q u e s[J].I E E EA c c e s s,2019(7):61904-61919.[5] 朱艳,谢忠志,于雯,等.低光照环境下基于面部特征点的疲劳驾驶检测技术[J].汽车安全与节能学报,2022,13(2):282-289.[6] 胡峰松,程哲坤,徐青云,等.基于多特征融合的疲劳驾驶状态识别方法研究[J].湖南大学学报(自然科学版),2022,49(4):100-107.[7] 陆荣秀,张笔豪,莫振龙.基于脸部特征和头部姿态的疲劳检测方法[J].系统仿真学报,2022,34(10):2279-2292.[8] 熊群芳,林军,岳伟.基于深度学习的疲劳驾驶状态检测方法[J].控制与信息技术,2018(6):91-95.[9] 郑伟成,李学伟,刘宏哲,等.基于深度学习的疲劳驾驶检测算法[J].计算机工程,2020,46(7):21-29.[10]李响.基于深度学习的面部疲劳信息检测方法研究与实现[D].长春:东北师范大学,2021.[11]A HM E D M,MA S O O DS,A HMA D M,e t a l.I n t e l l i g e n td r i v e rd r o w s i n e s sd e t e c t i o nf o r t r a f f i cs a f e t y b a s e do n m u l t iC N Nd e e p m o d e l a n d f a c i a l s u b s a m p l i n g[J].I E E ET r a n s a c t i o n s o n I n t e l l i g e n tT r a n s p o r t a t i o nS y s t e m s,2022,23(10):19743-19752.[12]Q I A N K,K O I K ET,N A K AMU R A T,e t a l.L e a r n i n g m u l t i m o d a l r e p r e s e n t a t i o n s f o rd r o w s i n e s sd e t e c t i o n[J].I E E ET r a n s a c t i o n s o n I n t e l l i g e n tT r a n s p o r t a t i o nS y s t e m s,2022,23(8):11539-11548.[13]周壮壮,陈艳,唐苏,等.基于P E R C L O S的眼部疲劳检测系统研究[J].集成电路应用,2021,38(2):48-49.[14]王霞,仝美娇,王蒙军.基于嘴部内轮廓特征的疲劳检测[J].科学技术与工程,2016,16(26):240-244.[15]Z HA N G K,Z HA N GZ,L I Z,e t a l.J o i n t f a c e d e t e c t i o n a n d a l i g n m e n t u s i n g m u l t i t a s k c a s c a d e d c o n v o l u t i o n a l n e t w o r k s[J].I E E ES i g n a l P r o c e s s i n g L e t t e r s,2016(10):1499-1503.[16]黄志亮.基于深度学习的驾驶员疲劳检测技术研究[D].南京:东南大学,2019.[17]L Y UJ,Z HA N G H,Y U A N ZJ.J o i n t s h a p ea n dl o c a l a p p e a r a n c e f e a t u r e s f o rr e a l-t i m ed r i v e rd r o w s i n e s sd e t e c t i o n[C]//A s i a nC o n f e r e n c e o nC o m p u t e rV i s i o n.T a i p e i:S p r i n g e r,2017:178-194.[18]赵磊.基于深度学习和面部多源动态行为融合的驾驶员疲劳检测方法研究[D].山东:山东大学,2018.[19]S H I H T H,H S UCT.M S T N:m u l t i s t a g e s p a t i a l-t e m p o r a l n e t w o r k f o r d r i v e r d r o w s i n e s s d e t e c t i o n[C]//A s i a nC o n f e r-e n c e o nC o m p u t e rV i s i o n.T a i p e i:S p r i n g e r,2016:146-153.[20]P A R KS,P A NF,K A N GS,e t a l.D r i v e r d r o w s i n e s s d e t e c t i o n s y s t e mb a s e d o n f e a t u r e r e p r e s e n t a t i o n l e a r n i n g u s i n g v a-r i o u s d e e p n e t w o r k s[C]//A s i a nC o n f e r e n c e o nC o m p u t e rV i s i o n.T a i p e i:S p r i n g e r,2017:154-164.[21]Y UJ,P A R KS,L E ES,e t a l.D r i v e rd r o w s i n e s sd e t e c t i o nu s i n g c o n d i t i o n-a d a p t i v e r e p r e s e n t a t i o n l e a r n i n g f r a m e w o r k[J].I E E ET r a n s a c t i o n s o n I n t e l l i g e n tT r a n s p o r t a t i o nS y s t e m s.2019,20(11):4206-4218.D r i v e rF a t i g u eD e t e c t i o n M e t h o dB a s e d o n F a c i a lM u l t i -f e a t u r eC r o s s -l a ye rF u s i o nN e t w o r k X U W e n q i ,HU Y a o c o n g*(S c h o o l o fE l e c t r i c a l E n g i n e e r i n g ,A n h u i P o l y t e c h n i cU n i v e r s i t y,W u h u241000,C h i n a )A b s t r a c t :T h i s p a p e r p r o p o s e d a d r i v e r f a t i g u e d e t e c t i o n a l g o r i t h mb a s e do n f a c i a lm u l t i -f e a t u r e f u s i o n .I t c a n l e a r n f e a t u r e s f r o mt h e o v e r a l l f a c i a l f a t ig u e s t a t e .Th e c u r r e n t d ri v e r f a t i g u e d e t e c t i o nh e a v i l y re l i e d o n e x t r a c t i n g l o c a l d r o w s i n e s s r e l a t e d i nf o r m a t i o n ,r e s u l t i ng i n i n s u f f i c i e n t d e t e c t i o n a c c u r a c y.H o w e v e r ,t h em e t h o d c a na c h i e v em o r e a c c u r a t e d r i v e r f a t i g u e s t a t e d e t e c t i o n .T h e p r o p o s e dd r i v e r f a c e f a t i g u e d e -t e c t i o na l g o r i t h mc o n s i s t s o f t h r e e s t e p s .F i r s t l y ,MT C N Nn e t w o r kw a su s e d t od e t e c t f a c i a l k e yp o i n t s a n de x t r a c t f a c i a l ,e y e ,a n d m o u t h i m a g e r e g i o n s ;S e c o n d l y ,a f a c i a lm u l t i f e a t u r e c r o s s l a y e r f u s i o nn e t -w o r kw a s d e s i g n e d t o a c h i e v e i n f o r m a t i o n e x c h a n g e a n d f a t i g u e r e l a t e d -f e a t u r e e x t r a c t i o nb e t w e e n d i f f e r -e n tf a c i a l r eg i o n s ,a n dth e nr e c o g ni z e df a c i a l f a t i g u er e l a t e da t t r i b u t e s i ns i n g l ef r a m e i m a g e st h r o u gh m u l t i -l a b e l c l a s s i f i c a t i o n ;F i n a l l y ,L S T M w a su s e dt o m o d e l t h e l o n g t i m es e r i e sa n da c h i e v e dt h e f i n a l d e t e c t i o no f d r i v e r f a t i g u e s t a t u s .T h e p r o p o s e dd r i v e r f a t i g u e d e t e c t i o n a l g o r i t h m w a s t e s t e d o n t h eN T -HU -D D Dd a t a s e t ,a n d c o m p a r a t i v e e x p e r i m e n t s v e r i f i e d t h e f e a s i b i l i t y a n d e f f e c t i v e n e s s o f t h i sm e t h o d .K e y w o r d s :d r o w s i n e s s -r e l a t e d i n f o r m a t i o n ;c r o s s -l a y e r f e a t u r e i n t e r a c t i o n ;m u l t i -l a b e l c l a s s i f i c a t i o n ;l o n g -t i m e s e qu e n c e (上接第63页)I m p r o v e m e n t o f S L A M A l go r i t h mf o rP o i n t a n dL i n eV i s i o n B a s e d o nG r a d i e n tD e n s i t yL IM i n g h a o 1,2,C H E N M e n g y u a n 1,2*,G O N GP e n g h a o 1,2,L O N G H a i ya n 3(1.S c h o o l o fE l e c t r i c a l E n g i n e e r i n g ,A n h u i P o l y t e c h n i cU n i v e r s i t y,W u h u241000,C h i n a ;2.K e y L a b o r a t o r y o fA d v a n c e dP e r c e p t i o na n d I n t e l l i g e n tC o n t r o l o fH i g h -e n dE q u i p m e n t ,A n h u i P o l y t e c h n i cU n i v e r s i t y,W u h u241000,C h i n a ;3.S c h o o l o fE l e c t r i c a l a n dE l e c t r o n i cE n g i n e e r i n g ,A n h u i I n s t i t u t e o f I n f o r m a t i o nT e c h n o l o g y,W u h u241000,C h i n a )A b s t r a c t :A i m i n g a t t h e p r o b l e m st h a t t h ev i s u a l s y n c h r o n o u s l o c a l i z a t i o na n d m a p b u i l d i n g (S L AM )m e t h o d i s l i k e l y t o c a u s e i m a g eb l u r i n t h e r a pi d c a m e r am o v e m e n t ,a n d t h a t t h e e x t r a c t i o no f c e n t e r l i n e f e a t u r e s i nd e n s es c e n e s i s l i k e l y t oc a u s e i n f o r m a t i o nr e d u n d a n c y ,a ni m p r o v e d p o i n ta n dl i n ev i s u a l S L AMa l g o r i t h mi n t e g r a t i n g g r a d i e n t d e n s i t y i s p r o p o s e d .T h e a l go r i t h mf i r s t u s e s t h e n u m b e r o f f e a t u r e p o i n t sb e t w e e n t h e f r o n t a n db a c k i m a g e f r a m e s t o f i l t e r t h e b l u r r e d i m a ge ,a n d t h e nu s e sG a u s s i a nb l u r t oo p t i m i z e t h e p r o c e s s i n g t o o b t a i nb e t t e rm a t c h i n g i m a ge f r a m e s .T h e n ,t h e p o i n t f e a t u r e i n f o r m a t i o n i s u s e d t o j u d g ew h e t h e r t h e l i n e f e a t u r e i s i n t r o d u c e d ,a n d t h e i m a g e p i x e l d e n s i t y g r a d i e n t i s i n t r o d u c e d t o o p t i m i z e t h eL S D (l i n e s e g m e n t d e t e c t i o n )l i n e f e a t u r e f r o m m u l t i -d i m e n s i o n s ,a n d t h e s t a b l e l i n e f e a t u r e i s e x t r a c t e d t o i m p r o v e t h e s u b s e q u e n tm a t c h i n gq u a l i t y .F i n a l l y,t h e e r r o r f u n c t i o n i s c o n s t r u c t e db a s e d o n t h e p o i n t a n d l i n e c h a r a c t e r i s t i c e r r o r t om i n i m i z e t h e p r o j e c t i o ne r r o r a n d i m p r o v e t h e p o s e e s t i m a -t i o na c c u r a c y .T h e a l g o r i t h mi s t e s t e d i nT UMd a t a s e t ,a n d t h e e x p e r i m e n t a l r e s u l t s s h o wt h a t t h e a l g o -r i t h mc a ne f f e c t i v e l y i m p r o v et h er o b u s t n e s so f f e a t u r ee x t r a c t i o n ,t h e r e b y i m p r o v i n g t h ea c c u r a c y of c a m e r a p o s e e s t i m a t i o na n dm a p p i n g.K e y w o r d s :s y n c h r o n o u s p o s i t i o n i n g a n dm a p p i n g ;g r a d i e n t i n f o r m a t i o n ;l i n e f e a t u r e e x t r a c t i o n ;p o i n t a n d l i n e f u s i o n ;i n f o r m a t i o ne n t r o p y㊃17㊃第6期徐文奇,等:基于面部多特征跨层融合网络的驾驶员疲劳检测方法。
纹理物体缺陷的视觉检测算法研究--优秀毕业论文
摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
SLAM应用领域
SLAM的应用领域自同时定位与地图构建被意识到并提出有严密数学基础的方案起,在很多领域被成功地工程实现和应用。
比如,外星球探索[4–7],采矿自动化与安全[8–11],水下探测与深海勘探[1, 2, 12–14],无人机导航与自治[15–17]以及灾难现场搜寻与营救[18–20]。
在这些对环境缺乏先验知识的场合,同时定位与地图构建的引入是实现移动机器人自治的唯一手段,它使移动机器人能够估计自身方位和周围环境特征的方位和几何轮廓,进而合理决策与规划动作和路径。
在民用领域,同时定位与地图构建可使移动车辆在GPS无法正常工作的环境下实现定位[21–24],并追踪辨识动态车辆与行人[3, 24, 25],为实现智能避障,辅助驾驶和自导航提供了可能。
陆地自主车(ALV: Autonomous Land Vehicle)之一是移动机器人重要应用领域,2005年美国国防部高级研究计划局(DARPA)举办的Grand Challenge比赛,由著名专家S.Thrun领导团队开发的Stanley自主车[26]仅用不到7小时完成了142英里的自主行使任务。
我国对地面智能移动机器人的研究起步较晚,但是也取得了很大的进展[27]。
20世纪80年代末,国家“八六三”计划自动化领域智能移动机器人主题确定立项进行遥控驾驶的防核化侦察车的研制;几乎同时国家部分部委也在规划“八五”预研中的智能移动机器人技术研究。
真正取得突破性进展的是“八五”期间研制成功的我国第一辆样车ATB-1(Autonomous Test Bed-1),该车由南京理工大学、国防科技大学、清华大学、浙江大学和北京理工大学联合研制。
在19%年演示中,该车各项性能都达到了较高的标准;在“八五”的基础上,我国在“九五”期间研制了第二代自主地面车ATB-2;到目前为止,第三代自主车已经研制成功,并已通过鉴定,现在正在进行第四代的自主车的研制工作。
另外,国内具有代表性的系统还有: 清华大学研制的THMR-v[28],国防科技大学研制的CITA VT-IV[29]及其与中国一汽合作研究的“红旗”自主轿车,吉林大学研制的JLUIV系列实验车[30]以及西安交通大学研制的Springrobot实验车[31]等。
基于小波去噪与改进Canny算法的带钢表面缺陷检测
现代电子技术Modern Electronics TechniqueFeb. 2024Vol. 47 No. 42024年2月15日第47卷第4期0 引 言带钢是钢铁工业的主要产品之一,广泛应用于机械制造、航空航天、军事工业、船舶等行业中。
然而在带钢的生产制作过程中,由于受到原材料、生产设备、工艺流程等多种因素的影响,不可避免地会导致带钢表面出现缺陷,例如:氧化、斑块、裂纹、麻点、夹杂、划痕等。
表面缺陷不仅影响带钢的外观,更是损害了产品的耐磨性、抗腐蚀性和疲劳强度等性能,因此需要加强产品的质检,对有表面缺陷的带钢进行检测和筛查。
但传统人工检测方法采用人为判断,随机性较大、检测置信度偏低、实时性较差[1]。
卞桂平等提出一种基于改进Canny 算法的图像边缘检测方法,采用复合形态学滤波代替高斯滤波,并通过最大类间方差法选取高低阈值,最后利用数学形态学对边缘进行细化,提高了抗噪性能[2]。
刘源等提出一种DOI :10.16652/j.issn.1004⁃373x.2024.04.027引用格式:崔莹,赵磊,李恒,等.基于小波去噪与改进Canny 算法的带钢表面缺陷检测[J].现代电子技术,2024,47(4):148⁃152.基于小波去噪与改进Canny 算法的带钢表面缺陷检测崔 莹, 赵 磊, 李 恒, 刘 辉(昆明理工大学 信息工程与自动化学院, 云南 昆明 650500)摘 要: 针对带钢表面图像亮度不均匀、对比度低以及缺陷种类多、形式复杂的问题,提出一种基于小波去噪与改进Canny 算法的带钢表面缺陷检测算法。
首先通过小波变换将原始图像分解,对低频分量采用改进的同态滤波提高亮度和对比度,对高频分量采用改进的阈值函数进行去噪,并通过小波重构得到增强图像。
其次对传统Canny 算法进行改进,通过改进的自适应加权中值滤波进行平滑,并增加梯度方向模板;然后采用迭代式最优阈值选择法与最大类间方差法来求取高低阈值,提高算法的自适应性。
基于Gabor滤波的人眼定位算法_熊飞
,
但是使用单一 G abor滤波器存在一定的局限性 , 对 于一些人脸图像滤波得到的 GaborEye 模型并非很 明显。本文首先通过人眼图像水平方向的梯度复 杂度确定人眼区域的纵坐标, 再通过 Gabor 滤波器 得到的频率响应幅值投影 , 确定人眼区域的横坐标 范围。最终分割得到人眼区域。 2 . 1 纵坐标定位 由于人眼的人眼、 眼白、 眼角等部位象素灰度 反差强烈, 人眼区域图像具有水平方向灰度变化频 繁和剧烈的特点 , 而且与眉毛和鼻子区域相比人眼
图 1 人脸人眼定位及校正算法流程
的灰度变化更频繁, 因此将图像竖直方向的边缘做 水平投影可以确定人眼的纵坐标。为了去除原始 灰度图像中头发边缘等与人脸部器官图像无关的 灰度信息, 本文使用数 字形态学闭操 作 V= I B - I 其中 I 为原始灰度图像 , 大小为 H
[ 1]
针对 GaborEye 模型的缺陷和它所体现的 Ga bor小波抗干扰和非均匀光照的优势 , 本文采用了 Gabor 滤波方法定位人眼, 但是为了更加确切的分 割人眼区域 , 本文通过综合人眼区域竖直梯度复杂 程度和 Gabor 滤波结果投影确定人眼区域范围 , 并 提出投影增强算法增强投影的双峰特性, 算法流程 如图 1 所示。该算法能够精准的定位人眼 , 具有较
表 1 双眼定位概率 滤波器及融合法则 m= 2 m= 3 m= 4 左眼定位概率 79. 2% 95. 8% 88. 4% 右眼定位概率 78. 9 % 95. 1 % 87. 7 %
坐标定位算法定位人眼准确率做比较, 见表 2 。 从实验数据分析得到投影增强定位方法对定 位概率有很大的提高。经过本文算法得到的人眼 定位结果如图 4 所示。
( 2) ( 3)
2 . 2 基于 Gabor滤波的横坐标定位 使用二维 G abor 滤波器则能够计算任何方向 和频率的能量, 由于眼眉区域有明显强烈的竖直方 向的灰度变化, 因此使用水平方向的二维 Gabo r滤 波器得到眼眉部分图像特有频率的能量, 即频率响 应幅值。由于该频率为眼眉区域特有的特征, 所以 该区域的频率响应幅值较大, 可以与脸部其它区域 相区别。二维 Gabor滤波器的函数形式 G U, V ( z ) = ∃ kU, V ∃
基于智能超表面的二维相扫天线
doi:10.3969/j.issn.1003-3114.2024.02.021引用格式:于瑞涛,符道临,熊伟,等.基于智能超表面的二维相扫天线[J].无线电通信技术,2024,50(2):386-391.[YURuitao,FUDaolin,XIONGWei,etal.Two dimensionalPhase scanAntennaBasedonRIS[J].RadioCommunicationsTechnology,2024,50(2):386-391.]基于智能超表面的二维相扫天线于瑞涛1,符道临2,熊 伟1,陈 珲3(1.杭州市钱塘区信息高等研究院,浙江杭州310018;2.江苏赛博空间科学技术有限公司,江苏南京211113;3.东南大学信息科学与工程学院,江苏南京210096)摘 要:针对传统相控阵天线设计复杂、成本高昂等问题,提出了基于智能超表面(ReconfigurableIntelligentSurface,RIS)的空间馈电二维相扫天线。
该天线通过RIS技术对来自馈源的电磁波进行相位操纵,实现了天线高增益和天线波束的可重构。
与传统相控阵天线相比,该天线具有结构简单、成本低廉、剖面极低等特点。
天线原型由一块含有1024个单元的超表面阵列、驱动模块及自支撑馈源组成,其中超表面阵列通过现场可编程门阵列(FieldProgrammableGateArray,FPGA)来实时控制以满足天线波束快速切换、方向图二维实时重构等需求。
超表面单元两种调控状态在6.5~10.0GHz具有低于0.5dB的幅度损耗,在7~10GHz频带范围内反射相位差在180°±5°以内。
天线原型在7.7、8.0、8.3GHz的测试增益为24.72、24.92、25.06dBi。
天线±60°扫描增益滚降低于4dB。
关键词:智能超表面;相控阵天线;二维扫描中图分类号:TN82 文献标志码:A 开放科学(资源服务)标识码(OSID):文章编号:1003-3114(2024)02-0386-06Two dimensionalPhase scanAntennaBasedonRISYURuitao1,FUDaolin2,XIONGWei1,CHENHui3(1.HongzhouQiantangAdvancedInstituteofInformation,Hangzhou310018,China;2.JiangsuCyberspaceScienceandTechnologyCo.,Ltd.,Nanjing211113,China;3.SchoolofInformationScienceandEngineering,SoutheastUniversity,Nanjing210096,China)Abstract:Inordertosolvetheproblemsofcomplexdesignandhighcostoftraditionalphasedarrayantennas,aspace fedtwo dimensionalphase scanantennabasedonReconfigurableIntelligentSurface(RIS)wasproposed.TheantennausesRIStechnologytomanipulatetheelectromagneticincomingwavesfromthefeed,whichrealizeshighgainandreconfigurabilityoftheantennabeam.Comparedwithtraditionalphasedarrayantennas,thisantennahasthecharacteristicsofsimplestructure,lowcostandlowprofile.Theantennaprototypeconsistsofametasurfacearraycontaining1024cells,adrivermoduleandaself supportingfeed,inwhichthemetasurfacearrayiscontrolledinrealtimethroughFieldProgrammableGateArray(FPGA)tomeettherequirementofrapidantennabeamswitchingandtwo dimensionalreal timereconstructionofthepattern.Twocontrolstatesofthemetasurfaceunithaveamplitudelossoflessthan0.5dBfrom6.5GHzto10.0GHz,andthereflectedphasedifferenceiswithin180°±5°inthefrequencybandrangefrom7GHzto10GHz.Theantennaprototypetestedgainsare24.72dBi,24.92dBiand25.06dBirespectivelyat7.7GHz,8.0GHz,and8.3GHz.Theantenna±60°scanlossisdownto4dB.Keywords:RIS;phasedarrayantennas;2Dscanning收稿日期:2023-11-150 引言面对日益增长的通信、探测等需求,传统线馈型相控阵技术路线存在成本高、设计复杂等缺点,难以满足未来通信/探测低成本、低功耗、智能化等迫切需求。
基于改进BiSeNet_的实时图像语义分割
第 31 卷第 8 期2023 年 4 月Vol.31 No.8Apr. 2023光学精密工程Optics and Precision Engineering基于改进BiSeNet的实时图像语义分割任凤雷1,2,杨璐1,2*,周海波1,2,张诗雨1,2,何昕3,徐文学4(1.天津理工大学天津市先进机电系统设计与智能控制重点实验室,天津 300384;2.天津理工大学机电工程国家级实验教学示范中心,天津 300384;3.中国科学院长春光学精密机械与物理研究所,吉林长春 130033;4.天津卓越信通科技有限公司,天津 300384)摘要:为了提升图像语义分割算法的性能,使其同时满足准确性和实时性需求,本文提出了一种基于改进BiSeNet的实时图像语义分割算法。
首先,通过使双分支网络头部共享以消除BiSeNet网络结构部分通道和参数的冗余,同时有效提取图像的浅层特征;然后,将上述共享网络拆分为由细节分支和语义分支组成的双分支网络,并分别用于提取空间细节信息和语义上下文信息;此外,在语义分支尾部引入通道和空间注意力机制以增强特征表达能力,通过使用双注意力机制对BiSeNet算法进行优化以更有效地提取语义上下文特征;最后,对细节分支和语义分支的特征进行融合并通过上采样操作恢复至输入图像分辨率大小以实现图像语义分割。
本文算法在Cityscapes数据集以95.3FPS的实时性表现达到77.2% mIoU的准确性;在CamVid数据集以179.1 FPS的实时性表现达到73.8% mIoU的准确性。
实验结果表明,本文算法在实时性和准确性方面获得了很好的平衡,其语义分割性能相较于BiSeNet算法及其它现有算法得到了显著的提升。
关键词:语义分割;注意力机制;实时性;深度学习中图分类号:TP394.1 文献标识码:A doi:10.37188/OPE.20233108.1217Real-time semantic segmentation based on improved BiSeNetREN Fenglei1,2,YANG Lu1,2*,ZHOU Haibo1,2,ZHANG Shiyv1,2,HE Xin3,XU Wenxue4(1.Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control,School of Mechanical Engineering, Tianjin University of Technology, Tianjin 300384, China;2.National Demonstration Center for Experimental Mechanical and Electrical Engineering Education,Tianjin University of Technology, Tianjin 300384, China;3.Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences,Changchun 130033, China;4.Transcend Communication Technology Tianjin Co., Ltd, Tianjin 300384, China)* Corresponding author, E-mail: yanglu8206@Abstract:To improve the performance of image semantic segmentation on accuracy and efficiency for practical applications, in this study, we propose a real-time semantic segmentation algorithm based on im⁃文章编号1004-924X(2023)08-1217-11收稿日期:2022-09-12;修订日期:2022-10-01.基金项目:国家自然科学基金资助项目(No.51275209);天津市自然科学基金重点项目资助(No. 17JCZDJC30400);广东省重点领域研发计划资助项目(No. 2019B090922002)第 31 卷光学精密工程proved BiSeNet.First,the redundancy of certain channels and parameters of BiSeNet is eliminated by sharing the heads of dual branches, and the affluent shallow features are effectively extracted at the same time. Subsequently, the shared layers are divided into dual branches, namely, the detail branch and the se⁃mantic branch, which are used to extract detailed spatial information and contextual semantic information,respectively. Furthermore, both the channel attention mechanism and spatial attention mechanism are in⁃troduced into the tail of the semantic branch to enhance the feature representation; thus the BiSeNet is opti⁃mized by using dual attention mechanisms to extract contextual semantic features more effectively. Final⁃ly, the features of the detail branch and semantic branch are fused and up-sampled to the resolution of the input image to obtain semantic segmentation. Our proposed algorithm achieves 77.2% mIoU on accuracy with real-time performance of 95.3 FPS on Cityscapes dataset and 73.8% mIoU on accuracy with real-time performance of 179.1 FPS on CamVid dataset. The experiments demonstrate that our proposed se⁃mantic segmentation algorithm achieves a good trade-off between accuracy and efficiency.Furthermore,the performance of semantic segmentation is significantly improved compared with BiSeNet and other ex⁃isting algorithms.Key words: semantic segmentation; attention mechanism; real time; deep learning1 引言图像语义分割作为计算机视觉领域的一项重要技术,旨在将每一个图像像素分类为相应的语义类别,其在自动驾驶、医学检测、机器人导航、场景解析、人机交互等领域均有着极为广泛的应用[1-5]。
基于视觉的旋翼无人机地面目标跟踪(英文)
I. INTRODUCTION UAV is one of the best platforms to perform dull, dirty or dangerous (3D) tasks [1]. UAV can be used in various applications where human is impossible to intervene. It greatly expands the application space of visual tracking. Research on the technology of vision based ground target tracking for UAV has been a great concern among cybernetic experts and robotic experts, and has become one of the most active research directions in UAV applications. Currently, researchers from America, Britain, France and Sweden are on the cutting edge in this field [2]. Typical visual tracking platforms for UAV include Scan Eagle, GTMax, RQ-11, RQ-16, DragonFly, etc. Because of many advantages, such as small size, light weight, flexible, easy to carry and low cost, rotor UAV has a broad application prospect in the fields of traffic monitoring, resource exploration, electricity patrol, forest fire prevention, aerial photography, atmospheric monitoring, etc [3]. Vision based ground target tracking system for rotor UAV is such a system that gets images by the camera installed on a low-flying rotor UAV, then recognizes the target in the images and estimates the motion state of the target, and finally according to the visual information regulates the pan-tilt-zoom (PTZ) camera automatically to keep the target at the center of the camera view. In view of the current situation of international researches, the study of ground target tracking system for
基于二维小波变换的圆形算子虹膜定位算法
基于二维小波变换的圆形算子虹膜定位算法赵静【摘要】An improved iris localization algorithm of circular operator based on two-dimensional wavelet transform is proposed to im-prove the accuracy and the speed of the iris localization. Firstly,the algorithm segments the pupil area of the iris by the threshold. Second-ly it locates the iris inner edge by the edge detection operator in the pupil area. Thirdly the human eye iris image is processed by the two-dimensional wavelet transform to reduce the image resolution instead of the smoothing function in the Daugman circular operator. Finally it gets the circular edge of the sliding window by the circular edge detection operator,and compares the circle inside mean gray with the circle outside mean gray to locate the iris outer edge. The simulation results show that the algorithm locates the iris inner and outer edge with 1. 85s average time and 99. 6% accuracy rate. The algorithm has a higher practical value in the iris recognition system.% 为了提高虹膜定位的准确率和速度,提出了一种基于二维小波变换的Daugman圆形算子虹膜定位改进算法。
收集有趣英文句子帮助记忆单词
倾斜的盐过滤器交替地停下以便改造。. The wandering band abandoned her bandaged husband on Swan Island.
流浪的乐队把她那位打着绷带的丈夫遗弃在天鹅岛上。
认为季节性的海外海鲜的价格是合理的就是背叛。
. The veteran in velvet found that the diameter of the thermometer was one
公正的预算法官只不过为司法调整辩护而已。
. I used to abuse the unusual usage, but now I'm not used to doing so.
我过去常滥用这个不寻常的用法,但我现在不习惯这样做。
. The lace placed in the palace is replaced first, and displaced later.
他们在辩论关于那件不朽乐器的文献。
. However, Lever never fevers; nevertheless, he is clever forever.
无论如何,杠杆从未发烧;尽管如此,他始终机灵。
. I never mind your unkind reminding that my grindstone hinders your
我暗示说虚弱的圣徒用了一品脱油漆涂印刷机。
. At any rate, the separation ratio is accurate.
无论如何,这个分离比是精确的。
05 数字图像处理_图像滤波
直方图 原始图像 Image & Vision Lab
灰度直方图定义: h(r)
h(rk ) = nk
r nk: 灰度值等于rk的像素数量(计数值)
Image & Vision Lab
灰度映射(直方图变换 灰度映射 直方图变换) 直方图变换
用直方图变换方法进行图像增强是以概率论为基 础的。 常用的方法:
0 ≤ p( x) ≤ 1
x = −∞
∑ p( x) = 1
∞
p(x) : 概率密度函数
Image & Vision Lab
直方图均衡算法
直方图均衡化主要用于增强动态范围偏小的图像 的方差; 基本思想:把原始的直方图变换为均匀分布的形 式,这样就增加了像素灰度值的动态范围,从而 达到增强图像整体对比度的效果。
3 0.25 0.50 3 3 0.25 4 0.40 0.90 6.2 6 0.4 5 0.05 0.95 6.6 7 6 0.05 1.00 7 7 7 0 1.00 7 7 0.10
r
1 0.05 0.10 -0.2 0 0.1
2 0.15 0.25 1 1 0.15
L=8
Image & Vision Lab
Image & Vision Lab
图像间的运算——加法的应用 加法的应用 图像间的运算
f g ( x, y ) 是采集到的图像, ( x, y )是原始场景图像,
e( x, y ) 是噪声图像。
g ( x, y ) = f ( x , y ) + e( x, y )
图像间的加法运算多用来求采集的多幅相同图像 的平均值图像,利用平均值图像滤除噪声。假设 有M副图像: 可以证明 − M越大,均值图像 g ( x, y ) 越接近
伊顿 LF 63 技术数据表
Weight: approx. 5.5 lbs.Dimensions: inches Designs and performance values are subject to change.Description:In-line filters of the type LF 63 are suitable for a working pressure up to 363 PSI. Pressure peaks are absorbed with a sufficient margin of safety. It can be used as suction filter, pressure filter and return-line filter.The filter element consists of star-shaped, pleated filter material, which is supported on the inside by a perforated core tube and is bonded to the end caps with a high-quality adhesive. The flow direction is from outside to inside.For cleaning the stainless steel mesh element (see special leaflets 21070-4 and 39448-4) or changing the filter element, remove the cover and take out the element. The mesh elements are not guaranteed to maintain 100% performance after cleaning.For filtration finer than 40 μm, use the disposable elements made of microglass. Filter elements as fine as 5 μm(c) are available; finer filter elements are available upon request.Eaton filter elements are known for a high intrinsic stability and an excellent filtration capability, a high dirt-retaining capacity and a long service life.Eaton filter can be used for petroleum-based fluids, HW emulsions, water glycols, most synthetic fluids and lubrication fluids. Consult factory for specific fluid applications.The internal valve is integrated in the filter cover. After reaching the bypass pressure setting, the bypassvalve will send unfiltered partial flow around the filter. Ship classifications available upon request. Type index:Complete filter:(ordering example)series:LF = in-line filternominal size: 633filter-material:25VG, 16VG, 10VG, 6VG, 3VG microglass25API, 10API microglass according to APIfilter element collapse rating:30 = ∆p 435 PSIfilter element design:E= single end opensealing material:P= Nitrile (NBR)V = Viton (FPM)filter element specification:- = standardVA = stainless steelIS06 =for HFC application, see sheet-no. 31601process connection::UG =thread connectionprocess connection size:4 = -12 SAEfilter housing specification:- = standardpressure vessel specification:- = standard (PED 2014/68/EU)internal valve:- = withoutS1 = with bypass valve ∆p 51 PSIclogging indicator or clogging sensor:- = withoutAOR =visual, see sheet-no.1606AOC = visual, see sheet-no.1606AE = visual-electric, see sheet-no.1615VS5 = electronic, see sheet-no.1619To add an indicator/sensor to your filter, use the corresponding indicator data sheet to find the indicator details and add them to the filter assembly model code.Filter element: (ordering example)series:01NL = standard filter element according to DIN 24550, T3nominal size: 63- see type index complete filterTechnical data:operating temperature: +14 °F to +212 °Foperating medium: mineral oil, other media on requestmax. operating pressure: 363 PSItest pressure: 522 PSIprocess connection: thread connectionhousing material: aluminium-cast, steel (filter cover)sealing material: Nitrile (NBR) or Viton (FPM), other materials on requestinstallation position: verticalmeasuring connections: BSPP ¼drain connection: BSPP ½volume tank: .19 Gal.Classified under the Pressure Equipment Directive 2014/68/EU for mineral oil (fluid group 2), Article 4, Para. 3.Classified under ATEX Directive 2014/34/EU according to specific application (see questionnaire sheet-no. 34279-4).Pressure drop flow curves:Filter calculation/sizingThe pressure drop of the assembly at a given flow rate Q is the sum of the housing ∆p and the element ∆p and is calculated as follows:∆p assembly= ∆p housing+ ∆p element∆p housing = (see ∆p= f (Q) - characteristics)For ease of calculation our Filter Selection tool is available online at /hydraulic-filter-evaluationMaterial gradient coefficients (MSK) for filter elementsThe material gradient coefficients in psi/gpm apply to mineral oil (HLP) with a density of 0.876 kg/dm³ and a kinematic viscosity of 139 SUS (30 mm²/s). The pressure drop changes proportionally to the change in kinematic viscosity and density.∆p = f(Q) – characteristics according to ISO 3968The pressure drop characteristics apply to mineral oil (HLP) with a density of 0.876 kg/dm³. The pressure drop changes proportionally to the density.Symbols:without indicatorwith bypass valvewith electric indicator AE30 / AE40with visual-electricindicator AE50 / AE62with visual-electricindicatorAE70 / AE80 / AE90with visual indicator AOR/AOCwith electronicsensor VS5Spare parts:Test methods:Filter elements are tested according to the following ISO standards:ISO 2941 Verification of collapse/burst resistance ISO 2942 Verification of fabrication integrityISO 2943 Verification of material compatibility with fluids ISO 3723 Method for end load testISO 3724 Verification of flow fatigue characteristicsISO 3968 Evaluation of pressure drop versus flow characteristics ISO 16889Multi-pass method for evaluating filtration performanceNorth America 44 Apple StreetTinton Falls, NJ 07724 Toll Free: 800 656-3344 (North America only) Tel: +1 732 212-4700Europe/Africa/Middle East Auf der Heide 253947 Nettersheim, Germany Tel: +49 2486 809-0 Friedensstraße 4168804 Altlußheim, Germany Tel: +49 6205 2094-0An den Nahewiesen 2455450 Langenlonsheim, GermanyTel: +49 6704 204-0Greater China No. 7, Lane 280, Linhong RoadChangning District, 200335 Shanghai, P.R. China Tel: +86 21 5200-0099Asia-Pacific100G Pasir Panjang Road #07-08 Interlocal Centre Singapore 118523 Tel: +65 6825-1668For more information, please email us at ********************or visit /filtration© 2021 Eaton. All rights reserved. All trademarks and registered trademarks are the property of their respective owners. All information and recommendations appearing in this brochure concerning the use of products described herein are based on tests belie ved to be reliable. However, it is the user’s responsibility to determine the suitability for his own use of such products. Since the actual use by others is beyond our control, no guarantee, expressed or implied, is made by Eaton as to the effects of such use or the results to be obtained. Eaton assumes no liability arising out of the use by others of such products. Nor is the information herein to be construed as absolutely complete, since additional information may be necessary or desirable when particular or exceptional conditions or circumstances exist or because of applicable laws or government regulations.。
基于改进扫描线逼近的鱼眼图轮廓提取算法的研究
基于改进扫描线逼近的鱼眼图轮廓提取算法的研究韩迎辉【摘要】鱼眼镜头可以克服普通镜头视场小的缺点,但是鱼眼图像具有严重的桶形畸变,在利用鱼跟图像信息之前需要对鱼眼图像进行校正展开.鱼眼图像轮廓的提取是图像校正前至关重要的步骤,影响最终校正的效果.其中扫描线逼近算法计算量最小,速度最快,应用最广,但仍具有抗噪能力差的缺陷.针对此问题,提出了一种改进算法,引入扫描步长采用新的扫描方案提高算法的运行速度、采用新的阈值计算方法实现自动阈值的选取、并利用局部二值化去噪思想抑制噪声.实验结果表明在不增加算法运行时间前提下改进算法能够有效改善鱼眼图像轮廓的提取效果,有较强的实用价值.【期刊名称】《电子器件》【年(卷),期】2013(036)006【总页数】5页(P784-788)【关键词】鱼眼图像;扫描线逼近;扫描步长;自动阈值;局部二值化【作者】韩迎辉【作者单位】常州大学城常州轻工职业技术学院,江苏常州213164【正文语种】中文【中图分类】TP391.41为了获得超宽视角,鱼眼镜头被大量地应用在群组视频会议[1]、大范围监控系统、智能交系统、虚拟实景技术[2],全景浏览及球面电影等领域。
鱼眼镜头可以克服普通镜头视场小的缺点,但是鱼眼图像具有严重的桶形畸变,在利用鱼眼图像信息之前需要对鱼眼图像进行校正展开。
目前鱼眼图像轮廓提取的算法主要有最小二乘拟合法,面积统计法,区域增长法和扫描线逼近法。
最小二乘拟合法[3]提取鱼眼轮廓的计算量大并且不一定精确。
面积统计法[4]原理简单但是当有效区域内特别是靠近轮廓有大量黑色像素点存在时计算误差偏大,而且计算量相对较大,因此适用范围也有限。
区域增长法[5]计算复杂,耗时长,且不一定能够取得理想结果,更不实用。
相比之下,扫描线逼近算法[6-8]的效率最高,效果也比较好,应用广泛,但仍具有抗噪能力差的缺陷,有些轮廓提取效果还可以提升。
本文在已有的扫描线逼近算法的基础上提出了一种新的鱼眼轮廓提取算法,该算法有很强的抗噪能力,能够精确定位鱼眼图像的中心和半径,在提高鱼眼轮廓提取精度的同时能够保证不增加运行时间,具有很强的实用价值。
TFT-H050A7SVISTKN40 液晶模组说明书
正面/FRONT背面/REAR3.2接线说明Wiring instructions电源POWERGND VCI 微控制器D2[P/N]D0[P/N]D1[P/N]DCK[P/N]D3[P/N]RESX TFT moduleMODE MCULEDA LEDK背光Backlight TFT 模组LENS SHLR UPDN DISP LVDS 4 Lane/JEIDA ModeVCI VCI电源POWERGND VCI 微控制器D2[P/N]D0[P/N]D1[P/N]DCK[P/N]D3[P/N]RESX TFT moduleMODE MCULEDA LEDK背光BacklightTFT 模组LENS SHLR UPDN DISP LVDS 4 Lane/VESA ModeGND VCI电源POWERGND VCI 微控制器D2[P/N]D0[P/N]D1[P/N]DCK[P/N]D3[P/N]RESX TFT moduleMODE MCULEDA LEDK背光Backlight TFT 模组LENS SHLR UPDN DISP LVDS 3 Lane/JEIDA ModeVCI GND电源POWERGND VCI 微控制器D2[P/N]D0[P/N]D1[P/N]DCK[P/N]D3[P/N]RESX TFT moduleMODE MCULEDA LEDK背光BacklightTFT 模组LENS SHLR UPDN DISP LVDS 3 Lane/VESA ModeGND GNDGND GND2.对比度测量应在θ=0的视角和LCD 表面的中心进行。
亮度测量时,视场中的所有像素首先设置为白色,然后设置为暗(黑色)状态。
(参见图1)亮度对比度(CR)是通过数学定义的。
Contrast measurements shall be made at viewing angle of Θ=0and at the center of the LCD surface.Luminance shall be measured with all pixels in the view field set first to white,then to the dark (black)state .(see FIGUR 1)Luminance Contrast Ratio (CR)is defined mathematically.3.透射率是没有APF 和没有CG 的值。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Anselm Haselhoff and Anton Kummert
Abstract— In this work a vision-based lane detection system is presented. The main contribution is the application of 2D line filters for lane detection which suppress noise in the near distance without destroying lines in the fare distance. Line pairs are decided to belong to lane markings by means of parallelism is world coordinates and reasonable constraints concerning the road width. The polygon constructed from the detected lane markings is then tracked over time via a kalman filter. Results of the approach are presented in terms of the detection accuracy on a labeled video sequence.
Faculty of Electrical, Information and Media Engineering, Chair of Communication Theory, University of Wuppertal, 42097 Wuppertal, Germany, haselhoff@uni-wuppertal.de, kummert@uni-wuppertal.de
˜ the 2D summation is Finally, for the two vectors r and r defined as
˜ ≤r r
˜ ) = ∑ ∑ f (x ˜, y ˜). ∑ f (r
x ˜=0 y ˜=0
x
y
(2)
III. L ANE D ETECTION S YSTEM In general the lane detection system is based on 2D line filters to extract lane features, the linear lane model presented in [4] and finally a kalman filter to track the lanes over time. To perform the detection a monochrome video camera with a resolution of 640 × 480 is used that is mounted at the position of the rear-view mirror. The whole lane detection system is visualized in figure 1. The components of the system are described in the follow subsections.
978-1-4244-2798-7/09/$25.00 ©2009 IEEE
0
m-1
M-1
x
0
x0 r1 A
x0+m-1 r2
x
y0
n-1 N-1 y
r rmax
y0+g. 2: Flat ground assumption.
B. 2D Line or Haar Filter The use of 2D line or Haar-like filters for object detection was first introduced by Viola & Jones [6]. The advantage of these features is a very fast computation due to the use of the integral image. One rectangular filter mask that is important for line detection is shown in figure 3. The filter output is calculated as a weighted sum of gray-level values contained in the black and white rectangles. The filter in Fig. 3 is related to the discrete filter approximation of a second order derivative and thus is denoted by vertical line filter hxx (r).
I. I NTRODUCTION Today’s vehicles contain driver assistance systems which observe the environment of the vehicle to provide road traffic with more comfort, safety, and efficiency. One example are lane detection and lane departure warning systems. Amongst others, video sensors are subject of current research in the lane detection context since video sensors enable sophisticated image processing algorithms and keep the costs to an affordable limit. The video camera facilitates application of various detection algorithms. The lane detection in [1] uses the sobel operator for edge detection and to handle fine and coarse structures they use an image pyramid. In [2] the edge detection is performed by the Canny filter and a fixed Gaussian filter is used for noise suppression. The work of [3] uses a Canny filter and in addition a Laplacian of Gaussian (LoG) edge filter. In best case these approaches handle noise by a fixed low-pass filter which is the same for high and low distances in world coordinates. The drawback is that fine structures in the fare distance can get lost. A good approach to handle the noise is presented in [4], where a stereo-camera is used in combination with distance dependent 1D edge filters. The lane detected is done by means of a Dark-LightDark (DLD) transition, where a gradient pair must have similar magnitude but an opposable sign. In our method we follow a similar strategy as in [4]. The advantage of our method is the usage of the integral image for image filtering, which can be used in other applications like vehicle [5] or pedestrian detection. The line filters enable that DLD transitions can be directly indentified by the sign of the filtered image with no additional computation. The remainder of the paper proceeds as follows. After a the definitions in section II the lane detection system, including the line filters, is described in section III. Finally, the results and conclusions are presented in section IV.
II. NOTATION AND BASICS In the following sections f (r) = f (x, y) denotes an M × N image with r = (x, y)T , defined for x = 0, 1, ..., M − 1 and y = 0, 1, ..., N − 1. Furthermore special unit vectors are defined as e1 = (1, 0)T , e2 = (0, 1)T , and e3 = (1, 1)T . ˜ = (x ˜, y ˜)T the lower equal For two vectors r = (x, y)T and r sign is defined as ˜≤r⇒x r ˜ ≤ x∧y ˜ ≤ y. (1)