Shortest Path Homography-Based Visual Control for Differential Drive Robots
视网膜功能启发的边缘检测层级模型
视网膜功能启发的边缘检测层级模型郑程驰 1范影乐1摘 要 基于视网膜对视觉信息的处理方式, 提出一种视网膜功能启发的边缘检测层级模型. 针对视网膜神经元在周期性光刺激下产生适应的特性, 构建具有自适应阈值的Izhikevich 神经元模型; 模拟光感受器中视锥细胞、视杆细胞对亮度的感知能力, 构建亮度感知编码层; 引入双极细胞对给光−撤光刺激的分离能力, 并结合神经节细胞对运动方向敏感的特性, 构建双通路边缘提取层; 另外根据神经节细胞神经元在多特征调控下延迟激活的现象, 构建具有脉冲延时特性的纹理抑制层; 最后将双通路边缘提取的结果与延时抑制量相融合, 得到最终边缘检测结果. 以150张来自实验室采集和AGAR 数据集中的菌落图像为实验对象对所提方法进行验证, 检测结果的重建图像相似度、边缘置信度、边缘连续性和综合指标分别达到0.9629、0.3111、0.9159和0.7870, 表明所提方法能更有效地进行边缘定位、抑制冗余纹理、保持主体边缘完整性. 本文面向边缘检测任务, 构建了模拟视网膜对视觉信息处理方式的边缘检测模型, 也为后续构建由视觉机制启发的图像计算模型提供了新思路.关键词 边缘检测, 视网膜, Izhikevich 模型, 神经编码, 方向选择性神经节细胞引用格式 郑程驰, 范影乐. 视网膜功能启发的边缘检测层级模型. 自动化学报, 2023, 49(8): 1771−1784DOI 10.16383/j.aas.c220574Multi-layer Edge Detection Model Inspired by Retinal FunctionZHENG Cheng-Chi 1 FAN Ying-Le 1Abstract Based on the processing of visual information by the retina, this paper proposes a multi-layer model of edge detection inspired by retinal functions. Aiming at the adaptive characteristics of retinal neurons under periodic light stimulation, an Izhikevich neuron model with adaptive threshold is established; By simulating the perception ability of cones and rods for luminance and color in photoreceptors, the luminance perception coding layer is con-structed; By introducing the ability of bipolar cells for separating light stimulation, and combining with the charac-teristics of ganglion cells sensitive to the direction of movement, a multi-pathway edge extraction layer is constructed;In addition, according to the phenomenon of delayed activation of ganglion cell neurons under multi-feature regula-tion, a texture inhibition layer with pulse delay characteristics is constructed; Finally, by fusing the result of multi-pathway edge extraction with the delay suppression amount, the final edge detection result is obtained. The 150colony images from laboratory collection and AGAR dataset are used as experimental objects to test the proposed method. The reconstruction image similarity, edge confidence, edge continuity and comprehensive indicators of the detection results are 0.9629, 0.3111, 0.9159 and 0.7870, respectively. The results show that the proposed method can better localize edges, suppress redundant textures, and maintain the integrity of subject edges. This research is oriented to the task of edge detection, constructs an edge detection model that simulates the processing of visual information by the retina, and also provides new ideas for the construction of image computing model inspired by visual mechanism.Key words Edge detection, retina, Izhikevich model, neural coding, direction-selective ganglion cells (DSGCs)Citation Zheng Cheng-Chi, Fan Ying-Le. Multi-layer edge detection model inspired by retinal function. Acta Automatica Sinica , 2023, 49(8): 1771−1784边缘检测作为目标分析和识别等高级视觉任务的前级环节, 在图像处理和工程应用领域中有重要地位. 以Sobel 和Canny 为代表的传统方法大多根据相邻像素间的灰度跃变进行边缘定位, 再设定阈值调整边缘强度和冗余细节[1]. 虽然易于计算且快速, 但无法兼顾弱边缘感知与纹理抑制之间的有效性, 难以满足复杂环境下的应用需要. 随着对生物视觉系统研究的进展, 人们对视觉认知的过程和视觉组织的功能有了更深刻的了解. 许多国内外学者在这些视觉组织宏观作用的基础上, 进一步考虑神经编码方式与神经元之间的相互作用, 并应用于边缘检测中. 这些检测方法大多首先会选择合适的神经元模型模拟视觉组织细胞的群体放电特性, 再关联例如视觉感受野和方向选择性等视觉机制, 以不收稿日期 2022-07-14 录用日期 2022-11-29Manuscript received July 14, 2022; accepted November 29,2022国家自然科学基金(61501154)资助Supported by National Natural Science Foundation of China (61501154)本文责任编委 张道强Recommended by Associate Editor ZHANG Dao-Qiang1. 杭州电子科技大学模式识别与图像处理实验室 杭州 3100181. Laboratory of Pattern Recognition and Image Processing,Hangzhou Dianzi University, Hangzhou 310018第 49 卷 第 8 期自 动 化 学 报Vol. 49, No. 82023 年 8 月ACTA AUTOMATICA SINICAAugust, 2023同的编码方式将输入的图像转化为脉冲信号, 经过多级功能区块处理和传递后提取出图像的边缘. 其中, 频率编码和时间编码是视觉系统编码光刺激的重要方式, 在一些计算模型中被广泛使用. 例如,文献[2]以HH (Hodgkin-Huxley)神经元模型为基础, 使用多方向Gabor滤波器模拟神经元感受野的方向选择性, 实现神经元间连接强度关联边缘方向,将每个神经元的脉冲发放频率作为边缘检测的结果输出, 实验结果表明其比传统方法更有效; 文献[3]在 LIF (Leaky integrate-and-fire) 神经元模型的基础上进行改进, 引入根据神经元响应对外界输入进行调整的权值, 在编码的过程中将空间的脉冲发放转化为时序上的激励强度, 实现强弱边缘分类, 对梯度变化幅度小的弱边缘具有良好的检测能力. 除此之外, 也有关注神经元突触间的相互作用, 通过引入使突触的连接权值产生自适应调节的机制来提取边缘信息的计算方法. 例如, 文献 [4] 构建具有STDP (Spike-timing-dependent plasticity) 性质的神经元模型, 根据突触前后神经元首次脉冲发放时间顺序来增强或减弱突触连接, 对真伪边缘具有较强的辨别能力; 文献 [5] 则在构建神经元模型时考虑了具有时间不对称性的STDP机制, 再融合方向特征和侧抑制机制重建图像的主要边缘信息, 其计算过程对神经元突触间的动态特性描述更加准确.更进一步, 神经编码也被应用于实际的工程需要.例如, 文献 [6]针对现有的红外图像边缘检测算法中存在的缺陷, 构建一种新式的脉冲神经网络, 增强了对红外图像中弱边缘的感知; 文献 [7] 则通过模拟视皮层的处理机制, 使用包含左侧、右侧和前向3条并行处理支路的脉冲神经网络模型提取脑核磁共振图像的边缘, 并将提取的结果用于异常检测,同样具有较好的效果. 上述方法都在一定程度上考虑了视觉组织中神经元的编码特性以及视觉机制,与传统方法相比, 在对复杂环境的适应性更强的同时也有较高的计算效率. 但这些方法都未能考虑到神经元自身也会随着外界刺激产生适应, 从而使活动特性发生改变. 此外, 上述方法大多也只选择了频率编码、时间编码等编码方式中的一种, 并不能完整地体现视觉组织中多种编码方式的共同作用.事实上, 在对神经生理实验和理论的持续探索中发现, 视觉组织(以视网膜为例)在对视觉刺激的加工中就存在着丰富的动态特性和编码机制[8−9]. 视网膜作为视觉系统中的初级组织结构, 由多种不同类型的细胞构成, 共同组成一个纵横相连、具有层级结构的复杂网络, 能够针对不同类型的刺激性选择相应的编码方式进行有效处理. 因此, 本文面向图像的边缘检测任务, 以菌落图像处理为例, 模拟视网膜中各成分对视觉信息的处理方式, 构建基于视网膜动态编码机制的多层边缘检测模型, 以适应具有多种形态结构差异的菌落图像边缘检测任务.1 材料和方法本文提出的算法流程如图1所示. 首先, 根据视网膜神经元在周期性光刺激下脉冲发放频率发生改变的特性, 构建具有自适应阈值特性的Izhikevich 神经元模型, 改善神经元的同步发放能力; 其次, 考虑光感受器对强弱光和颜色信息的不同处理方式编码亮度信息, 实现不同亮度水平目标与背景的区分;然后, 引入固视微动机制, 结合神经节细胞的方向选择性和给光−撤光通路的传递特性, 将首发脉冲时间编码的结果作为双通路的初级边缘响应输出;随后, 模拟神经节细胞的延迟发放特性, 融入对比度和突触前后偏好方向差异, 计算各神经元的延时抑制量, 对双通路的计算结果进行纹理抑制; 最后,整合双通路边缘信息, 将二者融合为最终的边缘检测结果.1.1 亮度感知编码层构建神经元模型时, 本文综合考虑对神经元生理特性模拟的合理性和进行仿真计算的高效性, 以Izhikevich模型[10]为基础构建神经元模型. Izhike-vich模型由Izhikevich在HH模型的基础上简化而来, 在保留原模型对神经元放电模式描述的准确性的同时, 也具有较低的时间复杂度, 适合神经元群体计算时应用, 其表达式如下式所示v thv th 其中, v为神经元的膜电位, 其初始值设置为 −70; u为细胞膜恢复变量, 设置为14; I为接收的图像亮度刺激; 为神经元脉冲发放的阈值, 设置为30; a描述恢复变量u的时间尺度, b描述恢复变量u 对膜电位在阈值下波动的敏感性, c和d分别描述产生脉冲发放后膜电位v的重置值和恢复变量u的增加程度, a, b, c, d这4个模型参数的典型值分别为0.02、0.2、−65和6. 若某时刻膜电位v达到,则进行一次脉冲发放, 同时该神经元对应的v被重置为c, u被重置为u + d.适应是神经系统中广泛存在的现象, 具体表现为神经元会根据外界的刺激不断地调节自身的性质. 其中, 视网膜能够适应昼夜环境中万亿倍范围的光照变化, 这种适应能够帮助其在避免饱和的同时保持对光照的敏感性[11]. 研究表明, 视网膜持续1772自 动 化 学 报49 卷接受外界周期性光刺激时, 光感受器会使神经元细胞的活动特性发生改变, 导致单个神经元的发放阈值上升, 放电频率下降; 没有脉冲发放时, 对应阈值又会以指数形式衰减, 同时放电频率逐渐恢复[12].因此, 本文在Izhikevich 模型的基础上作出改进,加入根据脉冲发放频率对阈值进行自适应调节的机制, 如下式所示τ1τ2τ1τ2v th τ1v th τ2其中, 和 分别为脉冲发放和未发放时阈值变化的时间常数, 其值越小, 阈值变化的幅度越大, 神经元敏感性变化的过程越快; 反之, 则表示阈值变化的幅度越小, 神经元敏感性变化的过程也就越慢.生理学实验表明, 在外界持续光刺激下, 神经元对刺激产生适应导致放电频率降低后, 这种适应衰退的过程比产生适应的过程通常要长数倍[13]. 因此,为了在准确模拟生理特性的同时保证计算模型的性能, 本文将 和 分别设置为20和40. 这样, 当某时刻某个神经元产生脉冲发放时, 则对应阈值 根据 的值升高, 神经元产生适应, 活跃度降低; 反之, 对应阈值 根据 的值下降, 神经元的适应衰退, 活跃度提升. 实现限制活跃神经元的脉冲发放频率, 促进不活跃神经元的脉冲发放, 改善神经元群体的同步发放能力, 减少检测目标内部冗余. 图2边缘检测结果图 1 边缘检测算法原理图Fig. 1 Principle of edge detection algorithm8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1773显示了改进前后的Izhikevich 模型对图像进行处理后目标内部冗余情况.0∼255为了规范检测目标图像的亮度范围, 本文将输入的彩色图像Img 各通路的亮度映射到 区间内, 如下式所示Img (;i )I (;i )其中, 和 表示经亮度映射前和映射后的R 、G 、B 三种颜色分量图像; max(·) 和min(·)分别计算对应分量图像中的最大和最小像素值.光感受器分两类, 分别为视锥细胞和视杆细胞[14], 都能将接收到的视觉刺激转化为电信号, 实现信息的编码和传递. 其中, 视锥细胞能够根据外界光刺激的波长来分解为三个不同的颜色通道[15].考虑到人眼对颜色信息的敏感性能有效区分离散目标与背景, 令图像中的每个像素点对应一个神经元,将R 、G 、B 三种颜色分量图像分别输入上文构建的神经元模型中, 在一定时间范围内进行脉冲发放,如下式所示fires (x,y ;i )其中, 为每个神经元的脉冲发放次数,函数Izhikevich(·)表示式(2)给出的神经元模型.视杆细胞对光线敏感, 主要负责弱光环境下的外界刺激感知. 当光刺激足够强时, 视杆细胞的感知能力达到饱和, 视觉系统转为使用视锥细胞负责亮度信息的处理[16]. 因此, 除了对颜色信息敏感外,视锥细胞对强光也有高度辨别能力. 考虑到作为检测对象的图像中, 目标与背景具有不同的亮度水平,本文构建一种综合视锥细胞和视杆细胞亮度感知能力的编码方法, 以适应目标与背景不同亮度对比的多种情况, 如下式所示I base I base (x,y )fires Res (x,y )其中, var(·) 计算图像亮度方差; ave(·) 计算图像亮度均值. 本文取三种颜色分量图像中方差最大的一幅作为基准图像 , 对于其中的像素值 ,将其中亮度低于平均亮度的部分设置为三种颜色分量脉冲发放结果的最小值, 反之设置为最大值, 最终得到模型的亮度编码结果 , 实现在图像局部亮度相对较低的区域由视杆细胞进行弱光感知, 亮度较高区域由视锥细胞处理, 强化计算模型对不同亮度目标和背景的区分能力, 凸显具有弱边缘的对象. 图3显示了亮度感知编码对存在弱边缘的对象的感知能力.1.2 基于固视微动的多方向双通路边缘提取层Img gray 人眼注视目标时, 接收的图像并非是静止的,而是眼球以每秒2至3次的微动使投射在视网膜上的图像发生持续运动, 不断地改变照射在光感受器上的光刺激[17]. 本文考虑人眼的固视微动机制,在原图像的灰度图像 上构建大小为3×3的微动作用窗口temp , 使窗口接收到的亮度信息朝8个方向进行微动, 如下式所示p i q i θi temp θi d x d y 其中, 和 是用于决定微动方向 的参数, 其值被设置为 −1、0或1, 通过计算反正切函数能够得到以45° 为单位、从0° 到315° 的8个角度的微动方向, 对应8个微动结果窗口 ; 和 分别表示水平和竖直方向的微动尺度; Dir 为计算得到(a) 原图(a) Original image (b) Izhikevich 模型(b) Izhikevich model (c) 改进的 Izhikevich 模型(c) Improved Izhikevich model图 2 改进前后的Izhikevich 模型对图像进行脉冲发放的结果对比图Fig. 2 Comparison of the image processing results of the Izhikevich model before and after improvement1774自 动 化 学 报49 卷Dir (x,y )的微动方向矩阵, 其中每个像素点的值为 ;sum(·) 计算窗口中像素值的和. 本文取每个微动窗口前后差异最大的方向作为该点的偏好方向, 分别用数字1 ~ 8表示.视网膜存在一类负责对运动刺激编码、具有方向选择性的神经节细胞 (Direction-selective gangli-on cells, DSGCs)[18]. 经过光感受器处理, 转化为电信号的视觉信息, 通过双极细胞处理后传递给神经节细胞. 双极细胞可分为由光照增强 (ON) 激发的细胞和由光照减弱 (OFF) 激发的细胞[19], 分别将信号输入给光通路 (ON-pathway)和撤光通路 (OFF-pathways) 两条并行通路[20], 传递给光运动和撤光运动产生的刺激. 而神经节细胞同样包括ON 和OFF 两种, 会对给光和撤光所产生的运动方向做出反应[21]. 因此, 本文构造5×5大小的对特定方向微动敏感的神经节细胞感受野窗口, 将其对偏好方向和反方向微动所产生的响应分别作为给光通路和撤光通路的输入. 以偏好方向为45° 的方向选择性神θi fires Res S xy ∗通过上述定义, 可以形成以45° 为单位、从0°到315° 的8个方向的感受野窗口, 与上文 的8个方向对应. 之后本文在亮度编码结果 上构筑与感受野相同大小的局部窗口 , 根据最优方向矩阵Dir 对应窗口中心点的方向, 取与其相同和相反方向的感受野窗口和亮度编码结果进行卷积运算 (本文用符号 表示卷积运算), 分别作为ON 和OFF 通道的输入, 如下式所示T ON T OFF 考虑到眼球微动能够将静止的空间场景转变为视网膜上的时间信息流, 激活视网膜神经元的发放,同时ON 和OFF 两通路也只在光刺激的呈现和撤去的瞬时产生电位发放, 因此本文采用首发脉冲时间作为编码方式, 将 和 定义为两通路首次脉冲发放时间构成的时间矩阵, 并作为初级边缘响应的结果. 将1个单位的发放时间设置为0.25, 当总发放时间大于30时停止计算, 此时还未进行发放的神经元即被判断为非边缘.1.3 多特征脉冲延时纹理抑制层视网膜神经节细胞在对光刺激编码的过程中,外界刺激特征的变化会显著影响神经元的反应时间. 研究发现, 当刺激对比度增大时, 神经元反应延时会减小, 更快速地进行脉冲发放; 反之, 则反应延时增大, 抑制神经元的活性[22]. 除此之外, 方向差异也会影响神经元活动, 突触前后偏好方向相似的神经元更倾向于优先连接, 在受到外界刺激时能够更快被同步激活[23]. 因此, 本文引入视网膜的神经元延时发放机制, 考虑方向和对比度对神经元敏感性的影响, 构造脉冲延时抑制模型. 首先结合局部窗口权重函数计算图像对比度, 如下式所示ω(x i ,y i )其中, 为窗口权重函数, L 为亮度图像, Con(a) 原图(a) Original image (b) Izhikevich 模型(b) Izhikevich model (c) 改进的 Izhikevich 模型(c) Improved Izhikevich model (d) 亮度感知编码(d) Luminance perception coding图 3 不同方式对存在弱边缘的菌落图像的处理结果Fig. 3 Different ways to process the image of colonies with weak edges8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1775S xy x i y i µ=∑x i ,y i ∈S xy ω(x i ,y i )为对比度图像, 为以(x , y )为中心的局部窗口,( , ) 为方窗中除中心外的周边像素, ws 为局部方窗的窗长, . 之后考虑局部方窗中心神经元和周边神经元方向差异, 同时用高斯函数模拟对比度大小与延时作用强度之间的关系, 构建脉冲延时抑制模型, 如下式所示D Dir (x,y )D Con (x,y )D (x,y )∆Dir (x i ,y i )min {|θ(x i ,y i )−θ(x,y )|,2π−|θ(x i ,y i )−θ(x,y )|}δ其中, 和 分别表示方向延时抑制量和对比度延时抑制量; 为计算得到的综合延时抑制量; 为突触前后神经元微动方向的差异, 被定义为 ; 用于调节对比度延时抑制量.T ON T OFFRes ON Res OFF 将上文计算得到的两个时间矩阵 和 中进行过脉冲发放的神经元与综合延时抑制量相加, 同样设置1个单位的发放时间为0.25, 将经延时作用后总发放时间大于30的神经元设置为不发放, 即判定为非边缘, 反之则判定为边缘. 根据式(19)和式(20) 得到两通道边缘检测结果 和. 最后, 将两通道得到的结果融合, 得到最终边缘响应结果Res ,如下式所示2 算法流程基于视网膜对视觉信息的处理顺序和编码特性, 本文构建图4所示的算法流程, 具体步骤如下:1) 根据视网膜在外界持续周期性光刺激下产生的适应现象, 在式(1)所示的Izhikevich 模型上作出改进, 构建如式(2)所示的具有自适应阈值的Izhikevich 模型.2) 根据式(3)将作为检测目标的图像映射到0 ~ 255区间规范亮度范围, 接着分离3种通道的颜色分量, 根据式(4)输入到改进的Izhikevich 模型中进行脉冲发放.3) 根据式(5)的方差计算提取出基准图像, 再结合基准图像根据式(6)对三通道脉冲发放的结果进行亮度感知编码, 得到亮度编码结果.4) 考虑人眼的固视微动机制, 根据式(7)和式(8)通过原图的灰度图像提取每个神经元的偏好方向, 得到微动方向矩阵, 接着根据式(9)和式(10)构筑8个方向的方向选择性神经节细胞感受野窗口.5) 根据式(11)和式(12), 将感受野窗口与亮度编码图像作卷积运算, 并输入Izhikevich 模型中得到ON 和OFF 通路的首发脉冲时间矩阵, 作为两通道的初级边缘响应.6) 根据式(13) ~ 式 (15), 结合局部窗口权重计算图像对比度.7) 考虑对比度和突触前后偏好方向对脉冲发放的延时作用, 根据式(16) ~ 式 (18)构建延时纹理抑制模型, 并根据式(19)和式(20)将纹理抑制模型和两通道的初级边缘响应相融合.8) 根据式(21)将两通路纹理抑制后的结果在神经节细胞处进行整合, 得到最终边缘响应结果.3 结果为了验证本文方法用于菌落边缘检测的有效性, 本文选择Canny 方法和其他3种同样基于神经元编码的边缘检测方法作为横向对比, 并进行定性、定量分析. 首先, 选择文献[4]提出的基于神经元突触可塑性的边缘检测方法(Synaptic plasticity model, SPM), 用于对比本文方法对弱边缘的增强效果; 其次, 选择文献[24]提出的基于抑制性突触的多层神经元群放电编码的边缘检测方法 (Inhibit-ory synapse model, ISM), 验证本文的延时抑制层在抑制冗余纹理方面的有效性; 然后, 选择文献[25]提出的基于突触连接视通路方向敏感的分级边缘检测方法(Orientation sensitivity model, OSM), 对比本文方法在抑制冗余纹理的同时保持边缘提取完整性上的优势; 最后, 还以本文方法为基础, 选择去除亮度感知编码后的方法(No luminance coding,NLC)作为消融实验, 以验证本文方法模拟光感受器功能的亮度感知编码模块的有效性.本文使用实验室在微生物学实验中采集的菌落图像和AGAR 数据集[26]作为实验对象. 前者具有丰富的颜色和形态结构, 用于检验算法对复杂检测环境的适应性; 后者则存在更多层次强度的边缘信息, 菌落本身与背景的颜色和亮度水平也较为相近,用于检测算法对颜色、亮度特征和弱边缘的敏感性.本文通过局部采样生成150张512×512像素大小的测试图像, 其中38张来自实验室采集, 112张来自AGAR 数据集. 然后分别使用上文的6种边缘1776自 动 化 学 报49 卷检测算法提取图像边缘, 使每种算法得到150张边缘检测结果, 其中部分检测结果如图5所示.定性分析图5可知, Canny 、SPM 和ISM 方法在Colony4和Colony5等存在弱边缘的图像中往往会出现大面积的边缘丢失. OSM 方法对弱边缘的敏感性强于以上3种方法, 但仍然会出现不同程度的边缘断裂, 且在调整阈值时难以均衡边缘连续性和目标菌落内部冗余. NLC 方法同样丢失了Colony4和Colony5中几乎所有的边缘, 对于Colony3也只能检出其中亮度较低的菌落内部, 对于梯度变化不明显的边缘辨别力差. 与其他方法相比, 本文方法检出的边缘更加显著且完整性更高, 对于弱边缘也有很强的检测能力, 在Colony3、Colony4和Colony5等存在多层次水平强弱边缘的菌落图像中能够取得较好的检测结果. 为了对检测结果进行定量分析并客观评价各方法的优劣, 计算边缘图像重建相似度MSSIM [27]对检测结果进行重建, 并计算重建图像与原图像的相似度作为边缘定位的准确性RGfires (R)fires (G)亮度编码结果Luminance codingresult方差计算Variance1 2 3ON-result对比度Contrast脉冲延时抑制量Neuron spiking delay感受野窗口感受野窗口DSGC templateOFF-通路输出OFF-result 5)6)7)图 4 边缘检测算法流程图Fig. 4 The procedure of edge detection algorithm8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1777图 5 Colony1 ~ Colony5的边缘检测结果(第1行为原图; 第2行为Canny 检测的结果; 第3行为SPM 检测的结果; 第4行为ISM 检测的结果; 第5行为OSM 检测的结果; 第6行为NLC 检测的结果; 第7行为本文方法检测的结果)Fig. 5 Edge detection results of Colony1 to Colony5 (The first line is original images; The second line is the results of Canny; The third line is the results of SPM; The fourth line is the results of ISM; The fifth line is the results of OSM;The sixth line is the results of NLC; The seventh line is the results of the proposed method)1778自 动 化 学 报49 卷指标. 首先对检测出的边缘图像做膨胀处理, 之后将原图像上的像素值赋给膨胀后边缘的对应位置,得到的图像记为ET , 则边缘重建如下式所示T k ET d k 其中, 为图像 上3×3窗口中8个方向的周边像素, 为窗口中心像素点与周边像素的距离, 计算得到重建图像R . 重建图像的相似度指标如下式所示µA µB σA σB σAB 其中, 和 为原图像和重建图像的灰度均值, 和 为其各自的标准差, 为原图像与重建图像之间的协方差. 将原图像和重建图像各自分为N 个子图, 并分别计算相似度指标SSIM , 得到平均相似度指标MSSIM . 除此之外, 为了验证边缘检测方法检出边缘的真实性和对菌落内部冗余纹理的抑制能力, 本文计算边缘置信度BIdx [28], 根据边缘两侧灰度值的跃变程度判断边缘的真伪. 边缘置信度指标如下式所示σij E (x i k ,y ik )(x i ,y i )d i其中, 为边缘像素在原图像对应位置的邻域标准差, EdgeNum 为边缘像素数量. 另外, 本文进一步计算边缘连续性 CIdx [29]来验证检出目标的边缘完整性. 首先将得到的边缘图像E 分割为m 个区域, 分别计算每个区域中的边缘像素 到其空间中心 的距离 ,则连续性指标如下式所示c i k C i n i 其中, 为边缘连续性的贡献值, D 为阈值, 为第i 个区域的像素点的连续性贡献值之和,为第i 个区域边缘像素点数量. 最后, 将计算得到的3个指标根据下式融合, 得到综合评价指标EIdx [21]其中, row 和col 分别为原图像的行数和列数. 于是, 检测图像的各项性能指标如表1 ~ 表5所示, 图像重建的结果如图6所示.表 1 不同检测方法下的重建相似度MSSIM Table 1 MSSIM of different methodsSerial number MSSIMCanny SPMISMOSMNLC本文方法Colony10.74520.77250.83570.92650.91750.9371Colony20.79510.79710.84900.95280.94470.9725Colony30.85760.86620.83140.91490.83370.9278Colony40.96900.98270.98380.98870.98930.9972Colony50.96340.97580.97800.97710.98830.9933表 2 不同检测方法下的边缘置信度BIdx Table 2 BIdx of different methodsSerial number BIdxCanny SPMISMOSMNLC本文方法Colony10.49880.46180.43070.58010.50580.6026Colony20.18210.15370.15530.33650.46150.4479Colony30.19830.15100.16100.26340.12630.3257Colony40.16310.14880.19060.14370.15210.2016Colony50.16200.18960.19020.18820.17350.1654表 3 不同检测方法下的边缘连续性CIdxTable 3 CIdx of different methodsSerial numberCIdxCanny SPMISMOSMNLC本文方法Colony10.83770.85300.86010.86760.97490.9652Colony20.80690.86550.85330.82930.91770.9518Colony30.80640.74080.72930.82690.77640.9406Colony40.81430.86110.90440.84300.90150.9776Colony50.90470.84480.86320.85920.87090.95718 期郑程驰等: 视网膜功能启发的边缘检测层级模型1779。
人工神经网络基础与应用-幻灯片(1)
4.4.2 根据连接方式和信息流向分类
反馈网络
y1 y2 ... y n 特点
仅在输出层到输入层存在反 馈,即每一个输入节点都有 可能接受来自外部的输入和 来自输出神经元的反馈,故 可用来存储某种模式序列。
应用
x 1 x 2 .... xn
神经认知机,动态时间序列 过程的神经网络建模
25
4.4.2 根据连接方式和信息流向分类
w ij : 从ui到xj的连接权值(注意其下标与方向);
s i : 外部输入信号;
y i : 神经元的输出
18
4.3.2 人工神经元的激励函数
阈值型 f 1 0
分段线性型
f
f max k
f
Neit10
Nei t0 Nei t0
Net i
0
0NietNie0 t
fNiet kNietNie0tNie0tNietNi1 et
典型网络
回归神经网络(RNN)
x 1 x 2 .... xn
27
第4.5节 人工神经网络的学习
连接权的确定方法: (1)根据具体要求,直接计算出来,如Hopfield网络作 优化计算时就属于这种情况。 (2)通过学习得到的,大多数人工神经网络都用这种方 法。
学习实质: 针对一组给定输入Xp (p=1,2,…, N ),通过学习使网络动态 改变权值,从而使其产生相应的期望输出Yd的过程。
树 突
细胞核 突
触
细胞膜 细胞体
轴 突
来自其 它细胞 轴突的 神经末 稍
神经末稍
11
4.2.1 生物神经元的结构
突触:是神经元之间的连接 接口。一个神经元,通过其 轴突的神经末梢,经突触与 另一个神经元的树突连接, 以实现信息的传递。
人工智能领域中英文专有名词汇总
名词解释中英文对比<using_information_sources> social networks 社会网络abductive reasoning 溯因推理action recognition(行为识别)active learning(主动学习)adaptive systems 自适应系统adverse drugs reactions(药物不良反应)algorithm design and analysis(算法设计与分析) algorithm(算法)artificial intelligence 人工智能association rule(关联规则)attribute value taxonomy 属性分类规范automomous agent 自动代理automomous systems 自动系统background knowledge 背景知识bayes methods(贝叶斯方法)bayesian inference(贝叶斯推断)bayesian methods(bayes 方法)belief propagation(置信传播)better understanding 内涵理解big data 大数据big data(大数据)biological network(生物网络)biological sciences(生物科学)biomedical domain 生物医学领域biomedical research(生物医学研究)biomedical text(生物医学文本)boltzmann machine(玻尔兹曼机)bootstrapping method 拔靴法case based reasoning 实例推理causual models 因果模型citation matching (引文匹配)classification (分类)classification algorithms(分类算法)clistering algorithms 聚类算法cloud computing(云计算)cluster-based retrieval (聚类检索)clustering (聚类)clustering algorithms(聚类算法)clustering 聚类cognitive science 认知科学collaborative filtering (协同过滤)collaborative filtering(协同过滤)collabrative ontology development 联合本体开发collabrative ontology engineering 联合本体工程commonsense knowledge 常识communication networks(通讯网络)community detection(社区发现)complex data(复杂数据)complex dynamical networks(复杂动态网络)complex network(复杂网络)complex network(复杂网络)computational biology 计算生物学computational biology(计算生物学)computational complexity(计算复杂性) computational intelligence 智能计算computational modeling(计算模型)computer animation(计算机动画)computer networks(计算机网络)computer science 计算机科学concept clustering 概念聚类concept formation 概念形成concept learning 概念学习concept map 概念图concept model 概念模型concept modelling 概念模型conceptual model 概念模型conditional random field(条件随机场模型) conjunctive quries 合取查询constrained least squares (约束最小二乘) convex programming(凸规划)convolutional neural networks(卷积神经网络) customer relationship management(客户关系管理) data analysis(数据分析)data analysis(数据分析)data center(数据中心)data clustering (数据聚类)data compression(数据压缩)data envelopment analysis (数据包络分析)data fusion 数据融合data generation(数据生成)data handling(数据处理)data hierarchy (数据层次)data integration(数据整合)data integrity 数据完整性data intensive computing(数据密集型计算)data management 数据管理data management(数据管理)data management(数据管理)data miningdata mining 数据挖掘data model 数据模型data models(数据模型)data partitioning 数据划分data point(数据点)data privacy(数据隐私)data security(数据安全)data stream(数据流)data streams(数据流)data structure( 数据结构)data structure(数据结构)data visualisation(数据可视化)data visualization 数据可视化data visualization(数据可视化)data warehouse(数据仓库)data warehouses(数据仓库)data warehousing(数据仓库)database management systems(数据库管理系统)database management(数据库管理)date interlinking 日期互联date linking 日期链接Decision analysis(决策分析)decision maker 决策者decision making (决策)decision models 决策模型decision models 决策模型decision rule 决策规则decision support system 决策支持系统decision support systems (决策支持系统) decision tree(决策树)decission tree 决策树deep belief network(深度信念网络)deep learning(深度学习)defult reasoning 默认推理density estimation(密度估计)design methodology 设计方法论dimension reduction(降维) dimensionality reduction(降维)directed graph(有向图)disaster management 灾害管理disastrous event(灾难性事件)discovery(知识发现)dissimilarity (相异性)distributed databases 分布式数据库distributed databases(分布式数据库) distributed query 分布式查询document clustering (文档聚类)domain experts 领域专家domain knowledge 领域知识domain specific language 领域专用语言dynamic databases(动态数据库)dynamic logic 动态逻辑dynamic network(动态网络)dynamic system(动态系统)earth mover's distance(EMD 距离) education 教育efficient algorithm(有效算法)electric commerce 电子商务electronic health records(电子健康档案) entity disambiguation 实体消歧entity recognition 实体识别entity recognition(实体识别)entity resolution 实体解析event detection 事件检测event detection(事件检测)event extraction 事件抽取event identificaton 事件识别exhaustive indexing 完整索引expert system 专家系统expert systems(专家系统)explanation based learning 解释学习factor graph(因子图)feature extraction 特征提取feature extraction(特征提取)feature extraction(特征提取)feature selection (特征选择)feature selection 特征选择feature selection(特征选择)feature space 特征空间first order logic 一阶逻辑formal logic 形式逻辑formal meaning prepresentation 形式意义表示formal semantics 形式语义formal specification 形式描述frame based system 框为本的系统frequent itemsets(频繁项目集)frequent pattern(频繁模式)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy clustering (模糊聚类)fuzzy data mining(模糊数据挖掘)fuzzy logic 模糊逻辑fuzzy set theory(模糊集合论)fuzzy set(模糊集)fuzzy sets 模糊集合fuzzy systems 模糊系统gaussian processes(高斯过程)gene expression data 基因表达数据gene expression(基因表达)generative model(生成模型)generative model(生成模型)genetic algorithm 遗传算法genome wide association study(全基因组关联分析) graph classification(图分类)graph classification(图分类)graph clustering(图聚类)graph data(图数据)graph data(图形数据)graph database 图数据库graph database(图数据库)graph mining(图挖掘)graph mining(图挖掘)graph partitioning 图划分graph query 图查询graph structure(图结构)graph theory(图论)graph theory(图论)graph theory(图论)graph theroy 图论graph visualization(图形可视化)graphical user interface 图形用户界面graphical user interfaces(图形用户界面)health care 卫生保健health care(卫生保健)heterogeneous data source 异构数据源heterogeneous data(异构数据)heterogeneous database 异构数据库heterogeneous information network(异构信息网络) heterogeneous network(异构网络)heterogenous ontology 异构本体heuristic rule 启发式规则hidden markov model(隐马尔可夫模型)hidden markov model(隐马尔可夫模型)hidden markov models(隐马尔可夫模型) hierarchical clustering (层次聚类) homogeneous network(同构网络)human centered computing 人机交互技术human computer interaction 人机交互human interaction 人机交互human robot interaction 人机交互image classification(图像分类)image clustering (图像聚类)image mining( 图像挖掘)image reconstruction(图像重建)image retrieval (图像检索)image segmentation(图像分割)inconsistent ontology 本体不一致incremental learning(增量学习)inductive learning (归纳学习)inference mechanisms 推理机制inference mechanisms(推理机制)inference rule 推理规则information cascades(信息追随)information diffusion(信息扩散)information extraction 信息提取information filtering(信息过滤)information filtering(信息过滤)information integration(信息集成)information network analysis(信息网络分析) information network mining(信息网络挖掘) information network(信息网络)information processing 信息处理information processing 信息处理information resource management (信息资源管理) information retrieval models(信息检索模型) information retrieval 信息检索information retrieval(信息检索)information retrieval(信息检索)information science 情报科学information sources 信息源information system( 信息系统)information system(信息系统)information technology(信息技术)information visualization(信息可视化)instance matching 实例匹配intelligent assistant 智能辅助intelligent systems 智能系统interaction network(交互网络)interactive visualization(交互式可视化)kernel function(核函数)kernel operator (核算子)keyword search(关键字检索)knowledege reuse 知识再利用knowledgeknowledgeknowledge acquisitionknowledge base 知识库knowledge based system 知识系统knowledge building 知识建构knowledge capture 知识获取knowledge construction 知识建构knowledge discovery(知识发现)knowledge extraction 知识提取knowledge fusion 知识融合knowledge integrationknowledge management systems 知识管理系统knowledge management 知识管理knowledge management(知识管理)knowledge model 知识模型knowledge reasoningknowledge representationknowledge representation(知识表达) knowledge sharing 知识共享knowledge storageknowledge technology 知识技术knowledge verification 知识验证language model(语言模型)language modeling approach(语言模型方法) large graph(大图)large graph(大图)learning(无监督学习)life science 生命科学linear programming(线性规划)link analysis (链接分析)link prediction(链接预测)link prediction(链接预测)link prediction(链接预测)linked data(关联数据)location based service(基于位置的服务) loclation based services(基于位置的服务) logic programming 逻辑编程logical implication 逻辑蕴涵logistic regression(logistic 回归)machine learning 机器学习machine translation(机器翻译)management system(管理系统)management( 知识管理)manifold learning(流形学习)markov chains 马尔可夫链markov processes(马尔可夫过程)matching function 匹配函数matrix decomposition(矩阵分解)matrix decomposition(矩阵分解)maximum likelihood estimation(最大似然估计)medical research(医学研究)mixture of gaussians(混合高斯模型)mobile computing(移动计算)multi agnet systems 多智能体系统multiagent systems 多智能体系统multimedia 多媒体natural language processing 自然语言处理natural language processing(自然语言处理) nearest neighbor (近邻)network analysis( 网络分析)network analysis(网络分析)network analysis(网络分析)network formation(组网)network structure(网络结构)network theory(网络理论)network topology(网络拓扑)network visualization(网络可视化)neural network(神经网络)neural networks (神经网络)neural networks(神经网络)nonlinear dynamics(非线性动力学)nonmonotonic reasoning 非单调推理nonnegative matrix factorization (非负矩阵分解) nonnegative matrix factorization(非负矩阵分解) object detection(目标检测)object oriented 面向对象object recognition(目标识别)object recognition(目标识别)online community(网络社区)online social network(在线社交网络)online social networks(在线社交网络)ontology alignment 本体映射ontology development 本体开发ontology engineering 本体工程ontology evolution 本体演化ontology extraction 本体抽取ontology interoperablity 互用性本体ontology language 本体语言ontology mapping 本体映射ontology matching 本体匹配ontology versioning 本体版本ontology 本体论open government data 政府公开数据opinion analysis(舆情分析)opinion mining(意见挖掘)opinion mining(意见挖掘)outlier detection(孤立点检测)parallel processing(并行处理)patient care(病人医疗护理)pattern classification(模式分类)pattern matching(模式匹配)pattern mining(模式挖掘)pattern recognition 模式识别pattern recognition(模式识别)pattern recognition(模式识别)personal data(个人数据)prediction algorithms(预测算法)predictive model 预测模型predictive models(预测模型)privacy preservation(隐私保护)probabilistic logic(概率逻辑)probabilistic logic(概率逻辑)probabilistic model(概率模型)probabilistic model(概率模型)probability distribution(概率分布)probability distribution(概率分布)project management(项目管理)pruning technique(修剪技术)quality management 质量管理query expansion(查询扩展)query language 查询语言query language(查询语言)query processing(查询处理)query rewrite 查询重写question answering system 问答系统random forest(随机森林)random graph(随机图)random processes(随机过程)random walk(随机游走)range query(范围查询)RDF database 资源描述框架数据库RDF query 资源描述框架查询RDF repository 资源描述框架存储库RDF storge 资源描述框架存储real time(实时)recommender system(推荐系统)recommender system(推荐系统)recommender systems 推荐系统recommender systems(推荐系统)record linkage 记录链接recurrent neural network(递归神经网络) regression(回归)reinforcement learning 强化学习reinforcement learning(强化学习)relation extraction 关系抽取relational database 关系数据库relational learning 关系学习relevance feedback (相关反馈)resource description framework 资源描述框架restricted boltzmann machines(受限玻尔兹曼机) retrieval models(检索模型)rough set theroy 粗糙集理论rough set 粗糙集rule based system 基于规则系统rule based 基于规则rule induction (规则归纳)rule learning (规则学习)rule learning 规则学习schema mapping 模式映射schema matching 模式匹配scientific domain 科学域search problems(搜索问题)semantic (web) technology 语义技术semantic analysis 语义分析semantic annotation 语义标注semantic computing 语义计算semantic integration 语义集成semantic interpretation 语义解释semantic model 语义模型semantic network 语义网络semantic relatedness 语义相关性semantic relation learning 语义关系学习semantic search 语义检索semantic similarity 语义相似度semantic similarity(语义相似度)semantic web rule language 语义网规则语言semantic web 语义网semantic web(语义网)semantic workflow 语义工作流semi supervised learning(半监督学习)sensor data(传感器数据)sensor networks(传感器网络)sentiment analysis(情感分析)sentiment analysis(情感分析)sequential pattern(序列模式)service oriented architecture 面向服务的体系结构shortest path(最短路径)similar kernel function(相似核函数)similarity measure(相似性度量)similarity relationship (相似关系)similarity search(相似搜索)similarity(相似性)situation aware 情境感知social behavior(社交行为)social influence(社会影响)social interaction(社交互动)social interaction(社交互动)social learning(社会学习)social life networks(社交生活网络)social machine 社交机器social media(社交媒体)social media(社交媒体)social media(社交媒体)social network analysis 社会网络分析social network analysis(社交网络分析)social network(社交网络)social network(社交网络)social science(社会科学)social tagging system(社交标签系统)social tagging(社交标签)social web(社交网页)sparse coding(稀疏编码)sparse matrices(稀疏矩阵)sparse representation(稀疏表示)spatial database(空间数据库)spatial reasoning 空间推理statistical analysis(统计分析)statistical model 统计模型string matching(串匹配)structural risk minimization (结构风险最小化) structured data 结构化数据subgraph matching 子图匹配subspace clustering(子空间聚类)supervised learning( 有support vector machine 支持向量机support vector machines(支持向量机)system dynamics(系统动力学)tag recommendation(标签推荐)taxonmy induction 感应规范temporal logic 时态逻辑temporal reasoning 时序推理text analysis(文本分析)text anaylsis 文本分析text classification (文本分类)text data(文本数据)text mining technique(文本挖掘技术)text mining 文本挖掘text mining(文本挖掘)text summarization(文本摘要)thesaurus alignment 同义对齐time frequency analysis(时频分析)time series analysis( 时time series data(时间序列数据)time series data(时间序列数据)time series(时间序列)topic model(主题模型)topic modeling(主题模型)transfer learning 迁移学习triple store 三元组存储uncertainty reasoning 不精确推理undirected graph(无向图)unified modeling language 统一建模语言unsupervisedupper bound(上界)user behavior(用户行为)user generated content(用户生成内容)utility mining(效用挖掘)visual analytics(可视化分析)visual content(视觉内容)visual representation(视觉表征)visualisation(可视化)visualization technique(可视化技术) visualization tool(可视化工具)web 2.0(网络2.0)web forum(web 论坛)web mining(网络挖掘)web of data 数据网web ontology lanuage 网络本体语言web pages(web 页面)web resource 网络资源web science 万维科学web search (网络检索)web usage mining(web 使用挖掘)wireless networks 无线网络world knowledge 世界知识world wide web 万维网world wide web(万维网)xml database 可扩展标志语言数据库附录 2 Data Mining 知识图谱(共包含二级节点15 个,三级节点93 个)间序列分析)监督学习)领域 二级分类 三级分类。
低层次和高层次特征相结合的人体动作识别
t —e o a r d e td s rp o sc e t d t l sr t h p t —e o a iu lfa u e fh ma ci n A i h it g a i i t mp r lg a in e c t ri r a e o i u tae t e s ai tmp r lv s a e t r s o u n a t . we g t so r m s o i l o o h a o t d a h c i n r p e e t t n b s d o x mu l e i o d e t t n ma i g t e ag rt m r f ce t i e weg t d p e st e a t e r s n ai a e n ma i m i l o si i k n h l o h mo ee o o k h ma o i i in l t i h wh e h
r tl ,wh c s r b s o o cu i n i h i g c a g s a d c me a z o n .A o y e r n wi ih y f c s mo e— a e p — ae y ih i o u t t c l so ,l t h n e n a r o mi g g n p l h d o t eg t a e d l s d s a h b
下丘脑影像解剖
也许大家对前、后联合比较陌生,它们都是连接纤维 (白质);前联合位于终板上方、穹隆前方,图1下方 小图可以看到穹隆前下方一个小小的圆形结构、其对 应的轴位和冠状位如图2;后联合位于上丘上方,其对 应的轴位和冠状位如图3。至于漏斗,是下丘脑下方一 个“上宽下窄”、漏斗样囊样结构,连接垂体;而灰 结节是漏斗后方微凸起的灰质结构。
图9 下丘脑重要核团的部分主要功能
下一步? 参考文献总结了部分下丘脑病变的影像,自行戳原文, 但是原文中下丘脑核团的解剖有些许错误,读者可自 行查看[1]。注:本文解剖结构主要参考奈特图谱 《Netter's Concise Neuroanatomy Updated Edition》 (Print ISBN: 9780323480918;eBook ISBN: 9780323482011; Imprint: Elsevier Published Date: 28th September 2016),图10为书中附图,本文第 三步中的三个平面即为图10第2、4、6个平面。欢迎 大家讨论~
图2 蓝色直线交叉处显示前联合
图3 蓝色直线交叉处显示后联合
第二步:找出下丘脑边界
以前联合下缘、后联合前缘、视交叉、中脑上缘、乳头 体及灰结节外缘围成的区域即为下丘脑区域,如图4。其 前界终板,为前联合与漏斗之间的薄板结构。
图4 下丘脑大致边界
第三步:定位下丘脑分区和重要核团
根据三个主要冠状位,定位下丘脑分区和重要核团。 教科书(八年制神经病学第三版)将下丘脑分为四区, 分别为视前区(Preoptic region)、视上区 (Supraoptic region)、结节区(Tuberal region)和乳 头体区(Mammillary region),需要注意的是视前区英 文为preoptic region,而结节区也被称为anterior region, 翻译时极易混淆。
一种用于图像重构的新型贝叶斯压缩感知技术
e p r ns s o a eP BC u p r r s t e c n e t n lBC n t e e o s u t n mer sfr x e me t h w t tt S S o t e o h o v n o a S a d o r r c n t ci t c o i h h f m i h r o i
应用传统的 R M进行信号重构往往精度非常差。为 了提 高精度 ,文 中提 出了一种新 的 B S V C 技 术 :粒子群 贝叶斯 压缩 感知 (S C ) 实验表 明这种 新 的 B S技 术在 重构精 度上 大大超越 了传 PB S 。 C
统的 B S 术。 C 技
关键 词 :贝叶斯压缩 感 知 ( C ) B S ;相关 向量机 ( V ;粒子 群 优 化 ;局 部 最 优 困境 ;向量 选 R M)
cm rsi e s g( C )w s rp sdi ter et er.I cnies h eos u t npoesa o pes esni B S a ooe e n as t o s r tercnt c o rcs s v n p nh c y d r i
h y sa d l t r n t r i o a r s a st mo e .I S,t O c l ee a c t e Ba e in mo e a e a et d t n l 1 o l p ri d 1 n BC h eS al d t er l v n e rh t h h a i Zn n y e h
图像特征的选择与提取
设P(j,i)为图像的第j个像素的第i个颜色分量值,一阶 矩为:
i
1 N
N
Pji
j 1
即表示待测区域的颜色均值 。
第11页/共31页
二阶矩(Variance)
i
(1 N
N
(Pij i )2 )1/ 2
j 1
表示待测区域的颜色方差,即不均匀性。
第12页/共31页
三阶矩(Skewness)
si
第23页/共31页
• 设f(i,j)是(i,j)处的像素值,(i,j)位置处的边缘强度通常用差分值或其函数来表示。简单的差分算法有: • x方向差分值:△xf(i,j)= f(i,j)- f(i,j-1) • y方向差分值:△yf(i,j)= f(i,j)- f(i-1,j) • 边缘强度 = |△xf(i,j)| + | △yf(i,j)| 或 • = △x2f(i,j) + △y2f(i,j),
图像特征
常见的目标特征分为灰度(颜色)、纹理和几何形状特征等。其中,灰度和纹理属于内部特征,几何 形状属于外部特征。
第4页/共31页
纹理特征 第5页/共31页
几何特征,判断凹凸
第6页/共31页
• 选取的特征应具有如下特点: • ❖ 可区别性 • ❖ 可靠性 • ❖ 独立性好 • ❖ 数量少 • ❖ 对尺寸、变换、旋转等变换尽可能不敏感
第21页/共31页
点特征提取
• 点特征主要指图像中的明显点,如房屋角点、圆点等.用于点特征提取得算子称为有利算子或兴趣算子
第22页/共31页
二值图像的边缘特征提取
• 二值图像边缘特征提取的过程实际上是寻找像素灰度值急剧变 化的位置的过程,并在这些位置上将像素值置为“1”,其余位 置上的像素值置为“0”,从而求出目标的边界线。二值图像的 边特征提取是用数学算子实现的,如Sobel、Prewitt、 Kirsch、拉普拉斯等多种算子。这些算子都是以一个3×3的模 板与图像中3×3的区域相乘,得到的结果作为图像中这个区域 中心位置的边缘强度。在计算出图像中每一个像素的边缘强度 后,将边缘强度大于一定值的点提取出来,并赋以像素值“1”, 其余赋以像素值“0”。
基于没有交集的主成分模型下的模式识别方法-外文文献及翻译
基于没有交集的主成分模型下的模式识别方法-外文文献及翻译xx工业大学毕业设计(论文)外文资料翻译学院:系(专业):姓名:学号:外文出处:Pattern Recognition附件: 1.外文资料翻译译文;2.外文原文。
指导教师评语:签名:2010年6 月日附件1:外文资料翻译译文基于没有交集的主成分模型下的模式识别方法化学计量学研究组,化学研究所,umea大学摘要:通过独立的主成分建模方法对单独种类进行模式识别,这一方法我们已经进行了深刻的研究,主成分的模型说明了单一种类之内拟合所有的连续变量。
所以,假如数据充足的话,主成分模型的方法可以对指定的一组样品中存在的任何模式进行识别,另外,将每一种类中样品通过独立的主成分模型作出拟合,用这种简单的方式,可以提供有关这些变量作为单一变量的相关性。
这些试样中存在着“离群”,而且不同种类间也有“距离”。
我们应用经典的Fisher鸢尾花数据作为例证。
1介绍对于挖掘和使用经验数据的规律性,已经在像化学和生物这样的学科中成为了首要考虑的因素。
在化学上一个经典的例子就是元素周期表。
当元素按渐增的原子质量排列时,化学元素特性上的规律以每8个为一个周期的出现。
相似的,生物学家也常按照植物和动物形态学上的规律才将其归类。
比如,植物的花朵和叶片的形状,动物两臂的长度和宽度以及动物不同的骨骼等等。
数据分析方法(通常叫做模式识别方法),特别的创制用以探知多维数据的规律性。
这种方法已在科学的各分支上得到了广泛的应用。
模式识别中的经典问题可系统的陈述如下:指定一些种类,每一类都被定义为一套样本,训练集和检验集,还有基于每组样本的M测度值,那么是否有可能基于原M值对新的样本作出分类呢?我们提出解决这类或相关问题的许多方法,这些方法也由Kanal和另外一些人回顾过了。
在科学的分支中,比如化学和生物中,数据分析的范围往往比仅获得一组未分类数据广泛,通常上,数据分析的目的之一仍然可说是分类,但有时我们不能确定一个样本是否属于一未知的或未辨明的类别,我们希望不仅去辨别已知种类,还有未知种类。
一种基于稀疏协同原型向量重构的鼻咽癌病理细胞协同分类方法
Ab ta t y eg t p t r rc g io ( P ,a crig t rttp vcos s u eu o h cl sr c :S n rei at n eo nt n S R) c odn o oo e e t ,i sfl fr te el c e i p y r
s n r e i v c o s I t i p p r t e h r c e i t o o t u l t r n f r y eg t c e t r . n h s a e , h c a a t rs i f e n o re ta s om wa a a y e c mb n d c s n l z d o i e wih t s n r e i p t r r c g i o . A n w f so me h d a e o c n o re ta s o m f r r t t p v c o s y eg t c at n e o nt n e i e u in t o b s d n o t u l t r n f r o p o o y e e t r g n r t n wa r p s d e e a i s p o o e .Th o f c e t tu t r n h r me r Sf so r c d r r i e n d t i , o ec e i i n s sr c u e a d t e fa wo k’ u i n p o e u e we e g v n i e a l s a d t e p o o e t o s t se n n s p a y g a a c n ma c is c a sfc to . Co a io s wi t e n h r p s d me h d wa e td i a o h r n e l c r i o e l l s i a i n i mp rs n t o h r h me h d ,i c u i g t d t n lS R , t e o e a e n c r e e r n f r a d b s d o o tu l t ta so m , t o s n l d n r i o a P a i h n s b s d o u v l t ta so m n a e n c n o re r n f r
融合柯西变异和反向学习的改进麻雀算法
融合柯西变异和反向学习的改进麻雀算法毛清华,张强+燕山大学经济管理学院,河北秦皇岛066004+通信作者E-mail:*****************摘要:针对基本麻雀搜索算法在迭代后期种群多样性减小,容易陷入局部极值的问题,提出一种融合柯西变异和反向学习的改进麻雀算法(ISSA )。
首先,采用一种映射折叠次数无限的Sin 混沌初始化种群,为全局寻优奠定基础;其次,在发现者位置更新方式中引入上一代全局最优解,提高全局搜索的充分性,同时加入自适应权重,协调局部挖掘和全局探索的能力,并加快收敛速度;然后,融合柯西变异算子和反向学习策略,在最优解位置进行扰动变异,产生新解,增强算法跃出局部空间的能力;最后,与3种基本算法和2种改进的麻雀算法进行对比,对8个基准测试函数进行仿真实验以及Wilcoxon 秩和检验,评估ISSA 的寻优性能,并对ISSA 进行时间复杂度分析。
结果表明ISSA 与其余5种算法相比,收敛速度更快,精度更高,全局寻优能力得到较大提升。
关键词:麻雀搜索算法;Sin 混沌;自适应;柯西变异;反向学习文献标志码:A中图分类号:TP301.6Improved Sparrow Algorithm Combining Cauchy Mutation and Opposition-Based LearningMAO Qinghua,ZHANG Qiang +School of Economics and Management,Yanshan University,Qinhuangdao,Hebei 066004,ChinaAbstract:Aiming at the problem that the population diversity of basic sparrow search algorithm decreases and it is easy to fall into local extremum in the late iteration,an improved sparrow search algorithm combining Cauchy variation and reverse learning (ISSA)is proposed.Firstly,this paper uses a Sin chaotic initialization population with an unlimited number of mapping folds to lay the foundation for global optimization.Secondly,this paper introduces the previous generation global optimal solution into the discoverer location-update method to enhance the sufficiency of global search.At the same time,the adaptive weight is added to coordinate the ability of local mining and global exploration,and the convergence speed is accelerated.Then,the Cauchy mutation operator and the opposition-based learning strategy are combined to perform disturbance mutation to generate new solutions at the optimal solution position,and the algorithm s ability to jump out of local space is enhanced.Finally,this algorithm is compared with 3basic algorithms and 2improved sparrow algorithms.Simulation and Wilcoxon rank and inspection are performed on 8benchmark test functions.The optimization performance of ISSA is assessed,and time complexity analysis of ISSA is carried out.The results show that ISSA has faster convergence rate and higher precision than the other 5algorithms.And the overall optimization capabilities are improved.Key words:sparrow search algorithm;Sin chaos;adaptive;Cauchy variation;opposition-based learning计算机科学与探索1673-9418/2021/15(06)-1155-10doi:10.3778/j.issn.1673-9418.2010032基金项目:国家自然科学基金(71704151)。
一种基于影像组学特征选择的模型、构建方法和应用[发明专利]
专利名称:一种基于影像组学特征选择的模型、构建方法和应用
专利类型:发明专利
发明人:牛田野,杨婧,罗辰
申请号:CN202010635185.9
申请日:20200703
公开号:CN111814868A
公开日:
20201023
专利内容由知识产权出版社提供
摘要:本发明公开了一种基于影像组学特征选择的模型、构建方法和应用,包括:利用皮尔逊相关系数分析方法去除冗余特征,然后使用序列浮动前向选择算法确定所需的特征子集。
采用逻辑回归分类器构建预测模型,通过自适应搜索策略确定皮尔逊相关系数分析方法和序列浮动前向选择算法的参数,以构建预测临床目标的最优模型。
特征选择方法和模型分类器的合理选择决定了预测临床目标的最终效果,该方法不需要预先设定参数,方法简单直接,计算效率高,是一种对不同疾病均具有参考价值的可重复方法,有潜力作为一种通用的、无创的预测工具指导不同患者的临床决策。
申请人:苏州动影信息科技有限公司
地址:215163 江苏省苏州市高新区科技城锦峰路158号13幢302-1
国籍:CN
代理机构:杭州天勤知识产权代理有限公司
代理人:曹兆霞
更多信息请下载全文后查看。
基于稀疏表示的单帧超分辨率算法
基于稀疏表示的单帧超分辨率算法王馨悦;辛志薇【摘要】目前的基于学习的超分辨率算法大都存在一个问题:图像与样本库差异较大,超分辨的结果就会变得很差.为此提出一种基于稀疏表示的单帧超分辨率算法,使用图像金字塔建立字典.同时,利用不同尺度间存在的重复块训练字典.对于彩色图像,为避免由颜色通道相关性而造成的重建图像质量的下降,在Lab颜色空间对彩色图像进行重建.实验结果表明,该算法可获得更好的视觉效果和更高的峰值信噪比.【期刊名称】《现代计算机(专业版)》【年(卷),期】2016(000)007【总页数】4页(P65-68)【关键词】超分辨;稀疏表示;图像金字塔;Lab颜色空间【作者】王馨悦;辛志薇【作者单位】四川大学计算机学院,成都 610000;四川大学计算机学院,成都610000【正文语种】中文超分辨;稀疏表示;图像金字塔;Lab颜色空间图像超分辨率(Super Resolution,SR)是指从一幅低分辨率图像(Low Resolution,LR)或者一组低分辨图像序列中重构出高分辨率图像(High Resolution,HR)。
高分辨率图像不仅能带给人们更好的视觉享受(因为分辨率越高,细节信息越丰富),还在很多领域有着至关重要的作用。
如:医学领域,更高分辨率的图像能帮助医生更好地判断病人的病情。
一般而言,单幅图像的超分辨率要获得更好的重建效果就需要依赖先验知识,而基于图像序列的超分辨率重建则更多的根据图像降质模型和多幅低分辨图像序列间存在的差异信息估计出图像的高频细节信息。
也因此,这些图像序列需要是关于同一场景且存在亚像素等级上的差异。
通常,将超分辨率重建算法分为三类:基于插值的方法、基于重建的方法和基于学习的方法。
基于插值的算法(如Bicubic插值[1]等)采用某种数学模型拟合数据,以其选中像素点的值结合相应数学公式估计出待插入位置的像素值。
这类算法实现简单,能符合实时应用的需求。
但其重建效果只有在超分辨提高因子较小时比较好。
局部弱sharp最小的几个等价条件
局部弱sharp最小的几个等价条件
李晓杰;李进馨
【期刊名称】《哈尔滨师范大学自然科学学报》
【年(卷),期】2006(022)003
【摘要】在一般Banach空间的框架下,证明了参考文献[1]中给出的法锥型必要条件也是局部弱shrp最小的充分条件,同时也给出了局部弱sharp最小的切锥型刻画.【总页数】3页(P14-16)
【作者】李晓杰;李进馨
【作者单位】哈尔滨师范大学;哈尔滨师范大学
【正文语种】中文
【中图分类】O1
【相关文献】
1.凸模糊集与模糊闭集间的弱等价条件 [J], 聂大陆;王丽媛
2.一类抽象锥不等式局部误差界的几个等价条件 [J], 于海姝
3.弱遍历自同胚映射拓扑共轭的等价条件 [J], 张莹
4.弱混合的一个等价条件 [J], 廖公夫;刘恒
5.关于无条件和弱无条件收敛级数的等价条件 [J], 韩月霞;金祥菊
因版权原因,仅展示原文概要,查看原文内容请购买。
稀疏编码学习笔记整理(一)
稀疏编码学习笔记整理(⼀)最近新⼊⼿稀疏编码,在这⾥记录我对稀疏编码的理解(根据学习进度不断更新中)⼀,稀疏编码的概述稀疏编码的概念来⾃于神经⽣物学。
⽣物学家提出,哺乳类动物在长期的进化中,⽣成了能够快速,准确,低代价地表⽰⾃然图像的视觉神经⽅⾯的能⼒。
我们直观地可以想象,我们的眼睛每看到的⼀副画⾯都是上亿像素的,⽽每⼀副图像我们都只⽤很少的代价重建与存储。
我们把它叫做稀疏编码,即Sparse Coding.1959年,David Hubel和Toresten Wiesel通过对猫的视觉条纹⽪层简单细胞的研究得出⼀个结论:主视⽪层V1区神经元的感受野能对信息产⽣⼀种“稀疏表⽰”.基于这⼀知识。
1961年,H.B.Barlow[5]提出了“利⽤感知数据的冗余”进⾏编码的理论.1969年,D.J.Willshaw和O.P.Buneman等⼈提出了基于Hebbian 学习的局部学习规则的稀疏表⽰模型.这种稀疏表⽰可以使模型之间有更少的冲突,从⽽使记忆能⼒最⼤化.Willshaw模型的提出表明了稀疏表⽰⾮常有利于学习神经⽹络中的联想.1972年,Barlow推论出在(Sparsity)和⾃然环境的统计特性之间必然存在某种联系.随后,有许多计算⽅法被提出来论证这个推论,这些⽅法都成功地表明了稀疏表⽰可以体现出在⼤脑中出现的⾃然环境的统计特性.1987年,Field提出主视⽪层V1区简单细胞的⾮常适于学习视⽹膜成像的图像结构,因为它们可以产⽣图像的稀疏表⽰.基于这个结论,1988年,Michison明确提出了神经稀疏编码的概念,然后由⽜津⼤学的E.T.Roll 等⼈正式引⽤.随后对灵长⽬动物视觉⽪层和猫视觉⽪层的电⽣理的实验报告,也进⼀步证实了视觉⽪层复杂刺激的表达是采⽤稀疏编码原则的.1989年,Field提出了稀疏分布式编码(Sparse Distributed Coding)⽅法.这种编码⽅法并不减少输⼊数据的,⽽是使响应于任⼀特殊输⼊信息的神经细胞数⽬被减少,信号的稀疏编码存在于细胞响应分布的四阶矩(即Kurtosis)中.1996年,Olshausen和Field在Nature杂志上发表了⼀篇重要论⽂指出,⾃然图像经过稀疏编码后得到的类似于V1区简单细胞的反应特性.这种稀疏编码模型提取的基函数⾸次成功地模拟了V1区简单细胞感受野的三个响应特性:空间域的局部性、时域和频域的⽅向性和选择性.考虑到基函数的超完备性(基函数⼤于输出神经元的个数),Olshausen 和Field在1997年⼜提出了⼀种超完备基的稀疏编码算法,利⽤基函数和系数的模型成功地了V1区简单细胞感受野.1997年,Bell和Sejnowski 等⼈把多维独⽴分量分析(Independent Component Analysis, ICA)⽤于⾃然图像数据分析,并且得出⼀个重要结论:ICA实际上就是⼀种特殊的稀疏编码⽅法.21世纪以来,国外从事稀疏编码研究的⼈员⼜提出了许多新的稀疏编码算法,涌现出了⼤量的稀疏编码⽅⾯的论⽂,国内研究者在稀疏编码和应⽤⽅⾯也作了⼀些⼯作],但远远落后于国外研究者所取得的成果.稀疏编码的⽬的:在⼤量的数据集中,选取很⼩部分作为元素来重建新的数据。
一种弱纹理图像特征跟踪的鲁棒方法
一种弱纹理图像特征跟踪的鲁棒方法
贾云得;Marti.,H
【期刊名称】《北京理工大学学报》
【年(卷),期】1999(19)2
【摘要】目的提出一种用于弱纹理图像的特征提取和跟踪的鲁棒方法.方法选取包含若干表面片的结构化特征,并假设结构化特征中的表面片之间的关系在图像运动时保持不变.使用基于梯度的方法和基于相关性的方法在另一幅图像中求取对应的结构化特征.利用结构化特征不变性判据,评价对应结构化特征的有效性.结果与结论该算法能有效地跟踪一般表面片特征,非常适合于在室外获取的一般图像序列和弱纹理图像序列运动分析.
【总页数】5页(P190-194)
【关键词】图像运动估计;特征跟踪;特征提取;弱纹理图像
【作者】贾云得;Marti.,H
【作者单位】北京理工大学机电工程系;卡内基-梅隆大学计算机科学学院
【正文语种】中文
【中图分类】TP391.1
【相关文献】
1.弱纹理环境双目视觉稠密视差鲁棒估计方法 [J], 杜英魁;刘成;田丹;韩晓微;原忠虎
2.一种基于鲁棒局部纹理特征的背景差分方法 [J], 金静;党建武;王阳萍;翟凤文
3.一种基于孪生网络的高鲁棒性实时单目标船舶跟踪方法 [J], 张云飞; 黄润辉; 单云霄; 周晓梅
4.复杂环境下一种基于改进核相关滤波的视觉鲁棒目标跟踪方法 [J], 何容;赖际舟;吕品;刘国辉;王博
5.高光弱纹理物体表面鲁棒重建方法 [J], 乔玉晶;张思远;赵宇航
因版权原因,仅展示原文概要,查看原文内容请购买。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
30Shortest Path Homography-Based VisualControl for Differential Drive RobotsG. López-Nicolás, C. Sagüés and J.J. Guerrero 1Universidad de ZaragozaSpain1 IntroductionIt is generally accepted that machine vision is one of the most important sensory modalities for navigation purposes. Visual control, also called visual servoing, is a very extensive and mature field of research where many important contributions have been presented in the last decade [Malis et al.,1999, Corke and Hutchinson, 2001, Conticelli and Allotta, 2001, Tsakiris et al., 1998, Ma et al., 1999]. Two interesting surveys on this topic are [De Souza and Kak, 2002] and [Hutchinson et al., 1996]. In this work we present a new visual servoing approach for mobile robots with a fixed monocular system on board. The idea of visual servoing is used here in the sense of homing, where the desired robot position is defined by a target image taken at that position. Using the images taken during the navigation the robot is led to the target.A traditional approach is to perform the motion by using the epipolar geometry [Basri et al., 1999, Rives, 2000, Lopez-Nicolas et al., 2006]. These approaches have as drawback that the estimation of the epipolar geometry becomes ill conditioned with short baseline or planar scenes, which are usual in human environments. A natural way to overcome this problem is using the homography model. In [Malis and Chaumette, 2000] it is proposed a method based on the estimation of the homography matrix related to a virtual plane attached to an object. This method provides a more stable estimation when the epipolar geometry degenerates. In [Benhimane et al., 2005] it is presented a visual tracking system for car platooning by estimating the homography between a selected reference template attached to the leading vehicle. A significant issue with monocular camera-based vision systems is the lack of depth information. In [Fang et al., 2005] it is proposed the asymptotic regulation of the position and orientation of a mobile robot by exploiting homography-based visual servo control strategies, where the unknown time-varying depth information is related to a constant depth-related parameter.These homography-based methods usually require the homography decomposition, which is not a trivial issue. Two examples of approaches which do not use the decomposition of the homography are [Sagues and Guerrero, 2005] which is based on a 2D homography and[Benhimane and Malis, 2006] which presents an uncalibrated approach for manipulators. We present a novel homography-based approach by performing the control directly on the1This work was supported by projects DPI2006-07928, IST-1-045062-URUS-STP.Source:Vision Systems:Applications,ISBN 978-3-902613-01-1Edited by:Goro Obinata and Ashish Dutta,pp.608,I-Tech,Vienna,Austria,June 2007O p e n A c c e s s D a t a b a s e w w w .i -t e c h o n l i n e .c o m584Vision Systems: Applications elements of the homography matrix. This approach, denoted as ”Shortest Path Control”, is based on the design of a specific robot trajectory which consists in following a straight line towards the target. This motion planning allows to define a control law decoupling rotation and translation by using the homography elements. This approach needs neither the homography decomposition nor depth estimation. In this work we have developed three similar methods based on the particular selection of the homography elements. Each method is suitable for diěerent situations.The chapter is divided as follows, Section 2 presents the homography model developing its elements as a function of the system parameters to be used in the design of the controllers. Section 3 presents the Shortest Path Control with three diěerent methods based on the elements of the homography. Sections 4 and 5 present the stability analysis of the controllers and the experimental results respectively. Section 6 gives the conclusions.2. Homography Based ModelThe general pinhole camera model considers a calibration matrix defined as,(1)whereǂx andǂy are the focal length of the camera in pixel units in the x and y directions respectively;s is the skew parameter and (x0,y0) are the coordinates of the principal point. We have that ǂx=f m x and ǂy=f m y, where f is the focal length and m x,m y are the number of pixels per distance unit. In practice, we assume that the principal point is in the centre of the image (x0=0, y0=0) and that there is no skew (s=0).A 3D point in the world can be represented in the projective plane with homogeneous coordinates as p=(x,y,1)T. A projective transformation H exists from matched points belonging to a plane in such a way that p2=H p1. The homography between the current and target image can be computed from the matched points, and a robust method like RANSAC should be used to consider the existence of outliers [Hartley and Zisserman, 2004]. Taking advantage of the planar motion constraint, the homography can be computed from three correspondences instead of four, reducing the processing time.Let us suppose two images obtained with the same camera whose projection matrixes in a common reference system are P1=K[Iɕ0] and P2=K[Rɕ–Rc], being R the camera rotation and c the translation between the optical centres of the two cameras. A homography H can be related to camera motion (Figure 1a) in such a way that(2)where n=(n x, n y, n z)T is the normal to the plane that generates the homography and d is the distance between the plane and the origin of the global reference.We consider a mobile robot in planar motion (Figure 1b). In this case the robot position is defined by the state vector (x,z,) and the planar motion constraint gives:Shortest Path Homography-Based Visual Control for Differential Drive Robots 585.(3) Taking this into account, the homography corresponding to a planar motion scheme can be written as.(4) The second row of the matrix will be ignored in the design of the control law as it does not give useful information. Developing expression (2) we obtain the homography elements as a function of the parameters involved:(5)The analysis of these homography elements will lead to the control law design. After computing the homography from the image point matches it has to be normalized. We normalize by dividing H/h22, given that h22is never zero due to the planar motion constraint.(a) (b) Figure 1. (a) Homography from a plane between two views. (b) Coordinate system586Vision Systems: Applications3. Visual Servoing with Shortest Path ControlIn this Section the Shortest Path Control is presented. The control law design is directly basedon the homography elements. Given that our system has two variables to be controlled (thevelocities v and ǚ), we need at least two parameters of the homography to define the controllaw. Several possibilities appear depending on which homography elements are selected. Inour approach we have developed three similar methods which are suitable for diěerentsituations. In the experimental results we show the performance of these methods as thecalibration or the scene change.Let us suppose the nonholonomic diěerential kinematics to be expressed in a general way as(6) where x=(x,z,)T denotes the state vector and u=(v,ǚ)T the input vector. The particularnonholonomic diěerential kinematics of the robot expressed in state space form as afunction of the translation and rotation robot velocities (v, ǚ) is:.(7)In the Shortest Path Control approach, we propose decoupling the motion, rotation andtranslation, by following a specific trajectory. Then, we design a navigation scheme in such away that the robot can correct rotation and translation in a decoupled way. The resultingpath of this motion is shown in Figure 2.Figure 2. Motion trajectory of the robot consisting in three stepsThe motion can be divided in three sequential steps. In the first step the robot rotates untilthe camera points to the target position. Then, the robot performs a straight translation inthe second step until the target position is reached up to a rotation. Finally, the orientation iscorrected in the third step. The key point is to establish what conditions have to be heldShortest Path Homography-Based Visual Control for Differential Drive Robots 587 during each phase of the navigation. When the motion starts, the initial homography is the general case (5). It can be seen in Figure 2 that during the second step the robot moves in a straight line with a constant angle respect the global reference (=t). From our reference system we can obtain the geometrical expression x = – z tan t. Using this expression in (5) we obtain the particular form of the homography that is held during the straight motion of the second step..(8) At the end of the second step the robot has an orientation error and no translation error(x=0,z=0,=t). Taking this into account, the homography matrix that results at the end of the second step (i.e. in the target position up to orientation error) is.(9) This previous expression also implies that det(H) = 1. Finally, at the end of the navigation, when the robot reaches the target pose with the desired orientation the homography will be the identity matrix,(10) The particular expressions of the homography just deduced are related graphically with its corresponding positions in Figure 3. It can be seen that the goal of each step is to move the robot having as reference the next desired expression of the homography.Figure 3. The number below each figure denotes the equation of the homography that holds in that position. In each step, the numbers give the homography equations at the start and at the end of the stepNow we briefly introduce the expressions used to define the controllers of the three diěerent methods of the Shortest Path Control. These are detailed in the following subsections. From the previous particular expressions of the homography, we can define the conditions that will be used in each step of the navigation to drive the robot. In the first step we want to reach the orientation =t, where the robot points to the target. The forward velocity is set588Vision Systems: Applications to zero (v=0) and from (8) we could use h11,h12 or h13 to set the angular velocity of the robot in a proportional control:(11)(12)(13) In this step we have rejected elements h31,h32 and h33 because they require knowledge about the plane and the robot position, which are unknown. Each one of these expressions (11), (12) or (13) can be used to correct rotation in the first step. The selection of the expressions for each of the three methods depending on the calibration hypothesis is explained below. In method I camera calibration is supposed to be known, while in Method II and III no specific calibration is required.Once the orientation t is gained, the second step aims to get translation to the target equal to zero (x=z=0), keeping the orientation constant during the motion (=t). In this case we could use the parameters h31,h32 or h33 from (9) to set the robot velocity as(14)(15)(16) In this second step we have rejected elements h11,h12 and h13 for the correction of v because the value of these elements is constant during this step. Any of the expressions (14), (15) or (16) can be used to compute v during this step. Odometry drift or image noise appear in real situations, so the orientation is corrected to avoid possible errors. Thus, in the three methods the rotation during second step is corrected respectively with the same control of the first step.In the last step the robot has zero translation error and only needs to perform a rotation in order to reach the target orientation,(17)(18) Then, the velocity is set to zero in this step (v=0) and the rotation can be corrected with expressions of (17) or (18). We have selected ǚ=–kǚ h13 for the three methods because of the robustness to noise of h13 with respect to the rest of the homography elements. Experimental results presented support this decision.The control loop of the scheme presented is shown in the diagram of Figure 4. An image in the current position is taken at each loop of the control. The homography that links it with the target image is computed from the feature matching. Using the homography, the control performs the three steps. When the homography-based control loop finishes, the robot is in the target position, the current and the target images are the same, and the homography is the identity matrix. Next, the details of the three methods of the Shortest Path Control for visual servoing based on homographies for mobile robots are presented in detail.Shortest Path Homography-Based Visual Control for Differential Drive Robots 589Figure 4. Diagram of the control loop3.1 Method I: Calibrated MethodIn this method we suppose that the calibration matrix of the camera is known, and therefore, the value of the focal length ǂx is given. In the first step v is set to zero while the angular velocity could be corrected with (11) or (13), needing the value t. This approach is based on the key value t, but this value is initially unknown. From (8) we have that h 11=cos t and h13=ǂx sin t. Taking this into account, we can obtain the next equation, which is true when =t,.(19)Using this expression we do not need to know the value of t to correct the orientation in thefirst step, and this is corrected until (19) is satisfied. In step two, the orientation is corrected with the same expression to take into account odometry drift or noise. The velocity v in the second step is corrected using (16) which is combined with h11 from (9) to remove the unknown parameter t from the expression of the control. Third step is based on (17). Then, we define the Method I as(20)where kǚ and k v are the control gains.We avoid the use of the parameter t in the velocity v of the second step by using the value of h 11 from (9) as previously explained. In any case t could be computed easily when the first step is finished from (11) or (13). This method needs to know the calibration of the camera (parameter ǂx) and this is its main drawback. The next two methods proposed work without knowing this parameter and they have shown to be independent of the focal length.590Vision Systems: Applications 3.2 Method II: Uncalibrated MethodThe previous method is calibrated. In a system, the need of calibration means disadvantages in terms of maintenance cost, robustness and adaptability. In Method II the calibration camera is considered to be unknown, which has many advantages in practice. We can define the control scheme of the Method II selecting expressions where the calibration parameters do not appear explicitly. These expressions are (12), (15) and (17). Then, the control is defined as(21)where kǚand k v are the control gains. With this method the robot is controlled by using a camera without specific calibration; although we assume that the principal point is in the centre of the image, this is a good supposition in practise. Method II requires the plane inducing the homography not to be vertical respect our reference because it is needed n y 0. This is due to the direct dependence of the parameters used from the homography to n y. This could be a problem since human environments are usually full of vertical planes (walls). In any case the method works if we guarantee that vertical planes are not used, for example constraining to the floor [Liang and Pears, 2002] or the ceiling plane [Blanc et al., 2005].3.3 Method III: Method with ParallaxThe previous method works without specific calibration, but it requires the scene homography plane not to be vertical and this could be a problem in man-made environments, usually full of vertical planes. Method III uses the concept of parallax relative to a plane and overcomes the problem of vertical planes. Using the parallax [Hartley and Zisserman, 2004] the epipole in the current image can be easily obtained from a homography H and two points not belonging to its plane. In the first step of Method III the objective is to get orientation =t. In this position the robot points to the target, so the camera centre of the target is projected to (x0,y0) in the current image and then e c=(0,0). Given that the robot moves in a planar surface we only need the x-coordinate of the epipole (e cx). Then we define the correction of the orientation in step 1 and step 2 with a proportional control to e cx. Once e cx=0 the robot is pointing to the target position. The other expressions of the control are obtained in a similar way to the previous methods using (16) and (17). Then, we define the scheme of Method III as(22)When the robot is close to the target position and the translation is nearly zero, all the points in the scene can be related by the homography. In this situation the parallax is not useful to correct the orientation. Before this happen we change the orientation control at the end of step 2 to the expression (11). This expression needs the value of t, which can be computed previously with the same equation while the rotation is corrected with the parallax procedure. Here, we use neither expression (15) because vertical planes can be easily foundShortest Path Homography-Based Visual Control for Differential Drive Robots 591 in human environments nor expression (19) because it needs specific calibration. We can detect easily when the parallax is not useful to work with by measuring the parallax of the points not belonging to the plane of the homography. If the result is under a threshold, the parallax procedure is not used any more. In the simulations presented with this approach the threshold is set to 5 pixels.In the three methods presented the homography is not decomposed, and neither the robot coordinates nor the normal of the plane are computed. This approach requires the selection of the signs of some of the control gains depending on where is the initial robot position and what is the orientation of the plane detected. This can be easily done by taking advantage of the parallax relative to the plane by computing it once at the start. Thus, the sign of the gains is easily determined.4. Stability AnalysisWe define the common Lyapunov function expressing the robot position in polar coordinates (r(t),lj(t),(t)), with the reference origin in the target and ljpositive from z-axis anticlockwise, as(23) This is a positive definite function, where r G i,ljG i and G i denote the desired value of the parameter in the subgoal position for each step (i=1,2,3). Due to the designed path, the value ofljis constant during the navigation. Although in the case of noisy data the value of ljcould vary, it does not aěect the control, because the path is defined towards the target independently of the value of lj, thus Vlj= 0. After diěerentiating we obtain:(24) We analyze the derivative Lyapunov candidate function in each step to show it is strictly negative. This analysis is valid whether if the goal is behind or in front of the initial position. Step 1. Here the robot performs a rotation with v=0. Thus, we only need to consider .The desired orientation is G1=t . < 0 is guaranteed if ( ï G1) > 0 and then ǚ<0; or else, if (ï G1)<0 and then ǚ>0. In Method I and II, the sign of ǚis guaranteed to be correct, given that the sign of kǚ is selected as previously explained. In Method III,e cx and, when (ïG1)>0 then e cx>0 and ǚ<0, or e cx<0 and ǚ>0 when (ïG1) < 0.ǚ=–kTherefore<0.Step 2. In this step the robot moves towards the target in a straight line path and we have .The sign of (rïr G2) is always positive. Then, with cos( – lj) < 0 we have v>0 and with cos(ïlj)>0 we have v< 0. In Method II, the sign of v is guaranteed to be correct, given that the sign of k v is properly selected. In Method I and III, the velocity given by the control and with (8) is v=k v z n z/(d cos t), which gives the expected signs. Therefore r < 0.With we have the same reasoning of step 1.Step 3. Similar to the reasoning of step 1, in this case, the sign of ǚcan be easily checked taking into account that G3=0 and h 13=ǂx sin t. Therefore < 0.So, we have shown that <0 for the controllers of the three methods. We have also asymptotic stability given that is negative definite in all the steps.592Vision Systems: Applications 5. Experimental ResultsSeveral experiments have been carried out with the controllers of the three methods presented by using virtual data. The simulated data is obtained by generating a virtual planar scene consisting of a distribution of random 3D points. The scene is projected to the image plane using a virtual camera, the size of the images is 640×480 pixels. In each loop of the control, the homography between the current and target image is computed from the matched points and the control law send the velocities (v, ǚ) to the robot. In the experiments, we assume that the camera is centred on the robot pointing forwards. Figure 5 shows the resulting path from diěerent initial positions. The target is placed in (x(m),z(m),(deg))=(0,0,0°). The diěerent initial positions behind the target are: (ï3,ï10,ï30°), (0,ï8,ï40°) and (6,ï6,0°). The results also show that the method works properly when the target is behind the initial robot position, moving the robot backwards in that case. The diěerent initial positions used in this case are: (ï6, 4, 20°), (6, 8, 10°) and (5,2,ï50°).Figure 5. Simulations with target position at (0,0,0°) and diěerent initial positionsThe performance of the three methods is exactly the same when using perfect data and quite similar when there is image noise. In Figure 6 two simulations are compared, one without noise, and the other, adding white noise to the image points with a standard deviation of ǔ=1 pixel using Method III. The evolution along time of the robot position and the homography elements is drawn.We have tested the controllers with odometry drift and with diěerent values of image noise. Thefirst row of Figure 7 shows the resulting evolution of the robot position when there is odometry drift in rotation of 1 deg/m. As it can be seen the controllers can cope properly with the drift error. Simulations with each method have been carried out using diěerent levels of image noise. The results are shown in the second row of Figure 7 and it can be seen that the methods converge properly in spite of image noise.The control law of Method I needs the calibration parameter ǂx of the camera whereas Method II and III do not use it. In Figure 8 we show the performance of the control to calibration errors. The value of the focal length of the controllers is fixed to f=6 mm while its real value is modified to see the final position error obtained for each Method, (first row of Figure 8). Besides, we have assumed that the principal point is in the centre of the image. In the second row of Figure 8, the value of x0 used in the controllers is supposed to be zeroShortest Path Homography-Based Visual Control for Differential Drive Robots 593while its real value is changed. Performance of Method I is sensitive to calibration errors as expected, this is because this control law is related directly with ǂx and depends highly onits accuracy. The simulations show that Method II works properly in spite of calibration errors. Finally, results using Method III show that a rough calibration is enough for the convergence, because it is robust to focal length in accuracy and it is only aěected by calibration errors in the principal point.(a) Lateral motion (b) Forward motion (c) Robot rotationh13h12 (f)(d)h11 (e)h32 (i)h33(g)h31 (h)Figure 6. Simulation without noise (thick line) and with image white noise of ǔ=1 pixel (thinline). The initial position is (x,z,)=(ï3,ï10,ï30°) and the target (0,0,0°)The performance of the methods can be spoiled in some cases by the particular plane that generates the homography. Simulations using diěerent planes are presented in Table 1. Theplanes are defined by the normal vector n=(n x,n y,n z)T, and a list of unitary normal vectors is selected to carried out the simulations with ɠnɠ=1. The final error obtained with each method is shown. The initial position is (ï3,ï10,ï30°) and the target is (0, 0, 0°). The resultsshow that Method I and III need n z 0 to work properly. On the other hand, Method IIneeds n y 0.This is because the Methods are directly related with these parameters of n.594Vision Systems: Applications Vertical planes are usually common in human environments; besides, in our monocular system, planes in front of the robot with dominant n z will be detected more easily. Methods I and III work properly in this case. If we constraint the homography plane detected to be thefloor or the ceiling (any plane with n y 0 is enough) the Method II will also work properly.Figure 7. (First row) Simulations with odometry drif of 1 deg/m. The evolution of one simulation in x,z and is shown for each method. (Second row) Final error of different simulations varying the image noiseFigure 8. Final error for each method in x,z and varying the focal length (first row) and varying the principal point coordinates (second row)Shortest Path Homography-Based Visual Control for Differential Drive Robots 595Table 1. Final error for each method in x (m),z (m) and (deg) varying the normal of the plane that generates the homography: n =(n x , n y , n z )T6. ConclusionsWe have presented a new homography-based approach for visual control of mobile robots. The control design is directly based on the homography elements and deals with the motion constraints of the di ěerential drive vehicle. In our approach, called Shortest Path Control , the motion is designed to follow a straight line path. Taking advantage of this specific trajectory we have proposed a control law decoupling rotation and translation. Three di ěerent methods have been designed by choosing di ěerent homography elements. Their performance depends on the conditions of the plane or the calibration. The methods use neither the homography decomposition nor any measure of the 3D scene. Simulations shows the performance of the methods with odometry drift, image noise and calibration errors. Also, the influence of the plane that generates the homography is studied.7. ReferencesBasri, R., Rivlin, E., and Shimshoni, I. (1999). Visual homing: Surfing on the epipoles.International Journal of Computer Vision, 33(2):117–137. [Basri et al., 1999]Benhimane, S. and Malis, E. (2006). Homography-based 2D visual servoing. IEEEInternational Conference on Robotic sand Automation , pages 2397–2402. [Benhimaneand Malis, 2006]Benhimane, S., Malis, E., Rives, P., and Azinheira, J. R. (2005). Vision-based control for carplatooning using homography decomposition. In IEEE International Conference onRobotics and Automation, Barcelona, Spain, pages 2173–2178. [Benhimane et al.,2005] n Method I Method II Method IIIn x n y n z x z x z x z0 0 -1.000 0 -0.09-3.00-10.00-3.120 0 -0.09 -0.20 0.57 -0.800.03 -0.00 -0.09-0.00-0.00 -0.090 0 -0.09 -0.40 0.69 -0.60-0.00-0.00 -0.09-0.00-0.00 -0.09-0.00-0.00 -0.09 -0.60 0.69 -0.40-0.00-0.01 -0.09-0.00-0.00 -0.09-0.00-0.01 -0.09 -0.80 0.57 -0.20-0.10-0.34 -0.03-0.00-0.00 -0.09-0.10-0.34 -0.03 -1.00 0 0 -3.00-10.000 -3.00-10.000 -3.00-10.00 0 1.00 0 0 -3.00-10.000 -3.00-10.000 -3.00-10.00 0 0.98 -0.20 0 -3.00-10.000 -0.15-0.62 0 -3.00-10.00 0 0.92 -0.40 0 -3.00-10.000 -0.01-0.04 -0.09-3.00-10.00 0 0.80 -0.60 0 -3.00-10.000 -0.00-0.00 -0.09-3.00-10.00 0 0.60 -0.80 0 -3.00-10.000 0 -0.00 -0.09-3.00-10.00 0 0 -1.00 0 -3.00-10.000 0 0 -0.09-3.00-10.00 0 0 -1.00 0 -3.00-10.000 0 0 -0.09-3.00-10.00 0 0.57 -0.80 -0.20-0.10-0.34 -0.030 -0.00 -0.09-0.10-0.34 -0.03 0.69 -0.60 -0.40-0.00-0.01 -0.09-0.00-0.00 -0.09-0.00-0.01 -0.09 0.69 -0.40 -0.60-0.00-0.00 -0.09-0.01-0.04 -0.10-0.00-0.00 -0.09 0.57 -0.20 -0.800 0 -0.09-0.15-0.62 -0.150 0 -0.09 0 0 -1.000 0 -0.09-3.00-10.00-3.120 0 -0.09。