An adaptive coupled-layer visual model for robust visual tracking
视网膜功能启发的边缘检测层级模型郑程驰 1范影乐1摘 要 基于视网膜对视觉信息的处理方式, 提出一种视网膜功能启发的边缘检测层级模型. 针对视网膜神经元在周期性光刺激下产生适应的特性, 构建具有自适应阈值的Izhikevich 神经元模型; 模拟光感受器中视锥细胞、视杆细胞对亮度的感知能力, 构建亮度感知编码层; 引入双极细胞对给光−撤光刺激的分离能力, 并结合神经节细胞对运动方向敏感的特性, 构建双通路边缘提取层; 另外根据神经节细胞神经元在多特征调控下延迟激活的现象, 构建具有脉冲延时特性的纹理抑制层; 最后将双通路边缘提取的结果与延时抑制量相融合, 得到最终边缘检测结果. 以150张来自实验室采集和AGAR 数据集中的菌落图像为实验对象对所提方法进行验证, 检测结果的重建图像相似度、边缘置信度、边缘连续性和综合指标分别达到0.9629、0.3111、0.9159和0.7870, 表明所提方法能更有效地进行边缘定位、抑制冗余纹理、保持主体边缘完整性. 本文面向边缘检测任务, 构建了模拟视网膜对视觉信息处理方式的边缘检测模型, 也为后续构建由视觉机制启发的图像计算模型提供了新思路.关键词 边缘检测, 视网膜, Izhikevich 模型, 神经编码, 方向选择性神经节细胞引用格式 郑程驰, 范影乐. 视网膜功能启发的边缘检测层级模型. 自动化学报, 2023, 49(8): 1771−1784DOI 10.16383/j.aas.c220574Multi-layer Edge Detection Model Inspired by Retinal FunctionZHENG Cheng-Chi 1 FAN Ying-Le 1Abstract Based on the processing of visual information by the retina, this paper proposes a multi-layer model of edge detection inspired by retinal functions. Aiming at the adaptive characteristics of retinal neurons under periodic light stimulation, an Izhikevich neuron model with adaptive threshold is established; By simulating the perception ability of cones and rods for luminance and color in photoreceptors, the luminance perception coding layer is con-structed; By introducing the ability of bipolar cells for separating light stimulation, and combining with the charac-teristics of ganglion cells sensitive to the direction of movement, a multi-pathway edge extraction layer is constructed;In addition, according to the phenomenon of delayed activation of ganglion cell neurons under multi-feature regula-tion, a texture inhibition layer with pulse delay characteristics is constructed; Finally, by fusing the result of multi-pathway edge extraction with the delay suppression amount, the final edge detection result is obtained. The 150colony images from laboratory collection and AGAR dataset are used as experimental objects to test the proposed method. The reconstruction image similarity, edge confidence, edge continuity and comprehensive indicators of the detection results are 0.9629, 0.3111, 0.9159 and 0.7870, respectively. The results show that the proposed method can better localize edges, suppress redundant textures, and maintain the integrity of subject edges. This research is oriented to the task of edge detection, constructs an edge detection model that simulates the processing of visual information by the retina, and also provides new ideas for the construction of image computing model inspired by visual mechanism.Key words Edge detection, retina, Izhikevich model, neural coding, direction-selective ganglion cells (DSGCs)Citation Zheng Cheng-Chi, Fan Ying-Le. Multi-layer edge detection model inspired by retinal function. 收稿日期 2022-07-14 录用日期 2022-11-29
Manuscript received July 14, 2022; accepted November 29, 2022
国家自然科学基金(61501154)资助
Supported by National Natural Science Foundation of China (61501154)
本文责任编委 张道强
Recommended by Associate Editor ZHANG Dao-Qiang
1. 杭州电子科技大学模式识别与图像处理实验室 杭州 310018
1. Laboratory of Pattern Recognition and Image Processing, Hangzhou Dianzi University, Hangzhou 310018 Laboratory of Pattern Recognition and Image Processing,Hangzhou Dianzi University, Hangzhou 310018第 49 卷 第 8 期自 动 化 学 报Vol. 49, No. 82023 年 8 月ACTA AUTOMATICA SINICAAugust, 2023同的编码方式将输入的图像转化为脉冲信号, 经过多级功能区块处理和传递后提取出图像的边缘. 其中, 频率编码和时间编码是视觉系统编码光刺激的重要方式, 在一些计算模型中被广泛使用. 例如,文献[2]以HH (Hodgkin-Huxley)神经元模型为基础, 使用多方向Gabor滤波器模拟神经元感受野的方向选择性, 实现神经元间连接强度关联边缘方向,将每个神经元的脉冲发放频率作为边缘检测的结果输出, 实验结果表明其比传统方法更有效; 文献[3]在 LIF (Leaky integrate-and-fire) 神经元模型的基础上进行改进, 引入根据神经元响应对外界输入进行调整的权值, 在编码的过程中将空间的脉冲发放转化为时序上的激励强度, 实现强弱边缘分类, 对梯度变化幅度小的弱边缘具有良好的检测能力. 除此之外, 也有关注神经元突触间的相互作用, 通过引入使突触的连接权值产生自适应调节的机制来提取边缘信息的计算方法. 例如, 文献 [4] 构建具有STDP (Spike-timing-dependent plasticity) 性质的神经元模型, 根据突触前后神经元首次脉冲发放时间顺序来增强或减弱突触连接, 对真伪边缘具有较强的辨别能力; 文献 [5] 则在构建神经元模型时考虑了具有时间不对称性的STDP机制, 再融合方向特征和侧抑制机制重建图像的主要边缘信息, 其计算过程对神经元突触间的动态特性描述更加准确.更进一步, 神经编码也被应用于实际的工程需要.例如, 文献 [6]针对现有的红外图像边缘检测算法中存在的缺陷, 构建一种新式的脉冲神经网络, 增强了对红外图像中弱边缘的感知; 文献 [7] 则通过模拟视皮层的处理机制, 使用包含左侧、右侧和前向3条并行处理支路的脉冲神经网络模型提取脑核磁共振图像的边缘, 并将提取的结果用于异常检测,同样具有较好的效果. 上述方法都在一定程度上考虑了视觉组织中神经元的编码特性以及视觉机制,与传统方法相比, 在对复杂环境的适应性更强的同时也有较高的计算效率. 但这些方法都未能考虑到神经元自身也会随着外界刺激产生适应, 从而使活动特性发生改变. 此外, 上述方法大多也只选择了频率编码、时间编码等编码方式中的一种, 并不能完整地体现视觉组织中多种编码方式的共同作用.事实上, 在对神经生理实验和理论的持续探索中发现, 视觉组织(以视网膜为例)在对视觉刺激的加工中就存在着丰富的动态特性和编码机制[8−9]. 视网膜作为视觉系统中的初级组织结构, 由多种不同类型的细胞构成, 共同组成一个纵横相连、具有层级结构的复杂网络, 能够针对不同类型的刺激性选择相应的编码方式进行有效处理. 因此, 本文面向图像的边缘检测任务, 以菌落图像处理为例, 模拟视网膜中各成分对视觉信息的处理方式, 构建基于视网膜动态编码机制的多层边缘检测模型, 以适应具有多种形态结构差异的菌落图像边缘检测任务.1 材料和方法本文提出的算法流程如图1所示. 首先, 根据视网膜神经元在周期性光刺激下脉冲发放频率发生改变的特性, 构建具有自适应阈值特性的Izhikevich 神经元模型, 改善神经元的同步发放能力; 其次, 考虑光感受器对强弱光和颜色信息的不同处理方式编码亮度信息, 实现不同亮度水平目标与背景的区分;然后, 引入固视微动机制, 结合神经节细胞的方向选择性和给光−撤光通路的传递特性, 将首发脉冲时间编码的结果作为双通路的初级边缘响应输出;随后, 模拟神经节细胞的延迟发放特性, 融入对比度和突触前后偏好方向差异, 计算各神经元的延时抑制量, 对双通路的计算结果进行纹理抑制; 最后,整合双通路边缘信息, 将二者融合为最终的边缘检测结果.1.1 亮度感知编码层构建神经元模型时, 本文综合考虑对神经元生理特性模拟的合理性和进行仿真计算的高效性, 以Izhikevich模型[10]为基础构建神经元模型. Izhike-vich模型由Izhikevich在HH模型的基础上简化而来, 在保留原模型对神经元放电模式描述的准确性的同时, 也具有较低的时间复杂度, 适合神经元群体计算时应用, 其表达式如下式所示v thv th 其中, v为神经元的膜电位, 其初始值设置为 −70; u为细胞膜恢复变量, 设置为14; I为接收的图像亮度刺激; 为神经元脉冲发放的阈值, 设置为30; a描述恢复变量u的时间尺度, b描述恢复变量u 对膜电位在阈值下波动的敏感性, c和d分别描述产生脉冲发放后膜电位v的重置值和恢复变量u的增加程度, a, b, c, d这4个模型参数的典型值分别为0.02、0.2、−65和6. 若某时刻膜电位v达到,则进行一次脉冲发放, 同时该神经元对应的v被重置为c, u被重置为u + d.适应是神经系统中广泛存在的现象, 具体表现为神经元会根据外界的刺激不断地调节自身的性质. 其中, 视网膜能够适应昼夜环境中万亿倍范围的光照变化, 这种适应能够帮助其在避免饱和的同时保持对光照的敏感性[11]. 研究表明, 视网膜持续1772自 动 化 学 报49 卷接受外界周期性光刺激时, 光感受器会使神经元细胞的活动特性发生改变, 导致单个神经元的发放阈值上升, 放电频率下降; 没有脉冲发放时, 对应阈值又会以指数形式衰减, 同时放电频率逐渐恢复[12].因此, 本文在Izhikevich 模型的基础上作出改进,加入根据脉冲发放频率对阈值进行自适应调节的机制, 如下式所示τ1τ2τ1τ2v th τ1v th τ2其中, 和 分别为脉冲发放和未发放时阈值变化的时间常数, 其值越小, 阈值变化的幅度越大, 神经元敏感性变化的过程越快; 反之, 则表示阈值变化的幅度越小, 神经元敏感性变化的过程也就越慢.生理学实验表明, 在外界持续光刺激下, 神经元对刺激产生适应导致放电频率降低后, 这种适应衰退的过程比产生适应的过程通常要长数倍[13]. 因此,为了在准确模拟生理特性的同时保证计算模型的性能, 本文将 和 分别设置为20和40. 这样, 当某时刻某个神经元产生脉冲发放时, 则对应阈值 根据 的值升高, 神经元产生适应, 活跃度降低; 反之, 对应阈值 根据 的值下降, 神经元的适应衰退, 活跃度提升. 实现限制活跃神经元的脉冲发放频率, 促进不活跃神经元的脉冲发放, 改善神经元群体的同步发放能力, 减少检测目标内部冗余. 图2边缘检测结果图 1 边缘检测算法原理图Fig. 1 Principle of edge detection algorithm8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1773显示了改进前后的Izhikevich 模型对图像进行处理后目标内部冗余情况.0∼255为了规范检测目标图像的亮度范围, 本文将输入的彩色图像Img 各通路的亮度映射到 区间内, 如下式所示Img (;i )I (;i )其中, 和 表示经亮度映射前和映射后的R 、G 、B 三种颜色分量图像; max(·) 和min(·)分别计算对应分量图像中的最大和最小像素值.光感受器分两类, 分别为视锥细胞和视杆细胞[14], 都能将接收到的视觉刺激转化为电信号, 实现信息的编码和传递. 其中, 视锥细胞能够根据外界光刺激的波长来分解为三个不同的颜色通道[15].考虑到人眼对颜色信息的敏感性能有效区分离散目标与背景, 令图像中的每个像素点对应一个神经元,将R 、G 、B 三种颜色分量图像分别输入上文构建的神经元模型中, 在一定时间范围内进行脉冲发放,如下式所示fires (x,y ;i )其中, 为每个神经元的脉冲发放次数,函数Izhikevich(·)表示式(2)给出的神经元模型.视杆细胞对光线敏感, 主要负责弱光环境下的外界刺激感知. 当光刺激足够强时, 视杆细胞的感知能力达到饱和, 视觉系统转为使用视锥细胞负责亮度信息的处理[16]. 因此, 除了对颜色信息敏感外,视锥细胞对强光也有高度辨别能力. 考虑到作为检测对象的图像中, 目标与背景具有不同的亮度水平,本文构建一种综合视锥细胞和视杆细胞亮度感知能力的编码方法, 以适应目标与背景不同亮度对比的多种情况, 如下式所示I base I base (x,y )fires Res (x,y )其中, var(·) 计算图像亮度方差; ave(·) 计算图像亮度均值. 本文取三种颜色分量图像中方差最大的一幅作为基准图像 , 对于其中的像素值 ,将其中亮度低于平均亮度的部分设置为三种颜色分量脉冲发放结果的最小值, 反之设置为最大值, 最终得到模型的亮度编码结果 , 实现在图像局部亮度相对较低的区域由视杆细胞进行弱光感知, 亮度较高区域由视锥细胞处理, 强化计算模型对不同亮度目标和背景的区分能力, 凸显具有弱边缘的对象. 图3显示了亮度感知编码对存在弱边缘的对象的感知能力.1.2 基于固视微动的多方向双通路边缘提取层Img gray 人眼注视目标时, 接收的图像并非是静止的,而是眼球以每秒2至3次的微动使投射在视网膜上的图像发生持续运动, 不断地改变照射在光感受器上的光刺激[17]. 本文考虑人眼的固视微动机制,在原图像的灰度图像 上构建大小为3×3的微动作用窗口temp , 使窗口接收到的亮度信息朝8个方向进行微动, 如下式所示p i q i θi temp θi d x d y 其中, 和 是用于决定微动方向 的参数, 其值被设置为 −1、0或1, 通过计算反正切函数能够得到以45° 为单位、从0° 到315° 的8个角度的微动方向, 对应8个微动结果窗口 ; 和 分别表示水平和竖直方向的微动尺度; Dir 为计算得到(a) 原图(a) Original image (b) Izhikevich 模型(b) Izhikevich model (c) 改进的 Izhikevich 模型(c) Improved Izhikevich model图 2 改进前后的Izhikevich 模型对图像进行脉冲发放的结果对比图Fig. 2 Comparison of the image processing results of the Izhikevich model before and after improvement1774自 动 化 学 报49 卷Dir (x,y )的微动方向矩阵, 其中每个像素点的值为 ;sum(·) 计算窗口中像素值的和. 本文取每个微动窗口前后差异最大的方向作为该点的偏好方向, 分别用数字1 ~ 8表示.视网膜存在一类负责对运动刺激编码、具有方向选择性的神经节细胞 (Direction-selective gangli-on cells, DSGCs)[18]. 经过光感受器处理, 转化为电信号的视觉信息, 通过双极细胞处理后传递给神经节细胞. 双极细胞可分为由光照增强 (ON) 激发的细胞和由光照减弱 (OFF) 激发的细胞[19], 分别将信号输入给光通路 (ON-pathway)和撤光通路 (OFF-pathways) 两条并行通路[20], 传递给光运动和撤光运动产生的刺激. 而神经节细胞同样包括ON 和OFF 两种, 会对给光和撤光所产生的运动方向做出反应[21]. 因此, 本文构造5×5大小的对特定方向微动敏感的神经节细胞感受野窗口, 将其对偏好方向和反方向微动所产生的响应分别作为给光通路和撤光通路的输入. 以偏好方向为45° 的方向选择性神θi fires Res S xy ∗通过上述定义, 可以形成以45° 为单位、从0°到315° 的8个方向的感受野窗口, 与上文 的8个方向对应. 之后本文在亮度编码结果 上构筑与感受野相同大小的局部窗口 , 根据最优方向矩阵Dir 对应窗口中心点的方向, 取与其相同和相反方向的感受野窗口和亮度编码结果进行卷积运算 (本文用符号 表示卷积运算), 分别作为ON 和OFF 通道的输入, 如下式所示T ON T OFF 考虑到眼球微动能够将静止的空间场景转变为视网膜上的时间信息流, 激活视网膜神经元的发放,同时ON 和OFF 两通路也只在光刺激的呈现和撤去的瞬时产生电位发放, 因此本文采用首发脉冲时间作为编码方式, 将 和 定义为两通路首次脉冲发放时间构成的时间矩阵, 并作为初级边缘响应的结果. 将1个单位的发放时间设置为0.25, 当总发放时间大于30时停止计算, 此时还未进行发放的神经元即被判断为非边缘.1.3 多特征脉冲延时纹理抑制层视网膜神经节细胞在对光刺激编码的过程中,外界刺激特征的变化会显著影响神经元的反应时间. 研究发现, 当刺激对比度增大时, 神经元反应延时会减小, 更快速地进行脉冲发放; 反之, 则反应延时增大, 抑制神经元的活性[22]. 除此之外, 方向差异也会影响神经元活动, 突触前后偏好方向相似的神经元更倾向于优先连接, 在受到外界刺激时能够更快被同步激活[23]. 因此, 本文引入视网膜的神经元延时发放机制, 考虑方向和对比度对神经元敏感性的影响, 构造脉冲延时抑制模型. 首先结合局部窗口权重函数计算图像对比度, 如下式所示ω(x i ,y i )其中, 为窗口权重函数, L 为亮度图像, Con(a) 原图(a) Original image (b) Izhikevich 模型(b) Izhikevich model (c) 改进的 Izhikevich 模型(c) Improved Izhikevich model (d) 亮度感知编码(d) Luminance perception coding图 3 不同方式对存在弱边缘的菌落图像的处理结果Fig. 3 Different ways to process the image of colonies with weak edges8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1775S xy x i y i µ=∑x i ,y i ∈S xy ω(x i ,y i )为对比度图像, 为以(x , y )为中心的局部窗口,( , ) 为方窗中除中心外的周边像素, ws 为局部方窗的窗长, . 之后考虑局部方窗中心神经元和周边神经元方向差异, 同时用高斯函数模拟对比度大小与延时作用强度之间的关系, 构建脉冲延时抑制模型, 如下式所示D Dir (x,y )D Con (x,y )D (x,y )∆Dir (x i ,y i )min {|θ(x i ,y i )−θ(x,y )|,2π−|θ(x i ,y i )−θ(x,y )|}δ其中, 和 分别表示方向延时抑制量和对比度延时抑制量; 为计算得到的综合延时抑制量; 为突触前后神经元微动方向的差异, 被定义为 ; 用于调节对比度延时抑制量.T ON T OFFRes ON Res OFF 将上文计算得到的两个时间矩阵 和 中进行过脉冲发放的神经元与综合延时抑制量相加, 同样设置1个单位的发放时间为0.25, 将经延时作用后总发放时间大于30的神经元设置为不发放, 即判定为非边缘, 反之则判定为边缘. 根据式(19)和式(20) 得到两通道边缘检测结果 和. 最后, 将两通道得到的结果融合, 得到最终边缘响应结果Res ,如下式所示2 算法流程基于视网膜对视觉信息的处理顺序和编码特性, 本文构建图4所示的算法流程, 具体步骤如下:1) 根据视网膜在外界持续周期性光刺激下产生的适应现象, 在式(1)所示的Izhikevich 模型上作出改进, 构建如式(2)所示的具有自适应阈值的Izhikevich 模型.2) 根据式(3)将作为检测目标的图像映射到0 ~ 255区间规范亮度范围, 接着分离3种通道的颜色分量, 根据式(4)输入到改进的Izhikevich 模型中进行脉冲发放.3) 根据式(5)的方差计算提取出基准图像, 再结合基准图像根据式(6)对三通道脉冲发放的结果进行亮度感知编码, 得到亮度编码结果.4) 考虑人眼的固视微动机制, 根据式(7)和式(8)通过原图的灰度图像提取每个神经元的偏好方向, 得到微动方向矩阵, 接着根据式(9)和式(10)构筑8个方向的方向选择性神经节细胞感受野窗口.5) 根据式(11)和式(12), 将感受野窗口与亮度编码图像作卷积运算, 并输入Izhikevich 模型中得到ON 和OFF 通路的首发脉冲时间矩阵, 作为两通道的初级边缘响应.6) 根据式(13) ~ 式 (15), 结合局部窗口权重计算图像对比度.7) 考虑对比度和突触前后偏好方向对脉冲发放的延时作用, 根据式(16) ~ 式 (18)构建延时纹理抑制模型, 并根据式(19)和式(20)将纹理抑制模型和两通道的初级边缘响应相融合.8) 根据式(21)将两通路纹理抑制后的结果在神经节细胞处进行整合, 得到最终边缘响应结果.3 结果为了验证本文方法用于菌落边缘检测的有效性, 本文选择Canny 方法和其他3种同样基于神经元编码的边缘检测方法作为横向对比, 并进行定性、定量分析. 首先, 选择文献[4]提出的基于神经元突触可塑性的边缘检测方法(Synaptic plasticity model, SPM), 用于对比本文方法对弱边缘的增强效果; 其次, 选择文献[24]提出的基于抑制性突触的多层神经元群放电编码的边缘检测方法 (Inhibit-ory synapse model, ISM), 验证本文的延时抑制层在抑制冗余纹理方面的有效性; 然后, 选择文献[25]提出的基于突触连接视通路方向敏感的分级边缘检测方法(Orientation sensitivity model, OSM), 对比本文方法在抑制冗余纹理的同时保持边缘提取完整性上的优势; 最后, 还以本文方法为基础, 选择去除亮度感知编码后的方法(No luminance coding,NLC)作为消融实验, 以验证本文方法模拟光感受器功能的亮度感知编码模块的有效性.本文使用实验室在微生物学实验中采集的菌落图像和AGAR 数据集[26]作为实验对象. 前者具有丰富的颜色和形态结构, 用于检验算法对复杂检测环境的适应性; 后者则存在更多层次强度的边缘信息, 菌落本身与背景的颜色和亮度水平也较为相近,用于检测算法对颜色、亮度特征和弱边缘的敏感性.本文通过局部采样生成150张512×512像素大小的测试图像, 其中38张来自实验室采集, 112张来自AGAR 数据集. 然后分别使用上文的6种边缘1776自 动 化 学 报49 卷检测算法提取图像边缘, 使每种算法得到150张边缘检测结果, 其中部分检测结果如图5所示.定性分析图5可知, Canny 、SPM 和ISM 方法在Colony4和Colony5等存在弱边缘的图像中往往会出现大面积的边缘丢失. OSM 方法对弱边缘的敏感性强于以上3种方法, 但仍然会出现不同程度的边缘断裂, 且在调整阈值时难以均衡边缘连续性和目标菌落内部冗余. NLC 方法同样丢失了Colony4和Colony5中几乎所有的边缘, 对于Colony3也只能检出其中亮度较低的菌落内部, 对于梯度变化不明显的边缘辨别力差. 与其他方法相比, 本文方法检出的边缘更加显著且完整性更高, 对于弱边缘也有很强的检测能力, 在Colony3、Colony4和Colony5等存在多层次水平强弱边缘的菌落图像中能够取得较好的检测结果. 为了对检测结果进行定量分析并客观评价各方法的优劣, 计算边缘图像重建相似度MSSIM [27]对检测结果进行重建, 并计算重建图像与原图像的相似度作为边缘定位的准确性RGfires (R)fires (G)亮度编码结果Luminance codingresult方差计算Variance1 2 3ON-result对比度Contrast脉冲延时抑制量Neuron spiking delay感受野窗口感受野窗口DSGC templateOFF-通路输出OFF-result 5)6)7)图 4 边缘检测算法流程图Fig. 4 The procedure of edge detection algorithm8 期郑程驰等: 视网膜功能启发的边缘检测层级模型1777图 5 Colony1 ~ Colony5的边缘检测结果(第1行为原图; 第2行为Canny 检测的结果; 第3行为SPM 检测的结果; 第4行为ISM 检测的结果; 第5行为OSM 检测的结果; 第6行为NLC 检测的结果; 第7行为本文方法检测的结果)Fig. 5 Edge detection results of Colony1 to Colony5 (The first line is original images; The second line is the results of Canny; The third line is the results of SPM; The fourth line is the results of ISM; The fifth line is the results of OSM;The sixth line is the results of NLC; The seventh line is the results of the proposed method)1778自 动 化 学 报49 卷指标. 首先对检测出的边缘图像做膨胀处理, 之后将原图像上的像素值赋给膨胀后边缘的对应位置,得到的图像记为ET , 则边缘重建如下式所示T k ET d k 其中, 为图像 上3×3窗口中8个方向的周边像素, 为窗口中心像素点与周边像素的距离, 计算得到重建图像R . 重建图像的相似度指标如下式所示µA µB σA σB σAB 其中, 和 为原图像和重建图像的灰度均值, 和 为其各自的标准差, 为原图像与重建图像之间的协方差. 将原图像和重建图像各自分为N 个子图, 并分别计算相似度指标SSIM , 得到平均相似度指标MSSIM . 除此之外, 为了验证边缘检测方法检出边缘的真实性和对菌落内部冗余纹理的抑制能力, 本文计算边缘置信度BIdx [28], 根据边缘两侧灰度值的跃变程度判断边缘的真伪. 边缘置信度指标如下式所示σij E (x i k ,y ik )(x i ,y i )d i其中, 为边缘像素在原图像对应位置的邻域标准差, EdgeNum 为边缘像素数量. 另外, 本文进一步计算边缘连续性 CIdx [29]来验证检出目标的边缘完整性. 首先将得到的边缘图像E 分割为m 个区域, 分别计算每个区域中的边缘像素 到其空间中心 的距离 ,则连续性指标如下式所示c i k C i n i 其中, 为边缘连续性的贡献值, D 为阈值, 为第i 个区域的像素点的连续性贡献值之和,为第i 个区域边缘像素点数量. 最后, 将计算得到的3个指标根据下式融合, 得到综合评价指标EIdx [21]其中, row 和col 分别为原图像的行数和列数. 于是, 检测图像的各项性能指标如表1 ~ 表5所示, 图像重建的结果如图6所示.表 1 不同检测方法下的重建相似度MSSIM Table 1 MSSIM of different methodsSerial number MSSIMCanny SPMISMOSMNLC本文方法Colony10.74520.77250.83570.92650.91750.9371Colony20.79510.79710.84900.95280.94470.9725Colony30.85760.86620.83140.91490.83370.9278Colony40.96900.98270.98380.98870.98930.9972Colony50.96340.97580.97800.97710.98830.9933表 2 不同检测方法下的边缘置信度BIdx Table 2 BIdx of different methodsSerial number BIdxCanny SPMISMOSMNLC本文方法Colony10.49880.46180.43070.58010.50580.6026Colony20.18210.15370.15530.33650.46150.4479Colony30.19830.15100.16100.26340.12630.3257Colony40.16310.14880.19060.14370.15210.2016Colony50.16200.18960.19020.18820.17350.1654表 3 不同检测方法下的边缘连续性CIdxTable 3 CIdx of different methodsSerial numberCIdxCanny SPMISMOSMNLC本文方法Colony10.83770.85300.86010.86760.97490.9652Colony20.80690.86550.85330.82930.91770.9518Colony30.80640.74080.72930.82690.77640.9406Colony40.81430.86110.90440.84300.90150.9776Colony50.90470.84480.86320.85920.87090.95718 期郑程驰等: 视网膜功能启发的边缘检测层级模型1779。
⽐如你想要对性别分类,分两类,使⽤pytorch中的预训练模型resnet18#coding:utf-8import torchfrom torchvision import modelsfrom torch import nndevice = torch.device("cuda:0"if torch.cuda.is_available() else"cpu")# 然后选择使⽤的模型model_conv = models.resnet18(pretrained=True)# resnet18仅有⼀个全连接层# 得到该全连接层输⼊神经元数.in_featuresfc_features = model_conv.fc.in_features# 默认的输出神经元数为1000# 这⾥修改为⾃⼰想进⾏的⼆分类,类别为2,即man和womanmodel_conv.fc = nn.Linear(fc_features, 2)这样模型就设置成功了2.双分类或多分类这⾥以双分类为例,在上⾯的单分类中,我们仅是在原有的模型上修改了参数值,并没有改变整个模型的结构但是单我们要实现双分类,如同时进⾏性别和⼈种分类,这个时候就需要在原来代码的基础上添加⼀些新的层,构造⼀个新的模型如下⾯代码:import torchimport torch.nn as nnimport torch.nn.functional as Ffrom torch.autograd import Variabledef conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):"""3x3 convolution with padding"""return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,padding=dilation, groups=groups, bias=False, dilation=dilation)def conv1x1(in_planes, out_planes, stride=1):"""1x1 convolution"""return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)class BasicBlock(nn.Module):expansion = 1def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,base_width=64, dilation=1, norm_layer=None):super(BasicBlock, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dif groups != 1 or base_width != 64:raise ValueError('BasicBlock only supports groups=1 and base_width=64')if dilation > 1:raise NotImplementedError("Dilation > 1 not supported in BasicBlock")# Both self.conv1 and self.downsample layers downsample the input when stride != 1self.conv1 = conv3x3(inplanes, planes, stride)self.bn1 = norm_layer(planes)self.relu = nn.ReLU(inplace=True)self.conv2 = conv3x3(planes, planes)self.bn2 = norm_layer(planes)self.downsample = downsampleself.stride = stridedef forward(self, x):identity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)if self.downsample is not None:identity = self.downsample(x)out += identityout = self.relu(out)return outclass Bottleneck(nn.Module):expansion = 4def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,base_width=64, dilation=1, norm_layer=None):super(Bottleneck, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dwidth = int(planes * (base_width / 64.)) * groups# Both self.conv2 and self.downsample layers downsample the input when stride != 1 self.conv1 = conv1x1(inplanes, width)self.bn1 = norm_layer(width)self.conv2 = conv3x3(width, width, stride, groups, dilation)self.bn2 = norm_layer(width)self.conv3 = conv1x1(width, planes * self.expansion)self.bn3 = norm_layer(planes * self.expansion)self.relu = nn.ReLU(inplace=True)self.downsample = downsampleself.stride = stridedef forward(self, x):identity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)out = self.relu(out)out = self.conv3(out)out = self.bn3(out)if self.downsample is not None:identity = self.downsample(x)out += identityout = self.relu(out)return outclass ResNet(nn.Module):def __init__(self, block, layers, zero_init_residual=False,groups=1, width_per_group=64, replace_stride_with_dilation=None,norm_layer=None ,gender_classes=2, race_classes=4):super(ResNet, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dself._norm_layer = norm_layerself.inplanes = 64self.dilation = 1if replace_stride_with_dilation is None:# each element in the tuple indicates if we should replace# the 2x2 stride with a dilated convolution insteadreplace_stride_with_dilation = [False, False, False]if len(replace_stride_with_dilation) != 3:raise ValueError("replace_stride_with_dilation should be None ""or a 3-element tuple, got {}".format(replace_stride_with_dilation))self.groups = groupsself.base_width = width_per_groupself.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,bias=False)self.bn1 = norm_layer(self.inplanes)self.relu = nn.ReLU(inplace=True)self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)yer1 = self._make_layer(block, 64, layers[0])yer2 = self._make_layer(block, 128, layers[1], stride=2,dilate=replace_stride_with_dilation[0])yer3 = self._make_layer(block, 256, layers[2], stride=2,dilate=replace_stride_with_dilation[1])yer4 = self._make_layer(block, 512, layers[3], stride=2,dilate=replace_stride_with_dilation[2])self.avgpool = nn.AdaptiveAvgPool2d((1, 1))# 注释掉之前的全连接层# self.fc = nn.Linear(512 * block.expansion, num_classes)# 变成两个并⾏的全连接层self.gen_fc = nn.Linear(512 * block.expansion, gender_classes)self.race_fc = nn.Linear(512 * block.expansion, race_classes)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):nn.init.constant_(m.weight, 1)nn.init.constant_(m.bias, 0)# Zero-initialize the last BN in each residual branch,# so that the residual branch starts with zeros, and each residual block behaves like an identity. # This improves the model by 0.2~0.3% according to https:///abs/1706.02677 if zero_init_residual:for m in self.modules():if isinstance(m, Bottleneck):nn.init.constant_(m.bn3.weight, 0)elif isinstance(m, BasicBlock):nn.init.constant_(m.bn2.weight, 0)def _make_layer(self, block, planes, blocks, stride=1, dilate=False):norm_layer = self._norm_layerdownsample = Noneprevious_dilation = self.dilationif dilate:self.dilation *= stridestride = 1if stride != 1 or self.inplanes != planes * block.expansion:downsample = nn.Sequential(conv1x1(self.inplanes, planes * block.expansion, stride),norm_layer(planes * block.expansion),)layers = []layers.append(block(self.inplanes, planes, stride, downsample, self.groups,self.base_width, previous_dilation, norm_layer))self.inplanes = planes * block.expansionfor _ in range(1, blocks):layers.append(block(self.inplanes, planes, groups=self.groups,base_width=self.base_width, dilation=self.dilation,norm_layer=norm_layer))return nn.Sequential(*layers)def forward(self, x):x = self.conv1(x)x = self.bn1(x)x = self.relu(x)x = self.maxpool(x)x = yer1(x)x = yer2(x)x = yer3(x)x = yer4(x)x = self.avgpool(x)x = x.view(x.size(0), -1)#变成两个并⾏的全连接层gender = F.softmax(self.gen_fc(x), 1)race = F.softmax(self.race_fc(x), 1)return gender, racedef resnet18Owned(**kwargs):"""Constructs a ResNet-18 model."""model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)return modeldef test():net = resnet18Owned(gender_classes=2,race_classes=4)gender, race = net(Variable(torch.randn(2,3,224,224)))print('gender :', gender.size(),gender)print('race :', race.size(), race)if __name__ == '__main__':test()这⾥举的是⼀个⽐较简单的例⼦,仅是将⼀个全连接层的resnet18更改为了两个并⾏全连接层的resnet18,那么这个时候怎么使⽤之前训练的resnet18模型参数呢?#coding:utf-8import torchfrom torchvision import modelsfrom torch import nndevice = torch.device("cuda:0"if torch.cuda.is_available() else"cpu")#导⼊预训练模型,得到结构和参数pretrained_resnet18 = models.resnet18(pretrained=True)pretrained_resnet18_dict = pretrained_resnet18.state_dict()#调⽤⾃⼰设置的模型,也得到结构即相应参数model_conv = resnet18Owned(gender_classes=2, race_classes=3)model_conv_dict = model_conv.state_dict()#当模型中的某层是同时在两个模型中共有时才取出,即得到除了全连接层以外的所有层次对应的参数pretrained_resnet18_dict = {k:v for k,v in pretrained_resnet18_dict.items() if k in model_conv_dict}#然后⽤该新参数的值取更新你⾃⼰的模型的参数#这样,除了你修改的全连接层外,其他层次的参数就都是预训练模型的参数了model_conv_dict.update(pretrained_resnet18_dict)#然后将参数导⼊你的模型即可model_conv.load_state_dict(model_conv_dict)后⾯了解到有⼀种更简单的⽅法:就是当你设置好你⾃⼰的模型后,如果仅想使⽤预训练模型相同结构处的参数,即在加载的时候将参数strict设置为False即可。
基于贝叶斯优化算法(BOA)的 XGBoost 检测模型对第一层中的异常窗口进一
Filter-XGBoost 充 分 结 合 了 两 种 检 测 模 型 各 自 的 优 点 。 与 其 他 算 法 对 比 ,
accurate to a single flow. Compared with the separate detection models,
Filter-XGBoost combines the advantages of both detection models. Compared with
other algorithms, Filter-XGBoost performs well in detection rate and false alarm rate.
关键词:铁路运输;停站方案;列车运行图;动车组接续方案;协同优化;双层规划中图分类号:U292.4+1文献标志码:ADOI :10.19961/ki.1672-4747.2021.07.004Bilevel Optimization Model for High-speed Railway Train Operation Diagram Considering Multifactor CooperationSHI Min-han 1,LV Hong-xia 1,2,3,NI Shao-quan 1,2,3,LV Miao-miao 1,2,3(1.School of Transportation and Logistics ,Southwest Jiaotong University ,Chengdu 611756,China ;2.NationalEngineering Laboratory of Integrated Transportation Big Data Application Technology ,Chengdu 611756,China ;3.National United Engineering Laboratory of Integrated and Intelligent Transportation ,Chengdu 611756,China)Abstract :The train stop plan ,train operation diagram ,and electrical multiple units (EMUs)circula-tion plan interact.Their collaborative optimization ensures that the satisfaction of passenger flow de-mand can be improved.In addition ,the transportation organization costs of the railway department determined by EMUs operation can be minimized.Therefore ,a bilevel model based on the analysis of the collaborative relationship between these three schemes was established in this study.The up-per-level model was the collaborative optimization model established to achieve the highest satisfac-tion of the passenger demand and the lowest transportation cost of railway departments.The lower-level model was the optimal EMUs operation model for the minimum number of EMUs used and the收稿日期:2021-07-05录用日期:2021-08-09网络首发:2021-08-18审稿日期:2021-07-05~07-14;08-02~08-07;08-09基金项目:国家重点研发计划项目(2017YFB1200702);国家自然科学基金项目(52072314);四川省科技计划项目(2020YJ0268,2020YJ0256);成都市科技项目(2019-YF05-01493-SN ,2020-RK00-00036-ZF );浙江省自然科学基金项目(LQ18G030012);教育部人文社科基金项目(18YJC630190)作者简介:石敏涵(1998—),女,硕士研究生,研究方向为运输组织优化理论与方法,E-mail :通信作者:吕苗苗(1986—),女,博士,讲师,研究方向为轨道交通运输组织优化,E-mail :引文格式:石敏涵,吕红霞,倪少权,等.考虑要素协同的高铁列车运行图双层优化模型[J].交通运输工程与信息学报,2022,20(2):125-135.SHI Min-han ,LV Hong-xia ,NI Shao-quan ,et al.Bilevel Optimization Model for High-speed Railway Train Operation Diagram Considering Multifactor Cooperation[J].Journal of Transportation Engineering and Information ,2022,20(2):125-135.第20卷第2期2022年06月交通运输工程与信息学报Journal of Transportation Engineering and InformationVol.20No.2Jun.2022minimum connection time.The lower-level model transferred the transportation cost of the railway department determined using the optimal EMUs operation index to the objective function of the up-per-level model,which constitutes the connection between these two bined with the dou-ble-layer characteristics of the model,a double-layer heuristic algorithm was designed to solve the problem.In the outer layer,the adaptive large-neighborhood search algorithm with high computation-al efficiency and good computational effect was adopted.The selection probability of the operators was dynamically determined based on the historical performance to obtain a feasible solution of the train stop plan and train operation diagram result.The inner layer had a simulated annealing algo-rithm,determined the corresponding optimal EMUs connection scheme based on the outer layer,and output the index to the outer layer.Finally,the example analysis result shows that a comprehensive scheme with improved indexes can be obtained within an acceptable time range using the proposed collaborative optimization method,verifying the effectiveness of the model and algorithm.Key words:railway transportation;train stop plan;train operation diagram;EMUs circulation plan;collaborative optimization;bilevel programming0引言高速铁路列车停站方案和列车运行图的优劣决定了旅客出行的便捷程度,动车组接续方案的优劣决定了运输成本。
An overview of ocean renewable energy in China 中国海洋资源开发回顾
An overview of ocean renewable energy in ChinaRenewable and Sustainable Energy ReviewsFacing great pressure of economic growth and energy crisis, China pays much attention to the renewable energy. An overview of policy and legislation of renewable energy as well as status of development of renewable energy in China was given in this article. By analysis, the authors believe that ocean energy is a necessary addition to existent renewable energy to meet the energy demand of the areas and islands where traditional forms of energy are not applicable and it is of great importance in adjusting energy structure of China. In the article, resources distribution and technology status of tidal energy, wave energy, marine current energy, ocean thermal energy and salinity gradient energy in China was reviewed, and assessment and advices were given for each category. Some suggestions for future development of ocean energy were also given.Design pressure distributions on the hull of the FLOW wave energy converterThis paper presents a procedure to calculate the design pressure distributions on the hull of a wave energy converter (WEC). Design pressures are the maximum pressure values that the device is expected to experience during its operational life time. The procedure is applied to the prototype under development by Martifer Energy (FLOW—Future Life in Ocean Waves).A boundary integral method is used to solve the hydrodynamic problem. The hydrodynamic pressures are combined with the hydrostatic ones and the internal pressures of the large ballast tanks. The first step consists of validating the numerical results of motions by comparison with measured experimental data obtained with a scaled model of the WEC. The numerical model is tuned by adjusting the damping of the device rotational motions and the equivalent damping and stiffness of the power take-off system. The pressure distributions are calculated for all irregular sea states representative of the Portuguese Pilot Zone where the prototype will be installed and a long term distribution method is used to calculate the expected maximum pressures on the hull corresponding to the 100-year return period.海波流能量转换器压力分散设计Development of an adaptive disturbance rejection system for the rapidly deployable stable platform–part 1: Mathematical modeling and open loop response 海面作业平台系统的稳定性保证:数学建模与开环响应模拟实验A Rapidly Deployable Stable Platform (RDSP) concept was investigated at Florida Atlantic University in response to military and civilian needs for ocean platforms with improved sea-keeping characteristics. The RDSP is designed to have enhanced sea-keeping abilities through the combination of a novel hull and thruster design coupled with active control. The RDSP is comprised of a catamaran that attaches via a hinge to a spar, enabling it to transit like a trimaran and then reconfigure so that the spar lifts the catamaran out of the water, creating a stable spar platform. The focus of this research is the mathematical modeling, simulation, and response characterization of the RDSP to provide a foundation for controller design, testing, and tuning. The mathematical model includes a detailed representation of residual drag, friction drag, added mass, hydrostatic and hydrodynamic pressure, and control actuator dynamics. Validation has been performed by comparing the simulation predicted motions of the RDSP operating in waves to the measured motions of the 1/10th scale prototype measured at sea. Resulting from this paper is an empirical assessment of the response characteristics of the RDSP that quantifies the performance under extreme conditions and provides a solid basis for controller development and testing.Combined use of dimensional analysis and modern experimental design methodologies in hydrodynamics experiments海洋工程设计的多维度分析与现代化的实验化设计与验证的方法In this paper, a combined use of dimensional analysis (DA) and modern statistical design of experiment (DOE) methodologies is proposed for a hydrodynamics experiment where there are a large number of variables. While DA is well-known, DOE is still unfamiliar to most ocean engineers although it has been shown to be useful in many engineering and non-engineering applications. To introduce and illustrate the method, a study concerning the thrust of a propeller is considered. Fourteen variables are involved in the problem and after dimensional analysis this reduces to 11 dimensionless parameters. Then, a two-level fractional factorial design was used to screen out parameters that do not significantly contribute to explaining the dependent dimensionless parameter. With the remaining five statistically significant dimensionless parameters, various response surface methodologies (RSM) were used to obtain a functional relationship between the dependent dimensionless thrust coefficient, and the five dimensionless parameters. The final model was found to be of reasonable accuracy when tested against results not used to develop the model. The methodologies presented in the paper can be similarly applied to systems with a large number of control variables to systematically derive approximate mathematical models to predict the responses of the system economically and accurately.Progress toward autonomous ocean sampling networks海洋实验与勘测的自动取样网络化系统的设计进程The goals of the Autonomous Ocean Sampling Network (AOSN) are reviewed and progress toward those goals is assessed based on results of recent, major field experiments. Major milestones include the automated control of multiple, mobile sensors for weeks using spatial coverage metrics and the transition from engineering a reliable data stream to managing the complexities of decision-making based on the data and the possibilities of timely feedback.Non-uniform adaptive vertical grids for 3D numerical ocean modelsOcean Modelling采用垂直化网格表示的的三维数字化海洋模型海洋建模学报A new strategy for the vertical gridding in terrain-following 3D ocean models is presented here. The vertical grid adaptivity is partially given by a vertical diffusion equation for the vertical layer positions, with diffusivities being proportional to shear, stratification and distance from the boundaries. In the horizontal, the grid can be smoothed with respect to z-levels, grid layer slope and density. Lagrangian tendency of the grid movement is supported. The adaptive terrain-following grid can be set to be an Eulerian–Lagrangian grid, a hybrid σ–ρ or σ–z grid and combinations of these with great flexibility. With this, internal flow structures such as thermoclines can be well resolved and followed by the grid. A set of idealised examples is presented in the paper, which show that the introduced adaptive grid strategy reduces pressure gradient errors and numerical mixing significantly. The grid adaption strategy is easy to implement in various types of terrain-following ocean models. The idealised examples give evidence that the adaptive grids can improve realistic, long-term simulations of stratified seas while keeping the advantages of terrain-following coordinates.Procedures for offline grid nesting in regional ocean models海岸离散测绘网布局点与区域海洋模型图绘制仿真步骤与过程One-way offline nesting of a primitive-equation regional ocean numerical model (ROMS) is investigated, with special attention to the boundary forcing file creation process. The model has a modified open boundary condition which minimises false wave reflections, and is optimised to utilise high-frequency boundary updates. The model configuration features a previously computed solution which supplies boundary forcing data to an interior domain with an increased grid resolution. At the open boundaries of the interior grid (the child) the topography is matched to that of the outer grid (the parent), over a narrow transition region. A correction is applied to the normal baroclinic and barotropic velocities at the open boundaries of the child to ensure volume conservation. It is shown that these steps, together with a carefully constructed interpolation of the parent data, lead to a high-quality child solution, with minimal artifacts such as persistent rim currents and wave reflections at the boundaries.Development of a Coupled Ocean–Atmosphere–Wave–Sediment Transport (COAWST) Modeling SystemUnderstanding the processes responsible for coastal change is important for managing our coastal resources, both natural and economic. The current scientific understanding of coastal sediment transport and geology suggests that examining coastal processes at regional scales can lead to significant insight into how the coastal zone evolves. To better identify the significant processes affecting our coastlines and how those processes create coastal change we developed a Coupled Ocean–Atmosphere–Wave–Sediment Transport (COAWST) Modeling System, which is comprised of the Model Coupling Toolkit to exchange data fields between the ocean model ROMS, the atmosphere model WRF, the wave model SWAN, and the sediment capabilities of the Community Sediment Transport Model. This formulation builds upon previous developments by coupling the atmospheric model to the ocean and wave models, providing one-way grid refinement in the ocean model, one-way grid refinement in the wave model, and coupling on refined levels. Herein we describe the modeling components and the data fields exchanged. The modeling system is used to identify model sensitivity by exchanging prognostic variable fields between different model components during an application to simulate Hurricane Isabel during September 2003. Results identify that hurricane intensity is extremely sensitive to sea surface temperature. Intensity is reduced when coupled to the ocean model although the coupling provides a more realistic simulation of the sea surface temperature. Coupling of the ocean to the atmosphere also results in decreased boundary layer stress and coupling of the waves to the atmosphere results in increased bottom stress. Wave results are sensitive to both ocean and atmospheric coupling due to wave–current interactions with the ocean and wave growth from the atmosphere wind stress. Sediment resuspension at regional scale during the hurricane is controlled by shelf width and wave propagation during hurricane approach.Contact dynamics of two floating cable-connected bodiesWe consider two ship-like bodies connected by six cables and excited by waves. The cables might be under tension, or they might be slack, thus forming a unilateral system generating possible impacts. The impact forces can reach 20,000 kN and are able to cause damage to a ship. In order to avoid such large impact forces, anti-shock buffers might be adopted but good buffer design requires knowledge of the impact forces. We have evaluated these using multi-body theory with unilateral contacts in combination with classical ship dynamics, which allows modeling of the contact dynamics of two floating bodies in an ocean. Based on an optimization algorithm a method using an artificial neural network (NNW) has been developed to determine the combination of possible constraints at each step. The results of a numerical example compare reasonably well with experiments. We have thus established a theoretical basis for further buffer design.Joint modelling of wave spectral parameters for extreme sea statesCharacterising the dependence between extremes of wave spectral parameters such as significant wave height (H S) and spectral peak period (T P) is important in understanding extreme ocean environments andin the design and assessment of marine structures. For example, it is known that mean values of wave periods tend to increase with increasing storm intensity. Here we seek to characterise joint dependence in a straightforward manner, accessible to the ocean engineering community, using a statistically sound approach.Many methods of multivariate extreme value analyses are based on models which assume implicitly that in some joint tail region each parameter is either independent of or asymptotically dependent on other parameters; yet in reality the dependence structure in general is neither of these. The underpinning assumption of multivariate regular variation restricts these methods to estimation of joint regions in which all parameters are extreme; but regions where only a subset of parameters are extreme can be equally important for design. The conditional approach of Heffernan and Tawn (2004), similar in spirit to that of Haver (1985) but with better theoretical foundation, overcomes these difficulties.We use the conditional approach to characterise the dependence structure of H S and T P. The key elements of the procedure are: (1) marginal modelling for all parameters, (2) transformation of data to a common standard Gumbel marginal form, (3) modelling dependence between data for extremes of pairs of parameters using a form of regression, (4) simulation of long return periods to estimate joint extremes. We demonstrate the approach in application to measured and hindcast data from the Northern North Sea, the Gulf of Mexico and the North West Shelf of Australia. We also illustrate the use of data re-sampling techniques such as bootstrapping to estimate the uncertainty in marginal and dependence models and accommodate this uncertainty in extreme quantile estimation.We discuss the current approach in the context of other approaches to multivariate extreme value estimation popular in the ocean engineering community.极端海洋多外力因素环境的综合作用仿真Robust diving control of an AUVMobile systems traveling through a complex environment present major difficulties in determining accurate dynamic models. Autonomous underwater vehicle motion in ocean conditions requires investigation of new control solutions that guarantee robustness against external parameter uncertainty.A diving-control design, based on Lyapunov theory and back-stepping techniques, is proposed and verified. Using adaptive and switching schemes, the control system is able to meet the required robustness. The results of the control system are theoretically proven and simulations are developed to demonstrate the performance of the solutions proposed.移动式潜水器的鲁棒控制Transient behavior of towed cable systems during ship turning maneuversThe dynamic behavior of a towed cable system that results from the tow ship changing course from a straight-tow trajectory to one involving steady circular turning at a constant radius is examined. For large-radius ship turns, the vehicle trajectory and vehicle depth assumed, monotonically and exponentially, the large-radius steady-state turning solution of Chapman [Chapman, D.A., 1984. The towed cable behavior during ship turning manoeuvers. Ocean Engineering 11, 327–361]. For small-radius ship turns, the vehicle trajectory initially followed a corkscrew pattern with the vehicle depth oscillating about and eventually decaying to the steady-state turning solution of Chapman (1984). The change between monotonic and oscillatory behavior in the time history of the vehicle depth was well defined and offered an alternate measure to Chapman's (1984) critical radius for the transition point between large-radius and small-radius behavior. For steady circular turning in the presence of current, there was no longer a steady-state turning solution. Instead, the vehicle depth oscillated with amplitude that was a function of the ship-turning radius and the ship speed. The dynamics of a single 360° turn and a 180° U-turn are discussed in terms of the transients of the steady turning maneuver. For a single 360°large-radius ship turn, the behavior was marked by the vehicle dropping to the steady-state turning depth predicted by Chapman (1984) and then rising back to the initial, straight-tow equilibrium depth once the turn was completed. For small ship-turning radius, the vehicle dropped to a depth corresponding to the first trough of the oscillatory time series of the steady turning maneuver before returning to the straight-tow equilibrium depth once the turn was completed. For some ship-turning radii, this resulted in a maximum vehicle depth that was greater than the steady-state turning depth. For a 180°turn and ship-turning radius less than the length of the tow cable, the vehicle never reached the steady-state turning depth.海洋勘测船的光缆稳定性与海波振动影响传输On the structure of Langmuir turbulenceThe Stokes drift induced by surface waves distorts turbulence in the wind-driven mixed layer of the ocean, leading to the development of streamwise vortices, or Langmuir circulations, on a wide range of scales. We investigate the structure of the resulting Langmuir turbulence, and contrast it with the structure of shear turbulence, using rapid distortion theory (RDT) and kinematic simulation of turbulence. Firstly, these linear models show clearly why elongated streamwise vortices are produced in Langmuir turbulence, when Stokes drift tilts and stretches vertical vorticity into horizontal vorticity, whereas elongated streaky structures in streamwise velocity fluctuations (u) are produced in shear turbulence, because there is a cancellation in the streamwise vorticity equation and instead it is vertical vorticity that is amplified. Secondly, we develop scaling arguments, illustrated by analysing data from LES, thatindicate that Langmuir turbulence is generated when the deformation of the turbulence by mean shear is much weaker than the deformation by the Stokes drift. These scalings motivate a quantitative RDT model of Langmuir turbulence that accounts for deformation of turbulence by Stokes drift and blocking by theair–sea interface that is shown to yield profiles of the velocity variances in good agreement with LES. The physical picture that emerges, at least in the LES, is as follows. Early in the life cycle of a Langmuir eddy initial turbulent disturbances of vertical vorticity are amplified algebraically by the Stokes drift into elongated streamwise vortices, the Langmuir eddies. The turbulence is thus in a neartwo-component state, with suppressed and . Near the surface, over a depth of order the integral length scale of the turbulence, the vertical velocity (w) is brought to zero by blocking of the air–sea interface. Since the turbulence is nearly two-component, this vertical energy is transferred intothe spanwise fluctuations, considerably enhancing at the interface. After a time of order half the eddy decorrelation time the nonlinear processes, such as distortion by the strain field of the surrounding eddies, arrest the deformation and the Langmuir eddy decays. Presumably, Langmuir turbulence then consists of a statistically steady state of such Langmuir eddies. The analysis then provides a dynamical connection between the flow structures in LES of Langmuir turbulence and the dominant balance between Stokes production and dissipation in the turbulent kinetic energy budget, found by previous authors.Effects of vertical variations of thickness diffusivity in an ocean general circulation model洋流循环模型The effects of a prescribed surface intensification of the thickness (and isopycnal) diffusivity on the solutions of an ocean general circulation model are documented. The model is the coarse resolution version of the ocean component of the National Center for Atmospheric Research (NCAR) Community Climate System Model version 3 (CCSM3). Guided by the results of Ferreira et al. (2005) [Ferreira, D., Marshall, J., Heimbach, P., 2005. Estimating eddy stresses by fitting dynamics to observations using a residual-mean ocean circulation model and its adjoint. J. Phys. Oceanogr. 35, 1891–1910.] we employ a vertical dependence of the diffusivity which varies with the stratification, N2, and is thus large in the upper ocean and small in the abyss. We experiment with vertical variations of diffusivity which are as large as 4000 m2 s−1 within the surface diabatic layer, diminishing to 400 m2 s−1 or so by a depth of 2 km. The new solutions compare more favorably with the available observations than those of the control which uses a constant value of 800 m2 s−1 for both thickness and isopycnal diffusivities. These include an improved representation of the vertical structure and transport of the eddy-induced velocity in the upper-ocean North Pacific, a reduced warm bias in the upper ocean, including the equatorial Pacific, and improved southward heat transport in the low- to mid-latitude Southern Hemisphere. There is also a modest enhancement of abyssal stratification in the Southern Ocean.Using satellite altimetry to correct mean temperature and salinity fields derived from Argo floats in the ocean regions around AustraliaWe present results from a suite of methods using in situ temperature and salinity data, and satellitealtimetric observations to obtain an enhanced set of mean fields of temperature, salinity (down to 2000-m depth) and steric height (0/2000 m) for a time-specific period (1992–2007). Firstly, the improved global sampling resulting from the introduction of the Argo program, enables a representative determination of the large-scale mean oceanic structure. However, shortcomings in the coverage remain. High variability western boundary current eddy fields, continental slope and shelf boundaries may all be below their optimal sampling requirements. We describe a simple method to supplement and improve standard spatial interpolation schemes and apply them to the available data within the waters surrounding Australia (100°E–180°W; 50°S–10°N). This region includes a major current system, the East Australian Current (EAC), complex topography, unique boundary currents such as the Leeuwin Current, and large ENSO related interannual variability in the southwest Pacific. We use satellite altimetry sea level anomalies (SLA) to directly correct sampling errors in in situ derived mean surface steric height and subsurface temperature and salinity fields. The surface correction is projected through the water column (using an empirical model) to modify the mean subsurface temperature and salinity fields. The errors inherent in all these calculations are examined. The spatial distribution of the barotropic–baroclinic balance is obtained for the region and a ‘baroclinic factor’ to convert the altimetry SLA into an equivalent in situ height is determined. The mean fields in the EAC region are compared with independent estimates on repeated XBT sections, a mooring array and full-depth CTD transects.海洋开发的航空与遥感大规模探测技术。
第40卷第9期2023年9月控制理论与应用Control Theory&ApplicationsV ol.40No.9Sep.2023不对称约束多人非零和博弈的自适应评判控制李梦花,王鼎,乔俊飞†(北京工业大学信息学部,北京100124;计算智能与智能系统北京市重点实验室,北京100124;智慧环保北京实验室,北京100124;北京人工智能研究院,北京100124)摘要:本文针对连续时间非线性系统的不对称约束多人非零和博弈问题,建立了一种基于神经网络的自适应评判控制方法.首先,本文提出了一种新颖的非二次型函数来处理不对称约束问题,并且推导出最优控制律和耦合Hamilton-Jacobi方程.值得注意的是,当系统状态为零时,最优控制策略是不为零的,这与以往不同.然后,通过构建单一评判网络来近似每个玩家的最优代价函数,从而获得相关的近似最优控制策略.同时,在评判学习期间发展了一种新的权值更新规则.此外,通过利用Lyapunov理论证明了评判网络权值近似误差和闭环系统状态的稳定性.最后,仿真结果验证了本文所提方法的有效性.关键词:神经网络;自适应评判控制;自适应动态规划;非线性系统;不对称约束;多人非零和博弈引用格式:李梦花,王鼎,乔俊飞.不对称约束多人非零和博弈的自适应评判控制.控制理论与应用,2023,40(9): 1562–1568DOI:10.7641/CTA.2022.20063Adaptive critic control for multi-player non-zero-sum games withasymmetric constraintsLI Meng-hua,WANG Ding,QIAO Jun-fei†(Faculty of Information Technology,Beijing University of Technology,Beijing100124,China;Beijing Key Laboratory of Computational Intelligence and Intelligent System,Beijing100124,China;Beijing Laboratory of Smart Environmental Protection,Beijing100124,China;Beijing Institute of Artificial Intelligence,Beijing100124,China)Abstract:In this paper,an adaptive critic control method based on the neural networks is established for multi-player non-zero-sum games with asymmetric constraints of continuous-time nonlinear systems.First,a novel nonquadratic func-tion is proposed to deal with asymmetric constraints,and then the optimal control laws and the coupled Hamilton-Jacobi equations are derived.It is worth noting that the optimal control strategies do not stay at zero when the system state is zero, which is different from the past.After that,only a critic network is constructed to approximate the optimal cost function for each player,so as to obtain the associated approximate optimal control strategies.Meanwhile,a new weight updating rule is developed during critic learning.In addition,the stability of the weight estimation errors of critic networks and the closed-loop system state is proved by utilizing the Lyapunov method.Finally,simulation results verify the effectiveness of the method proposed in this paper.Key words:neural networks;adaptive critic control;adaptive dynamic programming;nonlinear systems;asymmetric constraints;multi-player non-zero-sum gamesCitation:LI Menghua,WANG Ding,QIAO Junfei.Adaptive critic control for multi-player non-zero-sum games with asymmetric constraints.Control Theory&Applications,2023,40(9):1562–15681引言自适应动态规划(adaptive dynamic programming, ADP)方法由Werbos[1]首先提出,该方法结合了动态规划、神经网络和强化学习,其核心思想是利用函数近似结构来估计最优代价函数,从而获得被控系统的近似最优解.在ADP方法体系中,动态规划蕴含最优收稿日期:2022−01−21;录用日期:2022−11−10.†通信作者.E-mail:***************.cn.本文责任编委:王龙.科技创新2030–“新一代人工智能”重大项目(2021ZD0112302,2021ZD0112301),国家重点研发计划项目(2018YFC1900800–5),北京市自然科学基金项目(JQ19013),国家自然科学基金项目(62222301,61890930–5,62021003)资助.Supported by the National Key Research and Development Program of China(2021ZD0112302,2021ZD0112301,2018YFC1900800–5),the Beijing Natural Science Foundation(JQ19013)and the National Natural Science Foundation of China(62222301,61890930–5,62021003).第9期李梦花等:不对称约束多人非零和博弈的自适应评判控制1563性原理提供理论基础,神经网络作为函数近似结构提供实现手段,强化学习提供学习机制.值得注意的是, ADP方法具有强大的自学习能力,在处理非线性复杂系统的最优控制问题上具有很大的潜力[2–7].此外, ADP作为一种近似求解最优控制问题的新方法,已经成为智能控制与计算智能领域的研究热点.关于ADP的详细理论研究以及相关应用,读者可以参考文献[8–9].本文将基于ADP的动态系统优化控制统称为自适应评判控制.近年来,微分博弈问题在控制领域受到了越来越多的关注.微分博弈为研究多玩家系统的协作、竞争与控制提供了一个标准的数学框架,包括二人零和博弈、多人零和博弈以及多人非零和博弈等.在零和博弈问题中,控制输入试图最小化代价函数而干扰输入试图最大化代价函数.在非零和博弈问题中,每个玩家都独立地选择一个最优控制策略来最小化自己的代价函数.值得注意的是,零和博弈问题已经被广泛研究.在文献[10]中,作者提出了一种改进的ADP方法来求解多输入非线性连续系统的二人零和博弈问题.An等人[11]提出了两种基于积分强化学习的算法来求解连续时间系统的多人零和博弈问题.Ren等人[12]提出了一种新颖的同步脱策方法来处理多人零和博弈问题.然而,关于非零和博弈[13–14]的研究还很少.此外,控制约束在实际应用中也广泛存在.这些约束通常是由执行器的固有物理特性引起的,如气压、电压和温度.因此,为了确保被控系统的性能,受约束的系统需要被考虑.Zhang等人[15]发展了一种新颖的事件采样ADP方法来求解非线性连续约束系统的鲁棒最优控制问题.Huo等人[16]研究了一类非线性约束互联系统的分散事件触发控制问题.Yang和He[17]研究了一类具有不匹配扰动和输入约束的非线性系统事件触发鲁棒镇定问题.这些文献考虑的都是对称约束,而实际应用中,被控系统受到的约束也可能是不对称的[18–20],例如在污水处理过程中,需要通过氧传递系数和内回流量对溶解氧浓度和硝态氮浓度进行控制,而根据实际的运行条件,这两个控制变量就需要被限制在一个不对称约束范围内[20].因此,在控制器设计过程中,不对称约束问题将是笔者研究的一个方向.到目前为止,关于具有控制约束的微分博弈问题,有一些学者取得了相应的研究成果[12,21–23].但可以发现,具有不对称约束的多人非零和博弈问题还没有学者研究.同时,在多人非零和博弈问题中,相关的耦合Hamilton-Jacobi(HJ)方程是很难求解的.因此,本文针对一类连续时间非线性系统的不对称约束多人非零和博弈问题,提出了一种自适应评判控制方法来近似求解耦合HJ方程,从而获得被控系统的近似最优解.本文的主要贡献如下:1)首次将不对称约束应用到连续时间非线性系统的多人非零和博弈问题中;2)提出了一种新颖的非二次型函数来处理不对称约束问题,并且当系统状态为零时,最优控制策略是不为零的,这与以往不同;3)在学习期间,用单一评判网络结构代替了传统的执行–评判网络结构,并且提出了一种新的权值更新规则;4)利用Lyapunov方法证明了评判网络权值近似误差和系统状态的一致最终有界(uniformly 英语作文乒乓球马龙作文Okay, let's dive into a lengthy essay about Ma Long and table tennis!---。
Table Tennis Ma Long: A Champion's Journey。
Ma Long, often referred to as the "Dragon" in the world of table tennis, is a prominent figure whose journey has inspired countless athletes and enthusiasts globally. Born on October 20, 1988, in Anshan, Liaoning, China, Ma Long's remarkable skills, dedication, and achievements have solidified his place as one of the greatest table tennis players of all time.Early Years and Beginnings in Table Tennis。
Ma Long's passion for table tennis ignited at a young age. His early years were marked by intense training and adeep-seated determination to excel in the sport. Under the guidance of skilled coaches and mentors, he honed his techniques, developed his style, and gradually rose through the ranks.Rise to Prominence。
Integration Platform w-equation
2-equation models • k-w, BSL, SST
Transition Model • g-ReQ model
Unsteady models • SST-SAS • SST-DES
Wall Treatment • Automatic wall treatment
eN method (only natural transition) Very accurate predictions for 2D airfoils (low FSTI) N-S codes are not accurate enough to evaluate stability equations Extension to generic 3D flows very difficult (impossible?) Cannot account of non-linear effects (e.g. high FSTI, roughness)
– 可以和DES/SAS模型联用
E-LES: Spatially decaying turbulence
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
An adaptive coupled-layer visual model for robust visual tracking LukaˇCehovin,Matej Kristan,and Aleˇs LeonardisFaculty of Computer and Information Science,University of Ljubljana,SloveniaTrˇz aˇs ka25,Ljubljana,Slovenia{luka.cehovin,matej.kristan,ales.leonardis}@fri.uni-lj.siAbstractThis paper addresses the problem of tracking objects which undergo rapid and significant appearance changes. We propose a novel coupled-layer visual model that com-bines the target’s global and local appearance.The local layer in this model is a set of local patches that geometri-cally constrain the changes in the target’s appearance.This layer probabilistically adapts to the target’s geometric de-formation,while its structure is updated by removing and adding the local patches.The addition of the patches is constrained by the global layer that probabilistically mod-els target’s global visual properties such as color,shape and apparent local motion.The global visual properties are updated during tracking using the stable patches from the local layer.By this coupled constraint paradigm be-tween the adaptation of the global and the local layer,we achieve a more robust tracking through significant appear-ance changes.Indeed,the experimental results on challeng-ing sequences confirm that our tracker outperforms the re-lated state-of-the-art trackers by having smaller failure rate as well as better accuracy.1.IntroductionVisual tracking is an important research area in computer vision.In practice,the holistic approaches[6,11,5,16], that globally model the target’s appearance,have proven to be very successful.However,scenarios that contain rapid structural appearance changes present such models with se-rious difficulties.The reason is that such visual changes lead to reduced matches and drifting which eventually re-sult in the trackers’failure.To address these problems,approaches to tracking using sets of simple local parts have been proposed[15,1,3,12, 9,18].Flock-of-features,proposed by K¨o lsch and Turk[9] and later extended by Hoey[7],was one of the early at-tempts.Inflock-of-features a set of simple features(e.g. opticalflow features)are used to independently track indi-vidual parts of the object.If a feature violates simpleflock-Figure1.Illustration of the proposed coupled-layer visual model. The local layer is a geometrical constellation of visual parts that describe the target’s local visual properties.The global layer en-codes the target’s global visual features in a probabilistic model. ing rules based on a distance to other features,it is replaced by a new feature using a predefinedfixed color distribution. Since the set of features is geometrically unconstrained,the tracker is likely to get stuck on the background and tracking fails.Yin and Collins[18]use the Harris corner detector to determine only the stable regions for tracking and enforce a single global affine transformation constraint to avoid drift-ing.However,the number of stable regions is highly depen-dent on the object texture.If the object’s color is homoge-neous,no stable regions will be found and the tracking will fail.To avoid such problems,Fan et al.[4]proposed to track a target with a set of kernels which are connected by a global affine transformation constraint.To enable handling slightly more involved changes in the appearance,Mar-tinez and Binefa[15]connected multiple kernels together in triplets and constrained them with a local affine trans-formation.However,each kernel and the connections have to be carefully manually initialized based on the target’s structural properties.This is undesirable in many tracking scenarios.Furthermore,the set of kernels isfixed and the tracker therefore cannot adapt to the target’s larger appear-ance changes.A four-part fully-connected structure has been proposed by Badrinarayanan et al.[1]for face tracking.The visual model is composed of four patches,constrained by aflexi-ble fully-connected graph.Because the number of parts is low,the problem can still be solved efficiently using a parti-clefilter,however,this approach is not suitable for a larger sets of parts.Another drawback is that it requires manual initialization of positions for each patch.Chang et al.[3] used Markov randomfields to encode the spatial constraints between parts.Only subsets of parts are connected in this case,making larger sets of parts easier to process.However, this approach still assumes that individual parts are manu-ally initialized and cannot update the part set.Moreflexible geometrical constraints that allow remov-ing and adding parts during tracking have been presented by Kwon and Lee[12].A star model connects all the parts to the center of the object.This model is simple enough that individual parts can be removed or added.The authors propose a likelihood function landscape analysis and part proximity to detect bad parts and remove them.New parts are added to the visual model using corner-like stable re-gions in the estimated object area.We consider this recent work to be the closest to our own research.While this ap-proach provides a good mechanism for gradually adapting the visual model in a controlled manner,the mechanism of introducing new patches is rather nonrobust.The patch ini-tialization fails for objects that lack textured surface and is not directly constrained to the object.On the other hand a rapid part removal can lead to false structural changes in the geometrical model and possible tracking failure.In this paper we propose a coupled-layer visual model that combines the target’s global and local appearance(Fig-ure1).The local layer L t is a geometrical constellation of visual parts(patches)that describe the target’s local visual/geometrical properties.As the target’s appearance changes or a part of it gets occluded,some of the patches in the visual model cease to correspond to the target’s visi-ble parts.Those are identified and gradually removed from the model.The allocation of the new patches in the local layer is constrained by the global layer G t that encodes the target’s global visual features.The global layer maintains a probabilistic model of target’s global visual features such as color,shape and apparent motion and is adapted during tracking.This adaptation is in turn constrained by focusing on the stable patches in the local layer.The main contribution of the paper is the coupled con-straint paradigm implemented within our Bayesian formula-tion of the two-layer model.We also integrate the proposed adaptive visual model within a Bayesian tracker that allows tracking through significant appearance changes.We argue that this robustness is achieved by the coupled-constrained updating of the visual model through the feedback loops between the global and the local layer.The experiments on the challenging sequences with significant appearance changes confirm that our tracker outperforms the state-of-the-art trackers by smaller failure rate and at greater(statis-tically significant)accuracy.The rest of the paper is organized as follows:Sec-tion2describes the proposed visual model and the resulting tracker.In Section3we perform extensive experimental comparison with the state-of-the-art,and in Section4we discuss the method and draw the conclusions.2.A coupled-layer visual modelDuring tracking,the proposed coupled-layer visual model is used as follows.Starting from an initial posi-tion(predicted by the Kalmanfilter in our case),the local model’s geometrical structure is adapted to maximally ex-plain the visual data–thus locating the target(Section2.1).A mechanism is used to identify and remove the patches from the local visual model that no more correspond to the target(Section2.2).The remaining patches are used to up-date the visual information of the global layer and then the global layer is used to allocate new patches in the local layer if necessary(Section2.3).2.1.The local layerThe local layer L t of the the target’s visual model at time-step t is described by a geometrical constellation of weighted patches:L t={x(i)t,w(i)t}i=1:Nt,(1) where x(i)t represents the image coordinates of the i-th patch and the weight w(i)t represents the belief that the tar-get is well-represented by the i-th patch.The target’s center is defined as the weighted average over the patches,i.e.,c t=1W tN ti=1w(i)t x(i)t,where W t is a normalization fac-tor W t=N ti=1w(i)t.In the following we will denote theset of all patches at time-step t by X t={x(i)t}i=1:Nt.During tracking,we start from an initial estimateˆX t and the set of current image measurements Y t,and seek the value of X t that maximizes the joint probability p(Y t,X t|ˆX t).By treating the local-layer visual model L t as a mixture model,in which each patch competes to ex-plain the target’s appearance,we can decompose the joint distribution intop(Y t,X t|ˆX t)=N ti=1p(z(i))p(Y t,X t|ˆX t,z(i)),(2)where p(z(i))is the i-th patch’s prior and is approxi-mated using the corresponding weight,i.e.,p(z(i))= w(i)t/N tj=1w(j)t.In our model,we assume that the positionof the i-th patch is dependent only on its direct neighbors, and we can writep(Y t,X t|ˆX t,z(i))∝p(Y t,x(i)t|ε(i)t,ˆε(i)t,z(i)),(3) whereε(i)t andˆε(i)t denote the set of the i-th patch’s local neighbors’positions in the new and initial constellation,re-spectively.In our implementation,the local neighbors are the set of patches that are directly connected with the i-th patch in a Delaunay triangulated mesh of an entire set of patches.The conditional joint distribution can now be fur-ther decomposed in terms of visual and geometrical models asp(Y t,x(i)t|ε(i)t,ˆε(i)t,z(i))=p(Y t|x(i)t)p(x(i)t|ε(i)t,ˆε(i)t),(4)where we have assumed that the measurement at the i-th patch is independent from the other patches.The visual model of the i-th patch is encoded by a gray-level histogramh(i) ref which is extracted when the patch is initialized in theconstellation and remains unchanged during tracking.Let h(i)t be a histogram extracted at the current location of the patch x(i)t.We define the visual likelihood of the i-th patch asp(Y t|x(i)t)∝e−λvρ(h(i)ref,h(i)t),(5) whereρ(·,·)is the Bhattacharryya distance between the his-tograms[16].We constrain the local geometry using an elastic deformation modelp(x(i)t|ε(i)t,ˆε(i)t)∝e−λg||x(i)t−A(ε(i)t,ˆε(i)t)ˆx(i)t||,(6) where A(ε(i)t,ˆε(i)t)is an affine transformation matrix com-puted from correspondences between the i-th patch’s initial and current neighborhoods.Note that this geometric model assumes that the deformations of the constellation are lo-cally approximately affine.Therefore,during adaptation of the local layer to the target’s current appearance,we seek an approximately affine deformation of an initial set of patches ˆXtthat maximizes the joint probability in(2).We determine the unknown deformation by optimiz-ing(2)for X t using the standard cross-entropy method[17]. However,due to the high dimensionality of the problem at hand,(2)may contain many local maxima that may cause the method to take a long time to converge.We therefore write our deformation model as a composition of a glob-ally affine deformation A G t,that is equal for all patches, and of local perturbations∆(i)t which may vary between the patches:x(i)t=A G tˆx(i)t+∆(i)t.(7) In our implementation we thereforefirst optimize(2)w.r.t. the global affine deformation A G t.After convergence,we fix the value of A G t and sequentially optimize the positions of each patch x(i)t.2.2.Updating the local layerRecall from(1)that there is a weight w(i)t associated witheach patch that reflects the relevance of the corresponding patch in the mixture of patches.After adapting the set of patches to the target’s appearance,as described in the pre-vious section,each patch is analyzed and its weight is in-creased or decreased by∆w by applying the following two consistency rules:•Visual consistency:If the Bhattacharryya distance be-tween the patch’s reference and the current histogram exceeds a threshold T histHi then its weight is de-creased;if the distance falls below a threshold T histLo the weight is increased.•Drift from majority:If the median of the distances from the patch to all other patches in the set is greater than a predefined threshold T major,then the patch’s weight is decreased.The weight of a patch can be interpreted as a frequency at which each patch has been selected as belonging to the object(increasing weight)minus the frequency at which the patch was selected as a possible outlier(decreasing weight). When normalized,these weights can be regarded as a prob-ability that a patch belongs to the object.Patches with low probability(lower than T R)are considered as either outdated or mispositioned and are removed from the set. To allow a good coverage of the target in the image,new patches have to be added in the local layer.The patches are allocated by sampling their position from a probability density function(pdf)that determines locations in the im-age which are likely to contain the target.This pdf is con-structed from the global layer and is described in the next section.The weight w(i)t of the allocated patch is initialized with a value of twice the threshold for patch removal,i.e., w0=2T R.The remaining question is how many patches should be allocated.Let˜N t denote the number of patches in the local layer after removing the irrelevant patches.We define N captto be the local layer’s capacity,i.e.,the maxi-mum number of patches allowed in the local layer at time-step t.To allow the number of allocated patches to grow with the target’s size,we always try to allocate at mostN all t≤N capt−˜N t+1new patches.To prevent sudden significant changes in the estimated capacity,we adapt it using the autoregressive scheme:N capt+1=αcap N capt+(1−αcap)N t,(8) where N t=N all t+˜N t andαcap is an exponentially forget-ting factor.2.3.The global layerThe global layer G t captures the target’s global visual properties,in particular color C t,apparent motion M t,andshape S t,G t={C t,M t,S t}.(9) When required,this information is used to allocate new patches in the local layer.The allocation is implemented by drawing positions from the following distributionp(x|C t,M t,S t)∝p(C t,M t,S t|x).(10)Assuming that the visual cues are independent given a po-sition x,then(10)factors asp(x|C t,M t,S t)∝p(C t|x)p(M t|x)p(S t|x).(11)In the following we describe the models for each of the cues.The global color model is encoded by two HSV his-tograms h F t and h B t,thefirst corresponding to the target and the second to the background.Let I(x)be a pixel value at the position x in image ing the histograms,the prob-ability that a pixel corresponds to the background or fore-ground is p(x|F)=h F t(I(x))and p(x|B)=h B t(I(x)), respectively.The likelihood that a pixel at the location x belongs to the target is thereforep(C t|x)=p(x|F)p(F)p(F)p(x|F)+(1−p(F))p(x|B).(12)Both histograms are updated during tracking as follows. After the local layer isfitted to the target(Section2.1),a histogramˆh F t is extracted in the current image from the re-gions that correspond to the patches of the local layer.The background histogramˆh B t is extracted from a ring-shaped region defined by the convex hull of the patches in the local layer.These histograms are used to update the global color model by a simple autoregressive schemeh F t+1=αF h F t+(1−αF)ˆh F th B t+1=αB h B t+(1−αB)ˆh B t,(13)whereαF andαB arefixed constants that determine the rate of adaptation.The apparent motion model is defined by the local mo-tion model from[11].Briefly,the local motion model[11]first determines salient points{x i}N s i=1with sufficient tex-ture in the image.It then computes the motion likelihood p(x i|M t)at each salient point x i by comparing the local velocity of a pixel v(x i)(estimated by Lucas-Kanade opti-calflow[14])with the global velocity v t estimated by the tracker.As in[11],the motion likelihood at salient point x i is defined asp(x i|M t)∝(1−w noise)e−λM(d(v(x i),v t))+w noise,(14) where d(v(x i),v t))is the distance between two velocities and w noise is uniform noise.Finally,to obtain a dense esti-mate,the set of salient points is convolved with a smoothing kernel.We therefore define the motion likelihood asp(M t|x)∝1KN si=1p(x i|M t)ΦΣ(x−x i),(15)where K is a normalization factor,ΦΣ(x)is a Gaussian kernel with covarianceΣand N s is the number of salient patches.The covariance is estimated automatically from the weighted set of salient points using the multivariate Kernel Density Estimation[10].The shape model is a weighted superposition of the past∆t approximate object shapes.An approximate ob-ject shape at time-step t is defined as an object-centered region P t,which is calculated by a convex envelope over the patches from the local layer.To maintain the growing capability we dilate each hull by the size of a local patch. We define a function s(x,P t)≡1if x∈P t and0other-wise and the shape likelihood model for a pixel at x is thus defined asp(S t|x)∝∆ti=0αS i s(x,P t−i),(16)whereαS is a weighting factor which reduces the influence of the older shapes.As mentioned above,(11)is used for allocating new patches in the local layer.We do not sample(11)directly, but rather discretize itfirst,by calculating its value for each pixel in the image.This discretized distribution is then used to draw positions for new patches from the potential target region.To make sure that the patches are allocated only in regions whose likelihood of containing the target is high enough,we set to zero those regions of the discretized dis-tribution,whose value is smaller than30%of the maximal value from p(x|C t,M t,S t).To avoid duplicating patches in the local layer,the regions of the discretized distribution that correspond to existing patches are set to zero.2.4.Tracking with the coupled-layer visual modelRecall that the proposed coupled-layer visual model starts from an initial estimate of the target’s position and then refines its estimate by adapting to the current image as described in Section2.1.The center of the target can then be identified as a weighted average c t of the patches’positions.During tracking we require prediction of the lo-cal layer’s patches to initialize the adaptation of the visual model.We also require an estimate of the target’s velocity in the global layer’s apparent motion model.We therefore apply a Kalmanfilter[8]with a nearly-constant velocity (NCV)dynamic model[13]tofilter the estimates of the tar-get’s center c t.Thus,at time-step t,the target’s velocityˆv t estimated by the Kalmanfilter is used to initialize the local layer patchesˆX t={ˆx(i)t}i:1:Ntby predicting the locationframe:(i)t−1+ˆv(i)t.(17)is manually initialized byover the target.We give noand the set of patchesinitialized in a grid patternThe weights of the patchesw0.We summarize the relevant1.Algorithm1The coupled-layer visual tracker. Initialization:i Input:Place a rectangular region over a target.ii Distribute patches in a regular grid in the region and assign uniform weights.Tracking:For time-step t=1,2,3...1.Predict the target’s velocityˆv t using the Kalmanfil-ter and initialize the local-layer patches with the NCV model(17).2.Adapt the local layer patches by maximizingp(Y t,X t|ˆX t)(Section2.1),recalculate the target’s center c t and update the Kalmanfilter estimate.3.Identify/remove irrelevant patches from the local layer(Section2.2).4.To maintain numerical stability(e.g.Delaunay trian-gulation works better if the input points are not too close to each other)and decrease redundant compar-isons,merge patches in the local layer that are too close to each the remaining patches,update the visual cues ofthe global layer(Section2.3).6.If required,construct a discretized distributionp(x|C t,M t,S t)and sample positions of new patches for the local layer.3.Experimental resultsWe have analyzed the performance of the proposed local-global tracker(LGT)from Algorithm1on several examples of tracking either a nonrigid object or an object that un-dergoes a significant appearance change.Our tracker has been implemented in Matlab/C and runs at approximately4 frames per second on an Intel Core2Duo6600.The pa-rameters in our tracker were set as follows.The maximumFigure2.Samples from the experimental video sequences. number of iterations in the cross-entropy was10,with50 samples per iteration.We setλv=0.1andλg=0.015. For the adaptation of the local layer(Section2.2)the fol-lowing parameters were used:∆w=0.1,T histLo=0.4, T histHi=0.8,T major=40,T R=0.1andαcap=0.8. To update the global layer,parameter valuesαF=0.95,αB=0.5,λM=1,w noise=0.01,∆t=7,andαS=0.7 were used.We would like to emphasize that all the param-eters were kept constant for all the experiments.We have compared our tracker,i.e.LGT,withfive re-lated state-of-the-art reference trackers,which address the problem of object appearance changes:a color-based parti-clefilter[16](PF),an online boosting tracker[5](OBT),a flock-of-features tracker[9](FOF),a piecewise-affine ker-nel tracker[15](PAKT)and the basin-hopping Monte Carlo tracker[12](BHMC).The experiments involved tracking a hand,a human body,and objects with challenging view changes(Figure2).The basic properties of the experimen-tal sequences are collected in Table11.Table1.An overview of the video sequences. Sequence Type Comments Len. hand arti.body part rapid motion242 hand2arti.body part rapid motion267 gymnast.articulated rapid motion206 diver articulated rotation214 dinosaur rigid elab.struct.324 torus rigid empty center262 The target was tracked in each sequence R=30times by each tracker.For comparison,we recorded the number1The annotated sequences,as well as a reference implementation of the tracker are available at research/tracking/.Figure 3.Results for the hand sequence.Results are shown for trackers FOF (first row),PF (second row)and LGT (last row).of times each tracker failed and had to be reinitialized.We also recorded the tracked trajectories.The tracking fail-ure was automatically determined by measuring the over-lap between the ground-truth region Ωt gt and the regionestimated by the tracker Ωt.The overlap was measuredas F (Ωt gt ,Ωt )=Ωt gt ∩Ωt /Ωt gt ∪Ωt.A failure was pro-claimed at time-step t if F (Ωt ,Ωt a )<0.09.This threshold is based on our observation of the behavior of the estimated region,produced by a tracker vs.the ground truth region.To evaluate the tracker accuracy with respect to the other trackers,we have performed a one-sided standard hypothe-sis test [2]on the estimated trajectories.3.1.ResultsTable 2shows the average failure rates for each tracker.We see that the LGT is indeed superior to the reference trackers as the average failure rate is the lowest for all the sequences.Looking at the number of failures per sequence,we also see that the sequence hand2was the most difficult to track for all trackers.Visual properties of the hand,such as color,are similar for the entire arm,making trackers that rely heavily on color more vulnerable to drifting.Further-more,due to homogeneous color,skin contains only few distinct local regions,which makes it difficult to reliably estimate local motions on the object.The problem of color ambiguity and background clutter was also apparent in the sequence hand in Figure 3,where the PF tracker (second row),which relies only on color information,confused the head for the hand on the third image from the sequence.Be-cause of the difficulty of estimating the local motion from small regions,the FOF tracker (first row),which uses a set of optical flow features for tracking,failed.On the other hand,the LGT tracker succeeds in tracking (third row)since it integrates multiple cues at a global level to handle back-ground clutter and enforces geometrical constraints at a lo-cal level to handle local ambiguity.The sequences gymnastics and diver are the only twoTable 2.Average number of failures per sequence.PAKT FOF PF BHMC OBT LGT [15][9][16][12][5]hand 22.610.0 4.329.910.00.2hand240. 1.9gymnast. 3.0 3.7 4.79.7 4.00.2diver 2.4 2.2 4.3 3.97.0 1.2dinosaur 8.2 2.7 10.6 6.0 2.523.413.0sequences that include camera motion (following the tar-get).It is worth noting that the objects do not move much spatially in these sequences,but rather significantly change their appearance.PAKT and BHMC do not explicitly as-sume the object’s translational motion (do not estimate the object’s velocity),but rather assume Brownian-like motion.For this reason their failure rate is somewhat lower for these two sequences in comparison to other sequences.Never-theless,the LGT outperformed both trackers in these se-quences.Figure 4compares the BHMC tracker (first row)and LGT tracker (second row)on several frames of the gym-nastics sequence,in which the target significantly changes its appearance as well as scale.We can see from the es-timated bounding boxes that the size of the object is often poorly estimated by the BHMC tracker which leads to fail-ures (Table 2).On the other hand,the LGT successfully tracks the target through the scale change.The advantages of the LGT tracker are also evident in the sequences dinosaur and torus for the case of rigid objects with more complex structure that undergo rapid orientation and translation changes with respect to the camera.Even though these kinds of objects are not as deformable as a hu-man body or a hand,the changes in the appearance are still hard to describe without a predefined geometrical model for a specific object.As seen in Figure 5,when tracking a torus,the PAKT and OBT reference trackers drift from the object several times during the sequence,while the LGT trackerFigure 4.Results for the gymnastics sequence.Results are shown for trackers BHMC (first row)and LGT (second row).successfully accomplishes the task.In the case of the PAKT tracker (first row)the problem lies in its inability to follow fast movements because of the locality of the optimization and the limited adaptation capabilities due to a fixed parts set.The OBT tracker (second row)on the other hand fails many times because it focuses on the more visually inter-esting central region,which,however,belongs to the back-ground.The LGT tracker (third row)does not have these problems and can successfully track the object throughout the sequence.Table 3.RMS errors with respect to the ground truth.In all cases the LGT produced smaller RMSE and (·)∗denotes that the differ-ence was statistically significant.PAKTFOFPFBHMCOBTLGT[15][9][16][12][5]hand 18.5∗19.7∗14.4∗27.4∗17.5∗9.1hand218.7∗17.4∗16.6∗26.1∗22.5∗10.3gymnast.17.1∗23.1∗22.8∗27.6∗21.3∗11.3diver 18.1∗14.616.5∗21.1∗17.1∗13.7dinosaur 23.6∗19.2∗23.3∗35.1∗30.3∗11.5torus15.2∗14.8∗16.4∗21.5∗14.8∗5.1Table 3shows the tracking accuracy in terms of the aver-age RMSE.From the comparison of the RMSEs of the ref-erence trackers and the proposed tracker we can conclude that the proposed tracker,LGT,outperforms all the refer-ence trackers in accuracy at a standard significance level α=0.05(L α=1.564)except in one case (tracker FOF on the sequence diver )where the difference is not statistically significant .The better accuracy can be largely attributed to the two-stage optimization of the local layer,that first finds a good globally affine match for the entire set of the patches and then fine-tunes positions of individual patches to better match the target’s new appearance.The difference is less significant in the cases of the gymnastics and diver sequences because the targets do not move very much spa-tially,which makes the drawbacks of some of the related trackers less apparent.4.Discussion and conclusionWe have proposed a coupled two-layer visual model for efficient tracking of targets that undergo significant appear-ance changes.The proposed model is a coupled combina-tion of a local and global layer.The local layer is a set of local patches that geometrically constrain the changes in the target’s appearance.The set probabilistically adapts to the target’s appearance by maximizing the joint distribution over the model’s geometrical constraints and visual obser-vations.As the target’s appearance significantly changes,some of the patches in the visual model cease to correspond to the target’s visible parts.Those patches are identified by the local layer and gradually removed from the model.The allocation of the new patches in the local layer is con-strained by the global layer that encodes the target’s global visual features.The global layer maintains a probabilistic model of the target’s global visual features such as color,shape,and the apparent motion and is adapted during track-ing.This adaptation is in turn constrained by focusing on the stable patches in the local layer.We believe that it is ex-actly this constrained coupled updating between the layers that results in the robust tracking.We have incorporated the proposed visual model in a tracker and compared the tracker to the state-of-the-art on several challenging sequences.The results show that our tracker outperforms the related trackers by smaller failure rate and at a greater accuracy.The experiments have shown that even in the cases when the background’s color is similar to the target’s,tracking will not fail.The reason is that the global layer uses many more features,such as foreground-background similarity,shape,local motion,and temporal proximity from the Kalman filter to determine which re-gions in the image potentially contain the target.Therefore new patches are more likely initialized on the target.Only after these patches have been validated by the local layer over several frames,they start to play a stronger role in the model.Similarly,the global layer is updated only by using the stable patches from the local layer.These constrained feedbacks between the two layers,allow the tracker to track the target through scale and appearance changes as shown in the experiments.In the same respect,the tracker is ex-。