Fine Hand Segmentation using Convolutional Neural Networks
《公共建筑节能(绿色建筑)工程施工质量验收规范》DBJ50-234-2016
( 7 ) 本 规 范 第 16.2.10 条 依 据 国 家 标 准 《 太 阳 能 供 热 采 暖 工 程 技 术 规 范 》 GB50495-2009 第 5.3.5 条的规定。
(8)本规范第 3.4.4 条为绿色建筑工程涉及的建筑环境与资源综合利用子分部工程 验收方式的规定。
本规范由重庆市城乡建设委员会负责管理,由重庆市建设技术发展中心(重庆市建 筑节能中心)、重庆市绿色建筑技术促进中心负责具体技术内容解释。在本规范的实施 过程中,希望各单位注意收集资料,总结经验,并将需要修改、补充的意见和有关资料 交重庆市建设技术发展中心(重庆市渝中区牛角沱上清寺路 69 号 7 楼,邮编:400015, 电话:023-63601374,传真:023-63861277),以便今后修订时参考。
建设部备案号: J13144-2015
DB
重庆市工程建设标准 DBJ50-234-2016Leabharlann 公共建筑节能(绿色建筑)工程
施工质量验收规范
Code for acceptance of energy efficient public building(green building) construction
(3)本规范第 1.0.4、3.1.2、11.2.4、22.0.6、22.0.7 条内容分别依据国家标准《建 筑节能工程施工质量验收规范》GB50411-2007 第 1.0.5、3.1.2 条、11.2.3、15.0.5、15.0.5 条等强制性条文要求。
图像处理专业英语词汇
FFT 滤波器FFT filtersVGA 调色板和许多其他参数VGA palette and many others 按名称排序sort by name包括角度和刻度including angle and scale保持目标keep targets保存save保存和装载save and load饱和度saturation饱和加法和减法add and subtract with saturate背景淡化background flatten背景发现find background边缘和条纹测量Edge and Stripe/Measurement边缘和条纹的提取find edge and stripe编辑Edit编辑edit编辑或删除相关区域edit or delete relative region编码Code编码条Coda Bar变换forward or reverse fast Fourier transformation变量和自定义的行为variables and custom actions变量检测examine variables变形warping变形系数warping coefficients标题tile标注和影响区域label and zone of influence标准normal标准偏差standard deviation表面弯曲convex并入图像merge to image采集栏digitizer bar采集类型grab type菜单形式menu item参数Preferences参数轴和角度reference axis and angle测量measurement测量方法提取extract measurements from测量结果显示和统计display measurement results and statistics测量转换transfer to measurement插入Insert插入条件检查Insert condition checks查找最大值find extreme maximum长度length超过50 个不同特征的计算calculate over 50 different features area 撤销次数number of undo levels乘multiply尺寸size处理Processing处理/采集图像到一个新的窗口processed/grabbed image into new window 窗口window窗口监视watch window窗位window leveling创建create垂直边沿vertical edge从表格新建new from grid从工具条按钮from toolbar button从用户窗口融合merge from user form粗糙roughness错误纠正error correction错误匹配fit error打开open打开近期的文件或脚本open recent file or script打印print打印设置print setup打印预览print preview大小和日期size and date带通band pass带有调色板的8- bit带有动态预览的直方图和x, y 线曲线椭圆轮廓histogram and x, y line curve ellipse profiles with dynamic preview带阻band reject代码类型code type单步single step单一simple单帧采集snap shot导入VB等等etc.低通low pass第一帧first点point调色板预览palette viewer调试方式debug mode调用外部的DLL调整大小resize调整轮廓滤波器的平滑度和轮廓的最小域值adjust smoothness of contour filter and minimum threshold for contours定点除fixed point divide定位精度positional accuracy定义一个包含有不相关的不一致的或无特征区域的模板define model including mask for irrelevant inconsistent or featureless areas定制制定-配置菜单Customize - configure menus动态预览with dynamic preview读出或产生一个条形或矩阵码read or generate bar and matrix codes读取和查验特征字符串erify character strings断点break points对比度contrast对比度拉伸contrast stretch对称symmetry对模板应用“不关心的”像素标注apply don't care pixel mask to model 多边形polygon二进制binary二进制分离separate binary二值和灰度binary and grayscale翻转reverse返回return放大或缩小7 个级别zoom in or out 7 levels分类结果sort results分水岭Watershed分析Analysis分组视图view components浮点float腐蚀erode复合视图view composite复合输入combined with input复制duplicate复制duplicateselect all傅立叶变换Fourier transform改变热点值change hotspot values感兴趣区域ROI高级几何学Advanced geometry高通high pass格式栏formatbar更改默认的搜索参数modify default search parameters 工具Utilities工具栏toolbar工具属性tool properties工具条toolbar工作区workspace bar共享轮廓shared contours构件build构造表格construct grid关闭close和/或and/or和逆FFT画图工具drawing tools缓存buffer换算convert灰度grayscale恢复目标restore targets回放playback绘图连结connect map获得/装载标注make/load mask获取选定粒子draw selected blobs或从一个相关区域创建一个ROI or create an ROI from a relative region基线score基于校准映射的畸变校正distortion correction based on calibration mapping 极性polarity极坐标转换polar coordinate transformation几何学Geometry记录record加粗thick加法add间隔spacing间距distance兼容compatible简洁compactness剪切cut减法subtract减小缩进outdent交互式的定义字体参数包括搜索限制ine font parameters including search constraints 脚本栏script bar角度angle角度和缩放范围angle and scale range接收和确定域值acceptance and certainty thresholds结果栏result bar解开目标unlock targets精确度和时间间隔accuracy and timeout interval矩形rectangle矩形rectangular绝对差分absolute difference绝对值absolute value均匀uniform均值average拷贝copy拷贝序列copy sequence可接收的域值acceptance threshold克隆clone控制control控制controls快捷健shortcut key宽度breadth宽度width拉普拉斯Laplacians拉伸elongation蓝blue类型type粒子blob粒子标注label blobs粒子分离segment blobs粒子内的孔数目number of holes in a blob 亮度brightness亮度luminance另存为save as滤波器filters绿green轮廓profile overlay轮廓极性contour polarity逻辑运算logical operations面积area模板编辑edit model模板覆盖model coverage模板和目标覆盖model and target coverage 模板索引model index模板探测器Model Finder模板位置和角度model position and angle 模板中心model center模糊mask模块import VB module模块modules模式匹配Pattern matching默认案例default cases目标Targets目标分离separate objects目标评价target score欧拉数Euler number盆basins膨胀dilate匹配率match scores匹配数目number of matches平方和sum of the squares平滑smooth平均average平均averaged平均值mean平移translation前景色foreground color清除缓冲区为一个恒量clear buffer to a constant清除特定部分delete special区域增长region-growing ROI取反negate全部删除delete all缺省填充和相连粒子分离fill holes and separate touching blobs任意指定位置的中心矩和二阶矩central and ordinary moments of any order location: X, Y 锐化sharpen三维视图view 3D色度hue删除delete删除帧delete frame设置settings设置相机类型enable digitizer camera type设置要点set main示例demos事件发现数量number of occurrences事件数目number of occurrences视图View收藏collectionDICOM手动manually手绘曲线freehand输出选项output options输出选择结果export selected results输入通道input channel属性页properties page数据矩阵DataMatrix数字化设置Digitizer settings双缓存double buffer双域值two-level水平边沿horizontal edge搜索find搜索和其他应用Windows Finder and other applications 搜索角度search angle搜索结果search results搜索区域search area搜索区域search region搜索速度search speed速度speed算法arithmetic缩放scaling缩放和偏移scale and offset锁定目标lock destination锁定实时图像处理效果预览lock live preview of processing effects on images 锁定预览Lock preview锁定源lock source特定角度at specific angle特定匹配操作hit or miss梯度rank替换replace添加噪声add noise条带直径ferret diameter停止stop停止采集halt grab同步synchronize同步通道sync channel统计Statistics图像Image图像大小image size图像拷贝copy image图像属性image properties图形graph退出exit椭圆ellipse椭圆ellipses外形shape伪彩pseudo-color位置position文本查看view as text文件File文件MIL MFO font file文件load and save as MIL MMF files文件load and save models as MIL MMO files OCR文件中的函数make calls to functions in external DLL files文件转换器file converterActiveMIL Builder ActiveMIL Builder 无符号抽取部分Extract band -细化thin下一帧next显示表现字体的灰度级ayscale representations of fonts显示代码show code线line线lines相对起点relative origin像素总数sum of all pixels向前或向后移动Move to front or back向上或向下up or down校准Calibration校准calibrate新的/感兴趣区域粘贴paste into New/ROI新建new信息/ 图形层DICOM information/overlay形态morphology行为actions修改modify修改路径modify paths修改搜索参数modify default search parameters 序列采集sequence旋转rotation旋转模板rotate model选择select选择selector循环loops移动move移动shift应用过滤器和分类器apply filters and classifiers 影响区域zone of influence映射mapping用户定义user defined用基于变化上的控制实时预览分水岭转化结果阻止过分切割live preview of resulting watershed transformations with control over variation to prevent over segmentation用某个值填充fill with value优化和编辑调色板palette optimization/editor有条件的conditional域值threshold预处理模板优化搜索速度循环全部扫描preprocess model to optimize search speed circular over-scan预览previous元件数目和开始(自动或手动)number of cells and threshold auto or manual元件最小/最大尺寸cell size min/max源source允许的匹配错误率和加权fit error and weight运行run在目标中匹配数目number of modelmatches in target暂停pause增大缩进indent整数除integer divide正FFT正常连续continuous normal支持象征学supported symbologies: BC 412直方图均衡histogram equalization执行execute执行外部程序和自动完成VBA only execute external programs and perform Automation VBA only指定specify指数exponential Rayleigh中值median重复repeat重建reconstruct重建和修改字体restore and modify fonts重新操作redo重心center of gravity周长perimeter注释annotations转换Convert转换convert装载load装载和保存模板为MIL MMO装载和另存为MIL MFO装载和另存为MIL MMF状态栏status bar资源管理器拖放图像drag-and-drop images from Windows ExplorerWindows自动或手动automatic or manual自动或手动模板创建automatic or manual model creation字符产大小string size字符串string字体font最大maximum最大化maximum最大数maxima最后一帧last frame最小minimum最小化minimum最小间隔标准minimum separation criteria最小数minima坐标盒的范围bounding box coordinates图像数据操作Image data manipulation内存分配与释放allocation release图像复制copying设定和转换setting and conversion图像/视频的输入输出Image and video I/O支持文件或摄像头的输入file and camera based input图像/视频文件的输出image/video file output矩阵/向量数据操作及线性代数运算Matrix and vector manipulation and linear algebra routines 矩阵乘积products矩阵方程求解solvers特征值eigenvalues奇异值分解SVD支持多种动态数据结构Various dynamic data structures 链表lists队列queues数据集sets树trees图graphs基本图像处理Basic image processing去噪filtering边缘检测edge detection角点检测corner detection采样与插值sampling and interpolation色彩变换color conversion形态学处理morphological operations直方图histograms图像金字塔结构image pyramids结构分析Structural analysis连通域/分支connected components轮廓处理contour processing距离转换distance transform图像矩various moments模板匹配template matching霍夫变换Hough transform多项式逼近polygonal approximation曲线拟合line fitting椭圆拟合ellipse fitting狄劳尼三角化Delaunay triangulation摄像头定标Camera calibration寻找和跟踪定标模式finding and tracking calibration patterns 参数定标calibration,基本矩阵估计fundamental matrix estimation单应矩阵估计homography estimation立体视觉匹配stereo correspondence)运动分析Motion analysis光流optical flow动作分割motion segmentation目标跟踪tracking目标识别Object recognition特征方法eigen-methodsHMM模型HMM基本的GUI Basic GUI显示图像/视频display image/video键盘/鼠标操作keyboard and mouse handling滑动条scroll-bars图像标注Image labeling直线line曲线conic多边形polygon、文本标注text drawing梯度方向gradient directions系数coefficient空间频率spatial frequencies串级过滤cascade filtering卷积运算convolution operation有限差分近似the finite difference approximation 对数刻度logarithmic scale仿射参数affine parameters斑点Blob差距disparityAlgebraic operation 代数运算;一种图像处理运算,包括两幅图像对应像素的和、差、积、商。
3D SEGMENTATION AND LABELING USING SELF-ORGANIZING KOHONEN NETWORK FOR VOLUMETRIC MEASURMEN
Filtering
3D Segmentation
Majority Filter
3D Connected Component Labeling
Fig. 1. System Overview for 3D brain segmentation.
B. Volume Segmentation Scale plays an important role in image understanding. An image, being a physical observable of a scene, represents the scene on a nite range of scales only. The aim of image understanding is to infer scene strucures, which calls for a multi-scale description of image structure. It has been shown 6] that the only family satisfying the natural front-end vision constraints of linearity, shift-, rotation-, and scale invariance is the Gaussian family and all its partial derivatives. This operator family provides a complete representation of image structure. For the classi cation of these patterns, we apply a novel self-organizing network, This work was supported in part by grants from the Whitaker Foundation and the NSF (CDA-9422094). as shown in Fig. 2. The network adapts itself such that it
基于非下采样Contourlet变换的自适应阈值图像去噪方法
基于非下采样Contourlet变换的自适应阈值图像去噪方法魏金成;吴昌东;江桦【摘要】Contourlet变换是一种真正的图像二维表示方法,具有方向性和各向异性,能稀疏地表示图像.但Contourlet变换不具备平移不变性,图像去噪时会存在伪Gibbs现象.为了克服这种不足,在Contourlet变换基础上,构建了非下采样Contourlet变换.首先将图像进行非下采样Contourlet变换,接着运用自适应阈值进行去噪处理,然后进行非下采样Contourlet逆变换,得到去噪后图像.实验结果表明,采用非下采样Contourlet变换方法能有效去除图像噪声,并能保持图像纹理细节,提高图像信噪比,视觉效果好,其去噪效果优于传统小波及Contourlet去噪效果.【期刊名称】《科学技术与工程》【年(卷),期】2013(013)029【总页数】4页(P8662-8665)【关键词】非下采样Contourlet变换;自适应阈值;图像去噪【作者】魏金成;吴昌东;江桦【作者单位】西华大学电气信息学院,成都610039;西华大学电气信息学院,成都610039;西南交通大学峨眉校区计算机与通信工程系,峨眉614202【正文语种】中文【中图分类】TP391.41图像处理技术中,为了更好地分析图像,需对图像进行去噪处理。
近年来,小波变换 (wavelet transform,WT)在抑制噪声方面取得了较好效果。
但WT只在处理一维分段光滑信号方面效果较好;对于二维图像,WT方向有限,不能充分利用图像本身特有的几何特征,在保护图像边界和细节上不甚理想。
而Contourlet变换(contourlet transform,CT)是一种真正的图像二维表示,具有方向性和各向异性,能更稀疏地表示二维图像;但因CT由拉普拉斯金字塔分解和方向滤波器组构成,该变换存在下采样操作,不具备平移不变性,去噪的同时会引起伪 Gibbs现象[1,2]。
A Fast and Accurate Plane Detection Algorithm for Large Noisy Point Clouds Using Filtered Normals
A Fast and Accurate Plane Detection Algorithm for Large Noisy Point CloudsUsing Filtered Normals and Voxel GrowingJean-Emmanuel DeschaudFranc¸ois GouletteMines ParisTech,CAOR-Centre de Robotique,Math´e matiques et Syst`e mes60Boulevard Saint-Michel75272Paris Cedex06jean-emmanuel.deschaud@mines-paristech.fr francois.goulette@mines-paristech.frAbstractWith the improvement of3D scanners,we produce point clouds with more and more points often exceeding millions of points.Then we need a fast and accurate plane detection algorithm to reduce data size.In this article,we present a fast and accurate algorithm to detect planes in unorganized point clouds usingfiltered normals and voxel growing.Our work is based on afirst step in estimating better normals at the data points,even in the presence of noise.In a second step,we compute a score of local plane in each point.Then, we select the best local seed plane and in a third step start a fast and robust region growing by voxels we call voxel growing.We have evaluated and tested our algorithm on different kinds of point cloud and compared its performance to other algorithms.1.IntroductionWith the growing availability of3D scanners,we are now able to produce large datasets with millions of points.It is necessary to reduce data size,to decrease the noise and at same time to increase the quality of the model.It is in-teresting to model planar regions of these point clouds by planes.In fact,plane detection is generally afirst step of segmentation but it can be used for many applications.It is useful in computer graphics to model the environnement with basic geometry.It is used for example in modeling to detect building facades before classification.Robots do Si-multaneous Localization and Mapping(SLAM)by detect-ing planes of the environment.In our laboratory,we wanted to detect small and large building planes in point clouds of urban environments with millions of points for modeling. As mentioned in[6],the accuracy of the plane detection is important for after-steps of the modeling pipeline.We also want to be fast to be able to process point clouds with mil-lions of points.We present a novel algorithm based on re-gion growing with improvements in normal estimation and growing process.For our method,we are generic to work on different kinds of data like point clouds fromfixed scan-ner or from Mobile Mapping Systems(MMS).We also aim at detecting building facades in urban point clouds or little planes like doors,even in very large data sets.Our input is an unorganized noisy point cloud and with only three”in-tuitive”parameters,we generate a set of connected compo-nents of planar regions.We evaluate our method as well as explain and analyse the significance of each parameter. 2.Previous WorksAlthough there are many methods of segmentation in range images like in[10]or in[3],three have been thor-oughly studied for3D point clouds:region-growing, hough-transform from[14]and Random Sample Consen-sus(RANSAC)from[9].The application of recognising structures in urban laser point clouds is frequent in literature.Bauer in[4]and Boulaassal in[5]detect facades in dense3D point cloud by a RANSAC algorithm.V osselman in[23]reviews sur-face growing and3D hough transform techniques to de-tect geometric shapes.Tarsh-Kurdi in[22]detect roof planes in3D building point cloud by comparing results on hough-transform and RANSAC algorithm.They found that RANSAC is more efficient than thefirst one.Chao Chen in[6]and Yu in[25]present algorithms of segmentation in range images for the same application of detecting planar regions in an urban scene.The method in[6]is based on a region growing algorithm in range images and merges re-sults in one labelled3D point cloud.[25]uses a method different from the three we have cited:they extract a hi-erarchical subdivision of the input image built like a graph where leaf nodes represent planar regions.There are also other methods like bayesian techniques. In[16]and[8],they obtain smoothed surface from noisy point clouds with objects modeled by probability distribu-tions and it seems possible to extend this idea to point cloud segmentation.But techniques based on bayesian statistics need to optimize global statistical model and then it is diffi-cult to process points cloud larger than one million points.We present below an analysis of the two main methods used in literature:RANSAC and region-growing.Hough-transform algorithm is too time consuming for our applica-tion.To compare the complexity of the algorithm,we take a point cloud of size N with only one plane P of size n.We suppose that we want to detect this plane P and we define n min the minimum size of the plane we want to detect.The size of a plane is the area of the plane.If the data density is uniform in the point cloud then the size of a plane can be specified by its number of points.2.1.RANSACRANSAC is an algorithm initially developped by Fis-chler and Bolles in[9]that allows thefitting of models with-out trying all possibilities.RANSAC is based on the prob-ability to detect a model using the minimal set required to estimate the model.To detect a plane with RANSAC,we choose3random points(enough to estimate a plane).We compute the plane parameters with these3points.Then a score function is used to determine how the model is good for the remaining ually,the score is the number of points belonging to the plane.With noise,a point belongs to a plane if the distance from the point to the plane is less than a parameter γ.In the end,we keep the plane with the best score.Theprobability of getting the plane in thefirst trial is p=(nN )3.Therefore the probability to get it in T trials is p=1−(1−(nN )3)ing equation1and supposing n minN1,we know the number T min of minimal trials to have a probability p t to get planes of size at least n min:T min=log(1−p t)log(1−(n minN))≈log(11−p t)(Nn min)3.(1)For each trial,we test all data points to compute the score of a plane.The RANSAC algorithm complexity lies inO(N(Nn min )3)when n minN1and T min→0whenn min→N.Then RANSAC is very efficient in detecting large planes in noisy point clouds i.e.when the ratio n minN is 1but very slow to detect small planes in large pointclouds i.e.when n minN 1.After selecting the best model,another step is to extract the largest connected component of each plane.Connnected components mean that the min-imum distance between each point of the plane and others points is smaller(for distance)than afixed parameter.Schnabel et al.[20]bring two optimizations to RANSAC:the points selection is done locally and the score function has been improved.An octree isfirst created from point cloud.Points used to estimate plane parameters are chosen locally at a random depth of the octree.The score function is also different from RANSAC:instead of testing all points for one model,they test only a random subset and find the score by interpolation.The algorithm complexity lies in O(Nr4Ndn min)where r is the number of random subsets for the score function and d is the maximum octree depth. Their algorithm improves the planes detection speed but its complexity lies in O(N2)and it becomes slow on large data sets.And again we have to extract the largest connected component of each plane.2.2.Region GrowingRegion Growing algorithms work well in range images like in[18].The principle of region growing is to start with a seed region and to grow it by neighborhood when the neighbors satisfy some conditions.In range images,we have the neighbors of each point with pixel coordinates.In case of unorganized3D data,there is no information about the neighborhood in the data structure.The most common method to compute neighbors in3D is to compute a Kd-tree to search k nearest neighbors.The creation of a Kd-tree lies in O(NlogN)and the search of k nearest neighbors of one point lies in O(logN).The advantage of these region growing methods is that they are fast when there are many planes to extract,robust to noise and extract the largest con-nected component immediately.But they only use the dis-tance from point to plane to extract planes and like we will see later,it is not accurate enough to detect correct planar regions.Rabbani et al.[19]developped a method of smooth area detection that can be used for plane detection.Theyfirst estimate the normal of each point like in[13].The point with the minimum residual starts the region growing.They test k nearest neighbors of the last point added:if the an-gle between the normal of the point and the current normal of the plane is smaller than a parameterαthen they add this point to the smooth region.With Kd-tree for k nearest neighbors,the algorithm complexity is in O(N+nlogN). The complexity seems to be low but in worst case,when nN1,example for facade detection in point clouds,the complexity becomes O(NlogN).3.Voxel Growing3.1.OverviewIn this article,we present a new algorithm adapted to large data sets of unorganized3D points and optimized to be accurate and fast.Our plane detection method works in three steps.In thefirst part,we compute a better esti-mation of the normal in each point by afiltered weighted planefitting.In a second step,we compute the score of lo-cal planarity in each point.We select the best seed point that represents a good seed plane and in the third part,we grow this seed plane by adding all points close to the plane.Thegrowing step is based on a voxel growing algorithm.The filtered normals,the score function and the voxel growing are innovative contributions of our method.As an input,we need dense point clouds related to the level of detail we want to detect.As an output,we produce connected components of planes in the point cloud.This notion of connected components is linked to the data den-sity.With our method,the connected components of planes detected are linked to the parameter d of the voxel grid.Our method has 3”intuitive”parameters :d ,area min and γ.”intuitive”because there are linked to physical mea-surements.d is the voxel size used in voxel growing and also represents the connectivity of points in detected planes.γis the maximum distance between the point of a plane and the plane model,represents the plane thickness and is linked to the point cloud noise.area min represents the minimum area of planes we want to keep.3.2.Details3.2.1Local Density of Point CloudsIn a first step,we compute the local density of point clouds like in [17].For that,we find the radius r i of the sphere containing the k nearest neighbors of point i .Then we cal-culate ρi =kπr 2i.In our experiments,we find that k =50is a good number of neighbors.It is important to know the lo-cal density because many laser point clouds are made with a fixed resolution angle scanner and are therefore not evenly distributed.We use the local density in section 3.2.3for the score calculation.3.2.2Filtered Normal EstimationNormal estimation is an important part of our algorithm.The paper [7]presents and compares three normal estima-tion methods.They conclude that the weighted plane fit-ting or WPF is the fastest and the most accurate for large point clouds.WPF is an idea of Pauly and al.in [17]that the fitting plane of a point p must take into consider-ation the nearby points more than other distant ones.The normal least square is explained in [21]and is the mini-mum of ki =1(n p ·p i +d )2.The WPF is the minimum of ki =1ωi (n p ·p i +d )2where ωi =θ( p i −p )and θ(r )=e −2r 2r2i .For solving n p ,we compute the eigenvec-tor corresponding to the smallest eigenvalue of the weightedcovariance matrix C w = ki =1ωi t (p i −b w )(p i −b w )where b w is the weighted barycenter.For the three methods ex-plained in [7],we get a good approximation of normals in smooth area but we have errors in sharp corners.In fig-ure 1,we have tested the weighted normal estimation on two planes with uniform noise and forming an angle of 90˚.We can see that the normal is not correct on the corners of the planes and in the red circle.To improve the normal calculation,that improves the plane detection especially on borders of planes,we propose a filtering process in two phases.In a first step,we com-pute the weighted normals (WPF)of each point like we de-scribed it above by minimizing ki =1ωi (n p ·p i +d )2.In a second step,we compute the filtered normal by us-ing an adaptive local neighborhood.We compute the new weighted normal with the same sum minimization but keep-ing only points of the neighborhood whose normals from the first step satisfy |n p ·n i |>cos (α).With this filtering step,we have the same results in smooth areas and better results in sharp corners.We called our normal estimation filtered weighted plane fitting(FWPF).Figure 1.Weighted normal estimation of two planes with uniform noise and with 90˚angle between them.We have tested our normal estimation by computing nor-mals on synthetic data with two planes and different angles between them and with different values of the parameter α.We can see in figure 2the mean error on normal estimation for WPF and FWPF with α=20˚,30˚,40˚and 90˚.Us-ing α=90˚is the same as not doing the filtering step.We see on Figure 2that α=20˚gives smaller error in normal estimation when angles between planes is smaller than 60˚and α=30˚gives best results when angle between planes is greater than 60˚.We have considered the value α=30˚as the best results because it gives the smaller mean error in normal estimation when angle between planes vary from 20˚to 90˚.Figure 3shows the normals of the planes with 90˚angle and better results in the red circle (normals are 90˚with the plane).3.2.3The score of local planarityIn many region growing algorithms,the criteria used for the score of the local fitting plane is the residual,like in [18]or [19],i.e.the sum of the square of distance from points to the plane.We have a different score function to estimate local planarity.For that,we first compute the neighbors N i of a point p with points i whose normals n i are close toFigure parison of mean error in normal estimation of two planes with α=20˚,30˚,40˚and 90˚(=Nofiltering).Figure 3.Filtered Weighted normal estimation of two planes with uniform noise and with 90˚angle between them (α=30˚).the normal n p .More precisely,we compute N i ={p in k neighbors of i/|n i ·n p |>cos (α)}.It is a way to keep only the points which are probably on the local plane before the least square fitting.Then,we compute the local plane fitting of point p with N i neighbors by least squares like in [21].The set N i is a subset of N i of points belonging to the plane,i.e.the points for which the distance to the local plane is smaller than the parameter γ(to consider the noise).The score s of the local plane is the area of the local plane,i.e.the number of points ”in”the plane divided by the localdensity ρi (seen in section 3.2.1):the score s =card (N i)ρi.We take into consideration the area of the local plane as the score function and not the number of points or the residual in order to be more robust to the sampling distribution.3.2.4Voxel decompositionWe use a data structure that is the core of our region growing method.It is a voxel grid that speeds up the plane detection process.V oxels are small cubes of length d that partition the point cloud space.Every point of data belongs to a voxel and a voxel contains a list of points.We use the Octree Class Template in [2]to compute an Octree of the point cloud.The leaf nodes of the graph built are voxels of size d .Once the voxel grid has been computed,we start the plane detection algorithm.3.2.5Voxel GrowingWith the estimator of local planarity,we take the point p with the best score,i.e.the point with the maximum area of local plane.We have the model parameters of this best seed plane and we start with an empty set E of points belonging to the plane.The initial point p is in a voxel v 0.All the points in the initial voxel v 0for which the distance from the seed plane is less than γare added to the set E .Then,we compute new plane parameters by least square refitting with set E .Instead of growing with k nearest neighbors,we grow with voxels.Hence we test points in 26voxel neigh-bors.This is a way to search the neighborhood in con-stant time instead of O (logN )for each neighbor like with Kd-tree.In a neighbor voxel,we add to E the points for which the distance to the current plane is smaller than γand the angle between the normal computed in each point and the normal of the plane is smaller than a parameter α:|cos (n p ,n P )|>cos (α)where n p is the normal of the point p and n P is the normal of the plane P .We have tested different values of αand we empirically found that 30˚is a good value for all point clouds.If we added at least one point in E for this voxel,we compute new plane parameters from E by least square fitting and we test its 26voxel neigh-bors.It is important to perform plane least square fitting in each voxel adding because the seed plane model is not good enough with noise to be used in all voxel growing,but only in surrounding voxels.This growing process is faster than classical region growing because we do not compute least square for each point added but only for each voxel added.The least square fitting step must be computed very fast.We use the same method as explained in [18]with incre-mental update of the barycenter b and covariance matrix C like equation 2.We know with [21]that the barycen-ter b belongs to the least square plane and that the normal of the least square plane n P is the eigenvector of the smallest eigenvalue of C .b0=03x1C0=03x3.b n+1=1n+1(nb n+p n+1).C n+1=C n+nn+1t(pn+1−b n)(p n+1−b n).(2)where C n is the covariance matrix of a set of n points,b n is the barycenter vector of a set of n points and p n+1is the (n+1)point vector added to the set.This voxel growing method leads to a connected com-ponent set E because the points have been added by con-nected voxels.In our case,the minimum distance between one point and E is less than parameter d of our voxel grid. That is why the parameter d also represents the connectivity of points in detected planes.3.2.6Plane DetectionTo get all planes with an area of at least area min in the point cloud,we repeat these steps(best local seed plane choice and voxel growing)with all points by descending order of their score.Once we have a set E,whose area is bigger than area min,we keep it and classify all points in E.4.Results and Discussion4.1.Benchmark analysisTo test the improvements of our method,we have em-ployed the comparative framework of[12]based on range images.For that,we have converted all images into3D point clouds.All Point Clouds created have260k points. After our segmentation,we project labelled points on a seg-mented image and compare with the ground truth image. We have chosen our three parameters d,area min andγby optimizing the result of the10perceptron training image segmentation(the perceptron is portable scanner that pro-duces a range image of its environment).Bests results have been obtained with area min=200,γ=5and d=8 (units are not provided in the benchmark).We show the re-sults of the30perceptron images segmentation in table1. GT Regions are the mean number of ground truth planes over the30ground truth range images.Correct detection, over-segmentation,under-segmentation,missed and noise are the mean number of correct,over,under,missed and noised planes detected by methods.The tolerance80%is the minimum percentage of points we must have detected comparing to the ground truth to have a correct detection. More details are in[12].UE is a method from[12],UFPR is a method from[10]. It is important to notice that UE and UFPR are range image methods and our method is not well suited for range images but3D Point Cloud.Nevertheless,it is a good benchmark for comparison and we see in table1that the accuracy of our method is very close to the state of the art in range image segmentation.To evaluate the different improvements of our algorithm, we have tested different variants of our method.We have tested our method without normals(only with distance from points to plane),without voxel growing(with a classical region growing by k neighbors),without our FWPF nor-mal estimation(with WPF normal estimation),without our score function(with residual score function).The compari-son is visible on table2.We can see the difference of time computing between region growing and voxel growing.We have tested our algorithm with and without normals and we found that the accuracy cannot be achieved whithout normal computation.There is also a big difference in the correct de-tection between WPF and our FWPF normal estimation as we can see in thefigure4.Our FWPF normal brings a real improvement in border estimation of planes.Black points in thefigure are non classifiedpoints.Figure5.Correct Detection of our segmentation algorithm when the voxel size d changes.We would like to discuss the influence of parameters on our algorithm.We have three parameters:area min,which represents the minimum area of the plane we want to keep,γ,which represents the thickness of the plane(it is gener-aly closely tied to the noise in the point cloud and espe-cially the standard deviationσof the noise)and d,which is the minimum distance from a point to the rest of the plane. These three parameters depend on the point cloud features and the desired segmentation.For example,if we have a lot of noise,we must choose a highγvalue.If we want to detect only large planes,we set a large area min value.We also focus our analysis on the robustess of the voxel size d in our algorithm,i.e.the ratio of points vs voxels.We can see infigure5the variation of the correct detection when we change the value of d.The method seems to be robust when d is between4and10but the quality decreases when d is over10.It is due to the fact that for a large voxel size d,some planes from different objects are merged into one plane.GT Regions Correct Over-Under-Missed Noise Duration(in s)detection segmentation segmentationUE14.610.00.20.3 3.8 2.1-UFPR14.611.00.30.1 3.0 2.5-Our method14.610.90.20.1 3.30.7308Table1.Average results of different segmenters at80%compare tolerance.GT Regions Correct Over-Under-Missed Noise Duration(in s) Our method detection segmentation segmentationwithout normals14.6 5.670.10.19.4 6.570 without voxel growing14.610.70.20.1 3.40.8605 without FWPF14.69.30.20.1 5.0 1.9195 without our score function14.610.30.20.1 3.9 1.2308 with all improvements14.610.90.20.1 3.30.7308 Table2.Average results of variants of our segmenter at80%compare tolerance.4.1.1Large scale dataWe have tested our method on different kinds of data.We have segmented urban data infigure6from our Mobile Mapping System(MMS)described in[11].The mobile sys-tem generates10k pts/s with a density of50pts/m2and very noisy data(σ=0.3m).For this point cloud,we want to de-tect building facades.We have chosen area min=10m2, d=1m to have large connected components andγ=0.3m to cope with the noise.We have tested our method on point cloud from the Trim-ble VX scanner infigure7.It is a point cloud of size40k points with only20pts/m2with less noise because it is a fixed scanner(σ=0.2m).In that case,we also wanted to detect building facades and keep the same parameters ex-ceptγ=0.2m because we had less noise.We see infig-ure7that we have detected two facades.By setting a larger voxel size d value like d=10m,we detect only one plane. We choose d like area min andγaccording to the desired segmentation and to the level of detail we want to extract from the point cloud.We also tested our algorithm on the point cloud from the LEICA Cyrax scanner infigure8.This point cloud has been taken from AIM@SHAPE repository[1].It is a very dense point cloud from multiplefixed position of scanner with about400pts/m2and very little noise(σ=0.02m). In this case,we wanted to detect all the little planes to model the church in planar regions.That is why we have chosen d=0.2m,area min=1m2andγ=0.02m.Infigures6,7and8,we have,on the left,input point cloud and on the right,we only keep points detected in a plane(planes are in random colors).The red points in thesefigures are seed plane points.We can see in thesefig-ures that planes are very well detected even with high noise. Table3show the information on point clouds,results with number of planes detected and duration of the algorithm.The time includes the computation of the FWPF normalsof the point cloud.We can see in table3that our algo-rithm performs linearly in time with respect to the numberof points.The choice of parameters will have little influence on time computing.The computation time is about one mil-lisecond per point whatever the size of the point cloud(we used a PC with QuadCore Q9300and2Go of RAM).The algorithm has been implented using only one thread andin-core processing.Our goal is to compare the improve-ment of plane detection between classical region growing and our region growing with better normals for more ac-curate planes and voxel growing for faster detection.Our method seems to be compatible with out-of-core implemen-tation like described in[24]or in[15].MMS Street VX Street Church Size(points)398k42k7.6MMean Density50pts/m220pts/m2400pts/m2 Number of Planes202142Total Duration452s33s6900sTime/point 1ms 1ms 1msTable3.Results on different data.5.ConclusionIn this article,we have proposed a new method of plane detection that is fast and accurate even in presence of noise. We demonstrate its efficiency with different kinds of data and its speed in large data sets with millions of points.Our voxel growing method has a complexity of O(N)and it is able to detect large and small planes in very large data sets and can extract them directly in connected components.Figure 4.Ground truth,Our Segmentation without and with filterednormals.Figure 6.Planes detection in street point cloud generated by MMS (d =1m,area min =10m 2,γ=0.3m ).References[1]Aim@shape repository /.6[2]Octree class template /code/octree.html.4[3] A.Bab-Hadiashar and N.Gheissari.Range image segmen-tation using surface selection criterion.2006.IEEE Trans-actions on Image Processing.1[4]J.Bauer,K.Karner,K.Schindler,A.Klaus,and C.Zach.Segmentation of building models from dense 3d point-clouds.2003.Workshop of the Austrian Association for Pattern Recognition.1[5]H.Boulaassal,ndes,P.Grussenmeyer,and F.Tarsha-Kurdi.Automatic segmentation of building facades using terrestrial laser data.2007.ISPRS Workshop on Laser Scan-ning.1[6] C.C.Chen and I.Stamos.Range image segmentationfor modeling and object detection in urban scenes.2007.3DIM2007.1[7]T.K.Dey,G.Li,and J.Sun.Normal estimation for pointclouds:A comparison study for a voronoi based method.2005.Eurographics on Symposium on Point-Based Graph-ics.3[8]J.R.Diebel,S.Thrun,and M.Brunig.A bayesian methodfor probable surface reconstruction and decimation.2006.ACM Transactions on Graphics (TOG).1[9]M.A.Fischler and R.C.Bolles.Random sample consen-sus:A paradigm for model fitting with applications to image analysis and automated munications of the ACM.1,2[10]P.F.U.Gotardo,O.R.P.Bellon,and L.Silva.Range imagesegmentation by surface extraction using an improved robust estimator.2003.Proceedings of Computer Vision and Pat-tern Recognition.1,5[11] F.Goulette,F.Nashashibi,I.Abuhadrous,S.Ammoun,andurgeau.An integrated on-board laser range sensing sys-tem for on-the-way city and road modelling.2007.Interna-tional Archives of the Photogrammetry,Remote Sensing and Spacial Information Sciences.6[12] A.Hoover,G.Jean-Baptiste,and al.An experimental com-parison of range image segmentation algorithms.1996.IEEE Transactions on Pattern Analysis and Machine Intelligence.5[13]H.Hoppe,T.DeRose,T.Duchamp,J.McDonald,andW.Stuetzle.Surface reconstruction from unorganized points.1992.International Conference on Computer Graphics and Interactive Techniques.2[14]P.Hough.Method and means for recognizing complex pat-terns.1962.In US Patent.1[15]M.Isenburg,P.Lindstrom,S.Gumhold,and J.Snoeyink.Large mesh simplification using processing sequences.2003.。
翻译
多孔硅的超声表征使用遗传算法逆向解决问题关键词:遗传算法 ,逆向问题的解决,超声无损检测,多孔硅摘要:这篇文章写的是多孔硅超声表征的一种方法,利用遗传算法来选择解决逆问题的最优化方法。
使用描述水波通过浸在水中的样品的一维模型来计算透射光谱。
然后,通过使用插入法或者替代法测量水浸的带宽,样品的光谱通过快速傅立叶变换法来计算。
为了获得厚度,纵向黏度或密度等参数,需要使用以最优化为基础的基因算法。
做两个不同厚度的铝板做为对照实验来确认这个方法,尽管在超声波信号发生重叠的情况下,听觉上的参数仍非常符合。
最后,两个样品,晶体硅和硅表面的多孔硅被评价。
实际值与理论值比较符合。
为了解释一些小的矛盾需要做一些假设。
1.引言对于通过样品的超声波的分析允许声学参数,因此也需要被提取样品的力学参数。
大多数情况下,信号不会有重叠。
时间范围和频度范围的分析可以用来决定参数,例如波速,衰减时间或密度。
在某些情况下,在频度范围内的声波透射系数被用来计算这些参数。
当样品的厚度与波长的分步一致时,或者在多层样品中会发生重叠时,直接测量参数是不可能的。
然而,超声无损表征材料已在薄层的情况下进行了广泛的研究。
由于接收到的信号的复杂性,需要建立模型。
利用逆问题的解决方法,可以得到所需的参数。
但是,多数的最优化方法需要假设初使值。
在材料的参数变化较大的情况下,很难精确的设定初使值来得到正确的解决方案。
在这项研究中,提出一各遗传算法来限制初使值的影响。
事实上,这各优化方法收敛于全部解,并确定解的唯一性。
选择一个一维波传播模型来计算通过一个多层样品信号的波谱。
这个波谱段依赖于每层的几何性质和声学性质,比如,厚度,波速和波的密度。
为了验证的目的,对透射谱的理论浸渍铝合金板进行了计算并与经验值比较以获取样本的声学参数。
然后,对包含表面被侵蚀形成多孔硅层的样品进行研究。
波速和硅的密度是已知的。
多孔硅层被认为是均匀的,通过用遗传算法解决逆部里的方法来估计它的参数。
一种基于非采样Contourlet变换的图像水印算法
一种基于非采样Contourlet变换的图像水印算法熊顺清;周卫红【摘要】基于非采样Contourlet变换(NSCT)具有多尺度性、多方向性和平移不变性的优点,本文提出一种基于非采样Contourlet变换和SVD结合的数字水印算法,首先对图像进行非采样Contourlet变换得到低频子带,并对该子带系数进行SVD分解,然后将水印信息嵌入到奇异值中.实验结果表明,算法对旋转、JPEG压缩和噪声等攻击具有很好的抵抗能力,与基于DWT和基于SVD以及基于相同的算法框架下的DWT结合SVD和Contourlet结合SVD的水印算法相比,NSCT结合SVD的算法鲁棒性显著提高.【期刊名称】《广西师范大学学报(自然科学版)》【年(卷),期】2011(029)002【总页数】5页(P195-199)【关键词】数字水印;非采样Contourlet变换;奇异值分解;鲁捧性【作者】熊顺清;周卫红【作者单位】云南民族大学数学与计算机科学学院,云南昆明650031;云南民族大学数学与计算机科学学院,云南昆明650031【正文语种】中文【中图分类】TP391.40 引言数字水印技术是将版权信息嵌入到原始数字媒体产品中,起到版权保护的作用。
数字水印算法主要分为空间域算法和变换域算法两大类,其中变换域算法比空间域算法具有更好的鲁棒性,常见的变换域算法有DFT、DCT和DWT,DCT是JPEG压缩标准的核心算法,DWT为JPEG2000压缩标准的核心算法,然而这些算法都难以抵抗几何攻击。
Minh N.Do和Martin Vetterli[1-2]提出的Contourlet变换不仅具有小波变换的多分辨率和时频局部性,还具有多方向性和各向异性,能很好地捕捉图像的几何特征,而小波变换只能捕捉水平、垂直和对角线方向的信息,所以基于Contourlet变换的水印算法对抗常规的几何攻击比基于小波变换的水印算法具有更好的鲁棒性。
由于拉普拉斯塔式分解和方向滤波器组存在下采样过程,使得Contourlet变换不具有平移不变性,Do和Cunha[3-4]提出了非采样Contourlet变换弥补了Contourlet变换的不足。
一种基于模糊连接度和维诺图的混合分割方法
( I n oo et ,S ag a CMSa dR bt ne hn h i C r
Ab t a t sr c
,h n h i 0 02,hn ) S ag a 0 7 C ia 2
杨安荣 林财兴 李红强
( 海 大 学 CMS和 机 器 人 中心 上 I 上海 207 ) 0 0 2
摘
要
介 绍 一 种 可 用 于 医学 图像 处 理 的 、 成 了模 糊 连 接 度 和 维 诺 图 类 算 法 的 混 合 分 割方 法 。 首 先 采 用 模 糊 连 接 度 算 法 对 集
由 于 医学 图像 本 身具 有 的模 糊 性 、 态 性 、 理 相 关 性 等 复 杂 特 多 生
性 , 目前为止 , 到 依然没有一种 能够 解决 所有图像分割 问题 的普
遍 方 法 问世 。 在 一 些 复 杂 的 图像 分 割 任 务 处 理 过 程 中 , 一 的 分 割 算 法 单
s t fi e u ti c u la lc to . ai yng r s l n a t a pp iai n s K e wor y ds Fu z o ne t d es Vo o o i ga ca sfc to Hy rd s g e tto t d Voume r n rn z y c n ce n s r n id a rm ls i a in i b i e m na in meho l e de g i
T i p p r p e e t y rd s g n ain me h d w ih it ga e u z o n ce n s n r n id a r m l si c t n hs a e r s n s a h b e me t t t o h c n e rt s f z y c n e td e s a d Vo o o ig a c a sf ai i o i o
纹理物体缺陷的视觉检测算法研究--优秀毕业论文
摘 要
在竞争激烈的工业自动化生产过程中,机器视觉对产品质量的把关起着举足 轻重的作用,机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检 测技术相比,自动化的视觉检测系统更加经济、快捷、高效与 安全。纹理物体在 工业生产中广泛存在,像用于半导体装配和封装底板和发光二极管,现代 化电子 系统中的印制电路板,以及纺织行业中的布匹和织物等都可认为是含有纹理特征 的物体。本论文主要致力于纹理物体的缺陷检测技术研究,为纹理物体的自动化 检测提供高效而可靠的检测算法。 纹理是描述图像内容的重要特征,纹理分析也已经被成功的应用与纹理分割 和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检 测算法。这种算法能容忍物体变形引起的图像配准误差,对纹理的影响也具有鲁 棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义,如缺陷区域 的大小、形状、亮度对比度及空间分布等。同时,在参考图像可行的情况下,本 算法可用于同质纹理物体和非同质纹理物体的检测,对非纹理物体 的检测也可取 得不错的效果。 在整个检测过程中,我们采用了可调控金字塔的纹理分析和重构技术。与传 统的小波纹理分析技术不同,我们在小波域中加入处理物体变形和纹理影响的容 忍度控制算法,来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字 塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段,我们检测了一系列 具有实际应用价值的图像。实验结果表明 本文提出的纹理物体缺陷检测算法具有 高效性和易于实现性。 关键字: 缺陷检测;纹理;物体变形;可调控金字塔;重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II
图像处理特征不变算子系列之SUSAN算子(三)
像处理特征不变算子系列之SUSAN算子(三)作者:飛雲侯相发布时间:September 13, 2014 分类:图像特征算子在前面分别介绍了:图像处理特征不变算子系列之Moravec算子(一)和图像处理特征不变算子系列之Harris算子(二)。
今天我们将介绍另外一个特征检测算子---SUSAN算子。
SUSAN 算子很好听的一个名字,其实SUSAN算子除了名字好听外,她还很实用,而且也好用,SUSAN 的全名是:Smallest Univalue Segment Assimilating Nucleus,关于这个名词的翻译国内杂乱无章,如最小核值相似区、最小同值收缩核区和最小核心值相似区域等等,个人感觉这些翻译太过牵强,我们后面还是直接叫SUSAN,这样感觉亲切,而且上口。
SUSAN算子是一种高效的边缘和角点检测算子,并且具有结构保留的降噪功能(structure preserving noise reduction )。
那么SUSAN是什么牛气冲天的神器呢?不仅具有边缘检测、角点检测,还具备结构保留的降噪功能。
下面就让我娓娓地为你道来。
1)SUSAN算子原理为了介绍和分析的需要,我们首先来看下面这个图:该图是在一个白色的背景上,有一个深度颜色的区域(dark area),用一个圆形模板在图像上移动,若模板内的像素灰度与模板中心的像素(被称为核Nucleus)灰度值小于一定的阈值,则认为该点与核Nucleus具有相同的灰度,满足该条件的像素组成的区域就称为USAN(Univalue Segment Assimilating Nucleus)。
接下来,我们来分析下上图中的五个圆形模的USAN值。
对于上图中的e圆形模板,它完全处于白色的背景中,根据前面对USAN的定义,该模板处的USAN值是最大的;随着模板c和d 的移动,USAN值逐渐减少;当圆形模板移动到b处时,其中心位于边缘直线上,此时其USAN 值逐渐减少为最大值的一半;而圆形模板运行到角点处a时,此时的USAN值最小。
基于奇偶量化的Contourlet变换域指纹图像水印算法
基于奇偶量化的Contourlet变换域指纹图像水印算法
谢静;吴一全
【期刊名称】《计算机应用》
【年(卷),期】2007(27)6
【摘要】提出了一种基于奇偶量化的Contourlet变换域指纹图像水印算法.原始指纹图像经过Contourlet变换后,分解为一系列多尺度、局部化、方向性的子带图像,选择在低频子带中嵌入经过二维Arnold置乱加密后的水印.嵌入水印时,采用奇偶量化算法修改低频子带的系数.水印提取时,不需要原始图像,实现了盲提取.实验结果表明,提出的指纹图像水印算法能够较好地抵抗JPEG有损压缩、叠加噪声、剪裁等攻击,具有较好的不可见性和鲁棒性,提高了指纹识别的可靠性.
【总页数】3页(P1365-1367)
【作者】谢静;吴一全
【作者单位】南京航空航天大学,信息科学与技术学院,江苏,南京,210016;南京航空航天大学,信息科学与技术学院,江苏,南京,210016
【正文语种】中文
【中图分类】TP393.08
【相关文献】
1.一种Contourlet变换域彩色图像数字水印算法 [J], 何冰
2.基于邻域均值关系的Contourlet域量化水印算法 [J], 龚劬;张建
3.基于Contourlet变换和DCT量化的零水印算法 [J], 赵杰
4.基于奇偶量化的彩色图像盲水印算法 [J], 李娟; 王丽君
5.基于奇偶量化的彩色图像盲水印算法 [J], 李娟; 王丽君
因版权原因,仅展示原文概要,查看原文内容请购买。
非局部平均的去噪方法研究(模式识别与智能系统专业优秀论文)
摘要非局部平均的去噪方法研究摘要本论文主要围绕着图像去噪这个困扰了研究学者已久的图像处理难点展开了讨论,对非局部几何思想及其在图像去噪中的应用进行了研究,主要内容包括:Non-Local means算法的实现、Non-Local means算法与其它去噪算法的方法噪声比较、合理邻域的选择以及基于广义Gaussian模型的Non-Local去噪算法。
图像去噪问题由来已久,去噪的好坏会直接影响整个图像处理算法的质量。
自然图像中要区分细节信息与未知的噪声信息是相当困难的。
去噪算法的基本的思想是平均,因而关键点就在于如何使图像同时得到平滑,又在细节、或称为高频部分予以保留。
A. Buades等人提出了方法噪声的概念,转换了人们对于去噪问题的视角。
本文针对方法噪声的概念,给出了Gaussian滤波器、各向异性滤波器、Wiener滤波器、小波阈值去噪方法以及Non-Local means的方法噪声公式,并通过具体实验结果证明了Non-Local means算法的优越性;针对Non-Local的算法冗余度,提出适当地筛选邻域,保留相似性较大的像素、舍弃权值较小的像素,可以在保持去噪效果的基础上提高运算速度。
邻域平均灰度值和梯度值都是很好的选择;将Non-Local思想与小波域广义Gaussian模型相结合,在分解图像的各个子波带运用Non-Local means算法。
经实验证明该算法有较好的去噪结果。
关键词:非局部平均,方法噪声,广义Gaussian模型,邻域相似度,小波阈值,图像去噪A RESEARCH ON IMAGE DENOISINGBY NON-LOCAL MEANSABSTRACTThis thesis mainly discusses about image denoising, which has disturbed researchers for quite a long period. It does researches on Non-Local algorithm and its application in image denoising. The thesis emphasizes on the following parts: implementation on Non-Local means, comparison of method noise among Non-Local means and other filters, rational selection of neighborhoods, and Non-Local means algorithm based on General Gaussian Distribution.The noise reduction will affect the whole work of image processing. It’s extremely difficult to distinguish unknown noise from details and structures in natural images. The basic idea of denoising is average, so the key point is how to do smoothing while preserving details or high frequency parts. A. Bades, et al brought forward the concept of method noise, which changed the viewpoint of the problem.Based on the above, the contributions of our work mainly focus on the following aspects:1. Formulae of method noise for the Gaussian smoothing filter, the anisotropic filter, the Wiener filter, the translation invariant wavelet thresholding, and Non-Local means algorithm are deduced. The experiments’ result shows that Non-Local means is better than any mentioned filters.2. In order to accelerate the Non-Local means algorithm, filters that eliminate unrelated neighborhoods from the weighted average are introduced. These filters are based on local average gray values and gradients, pre-classifying neighborhoods and thereby reducing the original quadraticcomplexity to a linear one and reducing the influence of less-related areas in the denoising of a given pixel.3. A denoising technique based on General Gaussian Distribution is addressed. The wavelet coefficients of a noised image in each sub-band are modelized by a GGD whose parameters are estimated using an appropriate technique. The estimated parameters are used to define a generalized Non-Local mean which allows us to restore the original image. This algorithm allows us to reduce the computational cost since processed images are smaller.KEY WORDS: Non-Local means, Method Noise, General Gaussian Distribution, Neighborhood Similarity, Wavelet Thresholding, Image Denoising上海交通大学学位论文原创性声明本人郑重声明:所呈交的学位论文,是本人在导师的指导下,独立进行研究工作所取得的成果。
基于全局注意力多任务网络方法的CT图像细小骨折检测研究
Study on the detection of CT image based on multi-task network method of global attention for fine-fracture/Li Ruirui 1,Yang Xiaoguang 2, Sun Shihao 1, Ji Shangwei 3 1Beijing FuT ong T echnology Co.Ltd, Beijing 100020, China; 2Retirement Office, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China; 3Department of T rauma Orthopedics, Beijing Jishuitan Hospital, Beijing 100035, ChinaCorresponding author: [Abstract] Objective: T o improve the perception of computed tomography (CT) images in detecting fine fracture through multi-task network of global attention, and to realize the detection of the target of fine fracture at case level through multi-task, and to quickly and accurately identify and locate fracture from a large number of CT images, so as to assist doctors to timely conduct treatment. Methods: A grouped Non-local network method was introduced to calculate the remote dependency relationship between each position of CT image continuous sections and channel. A single-stage detector of multi-objective detection model three dimension (3D) RetinaNet was integrated with the medical image semantic segmentation architecture (3D U-Net). A end-to-end multi-task 3D convolutional network was realized, which realized the detection of case level for fine fracture through multi-task collaboration. Select 600 CT scan images from the Rib Frac Dataset of rib fractures provided by the MICCAI 2020 Challenge, and they were divided into training set (500 cases) and test set (100 cases) as the ratio of 5:1 to test the precise performance of multi-task 3D convolutional network. Results: The precise performance of multi-task 3D convolutional network method was better than that of single-task FracNet, 3D RetinaNet and 3D Retina U-Net in detection, which average precision was respectively higher 7.8% and 11.4% than 3D RetinaNet and 3D Retina U-Net. It was better than two kinds of single-task network detection method included 3D Faster R-CNN and 3D Mask R-CNN, and the average precision of that was respectively higher 6.7% and 3.1% than them. Conclusion: The integrated different modules of global attention multi-task network can improve the detection performance of fine fracture. The introduction of grouped Non-local network method can further improve the precise performance for the targets of fine fractures in detection.[Key words] 3D CNN ; Global attention ;Multi-task network; Non-local; Computed tomography (CT) image; RibFrac DatasetFund program: National Key R&D Programmes of China “Research and Development of Basic Scientific Research Conditions and Major Scientific Instrument Equipment” (2021YFF0704100)[摘要] 目的:通过全局注意力多任务网络提升CT图像细小骨折检测的感知,通过多任务实现实例级别细小骨折目标的检测,快速、准确地从大量CT图像中识别并定位骨折,以辅助临床及时开展治疗。
一种基于非下采样contourlet变换的自适应水印方法
一种基于非下采样contourlet变换的自适应水印方法
本文介绍了一种基于非下采样contourlet变换的自适应水印方法。
该方法首先将原始图像进行非下采样contourlet变换,然后根据变换系数的能量分布情况选择合适的水印嵌入位置。
在水印嵌入过程中,采用多尺度分块的方法对图像进行分割,以提高水印的鲁棒性和不可见性。
在水印提取过程中,根据原始图像的非下采样contourlet变换系数和水印嵌入位置,利用相关性算法提取水印。
实验结果表明,该方法具有较好的鲁棒性和不可见性,能够抵抗常见的攻击方法。
- 1 -。
基于深度可分离卷积的指静脉识别算法
基于深度可分离卷积的指静脉识别算法下载提示:该文档是本店铺精心编制而成的,希望大家下载后,能够帮助大家解决实际问题。
文档下载后可定制修改,请根据实际需要进行调整和使用,谢谢!本店铺为大家提供各种类型的实用资料,如教育随笔、日记赏析、句子摘抄、古诗大全、经典美文、话题作文、工作总结、词语解析、文案摘录、其他资料等等,想了解不同资料格式和写法,敬请关注!Download tips: This document is carefully compiled by this editor. I hope that after you download it, it can help you solve practical problems. The document can be customized and modified after downloading, please adjust and use it according to actual needs, thank you! In addition, this shop provides you with various types of practical materials, such as educational essays, diary appreciation, sentence excerpts, ancient poems, classic articles, topic composition, work summary, word parsing, copy excerpts, other materials and so on, want to know different data formats and writing methods, please pay attention!基于深度可分离卷积的指静脉识别算法引言指静脉识别技术作为一种生物特征识别技术,近年来得到了广泛的关注和应用。
抗倾+减摇,中国首套!
率大于97.3%,满足基本要求。本文目前仅对船体 焊缝中易出现的裂纹、未熔合、气孔、焊瘤4种缺 陷的识别理论与方法进行了研究,下一步将针对
更多种船体焊缝缺陷,探索提高焊缝识别率的改 [6] 进算法和应用方法。
参考文献:
[7]
[1] CHU H H, WANG Z Y. A Study on Welding Quality
抗倾+减摇,中国首套!
近年来,中国船舶集团七〇四所自主研发了中国首套具有抗倾、减摇复合功能的船舶姿态控制系统。该系统 具有补偿效率高、响应速度快、补偿能力大,舱室利用率高、智能化程度高、操作便捷等特点。
近日,该系统顺利通过 FAT 验收,各项功能指标均满足技术要求,体现了国产高端船舶装备的实力,系统功 能及设备外观均达到国际先进水平。
在97.3%以上。
[4]
6 结论
本文针对人工方式检测船体焊缝缺陷的劳动
强度大、检测效率低等问题,提出了适用于船体
焊缝检测的图像处理算法。为提高特征参数的计
算速度,提出焊缝图像长轴、短轴的新型计算方
法,并采用++和OpenCV编程验证新型计算方法 的准确性;最后设计了基于有限样本的焊缝缺陷 [5] 识别流程。测试结果表明:焊缝缺陷的准确识别
陷的子类相似度分别为:98.75%、98.19%、96.34%、
87.36%、99.4%,最终的裂纹相似度为81.12%;
若高于任何其他3个焊缝类型的最终相似度,则该
图像被识别为具有裂纹类型缺陷的焊缝。
[2]
本文对含有裂纹、未熔合、气孔、焊瘤4种类
型焊缝缺陷各100张进行编程测试,识别结果表明:
每张图片的识别时间平均小于3 s,正确识别率均 [3]
Inspection System for Shell-Tube Heat Exchanger
无下采样Contourlet变换在图像边缘检测中的应用
无下采样Contourlet变换在图像边缘检测中的应用李杏梅;严国萍【摘要】传统图像边缘检测不能同时实现边缘检测需要的各向异性和多尺度性,小波虽然可以做到,但是小波在表现多方向性时,不能以最稀疏的方式表示.Contourlet 变换正是解决这些问题的一种新的分析工具.目前将Contourlet变换用于图像边缘检测的方法还很少见,该文在各向异性的感受野模型可以很好用于图像高通滤波的思想上,提出一种利用无下采样Contourlet变换进行图像边缘检测的方法.实验结果证明,该方法可以较好地用于图像的边缘检测.【期刊名称】《计算机工程与应用》【年(卷),期】2010(046)030【总页数】3页(P178-180)【关键词】Contourlet变换;图像边缘检测;各向异性的感受野模型【作者】李杏梅;严国萍【作者单位】华中科技大学,电信系,武汉,430074;中国地质大学,机电学院,武汉,430074;华中科技大学,电信系,武汉,430074【正文语种】中文【中图分类】TP391图像的边缘检测是数字图像处理、图像分析和机器视觉领域的重要研究内容。
小波变换被誉为分析信号的数学显微镜,在时频两域上有突出信号局部特征的能力和进行多分辨率分析的能力,已经成功地应用在图像边缘检测领域。
采用小波变换多尺度法提取图像的边缘,通过细节和粗节进行逼近,使得大尺度下抑制噪声,小尺度下边缘精确定位,强于Sobel和Canny等经典边缘算子提取算法[1]。
由于小波变换只能检测到水平、垂直和对角三个方向的图像边缘特征,所以当物体边缘呈现多方向性时,小波变换就不能对其进行有效表示,不能以最稀疏的方式表示图像轮廓和边缘信息。
而Contourlet变换正是解决二维或更高维奇异性的一种新的分析工具。
这种变换的主要特征是有很好的方向性和各向异性,具有多尺度、多方向的特性,能更好地捕捉到图像的轮廓和细节。
1 Contourlet变换介绍Contourlet变换的思想是使用类似于线段的基函数去逼近原始图像,从而实现对图像信号的稀疏分离。
基于平滑滤波和分水岭算法的重组织织物图像分割
基于平滑滤波和分水岭算法的重组织织物图像分割周慧;张华熊;胡洁;康锋【摘要】针对带有重组织的织物图像特点,提出了一种根据纱线颜色进行图像分割的方法.首先将织物图像转化为Lab颜色模式,采用混合中值滤波算法滤除扫描噪声;其次通过设置色差容许值改变高斯权值的平滑滤波算法进行滤波,滤除织物图像中的重组织阴影和同颜色纱线纹理,保留纱线颜色特征;然后提取织物图像的色差梯度,通过分水岭算法进行图像分割,获得区域标记图像;最后将颜色相近的区域进行合并,得到织物图像的分色索引图像.实验结果表明,提出的算法可对重织物图像进行较为准确地分割.【期刊名称】《纺织学报》【年(卷),期】2015(036)008【总页数】5页(P38-42)【关键词】重织物;图像分割;色差;平滑滤波;分水岭算法【作者】周慧;张华熊;胡洁;康锋【作者单位】浙江理工大学信息学院,浙江杭州310018;浙江理工大学信息学院,浙江杭州310018;浙江理工大学信息学院,浙江杭州310018;浙江理工大学信息学院,浙江杭州310018【正文语种】中文【中图分类】TN919;TS145.4图像分割是织物图像处理与分析的基础,分割的准确性决定着织物图像组织结构提取、内容分析和检索等后处理的有效性。
织物重组织(backed weave)是由2组或2组以上的经纱与1组纬纱交织,或由2组或2组以上的纬纱与1组经纱交织,形成二重或二重以上的重叠组织[1]。
重织物由不同颜色或不同原料形成,随着经纱或纬纱重叠组数的变化,形成的织物色彩丰富、层次多变。
然而,重织物图像不是一个理想的平面结构,扫描获取的图像不能反映纱线的真实颜色。
在扫描光照下,由于纱线呈一定的圆柱结构,同一纱线的中心和边缘之间存在过渡颜色;在纱线之间的缝隙颜色往往偏暗,同颜色的纱线区域会形成一定的纹理;由于重组织凹凸不平,重组织边缘会产生一定的阴影;由于扫描图像文件往往采用有失真的JPEG压缩,不同颜色的纱线之间存在过渡颜色。
一种基于ROI和多描述量化的图像传输方法
教学重点和难点பைடு நூலகம்
1、教学重点:句型be going to ... /be动词+V-ing
2、教学难点:(1)如何正确使用现在进行时表将来时。
(2)如何在此基础上展开所写内容,描述整个计划。
教学过程
教学环节
教学活动
设计意图
Step1
Lead-in
(1)Do you want to have a vacation ? Yes?Good. Now imagine your vacation plans. What are you doing for your vacation? When are you going? Who are you going with? How long are you staying ...
(2)在已有提纲上进行延伸行写作(Individual work)
(1)理清思路,让学生明确写什么,怎么写
(2)培养学生的独立思考能力、解决问题能力
Step 3
在老师的引导下学生进行自我评价,自己修改,发现问题。老师主要提醒学生时态和语态的错误,用词错误,词形变化错误,句子结构错误
培养学生的自我认识能力
教材分析
1.本单元以What are you doing for vacation?为中心话题,围绕“future plans”而展开,主要运用现在进行时表将来时来询问“某人的计划”简单描述将来要做的事。而Writing about your vacation plans这一话题贴近学生的日常生活,能激发学生探讨的兴趣,使学生充分发挥想象力。
2.通过学习谈论将来的计划,为今后学生合理安排行程,合理制定计划表打下基础,同时通过讨论后的写作,进一步提高学生的口语及写作综合素质能力。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Fine Hand Segmentation using Convolutional Neural NetworksTadej Vodopivec 1,2Vincent Lepetit 1Peter Peer 21Institute for Computer Graphics and Vision,Graz University of Technology,Austria2Faculty of Computer and Information Science,University of Ljubljana,Veˇc na pot 113,1000Ljubljana,SloveniaAbstractWe propose a method for extracting very accurate masks of hands in egocentric views.Our method is based on a novel Deep Learning architecture:In con-trast with current Deep Learning methods,we do not use upscaling layers applied to a low-dimensional rep-resentation of the input image.Instead,we extract features with convolutional layers and map them di-rectly to a segmentation mask with a fully connected layer.We show that this approach,when applied in a multi-scale fashion,is both accurate and efficient enough for real-time.We demonstrate it on a new dataset made of images captured in various environ-ments,from the outdoors to offices.1IntroductionTo ensure that the user perceives the virtual objects as part of the real world in Augmented Reality ap-plications,these objects have to be inserted convinc-ingly enough.By far,most of the research in this di-rection has focus on 3D pose estimation,so that the object can be rendered at the right location in the user’s view [19,8,15].Other works aim at rendering the light interaction between the virtual objects and the real world consistently [5,14].Significantly less works have tackled the problem of correctly rendering the occlusions which occur when a real object is located in front of a virtual one.[9]provides a method that requires interaction with a human and works only for rigid objects.[17]relies on background subtraction but this is prone to fail when foreground and background have similar col-ors.Depth cameras bring now an easy solution to handling occlusions,however,they provide a poorly accurate 3D reconstruction of the occluding bound-aries of the real objects,which are essential for a con-vincing perception.The human perception is actu-ally very sensitive to small deviations from the actual locations in occlusion rendering,making the problem very challenging [21].With the development of hardware such as the HoloLens,which provides precise 3D registration and crisp rendering of the virtual objects,egocentric Aug-mented Reality applications can be foreseen to be-come very popular in the near future.This is why we focus here on correct rendering of occlusions by the user’s hands of the virtual objects.More exactly,we assume that the hands are always in front of the vir-tual objects,which is realistic for many applications,and we aim at estimating a pixel-accurate mask of the hands in real-time.The last years have seen the development of dif-ferent segmentation methods based on Convolutional Neural Networks [12,2].While our method also re-lies on Deep Learning,its architecture has several fundamental differences.It is partially inspired by Auto-Context [18]:Auto-Context is a segmentation method in which a segmenter is iterated,and the seg-mentation result of the previous step is used in the next iteration in addition to the original image.The fundamental difference between our approach and the original Auto-Context is that the initial seg-mentation is performed on a downscaled version of the input image.The resulting segmentation is then upscaled before being passed to the second iteration.This allows us to take the context into account very efficiently.We can also obtain precise localization of1a r X i v :1608.07454v 1 [c s .C V ] 26 A u g 2016the segmentation boundaries,because we avoid using pooling.In the remainder of the paper,wefirst discuss re-lated work,then describe our method,andfinally present and discuss our results on a new dataset for hand segmentation.2Related WorkHand segmentation is a very challenging task as hands can be very different in shape and skin color, look very different under another viewpoint,can be closed or open,can be partially occluded,can have different positions of thefingers,can be grasping ob-jects or other hands,etc.Skin color is a very obvious cue[1,7],unfortu-nately,this approach is prone to fail as other objects may have a similar color.Other approaches assume that the camera is static and segment the hands based on their movement[3],use a simple or even single-color background[10],or rely on depth information obtained by an RGB-D camera[6].None of these ap-proaches can provide accurate masks in general con-ditions.The method we propose is based on convolutional neural networks[11].Deep Learning has already been applied to segmentation,and recent architec-tures tend to be made of two parts:Thefirst part applies convolutional and pooling layers to the input image to produce a compact,low-resolution represen-tation;the second part applies deconvolutional layers to this representation to produce thefinal segmenta-tion,at the same resolution as the input image.This typically results in oversmoothed segments,which we avoid with our approach.3MethodIn this section,we describe our approach.Wefirst present our initial architecture based on multiscale analysis of the input.We then split this architecture in two to obtain our more efficient,final architecture. Wefinally detail our methodology to select the meta-parameters of this architecture.Figure1:The architecture for the two components of our network.We extract features with convolu-tional layers without using pooling layers and map them directly to the output segmentation with a fully connected layer.For clarity,we show only one convo-lutional layer,and both the number of feature maps n and the number of neurons in the fully connected layer m are underrepresented.3.1Initial Network ArchitectureAs shown in Figure1,our initial network was made of three chains of three convolution layers each.The first chain is directly applied to the input image,the second one to the input image after downscaling by a factor two,and the last one to the input image after downscaling by a factor four.We do not use pooling layers here,which allows us to extract thefine details of the hand masks.The outputs of these three chains are concatenated together and given as input to a fully connected lo-gistic regression layer,which outputs for each pixel its probability of lying on a hand.3.2Splitting the Network in TwoThe network described above turned out to be too computationally intensive for real-time.To speed it up,we developed an approach that is inspired by Auto-Context[18].Auto-Context is a segmentation method in which a segmenter is iterated,taking as input not only the image to segment but also the segmentation result of the previous iteration.The fundamental difference between our approach and the original Auto-Context is that the initial segmentation 2Figure2:Two-part network architecture.As in Fig-ure1,only one convolutional layer is shown,and the number of feature maps and the number of neurons on the fully connected layer are underrepresented in both parts of the classifier to make the representation more understandable.The second part of the net-work receives as input the output of thefirst part af-ter upscaling but also the original image,which helps segmentingfine details.is performed on a downscaled version of the input im-age.As seen in Figure2,thefirst part performs the seg-mentation on the original image after downscaling by a factor16,and outputs a result of the same resolu-tion.Its output along with the original image is then used as input to the second part of the new network, which is a simplified version of the initial network to produce thefinal,full-resolution segmentation.The two parts of the network have very similar structures. The difference is that the second part takes as in-put the original,full resolution input image together with the output of thefirst part after upscaling.The first output already provides afirst estimate of the position of the hands;the second part uses this in-formation in combination with the original image to effectively segment the image.An example of the fea-ture maps computed by thefirst part can be seen in Figure4.The advantage of this split is two-fold:Thefirst part runs on a small version of the original image, and we can considerably reduce the number of fea-ture maps and use smallerfilters in the second part without loosing accuracy.3.3Meta-Parameters SelectionThere is currently no good way to determine the opti-malfilter sizes and numbers of feature maps,so these have to be guessed or determined by trial and error. For this reason we trained networks with the same structure,but different parameters and compared the accuracy and running time.Wefirst identified pa-rameters that produced the best results while ignor-ing processing time and then simplified the model to reduce processing time while retaining as much accu-racy as possible.Our input images have a resolution of752×480, scaled to188×120,94×60,and47×30and input to thefirst part of the network.Thefirst two layers of each chain output32feature maps and the third layer outputs16feature maps.We usedfilters of size 3×3,5×5,and7×7pixels for the successive layers. For the second part,thefirst layer outputs8feature maps,the second layer4feature maps,and the third layer outputs thefinal probability map.We used3×3filters for all layers.We used the leaky rectified linear unit as activa-tion function[13].We minimize a boosted cross-entropy objective function[20].This function weights the samples with lower probabilities more.We used α=2as proposed in the original paper.We used RMSprop[4]for optimization.To avoid overfitting and to make the classifier more robust,we augmented the training set using very sim-ple geometric transformations:We used scaling by a random factor between0.9and1.1,rotating for up to10degrees,introducing shear for up to5degrees, and translating by up to20pixels.4ResultsIn this section,we describe the dataset we built for training and testing our approach,and present and discuss the results of its evaluation.4.1DatasetWe built a dataset of samples made of pairs of images and their correct segmentation performed manually. Figure5shows some examples.3Figure3:Original input image,its ground truth seg-mentation,and some of the resulting feature maps computed by thefirst part of thenetwork.Figure4:Example of output of thefirst part from the final architecture.Shades of grey represent the prob-abilities of the hand class over the pixel locations. We focused on egocentric images,i.e.the hands are seen from afirst-person perspective.Several sub-jects acquired these images using a wide-angle camera mounted on their heads,near the eyes.The camera was set to take periodic images of whatever was in the field of view at that time.In total348images were taken.90%of the images were used for training,and the rest was used for testing.191of those images were taken in an office at6Figure5:Some of the images from our dataset and their ground truth segmentations.different locations under different lighting conditions. The remaining157images were taken in and around a residential building,while performing everyday tasks like walking around,opening doors etc.The images were taken with an IDS MT9V032C12STC sensor with resolution of752×480pixels.4.2EvaluationFigure6shows the ROC curve for our method applied to our egocentric dataset.When applying a threshold of50%to the probabilities estimated by our method, we achieve a99.3%accuracy on our test set,where the accuracy is defined as the percentage of pixels that are correctly classified.Figure7shows that this accuracy can be obtained with thresholds from a large range of values,which shows the robustness of the method.Qualitative results can be seen in Figures8 and9.4.3Meta-parameter Fine SelectionIn total,we trained98networks for the reduced res-olution segmentation estimation and95networks for 4(a)(b)(c)(d)Figure9:Images and their segmentations.(a)Original image;(b)Upscaled segmentation predicted by the first part of our network;(c)Final segmentation;(d)composition of thefinal segmentation into the original image.the full resolutionfinal classifier.After meta-parameterfine selection for thefirst part of the network,we were able to achieve the ac-curacy of98.3%at16milliseconds per image,where thefirst layer had32feature maps and3×3filter,the second layer32feature maps and5×5filters,and the third layer had16feature maps and7×7filters. With further meta-parametersfine selection for thesecond part of the network,we obtained a network reaching99.3%for a processing time of39millisec-onds.Thefirst layer(of the second part)had8fea-ture maps,the second layer4feature maps,and the third layer1feature map.All layers used3×3filters.During thisfine selection process we noticed that the best results were achieved when the number of filters was higher for earlier layers andfilter sizes 5Figure6:ROC curve obtained with our method on our challenging dataset of egocentric images.Thefig-ure also shows a magnification of the top-left corner. were bigger at later layers.The reduced resolution segmentation estimation already provided very good results,but it still produced some false positives.Be-cause of the lower resolution the edges were not as smooth as desired.The full resolutionfinal classi-fier was in most cases able to improve both the false positives and produce smoother edges.4.4Evaluation of the Different As-pects of the Method4.4.1Convolution on Full Resolution With-out Pooling and UpscalingTo verify that splitting the classifier into two parts performs better than a more standard classifier,weFigure7:Accuracy of the classifier depending on probability threshold.Best accuracy can be obtained for a large range of thresholdingvalues.(a)(b)(c) Figure8:Comparison between the ground truth segmentations and the predicted ones.(a)Ground truth,(b)prediction,(c)differences.Errors are typ-ically very small,and1-pixel thin.trained a classifier to perform segmentation on full resolution images withoutfirst calculating the re-duced resolution segmentation estimation.We used the same structure as the second part of 6our classifier and modified it to only use the origi-nal image.To compensate the absence of input from thefirst part,we tried using more feature maps.The best trade-offwe found was using16feature maps and filter size of5×5pixels on each of the three layers—instead of3×3and8,4,and1feature maps.Be-cause of memory size limit on the used GPU,we were not able to train a more complex classifier,which may produce better results.Nevertheless,processing time per image was185milliseconds with accuracy of94.0%,significantly worse than the proposed ar-chitecture.4.4.2Upscale Without the Original Image To verify that the second part of the classifier bene-fited from re-introducing the original image compared to only having results of thefirst part,we trained a classifier like the one suggested in this work,but this time we provided the second part of the classifier with only results of thefirst part.In this experiment pro-cessing time was36.7milliseconds,compared to39.2 milliseconds in the suggested classifier and the accu-racy fell from99.3%to98.6%.Processing was there-fore faster,but the second part of the classifier was not able to improve the accuracy much further.The second part of the classifier was able to correct some false positives from thefirst part,but unable to im-prove accuracy along the edges between foreground and background.4.5Comparison to a Color-basedClassificationAs discussed in the introduction,segmentation based on skin color is prone to fail as other parts of the image can have similar colors.To give a compari-son we applied the method described in[16]to our test set and obtained an accuracy of81%,which is significantly worse than any other approach we tried. 5ConclusionOcclusions are crucial for understanding the position of objects.In Augmented Realist applications,theirexact detection and correct rendering contribute to the feeling that an object is a part of the world around the user.We showed that starting with a low reso-lution processing of the image helps capturing the context of the image,and using the input image a second time helps capturing thefine details of the foreground.References[1]Z.Al-Taira,R.Rahmat,M.Saripan,and P.Su-laiman.Skin Segmentation Using YUV andRGB Color Spaces.Journal of Information Pro-cessing Systems,10(2):283–299,2014.[2]V.Badrinarayanan,A.Kendall,and R.Cipolla.SegNet:A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.arXiv Preprint,2015.[3]L.Baraldi, F.Paci,G.Serra,L.Benini,andR.Cucchiara.Gesture recognition in ego-centricvideos using dense trajectories and hand seg-mentation.In Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recogni-tion Workshops,pages688–693,2014.[4]Y.Dauphin,H.de Vries,J.Chung,and Y.Ben-gio.RMSProp and Equilibrated AdaptiveLearning Rates for Non-Convex Optimization.In arXiv,2015.[5]P.Debevec.Rendering Synthetic Objects intoReal Scenes:Bridging Traditional and Image-Based Graphics with Global Illumination andHigh Dynamic Range Photography.In ACMSIGGRAPH,July1998.[6]Y.-J.Huang,M.Bolas,and E.Suma.FusingDepth,Color,and Skeleton Data for EnhancedReal-Time Hand Segmentation.In Proceedingsof the First Symposium on Spatial User Interac-tion,pages85–85,2013.[7]M.Kawulok,J.Nalepa,and J.Kawulok.SkinDetection and Segmentation in Color Images,2015.7[8]G.Klein and D.Murray.Parallel Tracking andMapping for Small AR Workspaces.In ISMAR, 2007.[9]V.Lepetit and M.Berger.A Semi AutomaticMethod for Resolving Occlusions in Augmented Reality.In Conference on Computer Vision and Pattern Recognition,June2000.[10]Y.Lew,R. A.Rhaman,K.S.Yeong,A.Roslizah,and P.Veeraraghavan.A Hand Seg-mentation Scheme using Clustering Technique in Homogeneous Background.In Student Confer-ence on Research and Development,2002. [11]J.Lon,E.Shelhamer,and T.Darrell.Fully Con-volutional Networks for Semantic Segmentation.In Proceedings of the IEEE Conference on Com-puter Vision and Pattern Recognition,2015. [12]J.Long,E.Shelhamer,and T.Darrell.FullyConvolutional Networks for Semantic Segmen-tation.In Conference on Computer Vision and Pattern Recognition,2015.[13]A.Maas, A.Hannun,and A.Ng.RectifierNonlinearities Improve Neural Network Acous-tic Models.In Proceedings of the International Conference on Machine Learning Workshop on Deep Learning for Audio,Speech,and Language Processing,2013.[14]M.Meilland,C.Barat,and port.3Dhigh dynamic range dense visual slam and its ap-plication to real-time object re-lighting.In Inter-national Symposium on Mixed and Augmented Reality,pages143–152,2013.[15]R.Newcombe,S.Izadi,O.Hilliges,D.Molyneaux,D.Kim,A.J.Davison,P.Kohli,J.Shotton,S.Hodges,and A.Fitzgibbon.KinectFusion:Real-Time Dense Surface Map-ping and Tracking.In International Symposium on Mixed and Augmented Reality,2011.[16]S.Phung,A.Bouzerdoum,and D.Chai.SkinSegmentation using Color Pixel Classification: Analysis and Comparison.IEEE Transactionson Pattern Analysis and Machine Intelligence,27(1):148–154,2005.[17]J.Pilet,V.Lepetit,and P.Fua.Retexturing inthe Presence of Complex Illuminations and Oc-clusions.In International Symposium on Mixedand Augmented Reality,2007.[18]Z.Tu and X.Bai.Auto-Context and Its Ap-plications to High-Level Vision Tasks and3DBrain Image Segmentation.IEEE Transactionson Pattern Analysis and Machine Intelligence,2009.[19]L.Vacchetti,V.Lepetit,and P.Fua.StableReal-Time3D Tracking Using Online and Of-fline Information.PAMI,26(10),October2004.[20]H.Zheng,J.Li, C.Weng,and C.Lee.Be-yond Cross-Entropy:Towards Better Frame-Level Objective Functions for Deep Neural Net-work Training in Automatic Speech Recognition.In Proceedings of the InterSpeech Conference,2014.[21]S.Zollmann, D.Kalkofen, E.Mendez,andG.Reitmayr.Image-Based Ghostings for Sin-gle Layer Occlusions in Augmented Reality.InInternational Symposium on Mixed and Aug-mented Reality,2010.8。