中英文文献翻译-原型基于颜色的图像检索与MATLAB

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

英文原文
Prototyping Color-based Image Retrieval with MATLAB
Abstrac
Content-based retrieval of (image) databases has become more popular than before. Algorithm develop-ment for this purpose requires testing/simulation tools,but there are no suitable commercial tools on the market.
A simulation environment for retrieving images from database according histogram similarities is presented in this paper. This environment allows the use of different color spaces and numbers of bins. The algorithms are implemented with MA TLAB. Each color system has its own m-files.
The phases of the software building process are pre-sented from system design to graphical user interface (GUI). The functionality is described with snapshots of GUI.
1. Introduction
Nowadays there are thousands or hundreds of thousands of digital images in an image database. If the user wants to find a suitable image for his/her purposes, he/she has to go through the database until the correct image has been found or use a reference book or some “intelligent” program. Video on demand (V oD) services also requires an intelligent search system for end-users. V oD systems’ search methods differ slightly from image database’s methods.
A reference book is a suitable option, if the images are arranged with a useful method, for example: 1)categories: animals, flags, etc, 2) names (requires a good naming technique) or 3) dates. An experienced user can use these systems as well as textual searches (keywords have to be inserted in a database) efficiently. There are situations when a multi-language system has to be used. There a language independent search system’s best properties can be utilized. A tool which is based on the images’ properties can be madelanguage independent. These properties can be for example color, shape, texture, spatial location of shape etc.
In the MuVi-project [1] this kind of tool is under construction. It will cover the properties presented above.Research work on content-based image retrieval has been done in [2 – 6]. The system, which is present ed in this paper, is a simulation environment, where MuVi’s color content based retrieval has been developed and tested.
2. System development
MATLAB is an efficient program for vector and matrix data processing. It contains ready functions for matrix manipulations and image visualization and allows a program to have modular structure. Because of these facts MATLAB has been chosen as prototyping software.
2.1 System design
Before any m-files have been written, the system designhas been done. A system design for the HSV (hue, saturation and value) color system based retrieval process is presented in Figure 1. Similar design has been done for all used color systems.
Figure 1: Function chart for HSV color space with 27 bins histogram.
Tesths27 is the main function for this color system and this number of bins. It calls other functions(hs27read, dif_hsv and image_pos) when needed. Each
color system has a main function of its own and variable number (2 – 3) of sub-functions. If there is no need for color space conversion there are 2 functions,otherwise 3 functions on the first branch of the function chart.
The function call of the main function is: matches=tesths27(imagen,directory,num)
The variable imagen specifies the query image’s name and path. The directory is a path of the image database and num is a desired number of retrieved images.
2.2 Functions
At this moment there are functions implemented for four color spaces: HSV, L*a*b*, RGB and XYZ [7]. Each color space has from 2 to 4 implementations for different numbers of bins. There are altogether 14 main functions.
For some color systems it is possible to make these functions dynamic, i.e. dynamic histogram calculation. Every color system / bin combination requires its own histograms and these can be made only with an exhaustive method (pixel by pixel). Histogram calculation takes ½ - 5 minutes per image, each
approximately 320×240 pixels, depending on the complexity of the color space on 150 MHz Pentium. Thus it is not reasonable to let the user select a bin number freely, especially in the case of large databases.
The functions have been named so that the names contain information of the color space used, the purpose of the functions and the number of used bins. Some functions, for example image_pos, have been used by many or all main functions and these functions have not been named as described above.
The main function checks, if the function call is correct. If the query image’s name doesn’t contain a path, the function assumes that the image is situated in the database directory. In addition
to this, the main function checks, if the query image already has a histogram in the currently used database. If the required histogram is not there, the image read (for example hs27read) function is called. This function also normalizes pixel values and arranges image matrix data to a vector format. After that stage a color space conversion function (if needed) is called. Finally a quantization function builds the histogram with the correct number of bins.
The histogram will then be saved into the database directory. If the histogram already exists there, the three previous steps will not be executed. Now the query image has been analyzed. Then the main function will go through all images in the database directory with an almost similar algorithm as in the case of the query image. The difference is that now there will be a histogram difference calculation between the query image’s and current image’s histogram. Finally the image_pos function will be used to put a query image and the desired number of best match images on the display.
2.3 Linking
It is not possible to use a program before the main function and sub-functions are connected to each other. The main function will be called from the command line or through the graphical user interface, which will be presented later in this paper. In both cases the function call will contain the same arguments. For multi-level search purposes separate main functions have been implemented, but it is possible to utilize “normal” functions and add one parameter, where the best matches array can be transferred for second a stage comparison function.
The main function calls an image read function with the image’s name. The histogram will be returned to the main function. If a color space conversion is needed, the conversion function will be called from the read function with r, g and b –vectors. The histogram will be returned to the calling function. Finally the histogram build function will be called with converted color vectors. This function returns a quantized histogram, which will go through all functions until it achieves the main function.
The main function calls the histogram difference function with two histogram vectors and will get a difference value as a response. The difference function uses Euclidean-distance calculation, but it can be easily changed to
another algorithm due to the modularity of the program. If the difference is smaller than largest difference on a best match table, the current result will be written over the last result on the best match table. After that the table is arranged again in an ascending order of distance. When all the images have been analyzed, the sorted best match table, the number of desired output images, the query image’s name, the search image’s path and the database
path are transferred to the image_pos function. These values can be transferred into larger components (vectors/containers). Now the program works faster with several input arguments, because there is no need for
picking up variables from a container.
2.4 Graphical user interface
The graphical user interface (GUI) is an important part of software development. The
designing of the GUI have to solve the following problems: learning time, speed of performance, rate of errors by users, retention over time, and subjective satisfaction [9]. This software is, at the moment, intended to be used only for testing purposes. The most important property of this software is that the results of different test queries can be seen quickly and the
results can be saved safely on a disk. Thus the visual layout is not as important as in case of a commercial software product.
In Figure 2 the first screen on GUI is presented. The purposes of the buttons, menus and other components will be presented later. If this software is developed into a commercial product, the menu bar will be disabled in the future and the exit and help buttons will be added on the canvas.
Figure 2: GUI before the search image selection.
In Figure 3 the search screen is presented just before starting a search. The user is shown a search image,and in this way he/she can be sure that the search will be made with the correct image.
Figure 3: GUI just before running a query.
The results of the query will be presented on the screen in the format which is presented in Figure 6.
3. Using the software
The first screen has already been presented in Figure 2. The user can choose from pop-up menus (see Figure 4), if the search is made with one a color system or as a multi-level search. In a one-level search a roughly quantized or a more accurate histogram is used in one loop (one color system).
Figure 4: Color system selection from a popup menu.
The second menu is disabled because a one-level search is selected.
In a multi-level search two different color systems /histograms are used. During the first loop the roughly quantized histograms are used and during the second loop.
the more accurate histograms are utilized for the best matches from the first loop. The color system on the second loop can be either the same as on the first loop or a different one. For queries with one-level search the selection of a second color system is disabled. The user can select the number of retrieved images at the final stage. The software can be linked to many image databases and the user can select a database where the query will be directed.
The user can select a search image either from the same database where the query will be directed to (default) or from any directory in his/her PC. The selection will be made with the file –open dialog, which is presented in
Figure 5. The form can be cleared with the “Reset” button. A query is executed with the “Search” button. Finally the results of the search will appear on the screen in a separate window, as presented in Figure 6. Earlier [8] the software
opened each image in a separate window and evaluating/saving the results is more difficult than after the improvement. In the top left top corner is the original query image. Below that image the best matches are presented in a descending order of similarity from left to right and from top to bottom. The user can select suitable images for further use with the “Copy selected” or the “Print selected” buttons. The “New search” button closes this form and go es back to the original search form. The “Search similar” button executes a new search where a query histogram is composed of histograms of the selected
images. If the user has selected a larger number than 21 as “Number of matching images”, the best match es will be shown on multiple screens. The user can browse these pages with the “Previous page” and “Following page”
buttons.
Figure 5: The query image selection dialog. The language of the dialog depends on the language of the operating system used.
Figure 6: The results of a query will be presented graphically.
4. Summary
The color content-based retrieval requires algorithms, which give visually correct results. Correctly working algorithms can not be chosen before simulations. The software presented in this
paper is intended to be used
for testing purposes. Some operations will be implemented, if the software is developed into a commercial product. Some modifications are under
construction.
This software has been used as a testing platform for histogram quantization tests. The modularity of this program makes it possible to take new algorithms as a part of the software in a short time. MATLAB makes
quick prototyping possible. A possibility to save figures (search results) directly on a disk is a
ful fillment of the program’s requirements. After the results have been analyzed visually, the best algorithms will be taken as a part of the final software.
5. Acknowledgements
This work has been founded by the European Union– ERDF, the Technology Development Centre Tekes, Alma Media, the Helsinki Telephone Company, Nokia Research Center, the Satakunta High Technology
Foundation and Ulla Tuominen’s Foundation
中文译文
原型基于颜色的图像检索与MATLAB
·摘要
基于内容的检索数据库(图像)已经变得越来越受欢迎。

为了达到这一目的,需要发展算法检测/模拟工具,但市场上没有合适的商业工具。

本文介绍了一个模拟环境,能够从数据库中检索图像直方图的相似之处。

该环境允许使用不同的色彩空间及柱,通过MA TLAB实现算法。

每一种颜色体系都有自己的m-files文件。

这个阶段的软件建设过程是从系统出发设计的pre-sented图形用户界面(GUI),对GUI 作了简短的功能描述。

1.介绍
现阶段图像数据库有成千上万的数字图像,如果用户想要找一个适合他/她目的的图象,他/她必须寻遍整个数据库直到检索出正确的图片,或使用相关资料或者一些智能软件。

为了最终用户,视频点播服务也需要一个智能搜索系统。

视频点播系统的检索方式与图像数据库检索方式略有不同。

如果图像采用一个有效的方法排列,一些相关资料是一个不错的选择,例如:1)范畴:动物、旗帜等等;2)名字(需要一个好的命名法);3)日期:一位有经验的用户可以如同文本检索(关键词必须插入一个数据库)一样,有效地运用这些系统。

还有使用多语言系统的情况。

一种基于图象特性的工具可以是语言独立的,一种独立语言系统可以利用最佳属性进行搜索,这些属性可以有颜色、形态、质感、空间定位的形状等。

在MuVi-project中,这种工具正在建设中,它涵盖了以上各种性质。

对基于内容的图像检索的研究工作,已经在文献[2 - 6]中完成了。

该系统,文中已经提到过,是一种模拟环境,在它上面,基于颜色内容检索的MuVi已经开发并测试了。

2. 系统开发
MATLAB是一种有效的矢量和矩阵的数据处理程序。

它包含了完整的矩阵运算函数和图像可视化功能且允许程序有模块化结构。

因为以上这些因素,MATLAB被选为原型软件。

2.1系统设计
要在撰写m-files之前,完成这个系统设计。

HSV(色调、饱和度、亮度)系统设计,颜色系统检索过程如图1。

类似的设计已为所有颜色系统使用。

图1:HSV颜色空间及27个柱的流程图。

Tesths27是这个颜色系统及这些柱的最主要功能。

它可以根据需要调用其他功能
(hs27read,dif_hsv和image_pos)。

每个颜色系统都有自己的主要功能和可变数目(2-3)的子功能。

如果需要颜色空间转换,在功能表的第一个分支上有两个或者三个此功能。

主函数的调用功能:
matches=tesths27(imagen,directory,num)
imagen指定变量名称及查询图像的路径。

Directory是图像数据库的路径,num是图像检索的预期数字。

2.2功能
到目前为止已实现了四个色彩空间:HSV、L*a*b*、RGB和XYZ。

每一种颜色空间已从2到4实现了不同数量的柱。

共计14个主要功能。

对于某些颜色系统使得这些动态功能成为可能,即动态直方图计算。

每个颜色系统/柱组需要依靠自身直方图,这些可以仅用一个详尽的方法(像素x像素)实现。

直方图计算每张图片要花费½- 5分钟,每张大约320×240个像素,根据每150MHz 奔腾上,颜色空间的复杂性。

因此让每个用户自由选择每个柱的编号是不切实际的,特别是在大型数据库。

这些函数之所以如此命名,是因为这些名称当中包含了色彩空间的选择,函数功能柱的数目。

一些函数,比如image_pos已经应用于许多或许所有的主函数当中,并且这些函数并未按照上述方法命名。

主函数的功能测试,如果这个函数调用是正确的。

如果这个查询图像的名字不包含路径,该函数就会默认图像位于数据库目录下。

除此之外,最主要的功能测试,查询图像的直方图是否已经位于当前数据库中。

如果被查询的直方图不在数据库中,则调用图像读取(例如
hs27read)函数。

该函数还可以使像素值规格化,以及将图像矩阵数据矢量化。

该阶段之后,将调用一个颜色空间转换函数(如有需要)。

最后调用一个量化函数根据正确数目的柱建立直方图。

该直方图将被保存到数据库目录下。

如果直方图已经存在,之前的三步骤将不会被执行。

现在对这个查询图像的已经完成分析。

然后主函数将依照查询图像,采用相似性算法,将图像数据库目录下的所有图像检索一遍,不同的是,现在将会在当前的检索图像直方图与查询
图像直方图之间有一个的差异计算。

最后image_pos函数将查询图像以及检索出来的要求数目的相似图像显示出来。

2.3链接
在主函数和子函数未链接之前要运行一个程序是不可能的。

主函数将通过命令行或通过图形用户界面调运,本文将后面陈述。

这两种函数调用的原理相同。

对于多级搜索功能分开的主要函数已经实现,但它可以利用普通程序并添加一个参数,其中最匹配的数组可以转为第二阶段的比较函数。

主要的函数通过图像名称调用图像阅读取函数,直方图将返回到主函数。

如果需要彩色空间转换,转换函数被读取函数通过R、G、B向量调用。

直方图将返回到调用函数。

最后,将通过转换后的颜色向量,调用直方图建立函数。

该函数返回一个量化的直方图,它将遍历调用所有函数,直到实现主函数。

主要函数根据两个直方图向量,调用直方图差函数,并响应一个得到的差值。

差值函数使用欧氏距离计算,但它可以依据程序的模块性很容易地更改为另一种算法。

如果差值小于最佳匹配表中的最大差值,该结果将覆盖最佳匹配表中的最后一条纪录。

最后,再将该表按照差值大小升序排列。

当所有的图像分析完成后,最佳匹配的排序表,在所需的输出图像的数量,查询图像的名称,检索图像的路径和数据库路径均传递给image_pos函数。

这些值可以传递到更大的组件(向量/容器)。

现在,可以通过输入几个参数而快速实现,不需要从容器中查找变量。

2.4图形用户界面
图形用户界面(GUI)是一个软件开发的重要组成部分。

该图形用户界面设计要解决以下问题:学习时间,速度的性能,用户的错误率,随着时间的推移保留和主观满意度。

该软件目前仅预期用于测试目的,该软件最重要的性能是可以快速看到不同的测试查询的结果,且结果可以安全地保存在磁盘上。

因此,可视化设计倒不如商业软件产品来得重要。

在图2中呈现的屏幕是基于GUI的,按钮、菜单和其他组件的功能将稍后介绍。

如果这个软件要开发成商业产品,菜单栏将被禁用,并将在画布上添加退出和帮助按钮。

图2:选择查询图像之前的GUI。

图3查询界面在开始查询之前弹出。

该用户选择一个查询图像,这样他/她可以断定,查询到正确的图像。

图3:进行查询之前的GUI。

数据的结果将以图像6中的界面显示出来。

3. 软件的使用
第一个界面已经出现在图2当中。

用户可以从弹出式菜单中选择(见下图),如果该系统是单色或多级检索系统。

在一个一级检索系统中一个粗量化或更精确的直方图从一次循环(一种颜色系统)中得到。

图4:系统颜色选择从弹出式菜单当中。

菜单中的第二项被禁用,因为选择了单级检索。

在一个多级检索当中,使用两种不同颜色系统/颜色直方图。

在第一循环中,得到粗量化直方图,在第二重循环中,为了最佳匹配利用第一次循环从而得到更准确直方图。

颜色系统的第二个循环可以与第一个循环相同或不同。

对于一级检索,二级颜色系统的选择被禁用。

用户可以选择图像检索在最后阶段的数目。

该软件可以链接到许多图像数据库,用户可以选择设定数据库用来查询。

用户选择一个查询图像,可以从查询设定的同一(默认)数据库当中或从他/她的PC下的任意目录中,该选择将以图5中的文件-打开对话框出现,该列表可以通过“重置”按钮被清除,通过“检索”按钮可以执行检索。

最后,搜索结果将以一个独立的窗口显示在屏幕上的,如图6。

早些时候,该软件将每一张图都以一个独立窗口打开,计算和保存结果都比改进之后更困难。

在顶部左上角是原来的查询的图像,在该图像的下面是从左到右、从上到下,按照相似度降序排列的最佳匹配图像。

用户可以进一步使用“复制选择”或“打印选择”按钮选择合适的图像。

“新检索”按钮将关闭此列表,并又回到了原来的搜索列表。

“相似检索”按钮执行一个新的检索在查询直方图是由所有选定图像的直方图组成时。

如果用户选择了更大的数目超过了21,作为“匹配图像组”,最佳匹配将在多级界面上显示。

用户可以通过“上一页”和“下页” 按钮浏览页面。

图5:选择查询图片对话框。

对话框的语言决于操作系统使用的语言。

图6:查询结果以列表方式。

4. 摘要
基于颜色内容的检索需要能提供直观正确结果的算法,在模拟前不能选择正确运算的算法。

本文介绍的软件以测试为目的的。

一些功能还能够扩展,假如该软件想要开发成商业产品,一些修改正在逐步实施过程。

该软件已被用来作为直方图量化测试的测试平台。

该软件的模块化使得在新算法能够在很短的时间内成为软件的一部分。

MATLAB的使得快速原型的建造成为可能,使得直接在磁盘上快速保存数据(检索结果),实现了软件的要求。

在结果可视化分析完成之后,最佳算法将成为这个最终软件的一部分。

5. 鸣谢
这项工作是由欧洲联盟ERDF成立的,技术研发中心特克斯、阿尔玛媒体、赫尔辛基电话公
司、诺基亚研究中心、高新技术Satakunta 基金会和乌拉图奥米宁的基金会。

相关文档
最新文档