图像识别中英文对照外文翻译文献
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
中英文对照外文翻译文献
(文档含英文原文和中文翻译)
Elastic image matching
Abstract
One fundamental problem in image recognition is to establish the resemblance of two images. This can be done by searching the best pixel to pixel mapping taking into account monotonicity and continuity constraints. We show that this problem is NP-complete by reduction from 3-SAT, thus giving evidence that the known exponential time algorithms are justified, but approximation algorithms or simplifications are necessary.
Keywords: Elastic image matching; Two-dimensional warping; NP-completeness 1. Introduction
In image recognition, a common problem is to match two given images, e.g. when comparing an observed image to given references. In that pro-cess, elastic image matching, two-dimensional (2D-)warping (Uchida and Sakoe, 1998) or similar types of invariant methods (Keysers et al., 2000) can be used. For this purpose, we can define cost functions depending on the distortion introduced in the matching and
search for the best matching with respect to a given cost function. In this paper, we show that it is an algorithmically hard problem to decide whether a matching between two images exists with costs below a given threshold. We show that the problem image matching is NP-complete by means of a reduction from 3-SAT, which is a common method of demonstrating a problem to be intrinsically hard (Garey and Johnson, 1979). This result shows the inherent computational difficulties in this type of image comparison, while interestingly the same problem is solvable for 1D sequences in polynomial time, e.g. the dynamic time warping problem in speech recognition (see e.g. Ney et al., 1992). This has the following implications: researchers who are interested in an exact solution to this problem cannot hope to find a polynomial time algorithm, unless P=NP. Furthermore, one can conclude that exponential time algorithms as presented and extended by Uchida and Sakoe (1998, 1999a,b, 2000a,b) may be justified for some image matching applications. On the other hand this shows that those interested in faster algorithms––e.g. for pattern recognition purposes––are right in searching for sub-optimal solutions. One method to do this is the restriction to local optimizations or linear approximations of global transformations as presented in (Keysers et al., 2000). Another possibility is to use heuristic approaches like simulated annealing or genetic algorithms to find an approximate solution. Furthermore, methods like beam search are promising candidates, as these are used successfully in speech recognition, although linguistic decoding is also an NP-complete problem (Casacuberta and de la Higuera, 1999). 2. Image matching
Among the varieties of matching algorithms,we choose the one presented by Uchida and Sakoe(1998) as a starting point to formalize the problem image matching. Let the images be given as(without loss of generality) square grids of size M×M with gray values (respectively node labels)from a finite alphabet &={1,…,G}. To define the
d:&×&→N , problem, two distance functions are needed,one acting on gray values
g measuring the match in gray values, and one acting on displacement differences :Z×Z→N , measuring the distortion introduced by t he matching. For these distance d
d
functions we assume that they are monotonous functions (computable in polynomial time) of the commonly used squared Euclid-ean distance, i.e
d g (g 1,g 2)=f 1(||g 1-g 2||²)and d d (z)=f 2(||z||²) monotonously increasing. Now w
e call the following optimization problem the image matching problem (let µ={1,…M} ).
Instance: The pair( A ; B ) of two images A and B of size M×M .
Solution: A mapping function f :µ×µ→µ×µ.
Measure:
c (A,B,f )=),(),(j i f ij g B A
d ∑
μμ⨯∈),(j i
+
∑⨯-⋅⋅⋅∈+-+μ}1,{1,),()))0,1(),(())0,1(),(((M j i d j i f j i f d
μ⨯-⋅⋅⋅∈}1,{1,),(M j i +
∑⋅⋅⋅⨯∈+-+1}-M ,{1,),()))1,0(),(())1,0(),(((μj i d j i f j i f d 1}-M ,{1,),(⋅⋅⋅⨯∈μj i
Goal:min f c(A,B,f).
In other words, the problem is to find the mapping from A onto B that minimizes the distance between the mapped gray values together with a measure for the distortion introduced by the mapping. Here, the distortion is measured by the deviation from the identity mapping in the two dimensions. The identity mapping fulfills f(i,j)=(i,j),and therefore ,f((i,j)+(x,y))=f(i,j)+(x,y)
The corresponding decision problem is fixed by the following
Question:Given an instance of image matching and a cost c′, does there exist a ma pping f such that c(A,B,f)≤c′?
In the definition of the problem some care must be taken concerning the distance functions. For example, if either one of the distance functions is a constant function, the problem is clearly in P (for d g constant, the minimum is given by the identity mapping and for d d constant, the minimum can be determined by sorting all possible matching for each pixel by gray value cost and mapping to one of the pixels with minimum cost). But these special cases are not those we are concerned with in image matching in general.We choose the matching problem of Uchida and Sakoe (1998) to complete the definition of the problem. Here, the mapping functions are restricted by continuity and monotonicity constraints: the deviations from the identity mapping may locally be at most one pixel (i.e. limited to the eight-neighborhood with squared Euclidean distance less than or equal to 2). This can be formalized in this approach by
choosing the functions f
1,f
2
as e.g.
f 1=id,f
2
(x)=step(x):=
⎩
⎨
⎧
.2
,
)
10
(
,2
,0
>
≤
⋅x
G
x
M
M
3. Reduction from 3-SAT
3-SAT is a very well-known NP-complete problem (Garey and Johnson, 1979), where 3-SAT is defined as follows:
Instance: Collection of clauses C={C
1,···,C
K
} on a set of variables X={x
1
, (x)
L
}
such that each c
k
consists of 3 literals for k=1,···K .Each literal is a variable or the negation of a variable.
Question:Is there a truth assignment for X which satisfies each clause c
k
, k=1,···K ?
The dependency graph D(Ф)corresponding to an instance Ф of 3-SAT is defined to be the bipartite graph whose independent sets are formed by the set of clauses C
and the set of variables X .Two vert ices c
k and x
1
are adjacent iff c
k
involves
x 1or
-
x
L
.
Given any 3-SAT formula U, we show how to construct in polynomial time an
equivalent image matching problem l(Ф)=(A(Ф),B(Ф)); . The two images of l (Ф)are similar according to the cost function (i.e.f:c(A(Ф),B(Ф),f)≤0) iff the formulaФ is satisfiable. We perform the reduction from 3-SAT using the following steps:
• From the formula Ф we construct the dependency graph D(Ф).
• The dependency graph D(Ф)is drawn in the plane.
• The drawing of D(Ф)is refined to depict the logical behaviour of Ф , yielding two images(A(Ф),B(Ф)).
For this, we use three types of components: one component to represent variables of Ф , one component to represent clauses of Ф, and components which act as interfaces between the former two types. Before we give the formal reduction, we introduce these components.
3.1. Basic components
For the reduction from 3-SAT we need five components from which we will construct the in-stances for image matching , given a Boolean formula in 3-DNF,
respectively its graph. The five components are the building blocks needed for the graph drawing and will be introduced in the following, namely the representations of connectors,crossings, variables, and clauses. The connectors represent the edges and have two varieties, straight connectors and corner connectors. Each of the components consists of two parts, one for image A and one for image B , where blank pixels are considered to be of the‘background ’color.
We will depict possible mappings in the following using arrows indicating the direction of displacement (where displacements within the eight-neighborhood of a pixel are the only cases considered). Blank squares represent mapping to the respective counterpart in the second image.For example, the following displacements of neighboring pixels can be used with zero cost:
On the other hand, the following displacements result in costs greater than zero:
Fig. 1 shows the first component, the straight connector component, which consists of a line of two different interchanging colors,here denoted by the two symbols◇and□. Given that the outside pixels are mapped to their respe ctive counterparts and the connector is continued infinitely, there are two possible ways in which the colored pixels can be mapped, namely to the left (i.e. f(2,j)=(2,j-1)) or to the right (i.e. f(2,j)=(2,j+1)),where the background pixels have different possibilities for the mapping, not influencing the main property of the connector. This property, which justifies the name ‘connector ’, is the following: It is not possible to find a mapping, which yields zero cost where the relative displacements of the connector pixels are not equal, i.e. one always has f(2,j)-(2,j)=f(2,j')-(2,j'),which can easily be observed by induction over j'.That is, given an initial displacement of one pixel (which will be ±1 in this context), the remaining end of the connector has the same displacement if overall costs of the mapping are zero. Given this property and the direction of a connector, which we define to be directed from variable to clause, we
can define the state of the connector as carrying the‘true’truth value, if the displacement is 1 pixel in the direction of the connector and as carrying the‘false’ truth value, if the displacement is -1 pixel in the direction of the connector. This property then ensures that the truth value transmitted by the connector cannot change at mappings of zero cost.
Image A image B
mapping 1 mapping 2
Fig. 1. The straight connector component with two possible zero cost mappings.
For drawing of arbitrary graphs, clearly one also needs corners,which are represented in Fig. 2.By considering all possible displacements which guarantee overall cost zero, one can observe that the corner component also ensures the basic connector property. For example, consider the first depicted mapping, which has zero cost. On the other hand, the second mapping shows, that it is not possible to construct a zero cost mapping with both connectors‘leaving’the component. In that case, the pixel at the position marked‘? ’either has a conflict (that i s, introduces a cost greater than zero in the criterion function because of mapping mismatch) with the pixel above or to the right of it,if the same color is to be met and otherwise, a cost in the gray value mismatch term is introduced.
image A image B
mapping 1 mapping 2
Fig. 2. The corner connector component and two example mappings.
Fig. 3 shows the variable component, in this case with two positive (to the left) and one negated output (to the right) leaving the component as connectors. Here, a fourth color is used, denoted by ·.This component has two possible mappings for the
colored pixels with zero cost, which map the vertical component of the source image to the left or the right vertical component in the target image, respectively. (In both cases the second vertical element in the target image is not a target of the mapping.) This ensures±1 pixel relative displacements at the entry to the connectors. This property again can be deducted by regarding all possible mappings of the two images.The property that follows (which is necessary for the use as variable) is that all zero cost mappings ensure that all positive connectors carry the same truth value,which is the opposite of the truth value for all the negated connectors. It is easy to see from this example how variable components for arbitrary numbers of positive and negated outputs can be constructed.
image A image B
Image C image D
Fig. 3. The variable component with two positive and one negated output and two possible mappings (for true and false truth value).
Fig. 4 shows the most complex of the components, the clause component. This component consists of two parts. The first part is the horizontal connector with a 'bend' in it to the right.This part has the property that cost zero mappings are possible for all truth values of x and y with the exception of two 'false' values. This two input disjunction,can be extended to a three input dis-junction using the part in the lower left. If the z connector carries a 'false' truth value, this part can only be mapped one pixel downwards at zero cost.In that case the junction pixel (the fourth pixel in the third row) cannot be mapped upwards at zero cost and the 'two input clause' behaves as de-scribed above. On the other hand, if the z connector carries a 'true' truth value, this part can only be mapped one pixel upwards at zero cost,and the junction pixel can be mapped upwards,thus allowing both x and y to carry a 'false' truth value in a zero cost mapping. Thus there exists a zero cost mapping of the clause component iff at least one of the input connectors carries a truth value.
image A
image B mapping 1(true,true,false)
mapping 2 (false,false,true,)
Fig. 4. The clause component with three incoming connectors x, y , z and zero cost mappings for
the two cases(true,true,false)and (false, false, true).
The described components are already sufficient to prove NP-completeness by reduction from planar 3-SAT (which is an NP-complete sub-problem of 3-SAT where the additional constraints on the instances is that the dependency graph is planar),but in order to derive a reduction from 3-SAT, we also include the possibility of crossing connectors.
Fig. 5 shows the connector crossing, whose basic property is to allow zero cost mappings if the truth–values are consistently propagated. This is assured by a color change of the vertical connector and a 'flexible' middle part, which can be mapped to four different positions depending on the truth value distribution.
image A
image B
zero cost mapping
Fig. 5. The connector crossing component and one zero cost mapping.
3.2. Reduction
Using the previously introduced components, we can now perform the reduction from 3-SAT to image matching .
Proof of the claim that the image matching problem is NP-complete:Clearly, the image matching problem is in NP since, given a mapping f and two images A and B ,the computation of c(A,B,f)can be done in polynomial time. To prove NP-hardness, we construct a reduction from the 3-SAT problem. Given an instance of 3-SAT we construct two images A and B , for which a mapping of cost zero exists iff all the clauses can be satisfied.
Given the dependency graph D ,we construct an embedding of the graph into a 2D pixel grid, placing the vertices on a large enough distance from each other (say
100(K+L)² ).This can be done using well-known methods from graph drawing (see e.g.di Battista et al.,1999).From this image of the graph D we construct the two images A and B , using the components described above.Each vertex belonging to a variable is replaced with the respective parts of the variable component, having a number of leaving connectors equal to the number of incident edges under consideration of the positive or negative use in the respective clause. Each vertex belonging to a clause is replaced by the respective clause component,and each crossing of edges is replaced by the respective crossing component. Finally, all the edges are replaced with connectors and corner connectors, and the remaining pixels inside the rectangular hull of the construction are set to the background gray value. Clearly, the placement of the components can be done in such a way that all the components are at a large enough distance from each other, where the background pixels act as an 'insulation' against mapping of pixels, which do not belong to the same component. It can be easily seen, that the size of the constructed images is polynomial with respect to the number of vertices and edges of D and thus polynomial in the size of the instance of 3-SAT, at most in the order (K+L)².Furthermore, it can obviously be constructed in polynomial time, as the corresponding graph drawing algorithms are polynomial.
Let there exist a truth assignment to the variables x
1,…,x
L
, which satisfies all
the clauses c
1,…,c
K
. We construct a mapping f , that satisfies c(f,A,B)=0 as
follows.
For all pixels (i, j ) belonging to variable component l with A(i,j)not of the background color,set f(i,j)=(i,j-1)if x
l
is assigned the truth value 'true' , set f(i,j)=(i,j+1), otherwise. For the remaining pixels of the variable component set A(i,j)=B(i,j),if f(i,j)=(i,j), otherwise choose f(i,j)from
{(i,j+1),(i+1,j+1),(i-1,j+1)}for x
l
'false' respectively from {(i,j-1),(i+1,j-1),(i-1,j-1)}
for x
l
'true ',such that A(i,j)=B(f(i,j)). This assignment is always possible and has zero cost, as can be easily verified.
For the pixels(i,j)belonging to (corner) connector components,the mapping function can only be extended in one way without the introduction of nonzero cost,starting from the connection with the variable component. This is ensured by the
basic connector property. By choosing f (i ,j )=(i,j )for all pixels of background color, we obtain a valid extension for the connectors. For the connector crossing components the extension is straight forward, although here ––as in the variable mapping ––some care must be taken with the assign ment of the background value pixels, but a zero cost assignment is always possible using the same scheme as presented for the variable mapping.
It remains to be shown that the clause components can be mapped at zero cost, if at least one of the input connectors x , y , z carries a ' true' truth value.For a proof we regard alls even possibilities and construct a mapping for each case. In the
description of the clause component it was already argued that this is possible,and due to space limitations we omit the formalization of the argument here.
Finally, for all the pixels (i ,j )not belonging to any of the components, we set f (i ,j )=(i ,j )thus arriving at a mapping function which has c (f ,A ,B )=0。
as all colors are preserved in the mapping and no distortion of squared Euclidean di- stance greater than 2 is introduced.
On the other hand, let there exist a mapping f with c (f ,A ,B )≤0 ,This means that c (f ,A ,B )=0 since there are only positive cost terms involved in the term for c. Furthermore, we can conclude,that
μμ⨯∈∀),(j i ,0),(),(=j i f ij g B A d
μ⨯-⋅⋅⋅∈∀}1{1,),(M j i
0)))0,1(),(()0,1(),(((=+-+j i f j i f d d
1}-M ,{1,),(⋅⋅⋅⨯∈∀μj i
0)))1,0(),(())1,0(),(((=+-+j i f j i f d d
This means that f maps all pixels onto pixels of the same color and that the difference in mapping between neighboring pixels has at most squared Euclidean distance 2.
These facts now ensure a number of basic properties:
(1) The basic elements of the components and the overall represented graph are mapped to their corresponding parts, because of the monotonicity and continuity assured by the maximum local mapping difference.
(2) All the variable components are mapped such that the leaving connectors all
carry consistent truth values (i.e. all the positive outputs carry the same truth value, and all the negative outputs carry the negated truth value).
(3) All (corner, crossing, straight) connectors are mapped such that the basic connector property of propagated truth values is fulfilled, since no neighboring pixels are mapped into opposite directions.
(4) Each clause component has at least one entering connector carrying a 'true' ruth value, as otherwise a mapping with zero cost is not possible.
Now we construct an assignment of truth values to the variables x
1,…,x
L
,
which satisfies all the clauses c
1,…,c
K
. We set x
l
to ' true ',if f(i,j)=(i,j-1)
for the pixels of the corresponding variable component, which are not background pixels.We set x
l
to 'false ', otherwise. By means of the properties stated above we
can conclude that this assignment satisfies all the clauses c
1,…,c
K
,which concludes
the proof.
4. Discuss ion and conclusion
With the completion of the proof some open questions remain. From the theoretical point of view it is interesting to understand for which distortion measure the problem becomes NP-complete, as indicated by the question marks in Table 1 and in which way the result is influenced if we allow interpolation of the discrete pixel array. Another interesting theoretical question is if the problem remains NP-complete if we allow only less than the four colors used in the proof here.
In the presented proof, the displacements of the pixels in the mapping are at most one pixel for each zero cost mapping between two constructed images. This may seem to be a restriction with respect to the general matching problem, which allows larger displacements. But obviously the general problem is NP-complete, if already the class of problems which contains only these restricted versions is NP-complete.
With the proof that the problem image matching is NP-complete, we have arrived at a fundamental result for a problem that has been studied thoroughly in the literature before. The works of Uchida and Sakoe deal with exactly the problem formalized here and they present a variety of exponential algorithms and restrictions to apply to the problem in order to make it tractable. Roneeet al. (2001) state that the computation ‘‘is in the exponential order of the image size’’, but they do not elab orate
on this statement. In the same manner, Levin and Pieraccini (1992) discuss a very similar problem and state that their algorithm is exponential in the general case, but do not go into more detail on this question. They propose to regard the lines of the images independently to keep the problem tractable. This is a similar approach to the use of pseudo 2D hidden Markov models as presented in (Kuo and Agazzi, 1994), but the drawback is that any correspondence information between the deformation of the lines is disregarded, which may not be advisable in all domains. Samaria (1994) writes for the case of face recognition that ‘‘In the general case, the HMM would need to be 2D’’, but also restricts the problem to a 1D one. Moore (1979) already discusses a similar task which is dealt with using a generalization of the 1D edit- or Levenshtein-distance,which also disregards local dependencies between certain lines. There are also works that view the image-warping problem based on a physical model of the gray value-surface that is represented by an image (Moghaddam et al., 1996) and others that use variational approaches that are similar to the computation of optical flow (Schnorr and Weickert, 2000), which is inherently linked with the problem of image registration (Fischer and Modersitzki, 2001).
We may conclude that despite of the fundamental result that the image matching problem is NP-complete there is a variety of methods to modify the basic problem such that it becomes tractable or to approximate the best solution. It is an interesting question to find out which of the possibilities is best suited for which practical application.
References
[1] M.J. Black and P. Anandan, ªThe Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields,ºComputer Vision and Image Understanding, vol. 63, no. 1, pp. 75-104, Jan. 1996.
[2] M.J. Black and A. Rangarajan, ªOn the Unification of Line Processes,
Outlier Rejection, and Robust Statistics with Applications in Early Vision,º
Int'l J. Computer Vision, vol. 19, no. 1, pp. 57-91, July 1996.
[3] C.F. Olson, ªMaximum-Likelihood Template Matching,ºProc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 52-57, 2000.
[4] D.P. Huttenlocher, G.A. Klanderman, and W.J. Rucklidge, ªComparing Images Using the Hausdorff Distance,ºIEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 15, no. 9, pp. 850-863, Sept. 1993
[5] K. Grauman and T. Darrell. Fast Contour Matching Using Approximate Earth Mover’s Distance. In Proc. CVPR,Washington D.C., June 2004.
[6] K. Grauman and T. Darrell. Pyramid Match Kernels:Discriminative Classification with Sets of Image Features Technical Report AIM-2005-007, MIT CSAIL,March 2005.
[7] P. Indyk and N. Thaper. Fast Image Retrieval via Em beddings.In 3rd Intl Workshop on Statistical and Computational Theories of Vision, Nice, France, Oct 2003.
[8] Y. Ke and R. Sukthankar. PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. In Proc. CVPR,Washington, D.C., June 2004.
[9] S. Lazebnik, C. Schmid, and J. Ponce. Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Local Descriptors and Neighborhood Statistics for Texture Recognition. In Proc. ICCV, Nice, France, Oct 2003.
[10] S. Lazebnik, C. Schmid, and J. Ponce. A Sparse Texture Representation Using Affine-Invariant Regions. In Proc CVPR, Madison, WI, June 2003.
.
弹性图像匹配
摘要
图像识别的一个根本问题是建立两个图像的相似性,这个可以通过寻找最后像素到考虑了单调性和约束性的像素映射来完成。
我们发现这个问题是NP完全还原3-SAT,从而有证据表明,已知的指数时间算法是合理的,但近似算法和简化是必要的。
关键词:弹性图像匹配;二维变形;NP完全问题。
1.介绍
在图像识别中,一个共同的问题是匹配给定的两个图像。
例如,比较观察图
像,给出参考。
在这个过程中,弹性图像匹配,二维翘曲或相似类型不变量方法可以运用。
为此,我们可以通过在图像匹配中引入的失真定义成本函数,从而对给定成本函数搜索出最佳匹配。
在本文中,我们发现决定两个匹配图像间是否存在成本低于某一阀值是一个复杂算法问题。
我们还发现这个图像匹配问题是个通过从3-SAT减少的NP完全问题,这是一种常见的表明问题内在困难的方法(加里和约翰逊,1997)。
这一结果表明,这种类型的图像比较存在固有的计算困难。
而有趣的是,同样的问题在多项式时间算法中是可解的一维序列。
例如,语音识别中的时间翘曲问题(参见例如奈伊等人,1992)。
这有以下几种含义:对这一问题准确答案感兴趣的研究人员不能希望找到一个多项式时间算法,除非N=NP。
此外,人们可以得出这样的结论,被内田和撒克提出并延长的指数时间算法可以认为是有正当理由的一些图像匹配应用。
另一方面,这表明,对更快算法如模式识别感兴趣的人在寻找次优解决方案中是正确的。
一种方法是限制凯塞斯等人2000年提出的在全球变革中的局部优化和线性近似,另一种可能性是使用启发式方法,如用模拟退火和遗传算法找到一个近似解。
此外,束搜索的方法如有希望的候选人。
因为这些被成功地用于语言识别,虽然语言解码也是一个NP完全问题(看瑞波特和德拉1999年提出)。
2.图像匹配
在各种各样的匹配算法中,我们选择钗达和撒克1998年提出的作为形成图像匹配的一个起点。
在不丧失概括性的条件下,在一个有限字母表&={1,…,G}中让图像和分别为节点标签的灰度值作正方形网络大小的运动。
为了定义这个问题,需要两个距离函数,一个相当于灰度值,g d :&×&→N ,测量在灰度值中的匹配性。
另一个相当于位移的差异,d d :Z ×Z →N ,测量由匹配引起的失真,对那些距离函数,我们假定它们在常用的平方距离中是可计算多项式时间的单调函数。
即:随着f 1,f 2的单调增长,d g (g 1,g 2)=f 1(||g 1-g 2||²)且d d (z)=f 2(||z||²)。
现在我们称下面的优化问题为图像匹配问题(令µ={1,…M})。
实例:大小为M ×M 的两个图像A 和B 的匹配(A,B )
结论:A 的映射函数f :µ×µ→µ×µ.
方法:c (A,B,f )=),(),(j i f ij g B A d ∑
μμ⨯∈),(j i
+
∑⨯-⋅⋅⋅∈+-+μ}1,{1,),()))0,1(),(())0,1(),(((M j i d j i f j i f d
μ⨯-⋅⋅⋅∈}1,{1,),(M j i
+∑⋅⋅⋅⨯∈+-+1}-M ,{1,),()))1,0(),(())1,0(),(((μj i d j i f j i f d 1}-M ,{1,),(⋅⋅⋅⨯∈μj i
目的:min f c(A,B,f).
换句话说,主要问题是找到从A 到B 的映射,从而缩小映射灰度值和由映射引入的衡量失真之间的距离。
这儿的失真是通过二维平面内恒等映射的偏离来测量的,恒等映射实现f(i,j)=(i,j),因此,f((i,j)+(x,y))=f(i,j)+(x,y)
相应的判断问题可以通过以下来说明。
问题:给出一个图形匹配的例子和损耗c ′,那么是否存在一个映射f 使c(A,B,f)≤c ′?
在定义一些问题的时候,必须注意采用的距离函数。
例如,如果任何一个距离函数是连续函数,那么在映射中问题就会很明显(d g 连续,最小值用恒等映射给出,d d 连续,最小值可以通过灰度值成本和映射到一个像素中的最小成本在每个像素所有可能匹配到的排序中确定),但是一般而言,这些特殊的情况不是我们在图像匹配中所关注的。
我们选择钗达和撒克的匹配问题去完成问题的定义,这里的映射函数受连续性和单调性的限制:恒等映射中出现的误差最多可以是一个像素(即限制八邻域
的平方欧氏距离小于或等于2),用这种方法可以通过选择函数f
1,f
2
作为例子来
形成。
f 1=id,f
2
(x)=step(x):=
⎩
⎨
⎧
.2
,
)
10
(
,2
,0
>
≤
⋅x
G
x
M
M
3.降低3-SAT
3-SAT是一个非常著名的NP完全问题(加里和约翰逊,1997),在这一问题里,3-SAT定义如下:
例如:收集条款C={C
1,···,C
K
},一组变量X={x
1
, (x)
L
},这样
每个C由3个常量构成,每个文字是一个变量或一个常量。
问题:有一个X的真值,它满足每个c
k
,k=1,···K?
对应3-SAT例子的图D(&)的独立性定义为偶图,它的独立集是由一套条款
和设置的变量X构成。
两定点c
k 和x
1
相邻,当且仅当c
k
涉及x
1
或
-
x
L。
给定任何3-SAT公式Ф,我们将展示在多项式l(Ф)=(A(Ф),B(Ф))里如何构建一个相当于图像匹配问题。
l(Ф)的这两个图像相似按成本功能(i.e.f:c(A(Ф),B(Ф),f)≤0),当公式Ф满足时,我们使用一下步骤从3-SAT中实现减少:
·从公式Ф,我们构建了图D(Ф)的独立性。
·图D(Ф)的独立性绘制在平面上。
·D(Ф)绘制的精致取决于对&逻辑行为的描绘,产生(A(Ф),B(Ф))两个图像。
为此,我们使用三种类型的组成部分:一个部分代表变量Ф,一个部分代表条款Ф,以及组件作为前两个类型的接口。
在我们给予正式的减少之前,我们介绍这些组件。
3.1基本组件
为了从3-SAT中减少,我们需要从我们即将建设的实例图像匹配中拿出五个组件,在3-DNF中给出一个布尔公式。
对应各自的图,五组件是在图形描绘中所需要建设的块,随后将被连接,即表示连接器,插口,变量和条款。
连接器代表边缘有两种:直连接器和角连接器。
每个组件由两部分组成,一个图像A 和一个图像B,在图像中白像素被认为是背景颜色。
下面,我们将用指示方向位移的箭头描述可能的映射(如像素的周围位移是。