11SFM 计算机视觉 Berkeley课件
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Xj
x1j x3j P1 x2j P3 Lazebnik P2
Structure from motion ambiguity
• If we scale the entire scene by some factor k and, at the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same:
1 x = PX = P (k X) k
It is impossible to recover the absolute scale of the scene!
Lazebnik
Structure from motion ambiguity
• If we scale the entire scene by some factor k and, at the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same • More generally: if we transform the scene using a transformation Q and apply the inverse transformation to the camera matrices, then the images do not change
Parallel Projection
Lazebnik
Affine cameras
• A general affine camera combines the effects of an affine transformation of the 3D space, orthographic projection, and an affine transformation of the image:
x = PX = PQ
Lazebnik
(
-1
)(QX)
Projective ambiguity
x = PX = PQ
Lazebnik
(
-1 P
)(Q X)
P
Projective ambiguity
Lazebnik
Affine ambiguity
Affine
x = PX = PQ
Lazebnik
– Case example with parallel optical axes – General case with calibrated cameras
• Correspondence search • The Essential and the Fundamental Matrix • Multi-view stereo
•
•
Lazebnik
Structure from motion
• Given: m images of n fixed 3D points xij = Pi Xj , i = 1, … , m, j = 1, … , n
• Problem: estimate m projection matrices Pi and n 3D points Xj from the mn correspondences xij
Today: SFM
• • • • • • SFM problem statement Factorization Projective SFM Bundle Adjustment Photo Tourism “Rome in a day:
Structure from motion
Lazebnik
Projective 15dof
A vT
t v
Preserves intersection and tangency
Affine 12dof Similarity 7dof
A t 0T 1 s R t 0T 1 R t 0T 1
C280, Computer Vision
Prof. Trevor Darrell trevor@eecs.berkeley.edu Lecture 11: Structure from Motion
Roadmap
• Previous: Image formation, filtering, local features, (Texture)… • Tues: Feature-based Alignment
• Affine projection is a linear mapping + translation in inhomogeneous coordinates
x a11 x= y = a 21
X
x a2 Lazebnik a1
a12 a22
X a13 b1 = AX + b Y + a23 b2 Z
Lazebnik
Affine structure from motion
• Centering: subtract the centroid of the image points
1 n 1 n ˆ ij = x ij − ∑ x ik = A i X j + b i − ∑ (A i X k + b i ) x n k =1 n k =1 1 n ˆ = Ai X j − ∑ Xk = Ai X j n k =1
• No classes next week: ICCV conference • Oct 6th: Stereo / ‘Multi-view’: Estimating depth with known inter-camera pose • Oct 8th: ‘Structure-from-motion’: Estimation of pose and 3D structure
(
-1 A
)(Q X)
A
Affine ambiguity
Lazebnik
Similarity ambiguity
x = PX = PQ
Lazebnik
(
-1 S
)(Q X)
S
Similarity ambiguity
Lazebnik
Hierarchy of 3D transformations
Preserves parallellism, volume ratios
Preserves angles, ratios of length
Euclidean 6dof
Preserves angles, lengths
Lazebnik
• With no constraints on the camera calibration matrix or on the scene, we get a projective reconstruction • Need additional information to upgrade the reconstruction to affine, similarity, or Euclidean
Projection of world origin
Affine structure from motion
• Given: m images of n fixed 3D points: xij = Ai Xj + bi , i = 1,… , m, j = 1, … , n • Problem: use the mn correspondences xij to estimate m projection matrices Ai and translation vectors bi, and n points Xj • The reconstruction is defined up to an arbitrary affine transformation Q (12 degrees of freedom):
A b A b −1 0 1 → 0 1 Q ,
X X 1 → Q 1
• We have 2mn knowns and 8m + 3n unknowns (minus 12 dof for affine ambiguity) • Thus, we must have 2mn >= 8m + 3n – 12 • For two views, we need four point correspondences
– Stitching images together – Homographies, RANSAC, Warping, Blending – Global alignment of planar models
• Today: Dense Motion Models
– Local motion / feature displacement – Parametric optic flow
– Factorization approaches – Global alignment with 3D point models
Last time: Stereo
• Human stereopsis & stereograms • Epipolar geometry and the epipolar constraint
• Distance from center of projection to image plane is infinite
Image World
• Projection matrix:
Lazebnik
Slide by Steve Seitz
Affine cameras
Orthographicwk.baidu.comProjection
Multiple-view geometry questions
• Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding points in 3D? Correspondence (stereo matching): Given a point in just one image, how does it constrain the position of the corresponding point in another image? Camera geometry (motion): Given a set of corresponding points in two or more images, what are the camera matrices for these views?
• For simplicity, assume that the origin of the world coordinate system is at the centroid of the 3D points • After centering, each normalized point xij is related to the 3D point Xi by
1 0 0 0 a11 [4 × 4 affine] = a P = [3 × 3 affine] 0 1 0 0 21 0 0 0 1 0 a12 a22 0 a13 b1 A b a23 b2 = 0 1 0 1
Structure from motion
• Let’s start with affine cameras (the math is easier)
center at infinity
Lazebnik
Recall: Orthographic Projection
Special case of perspective projection
x1j x3j P1 x2j P3 Lazebnik P2
Structure from motion ambiguity
• If we scale the entire scene by some factor k and, at the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same:
1 x = PX = P (k X) k
It is impossible to recover the absolute scale of the scene!
Lazebnik
Structure from motion ambiguity
• If we scale the entire scene by some factor k and, at the same time, scale the camera matrices by the factor of 1/k, the projections of the scene points in the image remain exactly the same • More generally: if we transform the scene using a transformation Q and apply the inverse transformation to the camera matrices, then the images do not change
Parallel Projection
Lazebnik
Affine cameras
• A general affine camera combines the effects of an affine transformation of the 3D space, orthographic projection, and an affine transformation of the image:
x = PX = PQ
Lazebnik
(
-1
)(QX)
Projective ambiguity
x = PX = PQ
Lazebnik
(
-1 P
)(Q X)
P
Projective ambiguity
Lazebnik
Affine ambiguity
Affine
x = PX = PQ
Lazebnik
– Case example with parallel optical axes – General case with calibrated cameras
• Correspondence search • The Essential and the Fundamental Matrix • Multi-view stereo
•
•
Lazebnik
Structure from motion
• Given: m images of n fixed 3D points xij = Pi Xj , i = 1, … , m, j = 1, … , n
• Problem: estimate m projection matrices Pi and n 3D points Xj from the mn correspondences xij
Today: SFM
• • • • • • SFM problem statement Factorization Projective SFM Bundle Adjustment Photo Tourism “Rome in a day:
Structure from motion
Lazebnik
Projective 15dof
A vT
t v
Preserves intersection and tangency
Affine 12dof Similarity 7dof
A t 0T 1 s R t 0T 1 R t 0T 1
C280, Computer Vision
Prof. Trevor Darrell trevor@eecs.berkeley.edu Lecture 11: Structure from Motion
Roadmap
• Previous: Image formation, filtering, local features, (Texture)… • Tues: Feature-based Alignment
• Affine projection is a linear mapping + translation in inhomogeneous coordinates
x a11 x= y = a 21
X
x a2 Lazebnik a1
a12 a22
X a13 b1 = AX + b Y + a23 b2 Z
Lazebnik
Affine structure from motion
• Centering: subtract the centroid of the image points
1 n 1 n ˆ ij = x ij − ∑ x ik = A i X j + b i − ∑ (A i X k + b i ) x n k =1 n k =1 1 n ˆ = Ai X j − ∑ Xk = Ai X j n k =1
• No classes next week: ICCV conference • Oct 6th: Stereo / ‘Multi-view’: Estimating depth with known inter-camera pose • Oct 8th: ‘Structure-from-motion’: Estimation of pose and 3D structure
(
-1 A
)(Q X)
A
Affine ambiguity
Lazebnik
Similarity ambiguity
x = PX = PQ
Lazebnik
(
-1 S
)(Q X)
S
Similarity ambiguity
Lazebnik
Hierarchy of 3D transformations
Preserves parallellism, volume ratios
Preserves angles, ratios of length
Euclidean 6dof
Preserves angles, lengths
Lazebnik
• With no constraints on the camera calibration matrix or on the scene, we get a projective reconstruction • Need additional information to upgrade the reconstruction to affine, similarity, or Euclidean
Projection of world origin
Affine structure from motion
• Given: m images of n fixed 3D points: xij = Ai Xj + bi , i = 1,… , m, j = 1, … , n • Problem: use the mn correspondences xij to estimate m projection matrices Ai and translation vectors bi, and n points Xj • The reconstruction is defined up to an arbitrary affine transformation Q (12 degrees of freedom):
A b A b −1 0 1 → 0 1 Q ,
X X 1 → Q 1
• We have 2mn knowns and 8m + 3n unknowns (minus 12 dof for affine ambiguity) • Thus, we must have 2mn >= 8m + 3n – 12 • For two views, we need four point correspondences
– Stitching images together – Homographies, RANSAC, Warping, Blending – Global alignment of planar models
• Today: Dense Motion Models
– Local motion / feature displacement – Parametric optic flow
– Factorization approaches – Global alignment with 3D point models
Last time: Stereo
• Human stereopsis & stereograms • Epipolar geometry and the epipolar constraint
• Distance from center of projection to image plane is infinite
Image World
• Projection matrix:
Lazebnik
Slide by Steve Seitz
Affine cameras
Orthographicwk.baidu.comProjection
Multiple-view geometry questions
• Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding points in 3D? Correspondence (stereo matching): Given a point in just one image, how does it constrain the position of the corresponding point in another image? Camera geometry (motion): Given a set of corresponding points in two or more images, what are the camera matrices for these views?
• For simplicity, assume that the origin of the world coordinate system is at the centroid of the 3D points • After centering, each normalized point xij is related to the 3D point Xi by
1 0 0 0 a11 [4 × 4 affine] = a P = [3 × 3 affine] 0 1 0 0 21 0 0 0 1 0 a12 a22 0 a13 b1 A b a23 b2 = 0 1 0 1
Structure from motion
• Let’s start with affine cameras (the math is easier)
center at infinity
Lazebnik
Recall: Orthographic Projection
Special case of perspective projection