問題描述
我想使用 OpenCV 做一些 Structure-from-Motion.到目前為止,我有 basicmatix 和 essentialmatrix.有了基本矩陣,我正在做 SVD 以獲得 R 和 T.
I want to do some Structure-from-Motion using OpenCV. So far I have the fundamentalmatix and the essentialmatrix. Having the essentialmatrix I am doing SVD for getting R and T.
我的問題是我有 2 個可能的 R 解決方案和 2 個可能的 T 解決方案,這導(dǎo)致整體姿勢有 4 個解決方案,其中 4 個解決方案中只有一個是正確的.如何找到正確的解決方案?
My problem is that I have 2 possible solutions for R and 2 possible solutions for T which leads to 4 solutions for the overall pose, where only one of the 4 solutions is the right one. How can I find the correct solution?
這是我的代碼:
private void calculateRT(Mat E, Mat R, Mat T){
Mat w = new Mat();
Mat u = new Mat();
Mat vt = new Mat();
Mat diag = new Mat(3,3,CvType.CV_64FC1);
double[] diagVal = {1,0,0,0,1,0,0,0,1};
diag.put(0, 0, diagVal);
Mat newE = new Mat(3,3,CvType.CV_64FC1);
Core.SVDecomp(E, w, u, vt, Core.DECOMP_SVD);
Core.gemm(u, diag, 1, vt, 1, newE);
Core.SVDecomp(newE, w, u, vt, Core.DECOMP_SVD);
publishProgress("U: " + u.dump());
publishProgress("W: " + w.dump());
publishProgress("vt:" + vt.dump());
double[] W_Values = {0,-1,0,1,0,0,0,0,1};
Mat W = new Mat(new Size(3,3), CvType.CV_64FC1);
W.put(0, 0, W_Values);
double[] Wt_values = {0,1,0-1,0,0,0,0,1};
Mat Wt = new Mat(new Size(3,3), CvType.CV_64FC1);
Wt.put(0,0,Wt_values);
Mat R1 = new Mat();
Mat R2 = new Mat();
// u * W * vt = R
Core.gemm(u, Wt, 1, vt, 1, R2);
Core.gemm(u, W, 1, vt, 1, R1);
publishProgress("R: " + R.dump());
// +- T (2 possible solutions for T)
Mat T1 = new Mat();
Mat T2 = new Mat();
// T = u.t
u.col(2).copyTo(T1);
publishProgress("T : " + T.dump());
Core.multiply(T, new Scalar(-1.0, -1.0, -1.0), T2);
// TODO Here I have to find the correct combination for R1 R2 and T1 T2
}
推薦答案
從兩個相機(jī)的基本矩陣重建相對歐幾里得姿勢時存在理論歧義.這種模糊性與以下事實(shí)有關(guān):給定圖像中的 2D 點(diǎn),經(jīng)典針孔相機(jī)模型無法分辨相應(yīng)的 3D 點(diǎn)是在相機(jī)前面還是在相機(jī)后面.為了消除這種歧義,您需要知道圖像中的一個點(diǎn)對應(yīng)關(guān)系:因?yàn)檫@兩個 2D 點(diǎn)被假定為位于兩個相機(jī)前面的單個 3D 點(diǎn)的投影(因?yàn)樗趦蓚€圖像中都可見),這將能夠選擇正確的 R 和 T.
There is a theoretical ambiguity when reconstructing the relative euclidian poses of two cameras from their fundamental matrix. This ambiguity is linked to the fact that, given a 2D point in an image, the classic pinhole camera model cannot tell whether the corresponding 3D point is in front of the camera or behind the camera. In order to remove this ambiguity, you need to know one point correspondence in the images: as these two 2D points are assumed to be the projections of a single 3D point lying in front of both cameras (since it is visible in both images), this will enable choosing the right R and T.
為此,C.Ressl (PDF).下面給出該方法的概要.我將用 x1 和 x2 表示兩個對應(yīng)的 2D 點(diǎn),用 K1 和 K2 表示兩個相機(jī)矩陣,用 E12 表示基本矩陣.
For that purpose, one method is explained in § 6.1.4 (p47) of the following PhD thesis: "Geometry, constraints and computation of the trifocal tensor", by C.Ressl (PDF). The following gives the outline of this method. I'll denote the two corresponding 2D points by x1 and x2, the two camera matrices by K1 and K2 and the essential matrix by E12.
我.計算基本矩陣 E12 = U * S * V'
的 SVD.如果 det(U) <0
設(shè)置 U = -U
.如果 det(V) <0
設(shè)置 V = -V
.
i. Compute the SVD of the essential matrix E12 = U * S * V'
. If det(U) < 0
set U = -U
. If det(V) < 0
set V = -V
.
二.定義 W = [0,-1,0;1,0,0;0,0,1]
,R2 = U * W * V'
和 T2 = U 的第三列
三.定義 M = [ R2'*T2 ]x
、X1 = M * inv(K1) * x1
和 X2 = M * R2' * inv(K2)* x2
四.如果 <代碼>X1(3) * X2(3) <0,設(shè)置R2 = U * W' * V'
并重新計算M
和X1
iv. If X1(3) * X2(3) < 0
, set R2 = U * W' * V'
and recompute M
and X1
v.如果 <代碼>X1(3) <代碼0 設(shè)置 T2 = -T2
六.定義 P1_E = K1 * [ I |0 ]
和 P2_E = K2 * [ R2 |T2]
符號 '
表示轉(zhuǎn)置,符號 [.]x
在步驟 iii 中使用.對應(yīng)于斜對稱算子.在 3x1 向量上應(yīng)用斜對稱算子 e = [e_1;e_2;e_3]
結(jié)果如下(參見 維基百科關(guān)于跨產(chǎn)品的文章):
The notation '
denotes the transpose and the notation [.]x
used in step iii. corresponds to the skew-symetric operator. Applying the skew-symmetric operator on a 3x1 vector e = [e_1; e_2; e_3]
results in the following (see the Wikipedia article on cross-product):
[e]x = [0,-e_3,e_2; e_3,0,-e_1; -e_2,e_1,0]
最后,請注意 T2
的范數(shù)將始終為 1,因?yàn)樗钦痪仃嚨牧兄?這意味著您將無法恢復(fù)兩個攝像頭之間的真實(shí)距離.為此,您需要知道場景中兩點(diǎn)之間的真實(shí)距離,并將其考慮在內(nèi)以計算相機(jī)之間的真實(shí)距離.
Finally, note that the norm of T2
will always be 1, since it is one of the column of an orthogonal matrix. This means that you won't be able to recover the true distance between the two cameras. For that purpose, you need to know the true distance between two points in the scene and take that into account to calculate the true distance between the cameras.
這篇關(guān)于基本矩陣分解:驗(yàn)證 R 和 T 的四種可能解決方案的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!