Thursday, March 14, 2013

Why can't I triangulate?

EDIT: Some progress has been made. See my next post.

I've given up trying to reach concrete results before presenting them here. That is obviously leading to a lack of blog posts. So, instead, here is the point at which I am stuck.

I've been trying to use OpenCV to triangulate stuff from my scene using the left and right images from my two PS3 Eye cameras. I've been using the image of the ping pong table to calibrate the exact locations and angles of the cameras with respect to the table, as I would like all my coordinates to be relative to the table for easy comprehension. But it just isn't working. So let me walk you through the steps.

I have a video I've taken of my half-table. The cameras are above the table, about 50cm apart, looking down the center line of the half-table. I have about 45 seconds of just the table that I intend to use for priming background subtraction. Then I have about 10 seconds of me gently bouncing 6 balls across the table.

Landmarks

I've taken a single still image from each camera to use in determining the position of the cameras. Since neither the cameras nor the table are moving, there is no need for synchronization between the eyes. Using these two images, I have manually marked the pixel for a number of "landmarks" on the table: the six line intersections on its surface, plus where the front legs hit the ground. I did this manually because I'm not quite ready to tackle the full "Where's the table?" problem. Done manually, there should only be a pixel or two of error in marking the exact locations. I then measured the table (which does, indeed, match regulation specs) and its legs to get the real-world coordinates of these landmarks. Here are the two marked-up images. There are green circles around the landmarks.




Camera Calibration

I have calibrated the two cameras independently to get their effective field-of-view, optical center, and distortion coefficients. This uses OpenCV's pre-written program to find a known pattern of polka dots that you move about its field of view. I've had no trouble with that. The two cameras give similar calibration results, which makes sense since they probably were manufactured in the same place a few minutes apart.

Here are the images with the distortion of the individual cameras removed. They look pretty normal, but are slightly different that the originals. That's easiest to see at the edges where some of the pixels have been pushed outside the frame by the process. But the straight lines of the table are now actually straight lines.






Stereo Calibration

Using all this info (2d landmarks + camera matrix + distortion coefficients for each camera, and the 3d landmarks) I use OpenCV's stereoCalibrate function. This gives me a number of things, including the relative translation and rotation of the cameras -- where one camera is relative to the other. The angles are hard to interpret, but the translation seems to make sense -- it tells me the cameras are indeed about 50cm apart. So I felt pretty good about that result.

Epilines

With the stereo calibration done, I can draw an epiline image. The way I understand it, an epiline traces the line across one eye's view that represents a single point in the other eye's view. We should know that it worked if the epiline goes through the true matching point. Let's see them:



Amazingly all those lines are right. They all go through one of the landmarks. So it would seem that my stereo calibration has been successful. I don't think the epilines actually serve a purpose here, except to show that so far my answers are working.

Rectify

The next step in OpenCV's workflow is to rectify the images using stereoRectify. Rectifying rotates and distorts the images such that the vertical component of an object in each image is the same. E.g. a table corner that is 100 pixels from the top of the left image is also 100 pixels from the top of the right image. This step is valuable in understanding a 3D scene because it simplifies the correspondence problem: the task of identifying points in each image that correspond to each other. I don't even have that problem yet, since I have hand-marked my landmarks, but eventually this will prove useful. Plus it's another way to show that my progress so far is correct.

Here is the pair of rectified images. They are now a single image side-by-side, because they have to be lined up accurately in the vertical. The red boxes highlight the rectangular region where each eye has valid pixels (i.e. no black filler). The lines drawn across the images highlight the vertical coordinates matching.



This is where I start to get worried. Am I supposed to get this kind of result? I copied this code from a fairly cohesive and simple example in the documentation, but I end up with shrunken images, and that odd swirly ghost of the image around the edges. That looks pretty wrong to me, and doesn't look like the example images from the documentation. This is the example from the documentation, and it shows none of that swirly ghost. The silver lining is that the images are indeed rectified. Those horizontal lines do connect corresponding points in the two images with fairly good accuracy.

Triangulation

Next I try to triangulate some points. I am trying to triangulate the landmarks because since I know their true 3D positions, I can see if the answers are correct. In the future, I would want to triangulate the ball using this same method.

To triangulate, I use OpenCV's triangulatePoints method. That takes the 2D pixel coordinates of the landmarks, and the projection matrix from each eye. That projection matrix is an output of stereoRectify.

The answers simply don't work. After converting the answers back from homogeneous coordinates into 3D coordinates, they don't resemble the table they should represent. Not only are the values too large, but they don't recreate the shape of a table either. It's just a jumbled mess. So now I know that something went wrong. Here are the true points and the triangulation output (units are mm).


True 3DTriangulated
(0,0,0)(3658.03,-1506.81,-6335.75)
(762.5,0,0)(2462.99,1025.58,4136.15)
(1525,0,0)(2620.73,398.168,1480.21)
(0,1370,0)(323.729,407.828,-1360.98)
(762.5,1370,0)(-897.203,594.634,-2136.74)
(1525,1370,0)(-7611.69,1850.22,-6986.95)
(298.5,203.2,-746)(-137.791,-5735.79,-7016.07)
(1226.5,203.2,-746)(5328.58,4257.4,5172.84)


What now?

This is very frustrating because my error is undoubtedly small. Probably something like a transposed matrix, or switching left for right, etc. Someone who knew what they were doing could fix it in a minute. But there is a lack of support for OpenCV, since it is an open source project, and I've been unable to attract any help on their forums.

Since the epilines worked, I believe my error must be in the last two steps: rectifying or triangulating. That's frustrating because the intermediate results that I get are too cryptic for me to make use of, so I feel like it's either all-or-nothing with OpenCV. And either way, this task is now harder.

I've been banging my head against this roadblock off-and-on for a few weeks now, and nothing good is coming of it. And that is why I haven't been posting. No progress, no joy, no posts.

8 comments:

  1. Do you just need to divide by the 4th co-ordinate, i.e. the W co-ordinate. The output is in homogeneous co-ordinates. The same would be true for the Q matrix method.

    points4D – 4xN array of reconstructed points in homogeneous coordinates.

    ReplyDelete
    Replies
    1. A good idea, but I was doing the division to escape from homogeneous coordinates. I'm not sure, but I suspect that no linear scaling to each point could make the points form the right arrangement.

      Delete
  2. It seems like your rectified images are quite strange. I had much more normal results with or without zero disparity: http://subgroup-ash.blogspot.de/2014/01/better-stereo-using-rectifyied-left-and.html

    ReplyDelete
    Replies
    1. Yeah, they're absolutely messed up. Shrunken I could understand might be a side-effect of the long baseline between my cameras. But that spherical ghost image (which is the original image, distorted in some weird way) makes no sense at all. Unfortunately I still don't know what I was doing wrong.

      Delete
  3. I think the triangulatePoints function of opencv is bugged. I'm working on 3d reconstruction and my reprojected points using this function is also rubbish, I can't find any mistake elsewhere. I think I can't rely on it and have to implement a triangulate function manually.

    ReplyDelete
  4. Hey I'm in a very similar situation, I am stuck at the triangulating step as well, with everything before it seeming OK. However, I get like e^-19 or such very small values in my coordinates. I don't understand what's wrong though ....

    ReplyDelete
  5. Were you able to find a solution to this ?

    ReplyDelete
  6. Echoing sportmaniac: I have the same problem, the 'ghost echo' on the stereorectified images. Did you ever find a solution to this?

    ReplyDelete

Be nice, remember I'm an amateur, but by all means please give me feedback!