Computer Vision 2002 Assignement 5
COMPSCI 775ST Assignment 5

Structured Lighting


Contents
Group Members

    Christian Graf
    Uli Schroeder
    YongTao Zou

Back to Contents

Presentation

  1. Introduction

      The projection of light patterns into a scene is called structured lighting. The light patterns are projected onto objects that lie in the field of view of the camera. Thus a light profile is projected onto the object. The distance of an object to the camera or the location of an object in space can be found through analyzing the light patterns in images. These images are taken at different angles of object rotation. Using this data we can reconstruct the objects' 3D coordinates.

  2. Assignment description

      The assignement is about reconstruction of an object using light stripe projection. A single line is projected on an object. The object is rotated and images are taken at predetermined angles.
      In the assignment we were given 36 pictures (10-degree steps) per object. After calculating the world coordinates of the object points, we use both projections to 2D planes and an interactive VRML representation to visualize the results.

      Structured Lighting
      The introduction provides the basic idea about the approach.

      Structured lighting methods can be regarded as modification of static binocular stereo. One of the cameras is replaced by a light source which projects the light pattern into the scene. The correspondence problem from stereo vision does not exist anymore since the triangulation is carried out by intersecting the optical axis (camera) and the light source direction. These both are well defined mathematical constructs: rays (light beams) and planes (light strips). For the sake of simplicity the light rays and planes are modelled by straight lines and plane equations. Hence the diameter or thickness of the projected light patterns is normally not part of the mathematical models.

      Other 3D shape recovery methods assumed less complex requirements for image aquisation. The solution was found by analyzing theoretical, mathematical, and algorithmical issues. Structured lighting simplifies the task by increasing the engineering prerequisites, hence the complexity of surface reconstruction is shifted to another level in structured lighting. The active manipulation of the scene by using light patterns simplifies the 3D reconstruction task enormously. This is underlined by the fact, that position, and orientation can be changed or remain static during the image acquisation process. The lecture book claims that even the shape of the light patterns may change but we can't see how the reconstruction process should be carried out then. All the preprocessing is done with one specefic light pattern, how would the system handle a sudden change in the pattern. That's an open question for us.

      The 3D coordinates of the scene points in the images are recovered by assuming a known image aquisation geometry and using triangulation. The methods of 3D object aquisation using structered lighting can be divided into methods which use simple geometric light patterns (light spots, light stripes) and methods which are bases on spatial and/or temporal coding of light patterns. In this assigment we focus on the the first type of techniques, in particular light stripes (light planes).

      Single light plane approach:
      The idea is to intersect the projection ray of the examined image point with the light plane. The intersection of the light plane with the object surface is visible as a light stripe in the image. Therefore, a larger set of depth values can be recovered from a single image which results in faster reconstruction compared to the single spot technique. The distance value for every column stays the same and can be stored in a look-up table (LUT). But before that is done, a calibration step has to be performed. For simplicity let's assume, that the basic geometric calibration (camera,...) has already been done. The Calibration of the Light Plane System is done with respect to the light projector's coordinate system. The optical axis of the camera passes through the origin of the light projector's coordinate system. The initial position of the calibration plane is choses such that the light profile coincides with the middle column of the image. The ray to the border point of the image which hits the light plane the first determines the minimal z-value, the corresponding point at the other side determines the maximal z-value which can be found in that scene. All other points are found by the LUT in the following way.

      Preparing the LUT, extracting the light profile and calculating the depth values
      We were given five calibplane images. The z-value of each image was also given. The problem is, that in real world experiments the light stripe is normally not just one pixel wide. This observation has a technical reason (thickness of laser is predefined) and a geometrical one (non-90degree angle between light and camera, non-90degree angle between normal of light plane and surface normal). Thus the light stripe forms a so called light profile which has an energy distribution similiar to an inverse quadratic polynomial. As we are interested in the point with the most brightness we have to search for it. This is done in beginning of the calibration step:
      The plane is centered at the point, where the cameras' optical axis and the axis of the light source intersect. Then the plane is moved in negative and positive direction along the optical axis of the camera. The light stripe moves linearly to the right or left with respect to the direction of the movement. Because of the linear movement of the stripe we can use the formula from the notes: z=ax+by+c (we can reduce y because the light plane is perpendicular to xz-plane), so we have five equations for z=ax+b. Then the a, b values are calculated. The formula of light strip plane shows the relationship between the x (image) and z (world). We build up the table according to the corresponding zx-value.

      Reconstruction
      Fist of all we parse all input pictures and search the maximum of the light stripes that occur in the picture. This is necessary because the stripe won't have a width of only 1 pixel. It will appear with different width, depending on the material and the angle between the camera and the light source. All pixels that belong to a stripe and that are not maximal are set to zero, thus getting the discrete coordinates of the points where the light has hit the object.
      After that we go through each single pixel of the 36 images. We pick the pixels we are interested in (the highlighted ones). We can generate the z-value when we refer to the x-value in the LUT. X (world) and z (world) can be computed by rotating the point by the according angle.
      We use two ways to indicate the output: One is to project the result to the xy-plane, zy-plane and xz-plane by just setting the third coordinate to zero. Another method is to output the result to a VRML file, which can be explored interactively.

      Implementation (in pseudo code):

      Calibration( ){
         for( each calibplane image ){
           profile the brightest x-coordinate
         }
         build the look-up table{
           Z1=aX1+b;
           ...
           Z5=aX5+b;
           obtain a, b;
           for ( from 0-256 )
             save the z-value to corresponding x-index;
         }
      }

      Reconstruction(){
         for( every image taken )
           for( each pixel we interested ){
             compute the acutal x-value, y-value and z-value;
             write coordinate to VRML file;
             project on xy-plane; (ignore z)
             project on zy-plane; (ignore x)
             project on xz-plane; (ignore y)
           }
         }
         write data to projection images;
      }

  3. Experiment results

      We have processed all images and computed the 3D object points from the given data. The VRML representations are very complex (a lot of points have to be displayed) and thus they are slow while exploring. If the user wants to display them in a smaller window he is advised to use the sphere representation (points are drawn as spheres), otherwise he/she will hardly recognize the whole shape as the points disappear quickly at small window sizes.
      XY-projection XZ-projection YZ-projection VRML-representation
      Model: Box



      VRML points
      VRML spheres
      Model: Teapot



      VRML points
      VRML spheres
      Model: Teapot 2



      VRML points
      VRML spheres
      Model: Bunny



      VRML points
      VRML spheres

      From the results we see that most of the objects were reconstructed correctly. This result was only achieved after we had scaled the y-coordinate of the image points to a reasonable degree. This is necassary because only the values of the z-dimension are calibrated. The x-values are calcullated from the given rotation angle of which the image is taken. The y-axis is never regarded in this first steps. So we don't know which y-value corresponds to how many milimetern. When calcullating the values for z and x with the LUT there is no corresponding method for the y-pixel-value (range 0 to 255). Thus there is no direct correlation between all three coordinates. In this case a scaling of the y-coordinates has to be done manually. Our factor is 1/3 which was obtain by observation until the result looked reasonable.

      The one teapot (Teapot 2) is an exception to the good reconstruction results, he is not that likely to be recognized. This is the result of the location and orientation this teapot has. In contrast to the others, it's not rotating aroung itself but has a little distance to the rotation center which is the center of projection of the camera and the point of intersection of the optical axis and the light plane as well. When rotating, the light strip does not hit the object every time and if then under a maybe not suitable slope (light intensity might be to low). This leads to less points for reconstruction and the result is not that clear and unique as with different settings.

  4. Conclusions, remarks or discussions

      From the view and with the human intution we can know whether a reconstruction looks correct or not. Sometimes this assessment is tricky to do because a simple projection of 3D object points to 2D planes sometimes results in images which seem to show the object deformed. Introducing shading or colours would be a possible imporvement. We have chosen to use the possibilities of VRML files to give the user a better opportunity to understand the recovered shape of the object.

      The representation we did only contains object points (as it was asked), no edges or even faces. We would have liked to incorporate them, but there was the problem, that without knowledge about the real topology (all convex surfaces or even concave ones?) it's hard to do it correctly. We could have done it but there would have been the pending danger of jeopardizing the result through wrong reconstructions. As the assignment asked us to find points, not faces, our result satisfies this objective.

  5. Source code

Back to Contents

Answers to Questions

  1. What assumptions have you made?

    1. The vertical axes of the light plane projector and the camera are coplanar.
    2. The optical axes are coplanar.
    3. The calibration plane moves exactly parallel to the reference plane.
    4. The object rotates around the Y axis.
    5. The calibration object is centered where the optical axis of the camera and the light plane intersect. In our case this was column 128 as given in the assignment specification.
    6. The detection of the light profile for light plane calibration is that we take the pixel the largest gray value.
    7. The "up vector" of camera and axis of rotation are coplanar.
    8. The camera and the light plane are static.

  2. What are the possible sources of error in Structured lighting?

    1. Due to the failure of reflecting the light back to the camera very well, the object surface cannot be illumed by the strip line properly. So it is not easy to get sufficient information to reconstruct them.
    2. If calibration data is not accurate, it will lead to a big change of the result. Different value of a and b will generate quite different images. Some of them are obviously wrong.
    3. The thickness of light plane would effect the result. The thinner the light plane the more accurate the measurement. However, the thinner the light plane the more difficult to detect in the image.
    4. Different brightness will lead to obvious change of the output.
    5. Bad quality of the image will lead to bad output (as always).
    6. If the object is not exactly rotated around its own centered y-axis, it can be difficult to reconstruct (see teapotr).

Back to Contents


References

[KSK98] Klette, Schlüns, and Koschan. Computer Vision. Singapore 1998

Back to Contents