Jeremy’s Status Update for 2/22/2020

This week I focused on some of the data pre-processing methods we will use for our algorithm. Note that the things mentioned in the first two paragraphs in this report may not be used for our current method which involves projecting a laser strip. I first looked at the geometry for converting between the data returned from a depth camera to cartesian coordinates. This involved doing some rudimentary trigonometry to get these calculations. The depth camera should return the z-distance to each pixel, but I also accounted for the case where the Intel SR305 camera would return distance from the center of the camera to the pixel instead. We will be able to see which one it is when we actually get the camera and test it on a flat surface like a flat cube. The math computed is as follows:

As mentioned in the previous status update, we were considering using an ICP (Iterative Closest Point) algorithm to combine different scans for the depth camera method and also accurately determine the angle between scans. The ICP algorithm determines the transformation between two point clouds from different angles of the object by using least squares to match duplicate points – these point clouds would be constructed by mapping the scanned pixel and depth to their corresponding 3D cartesian coordinates as shown from the math above. Similar to gradient descent, ICP works best when starting at a good starting point to avoid being stuck at local minima and also save computation time. One potential issue with ICP is that shapes like spheres or upright cylinders would have uniform point clouds across any angle – a good method to help with this risk factor is to start the ICP algorithm using the rotational data from the stepper motor, an area that Chakara has researched on this week. Essentially, we will have the rotational data from the stepper motor and then use ICP to determine the rotational difference more precisely between the two scans, then find duplicate points and combine the two point clouds this way. 

I also looked into constructing a mesh from a point cloud. The point cloud would likely be stored in a PCD (Point Cloud Data) file format. PCD files provide more flexibility and speed than other formats like STL/OBJ/PLY, and we can use the PCL (Point Cloud Library) in C++ to process this point cloud format as fast as possible. The PCL library provides many useful functions such as estimating normals and performing triangulation (constructing a triangle mesh from XYZ points and normals). Since our data will just be a list of XYZ coordinates, we can easily convert this to the PCD format to be used with the PCL library. The triangulation algorithm works by maintaining a fringe list of points from which the mesh can be grown and slowly extending the mesh out until it covers all the points. There are many tunable parameters such as size of the neighborhood for searching points to connect, maximum edge length of the triangles, as well as maximum allowed difference between normals to connect that point, which helps deal with sharp edges and outlier points. The flow of the triangulation algorithm involves estimating the normals, combining that data with the XYZ point data, initializing the search tree and other objects, then using PCL’s reconstruction method to obtain the triangle mesh object. The algorithm will output a PolygonMesh object which can be saved as an OBJ file using PCL, which is a common format for 3d printing (and tends to perform better than STL format). There will probably be many optimization opportunities or bugs/issues to be fixed with this design as it is just a basic design based on what is available in the PCL library. 

I also looked a bit into noise reduction and outlier removal for filtering the raw point cloud data. There are many papers that discuss approaches and some even use neural nets to get probability that a point is an outlier and remove it in that sense. This would require further research and trying out different methods as I don’t completely understand what the different papers are talking about just yet. There are also libraries that have their own noise reduction functions, such as the PCL library among a few others, but it is definitely better to do more research and write our own noise reduction/outlier removal algorithm for better efficiency, unless the PCL library function is already optimized for the PCD data format.

We discussed using a projected laser strip and a CCD camera to detect how this laser strip warps around the object to determine depth and generate a point cloud, so our updated pipeline is shown below.

By next week, I will be looking to completely understand the laser strip projection approach, as well as dive much deeper into noise reduction and outlier removal methods. I may also look into experimenting with triangulation from point clouds and playing around with the PCL library. My main focus will be on noise reduction/outlier removal since that is independent of our scanning sensors – it just takes in a point cloud generated from sensor data and does a bunch of computations and transformations.

 

Jeremy’s Status Update for 2/15/2020

This week I did research comparing different 3d reconstruction options as well as a bit of research on texturing 3d scans. There are several possible scanning options, some of which were suggested by Professor Tamal. These include RGB-D camera (gives depth information on each pixel using coded light), time-of-flight single point (one laser point with depth using time-of-flight), time-of-flight vertical line (like a barcode), and time-of-flight laser 2D depth map. 

The time-of-flight single point laser scanner was the lowest priced option, but it was difficult finding many papers that used this method due to being very prone to mechanical errors, as well as being rather time-costly and complex mechanically. There were a few possible ways of executing this method which included the spiral method, where the laser point would slowly move down vertically while the object rotated. Depending on the controlling mechanism, this method would be prone to missing a lot of points scanned, especially if the laser shudders or other mechanical errors. A way to make this more efficient would simply be to use several different laser points since each was not very costly; however, the same issues would still arise. 

The RGB-D camera using coded light was one idea we were very interested in, especially given that the camera would already help us do some of the processing to get the depth data. This would also allow for texture mapping, something that would be missing from the time-of-flight sensors (unless we combine those with camera data). This method would be less prone to error depending on our depth camera, and among the few options we looked at, we would most likely use one that can give within 1mm accuracy for depth data. This method would also allow for correction for bias using color data potentially. The price is also not too expensive (less than $100 for the coded light depth camera), which fits our requirements. However, we may need to do some work in figuring out confidence intervals for the accuracy ranges and determining if this reconstruction method would be able to figure out the depth accurate enough to fit our requirements for accuracy. 

The laser-based approaches are still intriguing since time-of-flight lasers can usually give micrometer accuracy since we determine the exact distance using wavelength and time-of-flight data. This led to an idea from Professor Tamal to use LIDAR (1D/2D) for the scanning. The 1D LIDAR would behave like a barcode scanner with a vertical line to scan and the object rotating, but there may be certain complexities to explore with this method, and there have not been a lot of previous work using this method. The 2D LIDAR would be even more accurate and gives an accurate depth map, but it would cost quite a bit more. This method is certainly very promising and deserves extensive research to compare with the RGB-D camera method. 

All of these methods would potentially require some filtering or smoothing techniques to remove the noise from the data, but the RGB-D camera and the 2D LIDAR would probably give us the easiest time in managing and converting the data into a 3D point cloud. Since the data is 2D, however, we would need to cross reference points and map several scans from different angles back into the same object, which would be one of the main algorithmic complexities of our project. We would also be able to leverage the rotational position of the platform to help us determine which pixel maps to which exact 3D point in space. 

Thus, in the coming week, I will have to dive deep into researching the RGB-D and 2D LIDAR methods and doing more extensive comparisons between the two, and referencing their qualities back to our requirements. So far, a lot of our research has been very breadth based, since we were considering a large variety of options, such as previously considering using two cameras and computer vision to do the scanning. However, my research goal this week is to narrow down on the specific idea we use and justify it with qualities we look for based on our requirements. I will also be doing more research on piecing together scans from different angles, as well as working out math to figure out a 3D point given a pixel, depth, and camera position, as this will be necessary information for our algorithm later on regardless which scanning mechanism we choose (both will output depth data).

Table for Comparing Possible Sensors

Sensor Cost Mechanical Complexity Pre-Filtering Complexity (estimated) Potential Sources of Error Texture Mapping Algorithmic Implications
RGB-D Camera (structured/coded light) ~$70 Low High Less accurate than laser time-of-flight approaches, noise Possible Color may allow better matching with ICP
Time-of-Flight single point ~$10 per sensor High Medium High risk of mechanical errors No Direct computation of point cloud, no ICP
Time-of-Flight vertical line ~$130? Medium Low Noise (but less error than 2D?) No Direct computation of point cloud, no ICP
Time-of-Flight 2D depth map ~$200? Low Medium Noise No Direct ICP available