There is still not much that I could have done this week, before we get the data from Sorell’s. However, I did manage to find some useful resources. Specifically, I found a website called Roboflow, which will allow me to take my labelled training data and run some preprocessing on them. This is different from the preprocessing that we plan on running in the central node, as it specifically pertains to the machine learning model.
The main advantage that Roboflow offers is that it will help me convert images from XML to the darknet format in bulk. For the initial picture of my workspace that I used to test the pipeline, I did this manually. Now that I have found Roboflow, I can do this automatically, saving a lot of time to process thousands of images.
Another advantage that Roboflow offers is increasing the size of my dataset. It lets me perform transformations like rotation, scaling and blurring on duplicates of the images in the dataset. With combinations of these transformations I could increase the size of my dataset by a factor of 3-10. One consideration I will need to make is that this compromises on the quality of the dataset, since the images will have some similarity among themselves.
On a different note, one issue that Pablo brought to my attention is where some of the images taken are distorted. The bottom part of the image is cut off and replaced with vertical lines, as shown in this picture. Pablo mentioned this was due to a wiring issue, but I am also planning on solving this problem in software.
One thing to note is that the lines that cause the distortion are all perfectly vertical and they are always at the bottom of the picture. This can be detected using a vertical Sobel filter. The Sobel filter is a high pass filter for images in one dimension. Since there is no change vertically in the bad part of the image, there is only a DC bias. A high pass filter will remove this bias and leave the bottom half of the image to be all zeros. After that, we simply need to compare the last bottom lines of the image to zeros in order to detect this kind of error.