This week I began creating integration code and continued to work on improving accuracy requirements. One of the things I did was I wrote a script to turn a directory of component images into a single YML file that can be parsed. Before/for the demo every run I would re-run image preprocessing and feature detection on every image in the directory, but now I have made it so that you create the YML file once and it can be parsed for the already-calculated feature vectors. The dataset file is bigger than I had anticipated and mentioned in the design report. In the report I mentioned that we would expect the file to be 500 KB large with a 50 image dataset. Right now with a 27 image dataset, the file is 1.4 MB, which means we can expect our 50 image dataset to be 3 MB large. Although this is larger than we anticipated, this is still plenty small. With the YML file there is added overhead because of metadata that makes it easier to parse the file like a dictionary/map lookup, so we are okay with this size tradeoff.
I have also started doing testing and polishing of accuracy. I ran a test on 66 component images, and 64 of them were identified correctly (~97% accuracy)! This statement isn’t exactly true though, because 42 of the images were ones that had orientation associated with it (voltage + current sources, diodes), and only 24 of those were identified with the correct orientation. Besides the difficulty in classifying the correct orientation of those components, I also noticed that current sources and voltage sources would have very similar matching scores (sometimes they had the same score but it just so happens the correct one was put as the first choice). As a result of this, one thing I want to experiment with is trying to use SIFT instead of ORB for feature detection. Because orientation actually matters, it actually makes sense to use SIFT now, so this is definitely something I want to try next week.
Last week I said that I wanted to improve the node detection, but I realized in testing this week that it actually performs pretty well. I played around with some parameters and it worked consistently.
My next steps are to continue working on the individual component accuracy and the full circuit accuracy as well. By the next report I want to have a complete dataset that will be used, and the accuracies will hopefully be in the ballpark of 80-90%.