Below are some of the tests that our team conducted:
- Image Classification Accuracy Tests: We had a dataset of 500 compiled images of all forms (white background, normal background, and blurred background) and tested the classification accuracy of it. The results fell into one of four categories: canned food, bananas, oranges, and apples. However, our system will be able to classify into other groups if needed. The purpose of the four categories was to simplify the confusion matrix into a smaller form. The accuracy was around 89%; however, we hope to aim for around 95%. The design changes that we will make to improve the accuracy involve background reduction to make the image consist of only the primary object.
- Image Classification Latency Tests: We had a dataset of 500 compiled images of all forms (white background, normal background, and blurred background) and tested the classification speed of it. Our goal is for it to take around 2-3 seconds. The tests took 4-5 seconds on average which is not too far from the final goal. Our primary focus is on accuracy, but once we get accuracy to the ideal threshold, we will work on optimizing the classification algorithm (potentially removing layers).
- Text Extraction Accuracy Tests: We used our sample of 125 canned food images as well as an additional 75 pictures that we took to create a testing set of 200 samples. We had two focuses: calorie extraction and product label extraction. The result indicated a 92% accuracy in obtaining the right calories which we ideally want to be above 96%. The primary issue is that sometimes the text extraction algorithm formats it into a weird way in which the calorie amount does not follow the word “calorie.” As a result, we plan to tweak our text extraction algorithm to take care of edge cases like that. In addition to that, we were able to extract the product label 98% of the time. The issue is the outputted text includes too many extraneous characters. We will work on a parsing algorithm to make the label clear and concise for inventory documentation purposes.
- Text Extraction LatencyTests: Using the 200 samples, we reproduced a score of 3-4 seconds for the whole calorie and product label extraction process. The results were skewed a little by the fact that some tests took exceptionally long trying to find the calorie amount. We expect to see this value to decrease to 1-2 seconds after tweaking the calorie extraction algorithm. Our goal is 2-3 seconds, so this gives us some buffer in adding a product label sanitization algorithm to make the product label clear and concise rather than a blob of all the text.
- RPi and MySQL Connector Latency Tests: We used a set of 100 image captures from the RPi (200 in total: 100 for product images and 100 for scale readings) and timed the time it takes from sending to being accessed via the web application. Our ideal time range includes 10-15 seconds. However, the results demonstrated on average 22.4 seconds out of all our samples. We hope to make some design choices that involve synchronizing when the first and second camera takes image captures to improve the latency. Likewise, we can reduce the polling time it takes to stabilize the image to speed up this process because the quickest speed from one training sample is 16.5 seconds. We found that it does not take too long for the product and weight to stabilize, so a slight alteration in the Python script will improve latency tremendously.
Regarding progress, Steven was able to debug and fix our ML classification model with ResNet-18. There were many bugs with integrating it into the front-end, so Steven was able to tweak it to run properly and output the classified results. Before it was not outputting any result, so Steven was able to modify it to be able to process images correctly. Likewise, he was able to improve our background reduction algorithm to make the RPi image clearer and easier to classify due to less external features in the database entry. This ultimately improved accuracy to around 89% which is roughly similar to our desired goals. He will work to fine-tune it more to get the accuracy up to 95%.
Grace and Steven also conducted more testing and validation regarding the ML components which included a lot of unit test with our compiled data that we had from previous weeks. These tests will be talked about more in the team status report, but we did speed and accuracy tests on the ML algorithms for image classification and text extraction. Most of these tests were done locally first with stock images to ensure quick and efficient testing. The next step is live testing with live pictures from the RPi on the website.
Surya was able to get a high accuracy on the Seven Segment Optical Character Recognition that effectively establishes communication between the scale and the MySQL database. The extracted weight would be forwarded to the database to be written and also used for backend calculations. This was after failed testing with the Tesseract OCR since it has trouble extracting digital text fonts. He plans to work on fine-tuning this algorithm with the remaining time to work across broad testing inputs.
Grace spent most of her time improving frontend components involving user experience, performance, and functionality. Since most efforts were shifted towards creating the database retrieval system in MySQL, improvements on the design and interactive sides of the website as it could initially be overwhelming to navigate all the different pages. I am thinking instructions could be included somewhere to help users with the functionalities. Another issue that emerged after integrating the subsystems together was providing enough feedback and error states/messages to users after their actions such as scanning the product with the camera. It is crucial to provide helpful error messages to immediately help users identify their incorrect behavior and receive necessary feedback as soon as possible to improve upon.
Likewise, we all worked closely to deploy the web application on Nginx to create a MySQL database to be connected with the RPi using a MySQLConnector. This involved work with coordinating between two USB cameras and testing with one vs two RPis. Likewise, we coded python scripts to take images of both the object and the digital scale output.