This week I continued on my progress in developing the ML backbone to this project. The first set of tasks I completed involved the soft-margin SVMs. I was able to fine-tune it to the best of my ability, but the accuracy values are not up to the value we like. As a result, I will experiment with more decision boundaries using the AdaBoost algorithm. This will assign different weights to a set of unique decision boundaries to improve the classification accuracy from the SVM formulation. The AdaBoost algorithm allows the classifier to learn from misclassifications by adjusting the weights of various decision boundaries. I did a lot of research on the mathematics and rationale behind this algorithm, so next week, I hope to implement a rough version of it and analyze its effect on improving classification between canned foods and fruits.
Next, I looked more into the GoogLeNet model. Thanks to feedback from the Professor Marios and Neha, I decided to steer away from fine-tuning the last layer. Instead, the plan is to experiment with k-nearest neighbors to classify the real data using the training data. I created a design tradeoff report on the usage of k-nearest neighbors. I began coding up the algorithm behind it using google Colab, and I will compile a testing dataset of bananas, oranges, and apples at the same time. The plan right now is to start with k = 5; however, if time permits, I plan to use 5-fold validation to experiment with k= 3, 5, 10, 20, 40.
The last thing I did this week involved familiarizing myself with the ChatGPT4 API. I tested it locally on our website using localhost, and it worked quite well. The next step is to format the inputs to the API and get formatted outputs back to store into the database. The computational power and accuracy is quite remarkable, and I plan to experiment more with all the features it has. The goal is to have sufficient knowledge in using this API in the event that the other ML models and algorithms we plan to employ end up missing the accuracy benchmarks.
Next week, I plan to work closely with the website design to integrate the various features I have been testing. Currently, all the testing has been done using my MacBook camera and locally on my computer in a contained environment. As a result, there was not many errors or difficulties in running tests and experiments. Likewise, I hope to conduct latency tests on the website using dummy data and values. The goal is to be able to retrieve values from the database and feed them in as input to our ML models to output responses. Likewise, I plan to work with Surya to figure out the camera design of our system and how to accept images from the Arduino.
