Things accomplished this week
This week there was a lot of emphasis on both developing the robustness of the YOLO model with regard to adding new types of objects and integrating said models with the database/web server. Significant progress was made in both areas.
With regard to robustness of adding new types of objects, levels were seen in the mid 80s/90% for Precision and Recall (how often its label is correct given it already labels and how often it can recall a label). That roughly mirrors the other existing categories for the YOLO model. The key to achieving said results was to first use a larger YOLO model (Yolov11l vs Yolov11n). While it takes 3-4 times as long to train P/R values were in the 50/60 percent before the model upgrade
In addition, it was determined that one does indeed need many images to get high Precision/Recall as similar percentage rates were observed before the entirety of training set was used as opposed to 10/20%.
The understanding that a lot of training data needed to exist also played a huge role in determining that manually labeled data would not be sufficient. Thus I wrote up a script that augmented each image in 15 different random ways so different angles, zoom, and objects would be seen in each frame. Without the additional augmentations, there was only around 70% for Precision and 50% for recall.
Thus most of the code development time went into writing scripts for augmentations and making sure that there was no overhead in running my python training script which has grown to around 1000 lines of code at this point.
Apart from the progress in the YOLO training model I worked with Gian Carlo and Swati to integrate Yolo into the database solution. All one needs is to just supply the model weights into their existing database/server code and run my prediction script for YOLO to do its magic.
Is your progress on schedule or behind
I believe we are on schedule. We have finally integrated the various parts of the project and at this point, the work remains in the realm of continuing to label/train the model to make it as accurate as possible. In addition, time will likely open up to work on the voice-to-text/cosine comparison code.
What deliverables do you hope to complete in the next week?
- Arrive at a final configuration for my YOLO model and have multiple charts plotting out the relationship of various parameters with accuracy
- Begin to work on the cosine comparison functionality and the audio-to-text with regards to actually integrating it into our current MVP.
0 Comments