We had a discussion before the weekend about “normalizing classes” so that multiple classes in the new dataset could have the same label based on a shared characteristic. For example, we assign paper bags, paper boxes, normal paper, etc. all to one class. This week, I have written the script to go through all labeling files of the old and new dataset to change their classes based on their original label while leaving the box coordinates unchanged. After testing that locally, Ting applied that to our Colab workspace and it processed all labeling files as we expected.
After training with the new dataset, Ting and I did some testing with the updated weights. Our camera, very unfortunately, broke again on Wednesday so we had to test with still images. Later, after quickly ordering a new camera usb and setting it up, the model could detect items correctly and could generally tell whether it is trash vs. recyclable. However, two major problems remain: 1. For actual items, the output confidence value is barely over 0.85 even though the classification is correct (our threshold for a positive answer is 0.85). 2. As there are multiple recyclable classes (single trash class), the output label is often wrong among recyclables (for example, metal might be identified as plastic).
Besides that, I have worked mostly on the presentation where we revisited and condensed materials from before and did more testing based on presentation outlines. Our schedule is slightly behind on ML training as we doubt the mapping between labeling classes (numbers) and actual classification results (word) might be skewed. Looking into that might take some time and we also need more integration testing. “FSM” is shown to be working but more testing needs to be done before the demo.
For next week, there are 3 major tasks for me:
- Finish the slides and do the presentation
- Improve ML accuracy and do more testing
- Help building the mechanics