During this week I spent time learning about computer vision, more specifically object detection. I took time to understand how Yolov8 works and how we can utilize it for our project. For the MPV we will be utilizing a balloon as the object to find. Therefore, I began by identifying a good dataset for training our Yolov8 model. I had to look in various datasets and identify which onces have a diverse set of images and large amounts of training data. Once I found the dataset, I realized that Yolov8 doesn’t support the lableing format used by the dataset (via format). Thus, I researched methods to convert a VIA format to COCO format. To verify correctness of the dataset, I wrote a script to viasualize the dataset after converting it to COCO format.
I then trained a model with 10 epochs to verify it is functioning as expected. I observed the loss and the testing images and noticed the trend seems correct. Then, I trained a model by running for 50 epoch. Once trained, I visualized the model’s output on testing imagse and noticed that the bounding boxes were not always good.
Going back to the dataset, I realized that the dataset was made for image segmentation and contained semgentation masks instead of bounding boxes. Therefore, I wrote an another script to convert from segmentation masks to bounding box lables. I retrained the model yet again for 50 epoch with the converted dataset. Once tested, I noticed good results. Next step is to write a program to intake live video and test the model in the real world.
Example of Yolov8 Nano model output (noise introduced to image to make it harder for detection):

