Andrew Wang’s Status Report: 3/8/2025

This week, I worked on fine-tuning the pretrained YOLOv8 models for better performance. Previously, the models worked reasonably well out of the box on an out of distribution dataset, so I was interested in fine-tuning it on this dataset to improve the robustness of the detection model.

 

Unfortunately, so far the fine-tuning does not appear to help much. My first few attempts at training the model on the new dataset resulted in the model not detecting any objects, and marking everything as a “background”. See below for the latest confusion matrix:

 

I’m personally a little confused as to why this is happening. I did verify that the out of the box model’s metrics that I generated for my last status report are reproducible, so I suspect that there might be a small issue with how I am retraining the model, which I am currently looking into.

Due to this unexpected issue, I am currently a bit behind schedule, as I had previously anticipated that I would be able to finish the fine tuning by this point in time. However, I anticipate that after resolving this issue, I will be back on track this week as the remaining action items for me are simply to integrate the model outputs with the rest of the components, which can be done regardless of if I have the new models ready or not. Additionally, I have implemented the necessary pipelines for our model evaluation and training for the most part, and am slightly ahead of schedule in that regard relative to our Gantt chart.

For this week, I hope to begin coordinating efforts to integrate the object detection models’ output to the navigation modules in the hardware, as well as resolving the current issues with the model fine-tuning. Specifically, I plan on beginning to handle the miscellaneous code that will be required to pass control between our modules.

Andrew Wang’s Status Report – 02/22/2025

This week, I was able to begin the evaluation and fine-tuning of a few out of the box YOLO object detection models. More specifically, I used the YOLOv8x, which is a large, high performance model trained on the COCO dataset.

For evaluation, we were advised to be wary of the robustness of the object detection models with regards to its performance on out-of-distribution data, as previous teams have run into difficulty when trying to use the models in a real world setting. Since the validation metrics of the model on the COCO dataset are already available online, so I decided to use the validation set of the BDD100k dataset to determine the level of performance decay on a out-of-distribution dataset to mimic performance in a real world setting.

So far, it appears that the out of the box model does reasonably well on the new dataset out of distribution. I first generated a confusion matrix to examine how well the model does on each class. Note that our evaluation dataset only contains the first 10 labels of the YOLO model, and so only the top left square of the matrix should be considered in our evaluation:

It appears that the model mistakingly assigns a “background” label to some images that should be been classified as another item on the road, which is especially troublesome for our use case. Besides this, the accuracy appears somewhat reasonable, with some notable off-target predictions. I also generated a precision-recall across the different classes:

It appears that the model struggles most with identifying traffic lights and trains. However, in our use case of crossing the road, these two objects are definitely less important to detect in comparison to the other categories, so I’m not personally too worried about this. As a whole, the mAP metrics across the other labels seem reasonable compared to the reported mAP metrics of the same models on the COCO dataset. Considering that these models weren’t trained on this new BDD100k dataset, I’m cautiously optimistic that they could perform well in our testing as is, even without extensive fine-tuning.

Finally, I generated a few example images with the model predictions overlaid to visually depict what the model is doing. Here is an example:
 

The top picture are the images with the reference labels,

and the bottom picture are the same images with our model predictions overlaid. On the top row, the second image to the left stood out to me, since our model detected trains where there weren’t any. To me, this might be an interesting dive point into why our model does so poorly with regards to detecting trains, although given that we have established that trains aren’t as important in our use case, we might not need to do a detailed analysis if time is tight.

With regards to progress, I believe that I am about on track as per our Gantt chart; I have been able to complete preliminary evaluation of the object detection models, and I have also started implementing a fine-tuning pipeline, in order to incorporate more datasets into the out of the box models we are currently using.

Next week, I plan on moving into the second part of my deliverables; writing out a pipeline to handle the outputs from our model with regards to navigation. I plan on brainstorming how to make proper use of the detection model inputs, as well as how they should be integrated into the larger navigation module that we have planned. I also plan on gathering some more datasets such that I can make use of the fine-tuning pipeline I already have implemented to develop even better object detection models, such that we have a wider array of options when we are ready to our integrated project.

Andrew Wang’s Status Report for 2/15/2025

This week, I was able to gain access to a new computing cluster with higher amounts of storage and GPU availability late into the week. As such, I began downloading an open source objection detection dataset, BD100K, from Kaggle onto the cluster for evaluation/fine-tuning. After all of the images were downloaded (the version I downloaded had 120,000+ images), I was able to start working on the implementation of the evaluation/fine-tuning pipeline, although this is still a work in progress.

With regards to schedule, I believe that I am slightly behind schedule. Due to some issues with gaining access to the cluster and the download time required to fetch a large dataset, I did not anticipate not being able to work on this until the later half of the week. I would have liked to have finished the evaluation/fine-tuning implementation by this week, and so I anticipate having to put in a bit of extra work this week to catch up and have a few different versions of the model ready to export to our Jetson Nano.

By the end of this week, I hope to have completed the evaluation/fine-tuning pipelines. More specifically, I would like to have concrete results for evaluating a few out of the box YOLO models with regards to accuracy and other metrics, in addition to hopefully have fine-tuned a few models for evaluation.

Andrew Wang’s Status Report for 2/8/2025

This week, I began looking into different object detection algorithms online that we can use as part of our first iteration of our implementation. Specifically, I installed a pre-trained YOLOv8 model from the YOLO package “ultralytics”, and was able to get it working on a CMU computing cluster.  Since a rigorous evaluation and fine-tuning of the models will be necessary for integration, I’m planning on beginning to implement an fine-tuning and evaluation pipeline in the next few days to measure the model performance on unseen data, such as generic datasets containing images of streets such as BDD100K, EuroCity Persons, and Mapillary Vistas. Unfortunately, these datasets are way too big to store on the clusters I currently have access to, so I am working on obtaining access to alternative computing resources, which should be approved in the next few days.

With regards to progress, I believe that I am about on schedule. We have specifically set aside the upcoming week and next to evaluate and handle the ML side of our project based on our Gantt chart, and I am optimistic that we should be able to get this done in the next two weeks as the models themselves can simply be fine-tuned to any degree as we see fit with our constraints.

By the end of next week, I’d hope to have completed the download of the image datasets, as well as finished preliminary evaluation of the YOLOv8 model. We may also consider using different object detection models, although this is likely something we will consider more seriously as we get the first results from our YOLOv8 model.