Max Tang’s Status Report for 3/22/2025 – Team D3: The Self-driving human

The walk sign image classification model is in a near-finalized state where I can begin to transition away from optimizing the model’s performance. Since last week, I performed some hyperparameter optimization and also tried adding some layers to the ResNet model to try and increase its performance. I tried changing the size of the dense linear layers, the number of epochs it was trained for, different activation functions, and additional linear and pooling layers. However, these did not seem to help as much as simply adding more training data that I’ve been continuously collecting. I also removed the validation dataset and divided its images amongst the training and testing datasets, since I did not find any real use to having a validation dataset and I benefited more from just having more data to train and test with. Current test accuracy is around 80%, which is not as high as desired. However, the good news is that most of the errors were when the model predicted “stop” when the image was “go”. This is much better than predicting “go” when the image is “stop”, and while I did not purposefully design the model to be more cautious when predicting “go” and this seemed to be a coincidence, it is something that I have realized that I could potentially add. This would not necessarily have to be a change to the model and could be done in some post-processing step instead.

The next step is implementing the logic that would take video input data and feed it into the model at some frequency and then return the result, using a sliding window for both the input and output. I plan to begin working on this next week.

Leave a Reply Cancel reply