This week there was some real concerning realization that our group had to face, including the fact that the originally planned CNN model struggles to learn. One thing I attempted was changing the size of the input packet to be of size 1 instead of a stack of 5 images, this way it was certain that the model was being trained on a single frame. The same issue for the past model arose. The model seems to latch onto some arbitrary solution within the first 100 iterations and not actually move towards the true local minimum. Another way I tried to solve this problem and spread out the search for the true minimum, was change the type of optimizer from an ADAM optimizer to an SGD optimizer. Unfortunately, these changes with both packet sizes did not help the model converge to any real solution. After these changes in the optimizers I was pretty hopeless because I am simply following the architecture of the model provided in the paper and I really don’t have a lot of insight on how the model is actually learning so I cannot make any real changes to the model architecture. Our plan moving forward is to try and use the model provided from the paper essentially as a library and pass our image frames to the model as the paper intended. The issue with that method is that the original model architecture was written in lua and we were intending on using pytorch so all of our code is written in python, so in border to use the model like a library we would have to make either use the model entirely in post-processing or use I/O to write each image to and from the file locations. It would be easier to do this entirely in post but the intended functionality would be to use I/O to write the frames and video to the SD card. 


0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *