This status report will be split into two parts: The first part will highlight the what happened last week, i.e. 10/31-11/6, as my status report was missing for that week, and the second part will cover my report for the week of 11/7-11/13.
For the week beginning on 10/31, I was extremely sick from 10/31-11/3 and could not get much work done nor communicate with my teammates. When I got started on Thursday, my group and I got together to reevaluate our situation, and also worked on presenting for our interim demo. I was responsible for the software demonstration, which involved showing our working software model that could upscale videos from our dataset. I had several demonstrations in mind, one that would output a side-by-side comparison of the videos from bicubic upscaling, CNN upscaling and also the original/native resolution video in real-time. Another demonstration involved a live webcam feed, which would show an upscaled version of our webcam in real-time that involved less filters. This was to demonstrate that although a real-time upscaling tool on software/GPU is possible, it was still lacking compared to the CNN model with more filters, which is not possible on software but possible with our hardware/FPGA implementation. This did not end up working, so I resorted to upscaling still images and showing the difference in SSIM that demonstrated the difference between the number of filters trained on our CNN implementation.
For the week beginning on 11/7, we started the week off with our interim demo. I demonstrated a clear difference in SSIM between a standard bicubic interpolation upscaling method and our software-based implementation of a CNN upscaling method on a video from our dataset. Due to some problems with throttling, I had to resort to showing videos I had previously upscaled instead of demonstrating my model working live.
I did some research to see if a dataset involving still images would end up working better than our initial dataset of frames from videos. The reason is because our upscaled models seem to output blurrier frames and videos, which is subjectively better than the artifact-heavy upscaled videos through bicubic implementation and objectively better in terms of SSIM, but still unusual. I hypothesized that since a lot of images used in our dataset are blurry, it may have somehow caused the CNN to output blurry sets of pixels when they aren’t in the native frames. After discussing with the team and examining our timeline, I decided to explore using other datasets, ones that had still images which weren’t blurry natively. We were able to fit this in because we already had our training infrastructure setup with Google Colab and my local machine. I decided on a dataset used by another open-source upscaling implementation, but one that was aimed on implementing single-image super-resolution instead of video super-resolution. Using the same hyperparameters and through 1-2 days of training, the initial results seemed to yield higher SSIMs on the still images, but when tested on our video datasets, artifacting was a lot more apparent, and the SSIM fell short of my initial model. So, I ended up dropping that idea and continuing to train our previous implementation.
I also discussed with James the current limitations/bottlenecks of our hardware approach, specifically, if there was anything that could be done and tested on the software side that would help with the runtime of the hardware model. He suggested making minor tweaks to the hyperparameters, specifically the number of filters in our second CNN layer, which could, potentially, massively decrease the time taken for the implementation to work with only a very minor decrease in quality of the upscaled video. I also discussed potential issues with the processing of each pixel, specifically, the encoding involved, as the paper was only processing one layer of data, specifically, the luminescence portion (Y) of the YCbCr encoding. There were different encoding methods shown which could potentially reduce the amount of data being processed on our FPGA, which was significant because James mentioned that it was a potential bottleneck for our system.
Minor additions: I added content to our website, as well as getting a start on our final report as to avoid our situation with our previous design report. It was also helpful to state and write down exactly what changes we had put into place since our design report, to double-check that our project has stayed on track and hasn’t deviated from our initial requirements.