Team Status Report for 12/04

James continued to work on squeezing performance from the fsrcnn model, but ran into diminishing returns. Using fixed weights allowed for some additional improvements in memory accessing, and since we have fixed weights, we have the ability to do this. Integration with the host side led to additional slowdowns. Thinking of ways to improve this, a multikernel approach was decided and James began writing this. He expects to finish implementing this by the end of the week of 11/29.

<Josh Edits>

<Kunal Edits>

Team Status Report for 11/20

This week, our group spent a lot of time working on getting a working version of our upscaling model on the FPGA. Since we had several delays in the previous week, we had to work hard to catch up on lost time, and we are making a lot of progress each day in order to meet our deadlines.

James worked on optimizing the latency on the U96 board, as well as improving on the FPGA architecture of the CNN. He ran into several problems near the beginning of the week, but quickly caught up and is in the middle of addressing all the numerous limiting factors that are preventing our upscaling model from working.

Joshua worked on addressing problems with the end-to-end latency by training a smaller model from scratch, since there were unexpected issues with the end-to-end latency on the U96 board. He was also in charge of starting the final presentation, as he will be presenting, as well as continuing work on the final report. He also did further research on a case for the U96 board, something that James had an initial design for last week, but was for an older generation of the board.

Kunal is working more on the I/O portion of the board, by looking at the different ways the frames can be passed in to increase the speed of implementation.

Next week, we are looking to have a final product working, even if it doesn’t fully meet our initial requirements. From the work we’ve put in these last few months, it seems that some trade-off is inevitable, and this week, we will pinpoint exactly which trade-off we are willing to take, and look to have at least a semi-working demo in preparation for our final presentation on 11/29.

Team Status Report (10/30)

This week I (Kunal) along with James worked on the I/O portion of the project. James helped me on host-side programming for the U96 board, and provided me with various resources to look at, so that I can get going with the implementation.

James attempted to make further gains of the CNN kernel, but it was not as successful this week. However, he has worked out various strategies which will help speedup our implementation, and has been attempting to implement those strategies. He also worked with Joshua to research pre-trained models. He also setup a git to help Kunal with his portion of the project.

Joshua worked on researching pre-trained models along with James, and also attempted to get AWS up and running, but after testing Google Colab, decided to go with that instead. Our request came back but was not fully fulfilled, as we weren’t provided with enough vCPU limits to use our preferred instance, so after purchasing Google Colab Pro, he decided to use that instead to speed up the training process.

In terms of the whole project, we almost have a working CNN model. The training should be done by around Wednesday/Thursday this week, and James and I will be working extensively on CNN acceleration before then, and then take our weights/hyperparameters from our software model and implement them this coming Friday/weekend to ensure everything is going smoothly. Overall, our project is around a week behind, but we are confident it will go smoothly as we have worked out enough slack time, and also addressed the issues that were preventing our project from moving forward.

Team Status Report for 10/23/21

Last week we mainly focused on writing the design review report. To address the elephant in the room, we know that our submission was nowhere near as well done or well polished as it could be, and frankly was not even fully done in some parts. We know that this now means that we will have to pick up more work leading into the final report to make sure that it is done well, and we more than intend to do this. We do not want to repeat submitting something of such sub-par quality for the final report at all.

As for this week:

James focused mainly on improving CNN performance with marginal gains so far. More details are included in his status report.

Joshua focused on refining the software implementation of the project and ironing out bugs, as well as sorting out issues with training due to problems with AWS.

Kunal helped with improving CNN performance, as well as acquainting himself more with some of the content from reconfig, which James is currently taking but Kunal is not.

Overall, the project is about one whole week behind according to the Gantt chart, but this is not a concern since we left two extra weeks in order to address unexpected issues with our project’s development. A lot of it came down to our other courses ramping up in terms of time commitment and effort, and all members having to focus on other things, but in the end, we made a steady amount of progress, and we are still on track to finish the project on time.

Team Status Report for 10/2/21

This week, we met up with Byron and Joel again to discuss more about our project, specifically to address any comments from the feedback from our proposal presentation, as well as following up on our initial, first meeting. During the meeting, we addressed the concerns about the use of VMAF as a metric for our training, as well as our dataset and some other things that weren’t fully justified during our presentation. Byron commented on how we have to make sure that implementing a CNN is better compared to traditional DSP methods, and to make sure that implementing something much harder is still the best choice. To that end, we benchmarked both VMAF and Anime4K, a project on Github that does something similar with, specifically, animation, and we obtained concrete, quantitative measurements which we can elaborate on our design presentation to fully justify our design choice.

Joel also raised a good point about how upscaled, lower resolution videos compared to original, native resolutions videos would always result a lower score, and we addressed that by limiting our training to only comparing videos that have been upscaled to the native resolution, e.g. 1080p to 1080p. We also talked about the importance of benchmarking as soon as possible, which we successfully did this week.

Although throughout the week our team members were slightly overwhelmed by work from other classes, we managed to catch up sufficiently by meeting up after class hours and communicating to make sure our tasks were still completed on time. James and Kunal continued their research on I/O, and calculated specific quantitative measurements to put on our design presentation, and I continued my research into VMAF, as well as the model being used for training our upscaling. Referring back to our Gantt chart/schedule, we were slightly behind on developing the Python code for training our own CNN, as we only received AWS credits Friday morning, but we used that surplus time efficiently by benchmarking locally, as well as researching in more detail Anime4K. As per the feedback from Tamal, we are taking the risk of our CNN not working/not being developed in time on the software side more seriously, and our backup plan would be to simply use the CNN implemented in Anime4K and start implementing that on hardware if we cannot get it working on the software side after Week 7. We’ve changed our schedule/Gantt chart to reflect that accordingly.

Looking further into the peer/instructor feedback, we see that they were a lot of comments about the absence of justification for our FPGA selection during the proposal presentation. We’ve focused on elaborating on the choice much more for our design presentation, and we are similarly going into a lot more detail for our software section, as well as our quantitative requirements.

Overall, despite some things not going as planned this week, we believe our team was very successful in overcoming all the problems we encountered, and our initial planning, which allowed for slack time/small delays, proved useful. We look forward to delivering our well-prepped presentation on Monday, addressing all feedback from our previous one, and continued success towards the progress of our project.

Team Status Report 9/25

For this week, we followed our schedule/Gantt chart and attempted to do every thing on our list. Almost every single task was accomplished successfully, with details listed on our personal status reports, with the only hiccup being our AWS setup – the credits will be acquired by next week, and we have already looked into which AWS instance type would be most suitable for our group.

During class time, our entire group also performed peer review on all our classmates. We decided that Kunal would be presenting for the second presentation, and we also aim to address all of the concerns that were addressed during the Q/A session of our first presentation, as well as all feedback we received from TAs/professors through Slack.

As a team, we further discussed the model for our project, as that is a core part of the upscaling process. Reflecting on the feedback from Slack, as well as following our schedule, we also decided on a specific dataset and acquired the videos online from a database that provides it.

Looking towards next week, we are on track according to our schedule, and optimistic of continuing our positive trajectory. Next week, we begin trying to implement I/O with our acquired hardware, as well as preparing well for our design presentation. We will also begin writing code for the training part of our project in Python.

Team Status Report for 9/18/21

This week, my group met with our TA, Joel, and Bryon to refine our abstract and pinpoint the finer details of our implementation. Taking on their recommendations, we decided to change our use case, specifically, to security/video streaming. Allowing to up-scale videos on demand in real-time allows for greater security and better decision-making when the user is presented with potential threats. Also, client-side upscaling of videos can make up for poor internet connection. After considering the throughput and use case, we also decided to go with 24fps as a target instead of 60fps, as this is more realistic whilst still perceivable as a video to the user.

As Byron suggested, we examined existing upscaling methods used in browsers more in-depth, as well as reading up on some DSP literature that he sent us. We decided that neural net methods were indeed more useful for our implementation, and we are in the process of figuring out how to fit this architecture onto an FPGA.

For the coming week, we will further develop our schedule, and also confirm how we will procure the key components of our project. We will also setup team infrastructure such as Github to ensure we can coordinate our progress better. Overall, we are on schedule and ready to progress to the next week.