Mark’s Status Report for March 14th – Team A7: Scalable Machine Learning Using FPGAs

This week, I started working on a helper function inside our API that would be able to calculate the size of a user given model. However, while working on this helper function, I realized that this function would be extremely difficult to implement with an API. Because we are fairly limited on our hardware storage space (~30MB), we need to know the size of each model so that the Workload Manager does not overload on the amount of models it sends to a particular board. Initially, I was stuck since PyTorch did not have an available API to calculate the size of a model. Luckily, a third party package called ‘torchsummary’ exists, and I will be looking into this package in this coming week in order to finish the size calculation function.

Additionally, I set up a benchmarking suite for the GPU. However, due to the recent outbreak, I was unable to test it on the originally planned GPU (NVIDIA 1080), and instead used a NVIDIA GeForce RTX 2060 SUPER. Additionally, there were some issues with the models having incorrect parameters, resulting in an incomplete run of the benchmarking suite, so the numbers will be updated soon. I also slightly reworked the skeleton of the user code in order for easier usability for the user.

Overall, I would say I am on schedule.

This coming week, I plan on fixing the GPU benchmarking bug which will allow me to train the full set of defined models, in turn providing me with another cost-effectiveness value that we can use to validate our final design. Additionally, I plan on finishing the helper function to calculate the ML model size, and write some additional helper functions for the Python API.

Leave a Reply Cancel reply