Over the last two weeks, I worked a lot on the FPGA development, which has led significant changes on our overall project. The first thing I realized after trying a lot was that the our pose estimation model (mediapipe) is not easily quantizeable and deployable to my FPGA’s DPU. I was able to deploy another pose estimation model which had all the supported operations and functions; however, since our classification model (SPOTER) requires mediapipe landmarks from both hands and pose, this was not a feasible option to pursue anymore. Thus, I began optimizing and finalizing the inference of the mediapipe model as much as I could and then began integrating it with Neeraj’s classification model. I designed the system structure and am currently trying to finish all of the integration. I hope to have the final pipeline working by tonight, so then we can do all of the testing and measurements tomorrow. We are a little behind schedule but hopefully by putting enough time over the next couple of days we can finish the final demo.
What I have learned: I have learned a lot of new tools with regards to FPGA development. In particular, how to use Vivado and Vitis to synthesize designs and deploy them on DPU’s. I also learned a lot of basics behind machine learning models like CNNS, alongside libraries like PyTorch. In order to do so, I relied heavily on online resources like youtube or research papers. Many a times I also followed along different articles written by individuals who have tried to develop small project on Kria boards.