Kavish’s Status Report for 4/6/24 – Team E6: TransLingualVisionary

This week was our interim demo and I finish two subsystems for the demo: HPE on the FPGA and live viewer application for displaying the results. I have already discussed the live viewer application over the last couple of status reports. For the demo, I implemented a holistic mediapipe model for human pose estimation (full upper body tracking) on the programmable software (meaning the Kria SOMs). This model takes in a live video stream from a usb camera and outputs the detected vector points in json format through UDP over the ethernet. This implementation is obviously unoptimized and not accelerated on the fabric yet. Over the next two weeks I plan on accelerating this model on the fabric using the vitis and vivado tools and integrate it with the jetson. I am not behind schedule but depending on the difficulties I might have to put in more hours behind the project. By next week, I hope to have significant progress on putting the model on the dpu with maybe a few kinks to solve.

Verifying Subsystem (HPE on FPGA): Once I finish the implementation, there are two main ways I will be verifying my subsystem. The first is measuring the latency, this should be pretty simple benchmarking where I will run prerecording video streams and live video streams and measuring end to end delay till the model output. This will help me understand where to improve the speed of this subsystem. For the second part, I will be testing the accuracy of the model using PCK (percentage of correct keypoints). This means I will run a pre-recording and labeled video stream through the subsystem and ensure that the predicted and true keypoints are within certain threshold distances (50% of the head bone link).

Leave a Reply Cancel reply