Integrating Perception Pipeline into the CSM

This week, after a successful first demo showing object tracking and object detection working independently, Ike and I started talking about integrating InFrame’s software together. Mainly, this work involved getting a basic understanding of how the CSM calls the Perception pipeline and verifying that it all works as expected. This integration is currently ongoing and should be ready in time for demo 2 so that we can show the entire software stack working together as one cohesive unit.

In addition, the bulk of my work this week has been focused on improving the perception pipeline. Specifically, the problem was that even the best trackers accumulate some sort of error over time and start drifting from the target bounding box. As such, I’m running object detection every n frames (an appropriate n still needs to be determined, I’m currently experimenting with 10) and resetting the tracker with the bounding box returned by the object detection model. The way I tie these together is by basically saving the class ID of the target (i.e. human, skateboard, etc) from the first round of object detection and then searching for that same class ID in future object detection results. The drawback here is that only one instance of each class may be present per frame (i.e. only one human), otherwise the tracker would randomly pick one of several instances of the class. An improvement here would be to find the bounding box from object detection that is most similar to the one currently being tracked, but I am not sure that I’ll be able to get around to implementing this, given the tight time constraints and how near the end of the semester is. I’ll play around with the first approach to see how effective it is and report back next week.


0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *