Larry’s Status Report for 19 March 2022

This week, I worked on writing code for angle estimation by interfacing with the camera, identifying people in the scene, and providing angle estimates for each person. My last status report stated that I hoped to complete angle estimation by the end of this week, and while I am substantially closer, I am still not quite done.

With some help from Charlie, I have been able to use the video from the webcam to identify the pixel locations of each person. With an estimate of the camera calibration matrix, I have also produced angle estimates for the pixel locations. My main issue so far is that the angle estimates are not entirely accurate, primarily due to the strong fisheye effect of the webcam.

As seen in the image above, the webcam produces a greatly distorted image of a square whiteboard. While the camera calibration matrix can produce a good result for any pixel along the horizontal center of the image, it does not compensate for the distortion at the edges.

Another thing I noticed was that while our webcam claimed to have a 140 degree FOV, I measured the horizontal FOV to be at best 60 degrees. The fisheye effect gives the impression of a wide angle camera, but in reality the FOV does not meet our design requirements. I have decided to try and use the included camera on the TX2, which I initially deemed to have too narrow a field of view for our project.

The above image shows that the included TX2 camera (top left) has a horizontal FOV that is slightly better than the webcam (bottom right). What I am currently working on is trying to integrate the included camera with my existing code. The issue I struggled with at the end of this week was installing OpenCV with GStreamer support to use the CSI camera, which took many hours.

I believe that we are still generally on schedule, though further behind than we were last week. To ensure that we stay on schedule, I will try to focus on integrating more of the components together to allow for faster and more applicable testing. My main concern so far is how we will actually handle the speech separation, so finishing up all the aspects around speech separation should allow us to focus on it.

By next week, I hope to have the camera and angle estimation code completely finished. I also want to be able to overlay text onto people in a scene, and have some work done toward generating captions from audio input.

 

Leave a Reply

Your email address will not be published. Required fields are marked *