Edward’s Status Update for 11/21
This week I spent time to better enhance the video feed that was being passed to the computer. It was easier than expected since we only needed to change how our windows were instantiated. After doing so I then began the code to better detect a speaker when there are multiple faces. The algorithm is still quite naive but in order to increase correctness I went about to consider the following things. The first was to account for issues caused by the CV itself where across different frames it may not find the same faces across each frame and human movement means that the faces might not be in the exact same locations across each of the frames. In order to account for these, I am using a grouping algorithm to make the best groups of faces across the different frames. If it goes well then that means that a single group of faces belong to the same person. Next I had to consider who is actually speaking across these different faces and in order to do so I decided to use a haar cascade to detect mouths across each frame and I can approximate talking by the change in the shape of the mouth over the frames. The last thing to consider now is to see which of the speakers detected is closest to the center of the screen since we know that the center will represent the angle that was derived by the microphone detection. Overall I hope this will be enough to guess a speaker but another thing that concerns me is that it may lag out the video feed. I think that I can actually make use of more threading to handle the video feed updates while the cv is processing. I’ve also already added in support for doing the CV processing across multiple threads to help reduce the latency but once thing that I will have to test is to see if by reducing the number of threads, I can reduce overhead of creating the threads which will lead to better results. At this point latency is my biggest concern because I am certain that the logic being implemented is better for the overall project as a whole. I may need to cut out some of these steps described above if I feel that algorithms are too demanding to be done within the different frames.
For the upcoming week I plan to finish up this piece of code then also begin testing. Heather was still waiting for the prototype to dry before we could go into testing so we plan on doing testing on Sunday afternoon and hope to begin getting some of those results in. With the final prototype being built, I also plan to fine tune the movement to account for the physical built that we have now.