Team Status Report for 19 February 2022

Currently, the most significant risk that can jeopardize our project is that we may not be able to separate the speakers well enough for the Speech to Text model to produce usable captions. We spoke with Professor Sullivan about our circular microphone array, and he strongly recommended the use of a linear array for our application. There don’t seem to be any great options for prebuilt linear arrays online, as we could only find one specifically for the Raspberry Pi. The estimated shipping time for that array is a month, so for now we plan to continue working with the UMA-8. If the UMA-8 is too small for both beamforming and STFT, we will have to try building our own array out of separate microphones. This approach will add cost and potentially take a lot more time. None of us are familiar with the steps involved in recording from multiple microphones, so we hope to avoid that complication.

One of the main changes we made from the proposal presentation is the use of a Jetson TX2 for all of the processing. We wanted to limit the amount of data movement that we would have to deal with, and the Jetson TX2 also provides consistent processing and I/O capability compared to the variability of the user’s laptop. Another key design choice we made was to use an HDMI to USB video capture card to transfer our final output to the user’s laptop. We based this off of the iContact project from Fall 2020. Both of these changes should greatly simplify our design and allow us to focus on the sound processing.

Our schedule remains pretty much the same as the one presented in the proposal presentation. Instead of having to worry about circuit wiring, however, we now just have to deal with the video capture card.

We were able to successfully use the TX2 to interface with the webcam and UMA-8 through a USB hub. We have now started to work with the video and audio data of what we hope to be our final components.

Larry’s Status Report for 19 February 2022

This week, I worked on both the design presentation and the initial testing of the components that we received. I got the demo of Detectron2 running on the TX2 and was able to record 7 channels using the UMA-8 microphone array. Installing all the required software packages took some time, but was fairly simple all things considered. I installed Detectron2 from source and had to use Nvidia’s instructions for installing PyTorch on Jetsons with CUDA enabled. Below is a picture of the image segmentation demo from Detectron2.

Since I will be presenting for the design presentation, I put the majority of my time this week towards that.

I believe that we are definitely on schedule so far. We have the high level design figured out and have confirmed that our purchased components work together. Of course, we have yet to tackle the hardest parts of the project.

Next week, I will have completed the design presentation and will begin working on using the image segmentation data to do angle estimation. As a deliverable, I hope to produce an accurate angle for a single person in the webcam view.

Larry’s Status Report for 12 February 2022

For this first part of this week, I helped Charlie construct his presentation. I also requested and received a Jetson TX2, which we are considering using. The biggest unknown for me is how we plan on capturing and moving around data. We proposed capturing the datastream on the Pi and sending it to a laptop for processing, but doing both on the TX2 would remove the extra step.

I attempted to reflash the TX2 using my personal computer, but I do not have the correct Ubuntu version. Fortunately, the previous group left the default password on, so we could just remove their files and add ours. Space is extremely limited right now without an SD card, so I poked around and preliminarily deleted about a gigabyte in files.

Since the current plan is for me to do the design presentation, I also began looking through the guidance document and structuring the presentation. I should have all of it done by next week.

Overall, I think that I am on schedule or maybe slightly behind. There are still many aspects of the design that need to be hashed out.