Larry’s Status Report for 12 February 2022

For this first part of this week, I helped Charlie construct his presentation. I also requested and received a Jetson TX2, which we are considering using. The biggest unknown for me is how we plan on capturing and moving around data. We proposed capturing the datastream on the Pi and sending it to a laptop for processing, but doing both on the TX2 would remove the extra step.

I attempted to reflash the TX2 using my personal computer, but I do not have the correct Ubuntu version. Fortunately, the previous group left the default password on, so we could just remove their files and add ours. Space is extremely limited right now without an SD card, so I poked around and preliminarily deleted about a gigabyte in files.

Since the current plan is for me to do the design presentation, I also began looking through the guidance document and structuring the presentation. I should have all of it done by next week.

Overall, I think that I am on schedule or maybe slightly behind. There are still many aspects of the design that need to be hashed out.

Charlie’s Status Report for 12 February 2022

This week, I gave the proposal presentation on behalf of my team. In the presentation, I discussed the use case of our product, and our proposed solution to tackle the current problem. I received some very interesting questions from the rest of the teams. One of my favorites was the possibility to use facial recognition deep learning models to do lip reading. While I doubt that my team will adopt such techniques due to the complexity, it does pique my interest in the topic as a future research direction.

I also designed a simple illustration of our solution, as shown below.

 

I think my illustration really helped my audience to understand my solution. It is extremely intuitive yet representative of my idea.

After the presentation on Wednesday, my team and I met up in person to work on our design. We decided that the most logical next step was to start working on the design presentation since it will help us to figure out what components we needed to obtain. Nevertheless, we booked a Jetson because we wanted to make sure that Larry could figure out how to use it given that he is the embedded person in our group.

I was also concerned that we might need to upsample our 8kHz audio to 16kHz or 44.1kHz in order to feed into our speech-to-text (STT) model.

I tested this and made sure that even a 8kHz sampling rate was sufficient for the STT to work.

We are currently on track with our progress, with reference to our gantt chart.

We should be coming up with a list of items we need by the end of this week to submit. Stella and I will start designing the beamforming algorithm this weekend, and Larry will start working on his presentation.