Charlie’s Status Report for 2 April 2022

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

This week, my team and I were discussing the current missing parts of our project. We realised that other than beamforming, we do have the rest of the components required for the integration of our Minimum Viable Product (MVP).

As I am in charge of the deep learning components of our project, I reformatted the deep learning packages for our speech separation module. In particular, for our MVP, we are planning to use SpeechBrain’s SepFormer module, which is trained on the whamr dataset which contains environmental and reverberation. Using the two microphones on our array, I am able to estimate the position of the speakers based on the delay in the time of arrival. This is crucial, as SepFormer separates speeches but does not provide any information about where the speakers are located.

On Thursday, Stella and I got on a call with Professor Stern. This is because I discovered that Professor Stern published a paper on speech separation with two microphones that are spaced 4cm apart. After speaking with Professor Stern, we identified three possible errors as to why our current implementation does not work.

  1. Reverb Environment in small music room (original recordings)
  2. Beamforming with given array will likely lead to spatial aliasing
  3. Audio is sampled unnecessarily high (44.1kHz)

Professor Stern suggested that we could do a new recording at 16kHz at an environment with a lot less reverberation, such as outdoors or a larger room.

On Friday, Larry and I went to an office at the ECE Staff lounge to do the new recording. This new recording will be tested on the Phase Difference Channel Weighting (PDCW) algorithm that Professor Stern published. Of course, this is a branch that allows us to continue with a more signal processing approach to our capstone project.

A casual conversation.

A scripted conversation.


Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

We are slightly behind as we are planning to abandon beamforming and switch to Professor Stern’s published work. However, it is difficult to consider that we are behind because we already have contingencies present (deep learning speech separation). To catch up, we just have to collect new recordings and test them using the PDCW algorithm.

What deliverables do you hope to complete in the next week?

In the next week, Stella and I will test the PDCW algorithm on our newly collected recordings. We will also likely meet up with Professor Stern to further advise us on our project, as he is very knowledgeable about binaural speech processing.

Leave a Reply

Your email address will not be published. Required fields are marked *