What did you personally accomplish this week on the project? Give files or
photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).
This week, I was experimenting with the use of deep learning approaches to separate mixed (or overlapping speech). The intention of this divergent branch from our beamforming approach is twofold. First, we want to have a backup strategy to separate overlapping speech in the instance that beamforming does not work as expected. Second, if beamforming does suppress other voices, we could use a deep learning approach to further improve the performance. We strongly believe the second option ties best to our project.
I managed to demonstrate that deep learning models are able to separate speech to some level of performance.
The following is overlapping speech between Stella and Larry.
The following is the separated speech of Stella.
The following is the separated speech of Larry.
One interesting point that we discovered was that filtering out noise prior to feeding the signals into the deep learning model harms performance. We believe this arises from the fact that noise filtering filters out critical frequencies, and that the deep learning model has inherent denoising ability built into it.
Second, we notice that the STT model was not able to interpret the separated speech. It could either be caused by poor enunciation of words from our speakers, or that the output of the STT model is not clear.
“ Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
We are currently on schedule.
“ What deliverables do you hope to complete in the next week?
In the next week, I want to experiment using the deep learning separated speech to cancel the original speakers, to test whether the output of the noise canceled speech leads to better STT predictions.