Week of 9/29 – Jiahao Zhou

Since our minimal viable product hinges upon the successful detection of a person’s rapping, I am working to make sure the voice tempo detection will work. I recorded samples of my own voice, and was able to clearly delineate start and stops between individual words. Even at faster tempos, the way human speech works means that there were clear stops between each word. Using spectral energy density, these gaps can be measured when the average energy of the sound input drops below a certain threshold. From initial testing, this threshold can be extremely low when the recording is clean, thus we can have a relatively high rate of detecting each word. Next week I plan on generating a lot of samples of different types of rapping, both slow and fast and from both my own voice as well as samples taken online, to test.

Leave a Reply Cancel reply