Week of 12/1 – Jiahao Zhou

After converting everything from MATLAB to Python I knew that Python would possibly be slower than MATLAB. However, testing revealed that on larger samples (> 5 seconds), the Python code begins to lag noticeably. The longer the sample, the slower the Python code. After timing I have determined that the for loops inside the Python functions are much slower than the MATLAB counterparts. This seems to be due to MATLAB using a just-in-time compiler. I changed some of the NumPy functions like arange() back into range() which actually helped increase speed a little.

However, at the point we are in the project the extra speed is not critical to our success. Since the demo is coming up next week I turned my attention to finishing the Final Project report. We ran into integration problems due to problems with the backend. We had to wrap up the integration before finishing the paper. The rest of the week was spent on the paper. We are now doing the finishing touches before the demo and making our video.

Week of 11/17 – Jiahao Zhou

Since we decided to not use the MATLAB engine, I have been converting my code over to Python. I initially ran into some trouble getting all the libraries set up for some of the helper modules, but all of the functionality of the MATLAB algorithm has been reimplemented with NumPy, SciPy, Matplotlib, and Pandas in Python. The next step is to integrate it with the backend and get the web app working. I will be in Pittsburgh over break so I plan to use that time to make sure everything runs smoothly before our initial demo and presentation.

Week of 11/3 – Jiahao Zhou

Got beat detection to work better with higher thresholds. This made the detected onsets less frequent, but more clear and easier to distinguish. However, we ran into a problem with the MATLAB Engine on the servers. Due to some documentation errors, my partner was not able to use it. Therefore, I am going to be switching to Python. As of now, I have converted the main voice tempo detector function to Python and am working on getting the helper modules converted. This will probably take me until next week to finish. Once done, I plan to begin implementing this on the servers and get it running on the web app. Even though this conversion will take extra time, I have the beat detection done early, and the addition of built-in slack means I can finish before the final demo.

Week of 10/27 – Jiahao Zhou

This week I wrapped up the voice tempo detection. There are still areas to optimize, but for the most part it is able to detect on-beat rapping. I rectify and smooth the audio before running it through an onset detection algorithm. Then, I calculate where beats should be based on the bpm given by the backing track beat detector. Here is an example of amateur rap where it is able to detect on-beat rapping.

 

The magenta lines indicate hit beats and green lines indicate missed beats. You can see gaps in hits where the rapper is pausing and taking breaks. In the coming weeks I plan to begin integrating this into the backend and have it working on live-audio in the web app.

 

Week of 10/20

This week our team is starting to finish our individual parts so we can put them together in time for the demo. We have run into an issue where we are not confident about the latency of all the different parts being able to display to the UI on time. However, since this is in preparation for our demo, we can iron out these issues later down the road.

However, this is an issue we expected to run into as putting the different parts together is the most complicated. We are also all working on polishing our individual parts. We see no other project wide problems aside from those described in the individual parts.

Week of 10/20 – Jiahao Zhou

This week I optimized the rapping tempo detection program by implementing better smoothing to the signal before onset detection. Here is a a 30 second long rap vocal track after it has been smoothed out.

I tested it on shorter signals and the beat detection seems to be much more accurate.

So far, I am on schedule according to the new schedule. I have used up my week of slack, which means from now on I need to be very cautious of how I use my time. I cannot fall behind now. Next week I will work with Saransh to implement rapping tempo detection into the backend so we can get a working UI display up before the demo.

Week of 10/13

This week our group made more progress in finishing our Minimum Viable Product. However, we had one team member out of town and another was sick, so we weren’t able to meet as often as we would have liked. However, we are still on track to finish the project on time given our slack. We are weary of last minute bugs that may crop up as we finish our code and it becomes complicated. Another aspect we have given plenty of time in case a problem does arise, is combining application features. We can definitely see a lot of issues during this last phase of putting everything together, so we are keen on testing individual units and making sure they work very well first.

Week of 10/13 – Jiahao Zhou

This week was spent optimizing the voice beat detection with different bands. Instead of focusing on all frequency spectrums, different frequency bands were given varying weights. I also generated more 4/4 beat test vocals, however I was not able to finish testing because I was busy with midterms. In addition, I was sick beginning Wednesday and thus had to move most of the work to Friday and this coming weekend. Since I have a week of slack, I plan to shift everything back and still be on schedule.

Week of 10/6

We spent this week finishing the Design Review Document, which took up most of our time. We are on schedule given the pivot we made. There are no major concerns present as of now.

Week of 10/6 – Jiahao Zhou

This week was spent working on creating voice samples for testing, finishing the voice detection algorithm and finishing the Design Review Document.

Instead of manually recording testing samples and converting it in audacity or some other software, I created a MATLAB script with autorecorder that records small 4-second (or however long I wanted) samples and saves it in MATLAB. This allowed me to both create samples easily and more conveniently, since I didn’t need external software and removed the cumbersome step of manually converting and importing samples to test. I was able to look at both the time and frequency domain representations and found that generally it was easier to analyze the energy in the time domain. This is because there were clear gaps between words which resulted in large changes in energy relative to time. In the frequency domain, rap vocals varied too much and didn’t show any significant interesting features to test on.

I also spent a lot of time finishing up the Design Review Document. In there is a much more detailed outline of how all the algorithms will work. Next week, I want to optimize the Spectral Energy Density I am performing on the vocals to get better accuracy. I will change it based on how testing goes. So far, I am relatively on track. I am a few days behind since I was behind last week, but that is due to the pivot our group performed. However, I still have slack time left and the project is well-within reach to finish on time.