Lin’s Status Report for 4/20

In the past two weeks I’ve been working on the integration of the audio detection and the web app with my teammates. I modified the data structure of the output so that it’s no longer a dictionary, but a class object that stores the note name and length. In addition to the integration, I spent most of the time testing and revising the audio detecting part. Based on current testing results, the system can detect single notes with 100% accuracy. I also tested c major scale and f major scale. The result is 80% accurate and for those 20% of inaccuracies, the errors are all within one half steps (like it outputs A#4 instead of A4). However, for input audio with a tempo higher than 100, the system gets very inaccurate, and I haven’t come up with a way to solve this problem. 

I’m still working on improving the overall accuracy of the audio detection by trying multiple methods, such as changing the sliding window size, the threshold dB of the input, the tempo of the input audio, etc. I hope that I can find a way to improve the performance before the final presentation. 

 

During my implementation of the audio processor, I learned a lot about signal processing. All of my previous knowledge about signal processing is from 18290 and implementing the code for pitch detection is challenging for me. I searched for tutorials on youtube and online platforms such as StackExchange. There are some discussions about how to do music transcribing and I learned from these previous posts. After I decided to make use of python libraries Librosa and Scipy, I looked through their guidelines. I also looked for previous capstone projects that did similar themes.

Leave a Reply

Your email address will not be published. Required fields are marked *