Lin’s Status Report for 4/27

This week I worked on two main goals: improving the accuracy of the audio detection, and finalizing the integration of the whole system. For the audio detection, I modified the pitch detection algorithm so that it now outputs ‘rest’ notes. This can help align the audio output with the fingering output. I tested the audio “Mary had a little lamb” with a lower tempo, and the result gets better. However, the system still has some unresolved issues. For example, if the reference note is [A3: 2s], and the audio processor’s outputs [A3: 1s, R: 0.2s, A3:1s], it’s hard for the integration system to decide whether this note is being played correctly or not. And for the integration of the whole system, I and Junrui redesigned the implementation logic of the synchronization for fingering and audio. 

Next week, I’ll keep working to solve the issue of the audio processor. I think the current pitch detection algorithm works fine, but I should add a function to process the detected note and note length. I’m mostly on track, but I hope I can find a way to improve the accuracy of pitch detection and raise the system’s error tolerance.

Lin’s Status Report for 4/20

In the past two weeks I’ve been working on the integration of the audio detection and the web app with my teammates. I modified the data structure of the output so that it’s no longer a dictionary, but a class object that stores the note name and length. In addition to the integration, I spent most of the time testing and revising the audio detecting part. Based on current testing results, the system can detect single notes with 100% accuracy. I also tested c major scale and f major scale. The result is 80% accurate and for those 20% of inaccuracies, the errors are all within one half steps (like it outputs A#4 instead of A4). However, for input audio with a tempo higher than 100, the system gets very inaccurate, and I haven’t come up with a way to solve this problem. 

I’m still working on improving the overall accuracy of the audio detection by trying multiple methods, such as changing the sliding window size, the threshold dB of the input, the tempo of the input audio, etc. I hope that I can find a way to improve the performance before the final presentation. 

 

During my implementation of the audio processor, I learned a lot about signal processing. All of my previous knowledge about signal processing is from 18290 and implementing the code for pitch detection is challenging for me. I searched for tutorials on youtube and online platforms such as StackExchange. There are some discussions about how to do music transcribing and I learned from these previous posts. After I decided to make use of python libraries Librosa and Scipy, I looked through their guidelines. I also looked for previous capstone projects that did similar themes.

Lin’s Status Report for 4/6

This week I mainly worked on finalizing the audio processor. I modified the preprocessing part by changing the band-pass filter’s range and adding a normalization for input signal. I also worked on the pitch detection logic to improve the accuracy of it. After the demo on Wednesday, me and my teammates discussed about how to integrate our parts together and we came up with a common data structure. I modified my code so that it outputs the data structure we discussed.

I’m a little behind the schedule. The main issue is that when I test my code with computer generated music, it works perfectly with a 100% accuracy. But when it comes to actual saxophone recordings played by Jordan, the accuracy greatly decreases. I’ll need to solve this problem next week and start to integrate my code with webapp frontend.

Lin’s Status Report for 3/30

This week I mainly work to finalize the audio processing part to prepare for the coming demo. I keep working on the integration of rhythm and pitch. I have tested an input audio recorded by Jordan, and I noticed some issues. The major issue is that the environment noises cause some pitch detection inaccuracy, and it seems that the current band-pass filter I applied doesn’t filter out the noises. I’m still working to revise the code to reduce the effect of noises. 

I am on track of schedule. I’ll keep working on my subpart to finalize it before the demo and start integrating the system with webapp frontend once we finish the demo.

Lin’s Status Report for 3/23

This week I worked on the integration of rhythm processing and pitch detection. It takes the output array of rhythm, which is a list of 0s and 1s, and the output array of pitch, which is a list of music notes. And then it pairs the note with the rhythm and stores the result in a dictionary. I also worked on the pre-processing of the input audio by adding a band-pass filter to it to make sure the signal is in the range of 300hz- 2500hz (this value is still under testing and may change).

I am on track of schedule. Next week, I’ll start testing the system with real-life saxophone input played by Jordan. I will see if it meets the accuracy requirement we aimed for in the proposal and make adjustments to my code based on the result.

Lin’s Status Report for 3/16

This week I’m working on the rhythm processor. I wrote the code that detects the peak of an input audio signal in an interval of ⅛ of a beat. The program returns 1 if there is a peak detected in that interval, indicating that there is a note detected in that interval, and returns 0 if no peak is detected, indicating a rest note. I have also debugged the pitch processing by switching between different pitch detection algorithms and sliding window sizes. Now the pitch processor should be working accurately (which I will do further testing next week). 

I am on track of schedule. Next week, I’ll start to implement the integration part that integrates the pitch and rhythm, which should be a dictionary or notes array that contain both the note and its lasting time. I will also work on the pre-processing to get rid of inaccuracies in pitch detection for the first few seconds of an input audio.

Lin’s Status Report for 3/9

This week I’m mainly focusing on revising my code for the pitch processor. I tested eight different saxophone music files as inputs and the outputs are not very accurate. Sometimes the output is correct, but sometimes the output is very wrong. I believe that there’s some miscalculations in my code and I’m still revising and repairing it. I tried to modify the code to use FFT instead of STFT but it turned out to be worse, so I decided to stick to STFT. For the rhythm processor, I decided to work on an offline processor instead of real time since I did some research and found that real-time processing requires far more work. I will finish working on offline processing first and then work on real-time if I have time left. 

(For example, the output notes here is a mess)

I’m slightly behind schedule this week due to the unexpected pitch processing error. I’m working to fix the issues and I expect to fix it by Monday. After that I will keep working on the rhythm processor and test several saxophone musics played by Jordan as inputs. 

Lin’s Status Report for 2/24

This week, since I’m the presenter of my group, I prepared for the design review presentation on Wednesday. I did a lot of practice to make sure I don’t need to look at notes during the presentation. Aside from the presentation, I researched on how the rhythm processor could work since my previous work mainly focused on pitch processing. After reading some past research and projects, I decided to use a window size of 1/8th of a beat. 

I also looked into Librosa to see how to find peaks for an input audio of 1/8th of a beat. I didn’t find the function but I found there’s a find_peak function in the Scipy library, so I decided to use that. However, the code still has some bugs right now and can not detect peaks correctly. 

I am slightly behind the schedule this week since I have both the design presentation and midterm paper due. I will definitely work harder next week to keep up with the schedule. I want to finish debugging my code by next week and test with some input audios generated from a real-life saxophone.

Lin’s Status Report for 2/17

This week my main focus is to work on the pitch detection of the audio processing. I have transferred all my work from Matlab to Python, since it will be the primary language of our Web App’s backend. Currently, I am applying Short-time Fourier transform(STFT) to an input audio to get its frequency. Then I convert the pitch into MIDI notes and pair them with music notes. I tried several python libraries including Librosa and Scipy, which all deal with music and audio analysis. For now I’ve decided to use Librosa mainly since it’s been widely recommended by the StackOverflow users, but if things doesn’t work out later I’ll switch to Scipy.  I’m able to extract pitch and note from a 12tet diatonic music scale as shown in the graph below. 

My progress is on track this week. I wrote my code for music files within 10s and I plan to work on musics that are longer next week. Our design goal is to perform pitch detection to a roughly 60s music file so I will start to work on that next week. I will also try with some low SNR input files to test the pre-processing part of my code.

Lin’s Status Report for 2/10

This week’s main goal is to finish the proposal presentation and get ourselves familiarize to our sub-parts. Since I am in charge of developing the audio processor, I have started to research on the essential steps for it. The audio processing will be divided into pre-processing and pitch detection. I started working on the pre-processing part on Matlab and tested several filtering algorithms on an input audio.  I also researched on several python libraries for music processing, such as Librosa, SciPy, etc.

My progress is currently on schedule, but I hope to achieve more next week. I plan to finish up the pre-processing of audio next week and convert it into Python (if the webapp frontend can be constructed by then). I will also start to research on pitch detection next week.