Team Status Report for 4/27

The biggest risk we face for now is the accuracy of the audio processor and the integration of the whole system. The accuracy of the audio processor still hasn’t reached the metrics we set before, and this could further affect the accuracy of the whole system. Lin’s currently working on improving the logic of the audio detection. If she fails to improve the accuracy,  we’ll switch to modify the integration logic to increase the mismatch tolerance rate. And for the integration, we are still writing the code since the implementation logic changed. We plan to finish the implementation by Sunday so we have two days before the poster due to test the system. 

There is no change to the existing schedule. All we need to do is try our best to improve our current system and finalize the implementation. 

Test result for fingering detection:

Test result for Audio Detection. Based on the result , we found that tempo range can make the accuracy of pitch detection worse. The slower and clearer the user plays the notes, the more accurate the result will be. We will keep working to improve the accuracy before the final.

Test result for Web App (not finalized yet). We will keep working on the testing for the web app before the final.

Lin’s Status Report for 4/27

This week I worked on two main goals: improving the accuracy of the audio detection, and finalizing the integration of the whole system. For the audio detection, I modified the pitch detection algorithm so that it now outputs ‘rest’ notes. This can help align the audio output with the fingering output. I tested the audio “Mary had a little lamb” with a lower tempo, and the result gets better. However, the system still has some unresolved issues. For example, if the reference note is [A3: 2s], and the audio processor’s outputs [A3: 1s, R: 0.2s, A3:1s], it’s hard for the integration system to decide whether this note is being played correctly or not. And for the integration of the whole system, I and Junrui redesigned the implementation logic of the synchronization for fingering and audio. 

Next week, I’ll keep working to solve the issue of the audio processor. I think the current pitch detection algorithm works fine, but I should add a function to process the detected note and note length. I’m mostly on track, but I hope I can find a way to improve the accuracy of pitch detection and raise the system’s error tolerance.

Lin’s Status Report for 4/20

In the past two weeks I’ve been working on the integration of the audio detection and the web app with my teammates. I modified the data structure of the output so that it’s no longer a dictionary, but a class object that stores the note name and length. In addition to the integration, I spent most of the time testing and revising the audio detecting part. Based on current testing results, the system can detect single notes with 100% accuracy. I also tested c major scale and f major scale. The result is 80% accurate and for those 20% of inaccuracies, the errors are all within one half steps (like it outputs A#4 instead of A4). However, for input audio with a tempo higher than 100, the system gets very inaccurate, and I haven’t come up with a way to solve this problem. 

I’m still working on improving the overall accuracy of the audio detection by trying multiple methods, such as changing the sliding window size, the threshold dB of the input, the tempo of the input audio, etc. I hope that I can find a way to improve the performance before the final presentation. 

 

During my implementation of the audio processor, I learned a lot about signal processing. All of my previous knowledge about signal processing is from 18290 and implementing the code for pitch detection is challenging for me. I searched for tutorials on youtube and online platforms such as StackExchange. There are some discussions about how to do music transcribing and I learned from these previous posts. After I decided to make use of python libraries Librosa and Scipy, I looked through their guidelines. I also looked for previous capstone projects that did similar themes.

Lin’s Status Report for 4/6

This week I mainly worked on finalizing the audio processor. I modified the preprocessing part by changing the band-pass filter’s range and adding a normalization for input signal. I also worked on the pitch detection logic to improve the accuracy of it. After the demo on Wednesday, me and my teammates discussed about how to integrate our parts together and we came up with a common data structure. I modified my code so that it outputs the data structure we discussed.

I’m a little behind the schedule. The main issue is that when I test my code with computer generated music, it works perfectly with a 100% accuracy. But when it comes to actual saxophone recordings played by Jordan, the accuracy greatly decreases. I’ll need to solve this problem next week and start to integrate my code with webapp frontend.

Team Status Report for 3/30

Our major concern for now is that the subsystems are not yet finalized but the demo is next Monday. Lin is still working on code for pitch detection and Junrui is still finishing implementation for the webapp practice page. We’ll do what we can to make our part perfect. Another concern is that we might not have enough time to integrate each subparts together. We’re still discussing what platform to use for webapp deployment. After we finish the demo, we’ll start working on integration immediately.

We have updated our schedule as in the graph. There’s not much change compared to the original one, but we postponed the integration process by one week.

Lin’s Status Report for 3/30

This week I mainly work to finalize the audio processing part to prepare for the coming demo. I keep working on the integration of rhythm and pitch. I have tested an input audio recorded by Jordan, and I noticed some issues. The major issue is that the environment noises cause some pitch detection inaccuracy, and it seems that the current band-pass filter I applied doesn’t filter out the noises. I’m still working to revise the code to reduce the effect of noises. 

I am on track of schedule. I’ll keep working on my subpart to finalize it before the demo and start integrating the system with webapp frontend once we finish the demo.

Lin’s Status Report for 3/23

This week I worked on the integration of rhythm processing and pitch detection. It takes the output array of rhythm, which is a list of 0s and 1s, and the output array of pitch, which is a list of music notes. And then it pairs the note with the rhythm and stores the result in a dictionary. I also worked on the pre-processing of the input audio by adding a band-pass filter to it to make sure the signal is in the range of 300hz- 2500hz (this value is still under testing and may change).

I am on track of schedule. Next week, I’ll start testing the system with real-life saxophone input played by Jordan. I will see if it meets the accuracy requirement we aimed for in the proposal and make adjustments to my code based on the result.

Lin’s Status Report for 3/16

This week I’m working on the rhythm processor. I wrote the code that detects the peak of an input audio signal in an interval of ⅛ of a beat. The program returns 1 if there is a peak detected in that interval, indicating that there is a note detected in that interval, and returns 0 if no peak is detected, indicating a rest note. I have also debugged the pitch processing by switching between different pitch detection algorithms and sliding window sizes. Now the pitch processor should be working accurately (which I will do further testing next week). 

I am on track of schedule. Next week, I’ll start to implement the integration part that integrates the pitch and rhythm, which should be a dictionary or notes array that contain both the note and its lasting time. I will also work on the pre-processing to get rid of inaccuracies in pitch detection for the first few seconds of an input audio.

Team’s Status Report for 3/9

Part A written by Lin:  Identifying that saxophonists may come from very diverse geographic locations and backgrounds, our system ensures accessibility for users worldwide. We choose to build a web app instead of a phone app so that it’s more convenient and accessible. We aim to make our web app interface intuitive and user-friendly so that people with different educational levels can all use it easily. We also want to include a learning page in our web app to help saxophone beginners worldwide learn better.

Part B written by Junrui:  This project takes into account cultural factors by emphasizing the universality of music as a learning tool. Although offered in English with a set selection of songs, our design is mindful of the diversity in musical practices and the educational needs of users from different cultural backgrounds. This project avoids cultural bias by focusing on the technical aspects of saxophone playing, which are common across cultures, thereby ensuring inclusivity.

Part C written by Jordan: This project focuses on helping beginning human players get better at saxophone. Helping human players get better at saxophone can help the environment by removing unpleasant noises from the environment, especially around aspiring saxophone players.

Our most significant risk right now is that the audio processor has some bugs that haven’t been fixed yet. If the pitch detector doesn’t perform as we expected, the inaccuracy rate of the overall system will be very high, which is not acceptable. To resolve the issue, Lin will make extra work in the following days and ask for TA and professor’s help if needed. Another risk is that the hardware construction for fingering collection is a little behind the schedule. Jordan will take his effort to catch up next week.

We have made a slight change to our design. We decided to focus on offline audio processing for now instead of real-time due to scheduling issues. If we finish the audio processor earlier than expected, we will work on converting it to real time. But if not, we will stick to offline processing. Other than that, our design stays the same as it was in the design report.

Lin’s Status Report for 3/9

This week I’m mainly focusing on revising my code for the pitch processor. I tested eight different saxophone music files as inputs and the outputs are not very accurate. Sometimes the output is correct, but sometimes the output is very wrong. I believe that there’s some miscalculations in my code and I’m still revising and repairing it. I tried to modify the code to use FFT instead of STFT but it turned out to be worse, so I decided to stick to STFT. For the rhythm processor, I decided to work on an offline processor instead of real time since I did some research and found that real-time processing requires far more work. I will finish working on offline processing first and then work on real-time if I have time left. 

(For example, the output notes here is a mess)

I’m slightly behind schedule this week due to the unexpected pitch processing error. I’m working to fix the issues and I expect to fix it by Monday. After that I will keep working on the rhythm processor and test several saxophone musics played by Jordan as inputs.