Team D5: Sonic Score Saxophonics – Carnegie Mellon ECE Capstone, Spring 2024

April 30, 2024

Jordan’s Status Report for 4/27

This week, I worked mostly on testing and final reports/posters. I participated in systems testing with the rest of the team, by playing the saxophone and helping analyse the results from the testing. I also started working on the poster and final report with the rest of the team.

There are currently no roadblocks, except maybe time, since testing is on a tight schedule now. For next week, our plan is to finish up the testing, prepare for the demo, and finish the final report.

April 28, 2024April 28, 2024

Team Status Report for 4/27

The biggest risk we face for now is the accuracy of the audio processor and the integration of the whole system. The accuracy of the audio processor still hasn’t reached the metrics we set before, and this could further affect the accuracy of the whole system. Lin’s currently working on improving the logic of the audio detection. If she fails to improve the accuracy, we’ll switch to modify the integration logic to increase the mismatch tolerance rate. And for the integration, we are still writing the code since the implementation logic changed. We plan to finish the implementation by Sunday so we have two days before the poster due to test the system.

There is no change to the existing schedule. All we need to do is try our best to improve our current system and finalize the implementation.

Test result for fingering detection:

Test result for Audio Detection. Based on the result , we found that tempo range can make the accuracy of pitch detection worse. The slower and clearer the user plays the notes, the more accurate the result will be. We will keep working to improve the accuracy before the final.

Test result for Web App (not finalized yet). We will keep working on the testing for the web app before the final.

April 28, 2024

Lin’s Status Report for 4/27

This week I worked on two main goals: improving the accuracy of the audio detection, and finalizing the integration of the whole system. For the audio detection, I modified the pitch detection algorithm so that it now outputs ‘rest’ notes. This can help align the audio output with the fingering output. I tested the audio “Mary had a little lamb” with a lower tempo, and the result gets better. However, the system still has some unresolved issues. For example, if the reference note is [A3: 2s], and the audio processor’s outputs [A3: 1s, R: 0.2s, A3:1s], it’s hard for the integration system to decide whether this note is being played correctly or not. And for the integration of the whole system, I and Junrui redesigned the implementation logic of the synchronization for fingering and audio.

Next week, I’ll keep working to solve the issue of the audio processor. I think the current pitch detection algorithm works fine, but I should add a function to process the detected note and note length. I’m mostly on track, but I hope I can find a way to improve the accuracy of pitch detection and raise the system’s error tolerance.

April 27, 2024April 27, 2024

Junrui’s Status Report for 4/27

Since I am the presenter for our team’s final presentation, this week I spent the first three days preparing for the final presentation. Then I discussed with Lin and modified our system’s integration logic. Initially when the system looped through the fingering data, it would take ‘slices’ of multiple lines and calculated the most common line, and that made the code inefficient. Now this is removed and new mistake tracking logic for mismatch tolerance is decided, which should be able to improve the latency. However, since I have some final homework deadlines this week, I didn’t have time to conduct new tests on the new implementation. I will continue to work on that with Lin on Sunday.

I am almost on track with the schedule in the integration phase, but the schedule is quite tight as the final report, demo, video deadlines are approaching. Next week I plan to finish the integration, conduct more tests with real inputs from users, and also start working on the report and video.

April 21, 2024

Team’s Status Report for 4/20

The biggest risk we face now is still integration. Our individual work is completed, though some improvements can be done to increase accuracy. However, for the integration, the synchronization of the fingering detection data and audio detection data is problematic. Both the timing and the fault tolerance need precise refinement. The accuracy after integration currently falls short of our expectations, and the web app also incorrectly flags some minor deviations from the reference solution due to the inaccurate synchronizing. To deal with the risk, our team will work together in the integration tests and assist Junrui in writing some integration code.

The schedule remains the same as previous week, since the remaining time is planned for integration tests.

April 21, 2024April 27, 2024

Junrui’s Status Report for 4/20

During the past two weeks, I have been working on the integration of the web app and the other 2 systems. I managed to write to the serial to notify the start to the fingering detection system, and read from the serial to get the real-time fingering info. Then the backend stores the fingering info in a buffer and generate the relative timestamp according to a line’s position in the buffer. For the audio detection system’s integration part, I managed to send a request to a new api endpoint to trigger the audio detection script in the backend. In addition to that, I modified the logic of the practice page and added start, end, replay button to only allow a user see the performance after their entire practice is recorded, since the audio detection process is not real-time.

I am currently on schedule. However, since the time generated for the fingering data are not so accurate, the synchronized result is not ideal most of the time. It’s hard for me to conduct integration tests and get results with moderate accuracy by this point. Next week, I plan to think of some ways to improve the situation, better synchronize all 3 parts, and try to get better results for tests.

Extra question:

To successfully implement and debug my project, I had to learn JavaScript for dynamic client-side interactions in HTML. Since most of the functions that can be done by Django are static, JS sections in HTML helps me a lot in constructing the web app. I also learned the Web Serial API for integrating the web app and the fingering detection system, and SVG for dynamic diagram coloring. I utilized interactive tutorials, official documentation, online discussion forum like Stack Overflow, as well as hands-on experiments to acquire those new knowledge and deepen my understanding.

April 21, 2024

Lin’s Status Report for 4/20

In the past two weeks I’ve been working on the integration of the audio detection and the web app with my teammates. I modified the data structure of the output so that it’s no longer a dictionary, but a class object that stores the note name and length. In addition to the integration, I spent most of the time testing and revising the audio detecting part. Based on current testing results, the system can detect single notes with 100% accuracy. I also tested c major scale and f major scale. The result is 80% accurate and for those 20% of inaccuracies, the errors are all within one half steps (like it outputs A#4 instead of A4). However, for input audio with a tempo higher than 100, the system gets very inaccurate, and I haven’t come up with a way to solve this problem.

I’m still working on improving the overall accuracy of the audio detection by trying multiple methods, such as changing the sliding window size, the threshold dB of the input, the tempo of the input audio, etc. I hope that I can find a way to improve the performance before the final presentation.

During my implementation of the audio processor, I learned a lot about signal processing. All of my previous knowledge about signal processing is from 18290 and implementing the code for pitch detection is challenging for me. I searched for tutorials on youtube and online platforms such as StackExchange. There are some discussions about how to do music transcribing and I learned from these previous posts. After I decided to make use of python libraries Librosa and Scipy, I looked through their guidelines. I also looked for previous capstone projects that did similar themes.

April 21, 2024

Jordan’s Status Report for 4/20

During my work on this project, I learned soldering a PCB, working with Arduino IDE in a working environment (before it was just messing around), especially on how to implement interrupts on an Arduino.

For soldering, I learned from my friends who I know had soldering experience. My friend was able to solder a few spots on my PCB while showing me how I should do it, and through that I was able to solder the rest of the components with no issue. For Arduino IDE, I learned by looking up information online, and experimenting with existing code to see what I can take from them and apply to my own.

Normal Report Portion:

This week, we advanced further into integration. We have finalised the communication protocols between our systems, and the communications are all up and working now. Specifically for me, I changed my code such that it accounts for keys that are not pressed but the sensors are activated. For example, when pressing certain right hand keys, a valve that is connected to a left hand key also gets triggered, even if the left hand key is not pressed. I changed it such that the left hand key would be excluded in the webapp. I experimented with all 22 keys and their combinations, and finalised the rules on that. Now when I press one key, only one digit in the data packet is changed.

I have also assisted my teammates in their integration efforts. For example, I added an interrupt that starts and ends data sending only when requested from the web app.

In terms of testing, my individual components pass all testing requirements. The delay from the Arduino serial monitor is negligible, but determining the specific timing would be extremely difficult. I would need to collect the exact that the key is pressed and the output shows up, which I am not sure how to do. However, the small delay is not noticeable to the end user, and all standard saxophone keys are confirmed to provide timely updates to the data packets when they are pressed.

In terms of deliverables next week, the focus is integration testing, to ensure that not only each component works, but the system works as a whole, and the user experience is smooth.

April 7, 2024

Team Status Report for 4/6

Similar to last week, the biggest risk we face right now is still integration. Most of our individual works are either complete or close to being complete, at least the core parts of them. We have 2-3 weeks left for integration, and we all have work to do to help other people in the team.

We have agreed on a framework of communication between our three parts, and now we will need to modify our three parts to be able to interact with the other two parts. Schedule is the same as last week, attached here:

April 7, 2024April 7, 2024

Lin’s Status Report for 4/6

This week I mainly worked on finalizing the audio processor. I modified the preprocessing part by changing the band-pass filter’s range and adding a normalization for input signal. I also worked on the pitch detection logic to improve the accuracy of it. After the demo on Wednesday, me and my teammates discussed about how to integrate our parts together and we came up with a common data structure. I modified my code so that it outputs the data structure we discussed.

I’m a little behind the schedule. The main issue is that when I test my code with computer generated music, it works perfectly with a 100% accuracy. But when it comes to actual saxophone recordings played by Jordan, the accuracy greatly decreases. I’ll need to solve this problem next week and start to integrate my code with webapp frontend.