Ben’s Status Report 12/7

The initial part of the week was largely spent getting the final presentation in order (up through Sunday). Helping Mathias plan and practice.

Other progress includes getting two audio streams to work at the same time from two Scarlett Device inputs with some noise issues. The issue now is properly interleaving the stereo channels of the piano microphone input to ensure the levels are balanced and notes are being heard as accurately as possible.

As things wrap up, I am shifting focus towards the deliverables ahead including the final poster, video, and paper. Though I hope so still try and spend some time optimizing the system a bit more.

In reflection, the audio processing seems to mainly pick up only the lower notes of the piano when multiple are played at the same time. Knowing this we could have chosen a cardioid microphone, instead of the ORTF, to record the lower end of the piano with higher accuracy and less noise. Given more time I would have liked to include an active normalization for the audio so the processing was more consistent and we could choose (and know) the noise floor to better filter note detection. I also would have wanted a bit more robust method of demoing. As of now the system is not designed to pick out music from a noisy environment like the demo is set to be which will be a significant issue for getting any accurate live results.

Mathias’ Status Report 12/7/2024

This week I focused on making the backend more robust. Previously the backend only worked when tested with correct inputs so if I sent in an invalid input there would be undefined behavior/system crashes. To do this I first changed the scanning portion of the code so that if not musicXML was produced we would return a error back to the caller instead of failing. I also changed the highlighting code to check if the beats lie within a specified range before attempting to highlight.

 

This week I’ll be mostly working on integration, I’ll be working with integrating Aakash’s solution into my component and having that all run on the Raspberry Pi.

Aakash’s Status Report for 12/7/2024

This week I focused on our system tests as well as optimizing the system for our final demo. We noticed that there are performance issues when aligning the piano sheet music to the piano audio data that aren’t present with the singer data. I am working on improving the piano data parsing as currently I am just parsing one stave when there are two available along with modifying the alignment algorithm to use different weights for the piano. This should lead to increased performance with the piano data.

I have also been working on complete system integration. I have made changes to my subsystem so that it can be called with any file as before the data file names were hard coded into the script and had to be changed manually. This will make it so Mathias’ program can just run my python script and just pass it the absolute file paths.

For early next week I want to get the complete solution running on the Raspberry Pi and start doing tests with Mathias and Ben to ensure that the system is working for the final demo and iron out any kinks. I also will be working on the final post, final paper, and final video.

Team Status Report for 12/7/2024

Ben’s Status Report 11/30

The last two weeks have been spent sporadically making improvements from the Interim Demo. Playing with features like the silence threshold seems to have dramatically increased accuracy on the basic music pieces as well as some improvement on Fly Me to the Moon from before.

We also had our first live functional demo with the School of Music musicians and have identified some issues in the data pipeline that have been worked on. The following changes have been made to better pre-process the audio data:

Extending rests – Some rests, also denoted as midi = 0 in our program,  left a gap between the start of the rest (silence) and the start of the next note. These have been fixed so that the rest takes up the entire duration between the two notes. This allows for easier and more consistent comparison in the timing algorithm.

Concatenating Silence Values – Any repeated portions of “silence” are now joined into one larger period. This removes extraneous note objects that cause desync in the timing algorithm.

Normalizing start time to zero – The output of any audio processed now begins at time 0.0000 with all other values correctly shifted to match. This ensures all data starts at the expected time since any output from the music XML side expects a start at t=0.

Stitching noise and notes – As mentioned in prior reports there was an issue of. audio spikes causing peaks in the data that were good for timing data but bad for pitch. These peaks have been joined with the note they precede allowing for a cleaner output that still maintains the consistency of the timing data.

Fixing the silence threshold at -45dB – Previously this had been operating under the default value for Aubio at -90dB which detects most onsets. Changing this value reduced the number of onsets and cut out a significant amount of noise. This is still ongoing for experimentation but ideally I could find a way to normalize the audio level so the noise floor isn’t too low (noise gets through) or too high (notes are cutoff or removed).

As usual updates to the DSP test files and main process can be found here

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

The project is on track for the final presentation.

What deliverables do you hope to complete in the next week?

Fixing audio recording issues and preforming more detailed analysis on complex music piece results.

Mathias’ Status Report 11/30

This week I focused on creating a functional UI that links to the backend. The UI has three major components a component that allows for the upload of a sheet music pdf to the backend, a component that allows for a user to play a song that they uploaded, and lastly a component that allows a users results from one of their sessions. The upload component is simply a button that allows a user to select a file from their device. The play song and view session component are separate pages that have a drop-down to allow users to select a song/result respectively.

I also spent time fixing a bug in the highlighting logic. Previously the logic always assumed that a quarter note was one beat however this wasn’t always the case so I changed this to be based on the value in the time signature.

In regards to testing I was able to quantify the accuracy of the sheet music scanning more formally. For a intermediate piece such as Fly Me to the Moon the sheet music scanning is at 99% accuracy for both timing onset and pitch detection.

In terms of new knowledge I learned about the basics of OpenCV as well as some general music knowledge. My learning method for most of the new content I had to learn was to learn incrementally meaning I would only learn about what I would need at the time. If I needing a specific piece of functionality from OpenCV or Audiveris instead of front loading all the learning I would just learn about that component I need.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

The project is on track for the final presentation.

What deliverables do you hope to complete in the next week?

Polishing up the UI and feedback as well as working to better integrate all the project components.

Team Status Report for 11/30/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk as of now is that the system won’t be performant on a raspberry pi and that some pieces of music won’t be able to be processed. In order to mitigate these risks we always have the use of a laptop as a backup as we know the system works well on that and to limit the scope of the pieces we use for the demo. This will allow us to have a fully working demo no matter what and be able to show off the functionality of our system without working with unknowns.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward? 

No changes made at this time.

 

Aakash’s Status Report for 11/30/2024

For the past two weeks I have mainly been working on refining the timing algorithm and working with Ben and Mathias on integration to turn the three subsystems into one complete solution.

I started setting up the raspberry pi and put ubuntu on the system. The next step for this would be to install the code and make sure I can get a working output. Also making sure that the audio interfaces are able to connect to the pi correctly.

I also have been working with Ben on the data output from the audio processing subsystem in order to improve my section. During the interim demo, we were able to show the system working to some degree, but we did notice that there were spikes in the audio which corresponded with a new note. We’ve been working together to combine these in order to make my timing algorithm more accurate and there has been some success so far.

I have also spent time working on the final report. I have began doing tests on my subsystem in regards to the quantitative requirements and worked with Ben and Mathias to determine what our testing environments are going to be. We decided to test on beginner, medium, and advanced pieces of sheet music which we classify based on things such as chords and speed. We also will be testing based on audio in an isolated environment and a real world environment such as a normal room.

After this testing I can take a look on the results and see what optimizations I need to make for the final demo and will give us good data for the final presentation.

Overall I am very happy with how the project is progressing and I am on track to be demo ready by finals week.

For the upcoming week I plan on finishing the final presentation and continue to work with Ben and Mathias on optimizing my algorithm and system integration.

Some new knowledge I learned during this project is signal processing algorithms such as dynamic time warping and how to record audio in a recording studio. These were both very foreign topics to me as I am mostly a very hardcore software person. Some learning strategies I used to accomplish this is by first reading about the theory online and then just jumping in and trying to do something. I noticed that by throwing myself at a problem I am able to learn way faster than by just reading or watching videos on how to do something. For the dynamic time warping, watching videos were really good at learning how it worked fundamentally, but by getting my hands dirty with the data, I was able to see how it worked in the real world and some of the challenges there are for dealing with relatively messy and unideal data.