Aakash’s Status Report for 12/7/2024

This week I focused on our system tests as well as optimizing the system for our final demo. We noticed that there are performance issues when aligning the piano sheet music to the piano audio data that aren’t present with the singer data. I am working on improving the piano data parsing as currently I am just parsing one stave when there are two available along with modifying the alignment algorithm to use different weights for the piano. This should lead to increased performance with the piano data.

I have also been working on complete system integration. I have made changes to my subsystem so that it can be called with any file as before the data file names were hard coded into the script and had to be changed manually. This will make it so Mathias’ program can just run my python script and just pass it the absolute file paths.

For early next week I want to get the complete solution running on the Raspberry Pi and start doing tests with Mathias and Ben to ensure that the system is working for the final demo and iron out any kinks. I also will be working on the final post, final paper, and final video.

Team Status Report for 12/7/2024

Team Status Report for 11/30/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk as of now is that the system won’t be performant on a raspberry pi and that some pieces of music won’t be able to be processed. In order to mitigate these risks we always have the use of a laptop as a backup as we know the system works well on that and to limit the scope of the pieces we use for the demo. This will allow us to have a fully working demo no matter what and be able to show off the functionality of our system without working with unknowns.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward? 

No changes made at this time.

 

Aakash’s Status Report for 11/30/2024

For the past two weeks I have mainly been working on refining the timing algorithm and working with Ben and Mathias on integration to turn the three subsystems into one complete solution.

I started setting up the raspberry pi and put ubuntu on the system. The next step for this would be to install the code and make sure I can get a working output. Also making sure that the audio interfaces are able to connect to the pi correctly.

I also have been working with Ben on the data output from the audio processing subsystem in order to improve my section. During the interim demo, we were able to show the system working to some degree, but we did notice that there were spikes in the audio which corresponded with a new note. We’ve been working together to combine these in order to make my timing algorithm more accurate and there has been some success so far.

I have also spent time working on the final report. I have began doing tests on my subsystem in regards to the quantitative requirements and worked with Ben and Mathias to determine what our testing environments are going to be. We decided to test on beginner, medium, and advanced pieces of sheet music which we classify based on things such as chords and speed. We also will be testing based on audio in an isolated environment and a real world environment such as a normal room.

After this testing I can take a look on the results and see what optimizations I need to make for the final demo and will give us good data for the final presentation.

Overall I am very happy with how the project is progressing and I am on track to be demo ready by finals week.

For the upcoming week I plan on finishing the final presentation and continue to work with Ben and Mathias on optimizing my algorithm and system integration.

Some new knowledge I learned during this project is signal processing algorithms such as dynamic time warping and how to record audio in a recording studio. These were both very foreign topics to me as I am mostly a very hardcore software person. Some learning strategies I used to accomplish this is by first reading about the theory online and then just jumping in and trying to do something. I noticed that by throwing myself at a problem I am able to learn way faster than by just reading or watching videos on how to do something. For the dynamic time warping, watching videos were really good at learning how it worked fundamentally, but by getting my hands dirty with the data, I was able to see how it worked in the real world and some of the challenges there are for dealing with relatively messy and unideal data.

Team Status Report for 11/16/2024

For validation tests, we want to ensure that when the system is given the same inputs, we get the same outputs every time. This can be done by giving the system the same sheet music and audio data and ensuring the music highlighting is the same.

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

So for the most significant risks are that we cannot properly process complex pieces of music. This risk is being managed by slowly increasing the complexity of the music we process to ensure that we don’t have any surprises as we test more complicate pieces. The contingency plans we have are to just stick with simpler pieces of music because it is working so far so if there are complex pieces of music we have trouble processing, we will treat them as edge cases and focus our energy on polishing what we know works.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

No changes made.

 

Aakash’s Status Report for 11/16/2024

This week I spent a lot of time this week making sure my system is robust and we are ready for the demo next week. We are going to demo on a simpler piece of music that doesn’t have complex chords, so am making sure the system is reliable. The current state of the system has it so it parses the sheet music data from the music xml and gives an output as so in the form of (pitch, onset time, duration)

It then takes this data and compares it to the audio data to find where the similarities of the two. This is done using a modified dynamic time warping algorithm and prioritizes onset time, then duration, then pitch when comparing two notes. This then outputs a list of (sheet music index, audio index) which are indexes where the audio matchest the sheet music. This looks like this:

This still needs a little work because when the audio data isn’t edited, there is a lot of noise that can disrupt the algorithm. This can be mitigated by doing pre processing on the data and to allow the algorithm to be more selective when matching indexes. Currently every index has to be match to another, but it can be modified so this is not the case. Once these changes are implemented it should be way more accurate. The preprocessing is going to be done by implementing a moving average to get rid of the erratic note anomalies. This should be relatively easy to implement. I will have to do more research on dynamic time warping and see if there is a signal processing technique I can implement to help with matching the erroneous notes, and if there isn’t,  I will manually iterate through the list and remove audio data notes that match with sheet music data more than once which should increase accuracy.

I have worked on increasing the accuracy of the matching audio by changing parameters and fine tuning the distance function within my custom dynamic time warping implementation. I am still learning the details about how this works mathematically in order to create an optimized solution, but I am getting there. Getting better data from Ben will certainly help with this as well but I wan’t to make sure the system is robust even with noise.

It does this for both the singer portion of the sheet music and the piano portion. It then finds where there are simultaneous notes in both portions in order to check there for whether they are in sync of out of sync. It gives an output of (singer index, piano index) of where to check in the sheet music data for similar notes. This is as so:

It the compares the audio data to the shared note points and sees if they are similar to a certain threshold. This threshold is currently .01 seconds. It then sends the sheet music time to the frontend which then highlights that note.

 

Verification:

Some of the verification tests I have currently run is when processing the sheet music data, manually comparing that data to the sheet music pdf to ensure it looks correct. This isn’t a perfect way to go about this but it works good enough for now and I don’t know of any automated way to do this because I am creating the data from the MusicXML. Perhaps a way to automate this verification is to create a new MusicXML file from the data I parsed and compare the two to ensure they look correct.

Some other tests I plan on doing is

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am on schedule. There is still a lot of refining to do before the final demo, but the core functionality is there and its looking very promising.

What deliverables do you hope to complete in the next week?

I want to keep working on improving the system so we can processing more complicated systems. For our demo we have it working with simple pieces, but I want to improve the chord handling and errors.

Aakash’s Status Report for 11/09/2024

This week I spent a lot of time making sure the MusicXML parsing is accurate. There was some bugs when parsing the piano portion of the MusicXML as the notes that played at the same time were being interpreted as being sequential notes. I was able to fix this and ensure that the MusicXML is being parsed accurately from just manually cross checking the note list with the sheet music pdf.

I also was able to produce a list of where there were timing errors. This can be used to send to Mathias in order to highlight it on the sheet music. I also worked on cleaning up the code.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am slightly behind schedule as I wasn’t able to test on the data produced from the audio were recorded, but this isn’t a major hurdle as we still have ample time to make sure everything is working.

What deliverables do you hope to complete in the next week?

I hope to be able to start doing system integration with Ben and Mathias to make sure we have a working demo ready so we have ample time to make sure we are ready for the interim demo.

Team Status Report for 11/9

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

We still have the risk of system integration as we are all having success with our individual parts of the project, but we haven’t spent much time integrating them and seeing if there are any weird bugs or edge cases. This can be mitigated by starting to to integration soon so we have ample time before the interim and final demo to make sure our system is stable and reliable.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

No changes made.

 

Aakash’s Status Report for 11/02/2024

I have spend this week expanding on the timing algorithm. I have implemented a dynamic time warping system to correlate the sheet music to the audio data. This is a system that measures the similarities between two time series data sets and finds the similarities regardless of speed. This ensures that no matter the speed of the two sets of data it can still find a match. Attached is an example where the algorithm matched notes even though the second one is delayed by a random magnitude for each note.  The left is the sheet music and the right is the delayed music.  

This has just been with singers music so far as I am still figuring out how to handle pianos chords, but so far I have done the same process while just handling one note at a time.

Then to find if they are in sync I compared the sheet music between the two and found where they had shared note onset times. This is then used to compare each piece of audio data to find delay.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

For next week I hope to get some processed audio data from Ben to make sure the system still works and if there are any tweaks I need to make with that.

Team Status Report for 10/26/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk that could jeopardize the success of the project is the accuracy of each sub component. Neither the sheet music scanning component nor the audio component will be able to have 100% accuracy meaning the timing algorithm will have to be able have some level of tolerance for that. For example if we classify a note duration for the pianist correctly but the singer incorrectly that can mean we incorrectly identify that they are out of sync. Currently we are managing this risk by ensuring that the duration classification is as high possible however this does not completely mitigate the risk. If this ends up being a significant issue then we can allow the user to manually edit the duration of certain notes so that if they notice that they are consistently getting out of sync after a certain point they can override the MusicXML we have with the correct duration.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

Changes made: None