Aakash’s Status Report for 11/02/2024

I have spend this week expanding on the timing algorithm. I have implemented a dynamic time warping system to correlate the sheet music to the audio data. This is a system that measures the similarities between two time series data sets and finds the similarities regardless of speed. This ensures that no matter the speed of the two sets of data it can still find a match. Attached is an example where the algorithm matched notes even though the second one is delayed by a random magnitude for each note.  The left is the sheet music and the right is the delayed music.  

This has just been with singers music so far as I am still figuring out how to handle pianos chords, but so far I have done the same process while just handling one note at a time.

Then to find if they are in sync I compared the sheet music between the two and found where they had shared note onset times. This is then used to compare each piece of audio data to find delay.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

For next week I hope to get some processed audio data from Ben to make sure the system still works and if there are any tweaks I need to make with that.

Ben’s Status Report 10/26

This week,  I was able to connect a live pipe between the Scarlett audio interface and Aubio to generate a real-time stream of pitch values . From initial testing I was able to generate a mapping to understand the pitch value output. It follows that middle C (C4) maps to a value of 60, with every semi-tone increasing or decreasing the value by one point. I did notice some potential issues from the live audio testing.

  1. Any elongated or plosive consonants (“ssss”, “ttt”, “pppp”) caused a spike into the 100+ range which is out of the bounds we are considering for expected pitch as it would put the sound in and above the 7th octave (above what a piano can play). Accounting for that is important but since it is so high, it might be easiest to simply ignore values outside the expected range.
  2. Chords behave abnormally. When two notes or more are played simultaneously, the output value was either lower than any of the pitches played or simply read “0.0” which is the default no-input or error value output. I believe there is a specific way to handle chords but this requires further digging into the documentation.
  3. Speaking seems to consistently generate an output of “0.0” which is good, however some quick transitions from speaking to singing yielded mixed results. Sometimes the pitch detection would work immediately and other times it took a second to kick in.
  4. Lastly, the pitch value provided has a decimal that corresponds to the number of cents each pitch is off by relative to the fundamental pitch. Accounting for notes that are within +/- 0.5 from a just pitch will be important. Vibrato varies from person to person but for me, at least, it seemed to be within that tolerance which is a good thing.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am just about on schedule. This is an ongoing learning progress but being able to sing live or play live and have a pitch value output is promising.

What deliverables do you hope to complete in the next week?

I hope to refine and create a more robust output that takes the average pitch over a short duration to create a pitch event list that is more parseable for the timing algorithm.

Team Status Report for 10/26/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk that could jeopardize the success of the project is the accuracy of each sub component. Neither the sheet music scanning component nor the audio component will be able to have 100% accuracy meaning the timing algorithm will have to be able have some level of tolerance for that. For example if we classify a note duration for the pianist correctly but the singer incorrectly that can mean we incorrectly identify that they are out of sync. Currently we are managing this risk by ensuring that the duration classification is as high possible however this does not completely mitigate the risk. If this ends up being a significant issue then we can allow the user to manually edit the duration of certain notes so that if they notice that they are consistently getting out of sync after a certain point they can override the MusicXML we have with the correct duration.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

Changes made: None

Aakash’s Status Report for 10/26/2024

I have spent time this week working on the ethics assignment and building out a basic prototype of the timing algorithm. I have implemented the data structure I want to use which is a list of (pitch, onset time, duration) in python. I created a basic event list with uniform notes of the same pitch and duration with different onset times. I duplicated these and created a basic comparison between two lists. I also started working on the MQTT communication and setup an ingestion pipeline to take the data from the sheet music in the form of MusicXML and transform that into an event list.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

For the next week I want to continue to build up the algorithm by beginning to develop and test using data from the music xml files.

Mathias’ Status Report for 10/26

Most of my time this week was spent making a prototype of the frontend web application. Familiarized myself with Expo a react native framework and made an extremely crude prototype(https://github.com/mathiast270/InSyncAPp). The prototype has three major pages the main page having a buttons to navigate to the other pages. The play_song page contains a drop down to select songs(currently just hard coded values) it also contains a text field to let a user select bpm. The other page allows a user to take a picture through the expo camera library. Currently all the functionality is not connected to the back end at all

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

On Schedule

What deliverables do you hope to complete in the next week?

Parse the XML output from Audiveries to get note location and use that to implement the post performance feedback component.

Ben’s Status Report 10/20

During the week my primary focus was creating a connection between the Scarlett USB device (one which I own) and linking it into the skeleton of the pitch detection script from earlier. This was in preparation for a similar device we are purchasing. I was able to successfully create a program using the pyaudio library, and the pysudev library that detects when a Scarlett audio interface is connected and finds a path that can be used in the audio processing script with Aubio. The current difficulty is integrating these two programs as the device detection is buggy at times and requires consistent polling to function. I also need to add edge cases for if the Scarlett device becomes disconnected during operation.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

In terms of schedule, everything seems to be on track. The majority of the tasks ahead are things I have to learn so my concept of how long they will take is purely an estimate. With that being said, my hope is that a MVP could be reached in about 2 weeks. This would look like a working note detection program which outputs an event list which will be used by the timing algorithm. I imagine most of the time over the next few weeks will be spent debugging and optimizing the systems.

What deliverables do you hope to complete in the next week?

In terms of deliverables for the end of the week I’d like to have these successfully integrated and to a point where I can begin testing the pitch and note onset detection libraries. This would keep me on the expected schedule.

Mathias Status Report for 10/20

Most of this week was spent working on the design report. I worked on the mostly on the web application/ sheet music scanning sub systems of the report as well as parts of the introduction and design considerations. The major thing that has changed since last report is the OMR I will be using, instead of Mozart I will be using Audiveris. The report details many of the design considerations between the two but as a summary of why, Audiveries not only contains a more robust output that gives duration as well as the position of each note in regard to the sheet music image. This also means the backend will change from a flask backend to a spring boot backend since the library is in Java.

 

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

A bit behind since I will have to redo certain parts of the to have it be in spring boot instead of flask.

 

What deliverables do you hope to complete in the next week?

Backend with full ability to scan in sheet music as well as be start working on the ability to color in certain areas of an image for the post performance feedback.

Team Status Report for 10/19/2024

Aakash’s Status Report for 10/19/2024

I spent the most amount of time working on the design document with my teammates. I worked heavily on sections regarding my sub-system which is the timing algorithm, while also making changes to our block diagram, and working on all the other miscellaneous sections to ensure the document was completed on time.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am slightly behind schedule because I didn’t have as much time to work on prototyping as I thought last week because of midterms and the time spent working on the design doc. Overall it isn’t that big of a deal because my goals were ambitious for last week and I am still on track.

What deliverables do you hope to complete in the next week?

I hope to have a basic prototype of the timing algorithm done by the end of the week with a working demo with midi files.

Ben’s Status Report for 10/5

This week, my focus was attempting to set up a connection between my existing audio interface (Scarlett Solo) and a python script that could process the input. I found three Python libraries to support this.

Firstly, PyAudio which is one possible route for input/output for our system. It can read existing sound files (like .wav) which is useful since we are starting by piping in recordings instead of live audio at first. It also has basic streaming capabilities for a possible shift to live audio input.

A second option I researched for  I/O is SoundDevice. Similar to PyAudio it can parse an existing music file and/or stream in real-time audio. It works in conjunction with numpy and has existing starting templates for various projects found here. 

Aubio is a library with pitch onset, duration, and separation routines; Audio filtering, FFT, and even beat detection. All of these features are essential components to our sound processing workflow.

Using templates from these libraries I made a basic, untested python script as a starting point.

In terms of scheduling, I am slightly behind as I expected to be working on the filtering component starting this week. I have realized that the event list generation can and should happen first as it goes hand in hand with the audio path creation. As such the scheduling has been changed to start that this upcoming week with filtering being pushed back until week ten as ensuring a working pipe for MVP needs higher priority.

By the end of next week I hope to have a python script that can take in an audio file and break it down into a tuple structure that contains (pitch, onset time, and a guess at duration). For my purposes I do not expect the generated list to be accurate, instead focussing on producing one to be tested in the first place. I’d also like to have a partial method of reading in input from a Scarlett audio interface. That is to say, a script that successfully transfers some data from the microphone into a buffer I can parse. I imagine that the live connection will take more time and may need to be shared over the next week as well, the schedule has been updated to reflect this (See group status update, Conversion Algorithm subsection).