Team Status Report for 4/12/25

Last week, we had a successful interim demo where we had our transcription pipeline working for a recording of Twinkle Twinkle Little Star. We also met with a flutist from the School of Music to get her feedback on our pipeline and obtain some sample recordings. She found the interface intuitive and easy-to-use, though we did run into some bugs with audio file formats and found that our note segmentation struggled a little with slurred notes. 

This week, we focused on the following items:

  1. Fix the denoising step
  2. Set up websockets for real-time BPM adjustment
  3. Any remaining frontend-backend integration remaining for the web app. (for e.g., earlier we had some bugs with audio file formats and with recording audio directly via the web app)
  4. Using a Short Time Energy approach instead of RMS to perform note segmentation. This helped to better account for rests/slurs in the music.

Later this weekend, we are meeting again with another flutist from the SoM to obtain more audio and to see if our note segmentation performs better this time. Our new audio interface and XLR cable also arrived this week, so we will hopefully be able to collect better audio samples as well. In the upcoming week, we will focus on:

  1. Polishing our STE/note segmentation
  2. Fixing the issues with making our sheet music editable via the Flat API
  3. Collecting user metrics such as their transcription history
  4. Deploying our web app
  5. Preparing our final presentation/demo
  6. Thorough testing of our pipeline

Below is our plan for verification, which we already started last week.

Shivi’s Status Report for 4/12/25

 

This week, I first worked on fixing the denoising step so that the note octaves would be accurate. Earlier, the notes would sometimes come out an octave higher because the bandpass filter was cutting out some of the lower frequencies, so I adjusted the frequency range to prevent this from happening. I also set up the websocket for real-time adjustment of the metronome, so the user is now able to adjust the tempo of the composition. Deeya and I integrated all of the webapp code and have been trying to figure out how to make the generated composition editable via the Flat API; unfortunately, we have been running into a lot of issues with it but are going to continue debugging this this week. I am also adding inputs for the user to be able to specify a key signature and time signature. Overall, my progress is on track. Pitch detection and MIDI encoding is largely done, and in the upcoming week, I will be focusing on resolving the issues with editing the sheet music directly through our web app using the Flat API and adding the key/time signatures. 

Shivi’s Status Report for 3/29/25

This week, I worked on preparing for the interim demo. I refined my pitch detection to account for rests and ensure that the generated notes were accurate (i.e. earlier, some notes were incorrectly being marked as flat/sharp instead of natural). Then, I worked with Deeya to set up the Flat.io API, as we were running into several errors with authorizing and formatting sending/receiving requests and responses. However, we were able to figure out how to send our generated MIDI files to the API for processing into sheet music. Finally, Grace and I worked on ensuring compatibility between our code, and I finished modularizing all our existing code and integrating it into a single pipeline that gets triggered from the web app and runs in the backend. Pitch detection is mostly done, and for next steps, I will be working on:

  1. Tempo detection
  2. Setting up websockets for our webapp for real-time adjustment of the metronome + assisting Deeya with making the displayed sheet music editable
  3. Working with Grace to refine audio segmentation (ex: rests and incorporating Short-Time Energy for more accurate note duration detection)

I am also finding that when I incorporate the denoising step into the pipeline, the detected pitches are thrown off a bit, so I’ll have to look more into ensuring that the denoising step does not impact the pitch detection.

Deeya’s Status Report for 03/29/25

This week I was able to show sheet music on our web app based on the user’s recording or audio file input. Once the user clicks on Generate Music, the web app is able to call the main.py file that integrates both Shivi and Grace’s latest rhythm and pitch detection algorithms, generate a midi file and store it, convert it into MusicXML, make an API POST request to Flat.io to then display the generated sheet music. The integration process is pretty seamless now so whenever there are more changes made in the algorithms it is easy to integrate the newest code with the web app and have it functioning properly.

In the API part once I convert the MIDI to MusicXML, I use the API method to create a new music score in the current User account. I send a JSON of the title of the composition, privacy (public/private), and data which is the MusicXML file:

new_score = flat_api.ScoreCreation(
   title=’New Song’,
   privacy=’public’,
   data=musicxml_string
)

This then creates a score and an associated score_id, which enables the Flat.io embedding to place the generated sheet music into our web app:

var embed = new Flat.Embed(container, {
   score: scoreId,
   embedParams: {
      mode: ‘edit’,
      appId: #,
      branding: false,
      controlsPosition: ‘top’.
    }
});

Flat.io has a feature that allows the user to make changes to the generated sheet music including the key and time signatures, notes, and any articulations and notations. This is what I will be working on next, which then should leave good amount of time for fine-tuning and testing our project with the SOM students.

Team Status Report for 3/29/25

This week we are working on integrating the basic functions of the code together. First Shivi and Grace will be combing the rhythm and pitch code together into a main.py, which will then be pulled as the backend for Deeya’s web app code. Then we’ll be integrating it with an API to generate the sheet code UI.

While integrating, we figured out that we need to modify audio segmentation a bit to account for the periods of rest time better. We will be looking into using concave changes in the audio to actually compute the start of a new note. Additionally, we will need to experiment with encoding into the midi file as currently the rhythm detection is outputting the type of note (quarter, half, etc) but the midi encoding will take in the start and end times of samples. We will be further testing if we can use actual start and end times and the midi api will round itself to be the current notes or if that is something we will be doing manually in the future.

For this week, we will be working on having a working basic demo for interim demos and then working on having the ability to change the sheet music within the API (manually adding and deleting notes, etc)

Grace’s Status Report for 3/29/25

This week I worked on finishing the rhythm detection. I approached this by using the audio segmentation code that I created last week. I looped through the segments and then in each segment, we use the BPM (which will come from the web app once it is integrated) to calculate the length of the notes and then using an if statement to classify it as either a sixteenth, eighth, quarter, etc note. This seems to be working with the Twinkle Twinkle Little Star audio. May be a little buggy with how I am calculating rests as I am just calculating the remaining portion of the segment so will need to figure out a better algorithm for this and test it further. Will be looking into using Regions of interest/energy to do audio segmentation for audios with less steep increases in amplitude (slurred note) for more precise audio segmentation. Currently on schedule – working on interim demo presentation and integrating shivi, deeya, and i’s part.

Team Status Report for 03/22/2025

This week we came up with important tasks for each of us to complete to make sure a working prototype for the interim demo. For the demo we are hoping that the user would be able to upload a recording, the key, and the BPM. The web app should then trigger the transcription pipeline and then the user can view the score. We are aiming to have this all finished for a rhythmically very simple music piece. 

These are the assigned tasks we each worked on this week:

Grace: Amplify Audio for Note segmentation, Review note segmentation after amplification, Rhythm Detection

Deeya: Get the Metronome working, Trigger backend from the webapp

Shivi: Learn how to write to MIDI file, Look into using MuseScore API + allowing user to edit generated music score and key signature

After the demo we are hoping to have the user be able to edit the score, refine the code to better transcribe more rhythmically complex audio, do lots of testing on a variety of audio, and potentially add tempo detection. 

Also, all of our parts (microphone, xlr cable, and audio interface) have arrived and this week we will try to get some recordings with our mic from the flutists from SOM. 

This week we completed the ethics assignment, which made us think a little bit more about plagiarism and user responsibilities when using our web app. We came to the conclusion that we might need to include a disclaimer for the user that they need to be careful and pay attention to where they are recording their piece so that someone else can’t do the same. Also, we have decided not to work on our stretch goal into a project anymore after reflecting on our conversation with Professor Chang. We instead will be focusing on making sure the web app interface is as easy as possible to use and that the user can customize and edit the sheet music our web app generates. Overall we are on track with our schedule.

Deeya Status Report for 3/22/2025

This week I focused on getting the metronome working to play at various speeds the user puts in. Also we don’t need to focus on the pitch of the metronome or making sure that it gets recorded by the microphone because the detection algorithms are using the user inputted BPM instead. Also I focused on getting the backend integrated with the web app. I was able to get the python file Shivi and Grace having been working on to start running when the user clicks on the Generate Music button on the web app. I first started off with a hardcoded recording inputted into the file, and then was able to use the uploaded recording as the input to the python file to do the detection algorithm on. Overall my progress with the web app is on schedule.

Shivi’s Status Report for 3/22/25

This week, I focused on being able to write the detected rhythm/pitch information to a MIDI file and also looked into APIs for displaying the generated MIDI information as sheet music. Using the pitch detection I did last week, I wrote another script that takes in the MIDI note numbers and note types and creates MIDI messages. Each note is associated with a message that encodes its frequency, duration, and loudness, and the script generates a .mid file with all the notes and their corresponding attributes. I tested this on a small clip of Twinkle, Twinkle Little Star; for the generated .mid file, I then uploaded this to the music notation platform, flat.io to see if the .mid file contained the correct notation. Below is the generated sheet music. For now, all the note pitches were generated by my pitch detection script, but all the notes are hard coded as quarter notes for now as our rhythm detection is in progress. The note segmentation –> pitch detection –> MIDI generation pipeline seems to be generating mostly correct notes for basic rhymes like Twinkle Twinkle.

Earlier this week, I also did some research into APIs that we could use to display the generated sheet music on our web application in a way that is similar to MuseScore, a popular music notation application. While MuseScore doesn’t have an API that we can use, flat.io has a developer guide that will allow us to display the generated sheet music. Next week, I will be looking more into the developer guide and working with Deeya to set up/integrate the Flat API onto our web app. I will also work with Grace to refine/test our note segmentation more and ensure it is accurate for other notes and rests. We will also potentially be meeting one of the flutists this week so that we can collect more audio samples as well. Overall, my progress is on schedule, and hopefully we will have our transcription pipeline working on simple audio samples for our interim demo.