Team Status Report for 4/19/25:

This week, we tested staccato and slurred compositions and scales with SOM flute students to evaluate our transcription accuracy. During the two-octave scale tests, we discovered that some higher octave notes were still being misregistered in the lower octave so Shivi worked on fixing this in the pitch detection algorithm. Deeya and Shivi also made progress on the web app by enabling users to view past transcriptions and input key and time signature information. Grace is working on improving our rhythm algorithm to better handle slurred compositions by using Short-Time Fourier Transform (STFT) to detect pitch changes and identify tied notes within audio segments. However, we’re still working on securing access/reimbursement for the Embed AP in Flat.io, which is needed to allow users to edit their compositions. For this week we are preparing for our final presentation, planning on doing two more testing sessions with the flute students, and cleaning up our project.

Deeya’s Status Report for 4/12/25

This week I focused on making the web app more intuitive. Shivi had worked on the metronome aspect of our web app which allows the tempo to change as you change the metronome value using Websockets. I integrated her code into the web app and took the metronome value into the backend so that the bpm changes dynamically based on the user. I also tried to get the editing feature of Flat IO but it seems that the free version using iframe doesn’t work. We are thinking of looking into the Premium version so that we can use the Javascript API. The next step is to work on this and add a Past Transcriptions page.

Deeya’s Status Report for 03/29/25

This week I was able to show sheet music on our web app based on the user’s recording or audio file input. Once the user clicks on Generate Music, the web app is able to call the main.py file that integrates both Shivi and Grace’s latest rhythm and pitch detection algorithms, generate a midi file and store it, convert it into MusicXML, make an API POST request to Flat.io to then display the generated sheet music. The integration process is pretty seamless now so whenever there are more changes made in the algorithms it is easy to integrate the newest code with the web app and have it functioning properly.

In the API part once I convert the MIDI to MusicXML, I use the API method to create a new music score in the current User account. I send a JSON of the title of the composition, privacy (public/private), and data which is the MusicXML file:

new_score = flat_api.ScoreCreation(
   title=’New Song’,
   privacy=’public’,
   data=musicxml_string
)

This then creates a score and an associated score_id, which enables the Flat.io embedding to place the generated sheet music into our web app:

var embed = new Flat.Embed(container, {
   score: scoreId,
   embedParams: {
      mode: ‘edit’,
      appId: #,
      branding: false,
      controlsPosition: ‘top’.
    }
});

Flat.io has a feature that allows the user to make changes to the generated sheet music including the key and time signatures, notes, and any articulations and notations. This is what I will be working on next, which then should leave good amount of time for fine-tuning and testing our project with the SOM students.

Deeya’s Status Report for 3/8/25

I mainly focused on my parts for the design review document and editing it with Grace and Shivi. Shivi and I also had the opportunity to speak to flutists in Professor Almarza’s class about our project, and we were able to recruit a few of them to help us with recording samples and providing feedback throughout our project.  It was a cool experience to hear about their thoughts as well as understand how this could be helpful for them during practice sessions they have. Specifically for my parts of the project I continued working on the website and learned how to record audio and store it in our database to be used later. I will now be starting to put more of efforts in the Gen AI part.  I am thinking of utilizing a Transformer-based generative model trained on MIDI sequences and I will need to learn how to take MIDI files and convert them into a series of token encodings of musical notes, timing, and dynamics, so that it can be processed by the Transformer model. I will also start compiling a dataset of flute MIDI files.

 

Shivi’s Status Report for 02/08/25

This week, I worked with Grace and Deeya to finish our proposal slides, where we included a very high-level workflow of our design. Completing the proposal slides gave us a better idea of the amount of work we need to do, and the three of us met up to generate some flute audio recordings. Since I am tasked with pitch detection, as an experiment, I wrote a basic Python script that performs a fast Fourier Transform on singular notes so that we could examine the frequencies associated with a few notes from the B flat major scale:

Here, we can see the fundamental frequencies/harmonics associated with each note, a property that we will leverage to determine which note is being played in the audio. After proposal presentations, we thought about some feedback from our TA (Ankit) and realized that we need to think more about software-hardware tradeoffs in our design. Initially, we were keen on having a hardware component in our project (having taken/taking 18341 and 18349 as well as seeing similar projects from the past doing this), but it seems that it may be cleaner/more efficient to simply perform certain tasks purely in software. For instance, our initial design included performing FFT using the microcontroller, but it will definitely be more efficient to perform it on a laptop CPU. These are some of my thoughts for a revised design (at least on the signal processing side) based on some independent research:

  • Signal Processing 
    • Use microphone to capture flute audio
      • Suggested mic: InvenSense ICS-43434, a MEMS microphone with digital output. Can be mounted close to the flute’s embouchure hole and does not require any sort of PCB soldering. We also have the option to 3D print a custom clip to attach it to the flute for optimal placement.
      • Send audio to microcontroller via I2S (Inter-IC sound interface)
      • Microcontroller converts PDM (Pulse Density Modulation) to PCM (Pulse Code Modulation). Some suggested microcontrollers with built-in PDM support: RPi RP2040, STM32 (more suited for high-end tasks and higher performance so might not be necessary)
    • In software, do pitch detection: 
      • Apply additional digital filtering to the PCM signal: noise suppression, bandpass filtering, adaptive filtering
      • Apply Fast Fourier Transform to detect flute frequencies, map frequencies to flute notes
      • Use moving average filter (ex: Kalman filter) to smooth out pitch detection
    • In software, do note length detection:
      • Use Peak Tracking in Frequency Domain (more computationally expensive than methods like time-domain envelope filtering and requires harmonic filtering to avoid detecting overtones, but less sensitive to volume variations and more accurate in noisy environments)
      • Detect note length: note is considered ongoing if the peak frequency remains stable. If the peak disappears or shifts significantly, the note has ended.
    • MIDI: store the note frequencies, durations in a MIDI format. Then, generate a MIDI Note On message when note starts (0x90 message), MIDI Note Off message when note ends (0x80). Use duration to check note type (eighth, quarter, half, whole note, etc)
    • Use MuseScore API to upload MIDI file and display sheet music on web app

For the coming week, we plan to flesh the design out more and work on our low-level design with other important details such as BPM detection, metronome, and integration with webapp. We also aim to make a list of any inventory/purchase items we will need.