This week I focused on getting the metronome working to play at various speeds the user puts in. Also we don’t need to focus on the pitch of the metronome or making sure that it gets recorded by the microphone because the detection algorithms are using the user inputted BPM instead. Also I focused on getting the backend integrated with the web app. I was able to get the python file Shivi and Grace having been working on to start running when the user clicks on the Generate Music button on the web app. I first started off with a hardcoded recording inputted into the file, and then was able to use the uploaded recording as the input to the python file to do the detection algorithm on. Overall my progress with the web app is on schedule.
Deeya Status Report for 3/15/25
This week I finished up the UI on our website for being able to record the flute and background separately, and I added the feature of being able to hear a playback of the audio being recorded. If a user decides to upload a recording, the ‘Start Recording’ button gets disabled, and if a user presses the ‘Start Recording’ button then the ‘Upload’ button gets disabled. Once the user starts recording they can stop the recording and then replay it or redo their recording. Also the user is able to adjust and hear the tempo of the metronome that gets played at a default of 60bpm. There are still two modifications I need to figure out 1. In the recording, the metronome can’t be heard because the recording only catches sound coming externally from the computer so I am playing around with some APIs I found that are able to catch internal and external recordings from a computer 2. I need to adjust the pitch of the metronome to a value that isn’t in the range a flute can be played at. For the Gen AI part there is a Music Transformer implementation available online using the MAESTRO dataset that focuses on piano music. I am thinking of using this instead of creating this process from scratch. I downloaded the code and tried to understand the different parts of the code. I was able to take a flute midi file and convert it into a format that the transformer can use. I want to continue learning and experimenting with this and see if I can fine tune the model on flute midi files.
Deeya’s Status Report for 02/22/2025
This week I finished setting up the user authentication process for our website so that each user will have an associated profile to their account. This will help keep track of what transcriptions belong to which user and which transcription to upload in their respective Past Transcriptions page. I also started looking into how to record live audio through the website and store that in our database so that it can be used by the pitch and rhythm algorithm being designed by Grace and Shivi. Overall I am on track with the website and should be done with its overall functionality this week. One thing I still want to figure out is how to take what is most recently stored in our database of either the uploaded or live recorded audio files and automatically put that through the pitch and rhythm algorithms so that when it is time to integrate the process should be smooth. For the Gen AI portion of the project it looks like I might need to create a labelled dataset myself which I will have time to focus on once I finish up the website this week. Also for this week I will be working on my portions of the design review report.
Deeya’s Status Report for 2/15/25
This week, I made progress on our project’s website by setting up a Django application that closely follows the UI design from our mockup using HTML and CSS. I am finishing up implementing the user authentication process using OAuth, which will allow users to easily register and log in with their email addresses. User profile information is being stored in an SQL database. I am currently on track with the website development timeline and will be focusing next on being able to upload files and storing them in our database. Also I will begin working on the “Past Transcriptions” web page, which will show the user’s transcription history along with the dates each transcription was created.
Regarding the generative AI component of the project, I am still searching for a large enough labeled dataset for training our model. I found the MAESTRO dataset of piano MIDI files, which would be ideal if a similar dataset existed for the flute. If I am unable to find a large labeled dataset within the next few days, I am planning on creating a small dataset myself as a starting point. This will allow me to start experimenting with model training and fine-tuning while continuing to look for a better dataset.
Deeya’s Status Report for 02/08/25
- I am tasked with working on the web/mobile application part of our project as well as with implementing the Gen AI aspect of our project
- We first were trying to assess whether a web application or a mobile app would work better for our project and its use cases. We decided to use a web app instead because it is easier to access, upload and store files, and authenticate users, and we overall have more experience working with Python, Javascript, HTML, CSS than with Swift for iOS apps.
- I designed a very basic UI for the website and will be starting a Django project that has the UI and basic functionality like being able to upload/save files to a database and has user profiles to allow users to login and out.
- For the Gen AI component the first step is to find a large enough dataset of flute music of different genres. I spoke with Professor Dueck to ask her if there was CMU archival of flute music or any resources that she recommends to look through. She recommended looking at classicalarchives.com and specifically for solo or duet flute sonatas or anything unaccompanied. Looking through this website there are a lot of flute compositions that can be useful for this project. However I still need to figure out what would be the best way to compile together a large dataset and categorize/label each piece based on its genre, tone, pace. This will be a time consuming process so I will still continue researching for more flute labelled datasets.