Team Status Report for 4/26/25

This week, we as a collective worked on creating a way to store past transcriptions using SQLite on our website as well as let people add, edit, and change anything from the original transcription that was generated. We believed that this followed the ebbs and flows of composition better and we wanted to mimic that. In addition to that, we are now focusing on fine tuning and testing our program further as well as working on some of the final deliverables, like the presentation and the poster.

Unit Tests:
Rhythm Detection/Audio Segmentation: Testing this on different BPM compositions of Twinkle Twinkle Little Star, using songs with tied notes like Ten Little Monkeys and compositions from the school of music, and compositions with rests in them like Hot Cross Buns and additional songs from the school of music.

Overall System Tests: We tested this project on varying difficulty of songs such as easy songs like nursery rhymes (hot cross buns, ten little monkeys, twinkle twinkle little star, etc) and scales, intermediate difficult songs from youtube as well as our team members playing (Telemann – 6 Sonatas for two flutes Op. 2 – no. 2 in E minor TWV 40:102- I. Largo, Mozart – Sonata No. 8 in F major, K. 13 – Minuetto I and II, etc) and more difficult sections like composition from the school of music. 

Findings and Design Changes: From these varying songs, we realized that our program struggled more with higher octaves as the filter would accidentally cut off those frequencies, slurred notes (as it wouldn’t see it as the onset of a new note), and rests (especially when trying to differentiate moments of taking a breath from actual rests). These led us to tweaking how we defined a new note to create a segmentation and the boundaries for our filter.

Data Obtained:

Latency Rhythm Accuracy Pitch Accuracy
Scale 1: F Major Scale 10.08 secs 100% 100%
Scale 2: F Major Scale W/ Ties 12.57 secs 97% 95%
Simple 1: Full Twinkle Twinkle Little Star 13.09 secs 100% 100%
Simple 2: Ten Little Monkeys 14.46 secs 93% 100%
Simple 3: Hot Cross Buns 11.78 secs 91% 100%

 

 

Latency Rhythm Accuracy Pitch Accuracy
Intermediate 1: Telemann – 6 Sonatas for two flutes Op. 2 – no. 2 in E minor TWV 40:102 15.16 secs 90% 100%
Intermediate 2: I. Largo, Mozart – Sonata No. 8 in F major, K. 13 – Minuetto I and II 16.39 secs 87% 97%
Hard 1: Phoebe SOM Composition 10.99 secs 91.5% 100%
Hard 2: Olivia SOM Composition 12.11 secs 93% 100%

Team Status Report for 3/29/25

This week we are working on integrating the basic functions of the code together. First Shivi and Grace will be combing the rhythm and pitch code together into a main.py, which will then be pulled as the backend for Deeya’s web app code. Then we’ll be integrating it with an API to generate the sheet code UI.

While integrating, we figured out that we need to modify audio segmentation a bit to account for the periods of rest time better. We will be looking into using concave changes in the audio to actually compute the start of a new note. Additionally, we will need to experiment with encoding into the midi file as currently the rhythm detection is outputting the type of note (quarter, half, etc) but the midi encoding will take in the start and end times of samples. We will be further testing if we can use actual start and end times and the midi api will round itself to be the current notes or if that is something we will be doing manually in the future.

For this week, we will be working on having a working basic demo for interim demos and then working on having the ability to change the sheet music within the API (manually adding and deleting notes, etc)

Team Status Report for 3/8/25

This past week, we mostly focused on finishing up our design review by ensuring that the language was specific enough and concise. Additionally, we focused on adding in clear graphics, like pseudo graphics and diagrams to help convey the information. In addition to this, we have also met with Professor Almaraza and confirmed the use case requirements and gained their opinion on the current workflow. From this, we also got three flutists to sign up for testing for our project and will now be working on getting them additionally sign up for the conjoined mini course. In terms of the project, we have a more clear understanding of how to implement audio segmentation after some research and discussing the concept with professor Sullivan and look towards really finishing this portion up this next week. 

Overall, we are currently on track, though may run into the some issues with the audio segmentation as this will be the most difficult aspect of our project.

Part A (Grace): Our product solution will meet the global factors for those who are not as technologically savvy by making the project as easy to understand as possible. Already this project is significantly decreasing the strain of composing your own music by eliminating the need for individuals to know exact lengths of notes, pitches, etc of when they are composing music and simplifying the process by decreasing the amount of time it takes to transcribe. In addition to this, the individual will only have to upload an mp4 recording of them playing before getting transcribed music, as we will be handling all the backend aspects of this. As such, even the technologically unsavvy should be able to use this application. Furthermore, we aim to make the UI user friendly and easy to read. 

In addition, we aim to make this usable in other environments, not just an academic one, by filtering out the outside noise to allow users to be able to use the application in even noisy settings. As mentioned in our design reviews, we will be testing this application in multiple different settings to hopefully encompass the different environments this website would be used globally. 

Part B (Shivi): Our project can make a cultural impact by allowing people to pass down music that lacks standardized notation. For instance, the traditional/folk tunes (such as the Indian bansuri or Native American flute) are often played by ear and likely to be lost over time, but our project can help transcribe such performances, allowing them to be preserved over multiple generations. This would also help to increase access to music for people from different cultures, promoting cross-cultural collaboration. 

Part C (Deeya): Write on Cue is addressing environmental factors by encouraging a more sustainable system of digital transcription over printed notation – reducing paper usage. Also digital transcription allows musicians to learn, compose, and practice remotely, reducing the need for physical travel to lessons, rehearsals, or recording sessions. By reducing transportation energy and paper consumption, it helps make our product more environmentally-friendly. Also, instead of relying on large, energy-intensive AI models, we are going to use smaller, more efficient models trained specifically for flute music, which will help reduce computation time and power consumption. We will look into techniques like quantization to help speed up inference.

Team Status Report for 02/08/25

As we had proposal presentations this week, we worked hard on finishing up our slides, ensuring that they were done far enough in advance that Ankit, our TA, would be able to give us feedback on our schedule. Here, Ankit had mentioned the possibility of converting our hardware systems (like the microcontroller of an arduino) be done solely in software instead, as it would function a lot faster. We are currently considering this option: Since we would ideally like to convert this system into real time, it would be best for faster processing. However, this could result in changes on how we approach tasks, like rhythm detection. We are planning on reaching out to Ankit again to talk this over further. 

Last week, we also meet with Professor Dueck and other musicians to discuss what our project looks like and how the music department could contribute to our project, such as allowing us to work in her studio to test the flutes in a relatively noiseless environment, which would be best for a bare bones working project. Additionally, she connected us with Professor Almarza, who will be helping us find some flutists to help test our project.

After this, we experimented with looking at some straight tone flute signals and seeing how this pitch would appear in Matlab. This is to get more insight in getting a bare bones project up and working.

Currently, our most significant risk would be switching the project and having unforeseen consequences and then having to backtrack to the hardware idea, which is a little more fleshed out due to past project references. These risks could be managed by discussing this further with our TA and staff, like Professor Sullivan. As such, this might pose a possible change to the existing design, specifically the system spec, to help with the speed. Overall, we feel that we are on track and excited to see where our project tasks us as well as work collaboratively with CMU musicians to get their feedback throughout the process.