Kumar Status Report 3/25

While Alejandro is familiar with Bootstrap, he focused on converting the previously written CSS to this as it helps  modularize our project better.

I continued working on the project, specifically the cruz of a major bug. The bug was that if the length of the notes array was greater than 0, then we weren’t able to draw the notes using the Vexflow library. We continued to encounter an uncaught syntax error in our JavaScript file. After digging into various forums and documentation for the JavaScript library Vexflow,  we ended up with the resolution to this – implement a try and except case and sort of hard code it in. I then tested it with various files and arrays and ensured it doesn’t fail.


As this tooka few days to sort out, I was mainly assisting Alejandro and Aditya on higher level discussions surrounding the integration files we had to alter, as outlined in the team status report. I also took the lead on the ethics assignment for our group.

I’d say I’m on track with the front-end nearly finalized and contributing mostly on debugging and code structure discussions on the backend.


I’ll be taking a lead on SNR rejection and adding more front end functionality such as saving and downloading files next week and perhaps creating user profiles.

Aditya’s Status report for 3/25

My schedule changed this week as after we built the Note design structures, we realized we would have to implement the integrator step of the project in the front-end instead of the back-end, as discussed in our team status report. Because of this, I couldn’t use the output of the integrator in my Vexflow code like I’d originally planned.

Instead, I worked on determining how long each transcription will need to be based on the number of notes in each audio. Vexflow’s library is setup so that you have to manually instantiate each measure of the musical piece before adding notes to it. So, I wrote code that took the length of the input audio and using the baseline time signature of 4/4 determines how many measures (defined as Staves in the Vexflow API) would be required in the transcription.

Once the number of Staves was known, I could setup a 2-d array of locations, to track which measure goes where on the output PDF. I chose that there would always be 4 Staves in each row, so the 2-d array was an N-by-4 matrix where 4*N is the total number of Staves. If the number of Staves isn’t divisible 4, there are “padding” Staves added to the ending so the output still looks neat.

Once the Staves are instantiated, I iterate through the piece and write each note (only the pitches currently, as the integrator is incomplete); the row and column of the desired Stave is determined  based on the index of the Note in the list recieved from the backend; for example, the 5th item in the list will correspond to the 2nd row, 1st Stave of the transcription.

Alejandro’s Status Report for 3/25

First, I had to work on the ethics assignment with my team.

After that, I decided to make the front end of the website look even better. I made sure we were only using Bootstrap and barely any CSS so that resizing the window would not affect the look of the website. I also had to made sure to add code so that our backend is able to read in the values set by the user in the form when selecting a clef, time signature and audio file. Before, our code was not able to read in these values correctly and now this should be fixed. This information will be sent to the rhythm and pitch processors. I also added the copyright footer to our website.


Finally, I had to write code in the views.py file that allows the backend to send all the correct information to the front end so that we can utilize it with VexFlow to display the music sheet. This required me to make some changes to the integrator like we talked about in the team weekly status report. I changed the integrator from being in python to happening in the javascript part of our code since there was a bug where apparently sending a list containing a class from python to javascript would not work. Therefore, now we just send to the javascript the pitches and the rhythm output and call our integration function in javascript.

I also made sure that Vexflow is able to display correctly in the front-end the clef and the time signature. 

Finally, I made sure that the integration system was working properly. It seems that it is now able to produce an output of notes, which means we should be able to test it next.

I would say my progress is on schedule.


Next week we will be focusing on testing the integration system as well as the other systems. We should also get started on the SNR rejection system if time allows. 


Team Weekly Status Report for 3/25


It seems that when entering an audio file to our system it takes a little long to transcribe with short audios. For example, it takes 24 seconds to transcribe an audio containing 4 notes that lasts around 14 seconds. Therefore, this could be a bigger issue with even longer audios. 


Design Changes

Our design has two sub-processors, one for determining the pitches of each note in the audio and one for determining the rhythm of each note, followed by an integration engine. The sub-processors are implemented in Python, and we initially planned to implement the integrator in Python as well, generating a list of Note Structs and sending them to the front-end to be transcribed. However, we found out that sending information packaged within a design structure meant that the front-end could not effectively parse through the information within the structures. We realized we would have to send to the front-end information that contained primitive data types such as Strings or ints. So, we decided to integrate the pitch and rhythm processors’ outputs after the information was sent to the front end. This is because we can instead send to the front-end the output from the rhythm and pitch processor separately, since they are a list of integers and strings. The method of integration is unchanged, the primary difference is that the HTTP Response contains two outputs instead of one.



[Front-End Updated]

[Integrator Change]

Kumar Status Report 3/18

My role this week was more of that of assisting my team-mates with their tasks. Our front-end app is coming along  nicely, so I transitioned to helping my team-mates with the backend work.

I planned out the HTTP response containing note information that we needed, so Aditya was able to use this while starting to work with Vexflow. I tried what Aditya already outlined, downloading the library with npm and manually, but this didn’t work for either of us, so we used a script tag.

I resolved a major bug on the front-end where we were having an issue with the MIME type of the Javascript, as the browser was unable to read the file being sent to it when trying to display the notes in Vexflow. This was an issue with the typescript and javascript integration, and was able to resolve it by simply moving the factory class in our javascript file to the bottom of the file.

I would say my progress is on schedule. I am going to be helping Alejandro with the rhythm and pitch processor integration next week, and taking a lead again on the CSS and look of the web app interface.

Aditya Status Report March 18

This week I started implementing Vexflow in the front-end of our web-app. My team already had the format of the HTTP Response containing the note information planned out, so I was able to work on this concurrently with my teammates working on the back-end integration of the frequency and rhythm processors.

I used the API found at https://github.com/0xfe/vexflow to implement this. I experimented with downloading the library manually or via npm, but I decided the simplest way would be to access the library via a <script> tag including a link to the unpkg source code provided in the Vexflow tutorial.

The plan is for the back-end to send a list of Note objects containing the pitch and duration data for each note. I started with just the pitch information, automatically assigning each note a duration of a quarter-note to minimize the errors. This was successful, but I found that I ran into an obstacle because Vexflow requires us to manually create each new “Stave” aka a set of 4 beats within a signal. Because of that, the app is currently limited to only transcribing 4 notes at a time. My plan for next week is to write a program to determine how many Staves will be needed based on how many beats are in the input signal.

I also managed to fix a major bug with the pitch processor, where the math code used to determine which note was played often caused rounding errors resulting in an output that was one half-step off of the desired note; for example, reading a D note as a D#.

Team Weekly Status Report for 3/18


There are not too many significant risks we have seen right now. One is figuring out how to integrate the rhythm and pitch processor into a single array of notes while including the rests, which might be challenging for Vexflow since there is not a lot of documentation for the rests aspect of it. We plan on testing with simple files and writing simple scripts to see which way might be the easiest way to make a data structure to accommodate this. 


Design Changes

Since rests are written a little bit differently than notes we might have to change our current design for the data structure of the Note. However, as of right now, no major design changes will be required. 



UI Skeleton

Output from audio file containing a C scale:

Alejandro’s Status Report for 3/18

First of all, I restructured the whole structure of our djangoapp application. The reason is we had created a lot of unncessary django applications for every subset of our app. So for example we had created a django application for our rhythm processor, one for our pitch processor etc. We do not need all these applications, just one for the music transcription application. Therefore, I deleted some folders and files, changed some import statements to handle the new structure,  and made it look cleaner.

I also created a django form called “AudioForm” which will allow the user to select the user-defined values in the front-end. These include the audio file, the type of clef, and the time signature.  This is how it looks…

I also made sure that the rhythm processor is able to handle audio files that contain more than a single channel. Since the rhythm processor is mainly looking for onset signals, I converted the several multiple-channel array into a 1D array. I iterated through each of the channels value at a specific instance and decided to just keep the value with the greatest absolute value and append it to the 1D array of the audio data that will be looked at to detect onsets. 

I would say my progress is on schedule.

Next week I plan to focus on writing code for the integration of the pitch and rhythm processor, as well as adding some styling to the front end of the website since it currently contains no CSS.




Team Weekly Status Report for 3/11

Design Changes

During the Design Review process, we determined that our method of testing pitch wouldn’t be reliable to determine how well we’ve achieved our goal with the transcription. We’ve reworked our testing process to be more user-focused, by running transcriptions of common tunes (e.g. Mary Had a Little Lamb), having average musicians read the sheet music, and having them rate the accuracy of the sheet music on a scale of 1 to 10.


We started the process of implementing the Vexflow JS library this week, and found that the internal documentation of the library causes some complications when implemented with our webapp. We are still working towards a solution, and we may have to reconsider our method of converting the Note object models into PDF format.




Alejandro’s Status Report for 3/11

This week I looked into what best parameters might be used for the rhythm processor to detect the onset peaks in an audio signal. The function find_peaks from SciPy can take in several parameters and we think that the best parameters to be utilized for our rhythm processor will be the peak distance parameter and the height parameter.

The peak distance parameter will be set to that of our time interval window so that it only looks for a maximum of one onset at the current window. Since we are iterating with a sliding window approach through our signal in our rhythm processor, we do not need to look for more than one peak at a current window. 

The height parameter is currently set to a certain number that I found to be the lowest value when playing a note based on multiple audios. However, in the future we might have to play around with this value when testing the system to try to maximize the performance of the rhythm processor based on different inputs. 

After that, my team and I mainly focused on writing the design document for the project. 

I think our progress might be a little behind overall, due to our team facing sick days and having heavy work from other classes. However, our other classes should hopefully get a little better on the coming weeks allowing us to catch up, and I think we will be fine based on timelines. 

For next week me and Aditya will be focusing on the integration of the rhythm and pitch processors.