Aditya Status Report March 18

This week I started implementing Vexflow in the front-end of our web-app. My team already had the format of the HTTP Response containing the note information planned out, so I was able to work on this concurrently with my teammates working on the back-end integration of the frequency and rhythm processors.

I used the API found at https://github.com/0xfe/vexflow to implement this. I experimented with downloading the library manually or via npm, but I decided the simplest way would be to access the library via a <script> tag including a link to the unpkg source code provided in the Vexflow tutorial.

The plan is for the back-end to send a list of Note objects containing the pitch and duration data for each note. I started with just the pitch information, automatically assigning each note a duration of a quarter-note to minimize the errors. This was successful, but I found that I ran into an obstacle because Vexflow requires us to manually create each new “Stave” aka a set of 4 beats within a signal. Because of that, the app is currently limited to only transcribing 4 notes at a time. My plan for next week is to write a program to determine how many Staves will be needed based on how many beats are in the input signal.

I also managed to fix a major bug with the pitch processor, where the math code used to determine which note was played often caused rounding errors resulting in an output that was one half-step off of the desired note; for example, reading a D note as a D#.

Team Weekly Status Report for 3/18

Risks

There are not too many significant risks we have seen right now. One is figuring out how to integrate the rhythm and pitch processor into a single array of notes while including the rests, which might be challenging for Vexflow since there is not a lot of documentation for the rests aspect of it. We plan on testing with simple files and writing simple scripts to see which way might be the easiest way to make a data structure to accommodate this. 

 

Design Changes

Since rests are written a little bit differently than notes we might have to change our current design for the data structure of the Note. However, as of right now, no major design changes will be required. 

 

Progress

UI Skeleton

Output from audio file containing a C scale:

Alejandro’s Status Report for 3/18

First of all, I restructured the whole structure of our djangoapp application. The reason is we had created a lot of unncessary django applications for every subset of our app. So for example we had created a django application for our rhythm processor, one for our pitch processor etc. We do not need all these applications, just one for the music transcription application. Therefore, I deleted some folders and files, changed some import statements to handle the new structure,  and made it look cleaner.

I also created a django form called “AudioForm” which will allow the user to select the user-defined values in the front-end. These include the audio file, the type of clef, and the time signature.  This is how it looks…

I also made sure that the rhythm processor is able to handle audio files that contain more than a single channel. Since the rhythm processor is mainly looking for onset signals, I converted the several multiple-channel array into a 1D array. I iterated through each of the channels value at a specific instance and decided to just keep the value with the greatest absolute value and append it to the 1D array of the audio data that will be looked at to detect onsets. 

I would say my progress is on schedule.

Next week I plan to focus on writing code for the integration of the pitch and rhythm processor, as well as adding some styling to the front end of the website since it currently contains no CSS.

 

 

 

Team Weekly Status Report for 3/11

Design Changes

During the Design Review process, we determined that our method of testing pitch wouldn’t be reliable to determine how well we’ve achieved our goal with the transcription. We’ve reworked our testing process to be more user-focused, by running transcriptions of common tunes (e.g. Mary Had a Little Lamb), having average musicians read the sheet music, and having them rate the accuracy of the sheet music on a scale of 1 to 10.

Risks

We started the process of implementing the Vexflow JS library this week, and found that the internal documentation of the library causes some complications when implemented with our webapp. We are still working towards a solution, and we may have to reconsider our method of converting the Note object models into PDF format.

 

Deliverables

UI SKELETON W/ C-SCALE AUDIO INPUT AND NOTE LIST OUTPUT

Alejandro’s Status Report for 3/11

This week I looked into what best parameters might be used for the rhythm processor to detect the onset peaks in an audio signal. The function find_peaks from SciPy can take in several parameters and we think that the best parameters to be utilized for our rhythm processor will be the peak distance parameter and the height parameter.

The peak distance parameter will be set to that of our time interval window so that it only looks for a maximum of one onset at the current window. Since we are iterating with a sliding window approach through our signal in our rhythm processor, we do not need to look for more than one peak at a current window. 

The height parameter is currently set to a certain number that I found to be the lowest value when playing a note based on multiple audios. However, in the future we might have to play around with this value when testing the system to try to maximize the performance of the rhythm processor based on different inputs. 

After that, my team and I mainly focused on writing the design document for the project. 

I think our progress might be a little behind overall, due to our team facing sick days and having heavy work from other classes. However, our other classes should hopefully get a little better on the coming weeks allowing us to catch up, and I think we will be fine based on timelines. 

For next week me and Aditya will be focusing on the integration of the rhythm and pitch processors.

Kumar Status Report 3/11

Admittedly, this was a bit of slow week – as I was doing a bit of research and overwhelmed with all deadlines in the advance of spring break.

I continued to refine the urls.py as well as the look and feel of the HTML pages we will be using – things such as adding the logo of our group and links to our ECE course webpage.

While there’s not much to show, the python screenshots from last week needed debugging to make sure registration edge cases were handled fine.

I also watched various tutorials and reviewed the documentation for Vexflow. Now that the skeletons of the various aspects of the projects are coming together – I spent a bit of time with Alejandro and Aditya discussing how the integration of the parts will be happening in our views.py file – which dictates all the backend functionality of a Django web app.

Finally, we all worked together to put in significant effort on the Design Review Report, and this was a bulk of our time. I would consider myself to be slightly behind schedule as we are shifting towards integration part of our project in the next couple weeks – but I know I will make this up in the week after spring break.

This should have all the required aspects of the weekly status report.

 

 

Kumar Status Report 2/25

This week I worked on the front-end of the app as well as focusing on the user feedback and testing forms.

My main goal was to start building basic HTML pages which encompass the basic functionalities of our web app – these include login, logout, register, file upload, selecting file options for the recording. These were extended from a base.html file.

I also started working on the backend python scripts which will dictate these functionalities. Some screen shots are shown below – I’m currently in the process of fixing some bugs which is why it looks as if the code has errors. I am now working on the file-upload functionality.

I also researched helpful Python django libraries such as Django CKEditor which will help with the UI of the web app interface.

I would consider myself back on track given last week’s sickness. My deliverables will be fixing this file upload functionality and adding more to the HTML pages before spring break.

 

This should have all the required aspects of the weekly status report.

 

Alejandro’s Status Report for 2/25

Met with the team and prepared for presentation. I did practice for the design presentation by reading over all our content, ensuring it contained all required information and trying to internalize everything so that I could talk to the audience without having to barely read the slides (except for maybe reading specific numbers for data).

I looked into how we can detect the SNR of our signal by doing some research online.  It turns out SciPy used to have a function that calculated the SNR of a signal, but it was deprecated. However, the code for that function is still online and therefore we can use this to detect the SNR in our audio input validation system (here).

I also researched how the rhythm processor should work and how we might want to implement it (paper). After reading some articles, I figured it would be best to try to detect the onsets in a given audio signal to detect the start of a new note. Since we will be dealing with piano music, this makes sense since the onset of a new note will be instantaneous. 

I decided to use the find_peaks() function from SciPy. I will be iterating over the input signal in the time domain, by looking at sections of the signal of a certain length (as of right now we said 1/8th of a beat) and trying to find peaks there. If I find a peak in a section I will append a 1 to the found peak array and if not I will append a 0. That is the main idea of the algorithm, which will return an array of 1s and 0s which indicate if there is a peak in that section of the signal or not (so the value of index 0 would be the first section of the signal, the value at index 1 the second section of the signal, etc).

The code right now looks something like this, but it is still being tested as the find_peaks() function from SciPy is a little complex since it will require us to come to an agreement on what values to use for the parameters to make it work correctly. I will attach the code I wrote to write the algorithm and test it at the same time.

It can also be viewed here

Team Weekly Status Report for 2/25

Risks

Implementing parts of the design have revealed many edge-cases that result in incorrect output. For example, the frequency processor picks up the signal before the musical input starts and calculates the notes as if music was being played. Also, the segmentation method of calculating the Discrete Fourier Transform can often involve overlapping parts of the signal which distorts the measured fundamental frequency.

Design Changes

We’ve made changes to the testing progress. In addition to the rhythm and pitch accuracy requirements, we plan to score our website based on how a musician feels about the accuracy of the song. We will have users listen to the reference signal and tested signal and rate on a scale of 1-10 how similar the two songs sound. Our goal is for all users to give a score of 8/10 or higher.

In terms of new work that each of us will take on based on the modify schedule, it is not really significant. Kumar should be working on the frontend of the website as he has to catch-up from last week since he was sick and was not able to put in any work. Aditya and Alejandro should be working on the same tasks as specifed in the previous Gantt chart, and then Kumar will also be working on writing user feedback forms and distributing these feedback forms to different users. We also added an optimization section based on user feedback that we will all be working on. 

Updated Schedule

Progress

Aditya’s Status Report 2/25

My work this week primarily focused on the Software area of our project. I worked within the django app to design a method of outputting a list of notes detected within a time signal.

The initial method involved examining a segment of N-frames, where N is one-quarter of the sample rate at a time. Copied the samples in the segment to their own array, then ran the scipy method fft() on it. However, this posed a problem bc the different size of the segment resulted in a less accurate output in the frequency domain.

I changed my process. Instead of creating an array the size of only one segment, I copied the entire time-signal then multiplied it by a box window, so that every sample is 0 outside of the desired time range. This had the same accuracy of frequency that analyzing the entire signal at once would do. The code below is the output for

One major error that’s put me behind schedule is that the size of the window results in capturing portions of the signal with differing frequencies. This results in the calculated fundamental frequency being a note between the two actual notes played: for example, an F# is detected at the border between the F and G notes of the C scale. I plan to make up for this by attempting to use Alejandro’s progress with the rhythm processor. I will try to determine how the pulse-detection algorithm can be applied to the frequency processor to prevent it from calculating the FFT at the points in the signal where a new pulse is detected. As integrating the processors was already part of the plan for next week, it won’t be a serious deviation from the schedule.

>>> from audio_to_freq import *
>>> getNoteList('')
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2278]
Note detected at [293.35508242]
D
[2278]
Note detected at [293.35508242]
D
[2278]
Note detected at [293.35508242]
D
[2567]
Note detected at [330.57177198]
F
[2544]
Note detected at [327.60989011]
E
[2566]
Note detected at [330.44299451]
F
[2562]
Note detected at [329.92788462]
F
[2713]
Note detected at [349.37328297]
F#
[2713]
Note detected at [349.37328297]
F#
[2710]
Note detected at [348.98695055]
F
[2716]
Note detected at [349.75961538]
F#
[3048]
Note detected at [392.51373626]
G#
[3040]
Note detected at [391.48351648]
G
[3049]
Note detected at [392.64251374]
G#
[3039]
Note detected at [391.35473901]
G
[3045]
Note detected at [392.12740385]
G#
['C', 'C', 'C', 'C', 'C', 'D', 'D', 'D', 'F', 'E', 'F', 'F', 'F#', 'F#', 'F', 'F#', 'G#', 'G', 'G#', 'G', 'G#']


This is an example of how the signal is modified to detect the frequency at a given point. The note detected is a middle C.