Kumar Status Report 2/25

This week I worked on the front-end of the app as well as focusing on the user feedback and testing forms.

My main goal was to start building basic HTML pages which encompass the basic functionalities of our web app – these include login, logout, register, file upload, selecting file options for the recording. These were extended from a base.html file.

I also started working on the backend python scripts which will dictate these functionalities. Some screen shots are shown below – I’m currently in the process of fixing some bugs which is why it looks as if the code has errors. I am now working on the file-upload functionality.

I also researched helpful Python django libraries such as Django CKEditor which will help with the UI of the web app interface.

I would consider myself back on track given last week’s sickness. My deliverables will be fixing this file upload functionality and adding more to the HTML pages before spring break.

 

This should have all the required aspects of the weekly status report.

 

Alejandro’s Status Report for 2/25

Met with the team and prepared for presentation. I did practice for the design presentation by reading over all our content, ensuring it contained all required information and trying to internalize everything so that I could talk to the audience without having to barely read the slides (except for maybe reading specific numbers for data).

I looked into how we can detect the SNR of our signal by doing some research online.  It turns out SciPy used to have a function that calculated the SNR of a signal, but it was deprecated. However, the code for that function is still online and therefore we can use this to detect the SNR in our audio input validation system (here).

I also researched how the rhythm processor should work and how we might want to implement it (paper). After reading some articles, I figured it would be best to try to detect the onsets in a given audio signal to detect the start of a new note. Since we will be dealing with piano music, this makes sense since the onset of a new note will be instantaneous. 

I decided to use the find_peaks() function from SciPy. I will be iterating over the input signal in the time domain, by looking at sections of the signal of a certain length (as of right now we said 1/8th of a beat) and trying to find peaks there. If I find a peak in a section I will append a 1 to the found peak array and if not I will append a 0. That is the main idea of the algorithm, which will return an array of 1s and 0s which indicate if there is a peak in that section of the signal or not (so the value of index 0 would be the first section of the signal, the value at index 1 the second section of the signal, etc).

The code right now looks something like this, but it is still being tested as the find_peaks() function from SciPy is a little complex since it will require us to come to an agreement on what values to use for the parameters to make it work correctly. I will attach the code I wrote to write the algorithm and test it at the same time.

It can also be viewed here

Team Weekly Status Report for 2/25

Risks

Implementing parts of the design have revealed many edge-cases that result in incorrect output. For example, the frequency processor picks up the signal before the musical input starts and calculates the notes as if music was being played. Also, the segmentation method of calculating the Discrete Fourier Transform can often involve overlapping parts of the signal which distorts the measured fundamental frequency.

Design Changes

We’ve made changes to the testing progress. In addition to the rhythm and pitch accuracy requirements, we plan to score our website based on how a musician feels about the accuracy of the song. We will have users listen to the reference signal and tested signal and rate on a scale of 1-10 how similar the two songs sound. Our goal is for all users to give a score of 8/10 or higher.

In terms of new work that each of us will take on based on the modify schedule, it is not really significant. Kumar should be working on the frontend of the website as he has to catch-up from last week since he was sick and was not able to put in any work. Aditya and Alejandro should be working on the same tasks as specifed in the previous Gantt chart, and then Kumar will also be working on writing user feedback forms and distributing these feedback forms to different users. We also added an optimization section based on user feedback that we will all be working on. 

Updated Schedule

Progress

Aditya’s Status Report 2/25

My work this week primarily focused on the Software area of our project. I worked within the django app to design a method of outputting a list of notes detected within a time signal.

The initial method involved examining a segment of N-frames, where N is one-quarter of the sample rate at a time. Copied the samples in the segment to their own array, then ran the scipy method fft() on it. However, this posed a problem bc the different size of the segment resulted in a less accurate output in the frequency domain.

I changed my process. Instead of creating an array the size of only one segment, I copied the entire time-signal then multiplied it by a box window, so that every sample is 0 outside of the desired time range. This had the same accuracy of frequency that analyzing the entire signal at once would do. The code below is the output for

One major error that’s put me behind schedule is that the size of the window results in capturing portions of the signal with differing frequencies. This results in the calculated fundamental frequency being a note between the two actual notes played: for example, an F# is detected at the border between the F and G notes of the C scale. I plan to make up for this by attempting to use Alejandro’s progress with the rhythm processor. I will try to determine how the pulse-detection algorithm can be applied to the frequency processor to prevent it from calculating the FFT at the points in the signal where a new pulse is detected. As integrating the processors was already part of the plan for next week, it won’t be a serious deviation from the schedule.

>>> from audio_to_freq import *
>>> getNoteList('')
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2030]
Note detected at [261.41826923]
C
[2278]
Note detected at [293.35508242]
D
[2278]
Note detected at [293.35508242]
D
[2278]
Note detected at [293.35508242]
D
[2567]
Note detected at [330.57177198]
F
[2544]
Note detected at [327.60989011]
E
[2566]
Note detected at [330.44299451]
F
[2562]
Note detected at [329.92788462]
F
[2713]
Note detected at [349.37328297]
F#
[2713]
Note detected at [349.37328297]
F#
[2710]
Note detected at [348.98695055]
F
[2716]
Note detected at [349.75961538]
F#
[3048]
Note detected at [392.51373626]
G#
[3040]
Note detected at [391.48351648]
G
[3049]
Note detected at [392.64251374]
G#
[3039]
Note detected at [391.35473901]
G
[3045]
Note detected at [392.12740385]
G#
['C', 'C', 'C', 'C', 'C', 'D', 'D', 'D', 'F', 'E', 'F', 'F', 'F#', 'F#', 'F', 'F#', 'G#', 'G', 'G#', 'G', 'G#']


This is an example of how the signal is modified to detect the frequency at a given point. The note detected is a middle C.

Aditya Status Report for 2/18

This week I developed code to detect the notes within a frequency array. The code isolates a relevant portion of the array and determines the average frequency of the segment. It then applies a mathematical formula that determines the note associated with a frequency based on the reference note, which in our case is A4 = 440 Hz.

I also built the file hierarchy of a django app and began the process of integrating our starter code into it. I designed a structure to hold information for a Note object, to build a database of notes for the web app to access when it comes to sending the information to the front-end.

My plan for next week is to design and implement the process of applying the Discrete Fourier transform to get the frequency arrays which will be processed by the detectNotes() method, so that I can get a sequence of Notes each of length one-quarter beat which will be integrated with the rhythm processor to determine where each Note should begin. After these have been integrated, the Note objects will be built.

Team Weekly Status Report for 2/18

Principles of engineering, science and mathematics

  • Short-Time Fourier Transform of a time domain signal. (18-290)
  • Utilization of frequency domain vs time domain to analyze different signals for different purposes (18-290)
    • Time domain for rhythm processor: we can analyze the start of a new note by detecting a sudden drop and rise in energy. This will allow us to detect the different length of each note and when a note ends leading to the start of a new note. 
    • Frequency domain for frequency processor. Thanks to the Short-Time Fourier Transform we will be able to detect the average frequency over a specific interval (of the length of our box size in the STFT), and with this frequency we can detect what note it is being played.
  • Signal-to-noise ratio (18-100, 18-290)
    • Our system will reject audios that have a poor low SNR.
  • Object Oriented Programming (15-112)
    • We will need to have different objects in our code for the characteristics of the fourier transforms, notes, audio files, etc. 
  • Web Applications (17-437)
    • Our system will require a web application for the users to access the system from their laptops. 

Significant Risks 

Unfortunately this week we have fallen a little bit off-schedule. One of our team members was sick for the whole week, therefore they were unable to make progress on the project. We also had issues with searching for software libraries that will allow us to test our frequency processor due to having installation issues which took longer than expected. For example, we tried installing a library called CREPE in python that would give us the frequencies given an audio sample. We were planning on using this to compare it with our frequency processor output. The problem is to install this it required tensorflow and installing tensorflow was giving us a lot of issues. After unsuccessfully trying to install CREPE, we decided to move away from CREPE and found another library called Parselmouth which was very easy to install. Therefore, we will be using Parselmouth to test our frequency processor. This library allows us to get all the frequencies of a specific audio, therefore we could compare the output of this library to the output of our own processor. 

Therefore the risk that I see is us not being able to finish everything on-time. However, I think that by putting in more work in the upcoming weeks we should be able to get back on track. Working on the Design Review helped us reorganize ourselves and adjust our plans based on the setbacks of this week.

Changes made to design 

We have not made any changes to our overall design, but we have worked more on developing the specifics of the sub-processors and the transcription engine. We have determined a method of isolating the pitches at specific segments of the signal, a binary mapping system to determine when a new key-press occurs, and a design structure that contains this information in a way that is easily comprehensible by the VexFlow library.

Updated schedule

We have updated our schedule a little bit, to make the process of building the sub-processors more concurrent and adding some sections for the user-testing part of our project.

 Photos about progress 

Parselmouth manually examining/testing of pitch over time compared to SciPy (previous method). Parselmouth is more accurate and comprehensible.

 

Kumar Status Report 2/18

This week was a mixed week for me. From last weekend until Wednesday afternoon, I was struggling with intense gastritis and stomach issues. Hence, I missed almost all my classes and wasn’t really able to work on a screen. Since then, I have been playing catch-up in regards to the front-end work. I created and installed the django-app we will be using, and set-up the repository, as can be seen in this picture. I am now working on building the HTML pages that we will use that will have file upload and a welcome screen, etc.

As the pitch processing is the crux of the main challenge we expect to face, I also assisted Aditya and Alejandro with the Python library CREPE, which required TensorFlow. After much struggle we all had to abandon this task, and the other status reports explain how they moved forward.

 

Finally, I worked with my teammates on the design review slides in terms of ideating on testing methodologies, technology we will use, software implementation plan, etc.

 

I would say I’m slightly behind task given the time I lost due to my sickness (had to request extensions in all classes and received a UHS note). I plan on getting on track by redirecting all my focus on the django front-end and working to re-acquaintance myself with the models.py and databases in django.

 

Alejandro’s Weekly Status Report for 2/18

The following courses were helpful in the design principles used in our project:

  • 18-290: Fourier Transforms, Frequency Domain vs Time Domain Signals, Signal to Noise Ratio.
  • 15-112: Object Oriented Programming.
  • 17-437: Web Applications.

During this week I mainly focused on trying to find software that will help us developing the frequency processor. Last week I found out that the pitch function from MATLAB would be quite useful to have to test against our own frequency processor. The purpose of this function is when given an audio signal it would return to the user all the frequencies found for that signal in the right order. Therefore, we can compare this to our own frequency processor to determine how accurate our frequency processor is. 

The problem is that we will need to write this code in python since we need to build our web-app utilizing Django. At first, I tried utilizing a public library called CREPE. However, it was really hard to install this library, and I was not successful in doing so. I tried for 2 days to install it, but the library required me to install TensorFlow and it was causing me too much trouble, as well my teammates. 

I decided to look for other libraries and I found another one called Parselmouth. This library was easy to install and also has a Pitch function to determine the different pitches given an audio file. The software seemed to correctly detect the frequencies given a couple of files that we had. Therefore, we will be utilizing this for testing our frequency processor. 

These are the outputs of a C-note and a C-scale: 

I also have been working with my teammates in laying out all the content for the Design Review, as well as setting up the database for our django application.

I would say my progress is on-schedule. There have been some modifications to our Gantt chart and I will be focusing this upcoming week on researching how to do the rhythm processor, as well as preparing to give the presentation for the Design Review.

Kumar 2/11 Progress Report

This week I focused heavily on the design presentation slides – for example working on the Gantt Chart and figuring the best division of labor.

 

I then began researching and refreshing my skills regarding the front-end webapp which will be designing, as Alejandro and Aditya started working on the back-end. This is reflective of all the overall tasks and division of labor we had outlined in our proposal presentation. I did this by refreshing my skills with a basic Django App tutorial that I found online from  https://docs.djangoproject.com/en/4.1/intro/tutorial01/ and various Youtube sources.

 

Finally, after some exploration, I’ve attached two screenshots of the layouts and user interfaces that we roughly hope to replicate with our final app design based on the apps “Voice Recordings” and “Genius Scan”. 

Alejandro’s Status Report for 2/11

This week I helped on creating the proposal slides for the presentation that Aditya gave on Wednesday.

I was also supposed to think about the design of the data structure for the audio file in our code. I thought that we could have the following data structure for the audio file. The data structure would contain a field containing the length of the audio in minutes, the sampling rate of the audio in samples per second, the number of samples taken throughout the whole audio, and a field containing an object which would be containing the content of the Short Time Fourier Transform for our audio.

class audio {

length

samplingRate

samples

STTF{}

}

I also researched into some helpful tools that we could utilize for our frequency processor. I found out that MATLAB offers a function called “pitch()” that when given an audio file input it will give us an output containing the frequencies of the audio. This would be ideal for us to use it to detect what note frequencies are being played at a certain time. This is the output of the function for a C-scale audio for example 

and this is the output for a constant c-note

The only issue is that when the audio goes quiet we get inconsistent frequencies like for the first image at the beginning and at the end. Therefore, at moments of silence we would have to ignore the output of this algorithm and this might be a challenge we have to deal with in the future.

My progress is on schedule, and next week I will be focusing on designing the data structure for the Short Time Fourier Transform as well as helping my team in integrating the frequency processor in the back-end in Python.