Shilika’s Status Update for 12/04/20

I spent this week working on finishing touches of the web application and testing. In the web application, I completed the “Completed Technical Interview” page under the Profile page. For this, after the user completes a technical question in our technical interview page, they’re answer and the question details get saved in a database. I display this database in a user friendly format for the person to check their progress. In addition, I fixed some minor css issues of the web application such as text formatting in the navigation bars. 

In addition to the web application, I worked on testing the neural network. I ran the algorithm multiple times through our automated testing to gauge the accuracy and found it to be 30% to 40% consistently. Lastly, I ran our training and testing data through an inbuilt gaussian naive bayes algorithm to see if the low accuracy of our neural network was due to the neural network or our speech recognition algorithm. I found that the accuracy using gnb was 40%. This is very similar to our neural network and we will be discussing this in our final report. 

This week, I plan on working on the video and the final report of our capstone project. I will also continue testing to get more metrics that we can display in our final report. Though the accuracy has a range, more data will lead us to a more concrete number of what the final accuracy was of our speech recognition algorithm. 

Shilika’s Status Update for 11/20/20

This week, I successfully added a hidden layer to the neural network. I worked with Mohini to gain a solid understanding of the back propagation algorithm. This gave me a clear understanding of how to update the three parameters – alpha, alpha2, and beta – of the new neural network. Alpha being the one between the input layer and first hidden layer, alpha2 being the one between the first and second hidden layer, and beta being the one between the second hidden layer and output layer. Using this knowledge, I incorporated the second hidden layer. In this algorithm, the two hidden layers have the same number of hidden units depending on the user input.

After this, I began testing the algorithm with our current training data to gauge its accuracy as compared to the algorithm with one hidden layer. The training data and testing data with the new neural network have an accuracy of 83.5% and 86.3% respectively, while the training and testing data with the previous neural network have an accuracy of 1.5% and 77.2% respectively. The results of the training data were as expected because an additional hidden layer reduces overfitting. The would result in a higher training error since the algorithm will classify the training data less correctly. However, I was disappointed to see the testing error slightly increase. I did not expect it to improve significantly, but at least marginally. This is because additional hidden layers bring more non-linearity to the neural network, and thus the classification. Therefore, it would be able to generalize the data better. 

My next steps will be to continue to improve the accuracy of the neural network. I will modify the variable values such as the number of training data and the number of hidden units in each hidden layer. I will continue to monitor how these changes affect the accuracy of the algorithm and try to find the optimal parameters to achieve the highest accuracy. Additionally, I will start working on the web application again to complete the profile page.

Shilika’s Status Update for 11/13/20

This week, I put my focus on improving the accuracy of the neural network and working towards adding an additional layer to the neural network. As mentioned last week, I worked on applying the stochastic gradient descent on the additional parameters for the additional hidden layer in-between the first hidden layer and the output. I realized there were flaws in my algorithm from last week as I had skipped over some variables while computing the gradient. . Because I have been having a hard time understanding how to get this matrix, I created two flow charts – one of the original neural network we had and one with the additional hidden layer – to visualize how the values are propagating to the output layer. 

After visualizing the neural network, I outlined the algorithm used to compute the current alpha and beta parameters. This gave me more insight into what the computation would look like for the second alpha layer. As mentioned, I am in the process of coding and debugging the new algorithm for the additional parameter and would like to have it finished early next week. In addition to the neural network, I worked on cleaning up the css and html in the technical page and the navigation bars. This included small details such as fixing the alignment of the username, positioning the side navigation bar, and altering the headings of the components in the technical page to make it more user friendly. 

Next week, I plan to continue working on the technical interview page and the completed technical interview page on the profile page. I will also continue to work on improving the accuracy of the neural network by creating more training data as this has shown to improve the accuracy by 5% for every 30-40 additional training samples we create.

Shilika’s Status Update for 11/06/20

This week, I worked on the web application components and the neural network portion. For the web application, I made the css and html for the login and register pages more user friendly and appealing. The design now properly integrates with the rest of the web pages, as well.

In addition to the css, I continued to work on saving the completed behavioral interview videos on the web page. I have not been able to properly display the video, as a blank video appears in every web browser that I have tried such as Firefox, Safari, and Chrome. 

In the neural network portion, I worked with Mohini to continue the code we are using for our baseline (the neural network homework code from the Machine Learning course at Carnegie Mellon). In the beginning, the neural network was predicting the same output for every training and testing data point we provided. After debugging and testing, we realized that the ordering in which the training data is provided has an effect on the final outcome of the predictions. Despite varying the order of the training data, our accuracy of our testing data is still low at approximately a 42% accuracy rate. To improve the accuracy, I decided to implement an additional hidden layer in the neural network. The changes that this will require in the neural network is integrating an additional layer of hidden layer after first hidden layer, initializing weights associated with the neural network, performing a stochastic gradient descent to optimize the weights/parameters, and connecting the hidden layer with the two output classes. I am currently working on applying SGD on the parameters and have been running into index out of bound bugs. 

By next week, I hope to have completed this layer and run it to test if the accuracy has increased. I will also continue to research other methodologies to improve the accuracy.  I also hope to figure out displaying completed behavioral interview videos in the django webapp. I am behind in this aspect because I intended to finish it this week. In order to get back on track, I will reach out to my team to help me with this portion as I have not been able to figure it out despite trying multiple possibilities that I found through online resources.

Shilika’s Status Report for 10/30/20

This week after finalizing the output of the signal processing, I began to review the concepts of a neural network which will be the next technical portion of our project. I will be working with Mohini to improve the neural network that we created in a Machine Learning course we previously took. In this algorithm, we use a single layer neural network that uses a sigmoid activation function for the hidden layer, a softmax function on the output layer, and the cross-entropy loss function to gauge the accuracy of our model. I reviewed the concepts behind these activation functions and how the output layer is formed using the input layer and hidden layers. 

I additionally started working on the web application components of our project again. I worked on how to run the java code in django and used the “copy path” command to be able to run the code from a separate direction. I also began working on the profile page again which is where the user will be able to save their skill set and view previously recorded behavioral interviews. I improved the css for the profile page to make it more user friendly and began to look at saving the videos locally in django.

Next week, my goal is to be able to save the videos on django and allow the user to upload a profile photo to the profile page. Additionally as soon as our training data is ready, I start implementing ways in which our neural network can be improved to classify our 8 outputs.

Shilika’s Status Update for 10/23/20

This week, I built off of the signal processing work we did in the previous weeks to create the output of the signal processing algorithm. The process after reading the original input file is as follows:

  1. We first apply a pre-emphasis on the audio input:
    1. To do this, we use the equation y(t) = x(t) – alpha*x(t-1). The alpha value is a predetermined filter coefficient which is usually 0.95 or 0.97.
    2. By doing so, we will be able to improve the signal to noise ratio by amplifying the signal.
  2. We then frame the updated signal:
    1. Framing is useful because a signal is constantly changing over time. Doing a simple Fourier transform over the whole signal because we would lose the variations through time.
    2. Thus, by taking the Fourier transform of adjacent frames with overlap, we will preserve as much of the original signal as possible.
  3. We are using 20 millisecond frames with 10 millisecond frames.
    1. With the updated signal, we use a hamming window:
    2. A Hamming window reduces the effects of leakage that occurs when performing a Fourier transform on the data.
    3. To apply it, we use a simple line of code in python.
  4. Fourier Transform and Power Spectrum:
    1. We can now do the Fourier Transform on the data and compute the power spectrum to be able to distinguish different audio data from each other.

The output will continue to be modified and enhanced to make our algorithm better, but we have something to input into our neural network now. I began looking into filter banks and mfcc, which are two techniques used to change the data so it is more understandable to the human ear. I will continue this next week and if time allows help the team with the neural network algorithm. 

Shilika’s Status Report for 10/16/20

This week, I worked with Mohini on the signal processing part. We needed to research and experiment with different ways to trim our audio and scale our x-axis to make all the final outputs the same length. We decided to take a different approach and analyze the Short Term Fourier Transform (STFT) over 20 millisecond chunks of the whole audio file. After splitting the audio file and  applying the fourier transform to each chunk, we plotted the results on a spectrogram. Unlike before, we were able to see slight similarities when we said the same letter multiple times and differences between the different letters. We additionally met with a PhD student who specializes in speech recognition. He gave us tips on how to further hone our input. For example, he recommended we use a Hamming window with a 50% overlap and scale the frequency values so the numbers aren’t too small. 

I believe I am still on schedule. The goal last week was to have an output ready so we could use it as the input for the neural network. Though the output needs more modifications, we were able to come up with a solution. This week, I hope to continue my work in the signal processing portion and add all the modifications that were recommended by the PhD students and solidify the output of the signal processing algorithm. 

Shilika’s Status Report for 10/09/2020

This week, I continued to work on the web application platform, questions database, and signal processing. I completed the html and css of the profile page on our website. This page allows the user to upload a picture of themselves, and contains links to setting up their initial video, picking their skillset, and accessing their previous videos. These links will lead to the pages once we have our facial detection and speech processing functioning.

I also continued to work on the questions database. I completed the behavioral database which contains approximately 100 questions which will randomly be assigned to the user. For the technical database, we are collecting a question, an output example, an output for the user to test their code with, and the correct answer for each question. Additionally, for each category (arrays, linked lists, etc.), we will have easy, medium, and hard questions. So far, I added nine questions with examples, outputs, and output answers using LeetCode and will continue to add questions routinely.

Lastly, I continued the work on the signal processing portion. Continuing the foundation from previous weeks, I gained an understanding of what the input into our neural network should look like. I refined and added to my previous code which stores in a more correct integer array of the audio, breaks the input into small chunks of audio, and outputs the values in a user-friendly format. I worked with Mohini to see if there are any patterns or similarities between each individual letter and were able to find commonalities in the audio signal.

I believe my progress is on schedule. Next week, I hope to continue adding to the technical database and have an input for our neural net. This input will have many iterations of refinement, but my goal is to have proper, calculated values.   

Shilika’s Status Report for 10/02/2020

This week, I created the navigation bars that will be used across the pages in our web application. The top navigation bar has three main components – 1. The menu button allows you to open the side navigation bar. It has two additional buttons, one that leads you to the behavioral interview page and the other that leads you to the technical interview page. 2. The profile button leads you to your profile page. 3. The help button leads you to a page in which our web application features will be explained. The side navigation bar has two buttons that lead you to the behavioral and technical interview pages.

I also began creating the behavioral and technical databases. I used online websites and used common questions that are asked in behavioral/technical interviews for software engineering interviews roles. 

Lastly, I researched the steps of our speech processing algorithm to detect letters that the user will speak. So far, I have been able to successfully read the audio, convert it to an integer array, and graph the audio. These preliminary steps are the foundation of creating the data we will feed into our neural network.

I believe that my progress is on schedule. Next week, I aim to complete the css and html for the user profile page, complete collecting questions for the databases, and get a solid understanding of how the Fourier transform can be used in python to pre-process the audio signal we are receiving.