B2: iRecruit

iRecruit is an interview assistant intended to help software engineering jobseekers practice for the interview process. In today’s society, people are challenged with navigating fully virtual interviews and learning how to conduct themselves during the two key interviews of the application process: behavioral and technical interviews. Although there exist several written guidelines and programming platforms, there is a lack of opportunity to practice simulated interviews. The goal of iRecruit is to provide users with ways to practice for interviews through facial detection for behavioral interviews, and speech recognition (a mix of signal processing and machine learning) for technical interviews.

Here is our final video: http://course.ece.cmu.edu/~ece500/projects/f20-teamb2/2020/12/08/final-video/.

Team’s Status Update for 12/04/20

This week, we continued working on the various components of iRecruit. Jessica worked on integrating the eye contact and screen alignment parts together for option 1 of the behavioral interview options, and figured out how to store summary information for each video recording. She was able to combine the eye contact and screen alignment parts to alert the user of both subpar eye contact and screen alignment. There are setup phases for the eye detection and facial landmark detection parts, where the frame of reference coordinates for the center of the eyes, nose, and mouth are calculated in the first 5 seconds. If a user’s eyes, nose, or mouth are off-center, they are alerted with the appropriate category (eye contact or screen alignment). She also implemented the video summaries for the profile part of the behavioral interviews. There is a common text file that keeps track of all the video summaries. For each video recording, the interview number, timestamp, number of times the user had subpar eye contact, and number of times the user had subpar screen alignment are appended to a new line in the file. Next week, she plans on integrating this text file with Django, so that the video summaries show up on the profile page. She also plans on continuing to test the eye contact only option, and beginning to test the screen alignment only option and the integrated option. 

Mohini worked on a couple different things this week. Since she finished most of the implementation last week, she spent her time this week testing and refining. She refined the backend database model that kept track of each user and their corresponding questions and answers. Then, she tested the speech recognition algorithm using both automated and manual testing. Automated testing consisted of having a fixed testing data set of approximately 50 samples whereas manual testing consisted of retrieving the testing input through the signal processing algorithm. She will continue this testing next week to determine the true accuracy of the algorithm. Lastly, Mohini started working on the final report and incorporating some of the design changes that we made throughout the project. Next week, she will start recording her parts for the final video demo. 

Shilika worked on the web application and added finishing touches such as the Completed Technical Interview page and minor css changes. She also worked on testing the speech recognition algorithm. She ran the automated testing and collected data. Additionally, she ran the in-built gnb methods in python to compare the results of that against the neural network. Next week, she will continue testing the performance of the speech recognition model, and work on the final video and report. 

Shilika’s Status Update for 12/04/20

I spent this week working on finishing touches of the web application and testing. In the web application, I completed the “Completed Technical Interview” page under the Profile page. For this, after the user completes a technical question in our technical interview page, they’re answer and the question details get saved in a database. I display this database in a user friendly format for the person to check their progress. In addition, I fixed some minor css issues of the web application such as text formatting in the navigation bars. 

In addition to the web application, I worked on testing the neural network. I ran the algorithm multiple times through our automated testing to gauge the accuracy and found it to be 30% to 40% consistently. Lastly, I ran our training and testing data through an inbuilt gaussian naive bayes algorithm to see if the low accuracy of our neural network was due to the neural network or our speech recognition algorithm. I found that the accuracy using gnb was 40%. This is very similar to our neural network and we will be discussing this in our final report. 

This week, I plan on working on the video and the final report of our capstone project. I will also continue testing to get more metrics that we can display in our final report. Though the accuracy has a range, more data will lead us to a more concrete number of what the final accuracy was of our speech recognition algorithm. 

Jessica’s Status Update for 12/04/2020

This week, I worked on combining the eye contact and screen alignment parts together for the facial detection portion, and implemented a way to store summaries about each video recording. I was able to integrate the eye contact part and the screen alignment part for option 1 to alert the user of both subpar eye contact and screen alignment. This required combining the two pieces of code I had written separately for options 2 and 3, so that there are setup phases for both eye detection and facial landmark detection, and that the coordinates of the center of the eyes and the nose/mouth were averaged during the initial 5 seconds. We then have separate frames of references for the eyes, nose, and mouth. In the respective parts of the code, if the current eye, nose, or mouth coordinates are off-center, the user is alerted of the appropriate one (eye contact or screen alignment). 

We were going to store the video recordings at the beginning of our project, and then allow users to view them in the profile section. However, we decided that it would be more helpful if we summarized the feedback from each video recording. There is a common text file (called behavioral_interview_output.txt) to store the video summaries. We calculate the interview number by counting the number of lines in the text file, and retrieve the timestamp of when the video practice took place using the Python datetime library. We keep track of the amount of times that the user had subpar eye contact and/or screen alignment during a video recording using count variables. The interview number, timestamp, subpar eye contact count, and subpar screen alignment count (for options 2 and 3, subpar screen alignment is “N/A” and subpar eye contact is “N/A,” respectively) are appended to the text file. This text file is to be displayed on the behavioral section of the profile page for the user to access.

I believe that we are making good progress for the facial detection portion, as we are wrapping up the technical portions and were able to accomplish a lot of the corresponding profile part as well. Next week, I plan on integrating the text file of the video recording summaries into Django for the web application. I also plan on continuing testing for the eye contact only option, and beginning testing for the screen alignment only option and the integrated eye contact and screen alignment option. I would like to get an idea of the current accuracy of the systems. 

Mohini’s Status Report for 12/04/2020

Since I wrapped up completing the implementation last week, I spent most of my time this week testing and putting final touches to the web app. I updated the user model that keeps track of the user’s name and all their questions and answers. I tested this functionality thoroughly in order to ensure that if I logged out and then logged back in as a different user, then a new set of questions and answers are displayed on the completed technical page that correspond to this specific user. The completed technical page is a running record of all the questions and answers that belong to the logged in user. 

Second, I formally tested the speech recognition algorithm. This was accomplished in two different ways. First, I did automated testing with fixed testing data. I created a testing data file with approximately 50 samples representative of all 8 categories. I ran the speech recognition algorithm to predict the category of all 50 samples. The accuracy fell around 30-40% and varied significantly between different test cases. Next, I did manual testing that tested the integration of  the signal processing and machine learning components. Here, I created the sample of testing data through audio recording my voice and letting the signal processing algorithm decompose it. I created a spreadsheet to keep track of my manual testing. The spreadsheet consists of the true word and the predicted word, and has a similar accuracy of 30-40% as the automated testing. 

Lastly, I started working on the final report. I used our design report as a template and incorporated some of the many design changes that we made throughout the project. Some of the updates included word classification (rather than letter classification), the layout of the technical interviewing page, and the information that we stored in the profile page for the completed behavioral and technical pages. Next week, I plan on continuing to test the speech recognition algorithm through both automated and manual testing. I will also start recording my part for the final video demo. I believe we are making good progress as iRecruit is almost complete!