Team Status Report for 4/16/22

The most significant risks that would jeopardize the success of our project are the accuracy of the static model predictions, the accuracy of the dynamic models (which is a post MVP concern), and minor bugs such as our start-stop recording not timing out correctly.

In regard to the static model accuracy, we are managing this risk by examining each model to determine which signs are having issues and playing with different epoch/prediction threshold values to see if different combinations improve the model accuracy. Given that there are only 2 weeks left in the semester, if the accuracy is not improved we will simply mention that in the report but will keep the models as they are in the platform since they are overall performing well.

As for the dynamic models, this a post MVP portion of our project, and currently a lot of the 2-handed moving signs are not being accurately detected. To mange this risk, based on feedback from Tamal and Isabel, we are checking for the number of hands present in frame for 2-handed tests, and immediately saying that its wrong if there is only 1 hand present. Additionally, we are looking into the flickering of media pipe landmarks which occurs when the hands blur in motion in the middle of doing the signs. We are thinking of padding or removing those blurred frames. Again, as a contingency plan, we will most likely keep the dynamic model in our platform as is if we can’t improve it, given that there are only 2 weeks left, and address the inaccuracies in our report.

In regard to the minor bugs, like the start-stop timeout, we are using logs and online forums to try to figure out why our recording is not actually stopping. If the issue persists, we will reach out on Slack to get some help with the error. It should be noted that this issue is not a major problem as the main purpose of our platform (having users sign and get predictions on whether their signing is correct) is not affected by it. However, if the issue is not able to be fixed, we will simply instruct the users on how to get around it (i.e. to repeat execution they have to refresh the page instead of being able to press start again).

There have not been changes to our system design or schedule.

Valeria’s Status Report for 4/16/22

This week I worked with Aishwarya to get the integration between the machine learning model and the web application done and have the results be shown on screen for the user. We were able to accomplish this and have now moved forward into perfecting how to show these results to the user. Apart from that, I have also been working on creating a test module and allowing the users to pick what kind of signs they would want to be tested on. As of now, I have the model and the HTML template done. I’m currently trying to figure out how to send the data from the HTML to the views.py through a POST request. I have also added the rest of the tips for the signs so that can now be viewed as well.

My progress is on schedule. For next week, I hope to finish the test module and figure out how to send that POST request data. Apart from that, my other goal is to get the slides done for the final presentation and to help out Hinna and Aishwarya with whatever else we need to finish up before the presentation.

Aishwarya’s Status Report for 4/16/22

This week I parsed the video data for dynamic signs (movement required) and trained two models for interpreting them. There was a bug with the video data being unsuccessfully feature extracted, and I realized this was because the beginning and end of many of the videos had no hands in them (due to the user starting and stopping the camera). I changed it such that these frames are padded as zeros when formatting the landmark data in arrays. I also debugged why the models were not being correctly loaded up in the back end of the web app (which turned out to be an issue with the way the model was being saved as a file after training). I further looked into how we could conveniently record accuracy data about models at various epochs, and changed the code such that multiple trained models are saved at intervals until the final epoch is reached.

My progress is on schedule. The only concern I have is for a bug with the timer in our code that stops the user’s input feed after 5 seconds. It executes multiple times such that sometimes the user cannot record a second or third attempt to their answer for a given sign.  During the next week, I hope to resolve this issue and tune the dynamic model such that it may be at a high enough accuracy to add to the app. I also hope to continue to collect data on model accuracy while varying parameters for our final design report/documentation. Lastly, I hope to add a bit more logic to the back end of the web app where sign grading is completed (e.g. say the sign is incorrect if the user has the wrong number of hands present in frame).

Hinna’s Status Report for 4/10/22

Throughout this week, I was feeling fairly sick, so I did not accomplish as much as I intended to. However, I was able to continue working on creating testing/training data (specifically focusing on our 15 dynamic signs) as well as continue working locally with our machine learning models to see which groupings of signs are performing the worst (currently this is the fist model). Additionally, we did a virtual dry run through of our interim demo, which we couldn’t do since our whole group was sick, and based on TA/professor feedback, I started looking into normalizing the inputs we get to our model in order to make sure that slightly varying the distance or hand position doesn’t affect the predictions. I will continue this research along with my other group members this week to determine if the normalization will be feasible given the amount of time we have left.

Additionally, given that there are only 3 weeks left in the semester, I began planning for user testing by researching the best webapp user satisfaction questions to ask, reaching out to potential testers (other students at CMU), and creating a testing schedule. Given that we initially planned to start user testing last week, this portion of the project is behind schedule.

Overall, we are slightly behind schedule due to our group-wide illnesses this past week and the fact that our models are not working for every letter, resulting in us pushing back user testing. In order to get back on track, we have scheduled two additional meetings this week (outside our normal class time and previously set weekly meets), so that we can dedicate more focused work time to our project.

Next week, we hope to have the static models fully done and the dynamic models significantly progressing (with at least 7/15 having 80%+ prediction rates). Additionally, we plan to start implementing a quiz/testing menu option on the webapp, where users can be quizzed on a random subset of signs. Finally, we plan to have a user testing schedule planned out and ready to go so we can get some feedback in these final weeks.

Aishwarya’s Status Report for 4/10/22

This week, I refined the integration of the web app and our neural networks. Previously, for static signs, we have been downloading a video and sending that to our python code to extract features and use one of the models to execute with this input data and generate a prediction for the sign the user completed. I changed it such that feature extraction is done directly in the javascript backend portion of the web app for each frame of the video camera input. An array of this data is sent as part of POST request to a python server to generate and send back a prediction response. I brought 5 separate models into the backend that are loaded upon webapp start-up. This removed the additional latency I observed the week before due to having to load a model with its structure and weights every time a prediction needed to be made. This integration appears to work smoothly, though we still need to refine an implemention taking a video input from the user in order to support dynamic sign prediction. In addition to this work with the web app, I continued tuning the models and created some additional video data to be used as training samples for our dynamic signs (conversational sign language that requires hand movements).

My progress is on schedule. This coming week, I hope to tune a model to support predictions for dynamic sign language. If the dynamic models have minimal issues, I also plan to help Valeria work on the web app support for user video inputs during the later half of the week.

Valeria’s Status Report for 4/10/22

This week I worked on showing the tips on how to make each sign and have the tips be changed every 5 seconds. I also worked on making the web application more cohesive by having the course and module boxes be of the same color and font. Here is a video showing how the current web application looks. Apart from that, I did 15 videos for each dynamic sign to help build upon our training data. I also started looking into how to create an AJAX model to save the results of each test that a user takes.

Currently, my progress is on schedule in regards to the web application. However, I did run across a problem earlier this week when trying to test the model accuracy. In order to test the model accuracy, I need to run the program locally. The problem is that tensorflow does not agree with the Mac M1 chip. I looked in StackOverflow and other websites for possible solutions that I could use to try to fix this and spent the majority of my week focusing on this. Unfortunately, these possible solutions were not working for me and my progress on testing the model accuracy is behind. In order to fix this, the team and I decided for me to use their computers to test the model accuracy when we are meeting in person so that way I can be involved in the process.

For next week, I have three main goals that I want to accomplish. I want to finish creating the AJAX model for testing and figure out a way to send a random assortment of questions when a user wants to test on one topic e.g. learning, alphabet, etc. The second goal I want to accomplish is to change the video capture from being Blob to be OpenCV. The third goal is to add in the rest of the tips for the signs e.g. conversation, learning, and numbers.

Team Status Report for 4/10/22

This week the most significant risk is processing the data for the dynamic models because it will take some time to format all the data correctly and see how well the model does. To manage this, we are dedicating more time to working with the dynamic data and making it our primary focus for this week. As for contingency plans, if we continue having issues, given that we only have 3 weeks left and that static signs were our MVP, we will most likely leave out the testing portion for dynamic signs, and only include them with educational materials in our platform.

A secondary risk is that the fist model (including signs like a, m, n, s, t, etc) is not performing as expected, where it is not correctly distinguishing between the signs. To manage this, we will be investigating the training data for the fist model this week to figure out why it is not performing well, currently we think the issue is due to varying hand positions but we will confirm after looking into it more this week. As for contingency plans, if we are unable to figure out the fist model issue, we will separate the fist signs into the other models, so that the model will only have to distinguish between dissimilar signs.

Our design has not changed over the past week nor has our schedule seen any significant changes.

Hinna’s Status Report for 4/2/22

This week, I personally worked on my portions of creating the ASL testing/training data, adjusting the instructional material (videos and text) to make sure it fits with how our model is reading hand data (i.e. making sure hands are angled so that all/most fingers are visible in frame), and locally examining the web app + ML models.

In regard to examining the web app, I have been brainstorming with Valeria on some of the suggestions that were given to us on our previous week meeting, where we are trying to decide the best way to store user statistics, include a random quiz mode for users to use with the ability for them to select which categories to be tested on, and format/display user profile information. As for the machine learning models, I have been locally working with them to see which is performing the best (currently the 1 finger model is) and to try to determine holes in how they have been trained (i.e. the sign for ‘d’ requires a certain tilt for it to be accurately detected). After identifying some of these signs, I have been working with Aishwarya to figure out solutions for theses issues in the models.

Our schedule is technically on schedule but is in danger of being behind, especially because my other two group members tested positive for Covid this past week. To account for this, we are pushing some tasks back a week on our schedule (such as integration) and doing our best to collaborate virtually.

In the next week, we plan to successfully execute an interim demo with most if not all of the static sign models working, along with a webapp that can teach users the ASL terms we have chosen after they make an account on the platform.

Valeria’s Status Report for 4/2/22

This week I created and inserted the instructional videos for the signs. There was a minor setback when doing this since we found that Chrome does not accept .MOV files. Therefore, I also had to convert all of the videos from .MOV to .MP4 so that the videos would show up on Chrome. Apart from that, I am also saving the user’s hand dominance after they register and log in. Originally, I thought I could save the user’s hand dominance by getting MediaPipe data. However, after discussing it further with the rest of the team, it was concluded that having the user explicitly state their hand dominance the first time they visit the website would be easier. I wasn’t able to do much else this week since I had a midterm exam for another class and I also contracted COVID.

Because I caught COVID, my progress is slightly behind what I anticipated. Originally, I planned for this week to figure out how to notify the users if they correctly/incorrectly did the sign, help with integration, and help with testing the model accuracy. Therefore, I am deciding to put user UI less of a priority for now. For next week, my main priority is to test the model accuracy during the weekend and continue helping with integration. If I’m able to catch up next week, another thing I hope to do is to add all of the tips that Hinna made for the users when they are trying to make a sign.

Aishwarya’s Status Report for 4/2/22

I integrated the model execution with the web app (such that the user’s input is parsed and passed to the model for generating a prediction). I also parsed all of the new data (that we collected in order to replace incorrect signs in the original training dataset we were using), by extracting image frames from a series of videos our group made, and then extracting landmarks from each image frame. I retrained all the models with this newly formatted data.

My progress is mildly hindered due to having covid this past week, so I haven’t been able to tune the models as much as I would like to. The models in general have slight trouble identifying unknown signs. The fist sign category model in particular seems to have the most difficulty identifying letters such as A and S. I hope that after recovering this next week, I can tune the models further in order to deal with these issues. I will have to experiment with the number of training epochs, and the model structure itself (increasing/decreasing the number of layers and nodes within each layer).

Next week, I hope to fix some of these prediction issues currently observed with the models. I also want to work on making the web app more smoothly integrated with the model execution service. Currently it requires downloading the video input from a user locally, but it would be better to cut out this middle step to improve latency.