Valeria’s Status Report for 4/16/22

This week I worked with Aishwarya to get the integration between the machine learning model and the web application done and have the results be shown on screen for the user. We were able to accomplish this and have now moved forward into perfecting how to show these results to the user. Apart from that, I have also been working on creating a test module and allowing the users to pick what kind of signs they would want to be tested on. As of now, I have the model and the HTML template done. I’m currently trying to figure out how to send the data from the HTML to the views.py through a POST request. I have also added the rest of the tips for the signs so that can now be viewed as well.

My progress is on schedule. For next week, I hope to finish the test module and figure out how to send that POST request data. Apart from that, my other goal is to get the slides done for the final presentation and to help out Hinna and Aishwarya with whatever else we need to finish up before the presentation.

Hinna’s Status Report for 4/10/22

Throughout this week, I was feeling fairly sick, so I did not accomplish as much as I intended to. However, I was able to continue working on creating testing/training data (specifically focusing on our 15 dynamic signs) as well as continue working locally with our machine learning models to see which groupings of signs are performing the worst (currently this is the fist model). Additionally, we did a virtual dry run through of our interim demo, which we couldn’t do since our whole group was sick, and based on TA/professor feedback, I started looking into normalizing the inputs we get to our model in order to make sure that slightly varying the distance or hand position doesn’t affect the predictions. I will continue this research along with my other group members this week to determine if the normalization will be feasible given the amount of time we have left.

Additionally, given that there are only 3 weeks left in the semester, I began planning for user testing by researching the best webapp user satisfaction questions to ask, reaching out to potential testers (other students at CMU), and creating a testing schedule. Given that we initially planned to start user testing last week, this portion of the project is behind schedule.

Overall, we are slightly behind schedule due to our group-wide illnesses this past week and the fact that our models are not working for every letter, resulting in us pushing back user testing. In order to get back on track, we have scheduled two additional meetings this week (outside our normal class time and previously set weekly meets), so that we can dedicate more focused work time to our project.

Next week, we hope to have the static models fully done and the dynamic models significantly progressing (with at least 7/15 having 80%+ prediction rates). Additionally, we plan to start implementing a quiz/testing menu option on the webapp, where users can be quizzed on a random subset of signs. Finally, we plan to have a user testing schedule planned out and ready to go so we can get some feedback in these final weeks.

Aishwarya’s Status Report for 4/10/22

This week, I refined the integration of the web app and our neural networks. Previously, for static signs, we have been downloading a video and sending that to our python code to extract features and use one of the models to execute with this input data and generate a prediction for the sign the user completed. I changed it such that feature extraction is done directly in the javascript backend portion of the web app for each frame of the video camera input. An array of this data is sent as part of POST request to a python server to generate and send back a prediction response. I brought 5 separate models into the backend that are loaded upon webapp start-up. This removed the additional latency I observed the week before due to having to load a model with its structure and weights every time a prediction needed to be made. This integration appears to work smoothly, though we still need to refine an implemention taking a video input from the user in order to support dynamic sign prediction. In addition to this work with the web app, I continued tuning the models and created some additional video data to be used as training samples for our dynamic signs (conversational sign language that requires hand movements).

My progress is on schedule. This coming week, I hope to tune a model to support predictions for dynamic sign language. If the dynamic models have minimal issues, I also plan to help Valeria work on the web app support for user video inputs during the later half of the week.

Valeria’s Status Report for 4/10/22

This week I worked on showing the tips on how to make each sign and have the tips be changed every 5 seconds. I also worked on making the web application more cohesive by having the course and module boxes be of the same color and font. Here is a video showing how the current web application looks. Apart from that, I did 15 videos for each dynamic sign to help build upon our training data. I also started looking into how to create an AJAX model to save the results of each test that a user takes.

Currently, my progress is on schedule in regards to the web application. However, I did run across a problem earlier this week when trying to test the model accuracy. In order to test the model accuracy, I need to run the program locally. The problem is that tensorflow does not agree with the Mac M1 chip. I looked in StackOverflow and other websites for possible solutions that I could use to try to fix this and spent the majority of my week focusing on this. Unfortunately, these possible solutions were not working for me and my progress on testing the model accuracy is behind. In order to fix this, the team and I decided for me to use their computers to test the model accuracy when we are meeting in person so that way I can be involved in the process.

For next week, I have three main goals that I want to accomplish. I want to finish creating the AJAX model for testing and figure out a way to send a random assortment of questions when a user wants to test on one topic e.g. learning, alphabet, etc. The second goal I want to accomplish is to change the video capture from being Blob to be OpenCV. The third goal is to add in the rest of the tips for the signs e.g. conversation, learning, and numbers.

Team Status Report for 4/10/22

This week the most significant risk is processing the data for the dynamic models because it will take some time to format all the data correctly and see how well the model does. To manage this, we are dedicating more time to working with the dynamic data and making it our primary focus for this week. As for contingency plans, if we continue having issues, given that we only have 3 weeks left and that static signs were our MVP, we will most likely leave out the testing portion for dynamic signs, and only include them with educational materials in our platform.

A secondary risk is that the fist model (including signs like a, m, n, s, t, etc) is not performing as expected, where it is not correctly distinguishing between the signs. To manage this, we will be investigating the training data for the fist model this week to figure out why it is not performing well, currently we think the issue is due to varying hand positions but we will confirm after looking into it more this week. As for contingency plans, if we are unable to figure out the fist model issue, we will separate the fist signs into the other models, so that the model will only have to distinguish between dissimilar signs.

Our design has not changed over the past week nor has our schedule seen any significant changes.

Hinna’s Status Report for 4/2/22

This week, I personally worked on my portions of creating the ASL testing/training data, adjusting the instructional material (videos and text) to make sure it fits with how our model is reading hand data (i.e. making sure hands are angled so that all/most fingers are visible in frame), and locally examining the web app + ML models.

In regard to examining the web app, I have been brainstorming with Valeria on some of the suggestions that were given to us on our previous week meeting, where we are trying to decide the best way to store user statistics, include a random quiz mode for users to use with the ability for them to select which categories to be tested on, and format/display user profile information. As for the machine learning models, I have been locally working with them to see which is performing the best (currently the 1 finger model is) and to try to determine holes in how they have been trained (i.e. the sign for ‘d’ requires a certain tilt for it to be accurately detected). After identifying some of these signs, I have been working with Aishwarya to figure out solutions for theses issues in the models.

Our schedule is technically on schedule but is in danger of being behind, especially because my other two group members tested positive for Covid this past week. To account for this, we are pushing some tasks back a week on our schedule (such as integration) and doing our best to collaborate virtually.

In the next week, we plan to successfully execute an interim demo with most if not all of the static sign models working, along with a webapp that can teach users the ASL terms we have chosen after they make an account on the platform.

Valeria’s Status Report for 4/2/22

This week I created and inserted the instructional videos for the signs. There was a minor setback when doing this since we found that Chrome does not accept .MOV files. Therefore, I also had to convert all of the videos from .MOV to .MP4 so that the videos would show up on Chrome. Apart from that, I am also saving the user’s hand dominance after they register and log in. Originally, I thought I could save the user’s hand dominance by getting MediaPipe data. However, after discussing it further with the rest of the team, it was concluded that having the user explicitly state their hand dominance the first time they visit the website would be easier. I wasn’t able to do much else this week since I had a midterm exam for another class and I also contracted COVID.

Because I caught COVID, my progress is slightly behind what I anticipated. Originally, I planned for this week to figure out how to notify the users if they correctly/incorrectly did the sign, help with integration, and help with testing the model accuracy. Therefore, I am deciding to put user UI less of a priority for now. For next week, my main priority is to test the model accuracy during the weekend and continue helping with integration. If I’m able to catch up next week, another thing I hope to do is to add all of the tips that Hinna made for the users when they are trying to make a sign.

Aishwarya’s Status Report for 4/2/22

I integrated the model execution with the web app (such that the user’s input is parsed and passed to the model for generating a prediction). I also parsed all of the new data (that we collected in order to replace incorrect signs in the original training dataset we were using), by extracting image frames from a series of videos our group made, and then extracting landmarks from each image frame. I retrained all the models with this newly formatted data.

My progress is mildly hindered due to having covid this past week, so I haven’t been able to tune the models as much as I would like to. The models in general have slight trouble identifying unknown signs. The fist sign category model in particular seems to have the most difficulty identifying letters such as A and S. I hope that after recovering this next week, I can tune the models further in order to deal with these issues. I will have to experiment with the number of training epochs, and the model structure itself (increasing/decreasing the number of layers and nodes within each layer).

Next week, I hope to fix some of these prediction issues currently observed with the models. I also want to work on making the web app more smoothly integrated with the model execution service. Currently it requires downloading the video input from a user locally, but it would be better to cut out this middle step to improve latency.

Team Status Report for 4/2/22

The most significant risks that could currently jeopardize the success of the project is the model accuracy. At the moment, our model is very sensitive to slight hand tilts and little nuances in signs, so even when making a technically correct sign, the model is unable to identify it as correct. To manage this risk, we are planning to alter some of the layers of our model, the epochs used to train, as well as the number of nodes to see if these adjustments result in a more robust detection.  Additionally, in the next week or so, we plan to consult with Professor Gormley on our model to see if he has any recommendations for improving the detection. As for contingency plans, if we are unable to make the model more flexible in its predictions, we will adjust our instructional materials to better reflect the training data, so that users sign in a way that is seen as correct by the model.

There have not been any changes to the design but after meeting with our advisor and TA we are thinking of adding some features to the webapp such as tracking user statistics. This change will mainly be involved with the user model that we currently have, with an extra field in their profile for letters that they frequently get wrong. We are making this change to make the learning experience more personalized for users, where our platform can reinforce signs/terms that they consistently get incorrect through additional tests. Note that such changes will not be made a priority until after the interim demo, and more specifically, after we have addressed all the feedback we get from the demo.

Our schedule mostly remains the same, however, two of our three group members are currently sick with COVID so we may have a much slower week this week in terms of progress. As a result, we may have to adjust our schedule and push some tasks to later weeks.

 

 

 

Aishwarya’s Status Report for 3/26/22

I trained models for 4 of our model groups (1-finger, 2-finger, 3-finger, and fist-shaped). With testing these, we noticed some unexpected behavior, particularly with the 3-finger model, and realized that the training dataset had incorrect samples for letters such as M and N. I, along with my other group members, recorded videos to create new data that would replace these samples. I wrote a script to extract frames from these videos to store as jpg images, allowing us to generate a few thousand images for the labels that needed to have their samples replaced.  Due to these issues we discovered with the datasets, I will need to reformat the training data and retrain some of the models with these newly created samples.

Our progress is on schedule. During this next week, I hope to integrate the web app video input with the model execution code in preparation for our interim demo. I will also complete re-parsing the data with our new samples for training and retrain the models.

The video linked is a mini-demonstration of one of my models performing real-time predictions.