ml – Team E5: ASL Learning Platform

May 1, 2022

Aishwarya’s Status Report for 4/30/22

This week, I worked on the final presentation with my team, taking pictures and documenting the current state of our project. I worked on adding details to our demo poster (such as system block diagrams and discussion regarding overall system structure). I also have been working on experimenting with further data collection concerning our neural networks (e.g. observing the effect of learning rate on model accuracy), to see if these metrics could provide more content for our discussion of quantitative results and tradeoffs.

My progress is on schedule. Next week I hope to complete my portion of the video demonstration (explaining the ML models and testing accuracy metrics), as well as my portions of the final design report.

April 30, 2022

Hinna’s Status Report for 4/30/22

This week, I finished working on the final presentation with my group, where I added some walkthroughs of the webapp (attached to this post). I also started working on the final poster where I wrote out our product pitch, some of the system description, and worked with my group members to do the rest. Additionally, I began writing the final report, making adjustments to the introduction, user requirements, and project management based on changes we made since the design review.

Our project is on schedule as of now, where we mostly need to finish working on solution tradeoffs, user testing, and final deliverables. Over the next week, I will work with my team to finish user testing, the final poster, the final video, and finalize plans for the final demo. I will also work on analyzing some of our tradeoffs (i.e. epochs vs accuracy for each of the models).

April 30, 2022

Team Status Report for 4/30/22

The most significant risk for the success of this project is model tuning, meaning we don’t achieve the accuracy that we aimed for before the final demo. To mitigate this risk, we are continuing to train our models by adding more training data. As for contingency plans, we are going to leave it as it is since there’s only a week till the demo. Also, after talking to Professor Gormley, the machine learning professor at CMU, he suggested that we should not change our neural network structures due to time constraints.

There have been no changes to the existing design of the system and to our schedule.

April 24, 2022

Hinna’s Status Report for 4/23/22

Over this past week, I created tradeoff graphs based on metrics we found for model accuracy, which we graphed against the number of epochs used in training and testing respectively. In these graphs, we identified that the dynamic models are performing very well (93%+ accurate) which most likely has to do with the fact that we had to create our own training data for them. On the other hand, the 1-finger and open-hand models were performing pretty poorly (60-70% accurate). So, along with my teammates, I made more training data for those models to see if adding that would help improve their accuracy.

Additionally, as the dynamic models are now integrated into the webapp, I examined how well they were doing, testing them personally at various angles, distances (within 2 feet), and using both hands in order to see how accurate they were. I found that when doing the signs quickly, within one second, the prediction was not accurate but when doing it more slowly, the accuracy improved. This finding was also reflected in some of our user test results where we had 2 users test the platform on Friday.

Finally, I have been working with my teammates on the final presentation, where I have updated our schedule and project management tasks, altered our Solution Approach diagram to account for the number of neural networks we have, adjusted our user requirements based on changes made since the design presentation (i.e. our distance requirement lowered and our model accuracy requirement increased), adjusted the testing/verification charts, and finally included the tradeoff curves for testing & training accuracy vs the number of epochs.

Our project overall seems to be on schedule with a few caveats. One is that we are head of schedule in terms of integration as we finished that last week, so our initial plan of integrating until the very end of the semester is no longer the case. However, our model accuracy is not quite where it needs to be for every subset of signs, so given that we only have about a week left, the fact that we might not be able to get them all to our desired accuracy of 97% makes it feel like we are a little behind. Additionally, we held user tests this past week and only 2 users signed up (our total goal is 10 users), which means our testing is behind schedule.

As for next week, my main focuses will be getting more user tests done, finalizing the tradeoff curves in the case where our model accuracies are improved through the addition of more training data, and working on the final report, demo, and video.

April 24, 2022April 24, 2022

Aishwarya’s Status Report for 4/23/22

This week, I completed integrating model execution with the randomized testing feature that Valeria created for the web app. The user proceeds through a set of mixed questions and the models execute on their inputs, so that scores are accrued in the background, and then presented to the user at the end in a score board format. Further, I resolved the bug from last week where the stop action triggered by the user or the timer was executing repeatedly, preventing the user from making further inputs. Now, the user can create multiple attempts at a sign without this bug hindering them.

I also gathered metrics for model training and testing accuracy vs number of epochs for training. This data will be included in our final presentation next week, and it also revealed that some of our models need additional data (created by us) to retrain and improve testing accuracy. Additionally, I conducted user tests with Valeria in order to obtain feedback about our platform, so that we may improve it further before final demo.

My progress in on schedule. The web app and the models are fully integrated. This next week I will focus on tuning the models and gathering more data (concerning model testing accuracy and execution time to generate a prediction) for our documentation of results.

April 23, 2022

Team Status Report for 4/23/22

The most significant risks that could jeopardize the success of our project currently are the model accuracies. Over the past week, we have been looking at accuracy tradeoffs and started conducting user tests, and with the dynamic models, we can see that when users sign them quickly, our model is not able to accurately detect the dynamic signs. To fix this, we are considering making training data of doing the signs faster so that the model can be trained on faster iterations of each sign. As a contingency plan, we will just tell the user to sign slightly slower or just keep the models as they are since we are nearing the end of the semester.

There haven’t been any changes to our system design. As for our schedule, we are going to extend our user testing weeks all the way up to the demo since we were not able to get enough users to sign up over this past week. Additionally, we plan to collect survey results at the live demo to get more user feedback to add to the final report. Also, because we have the webapp and ML models fully integrated, we are shortening the integration task on our schedule by two weeks.

April 17, 2022

Hinna’s Status Report for 4/16/22

This week, my main focus was user testing and also examining the accuracy of our static and dynamic models.

In regard to user testing, I made a google form survey that asks our testers to rate different usability features of the website as well as how helpful they felt it was in teaching them ASL. I also made a step by step guide for users to follow when we conduct the testing, which we will use to see how intuitive it is for users to complete the steps and to make sure they test various different actions (i.e. making sure each user tries to correctly and incorrectly sign on the platform to see the results). Finally, as a TA for the ASL StuCo this semester, I reached out to students who are either semi-experienced or experienced in ASL to conduct our tests. We will also be reaching out to a few people who are brand new to ASL in order to get a beginner’s perspective on our platform.

As for the models, I have been trying different combinations of training epochs and prediction threshold values (where the model only outputs a prediction if it is over a certain number i.e. 90%) to determine the best weights for the model to make it more accurate. In these tests, I have been able to identify certain signs that consistently have trouble over the combinations as well as some environmental factors like contrasting backgrounds that can influence the results. Because of this work and feedback during our weekly meeting, I will continue trying these combinations in a more intentional way, where I will record data related to the accuracy of the models based on epochs and/or threshold values in order to graph the tradeoffs associated with our system. The final accuracy data collection and graphs themselves will be recorded at the end of next week in order to account for any training shifts we make this week based on some of the identified signs with consistently inaccurate predictions.

Our project is on schedule at this point, however our model accuracy is not quite at the 97% we set out for it to be at the beginning of the semester. Since we planned to be adjusting and tuning the model up to the very end of the semester, this is not too big of a deal but we are going to start shifting focus primarily to testing as well as the final presentation/demo/report. Thus, while we are on schedule, our final implementation may not be as robust as we had planned for it to be.

Next week, I will be conducting user tests along with my teammates, focusing on factors such as hand dominance, hand size, lighting, distance from camera, and potentially contrasting backgrounds.I will also be examining the dynamic models more in depth to identify signs that are having less successful detections. Additionally, I will be recording information on accuracy vs threshold value and accuracy vs epochs used in training, then using that information to make tradeoff curves that we can hopefully include in our final presentation.

April 16, 2022

Team Status Report for 4/16/22

The most significant risks that would jeopardize the success of our project are the accuracy of the static model predictions, the accuracy of the dynamic models (which is a post MVP concern), and minor bugs such as our start-stop recording not timing out correctly.

In regard to the static model accuracy, we are managing this risk by examining each model to determine which signs are having issues and playing with different epoch/prediction threshold values to see if different combinations improve the model accuracy. Given that there are only 2 weeks left in the semester, if the accuracy is not improved we will simply mention that in the report but will keep the models as they are in the platform since they are overall performing well.

As for the dynamic models, this a post MVP portion of our project, and currently a lot of the 2-handed moving signs are not being accurately detected. To mange this risk, based on feedback from Tamal and Isabel, we are checking for the number of hands present in frame for 2-handed tests, and immediately saying that its wrong if there is only 1 hand present. Additionally, we are looking into the flickering of media pipe landmarks which occurs when the hands blur in motion in the middle of doing the signs. We are thinking of padding or removing those blurred frames. Again, as a contingency plan, we will most likely keep the dynamic model in our platform as is if we can’t improve it, given that there are only 2 weeks left, and address the inaccuracies in our report.

In regard to the minor bugs, like the start-stop timeout, we are using logs and online forums to try to figure out why our recording is not actually stopping. If the issue persists, we will reach out on Slack to get some help with the error. It should be noted that this issue is not a major problem as the main purpose of our platform (having users sign and get predictions on whether their signing is correct) is not affected by it. However, if the issue is not able to be fixed, we will simply instruct the users on how to get around it (i.e. to repeat execution they have to refresh the page instead of being able to press start again).

There have not been changes to our system design or schedule.

April 11, 2022

Hinna’s Status Report for 4/10/22

Throughout this week, I was feeling fairly sick, so I did not accomplish as much as I intended to. However, I was able to continue working on creating testing/training data (specifically focusing on our 15 dynamic signs) as well as continue working locally with our machine learning models to see which groupings of signs are performing the worst (currently this is the fist model). Additionally, we did a virtual dry run through of our interim demo, which we couldn’t do since our whole group was sick, and based on TA/professor feedback, I started looking into normalizing the inputs we get to our model in order to make sure that slightly varying the distance or hand position doesn’t affect the predictions. I will continue this research along with my other group members this week to determine if the normalization will be feasible given the amount of time we have left.

Additionally, given that there are only 3 weeks left in the semester, I began planning for user testing by researching the best webapp user satisfaction questions to ask, reaching out to potential testers (other students at CMU), and creating a testing schedule. Given that we initially planned to start user testing last week, this portion of the project is behind schedule.

Overall, we are slightly behind schedule due to our group-wide illnesses this past week and the fact that our models are not working for every letter, resulting in us pushing back user testing. In order to get back on track, we have scheduled two additional meetings this week (outside our normal class time and previously set weekly meets), so that we can dedicate more focused work time to our project.

Next week, we hope to have the static models fully done and the dynamic models significantly progressing (with at least 7/15 having 80%+ prediction rates). Additionally, we plan to start implementing a quiz/testing menu option on the webapp, where users can be quizzed on a random subset of signs. Finally, we plan to have a user testing schedule planned out and ready to go so we can get some feedback in these final weeks.

April 11, 2022April 11, 2022

Aishwarya’s Status Report for 4/10/22

This week, I refined the integration of the web app and our neural networks. Previously, for static signs, we have been downloading a video and sending that to our python code to extract features and use one of the models to execute with this input data and generate a prediction for the sign the user completed. I changed it such that feature extraction is done directly in the javascript backend portion of the web app for each frame of the video camera input. An array of this data is sent as part of POST request to a python server to generate and send back a prediction response. I brought 5 separate models into the backend that are loaded upon webapp start-up. This removed the additional latency I observed the week before due to having to load a model with its structure and weights every time a prediction needed to be made. This integration appears to work smoothly, though we still need to refine an implemention taking a video input from the user in order to support dynamic sign prediction. In addition to this work with the web app, I continued tuning the models and created some additional video data to be used as training samples for our dynamic signs (conversational sign language that requires hand movements).

My progress is on schedule. This coming week, I hope to tune a model to support predictions for dynamic sign language. If the dynamic models have minimal issues, I also plan to help Valeria work on the web app support for user video inputs during the later half of the week.