Team Status Report for 2/24

Main Accomplishments for This Week

  • Design Review presentation

  • Swift language and Xcode environment setup
    • Initialization of mobile app with camera capabilities

  • Ordered and picked up inventory items purchased (​​battery, oled screen and eink screen)
  • Beginning of ML model training for dynamic signs 

Risks & Risk Management

  • Currently no significant risks for the whole team, but some issues encountered by teammates are as follows:
    • One issue raised an issue that the Xcode-simulated iPhone does not have a camera implementation, so we are doing further research and testing to be able to use the iPhone camera in our app and integrate it with the rest of our code.
    • Another issue encountered was the significant amount of data that will be needed to produce an accurate ML model, as well as foreseen issues with integrating multiple data sets. Our team is mitigating risks relating to this by taking advantage of an iterative approach to training and testing the ML model with the CV processing.

Design Changes

  • No design changes

Schedule Changes

  • No schedule changes for now. However, if the intermediate integrations don’t go well, we probably will use spring break time to work on this

Sejal’s Status Report for 2/17/24

This week I got started on a simple ML model and combined it with Ran’s computer vision algorithm for hand detection. I trained a dataset from Kaggle’s ASL MNIST using a CNN. Using the trained model, I took the video processing from the OpenCV and Mediapipe code, processed the prediction of what character was being displayed, and displayed this prediction on the webcam screen, as shown below.

(Code on github https://github.com/LunaFang1016/GiveMeASign/tree/feature-cv-ml)

Training this simple model allowed me to think about the complexities required beyond this, such as incorporating both static and dynamic signs, and combining letters into words to form readable sentences. After doing further research on the structure of neural network to use, I decided to go with a combination of CNN for static signs, and LSTM for dynamic signs. I also gathered datasets that display both static and dynamic signs from a variety of sources (How2Sign, MS-ASL, DSL-10, RWTH-PHOENIX Weather 2014, Sign Language MNIST, ASLLRP).

My progress on schedule is on track as I’ve been working on model testing and gathering data from existing datasets.

Next week, I hope to accomplish more training of the model using the gathered datasets and hopefully be able to display a more accurate prediction of not just letters, but words and phrases. We will also be working on the Design Review presentation and report.

Sejal’s Status Report for 2/10/24

After presenting the project proposal Monday, my group and I reflected on the questions and feedback we got, and prepared to start each of our parts of the project. I started doing further research into the machine learning algorithm that will recognize ASL gestures. Since my teammate will be processing the datasets using OpenCV, I will begin by using publicly available datasets that provide preprocessed images for sign language recognition tasks. For example, ASL Alphabet Dataset on Kaggle and Sign Language MNIST on Kaggle. Since we decided to use TensorFlow and Keras, I looked into how existing projects utilized these technologies. In regards to training the neural network, I learned that convolutional neural networks (CNNs) or recurrent neural networks (RNNs) are commonly used. However, 3D CNNs are also used for image classification, especially with spatiotemporal data. Hybrid models combining CNNs and RNNs might also be a good approach. 

Our progress is on track relative to our schedule. During the next week, Ran and I will begin preparing a dataset. We will also allocate some time to learn ASL so we can use some of our own data. I also hope to do more research into the structures of neural networks and consider the best ones