Aishwarya’s Status Report for 2/19/22

This week, I worked with Tensorflow to gain familiarity with how it allows us to instantiate a model, add layers to it, and train the model. I also experimented with how we would need to format the data using numpy so that it can be fed into the model. Feeding dummy data in the form of numpy arrays to the model, I generated a timing report to see how long the model would take to generate a prediction for every 10 frames processed in mediapipe (during real-time video processing), so that we could get an idea of how the model’s structure impacted execution time.

Our team is also working on creating a test data set of ASL video/image data, so I recorded 5 videos for each of the signs for numbers 0-4 and a-m and uploaded them to a git repo we are storing them in.

The exact network structure that would optimize accuracy and execution time still needs to be determined, but this must be done through some trial and error. We will be using at least one LSTM layer, followed by a dense layer, but knowing the exact number of hidden layers and their number of neurons will be more clear after we have the chance to measure performance of the initial model structure and optimize from there.

Our progress is on schedule. Next week, I hope to complete the feature extraction code with my partners (both for real-time video feed and for processing our training data acquired from external sources).

Leave a Reply

Your email address will not be published. Required fields are marked *