Stephanie’s Status Report for 12/4

Last week, I helped making the final presentation slides and worked on generating graphs for our machine learning models to compare the trade-offs and accuracies from different rounds of testing for our final presentation.

In this week, we’re mostly making adjustments to the physical components of the glove, so there isn’t much to do on the software end. I wrote a new script for data training since the format of the data we collect after on will be different from the previous format as we added more sensors. With this, we can get started right away with model training after data collection.

Next week, we’ll be meeting up to collect data after the sensors are all tightly attached and finish up our final report and video. I have also contacted others for data collection.

 

Stephanie’s Status Report for 11/20

This week I worked on integrating the letter-specific models with our old classifier. The original classifier is responsible for identifying the input data as one of the 26 letters. If the predicted letter is in the list of similar letters that the original classifier easily gets confused on, notably R, U, V, W and M, N, S, T, E, the data will then be passed to one of the smaller models for further classification. This gave slightly better results and was able to make more correct classification.

I performed further data analysis to see if there’s any other way I can improve the model. I did find some data contained invalid values (negative values from flex sensors) that my old outlier removal function did not pick up, hence I was able to refine that to get better data. The sensor values for U and V are rather similar so it could be hard to have a classifier to distinguish between these two. R’s range of values is quite different from the rest of the letters for the middle finger, hence to improve future models for this specific group of letters, I may only train with data of important features, e.g., the flex sensor values.

During this, I also found the ring finger’s flex sensor value for W (finger is straight) is quite similar to that of R,U, and V (finger is bent). Upon further testing with the glove, I found that the flex sensor on the ring finger is not working as intended and is only giving about the same value no matter the bending angle, so we are looking to replace that sensor.

I believe we are still on schedule. In the next week, I’ll be collecting new data after the flex sensor has been fixed.  I’ll also be doing some real time testing and experimenting with new ideas of the smaller model (e.g., training on only important features, changing hyperparameter to produce models with less accuracy on the collected data but more generalizability). After that, we’ll be working on our final presentation and report.

Stephanie’s Status Report for 11/13

This week, I mostly worked on data analysis with real time testing, data collection, and developing models specific for certain letters. Rachel and I worked together on writing the script for recording accuracy for each letter in real time and I used the results we got to make a confusion matrix to pinpoint which letters tend to get more misclassified than others. I have also collected data from my friends.

After the interim demo on Thursday with class TAs,  we got some helpful feedback. One of them was to add models for classifying similar letters. Right now, we are using one model to classify all 26 letters. The poses for this group of letters: r,u,v,w, are similar to each other, and the same goes for m,n,s,t,e. Hence we have decided to make two models specifically for this two group of letters so that our glove can distinguish between the similar letters better. I have already developed two models based on the data collected so far, however having more data would be even better to help refine the models, so I’ll be updating the models as we get more data.

Next week, I’ll be working on integrating these news models with our current classification code next week, and analyzing this new method’s performance.  As a team, we will also begin to work on our final presentation slides. I would say we are on track with our new schedule. All we have left is to implement our new classification method and get audio to work without lag.

Stephanie’s Status Report for 11/6

This week, I worked on implementing new code for data parsing and model training and testing since we have decided to take in a window of data (three data points).  We still decided to use random forest for our classifier with the new data because after running through some default models, it’s still the model with best accuracy. We are also considering neural network since it can achieve >90% accuracy after hyperparameter tuning and the high dimensionality data may lead to better performance with neural network.

I also performed some data analysis with older sets of data on letters that the classifiers got confused on but were not similar in poses. I noticed two things that may help us in the future. First was that there were some big outliers in the data we collected. Usually, flex sensors’ readings are below 50~300 (for the letters I analyzed), but there were a few large outliers like 27600 that may have occurred due to sensor errors or data stream not being read correctly. While data are normalized before used for training, I did not explicitly implement any steps to get rid of outliers, so that’s something I will add on for future model training. Another discovery I made is that the flex sensors had similar values when they shouldn’t have, e.g. a curved finger vs. a straightened finger have similar range of values. This possibly led to the classifier being confused. This may have occurred because our sensors were not that tightly sewn to the glove. After the preliminary testing done for our newest set of data, we found that the classifier had much less trouble in identifying dissimilar letters that it was previously confused with.

We met with two ASL users too this week, one of whom is an interpreter. We talked about ethics and the extent of our project and got some helpful advice on addressing the scope and the users of our work.

We have made changes to our schedule since we realized that it doesn’t make sense to bulk data collection  until our glove build and data collection process is finalized and adjustments are made based on feedback. Hence we are still on schedule. In the next week, we will be collecting more data and I will work on training/testing with other models since our newest classifier still gets confused on similar letters.

 

Stephanie’s Status Report for 10/30

In this week, I worked on researching on why the neural network’s accuracy increased with the added labels of “space” and “transitions”. To compare the accuracies of different data sets, I compiled three different sets of data: one with all the labels, one with all the letters and the space labels, and one with only letter labels. To see if the differences in data did affect the overall accuracy, I performed hyperparameter tuning when training each data set’s model to find the best performing accuracies. The results showed that best tuned models were able to reach 93%+ accuracy for all the data sets.  (The accuracy also varies by runs due to random initialization in the model)

In conclusion, the added data labels didn’t actually improve the accuracy of the model. Since neural networks can have various amount of hidden layers and activation function, it’s likely that we just did not find the optimal parameters of the model for a particular dataset.

Furthermore, we found that our first set of data were mostly taken with the hand facing one direction. This may significantly impair our model performance. This was further proved when I tested the model trained on first set of data. On our two newest sets of data, the model only showed 4% and 21% accuracy. Hence it’s likely that we will not be using this set of data in the future.

We also adjusted the IMU on the glove this week and just got a new set of data. Preliminary tests on this dataset showed high accuracy (98%), but model trained on second data set only had a 50% accuracy on this set, which signifies this is a large change to the overall data. We’ll be doing more real time testing next week. I’ll also be working on making some data comparison with our older data to see why the old model did not perform as well in real time testing.

I would say we are a little behind schedule in terms of gathering data as we are still contacting people who use ASL. But we’re doing well with data training/testing and model analyzing.

 

 

Stephanie’s Status Report for 10/23

There has been a change of plan from what I wanted to do from last two weeks. My original plan was to collect more data for model training, however, our glove needed more fabrication work to ensure the sensors are well attached. We also plan to enhance the glove’s data collection process. Our first set of data (done by Rachel) had to be collected by pressing buttons to determine the time duration of when the data will be read in. So for this week, we are trying to integrate real-time data collection and interpretation. More data is collected for hand gestures in-between each letter gesture (these gestures are mostly random since they are just transitions from one letter to another). A ‘space’ letter is added in case the models can not categorize the ‘random’ gestures well. First round of testing shows promising result. With the random forest model, which has been the model with highest accuracy so far, the accuracy for recognizing these two new labels are quite high.

I also found that with these two new labels added, the neural network accuracy have increased by 10%. This is an interesting finding and I plan to look further into why this is the case. Before this, I have done much tweaking to the model and validating with different hyperparameters, but the network’s accuracy seemed to be capped around 80%, however, with this new dataset, its accuracy reached around 88% and that is something I would like to find out why to help our future classification works.

I would say we are still on schedule since we planned a lot of time for the software implementation. As for next week, I’ll be looking more into the models and working on make our data collection work more smoothly, and if possible, starting to collect data from others.

Stephanie’s Status Report for 10/9

In this week, Sophia finished building the glove and Rachel was able to get our first set of real data. Since I have already performed validation tests on the models I have used with the generated fake data, I decided to use these tuned models on the real data after preprocessing them. Surprisingly, the results were overall better than that of the fake data. This shows that our fake data was not well generated. One possibility is that we included too much variance in generating sensor values. However, though the accuracy metrics were quite different, the trends remain the same. Random forest classifier achieved highest accuracy while perceptron had the lowest. I also did some extra tuning with neural net, but there wasn’t any significant improvement in accuracies, likely because our data isn’t high dimensional. One thing I would like to add is that this set of real data is only from Rachel, so there could be a possibility of overfitting which explains the high accuracy metrics.

In terms of schedule, we are actually ahead. We were able to get data from both type of sensors. We do need to work on getting consistent data and ensure the craftsmanship of the glove since Rachel mentioned some parts came undone. We will need to make sure that the sensors on the glove are stabilized before moving on to collect data from others.

Next week, I’ll be working on fixing the glove with the team and gathering more training data, starting from Sophia and me. If time and resources permit, we will try to find others who can sign for us. We will also working on finishing up the design report.

Stephanie’s report for 10/2/2021

This week, my team and I worked collaboratively on the design review slides. Since we changed the number of gestures to recognize from 5 common signs to 21 ASL letters, we had to make sure to include our new scope. With the expansion of the scope, I worked on changing my data generation algorithm to include the ASL letters. I have also examined best performing models I have used in depth, such as using different parameters, to see if they can perform better than the default models given.

We are a bit behind schedule because our orders did not all arrive until Friday, hence putting us behind in making the glove and getting real data. We may have to speed up and do some more work next week to ensure we can get consistent data from both types of sensors. This setback is quite minor in my opinion since we have already gotten started on glove building and pre-determining ML models can save us time in the future.

Next week, we will have the glove built and we will be able to get real data. I will work on processing those data to ensure they are suitable for model training. We will also sign some gestures to obtain a preliminary set of data. Using this data, I’ll be testing the models that I have identified to have the best performances with generated data to find which one does well on the real data and perform further fine tuning to improve their accuracies.

Stephanie’s status report for 9/25/2021

This week Rachel and I worked together on generating fake data for gloves sensors to test out different machine learning models. We have decided to each generate a different set of data. Having data with some differences are good for making sure that models are generalizing well and not overfitting to a particular set of data.

Each sample of data consists of bend angles of each finger from flex sensors and nine values from each component of the IMU (accelerometer, gyroscope, and magnetometer). The bend angles for each finger are estimated by observing the finger’s pose for each gesture. The values will then be sampled from a normal distribution with some variance to account for variations in poses. There’s also a possibility of generating completely random numbers to consider for outliers. Similar procedures are done for IMU data generation. Data is generated by estimating a reasonable range based on the directions and orientations of the gestures. The estimations are based off an IMU data sample sheet that I have found online.

I have also done some preliminary model testing to check their accuracies. The models are trained and tested on the data that I generated, so there can be a possibility of overfitting. The models that I have tested include: SVM, perceptron, KNN, random forest, and neural networks. So far, most of these models meet our threshold for accuracy, but more testing should be done with different data sets and more model refinements are needed.

I believe we are on schedule. We have placed our order for the parts needed for the glove and started researching on models to use. Next week, I plan dive deeper into each model to find the optimal parameters to achieve best accuracy and latency and also get to test them different data sets.

Stephanie’s Status Report for 9/18

During our mandatory lab meeting, the team collaboratively researched about project design and implementation and worked on the presentation slides. We will be meeting this Sunday to finish up the slides and finalize the type of sensors that we will be using. Since our project must be done sequentially for most part, the team will not be splitting up a lot of the work.

I took on the task of setting up the team’s website and writing the project overview. I also looked into some machine learning models that our project can use and established pros and cons for each.

The team is slightly behind schedule on ordering the glove and sensors, but we’ll be ordering them right away after this Sunday’s meeting.

In the coming week, I will be presenting our proposal to the class and start looking deeper into viable machine learning models and performing tests to see which model can give higher accuracy, and if possible, implementing a baseline code structure for classifying gestures.