This week, I worked on implementing new code for data parsing and model training and testing since we have decided to take in a window of data (three data points). We still decided to use random forest for our classifier with the new data because after running through some default models, it’s still the model with best accuracy. We are also considering neural network since it can achieve >90% accuracy after hyperparameter tuning and the high dimensionality data may lead to better performance with neural network.
I also performed some data analysis with older sets of data on letters that the classifiers got confused on but were not similar in poses. I noticed two things that may help us in the future. First was that there were some big outliers in the data we collected. Usually, flex sensors’ readings are below 50~300 (for the letters I analyzed), but there were a few large outliers like 27600 that may have occurred due to sensor errors or data stream not being read correctly. While data are normalized before used for training, I did not explicitly implement any steps to get rid of outliers, so that’s something I will add on for future model training. Another discovery I made is that the flex sensors had similar values when they shouldn’t have, e.g. a curved finger vs. a straightened finger have similar range of values. This possibly led to the classifier being confused. This may have occurred because our sensors were not that tightly sewn to the glove. After the preliminary testing done for our newest set of data, we found that the classifier had much less trouble in identifying dissimilar letters that it was previously confused with.
We met with two ASL users too this week, one of whom is an interpreter. We talked about ethics and the extent of our project and got some helpful advice on addressing the scope and the users of our work.
We have made changes to our schedule since we realized that it doesn’t make sense to bulk data collection until our glove build and data collection process is finalized and adjustments are made based on feedback. Hence we are still on schedule. In the next week, we will be collecting more data and I will work on training/testing with other models since our newest classifier still gets confused on similar letters.