I spent most of this week trying to figure out what was going wrong with the audio output for our system. When I made a gesture, it would consistently say “A” repeatedly and nothing else would be outputted until the stream of “A”s was finished outputting. I thought that this was because the glove was continually making predictions and the audio outputs were overlapping each other. However, after trying lots of different things, I figured out that the audio files themselves were corrupted. I’m not really sure what happened, but I generated the audio files for each letter again and they’re fine now, so the audio integration is complete. Currently, our system is able to make approximately 20 predictions per second, but will only output the audio if it gets 7 of the same predictions in a row. This number was chosen to achieve our specification 2 gestures per second while leaving some room for transitions.
I also modified the script to utilize smaller models for the letters R, U, V, and W as well as M, N, S, T, and E. When our main model outputs any of those letters, the data gets put through another model for further classification. These smaller models only select from the group of letters that are easily confused. We found that this gives ever so slightly better results and are still in the process of doing data analysis to determine which data values to train on and/or if these letters are even distinguishable based on our sensor data.
I believe we are still on track since the functionality of our glove is complete. However, we did find that one of the flex sensors was not outputting useful data (it consistently outputs values around 256 degrees no matter the amount of bend), so we will need to replace that sensor and recollect data. After that, though, all that is left is to prepare our final presentation, make the final video, and add + make changes to our design review report for the final report. After Thanksgiving, I will collect data with our fixed glove and do all the final report/video/presentation things!