Stephanie’s Status Report for 11/13

This week, I mostly worked on data analysis with real time testing, data collection, and developing models specific for certain letters. Rachel and I worked together on writing the script for recording accuracy for each letter in real time and I used the results we got to make a confusion matrix to pinpoint which letters tend to get more misclassified than others. I have also collected data from my friends.

After the interim demo on Thursday with class TAs,  we got some helpful feedback. One of them was to add models for classifying similar letters. Right now, we are using one model to classify all 26 letters. The poses for this group of letters: r,u,v,w, are similar to each other, and the same goes for m,n,s,t,e. Hence we have decided to make two models specifically for this two group of letters so that our glove can distinguish between the similar letters better. I have already developed two models based on the data collected so far, however having more data would be even better to help refine the models, so I’ll be updating the models as we get more data.

Next week, I’ll be working on integrating these news models with our current classification code next week, and analyzing this new method’s performance.  As a team, we will also begin to work on our final presentation slides. I would say we are on track with our new schedule. All we have left is to implement our new classification method and get audio to work without lag.

Rachel’s Status Report for 11/13

This week, I mostly worked on data collection and adapting the scripts to incorporate audio with our script. I found another modules (gtts and playsound) that are able to create and play audio files relatively quickly (without noticeable delay from a user perspective), so we will be using that instead of pyttsx3, which had a really long delay. I added in some handshaking signals between the arduino and python programs, which slowed down the prediction and output rate to be about 0.27 gestures per second, which is significantly below our target of 2 gestures per second. In changing the arduino script back, I noticed that I was sending new line characters, which was being ignored by the script but that line that was sent could have been better used by sending actual data. After fixing that, our glove can make about 17 predictions per second. I am currently working on incorporating the audio properly, so that there isn’t a lag between streamed in data and the outputted audio– for reasons unknown to me at the moment, the handshaking signals I was passing around before are not working.  Since the changes we plan to make in the next couple of weeks do not involve changes in what the data looks like, I also had my house mates collect data for us to train on.

This week, I plan on fully integrating the audio and getting more people to collect data. I will also begin to work on the final presentation slides as well as the final report. I would say we are on track since all that remains is collecting data and training our final model (we are near the end!). We also have ordered a bluetooth arduino nano, which we will have to switch out for our current arduino– this will also require some changes in the scripts that we have been using, but it shouldn’t become a blocker for us.

Sophia’s Status Report 11/6

This week I arranged for meetings with Daniel, an ASL user at CMU, and with Danielle, an ASL interpreter. From Daniel, we learned that for fingerspelling, double letters are usually signed with a single fluid movement rather than signing the letters twice. This is something we might want to keep in mind when trying to parse together the words. Next week we plan to meet with him in person and collect data from him to train our model.

Danielle had a lot of feedback poking holes in the use case we defined for our product. She pointed out that the ASL alphabet is used just for proper nouns, and so you can’t really communicate in ASL with just the alphabet.  Furthermore, ASL has five different aspects: palm orientation, placement, location, movement, non-manual signals (body language and facial expression, tone of voice). The glove can’t really pick up the non-manual signals so at best, the glove can translate, but not interpret. She explained to us that a lot of grammar is actually based on non-manual signals in ASL. She also pointed out that ASL users don’t usually like to have things obstructing their hand movement. She shared that she doesn’t like to sign, or feels she signs awkwardly when she wears a ring on her finger. With this input, we should look into a way to measure the resistance the glove gives.

We are on schedule, but we had to readjust the schedule since adjustments to our product required us to recollect training data.

Team’s Status Report for 11/6

This week, we implemented our new data collection process (collecting a window size of three snapshots) and collected data from each of our team members. We also integrated the real time testing with the glove so the software system can record the accuracies of each letter tested. We found that letters with movements (J and Z) perform much better and the classifier recognize it more frequently. We also found that the classifier recognized dissimilar letters better but it does get confused on similar letters (e.g., M and N). In order for the classifier to discern between them better, we will need to get more data. Overall, using a window of three for our data points have improved accuracy compared to our older model. It’s also recognizing about 3-4 gestures per second with the new data collection process. This rate is more suitable for actual usage since the average rate of signing is about 2 signs per second.

We also met up with an ASL user and an interpreter. Both gave useful feedback pertaining ethics, use cases, and capability of the glove for our project.

In terms of schedule, we are on the right track. Next week we will be meeting with the ASL users that we have contacted to collect data from them. Ideally, we will be getting people of different hand sizes for data collection. We will also start on refining our classifier to better recognize similar letters and implementing the audio output.

 

Stephanie’s Status Report for 11/6

This week, I worked on implementing new code for data parsing and model training and testing since we have decided to take in a window of data (three data points).  We still decided to use random forest for our classifier with the new data because after running through some default models, it’s still the model with best accuracy. We are also considering neural network since it can achieve >90% accuracy after hyperparameter tuning and the high dimensionality data may lead to better performance with neural network.

I also performed some data analysis with older sets of data on letters that the classifiers got confused on but were not similar in poses. I noticed two things that may help us in the future. First was that there were some big outliers in the data we collected. Usually, flex sensors’ readings are below 50~300 (for the letters I analyzed), but there were a few large outliers like 27600 that may have occurred due to sensor errors or data stream not being read correctly. While data are normalized before used for training, I did not explicitly implement any steps to get rid of outliers, so that’s something I will add on for future model training. Another discovery I made is that the flex sensors had similar values when they shouldn’t have, e.g. a curved finger vs. a straightened finger have similar range of values. This possibly led to the classifier being confused. This may have occurred because our sensors were not that tightly sewn to the glove. After the preliminary testing done for our newest set of data, we found that the classifier had much less trouble in identifying dissimilar letters that it was previously confused with.

We met with two ASL users too this week, one of whom is an interpreter. We talked about ethics and the extent of our project and got some helpful advice on addressing the scope and the users of our work.

We have made changes to our schedule since we realized that it doesn’t make sense to bulk data collection  until our glove build and data collection process is finalized and adjustments are made based on feedback. Hence we are still on schedule. In the next week, we will be collecting more data and I will work on training/testing with other models since our newest classifier still gets confused on similar letters.

 

Rachel’s Status Report for 11/6

This week, I worked on some of the things we had identified to help improve our glove’s performance. I sewed the flex sensors down (so that it is connect to the glove at more than jus a few select intervals). Before, the glove would bend in between connections when our fingers were too straight and the degree of that bend was also contributing to inconsistencies between our training data and our real-time data. Sewing down the glove helped with that a lot and now we are seeing that the glove behaves as expected. The glove is now confused on letters that we know are similar but not on ones that we know should be different. For example, u and w are incredibly distinct letters in the ASL alphabet, but where commonly mis-classified in our previous iteration, but u and r were not even though they are significantly more similar. I believe this is due to the bending issue discussed earlier– now our glove gets confused on r and u but not on u and w. We also discussed using a window of time to see if we could capture the letters that involve motion more accurately. I adjusted the python and arduino scripts to do that as well. We found a few benefits from a window of time after collecting data and testing with a window size of 3 consecutive time instances: the window of time is better at capturing the gestures requiring movement and is also better at capturing the transitions between letters (it knows that those are not to be classified for the most part).

We also met with an ASL user as well as an interpreter to discuss possible improvements to our design and the ethics of our project. One important point that we discussed in both meetings is that we should re-word the scope and use case of our project and add what this can be extended to as “future work.”

I would say we are still on track. We discussed our schedule as a group and determined that our initial schedule with most of our data collection earlier on and iterating later didn’t make sense because iterating on our glove would require getting new data. We have modified our schedule so that we are still in the iterating phase and the bulk of the data collection will happen after we settle on what we think is the best implementation we can achieve in the time we have remaining. This week, I will modify the python and arduino scripts to output audio. I experimented with this a bit this week and found that there are weird syncing issues if only the python script is modified– I believe I need to send the arduino a signal to wait on sending data while the audio is outputting. We will hopefully also decide on our final design and gather data from various people.

Team’s Status Report for 10/30

This week, we collected data from each of our team members and integrated the glove with the software system so that the classification can be done in real-time. We found that some letters that we expect to have high accuracy performed poorly in real time. Namely, the letters with movement (J and Z) did not do well. We also found that different letters performed poorly for each of our group members.

After our meeting with Byron and Funmbi, we had a bunch of things to try out. To see if our issue was with the data we had collected or perhaps with the placement of the IMU, we did some data analysis on our existing data as well as moved the IMU to the back of the palm from the wrist. We found that the the gyroscope and accelerometer data for the letters with movement are surprisingly not variable– this means that when we were testing real time, the incoming data was likely different from the training data and thus resulted in poor classification. The data from the IMU on the back of the hand has a 98% accuracy from just the data collected from Rachel; we will be testing it in real time this coming week.

We also found that our system currently can classify about 8.947 gestures per second, but this number will change when we incorporate the audio output. This rate is also too high for actual usage since people cannot sign that fast.

We are also in contact with a couple of ASL users who the office of disabilities connected us with.

We are still on schedule. This week we will work on parsing the letters (not continually classify them). We are also going to take data from a variety of people with different hand sizes, ideally. We will also experiment with capturing data over a time interval to see if that yields better results. We will also be improving the construction of the glove by sewing down the flex sensors more (so that they are more fitting to the glove) and doing a deeper dive into our data and models to understand why they perform the way they do. We will also hopefully be able to meet with the ASL users we are in contact with.

Rachel’s Status Report for 10/30

This week during lab, our group did a lot of work together. We collected data from each of us and found a similar accuracy between the new data and the data that I had collect earlier on just myself, so we can have confidence that with more variety of user data, we will be able to generalize our glove. I mostly spent my time outside of lab writing the various scripts we will use for collecting data, adding to existing data, and using the glove and have real-time classifications. Currently, our system can make approximately 8.947 classifications per second.

There are certain letters that consistently classify poorly– letters that require movement (J, Z) and letters that are similar to others (N, T, M, X, etc.). It makes sense for similar letters to be classified incorrectly more often because they would give similar sensor data to other letters. For J and Z, we hypothesized that the poor classification is because our data collection procedure does not necessarily collect data from the same point in each movement (we are using a button press to tell the system to collect data). To investigate, I plotted some graphs of the letter J (a gesture requiring movement) and P (a gesture that is consistently correctly classified). Interestingly, the issue seems to be that certain measurements from the gyroscope and accelerometer were too consistent, so when we were using the glove real-time, the differing measurements were throwing the classification off. Here is an example of one of those measurements (accelerometer x):

I also collected new data experimenting with the IMU on the back of the palm rather than on the wrist. The hope is that because the back of the palm will have more movement, our classification will pick up the movement from the letters J and Z better.

I would say we are on schedule. This week, I will work to gather data from various people (ideally with varying hand sizes) and search for a better text to speech package since pyttsx3 has a very noticeable delay, which would not allow a very smooth user experience. I will also work with my team to figure out a solution to determine when a sign should be classified. Currently, our plans to utilize a classification probability threshold as well as a “transition” class would not work to account for pauses in signing (perhaps to think of the next letter to sign).

Stephanie’s Status Report for 10/30

In this week, I worked on researching on why the neural network’s accuracy increased with the added labels of “space” and “transitions”. To compare the accuracies of different data sets, I compiled three different sets of data: one with all the labels, one with all the letters and the space labels, and one with only letter labels. To see if the differences in data did affect the overall accuracy, I performed hyperparameter tuning when training each data set’s model to find the best performing accuracies. The results showed that best tuned models were able to reach 93%+ accuracy for all the data sets.  (The accuracy also varies by runs due to random initialization in the model)

In conclusion, the added data labels didn’t actually improve the accuracy of the model. Since neural networks can have various amount of hidden layers and activation function, it’s likely that we just did not find the optimal parameters of the model for a particular dataset.

Furthermore, we found that our first set of data were mostly taken with the hand facing one direction. This may significantly impair our model performance. This was further proved when I tested the model trained on first set of data. On our two newest sets of data, the model only showed 4% and 21% accuracy. Hence it’s likely that we will not be using this set of data in the future.

We also adjusted the IMU on the glove this week and just got a new set of data. Preliminary tests on this dataset showed high accuracy (98%), but model trained on second data set only had a 50% accuracy on this set, which signifies this is a large change to the overall data. We’ll be doing more real time testing next week. I’ll also be working on making some data comparison with our older data to see why the old model did not perform as well in real time testing.

I would say we are a little behind schedule in terms of gathering data as we are still contacting people who use ASL. But we’re doing well with data training/testing and model analyzing.

 

 

Sophia’s Status Report for 10/30

This week I reached out to the office of disabilities at CMU to contact actual ASL users. There is a post-doc at CMU which uses ASL who has agreed to talk with us and provide insight into our project. There is also an ASL interpreter interested in helping out with our project.

In addition to reaching out to ASL users, I wrote up a data collection procedure that we can give to people to follow to collect data more autonomously.

After our discussion with Byron, we identified the next steps we need to take to improve our prototype. Right now we are going to focus on improving the classification accuracy. I moved the IMU to the back of the glove in a makeshift way. There are a lot of wires, so it’s quite messy, but it will allow us to see if the change in IMU position helps differentiate between signs.

Open photo

This upcoming week I need to restitch some of the flex sensors because sometimes when a pose is made, the flex sensors don’t always bend in a consistent way due to the limited stitches. The embroidery thread has finally arrived so I will stitch over each of the flex sensor in this upcoming week.

I think we are on schedule, but I’m not confident we will be able to  parse letters into words. However, this was not a functionality we originally defined in our requirements.