status report – Team F0: *wave* Google

Sung’s Status Report for 04/25

HI EVERYONE

The past week, I spent countless hours trying to get OpenPose running on AWS. After spending about 3 full days and 8 different images, on 5 different machines, I finally got OpenPose working on Ubuntu16. It sucks that it works on nothing else, but Ubuntu16 works so I guess it’s all good. With that, I was able to train a lot of images really quickly, due to the GPU that exists on the AWS EC2 P2 instances. With that, I was able to enlarge the training data to about 5000 samples. I’m training more data this week, and doing confusion matrix implementation on some of the few misclassifications.

One thing about the speedup I found is that, we wanted OpenPose to run as fast as possible, and that happens if the quality of the image is low. An image that has size about 200KB will run on 2 seconds on AWS, whereas an image of 2 MB will run on 14 seconds on AWS. Because of this, we will have to utilize image compression so that we can run images with lower quality and boost our speedup

Jeff’s Status Report For 04/25

Hello,

This week I got my computer back and life is much better. I have been able to work on progress on regetting the Django web application deployed to AWS. It is still proving to be a little tricky.

As for OpenCV hand recognizing and NN classifier, I tested the performance and realized the simploy normalizing the image had very poor accuracy (<60 percent), and as a result the classifier would also be very bad. Thus, given Sung’s SVM is some good stuff, we have decided to abandon the secondary classifier.

But, instead I am now using OpenCV to generate different lighting conditions given our test image. By using Gamma correction, we can now artificially simulate different lighting conditions, to test our classifier. Also, I am experimenting with using OpenCV to resize images, as smaller images have better OpenPose performance, seeing how small we can go before it stops working.

Next week, I plan to finally wrap up the web application deployment and hopefully smash my old computer for Marios.

Jeff’s Status Report For 04/11

This week I prepared for our demo, getting the OpenCV hand recognizer and normalizer working, while working out kinks in deploying the web application. There remains problems linking the channel layer of Django which is an EC2 instance of Redis.

However, I was not yet able to solve these yet, as unfortunately my computer died spectacularly and I am forced to order a new one. :'(

Until then, I have just been getting my new Nano setup for OpenCV testing.

Claire’s Status Report for 4/11

This week, I got the AWS setup and I think we finalized how we want to have it running in conjunction with the Jetson device. We have scripts ready on both side and we now have a way of SCP-ing images from the device to the server. Then, the server will delete the images as they become processed. The FPS situation didn’t really end up getting resolved – it is still sampling at a much lower rate than I would like (roughly 10 fps at best). However, it can now be adjusted.

Next week, I will be working further on the OpenCV and adding some more commands to the list of testing commands possible. I want to ask some basic who/what/when/where/why + how questions, but I need to think about how to generate them. I want to also be able to adjust the FPS for the testing on the fly while maintaining roughly the same “ratio” of signal and noise. The biggest hurdle right now is really the sampling, which I think would be difficult on a device on the Jetson Nano. I will try some more things and have scripts running with Sung’s OpenPose software by the end of next week for sure.

Jeff’s Status Report For 4/04

Hello from home!

This week, I focused on getting AWS set up and hosting our web application, switching from just local hosting. AWS set up took a while to figure out how to get the credentials, but the web application should be hosted, and now I am just doing some experimentation that all the features still work, ie we can still connect via websockets and send data through. In addition, I’m continuing to tinker with the UI/UX, adding some features like parsing the message to add some more visual elements like emojis to the currently only text based output.

In addition, for the OpenCV part after doing some further investigation and talking it over with Sung, we decided that OpenCV would be better suited for use in normalizing images for use in Sung’s image neural network classifier. Even after adding different patterns and colors as suggested during last week’s meeting, we realized there was simply OpenCV with a glove simply did not have the capability to reproduce OpenPose’s feature extraction by itself. Too often in our gestures were markers covered by our gesture, and without the ability to “infer” were the markers were, OpenCV could not reproduce the same feature list as OpenPose requiring a new classifier. Instead we are using OpenCV to just extract the hands and normalize each image so that it can be passed in. I have worked on extracting the hands, and I am continuing working to normalize the image first by using the glove with a marker to standardize the image, and hopefully eventually without a glove.

Claire’ Status Report for 4/4

Hello! This week, I did some tweaking with the testing. To start, I am now doing around 20 fps for the testing video. I am still changing around some parameters, but I think I am going for this particular setup.

I want to switch away from the one gesture straight to next approach I was doing before to simulate more of the live testing that I talked about with Marios. If we go for 20 frames per second (which we might tweak depending on how good the results are, but we can go up to 28 fps if necessary), I want at least 3/4 frames to be the correct gesture consecutively for the gesture I am trying to get the Jetson to read. The five frames before or after can be either blank or another gesture. That way, it should be guaranteed that there will be no consecutive 15 frames of one thing at any point of time no matter what. Obviously, real life testing would have more variables, like the gesture might not be held consecutive. But I think this is a good metric to start.

Here is a clip of a video that has the incorporated “empty space” between gestures.

From this video you can see the small gaps of blanks pace between each gesture. At this point I haven’t incorporated other error gestures into it yet (and I want to sort of think more about that). I think this is pretty much how we would do live testing in a normal scenario – the user holds up the gesture for around a second, quickly switches to the next one, etc.

Next week, I plan on getting the AWS situation set up. I need some help with learning how to communicate with it from the Jetson. As long as I am able to send and receive things from the AWS, I would consider it a success (OpenPose goes on it but that is within Sung’s realm). I also want to test out the sampling for the camera and see if I can adjust it on the fly (i.e. some command -fps 20 to get it sampling at 20 fps and sending those directly to AWS).

Team Status Report for 03/28

We’re not sure what the purpose of the Team Status Report is now that we all have individual milestones. However, we have figured out what we each are going to do, and are currently working towards our individual goals. We have purchased more hardware to make working with the hardware feasible, and we have devised a plan to integrate our work together.

Jeff and Claire both have Jetson Nanos, that they are working with, and Sung will pass the OpenPose code/classification model to the both of them so that they can integrate it once they have their individual parts working.

Jeff’s Status Report For 03/28

Hello World!!!! hahaha am i funny too

This week I continued to work on refining the design of our web application. In addition, I began to focus more heavily on the OpenCV glove tracking component. To start rather than use all 20 markers that OpenPose has, I am focusing initially on 6, 5 on the palm knuckles and 1 on the wrist. (the glove is a clear latex glove). I was able to get overall body and hand detection and the segmentation of the body and hand from the background working, although it is not perfect at this time and I will continue to work on it. Furthermore, when doing the markers, I experimented with different ways of doing it from creating square markers and doing edges, but the markers lacked a clear enough edge for it to work super well, as well as using SURF, which did a sort of blob detection, but performance was not super great, but the markers lacked enough contrast. Finally, the best solution so far was to use color detection to detect the color of the markers which was red, which worked for the knuckles but not so much of the wrist. The color of my hand is kind of pinkish, which caused less ideal results on the wrist marker. Hopefully, when the green marker ordered comes, performance will be much better.

My SOW and Gantt Chart was emailed to Emily earlier this week!

Claire’s Status Report for 03/28

This week, I managed to get the basic functionality of the randomly generated testing working. I have a Python program that can read images from a directory and create videos with it through OpenCV. I knew that it was very likely that this functionality is possible in Python, but it was good to see it working.

Over the next two days I want to refine the script in terms of range of possible inputs for the test. For example, I want to see what range of words I will deem “acceptable” as a part of the command for Google. In particular, I think the “what is?” command (not its own designated command, but a feature we plan on fully implementing) would be the hardest to execute correctly with good inputs. For example, we want to eliminate words like “it”, “she”, “her” – special words that hold no meaning in a “what is” context. It would be nice to include some proper nouns too. These are all just for interest and proof of concept (since no one will be actually using our product, we want to show that it is functional as a normal Google Home would be).

Another concern that came up is the spacing. After some thought, I think it would make sense to put some spacing of “white space” between gestures, as a normal person signing would be. Someone who actually signs in real life will probably do one letter then switch to the next one in less than a second. This spacing could be important – this will help us distinguish repeating letters. I didn’t think of this before, but now that I have I think I will put some randomly timed breaks where the video is just blank (need to explore that in the next few days as well) to imitate that. This could great improve our accuracy and simulation of a real-world situation.

Sung’s Status Report for 03/28

Hello world!! ahahah i’m so funny

This week, I worked a lot of trying to collect data for our project. There is a bottleneck in collecting data, as OpenPose segfaults when I run it with an image directory of more than 50 images. This means that in order for me to train more than 50 images (I would ideally like 100 images per gesture), I need to rerun OpenPose with a new directory of images. 50 images take around 30 minutes to finish, which means that I need to check every 30 minutes and the fact that I have to be awake makes this process a little bit slower than I expected it to be.

Other than data collection, I’ve been working on the classification model for our project. I’ve been working on using a pretrained network, and I am trying to integrate that into our project. I have found some examples where they use pretrained networks to train their new data and model, so I am trying to implement that into this project.

Restructured SOW Sung

Here is my restructured SOW with my Gantt chart for further descriptions.