I was not able to do anything for Capstone this week. I was hit with 3 assignments from 15440 and even though I spread out all the work to be done by Friday, the project that was due on Friday was too much and I am currently using my late day and about to use my 2nd late day to finish this project. As soon as I finish this I will transition to Capstone and do the things I was supposed to this week.
Tag: status report
Jeff’s Status Report For 02/29
This week, I worked on the Design Report. I also began making more progress on the web application finalizing some design choices and creating a rough prototype.
The key design and overall of the web application is to emulate the Google Assistant on the phone, which displays visual data of the Google queries in a chat type format. The “messages” would be the responses from the Jetson containing the information. We are still experimenting with Google Assistant SDK to determine what exact information that is received, but at minimum the verbal content usually stated.
In addition, do to the nature of this application it is important that the “messages” from the Jetson with the appropriate information be updated in real time, ie eliminating the need to constantly refresh the page for new messages to occur. To do this, I decided on using Django channels, which allow asynchronous code and handling HTTP as well as Web Sockets. By creating a channel layer, consumer instances can send then information. The basic overall structure has been written, and I am currently now in the process of experimenting with finishing up the channel layer and experimenting with using a simple python scripts to send “messages” to our web application.
Sung’s Status Report
This week, I was working on the design review in preparation for the design presentation. As such, a lot of time was devoted into thinking about the design decisions and whether or not these decisions were the best way to approach our problem/project.
I was hesitant about using OpenCV and whether or not it could be accurate, and we recognized as a risk factor, and as such added a backup. As such, Jeff and I decided that we should use OpenPose as a backup, and have it running as well as OpenCV. We realized that OpenPose takes up a lot of GPU power and would not work well on the Nano given that the Jetson Xavier (which has about 8 times the GPU capabilities) resulted in 17 fps with OpenPose video capture. As such, we decided to use AWS to run OpenPose, and I am in the process of setting that up. We have received AWS credit and we just need to see if AWS can match our timing requirements and GPU requirements.
Our initial idea revolves around a glove that we use that tracks joints. We were originally thinking of a latex glove where we would mark the joint locations with a marker, but we thought that the glove would then interfere with OpenPose tracking. We tested this out and we found out that OpenPose is not hindered even with the existence of the glove as shown on the picture below.
This week, I have to make a glove joint tracker with OpenCV. I’ve installed OpenCV and have been messing around it, but now I will have to implement a tracker that will give me a list of joint locations. This will probably be a really challenging part of the project, so stay tuned in for next weeks update!
Team Status Report for 02/22
Hello from Team *wave* Google!
This week, we worked a lot on finalizing our idea/concepts in preparation for the design review. In terms of the gesture recognition software we were planning on using, we decided to use both OpenPose and OpenCV to mitigate the risk of misclassifying and/or not classifying at all. Initially, we were planning on only using OpenCV and have users wear a glove to track joints. However, we weren’t sure how reliable this would be so to mitigate the risk of misclassifying/not classifying, we added a backup, which is to run OpenPose to get joint locations, and to use that data to classify gestures. With this new approach, we will have OpenCV running on the nano and OpenPose running on AWS. Gestures in the form of video will be split into different frames, and those frames will be tracked with glove tracking on the Nano and OpenPose tracking on AWS. If we don’t get a classification on the Nano and OpenCV, we will use the result from AWS and OpenPose to classify our gestures.
We wanted to see how fast OpenPose could run as how fast we can classify gestures is a requirement we have, and on a MacBook Pro, we achieved a 0.5 fps with video and tracking one image took around 25-30 seconds. Now OpenPose on the MacBook Pro was running on a CPU, whereas it would run on a GPU on the Nvidia Jetson Nano. Even still, the fact that it ran 25-30 seconds on a CPU to track one image meant that it would be possible that OpenPose would not deliver our timing requirements. As such, we decided to use AWS to run OpenPose instead. This should mitigate the risk of classification being too slow by using OpenPose.
Another challenge we ran into was processing dynamic gestures. Processing dynamic gestures would mean that we would have to do video recognition to do our gesture recognition. We researched online and found that most video recognition algorithms rely on 3D-CNN’s to train/test because of the high accuracy that 3D-CNN provides compared to 2D-CNN. However, given that we need fast response/classification times, we decided not to do dynamic gestures as we thought they would be hard to implement with the time constraints that we are working with. Instead we decided to have a set of static gestures and only do recognition on those static gestures.
We’ve also modified our Gantt Chart to update the change in design choices, especially with the gesture recognition aspect of our project.
Next week, we are going to run OpenPose on AWS and start the feature extraction with the results we get from OpenPose tracking so that we can start training our model soon.
Claire’s Status Report for 2/22
This week, a lot of my work was focused on developing our design review decks, as I am presenting. I think the most important thing I did was fully flesh out our set of static gesture for the MVP.
We derived these gestures from ASL fingerspelling. We had to make sure that the gestures were unique from each other (the original set had some overlapping gestures) and if unique, distinct enough for the camera. Some examples of similar gestures are K and V. While they look distinct from each other in the image, we felt that they would not be too different from one person to another given differences in finger length and hands.
Aside from the decks, I also worked on getting the Nvidia Jetson running. I successfully booted the disk, but because it lacks WiFi abilities, I wasn’t able to get it to run anything too useful. I started a demo, and tried my hands at some basic machine learning setup to prep the Nano for image recognition. I am now learning how to train networks on my personal computer using its GPU.
This was surprisingly difficult on my machine, due to some secure boot and missing dependencies. After a few hours of installing Python libraries, I got to a point where I was not confident in how to fix the error messages.
Aside from that, because our designated camera hasn’t arrived yet I tried to borrow some webcams from the ECE inventory. Both didn’t work. One was connected through GPIO, and another was through the camera connector. Both were not detected by the Nano despite a few hours of tinkering and searching online. This could be troublesome – especially if the camera connector is broken. However, for now, it is most likely a compatibility issue with the Nano, as neither of the webcams were meant for this particular device. For now, we just have to wait for the camera to see.
The progress is still behind, but I feel fairly confident that it will work out. I can start looking into Google Assistant SDK while waiting for the parts to arrive, as those two tasks do not depend on each other.
As I am looking into other tasks to do while waiting for hardware, I think the best use of my time right now would be to start thinking about the Python scripts for automated testing, and to start testing out the Google Assistant SDK and possibly make my first query.
Thus, my deliverable next week are the design review report and a small Google Assistant SDK Python program. I am thinking something that even just takes in a line of input from the command line and outputs in text form would be a good enough proof of concept.
Jeff’s Status Report of 02/22
This week in preparation for our Design Review Presentation, I worked with Sung and Claire on preparing our slides and finalizing design decisions.
As a part of that, I worked with Claire on choosing the static gestures that we would use for our MVP. Following the advice from our Project Proposal presentation, we choose to modify fingerspelling gestures as many gestures we noticed were very similar or identical. We then remapped the similar gestures into something more distinguishable, and also the dynamic gestures (ie Z) to unique .static gestures. In addition, we also made the unique gestures for popular commands (ie whats the weather).
Furthermore, I worked with Sung on helping to finalize the machine learning implementation we would use. Based on some research on other group’s performances of running OpenPose on Xavier resulting in 17 fps, we found it unlikely that it would run on the Nano, making us pivot to using AWS combined with running OpenCV on Nano. In addition, we prototyped a glove we would use for our OpenCV implementation. The glove was simply a latex glove with a marker marking the joints (ie knuckles). We then did some testing with OpenPose to ensure that the glove would not hinder OpenPose, as well as the effect of distance on OpenPose as well as other factors (ie is face required). We found that the glove did not interfere with OpenPose, that the face is needed for OpenPose to work, and that at roughly around 3 meters OpenPose has difficulty detecting hands (using a 1.2 megapixel camera on a Macbook). We also setup OpenCV and began basic experimenting.
Finally, I continued to make some progress on the Web App, but given the Design Review, I choose to focus more on finalizing design decisions this week, resulting in less progress in the Web App, but with the slack allocated it should be okay.
Jeff’s Status Report For 02/15
This week, I worked on designing and planning our approach to testing and validation for our design review, something that we told we had to add more detail in. We decided on an automated approach where we word play video of various different gestures at different intervals, distances, and lighting conditions. This would test the rate of how many gesture we could correctly identify (our requirement was at least 1 gesture per second), as well as performance at different distances and lighting conditions. The video would be synced to the Jetson, so the final output and final time, could be checked with the gesture given, allowing us to also keep track of accuracy and time of individual gestures.
In addition, I was also researching different Web Frameworks and the best way to implement our Web Application. After doing some research, I decided to go with Django due to a lot of useful features like built in authentication, integration with SQLite (database included with Python), as well as being in Python, a language we are all very familiar with. After choosing Django, I began learning and setting up a simple Web Application.
Finally, I set up the WordPress and researched the display part that we would need to not only work on the Jetson Nano with but also to be used in our final project do display visual information back to the user.
Sung’s Status Report for 02/15
This week, I worked on collecting datasets for our project and working on finalizing the design of the gesture recognition portion of the project. We were told that our initial ideas of creating our own skeletal tracking algorithm would be too hard, so we are planning on using OpenPose to train our model with. We also plan on using OpenCV and have our users where a glove that has joint markings so that we can “imitate” skeletal tracking. With OpenPose, we needed a data sets of gestures, so my task this week was collecting a bunch of data sets that we could potentially use in our project to train our model.
(sample image from dataset)
I was able to acquire two different datasets. One dataset only has RGB images while the other dataset has a variety of images, ranging from RGB, RGB-D, and confidence images. I am currently in the process of hearing back from one author about another set of gesture datasets. This should all be done by next week.
With gesture recognition, I looked into using OpenPose. I had some troubles setting up OpenPose as the documentation was not the best one written, but I hope to fix that on Monday by talking to the Professor and/or my peers and trying to get a sample OpenPose program working. After this, Jeff and I’ll both implement different ways of training our data to start off with the gesture recognition aspect of our project.
Team Status Update for 02/15
Hello from Team *wave* Google! Right now, the most significant risk is in the performance of the Jetson Nano. We are not sure if the Nano has the amount of computational power we need to complete both the feature extraction and the gesture recognition. If the Nano proves to be insufficient, we will need to quickly pivot to another piece of hardware, likely the Jetson TX1 or Jetson Xavier. We will try to get the Jetson Nano demo-ing as soon as possible in order to test if it has what it takes. We can do hardware testing and network training in parallel, as those two tasks don’t depend on each other. The gesture recognition demo on the Nano we saw online used OpenCV, but we want to also use OpenPose, which we are not sure if we can run on the Nano yet. This could greatly complicate our project, and the only way to mitigate is to start only.
We had to change how we implement the gesture recognition aspect of this project. We originally thought that we would’ve been able to implement our own skeletal tracking, but upon talking to the professors and reading up papers on skeletal tracking, we realized that implementing skeletal tracking from scratch would be way too hard. Thus we have two alternative approaches. The first approach is to use OpenPose and train a model with pre-processed datasets, and use skeletal tracking provided by openpose to classify gestures. The other approach is to use OpenCV and have our users possibly wear a special glove. This glove would have joints labeled, and we would use OpenCV to imitate skeletal tracking and classify gestures.
Finally, our Gantt chart and overall schedule has changed as a result of the longer than expected time for parts to arrive. This results in us not being able to completely setup the Jetson Nano and run OpenCV and OpenPose on the Nano in our initial time frame. Instead, we are forced to wait until our parts arrive, and instead run it first on the laptop. Also using OpenPose on the laptop proved more difficult than expected and would carry onto the next week.
Claire’s Status Report for 02/15
This week, I worked on getting parts for our Jetson Nano. The most important piece of hardware for meeting our requirements is the camera to go with our board. After some research, I decided to go with a SoC board by e-Con Systems specifically made for the Jetson Nano. I researched and compared factors such as the connector (and thus communication protocol, which affects latency), the size (is it appropriate for an embedded system? does it look intrusive?), the resolution (how much resolution do we need per image for accurate feature extraction?), and finally, the frames per second (how much information do we need to make dynamic gestures?). Unfortunately, the camera won’t be arriving for another two weeks at least, so some parts of the testing may be delayed for now. I hope to continue on with trying out some Jetson demos by borrowing a webcam from the inventory and working with that for now. Luckily, familiarizing myself with the Nano is not a super pressing task – the next task that is dependent on it isn’t due for a few weeks.
Aside from learning camera lingo, I also made a rough draft of our block diagram for the hardware specifications. We have shopped and submitted purchase forms for most of the hardware listed on this image. It took some time for us to find hardware that specifically works with an embedded system and looks sleek. In terms of purchasing parts, we are on time. We started our Gantt chart a little earlier than we should have (we didn’t realize purchase forms didn’t open until this week), but otherwise we are on schedule.
I also worked on collecting some data for each of the hardware and putting them into slides for our design review in a week.
Another factor is the missing microSD card. We just placed the order for it and we can’t set up the Nano without it.
By next week, I hope to have the microSD card and start setting up the Jetson. I will also talk to a TA about possibly borrowing a webcam for now to start setting up some demos on the Nano. I will also be working on the design review slides and presentation next week, and that will be another deliverable.