Weekly Update #2 (2/17 – 2/23)

Team

With the upcoming design presentation, we knew we had to make some important decisions. We’ve decided to use PoseNet and create a web application, which are two major changes from our original proposal. This is because we discovered that our original design, which was using OpenPose in a mobile application, would run very slowly. However, this change will not affect the overall schedule/timeline, as it is more of a lateral movement than a setback. Our decision to abandon the mobile platform could jeopardize our project; to adjust, we decided to offload processing to a GPU, which will make our project faster than it would have been on mobile.

Kristina

This week, I worked with Brian and Umang to test the limits of PoseNet so we could decide which joint detection model to use. I also started creating the base of our web application (just a simple Hello World application for now to build off of). I haven’t done any web development in a while, so creating the incredibly basic application was also a good way to review my rusty skills. Part of this was also trying to integrate PoseNet into the application, but I ran into installation issues (again…like last week. Isn’t set up like the worst part of any project) so I ended up just spending a lot of time trying to get TensorFlow.js and PoseNet on my computer. Also since this upcoming week is going to be a bit busier for me, I made a really simple, first-draft sketch of a UI design to start from. For this next week, my goals are to refine the design, create a simple application we can use to start gathering the “expert” data we need, and to start collecting the data.

Simple first draft of the UI design – very artistic right?!! I’m an aspiring stick figure artist.

Brian

This week I attempted to find the mobile version of openpose and have it run on an iphone. Similarly to last week, I ran into some issues during installation, and decided that since we already had a web version running, it was better to solidify our plan to create a webapp and trash the mobile idea.

Afterwards, I decided to get a better feel for the joint detection platform, and play around with tuning some of the parameters to see which ones yielded the best accuracy. This was mainly done by manual observation of the real time detection as I tracked the movement of what I assumed were dancelike movements. I also took a look at the raw output of the algorithm, and started thinking about the frame matching algorithm that we would like to use to account for the difference in speed amongst the user and training data. I also worked on creating the design documents. For the next week, I would like to work more with the data, and see if I can get something that can detect the difference between joints in given frames.

Umang

This week I worked with Brian to explore the platform options for our application. We found that mobile versions will be all to slow (2-3 fps without an speed-up to the processing) for our use case.  We then committed to making a web app instead. For the web version, we used a lite version of Google’s pretrained posenet (for real time estimation) to explore latency and estimation methods. With simple dance moves, I am able to get the estimate of twelve joints; however, when twirls, squats, or other scale/orientation variants are introduced, this light posenet variant loses estimates. As such, this coming week, I want to explore running the full posenet model on a prerecorded video. If we can do the pose estimation post hoc, then I can send the recorded video to an AWS instance with a gpu for quicker processing with the entire model and then send down the pose estimates.

I still need to work on the interpolation required to frame match the user’s video (or frame) with  our collected ground truth. To evade this problem, we are going to work stills of Kristina to generate a distribution over the ground truth. We can then query into this distribution at inference time to see how far the user’s joint deviates from the mean. I hope to have the theory and some preliminary results of this distributional aggregation within the next two weeks.

Weekly Update #1 (2/10 – 2/16)

Team

The most significant risks that we face are not finding a reliable implementation of an algorithm that can detect poses in real time. Thought we think we have found something reliable, we still need to test in the upcoming week whether or not it will be sufficient. In order to manage this, we are thinking up an alternate plan that will instead of processing in “real time”, allow for a buffer of a couple of seconds for the processing to complete. It will only be a slight alteration though, and seems to be something we can deal with pretty easily. The main change that was made this week was the shift from mobile to web platforms. As I mentioned, there may be slight alterations coming later when it comes to the “real time” aspect of our project, but both of these changes were things that we had mentally accounted for earlier, and therefore will not change the overall trajectory of our project. The schedule has not changed, except for a slight shift due to unexpected errors that occurred during the installation of the pose detection algorithm.

Kristina

This week, I tried getting OpenPose set up on my personal laptop and did some research into how to use OpenCV/OpenPose on mobile. Through reading different articles and projects, and testing out some trial apps, I didn’t find a solution to use OpenPose with even close to real-time latency on mobile, so we all decided to just try getting OpenPose running on our laptops first. Since I have a PC and a possible Ubuntu VM to use while Brian and Umang have Macs, we tried different setups, but all ran into the problem of not having a Nvidia graphics card (my one screenshot just shows one of the error messages while trying to run a simple OpenPose Windows demo). We realized we would have to spend more time next week looking into other options, possibly changing the scope of our project, so we had to change the schedule a bit. We still need to do more work in figuring out what joint detection we want to use in order to determine whether we can still make a mobile app or if we should make a web application. I was supposed to start with the UI design at the end of this week, but I couldn’t due to OpenPose problems we encountered. The goal is that next week we’ll have this figured out and a simple UI design.

Brian

This week I focused on getting any implementation of a program to obtain poses from images running. At first we wanted to have a program that was able to run on mobile. However, after facing problems with that, we decided to work on finding something to run as a webapp. We had settled on openpose, but after a couple hours of troubleshooting installation errors, I found out that the devices we have access to do not have the capabilities to run or even install the program. Also I didn’t have the necessary permissions on the external devices to download the requirements. After failing for a bit, I was able to find another implementation that was able to run with pretty good framerate (~15-18) on my laptop.

They say that there is an implementation for mobile, so for next week, I would like to have this running there. Otherwise, we will proceed with a webapp. I would also like to get access to the point data from this app, and start playing around with comparing poses in different images.

Umang

This week I worked on getting a tensorflow implementation of openpose running. I quickly learned that the framerate of mobilenet (https://arxiv.org/pdf/1704.04861.pdf) would not be conducive to dance pose estimation on mobile devices: 3.23 fps is too slow for real time processing. See figure at the end of the document for me attempting to dance. Mobilenet_fast may provide us with the latency if we run it on a NVIDIA Jetson TX2. I started the rank aggregation framework we need to aggregate the poses Kristina gathers (this does not need to happen in real-time but we need a prioritization framework to pick the joints to correct).

In the coming week, I hope to assess the pretrained estimation networks we can use and begin to attempt naive aggregation. Then, I also hope to also work through the rank aggregation framework for the ground truth such that when Kristina has the data, my framework can be applied.

 

Introduction and Project Summary

Hello, we are Team KUB, and our members are Umang Bhatt, Kristina Banh, and Brian Davis. Our goal is to build an application that is able to teach dancing on the go! We realize that traditional methods of learning are expensive and often inefficient. In order to work around these issues, we are developing a platform that is able to capture your movements and offer corrections after they are completed. We eventually hope to provide a useful alternative for people who may not have the means to learn dancing the traditional way.