Weekly Update #7 (3/31 – 4/6)

Team

After the midpoint demo this week, where we demoed running our correction algorithm on stills, we started work on corrections on videos. Before our design report and review, we had roughly designed our method to be able to match a user’s video to an instructor’s video from data collection regardless of the speed of the videos, but now we had to actually implement it. Like before, we spent some time adjusting the design of our original plan to account for problems encountered with the correction algorithm on stills before implementation. We will continue to work on frame matching and altering the correction algorithm to work with videos as well in the next week.

Kristina

I spent some time this week gathering more data since initially I just got data for the poses. I focused on taking videos of myself and a couple other dancers in my dance company doing a port de bras and a plie, which are the two moves we’ve decided to implement, but I also gathered more data for our poses as well (fifth position arms, first arabesque tendu, and a passe, since I realized I’ve never wrote the specific terms on the blog). Also the current UI is only set up for stills right now, so I spent a little bit of time redesigning it to work with videos as well. In the upcoming weeks, I hope to have a smoother version of the UI up and running.

Brian

I spent this first part of the week thinking of the best way to do corrections for videos. There were a couple of options that came to mind, but most of them were infeasible due to the amount of time that it takes to process pose estimations. Therefore, in the end we decided to correct videos by extending our image corrections to “Key Frames”. Key Frames are the poses within a video that we deem to be the defining poses necessary for the proper completion of the move. For example, in order for a push-up to be “proper”, the user must have proper form at both the to and the bottom. By isolating these frames and comparing them to the instructor’s top and bottom poses, we can correct the video.

In order to do this, we need to be able to match the instructor’s key frames to those of the user with a frame matching algorithm. I decided that it would be best to implement this matching by looking at the distance between the frame that we want to match and the corresponding user pose. This week Kristina and I experimented with a bunch of different distance metrics such a l1, l2, cosine, max, etc, and manually determined that the l2 distance yielded distances that most aligned with how similar Kristina judged 2 poses to be.

I will be using this metric to finalize the matching algorithm next week.

Umang

After a wonderful week in London, I spent this week working on video pipelining in particular I ironed out a script that allows us to locally to run estimation on images end to end and then started a pipeline to get pose estimates from a video (*.mp4 file), which would enable Brian’s frame matching algorithm to run. Working with him to devise a scheme to identify key frames made it a smooth process to run pose estimation locally: the problem was that certain frames were estimated incorrectly (due to glitches in the estimation API) and needed to be dropped. A more pressing issue is that pose estimation is really hard to run locally since it is so computationally expensive. I hope to complete the video pipeline and think about ways to speed this process up next week.

Leave a Reply

Your email address will not be published. Required fields are marked *