Brian Lane’s Status Update 11/20

This week was spent with further improvements to the model. Specifically, I spent this week performing some data augmentation in order to improve overall validation accuracy.

The model as it exists currently outputs high confidence intervals for a couple gestures, namely open hand and a fist, when they directly face the camera. For gestures involving a specific number of fingers, the model is less certain and only predicts correctly when the hand is directly facing the camera.

I am still uncertain how to go about improving the accuracy in classifying more complex gestures, but the problem of hand orientation in relation to the camera has a strait forward solution in data augmentation.

Using math similar to that in my former blog post where I explained a 2D rotation, I spent the week creating a script to perform random rotations of the training data about the X, Y, and Z axes instead of just the initial Z axis rotations.

This data augmentation involved projecting the training data points into 3D space at Z=0 and selecting an angle from 0 to 2pi for the Z axis rotation and -pi/4 to pi/4 for the Y and X rotations. After this, the rotations were each done sequentially and the result was project back onto the XY plane by simply removing the Z component.

Next week is shortened for Thanksgiving break, so my ability to work on this project will be reduced. Even so, next week I plan to add two more gesture classes to the OS interface allowing right clicking and scrolling. Time permitting I will also begin collecting tradeoff data for our final report.

Links:

Wikipedia article on 3D rotation matrices
https://en.wikipedia.org/wiki/Rotation_matrix#In_three_dimensions

Andrew’s Status Report 11/20

This week, I looked into smoothing out the cursor motion again and started a basic implementation of the finite state machine for when the system is in cursor movement mode and in precision mode i.e. when a user is trying to be price and hover over a specific region on screen. While the team didn’t meet this week, we will be working closely this coming week for getting the system to behave a bit more smoothly incorporating the feedback from the interim demo report. I’m thinking of changing the mouse movement function a little bit to make it more drastic like scaling up the movement by some constant so the distance moved is greater so the user doesn’t have to strain as much. Then, since we’re doing some form of FSM, we can change the state to detect when the user is trying to be precise or when he wants to move. We’re placing an emphasis on our project on quality of life and user likability, and that’s what’s prompting this decision. As I said earlier, I’ll be following up with the team to test this on the webcam as they have the one we’re using right now. As per the interim demo as well, we have a functional project, we’re just working out the little kinks. As such, I’m on schedule with my work.

Team Status Report for 11/20

This week, the team reflected on the feedback for our Interim Demo, most of which was incorporating some quality of life and smoothness changes into our design as well as being better prepared to explain our system and responsibilities for the final demo. The team has also transitioned from pure implementation to begin focusing on observing tradeoffs and collecting metrics before the final presentation, and moreso for the final report. The biggest risk our team has is gathering enough information to present as tradeoffs, specifically for creating a tradeoff graph. The development of our system is still proceeding nicely as we continue to improve our gesture model accuracy and implement smoothness changes to our mouse functions, but we have yet to see how smoothly our metric collection progresses. Our design and schedule are still the same and we are on track to present some tradeoffs and metrics during the final presentation after Thanksgiving break.

Alan’s Status Report for 11/20

This week, I finished implementing quality of life changes for our current mouse functions as well as some test code for implementing the other mouse functions such as right clicking and scrolling. I added a method of differentiating between clicking and holding by sampling gestures over time and determining which one to do based on if the click gesture was detected multiple times over a half a second time frame. I also experimented with other mouse functions such as right clicking (works the same as left clicking) and scrolling (instead of using mouse.move, mouse.wheel is used). mouse.wheel is interesting in the sense that the scrolling sensitivity is determined by how often the function is called, so the change in y positional value of the hand is used to determine how often the function is called instead of directly assigning a “distance” to scroll. Since we have a working system, for now the team and I will focus on gathering trade offs and metrics to prepare for the final presentation after Thanksgiving break. I am on schedule and will turn my focus towards finishing the implementation of other mouse functions once the gesture recognition model is more accurate as well as noting trade offs and collecting metrics for the final presentation.

Andrew’s Status Report 11/15

This week, I along with the other people in the group spent a lot of time preparing for the interim demo. We met a couple times the past weekend and to prep for it and made decent amount of headway with the integration. We now have a functioning system where we can move the cursor to an onscreen position and close our hand getting it to recognize a clicking gesture, and this recognition doesn’t produce noticeable latency within our system. My next steps are working on making the cursor movement run smoother and that will be initially attempted with some form of rolling average. Since the mouse movement gets a bit jittery when you try to click on specific locations, having some sort of rolling average will smooth out the cursor motion and make it so the cursor becomes a lot more precise at specific tasks that require precision. Another thing I’ll be working on in tandem with Brian is a hand landmark frontalization algorithm performed with some three dimensional rotation matrix performed likely with the hands that will hopefully get our gesture recognition more accurate. Overall, like the previous week, I’m on schedule with my tasks.

Alan’s Status Report for 11/13

This week, most of my work was done in preparation for the Interim Demo. We managed to successfully demonstrate the mouse movement as we planned but we even made progress with integrating the gesture recognition model to allow for left mouse clicking and dragging. While Brian continues to refine the model, I will work on implementing the other mouse operations such as right clicking and scrolling. Right now I am working on making quality of life changes to the mouse functions. One example of this has to do with clicking vs dragging using the mouse. Due to the nature of the mouse module functions, if the gesture for holding the mouse is continuously input, it makes it somewhat difficult to differentiate a user who wants to drag something around for a short while, or a user who wants to click. I am working on implementing code that differentiates between these by sampling gesture inputs over a short period of time. This will be implemented along with a sort of “cooldown timer” on mouse clicking. This will prevent the module from accidentally sending spam click calls to the mouse cursor as it continually detects the gesture for clicking, and instead allows for a user to click once every half a second or so. These changes are being made to ensure a smoother user experience. Currently I am on track with the schedule.

Team Status Report for 11/13

This week, the team finished a successful Interim Demo! We were able to show off a large chunk of our system’s functionality, namely mouse movement and left mouse clicking/dragging, and received great feedback and suggestions with which to proceed. With this deadline out of the way and looking forward towards the final presentation and final report, the biggest risk in our path is dealing with testing and verification. Now that we have a working system, we can start with gathering testing metrics and preparing user stories for future testing.

Our system has no design changes although we are all now making new small changes to improve the smoothness and functionality of the system. We did however discover a bug in our schedule thanks to the professors! Here is an updated and fixed version of our schedule.

Schedule

Brian Lane’s Status Update for 11/13

We had interim demos this week. My group demoed the functional portions of our project to course TAs on Monday and Professors on Wednesday, are received positive feedback, as well as suggestions for UI to add and potential training and dataset augmentations.

Personally, I spent this week improving the gesture recognition model, adding robustness to the predictions made by the model by introducing random rotations to the training data. This is done by initializing a random angle theta and constructing a 2×2 rotation matrix

[ cos theta, –sin theta ]
[ sin theta,    cos theta]

That is then multiplied by the 2×21 matrix containing the landmark coordinates. These transforms resulted in much better accuracy recognizing the click gesture when user hands were not directly vertical.

I will spend next week further refining the model. Our current implementation is very accurate classifying open hands and closed fists, but is still struggling some with hand representations of number and the other and signs.

 

 

Andrew’s Status Report 11/16

This week, I worked more on polishing up the code for the pose estimation. We began integration this week and met several times to discuss the demo as well as piece together the system. I’m working on some sort of smoothing algorithm for the on screen position estimation as right now when we’re updating mouse location with the pose data we’re getting slightly noisy mouse movement when we’re trying to be precise. While we’re still able to click on a unit of smallest pixel frame with relative ease, a big part of our project as we’ve stressed is user experience, and thus, getting smoother cursor movement should be somewhat of a priority. I’m currently thinking about doing an average of pixel motion. Similar to how periodic averaging smooths a signal and acts as a high pass filter, I’m going to test if the same applies here. Since our camera refresh rate is pretty high, it’s safe to assume we have some noise in our hand detection in a stationary position, so we’ll see if this averaging smooths that mild sporacity out. To stress again, this module isn’t a top priority right now as we’re able to perform within our initial specs, but it would be nice to have. Right now, integration of the system takes top priority. I’m on schedule with everything else I have outlined in the gantt chart.

Team Status Report for 11/6

This week, the whole team continued to work on the deliverables that we plan to show during the Interim Demo. There was more collaboration and discussion between team members this week as we started to integrate our components together. The integration of pose estimation and mouse movement is already functional but can still be fine tuned. Training of the gesture recognition model using pose estimation has also begun and is progressing smoothly. At this point, the biggest risks are if our implementation that we have committed to can meet the requirements and quantitative metrics that we set for ourselves. Hopefully through the Interim Demo we can receive feedback about if any aspect of our project needs to be rescoped or if there should be other considerations we have to make. Currently the system design and schedule are the same and we are working towards preparing a successful Interim Demo.