Shaye’s Status Report for 12/07

This past week I gave the final presentation, continued integration, and focused on completing testing for my individual portions of the project. 

For integration we hit a slight blocker; I had used a handedness classifier available through Mediapipe to differentiate the hands in our videos, but that classifier was unavailable in the Blaze hand landmarks version. As a result, I had to circumvent by relying on identified landmarks to tell the hands apart instead. For now, I’m simply labelling the leftmost middle landmark as the left hand and the rightmost middle landmark as the right hand. Since this method has lots of clear complications, I’ll work on using some other landmarks (ie: thumb relative to middle) to tell apart hands as well. 

For testing, I’ve continued with the landmark & tension algorithm verification. For landmark verification, I’ve wrapped up verifying that Blaze and Mediapipe produce the same angles. Jessie ran some techniques clips I ran with Mediapipe, and I plotted the angle outputs from that to compare with the Mediapipe output. For tension testing, I’ve gone through and organized the datasets that we can verify with. More on this is in the team report. I’ll actually go through and test the video with the system tomorrow. I’ll record if the output matches the label given to the video, as well as instances of false positives/negatives.

Shaye’s Status Report for 11/30

I’ve spent the past two weeks focused on developing the tension algorithm while working on prepping for our interim demo & final presentations. After finding the trend about hand span vs wrist movement, I’ve honed in on creating something that uses current hand span to adjust the comparison windows for tension detection. I started this process by using the technique clips to decide on the approximate ratio between hand span & tension time window. I was able to manually adjust the hand span input to produce more accurate results on the technique videos. Next, I’ll be adding in the ability to automatically adjust the algorithm based on handspan & test that according to the test plans outlined in our final presentation. 

Although I’ve gotten more promising results with this version of the algorithm, there’s still a high likelihood that the algorithm is inconclusive. In the form we sent out asking the pianist and Dr. Dueck to label tense/ non tense playing, we got conflicting responses accompanied by high confidence ratings. As such, I’m also working on recording and presenting wrist angle data to be sent to the UI as another metric the users can see for their playing. I’ll also provide the output of the tension algorithm in these graphs and provide some more data analysis on the raw angle measurements. This way, we can still provide our users with some feedback while also providing a “beta” tension detecting system. 

For the demo and final presentations, I’ve mostly worked on integration and compiling my testing plans and results. I’m still in the process of testing & will finish those by tomorrow to add to our final slides. 

In the process of creating our project, I’ve found a new appreciation for open source resources available online. I spent lots of time reading technical blogs on hand landmark models to learn how to effectively use Mediapipe/ Blaze in our project, and got a lot better at distilling these blogs to find what was actually helpful. This skill also came in handy when I was researching piano hand kinematics/ tension—I had to read through a lot of papers and quickly determine what was helpful for our system specifically. Over the course of the project, I’ve grown familiar with most hand landmark models, learned more than I expected about hand physiology, and learned to find information quickly. My main takeaway from this project is that what I can make isn’t really limited by the knowledge I already have, but how quickly I can find and distill the knowledge I can find online. 

Shaye’s Status Report for 11/16

This week my main objectives were to integrate my code onto the pi, gather more video from the pianists, and prepare for the interim demo next week. 

For integration, I ported my pre-existing pipeline to rely on an accelerated blaze-based version of Mediapipe. I used code from this repo as the base to add my pipeline to. Currently, I’ve included our old tension algorithm as a placeholder for the typical processing workload we’d be doing with each frame. This will allow us to get a realistic estimate for our anticipated framerate (20 fps woo!). The modularity of the code will allow me to update the tension algorithm independently of the system running on the RPi, making future independent work straightforward. A video of this fully integrated with buzzer is linked here

During our last session with Professor Dueck’s students, I gathered more clips of the tense/ non-tense technique playing. I also gathered some videos of students playing familiar & unfamiliar pieces and split them into 10 second clips—Jessie will send out a form this weekend for the students to label these clips as tense/ nonsense. This will become the data I’ll use to verify that the tension algorithm uses. 

Lastly, now that the system’s been integrated more, I’ll also start focusing on verifying my algorithm using the data gathered last week. My current plan is to run the system on the labeled clips, then comparing the output to label. I’ll record the number of times we match the label, and the number of times we give false positives & false negatives for tension. If some results are inaccurate, I’ll adjust some of the parameters for the algorithm and try the tests again. This would involve adding more span-based conditionals when looking for tension indicators, and adjusting the existing window size & convolution parameters from before. 

Additionally, I’ll use the technique clips to verify that output from the RPI matches the output from the Mediapipe-based pipeline running on my laptop. This means running the same test video on both systems, writing output information on tension to a text file, and comparing the textfiles to check that they are consistent. If they are, I’ll be able to safely assume that any work on the tension algorithm on my laptop will produce the same result on the RPi, making integrated testing much easier as we’ll only need to test the algorithm with my laptop. 

Shaye’s Status Report for 11/09

This week I gathered more formal piano playing videos from Professor Dueck’s students and did some analysis on them. More specifically, I took tense/ non tense recordings with the pianists, then post processed the recordings to gather wrist angle data that will help inform future iterations of the tension algorithm. 

During the session, I recorded the same tense and non-tense technique playing as usual. I also recorded a side angle for extra data in addition to our current above angle  in case it provided a clearer difference between tense and non-tense playing. I then trimmed videos to be split by exercise and sorted them into tense/ non tense groups. 

After taking and processing the recordings, I wrote a script that would record wrist angle over the course of the entire clip and output the data as a text file to an output directory. I then automated it to run on all of the recordings. Finally, I graphed the wrist angle data for the tense/non tense playing for each exercise and compared them to one another. 

 

Tense vs. Non tense wrist angles for different exercises

From the graphs, I was able to confirm that tense playing correlates to less change in wrist deviation over time. I was also able to confirm what I suspected earlier—that playing with hand span is smaller means smaller differences in wrist deviation between tense and non tense playing. This explains why the previous algorithm worked for exercises involving larger hand spans (arpeggios & chords), but didn’t work for exercises involving smaller hand spans (scales & Hanon). 

I do have some concerns that these findings aren’t 100% sound. For some of the recordings, the built in Mediapipe hand labeling would sometimes mislabel the hands, causing angle data between hands to mix. While I don’t think this has a large impact on the metrics I’m looking for, I do think having a better way to label hands would improve the overall robustness of our system. Additionally, only two students were able to make our session last Wednesday, meaning that these findings may be specific to only these two students. I’ll gather more data with the remaining students this coming Wednesday to verify if this is true or not. 

I’ll look more into how to use these findings to write a new tension algorithm/ modify our old one to work for smaller hand span exercises tomorrow. I’ll also analyze the side view recordings more in depth. The remainder of my time will be spent preparing for the demo the following week and helping out with integration.

Shaye’s Status Report for 11/02

This week I worked on reorganizing the code and writing a test script that utilizes pre-recorded clips of piano playing. I also created two more versions of the tension algorithm. 

For more code organization, I created a hand class that stores information about wrist angles and reference vectors, then refactored the code to use that class. This organization will be extremely helpful later on when I’m integrating with the web app. The test script I wrote enables me to try different versions of the algorithm on video recordings. I haven’t extensively tested any of the tension algorithms yet, but I will be in the next few days. 

The new versions of the algorithm use the same idea as the old algorithm with two main variations. In one verison, we take a new neutral vector after some time to give a new reference point to calculate wrist angle with. The second new version is very similar, but instead of using a neutral vector as a reference to calculate the wrist angle, we use the vector recorded from the frame before as a reference. These changes are intended to prevent the algorithm from getting stuck outputting “not tense” after recording for too long. Additionally, they’ll hopefully be able to give more conclusive tension results on scales/ Hanon. I’ll be doing testing in the next few days to help confirm this. 

For the coming week, I’m focusing on testing the algorithms more thoroughly to determine which one works best for which exercise. Additionally, I’ll be collecting feedback about how tension looks like from Professor Dueck’s students and use that to further develop the algorithm. Finally, I’ll start porting my script to use Blaze—an accelerated version of Mediapipe for edge devices—so that I’ll be ready to integrate onto the RPi when that’s setup. 

I’m currently not behind schedule, but will be a little busy next week and less available to work on the project. I’ll be spending more time this weekend/ next weekend to prevent this from being an issue. 

Shaye’s Status Report for 10/26

This week I focused mostly on the tension algorithm. During our previous testing session before fall break, we noticed that our algorithm isn’t great for detecting tension in scales/ Hanon. In layman’s terms, our algorithm doesn’t work when the fingers are closer into the hand/ aren’t reaching as far. 

During our session this week, I tried to mess around with values in our current algorithm to produce different results during scales/ Hanon and had little success. One specific issue did come up consistently with these two exercises—the algorithm would call playing not tense once the pianist has been playing for long enough, regardless of whether they were playing tensely or not. As a result, I have been rewriting/ writing several versions of the algorithm. One version I’m optimistic about automatically records a new neutral vector every so often, hopefully solving the issue above. I’ve also asked Dueck’s students to come up with some features of tension for me to incorporate into other versions of the algorithm. I’ll continue working on these versions this week. Note that this issue doesn’t come up with chords and arpeggios. 

I also recorded video snippets to test with during the session. I’ll be writing some test routines to help resolve the algorithm issues.

I’ll also be working on reorganizing the code in the coming week. Mainly, I’ll be separating out the main camera landmark routine & tension detection. Currently, tension detection is embedded into the landmark detection loop. Separating the two would allow for multithreading, potentially enabling us to run our system faster. The code will also be more modular & easier to follow. 

Currently, I’m not behind on writing/ testing the CV/ tension parts. However, our group is starting to fall behind on integration—I’ll be supporting Jessie with RPi setup & integration as needed. 

Videos from this week’s testing are here.

Shaye’s Status Report for 10/20

I spent the week before fall break focusing on testing the tension detection with Professor Dueck’s students and working on the design report. 

During our last session with the musicians, I had each musician play a variety of technique exercises including scales, chords, arpeggios, and Hanon exercises with relaxed and tense wrist positioning. Generally speaking, the default threshold values for tension detection worked on exercises that require more wrist movement (arpeggios & chords), but worked inconsistently with scales and Hanon. Once I tuned the values a little bit, the algorithm was more accurate with identifying tension with scales and Hanon. I’ll work more on finalizing these values this coming Wednesday. I’ll also standardize the set of exercises for the pianists to play while we’re working with them. 

We did run into a minor issue with the setup while testing. At times, the musicians’ heads would accidentally get in the way of the camera and block out a hand, causing the program to crash. I’ll add in more robust error catching prior to this Wednesday so that we’ll experience less crashing while testing. Additionally, we didn’t have a proper connector between our gooseneck and the tripod last session—getting this connector will help steady the camera feed and may also help resolve the issue. 

I’m still currently on schedule. In addition to what I’ve mentioned above, I’ll also begin running the CV pipeline on an RPi as a backup and assist Jessie with porting our existing CV pipeline as needed. 

Some clips of testing from last week are in this folder

Shaye’s Status Report for 10/05

This week I worked on fleshing out the tension detecting system I detailed last week. This was a little more involved than I thought it would be—the function ended up consisting of the following pseudocode: 

  1. Recording x number of past wrist angles
  2. Compute difference between each of the angles in the signal 
  3. Convolve the signal with a sliding window of 1’s to compute average range over that window
  4. If the result of the convolution is smaller than a threshold, then not enough wrist angle change has happened and that window is labeled as tense
  5. If the result of the convolution is larger than a threshold, then that window is labeled as not tense 
  6. If over half of the windows are tense, the function returns tense
  7. If not, then the function returns not tense. 

The bolded words in the pseudocode are adjustable thresholds throughout the function. After messing around for a bit, I settled on 20 recorded angles, a window size of 3, a convolution threshold of 0.5, and half of the windows being tense as my values. You can see the results of these values in the video linked here. True means tension and false means no tension. 

I’ll be testing the tension detection function on Professor Dueck’s students this coming Wednesday. I anticipate that I’ll need to tune some of the values as this will be the first time I’ll be testing this function on an actual piano. 

Additionally, I worked a little more with the webcam this week. The team report has photos of the webcam setup. Once we finalize the full setup next week I’ll work more with the webcam on the tripod configuration. 

I’m currently still on schedule—I’ve temporarily held back on conversion to give Jessie some more time to scope out the FPGA. We’re unsure if it’s necessary to fully port the pipeline or not yet. We’ll continue to progress on this in the following week. I’ve requested an RPi as a contingency if we’re unable to port the pipeline by fall break.

Shaye’s Status Report for 9/28

I started this week by continuing to work on the hand tracking pipeline—mostly adding in two-handed tracking abilities. My main focus of the week was working with pianists on Wednesday. The hand tracking ran smoothly with the pianists and we were able to record some footage (linked here). I made a key discovery during the session—hand tension isn’t determined by set deviation in either direction. Tension occurs when the wrist is held in one position for too long. So, I’ll be changing the pipeline to look for changes in wrist angle instead of specifically looking for ulnar/radial deviation. If the wrist remains at the same angle for too long, the pipeline will signal for a buzz. 

There are no new issues with the pipeline. I still need to integrate a way to tell which direction the wrist is deviated—however, since we’re now tracking wrist angle changes, this is more of a “nice to have” rather than a necessary component. 

Despite the change, I’m still on schedule. For next week, I’ll focus on continuing to beautify the code I have as well as looking into how to port Mediapipe to Pytorch. Jessie has started looking more into the KV260 and discovered that TFLite, the framework Mediapipe is based on, isn’t supported by the system. We’ll need to convert it to Pytorch or Tensorflow instead. I found more online resources outlining how to port to Pytorch so I’ll start looking there first. Finally, we’ll be putting in the order for our camera stand next week, so I’ll also begin assembly for that once it arrives.

Shaye’s Status Report for 09/21

This week I worked on building out the initial pipeline for our hand posture tracking. Before starting the pipeline, I looked more into how wrist deviation is usually measured. I found that typically it relies on the middle joint as a reference position. First, a neutral position is recorded (solid vertical line in the photo). Then, as the wrist deviates, we can measure the angle between the middle solid line and the dotted lines to find the angle of deviation. 

Image from https://www.mdpi.com/2306-5354/10/2/219

I then ran the Mediapipe hand landmark detection demo available on Github and used the code as a starting point to build out the rest of the pipeline. After looking at the landmarks I had available to me, I decided to use points 0 and 9 to get the vectors to calculate wrist deviation angle with (see image below). I wrote a function to record a neutral position, then changed the code a bit to continuously find the vector formed between points 0-9 and calculate the current angle of deviation. This then allowed me to print out the live deviation of the wrist once a neutral position was recorded. A video demo of the running code is included here.

Landmark map for Mediapipe hand detection

A current issue with the pipeline is that it can’t tell which direction the deviation is in. Since I’m using a dot product to calculate the angle between my neutral vector and the live recorded vector, the angles I get as outputs are only positive. Thus, I’ll be adding another check in to identify, then label the direction of deviation using more vector operations. 

Another smaller issue is that the pipeline currently only works with one hand. I’ll add in the ability to track neutral positions & deviations for both hands while I’m finalizing and cleaning up the pipeline. 

For next week, my plan is to fully flesh out the pipeline by fixing the issues identified above restructuring the code to be more understandable and usable. We have a work session with Professor Dueck’s students on 9/25—two handed tracking will be added by then. During the session, I’ll be testing to make sure that the pipeline still works from an overhead angle with the piano keyboard backdrop. I’ll also record some video and photos to test with the following week. Finally, I’ll start looking at how to port the pipeline to work on the FPGA and assist Jessie as necessary on setup.

Link to Github w/ code. A new git will be made for the team; currently using this as a playground to mess with the hand detection before transitioning to a more formal pipeline.