Weekly Status Reports – Team C4: BACK IN TUNE

December 8, 2024

Shaye’s Status Report for 12/07

This past week I gave the final presentation, continued integration, and focused on completing testing for my individual portions of the project.

For integration we hit a slight blocker; I had used a handedness classifier available through Mediapipe to differentiate the hands in our videos, but that classifier was unavailable in the Blaze hand landmarks version. As a result, I had to circumvent by relying on identified landmarks to tell the hands apart instead. For now, I’m simply labelling the leftmost middle landmark as the left hand and the rightmost middle landmark as the right hand. Since this method has lots of clear complications, I’ll work on using some other landmarks (ie: thumb relative to middle) to tell apart hands as well.

For testing, I’ve continued with the landmark & tension algorithm verification. For landmark verification, I’ve wrapped up verifying that Blaze and Mediapipe produce the same angles. Jessie ran some techniques clips I ran with Mediapipe, and I plotted the angle outputs from that to compare with the Mediapipe output. For tension testing, I’ve gone through and organized the datasets that we can verify with. More on this is in the team report. I’ll actually go through and test the video with the system tomorrow. I’ll record if the output matches the label given to the video, as well as instances of false positives/negatives.

December 8, 2024

Danny’s Status Report for 12/07

For this week, I continued my work of putting the videos and associated graph onto the web application. The graphs are very rudimentary but they display whether you have tension or not over the course of time. I investigated trying to sync the video with the graph but I think it is slightly too complex. I was not quite sure how to implement that and I didn’t think it was essential to the web application so I skipped it for now. I was able to fix all the bugs I had and the videos/graphs are able to be displayed onto the web application now. I also used an HTML video library to have a better display for our videos as the default one lacks some features. There is a small issue where the videos are not able to be played for some reason. I am currently working with the videos provided by the model and I have noticed that some of the videos are being corrupted. I am not sure if the model is not saving the videos correctly or if it is a problem with my code. I am going to test the model with regular videos I download off the internet and work with Jessie if the model is outputting corrupted/invalid videos. Additionally, I worked on improving the instructions so that our users are able to clearly understand what is being asked of them.

My progress is still on schedule. I am not worried about falling behind.

For next week, I hope to fix the video issue I was seeing where the videos were not properly being played. I also will begin testing the new UI with different users. These users do not need to be pianists as they will simply follow the instructions to set the stand up. Additionally, we will have one integrated test for the UI that we have planned out as well. For the UI itself, I will clean up some of the graphs associated with the video that display tension. If I have time, I will also work on improving the look of the web application such as changing the color scheme and moving components but these are not essential and more minor changes if the time allows.

December 7, 2024December 7, 2024

Team Status Report for 12/07

General updates:

At the beginning of the week, the team prepared for the final presentation by finalizing slides and doing some practice runs.
Jessie and Shaye worked together to adjust the live feedback code so the tension for 2 hands could be identified. Previously, our tension detection algorithm only worked for 1 hand. For more information, see Shaye’s status report.
Jessie continued to work on testing the post-processing and average FPS of various piano-playing videos. For more information view her status report.
Danny continued working on improving the web application and displaying the videos on the website. For more information view his status report.
At the end of the week, the team also discussed the content we want to present on our poster.

Unit testing:

UI/UX (Clarity of instructions, stand setup/take down)

We tasked one of our pianists to run through the instructions on the web application. This involved setting up the stand, placing the stand behind the piano bench and adjusting the camera, navigating to the calibration and recording page for a mock recording session, and then taking down the entire system.
We timed how long it took for the pianist to complete each of the steps and they were within most of our time limits. One step we underestimated was how long it takes to set up the stand so we will be working on that.
We received feedback afterwards that our instructions were not the clearest so the pianist was not sure what they had to do at each step. They also commented that providing pictures of each component would help clear up the instructions as well.
As the pianist was able to complete most of the tasks within time, we are not too worried about making big changes to our system. We have made our instructions more clear and are going to add pictures to our instructions to help our users. Additionally, we are not currently planning on changing anything related to our stand or how setup/takedown is done as changing the instructions is easier and was our main complaint.

Hardware (latency, frame rate)

We created a small data set of videos with varying types of pieces and of varying lengths. We used this data set to test both the average frame rate and post-processing time.
Frame rate
- To find the average frame rate for each video, we have an array of the captured frame rate and then average this at the end of the video. The base code already included the code to find the frame rate. It uses a ticker to track time and after 10 frames, will find the frame rate by dividing 10 frames by the time that has passed. We collect and average these values.
- We found that our average frame rate was about 21-22 fps, which is within the targeted range. Therefore, we made no changes to our design due to the frame rate.
Post-processing time
- To find the post-processing time of each of the videos we create a time-stamp before the frames of the video are written into a video and afterwards.
- We found that the post-processing time was way below our targeted processing rate of ½ (processing time is half of the video length). The rate that we achieved is around ⅙. However, we plan to test this more extensively with longer video lengths. It’s possible that our code will change if these longer videos fail, which could slow down our processing rate. However, since the rate is so far below our target, we are not concerned.
- No major design decisions were made based on our findings. However, changes were made to our code based on the fact that the post-processing did not work for longer videos. Instead of storing the frames of the video in an array, the frames were stored in a file. This likely slowed down the frame rate and post-processing time due to the extra time to read and write to files; however, with this change, we are still within the target range.
Live feedback latency
- To test the live feedback latency of our system, we inputted clearly tense and not tense movements into our system by waving and abruptly stopping. Our tension algorithm detects tension by horizontal degree deviation, therefore the waving is extremely non-tense and the abrupt stop followed by a pause is extremely tense. We had multiple iterations of waving and stopping and recorded when the waving stopped and when the audio feedback was produced. The difference between these 2 timestamps was effectively our system’s latency.
- We found that the system had a latency of around 500ms, which is way below our target of 1 second. These results are from only testing tension in one hand as we are still finishing our tension detection algorithm; however, we expect the latency will go up with 2 hands as with 2 hands in the frame the frame rate drops. We don’t think this drop in frame rate will cause us to go over our target though since the drop is not very big (28 fps to 21 fps). Because we are meeting our target, no changes were made based on these results.

Model (accuracy of angles)

Compared Mediapipe to hand measurements & Mediapipe to Blaze.

Was fine, no changes
Graphs of angles & summary of percent match between every output

Tension detection algorithm (comparison to ground truth)

Run detection pipeline on clips. Compare to ground truth results
Compiled two datasets the our tension form that we sent out—one where at least 4/6 respondents agree, and one where 5/6 respondents agreed
- We sent out 21 video clips to be labelled
- The 5/6 dataset contained 7 videos
- The 4/6 dataset contained 15 videos
Overall, the datasets are very iffy & tension detection is inconclusive. However, because of the mixed ground truth and feedback for a better angle, that could’ve been an approach if we had more time for this project.

Preliminary results are as summarized:

Integrated tests:

Tension on RPI:

To test whether the tension detection algorithm maintains a high accuracy when run on the RPi, we plan to use the same as tension detection testing from earlier (but on the RPI w/ full system instead). We plan to run this soon and currently have no results for this test.

Overall workflow UI/UX testing:

Because we are not quite finished integrating our system, we plan to do this in the near future and have yet to collect results. However, to test our full system, we plan to first have Shaye, our local pianist, run through the workflow to see if there are any major integration issues. Once this works, we will ask a pianist who is unfamiliar with our system to try and go through the workflow to see if the UI is clear.

December 7, 2024

Jessie’s Status Report for 12/07

What I did this week:

UPDATE PROCESSING DATA SET VIDEOS: I continued working to test the dataset of videos for average fps and post-processing time. I was correct in my guess that the video was playing much slower on the display because the video clip had a high frame rate. Because the processing rate is lower than the inputted frame rate, the video looks like it’s moving slower. I changed my dataset video processing script to adjust the frame rate to 30 fps.
- The code related to hardware testing can be found here: https://github.com/mich-elle-xu/backintune/tree/jessie_branch/HW_testing
UPDATE POST PROCESSING CODE: I found that my code now worked for the shorter videos (1 minute) but did not work for the 5-minute video. I thought this could be due to the large number of video frames, which I was storing in an array (in memory) before writing to a video at the end; running out of memory (resources) could’ve been what caused my program to be killed prematurely. I adjusted my code to store the files in an array to save memory. I believe this change did slow down our frame rate slightly, likely due to the extra time to execute I/O (writing to a file); however, the frame rate is still above target (close to 21-22 fps while previously it was around 22-23 fps).
- With this change, the 5-minute video was able to run successfully and I was able to collect data. I found that the post-processing time was around 1 minute (1/6th of the length of the video), which is still way below our target of ½ of the length of the video. This rate was lower than my previously observed rates for the shorter videos; this could be due to the videos being shorter or to the additional time to read from a file (instead of an array) while post-processing the video.
- I tried running a longer video (11 minutes) but it seemed to crash. I’m not sure what happened and more investigation will need to be done.
- The updated live feedback code can be found here: https://github.com/mich-elle-xu/backintune/blob/jessie_branch/blaze/ported_live_blaze_handpose.py
DEBUG LED: I noticed that the wireless LED stopped working this week. The problem seems to be the RPi cannot connect to the LED– I cannot even ping the LED. I made some adjustments to the LED code, to try and debug the issue; however, it seems to be related to poor wifi connection. I plan to create a wired alternative and debugging this wireless connection will be a low priority.

Schedule:

I am still roughly on schedule. The testing infrastructure is largely complete, and I just have to run it on the updated live feedback code once the tension algorithm is finalized; I’m still working with Shaye to finish finding the tension of 2 hands. I don’t anticipate this taking long though since the infrastructure is already established. Next week I plan to work mostly on the poster, video, and report as well as some integration finishing touches for the demo.

Next week’s deliverables:

Running the existing testing infrastructure on the updated code (finalized tension detection algorithm)
Investigate the wireless LED and create a wired backup.

December 1, 2024

Danny’s Status Report for 11/30

For this week, I spent most of my time integrating with Jessie and the overall system. This involved cleaning up the web application. This means designing the UI so it resembles more of a finished product. As a team, we thought about what instructions would need to go on the website. I was the one in charge of putting the instructions onto the website. Additionally, I continued interfacing with Jessie and changed the recording page to be able to pass in arguments to the model. We wanted to pass in a toggle for the display, a volume control, and the rate at which the buzzer goes off. The recording page is now able to properly start the model with the options specified. The user is also able to stop the model from the recording page as well. Once this is done, the model will save the recorded video into the filesystem which can be visible to the user in the analytics page. I also began looking at how to display the videos onto the website. I was asked to look into listing the videos into expandable tabs so that UI looked more clean. I have a rough idea of how to do it and will hopefully have it completed by the end of this week. For some verification, we met with one of Dueck’s students. The other pianists were not able to attend unfortunately. We were able to get some rough testing and verification from this student, however. We had them look at our website and run through the instructions provided. We timed them on how long it took for them to do each task. Unfortunately, it seems that our instructions were not the most clear and that there was some work to do to make our project easier to use. Finally, I have helped my team on the final slides. We drafted what we wanted on the slides beforehand. I then helped create some of the slides such as the block diagram and software tradeoff slides. Additionally, I created the slide relating to the UI/UX verification which involves the setup, UI and takedown.

I would say my progress is on track. I am not worried about finishing everything before the final demo.

For next week, I hope to finish the UI and finish integrating with the rest of the system. This involves updating the instructions and updating how the videos are displayed on the website. As a stretch goal, a more detailed analytics page will be implemented as well.

I came into this project not knowing anything about web applications and how to design/create them. I had some incredibly basic HTML and CSS knowledge from a couple years ago and that was about it. I had to learn how to use the Django framework for my project and how it leads to a website being created. This involved learning how to interface with a database to store metadata information, how to set up a web server on the RPi, and how to create a more clean looking frontend by working with HTML, some small amount of JavaScript and Django. In order to learn these things, I read a lot of the Django documentation. I completed the small Django tutorial to setup a basic web application at the beginning of the semester which helped solidify the ideas for me. Additionally, I read from a lot of Django forums and StackOverflow posts to answer any questions I had about the features I had to implement.

December 1, 2024December 1, 2024

Jessie’s Status Report for 11/30

What I did last week:

GROUND-TRUTH FORM: I started the week by making a Google form and corresponding spreadsheet to collect/organize ground truth data from Prof Dueck and her students.
- https://docs.google.com/forms/d/e/1FAIpQLSdZ1LsQ5ckyfkgTnUzCFKm7-o78PcxrrPIhwr4NxlNbCd_XEQ/viewform?usp=sf_link
- For each clip included, we asked Professor Dueck and her students to fill out a tension identification rubric (sent by Prof Dueck), whether they thought the clips were tense in general, and their level of confidence in their response.
- I use the video clips Shaye created by dividing the recorded footage of Prof Dueck’s students into 10-second snippets. I had to upload each of these videos to YouTube in order to link them to the form.
- There are a total of 23 clips included, though we have 50 clips in total. The form only uses half of the clips we have since the form was getting quite long. Additionally, some of the clips were unusable since the hands were covered by the pianist’s head.
CLEAN LIVE-FEEDBACK CODE: At the beginning of the week, I also cleaned up the code from the hackster article since it includes a lot of optional functionality that we don’t use. A more concise version will be easier to debug in the future too. This can be found in the “cleaned up code” commit on the jessie_branch
DETERMINE HANDEDNESS: I then tried to figure out how to edit the tension algorithm so it would work for 2 hands (instead of just one). However, I was unable to determine the handedness (identifying right versus left hand) metric from the model. There was a variable defined in the precompiled model; however, the output seemed the same for left and right hands. Shaye said they will write an algorithm to determine the handedness since we’re not sure the model provides it.
VIDEO RECORDING: I also worked to integrate video recording into the live feedback code based on the code from the Hackster article. I wanted to include the visual of the model’s landmark placement in the video so that users were able to see if the landmarks were incorrect, which could cause inconsistencies in our tension detection algorithm. The code from the Hackster article creates an image for each frame processed where the landmarks are placed onto the inputted video feed and then outputted to the display. I created an array of frames (with the landmarks placed onto the video feed), which could then be written into a video. The changes can be found in the “can record video and buzzes every second instead of all the time” commit on the jessie_branch
- A minor concern I have right now is that our captured frame rate is not consistent, it fluctuates. However, when I write the frames into a video, it must use a fixed frame rate. Therefore the video might not be synchronized with the timing in which the video was recorded. I don’t think this will have any implications on our functionality, though it could be disorienting for users.
ADJUST BUZZER FEATURE: Lastly, I adjusted the buzzer feature so that instead of constantly buzzing when tension is detected, it would buzz once when tension is detected and continue to buzz at a rate inputted by the user if tension is still detected. I attempted to achieve this by using a flag for tension and creating a recursive thread with a timer. Each time tension is newly detected, a thread to continually buzzing will be started. It uses a timer to wait a user-specified time before buzzing again. If tension is no longer detected (signaled by the flag), the thread will exit. These changes can also be found in the “can record video and buzzes every second instead of all the time” commit on the jessie_branch

What I did this week:

INTERFACE LED AND ESP32: At the beginning of this week, I learned how to work with the RGB LED and ESP32 board. I first had to determine whether the RGB LEDs we had were common cathode or common anode– I didn’t have a multimeter on hand so I tried different wirings of the LEDs to the RPi and noted when the LED lit up. Once we determined whether the LEDs were common cathode or common anode, I wired the LED to the ESP32 board. I used an Arduino IDE and with the help of ChatGPT, I was able to download the necessary libraries and achieve the basic functionality of the LED. We want the LED to be wireless so that the user can place it somewhere in their line of sight and not be restricted by the RPi wiring, so we opted to use HTTP requests to communicate requests from the RPi to the ESP32. Code for the basic functionality can be found here.
INTEGRATE LED TO RPI: I then integrated the LED into the live-feedback code. The LED is turned off at the beginning of the program and at the end of the program. That way if the LED is on, it should signal that recording is occurring. Additionally, I was able to determine how many hands the model recognizes by evaluating the length of the landmarks array; I used this information to make the LED green when >= 2 hands were in frame and make the LED red otherwise (only 1 or 0 hands were recognized).
- I reverted this code to debug the integration of user-specified values with the web app, but the final code on the RPi can be found in this commit. I also made a copy of the code that is on the ESP32 and that can be found here.
- When I implemented setting the LED color for each frame that was inputted, the frame rate dropped down significantly (around 7 fps). I think this is because it takes time for the ESP32 to respond to the GET request from the RPi and blocks the program. To combat this I limit the number of requests sent by creating a flag for the current color the LED is set to and only send a request when the color must be changed. I also implemented threading for each request that needs to be sent.
- Another problem I ran into was that sometimes the connection between the RPi and ESP32 would fail, so the request to change the LED color would be dropped. To handle this, I handle the connection error exception to recursively call the change color function until it succeeds.
- A concern that I still have is that sometimes it seems like the ESP32 gets overwhelmed– it gets a bit warm and many requests will fail. Further testing will need to be done to determine how robust our system is and if this could be a problem.
DEBUG BUZZER: I returned to editing the buzzer feature because after more iterations it became apparent there was a severe multi-threading problem. The buzzing often started out fairly evenly spaced, but as the ‘practice’ session progressed the buzzing became more erratic.
- I think the unevenness of buzzes was due to multiple buzzing threads existing at the same time. Multiple buzzing threads can exist if in the span of the timer waiting in one thread, the current state went from tense to not tense and then back to tense. Then the buzzing thread would not detect that the user became not tense and would continue buzzing while another buzzing thread would be created. To remedy this issue I used an “event” in Python to block/unblock threads. This way I could detect if there was a change in state and interrupt the thread.
- At first, I used this event in conjunction with the flag I previously used; however, I think I was still having more subtle threading issues. Print statements when a thread was created and when a thread finished confirmed the existence of multiple threads at the same time, though less frequently than before. I believe this could’ve been due to some race conditions since setting the flag and setting the event was not atomic. To fix this, I removed the use of the flag and checked whether the event was set or not instead. Print statements confirmed the existence of only one buzzing thread at a time.
- There are still instances when buzzing is not evenly spaced (2 buzzes close together); however, my print statements show that this is due to the fast change in state from tense, to not tense, and back to tense as the system will immediately buzz each time tension is detected. If this proves to be a big issue in practice, we can increase the window size in the tension algorithm so that the algorithm is more robust to small changes in angle deviation.
- The finalized code can be found at this git commit.
TEST LATENCY: Once the buzzer feature was working as intended, I tested the system’s live feedback latency (the time between tension and live audio feedback).
- To do this, I deviated a bit from what I had previously planned. Originally, I was planning on finding the time between tension detection and audio feedback; however, when I tried doing this by putting a print statement when we detect tension, the print statement came after the audio feedback. This implies that the time between detection and outputting is extremely small. This measurement is likely inaccurate since there is a noticeable gap between when I am tense (no hand movement) and when I hear audio feedback. This implies that most of the latency is from the model processing/tension detection algorithm.
- To better test the system, I found the time between when I was tense and when I heard a buzz. It is difficult to determine exactly when a piano player becomes tense; however, our algorithm will detect tension when there is no horizontal hand movement (waving) and vice versa. So I decided to have iterations of me waving my hand, and then abruptly stopping so I could precisely document the start time of tension.
- Danny helped me record a video on his phone (our system doesn’t have a microphone) of me interacting with the system by waving my hand. I then played this video at 0.1x speed to have a finer time granularity when documenting the time of the tension and audio feedback.
- I used the recorded times of the tension and audio feedback to find the latency. From there I found some metrics to present in our final presentation.
- Here is a link to the video Danny recorded that I used to measure latency time: https://drive.google.com/drive/u/0/folders/1W-ZQKiA9caTTIdh_Ydgcm5rLkRDS_jbL
- Here is a link to a Google sheet with my results: https://docs.google.com/spreadsheets/d/1GLNH3a0HwWe3TAwUPFOv4lbOCp4f87zj3cBo2VtS-zI/edit?gid=0#gid=0
- It is worth noting that I was only able to test our system with one hand in the frame because the handedness feature has yet to be completed (so our system only works on one hand). This is important because our system has a much lower frame rate once two hands are in frame (22 fps versus 30 fps), so the latency will likely be higher than what I recorded once the tension detection algorithm is finalized.
INTERFACE USER INPUTS: I then worked with Danny to interface the start/stop recording functionality, inputting the buzzer volume, whether the user wants to display the footage, inputting the time between consecutive buzzes, and outputting the video recorded by the live-feedback code between the RPi and the web app.
- I created functions that would call and quit the live-feedback program. This can be found in the file call_ported_live_blaze_handpose.py
- I adjusted the live-feedback code to take the buzzer volume, the time between consecutive buzzes, and the toggling of the display as arguments that could be passed in. The finalized code can be found at this git commit
INTERFACE TENSION GRAPHS: I also started to work with Danny to interface the synced tension graphs with the video from the RPi to the web app. To do this, I created an array of tense/not tense values that are written to a csv file that corresponds to the recorded video of the same name.
FRAME RATE AND POST-PROCESSING VERIFICATION: Lastly, I started to work on the frame rate and post-processing time tests.
- The given code finds the frame rate by using a ticker to keep track of time. Each time 10 frames are processed, it will divide 10 by the amount of time that has passed to find the frame rate. I add these values to an array to be averaged at the end of the program.
- To find the post-processing time I create a time-stamp before the frames are written to a video and after the frames have been written to the video.
- The script can be found here:
- Next week I plan to convert all the videos to a smaller size and investigate their frame rates.
- First I had to figure out how to input videos (instead of using video from a webcam) into our system. The code from the hackster article already had the functionality for this, but I discovered I was only able to input .mp4 videos (not .mov).
- I created a set of videos which were of various lengths. I used videos we recorded from Prof Dueck’s students, which varied in piece type but were all around 1 minute. To stress-test the post-processing time, I downloaded a couple of longer videos from YouTube. I ensured all these videos were .mp4.
- I then edited my code to record the captured fps values and post-processing time.
- Then, with the help of ChatGPT, I wrote a script to run the downloaded videos and gather the video length, average framerate, and post-processing time into a csv file.
- I tried testing this script on just one video and my code seemed to work for a while, but was killed prematurely. After looking at the video’s metadata, I believe this happened because the video was too big and took up too many resources. Once I made the video smaller (scaling the height/width) the code was able to successfully complete. However, the video seemed to be playing on the display much slower than the webcam input which caused my script to take a while. I still have to investigate and fix this, but I suspect it is because of a high input video frame rate.

Schedule:

I am still roughly on schedule as the system integration is largely complete and I am almost done with testing. The only 2 big parts of system integration left are 1. tension graphs to sync with video on the web app and 2. integrating the finished tension algorithm. For 1. I have already outputted an array of tense/not-tense information for Danny to turn into a graph. For more sophisticated graphs I am waiting for Shaye to finalize the tension detection algorithm. I am also dependent on the finalized tension detection algorithm for 2.

Next week’s deliverables:

Finish testing frame rate and post-processing times.
Redo latency testing on the finished tension detection algorithm (with 2 hands in frame).

Learning Reflections:

I had to learn a lot for this project.

I had little experience using Raspberry Pis, so even getting SSH and wifi setup was difficult.
Learning to interface with all the hardware was also a learning curve: installing the necessary RPi accelerator materials, writing code to interface with the SPI display, learning how to use active/passive buzzers, learning how to program the ESP32 chip with the Arduino IDE.
I also had to use many Python libraries I was unfamiliar with.

My learning involved many strategies. Our approach for how to accelerate the model was to look up similar applications online. For the accelerator materials, I followed the Hackster article. In other applications I followed the given documentation/materials; for example, for the SPI display setup I followed this article https://www.waveshare.com/wiki/2inch_LCD_Module#Python and for the ESP32 wiring I followed the provided pinout from the original Amazon listing. Shaye was also a big help as they were more familiar with many of the hardware components I was working with. I would often first go to them for guidance on what I should generally do. Lastly, ChatGPT was a big help when it came to adjusting to new Python libraries as they could help generate basic code for me to build onto and give me ideas to help debug problems related to the unfamiliar libraries (specifically cv2 and threading specifics).

November 30, 2024

Shaye’s Status Report for 11/30

I’ve spent the past two weeks focused on developing the tension algorithm while working on prepping for our interim demo & final presentations. After finding the trend about hand span vs wrist movement, I’ve honed in on creating something that uses current hand span to adjust the comparison windows for tension detection. I started this process by using the technique clips to decide on the approximate ratio between hand span & tension time window. I was able to manually adjust the hand span input to produce more accurate results on the technique videos. Next, I’ll be adding in the ability to automatically adjust the algorithm based on handspan & test that according to the test plans outlined in our final presentation.

Although I’ve gotten more promising results with this version of the algorithm, there’s still a high likelihood that the algorithm is inconclusive. In the form we sent out asking the pianist and Dr. Dueck to label tense/ non tense playing, we got conflicting responses accompanied by high confidence ratings. As such, I’m also working on recording and presenting wrist angle data to be sent to the UI as another metric the users can see for their playing. I’ll also provide the output of the tension algorithm in these graphs and provide some more data analysis on the raw angle measurements. This way, we can still provide our users with some feedback while also providing a “beta” tension detecting system.

For the demo and final presentations, I’ve mostly worked on integration and compiling my testing plans and results. I’m still in the process of testing & will finish those by tomorrow to add to our final slides.

In the process of creating our project, I’ve found a new appreciation for open source resources available online. I spent lots of time reading technical blogs on hand landmark models to learn how to effectively use Mediapipe/ Blaze in our project, and got a lot better at distilling these blogs to find what was actually helpful. This skill also came in handy when I was researching piano hand kinematics/ tension—I had to read through a lot of papers and quickly determine what was helpful for our system specifically. Over the course of the project, I’ve grown familiar with most hand landmark models, learned more than I expected about hand physiology, and learned to find information quickly. My main takeaway from this project is that what I can make isn’t really limited by the knowledge I already have, but how quickly I can find and distill the knowledge I can find online.

November 30, 2024

Team Status Report for 11/30

General updates:

The team worked together at the beginning of last week to finish preparing for the interim demo.
The team also worked together to write an initial draft of the setup instructions on the webpage. We then met with one of Professor Dueck’s students to test the setup and take down time of our stand using our instructions. We also wrote a Google form to gather feedback on the instructions for the setup/takedown process.
This week, the team worked to create slides for the final presentation; each person largely worked on the slides relevant to their contributions. Jessie wrote the slides related to verifying hardware aspects (frame rate, post-processing time, and latency) and the design tradeoffs between the NVIDIA Jetson, AMD KV260, and an accelerated RPi. Shaye wrote the slides relating to verifying software (tensions algorithm, hand landmark models). Danny wrote the slides relating to verifying and testing the UI/UX (setup time, takedown time and web application)
Jessie continued to work on integrating features onto the RPi. She implemented the LED recording-feedback feature Joshna suggested during the interim demo, adjusted/debugged the buzzer feature, and enabled video recording in the live feedback code. She also wrote a Google form to gather ground truth on tension based on some clips we’ve recorded. Lastly, she worked on testing our system’s latency and started writing a script to find the average frame rate and post-processing time for various videos. Refer to her status report for more details.
Shaye continued iterating on the tension algorithm & tracking different aspects of playing. Refer to their status report for more details.
Danny continued working on integrating with the rest of the team. Time was also spent cleaning up the web application and implementing the different features that were asked for. Refer to his status report for more details.

November 16, 2024

Danny’s Status Report for 11/16

For this week, as mentioned in the team status report I helped set up the RPi for use. One concern we had was that the 16 GB SD card we originally had would be too small as we would want to store video files on the RPi. We also had some slight concerns that our model might be too big for the SD card as well. Thus, I helped transfer the 16 GB SD card to a larger 128 GB SD card. As for the web application, I worked on the front end and decided the different pages we will have in the end. This involved trimming a lot of extra unnecessary content that was available on the template. I tried to stick with the bare minimum that we needed in order to ensure that our users will not be overwhelmed or confused. Next, I worked with Jessie to integrate the web application with the model. I created two separate pages, one for calibration and another for recording, that would be used when we interface with the model. For now the calibration page has two buttons that start and stop the small display we have. The recording page will also start and stop a separate script that will start recording from the webcam and store the output video file appropriately. This script was just to test the functionality of our buttons and it will be changed to start and stop the model we have on the RPi in our final product. I then started looking into how to store the video files within the web application. I encountered some small bugs that prevented me from finishing but I believe I have a good idea on how to fix the issue.

I would say that I am on schedule now and I am not worried about my progress. I have completed the work we wanted to do before the interim demo and will be on progress for the final demo.

For next week, I hope to be able to show the video files on the website. There was a small issue with displaying it on the website before but I would ideally want to figure it out by the end of next week. I would also want to create the instructions page and figure out how to display the post-processed graphs of the videos as well. Finally, I would like to set up the web server so we do not have to run the command to start the web application. I did it previously but we ended up switching RPis so I would have to redo that as well. As a stretch goal, I would ideally also look into a better way of displaying the videos on the website. I think it does not look the best but obviously it is more important to get it functional at first.

Verification:

We currently have not tested the web application with any of our users. As mentioned in our design report, we plan to test the web application by having them use our website and giving us their feedback. This test will consist of having the users complete a series of tasks that we have outlined for them. We will want to measure how long it takes for each of the users to complete each of the tasks. The exact requirements were outlined in our design report but each task should not take more than a couple minutes at most. Our users would ideally not get stuck at any step in the process. The measurements we will be taking will be compared to the requirements we have set and would ensure that the users are able to use the web application without difficulty. If any step takes longer than anticipated then we will try to figure out how to make that specific step more accessible and easier for our users. At the beginning and after they completed the tasks, we will ask the users for any feedback on the web application. Lastly, we will most likely poll our users on our overall system and that may include more web application feedback that they would like to give.

November 16, 2024

Team Status Report for 11/16

General updates:

We worked as a team to piece together the RPi into its case with the display wiring and new SD card. Danny transferred the data on the 16GB SD card to the larger 128GB SD card. Shaye pieced the accelerator to the RPi, ensuring that the GPIO pins were exposed. Jessie had to redo the display wiring many times in this process.
In general, the team’s goal for this week was to have an integrated system for the demo on Monday/Wednesday. Individual parts of the system aren’t fully done, but the basic workflow and basic integration have been completed. We are confident we can complete integration the following week.
Shaye and Jessie worked to interface the blaze model on the RPi with the tension-tracking algorithm Shaye had previously written. See Shaye’s status reports for more info on the integration.
Danny and Jessie worked together to interface button clicking from the web application to start/stop video recording as well as start/stop calibration. Jessie wrote some code to start and stop recording to test the button-clicking mechanism (this code will not be used in the final product) that can be displayed for demo purposes. She also wrote some code for the start/stop calibration. For more information on this code see Jessie’s status report. For more information on the web application button clicking integration see Danny’s status report.

Shaye collected more video data & analyzed the video for more tension algorithms. They also cut the gathered video data into snippets for the ground-truth Google form. See Shaye’s status report for more information.
Jessie worked off of the integrated blaze model with tension-tracking to add the buzzer feature for when tension is detected. See Jessie’s status report for more information.
Danny has moved the code for the web application onto the RPi. He has continued working on trimming the unnecessary content and creating the functionality desired for the project. See Danny’s status report for more information.

Verification of Subsystems:

Stand Set-up/Take-down:

To test whether the user can set up and take down the camera stand within the targeted time, we plan to simply write some instructions to direct the user on how to set up and take down the camera stand. We then plan to time how long it takes the user to follow these instructions and successfully set up/take down the stand. The stand is considered successfully set up when the entirety of the piano is within the camera’s view and is parallel to the frame of the camera. The stand is considered successfully taken down when all the components are placed in the tote bag. We plan to test both new users (1st time setting up and taking down) and experienced users. This way, we can get a feel for how easy the instructions are to follow as well as get a sense of how long it would take a user to set up our system if it were a part of their daily practice routine.

If the set-up/take-down time is too long, we plan to modify the instructions so that they are easier to follow (i.e. adding more pictures) and fix more of the components of the system together so there is less for the user to put together. The specific modifications we’ll make will be dependent on our observation (i.e. if users often got stuck at a specific step) and their feedback (i.e. they thought step 2. In the instructions was poorly worded).

Tension Detection Algorithm:

We have run some informal tests to determine the effectiveness of our tension detection algorithm. We roughly tested the effectiveness of our algorithm by collecting data from Professor Dueck’s students– we asked them to play a list of exercises with and without tension. We then check the output of our system to see if it aligns with how the pianist intended to play. For more information on the results of these informal tests, see Shaye’s status reports.

To more formally test the correctness of our tension detection algorithm, we have collected more data from Professor Dueck’s students; however, this time we asked them to prepare a 1-minute snippet of a piece they were comfortable with and a 1-minute snippet of a piece they were still learning. Shaye has divided these recordings into 10-second snippets, which we plan to ask Professor Dueck and her students to identify as tense or not; this will help establish a ground truth to compare our system’s output. The data we collected from Professor Dueck’s students is good because we can run our algorithm on a variety of pianists and a variety of pieces. Additionally, we can use the output of the gathered data to compare the output of the tension detection algorithm using the MediaPipe model and the Blaze model to see if there was any reduction in accuracy after converting models. We can then adjust the tension detection algorithm accordingly. See Shaye’s status report for more information on how the tension detection algorithm can be tweaked.

System on RPi:

We want to ensure that our system can process data fast enough and provide feedback within a reasonable time. For information on how we plan to test our system’s live feedback latency/frame rate and the system’s post-processing time, see Jessie’s status report.

Web Application:

We want to ensure that our web application is easy and intuitive for users to use. For information on how we plan to test the intuitiveness of our web application, see Danny’s status report.

Validation:

Finally, we want to ensure that our system is meeting our user’s needs. The main way we will be ensuring that our user’s needs have been met is through polling them. Currently, we are planning on polling Professor Dueck and her students on different aspects of our system. These aspects include but are not limited to whether or not our system provides enough accurate feedback for our system to be useful, how easy was it to set up and use our system, was there any difficulties when using our system and how we could improve upon it, and whether or not the way they received the feedback was helpful for their uses. Through the use of this polling, we will validate whether or not our tension detection algorithm on the RPi provides accurate and helpful feedback for our users. Additionally, our users will be interfacing with our system through the use of the web application. Our polling will help ensure that our web application is intuitive and easy to use and if they have any suggestions for a better user experience.