Team Status Reports – Team C4: BACK IN TUNE

December 7, 2024December 7, 2024

Team Status Report for 12/07

General updates:

At the beginning of the week, the team prepared for the final presentation by finalizing slides and doing some practice runs.
Jessie and Shaye worked together to adjust the live feedback code so the tension for 2 hands could be identified. Previously, our tension detection algorithm only worked for 1 hand. For more information, see Shaye’s status report.
Jessie continued to work on testing the post-processing and average FPS of various piano-playing videos. For more information view her status report.
Danny continued working on improving the web application and displaying the videos on the website. For more information view his status report.
At the end of the week, the team also discussed the content we want to present on our poster.

Unit testing:

UI/UX (Clarity of instructions, stand setup/take down)

We tasked one of our pianists to run through the instructions on the web application. This involved setting up the stand, placing the stand behind the piano bench and adjusting the camera, navigating to the calibration and recording page for a mock recording session, and then taking down the entire system.
We timed how long it took for the pianist to complete each of the steps and they were within most of our time limits. One step we underestimated was how long it takes to set up the stand so we will be working on that.
We received feedback afterwards that our instructions were not the clearest so the pianist was not sure what they had to do at each step. They also commented that providing pictures of each component would help clear up the instructions as well.
As the pianist was able to complete most of the tasks within time, we are not too worried about making big changes to our system. We have made our instructions more clear and are going to add pictures to our instructions to help our users. Additionally, we are not currently planning on changing anything related to our stand or how setup/takedown is done as changing the instructions is easier and was our main complaint.

Hardware (latency, frame rate)

We created a small data set of videos with varying types of pieces and of varying lengths. We used this data set to test both the average frame rate and post-processing time.
Frame rate
- To find the average frame rate for each video, we have an array of the captured frame rate and then average this at the end of the video. The base code already included the code to find the frame rate. It uses a ticker to track time and after 10 frames, will find the frame rate by dividing 10 frames by the time that has passed. We collect and average these values.
- We found that our average frame rate was about 21-22 fps, which is within the targeted range. Therefore, we made no changes to our design due to the frame rate.
Post-processing time
- To find the post-processing time of each of the videos we create a time-stamp before the frames of the video are written into a video and afterwards.
- We found that the post-processing time was way below our targeted processing rate of ½ (processing time is half of the video length). The rate that we achieved is around ⅙. However, we plan to test this more extensively with longer video lengths. It’s possible that our code will change if these longer videos fail, which could slow down our processing rate. However, since the rate is so far below our target, we are not concerned.
- No major design decisions were made based on our findings. However, changes were made to our code based on the fact that the post-processing did not work for longer videos. Instead of storing the frames of the video in an array, the frames were stored in a file. This likely slowed down the frame rate and post-processing time due to the extra time to read and write to files; however, with this change, we are still within the target range.
Live feedback latency
- To test the live feedback latency of our system, we inputted clearly tense and not tense movements into our system by waving and abruptly stopping. Our tension algorithm detects tension by horizontal degree deviation, therefore the waving is extremely non-tense and the abrupt stop followed by a pause is extremely tense. We had multiple iterations of waving and stopping and recorded when the waving stopped and when the audio feedback was produced. The difference between these 2 timestamps was effectively our system’s latency.
- We found that the system had a latency of around 500ms, which is way below our target of 1 second. These results are from only testing tension in one hand as we are still finishing our tension detection algorithm; however, we expect the latency will go up with 2 hands as with 2 hands in the frame the frame rate drops. We don’t think this drop in frame rate will cause us to go over our target though since the drop is not very big (28 fps to 21 fps). Because we are meeting our target, no changes were made based on these results.

Model (accuracy of angles)

Compared Mediapipe to hand measurements & Mediapipe to Blaze.

Was fine, no changes
Graphs of angles & summary of percent match between every output

Tension detection algorithm (comparison to ground truth)

Run detection pipeline on clips. Compare to ground truth results
Compiled two datasets the our tension form that we sent out—one where at least 4/6 respondents agree, and one where 5/6 respondents agreed
- We sent out 21 video clips to be labelled
- The 5/6 dataset contained 7 videos
- The 4/6 dataset contained 15 videos
Overall, the datasets are very iffy & tension detection is inconclusive. However, because of the mixed ground truth and feedback for a better angle, that could’ve been an approach if we had more time for this project.

Preliminary results are as summarized:

Integrated tests:

Tension on RPI:

To test whether the tension detection algorithm maintains a high accuracy when run on the RPi, we plan to use the same as tension detection testing from earlier (but on the RPI w/ full system instead). We plan to run this soon and currently have no results for this test.

Overall workflow UI/UX testing:

Because we are not quite finished integrating our system, we plan to do this in the near future and have yet to collect results. However, to test our full system, we plan to first have Shaye, our local pianist, run through the workflow to see if there are any major integration issues. Once this works, we will ask a pianist who is unfamiliar with our system to try and go through the workflow to see if the UI is clear.

November 30, 2024

Team Status Report for 11/30

General updates:

The team worked together at the beginning of last week to finish preparing for the interim demo.
The team also worked together to write an initial draft of the setup instructions on the webpage. We then met with one of Professor Dueck’s students to test the setup and take down time of our stand using our instructions. We also wrote a Google form to gather feedback on the instructions for the setup/takedown process.
This week, the team worked to create slides for the final presentation; each person largely worked on the slides relevant to their contributions. Jessie wrote the slides related to verifying hardware aspects (frame rate, post-processing time, and latency) and the design tradeoffs between the NVIDIA Jetson, AMD KV260, and an accelerated RPi. Shaye wrote the slides relating to verifying software (tensions algorithm, hand landmark models). Danny wrote the slides relating to verifying and testing the UI/UX (setup time, takedown time and web application)
Jessie continued to work on integrating features onto the RPi. She implemented the LED recording-feedback feature Joshna suggested during the interim demo, adjusted/debugged the buzzer feature, and enabled video recording in the live feedback code. She also wrote a Google form to gather ground truth on tension based on some clips we’ve recorded. Lastly, she worked on testing our system’s latency and started writing a script to find the average frame rate and post-processing time for various videos. Refer to her status report for more details.
Shaye continued iterating on the tension algorithm & tracking different aspects of playing. Refer to their status report for more details.
Danny continued working on integrating with the rest of the team. Time was also spent cleaning up the web application and implementing the different features that were asked for. Refer to his status report for more details.

November 16, 2024

Team Status Report for 11/16

General updates:

We worked as a team to piece together the RPi into its case with the display wiring and new SD card. Danny transferred the data on the 16GB SD card to the larger 128GB SD card. Shaye pieced the accelerator to the RPi, ensuring that the GPIO pins were exposed. Jessie had to redo the display wiring many times in this process.
In general, the team’s goal for this week was to have an integrated system for the demo on Monday/Wednesday. Individual parts of the system aren’t fully done, but the basic workflow and basic integration have been completed. We are confident we can complete integration the following week.
Shaye and Jessie worked to interface the blaze model on the RPi with the tension-tracking algorithm Shaye had previously written. See Shaye’s status reports for more info on the integration.
Danny and Jessie worked together to interface button clicking from the web application to start/stop video recording as well as start/stop calibration. Jessie wrote some code to start and stop recording to test the button-clicking mechanism (this code will not be used in the final product) that can be displayed for demo purposes. She also wrote some code for the start/stop calibration. For more information on this code see Jessie’s status report. For more information on the web application button clicking integration see Danny’s status report.

Shaye collected more video data & analyzed the video for more tension algorithms. They also cut the gathered video data into snippets for the ground-truth Google form. See Shaye’s status report for more information.
Jessie worked off of the integrated blaze model with tension-tracking to add the buzzer feature for when tension is detected. See Jessie’s status report for more information.
Danny has moved the code for the web application onto the RPi. He has continued working on trimming the unnecessary content and creating the functionality desired for the project. See Danny’s status report for more information.

Verification of Subsystems:

Stand Set-up/Take-down:

To test whether the user can set up and take down the camera stand within the targeted time, we plan to simply write some instructions to direct the user on how to set up and take down the camera stand. We then plan to time how long it takes the user to follow these instructions and successfully set up/take down the stand. The stand is considered successfully set up when the entirety of the piano is within the camera’s view and is parallel to the frame of the camera. The stand is considered successfully taken down when all the components are placed in the tote bag. We plan to test both new users (1st time setting up and taking down) and experienced users. This way, we can get a feel for how easy the instructions are to follow as well as get a sense of how long it would take a user to set up our system if it were a part of their daily practice routine.

If the set-up/take-down time is too long, we plan to modify the instructions so that they are easier to follow (i.e. adding more pictures) and fix more of the components of the system together so there is less for the user to put together. The specific modifications we’ll make will be dependent on our observation (i.e. if users often got stuck at a specific step) and their feedback (i.e. they thought step 2. In the instructions was poorly worded).

Tension Detection Algorithm:

We have run some informal tests to determine the effectiveness of our tension detection algorithm. We roughly tested the effectiveness of our algorithm by collecting data from Professor Dueck’s students– we asked them to play a list of exercises with and without tension. We then check the output of our system to see if it aligns with how the pianist intended to play. For more information on the results of these informal tests, see Shaye’s status reports.

To more formally test the correctness of our tension detection algorithm, we have collected more data from Professor Dueck’s students; however, this time we asked them to prepare a 1-minute snippet of a piece they were comfortable with and a 1-minute snippet of a piece they were still learning. Shaye has divided these recordings into 10-second snippets, which we plan to ask Professor Dueck and her students to identify as tense or not; this will help establish a ground truth to compare our system’s output. The data we collected from Professor Dueck’s students is good because we can run our algorithm on a variety of pianists and a variety of pieces. Additionally, we can use the output of the gathered data to compare the output of the tension detection algorithm using the MediaPipe model and the Blaze model to see if there was any reduction in accuracy after converting models. We can then adjust the tension detection algorithm accordingly. See Shaye’s status report for more information on how the tension detection algorithm can be tweaked.

System on RPi:

We want to ensure that our system can process data fast enough and provide feedback within a reasonable time. For information on how we plan to test our system’s live feedback latency/frame rate and the system’s post-processing time, see Jessie’s status report.

Web Application:

We want to ensure that our web application is easy and intuitive for users to use. For information on how we plan to test the intuitiveness of our web application, see Danny’s status report.

Validation:

Finally, we want to ensure that our system is meeting our user’s needs. The main way we will be ensuring that our user’s needs have been met is through polling them. Currently, we are planning on polling Professor Dueck and her students on different aspects of our system. These aspects include but are not limited to whether or not our system provides enough accurate feedback for our system to be useful, how easy was it to set up and use our system, was there any difficulties when using our system and how we could improve upon it, and whether or not the way they received the feedback was helpful for their uses. Through the use of this polling, we will validate whether or not our tension detection algorithm on the RPi provides accurate and helpful feedback for our users. Additionally, our users will be interfacing with our system through the use of the web application. Our polling will help ensure that our web application is intuitive and easy to use and if they have any suggestions for a better user experience.

November 10, 2024

Team Status Report for 11/09

General updates:

As a team, we set up the RPi case/fans. We have to take this apart and back it together in the future to attack the display and buzzer.
This week the team largely worked independently.
Jessie continued working on setting up the accelerated RPi. She started the week by registering the RPi with CMU-DEVICE and looking at how to have a program run automatically at boot. She continued to look into putting the model on the RPi, setting up the active buzzer, and interfacing the display. To find more information, refer to Jessie’s status report.
Shaye worked on using footage from this week’s session with the pianists to help inform tension algorithm rewrites. Refer to Shaye’s status report for more info.
Danny continued to work on the web application. Danny decided to refer to a template and modify that for our uses instead of starting from scratch. Refer to Danny’s status report for more information.

Pictures of the RPi Accelerator and the case:

November 1, 2024

Team Status Report for 11/02

General updates:

This week the team largely worked independently.
Jessie worked on setting up the accelerated RPi. She looked into putting the model on the RPi, interfacing UART between the 2 RPis, using the buzzer, and ordering a display. To find more information, refer to Jessie’s status report.
Shaye worked on different iterations of the tension algorithm and set up unit testing with the videos from last week. They also organized the code for storing information about hand position information and helped debug buzzer RPi issues. Refer to Shaye’s status report for more info.
Danny continued to work on the web application. He worked on figuring out how to store the video file paths within the database and how to display the videos through our web application. He implemented some functionality to allow users to upload videos but it will probably not be part of the final design. Refer to Danny’s status report for more information.

October 26, 2024

Team Status Report for 10/26

General updates:

Team trip to Home Depot to buy parts to connect the webcam gooseneck to the tripod. The connecting part of the gooseneck was a wider tube (⅝”) and not threaded while the connecting part of the tripod was a smaller threaded screw (¼”) so it was difficult to find a connector for the 2. However, with assistance from the Home Depot staff we landed on using multiple spacers. We screwed the tripod connecting part with the help of a heat gun to create threads in the spacers. In the future, we might have to glue the spacers to the gooseneck tube to make sure it stays in there.
We met with Dueck’s students and tested our newly built stand. Images are attached below. The stand worked great! It was stable (didn’t require a counterweight) and was tall enough to capture all the keys on the keyboard. Additionally, the gooseneck was able to stretch enough horizontally such that the pianist’s head did not block the view of their hands. The webcam’s USB cable was a bit short so it was inconvenient to hold the laptop close to the camera stand. Additionally, the USB cable kept bumping into the pianist’s head. We have ordered velcro for the purpose of wire management and a USB extension cable.
- We also gather video of scales & Hanon specifically to test with. More details can be found in Shaye’s report

As per Joshna and Prof Bain’s suggestion we met with Varun to ask for advice on using the KV260. More details can be found in Jessie’s report.

Design Changes:

Based on the feedback we received from Varun, we have decided to drop using the KV260 and commit to pivoting to the RPi with an accelerator to accelerate our hand landmark model.

Updated Schedule:

We are slightly behind our originally planned schedule of finishing the CV and FPGA integration by the coming Wednesday; additionally, our plans have shifted to using the RPi instead of the FPGA. Here is our updated schedule with the RPi:

Due 11/01 –

Jessie: finish moving the model to RPi and research how to connect the buzzer and display and web app hosting RPi using UART
Shaye: write many versions of the tension detection algorithm for RPi (see Shaye’s status report)

Due 11/06 – (next meeting with Professor Dueck’s students)

Shaye: finalize tension detection algorithm using output of RPi model (see Shaye’s status report for more)

Due 11/08 –

Jessie: finish implementing buzzer feature and set up UART connection with web app hosting RPi

(11/08-11/11) – buffer time

Due 11/11 – start full system integration

October 20, 2024

Team Status Report for 10/20

General updates:

Worked on and finished the design report (the vast majority of our time this week was spent here).
Worked with musicians on 10/9 to test out rough camera setup & tension detection algorithm—see Shaye’s status report for more info.

Product solution considerations:

A was written by Jessie, B was written by Danny and C was written by Shaye.

Part A:

We hope our product will allow pianists of all skill levels, from hobbyists and beginner students to professionals, to protect themselves from hand and wrist injuries related to piano playing– all players can benefit from injury prevention. The product’s utility, though intended for students with access to a teacher, could also be applied to those who are self-taught. However, the self-taught student would also have to self-learn how to identify correct positioning in order to properly set the initial calibration. Additionally, our system does not rely on a laptop to host our system, so users with only access to a phone or a tablet can also protect themselves. The system will also have simple and intuitive features, so it does not require the user to be technologically advanced.

Part B:

Our product will hopefully encourage people to either learn or continue playing the piano. Whether you are completely new to the piano or perhaps have been injured in the past, our product will ensure that these players are avoiding positions that can cause injury. We hope that the peace of mind our product will provide will lead to an increase in piano players. If the amount of pianists increases due to our product then we believe that we are contributing to a growing and deeper culture. While we are targeting piano players, we believe that a more musically inclined population is able to both better appreciate the culture we currently have but also make contributions to the culture we all share.

Part C:

We account for environmental factors with our system in two main ways: by using less power-hungry hardware, and by decreasing overall medical interventions for wrist strain injuries. In terms of hardware, our FPGA-based system would use less power than a Jetson/ GPU based system, minimizing power consumption during longer practice sessions. If our proof-of-concept becomes more widely adopted, this difference in power consumption would have a large impact on reducing overall energy waste from our system.

Additionally, by preventing wrist strain injury in pianists, our product will decrease the amount of medical resources spent attending to those injuries, thus reducing the environmental impact of those resources. This includes a variety of healthcare items, ranging from single-use wrist wraps to energy spent on wrist imaging.

October 4, 2024

Team Status Report for 10/05

General update:

We gave the design report presentation. This involved considering some use case requirements and making design decisions regarding our camera setup. We also fleshed out the testing for each of the different components.
We had a couple of practice sessions over the weekend to strengthen Danny’s presentation skills.
We obtained the parts for our stand. We did a rough setup to see if the stand was able to capture the full keyboard. Our stand was tall enough that it was able to capture the entire keyboard. We plan to build a connector between the gooseneck and the stand before we meet with Professor Dueck’s students again on Wednesday.
We ran through the basic setup tutorial for Vitis AI on the Kria KV260. We were able to set up a model and detect their example image. More details can be found in Danny’s status report. We also ran through the Vitis AI workflow tutorial for quantizing and compiling a model on the FPGA. More details can be found in Jessie’s Status report.
We looked into how to implement kinematics on the FPGA. More details can be found in Jessie’s Status report.
We worked on printing out tension vs not tension CV pipeline—see Shaye’s status report
We tested kinematics in CV pipeline—see Shaye’s status report
We further looked into the different webapp examples that are found in Django.

Risks:

Weight distribution of the camera stand being off-center leading to the stand tipping over.
- If this happens, we plan to add a counterweight.
FPGA not working—many complications in general
- Order RPI to work with over break for contingency.
- May order RPI ai kit if we end up switching over 100%.

Changes to schedule/ design:

No changes were made, we’re still on track of our schedule.

Stand setup:

Screenshot of “rough setup” we did

September 28, 2024

Team Status Report for 09/28

General accomplishments:

We worked on the design report and slides.
Jessie looked into the AMD workflow for putting a model on the FPGA. We now have a more fleshed-out idea of the tasks that need to be done. More details can be found in Jessie’s status report.
Danny finished the Django tutorial.
We asked Dueck’s students on their preferences on different WebApp features. More details can be found in Danny’s status report.
We met with Dueck’s students to preliminarily test system
- Need to track angle variation rather than consistent angle position—more details in Shaye’s status report
We have better specified the setup measurements and requirements based on additional measurements of Shaye playing piano. (Shaye not pictured 🙁 )

Risks:

We are slightly concerned about converting the currently used MediaPipe model (TFLite) to PyTorch or TensorFlow, the only 2 formats compatible with Vitis AI. We plan to look into
- existing scripts that convert between the 2 formats or
- find different models that are in the compatible format
We are also worried about the MediaPipe model losing accuracy once compiled on the FPGA due to over-optimization and bad inputted samples during the quantization phase
- We plan to try multiple data sets and ways to convert samples to tweak accuracy
We need to look into how to implement kinematics on FPGA. We are not sure how to interact with the output of the model
Contingency: switch to RPi—will request on 10/8 if needed so we have time to work on it before fall break

Updates on the system:

We figured out how to port model & communicate between FPGA & RPI
- Using Vitis AI workflow for model conversion
- Using UART for board communication

Schedule updates:

Still on schedule, no updates

Additional questions:

Part A is written by Jessie, Part B is written by Danny, and Part C is written by Shaye

Public health/ safety/ welfare factors:

Our product aims to reduce injury among piano players by identifying and correcting harmful playing positions. Injury among piano players is extremely common (50-70%), with many injuries being related to the hand and wrist. The effects of being injured can have mental health impacts additionally, as it may prevent players from being able to practice for extended periods. Our system will provide players with real-time and post-processed feedback on the player’s hand position. This will help them correct their position, even without the guidance of a piano teacher. We realize that providing bad feedback can lead to more injury; thus, we’re taking special care to focus on the testing and verification metrics we’ve mentioned throughout this process to ensure system accuracy.

Social Factors:

Our product will change the dynamic between the piano teacher and the student. Instead of having the piano teacher focus on the health of the student and harmful techniques, they can focus on music-related content. This product could also help encourage beginners to get into piano playing without the potential worry of becoming injured, lowering the barrier to entry. This could potentially lead to an increase in piano players.

Economic factors:

Our project will be the first product pianists can use to monitor their technique while practicing. Thus, although the total cost of our current system is high (FPGA cost, camera, stand, etc), creating a proof of concept using more general hardware will allow for more projects down the line to decrease the cost and create more accessible & commercialized products. With more specific hardware/ boards dedicated to running our system, the cost will decrease, allowing the product to be available for all piano players. For now, even just parts of this tool (CV pipeline, video saving features) can help save pianists from injury and incurring more personal costs.

September 21, 2024September 21, 2024

Team Status Report for 09/21

At the start of the week, we prepared slides for the proposal presentation and did a couple of practice runs during the weekend. We also placed inventory orders for a (temporary) camera and KR260. Discussed more concrete requirements for the camera tripod. Shaye got a basic CV pipeline up. Danny looked into different backends for the web app and started the Django tutorial. Jessie began Varun’s tutorial for Kria KR260 setup.

One risk that came up this week is camera compatibility with FPGA. We are waiting for Varun to test the compatibility with the 1080p webcam and are hoping to hear back by the end of this weekend. If the compatibility is confirmed, we’ll decide on and order a camera next week. We are also concerned with the camera field of view in relation to capturing the whole keyboard. We plan to either use a taller camera tripod or place the camera in a taller position, or get a more expensive camera with a larger field of view. We will weigh the decision out by the end of next week. A general worry we still have is porting to the FPGA. We’ll hold off on FPGA porting until the CV pipeline is fully finalized. We may start working in parallel on RPi in addition to the FPGA if we’re unable to see a path forward by October 12th. Shaye will focus on working with either an RPi4 or RPi5 while Jessie continues with the FPGA. If we’re unable to get the FPGA working on a basic level by mid October then we will give up on the FPGA.

No major changes happened—we have more concrete ideas on how to position the camera. This is included in Jessie’s status report for more detail; the diagram also included there.

We’re still on schedule. For next week we want to finish up CV pipeline and FPGA setup and hopefully start CV and FPGA integration. We will meet with Professor Dueck’s students on Wednesday, where we will test the CV pipeline to detect different angles on the keyboard. We will use a loaned camera from the inventory that will be handheld temporarily before we order one.

Link to video of current CV pipeline: link