Akul’s Status Report for 4/12

This week I worked on developing multiple unique comparison algorithms, with the goal of iterating upon our existing comparison algorithm, trying new features, and being intentional about the design decisions behind the algorithm. For the interim demo, we already had an algorithm that utilizes dynamic time warping, analyzes 30 frames every 0.1 seconds, and computes similarity based on each joint individually. This week I focused on making 5 unique comparison algorithms, allowing me to compare the accuracy of how different parameters and features of the algorithm can improve the effectiveness of our dance coach. The following are the 5 variations I created, and I will compare these 5 with our original to help find an optimal solution:

  1. Frame-to-frame comparisons: does not use dynamic-time warping, simply creates a normalization matrix between the reference video and user webcam input and compares the coordinates on each frame.
  2. Dynamic-time-warping, weighted similarity calculations: builds upon our algorithm for the interim demo to calculate the similarity between different joints to relatively weigh more than other joints.
  3. Dynamic-time-warping, increasing analysis window/frame buffer: builds upon our algorithm for the interim demo to increase the analysis window and frame buffer to get a  more accurate DTW analysis.
  4. Velocity-based comparisons: similar to the frame-to-frame comparisons, but computes the velocity of joints over time as they move, and compares those velocities to the reference video velocities in order to detect not exactly where the joints are, but how the joints move over time.
  5. Velocity-based comparisons with frame-to-frame comparisons: iterates upon the velocity comparisons to utilize both the velocity comparisons and the frame-to-frame joint position comparisons to see if that would provide an accurate measurement of comparison between the reference and user input video.

I have implemented and debugged these algorithms above, but starting tomorrow and continuing throughout the week, I will conduct quantitative and qualitative comparisons between these algorithms to see which is best for our use case and find further points to improve. Additionally, I will communicate with Rex and Danny to see how I can make it as easy as possible to integrate the comparison algorithm with the Unity side portion of the game. Overall, our progress seems to be on schedule; if I can get the comparison algorithm finalized within the next week and we begin integration in the meantime, we will be on a good track to be finished by the final demo and deadline.

There are two main parts that I will need to test and verify for the comparison algorithm. First, I aim to test the real-time processing performance of each of these algorithms. For example, the DTW algorithm with the extended search video may require too much computation power to allow for real-time comparisons. On the other hand, the velocity/frame-to-frame comparison algorithms may have space to add more complexity in order to improve the accuracy of the comparison without resulting in problems with the processing performance. 

Second, I am to test the accuracy of each of these comparison algorithms. For each of the algorithms described above, I will run the algorithm on a complex video (such as a TikTok dance), a simpler video (such as a video of myself dancing to a slower dance), and a still video (such as me doing a T-pose in the camera). With this, I will record the output after I actively do the dance, allowing me to watch the video back and see how each algorithm does. After, I will create a table that allows me to note both quantitative and qualitative notes I have on each algorithm, seeing what parts of the algorithm are lacking and performing well. This will allow me to have all the data I need in front of me when deciding what I should do to continue iterating upon the algorithm.

With these two strategies, I believe that we will be on a good track to verify the effectiveness of our dance coach and create the best possible comparison algorithm we can to help our users.

Danny’s Status Report for 4/12

This week I focused on integrating our comparison algorithm with the Unity interface, collaborating closely with Rex and Akul. We established a robust UDP communication protocol between the Python-based analysis module and Unity. We encountered initial synchronization issues where the avatars would occasionally freeze or jump, which we traced to packet loss during high CPU utilization. We implemented a heartbeat mechanism and frame sequence numbering that improved stability significantly.

We then collaborated on mapping the comparison results to the Unity visualization. We developed a color gradient system that highlights body segments based on deviation severity. During our testing, we identified that hip and shoulder rotations were producing too many false positives in the error detection. We then tuned the algorithm’s weighting factors to prioritize key movement characteristics based on dance style, which improved the relevance of the feedback.

As for the verification and validation portion, I am in charge of the CV subsystem of our project. For this subsystem specifically, my plans are as follows:

Pose Detection Accuracy Testing

  • Completed Tests: We’ve conducted initial verification testing of our MediaPipe implementation by comparing detected landmarks against ground truth positions marked by professional dancers in controlled environments.
  • Planned Tests: We’ll perform additional testing across varied lighting conditions and distances (1.5-3.5m) to verify consistent performance across typical home environments.
  • Analysis Method: Statistical comparison of detected vs. ground truth landmark positions, with calculation of average deviation in centimeters.

Real-Time Processing Performance

  • Completed Tests: We’ve measured frame processing rates in typical hardware configurations (mid range laptop).
  • Planned Tests: Extended duration testing (20+ minute sessions) to verify performance stability and resource utilization over time.
  • Analysis Method: Performance profiling of CPU/RAM usage during extended sessions to ensure extended system stability.

Team Status Report for 4/12

Team Status Report

Risk Management:

Risk: Comparison algorithm slowing down Unity feedback

Mitigation Strategy/Contingency plan: We plan to reduce the amount of computation required by having the DTW algorithm run on a larger buffer. If this does not work, we will fall back to a simpler algorithm selected from the few we are testing now.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Verification and Validation:

Verification Testing

Pose Detection Accuracy Testing

  • Completed Tests: We’ve conducted initial verification testing of our MediaPipe implementation by comparing detected landmarks against ground truth positions marked by professional dancers in controlled environments.
  • Planned Tests: We’ll perform additional testing across varied lighting conditions and distances (1.5-3.5m) to verify consistent performance across typical home environments.
  • Analysis Method: Statistical comparison of detected vs. ground truth landmark positions, with calculation of average deviation in centimeters.

Real-Time Processing Performance

  • Completed Tests: We’ve measured frame processing rates in typical hardware configurations (mid range laptop).
  • Planned Tests: Extended duration testing (20+ minute sessions) to verify performance stability and resource utilization over time.
  • Analysis Method: Performance profiling of CPU/RAM usage during extended sessions to ensure extended system stability.

DTW Algorithm Accuracy

  • Completed Tests: Initial testing of our DTW implementation with annotated reference sequences.
  • Planned Tests: Expanded testing with deliberately introduced temporal variations to verify robustness to timing differences.
  • Analysis Method: Comparison of algorithm-identified errors against reference videos, with focus on false positive/negative rates.

Unity Visualization Latency

  • Completed Tests: End-to-end latency measurements from webcam capture to avatar movement display.
  • Planned Tests: Additional testing to verify UDP packet delivery rates.
  • Analysis Method: High-speed video capture of user movements compared with screen recordings of avatar responses, analyzed frame-by-frame.

Validation Testing

Setup and Usability Testing

  • Planned Tests: Expanded testing with 30 additional participants representing our target demographic.
  • Analysis Method: Observation and timing of first-time setup process, followed by survey assessment of perceived difficulty.

Feedback Comprehension Validation

  • Planned Tests: Structured interviews with users after receiving system feedback, assessing their understanding of recommended improvements.
  • Analysis Method: Scoring of users’ ability to correctly identify and implement suggested corrections, with target of 90% comprehension rate.

Rex’s Status Report for 4/12

This week, I began by implementing more key features and refactoring critical components, as a part of the integration phase of our project. I modified our pose receiving to properly handle CombinedData, which now includes both raw poseData and real-time feedback from the dynamic time warping (DTW) algorithm. This integration required careful coordination with the updated pose_sender.py script, where I also addressed performance issues with regards to a laggy webcam input. Specifically, I optimized the DTW algorithm by offloading computations to a separate thread, reducing webcam lag and improving responsiveness. Additionally, I implemented a new character skin feature compatible with Danny’s pose_sender, allowing for a more customized and engaging user experience.

Progress is mostly on schedule for the integration part. I plan to spend additional hours refining the feedback visualization and testing latency under different system loads. In the coming week, my goal is to complete the UX feature that highlights which body parts are incorrectly matched in real-time during a dance session. This will significantly enhance usability and user learning by making corrections more intuitive and immediate for the final demo as well.

Now that core modules are functioning, I’ve begun transitioning into the verification and validation phase. Planned tests include unit testing each communication component (pose sender and receivers), integration testing across the DTW thread optimization, and utilizing several short dances for testing accuracy of the real-time feedback. To verify design effectiveness, I will analyze frame-by-frame comparisons of live poses against reference poses as well as the DTW algorithm’s window. This would allow me to check timing accuracy, body part correlation, and response latency using python timers in the code; seeing that they adhere to what we outlined in the use-case requirements with regards to timing metrics. I also plan to evaluate user interaction with the feedback system via usability testing in order to see how viable the final demo can be.