Danny’s Status Report for 4/12

This week I focused on integrating our comparison algorithm with the Unity interface, collaborating closely with Rex and Akul. We established a robust UDP communication protocol between the Python-based analysis module and Unity. We encountered initial synchronization issues where the avatars would occasionally freeze or jump, which we traced to packet loss during high CPU utilization. We implemented a heartbeat mechanism and frame sequence numbering that improved stability significantly.

We then collaborated on mapping the comparison results to the Unity visualization. We developed a color gradient system that highlights body segments based on deviation severity. During our testing, we identified that hip and shoulder rotations were producing too many false positives in the error detection. We then tuned the algorithm’s weighting factors to prioritize key movement characteristics based on dance style, which improved the relevance of the feedback.

As for the verification and validation portion, I am in charge of the CV subsystem of our project. For this subsystem specifically, my plans are as follows:

Pose Detection Accuracy Testing

  • Completed Tests: We’ve conducted initial verification testing of our MediaPipe implementation by comparing detected landmarks against ground truth positions marked by professional dancers in controlled environments.
  • Planned Tests: We’ll perform additional testing across varied lighting conditions and distances (1.5-3.5m) to verify consistent performance across typical home environments.
  • Analysis Method: Statistical comparison of detected vs. ground truth landmark positions, with calculation of average deviation in centimeters.

Real-Time Processing Performance

  • Completed Tests: We’ve measured frame processing rates in typical hardware configurations (mid range laptop).
  • Planned Tests: Extended duration testing (20+ minute sessions) to verify performance stability and resource utilization over time.
  • Analysis Method: Performance profiling of CPU/RAM usage during extended sessions to ensure extended system stability.

Team Status Report for 4/12

Team Status Report

Risk Management:

Risk: Comparison algorithm slowing down Unity feedback

Mitigation Strategy/Contingency plan: We plan to reduce the amount of computation required by having the DTW algorithm run on a larger buffer. If this does not work, we will fall back to a simpler algorithm selected from the few we are testing now.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Verification and Validation:

Verification Testing

Pose Detection Accuracy Testing

  • Completed Tests: We’ve conducted initial verification testing of our MediaPipe implementation by comparing detected landmarks against ground truth positions marked by professional dancers in controlled environments.
  • Planned Tests: We’ll perform additional testing across varied lighting conditions and distances (1.5-3.5m) to verify consistent performance across typical home environments.
  • Analysis Method: Statistical comparison of detected vs. ground truth landmark positions, with calculation of average deviation in centimeters.

Real-Time Processing Performance

  • Completed Tests: We’ve measured frame processing rates in typical hardware configurations (mid range laptop).
  • Planned Tests: Extended duration testing (20+ minute sessions) to verify performance stability and resource utilization over time.
  • Analysis Method: Performance profiling of CPU/RAM usage during extended sessions to ensure extended system stability.

DTW Algorithm Accuracy

  • Completed Tests: Initial testing of our DTW implementation with annotated reference sequences.
  • Planned Tests: Expanded testing with deliberately introduced temporal variations to verify robustness to timing differences.
  • Analysis Method: Comparison of algorithm-identified errors against reference videos, with focus on false positive/negative rates.

Unity Visualization Latency

  • Completed Tests: End-to-end latency measurements from webcam capture to avatar movement display.
  • Planned Tests: Additional testing to verify UDP packet delivery rates.
  • Analysis Method: High-speed video capture of user movements compared with screen recordings of avatar responses, analyzed frame-by-frame.

Validation Testing

Setup and Usability Testing

  • Planned Tests: Expanded testing with 30 additional participants representing our target demographic.
  • Analysis Method: Observation and timing of first-time setup process, followed by survey assessment of perceived difficulty.

Feedback Comprehension Validation

  • Planned Tests: Structured interviews with users after receiving system feedback, assessing their understanding of recommended improvements.
  • Analysis Method: Scoring of users’ ability to correctly identify and implement suggested corrections, with target of 90% comprehension rate.

Team Status Report for 3/29

Risk Management:

Risk: Comparison algorithm not being able to handle depth data

Mitigation Strategy/Contingency plan: We plan to normalize the test and reference videos so that they both represent absolute coordinates, allowing us to use euclidean distance for our comparison algorithms. If this does not work, we can fall back to neglecting the relative and unreliable depth data from the CV and rely purely on the xy coordinates, which should still provide good quality feedback for casual dancers.

Risk: Comparison algorithm not matching up frame by frame – continued risk

Mitigation Strategy/Contingency plan: We will attempt to implement a change to our algorithm that takes into account a constant delay between the user and the reference video. If correctly implemented, this will allow the user to not start precisely at the same time as the reference video and still receive accurate feedback. If we are unable to implement this solution, we will incorporate warnings and counters to make sure the users know when to correctly start dancing so that their footage is matched up with the reference video

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Danny’s Status Report for 3/29

This week I focused on addressing the issues brought up at our most recent update meeting, which is the sophistication of our comparison algorithm. We ultimately decided that we would explore multiple ways to do time series comparisons in real time, and that I would explore a fastDTW implementation in particular.

Implementing the actual algorithm and adapting it to real time proved difficult at first, since DTW was originally used for analysis of complete sequences. However, after some research and experimentation, I realized that we could adapt a sliding window approach to implementing DTW. This meant that I would store a certain number of real time frames in a buffer and try to map that as a sequence onto the reference video.

Then, since our feedback system in Unity has not been fully implemented yet, I chose to apply some feedback metrics to the computer vision frames, which allow us to easily digest the results from the algorithm and try to optimize it further.

Example of feedback overlaid on a CV frame:

Danny’s Status Report for 3/22

This week I was deeply involved in collaborative efforts with Rex and Akul to enhance and streamline our real-time rendering and feedback system. Our primary goal was to integrate various components smoothly, but we encountered several significant challenges along the way.

As we attempted to incorporate Akul’s comparison algorithm with the Procrustes analysis into Rex’s real-time pipeline, we discovered multiple compatibility issues. The most pressing problem involved inconsistent JSON formatting across our different modules, which prevented seamless data exchange and processing. These inconsistencies were causing failures at critical integration points and slowing down our development progress.

To address these issues, I developed a comprehensive Python reader class that standardizes how we access and interpret 3D landmark data. This new utility provides a consistent interface for extracting, parsing, and manipulating the spatial data that flows through our various subsystems. The reader class abstracts away the underlying format complexities, offering simple, intuitive methods that all team members can use regardless of which module they’re working on.

This standardization effort has significantly improved our cross-module compatibility, making it much easier for our individual components to communicate effectively. The shared data access pattern has eliminated many of the integration errors we were experiencing and reduced the time spent debugging format-related issues.

Additionally, I worked closely with Akul to troubleshoot various problems he encountered while trying to adapt his comparison algorithm for real-time operation. This involved identifying bottlenecks in the video processing pipeline, diagnosing frame synchronization issues, and helping optimize certain computational steps to maintain acceptable performance under real-time constraints.

By the end of the week, we made substantial progress toward a more unified system architecture with better interoperability between our specialized components. The standardized data access approach has set us up for more efficient collaboration and faster integration of future features.

Team Status Report for 3/15

Risk Management:

Risk: Comparison algorithm not matching up frame by frame

Mitigation Strategy/Contingency plan: We will attempt to implement a change to our algorithm that takes into account a constant delay between the user and the reference video. If correctly implemented, this will allow the user to not start precisely at the same time as the reference video and still receive accurate feedback. If we are unable to implement this solution, we will incorporate warnings and counters to make sure the users know when to correctly start dancing so that their footage is matched up with the reference video

Risk: Color based feedback not meeting user expectations – continued risk

Mitigation Strategy/Contingency plan: We plan to break down our Unity mesh into multiple parts to improve the visual appeal of the feedback coloring, so that users can more immediately understand what they need to do to correct a mistake. We also plan to incorporate a comprehensive user guide to help with the same purpose.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Danny’s Status Report for 3/15

This week, I concentrated on enhancing our integration capabilities and optimizing our data analysis framework. I collaborated with Rex to troubleshoot and resolve challenges related to data synchronization between our computer vision system and Unity 3D visualization environment. This involved identifying bottlenecks in the data pipeline and implementing solutions to ensure smooth data flow between different components of our stack.

Additionally, I made significant progress on integrating advanced mathematical transformation techniques into our real-time processing framework. This work required careful consideration of performance implications and algorithm design to balance accuracy with computational efficiency. The optimized implementation calculates reference parameters once at initialization and applies these transformations to subsequent data points, rather than recalculating for each new data frame.

This architectural improvement from last week has yielded substantial performance gains and enhanced the robustness of our system when handling variations in input data. This is especially useful when we are transitioning into full real time processing. The visualization tools we’ve implemented have provided valuable insights that continue to guide our development efforts.

Danny’s Status Report for 3/8

This week I focused on integrating my existing work with the 3D Unity framework that Rex has been working on as well as continuing to improve upon the Procrustes Analysis method. The unity visualization allowed me to get a better sense of how the procrustes analysis is working and how I could improve it.

Initially, I was running the normalization algorithm on every single frame and normalizing them to the reference frame. However, this presented a serious problem once we tested the algorithm in Unity. If the test and reference video are not the exact same length in terms of the number of frames, we would be normalizing frames that are not aligned at all. This means that we would have very little temporal distortion tolerance, which negates our premise of doing DTW analysis. It also greatly impacted processing time since a new rotational matrix needed to be computed every single frame.

To improve upon this, I changed the algorithm to calculate procrustes parameters only once based on frame 0, and apply the calculated parameters to each frame afterwards. This solution worked well and greatly improved our processing speed.

 

Reference Footage
Test Footage (Slightly Rotated)
Raw Test Data (Rotated)
Normalized Test Data
Reference

 

Team Status Report for 3/8

Team Status Report

Risk Management:

Risk: Cosine Similarity Algorithm not yielding satisfactory results on complex dances

Mitigation Strategy/Contingency plan: We will continue working on the algorithm to see if there are improvements to be made given how the CV algorithm processes landmark data. If the Cosine Similarity Method will not work properly, we will fall back to a simpler method using Euclidean distance and use that to generate immediate feedback.

Risk: Color based feedback not meeting user expectations

Mitigation Strategy/Contingency plan: We plan to break down our Unity mesh into multiple parts to improve the visual appeal of the feedback coloring, so that users can more immediately understand what they need to do to correct a mistake. We also plan to incorporate a comprehensive user guide to help with the same purpose.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Part A:

 

DanCe-V addresses the global need for accessible and affordable dance education, particularly for individuals who lack access to professional dance instructors due to financial, geographic, or logistical constraints. Traditional dance lessons can be expensive and may not be available in rural regions. DanCe-V makes dance training more accessible, as anyone with an internet connection and a basic laptop is able to use our application. Additionally, the system supports self-paced learning, catering to individuals with varying schedules and learning speeds. This is particularly useful in today’s fast-paced world where flexibility in skill learning is becoming more and more important.

 

Furthermore, as the global fitness and wellness industry grows, DanCe-V aligns with the trend of digital fitness solutions that promote physical activity from home. The system also has potential applications in rehabilitation and movement therapy, offering value beyond just dance instruction. By supporting a variety of dance styles, DanCe-V can reach users across different cultures and backgrounds, reinforcing dance as a universal form of expression and exercise.

 

Part B:

One cultural factor to consider is that dance is deeply intertwined with cultural identity and tradition. DanCe-V recognizes the diversity of dance forms worldwide and aims to support various styles, with possibilities of learning classical Indian dance forms, Western ballroom, modern TikTok dances ballroom, traditional folk dances, and more. By allowing users to upload their own reference videos and not just including a constrained set of sample videos, the system ensures that people from different cultural backgrounds can engage with dance forms that are personally meaningful to them. Additionally, DanCe-V respects cultural attitudes toward dance and physical movement. Some cultures may have gender-specific dance norms or modesty considerations, and the system’s at-home training approach allows users to practice comfortably in a private setting.

Part C:

DanCe-V is an eco-friendly alternative to traditional dance education, reducing the need for transportation to dance studios and minimizing associated carbon emissions. By enabling users to practice from home, it decreases reliance on physical infrastructure such as studios, mirrors, and printed materials, contributing to a more sustainable learning model. Additionally, the system operates using a standard laptop webcam, eliminating the need for expensive motion capture hardware, which could involve materials with high environmental costs.

Furthermore, dance is a style of exercise that does not require extra materials, such as weights, treadmills, or sports equipment. By making dance accessible to a larger audience, DanCe-V can help reduce the production of these materials, which often have large, negative impacts on the environment.

Procrustes Analysis Normalization demo:

Before Normalization
After Normalization
Reference
Test Footage
Reference Footage

Cosine Similarity comparison results:

Danny’s Status Report for 2/22

This past week I was responsible for presenting our project during the Design Review. As a result, I spent most of the time during the first half of the week refining the presentation as well as practicing my delivery. After that, since we are ahead of schedule in terms of the CV system implementation, I focused on doing research into the specific algorithms and optimization methods we can use to construct our 3D comparison engine. Since we want to provide feedback in a timely manner, whether that’s after the entire dance or real-time, computation speed is a big problem for us since our chosen algorithm (DTW) is extremely computationally intensive. Therefore, I spent time looking specifically into optimization methods that include papers written on PrunedDTW, FastDTW, SparseDTW, etc.

 

Illustration of DTW:

Implementation of standard DTW:

Example of SparseDTW:

Al-Naymat, G., Chawla, S., Taheri, J. (2012). SparseDTW: A Novel Approach to Speed up Dynamic Time Warping.

Olsen, NL; Markussen, B; Raket, LL (2018), “Simultaneous inference for misaligned multivariate functional data”, Journal of the Royal Statistical Society, Series C, 67 (5): 1147–76, arXiv:1606.03295, doi:10.1111/rssc.12276, S2CID 88515233