Team Status Report for 3/22

Risk Management:

Risk: Dynamic Input Integration into Unity Pipeline

Mitigation Strategy/Contingency Plan:
A newly identified risk is the uncertainty regarding how to efficiently store and process dynamic user inputs within our current UI/UX pipeline, particularly in the context of real-time performance comparison. To address this, we will undertake detailed research into Unity’s documentation and forums. Our contingency plan includes setting aside additional team time for prototype development and targeted debugging sessions, ensuring timely resolution without affecting our overall timeline.

Risk: Comparison Algorithm Synchronization Issues

Mitigation Strategy/Contingency Plan:
We continue to face potential challenges in ensuring our comparison algorithm aligns the user’s performance frame-by-frame with the reference video. To mitigate this, we’re refining the algorithm to better accommodate constant timing offsets, allowing flexibility if the user doesn’t start exactly in sync with the reference video. If this proves insufficient, we will implement clear UI warnings and countdown mechanisms to ensure proper synchronization at the start of each session.

Risk: Visual Feedback Clarity and Usability

Mitigation Strategy/Contingency Plan:
Our original plan to split the Unity mesh into multiple segments for improved visual feedback has encountered increasing complexity. As mesh segmentation proves more cumbersome than initially expected, we’re now considering the implementation of custom Unity shaders to dynamically color individual meshes. Alternatively, we may explore overlaying precise visual indicators directly onto the user’s dance pose to clearly highlight necessary corrections, ensuring usability and meeting user expectations.

Design Changes:

No substantial design changes have occurred this week. Our current implementation aligns closely with our established schedule and original design specifications. PLEASE REFER TO INDIVIDUAL STATUS REPORTS FOR SPECIFIC UPDATES/PHOTOS. However, as noted in the risks above, we are preparing for potential minor adjustments, particularly concerning visual feedback/experience and Unity integration processes.

Rex’s Status Report for 3/22

This week, I improved the game’s UI/UX pipeline to facilitate smooth selection of reference .mp4 videos. Although initial implementation was partially completed last week, several bugs affecting the UI/UX integration were identified and resolved this week. Users can now intuitively pick and load a video directly from the game interface, simplifying the setup process and enhancing the overall user experience. Furthermore, the video analysis module was extended to handle selected reference videos robustly, effectively translating video movements into coordinates used by the avatar. This enhancement enables accurate real-time performance comparison, seamlessly integrating both live capture and pre-recorded video data.

This is Danny in a Pre-recorded .mp4 (reference mp4) – NOT live capture Additionally, I successfully optimized the avatar‘s leg recreation for our Unity-based OpenCV MediaPipe dance comparison game. Previously, the avatar’s leg movements experienced slight jitter and occasional lagging frames, making the visual representation less smooth as I mentioned in last week’s report. By refining the landmark smoothing algorithm and employing interpolation techniques between key frames, the avatar’s leg animations now  follow the user’s movements better, significantly enhancing overall realism and responsiveness. As a result, the visual feedback loop maintains an ideal frame rate, consistently hovering around 30 fps, matching our outlined design goals.

Currently, our progress aligns well with our original timeline. Next week, I plan to focus on optimizing and integrating the comparison algorithm further alongside Danny and Akul. Our goal is to implement more sophisticated analytical metrics to assess player accuracy comprehensively. Deliverables targeted for completion include a refined comparison algorithm fully integrated into our Unity game pipeline, rigorous testing, and initial documentation outlining the improved analytic metrics.

Team Status Report for 3/15

Risk Management:

Risk: Comparison algorithm not matching up frame by frame

Mitigation Strategy/Contingency plan: We will attempt to implement a change to our algorithm that takes into account a constant delay between the user and the reference video. If correctly implemented, this will allow the user to not start precisely at the same time as the reference video and still receive accurate feedback. If we are unable to implement this solution, we will incorporate warnings and counters to make sure the users know when to correctly start dancing so that their footage is matched up with the reference video

Risk: Color based feedback not meeting user expectations – continued risk

Mitigation Strategy/Contingency plan: We plan to break down our Unity mesh into multiple parts to improve the visual appeal of the feedback coloring, so that users can more immediately understand what they need to do to correct a mistake. We also plan to incorporate a comprehensive user guide to help with the same purpose.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Akul’s Status Report for 3/15

This week I focused on developing the comparison algorithm. Now that we had the code to normalize the points based on different camera angles, we had the capability to create a more fleshed out comparison engine to see if two videos contain the same dance moves. 

I spent my time this week creating a script that will take in two videos (one a reference video, one a user video) and see if the videos match via frame-to-frame comparisons. In our actual final project, the second video will be replaced with real-time video processing, but just for testing’s sake I made it so I could upload two videos. I used two videos of my partner Danny who does the same dance moves at different angles from the camera and at some different times. Using these videos, I had to extract the landmarks, get the pose data, and normalize the data in case there were any differences in camera poses. After that, I parsed through the JSONs, trying to see if each of the JSONs at each comparable frame are similar enough. I then created a side-by-side comparison UI that allows us to tell which frames are similar, and which frames are different. The comparison is pretty good for the most part, but I did find that there were some false positives, so I modified the thresholds and it got better as well.

Overall, our progress seems to be on schedule. The next steps will be integrating this logic into the Unity side instead of just the server side code. Additionally, I will need to change the logic to take inputs from a webcam and a reference video instead of uploading two videos, but this should be trivial. Overall, the biggest thing will be to test our system more thoroughly with more data points and videos. Next week, we will work on testing the system more thoroughly as well as beginning to work on our DTW post-video analysis engine.

I couldn’t upload a less blurry picture due to maximum file upload size constraints so apologies for any blurriness in the following images.

Match

No Match

Rex’s Status Report for 3/15

This week, I made progress by most importantly getting the real time and reference .mp4 video coordinates working with the avatar. I also spent time optimizing the physics of joint movements and improving the accuracy of how OpenCV MediaPipe’s 2D coordinates are mapped onto the 3D avatar. This was a crucial step in making the motion tracking system more precise and realistic, ensuring that the avatar’s movements correctly reflect the detected body positions. Additionally, I worked on expanding the GUI functionality within Unity, specifically implementing the ability to select a reference .mp4 video for analysis. This feature allows users to load pre-recorded videos, which the system processes by extracting JSON-based pose data derived from the .mp4 file. As a result, the dance coach can now analyze both live webcam input and pre-recorded dance performances, significantly enhancing its usability as mentioned before. I have attached a real-time demo video below to showcase the system’s current capabilities in tracking and analyzing movements. Debugging and refining the motion tracking pipeline took considerable effort, but this milestone was essential to ensuring the system’s core functionality is robust and scalable.

(THIS LINK TO DEMO VIDEO EXPIRES ON MONDAY 3/15, please let me know if new link needed)

I am on track with the project timeline, as this was a major development step that greatly improves the system’s versatility. I primarily want to focus on refining the coordinate transformations, adjusting physics-based joint movements for smoother tracking, and enhancing the UI experience. Debugging the .mp4 processing workflow and ensuring proper synchronization between the extracted JSON pose data and Unity’s animation system were also key challenges that I successfully addressed. Looking ahead, my goal for the upcoming week is to refine the UI pipeline further so that the application becomes a polished, standalone application. This includes improving the user interface for seamless video selection, enhancing the visualization of movement analysis, and optimizing performance for smooth real-time feedback. With these improvements, the project is on a solid path toward completion, and I am confident in achieving the remaining milestones on schedule.

Danny’s Status Report for 3/15

This week, I concentrated on enhancing our integration capabilities and optimizing our data analysis framework. I collaborated with Rex to troubleshoot and resolve challenges related to data synchronization between our computer vision system and Unity 3D visualization environment. This involved identifying bottlenecks in the data pipeline and implementing solutions to ensure smooth data flow between different components of our stack.

Additionally, I made significant progress on integrating advanced mathematical transformation techniques into our real-time processing framework. This work required careful consideration of performance implications and algorithm design to balance accuracy with computational efficiency. The optimized implementation calculates reference parameters once at initialization and applies these transformations to subsequent data points, rather than recalculating for each new data frame.

This architectural improvement from last week has yielded substantial performance gains and enhanced the robustness of our system when handling variations in input data. This is especially useful when we are transitioning into full real time processing. The visualization tools we’ve implemented have provided valuable insights that continue to guide our development efforts.

Rex’s Status Report for 3/8

This week, I spent a lot of time refining the dance coach’s joint rotations and movements in Unity, making sure they feel as natural and responsive as possible which involves using physics/physiology and rotations. One of the focuses this week was adding logic to recolor the avatar’s mesh based on movement accuracy, giving users clear visual feedback on which parts of their body need adjustment. I also worked on integrating the comparison algorithm with Danny and Akul, the algorithms which evaluates the user’s pose against the reference movements. A major challenge was optimizing the frame rate while ensuring that the physics and physiological equations accurately represent real-world motion. It took a lot of trial and error to fine-tune the balance between performance and accuracy, but it’s starting to come together. I collaborated closely to test and debug these changes, ensuring that it works correctly for basic movements.

Overall, I’d say progress is on schedule, but some of the optimization work took longer than expected. The biggest slowdown was making sure the calculations didn’t introduce lag while still maintaining accurate movement tracking. I also believe that there is more improvement to be made on the rotations of some of the joints, especially the neck to model the movement more accurately. To stay on track, I plan to refine the physics model further and improve computational efficiency so the system runs smoothly even with more complex movements. Next week, I hope to finalize the avatar recoloring mechanism, refine movement accuracy detection, and conduct more extensive testing with a wider range of dance poses. The goal is to make the feedback system more intuitive and responsive before moving on to more advanced features.

Attached below are the demo videos for how the dynamic CV to unity avatar is right now, the physics movements will need to be further tweaked for advanced movement  (Note: speeds are not the same for both GIFs) 

Danny’s Status Report for 3/8

This week I focused on integrating my existing work with the 3D Unity framework that Rex has been working on as well as continuing to improve upon the Procrustes Analysis method. The unity visualization allowed me to get a better sense of how the procrustes analysis is working and how I could improve it.

Initially, I was running the normalization algorithm on every single frame and normalizing them to the reference frame. However, this presented a serious problem once we tested the algorithm in Unity. If the test and reference video are not the exact same length in terms of the number of frames, we would be normalizing frames that are not aligned at all. This means that we would have very little temporal distortion tolerance, which negates our premise of doing DTW analysis. It also greatly impacted processing time since a new rotational matrix needed to be computed every single frame.

To improve upon this, I changed the algorithm to calculate procrustes parameters only once based on frame 0, and apply the calculated parameters to each frame afterwards. This solution worked well and greatly improved our processing speed.

 

Reference Footage
Test Footage (Slightly Rotated)
Raw Test Data (Rotated)
Normalized Test Data
Reference

 

Team Status Report for 3/8

Team Status Report

Risk Management:

Risk: Cosine Similarity Algorithm not yielding satisfactory results on complex dances

Mitigation Strategy/Contingency plan: We will continue working on the algorithm to see if there are improvements to be made given how the CV algorithm processes landmark data. If the Cosine Similarity Method will not work properly, we will fall back to a simpler method using Euclidean distance and use that to generate immediate feedback.

Risk: Color based feedback not meeting user expectations

Mitigation Strategy/Contingency plan: We plan to break down our Unity mesh into multiple parts to improve the visual appeal of the feedback coloring, so that users can more immediately understand what they need to do to correct a mistake. We also plan to incorporate a comprehensive user guide to help with the same purpose.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Part A:

 

DanCe-V addresses the global need for accessible and affordable dance education, particularly for individuals who lack access to professional dance instructors due to financial, geographic, or logistical constraints. Traditional dance lessons can be expensive and may not be available in rural regions. DanCe-V makes dance training more accessible, as anyone with an internet connection and a basic laptop is able to use our application. Additionally, the system supports self-paced learning, catering to individuals with varying schedules and learning speeds. This is particularly useful in today’s fast-paced world where flexibility in skill learning is becoming more and more important.

 

Furthermore, as the global fitness and wellness industry grows, DanCe-V aligns with the trend of digital fitness solutions that promote physical activity from home. The system also has potential applications in rehabilitation and movement therapy, offering value beyond just dance instruction. By supporting a variety of dance styles, DanCe-V can reach users across different cultures and backgrounds, reinforcing dance as a universal form of expression and exercise.

 

Part B:

One cultural factor to consider is that dance is deeply intertwined with cultural identity and tradition. DanCe-V recognizes the diversity of dance forms worldwide and aims to support various styles, with possibilities of learning classical Indian dance forms, Western ballroom, modern TikTok dances ballroom, traditional folk dances, and more. By allowing users to upload their own reference videos and not just including a constrained set of sample videos, the system ensures that people from different cultural backgrounds can engage with dance forms that are personally meaningful to them. Additionally, DanCe-V respects cultural attitudes toward dance and physical movement. Some cultures may have gender-specific dance norms or modesty considerations, and the system’s at-home training approach allows users to practice comfortably in a private setting.

Part C:

DanCe-V is an eco-friendly alternative to traditional dance education, reducing the need for transportation to dance studios and minimizing associated carbon emissions. By enabling users to practice from home, it decreases reliance on physical infrastructure such as studios, mirrors, and printed materials, contributing to a more sustainable learning model. Additionally, the system operates using a standard laptop webcam, eliminating the need for expensive motion capture hardware, which could involve materials with high environmental costs.

Furthermore, dance is a style of exercise that does not require extra materials, such as weights, treadmills, or sports equipment. By making dance accessible to a larger audience, DanCe-V can help reduce the production of these materials, which often have large, negative impacts on the environment.

Procrustes Analysis Normalization demo:

Before Normalization
After Normalization
Reference
Test Footage
Reference Footage

Cosine Similarity comparison results:

Akul’s Status Report for 3/8

Over the past week, I focused on two main components: the design report and our single frame comparison. For the design report, I spent time developing the quantitative design requirements, the design trades, and the system implementation. With that, I conducted research into requirements for our design, finding specific reasons as to why we want to make any design decisions. For example, to mitigate latency within our system, we decided to choose only a certain subset of points in the MediaPipe output to decrease latency while maintaining accuracy in our points. I decided to go with just 17 points, as many of the points crowd at the user’s head and toes which isn’t necessary for our specific use case. Additionally, we had an idea of how we would implement our system, but I spent time creating block diagrams to put all of our thoughts together for each aspect of the system. Consequently, throughout the rest of the semester, we will have these diagrams to refer to and continue to adapt if we make any changes, so both us and viewers can better understand our system. For the design trade study, I focused on making sure that all of our decisions were fully intentional in terms of the algorithms/libraries/protocols that we were using. I explored tradeoffs between these aspects and provided concrete reasoning as to why we chose one or the other. 

This week, we also made the goal to get a working MVP of the single frame comparison, where we can take a user input and a reference video to see whether or not their dances are similar, when doing a frame-to-frame comparison. We split up the work into normalizing the points, doing the actual comparison given normalized points, and providing the Unity output. My task was to compute whether or not a frame was similar or not based on two provided jsons that represent the user input and the reference input for that frame. 

The overall algorithm that I used was pretty simple. I first created a helper function to find the euclidean distance between two points in space, which will be given in the json inputs. Then, I loop through each of the points in the jsons, computing the distance between each of them. If the distance is less than a certain threshold (.05 for now), then the similarity for that point is true. I do this for each joint that we are computing, then if 80% of the joints are “similar” enough, then the overall output for that frame is true. These metrics I decided on are very arbitrary, and I think that we will first need to integrate the code fully and test these metrics to get a better idea of what we need for our final project.

Our progress is currently on schedule. By the end of spring break, we will have a MVP of our real-time feedback system, and once that is complete we will begin to work on our multi-frame analysis and overall integrating our system together.