Team Status Report for 2/22

Team Status Report

Risk Management:

Risk: While our proposed solution may achieve accurate comparison on a technical level, our feedback system design carries the risk of not being exactly what our targeted users what/need.

Mitigation Strategy/Contingency plan: We plan to reach out to a variety of potential users of this system, including serious dancers, tiktok influencers who record dances regularly, and regular people who may record a casual dance or two once in a while. We will then use the feedback gathered from these potential users to better inform the specific design of how we generate our feedback.

Design Changes:

There were no design changes this week. We have continued to execute our schedule.

Danny’s Status Report for 2/15

This past week, as outlined on the schedule, I primarily focused on processing reference video inputs with OpenCV. I spent time exploring both MediaPipe and Open Pose as different ways to process and label the reference input video. After spending a substantial amount of time experimenting with both, we as a team decided that MediaPipe was a better fit for our needs. I then proceeded to test the MediaPipe pipeline with video inputs, initially with just a simple recording of myself. This initial test yielded unsatisfactory results, prompting me to continue to fine tune the MediaPipe library and OpenCV capturing.

The MediaPipe library comes with several base models. It also has a variety of options that includes:

  • min_pose_detection_confidence (0.0-1.0):
    • Controls how confident the model needs to be to report a pose detection
    • Higher values reduce false positives but might miss some poses
    • Lower values catch more poses but may include false detections
  • min_pose_presence_confidence (0.0-1.0):
    • Threshold for considering a pose to be present
    • Affects how readily the model reports pose presence
  • min_tracking_confidence (0.0-1.0):
    • For video mode, controls how confident the tracker needs to be to maintain tracking
    • Lower values make tracking more stable but might track incorrect poses
    • Higher values are more precise but might lose tracking more easily
  • num_poses:
    • Maximum number of poses to detect in each frame
    • Increasing this will detect more poses but use more processing power
    • Default is 1
  • Output_segmentation_masks:
    • Boolean to enable/disable segmentation mask output
    • Disabling can improve performance if you don’t need masks

After experimentation, I found that the parameters that affected our detection the most was the min_pose_detection_confidence as well as the min_pose_presence_confidence parameters. After fine tuning these parameters, I was able to achieve much better tracking on not just my own simple testing video, but also a relatively complex YouTube dancing short. As we continue to work on this algorithm and integrating the systems together, I will also continue to experiment with the options to try to optimize the performance while keeping tracking confidence as high as possible.

 

Testing with recorded footage from webcam:

Testing with YouTube shorts dancing video ():

Team Status Report for 2/15

Team Status Report

Risk Management:

Risk: Losing movement details in the transition from MediaPipe to the Unity inputs. This is something we are noticing after running some initial experiments this week in trying to push simple movements through MediaPipe into Unity.

Mitigation Strategy/Contingency plan: Unity has different kinds of joint options (either choice of Two-Bone Inverse Kinematic Constraint/Multi-Aim Constraint/Damped Transform/Rotation Constraint), so testing between these 4 types of joint options and finding what looks the most natural, and is most coherent with our MediaPipe data.

Design Changes:

  1. Specific Design Updates:
  • Change: Selecting MediaPipe as our library of choice as opposed to Open Pose
    • Why: More detailed documentation, ease of use, better match with the amount of details we require
  • Change: 3D Comparative Analysis Engine to be done in Unity
    • Why: Unity’s detailed avatar rigging allows us to display the dance moves with accuracy and compare the webcam footage with the reference video with sufficient detail

 

  1. Cost Impact and Mitigation:

– No direct costs incurred these changes were a part of the planned exploratory stage in our schedule

Updated Schedule:

Part A was written by Danny   Cui, Part B was written by Rex Kim, Part C was written by Akul Singh

It is possible, though probably unusual that the answer to a particular question would be “does not apply.” In such a case, please describe what you have considered to ensure that it does not apply.

Please write a paragraph or two describing how the product solution you are designing will meet a specified need…

 

Part A: … with respect to considerations of public health, safety or welfare. Note: The term ‘health’ refers to a state of well-being of people in both a physiological and psychological sense. ‘Safety’ is the absence of hazards and/or physical harm to persons. The term ‘welfare’ relates to the provision of the basic needs of people.

  • From a physical health perspective, the system promotes regular exercise through dance, which improves cardiovascular fitness, flexibility, coordination, and muscle strength. The feedback mechanism ensures users maintain proper form and technique, reducing the risk of dance related injuries that could occur from incorrect movements or posture. This is particularly valuable for individuals who may not have access to in person dance instruction or cannot afford regular dance classes.
  • From a psychological health and welfare standpoint, the system creates a safe, private environment for users to learn and practice dance without the anxiety or self-consciousness that might arise in group settings. Dance has been shown to reduce stress, improve mood, and boost self-esteem, benefits that become more accessible through this technology. The immediate feedback loop also provides a sense of accomplishment and progression, fostering motivation and sustained engagement in physical activity. Additionally, the system addresses safety concerns by allowing users to learn complex dance moves at their own pace in a controlled environment, with guidance that helps prevent overexertion or dangerous movements. This is especially important for beginners or those with physical limitations who need to build up their capabilities gradually.

 

Part B: … with consideration of social factors. Social factors relate to extended social groups having distinctive cultural, social, political, and/or economic organizations. They have importance to how people relate to each other and organize around social interests.

 

  • Our computer vision-based dance-coaching game makes dance training more accessible and engaging. Traditional dance lessons can be hard to find, especially in remote areas and especially if one does not want to consistently pay for the classes. Our game removes these barriers by letting users practice at home with just a camera and computer setup. Using Mediapipe and Unity, it analyzes an input video and compares the user’s movements to an ideal reference. Real-time feedback helps users improve without needing an in-person instructor. This makes dance education more available to people who may not have the resources or opportunities to attend formal classes.
  • Beyond accessibility, our game also fosters cultural exchange and social engagement. Dance is deeply tied to cultural identity, and by incorporating a variety of dance styles from different traditions, the game can serve as an educational tool that promotes appreciation for diverse artistic expressions. Users can learn and practice traditional and contemporary dance forms, helping preserve cultural heritage while making it more interactive and engaging for younger generations. Additionally, the game can create virtual dance communities, encouraging users to share their performances, participate in challenges, and interact with others who share their interests.

 

Part C: … with consideration of economic factors. Economic factors are those relating to the system of production, distribution, and consumption of goods and services.

  • Since our application relies only on a webcam and computer processing, its economic impact is primarily related to accessibility, affordability, and potential market reach. Unlike traditional dance classes, which require ongoing payments for instructors/studio rentals, our application offers a cost-effective alternative by enabling users to practice and improve their dance skills from home. This affordability makes dance education more accessible to individuals who may not have the financial means to attend in-person lessons, thus reducing economic barriers to learning a new skill.
  • Additionally, our application aligns with current technological trends in society, where software-based fitness and entertainment solutions generate revenue through app sales, subscriptions, or advertisements. The fact that danCe-V only requires a computer webcam also reduces the financial burden on users, as they do not need specialized equipment beyond a standard webcam and computer. This makes it an economically sustainable option for both consumers and potential business models, allowing the platform to reach a broad audience while keeping costs low.

 

Images:

Testing from video input: 

Testing from direct webcam footage:

Danny’s Status Report for 2/8

Starting from this past week, I mostly acted in the role of a “Project Manager” for our project as a whole, managing all the tasks that needed to be done and putting a schedule together. This has been made more difficult by our second project pivot from making a dancing game to a dancing coach instead. As a part of that pivot, we decided to scrap our idea of incorporating a haptic feedback device to increase player immersion for our game. Since I had originally been designated the main person responsible for creating this haptic feedback device, I had been mostly conducting research on how I wanted to put this device together, which components to buy, and how the integration could work. Additionally, I had been preparing for our equipment procurement, which was scheduled to happen next week.

Because of our pivot, my efforts will now be redirected towards the CV part of the project. After helping the team create a renewed schedule that reflects our pivot and all the new tasks that need to be done, I’ve started to conduct research in the implementation of Computer Vision, since I’m relatively inexperienced in the field. I will then begin work on processing the input reference video and the webcam video with teammate Akul Singh.