This week I worked on developing multiple unique comparison algorithms, with the goal of iterating upon our existing comparison algorithm, trying new features, and being intentional about the design decisions behind the algorithm. For the interim demo, we already had an algorithm that utilizes dynamic time warping, analyzes 30 frames every 0.1 seconds, and computes similarity based on each joint individually. This week I focused on making 5 unique comparison algorithms, allowing me to compare the accuracy of how different parameters and features of the algorithm can improve the effectiveness of our dance coach. The following are the 5 variations I created, and I will compare these 5 with our original to help find an optimal solution:
- Frame-to-frame comparisons: does not use dynamic-time warping, simply creates a normalization matrix between the reference video and user webcam input and compares the coordinates on each frame.
- Dynamic-time-warping, weighted similarity calculations: builds upon our algorithm for the interim demo to calculate the similarity between different joints to relatively weigh more than other joints.
- Dynamic-time-warping, increasing analysis window/frame buffer: builds upon our algorithm for the interim demo to increase the analysis window and frame buffer to get a more accurate DTW analysis.
- Velocity-based comparisons: similar to the frame-to-frame comparisons, but computes the velocity of joints over time as they move, and compares those velocities to the reference video velocities in order to detect not exactly where the joints are, but how the joints move over time.
- Velocity-based comparisons with frame-to-frame comparisons: iterates upon the velocity comparisons to utilize both the velocity comparisons and the frame-to-frame joint position comparisons to see if that would provide an accurate measurement of comparison between the reference and user input video.
I have implemented and debugged these algorithms above, but starting tomorrow and continuing throughout the week, I will conduct quantitative and qualitative comparisons between these algorithms to see which is best for our use case and find further points to improve. Additionally, I will communicate with Rex and Danny to see how I can make it as easy as possible to integrate the comparison algorithm with the Unity side portion of the game. Overall, our progress seems to be on schedule; if I can get the comparison algorithm finalized within the next week and we begin integration in the meantime, we will be on a good track to be finished by the final demo and deadline.
There are two main parts that I will need to test and verify for the comparison algorithm. First, I aim to test the real-time processing performance of each of these algorithms. For example, the DTW algorithm with the extended search video may require too much computation power to allow for real-time comparisons. On the other hand, the velocity/frame-to-frame comparison algorithms may have space to add more complexity in order to improve the accuracy of the comparison without resulting in problems with the processing performance.
Second, I am to test the accuracy of each of these comparison algorithms. For each of the algorithms described above, I will run the algorithm on a complex video (such as a TikTok dance), a simpler video (such as a video of myself dancing to a slower dance), and a still video (such as me doing a T-pose in the camera). With this, I will record the output after I actively do the dance, allowing me to watch the video back and see how each algorithm does. After, I will create a table that allows me to note both quantitative and qualitative notes I have on each algorithm, seeing what parts of the algorithm are lacking and performing well. This will allow me to have all the data I need in front of me when deciding what I should do to continue iterating upon the algorithm.
With these two strategies, I believe that we will be on a good track to verify the effectiveness of our dance coach and create the best possible comparison algorithm we can to help our users.