Shiheng’ Status Report for 10/28/2023

My work this week mainly focused on comparison algorithms. Using the json data generated from Openpose thanks to Eric, I was able to craft my python script for comparing two different postures. There was some confusion at the beginning since there were some extra outputs than I previously expected, but it was figured out after communicating with Eric and looking at the documents from Openpose. The keypoints were narrowed down and point representing positions like eyes and ears were eliminated to improve the accuracy of determining overall body posture.

Only point 0-14 are used in judging posture right now for efficiency, I’ll see how certain keypoints (on the feet) impact the correctness of posture in testings following the implementation.

Using packages like numpy and json, I was able to pull files from the locally designated folder and compare the sets of data within. I added a few steps to process the data including the mentioned reduced datapoints, but also reforming vectors from the given limb datapoints. The reformed ‘limbs’ were then passed into the cosine similarity comparison. The response of the algorithm was quick, and results were easy to understand. The way we wanted to display the results and the mechanism is still to be determined, so I will display a raw output below from my algorithm to showcase one example of comparison based on two trials of a same image but which I did some minor changes in the output position of the upper limb of the second image.

The next steps will be to connect the comparison algorithm with the TTS engine to invoke vocal instructions and move on the integration steps accordingly.

Ray’s Status Report for 10/28/2023

For the past week, I have been learning and writing KV language as well as adjusting the current prototype framework that we have. I changed a lot of widgets’ configurations so that they are now aligned according to the size of the application window, and I also successfully got the setting page to work.

My next important task is to create the training page, which our group comes to an agreement on the design. We now want to display the user’s footage on the top left corner, the reference posture on the top right corner, and the two skeletons at the bottom of the application window. The skeletons will be drawn with the same body part lengths, but the angles between different body parts will depend on the reference pose and the user’s attempt at imitating it. We believe this will help users know how to correct their mistakes the most intuitively.

Also, with help from Eric, Openpose is finally running on my laptop! While the framerate is very low, it should be enough to let me start testing out the openpose api in the official documents. I plan to try them out next week and see how things work for our overall integration.

Also, my teammates and I also looked into the ethics aspect of our project. We met and discussed this week to see if there are any extra design choices necessary to make our project ethically robust. It turns out the parameter adjustment option that we are offering to our users on the setting page has already resolved some of the most important ethical issues concerning our project.

I believe I am currently a little behind schedule since the integration process is expected to start next week, but I believe I can catch up early next week as I have clearly identified and prepared for the tasks left for the frontend part of our project.

Team Status Report for 10/28/2023

During the past week, our team has been diligently working on various aspects of our project and collaborating to work the magic.

 

Ray and Jerry are working on the UI design using Kivy about functions like image uploading and displays. They will continue to cooperate with other members of the team to integrate functions like Openpose and voice engine into the application. This will play a pivotal role in ensuring a user-friendly and visually appealing experience for our application compared to the more naïve implementation of Tkinter.

 

Eric has successfully integrated Openpose into our system, enabling it to accept image inputs and generate JSON outputs for the comparison algorithm for Shiheng. This is a pivotal step in our project, as it provides the foundation for our comparison algorithm and more detailed instructions on body postures. We’ll look at how we want to display Openpose feedback integrated with camera images to the user in the following weeks.

 

Shiheng has implemented a cosine similarity algorithm for comparing key points that represent body posture. This algorithm will allow us to quantify the similarity between different body postures, providing a strong basis for evaluation and giving out instructions. We can analyze and compare the alignment and positioning of individual limbs, offering a more detailed assessment of body posture.

 

We also discussed issues on ethics after we specified more about project details on our Friday meeting, during which we first exchanged opinions about the reading and answers we made on the project, then proceeded to argue more on privacy, autonomy, and potential issues of misusing the project.

 

Our team is making solid progress, and we’re on track to deliver our comprehensive Taichi instructor system. We’ll continue working on these components, work on feedbacks from our design review report, and aim to achieve our project milestones in the coming weeks.

Ray’s Status Report for 10/21/2023

For the past two weeks, my team and I worked on the design review of our project to finalize the application structure and implementation design. We worked on the design review document together and dicussed on how each subcomponent of our project should work. I worked specifically on finalizing the user interface section.

To accomplish the above task, I shifted my focus from tkinter to kivy, which is the new package we chose for implementing our application interface. I learned about the basics of initiating and configuring the widgets and wrote kv language files to simplify the design process. Using the skills I learned, I created the prototype pages for the main menu and the pose selection page. Below are the prototypes I created. (Consider this my answer to the ABET question.)

In order to let the application run as expected, I have to get used to kivy’s screen manager functionality to switch between different pages. I am currently working on it and I plan to work out a functional application frame next week. Also, the live camera embedded in the training page is also something I need to look into, and I will start working on it as soon as the previous task is mostly completed.

I am still trying to configure openpose on Windows; if necessary, I will ask Eric about the building process and make sure openpose run on my computer by next week.

I am overall on schedule for this week. Still, starting from next week, I have to put more attention on cooperating with my teammates since the integration process of our application should soon begin.

Team Status Report for 10/21/2023

For the week prior to the fall break, our group mainly focused on finishing and polishing the Design Review document. We split up the work evenly, with every team member taking up some part of the writing and filling in the part of the subsystem they are responsible for in the following developments. We were able to find existing journals and research to justify our design requirements, provide details for our system specs, and think of alternative approaches if some parts of the system would go wrong. The writing of the design documentation aided us to explain a lot more details not mentioned in our design review presentation due to time constraints (since we need to fit contents of two presentation into one as we redesigned our project after the proposal presentation), providing faculty a better picture of the project and helped clarifying details among group members.

While enjoying our fall break, we also made some progress in implementing some of the subsystems.

Thanks to Hongzhe (Eric), the Openpose posture detection is now working with imported images (handpicked high resolution from online video source) which we are going to use as reference postures and baseline in evaluating user poses. As we described inside our design documents, these images will be transformed into json files with key points of postures in 2D that will be passed into the comparison algorithm for calculation of the differences.

Shiheng worked on the Text-to-Speech engine, where it could take in text instructions generated from a comparison algorithm and pass them into the voice engine to generate real time instructions. The time required for the voice engine to generate output files is low (< 0.5 seconds for an average length instruction) and the output voices are clear and easy to understand. We will continue to look into voice engine outputs and determine the best way to present these vocal instructions to users.

On the frontend, Jerry looked into the project considering several Python packages for better UI and user considerations. He determined that the Kivy package will provide us with lots of widgets that could be easily implemented and a much better interface than Tkinter which was what we planned to use originally.

Ray focused on learning the kivy language and creating the prototype frontend application. He created a prototype page for the main menu and the pose selection page. He is also working on the screen manager feature of Kivy to support page switch in the application and expeirmenting on openpose data interfacing with Kivy UI.

Hongzhe’s Status Report for 10/21/2023

For the past 2 weeks including the fall break, we developed our ideas deeper and with more details from the design review documentation. Each member of us is also making progress on practicing our respective responsible technology.

Personally, I made the outline of the design review documentation, listing out the key points we discussed with the faculty so that the document and content are structured. I was also in charge of filling up certain portions of the documentation mostly on the overall architecture and summary.

I have also pushed the OpenPose usage forward. I succeeded in using the compile OpenPose executable to process reference image files of Tai Chi poses and generate JSON output. I will iteratively do that next week to process all reference images. Below is a sample JSON output. I was also trying to enable the Python feature for OpenPose. While the Python support is built, the sample python program can not be executed. I will dig into this issue more in the future and we always have the backup option to use the executable from C++ compilation instead of the library version of Python OpenPose.

ABET: The new tool I learned is for sure the software of OpenPose. I learned from scratch on understanding the brief architecture to the means on establishing the current environment on  Windows. Given that the software has not been updated for years, I also learned a lot to gather resource from the internet and the github page to solve the missing components or incompatible module versions.

 

{
    “version”: 1.3,
    “people”: [
        {
            “person_id”: [
                -1
            ],
            “pose_keypoints_2d”: [
                411.349,
                275.523,
               … (ignored for view length)
                875.764,
                0.543042
            ],
            “face_keypoints_2d”: [],
            “hand_left_keypoints_2d”: [],
            “hand_right_keypoints_2d”: [],
            “pose_keypoints_3d”: [],
            “face_keypoints_3d”: [],
            “hand_left_keypoints_3d”: [],
            “hand_right_keypoints_3d”: []
        }
    ]
}

Shiheng’ Status Report for 10/21/2023

I mainly contributed to the Design Requirement and Tradeoff part of the Design review document. Starting from what we have in the design review slides and notes shared among us, I was able to quatify and justify requirements we encoutered in our project. The tradeoff part also plays a vital role inside the design document, as it explains decisions we made in the past weeks and why do we think they are in our best interest.

For the past week, I mostly researched my part of TTS engine and its applications to our project. During the implementation, lots of compatibility problems have been discovered on Python version, package compatibility, and output supports. After trying out a few different Python versions and attempting to install the TTS package on my laptop, I determined that Python 3.10 was the best fit as it supports all the necessary packages inside the TTS engine with most packages up to date. Other versions either reached the end of their life cycle or had issues supporting the latest packages that the TTS engine required to use.

With the package successfully installed and all requirements fulfilled, I generated a few .wav files to for demonstration purposes. The .wav files sound ok and is compatible with running locally on command prompt using the default voice engine provided. I’ll continue to research on different voice engines to make sure that the best among them is picked and consider that user might want to pick different voices while they are being instructed. I will continue to work on this part and begin to integrate this voice function into our project once our pipeline is mostly done.

Here’s a Sample message using default engine:

“Raise your arm by 20 degrees”

 

ABET #7 Question:

I have looked into tools like Text-to-Speak engines and try to understand the logic and algorithm behind with different voice models.

Additional knowledge including understanding pronunciation, syllables in English and what speed should the voice engine use to speak the words. Those parameters need to be adjusted to meet our design requirements and provide efficient feedback to the user.

Also looked into various online forums and design documents for the installation and learnt similar problems that other users previously had in installing older versions of the package. Learnt a lot more about Python packages related to machine learning and voice generation.

Ray’s Status Report for 09/30/2023

We start on our implementation process this week and everyone gets to work on their respective sections of work. I started learning and writing UI for our system with TKinter.

I got through many of the UI elements over the week and I plan to planning to approach the image display, video display, and database functionalities tomorrow. Below is the video I will refer to.

I also tried to get openpose to work on my laptop, but I am still getting errors based on CMake versions. My plan is to make Openpose run on my laptop as soon as possible, hopefully in the next 2 days.

I also looked through whole_body_from_image.py and keypoints_from_image.py examples in the openpose official repository. Base on the whole body example, display a sequence of poses in order can be realized by reading each image , setting up the pose datapoints in Shiheng’s cost functions, and then waiting for the user to input a correct pose. The timing could be an issue to implement and I might need to look into the Python functionalities to realize them in our system. Meanwhile, static pose evaluation can be realized with the following code as a reference:

Still, through discussions on the design review we presented this week, we noticed aspects of our project that requires more detailed considerations. In particular, the way we want to convey our pose dectection results need to be intuitive enough for our users. Also, the time interval between each verbal instruction should be different from the interval between each pose detection. I will take these into account when designing the user interface next week and reflect them in the design report that is coming up next week.

My progress for this week is still on schedule, though work for next week might be heavier than regular weeks. I plan to work some time during the fall break if the schedule for next week turns out to be too packed, but I will put priority on planning out how the api and ui should communicate and decide on a design for the ui.

Team Status Report for 10/07/2023

Week 10/7/2023: Continuing to revise design and initial project development.

On Wednesday, we finished our design review presentation, which we continued our discussion on our design during our Thursday night meeting. We were joined by professor Tamal who gave us valuable feedback on our current project and how we could improve our MVP framework to include more functionality. One of the things we adapted was the idea of evaluating users’ performance over a short period of time to understand the fact that they need to setup and transit into the position, eliminating the chances that they are doing a completely different posture but accidentally got a higher score due to the frames grabbed by our system, and provide an internal evaluation/scoring system for body postures. We still need to polish these ideas before next Friday, when the design report is due, before which we would need to quantify these thresholds and reflect the changes in our report.

On Friday morning, we received detailed feedback thanks to hardworking faculties on our design presentation. We believed that a detailed presentation was given but it seemed we still needed to clarify some issues that were not showcased clearly within our presentation.

The team worked throughout the week to perfect the details and narrowed down how we wanted to approach this project, which we determined that minor changes need to be made to our schedule. We have begun coding and incorporating systems gradually into our project, for which we already have Openpose algorithms working ahead of time thanks to the extra effort Hongzhe (Eric) put in to configure and run the system.

We’ll continue to work with faculty members in the following week on our design report and start data collection through imported poses we handpicked online from Taichi professionals, integrating Openpose API, and start developing the comparison algorithm. We currently have no plans to order any hardware equipment, and please stay tuned for our next update and design documentation.

 

ABET question:

  1.       Engineering Ethics: We highly respect the privacy of users due to the necessity of using cameras to capture body posture. It is possible that the camera could collect more personal info than posture, so we planned to make this a local application without the need for exchanging information with a cloud server. All collected data will be stored and evaluated locally without the need for connection to the internet, and users can choose to close the camera during the usage of the application if they feel necessary. (e.g. reviewing posture, attending to other businesses, or feeling not to practice)
  2.       Math model in comparing body postures: Using the idea of cosine similarity, we attempt to compare body postures captured from users to our standard preset provided by Taichi professionals. To account for differences in heights and shapes of people, we directly measure the joint angles of a person instead of directly approximating how similar the person is due to the differences of absolute positions of joints. Normalization principle could also be applied to postures to account for magnitude differences in vectors for varied body sizes.
  3.   Machine Learning model in OpenPose: In the posture recognition application OpenPose, we are able to use the trained convolutional neural network for recognizing the core body coordinates from the input video/image. Convolutional neural networks are based on simple batch processing, dot product of matrices and many techniques such as pooling and regularization to avoid outliers and overfitting. It utilizes all kinds of principles and formulas from Mathematics as logarithmic calculation to enable convolution and activation function. It also uses differentiation calculus for training the network parameters when back propagation.

Shiheng’ Status Report for 10/07/2023

This week, I put my focus more on implementing the comparison algorithm and normalizing body sizes for better results which accounts for difference in body sizes. Abiding our Gantt Chart, here’s my progress:

Progress:

Since our pipeline from OpenPose is still under development, I constructed some datasets myself (purely random) to test my comparison algorithm using cosine similarity. Cosine similarity measures the angular similarity between vectors, making it ideal for assessing body orientations. Additionally, I will explore techniques to normalize body sizes to enhance the accuracy of these posture comparisons in the following week.

To facilitate comparison, each body posture is transformed into a vector within a multi-dimensional space. Each dimension within this space corresponds to a specific key point (in our case, joint) detected by OpenPose. For instance, if I am receiving from OpenPose output consists of 18 absolute positions, each posture is then represented as an 18-dimensional vector.

The implementation requires packages including numpy, normal Python environment and VSC for development. I used 3.11.5 in this case since 3.12 was just released a few days ago which could have compatibility issues with package supports. I’ll make sure to keep targeting the latest version for optimization and support of the packages.

 

Implementation on absolute position (Planned for next week):

To account for differences in body sizes and variations in the distances between body joints, it is imperative to normalize the posture vectors. The idea I have now is to normalize every person into my body size, which is around 5’10 and 170 lbs (need to be justified next week with other members of the group). This will be an add on to the cosine comparison idea to determine the absolute position of users. The idea of using absolute position eliminates the possibility that the user is doing a completely different posture and scores a high similarity due to the nature of cosine similarity. The normalization process involves dividing the coordinates of key points by reference length, which is typically the distance between two consistent points (two shoulders, the ratio between upper and lower body, the ratio between calf and thigh). This procedure scales all joints proportionally to this reference length, facilitating a relatively standard comparison.