Shiheng’s Status Report for 12/09/2023

This past week has been incredibly productive as I dedicated a significant portion of my time to preparing for an upcoming presentation. The focal point of our discussion centered around the quantitative results derived from the extensive test trials we conducted. Through a thorough analysis of the gathered data, we were able to draw valuable insights that formed the backbone of our presentation.

One key aspect of our discussions revolved around potential areas of improvement, which we identified through meticulous examination of user feedback obtained during the testing phase. This feedback proved invaluable in shaping our understanding of user experiences and guiding us towards refining the functionalities of the project.

Simultaneously, my attention was devoted to finalizing the backend processes of our project. Collaborating closely with Hongzhe, we delved into intricate details to ensure the seamless integration and optimal performance of the backend. Several optimizations were implemented, enhancing the overall efficiency and responsiveness of the system.

In addition to the technical aspects, I engaged in detailed discussions with Hongzhe regarding the design elements of our project. Fine-tuning the design for optimal user experience was a priority, and we worked collaboratively to address any remaining concerns on the backend.

Shiheng’s Status Report for 12/02/2023

For this week, I mostly focused on fixing minor bugs along with integration issues faced with testing out the application with rest of the group. Guidance was improved for users not fitting inside the frame of the camera and about correcting positions. I worked with Ray about issues I found in comparing skeletons about missing joints and scaling of the reference picture. Eric also assisted me in rerunning the openpose script on cropped picture for better performance in the application.

Issues were also found inside our voice module causing unintended termination of the application, which were found caused by pygame module. For the weekend and upcoming week, I will be preparing for the upcoming presentation and doing testing on my backend module along with Eric for justifications and evaluation purposes.

Shiheng’s Status Report for 11/18/2023

My work this week reflects the efforts I made on the backend on integration with Ray and continued progress in implementing vocal instructions.

I have been actively driving progress in the project, specifically focusing on enabling pose selection from multiple persons. My work involves extensive Python scripting to develop a flexible system that is automated to choose poses from different individuals and pick out the Taichi practitioner. User will be able to train in environment not so private (e.g. gyms) without the need of having the room cleared to prevent the system from capturing other body parts.

Additionally, I have taken on the responsibility of building the backend support for skeleton drawing. Through passing angles with a reference frame, I enabled Ray to pinpoint and draw the vectors for the reference skeleton. The user skeleton follows the similar logic, and I have lay down the foundation for comparison through passing a boolean list for verfication purposes and creating visual cues on the front end.

I am still working and researching on creating good vocal instructions while putting the sequence and body parts priority in mind, which should be done by the end of this weekend and ready for testing. For following weeks, I will focus on testing and fixing bugs from our final product with all the functions done.

Shiheng’s Status Report for 11/11/2023

This week I continued developing the backend part of our software.
There were some back and forth between the group members on project details, but they were soon resolved and we were quickly back on track.

Following Monday’s interim demo session, I worked with Eric on hand posture detection as our demo did not detect them. We oversimplied the model and made the comparision inaccurate.
Fortunately it got resolved on the afternoon of the same day and we are now able to detect user’s hand posture. User will be reminded that their hand is not in the correct posture, e.g doing a fist instead of flat hand.

I also worked on more functionalities this week on the backend, including dividing and identifying body parts into ‘head’, ‘torso’, ‘arms’, ‘legs’, ‘feet’, communicating with Ray and Jerry about feedbacks needed on the frontend. The functionaility is mostly finished for normal picture (i.e user with full body included in frame with clear reference picture).

The plan for next week and following weeks is to implement the cases where unclear pictures are provided, or user is not posing their body inside the frame. I have been working closely with Eric’s test cases to identify potential issues from incoming rendered images and json files. Functions that need to implement are:
Calibration for one’s body and identifying their missing keypoints -> Prompt them to change their posture and include the full body in frame.
Multiple people in frame -> Use similiarity scoring to identify the actual user
Instruction wording -> Clear consise commands to user, and priortize lower body parts in training
Potential Integration issues -> Fix them as the project progresses
I also need to work with members on frontend about progress and implementation.

ABET question:
1. Unit testing on backend code: Lots of pre and postconditions are currently implemented for unit testing and debugging purposes
2. Image testing: Test on the go, using user inputs and reference images handpicked to test the functionality of the code
3. User feedback testing: Voice instructions will be played to user which will be evaluated on their quality and clarity
4. Edge case testing: Cases where improper images are uploaded? User not/partly in frame? User doing a completely different pose?

Shiheng’ Status Report for 10/28/2023

My work this week mainly focused on comparison algorithms. Using the json data generated from Openpose thanks to Eric, I was able to craft my python script for comparing two different postures. There was some confusion at the beginning since there were some extra outputs than I previously expected, but it was figured out after communicating with Eric and looking at the documents from Openpose. The keypoints were narrowed down and point representing positions like eyes and ears were eliminated to improve the accuracy of determining overall body posture.

Only point 0-14 are used in judging posture right now for efficiency, I’ll see how certain keypoints (on the feet) impact the correctness of posture in testings following the implementation.

Using packages like numpy and json, I was able to pull files from the locally designated folder and compare the sets of data within. I added a few steps to process the data including the mentioned reduced datapoints, but also reforming vectors from the given limb datapoints. The reformed ‘limbs’ were then passed into the cosine similarity comparison. The response of the algorithm was quick, and results were easy to understand. The way we wanted to display the results and the mechanism is still to be determined, so I will display a raw output below from my algorithm to showcase one example of comparison based on two trials of a same image but which I did some minor changes in the output position of the upper limb of the second image.

The next steps will be to connect the comparison algorithm with the TTS engine to invoke vocal instructions and move on the integration steps accordingly.

Shiheng’ Status Report for 10/21/2023

I mainly contributed to the Design Requirement and Tradeoff part of the Design review document. Starting from what we have in the design review slides and notes shared among us, I was able to quatify and justify requirements we encoutered in our project. The tradeoff part also plays a vital role inside the design document, as it explains decisions we made in the past weeks and why do we think they are in our best interest.

For the past week, I mostly researched my part of TTS engine and its applications to our project. During the implementation, lots of compatibility problems have been discovered on Python version, package compatibility, and output supports. After trying out a few different Python versions and attempting to install the TTS package on my laptop, I determined that Python 3.10 was the best fit as it supports all the necessary packages inside the TTS engine with most packages up to date. Other versions either reached the end of their life cycle or had issues supporting the latest packages that the TTS engine required to use.

With the package successfully installed and all requirements fulfilled, I generated a few .wav files to for demonstration purposes. The .wav files sound ok and is compatible with running locally on command prompt using the default voice engine provided. I’ll continue to research on different voice engines to make sure that the best among them is picked and consider that user might want to pick different voices while they are being instructed. I will continue to work on this part and begin to integrate this voice function into our project once our pipeline is mostly done.

Here’s a Sample message using default engine:

“Raise your arm by 20 degrees”

 

ABET #7 Question:

I have looked into tools like Text-to-Speak engines and try to understand the logic and algorithm behind with different voice models.

Additional knowledge including understanding pronunciation, syllables in English and what speed should the voice engine use to speak the words. Those parameters need to be adjusted to meet our design requirements and provide efficient feedback to the user.

Also looked into various online forums and design documents for the installation and learnt similar problems that other users previously had in installing older versions of the package. Learnt a lot more about Python packages related to machine learning and voice generation.

Shiheng’ Status Report for 10/07/2023

This week, I put my focus more on implementing the comparison algorithm and normalizing body sizes for better results which accounts for difference in body sizes. Abiding our Gantt Chart, here’s my progress:

Progress:

Since our pipeline from OpenPose is still under development, I constructed some datasets myself (purely random) to test my comparison algorithm using cosine similarity. Cosine similarity measures the angular similarity between vectors, making it ideal for assessing body orientations. Additionally, I will explore techniques to normalize body sizes to enhance the accuracy of these posture comparisons in the following week.

To facilitate comparison, each body posture is transformed into a vector within a multi-dimensional space. Each dimension within this space corresponds to a specific key point (in our case, joint) detected by OpenPose. For instance, if I am receiving from OpenPose output consists of 18 absolute positions, each posture is then represented as an 18-dimensional vector.

The implementation requires packages including numpy, normal Python environment and VSC for development. I used 3.11.5 in this case since 3.12 was just released a few days ago which could have compatibility issues with package supports. I’ll make sure to keep targeting the latest version for optimization and support of the packages.

 

Implementation on absolute position (Planned for next week):

To account for differences in body sizes and variations in the distances between body joints, it is imperative to normalize the posture vectors. The idea I have now is to normalize every person into my body size, which is around 5’10 and 170 lbs (need to be justified next week with other members of the group). This will be an add on to the cosine comparison idea to determine the absolute position of users. The idea of using absolute position eliminates the possibility that the user is doing a completely different posture and scores a high similarity due to the nature of cosine similarity. The normalization process involves dividing the coordinates of key points by reference length, which is typically the distance between two consistent points (two shoulders, the ratio between upper and lower body, the ratio between calf and thigh). This procedure scales all joints proportionally to this reference length, facilitating a relatively standard comparison.

 

Shiheng’ Status Report for 09/30/2023

During the week of 9/30, my primary focus was on researching and understanding critical design tradeoffs related to our project. This entailed two key aspects: evaluating text-to-speech (TTS) options within the Python development environment and gaining insights into the implementation of cosine similarity-based posture analysis. Each option had its unique set of pros and cons, with considerations such as internet accessibility, voice quality, and language support. Furthermore, I delved into the idea of cosine similarity and its application in posture analysis, with a keen eye on setting an appropriate similarity threshold. These endeavors paved the way for informed design decisions in the upcoming phases of our project.

 

In the realm of Python, I examined three TTS solutions: gTTS API, pyttsx3, and Mozilla TTS. The gTTS API offers flexibility in preprocessing and tokenizing text, supporting multiple languages with customizable accents. However, it necessitates internet access due to its API nature. Conversely, pyttsx3 provides greater customization options but lacks the naturalness of gTTS. Mozilla TTS, while high-quality and offline-capable, requires research for voice training and personal selection of voice engine. These assessments have equipped us with a comprehensive understanding of TTS tools, which I determined that Mozilla TTS is the best option among all. I also made backup plans for the case of C++ and found TTS engines that fit that approach.

In parallel, I delved into the mathematical underpinnings of cosine similarity for posture analysis.  It offers scale-invariance and angular similarity comparison, making it apt for body pose analysis. The critical decision to set a similarity threshold, possibly ranging from 80% to 90%, emerged as a key design consideration. This threshold will be pivotal in assessing whether two postures are sufficiently similar or dissimilar. By thoroughly understanding these design tradeoffs, we are better equipped to make informed choices in developing our posture analysis system, balancing accuracy, and flexibility to accommodate varying body sizes and orientations.

The comprehensive evaluation of TTS tools has provided insights into the advantages and disadvantages of each option, enabling us to make an informed choice aligning with our project’s goals. These efforts represent significant progress toward the successful execution of our project, empowering us to make well-informed design decisions moving forward.

ABET: For consine similarity, I extracted the concept from 18-290 and 18-202 in terms of vector comparision and algebra.

For Python coding perspective, I took it from previous CS courses taken and my personal experience with Python. I researched on TTS packages this week through looking at github developing docs and developer websites about those concepts.

Shiheng’s Status Report for 09/23/2023

After an intense discussion with Professor Tamal during the Friday meeting, we discovered the justification for our initial plan in terms of costs and applicability was not strong and viable. Understanding the difficulties and hardships we are going to meet using Openpose in patient behavior detection, I contributed to the change of topic into Taiji instructor. With reference to yoga instructor available online, we decided to follow a similar trend but carry out it differently since Taiji is more concentrated on the flow of body motion instead of static data points. We could control the cost easily through justification provided to monitor larger motions of Taiji using Openpose and the setup of one camera to provide back real time justification and evaluation of the body posture, instead of the original plan on patients, which we ignored the class of patients which are bedridden and would have continuous minor movements instead of large motions which normal people would behave.

Gathering data and formulating them are the main progress I made in the past few days after the project change, using Ray’s reference body positions, I identified some body positions that are concise for beginners to learn and could possibly be identified through the Openpose algorithm. I reckon these positions will be clearly identified in low-cost cameras and meet the requirements of the Raspberry Pi we tend to use in the project.

Though we are behind on scheduling, we have advanced greatly in the data collection and classification part which hindered us greatly on the previous project. I believe we could catch up in the following weeks and advance greatly into our project.