Ray’s Status Report for 09/30/2023

This week, we had a new team member on our team. We had two three meetings together to discuss on labor division as well as technical details of the project and made solid progress on our implementation design. On my part, I researched on the API structure of Openpose and the packages for implementing our user interface.

The API of Openpose supports both Python and C++ for data transfer, and after discussing with my teammates (specifically Jerry), we decided on using Python since it has more functionalities available for user interface implementation. The most time consuming computation our system will do is in openpose, while the api is only responsible for reading outputs from it, so there should not be a huge sacrifice in runtime efficiency.

The API package that I plan to use is TKinter. Upon research, I find it having great potential as the tool for implementing our system. It supports graphics and also has extensive support on GUI elements. (Below is the tutorial I’m watching.)

I was taught briefly about how openpose work in Computer Graphics 15462 and had the chance to read about it in the past. However, I have not integrated the model into a full system before, so for the past week I read about openpose’s API usage on their Github page. Openpose official repository has some helpful tutorials for me to refer to. I read through some of them, but have not had the chance to try running them. I plan to install and build the openpose project on my laptop this weekend.

I am back on schedule this week, and my plan for next week is to get used to the TKinter package and the Openpose API to start creating a prototype UI for our system. Look forward to seeing how it will turn out!

 

Team Status Report for 09/30/2023

Week of 09/30/2023: A week of Research and Findings

We have a new team member Jerry Feng joining us following our new proposal being approved on Monday. Doing research works on different parts of the project and reintroducing the project to group members were the main focuses of the week.

During Monday and Wednesday meetups, Shiheng, Ray and Eric (Hongzhe), made a proper introduction to Jerry about our Taichi project, discussed its background, and explained why we chose this as our project. In addition, Ray and Eric discussed with Jerry in depth about the openpose algorithm  and existing pipeline in our original plan, while Shiheng focused more on the comparison algorithm and how voiceover should be implemented in parallel to the development of the posture pipelines. While meeting up with Professor Bryon and Eshita, we brainstormed about how Jerry could integrate into our existing framework and decided that developing an alternative pipeline allowing customization would be best in Jerry and the rest of the team’s interest in working parallelly.

After transitioning smoothly into our new workflow, we spent most of our time researching our own parts of the project. We discovered various compatibility issues when trying to set up environments on our own laptops, but fortunately they were all resolved by the end of the week through our discussions and research online, which we all had a good understanding of how to implement the project. Everyone on the team has done decent research on narrowing down the ideal programming language, packages, and algorithm, which could be justified from various aspects including but not limited to compatibility, offline support, efficiency, and ease to use.

For the rest of the week, we spend the majority working on design review slides and replanning out our project quantitatively with the new pipeline Jerry owned. We refined our proposal slides to include more quantitative values to measure our performance, specified measures dealing with pipeline failures, and brainstormed various test cases for future verification purposes. Additionally, we worked on creating a new Gantt chart to include current works and reorganizing the work as Ray and Jerry now have some overlapping they could collaborate on during the semester.

Gantt Chart is Attached Below to show our progress:

Shiheng’ Status Report for 09/30/2023

During the week of 9/30, my primary focus was on researching and understanding critical design tradeoffs related to our project. This entailed two key aspects: evaluating text-to-speech (TTS) options within the Python development environment and gaining insights into the implementation of cosine similarity-based posture analysis. Each option had its unique set of pros and cons, with considerations such as internet accessibility, voice quality, and language support. Furthermore, I delved into the idea of cosine similarity and its application in posture analysis, with a keen eye on setting an appropriate similarity threshold. These endeavors paved the way for informed design decisions in the upcoming phases of our project.

 

In the realm of Python, I examined three TTS solutions: gTTS API, pyttsx3, and Mozilla TTS. The gTTS API offers flexibility in preprocessing and tokenizing text, supporting multiple languages with customizable accents. However, it necessitates internet access due to its API nature. Conversely, pyttsx3 provides greater customization options but lacks the naturalness of gTTS. Mozilla TTS, while high-quality and offline-capable, requires research for voice training and personal selection of voice engine. These assessments have equipped us with a comprehensive understanding of TTS tools, which I determined that Mozilla TTS is the best option among all. I also made backup plans for the case of C++ and found TTS engines that fit that approach.

In parallel, I delved into the mathematical underpinnings of cosine similarity for posture analysis.  It offers scale-invariance and angular similarity comparison, making it apt for body pose analysis. The critical decision to set a similarity threshold, possibly ranging from 80% to 90%, emerged as a key design consideration. This threshold will be pivotal in assessing whether two postures are sufficiently similar or dissimilar. By thoroughly understanding these design tradeoffs, we are better equipped to make informed choices in developing our posture analysis system, balancing accuracy, and flexibility to accommodate varying body sizes and orientations.

The comprehensive evaluation of TTS tools has provided insights into the advantages and disadvantages of each option, enabling us to make an informed choice aligning with our project’s goals. These efforts represent significant progress toward the successful execution of our project, empowering us to make well-informed design decisions moving forward.

ABET: For consine similarity, I extracted the concept from 18-290 and 18-202 in terms of vector comparision and algebra.

For Python coding perspective, I took it from previous CS courses taken and my personal experience with Python. I researched on TTS packages this week through looking at github developing docs and developer websites about those concepts.

Jerry’s Status Report for 09/30/2023

  1. During this week, I was only able to be integrated into the team on Wednesday (9/27), so I spent Wednesday discussing how to integrate me into the team with everyone else and reworking the labor division between the team.  I also did research on my teams options for file storage and decided we would go with a simple directory system and use JSON files to store the user’s pose coordinates and reference pose coordinates, for ease of interfacing with Openpose, as it already uses a JSON format in its API’s.  I also looked into running Openpose on my local machine, and I found out that I would have to run Openpose on my CPU as opposed to my graphics card as my graphics card unfortunately does not support CUDA.  Additionally, I also made an official system diagram for our presentation and contributed to the design presentation slides.

2. I believe that our progress on the project is on schedule, as I am in a good position to finish the design presentation slides today and then move forward with running Openpose and getting into the meat of the project.

3. Next week, I plan to get Openpose up and running on my computer.  I also plan on helping to write the design report and coordinating with Ray about some of the specifics of the UI design so user’s can easily upload and access custom image sequences for their own training purposes.

4. Using JSON files and working with API’s is something I learned over the summer at my internship, and our idea of breaking our project into modules for ease of organization and implementation is something that was heavily emphasized to me in Structure and Design of Digital Systems (18-240).

Hongzhe’s Status Report for 09/23/2023

For this week, we had a major change in the project plan, and here is the work I did. 

For the past project regarding patient monitoring. I interviewed some doctors and nurses for background information to prove the usability of the project. The interview consists of gathering symptoms, especially behavioral symptoms, such as chest pain, vomit, increasing breath, seizure, etc. At the same time, the interview focusing on nursing assistants is more about the medical care system in hospitals and nursing homes, attempting to prove that the usual alarm system is not quick enough given the multiple layers of notification all the way to doctors. Please note that the interviews are conducted with Chinese medical system workers instead of in the US, and there might be discrepancy on how US hospitals work.

Then after having abundant communications with the course staff, we decided to change the project to a Taiji instruction application based on OpenPose given that this idea is more solid with more accessible online resources and static gesture references.

Indeed, my team and I are slightly behind schedule since we switched the project, but we will keep up for the future weeks by accelerating the schedule. At the same time, since the Taiji context has more gesture resources online, we could skip the raw data collection process before, giving us more time to work on the technical portion.

I am hoping to be able to successfully run OpenPose on my own machine for the next week so that we can start to use OpenPose for processing some Taiji gestures.

Team Status Report for 09/23/2023

Our team experienced drastic changes this week as we changed our project theme completely from hospital setting in to the traditional Chinese Taiji posture detection and instructor. After the discussion with course staff in the past few days, we recognized some usability and cost issues not apparent to us initially. Such discrepancies cost us substantial time and effort to justify our work. Thus, by the end of the week, we decided to switch gears and move on to the current topic–Taiji Pose Recognition Project: Taichine.

We adopted the idea of Taiji instructor since it is not a major change from our original idea. Most hardware and software usage will stay the same, but the overall setting of Taiji is much more manageable regarding a semester-long project. We still plan to use the camera-RasPi-backend setup as before, but the hardware will have enough capabilities to detect, classify, and evaluate body posture in real-time since Taiji pose recognition has higher tolerance in recognition efficiency. There will also be more online resources available to us while fewer privacy concerns should arise. We will spend less time monitoring personal data and focus on certain actions/routines so that we can offer clear evaluations in short periods of time.

It has been a week of transitions and chaos, but we successfully identified valuable online resources on Taiji postures and its recognition, which we could apply as training sets into our OpenPose algorithm. For the next week, we will focus on fixiating our hardware/software quantifications and formulate an optimal design for our system. If time allows, we plan to start data collection for our machine learning process. We will make necessary changes to our schedule and make sure we do not fall behind in the upcoming weeks, when working on Design review and training our models.

Shiheng’s Status Report for 09/23/2023

After an intense discussion with Professor Tamal during the Friday meeting, we discovered the justification for our initial plan in terms of costs and applicability was not strong and viable. Understanding the difficulties and hardships we are going to meet using Openpose in patient behavior detection, I contributed to the change of topic into Taiji instructor. With reference to yoga instructor available online, we decided to follow a similar trend but carry out it differently since Taiji is more concentrated on the flow of body motion instead of static data points. We could control the cost easily through justification provided to monitor larger motions of Taiji using Openpose and the setup of one camera to provide back real time justification and evaluation of the body posture, instead of the original plan on patients, which we ignored the class of patients which are bedridden and would have continuous minor movements instead of large motions which normal people would behave.

Gathering data and formulating them are the main progress I made in the past few days after the project change, using Ray’s reference body positions, I identified some body positions that are concise for beginners to learn and could possibly be identified through the Openpose algorithm. I reckon these positions will be clearly identified in low-cost cameras and meet the requirements of the Raspberry Pi we tend to use in the project.

Though we are behind on scheduling, we have advanced greatly in the data collection and classification part which hindered us greatly on the previous project. I believe we could catch up in the following weeks and advance greatly into our project.

Ray’s Status Report for 09/23/2023

This week, I presented our team’s original idea “Patient Monitor System” to the class. After the presentation, my teammates and I received feedback from instructors and, upon reconsideration, we decide to switch to the more applicable idea of designing a Taiji Training monitor system. To build upon this new idea, I searched for relevant papers and researches on pose recognition. In particular, I found and skimmed through the paper “3D Human Pose Estimation on Taiji Sequence” [1], finding it very relevant to our new project idea while targeting a different goal. In particular, the part where the conversion from 2D joints to 3D joints using neural network is intriguing and worth looking into.

My progress, along with our team’s schedule, is a bit behind due to our change in plan. To catch up on our original schedule, we will rework on our abstract to establish a reliable ground for our new idea for the first few days in the next week, and try to get the tasks originally planned for next week finished after we have a clear plan for the system we want to design.

In particular, I want to lay out how our systems will recognize and evaluate Taiji Poses. The Taiji Poses we plan to work with is the 24-form-Taiji sequence, there are clear ground-truth poses in this sequence for machine learning. We also have the idea of letting our user input the Taiji videos of their choices. We would also want to find a good way for the users to know how good their Taiji Poses are; some possible solutions to this might be an LED screen or a programmable LED unit. I also hope to make plenty of progress in our data collection next week, since data collection for our current idea is considerably simpler than for our previous idea.

[1]https://etda.libraries.psu.edu/files/final_submissions/17625