Shengxi’s Status Report for March 8th

What did you personally accomplish this week on the project?
This week, I worked on integrating my reconstruction pipeline into Jetson and framed the most efficient approach for the rendering pipeline. I also focused on motion tracking to ensure that the rendering stays aligned with the user’s face with minimal drift. Specifically, I refined the alignment of 3D facial landmarks with the face model and calibrated the coordinate transformations using OpenCV and AprilTag. Additionally, I implemented real-time head motion tracking using PnP to estimate both rigid and non-rigid transformations, ensuring that AR filters remain correctly positioned.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
My progress is on schedule. The foundation for rendering and motion tracking is in place, and next week, I will move on to implementing the OpenGL rendering.

What deliverables do you hope to complete in the next week?
Next week, I plan to complete the OpenGL-based rendering pipeline. This includes implementing real-time texture blending using OpenGL shaders to seamlessly overlay AR effects onto the user’s face. Additionally, I will refine motion tracking by further improving the hybrid approach for rigid and non-rigid motion estimation, ensuring robustness against rapid movements and partial occlusions.

Team’s Status Report for March8

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One of the most significant risks is environment compatibility, since currently each of us is coding independently, so some members have to work from the Mac ecosystem, so running our software on the Jetson may present unforeseen challenges, such as driver conflicts or performance limitations.
Mitigation: We will pass the access of Jetson one by one in our team to ensure smooth integration before full-system testing.

Another minor risk is performance bottlenecks, 3D face modeling, and gesture recognition involve computationally expensive tasks, which may slow real-time performance.
Mitigation: We are each trying different tricks to optimize computation like using SIMD, and also evaluating accuracy trade-offs between accuracy and efficiency to ensure the best performance within required frame rate bounds.

One risk that was faced was uploading the Arduino code onto the Arduino. We had anticipated that we would just need to buy the materials as instructed so that we can be ready to code and upload it to the Arduino. However, we found out that there’s no way to upload the code to an Arduino Pro Mini, so with our leftover budget, we bought 2 USB to serial adapters for around $9 so that we can upload the code.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The overall block diagram remains unchanged, and few details within the software implementation of the pipeline are tested (e.g. what exact opencv technique we use in each module).

However, we will have to meet next week on going through our requirements again to make sure that the previously described performance bottlenecks are be mitigated within our test requirement, or to loosen the requirements a little to ensure a smooth user experience with the computing power we have.

Updates & Schedule change

So far we are good with the schedule, some changes have been made, and the UI development has been pulled in the front since the hardware parts have not arrived yet. In terms of progress, we are positive and will be able to reach our system integration deadline in time. We will also ensure weekly synchronization between modules to prevent any latency in final integration.

Anna’s Status Report for March8

 

  • What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).  

I am working on the setup for the UI, more specifically, generating build files and building (step 5- last step: https://github.com/kevidgel/usar-mirror). So far, I was able to do all the other previous steps (steps 1-4) and verified that openpose (https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/installation/0_index.md#compiling-and-running-openpose-from-source) is running successfully as shown in the image below. 

Right now, I am having trouble identifying the CMakeList.txt file which involves installing Cuda. I have confirmed with Steven that we will not be using Cuda, so I will ask Steven how to build without using Cuda since it won’t let me build without using Cuda even after silencing the flags. 

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am slightly behind in that I still need to solder my PCB with the parts that I just got on the day before Spring break and am still waiting on my USB to Serial adapter to upload my code to the Arduino. I am also a little behind in terms of setting up the UI as I anticipate working on the UI after assembling the camera rig. At the same time, I will write and test the Arduino code. 

  • What deliverables do you hope to complete in the next week?

I hope to at least build my camera rig and to finish setting up my environment so that I can get started working on the UI. Then, I will plan on writing and testing my Arduino code and integrating it with the gesture recognition. 

Steven’s Status Report for March 8

What did you personally accomplish this week on the project?

I worked on integrating the C++ api for OpenPose in our application, and did some fine tuning for performance and accuracy. Keypoints are now available to use in our application for eye tracking and gesture control. I also did some research for gesture recognition algorithms. I think a good starting point is having the purely based on the velocity of the keypoint (ex. Left hand moves quickly to the right).

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Roughly on schedule. I think with the OpenPose now integrated into the application, developing gesture control should be simple.

What deliverables do you hope to complete in the next week?

Complete gesture control algorithm. Also, I have yet to compile the project on the Jetson.

Anna’s Status Report for Feb22

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).  

Files/photos: https://docs.google.com/document/d/1AsT0dXenHnLb7vWtu7ljc6i2_zY0NIXrijEcWs9SGDA/edit?usp=sharing

This week, I focused on setting up the user interface (UI) and preparing everything needed to start coding. I spent time looking at UIs from makeup and glasses apps like YouCam Makeup, which helped me get an idea of what the UI should look like. I also checked out some tutorials for Dear ImGui to understand how to implement the UI elements.

Steven shared the GitHub repo with the ImGui backend set up, so I just need to call the library functions in the code to create the UI elements. However, I’ve been having some trouble with generating build files and running the build process. Steven is helping me troubleshoot, and we’re hoping to get everything set up so I can start coding the UI on Monday.

Another part of the project I’m responsible for is the motorized camera control system. I ordered the parts last week, so I’m still waiting for them to arrive. Once I get the parts, I can start assembling and programming the system.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I’m a little behind schedule due to the delays in receiving the parts for the motorized camera control system and the issues I’ve had with building the Dear ImGui project. That said, I’ve been working closely with Steven to resolve the build problems, and I expect to be able to move forward with coding the UI soon. To catch up, I’ll focus on fixing the build issue and getting everything set up so I can start coding the UI by next week. Once the camera control system parts arrive, I’ll focus on assembling and programming it, so I stay on track with both tasks.

What deliverables do you hope to complete in the next week?

I hope to begin assembling the motorized camera control system and start the initial programming once the parts arrive. I also hope to begin coding the UI elements (like the camera angle and filter menus) using Dear ImGui, starting with the basic UI elements and getting them integrated into the project. 

Team Status Report for Feb22

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One of the most significant risks is environment compatibility, since currently each of us is coding independently, so some members have to work from the Mac ecosystem, so running our software on the Jetson may present unforeseen challenges, such as driver conflicts or performance limitations.
Mitigation: We will pass the access of Jetson one by one in our team to ensure smooth integration before full-system testing.

Another minor risk is performance bottlenecks, 3D face modeling, and gesture recognition involve computationally expensive tasks, which may slow real-time performance.
Mitigation: We are each trying different tricks to optimize computation like using SIMD, and also evaluating accuracy trade-offs between accuracy and efficiency to ensure the best performance within required frame rate bounds.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The overall block diagram remains unchanged, and few details within the software implementation of the pipeline are tested (e.g. what exact opencv technique we use in each module).

However, we will have to meet next week on going through our requirements again to make sure that the previously described performance bottlenecks are be mitigated within our test requirement, or to loosen the requirements a little to ensure a smooth user experience with the computing power we have.

Updates & Schedule change

So far we are good with the schedule, some changes have been made, and the UI development has been pulled in the front since the hardware parts have not arrived yet. In terms of progress, we are positive and will be able to reach our system integration deadline in time. We will also ensure weekly synchronization between modules to prevent any latency in final integration.

Steven’s Status Report for Feb22

What did you personally accomplish this week on the project?

I worked on the code for eye-tracking and gesture-recognition. I managed to build the C++ API for OpenPose and integrate it into our project, so we can use cameras to track facial keypoints and body keypoints (such as arms, hands). I have also started working on the software foundation for our project, creating the main render loop for our application.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am a little behind. I will use some slack time over spring break to work on the gesture algorithm. Facial keypoint tracking + body tracking is essentially done, as it is provided with the C++ API.

What deliverables do you hope to complete in the next week?

(1) Get project running on the Jetson (as opposed to my laptop) to make sure everything works with the hardware. (2) Research gesture recognition algorithms using body keypoints, trying to find libraries that do this or come up with an algorithm of my own.

Shengxi’s Status Report for Feb 22th

What did you personally accomplish this week on the project?
This week, I implemented a pipeline for generating a 3D face model using RGB and depth images.  Where the input per frame is RGB and depth image, then I used dlib to detect facial landmarks in the RGB image and projected them onto the depth image. Then I converted depth image pixels to 3D world coordinates and extracted 3D face landmarks.

Alongside (this takes a little more time to process) I am able to transform the depth image into a full point cloud and integrated it with the 3D face landmarks to form a smooth surface for the 3d face model if more landmarks needs to be mapped.

This is a display of the reconstructed 3D face point cloud along with detected landmarks in 3D.

In terms of compute, only extracting a costant number of landmarks (14 in this case) takes around 1-2 second, and this can be accelerated by using lower resolution RGB and depth, which I want to make a quality study on accuracy once rendering section is implemented. I have also been trying using simd to accelerate the calculation here, which I will continue doing next week.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I am on schedule, having completed the 3D modeling component on time. Before leaving for spring break, I also integrated my work with Jetson and prepared for the next phase.

After returning from spring break, I will start to work on wrapping a texture map onto the 3D face model using captured landmarks.

What deliverables do you hope to complete in the next week?

Continue testing and refining for better visualization and performance.
By the next milestone, I aim to have a functional prototype with texture mapping implemented.

Shengxi’s Status Report for Feb 15th

What did you personally accomplish this week on the project?

 

From last week’s progress, I have noticed that using Dynamic Fusion to iteratively find the most compact 3D reconstruction of user is probably not the most efficient approach to solve the problem.

Since we are specifically concerned with the human face, rather than reconstructing a full volumetric model dynamically, we can leverage pre-defined facial priors and structured depth-based 3D face models for a more efficient solution.

To refine this approach, I explored alternative methods for 3D face modeling and rendering that use depth maps combined with facial priors rather than relying solely on iterative fusion. Additionally, I investigated texture blending techniques that adjust AR overlays based on lighting conditions, ensuring realistic makeup application from different angles.

Ultimately, this would allow me to simplify my 3D reconstruction process and target the output specifically to provide necessary information for texture blending with a pre-defined makeup texture map.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

So far, I have found some paper that was able to map 3D input from the Kinect depth sensor to the Candide-3 model, which I am looking to implement with the Realsense camera and test the refined depth-based 3D face model.

I have started the implementation but not yet finished so I am a little bit behind schedule as I also had to work on Design Presentation this week.

For the texture blending section, that will be the focus for the next task after coming back from Spring Break – AR Overlay Rendering Prototype section.

What deliverables do you hope to complete in the next week?

Finalize the 3D face modeling framework using depth-based priors for improved performance. Hopefully I should be able to get a 3D reconstruction model of my own face once I receive all the parts I need (still waiting for the Realsense cable to arrive)

 

Team Status Report for Feb15

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One of the most significant risks could be that there can be jitters from the camera. This will ruin the overall experience of the users since they are not able to see their side profiles and other parts of the face well. To mitigate this, we are implementing a PID control loop to ensure smooth motor movement and reduce vibrations. Additionally, we are testing different mounting and damping mechanisms to isolate vibrations from the motor assembly.

Contingency plans include having a backup stepper motor with finer resolution and smoother torque, as well as a manual override mode for emergency situations.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The design for the camera control system changed, but it’s cheaper. We changed it due to the cost and complexity of the previous project. The one we are going with requires fewer 3D printing parts, which cuts down our cost by half, and it will integrate well with the display. The change also simplifies the assembly and reduces the overall weight of the system, improving portability.

The cost incurred is minimal, primarily for redesigning and reprinting certain parts. To mitigate these costs, we are using readily available components and not printing parts that we don’t need (like the stepper motor case).

Provide an updated schedule if changes have occurred.

We are behind schedule since we haven’t received the materials and equipment yet. Once we get the materials, we plan on catching up with the schedule by allocating more time for assembly and testing. We’ve also added buffer periods for unforeseen delays and assigned team members specific tasks to parallelize the work.

A: Public health, safety or welfare Considerations (written by Shengxi)

Our system prioritizes user well-being by incorporating touch-free interaction, eliminating the need for physical contact and reducing the spread of germs, particularly in shared or public spaces. By maintaining proper eye-level alignment, the system helps minimize eye strain and fatigue, preventing neck discomfort caused by prolonged unnatural viewing angles. Additionally, real-time AR makeup previews contribute to psychological well-being by boosting user confidence and reducing anxiety related to cosmetic choices. The ergonomic design further enhances comfort by accommodating various heights and seating positions, ensuring safe, strain-free interactions for all users.

B: Social Factor Considerations (written by Steven)

Being a display with mirror-like capabilities, we aim to pay close attention to how it affects perception of body image and self-perception. We plan to make the perspective transforms accurate and the image filters reasonable,  so we don’t unintentionally reinforce unrealistic beauty norms or contribute to negative self-perception. This will be achieved through user-testing and accuracy testing of our reconstruction algorithms. Also one of the goals of this project is to keep it at a lower cost compared to competitors (enforced by our limited budget of ~$600) so that lower-income communities have access to this technology.

C: Economic Factors (written by Anna)

UsAR mirror provides a cost-efficient and scalable solution as our mirror costs no more than $600. For production, the UsAR mirror has costs in hardware, software, and maintenance/updates. It uses affordable yet high-quality cameras like the Realsense depth camera and webcams. The Realsense depth camera will allow users to have filters properly aligned to a 3D reconstruction of the face, maximizing the experience while minimizing the cost. The camera control system has an efficient yet simple design that doesn’t require many materials or doesn’t incur a lot of costs. As for the software, there’s no cost. It uses free, open-source software libraries like OpenCV, Open3D, OpenGL, and OpenPose. The Arduino code that controls the side-mounted webcams is developed with no cost.

For distribution, the mirror is lightweight and easy to handle and install. The mirror is a display that’s only 23.8 inches, so it is easy to carry and use as well as easy to package and ship. For consumption, UsAR mirror will be greatly used by retailers who can save money on sample products and the time spent for customers to try on all kinds of glasses. Moreover, because customers are able to try on makeup and glasses efficiently, this reduces the percentage that they will likely come back to return products, making the shopping experience and business on the retail end more convenient. These days, customers are longing for a more personalized and convenient way of shopping, and UsAR mirror addresses this demand.