shengxiw – Team A4: UsAR Mirror

April 28, 2025

Shengxi’s Status Report for April 26th

What did you personally accomplish this week on the project?
I completed the majority of the testing, and most performance metrics were successfully met as outlined in the final presentation. We are currently finalizing the integration between the gesture recognition UI and my code. Due to build issues with OpenPose on Ubuntu 22.04, we decided to switch to another device where package compatibility is better. Since we are approaching the final deadline, we agreed it would be more efficient to continue development on a computer instead.

Is your progress on schedule or behind?
We are slightly behind schedule but very close to completion.

If you are behind, what actions will be taken to catch up to the project schedule?
We are prioritizing final integration tasks and adjusting development platforms to avoid technical delays. I am actively working on completing the integration to ensure we meet the project deadline.

What deliverables do you hope to complete in the next week?

Final integration of the gesture recognition UI and my modules.
Final polish and debugging to ensure a complete, working product.

Unit Tests and Overall System Tests Conducted:

3D Face Model Generation Delay Test:

Objective: Measure the time taken to generate a sparse 3D facial model from detected 2D landmarks and depth input.

Method: After facial landmark detection, the 3D point cloud was reconstructed using depth information. Timing measurements were taken from the start of reconstruction to the output of the 3D model.

Result: Approximately 20 milliseconds delay for generating a model with sparse landmarks.

Expectation: Less than or equal to 50 milliseconds for acceptable responsiveness.

Outcome: Passed — face model generation is fast and within real-time constraints.

6DoF Head Pose Estimation Test:

Objective: Validate the speed and accuracy of estimating head pose (rotation and translation) from 2D–3D correspondences.

Method: The solvePnP function was used with the reconstructed sparse 3D landmarks to estimate pose. Timing was measured per frame.

Result: Approximately 2 milliseconds per pose estimation using 68 landmarks.

Expectation: Less than or equal to 150 milliseconds to ensure responsiveness.

Outcome: Passed — pose estimation is extremely fast, even with sparse or noisy data.

AR Filter Rendering Frame Rate Test:

Objective: Ensure that the AR rendering pipeline operates at a real-time frame rate.

Method: The rendering frame rate was measured during typical operation with overlays (e.g., glasses or makeup masks) being applied based on face tracking.

Result: Frame rate of about 160 FPS under standard conditions.

Expectation: Minimum 15 FPS for real-time interaction.

Outcome: Passed — rendering performance is well above the required threshold.

Movement Drift and Jitter Test:

Objective: Evaluate the stability of 3D tracking over natural head movements.

Method: Users were asked to move their heads smoothly across a reasonable range (side-to-side, up/down). The stability of the projected landmarks and overlays was recorded and analyzed.

Result: Minor visible jitter, approximately 3–5 pixels deviation.

Expectation: Drift and jitter less than or equal to 5 pixels.

Outcome: Passed — slight but acceptable drift observed.

Pose Estimation Accuracy Test:

Objective: Measure the accuracy of 6DoF head pose estimation under varying head movements and angles.

Method: Estimated pose was compared to known or visually aligned poses for validation. Deviation in landmark projection was used as a proxy for pose error.

Result: Pose errors ranged around 10–15 pixels in some cases.

Expectation: Less than or equal to 5 pixels for optimal accuracy.

Outcome: Partially Passed — some instability observed especially under extreme angles or occlusions, highlighting an area for improvement.

April 21, 2025

Shengxi’s Status Report for April 19

What did you personally accomplish this week on the project?
This week, I worked on integrating the 3D face tracking pipeline with real-time AR rendering, including syncing 6DoF head pose estimation using depth-based landmarks and OpenGL model placement. I also optimized overlay performance and began evaluating pixel-level drift across different detectors.

Is your progress on schedule or behind?
I’m slightly behind schedule. Integration is taking longer than expected, but I plan to finish it within the next few days.

What actions will be taken to catch up?
I will prioritize completing the C++ pipeline and debugging edge cases to ensure model rendering correctly follows facial motion.

What deliverables do you hope to complete in the next week?
We aim to complete full system integration and deliver a minimally functional final product. I will also focus on testing and quantifying system latency, drift, and pose accuracy to support our performance analysis.

April 14, 2025April 19, 2025

Shengxi’s Status Report for April 12th

What did you personally accomplish this week on the project?
This week, I worked on integrating and rendering 3D objects—such as glasses—on top of a human face using OpenGL. I aligned the 3D model with facial landmarks and refined the rendering pipeline to ensure the virtual object appears naturally overlaid in real time. This required deeper integration with face tracking data and improvements to pose stability and object placement.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I’m on schedule. However, GPU performance and integration continue to be a challenge—especially on the Jetson platform, where driver and shader issues have impacted rendering stability. Given that this is a prototype, I’ll be making compromises on some graphical fidelity and fallback features to prioritize real-time responsiveness and ensure the demo remains functional and stable.

What deliverables do you hope to complete in the next week?
Next week, I plan to finalize the alignment and positioning of the glasses overlay under varying head poses. I will also test the robustness of the pipeline under different motion and lighting conditions. If time permits, begin generalizing the overlay system to support additional 3D elements (e.g., hats or earrings).

Verification and Validation Plan
Verification (individual, subsystem-level):
Pose accuracy: Comparing the rendered glasses position against known synthetic inputs or manually annotated frames to ensure consistent alignment with the eyes and nose bridge.

Latency: Measuring rendering latency from input frame to final render output using timestamped profiling to ensure the pipeline meets real-time constraints (<100ms end-to-end).

Robustness: Testing the system under varied conditions—such as rapid head movement, occlusions, and lighting changes—to ensure the overlay does not jitter or drift significantly.

March 30, 2025March 30, 2025

Shengxi Wu’s report for March 29

What did you personally accomplish this week on the project?
This week, I focused on preparing a working prototype for the interim demo. I finalized the integration of motion tracking data with the OpenGL rendering pipeline on the Jetson Orin Nano, which now supports stable AR overlays in real time. I implemented basic camera motion smoothing to reduce jitter and improve alignment between the virtual and real-world content. On the performance side, I began profiling GPU usage under different scene conditions, identifying a few bottlenecks in the current shader code.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I’m on schedule for the interim demo, though there’s still polishing left. Some experimental features like dynamic resolution scaling are still pending, but the core functionality for the demo is working. To stay on track, I’m prioritizing stability and responsiveness, and will continue optimizing shader code and system performance after the demo.

What deliverables do you hope to complete in the next week?
Next week, I plan to polish the demo for the presentation, focusing on smoother motion tracking, more consistent AR overlays, and better GPU performance. I also want to finish porting over the refined blending strategies from macOS to the Jetson, and begin experimenting with fallback rendering techniques to handle lower-performance scenarios gracefully

March 23, 2025

Shengxi’s status report for March 22

What did you personally accomplish this week on the project?
This week, I continued working on the OpenGL rendering pipeline for the Jetson Orin Nano. After resolving the environment setup issues from last week, I successfully got real-time shader compilation and execution working on the device. I debugged and finalized the initial implementation of real-time texture blending using OpenGL shaders, and now the system renders correctly with basic blending. I also started integrating the rendering pipeline with motion tracking data and ran performance tests to evaluate shader throughput. On my Mac, I continued experimenting with advanced blending strategies for more realistic AR overlays, which I plan to port over to the Jetson once stable.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I’m now mostly back on schedule. Getting past the earlier OpenGL setup challenges unblocked a lot of progress this week. Some shader optimization tasks may still spill into next week, I did not manage to implement everything I was planning for due to other workload but I am optimistic to get back on track. To stay on track, I will continue to test and iterate quickly on the Jetson, focusing on resource-efficient rendering and responsive performance.

What deliverables do you hope to complete in the next week?
Next week, I plan to complete integration of motion tracking with the rendering pipeline and refine the AR overlay stability. I’ll also work on final shader optimizations and begin profiling GPU usage to ensure low-latency rendering on the Jetson. If time permits, I’ll explore dynamic resolution scaling or other adaptive techniques to maintain performance under varying scene complexity.

March 16, 2025March 16, 2025

Shengxi’s Status Report for March 15th

What did you personally accomplish this week on the project?
This week, I worked on integrating OpenGL rendering on the Jetson Orin Nano. A significant portion of my time was spent on solving environment setup issues, including driver compatibility, OpenGL extensions, and ensuring shader compilation worked correctly on the platform. I also began implementing real-time texture blending using OpenGL shaders but is still trying to debug this part right now. Additionally, I continued searching and experimenting with different ways to handle AR overlay rendering efficiently (on my Mac).

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I am slightly behind schedule due to the unexpected challenges with OpenGL setup on Jetson Orin Nano. Debugging and configuring the environment took longer than expected. However, I think once I have this out of the way, I will be able to catch up next week by focusing on implementing and optimizing shader performance and refining the rendering pipeline (this may also carry over to next next week) but that’s almost the end of what my work distribution.

What deliverables do you hope to complete in the next week?
Next week, I plan to finalize the OpenGL rendering pipeline by optimizing shader execution and ensuring smooth integration with motion tracking. I will also work on improving texture blending techniques to enhance AR overlay realism. Additionally, I aim to implement an efficient resource management strategy to handle real-time rendering on the Jetson without excessive latency.

March 9, 2025

Shengxi’s Status Report for March 8th

What did you personally accomplish this week on the project?
This week, I worked on integrating my reconstruction pipeline into Jetson and framed the most efficient approach for the rendering pipeline. I also focused on motion tracking to ensure that the rendering stays aligned with the user’s face with minimal drift. Specifically, I refined the alignment of 3D facial landmarks with the face model and calibrated the coordinate transformations using OpenCV and AprilTag. Additionally, I implemented real-time head motion tracking using PnP to estimate both rigid and non-rigid transformations, ensuring that AR filters remain correctly positioned.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
My progress is on schedule. The foundation for rendering and motion tracking is in place, and next week, I will move on to implementing the OpenGL rendering.

What deliverables do you hope to complete in the next week?
Next week, I plan to complete the OpenGL-based rendering pipeline. This includes implementing real-time texture blending using OpenGL shaders to seamlessly overlay AR effects onto the user’s face. Additionally, I will refine motion tracking by further improving the hybrid approach for rigid and non-rigid motion estimation, ensuring robustness against rapid movements and partial occlusions.

February 23, 2025

Team Status Report for Feb22

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One of the most significant risks is environment compatibility, since currently each of us is coding independently, so some members have to work from the Mac ecosystem, so running our software on the Jetson may present unforeseen challenges, such as driver conflicts or performance limitations.
Mitigation: We will pass the access of Jetson one by one in our team to ensure smooth integration before full-system testing.

Another minor risk is performance bottlenecks, 3D face modeling, and gesture recognition involve computationally expensive tasks, which may slow real-time performance.
Mitigation: We are each trying different tricks to optimize computation like using SIMD, and also evaluating accuracy trade-offs between accuracy and efficiency to ensure the best performance within required frame rate bounds.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The overall block diagram remains unchanged, and few details within the software implementation of the pipeline are tested (e.g. what exact opencv technique we use in each module).

However, we will have to meet next week on going through our requirements again to make sure that the previously described performance bottlenecks are be mitigated within our test requirement, or to loosen the requirements a little to ensure a smooth user experience with the computing power we have.

Updates & Schedule change

So far we are good with the schedule, some changes have been made, and the UI development has been pulled in the front since the hardware parts have not arrived yet. In terms of progress, we are positive and will be able to reach our system integration deadline in time. We will also ensure weekly synchronization between modules to prevent any latency in final integration.

February 22, 2025February 23, 2025

Shengxi’s Status Report for Feb 22th

What did you personally accomplish this week on the project?
This week, I implemented a pipeline for generating a 3D face model using RGB and depth images. Where the input per frame is RGB and depth image, then I used dlib to detect facial landmarks in the RGB image and projected them onto the depth image. Then I converted depth image pixels to 3D world coordinates and extracted 3D face landmarks.

Alongside (this takes a little more time to process) I am able to transform the depth image into a full point cloud and integrated it with the 3D face landmarks to form a smooth surface for the 3d face model if more landmarks needs to be mapped.

This is a display of the reconstructed 3D face point cloud along with detected landmarks in 3D.

In terms of compute, only extracting a costant number of landmarks (14 in this case) takes around 1-2 second, and this can be accelerated by using lower resolution RGB and depth, which I want to make a quality study on accuracy once rendering section is implemented. I have also been trying using simd to accelerate the calculation here, which I will continue doing next week.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I am on schedule, having completed the 3D modeling component on time. Before leaving for spring break, I also integrated my work with Jetson and prepared for the next phase.

After returning from spring break, I will start to work on wrapping a texture map onto the 3D face model using captured landmarks.

What deliverables do you hope to complete in the next week?

Continue testing and refining for better visualization and performance.
By the next milestone, I aim to have a functional prototype with texture mapping implemented.

February 16, 2025February 16, 2025

Shengxi’s Status Report for Feb 15th

What did you personally accomplish this week on the project?

From last week’s progress, I have noticed that using Dynamic Fusion to iteratively find the most compact 3D reconstruction of user is probably not the most efficient approach to solve the problem.

Since we are specifically concerned with the human face, rather than reconstructing a full volumetric model dynamically, we can leverage pre-defined facial priors and structured depth-based 3D face models for a more efficient solution.

To refine this approach, I explored alternative methods for 3D face modeling and rendering that use depth maps combined with facial priors rather than relying solely on iterative fusion. Additionally, I investigated texture blending techniques that adjust AR overlays based on lighting conditions, ensuring realistic makeup application from different angles.

Ultimately, this would allow me to simplify my 3D reconstruction process and target the output specifically to provide necessary information for texture blending with a pre-defined makeup texture map.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

So far, I have found some paper that was able to map 3D input from the Kinect depth sensor to the Candide-3 model, which I am looking to implement with the Realsense camera and test the refined depth-based 3D face model.

I have started the implementation but not yet finished so I am a little bit behind schedule as I also had to work on Design Presentation this week.

For the texture blending section, that will be the focus for the next task after coming back from Spring Break – AR Overlay Rendering Prototype section.

What deliverables do you hope to complete in the next week?

Finalize the 3D face modeling framework using depth-based priors for improved performance. Hopefully I should be able to get a 3D reconstruction model of my own face once I receive all the parts I need (still waiting for the Realsense cable to arrive)