Team A4: UsAR Mirror

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

I finished setting up the UI for me to begin doing the UI.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am a little behind. I will finish the UI this week.

What deliverables do you hope to complete in the next week?

I hope to finish my UI.

April 28, 2025

Shengxi’s Status Report for April 26th

What did you personally accomplish this week on the project?
I completed the majority of the testing, and most performance metrics were successfully met as outlined in the final presentation. We are currently finalizing the integration between the gesture recognition UI and my code. Due to build issues with OpenPose on Ubuntu 22.04, we decided to switch to another device where package compatibility is better. Since we are approaching the final deadline, we agreed it would be more efficient to continue development on a computer instead.

Is your progress on schedule or behind?
We are slightly behind schedule but very close to completion.

If you are behind, what actions will be taken to catch up to the project schedule?
We are prioritizing final integration tasks and adjusting development platforms to avoid technical delays. I am actively working on completing the integration to ensure we meet the project deadline.

What deliverables do you hope to complete in the next week?

Final integration of the gesture recognition UI and my modules.
Final polish and debugging to ensure a complete, working product.

Unit Tests and Overall System Tests Conducted:

3D Face Model Generation Delay Test:

Objective: Measure the time taken to generate a sparse 3D facial model from detected 2D landmarks and depth input.

Method: After facial landmark detection, the 3D point cloud was reconstructed using depth information. Timing measurements were taken from the start of reconstruction to the output of the 3D model.

Result: Approximately 20 milliseconds delay for generating a model with sparse landmarks.

Expectation: Less than or equal to 50 milliseconds for acceptable responsiveness.

Outcome: Passed — face model generation is fast and within real-time constraints.

6DoF Head Pose Estimation Test:

Objective: Validate the speed and accuracy of estimating head pose (rotation and translation) from 2D–3D correspondences.

Method: The solvePnP function was used with the reconstructed sparse 3D landmarks to estimate pose. Timing was measured per frame.

Result: Approximately 2 milliseconds per pose estimation using 68 landmarks.

Expectation: Less than or equal to 150 milliseconds to ensure responsiveness.

Outcome: Passed — pose estimation is extremely fast, even with sparse or noisy data.

AR Filter Rendering Frame Rate Test:

Objective: Ensure that the AR rendering pipeline operates at a real-time frame rate.

Method: The rendering frame rate was measured during typical operation with overlays (e.g., glasses or makeup masks) being applied based on face tracking.

Result: Frame rate of about 160 FPS under standard conditions.

Expectation: Minimum 15 FPS for real-time interaction.

Outcome: Passed — rendering performance is well above the required threshold.

Movement Drift and Jitter Test:

Objective: Evaluate the stability of 3D tracking over natural head movements.

Method: Users were asked to move their heads smoothly across a reasonable range (side-to-side, up/down). The stability of the projected landmarks and overlays was recorded and analyzed.

Result: Minor visible jitter, approximately 3–5 pixels deviation.

Expectation: Drift and jitter less than or equal to 5 pixels.

Outcome: Passed — slight but acceptable drift observed.

Pose Estimation Accuracy Test:

Objective: Measure the accuracy of 6DoF head pose estimation under varying head movements and angles.

Method: Estimated pose was compared to known or visually aligned poses for validation. Deviation in landmark projection was used as a proxy for pose error.

Result: Pose errors ranged around 10–15 pixels in some cases.

Expectation: Less than or equal to 5 pixels for optimal accuracy.

Outcome: Partially Passed — some instability observed especially under extreme angles or occlusions, highlighting an area for improvement.

April 26, 2025April 27, 2025

Steven’s Status Report for Apr 26

What did you personally accomplish this week on the project?

Tests are basically finished, most of the metrics were met as described in the final presentation. We are still working on final integration between the gesture recognition UI and Catherine’s AR code. We have actually decided against using the Jetson, as we ran into issues with OpenPose not running/running too slowly on the Jetson. Since we are nearing the final deadline, we have decided it would be easier to do it on a computer instead.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind schedule, but we are almost done.

What deliverables do you hope to complete in the next week?

Final product integration.

Unit tests: (This is the same as my old status report)

To test latency, I’ve simply measured the time it takes for the gesture recognition pipeline to receive the image and then provide pose estimates (essentially model evaluation time). This was simple as inserting “stopwatches” into the codebase, and averaging measurements. The end-to-end latency was about 55 ms, which meets our target.
For input accuracy, I made “test inputs” for each of the inputs that we wanted to test. Since the inputs are now location based, these tests essentially consisted of holding up my hand over the button I wanted to click on the screen. Overall, this test went very well, since the buttons were fairly large and OpenPose wasn’t noisy enough to “miss” these buttons. Out of the 20 trials I did for a button, 19-20 of them would “pass” (>= 95% acc) — which means registering an input. However, the input accuracy would change under different lighting conditions, meaning that if the room was too bright or too dark, the accuracy would drop to around 18-20 / 20 successes (>= 90% acc), which is still good enough for our purposes.
For pose estimation, I held my hand relatively stationary, and measured the standard deviation, and max deviation from the mean. For a 1080p camera, this deviation was about 35 px, which is a little more than we’d like (<20px), but is still good enough for our purposes. Note that this metric is bound by the model we choose (in this case OpenPose).

Design changes:

Switching codebase platform from Jetson to a laptop. There were issues with building some of the required libraries during integration, as well as some runtime issues (OpenPose wouldn’t run). Since we are strapped for time, we switched to having our software system being run on a laptop instead.

April 26, 2025April 29, 2025

Team Status Report for Apr 26

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The primary risk remains hardware and software compatibility—specifically, getting OpenPose and its dependencies to run reliably under our target environment. We experienced this first–hand on the Jetson platform, where build failures and performance bottlenecks threatened our integration timeline. To manage this, we’ve migrated development and testing to a standard laptop environment where package compatibility is well‐tested.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?
Yes. The system architecture was updated to replace the Jetson‐based compute node with a standard x86–64 laptop. This change was necessary because OpenPose on Ubuntu 22.04 either failed to build or ran at sub–real-time speeds on the Jetson, jeopardizing our ability to meet real-time gesture recognition requirements. The costs of this change include additional porting effort (approximately 1–2 person-days) and potential increases in power consumption and hardware expense.

Provide an updated schedule if changes have occurred.

Same schedule since this is the final week.

April 21, 2025

Shengxi’s Status Report for April 19

What did you personally accomplish this week on the project?
This week, I worked on integrating the 3D face tracking pipeline with real-time AR rendering, including syncing 6DoF head pose estimation using depth-based landmarks and OpenGL model placement. I also optimized overlay performance and began evaluating pixel-level drift across different detectors.

Is your progress on schedule or behind?
I’m slightly behind schedule. Integration is taking longer than expected, but I plan to finish it within the next few days.

What actions will be taken to catch up?
I will prioritize completing the C++ pipeline and debugging edge cases to ensure model rendering correctly follows facial motion.

What deliverables do you hope to complete in the next week?
We aim to complete full system integration and deliver a minimally functional final product. I will also focus on testing and quantifying system latency, drift, and pose accuracy to support our performance analysis.

April 21, 2025April 21, 2025

Team’s Status Report for 4/19

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One major risk is the unreliability of gesture recognition, as OpenPose struggles with noise and time consistency. To address this, the team pivoted to a location-based input model, where users interact with virtual buttons by holding their hands in place. This approach improves reliability and user feedback, with potential refinements like additional smoothing filters if needed.

Finally, GPU performance issues could affect real-time AR overlays. Ongoing shader optimizations prioritize stability and responsiveness, with fallback rendering techniques as a contingency if improvements are insufficient.

No changes have been made.

Provide an updated schedule if changes have occurred.

This week, the team is doing full system integration, finalizing input event handling, and testing eye-tracking.

April 21, 2025April 21, 2025

Anna’s Status Report for 4/19

I finished building the 2nd camera rig. I also finished setting up the UI so that I can now work on it.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I am on track. I will finish the UI this week.

What deliverables do you hope to complete in the next week?

I hope to finish my UI.

April 19, 2025April 21, 2025

Steven’s Status Report for Apr 19th

What did you personally accomplish this week on the project?

I continued worked on testing my pose recognition UI, and generating testing metrics to see if we meet the design requirements. I also worked on the final presentation.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind schedule. We still need to integrate everything together. Catherine’s code should in theory integrate with my code base. I’ve also verified that Anna’s code works with mine. So, pairwise, the systems have been integrated. But we still need to integrate everything as a final product.

What deliverables do you hope to complete in the next week?

Integration. We hope to have a (barely) working final product by the end of this week.

Testing:

To test my subsystem, I’ve created tests for each of the design requirements we are trying to target, (1) latency (2) input accuracy and (3) pose estimation accuracy.

To test latency, I’ve simply measured the time it takes for the gesture recognition pipeline to receive the image and then provide pose estimates (essentially model evaluation time). This was simple as inserting “stopwatches” into the codebase, and averaging measurements. The end-to-end latency was about 55 ms, which meets our target.
For input accuracy, I made “test inputs” for each of the inputs that we wanted to test. Since the inputs are now location based, these tests essentially consisted of holding up my hand over the button I wanted to click on the screen. Overall, this test went very well, since the buttons were fairly large and OpenPose wasn’t noisy enough to “miss” these buttons. Out of the 20 trials I did for a button, 19-20 of them would “pass” (>= 95% acc) — which means registering an input. However, the input accuracy would change under different lighting conditions, meaning that if the room was too bright or too dark, the accuracy would drop to around 18-20 / 20 successes (>= 90% acc), which is still good enough for our purposes.
For pose estimation, I held my hand relatively stationary, and measured the standard deviation, and max deviation from the mean. For a 1080p camera, this deviation was about 35 px, which is a little more than we’d like (<20px), but is still good enough for our purposes. Note that this metric is bound by the model we choose (in this case OpenPose).

April 14, 2025April 19, 2025

Shengxi’s Status Report for April 12th

What did you personally accomplish this week on the project?
This week, I worked on integrating and rendering 3D objects—such as glasses—on top of a human face using OpenGL. I aligned the 3D model with facial landmarks and refined the rendering pipeline to ensure the virtual object appears naturally overlaid in real time. This required deeper integration with face tracking data and improvements to pose stability and object placement.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
I’m on schedule. However, GPU performance and integration continue to be a challenge—especially on the Jetson platform, where driver and shader issues have impacted rendering stability. Given that this is a prototype, I’ll be making compromises on some graphical fidelity and fallback features to prioritize real-time responsiveness and ensure the demo remains functional and stable.

What deliverables do you hope to complete in the next week?
Next week, I plan to finalize the alignment and positioning of the glasses overlay under varying head poses. I will also test the robustness of the pipeline under different motion and lighting conditions. If time permits, begin generalizing the overlay system to support additional 3D elements (e.g., hats or earrings).

Verification and Validation Plan
Verification (individual, subsystem-level):
Pose accuracy: Comparing the rendered glasses position against known synthetic inputs or manually annotated frames to ensure consistent alignment with the eyes and nose bridge.

Latency: Measuring rendering latency from input frame to final render output using timestamped profiling to ensure the pipeline meets real-time constraints (<100ms end-to-end).

Robustness: Testing the system under varied conditions—such as rapid head movement, occlusions, and lighting changes—to ensure the overlay does not jitter or drift significantly.

Posts

Final Video Submission

Anna’s Status Report for Apr 26

Shengxi’s Status Report for April 26th

Steven’s Status Report for Apr 26

Team Status Report for Apr 26

Shengxi’s Status Report for April 19

Team’s Status Report for 4/19

Anna’s Status Report for 4/19

Steven’s Status Report for Apr 19th

Shengxi’s Status Report for April 12th