stevenl4 – Team A4: UsAR Mirror

April 26, 2025April 27, 2025

Steven’s Status Report for Apr 26

What did you personally accomplish this week on the project?

Tests are basically finished, most of the metrics were met as described in the final presentation. We are still working on final integration between the gesture recognition UI and Catherine’s AR code. We have actually decided against using the Jetson, as we ran into issues with OpenPose not running/running too slowly on the Jetson. Since we are nearing the final deadline, we have decided it would be easier to do it on a computer instead.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind schedule, but we are almost done.

What deliverables do you hope to complete in the next week?

Final product integration.

Unit tests: (This is the same as my old status report)

To test latency, I’ve simply measured the time it takes for the gesture recognition pipeline to receive the image and then provide pose estimates (essentially model evaluation time). This was simple as inserting “stopwatches” into the codebase, and averaging measurements. The end-to-end latency was about 55 ms, which meets our target.
For input accuracy, I made “test inputs” for each of the inputs that we wanted to test. Since the inputs are now location based, these tests essentially consisted of holding up my hand over the button I wanted to click on the screen. Overall, this test went very well, since the buttons were fairly large and OpenPose wasn’t noisy enough to “miss” these buttons. Out of the 20 trials I did for a button, 19-20 of them would “pass” (>= 95% acc) — which means registering an input. However, the input accuracy would change under different lighting conditions, meaning that if the room was too bright or too dark, the accuracy would drop to around 18-20 / 20 successes (>= 90% acc), which is still good enough for our purposes.
For pose estimation, I held my hand relatively stationary, and measured the standard deviation, and max deviation from the mean. For a 1080p camera, this deviation was about 35 px, which is a little more than we’d like (<20px), but is still good enough for our purposes. Note that this metric is bound by the model we choose (in this case OpenPose).

Design changes:

Switching codebase platform from Jetson to a laptop. There were issues with building some of the required libraries during integration, as well as some runtime issues (OpenPose wouldn’t run). Since we are strapped for time, we switched to having our software system being run on a laptop instead.

April 26, 2025April 29, 2025

Team Status Report for Apr 26

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The primary risk remains hardware and software compatibility—specifically, getting OpenPose and its dependencies to run reliably under our target environment. We experienced this first–hand on the Jetson platform, where build failures and performance bottlenecks threatened our integration timeline. To manage this, we’ve migrated development and testing to a standard laptop environment where package compatibility is well‐tested.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?
Yes. The system architecture was updated to replace the Jetson‐based compute node with a standard x86–64 laptop. This change was necessary because OpenPose on Ubuntu 22.04 either failed to build or ran at sub–real-time speeds on the Jetson, jeopardizing our ability to meet real-time gesture recognition requirements. The costs of this change include additional porting effort (approximately 1–2 person-days) and potential increases in power consumption and hardware expense.

Provide an updated schedule if changes have occurred.

Same schedule since this is the final week.

April 19, 2025April 21, 2025

Steven’s Status Report for Apr 19th

What did you personally accomplish this week on the project?

I continued worked on testing my pose recognition UI, and generating testing metrics to see if we meet the design requirements. I also worked on the final presentation.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind schedule. We still need to integrate everything together. Catherine’s code should in theory integrate with my code base. I’ve also verified that Anna’s code works with mine. So, pairwise, the systems have been integrated. But we still need to integrate everything as a final product.

What deliverables do you hope to complete in the next week?

Integration. We hope to have a (barely) working final product by the end of this week.

Testing:

To test my subsystem, I’ve created tests for each of the design requirements we are trying to target, (1) latency (2) input accuracy and (3) pose estimation accuracy.

To test latency, I’ve simply measured the time it takes for the gesture recognition pipeline to receive the image and then provide pose estimates (essentially model evaluation time). This was simple as inserting “stopwatches” into the codebase, and averaging measurements. The end-to-end latency was about 55 ms, which meets our target.
For input accuracy, I made “test inputs” for each of the inputs that we wanted to test. Since the inputs are now location based, these tests essentially consisted of holding up my hand over the button I wanted to click on the screen. Overall, this test went very well, since the buttons were fairly large and OpenPose wasn’t noisy enough to “miss” these buttons. Out of the 20 trials I did for a button, 19-20 of them would “pass” (>= 95% acc) — which means registering an input. However, the input accuracy would change under different lighting conditions, meaning that if the room was too bright or too dark, the accuracy would drop to around 18-20 / 20 successes (>= 90% acc), which is still good enough for our purposes.
For pose estimation, I held my hand relatively stationary, and measured the standard deviation, and max deviation from the mean. For a 1080p camera, this deviation was about 35 px, which is a little more than we’d like (<20px), but is still good enough for our purposes. Note that this metric is bound by the model we choose (in this case OpenPose).

April 12, 2025April 19, 2025

Steven’s Status Report for April 12th

What did you personally accomplish this week on the project?

I worked on testing my pose recognition UI, and making it more robust to noise generated from inaccuracy from OpenPose. I also worked on adding more UI components, for all the features that we want in the final product. I also worked on serial input to the arduino (part of camera rig), and mapped inputs from my gesture/location control to serial inputs to Anna’s camera rig.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind schedule. I’ve integrated my part of the project with Anna’s, so now the application can control camera rig. However, I am behind trying to build the program on the Jetson.

What deliverables do you hope to complete in the next week?

Continue refining the UI, and attempt to integrate Catherine’s part of the project.

Verification

I verify my system by (1) making sure that my location-based input system reaches the desired accuracy by making a series of desired inputs (for example, hovering my hand over the left swipe button) and recording how many of these test inputs fail/succeed. I will also make sure that the eye-tracking system (with the camera rig controller) corrects the eye level by first setting different eye levels on the screen to “calibrate to”, and making sure the camera rig corrects for it within the desired percentage away from the desired screen eye leel.

March 30, 2025

Team Status Report for March 29th

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

One major risk is the unreliability of gesture recognition, as OpenPose struggles with noise and time consistency. To address this, the team pivoted to a location-based input model, where users interact with virtual buttons by holding their hands in place. This approach improves reliability and user feedback, with potential refinements like additional smoothing filters if needed.

System integration is also behind schedule due to incomplete subsystems. While slack time allows for adjustments, delays in dependent components remain a risk. To mitigate this, the team is refining individual modules and may use mock data for parallel development if necessary.

The camera rig needs a stable stand and motion measurement features. A second version is in progress, and if stability remains an issue, alternative mounting solutions will be explored.

Finally, GPU performance issues could affect real-time AR overlays. Ongoing shader optimizations prioritize stability and responsiveness, with fallback rendering techniques as a contingency if improvements are insufficient.

Gesture-based input has been replaced with a location-based system due to unreliable pose recognition. While this requires UI redesign and new logic for button-based interactions, it improves usability and consistency. The team is expediting this transition to ensure thorough testing before integration.

Another key change is a focus on GPU optimization after identifying shader inefficiencies. This delays secondary features like dynamic resolution scaling but ensures smooth AR performance. Efforts will continue to balance visual quality and efficiency.

Provide an updated schedule if changes have occurred.

This week, the team is refining motion tracking, improving GPU performance, stabilizing the camera rig, and finalizing the new input system. Next week, focus will shift to full system integration, finalizing input event handling, and testing eye-tracking once the camera rig is ready. While integration is slightly behind, a clear plan is in place to stay on track.

March 29, 2025March 30, 2025

Steven’s Status Report for March 29th

What did you personally accomplish this week on the project?

Gesture recognition seems too unintuitive for the application, plus the inputs we received from the pose recognition model were too noisy for reliable velocity estimation for gestures. So, I pivoted to a location based input model, where the user instead of making gestures, moves their hand to a virtual button on the screen, which registers input if the user “holds” their hand location over that button for a period of time. This is a better solution, since estimating position was a lot more reliable than estimating velocities, since (I don’t think) OpenPose is time consistent. Also, visual buttons on the screen provide better feedback to the user.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind. Currently we’re supposed to do system integration. But I’m waiting on my partners subsystems, and currently finishing up my own. We have a little slack time for this, so I am not too worried.

What deliverables do you hope to complete in the next week?

I hope to write input events/buttons for all the input events we have planned as features for our project. I also hope to get started on testing the eye-tracking system (i.e. correcting eye level) once Anna finishes her camera rig, through serial command inputs from the program to Arduino via usb.

March 23, 2025

Steven’s status report for March 22

What did you personally accomplish this week on the project?

https://github.com/kevidgel/usar-mirror/commit/5f6e604137110f6559df3144245f885c7efa9c0f — the pose tracking doesn’t suck anymore. Also worked on the gesture recognition algorithm to be more robust to missing/implausible data.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Behind. But I will spend some extra time this week to catch up.

What deliverables do you hope to complete in the next week?

Maybe I get to compile it on the Jetson if I have time. But mainly testing gesture control.

March 16, 2025

Team Status Report for March 15th

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

Most significant risk currently is the system integration failing. So far everyone has been working on their tasks pretty separately (software for gesture recognition/eye tracking, software for AR overlay, hardware for camera rig + UI). It looks like everyone has made significant progress on their tasks, and are close to the testing stage for the individual parts. However, not much testing/design has gone into how these subprojects will interface. We will discuss this in the further weeks. Moreover, we have made some time in the schedule for the integration, which gives us ample time for making sure everything works.

Another risk is performance, the compute requirement for the software is a lot and the Jetson may not be able to handle it. But this has already been mentioned in our last team status report, and we are currently working on it.

No changes currently.

Updates & Schedule change

The team seems to be roughly on schedule.

March 16, 2025

Steven’s Status Report for Mar 15th

What did you personally accomplish this week on the project?

Worked on the gesture control algorithm. I implemented a left/right swipe recognition input based on keypoint velocity, but due to OpenPose being fairly slow an the actual keypoints being tracked being fairly noisy, the accuracy isn’t that good. Potential solutions include switching from a velocity-based algorithm to something more positional, such as detecting whether the hand is to the right or left of shoulder, or writing a filter for more stable keypoints –> more stable velocities. Regarding eye-tracking, I managed to extract eye keypoints in screen-space coordinates. Since the camera rig isn’t finished, and the webcam I currently have is fixed, I simply wrote a function to output the signed distance between the average y-position of the eyes and a certain height (in (-1,1) viewport coordinates). This will be input to a feedback algorithm to adjust our camera rig.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I’m supposed to be doing gesture recognition algorithm testing right now (re: schedule), but currently I’ve been strapped for time and not doing that. More refinement needs to be done to the algorithm to make it robust enough for our application. To catch up, I will allocate more time in my schedule for this project.

What deliverables do you hope to complete in the next week?

I do plan to refine the algorithm to be more robust against inaccurate readings from OpenPose, as well as doing some performance fine tuning so that keypoint collection happens faster and the algorithm could be more accurate. The inputs it detects are also pretty limited, will have to discuss more about which gestures we want however. I also don’t have the Jetson currently, and I know I’ve been putting this off literally every week since the second week, but I have yet to build OpenPose on the Jetson.

March 9, 2025March 11, 2025

Steven’s Status Report for March 8

What did you personally accomplish this week on the project?

I worked on integrating the C++ api for OpenPose in our application, and did some fine tuning for performance and accuracy. Keypoints are now available to use in our application for eye tracking and gesture control. I also did some research for gesture recognition algorithms. I think a good starting point is having the purely based on the velocity of the keypoint (ex. Left hand moves quickly to the right).

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Roughly on schedule. I think with the OpenPose now integrated into the application, developing gesture control should be simple.

What deliverables do you hope to complete in the next week?

Complete gesture control algorithm. Also, I have yet to compile the project on the Jetson.