Steven’s Status Report for Feb 8

What did you personally accomplish this week on the project?

This week I worked on finding libraries for eye-tracking and gesture recognition (~4 hrs), and starting on the eye-tracking implementation. One library I found that could do both of these tasks is OpenPose , which in addition could be built on a Jetson. Because we don’t have most of our parts yet, I started developing the code locally. A lot of time (~5 hrs) was spent on trying to get OpenPose to build locally (no prebuilt binaries for unix/linux 🙁 ) ; a lot of dependencies (ex. Caffe) didn’t build on an M1 Mac, so I had to switch to using my x86 linux machine. I was able to get a demo program tracking facial and body keypoints running using the CUDA backend at ~5 fps. More work will have to be done optimizing latency/fps.

Is your progress on schedule or behind? If you are behind, what actions will betaken to catch up to the project schedule?

Progress is a little behind, because we don’t have the Jetson Nano yet. Until we get a Nano, one possible route would be to use something like a Docker container to do development on. However, with the current scope of the project I think it is fine to continue development natively on Linux.

What deliverables do you hope to complete in the next week?

One goal is that I hope to build OpenPose on the Jetson and have at least the demo program running. Another goal is to figure out the Python API for OpenPose so I extract keypoints for the eyes and implement the gesture recognition algorithm.

Team Status Report for Feb8

 

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

 

Our system will feature a camera control mechanism that adjusts the camera’s position based on the user’s movements. The control system consists of three camera rigs: one for linear motion and two for panning and tilting. A RealSense camera will be mounted at the top of the display, capable of horizontal movement along with panning and tilting. Additionally, two webcams with a similar setup will be responsible for vertical movement while also supporting panning and tilting. 

To achieve precise control over the cameras, we will use an Arduino to interface with motorized actuators. The Arduino will process real-time data on the user’s position, movement, and angles collected from computer vision and tracking algorithms (processed on jetson). Based on this data, the Arduino will adjust the cameras accordingly, ensuring that the virtual overlays remain properly aligned with the user’s face.

One of the most significant risks in our project is ensuring that the camera dimensions are compatible with the premade rig design, particularly for the pan and tilt mechanism. Since the rig has many moving parts, even slight misalignments could lead to unstable movement (especially jittery motion) or poor tracking. To mitigate this, I will adjust the CAD files and verify all measurements before printing the parts. Additionally, I will test the motors beforehand to ensure they function smoothly. To reduce jittery movements, I will implement controlled speed adjustments and include a brief resting period after movement to allow the motors to stabilize.

Another risk is ensuring that the motors respond accurately to the Arduino’s commands. Before integrating the motors into the camera system, I will perform basic functionality tests to confirm their responsiveness. I will also take advantage of Arduino’s built-in motor control libraries to fine-tune movements for precision.

To ensure proper synchronization between the camera movement and the AR system, we will conduct individual component testing before proceeding with full system integration. If issues arise, debugging will be more manageable since we will already know which part of the system requires improvement.

Since unforeseen problems could still occur, we have built buffer time into our project schedule to accommodate troubleshooting and necessary modifications.

 

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

 

We are replacing the Kinect depth camera with an Intel RealSense camera along with two web cameras on each side. This is necessary because Kinect cameras are no longer in production and are difficult to obtain. Using a RealSense camera offers the same functionality with only a slight increase in cost ($300 instead of $200). The change won’t affect the overall functionality of the project, but it does require extra coding to integrate and process data from the new camera setup. While this helps reduce hardware costs, it comes at the expense of additional development time for software adjustments. To manage this, we’ll focus on optimizing the code for depth and vision processing, making use of existing libraries and frameworks to streamline integration. We’ll also conduct thorough testing to ensure the new setup maintains the required accuracy and performance.

Provide an updated schedule if changes have occurred.

 

 

We are behind in schedule since we haven’t received the materials and equipment yet. Once we get the materials, we plan on catching up with the schedule. Steven pushed the eye tracking implementation to the week after, and Anna pushed the camera control system assembly the week after since she couldn’t get the materials and parts yet. 

This is also the place to put some photos of your progress or to brag about a component you got working. 

 

 

Shengxi’s Status Report for Feb 8th

Accomplishments This Week:

  • Planning and Research:

The flow diagram illustrates the high-level process for a 3D reconstruction pipeline using Intel RealSense and side webcams. The process begins with a one-time synchronization and calibration of the RealSense camera with the webcams, so we would be able to know the relative position of webcams in relation to RealSense to better render resultant images from their perspectives. This step involves ensuring that all cameras are spatially and temporally aligned using calibration techniques such as checkerboard or AprilTag patterns. The goal is to establish a unified coordinate system across all devices to facilitate accurate data capture and reconstruction.

Following calibration, the depth image capture phase begins, where the RealSense camera captures depth. The captured data is then processed using Dynamic Fusion, a technique that performs non-rigid surface reconstruction. This step updates and refines the 3D face model in real time, accommodating facial deformations and changes.

Once the face model is processed, the pipeline evaluates whether the 3D face model contains holes or incomplete areas. If gaps are detected, the system loops back to Dynamic Fusion processing to refine the model further and fill in the missing parts. If the model is complete, the pipeline proceeds to facial movement tracking, where the system shifts focus to monitoring the user’s facial movements and angles across frames. This is not too changing since the face would move rigidly so only one 6DoF transform is needed per frame.

Finally, the reconstructed model is aligned with the side webcams during the 3D model alignment step using the 6DoF transform we computed. This ensures the resultant AR overlay would accurately stick on the face of the users.

  • Dynamic Fusion Investigation:
    • Discovered an OpenCV package for DynamicFusion, which can be utilized for non-rigid surface tracking in our system.
    • Began reviewing relevant documentation and testing standalone implementations.
  • Calibration Code Development:
    • Wrote the AprilTag and checkerboard-based calibration code.
    • Cannot fully test yet, as RealSense hardware is required for debugging and system implementation. (still waiting for RealSense camera)

Project Progress:

    • Slightly behind due to missing hardware required for implementation (I need the RealSence in order to know the input format of the depth information and to integrate Dynamic Fusion), but was able to complete the design and most part of implementation.
    • Currently I have tried using Blender to create scenes with depth map and RGB image to simulate RealSence data to use in my program, currently have a working version that is slower than the requirement (takes around few seconds per frame).
    • Completed the camera calibration code but also cannot test if it works due to lack of hardwares.
    • Another concern I have is opencv’s package is able to work locally, but the environment setup is pretty challenging and might be hard to migrate to jetson.
  • Actions to Catch Up:
    • Continue refining non-rigid transform calculation process using DynamicFusion.
    • RealSense was ready for pickup, will integrate RealSense measurements with my code.
    • Also needs to check how necessary is using DynamicFusion since it does pose a pretty big delay to the 3D reconstruction processing pipeline.
  • By next week, I hope to be able to run the RealSense on my code to validate initial results. By next status report, I should be able to attach images of 3D reconstructed model of my face using the pipeline I wrote

Anna’s Status Report for Feb8

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).  

     This week, I worked on designing the control system for the camera’s rotation, tilt, and vertical movement. At first, I considered building a custom camera rig from scratch using servos for rotation and tilt and linear actuators for vertical movement. However, I realized that integrating these components with the Arduino could be tricky, and that coming up with my own design that isn’t polished enough might lead to wasted time and money. To make the process more efficient, I decided to research existing designs that could be adapted for our project to ensure that we have a working design.

     I came across several options, but most were either too complex or lacked clear instructions. One design stood out because it allowed for pan, tilt, and vertical movement—exactly what we need for our augmented reality mirror. I would have three of these set up on the mirror; one horizontally at the top of the mirror for the realsense camera and two vertically on the sides of the mirror for the webcams. However, the design required a lot of material prep, had minimal step-by-step guidance, and involved a more complicated assembly.
Reference: https://www.youtube.com/watch?v=hEBjbSTLytk 

     I also spent time researching different motor options for controlling the cameras, focusing on cost and ease of implementation. After looking at multiple designs, I chose one that looked similar, except that it included 3D-printable parts, making it much easier to put together. This design also provides a full list of required parts, estimated costs, and dimensions, which helped me confirm that it would work with our webcams and budget. I would have to adjust the dimensions of the parts to fit the realsense camera (which is longer). I also made sure it could be smoothly integrated into the overall project.
Reference: https://www.instructables.com/Automatic-Arduino-Powered-Camera-Slider-With-Pan-a/

     I decided to use an Arduino as the main controller because it’s easy to program, has strong support for motor control, and works well with both servo motors and linear actuators. Its built-in libraries make it simple to create precise movements, allowing for smooth camera adjustments in all directions. Plus, it leaves room for future improvements, like adding user controls or automating movement based on environmental data.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

     My progress is a little behind. This week, we had our proposal presentation, and I started on designing the camera control system when I was supposed to start assembling the camera control system. I worked on researching all the necessary materials and assessed whether the design is feasible for our project, given the fixed budget and time frame. I will go to Techspark on the 9th to see which materials I can borrow and fill out the purchase form so that I can get the materials as soon as possible. Once I get all the materials, I can start building this week so that I can work on programming the control system and integrating it into the mirror later on.  

What deliverables do you hope to complete in the next week?

     Next week, I hope to have my digitally fabricated parts ready for me to assemble the week after next week. I would have to adjust the dimensions in the STL files so that our cameras can fit in the designs. I hope to have my other materials ready and delivered so that I can have all the materials ready to build the week after next week.