Caroline’s Status Report for 04/06

I continued to work on both the web application and projector interface. Both components are mostly functional, but they need some more work. For example, the web application works with hard coded data, but I am working on making it so that recipe data can be dynamically taken from files on the device. I was also working on integrating the calibration step into the user flow. For demo, we just assumed that there is an old calibration step saved, so now I am working on giving the user the option to start a new calibration and wait for the script to finish in the backend.

I am on schedule.

This upcoming week, I will install the wifi card on the AGX and make sure that the network interface is set up properly. I also try to reduce the latency for the running processes that we noticed during demo. I want to focus on these things this coming week and then work on the rest of the UI if I have time.

Now that you have some portions of your project built, and entering into the verification and validation phase of your project, provide a comprehensive update on what tests you have run or are planning to run. In particular, how will you analyze the anticipated measured results to verify your contribution to the project meets the engineering design requirements or the use case requirements?

One module that I am in charge of is the voice commands module. In the design review, we had outlined tests that I was planning on doing to verify that the voice commands work. One aspect of the latency of the voice commands. During demo, I noticed that there was a higher latency than expected, so we will run tests where we were say each command and measure both how long it takes for the command to be recognized in the script. It should only take 2-3 seconds to register the command. This is important to test to figure out if the latency is more of a problem within the voice command module or when it is integrated. Another test is checking the accuracy of the commands themselves. We will say commands and take a count of how many it correctly identifies. Additionally, we will try to do it with people talking in the background to see the accuracy in that environment.

Another module that I am in charge of is the web application. I will make sure to test each possible path of navigation and make sure that there are no errors while a user traverses the website. I will also make sure to test the display on different devices by physically testing and simulating to make sure that the interface spacing and size of components is styled accurately on different screen sizes.

To verify the projector interface, I will make sure that all the required elements are there and function correctly together. I will make sure that different components work simultaneously, such as the video and the timer.

Sumayya’s Status Report for 4/06

Progress Update:

From Last Week:

  • Set up camera with system – done
  • Test buttons and hand tracking with livestream video – done
  • Test reID with livestream video – done
  • integrate object tracking with camera / livestream – not done
    • Re-define how much to track and what user will see
  • Start processing the recipe – not done

I was able to demo gesture tracking mapped to button or gesture actions during the demo interim. This included setting up the external camera and testing gesture recognition/tracking with the project screen. There was a large learning curve when I realized how to use Meta’s API functions for a livestream but I was able to run my code with a live video feed.

Something I noticed was that there was a lot of latency in recognizing the gestures. I need to see if this was because of distance,  image quality or too much processing happening at once.

I had also implemented part of the calibration script that will look at the projected image and determine each button’s region and each swipe region. This was tested with a video input and worked very well. It’s harder with a projection due to lighting and distance.

Schedule:

Slightly behind: Need to make more progress on object tracking since reID is complete.

Next Week Plans: 

  • improve accuracy and latebcy of detecting a hand gesture
  • add object tracking with live video
  • set up arducam camera with AGX (Were using Etron camera but it has too much of fish eye effect and the fps is not compatible with our projector)
  • Help with recipe processing

Verification Plans:

Gesture Accuracy

  • Description: for each gesture, attempt to execute it on the specified region and note if system recognizes correctly
  • Gestures:
    • Start Button
    • Stop Button
    • Replay Button
    • Prev Swipe
    • Next Swipe
  • Goal: 90% accuracy

 

Gesture Recognition Latency

  • Description: for each gesture, attempt to execute it on the specified region and measure how how long the system takes to recognize the gesture
  • Goal: 3 seconds

Gesture Execution Latency

  • Description: for each gesture, attempt to execute it on the specified region and measure how how long the system takes to execute the gesture once its been recognized
  • Goal: 1 second

Single Object Re-ID Detection Accuracy

  • Description: how accurately is a single object detected in a frame. An image of the object will first be taken. The system must be able to detect this object again using the reference image.
  • Goal: 90% accuracy

 

Single Object Tracking Accuracy 

  • Description: single object can be smoothly tracked across the screen
  • Goal: given a set of continuous frames, object should be able to be tracked for 80-90% of the frames.

 

Multi Object Tracking Accuracy 

  • Description: multiple objects can be smoothly tracked across the screen
  • Goal: given a set of continuous frames, all intended objects should be able to be tracked for 80-90% of the frames.

 

Tahaseen’s Status Report for 03/30

This week, I worked on translating the warp homography and code to Flutter and tried it on the actual planned projector setup. Previously, I had been using a smaller projector for testing purposes. There have been several snags in the process of designing the calibration for the initial warping. Since the current method requires user input for the homography, it makes the process a little harder and less user friendly than intended. The checkerboard is meant to make this easier but is still under quality testing. Integrating with Flutter has been more difficult than anticipated given the different hierarchical structure. Futhermore, I think it may be necessary to design a lip for the new projector stand to ensure the projector is comfortably situated on the stand while optimizing the projection on the table.

I am on track this week, but plan to spend significant time on streamlining the calibration procedure.

Team Status Report for 3/30

Our biggest risk this week is the hardware set up and projector homography/calibration. Upon testing with our new projector mount, we found that we have to set up the projector to the side of the user rather than across from the user. This means the projection homography logic needs to be recalculated for the new rotation. Additionally, we realized that there are many more factors that are impacting the ratio and size of the projection internal to the project such as the keystone value. Both the projector set up and the homography calculations need to be tested more for better results.

No changes to schedule. We are on track.

Sumayya’s Status Report for 3/30

Progress Update:

From Last Week:

  • Implement button actions using gesture recognition – done
  • Implement Swipe Gesture only in a region of interest – done
    • currently this has only been implemented for functionality and not integrated into TC

I completed the integration for gestures and hand tracking. Now my feature can take in a button location on the frame and then check if that button is being pressed. It can also check if there is a swipe motion in a specific region to indicate a next and prev action. I added a publisher/subscriber implementation that bridges communication between the main controller module and the CV module. Essentially, once the CV module detects a gesture, it sends a message with the corresponding command the  to the CV topic.

Video of Button Press:

https://drive.google.com/file/d/1UqzU4HLCSh_pWWE0EWWv2mxO4PwzIQHu/view?usp=share_link

Schedule: On Track!

Next Week Plans:

  • Set up camera with system
  • Test buttons and hand tracking with livestream video
  • Test reID with livestream video
  • integrate object tracking with camera / livestream
    • Re-define how much to track and what user will see
  • Start processing the recipe

 

 

Caroline’s Status Report for 03/30

I worked on the backend integration part. I was able to build a web page for the web app user interface, launch it on a server using python, and created a socket that connected the user interface interactions with the rest of the system. I also implemented using a pub/sub model instead of a queue because it is easier and more efficient to send messages this way because multiple modules need to be able to see values pushed from another module. I used ZeroMQ in python to do this and was successful in implementing it in the web app user interface. Additionally, I have been working with my team on integrating the system at large.

On schedule.

Next week, I will continue to integrate with my team by implementing the pub/sub model in other modules and also polishing the UI for demo day.

Team Status Report for 3/23

There are two significant risks – one is the projector, which has been an ongoing risk. The risk of knocking down or dropping the projector is still a concern because we have not finished this portion of the project. However, we have figured a way to somewhat mitigate this risk, which will be discussed in the system design changes paragraph. Another risk is the object reidentification. It might not work as well once implemented due to lighting conditions. Right now, the implementation uses the UI wireframe, which has less uncertainties as in a real kitchen. We will have to do testing to see how much of a problem this will be – if it is one. To manage this, we will experiment with different lighting conditions.

One change that we have made is simplifying the projector mount. Instead of having the user set different angles during calibration, we will instead fit the projection to any angle – within reason (the projector still must be pointed at the table). We are doing this by computing a homography between a sample image taken by the camera and the image on the screen. Now, we do not need to make the projector mount rotate up and down. The change will make the mount cost less because we do not need a rotation mechanism.

No changes to schedule. We are on track.

Caroline’s Status Report for 03/23

I completed the user interface for the web application where the user chooses the recipe. This involved designing the user interface on Figma and then implementing it in flutter.

On schedule.

Next week, I will work on connecting this UI to the backend and also work on the network interface on the AGX.

Tahaseen’s Status Report for 03/23

This week, I spent some time working with the Xavier and tried to connect the camera to it so that we would be able to do some preliminary testing. However, there were a few roadblocks here that should be resolved this week as we work to onboard the Xavier. On the other hand, I revisited the projection method such that it utilizes the camera and prevents the user from having to mess with the projector setup. This would involve another step in the calibration process that would create the homography based on the existing camera and projector setup. Furthermore, I have designed the first iteration of the hardware mount that will be 3D printed this week unless an existing metal/wood solution is found. There is some concern that the filament will not hold up.

I am on track and this week I plan to work on completing the calibration procedure, mount and sorting out our central compute.

Sumayya’s Report for 3/23

Progress Update:

I completed the Object Re-Identification logic and finish implementing it. This feature can detect the ingredients grid in the UI then see which cells are occupied. Once it identifies the occupied cell locations, it parses the json file that contains the list of ingredients that should be present in the grid. Each item has a property called “cell_pos” that has the expected row and column of the ingredient along with properties such as “name”, “filename”,  and “filepath”. This allows the program to retrieve the ingredient name and label the image of the ingredient captured from the grid.

As a result of this feature, I created modular functions that can easily add and remove ingredients from the json file and scan singular cells for an ingredients vs scanning the entire grid at once.

I used the following template of the UI (made by Caroline) to do my testing. I added images of ingredients to reID. This is what the camera will see projected on to the table:

Here is a video demonstrating the reID process:

https://drive.google.com/file/d/1fGxNk6h5AN5JJqAoFRP-cgjNKqWtU1hy/view?usp=sharing

Schedule: On track!

Next Week Plans:

  • Implement button actions using gesture recognition
  • Implement Swipe Gesture only in a region of interest
    • currently this has only been implemented for functionality and not integrated into TC
  • Consider adding a cursor feature that follows user’s finger tip (beyond MVP)