Diya’s Status Report for 02/22/2024

This past week, I was catching up on a lot of work since I was really sick the previous week and also had a midterm on Thursday. Despite that, I made significant progress on the project. I worked on the design presentation slides and presented them on Monday. Additionally, I have been working on OpenCV gesture recognition, ensuring it runs locally on my computer. The setup is now complete, and I am currently in the process of testing the accuracy of the model. Now that I have the gesture recognition working locally, the project is back on schedule. The progress aligns with our timeline, and I am ready to move forward with the next steps.

For the upcoming week, I plan to

  1. Continue testing the accuracy of the gesture recognition model
  2. Work on Figma design for the website interface.
  3. Start working on the networking portion of the project for the webapp
  4. Begin drafting and finalizing the design review report submission.

Diya’s Status Report for 02/15/2024

This week was quite challenging for me as I was sick for most of it. Last week, I was recovering from a bacterial infection, and unfortunately, I came down with the flu this week, which led to a visit to urgent care. Despite that, I was still able to contribute to the project, particularly in refining our approach to hand gesture recognition and pivoting my role to contribute more effectively.

Initially, I had misunderstood the gesture recognition task, thinking I needed to find and train a dataset myself. However, after further research, I realized that MediaPipe provides a pretrained model with 90% accuracy for gesture recognition, meaning I could directly integrate it without training a new model. This required a shift in my focus, and I pivoted to handling the networking aspect of the project to add complexity and depth to my contribution.

Beyond that, I have been actively involved in facilitating group meetings, translating our use case requirements into quantitative design requirements, and preparing for the design review presentation this week.

Given my health issues, my progress is slightly behind where I initially wanted to be, but I have taken steps to ensure that I am back on track. Since the gesture recognition aspect is now streamlined with MediaPipe, I have moved focus to the networking component, which is a new responsibility. I am catching up by working on setting up the foundational pieces of the social network feature in our web app.

Next week, I plan to make significant progress on the networking component of the project. Specifically, I aim to set up user authentication for the web app to allow users to create accounts, implement user profiles, which will include cooking levels, past recipe attempts, and preferences, and develop a basic social network feature, where users can add friends and view their cooking activities.

 

Diya’s Status Report for 02/08

This week my primary focus was researching gesture recognition algorithms and setting up the necessary environment to begin implementation. Since I am relatively new to this field, I dedicated a significant amount of time to understanding the different approaches that I can use to implement real time gesture recognition and looking at the feasibility for integration into the CookAR glasses. 

I have detailed some algorithms I have researched below: 

  1. Google MediaPipe – MediaPipe Hand Tracking 
    1. The MediaPipe Hand Tracking offers real time hand tracking and provides 21 3D hand landmarks so this allows us to determine the hand position, orientation and gesture. 
    2. It has a > 90% real time accuracy and it is designed to be lightweight so it can be run on a microcontroller which has limited processing power 
    3. For the environment set up, I am using python, specifically the MediaPipe Python Package 
    4. Next steps include defining the gestures we want the algorithm to recognize so this includes swipe left, right for next, open palm for pause etc and then record the landmark data for each gesture. After this, I will extract the relevant features from the landmark data like the distance between key joints, angle between fingers, velocity of the hand movement. 
    5. I am planning to use a simple rule based system approach based on thresholds for distances/angles as the model for gesture classification. I looked into more robust models such as training a machine learning classifier using the extracted features. Here, I could use TensorFlow Lite to run the model efficiently on the microcontroller. I am first going to start off by just using the simple rule based approach and pivot to the more robust model if needed. 
    6. Since we are using Unity for the AR display, I also have to create a script that receives the gesture data and updates the AR elements accordingly. This is something I am looking more into. 
  2. Hidden Markov Models for Dynamic Gestures: 
    1. Used to recognize sequences of movements so this would be ideal for gestures that can involve a lot of different hand positions over time 
    2. Dataset of recorded gestures for training. I found a preliminary dataset with gestures
      1. https://www.visionbib.com/bibliography/contentspeople.html#Face%20Recognition,%20Detection,%20Tracking,%20Gesture%20Recognition,%20Fingerprints,%20Biometrics
      2. American Sign Language Dataset to recognize basic gestures 
    3. Implement using Tensorflow but it would need gesture sequence data for effective training 

Technical Challenges 

  1. I need to gather data in different lighting condition and different backgrounds to make sure the testing is robust
  2. I can also synthetically create more training data by adding noise, varying the lighting and rotating hand images 

Setting up the Development Environment

Since this is my first time working with gesture recognition, I spent time getting the necessary tools and dependencies installed: 

  • Installed necessary libraries like OpenCV, MediaPipe, TensorFlow 
  • Configured Jupyter Notebook for testing different models and algorithms 

Progress Update

 I would say that I am slightly behind schedule in terms of actual implementation but on track in terms of understanding the concepts and setting up the groundwork. The research and initial setup phase took longer than expected but now that I have a better understanding of the algorithms and their implementation, I should be able to move forward with actually implementing code. 

To catch up, I plan to: 

  1. Run and analyze sample gesture recognition models in Python 
  2. Begin experimenting with CNN models for static gesture classification 

Next Week’s Deliverables: 

By the end of the week, I aim to have: 

  • A working mediapipe hand tracking prototype capturing and displaying hand keypoints 
  • A basic CNN model for static gesture classification