Diya’s Status Reports – Team B1: CookAR

Diya’s Status Report for 04/12/2025

I have worked on the following this week:

I’ve been ironing out the design details for the post-cooking analytics feature. Based on concerns raised during our last meeting especially around how we detect when a step is completed and how to compute time per step. I am thinking of a few options such as to reduce noise from accidental flicks we already debounce each gesture using a timer. Only gestures that persist for a minimum duration (e.g. more than 300ms) are treated as intentional. If the user moves to the next step and then quickly goes back it’s pretty much a signal that they may have skipped accidentally or were just reviewing the steps. In these cases, the step won’t be marked as completed unless they revisit it and spend a reasonable amount of time. I’ll implement logic that checks if a user advanced and did not return within a short window making that as a strong indicator the step was read and completed. Obviously there is still edge cases to consider for example,
1. Time spent is low, but the user might still be genuinely done. To address this I was thinking of tracking per-user average dwell time. If a user consistently spends less time but doesn’t flag confusion or goes back on steps, mark them as ‘advanced’. If a user shows a gesture like thumbs up or never flags a step we would treat it as implicit confidence even with short duration.
2. Frequent back and forth or double checking. User behavior might seem erratic even though they are genuinely following instructions. I was thinking for this i won’t log a step as completed until user either a) proceeds linearly and spends threshold time or b) returns and spends more time. If a user elaborates or flags a step before skipping, we lower the confidence score but still log it as visited
3. user pauses cooking mid step for example when they are using an oven and long time spent doesn’t always mean engagement. As we gather more data from a user, we plan to develop a more personalized model that will combine the gesture recognition, time metrics and NLP analysis of flagged content.
I’ve been working on integrating gesture recognition using the pi camera and mediapipe. The gesture classification pipeline runs entirely on thepi. Each frame from the live video feed is passed through the mediapipe model, which classifies gestures locally. Once a gesture is recognized, a debounce timer ensures it isn’t falsely triggered. Valid gestures are mapped to predefined byte signals, and I’m implementing the I2C communication such that the Pi (acting as the I2C master) writes the appropriate byte to the bus. The second Pi (I2C slave) reads this signal and triggers corresponding actions like “show ingredients”, “next step”, or “previous step”. This was very new to me since I have never worked with writing an I2C communication. This still has to be tested.
I’m also helping Charvi with debugging the web app’s integration on the Pi. Currently, we’re facing issues where some images aren’t loading correctly and also a lot of git merge conflicts. I’ll be helping primarily with this tomorrow.

Diya’s Status Report for 29 March 2025

After getting feedback to increase the complexity of my contributions beyond gesture recognition, I’ve significantly expanded my role across both software and hardware components of the project:

Hardware/CAD:
I supported Rebecca by taking over the CAD design for the smart glasses. Although this is my first time working with CAD, I’ve been proactive in learning and contributing to the hardware aspect of the project.
Frontend Development:
- I added TailwindCSS and javascript to enhance the styling of our web app interface.
- I also redesigned the frontend structure since the original wireframes didn’t align with the actual website architecture. I restructured and implemented a layout that better
- suits our tech stack and user experience goals.
Integration Work:
- I successfully integrated the gesture recognition system with Charvi’s display functionality,. This now allows for seamless communication between hand gestures and what is shown on the glasses.

I plan to integrate the recipe database with the Pygame-based display, enabling users to view and interact with individual recipes on the smart glasses.

This past week, I definitely went beyond the expected 12 hours of work. I’m feeling confident about our current progress and believe we’re in a strong position for the interim demo. I’ve taken initiative to broaden my scope and contribute to areas outside my original domain.

Diya’s Status Report for 3/22/25

I am currently on track with the project schedule. The gesture recognition system is now fully functional on my computer display with all of the defined gestures. This week I focused on building the recipe database and successfully scraped recipe data from Simply Recipes and structured it into JSON format. An example of one of the recipe entries includes fields for title, image, ingredients, detailed step-by-step instructions, author, and category. The scraping and debugging process was somewhat tedious, as I had to manually inspect the page’s HTML tags to accurately locate and extract the necessary data. In our use case requirements, we specified that each step description should be under 20 words, but I’ve noticed that many of the scraped steps exceed that limit. This will need additional post-processing and cleanup. Additionally, some scraped content includes unnecessary footer items such as “Love the recipe? Leave us stars and a comment below!” and unrelated tags like “Dinners,” “Most Recent,” and “Comfort Food” that need to be removed before display.

My current focus is integrating the recipe JSON database into our Django web app framework. Additionally, I am also going to start working on generating recipe titles for display in Pygame on the Raspberry Pis. Next steps include complete integration of the recipe data with the Django web app and refining the display logic for recipe titles on the Raspberry Pi setup.

Example structure of a scraped recipe:

{

“title”: “One-Pot Mac and Cheese”,

“image”: “images/Simply-Recipes-One-Pot-Mac-Cheese-LEAD-4-b54f2372ddcc49ab9ad09a193df66f20.jpg”,

“ingredients”: [

“2tablespoonsunsalted butter”,

“24 (76g)Ritz crackers, crushed (about 1 cup plus 2 tablespoons)”,

“1/8teaspoonfreshlyground black pepper”,

“Pinchkosher salt”,

“1tablespoonunsalted butter”,

“1/2teaspoonground mustard”,

“1/2teaspoonfreshlyground black pepper, plus more to taste”,

“Pinchcayenne(optional)”,

“4cupswater”,

“2cupshalf and half”,

“1teaspoonkosher salt, plus more to taste”,

“1poundelbow macaroni”,

“4ouncescream cheese, cubed and at room temperature”,

“8ouncessharp cheddar cheese, freshly grated (about 2 packed cups)”,

“4ouncesMonterey Jack cheese, freshly grated (about 1 packed cup)”

“steps”: [

{

“description”: “Prepare the topping (optional):Melt the butter in a 10-inch Dutch oven or other heavy, deep pot over medium heat. Add the crushed crackers, black pepper, and kosher salt and stir to coat with the melted butter. Continue to toast over medium heat, stirring often, until golden brown, 2 to 4 minutes.Transfer the toasted cracker crumbs to a plate to cool and wipe the pot clean of any tiny crumbs.Simply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Begin preparing the mac and cheese:In the same pot, melt the butter over medium heat. Once melted, add the ground mustard, pepper, and cayenne (if using). Stir to combine with the butter and lightly toast until fragrant, 15 to 30 seconds. Take care to not let the spices or butter begin to brown.Add the water, half and half, and kosher salt to the butter mixture and stir to combine. Bring the mixture to a boil over high heat, uncovered.Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Cook the pasta:Once boiling, stir in the elbow macaroni, adjusting the heat as needed to maintain a rolling boil (but not boil over). Continue to cook uncovered, stirring every minute or so, until the pasta is tender and the liquid is reduced enough to reveal the top layer of elbows, 6 to 9 minutes. The liquid mixture should just be visible around the edges of the pot, but still with enough to pool when you drag a spatula through the pasta. Remove from the heat.Simple Tip!Because the liquid is bubbling up around the elbows, it may seem like it hasn\u2019t reduced enough. To check, pull the pot off the heat, give everything a stir, and see what it looks like once the liquid is settled (this should happen in seconds).Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Add the cheeses:Add the cream cheese to the pasta mixture and stir until almost completely melted. Add the shredded cheddar and Monterey Jack and stir until the cheeses are completely melted and saucy.Simply Recipes / Ciara KehoeSimply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Season and serve:Taste the mac and cheese. Season with more salt and pepper as needed. Serve immediately topped with the toasted Ritz topping, if using.Leftover mac and cheese can be stored in an airtight container in the refrigerator for up to 5 days.Love the recipe? Leave us stars and a comment below!Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Dinners”,

“image”: null

{

“description”: “Most Recent”,

“image”: null

{

“description”: “Recipes”,

“image”: null

{

“description”: “Easy Recipes”,

“image”: null

{

“description”: “Comfort Food”,

“image”: null

}

“author”: “Kayla Hoang”,

“category”: “Dinners”

Diya’s Status Report for 3/15/25

This week, I worked on our ethics assignment, completing the necessary tasks for the assignment and addressing ethical considerations related to our project.

I also spent considerable time researching and learning how to handle specific tasks such as creating .task files for the Raspberry Pi and implementing web scraping techniques. After discussions with Rebecca, we realized integrating gesture recognition onto the Raspberry Pi is more challenging than initially anticipated, mainly due to compatibility issues with .py files. I have begun developing a .task file to resolve this and plan to test it with Rebecca next week.

Additionally, I’ve been exploring web scraping to automate the recipe database, avoiding the manual entry of 100 recipes. I’m currently writing a script for this task and plan to test it this weekend.

Looking ahead, my primary focus for next week will involve testing these implementations. Given the complexity of the integration, I want to ensure that I have enough time for the integration phase to address any blockers that I might run into.

Diya’s Status Report for 08/03/2025

Last week, I focused heavily on the design report, contributing significantly to refining the software details and web application requirements. I worked on structuring and clarifying key aspects of our system to ensure that our implementation aligns with our project goals. A major portion of my work involved ironing out details related to gesture recognition, particularly ensuring it aligns with our defined gesture language. This included adjusting parameters, and troubleshooting inconsistencies to improve accuracy. I have attached a photo of an example of the gesture recognition for the defined gesture language in the design project report.

In the upcoming week, my main focus will be on improving the accuracy of gesture recognition. This will involve fine-tuning detection thresholds, reducing latency, and optimizing the system for different environmental conditions to ensure robustness. I will also continue working on refining the design report if needed and contribute to the integration of the gesture system into the broader application.

Diya’s Status Report for 02/22/2024

This past week, I was catching up on a lot of work since I was really sick the previous week and also had a midterm on Thursday. Despite that, I made significant progress on the project. I worked on the design presentation slides and presented them on Monday. Additionally, I have been working on OpenCV gesture recognition, ensuring it runs locally on my computer. The setup is now complete, and I am currently in the process of testing the accuracy of the model. Now that I have the gesture recognition working locally, the project is back on schedule. The progress aligns with our timeline, and I am ready to move forward with the next steps.

For the upcoming week, I plan to

Continue testing the accuracy of the gesture recognition model
Work on Figma design for the website interface.
Start working on the networking portion of the project for the webapp
Begin drafting and finalizing the design review report submission.

Diya’s Status Report for 02/15/2024

This week was quite challenging for me as I was sick for most of it. Last week, I was recovering from a bacterial infection, and unfortunately, I came down with the flu this week, which led to a visit to urgent care. Despite that, I was still able to contribute to the project, particularly in refining our approach to hand gesture recognition and pivoting my role to contribute more effectively.

Initially, I had misunderstood the gesture recognition task, thinking I needed to find and train a dataset myself. However, after further research, I realized that MediaPipe provides a pretrained model with 90% accuracy for gesture recognition, meaning I could directly integrate it without training a new model. This required a shift in my focus, and I pivoted to handling the networking aspect of the project to add complexity and depth to my contribution.

Beyond that, I have been actively involved in facilitating group meetings, translating our use case requirements into quantitative design requirements, and preparing for the design review presentation this week.

Given my health issues, my progress is slightly behind where I initially wanted to be, but I have taken steps to ensure that I am back on track. Since the gesture recognition aspect is now streamlined with MediaPipe, I have moved focus to the networking component, which is a new responsibility. I am catching up by working on setting up the foundational pieces of the social network feature in our web app.

Next week, I plan to make significant progress on the networking component of the project. Specifically, I aim to set up user authentication for the web app to allow users to create accounts, implement user profiles, which will include cooking levels, past recipe attempts, and preferences, and develop a basic social network feature, where users can add friends and view their cooking activities.

Diya’s Status Report for 02/08

This week my primary focus was researching gesture recognition algorithms and setting up the necessary environment to begin implementation. Since I am relatively new to this field, I dedicated a significant amount of time to understanding the different approaches that I can use to implement real time gesture recognition and looking at the feasibility for integration into the CookAR glasses.

I have detailed some algorithms I have researched below:

Google MediaPipe – MediaPipe Hand Tracking
1. The MediaPipe Hand Tracking offers real time hand tracking and provides 21 3D hand landmarks so this allows us to determine the hand position, orientation and gesture.
2. It has a > 90% real time accuracy and it is designed to be lightweight so it can be run on a microcontroller which has limited processing power
3. For the environment set up, I am using python, specifically the MediaPipe Python Package
4. Next steps include defining the gestures we want the algorithm to recognize so this includes swipe left, right for next, open palm for pause etc and then record the landmark data for each gesture. After this, I will extract the relevant features from the landmark data like the distance between key joints, angle between fingers, velocity of the hand movement.
5. I am planning to use a simple rule based system approach based on thresholds for distances/angles as the model for gesture classification. I looked into more robust models such as training a machine learning classifier using the extracted features. Here, I could use TensorFlow Lite to run the model efficiently on the microcontroller. I am first going to start off by just using the simple rule based approach and pivot to the more robust model if needed.
6. Since we are using Unity for the AR display, I also have to create a script that receives the gesture data and updates the AR elements accordingly. This is something I am looking more into.
Hidden Markov Models for Dynamic Gestures:
1. Used to recognize sequences of movements so this would be ideal for gestures that can involve a lot of different hand positions over time
2. Dataset of recorded gestures for training. I found a preliminary dataset with gestures
  1. https://www.visionbib.com/bibliography/contentspeople.html#Face%20Recognition,%20Detection,%20Tracking,%20Gesture%20Recognition,%20Fingerprints,%20Biometrics
  2. American Sign Language Dataset to recognize basic gestures
3. Implement using Tensorflow but it would need gesture sequence data for effective training

Technical Challenges

I need to gather data in different lighting condition and different backgrounds to make sure the testing is robust
I can also synthetically create more training data by adding noise, varying the lighting and rotating hand images

Setting up the Development Environment

Since this is my first time working with gesture recognition, I spent time getting the necessary tools and dependencies installed:

Installed necessary libraries like OpenCV, MediaPipe, TensorFlow
Configured Jupyter Notebook for testing different models and algorithms

Progress Update

I would say that I am slightly behind schedule in terms of actual implementation but on track in terms of understanding the concepts and setting up the groundwork. The research and initial setup phase took longer than expected but now that I have a better understanding of the algorithms and their implementation, I should be able to move forward with actually implementing code.

To catch up, I plan to:

Run and analyze sample gesture recognition models in Python
Begin experimenting with CNN models for static gesture classification

Next Week’s Deliverables:

By the end of the week, I aim to have:

A working mediapipe hand tracking prototype capturing and displaying hand keypoints
A basic CNN model for static gesture classification