Team B1: CookAR – Page 3 – Carnegie Mellon ECE Capstone, Spring 2025

Diya’s Status Report for 3/22/25

I am currently on track with the project schedule. The gesture recognition system is now fully functional on my computer display with all of the defined gestures. This week I focused on building the recipe database and successfully scraped recipe data from Simply Recipes and structured it into JSON format. An example of one of the recipe entries includes fields for title, image, ingredients, detailed step-by-step instructions, author, and category. The scraping and debugging process was somewhat tedious, as I had to manually inspect the page’s HTML tags to accurately locate and extract the necessary data. In our use case requirements, we specified that each step description should be under 20 words, but I’ve noticed that many of the scraped steps exceed that limit. This will need additional post-processing and cleanup. Additionally, some scraped content includes unnecessary footer items such as “Love the recipe? Leave us stars and a comment below!” and unrelated tags like “Dinners,” “Most Recent,” and “Comfort Food” that need to be removed before display.

My current focus is integrating the recipe JSON database into our Django web app framework. Additionally, I am also going to start working on generating recipe titles for display in Pygame on the Raspberry Pis. Next steps include complete integration of the recipe data with the Django web app and refining the display logic for recipe titles on the Raspberry Pi setup.

Example structure of a scraped recipe:

{

“title”: “One-Pot Mac and Cheese”,

“image”: “images/Simply-Recipes-One-Pot-Mac-Cheese-LEAD-4-b54f2372ddcc49ab9ad09a193df66f20.jpg”,

“ingredients”: [

“2tablespoonsunsalted butter”,

“24 (76g)Ritz crackers, crushed (about 1 cup plus 2 tablespoons)”,

“1/8teaspoonfreshlyground black pepper”,

“Pinchkosher salt”,

“1tablespoonunsalted butter”,

“1/2teaspoonground mustard”,

“1/2teaspoonfreshlyground black pepper, plus more to taste”,

“Pinchcayenne(optional)”,

“4cupswater”,

“2cupshalf and half”,

“1teaspoonkosher salt, plus more to taste”,

“1poundelbow macaroni”,

“4ouncescream cheese, cubed and at room temperature”,

“8ouncessharp cheddar cheese, freshly grated (about 2 packed cups)”,

“4ouncesMonterey Jack cheese, freshly grated (about 1 packed cup)”

“steps”: [

{

“description”: “Prepare the topping (optional):Melt the butter in a 10-inch Dutch oven or other heavy, deep pot over medium heat. Add the crushed crackers, black pepper, and kosher salt and stir to coat with the melted butter. Continue to toast over medium heat, stirring often, until golden brown, 2 to 4 minutes.Transfer the toasted cracker crumbs to a plate to cool and wipe the pot clean of any tiny crumbs.Simply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Begin preparing the mac and cheese:In the same pot, melt the butter over medium heat. Once melted, add the ground mustard, pepper, and cayenne (if using). Stir to combine with the butter and lightly toast until fragrant, 15 to 30 seconds. Take care to not let the spices or butter begin to brown.Add the water, half and half, and kosher salt to the butter mixture and stir to combine. Bring the mixture to a boil over high heat, uncovered.Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Cook the pasta:Once boiling, stir in the elbow macaroni, adjusting the heat as needed to maintain a rolling boil (but not boil over). Continue to cook uncovered, stirring every minute or so, until the pasta is tender and the liquid is reduced enough to reveal the top layer of elbows, 6 to 9 minutes. The liquid mixture should just be visible around the edges of the pot, but still with enough to pool when you drag a spatula through the pasta. Remove from the heat.Simple Tip!Because the liquid is bubbling up around the elbows, it may seem like it hasn\u2019t reduced enough. To check, pull the pot off the heat, give everything a stir, and see what it looks like once the liquid is settled (this should happen in seconds).Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Add the cheeses:Add the cream cheese to the pasta mixture and stir until almost completely melted. Add the shredded cheddar and Monterey Jack and stir until the cheeses are completely melted and saucy.Simply Recipes / Ciara KehoeSimply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Season and serve:Taste the mac and cheese. Season with more salt and pepper as needed. Serve immediately topped with the toasted Ritz topping, if using.Leftover mac and cheese can be stored in an airtight container in the refrigerator for up to 5 days.Love the recipe? Leave us stars and a comment below!Simply Recipes / Ciara KehoeSimply Recipes / Ciara Kehoe”,

“image”: null

{

“description”: “Dinners”,

“image”: null

{

“description”: “Most Recent”,

“image”: null

{

“description”: “Recipes”,

“image”: null

{

“description”: “Easy Recipes”,

“image”: null

{

“description”: “Comfort Food”,

“image”: null

}

“author”: “Kayla Hoang”,

“category”: “Dinners”

Team Status Report for March 15, 2025

Project Risks and Mitigation Strategies

Since all three of us are very involved in Carnival (booth and buggy), we have decided to set a deadline for completing the work that is necessary for the interim demo by Friday, March 28th. This way, we will be able to ensure that our capstone project is as completed and ready to go as possible before week-of-Carnival responsibilities descend upon us all, as both booth and buggy become extraordinarily time-consuming that week in particular. And worst case, if we are a little behind, we can use the weekend before carnival to finish up any last minute work.

Changes to System Design

One change we decided to make after talking to Gloria was to add a feature for the user on the glasses. This feature is an additional gesture that prompts the display on the HUD of the definition / elaboration of a step which may have field-specific terminology or other words with which beginners may not be familiar. For instance, if the user asks for elaboration upon the step “dice two onions” the display would return new text to explain that dicing means to cut the onion into uniform quarter-inch cubes, and possibly a technique recommendation. This should not be too hard to implement on both the backend and the display, and we felt as if it would be a good way to further utilize the gesture language and serve our users’ requirements better.

Schedule Progress

Independent schedule progress is addressed in our individual reports. While the order of some tasks has shuffled, our work is roughly on track.

Rebecca’s Status Report for March 15, 2025

Report

Changing the WiFi on a Raspberry Pi without entirely rewriting the OS (using the imager) turns out to be a relatively straightforward task, assuming you have current access to the OS. Changing the WiFi on a Raspberry Pi without entirely rewriting the OS while you don’t, i.e., when you’re on the opposite side of the state to the network it’s set up for, is virtually impossible. It didn’t used to be, though- on previous versions of Raspberry Pi OS, pre-Bookworm, it was just a matter of creating a specific file in the boot directory of the SD card and putting the network information there in a specific format. And since so many people in the Rasppi community simply do not like to call out by name the version of the OS they’re working with, it took a frankly unreasonable amount of time to figure out that that method had been deprecated on the version I’m using, and that’s why it wasn’t working. (In fairness, I suppose, the new version of the OS is very new, only a few months at time of writing, so the vast majority of the discussion out there predates it. Unfortunately, the new version of the OS is very new, so the vast majority of the discussion out there predates it!)

CMU-DEVICE requires registry using the device’s hardware address, which is easily identifiable with arp -a on my laptop’s terminal given that it and the Rasppi are on the same network, I know the Rasppi’s IP address, that the two had been recently in contact. What I ended up doing was flashing my second SD card with the WiFi information for my cellphone’s hotspot, connecting both it and my laptop to that hotspot, using an IP scanner to identify the Rasppi’s IP address, pinging it, and then calling arp to get the MAC address. Success. My device is registered with the WiFi! Now how do I get to the WiFi? It’s no longer something stored on the hardware- I need to modify the SD card with all of my work on it from last week without destroying it.

There’s no good way, turns out. I ended up changing the login information of my phone’s hotspot to spoof my home network so the Rasppi would connect to it, then sshing in on that network to use rasppi-config to update the information. It felt very silly, but it worked, so sure! Alright! In retrospect, if I had started by spoofing the old network I could have skipped using the other SD card entirely, so if I have to change the information again going forward that’s the way I’ll do it.

My week has been… nothing short of insane, on account of one specific project I have in another class that ate me alive, so I haven’t gotten a chance to sit down in front of a monitor that takes HDMI or wire up the Rasppis to be able to talk to each other. I’ve done a good bit of research and am pretty sure I know how to make the I2C, HDMI, and AV out work, so RF project willing I’ll be sitting down early this upcoming week to get at least temporary wires running between the Rasppis. I’ll probably have to solder them in, since the boards don’t have pins, but I’m going to try to do the lightest-weight job I can since I’ll have to take it out and redo it eventually. I also realized that I need to get my hands on another USB Micro cable, since I only have the one but will have to test-power both of them at once pretty soon. Gonna ask around to see if anyone I know has one that I can borrow lying around this weekend, then just order one on Amazon early next week if not.

Progress Schedule

Unfortunately the radiofrequency project that came out of left field (I knew it was coming, didn’t expect it to be nearly so insanely difficult as it was) has put me on the back foot with regard to literally everything else. I might have to abbreviate some of the HDMI work I was planning on doing since we are approaching pretty quickly when the displays are going to be delivered. Gonna be playing catch-up this week.

Next Week’s Deliverables

I need to get the boards talking to each other, which may be early this week or may be late depending on whether or not I can get my hands on another USB micro cable quickly or not. Also want to get HDMI out working, since that was supposed to be this week and ended up falling to the wayside.

Diya’s Status Report for 3/15/25

This week, I worked on our ethics assignment, completing the necessary tasks for the assignment and addressing ethical considerations related to our project.

I also spent considerable time researching and learning how to handle specific tasks such as creating .task files for the Raspberry Pi and implementing web scraping techniques. After discussions with Rebecca, we realized integrating gesture recognition onto the Raspberry Pi is more challenging than initially anticipated, mainly due to compatibility issues with .py files. I have begun developing a .task file to resolve this and plan to test it with Rebecca next week.

Additionally, I’ve been exploring web scraping to automate the recipe database, avoiding the manual entry of 100 recipes. I’m currently writing a script for this task and plan to test it this weekend.

Looking ahead, my primary focus for next week will involve testing these implementations. Given the complexity of the integration, I want to ensure that I have enough time for the integration phase to address any blockers that I might run into.

Charvi’s Status Report for 3/15/25

This week, I worked on the ethics assignment and further worked on the webapp.

I have the login and registration completed from before, and I have also worked a lot on the backend this week and done planning of the total website / backend structure.

I have been focusing on the recipe selection page primarily, as well as adding features to the profile page to reflect the changes we made last week.

In addition, after a conversation with staff, we decided to add a feature in which users can request further elaboration on a cooking term in a step that will show up on the display. I’ve accounted for this feature in the webapp structure.

I did not have much time to work on capstone this week as I landed from my flight Monday night and had a large assignment due for another class on Thursday, so I am behind and was not able to complete the functionality of the entire website which was my goal that I assigned for myself last week. However, I do think this was an unrealistic goal, and I think I have more time than I initially thought to work on my end of the project as integration will be happening later than initially planned. In addition, I will have a lot more time to work on capstone this coming week.

This week, I absolutley must complete atleast the recipe selection and recipe running page with steps showing to be on track. I am confident that I can get this done since I have the backend setup and I have more time this week. Once that is done, I will continue working on functionality for the rest of the webapp.

Diya’s Status Report for 08/03/2025

Last week, I focused heavily on the design report, contributing significantly to refining the software details and web application requirements. I worked on structuring and clarifying key aspects of our system to ensure that our implementation aligns with our project goals. A major portion of my work involved ironing out details related to gesture recognition, particularly ensuring it aligns with our defined gesture language. This included adjusting parameters, and troubleshooting inconsistencies to improve accuracy. I have attached a photo of an example of the gesture recognition for the defined gesture language in the design project report.

In the upcoming week, my main focus will be on improving the accuracy of gesture recognition. This will involve fine-tuning detection thresholds, reducing latency, and optimizing the system for different environmental conditions to ensure robustness. I will also continue working on refining the design report if needed and contribute to the integration of the gesture system into the broader application.

Charvi’s Status Report for 3/8/25

This week, I primarily worked on the wireframes for the website. They are in the design report.

We did a lot of rework of the web application such as what networking features to include, so the wireframe was very necessary to get our entire group on the same page about what we have to design and what exactly our web app is used for. This will be a good way to have everyone on the same page if we add / change anything additionally.

I am behind schedule- I did not finish all of my goals to have the website working with basic functionality since we redefined our scope for the webapp. I have to spend a lot of time this week actually getting the website coded (especially with all the changes just added).

As mentioned above, this week I aim to have basic functionality of the entire site (or atleast the most important pages (recipe selection, recipe running page) working).

Team Status Report for March 8, 2025

Project Risks and Mitigation Strategies

At this point the only hardware delay we’re waiting on is the displays. The original displays ordered were cancelled, so similar alternate parts had to be identified and purchased- these come from sellers with much more recent activity, and before the parts were ordered we asked and received confirmation that they were still in stock, so we are much more hopeful that they will arrive when expected. We then, of course, have to hope they work- they fill the same technical niche as the original part, as small, lightweight displays with optical lenses meant for heads-up displays, but since they are slightly different we no longer have proof that the deconstruction we have planned will work exactly as expected.

Beyond hardware, another key area of risk is ensuring the gesture-based interactions function smoothly and intuitively. As part of my work on the gesture system, I have been refining the detection algorithms and ensuring that they align well with user expectations. The primary risk is potential latency or inconsistency in recognizing gestures, especially in dynamic environments. So I am looking vat sensitivity tuning and error correction methods to mitigate this.

Changes to System Design

No changes to the system design were made this week.

Schedule Progress

Independent schedule progress is addressed in our individual reports. While the order of some tasks has shuffled, our work is roughly on track.

Meeting Specific Needs

Part A was written by Diya, part B was written by Rebecca, and part C was written by __.

Part A: … with consideration of global factors.

The product solution we are designing aims to meet the need for intuitive and hands-free interaction in augmented reality environments. By incorporating a gesture-based input system, users can interact naturally without relying on physical controllers, improving accessibility and ease of use. The gesture recognition is designed to be responsive to gestures and adaptive to the users’ input.

Part B: … with consideration of cultural factors.

Food is, of course, a foundational part of just about every culture everywhere on Earth. It’s one of the first things people may turn to when trying to reconnect as an adult with a culture they may have missed out on, for various reasons, in childhood, or lost track of somewhere along the line- or when reaching out to friends and family of different cultures. But starting to learn how to cook, or learning how to cook in a style one is entirely unfamiliar with, can be a daunting undertaking; by lowering this barrier of entry, we hope that people will be more encouraged to attempt this particular path to connecting with a new culture, be it their own or their loved ones’.

Part C: … with consideration of environmental factors.

[ ]

We are trying to design as low of a cost environmentally as we can. Our physical frame is designed to last for long use, by being as sturdy as possible. We plan on using as small as an EC2 instance as needed to deploy our web app and store any databases. And inherently, cooking at home is better for the environment: less one time use plastics are used for packaging m, there are less delivery emissions, and food waste is lessened if someone is able to easier cook food that they want to eat when they want it. Our recipes included are all simple recipes, that aim to use ingredients that the user will probably already have: also reducing food waste. Overall, home cooking is more environmentally friendly than ordering takeout, which is an option that many people that don’t feel comfortable cooking at home will end up doing- CookAR hopes to bridge that gap and get people to start homecooking. Though there are additional environmental costs associated with creating a new physical glasses product and running a website, we aim to be intentional about what we design and what resources we use in a way that is as environmentally conscious as possible.

Rebecca’s Status Report for March 8, 2025

Report

I have learned that despite being supposedly a very mainstream device, the Raspberry Pi is… remarkably unintuitive. I’m using Raspberry Pi OS Lite to run the Rasppi headless and ssh into it, though for an as-of-yet unclear reason my computer does not seem to be able to resolve the Rasppi’s hostname and I have to use the IP address directly. This has only worked for this week’s development because I have direct access to my router and its IP address assignments at home, and will immediately have to resolve this issue upon returning to campus. Figuring out how to get into the Rasppi took just far, far too long because every single tutorial and answered question and Guide To Headless Rasppis that I could find online assumed that you could resolve the hostname, which is a very reasonable assumption, and simply bizarrely untrue in my case. I don’t know.

The Raspberry Pi OS Imager also doesn’t tell you what the name of the OS you’re using is, and even on the main website it’s just kind of… a throwaway inline parenthetical comment. Despite being the main thing the entire community uses to refer to the major versions of the OS. And so many things changing between them. It’s. This was a conscious decision. Why would you do it this way.

After figuring out the issue and getting into the board, getting it to talk to the camera was relatively simple (though I had the cable in upside down for a bit, which was deeply frustrating to discover after an hour and a half of debugging. So it goes). I’m using the native Raspberry Pi Camera Module, which is, you know, supposed to be the native camera and therefore straightforward to use, but you would just not believe the number of problems I have had because I’m using a native Pi camera instead of a USB camera.

First photograph captured from the Pi camera! It’s blurry and poorly exposed because I’ve left the protective plastic tab over the lens, since it still has to travel back to Pittsburgh. I expect the quality to be better once I take that off.

I also discovered that OpenCV’s primary image capture method VideoCapture(camera_id) is not compatible with libcamera, the regular Raspberry Pi camera library, because of course it isn’t. Surely nobody would ever want to use OpenCV straightforwardly on a minimal Raspberry Pi. Surely that couldn’t be an extremely common desire and mainstream goal. Can’t imagine.

However Picamera2, the Bookworm Python wrapper for libcamera, is configurable enough to be kind of compatible itself with MediaPipe.

(As an aside: all of the libraries I used this week I was able to access via pip, and that also seems to be the simplest way to use MediaPipe, except for Picamera2, which was only accessible with apt; I set the include-system-site-packages flag in my pyvenv.conf to true to be able to use it.)

This is the MediaPipe on Raspberry Pi tutorial I started from. It doesn’t work on its own, because it relies on the OpenCV method that doesn’t work, but I used it and the associated tutorials linked to set up the Python environment (sigh. why did it have to be Python) and MediaPipe installation.

I found this document, which was exactly what I want to do, with the sole caveat that it’s ten years out of date. Picamera has been displaced with Picamera2, which has been significantly streamlined and so the translation isn’t 1:1, and I’m not familiar enough with either library to do a quality translation. Sigh.

I ended up being able to scavenge bits and parts from this document and from the Picamera2 repo examples to make an trial script which captures images off the camera and streams them via OpenCV (in this case over my ssh tunnel, which was very slow, but I hope an amount of that is the ssh streaming and it will speed up when I cut that).

I was able to then graft the working Picamera image-capture script onto the MediaPipe script provided in the first tutorial. I’m just using a generic model right now, not our own custom gesture language, but it is a proof that the software works on the hardware. If only just barely. It ran at this point extraordinarily slowly, and there was truly just an untenable amount of lag between my hand motions and what I saw on the screen, and even more between the motion of the frames on the screen and the MediaPipe overlay. Making it run faster became a critical priority.

Image capture of the MediaPipe hand tracker running on the Raspberry Pi.

I modified the camera configuration to tell the software reading the camera both the resolution that I wanted out of it (which was already there) and the raw native resolution of the camera. This seemed to fix my zoom problems- the camera’s field of view was far smaller than I had expected or wanted; it seemed to have just been cutting out a 640×480 box out of the center of the FOV. With access to the native resolution, it appears to be binning the pixels to the desired resolution much more cleanly. Additionally, I fixed the framerate, which had previously just been at “whatever the software can handle”. Pinning it at 1.5fps sped up MediaPipe’s response time greatly, improved its accuracy, and all of the lag functionally disappeared (even still streaming the output). It also kept the board from getting so dang hot as it was before; Raspberry Pis since the 3 underclock when they hit 60C, and according to my temp gun that’s about where I was hanging before I fixed the framerate, so that was probably also contributing to lag.

Image capture of the MediaPipe hand tracker working on the Raspberry Pi.

1.5fps is a little lower than I wanted it to be, though. I switched the framerate and recognition outputs to feeding to a printline and turned off the streaming, and was able to trivially double my framerate to 3fps. This hits the spec requirement!

If possible, I’d like to try to pull OpenCV entirely out of the script (with the possible exception of its streaming feature for debugging purposes) since Picamera2 seems to have all of the functionality of OpenCV that I’m using, and in a much more lightweight, Raspberry Pi-native library. I believe this may help me improve the responsiveness of MediaPipe, and will certainly make the script cleaner, with fewer redundant, overkill tools. However, since it works just fine as is, this is not a high priority.

Progress Schedule

I’ve shuffled around my tasks slightly, accelerating the work on MediaPipe while pushing off the HDMI output slightly, so I’m ahead on one section while being behind on another. I’ve also had to put off measuring the power consumption of the Rasppi until I had the recognition model working- in retrospect, I don’t know why measuring the power consumption was placed ahead of getting the most power-hungry algorithm working. I’m not particularly worried about the lead time on the battery, so I’m fine with that getting estimated and selected a bit later than expected.

Next Week’s Deliverables

Originally next week was meant to be the MediaPipe recognition week, while this week was for the HDMI out, but this has been flipped; I plan on working on the code which will generate the display images next week. Additionally, I’ll have to figure out how to log into the Rasppi on the school’s internet connection when I don’t know its IP address directly, which may take a nontrivial amount of time.

Team Status Report for February 22, 2025

Project Risks and Mitigation Strategies

A key bottleneck in the project is the delay in receiving the Raspberry Pis which don’t arrive until Monday. This impacts our ability to test power consumption and system performance. To mitigate this delay, Rebecca has already obtained the SD cards and has flashed OS onto them, so the boards can be booted immediately upon arrival. This will allow us to immediately start testing early in the week.
If there are additional delays with hardware setup, we will proceed with software side testing on local machines and simulate hardware behavior to continue the development.
Charvi already has profile, following and registration functionality figured out and she is integrating and debugging these components.
Diya has set up gesture recognition locally and is currently testing its accuracy. If accuracy issues arise, we will adjust the model parameters, consider alternative gesture recognition models, or refine preprocessing techniques.
Rebecca has drafted the headset CAD so a base exists for the mount points and, as mentioned above, prepped the SD cards and found instructions for installing and running OpenCV on a Raspberry Pi, to jumpstart our work on this.

Changes to System Design

No changes to the system design were made this week.

Schedule Progress

A few of the Rasppi-testing-related tasks expected to be done this week have been pushed to next week on account of the boards not arriving. No other changes have been made. Some of next weeks tasks may be pushed into spring break, on account of this delay and possibly underestimating the time the design report will take to write, but this slack time should catch all of it.