Steven’s Status Report for Feb 23 2025

For this week, I focused on expanding our existing dataset of annotated fridge images. I identified several datasets from online and made use of them to train and update our existing YOLOv5 model. Furthermore, I experimented with data augmentation techniques(i.e rotations/ occlusions) to improve the robustness of our model. Furthermore, I spent time conducting research on the different YOLOv5 models and their expected accuracy/latency for our design report, in order to determine which model will be optimized for our use.

In terms of progress, I am currently on schedule and have completed the development of the training pipeline in PyTorch, and am working on training the model with our datasets.

For next week, I will explore training with the YOLOv5x model with further hyperparameter tuning, with the aim of increasing our detection accuracy to beyond 90%.  I will also compare inference timings with and explore model quantization for Raspberry Pi optimizations, in order to identify the model which best meets our requirements.

Steven’s Status Report for February 16, 2025

For this week, I focused on creating a prototype of the model we will use for detection and classification of grocery items. To accomplish this, I trained an initial YOLOv5 model on a small preliminary dataset of grocery items. I made use of a diverse dataset, in order to ensure basic object detection and classification functionality.

After training, I did some initial tests, using an image of a cake within the fridge that we are using. I ran the model on the image to evaluate the performance of the prototype model, below are the results.

As can be seen, the preliminary model was already quite successful in identifying the bounding box of the item, though the classification was not completely accurate, it is understandable since it was a niche item(cake from Giant Eagle).

I am currently making good progress with respect to our Gantt chart, and have started training the model slightly ahead of schedule. For following weeks, I aim to continue data collection by finding more datasets of fridge images, as well as exploring training with annotated data of images of groceries taken within our fridge. I also aim to fine-tune the model through adjusting hyperparameters and increasing training size through data augmentation to increase accuracy. I will also experiment with different variations of YOLOv5 models to see if a larger model will yield better accuracy without major latency trade-offs. I will also measure latency in terms of local and cloud run inference, in order to see which one better suits our requirements for cost and latency.

 

Steven’s Status Report for February 9, 2025

For this week, I focused on data collection and annotations, as well as getting started on the prototype of our model.

I’ve set up our PyTorch environment and completed an initial YOLOv5 prototype. I’ve developed a basic pipeline, where images are pre-processed with OpenCV and then fed to the YOLOv5 model, and conducted some preliminary training and testing using small annotated datasets of grocery items. Furthermore, I have been sourcing and noting down relevant datasets on Kaggle/ Roboflow.

Currently, I am working on integrating the online datasets with the YOLOv5 model, and conducting some initial tests on accuracy as well as inference speed. I aim to test the inference speed locally on Raspberry Pi, as well as on cloud, to get a measure of latency of either set-up.   I will also experiment with image processing methods using images taken from our fridge, in order to try and improve detection accuracy.

Looking to the future, I will have to obtain the necessary hardware(Raspberry Pi) in order to test the effectiveness of our model when run locally. I will also have to work on integrating the model with our peripheral device as well as the mobile application.

Steven’s Status Report for February 02, 2025

For this week, I focused on researching the relevant CV libraries and frameworks that we can used to build our model. I explored multiple libraries(OpenCV, PyTorch) for handling image preprocessing and deep learning, and compared various object detection frameworks to determine which one best fits our requirements.

We decided to employ OpenCV in our computer vision pipeline, due to it being the industry-standard library with a large active user community and extensive documentation. OpenCV offers a wide variety of image transformations and integrates extremely well with deep learning frameworks such as PyTorch, allowing it to be easily implemented within our pipeline. The use of OpenCV for data processing will be essential in integrating our CV model with our overall system. We have also decided to use a YOLOv5 model for object detection, due to its extensive documentation and active community, and its suitability for real-time applications such as ours. Based on proven real-world performance, we are confident that the YOLOv5 model can achieve our accuracy goal of 90%.

The following is an initial outline of how the CV pipeline will operate:

  1. Image Capture and Basic Preprocessing (resizing, normalization with OpenCV)
  2. Inference using a YOLOv5 model
  3. Post-Processing to handle classifications/ quantity counts

In terms of my next steps, I will gather datasets of representative fridge ingredients, including variety in lighting, item arrangement and packaging. A diverse dataset of relevant images is necessary in order to achieve our accuracy goals. I will also set up a basic prototype using YOLOv5 on a small subset of test images, to evaluate initial detection accuracy, speed and resource usage. I aim to run initial benchmarks on local vs cloud inference, in order to confirm latency and throughput, as well as identify optimization techniques if needed to meet time constraints.

Overall, I am on track with researching the relevant CV libraries, laying out model architecture, and investigating how to integrate the detection pipeline within our system. For the next week, I will focus on data gathering and prototyping of the CV model.