Team’s Status Report – 2 Apr 22

This week we made much progress as a group in preparation for demo day!

We were all individually able to get all our individual components working AND integrated together:

Individual Components

  • Samuel: Completed CV scanning system, allowing for accurate detection of scan/removal and prediction on stable image
  • Alex: Trained NN and got websockets coded and working/integrated with the API.
  • Oliver: Made a lot of progress this week. Almost fully integrated all available backend API endpoints with the front-end as well as the CV system. Product works end-to-end as a result. Also implemented “live-streaming” of CV predictions to all front-ends

Overall workflow

  1. User places fruit under camera in CV system
  2. CV system detects movement and waits for stable image
  3. CV system predicts on image and sends the top 5 predictions to backend via JSON API
  4. Back-end stores the CV predictions and their relative ranks, and emits a “newTransaction” event to client front-ends.
  5. Front-end receives event from the API and presents a prompt for the user to confirm the CV’s predictions and committing the transaction into the back-end database.

 

We are happy with our current progress / product right now, as everyone in the team worked hard on their individual parts, and were also able to integrate everything together smoothly.

With a working integrated MVP, our team is on track (in fact slightly ahead of schedule) to deliver all that was promised in earlier stages, and it is even possible that we can achieve some of our stretch goals, such as a door sensor to tell whether it is an item addition or removal.

The immediate tasks at hand (in order of priority) are:

  1. Item removal – System currently only supports item addition, but needs to have an interface to support removal. Should be completed quite fast.
  2. Hardware integration – system currently works off a laptop, but want it to work on our Jetson, which currently needs its motherboard fixed. Once the new one arrives, integration should be pretty fast.
  3. Chasis/Platform – Currently propping up the camera on white chairs or monitors facing a white table/table with piece of paper, but eventually need to have it mounted on a proper chasis and platform. However, the algorithm still works despite different environmental conditions, indicating its robustness!
  4. Data collection  – currently we have 14 classes of fruits that the algorithm can robustly detect (including apples, oranges, tomatoes, bananas, pears), but some of these fruits are not commonly bought (eg. pomegranate, kiwi, starfruit).  Would need to collect some data of our own for vegetables like carrots, bell peppers, broccoli, cauliflower etc
  5. Recipes – Was originally a stretch goal, but now something we can possibly do!

Samuel’s Status Report – 26 Mar 22

During last week’s status report, I mentioned how we needed to find a dataset which exposed the network to a wider variety of fruit.

Samuel’s Status Report – 19 Mar 22

This week, was a relatively productive one: I managed to train the network on the “silvertray” dataset (https://www.kaggle.com/datasets/chrisfilo/fruit-recognition) which produced relatively robust results on test data that the network had never seen before (a green label indicates accurate detection; here we had 100% accuracy on test data).

Of course, the test data also involved the same silver trays that the algorithm trained on, so a high accuracy is expected.

I then moved on to making it perform detection using real-world data, on our C++ application with our webcam, and these are results!

As visible in the images above, the NN is able to detect the fruits accurately on in a real-world situation (including a noisy non-white background WITHOUT segmentation applied). That being said, there are some inaccuracies/misdetections such as with the orange above despite the frames being very similar. I describe needed improvements later below.

With this, we are currently on track towards a working prototype, although we could probably train the network to handle more classes with self-collected or web-procured data.

Next week, we will begin integration of the various components together, and I will work on several improvements to the current CV algorithm/setup:

  1. Include a “change detection” algorithm that will detect when significant changes appear; this will allow us to tell if a fruit is needing to be scanned.
  2. Normalization of the image before processing; this will help reduce issues with random lighting changes, but might require that the network be retrained
  3. Build the actual rig with a white background and test the algorithm on that
  4. If necessary, change to using a silver tray or silver-colored background similar to the network’s training set, and/or collect our own datasets.

Team Status Report – 26 Mar 22

This week was a relatively productive week; we were able to get our individual parts working, and are quickly moving on towards integrating our various components into a working prototype:

    • Samuel: Successfully got a model working with ability for our C++ program to detect and classify fruits relatively accurately with real-world conditions (real fruit, webcam etc.)
       
    • Alex: I was working with Oliver on integrating the front end and back end during the week.
    • Oliver: The real fun has finally began – and we’re now watching our product come to life. I began the integration process between the front-end and the back-end, ironing out the kinks and paving the way for future integration ahead. I also continued work on the back-end API, adding logic for when items are replaced to the fridge. This ensures that the item count and expiry dates remain correct, even when the same item is removed and placed back into the fridge.

We are currently on track in our timeline, but need to speed up for our integration components, as we suspect that this will be the part that will cause the most problems. Next week, besides working on integration, we will also be continuing to iron out issues in the individual components; in particular:

  • Samuel: Will work on background-change detection to see when fruits are coming in; attempt various preprocessing techniques to make network more robust.
  • Alex: Will finish integrating the front end with the backend API
  • Oliver: Coming up, another key integration task is to ensure that the Jetson and CV system is integrated with the API. I will also have to continue work with Alex to bring other aspects of the front-end into fruition, such as editing previous transactions.

19 Mar 2022 – Team Status Report

We mostly worked individually on our own responsibilities for the project this week:

  • Samuel: Fixed issues with neural network training, completed training on Fruits360 dataset with ResNet50 architecture. Currently working on finding better datasets to train on.
  • Oliver: Brought API online onto a live server connected to the Internet, and continued to implement API endpoints needed as part of the plan. We now have enough API endpoints to begin the integration process meaningfully
  • Alex: Worked with Samuel on researching new datasets and potential fixes for the dataset, as well as running the neural network training and fixing server issues.

 

For next week, we will also be continuing to work on our individual parts but will begin integration through our APIs, especially with the front end.

  • Samuel: Will work on finding (or creating our own) and training network on new datasets. Will work on background segmentation (as needed for new datasets or white background).
  • Oliver: Will work on integrating the back-end with the front-end created by Alex, so that Alex can be unblocked on that front and can continue work on the front-end. Will also continue with remaining back-end API endpoints
  • Alex: Working with Oliver on integrating the front-end and back-end

Samuel’s Status Report – 19 Mar 22

This week, I focused heavily on getting the neural network to work properly. In the beginning of the week, I successfully trained the neural network on the new ResNet18 architecture (as opposed on the old one that did not work). After I realized that it didn’t work as well as expected on real data, I swapped to a more advanced ResNet50 architecture, but that did not seem to help it either.

It was then that I began to suspect something else was wrong besides the the network itself, because the networks kept reporting a 90+% validation accuracy, but whenever I tested the code, even on training images. This hinted at a problem with my testing code/script. Eventually, I realized that during the network training process, we were passing in normalized images, and the network was training on that; once I changed my test/evaluation script to feed normalized images into the network, and everything worked very well!

However, as I began testing the network on various images, we realized that the network was not very robust on external data:

After scrutinizing the dataset, we realized that the dataset was not good enough, and was subject to some major flaws that made it susceptible to overfitting. Firstly, it was a 360 degree shot of a single fruit per category, so even though there were many images of fruit, the network was fed only one example of something from that fruit category, thus making it hard for the network to generalize based on colour, shape etc.

To resolve this problem, I would need to search for more datasets, parse them, and train our network on them. This will be my focus for next week. Currently, I have found several datasets; however, they each have their own issues. The most promising one I have found so far is very similar to our use-case, with images of fruits taken from a top-down view, but has a reflective silver tray background which is very hard to segment away. Some pictures also have groups of fruit:

I will first try training the network on center-cropped and resized images and if that does not work, I will try algorithms like Otsu thresholding on saturation value, or GrabCut, to segment away the background.

Team Status Report – 5 Mar 2022

This week, we primarily focused on our design review report, which was more work than expected. Thankfully, the deadline was extended, and most of the content was already thought through or covered in the presentation; we just needed to spend time writing in out. In particular, we decided to make our block diagram a lot more detailed:

Old Block Diagram

Our old block diagram used in the slides was primarily meant as a summary for visual purposes, and therefore lacked the detail needed for the report.

New Block Diagram

Our new block diagram is a lot more detailed, with specifics regarding algorithms, APIs and data transfer; however, this would have been too confusing for a presentation.

 

Most of our time this week was spent on the design report, and not much was done on the implementation side. However, we are still quite comfortably ahead of schedule since we began implementation early. With regards to the design report, we split up the roles equally, with each team member taking care of the architecture and implementation components related to their specialization:

  • Samuel:
    • Architecture, Implementation, Testing (CV + Attachment System)
    • Introduction, Use-Case
    • Trade studies (CV-related)
    • Related work
  • Alex:
    • Architecture, Implementation (Front-end/UI)
    • Use-Case, Design requirements
    • Trade Studies (misc)
    • Risk-mitigation
  • Oliver:
    • Architecture, Implementation, Testing (Databases/Back-End, APIs,)
    • Project management (Schedule, Responsibilities, Materials)
    • Risk-mitigation
    • Summary

Samuel’s Status Report – 5 Mar 2022

This week, we spent most of our time writing up our design review report. Although a lot of the main content had been covered in our design review presentation, the devil was in the details for this report. In particular, we needed to make our block diagrams a lot more detailed since the ones we used in the slides were merely summaries.

We were very thankful that the deadline for the report got extended as that reduced the amount of stress we had, and allowed us to write a more polished report.

On the implementation side, I am slightly “behind schedule” in the sense that I was not able to get as much work done on the new neural network implementation as I had hoped to, because I was focusing on the report instead. However, we are still ahead of schedule since we already have an implementation going.

Next week, I will focus on implementing and training the ResNet18 network, and then testing the accuracy on a self-collected dataset of various fruits.

Team Status Report – 26 Feb 2022

This week, we completed our design review presentation, which we think went quite well.  Our main focus for next week will be the design review report, due Wednesday next week. After Wednesday, we will continue working individually on our various responsibilities to implement the CV, UI and back-end systems  for Samuel, Alex and Oliver respectively.

Currently, we have made good progress on the implementation side, and are slightly ahead of schedule in this sense (see our individual reports for more information):

  • Samuel: Completed C++ testing for CV. Discovered that the CNN network found from Medium does not work well and has major flaws with the customized architecture. Will begin work next week training and testing a new ResNet18 or AlexNet model.
  • Oliver: Enforced a rigorous common standard in the back-end code-base by integrating automatic linting, type-checking, and even bug catching tools. Brought the code-base to strict, 100%, type-safe standards, setting up the back-end for seamless and co-operative development regardless of each team member’s individual style, and ensuring that code pushed meets high levels of rigor. Will deliver a core set of APIs next week built upon this level of rigor for integration with front-end
  • Alex: Completed most of basic UI. Will start collaborating with Oliver once API side is complete. Helped Samuel with the training of the classification algorithm

Samuel’s Status Report – 26 Feb 2022

The highlight of this week for me was delivering the Design Review presentation. I think I did quite well for the presentation, with my teammates and Professor commenting that the presentation was polished, with good graphics and content. We are currently working on the design report, and will use the content of the design presentation to write it.

On the technical side, I was able to make significant progress on the PyTorch port for the CV application, which was coded in C++ so that it can run optimized code on the Jetson, for maximum speed. The C++ application is now able to run the trained PyTorch model and spit out a set of predictions for the top 5 highest probability classes.

There were several challenges with the porting process, including the lack of good PyTorch documentation, which resulted in it being very difficult to figure out how to properly convert between C++ formats (cv2 to torch Tensor for example) and also important considerations in ensuring that the model can be serialized properly for running in C++ (in particular, no nested functions etc.). This is a lesson learnt on the importance of good documentation, and the pain/need of having to pour through various forums and articles as a result.

However, after training and testing the network, I began to realize big problems with the trained model. Most notably, the model failed to produce correct predictions. After consulting with Prof Mario’s PhD student, we realized that we were using a highly customized model that was not designed properly, and was not even a proper “ResNet” (lacking fully residual layers). To this end, he advised us to use other preexisting models like ResNet18 or AlexNet. This is a lesson learnt as to not blindly copy code over intern

Next week, I will focus on trying to train our data on either ResNet18 or AlexNet, as well as test it in the new C++ classifier. (There is also a Python one for quick testing, in case the C++ one still has bugs). Hopefully I will be able to train up a network that will achieve our desired accuracy of 85% (the network itself should reach about 95% validation accuracy).

Fortunately, despite this setback, we are currently still on schedule, because we were previously ahead of schedule with the successful porting of the C++ application.

Team Status Report – 19 Feb 22

This week, we worked as a team primarily on finalizing and documenting our design as part of the design review presentation and report. In particular, we finalized our overall architecture and the way in which everything will integrate together. Notably, our original design of a front facing camera with a vertical platform got changed to one where the camera faces downward on a flat white surface. This was to facilitate a more intuitive and non-intrusive scanning process.

Architectural Overview:

> Architecture (Original Handwritten)

The following is a summary of our individual work (for more details, visit the individual status reports):

  • Samuel was able to write the code needed to train the neural network classifier (we now have a pretrained model with 98% accuracy), and is working on a C++ application that can use it to classify an image.
  • Oliver began work on the backend, and defined the database schema and column relations in Prisma. He is now working on implementing the API endpoints defined in the architecture.
  • Alex was able to complete a significant chunk of the front-end design, and get it running at https://capstone.astrasser.com:2096/

Next week, we will continue working independently on our individual portions of developing the CV, UI and backends (more details in the individual reports). As a group we will also start building our fake fridge and scanning platform