Oliver’s Status Report – 30 Apr 2022

This is the last status report for the project! As we step in finals week, I crafted and presented the final presentation, and the final presentation seems to have turned out well!

The CV system is now also capable of making quantity guesses. I updated the API specification on the backend to support the new capability in order to allow quantity information to be sent from the CV system to the backend, and then to the frontend for display in the confirmation dialog. This required a mini-rewrite of the schema as well as the relevant API endpoints, but it was not too troublesome and I was able to deliver it in just 2 days.

In preparation for the final presentation as well as poster, I updated the overall architecture diagram, and consolidated test results and compared them against our use case requirements. All our explicitly defined use case requirements have been achieved! (see presentation) This is a testament to our good work both individually and as a team.

Coming up, there are some stretch goals and “nice-to-haves” that we can try to implement. Of higher priority is the final report and demo, and with that we will wrap up our project, on time and on schedule!

Alex’s Status Report 4/30

This week, I finished the final project presentation and watched everyone else’s presentations. For our project, I also worked on attaching the mechanism to the fridge, and training a new network based on the data Samuel collected. The new network still seems to be struggling (as the last one we tried did) on the custom data we collected with a validation accuracy of 30%. If we aren’t able to figure out the bug, we will just fall back on our old network which seems to be working well.

As well as attaching the mechanism to the fridge, I got the system to auto-boot on power up and to be accessible over SSH remotely if we need to access it over the CMU-device network.

Next week, I want to work with Oliver to finish off push notifications for expiring items, as well as recipe recommendations on the front end. This should not be too hard with the existing infrastructure I already have in place. We will also work on finishing up the poster tomorrow and then film the final video. There’s a fair bit of wrap-up work to do but overall we are in great shape!

30 Apr 22 – Samuel’s Status Report

Since I had completed quantity detection last week, I had nothing much left to do for the CV side of the project, besides testing and experimenting with model training/data collection, which I did. I also helped out with the installation of the system onto an actual fridge:

During testing, I found some edge and potential failure cases, and added more robustness checks. In particular, by using my white background detection, I fixed an issue where the FSM will move into the wrong state if the user tries to remove fruit one-by-one off the fridge, or tries to add more fruit once a prediction is done. The final FSM is shown in the Figure below:

I also trained the model again with some new fake fruits but the validation accuracy was still poor (although training accuracy was good). From the graph shown below, it seemed that the CNN was not learning the underlying model of the fruits, but was instead overfitting to the model. Most likely, this was the result of not having enough data to learn from for the new classes, thus creating confusion between fruits.

Next week, the focus will be on final testing on the integrated system (although we have actually tested quite a bit already, and the system seems to be fairly robust), and preparation for the final video and demo. CV wise, we are definitely ahead of schedule (basically done), since increasing the number of known fruits/vegetable classes was somewhat of a reach goal anyway.

Oliver’s Status Report – 23 Apr 2022

It turned out that the fix applied last week on the backend relating to items not being properly decremented upon removal was incomplete, and there were still edge cases. Thankfully, this was uncovered during further testing and integration this week, way in time before the final demo. This bug has now been fixed (and verified through the same rigorous testing process) through a complete rewrite of the item quantity computation logic. In short, instead of returning a single result after one pass through the transactions, it now makes 2 passes in order to properly account for item returns. In terms of time complexity, both approaches have a complexity of O(n), so there should be no significant difference in backend latency as a result of the change. In exchange, not only is the bug fixed, but the logic and code is also much more clearer and readable, significantly helping in any future efforts to expand the backend.

I have also made the backend more robust against incorrect quantity data, e.g. “removing” more apples than there are in the fridge. In essence, the backend now computes as if there is a floor of 0, instead of going into negative numbers, which is not logical. We passed over this aspect of the project initially as we prioritized producing new features that are suitable for the demo, but the time is now right for this change to be implemented as the product matures.

My main priority this weekend is to produce and complete the final presentation for our project. I recorded several video demos that I intend to incorporate into the slides, and collected a series of performance and benchmarking data to compare against our use case requirements (good news: we’re meeting them!)

Coming up, the largest priority is to complete the wrap up of the project as the semester comes to a close, in the form of the final presentation, poster, and final demo. We also intend to expand user friendliness and productiveness even further by allowing the user to select multiple fruits in one confirmation, based off our discovery that the CV is capable to predicting all 3 different types of fruits when all 3 of them are placed on the platform.

We’re comfortably ahead of schedule! The project including the hardware is now all in place, integration tests are nearly complete, and the semester is coming to a close. This will mark a great end to our CMU careers!

23 Apr 22 – Samuel’s Status Report

This week, we focused on integration and testing, and I also made some minor improvements to the CV algorithm, and attempted to collect more data for training.

Integration with Jetson

The Jetson was surprisingly annoying and difficult to setup, and I spent at least 10 hours just trying to get my CV code to run properly on the Jetson.  In particular, trying to install various dependencies like PyTorch and OpenCV took a long time; we needed to compile a lot of dependencies from source (which came with a lot of its own errors) because the Jetson is an ARM aarch64 system which is not compatible with the x86_64 architectures that most things are precompiled for. The various issues were compounded by the fact that the Jetson was slightly old (using older version of Ubuntu, low RAM and memory capacity).

Even after I/we* managed to get the code up and running on the Jetson, we had significant problems with the speed of the Jetson system. I/we at first tried various methods including turning off visual displays and killing processes. Eventually, we realized that the bottleneck was … RAM!!!

What we discovered was that the Jetson took 1-2 minutes to make its first prediction, but then ran relatively quickly (~135ms) after that. This is in comparison to my computer which runs a single prediction in ~30ms. When Alex was debugging with a display of system resources, we eventually pinpointed the issue to being the lack of sufficient RAM when loading and using the model for the first time: the model was just too big to fit into RAM properly, and a major bottleneck came from having to move some of that memory into SWAP. Once that was complete, the algorithm ran relatively quickly. However, because it is nonetheless using memory accesses (swap) instead of the faster RAM, the predictions on the Jetson still ran slower than that of my computer. Nonetheless, it still runs fast enough (~135ms) after this initial “booting” stage, which has now been integrated as part of the “loading” in my CV code.

*while I was in charge/did most of the debugging, my teammates were also instrumental in helping me get it up and running (it is after all, Alex’s Jetson), so credit should be given where it is due 🙂

CV Training

While trying to fix/install dependencies on the Jetson, I had also in parallel attempted to collect some more data with the new fake fruits that came in, including the addition of a new “Lemon” class. However,  our model could not converge properly. I believe that it was due to the fact that some of the fake fruits/vegetables were not very high-quality, and looked fairly different from the ones in the original dataset (and in real life) like the peach and pear, so when validating against our original test images, it failed to perform very well. Next week, I aim to try training only the fake fruits/vegetables that look realistic enough (like the apple, lemons and eggplant). That being said, the algorithm already performs very well against some of the semi-realistic fake fruit, like the starfruit and banana shown in Figure 1 below.

During testing, I was actually pleasantly amazed by the neural network’s ability to detect multiple fruits despite being a classifier, and outputting probabilities associated with those fruits. As can be seen in Figure 1 below, the fruits being captured are Apple, Banana and Starfruit, which appear as the top 3 probabilities on the screen, as detected by the network.

Figure 1: Multiple Fruits Detection

Minor Improvements – White Background Detection

After spray painting the platform with Alex, we now had a good white background to work with. Using this piece of information, I was able to have a simple (and efficient) code that detects whether the background is mostly white using the HSL image representation, and checking for how many pixels are above a certain threshold.

Since my algorithm currently uses changes in motion (i.e. pixel changes between the frames) to switch from different states (background, prediction, wait for user to take off their fruit) in my internal FSM, this white background detection adds an important level of robustness against unforeseen changes, like an accidental hand swipe, lighting changes or extreme jerks to the camera. Otherwise, the CV system might accidentally go into a state that it is not supposed to, such as awaiting fruit removal when there is no fruit there, and confuse/frustrate the user.

Future Work

We are currently far ahead of schedule in terms of what we originally wanted to do (robust CV algorithm, fruit + vegetable detection), but there are a few things left to do/try:

  1. Quantity detection: This can be done by using white background segmentation (since I already have some basic algorithm for that) + floodfill to have a rough quantity detection of number of fruits on the platform. Right now, our algorithm is robust to multiple fruits, and there is already an API interface for quantity.
  2. Adding more classes/training: As mentioned above, I could try retraining the model on new classes using the fake fruits/vegetables + perhaps some actual ones from the supermarket. Sadly, my real bell peppers and apple are already inedible at this point 🙁

Alex’s Status Report 4/23

This week, I worked on finishing some UI fixes and integrating everything together. First, I finished the nutrition information popup:

Then, Samuel and I spray painted the platform for better visibility. The computer vision algorithm improved drastically with the nicer white background.

Next, I set up the Jetson for the computer vision processing. We had a lot of issues with performance, but I discovered that when I disabled all of the GUI processes, the performance improved slightly. We wanted to improve performance using Tensor RT, but I found it very hard to compile (resource limited and arm processor) and no precompiled binaries were available. Fortunately, disabling the GUI was enough to improve performance, but it appears that there is a bug with torchscript that causes the performance to be extremely slow on the first detection. After “warming up” it seems to perform much better, so we will account for this in the code. My hypothesis is that the CPU needs to optimize what goes in the swapfile since it has to use about 1 GB of swap to run the process, which is significantly slower than RAM.

Finally, I tested the performance of the algorithm and we worked out a few bugs on the API and made sure that we are in good shape for the demo. I also collected detection times for our performance evaluation. I then worked on putting all this data into the final presentation.

Oliver’s Status Report – 16 Apr 22

This week I have been working on testing and ironing the bugs that have been identified. One bug that cropped up during the demo presentation itself was a bug in the backend logic that was resulting in items not being properly decremented after removal. This stemmed from our transaction-oriented database design, where instead of a running total, a list of transaction deltas are stored in the database instead. A bug in the logic in collapsing these transactions (i.e. one loop was reading from the front while an inner loop was reading from the back) resulted in the unexpected behavior. Fixing this bug was relatively nasty as it meant having to store more information for computation – the original quantities of each addition. This bit of logic was especially complicated, due to the simultaneous “first-in-first-out with replacement” nature of our application.

Additionally, I have also began work to expand the database to support presenting nutritional information, a new feature we are planning to introduce. I will be expanding the schema by adding new tables, and creating new API endpoints for the new feature. I expect this will take around 1-2 weeks.

Overall, we are on track and I am personally confident in our ability to finish this project satisfactorily.

Alex’s Status Report 4/16

Last week, we used the saws and equipment from carnival to cut up our wood into usable pieces. This week, we finally got the chance to actually assemble them into a folding platform to assist in our computer vision algorithm. This is necessary to create a nice, light background that is easy to mask out and better detect changes on. For example, when someone walks by, there won’t be significant change to the camera detection since it won’t see the foot, just small changes in lighting. The platform also helps to get more regular sizing of fruit. Users will know to hold it at a consistent height (next to the platform) which makes classification and segmentation much easier.

The folding feature is a small feature so that the fridge takes up less square footage when scanning is not in use.

This week I finished construction of the folding platform for the camera. using the pieces we had cut last week. We found various wood screws in tech spark to finish attaching the foldable hinges. The wood screws were slightly long, so I used hot glue to cover up anything dangerous that could catch.

We also purchased white spray paint for the platform to even further decrease any barriers of detection, but when Samuel ran the computer vision it appeared to work fine with just the light colored wood. I may still paint it next week for aesthetic purposes.

On the front end side, I have begun work on a nutrition information system in place of the recipe recommendation system. The recipe recommendation system was proving very difficult to integrate and test, since these timers are for very long periods of time and required extra frontend and backend integration. Instead, when clicking on items in the interface, it will display nutritional information about the item. For an apple, this will pop up: Apples - MSU Extension Nutrition | Montana State University

I’m still in the process of integration and will work to finish this up next week. Overall, we’re in very good shape. I just need to get the front end up and running on the tablet next week.

Samuel’s Status Report – 16 Apr 22

This week I mostly helped Alex to build the platform for the camera setup, and then attempted some basic tests on the setup. I also am currently training a model based on the data collected from previous weeks (and some from this week after getting the platform up)

 

Figure 1: Example of Fully-Constructed Platform

Figure 2: Example of self-collected data (here, of yellow squash) taken on new platform

 

More notably, I have been trying to improve the robustness of the change detector since I predict issues with the CV algorithm falsely detecting a change due to illumination changes from the opening of the door or a person walking around etc.

To this end, I tried using division normalization, where we divide the image by a very blurred version of the same image as per https://stackoverflow.com/a/64357685/3141253

While this allowed the algorithm to be very robust against illumination it also reduced the sensitivity significantly. One particularly problematic aspect occured because my skin color was similar to the birch wood color and so my hand was “merged” into the background. When displaying with a red apple, however, there was a better response, but not always enough to trigger changes. With this in mind, we have plans to color the platform white and hope that this might cause a larger differential between natural colors like that of skin and fruit, vs those of the artificially white background.

Another alternative is to play around with the saturation value of the HSV dimension. Or, since the main issue is going to be falsely detecting a change that moves the FSM into a background state falsely, we can potentially check for  a true extra object using a Harris corner detector.

Next week, in addition to trying the aforementioned items, I will also be helping to integrate the CV algorithm on the Jetson. We are currently on schedule but might need to speed up the integration component so we can spend more time on robustness testing.

Oliver’s Status Report – 2 Apr 22

This week was an incredibly productive week for me because the primary blockers, such as the development environment and library issues have been tackled in the preceding weeks, leaving me free to implement the meat of the back-end swiftly. I delivered multiple API endpoints and worked closely with Alex and Samuel to integrate the API with their front-end and CV system respectively, producing a cohesive product very rapidly.

One thing I also achieved this week is that I actually refactored the database schema in order to better work with Samuel’s idea of sending a list of items the CV system has predicted, in descending order of probability, and presenting it to the user. The previous schema did consider such a use case but was not well designed for it, and did not store the CV’s predictions in the database at all. By swiftly refactoring (which was made much easier by the Prisma ORM and TypeScript!) I managed to deliver the results we wanted for demo week and beyond.

Going by my personal Gantt Chart, we are now exactly 1 day ahead of schedule. It’s a fantastic sign, and I believe that with this pace, we can work even faster. In the following weeks, I have identified several improvements my API could perform, such as coalescing multiple transactions within a single day, as well as fixing some security vulnerabilities.