Lucky’s Status Reports – Team B0: Where's the Milk?

April 30, 2022

Lucky’s Status Report for April 30, 2022

This week I did the following tasks:

Integrated motion and object detection for live testing
- Completed a function live setup
  - Used a webcam rather than raspberry pi uno to ensure functionality prior to dealing with increased complexity of raspberry pi
    - Use of single webcam to test out the pivot we recently decided which was to use a single camera for both motion and item detection
  - Physical parameters of setup
    - Camera placed 5 ft away from front of shelf and 3 ft above the found
    - Items Tested:
      - protein powder, quick oats, empty milk container, cereal, chia seeds, peanut butter, brownie mix, tuna, ziploc box, vegetable oil, paper towels
  - Recorded a video clip of the testing, and incorporated that into our final presentation
    - System was able to detect motion
    - Finite state control mechanism functioned properly to trigger the object detection
    - Object detection worked accurately – presence detection and bounding boxes were correct – for all of the items except the following:
      - empty milk, vegetable oil
Continued working on the video script
Worked on the project poster

In terms of schedule, I feel my biggest worry is setting up the wireless system with the raspberry pi’s and integrating control and file transfer with the website

What Next

Establish working system with wireless raspberry pi camera setup
Establish file transfer and control a.k.a. functional integration between raspberry pi and website

April 23, 2022

Lucky’s Status Report for April 23, 2022

This week I did the following tasks:

Worked towards integration
- In order to make integration more seamless, I decided on the approach to include the object detection source code with the web app source code
  - This required transitioning the source code from a script that was to run on the raspberry pi camera subsystem to an internal library of functions that could be used by the web app using object oriented programming
    - This enables the web app to more easily supply the desired aisle and items, and receive the desired json object without too much hassle with transfer protocols
      - The web app will create a class instance for each aisle by calling on the object oriented library and supplying necessary information i.e. items expected in that aisle
      - Each class instance has its own set of functions for detection distributed detection for multiple aisles if needed
  - Also created a testing script for the library
Began planning for functionality of displaying an image of the detected shelf (the idea was a cool feature idea from Prof. Savvides that we will try after completing MVP)
- The overall programmatic design I am working on is
  - Produce empty shelf image base layer
  - As the sliding window moves through during detection, store the presence of items along with the x and y coordinates
  - For each item that was detected as present, use the x and y coordinates to add an additional image layer in the location where it was detected
    - The images will be preprocessed stored images for our global item set
Live testing and prep for demo
- Finalized workable live capture code
- Acquired groceries for testing and demo day
- Began testing of live capture object detection
Final Presentation
- Put together tests and visualizations for the presentation
Started working on the video script

In terms of schedule, I feel a bit pressured, basically hoping things go as planned, but preparing for when they do not, which is almost always the case for engineering projects. My biggest worry is the trigger mechanism between the background subtraction subcomponent and object detection subcomponent because it requires communication between multiple devices

What Next

Test, test, and more test, because we need to produce quantitative data and documentation of our decisions that we have made thus far
Finalize integrated system
Work on setup for demo
- Cameras and shelf locations / distances
- Script / overview of our demonstration
Work on video script
Storyboard poster components

April 17, 2022April 17, 2022

Lucky’s Status Report for April 16, 2022

This week I did the following tasks:

Deeper research into more complex segmentation approaches
- Upon further research and designing of sliding window I found some disadvantages to the approach
  - Fixed window size or aspect ratio can result in not capturing the full objects in windows or expensive computation
- I learned about modern techniques that were developed to tackle that issue i.e. R CNN and Fast R CNN, Faster R CNN, and more
- Therefore, I compared K-means, Mean Shift, and Graph Based segmentation
  - K-Mean: predefined number of clusters, randomly set centroids (will be the mean(s) i.e. the average of the colors in a cluster) for each cluster, assign each point to the nearest mean thereby creating a cluster, then recompute the means and repeat until the means stop moving i.e. convergence
    - Pros:
      - Not as complex as the other two approaches
      - Very fast
      - Multiple initialization methods can be used
    - Cons:
      - Manually have to set number of clusters
      - Sensitive to initialization
      - Sensitive to outliers
  - Mean Shift: for each pixel – first compute mean of the pixels in a surrounding window, then move the pixel’s assigned centroid to the new mean location, repeat until the mean converges to the nearest mode which will be the local extrema i.e. the hill of a normalized density feature mapping. Each hill will be the centroid of the pixels around them thus separating the pixels into clusters
    - Pros:
      - Simple algorithm
      - No initialization required
      - Robust to outliers
    - Cons:
      - Computationally expensive
      - Sensitive to window size parameter
  - Graph Based Segmentation: each pixel is a vertex, an edge is produced between neighboring pixels, each edge is assigned a weight based on affinity / similarity between the pixels/vertices, then the graph is partitioned to subgraphs where pairs of pixels in the same subgraph have a high affinity, and pairs of different subgraphs have a low affinity i.e. a min-cut
    - Pros:
      - Complex segmentation capabilities
    - Cons:
      - There is a bias to cut into smaller segments
        
        Requires normalizing cuts to favor larger subgraphs
        
        No known polynomial time solution for minimizing the cost of a normalized cut, but there are approximations
Upon comparing all of the solution approaches I used the following constraints:
- Low computational expense and latency to meet user requirements
- The camera is going to be fixed and the shelf is of a fixed size and distance from the camera
  - Results in images of fixed aspect ratio
  - Results in items having a constrained placement on shelf rows
- We constrained the size of items for the scope of this project
Therefore, I decided that given the constraints and scope of the project, an implementation of sliding window would be the feasible design solution
Completed a functional sliding window matching algorithm
Put together the shelf we bought

In terms of schedule, I think I am around where I expected to be. I would like to move faster this next week though to give some more leeway for hiccups as we get our project to something worthy of a decent final demo.

What Next

Set up webcam and shelf with the following specifications to begin live testing such that the frame is composed of only the shelf for speed and accuracy since there would be minimal sliding window matching segments outside of shelf area
- Distance from shelf to webcam:
- Orientation of webcam:
  - Using SIFT matching which is supposed to be rotation invariant, but algorithm rotates test image for the sake of user visualization while testing
- Height from floor of webcam:
  - Centered to shelf to allow for most forward facing camera angle of complete shelf
- Figure out a set original captured Image aspect ratio and adjust sliding window accordingly

April 10, 2022April 11, 2022

Lucky’s Status Report for April 10, 2022

This week I did the following tasks:

Presented interim demo
Considered solution approaches for image processing
- Researched segmentation for image processing
  - Researched sliding window segmentation
- Began implementation design of sliding window segmentation
  - This includes testing considerations
    - The test images must be of equal size to the image that will be taken of the aisle shelf from the fixed distance, otherwise sliding window will be either to small or too large
Ordered shelf to begin building for integrated testing

In terms of schedule, I think I am on pace. As long as the segmentation code I am in the processes of producing is finished by end of class this upcoming Monday lab session

What Next

Pick up shelf and set up camera at fixed distance that fits the shelf
- Determine the image dimensions
- Formulate the constraints for the sliding window
Complete sliding window segmentation and test
Proceed to integration

April 2, 2022

Lucky’s Status Report for April 2, 2022

This week I did the following tasks:

Completed a testable phase 2 object detection subsystem
Ran Tests with SIFT
- Lowe’s Ratio Test (all else held constant)
  - I tweaked the lowe’s ratio by increments of 5% from 0% to 100%
- Match Threshold Test (all else held constant)
  - I tweaked the threshold value by increments of 10 from 0 to 200
- Combination Test (tweaked Lowe’s Ratio and the match threshold to look for the best accuracy parameters)

In terms of schedule, I had to miss the meeting in person on Wednesday of this week, but I worked as soon as I could that day to make up for the missed time. In addition, I was able to get more progress on object detection and did some tests.

What Next

Prepare to present for the initial interim demo
Detection selection Process
- Compared the detectors with one another and selected the detector with the best combination scores that also fell within the user latency requirements
Phase 3 object detection
- Integrate my subcomponent with the other’s and the camera module

March 26, 2022

Lucky’s Status Report for March 26, 2022

This week I did the following tasks:

I continued to work on the detection script
- I had to make reconsiderations and adjustments to the code I have written / been working on given the concerns explained in the next following section titled “Things I ran into worth considering”
Raised my concerns and spoke to my team about the design choices

Things I ran into worth considering:

User item customization
- Cons:
  - Can reduce accuracy of system
  - Can institute ethical issues
- Pros:
  - Enables users to customize to their store needs
  - Faster system that does not go through a large set of products that aren’t even available in store
  - A system that has all of the store’s available products
- Possible solutions:
  - Allow store owners to capture new images from their live feed i.e. add new items to dataset from live feed
    - Cons:
      - Can add ethical issues
        
        Misuse: unsolicited images taken of customers i.e. pervasive / private images
        
        If images are added globally
        
        Need additional infrastructure to protect against things such as predatory images that intentionally mismatch, inappropriate images, etc.
      - Requires a separate local data storage abstraction / subsystem if images are not uploaded globally
      - Will most likely add complexity that could be avoided with different design choice
      - Can reduce accuracy of system
  - Have a universal set of items we have tested and pre uploaded data on our side controlled by us, then allow local systems to select from those pre-validated items
    - Cons:
      - We may not have all items that a store needs
      - We will have a large data need that may have more items then users even use i.e. what if our customer base is mostly produce, but we have numerous items in home improvement => inefficient data storage
      - We have to add the necessary back end infrastructure and integration with the detection system to handle that => may be something to consider after mvp
    - Pros:
      - No
      - No security concerns of uploaded items / images from users
      - No accuracy concerns of uploaded items / images from users
      - No ethical concerns of uploaded items / images from users

In terms of schedule, depending on how testing goes upon completion of this second phase of object detection, I think I should be good on schedule. If testing goes unfavorably, I will have to step up some on my pacing.

What’s Next:

Now that we have solidified an approach for the item customization, I am going to move forward with those design choices in mind and finish the object detection version that I am working on

March 19, 2022

Lucky’s Status Report for March 19, 2022

This week I did the following tasks:

I finished producing a functional SIFT matching script that detected matches and produced results between some test images
I ported all of the object detection code I’ve been working on to a version control system to prep for collaboration and integration
I began a new python file/script to run continuously on a system in order to prepare something that can function in production
- I cleaned up the code, and modularized sections for readability and testing of subcomponents
  - This included modularizing the SIFt detection portion so we can more easily swap out the detection algorithm for other algorithms for benchmarking the best algorithm for our use case
  - I organized global variables for making the tweaking of threshold / customizable portions of the code easily accessible for benchmarking and optimization

In terms of schedule, it felt like I finally began catching up on the slack I had built up on weeks prior. I now have a clearer trajectory of what I want to get done soon

What Next

I want to finalize the transition to an almost production ready subsystem of the object detection algorithm for our MVP
- This entails finishing a functional main script that runs the detection and matching continuously with prompts from the terminal
  - The terminal prompts will service as a trigger mechanism to simulate the trigger from the motion detection subsystem
  - There is als a capture simulation to simulate capturing new images from the shelf camera for detection / matching

February 26, 2022February 26, 2022

Lucky’s Status Report for February 26, 2022

This week I did the following tasks:

I finished a BRISK algorithm version
- Last week I ran into an error using SURF and SIFT in which open cv kept giving runtime errors stating that they were patent protected / copyrighted, so I proceeded with BRISK
- Finished the BRISK implementation that extracted features from images of cereal and compared it to images containing cereal
  - I tried to tweak the filtering algorithm I used – Lowe’s ratio Test – and some other parameters to see if I could detect one cereal box from a group of cereal boxes, but the process did not seem to be productive
    - I plan to instead have the system detect if a cereal is present rather than for example honey nut cheerios vs regular cheerios for now, then branch to more detailed filtering in the future if possible since it is not part of our MVP
Attended the presentation, asked questions and took notes
- One important take away was that another group actually was able to use SIFT
  - We had previously pivoted from SIFT to BRISK because when I tried to use SIFT, open CV kept giving me an error stating that SIFT is a patented / copyrighted software so I could not use it, but the other team said they were able to use it
- I asked for their requirements file because I suspect that the patent issue can be avoided by using older versions of opencv and/or python
  - I began attempting to develop a similar detection algorithm with SIFT as I did with BRISK

In terms of schedule, it felt like I did more work than weeks prior, but that could be because I began shifting from research to implementation. As always, I have room to improve on my time management, but I do not feel any more behind than last week.

The main way I intend to catch up and do better this upcoming week is the following

Finish a comparable SIFT detection as I did with BRISK
Begin designing and implementing a process to actually use the detection for a pool of images to simulate the process of the algorithm looking for multiple items rather than fixed test items / images

February 20, 2022February 20, 2022

Lucky’s Status Report for February 19, 2022

This week I did the following tasks:

Attended the mandatory meeting, asked questions and took notes
- One important take away was understanding the nature of the course as more of one where we build a recip, testing which ingredients work best where, rather than all the raw ingredients
- Another takeaway was to consider the scarcity of raspberry pi’s and beginning to think of alternatives
Began some actual code for the detection algorithms
- Created a jupyter file on Google Colab to test SIFT, SURF, BRISK, color detection
- Wrote some code and read through various docs
- Learned about matching and feature extraction processes
  - Introduced to FLANN and Brute Force
- Learned more about ways to speed up detection
  - In addition to gray scale, learned about converting to HSV
    - This enables one channel (H – Hue) for the colors, I’m thinking this could speed up color detection

In terms of schedule, I did some more work, but not as much as I anticipated with a couple hiccups like finding that SIFT / SURF are patented algorithms. Also, I need to improve on my time management as well if I am honest.

The main way I intend to catch up and do better this upcoming week is the following

Continue testing the open source options such as BRISK
See if we can find a workaround i.e. use older versions of open cv / python to implement SIFT
Break down my schedule to a more micro scale for the week and have smaller milestones to try to manage my time better and avoid going down rabbit trails of non-essential research / information

This next week, I hope to complete a testable version of BRISK object detection for various grocery items. I hope to also begin shifting to interconnectivity requirements (i.e. raspberry pi, camera, computer, cloud) and the location of processing as we become more familiar and have tested some algorithms

February 13, 2022February 13, 2022

Lucky’s Status Report for February 12, 2022

This week I did the following tasks:

Presented project proposal
Continued research on SIFT algorithm to get a better understanding of how to incorporate it into our computer vision component of our project
Researched the differences between SIFT, SURF, OBS to gain a better understanding on the alternatives to ensure we are pursuing the best algorithm to meet our proposed user requirements
Did some preliminary research on background subtraction to consider it for our detection of motion in an aisle as referenced by the professor because I had not heard about it before
Looked into some comparison of popular methods to track and count people e.g. HOG vs YOLO
Did some preliminary research on which database we would use
- Compared and contrasted SQL and NoSQL
- Looked into the popular databases used for image classification applications e.g. Apache Cassandra, MLDB, MySQL, and MangoDB
Reconsidered where we would be perform the image processing: on the cloud or on the local machine, and began looking into the pros and cons of both

In terms of schedule I feel slightly behind if I am honest due to uncertainty on which steps to take moving forward. I feel as though I am at a state of choice overload for technologies to choose

The main way I intend to overcome this hiccup in the process is to go over the options with my teammates in our first meeting this up to make a final decision on what we will attempt first before moving on to testing and implementation

This next week, I hope to perform the next steps as outlined in the team status next tasks, and in terms of individual deliverable, I hope to establish the mechanism we will use to communicate between the different cv components, and design an updated model of our system with those decisions in mind.