Brian’s Status Report for 4/27

Accomplishments 

This week, I went to Salem’s with Simon and Shubhi early on in order to get approval to test our full system, but we weren’t able to do so then because of availability issues with the IT lead. Simon and I then went later on in the week, and we were able to get approval to test, and we got some data to use for testing our fullness calculating module and line counting module. I also went to Giant Eagle this week and took pictures of a few different cart configurations (mostly very full carts), like so:

Because there is noise (in the surroundings), I cropped out the images according to the bounding boxes displayed when running the shopping cart model alone and tested relative fullness on them, but found that extremely full carts are for some reason hard to detect with the current implementation. I will need to refine the implementation in the next few days in order to hopefully improve accuracy, but the module seems to work fine with carts that aren’t as full. Testing the relative fullness module on the test footage we got from Salem’s, results are very similar. The carts with much less seem to be much more accurately counted for by the relative fullness module than the carts that are nearly full. I have tried decreasing the amount of Gaussian blur in order to increase accuracy, and while the returned fullness values are somewhat more corrected, they are still quite inaccurate (returning 65% fullness for the above picture on the left), which is concerning. I have also worked on testing the upload and download speed of different length videos, and I’ve created graphs of the relationship between video length and upload/download time. My internet is a bit spotty, so I am not sure if this graph is a completely accurate depiction of the relationship between the video time and the upload/download times. Here is my code and the resulting graphs: 

Duration Upload Time (Averaged over 10 trials)  Download Time (Averaged over 10 trials)
1 second 0.61s 0.32 s
2 seconds 0.92 s 0.47 s
3 seconds 1.35 s 0.53 s 
4 seconds 1.52 s 0.55 s 
5 seconds 1.61 s  0.57 s
6 seconds 1.65 s 0.59 s
7 seconds 1.80 s 0.68 s 
8 seconds 1.92 s 0.72 s
9 seconds 2.21 s 0.76 s
10 seconds 2.3 s 0.88 s 

Progress 

In terms of progress, we are still behind schedule by quite a bit. We should be testing the fully integrated system by now, but we have run into a few roadblocks with testing at Salem’s due to discomfort from the cameras by the employees. However, since we figured out workarounds to these issues, we should hopefully be able to get testing done by tomorrow or Monday in the worst case. I will work on trying to increase fullness calculation accuracy for the next day or two while we continue to gather testing data. 



Brian’s Status Report for 4/20

Accomplishments

This week, I went to Salem’s multiple times in order to try and get testing data. On Monday, Simon and I went twice (second time with Shubhi as well) to get WiFi on the RPis up and running and debug an error with S3 bucket downloading/uploading. We ran into an issue in the morning with the public WiFi having an “OK” prompt, which effectively blocks connections on a Raspberry Pi. However, Salem’s was kind enough to give us their private network, and so we were able to set up WiFi on all of the Raspberry Pi 4s. In the downtime between our trips to Salem’s, I worked with Simon to make detecting the number of people in a line more robust and to at least work somewhat. All three of us then went on Thursday to actually set up our full system and let it run, but we ran into issues debugging (since we needed to copy code to all of the RPis and then change minute aspects in order to upload different data), and fixing the camera angles, since the RPi camera modules have a very short connector the the RPi itself. Today we went again, but weren’t able to collect data from the full system running because a few of the employees there today were uncomfortable with being recorded. Therefore, we took pictures of people’s shopping carts, and used an empty checkout lane to record videos for testing throughput calculations. I also started working on the final presentation slides, since that’s coming up. 

Progress

In terms of progress, I am now behind the day-to-day schedule that I created by a few days, mostly because setting up the integrated system at Salem’s had so many setbacks preventing us from running the full system for data collection. In terms of what I will be doing this upcoming week to catch up, we need to test our integrated system completely. Therefore, I’ll try to help for finalizing individual component testing, and after that we can try to go on a day where the employees at Salem’s won’t be uncomfortable with their hands being recorded for throughput. 

New Tools

In terms of new tools and technologies that I managed to learn over the course of this semester, I was quite unfamiliar with most of the technologies used in our project. I didn’t know how to use YOLO, I never used AWS S3 buckets, and I didn’t do much programming at all with Raspberry Pis prior to this semester. My method of learning how to use these tools and technologies was to mainly browse the documentation available for most of my needs, and checking various forum posts to debug when needed were vital to learning how to implement various things for this project. Sometimes reading the forum posts led to pitfalls (i.e with the raspberry pis, one of the features for getting addition WiFi connections was only supported on a legacy OS, and so I used up a lot of time trying to implement the wrong solution).

Brian’s Status Report for 4/6

Accomplishments 

For this week, prior to Wednesday, I spent a lot of time integrating the software modules written in order to prepare for our interim demo. I was able to make the main system run for a single counter using test videos from a trip to Salem’s, but I ran into an issue where OpenCV would not display any of the model predicts (with images) due to an environment issue. I was not able to fix this before the interim demo, but I looked into it after and it seems as though cv2.imshow, which is used to display images in the backend is not thread safe, and so using two separate models for cart counting and item counting led to not being able to see visually what was going on. In the meantime, I worked on setting up a raspberry pi, which I was able to do, and also on our method of choice in uploading and fetching image/video data from cameras. We are using an S3 bucket in order to store real time footage captured from our RPis, and then fetching that data and using it to compute results on our laptops. I set that up and have code that works with uploading generic text and python files, but I haven’t been able to get videos or images recorded from the RPi camera module to upload, because my program doesn’t compile when I import the raspberry pi camera module package. 

Here is my code and the S3 bucket after running the code a few times on different files: 

 

I also wrote some code for downloading the files off the S3 bucket into a laptop, which I haven’t tested yet: 

Progress 

We have a new day by day schedule with many tasks so we can finish our project in time. I am somewhat keeping up with it for now and hope to finish code for uploading real time footage to the S3 bucket soon (and fetching). I will need to expand the current code today and tomorrow to make it so that it runs and constantly collects real time footage instead of only recording from the camera module for 3 seconds, and also use USB camera footage as well.

Testing 

In terms of testing, with throughput calculations I plan on using video footage, calculating the time between the cashier beginning and finishing checking out a batch of items, and then manually calculating the throughput by dividing the number of items in that batch by the time elapsed from start to finish, and looking at the difference between that calculation and the actual throughput.

Brian’s Status Report for 3/30

Accomplishments 

My model finished training, and this last Sunday Simon and I went to Giant Eagle to film some videos for testing purposes. However, running my throughput code on the video made me realize that the model performs very poorly on live video, not even recognizing a single item. I then trained another dataset for 100 epochs, and the results are below: 

  

Even after training the second dataset, the accuracy on live video is quite poor. Giant Eagle also has advertisements and a design on their conveyor belts, which could’ve caused the accuracy to drop. Below there is an image that clearly just shows only one thing being recognized as an object, which is clearly very inaccurate for what we want to do.

Therefore, I started working on edge detection to see if there was a way to calculate throughput and count items that wasn’t so inaccurate. Below are my results from trying to get rid of noise using Gaussian blur and canny. Words are still being shown which makes it seem as though it will be very difficult to continue with this approach as well. 

In terms of interim demo progress, I will pick up a RPi and cameras as they are available and try to work on getting the RPi working with two cameras.

Progress

I am still a bit behind schedule. I will try tomorrow to get the integration on the software side done so that we can have something to demo. As for throughput, if I can’t make significant progress training for more epochs or with edge detection, I will need to pivot to adding pictures of items specifically on conveyor belts from Salem’s Market to the dataset for training.



Brian’s Status Report for 3/23

Accomplishments 

This week, I managed to finish training my model after some delay (5000 images in a dataset quickly led to me being booted off Google Colab for hitting GPU usage limits). I trained it for 100 epochs total, but I had to split it into 50 epoch sessions because of GPU usage limits, so I trained for 50 epochs and then trained the resulting model for another 50 epochs. The resulting model seems to have serviceable precision, reaching about 75%-80%, and the validation batches seem to detect grocery items relatively well. 

Now the implementation for the throughput calculation module is complete with the updated module. Simon and I will go to a grocery store tomorrow (Giant Eagle/Aldis), take a video of items moving on a conveyor belt, and test the throughput module. 

Progress

I am still behind schedule, but I am currently working with Simon to fully flesh out the line detection module. This next week, we will try to implement this fully and test it for the interim demo, and start helping out with the relative fullness module since that is very important. 



Brian’s Status Report for 3/16

Accomplishments

This week, I didn’t complete as much as I had hoped. I wanted to finish my implementation of the throughput calculating module and testing but I spent a lot of time this week trying to use YOLOv3 to accomplish it. Much of my time was spent trying to figure out how to train a model from the Freiburg grocery store dataset available online (http://aisdatasets.informatik.uni-freiburg.de/freiburg_groceries_dataset/), which has 25 different classes of objects typically found in grocery stores. However, a significant issue I ran into that I dealt with up until Thursday/Friday was that on Google Colab, I was getting many errors trying to run Darknet (the backend for YOLOv3), which I figured out was due to CUDA and OpenCV versions being different on Google Colab than what they should be. These errors were mainly because of YOLOv3’s age as a more legacy version of the framework, and so my attempts to fix them cost a large amount of time. I finally decided to pivot to YOLOv8, which was much less time consuming and allowed me to make some progress. Currently, I have written a rudimentary algorithm for determining the throughput of a cashier: essentially the software module takes in a frame every ⅕ seconds, checks how many new items appear on the left side of the frame (where the bagging area is), and then counts those processed items into a running total (that is divided by the time elapsed). Pictures of my current code are taken below. Since the model is pretrained, it doesn’t work with objects from grocery stores and so I tested to see if the code would even calculate throughput from a video stream using generic objects (cell phone, wallet, computer mice, chapstick): 

Throughput is being calculated and updated 5 times a second, but I will need to experiment with figuring out if this is an optimal number of updates per second. A huge benefit that I noticed about pivoting to YOLOv8 is that my code is much more readable and straightforward. My Google Colab GPU access was restricted for some of the week due to overuse when trying to figure out YOLOv3 issues, so I am currently training the dataset as of tonight (3/16), I will check back on it every few hours to see when it is finished. 

Progress

I am still behind schedule, but tomorrow and Monday I will be working on testing the code after the model is done training, and after that I’ll be able to help Shubhi and Simon with their tasks, as relative fullness and line detection are the most important parts of our project. 

 

YOLOv8 code: 

Outdated YOLOv3 snippet for reference: 

 

Brian’s Status Report for 3/9

Accomplishments

The week before spring break, I mainly focused on the design report because we ended up making some major changes to our design. The lack of an FPGA mainly changed many things with our proposed solution and so I changed our Gantt Chart to reflect the new tasks. The week of spring break, I didn’t get much progress done with getting a working implementation of the throughput algorithm, but we decided to change our programming language of choice from C++ to Python because of the lack of significant overhead when using the latter, and more simple syntax. The main issue when trying to finish my C++ implementation that I thought of was how to figure out when to start and end the throughput calculations: if counting the number of items in the bagging area section and seeing how many go there per second and averaging it, it is difficult to accurately determine when to start and when to finish calculating the throughput. I have yet to figure this out. 

 

Progress

I am still a bit behind schedule as I have not fully implemented a working algorithm for detecting when someone picks up and puts down an item yet. I am currently in the process of converting my current (not working) algorithm to Python so I can test it more easily. This week I hope to work with Simon if necessary to get this algorithm working and simultaneously work with Shubhi for her edge detection algorithm as it is the most important part of our proposed solution. 



Brian’s Status Report for 2/24

Accomplishments

This week, I mostly worked on preparing for the design presentation that happened this Wednesday, since I was the presenter this time around. After this, I got some rudimentary work done on the software side of our project, getting a branch set up on our team’s Github repository. I am now attempting to detect when a hand is in contact with an item based off webcam footage, but I haven’t found a working solution to this yet. I am currently using YOLO for object detection because I realized it might be more promising than trying to figure out when a hand is grabbing an item with a custom algorithm. I’ve read a little into the HPS OpenCL guide in order to make myself a little more familiar with the framework as I will be working with Simon after break on the FPGA side of things for our project, but mainly focusing on software for now. 

 

Progress

I am still a bit behind schedule as I have not fully implemented a working algorithm for detecting when someone picks up and puts down an item yet. I will very likely be working on the software implementation during spring break should I still be behind next week, as I want to make sure we can integrate our system after with no delays. However, to catch up for next week, I will read more into YOLO/OpenCV documentation to see if there are things I can try in getting hand contact with items to be recognized by the algorithm, and attempt implementing those solutions. I’ve already searched a bit to see what types of algorithms I could use but haven’t found an idea that seems truly feasible/efficient yet.



Brian’s Status Report for 2/17

Accomplishments

This week, I mostly worked on the design presentation slides with Simon, since I am the presenter for next week. We were able to create block diagrams for our system’s algorithmic design and a rough diagram of the physical layout together. I also picked up our DE10-Standard FPGA that we will be using for hardware acceleration and looked into videos/documentation on how to work with the board. I also modified our Gantt chart after discussion with my teammates because now there is a dependency in our schedule. I am currently working with some open source OpenCL code in order to further understand how the framework works in order to be able to help Simon with converting OpenCV Python code to C/C++ code in later weeks to use with OpenCL. 

 

Progress 

The DE10-Standard has a lot of features, and because of this, I am still quite unfamiliar with how to use it fully. In terms of CV algorithm implementation, because of our redefinition of scope/design after our weekly meeting, I am also slightly behind schedule in this regard. To combat this, I hope to get through the OpenCL open source code within the next few days, communicate with Shubhi for more clarification on the software side of the project, and progress with the implementation of a CV algorithm for tracking a cashier checking out items.



Brian’s Status Report for 2/10

Accomplishments

Simon and I researched FPGA hardware acceleration and discussed our results briefly in order to understand how it works to a better degree and we also deliberated on which FPGA would be good to use for our project. While we saw multiple articles online where Xilinx FPGAs are being used for accelerating ML algorithms, we came to the conclusion that the exact type of FPGA doesn’t really matter and we decided that we could use an Intel Altera board (currently planning on using the DE10-Standard, can order the DE10-Lite if the DE10-Standard is taken by another team). We also found that it would be better to use a Cyclone V FPGA rather than a Cyclone IV FPGA like the DE0-Nano, because Cyclone V boards support OpenCL, which is a framework we are heavily considering for hardware acceleration. 

 

Furthermore, I came across this article that seems to be very relevant for our project (https://www.researchgate.net/publication/338481306_Hardware_Acceleration_of_Computer_Vision_and_Deep_Learning_Algorithms_on_the_Edge_using_OpenCL). This article seems to implement hardware acceleration for computer vision algorithms using OpenCL, a high level synthesis framework that converts kernel code to RTL. An even more relevant detail in this article is that this was all done using a Cyclone V Altera board, which we are planning to use by requesting from the project inventory (DE10-Standard). I also edited the Gantt Chart to better reflect our workflow during Spring Break.

 

Project Schedule and Progress

Fortunately, my progress is on schedule, and during Monday’s mandatory lab meeting, Simon and I will discuss our findings in more detail and start working on the design of our RTL/kernel code implementation, if time allows. This portion of the project is extremely important because our approach for hardware acceleration is vital for our final product. 

 

Goals for the Next Week

In the next week, Simon and I hope to finish designing our RTL/kernel implementation and getting an FPGA. I will also hope to complete a significant portion of the cashier time detection algorithm implementation in that time, and we hope to get cameras delivered so we can test our CV algorithms in the future.