Simon’s Status Report for 4/27

Accomplishments

I presented this week. I also went to Salem’s three separate times to try and get footage to test with, but was only successful the 3rd time (see Brian’s status report for more details). I did discuss alternative camera angles for our cashier throughput camera with the manager and the cashiers on my first visit this week (so I didn’t go for nothing), so I realized that I would need to train another model for the new angle (this time, a generalized one using the EgoHands dataset). The model seems to perform rather well, but we need to confirm tomorrow with further testing at Salem’s. On the 3rd visit, I got some data to replace the lost footage from last week, and realized that my shopping cart detection model was also failing to perform on our new camera angle for the small shopping carts (which was surprising, because it worked both from overhead and from a side view, but I guess there was insufficient training data for in between the two angles?). I annotated and added our new images of carts into the old dataset and retrained the model, which is now somewhat improved in terms of performance. However, I’m not confident that it predicts with high enough confidence on carts (it’s around .5 confidence), so we will need to test this as well to see if it ends up being reliable enough. If not, I will probably try to get more footage and just add more data, at the risk of overtraining the model to Salem’s.

Progress

Obviously, we are still very behind. We need to go to Salem’s tomorrow and finalize testing by Monday so that we can prepare for the final demo. As such, I haven’t bothered with the pose estimation for line detection that I mentioned last week, since failure to detect shopping carts/throughput is a much more pressing issue. Also, we need to configure the RPi’s to CMU wifi for the final demo (which is weird with CMU-DEVICE), but we can just use a phone hotspot instead, so this isn’t much of a concern.

Simon’s Status Report for 4/20

Accomplishments

In the past 2 weeks, Brian and I collaborated together for most of the work. To start, Brian and I managed to get uploading from our Raspberry Pi’s to S3 working, and then we were able to download as well. After that, we came up with a really simple synchronization method, which was just uploading a .txt file to indicate an upload was complete and then downloading the .txt file frequently to check if an upload had occurred (since the file is very small, there’s negligible cost to doing this).

Brian and I then worked on line detection, in which we used YOLOv8’s pretrained model for detecting people, using the bounding boxes to keep track of their location, and then determined whether people should be considered in line by distance and angle between them. To make it more robust, we tried to add functionality to only count people as in the line if they’ve stood still for a few seconds, which should eliminate people just passing by. However, we weren’t able to get this to work reliably. I was thinking of implementing pose detection instead, so we could eliminate counting passersby by checking whether they are facing forward (in line) or facing to the side (passing by), but we will do this next week if we have the chance.

Lastly, we spent some time testing in Salem’s this week. We spent Monday trying to set up and resolve Wi-Fi issues between our RPi’s and Salem’s. We went back on Thursday and managed to get some good footage using an actual checkout lane to test our individual components. We also went earlier today, but were unfortunately not allowed to test this time.

Progress

The main issue is that we didn’t get as much testing done as I would have liked (and also not enough to create much of a presentation). I’m planning to go back to Salem’s tomorrow and try again at a better time for them. Less importantly, I was going to try and implement pose detection to improve our line detection.

New Knowledge

Since I haven’t done anything with computer vision previously, I had to learn how to use OpenCV, which I accomplished by completing a short boot camp from their website. Afterwards, I mostly used YOLOv8’s documentation along with a few articles I found online to learn how to train a model and run inference with it. Afterwards, I used some more online articles to figure out how to set up a headless RPi. For accessing the camera module, I used the picamera2 documentation and some of the examples that I found inside. Lastly, I had to use forums quite frequently when I ran into issues setting up (stuff like how to download packages and change the WiFi settings).

Simon’s Status Report for 4/6

Accomplishments

On Wednesday, I flashed the Raspberry Pi OS onto a microSD card and set up it up as a headless RPi. On Thursday, I attached the RPi camera module and made sure that I could record and save a video using the picamera2 module (took me unnecessarily long to realize that I needed picamera2 and not picamera). On Friday, I also made sure that I could record and save a video from the USB camera. On Saturday, I got the camera module and the USB camera to capture and save videos simultaneously, so Brian and I can put our work together tomorrow and see if we can record and then upload/download from S3 successfully. I also spent some time trying to stream from my RPi to a website, but I couldn’t figure out why I wasn’t able to connect, so I think I’ll just scrap that approach (Plus, it’s probably a bad idea since anyone who can view the website can view all the footage).

Here’s the code with the camera module and USB camera recording simultaneously:

Progress

I’m keeping up with the new day-by-day schedule that Brian created, but I was hoping to be further ahead of schedule by now because the RPi stuff should not have taken as long as it did. For next week, it seems like it should be easy enough to put Brian and I’s work together tomorrow, and from there I will just continue following the schedule outlined in Brian’s status report.

Testing

For what I’ve worked on/plan to finish, which are shopping cart detection and line detection, I could test by going to Salem’s and setting up a camera in the checkout lines for 10 minutes each (since there are 6 checkout lines, this will total an hour or so) and just mark down what percentage of the shopping carts are correctly detected (correctly meaning detected by the time they enter the bounding box). For line detection, I will look at how many people are in the line from the camera footage and see if the correct number of people are detected as being in the line and keep track of how many people we are off by.

Simon’s Status Report for 03/30/24

Accomplishments

At the start of the week, Brian and I took videos at Giant Eagle to test our models, and my model for the shopping cart detection works reasonably well on the footage that we have. However, I realized that the accuracy would probably be improved by putting the camera higher up than originally planned.

To improve the accuracy, I also tried to split the larger video into two 640 x 640 videos because the model is trained on image sizes of 640×640, but I couldn’t get the model to predict on two videos without some kind of error despite running the process in separate threads. I don’t think it will be necessary for the interim demo, and there might be better ways to improve accuracy (such as the higher camera angle and simply collecting more data from Salem’s Market, where we plan to implement our system), so I will just put this aside for now and just run inference on the original large video instead.

Lastly, I changed the method for counting carts from a static bounding box that counts when carts enter/leave. Instead, I use a static bounding box to find the first cart, and then look for the next cart in a small region after the first cart, and so on and so forth until the last cart, which should minimize errors due to carts that are passing by the line but not entering.

Progress

I plan to collect some real data from Salem’s Market this week and retraining my model for higher accuracy. If there is any component that’s missing for us to have a complete end-to-end system, I will also work on that (such as if our throughput calculations aren’t quite done).

Simon’s Status Report for 3/23/2024

Accomplishments

Realized I had to retrain model because I accidentally set the image size to 64×64 in the original, which made it so that the model would fail to make accurate predictions on images with higher resolution. Scaling down images/video to 64×64 worked, but with lower confidence on predictions than I would have liked. With a new model trained with image size of 640×640, the model works much better and with higher resolution images and predicts correctly with higher confidence than the scaled down images for previous models.

With the new model, I used YOLOv8’s object counting to count the number of shopping carts in a certain region of an image. I’ve tested it with stock footage to verify that it works on an extremely simple case, and Brian and I will go to grocery stores tomorrow to take video footage from our desired camera angle so that we can see if it works correctly.

Progress

I’m hoping I’m close to being done with shopping cart detection (need to verify that it works and make changes accordingly), but if all goes well then I should be somewhat caught up. For next week, I will try to make sure the shopping cart detection is working and once I’m confident that it is, I can help Shubhi on relative fullness detection.

 

Simon’s Status Report for 03/16/2024

Accomplishments

For this week, I found a few datasets to train my model on

The first one can be found here: https://universe.roboflow.com/furkan-bakkal/shopping-cart-1r48s

I used YOLOv8 to train a model and the results were quite good at first glance, with the best model having roughly 90% accuracy.

However, after looking through some of the validation data, it doesn’t look promising. Here’s one of the validation batches:

The image is pretty blurry, but the important thing to note is that the model performs poorly with realistic backgrounds (as opposed to plain white). I think it’s inaccurate enough to where tuning the hyperparameters probably won’t increase the accuracy significantly enough.

I used another dataset to train a model after deciding that the first one probably wouldn’t be good enough. Link here: https://universe.roboflow.com/roboflow-demo-projects/retail-nxvjw

The results are similar at first glance:

Looking through the validation batches, however, it seems to perform much better than the first model. Because the dataset includes much fewer images with white backgrounds, the new model looks to be significantly better than the old one at detecting shopping carts despite noisy backgrounds. I think I’ll try to tune some of the hyperparameters (such as number of epochs trained and image size) to see if the accuracy can be improved further.

I found one more much larger dataset that I’m currently training, but it isn’t done yet at the time of this status report, so I’ll check the model in a few hours.

Lastly, I learned how to use OpenCV to apply Gaussian blur, perform edge detection, and find contours, which is hopefully a good starting point for detecting how full shopping carts are, since we plan on NOT using a CV model for this part of the system. This is what an example currently looks like:

As it stands right now, this could be much improved. I will be looking into how I can use OpenCV to perform image alignment this upcoming week.

Progress

I think I’m still quite behind where I need to be at this point. For next week, I hope to refine the model I currently have a little further by tuning some hyperparameters and training again. Then, I need to see how it performs by actually feeding it camera footage of shopping carts from one of our cameras at the angle that we plan on using. As for the object tracking of the carts that I mentioned last status report, I no longer think it will be necessary, since we should only need to know how many carts are in a line at a given time, but I will check with my teammates to confirm. Lastly, I hope to figure out how to correctly implement image alignment so that the shopping carts will be consistently aligned to increase accuracy when we estimate the fullness of the carts.

Simon’s Status Report for March 9, 2024

Accomplishments

I don’t have much to report this time. The week before spring break, I spent the majority of my time working on the design report. Since we have shifted responsibilities around and I am now working on the checkout line detection (and I have no previous experience in OpenCV), I spent some time over spring break working through OpenCV’s bootcamp and learning how to do object tracking and detection, so I think I’m ready to start my implementation of the checkout line detection this upcoming week.

Progress

Since we have decided to no longer use an FPGA, my previous progress (which was not much) is now irrelevant. As such, I’m pretty far behind where I’d like to be in terms of my new responsibilities, but I plan to spend extra time this week to implement the line detection algorithm. Hopefully, I will be able to start testing this week as well. I plan to go to a grocery store and ask to take a few pictures from our proposed camera angle for some initial testing data since I couldn’t find any suitable images online.

Simon’s Status Report for 02/24/2024

Achievements

I made little progress this week. I didn’t manage to finish the HPS guide I mentioned last week due to not having an ethernet router or an ethernet port on my laptop, both of which are needed. As such, I am unable to use the UART to USB communication that I was originally planning on until I get the ethernet components. I have not yet figured out an alternative for communication without using the HPS. Also, I didn’t get to experiment with OpenCL this week because I didn’t have the required microSD card and the adapter, so I am waiting on those to be delivered this coming Monday (note to self: In the future, I should read all the required materials first thing). As a result, much of my time this week was spent on planning the OpenCL implementation since I was unable to make progress on the other deliverables.

Progress:

I am significantly behind schedule, since I am supposed to start implementing the CV algorithm using OpenCL next week. To resolve this, I plan to spend some extra time over spring break (which was originally scheduled to not have any work) to catch back up to schedule, since I doubt I will be back on schedule by the end of this week.

Simon’s Status Report for 02/17/2024

Accomplishments:

I started off this week by looking into sensors we could use instead of cameras after receiving feedback from our proposal presentation. However, we decided that sensors on shopping carts would be economically unfeasible, easy to break/lose, and sensors at checkout lines would be less accurate than cameras.

I also worked on the design presentation slides with Brian. We refined our use case requirements, decided on which components we would use, and created a block diagram for our system.

Lastly, I familiarized myself with the DE10-Standard FPGA by reading through the manual and system CD, downloading the necessary software, and working through the getting started guide and the tutorials for the FPGA. I am currently working through the HPS guide and I plan to finish that and the OpenCL guide in the next few days.

Progress:

I underestimated the amount of work needed to familiarize myself with the DE10-Standard, so I am behind schedule by a little bit. Also, I am still unsure of the details for the CV algorithm, so Brian and I have not gotten around to designing the RTL datapath. However, the schedule for this week only entails getting the communication between the FPGA and my laptop working, so hopefully I can still complete this task on time and catch back up. For next week, I hope to have an initial version of the datapath done and also be able to send and receive data between my laptop and the FPGA.

Simon’s Status Report for 02/10/2024

This week, I worked on the proposal presentation slides, mostly contributing to the use-case requirements, technical challenges, and scope. I also focused on doing research into FPGA-based hardware acceleration to gain a general understanding of how it applies to machine learning and computer vision so that Brian and I can design our RTL implementation next week. From our readings, Brian and I decided on a DE10-Standard FPGA with a DE10-Lite as backup if the DE10-Standard is not available in the inventory. We also are planning to use the OpenCL framework for our FPGA development. (See Brian’s status report for more details).

In terms of progress, I feel like I am slightly behind schedule because of uncertainty as to which component(s) of the CV algorithm to speed up using an FPGA. Consequently, I haven’t figured out how to start designing the datapath, which Brian and I are scheduled to complete this upcoming week. However, we plan to have a team meeting either on Sunday or Monday where we will discuss potential CV algorithms and how an FPGA can be incorporated. This will hopefully give Brian and I a good idea of what to needs to be included in the datapath design.

For next week, I hope to have a completed plan for our datapath and begin looking into how to send/receive data between the FPGA to the CPU.