Mehar’s Status Report for 10/1

This past week I focused on implementing and testing various neural network-based object detection architectures and working on the design documentation. My goals this week were to pull up a fully functional Faster R-CNN model to test with, have the rough CV pipeline laid out and study Open CV further.

The bulk of my time was spent pulling up the models and performing preliminary testing. In my research, I found a promising Object Detection library – Facebook Detectron 2 with support for various Faster R-CNN architectures. Briefly, Mask R-CNN became a consideration since the object masks could help with object occlusion for our use case (ie table covering chairs), but I ultimately decided against it as Faster R-CNN would work sufficiently and the object masks would add significant overhead in labeling training data.

I tested a number of Faster R-CNN architectures on some test images we took after class Monday. Ultimately, I found the larger Faster R-CNN – Reset 101 architectures had higher accuracy and were able to detect more objects. During this testing, I tested out Ultralytics Yolov5. Yolov5 surprisingly performed similarly to the larger Faster R-CNN architectures despite the smaller model size and faster computation time. For this reason I decided on working with the Yolov5 instead of the Faster R-CNN.

Ultralytics Yolov5 on iPhone Image
Detectron’s Faster R-CNN w:ResNet-50 on iPhone Image
Detectron’s Faster R-CNN w:Resnet 101on iPhone Image
Detectron’s Faster R-CNN w:ResNeXt-101-32x8d on iPhone Image

From there, I spent some time determining the rough overall CV pipeline – discussing with Chen how to translate the object detection output into the seat occupation data.  I added this final pipeline to the Design Review powerpoint slides.

Design Review Presentation Computer Vision

I didn’t research as much about Open CV – I fell sick during the week so I lost some time that I might have used to research Open CV. Based on what I was finding with model testing though, I’ve found I mainly need to look into noise reduction, contrast increase and some potential image segmentation for preprocessing.

Next week, I’ll research the Open CV needed for the image preprocessing layers to catch up in that area and will start putting together the image preprocessing with the model. Besides that, next week’s goals include deciding what training data to collect and starting to collect training data using the camera setup.

Aditi’s Status Report 10/1

I ordered the parts late Sunday night, and was able to pick them up during the week. This week a lot of our energy was spent on doing the design presentation. I was in charge of the hardware implementation slide, the testing and validation slides and the implementation plans. I also redefined the use-case requirements. This took a while as I needed to figure out how we were going to implement some of our testing and think of a new test case for one of our use-case requirements. I attempted to get the Jetson TX2 up and running but I was not able to download the SDK onto the ECE machines due to some user privileges. I also realized that the TX2 was too large and would be difficult to mount. After doing some research, I realized that the Nano can fit most of our requirements but we would have to switch to USB cameras. I did a trade study and looked at the average accuracy and FPS on some of the other options like using smaller YOLO framework, using a Raspberry Pi and support for multiple cameras. The setup for the nano also has a simpler headless setup so hopefully I can get things moving this week. I also reviewed all of our feedback, and gave each person on the team their relevant feedback.

Team Status Report 10/1

The implementation of our project changed slightly again this week and some more details were figured out. Instead of using a NVIDIA Jetson TX2, we will use a NVIDIA Jetson Nano. The TX2 was much larger in its form factor than we expected and the Nano has enough compute power for our use case. We have also decided on using YOLOv5 for object detection after doing some testing on preliminary images. There is some risk that we won’t be able to detect chairs when people are sitting on them, and in this case we may just ignore those objects and only identify empty chairs as it still will meet our use case requirements. Identifying chairs which are occluded also may be difficult and we may try and preprocess the image by filtering by known colors of chairs in the room. We have changed the delegation of tasks slightly, and Chen (instead of Mehar) will be working on counting the number of chairs and interpolating the middle of each of the bounding boxes outputted from Yolov5. 

 

Here is a picture of Yolov5 working on an image we took of a study room we hope to use for MVP: