During this week, I started exploring object detection algorithms. A promising model that I found was the yolov5 algorithm, it is a lightweight and good-performing object detection algorithm that doesn’t require significant computing power to run. I tried local testing with yolov5s and used my webcam as the input video feed and I was able to achieve about 8 FPS on my M2 macbook pro CPU. Below is a test image, the model is able to consistently identify me as a person.
Next week I want to look into training on a custom dataset, as in our design, our camera is placed above the library door and looking down at a slight angle, therefore we would be getting footages of people mostly from above. We want the model to be able to detect people when they are not facing the camera directly to ensure that the data we extract is accurate. I would also like to research/testing on tracking algorithms locally. I think the development progress for CV part is on schedule so far, as originally I planned 3 weeks to do the object detection and tracking. Here is the local testing code I used: github repo.
I learned these through reading documentation about libraries such as openCV, and researching different object detection algorithms online. I found some existing guides and youtube videos and used them as a reference point to build my local test project.