Last week, I finished making the model using my custom dataset. I came across a major problem when it came to working the YOLO model, which is what we had previously intended. Initially, I had thought that an object detection model like YOLO was ideal, since it is trained to pick up different objects in a scene. However, one shortcoming I did not foresee was that the images we would collect in our system were not like the natural images, because they were taken from an aerial view. The following two pictures display this disparity.
In the first image, we see the picture is taken from the side. The second image is taken from the top. For human beings, it is easier to identify the presence or absence of other humans, but for a machine learning model, this cannot be abstracted.
In order to fix this problem, I decided to use my own machine learning model, without any pretrained weights. The advantage of this is that it will solely focus on data that we have fed it, and so it doesn’t need to depend on natural images. However, the disadvantage is that I need to make a simpler model, since there is less data availability. For this reason I had to adapt the model, where instead of the model finding the location of the seats, I specify the location of the seats. Then, the algorithm would crop out each seat, resize the pictures and identify whether there was a person in each seat or not. I thought this was a fair compromise, given that the seats in Pablo’s dining room are in a relatively fixed position, and so is the camera that we set up. If we were to take this project further, there would be an extra cost associated with installing the system in a new location, but this cost would be negligible compared to the effort it would take to mount the camera and central nodes at the location.
Additionally, I worked on the frontend of the website. When we had planned on setting up in Sorrell’s Library, I did not know the layout of the seats, especially since we were not sure where we could have mounted the cameras. Once I got a few pictures of Pablo’s dining room, I could understand the layout and set up the website to mirror that layout. Right now, the website can read four occupation bits and display an appropriate html page with red and green colors based on availability, at different locations representing locations at the dining table.