Feb 23: Team Status Update

We want to minimize the latency of our computer vision and ML algorithms because we want to be able to open the door for a valid cat as it is walking up to the door, without having the cat needing to wait. We estimate that the cat will be within range of the camera for a total of 1.2 seconds.
Through our research we determined that a Raspberry Pi would allow us to compute around 1 frame per second, which is too slow because we could potentially only receive one image during the 1.2 second span and this image might not give a good indication of whether the animal is valid or not. Similarly, we looked into Odroid which is a board similar to the Raspberry Pi, but much more powerful. This would likely yield us 2-3 frames per second. Still, we are unsure if this frame rate is fast enough and we want to be sure that we are going to get at least one good image for our algorithms.
We then looked into GPUs, which are processing units designed for image processing. Nvidia makes the most commonly-used and best documented GPUs. In addition, one of our group members has experience with Nvidia GPUs. We found the Jetson family, which are GPUs created for the embedded systems world. Specifically, we chose the Jetson TX2, which has 256 Cuda cores, because based off of our research we will be able to process 15 frames per second. Furthermore, Nvidia has a library called TensorRT, which compliments TensorFlow. This library can be used in conjunction with TensorFlow to optimize the ML algorithm computation for Nvidia GPUs. We will be using this to improve the latency of our algorithm.

To implement motion detection, we first can store a weighted average of previous frames and call this our “background frame.” With the weighted average, the script can dynamically adjust to the background, even as the time of day changes along with the lighting conditions. Then we compare the background frame to the current frame by subtracting. If the delta is above a certain threshold, then we have detected motion as a substantial difference in the image. We know where the motion occurred in the frame, so we can crop that part of the image out. On the other hand, tracking involves comparing adjacent frames to figure out what moved where. So even if there are two moving objects in a frame, we can figure out what moved where.

Instead of using an ultrasonic sensor, a PIR sensor will be mounted to the bottom of the door on the indoor side in order to know when the door needs to open for a cat wanting to exit the house. The camera will be mounted on the top of the outdoor side and angled downwards. The camera will be used in determining when the door needs to open for a cat wanting to enter the house. A door switch will be used for when the servo needs to lock after a cat has finished entering or exiting the house.
Passive infrared sensors detect changes in infrared radiation. All objects with a temperature above absolute zero emit heat energy in the form of radiation, so a PIR sensor can be used to sense movement of people, animals, or other objects.