Work Done
I did manage to get a faster object detection model (found in this Git repo) – it’s a YOLOv3 model (so a bit less accurate than the YOLOv4, but it works for our purposes and is less computationally intensive.) This model is also able to detect pedestrians, so the algorithms for these components of our project have been finalized. This model appears to take less than a second to process each frame, but I’ll need to test it on the RPi to make sure that the timing remains consistent. Kaitlyn and I set up the SUMO simulation on the RPi this week and it works pretty seamlessly (i.e. no noticeable latency issues or lagging when simulation parameters are changed.)
Unfortunately, contrary to what the vendor informed me, I found out after asking some questions about how to set it up that the camera we bought does not support RTSP and that in fact there likely aren’t any battery-powered IP cameras on the market that do support RTSP (perhaps due to the fact that, in order to save power, they typically only record the live feed when motion is detected or when the user prompts them to through the apps that control them). I will order a wired IP camera next week along with a portable battery (based on video tutorials I’ve found online, I know of several options that definitely do support RTSP), but at this point I’m not sure it makes sense to demo the finalized object detection model on anything other than prerecorded footage, so the purpose of doing this would just be to show that we can use a live IP camera feed for the object detection model.
Schedule
There is still some stuff I need to do to fix the object detection code; currently, every single vehicle that is detected is included in our count of vehicles on that side of the intersection. I found some interesting lane detection algorithm resources in this article (using Canny edge detection and Hough transforms) but because those are very lighting-dependent I’m not sure that they will be the most accurate. If they don’t give me the results I want, I’ll just hardcode lane boundaries according to the video dimensions. There is also the issue of cars crossing the intersection and blocking the camera’s view of the cars that are waiting (which is the number we actually want); in this case, if the bounding box around the car goes beyond a certain dimension, I will maintain the previous vehicle count until the large bounding box is no longer present. Similarly, I will only count pedestrians that are within specific coordinates that correspond to each side of the intersection’s sidewalk.
Next week, we will need to take concurrent video of all 4 sides of the intersection in order to retrieve the object counts that would be used by the optimization algorithm (we will actually be using simulated object counts for this optimization, because the results of the optimization algorithm – light state changes – will not be reflected in real life conditions. As a result, the assumptions that our algorithm makes, i.e. that cars will go when there is a green light, will not hold, and our algorithm would not optimize at all. However, we want to demo the object detection in a way that shows that, if we were able to reflect the light state changes in real life, the two systems could be integrated without issues.) The finalized object detection code would be demonstrated on these 4 videos simultaneously, with accuracy tests being conducted on vehicle and pedestrian counts for all 4 sides of the intersection.
Testing Plan
The parts of my subsystem that I need to test are latency (how long does the object detection model take per frame – needs to be < 5 seconds – (optimization algorithm time)) and accuracy (how accurate are the detected vehicle and pedestrian counts – should be 90% and 80%, respectively.)
I will test latency by adding a timer that starts at the beginning of frame processing and ends at the end of it. I will test accuracy by compiling ~100 frames (after the intersection videos are taken) and comparing the number of detected vehicles and cars to the actual number of vehicles and cars (determined by eye).
Deliverables
By Monday, 4/15, I will:
- Get the object detection running on the RPi
- Retrieve accurate object counts for the demo video I’ve been working with (using lane detection algorithms or hardcoded values)
- Order the wired IP camera and portable battery and, if they arrive, try to access the live video feed
- Take the concurrent videos of all 4 sides of the intersection (preferably with some assistance)
- Help Kaitlyn and Zina integrate the RPi and the Arduino so that the traffic light PCB reflects the state changes determined by the optimization algorithm