Colin’s Status Report for 10/8

This week I worked on figuring out the compatibility of the parts that we want to use on the project. Initially, we thought about using two RPIs, and dedicating one to the front end and one to the back end. The only reason for this would be to make the GPS and cellular data card easier to hook up to the system. However, the increased development time and complexity of two RPIs communicating data to each other is not worth it. I did some research on the data card and the GPS and have determined that we can hook both of them up to one RPi. Since we aren’t going to be anywhere near the computational limits of the RPi, it seems as if the most logical route to take. The GPS unit has the ability to change it’s I2C address, so we can run both the GPS unit and the cellular data card on the same I2C lines. An alternative would be to communicate to the cellular data card via I2C, and the GPS via UART if problems arise.

I also did research on the environment we will be running our tasks in. I originally contemplated scheduling 3 separate python processes on the RPi kernel, one for GPS data gathering and filtering, another for the backend, and another for the audio output, however the communication of the data to and from each process is not simple. An easier way to do this would be through the use of one single python process utilizing asyncio to perform cooperative multitasking with each of the 3 threads. Since we are not bound by computation power, we do not need to utilize all of the cores on the RPi, and this would allow for data communication between the threads to be much simpler. We also do not have very hard RTOS requirements, so we do not need preemption if we work out a simple cooperative scheduling scheme. Any extra development time that can be taken away from the environment that can be put towards the main direction algorithm of the project will be very useful for us.

I am doing okay in terms of the schedule. I accomplished a lot of what I wanted to in terms of figuring out exactly what parts to use and how they will communicate with each other. I have ordered the RPi from the ECE inventory, and have figured out what we will be running on the RPi in terms of software. Something that I would have liked to get done was to actually receive the RPi last week, however I was not able to and I will be doing so Monday morning.

Next week I need to get a few things done. The first would be to set up python on the RPi and start on the frameworks for all of our threads to communicate with eachother. The most important goal for next week is to order all of the extra parts that we will need for the project. Those parts would be the GPS/IMU, cellular data card, antennas for both of those parts, and some wires to hook up the devices to the RPi.

Team Status Report 10/08/2022

Some risks that could jeopardize the success of our project: we have to make sure that when our device is mounted on a user, that it does not move/tilt significantly with respect to the user, as that could affect the coordinates given by our IMU/GPS.  As such, we need to come up with a way to design our device so that it is stable on the user’s body. Another problem is something we discussed during the presentation, which is detecting whether a person is deviating from the route that the device has given them. A significant challenge here is that just because someone’s distance from their next checkpoint is increasing, does not necessarily mean that they are deviating from the path (ie. road could be curved, etc.). There are several ways we could approach this issue; one way could be detecting if a person’s distance to their next checkpoint has increased over the past n epochs. If it has, then we assume they are deviating and reroute them.

Although we haven’t made any explicit changes in our design, we are thinking about adding an input system (probably auditory) that will prompt the user to give directions to where they want to go. This will make our system more complete and give a more well-rounded user experience. We will discuss this in the upcoming week and change our design accordingly.

Zachary’s status report for 10/08/2022

This week I worked on setting up a Google Cloud account and gaining access to the Google Maps APIs. After doing that, I was able to play around with the API  and read more documentation to gain insight on how it works. Initially, I thought that we would need another API (maybe Overpass) to be able to geocode intersections, however I figured out that Google Maps actually has this capability, which is great. I also did research on communication latency with the Google Maps API. I expect the average will be around ~200ms, with maximum ~500ms. While this is fine, I want to implement a cache so we do not have to repeatedly ping Google Maps (hypothetically this is also a great cost-saving method).

Additionally, I worked on the design presentation slides that Colin presented on Wednesday. One particular point of feedback that I thought was important was that we currently have no form of input to our system. While Colin and I had discussed this before our presentation, we did not have time to come up with a solution since we were short on time due to pivoting. However, I think perhaps an audio input with an on-off switch (ie. button) could be a viable way of approaching this problem. I will talk more with Colin about this.

Referencing our Gantt chart, I will start implementing the backend this week. I think we are still slightly behind schedule given our late pivot, but hopefully we will be able to catch back up over the next two weeks. I will also be writing the design report with Colin.

Zachary’s status report for 10/1/2022

Due to Eshita dropping the course, Colin and I have decided to quickly pivot on  our project after meeting with Prof. Mukherjee yesterday. As our team status report indicates, we’ve pivoted towards a route planning project, which helps visually impaired people navigate from point A to point B, while helping them avoid “unfriendly” crosswalks.

Because Eshita only told us she was dropping yesterday, I’ve only had time today to do research on our new project. Since alot of the focus of this project will be on route planning and identification of crosswalks, I searched through potential APIs that could be useful for this. In particular, I spent alot of time going through the Google Maps API, and looking through its capabilities.

In addition, the identification of street intersections/crosswalks is my biggest concern right now. As far as I know, the Google Maps API does not have the capability to give information on things like “the nearest intersection from point X”, or if a coordinate is an intersection or not. A potential solution I’ve found so far is the overpass API, which can given information on the location of an intersection, given two street names as input.

Due to this unforseen circumstance, I am currently behind schedule. However, Colin and I are prepared to work hard in the upcoming weeks to get back on track. For next week, I want to read more into the Google Maps and Overhead APIs, and start interacting with them, and also talk more with Colin so we can flesh out the details of the design.

Colin’s Status Report for 10/1

This week our team altered our project to now provide directions along blind-friendly routes to aid the visually impaired. Due to Eshita dropping the class, Zach and I lack the machine learning knowledge to be able to proceed with the prior design.

I will now be focusing on the front-end of our system. I will be using a Raspberry Pi to gather data from a GPS unit to be able to determine the user’s location. The SparkFun GPS-RTK Dead Reckoning pHAT board appears to be a good unit for the project. The unit attaches to an RPi4 through the 40 pin header, and is easily interfaced with I2C. The unit contains a GPS locator, and an IMU to provide more accurate position readings when a loss of GPS signal is encountered. The unit has heading accuracy of within 0.2 degrees, however the unit does not contain a magnetometer. It achieves this by relying on the GPS moving, combined with accelerometer readings. This may be a potential problem for us given that our user may be standing still for a long period of time, and the heading reading will be prone to drift without the user moving in a direction. A solution to this would be to add a secondary board with a magnometer to tell direction, however this may not be necessary will significantly increase complexity of the unit because we would no longer be able to use the PiHAT 40 pin connector for the GPS and we would have connect both boards to the RPi, sharing the header.

I will also be taking commands from the back-end Pi to give directions to the user via audio. I will be using a Text-To-Speech engine to tell the user where to go and give various status updates given from the back-end Pi. The RPi4 comes with a 3.5mm audio jack capable of outputting audio to a wired ear bud which the user will be able to hear the directions from.

I am currently behind schedule given that our team is re-designing the project, however I am very happy about the new direction of the project. In the past day we have been focusing heavily on the project and will continue to do so in order to have a good design done by Sunday for the design review.

Team Status Report for 10/1

This week Eshita decided to drop the class due to an overwhelming workload. The two of us remaining thought about continuing the project in the same direction but realized that we lacked the machine learning knowledge to be able to confidently proceed. We thought of how we could still aid the visually impaired without the use of machine learning and decided to go towards direction and navigation instead. Our new project will tell a user how to get from point A to B, avoiding blind unfriendly crosswalks.

The system will be comprised of 2 Raspberry Pis communicating with each other over a wired network. The front-end Pi will be gathering location data using a GPS and IMU, and will communicate that data to the back-end Pi. The back-end Pi will take the location data and will interface with the Google Directions and Roads APIs via a cellular data chip to determine where the user should go next to reach the destination. Information such as the distance to the next intersection, the direction to turn at the intersection, and ETA will be periodically reported via a speaker and a Text-To-Speech engine running on the front-end Pi.

Our team is behind schedule at this point considering that we have to restart most of the research and design, however we are working hard to catch up to where we should be. The skills needed for the project suit our areas of specialization very well and we should be able to dedicate most of our time towards development as opposed to research.

Zachary’s status report for 9/24

This week, I was mainly focused on editing the slides and preparing for the presentation, which I presented on Monday. I appreciated the feedback and questions that we received from the Instructors and classmates, and particularly the pushback on the false positive rate, which I feel is a valid concern. As an aside, I felt that some of the feedback I received saying that I did not know the material well, or that I was underprepared, was unwarranted, as I had spent a substantial amount of time on the presentation. However, perhaps due to being too softspoken and having technical difficulties during the presentation, I was not able to reflect that.

Additionally, I have also spent a bit of time doing research on object detection algorithms for the implementation of the walk sign detection.

I am currently on schedule, as our team has put aside time in our schedule for the first four weeks, to specifically do research and flesh out our design (as well as prepare for presentations) before we start implementing.

Since I have limited experience with ML, I really want to get a head start on the material and implementation. In this upcoming week, I will be doing more research, as well as working with Eshita to find/create a dataset for walk signs fo the ML model that I will implement. Additionally, I also hope to set up a github repo and start writing down some code, if possible. Lastly, I will talk with my teammates to see if AWS may be needed for model training, and talk with the TAs and professor if we do to set that up.

Eshita’s Status Report for 9/24

Personal reflection

I looked at all the feedback this week for our team and wrote down some ideas that Colin, Zach, and I were brainstorming already. Generally, I feel like I should not have interrupted Zach in the presentation, he knew his material and I felt embarrassed to see the feedback that peers/faculty thought otherwise. It is a teammate attitude change that I will uphold moving forward. I do also think I got a little carried away with predictions on the statistics of our application. I feel like I’m used to classes and assignments where the stakes are quite different, and even compared to the other CV projects within the class, ours stands out in its real user base. I am excited to work for what we’re doing, but worried about how much we can achieve. It is not only a change of number (say from 1% to 0.05%), but we have to follow through on actually doing it.

Some design ideas

We do plan to keep the camera mounted on the shoulder using some sort of strap-on that comes for GoPros. I have a GoPro and there seems to be some vague documentation on connecting and streaming video to a Jetson. (An example is here.)

Another idea would be to keep the Jetson and the camera in a self-contained unit (maybe 3D-printed), but I do not know how much power/energy it would give off, and whether that would be harmful to the user. I wouldn’t want to wear something hot to my chest or near my body.

Algorithm work

I also research the background of crosswalk detection itself. I wrote down some potential steps based on my research of edge detection algorithms I think I would use. More research is needed on how to actually track objects in video, which is not something I’ve delved too much into before. Here are the steps I came up with:

  1. I would like to start with processing a frame and filtering out all the white I notice in that frame.
  2. Erode frames by structuring the frame. This is necessary to remove the noise and identify some finer points in the image. (A link on why this is necessary)
  3. Identify contours using cv.findContours, and draw tracking lines on each of the contours
  4. Find the median points in the contour. I have some ideas for the actual linear regression, which could be a simpler model, but also found this specific application for image alignment called RANSAC https://learnopencv.com/tag/ransac/
  5. Basically, after finding the median points, we want to use those along with the contours found to create a middle intersection point. This will be used to calculate the direction vector from the camera compared to that points.
  6. This alignment vector will help decide the walking direction for the user.

Next week?

I am a little behind on project work due to attending career fairs and interviews this past week. I feel good about my algorithm idea and want to start a github repo in the coming week and sketch out some outline code for it. I also want to, if not complete the pseudocode, find images and videos of crosswalks to test my algorithm on, as well as do research on how video tracking would work. I have a direction vector and a camera’s center point, but how do these two connect? I also hope to help Zach with collecting pictures of walk signs and I’m trying to take pictures of as many as I can while walking to and from campus! I also let the professors know I would need AWS, but I’ve always worked locally to train my ML algorithms even for classes, so I would like to play around with how that works!

Team Status Report for 9/24

At the moment the most significant risk that could jeopardize the success of the project is the accuracy of our object detection algorithms. We do not want to tell a blind person to cross the road when they are not supposed to. We are currently looking into options to mitigate this risk, one option may be to reduce the scope of the project to just doing crosswalk detection or cross sign detection to allow us to focus more time to one the algorithms and to hopefully make it better. We should also focus on the rate being less than 1%, a metric we were thinking of would probably be 0.5% for the detection of whichever application we pick, and if we pusue both as well. The design of the system has been unchanged, however we are looking into how to get the false positive rate as close to 0% as possible.

Colin’s Status Report for 9/24

This week I did research about all aspects of the hardware for the project. I wanted to tie in all of the components at a high level and see how they would all interact together in the project. In particular, I have decided to go with a BMA180 accelerometer to feed in data into the Jetson to determine if the user is stationary or walking. I can use a python library for this particular accelerometer to get the data, or I can write a small c program to gather the data and run some algorithms to determine the state of the user. I figured it would be nice to be able to easily gather the data using python given that we will be using python for the object detection portion of the project, and that the data from the accelerometer must be communicated to the object detection portion. I believe that doing both of these in the same python code would significantly increase both robustness and speed of development. I have also been looking into cameras that can stream data to the Jetson, and I believe that the SainSmart IMX219 would work well with a Jetson Nano, which is what we plan on using. Currently, I am on track according to the schedule given that for now all of us are working towards the design proposal, and the work that I have done this week all has to do with the design of the project on the hardware side. My primary goal next week is to look into the options involving the audio communication to the user of whether or not they should cross the street and what direction to go. I would also like to receive a Jetson Nano within the next week and start to install python/OpenCV on it. When installing python, I would also like to look into the option of building a multi-threaded python program to be able to get the accelerometer data at regular intervals and to communicate that data to the thread that decides whether to look at the walk sign detection or the crosswalk detection.