Charles’s Status Report for 4/12/2025

This week I spent my time working with the Jetson to try to get our image processing code to work on it. This mostly involved a lot of package building and library downloading which ended up being much more involved than I initially thought it was going to be.

We ordered a WiFi adapter for the Jetson that way we could use WiFi instead of Ethernet. This made it so that we could use the Jetson at home and start working on getting all the necessary packages installed.

Installing the packages actually ended up being pretty complex. All the simple libraries like numpy were relatively painless to install. However, all the ML libraries ended up being much more of a hassle. PyTorch and Torchvision both took an extremely long time to install and build. This is because the normal libraries for these that are hosted through pip are not compatible with the Jetson architecture. Using this link, I was able install and build a PyTorch and Torchvision that was compatible with the jetson architecture. This did end up taking multiple hours to build and compile, so I had to leave it running overnight. After finally getting all the necessary libraries downloaded to finally compile our image processing code, we ran into another problem.

As we can see here, we have the 2GB of RAM that our board has, but we also have a 9GB swap file. This 9GB swapfile is essentially virtual RAM for the board because the base amount of RAM is very small. The problem with this is that, even when loading in smaller models, the system runs out of memory on almost any input.

Because the process runs out of memory, it ends up being killed by the OS. Resulting in this message:
[Sat Apr 12 19:12:14 2025] Killed process 10399 (python3) total-vm:9650160kB, anon-rss:0kB, file-rss:537696kB, shmem-rss:0kB

This essentially means that we have completely ran out of memory and thus unable to continue being run.

For the upcoming week, I am going to figure out if there is anything that we can do to run the image processing on the Jetson, and if that’s not possible, if there is any other alternatives.

For verification and validation, I am going to start testing the camera and the image processing in combination. We will do multiple room layouts to test the robustness of the mask generation. We will most likely still do the processing on our laptops because it will be significantly faster, but we will use the camera that we have recently bought.

Charles’s Status Report for 3/29

This week I spent some more time on the obstacle detection pipeline along with Talay. Talay got a PoC to work with his phone and having his phone communicate back with our laptops. So we tried adding his phone to the pipeline, to do this we just captured a frame from the live feed of his camera and started to run our obstacle detection pipeline on the captured image. I didn’t capture an image of the process or what it looks like, but this snippet of code should give a good idea of what is being done.

Once we got this image to work, we realized that some of the images we were getting from the obstacle detection looked off. This an example picture of a problematic result:

The white in this picture denotes what is a floor while the black denotes an obstacle. You can see pretty clearly here that it is mislabeling the table and the chair here as floor. We figured out that this was just a processing mistake and Talay’s post has the corrected processing. However, in order to fix the issue we had to regress back to a stage of the pipeline where there is no labelling of the floor and obstacles to see what exactly the SAM model is doing.

We can see here that the SAM model is segmenting everything in the room correctly, the only issue is with our labelling algorithm. The solution to the problem ended up being a bug in the processing code.

Next week, I am going to talk with the rest of my group members on a very important point of the project. We need to discuss how the user will select destinations. This can be done with a variety of approaches, most of them requiring at least some sort of UI. We also need to prepare for the demo coming next week.

Charles’s Status Report for 3/22

This week I spent a lot of time on the CV part of our project. The first challenge of this week was finding a model that could effectively recognize objects. This immediately came with some shortcomings, a lot of the object recognition models are not trained on images that are taken from a top-down POV. This lead to some very questionable predictions and a lot of times omitted larger portions of the image. I’m still not really sure why parts of the image were not inferred on or if they was just not guess for an object. The picture below shows one of the inferences:

There wasn’t that much of a solution to this problem as other models also gave questionable results. I then moved onto a segmentation only model which lowered the model’s complexity and was only used to segment the image. This ended up working much better in terms of pure obstacle detection than the YOLO model. I used the Segment Anything Model from Facebook and it returns masks of segmented objects. I then heuristically chose the largest continuous segmented object to be the floor as this is usually the case. The result of this code is below:

In this image, the green is considered the floor, and all the red is considered an obstacle. As we can see from the image, the results are actually quite accurate. I also converted this image into a occupancy matrix, preparing the data for pathfinding afterwards.

Assuming that this heuristic approach will work for our future rooms, I have started to move towards pathplanning/finding. I am working together with Talay to hopefully implement a working A*/D* algorithm. We hope to first simulate the algo with a controllable character and a prompt telling us the intended route to the target.

Charles’s Status Report for 3/15

This week I spent most of my time working on the CV component of the product. I found a very rudimentary way to locate obstacles and that was through Canny edge detection. There are a lot of different factors for preprocessing the image before it undergoes the algorithm that can skew a lot of what is considered an edge or not. I followed some online resources that gave a very rudimentary look at getting edge detection to work though OpenCV python. This is what some of my preliminary results look like.

Original Image:

Edge Detection:

Here its pretty clear that the algorithm is effectively doing edge detection but it now becomes kind of hard to determine what is a relevant edge and what is not. it does a good job of marking the plant, lamp, sofa, and tables respectively however the rug is still considered an obstacle.

For the next week, I want to see if there is any more accurate edge detection models, as this one has some minor flaws in detection, and seeing if combining the results of edge detection with something like YOLO might make it easier to differentiate between something like a rug and a legitimate obstacle like a sofa.

Charles’s Status Report for 3/8/2025

I spent the beginning of this week looking at some of the OpenCV modules that could be useful in our project. I found that OpenCV has a couple of algorithms that can be used for edge detection, the most popular one using the Canny edge detection algorithm. I started writing some testing code to see what input and outputs the function takes. I spent the rest of the week preparing for a surgery and recovering.

I expect to spend some time next week continuing to recover from surgery and hopefully continue my testing with OpenCV and our obstacle detection portion of the project.

Team Status Report for 3/8/2025

A was written by Talay, B was written by Kevin, C was written by Charles

A:

BlindAssist addresses the global need for independent indoor navigation for visually impaired individuals by utilizing technology that does not rely on certain geographies or existing infrastructure. Traditional aids like white canes struggle in indoor navigation, as it only focuses on obstacle avoidance. By using ultra wideband (UWB) localisation and stereo vision, we offer a solution that only requires a one-time set up that can be applied to any geographical area. If we used a GPS instead, we would be limited to areas where GPS signals could reach. The haptic feedback belt is easy to use for non-tech-savvy users, since the user can simply press a button to request navigation and follow through with haptic feedback. This also overcomes any language or literacy barriers.

The setup of BlindAssist is also minimalist, requiring only ceiling-mounted cameras and UWB anchors. This means that the system can be adopted in diverse global settings with modifying the existing infrastructure. This scalability is crucial to our targeted areas, which are hospitals, universities, airports, etc. The discreet belt form also allows the user to walk around without attracting too much attention. The 3-5 hour battery life allows users to use the product for an extended period of time without needing a power source close by. Thus, BlindAssist offers a product that could help address the global need for indoor navigation beyond localized contexts. 

B:

One cultural aspect that I think our wearable design addresses is the desire for discreteness. In many cultures, discretion is preferred especially when it comes to assistive technology. We prioritize discreteness through tactile feedback over auditory feedback and using a belt, which can be hidden under clothes. Furthermore, our haptic feedback and button system eliminates the need to accommodate various languages since it eliminates vocal input and auditory cues, which makes it adaptable under many cultures.

C.

Our product uses a lot of technology and sensors that inevitably use natural resources. Although our device is not as resource intensive as something like a smartphone or other massively resource intense technologies, given the fact that we expect our product to be scalable to larger areas, it will require a linear growth of both cameras, processing systems, and UWB sensors. Although we don’t expect these technologies to be as resource demanding as some other technological commodities, it is definitely important to note that increasing the coverage area of our product covers comes with a cost of resources. Because of this, we have chosen to keep all our sensors and our processing system relatively lightweight. All of the haptic feedback sensors are very small in scale and require the least amount of resources in comparison to some of the other guidance systems we ideated with. The Jetson Nano is also a relatively lightweight computer used for processing. Because we are using a smaller computer, we can limit the amount of energy required for the product to operate. Additionally, the stereo camera that we are using is very bare bones and is essentially as lightweight as a camera can get. By keeping our solution to minima while still maintaining functionality, we think that we lessen the impact that our product has on the environment.

Team Status Report for 2/15/2025

The most significant risks involve the fact that a lot of the parts we want to use have limited examples online, and are unfamiliar to us. For example, the UWB boards seem tricky as we will have to figure out the documentation on our own. We tried to mitigate this by selecting parts that seem to have more documentation/similar examples, but there will be some inevitable learning and trial-and-error necessary. Furthermore, we selected parts that are more general purpose. For example, in the case that we can’t find an effective solution utilizing the bluetooth feature of the UWB board, we can still wire it directly to the belt’s Raspberry Pi, which should have bluetooth capabilities on its own.
One change we made was to add a magnetometer. This was necessary as we needed the rotational orientation of the user to navigate them properly. The additional cost is both the additional hardware component and needing to learn the interface of this tool, but we plan on keeping this portion of the design simple. Furthermore, we introduced a Raspberry Pi on the wearable belt. This was necessary because we are realizing that a lot of communication/processing stems from the wearable, but we plan on selecting a lightweight and efficient Pi to minimize weight and power consumption.
Otherwise, our schedule remains mostly the same.

Kevin did part A. Charles did part B. Talay did part C.

Part A.
Our project’s initial motivation was to improve accessibility, which I believe strongly addresses safety and welfare.
Providing navigation for the visually impaired can provide an additional level of physical safety by navigating users around obstacles to avoid collisions/hazards. Traditional tools, such as a white cane, have limitations. For example, white canes are not reliable for detecting obstacles that aren’t at ground level and may not provide the granularity needed to navigate around an unfamiliar environment. Using a stereo camera from an overhead view, our device should be able to detect the vast majority of obstacles in the indoor space, and safely navigate the user around such obstacles without having to approach them.
Furthermore, public welfare is addressed, as accessibility enables users to navigate and enjoy public spaces. We anticipate this project being applicable in settings such as office spaces and schools. Take Hamerschlag Hall as an example; a visually impaired person visiting the school or perhaps frequenting a work space, would struggle to navigate to their destinations. With lab chairs frequently disorganized and hallways splitting from classrooms into potentially hazardous staircases, this building would be difficult to move around without external guidance. This ties hand-in-hand with public health as well; providing independence and the confidence to explore unfamiliar environments would improve the quality of life for our target users.

Part B

For blind individuals,  our product will help them express more freedom. Right now, many public spaces aren’t designed with their needs in mind, which can make everyday activities stressful or even isolating. Our project aims to make spaces like airports, malls, and office buildings more accessible and welcoming. It means blind individuals can navigate these places on their own terms, without always needing to rely on others for help. This independence opens up opportunities for them to participate more fully in social events, explore new places, or even just move through their daily routines with less stress. This will have huge social impacts for the visually impaired and will allow them to more fully engage in social areas.

Part C
BlindAssist will help enhance the independence and mobility of blind people in indoor spaces such as offices, universities, or hospitals. This reduces the need for external assistance in public institutions such as universities and public offices. This could help reduce cost to hire a caregiver or expensive adaptations to buildings. BlindAssist offers an adaptable, scalable system that many institutions could rapidly adopt. With a one-time set up cost, the environment could become “blind-friendly” and accommodate many blind people at once. With economies of scale, the technology to support this infrastructure becomes cheaper to produce, allowing more places to adopt it. This could reduce accessibility costs in most environments even more. A possible concern is the reduction in jobs for caregivers. However, these caregivers could spend their time caring for other people who currently do not have the technical infrastructure to support them autonomously.

Charles’ Status Report for 2/15/2025

This week I spent time with the team talking more about the details for implementation. I spent some time investigating and researching how we were planning on detecting obstacles from our image. I figured that edge detection would be the most lightweight and functional tool, so I started looking at frameworks that support edge detection. Some of the libraries that I found were OpenCV, a popular computer vision framework, and PyTorch with TorchVision. These both have a lot of exciting documentation and examples of how to use them. I can see these being very helpful in creating the 2d occupancy array that we can later run a pathfinding algo on, like D-star.  I also found a somewhat robust library for object recognition called YOLO. Although YOLO doesn’t have the greatest accuracy for everyday objects (~57%), the underlying model should be helpful in our use case as there isn’t such a wide variety of objects that are seen in indoor shared spaces.

For the next week, we are going to be able to pick up our camera that we ordered and I want to start experimenting with the camera and seeing what kind of recognition/detection results we can get. This will probably require some set up time to get the camera to work with my laptop, and further work to get it to work with the NVIDIA Jetson we plan to use.

Charles’ Status Report for 2/8

I spent this week thinking more about our project idea and the functionalities that we want to change/drop/add. I met with the team to talk through the ideal use cases for our project and what we are ultimately trying to accomplish. We came to the conclusion that our initial project had too many complexities and that it would be quite difficult to get a working and accurate product. Instead, we talked about different alternatives that could fulfill a similar use case and ideated several new ideas. I did some individual research on the frameworks and pre-existing libraries that could help in making a product that we wanted to.

Charles’ Status Report for 2/1

This week, I met with the team to discuss our final ideas and direction for the project. We decided on something that slightly differed from our original idea, but we realized that we were a lot more enthusiastic about this idea than our original. We talked a lot about our vision for the project and the scope of the features we wanted to incorporate. We worked together on a project introduction/summary, we also began work on our project proposal presentation. For the upcoming week, I want to start thinking more about the exact parts we will need. Also, start to flesh out a timeline for the project.