Team D3: The Self-driving human – Carnegie Mellon ECE Capstone, Spring 2025: Max Tang, William Shaw, Andrew Wang

April 27, 2025April 27, 2025

Weekly Status Report for 4/26/25

The only remaining work to be done is to add the audio component, which simply involves connecting the earbuds via bluetooth and then calling some Python library like “subprocess” to connect the earbuds, and then “pygame” to play an audio file. We have the logic for when each audio file should be played already. There is also the compass component, which should also be quick.
The final design for the chest mount has been completed. The final revision adds a front plate, neoprene padding, and uses black and smoke grey acrylic.

Unit tests:
YOLO model (90 ms inference)

The performance was good enough and the speed was fast enough to not have to make any design changes. We also observe that in practice, the model does well enough to identify obstacles in the field of view:

For the initial verson of the navigation, we provided feedback to the user based on how close a detected obstacle was to the camera. However, in common scenarios where a person is just overtaking the user, this logic is insufficient, such as in the example image below during our testing at a crosswalk on campus:

In order to address this, we needed to update our navigation submodule slightly, allowing for a slightly more nuanced logic with regards to whether or not to alert the user to specific obstacles. We will need to include in our system wide testing as well, to make sure that the navigation behaves as anticipated as part of a larger system.

ResNet model (30 ms inference)

Again, the performance was good enough and the speed was fast enough to not have to change the model or size.

For the overall system test, no numbers have been collected as of yet. However, we plan to run it as follows:
First, we wear the device (without being blindfolded). Next, the we wait at a crosswalk, ensuring that the “Walk Sign” cue is only played when there is a valid walk signal. Once the light displays “WALK”, we begin crossing the road. We will ensure that the device functions properly for both a clear crossing with no deviation, a crossing with objects in the path, and a crossing where we deviate from the path. At the end of the cross road, we also ensure that it resets back to the original Walk Sign Detection model (our idle state).

April 27, 2025April 27, 2025

William Shaw’s Status Report for 04/26/2025

This week, the final design for the actual mount was completed. I did a final revision, adding a front plate, the neoprene padding, and using black and smoke grey acrylic (to make it look less suspicious). I also found a way to mount the Wifi+Bluetooth module, although it is not the most secure way of doing it (though necessary due to space constraints).

The IMU integration needs to be finalized, with resync signals based on drift and finding the new walk sign. This should not be too bad, but it requires us to find accurate angles based on the detected traffic lights from the Object Detection ML model, which involves some math based on the bounding box location and the camera’s field of view (FOV).

Regarding the schedule, I am in sync with the Gantt Chart. I need to finalize some things with the IMU, but it’s good for the most part. Another component is finalizing the audio signals. We are currently using a Python text-to-speech library, but it may also be better to have some pre-recorded audio signals to just play (including beeps and whatnot).

April 27, 2025April 27, 2025

Max Tang Status Report for 4/26/25

This week I worked on writing the main program that would run the entire system, which involved integrating the various ML submodules. I have tested that both models work when running together. All we have left to do is add the audio component, which simply involves calling a Python library to connect the earbuds and play an audio file in certain if/else blocks in our code. Other than that, the device is pretty much in a finalized state.

April 27, 2025April 27, 2025

Andrew Wang’s Status Report 4/26/25

This week, I spent some time drafting/refining my implementation of a speech to text feedback loop in the navigation submodule. I had to account for various cases such as when there are people surrounding the user but are all going in the same direction, which shouldn’t alert the user to move out of the way. However, I ended up being extremely busy this week, and wasn’t able to port this onto the Jetson, so I am slightly behind schedule in this regard. I don’t anticipate this being very time consuming, however, since we set up Docker containers to run our code successfully last week, so it is mostly a matter of coming to lab and just porting the code over for testing. Since this is also the last thing I currently need to do, I believe I’ll be able to get all of it done by the end of the weekend in time for testing and the presentations next week.

With regards to progress, I believe that I am very slightly behind schedule. Once again, I am not currently too worried about not getting our deliverables completed, and next week, the majority of our time will likely center around some light user testing, and preparing the materials for our final submissions (poster, slides, writeup).

April 20, 2025April 20, 2025

Team Status Report for 4/19/2025

Last week, we managed to get both the object detection model and walk sign model onto the board. Due to make sure that the object detection model could use the GPU and decrease the inference time, we had to create a docker container for it to run in. However, due to Python dependency issues with tensorflow and pytorch, we are currently trying to change the walk sign image classification model to only use pytorch. We have tried a variety of methods such as converting a .h5 tensorflow model to a .pth pyotrch model, and also just rewriting everything using pytorch, but both have had issues. We are still currently exploring solutions to this problem.

Regarding hardware, we have finally mounted all our components to the chest mount. The power bank also arrived, which fits nicely into a fanny pack. We have tested running our system fully attached to a person, and it works as expected. The device is comfortable, and does not impede user motion or weigh too much. We are still improving the mount, but it’s in a very good state right now.

Update on 4/20/2025: Regarding the issue with converting the model from tensorflow to pytorch, we have successfully recreated the model in pytorch. We now have 3 different models trained on the much larger dataset collected from last week, using resnet 34, 101, and 152. The performance and confusion matrix for each are pretty similar. This is the confusion matrix for the resnet 152 model:

The test accuracy is 90.54%, but it’s possible that the class imbalance is skewing this accuracy. In any case, it’s better for the model to be more cautious when predicting “walk” than “don’t walk”, and we see that the error rate for the “don’t walk” class is very low.

April 20, 2025

William Shaw’s Status Report for 04/19/25

This week, I focused more on fine-tuning the chest mount. We also continued working on integration and testing all our components. Last week’s changes to the chestmount were well received. We tested the chest mount of a few women, all of whom stated that the mount was secure and comfortable. This resolved our worries about non-flat chests being incompatible with our mounting design. The standoff components arrived, and I had to make some tweaks to the mounting dimensions to get them to work properly. However, after the modifications, everything works perfectly. Currently, the Jetson Orin Nano, camera, and IMU are nicely and securely mounted onto the chestpiece. One small issue was that I didn’t consider mounting the WiFi module, along with its antennas. These were connected to the original case. I had assumed that we would have our stuff running offline, so I thought I wouldn’t need to mount these. I will look into fully mounting these, but for now, they are safely tucked into the IMU standoffs. Lastly, I have begun testing the addition of neoprene padding for comfort.

The power bank also worked well, providing more than enough power to last the whole day (around 11 hours). With the components all nicely mounted to the chest mount, we were much more easily able to gather data. Previously, data collection needed to be done with two people, as we had to hold all the components together carefully. Now the device is basically hands-free, save for the laptop needed to SSH in and run the scripts. Hopefully, this can be replaced with a startup script once we finalize integrating all our software components together.

Regarding the schedule, I am in sync with the Gantt Chart. I aim to continue working to get all the components mounted to the chest mount. I expect to work through some issues with screw hole dimensions, so I have blocked out a decent chunk of time for iterating through designs.

As I designed, implemented, and debugged my project, I had to use some tools to fix issues with the camera. Specifically, I had to use nvarguscamerasrc, Argus daemon, GStreamer, and v4l2-ctl for debugging. Most of my knowledge for this was gained by using Google and the NVIDIA Developer Forums for people with similar issues. I also had to debug many issues by checking versioning issues with packages. This issue is still in progress, as we have some conflicts with what packages are needed.

April 20, 2025

Andrew Wang’s Status Report 4/19/2025

This week, I worked on integrating the ML object detection with the hardware components that Will has set up. Most of the work that needed to be done was in the context of porting the model to the Jetson, then ensuring that the models themselves ran on the Orin GPU. This was a lot trickier than I anticipated, since Jetson Nano configurations with the underlying CUDA kernels aren’t as straightforward as using GPUs on the computing cluster that I had developed and initially evaluated the models on.

I was able to eventually get the object detection models running using a Docker container, which are custom modular builds that enable the usage of Jetson-specific GPU functionalities that are needed for the object detection models. After this started working, I observed a drop in inference time from 8 seconds/frame to 90 milliseconds/frame, so I am reasonably sure that this is working now.

I was also able to draft an implementation of a speech to text feedback loop in the navigation submodule, and I’ll be able to fully test it in the next few days to make sure the feedback is clear and timely.

With regards to progress, I believe that I am about on schedule. I am not particularly concerned about not being able to complete the necessary deliverables on my end, and in the coming weeks, I anticipate spending time working on the integration, possibly helping troubleshoot the other components when I can, which I started doing this week.

The next week, I also anticipate refining the navigation logic further, and getting a speech to text feedback loop tested on the Jetson, as mentioned before.

April 20, 2025April 20, 2025

Max Tang Status Report 4/19/2025

This week I worked on the final integration of the walk sign image classification model into the board and docker environment. We first further optimized the walk sign model. I trained it on two different datasets: one that included a lot of non-crosswalk images labeled as “don’t walk”, and one that only included crosswalk images. I also implemented the code logic for transitioning from walk sign image classification to cross walk object detection. Initially, the model worked fine on the board, but uploading the other object detection model, we realized there were Python dependency conflicts. This is still currently an issue, as there have been challenges in trying to convert the tensorflow model to a pytorch model. One attempt was to change the walk sign model from a tensorflow model to a pytorch model. This involved first saving the model in the .tf format, and then converting it to .onnx using (python -m tf2onnx.convert –saved-model walksignmodel –output walksignmodel.onnx), and then converting that to .pt. However, this has had many other python dependency issues with the onnx libraries too. My plan for this weekend is to resolve this issue as soon as possible.

Update on 4/20/2025: The pytorch issue has been resolved, see team status report for update.

April 13, 2025

Team Status Report for 4/12/2025

We conducted a live test of the walk sign image classifier using the actual camera and with the model running on the Jetson. The performance did not meet our specifications, so we are now working on improving that model, as detailed in Max’s status report for this week. The good news is that the camera quality is decent, and despite some initial issues with linking the video stream to the model, everything seems to work smoothly.

The object detection models all seem to work reasonably well on the images sampled from the live video feed. Not only do the bounding boxes accurately center on the relevant objects, but the video feed is clearly high-resolution enough for the model to work correctly. As such, it has been straightforward to visually confirm that the object detection component is working as well as it did during our extensive testing phase, and we can focus our efforts on refining the navigation submodule logic.

Concerning hardware, finishing touches for the chest mount are arriving this weekend. This includes standoff components and the padding. The primary material of the chest mount has been changed from 6mm plywood to 3mm acrylic. This decision was made for weight, flexibility, compatibility (with mounting hardware), and aesthetics. We also have obtained our power supply, which reads an estimated 11-hour battery life. This should be more than enough to cover the day-to-day usage of most users.

April 13, 2025

William Shaw’s Status Report for 04/12/2025

This week, I focused on the refinement of the chest mount, as well as some integration and finalization. For the chest mount, I worked on calibrating the designs for comfort. Now, the main part of the chest mount is made of 3mm acrylic (vs the 6mm plywood previously used). This allows for much smoother contact with straps, as the previous design had issues with fraying fabric. It is also much lighter and more flexible, making the design more comfortable to wear. The standoff components are arriving tomorrow, which will let me mount the components to the chest piece.

In other news, the power bank arrived this week. It took a while for this, as I had to ensure that the battery could consistently output the required wattage over the ports (an issue we ran into with existing power supplies). This allowed us to take the system out for testing, albeit awkwardly, as we held all the components, laptops, and wires by hand. The results from this testing were not great, as the classifiers could not correctly detect the correct signals most of the time. As such, we recorded a bunch of new data, which my teammates are using to retrain the models to make them more accurate.

Another issue was that my default settings for the camera were lackluster. I believe that with some fiddling, I can get a better picture quality for the models. However, this higher resolution may not help or may even cause the models to take longer to run. Ultimately, more testing needs to be done for this. On a more positive note, the red tint issues from last week have been completely resolved using a .isp file I found online.

Regarding the schedule, I am more or less in sync with the Gantt Chart (which was updated during demos). I aim to continue working tomorrow to get all the components mounted to the chest mount. I expect to work through some issues with screw hole dimensions, so I have blocked out a decent chunk of time for iterating through designs.