njzhu – Team B1: Homerover

April 27, 2024

Nathan’s Status Report for 4/27/24

This week, I personally accomplished two main tasks. First, I edited the depth pipeline to finally be able to utilize the ColorCamera ability of the camera. Specifically, using the same mobilenet framework, I was able to make the two StereoDepth inputs come from the color camera and utilize the color camera for input into the spatial network pipeline as well as for display. The updated code can be seen here https://github.com/njzhu/HomeRover/blob/main/depthai-tutorials-practice/1-hello-world/depth_script_color.py

Additionally, a photo demonstrating this color functionality is shown here: We can see a bounding box around Hayden, demonstrating that the pipeline is able to make detections. A closer glimpse into the code can be seen here: https://drive.google.com/file/d/1MTEEsWslS3K_79CAxSCOV2DoH274E5bl/view?usp=sharing

From this photo, we can see the linking together of the pipeline, starting from the color cameras, going through stereo depth, and ending at the spatial detection output.

The second main task I accomplished was that I was able to put a live camera feed onto a locally-hosted website. To achieve this, I followed and adapted a Flask tutorial to setup the website, html pages, and routing actions. We plan on having the controller Pi access this website to display it on the display. One challenge was similar to a previous issue of multithreading and accessing variables in different files. To overcome this, I combined the code of my depth pipeline with the locking and unlocking required to update the frame to be displayed on the website. A video showing the live camera feed updating on the website can be found here: https://drive.google.com/file/d/1kDW8dthBEhdgHHi2DJDfmURngqmv2ULZ/view?usp=sharing

After this week, I believe my progress is on schedule because a point of worry from last week was the ability to broadcast the video and live camera feed for the user, but a major component of that functionality was figured out today, which is good.

In this coming week, I hope to be able to fully finalize the rover in a culmination of all of our combined effort over the semester. In addition, I hope to, together with the team, finalize all the requisite documents as well.

April 20, 2024

Nathan’s Status Report for 4/20/24

These past two weeks, I did a lot of collaborative group work in integrating all the requisite parts required for a full implementation of the rover. For instance, one thing I did was help solder the third pump for our suction arm, as seen here. I also contributed to the group’s effort in the foray of investigating multiprocessing in Python.

In addition, I also helped setup a testing suite for analyzing the coordinates and bounding box dimensions in our object detection pipeline, as well as analyzing what these different variables refer to. The testing suite can be seen below. I took the various detection.y, ymin, ymax, and other similar coordinates and measured spatially with a ruler to determine how the camera is calibrated.

I also investigated potentially abandoning a neural network altogether and using openCV thresholding, contouring, and rectangle-creating functions to use the depth color map to extract an object that is close enough. However, this idea fell short because as we would get closer to an object, the floor would also be lower than our desired object’s z-distance threshold, meaning the heatmap would reflect that the floor is just as close as the object. If I were to try and manually create a bounding box, it would also include the floor; there would be no way to differentiate. In addition, as the video frame displays, I saw some blips on the heatmap that showed closeness that wasn’t part of the object. These small blips could potentially cause issues if I were to threshold the frame since they wouldn’t be greyed out.

My progress is flip-flopping between mainly on schedule and a teeny bit behind, mainly because problems arise when performing integration tests that did not appear when doing unit tests, such as the lower accuracy resulting from the object detection pipeline. Since we budgeted this time, I would consider progress to be on schedule.

The main deliverable I hope to complete in the next week is to tweak and adjust the object detection pipeline so when we perform validation tests, we meet our desired accuracy threshold as described in our use case requirements. Additionally, I hope to be able to work with Hayden to successfully transmit the camera feed back to the monitor.

Throughout the semester, the acquisition of new knowledge was a particularly salient undertaking due to the fact that I had to learn and get used to the depthai module, features, and programming paradigm. To be able to accomplish these tasks and successfully utilize the module, I found it necessary to be able to learn quickly, which I tackled through sifting through the wide variety of examples created by depthai’s Luxonis. I utilized their depthai-examples repository, which had example programs for every feature of the camera, from spatial detection, object tracking, and depth. Being able to see example programs really allowed me to learn and quickly get acclimated to the module, and in addition, when searching the internet to help solve my error messages, I found forum posts and discussions that really helped my debugging process, which was especially helpful because the camera is so new and constantly undergoing changes when compared to different versions and models.

April 20, 2024

Team Status Report for 4/20/24

For the past two weeks, the main thing we as a team have been working on revolves around integration and connecting together all our separate subsystems. When connecting our object detection/depth map code that instructs the motors to turn and move a specific direction, we ran into difficulty parallelizing the two processes, either using Python’s multiprocessing and socket modules. The two control loops were blocking each other and preventing us from progressing in either program, but as stated in Varun’s status report, the end program is an overarching one acting as a state machine. After fully assembling the rover and running integration tests that involve the full pipeline of navigation -> detection -> pickup, the most significant risk lies on the detection side. Earlier in the week, we were running into an issue with the coordinates that the object detection pipeline was giving for the bounding box it created in that the y-coordinate it outputted wasn’t necessarily the top of of the object, but the center relative to some part of the camera, causing our suction to undershoot because of inaccurate kinematics. After investigating the various dimensions and y-values that exist in the code and comparing them to our hand measurements, we found that the detection.y value it spit out did reflect the top of the object, but regarding its magnitude, it reflected the distance from the bottom of the camera. To mitigate and manage this risk, as well as improve our kinematics in the process, we plan on finetuning a potential hard offset to all y values to ensure that we hit the top of the object as well as tuning with adding a variable offset based on the bounding box dimensions. We have started doing this, but plan on performing a lot more trials next week. Another risk involves the accuracy of the object detection, which meets below our standards as defined in the design and use case requirements. A potential cause for this issue is that since mobilenetSSD has a small label database, there’s a chance it doesn’t encompass all the objects we want to pickup, but since we don’t necessarily need identification, just detection, a potential risk mitigation strategy is to lower the confidence threshold of the detection pipeline to hopefully improve on detection accuracy.

One change that was made to the design of the system was adding a second layer to our rover to house our electronics due to space constraints. The change does not incur major costs since the material and machinery were both provided by Roboclub. This is a quick, lasting fix, with not further costs needing mitigation going forward.

As stated in our individual status reports, we are fairly on track, with no major changes to the schedule needed.

A full run-through of our system can be found here, courtesy of Varun: https://drive.google.com/file/d/1vRQWD-5tSb0Tbd89OrpUtsAEdGx-FeZF/view?usp=sharing

April 6, 2024

Nathan’s Status Report for 4/6/24

This week on the project, I helped integrate the Raspberry Pi program that successfully receives button presses serially from the controller and simultaneously transmits the results serially to the Arduino Nano that controls the turning of the motors. In addition, I integrated the camera onto the rover and did minor calibrations to the camera control loop based on realizations after mounting the camera to the rover.

After demo day, I witnessed some other groups have computer vision follower-based ML systems, and one group was using the Yolo v8 architecture. I heavily considered the idea of using a Yolo framework over a MobileNetSSD framework due to the increased variety of objects it can detect (# of labels) as well as potential higher accuracy due to using BGR frames rather than black and white frames. This meant that I had to figure out the longstanding issue of the ColorCamera never being recognized. I finally figured out the solution to this issue: in various example programs, the default configuration of the camera and the socket that is used, called “RGB”, is not present on the Oak D SR. Thus, to fix it, I had to manually set the socket port to CAM_B or CAM_C, and rgb functionality finally worked. However, I was having trouble setting up a Yolo architecture because it required both an rgb socket in use as well as left and right stereo camera. Originally, they were MonoCamera, but this conflicted because we had three different cameras needed but only two sockets. Thus, I researched how the StereoDepth feature worked further on the Oak D SR camera and found a workaround which was to use rgb ColorCameras for the left and right cameras necessary for depth detection, as well as to “align depth map to the perspective of RGB camera, on which inference is done“. The tentative code that I adapted can be found here: HomeRover Github

The Github also includes the rover control arduino/python code written for intersystem communication.

This photo shows the openCV frame that captures an rgb image output, something that was so elusive to me for so long, along with the corresponding depth heatmap.

My progress is on schedule despite the revamping of our object detection system. It is aligned with the developments made to the kinematics scheme in that the communication between the camera control loop and the arm control loop needs to be established, which is what I hope to complete this coming week. Since there is no cross-device communication for this, I hope it is easier than serial communication between different computers like a few weeks ago.

The tests I am planning to run are twofold and involves testing my individual subsystem and integration testing. For testing my individual subsystem, I aim to keep a few big concepts in mind that define our use case and design requirements, that of versatility and accuracy. I hope to gather a variety of objects, and with the camera mimicking a position like on the rover (lifted off the ground with a similar background setting), achieve accuracy up to our requirements for both object detection and transmission of accurate signals for directing the rover. Further tests to ensure accuracy will be made in conjunction with incorporating the arm in that the coordinates passing and translation constitutes an important part of our system. In addition, I hope to perform latency tests with my control loop code running on the RPi in using slow motion cameras to identify the object detection reaction time to an object appearing on screen to make sure it falls within our requirements for appearing instantaneous.

March 30, 2024March 30, 2024

Team Status Report for 3/30/24

When we were all working together on Friday, one issue we noticed is that after switching to the 5V driver from the Arduino Nano, when spinning the motors, we observed that they were not spinning the same speed. This is a significant risk in that if there is disparity between the motor speeds, because of our belt setup, it would affect the manner in which our robot turns, making it not dependable or reliable. To mitigate this risk, we have two potential avenues to pursue: the first is through tuning the commands given by the microcontroller to make sure that the robot can indeed drive straight, thus allowing us to maintain the same speed for the motors manually through tuning. The second way of mitigation is through using rear wheel drive only and switching to casters on the front wheels. This is because the belt tension is causing undue force on the motor shaft, causing it to spin slower. If we convert to rear wheel drive, it removes the need for a belt in the first place.

A change made to the existing design of the system is the switch from using a Raspberry Pi Pico to an Arduino Nano. This is necessary because it allows us to drive a 5V logic as opposed to a 3.3V logic. The change does not incur any additional cost because the Arduino Nano was provided free of charge.

For an updated schedule, we are hoping to target this week to be able to drive the rover and control the servos for the arm, even if it’s a basic program to test functionality.

This video link showcases our currently assembled rover so far (sans-camera), with the motors successfully wired up and spinning!

https://drive.google.com/file/d/1zICyOJkQBSxv6ApgS1hE1o7wqdp9SjWX/view?usp=sharing

March 30, 2024

Nathan’s Status Report for 3/30/24

This week saw significant progress towards the main control loop of the pick up mechanism. After making the plan last week to figure out the manner in which we align the arm to pick up items, I implemented it this week. Specifically, our system works like this: when I receive an array of detections that the camera detects with its onboard camera, I check if the z distance is within our desired pick up range to see if we are in range and in scope of an item. Then, I do a series of x-coordinate checks to ensure that the object that we are near is within a threshold that is not yet decided. If our frame has the object to the left – meaning our camera is right leaning, we print “Turn Right”, and vice versa for a right leaning object. This print statement occurrence can be adapted to send a signal to the arm controller code. This hasn’t been set up yet, but the underlying infrastructure is there which will hopefully make this connection easier. Additionally, the threshold I mentioned earlier will be calibrated once we do rover tests with the camera attached to its intended location.

A video showcasing this functionality is linked here: https://drive.google.com/file/d/1XJyA2q35H8Kpg9TzOHVndv2On-Wu5Cji/view?usp=sharing

The photos that show this functionality can be seen here:

Additionally, I ran this code off of the Raspberry Pi as well and the speed of the program did not seem to suffer a performance hit after transitioning to the RPi. Thus, I am increasingly confident of our system to work successfully on Raspberry Pi.

After implementation this week, my progress is back on schedule. To further catch up to the project schedule in this coming week, I hope to be able to integrate with the rover to determine the proper threshold amount to coordinate with the arm. In addition, I hope to be able to write the communication protocol with the arm controller code and potentially be able to receive inputs on an arm-controller program.

March 23, 2024

Nathan’s Status Report for 3/23/24

This week, together with Varun, we finalized the control loop and main function for how the rover should detect and position itself for the kinematics to be successful. To reiterate, it involves a series of four steps:

1) The user drives in front of the object in frame

2) It detects if it finds an object under the distance

3) Track until x distance from object to the camera is the same as camera and arm

4) then we can pick it up

For step 2, this would involve utilizing either a sweep of the frame to find a block that meets our minimum distance and then adjusting the x-value from there, or utilizing an object detection algorithm combined with depth measurements to automatically detect the object on the screen. In the latter case, classification is not necessary since there will only be one object in our immediate radius as defined by our scope.

The progress I have made on the code can be found at github.com/njzhu/HomeRover

My progress is aligning with the completion of the fabrication of the rover and development of the kinematics scheme. To match this project schedule, I will be in close coordination, developing clear, explicit interfaces of information passing to ensure integration goes smoothly.

In the next week, I hope to get a preliminary version of the control loop working, outputting the necessary information that the other modules require.

March 16, 2024March 16, 2024

Team Status Report for 3/16/24

After doing a post spring break evaluation, the most significant risks that could jeopardize the success of the project revolve around the camera, both in terms of its function and the dependencies that require its outputs, such as the kinematics calculation module. Despite having a pipeline that can detect depth and display a live camera feed smoothly on my (Nathan’s) laptop, when using X11 forwarding, the resulting feed was extremely slow and laggy. Our plan to manage and mitigate this risk is to get our RPi monitor as soon as possible to test on actual usage as well as look for any opportunities to lower bandwidth and latency. The Luxonis documentation has benchmarks to test these values so we can analyze if we have any shortcomings. Additionally, another risk that stems from the first risk is the fact that we are behind on our timeline. However, we have recently placed orders for pcbs this week so for these fabrication tasks, we have controlled what we can control in terms of timing. This week saw a lot of fabrication of parts, so our next weeks will see an abundance of integration and in-person meeting time.

No changes were made to the existing design of the system and it is consistent with the change made after spring break.

Although no official changes to the schedule have been made, we are systematically cutting things out from the future and trying our best to push forward in terms of days.
Here are some parts Hayden and Varun made!

March 16, 2024

Nathan’s Status Report for 3/16/24

For my progress this week, it was threefold. The first aspect of this week involved trying to successfully boot up the Raspberry PI and SSH into it in a headless fashion since that is how it will be operating on the rover. However, I ran into some trouble earlier this week when I broke a microSD card while attempting the first boot, so progress on this front was delayed until the end of the week. Towards the end of the week, I successfully managed to SSH into the PI and register our devices MAC address to CMU-DEVICE.

The middle half of the week saw me spend pretty much all of Thursday doing the ethics assignment, and I found both readings extremely insightful.

The last part of my week involved installing dependencies and running the depthai software on the Raspberry Pi remotely through SSH. After doing research on the ColorCamera issue as stated in previous status reports, I might have found my issue, which is that when the device is connected to the Pi, and when it reconnects back to the host after performing inference, it is using a USB2 connection instead of a USB3 connection. One potential way to solve this is to reduce the data stream, but as of now I am unsure how to do this.

Currently, I am working on two different pipelines. MonoCamera -> Object Detection Pipeline and a depth pipeline that uses a colormap where I am trying to extract coordinate data and potentially link it with the Object Detection pipeline.

Currently, progress is behind because there are still issues with the camera being laggy on the Pi side as well as continued development that needs to be done with my two pipelines. With no exams this coming week, I hope to catch up to the project schedule by utilizing the week’s free time effectively.

In the next week, I hope to develop a pipeline that takes the closest depth value in an image as well as find a method to reduce data rate by potentially editing a bandwidth or FPS setting.

March 9, 2024

Nathan’s Status Report for 3/9/24

For the past couple of weeks, the primary focus I had towards our project was twofold: the design report and the camera. A significant part of the week leading up to spring break, around 10-12 hours worth, was spent on the design report. For my sections of the introduction, tradeoffs, testing & verification, and project management, it required a lot of time spent doing research on current statistics about individuals facing mobility challenges as well as existing solutions. Additionally, research was conducted on tradeoffs between Wi-Fi and Bluetooth as well.

For the camera aspect of the project, I read and investigated the source code of the Luxonis camera repository, specifically the depthai-viewer tool. I wanted to see why the rgb capability was possible when using their tool but not when I manually write a pipeline that utilizes it. In addition, I am also investigating an example that utilizes a pointcloud and see what useful data arises there, but the bottleneck still lies with the color camera issue. Because the depthai-viewer registers the cameras as a default stereo pair, it is possible there is a configuration that I am missing.

Now that the design report is complete, which took a majority of my time for the week, I can fully focus my energy on both setting up the Raspberry Pi as well as getting to the bottom of the camera issue. I started booting the Raspberry Pi and setting up SSH, but I accidentally broke the SD card, so progress was temporarily halted on this front. Thus, my progress is slightly behind but I hope to be able to catch up with this narrowing of focus and fixation on my tasks.

In the next week, I hope to successfully setup the Raspberry Pi in a headless fashion, set up SSH on the Pi, and be able to run my tutorial that utilizes the ColorCamera feature on the Oak D SR.