Nathan’s Status Report for 4/27/24

This week, I personally accomplished two main tasks. First, I edited the depth pipeline to finally be able to utilize the ColorCamera ability of the camera. Specifically, using the same mobilenet framework, I was able to make the two StereoDepth inputs come from the color camera and utilize the color camera for input into the spatial network pipeline as well as for display. The updated code can be seen here https://github.com/njzhu/HomeRover/blob/main/depthai-tutorials-practice/1-hello-world/depth_script_color.py

Additionally, a photo demonstrating this color functionality is shown here: We can see a bounding box around Hayden, demonstrating that the pipeline is able to make detections. A closer glimpse into the code can be seen here: https://drive.google.com/file/d/1MTEEsWslS3K_79CAxSCOV2DoH274E5bl/view?usp=sharing

From this photo, we can see the linking together of the pipeline, starting from the color cameras, going through stereo depth, and ending at the spatial detection output.

The second main task I accomplished was that I was able to put a live camera feed onto a locally-hosted website. To achieve this, I followed and adapted a Flask tutorial to setup the website, html pages, and routing actions. We plan on having the controller Pi access this website to display it on the display. One challenge was similar to a previous issue of multithreading and accessing variables in different files. To overcome this, I combined the code of my depth pipeline with the locking and unlocking required to update the frame to be displayed on the website. A video showing the live camera feed updating on the website can be found here: https://drive.google.com/file/d/1kDW8dthBEhdgHHi2DJDfmURngqmv2ULZ/view?usp=sharing

After this week, I believe my progress is on schedule because a point of worry from last week was the ability to broadcast the video and live camera feed for the user, but a major component of that functionality was figured out today, which is good.

In this coming week, I hope to be able to fully finalize the rover in a culmination of all of our combined effort over the semester. In addition, I hope to, together with the team, finalize all the requisite documents as well.

Nathan’s Status Report for 4/20/24

These past two weeks, I did a lot of collaborative group work in integrating all the requisite parts required for a full implementation of the rover. For instance, one thing I did was help solder the third pump for our suction arm, as seen here. I also contributed to the group’s effort in the foray of investigating multiprocessing in Python.

In addition, I also helped setup a testing suite for analyzing the coordinates and bounding box dimensions in our object detection pipeline, as well as analyzing what these different variables refer to. The testing suite can be seen below. I took the various detection.y, ymin, ymax, and other similar coordinates and measured spatially with a ruler to determine how the camera is calibrated.

I also investigated potentially abandoning a neural network altogether and using openCV thresholding, contouring, and rectangle-creating functions to use the depth color map to extract an object that is close enough. However, this idea fell short because as we would get closer to an object, the floor would also be lower than our desired object’s z-distance threshold, meaning the heatmap would reflect that the floor is just as close as the object. If I were to try and manually create a bounding box, it would also include the floor; there would be no way to differentiate. In addition, as the video frame displays, I saw some blips on the heatmap that showed closeness that wasn’t part of the object. These small blips could potentially cause issues if I were to threshold the frame since they wouldn’t be greyed out.

My progress is flip-flopping between mainly on schedule and a teeny bit behind, mainly because problems arise when performing integration tests that did not appear when doing unit tests, such as the lower accuracy resulting from the object detection pipeline. Since we budgeted this time, I would consider progress to be on schedule.

The main deliverable I hope to complete in the next week is to tweak and adjust the object detection pipeline so when we perform validation tests, we meet our desired accuracy threshold as described in our use case requirements. Additionally, I hope to be able to work with Hayden to successfully transmit the camera feed back to the monitor.

Throughout the semester, the acquisition of new knowledge was a particularly salient undertaking due to the fact that I had to learn and get used to the depthai module, features, and programming paradigm. To be able to accomplish these tasks and successfully utilize the module, I found it necessary to be able to learn quickly, which I tackled through sifting through the wide variety of examples created by depthai’s Luxonis. I utilized their depthai-examples repository, which had example programs for every feature of the camera, from spatial detection, object tracking, and depth. Being able to see example programs really allowed me to learn and quickly get acclimated to the module, and in addition, when searching the internet to help solve my error messages, I found forum posts and discussions that really helped my debugging process, which was especially helpful because the camera is so new and constantly undergoing changes when compared to different versions and models.

Nathan’s Status Report for 4/6/24

This week on the project, I helped integrate the Raspberry Pi program that successfully receives button presses serially from the controller and simultaneously transmits the results serially to the Arduino Nano that controls the turning of the motors. In addition, I integrated the camera onto the rover and did minor calibrations to the camera control loop based on realizations after mounting the camera to the rover.

After demo day, I witnessed some other groups have computer vision follower-based ML systems, and one group was using the Yolo v8 architecture. I heavily considered the idea of using a Yolo framework over a MobileNetSSD framework due to the increased variety of objects it can detect (# of labels) as well as potential higher accuracy due to using BGR frames rather than black and white frames. This meant that I had to figure out the longstanding issue of the ColorCamera never being recognized. I finally figured out the solution to this issue: in various example programs, the default configuration of the camera and the socket that is used, called “RGB”, is not present on the Oak D SR. Thus, to fix it, I had to manually set the socket port to CAM_B or CAM_C, and rgb functionality finally worked. However, I was having trouble setting up a Yolo architecture because it required both an rgb socket in use as well as left and right stereo camera. Originally, they were MonoCamera, but this conflicted because we had three different cameras needed but only two sockets. Thus, I researched how the StereoDepth feature worked further on the Oak D SR camera and found a workaround which was to use rgb ColorCameras for the left and right cameras necessary for depth detection, as well as to “align depth map to the perspective of RGB camera, on which inference is done“.  The tentative code that I adapted can be found here: HomeRover Github

The Github also includes the rover control arduino/python code written for intersystem communication.

This photo shows the openCV frame that captures an rgb image output, something that was so elusive to me for so long, along with the corresponding depth heatmap.

My progress is on schedule despite the revamping of our object detection system. It is aligned with the developments made to the kinematics scheme in that the communication between the camera control loop and the arm control loop needs to be established, which is what I hope to complete this coming week. Since there is no cross-device communication for this, I hope it is easier than serial communication between different computers like a few weeks ago.

The tests I am planning to run are twofold and involves testing my individual subsystem and integration testing. For testing my individual subsystem, I aim to keep a few big concepts in mind that define our use case and design requirements, that of versatility and accuracy. I hope to gather a variety of objects, and with the camera mimicking a position like on the rover (lifted off the ground with a similar background setting), achieve accuracy up to our requirements for both object detection and transmission of accurate signals for directing the rover. Further tests to ensure accuracy will be made in conjunction with incorporating the arm in that the coordinates passing and translation constitutes an important part of our system. In addition, I hope to perform latency tests with my control loop code running on the RPi in using slow motion cameras to identify the object detection reaction time to an object appearing on screen to make sure it falls within our requirements for appearing instantaneous.

Nathan’s Status Report for 3/30/24

This week saw significant progress towards the main control loop of the pick up mechanism. After making the plan last week to figure out the manner in which we align the arm to pick up items, I implemented it this week. Specifically, our system works like this: when I receive an array of detections that the camera detects with its onboard camera, I check if the z distance is within our desired pick up range to see if we are in range and in scope of an item. Then, I do a series of x-coordinate checks to ensure that the object that we are near is within a threshold that is not yet decided. If our frame has the object to the left – meaning our camera is right leaning, we print “Turn Right”, and vice versa for a right leaning object. This print statement occurrence can be adapted to send a signal to the arm controller code. This hasn’t been set up yet, but the underlying infrastructure is there which will hopefully make this connection easier. Additionally, the threshold I mentioned earlier will be calibrated once we do rover tests with the camera attached to its intended location.

A video showcasing this functionality is linked here: https://drive.google.com/file/d/1XJyA2q35H8Kpg9TzOHVndv2On-Wu5Cji/view?usp=sharing

The photos that show this functionality can be seen here:

Additionally, I ran this code off of the Raspberry Pi as well and the speed of the program did not seem to suffer a performance hit after transitioning to the RPi. Thus, I am increasingly confident of our system to work successfully on Raspberry Pi.

After implementation this week, my progress is back on schedule. To further catch up to the project schedule in this coming week, I hope to be able to integrate with the rover to determine the proper threshold amount to coordinate with the arm. In addition, I hope to be able to write the communication protocol with the arm controller code and potentially be able to receive inputs on an arm-controller program.

Nathan’s Status Report for 3/23/24

This week, together with Varun, we finalized the control loop and main function for how the rover should detect and position itself for the kinematics to be successful. To reiterate, it involves a series of four steps:

1) The user drives in front of the object in frame

2) It detects if it finds an object under the distance

3) Track until x distance from object to the camera is the same as camera and arm

4) then we can pick it up

For step 2, this would involve utilizing either a sweep of the frame to find a block that meets our minimum distance and then adjusting the x-value from there, or utilizing an object detection algorithm combined with depth measurements to automatically detect the object on the screen. In the latter case, classification is not necessary since there will only be one object in our immediate radius as defined by our scope.

The progress I have made on the code can be found at github.com/njzhu/HomeRover

My progress is aligning with the completion of the fabrication of the rover and development of the kinematics scheme. To match this project schedule, I will be in close coordination, developing clear, explicit interfaces of information passing to ensure integration goes smoothly.

In the next week, I hope to get a preliminary version of the control loop working, outputting the necessary information that the other modules require.

Nathan’s Status Report for 3/16/24

For my progress this week, it was threefold.  The first aspect of this week involved trying to successfully boot up the Raspberry PI and SSH into it in a headless fashion since that is how it will be operating on the rover. However, I ran into some trouble earlier this week when I broke a microSD card while attempting the first boot, so progress on this front was delayed until the end of the week. Towards the end of the week, I successfully managed to SSH into the PI and register our devices MAC address to CMU-DEVICE.

The middle half of the week saw me spend pretty much all of Thursday doing the ethics assignment, and I found both readings extremely insightful.

The last part of my week involved installing dependencies and running the depthai software on the Raspberry Pi remotely through SSH. After doing research on the ColorCamera issue as stated in previous status reports, I might have found my issue, which is that when the device is connected to the Pi, and when it reconnects back to the host after performing inference, it is using a USB2 connection instead of a USB3 connection. One potential way to solve this is to reduce the data stream, but as of now I am unsure how to do this.

Currently, I am working on two different pipelines. MonoCamera -> Object Detection Pipeline and a depth pipeline that uses a colormap where I am trying to extract coordinate data and potentially link it with the Object Detection pipeline.

Currently, progress is behind because there are still issues with the camera being laggy on the Pi side as well as continued development that needs to be done with my two pipelines. With no exams this coming week, I hope to catch up to the project schedule by utilizing the week’s free time effectively.

In the next week, I hope to develop a pipeline that takes the closest depth value in an image as well as find a method to reduce data rate by potentially editing a bandwidth or FPS setting.

Nathan’s Status Report for 3/9/24

For the past couple of weeks, the primary focus I had towards our project was twofold: the design report and the camera. A significant part of the week leading up to spring break, around 10-12 hours worth, was spent on the design report. For my sections of the introduction, tradeoffs, testing & verification, and project management, it required a lot of time spent doing research on current statistics about individuals facing mobility challenges as well as existing solutions. Additionally, research was conducted on tradeoffs between Wi-Fi and Bluetooth as well.

For the camera aspect of the project, I read and investigated the source code of the Luxonis camera repository, specifically the depthai-viewer tool. I wanted to see why the rgb capability was possible when using their tool but not when I manually write a pipeline that utilizes it. In addition, I am also investigating an example that utilizes a pointcloud and see what useful data arises there, but the bottleneck still lies with the color camera issue. Because the depthai-viewer registers the cameras as a default stereo pair, it is possible there is a configuration that I am missing.

Now that the design report is complete, which took a majority of my time for the week, I can fully focus my energy on both setting up the Raspberry Pi as well as getting to the bottom of the camera issue. I started booting the Raspberry Pi and setting up SSH, but I accidentally broke the SD card, so progress was temporarily halted on this front. Thus, my progress is slightly behind but I hope to be able to catch up with this narrowing of focus and fixation on my tasks.

In the next week, I hope to successfully setup the Raspberry Pi in a headless fashion, set up SSH on the Pi, and be able to run my tutorial that utilizes the ColorCamera feature on the Oak D SR.

Nathan’s Status Report for 2/24/24

For the first half of this week, from Sunday to Wednesday morning, I was practicing my Design Proposal presentation. I wrote down meaningful talking points I wanted to hit on each slide, and then I made sure to get the slide transitions down smoothly and without pausing or awkwardness. In addition, I practiced my timing and in front of a live audience (my roommates). For the latter half of the week, I dove in-depth into the Luxonis DepthAI documentation to try and figure out which frameworks and functions would be useful for our application. I read numerous examples from Luxonis’ depthai-experiments repo to try and find the relevant depth processing we need. Alongside this experimentation, I was figuring out the nuances behind the dependencies needed to install the required packages. Currently, I’m facing an issue where I am unable to perform RGB Color Camera capture; it doesn’t crash only when I run MonoCamera applications, which is odd. I’ve tried troubleshooting versions and I’m still investigating this issue.  The following photos below show the depth application examples I got to work that do not involve RGB color capture.

In addition, I made a repo to get started on a hello_world.py file where they walk you through how to create a pipeline to then start image capture and image processing, and the Github link to this repo is https://github.com/njzhu/HomeRover

My progress is slightly behind because of this small issue with the rgb camera, but once I figure that out, I hope I am able to understand the examples and apply depth capture to objects in our surrounding areas and rover’s vicinity. Since the examples are made by Luxonis, it should be extremely helpful and informative.

In the next week, I hope to get my hello_world.py up and running successfully and be able to do basic depth capture using the stereo depth perception technology on the Luxonis.

Nathan’s Status Report for 2/17/24

For the first half of this week, I managed purchase and rental requests for equipment for our project from ECE Receiving. We initially put in a request for the Intel RealSense l515, but in order to let other teams use the equipment in the interest of fairness, we ended up purchasing the Oak D Short Range camera, which arrived yesterday. Before the new camera came, I spent most of my time doing in-depth research and making an onboarding plan to start using the camera, which included finding setup instructions and finding specific tutorials and examples to start depth perception. In addition, I started research on translating our camera output (coordinates, depth) into kinematic motion and instructions to the arm. This involved preliminary research on kinematics. The second task I did while my teammates were doing CAD design was start putting together our design review presentation. I incorporated ideas and got inspiration from previous projects in determining what to include.

Once the camera came, I briefly started the onboarding process, installing packages and getting acquainted with the software. The camera has a depthai-viewer, which is a GUI interface for viewing the output of the camera, with a preinstalled neural network. The output is shown below:

Since the camera came recently and I have spent the majority of my time on the presentation, I am a little behind on playing around with the camera. To catch up to the project schedule, I believe without the task of preparing the presentation, I should be able to dedicate most of my time towards catching up.

Next week, I hope to complete the deliverables of working together with Hayden to establish the embedded software side of the kinematics. In addition, I hope to be able to output coordinates and a depth and finalize the nominal interface between the camera and the robotic arm.

Nathan’s Status Report for 2/10/24

At the beginning of this week, the majority of my work went into preparing the proposal presentation slides and meeting with Hayden to finalize the presentation and offer feedback. Personally, I made the use case and problem description slides as well as created the Gantt chart and inputted the tasks we need to achieve on Asana, our technology of choice.

After the presentation and after reviewing our TA feedback, I started to research the required technologies for my end – the depth camera or a LiDAR camera. Upon discussing with the team, we decided to acquire one of each since the ECE inventory already has a LiDAR camera that we can access without cutting into our budget. I filled out a request with ECE receiving and I hope to receive the camera sometime next week so I can start playing with it. In the meantime, I am researching how to interface with the camera and any needed technologies that are required for it to function properly. Since my area of focus is mostly on software, I am also starting to research how to translate what the camera sees and the information it gathers to the kinematic motion of the arm. This is still very much a work in progress.

Currently, my progress is on schedule since I am waiting for the technologies to arrive before I can actually start digging into experimentation and research.

In the next week, I hope to be able to receive the materials and play around with them. I hope to be able to write extremely basic code for the camera and RPi and perform basic setup tasks for the two so we can establish a good basis for the coming weeks as further experimentation occurs.