Team A2: 🌊Splash💦 – Page 2 – Carnegie Mellon ECE Capstone, Fall 2024

November 16, 2024

Josiah’s Status Report for 11/16

Accomplishments

This past week, I became more familiar with grbl and G-Codes. I progressively tuned the firmware settings using UGS (Universal G-Code Sender) such that acceleration and feed rate (which is essentially max velocity) were safely maximized without stalling the stepper motors. If the velocity is too fast, the motors won’t be able to handle the strain and just stop rotating. I also created a gcode testfile that translates the cup from the origin to 10cm in the eight cardinal/diagonal directions and back again. This will be used for the demo this next week. Further details on validation are in the team report

Progress

I hope to purchase a different step of limit switches that are more physically compatible with the XY robot. I misunderstood how limit switches function: they have to be physically placed onto the ends of the “gantry” rails such that the levers depress when the cup reaches the end. This sends a signal to the firmware saying “stop moving! you’ve hit the end!” I mistakenly purchased a set of limit switches that are compatible with a 3D printer model, so I’ll get a different bunch to use and wire up. After that, homing should be enabled and we’ll be able to guarantee the location of the cup at all times, even after booting in.

November 16, 2024November 17, 2024

Team Status Report 11/16

For the camera / estimation and tracking pipeline on Jimmy’s end, the decision was made to stop using the OpenCV kalman filter library, and ended up making a good in house implementation which yielded good results. This will be good to set us up as now the 2D visual pipeline is done, which can be a good contingency plan incase our 3D pipeline does not give good results. Some of the risks of this subsystem now lie entirely on the accuracy of the 3D model that is going to be implemented. As for validation, there will be two components to the test for validation for object detection and trajectory estimation. The first is to making sure that the estimates are accurately predicted and tracked. The second is to ensure that the calculation time can be done in a timely manner. The overall goal of the camera and detection/prediction pipeline is to generate a coordinate to send to the gantry so that it can catch the ball in time. Making sure of these conditions are met in the verification of this subsystem will ensure that our overall project goal can be achieved.

For the Raspberry Pi hardware integration side, a lot of work was put into getting past the setup stage and having functional work be done on the Pi. Jimmy has been working on developing a good Kalman prediction code from a recorded video that he is able to pass into the code. Once I finalized the setup and finished installing the few dependencies, I was able to run the Camera on the Pi. The depth AI package that the camera came with also had many example files that could be run to showcase some of the camera’s capabilities. On Monday I focused on learning how the file system structure is set up and looked through many examples of the camera and what different pieces of code could make the camera do. The next step was to adapt Jimmy’s detection function to take in camera input for video instead of a recorded video. That was successfully done, but the resulting video was only 15 FPS, and it struggled to detect the ball. We had a few ideas of how to improve that, and the details of how we went about implementing that are elaborated more specifically in Gordon’s status report, but essentially we were able to identify that making the camera window smaller greatly increased the FPS, as shown in this table below.

Camera with 2D Ball Detection FPS Experiments

Window Size (All 1080 resolution, cant do worse)	Resulting FPS	Comments
camRgb.setVideoSize(1920, 1080)	13-15	Default screen size, FPS is way too slow for reliable ball detection. This is without any optimizations of code, which we are looking into
1200, 800	33-35	Ball detection is pretty smooth, need more integration to see if its enough
800, 800	46-50	At this point window size is probably as small as it should go, starts to inhibit ability to capture ball trajectory. Ball detection is super smooth however, a balance between screen size and FPS should be studied
700, 800	~55	^
700, 700	60	Achieved 60 FPS, but again window size is likely too small to be of use

The comments provide a baseline overview of what I concluded from this simple verification test. From what I’ve gathered, I believe that this framerate and window manipulation will be a good tactic that we can use to increase the precision and effectiveness of our tracking and trajectory prediction. There are even more tactics that we haven’t tried yet, and plan to test out in these last few weeks if any design requirements have still not been met. There is still a good amount of work to be done to add the third dimension into it, and to convert it to instructions for the XY robot. These last bits of integration we will work on as much as we can as a team, as we still need to figure out how to structure the whole system to give real world coordinates. In general I think our team has done a lot of verification within each of our own subsystems, and it is almost time to start the validation process of integrating all of them together. The RPI and Jimmy’s camera trajectory work are already relatively connected and working, but like aforementioned, connecting to the XY robot will require work from everyone.

XY Robot:

As described in the design report, there is only a single, though important, test to be run. The big picture is that the robot is agile in motion and precise in positioning, such that it can support the rest of the system. Thus, we have the following validation test:

Supply a series of valid, arbitrary real-world coordinates to the Arduino running grbl firmware through G-Code serial communications. For each coordinate, validate that the robot:
1. Moves the cup within <0.5cm of the position
2. Moves the cup to the position within <1.0s

The following conditions should also be met:

We define a valid and arbitrary coordinate to be any position within a 10cm XY radius around the origin of the cup.
- Thus, the cup will only have to move AT MOST 10cm at a time.
After each movement, allow the robot to reset to (0, 0).
Ideally, configure the firmware to make the robot as fast as possible.
- We want to give as much leeway to the other systems (computer vision/trajectory prediction) as possible.

For the demo, I’ve made fairly good progress towards most of these goals. The firmware is set up to a possible maximum configuration of XY acceleration/feed rate, though I think this can be further configured. The demo entails moving the cup 10cm in the 8 cardinal directions + diagonals, and back to (0,0). This leads me to believe that the robot should be able to handle any instructions that call for less than 10cm of translation too. While unrigorous, I’ve also timed these translations, and they seem to fall within 0.3-0.4s, which is fairly decent.

Future testing would be to set up a python script that communicates G-Codes directly to the arduino through the serial port. I believe this should be possible. This can automate random valid coordinates as well. I also need to purchase a different set of limit switches. I had a misunderstanding over how they function, and so in order to enable homing, must set up four at the ends of each rail.

November 16, 2024November 17, 2024

Gordon’s Status Report for 11/16

A lot of progress was made regarding getting things working on the Pi + camera setup. A few extra dependencies were set up on the Pi, and I was able to run all the example files that the depth AI library of the camera came with. There was actually an issue with connecting to the camera to run certain examples, because we ran into current overcharge errors. Did some research, and found out that the Oak Camera actually came with a Y adapter, which allows the input to the camera to be split into 2, allowing the camera to receive input from the Pi as well as get powered from an external supply. This meant the RPI’s limit of 1.2 output Amps could get bypassed by powering the camera separately.

I took some time to understand how the code behind the camera works, looking into the cv2 library and the depth AI library, both areas that I had previously never worked with before. This was also done under Jimmy’s guidance, as he explained how his code was setup and the pipeline that he was following. Currently we have a black box model in which Jimmy is developing the trajectory prediction with the Kalman filter off of a prerecorded video, assuming that the camera will be able to connect. Thus I took some time to understand how Jimmy was running his code, and adapted it to allow for the Camera input to be used instead of the prerecorded video. I also added in a feature to allow us to see what the FPS of any given video was.

It was successful in that we were able to run the ball tracking with input from the camera, but we realized that the resulting FPS was a measly 15, too low to reliably track the ball. Our initial thought on how to tackle this was to actually attempt making our program multi threaded, having one thread be handling the camera sending in frames, while the other thread worked on processing those frames. We thought that this would help since in the main code, each frame had to come in and get processed before another frame could come in. All the time it took to process each frame (i.e. do the ball identification) was thus causing the FPS to drop. We ran that code but then the Pi softlocked, and we were stuck. The Pi would get stuck on the boot up screen, and was major cause for concern. I ended up having to re flash a new SD card with a blank OS, and restart the setup process with all the dependency installation. Thankfully the second time around was much easier as I knew what to do, and also we had our code backed up and could get back to where we were relatively easily.

I also setup remote VNC forwarding, which allows me to essentially have the Pi running on my computer, eliminating a need for a monitor. However it is slightly laggy on the computer, so a monitor is still preferred. Once returning to the point where we had the program capable of tracking the ball at 15 FPS, I dug around the code and given example files to try and see if there were other ways to improve the FPS. One thing I found that made a big impact on FPS was the window size of the camera. Currently it was defaulted at a 1920×1080 wide window, which is actually bigger than we need it to be. Decreasing the window size significantly improved the FPS, with data and some comments shown in this table below.

Camera with 2D Ball Detection FPS Experiments

Window Size (All 1080 resolution, cant do worse)	Resulting FPS	Comments
camRgb.setVideoSize(1920, 1080)	13-15	Default screen size, FPS is way too slow for reliable ball detection. This is without any optimizations of code, which we are looking into
1200, 800	33-35	Ball detection is pretty smooth, need more integration work to see if 35 FPS and this window size is enough
800, 800	46-50	At this point window size is probably as small as it should go, as it starts to inhibit ability to capture ball trajectory. Ball detection is super smooth however, a balance between screen size and FPS should be studied
700, 800	~55	Same as 800, 800
700, 700	60	Achieved 60 FPS, but again window size is likely too small to be of use

This gives us a lot of confidence that the tracking would work in a 2D sense. There are still quite a few more next steps, as I haven’t added the Kalman filter portion into the video, and we are also missing the third dimension. Another unknown is the connect between RPI and instructions for the XY robot, as we haven’t looked into how the whole systems are going to physically connect, and how we can construct a coordinate system that all subparts will recognize. The next few weeks will be crucial in project integration, but I’m confident that we are on track to be able to successfully integrate everything.

November 10, 2024

Team Status Report 11/09

Status Report for 11/9

For the camera pipeline on Jimmy’s end, a lot of progress was made this week getting a proof of concept implementation of the kalman filter, where it is able to project out estimations of where the ball is flying towards (demo can be seen on individual report). However, a few issues did come up in the process regarding accuracy, some oscillation behaviour with the estimations, and the fact that this also needs to be implemented with both X, Y, Z coordinates, rather than X, Z which is the implementation right now. However overall, this is a really good physical evidence that a kalman filter would really work and give us the best estimate as opposed to the sole physics model. One pivot that was made was scrapping the YOLO model as it was too resource intensive and would lower the FPS of the camera video output, opting for a filter / colour tracking approach (more discussed in individual report). There are some drawbacks to this approach, as we will need to fine tune the colour HSV ranges to detect depending on the lighting conditions, and the camera colour range. Additionally, since we have pivoted to the raspberry pi, some work will be done in the following week between Jimmy and Gordon to make sure that the bringup of the raspberry pi goes as smoothly as possible, making sure the loading of the camera pipeline and the object detection and kalman filter is functional on the RPi.

After consulting with Tamal and Nathan and receiving their approval with switching from KRIA to Raspberry Pi (RPI), we were able to place an order for an RPI5 and officially move into doing it all on the RPI. We repartitioned our roles, as without the KRIA Gordon has less work to do in terms of hardware integration. Gordon will still be handling the connection between OAK-D to RPI and RPI to robot, but now has a little more liberty to help out with Jimmy or Josiah on other tasks.

For Gordon’s RPI setup and other hardware integration, we were able to receive the RPI and could start the setup immediately. My personal status report goes into a lot more detail, but essentially we were able to successfully setup the RPI after getting through a few unexpected bumps, and are now able to SSH into the Pi directly from anyone’s laptop. Here are the steps to SSH:

Connect Pi to CMU-DEVICE
Connect laptop to CMU-SECURE
In terminal on laptop, type username + passwd

With the help of Jimmy, using Conda we were also able to setup the Depth AI library and other dependencies needed to run the code on the Pi. For next steps, Gordon will do some research into how the code runs and look into how we can transfer data from the camera output into the robot system. Specifically, find a way to get rid of the Arduino that Josiah is currently using to test the robot, as if we can get rid of that middleman then we save on latency.

On Josiah’s end, the construction of the robot is practically complete. Controlling the stepper motors is possible with GRBL firmware loaded on the arduino, and a Universal G-Code Sender hosted on a laptop. It’s promising seeing it all put together at last, despite some frustrating circuit-level debugging. It turns out that the motor drivers–mounted on the CNC shield–are highly sensitive to electrical interference, and just a light touch can grind proper motor rotation to a halt. Additionally, a second cup mount was designed and printed to accommodate for a belt bearing that stuck out and made the previous cup mount incompatible with the system. Future work is further familiarizing with G-Codes and figuring out “inverse kinematics.” I place this term in quotes, because the task is far simpler compared to more sophisticated robotics applications.

November 10, 2024

Jimmy’s Status Report 11/09

Accomplishments

A lot of progress was made on the kalman filter, including a prototype of an accurate model that is able to track the ball in flight and estimate it’s landing position through visual input. At first, this was done on the YOLO + background subtraction MOG (gaussian) pipeline, however upon integrating this with the OAK-D camera, I realized that this would cap the frames per second (FPS) rate of the camera video output to only 15 FPS, which is much lower than what we need to generate an accurate estimate with the kalman filter. In order to combat this, I decided to pivot to a new object detection model, which is using a colour detection model that detects a specific range of HSV values, also with contour detection for the curves of the ball. This yielded good results for the tracking, which will also help the camera yield a higher framerate as it will not be competing for resources on the camera side. Regarding the kalman filter, I have refined the filter even further to produce good results shown in the video. Although I needed to work on a 3D approach, I spent the time refining the model with the 2D approach before feeling confident enough to work on the kalman filter with 3 dimensions. At the end of the week I also helped Gordon with the initial bring up of setting up the raspberry pi and getting the initial python camera libraries set up on the board.

Schedule and Deliverables

I have made a lot of advances on getting the kalman filter working, so I am on schedule currently. Work needs to be done migrating from a 2D approach to detecting and tracking the ball movement to a 3D approach, so this will be a top priority for next week. At the same time, I will need to find techniques to refine the estimation accuracies of the kalman filter on the 2D version, as it is giving some inaccurate results. I have also observed an oscillation in the predictions that the kalman filter is giving, so I will either have to debug this issue or find a mitigation through a smoothing function or integrating PID control when passing the coordinates onto the gantry system. As Gordon is also working on the bring up of the raspberry pi, some work will be done next week integrating the camera pipelines, installing dependencies, and making sure that the pipeline can run on the board.

November 10, 2024November 10, 2024

Josiah’s Status Report for 11/9

Accomplishments

The robot moves! With the timing belt and pulleys in place, I mounted the Arduino, CNC shield, and motor drivers to control the steppers. After some difficulties while testing the stepper motors, I eventually came to several conclusions. The first is that the motor drivers (which plug into the CNC shield, which itself is plugged into the Arduino) are HIGHLY sensitive to electrical interference. Even lightly touching a driver can result in stalled motor motion. Later, online resources suggested to unplug the 12V 2A power supply before changing any connections with hot wires. I was very confused and at times exasperated when the motors would grind to a halt seemingly arbitrarily, and I suspect this issue to be a result of the two factors at the same time. I got GRBL (a free, open-source firmware that controls the motion of CNC machines, short for “G-Code Reference Block Library) uploaded to the arduino, and can send G-Codes through an external Universal G-Code Sender that controls the stepper motors in tandem. While I didn’t get a ton of time to play with it, it appears to be very powerful.

On another note, I had to do a small modification to the cup mount to accommodate for a belt bearing. The bearings slightly stuck out, and so I made a small cutout of the existing part. As bonus, I totally nailed the design of the cup mount. The cup is held just slightly above the surface of the table!

Progress

I expect to familiarize myself better with the XY robot controls, and see if I can send a single command to make it move to a specific location XY location. I did see a “set origin” button in the G-Code sender, so I believe this will be possible. In the future, I will also want to procure a soft material to line the cup mount, as the stepper motors can vibrate the entire system quite loudly. This will dampen the cup vibrating on the mount, which can be very loud.

November 9, 2024November 10, 2024

Gordon’s Status Report for 11/9

After making the monumental decision to shift away from the KRIA and into the Raspberry Pi last week, on Monday we confirmed our ideas and plan with Tamal and Nathan, and could now proceed with figuring out the details on what to do next. After getting approval, I ordered the Raspberry Pi 5 from the ECE 500 inventory, and started to do more research into its capabilities and if it would suit exactly what we needed. We also found that it could connect to the OAK-D camera with ease, as Nathan had used it in his project. We also found a guide made by Luxonis on how to connect a raspberry pi to the OAK-D, so I spent some time reading through it. The Luxonis guide had links to a few Raspberry Pi OS files that had the DepthAI and other dependencies pre-installed, so I spent some time trying to flash those into the spare SD cards we have. My laptop has a slot for the microSD cards, but previously I had been misusing it, as I thought I had to hold in the card with my finger to have it be read. That was not the case, I simply had to use my nail to press the card further into the computer and it would lock and hold in place. Knowing that earlier would have saved my finger a lot of pain but at least I was able to figure it out eventually. Using the Luxonis OS installer was actually not the standard Raspberry Pi Imager, and there were some issues with the one that they provided. I ended up installing the Raspberry Pi imager as well, and had multiple copies of raspberry Pi OS on a few SD cards.

On Wednesday, the Pi arrived, and luckily the kit that it came in had all the port connection wires that I needed to plug into a monitor. I tried connecting it up and seeing what would happen, but plugging it in and sliding in the microSD card only gave me a black screen. This was concerning, as I wasn’t sure if the OS I grabbed from the Luxonis guide was good or not, and I also didn’t know if it was an issue with the monitor or the Pi itself. I ended up having to try all combinations of my installed OS files and a few different monitors in the 1300 wing, to which nothing worked. I realized that the Pi was displaying an error code through the LED, and through that was able to find out that there was actually a SPI EEPROM error. This meant a corrupted image with the Pi, which was something I had no control over during the setup. I ended up solving the issue by following a guide on the Raspberry Pi forum, and the Pi was able to display on a monitor. Here’s a link to a picture of it successfully displaying: (Too big for upload on here)

I was now able to test the different OS files, and found out that the Luxonis provided ones were either corrupted while flashing or not compatible with the RPI5. Thus I had to stick with the default Raspberry Pi OS, and started looking into how to install the required dependencies and libraries.

At the same time, I started to setup the SSH from our computers that we would need to access into the RPI to run our files. This required getting on the same network from my computer and the RPI while the RPI was connected to a monitor. I had to request a new device to be added to the CMU-DEVICE wifi, as when my computer is on CMU-SECURE and the pi is on CMU-DEVICE, we are technically on the same network and SSH is possible.

After we managed to connect via SSH, we are able to unplug the RPI from the monitor and just connect power to the RPI and SSH into it, which makes operating and doing work on it much easier. While SSH’ed, I was able to then setup the DepthAI library and other dependencies that our code would need to run. This was done with the help of Jimmy, who showed me how to setup the Conda environment to do this.

We made pretty good progress with setting up the RPI and all of that, so next steps would include figuring out how the current camera code works, because we now need to interface with getting camera data into the robot. If we can get rid of the middleman Arduino that currently Josiah is using to test the robot, we should be able to save on latency.

November 3, 2024

Jimmy’s Status Report 11/02

Accomplishments

This week, I worked more on the camera integration, working to produce an output that would be suitable for the kalman filter to run on. I integrated the depth sensor with the object detection model, so now it is able to generate a X, Y, Z coordinate position of the ball as it is being detected and tracked in real time. I also worked on improving the accuracy of the kalman filter, looking at multiple different implementations using OpenCV, and also read some papers to conceptually understand the background knowledge in the case that I would need to write my own filter. Finally, to help with the camera vision pipeline, I also added a background subtraction filter to allow more ease for the yolo model to track fast moving objects. This gave good results, as now putting video playback onto the detection model, it was able to detect the ball in every single frame of the video.

Schedule and deliverables

With a lot of work being done in the camera pipeline, I am feeling good about being on schedule overall. However, the kalman filter still remains my biggest concern, as it is difficult to get a working implementation without any good documentation or background. My main goal next week is to get the new implementation of the kalman filter working and test out a proof of concept. It will be my top priority as the results from this POC will determine whether we would need update our design for the camera location and camera angle.

November 3, 2024

Team Status Report 11/02

Monday’s ethics lecture gave us a lot of new perspectives and it was interesting to sit in and listen. After that, we continued work in our own areas. As Gordon continued to set up the KRIA and Jimmy continued to work through the camera, we were thinking through the feedback we received last week about the overlapped functionality between camera and KRIA. Jimmy has done great with setting up the CV model with our OAK-D camera, and even got the chance to run a Kalman filter on it. Josiah has been doing great with the robot, getting the chassis built. Gordon was working next to Jimmy during this, and it sparked a conversation about what would be left for the KRIA to do if the camera is capable of running the Kalman filter as well.

Originally, we had planned the KRIA to potentially house the CV, run the Kalman filter + trajectory calculation models, and utilize the on board FPGA for hardware acceleration, while the camera would be sending in video frames or ball position. We hadn’t really dived into the exact divide of labor between camera and KRIA, because we weren’t sure of the exact capabilities of both systems. We knew generally what they could do, but didn’t know how well each component would work in a technical sense, like with what accuracy can the model of the camera detect the ball, latency of sending data between systems, or how complex the Kalman filter code is and the feasibility of writing HLS to utilize the hardware acceleration. Thus we had originally proceeded knowing that there were potentially going to be questions about the camera and KRIA interface, and potentially an overlap in functionality.

We didn’t expect the camera to be so powerful, capable of not only doing depth sensing but also running the CV and the Kalman filters. At this point with the camera doing so much, all the KRIA can offer is the on board FPGA to accelerate any calculations before sending off the landing coordinates to the robot. Even that would require writing our own Kalman filter model in C in order to use the Vitis HLS for hardware acceleration, which is a very challenging task (difficulty of doing this confirmed by Varun, FPGA TA guru). Yes it could also be nice to utilize the SoC of the KRIA to run CV models instead of a laptop, but something like a Raspberry Pi could get the same job done with way less technical difficulties and setup overhead. The difficulties of the KRIA and what exactly is “difficult to set up” are elaborated upon more in Gordon’s solo status report.

Because of all this, more discussions about the role of the KRIA were held, and Gordon met with Varun the FPGA guru to get his input as well. Overall, it was decided that next steps could follow this plan:

Research into if a raspberry pi will be able to handle everything necessary, keeping the KRIA as a backup plan. Consult table below for pros and cons of KRIA vs raspberry pi.
Benchmark how long the CV+trajectory calculation would take running on a Raspberry Pi, hopefully interface with robot and determine if we can get away with just using the Pi
In parallel with #2, if Kalman filter is just in python right now, try changing it to be in C
1. Two reasons why. First, C programs run faster than python programs, so with latency being even more crucial now this could be worthwhile.
2. Second reason, in case we need to fall back to KRIA, to use HLS we need a C file for what we want to be converted to RTL for hardware acceleration
  1. Although it’s possible that people have wrote similar Kalman filters for HLS already, research online on availability and feasibility of other people’s code

Pros

Cons

KRIA

Potentially the fastest overall computation, although unsure exactly by how much

More powerful SoC, potentially able to run programs faster than raspberry pi

Already have the board

Difficult to set up and get running, would require a lot of research into how it interfaces

Would take a lot more work to have Kalman working on HLS

Potential speedup maybe not worth the massive knowledge and setup overhead

Raspberry Pi

Significantly easier to use and setup

Effectively the same function as KRIA without FPGA

Potentially more latency

No hardware acceleration

Admittedly, timeline wise this is later on in the semester than we hoped to be having these design changes, but given our workloads and other events happening during the past month and the fact that we already have a lot done, we are confident that we will be able to pull through with these changes. In fact, changing to the Raspberry Pi eliminates the complicated KRIA work and makes it more feasible to complete this aspect of the project. We will look into where to get the Raspberry Pi ASAP, and begin work to catch up.

Regarding the robot, assembly is well underway, and is looking to be completed this next week. Check out the photos for progress!

For the camera, a lot of work was done in integrating the detection and tracking pipelines, and now we are able to fully detect and track the coordinates of a moving ball, with good accuracy thanks to the addition of background subtraction filter. A big risk the remains is the kalman filter, with details being described in Jimmy’s status report.

November 2, 2024November 2, 2024

Josiah’s Status Report for 11/2

Accomplishments

Robot construction is finally commencing! I made significant headway into assembling the various parts while following along the online guide. One speedbump I ran into was the diameters of the holes on some of the printed parts. For several mounts, the 8mm smooth rods (on which the XY motion relies upon to slide) were too large to fit, and so I had to spend time sanding the insides of the holes to properly fit the rods. The silver lining is that the rods can then be press-fit and don’t require extra fasteners. Additionally, due to the nature of 3D-printing, the small screw holes are also not an exact print, and so it took time to drive the screws through each hole and make grooves/widen the hole.

Progress

At this point, the bulk of the frame is put together, and now comes the task of adding the control mechanisms, such as the belt, pulleys, and servo motor wires with the Arduino. Following this, calibration and testing of the system can commence.