jameszho – Team A2: 🌊Splash💦

December 7, 2024

Team Status Report 12/07

This week (and the last week) was all hands on deck with integration. We focused a lot on actually mapping out the system, and setting up the coordinate grid. The translation between camera into real world distance was made, and we also moved our system into the 1200 wing for testing. Our backdrop arrived, and we were able to set it up in the back of 1207. Since we are having a system where we need to capture the full arc of the throw, we found the optimal camera configuration that includes the robot, backdrop, and the full arc of the throw.

Once setup was complete, the three of us all worked on setting up the complete pipeline for the 2D version. The third dimension is coming from depth in the camera, which we had to put on hold. More information about that in Gordon’s individual status report. We had created a file in which all the components (camera with detection and kalman interfacing, connecting to the Pi, translation into G-code for the arduino) were able to run, but with testing found out that the Raspberry Pi was not powerful enough to handle all of it at once. More information in Gordon’s individual status report. We ended up just using Jimmy’s laptop to run it, and it was able to perform very nicely. We actually were able to get the full pipeline working, and have multiple recordings of us throwing and the system catching the ball.

For next steps, we are working to integrate a second camera to provide the missing dimension. The decision of giving up on the depth capability of the depth camera and getting a second camera setup was made only after meticulously attempting to debug and make workarounds on how the depth camera could not handle everything running. Even using the laptop instead of the pi, there simply was not enough processing power to get enough frames to reliably capture the ball’s depth coordinate. Specifically, the ball’s location could be tracked, and we moved the Region of Interest to those coordinates and requested the depth of the ROI. But in the short time it took to actually get that request fulfilled, the ball had already moved out of the region. We tried all sorts of methods to move around the ROI or make it bigger, but everything led to a buggy implementation where the depth coordinate simply could not be reliably generated. We also tried getting rid of the ROI in entirety and just look into if they can return depth coordinates at a specific point, but even that was unsuccessful. We were able to get the ball coordinate when it was moving at slower speeds, but in order for it to matter it needed to capture depth coordinates of an in-flight ball, which it couldn’t do.

We have tested with a second camera positioned facing the throw, and have good reason to believe that we can modify our code from the first camera to successfully integrate the second camera. This is because the only force acting on the x and z axis of our throw is air resistance, so the detection and Kalman models we already have for the x axis should be able to easily convert to the z axis. Jimmy successfully wrote up the code, and with some preliminary testing we found it to work pretty well (more details in Jimmy’s individual report). We are right here at the end, as once the second camera gets integrated, we will have the missing dimension and we will be done.

List all unit tests and overall system test carried out for experimentation of the system. List any findings and design changes made from your analysis of test results and other data obtained from the experimentation.

For the camera, we did testing on if detection and Kalman filters would work for a 2D plane first, off a recorded video. Then we moved into testing how well those same functions would run with live camera feed video. After we got 2D working, we ran the same tests on 3D to see if the FPS and functionality is still adequate. From 2D testing, the Pi showed promise and was able to detect pretty well. But once we switched to 3D, testing results were not good enough with the Pi. This led us to abandon the Pi for a laptop, as explained above and also in Gordon’s personal status report. For further 3D testing, we determined that the single depth camera was also not adequate to our standards, and made the design change to use 2 cameras. That was explained in this status report as well.

For the robot, we tested the movement of the robot. We tested moving in our full coordinate range, and verified that it can move the whole range. We also ran some tests on the timing of how quick it could move from one end to another. Then we did tests on if the robot could receive multiple movement commands in rapid succession. The robot was able to pass all these tests to our standards.

December 7, 2024December 7, 2024

Jimmy’s Status Report 12/07

Accomplishments

Over this week, I was able to fully integrate the camera detection pipeline, tracking and kalman filter implementation to make a series of predictions and finally link that to the gantry system. The final result was a system that is able to predict throws from the user on a single axis. I then tried to debug the issue of the 3D depth detection on the Luxonis camera with Gordon, where we eventually deduced that the depth was a no-go for reasons outlined in the team report. We eventually switched gears to using a second camera, which I have prototyped up some code that successfully predicts the horizontal direction of where the ball will land, from a front-facing camera placed behind the gantry system, getting a full-view of the user throwing the ball so that we can use this prediction in-place of the original depth measurement. This uses the same kalman prediction pipeline as the side-view camera and an adjusted detection filter (an addition of MOG background subtraction and also using moments capture rather than contour extraction). Here are the results:

Schedule

Since we have pivoted to a much easier concept by using a second camera, we are back on schedule and implementation details are fully complete – we are waiting for the second tripod to arrive so that we can test the full system. What remains is the balance between the poster/video/final report submissions with the final integration sprint that needs to be completed in time for the final demo.

Deliverables

Testing of the final integrated product will need to be completed once the second tripod has arrived. I will also finish up the code that integrates the side-view camera detection/prediction and the front-view camera detection/prediction whilst waiting for the tripod. I will also finish up some logic that will only allow for the front-view kalman pipeline to be running when the side-view camera has detected the ball in it’s frame, as to prevent the front-view camera erroneously to begin predicting before the user has even thrown the ball.

December 1, 2024

Jimmy’s Status Report 11/30

Accomplishments

Over the past two weeks, I was able to complete the 3D implementation of the kalman filter – this is good news as this means all the coding on my ball detection, tracking and prediction has been completed! I have also completed the integration of the camera pipeline onto the raspberry pi, which means that it is able to feed coordinates to the x-y gantry system. Some obstacles that I ran into when integrating the pipeline included the frame rate problem – when taking camera frames in the main while loop, the detection and kalman code would take a non-trivial amount of time. This would cause the framerate of the overall pipeline to drop to about 15 FPS, which is not enough for our needs. The solution we discussed was to implement multithreading – one thread to take in camera frames and run the detection filter, and pushing the detected ball in x-y coordinate form into a synchronized queue data structure. The second thread would pop coordinates out of the queue, and run the kalman filter to output the predicted coordinates, which would be fed into the serializer module for the arduino. This solution deemed to be extremely successful, allowing the framerate to increase to 60FPS, which exceeded our expectations.

Schedule

Currently, we are a bit behind schedule as we still need to fully test and verify our system is correct. However, we are opting to use the slack time that we allocated ourselves for this task, and so we are confident that we will be able to complete all these tasks within the final demo day.

Deliverables

I need to finish testing of the 3D kalman filter with the real-time data once Gordon has fully tuned the depth information from the camera and also once the white backdrop has arrived (so that we can fully detect the ball in all lighting conditions). I will also need to tune the parameter of the kalman filter, so the gravity parameter in real time calculations is different to the gravity parameter used in the initial testing videos. I will also help Josiah and Gordon to make sure that whatever needs to be done is completed before our final demo.

November 17, 2024November 17, 2024

Jimmy’s Status Report 11/16

Accomplishments

After experimenting and working with the current kalman filter, I decided to scrap this implementation and work on a new self-implementation of the filter using the numpy library. This yielded really good results, as it allowed me to tune specific parameters that the other model wouldn’t have otherwise. It was able to predict very accurately from visual inspection, although testing will need to be done to make sure that the predictions can be made consistently from different throws and with many variables involved. Also this week Gordon and I together did work on bringing up the OAK-D camera to use on the RPI. There were some issues with the frame rate of the camera, which I aim to resolve using multithreading so that the detection and prediction processing will occur concurrently with pulling in the frames from the camera’s feed.

Schedule

Although I have finished the 2D kalman filter to a very satisfactory level, progress on the 3D kalman filter needs to be done. Things will hopefully be sped up now that Gordon will also be working on onboarding the camera with the RPi so that I can focus all my attention on the algorithmic portion. However, I feel comfortable with the progress that we’re at because the 2D kalman filter can always be used as a fallback incase anything goes wrong with the 3D kalman filter.

Deliverables

Effort will now be made to transition to the 3D kalman filter. I will need to order the white background sheet and stands so that we can accurately replicate the favourable conditions that was used in the original test video so that our camera system will be able to accurately detect and predict the ball. Once the background has come in, I will be able to start recording videos that can be used for the verification stages of the project (see below).

Testing for detection pipeline

Setup: 10 different mp4 videos of throws using the same background and similar lighting conditions of the orange ball travelling in a semi-parabola (parabolic trajectory of the throw will range from low to high arcs).
Analyzing: Making sure that percentage of the detections is above 90% (frames ball is detected / total frames). This is because any detection rate below will affect the accuracy of the kalman filter

Testing for the prediction pipeline (kalman filter)

Same setup as above, using the detection pipeline output
Analyzing: Making sure that the predicted landing position is not fluctuating and is accurate after 25% of the ball’s flight time. This is important because we need to give accurate measurements quickly to give time for the gantry system to move towards. We also need to make sure that the estimated points do not oscillate to prevent wasted movements by the gantry. Accuracy of the landing position is obvious since we want to catch the ball. The leeway given for inaccuracies is proportional to the radius of the cup to catch the ball with

November 16, 2024November 17, 2024

Team Status Report 11/16

For the camera / estimation and tracking pipeline on Jimmy’s end, the decision was made to stop using the OpenCV kalman filter library, and ended up making a good in house implementation which yielded good results. This will be good to set us up as now the 2D visual pipeline is done, which can be a good contingency plan incase our 3D pipeline does not give good results. Some of the risks of this subsystem now lie entirely on the accuracy of the 3D model that is going to be implemented. As for validation, there will be two components to the test for validation for object detection and trajectory estimation. The first is to making sure that the estimates are accurately predicted and tracked. The second is to ensure that the calculation time can be done in a timely manner. The overall goal of the camera and detection/prediction pipeline is to generate a coordinate to send to the gantry so that it can catch the ball in time. Making sure of these conditions are met in the verification of this subsystem will ensure that our overall project goal can be achieved.

For the Raspberry Pi hardware integration side, a lot of work was put into getting past the setup stage and having functional work be done on the Pi. Jimmy has been working on developing a good Kalman prediction code from a recorded video that he is able to pass into the code. Once I finalized the setup and finished installing the few dependencies, I was able to run the Camera on the Pi. The depth AI package that the camera came with also had many example files that could be run to showcase some of the camera’s capabilities. On Monday I focused on learning how the file system structure is set up and looked through many examples of the camera and what different pieces of code could make the camera do. The next step was to adapt Jimmy’s detection function to take in camera input for video instead of a recorded video. That was successfully done, but the resulting video was only 15 FPS, and it struggled to detect the ball. We had a few ideas of how to improve that, and the details of how we went about implementing that are elaborated more specifically in Gordon’s status report, but essentially we were able to identify that making the camera window smaller greatly increased the FPS, as shown in this table below.

Camera with 2D Ball Detection FPS Experiments

Window Size (All 1080 resolution, cant do worse)	Resulting FPS	Comments
camRgb.setVideoSize(1920, 1080)	13-15	Default screen size, FPS is way too slow for reliable ball detection. This is without any optimizations of code, which we are looking into
1200, 800	33-35	Ball detection is pretty smooth, need more integration to see if its enough
800, 800	46-50	At this point window size is probably as small as it should go, starts to inhibit ability to capture ball trajectory. Ball detection is super smooth however, a balance between screen size and FPS should be studied
700, 800	~55	^
700, 700	60	Achieved 60 FPS, but again window size is likely too small to be of use

The comments provide a baseline overview of what I concluded from this simple verification test. From what I’ve gathered, I believe that this framerate and window manipulation will be a good tactic that we can use to increase the precision and effectiveness of our tracking and trajectory prediction. There are even more tactics that we haven’t tried yet, and plan to test out in these last few weeks if any design requirements have still not been met. There is still a good amount of work to be done to add the third dimension into it, and to convert it to instructions for the XY robot. These last bits of integration we will work on as much as we can as a team, as we still need to figure out how to structure the whole system to give real world coordinates. In general I think our team has done a lot of verification within each of our own subsystems, and it is almost time to start the validation process of integrating all of them together. The RPI and Jimmy’s camera trajectory work are already relatively connected and working, but like aforementioned, connecting to the XY robot will require work from everyone.

XY Robot:

As described in the design report, there is only a single, though important, test to be run. The big picture is that the robot is agile in motion and precise in positioning, such that it can support the rest of the system. Thus, we have the following validation test:

Supply a series of valid, arbitrary real-world coordinates to the Arduino running grbl firmware through G-Code serial communications. For each coordinate, validate that the robot:
1. Moves the cup within <0.5cm of the position
2. Moves the cup to the position within <1.0s

The following conditions should also be met:

We define a valid and arbitrary coordinate to be any position within a 10cm XY radius around the origin of the cup.
- Thus, the cup will only have to move AT MOST 10cm at a time.
After each movement, allow the robot to reset to (0, 0).
Ideally, configure the firmware to make the robot as fast as possible.
- We want to give as much leeway to the other systems (computer vision/trajectory prediction) as possible.

For the demo, I’ve made fairly good progress towards most of these goals. The firmware is set up to a possible maximum configuration of XY acceleration/feed rate, though I think this can be further configured. The demo entails moving the cup 10cm in the 8 cardinal directions + diagonals, and back to (0,0). This leads me to believe that the robot should be able to handle any instructions that call for less than 10cm of translation too. While unrigorous, I’ve also timed these translations, and they seem to fall within 0.3-0.4s, which is fairly decent.

Future testing would be to set up a python script that communicates G-Codes directly to the arduino through the serial port. I believe this should be possible. This can automate random valid coordinates as well. I also need to purchase a different set of limit switches. I had a misunderstanding over how they function, and so in order to enable homing, must set up four at the ends of each rail.

November 10, 2024

Team Status Report 11/09

Status Report for 11/9

For the camera pipeline on Jimmy’s end, a lot of progress was made this week getting a proof of concept implementation of the kalman filter, where it is able to project out estimations of where the ball is flying towards (demo can be seen on individual report). However, a few issues did come up in the process regarding accuracy, some oscillation behaviour with the estimations, and the fact that this also needs to be implemented with both X, Y, Z coordinates, rather than X, Z which is the implementation right now. However overall, this is a really good physical evidence that a kalman filter would really work and give us the best estimate as opposed to the sole physics model. One pivot that was made was scrapping the YOLO model as it was too resource intensive and would lower the FPS of the camera video output, opting for a filter / colour tracking approach (more discussed in individual report). There are some drawbacks to this approach, as we will need to fine tune the colour HSV ranges to detect depending on the lighting conditions, and the camera colour range. Additionally, since we have pivoted to the raspberry pi, some work will be done in the following week between Jimmy and Gordon to make sure that the bringup of the raspberry pi goes as smoothly as possible, making sure the loading of the camera pipeline and the object detection and kalman filter is functional on the RPi.

After consulting with Tamal and Nathan and receiving their approval with switching from KRIA to Raspberry Pi (RPI), we were able to place an order for an RPI5 and officially move into doing it all on the RPI. We repartitioned our roles, as without the KRIA Gordon has less work to do in terms of hardware integration. Gordon will still be handling the connection between OAK-D to RPI and RPI to robot, but now has a little more liberty to help out with Jimmy or Josiah on other tasks.

For Gordon’s RPI setup and other hardware integration, we were able to receive the RPI and could start the setup immediately. My personal status report goes into a lot more detail, but essentially we were able to successfully setup the RPI after getting through a few unexpected bumps, and are now able to SSH into the Pi directly from anyone’s laptop. Here are the steps to SSH:

Connect Pi to CMU-DEVICE
Connect laptop to CMU-SECURE
In terminal on laptop, type username + passwd

With the help of Jimmy, using Conda we were also able to setup the Depth AI library and other dependencies needed to run the code on the Pi. For next steps, Gordon will do some research into how the code runs and look into how we can transfer data from the camera output into the robot system. Specifically, find a way to get rid of the Arduino that Josiah is currently using to test the robot, as if we can get rid of that middleman then we save on latency.

On Josiah’s end, the construction of the robot is practically complete. Controlling the stepper motors is possible with GRBL firmware loaded on the arduino, and a Universal G-Code Sender hosted on a laptop. It’s promising seeing it all put together at last, despite some frustrating circuit-level debugging. It turns out that the motor drivers–mounted on the CNC shield–are highly sensitive to electrical interference, and just a light touch can grind proper motor rotation to a halt. Additionally, a second cup mount was designed and printed to accommodate for a belt bearing that stuck out and made the previous cup mount incompatible with the system. Future work is further familiarizing with G-Codes and figuring out “inverse kinematics.” I place this term in quotes, because the task is far simpler compared to more sophisticated robotics applications.

November 10, 2024

Jimmy’s Status Report 11/09

Accomplishments

A lot of progress was made on the kalman filter, including a prototype of an accurate model that is able to track the ball in flight and estimate it’s landing position through visual input. At first, this was done on the YOLO + background subtraction MOG (gaussian) pipeline, however upon integrating this with the OAK-D camera, I realized that this would cap the frames per second (FPS) rate of the camera video output to only 15 FPS, which is much lower than what we need to generate an accurate estimate with the kalman filter. In order to combat this, I decided to pivot to a new object detection model, which is using a colour detection model that detects a specific range of HSV values, also with contour detection for the curves of the ball. This yielded good results for the tracking, which will also help the camera yield a higher framerate as it will not be competing for resources on the camera side. Regarding the kalman filter, I have refined the filter even further to produce good results shown in the video. Although I needed to work on a 3D approach, I spent the time refining the model with the 2D approach before feeling confident enough to work on the kalman filter with 3 dimensions. At the end of the week I also helped Gordon with the initial bring up of setting up the raspberry pi and getting the initial python camera libraries set up on the board.

Schedule and Deliverables

I have made a lot of advances on getting the kalman filter working, so I am on schedule currently. Work needs to be done migrating from a 2D approach to detecting and tracking the ball movement to a 3D approach, so this will be a top priority for next week. At the same time, I will need to find techniques to refine the estimation accuracies of the kalman filter on the 2D version, as it is giving some inaccurate results. I have also observed an oscillation in the predictions that the kalman filter is giving, so I will either have to debug this issue or find a mitigation through a smoothing function or integrating PID control when passing the coordinates onto the gantry system. As Gordon is also working on the bring up of the raspberry pi, some work will be done next week integrating the camera pipelines, installing dependencies, and making sure that the pipeline can run on the board.

November 3, 2024

Jimmy’s Status Report 11/02

Accomplishments

This week, I worked more on the camera integration, working to produce an output that would be suitable for the kalman filter to run on. I integrated the depth sensor with the object detection model, so now it is able to generate a X, Y, Z coordinate position of the ball as it is being detected and tracked in real time. I also worked on improving the accuracy of the kalman filter, looking at multiple different implementations using OpenCV, and also read some papers to conceptually understand the background knowledge in the case that I would need to write my own filter. Finally, to help with the camera vision pipeline, I also added a background subtraction filter to allow more ease for the yolo model to track fast moving objects. This gave good results, as now putting video playback onto the detection model, it was able to detect the ball in every single frame of the video.

Schedule and deliverables

With a lot of work being done in the camera pipeline, I am feeling good about being on schedule overall. However, the kalman filter still remains my biggest concern, as it is difficult to get a working implementation without any good documentation or background. My main goal next week is to get the new implementation of the kalman filter working and test out a proof of concept. It will be my top priority as the results from this POC will determine whether we would need update our design for the camera location and camera angle.

November 3, 2024

Team Status Report 11/02

Monday’s ethics lecture gave us a lot of new perspectives and it was interesting to sit in and listen. After that, we continued work in our own areas. As Gordon continued to set up the KRIA and Jimmy continued to work through the camera, we were thinking through the feedback we received last week about the overlapped functionality between camera and KRIA. Jimmy has done great with setting up the CV model with our OAK-D camera, and even got the chance to run a Kalman filter on it. Josiah has been doing great with the robot, getting the chassis built. Gordon was working next to Jimmy during this, and it sparked a conversation about what would be left for the KRIA to do if the camera is capable of running the Kalman filter as well.

Originally, we had planned the KRIA to potentially house the CV, run the Kalman filter + trajectory calculation models, and utilize the on board FPGA for hardware acceleration, while the camera would be sending in video frames or ball position. We hadn’t really dived into the exact divide of labor between camera and KRIA, because we weren’t sure of the exact capabilities of both systems. We knew generally what they could do, but didn’t know how well each component would work in a technical sense, like with what accuracy can the model of the camera detect the ball, latency of sending data between systems, or how complex the Kalman filter code is and the feasibility of writing HLS to utilize the hardware acceleration. Thus we had originally proceeded knowing that there were potentially going to be questions about the camera and KRIA interface, and potentially an overlap in functionality.

We didn’t expect the camera to be so powerful, capable of not only doing depth sensing but also running the CV and the Kalman filters. At this point with the camera doing so much, all the KRIA can offer is the on board FPGA to accelerate any calculations before sending off the landing coordinates to the robot. Even that would require writing our own Kalman filter model in C in order to use the Vitis HLS for hardware acceleration, which is a very challenging task (difficulty of doing this confirmed by Varun, FPGA TA guru). Yes it could also be nice to utilize the SoC of the KRIA to run CV models instead of a laptop, but something like a Raspberry Pi could get the same job done with way less technical difficulties and setup overhead. The difficulties of the KRIA and what exactly is “difficult to set up” are elaborated upon more in Gordon’s solo status report.

Because of all this, more discussions about the role of the KRIA were held, and Gordon met with Varun the FPGA guru to get his input as well. Overall, it was decided that next steps could follow this plan:

Research into if a raspberry pi will be able to handle everything necessary, keeping the KRIA as a backup plan. Consult table below for pros and cons of KRIA vs raspberry pi.
Benchmark how long the CV+trajectory calculation would take running on a Raspberry Pi, hopefully interface with robot and determine if we can get away with just using the Pi
In parallel with #2, if Kalman filter is just in python right now, try changing it to be in C
1. Two reasons why. First, C programs run faster than python programs, so with latency being even more crucial now this could be worthwhile.
2. Second reason, in case we need to fall back to KRIA, to use HLS we need a C file for what we want to be converted to RTL for hardware acceleration
  1. Although it’s possible that people have wrote similar Kalman filters for HLS already, research online on availability and feasibility of other people’s code

Pros

Cons

KRIA

Potentially the fastest overall computation, although unsure exactly by how much

More powerful SoC, potentially able to run programs faster than raspberry pi

Already have the board

Difficult to set up and get running, would require a lot of research into how it interfaces

Would take a lot more work to have Kalman working on HLS

Potential speedup maybe not worth the massive knowledge and setup overhead

Raspberry Pi

Significantly easier to use and setup

Effectively the same function as KRIA without FPGA

Potentially more latency

No hardware acceleration

Admittedly, timeline wise this is later on in the semester than we hoped to be having these design changes, but given our workloads and other events happening during the past month and the fact that we already have a lot done, we are confident that we will be able to pull through with these changes. In fact, changing to the Raspberry Pi eliminates the complicated KRIA work and makes it more feasible to complete this aspect of the project. We will look into where to get the Raspberry Pi ASAP, and begin work to catch up.

Regarding the robot, assembly is well underway, and is looking to be completed this next week. Check out the photos for progress!

For the camera, a lot of work was done in integrating the detection and tracking pipelines, and now we are able to fully detect and track the coordinates of a moving ball, with good accuracy thanks to the addition of background subtraction filter. A big risk the remains is the kalman filter, with details being described in Jimmy’s status report.

October 26, 2024

Jimmy’s Status Report 10/26

Accomplishments

Following this week’s meeting, a big issue was raised on which camera angle would work best to get accurate data information for making predictions. As such, this week was mainly getting the custom YOLO model to run inference on the camera, since the pre-built recognition models are not robust enough for our applications. I successfully compiled the detection model onto the camera from the .pt weight file to ONNX and then to a BLOB format. I was also able to get some depth data output by looking at the examples in the Luxonis documentation.

Schedule and deliverables

I have caught up to schedule by making a lot of progress on the camera API area. More work needes to be done on the algorithmic side for the Kalman filter though. Since integration of the kalman filter and camera system is still in the works, I can work on the two components independently by making up some simulated points in space and time to use in the kalman filter, whilst working on the point generation on the camera side at the same time.