agao2 – Team C0: CueTips

April 27, 2024April 27, 2024

Team Status Report for April 27, 2024

The most significant risks that could jeopardize the success of the project are to do with the environment of the setup during demo day. Specifically, our system relies on consistent, not-too-bright lighting in order to function optimally. Non-ideal lighting conditions would result in faulty detection of the cue ball or other objects. To manage and mitigate this risk, we made very specific requests for the area we want for our project to be located in and modified our code to make sure that we are able to make on-the-fly parameter adjustments based on the given lighting conditions. No large changes were made to the existing design of the system; most of what we did this week was testing, verification, tweaking + small optimizations, cleanup. No change to the schedule is needed – we’re on track and proceeding as planned.

Unit Tests:

Cue Stick Detection System:

Cartesian-to-polar
Checking image similarity (pixel overlap)
Frame history queue insertion, deletion, dequeuing
Computing slope
Polar-to-cartesian point
Polar-to-cartesian line
Walls and pool table masking
Extracting point pairs (rectangle vertices) from boxPoints
Primary findCueStick function

Ball Detection System:

getCueBall function
Point within wall bounds
Green masking of pool table
Creating mask colors
getBallsHSV
Finding contours of balls after HSV mask applied
Remove balls within pockets
Point-to-line distance
Ball collision/adjacent to walls

Physics Engine/System:

Normal slope
Point-to-line distance
Checking if point out of pool table bounds
Line/trajectory intersection with wall
Reflected points, aim at wall
Finding intersection points of two lines
Extrapolate output line of trajectory
Point along trajectory line (or within some distance d of it)
Find new collision point on trajectory line
Intersection of trajectory line and ball
Main run_physics

System Tests:

Cue Stick Detection System:

Isolated stick, no balls
Stick close, next to cue ball
Stick among multiple balls
Random configurations (10x)
Full-length stick
Front of stick, at table edges (5x)
Different lighting conditions
IMU accelerometer request-response

Ball Detection System:

Random ball configurations (20x)
Similar colored balls close together (e.g. cue ball + yellow stripe + yellow ball)
Balls near pockets
Balls near walls
Different lighting conditions

Physics Engine/System:

Kiss-shot: cue ball – ball – ball – pocket (20x)
Bank shot: cue ball – ball – wall (20x)
Normal shot: cue-ball – ball – pocket (20x)

Webapp System:

Spin shot location request-response
End-to-end latency (processing each frame)

Findings:

We were well under the latency requirement (22ms << 100ms), allowed us to take more liberties in increasing quality of detection at the cost of being computationally inefficient.
Testing under lighting conditions forced us to change how detection was done (used to heavily rely on color detection, now much more using contours, polygon approximation, thresholding, masking, etc). This also made us parametrize a lot of our detection settings that did rely on color detection, so we could modify it on the fly if needed.
Utilized “frame averaging” and/or remembering past frames. Due to detection happening quickly (22ms), we were able to save a few frames from the past and do computation on those to “fill in gaps”. Also helped us calibrate setup as well as make improvements to all our detection systems by considering both current and past frame data in discerning objects, tracking, etc.

April 27, 2024

Andrew’s Status Report for April 27, 2024

This week I was mostly wrapping up for the project. I did work on the spin physics again, refining and writing additional unit tests to verify the functionality. I helped debug some of the frontend CSS issues we were also facing (resizing issues, running on different laptops). Additionally, I worked on restructuring some of the system’s calibration to be more robust in handling errors and being more user-friendly to debug. I also did various end-to-end tests to verify the accuracy of the system & trajectory prediction. This was in-part for the final presentation as well as additional verification that we were able to meet our use case requirements. Our progress in on schedule, and in the coming week, I will be working on the final demo as well as the poster to present. I also plan on doing one last pass at the cue stick detection and squeezing as much stability and precision out as possible before the final demo.

April 22, 2024April 22, 2024

Final Presentation

April 20, 2024

Andrew’s Status Report for April 20, 2024

This week consisted of a lot of optimization, improvements, and cleaning up on my end. This primarily had to do with: 1) cue stick detection, 2) incorporating spin into our physics engine and displaying it. For cue stick detection, we realized that the stick trajectory was not very stable, and I tried many, many different ways of improving the cue stick detection. Ultimately, what worked the best was a combination of color masking, Gaussian Blur, contour detection, and enclosing rectangle + math. The cue stick detection is now significantly more stable and accurate compared to before; this was huge for our system, as the cue stick detection is a crucial part. If the stick detection is off, then the usefulness of the system decreases significantly. The second part Tjun Jet and I worked on was incorporating spin into our physics engine. Specifically, we took a deep dive on the physics of pool ball spin. We incorporated this into our physics engine when we case on both ball-wall and ball-ball collisions. Further, we also take in information about the user’s strike (location of strike + speed) and feed it into our physics engine. Our physics engine uses this input to modify the predicted trajectory in real-time. By having this web application interface link directly to the physics engine, the user is able to see in real-time how spin will affect the ball’s trajectory.

Our progress is on-schedule. In the coming week, we will be looking to finish our Final Presentation, present it, and make some last-minute touch-ups to our project. On a very practical level, I got a very hands-on introduction to computer vision applied to a specific problem. More on the theoretical side, I also had to refresh myself with physics and take a deep dive into the physics of pool. I knew almost nothing about computer vision coming into this project, and I didn’t have enough time to fully understand the theory behind everything by reading textbooks or taking a course. Instead, I found research papers, projects, videos, etc that had some parts overlapping with ours (or what we wanted to do), and I consumed that content. This is the learning strategy I took on to acquire this new knowledge, and I realized how important it is to be able to limit the scope of what you are learning in order to accomplish the tasks at hand. If I resorted to a traditional textbook or course, it would not be possible to finish our system in time. Much of the learning I did was on the fly, hands on.

April 6, 2024

Andrew’s Status Report for April 06, 2024

This week I modified our cue stick subsystem to use AprilTags. This turned out to affect the accuracy both positively and negatively. It provided a positive effect in that the cue stick detection itself was more accurate and much more consistent. However, the negative part was that the large AprilTags made our other computer vision subsystems behave unexpectedly. The most detrimental was that our cue ball detection subsystem started to mistake the AprilTags themselves occasionally as cue balls. Additionally, in order for the cue stick to be detected, this now required both AprilTags to be within frame. However, for some edge-case shots near the pool table walls, this would not work. As such, we decided to revert back to the previous polygon approximation method for now, and I am working on relying more on color detection for the cue stick detection. The idea is to use some sort of very bright, unusual color as an indicator for the cue stick subsystem to pick up on (something like bright pink).

Since our schedule is currently focused on integration, I am not too much behind schedule. I caught a pretty bad case of food poisoning Tuesday night and could not do much work until the weekend. However, I’m using the time now to catch up as well as work on newer tasks. In the coming week, I will be looking into integrating ball spin from our web application input into our physics engine + backend.

For verification, the most important component/subsystem I need to verify and improve is the cue stick detection system. The tests I’m planning to run for this subsystem is fairly straightforward: they consist of various frames/video taking various shots as well as lining up a variety of different shots. Verifying the subsystem is not difficult – I look frame-by-frame, output the detected cue stick, and verify it visually. To verify these results, I also added unit tests to the subsystem verifying smaller functional components required to make the entire subsystem operate successfully. This was also repeated for parts of the calibration subsystem as well.

March 30, 2024

Team Status Report for March 30, 2024

The most significant risks that could jeopardize the success of the project is really only inconsistent CV detection (usually varying lighting conditions). Since we don’t use machine learning, a lot of our computer vision detection uses more “deterministic” computer vision algorithms like HoughLines, contour detection, polygon approximation, etc. In one of the previous days we were testing, there was a lot of sunlight in HHA104 which confused the cue stick detection model as well as the cue ball detection model. This poses a serious potential issue with our system. To manage the risk for the cue stick detection, we were thinking of just resorting to attaching two AprilTags onto the cue stick. We already have AprilTag detection infrastructure integrated into our system, so this would not be a hard change at all. Our only hesitation is that using AprilTags for the cue stick would result in a poorer user experience, since the user would have to hold the cue stick such that the AprilTags are always facing up. Worst comes to worst though, we would go with this option. As for the cue ball detection issue, this really only is an issue with the yellow ball and cue ball. We have two contingency plan options: color the yellow ball a darker shade of yellow, or make the “cue ball” whichever ball the cue stick is pointing at. The second option does not align as well with how pool is actually played, but since this is more of a training system, enforcing the actual rules of pool isn’t as necessary for our use case.

There were some minor changes made to the existing design of the system. For one, the Flask backend had to be integrated with the primary backend; this just made it a lot easier to both do the image processing/physics computation as well as stream it to our frontend. Additionally, this is really helpful in debugging the project. The change does not incur much cost really, only a small part of the backend has to be rewritten. Another change is having a “Calibration” class. This helps us calibrate our backend upon the system starting. This involves things like detecting the walls and pockets, cropping each frame to avoid noise. We also implemented a sort of “memory” for the system. There is a Calibration Python class we created that stores data from previous frames. If we aren’t able to detect certain objects or have some inconsistencies frame-to-frame, we use this class to either take the union of all detected objects in the past n frames (not detecting enough objects per frame), or we find the “median” objects (inconsistent objects frame-to-frame).

March 30, 2024

Andrew’s Status Report for March 30, 2024

This week was again integration and refining the cue stick detection module. For the integration, I merged our primary backend (CV detection, physics engine, etc) with the Flask server that streams the video frames to our frontend. Getting this to work was a little bit tricky, since the frames have to be streamed in real-time to the frontend. For the initial demo, I changed the cue stick detection module to use polygon approximation to detect the end of the cue stick and identify two points which represent the stick. It is not 100% accurate, but is serviceable for the demo. There were alternate ways to implement this; however, most of them were not consistent or do not detect the cue stick often enough for it to be real-time. Also, since we last discussed being able to do ball spin, I implemented a basic selection system on the frontend which lets the user select the kind of shot they want to take with various spin types.

This week I am on schedule. We are wrapping up the MVP of our project and will be ready to show at the interim demo. In the coming week, I hope to demo the project on Monday and continue to improve the cue stick detection. I’m also aiming to improve the frontend UI and incorporate spin into the backend.

March 23, 2024

Andrew’s Status Report for March 23, 2024

This week I spent a massive amount of time debugging and testing, as well as integrating the entire system together. The debugging was specifically for the cue stick detection model. After rigorous testing I realized that there were a number of edge cases that the cue stick detection completely failed at. Most of these specifically depends on how the user grips the cue stick. Varying grips cover different parts of the cue stick that the cue stick detection model relies on to detect the stick, which makes it hard to have consistent results. Further, skin tone also affects the rate of detection, with skin tones that are closer to the color of the cue stick confusing the model more.

I cycled through a number of varying methods for detecting the cue stick – HoughLines, polygon approximation, background subtraction, and many others. Unfortunately, none of these approaches covered all edge cases by 100% and was not reliable enough. I also incorporated a form of “memory” for the cue stick detection system, which essentially takes the union of detections across multiple frames in order to make up for its inconsistencies. Despite this, many times we still failed to detect the cue stick. My progress this week is slightly behind since the cue stick detection is not working properly. There is an easier way to address this issue – attaching AprilTags on the cue stick. However, this is a last resort and as a team we collectively wanted to avoid doing this in order to have a better user experience. If we use AprilTags to detect the cue stick, then the user must hold the cue stick in a specific position for it to be detected. I will try to put in at least another half-week into finding a way to fix the cue stick detection without AprilTags.

In the coming week, I aim to fix the cue stick detection model and continue with end-to-end integration of the system.

March 16, 2024

Andrew’s Status Report for March 16, 2024

This week I wasn’t able to get much done. I was supposed to extend the web application by building a user-friendly display for the accelerometer, gyroscope data + integrating the camera feed into the web application. I had three midterms + other homework due this week, so I wasn’t able to get much done. I made some progress on the display but I was not able to finish that nor the camera feed integrated to the web application. I’m aiming to use the entirety of this Sunday (tomorrow) to finish the work, and our team plans on fully integrating everything tomorrow as well.

Something to note, however, is that after discussions with Professor Kim, we realized that the raw gyroscope and accelerometer data would not be very accurate to display. We thought of doing some sort of meter or “relative” display in order to use this data, but we need to pivot to a better way of utilizing the data to display user recommendations. For now, we were advised not to spend too much time on getting the IMU to work precisely.

For plans in the coming week, we hope to be able to integrate everything together and demo a working version of our project. This will probably involve a lot of debugging and potential modifications to the pool table/metal frame in order to get the computer vision components working as best as possible.

March 9, 2024March 9, 2024

Team Status Report for March 09, 2024

This week we were on schedule for our goals. Work was done for the web application as well as refining the cue stick system. Building out the web application involved a frontend (React) and backend (Flask/Sqlite), though we are considering phasing out the backend if WiFi speed is fast enough for us to go directly from ESP32 to frontend. Having the Arduino Nano was a convenient way to test the software with a wired connection; we phased out the Nano and are just using the ESP32 module which acts as a server and exposes API endpoints to get the gyroscope and accelerometer data for the IMU. We also calibrated our projector, tested the camera, and adjusted the height of various components. The projector needed to be high enough to display over the entire pool table, and the camera needed to be high enough for each frame to fit almost exactly into its FOV. After testing, we decided that the projector and camera needed to be much higher, which involves shifting the actual pool table down. For our structure, this is extremely convenient since the metal shelf frame has notches. We shifted down the pool table to lower notches in order to increase the vertical distance between pool table and camera/projector. In addition to this, we spent a good chunk of time this week writing up the Design Review report. This was a significant effort, as the report spanned 11 pages; we split up the report evenly. Debrina wrote the introduction, use case requirements, and design requirements; Andrew wrote the abstract, design trade, testing & verification, and part of the risk mitigation; Tjun Jet wrote the rest. This was a good checkpoint for us as well, since it forced us to document with detail all the work we’ve finished so far and assess next steps.

A risk that could jeopardize the project is the detection of the cue stick. Detecting the stick with zero machine learning proves to be a hard task, and the results are not always consistent just by using contour, color detection etc. The detection can be further improved with more layers and sophisticated heuristics. However, in the event that we still cannot detect the cue stick with high accuracy, we will opt to use a more reliable solution like AprilTags. The infrastructure for AprilTag detection is already built into our system. In the worst case we could attach AprilTags to the cue stick in order to have our camera detect it easily. This is the main contingent plan for this problem.

Part A (written by Andrew):

The production solution we are designing will meet the need for an intuitive pool training system. Taking into account global factors, pool itself is a global game played by people all over the world, and it has its origins dating back to the 1300s. As such, we need a system that is culture and language-agnostic. Namely, this system should work for someone in the United States as well as someone in France, Italy, etc. This requires our product to use minimal country-specific information/context – a concern we should think carefully about. Since we are used to living in the United States, we must think critically about these implicit biases. For instance, we planned our product to heavily rely on visual feedback instead of text. The trajectory prediction in particular is country/culture/language agnostic and so will meet world-wide contexts and factors. Additionally, we built the system as simple as possible, and with minimal effort on the part of the user. This would account for variations in different forms of pool. This also aids players who are not as technologically savvy. If we had made the primary method of feedback something more complicated or convoluted, then it would be a detriment to those who are not technology savvy.

Part B (written by Debrina):

The production solution we are designing meets the need for a pool training system that has an intuitive user interface. The consideration of cultural factors is similar to that of global factors. Our training system makes no assumptions on which languages are spoken by our users. Nor does it make any assumptions regarding the level of language proficiency our users have. The feedback provided by our pool training system is purely visual, which makes its interface easily understood by users of all cultures and backgrounds. Our product solution also has the potential to spread the accessibility and popularity of billiards in cultures where the game is not as widespread. Currently, the game of billiards is more popular in some countries and cultures than others (the United States, United Kingdom, and Philippines, to name a few). Our product solution’s intuitive user interface would be able to promote the game of billiards in other cultures. The versatility of our product solution provides people of all cultures an opportunity to learn to play billiards.

Part C (written by Tjun Jet):

CueTips considers various environmental factors in the selection of our material, power consumption, and the modularity and reusability of our product. When selecting the material to use to build the frame for our pool table, we not only considered the biodegradability of the material used, but also the lifespan of the material. We bought a frame that consisted of wooden boards to hold the pool table, and metal frames as the stand. We initially considered using plywood planks to build our own frames, but we decided against it when we realized that prototyping and building the frame over and over again could lead to a lot of material wastage. Furthermore, our frame is modular and reusable, meaning it is easy to take apart and rebuild. For instance, if a consumer is moving the location of the pool table, it will be easy for them to take it apart and build it in another area, without the need for large transportation costs. To ensure appropriate lighting for the camera detection, we also lined the frame with LED Neopixels. LED neopixels are generally energy-efficient compared to traditional lighting. When used efficiently, this minimizes unnecessary power consumption. Thus, by carefully selecting biodegradable material, choosing energy efficient lighting, and making our entire product easily transportable, modular, and reusable, CueTips aims to provide a pleasant yet environmentally friendly pool learning experience.