Lohith’s Status Report for 22nd February 2025

Since last week, I started creating some sketches of the chamber that will house the camera and jetson device. The purpose is that we need a stable container that can hold the camera at any angle we desire (since we likely need to tilt the camera down to see the table cards) and also we want to hide the details of our product for the consumer. The idea that I was thinking of was a box with a lid that can freeze in place due to a handle on the hinge. The idea is that turning the hinge tighter will hold the lid in place. I would also need small attachments on the front side to keep the camera from simply falling out. The jetson should also be secured in the chamber at the bottom. It is important that the wires only come out at the bottom of the backside, so that our product is more portable. I also started working with Joe on the processing, specifically with regards to the amount that should be bet when the count is at a certain number. I am still on schedule, and next week, I should finalize the chamber plans (via SolidWorks likely), work on a possible CUDA kernel (though I need to work with Nicholas first to determine if that is needed to meet latency goals), and continue working through processing bets.

 

Nicholas’ Status Report for February 22nd

I have finalized the ML model, so now we are moving onto setting up the training environment and getting the data downloaded onto the ECE clusters so we can train the model over Spring Break. We are using Yolov11Large, so what I did is set up a work environment on the ECE Clusters with Tensorflow. I also downloaded the data we will fine tune on. I am less familiar with Tensorflow than PyTorch, so resolving library conflicts in the virtual environment took longer than desired. Despite this set up, I was able to get everything downloaded, and I was able to load Yolov11Large onto one of the GPU’s and did a sample detection on a static image. The next steps will be to use all the components and actually fine tune the model, which I will report on next week.

Joe’s Status Report for Feb 22, 2025

There were two main objectives for this week. First, I’ve been debugging the website and making sure that all the functionalities that I outlined in the design presentation were properly working. I simply placed some dummy information in the database and made sure that the pipeline between the web app and the DB was working fully. There were some bugs here and there so I’m still working on making this as airtight as possible.

The other thing I was hoping to work on was more or less formalizing the mathematics our processing algorithm and start implementing it. However, the paper had more material than I was initially anticipating and as such I’ve not managed to complete this, albeit crunching a substantial portion of this. I hope to finish this next week before Spring break and start implementing the basic components of our processing algorithm over the break and after.

Nicholas’ Status Report for February 15th

This week has been devoted to cementing the choices for the ML model we should use for our project as well as addressing the potential of a purely Computer Vision approach for our card detection module. Our advisors proposed the following pipeline:

  1.  Use a traditional feature descriptor to identify cards.
  2. Take a snapshot of the table’s state.
  3. Wait 300 ms and perform steps 1-2 again.
  4. Given these 2 images, use a convolutional filter
  5. Pass a gradient filter over both of these images.
  6. Subtract the gradients of the images.

The idea is that this pipeline should allow us to cheaply and efficiently detect the cards being placed on the table, and then once we are sure we a new card has been placed into frame, we can simply use the output of the feature detector to keep track of the cards. However, this algorithm makes too many simplifying assumptions to work in the context of Blackjack, and will most likely not be a good fit for various reasons.

Firstly, we note that the best feature descriptor for the task is ORB (Oriented FAST and Rotated BRIEF), which provides a scale and rotational invariant features for object detection that is fast. However, ORB is known to fail if major occlusions for object occur, and is also not robust to lighting changes, both of which are sensible issues for us. For occlusions, we note that since cards are dealt one over each other, the first card dealt in a players hand will be largely occluded from view of the camera. Furthermore, ORB is not as robust as other, paid feature descriptors are, to lighting changes. One of our user requirements is that there is no arduous or rigid setup the users have to follow to use our product, so limiting camera position or lighting is not a trade off we are willing to make. Furthermore, given the nature of Blackjack, ORB will face issues as a feature descriptor, making step 1 of this pipeline already problematic.

There would also be noise issues with this pipeline that would go against our user requirements. We already mentioned that this pipeline will experience shaky accuracy with ORB, which is a red flag, since we need at minimum 90% accuracy to guarantee a better deviation from the count than a typical professional card counter. However, we would also require a fixed camera angle if we used this pipeline. This is because if we only used gradients and edge detection, we cannot discern a playing card from another rectangular object that has numbers or letters on it. This is because ORB is hand crafted feature descriptor that does not undergo any further processing, so it is more susceptible to adversarial attacks. We could easily place a book in frame, and the edges and the letter “K”, for a King, would be detected as a King! The only way around this would be to fix the camera angle, so we know what size the cards should be, but again, this goes against a user requirement.

Finally, we would also spend a lot of time on this pipeline for little gains. Even if we use libraries that are optimized by using BLAS under the hood, such as NumPy, we are CPU bound on critically (parallel!) operations, such as the convolution! We would have to write our own CUDA kernel for performing convolutions, making it a large time expenditure to develop and debug the kernel for minimal savings either on time or memory while still retaining the above issues. All in all, this Computer Vision pipeline seemed like a sensible idea, but the numbers don’t justify using a pipeline like this at the end of the day.

For the machine learning side of things, we finally found the exact Yolo Model we want to use. We will be using Yolov11 Large for the foreseeable feature. This is because it has a good accuracy-speed trade off for us. We need accuracy far more than speed, and we can get 300ms-500ms inference times with this model, with far better guarantees on the ability to detect cards. This lines up well with our user requirements, since we will not be streaming data or performing real time inference.

This keeps us on schedule for meeting the schedule and requirements. By next week, we hope to have compiled a slightly larger dataset and begin setting up a training environment on the ECE clusters for fine tuning.

Joe’s Status Report for February 15th, 2025

This week, I have more or less completed finalizing all the functionalities I will have in my application. I intend the application to provide the following information: 1) running count 2) true count 3) optimal action 4) deviations. The user will also be able to interact with the application by providing the history of bets they have placed and the action + outcome that just occurred. As of now I have set up the web server on my own IP just to test the functionalities of the application. I have set up the database for the app and the basic features I mentioned before. However, I intend to spend the next week setting up the server to run on the Jetson Nano and use the information provided from the ML pipeline. I haven’t managed to work on the processing algorithm too much this week as I have been working on the web app instead but now that I can soon get a prototype of the web server running, I will be formalizing and implementing the algorithm next week.

Lohith’s Status Report for February 15th, 2025

After items were picked up, I wanted to set up the camera and Jetson together to make sure they work, since initializing devices can often have compatibility problems. The first step was to flash the SD card with the image for the Linux operating system. Then, I inserted the microSD into the Jetson and connected the Jetson to a monitor. There were several issues with setup on the first attempt, related to running “sudo apt-get update.” This required me to flash the microSD, and to not update. I downloaded a demo program with “DepthAI” (which we likely will not use on the first attempt of writing code), and connected the camera. The result was a (somewhat) functional object detection. Currently, I am on track, since I anticipated setup to take more debugging efforts. Now, we can begin to run some smaller sample computer vision programs to determine the accuracy of card detection. I can also start profiling different methods of card detection, such as machine learning (which can usually guarantee a higher accuracy) and computer vision (which can be simpler but with no guarantee of accuracy).

Team’s Status Report for 15th February 2025

As a team, we made final decisions for the camera and Jetson products, then placed orders for them. We are ready to start creating a program for card detection, as well as working on the math required for card counting. We are trying to find an easy way to develop without moving the hardware around too much. For example, it would be nice if we could develop significant parts of a computer vision program without needing to actually connect the camera. We’ll work on how to make this process easy. We do not have any significant schedule changes or block diagram changes. The most significant risks would be a hardware device breaking, which could delay progress as we may need to order a replacement. We still have a contingency plan to develop a CUDA kernel for additional performance if accuracy and performance suffer with computer vision or a python ML model.

 

Nicholas wrote part A, Lohith wrote part B, and Joe wrote part C.

(A) The product does not strictly meet a specified need to public health, safety, or welfare, but rather is designed to ensure these factors are not infringed upon. This product is meant to teach people how to count cards when playing Blackjack at a Casino, as well as providing a fun user experience with friends to all learn as a group. However, we could see that if a product like this gained wide spread adoption, without proper boundaries, arguably public welfare and health might decline. We note that gambling addictions are real problems that need to be properly addressed and mitigated when making a product such as our own, so we have designed it to have constant reminders of safety and proper risk management when playing Blackjack. Luckily, since we are teaching players how to optimally play Blackjack, we are giving them a lower chance of losing money and causing themselves financial harm, but we still need to be aware of the real risk and issues associated with a gambling addiction. Through occasional warnings during gameplay and a warning when starting the system, we hope to mitigate possible issues related to public health, safety and welfare.

(B) This product meets needs with consideration to social factors. Blackjack can be and is often a social activity. Many people going to the casino will go with friends or family, and playing blackjack is often a bonding experience, since a game involving both luck and skill will usually prompt discussion between members of the table, with feelings of jubilation (winning a hand), disappointment (a correct play leading to a loss), or surprise (a misplay or a lucky card). Our product, which aims to help a beginner player make the right decisions when playing, will also help the player take part in these discussions and therefore increase their social interaction with friends and family. Even a person who goes alone can start interacting with other players (including the dealer as is often the case), and maybe make some friends along the way. 

People also play blackjack with their friends at home, and it is possible that a beginner who isn’t confident enough in their skills to bet money will feel alienated. This product can help the beginner start making the right moves in blackjack and increase their confidence, enabling them to start betting and also having fun with their friends.

(C) This product does not really have that many economic considerations since it is an educational application that helps with making accurate probabilistic decisions. From an indirect perspective, the application does also help teach individuals how to manage their finances and make prudent financial decisions. However, from a strictly macro-economic perspective there are not that many relevant considerations for this application and thus the design of the product can more or less ignore this area.

Team’s Status Report for 8th February 2025

It is possible that with the base Jetson Nano setup, we may not be able to meet the latency requirements. For this, our contingency plan is to use the CUDA kernels given to us and write optimized code to meet requirements. Also, we plan to use an FPGA to accelerate some of the ML operations in the Jetson, but if the I/O speeds prove to be too slow, and we meet specs without the FPGA, then we will simply not use it. We did not make any major modifications to our initial ideas this week; however we did brainstorm for additional functionalities that we might want to add if time and resources permit. The schedule has not been updated.

Lohith’s Status Report for February 8, 2025

Over the past week, I have been exploring options for camera models. The camera is an integral piece to our project, and since we have a goal of 90% accuracy for card identification, the choice of camera is very important. Out of the many options, one option I found was the Intel RealSense camera series. These cameras have sub-millimeter accuracy, which is sufficient for us, since this camera will be on a short tripod close to the hands of cards. Out of the many options, I settled upon the D435 model, a camera that records at 90 frames per second, with RGB capabilities, and a USB-C connector. Also, I have been researching videos on how to interface the Jetson Nano with the camera. This is a common practice that people do, so there is support for if we run into issues. I am currently on schedule. Early next week, we will place the order for the camera (unless the ECE department has one, in which case we will borrow one), and then we can start testing data transfer with the Jetson Nano with a small sample program. Then, we can start developing the ML model for identifying cards.

Nicholas’ Status Report for 8th February 2025

I have been researching which ML Model we should use for our project, as well as confirming my belief that there are no alternate Computer Vision techniques we can use. In my research, I reviewed edge detection techniques and color analysis systems for detecting cards. However, as I initially believed, these systems are not robust to different conditions, which is a user requirement for us. Furthermore, these systems do not do further feature processing and refinement like the ML Model’s we will be using can do. This lack of further processing means that a pure Computer Vision approach would not meet the user requirements. Once I had fully confirmed my suspicions, I began to further research Resnet-50 and the YOLO family of models. We have currently decided to use a YOLO model, since after further research, we realized Resnet is a common backbone for most object detection pipelines, but would require further scaffolding on our part. YOLO would streamline this process, so for next week, we are deciding which YOLO model to use, and we will begin preparing the model and dataset for training.