Jeremy’s Status Report for May 1, 2021

This week, I worked with Sid to achieve and verify our design requirements for classification accuracy and latency using the completed PCB.

I improved the convolutional neural network training by adding data augmentation to the training set. Since the camera occasionally shifts between the days, we added data augmentations that choose randomly-shifted crops of the input image to model those physical movements. I applied this data augmentation only to the training set.

Sid and I captured and labelled 2724 new captures with the LED’s from the PCB. Combined with the new data augmentation and a learning rate scheduler, we achieved the best validation accuracy of 99.1%. This model achieved 98.1% accuracy on the test set. This is the best validation accuracy we have achieved, so we are using it for the MVP.

We also measured the classification latency. The design requirements specify 2 second latency from capture to web update. We dealt 52 cards over a 52 second period and achieved the following latency statistics for each card:
min: 0.162s
mean: 0.184s
median: 0.177s
max: 0.224s
The maximum latency of 0.224s is far less than our design requirement of 2 seconds.

I have now completed the imaging and machine learning system. We have some final testing to carry out, but so far, we have achieved every design requirement that we have tested.

Jeremy’s Status Report for 4/24/21

For the past two weeks, I have worked with Sid collecting a dataset and training different convnets to classify the images.

For training, we have the following data:
Training images: 3481
Validation images: 497
Testing images: 996

The camera returns a 1280×720 image. We crop it to a fixed region-of-interest and downsample by 4 to obtain a 200×148 image. We normalize the training and validation dataset to mean 0 and std 1. That normalized 200×148 image is then passed to the network.

The network architecture is based off on the LeCun-5 network described in LeCun et al., “Gradient-Based Learning Applied to Document Recognition.” It contains 4 convolutional layers with 5×5 kernels, one fully connected layer to output the feature vector, and two separate fully connected layers to output the rank and suit probability distributions. Each convolutional layer is followed by batch normalization and 2×2 max-pooling.

So far, our best network achieves 98.0% validation accuracy and 97.5% test accuracy. A card is classified correctly if both the rank and suit are correct. This network takes approximately 50ms to classify a single image on the Jetson Nano, so we have plenty of headroom to achieve the 2s latency requirement.

Because the PCB is delayed, we are still delayed in testing the images and classification using the final prototype with the PCB. However, Sid and I are on track to hit the 98% accuracy requirement and complete the classification subsystem with our current prototype until the hardware is finished.

Here are details from the training process:

Validation metrics
Card accuracy (suit and rank): 0.979879
Suit accuracy: 0.993964
Rank accuracy: 0.985915

Test metrics
Card accuracy (suit and rank): 0.974900
Suit accuracy: 0.996988
Rank accuracy: 0.976908

This week, Sid and I will take more training images to increase our dataset size. We can experiment with larger networks with more data to hit 98% test accuracy.

Jeremy’s Status Report for 4/10/2021

This week, I mostly finalized the imaging system and prepared it for the demo. On Monday, we found that the camera was broken (likely due to transportation and connecting/reconnecting). I reordered a camera, and it arrived Friday. Because of that hicop, I am one week behind since I could not collect a dataset for machine learning without a functioning camera. I updated our Gantt chart and used one of our two weeks of slack time to account for this.

On Sunday before this hardware issue, I collected a dataset of ~200 captures to quickly prototype some classifiers. While this is not nearly enough data to train a classifier that generalizes well (~200 captures = 16 captures per rank), it let us bring up our SVM classification code. As expected, we got insufficient validation accuracies, so our next step is to acquire a sufficiently large dataset.

When moving a card over the trigger, the imaging system now returns two captures: the unprocessed 8-bit black-and-white capture and a cropped & thresholded binary image that contains the rank and suit. This represents successful integration with infrared sensor ADC and camera drivers in software. After examining the classification results, I only expect to change the rectangle that crops a fixed region-of-interest out of each capture to zoom in on the rank and suit. The cards can move horizontally, so occasionally the rank and suit are shifted in that ROI. Otherwise, I do not expect to make significant changes to the imaging system.

This week, my first priority is to obtain a sufficiently large dataset for training classifiers. I will work with Sid on SVM and neural network training.

Jeremy’s Status Report for 4/3/2021

This week, I worked on bringing up the trigger and syncing the camera captures with it. I implemented I2C communications with the ADC evaluation board to sample the analog voltages from the infrared sensor.

The is currently set to 1600 samples/s, but we can increase that if necessary. Right now, we believe a 0.625ms sample period (1600 samples/s) is adequate given that the camera shutter time is 16.667ms. The main python loop polls the ADC to respond quickly to card triggers.

Since the sensor measures infrared reflectance, the measured reflectance depends on the surface’s color. This means that the trigger signal is slightly different for face cards since there are contrasting colors (white, black, red) that pass over the sensor. However, the white edge of the card that passes over the trigger first always creates a voltage that is consistently below the threshold. Therefore, this should not be an issue for our project. The graph below shows the trigger signal when we deal an ace of clubs, five of hearts, 9 of spades, jack of spades, and king of hearts. Note that the signal includes spikes for cards with a black rank, but it consistently falls below 500 at the beginning. Perhaps a future revision could learn characteristics of the trigger signal to use a priors when classifying the card…but that’s out of the scope of our project.

The second graph shows the trigger values when we place a finger directly over the imaging stage, an expected mistake during usage. This will not dip below our trigger threshold. We found that placing a phone flashlight directly over it will trip the sensor if it is within a few inches, but we have not yet saved those signals.

Before we integrated the trigger with the ADC, we used an arduino to perform A/D conversions. With that prototype, we imaged an entire deck once to obtain a toy dataset for machine learning. Luckily, we found that frame 12 (where frame 0 is the first continuous capture when triggered) consistently contains the rank and suit in on image. This video loops through each frame-of-interest for each card. This second video shows those frames after preprocessing (cropping and Otsu’s thresholding). This preprocessing is not robust enough since it misses some digits.

Both defocus and motion blur are an issue. The camera’s listed minimum object distance of 3cm still gives images that are out of focus. The motion blur is due to the 60Hz framerate limit, but it only blurs the images in one direction. We can overcome defocus with thresholding, but motion blur is trickier. The “1” in cards of rank “10” is often blurred due to motion that gives little contrast in the image. The current global threshold misses that single digit, so I may experiment with adaptive thresholding to see if that makes it more sensitive.

I was unable to experiment with the lighting system this week since we do not yet have a PCB. While we will continue working without consistent lighting, Ethan plans to work on that next week. I have also not yet finalized the edge detection and cropping to separate the rank and suit, but I do not expect that to take very long. Because of this, I am slightly behind schedule. Now that we have the trigger working, I hope to get back on schedule this week and obtain a larger dataset for Sid to work with.

Jeremy’s Status Report for 3/27/21

In the past two weeks, we have made good progress on the prototype. I have been working with Ethan to move the camera around to frame the photos while balancing defocus blur and resolution. I also built the preprocessing routine to segment the region of interests (rank and suit) from black-and-white captures. From this, we have some test captures I used to build the preprocessing and segmentation. This preprocessing is incredibly fast since it outputs a binary image and uses aggressive downsampling, so that will help us hit our latency target.

One issue we ran into was camera defocus. While the lens claims that 3cm is the minimum object distance, we found the edges to be out of focus. Thankfully, thresholding on intensity removes any blurred boundaries without any artifacts.

Secondly, I discovered the Nvidia Jetson Nano’s camera interface connector cannot support 120Hz captures (even though the camera sensor does). As such, I’ve been working on 60Hz captures. I have not noticed any issues with motion blur in the captures but will update if that becomes a concern.

This imaging system relies heavily on an accurate trigger. We need the trigger to immediately identify the capture where the rank and suit is within the image boundaries. I am working with Ethan to fine-tune the trigger positioning and timing. I hope to avoid identifying the image with the rank and suit in software since that would add significant processing time.

I am currently on schedule. Since the trigger is far more critical to the product than we initially realized, we may be delayed in building the final prototype next week. I will update in next week’s status report.

Attached are some example captures and their preprocessed outputs. We will save the largest binary blobs that correspond to the suit and rank. Those inputs will be the inputs to our classifier. Note these preprocessed images are currently ~40x15px after downsampling.

Jeremy’s Status Report for 3/13/2021

Last week, we ordered the first batch of parts to prototype our system. Shipping took a week longer than we planned for, but we got the parts on Thursday. While we waited, I researched lens distortion correction in case that is necessary for the system. Starting today (Friday), I have setup the Jetson Nano and am currently working to get the camera drivers working and bring-up a python script to stream images from the camera. Once I have that done, I will experiment with different camera poses and lighting. Ethan proposed angling the camera so it can sit recessed in the card shoe. That provides benefits for the physical design, so I will also explore homographies to warp images if the captures include a perspective projection.

Because shipping took longer than we expected, I may be delayed in my task list if the camera drivers pose an extended issue. I did not include a “camera bring-up” task in my Gantt Chart, but I will update my schedule in next week’s status report either way.

Jeremy’s Status Report for 3/6/21

This week, we stayed our course by ordering Nvidia Jetson Nano’s, two camera evaluation boards, card shoes, and card decks.  Once they arrive (hopefully this weekend), I will bring up my Jetson Nano and start taking photos.

We purchased two camera modules that have a different sensor resolution and framerate.  This will allow us to experiment with different resolutions without needing to wait another week for shipping.  The cameras go up to 180fps, and we estimate we need a high FPS to avoid motion blur during quick card movements.

The primary camera I am interested in has the following specs:

  • Up to 1280×800@120fps
  • 30mm minimum object distance
  • 75deg horizontal field of view.

With the camera 3cm away from the playing card, this images a 46mm (1.81in) area with 0.058mm horizontal resolution at 180fps.  This is the closest distance to stay in focus.  This resolution should be more than sufficient for card classification when the rank and suit are on the imaging plane.

This brings me to another challenge for the project: image selection before classification.  When the card trips the sensor, the camera will spam photos to the Jetson Nano.  Since we will likely use a camera that has a small imaging plane (ex. 1.81in tall), we will need to choose a valid image to classify.  I hope to choose this with _priors_.  From the sensor, we will know how long it took to move the card over the camera.  Using prior knowledge of Bicycle Standard cards, we can estimate approximately which images contain the rank and suit by assuming constant velocity.  While I hope this solution will work, I will have to examine it once we have prototyped the imaging system.

I adjusted our schedule to account for ordering parts on Thursday.  I began exploring lens distortion correct methods, but I’ll need the camera in-hand to actually implement that.  I am otherwise on schedule.

Jeremy’s Status Report for 2/27

This week, I explored camera options that are cheap and have an evaluation board available.  We need the evaluation board so Sid and I can start prototyping the imaging pipeline before the PCB is ready with the final product.  Here are some options from which I will choose one to order:

OpenMV Cam H7 Plus

Pros: Easy integration with a Jetson Nano over SPI.   Lens is interchangeable on M12 mount.  This could be very convenient if we change the geometry of the final product.

Resolution: Up to 120 fps at 320x240px. Price: $80

IMX447 Sensor Board

Pros: Higher quality sensor, convenient hardware interface.

Cons: must purchase a lens separately

NVIDIA also provides a list of supported camera hardware.  I’m still inspecting those options, but many of them are lab-grade cameras that are far beyond our budget.  Ethan is also looking at those.

I am still on schedule with the camera and hope to order a evaluation board in the coming days.   After our design presentation, I pushed back some initial tasks on the Gantt chart to have a more realistic timeline.

Jeremy’s Status Report for 2/20/21

This week, I spent most of my time meeting with our team and TA’s to refine our project’s scope.  We transitioned from two disjoint projects involving a custom RFID poker chip tracker and card imager to a single deliverable: a playing card holder that images and classifies playing cards as they are retrieved by the dealer.  We framed the project such that there are clear individual contributions from each member.  I will focus primarily on building the imaging system.

Before our design review, I need to quantify the optical and electrical requirements for a camera.  We require a camera whose optics provide reasonable resolution photos of the suit and rank within close proximity.  We want to avoid fisheye projections that add non-linear transformations to the images; that would make the classifiers more difficult to train.  Secondly, I need to determine a lower bound on the framerate based on how fast cards are dealt.  Finally, we will tend towards cameras with stable linux drivers.

Before our design review, I will also plan the lighting more concretely.  Perhaps we could image a playing card under different color illumination to provide higher-contrast images for different playing cards.  As we narrow the scope of the project, we will solve these design questions so Ethan has time to add these hardware changes.

Our group decided to add a hardware switch that triggers when the dealer moves a card over the camera.  This will avoid the unnecessary complexity of having the camera determine when a card is being dealt.

We are on schedule to complete our proposal in time, and I will solidify the details above in the coming week.