jha1 – Team B1: Aware-ables

December 10, 2022

Jong Woo’s Status Report for 12/10/2022

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours):

This week, all the clamps for our apparatus have been properly replaced with screws and glues. Final height adjustments for initial cropping have been tweaked correspondingly, and our team started crafting the final paper as well. Our team decided to create a poster to go along with the in-person demo, and I did some preps for the in-person demo presentation.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

Pre-processing has been fully implemented to its original scope.

What deliverables do you hope to complete in the next week?:

Final report will be completed along with the in-person demo.

December 10, 2022December 10, 2022

Team Status Report for 12/10/2022

What are the most significant risks that could jeopardize the success of the project?

Pre-processing: Pre-processing is completed and meets all design requirements and also possesses an machine-learning based alternative with a longer latency.
Classification: Classification is more-or-less complete and meeting the requirements set forth in our design review. We don’t foresee any risks in this subsystem
Hardware/integration: We are still in the process of measuring latency of the entire system, but we know that we are within around 5 seconds on the AGX Xavier, which is a big improvement over the Nano. We will continue to measure and optimize the system, but we are at risk of compromising our latency requirement somewhat.
Report: We are beginning to outline the contents of our report and video. It is too early to say if any risks jeopardize our timeline.

2. How are these risks being managed?

Nearly everything has been completed as planned.

3. What contingency plans are ready?

Post-Processing: At this point there are no necessary contingency plans with how everything is coming together.

4. Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)?

Since last week, we have been able to measure the performance of our system on the AGX Xavier, and have chosen to pivot back to the Xavier, as we had originally planned in our proposal and Design Review.

5. Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

This change was necessary to more capably meet our latency requirements in the classification subsystem, where we were able to perform inferences 7x faster. This also improved the overall latency of the system.

6. Provide an updated schedule if changes have occurred.

We are on schedule to present our solution at the final demo and make final measurements for our final report without making any schedule changes.

December 4, 2022

Jong Woo’s Status Report for 12/3/2022

This week, our team’s testing apparatus has been built in the TechSpark’s Woodshop and pre-processing has been fully implemented to its original scope. More specifically, further accuracy was attained through the non_maximum_suppression’s capability to locate individual coordinates and get rid of most of the pre-existing redundancies in green dots drawn on top. Therefore, when an image is taken, it is properly pre-processed, and then cropped to 150 ~ 800 individual braille images, and then handed over to the ML classification pipeline.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

Pre-processing has been fully implemented to its original scope.

What deliverables do you hope to complete in the next week?:

Final modifications and tweaking will be placed while prepping for the in-person demo.

November 20, 2022

Jong Woo’s Status Report for 11/19/2022

Currently ,I am primarily relying on the stats values (left, top coordinates of individual braille dots,as well as the width and height of the neighboring dots) from “cv2.connectedComponentsWithStats()” function. I have checked the exact pixel locations of the spitted out matrices and the original image and have confirmed that the values are in fact accurate. My current redundancies of dots come from the inevitable flaw of the connectedComponentsWithStats() function, and I need to get rid of the redundant dots sporadically distributed in nearby locations using the non_max_suppression. There is a little issue going on, and I do not want to write the whole function myself so I am looking for ways to fix this, but as long as this gets done, I am nearly done with the pre-processing procedures. Currently, I am looking more into how nms filter gets rid of redundant images and am planning on writing the nms filter function myself if it could not be debugged in time. I wasn’t able to make progress as of now (saturday) as far as the non_maximum_suppression filter is concerned, so further works are required tmrw in order to discuss some progress.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

The current, pre-defined non_max_suppression is currently not fully functional and am currently in the process of debugging to properly get rid of redundant dots. However, once this is taken care of, the pre-processing subsystem is nearly done so I am hoping to get it solved and done by thanksgiving-ish.

What deliverables do you hope to complete in the next week?:

The current non_max_suppression filter would either be tweaked or re-written to properly process redundant dots sporadically present near the target location. Furthermore, I have currently confirmed that the center coordinates of each braille dots as well as their corresponding width and height are accurate given the direct pixel-by-pixel comparison with the original image, so cropping should be completed with relative ease as well.

November 12, 2022

Team Status Report for 11/12/2022

What are the most significant risks that could jeopardize the success of the project?

Pre-processing:
- Currently, I am relying on openCV’s “cv.2connectedComponentsWithStats()” function that outputs various statistical values in regards to the original image inputted, including the left, top coordinates as well as the width, height, and area of the most commonly appearing object (braille dots in our case). However, depending on the lighting or the image quality of the original image taken, the accuracy of this stats function needs to be further tested in order to modify further modifications.
Classification:
- On the classification side, one new risk that was introduced when testing our neural network inference on the Jetson Nano was latency. Since each character has around a 0.1s latency, if we were to process characters sequentially, an especially long sentence could produce substantial latency.
Hardware:
- The Jetson Nano hardware also presented some challenges due to its limited support as a legacy platform in the Jetson ecosystem. Missing drivers and slow package build times make bring-up particularly slow. This is, however, a one-time cost which should not have any significant impact on our final product.
Post-processing:
- Another hardware related risk to our final product is the audio integration capabilities of the Nano. Since this is one of the last parts of integration, complications could be critical.

2. How are these risks being managed?

Pre-processing:
- On primary level, pixel by pixel comparison between image and printed matrices on terminal would be undergone to understand the current accuracy level and for further tweaking of the parameters. Furthermore, cv’s non_max_suppression() function is being further investigated to mitigate some of the inaccuracies that can rise from the initial “connectedComponentsWithStats().”
Classification:
- To address possible latency issues as a result of individual character inference latency, we are hoping to convert our model from the mxnet framework to NVIDIA’s TensorRT, which the Jetson can use to run the model on a batch of images in parallel. This should reduce the sequential bottleneck that we are currently facing.
Hardware:
- Since hardware risks are a one-time cost, as mentioned above, we do not feel that we will need to take steps to manage them at this time. However, we are considering using a docker image to cross-compile larger packages for the Jetson on a more powerful system.
Post-processing:
- After finishing camera integration, we will work on interacting with audio through the usb port. We have a stereo adapter ready to connect to headphones.

3. What contingency plans are ready?

Classification:
- If the inference time on the Jetson Nano is not significantly improved by moving to TensorRT, one contingency plan we have in place would be to migrate back to the Jetson AGX Xavier, which has significantly more computing power. While this comes at the expense of portability and power efficiency, it is within the parameters of our original design
Post-Processing:
- There is a possible sound board input and output pcb that would allow us to attach to the nano and play sound. This comes with added expense and complexity, but it seems more likely to be proven effective.

4. Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)?

Integrating each of our individual components into our overall software pipeline did not introduce any obvious challenges. Therefore, we did not think it is necessary to make any significant changes to our software system. However, in response to interim demo feedback, we are looking to create more definitive testing metrics when deciding on certain algorithms or courses of action. This will allow us to justify our choices moving forward and give our final report clarity. In addition to the testing, we are considering a more unified interaction between classification and post-processing that helps create a more deterministic approach to which characters might be wrong more often.

5. Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The minor changes that we are making to the individual subsystems are crucial for the efficiency and effectiveness of our product. Making sure that we stay on top of optimal decisions and advice given by our professors and TAs.

November 12, 2022November 12, 2022

Jong Woo’s Status Report for 11/12/2022

The former half of this week was dedicated to the preparation and execution of the interim demo as well as further debugging and parameter tweaking of issues being acknowledged. More specifically, I am relying on openCV’s “cv.2connectedComponentsWithStats()” function that outputs various statistical values in regards to the original image inputted, including the left, top coordinates as well as the width, height, and area of the most commonly appearing object (braille dots in our case). However, depending on the lighting or the image quality of the original image taken, the accuracy of this stats function needs to be further tested in order to modify further modifications. Therefore, I cam currently undergoing pixel by pixel comparison between the original image and the printed matrices containing information regarding (x,y) coordinates of the common objects (braille dots) as well as their corresponding width and heights, in an attempt to understand the current accuracy level and for further modifications that would be required for accurate acquisition of core numbers. Furthermore, I am currently working on the cv’s non_max_suppression() function to mitigate some of the inaccuracies that can rise from the initial “connectedComponentsWithStats().”

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

There has been a little bit of an unexpected delay due to my health condition after my covid 4th booster shot, but on a general level progress is on schedule, and the upcoming week will be primarily focused on specific tweaking of the cv’s “connectedComponentsWithStats()” function and “non_max_suppression()” function’s parameters.

What deliverables do you hope to complete in the next week?:

Through the comparison between the pixel by pixel image and the printed matrices from “connectedComponentsWithStats()” function, I wish to gain a accuracy table of the current parameters in order to better tweak to get the accurate (x,y) coordinates of the centers of each braille dots, as well as the corresponding width and height. The accuracy of these value are critical for the last, and last, step of pre-processing; cropping via numpy slicing.

November 5, 2022November 5, 2022

Jong Woo’s Status Report for 11/5/22

This week, I gave a finishing touch to the current pre-processing filters that will be used to train our ML model. For thresholding, otsu thresholding, median thresholding, and gaussian thresholding with various threshold boundaries have been investigated. Because the thresholded image will then be eroded (to reduce small noises, by shrinking the currently existing dots) and dilated (extended to fill up the spaces to create a more wholesome circle), parameters have been tweaked multiple times and then fed into the erosion and dilation process, and individual results have been visually compared to opt for the better pre-processing results. For now, gaussian adaptive threshold with upper and lower boundaries as 21,4 exhibit preliminary thresholding. Below is the image of various different thresholding parameters and their corresponding result images.

(please zoom in using ctrl + / – to view the images here and below)

Similarly, canny edge filters, erosion, and dilations were all tested with various parameters to reach reasonable pre-processing results. Below is the code and corresponding comparison image that also includes the original braille image (img) as well as the final processed images(final1,2).

Furthermore, camera is integrated this week, and due to the resolution or lighting variations, the masking filters will need to be tweaked correspondingly to continuously produce reasonable pre-processing results. Below are the initially captured images with various colored contrast.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

Progress is on schedule, and the upcoming week will be focused primarily on finishing vertical and horizontal segmentations that would lead into final cropping.

What deliverables do you hope to complete in the next week?:

I hope to refine and finish the current horizontal segmentation, finish the remaining vertical segmentation, and lead into cropping.

October 30, 2022

Jong Woo’s Status Report for 10/29/2022

This past week, I focused on further applying the canny edge detection filter and using non-maximum suppression to draw out circles on top of the pre-existing braille characters. Canny edge detection, supported by OpenCV, allows various structural information to be extracted from an image while dramatically reducing the amount of the data to be processed. The results of the canny edge filter applied image will be facilitated for the non-maximum suppression, which would select a single entity (dot) out of many overlapping entities and draw out individual colored circles on the top.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

Progress is on schedule, and the upcoming week will be primarily focused on the group work regarding camera integration.

What deliverables do you hope to complete in the next week?:

Camera Integration is the primary goal to be completed by next week. This is quintessential for the upcoming interim demo on the Nov 16th, and I believe that as long as camera integration could be completed in time, our team would be able to have some tangible deliverable by the interim demo date.

October 23, 2022

Jong Woo’s Status Report for 10/22/2022

Note: This weekly status report covers any work performed during the week of 10/15 as well as Fall Break

The week before mid-sem break was primarily invested in the crafting and review of the design report. Our team was able to efficiently divide the work loads for the design report based on the subsystems that we were individually working on, and have a thorough and considerate revision to finalize our design report.

I have spent some time this week, looking into how the vertically and horizontally segmented images would be cropped and stored into a folder of separate file images. Since OpenCV had no specific functions for cropping, “Numpy array slicing” method will be adopted. Since every image files that has been pre-processed can be read in and stored as 2D array for the individual color channels, pixel values of the height and width of the area can be specified to crop specific regions of the images.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?:

Progress is on schedule, and given how this past week was a mid-sem break, our team will focus on getting things back rolling and combine the parallelized works, getting ready for the interim demo that is approaching.

What deliverables do you hope to complete in the next week?:

Implementation of the canny edge filters along with the non-maximum suppression method for the ultimate edge contrasts will be done.

October 8, 2022

Jong Woo’s Status Report for 10/08/2022

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

First half of this week was spent on the preparation and rehearsal of the Design Presentation, as I was the presenter of team B1-Awareables.

Latter half of this week was primarily invested in further research on segmentation and partial implementation of horizontal segmentation. After I obtain the region of image (ROI) after the initial pre-processing, the next step to proceed is the vertical and horizontal segmentation, a process of dividing the original ROI to vertical and horizontal segments of crops, in order to process into final cropped images of individual braille characters.

After some research, I found out that the approach for the horizontal and vertical segmentations should be a little bit different. More specifically, for horizontal segmentation, there are two different approach options: 1) Using hough transform lines, and 2) manually grouping individual rows by label, assigning labels to each row based on whether they are following each other or not (i.e. diff >1), and then find the mean row value associated with each label to color each row lines for horizontal segmentation. For vertical segmentation in specific, since the space between the dots depends on the letters following each other, the similar “2)” approach from horizontal segmentation should be somewhat adapted using Hough transformation, which will be further explored in the coming weeks.

For the horizontal segmentation, this is the current result:

Next steps would be to crop individual rows of ROI and save them into individual segmented rows which will be further vertically segmented using Hough Transforms:

EX)

(further segmented rows of ROI)….

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Things that were due this week were i) start working on vertical & horizontal segmentation of the currently pre-processed images to attain cropped individual jpegs of braille alphabets, and 2) research on non-max suppression methods. This week I made some progress with horizontal segmentation and studied in depth how non-max suppression would be applied to our final filter. All goals were met and my progress is currently on schedule.

What deliverables do you hope to complete in the next week?

By the end of next week I plan on accomplishing the following: 1) Keep working on vertical & horizontal segmentation of the currently pre-processed images to attain cropped individual jpgs of braille alphabets, and 2) Further research and application of non-max suppression filters.