Josh’s Status Updates – Team D2: SightMate

April 27, 2024May 1, 2024

Josh’s Status Report for 4/27/2024

Accomplishment (Updated on 04/29)

Prepared for the final presentation
Worked with Shakthi to deploy headless device settings by changing the startup application of the Ubuntu, so that the Jetson Nano automatically runs the main.py file, which has the OR model with speech module and button integration.
During the process, when trying out different permission change commands, the speech module broke down. As a result, we had to reboot the Jetson and reinstall all the necessary programs to run the speech module and OR module.
I reinstalled python 3.8.0 to the Jetson and opencv 4.8.0 with the GStreamer option enabled to allow video streaming. The same memory swap technique used previously was used to download a huge opencv build folder, which was about 8 GB.
(Update) Reduced the data latency of the OR model to an average of 1.88s by using a multithreading method to concurrently run the OR model and the video capture by the gstreamer from OpenCV. The OR model uses a frame to detect the closest object while the camera concurrently updates the frame. Although it faces a race condition by multiple threads accessing the global variable at the same time, the data fetched from the global variable would be from the previous instance, which is at most 1 frame behind real time. Not only it works with our use case requirement, but also this would not be noticeable to the user and hence does not affect their navigation experience. For this reason, we decided not to use any mutex or other lock methods for the global variable, which can potentially create a bottleneck and increase the latency of data transfer.

Progress

Fixed the speech module by rebooting the Jetson to a clean default setting.
Due to the audio error in the Jetson, the rebooting and reinstalling programs hindered our work for headless deployment. We are behind schedule in this step and will do unit testing once we get to finish this deployment.

Projected Deliverables

By next week, we will finish deploying the Jetson headless, so that we can test out the OR model by walking around an indoor environment.
By next week, we will conduct a user testing on the overall device functionality

April 20, 2024April 21, 2024

Josh’s Status Report for 4/20/2024

Accomplishment:

Enabled Gstreamer option on opencv-python on Jetson to allow real time capture. The opencv version 4.8.0 did not have a gstreamer option enabled, so a manual installation of the opencv with that option enabled was necessary. Because the opencv folder is too big, around 8 GB, I used a memory swap within the Jetson to temporarily increase space on Jetson. The build and install was run after the download.
Worked on integrating the OR Module, Speech Module and the Proximity module into the NVIDIA Jetson alongside Shakthi. The speech module and the proximity module were integrated within the loop of the OR model, so that for each frame, the Jetson will identify which button is pressed and which object is detected to output a desired result.
Added “cat” and “cellphone” as one of the indoor object options in the OR model and DE feature.
Tested OR model with a test file I have created. It stores the detected results and the real objects and compares them to yield the accuracy data. Tested 40 images composed of 6 cat images, 6 cellphone images, 10 chair images, 6 couch images, and 12 person images. Among them, 38 images were able to correctly detect the closest object. That makes the accuracy 95%. The incorrect images were due to the overlapping of several objects in one image. As an example, the model falsely identified the closest object when an image contained a cat right beneath a person.

Conducted unit testing on buttons and speech module with integration with the OR model. Pressed buttonA for vibration module and pressed buttonB for speech module consequently to test the functionality. Both modules had 100% accuracy.
Performed a distance estimation testing under four different conditions on detecting a person: first was to stay around 1.8m from the camera, second was to stay around 1.2m, third was to stay up close around 0.2m, and the last was to stay around 2.2m.
- The result was that 1.8m detected 1.82m, 1.20m detected 0.89m, 0.20m detected 0.38m, 2.2m detected 1.94m. On average, there is an uncertainty of 21.5%. Since the DE feature works based on the reference images, a little calibration is required. We will conduct more testings to find the most accurate calibration on the distance result.

Progress:

I made progress on successfully implementing the OR model to the Jetson Nano and allowing the camera to send real time data to the model for object detection.
We need to work on making the device headless, so that the device can be run without the monitor and wifi.
During the process of moving the device to headless, the speech module broke, so will need to work on the module again.

Projected Deliverables:

By next week, we will finish deploying the Jetson headless, so that we can test out the OR model by walking around an indoor environment.
By next week, we will conduct more testing on the Jetson OR to find the most accurate calibration for the distance of the closest object.
By next week, we will integrate the speech module again.

As you’ve designed, implemented and debugged your project, what new tools or new knowledge did you find it necessary to learn to be able to accomplish these tasks? What learning strategies did you use to acquire this new knowledge?

I learned a lot about machine learning frameworks and techniques by integrating an OR model and developing a distance estimation feature. I learned how to train an OR model with my own dataset with pytorch, modify training parameters such as epochs to yield different training weights, and display and compare detection results with a tensorboard. To learn this new knowledge, I allocated a lot of time researching by reading research papers, navigating through github communities, and scanning many tutorials. It was very challenging to find online resources that had the same issue as me because the systems are generally all different for each user. I also realized how important the relevancy of a post is because the technology upgrades rapidly, so I found many cases where the issue occurred due to the outdated sources.

Furthermore, I was able to get some experience on deploying modules on Jetson Nano. I learned a new skill of “memory swap”, which allowed me to temporarily increase the memory of the Jetson if I needed to import a huge module, such as opencv. I also realized how difficult it is to work with hardware modules and learned why we need to leave sufficient slack time towards the end of the project. As an example, the detection rate of the OR model was much slower than when it was run on the computer. If I did not spend time modifying the weight of the model during the slack time, I would not have been able to deploy the module and yield the detection result with less latency. Likewise, through multiple occasions where deployment of the model did not function as what I would have expected, I acquired this learning strategy.

April 6, 2024

Josh’s Status Update for 4/6/2024

Accomplishment:

Implemented a python program that tests the OR model + DE feature against test images. It retrieves the closest object detected in the image and verifies accuracy by the respective image filename. As an example, if the filename is “person_test5.jpg”, the actual closest object is a person in the image. In the program, it filters out “person” from the filename and compares it with the detected closest object.
The program was run against chair (8), couch (6), person (5) images. The result came out as 100% accurate.
Started working on deploying the OR module to Jetson. I transferred python files and reference images from my computer to Jetson.

Progress:

I failed to meet the schedule due to the system setting of Nvidia Jetson. Importing the torch module on Jetson is taking more time than expected due to unexpected errors, so the schedule is postponed for a few days. Installing appropriate modules to Jetson is the critical component of the project, so I will make this as the highest priority and attempt to resolve the issue as fast as possible.

Projected Deliverables:

By next week, I will finish deploying the OR model to Jetson, so that we can start testing the interaction between several subsystems.

Now that you have some portions of your project built, and entering into the verification and validation phase of your project, provide a comprehensive update on what tests you have run or are planning to run. In particular, how will you analyze the anticipated measured results to verify your contribution to the project meets the engineering design requirements or the use case requirements?

I have implemented a python program that tests the OR model + DE feature against test images. It retrieves the closest object detected in the image and verifies accuracy by the respective image filename. As an example, if the filename is “person_test5.jpg”, the actual closest object is a person in the image. In the program, it filters out “person” from the filename and compares it with the detected closest object. The program was run against chair (8), couch (6), person (5) images. The result came out as 100% accurate, which is far greater than the use case requirement of 70% accuracy. If time permits, I am planning to include more indoor objects, so that the model can cover a wider range of objects while maintaining high accuracy.

After the deployment of the OR model to Jetson, I am planning to use the same test file to run a testing on images taken from the Jetson camera and produce an accuracy report. In this case, since we are sending the images to the model in real time from the Jetson, we would not be able to rename the file in the format of the actual object. Therefore, I will instead use live outputs of detected closest objects from the Jetson and manually check whether the detection is accurate.

March 30, 2024April 1, 2024

Team Status Report for 3/30/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

This week we connected the camera module to the Jetson and captured a few images. The camera lens causes a slight distortion to the image, and the images are lower resolution compared to the laptop camera we have been using to test the OR model. A risk associated with this is that we may experience lower accuracy of the model, and we may have to mitigate this by processing the images before sending them to the OR model.
The accuracy of the DE of a detected object is a risk that we are currently facing. Although we can successfully determine which object is closer to the camera, the numerical value of the distance in meters is inaccurate. This is due to the difference in the width of the chair in the lab and from the reference image. This inaccuracy does not impact the output of the model as much, but it is an undeniable factor to the accuracy of the DE feature. We are planning to mitigate this risk by taking the reference images of the objects that we will be using for the test environment. In this way, we are able to make the width of the respective indoor objects the same (i.e. all chairs have the same width, all sofas have the same width, etc.).

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

There was one change from the design of the system. The OR model will use Yolov5 instead of Yolov4. This change is to increase the accuracy and improve the data latency of the model. Although it took more development time than necessary to program the DE feature, the result will have a better yield.

Provide an updated schedule if changes have occurred.

Few more relevant objects will be added to the DE feature. The deployment of the OR model on Jetson will begin next week.

Here is our update Gantt Chart as of Interim Demo (4/1/2024):

Gantt Chart – Timeline 1

March 30, 2024

Josh’s Status Update for 3/30/2024

Accomplishment:

For this week, I have successfully implemented the Yolov5 OR model + DE feature. I used classes for the easier extraction of the reference images and filtered the detected objects so that it only outputs several indoor objects, such as a couch, person, mobile phone, and chair. I took several reference images from my laptop camera from a known distance and compared them with the images from online pairwise to determine whether the OR model successfully recognizes specific indoor objects and outputs relative distance from one another (which object is closer to the camera). After several instances of successful output, I used the image captured from the Jetson camera and ran it in my model. The following has the image taken and the output from the model.

As shown in the image, it successfully outputs several chairs and their distances from the camera. However, although the order of distance from the camera makes sense, the numerical value of the estimation is too high. This is because the chair from the reference image has a different width from the chair from the test image. To resolve this problem, I am planning on taking the reference images of the objects that will be used in the test environment to increase the precision of the DE feature.

Progress

I have successfully added relevant objects (coach, chair) to the DE feature and had some testing done. However, it is also important to test the OR model with more images from the Jetson camera to ensure the accuracy. I will need to do more testing with images and videos taken from a Jetson camera.

Projected Deliverables

For next week, I will finish deploying the OR model to Jetson. At the same time, I will include more relevant objects, such as a table, to ensure sufficient range of indoor objects. The test results with Jetson camera will be documented for the final report.

March 23, 2024

Josh Joung’s Status Report for 03/23/2024

Accomplishment:

I have worked on trying to integrate a DE feature to Yolov9, but it was unsuccessful. I have noticed that the recent versions of the OR model restrict our abilities to manipulate with the detected output, which prevents us from using the data to detect the distance.

Therefore, I decided to step back to Yolov4 with the DE feature considering the time constraint on the project. The current open source library only has the DE feature for a person and a mobile phone. To test my understanding of the source, I have added several reference images and measured the distance of the width of a sofa to detect the distance. As shown in the screenshot of the running model, the distance is shown along with the detected object.

The accuracy of the distance will be improved by taking a reference image from a constant known value, which will be the distance between the object and the camera.

Progress

Because we are stepping back to Yolov4 with the DE feature, I am catching up with the progress. I still need to add reference images of several more indoor objects to test out the DE feature.

Projected Deliverables

By next week, I will add bench, cat, dog, backpack, handbag, suitcase, bottle, wine glass, cup, fork, knife, spoon, bowl, chair, sofa, potted plant, bed, dining table, tv monitor, laptop, mouse, keyboard, cell phone, microwave, oven, toaster, sink, refrigerator, book, scissors to the DE feature. I will also finish up the testing and begin to work on deploying the model to Nvidia Jetson. It will be a long week to prepare for the interim demo and ensure that the model runs in the Jetson well.

March 16, 2024

Josh’s Status Report for 3/16/2024

Accomplishment:

For this week, I have worked on training Yolov9 with our own indoor object dataset as well as the analysis of the training result. I have also started working on implementing a DE feature. During the process, I have encountered several consideration points. First, the pre-trained Yolov9 model identifies indoor objects very well, even better than the trained model with our own dataset.

This is the image of all objects identified by Yolov9-e.pt, which is the model with the highest precision rate among other weights in Yolov9. The image has been chosen randomly on the Internet for a simple test purpose, and the OR model successfully identifies most indoor objects.

The following screenshot also shows the output of the Yolov9 using a laptop camera. As seen in the screenshot, the model successfully recognizes most indoor objects.

On the other hand, the trained model needs some adjustments in the training method because it has shown a decrease in precision and increase in value loss when the model is trained for too long with high epochs. An ideal graph should look like as following:

The left 6 boxes show decrease of losses and the right 4 boxes show increase of precisions. However, our training result is as the following:

Evidently, the box_loss and class_loss for validation dataset has increased and the precision in the right 4 boxes has decreased after around 22 epochs.

The image above represents the confusion matrix for the trained dataset. Based on the matrix, chair, keyboard, table, trash bin, and tv monitor have shown relatively high precision while book, bottle, cup, laptop, and window have shown relatively low precision. Because the potential obstacles are commonly chairs, tables, or trash bins, this trained model is showcasing a desired output.

During the process of implementing the DE feature, I have found out that the open source of Yolov4 + DE feature, which I have planned to use as a reference, uses a reference image and distance as its tool to estimate distance of a specific object. In that project, pictures of a human and cell phone are used with known distances to estimate the distance of a human or a cellphone. I will be integrating this method for common indoor objects, such as a chair, table, door, TV, etc. However, a potential risk is that because the model will be estimating the distance based on the reference image, the actual distance may be incorrect. Furthermore, the size of indoor objects are usually different for different indoor settings, so it may yield inaccurate estimates.

Progress

I have reached the milestone of training Yolov9 with our own dataset, but a bit of adjustment will be made to raise the accuracy. However, I have failed to integrate the DE feature to Yolov9 by this week. The implementation is expected to take longer to collect reference images and determine reference distances.

Projected Deliverables

By next week, I am expecting to finish re-training the model with our dataset, test the OR model with the pre-trained and trained models to determine which model to go with. I will also finish implementing the DE feature, so that we can start integrating the components together using Nvidia Jetson.

March 9, 2024

Josh’s Status Report for 3/9/2024

Accomplishment:

-Compared the specification of Yolov5 and Yolov8, the most recent version of the OR model, and made a decision to continually work on Yolov5. Yolov5 is directly built on PyTorch framework, which can be easily implemented and adjusted to add a distance estimation feature.

-Forked Yolov5 github repository, so that the google collab can use our version of OR model to train a dataset. There was an issue regarding np.int, so changed to int instead to avoid errors.

-Currently looking into an open source of Yolov9 + Dist. Est. feature to upgrade the version of OR model. It will increase the accuracy and reduce the latency while reducing the development time to add a Dist. Est. feature.

Progress:

I am currently working on integrating Yolov5 with Dist. Est. feature, so I am behind schedule. However, since I have found an open source of Yolov9 + Dist. Est. feature, I will be back on schedule and be able to test the OR model by 03/16. To do so, I will need to retrain the Yolov9 model with an indoor object dataset. This will be done by 03/11. The data processor will be implemented by 03/15 to leave a day for the testing of the OR module.

Projected Deliverables:

By 03/11, I will finish retraining the Yolov9 + Dist. Est. feature. Then, by 03/15, I will finish implementing the data processor that outputs a desired result of the closest object, so that the testing can be done by 03/16.

February 24, 2024February 25, 2024

Josh’s Status Report for 2/24/2024

Accomplishment: For this week, I have accomplished training the object recognition model with an indoor object dataset I have found on the Internet. I have created a Google Collab, so that my teammates can also implement or view the progress of the training model and download the trained model after running all the steps. The model has changed from Yolov4 to Yolov5 because of the dependency issue in Yolov4. It uses darknet, which is now outdated and not maintained by the development team, so I have switched to Yolov5, which is developed by Ultralytics. This version is supported by a team that has a great maintainability, so this upgrade may be a huge advantage to our product.

However, it also means that I have to integrate a distance estimation feature from Yolov4 to Yolov5 model, which may take some time for development.

I have also collected a dataset relevant to indoor objects from the Internet. This dataset has 558 images for training, 52 images for validation, and 30 images for testing.

The challenge towards training the OR model is that the dataset needs to be annotated as well, so it might take more time to collect greater number of datasets. My plan is to search more data online or perhaps use some tools for annotating the images. I have found that Kaggle has some useful datasets, so I will take a look into the website for more data.

Progress: I am on schedule because I have already collected some indoor object dataset and trained the model with it. However, I am planning on collecting more dataset to increase accuracy and need time to add a distance estimation feature to Yolov5 model. Before spring break ends, I am planning on finish training the OR model with distance estimation feature.

Projected Deliverables: By next week, I will collect around 500 more images with annotations and train the model with it. I will also begin implementing distance estimation feature to Yolov5 model by cross-referring the original source of Yolov4 + distance estimation feature.

February 17, 2024February 18, 2024

Josh’s Status Report for 2/17/2024

Accomplishment:

This week, I have downloaded and tested several object recognition models that have distance estimation features to see how much implementation and add-on are needed to make a viable model. What I have realized about the two models I have first tested is that they both base off the object recognition model from the Yolov4 module. The techniques for ML are similar, and when going through the codes, I have found some space to adjust the model to be a viable model for our project.

During the testing, I have also realized that their pre-trained models detect a few objects (human, cellphone, etc.) that are irrelevant to indoor objects. Therefore, I am planning on finding a new dataset to train the model myself. Then, I have tested the Yolov7 model by installing NVIDIA CUDA and running the test to determine whether the model accurately detects a specific object that has been used to train the model. By comparing the accuracy based on the research papers on Yolov4 and Yolov7, I determined that using the Yolov4 model with distance estimation is sufficient to become our object recognition model.

I have also updated the software flow diagram by adding more specific details on inputs, outputs, and features that go with the data processor.

Progress:

I am on track on testing object recognition models, but because I have added a few more objectives in terms of training the chosen model with a suitable indoor object dataset, the planned date of testing the finalized object recognition model has been postponed by a week.

Projected Deliverables:

By next Tuesday, I will be done searching for a dataset with common indoor objects. If time permits, I will include some partial images of respective objects to take consideration of identifying an object even if the object is too close. By February 26th, I will be done training the model with the dataset that I have collected on Tuesday.