yoochanj – Page 2 – Team D2: SightMate

March 9, 2024March 11, 2024

Team Status Report for 3/9/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The major risks remain the same as last week: the weight of the device, the PCB connection between Jetson and peripherals, and the identification of partial frames of objects.

A new risk that can potentially jeopardize the success of the project is the longevity of the support for the Yolov5 model. Although it is constantly supported and updated by Ultralytics, we cannot guarantee that this version will be supported in the next few years. To mitigate this risk, we are planning to upgrade this version to the Yolov9 model, which is the most recent version (currently being updated). The reason for this approach is that we have also found an open source that had already integrated distance estimation features to the OR model. Therefore, we can reduce the development time and just need to focus on training the model and creating a data processor to manage the data output. If this development faces an issue due to the ongoing deployment by the developers, we are planning to stick to the Yolov5 model and meet the MVP.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

There is currently one change made to the existing design of the system. It is the upgrade from Yolov5 to Yolov9. This change is necessary to raise the accuracy of the object recognition and to mitigate the risk of the module not being supported in the future. It can also reduce the development time of integrating a distance estimation feature by referring to an open source that uses this model and the feature.

Provide an updated schedule if changes have occurred.

By 03/11, retraining the Yolov9 + Dist. Est. feature will be completed. Then, by 03/15, implementing the data processor will be done, so that the testing can be done by 03/16.

Please write a paragraph or two describing how the product solution you are designing will meet a specified need…

Part A (written by Josh): … with consideration of global factors.

The product solution focuses on its influence in a global setting. Our navigation aid device is designed to be easily worn with a simple neck-wearable structure. There are only two buttons in the device to control all the alert and object recognition settings, so visually impaired people can easily utilize the device without any technological concerns or visual necessity. The only requirement is learning the functionalities of each button, which we delegate the instructions to the user’s helper.

Another global factor considered is that the device outputs results in English. Because English is the most commonly used language, the product can be used by not only those from the United States but also those who are aware of indoor objects in English terms.

Part B (written by Shakthi): … with consideration of cultural factors.

The product solution considers the cultural factors by taking into consideration the commonly used indoor items in many cultures. That is, this design takes an account of indoor items like a sofa, table, chair, trash bin, and a shelf, which can be easily found in most indoor settings. Furthermore, as mentioned in part A, English is used to identify the items, so the cultures with English as their first language or secondary languages can easily use the device.

Most importantly, the device is aiming to positively influence the community of visually impaired people. Its goal is to give them confidence to go around indoor settings without any safety concerns. After an interview with several blind people, the device takes into consideration the common struggle and challenge that they face in daily lives. We hope that our product can flourish the relationship between the people with visual needs and the people who do not.

Part C (written by Meera): … with consideration of environmental factors.

The product solution considers environmental factors by allowing the users to take care of the waste properly. A trash bin is one of the indoor objects that is in the dataset, so the users can know if a bin is in front of them. This design encourages the visually impaired people to put trash into the bin.

Furthermore, this navigation device utilizes a rechargeable battery, so it reduces the total amount of product that may go to waste after its usage. In addition, we are connecting the sensor, vibration motor, and logic level converter to the PCB using headers and jumper wires instead of soldering them onto the PCB so that we can reuse them if the PCB needs to be redesigned. We are attempting to avoid using disposable items as much as possible to avoid harming the environment.

March 9, 2024

Josh’s Status Report for 3/9/2024

Accomplishment:

-Compared the specification of Yolov5 and Yolov8, the most recent version of the OR model, and made a decision to continually work on Yolov5. Yolov5 is directly built on PyTorch framework, which can be easily implemented and adjusted to add a distance estimation feature.

-Forked Yolov5 github repository, so that the google collab can use our version of OR model to train a dataset. There was an issue regarding np.int, so changed to int instead to avoid errors.

-Currently looking into an open source of Yolov9 + Dist. Est. feature to upgrade the version of OR model. It will increase the accuracy and reduce the latency while reducing the development time to add a Dist. Est. feature.

Progress:

I am currently working on integrating Yolov5 with Dist. Est. feature, so I am behind schedule. However, since I have found an open source of Yolov9 + Dist. Est. feature, I will be back on schedule and be able to test the OR model by 03/16. To do so, I will need to retrain the Yolov9 model with an indoor object dataset. This will be done by 03/11. The data processor will be implemented by 03/15 to leave a day for the testing of the OR module.

Projected Deliverables:

By 03/11, I will finish retraining the Yolov9 + Dist. Est. feature. Then, by 03/15, I will finish implementing the data processor that outputs a desired result of the closest object, so that the testing can be done by 03/16.

February 24, 2024February 25, 2024

Team Status Report for 02/24/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The major risks remain the same as last week: the weight of the device, the PCB connection between Jetson and peripherals, and the identification of partial frames of objects.

A new risk that can potentially jeopardize the success of the project is the dependency on the object recognition model. We have realized that training the Yolov4 model with our own dataset is no longer possible due to the malfunction in darknet, which is the responsible team that has supported the recognition model. Therefore, we have changed our plan to upgrade the model to Yolov5, which is more recent than Yolov4 and is implemented by a more reliable team Ultralytics. The risk of such dependency can be mitigated by upgrading the version one by one as time permits. Our reach goal is to upgrade to Yolov7, which is relatively new, and attach a distance estimation module to the new version.

No major changes have been made to our design. Using the suggestions from the LAMP advisory board, we are focusing on datasets for the OR model incorporating hallways, stairs, doors, trash cans, and/or pets, since these are obstacles they identified as common and necessary to identify. One change of a design of the system is that the OR model has changed from Yolov4 to Yolov5 due to the outdated dependency and unsupported module.

Provide an updated schedule if changes have occurred.

Because the OR model has been upgraded to version 5, it needs a new distance estimation feature to be integrated. Therefore, we have postponed testing the image recognition model by a few days and added some time to work on integrating the feature to the upgraded model.

February 24, 2024February 25, 2024

Josh’s Status Report for 2/24/2024

Accomplishment: For this week, I have accomplished training the object recognition model with an indoor object dataset I have found on the Internet. I have created a Google Collab, so that my teammates can also implement or view the progress of the training model and download the trained model after running all the steps. The model has changed from Yolov4 to Yolov5 because of the dependency issue in Yolov4. It uses darknet, which is now outdated and not maintained by the development team, so I have switched to Yolov5, which is developed by Ultralytics. This version is supported by a team that has a great maintainability, so this upgrade may be a huge advantage to our product.

However, it also means that I have to integrate a distance estimation feature from Yolov4 to Yolov5 model, which may take some time for development.

I have also collected a dataset relevant to indoor objects from the Internet. This dataset has 558 images for training, 52 images for validation, and 30 images for testing.

The challenge towards training the OR model is that the dataset needs to be annotated as well, so it might take more time to collect greater number of datasets. My plan is to search more data online or perhaps use some tools for annotating the images. I have found that Kaggle has some useful datasets, so I will take a look into the website for more data.

Progress: I am on schedule because I have already collected some indoor object dataset and trained the model with it. However, I am planning on collecting more dataset to increase accuracy and need time to add a distance estimation feature to Yolov5 model. Before spring break ends, I am planning on finish training the OR model with distance estimation feature.

Projected Deliverables: By next week, I will collect around 500 more images with annotations and train the model with it. I will also begin implementing distance estimation feature to Yolov5 model by cross-referring the original source of Yolov4 + distance estimation feature.

February 17, 2024February 17, 2024

Team Status Report for 02/17/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

Weight of the device:

We initially estimated a weight of 200 grams for the whole device which in retrospect was a vast underestimate. Our on-board computer (NVIDIA Jetson Nano) alone comes to 250 grams, along with some other heavy components such as the PCB/ Arduino, the rechargeable battery pack and other sensors. We also intend on 3D printing a case to improve the overall look and feel of the device. Given all of this, the total weight is going to be around 400-450g. We now run the risk of the device being too bulky and just uncomfortable and impractical for our user case. Although we will certainly make efforts along the way to reduce weight when we can, our backup plan is to offload the battery pack and potentially the Jetson to the waist of the user so that the weight is distributed and less of a disturbance for the user.

Connection to peripherals:

We plan to connect the peripherals (buttons, sensor, and vibration motor) to the GPIO pins of the Jetson, with a custom PCB in between to manage the voltage and current levels. A risk with this approach is that custom PCBs take time to order, and there may not be enough time to redesign a PCB if there are bugs. We plan to manage this risk by first breadboarding the PCB circuit to ensure it is adequate for safely connecting the peripherals before we place the PCB order. Our contingency plan in case the PCB still has bugs is to replace the PCB with an Arduino, which will require us to switch to serial communication between the Jetson and Arduino and will cause us to reevaluate our power and weight requirements.

Recognition of partial frame of an object

Although we are planning to find the dataset of indoor objects that includes some additions of partial images of objects, the recognition of a cropped image of an object due to close distance can be inaccurate. To mitigate this risk, we are planning to implement a history referral system that can track back to the history of recognized objects and determine the object if the accuracy is below a chosen threshold. Then, even when a user walks closer to the object to the point where the product cannot recognize the item, it can still produce a result by using the history.

Switch from RPi to NVIDIA Jetson Nano:

In our previous report, we mentioned taking a network-based approach that offloads the bulk of the processing from the RPi to a server hosted on the cloud. This raised the issue of having the device be Wifi dependent and we quickly decided against taking that approach as we wanted our device to be accessible and easy to use, rather than an added chore for the user. To mitigate this, we did some research and found that switching from the RPi to the NVIDIA Jetson Nano as our on-board computer would make the most sense for the purposes of our project, as well as resolving the problem of overexerting the RPi or having to rely on a network connection to a server. The NVIDIA Jetson has higher performance, more powerful GPUs that make it better suited to run our object recognition model on board. Here is an updated block diagram:

As for changes in the cost, we have been able to get an NVIDIA Jetson Nano from the class inventory and so there is no additional cost. However, we have had to place a purchase order for a Jetson-compatible camera as the ones in the inventory were all taken. This was $28 out of our budget, which we believe we can definitely afford, and we don’t foresee any extra costs due to this switch.

Extra device control (from 1 button to 2 buttons):

Our design prior to this modification was such that the vibration module that alerts the user of objects in their way would be the default mode for our device, and that there would be one button that worked as follows: single-press for single object identification, double-press for continuous speech identification. As we got into discussing this implementation further, we realized that having the vibration module be turned on by default during the speech settings may be uncomfortable and possibly distracting for the user. To avoid risking overstimulation for the user, we decided to have both the vibration and speech modules be controllable via buttons, allowing the user to choose the combination of modes they wanted to use. This change is reflected in the above block diagram that now shows buttons A and B.

The added cost for this change should be fairly minimal as buttons cost around $5-10 and will greatly improve user experience.

Custom PCB:

Since we have switched to using the Jetson and plan to configure the GPIO pins for connecting peripherals, we now need to design and order a custom PCB for voltage conversion and current limiting. This change was necessary because the operating voltage of peripherals and the GPIO pin tolerances are different, and require a circuit in between to ensure safe operation without damaging any of the devices.

The added cost of this change is the cost of ordering a PCB as well as shipping costs associated with this. Since we are using a Jetson from ECE inventory, and the rest of our peripherals are fairly inexpensive, this should not significantly impact our remaining budget.

Provide an updated schedule if changes have occurred.

The hardware development schedule has changed slightly since we are now ordering a custom PCB. The plan for hardware development this week was to integrate the camera and sensor with the Jetson, but since these devices haven’t been delivered yet, we will focus on PCB design this week and will push hardware integration to the following week.

During testing of a pre-trained object recognition and distance estimation model, we have realized that the model only detects few objects that are irrelevant to indoor settings. Therefore, we have decided to train the model ourselves by using the dataset of common indoor objects. The workload of searching for a suitable dataset and training the model is added to the schedule, which has pushed back the object recognition model testing stage for around a week.

Please write a paragraph or two describing how the product solution you are designing will meet a specified need.

Part A: … with respect to considerations of public health, safety or welfare.

Our product aims to aid visually impaired people from encountering an unnoticed danger that has not been detected by just using a cane. Not only does the product notifies the user what the object is but also alerts that there exists an obstacle right in front. We are projecting the safety distance to be 2 meters, so that the user has time to avoid an obstacle in their own methods.

If the product is successfully implemented, it can benefit blind people also in a physiological sense. The people no longer need to be worried about running into an obstacle and getting hurt, which can significantly reduce their anxiety when walking in an unfamiliar environment. In addition, the user has an option to change a device setting to a manual option, in which the user manually presses a button to identify what the object in front of them is. This will alleviate the user’s stress of hearing the recognized objects every second.

Part B: … with consideration of social factors.

The visually impaired face significant challenges when it comes to indoor navigation, often relying on assistance from those around them or guide dogs. To address this, our goal is to have our device use technology to provide an intuitive and independent navigation experience. We use a combination of depth sensors, cameras, object recognition algorithms and speech synthesis to hopefully achieve this objective. The driving factor for this project is to improve inclusivity and accessibility in our society, aiming to empower individuals to participate freely in social activities and navigate public spaces with autonomy. Through our collaboration with the Library of Accessible Media, Pittsburgh, we also hope to involve our target audience in the developmental stages, as well as testing during the final stages of our project.

Part C: … with consideration of economic factors.

Guide dogs are expensive to train and care for, and can cost tens of thousands of dollars for the visually impaired dog owner. Visually impaired people may also find it difficult to care for their guide dog, making them inaccessible options for many people. Our device aims to provide the services of guide dogs without the associated costs and care. Our device would reach a much lower price point and would be available for use immediately, while guide dogs require years of training. This makes indoor navigation aid more financially accessible to visually impaired people.

February 17, 2024February 18, 2024

Josh’s Status Report for 2/17/2024

Accomplishment:

This week, I have downloaded and tested several object recognition models that have distance estimation features to see how much implementation and add-on are needed to make a viable model. What I have realized about the two models I have first tested is that they both base off the object recognition model from the Yolov4 module. The techniques for ML are similar, and when going through the codes, I have found some space to adjust the model to be a viable model for our project.

During the testing, I have also realized that their pre-trained models detect a few objects (human, cellphone, etc.) that are irrelevant to indoor objects. Therefore, I am planning on finding a new dataset to train the model myself. Then, I have tested the Yolov7 model by installing NVIDIA CUDA and running the test to determine whether the model accurately detects a specific object that has been used to train the model. By comparing the accuracy based on the research papers on Yolov4 and Yolov7, I determined that using the Yolov4 model with distance estimation is sufficient to become our object recognition model.

I have also updated the software flow diagram by adding more specific details on inputs, outputs, and features that go with the data processor.

Progress:

I am on track on testing object recognition models, but because I have added a few more objectives in terms of training the chosen model with a suitable indoor object dataset, the planned date of testing the finalized object recognition model has been postponed by a week.

Projected Deliverables:

By next Tuesday, I will be done searching for a dataset with common indoor objects. If time permits, I will include some partial images of respective objects to take consideration of identifying an object even if the object is too close. By February 26th, I will be done training the model with the dataset that I have collected on Tuesday.

February 10, 2024February 17, 2024

Josh’s Status Report for 2/10/2024

Accomplishment: Throughout the week, I have researched three popular Object Recognition models. I have evaluated pros and cons of each model and created a checklist to determine which model fits the best to our project. Then, I have participated in a weekly progress meeting with the team to discuss the feedback from the proposal and the contingency plan of a significant risk to the product. I have also created a brief software module flow diagram to visualize the process.

Progress: I am slightly behind the schedule due to the changes made to mitigate a significant risk of Raspberry Pi overheating. I will make sure to get back on track by spending more time on researching an OR model that also has a distance estimation feature. Testing the models by downloading and setting up the modules can also take a long time, so I will make sure to leave extra time for this objective.

Projected Deliverables: By Monday, I will be done researching the possible models that have both OR module and distance estimation module. Then, by Thursday, I will be done testing the models on my laptop and reviewing them with the checklist I have created to decide which model to work with.

February 10, 2024February 10, 2024

Team Status Report for 02/10/2024

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

RPi may overheat with constant use, and constant use also requires a large battery capacity, which increases device weight.

Plan: Have the RPi only be responsible for pre-processing the camera data and compiling it into a package with any necessary metadata to be sent to a server. This makes it so that the RPi is not doing the bulk of the processing and only acts as a channel to the server. The implementation will require us to set up a server on the cloud that runs our object recognition model and have it listen for incoming packages from the RPi. The RPi will periodically (every 1 second, for example) send an image to the server. This allows for real-time detection while minimizing the load on the RPi.

There is a change made to the existing design regarding the change in the functionality of the ultrasonic sensor. To help reduce data latency when the user confronts an obstacle in the way, we plan on directly connecting the sensor to the vibration module rather than integrating the sensor with the recognition model. The cost will remain the same, but the purpose of an ultrasonic sensor has reduced from estimating the distance of an object and detecting an object to simply detecting that an object exists in front of the user. Then, there will be a change in plan to include the distance estimation model from a single source camera. Preferably, we will aim to integrate a single model that can take on both roles to relieve the workload of integrating two different models.

Provide an updated schedule if changes have occurred.

Updated schedule to accommodate design changes and the need for further research:

2/10-2/16: Continue testing detection models to decide which ones to integrate

2/10-2/14: Continue research into camera modules, TOF cameras, and depth detection without ultrasonic sensors

2/12: Contact LAMP and set up a time to get feedback and suggestions for features

2/14: Order parts this week (RPi and potentially camera modules)