Progress report for work done until 09/28
Summary
- Were able to collect HW and setup the RPi with the camera to take images
- Did preliminary research on how the camera synchronization can be done using the MST algorithm
- Successfully set-up and deployed 2 databases with a 3rd one in the works [after research on the multiple options we had]
- DynamoDB
- A simple csv file for testing purposes
- Amazon RDS [in case we want a server-supported option]
- Set up a nginx + gunicorn webserver and began to complete light testing on it. Testing includes
- Serving webpages
- Transferring files
- Executing Code In server
- Completed Various Tests for how one would train YOLO to have additional classes that it would classify
- Notably, attempts were made to train on top of existing models but accuracy on existing classes suffered greatly.
- It was decided to train YOLO10m from scratch due its high accuracy and relatively quick training time. From scratch just means borrowing relative classes/photos from COCO and then splicing together existing online datasets of things like pencils.
What are the most significant risks that could jeopardize the success of the
project? How are these risks being managed?What contingency plans are ready?
Ensuring that the Vision model works with high accuracy on new classes is incredibly important. There are multiple contingency plans involving using multiple YOLO models together and using GPT vision as a cross-reference in case the YOLO-only solution doesn’t pan out [Ethan]
It is also essential to ensure necessary hardware is brought so that training can go as quickly as possible. Otherwise, there simply won’t be enough time to train the whole model. More lightweight models are being considered to alleviate the risk. [Swati] In addition GPUs on laptop are being utilized for training currently [Ethan]
Hashing out additional details on the webserver and the interfaces between various components. We need to ensure that we understand how everything relates as soon as possible so issues can be caught early through testing [Giancarlo]
Meeting Minutes:
Actionables:
High priority: DUE: Sep 30, 2024
- Ensure that missing components from initial orders are found [Swati Anshu]
- Rasp pi was missing form container
- SD card missing from jetson
- Expand on object detection working robustly to it working robustly on new classes [emuchnik@andrew.cmu.edu]
- Initial work will be the completion of several new classes added to dataset
- In addition, there should be a good amount of exploration into how GPT Vision can be used in tandem with YOLO to keep track of objects
- Database [Swati Anshu]
- Ensure database has a functioning interface with other’s submodules such as the vision model
- Ensure data is collected from Rasp pi camera and sent to the cloud.
- Create Benchmarking Code For ML models [Giancarlo Zaniolo]
- Note there will likely be benchmarking code already created by YOLO or 3rd parties that can be used
- Remember to update website (Week 2)!!!
- Group update
- Order: 1. Swati Anshu 2. Ethan Muchnik 3. Giancarlo Zaniolo
- User update @everyone
- Group update
- Design Presentation Work @everyone
Additional Sections:
A was written by [emuchnik@andrew.cmu.edu]. B was written by [Giancarlo Zaniolo] C was written by [Swati Anshu]
A. Our solution will arguably help the most when it comes to health. Since our solution helps people locate objects, in general it will make people’s mental health better as they will now be less likely to lose things important to them for an extended period of time. We anticipate this being most helpful to seniors who may have memory problems and whose quality of life is negatively affected by that. In addition, seniors may also misplace things like medicine which very clearly have health effects other than just psychological. If a senior misplaced their medicine that could easily mean hospitalization. Thus, a major goal of Forget-Me-Not is to get rid of most of these deaths related to replacing such vital objects.
B. As has been shown by previous research, machine learning models can often be affected by unintentional biases from the training data. In our case, the machine model we are using is able to track a list of household objects. This list needs to be predefined by the architecture of our model, and depending on our initial training data, we may unintentionally prioritize objects that would most likely be used by the demographics we take part in. However, when designing with consideration to social factors, we need to also take the the opinions of others into account. Doing a comprehensive survey of what every single demographics would like to keep track of may be currently out of the scope of this capstone project, as surveying a diverse population is itself a difficult task. However, we do hope to structure our ML training pipeline such that the functionality of our model is extensible. In addition, we have thought about adding a suggestion box to our website which would allow users to list which items they would want to be added to the detected object list in the future. We hope that by making it easy to add objects to be detected, and easy for users to suggest objects to be detected, we can address any societal biases introduced in our original model, and make our product better for people from all walks of life.
C. In terms of economic feasibility, our product is designed to be accessible to a wide range of consumers. The use of affordable hardware components, such as Raspberry Pi and standard cameras, keeps initial costs low. Cloud computing services like AWS offer scalable solutions, making the system financially viable for both small-scale personal use and potential enterprise applications. As our system evolves, optimizations in both hardware and software can lead to reduced costs, potentially allowing us to offer the product at lower prices as demand increases and production scales. For businesses or institutions that deal with misplaced equipment or tools in offices, classrooms, or warehouses, this solution could lead to significant savings by automating the tracking process and ensuring that valuable time isn’t spent searching for misplaced assets. By using modular components and cloud services, users can opt for different pricing tiers based on their needs, ensuring that they only pay for what they use. This flexibility in production and distribution offers an economic advantage by lowering entry costs for small users while allowing for expansion without large upfront investments. This product could also be integrated with existing home technologies like Alexa, HomePod etc. making it an ideal business product in terms of production, distribution and consumption.
0 Comments