Opalina’s Status Report 4/26

This week, I created a quantized YOLO model and tuned it in order to increase speed of the model on the Pi. I also rewrote an integration script, making optimizations with the OCR intervals and threading in order to reduce the inference time on each frame when running the end-to-end integration script on the Pi.

For unit tests, I ran the video processing script on a set of manually recorded videos (using printed signs) as well as pre-existing image datasets (117 airport signs) and reached an accuracy of approximately 92% with <100ms of preprocessing and inference time on the Mac. Eventually, the new script yielded a latency of <2s on the Raspberry Pi, which met our initial use case requirements.

Team Status Report 04/26/25

We’re working on integration and trying to get everything running smoothly on the pi. We’ve run many individual tests and so far we’re meeting our specification requirements. We had 90% accuracy for the model on the test image dataset. Our battery life was between 4-5 hours. The device weighs under 1kg. Currently we’re able to run at about 10FPS which meets our 2s latency spec, however, we have some more integration to do and hope that we can continue to meet the specification.

Krrish’s Status Report 04/26/25

I’m working on integration and optimising the script to run locally on the pi. We’re running into some issues as ocr is very computationally heavy. I’m testing various techniques like threading, running it on intervals, etc. I also tested the depth accuracy for the camera system.

Krrish’s Status Report 04/19/2025

I’ve been working on the AI accelerator and trying to compile our custom models to run on it. I’ve gotten the software and drivers installed but am struggling with the optimisation and compiling steps involved. I’ve also been working on integration of the model and the raspberry pi.

One thing i’ve learnt is that we need constant communication and we need to start integrating early. Learning to communicate and give feedback in a team is important to keep everyone on track. It also ensures everyone is constantly working on integration so it’s not left to the end.

Opalina’s Status Report 4/19

This week, I achieved significant milestones in my project. I successfully integrated the entire software subsystem, which included retraining the YOLO model to recognize “gate text”. Additionally, I developed a new video processing script and integrated it with the speech-to-text interface. A pivotal improvement was transitioning the speech-to-text functionality to leverage the Whisper model by OpenAI, which demonstrated superior performance compared to the initial VOSK model.

Throughout this semester, I have deepened my understanding of machine learning models, their implementation, and their application in real-world projects. This experience has provided me with valuable insights into how different disciplines within Electrical and Computer Engineering come together to create cohesive projects—a perspective that was less emphasized in my earlier coursework. My learning approach evolved to prioritize online documentation, videos, and research papers over traditional textbooks and lecture slides, enhancing my ability to tackle complex technical challenges effectively.

Team Status Report 04/12

The new camera arrived this week, which meant we had to start the process of hardware integration from scratch. The new AI hat, while easy to set up, is posing compilation issues with the custom ML models. While the text-to-speech and sign detection seems to be acceptably, this week, we hope to make more progress on integrating the camera with the software. Once we have a basic level of functionality, we hope to start user and battery tests to ensure that the device holds up to the initially outlined use-case requirements (accuracy, latency, power, depth perception etc)

Opalina’s Status Report 04/12

This week, I used new training and testing datasets to fine-tune the YOLO model that we previously used for our interim demo. The new model looks for an added set of features in order to reduce the amount of OCR passes needed on each image, subsequently reducing our latency. I ran a few standard tests in order to verify the functionality of this new model, and found a slight drop in accuracy, which I am currently working to reduce. In the coming week, I plan to build new custom test datasets in order to properly identify these points of error.

Opalina’s Status Report 3/29

This week, I managed to run and fully test YOLO and OpenCV in conjunction with OCR, to extract and interpret all the relevant information from an airport sign. This script simply needs to be run with the TTS component in order to complete the end-to-end software subsystem. In the coming weeks, we hope to use camera footage to run the models on the Pi.

Krrish’s Status Report 3/29

I’m running experiments with the camera and YOLO model. It is struggling and working quite slowly with significant delays. I’ve put together a list of new camera options and have narrowed it down to pretty much 2 options. I will place an order on the one i think will work best.

Team Status Report 3/29

We have OCR and the YOLO model working independently. We are working on integrating that and also testing on videos rather than still frames. Our goal is to have this working on our laptops and then integrate it with the Pi. Simultaneously we are testing the camera on the Pi with the YOLO model to see how fast it would run. Currently it looks like the camera is running too slow on the pi and so we’re looking at different options. We’ve narrowed it down to 3 options and will place an order today.