This week, I achieved significant milestones in my project. I successfully integrated the entire software subsystem, which included retraining the YOLO model to recognize “gate text”. Additionally, I developed a new video processing script and integrated it with the speech-to-text interface. A pivotal improvement was transitioning the speech-to-text functionality to leverage the Whisper model by OpenAI, which demonstrated superior performance compared to the initial VOSK model.
Throughout this semester, I have deepened my understanding of machine learning models, their implementation, and their application in real-world projects. This experience has provided me with valuable insights into how different disciplines within Electrical and Computer Engineering come together to create cohesive projects—a perspective that was less emphasized in my earlier coursework. My learning approach evolved to prioritize online documentation, videos, and research papers over traditional textbooks and lecture slides, enhancing my ability to tackle complex technical challenges effectively.