Team’s Status Report for 3/23/24 – Team E6: TransLingualVisionary

Overall, apart from the risks and mitigations discussed in the previous reports about the new spoter classification model and the FPGA integration, the new risk we found was the ability to determine the end of sentences. As of right now, our classification model simply outputs words or phrases based on the input video stream; however, it is unable to analyze the stream to determine end of sentence. In order to mitigate this problem, our current solution is to run a double query to the LLM while maintaining a log of previous responses to determine the end of sentence. If this idea does not work, we will introduce timeouts and ask the users to take a second of pause between sentences. No big design changes have been made until now since we are still building and modifying both the pose estimation and classification models. We are a bit behind schedule due to the various problems with the model and FPGA, and have thus updated the schedule below. We hope to have at a partially integrated product by next week before our interim demo.

Leave a Reply Cancel reply