Team Status Report 2/26 – Team C0: CodeSwitch

This week was primarily focused on finalizing our design and system specifications for the coming design document next week. The final document will still need some work in terms of formatting, but a majority of our metrics, testing, and explanations are currently in a shared document which will be used to populate the final template. On the language model side, we’ve obtained our AWS credits and have uploaded our training dataset. The LID model has been initialized and Nick is conducting his first runs with the dataset and a smaller version of our intended final architecture using only cross-entropy loss. The architecture itself remains stable.

On the web app side, we conducted research on deployment strategies that support our requirements (mainly continuous push of audio chunks from client to server and model service instantiation on server boot time). We already decided to deploy our app following the logic of deploying a Flask API, but since our app is built using Django, we still need some further research and code modifications before finishing deployment.

Nothing has changed in terms of our schedule. On the modeling side, this week will be ramping up our training. One of our risks has been limited training time so it will be essential to make sure that the language model is fully ready and trainable so that it may run for hours at a time overnight over the course of the next two weeks. Initial results will be important to understand and characterize within the next week so that we can make training adjustments as needed to achieve our best performance. On the web app side, this week and next week will be focusing on app deployment with the goal to have our deployed app capable of running a pretrained ASR model and returning the prediction text results to client frontend.

Leave a Reply Cancel reply