Nick’s Status Report for 04/02

This week’s work was focused on further fleshing out how the LID model and how it will interact with the rest of the components of our system. Currently a version of the model is available for to be pulled from a cloud repository, loaded, and run on raw speech utterances to produce a sequence of classifications. I’ve added methods which allow for my partners to preemptively load the model into system and Cuda memory (so that we can minimize loading times when actually running at transcription request since only the data then needs to be moved into and out of memory). I also exposed a method for actually making the forward call through the network. I anticipate the interface between the backend language model to be continue to be just a simple class which can be called from he software API level which Tom’s been working on. Integration and testing will continue to be our focus for the next couple of weeks. There is work to be done to set up testing frameworks both for accuracy as well as noise tolerance. In this aspect I feel a little behind but I plan on spending much of the next 3 or 4 days working on this. Delivering our first demo is the next major milestone as a team so we will need to continue meeting in person or through live zooms to flesh out that integration.

Leave a Reply

Your email address will not be published. Required fields are marked *