This week I focused on refining the design requirements, numeric goals and testing plans for audio transfer module, web frontend text displayer module and backend wav generation API module. After drafting detailed documentation of design metrics, I continued researching for tools for audio transfer whose performance can meet our design and finished implementing an audio transfer module that can periodically send requests embedded with audio data to the server.
To test if the audio transfer module matches our design requirement for the latency of .wav generation, I logged the server time at receiving the audio data and the time at finishing generating the corresponding .wav file at the web frontend. The .wav file generation time exceeds our intended 50ms but is under 300ms. We will tolerate this result until having further tested the client to server transmission latency (after having our server deployed) and the ML model average run time on a 2-second audio.
Next week, I will move on to deploying an AWS server with GPU so that we can load a pre-trained ASR model to experiment with network latency and model run time so that we can better estimate the practicality of our current design metrics.