Looked into audio transcription using C++, and this made us realise that audio transcription is I/O bound. This confirmed our suspicion that a signal processing based approach was the only way to move forward.
Set up the time sync infrastructure with Eugene, and fixed numerous bugs across the 2 python scripts, as well as understanding some of the source code for PyAudio to understand why some of our programs weren’t working as expected.
With a better understanding of PyAudio, Eugene and I were able to reduce the lower-bound latency even more (around 100ms).
Looked into MFCC coefficients with Spencer, but Spencer and I were unable to come up with an accurate way of comparing these coefficients across 2 different recordings. We are meeting with Professor Stern on Monday to obtain clarity on the same.