Created basic audio -> text proof of concept pipeline using speech recognition module in python.
Measured performance of compiled vs. interpreted python & found no noticeable difference in performance. Performance of this pipeline is really poor and takes > 1 second to run consistently.
Next steps: Investigating ways to use signal processing techniques to enhance performance/response time of basic pipeline. Ex: using MFCC coefficients may be faster than audio to text.
Possible library to look at: (https://github.com/MycroftAI/sonopy)