I mainly spent this week writing the code for the query workflow. Now we have the preliminary code which uses the CMU PocketSphinx Speech to Text engine and sends this query from the device to the server hosting the model. It then receives the query result and using Espeak, it uses the microphone to output the query. I am currently on track and will be looking to make this more robust and will start working on the dockerization of the model.
Kemdi Emegwa’s Status Report for 2/8
This week I primarily worked on researching the mechanisms that we are going to use for text-to-speech/speech-to-text. Python has many established libraries for this purpose, but we have the added nuance that whatever model we use, will have to run directly on the Raspberry PI 4. I was able to find a lightweight, open-source model that was actually developed by CMU researchers called PocketSphinx. This will likely work well for our use case because the model is small enough to run locally on limited hardware. We are currently on schedule and for the upcoming week I plan to finish the python code for the Raspberry PI so we can start utilizing the speech-text on the device.