Shiheng’ Status Report for 10/21/2023

I mainly contributed to the Design Requirement and Tradeoff part of the Design review document. Starting from what we have in the design review slides and notes shared among us, I was able to quatify and justify requirements we encoutered in our project. The tradeoff part also plays a vital role inside the design document, as it explains decisions we made in the past weeks and why do we think they are in our best interest.

For the past week, I mostly researched my part of TTS engine and its applications to our project. During the implementation, lots of compatibility problems have been discovered on Python version, package compatibility, and output supports. After trying out a few different Python versions and attempting to install the TTS package on my laptop, I determined that Python 3.10 was the best fit as it supports all the necessary packages inside the TTS engine with most packages up to date. Other versions either reached the end of their life cycle or had issues supporting the latest packages that the TTS engine required to use.

With the package successfully installed and all requirements fulfilled, I generated a few .wav files to for demonstration purposes. The .wav files sound ok and is compatible with running locally on command prompt using the default voice engine provided. I’ll continue to research on different voice engines to make sure that the best among them is picked and consider that user might want to pick different voices while they are being instructed. I will continue to work on this part and begin to integrate this voice function into our project once our pipeline is mostly done.

Here’s a Sample message using default engine:

“Raise your arm by 20 degrees”

 

ABET #7 Question:

I have looked into tools like Text-to-Speak engines and try to understand the logic and algorithm behind with different voice models.

Additional knowledge including understanding pronunciation, syllables in English and what speed should the voice engine use to speak the words. Those parameters need to be adjusted to meet our design requirements and provide efficient feedback to the user.

Also looked into various online forums and design documents for the installation and learnt similar problems that other users previously had in installing older versions of the package. Learnt a lot more about Python packages related to machine learning and voice generation.

Leave a Reply

Your email address will not be published. Required fields are marked *