I spent this week doing a lot of different things mostly pertaining to the device, in addition we did an end to end test.
Firstly, as I mentioned last week we decided to upgrade back to a Raspberry pi 5 from the Raspberry pi 4 we had been using because the Raspberry pi 4 was not delivering the performance we wanted. I spent a bit of time configuring the new Raspberry pi 5 to work with our code and with the speakerphone we bought last week. Once I got it working, I test the Text to Speech models which were the reason we made the switch in the first place. Piper TTS, which was what I had wanted to move forward with previously was a lot faster, but still had a noticeable delay even when streaming the output. I plan on doing more research on faster text to speech models, but for right now we are using Espeak, which provides realtime tts, albeit at worse quality.
In addition, I started to think about how a user would approach setting up the device when they first get it. Operating under the assumption that the device was not already connected to wifi, the user needed a way to access the frontend. This posed a challenge. I did some research and found a solution: Access point mode.
Wifi devices can operate in 1 of 2 modes at any given time. Client mode And access point mode. Typically for our devices we use client mode and routers use access point mode, but by leveraging access point mode we are able to allow the user to access the device frontend without a wifi connection.
How this works is that when the device starts up and detects it is not connected to wifi it will activate access point mode. This emits a network that the user can connect to by going into the wifi settings and inputting a password. They then just have to go their browser and input the ip address/port where the flask server is being hosted. I have written scripts to automate starting up access point mode and turning it off, but more needs to be done to allow the device to detect it is not connected to wifi and use those scripts.
In the same vain of user experience, I configured the raspberry pi so that our scripts run on startup/reboot. This eliminates the need of a monitor to run the code, which is the way we envision users interacting with our project anyways.
Lastly, as a group we did a full end to end test with a cloud hosted open source model. We were able to test all our core functionalities, including regular LLM queries, music playback and alarms.
I don’t for see any challeneges upcoming and I believe we are on track. This upcoming week will be spent researching more TTS models, allowing the device to detect that it is not connected, and error handling.