February 2025 – Team E0: VoiceVault

February 25, 2025March 9, 2025

Team status report for 3/8

The most significant risk that could jeopardize the success of our project is getting everything to run on the board. As will be explained below, we are adding 2 new features and moved some part of website to being hosted on the board, which means we will need more resources on the board itself. To mitigate this risk, we will have an MVP up and running ASAP to test on the board. This way we can see if we will need to upgrade to a different board or if our raspberry pi 4 with 4gb ram is enough.

This week we decided to make some pretty substantial changes based on the design presentation feedback:

Add 2 new features to increase scope of our project: (1) an timer feature and (2) an mp3 like feature where users can upload songs to their device and can play them from their device.
We decided to move part of the website from being something we hosted to being hosted on the device. The website had 2 main purposes: (1) configure your device by passing in your VM endpoint and (2) everything else such as setup instructions, data privacy policies, and docker containers. Everything in (2) is strictly static, meaning that the same content can be used by every single user, so we are still going to be hosting this part. However, we decided to move (1) to being hosted on the actual device for each user. By doing it this way, we eliminate all 3rd parties, including ourselves. If we had kept (1) on our end, it would require us or a 3rd party to store user info (we chose Clerk which is a 3rd party but if we had implemented auth ourselves we would have had to store user info) which is against what we want to do. By doing the configuration on device, we eliminate the use of any 3rd party system and then the user owns all parts of the system which is our intended goal.
We held a meeting with David Brumley and he gave us some advice regarding how to define privacy. Based on his feedback, we will be adding a terms of service agreement to our website to give users visibility into how their data is being used. We want to make it explicit to them that we nor any other 3rd party has access to their data or information at any point, including that their data is not being stored, their data is not being reused to train any models, etc. We need to make it very explicit that they own all aspects of the user experience.
We also made the choice to have the users setup SSL encryption on their VMs (with instructions on our end). By doing so, we are not prone to man in the middle attacks between the devices and the VMs when making endpoint requests, which was a point of concern. After this is done, we can ensure that all privacy is being maintained with regards to the data transfers.Based on these changes, this is a rough estimate of what we want our schedule to look like in the upcoming weeks:
Week 1-2: finalize design report, get rid of auth and move configuration to a separate device hosted website, start working on dockerization containers, start testing speech to text and text to speech.
Week 3-4: Polish out UI and have it fully working, finish ToS, have dockers ready and on website with instructions, test individual components
Week 5-6: test everything e2e, setup instruction guidebook
Week 7: slack time to finish up anything we didn’t finish before. This accounts for unforeseen circumstances and pivotsWeek specific report:
Part A(David):Our product has one main global factor that it affects. It is that our product ensures privacy to the user. While in the US this is mainly a data privacy issue as our government is most likely not actively using our data to control us, in other parts of the world this could be helpful against more controlling governments and therefore offer physical protection. In areas where there is no freedom of speech or less of it, a voice assistant in the house can be a danger if the data is sent out to an unknown location.
This product could also be of assistance in protecting government officials or anyone with confidential information. Because most voice assistant companies are in the US, there might be some mistrust for foreign people, especially those with confidential information being spoken in their houses. Our product ensures that all the data is kept within the user’s control, so that people from across the world can feel free to say anything they want and have it not be sent to a US server.
Part B: (Kemdi Emegwa)Voice Vault aims to preserve the right to privacy and the right to consent while still allowing users to leverage state of the art artificial intelligence. By allowing the user to host their own model whether that be on the cloud or on their own local server, they gain the ability to dictate how their data is stored/used. At a time when AI/ML advancements have come perpendicular to user concerns about data privacy, our lightweight solution can bridge the gapPart C (Justin Ankrom): Voice Vault minimizes environmental impact by reducing energy consumption and electronic waste. It runs on a low-power 4GB RAM Arduino 4 board and has local storage via microSD which reduces dependence on external servers for storage, lowering the system’s carbon footprint. Its customizable design extends hardware lifespan by allowing upgrades instead of full replacements, reducing electronic waste. By supporting self-hosted LLMs, Voice Vault eliminates reliance on large-scale data centers, further decreasing energy consumption.

February 23, 2025February 23, 2025

Team status report for 2/22

This week we spent a lot of time developing with the device, now that we have the Raspberry Pi on hand. David did research into how we are going to begin integrating our other hardware components like the speaker/microphone into our solution. Kemdi worked on allowing the device to begin communicating with first party hosted models. Meanwhile, Justin continued to lead the effort on developing the website alongside out strategic pivot we made a couple weeks ago. After the pivot we made, we don’t foresee anymore changes to our plan. The main risk right now is integrating our hardware with our software properly. This will likely be the most time consuming task.

February 23, 2025

Justin Ankrom’s Status Report for 2/22

This week I worked on setting up user board configuration through our software, which we store with Clerk metadata. Progress is on schedule. In the next week, I hope to refine website so it is fully functional and done and reflects the changes we made with our pivot, and also hope to do some preliminary testing of website and board configurations.

February 23, 2025

David Herman’s Status Report for 2/22

This week I did some more work into the Raspberry Pi and setting it up. Also have done some research into how we will integrate all the peripherals to the board as the parts have just come in. We hope to have the physical hardware (board, microphone, speaker) setup by next week.

February 23, 2025

Kemdi Emegwa’s Status report for 2/22

This past week, I spent some time testing the code I wrote last week, which allowed the device to send queries to a model hosted by a third party. Additionally, I started writing code which allowed the device to target a first party model hosted on a docker container. We are currently ahead of schedule and this next week, I will spend finalizing the abovementioned code.

February 16, 2025

Justin Ankrom’s Status Report for 2/15

This week I added authentication to the our website. We chose to use Clerk as our authentication provider, since we get up to 10k active users for free and it makes it very easy to setup auth without us having to worry about setting it up ourselves. Now a user will have to be authenticated to enter our website. My progress is on schedule. Next week I will add a functionality for users to link their account with their physical board so we can ensure only they will be able to access and modify their board through our software.

February 16, 2025February 16, 2025

Team Status Report for 2/15

We don’t have any significant risk at this moment that can jeopardize the the success of our project. We did however made changes to our design of the system. Originally we had planned to use 2 different types of models: a realtime model and a non-realtime model. However, we decided to pivot and remove the realtime model because we believe that it would compromise the safety and privacy of the user, which is our biggest requirement and goal. This pivot does not change our schedule. This week we got authentication working, our board setup, and got text to speech and speech to text working.

Status Report 2 specific questions:
Part A (Justin Ankrom): Voice Vault is designed to enhance public health, safety, and welfare by prioritizing user privacy. In terms of health, we support users’ psychological well-being by ensuring their interactions with the assistant remain entirely private, mitigating concerns about data surveillance and unauthorized access. Voice Vault operates on self-hosted infrastructure, preventing leaks that could expose sensitive personal information. We ensure that their interactions are never exposed to third-party entities. This design is particularly valuable for individuals handling sensitive information.

Part B (David Herman): Voice vault can have an effect on social and political factors due to its protection against data leaks and corrupt companies. Data leaks happen frequently even when the company storing it is not actively sharing it. Data can also be misused which has happened often before and still happens now. Information such as what a person asks a voice assistant in their own home is very sensitive, and is best kept to oneself. Our device allows users who do not trust where their data goes and who holds it a solution with privacy.

Part C (Kemdi Emegwa): Voice Vault is positioned to be an economic powerhouse, with a reasonably low cost basis, if it were to be coupled with a high means of production he economics of the project are very high. The cost of producing our prototype is expected to be under $150, which if we wanted to sell this product allow us to price it extremely competitively. Additionally, since the user can host their own model, they have complete control over the cloud costs.

February 16, 2025

Kemdi Emegwa’s Status Report for 2/15

I mainly spent this week writing the code for the query workflow. Now we have the preliminary code which uses the CMU PocketSphinx Speech to Text engine and sends this query from the device to the server hosting the model. It then receives the query result and using Espeak, it uses the microphone to output the query. I am currently on track and will be looking to make this more robust and will start working on the dockerization of the model.

February 15, 2025

David’s Status Report for 2/15

This week I ordered a replacement microphone as the one previously ordered from 18-500 inventory was too expensive and needed to be reserved for other potential groups. I then configured our raspberry Pi by downloading the OS on my laptop and plugging in the SD card to the board. I also configured the board to connect to CMU-SECURE.

February 9, 2025

Kemdi Emegwa’s Status Report for 2/8

This week I primarily worked on researching the mechanisms that we are going to use for text-to-speech/speech-to-text. Python has many established libraries for this purpose, but we have the added nuance that whatever model we use, will have to run directly on the Raspberry PI 4. I was able to find a lightweight, open-source model that was actually developed by CMU researchers called PocketSphinx. This will likely work well for our use case because the model is small enough to run locally on limited hardware. We are currently on schedule and for the upcoming week I plan to finish the python code for the Raspberry PI so we can start utilizing the speech-text on the device.