Team Status Report for 4/27

Risk Mitigation

The main risk of our system is the possible failure in parsing the voice commands. To resolve this issue, we listed out the supported commands that are guaranteed to work and integrated these instructions in the “help” command. The user can simply say “help” to get sample commands and start exploring more from there.

 

Design Changes

There is no change in our design.

 

Updated Schedule

The updated schedule is as reflected in the final presentation.

 

List of all tests

Unit tests

Test Expected Performance Actual Performance
Latency (voice command) 4 seconds to render the page Average 4.52 seconds to render the page
Battery life Consumes less than 50% of power with monitor on for 1 hour Power drops from 100% to 77%, consuming 23% of total power
Portability 500 grams 500 grams
Accessibility 32% of right half of screen 35.45% of right half  of screen
Noise reduction 90% accuracy of integral test under 70dB environment 9 out of 10 commands work as expected, 90% accuracy

 

Test Expected Performance Actual Performance
Audio to text Less than 20% of word error rate (WER) 98.3% accuracy, edge case exists
Text to command (NLP) Identify verb with 100% accuracy;

Identify item name, money, date, and number with 95% accuracy;

NLP process takes less than 3s

Identify verb, money, date, and number with 100% accuracy;

Identify item name with 96% accuracy

NLP process takes about 2.5s

Item classification (Glove) 90% of item names correctly classified 18 out of 20 item names are correctly classified, 90% accuracy
Voice response All audio requests should be assisted with a voice response The app will read out the content of the response (entries, report, etc.)

System tests

User Experience Test: Each of the 5 volunteers would have half of the day to interact with our product without our interference. Volunteers are expected to give feedback about what they like and dislike about the product.

Overall Latency: Test different commands (e.g. record/change/delete entry, view entry, generate report, etc.) Expect <15s from pressing the button to rendering the page for all commands.

Overall Accuracy: Expect >95% accuracy for the whole system.

Test Findings

There are some improvements that could be made about the user interface. The font could be larger for normal users. We support “remove” command but not “delete” command, while there are delete buttons which could be misleading. The audio instruction is too long for some user to remember what commands are supported. It might be necessary to tell the user to strictly follow the instructions, or the accuracy of commands will be very low. We will make corresponding improvements before the final demo.

Team Status Report for 04/20

Risk Mitigation

Our team collaborated to finish all the required functionalities of our application this week. During implementation of the web application request handler functions, we figured out that some verbs may not be distinguished correctly and thus lead to bad performance. Therefore, we communicated effectively and applied a newly trained SpaCy model to deal with the edge cases. We also constructed some specific handlers for potential failures found during testing.

We also worked together to prepare for the final presentation. We conducted unit testing on each module separately and did some primary integral testing on the web application. After that, we designed the final presentation slides as well.

Design Changes

There is no design change made.

Updated Schedule

There is no schedule update this week.

Team Status Report for 04/06/2024

Risk Mitigation

The main task this week is the intriem demo. We have received many valuable advices from professor and TA. We have working word2vec on the computer, but when we installed it to Raspberry Pi, there would be some compatibility problem. Therefore, we would switch to GloVe. We have run simple scripts on the Raspberry Pi with GloVe and it works well. It could give us similar messages as word2vec. We now have all the parts needed to add entries (webapp, speech recognition, parsing, and classification model). All we need to do next is combine them together.

Design Changes

Change word2vec to GloVe due to Raspberry PI compatibility.

Updated Schedule

Our schedule does not change since last week.

Specific tasks we need to finish by next week:

  • Finish spacy
  • Further refine GloVe classfication
  • Integrate spacy with webapp

Week’s Special

  • User Experience Test

We would invite 5 volunteers to use our product without our interference. They would explore the app by themselves like what they would do when downloading a new app on their phone. We would give them half of the day to play with the app and would invite them to offer feedback after that. We would want them to list what they like about our product and what they dislike about our product.

  • Latency Test

We will test different commands and record the time it requires for our whole system to process. For example, we would measure the time it takes for our system to display the added entries after the user request it.

Team Status Report for 3/30

Risk Mitigation

This week our team integrated all components and scripts into Raspberry Pi to make sure that they are compatible. There was an issue when downloading SpaCy, which was solved by upgrading the Raspbian OS from 32 bits to 64 bits. We successfully installed SpaCy, PyAudio, and all related modules to the RPi, and we also set up Django and were able to run our web app on the RPi. We incorporated the microphone and made sure SpeechRecognition worked fine on the RPi.

Design Changes

There is no design change made.

Updated Schedule

The main functions to implement include text-to-speech conversion, audio input feature integrated into the web app, and report generation in web app. Below is an updated schedule for the rest of the project.

Team Status Report for 03/23/2024

Risk Mitigation

Setting up and downloading all required libraries on Raspberry Pi required more time than we expected. It took time for us to figure out how to deal with some issues when setting up the environment. In order to run pip on RPi, we modified some built-in documents and ended up downloading all the required libraries.

A potential risk would be running the trained spacy model on RPI. Currently, our group still works on training the model, so we failed to test how the model would work on RPi. However, other signal processing features should be able to run successfully. 

Design Changes

A major change in our design is the date information input for the View Entry page. We decided to align this input with the “Year-Month” selection of the “Financial Report” page. This would improve user experience by standardizing the input information. Also, according to the primary tests of speech recognition, higher performance is achieved when entering the year number and month while the date would be harder to translate correctly.

Updated Schedule

Since we are generally on time, we would keep the current schedule for the week. 

The RPI with all environment setting up is attached.

Team Status Report for 03/16/2024

Risk Mitigation

The WebApp part of our system is larger and more effort-consuming than we expected. Therefore, Lynn may help Yuxuan to make sure that part would be finished in time.

In addition, one risk for our project is that we have not tested any code on the Raspberry Pi. We will download the code to Raspberry Pi next Monday (our ethical lecture is on Wednesday). Since we still have a week to prepare for the first demo, we will have enough time to figure it out.

Design Changes

We did not make significant design changes this week. What we do are mainly specifying things we mentioned before. For example, in the past, we decided to use imperative sentences for commands. In this week, we specified what specific words (generate, enter, …) we would use.

Updated Schedule

The schedule will stay the same as we mentioned last week.

Things we need to finish by last week:

  1. Let things run on Raspberry Pi monitor (webapp would show up, and add entry should work)
  2. Finish the ethical assigment (attend lec + slack response)

Team Status Report for 3/9

Risk Mitigation

We started assembling Raspberry Pi and the monitor last week in case any changes need to be made before we write the design report. The integration worked as expected, and we are now able to program on Raspberry Pi with the touch screen and a keyboard. We will also integrate microphone (should be available once plugged in), test the built-in speaker and keyboard on the touchscreen next week to make sure all hardware components function compatibly.

Design Changes

We originally planned to use a power bank of 10000mAh, but after recalculating the power consumption, we might only need a power bank of 3500mAh, a changed reflected in the design report.

Updated Schedule

We are a bit behind schedule due to the time spent on the design report. The updated schedule is attached below.

This Week Special

Part A was written by Yixin, Part B was written by Yuxuan, and Part C was written by Lynn (Tianyi).

Part A

In considering global factors, this app addresses the fundamental need for financial management across different demographics. People all over the world, not only students and not only people in Pittsburgh, would have the need to track their spendings. By leveraging voice recognition technology, the app significantly lowers the barrier to entry for users. It would definitely help people with limited literacy or visual impairments, but it would also help general users to improve their user experience. This inclusivity ensures that people, regardless of their technological proficiency or physical capabilities, can efficiently manage their finances with ease.

Our current app’s focus on English-speaking users is designed to refine and perfect the user experience, ensuring that the core functionalities—such as expense tracking, report generation, and voice recognition—are robust and user-friendly. This strategic approach allows us to cater effectively to a significant portion of the global population, providing them with a powerful tool for financial management.

Part B

Our target users are mainly visually impaired people and the elderly group, and our design takes two main cultural factors into consideration.

One factor is social inclusion. Both visually impaired and elderly people tend to feel marginalized by society. Specifically, many money tracking tools on the market are applications on computers or mobile phones, to which these minority groups might have no access. Our product, however, provide a cheap access to money tracking for these groups. By supporting audio input and output, we enable these groups to use a money tracking tool like everyone else, satisfying their sense of belonging to the society.

The other factor is simple operation. The elderly and the visually impaired might have trouble interacting with a complex system due to the incapability to see the page or to understand the components. Therefore, we designed a simple UI so that the users can interact with the app without effort and almost hands-free. It enhanced the user experience for the targeted groups while keeping the essential functionalities of a money tracking app.

Part C

A major design consideration in our product is power consumption. While a portable power bank is attached to the device when customers are using it in environments without an approachable outlet, it is generally preferred to plug the device into a stable charger to guarantee that the system functions properly. Continuous charging may result in excessive power consumption. To avoid potential power waste, customers are encouraged to disable HDMI if they are not using the device.

Another concern is the screen radiation emitted from the touchscreen monitor in the device. Long-term exposure to such environment may cause diseases. However, the average screen time for our design is estimated to be under 30 minutes per day to guarantee that such radiation problems would hardly take place.

Team Status Report for 02/24

Risk Mitigation

According to both the feedback from the design review presentation and our evaluation, the most significant risk is the NLP process. To further guarantee usability for visually impaired groups, we would add detailed features including audio instruction that could help the customers to use the product more efficiently and a “start recording” and “end recording” audio notification when pressing the button to let the users know when they could start giving commands. 

After the design review presentation, our group met again and went over the feedback together. Based on the feedback, we distributed the tasks by assigning each part of the design report to different group members. 

Design Changes

An audio instruction will be added to the web app to guarantee that the users can know the standard commands in advance. A “help” command is introduced in all pages of the web app so that the users could trigger the audio helper at any time. 

A “start recording” and “end recording” audio notification should be added each time the button is pressed. This ensures that the visually impaired groups could know when they can input vocal commands and when to stop. More details of the implementation will be included in the design report. 

Updated Schedule

The primary focus for next week is the design report. Each section of the report is assigned to one or more group members and should be reviewed before submission.

Team Status Report for 02/17/2024

Risk Mitigation

We chose spaCy as our NLP model for parsing the text and we chose to use word2vector for classification. Our team began to learn to use these libraries and wanted to see if these two libraries could give us the desired accuracy.

In addition, we also have a basic design of UI, which would be more difficult than we expected. Therefore, we may begin to look into that part earlier. This problem can be solved if we spend more time on it.

For the hardware part, we need to power our system. Therefore, when the hardware arrives, we may need to use a battery to power our system.

Design Changes

  1. we would use a Raspberry Pi monitor for display. (used to use phones/laptops)
  2. Decided to use PyAudio for translating speech to text
  3. UI design becomes more specific

Updated Schedule

We begin to look at UI earlier (we will begin UI design on 2.18 in the schedule, but we began to look at it last week). In addition, we also order our hardware earlier than planned. For next week, we would begin to run the NLP model, construct the Webapp UI, and have our hardware constructed.

This Week Special

A was written by Yuxuan, B was written by Tianyi and C was written by Yixin.

Section A:

Public health:
Our product is an easy-to-use tool for money tracking. It allows the user to view a detailed list and an analysis of their expenses and can thus eliminate the user’s anxiety of potential overspending. With a healthy spending habit, people are well-prepared for financial emergencies since they can save money by carefully planning their spending based on their needs. All the benefits can be achieved with little effort due to the audio input functionality and portability of our product, so the users are more likely to form a habit of money tracking, and the records on the web app can reflect their true spending more accurately.

Safety:
Our product is safe to use in terms of user privacy. We use Raspberry Pi as our web server so that the web application does not connect to the internet, preventing potential attacks or hacking. User data is stored in a local database so that only the owner of the product can access the stored information. In addition, we can configure the Raspberry Pi so that the user can lock screen and unlock with their own password. If the product is lost, it cannot be unlocked by others, which prevents potential information leakage.

Welfare:
Our product fulfills the need of people who want to stick to their budget and form good spending habits. It helps people cut unnecessary expenses and focus more on certain categories of spendings. With clear financial goals and strategies, people can better identify and achieve their own needs, which are crucial indirect benefits our product provides.

Section B:

One of the major purposes of our design is to benefit visually impaired people who have the requirement of daily money tracking. To guarantee that the design can be used effortlessly without vision, audio input and output assistants are integrated within the money-logging application. The design would also enhance the user experience for common customers because they could manage their wealth more beneficial via voices. That is to say, our design not only takes the special needs of minorities into account but is also willing to improve the financial management experience for the general public.

Section C:

The production cost of our app would be low. One of the reason should be that we would use Rasberry Pi, which is much cheaper than a smartphone. On the other hand, we would have a local web app, we do not need to purchase licenses.

Although we are trying to lower the price of our product and our product do have many features (e.g. voice input) that other money tracker app does not have, we still cannot ensure customers would buy our products. One of the biggest strength of exisiting apps is that these apps are free. Therefore, we may need to make a survey about how much customers would likely to pay for better user experience.

Team Status Report for 2/10

Risk Mitigation

For now, we focus on selecting the most suitable NLP model for parsing the text and classifying the item name, which is critical for correctly generating the entry, a key feature of our product. We compared across models for matching commands, identifying elements, and classifying words and settled on spaCy and word2vec for these tasks. There are also backup libraries if the selected ones don’t work out. We will start working on spaCy and word2vec next week to make sure they work as expected before the design presentation.

Design Changes

We replaced random forest model with word2vec for classifying item names because random forest requires feature engineering and is thus more suitable for classifying whole sentences or paragraphs. We will reflect this change on our block diagram for the design presentation/report.

Updated Schedule

No updates to the schedule. Next week we will address the comments on our proposal presentation provided by TAs and peers, discuss hardware order requests with faculty, design MVC framework for our Django web application, and run tests for every model/library we want to use as part of the speech recognition/NLP process on our laptop before the design presentation.