Team Status Report for 03/16/2024

Risk Mitigation

The WebApp part of our system is larger and more effort-consuming than we expected. Therefore, Lynn may help Yuxuan to make sure that part would be finished in time.

In addition, one risk for our project is that we have not tested any code on the Raspberry Pi. We will download the code to Raspberry Pi next Monday (our ethical lecture is on Wednesday). Since we still have a week to prepare for the first demo, we will have enough time to figure it out.

Design Changes

We did not make significant design changes this week. What we do are mainly specifying things we mentioned before. For example, in the past, we decided to use imperative sentences for commands. In this week, we specified what specific words (generate, enter, …) we would use.

Updated Schedule

The schedule will stay the same as we mentioned last week.

Things we need to finish by last week:

  1. Let things run on Raspberry Pi monitor (webapp would show up, and add entry should work)
  2. Finish the ethical assigment (attend lec + slack response)

Yixin’s Status Report for 03/16/2024

  • What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

Work with Team:

  1.  Further finalize what commands we would use. The notes from this discussion helped me to rewrite the test case for the spaCy model.
  2. Work with the team on the ethical problem
  3. Discuss with Lynn about what the prices would look like when generating scripts (will the script give “five dollars” or “5 dollars”). This would help me to train the spaCy model.

Individual work:

The main task for me is still the spaCy model.

The function to identify the action verbs works well. As we can see from the example below. The actions.txt contains some commands as input, and after we run the get_action.py, it will give us the action verbs we needed.

As I have mentioned in previous status report, I would want to train the spaCy model for larger datasets. The base_config.cfg file would stay the same. And I have changed the preprocessor file. In the past, I would manually input commands, which will take a lot of time. With the new file, I can input some txt files and let the code read the file, making the training process easier.

I have also prepared the txt file for training (contains about 100 commands). But I have not started the new training yet.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

on schedule

  • What deliverables do you hope to complete in the next week?
  1. train the spaCy model & test accuracy
  2. run the code on raspberry pi monitor
  3. prepare for the demo

Lynn’s Status Report for 03/16

Progress

I was working on finalizing the speech recognition pipeline and did basic testing toward the subsystem this week. To simplify the script and save memory, I decided not to temporarily save the audio inputs as .wav files. Instead, I choose to feed the byte frames directly into the noisereduce and speech recognition methods. 

After researching and doing some primary testing on the primary version of the script, I realized that it is not necessary to include both a “start” and “end” event to manually control audio recording. With the modified version of the audio recording and speech recognition pipeline, the recording process will terminate automatically after a specific time, and would end the current session if do not hear from the speaker for another set time period.

The current script could recognize the standard commands with acceptable accuracy:

A major focus of the speech recognition process is the price number. Currently, the price could be accurately recognized if the “dollar” keyword is included. On the other hand, if the speaker gives vague word commands such as “four-sixty”, the recognizer would directly convert it to “460”, which is discrepant from the expected value. We may need further discussion on how to deal with this.

Schedule

I am a little behind schedule for testing the scripts on RPi, but I will catch that next week. 

Next Step

I will test the pipeline on the RPi in both quiet environments and crowded environments. Also, I will work with Yuxuan to implement the web app and connect the front-end buttons to the speech recognition pipeline.

Yuxuan’s Status Report for 3/16

Progress

My main focus of this week is the web application. I wrote html files for the 3 main pages: new entry, spending list, and financial report. I also implemented a navigation bar to jump between pages. I initialized the 6 categories in migration file so that user can select from the preset categories from a dropdown when creating an entry. Some buttons are dummy for now and the css style also requires further perfection.

Schedule

I am on schedule.

Next Steps

Next week I will continue work on the web app, improving the styles of the components and complete the functionalities. I aim to make the app functional with manual input by the interim demo.

Lynn’s Status Report for 03/09

Progress

The primary focus for the week is the design report, and I spent time composing the Architecture and/or principle of operation, System Implementation, and some parts of the trade studies. To better clarify the design details and hierarchy, I added several new diagrams for both the hardware and software systems. I also did the design report formatting for the team after the contents were finished. 

I spent the rest of the week diving deeper into the signal processing section. After successfully recording audio using PyAudio, I started to write scripts for noise reduction and tried to test the different noise reduction libraries using audio inputs under higher volume environments. I also wrote scripts to test the speech recognition function.

Schedule

I am on schedule

Next Step

After the primary test passes for noise reduction, I am planning to test the functionality on RPi since we have set up the raspberry pi and monitor. I will also start to construct the whole pipeline for signal processing.

Yixin’s Status Report for 03/09/2024

  • What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

Last week, I set up the hardware with my group. The Raspberry Pi, monitor, and microphone are connected. Our hardware setup works fine. The touch screen of the monitor is sensitive, and we could run some Python programs on it. For next week, we need to set the on-screen keyboard for the monitor (use wire to connect a keyboard first, download the packages needed for the on-screen keyboard, then disconnect the wired keyboard).

Like before, my main task is to work with spaCy. I adjusted the strategy to train spaCy a little bit, because we refined the requirements for our app. I would want to train the model to assign labels for the action verbs we want before this week. This week we refine the design of our app and we would require the user to use imperative sentences. Therefore, I would use the POS attributes of spaCy to identify the verbs directly. I have tested the new design, and it has high accuracy.

In addition, I still want to train the model to make sure it can assign labels (items & prices) for corresponding words in the sentence. I have written the script to train the label and already trained it with a small dataset (10 words for each label). This seems to work. I would train it with larger datasets next week to test the accuracy.

I also wrote many parts of the design reports and our team also met to discuss some design details.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

on schedule

  • What deliverables do you hope to complete in the next week?
  1. download on-screen keyboard
  2. train the spaCy model with larger datasets

Yuxuan’s Status Report for 3/9

Progress

This week I focused on composing the design review document. Specifically, I worked on introduction, word2vec of system implementation, word2vec and database in trade studies, testing, project management, and summary sections. Since I was responsible for the word2vec part, I tested out a pretrained word2vec model provided by gensim on my laptop. It took me a long time to set up the library and the model because gensim does not support latest versions of python. I was able to get the vector representation of a word and the similarity between two words. After familiarizing myself with the library and customizing the model for our use, the process of installing and using required modules should be smooth on Raspberry Pi as the next step.

Schedule

I am behind schedule because the design report took me longer than expected. I will focus on web app development next week as I have slack time after next week based on the original schedule.

Next steps

For next week I will implement the basic functions of the app (not including the audio input function) and continue integrating components in Raspberry Pi with my teammates.

Team Status Report for 3/9

Risk Mitigation

We started assembling Raspberry Pi and the monitor last week in case any changes need to be made before we write the design report. The integration worked as expected, and we are now able to program on Raspberry Pi with the touch screen and a keyboard. We will also integrate microphone (should be available once plugged in), test the built-in speaker and keyboard on the touchscreen next week to make sure all hardware components function compatibly.

Design Changes

We originally planned to use a power bank of 10000mAh, but after recalculating the power consumption, we might only need a power bank of 3500mAh, a changed reflected in the design report.

Updated Schedule

We are a bit behind schedule due to the time spent on the design report. The updated schedule is attached below.

This Week Special

Part A was written by Yixin, Part B was written by Yuxuan, and Part C was written by Lynn (Tianyi).

Part A

In considering global factors, this app addresses the fundamental need for financial management across different demographics. People all over the world, not only students and not only people in Pittsburgh, would have the need to track their spendings. By leveraging voice recognition technology, the app significantly lowers the barrier to entry for users. It would definitely help people with limited literacy or visual impairments, but it would also help general users to improve their user experience. This inclusivity ensures that people, regardless of their technological proficiency or physical capabilities, can efficiently manage their finances with ease.

Our current app’s focus on English-speaking users is designed to refine and perfect the user experience, ensuring that the core functionalities—such as expense tracking, report generation, and voice recognition—are robust and user-friendly. This strategic approach allows us to cater effectively to a significant portion of the global population, providing them with a powerful tool for financial management.

Part B

Our target users are mainly visually impaired people and the elderly group, and our design takes two main cultural factors into consideration.

One factor is social inclusion. Both visually impaired and elderly people tend to feel marginalized by society. Specifically, many money tracking tools on the market are applications on computers or mobile phones, to which these minority groups might have no access. Our product, however, provide a cheap access to money tracking for these groups. By supporting audio input and output, we enable these groups to use a money tracking tool like everyone else, satisfying their sense of belonging to the society.

The other factor is simple operation. The elderly and the visually impaired might have trouble interacting with a complex system due to the incapability to see the page or to understand the components. Therefore, we designed a simple UI so that the users can interact with the app without effort and almost hands-free. It enhanced the user experience for the targeted groups while keeping the essential functionalities of a money tracking app.

Part C

A major design consideration in our product is power consumption. While a portable power bank is attached to the device when customers are using it in environments without an approachable outlet, it is generally preferred to plug the device into a stable charger to guarantee that the system functions properly. Continuous charging may result in excessive power consumption. To avoid potential power waste, customers are encouraged to disable HDMI if they are not using the device.

Another concern is the screen radiation emitted from the touchscreen monitor in the device. Long-term exposure to such environment may cause diseases. However, the average screen time for our design is estimated to be under 30 minutes per day to guarantee that such radiation problems would hardly take place.

Yixin’s Status Report for 02/24/2024

  • What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).

I test the PyAudio speech recognition part with Lynn. It shows that this part works well. It can clearly identify many instructions that we may use (e.g. generate the report for last week, buy a book with five dollars, etc.). We also receive the hardware. My groupmate has tested if the microphone works. We would test the speech recognition

I have also prepared a small dataset for the classification model. I grabbed item names from the shopping website and assigned them different tags (the 6 different tags). Waiting for the word2vec model to be set up and test the model.

Last week, I learned the attributes spaCy has. I learned how to train the spaCy model this week, and may begin to train the spaCy model next week. I may first train the spaCy model to identify the money value in the sentence.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

on schedule

  • What deliverables do you hope to complete in the next week?
  1. setup the hardware with the group (especially Raspberry Pi)
  2. revise the design report based on comments received
  3. train the spaCy model

Lynn’s Status Report for 02/24

During Monday’s meeting, I did the design review presentation for our team. As the speaker of the presentation, I spent the majority of my time in the first half of the week preparing for the speech. Following the job breakdown, my focus for the design review was still Solution Approach and Implementation Plan, so I made the slides for these parts and finalized the presentation script with my teammates. I also updated the block diagram accordingly, 

After the presentation, I started to write Python scripts for the audio input with the microphone connected. Based on the design, PyAudio library is used to start and end speech recording for user input commands. The input stream is then fed to the PyPI noisereduce algorithms. 

Schedule

I am on schedule

Next Step

During the team meeting, different parts of the design report were discussed and assigned. Therefore, I will focus on writing my part of the design report, Architecture and/or principle of operation, System Implementation, and some parts of the trade studies, next week. I will also write more scripts for audio input analysis.