yiningd – Page 2 – Team D3: Meal By Words

Nina Duan’s Status Report For 3/18/2023

Personal Accomplishment

In addition to completing the ethics assignment, I integrated our database module and preliminary NLP module with Lisa and modified the microphone & speech recognition system provided by Python’s SpeechRecognition library.

After integration, our system is now able to extract menu items and quantities from simple sentences, add them to an Order object, and upload that object to the database. However, there are still flaws with this simple system because we have yet to implement the checkout portion of the NLP module.

The open-source SpeechRecognition library provides a basic real-time speech recognition functionality that can be used with an external microphone. This process, however, doesn’t allow room for noise reduction. Therefore, I explored the source code of the library, determined where the microphone’s input is read, and extended it to utilize a noise reduction algorithm. For now, it uses a simple, deterministic noise cancellation algorithm that attempts to cancel out low amplitudes by mixing with the signal’s inversion. By slightly altering this visualization tool, I was able to visualize the difference. This is what it looks like when I speak at conversational volume from a distance of ~0.7m, with a restaurant ambience noise YouTube video playing in the background (graphs are in time domain; top = raw microphone input, bottom = filtered input):

Speech with noise, with amplitude decreased.

About Schedule

I have caught up to the schedule. The microphone has been set up, and preliminary signal processing code has been written.

Plans for Next Week

I will continue to work with Lisa to improve our NLP & database modules, as this is the core part of our system. In addition, I will start installing necessary dependencies on and transferring our code to the microcontroller (RPi 4).

Team Status Report For 3/18/2023

Risks

The greatest risk that we are currently facing is the low performance of the speech recognition system. As we started writing the speech recognition algorithm, we realized that although the Python SpeechRecognition library usually returns coherent sentences, which helps our NLP system to parse the input, the speed and accuracy of recognition is not very optimistic. We will test how to change certain metrics in the SpeechRecognition library for a more accurate output, and in the worst case, we can switch to other speech recognition algorithms compatible with Python.

Design Changes

We may be able to use fewer infrared sensors. We conducted testing this week and found that the system was still able to detect relatively short human figures accurately with just one or two sensors.

To accommodate inflexibilities in our current speech recognition and NLP modules, we decided that checkout can only be triggered by certain keywords (“checkout,” “finish,” and “done”). We will also be taking item orders one by one, so a sample interaction would look like this:

Kiosk: “Welcome! Please order your first item!”

Customer: “One hamburger, please.”

Kiosk: “You’ve ordered one hamburger. Is this correct?”

Customer: “Yes.”

Kiosk: “One hamburger, confirmed. Please order your next item, or say ‘finish’ to checkout.”

Customer: “Checkout.”

Kiosk: “Are you ready to checkout?”

Customer: “Yes.”

Kiosk: “Checkout successful! Your total is $XX.XX. Your order number is XX. You will be called when your order is ready. Thank you for using Meal By Words!”

Schedule

There is no schedule change this week. Everyone is on track with our plan.

Team Status Report For 3/11/2023

New Tools

To properly access and control our cloud database, Redis, we need to use RedisLabs, an online platform for viewing and manipulating database settings, and RedisInsight, a desktop application that visualizes and allows manually changing database data. These tools will also allow us to view statistics about the database, such as latency and number of accesses, which may help with testing speed of service in the future.

In addition, we are planning to use some jQuery libraries to write JavaScript faster and easier. jQuery also works with multiple browsers so our code is compatible regardless of which features does the browser contain.

Risks

As the natural language processing algorithm is developed further, we realized that it is heavily relying on the grammar structure of input sentences to capture the necessary information. The most significant risk is that if our speech recognition system fails to generate grammatically coherent sentences, it will be difficult for the speech recognition and natural language processing subsystems to integrate. To mitigate the risk, we are ready to use the token matcher on top of the dependency matcher to capture key words in the sentences instead of grammar structures.

The risk with regards to the UIs is that some Bootstrap templates we are currently using are unstable. Depending on how well they are maintained remotely, some always work when the pages are loaded while some may not due to the fact that the servers they live on are poorly maintained. Therefore, we are considering using static styling (CSS, SCSS, and JavaScript) only, but the decision is not finalized yet.

Design Changes

Our design has not changed from our design review report, but we solidified a few design details.

First, we finalized our menu and constructed an immutable dictionary for future use:

cheeseburger $7.99
hamburger    $6.99
veggie burger $7.49
chicken burger $7.49
beef sandwich $8.99
chicken sandwich $8.99
hot dog       $4.99
corn dog       $5.99
taco       $6.99
donut       $3.99
fries       $2.99
onion rings    $4.99
fountain drink    $1.29
coffee       $3.29
ice cream    $2.99

For the MVP, we do not plan on allowing customizations or size selections.

Second, the cloud database and the staff-side module will maintain a server-client-like relationship. When the staff-side module’s subscriber thread receives notification of a new order (sent by the customer-side module when a customer checks out), it requests the order’s information from the database by spawning a child thread. This eliminates the need to constantly poll the cloud database for new data.

Third, we may change how we conduct speech interactions (e.g. near real-time parsing vs. letting customers speak one sentence and then parse) based on how well the noise-reduction and the speech recognition libraries work together.

Schedule

We move the integration between database and NLP to the week after spring break since the MVP version of the two subsystems have just been completed. As a result, the tasks following database and NLP integration have been pushed back as well.

Nina Duan’s Status Report For 3/11/2023

Personal Accomplishment

Other than completing the design review report with my teammates, I also worked on a couple of tasks.

1. Voice Synthesizer Script

To assist Shiyi with developing an accessible UI, I created a voice synthesizer script using the open source library Google Text-to-Speech (gTTS). The script allows the user to synthesize any English text from both an input prompt and the command line:

2. Database and customer-side model for orders, items, and the menu

I finalized the representations of orders, items, and the menu both on the cloud database and in local storage:

The design review report goes into detail about the model and how they interact with each other, so I won’t repeat them here. The important thing to note is that, by design, the local copy won’t be uploaded to the cloud until the customer finishes ordering by calling checkout().

I have tested the flow and successfully added sample orders into the cloud database:

3. Staff-side model for orders

I designed a model to represent orders for the staff-side as well:

This object will automatically be generated when a subscriber to the Redis pub/sub channel receives a new orderNum. It allows the staff to view order items, cross out prepared items (using finishItem()), and remove completed orders from the cloud database (using removeOrder()).

4. Redis pub/sub and fetching orders from the database

The Redis pub/sub channel is shared by the customer-side modules (publishers) and the staff-side modules (subscribers). Once the customer-side order publishes its orderNum, the staff-side subscriber thread will receive a message containing the orderNum and spawn a child thread to fetch that orderNum’s information from the database.

I have implemented this functionality as well, but it still requires more testing.

About Schedule

Since all of us are slightly behind, the database and NLP integration hasn’t been able to happen, yet. I am fairly confident that the database component is complete functionality-wise, and unit-testing has been conducted. Therefore, once we meet again next week, Lisa and I will be able to start utilizing the database with data from the NLP module.

Plans for Next Week

Our microphone and infrared sensor are set to arrive next week. Therefore, I will shift gears and start programming the microphone against our RPi 4.

Lisa and I will also try to integrate our NLP modules and database modules during the mandatory lab meetings.

Nina Duan’s Status Report For 2/25/2023

Personal Accomplishment

On Wednesday, I presented my group’s design review presentation and received valuable feedback.

After following up on our request for AWS credit, I was told that we should use a free, open source database instead of AWS DynamoDB. As a result, I spent some time experimenting with Replit, the database an instructor recommended, and Redis. However, the free version only allows us to create public repositories, which could result in academic integrity issues, so I ended up choosing Redis. While Redis is built to support storage of complex data structures, it works perfectly well with small-scale, simple key-value pairs we are planning on storing, too. In addition, because it is an open source database, there are many sample projects and usages that we can draw inspiration from.

As of now, I have finished setting up the cloud database and written skeletal Python code for simple data insertion, removal, and modification.

I will follow up with a more detailed storage model design in my next status report and our design review report.

About Schedule

Since we switched to Redis Database in the middle of the week, I have fallen behind schedule. However, because our project only relies on a few basic functionalities that are common among most noSQL cloud databases, this change won’t require a drastic change in our design.

Plans for Next Week

Other than crafting the design review report, I will create object classes representing customer orders and related subcategories in Python, which will match how they are stored in the cloud database. Completing this will allow us to integrate the cloud database with our NLP algorithm, which Lisa is still in the process of fine tuning.

Team Status Report For 2/25/2023

Teaming

This week we have made a Github repository for our project’s code files.

Each member has made progress on some of the assigned tasks, which will be explained in detail in everyone’s status reports. A simple version of the natural language processing system has been created; it is able to detect menu items based on basic sentence structures (such as “I want one hamburger” / “A cheeseburger, please”), and we are still in the process of debugging and determining the ideal approach to process more complicated grammar structures and tackle edge cases.

While working on our individual tasks next week, we will write the design review report together.

Risks

The most significant risk is falling behind the schedule for our project, since most of the work is taking longer than expected. We will make sure to allocate enough slack time before the final deadline to accommodate potential schedule changes. We also have decided to continue working on the project over spring break to make more progress.

Design Changes

Since it’s unlikely that we will be able to get AWS credit through the capstone course, we plan on switching our cloud database to Redis. We also considered Replit, which an instructor suggested. However, the free version only allows us to create public repositories, which doesn’t satisfy one of our basic requirements. Fortunately, this change doesn’t affect our design much as our project only relies on a few basic functionalities that are common among most noSQL cloud databases. In addition, since we have a few spare infrared sensors, we might be using multiple sensors to detect the presence of a customer in order to increase detection accuracy.

Schedule

We have updated our schedule according to the current week’s progress.

The setup of the infrared sensor is shifted to an earlier date since the Raspberry Pi has already arrived.

The natural language processing system is taking longer than expected to program, so we have extended the timeline for a week and will potentially still work on polishing it when integrating the database and NLP system.

Nina Duan’s Status Report For 2/18/2023

Principles of Engineering, Science, and Mathematics – Relevant Courses

Many courses touched on the importance of ethical considerations in engineering. For example, both 18-100 and 18-500 had slides dedicated to the societal/economic/environmental impact of engineering.

Modularity is also emphasized in many ECE and CS courses. Project-heavy courses such as 18-341, 18-349, 15-445, 17-214 especially focused on this, since modularity makes a large project more testable and maintainable.

Personal Accomplishment

For the first half of the week, I focused on researching microcontrollers and databases.

In the end, my teammates and I decided to use a Raspberry Pi for speech recognition because it can interface with sensors and microphones and has CPU and memory powerful enough to drive a speech recognition algorithm. I also found some sample projects that use a Raspberry Pi for signal processing:

I focused my database research on comparing Amazon DynamoDB and Redis:

	DynamoDB	Redis (Remote Dictionary Server)
General	Commercial system (pay)	Open-source, can be used for commercial purposes
Storage Model	Key-value Document model	Key-value Secondary database models: document store, graph DBMS, and spatial DBMS
Partitioning	Sharding	Sharding
Performance	20+ million requests/sec R&W fast regardless of table size	In-memory database (requires large amount of memory to run quickly) Optimized for complicated data structures
Durability & Availability	3 separate zones Data still available even if one zone goes offline	Open-source version not very durable (diskless DB)
Security	Encryption	No encryption
Use Cases	Applications that require high-speed data writing and reading	Session cache, chat, messaging, and queues Geospatial data, live streams, and real-time analytics
Pricing	On-demand mode: based on number of accesses	Free, open-source

For now, we are planning on using DynamoDB because our project requires fast insertions and deletions but not complex data structures. The final decision will, of course, also depend on whether we’re able to get AWS credit through this course.

For the second half of the week, I worked on preparing for the design review presentation.

About Schedule

I am on track with our schedule. In fact, we were able to get a Raspberry Pi 4 and start playing with it ahead of time.

Plans for Next Week

I will be presenting our design in class.

In addition, I will discuss the potential of getting AWS credits with the instructors and start familiarizing myself with DynamoDB’s APIs.

Once we review design review feedback, my teammate and I will also place orders for the hardware components.

Team Status Report For 2/18/2023

Principles of Engineering, Science, and Mathematics

Modularity – We broke our design down into smaller chunks that each manage a cohesive group of tasks. For example, the program that runs on the Raspberry Pi consists of two modules: one monitors the infrared sensor and wakes up the main backend loop; the other manages the heavy-lifting for speech parsing and recognition. These modules can further be broken down into submodules such as signal processing, speech-to-text translation, and text parsing (NLP).
Ethicality – One of the main goals of our project is to improve the welfare of fast-food restaurant employees. We believe that the success of our system will alleviate the burden of kitchen staff, enabling them to focus only on preparing food. Our infrared sensor and ordering station will also accommodate customers in wheelchairs as well as children.

Risks

Since we are still in the design phase of our project, the most significant risk that could jeopardize its success is failing to consider important design requirements, which would lead to fundamental flaws in our design. To mitigate this risk, we will carefully review feedback from our design presentation and discuss potential problems with our instructors.

Design Changes

We finalized our design for the design review presentation and created a system diagram for the current design:

We have already requested and received a Raspberry Pi 4 with 8GB memory from the ECE inventory. Once we present our design and receive feedback, we will start ordering the hardware components (infrared sensor, microphone, and sound shield).

Schedule

We reformatted our schedule and took spring break into consideration.

Here’s the updated version:

Nina Duan’s Status Report For 2/11/2023

Personal Accomplishment

This week, I mainly focused on conducting research for our proposal use-case requirements and some components necessary to achieve them.

To properly quantify our project’s service expectations, I took a look into research about service times of existing fast food restaurants and found this 2016 research by QSR Magazine particularly interesting. Although it is about drive-thru service specifically, the research data does suggest that customers expect an average service time of about 200 seconds. A news report from 2020 claims that drive-thru has been slowing down in recent years, which means the expectation nowadays could potentially be even lower.

To achieve our use-case requirements (see proposal), we need one or more directional microphones that can receive verbal inputs from a distance of 0.5m to 1.0m. They will be driven by a Raspberry Pi or an Arduino, which requires USB or I2S connectivity. Here’s a list of some options I’ve found so far:

WM8960 I2S Microphone
- Raspberry Pi connectivity, compatible with Raspberry Pi Zero/Zero W/Zero
- WH/2B/3B/3B+
- Comes with demo and development guide in Python
MP34DT01 I2S Microphone
- More compatible with Arduino, includes device-specific library
- CircuitPython module (in Python, C)
Samson Go USB Mic
- Compatible with with Raspberry Pi and laptops (Mac & Windows)

In addition, I also took a look at available commercial databases. In our proposal, we chose to use noSQL cloud database, which leaves us with two prominent options:

AWS DynamoDB: fast insertions and deletions, but less customizable and structured
Redis: supports secondary database models like the document store, graph DBMS, and spatial DBMS

About Schedule

Since we have scheduled two weeks for preliminary research (i.e. to be completed by 2/19/2023, the due date for our design presentation), I am on track with our schedule.

Plans for Next Week

I hope to finalize our microphone, microcontroller, and database selection by next Wednesday, which will allow us to finish our design and start gathering chosen components. I will also be our group’s presenter, so the rest of the week will be spent on polishing our presentation slides and preparing for Monday’s presentation.

Team Status Report For 2/11/2023

Our project includes considerations for customer convenience, employee welfare, and restaurant cost reduction. Our system will provide an alternative ordering approach to fast food restaurant customers, and reduce the number of cashiers required. This could also improve existing employees’ working conditions, as they no longer need to shuffle between the counter and the kitchen and can focus on food preparation.

This week, we updated our Gantt Chart to increase slack time at the end of the project timeline. This time will allow us to conduct more end-to-end tests if necessary and fix unexpected issues with our final product. We also created preliminary designs for our whole system, separating the system into hardware, front-end software, and back-end software components. Use-case requirements and testing metrics were updated based on our research about existing fast-food services and hardware systems. Next week, we will finalize our design, prepare for the upcoming design presentation, and start gathering necessary project components.