April 2023 – Page 2 – Team D3: Meal By Words

Nina Duan’s Status Report For 4/8/2023

Verification and Validation Plan

Other than running through the entire order workflow (from ordering the first item to checking out) without the UI, which we showed during the interim demo, I’m planning on conducting the following tests:

Using Python’s built-in function for getting the current system time, measure the difference between the time an order is uploaded to the database after a customer confirms checkout and the time the staff-side UI’s backend is notified of its existence. Ideally, the time should be less than 0.5s (500ms). This will allow the staff-side UI to fetch the data from the database and display the new order in the anticipated 1s latency requirement.
Verify that the order fetched from the database matches the order the customer placed. All of the following parameters should match: order number, order time, items ordered, and total price.
Find audio clips with different levels of background noise and play them to the microphone. The speech recognition accuracy should be kept above 85%. This will allow our NLP to recognize menu items most of the time.

After completing each individual test, we will get together as a group and perform some integration tests, preferably with volunteers with different speech habits or from different cultural backgrounds.

Personal Accomplishment

With Lisa’s NLP support, I was able to add a “confirm” functionality to our checkout process. Now, instead of directly checking the customer out when they say “checkout,” the system will ask the customer to review their order. If the customer says “yes,” the system will check them out through the same process as before. Otherwise, the system will return to the previous state, where the customer can add more items or remove existing items.

I also fixed a bug in our system that allowed customers to remove items they didn’t order. Before, the system would respond to a “remove” request with “you have removed …” without checking whether the order contains said item. Now, the system will only say so when the customer has, indeed, order the item they desire to remove.

To better support Shiyi’s frontend design, I created a separate thread for indicating when the customer should speak and when they should stop speaking. This thread will be used to control a microphone icon on the customer-side UI. When the system is listening for customer speech, the microphone icon will flash green and invite the customer to speak. Otherwise, the icon will let the customer know that the system is currently unable to hear what they are saying. This long-running thread terminates when the customer confirms to check out, so it can also be used to detect when the checkout process is complete. Therefore, it can also control when the customer-side UI navigates to the “order complete” page.

About Schedule

I am on track with the schedule.

Plans for Next Week

I will work with Shiyi to integrate the customer-side UI with the newly edited backend. I will also work with Lisa to integrate the preliminary staff-side UI with the database’s pub-sub functionality. At the same time, I will conduct the tests mentioned in the “Verification and Validation Plan” section.

Team Status Report For 4/8/2023

Risks

The significant risks according to the current progress is the sensitivity of the sensors. As we are ordering new distance sensors to replace the infrared sensor we had, we will have to test and see if they can accurately detect incoming customers for our purpose. If the new distance sensors have unsatisfactory performance, we will either fall back to our original infrared sensor or use OpenCV for human detection.

A minor risk is the previously mentioned latency of our speech recognition and NLP system. Although we were able to greatly reduce the lag, the system still takes 1s to 5s to respond to the customer when they say a menu item. This risk is no longer as damaging as it was because the lag has been reduced to a range that isn’t very noticeable. However, as we start testing our system with volunteers, we may need to further optimize the system if the lag causes bad experiences.

Design Changes

Through testing with the RPi 4 some more this past week, we’ve found that it is insufficient to drive the sensors, the microphone, the speech recognition and NLP loop, and the customer-side UI all at the same time. Therefore, we decided to use one of our laptops as the main CPU. The microphone will be plugged into the laptop, which runs the customer-side UI and the backend server that supports it. The sensors will be driven by an Arduino Uno Rev3.

We’ve also decided to switch from using PIR sensors to using ultrasonic module HC-SR04 distance sensors, because they could provide detailed information about our customers (exactly how far away they are from the kiosk and the microphone) rather than just whether they are detected.

Schedule

We are on track with our previous schedule, finishing all our assigned tasks before the interim demo. However, because we switched from the RPi 4 to an Arduino, we need to adjust our sensor code to accommodate. We’ve updated our Gantt chart to reflect the additional work required to do so.

Shiyi Zhang’s Status Report for 04/01/2023

Personal Accomplishments

This week I’ve been working on integrating the backend code, the frontend code, and the hardware components to prepare for the interim demo. Our voice-operated system is now capable of detecting presence of customers, responding to their speech, and place orders. However, there are still a few minor bugs that pop up occasionally due to unhandled edge cases. For example, there is a particular thread that occasionally crash unexpectedly, whereas this issue does not occur on a Mac. We plan to address these issues next week. Overall, we’re in good shape for the demo. In additional to the integration, I’ve been doing some research on utilizing three extra PIR sensors (right now we are using just one) or a combo of a webcam and OpenCV to provide a more detailed understanding of the presence of a customer. I’ve narrowed down to an algorithm called Multiple Sensor Fusion, the process of combining sensor data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. While OpenCV is a viable option, I’m leaning towards a more “hardware” approach, which is using sensors, as this is an ECE project.

On schedule

Yes, my progress is on schedule.

Next week

I’ll be collaborating with Nina to make sure that our code is compatible with the Raspberry Pi. Additionally, I’ll be working on developing code to implement Multiple Sensor Fusion and see if it works better.

Nina Duan’s Status Report For 4/1/2023

Personal Accomplishment

1. Voice Generation

I modified the voice synthesizing script I wrote earlier in the semester to support mass generation of constant messages (e.g. “Welcome to Meal by Words,” “please order your next item,” etc.) from the command line. I also created some helper functions so that the logic of the script can be called in real time to generate messages that haven’t been prepared already.

2. Order Interaction Workflow

I laid out the entire backend workflow of the order interaction in code. The interaction is as follows:

(After the back end has been woken up by an infrared sensor) Play the synthesized welcome message.
Ask the customer to order the first item. The system does support ordering and/or removing multiple items at a time, but, to maintain a relatively high item detection rate, we are limiting it to one item (with quantity) at a time for the MVP.
Parse customer speech and detect menu items.
If an item is detected, repeat the item and its quantity back to the customer. Otherwise, the system will ask the customer to repeat their order item after 15 seconds.
Ask the customer to order the next item. They can also start the sentence with “remove” to remove a certain amount of an item, or say “checkout” to checkout.
Repeat steps 3 to 5 until the customer says “checkout.”
Upload the order to the database and give the customer their order number.

In the future, we are also planning on adding a confirm feature for checkout, so the customer will be asked to confirm their order (displayed on customer UI and/or spoken out loud by voice synthesizer) before step 7.

About Schedule

I am on track with the schedule.

However, it’s important to note that because we are still having trouble with migrating our code to the microcontroller, some of the completed tasks may need to be reevaluated.

Plans for Next Week

I will work with Lisa to add a confirm feature to our system. This will require support from both the NLP module and the overall order interaction workflow. I will also fine-tune parameters such as the energy-level threshold of our speech recognition system and the amplitude of our noise cancellation filter to better accommodate the RPi environment. However, if we do need to replace the microcontroller with some other back end controller, we will also make the decision in the coming week.

If time permits, I will start integrating the cloud database with a preliminary, command-line-based staff-side UI.

Team Status Report For 4/1/2023

Risks

The biggest risk we are facing is the difference between our current development environment (MacOS) and that of the Raspberry Pi (RPiOS). While our prototype backend flows relatively smoothly on MacOS, it behaves in unexpected ways when migrated to the Raspberry Pi. We are still trying to find the root cause of this. Our fallback is running everything on our laptop instead, which means we might switch to using Arduino. The hardware components therefore would not include a Raspberry Pi. However, this is not a finalized decision. We will continue debugging on our Raspberry Pi this weekend.

Another risk is the overall speech-processing speed. The time it takes for our system to listen to the user input, convert to text, parse into entries, and add to a local order object is longer than our ideal goal of 1 second. Because we are unable to correctly determine the end of speech every time, sometimes the speech recognition module keeps listening after the customer has finished speaking.

Design Changes

We modified our menu to accommodate some NLP edge cases. The current menu is:

cheeseburger $7.99
hamburger    $6.99
chicken burger $7.49
beef sandwich $8.99
chicken sandwich $8.99
hot dog       $4.99
corn dog       $5.99
taco       $6.99
donut       $3.99
fries       $2.99
onion rings    $4.99
cheesecake $5.99
fountain drink    $1.29
coffee       $3.29
ice cream    $2.99

We are planning to add three more infrared sensors to increase detection accuracy. We might also use OpenCV as assistance to better detect people with special needs such as children or people sitting in wheelchairs.

Schedule

We have pushed back the design of staff-UI since the integration, testing and revisions will take up all the time before our interim demo. An up-to-date schedule has been attached.