Nina Duan’s Status Report For 4/8/2023

Verification and Validation Plan

Other than running through the entire order workflow (from ordering the first item to checking out) without the UI, which we showed during the interim demo, I’m planning on conducting the following tests:

  1. Using Python’s built-in function for getting the current system time, measure the difference between the time an order is uploaded to the database after a customer confirms checkout and the time the staff-side UI’s backend is notified of its existence. Ideally, the time should be less than 0.5s (500ms). This will allow the staff-side UI to fetch the data from the database and display the new order in the anticipated 1s latency requirement.
  2. Verify that the order fetched from the database matches the order the customer placed. All of the following parameters should match: order number, order time, items ordered, and total price.
  3. Find audio clips with different levels of background noise and play them to the microphone. The speech recognition accuracy should be kept above 85%. This will allow our NLP to recognize menu items most of the time.

After completing each individual test, we will get together as a group and perform some integration tests, preferably with volunteers with different speech habits or from different cultural backgrounds.

Personal Accomplishment

With Lisa’s NLP support, I was able to add a “confirm” functionality to our checkout process. Now, instead of directly checking the customer out when they say “checkout,” the system will ask the customer to review their order. If the customer says “yes,” the system will check them out through the same process as before. Otherwise, the system will return to the previous state, where the customer can add more items or remove existing items.

I also fixed a bug in our system that allowed customers to remove items they didn’t order. Before, the system would respond to a “remove” request with “you have removed …” without checking whether the order contains said item. Now, the system will only say so when the customer has, indeed, order the item they desire to remove.

To better support Shiyi’s frontend design, I created a separate thread for indicating when the customer should speak and when they should stop speaking. This thread will be used to control a microphone icon on the customer-side UI. When the system is listening for customer speech, the microphone icon will flash green and invite the customer to speak. Otherwise, the icon will let the customer know that the system is currently unable to hear what they are saying. This long-running thread terminates when the customer confirms to check out, so it can also be used to detect when the checkout process is complete. Therefore, it can also control when the customer-side UI navigates to the “order complete” page.

About Schedule

I am on track with the schedule.

Plans for Next Week

I will work with Shiyi to integrate the customer-side UI with the newly edited backend. I will also work with Lisa to integrate the preliminary staff-side UI with the database’s pub-sub functionality. At the same time, I will conduct the tests mentioned in the “Verification and Validation Plan” section.

Team Status Report For 4/8/2023

Risks

The significant risks according to the current progress is the sensitivity of the sensors. As we are ordering new distance sensors to replace the infrared sensor we had, we will have to test and see if they can accurately detect incoming customers for our purpose. If the new distance sensors have unsatisfactory performance, we will either fall back to our original infrared sensor or use OpenCV for human detection.

A minor risk is the previously mentioned latency of our speech recognition and NLP system. Although we were able to greatly reduce the lag, the system still takes 1s to 5s to respond to the customer when they say a menu item. This risk is no longer as damaging as it was because the lag has been reduced to a range that isn’t very noticeable. However, as we start testing our system with volunteers, we may need to further optimize the system if the lag causes bad experiences.

Design Changes

Through testing with the RPi 4 some more this past week, we’ve found that it is insufficient to drive the sensors, the microphone, the speech recognition and NLP loop, and the customer-side UI all at the same time. Therefore, we decided to use one of our laptops as the main CPU. The microphone will be plugged into the laptop, which runs the customer-side UI and the backend server that supports it. The sensors will be driven by an Arduino Uno Rev3. 

We’ve also decided to switch from using PIR sensors to using ultrasonic module HC-SR04 distance sensors, because they could provide detailed information about our customers (exactly how far away they are from the kiosk and the microphone) rather than just whether they are detected.

Schedule

We are on track with our previous schedule, finishing all our assigned tasks before the interim demo. However, because we switched from the RPi 4 to an Arduino, we need to adjust our sensor code to accommodate. We’ve updated our Gantt chart to reflect the additional work required to do so.

Shiyi Zhang’s Status Report for 04/01/2023

Personal Accomplishments

This week I’ve been working on integrating the backend code, the frontend code, and the hardware components to prepare for the interim demo. Our voice-operated system is now capable of detecting presence of customers, responding to their speech, and place orders. However, there are still a few minor bugs that pop up occasionally due to unhandled edge cases. For example, there is a particular thread that occasionally crash unexpectedly, whereas this issue does not occur on a Mac. We plan to address these issues next week. Overall, we’re in good shape for the demo. In additional to the integration, I’ve been doing some research on utilizing three extra PIR sensors (right now we are using just one) or a combo of a webcam and OpenCV to provide a more detailed understanding of the presence of a customer. I’ve narrowed down to an algorithm called Multiple Sensor Fusion, the process of combining sensor data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. While OpenCV is a viable option, I’m leaning towards a more “hardware” approach, which is using sensors, as this is an ECE project.

On schedule

Yes, my progress is on schedule.

Next week

I’ll be collaborating with Nina to make sure that our code is compatible with the Raspberry Pi. Additionally, I’ll be working on developing code to implement Multiple Sensor Fusion and see if it works better.

Nina Duan’s Status Report For 4/1/2023

Personal Accomplishment

1. Voice Generation

I modified the voice synthesizing script I wrote earlier in the semester to support mass generation of constant messages (e.g. “Welcome to Meal by Words,” “please order your next item,” etc.) from the command line. I also created some helper functions so that the logic of the script can be called in real time to generate messages that haven’t been prepared already.

2. Order Interaction Workflow

I laid out the entire backend workflow of the order interaction in code. The interaction is as follows:

  1. (After the back end has been woken up by an infrared sensor) Play the synthesized welcome message.
  2. Ask the customer to order the first item. The system does support ordering and/or removing multiple items at a time, but, to maintain a relatively high item detection rate, we are limiting it to one item (with quantity) at a time for the MVP.
  3. Parse customer speech and detect menu items.
  4. If an item is detected, repeat the item and its quantity back to the customer. Otherwise, the system will ask the customer to repeat their order item after 15 seconds.
  5. Ask the customer to order the next item. They can also start the sentence with “remove” to remove a certain amount of an item, or say “checkout” to checkout.
  6. Repeat steps 3 to 5 until the customer says “checkout.”
  7. Upload the order to the database and give the customer their order number.

In the future, we are also planning on adding a confirm feature for checkout, so the customer will be asked to confirm their order (displayed on customer UI and/or spoken out loud by voice synthesizer) before step 7.

About Schedule

I am on track with the schedule.

However, it’s important to note that because we are still having trouble with migrating our code to the microcontroller, some of the completed tasks may need to be reevaluated.

Plans for Next Week

I will work with Lisa to add a confirm feature to our system. This will require support from both the NLP module and the overall order interaction workflow. I will also fine-tune parameters such as the energy-level threshold of our speech recognition system and the amplitude of our noise cancellation filter to better accommodate the RPi environment. However, if we do need to replace the microcontroller with some other back end controller, we will also make the decision in the coming week.

If time permits, I will start integrating the cloud database with a preliminary, command-line-based staff-side UI.

Team Status Report For 4/1/2023

Risks

The biggest risk we are facing is the difference between our current development environment (MacOS) and that of the Raspberry Pi (RPiOS). While our prototype backend flows relatively smoothly on MacOS, it behaves in unexpected ways when migrated to the Raspberry Pi. We are still trying to find the root cause of this. Our fallback is running everything on our laptop instead, which means we might switch to using Arduino. The hardware components therefore would not include a Raspberry Pi. However, this is not a finalized decision. We will continue debugging on our Raspberry Pi this weekend. 

Another risk is the overall speech-processing speed. The time it takes for our system to listen to the user input, convert to text, parse into entries, and add to a local order object is longer than our ideal goal of 1 second. Because we are unable to correctly determine the end of speech every time, sometimes the speech recognition module keeps listening after the customer has finished speaking.

Design Changes

We modified our menu to accommodate some NLP edge cases. The current menu is:

cheeseburger              $7.99
hamburger                  $6.99
chicken burger           $7.49
beef sandwich            $8.99
chicken sandwich     $8.99
hot dog                        $4.99
corn dog                      $5.99
taco                              $6.99
donut                           $3.99
fries                              $2.99
onion rings                $4.99
cheesecake                $5.99
fountain drink          $1.29
coffee                          $3.29
ice cream                   $2.99

We are planning to add three more infrared sensors to increase detection accuracy. We might also use OpenCV as assistance to better detect people with special needs such as children or people sitting in wheelchairs.

Schedule

We have pushed back the design of staff-UI since the integration, testing and revisions will take up all the time before our interim demo. An up-to-date schedule has been attached.

Lisa Xiong’s Status Report For 4/1/2023

Personal Accomplishments

I have changed one of our menu items “veggie burger” to “salad” then to “cheesecake” due to NLP parsing difficulties. Usually the structure of menu items as compounded words will be the one shown in the following screenshot, where the quantity modifies the anchor noun.

[INSERT IMG]

However, for “veggie burger”, spaCy recognizes it as the following structure, where the quantity is a child of the adjective “veggie”. There are possible solutions to handle the alien structure, but we have decided that it would be easier to change the menu item instead.

[INSERT IMG]

The updated item “salad” also faced the same issue. Addition of salads into the order worked perfectly, yet when we were trying to remove them, the phrase structure also becomes an outlier. Usually the quantity will directly modify the anchor noun, but here it becomes a child of the verb. To make it worse, when I tried to add a special dependency rule just for the removal of salads, I realized that no existing spaCy matcher operators can describe the relationship between the quantity and “salad” in this case. Although I expected them to be siblings according to the tree, both “$++” and “$- -” for sibling relationships fail to detect the phrase. That is when we decided to change the item again into “cheesecake”.

[INSERT IMG]

Additionally, I enabled the handling of order confirmation keywords, and debugged some edge cases that occurred during testing. An interesting edge case was that “a beautifully packaged cheeseburger” could be detected while “a beautifully package cheeseburger” (a mishearing) failed to generate an order. I solved the issue by changing some operators and tags for dependency matching.

Schedule

I am on track with our updated schedule (see in the team status report section). I pushed the design of the staff UI further back since the integration of other subsystems has not been completed yet. Noise reduction is also temporarily removed from my schedule, as the current method is already able to handle classroom-level noise. We may choose to do more with noise reduction after the interim demo.

Plans for Next Week

I will work with my teammates to complete the interim demo and debug more NLP edge cases, such as finding a way to still parse the order entry when the speech recognition system recognizes our quantity “two” as “to”. It might require adding new rules to our dependency matcher to recognize the specific sentence structures where the “to” should be changed to “two”. I will also start designing the staff UI mid-week.

Shiyi Zhang’s Status Report for 03/25/2023

Personal Accomplishments

This week, I’ve been working on a few different things for our project. Firstly, I’ve been downloading the necessary packages and test-running the parts that we’ve completed so far. In addition to this, I’ve also been working on connecting our microphone to the Pi.

One particular challenge I encountered was with downloading the en_core_web_sm package under Spacy. Although I had no trouble downloading Spacy itself, downloading this specific package was unsuccessful. It turned out that it was looking for a variable in a system-level C file, which was undefined on the 32-bit Raspberry Pi OS we were using. I tried installing a different OS, the 64-bit Raspberry Pi OS, but unfortunately, that didn’t work either. After spending a day searching online for a solution, I came across a modified version of the 64-bit Raspberry Pi OS created by an online user that might do the trick. I installed it, and thankfully it worked.

I was unable to connect our Neat Bumblebee II microphone to the Pi due to its power and audio data transmission requirements, which require a single USB-C to USB-A cable. Unfortunately, the Pi’s only USB-C port is reserved for power supply and does not support audio data transmission. Additionally, the USB-A ports on the Pi do not support power delivery (PD). As an alternative solution, I tried connecting a Bluetooth wireless headphone with a built-in microphone to the Pi. While the Pi was able to detect the headphone, it did not recognize it as an audio device. That means it may require a driver to get it recognized. My next task for the upcoming week will be to troubleshoot this issue.

On schedule

Yes, my progress is on schedule.

Next week

This weekend, my top priority will be to find a solution to connect our Neat Bumblebee II microphone to the Pi or identify an alternative microphone that is compatible. Once the microphone is set up, I will focus on creating additional pages for the frontend. As we have decided not to ask customers to state all of the items they want to order in one sentence, I will work on creating a page where customers state one item at a time.

Lisa Xiong’s Status Report For 3/25/2023

Personal Accomplishments

After some unit testing of the integrated speech recognition, NLP and database subsystems, I implemented more features to my NLP system based on the results.

The first new feature is the checkout process. When a keyword (“check out”) is detected, the checkout function written by Nina will be called so that the order is uploaded to the database and sent to the kitchen side. This is achieved by adding a phrase matcher to the NLP system.

The second new feature is the detection of menu items with quantifiers. Previously, the NLP system was unable to recognize user input with quantifiers, such as “a cup of coffee”. I added another set of patterns to the dependency parsers so that the phrase structure with quantifiers can be processed as normal input.

Another feature I implemented is the removal of all mentioned items if the user does not specify the quantity in an item-deleting situation. When the user input is “no hamburgers”, all hamburgers in the existing order will be removed.

Schedule

I am on track with our updated schedule (described in detail in the team status report).

Plans for Next Week

I will keep revising the NLP system based on testing results, and work on improving the noise reduction abilities of our microphone. Some new NLP features I plan to add are the handling of order confirmation keywords, and the situation when the quantity is in a different part of the sentence than the menu item. (eg. “I want some coffee, um, 2 please.”)

Nina Duan’s Status Report For 3/25/2023

Personal Accomplishment

I worked with Lisa to integrate the microphone and speech recognition modules with the NLP module. For now, the system is able to correctly find the desired microphone, listen for and transcribe speech, parse long sentences like “I’d like one burger and two fries and three veggie burgers” in the background (in another thread), and store the parsed items under a single local Order object.

The first line in the screenshot is transcribed from speech.

About Schedule

I am on track with the schedule.

Plans for Next Week

I will continue to work with Lisa to improve our microphone, speech recognition, and NLP modules, as the usability of our system depends heavily on these parts. In addition, I will start creating ways of handling errors. For example, what should the system do if it fails to parse the customer’s speech? How should the system react if it times out?

Team Status Report For 3/25/2023

Risks

Although we have successfully integrated our microphone, speech recognition, and NLP modules, the functionality is still rather limited. For now, the system only has an accuracy of ~50% when translating from speech to text. In addition to exploring more noise cancellation algorithms, we will also find ways to limit how long a customer can speak. For example, we will ask customers to order items one by one instead of placing the entire order in one sentence. We will also repeat the detected item and quantity to the customer and ask them to confirm. In addition, at any time during the ordering process, the customer can say “remove XX item” to remove an item from the order. Hopefully, these measures are enough to guarantee that we don’t mistakenly order unwanted items for the customers.

In addition, we are currently having trouble downloading the related Spacy package to our Raspberry Pi due to Operating System incompatibility. We have tried 32-bit RPi OS as well as 64-bit RPi OS but have had no luck so far. This weekend we will try Ubuntu. In the worst case, we might use sockets to request and fetch NLP and speech recognition results from another computer. Another fallback option is to simply run our backend modules on a laptop, as we have already tested them on MacOS.

Design Changes

For the NLP system, we have changed the way order deletion is processed. Previously when the user input includes a deletion keyword but no quantity is present, we chose to not process the request. Now in this situation, the NLP system will consider it a request to delete all of the mentioned menu items in the order since that is the more intuitive intention. For example, when the customer says “no cheeseburgers”, we should be able to remove all cheeseburger entries in the current order.

In addition, for our web application, we switched back to using Django from pure Python because it provides better support for client-server integration. 

Schedule

We broke some larger tasks down into smaller chunks for better keeping track of everyone’s progress. 

One major schedule change we made is pushing the staff UI design to early April. As this is a post-MVP feature, we will work on it after all other subsystems have been integrated.

Currently, everyone is on track with the new schedule.