March 2023 – Team D3: Meal By Words

Lisa Xiong’s Status Report For 4/1/2023

Personal Accomplishments

I have changed one of our menu items “veggie burger” to “salad” then to “cheesecake” due to NLP parsing difficulties. Usually the structure of menu items as compounded words will be the one shown in the following screenshot, where the quantity modifies the anchor noun.

[INSERT IMG]

However, for “veggie burger”, spaCy recognizes it as the following structure, where the quantity is a child of the adjective “veggie”. There are possible solutions to handle the alien structure, but we have decided that it would be easier to change the menu item instead.

[INSERT IMG]

The updated item “salad” also faced the same issue. Addition of salads into the order worked perfectly, yet when we were trying to remove them, the phrase structure also becomes an outlier. Usually the quantity will directly modify the anchor noun, but here it becomes a child of the verb. To make it worse, when I tried to add a special dependency rule just for the removal of salads, I realized that no existing spaCy matcher operators can describe the relationship between the quantity and “salad” in this case. Although I expected them to be siblings according to the tree, both “$++” and “$- -” for sibling relationships fail to detect the phrase. That is when we decided to change the item again into “cheesecake”.

[INSERT IMG]

Additionally, I enabled the handling of order confirmation keywords, and debugged some edge cases that occurred during testing. An interesting edge case was that “a beautifully packaged cheeseburger” could be detected while “a beautifully package cheeseburger” (a mishearing) failed to generate an order. I solved the issue by changing some operators and tags for dependency matching.

Schedule

I am on track with our updated schedule (see in the team status report section). I pushed the design of the staff UI further back since the integration of other subsystems has not been completed yet. Noise reduction is also temporarily removed from my schedule, as the current method is already able to handle classroom-level noise. We may choose to do more with noise reduction after the interim demo.

Plans for Next Week

I will work with my teammates to complete the interim demo and debug more NLP edge cases, such as finding a way to still parse the order entry when the speech recognition system recognizes our quantity “two” as “to”. It might require adding new rules to our dependency matcher to recognize the specific sentence structures where the “to” should be changed to “two”. I will also start designing the staff UI mid-week.

Shiyi Zhang’s Status Report for 03/25/2023

Personal Accomplishments

This week, I’ve been working on a few different things for our project. Firstly, I’ve been downloading the necessary packages and test-running the parts that we’ve completed so far. In addition to this, I’ve also been working on connecting our microphone to the Pi.

One particular challenge I encountered was with downloading the en_core_web_sm package under Spacy. Although I had no trouble downloading Spacy itself, downloading this specific package was unsuccessful. It turned out that it was looking for a variable in a system-level C file, which was undefined on the 32-bit Raspberry Pi OS we were using. I tried installing a different OS, the 64-bit Raspberry Pi OS, but unfortunately, that didn’t work either. After spending a day searching online for a solution, I came across a modified version of the 64-bit Raspberry Pi OS created by an online user that might do the trick. I installed it, and thankfully it worked.

I was unable to connect our Neat Bumblebee II microphone to the Pi due to its power and audio data transmission requirements, which require a single USB-C to USB-A cable. Unfortunately, the Pi’s only USB-C port is reserved for power supply and does not support audio data transmission. Additionally, the USB-A ports on the Pi do not support power delivery (PD). As an alternative solution, I tried connecting a Bluetooth wireless headphone with a built-in microphone to the Pi. While the Pi was able to detect the headphone, it did not recognize it as an audio device. That means it may require a driver to get it recognized. My next task for the upcoming week will be to troubleshoot this issue.

On schedule

Yes, my progress is on schedule.

Next week

This weekend, my top priority will be to find a solution to connect our Neat Bumblebee II microphone to the Pi or identify an alternative microphone that is compatible. Once the microphone is set up, I will focus on creating additional pages for the frontend. As we have decided not to ask customers to state all of the items they want to order in one sentence, I will work on creating a page where customers state one item at a time.

Lisa Xiong’s Status Report For 3/25/2023

Personal Accomplishments

After some unit testing of the integrated speech recognition, NLP and database subsystems, I implemented more features to my NLP system based on the results.

The first new feature is the checkout process. When a keyword (“check out”) is detected, the checkout function written by Nina will be called so that the order is uploaded to the database and sent to the kitchen side. This is achieved by adding a phrase matcher to the NLP system.

The second new feature is the detection of menu items with quantifiers. Previously, the NLP system was unable to recognize user input with quantifiers, such as “a cup of coffee”. I added another set of patterns to the dependency parsers so that the phrase structure with quantifiers can be processed as normal input.

Another feature I implemented is the removal of all mentioned items if the user does not specify the quantity in an item-deleting situation. When the user input is “no hamburgers”, all hamburgers in the existing order will be removed.

Schedule

I am on track with our updated schedule (described in detail in the team status report).

Plans for Next Week

I will keep revising the NLP system based on testing results, and work on improving the noise reduction abilities of our microphone. Some new NLP features I plan to add are the handling of order confirmation keywords, and the situation when the quantity is in a different part of the sentence than the menu item. (eg. “I want some coffee, um, 2 please.”)

Nina Duan’s Status Report For 3/25/2023

Personal Accomplishment

I worked with Lisa to integrate the microphone and speech recognition modules with the NLP module. For now, the system is able to correctly find the desired microphone, listen for and transcribe speech, parse long sentences like “I’d like one burger and two fries and three veggie burgers” in the background (in another thread), and store the parsed items under a single local Order object.

The first line in the screenshot is transcribed from speech.

About Schedule

I am on track with the schedule.

Plans for Next Week

I will continue to work with Lisa to improve our microphone, speech recognition, and NLP modules, as the usability of our system depends heavily on these parts. In addition, I will start creating ways of handling errors. For example, what should the system do if it fails to parse the customer’s speech? How should the system react if it times out?

Team Status Report For 3/25/2023

Risks

Although we have successfully integrated our microphone, speech recognition, and NLP modules, the functionality is still rather limited. For now, the system only has an accuracy of ~50% when translating from speech to text. In addition to exploring more noise cancellation algorithms, we will also find ways to limit how long a customer can speak. For example, we will ask customers to order items one by one instead of placing the entire order in one sentence. We will also repeat the detected item and quantity to the customer and ask them to confirm. In addition, at any time during the ordering process, the customer can say “remove XX item” to remove an item from the order. Hopefully, these measures are enough to guarantee that we don’t mistakenly order unwanted items for the customers.

In addition, we are currently having trouble downloading the related Spacy package to our Raspberry Pi due to Operating System incompatibility. We have tried 32-bit RPi OS as well as 64-bit RPi OS but have had no luck so far. This weekend we will try Ubuntu. In the worst case, we might use sockets to request and fetch NLP and speech recognition results from another computer. Another fallback option is to simply run our backend modules on a laptop, as we have already tested them on MacOS.

Design Changes

For the NLP system, we have changed the way order deletion is processed. Previously when the user input includes a deletion keyword but no quantity is present, we chose to not process the request. Now in this situation, the NLP system will consider it a request to delete all of the mentioned menu items in the order since that is the more intuitive intention. For example, when the customer says “no cheeseburgers”, we should be able to remove all cheeseburger entries in the current order.

In addition, for our web application, we switched back to using Django from pure Python because it provides better support for client-server integration.

Schedule

We broke some larger tasks down into smaller chunks for better keeping track of everyone’s progress.

One major schedule change we made is pushing the staff UI design to early April. As this is a post-MVP feature, we will work on it after all other subsystems have been integrated.

Currently, everyone is on track with the new schedule.

Shiyi Zhang’s Status Report for 03/18/2023

Personal Accomplishments

Frontend

This past week, my main focus has been on making adjustments to our frontend code. Specifically, I’ve been transitioning it from Django to Tkinter. The reason behind this decision was simply due to the fact that our current project priorities lie elsewhere and we need the frontend to be operational as quickly as possible. However, we may switch back to Django once we’re nearing the completion of the project since it provides us with more styling options.

Currently, the pages have the capability to wait for output variables from the backend, such as the system response, and to then display the text like a typewriter. Additionally, the pages are able to disable audio inputs while waiting for responses from the backend or while still in the process of type-writing.

Sensor

I have installed the operating system, a fan, and some heat sinks onto our Raspberry Pi. The code for the sensor has also been transferred to the RPi and is functioning correctly.

On schedule

Yes, my progress is on schedule.

Next week

I will fetch Lisa’s NLP code and incorporate it into my workspace, connecting it to my frontend. Additionally, I will also integrate the sensor code with the frontend.

Lisa Xiong’s Status Report For 3/18/2023

Personal Accomplishments

I collaborated with Nina to finish our NLP and database integration this week. The output of my NLP algorithm can now feed directly into the database to be an Order object. I started writing the speech recognition code with Nina and the system is able to convert user input into text in near real-time. The speech recognition system is still weak on noise reduction (as described in Nina’s status report) and speech conversion speed, which we plan to improve soon. I also fixed some bugs in my NLP algorithm while making it account for more use cases, such as when a user orders and then deletes an item in the same sentence. Another change I made in the NLP is that when we cannot detect the quantity of a menu item in the sentence, we will default it to 1.

Schedule

I am on track with our schedule this week, for the NLP and database integration is almost finished and the speech recognition work has started.

Plans for Next Week

I will continue to work with Nina to improve our NLP and database integration, and tune the speech recognition to our microphone so that it can convert user speech faster and more accurately. If I have enough time left, the integration between speech recognition and NLP systems will be the next task I am going to work on.

Nina Duan’s Status Report For 3/18/2023

Personal Accomplishment

In addition to completing the ethics assignment, I integrated our database module and preliminary NLP module with Lisa and modified the microphone & speech recognition system provided by Python’s SpeechRecognition library.

After integration, our system is now able to extract menu items and quantities from simple sentences, add them to an Order object, and upload that object to the database. However, there are still flaws with this simple system because we have yet to implement the checkout portion of the NLP module.

The open-source SpeechRecognition library provides a basic real-time speech recognition functionality that can be used with an external microphone. This process, however, doesn’t allow room for noise reduction. Therefore, I explored the source code of the library, determined where the microphone’s input is read, and extended it to utilize a noise reduction algorithm. For now, it uses a simple, deterministic noise cancellation algorithm that attempts to cancel out low amplitudes by mixing with the signal’s inversion. By slightly altering this visualization tool, I was able to visualize the difference. This is what it looks like when I speak at conversational volume from a distance of ~0.7m, with a restaurant ambience noise YouTube video playing in the background (graphs are in time domain; top = raw microphone input, bottom = filtered input):

Speech with noise, with amplitude decreased.

About Schedule

I have caught up to the schedule. The microphone has been set up, and preliminary signal processing code has been written.

Plans for Next Week

I will continue to work with Lisa to improve our NLP & database modules, as this is the core part of our system. In addition, I will start installing necessary dependencies on and transferring our code to the microcontroller (RPi 4).

Team Status Report For 3/18/2023

Risks

The greatest risk that we are currently facing is the low performance of the speech recognition system. As we started writing the speech recognition algorithm, we realized that although the Python SpeechRecognition library usually returns coherent sentences, which helps our NLP system to parse the input, the speed and accuracy of recognition is not very optimistic. We will test how to change certain metrics in the SpeechRecognition library for a more accurate output, and in the worst case, we can switch to other speech recognition algorithms compatible with Python.

Design Changes

We may be able to use fewer infrared sensors. We conducted testing this week and found that the system was still able to detect relatively short human figures accurately with just one or two sensors.

To accommodate inflexibilities in our current speech recognition and NLP modules, we decided that checkout can only be triggered by certain keywords (“checkout,” “finish,” and “done”). We will also be taking item orders one by one, so a sample interaction would look like this:

Kiosk: “Welcome! Please order your first item!”

Customer: “One hamburger, please.”

Kiosk: “You’ve ordered one hamburger. Is this correct?”

Customer: “Yes.”

Kiosk: “One hamburger, confirmed. Please order your next item, or say ‘finish’ to checkout.”

Customer: “Checkout.”

Kiosk: “Are you ready to checkout?”

Customer: “Yes.”

Kiosk: “Checkout successful! Your total is $XX.XX. Your order number is XX. You will be called when your order is ready. Thank you for using Meal By Words!”

Schedule

There is no schedule change this week. Everyone is on track with our plan.

Shiyi Zhang’s Status Report for 03/11/2023

Personal Accomplishments

During Spring break, I continued working on the client-side UI and now have two pages: one that appears when user speech is detected, and another that displays our menu and added items.

Page #1

This page is supposed to be voice-operated, but as we have not yet received the microphone, I have decided to use a click button that listens to the laptop’s microphone for now. By utilizing Mozilla’s Web Speech API and its JavaScript functions, the page is capable of displaying real-time transcribed text in the provided text area.

Page #2

This is where the customer views the menu and review their order before checkout.

Schedule

The client-side UI is close to completion, but it’s currently not talking to any sub-system such as the Django backend, so my progress is a bit behind on schedule. I don’t think it’s too much of a problem since the mic/tool kit will arrive next week, and utilizing the outputs from the sensors should not take too long.

Next week

Next week I will be working on making the sensors & the mic work with the backend and, if I got time, making it work with the frontend as well. I will work with Lisa on the mic part since she is responsible for language parsing.