Tests
For the parts I am responsible for (this includes distance sensors, user interface, and possibly a camera), I have conducted some unit tests on PIR sensors and user interface. Our distance sensors have not arrived yet, so tests for them will be delayed, but we are expecting them to arrive next week.
- User interface: The duration of the timeout given to the text transcription (i.e., audio input to text) has a direct impact on the completeness of the transcribed text. After testing timeouts of 1, 2, and 3 seconds, I found that 3 seconds was the safest option, as the text rarely got cut off. However, this extended delay came at the expense of user experience. In contrast, a timeout of 1 second provided a better user experience but required the customer to speak quickly with no gap at all, or risk having their speech cut off. After weighing these options, I ultimately decided to go with a timeout of 1 second. In addition, I have tested edge cases including receiving no speech for longer than 30 seconds (should go into INACTIVE mode and delete the current, incomplete order), checking out (should submit order), and receiving unrecognizable speech (should wait). They work as intended. However, I have not tested the UI with the sensors and the camera installed. My plan is to test whether the UI can reflect the number of people waiting in line, whether it can remind the customer to get closer to the mic, and whether it can switch to the appropriate page when no customer is around.
- Sensors and the camera: My plan is to experiment with tilting the sensors to find the optimal angle for detecting people within a specific distance range, while ignoring those beyond that range. There are several factors to consider, including the location of the sensors and how to distinguish between an individual and a large crowd.Once we have the sensors installed and calibrated, I will evaluate their performance in terms of accuracy and speed. Specifically, I’ll be looking at how accurately the system can count the number of people in line (actual # of people vs # calculated by us), as well as how fast the camera/OpenCV can process the data (within how many seconds the # of people is counted).
Personal Accomplishments
This week, my focus has been on integrating the backend and the frontend. Nina added flag variables and a new interface for the frontend, which is now used by the frontend to read the status of the speech recognition and natural language processing parts of the system. As a result of the changes, the frontend now has the ability to detect when it’s time for customers to speak and when the system is processing and won’t accept any audio inputs. Additionally, I implemented code that can transcribe speech to text to display on the screen. This will enable customers who are hard of hearing to view their order.
Aside from the frontend work, I’ve also been working on the hardware aspect of the project. Since the distance sensors have not arrived yet, I have been exploring the use of OpenCV to better understand what the customer is doing. As a result, the system can now detect the number of people waiting in line, as well as identify if a person is present.
Schedule
My progress has been slightly delayed because the distance sensors haven’t arrived yet, and we have just switched from RPI to Arduino. However, to avoid any further delays, I’ve implemented a backup solution using OpenCV and a camera. This should ensure that our progress won’t be affected, even if the sensors never arrive. We also have a second backup plan in place, which involves using PIR sensors. I have already written the necessary code for this option, so we are prepared.
Next week
Once the distance sensors arrive, my plan is to install them on our Arduino and then work on debugging the code I have prepared for them. Additionally, I intend to integrate these sensors with a camera, using the OpenCV library, so that the OpenCV part knows when and when not to check the surrounding.