Mahlet’s Status Report for 12/07/2024

This week, mainly consisted of debugging my audio localization solution and making necessary changes to the hardware of SBB. 

Hardware

Based on the decision to change motors from servo to stepper, I had to change the mounting mechanism of the robot’s head to the body. I was able to reuse  most of the components from the previous version, and had to make the mounting stand slightly longer to be in line with our use case requirement. Now the robot can move its head very smoothly and consistently. 

My work on audio localization and its integration with the neck rotation mechanism has made significant progress, though some persistent challenges remain. Below is a detailed breakdown of my findings and ongoing efforts.

To evaluate the performance of the audio localization algorithm, I conducted simulations using a range of true source angles from 0° to 180°. The algorithm produced estimated angles that closely align with expectations, achieving a mean absolute error (MAE) of 2.00°. This MAE was calculated by comparing the true angles with the estimated angles and provides a clear measure of the algorithm’s accuracy. The result confirms that the algorithm performs well within the intended target of a ±5° margin of error.

To measure computational efficiency, I used Python’s time library to record the start and end times for the algorithm’s execution. Based on these measurements, the average computation time for a single audio cue is 0.0137 seconds. This speed demonstrates the algorithm’s capability to meet real-time processing requirements.

In integrating audio localization with the neck rotation mechanism, I observed both promising results and challenges that need to be addressed.

For audio cue detection, I tested the microphones to identify claps as valid signals. These signals were successfully detected when they exceeded an Arduino ADC threshold of 600. Upon detection, these cues are transmitted to the Raspberry Pi (RPi) for angle computation. However, the integration process revealed inconsistencies in serial communication between the RPi and the Arduino.

While the typical serial communication latency is 0.2 seconds or less, occasional delays ranging from 20 to 35 seconds have been observed. These delays disrupt the system’s responsiveness and make it challenging to collect reliable data. The root cause could be the Arduino’s continuous serial write operation, which conflicts with its role in receiving data from the RPi. The data received on the RPi seems to be handled okay, but I will proceed to validate the data side-by-side, and make sure the values are accurate.  Attempts to visualize the data on the computer side were too slow for the sampling rate of 44kHz, leaving gaps in real-time analysis.

To address hardware limitations, I have temporarily transitioned testing to a laptop due to USB port issues with the RPi. However, this workaround has not resolved the latency issue entirely.

Despite these challenges, the stepper motor has performed within expectations. The motor’s rotation from 0° to 180° was measured at 0.95 seconds, which meets the target of under 3 seconds, assuming typical latency.

Progress is slightly behind schedule, and the contingency plan for this is indicated in the google sheets of the team weekly report.

Next Steps

Resolving the serial communication latency is my highest priority. I will focus on optimizing the serial read and write operations on both the Arduino and RPi to prevent delays. Addressing the RPi’s USB port malfunction is another critical task, as it will enable me to move testing back to the intended hardware. Otherwise, I will resort to the contingency plan of using the webapp to compute the data. I will be finalizing all the tests I need for the report, and finalize integration with my team over the final week.

Shannon’s Status Report 12/7/24

This week, since I was done with everything I was in charge of, I focused my all into helping Jeffrey catch up on his parts. Jeffrey was able to have the Study Session start working, where when a user clicks start session on the WebApp, the robot display screen is able to start a timer. However, he was unable to get the pause/resume or the end session to work. As such, I sat down with him and we worked on these features. I noticed that he had a lot of debugging statements on the robot, which was good, but he wasn’t really logging anything on the WebApp. Following my advice, he did so and we realised that the event was being emitted from the RPi but it was not being received on the WebApp based on the logs. I also noticed that he had a log for when the WebSocket connected on the start but not on the in progress page and so I recommended that he add that. After he added that, I noticed that on the in progress page, the connection log was not appearing and so we waited until it appeared and then we tested the pause/resume button and it worked. Thus, we managed to debug that the issue had to do with latency. We then resolved this by switching the transport to WebSockets from Polling, and that fixed the latency issue. Then, we worked to debug why the End Session on the WebApp was not being received on the robot end. Once again, based on checking the logs, I deduced that it was probably because the events were being emitted wrongly and so the robot event handlers could not catch them. We tried fixing it to receive that event that was being sent by the WebApp instead and it worked – we finished a standard study session! This was done on Wednesday. After this, I took over all TTS and Study Session related work, but Jeffrey will work on the last part of hardware integration with the pause/resume buttons.

On Thursday, I then integrated my TTS code, and ran into some issues with the RPi trying to play audio from a “default” device and failing. After some quick troubleshooting, I decided to write a config file to set the default device to the port that the speaker was plugged into and it worked! The text-to-speech feature was working well, and there was barely any detectable difference in latency between the audio playing on the user’s computer and on the RPi through the speakers. I tested it some more to make sure it worked with various different .txt files, and when I was satisfied with the result, I moved on to integration with the Study Session feature. I worked on making sure TTS worked only while a study session was in progress (not paused). I wrote and tested some code but it was buggy and I could not get the redirection to work like I desired. I wanted it such that in the case that a Study Session had not been created yet, that the user clicking on the text-to-speech feature on the WebApp would redirect them to the create a new study session and if there was a Study Session that has been created but was on pause, it would redirect them to it. I also wanted to handle the case where if a user was using the text-to-speech feature, and the goal duration was reached, then the user going back to the study session would still trigger the pop-up to occur asking if they would like to continue or stop the session since the target duration has been reached.

On Friday, I continued to work on these features, and I managed to successfully debug them. The text-to-speech feature worked well alongside the Study Session and exhibited the desired behavior I wanted. I then worked on the Pomodoro Study Session feature, which implemented a built-in break for the user based on what they set at the start of the study session. I worked on the WebApp and the RPi, ensuring that the study and break intervals were sent correctly, and then to have it such that the study and break intervals worked, I had to create another background clock that ran alongside the display clock. The standard study session only needed a display clock since the user was in charge of pausing and resuming sessions, but since the pomodoro study session has to keep track of when to pause and resume the session automatically while sending a break reminder audio, it required a separate clock. I wrote and debugged this code and it worked well. This is also when I discovered that the text-to-speech feature could not occur at the same time due to the break reminder being played as well, and so I made a slight design change to prevent this.

Today, I will work on cleaning up the WebApp code, help Jeffrey with any RPS issues that he may face, and start on the final poster.

According to the Gantt chart and the newly linked final week schedule, I am on target.

In the next week I will work on:

  • Final overall testing with my group
  • Final poster, video, demo and report

Mahlet’s Status Report for 11/30/2024

As we approach the final presentation of our project, my main focus has been preparing for the presentation, as I will be presenting the coming week. 

In addition to this, I have assembled the robot’s body, and made necessary modifications to the body to make sure every component is placed correctly. Below are a few pictures of the changes so far. 

I have modified the robot’s face so that it can encase the display screen. Previously, the head was a solid box. The servo to head mount is now properly assembled. The head is well balanced using the stand I used to mount the motor to. This way there is space to place the Arduino, speaker and RaspberryPi accordingly. I have also mounted the microphones to the corners as desired. 

Before picture: 

After picture: 

Mounted microphones on to the robot’s body

Assembled Body of the robot

Assembled body of the robot including the display screen

 

I have been able to detect a clap cue using the microphone, by identifying the threshold of a loud enough clap detectable by the microphone. I do this processing in the raspberry pi, and once the RPi detects the clap, it runs the signal through the direction estimate function, which spits out the angle. This angle is then sent to the Arduino to modify the motor to turn the robot’s head. Due to the late arrival of our motor parts, I haven’t been able to test the integration of the motor with the audio input.  This put me a little behind, but using the slack time we allocated, I plan to finalize this portion of the project within the coming week.

Another thing I worked on is implementing the software aspect of the RPS game, and once the keypad inputs are appropriately detected, I will meet with Jeffrey to integrate these two functionalities. 

I briefly worked with Shannon to make sure the audio output for the TTS through the speaker attached to the RPi works properly. 

 

Next week: 

  1. Finalize the integration and testing of audio detection + motor rotation
  2. Finalize the RPS game with keypad inputs by meeting with the team. 
  3. Finalize the overall integration of our system with the team. 

Some new things I learned during this capstone project is how to use serial communication between Arduino and a raspberry pi. I used some online Arduino resources that clearly teach how to do this. I also learned how to perform signal analysis on audio inputs to localize the source of a sound within a range. I learned how to use the concept of time difference of arrival to get my system working. I used some online resources about signal processing, and by discussed with my professors to clarify any misunderstandings I had towards my approach. I also learned from online resources, Shannon and Jeffrey how a WebSocket works. Even though my focus was not really on the web app to RPi communication, it was good learning how their systems work.

Team’s Status Report for 11/30/2024

For this week, one risk that we are taking on is adapting the DSI display touch screen to use the keypad for inputs instead. We want to complete the pipeline of keypad to RPi5 to Web App. The Web App and RPi5 connection is working well currently, using socket.IO to maintain low latency communication. However, the next step is having keypad inputs as opposed to using the DSI display touchscreen while maintaining the low latency requirements. While it is possible that we will have some difficulties with a smooth integration process, we do not foresee any huge errors/bugs occurring. Nevertheless, should we get stuck, the mitigation plan is to use the touch screen of the DSI display.

Another minor but potential risk during the demo is that given that our project assumes a quiet study environment, the audio detection relies on identifying a higher volume threshold for the double clap audio cue. If we are in a relatively noisy environment, there is a risk of interference in the audio detection mechanism. One way to mitigate this risk is to increase the audio threshold in a noisy environment, or performing the demo in an environment the project assumes. 

One major design change is regarding the audio input mechanism. Since the RaspberryPi does not have an analog to digital converter, we have used an Arduino to get the correct values for the audio input for audio localization. This did not affect the schedule, as it was easily integrated into the integration portion of the system. Budget wise, this did not incur any budget constraints as we used an Arduino we have from a previous course. Other than that, we haven’t made any changes to the existing system design, and are mainly focused on moving forward from risk mitigation steps to final project implementations, to ensure our use cases are addressed in each system.

The schedule remains the same, with no updates to the current one. 

Overall, our team has accomplished:

  1. WebApp implementation
  2. Audio Response feature (position estimation with 5 degrees margin of error) 
  3. Partial Study Session implementation (WebApp to RPi communication completed)
  4. Partial RPS game implementation
  5. TTS Feature (able to play audio on the robot’s speaker) 

More details on each feature can be found in the individual reports.

Mahlet’s Status Report 11/09/2024

This week, I worked on Audio localization mechanism, servo initialization through the RPi and ways of mounting the servo to the robot head for seamless rotation of the head. 

Audio localization: 

I have a script that records audio for a specified duration, in our case would be every 1.5 seconds, and this will take in an input audio and filter out the clap sound from the surrounding using a bandpass filter. This audio input from each mic is then passed into the function that performs the direction estimation by performing cross correlation between each microphone. 

I have finalized the mathematical approach using the four microphones. After calculating the time difference of arrival between each microphone, I have been able to get close to the actual input arrival differences with slight variations. These are causing very unstable direction estimation to a margin of error to up to 30 degrees. The coming week, I will be working on cleaning up this error to ensure a smaller margin of error, and a more stable output. 

I also did some testing by using only three of the microphones in the orientation (0,0), (0, x), (y, 0) as an alternative approach. x and y are the dimensions of the robot(x = 8 cm, y = 7cm). This yields slightly more inaccurate results. I will be working on fine-tuning the 4 microphones, and as needed, I will modify the microphone positions to get the most optimal audio localization result.

Servo and the RPi: 

The Raspberry pi has a built-in library called python3-rpi.gpio, which initializes all the GPIO pins on the raspberry pi. The servo motor connects to the power, ground and a GPIO pin which receives the signal. The signal wire connects to a PWM GPIO pin, to allow for precise control over the signal that is sent to the servo. This pin can be plugged into GPIO12 or GPIO13. 

After this, I specify that the pin is an output and then initialize the pin. I use the set_servo_pulsewidth function to set the pulse width of the servo based on the angle from the audio localization output. 

Robot Neck to servo mounting solution: 

I designed a bar to mount the robot’s head to the servo motor while it’s housed in the robot’s body. 

The CAD for this design is as follows.

By next week, I plan to debug the audio triangulation and minimize the margin of error. I will also 3D print the mount and integrate it with the robot, and begin integration testing of these systems.