Jeffrey’s Status Report for 10/26/2024

This week was focused on the ethics assignment as well as the weekly meeting, where we were able to discuss our ideas more in depth with Professor Bain and Ella. Our goals was to focus on figuring out ways to implement the microphone and web sockets to the RPi.

 

On Thursday, the group met up and we worked on implementing the web sockets. Our goal was to set up the RPi for the first time and write some python code that would enable the web app to act as a client, and have the RPi act as a server and send messages to the client and for the client to confirm that a message has been received (this was tested by having the web app change the text when a message has been received from the server). The next step in our goal is to have the server be able to serve information to the web app, and have the web app store information from the RPi acting as the server (information such as study sessions completed or games won/lost in RPS).

Since we are currently still waiting for parts, we are a bit behind on the actual construction of the robot. Mahlet and I still have to build the robot base in acrylic at Techspark. We plan to just create the base of the robot and then drill holes as need to add internal components such as microphones/speakers. We want to be able to have the robot built so we can ensure that all the components we want to use can fit within the base. I also been working on finetuning the GPIO pin logic. By using the GPIO library, we can enable button presses will be inputted by the RPi and we can process them accordingly.

 

The biggest upcoming goal is testing the GPIO library with buttons wired directly to the RPi. I will also work on the RPS logic that was intended to be completed last week. My primary goal there is just make sure that the algorithm can randomly selected an option out of R/P/S and then depending on win/loss/tie, output the result accordingly.

Mahlet’s Status Report for 10/26/2024

This week I worked on the forward audio triangulation method with the real life scale in mind. I limited the bounds of the audio source to 5 feet from each side of the robot’s base and placed the microphones at a closer distance. I accounted for accurate values in units to make my approximation possible. Using this, and knowing the sound source location, I was able to pinpoint the source of the audio cue. I used a smaller scale to go over the grid dimensions to have a closer approximation. This is to allow low inaccuracies in the direction that the robot is going to turn to. 

I randomly generate the audio source location, and below are some of the simulations for this.  The red circles denote the source of audio and the cross indicates the audio source.

After this,  I pivoted from audio triangulation and focused on tasks such as setting up the RaspberryPi, doing tests for the TTS with Shannon and learned  about the WebSocket connection methodology. I joined Shannon and Jeffrey’s session when they discussed the WebSocket’s approach and learned about it

During setting up the RaspberryPi, I ran into some issues with it, while trying to SSH to it. Setting up folders and the basics however went well. One task for next week is to reach out to the department to get more information about prior connections with the raspberry pi. It is already connected to CMU Secure as well as CMU devices networks, however it doesn’t seem to be working with the CMU device network. I tried registering the device to CMU Devices, but it seems like it has been registered prior to this semester. I aim to figure out the issue with SSH-ing to this device over the next week. However, we can still work with the RPi using a monitor, so this is not a big issue. 

After this, I worked on Text-To-Speech along with Shannon, and worked on the pyttsx3 library. We intended so that the WebApp reads various texts back to back through the text/file input mechanism. The library works by initializing a text engine, and uses the function, engine.say(), to read the text input. This works when running the app for the first time. However after inputting data for the second time and onwards, it gets stuck in a loop. The built-in engine.stop() function requires multiple instances of initialization of the text engine, which causes the WebApp to lag. As a result, Shannon and I have decided to look into more TTS libraries that can be used for python, and also we will try testing the TTS directly on the RPi instead of the WebApp first.

My progress is on track, the only setback is the late arrival of ordered parts. As described in the team weekly report, I will be using the slack time to accommodate for progress with assembling the robot, and integrating systems. 

Next week I will be working on finalizing the audio triangulation, work with Shannon to find the optimal TTS functionality and work with Jeffrey to build the hardware.

Shannon’s Status Report for 10/26/2024

This week, I focused on making WebSockets communications between our WebApp and the RPi work. When we met up on Thursday afternoon, Mahlet and Jeffrey helped to set up the RPi (registering the device, connecting it to WiFi, etc.). Once we were able to download VSCode on the RPi, I coded up a short script to test if communications were able to happen. I wrote a simple script in JavaScript on the RPi, and then wrote a similar one with some extra UI features on the WebApp and tested it out. Theoretically, when I clicked a button on the WebApp, the RPi should receive it and print out a message. Initially, this wasn’t working due to the RPi and the WebApp being on different ports. There was a CORS(Cross-Origin Resource Sharing) error, due to the WebApp trying to send a request to a different domain than the server that was hosting it and so to debug this, I included some CORS settings on the RPi side to allow the WebApp to send a request. This worked, and the RPi was able to display a message when a button on the WebApp was clicked.

On the WebApp:


On the RPi:



I also spent quite some time on trying to incorporate TTS on the WebApp itself this week. Unfortunately, the pyttsx3 library that we were trying to use seems to not work well with our website. After coding up some simple logic to use the TTS function in the library when a user input is received, me and Mahlet tried testing it to see if it was successful. When we first input some text into the textbox and click the read button, it works well and the laptop speakers play the correct audio with little to no delay. However, when we try to send more text again, we get the error “run loop has already started”, which indicates that the previous text to speech command queued had not finished. We were confused and spent quite some time trying to debug this by looking up solutions online that other users who have encountered this issue tried and it did not work for us. We looked through the documentation for the TTS library itself and tried out various functions, but nothing seemed to work. Thus, me and Mahlet are looking into using other TTS libraries to see if we can find a solution to this. I am considering using gTTS (Google Text to Speech), which is not as ideal as pyttsx3 because it requires an internet connection, but should be well-documented enough to reduce the chances of it not working as well.

In the next week, I’ll be working on:

  • Building the robot with my team
  • Researching different solutions for TTS with Mahlet
  • RPS game function on the WebApp with WebSockets

Team Status Report for 10/26/2024

The most significant risk is currently the arrival of parts being delayed. Despite the submission of parts prior to fall break, we still haven’t received them, pushing our parts integration and testing timeline backward. We have the hardboard wood, purchased from TechSpark. Once the parts are in our possession, we can start assembling the robot base. Timeline wise, the assembly should only take a couple of hours. Our contingency plan is to make good use of our slack time assigned to make sure we can catch up to our schedule and finish building the robot base. 

Another aspect of the design that poses some risk is the overall integration. We would want our Study Session and TTS feature to be able to occur at the same time (i.e. while studying, a student should be able to use the TTS feature and have the robot read to them without it affecting the session in terms of pausing or ending it). We want the WebSocket communications and the TTS processes to seamlessly integrate but there may be some threading issues that we have to account for which may cost us additional time to integrate those parts together. Our contingency plan for this is to use multithreading on the RPi. There is also a chance that the TTS feature might take up too much processing resources on the RPi, affecting the study session and so in that case, our contingency plan would be to have the TTS work directly on the user’s computer through the WebApp. 

No further changes have been made to the overall design of the system, and we are keeping a close eye on our project to make sure we adjust if necessary when any further issues arise but as for now, we foresee no issue with our current plan.

Jeffrey’s Status Report for 10/19/2024

Since last week, my main focus was finishing the design report, and preparing for the week 6 tasks on the Gantt Chart. Over Fall Break, I worked on preparing for three tasks. The servo motors, the speakers, and the GPIO pins that connect the RPi to the buttons on the robot base.

For the GPIO pins, I plan to use Python with the GPIO library and have written preliminary code:

  • import RPi.GPIO as GPIO
  • import time 
  • # Use Broadcom pin-numbering scheme 
    • GPIO.setmode(GPIO.BCM) 
  • # Set up the GPIO pin (e.g., Pin 17) as input with internal pull-down resistor 
    • button_pin = 17 GPIO.setup(button_pin, GPIO.IN, pull_up_down=GPIO.PUD_DOWN) def button_callback(channel): print(“Button was pressed! ‘X’ has been selected.”) 
  • # Add an event detection for button press 
    • GPIO.add_event_detect(button_pin, GPIO.RISING, callback=button_callback, bouncetime=200) try: while True: 
  • # Keep the program running to detect button press:
    •  time.sleep(1) except KeyboardInterrupt: 
  • # Clean up the GPIO setup on exit: 
    • GPIO.cleanup()

For testing and validation, I will look into ways to ensure that the latency of button inputs is under 250 ms. And also look into methods to test debouncing, to ensure that multiple unintended button presses  aren’t triggering unintended inputs. 

 

For the speakers, Mahlet put in the order for a USB speaker. This goes along with one of my tasks that is hello/goodbye greetings when the robot is powered on or off. From button presses, the GPIO pins can act as an input the the RPi and call the Python TTS library functions to trigger voice activation greetings. Since it is USB, the wiring of the speaker to the RPi should be trivial, but we would want to ensure that latency won’t be an issue, and that the RPi can take in inputs that can cause a corresponding correct output from the speakers.

 

Finally, we have the servo motors, which comes with a bracket mount. I looked into the specs of the servo motors we bought, at 1.6 inches. So we would need a <5 inch bracket mount, that we can then connect the DCI display to.

 

For the upcoming week, my goal is to simulate the rotation in Matlab, to ensure that the X and Y axis rotation is possible to achieve. Furthermore, Mahlet and I will be meeting in Techspark early next week to work on the acrylic base. Once we have the base, we can more easily measure dimensions for bracket mounts and buttons, that will then segue to use being able to test the features that we have implemented.

I am currently behind on the Gantt Chart for testing the speakers since we haven’t acquired them yet. I have prepared the python code necessary to test 2-3 fixed phrases.

Shannon’s Status Report for 10/12/2024 or 10/19/2024

This week, I focused heavily on writing the Design Report. Starting Monday night, I typed up the Introduction paragraph and detailing our Use Case, Use-Case requirements, and Design Requirements in to a shared document such that my other team members would be able to refer to and work on writing their separate parts (I essentially generated some points to talk about/the structure and they could elaborate on it). Then on Thursday night and Friday night, I dedicated time to working on completing my individual parts. The early half of Saturday, I worked on covering some parts that were delegated to other members but were not finished by the deadline of Friday night that we had set, and Saturday night was spent editing any jarring errors. I also spent some time detailing out everything that had to be done so as to ensure our group did not miss out on any important information in the Design Report, as well as coming up with Monday and Wednesday morning agendas for when we met up to make sure we were on track. Overall, on top of the actual work that I created/ typed up for the Design Report such as research on the various requirements, drawing up most of the diagrams in the report (all flow charts and mock-ups, I also sketched out the software part of the block diagram which Mahlet created), and the actual typing up of the document, I think I also put in a lot of mental labor to try and keep everyone on top of things. I would say that in total, I spent more than 12 hours just on the Report. 

Moving on to our project itself, I have also finished most of the coding for the pages (RPS page, TTS page, StudySession page) this week as well, and now the website resembles the mock-ups a lot more. I still have to work on some CSS elements and making the website UI more user-friendly, but everything seems to be working well. I am slightly behind schedule, as I wanted to have some Text-to-Speech functionality sorted out on the RPi, along with WebSockets, but since these are areas that require me to work with Mahlet and Jeffrey respectively, I will put my all into it once Fall break is over. Included below is some images of the website that I have working thus far:

In the next week, I’ll be working on:

  • Cleaning up the UI on the individual pages
  • Error handling for the pages
  • Building the robot with my team
  • Setting up TTS on the RPi/robot

Team Status Report 10/12/2024 / 10/19/2024

The most significant risk as of now is that our team is slightly behind schedule and should be working on completing the build of the robot base and individual component testing along with the implementation using the RPi. To manage these risks, we will use some of the slack time delegated to catch up on these tasks and ensure that our project is still overall on track. Following the completion of the design report, we were able to map the trajectory of each individual task. Some minor changes were also made to the design, with the removal of the todo-list feature on the WA because it felt non-essential and was a one-sided feature on the WebApp, and the neck of the robot having only rotational motion response along the x-axis for audio cue, and y-axis (up and down) translation for a win during RPS Game. We decided to change this because we wanted to reduce the range of motion for our servo horn that connects the servo mount bracket to the DCI display. By focusing on the specified movements, our servo motor system will be more streamlined and even more precise in turning towards the direction of the audio cues.

Part A is written by Shannon Yang

The StudyBuddyBot (SBB) is designed to meet the global need for accessible, personalized learning by being a study companion that can help structure/regulate study sessions  and incorporate tools like text-to-speech (TTS) for auditory learners. The accompanying WebApp to the robot ensures that it can be accessed globally by anyone with an internet connection, without requiring users to download or install complex software or paying exorbitant fees. This accessibility factor helps make SBB a universal solution for learners from different socioeconomic backgrounds.

With the rise of online education platforms and global initiatives to support remote learning, tools like the StudyBuddyBot fill a crucial gap by helping students manage their time and enhance focus regardless of geographic location. If something similar to the pandemic were to happen again, our robot would allow students to continue learning and studying from the comfort of their home while mimicking the effect of them studying with friends. 

Additionally, as mental health awareness grows worldwide, the robot’s ability to suggest breaks can help to address the global issue of burnout among students. The use of real-time interaction via WebSockets allows SBB to be responsive and adaptive, ensuring it can cater to students across different time zones and environments without suffering from delays or a lack of interactivity.

Overall, by considering factors like technological accessibility, global learning trends, and the increasing focus on mental health, SBB can address the needs of a broad, diverse audience.

Part B is written by Mahlet Mesfin

Every student has different study habits, and some struggle to stay focused and manage their break times, making it challenging to balance productivity and relaxation. Our product, StudyBuddyBot (SBB), is designed to support students who face difficulties in maintaining effective study habits. With features such as timed study session management, text-to-speech (TTS) for reading aloud, a short and interactive Rock-Paper-Scissors game, and human-like responses to audio cues, SBB will help motivate and engage students. These personalized interactions keep students focused on their tasks, making study sessions more efficient and enjoyable. In addition, SBB uses culturally sensitive dialogue for its greeting features, ensuring that interactions are respectful and inclusive.

Study habits vary across different cultures. For example, some cultures prioritize longer study hours with fewer breaks, while others value more frequent breaks to maintain focus. To accommodate these differences, SBB offers two different session styles. The first is the Pomodoro technique, which allows users to set both study and break intervals, and the second is a “Normal” session, where students can only set their study durations. Throughout the process, SBB promotes positive moral values by offering encouragement and motivation during study sessions. Additionally, the presence of SBB creates a collaborative environment, providing a sense of company without distractions. This promotes a more focused and productive study atmosphere.

Part C was written by Jeffrey Jehng

The SBB was designed to minimize its environmental impact while still being an effective tool for users. We focus on SBB’s impact on humans and the environment, as well as how its design promotes sustainability. 

The design was created to be modular, so a part that wears out can be replaced as opposed to replacing the whole SBB. Key components, such as the DCI display screen and the microcontroller (RPi), were selected for their low power consumption and long life span, to reduce the need for replacement parts. To be even more energy efficient, we will implement conditional sleep states to the SBB to ensure that power is used only when needed. 

Finally, we have an emphasis on using recyclable materials, such as acrylic for the base, and eco-friendly plastics for the buttons, that reduce the carbon footprint of the SBB. By considering modularity, energy efficiency, and sustainability of parts, the SBB can be effective at assisting users and balancing its functionality with supporting these environmental concerns.

Mahlet’s Status Report for 10/12/2024

This week, I focused mainly on the design report aspect. After my team and I had a meeting, we split up the different sections of the report fairly, and proceeded to work on the deliverables. 

I worked mainly on the audio triangulation, robot neck motion(components included) and the robot base design. In the design report, I worked on the use-case requirements of the audio response and the robot base. I made the final block diagram for our project. After this, I worked on the design requirements for the robot’s dimensions, and the audio cue response mechanism. After identifying these, I worked on the essential tradeoffs for choosing to use some of our components such as the Raspberry Pi, the servo response, choice of material for our robot’s body, and microphone for audio input. After this, I worked on the system implementation for the audio cue response and unit and integration testing of all these components. 

Following our discussion, I finalized the bill of materials, and provided risk mitigation plans for our systems and components. 

In addition to this, I was able to spend some time discussing the audio response methodology and approach with Professor Bain. After implementing the forward audio detection system (i.e. knowing the location of the audio source and the location of the microphone receivers), my goal was to work backwards, without knowing the location of the audio source. From this meeting, and further research, I concluded my approach as follows, and will be working on it in the coming week. More detail on this implementation can be found on the design report.

The system detects a double clap by continuously recording and analyzing audio from each microphone in 1.5-second segments. It focuses on a 1-second window, dividing it into two halves and performing a correlation to identify two distinct claps with similar intensities. A bandpass filter (2.2 kHz to 2.8 kHz) is applied to eliminate background noise, and audio is processed in 100ms intervals.

Once a double clap is detected, the system calculates the time difference of arrival (TDOA) between microphones using cross-correlation. With four microphones, it computes six time differences to triangulate the sound direction. The detection range is limited to 3 feet, ensuring the robot only responds to nearby sounds. The microphones are synchronized through a shared clock, enabling accurate TDOA calculations, allowing the robot to turn its head toward the detected clap.

I am a little behind on schedule as parts are not here yet. I will be working on the Robot base building with Jeffrey once that is complete, and do testing on the audio triangulation with microphones and RPi after performing the necessary preparations within the coming week. 

Shannon’s Status Report for 10/5/2024

This week, I focused on nailing down the details for the Design Report with my team. I engaged in an in-depth discussion on Thursday with our TA Ella regarding the details about our Use cases, Use-case requirements and the Design requirements. I made sure to clarify and really narrow down on the specifics. From the discussions we had, I worked on defining the features with regards to 5 areas of use case and their corresponding requirements (Power On, RPS Game, TTS, Studying Session, Audio Cue Response). Below, I’ll illustrate the features for Power On and RPS Game to give an idea of what this process was like and what I accomplished at the end of it.

  • Initial greeting of a user(Power on, display screen lights up and then robot face displayed)
    • UC Requirement: Power on and display in under 3s
      Design Requirements: DCI display latency
  • RPS Game
    • UC: Starting a Game
      • UC Requirement: Start game on WA and robot responds in under 1s
      • Design Requirements: DCI display latency, WS latency
      • UC Requirement: Robot prompts user to start, user presses OK and robot responds in under 250ms
      • Design Requirement: GPIO Latency
    • UC: Playing a Game
      • UC Requirement: User presses input at the end of “Rock, Paper, Scissors, SHOOT!” countdown and robot responds with win/lose output in under 1s
        • Design Requirement: DCI display latency, RPi state calculation latency
    • UC: Ending a Game
      • UC Requirement: At the end of the initially set no. of rounds, the robot indicates Game Over! on display and reverts back to Study mode in under 1s
        • Design Requirement: DCI display latency

…and so on for all 5 areas of use case. We will be basing our design report on the outline that I have created, and I made sure to really think about the details and define everything fully so that this could be applicable to our final report as well.

Following this discussion, I realised that we had far too many elements to our robot, as on top of these 5 areas, we wanted to have a photoresistor for a niche use case (see team report for more details) and the ultrasonic sensor for user presence detection for a total of 7. As such, I lead the team in discussing whether we should consider pruning some features, and ultimately we did end up removing them from the design.

I have also worked on the study session feature this week, and it is still slightly buggy, but I have confidence that I can fix the issues. I have also finished creating all the necessary pages we need for our WebApp. According to the Gantt chart, I am slightly behind schedule, since I also aimed to finish the Todo-List feature this week, but since it is a simpler feature than the Study Sessions, I prioritised that instead. To catch up, I plan on dedicating extra time next week to work on finishing up these features, and since all other WebApp features left to do next week are with regards to the robot, I will enlist help from my team members to ensure we remain on target.

In the next week, I’ll be working on:

  • Completing the Design Report
  • Completing the Study Session and Todo-List features
  • Coding up RPS Game Page
  • Coding up TTS Page
  • Researching TTS with Mahlet to test a few short inputs

Mahlet’s Status Report for 10/05/2024

This week, my primary focus was gathering data for the design report and completing tasks related to audio localization.

For the audio localization, I used MATLAB to simulate and pinpoint an audio source on a randomly generated 10×10 grid. I arranged the microphones in a square (2×2) configuration and randomized the location of the audio source. By calculating the distance between each microphone and the audio source, and considering the speed of sound (approximately 343 m/s), I determined the time delays relative to each microphone.

I applied the Time Difference of Arrival (TDOA) method. For each pair of microphones, the difference in the time it takes for sound to reach each microphone forms a hyperboloid. I repeated this process for every microphone pair, and the intersection of these hyperboloids provided a reasonable estimate of the audio source’s location. In MATLAB, I looped over the grid and computed the integer intersections of various locations. Using the Euclidean approach, I predicted the distance and calculated the corresponding TDOA using the speed of sound. By comparing the predicted TDOA with the actual time delays, I tried to estimate the error in the localization process.

The following figure illustrates the results, where ‘X’ represents the audio source, and ‘O’ marks the microphone positions. Additionally, I will include the relevant equations that informed this approach.

 

Currently, I am facing an issue with pinpointing the exact location of the source. To address this, I plan to refine the grid resolution by using smaller iterations, which should allow for greater accuracy. I will also calculate and display the approximate error in the final results. So far, I have a general idea of the audio source’s location, as indicated by a dark blue line, and I will continue working to pinpoint the exact position. Once I achieve this, I will conduct further simulations and eventually test the system using physical microphones, which will introduce additional challenges.

I am slightly behind on the project schedule. By next week I aim to finalize the audio localization section of the design report, along with the remaining parts of the report, in collaboration with my team.  I had a goal to setup the robot neck rotation servos by this week. This hasn’t been done as well. We will be finalizing the bill of materials by this week. I will be working on this as early as the components get in. To make up for this, I will be spending some time over fall break working on this.

According to the Gantt chart, Jeffrey and I had planned on building the robot, by the end of this week. This hasn’t been completed yet, but the CAD design is already completed. This week we will meet and discuss more about space constraints and make decisions accordingly.