Shannon’s Status Report 12/7/24

This week, since I was done with everything I was in charge of, I focused my all into helping Jeffrey catch up on his parts. Jeffrey was able to have the Study Session start working, where when a user clicks start session on the WebApp, the robot display screen is able to start a timer. However, he was unable to get the pause/resume or the end session to work. As such, I sat down with him and we worked on these features. I noticed that he had a lot of debugging statements on the robot, which was good, but he wasn’t really logging anything on the WebApp. Following my advice, he did so and we realised that the event was being emitted from the RPi but it was not being received on the WebApp based on the logs. I also noticed that he had a log for when the WebSocket connected on the start but not on the in progress page and so I recommended that he add that. After he added that, I noticed that on the in progress page, the connection log was not appearing and so we waited until it appeared and then we tested the pause/resume button and it worked. Thus, we managed to debug that the issue had to do with latency. We then resolved this by switching the transport to WebSockets from Polling, and that fixed the latency issue. Then, we worked to debug why the End Session on the WebApp was not being received on the robot end. Once again, based on checking the logs, I deduced that it was probably because the events were being emitted wrongly and so the robot event handlers could not catch them. We tried fixing it to receive that event that was being sent by the WebApp instead and it worked – we finished a standard study session! This was done on Wednesday. After this, I took over all TTS and Study Session related work, but Jeffrey will work on the last part of hardware integration with the pause/resume buttons.

On Thursday, I then integrated my TTS code, and ran into some issues with the RPi trying to play audio from a “default” device and failing. After some quick troubleshooting, I decided to write a config file to set the default device to the port that the speaker was plugged into and it worked! The text-to-speech feature was working well, and there was barely any detectable difference in latency between the audio playing on the user’s computer and on the RPi through the speakers. I tested it some more to make sure it worked with various different .txt files, and when I was satisfied with the result, I moved on to integration with the Study Session feature. I worked on making sure TTS worked only while a study session was in progress (not paused). I wrote and tested some code but it was buggy and I could not get the redirection to work like I desired. I wanted it such that in the case that a Study Session had not been created yet, that the user clicking on the text-to-speech feature on the WebApp would redirect them to the create a new study session and if there was a Study Session that has been created but was on pause, it would redirect them to it. I also wanted to handle the case where if a user was using the text-to-speech feature, and the goal duration was reached, then the user going back to the study session would still trigger the pop-up to occur asking if they would like to continue or stop the session since the target duration has been reached.

On Friday, I continued to work on these features, and I managed to successfully debug them. The text-to-speech feature worked well alongside the Study Session and exhibited the desired behavior I wanted. I then worked on the Pomodoro Study Session feature, which implemented a built-in break for the user based on what they set at the start of the study session. I worked on the WebApp and the RPi, ensuring that the study and break intervals were sent correctly, and then to have it such that the study and break intervals worked, I had to create another background clock that ran alongside the display clock. The standard study session only needed a display clock since the user was in charge of pausing and resuming sessions, but since the pomodoro study session has to keep track of when to pause and resume the session automatically while sending a break reminder audio, it required a separate clock. I wrote and debugged this code and it worked well. This is also when I discovered that the text-to-speech feature could not occur at the same time due to the break reminder being played as well, and so I made a slight design change to prevent this.

Today, I will work on cleaning up the WebApp code, help Jeffrey with any RPS issues that he may face, and start on the final poster.

According to the Gantt chart and the newly linked final week schedule, I am on target.

In the next week I will work on:

  • Final overall testing with my group
  • Final poster, video, demo and report

Shannon’s Status Report for 11/30/24

This week, I focused on finishing up the TTS feature on the robot. Since the feature works well on the WebApp, I decided to integrate it fully with the robot’s speakers. I first ensured that the user’s input text could be sent properly via WebSockets to the robot. Once this was achieved, I then used the Google text-to-speech (gTTS) library on the RPi and had it translate the text into a mp3 file. Then, I tried to have the mp3 file play through the speakers. On my personal computer (a macbook), the line to play audio is os.system(f”afplay {audio_file}”). However, since the RPi is a linux system, this does not work and I tried using os.system(“xdg-open {audio_file}”) instead. This allowed the audio file to be played, but also opened up a command terminal for the VLC media player, which is not what I wanted, since the user would not be able to continue playing audio files unless they quit the terminal first. Thus, I had to look up ways to play the audio file and this led me to using os.system(f“mpg123 {audio_file}”). It worked well, and was able to play the audio with no issues. I timed the latency and it was able to be mostly under 3s for a text with word length of 50 words. If the text was longer and was broken into 50 word chunks, the first chunk would take slightly longer, but the subsequent chunks would be mostly under 2.5s which is in line with our use case and design requirements. With this, the text-to-speech feature is mostly finished. There is still a slight issue where for a better user experience, I wanted the WebApp to display when a chunk of text was done reading, but the WebApp is unable to do so. After some debugging, I found that it was because the WebApp tries to display before the WebSocket callback function has returned. Since the function is asynchronous, I would have to use threading on the WebApp if I still want this display to appear. I might not keep this slight detail because introducing threading could cause some issues and the user should be able to tell when a chunk of text is done reading by the audio itself. Nevertheless, the text-to-speech feature now works on the robot, the user can input a .txt file, the robot will read out the first x number of words and then when the user clicks continue, reads out the next x number of words and so on, so I think this feature is final demo ready.

 

According to the Gantt chart, I am on target. 

 

In the next week, I’ll be working on:

  • Helping Jeffrey finish up the Study Session feature
  • Finishing up any loose ends on the WebApp (deployment, code clean-up, etc.)

 

For this week’s additional question:

I had to learn how to use the TTS libraries such as pyttsx3 and gTTS. I thoroughly reviewed their respective documentation at https://pypi.org/project/pyttsx3/ and https://pypi.org/project/gTTS/ to understand how to configure their settings and integrate them into my project. When debugging issues, I relied on online forums like Stack Overflow, which provided insights from others who encountered similar problems. For example, when I encountered the run loop error, I searched for posts describing similar scenarios and experimented with the suggested solutions. It was there that I saw someone recommending gTTS instead, saying how this issue would be prevented because unlike pyttsx3, it does not use an engine and relies on converting text to mp3 files first and then playing instead of converting and playing as it went. This allowed me to switch over to gTTS, which was what we used in the end. 

I also had to learn WebSockets for real-time communication between the RPi and the WebApp. I read through the documentation online at https://socket.io/docs/v4/ which was great for understanding how the communication process worked. It also taught me how to set up a server and client, manage events, and handle acknowledgments. For debugging, I used tools that I had previously learnt in other classes, such as the Chrome browser developer tools console and the VSCode Debugger with breakpoints and logpoints, which allowed me to diagnose CORS issues and verify if events were being emitted and if the emitted events were being received through the logs/erros diplayed.

Shannon’s Status Report 11/16/24

This week, I worked on WebSockets with Jeffrey during our Tuesday and Thursday meet-up. Initially, I spent some time helping Jeffrey set up his virtual environment, ensuring that he had access to our GitHub repository, and that ultimately he was able to run our WebApp on his computer so that he could test the DSI display showing correct information based on the WebApp inputs. Jeffrey later ran into some git commit issues that I also worked with him to resolve (he had accidentally committed the virtual environment folder to our GitHub repository, resulting in more than 1 million lines of code being committed and causing him to be unable to git pull due to the sheer volume of content). Unfortunately, as of right now, we are still running into some issues trying to use socket.IO to have the WebApp communicate with the RPi and display. Previously, we were able to communicate with just the RPi itself, however, when trying to draw up the DSI display using Tkinter, Jeffrey ran into issues with trying to communicate between the WebApp and the RPi. While WebApp to RPi communication could work, it did not work the other way around and he is still working to resolve this issue. As such, although I was hoping to be able to test the WebApp display based on the RPi-sent information, since this communication from the RPi to the WebApp is still buggy, I was unable to do so. Hopefully Jeffrey is able to resolve this issue in the next week, and I will be able to more thoroughly test the code I have written. 

 

I have also worked on improving the latency of the TTS feature. Previously, there was an issue where upon a large file upload, the TTS would take a long time to process the information before speaking. As such, I have changed the TTS interface to include an option for the user to choose how many words they want in a “part”. If a user inputs 50 words, when they click “Start Reading!”, the input .txt file is processed, the text is split into 50 word parts, and the first 50-word part will be read. After reading, the website will display “Part 1 read successfully” and a new button will appear, saying “Continue?”. If the user clicks on it, the next 50-word part will be read. Once all parts have been read, a message reading “Finished reading input text, upload new .txt file to read new input.” will appear, and the “Continue?” button will disappear. 

User can upload a .txt file, default word count is set to 50 (latency of ~2s,)

Reading a 50-word text with the maximum word limit set to 25 words (should read in two parts).

After the first part/25-word chunk is read successfully, continue button appears:

After second part is read successfully, continue button disappears and the finished message appears.

Lastly, this week I also worked on a slight UI improvement for the website. When a user ends a study session, there is now star confetti! (Similar to Canvas submissions).

According to the Gantt chart, I am on target and have finished all individual components that I am responsible for. All other tasks that I am involved in are collaborative with either Jeffrey (WebSockets – RPi receiving and sending info) or Mahlet (TTS on Robot) being in charge.  Although slightly behind on what I initially planned for the interim demo (Study Session not fully working), everything else I had planned to talk about is working.

In the next week, I’ll be working on:

  • Helping Jeffrey finish up the Study Session feature
  • Helping Jeffrey to start RPS Game feature
  • Implementing TTS feature on the Robot with Mahlet

Shannon’s Status Report 11/9/2024

This week, I worked on ensuring that Study Session information could be sent via WebSockets, and I managed to succeed in doing so. The WebApp can successfully send over information when the user creates a Study Session, and it can send over information that the user has ended a Study Session. As for robot to WebApp communication, because the pause button on the robot for Study Session has not yet been tested and implemented, I have not yet tested if the code I wrote for upon receiving such an input through WebSockets works yet. Theoretically, upon the pause button being pressed, the RPi should send a message via WebSockets through something like socket.emit(“Session paused”), and upon receiving such a message, the WebApp display page will show “Study Session on Break” instead of “Study Session in progress”. Ideally, I wish to test this with the actual pause button being pressed on the robot, but if Jeffrey runs into some issues with implementing that in time, I will test it by just sending the message 10 seconds after the RPi receives the start session information by default to see if the code I have written actually works. In conclusion, WebApp to robot communication is working (Figure 1), robot to WebApp communication needs testing on the WebApp end.

Figure 1: RPi receiving Study Session information from the WebApp.

I also worked on the WebSocket code for the RPS Game this week. Unlike the Study Sessions, there is significantly less communication between the WebApp and the robot and as such, I only worked on this after I was confident about WebSockets working for our Study Session feature. For the RPS Game, all the WebApp has to do is send over information at the start of the game with regards to the number of rounds they wish to play, and then all gameplay occurs on the robot with RPi, DSI display and the RPS game buttons. When the game ends, the robot then sends back game statistics via WebSockets, which gets displayed on the WebApp. I am able to send the RPS Game information to the robot with no issue, but I have yet to test the receiving of information and the display that should occur, which I will focus on next week. Same as before, WebApp to robot communication is working, robot to WebApp communication needs testing.

For TTS feature, I didn’t have as much time to work on it this week, but I have managed to implement the reading of a .txt file instead of just inputting text into a text field! A user can now upload .txt file and gTTS is able to read the text of the .txt file.

Lastly, this week I also worked on the overall UI of our page to make sure that everything looks neater and more visually appealing. Previously, the links were all connected together and messy, but I have separated the header bar into individual sections with buttons and so the overall UI looks more professional. I will continue to work on improving the overall style of our website, but now it more closely resembles the mock-ups that I drew up in the design report.

According to the Gantt chart, I am on target.

In the next week, I’ll be working on:

  • Finishing up the Study Session feature
  • Finishing up TTS feature on the WebApp
  • Testing RPS Game statistics display on the WebApp
  • All Interim demo goals are listed under Team Report!

Shannon’s Status Report for 11/2/24

This week, I focused on the TTS feature of our robot. I spent some time trying to use gTTS (Google Text to Speech) on our WebApp, and it worked! We were able to take in input text on the WebApp and have it read out by the user’s computer. However, there is a significant issue that gTTS has, which is the latency of the feature. gTTS library works by converting all text to an mp3 file, and then the WebApp plays the mp3 file for the user. The problem thus arises when a long piece of text is used as input. The specific details are also mentioned in the Team Weekly Report, but essentially the delay can be as long as 30s for a piece of long text to be read. As such, this is definitely a concern for our project. Our previous library, pyttsx3, translates as it processes text and as such there was no scaling latency associated with it, unlike gTTS. Me and Mahlet have agreed that we will still try and get pyttsx3 to work to avoid this significant latency issue from gTTS, and if we still can’t get it to work by the end of next week, we will switch to using gTTS and possibly split up the input text into 150-200 word chunks and then have multiple mp3 files be generated and then played back-to-back.

I also worked on the WebSocket code for the Study Sessions this week. Following our success in having the RPi communicate and respond with our WebApp, this week I have written some code on our WebApp Study Session feature to have it send over information about the Study Session when it is created to see if the RPi can receive the Study Session information and not just information about whether a button was clicked. Unfortunately, I have not had a chance to test this out yet on the RPi, but I have confidence that it will work. I have also written some code in preparation to be added to the RPi to see if the WebApp can receive information about paused Study Sessions that I plan on transferring to the RPi when I am next available to work on it. Ideally, by the end of next week, all communications between the RPi and the WebApp will be working enough to simulate a study session occurring. 


In the next week, I’ll be working on:

  • Researching a solution for pyttsx3 with Mahlet
  • Study Sessions communications between the WebApp and RPi through WebSockets
  • Starting Study Session timing code on the RPi

Shannon’s Status Report for 10/26/2024

This week, I focused on making WebSockets communications between our WebApp and the RPi work. When we met up on Thursday afternoon, Mahlet and Jeffrey helped to set up the RPi (registering the device, connecting it to WiFi, etc.). Once we were able to download VSCode on the RPi, I coded up a short script to test if communications were able to happen. I wrote a simple script in JavaScript on the RPi, and then wrote a similar one with some extra UI features on the WebApp and tested it out. Theoretically, when I clicked a button on the WebApp, the RPi should receive it and print out a message. Initially, this wasn’t working due to the RPi and the WebApp being on different ports. There was a CORS(Cross-Origin Resource Sharing) error, due to the WebApp trying to send a request to a different domain than the server that was hosting it and so to debug this, I included some CORS settings on the RPi side to allow the WebApp to send a request. This worked, and the RPi was able to display a message when a button on the WebApp was clicked.

On the WebApp:


On the RPi:



I also spent quite some time on trying to incorporate TTS on the WebApp itself this week. Unfortunately, the pyttsx3 library that we were trying to use seems to not work well with our website. After coding up some simple logic to use the TTS function in the library when a user input is received, me and Mahlet tried testing it to see if it was successful. When we first input some text into the textbox and click the read button, it works well and the laptop speakers play the correct audio with little to no delay. However, when we try to send more text again, we get the error “run loop has already started”, which indicates that the previous text to speech command queued had not finished. We were confused and spent quite some time trying to debug this by looking up solutions online that other users who have encountered this issue tried and it did not work for us. We looked through the documentation for the TTS library itself and tried out various functions, but nothing seemed to work. Thus, me and Mahlet are looking into using other TTS libraries to see if we can find a solution to this. I am considering using gTTS (Google Text to Speech), which is not as ideal as pyttsx3 because it requires an internet connection, but should be well-documented enough to reduce the chances of it not working as well.

In the next week, I’ll be working on:

  • Building the robot with my team
  • Researching different solutions for TTS with Mahlet
  • RPS game function on the WebApp with WebSockets

Shannon’s Status Report for 10/12/2024 or 10/19/2024

This week, I focused heavily on writing the Design Report. Starting Monday night, I typed up the Introduction paragraph and detailing our Use Case, Use-Case requirements, and Design Requirements in to a shared document such that my other team members would be able to refer to and work on writing their separate parts (I essentially generated some points to talk about/the structure and they could elaborate on it). Then on Thursday night and Friday night, I dedicated time to working on completing my individual parts. The early half of Saturday, I worked on covering some parts that were delegated to other members but were not finished by the deadline of Friday night that we had set, and Saturday night was spent editing any jarring errors. I also spent some time detailing out everything that had to be done so as to ensure our group did not miss out on any important information in the Design Report, as well as coming up with Monday and Wednesday morning agendas for when we met up to make sure we were on track. Overall, on top of the actual work that I created/ typed up for the Design Report such as research on the various requirements, drawing up most of the diagrams in the report (all flow charts and mock-ups, I also sketched out the software part of the block diagram which Mahlet created), and the actual typing up of the document, I think I also put in a lot of mental labor to try and keep everyone on top of things. I would say that in total, I spent more than 12 hours just on the Report. 

Moving on to our project itself, I have also finished most of the coding for the pages (RPS page, TTS page, StudySession page) this week as well, and now the website resembles the mock-ups a lot more. I still have to work on some CSS elements and making the website UI more user-friendly, but everything seems to be working well. I am slightly behind schedule, as I wanted to have some Text-to-Speech functionality sorted out on the RPi, along with WebSockets, but since these are areas that require me to work with Mahlet and Jeffrey respectively, I will put my all into it once Fall break is over. Included below is some images of the website that I have working thus far:

In the next week, I’ll be working on:

  • Cleaning up the UI on the individual pages
  • Error handling for the pages
  • Building the robot with my team
  • Setting up TTS on the RPi/robot

Shannon’s Status Report for 10/5/2024

This week, I focused on nailing down the details for the Design Report with my team. I engaged in an in-depth discussion on Thursday with our TA Ella regarding the details about our Use cases, Use-case requirements and the Design requirements. I made sure to clarify and really narrow down on the specifics. From the discussions we had, I worked on defining the features with regards to 5 areas of use case and their corresponding requirements (Power On, RPS Game, TTS, Studying Session, Audio Cue Response). Below, I’ll illustrate the features for Power On and RPS Game to give an idea of what this process was like and what I accomplished at the end of it.

  • Initial greeting of a user(Power on, display screen lights up and then robot face displayed)
    • UC Requirement: Power on and display in under 3s
      Design Requirements: DCI display latency
  • RPS Game
    • UC: Starting a Game
      • UC Requirement: Start game on WA and robot responds in under 1s
      • Design Requirements: DCI display latency, WS latency
      • UC Requirement: Robot prompts user to start, user presses OK and robot responds in under 250ms
      • Design Requirement: GPIO Latency
    • UC: Playing a Game
      • UC Requirement: User presses input at the end of “Rock, Paper, Scissors, SHOOT!” countdown and robot responds with win/lose output in under 1s
        • Design Requirement: DCI display latency, RPi state calculation latency
    • UC: Ending a Game
      • UC Requirement: At the end of the initially set no. of rounds, the robot indicates Game Over! on display and reverts back to Study mode in under 1s
        • Design Requirement: DCI display latency

…and so on for all 5 areas of use case. We will be basing our design report on the outline that I have created, and I made sure to really think about the details and define everything fully so that this could be applicable to our final report as well.

Following this discussion, I realised that we had far too many elements to our robot, as on top of these 5 areas, we wanted to have a photoresistor for a niche use case (see team report for more details) and the ultrasonic sensor for user presence detection for a total of 7. As such, I lead the team in discussing whether we should consider pruning some features, and ultimately we did end up removing them from the design.

I have also worked on the study session feature this week, and it is still slightly buggy, but I have confidence that I can fix the issues. I have also finished creating all the necessary pages we need for our WebApp. According to the Gantt chart, I am slightly behind schedule, since I also aimed to finish the Todo-List feature this week, but since it is a simpler feature than the Study Sessions, I prioritised that instead. To catch up, I plan on dedicating extra time next week to work on finishing up these features, and since all other WebApp features left to do next week are with regards to the robot, I will enlist help from my team members to ensure we remain on target.

In the next week, I’ll be working on:

  • Completing the Design Report
  • Completing the Study Session and Todo-List features
  • Coding up RPS Game Page
  • Coding up TTS Page
  • Researching TTS with Mahlet to test a few short inputs

Shannon’s Weekly Report 9/28/2024

This week, I focused on narrowing down the specifics of the Robot and the WebApp with my team.  We wanted to have a clear idea of what exactly our robot will look like and what the WebApp would look like. We discussed in-depth on what our robot dimensions should be and came to the conclusion that the robot should be roughly 12-13 inches in height to account for eye level on a desk. Since the LCD display will be around 5 inches, the base will have a height of about 7 inches. We also discussed the feet dimensions, which came out to be 2.5 inches wide to account for the 3 rock paper scissors buttons and 1 inch in height to account for the buttons sticking out. Then, I lead the discussion around what the WebApp should look like, what pages we should have, and what each page should do. We decided on 4 main pages:

  • a Home page displaying the most recent study sessions and todo lists,
  • a Timer page that allows the user to set timers for tasks and a stopwatch to time how long they take to do tasks,  
  • a Focus Time/Study Session page where the study can start, pause, and end a study session, and view statistics/analyze their study sessions,
  • a Rock-Paper-Scissors page, where the user can start a game with the robot.

Following our discussion, I have started working on the Timer page for our WebApp. I have finished the basic timer and stopwatch features, so now a user can start a timer, and they can start and stop a stopwatch. Attached is a screenshot of this. I also plan on adding a feature where the previous timer and stopwatch timings are recorded with tags the user can add to the previous activity.

 

According to the Gantt chart, I am on target. 

In the next week, I’ll be working on:

  • Completing the Timer Page
  • Coding up the Focus Time/Study Session Page
  • Fully finalizing a plan on how to integrate the robot with the WebApp

Shannon’s Status Report for 9/21/2024

This week, I focused on researching various components for our proposal presentation. To better define our use case, I looked into a few different research papers with a similar end goal as us, and found good support that there is a need for our robot (because it can ​​fulfill the psychological needs of students and will motivate learners and improve their learning output). I also properly scoped out the 6 main features for our robot including the 3 features for studying and the 3 features for interaction and what problems each feature can address with regards to our use case. I also worked on more concretely defining our technical challenges, and coming up with possible risk mitigation strategies by doing some simple research and seeing what options were available and more appropriate for our project. For example, when it came to the issue of having fast real-time communication between app and robot, there were a few options including using REST API with polling (too high latency) but I ultimately decided to implement WebSockets because it is a commonly used lightweight protocol that is suited to our project.

I have also started on creating the frontend pages for our WebApp using Django. I have not fully fleshed out all the pages (timer pages, studying session pages) and will work on that starting next week, but for now there is some basic navigation (login, logout, register, and a home page). I will further discuss with my team to fix on a design for the wireframes and then I will correspondingly update the frontend views for our WebApp. 

According to the Gantt chart, I am on target. 

In the next week, I’ll be working on:

  1. Design Presentation Slides
  2. Finalising our parts ordering list with my team
  3. Finishing up the WebApp frontend pages
  4. Starting work on timers, focus times, and todo list features on the WebApp