Author: jtraiger

Team Status Update for May 2

Team Status Update for May 2

This week we worked on finalizing our presentation which Ashika presented on Monday. In the middle of the week we all worked on updating our design report to reflect changes in the design and descriptions of our final project. At the end of the week 

Jade’s Status Update for May 2

Jade’s Status Update for May 2

This week I worked on getting audio latency metrics for our presentation on Monday as well as working on updating our paper and recording segments for our final demo video. On Sunday I realized I had to remeasure system latency because I realized that because 

Jade’s Status Report for Apr 25

Jade’s Status Report for Apr 25

On Monday I just tested our code before the demo we had and participated in the demo.

On Wednesday I wrote testing code to measure speech recognition accuracy. The code pulled random words from the thousand most common English words and then prompted the user to say them and compared the speech recognition output to the original prompt word. Abha and I tested it on the Raspberry Pi with the microphone hardware setup so that we could get a good understanding of what the accuracy of the system would be.

Person Date/time correctness total file
Jade 4/22 12:20 81 100
Abha 4/22 12:41 91 100
Jade 4/22 12:45 89 100 op2.txt
Abha 4/22 12:48 80 100
Jade 4/22 12:52 91 100 op3.txt
Jade 4/22 1:03 91 100 op4.txt
Average: 0.8716666667

So far over 6 tests the system is averaging an 87% accuracy which is slightly above our metric of 85% accuracy.

I also had the program output a text file with the exact word errors so that we could analyze them.

On Friday I wrote testing code to measure system latency between the pi and the system. It measures the latency between when the user finishes saying the user input to when the system says the next line of dialog. I tested this only on one laptop with text inputs and found that the latency was within our 4- 6 second range. We still need to test latency for the whole system with speech recognition included.

I am on schedule.

Next week I will be working on finishing measurements and writing the final paper.

Jade’s Status Update for Apr 18

Jade’s Status Update for Apr 18

This week I worked on improving the pi’s main function as well as cleaning up the socket code. Specifically I worked on making it so that the program would close the sockets when done running and also not hang if something happened to a socket 

Jade’s Status Report for Apr 11

Jade’s Status Report for Apr 11

On Monday I worked on cleaning up the socket interface code. I wrote read and write functions for both the server and the client. On Wednesday I worked on integrating socket code into our program, this consisted of creating a new main file for the 

Jade’s Status Report for April 4

Jade’s Status Report for April 4

On Sunday I decided to tackle the choppy audio quality that I was getting from the audio playback. I was using pydub’s AudioSegment to play the sound back, but it was choppy, so, I investigated a few other audio processing packages including pyglet, pygame, tkSpeak, as well as just running mpg123 on command line. After doing testing with each, I found that I couldn’t get pyglet, pygame or tkSpeak to work either install or work on the pi. So, I went with playing audio with mpg123. This worked surprisingly well and provided clean audio.

On Monday I found that the audio input wasn’t working very well on the raspberry pi. Originally I was using speech_recognition’s microphone class to record audio, however it would hang or never stop recording, even when I supplied a timeout. So, I decided that instead I would record 3 seconds of audio after the user has been prompted from the microphone and pass it into speech recognition on an audio file. This worked really well and allowed for the pi to take audio input.

On Wednesday I worked on integrating Ashika’s new ML code changes with the audio code on my laptop. This involved a lot of back and forth figuring out errors between the both of us.

On Friday I got the audio input/output processing code working on both the pi and my laptop. The ML + audio system works fully on my laptop. We realized that if we could install all the ML packages onto the pi, then we wouldn’t need to run the ML code on the laptop. So, I worked on installing ML packages on the pi. Unfortunately I ran into a lot of issues and plan to try just a bit more before giving up and working on writing code to let the laptop and pi communicate.

So far I am a little behind because I should’ve started working on phase vocoding, however, I’ve decided to table that until we have a completely working ML and pi communication system.

Next week I want to try and work on getting the ML installs working on the pi. If I can’t then I will be working on cleaning up the sockets code between the laptop and pi.

Jade’s Status Update for Mar 28

Jade’s Status Update for Mar 28

On Monday I cleaned up my audio input/output code in order to work with Ashika on integration. I turned my code into easy to call modules, documented it, and documented all the installs needed to run it. I also started writing code that will setup 

Jade’s Status Report for Mar 21

Jade’s Status Report for Mar 21

This week I worked on getting the simple fill in the blank interface working on the Raspberry Pi. I have the audio input portion working, however have not got the audio output portion working. This is mostly because I have been having trouble figuring out 

Jade’s Status Report for Mar 7

Jade’s Status Report for Mar 7

During the earlier parts of this week I worked on writing the design report.

After the design report was submitted, I talked with Ashika to figure out how the ML algorithm would output dialog to the text to speech portion of KATbot. I then worked on writing a program that reads a templated story line by line, prompts the user for input whenever needed, takes that input and saves it. The program finishes by asking the user if they want to hear the story that they made while interacting with the program. The program works moderately well, however, there are latencies of around 1- 2 seconds from when the user responds to a query.

Below is a line by line text view of everything the TTS algorithm says. Right now, “generated” is being used to fill in words that the ML algorithm would provide.

Currently I am a little behind where I would like to be because I need to start working on the pitch shifting algorithm, as well as getting the program I wrote running on the pi.

I will not be working this upcoming week because it is spring break and our group planned slack time around spring break. However, for the week after spring break, the main things I plan on working on are getting the program I wrote running on the pi and working on the phase vocoding algorithm. If I have time I will work with Ashika to integrate the short storytelling interface I wrote with the ML algorithm that she has been working on.

Jade’s Status Report for Mar 1

Jade’s Status Report for Mar 1

This week I primarily spent time working on our design presentation. I created a new system diagram for us to use which is displayed below. I also spent a decent amount of time practicing for the presentation. As for what I’ve accomplished regarding the project,