Cambrea’s Status Reports – Team D6: StenoPhone

Cambrea’s Status Report for May 8

This week I have been working on our final report and we have also been filming our final video, and we have finished the poster.

We are on schedule for completing all documents for May 10.

For the demo next week and plan on including a live demo of our product with the poster.

Cambrea’s Status Report for May 1

This week I finished all tests for the audio networking and made more formal output files with the results. Our latency is~30ms and our dropped packet rate is ~.3% , both of these meet our requirements

latencyTestMic1

latencyTestMic2

droppedPacketTestMic1

droppedPacketTestMic2

We also finished the slides for our final presentation.

Today we are working on our final video outline and we are filming some clips of our meeting setup.

My progress is on schedule, next week we will be working more on the video and final report after our presentation

Cambrea’s Status Report for April 24

Last weekend we finished the migration to AWS so now we are running our server code on the ec2 server, instead of running our server on the local computer.

Since we are now running our system on the server we started on our final standard tests. I completed the 2 tests for the audio device to aws server networking this week .

I first wrote the latency test, this test sends packets to the server and should receive the packet back. I capture the timestamp of the packet when it is sent and compare this to the time stamp of when the packet is received back at the raspberry pi to make sure we use the same clock to calculate the latency. After sending 500 and receiving 500 packets we calculate the average latency of the systems round trip time. We are getting the results of 27-80 ms average latency which is below our requirement of 150 ms latency.

The second test is the dropped packet test, this sends packets to the server for a fixed amount of time and counts the number of packets received back from the server, we calculate the dropped packet rate as (number packets sent – number packets received)/ number packets sent. We ran this test for 2 minutes and 10 minutes and found the dropped packet rate to be less than 1% at .2% and .1% . This also meets our requirement of <5% dropped packets in our system.

These 2 tests are in this ClientTest.py, this file works with the arguments

-b for basic connection to the server test, send and received 1 packet to make sure the connection exists

-l run the full latency test

-d run the full dropped packet test

the tests are in the corresponding functions testLatency and testDroppedPacket

ClientTest

This progress is on schedule we will be testing until next week as well.

Next week I will be working on the final presentation slides and the final report.

Cambrea’s Status Report for April 10

Last week week I completed the streaming code and AWS server code that is responsible for sending and receiving audio over the network,. The ReSpeaker offers a capability of detecting whether a user is speaking, using the is_voice() parameter. I was testing this capability over the last weekend and found that the output audio using this information is too choppy to be intelligible to the user. We are currently testing if after we tag the packets as voice and feed those to the transcript, if these packets have enough data to create the transcript.

This week we started integration of each of the systems so we were working on campus in HH D level. On Monday and Tuesday I added the raspberry pi’s to the cmu device wifi. We were having issues with connecting the devices to the wifi so we reflashed the OS to the SD cards on the raspberry pi and reconfigured the wifi and it now works on the cmu-device wifi.

On Wednesday we finished the integration between the audio streaming on the audio devices and the transcript generation on the server. For this integration we are currently using Ellens computer to act as the Server so that we can complete the integration for the demo before migrating to using the AWS Server. We are currently developing the speaker Identification more to make sure that I works to recognize different speakers.

This week we will start the tests for transcript accuracy, prepare for the demo, and also start the migration to AWS.

Cambrea’s Status Report for March 27

For this week, I worked on fixing some audio issues with the audio device code. The issue was that when I went to play the output audio on the speaker(this should be the output audio from users), this was not the correct audio and instead was just computer noise. I was trying to play raw audio using pyaudio like

stream = p.open(

rate=RESPEAKER_RATE,

format=p.get_format_from_width(RESPEAKER_WIDTH),

channels=RESPEAKER_CHANNELS,

output=True,

frames_per_buffer=CHUNK,

input_device_index=RESPEAKER_INDEX,

)

stream.write(data)

This code to stream.write(data) was not working with raw audio but was working when instead I wrote this audio to a wav file and stream.write() the data from the file after.

I am thinking that the raw audio data was not the correct format or type to be played using the stream, but when I print the type of the raw data I can see that it is “bytes” which is very ambiguous, the stream documentation says that it is also meant to play “byte” like objects.

Going forward I just instead chose to write this raw data to wav and then play the wav file through the stream since that gives clear audio.

This week I also received the equipment for the second audio device and assembled it. I have downloaded the necessary code onto the device and am testing currently using my computer as the server(by this I mean I am running basic server code on my computer with both devices connected to it). I am currently testing that the devices send and receive the correct data to eachother.

I am currently on track with my progress, next week I will work on using the AWS server for the audio devices and integrating this code with Ellen’s speech to text code.

Cambrea’s Status Report for March 20

This week I wrote the code the handle all components on the raspberry pi.

These components are the microphone IO, audio processing, and the client code. The RaspberrySystem file acts as the “top” module on the raspberry pi, and is used to run the entire system. In this file I start the threads for the mic IO, audio processing, and client threads.

RaspberrySystem

I also added the MicrophoneIO file. In this file, the startIO function is called by the RaspberrySystem file. This function starts the audio stream to listen on the microphone. When audio is detected on the microphone, it is then put into the queue for the audio processing component. The direction of arrival for that particular audio is also added to the queue. This happens in the callback function.

When audio is ready to be played on the speaker(this is the output audio coming from the second set of users in a different room) the stream to listen to audio is stopped to prevent a feedback loop while the output audio is played.

MicrophoneIO

Lastly I added to the client code. Here I made a text file that holds the audio device’s ID number. This number is sent to the server so that the server can differentiate the audio devices.

I also ordered a respeaker, micro SD card, raspberry pi case and raspberry pi power supply cable to create the second audio device.

I am currently on track with my progress.

Next week I will configure the new audio device when the parts arrive. Ellen and I will also start integrating the audio streaming work that I have done with the transcript generator that Ellen has completed.

Cambrea’s Status Report for March 13

This week I completed the server/client code to send and received audio data between the raspberry pi and the server. This server code here AWSServer

receives data from 2 separate audio devices and places the input into a queue to be sent out to the other audio device. So for example device one sends a packet to the server, this packet it put in to the queue and is sent to device 2.

The client code is here RPiClient

This code will package the audio information into a list which then is turned into a byte string, using the python library Pickle, so that it can be sent to the server. Pickle is used on the server side as well to unpackage the data. Input audio data is also placed on a queue to be played on the speaker for the users.

This week I also worked on the design report document, I have finished a draft of my sections, on Sunday I will revise this writing for a final draft so that we can discuss the document on Monday.

I am on schedule this week. Next week I will make the “Top” file for the raspberry pi to connect the client code and the audio I/O functions for the ReSpeaker, I will continue testing the code as I write it.

Cambrea’s Status Report for March 6

This week I started finished the design for the network connection between the raspberry pi and the AWS server. We are still planning on using a UDP connection here to transmit packets, the packets will be a struct of the micID, audio bytes and direction of arrival information for the audio. The micID is assign to a single rapsberry pi audio device from the server, when the connection is first made. The AWS Server will receive information that the audio device wants to connect and will return the mic ID, this will be resent until the raspberry pi audio device receives the micID or the connection times out. On the server both listen and send will need to run on separate threads to make sure that the server is always listening. There will also be a separate thread for each audio device connection to the server.

I set up a simple server code to test the connection between the aws server and the raspberry pi. Since the AWS server is remote it is in a different wifi network so I had to set up port forwarding, so that the raspberry pi can access the server code on the AWS Server. AWS has its own way for users to setup port forwarding, using AWS Systems manager, I have been following this tutorial https://aws.amazon.com/blogs/mt/amazon-ec2-instance-port-forwarding-with-aws-systems-manager/.

We also received the AWS credits so we upgraded our ec2 instance to m5a.large, we are tracking our usage of the server.

I am currently on schedule, In the next week I am going to complete the code to handle the networking from raspberry pi to AWS Server, this code will include the initial raspberry pi audio device handshake with the AWS server, and the transmission of our packets.

Cambrea’s Status Report for Feb. 27

This week I gave the project proposal presentation on Monday, and I ordered a respeaker to start configuring the hardware. I already have a raspberry pi 3 B+ so I spent some time downloading the libraries we will need to the raspberry pi, to test downloading and running code on the raspberry pi. I also started to work on the design for the code that will run Raspberry Pi. The flow of this code will need to handle listening for audio on the microphone, sending this audio in real time to be processed, then transmitting the audio to the AWS server. For this, I found we need to listen for audio on the microphone on its own thread separate from audio processing, we also need to run the server listen and accept on its own thread. I found 2 ways we can do this in python, 1 by using the library thread or 2 by using the library asyncio. I am leaning toward using the asyncio library since it is more helpful in creating io routines and also for networking. One main challenge here is that we will need to tell the respeaker to output audio potentially during the time that it is listening to audio. I will test this functionality on the respeaker when it arrives here. I also researched how we can send the input audio to be processed, we can use a python numpy array or a .wav file, so this format depends on how Mitchell will filter this audio, and what the format needs to be for his script. For the server code, I am planning on using the python socket library and a udp connection. The packets will be compressed audio and the metadata containing the direction of arrival (DOA) of the audio.

My progress is on schedule this week. For Monday I will create a system diagram of the software on the raspberry pi, and continue working on the design presentation slides. Next week I will write the server code that goes on the AWS server and start to test the raspberry pi connection to the server.

Cambrea’s Status Report for Feb. 20

This week we worked on creating an outline document with our research. I added my research about hardware, why we are using a raspberry pi, which respeaker we are using and which AWS server we should use. I also added research about how to transmit audio data on the application layer to the AWS server and how to compress the audio for creating packets to send.

From this research document we created the slides and since I am doing the proposal presentation I have been reviewing what I will say. I also talked about each slide with my group.

The most significant risks we could have at this point would be incorrectly laying out our work on the gantt, and not getting the timing right since we are just estimating how long each task will take us. To mitigate this we will be updating the gantt chart as we figure out more about how long each piece of the project will take.

We have worked on designing the system for the most part this week so we don’t have any changes to report. We also have just created the schedule this week which is linked in our team report.