March 2021 – Team D6: StenoPhone

Cambrea’s Status Report for March 27

For this week, I worked on fixing some audio issues with the audio device code. The issue was that when I went to play the output audio on the speaker(this should be the output audio from users), this was not the correct audio and instead was just computer noise. I was trying to play raw audio using pyaudio like

stream = p.open(

rate=RESPEAKER_RATE,

format=p.get_format_from_width(RESPEAKER_WIDTH),

channels=RESPEAKER_CHANNELS,

output=True,

frames_per_buffer=CHUNK,

input_device_index=RESPEAKER_INDEX,

)

stream.write(data)

This code to stream.write(data) was not working with raw audio but was working when instead I wrote this audio to a wav file and stream.write() the data from the file after.

I am thinking that the raw audio data was not the correct format or type to be played using the stream, but when I print the type of the raw data I can see that it is “bytes” which is very ambiguous, the stream documentation says that it is also meant to play “byte” like objects.

Going forward I just instead chose to write this raw data to wav and then play the wav file through the stream since that gives clear audio.

This week I also received the equipment for the second audio device and assembled it. I have downloaded the necessary code onto the device and am testing currently using my computer as the server(by this I mean I am running basic server code on my computer with both devices connected to it). I am currently testing that the devices send and receive the correct data to eachother.

I am currently on track with my progress, next week I will work on using the AWS server for the audio devices and integrating this code with Ellen’s speech to text code.

Mitchell’s Status Report for March 27

This week I continued moving forward on my tasks in the order I assigned them to myself. I had some family issues over the weekend, so I was not in a state to work then. I wrote half of the ethics assignment over Monday. Over the rest of the week, I worked on integrating transcript streaming. I took over this task from Cambrea from a future date. It uses channels, webhooks, and channel_asgi to update the transcript from the database.

Schedule wise, I believe that I am on schedule. Some items got shifted around like I was working on a later task this week, and am creating the housing the next week, as we got our second set of parts, so I can work on it without impeding other group members. I plan to use VMWare Horizon to use the Windows lab cluster machines, but if it is too slow, then I will go to a campus computer lab and work there.

Team Status Report for March 27

This past week, our team continued to work on phases 2 and 3. Some of the members worked on their ethics assignment. Ellen set up the microphone identification backend. Cambrea continued to work on Raspberry Pi code, making sure that it is robust. Mitchell worked on transcript streaming. We also assembled our second device.

In terms of upcoming risks, on Thursday, Ellen and Mitchell are going to be assigned a large project in another class, so it may be a large time commitment, but we do not think it will be a problem. We have not changed changed our design. Schedule change wise, Mitchell is working early on transcript streaming which Cambrea was assigned to.

Ellen’s Status Report for March 27

This week I continued moving forward on my tasks in the order I assigned them to myself. First, I finished the ethics assignment over the weekend. Also over the weekend, I finished adding functionality to the buttons and forms on the website.

Then, over the week, I worked on the mic setup process. When a mic first joins a meeting, it’s in setup mode – audio is used to assign names to speakers. In this process the system compiles names, initial locations, and vocal samples for all speakers. Then when the web user says the process is done, audio starts getting routed to the actual transcript file. I finished coding up this process in a way that meshes with the current speaker id system (non-ML) but will be easy to use in the ML system as well.

I’m a few days ahead of schedule. I was supposed to start speaker ID ML on Monday, but I’ll start it on Saturday. Speaker ID ML is my last solo task.

Speaker ID ML doesn’t have to be finished by the end of next week, but it should be most of the way there; I should have at least one option working. The other thing I’m going to do is rework my data storage so that everything lives inside the database. Hopefully that’ll be done by later today.

Cambrea’s Status Report for March 20

This week I wrote the code the handle all components on the raspberry pi.

These components are the microphone IO, audio processing, and the client code. The RaspberrySystem file acts as the “top” module on the raspberry pi, and is used to run the entire system. In this file I start the threads for the mic IO, audio processing, and client threads.

RaspberrySystem

I also added the MicrophoneIO file. In this file, the startIO function is called by the RaspberrySystem file. This function starts the audio stream to listen on the microphone. When audio is detected on the microphone, it is then put into the queue for the audio processing component. The direction of arrival for that particular audio is also added to the queue. This happens in the callback function.

When audio is ready to be played on the speaker(this is the output audio coming from the second set of users in a different room) the stream to listen to audio is stopped to prevent a feedback loop while the output audio is played.

MicrophoneIO

Lastly I added to the client code. Here I made a text file that holds the audio device’s ID number. This number is sent to the server so that the server can differentiate the audio devices.

I also ordered a respeaker, micro SD card, raspberry pi case and raspberry pi power supply cable to create the second audio device.

I am currently on track with my progress.

Next week I will configure the new audio device when the parts arrive. Ellen and I will also start integrating the audio streaming work that I have done with the transcript generator that Ellen has completed.

Team Status Report for March 20

This week the team finished the design report. Ellen and Mitchell worked on setting up the website using the Django framework, and also on connecting to the AWS Server to access the database. Cambrea worked on the raspberry pi system which set up the threads for audio IO, audio processing and audio streaming to server on the raspberry pi.

Our biggest risk now for our project is that our denoising will have a negative impact on the audio stream and negatively affect the clarity of the audio and the transcript generation. We are planning on using the background noise in the room(when there are no voices) to create a noise file that we will use to denoise the signal. Our current approach is to create this file during the meeting using the extraneous noise from the beginning of the meeting. There could be a problem with this approach since the noise in the beginning wouldn’t necessarily represent the noise throughout the meeting. But we are starting with this approach for now and we will test the result. Alternatively we will focus more on filtering the audio to remove noise.

We haven’t changed the design of our project.

A change in the schedule is that the task of mic initialization backend was reassigned to Ellen.

Gantt Chart – combined (1)

Ellen’s Status Report for March 20

I worked on a couple different things this week. Each day up to Wednesday we were making tweaks and additions to the design report paper, and then submitting it on Wed.

I finished the (initial) meeting management module. After a fistfight with python import statements I was able to interface it properly with the Django models that represent microphone objects.

I also wrote a script intended to be run after the django server starts, that will start threads for Cambrea’s udp server and my transcript generator. The two will communicate with a queue object. New threads will break off to “serve” every item the udp server puts in the queue.

Then for the rest of the week I worked on setting up the web pages for getting around the website. I configured the urls, wrote the html, and made the Django “views” (controlling what http response gets sent and parsing POSTed forms and that sort of thing). I did that for a bunch of pages but there are still a couple of pages that need to be added.

I’m on-track in terms of scheduling. I was supposed to start the web page stuff today (Friday) but I started on Wednesday.

None of my current tasks have due dates before the next status report. Nevertheless, I hope to have all my web pages set up with buttons/forms that perform the intended effect on the backend as well. I will also start the mic initialization backend work, which is a task I took over from Mitchell since it’s so heavily connected to what I’ve already done. And I’ll also be doing the ethics assignment this upcoming week.

Mitchell’s Status Report for March 13

This week I worked on adding audio filtering to the Raspberry Pi, developing the website, and finishing the Design Document. The current version of audio filtering does file i/o. Am looking to upgrade to real time processing using streams and pyaudio. For the Design Document, I filled out the sections that I implemented and proofread the sections that were written by Ellen.

Schedule wise, I am on schedule, but need to rework items. Risk wise, I need to better communicate with my group members as I made assumptions and did not make an interface that would connect well. Next week, I am looking to rework the audio filtering on Monday and spend the rest of the time on development. I will also work on connecting components together.

Cambrea’s Status Report for March 13

This week I completed the server/client code to send and received audio data between the raspberry pi and the server. This server code here AWSServer

receives data from 2 separate audio devices and places the input into a queue to be sent out to the other audio device. So for example device one sends a packet to the server, this packet it put in to the queue and is sent to device 2.

The client code is here RPiClient

This code will package the audio information into a list which then is turned into a byte string, using the python library Pickle, so that it can be sent to the server. Pickle is used on the server side as well to unpackage the data. Input audio data is also placed on a queue to be played on the speaker for the users.

This week I also worked on the design report document, I have finished a draft of my sections, on Sunday I will revise this writing for a final draft so that we can discuss the document on Monday.

I am on schedule this week. Next week I will make the “Top” file for the raspberry pi to connect the client code and the audio I/O functions for the ReSpeaker, I will continue testing the code as I write it.

Team Status Report for March 13

This week the team members were busily taking care of our individual programming responsibilities. Cambrea was working on networking, Mitchell was working on audio processing and website stuff, and Ellen was working on transcription. We also worked towards completing the design report document that’s due on Wednesday.

The risk we’ve been talking about recently is mismatched interfaces. While we write our separate modules, we have to be aware of what the other members might require from them. We have to discuss the integration of the individual parts and, if we discover that something different is required, we have to be ready to jump in and change the implementation. For example, Ellen made the transcript output a single text file per meeting. However, when Cambrea starts writing the transcript streaming, she might discover that she wants it in a different format; so we just have to recognize that risk and be prepared to modify the code.

Our schedule hasn’t changed besides our making progress through its components.