Team Status Report for 3/6

Our biggest risk remains that sending audio over a socket connection may either not work or not lower latency enough to be used in a recording setting. To manage this risk, we are focusing most of our research efforts on sockets (Jackson and Christy) and synchronization (Ivy). As a contingency plan, our app can still work without the real-time monitoring using a standard HTML form data upload, but it will be significantly less interesting this way.

In our research, we found that other real-time audio communication tools for minimal latency use peer-to-peer connections, instead of or in addition to a web server. This makes sense, since going through a server increases the amount of transactions, which in turn increases the time it takes for data to be sent. Since a peer-to-peer connection seems to be the only way to get latency as low as we need it to be, we decided on a slightly different architecture for the app. This is detailed in our design review, but the basic idea is that audio will be sent from performer to performer over a socket connection. The recorded audio is only sent to the server when one of the musicians hits a “save” button on the project.

Because of this small change in plans, we have a new schedule and Gantt chart, which can be found in our design review slides. The high-level change is that we need more time to work on peer-to-peer communication.

Jackson’s Status Report for 3/6

This week, I spent a lot of time looking into websockets and how they can be integrated with Django. I updated the server on our git repo to work with the “channels” library, a Python websockets interface for use with Django. This required changing the files to behave as an asynchronous server gateway interface (ASGI), rather than the default web server gateway interface (WSGI). The advantage this provides is that the musicians using our application can receive audio data from the server without having to send requests out at the same time. As a result, the latency can be lowered quite a bit.

Additionally, I worked pretty hard on our design review presentation (to be uploaded 3/7), which included a lot more research on the technologies we plan to use. In addition to research on websockets, I looked specifically at existing technology that does what we plan to do. One example is an application called SoundJack, which is essentially an app for voice calling with minimal latency. While it doesn’t deal with recording or editing at all, Ivy and I were able to talk to each other on SoundJack with latency around 60ms, far lower than we thought was possible. It does this by sending tiny packets (default is 512 samples) at a time using a peer-to-peer architecture.

We are still on schedule to finish in time. Per the updated Gantt chart, my main deliverable this week is a functional audio recording and playback interface on the web.

Christy’s Status Report for 2/27

After realizing that Django might not be proper framework to implement real-time audio streaming, I have been looking at Node.js. Node.js seems to have many rpm plugins that support real-time audio sending. I was learning about basic of Node.js since i do not have any experience with Node.js.

Ivy’s Status Report for 2/27

This week I presented our project proposal as well as did further research into the synchronization issue. This remains my biggest concern: being able to synch real time vs only synching after the tracks are recorded will greatly affect how our project is constructed. I want to know the advantages and technological limits of implementing both of them asap, so we can decide on which one to focus on moving forward.

In saying this, I’ve found partial solution in the web app, SoundJack. The application can control the speed and number of samples that are sent over to other users which allow users to have some control over the latency and greatly stabilize their connection. It calculates the displays the latency to the user, so they may make the appropriate adjustments to decrease it. Users then can set multiple channels to mics and chose what audio to send to each other via buses.

One coincidental advantage of this is that, because we will be taking care to reduce latency during recording, the finished tracks will not need much adjustments to be completely on-beat. Still, where this solution falls short is that the latency will either have to be compounded with multiple users in order for real time to keep up with digital time, or other users will here an ‘echo’ of themselves playing. Additionally, the interface of all the programs (SoundJack, Audiomovers) I’ve looked into is pretty complicated and hard to understand. One common complaint I’ve seen in comments from YouTube guides is that it makes sound recording more engineering focused than music-making focused.  Perhaps our algorithm could do these speed and sample adjustments automatically, to take the burden off of the user.

Furthermore, in these video guides, the users use some sort of hardware device so that they are not reliant on wifi connection, like what our project assumes they will be doing in. So far, I’ve only read documentation and watched video guides of this. Since it is a free software, I want to experiment with this in our lab session on Monday and Wednesday.

I completed the Django tutorial and have started on metronome portion of the project. I have a good idea what I want this part of our project to look like, however I have less of an idea of what exactly the metronome’s output should be. One thing I know for sure is that, in order to mesh with our synchronization output, there needs to be a defined characteristic in the wave where the beat begins. I also think that, because some people may prefer to have a distinguishing beat at the beginning of measures, we need to take that into account when synchronizing.

Team Status Report for 2/27

Since last week, we have gained familiarity with audio processing on the web, and we know how we’re going to implement all in-browser operations. The biggest remaining challenge is streaming the audio data to and from the server. The most significant risk is that we will not be able to do this in real-time, and thus the recording musicians will not be able to hear each other. As we said last week, this is something we would love to have, but if we can’t figure it out, there are other solutions (e.g. uploading the audio after the whole track is recorded, or asynchronously but in larger chunks than would be workable for real-time). We need to decide on this early on, as it will effect many other facets of our project going in.

We haven’t made any big changes in our implementation, but we’re able to be more specific now about how things are going to get done. For example, the click track can be created using an audio buffer similar to the white noise synthesizer in Jackson’s status report. This way, the clicks can be played to recording musicians the entire time they are recording. Better yet, no server storage is needed for this. A buffer containing a click sound can be looped at a certain tempo (i.e. every quarter note), so only a small buffer is needed.

Since there aren’t any big changes to report, our initial schedule is still valid. That said, it is possible we will get the in-browser recording and click track aspects to work much sooner than anticipated, in which case we’ll have far more time to work on the challenges mentioned above (uploading and downloading audio data chunks in real-time). Research into websockets is still needed, and remains a critical part of our desired implementation.

Jackson’s Status Report for 2/27

This week, I familiarized myself with the Web Audio API. This is crucial to our project since it deals with in-browser audio processing, and none of us have experience with that sort of thing. Though I’ve done a lot of audio DSP, I’ve never used the web APIs meant to do that. What I’ve learned is that it’s actually quite simple, and fairly similar to Nyquist, though a lot less elegant. Audio elements such as sound sources, oscillators, filters, gain controls, etc. are all built into your browser already, and all you have to do is route them all properly.

Following a number of different tutorials, I made a few different small in-browser audio applications. I ran them on a Django server I created with a single view pointing to a url, which loads an html file. This html file loads some javascript, which is where I did most of my work. This post contains some code snippets of interest that make up a small portion of code that I wrote this week.

Firstly, I created a small synthesizer, which generates white noise on a button press with a simple for loop:

The channelData is connected to an audio buffer called “buffer”, which is routed to a gain node before being sent to your computer’s audio output. Generating a click track (which will be part of our final product) can be done in a similar way.

The white noise can be filtered in real time by inserting a biquad filter in between the buffer and the gain node. The syntax for this is very simple as well:

With some Web Audio API basics down, I moved on to the main thing our application needs to do in-browser, which is audio recording and playback. I created a small recorder, which saves CD quality audio in chunks (in real time), and after a set amount of time, it stops recording, creates a blob containing the recorded audio, and plays the audio back to the user. This recording interface was my deliverable this week, so I would say we are still on schedule.

For next week, I will integrate this recorder with the Django server we’re using for the final product, and hopefully get the recorded audio sent to the server. More than likely, I will write a new one from scratch, with the information I’ve learned this week.

Christy’s Status Report for 2/20

For this week, our team worked on creating proposal presentation. I created Github Repo for our team where we can organize and update our code.

Our main concern for this project is how we are going to send audio real-time, using Django web application platform. I was searching for the website out on the internet such that uses real-time audio streaming technology. And i found one, ( https://livevoice.io/en ).  I tested their function in order to

Another concern is how we are going to display audio streaming from multiple users at the same time. Streaming by one user is common in web development. For example, Youtube allows one user to stream live video to be shared with other users. However, it is difficult to allow multiple user to stream their live video/audio on a website. Usually, streaming of data by multiple users on screen is common in software application. Examples are zoom, Skype, and facebook message app. I was researching about feasibility of live streaming audio on website. However, i did not get clear picture of how we will do this.

Plan for next week is to do more researches on our real-time functionality in webapp.  And hopefully i will do basic setup of our website.

Team Status Report for 2/20

One of the most significant risks in our project is our reliance on WebSockets to deliver packets of audio, and our expectation that Django will work kindly with the packets. We know from experience that Django has built-in measures to handle file uploads in very particular ways (for security purposes), and sending audio data packets with WebSockets to a Django web app server is not something there is very much information about on the internet. To add to this, none of us have prior experience using WebSockets. In order to mitigate this risk, we plan to do some experimentation with WebSockets before we fully specify the design for our project, to determine if this approach is feasible. As a contingency plan, if sockets cannot be used, we know for sure that the audio can be recorded in-browser, and sent afterwards to the server as a complete file in a multipart form-data upload. However there are many drawbacks to this approach. This would require a large file upload at once (CD quality wav audio files get large very quickly), but perhaps more importantly, there is no possibility for audio to be streamed “live” to other musicians or spectators.

The only major structural changes made to our design this week have been narrowing the scope of the project from our initial abstract as we created our project proposal slides. We have set more specific guidelines for the project, which can be found on slide 2 of the proposal slides. Our schedule has not changed, since the proposal slides contain the first draft of our timeline (slides 11 and 12).

Aside from that, we have each done our share of research for our specific areas of the project: Jackson in audio processing and file uploads, Ivy in synchronization and Django, and Christy in UI and audio visualization.

Jackson’s Status Report for 2/20

This week, I worked on the Project Proposal slides, I set up the WordPress site with the required pages and formatting, and I worked on familiarizing myself with some technologies we plan to use for our project.

Specifically, I found a JavaScript audio processing interface called Web Audio API, which we plan to use to handle all in-browser audio operations. This includes recording, basic DSP, and some UI components as well like displaying waveforms and spectrograms. I’ve followed a few tutorials on the Web Audio API, since I’m fairly familiar with audio DSP, but not as much with web programming.

In addition to the Web Audio API, I’ve also started experimentation with Python audio processing libraries from this tutorial which will help with any necessary audio manipulation on the server side. Since the main challenges involved in our project are timing-related, server side audio processing will likely not be as important as in-browser manipulation, but some basic processing will of course be necessary.

Right now we are on schedule, though the project is still in the design stage. We need to familiarize ourselves with the specific technologies (like web audio programming in my case) before we can reasonably plan out exactly how the project will be done, and I have made good progress with that.

In the next week, I hope to have some kind of working audio recording app in-browser using the Web Audio API, which can be converted into a Django app for our final project. We also will have received feedback on our project proposal, so we will likely have a more concrete idea of exactly what our project needs to do, and we’ll make any necessary adjustments.

Ivy’s Status Report For 2/20

During the week, we worked on the Project Proposal Presentation as a team, as well as set up the wordpress and gantt chart to layout our workflow and track our process.

Furthermore, I wrote up an outline for the oral presentation of our proposal here. One of the major concerns we have going into this project is figuring out a method to synchronize individual tracks. In our initial research, we’ve come across some papers (Carnegie Mellon Laptop Orchestra, Dannenberg 2007) and programs (rewire, audiomovers) that aim to do something similar. The former gives us an idea of how to sync performances to a click track, but we hope to sync performances as they are being played live as well. I will look further into the commercially available options this weekend.

We will be using Django to build our website. Since I have not used that before, I’ve been following a small tutorial that’ll hopefully get me familiarized with its functionality and interface.

Our design hasn’t changed much from our abstract, but we’ve added some more methods of testing our final products viability, including testing for security, website traffic, and qualitative feedback from its users.