Team Status Report for 3/27

This week, we’ve made a lot of progress on the monitoring aspect of our project. As outlined in Jackson’s status report, websocket connections between each user and the server are working, as well as WebRTC peer-to-peer connections. Though audio is not sent from browser to browser quite yet, simple text messages work just fine, and sending audio will be completed by next week. In addition, the click track is working, and the user interface has seen big improvements.

There are a few major changes to our design this week:

  1. To allow multiple asynchronous websocket connections open at once, we had to add a Redis server. This runs simultaneously with the Django server, and communicates with the Django server and its database over port 6379. This was based on the tutorial in the channels documentation, though that tutorial uses Docker to run the Redis server, while I just made the Redis server work in a terminal. This change doesn’t have much of a trade off, it’s just a necessary addition to allow asynchronous websocket channels to access the database.
  2. We have decided to use WebRTC for peer-to-peer connections. In our design review, we planned to use TCP, since it gives a virtually 0% packet loss rate. The cost of using TCP is big though when latency is such an issue as it is with music, making it impractical. WebRTC is used for real-time communication and particularly designed for media streaming between users. It uses UDP to send data with the lowest possible latency, and it’s built in to the user’s browser. The only real cost here is that now we do have to worry about packet loss. But for something like music where timing is so critical, we’ve decided that meeting the latency requirement (<100ms) is far more important than the packet loss requirement (<5%).

Since the WebRTC connection is working, we no longer have to worry about not having monitoring. However, we do have a number of other big risks. As I see it, our biggest risk right now is that none of us really know how to create the DAW user interface we showcased in our design review. Visualizing audio as you’re recording and being able to edit the way you can in industry DAWs like Pro Tools/Audacity/Ableton/etc. is going to be a challenge. To mitigate this risk, we will need to put extra work into the UI in the coming weeks, and if this fails, we can still expect to have a good low-latency rehearsal tool for musicians even without the editing functionality.

 

Jackson’s Status Report for 3/27

I worked a lot on many different parts of the project in the past two weeks, and since there was no status report due last week, I’ll write about both weeks. Forgive me, it’s a long one!

Last week, I worked a lot on the server code, setting up Django models for groups, tracks, and recordings. A logged-in user can create a group and add other users to the group, so all group members can edit/record with their group. In each group, members will be able to add audio tracks, which can consist of multiple recordings. To help with group creation, I made a simple homepage:

The forms check for the validity of the user’s inputs and displays error messages if necessary. For example, if you don’t enter a group name, you get this message at the top of the screen:

After you create a group, the server creates a group model for you, and sends you to the corresponding group page that I made:

It looks a little rough right now, but it’s mainly for testing features at the moment. Eventually, this will contain the DAW interface we had in our design review. But this screen is enough to explain what I’ve been working on this week:

At the top, the online users are displayed, along with a “Monitor” button. When you click this button, this will allow you to hear the other users in real-time. Though that hasn’t been fully implemented yet, I think I got it most of the way there. So here’s what actually happens right now:

  1. Before the button is even clicked, as soon as the page is loaded, a websocket connection is opened with the Django (ASGI) server. Simultaneously, your browser creates an SDP (session description protocol) “offer” to connect with other users, which contains all of your computer’s public-facing IP address/port combinations (aka ICE candidates) that other computers can use to connect to you. This is needed so that peer-to-peer connections can be established, since your private IP address/port cannot be connected to as is (for security reasons).
  2. When the button is clicked, your SDP offer from step 1. gets sent to the server over the websocket connection, and the server echoes this offer to every other online user in the group, via their own websocket connections.
  3. When you receive an SDP offer from another user via your websocket connection, your browser generates an SDP “answer,” which is very similar to the SDP offer from step 1. The answer is then sent automatically back to the server via your websocket connection, and then forwarded to the user who sent you the offer via their websocket connection.
  4. When you receive an SDP answer back from a user you’ve requested to connect with, a peer-to-peer connection is finally established! I chose the WebRTC protocol, which is essentially a very low latency way to send data using UDP, intended for media streaming.

Right now, handshaking works perfectly, connections can be established successfully between 2 users, and you can send string messages from one browser to the other. To ensure that it works, I connected two different browsers (using the monitor button) and then shut down the server completely. Even after the server gets shut down, the browsers can still send messages back and forth over the WebRTC data channel! All that remains on the monitoring front is sending an audio stream rather than text messages, and connecting more than 2 users at once. These two things are the deliverables I plan to have completed by next week’s status report.

This probably goes without saying, but I didn’t have any idea how websockets, handshaking, or peer-to-peer connections worked 2 weeks ago. I’ve learned quite a lot from many online tutorials, I’ve made a number of practice Django apps, and I’ve been working well over 12 hours/week, probably closer to 24. In spite of this though, I am a little bit behind, because I didn’t know just how complicated sending audio over a peer-to-peer connection would be. To catch up, I’ll continue working a lot on this, since it’s probably the main functionality of our capstone project and has to get done. The recording aspect will have to either be pushed back, or I will need help from my other group members. Though I’m a bit more familiar with how that works, and I don’t think it’s quite as difficult as the monitoring aspect.

Team Status Report for 3/20

This week, we have been working on implementing main page and group page. Main page is where user decides to create a new group or join an existing group. Jackson worked on both front-end and back-end of this main page. We decided to make the group accessible by private room-key, whose type is uuid, and the user can get the room-key from the url. Only people who know about the room-key can access the group and start recording together.

Once people join the room, that is where concurrent recording comes into play. People are going to record synchronously. We are currently still working on implementing the UI and back-end of this page.

Christy’s Status Report for 3/20

This week, i have been working on group page features. Based on Group Model, created by Jackson, the group page indicates who are the members of the group. In addition, i implemented basic UI for the metronome. Metronome allows user to figure out appropriate bpm to sing.

There are still things to be determined for group page. We first need to assign a leader of the group, who will be responsible for adjusting tempo and leading recording processes.

Christy’s Status Report for 3/13

This week, i focused on implementing group formation functionality, which enables user to create a group or join an existing group.

For next week, i will focus on implementing basic UI.

Jackson’s Status Report for 3/13

This week I gave the design review presentation, and worked on code for recording, storing, and playing back audio on the web.

The design review process included writing detailed notes for myself with talking points for each slide, and after receiving the feedback, I’ve begun thinking about ideas for what to add to our design review paper, which is a main deliverable for this coming week. In particular, we were told that our requirements are not use case driven. When we drafted our requirements, I thought we just had to detail what the final product for the class would need in order to be considered successful, and I think we did that. However, after receiving our feedback, it seems I may have had a bit of a misunderstanding about what exactly counts as a requirement. I’m still not quite sure what exactly makes a requirement use case driven, so this will be a point to talk about in our meetings this week.

For our actual project, I finished code this week which uses the built-in Web Audio API to grab audio from any audio input on the computer (it even works with my USB interface!), record it in chunks using a JavaScript MediaRecorder object, store the recording with its own URL, and play it back to the user using an HTMLMediaElement object. To accomplish this I created an FSM-like structure to switch between three modes: STOPPED, RECORDING, and PLAYING.

My biggest deliverables for this week were the design review and a working recording interface, and since those are both done, I am on schedule. My deliverables for next week will be the design review paper and monitoring over websockets.

Team Status Report for 3/13

This week, we began working on the basic functionality of our website, implementing the recording function, some basic ui and the click generator.

Because we know little of networking, our biggest concern is still implementing the real time monitoring. A good question regarding this was raised during our Design Review presentation, where someone asked about alternatives to our socket solution, if it does not meet our latency requirement. This is a valid concern we had not considered; even though the other examples we tried out could reduce the latency below 100 ms, we might be limited with websockets and might not be able to do the same. One alternative to this is requiring the use of ethernet, which might speed a user’s connection to meet this requirement, but we are not sure if that alone would be enough.

Ivy’s Status Report for 3/13

This week I finished implementing the click track. I settled on creating the click track in python. To do this I used the playsound module and the sleep command to play a short .wav file after a delay based on the beats per measure.

This raised some issues however, as the sleep command is not very accurate.  While testing,  I found that the beats consistently had up to 150 ms of delay. To improve upon this, I created a separate clock which I could initialize to run based on the imputed tempo.

The meter/beats per measure dependent click track ui was much harder than I thought. I only knew some basic HTML going into this so it took a while to figure out how to fetch variables from other elements on the webpage. Even now I’m not so sure it’ll fit with the rest of the UI; since I’m unsure of the dimensions of our actual site, I made it out of <div>s. I’m a little behind right now, as I have yet to merge my code with the current version on the github, but I will get it done by or after our lab meeting on Monday (should I end up with questions), and thus will begin working on the track synchronization by then.

Our biggest concern is the networking aspect of our project. We are not too knowledgeable about networking, and as the concern was raised to us during the Design Review presentation, we aren’t too sure if our proposed socket solution will even meet our requirements.

 

 

 

Ivy’s Status Report for 3/6

This week, I worked on the design review presentation with the rest of the team. I created this tentative design of the UI for our web app’s main page, where users will be recording and editing their music together.

In the beginning of the week, Jackson and I tested out SoundJack, and found we could communicate with one another with the latency of 60ms through it. This was much better than either of us were expecting, so using this method (adjusting the packet size to increase speed and amount of packets to be sent/received to increase audio quality) as a basis for our user-to-user connection seems to be a good idea. But instead of manual adjustments, which can become really complicated with more than two people, I will be creating an automatic function that takes into account of all the users’ connectivity, and set the buffer parameters based on that.

We have settled a major concern of our project, as we will be reducing the real-time latency so that users will be able to hear each other and synchronizing their recording afterwards. We have updated our gantt chart to reflect this.

My first task will be to create the click track generator. To begin, I created a CSS form which will send the beats per measure, beat value, and tempo variables to the server when the user sets them and clicks on the ‘test play’ button. A function will then generate a looped audio sound with this information and play it back to the user. As for the latter, I’m still not too sure whether the sound should be created with Python DSP Library or the Web Audio API. Further research is needed, but I imagine both implementations will not be too different, so I should be able to get the click track generator  functioning by 3/9, the planned due date for this deliverable.