This week, my task was measuring and possibly improving latency and packet loss rate.
Latency
Measuring latency, like every other part of this project so far, is a much more complex task than I initially thought. For one, the term “latency” is a bit ambiguous. There are multiple different measurements this could mean:
- “End-to-end delay” (E2E) or “one-way delay”
 This is the time it takes from the moment data is sent by one client to the moment that data is received by another. Since this relies on perfect synchronization between the clocks of each client, this could be difficult to measure.
- “Round-trip time” (RTT)
 This is the time it takes from the moment data is sent by one client to the moment that data is received back by the same client. This measurement only makes sense if the remote computer is returning the data they receive.
These measurements are certainly not meaningless, but in an audio system, neither one is all that great. To explain this, it’ll be helpful to first go over the signal path again. This is what it looks like when client A is sending audio to client B (very simplified for the purposes of this discussion):
- Client A makes a sound.
- The sound is sampled and converted to a digital signal.
- The browser converts this signal to a MediaStreamTrack object.
- WebRTC groups samples into packets.
- These packets are sent to client B.
- On client B’s computer, WebRTC collects these packets and places them into a jitter buffer.
- Once the jitter buffer has accumulated a sufficient amount of samples to make smooth playback possible, the samples in the jitter buffer are played back, allowing client B to finally hear client A’s sound.
Steps 1-5 take place on client A’s computer, and steps 6-7 take place on client B’s computer.
The first thing to notice is that this communication is one-way, so RTT doesn’t make sense here. E2E delay must be approximated. We can do this as described in the design review. Client A sends a timestamp to client B. Client B then compares this timestamp to their current time. The difference between these times is a good approximation for E2E delay, provided the clocks between the computers are synchronized very closely. Luckily, JavaScript provides a funciton, performance.now(), which gives you the elapsed time in milliseconds since the time origin. Time origins can be synchronized by using performance.timing.navigationStart. We now have the E2E delay pretty easily.
But as you can see, E2E delay only measures the time from step 5 to step 6. Intuitively, we want to measure the time from step 1 to step 7. That is the amount of time from the moment a sound is made by client A to the moment that sound is heard by client B. This is the real challenge. The browser doesn’t even have access to the time from stepĀ 1 to step 3, since these happen on hardware or in the OS, so these are out of the picture. Step 3 to step 5 are done by WebRTC with no reliable access to the internals, since these are implemented differently on every browser, and poorly documented at that. As mentioned, we can approximate step 5 to step 6, the E2E delay. All that’s left is step 6 to step 7, and luckily, WebRTC gives this to us through their (also poorly documented) getStats API. After some poking around, I was able to find the size of the jitter buffer in seconds. This time can be added to the E2E delay, giving us the time from step 5 to step 7, and this is probably the best we can do.
So is the latency good enough? Well, maybe. On my local computer, communicating only from one browser to another, latency times calculated in this way are around 5ms, which is very good (as a reminder, our minimum viable product requires latency below 100ms). But this isn’t very useful without first deploying to the cloud and testing between different machines. Cloud deployment is not something I am tasked with, so I’m waiting on my teammates to do this. Per our initial Gantt chart, this should have happened weekly for the past few weeks. As soon as it does, we can interpret the latency test I implemented this week.
Packet Loss Rate
Unlike latency, packet loss rate is actually very simple to get. The amount of packets sent and the amount of packets dropped can both be found using the WebRTC getStats API, the same way I got the size of the jitter buffer. Interpretation of this is again dependent on cloud deployment, but locally the packet loss rate is almost 0%, and I never measured any more than 5% (our threshold for the minimum viable product).
Schedule and Deliverables
As with last week, I am pretty clearly behind, despite working well over 12 hours again this week. At this point, we have used most of our slack time, and it may be smart to think about shrinking our minimum viable product requirements, particularly dealing with the UI. If we don’t have a DAW-like UI by the end of the course, we will at the very least have a low-latency multi-peer audio chatting app which can be used for practicing musicians.
For next week, I will work on polishing what we have before the interim demo, completing the ethics assignment, and possibly UI and cloud deployment if needed. Most of the remainder of our project relies on cloud deployment getting done, and a better UI, so it’s hard to pick a better next step.
One Reply to “Jackson’s Status Report for 4/10”