HACKER Q&A
📣 singerislonely

Low-latency, ensemble-oriented video conferencing?


It seems all popular video conferencing software includes a really noticeable lag, sometimes multiple second round-trip.

I'm in a large community of choral singers, and many groups have been hoping to sing together remotely, live. Of course, using Zoom this is hopeless: Not only does the lag make it impossible to synchronize, but Zoom only sends you one or two channels of audio at a time, so four-part harmony is out of the question, especially with multiple singers on each part.

This has me thinking: Is there software that solves this problem? Specifically, optimizes for extremely low latency between my singing and your hearing me, _and_ multiplexes many separate audio feeds together on the fly?

I assume this might require a network protocol other than TCP?


  👤 jefftk Accepted Answer ✓
It's not a software problem: packet round trips are just too slow for singing. A good ping time in gaming is 15ms to the gaming server, achievable with good internet and a wired connection. Double that, because both you and the other singer needs to talk to the central server and we're at 30ms. Then if your system is using 128-sample buffers at 44.1kHz (tight but doable) there's ~3ms each for ADC on your end, DAC on their end, ADC on their end, and DAC on your end, which adds another 12ms. This best case is a total of 42ms, which is still too high.

Singing rounds does work, though, and https://www.cockos.com/ninjam/ is an open source tool for it. You could also imagine some sort of bucket brigade approach for larger groups (https://www.jefftk.com/p/series-singing) but I don't know of any implementations.

Really old style circuit switched landlines get down to the speed of electricity in copper (~0.7c https://en.wikipedia.org/wiki/Velocity_factor#Typical_veloci...) over the distance (to your local exchange and then to your neighbor) which would be fast enough if you lived nearby (c is l ms per mile). But today it's all packet switched which adds latency in exchange for much higher capacity.


👤 traderj0e
Do existing video apps use TCP? I'll Wireshark next time I use Zoom. Seems like TCP is never going to perform well for low latency applications.

If you have bandwidth to spare, you can optimize for latency by using UDP and sending a lot of duplicate packets, or you can do something smarter with error-correcting codes if bandwidth conservation becomes an issue. The danger is sending too much and incurring queueing delay somewhere.


👤 gianpaj
so it's "physically" impossible, even with only audio?