HACKER Q&A
📣 vidcurious

Good overview of live streaming audio/video client/server architectures?


I'm curious how something like Clubhouse or Streamyard works and having a bit of hard time putting together the pieces because most of Google is concerned with setting up live streaming (as a user and not as a developer).

My main question is what happens on the server to accept live streams, mux them together, add overlays etc.

My guess is something like this:

- RTMP stream from client (or WebRTC) to Server - Server runs something like GStreamer which can frame the video stream, apply overlays, output RTMP stream - Send that to nginx and expose to client

Does AWS have this encapsulated as a service yet or are most people operating something like this at the compute level?


  👤 motoboi Accepted Answer ✓
I'm no expert either, but is not very difficult.

- They use webrtc to set-up a peer to peer connection between all the participantes (streamyard server is a participant of the call). (The browser takes care of all the hard part (encoding, noise reduction, echo suppresion) and just sends the video and audio stream via RTP).

- The local and remote cameras are objects that you can arrange on the screen using javascript. This is how they make the layout options.

- In the server, they implement a webrtc client which has the same layout options. I would bet they use something like ffmpeg library to create a final video stream and send to the destination via RTMP (youtube, facebook and such).

Unfortunately, I'm not aware of any AWS solution for this, but there are several implementations of the server part (like Kurento and https://openvidu.io/)