If you haven’t heard about WebRTC or maybe you did but not really sure how it works, its real-time in-browser solution to have peer-to-peer communication with audio, video and data. The key here is peer-to-peer, that means its browser-to-browser speaking directly and no server in-between. This all sounds simple but in practice its more complicated than that since before peers can communicate, they should be able to reach each other but users can be behind NAT and so its not that easy.

ICE

That’s where the Interactive Connectivity Establishment (ICE) protocol comes in, its a way to figure out all possible ip:port combinations (ice candidates) that can be used to access a peer. Once both peers finishes gathering their ICE candidates, both peers exchanges candidates and use it to reach each other. Here’s an example of how it looks like:

var config = {"iceServers": [...]}
var peer = new RTCPeerConnection(config)
navigator.getUserMedia({audio: true, video: true}, function(stream) {
  peer.addstream(stream)
  peer.createOffer(function(offer) {
  	peer.setLocalDescription(offer)
  })
}, function(err) {
  console.log('error:', err)
})

peer.onicecandidate = function(evt) {
  if (evt.target.iceGatheringState === 'complete') {
    peer.createOffer(function(offer) {
      signal.send({offer: offer})
    })
  }
}

The initial call to createOffer triggers ICE candidate gathering process and once its complete, only then do we extend a regenerated offer and this time with ICE candidates. But how to exchange ICE candidates when both parties can’t reach each other yet?

That’s where a Signaling Server comes in, the WebRTC specs doesn’t specify how to do signaling and this is intentional to reuse and easy integration with existing mechanisms. This gives the flexibility as you can choose what protocol to use on your Signaling Server. It can be XHR + SSE, XHR Long-Polling, Websocket, SIP over Websocket or even email. But basically the role of a Signaling Server is to introduce peers with each other, this is where the ICE candidates exchange happens. There are several free Signaling Servers out there and I’m writing one as well just for the fun of it. Once peers can reach each other, the role of a signaling server is done.

Trickle ICE

Going back to ICE candidates, while the discovery of ip:port combinations sounds straight forward, it involves NAT traversal by talking to STUN and/or TURN server and making sure all candidates actually worked. A STUN server is used to return the public IP address of the peer and if that doesn’t work, TURN server is used which acts a proxy between peers. And so waiting for all the candidates may prove to be non-realtime at all. And that’s where Trickle ICE comes in, its an optimization for the ICE behaviour described above. So instead of waiting for all the candidates, you send each candidate discovered immediately so there’s no waiting and peers can discover each other faster. Here’s how it looks like:

var config = {"iceServers": [...]}
var peer = new RTCPeerConnection(config)
navigator.getUserMedia({audio: true, video: true}, function(stream) {
  peer.addstream(stream)
  peer.createOffer(function(offer) {
  	peer.setLocalDescription(offer)
  	signal.send({offer: offer})
  })
}, function(err) {
  console.log('error:', err)
})

peer.onicecandidate = function(evt) {
  if (evt.candidate) {
    signal.send({candidate: evt.candidate})
  }
}
...
signal.onmessage = function(msg) {
  if (msg.candidate) {
    peer.addIceCandidate(msg.candidate)
  }
}

This time we send the offer even with no candidates and succeedingly send candidates as they are discovered. This is important since a peer won’t entertain the other peer if it doesn’t even express an interest (offer). No sir, some peers don’t like making the first move. Then down below, any candidate we received gets registered and that’s how peers discover each other, the Signaling Server just acts as a match-maker, set them up on a date and after that they’re on their own. :)