Encrypted video calls (technical)

This page documents a reference architecture for E2EE WebRTC media using a signaling relay (Node.js + Socket.IO), peer connection for transport, and Insertable Streams for media encryption where supported. Examples are safe-to-publish and omit production secrets and app-specific identifiers.

System overview

  • Signaling server: relays SDP/ICE + E2EE public material; does not see media.
  • Peers: perform key agreement (derive session keys) encrypt encoded frames.
  • Transport: WebRTC SRTP (plus app-layer E2EE), ICE, STUN/TURN.
  • Threat model: server/cloud provider is untrusted for content.
  • Goal: confidentiality + integrity of media frames end-to-end.
  • Non-goals: hiding all metadata (timing, IP-level routing).

Signaling: what the server does (and does not do)

The server should be a dumb relay for connection setup and E2EE public handshakes. It must never request, generate, or store private keys.

// server/signaling-relay.js (safe sample)
io.on("connection", (socket) => {
  socket.on("rtc:signal", ({ to, payload }) => {
    io.to(to).emit("rtc:signal", { from: socket.userId, payload });
  });

  socket.on("e2ee:pub", ({ to, pub }) => {
    io.to(to).emit("e2ee:pub", { from: socket.userId, pub });
  });

  socket.on("e2ee:ready", ({ to, sessionId }) => {
    io.to(to).emit("e2ee:ready", { from: socket.userId, sessionId });
  });
});

Tip: rate-limit signaling events and validate envelope sizes to reduce abuse.

WebRTC negotiation: stable patterns

For production, prefer the Perfect Negotiation pattern to handle glare and restarts cleanly. Below is the safe conceptual flow.

// client/perfect-negotiation-concept.js (safe conceptual snippet)
let makingOffer = false;
let ignoreOffer = false;
const polite = true;

pc.onnegotiationneeded = async () => {
  try {
    makingOffer = true;
    await pc.setLocalDescription(await pc.createOffer());
    socket.emit("rtc:signal", { to: peerId, payload: { desc: pc.localDescription }});
  } finally {
    makingOffer = false;
  }
};

socket.on("rtc:signal", async ({ from, payload }) => {
  const { desc, candidate } = payload;

  try {
    if (desc) {
      const offerCollision =
        desc.type === "offer" && (makingOffer || pc.signalingState !== "stable");

      ignoreOffer = !polite && offerCollision;
      if (ignoreOffer) return;

      await pc.setRemoteDescription(desc);
      if (desc.type === "offer") {
        await pc.setLocalDescription(await pc.createAnswer());
        socket.emit("rtc:signal", { to: from, payload: { desc: pc.localDescription }});
      }
    } else if (candidate) {
      await pc.addIceCandidate(candidate);
    }
  } catch (e) {
    console.warn("signal error", e);
  }
});

Note: avoid leaking internal ICE/SDP logging in production (log levels matter).

ICE / STUN / TURN: connectivity notes

  • STUN discovers reflexive candidates; works for many NAT types.
  • TURN relays encrypted RTP when direct paths fail (corporate NAT/firewalls, symmetric NAT).
  • Security: TURN credentials should be short-lived (REST TURN) and never hardcoded.
// client/ice-config-sample.js (safe sample; no real credentials)
const pc = new RTCPeerConnection({
  iceServers: [
    { urls: ["stun:stun.example.net:3478"] }
    // { urls: ["turns:turn.example.net:5349"], username: "...", credential: "..." }
  ],
  iceTransportPolicy: "all"
});

E2EE media: Insertable Streams architecture

With Insertable Streams, you can encrypt/decrypt encoded frames (audio/video) before they hit the network stack. The transport remains WebRTC; your app-layer E2EE protects content.

// client/e2ee-insertable-streams.js (safe sample)
function attachSenderTransform(sender, encryptFrame) {
  if (!sender?.createEncodedStreams) return false;
  const { readable, writable } = sender.createEncodedStreams();
  readable
    .pipeThrough(new TransformStream({
      async transform(frame, controller) {
        frame.data = await encryptFrame(frame.data, frame);
        controller.enqueue(frame);
      }
    }))
    .pipeTo(writable);
  return true;
}

function attachReceiverTransform(receiver, decryptFrame) {
  if (!receiver?.createEncodedStreams) return false;
  const { readable, writable } = receiver.createEncodedStreams();
  readable
    .pipeThrough(new TransformStream({
      async transform(frame, controller) {
        frame.data = await decryptFrame(frame.data, frame);
        controller.enqueue(frame);
      }
    }))
    .pipeTo(writable);
  return true;
}

Operational note: browser support varies. Feature-detect and provide fallbacks.

Key agreement & derivation: X25519 + HKDF (concept)

Peers exchange public keys (via signaling). Each derives a shared secret locally, then derives per-direction AEAD keys via HKDF.

// client/key-schedule-concept.js (safe pseudocode)
const sharedSecret = X25519(myPriv, theirPub);

const sendKey = HKDF_SHA256(sharedSecret, salt, "icallu:e2ee:send:v1", 32);
const recvKey = HKDF_SHA256(sharedSecret, salt, "icallu:e2ee:recv:v1", 32);

Use explicit context strings and per-session salts to avoid key/nonce reuse across sessions.

AEAD for frames: nonce, AAD, replay protection

Each encoded frame should be protected with authenticated encryption (AEAD). You must ensure nonce uniqueness per key, and protect against replays.

// client/aead-frame-envelope-concept.js (safe sample; AES-GCM shown)
let counter = 0n;
const prefix = crypto.getRandomValues(new Uint8Array(4));

function nextNonce12() {
  const ctr = new Uint8Array(8);
  new DataView(ctr.buffer).setBigUint64(0, counter++);
  const nonce = new Uint8Array(12);
  nonce.set(prefix, 0);
  nonce.set(ctr, 4);
  return nonce;
}

Production notes: nonce uniqueness is mandatory; avoid reusing a key+nonce pair. Consider per-sender counters and key rotation thresholds.

Session identity & verification (SAS concept)

To mitigate MITM via compromised signaling, many systems add verification: SAS, QR, or fingerprint confirmation.

// client/sas-concept.js (safe conceptual snippet)
const fingerprint = SHA256(theirPubKeyBytes);
const sas = humanReadableWords(fingerprint.slice(0, 10));
showToUsers("Verify both devices show the same words:", sas);

Operational pitfalls checklist

  • Glare handling: use Perfect Negotiation and stable transceivers when toggling tracks.
  • Key lifecycle: rotate keys and wipe them on end; avoid stale sessions after refresh.
  • ICE restarts: handle network changes; don't break transforms during renegotiation.
  • Backpressure: transforms must be efficient; avoid async stalls per-frame.
  • Compatibility: feature-detect Insertable Streams; ensure fallback paths.
  • Logging: never log SDP/ICE or key material at info level in production.

Related pages


Security Hub