Encrypted video calls (technical)
This page documents a reference architecture for E2EE WebRTC media using a signaling relay (Node.js + Socket.IO), peer connection for transport, and Insertable Streams for media encryption where supported. Examples are safe-to-publish and omit production secrets and app-specific identifiers.
System overview
- Signaling server: relays SDP/ICE + E2EE public material; does not see media.
- Peers: perform key agreement (derive session keys) encrypt encoded frames.
- Transport: WebRTC SRTP (plus app-layer E2EE), ICE, STUN/TURN.
- Threat model: server/cloud provider is untrusted for content.
- Goal: confidentiality + integrity of media frames end-to-end.
- Non-goals: hiding all metadata (timing, IP-level routing).
Signaling: what the server does (and does not do)
The server should be a dumb relay for connection setup and E2EE public handshakes. It must never request, generate, or store private keys.
// server/signaling-relay.js (safe sample)
io.on("connection", (socket) => {
socket.on("rtc:signal", ({ to, payload }) => {
io.to(to).emit("rtc:signal", { from: socket.userId, payload });
});
socket.on("e2ee:pub", ({ to, pub }) => {
io.to(to).emit("e2ee:pub", { from: socket.userId, pub });
});
socket.on("e2ee:ready", ({ to, sessionId }) => {
io.to(to).emit("e2ee:ready", { from: socket.userId, sessionId });
});
});
Tip: rate-limit signaling events and validate envelope sizes to reduce abuse.
WebRTC negotiation: stable patterns
For production, prefer the Perfect Negotiation pattern to handle glare and restarts cleanly. Below is the safe conceptual flow.
// client/perfect-negotiation-concept.js (safe conceptual snippet)
let makingOffer = false;
let ignoreOffer = false;
const polite = true;
pc.onnegotiationneeded = async () => {
try {
makingOffer = true;
await pc.setLocalDescription(await pc.createOffer());
socket.emit("rtc:signal", { to: peerId, payload: { desc: pc.localDescription }});
} finally {
makingOffer = false;
}
};
socket.on("rtc:signal", async ({ from, payload }) => {
const { desc, candidate } = payload;
try {
if (desc) {
const offerCollision =
desc.type === "offer" && (makingOffer || pc.signalingState !== "stable");
ignoreOffer = !polite && offerCollision;
if (ignoreOffer) return;
await pc.setRemoteDescription(desc);
if (desc.type === "offer") {
await pc.setLocalDescription(await pc.createAnswer());
socket.emit("rtc:signal", { to: from, payload: { desc: pc.localDescription }});
}
} else if (candidate) {
await pc.addIceCandidate(candidate);
}
} catch (e) {
console.warn("signal error", e);
}
});
Note: avoid leaking internal ICE/SDP logging in production (log levels matter).
ICE / STUN / TURN: connectivity notes
- STUN discovers reflexive candidates; works for many NAT types.
- TURN relays encrypted RTP when direct paths fail (corporate NAT/firewalls, symmetric NAT).
- Security: TURN credentials should be short-lived (REST TURN) and never hardcoded.
// client/ice-config-sample.js (safe sample; no real credentials)
const pc = new RTCPeerConnection({
iceServers: [
{ urls: ["stun:stun.example.net:3478"] }
// { urls: ["turns:turn.example.net:5349"], username: "...", credential: "..." }
],
iceTransportPolicy: "all"
});
E2EE media: Insertable Streams architecture
With Insertable Streams, you can encrypt/decrypt encoded frames (audio/video) before they hit the network stack. The transport remains WebRTC; your app-layer E2EE protects content.
// client/e2ee-insertable-streams.js (safe sample)
function attachSenderTransform(sender, encryptFrame) {
if (!sender?.createEncodedStreams) return false;
const { readable, writable } = sender.createEncodedStreams();
readable
.pipeThrough(new TransformStream({
async transform(frame, controller) {
frame.data = await encryptFrame(frame.data, frame);
controller.enqueue(frame);
}
}))
.pipeTo(writable);
return true;
}
function attachReceiverTransform(receiver, decryptFrame) {
if (!receiver?.createEncodedStreams) return false;
const { readable, writable } = receiver.createEncodedStreams();
readable
.pipeThrough(new TransformStream({
async transform(frame, controller) {
frame.data = await decryptFrame(frame.data, frame);
controller.enqueue(frame);
}
}))
.pipeTo(writable);
return true;
}
Operational note: browser support varies. Feature-detect and provide fallbacks.
Key agreement & derivation: X25519 + HKDF (concept)
Peers exchange public keys (via signaling). Each derives a shared secret locally, then derives per-direction AEAD keys via HKDF.
// client/key-schedule-concept.js (safe pseudocode)
const sharedSecret = X25519(myPriv, theirPub);
const sendKey = HKDF_SHA256(sharedSecret, salt, "icallu:e2ee:send:v1", 32);
const recvKey = HKDF_SHA256(sharedSecret, salt, "icallu:e2ee:recv:v1", 32);
Use explicit context strings and per-session salts to avoid key/nonce reuse across sessions.
AEAD for frames: nonce, AAD, replay protection
Each encoded frame should be protected with authenticated encryption (AEAD). You must ensure nonce uniqueness per key, and protect against replays.
// client/aead-frame-envelope-concept.js (safe sample; AES-GCM shown)
let counter = 0n;
const prefix = crypto.getRandomValues(new Uint8Array(4));
function nextNonce12() {
const ctr = new Uint8Array(8);
new DataView(ctr.buffer).setBigUint64(0, counter++);
const nonce = new Uint8Array(12);
nonce.set(prefix, 0);
nonce.set(ctr, 4);
return nonce;
}
Production notes: nonce uniqueness is mandatory; avoid reusing a key+nonce pair. Consider per-sender counters and key rotation thresholds.
Session identity & verification (SAS concept)
To mitigate MITM via compromised signaling, many systems add verification: SAS, QR, or fingerprint confirmation.
// client/sas-concept.js (safe conceptual snippet)
const fingerprint = SHA256(theirPubKeyBytes);
const sas = humanReadableWords(fingerprint.slice(0, 10));
showToUsers("Verify both devices show the same words:", sas);
Operational pitfalls checklist
- Glare handling: use Perfect Negotiation and stable transceivers when toggling tracks.
- Key lifecycle: rotate keys and wipe them on end; avoid stale sessions after refresh.
- ICE restarts: handle network changes; don't break transforms during renegotiation.
- Backpressure: transforms must be efficient; avoid async stalls per-frame.
- Compatibility: feature-detect Insertable Streams; ensure fallback paths.
- Logging: never log SDP/ICE or key material at info level in production.