webrtc
introduction to webrtc: https://hpbn.co/webrtc/
conclusion
Browsers are crap, for a low latency e2e encrypted group video call thing in a browser you need :
1. if you want low latency / works over crap network:
- To implement your own (low latency) encoder/decoder (there is webassembly implementations of vp8) + rtp stack (adjusting encoder settings based on send buffers/ receiver reports) (or implement a integrated codec and realtime protocol like salsify)
- implement some crypto (probably webcrypto can be used for that) that encrypts the packets generated by salsify and forwards them to your sfu over webrtc datachannel (unreliable unordered mode, you can look at the datachannel send buffers levels, so should be possible to do low latency stuff)
- implement video layering (SVC) and FEC (send multiple quality streams, the receiver can choose which of the layers it wants), which is supported in vp8 which salsify is based on, or alternatively send out a very low quality stream, and a higher quality stream
- setup a sfu (selective forwarding unit, allows each client to choose what to receive), janus videoroom can be used for that, cause it supports datachannel (might needs some patches to support binary stuff over datachannel)
- write the stuff that will make sure that under crappy network conditions, you switch to the lower quality layers, or when it gets really bad, switch off video completely so at least the audio will work properly
2. just encrypted videocalls (without taking in consideration network conditions and latency)
- Record video using mediastreamrecorder api
- Packetize it and encrypt the packets (key passed via url)
- send the packets over webrtc datachannel to janus videoroom
3. better version of 2
- 2, but with simulcasting (sending a low res version of your video + a high res version)
- display the high res version of the person currently talking
- clients on crappy connection can choose to only receive and send the low res version
4. don't use a browser
- all the things that are an issue in a browser (control over encoder / rtp stack / crypto) are not an issue if you have native code
- server side is not an issue, there is things like janus (and a bunch of others), that can do the SFU thing
- there is a bunch rtp (webrtc is just a rtp with limitations) stacks in different languages:
- gstreamer (gobject, can be used from bunch of languages)
- pjsip
- pion (webrtc stack in go, has some examples where media is done in gstreamer) https://github.com/pion/webrtc
- libwebrtc (google webrtc reference? implemention) https://github.com/aisouard/libwebrtc
- libjitsi (java)
- https://www.gnu.org/software/ccrtp/
to get the webrtc examples working in chrome:
- disable this otherwise you can't connect to yourself: chrome://flags/#enable-webrtc-hide-local-ips-with-mdns
- chromium --use-fake-device-for-media-stream , so you don't need to connect a webcam
setup a janus server for testing
- docker pull mcroth/docker-janus
end to end crypto in webrtc
Webrtc cannot be used (without doing messy shit) for end2end encrypted group calls via a selective forwarding unit
, there used to be browser api's that would let you build such a thing (use static keying instead of dtls), but they took them out.
There has been proposals since > 8 years to fix this in browsers:
2011 PERC:
- https://www.w3.org/2011/04/webrtc/wiki/images/2/2f/E2EE_for_conference_calls_In_WebRTC.pdf
- https://tools.ietf.org/html/draft-roach-perc-webrtc-00
- https://github.com/agouaillard/perc-webrtc
(nonexisting) browser api's:
2011 (used to be supported, taken out) https://tools.ietf.org/id/draft-ohlsson-rtcweb-sdes-support-00.html https://bugzilla.mozilla.org/show_bug.cgi?id=825515
2018 Issue 9681: Integrate Per Frame Encryption Interface Into WebRTC https://bugs.chromium.org/p/webrtc/issues/detail?id=9681
PERC
insertable streams: https://www.chromestatus.com/feature/6321945865879552 https://github.com/alvestrand/webrtc-media-streams
2020 Sframes: https://webrtcbydralex.com/index.php/2020/03/30/secure-frames-sframes-end-to-end-media-encryption-with-webrtc-now-in-chrome/
- changing codec parameters: https://github.com/WICG/web-codecs/blob/master/explainer.md
Expose an explicit set/get low-latency versus "smoothing" MSE API https://github.com/w3c/media-source/issues/21
webgpu https://github.com/WebAssembly/WASI/issues/53
- https://github.com/gpuweb/gpuweb/wiki/Implementation-Status
existing browser api's
- Streams https://streams.spec.whatwg.org/
- TransformStreams https://streams.spec.whatwg.org/#ts-class
- MediaStreamTracks https://www.w3.org/TR/mediacapture-streams/#dom-mediastreamtrack
- (this assumes control over encoder) https://stackoverflow.com/questions/55887980/how-to-use-media-source-extension-mse-low-latency-mode
- https://w3c.github.io/mediacapture-record/MediaRecorder.html
- https://developers.google.com/web/updates/2016/10/capture-stream
how others solve it:
- zoom, is running it's own rtp stack and encoders in webassembly, and using datachannels (with websocket fallback) to communicate https://webrtchacks.com/zoom-avoids-using-webrtc/
webrtc implementatins / libraries
- python https://github.com/aiortc/aiortc/
- peerjs browser abstraction: https://github.com/peers
random links
- https://github.com/WICG/web-codecs/blob/master/explainer.md
- https://github.com/medooze/libdatachannels/blob/master/specs.md
- https://developer.mozilla.org/en-US/docs/Web/API/SourceBuffer
- https://developer.mozilla.org/en-US/docs/Web/API/SourceBuffer/appendBuffer
- https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API
- https://www.w3.org/TR/webrtc/#dom-rtccertificate
- https://developer.mozilla.org/en-US/docs/Web/API/RTCDataChannel
- https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/generateCertificate
- https://developer.mozilla.org/en-US/docs/Web/API/RTCConfiguration#Using_certificates
- https://developer.mozilla.org/en-US/docs/Web/API/MediaKeySession
- https://developer.mozilla.org/en-US/docs/Web/API/Encrypted_Media_Extensions_API
- https://developer.mozilla.org/en-US/docs/Web/API/Crypto
- https://www.html5rocks.com/en/tutorials/eme/basics/
- https://github.com/samdutton/simpl/blob/gh-pages/eme/clearkey/js/main.js
- https://github.com/medooze/libdatachannels/blob/master/specs.md
- https://hpbn.co/webrtc/
- https://webrtcbydralex.com/index.php/2020/03/30/secure-frames-sframes-end-to-end-media-encryption-with-webrtc-now-in-chrome/
- https://www.youtube.com/watch?v=BYtNI4esj1I&list=PL4_h-ulX5eNdJljPgBWAgD8l49YOjeMoc&index=3
- https://testrtc.com/happens-webrtc-shifts-turn-tcp/
stuff build with webrtc
- https://github.com/pion/rtwatch
- https://github.com/Uninett/webrtcdatamedia
- janus videoroom ( can do datachannel rooms as well ) https://janus.conf.meetecho.com/
codecs
SVC stands for Scalable Video Coding.
SVC is a technique that allows encoding a video stream once in multiple layers. The layers in SVC are akin to the layers in an onion – they can be “pealed off” while maintaining the video, reducing its quality with the reduction of each layer.
- https://github.com/excamera/alfalfa
- https://github.com/excamera/alfalfa/blob/master/src/net/packet.cc
- https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/fouladi
- https://snr.stanford.edu/salsify/
- http://ex.camera/nsdi17/
wasm links
- https://i.cs.hku.hk/fyp/2019/fyp19038/
- python in webassembly: https://webassembly.sh/?run-command=python
- https://fsan.github.io/post/emscripten_and_webassembly/
- webm encoder: https://github.com/GoogleChromeLabs/webm-wasm
- another webm encoder: https://github.com/antimatter15/whammy
- https://github.com/Kagami/ffmpeg.js
- https://github.com/GoogleChromeLabs/wasm-av1
- ffmpeg https://github.com/AlexVestin/WasmVideoEncoder
- https://hacks.mozilla.org/2017/02/webassembly-will-ease-collaboration-on-next-generation-video-codecs/
- https://medium.com/hackernoon/av1-bitstream-analyzer-d25f1c27072b
- webrtc stack in webassembly: https://webrtchacks.com/webassembly-experiments/
https://webrtchacks.com/zoom-avoids-using-webrtc/
https://www.babylonjs.com/