WebCodec - Sending and Receiving

Introduction

Hello! 😎

In this tutorial I will show you how to use the WebCodec API to both send and receive video.

First lets get coding the server.

Setting Up The Server

In order to send and receive packets between peers we will need a websocket server.

For this we will create a very basic server using nodejs. First initialize the project:

npm init -y

Then install the required modules:

npm i ws express

Next create a new file called "index.js" and populate it with the following code:

// server.js
const WebSocket = require('ws');
const express = require('express');

const app = express();
const port = 3000;
const connectedClients = new Set();

app.use(express.static(__dirname + '/public'));

const wss = new WebSocket.Server({ noServer: true });

wss.on('connection', ws => {
  console.log('new connection');
  connectedClients.add(ws);

  ws.on('message', message => {
    connectedClients.forEach(client => {
      if (client !== ws && client.readyState === WebSocket.OPEN) {
        client.send(message);
      }
    });
  });

  ws.once('close', () => {
    connectedClients.delete(ws);
    console.log('connection closed');
  });
});

const server = app.listen(port, () => {
  console.log(`server running on port ${port}`);
});

server.on('upgrade', (request, socket, head) => {
  wss.handleUpgrade(request, socket, head, (ws) => {
    wss.emit('connection', ws, request);
  });
});

Nothing too complicated the above code serves the public directory and handles the websocket connection sending packets to all connected peers. 😸

Next we will handle the sender part, but first create a new directory called "public"

mkdir public

Creating The Sender

The first front end file we will create is the one that is broadcasting, create a new file called "sender.html" under public and populate it with the following HTML:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8"/>
    <meta name="viewport" content="width=device-width, initial-scale=1.0"/>
    <title>Sender</title>
    <style>
      video, canvas {
        width: 640px;
        height: 480px;
        border: 2px solid black;
        margin: 10px;
      }
    </style>
  </head>
  <body>
    <video id="video" autoplay playsinline></video>
    <canvas id="canvas" width="640" height="480"></canvas>

    <script>
      const videoElement = document.getElementById('video');
      const canvas = document.getElementById('canvas');
      const ctx = canvas.getContext('2d');
      let videoEncoder;
      let socket;

      const initWebSocket = () => {
        socket = new WebSocket('ws://localhost:3000');

        socket.onopen = () => console.log('WebSocket connected');
        socket.onerror = error => console.error('WebSocket error:', error);
      };

      const initEncoder = () => {
        videoEncoder = new VideoEncoder({
          output: (encodedChunk) => {
            const chunkData = new Uint8Array(encodedChunk.byteLength);
            encodedChunk.copyTo(chunkData);

            if (socket.readyState === WebSocket.OPEN) {
              socket.send(chunkData.buffer);
            }
          },
          error: (error) => console.error('Encoding error:', error)
        });

        videoEncoder.configure({
          codec: 'vp8',
          width: 640,
          height: 480,
          bitrate: 1_000_000,
          framerate: 30
        });
      };

      navigator.mediaDevices.getUserMedia({ video: true })
        .then((stream) => {
          videoElement.srcObject = stream;
          const videoTrack = stream.getVideoTracks()[0];
          const processor = new MediaStreamTrackProcessor(videoTrack);
          const reader = processor.readable.getReader();

          const processFrames = async () => {
            while (true) {
              const { value: videoFrame, done } = await reader.read();
              if (done) break;

              ctx.drawImage(videoFrame, 0, 0, canvas.width, canvas.height);
              videoEncoder.encode(videoFrame, { keyFrame: true });
              videoFrame.close();
            }
          };

          processFrames();
        })
        .catch((error) => console.error('Failed to get camera', error));

        initEncoder();
        initWebSocket();
    </script>
  </body>
</html>

To breakdown and explain what the code does.

HTML Structure
- The video element displays live video from the user's camera
- The canvas element is used to display individual frames of the video feed. This provides a visual preview of each encoded frame
JavaScript code
- The initWebSocket function connects to a WebSocket server. This connection is essential for streaming the encoded frames to a receiver
- The initEncoder function creates a VideoEncoder object. It defines an output callback that runs every time the encoder produces a new chunk
- The videoEncoder.configure() sets the codec to VP8 with a 1Mbps bitrate and 30 FPS, ensuring a smooth and high quality encoding
- getUserMedia call is used to request access to the camera. The video feed is assigned to the video element, and the VideoTrackProcessor allows processing each frame in real time
- The processFrames function reads frames from the video, displays them on the canvas element, and encodes each frame using videoEncoder.encode(). Each frame is then sent to the server as an encoded chunk.

Phew! Hopefully that was understandable to you. Next we will create the file that will receive the stream. 👀

Creating The Receiver

This file receives the encoded video chunks via WebSocket, decodes them, and displays them on a canvas element.

Create a new file under the public directory called "receiver.html" and populate it with the following:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0"/>
    <title>Receiver</title>
    <style>
      canvas {
        width: 640px;
        height: 480px;
        border: 2px solid black;
        margin: 10px;
      }
    </style>
  </head>
  <body>
    <canvas id="canvas" width="640" height="480"></canvas>

    <script>
      const canvas = document.getElementById('canvas');
      const ctx = canvas.getContext('2d');
      let videoDecoder;

      const initWebSocket = () => {
        const socket = new WebSocket('ws://localhost:3000');
        socket.binaryType = 'arraybuffer';

        socket.onmessage = event => {
          decodeFrame(event.data);
        };
        socket.onerror = error => console.error('WebSocket error:', error);
      };

      const initDecoder = () => {
        videoDecoder = new VideoDecoder({
          output: (videoFrame) => {
            ctx.drawImage(videoFrame, 0, 0, canvas.width, canvas.height);
            videoFrame.close();
          },
          error: (error) => console.error('Decoding error:', error)
        });

        videoDecoder.configure({
          codec: 'vp8',
          width: 640,
          height: 480
        });
      };

      const decodeFrame = (encodedData) => {
        const chunk = new EncodedVideoChunk({
          type: 'key',
          timestamp: performance.now(),
          data: new Uint8Array(encodedData)
        });

        videoDecoder.decode(chunk);
      };

      initDecoder();
      initWebSocket();
    </script>
  </body>
</html>

To break down the above file:

HTML
- The canvas element is the primary display area for the decoded video frames. It has a fixed width, height, and border, same as the sender page.
JavaScript
- The initWebSocket function creates a new WebSocket connection, receiving encoded frames from the sender and passing them to decodeFrame() for decoding.
- initDecoder initializes a VideoDecoder object configured for the VP8 codec. The decoder outputs each frame to the canvas.
- decodeFrame takes the encoded data, wraps it in an EncodedVideoChunk (as a key frame with the current timestamp), and decodes it via videoDecoder.decode(). Each frame is displayed on the canvas in real-time

Phew! Now that we have all the pieces needed, lets actually run it! 🤓

Running The Code

To run the code simply run the following command:

node index.js

Then point your browser to http://localhost:3000/sender.html
Allow access to your camera and then open another tab to
http://localhost:3000/receiver.html

Like the below you should see the stream being sent from the sender.

Conclusion

In this tutorial I have shown how to get access to the camera, encode it, send the chunks over WebSocket and decode and display them on the receiver side. I hope this tutorial was of use to you. 🥳

As always you can get the code via my github:
https://github.com/ethand91/webcodec-stream

Happy Coding! 😎

Like my work? I post about a variety of topics, if you would like to see more please like and follow me.
Also I love coffee.