Lesson 05-WebRTC Advanced Media Processing

Audio and Video Codecs

Common Codecs

WebRTC supports multiple audio and video codecs, each with distinct characteristics and use cases:

Video Codecs:

  • VP8: An open-source codec developed by Google, robust against packet loss, ideal for real-time communication.
  • VP9: Successor to VP8, offering higher compression efficiency, reducing bitrate by ~30% at the same quality.
  • H.264: Widely compatible codec with extensive hardware support, but requires patent licensing.

Audio Codecs:

  • Opus: Designed for the internet, supports narrowband to full-band audio.
  • G.711: Traditional PCM codec, no compression, low latency but high bitrate.

Codec Characteristics Comparison:

CodecCompression EfficiencyLatencyHardware SupportPatent Status
VP8MediumLowWideNone
VP9HighLowModerateNone
H.264HighLowVery WideYes
OpusVery HighVery LowModerateNone

Codec Selection and Configuration

WebRTC automatically selects the optimal codec based on network conditions and device capabilities, but developers can also configure preferences:

// Configure codec preferences when creating RTCPeerConnection
const pc = new RTCPeerConnection({
  iceServers: [...],
  encodedInsertableStreams: true // Enables access to encoded streams
});

// Set codec priority in SDP
function preferCodec(sdp, codecType, codecName) {
  const lines = sdp.split('\r\n');
  const mLineIndex = findMediaLine(lines, codecType);

  if (mLineIndex === null) return sdp;

  // Find codec list
  const payloadTypes = getPayloadTypes(lines, mLineIndex, codecType);

  // Reorder codec priority
  if (payloadTypes.includes(codecName)) {
    // Move preferred codec to the first position
    const preferredIndex = payloadTypes.indexOf(codecName);
    const firstNonPreferred = payloadTypes.findIndex(pt => pt !== codecName);

    if (preferredIndex > firstNonPreferred) {
      [payloadTypes[preferredIndex], payloadTypes[firstNonPreferred]] = 
      [payloadTypes[firstNonPreferred], payloadTypes[preferredIndex]];
    }

    // Update m-line
    lines[mLineIndex] = updateMLine(lines[mLineIndex], payloadTypes);
  }

  return lines.join('\r\n');
}

// Example: Prefer VP9
const preferredSdp = preferCodec(originalSdp, 'video', 'VP9');

Dynamic Codec Switching

WebRTC supports dynamic codec switching during a call:

// Switch codec based on network conditions
function adaptCodecBasedOnNetwork(quality) {
  if (quality === 'excellent') {
    // Switch to high-quality codec
    switchToCodec('VP9');
  } else if (quality === 'good') {
    // Switch to balanced codec
    switchToCodec('VP8');
  } else {
    // Switch to low-bandwidth codec
    switchToCodec('H.264');
  }
}

// Implementation of codec switching
async function switchToCodec(codecName) {
  if (!pc || !localStream) return;

  // 1. Get current track
  const videoTrack = localStream.getVideoTracks()[0];

  // 2. Create new media stream
  const newStream = await navigator.mediaDevices.getUserMedia({
    video: { 
      width: { ideal: 1280 },
      height: { ideal: 720 },
      frameRate: { ideal: 30 }
    },
    audio: true
  });

  // 3. Replace track
  const newVideoTrack = newStream.getVideoTracks()[0];
  pc.removeTrack(videoTrack);
  pc.addTrack(newVideoTrack, newStream);

  // 4. Create new offer
  const offer = await pc.createOffer();
  await pc.setLocalDescription(offer);

  // 5. Send new offer via signaling server
  signalingChannel.send({
    type: 'offer',
    sdp: pc.localDescription,
    roomId: currentRoomId
  });
}

Hardware Acceleration

WebRTC supports various hardware acceleration methods:

  1. GPU Acceleration:
    • Video encoding/decoding handled by GPU.
    • Significantly reduces CPU usage.
    • Enabled by default in modern browsers.
  2. DSP Acceleration:
    • Audio processing handled by dedicated digital signal processors.
    • Particularly useful for echo cancellation and noise suppression.

Detecting Hardware Acceleration Status:

// Check if video codec uses hardware acceleration
function checkHardwareAcceleration() {
  if (!pc || !pc.getStats) return;

  pc.getStats().then(stats => {
    stats.forEach(report => {
      if (report.type === 'codec' && report.codecImplementationName) {
        console.log('Codec implementation:', report.codecImplementationName);
        // Typically includes "hardware" or "gpu" indicating hardware acceleration
        if (report.codecImplementationName.toLowerCase().includes('hardware') ||
            report.codecImplementationName.toLowerCase().includes('gpu')) {
          console.log('Hardware acceleration detected');
        }
      }
    });
  });
}

// Periodic check
setInterval(checkHardwareAcceleration, 10000);

Codec Performance Optimization

Key strategies for optimizing codec performance:

  1. Adaptive Bitrate (ABR):
    • Dynamically adjusts bitrate based on network conditions.
    • Prevents congestion and packet loss.
  2. Keyframe Requests:
    • Requests keyframes after severe packet loss.
    • Quickly restores video quality.
  3. Forward Error Correction (FEC):
    • Adds redundant packets.
    • Enhances packet loss resilience.

Implementation Example:

// Implement simple network adaptation
let lastBytesSent = 0;
let lastTime = Date.now();
let currentBitrate = 1000; // Initial bitrate (kbps)

function monitorNetworkAndAdjust() {
  const now = Date.now();
  const elapsed = (now - lastTime) / 1000; // Seconds

  pc.getStats().then(stats => {
    let bytesSent = 0;
    stats.forEach(report => {
      if (report.type === 'outbound-rtp' && report.bytesSent) {
        bytesSent += report.bytesSent;
      }
    });

    // Calculate current bitrate
    const currentBitrateKbps = (bytesSent - lastBytesSent) * 8 / elapsed / 1000;

    // Simple adjustment strategy
    if (currentBitrateKbps < currentBitrate * 0.8) {
      // Bitrate drops by over 20%, reduce target bitrate
      currentBitrate = Math.max(300, currentBitrate * 0.8);
    } else if (currentBitrateKbps > currentBitrate * 0.9) {
      // Bitrate stable, try increasing
      currentBitrate = Math.min(3000, currentBitrate * 1.1);
    }

    // Update records
    lastBytesSent = bytesSent;
    lastTime = now;

    // Apply new bitrate settings
    applyBitrateSetting(currentBitrate);
  });
}

function applyBitrateSetting(bitrate) {
  if (!pc || !localStream) return;

  const videoTrack = localStream.getVideoTracks()[0];
  if (videoTrack.getCapabilities && videoTrack.getCapabilities().bitrate) {
    // If bitrate control is supported
    videoTrack.applyConstraints({
      advanced: [{ bitrate: bitrate * 1000 }]
    });
  } else {
    // Negotiate bitrate via SDP
    const sdp = pc.localDescription.sdp;
    const modifiedSdp = sdp.replace(
      /a=fmtp:\d+ .*$/,
      `a=fmtp:${getVideoPayloadType(sdp)} x-google-min-bitrate=${bitrate * 1000};x-google-max-bitrate=${bitrate * 1000}`
    );

    // Update SDP and renegotiate
    const offer = new RTCSessionDescription({
      type: 'offer',
      sdp: modifiedSdp
    });
    pc.setLocalDescription(offer);
    signalingChannel.send({
      type: 'offer',
      sdp: offer.sdp,
      roomId: currentRoomId
    });
  }
}

// Periodically monitor network
setInterval(monitorNetworkAndAdjust, 2000);

Audio Processing Techniques

Acoustic Echo Cancellation (AEC)

Echo cancellation is one of the most critical audio processing techniques:

AEC Implementation in WebRTC:

  • Multi-microphone beamforming.
  • Adaptive filters.
  • Delay estimation.

Configuration Options:

// Configure AEC when creating RTCPeerConnection
const pc = new RTCPeerConnection({
  iceServers: [...],
  // Advanced audio constraints
  encodedInsertableStreams: true,
  // Note: Most AEC parameters are managed internally by the browser
});

// Check AEC status
function checkAecStatus() {
  if (!pc || !pc.getStats) return;

  pc.getStats().then(stats => {
    stats.forEach(report => {
      if (report.type === 'remote-inbound-rtp' && report.googEchoCancellationQualityMin) {
        console.log('Echo cancellation quality:', report.googEchoCancellationQualityMin);
        // Value typically ranges from 0-1, higher is better
      }
    });
  });
}

Noise Suppression (NS)

Noise suppression techniques eliminate background noise:

NS Implementation in WebRTC:

  • Spectral subtraction.
  • Wiener filtering.
  • Machine learning methods.

Configuration Example:

// While most NS parameters are managed by the browser, constraints can influence behavior
const constraints = {
  audio: {
    noiseSuppression: true,  // Enable noise suppression
    echoCancellation: true,  // Echo cancellation often works with noise suppression
    autoGainControl: true    // Automatic gain control also aids noise suppression
  }
};

navigator.mediaDevices.getUserMedia(constraints)
  .then(stream => {
    // Process media stream
  });

Automatic Gain Control (AGC)

AGC automatically adjusts audio gain to maintain stable output volume:

AGC Working Principle:

  • Short-term gain adjustment.
  • Long-term gain adjustment.
  • Peak limiting.

Detecting AGC Status:

function checkAgcStatus() {
  if (!pc || !pc.getStats) return;

  pc.getStats().then(stats => {
    stats.forEach(report => {
      if (report.type === 'remote-inbound-rtp' && report.googAutoGainControlQualityMin) {
        console.log('Automatic gain control quality:', report.googAutoGainControlQualityMin);
      }
    });
  });
}

Audio Mixing and Routing

WebRTC supports mixing and routing of multiple audio streams:

Multi-Stream Audio Processing:

// Add multiple audio tracks to connection
const audioStream1 = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioStream2 = await navigator.mediaDevices.getUserMedia({ audio: true });

// Add first audio track
pc.addTrack(audioStream1.getAudioTracks()[0], audioStream1);

// Add second audio track
pc.addTrack(audioStream2.getAudioTracks()[0], audioStream2);

// Note: The browser automatically mixes audio, but special handling may be needed for mix ratio control

Audio Routing Control:

// Control audio routing (speaker/earpiece) on supported devices
if (navigator.mediaDevices && navigator.mediaDevices.enumerateDevices) {
  navigator.mediaDevices.enumerateDevices()
    .then(devices => {
      const audioOutputDevices = devices.filter(device => device.kind === 'audiooutput');

      if (audioOutputDevices.length > 1) {
        // Multiple audio output devices available
        // Note: Actual switching typically requires Web Audio API or platform-specific APIs
        console.log('Available audio output devices:', audioOutputDevices);
      }
    });
}

Audio Quality Assessment

Common metrics and methods for assessing audio quality:

Objective Quality Metrics:

  • PESQ (Perceptual Evaluation of Speech Quality): Scores 0-4.5, higher is better.
  • POLQA (Perceptual Objective Listening Quality Analysis): Improved version of PESQ.

Audio Quality Metrics in WebRTC:

function monitorAudioQuality() {
  if (!pc || !pc.getStats) return;

  pc.getStats().then(stats => {
    stats.forEach(report => {
      if (report.type === 'remote-inbound-rtp' && report.googEchoCancellationReturnLoss) {
        console.log('Echo cancellation return loss:', report.googEchoCancellationReturnLoss);
        // Higher values indicate better echo cancellation
      }

      if (report.type === 'remote-inbound-rtp' && report.googEchoCancellationReturnLossEnhancement) {
        console.log('Enhanced echo cancellation return loss:', report.googEchoCancellationReturnLossEnhancement);
      }

      if (report.type === 'remote-inbound-rtp' && report.googJitterBufferMs) {
        console.log('Jitter buffer delay:', report.googJitterBufferMs);
        // More stable values indicate better network conditions
      }
    });
  });
}

// Periodic monitoring
setInterval(monitorAudioQuality, 3000);

Membership Required

You must be a member to access this content.

View Membership Levels

Already a member? Log in here
Share your love