Lesson 07-WebSocket Message Format and Serialization

The WebSocket protocol defines only the frame format at the transport layer, leaving the message format and serialization method at the application layer entirely up to developers. Choosing an appropriate message format and serialization approach significantly impacts application performance, development efficiency, and maintainability.

WebSocket Message Structure Basics

Relationship Between WebSocket Frames and Messages

The WebSocket protocol divides data into frames at the transport layer, while the application layer typically combines multiple related frames into a logical message. This layered design has the following characteristics:

  • Frames: Transport units handled by the protocol stack (fragmentation, compression, encryption, etc.).
  • Messages: Logical units at the application layer, potentially composed of multiple frames.
[Frame 1][Frame 2][Frame 3]...[Frame N] → [Complete Application Message]

Message Boundary Handling

WebSocket handles message boundaries through the following mechanisms:

  1. FIN Flag: Indicates whether a message is complete.
  2. Opcode: Identifies the message type (text/binary).
  3. Fragmentation Mechanism: Allows large messages to be split into multiple frames.
// Pseudo-code demonstrating message boundary handling logic
let currentMessage = null;
let isBinary = false;

function handleFrame(frame) {
  if (frame.opcode === 0x1) { // Text frame
    isBinary = false;
  } else if (frame.opcode === 0x2) { // Binary frame
    isBinary = true;
  } else if (frame.opcode === 0x0) { // Continuation frame
    // Append to current message
    currentMessage.append(frame.payload);
    return;
  }

  if (frame.fin) { // Message complete
    if (currentMessage) {
      // Process complete message
      processCompleteMessage(isBinary ? currentMessage : currentMessage.toString());
      currentMessage = null;
    }
    // Start new message
    currentMessage = isBinary ? new BinaryBuffer() : '';
    currentMessage.append(frame.payload);
  } else { // Message incomplete
    currentMessage = isBinary ? new BinaryBuffer() : '';
    currentMessage.append(frame.payload);
  }
}

Comparison of Common Message Formats

Text Format (JSON)

JSON Format Example:

{
  "type": "chat",
  "content": "Hello, WebSocket!",
  "sender": "user123",
  "timestamp": 1634567890123
}

Advantages:

  • Human-readable, easy for debugging.
  • Widely supported, with JSON libraries available in nearly all languages.
  • Suitable for structured data.

Disadvantages:

  • Redundant characters (quotes, commas, etc.) increase message size.
  • Parsing performance is relatively low.
  • Lacks a type system.

Performance Data:

  • Average message size: Approximately 30-50% larger than binary formats.
  • Parsing speed: 2-5 times slower than binary formats.

Binary Formats (Protobuf/MessagePack)

Protobuf Example:

syntax = "proto3";

message ChatMessage {
  string type = 1;
  string content = 2;
  string sender = 3;
  int64 timestamp = 4;
}

MessagePack Example:

// Object before encoding
{
  type: 'chat',
  content: 'Hello, WebSocket!',
  sender: 'user123',
  timestamp: 1634567890123
}

// Encoded binary (hexadecimal representation)
a4 74 79 70 65  a4 63 68 61 74  a7 63 6f 6e 74 65 6e 74  a6 73 65 6e 64 65 72  a6 75 73 65 72 31 32 33  a9 74 69 6d 65 73 74 61 6d 70  cd 93 a4 b8 9d 01

Advantages:

  • Compact binary format, smaller size.
  • Fast parsing speed.
  • Strict type system (especially with Protobuf).
  • Supports forward/backward compatibility.

Disadvantages:

  • Not human-readable, difficult to debug.
  • Requires predefined schema.
  • Limited support in some languages.

Performance Data:

  • Average message size: 50-70% smaller than JSON.
  • Parsing speed: 5-10 times faster than JSON.

Custom Binary Format

Design Considerations for Custom Formats:

  1. Field Order: Fixed field order reduces parsing overhead.
  2. Data Types: Choose the most compact representation.
  3. Variable-Length Encoding: Use techniques like Varint for numbers.
  4. Alignment: Consider CPU cache line alignment (typically 8 bytes).

Example: Simple Custom Binary Message Format:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Message Type (1 byte)       |   Payload Length (3 bytes)    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
|                    Payload (variable)                         |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Serialization/Deserialization Implementations

JSON Serialization Implementation

JavaScript Implementation:

// Serialize
function serializeToJson(message) {
  return JSON.stringify(message);
}

// Deserialize
function deserializeFromJson(jsonString) {
  try {
    return JSON.parse(jsonString);
  } catch (e) {
    console.error('JSON parsing error:', e);
    return null;
  }
}

// Usage example
const chatMessage = {
  type: 'chat',
  content: 'Hello, WebSocket!',
  sender: 'user123',
  timestamp: Date.now()
};

// Send message
const jsonStr = serializeToJson(chatMessage);
websocket.send(jsonStr);

// Receive message
websocket.onmessage = (event) => {
  const message = deserializeFromJson(event.data);
  if (message) {
    console.log('Received message:', message);
  }
};

Performance Optimization Techniques:

  1. Reuse JSON parser instances (supported by some libraries).
  2. Avoid unnecessary fields (reduce message size).
  3. Use shorter property names (trade readability for smaller size).

Protobuf Serialization Implementation

Node.js Implementation:

// 1. Define proto file (chat.proto)
/*
syntax = "proto3";

message ChatMessage {
  string type = 1;
  string content = 2;
  string sender = 3;
  int64 timestamp = 4;
}
*/

// 2. Generate JavaScript code (using protoc compiler)
// protoc --js_out=. --grpc-web_out=. chat.proto

// 3. Use generated code
const protobuf = require('protobufjs');
const root = await protobuf.load('chat.proto');
const ChatMessage = root.lookupType('ChatMessage');

// Serialize
function serializeToProtobuf(message) {
  const errMsg = ChatMessage.verify(message);
  if (errMsg) throw Error(errMsg);

  const payload = ChatMessage.create(message);
  return ChatMessage.encode(payload).finish();
}

// Deserialize
function deserializeFromProtobuf(buffer) {
  const message = ChatMessage.decode(buffer);
  return ChatMessage.toObject(message, {
    longs: String, // Convert int64 to string
    enums: String, // Convert enums to string
    bytes: String, // Convert bytes to base64 string
    defaults: true, // Include default values
    arrays: true, // Include empty arrays
    objects: true, // Include empty objects
    oneofs: true // Include oneof fields
  });
}

// Usage example
const chatMessage = {
  type: 'chat',
  content: 'Hello, WebSocket!',
  sender: 'user123',
  timestamp: Date.now()
};

// Send message
const buffer = serializeToProtobuf(chatMessage);
websocket.send(buffer);

// Receive message
websocket.binaryType = 'arraybuffer';
websocket.onmessage = (event) => {
  const message = deserializeFromProtobuf(new Uint8Array(event.data));
  console.log('Received message:', message);
};

Browser Implementation:

// Browser requires lightweight version of protobuf.js
import * as protobuf from 'protobufjs/light';

// Dynamically load proto definition
protobuf.load('chat.proto', (err, root) => {
  if (err) throw err;

  const ChatMessage = root.lookupType('ChatMessage');

  // Serialization and deserialization functions same as above...
});

MessagePack Implementation

JavaScript Implementation:

const msgpack = require('@msgpack/msgpack');

// Serialize
function serializeToMsgPack(message) {
  return msgpack.encode(message);
}

// Deserialize
function deserializeFromMsgPack(buffer) {
  return msgpack.decode(buffer);
}

// Usage example
const chatMessage = {
  type: 'chat',
  content: 'Hello, WebSocket!',
  sender: 'user123',
  timestamp: Date.now()
};

// Send message
const buffer = serializeToMsgPack(chatMessage);
websocket.send(buffer);

// Receive message
websocket.binaryType = 'arraybuffer';
websocket.onmessage = (event) => {
  const message = deserializeFromMsgPack(new Uint8Array(event.data));
  console.log('Received message:', message);
};

Membership Required

You must be a member to access this content.

View Membership Levels

Already a member? Log in here

Share your love