Multimedia elements
audio element
<audio> element: used to embed audio files, supporting multiple audio formats such as MP3, WAV, Ogg Vorbis, etc. You can specify the URL of the audio file through the src attribute, and control the behavior of the player through a series of attributes (such as controls, loop, autoplay).
<audio controls>
<source src="my-song.mp3" type="audio/mpeg">
<source src="my-song.ogg" type="audio/ogg">
Your browser does not support the audio element.
</audio>The main attributes and usage include:
- src: specifies the URL of the audio file.
- controls: adds standard playback controls (such as play/pause buttons, progress bars, etc.).
- autoplay: automatically starts playing audio after the page is loaded.
- loop: restarts playback after the audio ends.
- muted: muted by default.
- preload: Indicates whether the browser should preload audio data when the page loads (optional values are “none”, “metadata”, “auto”).
- volume: Sets the volume of the audio playback (range 0.0 to 1.0).
video element
<video> element: used to embed video files, supporting formats such as MP4, WebM, Ogg Theora, etc. You can also set attributes such as src, controls, loop, autoplay, and specify multiple <source> child elements to adapt to different browsers’ support for video formats.
<video controls width="320" height="240">
<source src="my-video.mp4" type="video/mp4">
<source src="my-video.webm" type="video/webm">
<source src="my-video.ogv" type="video/ogg">
Your browser does not support the video tag.
</video>The main attributes and usage are similar to <audio>, including src, controls, autoplay, loop, muted, preload. In addition, the attributes specific to video are:
widthandheight: set the dimensions of the video player.poster: specifies the URL of the cover image displayed when the video is loaded.
Because different browsers may support different audio and video formats, multiple <source> child elements are usually provided for the <audio> and <video> elements, each of which specifies an alternative source and the corresponding MIME type. The browser will try to load in the order of the source list until it finds a supported format
Multimedia events and JavaScript interaction
HTML5 multimedia elements also support a series of events, allowing developers to respond to changes in playback status, user operations, etc. through JavaScript programming. Common multimedia events include:
- Playback control events
play: Triggered when audio or video starts or resumes playback. Can be used to update UI status, record playback statistics, etc.pause: Triggered when audio or video is paused. Can be used to update the UI of the paused state or stop the timer, etc.ended: Triggered when audio or video has finished playing. Can be used to loop playback, switch to the next video, display recommended content, etc.
- Playback status change events
canplay: Triggered when the browser is able to start playing audio or video (at least enough data has been loaded to start, but it is not guaranteed to play smoothly to the end). Can be used to automatically play or unlock the disabled state of the pause button.canplaythrough: Triggered when the browser estimates that there is enough data available to play to the end without pausing. Often used to determine when to start autoplay or to show a “loading complete” prompt.waiting: Fired when the browser is waiting for enough data to continue playback. Can be used to show a loading indicator or prompt the user to wait for buffering.playing: Fired when audio or video has started or resumed playing (after a play event, after any pause or buffering). Used to update the UI or start a timer, etc.stalled: Fired when the browser tries to get media data, but the data is not available. May indicate a network problem or server problem.suspend: Fired when the browser has paused downloading media data, but not due to an error, but to save bandwidth or processing resources.
- User interaction events
seeking: Fired when the user starts fast-forwarding, rewinding, or jumping to a different position in the video. Can be used to show a progress bar animation or temporarily disable other controls.seeked: Fired when the user has completed a fast-forwarding, rewinding, or jumping operation. Used to update the UI to reflect the new playback position.timeupdate: Fired whenever the current playback position changes. Used to update progress bars, time displays, etc. in real time.
- Error event
error: Triggered when an error occurs during audio or video loading, parsing, or playback. The specific error type can be determined by checking event.target.error.code.
<video id="myVideo" src="my-video.mp4" controls></video>
<script>
const video = document.getElementById("myVideo");
// Listen for playback start events
video.addEventListener("play", () => {
console.log("Video started playing.");
// Update UI or execute other logic
});
// Listen for pause events
video.addEventListener("pause", () => {
console.log("Video paused.");
// Update UI or execute other logic
});
// Listen for playback end events
video.addEventListener("ended", () => {
console.log("Video has ended.");
// Automatically play the next video, reset the player status, etc.
});
// Listen for playback position change events
video.addEventListener("timeupdate", () => {
const currentTime = video.currentTime;
const duration = video.duration;
console.log(`Current time: ${currentTime.toFixed(2)} / Duration: ${duration.toFixed(2)}`);
// Update progress bar, time display, etc.
});
// Listening for error events
video.addEventListener("error", () => {
console.error("An error occurred while loading or playing the video.");
const errorCode = video.error.code;
switch (errorCode) {
case MediaError.MEDIA_ERR_ABORTED:
console.error("User aborted the video playback.");
break;
case MediaError.MEDIA_ERR_NETWORK:
console.error("A network error caused the video download to fail.");
break;
case MediaError.MEDIA_ERR_DECODE:
console.error("The video file could not be decoded.");
break;
case MediaError.MEDIA_ERR_SRC_NOT_SUPPORTED:
console.error("The video format is not supported by the browser.");
break;
default:
console.error("An unknown error occurred.");
}
// Display error prompts, provide backup playback sources, etc.
});
</script>By listening to these events and calling element methods (such as .play(), .pause(), .currentTime, .volume, etc.), more complex interactions and control logic can be implemented.
Audio and Video Tracks
The <track> element is used to add text tracks to audio or video, such as subtitles, closed captions (CC), chapter titles, descriptions, etc. It appears as a child of the <audio> or <video> element and contains the following attributes:
kind: Specifies the track type (such as “subtitles”, “captions”, “descriptions”, etc.).src: A URL pointing to a WebVTT file containing the track data.srclang: Specifies the language of the track.label: Provides a user-visible name for the track.
<video src="my-video.mp4" controls>
<track kind="subtitles" src="subtitles-en.vtt" srclang="en" label="English Subtitles">
</video>Media API
HTML5 Media API is a set of JavaScript interfaces closely associated with the <audio> and <video> elements. They provide developers with advanced control capabilities for audio and video content, including but not limited to play, pause, volume adjustment, playback speed control, time positioning, buffer status query, event handling, etc.
1. MediaElement API
MediaElement API is the basic media interface, which defines the properties, methods and events common to all HTML media elements (such as <audio> and <video>). Through the MediaElement API, developers can operate the playback status, volume, duration, current position and other attributes of the media element, and respond to various playback-related events.
Key properties and methods:
paused: Boolean property indicating whether the media is paused.currentTime: Get or set the current playback position (in seconds).duration: Get the total duration of the media (NaN if not available).volume: Get or set the media volume (in the range of 0.0 to 1.0).play(): Start or resume playback.pause(): Pause playback.load(): Reload the media resource.
Key events:
play, playing, pause, ended, timeupdate, error: Introduced in detail in the previous multimedia events section.
const myAudio = document.getElementById('my-audio');
myAudio.play();
myAudio.volume = 0.5;
myAudio.addEventListener('ended', function() {
console.log('Audio has finished playing.');
});2.MediaSource Extensions(MSE)
The MediaSource Extensions (MSE) API allows JavaScript to dynamically process and decode media data streams, supporting streaming and adaptive bitrate streaming. With MSE, media data can be divided into small chunks (usually sliced MP4 or WebM format), and then fed into a MediaSource object chunk by chunk, and then the MediaSource is associated with the <video> element for playback. MSE is widely used in HTTP-based streaming, adaptive streaming (such as DASH, HLS), and real-time communication scenarios.
Key interfaces and methods:
MediaSource: The core interface, representing a media data source that can be used by the<video>element.SourceBuffer: Represents a media data buffer that can receive and process media data chunks.MediaSource.isTypeSupported(type): Detects whether the browser supports a specific MIME type.mediaSource.addSourceBuffer(type): Creates a newSourceBuffer, specifying its MIME type.sourceBuffer.appendBuffer(data): Appends a media data chunk to aSourceBuffer.sourceBuffer.remove(start,end): Removes the specified time period from the SourceBuffer.
3.Encrypted Media Extensions (EME)
The Encrypted Media Extensions (EME) API provides support for protected media content (such as DRM-encrypted video) to ensure compatibility and security of digital rights management systems. EME works with various DRM service providers (such as Widevine, PlayReady, FairPlay) to handle the decryption and playback of encrypted media.
Key interfaces and methods:
MediaKeys: Represents a set of keys used to decrypt protected media content.window.navigator.requestMediaKeySystemAccess(): Requests permission to access a specific DRM system.mediaKeySystemAccess.createMediaKeys(): Creates a MediaKeys instance.mediaElement.setMediaKeys(mediaKeys): Associates a MediaKeys instance with a<video>element.mediaKeys.createSession(): Create aMediaKeySessionfor handling encrypted sessions.
4. Web Audio API
HTML5’s Web Audio API provides a powerful audio processing and synthesis system, independent of the MediaElement API, suitable for creating complex audio applications such as game audio, music synthesizers, real-time effect processors, etc. It provides the concepts of audio nodes (such as source nodes, processing nodes, mixing nodes) and audio contexts (AudioContext), supporting audio routing, filtering, mixing, sampling rate conversion, audio synchronization and other functions. It includes:
Key interfaces and methods:
AudioContext: The global entry point for audio processing, responsible for managing the connection between audio nodes and the flow of audio data.AudioBuffer: Stores offline audio data and is used to createAudioBufferSourceNode.AudioBufferSourceNode: Plays audio data in the audio buffer.GainNode: Adjusts the gain (volume) of the audio signal.DelayNode, BiquadFilterNode, ConvolverNode, etc.: provide various audio processing effects.AnalyserNode: analyze audio signals for visualization or other analysis purposes.
Canvas and WebGL support for multimedia
Canvas and multimedia
Canvas is an HTML element based on 2D graphics rendering, providing drawing functions through the JavaScript API (CanvasRenderingContext2D). Its support for multimedia is mainly reflected in the following aspects:
- Image processing and synthesis: Canvas can directly draw the content of
<img>,<video>or<canvas>elements through thedrawImage()method to achieve real-time display of images or video frames. This allows developers to use Canvas to implement multimedia applications such as image filters, dynamic poster generation, and video frame analysis. - Video frame drawing: Using the
<video>element as a drawing source, you can draw its content onto Canvas when each frame of video is played, for real-time video processing or special effects overlay. For example, green screen keying, real-time video filters, AR (augmented reality) effects, etc. - Animation and visual feedback: Canvas can create complex animation effects, synchronize with audio or video content, and provide visual feedback. For example, in a music rhythm game, draw animations that follow the beat changes, or display visual information such as waveforms and spectrograms when playing audio.
- Combined with
Web Audio API: Although Canvas itself does not process audio, it can be used in conjunction withWeb Audio APIto achieve audio and video synchronization. For example, dynamically adjust the visual elements drawn on Canvas based on the results of audio analysis, or update Canvas rendering when audio events (such as beat detection) are triggered.
WebGL and Multimedia
WebGL is a 3D graphics API based on the OpenGL ES specification that provides hardware-accelerated 3D rendering capabilities on Canvas elements. WebGL’s support for multimedia is mainly reflected in:
- 3D video playback: By mapping the content of the
<video>element as a texture (Texture) to the WebGL 3D model, video playback in a 3D environment can be achieved, such as panoramic video, 360° video player, etc. - Video processing and special effects: Using WebGL’s shader programming capabilities, video frames can be processed in real time at the GPU level to achieve complex visual effects, such as depth of field simulation, color space conversion, non-realistic rendering, etc.
- Interactive multimedia applications: WebGL is suitable for building highly interactive multimedia applications, such as 3D games, virtual reality (VR), augmented reality (AR) applications, etc., which contain rich audio and video content and are closely integrated with user interaction.
- Combined with Web Audio API: Similar to Canvas, WebGL can work with Web Audio API to achieve 3D animation or games with synchronized audio and video. For example, drive the animation of a 3D model according to the audio frequency, or update the 3D scene when an audio event is triggered.
Common features
- Performance optimization: Both Canvas and WebGL use GPU to accelerate rendering, which can provide higher performance than pure JavaScript when processing large amounts of image data (such as high-resolution video frames) or computationally intensive visual effects.
- Cross-platform compatibility: As part of the HTML5 standard, Canvas and WebGL have good cross-platform compatibility in modern browsers that support these technologies, and can provide users with a consistent multimedia experience on desktop and mobile.
- Programming flexibility: Through JavaScript programming, developers can flexibly control the rendering logic of multimedia content and implement highly customized multimedia applications, including real-time interaction, dynamic response to user input, etc.
Other related technologies
Media Fragments URI: By adding time ranges or tracking information to media file URLs, partial content of media resources can be directly requested and played.
Picture Element and Srcset Attribute: Optimize image loading and select the most appropriate image resource based on device characteristics (such as viewport size, resolution, etc.).



