Ambisonics 101

Table of Contents

What Is 3D Audio?
What Is Ambisonics?
Ambisonics vs. Stereo
What Is B-Format?
Signal Flow at a Glance
Typical Setups
Headphones vs. Loudspeakers
How is Ambisonics Different from Immersive Audio, Dolby Atmos, and Spatial Audio?
Where To Start at ICST
Ambisonic Microphones — A Practical Introduction
Ambisonics Glossary (Quick Reference)

1. What Is 3D Audio?

3D audio is the general term for sound that is perceived not only left and right, but also in front, behind, above, below, and in depth. Instead of a flat stereo image, it creates the impression of a surrounding acoustic space.

Ambisonics is one specific way to produce and store 3D audio. Other approaches include binaural audio for headphones and object-based formats such as Dolby Atmos. In that sense, 3D audio is the broader category, and Ambisonics is one method within it.

Two short listening examples:

2. What Is Ambisonics?

Ambisonics is a format-agnostic method for describing a spatial 3D sound field. Instead of mixing directly for a fixed loudspeaker layout, you work with a spatial representation that can later be rendered for different playback systems.

3. Ambisonics vs. Stereo

Stereo is familiar: two channels, left and right. It creates the illusion of sounds positioned along a horizontal line between two speakers. Add a centre channel and surrounds, and you get 5.1 or 7.1 — but each time you change the loudspeaker layout, you have to re-mix from scratch.

Ambisonics takes a different approach. Instead of mixing directly for a speaker layout, you first encode the spatial sound field as B-format (see section 3 below). That representation captures where sound comes from across the full 3D sphere — left, right, front, back, above, below. The decoding to actual speakers happens later, and the same file can be decoded for completely different setups without touching the mix.

	Stereo	Ambisonics
Channels	2 (L / R)	4 – 64+ (B-format)
Spatial range	Left–right line	Full sphere (360° × 180°)
Speaker dependency	Fixed to layout at mix time	Decoded to any layout later
Re-use	New mix per setup	One B-format file → many setups
Typical use	Music, broadcast, everyday listening	Art, research, installation, live, film

When does stereo make more sense? For most music distribution, podcasts, and broadcast, stereo remains the standard — it is compatible with every playback system and requires no special tools. Ambisonics pays off when the spatial dimension of sound matters artistically or technically, or when you need a single master file that can serve multiple playback contexts.

4. What Is B-Format?

B-format is the core signal format in Ambisonics and carries spatial information. Sources are encoded into B-format and then decoded for a target setup, such as headphones, stereo, or different loudspeaker arrays.

It describes a sound-field state around a listening point using pressure and directional components. In first order, this means:

W is the omnidirectional component, i.e. the pressure/presence in the room.
X, Y, and Z are directional components along three axes (front-back, left-right, up-down), indicating where the signal comes from.

In the strict classical sense, “B-format” refers to this four-channel first-order format (W, X, Y, Z). In an extended sense, the term is also used for higher-order Ambisonics, where all Ambisonics coefficients up to a given order are represented as separate audio channels.

This format can then be decoded to different target systems such as headphones, stereo, or loudspeaker arrays. 1 2

5. Signal Flow at a Glance

From source to speaker — this is how Ambisonics works in REAPER with the ICST plugins:

Ambisonics Signal Flow — Single Source

Single source: mono track → AmbiEncoder (az/el/dist) → B-Format Bus → AmbiDecoder → Speakers or Binaural.

Ambisonics Signal Flow — Multi-Source (ICST MultiEncoder)

Multi-source: up to 64 sources feed simultaneously into the ICST MultiEncoder → shared B-Format Bus → decoded once.

6. Typical Setups

Common setups range from small studio rings and dome-like height configurations to custom arrays in composition studios. The same Ambisonics material can be adapted to each of these setups through decoding.

7. Headphones vs. Loudspeakers

Headphones use binaural rendering and are practical for editing, checking translation, and remote collaboration. Loudspeakers provide a physical spatial field in the room and remain essential for composition, depth perception, and artistic evaluation.

8. How is Ambisonics Different from Immersive Audio, Dolby Atmos, and Spatial Audio?

Immersive audio is a broad term for any 3D audio approach that places sound around — and above — the listener rather than just left and right. Ambisonics, Dolby Atmos, and Apple Spatial Audio all pursue this goal, but they do so in fundamentally different ways.

Ambisonics is field-based. The sound field is encoded as a mathematical representation (B-format) that is independent of any specific speaker layout. The same B-format file can be decoded later for a studio ring, a concert dome, headphones, or stereo. The playback system does not need to be decided at production time.

Dolby Atmos and Apple Spatial Audio are object-based. Individual sound sources are stored as audio objects with position metadata. A licensed renderer (Dolby Atmos Renderer, Apple Music infrastructure) places them into a target playback system — whether a cinema, a home theater, or headphones — at delivery time.

	Ambisonics	Dolby Atmos
Spatial approach	Field-based (B-format)	Object-based (audio + metadata)
Speaker independence	Yes — one file, many layouts	No — render per target system
Hardware dependency	Free, open, any speaker array	Requires licensed Dolby renderer
Headphone playback	Binaural decoder (free tools)	Dolby binaural renderer
Typical tools	ICST Plugins, IEM, ATK	Pro Tools + Dolby Renderer, Logic, Nuendo
Cost	Free, open source	Commercial licensing for distribution
Typical use	Art, research, installation, archiving, live	Film, streaming music, gaming, consumer media
Archivability	High — B-format is format-agnostic	Medium — tied to Dolby ecosystem

When to use which: Ambisonics is the better choice when speaker-layout independence, open archiving, or research and artistic use matter. Dolby Atmos is the standard for commercial streaming delivery (Tidal, Apple Music, Amazon Music) and film — if you need to reach those channels, Atmos is the practical requirement.

The two are not mutually exclusive: some workflows produce Ambisonics for archiving and artistic use, and separately deliver a Dolby Atmos render for streaming.

9. Where To Start at ICST

For Beginners: Quick Start
Start with the ICST Ambisonics Plugins for DAW workflows.
Use the ICST Ambisonics Tools for Max/MSP workflows.
Explore Ascolta for listening practice and references.
Continue with tutorials and articles in the Blog & Tutorials.

Ambisonics 101: Ten Essential Questions Answered

10. Ambisonic Microphones — A Practical Introduction

An Ambisonic microphone captures the full-sphere sound field in a single recording. Unlike standard stereo or surround microphones, it uses a tetrahedral arrangement of four (or more) capsules and outputs a raw format called A-format, which must be converted to B-format before use in your DAW.

A-format and the encoding step

Most tetrahedral microphones output A-format: four raw capsule signals in a tetrahedral geometry. This must be encoded into B-format (W, X, Y, Z for first order) before working with the recording in Ambisonics. Encoding is usually handled by dedicated software from the manufacturer — for example the SoundField by Rode plugin, Zylia Studio, or the Sennheiser A-B Ambisonics plugin — or by third-party tools such as Harpex or the IEM AllRADecoder.

Some microphones (e.g. the Zoom H3-VR) handle this internally and output B-format directly.

Common microphones

Microphone	Order	Capsules	Output	Notes
Zoom H3-VR	1st	4	A- or B-format	Entry-level, integrated encoder, good for field recording
Sennheiser Ambeo VR Mic	1st	4	A-format	Widely used, requires Sennheiser A-B Ambisonics plugin for encoding
Rode NT-SF1	1st	4	A-format	SoundField by Rode software included
Core Sound TetraMic	1st	4	A-format	Long-established, widely used in field recording and research
Zylia ZM-1	3rd	19	A-format	Higher-order, includes Zylia Studio software, good spatial resolution
mh acoustics Eigenmike em32	4th	32	A-format	Professional/research grade, very high spatial resolution

In the ICST workflow

Any B-format recording — whether from a first-order or HOA microphone — can be loaded directly into a REAPER session and decoded with the ICST AmbiDecoder. For HOA recordings, make sure the Ambisonics order in the decoder matches the recording order.

11. Ambisonics Glossary (Quick Reference)

A-format — Raw signal from a tetrahedral Ambisonics microphone: four capsule signals in tetrahedral geometry. Must be encoded to B-format before use in a DAW. → Wikipedia: Ambisonics
B-format — The Ambisonics carrier signal: encodes the spatial sound field as spherical harmonics. First order = 4 channels (W, X, Y, Z); seventh order = 64 channels. → Wikipedia: Ambisonics | ICST Wiki
ambiX — Standardised Ambisonics file format (ACN channel ordering, SN3D normalisation); the de-facto standard for HOA exchange and archiving. → ambiX Specification (IEM)
Ambisonics Order — Spatial resolution level: 1st order = 4 channels, 3rd = 16, 7th = 64. Higher order means finer localisation and more channels. → Wikipedia: Higher-order Ambisonics | ICST AmbiDecoder
Encoder — Converts a mono/stereo source with position data (azimuth, elevation, distance) into B-format. → ICST AmbiEncoder
Decoder — Renders B-format to a target playback system: loudspeaker array or binaural. → ICST AmbiDecoder
Channel Count — Number of channels in the Ambisonics signal path; must remain consistent across the entire routing. → ICST Wiki
Speaker Layout — Physical loudspeaker geometry to which the decoder renders B-format. → Wikipedia: Ambisonic reproduction systems
Binaural / HRTF — Headphone rendering via Head-Related Transfer Functions (HRTFs): simulates spatial directional cues without loudspeakers. Enables Ambisonics monitoring on any headphone. → Wikipedia: Binaural recording | Wikipedia: HRTF
Dolby Atmos — Object-based 3D audio format: sound sources are stored as audio objects with position metadata; a licensed renderer places them in the target system (cinema, home theatre, streaming services). → dolby.com | Wikipedia: Dolby Atmos
OSC (Open Sound Control) — Network protocol (UDP/IP) for real-time control of spatial parameters. → opensoundcontrol.stanford.edu | ICST AmbiEncoder
Yaw / Pitch / Roll — Rotation axes in 3D space: Yaw = horizontal (left/right), Pitch = vertical (up/down), Roll = tilt. → Wikipedia: Euler angles
Azimuth / Elevation — Polar coordinates for source direction: Azimuth = horizontal angle (0°–360°), Elevation = vertical angle (−90° to +90°). → Wikipedia: Horizontal coordinate system