Topics covered in this article:
- Spatial coding
- Working with stems
- Loudness measurement with Dolby Atmos
- Dolby Atmos master formats
- Delivery specifications
Before diving into the workflow, it is important to understand some additional key Dolby Atmos concepts and how Dolby Atmos is delivered to consumers.
The Dolby Atmos Renderer
The Dolby Atmos Renderer can record up to 128 tracks of audio. The first ten tracks are reserved for Bed audio and the remaining 118 inputs can be used for Object Audio or used for additional Beds.
Before Dolby Atmos is delivered to consumer playback devices via streaming services or on Blu-Ray, additional processes are employed to reduce the data rate, file size, and complexity of a full Atmos master while preserving the artistic intent of the original mix and providing an immersive audio experience. These processes are Spatial Coding and the use of delivery Codecs (covered in the appendix)
Spatial coding provides a way to transmit a full Dolby Atmos presentation to a smaller data set.
Spatial coding reduces the Atmos presentation:
- From: Up to 128 tracks with OAMD for up to 118 Objects
- To: 12, 14, or 16 elements and OAMD
Spatial coding is a process that dynamically groups nearby audio from Beds and Objects using loudness and positional algorithms into “elements” (sometimes called clusters) that contain their own OAMD. The elements themselves can move over time, and the Bed and Object audio can move between elements to more accurately reflect their position and trajectory. Below is a graphical representation of the spatial coding process.
While there may be up to 128 tracks in a Dolby Atmos presentation, the tracks are rarely all active at the same time. Even with complex and frenetic mixes, the dynamic elements produced by the spatial coding process provide the spatial resolution for the OAR to recreate an immersive sound field. With the reference mix speaker configuration of 7.1.4 and common home theater speaker configurations up to 9.1.6 and beyond, the spatial coding process is transparent for most content.
Spatial Coding Emulation
The spatial coding process takes place as part of the encoding process, downstream from mixing. While spatial coding is most often transparent, it can be audible with some content, depending on the number of elements used.
Spatial Coding Emulation is a feature of the Dolby Atmos Renderer that allows the mixer to audition what spatial coding sounds like prior to the encoding process. As spatial coding is part of the delivery of Dolby Atmos to the home, it is important that the mixer be satisfied with the results and make adjustments to the mix if needed. Spatial coding emulation should be turned on as the mix comes together.
Spatial coding can be emulated with 12, 14, or 16 elements. It is important to understand the final delivery method(s) in order to monitor appropriately, as the number of elements that can be included will vary depending on the codec and bit rate. Most Dolby Atmos Music streaming services encode using 16 elements. See appendix A for a more in-depth discussion of Dolby codecs used for the delivery of Dolby Atmos to consumers.
Note that spatial coding is not used in binaural rendering.
In addition to recording to the Dolby Atmos Master Fileset and exporting to other mastering formats, the Dolby Atmos Renderer can also be used to output channel-based deliverables. These 're-renders' are generated by the OAR. Re-renders can be output in real time (assigned to specific hardware outputs) for recording back into a DAW or exported offline.
Re-renders can range in width from Stereo to 9.1.6, as well as Ambisonic formats. For Dolby Atmos music, only Stereo re-renders are utilized. Re-renders can contain the full mix or can be customized and derived from custom input groupings to create stems.
Working with Stems
The first 10 inputs to the Dolby Atmos Renderer are reserved for Bed inputs. Bed audio can range in width from Stereo to 7.1.2. The remaining 118 inputs to the Renderer can be used for either Bed audio or Object audio.
Within a DAW, mixers often group similar audio together into stems. In music production, stems could include Instruments: Drums, Guitars, Keyboards, Vocals; as well as multichannel mix elements such as reverb returns, delay returns and the LFE. Depending on the DAW, mixers can use multiple Bed tracks (one or more) for each stem.
Bed tracks can be summed/combined to create a single composite Bed that is output from the DAW and feeds Renderer inputs 1-10, leaving the rest of the Renderer inputs available for Object audio. In Dolby Atmos music production, the Bed is often used just for reverb returns, delay returns and the LFE, with other stems created and grouped together as audio Objects.
This allows separate stereo re-renders created from these stems, including the combined LFE, reverb and delay returns, to be provided for stereo mastering if needed.
If Bed audio is needed for other stems, those beds can be created, but will reduce the number of inputs available for Object audio.
A more thorough explanation of multiple Bed workflows is covered in our self-guided online post production training.
Loudness Measurement with Dolby Atmos
Loudness measurement is not performed on a full Dolby Atmos mix but instead on a 5.1 re-render. This is done for two reasons. First, there isn’t an effective way to measure the loudness of an entire Dolby Atmos presentation. Second, and more importantly, this practice ensures loudness continuity between Dolby Atmos content and content that is not mixed or presented in Dolby Atmos.
Loudness measurement can be performed using both the real-time and offline loudness measurement built into the Dolby Atmos Renderer. Alternately, a 5.1 re-render can be generated and measured in loudness apps and DAW plug-ins.
Delivery specifications vary, but mixes are not to exceed -18LKFS for Dolby Atmos Music.
True Peak limits are commonly specified in delivery requirements when for 5.1, when working in Dolby Atmos.
True Peak (dBTP) targets are difficult to achieve with Dolby Atmos content, even if limiting is used in the DAW session. Rendering to 5.1 involves complex summing. Additionally, the nature of True Peak measurements are interpolative, meaning the values predict peaks between samples.
Because of the nature of True Peak measurement, specifications should be considered a suggested target value, not an absolute value. As long as the dBTP measurements aim for –2dBTP and do not exceed –.1dBTP, the limiter that is used in the encoding process will be sufficient to prevent audible clipping.
If True Peak limits must be met for stereo Re-renders, the re-renders will need to be limited as an additional processing step in the DAW.
The loudness measurement tool in the Dolby Atmos Renderer has a soft clip limiter inline that mimics the encoding process. If using external loudness measurement applications or plug-ins, use the “loudness” 5.1 Re-render that has this limiter applied for consistent measurements.
Loudness measurement is also performed during the encoding process and used to write metadata called dialnorm (shorthand for dialog normalization) into the bitstream. This is used to ensure that there are not jumps in volume during playback between different pieces of content. Despite using the word dialog, the dialnorm concept is still applicable for music.
Note that Binaural loudness is measured separately from the 5.1 re-render.
Dolby Atmos Master Formats
The Dolby Atmos Renderer records up to 128 inputs of Bed and Object audio; OAMD as well as Binaural, Downmix, and Trim metadata; and Input and Re-render configurations. These other types of metadata are discussed in upcoming modules and appendixes.
These are recorded to a Dolby Atmos Master File set (DAMF). This is the format native to the Dolby Atmos Renderer and is recorded as a three file set comprised of:
- .atmos — An XML file containing information about the Dolby Atmos presentation and index information about the other files in the file set. The .atmos file includes the number of inputs used as Beds or Objects, frame rate, file start, first frame of action, the number of elements used in spatial coding, downmix, and trim metadata.
- .atmos.metadata — An XML file containing dynamic positional and size OAMD for each Object, along with binaural metadata settings.
- .atmos.audio — A Core Audio Format (CAF) file of up to 128 tracks of interleaved audio.
The .atmos and .atmos.metadata files can be opened for inspection with a text editor. However, direct editing of these files is not recommended, as the file set can become corrupted.
While a new master is always recorded as a DAMF, two other formats are used for distribution, encoding, mastering, or further editing:
- ADM BWF — The Audio Description Model Broadcast Wav Format (ADM BWF) is an alternative Dolby Atmos master format. With ADM BWF (sometimes referred to as ADM BWAV), all other information included in the .atmos and .atmos.metadata files is included in a data chunk in the header of the wav file. The audio payload itself is up to 128 tracks of interleaved audio. ADM has several advantages:
- ADM BWF is a single file instead of three files in a folder, making it easy to interchange with other facilities.
- ADM BWF can be imported into some DAWs. This allows all Bed and Object audio tracks to be recreated along with all the panning metadata. This allows for subsequent editing prior to remastering.
- ADM BWF can be encoded to Dolby True HD, Dolby Digital Plus JOC, and Dolby AC-4 IMS and is the primary deliverable to mastering engineers, streaming operators and Blu-ray authoring.
- IMF.IAB – Immersive Audio Bitstream is a mezzanine format for IMF (interoperability mastering format). IAB is considered a mezzanine format rather than a master format, as OAMD is quantized. IAB.mxf is used by third-party IMF packaging tools to create a delivery container for both Dolby Atmos and video (including Dolby Vision). IMF IAB is not used for Dolby Atmos Music.
While the Dolby Atmos Renderer natively records in the .atmos format only, it can convert to and export ADM BWF and IAB.MXF. The entire file can be exported, or basic top/tail (specified range) edits can be performed.
The Dolby Atmos Renderer can also open ADM BWF and IAB.MXF files as master files for playback, QC, basic top/tail editing, conversion (between the two formats), and re-export. However, some restrictions apply with open ADM BWF and IAB.MXF. Punch-ins and other metadata editing are not permitted with ADM BWF and IAB.MXF. Conversion to .atmos from ADM BWF and IAB.MXF is not permitted.
The Dolby Atmos Conversion Tool (DACT) is a companion application to the Dolby Atmos Renderer and is required to convert from ADM BWF and IAB.MXF to .atmos, perform format and frame-rate conversions, as well as perform complex editing operations on master files. The Dolby Atmos Conversion Tool is a free utility.
Technical Delivery Specifications
The deliverables required by streaming services are spelled out in technical delivery specifications. These vary in terms of loudness and peak target, the number and format of master files and channel-based deliverables, as well as naming conventions and more. Some specifications ask for Pro Tools sessions along with ADM BWF for archival purposes. Being aware of the deliverables required is crucial to achieving an efficient workflow.
Previous: Module 1.1 - Module Objectives