Definitive Guide to Audiobook Submission Specifications

6 May 2024

Audiobooks submitted to ACX and other vendors must adhere to strict quality standards to ensure they offer the best possible listening experience and meet the expectations of a wide audience. This guide consolidates the crucial technical and content-related specifications your audiobook needs to satisfy before it can be sold on major retail platforms through ACX and other vendors.

Also, refer to https://help.acx.com/s/article/acx-audio-submission-requirements

Quality and Consistency

Your audiobook must exhibit consistency in sound, tone, and formatting to enhance listener engagement and satisfaction. Maintaining uniform audio levels and pronunciation across all files is essential to prevent jarring transitions. Each audiobook segment should blend seamlessly, promoting a professional and enjoyable listening experience.

Technical Specifications

Audio Files: Each audiobook must consist of high-quality audio files free from unwanted sounds like plosives, mic pops, and mouse clicks. These extraneous noises can distract listeners, potentially leading to negative reviews and impacting sales.

Credits and Metadata: Each audiobook must include opening and closing credits that correspond accurately with the title's cover art and metadata:

  • Opening Credits: The title, author(s), and narrator(s) must be included.
  • Closing Credits: Should minimally state "The End."

Audio Samples: A retail audio sample between one and five minutes long must be provided. This sample should be representative of the audiobook’s quality and should not contain explicit material, music, or opening credits.

File and Format Requirements

  • Chapter Separation: Each audio file must contain only one chapter or section. The opening and closing credits must also be in separate files, ensuring easy navigation for listeners.
  • Room Tone: Each file should include no more than 5 seconds of room tone at the beginning and end, providing a clear auditory signal of the start or end of a section.
  • Section Headers: Each file must accurately reflect its content with a header, like "Prologue" or "Chapter 1," ensuring clarity and continuity for the listener.

Audio Levels

  • RMS: The root mean square (RMS) levels of each file should measure between -23dB and -18dB, maintaining a consistent volume that doesn't require listeners to adjust their playback settings.
  • Peak Values: The peak sound levels must not exceed -3dB to avoid distortion and ensure high-quality audio.
  • Noise Floor: The background noise level, or noise floor, must not be higher than -60dB RMS, minimizing distractions from the core audio content.

Encoding and File Size

  • MP3 Format: All files must be encoded as 192kbps or higher MP3 files with a Constant Bit Rate (CBR) and a sample rate of 44.1kHz. This specification ensures the files undergo the encoding process without errors. While higher bit rates like 256kbps or 320kbps are permitted, they generally do not provide a noticeable quality improvement for the listener.
  • Channel Format: Files must be uniformly mono or stereo across the entire audiobook. Mono is recommended for consistency and simplicity in encoding.
  • File Length: No file should exceed 120 minutes. If a chapter exceeds this limit, it should be divided appropriately with clear section headers indicating parts like "Chapter 2 continued."

Additional Guidelines

  • Human Narration: A human unless must narrate audiobooks specifically authorized otherwise.
  • File Naming: Use standard US alphabetical/numeric characters for file names, avoiding symbols or non-standard characters to prevent file processing errors.

By meticulously following these specifications, you can ensure your audiobook meets ACX standards and provides an engaging and satisfying experience for listeners, ultimately reflecting positively on sales and listener reviews.


Differences Between ACX and Other Vendors

  1. File Length Limit:

    • Other Vendor's specs suggested a maximum file length of 77 minutes to ensure safety within platform limits.
    • The ACX-specific specs increased this maximum to 120 minutes, allowing for longer sections while still requiring breaks in very long chapters.
  2. Mono vs. Stereo Files:

    • ACX specifically recommends mono files to ensure smooth encoding and consistent audio quality across all playback devices.
  3. Credit Requirements:

    • The first set includes a detailed script for opening and closing credits and stresses the importance of matching these with the book's metadata.
  4. Technical Specifications for Encoding:

    • ACX details requirements for a constant bit rate (CBR) at 192kbps or higher and a sample rate of 44.1kHz, noting that while higher rates like 256kbps or 320kbps are allowed, they are generally unnecessary.
  5. Chapter Continuation Protocol:

    • The first set suggests keeping tracks under 77 minutes and splitting longer chapters without specifying how to label continuations.
    • ACX requires clear labeling for continuations, e.g., "Chapter 2 continued," to maintain listener navigation.

Summary of Specs

Here is a concise summary of the key specifications for submitting audiobooks to ACX:

  1. Audio Quality and Consistency:

    • Maintain consistent sound, tone, and formatting across all files.
    • Ensure files are free from extraneous sounds like plosives, mic pops, and mouse clicks.
  2. Credits and Metadata:

    • Include opening credits with the book title, author, and narrator. Closing credits should minimally state "The End."
    • Ensure all metadata matches the cover art and provided credits.
  3. Audio Samples:

    • Provide a retail audio sample between one and five minutes long that showcases the performance and production quality. Avoid explicit content and opening credits in the sample.
  4. Technical Requirements:

    • Files must be in MP3 format with a 192kbps or higher bitrate, 44.1kHz sample rate, and constant bit rate (CBR).
    • Maintain RMS levels between -23dB and -18dB.
    • Ensure peak audio levels do not exceed -3dB.
    • Keep the noise floor no higher than -60dB RMS.
  5. File Formatting:

    • Each file must contain only one chapter or section.
    • Files must include no more than 5 seconds of room tone at the beginning and end.
    • Ensure consistency in mono or stereo format throughout the audiobook. Mono is recommended.
  6. File Length and Naming:

    • Each audio file should not exceed 120 minutes. If necessary, longer chapters should be split appropriately.
    • Use standard US alphabetical/numeric characters for file names, avoiding special characters.
  7. Human Narration:

    • The audiobook must be narrated by a human unless explicitly authorized for an alternative method.