Creating An Audiobook with AI


Creating An Audiobook with AI

Recording Your Audiobook Using AI Voices (ElevenLabs/Audible)

Creating an audiobook using AI voice tools is a cost-effective and accessible way to bring your book to life without needing to record it yourself. This guide walks you ElevenLabs (currently the major player in AI generated books for mass distribution).

ELEVENLABS

Step 1 – Sign Up
Visit www.elevenlabs.io and create a free account.

Step 2 – Prepare Your Manuscript
Upload your manuscript as a clean PDF, DOCX, or ePub file. Make sure it’s well formatted with clear chapter breaks, with separate opening and closing credit files. This will be important at the editing stage. 

Step 3 – Select a Voice
Choose from ElevenLabs’ range of AI voices. You can assign different voices to different characters or sections.

Step 4 – Generate Your Audio
Click to convert your manuscript into audio. This takes a few minutes depending on length.

Step 5 – Download your files ready for editing
Download your audio in WAV (44.1 kHz, 16-bit) format. Ensure that you export separate chapters, including opening credits, closing credits etc.

Step 6 – Open the files in your editing software (GarageBand/Audacity)
See the VoomVox guide "Get your Audiobook ready for VoomVox" for the required adjustments to be made prior to upload to our studio. We'll add the essential finishing touches to ensure that your audiobook is at it's best and meets all submission requirements.

Step 7 – Export the Files from garageband or Audacity
Download your audio in WAV (44.1 kHz, 16-bit) format. This format is supported by VoomVox for mastering.

Important: These files must be checked and adjusted to meet distributor standards. VoomVox will help ensure they are mastered and submission-ready.

AUDIBLE (Using Third-Party AI Narration via ACX)

If you're using an AI voice tool like ElevenLabs, Descript, or Speechki to generate your audiobook narration, you can publish on Audible β€” but there’s a catch.

Audible accepts AI-generated audiobooks only via its independent publishing platform, ACX. These requirements are non-negotiable β€” and raw AI output rarely meets them.


ElevenLabs, Descript, or Speechki Pros and Cons...

ELEVENLABS
βœ” Realistic AI voice generation
βœ” Multiple stock voices
βœ” Multilingual support
βœ” Emotion control (Pro plan only)
βœ” Voice cloning (via upload or training)
βœ” Assign different voices to different sections
βœ” Chapter-by-chapter generation
βœ” WAV export
✘ Built-in audio editor

DESCRIPT
βœ” AI voice generation (stock + Overdub)
βœ” Voice cloning (with consent/training)
βœ” Text-based audio editing
βœ” Multitrack editor (audio + video)
βœ” Chapter structuring via script editing
βœ” WAV/MP3 export
✘ Advanced emotion control
✘ Large voice library (fewer than ElevenLabs)

SPEECHKI
βœ” AI audiobook production (B2B service)
βœ” 300+ voices, 70+ languages
βœ” Human/AI hybrid production option
βœ” Mastering done by Speechki team
βœ” WAV/MP3 delivery
✘ Self-serve voice editor
✘ On-platform editing tools
✘ Real-time voice control or cloning


WHAT VOOMVOX WILL DO

  • We check and adjust your audio files to meet platform standards (loudness, peaks, noise floor, etc.)
  • We supply you with submission-ready files for distribution
  • We supply a professionally designed cover to upload with your audio - free of charge.

You bring the words, we’ll help you get them heard.