The Invisible Goldmine of the Audiobook Market

It seems surprising that audiobook production has much in common with gold mining, but it really does. The audiobook market is full of money, just as mines can be full of gold. If you want to know how technologies change audiobook production and why we compare it with gold, read the article!

Hidden gold

In 1848, gold was found in California on the land of a man named John Sutter. That man probably should have become rich in no time. But there was a little problem: Sutter had no opportunity to mine all this gold under his property quickly and efficiently enough.

Soon he was overrun with squatters and thieves, who made off with gold and exploited what was under the land that he owned. If only he had had a fast, efficient way to extract all that gold for himself, he would have been unimaginably wealthy. He was literally sitting on a pile of gold. But he died poor.

In the publishing industry, the gold rush is already underway. The gold is not mineral gold, but the huge backlog of unrecorded books that could be turned into audiobooks. If a publisher or rights holder converts less than 100% of books into audio, they are on the land where gold is waiting to be discovered, just like John Sutter was — and they are not mining the gold.

Publishers are currently recording only 5% of yearly produced unique titles as audiobooks. Given that all world publishers and independent authors create a combined 2.2 million unique titles per year, this means that the untapped supply of content is massive. So the portion of audiobooks recorded looks like just the gold that can be picked up from the surface. The real riches are beneath the earth. And sound-based content has never been more popular.

Tools to mine

You are only as good as your tools. Just as you can’t mine gold with your bare hands, you need tools to allow you to take advantage of this motherlode of unrecorded material. The traditional way to take advantage of the back catalogue in the audiobook world is, of course, to rent a studio, and hire a voice actor.

But that’s the equivalent of mining gold with a pick and shovel. It’s traditional, and everyone is used to it, but this does not mean that it is the best and most effective way.

Modern technology offers publishers the audiobook-production equivalent of a mining extractor. And that mining extractor is thirty times faster and enormously cheaper than recording an audiobook using a human voice.

In this concept of mining, different languages are like different types of soil. They can all be mined for resources, but they all require different tools. Speechki offers versatile tools for ‘mining’ your audiobook resources in a large array of major languages, including English, Spanish, German, French, Portuguese, Italian, and other languages.

So what exactly is this digital equivalent of a mining extractor? Let’s take a look at the technology that is used to create AI-generated audiobooks.

We all hear everywhere about the development of AI technologies. AI-powered self-driving cars, AI in healthcare, AI targeted advertising, AI disaster prediction, AI financial advisors, the list goes on … As for book publishing, it is commonly believed that AI is something far off. But this is not at all true!

AI is already here and has its place in publishing. For example, AI is used in creating short summaries of audiobooks and content translation. Thanks to the use of AI, there has been significant progress in creating professional translation automation tools. AI creates recommendation systems for readers so that they always have interesting books and their subscriptions are renewed.

As recently as five years ago, it was thought that AI neural voices sounded creepy — like a robot. But this is no longer true. Over the past 2 years, speech synthesis technology has made a huge leap forward, thanks to machine learning. The computer voices, a.k.a text-to-speech voices, which once seemed robotic, monotonous and lifeless have now been transformed into natural-sounding realistic voices. They are not only capable of mimicking human-like speech but can generate full-length near-perfect audio narrations.

👉 Check out these ten audio files. Just sort them into the correct category — human or robotic.

Examples of AI-generated audio

Let’s listen to some samples so that we can get a sense of how AI voice technology has evolved and improved over the years.

Finally, let’s listen to how it sounds now so that we can compare that with how it used to sound. This is the next generation of synthetic voices. Clearly this sounds ultra-realistic — close to an actual human voice.

Speechki, Inc. · Samples – Speechki – audiobook recording platform

It was designed specifically to be easy to use because Speechki wants to break down the barrier to entry that prevents people from getting started with creating machine-generated audiobooks from existing texts. Actually, the process of creating AI-narrated audiobooks with Speechki looks remarkably similar to the traditional way — only vastly streamlined and speed-optimized.

How to use Speechki

Here is a short step-by-step guide about how to use our service even at home.

Upload your book to Speechki’s system. The interface is extremely intuitive and should look familiar to anyone who has used Google Docs or Microsoft Word.
Choose which of seventy languages the book is written in, and decide on one of twenty ultra-realistic AI narrator voices to read your book.
Press START.

In the next step, Speechki’s AI-powered backend system will synthesize an audiobook. While that happens, do nothing. Sit back and relax for a little while — about fifteen or so minutes for the recording of an approximately eight-hour audiobook.

When the audiobook is ready, then you can ‘proof-listen’ through the book to make sure you catch any errors the AI might have made. Even human narrators stumble sometimes, and machine-generated narrators are no different. It is at this stage too that you can add in any sound effects or music cues if you think the book might be enhanced by the sound of gulls on the beach or some moody strings in the background.

If you don’t have time to proof-listen to the audiobook and finalize it yourself, it needn’t be an obstacle. For an additional fee, Speechki can take care of the proof-listening process with its own in-house team.

Sounds easy, doesn’t it?

Final word

So as you may have observed, modern AI-powered voice-generated technologies have very user-friendly interfaces, analogous to mainstream word processors. You make the changes you want in an easy-to-use system. The system makes all complex changes itself without showing the source code to the proofer. Also, no client-side servers, no specialist equipment required — just a browser, headphones, keyboard, and mouse.

Let’s get back to the gold mining concept. If you’re a publisher and you convert into audio format under 80% or even 50% of your titles, this means you are like John Sutter. But you’ll say ‘I’m OK with our current number of audiobooks’. Well, we’ll respond ‘You will be broken by modern squatters and thieves‘.

You know that the audio market is growing, right? But if consumers can’t find the books they want in the audio format, they will switch to podcasts, radio or music. And that means a loss of audience and therefore income for the whole publishing market. That would be a worst-case-scenario for everyone involved in the industry, including you personally. The publishers of that content would get rich instead of you, even though you have your own content sitting and ready to be converted to audio! And podcast producers are already sharpening their picks and shovels. Just think about it.