← Back to blog

AI Training Licenses: When Publishers Ask to Use Your Backlist — What to Say

Publishers are already cutting AI training deals — and the default revenue split they offer authors is insultingly low.

By Vlada Matusova

Wiley, one of the world's largest academic publishers, has already struck deals to license books for AI model training — and they are far from alone. If you're an indie author with even a modest backlist, the knock on your door is coming. Maybe it's from a distributor who's partnered with an AI company. Maybe it's buried in a contract amendment you're asked to sign. Either way, the question isn't whether your work will be sought for AI training. It's whether you'll recognize the ask when it arrives and know what your words are actually worth.

Here's the fact that should anchor every decision you make on this topic: under most existing publishing agreements, AI training rights were never granted to your publisher. They remain with you, the author. The Authors Guild has published a formal statement confirming this interpretation, and even the Big Five — Penguin Random House, HarperCollins, Hachette, Simon & Schuster, and Macmillan — have so far proceeded accordingly, seeking author permission before licensing books for AI training purposes. This is critical. It means that no matter who distributed or published your book, the default legal position is that your consent is required. If anyone tries to tell you otherwise, they are either misinformed or testing your boundaries.

So what happens when consent is actually requested? This is where indie authors need to pay close attention to the numbers. The Authors Guild recommends that authors receive 75 to 85 percent of any AI training license revenue — a split that reflects the foundational reality that the book is yours and the publisher is merely a conduit. In practice, however, publishers are offering 50-50 splits. Let that sink in. A publisher who did not create the work, who may have already earned back their advance, and who holds no contractual claim to AI rights is asking for half. For indie authors who self-published through platforms like KDP or IngramSpark, where you retained all rights from day one, accepting anything close to a 50-50 split with a third party would be an act of self-sabotage. You own the asset. Price it like you do.

The legal landscape makes this even more urgent because almost nothing is settled. Courts have not yet definitively ruled on whether AI training on lawfully acquired books constitutes fair use. Early rulings have gone in different directions — one federal judge found that AI model training qualifies as fair use, but with significant caveats that leave the door wide open for future challenges. Meanwhile, unresolved questions are stacking up: How will market harm be measured when AI-generated content competes directly with your books? What are your rights when an AI company simply purchases a copy of your book and trains on it without any license at all? And how will publishing contracts evolve as these technologies mature? The honest answer is that nobody knows yet. But legal uncertainty is not a reason to give away rights cheaply — it's a reason to hold them tightly until their full value becomes clear.

My position is straightforward: say no to any AI training license that doesn't meet three conditions. First, explicit opt-in language — not a blanket clause buried in a contract update, but a standalone agreement you sign with full knowledge of what's being licensed and to whom. Second, a revenue split of at least 75 percent in your favor, consistent with the Authors Guild's recommendation and reflective of the fact that you created the work. Third, a time-limited term with a clear expiration date. Perpetual AI training licenses are a trap. The value of training data will only increase as models improve, and you should retain the ability to renegotiate as the market matures. If a publisher or platform cannot meet all three conditions, the answer is no — politely, firmly, and in writing.

Here is one thing you can do right now: open every publishing contract, distribution agreement, and platform terms-of-service document you've signed in the last three years. Search for the words "artificial intelligence," "machine learning," "training," and "data." If any of those terms appear in a rights grant you didn't explicitly agree to, flag it, screenshot it, and consult an intellectual property attorney before your next royalty cycle. Your backlist is not just a catalog — it is a licensing portfolio, and the AI industry is betting you won't treat it like one.