Do You Own The Book You Wrote?
Copyright with Dickens, Twain, Harper, Unions, and our new LLM friends
Last week’s post on Men, Substack, and Novels is now no longer paywalled.
Three slots left in my book proposal course, starting July 7.
In my two previous newsletters on copyright laws in America (here and here), I discussed how the lack of them before the nation was founded was possibly revolutionary, a populist good that led colonists to rebel. Since ideas were freely circulated, and books sold cheaply, the citizenry became more literate, informed, and educated. When copyright laws were passed in the 1790s, they protected only American authors; tariffs on paper and the machinery for presses levied around the same time helped create American publishing industry. But works by foreign authors continued to be legally pirated for another century.
Many authors today would decry these lax laws, even though one could argue they were a net public good. The imperative to preserve the sanctity of copyright is something we hear often, given tech companies, LLM training, and AI chatbots. But I’m always surprised at how easily people forget Aaron Swartz, debates over Open Access, Napster, sampling, and bricolage, the entire corpus of literary works which arguably steal from previous ones, the brute truth that copyright usually benefits companies, not individuals, and of course the sloggy decades I spent reading postmodern and poststructuralist literary theory, in which authors and originality and individuals were killed. And now I also think of those early rebellious colonists.
Copyright laws have been gradually tightened and become more restrictive over the ensuing decades, and for those who are happy about this, Mark Twain shows up as a hero. But anyone who has wished they could use the image of Batman more freely have him to blame as well.
Before I dig into this more, I want to mention a few things I think many people may be confused about:
copyright is not the same as publishing right.
If an LLM is trained using your book, the result will not be AI or chatbots “publishing” that book in the same form.
For much of the 19th century, British literature was much cheaper in the US than it was in England, given the laws outlined above. Everything a Brit wrote was basically immediately in the public domain stateside. This, of course, could be seen both ways: great for Americans, for literacy, for inexpensive access; bad for British authors who didn’t get a dime from those cheap editions. But it all lead to a phenomenon that determined much of academic study of English since its birth: British authors were seen as better than American ones, partially because they were simply more known, as it was easier and cheaper to read their work. American authors may have been given more rights than Brits, but that also meant their books were more expensive. Thus, in part, Brits have been privileged over Americans in the literary canon.
And this situation benefitted American publishers, if not authors, who were publishing those cheap reprints. Without international copyright laws, readers and publishers win; authors lose. Particularly British authors.
In the 1840s, it would cost you $2.50 to buy a Dickens novel in London. In New York, that same novel would cost you six cents. This is partially why Dickens decided to tour the United States, in 1842: to urge Americans to support international copyright laws, so he would be paid for the copies of his novels Americans bought. What he discovered was that a key reason the tour was so successful—it was packed, night after night, with eager readers—and a key reason reason Dickens was so famous stateside—was because his books were so cheap and plentiful. When he returned home, frustrated at his inability to convince Americans, he wrote a book about his experiences. It was (of course) pirated and reprinted in the US. It sold 50,000 copies in three days. Dickens received $0 for those sales.
Twain was also being pirated—Canadians were publishing his work without compensating him, for instance. So for decades he would spend a weekend in Canada and apply for a Canadian copyright that would work throughout the British Empire. Twain also tried to trademark his pen name so other publishers couldn’t use it. He did so by calling upon the idea of “originality,”even though he often admitted all writers were thieves (let us all take a moment to think about the literary sensation of the past year: James by Percival Everett, a retelling of an '“original” work by Twain). Of course Twain was also a publisher and thus his views of copyright are as informed by his role as businessman as it is novelist.
What helped change this situation that was arguably robbing authors of money they were owed? When publishers started to feel the pinch.
More and more new publishers were starting up with business plans focused on cheap reprints. They refused to submit to the "gentleman’s agreement” or “courtesy of the trade” or collusion —however you want to describe it—of the established publishers not to reprint the same books or authors. Henry Holt (the person, not the company—this will become a theme in this history!) expressly used this “gentlemanly” argument in the courts, when he defending the informal practice between publishers not to put out the same pirated editions (only one of them would get to publish Dickens, for instance, even though, legally, all could), (informal practice or collusion is something we’ve seen publishers discuss in court cases in the 21st century as well!).
Here’s the case: Holt had published Thomas Hardy in the US. But then Harper Brothers brought out their own edition of a Hardy novel. To avoid Lippincott (publisher) doing the same, Holt wrote to Lippincott (person): “We of course claim Hardy as our man as we have introduced him to the American public and when we add that we have published all his works by direct arrangement with the author, we trust that you will withdraw in our favor.” (Holt did pay Hardy, but was not legally required to). Lippincott deferred, and didn’t also publish Hardy. But then a Chicago publisher decided to put out a series of major British works and sell them for ten to twenty cents each. It included Hardy. Other publishers started similar series, all with Hardy volumes. By 1877, there were fourteen such series.
These publishers were all outside New York, not in the “gentleman’s club” of the Harpers, Holt, and Lippincott, so not abiding by (or invited into) the “courtesy of the trade.” (Keen readers will know that eventually what was originally Harper and Brothers (company) would buy what was then Lippincott (company); Holt (company) would be bought by Macmillan)
Eventually the gentleman gave up and joined in: Harper would become the most prominent publisher of these cheap books series. Their Library of Select Novels and Franklin Square Library were stocked in bookstores (as opposed to mail order and newsstands, which is where the other cheap books were sold). By the 1880s, the nation was even more flooded with classic reprints: a soap company gave out a volume with each bar sold. Then, given the rhythm of the copyright terms of that era, more and more books by American authors began to enter the public domain as well, and this increased the number of inexpensive reprints even more. Prices flattened even further.
Now no one was winning—well, I take that back—fewer authors and publishers were winning, if by winning we mean making money. Arguably, American readers were winning! So many great books, so plentiful and affordable!
So authors and publishers hurt by this movement organized: the Author’s Club changed its name to the American Copyright League. Twain was an active member. The owner of Publisher’s Weekly joined. George Putnam (person) joined. Alcott, Whitman, Harte, and Whittier signed petitions. Twain spoke to the Senate, lobbying for stricter laws.
The opposition kept it up, too: Henry Carey Baird, who published reprints, argued that these expressions belonged to the public, and were not the property of the author. Brits should just become American citizens if they want American protections, he said. Also on the side of pirating—aka lax copyright laws—were the unions, who benefitted from the increased work, and protectionists, who sided with the pirates, because they were good for American industry.
The impasse was finally resolved by the printers, who flipped their position. Originally on the side of the pirates and no international copyright law, they saw that, as book prices kept going down, profits were getting ever thinner, and the cheap books used worse material. In addition, the reprint publishers tended to be based in places where printers’ unions were weak, and those shops started hiring non white men and women, who were okay with printing and binding for terrible pay. Union men were getting replaced. So they switched sides and came out in favor of copyright laws. Benjamin Harrison signed it into law.
Twain had won; now, as he had previously told Congress, the “a day [had] come when, in the eyes of the law, literary property will be as sacred as whiskey, or any other of the necessaries of life.”
He wasn’t done, though: Twain still wanted copyright to cover his works for longer terms. By 1909 he and others succeeded in extending the copyright term to twenty-eight years, renewable for another 28. It’s only grown longer and more Twain-friendly since: the most recent extension, in 1998, was for seventy years beyond the life of the author.
Because the term has changed so often, it can be difficult to know when a book will go into the public domain, no longer protected by copyright. Here’s a handy guide.
Now, to return to the points I made up top: none of this discussion has to do with rights to publish books. It only has to do with who, if anyone, holds copyright, and for how long. So here’s a brief overview of how things stand now:
When you write something—on your phone, on a napkin—it’s yours. Your idea has been expressed. You have the copyright automatically. You don’t need to register it. It just shows up, swooping in like a little ghost. Congrats! (This newsletter is now in my copyright. Oops also that sentence. Hands off! It’s mine until I’ve been dead for 70 years.) But what good does that do me, really? It’s when an author transfers some of those rights that things become interesting.
Think of copyright as an umbrella term with lots of subdivisions One of those subdivisions is publishing rights. When you sign a contract with a publisher, you grant someone else one of those right that lies within your larger copyright umbrella: the right to publish the book. Usually it’s called “Grant of Rights” and is right up top of a publishing contract. Currently, those rights are usually divided into two types: primary (usually print and ebook—note that this primary right has obviously been added only recently) and subsidiary (translations, audio, film, etc.) The contract will spell out which rights the publisher is asking for (usually the primary ones), what the publisher will give you for those rights (royalties), and how you might be able to get those rights back if you wanted to, or if the publisher ceases to exist (reversion of rights).
So: copyright is yours before during and after, but by signing a contract with a publisher you give them some of the rights that fall under that umbrella. Now you cannot go to your neighborhood reprint company and say— hey wanna make some cheaper editions of this book? You don’t have that right any longer.
So, to scroll back up to the title of this newsletter: do you own the book you wrote? Yes (copyright), and no (if you’ve granted someone else publishing rights by signing a contract).
Okay, now to LLMs. Let’s set aside the current legal issues about fair use and copyright here. Let’s just stick with tech companies feeding the contents of books to train LLMs. The rights toallow a book to be used to train an LLM should, I think, be one of those rights that fall under copyright, and should be included in the rights clauses of publisher contracts, as a subsidiary, not primary rights, requiring the consent of the author (and authors could opt-out/strike that clause before any contract is signed). (Here’s the full list of subsidiary rights that could be sold, beyond print and ebook, in Belt contracts: Reprint, Book Club, Textbook, Anthology, First Serial, Second Serial, Performance, Translation, Foreign English Language, Commercial/Merchandising, Audio, Nondramatic Readings, Braille. One could see “LLM Training” be added to this list).
Obviously this is an exceedingly complicated issue, but I make this point because I think some authors have an idea that if their book is used to train an LLM, it is somehow parallel to those cheap pirated editions I describe above, that one result might be the digital version of the Hard Times edition in the image—something anyone with access to a chatbot could just grab (“Claude, send me the full text of that book PRH published last year”). That’s not true. LLMs may be pirating, or stealing, or violating copyright, but they are not disseminating the books in the same form they were in when they added them in. That does, however, describe exactly what American publishers were doing with foreign books for half of the lifetime of this county.
I think that’s enough for one week! I’ll have you know I cut out an entire section on whether copyright derives from natural law or is a governmental right, how copyright aids censorship, and legal theories as to whether or not your book is like that cabin in the woods you bought in the 80s. Next up in this occasional series is the reason why Creative Commons and Open Access was developed in the 21st century, as well as other efforts to loosen copyright restrictions. Strict copyright laws helped Twain, and would have helped Dickens, but they are also reason why Warner Brothers, Marvel, and Disney have so much power and money. Copyright as polycule: in it there are always strange bedfellows.