Should AI be able to steal books? Are they stealing them?
Unwinding this issue back to the beginning of copyright
I’m teaching my book proposal course again in July. Learn more about it here.
Did you know it’s only $30 to become a paid subscriber?
Controversies over AI and copyright are in the courts, and debated everywhere else. I have long found copyright, when it comes to creative works—books, specifically— to be a confoundingly complicated issue, and one that resists any easy left/right politics divides. Should we prioritize compensating authors at the risk of stifling creativity? Do authors ever create anything anyway, or are authorship, originality, and the individual self constructs we should abandon? Do we want people to have access to as much information and art as possible, or do we want to prioritize protecting and encouraging original writing?
I’m going to back into the current controversies over copyright and AI by looking at the roots of copyright in America. Because, as I’ve written before, the entire American publishing industry is based upon piracy and theft. Few today would lament that Americans were able to buy cheap versions of Emma soon after it was published, but Austen received no royalties from sales of her novel over here. As I wrote then:
Since there was no international copyright law, any book could be reprinted by any American and published under their name. So they would race to boats arriving from England to get the first copies of British original editions—of, say, Sir Walter Scott’s Ivanhoe— take them to their print shops and meticulously recreate the text by placing each letter, one by one, backwards, into a compositor stick, then setting plates for each page, page after page, before sending the completed text block to a bindery to be stitched inside a leather cover, and sold….
One of the British novels that Mathew Carey pirated and published under his name would go on to become the bestselling American novel for a century. In 1794, Carey pirated a novel by Susanna Rowson, Charlotte: A Tale of Truth, the story of a young girl who is brought to America by a nefarious British officer who then abandons her, leaving her sick, pregnant, and alone. It was published in 1791 by William Lane of the Minerva Press… Carey chose to recreate it in his print shop and sell it to Americans. He decided upon an initial print run of 1,000 copies—a fairly standard number for the time—and retitled it Charlotte Temple…. Since the book was also free to republish, Carey could put out his own edition without paying Rowson anything for copyright or royalties. In 1801, seven years after his edition was published, Carey sent Rowson a check for twenty dollars along with twenty copies of the book “as a small acknowledgment for the copy right of Charlotte.” …Charlotte Temple blew up. Sixteen years later, in 1812, Carey wrote to Rowson about its success:
“Charlotte Temple is by far the most popular & in my opinion the most useful novel ever published in this country …[its sales] must far exceed 50,000 copies; & the sale still continues”
To translate this into contemporary terms: stealing books, and not compensating authors, is a practice as old as American publishing.
But those are examples of British books; there would be no copyright on those books for another century.2 But debates over copyrighting American books started as soon as the colonies became a nation.
Before independence, colonial printers (who were also publishers and booksellers) were beholden to royal authority as to what they could and could not print. According to John Tebbel, this worked more like the now oft-invoked phrase “obeying in advance,” or, as he puts it, “a chain of fear: the printer feared the governor and his council, as well as the colonial assembly; and the governor feared the king and his ministers. The people feared all authority, although in diminishing measure as time went by, and sometimes they took out their fear and hostility on the printer. Then, as passions rose before the Revolution, the printer faced censorship not only by authority but from the mob.” 3
There was an early attempt at copyright in 1673, but it protected printers (slash publishers/booksellers), not authors. It would be another century before an author in America could claim legal ownership of their work. This was true in Britain as well, where the first copyright laws were passed in 1518, but meant to protect royal printers, not authors; authors didn’t receive that thing we now call “royalties” there either.4 There has been no time in the history of America when copyright laws for authors was clear, uncontested in the courts, and free from intense debate.
In 1770 William Billings sought to prevent his New-England Psalm-Singer from being pirated, so he petitioned the Massachusetts House of Representatives to grant him an exclusive license to the work for seven years. At that time, the term in England was fourteen years, but, as we will see later from Thomas Jefferson’s argument that a shorter term was better, they changed it to fourteen. But it was just for this one book.
In 1781, Andrew Law also published a music book, Collection of Best Tunes, that was being pirated. He appealed to the Connecticut Assembly, stating that it had cost him 500 pounds to compile, engrave, and print the book which had then been stolen by other publishers. (Here we have another reminder that self-publishing has always been a common practice). Law wanted five years to be the only one to print and sell the book. because “the works of Art ought to be protected in this Country.”
In 1782, , four authors would campaign to create an actual copyright law: Jeremy Belknap, Thomas Paine, Joel Barlow, and Noah Webster. Webster was then the author of the most popular book in in America, and everyone was already pirating his dictionary. They built upon Law’s phrase—protecting works of art—and Webster in particular traveled about campaigning for a copyright law. He succeeded in 1783, when Connecticut passed an “Act for the encouragement of Literature and Genius,” followed by a federal law in 1790 which gave American authors “the sole Liberty of printing, publishing, and vending” a book they created for fourteen years.
But the law prohibited authors from copyrighting ideas—they had to have a book to protect. They were also not allowed to set the prices for those books higher than what would be considered reasonable.
Copyright would be included in Article 1, Section 8 of the Constitution, which gives Congress the ability to “promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.”
However, as we have seen, this only applied to works by Americans. “Nothing in this act shall be construed to extend to prohibit the importation or vending, reprinting or publishing with the United States any map, chart, book or books, written, printed, or published by any person not a citizen of the United States.”
And here we run into another contemporary issue: protectionism! Jane Austen and and, really, most authors Americans read during this period, had no rights. Only Americans. And, since anyone could print books by non-Americans, there were multiple simultaneous editions of their books, which also meant books by foreigners were much cheaper than those made by Americans, and is one reason foreign literature was more highly praised and much more widely read than “native” authors.
**
While fine American writers were off campaigning for legislation to protect the property of their fellow authors, Thomas Jefferson was arguing for more lax protections for authors. And no one was arguing that an author’s work was their property.
American copyright law did not give authors a right, nor consider creative works to be property. The phrase “intellectual property” is now often used interchangeably with copyright, which is misleading. Authors can transfer control of copyright to others. For instance, an author controls the copyright on their work, but when they sign a contract with a publisher, they transfer those rights to publish and distribute that work. It was understood more as a privilege rather than a right. A privilege is something the law grants you: “Copyright is a ‘deal’ that the American people, through its Congress, made with the writers and publishers of books. Authors and publishers would get a limited monopoly for a short period of time, and the public would get access to those protected worlds and free use of the facts, data, and ideas within them.”5
The writers of the Constitution wanted to encourage Americans to create, but they also felt that the public would benefit from using previous works in new ones. So they wanted a public domain in which the works became common property of all. They saw giving authors this monopoly a tax on the public, and that tax should be limited. There has always been a fundamental tension between the interests of authors and those of readers and future authors. How long the public should be “taxed” by authors continues to change, as we see with today’s very extended length of copyright. As is the question of whether any monopoly is a good thing for the public.
James Madison didn’t think copyright was about property but the ideals of “progress” and “learning”: it encouraged new expressions and ideas. George Washington believed mainly in free and easy access to information, so the public could educate itself, all the better to resist tyranny.
It was Jefferson who had the most concerns with copyright, because he abhorred any form of monopoly. He wanted the constitution to contain a prohibition on monopolies authors would receive over their works under copyright. He suggested th eConstituion adopt restrictive language and a very short term. He lost, but continued to grouse, as he wrote in 1813:
“If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of everyone,”
Ideas, Jefferson said, don’t work on a scarcity model, as does property.
“Its peculiar character.. is that no one possesses the whole of it. He who receives an idea form me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening mine.”
At the end of the debate, copyright was the law for a 14 year term. In 1831 it was extended to 28 years, renewable for another 14 years. In 1909 it was extended to 28 years, renewable for another 28 years. In 1978, it was extended to the life of the author plus 50 years. In 2001 that was extended to life of the author plus 70 years. The forces that have had the most sway in extending copyright have not been a union of authors; they have been companies who want to create higher and higher walls around their commercial goods. They have been concerned with protecting the money they get from, and creating monopolies out of, Mickey Mouse and Batman.
I realize this newsletter may not be answering questions like “so how should we oppose AI pirating of books?” In the writer/author community I hang out in, opposing AI scraping of copyrighted books is the de facto position. And that may well be mine as well. But I also always hear the voice that wonders about the beauty of the public domain, that resists tightening laws around private property, and that balks at the control and commercialism that often lies at the heart of the logic of those who support strong copyright controls (even if they are not aware that this is the logic they are indeed drawing upon). I’m not speaking here of the specific companies mining data for AI, or the energy issues around it—I am firmly opposed to both of those! But the question of who owns, controls, and/or has a right to use books is one that is perennially complicated. And the 18th century origins of copyright in the US reveals that complexity.
The first American novel to be copyrighted.
Also Carey didn’t pirate or steal those books, as there was no law preventing him from doing so.
A History of Book Publishing in the United States, Volume 1 by John Tebbel (Bowker 1972).
That we call it a “royalty” is not coincidental. American authors didn’t receive royalties until later in the 19th century; before then, publishers would pay authors a flat sum to publish, or a share of profits. This is when authors and publishers entered into the sorts of partnerships we now consider key to “traditional” publishing.
This quote, and the research following, from Copyrights and Copywrongs by Siva Vaidhyanathan.
I get where you're coming from here.....but for me, as someone who has a couple of books in the database that Meta's been using to train its AI, the current situation feels like the worst of both worlds. I don't love seeing a pirated version of one of my books showing up on a sketchy-looking ebook site (which has happened), but at least that means that someone's.....reading them. Maybe. I don't love the normalizing of not paying for art - I have too many musician friends for that - but I get it, up to a point.
Whereas in the case of Meta, there's literally a large corporation saying that it needs access to everything everyone has written so that it can train a product it's put absurd amounts of money into, but also that the value of that writing is zip, zilch, nada....
I don't like it, is what I'm saying.
Thanks Anne. What a circus the industry was. And still is, in a different way : ) In copyright, the thing to remember is this: Only your exact words in the exact order are absolutely protected. The rest is a gray area. Plots may well be vulnerable, so watch it, writers. If I may link my own post, I wrote about this kind of theft here https://richarddonnelly.substack.com/p/authors-own-it Thanks again Anne for your fine overview.