Can You Copyright AI Generated Content?
Depending on the tools you use, you might not like the answer!
There's no doubt that artificial intelligence can be confusing. If you peek into any online forum about AI, you'll see a lot of heated discussions about what it is or isn't, and what it could or would never be able to do. Who owns the things that AI creates is also an issue of debate. And when it comes to content creation, writing does not get a pass. It too suffers from a copyright complex brought on by AI writing tools that don't respect or protect writer's rights.
To understand why, you have to know a little about how AI writing tools work.
In general, AI writing tools present writers with some form of text to use as part of the entire writing process. That text could be a phrase placed into a tooltip or a full-blown article displayed in a browser. A key part of the answer to whether you can copyright any of that text, therefore, lies in the source of that text.
Why?
Some AI Writing Tools Use Copyrighted Material without Permission
GTP, one of the Internet's most popular AI "text engines" at the moment, is the source of content for the writing tools that use it. It's an engine developed by OpenAI, who apparently built GTP with data from the Common Crawl dataset, a conglomerate of copyrighted articles, internet posts, web pages, and books scraped from 60 million domains over a period of 12 years.
According to the Verge, GPT is used in more than 300 different apps already. And should you decide to use any one of them for your writing, they could put you at risk of not only plagiarizing, but also breaking the law.
Using Copyrighted Material without Permission is Plagiarism
Fortunately, this copyrighted material has not gone unnoticed. Here, a Reddit user expresses concerns about plagiarism after he witnessed a GPT writing app spew out the company names of real businesses:
And here, a Reddit user expresses concerns about a GPT writing app giving accurate instructions (blocked out) on how to make an explosive:
The conversation that ensued devolved from whether OpenAI should even allow such a thing to whether these types of instructions already existed on the Internet -- with the general concession being that yes, they did indeed already exist on the Internet. Further, these other folks seem to revel in the fun they had generating GPT content with J. K. Rowling's copyrighted text. (Click here for over 30 more GTP plagiarism examples.)
And Yes -- It Gets Even Worse!
The United States Copyright Office just ruled that non-human expression is ineligible for copyright protection. So for those who wish to simply copy and paste their way to publication, this new ruling means everything they auto-generate and auto-publish from GPT and GPT-similar tools becomes auto-public domain material.
THINK ABOUT THAT FOR A MINUTE
People who use a GPT program for content not only pay money to plagiarize; they also pay money to put other people's stolen content into the public domain. That's content that (1) didn't belong to them or GPT and (2) gives everyone 100% rights to use anyway they wish without recognition or payment! I don't know about you, but to me, that takes the criminal act of copyright infringement to a whole new level!
Solutions
It's really disheartening to learn that by using GPT, there's a significant chance that writers will plagiarize someone else's work. One the one hand, they'll risk a DMCA defeat, risk having to pay royalties or profits that the original creator lost, or risk having to spend time in jail. On the other hand, they'll lose all rights and any claim to everything they publish. Looks like either way -- through service fees to court fees -- a GPT product will make them pay one way or another. That's why it's important to follow these three simple steps before deciding to use any AI writing tool (including my own, First Draft).
- Avoid GPT-based writing tools. That includes GPT-2, GPT-3, and any form of it that consults a dataset. As shown above, content generated from any GPT text generator tool creates a significant risk for plagiarism.
- Learn where a tool's text comes from. GPT isn't the only dataset that some AI writing tools use. Others might use databases from A.I. Wiki, the datasets mentioned above, or similar websites, which illegally distribute copyrighted content from:
- Reuters
- Penn
- Amazon
- Yelp
- Yahoo!
- A petabyte-scale crawl of the web
- Google Books
- Check the sample output for plagiarism. When looking for an AI writing tool, check the sample output (if available) for plagiarism. Copy a random sentence or a "unusually circumstantial phrase" (like that one) and paste it into a Google search to see if it exists on a bunch of other websites. If it does, the risk for plagiarism, again, increases significantly.
Content generated from any of the datasets at that site will create a risk for plagiarism as well.
Another solution is to use an alternative AI writing tool like First Draft. First Draft is a seven-year-old, offline, Windows AI writing program that doesn't use GPT or any other sources of copyrighted text. All of the text that First Draft generates, in fact, comes from the developer, the dictionary, and you (the writer). Come get to know it if you've never heard of it before.