LibGuides: Plagiarism: Plagiarism and AI

Citing AI

If you use AI-generated works in your papers or research, you need to acknowledge and most likely need to cite it!

Reusing AI Work Can Be Plagiarism

If you remember from the first tab, plagiarism is the stealing of someone else's work without crediting the source. While you may not be technically stealing content itself, reusing AI generated works without crediting or noting that it's AI still meets this definition of plagiarism.

Sample Generative AI Programs

ChatGPT

DALL-E

Bard

Bing Chat

HuggingChat

Perplexity AI

YouChat

Artificial Intelligence

We use artificial intelligence everyday, often without knowing it. Artificial intelligence (AI) is computer programming and learning that simulates human intelligence and learning. Google, Siri, search engines, self-driving cars, automated stocks, and virtual assistants are examples of AI.

Recently, a new form of AI, called generative AI, has been developed. Generative AI can take raw data and make something new out of it. The most famous examples of generative AI are ChatGPT for text-based applications and Dall-E for image-based applications. When used as a tool, AI can be amazing. Students can use it to come up with paper topics, highlight relevant content, test their critical thinking skills, and help make existing content more accessible for those with disabilities. The key is making sure you're using AI as a tool and not reusing AI-generated work for your assignments.

Plagiarism and Artificial Intelligence

While generative AI programs are incredibly innovative, they also come with some serious concerns. We'll only cover two here.

They may cite made up resources. ChatGPT will readily create reading lists and work cited pages for your papers, but it is limited to the data they have been trained on (meaning no library resources) and does not contain up-to-date information. To get around this, ChatGPT will completely make up citations to "support" the text it generates. Here, in our library, we've already encountered multiple instances of faculty and students looking for books and articles that don't exist.
They may plagiarize resources. So where does ChatGPT, Dall-E, and other AI chatbots get their data? Generative AI programs require text and images to be "scraped" from existing resources and entered into their datasets. In some cases, companies are transparent as to where they get their work. In many cases though, we don't know and evidence shows that scraped resources may come from copyrighted works without the original creator's permission or knowledge. And it's not just major works. Anything freely found online is targeted, including product reviews and public comments and posts. Further, these programs also lack any way for authors or artists to have their works removed from the program. In fact, when you use these programs, they often are scraping your inquiries into their database as well.

You may be wondering why plagiarism is a problem if generative AI is supposed to be reshaping scraped text and images into something new. The problem is that it's easy to purposefully or accidentally prompt these programs into reusing other people's works, including creating art in another person's copyrighted style or pulling large blocks of text from copyrighted sources, such as entire pages of books. This means that the "generative" and "new" works that these AI programs create may actually be someone else's work. In addition, even when AI generative works don't directly plagiarize, they often reuse ideas and steal core concepts from copyrighted works without crediting the original creators. This is still considered plagiarism.

Examples of Copyright Infringement in AI Datasets

In December 2023, The New York Times filed a lawsuit again ChatGPT parent company, OpenAI. Included in the lawsuit are 100 examples of ChatGPT "generating" responses nearly identical to New York Times articles. In the example below, the red text (approximately 97% of both articles) is text that ChatGPT created that directly matches an article in The New York Times.

In January 2024, Gary Marcus and Reid Southen demonstrated multiple instances of generative AI programs creating images that include, or are extremely similar, to copyrighted characters. For example, the prompt "yellow 3d cartoon character with goggles and overalls" in Midjourney created these Minion-like characters: