
AI and Copyright
The majority of AI tools rely on training data made up of vast quantities of human-generated content. Intellectual property like this is controlled by copyright, an area of law that governs the use and reuse, distribution, and right to profit from it. Many AI tools rely on training data collected via web scraping, which extracts content data from websites and databases published online. Datasets can also be built from content that an individual or organization owns or licenses.
Because of the integral nature of copyrighted material to the training and operation of LLMs and other AI tools, copyright has been the major area of law considered to control the use of these applications thus far. As scholarship is a form of intellectual property in itself, we should also be concerned with the implications of publishing materials composed, edited, or otherwise altered using AI.