OpenAI has commented on a recent lawsuit filed by The New York Times over the use of its texts to train AI. According to the company, the lawsuit came as a surprise. The representatives of OpenAI expressed their position in four points, one of them was completely devoted to relations with the publication. In it, the company explained that the lawsuit was without merit and suggested that the NYT was manipulating AI tools. This is stated in the official statement of OpenAI.
The company emphasizes that the negotiations with the publication were constructive and focused on a partnership in the field of using ChatGPT, with which the NYT could connect with its readers in a new way, and OpenAI – to access reports. At the same time, according to OpenAI's statement, they explained to the publication that their content will not be effective enough for future learning. The December 27 lawsuit was a “surprise and disappointment” for the company. OpenAI recalls that The New York Times said it had seen some duplication of its content, but declined to share examples despite the company's commitment to investigate and address any issues. Also, OpenAI emphasized that in June (2023 – ed.) they disabled the ChatGPT function after learning that the program could play online content in an unpredictable way.
OpenAI notes that the duplicates reported by the NYT may be due to years-old articles shared on multiple third-party websites. OpenAI representatives suggest that the publication either allowed the models to repeat the content, or selected examples from a large number of attempts. OpenAI also emphasizes that it is constantly making systems more resistant to attacks aimed at extracting training data, and has already made some progress.
Regarding the lawsuit, despite its groundlessness, OpenAI hopes for a constructive partnership with the NYT, as well as further cooperation with the media. In addition, in the article, the company representative discussed some other issues of AI training with the help of publications.
According to the statement, OpenAI aims to support a healthy news ecosystem, be a good partner and create mutually beneficial opportunities. The company works to develop technologies to support news media, representatives meet with leading media to explore opportunities, discuss challenges and propose solutions. The statement also says that OpenAI has managed to establish partnerships with news media. In particular, from the Assocated Press, Axel Springer, American Journalism Project. In the second point, OpenAI emphasizes that AI training should only be done in good faith. The company considers it critically necessary for competitiveness in the USA. And in some countries, laws have even been passed that allow teaching on copyrighted content. However, OpenAI provides an opt-out process for publishers to prevent AI tools from accessing their sites. The statement notes that The New York Times ran it in August 2023. The article provided an explanation for the failures that occur during training. In particular, memorization and repetition of simulation results. According to OpenAI representatives, this happens when certain content appears in the training data more than once. The Company takes measures to limit inadvertent storage. But they expect responsibility from users, and also remind that deliberately manipulating models is an inappropriate use of AI technology.
“As people are educated to learn to solve new problems, we want our AI models to observe a range of global information, from every language, culture and industry,” OpenAI emphasizes. And they add that “the knowledge aggregator is so large that a single source, including The New York Times, won't matter much for training the model.”
What preceded it
The New York Times is suing OpenAI and Microsoft for copyright infringement. According to the representatives of the publication, the articles are used to train artificial intelligence. The NYT demanded the responsibility of companies for “illegal copying and use of uniquely valuable works”. Other demands include the destruction of chatbot models and training data using their materials. At the same time, information appeared in the media that OpenAI is offering some media outlets payments of $1 to $5 million in exchange for using their news to train the AI models that underlie ChatGPT.