Terrible mistake. AI image generators trained on child pornography – study
December 22, 00:25 Share:
Significant problems were found in the LAION-5B dataset (Photo: Ales Nesetril / Unsplash)
Researchers at the Stanford Internet Observatory examined datasets used to train artificial intelligence imaging tools and found thousands of sensitive materials.
We are talking about the LAION-5B machine learning dataset published by the non-profit organization LAION. This dataset has been used by Stable Diffusion and other companies offering artificial intelligence products.
In total, it contains more than 5 billion links to images and alternative text to them. Most institutions in the United States are prohibited from viewing materials that may contain images of child sexual abuse. ( CSAM) even for verification purposes. Therefore, to study the LAION-5B package, the researchers used perceptual and cryptographic hash detection methods.
Read also:
Faster than human imagination. AI was taught to turn text into an image in real time
According to a Stanford Internet Observatory report, the researchers were able to discover that the LAION-5B dataset contains “ millions of pornographic images, images of violence and nudity of children, racist memes, hate symbols, copyrighted art and works taken from private company websites.”
3,226 entries from the dataset were identified as sexually violent content. 1008 of these materials have already been flagged by the Canadian Center for Child Protection, PhotoDNA filtering system, etc.. d., as CSAM. The researchers highlight that the presence of CSAM in a data set can allow artificial intelligence models trained on this data to create new and even realistic instances of CSAM.
Stability AI trained its Stable Diffusion AI model using LAION-5B. A Stability AI spokesperson told Bloomberg that it used a filtered subset of the data from this set.. In addition, the representative claims that the company's system prohibits the use of its products to create or edit CSAM and for other illegal purposes.
At the same time, the publication notes that only the new version of the image creation tool Stability AI learned from data that was significantly filtered. The previous version of Stable Diffusion 1.5, which is still available online, did not have the same protection, the researchers found.
LAION told 404 Media that it is temporarily removing these data packages to ensure they are safe and will then republish them.
Read also: Competitor Midjourney. Adobe releases AI image generator You were doing it wrong. Six tricks for creating images in Midjourney that you didn't know about Even more AI. Snapchat now has its own image generator NV requests a new entry UKRAINE THE WORLD AHEAD 2024 21 Breast, Kiev