
Horror: Die 12. Staffel von American Horror Story hat ein Startdatum
Eigentlich dachten alle, man müsste noch bis 2024 auf die zwölfte Staffel von American Horror Story warten. (Popcorn, Disney)
Just another news site
Eigentlich dachten alle, man müsste noch bis 2024 auf die zwölfte Staffel von American Horror Story warten. (Popcorn, Disney)
Um Daten aufzubereiten, wird noch oft Excel genutzt. Es geht aber auch anders – und einfacher: mit der Python-Bibliothek Pandas. Wir geben eine Hilfe für den Einstieg. Eine Anleitung von Antony Ghiroz (Software, Python)
Auf der Suche nach einem ergonomischen Bürostuhl? Der beliebteste Schreibtischstuhl ist gerade im Angebot und mit 2-fach Rabatt zu kaufen. (Technik/Hardware, Amazon)
Ältere Flip- und Fold-Modelle bekommen per Update Funktionen der aktuellen Modelle. Auch für ältere Smartwatches bringt Samsung neue Features. (Samsung, Smartphone)
Bei Amazon sind derzeit einige Gartenprodukte im Sonderangebot. Einen besonderen Rabatt gibt es auf eine Gardena Gartenschere. (Technik/Hardware)
Fertigungstechnik aus China kann fehlende ausländische Maschinen nicht ersetzen. Der wichtigste Fertiger muss offenbar den modernsten Prozess zurückfahren. (Halbleiterfertigung, Wirtschaft)
Nachdem Apple das Vision Pro gezeigt hat, ist die Konkurrenz nicht untätig geblieben. Samsung will ein ähnliches Produkt bauen – für weniger Geld. (VR, Apple)
Danish anti-piracy group Rights Alliance has taken down the prominent “Books3” dataset, that was used to train high-profile AI models including Meta’s. A takedown notice sent on behalf of publishers prompted “The Eye” to remove the 37GB dataset of nearly 200,000 books, which it hosted for several years. Copies continue to show up elsewhere, however
From: TF, for the latest news on copyright battles, piracy and more.
Generative AI models such as ChatGPT have captured the imaginations of millions of people, offering a glimpse of what an AI-assisted future might look like.
There is little doubt that generative AI will lead to new breakthroughs, some with the potential to revolutionize many aspects of day-to-day life. At the same time, AI is causing grave concerns within the copyright industries.
The copyright angle is the topic of many debates and has already made its way to court in a few cases. It’s high on the agendas of governments around the world, which are poised to accommodate generative AI within copyright legislation.
While lawyers and lawmakers are working hard to explore this novel area, anti-piracy agencies are taking concrete action. A few weeks ago we reported that the RIAA had taken down datasets used to create voice models, for example.
This week, Rights Alliance entered the arena with one of the most high-profile takedowns thus far. The Danish anti-piracy outfit sent a DMCA takedown notice to The Eye, targeting the “Books3” training dataset.
Books3 doesn’t sound as exciting as ‘The Lord of the Rings’ or ‘A Song of Ice and Fire’ but these titles are likely covered in the plaintext collection of 196,640 books, which is nearly 37GB in size.
The dataset, which contains all books from the pirate site Bibliotik, was first published on The Eye in late 2020 and since then has been used to train several AI models, including Meta’s.
The notion that AI models are trained on pirated books isn’t new. According to a recent lawsuit, which also mentions Books3, OpenAI also used books datasets that rightsholders believe were sourced from shadow libraries such as LibGen, Z-Library and Sci-Hub.
In recent years, The Eye managed to keep the Books3 database online but recently removed the archive following Rights Alliance’s takedown notice.
The anti-piracy group acted on behalf of Danish book publishers whose works were featured in the database. They see this as an important step to limit access to unauthorized AI training materials, which can be exploited by commercial AI initiatives.
“It is absolutely crucial that we can prevent AI from being trained on illegal content,” Rights Alliance Director Maria Fredenslund says, commenting on the takedown.
“We have a big task ahead of us in detecting and taking down illegal training datasets like Books3, but also in dealing with AI that has already been trained on illegal content and is now spreading on the internet.”
Rights Alliance stresses that it should be up to rightsholders to control how their works are used so the crackdown on unauthorized datasets will continue.
While the original and most widely circulated Books3 download link is offline now, the dataset hasn’t completely disappeared from the web. The file is still backed up by the Internet Archive’s Wayback Machine and alternative download links are also being shared.
Shawn Presser, who first shared the Books3 dataset on X years ago, points out that it is still available elsewhere. For example, Books3 is part of ‘The Pile‘, an AI training dataset compiled by EleutherAI. A torrent for this dataset is still hosted on The Eye at the time of writing.
In addition, the Books3 dataset is also available from direct download sources. In this sense, it’s not much different from traditional pirated books and movies, which are hard to take down permanently.
This shows that AI doesn’t just promise new technological breakthroughs, it also adds a new task to the roster of anti-piracy groups.
From: TF, for the latest news on copyright battles, piracy and more.
Cyberkriminelle haben es zuletzt vermehrt auf Linkedin-Konten abgesehen. Bei Google getätigte Suchanfragen bestätigen diesen Trend. (LinkedIn, Microsoft)
Der ukrainische Sicherheitsdienst hat zum ersten Mal offen zugegeben, dass Seedrohnen eingesetzt wurden, um die russische Kertsch-Brücke anzugreifen. (Militär, Politik)
You must be logged in to post a comment.