A group led by RSS co-creator Eckart Walther has launched a new protocol designed to standardize and scale licensing of online content for AI training. Backed by publishers like Reddit, Quora, Yahoo, and Medium, Real Simple Licensing (RSL) combines machine-readable terms in robots.txt with a collective rights organization, aiming to do for AI training data what ASCAP did for music royalties. However, it remains to be seen whether AI labs will agree to adopt it. TechCrunch reports: According to RSL co-founder Eckart Walther, who also co-created the RSS standard, the goal was to create a training-data licensing system that could scale across the internet. "We need to have machine-readable licensing agreements for the internet," Walther told TechCrunch. "That's really what RSL solves."
For years, groups like the Dataset Providers Alliance have been pushing for clearer collection practices, but RSL is the first attempt at a technical and legal infrastructure that could make it work in practice. On the technical side, the RSL Protocol lays out specific licensing terms a publisher can set for their content, whether that means AI companies need a custom license or to adopt Creative Commons provisions. Participating websites will include the terms as part of their "robots.txt" file in a prearranged format, making it straightforward to identify which data falls under which terms.
On the legal side, the RSL team has established a collective licensing organization, the RSL Collective, that can negotiate terms and collect royalties, similar to ASCAP for musicians or MPLC for films. As in music and film, the goal is to give licensors a single point of contact for paying royalties and provide rights holders a way to set terms with dozens of potential licensors at once. A host of web publishers have already joined the collective, including Yahoo, Reddit, Medium, O'Reilly Media, Ziff Davis (owner of Mashable and Cnet), Internet Brands (owner of WebMD), People Inc., and The Daily Beast. Others, like Fastly, Quora, and Adweek, are supporting the standard without joining the collective.
Notably, the RSL Collective includes some publishers that already have licensing deals -- most notably Reddit, which receives an estimated $60 million a year from Google for use of its training data. There's nothing stopping companies from cutting their own deals within the RSL system, just as Taylor Swift can set special terms for licensing while still collecting royalties through ASCAP. But for publishers too small to draw their own deals, RSL's collective terms are likely to be the only option.
[ Read more of this story ](
https://tech.slashdot.org/story/25/09/10/2320207/rss-co-creator-launches-new-protocol-for-ai-data-licensing?utm_source=atom1.0moreanon&utm_medium=feed ) at Slashdot.