This specifically talks about AI data scrapers being an issue, and some general issues that are frankly not exclusive to open access info.
Exploitative companies are always a problem, whether it’s AI or not. But someone who uses the Wikipedia text torrents as a dataset isn’t doing anything of what is described in that article for example.
How does a model that is trained on an open dataset undermine free access? The dataset is still accessible no?
“Wait, not like that”: Free and open access in the age of generative AI
This specifically talks about AI data scrapers being an issue, and some general issues that are frankly not exclusive to open access info.
Exploitative companies are always a problem, whether it’s AI or not. But someone who uses the Wikipedia text torrents as a dataset isn’t doing anything of what is described in that article for example.