Reddit is Angry at AI Companies
Reddit CEO Steve Huffman is taking a stand, calling out tech giants Microsoft, Anthropic, and Perplexity AI for using Reddit's data to train their AI models without permission or compensation.
Huffman didn't mince words, labelling these companies’ actions as treating online content like it's "free for them to use" and calling the situation "a real pain in the ass" for Reddit.
Yolk of the Matter
- Reddit CEO's Criticism: Steve Huffman criticizes Microsoft, Anthropic, and Perplexity AI for using Reddit's data without permission or payment.
- Blocking Unauthorized Crawlers: Reddit blocks unauthorized data extraction while Google has a $60 million deal for data use.
- Demand for Agreements: Huffman insists on negotiated terms for data access to maintain control over its use.
- Financial Impact: Experts warn that adhering to data usage rules could cost AI companies billions in royalties.
The Great Data Scrape Showdown
Reddit has been working to block unauthorized web crawlers that siphon data from its site. While Google has played by the rules, signing a $60 million annual deal for Reddit's data, Microsoft, Anthropic, and Perplexity haven't followed suit. In fact, Reddit claims these companies have even violated the robots.txt rule designed to keep such crawlers at bay.
Drawing a Line in the Sand
Reddit isn't backing down. Huffman made it clear that any company wanting access to Reddit’s treasure trove of data needs to negotiate terms. He stressed the importance of controlling how Reddit's data is used and displayed, pointing out that without agreements, Reddit is left in the dark about where its data ends up. This lack of control has pushed Reddit to block companies that refuse to play by its rules.
The Bigger Picture
As the tug-of-war over data usage and copyright heats up in the AI industry, experts are warning that paying for data could cost companies "hundreds of billions of dollars" in royalties. The burning question now is: will AI firms change their ways to meet platforms like Reddit's demands, or will they continue to scrape data without a second thought?